0% found this document useful (0 votes)
33 views

[3]Symplectic Geometric Algorithms for Hamiltonian Systems

Uploaded by

muyangmad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

[3]Symplectic Geometric Algorithms for Hamiltonian Systems

Uploaded by

muyangmad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 690

Kang Feng

Mengzhao Qin

Symplectic Geometric Algorithms


for Hamiltonian Systems
Kang Feng
Mengzhao Qin

Symplectic Geometric
Algorithms for
Hamiltonian Systems

With 62 Figures

ZHEJIANG PUBLISHING UNITED GROUP


ZHEJIANG SCIENCE AND TECHNOLOGY PUBLISHING HOUSE
Authors
Kang Feng (1920-1993) Mengzhao Qin
Institute of Computational Institute of Computational
Mathematics and Scientific/ Mathematics and Scientific/
Engineering Computing Engineering Computing
Beijing 100190, China Beijing 100190, China
Email: [email protected]

ISBN 978-7-5341-3595-8
Zhejiang Publishing United Group, Zhejiang Science and Technology Publishing House,
Hangzhou

ISBN 978-3-642-01776-6 ISBN 978-3-642-01777-3 (eBook)


Springer Heidelberg Dordrecht London New York

Library of Congress Control Number: 2009930026

¤ Zhejiang Publishing United Group, Zhejiang Science and Technology Publishing House,
Hangzhou and Springer-Verlag Berlin Heidelberg 2010
This work is subject to copyright. All rights are reserved, whether the whole or part of the material
is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilm or in any other way, and storage in data banks.
Duplication of this publication or parts thereof is permitted only under the provisions of the
German Copyright Law of September 9, 1965, in its current version, and permission for use must
always be obtained from Springer. Violations are liable to prosecution under the German Copyright
Law.
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.

Cover design: Frido Steinen-Broo, EStudio Calamar, Spain

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)


“. . . In the late 1980s Feng Kang pro-
posed and developed so-called symplec-
tic algorithms for solving equations in
Hamiltonian form. Combining theoreti-
cal analysis and computer experimenta-
tion, he showed that such methods, over
long times, are much superior to standard
methods. At the time of his death, he was
at work on extensions of this idea to other
structures . . . ”

Peter Lax

Cited from SIAM News November 1993


Kang Feng giving a talk at an international conference

“ A basic idea behind the design of nu-


merical schemes is that they can preserve
the properties of the original problems as
much as possible . . . Different represen-
tations for the same physical law can lead
to different computational techniques in
solving the same problem, which can pro-
duce different numerical results . . .”
Kang Feng (1920 – 1993)

Cited from a paper entitled “How to compute property Newton’s equation of motion”
Prize certificate

Author’s photograph taken in Xi’an in 1989


Foreword

Kang Feng (1920–1993), Member of the Chinese Academy of Sciences, Professor and
Honorary Director of the Computing Center of the Chinese Academy of Sciences,
famous applied mathematician, founder and pioneer of computational mathematics
and scientific computing in China.
It has been 16 years since my brother Kang Feng passed away. His scientific
achievements have been recognized more and more clearly over time, and his contri-
butions to various fields have become increasingly outstanding. In the spring of 1997,
Professor Shing-Tung Yau, a winner of the Fields Medal and a foreign member of the
Chinese Academy of Sciences, mentioned in a presentation at Tsinghua University,
entitled “The development of mathematics in China in my view”, that “there are three
main reasons for Chinese modern mathematics to go beyond or hand in hand with
the West. Of course, I am not saying that there are no other works, but I mainly talk
about the mathematics that is well known historically: Professor Shiingshen Chern’s
work on characteristic class, Luogeng Hua’s work on the theory of functions of several
complex variables, and Kang Feng’s work on finite elements.” This high evaluation of
Kang Feng as a mathematician (not just a computational mathematician) sounds so
refreshing that many people talked about it and strongly agreed with it. At the end
of 1997, the Chinese National Natural Science Foundation presented Kang Feng et
al. with the first class prize for his other work on a symplectic algorithm for Hamil-
tonian systems, which is a further recognition of his scientific achievements (see the
certificate on the previous page). As his brother, I am very pleased.
Achieving a major scientific breakthrough is a rare event. It requires vision, ability
and opportunity, all of which are indispensable. Kang Feng has achieved two major
scientific breakthroughs in his life, both of which are very valuable and worthy of
mention. Firstly, from 1964 to 1965, he proposed independently the finite element
method and laid the foundation for the mathematical theory. Secondly, in 1984, he
proposed a symplectic algorithm for Hamiltonian systems. At present, scientific inno-
vation has become the focus of discussion. Kang Feng’s two scientific breakthroughs
may be treated as case studies in scientific innovation. It is worth emphasizing that
these breakthroughs were achieved in China by Chinese scientists. Careful study of
these has yet to be carried out by experts. Here I just describe some of my personal
feelings.
It should be noted that these breakthroughs resulted not only from the profound
mathematical knowledge of Kang Feng, but also from his expertise in classical physics
and engineering technology that were closely related to the projects. Scientific break-
throughs are often cross-disciplinary. In addition, there is often a long period of time
before a breakthrough is made-not unlike a long time it takes for a baby to be born,
which requires the accumulation of results in small steps.
x Foreword

The opportunity for inventing the finite element method came from a national re-
search project, a computational problem in the design of the Liu Jia Xia dam. For
such a concrete problem, Kang Feng found a basis for solving of the problem using
his sharp insight. In his view, a discrete computing method for a mathematical and
physical problem is usually carried out in four steps. Firstly, one needs to know and
define the physical mechanism. Secondly, one writes the appropriate differential equa-
tions accordingly. In the third step, design a discrete model. Finally, one develops the
numerical algorithm. However, due to the complexity of the geometry and physical
conditions, conventional methods cannot always be effective. Nonetheless, starting
from the physical law of conservation or variational principle of the matter, we can
directly relate to the appropriate discrete model. Combining the variational principle
with the spline approximation leads to the finite element method, which has a wide
range of adaptability and is particularly suited to deal with the complex geometry of
the physical conditions of computational engineering problems. In 1965, Kang Feng
published his paper entitled “Difference schemes based on the variational principle”,
which solved the basic theoretical issues of the finite element method, such as conver-
gence, error estimation, and stability. It laid the mathematical foundation for the finite
element method. This paper is the main evidence for recognition by the international
academic community of our independent development of the finite element method.
After the Chinese Cultural Revolution, he continued his research in finite element
and related areas. During this period, he made several great achievements. I remem-
ber that he talked with me about other issues, such as Thom’s catastrophe theory,
Prigogine’s theory of dissipative structures, solitons in water waves, the Radon trans-
form, and so on. These problems are related to physics and engineering technology.
Clearly he was exploring for new areas and seeking a breakthrough. In the 1970s,
Arnold’s “Mathematical Method of Classical Mechanics” came out. It described the
symplectic structure for Hamiltonian equations, which proved to be a great inspira-
tion to him and led to a breakthrough. Through his long-term experience in mathe-
matical computation, he fully realized that different mathematical expressions for the
same physical law, which are physically equivalent, can perform different functions
in scientific computing (his students later called this the “Feng’s major theorem”).
In this way, for classical mechanics, Newton’s equations, Lagrangian equations and
Hamiltonian equations will show a different pattern of calculations after discretiza-
tion. Because the Hamiltonian formulation has a symplectic structure, he was keenly
aware that, if the algorithm can maintain the geometric symmetry of symplecticity, it
will be possible to avoid the flaw of artificial dissipation of this type of algorithm and
design a high-fidelity algorithm. Thus, he opened up a broad way for the computa-
tional method of the Hamiltonian system. He called this way the “Hamiltonian way”.
This computational method has been used in the calculation of the orbit in celestial
mechanics, in calculations for the particle path in accelerator, as well as in molecular
dynamics. Later, the scope of its application was expanded. For example, it has also
been widely used in studies of the atmosphere and earth sciences and elsewhere. It
Foreword xi

has been effectively applied in solving the GPS observation operator, indicating that
Global Positioning System data can be dealt with in a timely manner. This algorithm
is 400 times more efficient than the traditional method. In addition, a symplectic al-
gorithm has been successfully used in the oil and gas exploration fields. Under the
influence of Kang Feng, international research on symplectic algorithm has become
popular and flourishing, nearly 300 papers have been published in this field to date.
Kang Feng’s research work on the symplectic algorithm has been well-known and
recognized internationally for its unique, innovative, systematic and widespread prop-
erties, for its theoretical integrity and fruitful results.
J. Lions, the former President of the International Mathematics Union, spoke at
a workshop when celebrating his 60th birthday: “This is another major innovation
made by Kang Feng, independent of the West, after the finite element method.” In
1993 one of the world’s leading mathematicians, P.D. Lax, a member of the Ameri-
can Academy of Sciences, wrote a memorial article dedicated to Kang Feng in SIAM
News, stating that “In the late 1980s, Kang Feng proposed and developed so-called
symplectic algorithms for solving evolution equations . . .. Such methods, over a long
period, are much superior to standard methods.” E. J. Marsden, an internationlly well-
known applied mathematician, visited the computing institute in the late 1980s and
had a long conversation with Kang Feng. Soon after the death of Kang Feng, he pro-
posed the multi-symplectic algorithm and extended the characteristics of stability of
the symplectic algorithm for long time calculation of Hamiltonian systems with infi-
nite dimensions.
On the occasion of the commemoration of the 16th anniversary of Kang Feng’s
death and the 89th anniversary of his birth, I think it is especially worthwhile to praise
and promote what was embodied in the lifetime’s work of Kang Feng — “ indepen-
dence in spirit, freedom in thinking”. 1 Now everyone is talking about scientific inno-
vation, which needs a talented person to accomplish. What type of person is needed
most? A person who is just a parrot or who has an “independent spirit, freely think-
ing”? The conclusion is self-evident. Scientific innovation requires strong academic
atmosphere. Is it determined by only one person or by all of the team members? This
is also self-evident. From Kang Feng’s scientific career, we can easily find that the key
to the problem of scientific innovation is “independence in spirit, freedom in thinking”,
and that needs to be allowed to develop and expand.
Kang Feng had planned to write a monograph about a symplectic algorithm for
Hamiltonian systems. He had accumulated some manuscripts, but failed to complete
it because he died too early due to sickness. Fortunately, his students and Professor
Mengzhao Qin (see the photo on the previous page), one of the early collaborators,
spent 15 years and finally completed this book based on Kang Feng’s plan, realizing
his wish. It is not only an authoritative exposition of this research field, but also an

1
Yinke Chen engraved on a stele in 1929 in memory of Guowei Wang in campus of Tsinghua
University.
xii Foreword

exposure of the academic thought of a master of science, which gives an example of


how an original and innovative scientific discovery is initiated and developed from
beginning to end in China.
We would also like to thank Zhejiang Science and Technology Publishing House,
which made a great contribution to the Chinese scientific cause through the publication
of this manuscript.
Although Kang Feng died 16 years ago, his scientific legacy has been inherited and
developed by the younger generation of scientists. His scientific spirit and thought still
elicit care, thinking and resonance in us. He is still living in the hearts of us.

Duan, Feng
Member of Chinese
Academy of Sciences
Nanjing University
Nanjing
September 20, 2009
Preface

It has been 16 years since Kang Feng passed away. It is our honor to publish the En-
glish version of Symplectic Algorithm for Hamiltonian Systems, so that more readers
can see the history of the development of symplectic algorithms. In particular, after
the death of Kang Feng, the development of symplectic algorithms became more so-
phisticated and there have been a series of monographs published in this area, e.g.,
Sanz-Serna & M.P. Calvo’s Numerical Hamiltonian Problems published in 1994 by
Chapman and Hall Publishing House; E. Hairer, C. Lubich and G. Wanner’s Geo-
metrical Numerical Integration published in 2001 by Springer Verlag; B. Leimkuhler
and S. Reich’s Simulating Hamiltonian Dynamics published in 2004 by Cambridge
University Press. The symplectic algorithm has been developed from ordinary dif-
ferential equations to partial differential equations, from a symplectic structure to a
multi-symplectic structure. This is largely due to the promotion of this work by J.
Marsden of the USA and T. Bridge and others in Britain. Starting with a symplectic
structure, J. Marsden first developed the Lagrange symplectic structure, and then to
the multi-symplectic structure. He finally proposed a symplectic structure that meets
the requirement of the Lagrangian form from the variational principle by giving up
the boundary conditions. On the other hand, T. Bridge and others used the multi-
symplectic structure to derive directly the multi-symplectic Hamilton equations, and
then constructed the difference schemes that preserve the symplectic structure in both
time and space. Both methods can be regarded as equivalent in the algorithmic sense.
Now, in this monograph, most of the content refers only to ordinary differential
equations. Kang Feng and his algorithms research group working on the symplectic
algorithm did some foundation work. In particular, I would like to point out three nega-
tive theorems: “ non-existence of energy preserving scheme”, “ non-existence of mul-
tistep linear symplectic scheme”, and “ non-existence of volume-preserving scheme
form rational fraction expression”. In addition, generating function theory is not only
rich in analytical mechanics and Hamilton–Jacobi equations. At the same time, the
construction of symplectic schemes provides a tool for any order accuracy difference
scheme. The formal power series proposed by Kang Feng had a profound impact on
the later developed “ backward error series” work ,“ modified equation” and “ modified
integrator”.
The symplectic algorithm developed very quickly, soon to be extended to the ge-
ometric method. The structure preserving algorithm (not only preserving the geomet-
rical structure, but also the physical structure, etc.) preserves the algebraic structure
to present the Lie group algorithm, and preserves the differential complex algorithm.
Many other prominent people have contributed to the symplectic method in addition
to those mentioned above. There are various methods related to structure preserving
algorithms and for important contributions the readers are referred to R. McLach-
lan & GRW Quispel “ Geometric integration for ODEs” and T. Bridges & S. Reich
“ Numerical methods for Hamiltonian PDEs”.
The book describes the symplectic geometric algorithms and theoretical basis for
a number of related algorithms. Most of the contents are a collection of lectures given
xiv Preface

by Kang Feng at Beijing University. Most of other sections are a collection of papers
which were written by group members.
Compared to the previous Chinese version, the present English one has been im-
proved in the following respects. First of all, to correct a number of errors and mis-
takes contained in the Chinese version. Besides, parts of Chapter 1 and Chapter 2
were removed, while some new content was added to Chapter 4, Chapter 7, Chapter
8, Chapter 9 and Chapter 10. More importantly, four new chapters — Chapter 13 to
Chapter 16 were added. Chapter 13 is devoted to the KAM theorem for the symplectic
algorithm. We invited Professor Zaijiu Shang , a former PhD student of Kang Feng
to compose this chapter. Chapter 14 is called Variational Integrator. This chapter re-
flects the work of the Nobel Prize winner Professor Zhengdao Li who proposed in
the 1980s to preserve the energy variational integrator, but had not explained at that
time that it had a Lagrange symplectic type, which satisfied the Lagrange symplectic
structure. Together with J. Marsden he proposed the variational integrator trail con-
nection, which leads from the variational integrator. Just like J. Marsden, he hoped
this can link up with the finite element method. Chapter 15 is about Birkhoffian Sys-
tems, describing a class of dissipative structures for Birkohoffian systems to preserve
the dissipation of the Birkhoff structure. Chapter 16 is devoted to Multisymplectic
and Variational Integrators, providing a summary of the widespread applications of
multisymplectic integrators in the infinitely dimensional Hamiltonian systems.
We would also like to thank every member of the Kang Feng’s research group
for symplectic algorithms: Huamo Wu, Daoliu Wang, Zaijiu Shang, Yifa Tang, Jialin
Hong, Wangyao Li, Min Chen, Shuanghu Wang, Pingfu Zhao, Jingbo Chen, Yushun
Wang, Yajuan Sun, Hongwei Li, Jianqiang Sun, Tingting Liu, Hongling Su, Yimin
Tian; and those who have been to the USA: Zhong Ge, Chunwang Li, Yuhua Wu,
Meiqing Zhang, Wenjie Zhu, Shengtai Li, Lixin Jiang, and Haibin Shu. They made
contributions to the symplectic algorithm over different periods of time.
The authors would also like to thank the National Natural Science Foundation, the
National Climbing Program projects, and the State’s Key Basic Research Projects for
their financial support. Finally, the authors would also like to thank the Mathematics
and Systems Science Research Institute of the Chinese Academy of Sciences, the
Computational Mathematics and Computational Science and Engineering Institute,
and the State Key Laboratory of Computational Science and Engineering for their
support.
The editors of this book have received help from E. Hairer, who provided a tem-
plate from Springer publishing house. I would also like to thank F. Holzwarth at
Springer publishing house and Linbo Zhang of our institute, and others who helped
me successfully publish this book.
For the English translation, I thank Dr. Shengtai Li for comprehensive proof-
reading and polishing, and the editing of Miss Yi Jin. For the English version of the
publication I would also like to thank the help of the Chinese Academy of Sciences
Institute of Mathematics. Because Kang Feng has passed away, it may not be possible
to provide a comprehensive representation of his academic thought, and the book will
inevitably contain some errors. I accept the responsibility for any errors and welcome
criticism and corrections.
Preface xv

We would also like to thank Springer Beijing Representation Office and Zhejiang
Science and Technology Publishing House, which made a great contribution to the
Chinese scientific cause through the publication of this manuscript. We are especially
grateful to thank Lisa Fan, W. Y. Zhou, L. L. Liu and X. M. Lu for carefully reading
and finding some misprints, wrong signs and other mistakes.
This book is supported by National Natural Science Foundation of China under
grant No.G10871099 ; supported by the Project of National 863 Plan of China (grant
No.2006AA09A102-08); and supported by the National Basic Research Program of
China (973 Program) (Grant No. 2007CB209603).

Mengzhao Qin
Institute of Computational
Mathematics and Scientific
Engineering Computing
Beijing
September 20, 2009
Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

1. Preliminaries of Differentiable Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . 39


1.1 Differentiable Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
1.1.1 Differentiable Manifolds and Differentiable Mapping . . . . 40
1.1.2 Tangent Space and Differentials . . . . . . . . . . . . . . . . . . . . . . 43
1.1.3 Submanifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.1.4 Submersion and Transversal . . . . . . . . . . . . . . . . . . . . . . . . . 51
1.2 Tangent Bundle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
1.2.1 Tangent Bundle and Orientation . . . . . . . . . . . . . . . . . . . . . . 56
1.2.2 Vector Field and Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
1.3 Exterior Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
1.3.1 Exterior Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1.3.2 Exterior Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
1.4 Foundation of Differential Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
1.4.1 Differential Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
1.4.2 The Behavior of Differential Forms under Maps . . . . . . . . 80
1.4.3 Exterior Differential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
1.4.4 Poincaré Lemma and Its Inverse Lemma . . . . . . . . . . . . . . . 84
1.4.5 Differential Form in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
1.4.6 Hodge Duality and Star Operators . . . . . . . . . . . . . . . . . . . . 88
1.4.7 Codifferential Operator δ . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
1.4.8 Laplace–Beltrami Operator . . . . . . . . . . . . . . . . . . . . . . . . . . 90
1.5 Integration on a Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
1.5.1 Geometrical Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
1.5.2 Integration and Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . 93
1.5.3 Some Classical Theories on Vector Analysis . . . . . . . . . . . 96
1.6 Cohomology and Homology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
1.7 Lie Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
1.7.1 Vector Fields as Differential Operator . . . . . . . . . . . . . . . . . 99
1.7.2 Flows of Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
1.7.3 Lie Derivative and Contraction . . . . . . . . . . . . . . . . . . . . . . . 103
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
xviii Contents

2. Symplectic Algebra and Geometry Preliminaries . . . . . . . . . . . . . . . . . . 113


2.1 Symplectic Algebra and Orthogonal Algebra . . . . . . . . . . . . . . . . . . . 113
2.1.1 Bilinear Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
2.1.2 Sesquilinear Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
2.1.3 Scalar Product, Hermitian Product . . . . . . . . . . . . . . . . . . . . 117
2.1.4 Invariant Groups for Scalar Products . . . . . . . . . . . . . . . . . . 119
2.1.5 Real Representation of Complex Vector Space . . . . . . . . . 121
2.1.6 Complexification of Real Vector Space and Real Linear
Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.1.7 Lie Algebra for GL(n, F) . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
2.2 Canonical Reductions of Bilinear Forms . . . . . . . . . . . . . . . . . . . . . . 128
2.2.1 Congruent Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
2.2.2 Congruence Canonical Forms of Conformally Symmet-
ric and Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 130
2.2.3 Similar Reduction to Canonical Forms under Orthogo-
nal Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
2.3 Symplectic Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
2.3.1 Symplectic Space and Its Subspace . . . . . . . . . . . . . . . . . . . 137
2.3.2 Symplectic Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
2.3.3 Lagrangian Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
2.3.4 Special Types of Sp(2n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
2.3.5 Generators of Sp(2n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
2.3.6 Eigenvalues of Symplectic and Infinitesimal Matrices . . . 158
2.3.7 Generating Functions for Lagrangian Subspaces . . . . . . . . 160
2.3.8 Generalized Lagrangian Subspaces . . . . . . . . . . . . . . . . . . . 162
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

3. Hamiltonian Mechanics and Symplectic Geometry . . . . . . . . . . . . . . . . . 165


3.1 Symplectic Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.1.1 Symplectic Structure on Manifolds . . . . . . . . . . . . . . . . . . . 165
3.1.2 Standard Symplectic Structure on Cotangent Bundles . . . . 166
3.1.3 Hamiltonian Vector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 167
3.1.4 Darboux Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
3.2 Hamiltonian Mechanics on R2n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
3.2.1 Phase Space on R2n and Canonical Systems . . . . . . . . . . . 169
3.2.2 Canonical Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 172
3.2.3 Poisson Bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
3.2.4 Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
3.2.5 Hamilton–Jacobi Equations . . . . . . . . . . . . . . . . . . . . . . . . . 182
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

4. Symplectic Difference Schemes for Hamiltonian Systems . . . . . . . . . . . . 187


4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
4.1.1 Element and Notation for Hamiltonian Mechanics . . . . . . 187
Contents xix

4.1.2 Geometrical Meaning of Preserving Symplectic Struc-


ture ω . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
4.1.3 Some Properties of a Symplectic Matrix . . . . . . . . . . . . . . . 190
4.2 Symplectic Schemes for Linear Hamiltonian Systems . . . . . . . . . . . 192
4.2.1 Some Symplectic Schemes for Linear Hamiltonian Sys-
tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
4.2.2 Symplectic Schemes Based on Padé Approximation . . . . . 193
4.2.3 Generalized Cayley Transformation and Its Application . . 197
4.3 Symplectic Difference Schemes for a Nonlinear Hamiltonian Sys-
tem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
4.4 Explicit Symplectic Scheme for Hamiltonian System . . . . . . . . . . . . 203
4.4.1 Systems with Nilpotent of Degree 2 . . . . . . . . . . . . . . . . . . 204
4.4.2 Symplectically Separable Hamiltonian Systems . . . . . . . . . 205
4.4.3 Separability of All Polynomials in R2n . . . . . . . . . . . . . . . 207
4.5 Energy-conservative Schemes by Hamiltonian Difference . . . . . . . . 209
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

5. The Generating Function Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213


5.1 Linear Fractional Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
5.2 Symplectic, Gradient Mapping and Generating Function . . . . . . . . 215
5.3 Generating Functions for the Phase Flow . . . . . . . . . . . . . . . . . . . . . 221
5.4 Construction of Canonical Difference Schemes . . . . . . . . . . . . . . . . . 226
5.5 Further Remarks on Generating Function . . . . . . . . . . . . . . . . . . . . . . 231
5.6 Conservation Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
5.7 Convergence of Symplectic Difference Schemes . . . . . . . . . . . . . . . . 239
5.8 Symplectic Schemes for Nonautonomous System . . . . . . . . . . . . . . . 242
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

6. The Calculus of Generating Functions and Formal Energy . . . . . . . . . . 249


6.1 Darboux Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
6.2 Normalization of Darboux Transformation . . . . . . . . . . . . . . . . . . . . . 251
6.3 Transform Properties of Generator Maps and Generating Functions 255
6.4 Invariance of Generating Functions and Commutativity of Gener-
ator Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
6.5 Formal Energy for Hamiltonian Algorithm . . . . . . . . . . . . . . . . . . . . 264
6.6 Ge–Marsden Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

7. Symplectic Runge–Kutta Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277


7.1 Multistage Symplectic Runge–Kutta Method . . . . . . . . . . . . . . . . . . . 277
7.1.1 Definition and Properties of Symplectic R–K Method . . . . 277
7.1.2 Symplectic Conditions for R–K Method . . . . . . . . . . . . . . . 281
7.1.3 Diagonally Implicit Symplectic R–K Method . . . . . . . . . . 284
7.1.4 Rooted Tree Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
7.1.5 Simplified Conditions for Symplectic R–K Method . . . . . 297
xx Contents

7.2 Symplectic P–R–K Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302


7.2.1 P–R–K Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
7.2.2 Symplified Order Conditions of Explicit Symplectic R–K
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
7.3 Symplectic R–K–N Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
7.3.1 Order Conditions for Symplectic R–K–N Method . . . . . . . 319
7.3.2 The 3-Stage and 4-th order Symplectic R–K–N Method . 323
7.3.3 Symplified Order Conditions for Symplectic R–K–N
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
7.4 Formal Energy for Symplectic R–K Method . . . . . . . . . . . . . . . . . . . 333
7.4.1 Modified Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
7.4.2 Formal Energy for Symplectic R–K Method . . . . . . . . . . . 339
7.5 Definition of a(t) and b(t) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
7.5.1 Centered Euler Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
7.5.2 Gauss–Legendre Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
7.5.3 Diagonal Implicit R–K Method . . . . . . . . . . . . . . . . . . . . . . 347
7.6 Multistep Symplectic Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
7.6.1 Linear Multistep Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
7.6.2 Symplectic LMM for Linear Hamiltonian Systems . . . . . . 348
7.6.3 Rational Approximations to Exp and Log Function . . . . . . 352
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

8. Composition Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365


8.1 Construction of Fourth Order with 3-Stage Scheme . . . . . . . . . . . . . 365
8.1.1 For Single Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
8.1.2 For System of Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
8.2 Adjoint Method and Self-Adjoint Method . . . . . . . . . . . . . . . . . . . . . 372
8.3 Construction of Higher Order Schemes . . . . . . . . . . . . . . . . . . . . . . . 377
8.4 Stability Analysis for Composition Scheme . . . . . . . . . . . . . . . . . . . . 388
8.5 Application of Composition Schemes to PDE . . . . . . . . . . . . . . . . . . 396
8.6 H-Stability of Hamiltonian System . . . . . . . . . . . . . . . . . . . . . . . . . . . 401
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

9. Formal Power Series and B-Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407


9.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
9.2 Near-0 and Near-1 Formal Power Series . . . . . . . . . . . . . . . . . . . . . . . 409
9.3 Algorithmic Approximations to Phase Flows . . . . . . . . . . . . . . . . . . . 414
9.3.1 Approximations of Phase Flows and Numerical Method . 414
9.3.2 Typical Algorithm and Step Transition Map . . . . . . . . . . . . 415
9.4 Related B-Series Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
9.4.1 The Composition Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
9.4.2 Substitution Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 432
9.4.3 The Logarithmic Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
Contents xxi

10. Volume-Preserving Methods for Source-Free Systems . . . . . . . . . . . . . . 443


10.1 Liouville’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443
10.2 Volume-Preserving Schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
10.2.1 Conditions for Centered Euler Method to be Volume
Preserving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444
10.2.2 Separable Systems and Volume-Preserving Explicit Meth-
ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
10.3 Source-Free System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
10.4 Obstruction to Analytic Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 450
10.5 Decompositions of Source-Free Vector Fields . . . . . . . . . . . . . . . . . . 452
10.6 Construction of Volume-Preserving Schemes . . . . . . . . . . . . . . . . . . . 454
10.7 Some Special Discussions for Separable Source-Free Systems . . . . 458
10.8 Construction of Volume-Preserving Scheme via Generating Func-
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
10.8.1 Fundamental Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
10.8.2 Construction of Volume-Preserving Schemes . . . . . . . . . . . 464
10.9 Some Volume-Preserving Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 467
10.9.1 Volume-Preserving R–K Methods . . . . . . . . . . . . . . . . . . . . 467
10.9.2 Volume-Preserving 2-Stage P–R–K Methods . . . . . . . . . . . 471
10.9.3 Some Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473
10.9.4 Some Explanations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476

11. Contact Algorithms for Contact Dynamical Systems . . . . . . . . . . . . . . . 477


11.1 Contact Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477
11.1.1 Basic Concepts of Contact Geometry . . . . . . . . . . . . . . . . . 477
11.1.2 Contact Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
11.2 Contactization and Symplectization . . . . . . . . . . . . . . . . . . . . . . . . . . 484
11.3 Contact Generating Functions for Contact Maps . . . . . . . . . . . . . . . . 488
11.4 Contact Algorithms for Contact Systems . . . . . . . . . . . . . . . . . . . . . . 492
11.4.1 Q Contact Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
11.4.2 P Contact Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
11.4.3 C Contact Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493
11.5 Hamilton–Jacobi Equations for Contact Systems . . . . . . . . . . . . . . . 494
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

12. Poisson Bracket and Lie–Poisson Schemes . . . . . . . . . . . . . . . . . . . . . . . . 499


12.1 Poisson Bracket and Lie–Poisson Systems . . . . . . . . . . . . . . . . . . . . . 499
12.1.1 Poisson Bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
12.1.2 Lie–Poisson Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501
12.1.3 Introduction of the Generalized Rigid Body Motion . . . . . 505
12.2 Constructing Difference Schemes for Linear Poisson Systems . . . . 507
12.2.1 Constructing Difference Schemes for Linear Poisson
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508
xxii Contents

12.2.2Construction of Difference Schemes for General Pois-


son Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509
12.2.3 Answers of Some Questions . . . . . . . . . . . . . . . . . . . . . . . . . 511
12.3 Generating Function and Lie–Poisson Scheme . . . . . . . . . . . . . . . . . 514
12.3.1 Lie–Poisson–Hamilton–Jacobi (LPHJ) Equation and Gen-
erating Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
12.3.2 Construction of Lie–Poisson Schemes via Generating
Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
12.4 Construction of Structure Preserving Schemes for Rigid Body . . . . 523
12.4.1 Rigid Body in Euclidean Space . . . . . . . . . . . . . . . . . . . . . . 523
12.4.2 Energy-Preserving and Angular Momentum-Preserving
Schemes for Rigid Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525
12.4.3 Orbit-Preserving and Angular-Momentum-Preserving Ex-
plicit Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527
12.4.4 Lie–Poisson Schemes for Free Rigid Body . . . . . . . . . . . . 530
12.4.5 Lie–Poisson Scheme on Heavy Top . . . . . . . . . . . . . . . . . . . 535
12.4.6 Other Lie–Poisson Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 538
12.5 Relation Among Some Special Group and Its Lie Algebra . . . . . . . . 543
12.5.1 Relation Among SO(3), so(3) and SH1 , SU (2) . . . . . . . 543
12.5.2 Representations of Some Functions in SO(3) . . . . . . . . . . 545
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547

13. KAM Theorem of Symplectic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 549


13.1 Brief Introduction to Stability of Geometric Numerical Algorithms 549
13.2 Mapping Version of the KAM Theorem . . . . . . . . . . . . . . . . . . . . . . 551
13.2.1 Formulation of the Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 551
13.2.2 Outline of the Proof of the Theorems . . . . . . . . . . . . . . . . . 554
13.2.3 Application to Small Twist Mappings . . . . . . . . . . . . . . . . . 558
13.3 KAM Theorem of Symplectic Algorithms for Hamiltonian Systems559
13.3.1 Symplectic Algorithms as Small Twist Mappings . . . . . . . 560
13.3.2 Numerical Version of KAM Theorem . . . . . . . . . . . . . . . . . 564
13.4 Resonant and Diophantine Step Sizes . . . . . . . . . . . . . . . . . . . . . . . . . 568
13.4.1 Step Size Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
13.4.2 Diophantine Step Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 569
13.4.3 Invariant Tori and Further Remarks . . . . . . . . . . . . . . . . . . . 574
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

14. Lee-Variational Integrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581


14.1 Total Variation in Lagrangian Formalism . . . . . . . . . . . . . . . . . . . . . . 581
14.1.1 Variational Principle in Lagrangian Mechanics . . . . . . . . . 581
14.1.2 Total Variation for Lagrangian Mechanics . . . . . . . . . . . . . 583
14.1.3 Discrete Mechanics and Variational Integrators . . . . . . . . . 586
14.1.4 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 591
14.2 Total Variation in Hamiltonian Formalism . . . . . . . . . . . . . . . . . . . . . 591
14.2.1 Variational Principle in Hamiltonian Mechanics . . . . . . . . 591
Contents xxiii

14.2.2 Total Variation in Hamiltonian Mechanics . . . . . . . . . . . . . 593


14.2.3 Symplectic-Energy Integrators . . . . . . . . . . . . . . . . . . . . . . . 596
14.2.4 High Order Symplectic-Energy Integrator . . . . . . . . . . . . . 600
14.2.5 An Example and an Optimization Method . . . . . . . . . . . . . 603
14.2.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
14.3 Discrete Mechanics Based on Finite Element Methods . . . . . . . . . . . 606
14.3.1 Discrete Mechanics Based on Linear Finite Element . . . . . 606
14.3.2 Discrete Mechanics with Lagrangian of High Order . . . . . 608
14.3.3 Time Steps as Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613
14.3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615

15. Structure Preserving Schemes for Birkhoff Systems . . . . . . . . . . . . . . . . 617


15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
15.2 Birkhoffian Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618
15.3 Generating Functions for K(z, t)-Symplectic Mappings . . . . . . . . . 621
15.4 Symplectic Difference Schemes for Birkhoffian Systems . . . . . . . . . 625
15.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
15.6 Numerical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 639

16. Multisymplectic and Variational Integrators . . . . . . . . . . . . . . . . . . . . . . 641


16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
16.2 Multisymplectic Geometry and Multisymplectic Hamiltonian Sys-
tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642
16.3 Multisymplectic Integrators and Composition Methods . . . . . . . . . . 646
16.4 Variational Integrators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 652
16.5 Some Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658

Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 669
Introduction

The main theme of modern scientific computing is the numerical solution of various
differential equations of mathematical physics bearing the names, such as Newton, Eu-
ler, Lagrange, Laplace, Navier–Stokes, Maxwell, Boltzmann, Einstein, Schrödinger,
Yang-Mills, etc. At the top of the list is the most celebrated Newton’s equation of mo-
tion. The historical, theoretical and practical importance of Newton’s equation hardly
needs any comment, so is the importance of the numerical solution of such equations.
On the other hand, starting from Euler, right down to the present computer age, a
great wealth of scientific literature on numerical methods for differential equations
has been accumulated, and a great variety of algorithms, software packages and even
expert systems has been developed. With the development of the modern mechanics,
physics, chemistry, and biology, it is undisputed that almost all physical processes,
whether they are classical, quantum, or relativistic, can be represented by an Hamilto-
nian system. Thus, it is important to solve the Hamiltonian system correctly.

1. Numerical Method for the Newton Equation of Mo-


tion
In the spring of 1991, the first author [Fen92b] presented a plenary talk on how to com-
pute the numerical solution of Newton classical equation accurately at the Annual
Physics Conference of China in Beijing.
It is well known that numerically solving so-called mathematics-physics equa-
tions has become a main topic in modern scientific computation. The Newton equation
of motion is one of the most popular equations among various mathematics-physics
equations. It can be formulated as a group of second-order ordinary differential equa-
tions, f = ma = mẍ. The computational methods of the differential equations ad-
vanced slowly in the past due to the restriction of the historical conditions. However, a
great progress was made since Euler, due to contributions from Adams, Runge, Kutta,
and Stömer, etc.. This is especially true since the introduction of the modern com-
puter for which many algorithms and software packages have been developed. It is
said that the three-body problem is no longer a challenging problem and can be easily
computed. Nevertheless, we propose the following two questions:
1◦ Are the current numerical algorithms suitable for solving the Newton equa-
tion of motion?
2◦ How can one calculate the Newton equation of motion more accurately?
It seems that nobody has ever thought about the first issue seriously, which may be
the reason why the second issue has never been studied systematically. In this book, we
will study mainly the fundamental but more difficult Newton equation of motion that
is in conservative form. First, the conservative Newton equation has two equivalent
mathematical representations: a Lagrange variation form and a Hamiltonian form. The
2 Introduction

latter transforms the second-order differential equations in physical space into a group
of the first-order canonical equations in phase space. Different representations for
the same physical law can lead to different computational techniques in solving
the same problem, which can produce different numerical results. Thus making a
wise and reasonable choice among various equivalent mathematical representations is
extremely important in solving the problem correctly.
We choose the Hamiltonian formulation as our basic form in practice based on the
fact that the Hamiltonian equations have symmetric and clean form, where the physical
laws of the motion can be easily represented. Secondly, the Hamiltonian formulation
is more general and universal than the Newton formulation. It can cover the classical,
relativistic, quantum, finite or infinite dimensional real physical processes where dis-
sipation effect can be neglected. Therefore, the success of the numerical methods for
Hamiltonian equations has broader development and application perspectives. Thus, it
is very surprising that the numerical algorithms for Hamiltonian equations are almost
nonexistent after we have searched various publications. This motivates us to study
the problem carefully to seek the answers to the previous two questions.
Our approach is to use the symplectic geometry, which is the geometry in phase
space. It is based on the anti-symmetric area metric, which is in contrast to the sym-
metric length metrics of Euclid and Riemann geometry. The basic theorem of the clas-
sic mechanics can be described as “the dynamic evolution of all Hamiltonian systems
preserves the symplectic metrics, which means it is a symplectic (canonical) transfor-
mation”. Hence the correct discretization algorithms to all the Hamiltonian systems
should be symplectic transformation. Such algorithms are called symplectic (canoni-
cal) algorithms or Hamiltonian algorithms. We have intentionally analyzed and eval-
uated the derivation of the Hamiltonian algorithm within the symplectic structures.
The fact proved that this approach is correct and fruitful. We have derived a series of
symplectic algorithms, found out their properties, laid out their theoretical foundation,
and tested them with extremely difficult numerical experiments.
In order to compare the symplectic and non-symplectic algorithm, we proposed
eight numerical experiments: harmonic oscillator, nonlinear Duffing oscillator, Huy-
gens oscillator, Cassini oscillator, two dimensional multi-crystal and semi-crystal lat-
tice steady flow, Lissajous image, geodesic flow on ellipsoidal surface, and Kepler
motion. The numerical experiments demonstrate the superiority of the symplectic al-
gorithm. All traditional non-symplectic algorithms fail without exception, especially
in preserving global property and structural property, and long-term tracking capabil-
ity, regardless of their accuracy. However, all the symplectic algorithms passed the
tests with long-term stable tracking capability. These tests clearly demonstrate the su-
periority of the symplectic algorithms.
Almost all of the traditional algorithms are non-symplectic with few exceptions.
They are designed for the asymptotic stable system which has dissipation mechanism
to maintain stability, whereas the Hamiltonian system does not have the asymptotic
stability. Hence all these algorithms inevitably contain artificial numerical dissipation,
fake attractors, and other parasitics effects of non-Hamiltonian system. All these ef-
fects lead to seriously twist and serious distortion in numerical results. They can be
used in short-term transient simulation, but are not suitable and can lead to wrong
Introduction 3

conclusions for long-term tracking and global structural property research. Since the
Newton equation is equivalent to Hamiltonian equation, the answer to the first ques-
tion is “No”, which is quite beyond expectation.
The symplectic algorithm does not have any artificial dissipation so that it can con-
genitally avoid all non-symplectic pollution and become a “clean” algorithm. Hamil-
tonian system has two types of conservation laws: one is the area invariance in phase
space, i.e., Liouville–Poincaré conservation law; the other is the motion invariance
which includes energy conservation, momentum and angular momentum conserva-
tion, etc. We have proved that all symplectic algorithms have their own invariance,
which has the same convergence to the original theoretical invariance as the conver-
gence order of the numerical algorithm. We have also proved that the majority of in-
variant tori of the near integrable system can be preserved, which is a new formulation
of the famous KAM (Kolmogorov–Arnorld–Moser) theorem[Kol54b,Kol54a,Arn63,Mos62] .
All of these results demonstrate that the structure of the discrete Hamiltonian algo-
rithm is completely parallel to the conservation law, and is very close to the original
form of the Hamiltonian system. Moreover, theoretically speaking, it has infinite long-
term tracking capability. Hence, a correct numerical method to solve the Newton equa-
tion is to Hamiltonize the equation first and then use the Hamiltonian algorithm. This
is the answer to the second question. We will describe in more detail the KAM theory
of symplectic algorithms for Hamiltonian systems in Chapter 13. In the following we
present some examples to compare the symplectic algorithm and other non-symplectic
algorithms in solving Newton equation of motion.

(1) Calculation of the Harmonic oscillator’s elliptic orbit

Calculation of the Harmonic oscillator’s elliptic orbit (Fig. 0.1(a)) uses Runge–Kutta
method (R–K) with a step size 0.4. The output is at 3,000 steps. It shows artificial
dissipation, shrinking of the orbit. Fig. 0.1(b) shows the results using Adams method
with a step size 0.2. It is anti-dissipative and the orbit is scattered out. Fig. 0.1(c)
shows the results of two-step central difference (leap-frog scheme). This scheme is
symplectic to linear equations. The results are obtained with a step size 0.1. It shows
that the results of three stages for 10,000,000 steps: the initial 1,000 steps, the middle
1,000 steps, and the final 1,000 steps. They are completely in agreement.

(2) The elliptic orbit for the nonlinear oscillator

Fig. 0.2(a) shows the results of two-step central-difference. This scheme is non-
symplectic for nonlinear equations. The output is for step size 0.2 and 10,000 steps.
Fig. 0.2(a) shows the initial 1,000 steps and Fig. 0.2(b) shows the results between
9,000 to 10,000 steps. Both of them show the distortion of the orbit. Fig. 0.2(c) is for
the second-order symplectic algorithm with 0.1 step size, 1,000 steps.
4 Introduction

Fig. 0.1. Calculation of the Harmonic oscillator’s elliptic orbit

Fig. 0.2. Calculation of the nonlinear oscillator’s elliptic orbit


Introduction 5

Fig. 0.3. Calculation of the nonlinear Huygens oscillator

(3) The oval orbit of the Huygens oscillator

Using the R–K method, the two fixed points on the horizontal axes become two fake
attractors. The probability of the phase point close to the two attractors is the same.
The same initial point outside the separatrix is attracted randomly either to the left
or to the right. Fig. 0.3(a) shows the results with a step size 0.10000005 and 900,000
steps, which approach the left attractor. Fig. 0.3(b) shows the results with a step size
0.10000004 and 900,000 steps, which approach the right attractor. Fig. 0.3(c) shows
the results of the second-order symplectic algorithm with a step size 0.1. Four typical
orbits are plotted and each contains 100,000,000 steps: for every orbit first 500 steps,
the middle 500 steps, and the final 500 steps. They are in complete agreement.

(4) The dense orbit of the geodesic for the ellipsoidal surface

The dense orbit of the geodesic for the ellipsoidal surface with irrational frequency ra-
tio. The square of frequency ratio is 5/16, step size is 0.05658, 10,000 steps. Fig.0.4(a)
is for the R–K method which does not tend to dense. Fig. 0.4(b) is for the symplectic
algorithm which tends to dense.
6 Introduction


Fig. 0.4. Geodesics on ellipsoid, frequency ratio 5 : 4, non dense (a), dense orbit (b)

(5) The close orbit of the geodesic for the ellipsoidal surface

The close orbit of the geodesic for the ellipsoidal surface with rational frequency
ratio. The frequency ratio is 11/16, step size is 0.033427, 100,000 steps and 25 cycles.
Fig.0.5(a) is for the R–K method which does not lead to the close orbit. Fig. 0.5(b) is
for the symplectic algorithm which leads to the close orbit.

Fig. 0.5. Geodesics on ellipsoid, frequency ratio 11:16, non closed (a), closed orbit (b)

(6) The close orbit of the Keplerian motion

The close orbit of the Keplerian motion with rational frequency ratio. The frequency
ratio is 11/20, step size is 0.01605, 240,000 steps and 60 cycles. Fig. 0.6(a) is for the
R–K method which does not lead to the close orbit. Fig. 0.6(b)is for the symplectic
method which leads to the close orbit.
Introduction 7

Fig. 0.6. Geodesics on ellipsoid, frequency ratio 11:20, non closed (a), closed orbit (b)

2. History of the Hamiltonian Mechanics


We first consider the three formulations of the classical mechanics. Assume a motion
has n degrees of freedom. The position is denoted as q = (q1 , · · · , qn ). The potential
function is V = V (q). Then we have

d2 q ∂
m = − V,
d t2 ∂q
which is the standard formulation of the motion. It is a group of second-order differen-
tial equations in space Rn . It is usually called the standard formulation of the classical
mechanics, or Newton formulation.
Euler and Lagrange introduced an action on the difference between the kinetic
energy and potential energy
1
L(q, q̇) = T (q̇) − V (q) = (q̇, M q̇) − V (q).
2
Using the variational principle the above equation can be written as
d ∂L ∂L
− = 0,
d t ∂ q̇ ∂q
which is called the variational form of the classical mechanics, i.e., the Lagrange form.
In the 19th century, Hamilton proposed another formulation. He used the momen-
tum p = M q̇ and the total energy H = T + V to formulate the equation of motion
as
∂H ∂H
ṗ = − , q̇ = ,
∂q ∂p
which is called Hamiltonian canonical equations. This is a group of the first-order
differential equations in 2n phase space (p1 , · · · , pn , q1 , · · · , qn ). It has simple and
symmetric form.
8 Introduction

The three basic formulations of the classical mechanics have been described in
almost all text-books on theoretical physics or theoretical mechanics. These different
mathematical formulations describe the same physics law but provide different ap-
proaches in problem solving. Thus equivalent mathematical formulation can have
different effectiveness in computational methods. We have verified this in our own
simulations.
The first author did extensive research on Finite Element Method (FEM) in the
1960s [Fen65] which represents a systematic algorithm for solving equilibrium problem.
Physical problems of this type have two equivalent formulations: Newtonian, i.e., solv-
ing the second-order elliptic equations, and variational formulation, i.e., minimization
principle in energy functional. The key to the success of FEM in both theoretical and
computational methods lies in using a reasonable variational formulation as the basic
principle. After that, he had attempted to apply the FEM idea to the dynamic problem
of continuum media mechanics, but not yet achieved the corresponding success, which
appears to be difficult to accomplish even today. Therefore, the reasonable choice for
computational method of dynamic problem might be the Hamiltonian formulation.
Initially it is a conjecture and requires verification from the computational experi-
ments. We have investigated how others evaluated the Hamiltonian system in history.
First we should point out that Hamilton himself proposed his theory based on the ge-
ometric optics and then extended it to mechanics that appears to be a very different
field. In 1834 Hamilton said, “This set of idea and method has been applied to optics
and mechanics. It seems it can be applied to other areas and developed into an inde-
pendent knowledge by the mathematicians”[Ham34] . This is just his expectation, and
other peers in the same generation seemed indifferent to this set of theory, which was
“beautiful but useless”[Syn44] to them. Klein, a famous mathematician, while giving a
high appreciation to the mathematical elegance of the theory, suspected its applicabil-
ity, and said: “. . . a physicist, for his problems, can extract from these theories only
very little, and an engineer nothing”[Kle26] . This claim has been proved wrong at least
in physics aspect in the later history. The quantum mechanics developed in the 1920s
under the framework of the Hamiltonian formulation. One of the founders of the quan-
tum mechanics, Schrödinger said, “Hamiltonian principle has been the foundation for
modern physics . . . If you want to solve any physics problem using the modern theory,
you must represent it using the Hamiltonian formulation”[Sch44] .

3. The Importance of the Hamiltonian System

The Hamiltonian system is one of the most important systems among all the dynam-
ics systems. All real physical processes where the dissipation can be neglected can be
formulated as Hamiltonian system. Hamiltonian system has broad applications, which
include but are not limited to the structural biology, pharmacology, semiconductivity,
superconductivity, plasma physics, celestial mechanics, material mechanics, and par-
tial differential equations. The first five topics have been listed as “Grand Challenges”
in Research Project of American government.
Introduction 9

The development of the physics verifies the importance of the Hamiltonian sys-
tems. Up to date, it is undisputed that all real physical processes where the dissipation
can be neglected can be written as Hamiltonian formulation, whether they have finite
or infinite degrees of freedom.
The problem with finite degrees of freedom includes celestial and man-made
satellite mechanics, rigid body, and multi-body (including the robots), geometric op-
tics, and geometric asymptotic method (including ray-tracing approximation method
in wave-equation, and WKB equation of quantum mechanics), confinement of the
plasma, the design of the high speed accelerator, automatic control, etc.
The problem with infinite degrees of freedom includes ideal fluid dynamics, elas-
tic mechanics, electrical mechanics, quantum mechanics and field theory, general rel-
ativistic theory, solitons and nonlinear waves, etc.
All the above examples show the ubiquitous and nature of the Hamiltonian sys-
tems. It has the advantage that different physics laws can be represented by the same
mathematical formulation. Thus we have confidence to say that successful develop-
ment of the numerical methods for Hamiltonian system will have extremely broad
applications.
We now discuss the status of the numerical method for Hamiltonian systems.
Hamiltonian systems, including finite and infinite dimensions, are Ordinary Differ-
ential Equations (ODE) or Partial Differential Equations (PDE) with special form.
The research on the numerical method of the differential equations started in the 18th
century and produced abundant publications. However, we find that few of them dis-
cuss the numerical method specifically for Hamiltonian systems. This status is in sharp
contrast with the importance of the Hamiltonian system. Therefore, it is appealing and
worthy to investigate and develop numerical methods for this virgin field.

4. Technical Approach — Symplectic Geometry Method


The foundation for the Hamiltonian system is symplectic geometry, which is increas-
ingly flourishing in both theory and practice. The history of symplectic geometry can
be traced back to Astronomer Hamilton in the 19th century. In order to study the New-
ton mechanics, he introduced generalized coordinates and generalized momentums to
represent the energy of the system, which is now called Hamiltonian function now. For
a system with n degrees of freedom, the n generalized coordinates and momentums
are spanned into a 2n phase space. Thus the Newton mechanics becomes the geometry
in phase space. In terms of the modern concept, this is a kind of symplectic geometry.
Later, Jacobi, Darboux, Poincaré, Cartan, and Weyl did a lot of research on this topic
from different points of view (algebra and geometry). However, the major develop-
ment of the modern symplectic geometry started with the discovery of KAM theorem
(1950s to the beginning of 1960s). In the 1970s, in order to research Fourier integral
operator, quantum representation of the geometry, group representation theory, classi-
fication of the critical points, Lie Algebra, etc., people did a lot of work on symplectic
geometry (e.g., Arnold[Arn89] , Guillemin[GS84] , Weinstein[Wei77] , Marsden[AM78] , etc.),
which promoted the development in these areas. In the 1980s, the research on total
10 Introduction

symplectic geometry emerged subsequently, such as the research on “coarse” sym-


plectic (e.g., Gromov et al.), fix point for symplectic map (e.g., Conley, Zehnder’s
Arnold conjecture), the convexity of the matrix mapping (e.g., Atiyah, Guillemin,
Sternberg et al.). The research on symplectic geometry is not only extremely enriched
and vital, but its application is also widely applied to different areas, such as celes-
tial mechanics, geometric optics, plasma, the design of high speed accelerators, fluid
dynamics, elastic mechanics, optimal control, etc.
Weyl[Wey39] said the following in his monograph on the history of the symplectic
group: “I called it complex group initially. Because this name can be confused with the
complex number, I suggest using symplectic, a Greek word with the same meaning.”
An undocumented law for the modern numerical method is that the discretized
problem should preserve the properties of the original problem as much as possible.
To achieve this goal, the discretization should be performed in the same framework
as the original problem. For example, the finite element method treats the discretized
and original problem in the same framework of the Sobolev space so that the basic
properties of the original problem, such as symmetry, positivity, and conservativ-
ity, etc., are all preserved. This not only ensures the effectiveness and reliability in
practice, but also provides a theoretical foundation.
Based on the above principle, the constructed numerical methods for the Hamil-
tonian system should preserve the Hamiltonian structure, which we call “Hamiltonian
algorithm”. The Hamiltonian algorithm must be constructed in the same framework
as the Hamiltonian system. In the following, we will describe the basic mathematical
framework of the Hamiltonian system and derive the Hamiltonian algorithm from the
same framework. This is our approach.
We will use the Euclid geometry as an analogy to describe the symplectic geome-
try. The structure of an Euclid space Rn lies in the bilinear, symmetric, non-degenerate
inner product,
(x, y) = x, Iy, I = In .
Since it is non-degenerate, (x, x) is alwayspositive when x = 0. Therefore we can
define the length of the vector x as ||x|| = (x, x) > 0. All the linear operators that
preserve the inner product, i.e., satisfy AT IA = I, form a group O(n), called the
orthogonal group, which is a typical Lie group. The corresponding Lie algebra o(n)
consists of all the transformation that satisfies AT + A = AI + IA = 0, which is
infinitesimal orthogonal transformation.
The symplectic geometry is the geometry on the phase space. The symplectic
space, i.e, the symplectic structure in phase space, lies in a bilinear, anti-symmetric,
and non-degenerate inner product,
 O 
In
[x, y] = x, Jy, J = J2n = ,
−In O

which is called the symplectic inner product. When n = 1,


 
 x y1 
[x, y] =  1 ,
x2 y2
Introduction 11

which is the area of the parallel quadrilateral with vectors x and y as edges. Generally
speaking, the symplectic inner product is an area metric. Due to the anti-symmetry
of the inner product, [x, x] = 0 always holds for any vector x. Thus it is impos-
sible to derive the concept of length of a vector from the symplectic inner product.
This is the fundamental difference between the symplectic geometry and Euclid ge-
ometry. All transformations that preserve the symplectic inner product form a group,
called a symplectic group, Sp(2n), which is also a typical Lie group. Its corresponding
Lie algebra consists of all infinitesimal symplectic transformations B, which satisfy
B T J + JB = 0. We denote it as sp(2n). Since the non-degenerate anti-symmetric
matrix exists only for even dimensions, the symplectic space must be of even dimen-
sions. The phase space exactly satisfies this condition.
Overall the Euclid geometry is a geometry for studying the length, while the sym-
plectic geometry is for studying the area.
The one-to-one nonlinear transformation in the symplectic geometry is called sym-
plectic transformation, or canonical transformation. The transformation whose Jaco-
bian is always a symplectic matrix plays a major role in the symplectic geometry. For
the Hamiltonian system, if we represent a pair of n-dim vectors with a 2n-dim vector
z = (p, q), the Hamiltonian equation becomes
dz ∂H
= J −1 .
dt ∂z
Under the symplectic transformation, the canonical form of the Hamiltonian equation
is invariant. The basic principle of the Hamiltonian mechanics is for any Hamiltonian
system. There exists a group of symplectic transformation (i.e., the phase flow) GtH1 ,t0
that depends on H and time t0 , t1 , so that

z(t1 ) = GtH1 ,t0 z(t0 ),

which means that GtH1 ,t0 transforms the state at t = t0 to the state at t = t1 . Therefore,
all evolutions of the Hamiltonian system are also evolutions of the symplectic trans-
formation. This is a general mathematical principle for classical mechanics. When
H is independent of t, GtH1 ,−t2 = GtH1 ,−t0 , i.e., the phase flow depends only on the
difference in parameters t1 − t0 . We can let GtH = Gt,0 H .
One of the most important issues for the Hamiltonian system is stability. The fea-
ture of this type of problems in geometry perspective is that its solution preserves the
metrics. Thus the eigenvalue is always a purely imaginary number. Therefore, we can-
not use the asymptotic stability theory of Poincaré and Liapunov. The KAM theorem
must be used. This is a theory about the total stability and is the most important break-
through for Newton mechanics. The application of the symplectic geometry to the nu-
merical analysis was first proposed by K. Feng [Fen85] in 1984 at the international con-
ference on differential geometry and equations held in Beijing. It is based on a basic
principle of the analytical mechanics: the solution of the system is a volume-preserved
transformation (i.e., symplectic transformation) with one-parameter2 on symplectic
2
Before K.Feng’s work, there existed works of de Vorgelaere[Vog56] , Ruth[Rut83] and
Menyuk[Men84] .
12 Introduction

integration. Since then, new computational methods for the Hamiltonian system have
been developed and we have studied the numerical method of the Hamiltonian system
from this perspective. The new methods make the discretized equations preserve the
symplectic structure of the original system, i.e., to restore the original principle of the
discretized Hamiltonian mechanics. Its discretized phase flow can be regarded as a
series of discrete symplectic transformations, which preserve a series of phase area
and phase volume. In 1988, K. Feng described his research work on the symplectic
algorithm during his visit to Western Europe and gained the recognition from many
prominent mathematicians. His presentation on “Symplectic Geometry and Compu-
tational Hamiltonian Mechanics” has obtained consistent high praise at the workshop
to celebrate the 60th birthday of famous French mathematician Lions. Lions thought
that K. Feng founded the symplectic algorithm for Hamiltonian system after he devel-
oped the finite element methods independent of the efforts in the West. The prominent
German numerical mathematician Stoer said, “This is a new method that has been
overlooked for a long time but should not be overlooked.”
We know that we can not study the Hamiltonian mechanics without the symplectic
geometry. In the meantime, the computational method of the Hamiltonian mechanics
doesn’t work without the symplectic difference scheme. The classical R–K method is
not suitable to solve this type of problems, because it cannot preserve the long-term
stability. For example, the fourth-order R–K method obtains a completely distorted
result after 200,000 steps with a step size 0.1, because it is not a symplectic algorithm,
but a dissipative algorithm.
We will describe in more detail the theory of symplectic geometry and symplectic
algebra in Chapters 1, 2 and 3.

5. The Symplectic Schemes


Every scheme, whether it is explicit or implicit, can be treated as a mapping from this
time to the next time. If this mapping is symplectic, we call it a symplectic geometric
scheme, or in short, symplectic scheme.
We first search the classical difference schemes. The well-known Euler midpoint
scheme is a symplectic scheme
 z n+1 + z n 
z n+1 = z n + J −1 Hz .
2
The symplectic scheme is usually implicit. Only for a split Hamiltonian system, we
can obtain an explicit scheme in practice by alternating the explicit and implicit
stages. Its accuracy is only of first order. Symmetrizing this first-order scheme yields
a second-order scheme (or so-called reversible scheme). There exist multi-stage R–K
symplectic schemes among the series of R–K schemes. It is proved that the 2s-order
Gauss multi-stage R–K scheme is symplectic. We will give more details on these top-
ics in Chapters 4 , 7 and 8. The theoretical analysis and a priori error analysis will be
described in Chapter 6 and 9.
Introduction 13

In addition, the first author and his group constructed various symplectic schemes
with arbitrary order of accuracy using the generating function theory from the ana-
lytical mechanics perspective. In the meantime, he extended the generating function
theory and Hamilton–Jacobi equations by constructing all types of generating function
and the corresponding Hamilton–Jacobi equations. The generating function theory and
the construction of the symplectic schemes will be introduced in Chapter 5.

6. The Volume-Preserving Scheme for Source-free Sys-


tem

Among the various dynamical systems, one of them is called source-free dynamical
system, where the divergence of the vector field is zero:

dx
= f (x), div f (x) = 0.
dt
 
∂xn+1
The phase flow to this system is volume-preserved, i.e., det = 1. Therefore,
∂xn
the numerical solution should also be volume-preserved.
We know that Hamiltonian system is of even dimensions. However, the source-free
system can be of either even or odd dimensions. For the system of odd dimensions,
the Euler midpoint scheme may not be volume-preserved. ABC (Arnold–Beltrami–
Childress) flow is one of the examples. Its vector field has the following form:

ẋ = A sin x + C cos y,
ẏ = B sin x + A cos z,
ż = C sin y + B cos x,

which is a source-free system and the phase flow is volume-preserved. This is a split
system and constructing the volume-preserving scheme is easy. Numerical experi-
ments show that the volume-preserving scheme can calculate the topological structure
accurately, whereas the traditional schemes can not[FS95,QZ93] . We will give more de-
tails in Chapter 10.

7. The Contact Schemes for Contact System

There exists a special type of dynamical systems with odd dimensions. They have
similar symplectic structure as the systems of even dimensions. We call them contact
systems. The reader can find more details in Chapter 11.
Consider the contact system in R2n+1 space
14 Introduction

⎡ ⎤ ⎡ ⎤ ⎡ ⎤
x x1 y1
⎢ ⎥ ⎢ ⎥
(2n + 1) − dim vector : ⎣ y ⎦ , where x = ⎣ ... ⎦ , y = ⎣ ... ⎦ , z = (z);
z xn y
⎡ ⎤ ⎡ ⎤ ⎡n ⎤
a(x, y, z) a1 b1
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
(2n + 1) − dim v.f. : ⎣ b(x, y, z) ⎦ , where a = ⎣ ... ⎦ , b = ⎣ ... ⎦ , c = (c).
c(x, y, z) an bn

A contact system can be generated from a contact Hamiltonian function K(x, y, z):

dx
= −Ky + Kz x = a,
dt
dy
= Kx = b,
dt
dz
= Ke = c,
dt
Ke (x, y, z) = K(x, y, z) − (x, Ky (x, y, z)).

The contact structure in R2n+1 space is defined as

⎡ ⎤
dx
α = xd y + d z = [0, x, 1] ⎣ d y ⎦ .
dz

A transformation f is called the contact transformation if it could preserve the contact


structure with a pre-factor μf . A scheme which can preserve the contact structure is
called contact scheme[FW94,Shu93] .
The contact schemes have potential applications in the propagation of the wave
front[MF81,QC00] , the applications in thermal dynamics[MNSS91,EMvdS07] , and the charac-
teristic method for the first-order differential equations[Arn88] .
The symplectic algorithm, the volume-preserving algorithm, the contact algo-
rithm, and the Lie–Poisson algorithm are all schemes that preserve the geometry
structure of the phase space. We call these methods “geometric integration for dy-
namic system”[FW94,LQ95a] . The geometric integration was first introduced by the first
author[FW94] and has been widely accepted and used by the international scientists.
The 1996 workshop on the advance of the numerical method, held in England, men-
tioned the importance of the structure-preserving schemes for the dynamics system. In
that workshop, a series of high-order structure preserving schemes has been proposed
via the multiplicative extrapolation method[QZ92,QZ94] . We have extended the explicit
schemes of Yoshida[Yos90] to all self-adjoint schemes. By using the product of the
schemes and their adjoint, we have constructed very high order self-adjoint schemes.
The details are described in Chapter 8. Concerning the Lie–Poisson algorithm we will
describe more details in Chapter 12.
Introduction 15

8. Applications of the Symplectic Algorithms for Dy-


namics System
(1) Applications of symplectic algorithms to large time scale sys-
tem
Nearly all systems of celestial mechanics and dynamic astrophysics are Hamiltonian
or approximately Hamiltonian with few dissipations. Such systems can be described
by canonical forms of Hamiltonian systems, which has now become one of the most
important research areas of dynamical system. However, due to the complicated non-
linearity of those canonical Hamiltonian systems, few analytic solutions are available.
Although sometimes approximate analytic solutions in form of power series can be
obtained by the perturbation method, the long time dynamics, the quantity property,
and the intrinsic nonlinearity are overlooked by such solutions. Thus, the numerical
methods are required to study those problems to get more accurate and quantitative
numerical solutions, which not only provide the information and images on the whole
phase space of the given mechanical system for further qualitative analysis, but also
lead to some important results for the system. There are two ways to analyze Hamil-
tonian system qualitatively. One way is to get the numerical solution of the canoni-
cal Hamiltonian system directly by the numerical methods, and the other is simpler
discretization process to the equation of motion, which becomes a simple mapping
question which making computing easier. The later method reduces the computational
effort so that it can be performed by normal computers to study the large time scale
evolution of dynamical systems.
Traditional numerical methods for dynamics system can be categorized into single-
step methods, e.g. the R–K method, and multi-step methods, e.g. the widely used
Adams method for the first order differential equations, and Cowell methods for the
second order differential equations. However, all the methods have the artificial nu-
merical dissipations so that the corresponding total energy of the Hamiltonian system
will change linearly. This will distort basic property of Hamiltonian system and lead
to wrong results for a long time computation. By quantitative analysis, we know that
the dissipation of the total energy will accumulate errors of the numerical trajectories
of the celestial bodies. The errors will increase at least squarely with respect to the
integration time step.
In the 1980s, the first author and his group established the theory of the symplec-
tic algorithms for Hamiltonian system. The significance of this theory is not only to
present a new kind of algorithms, but also to elucidate the reason for the false dis-
sipation of the traditional methods, i.e.,that the main truncation error terms of those
non-symplectic methods are dissipative terms, whereas the main truncation error terms
of symplectic algorithms are not dissipative terms. Thus the numerical energy of the
system will not decrease linearly, but change periodically. Due to the conservation of
the symplectic structure of the system, which is the basic property, the symplectic al-
gorithms have the long time capacity to simulate the evolution of the celestial bodies.
As the energy is a very important parameter of such a system, the numerical results
of symplectic algorithms, which can preserve the energy approximatively, are more
16 Introduction

reasonable. Furthermore, because the errors of the energy are controlled, the errors of
numerical trajectories of celestial bodies are no longer along the track by (t − t0 )2
laws of the fast-growing, and with only a t − t0 linear growth, this to the long arc
computation is extremely advantageous.
For the advantages of the symplectic algorithms, nowadays they have been widely
used in the study of dynamical astronomy, especially in the qualitative analysis of the
evolution of solar system, e.g. to analyze the stable motion area, space distributions
and trajectory resonance of little planets, long time evolution of large planets and
extra-planets, and other hot topics in the dynamical astronomy.

(2) Applications of symplectic algorithms to qualitative analysis

We first use two simple examples to illustrate the special affects of symplectic schemes
on the qualitative analysis in dynamics astronomy[LZL93,JLL02,LL95,LL94,Lia97,LLZW94] .
Example I. The Keplerian motions. It is the elliptical motions of two-body problem.
The corresponding Hamiltonian function is:

H(p, q) = T (p) + V (q),

where p and q are the generalized coordinates and generalize momentum, T and V
are the kinetic and potential energies. The analytic solution is a fixed ellipse. When
we simulate this problem by the R–K methods and symplectic algorithms, the former
ones shrink the ellipse gradually, whereas the later ones preserve the shape and size
of the ellipse (see the numerical trajectories after 150 and 1000 steps respectively in
Fig. 0.7(a), e=0.7 and Fig. 0.7(b), e=0.9 where e is the eccentricity of the ellipse).
This means the non-symplectic R–K methods have the false energy dissipation and
the symplectic algorithms preserve the main character of the Kepler problem because
of the conservation of the symplectic structure.
Example II. The axial symmetry galaxy’s stellar motion question. Its simplified
dynamic model corresponding to the Hamiltonian function is:

1 2 1 2
H(p, q) = (p1 + p22 ) + (q12 + q22 ) + (2q12 q2 − p32 ).
2 2 3

To obtain the basic character of the dynamics of this system, we compute it with
order 7 and order 8 Runge–Kutta–Fehlberg methods (denoted as RKF(7) and RKF(8)
resp.), as well as the order 6 explicit symplectic algorithm (SY6). The numerical re-
sults are listed in Fig. 0.8 to Fig. 0.10. In these figures (Fig. 0.8 to Fig. 0.9), we see that
the symplectic algorithm preserves the energy H very well in both of the two cases
(ordered LCN= 0 and disorder region LCN> 0), while the RKF methods increase the
energy with the evolution of time ΔH. In Fig. 0.10 (a) and Fig. 0.10 (b), the symplec-
tic algorithms present numerically the basic characters of the system: the fixed curve
in case of LCN= 0 and the chaos property in case of LCN> 0.
Introduction 17

Fig. 0.7. Comparison of calculation of Keplerian motion by R–K and symplectic methods.

Fig. 0.8. Curves of ΔH obtained by RKF(8)[left] and SY6 [right] both with H0 = 0.553,
LCN=0.

Fig. 0.9. Curves of ΔH obtained by RKF(8)[left] and SY6 [right] both with H0 =
0.0148, LCN > 0

The symplectic algorithms can preserve the symplectic structure of Hamiltonian


systems and the basic evolutionary property of such dynamical systems. Therefore,
18 Introduction

Fig. 0.10. Poincaré section obtained by RKF(8)[left] with H0 = 0.553,LCN=0 and SY6
[right] with H0 = 0.0148,LCN>0

the symplectic algorithms were widely used to study the dynamical astronomy. Cur-
rently, it is a hot topic to study the dynamical evolution of the solar system, such
as the long-term trajectory evolution of large planet and extra-planet, the space dis-
tribution of little planets in main zone (Kirkwood interstice phenomenon), trajectory
resonance, the evolution of satellite system of a large planet, the birth and evolution of
planet loops and the trajectory evolution of a little planet near Earth. All these prob-
lems require numerical simulation for a very long time, e.g. 109 years or more for the
solar system. Thus, the time steps for the numerical methods shall be large enough
due to limitations of our computers, while the basic property of the system should be
preserved. This excludes all the non-symplectic methods, whilst just lower order sym-
plectic algorithms are valid for the task. In recent years, many astronomers in Japan
and America, e.g. Kinoshita[KYN91] , Bretit[GDC91] and Wisdom[WHT96,WH91] , have done
a large amount of research on the evolution of the solar system. The following con-
tribution of Wisdom has been widely cited. He derived the Hamiltonian function in
Jacobin coordinates of the solar system as


n−1
H(p, q) = Hi (p, q) + εΔH(p, q),
i=1

where Hi (p, q) is corresponding Hamiltonian function for a two-body system, ε  1


is a small parameter. By splitting the Hamiltonian function, explicit symplectic al-
gorithms with different orders can be constructed. The advantage of those symplec-
tic algorithms is that the truncation errors are as small as order of ε than those of
algorithms constructed by the ordinary splitting for the Hamiltonian function (i.e.,
H(p, q) = T (p) + V (q)). Even the lower order symplectic algorithms obtained by
this splitting method are very effective in a study of the evolution of the solar system.
Since the 1980s , Chinese astronomers have also made some progress in the applica-
tions of symplectic algorithms to the research of dynamical astronomy, such as[WH91]
Introduction 19

1◦ For the restrictive three-body system constituted by solar, the major planet
and the planetoid, some new results have been obtained after studying its correspond-
ing resonance of 1:1 orbit and the triangle libration point. These results can success-
fully explain the distribution of stability region [ZL94,ZLL92] of Trojan planetoid, as well
the actual size of the stable region of distributed triangle libration points corresponding
to several relate major planet.
2◦ Adopting the splitting method of Wisdom for the Hamiltonian function to
study the long-term trajectories evolution of some little planets.

H(p, q) = H0 (p, q) + εH1 (q),

where H0 (p, q) is the Hamiltonian function for an integrable system, ε  1 is a


little parameter. The numerical results obtained by using this splitting method are
very reasonable because the energy is preserved in a controlled range and no false
dissipation occurs.
3◦ Application of symplectic algorithms to galaxy system. The bar phenomenon
and the evolution of stars in NGC4736 Galaxy were simulated successfully by the
symplectic algorithms.
4◦ Some useful results on how to describe the evolutionary features of celes-
tial dynamical system were obtained by further study on the symplectic integrators
and the existence of their formal integrations, as well as the changes of all kinds of
conservation laws.
Besides the research on Hamilton systems in dynamics astronomy mentioned
above, the small diffusion situation were also discussed and applied. In view of the fact
that the diffusion factor is relatively weak, a mixed symplectic algorithm constituted
by the explicit scheme and the centered Euler scheme is applied for the conservative
part (the main part corresponding to mechanic system) and the dissipative part, which
is remarkably effective, because it could maintain the features of Hamilton system in
the main part of this system.

(3) Applications of symplectic algorithms to quantitative compu-


tations
Because the structure could be preserved by the symplectic algorithms, the errors of
their numerical energy don’t accumulate linearly. When the celestial systems are inte-
grated by symplectic algorithms, errors of trajectories will increase linearly as t − t0 ,
whereas errors of the non-symplectic methods increase rapidly as (t − t0 )2 . We show
some examples next.
Taking the trajectory of Lageos satellite as background, we consider two mechan-
ics system of the Earth perturbation problems. The first one just takes into account the
nonspherical perturbation of the Earth and the second one takes into account the non-
spherical perturbation of the Earth and the perturbation of atmospheric resistance. The
former problem corresponds to a Hamiltonian system, while the later one corresponds
to a quasi-Hamiltonian system because of very small dissipation. We use the RKF7(8)
methods and revised order 6 symplectic algorithm (denoted as SY6) to compute the
20 Introduction

two problems, respectively. The numerical results of the errors Δ(M + ω) of main
1000 cycles trajectory are listed in Table 0.1 and Table 0.2. From the two tables, we
can clearly see that the errors of the non-symplectic methods, though very small at the
beginning, increase rapidly as (t − t0 )2 ; whereas the errors of symplectic algorithm
increase linearly as t − t0 . The results of symplectic algorithms are much better. This
indicates that though the accuracy order of symplectic algorithms is the same as for
other methods, they have more application value in the quantitative computations. We
also improve the RKF7(8) for energy conserving methods by compensating the en-
ergy at every time step. We denote such method as the RKH method whose numerical
results are also listed in the two tables. From the results, we can see that we have
made much improvement of the schemes. The results of the energy error by the RKH
are almost same with those by the symplectic algorithm. Thus the RKH methods not
only have high order accuracy, but also can preserve the energy approximately as the
symplectic algorithms.

Table 0.1. Errors of trajectories with nonspherical perturbation of the EarthΔ(M + ω)

method N of steps / circle 100 circles 1000 circles 10000 circles


FKF7(8) 100 1.5 E − 10 1.4 E − 08 1.3 E − 06
SY6 50 0.5 E − 09 0.6 E − 08 1.0 E − 07
RKH 100 0.9 E − 11 0.9 E − 10 0.9 E − 09

Table 0.2. Errors of trajectories with perturbation of atmospheric resistanceΔ(M + ω)

method N of steps / circle 100 circles 1000 circles 10000 circles


FKF7(8) 100 1.4 E − 410 1.3 E − 08 1.3 E − 06
SY6 50 0.6 E − 09 0.7 E − 08 1.0 E − 07
RKH 100 2.1 E − 11 3.5 E − 10 6.2 E − 09

(4) Applications of symplectic algorithms to quantum systems


The governing equation of the time evolution of quantum system is the Schrödinger
equation
∂ψ
i = Ĥψ, Ĥ = Ĥ0 (r) + V̂ (t, r), (0.1)
∂t
where the operator Ĥ is Hermitian.
According to the basic theory of quantum mechanics, the initial state of a quantum
system uniquely determines all the states after initial moment of time. That is to say,
if the state function ψ(t1 , r) is given at time t1 , then the solution (so-called wave
function) of Equation (0.1) is determined as
Introduction 21

ψ(t, r) = a(t, r) + ib(t, r),

where functions a and b are real.


Such a solution can be generated by a group of time evolutionary operators
t1 ,t2
{UĤ }, i.e.,
t1 ,t2
ψ(t2 , r) = UĤ ψ(t1 , r).

Every operator is unitary and depends on t1 , t2 and Ĥ. They are independent of the
state ψ(t1 , r) at time t1 . Therefore, the time evolutions of the quantum system are
evolutions of unit transformation in this sense. Every operator can induce an operator,
which acts on the real function vector. The two components of the real functions vector
are the real part and the image part of the wave function, i.e.,
 b(t , r)   b(t , r) 
2 t1 ,t2 1
= SĤ .
a(t2 , r) a(t1 , r)
t1 ,t2
The operator SĤ preserves the inner product and symplectic wedge product for
any two real function vectors. It is simply called norm-preserving symplectic evolu-
tion. The quantum system is a Hamiltonian system (with infinite dimensions) and the
time evolution of the Schrödinger equation can be rewritten as a canonical Hamilto-
nian system for the two real functions of the wave function as the generalized mo-
mentum and generalized coordinates. The norm of wave function is the conservation
law of the canonical system. Thus it is reasonable to integrate such a system by the
norm-preserving symplectic numerical methods. To apply such a method to the in-
finite dimensional system, we should first space discretize the system into a finite
dimensional canonical Hamiltonian system, which also preserves the norm of wave
function. Suppose the characteristic functions of the operator Ĥ0 (r) for the evolution-
ary Schrödinger equation with some given boundary conditions contain the discrete
states and continuous states.
When the Hamiltonian Ĥ is independent on time explicitly, the energy of the quan-
tum system ψ|, Ĥ|ψ = Z T HZ is a conservation law both for the canonical system
and norm-preserving symplectic algorithm. Such a norm-preserving symplectic algo-
rithm with the fourth order accuracy can be constructed by the order 4 diagonal Padé
approximations to the exponential function eλ .
In the following, we take an example to introduce the method to discretize the time
involved Schrödinger equation to a canonical system[LQHD07,QZ90a] .
Consider the time evolution of an atom moving in one dimensional space by the
action of some strong field V (t, x) is
∂ψ
i = Ĥψ, Ĥ = Ĥ0 (r) + V̂ (t, r),
∂t
1 ∂2
Ĥ0 = − + V0 (x),
2 ∂ x2

0, 0 < x < 1,
V0 (x) =
∞, x ≤ 0 or x ≥ 1.
22 Introduction

In contrast to the characteristic function expanding method, we don’t make any trun-
cation for the wave function when discretizing the Schrödinger equation. Therefore,
the resulting canonical system contains all the characteristic states of Ĥ0 .
The numerical conservation laws of explicit symplectic algorithms will converge
to the corresponding conservation laws of the system as the time step tends to zero.
Thus, although numerical energy and norm of the wave function presented by explicit
symplectic algorithms will not be preserved exactly, they will converge to the true
energy and norm of the wave function of the system as the time step reduces.
The time dependent Schrödinger equation (TDSE) in one dimensional space by
the action of some strong field V (t, x) is
∂ψ
i = Ĥψ, Ĥ = Ĥ0 (x) + εV̂ (t, x),
∂t
1 ∂2
Ĥ0 = − + V0 (x),
 2 ∂x2
0, 0 < x < 1,
V0 (x) =
∞, x ≤ 0 or x ≥ 1.


⎨ 2x, 0 < x < 0.5,
V (x) = 2x − 2x, 0.5 ≤ x ≤ 1,

⎩ 0, x ≤ 0 or x ≥ 1.

Using the similar method


√ as before, we expand the wave function as the character-
istic functions {Xn (x) = 2 sin nπx, n = 1, 2, · · ·} of Ĥ0 to discretize the TDSE.
Because the Hamiltonian operator is real, the discrete TDSE is a separable linear
canonical Hamiltonian system with the parameters as follows.

n2 π 2
S = (Smn ), Smn = δmn + εvmn ,
⎧ 2
⎪ 1 1 − (−1)n

⎪ + , m = n,

⎪ 2 n2 π 2

⎪ |m − n| = 1, 3, 5, · · · ,

⎨ 0, m−n
vmn = −16mn(1 − (−1) 2 )

⎪ , |m − n| = 2, 4, 6, · · · , n = 2, 4, 6, · · · ,

⎪ (m2 − n2 )2 π 2

⎪ m−n

⎪ −8|2mn − (−1) 2 (m2 − n2 )|
⎩ , |m − n| = 2, 4, 6, · · · , n = 1, 3, 5, · · · .
2 (m − n ) π
2 2 2

The initial state is taken as


1+i
ψ(0, x) = |X1 (x) + X2 (x)|, ε = 5π 2 .
2
The energy of the system is conserved because the Hamiltonian does not depend
on the time explicitly. E(b, a) = e0 = 42.0110165. The norm of wave function keeps
unitary, i.e., N (b, a) = n0 = 1. We take the Euler midpoint rule, order 2 explicit
symplectic algorithm and the order 2 R–K method to compute the problem with the
same time step h = 10−3 . The numerical results are as follows:
Introduction 23

1◦ The R–K method can not preserve the energy and the norm of wave function,
as evident by ER–K in Fig. 0.11(left) and NR–K in Fig. 0.11(right).
2◦ The Euler midpoint rule can preserve the energy and norm, as evident by EE
in Fig. 0.11(left) and NE in Fig. 0.11(right). Note that for EE in Fig. 0.11(left), there
is a very small increase at some time because of the implicity of the Euler scheme.

Fig. 0.11. Energy [left] and norm [right] comparison among the 3 difference schemes

3◦ The explicit symplectic algorithms can preserve exactly the energy Ẽ(bk , ak ; h)
and norm Ñ (bk , ak ; h), as evident by ES in Fig. 0.11(left) and NS in Fig. 0.11(right).
If we want to get further insight into these conservation laws within smaller scales, we
find that as the time steps get smaller, the numerical energy of symplectic algorithm
converges to the true energy of the system e0 = 42.0110165 and the numerical norm
converges to unit n0 = 1. See Table 0.3 showing the numerical energy and norm as
well as their errors. The errors are defined as

CE (h) = max |ESk − e0 |, CN (h) = max |NSk − n0 |.


k k

Actually, the numerical energy and norm obtained by symplectic algorithm oscillate
slightly, as shown by ES and NS in Fig. 0.12. However, the amplitude of their oscil-
lations will converge to zero, if the time step tends to zero. As the time step tends to
zero, we have
e(h) −→ e0 , CE (h) = maxk |ESk − e0 | −→ 0,

n(h) −→ n0 , CN (h) = maxk |NSk − n0 | −→ 0.


24 Introduction

Table 0.3. The change of energy and norm of the wave function with the step size

h e(h) CS (h) n(h) CN (h)


−3
10 42.0169964 0.0445060 0.9996509 0.0003106
−4
10 42.0110763 0.0004195 0.9999965 0.0000030
−5
10 42.0110171 0.0000018 0.9999990 0.0000000
−6
10 42.0110165 0.0000000 1.0000000 0.0000000
10−7 42.0110165 0.0000000 1.0000000 0.0000000
exact value 42.0110165 0.0000000 1.0000000 0.0000000

Fig. 0.12. Energy E and norm N obtained from explicit symplectic scheme

In all, for a quantum system with real Hamiltonian function independent of time
explicitly, the explicit symplectic algorithms can preserve the energy and norm of the
wave function to any given accuracy. They overcome the main disadvantages of the
traditional numerical methods.
Next, we look at the quantum system with real Hamiltonian function, which is
dependent on time explicitly. In this case, the resulting system after semi-discretization
is an m-dimensional, separable, linear, Hamiltonian canonical system. The energy of
the system is not conserved any more, but the norm of the wave function is still a
quadratic conservation law.
The TDSE for an atom in one dimensional space with the action of some strong
field V (t, x) = εx sin(ωt) is
∂ψ
i = Ĥψ, Ĥ = Ĥ0 (x) + εV̂ (t, x),
∂t
1 ∂2
Ĥ0 = − + V0 (x).
2 ∂ x2
By the similar method
√ as before, we expand the wave function as the characteristic
functions Xn x = 2 sin nπx(n = 1, 2, · · ·) of Ĥ0 to discretize the TDSE. Because
Introduction 25

Fig. 0.13. ω = 3π 2 /2, ε = π 2 /2: Graph of norm[left]; graph of probability[right]

the Hamiltonian operator is real, the discrete TDSE is a separable linear canonical
Hamiltonian system with the parameters as follows.

n2 π 2
S(t) = (s(t)mn ), s(t)mn = δm,n + εv(t)mn ;
2


⎪ sin(ωt), m = n,


v(t)mn = 0, |m − n| = 2, 4, 6, · · · ,



⎩ 8mn sin (ω t) , |m − n| = 1, 3, 5, · · · .
(m2 − n2 )2 π 2

The initial state is taken as ψ(0, x) = X1 (x) = 2 sin(πx). The energy of the
system is not conserved in this case because the Hamiltonian depends on the time
explicitly. The norm of wave function remains unitary, i.e., N (b, a) = n0 = 1. We
take the Euler midpoint rule scheme, order 2 explicit symplectic algorithm and the
order 2 R–K method to compute the problem with the same time step h = 4 × 10−3 .
The numerical results are as follows:
1◦ The R–K method increases the norm of wave function rapidly, see NR–K in
Fig. 0.13(left). It leads to unreasonable results, see in Fig. 0.13(right).
2◦ The Euler midpoint rule scheme can preserve the norm, see NE in Fig.0.13(left).
These results are in good agreement with the theoretical results. See Fig. 0.13(right)
π
for the results for weak fields ε = . When ω = ΔE1n , i.e., resonance occurs, the
2
basic state and the first inspired state will intermix and the variation period of the
energy is identical to the period of intermixing. See the corresponding results in Fig.
0.14(left) and Fig. 0.14(right). When ω = ΔE1n there will not be intermixing. See the
corresponding numerical results in Fig. 0.15(left) and Fig. 0.15(right), where O is the
basic state. When the field is strong, the selection rule is untenable, and no resonance
occurs, but the basic state will intermix with the first, second, . . . inspired states. See
5π 2 3π 2
the results for ω = in Fig. 0.16(left) and Fig. 0.16(right) and ω = = ΔE12
4 2
in Fig. 0.17(left) and Fig. 0.17(right).
26 Introduction

Fig. 0.14. ω = 3π 2 /2, ε = π 2 /2: Graph of probability[left]; graph of norm[right]

Fig. 0.15. ω = 5π 2 /4, ε = 3π 2 /2: Graph of probability[left]; graph of norm[right]

3◦ The order 2 explicit symplectic algorithms can not preserve the norm exactly.
The numerical norms oscillate near the unit. See NS in Fig. 0.13, where changes of
numerical energy and states of intermixing obtained by symplectic algorithms are
similar to the results of Euler midpoint rule scheme.
We can conclude that for this system the R–K method can not preserve the norm of
wave function and its results are unreasonable; the Euler scheme can preserve the norm
and its results are in agreement with the theoretical results; the second order scheme
obtains the numerical norm which oscillates near the unit and its energy and states of
intermixing are the same as for the results of Euler scheme. Thus, the Euler scheme
(an implicit symplectic scheme) and the second order explicit symplectic algorithm
are good choices for studying the quantum system with the Hamiltonian dependent on
time explicitly. They overcome the drawbacks of the traditional R–K methods.

(5) Applications to computation of classical trajectories


Applications of symplectic algorithms to computation of classical trajectories of A2 B
molecular reacting system [LDJW00] .
To study the classical or semi-classical trajectories of the dynamical system, mi-
croscopic chemistry is an effective theory method.
Introduction 27

Fig. 0.16. ω = 5π 2 /4, ε = 50π 2 : Graph of probability[left]; graph of norm[right]

Fig. 0.17. ω = 3π 2 /2, ε = 50π 2 : Graph of probability[left]; graph of norm[right]

The classical trajectory method regards the atom approximatively as a point and
the system as a system of some points, and advances the process of action as the clas-
sical motions of point system in potential energy plane of the electrons. It was Bunker
who first applied the R–K method to computations of classical trajectory of molecular
reacting system. Karplus et al. did a large number of computations by all kinds of
numerical methods and screened out the R–K–G (Runge–Kutta–Gear) method to pro-
long the computation time from 10−15 s to 10−12 s. The R–K–G method made rapid
progress in the theoretical study of reacting dynamics of microscopic chemistry and
was widely used for computation of classical trajectory. However, its valid computa-
tion time is much less than 10−8 s which is necessary time for study of chemical re-
actions. Moreover, there were many differences between the numerical quantities and
theoretical quantities of some parameters. The classical trajectory method describes
the microscopic reaction system approximately as a Hamiltonian system which natu-
rally has symplectic structure. Thus, it is expected that the symplectic algorithms will
overcome the shortages of the R–K–G method and improve the numerical results.
Here we take the mass of the proton as the unit mass and 4.45 × 10−14 s as unit
time.
Consider the classical motions of the A2 B type molecules like H2 O and SO2 mov-
ing in the electron potential energy plane of the reaction system and preserving the
28 Introduction

symmetry of C2v . Set the masses of A and B to be mA = 1 and mB = 2 resp., the


center of mass of the molecule be the origin of some coordinate, the C2 axes be z axes,
and the coordinates of two atoms A and the atom B be (y1 , z1 ), (y2 , z2 ) and (y3 , z3 )
reps. in the fixed coordinate system. By Banerjee’s coordinates separating method, we
can get the generalized coordinates of the A2 B molecule as

q1 = z1 + z2 − 2z3 , q2 = y 2 − y 1 ,

and the generalized mass as M1 = 0.25, M2 = 0.5, further the generalized momen-
tum as
d q1 d q2
p1 = 0.25 , p2 = 0.5 ,
dt dt
and the kinetic energy of system as

K(p) = 2p21 + p22 .

The potential energy suggested by Banerjee, who introduced the symmetry C2v

and notation D = q12 + q22 , was

V (q) = 5π 2 (D2 − 5D + 6.5) + 4D−1

+0.5π 2 (|q2 | − 1.5)2 + |q2 |−1 .

The Hamiltonian function for the A2 B molecular system is

H(p, q) = K(p) + V (q),

and the canonical equations for the classical trajectories are


d p1 ∂V d q1 ∂K
=− = −f1 (q), = = g1 (p),
dt ∂t dt ∂ p1
d p2 ∂V d q2 ∂K
=− = −f2 (q), = = g2 (p).
dt ∂ q2 dt ∂ p2

It is a separated Hamiltonian system, which can be integrated by explicit symplec-


tic algorithms. We can obtain its numerical solutions of some initial values as

tk = kh, pk1 = p1 (tk ), q1k = q1 (tk ), pk2 = p2 (tk ), q2k = q2 (tk ),

and further its classical trajectories of A2 B system and the changes of kinetic energy,
potential energy and total energy with time by following relations:
q1 q2 q1
y3 = 0, z3 = − ; y2 = −y1 = , z2 = z1 = .
4 2 4

The initial values are taken as


3
q1 (0) = 3, q2 (0) = ; p1 = 0, p2 = 0.
2
Introduction 29

Fig. 0.18. The potential energy curve of the electronic potential function in phase space

We compute this system with order 4 explicit symplectic algorithm and R–K
method. The time step is taken as h = 0.01 for both. The numerical classical tra-
jectories, kinetic energy, potential energy and total energy are recorded. Fig. 0.18
shows the potential energy curve of the electronic potential function in phase space. If
|q1 | → +∞, then V (q) → +∞; if |q2 | → 0 or |q2 | → +∞, then V (q) → +∞. By
the theoretical analysis, we know that the total energy of the system will be conserved
all the time, the three atoms will oscillate nearly periodically, and the whole geometry
structure of the system may be reversed but kept periodic. The changes of the total
energy with time are shown in Fig.0.19, where we can see that the total energy ob-
tained by symplectic algorithms are preserved up to 6.23 × 10−9 s, whereas the R–K
method reduces them rapidly with time. The motion trajectories of the system in the
plane by the symplectic algorithms and R–K method are shown in Fig. 0.20 (a), (c),
(e) and (b), (d), (f) resp., where we can see that the numerical results of symplectic al-
gorithms are coincident with the theoretical results but the results of R–K method are
not. We also applied the order 1 and 2 symplectic algorithms, the Euler method and the
revised Euler method to compute the same problem. The conclusions are almost the
same. Because all the traditional methods such as R–K methods, Adams methods and
Euler methods can not preserve the symplectic structure of this microscopic system,
they will bring false dissipations inevitably, which will make their numerical results
meaningless after long-term computations. On the contrary, symplectic algorithms can
preserve the structure and do not bring any false dissipations. Therefore, they are suit-
able for long-term computations and greatly improve the classical trajectory methods
for studying the microscopic dynamical reactions of chemical systems.
30 Introduction

Fig. 0.19. The changes of the total energy with the time

(6) Applications to computation of classical trajectories of di-


atomic system [Dea94,DLea96]

Consider the classical motion of AB diatomic molecule system in electron potential


energy plane. Set the masses of A and B to be m1 and m2 resp., the center of mass to
be the origin of some coordinate with fixed axes Ox, the coordinates of two atoms A
and B to be −x1 and x2 resp. Then the generalized coordinate is q = x2 + x1 and the
m1 m2 dq
generalize mass is M = . Further, the generalized momentum is p = M
m1 + m2 dt
p2
and the generalized kinetic energy is U (p) = . Take the potential function as the
2M
Morse potential
V (q) = D{e−2a(q−qe ) − 2e−a(q−qe ) },
where the parameters D, a, qe were derived by E. Ley and Koo recently. Thus, the total
energy for such system is H(p, q) = U (p) + V (q), and the canonical Hamiltonian
system for the classical trajectory is

dp d V (q)
=− = −f (q),
dt dt
dq d U (p)
= = g(p).
dt dt

It is a separable system. By explicit symplectic algorithms, we can get its numeri-


cal solutions as
tk = kh, pk = p(tk ), q k = q(tk ),
and advance its classical trajectories of AB two-atom system as
m2 q m1 q
x1 = , x2 = ,
m1 + m2 m1 + m2
Introduction 31

Fig. 0.20. The motion trajectories of the system in the plane,(a) and (b) period range from
4.45 × 10−10 s to (4.45 × 10−10 + 4.45 × 10−13 )s. (c) and (d) period range from 6.23 × 10−9 s
to (6.23 × 10−9 + 4.45 × 10−13 )s. (e) and (f) period range from 6.23 × 10−9 s to (6.23 ×
10−9 + 4.45 × 10−13 )s. (a), (c), (e) is the symplectic algorithm path, (b), (d), (f) is the R–K
method path

as well as the changes of kinetic energy, potential energy and total energy with the
variation of time.
We compute some states of two homonuclear molecules Li2 and N2 and two
heteronuclear molecules CO and CN by using the order 1, 2 and 4 explicit sym-
plectic algorithms and compare the numerical results of total energy and classical
trajectories with the Euler method and order 2 and 4 order R–K methods. In Fig.
0.21, Fig. 0.22 and Fig. 0.23, we show the numerical results of the classical tra-
jectories, total energy and the trajectories in p − q phase space obtained by order
4 explicit symplectic algorithm and order 4 R–K method respectively. The parame-
32 Introduction

Fig. 0.21. Classical orbit of two homonuclear molecules Li2

Fig. 0.22. Comparison of energy of two homonuclear molecules Li2

ters in those computations


√ are taken as the time step h = 0.005, the initial values
q(0) = qe , p(0) = 2M D − 0.0001, and D = 8541cm−1 , qe = 2.67328Å, a =
−1
0.867Å , Å = 0.1nm. The results show that the symplectic algorithms can preserve
the energy after 106 time steps and the facts that the two Li atoms oscillate periodically
and their trajectories in phase remain invariant are simulated by the symplectic algo-
rithm. The results are opposite for the R–K method. The numerical total energy, and
the oscillation period and amplitude of the two atoms were reduced, after 3000 time
steps. Furthermore, the trajectories in the phase space became flat to q axis after 50000
time steps and lost entirely their shape as manifested in the theory analysis and exper-
iments (Fig. 0.21, Fig. 0.22, Fig. 0.23). The results of the other molecules N2 , CO and
CN are similar. Thus, we can draw the conclusion that the symplectic algorithms can
preserve the symplectic structure and the basic properties of the microscopic system.
Therefore they are capable of long time computations for such systems.
Introduction 33

Fig. 0.23. The trajectories in p − q phase space

(7) Applications to atmospheric and geophysical science

Recently, the symplectic algorithms have been applied to study the observation opera-
tor of the global positioning system (GPS) by Institute of Atmospheric Physics of the
Chinese Academy of Science[WZJ95,WJX01] . Numerical weather forecasting needs very
large amount of atmospheric information from GPS. One of the key problems in this
field is how to reduce largely the computational costs and to compute it accurately
for a long time. The symplectic algorithms provide rapid and accurate numerical al-
gorithms for them to deal with the information of GPS efficiently. The computational
costs of the symplectic algorithms are one four hundredth of the costs of traditional
algorithms. For the complicated nonlinear system of atmosphere and ocean, symplec-
tic algorithms can preserve its total energy, total mass, total potential so well that the
relative errors of potential height is below 0.0006 (see Fig. 0.24).
Another application of symplectic algorithms to geophysics is carried out by In-
stitute of Geophysics to prospect for the oil and natural gas[GLCY00,LLL01a,LLL01b,LLL99] ,
which has obtained several great achievements. For example, the spread waves of
earthquake under the framework of Hamiltonian system and the corresponding sym-
plectic algorithms have been investigated. Moreover, “the information of oil reserves
and geophysics and its process system ” has been produced, and the task of prospecting
for 1010 m3 of natural gas, which has obtained. Fig. 0.25 shows the numerical results of
prestack depth migration in the area of Daqing Xujiaweizi by applying symplectic al-
gorithms to Marmousi model. Recently, Liuhong et.al. proposed a new method[LYC06]
to calculate the depth extrapolation operator via exponential of pseudo-differential op-
erator in lateral varied medium. The method offers the phase of depth extrapolation
operator by introducing lateral differential to velocity, which in fact is an application
of Lie group method.
34 Introduction

Fig. 0.24. The relative errors of potential height is below 0.0006 after 66.5 days

Fig. 0.25. Numerical results of prestack depth migration in the area of Daqing Xujiaweizi
obtained by applying symplectic algorithms to Marmousi model
Bibliography

[AM78] R. Abraham and J. E. Marsden: Foundations of Mechanics. Addison-Wesley, Reading,


MA, Second edition, (1978).
[Arn63] V. I. Arnold: Small denominators and problems of stability of motion in classical and
celestial mechanics. Russian Math. Surveys, 18:85–191, (1963).
[Arn88] V. I. Arnold: Geometrical Methods In The Theory Of Ordinary Differential Equations.
Springer-Verlag, Berlin , (1988).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM
60, Berlin Heidelberg, Second edition, (1989).
[Dea94] P. Z. Ding and et al: Symplectic method of time evolution problem for atomic system.
Atomic and Molecular Physics (in Chinese), 6:440, (1994).
[DLea96] P. Z. Ding, Y. Li and et al: Symplectic method of caculation classical trojectory
for microscopic chemical reaction. Chinese Academic Journals Science and Technology
Abstracts (Express), 2(2):111, (1996).
[DLM97a] A. Dullweber, B. Leimkuhler, and R. McLachlan: Symplectic splitting methods for
rigid body molecular dynamics. J. Chem. Phys., 107:5840–5851, (1997).
[DLM97b] A. Dullweber, B. Leimkuhler, and R. I. McLachlan: Split-Hamiltonian Methods for
Rigid Body Molecular Dynamics. Technical Report 1997/NA11, Department of Applied
Mathematics and Theoretical Physics, University of Cambridge, (1997).
[EMvdS07] D. Eberard, B.M. Maschkea, and A.J. van der Schaftb: An extension of Hamil-
tonian systems to the thermodynamic phase space: Towards a geometry of nonreversible
processes. Reports on Mathematical Physics, 60(2):175–198, (2007).
[Fen65] K. Feng: Difference schemes based on variational principle. J. of Appl. and Comput.
Math.in Chinese, 2(4):238–262, (1965).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[Fen92] K. Feng: How to compute property Newton’s equation of motion. In L. A. Ying, B.Y.
Guo, and I. Gladwell, editors, Proc of 2nd conf. on numerical method for PDE’s, pages 15–
22. World Scientific, Singapore. (1992). Also see Collected Works of Feng Kang. Volume
I, II. National Defence Industry Press, Beijing, (1995).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[FQ91a] K. Feng and M.Z. Qin: Hamiltonian Algorithms for Hamiltonian Dynamical Systems.
Progr. Natur. Sci., 1(2):105–116, (1991).
[FQ91b] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FQ03] K. Feng and M. Q. Qin: Symplectic Algorithms for Hamiltonian Systems. Zhejiang
Press for Science and Technology, Hangzhou, in Chinese, First edition, (2003).
[FS95] K. Feng and Z. J. Shang: Volume-preserving algorithms for source-free dynamical
systems. Numer. Math., 71:451–463, (1995).
36 Bibliography

[FW94] K. Feng and D.L. Wang: Dynamical systems and geometric construction of algo-
rithms. In Z. C. Shi and C. C. Yang, editors, Computational Mathematics in China, Con-
temporary Mathematics of AMS Vol 163, pages 1–32. AMS, (1994).
[GDC91] B. Gladman, M. Duncan, and J. Candy: Symplectic integrators for long-term inte-
gration in celestial mechanics. Celest. Mech., 52:221–240, (1991).
[GLCY00] L. Gao, Y. Li, X. Chen, and H. Yang: An attempt to seismic ray tracing with
symplectic algorithm. Chinese Journal Geophys., 43(3):402–409, (2000).
[Gol80] H. Goldstein: Classical Mechanics. Addison-Wesley Reading, Massachusetts, (1980).
[GS84] V. Guillemin and S. Sternberg: Symplectic Techniques in Physics. Cambridge Univer-
sity Press, Cambridge, (1984).
[GS94a] Z. Ge and C. Scovel: Hamiltonian truncation of shallow water equations. Letters in
Mathematical Physics, 31:1–13, (1994).
[Ham34] Sir W. R. Hamilton: On a general method in dynamics; by which the study of the
motions of all free systems of attracting or repelling points is reduced to the search and
differentiation of one central relation, or characteristic function. Phil. Trans. Roy. Soc.
Part II for 1834, 247–308; Math. Papers, Vol. II, 103–161, Second edition, (1834).
[Ham40] W.R. Hamilton: General methods in dynamics, volume I,II. Cambridge Univ. Press,
(1940).
[HIWZ95] T. Y. Huang, K. A. Innanen, C. B. Wang, and Z. Y. Zhao: Symplectic methods
and their application to the motion of small bodies in the solar system. Earth, Moon, and
Planets,, 71(3):179–183, (1995).
[HKRS97] M. Hankel, B. Karasözen, P. Rentrop, and U. Schmitt: A Molecular Dynamics
Model for Symplectic Integrators. Mathematical Modelling of Systems, 3(4):282–296,
(1997).
[HL97a] E. Hairer and P. Leone: Order barriers for symplectic multi-value methods. In D.F.
Grifysis, D.F.Higham, and G.A. Watson, editors, Numerical analysis 1997 Proc. of the 17th
Dundee Biennial Conference, June 24-27, 1997, Pitman Reserch Notes in math. series 380,
pages 133–149, (1997).
[IMKNZ00] A. Iserles, H. Z. Munthe-Kaas, S. P. Nørsett, and A. Zanna: Lie-group methods.
Acta Numerica, 9:215–365, (2000).
[JLL02] J. Ji, G. Li, and L. Liu: The dynamical simulations of the planets orbiting gj 876.
Astrophys. J., 572: 1041C-1047, (2002).
[Kle26] F. Klein: Vorlesungen über die Entwicklung der Mathematik in 19 Jahrhundert. Teub-
ner, (1926).
[Kol54a] A. N. Kolmogorov: General theory of dynamical systems and classical mechanics.
In Proc. Inter. Congr. Math., volume 1, pages 315–333, (1954).
[Kol54b] A. N. Kolmogorov: On conservation of conditionally periodic motions under small
perturbations of the Hamiltonian. Dokl. Akad. Nauk SSSR, 98:527–530, (1954).
[KYN91] H. Kinoshita, H. Yoshida, and H. Nakai: Symplectic integrators and their application
to dynamical astronomy. Celest. Mech. and Dyn. Astro., 50:59–71, (1991).
[LDJW00] Y. X. Li, P. Z. Ding, M. X. Jin, and C. X. Wu: Computing classical trajectories of
model molecule A2 B by symplectic algorithm. Chemical Journal of Chinese Universities,
15(8):1181–1186, (2000).
[Lia97] X. Liao: Symplectic integrator for general near-integrable Hamiltonian systems. Ce-
lest. Mech. and Dyn. Astro., 66:243–253, (1997).
[LL94] L. Liu and X. H. Liao: Numerical calculations in the orbital determination of an artifi-
cial satellite for a long arc. Celest. Mech., 59:221–235, (1994).
[LL95] L. Liu and X. H. Liao: Existence of formal integrals of symplectic integrators. Celest.
Mech., 63(1):113–123, (1995).
[LL99] L. D. Landau and E. M. Lifshitz: Mechanics, Volume I of Course of Theoretical
Physics. Corp. Butterworth, Heinemann, New York, Third edition, (1999).
[LLL99] M. Luo, Y. Li, and H. Lin: The symplectic geometric description and algorithm of
seismic wave propagation. In The 69-th Ann. Seg mecting, volume 199, pages 1825–1828,
(1999).
Bibliography 37

[LLL01a] Y. M. Li, H. Liu, and M. Q. Luo: Seismic wave modeling with implicit symplectic
method based on spectral factorization on helix. Chinese Journal Geophys., 44(3):379–388,
(2001).
[LLL01b] M. Q. Luo, H. Liu, and Y. M. Li: Hamiltonian description and symplectic method
of seismic wave propagation. Chinese Journal Geophys., 44(1):120–128, (2001).
[LLZD01] X. S. Liu, X. M. Liu, Z. Y. Zhao, and P. Z. Ding: Numerical solution of 2-D time-
independent schrödings. Int. J. Quant. Chem., 83:303–309, (2001).
[LLZW94] L. Liu, X. Liao, Z. Zhao, and C. Wang: Application of symplectic integrators to
dynamical astronomy(3). Acta Astronomica Sinica, 35:1, (1994).
[LQ88] C.W. Li and M.Z. Qin: A symplectic difference scheme for the infinite dimensional
Hamiltonian system. J. Comput. Appl. Math., 6:164–174, (1988).
[LQ95a] S. T. Li and M. Qin: Lie–Poisson integration for rigid body dynamics. Computers
Math. Applic., 30:105–118, (1995).
[LQ95b] S. T. Li and M. Qin: A note for Lie–Poisson Hamilton-Jacobi equation and Lie-
Poisson integrator. Computers Math. Applic., 30:67–74, (1995).
[LQHD07] X.S. Liu, Y.Y. Qi, J. F. He, and P. Z. Ding: Recent progress in symplectic algorithms
for use in quantum systems. Communications in Computational Physics, 2(1):1–53, (2007).
[LSD02a] X. S. Liu, L. W. Su, and P. Z. Ding: Symplectic algorithm for use in computing the
time independent Schrodinger equation. Int. J. Quant. Chem., 87:1–11, (2002).
[LSD02b] X.S. Liu, L.W. Su, and P. Z. Ding: Symplectic algorithm for use in computing the
time independent schrödinger equation. Int. J. Quant. Chem., 87(1):1–11, (2002).
[LYC06] H. Liu, J.H. Yuan, J.B. Chen, H. Shou, and Y.M. Li: Theory of large-step depth
extrapolation. Chinese Journal Geophys., 49(6):1779–1793, (2006).
[LZL93] X. Liao, Z. Zhao, and L. Liu: Application of symplectic algorithms in computation
of LCN. Acta Astronomica Sinica, 34(2):201–207, (1993).
[Men84] C.R. Menyuk: Some properties of the discrete Hamiltonian method. Physica D,
11:109–129, (1984).
[MF81] V.P. Maslov and M. V. Fedoriuk: Semi-classical approximation in quantum mechanics.
D. Reidel Publishing Company, Dordrecht Holland, First edition, (1981).
[MK95] H. Munthe-Kaas. Lie–Butcher theory for Runge–Kutta methods. BIT, 35(4):572–587,
(1995).
[MK98] H. Munthe-Kaas: Runge–Kutta methods on Lie groups. BIT, 38(1):92–111, (1998).
[MK99] H. Munthe-Kaas: High order Runge–Kutta methods on manifolds. Appl. Numer.
Math., 29:115–127, (1999).
[MKO99] H. Munthe-Kaas and B. Owren: Computations in a free Lie algebra. Phil. Trans.
Royal Soc. A, 357:957–981, (1999).
[MKQZ01] H. Munthe-Kaas, G. R. W. Quispel, and A. Zanna: Generalized polar decom-
positions on Lie groups with involutive automorphisms. Foundations of Computational
Mathematics, 1(3):297–324, (2001).
[MKZ97] H. Munthe-Kaas and A. Zanna: Numerical integration of differential equations on
homogeneous manifolds. In F. Cucker and M. Shub, editors, Foundations of Computational
Mathematics, pages 305–315. Springer Verlag, (1997).
[MM05] K.W. Morton and D.F. Mayers: Numerical Solution of Partial Differential Equations:
an introduction. Cambridge University Press, Cambridge, Second edition, (2005).
[MR99] J. E. Marsden and T. S. Ratiu: Introduction to Mechanics and Symmetry. Number 17
in Texts in Applied Mathematics. Springer-Verlag, second edition, (1999).
[MNSS91] R. Mrugała, J.D. Nulton, J.C. Schon, and P. Salamon: Contact structure in thermo-
dynamic theory. Reports on Mathematical Physics, 29:109C121, (1991).
[Mos62] J. Moser: On invariant curves of area-preserving mappings of an annulus. Nachr.
Akad. Wiss. Gottingen, II. Math.-Phys., pages 1–20, (1962).
[QC00] M. Z. Qin and J.B. Chen: Maslov asymptotic theory and symplectic algorithm. Chi-
nese Journal Geophys., 43(4):522–533, (2000).
[Qin89] M. Z. Qin: Cononical difference scheme for the Hamiltonian equation. Mathematical
Methodsand in the Applied Sciences, 11:543–557, (1989).
38 Bibliography

[Qin97a] M. Z. Qin: A symplectic schemes for the pde’s. AMS/IP studies in Advanced Math-
emateics, 5:349–354, (1997).
[QT90] G. D. Quinlan and S. Tremaine: Symmetric multistep methods for the numerical inte-
gration of planetary orbits. Astron. J., 100:1694–1700, (1990).
[QZ90] M. Z. Qin and M. Q. Zhang: Explicit Runge–Kutta–like Schemes to Solve Certain
Quantum Operator Equations of Motion. J. Stat. Phys., 60(5/6):839–843, (1990).
[QZ92] M. Z. Qin and W.J. Zhu: Construction of Higher Order Symplectic Schemes by Com-
position. Computing, 47:309–321, (1992).
[QZ93a] M. Z. Qin and W. J. Zhu: Volume-preserving schemes and numerical experiments.
Computers Math. Applic., 26:33–42, (1993).
[QZ93b] M. Z. Qin and W. J. Zhu: Volume-preserving schemes and applications. Chaos,
Soliton & Fractals, 3(6):637–649, (1993).
[QZ94] M. Z. Qin and W. J. Zhu: Multiplicative extrapolation method for constructing higher
order schemes for ode’s. J. Comput. Math., 12:352–356, (1994).
[Rut83] R. Ruth: A canonical integration technique. IEEE Trans. Nucl. Sci., 30:26–69, (1983).
[Sch44] E. Schrödinger: Scripta mathematica, 10:92–94, (1944).
[Shu93] H.B. Shu: A new approach to generating functions for contact systems. Computers
Math. Applic., 25:101–106, (1993).
[ST92a] P. Saha and S. Tremaine: Symplectic integrators for solar system dynamics. Astron.
J., 104:1633–1640, (1992).
[Syn44] J.L. Synge: Scripta mathematica, 10:13–24, (1944).
[Vog56] R. de Vogelaere: Methods of integration which preserve the contact transformation
property of the Hamiltonian equations. Report No. 4, Dept. Math., Univ. of Notre Dame,
Notre Dame, Ind., Second edition, (1956).
[War83] F. W. Warner: Foundations of Differentiable Manifolds and Lie Groups. GTM 94.
Springer-Verlag, Berlin, (1983).
[Wei77] A. Weinstein: Lectures on symplectic manifolds. In CBMS Regional Conference, 29.
American Mathematical Society,Providence,RI, (1977).
[wey39] H. weyl: The Classical Groups. Princeton Univ. Press, Princeton, Second edition,
(1939).
[WH91] J. Wisdom and M. Holman: Symplectic maps for the N -body problem. Astron. J.,
102:1528–1538, (1991).
[WHT96] J. Wisdom, M. Holman, and J. Touma: Symplectic Correctors. In Jerrold E. Mars-
den, George W. Patrick, and William F. Shadwick, editors, Integration Algorithms and
Classical Mechanics, volume 10 of Fields Institute Communications, pages 217–244.
Fields Institute, American Mathematical Society, July (1996).
[WJX01] B. Wang, Z. Ji, and Q. Xiao: the atmospheric dynamics of the equation Hamiltonian
algorithm. Chinese Journal Computational Physics, 18(1):289–297, (2001).
[WZJ95] B. Wang, Q. Zhen, and Z. Ji: The system of square conservation and Hamiltonian
systems. Science in China (series A), 25(7):765–770, (19950.
[Yos90] H. Yoshida: Construction of higher order symplectic integrators. Physics Letters A,
150:262–268, (1990).
[ZL93] Z. Zhao and L. Liu: The stable regions of triangular libration points of the planets .
Acta Astronomica Sinica, 34(1):56–65, (1993).
[ZL94] Z. Zhao and L. Liu: The stable regions of triangular libration points of the planets II.
Acta Astronomica Sinica, 35(1):76–83, (1994).
[ZLL92] Z. Zhao, X. Liao, and L. Liu: Application of symplectic integrators to dynamical
astronomy. Acta Astronomica Sinica, 33(1):33–41, (1992).
Chapter 1.
Preliminaries of Differentiable Manifolds

Before introducing the concept of differentiable manifold, we first explain what map-
ping is. Given two sets X, Y, and a corresponding principle, if for any x ∈ X, there
exists y = f (x) ∈ Y to be its correspondence, then f is a mapping of the set X into
the set Y , which is denoted as f : X → Y. X is said to be the domain of definition of
f , and f (x) = {f (x) | x ∈ X} ⊂ Y is said to be the image of f . If f (X) = Y , then
f is said to be surjective or onto; if f (x) = f (x ) ⇒ x = x , then f is said to be injec-
tive (one-to-one); if f is both surjective and injective (i.e., X and Y have a one-to-one
correspondence under f ), f is said to be bijective. For a bijective mapping f , if we
define x = f −1 (y), then f −1 : Y → X is said to be the inverse mapping of f . In ab-
stract algebra, a homomorphism is a structure-preserving map between two algebraic
structures (such as groups, rings, or vector spaces). For example, for two groups G and
G and a mapping f : G → G , a → f (a), if f (a, b) = f (a) · f (b), ∀a, b ∈ G, then f
is said to be a homomorphism from G to G . A homomorphism is a map from one al-
gebraic structure to another of the same type that preserves all the relevant structures,
i.e., properties such as identity element, inverse element, and binary operations. An
isomorphism is a bijective homomorphism. If f is a G → G homomorphic mapping,
and also a one-to-one mapping from G to G , then f is said to be a G → G isomor-
phic mapping. An epimorphism is a surjective homomorphism. Given two topological
spaces (x, τ ) and (y, τ ), if the mapping f : X → Y is one-to-one, and both f and its
inverse mapping f −1 : Y → X are continuous, then f is said to be a homeomorphism.
If f and f −1 are also differentiable, then the mapping is said to be diffeomorphism.
A monomorphism (sometimes called an extension) is an injective homomorphism. A
homomorphism from an object to itself is said to be an endomorphism. An endomor-
phism that is also an isomorphism is said to be an automorphism. Given two mani-
folds M and N , a bijective mapping f from M to N is called a diffeomorphism if
both f : M → N and its inverse f −1 : N → M are differentiable (if these functions
are r times continuously differentiable, f is said to be a C r -diffeomorphism).
Many differential mathematical methods and concepts are used in classical me-
chanics and modern physics: differential equations, phase flow, smooth mapping,
manifold, Lie group and Lie algebra, and symplectic geometry. If one would like to
construct a new numerical method, one needs to understand these basic theories and
concepts. In this book, we briefly explain manifold, symplectic algebra, and symplec-
tic geometry. In a series of books[AM78,Che53,Arn89,LM87,Ber00,Wes81] can be found these
materials.
40 1. Preliminaries of Differentiable Manifolds

1.1 Differentiable Manifolds


The concept of manifold is an extension of Euclidean space. Roughly speaking, a
manifold is an abstract mathematical space where every point has a neighborhood that
resembles Euclidean space (homeomorphism). Differentiable manifold is one of the
manifolds that can have differentiable structures.

1.1.1 Differentiable Manifolds and Differentiable Mapping


Definition 1.1. A Hausdorff space M with countable bases is called an n-dimensional
topological manifold, if for any point in M there exists an open neighborhood home-
omorphic to an open subset of Rn .
Remark 1.2. Let (U, ϕ), (V, ψ) be two local coordinate systems (usually called
chart) on the topological manifold M . (U, ϕ), (V, ψ) are said to be compatible, if
U ∩ V = Ø, or the change of coordinates ϕ ◦ ψ −1 and ψ ◦ ϕ−1 are smooth when
U ∩ V = Ø.

Definition 1.3. A chart is a domain U ⊂ Rn together with a 1 to 1 mapping ϕ : W →


U of a subset W of the manifold M onto U . ϕ(x) is said to be the image of the point
x ∈ W ⊂ M on the chart U .
Definition 1.4. A collection of charts ϕi : Wi → Ui is an atlas on M if
1◦ Any two charts are compatible.
2◦ Any point x ∈ M has an image on at least one chart.
Remark 1.5. If a smooth atlas on a topological manifold M possesses with its all
compatible local coordinate systems (chart), then this smooth atlas is called the maxi-
mum atlas.
Definition 1.6. If an n-dimensional topological manifold M is equipped with the
maximal smooth atlas A, then (M, A) is called the n-dimensional differentiable man-
ifold, and A is called the differentiable structure on M .
Definition 1.7. Two atlases on M are equivalent if their union is also an atlas (i.e., if
any chart of the first atlas is compatible with any chart of the second).
Remark 1.8. Suppose M is the n-dimensional topological manifold, A = {(Uλ , ϕλ )}
is a smooth atlas on M . Then there exists a unique differentiable structure A∗ , which
contains A. Hence, a smooth atlas determines a unique differentiable structure on M .
The local coordinate system will be called (coordinate) chart subsequently.
Definition 1.9. A differentiable manifold structure on M is a class of equivalent at-
lases.
Definition 1.10. A differentiable manifold M is a set M together with a differentiable
manifold structure on it. A differentiable manifold structure is induced on set M if an
atlas consisting of compatible charts is prescribed.
1.1 Differentiable Manifolds 41

Below are examples of differentiable manifold.


Example 1.11. Rn is an n-dimensional differentiable manifold.
Let A ={(Rn , I)}, where I is the identity mapping.

Example 1.12. S n is an n-dimensional differentiable manifold.


We only discuss the n = 1 case. Let

U1 = {(u1 , u2 ) ∈ S 1 |u1 > 0}, U2 = {(u1 , u2 ) ∈ S 1 |u1 < 0},


U3 = {(u1 , u2 ) ∈ S 1 |u2 > 0}, U4 = {(u1 , u2 ) ∈ S 1 |u2 < 0}.

Define ϕi : Ui → (−1, 1), such that (s.t.)

ϕi (u1 , u2 ) = u2 , i = 1, 2; ϕi (u1 , u2 ) = u1 , i = 3, 4.

Note that on ϕ1 (U1 ∩ U3 )


 
ϕ3 ◦ ϕ−1 2 2 2 2
1 : u −→ ( 1 − (u ) , u ) −→ 1 − (u2 )2

is smooth, then A ={(Uk , ϕk )} is a smooth atlas on S 1 .

Example 1.13. RP n is an n-dimensional differentiable manifold.


Let

Uk = {[(u1 , · · · , un+1 )] | (u1 , · · · , un+1 ) ∈ S n , uk = 0}, k = 1, · · · , n + 1

defines ϕk : Uk → Int B n (1), s.t.

ϕk ([(u1 , · · · , un+1 )]) = uk |uk |−1 (u1 , · · · , uk−1 , uk+1 , · · · , un+1 ),


 n 
n
1
where B (1) = (u , · · · , u ) ∈ R 
n n
(ui )2 ≤ 1 . It is easy to prove that A
i=1
={(Uk , ϕk )} is a smooth atlas on RP n .

Example 1.14. Let M, N be m- and n-dimensional differentiable manifolds, respec-


tively, then M ×N is a m+n dimensional differentiable manifold (product manifold).
Suppose A = {(Uα , ϕα )}, B = {(Vα , ψα )} are smooth atlases on M, N re-
spectively. Denote A × B={(Uα × Vλ, ϕα × ψλ )}, where ϕα × ψλ : Uα × Vλ →
ϕα (Uα ) × ψλ (Vλ ),(ϕα × ψλ )(p, q) = ϕα (p), ψλ (q) , (p, q) ∈ Uα × Vλ , then A × B
is a smooth atlas on M × N .

Definition 1.15. Let M, N be m- and n-dimensional differentiable manifolds, respec-


tively. A continuous mapping f : M → N is called C k differentiable at p ∈ M , if
the local representation f = ψ ◦ f ◦ ϕ−1 : ϕ(U ) → ψ(V ) is C k differentiable for
the charts (U, ϕ), (V, ψ) corresponding to points p and f (p), and f (U ) ⊂ V . If f is
C k differentiable in each p ∈ M , then f is called C k differentiable, or called C k
mapping. See Fig. 1.1.
42 1. Preliminaries of Differentiable Manifolds

Fig. 1.1. A differentiable mapping

Example 1.16. Let M1 , M2 be m- and n-dimensional differentiable manifolds, re-


spectively. Define θ1 : M1 × M2 → M1 , θ2 : M1 × M2 → M2 , such that
θ1 (p, q) = p, θ2 (p, q) = q, ∀ (p, q) ∈ M1 × M2 ,
then θ1 , θ2 are all smooth mappings.
If the charts on M1 , M2 are denoted by (U, ϕ), (V, ψ), then it is easy to show that
(U × V, ϕ × ψ) is the chart on M1 × M2 . Thus, the local coordinate expression of θ1 ,
θ1 = ϕ ◦ θ1 ◦ (ϕ × ψ)−1 : (ϕ × ψ)(U × V ) −→ ϕ(U ), θ1 (u, v) = u
is a smooth mapping. Therefore, θ1 is a smooth mapping. Likewise, θ2 is also a smooth
mapping.
Example 1.17. Let M, N1 , N2 be differentiable manifolds.
f1 : M −→ N1 , f2 : M −→ N2
k
are C -mapping. Define
f : M −→ N1 × N2 , f (p) = (f1 (p), f2 (p)), ∀ p ∈ M,
then f is a C k -mapping.
∀ p0 ∈ M , let N1 contain the chart (V, ψ) of f1 (p0 ), and let N2 contain the chart
(W, χ) of f2 (p0 ), and M contain the chart (U, ϕ) of p0 . Assume f1 (U ) ⊂ V, f2 (U ) ⊂
W , and
f1 : ψ ◦ f1 ◦ ϕ−1 : ϕ(U ) −→ ψ(V ),

f2 : χ ◦ f2 ◦ ϕ−1 : ϕ(U ) −→ χ(W )


are all C k -mapping, and (V × W, ϕ × χ) is a chart that contains (f1 (p0 ), f2 (p0 )) =
f (p0 ) on the product manifold N1 × N2 , which satisfies f (U ) ⊂ V × W . Then we
have
f = (ψ × χ) ◦ f ◦ ϕ−1 : ϕ(U ) −→ (ψ × χ)(U × W ), f = (f1 , f2 ),
i.e., f is C k -mapping.
1.1 Differentiable Manifolds 43

Remark 1.18. According to the definition, if f : M → N, g : N → L are C k -


mappings, then g ◦ f : M → L is also a C k -mapping.

Definition 1.19. Let M, N be differentiable manifolds, f : M → N is a homeomor-


phism. If f, f −1 are smooth, then f is called diffeomorphism from M to N . If there
exists a diffeomorphism between differentiable manifolds M and N , then M and N
are called differentiable manifolds under diffeomorphism, denoted by M  N .
If we define two smooth atlases (R, I), (R, ϕ) on√ R, and ϕ : R → R, ϕ(u) = u3 ,
−1
because the change of coordinates I ◦ ϕ (u) = 3 u in u = 0 is not differentiable,
then (R, I) and (R, ϕ) determine two different differentiable structures A, A on R.
However, if we define f : (R, A) → (R, A ), {f (u)} = u3 , then (R, A)  (R, A ).
In fact, there exist examples that are not homeomorphism in a differentiable man-
ifold, like the famous Milnor exotic sphere.

1.1.2 Tangent Space and Differentials


In order to establish the differential concept for differentiable mapping on a differen-
tiable manifold, we first need to extend the concept of tangent of curve and tangent
plane of surface in Euclidean space. If we take the tangent vector in Euclidean space
not simply as a vector with size and direction, but as a linear mapping, which satisfies
the Leibniz rule, from the differentiable functional space to R, then the definition of
tangent vector can be given similarly for a manifold.
Let M be the m-dimensional differentiable manifold, p ∈ M be a fixed point. Let
C ∞ (p) be the set of all smooth functions that are defined in some neighborhood of p.
Define operations on M that have the following properties:

(f + g)(p) = f (p) + g(p),


(αf )(p) = αf (p),
(f g)(p) = f (p)g(p).

Definition 1.20. A tangent vector Xp at the point p ∈ M is a mapping

Xp : C ∞ −→ R,

that has the following properties:


1◦ Xp (f ) = Xp (g), if f, g ∈ C ∞ (p) are consistent in some neighborhood of
the point p.
2◦ Xp (αf + βg) = αXp (f ) + βXp (g), ∀ f, g ∈ C ∞ (p), ∀ α, β ∈ R.
3◦ Xp (f g) = f (p)Xp (g) + g(p)Xp (f ), ∀ f, g ∈ C ∞ (p) (which is equivalent
to the derivative operation in Leibniz rule).
Denote Tp M ={All tangent vectors at the point p ∈ M } and define operation:

(Xp + Yp )(f ) = Xp (f ) + Yp (f ),
(kXp )(f ) = kXp (f ), ∀ f ∈ C ∞ (p).
44 1. Preliminaries of Differentiable Manifolds

It is easy to verify that Tp M becomes the vector space that contains the above op-
eration, which is called the tangent space at the point p of the differential manifold
M.

Remark 1.21. By definition of the tangent vector, it is easy to know that if f is the
constant function, Xp (f ) = 0 for Xp ∈ Tp M .

Lemma 1.22. Let (U, ϕ) be the chart that contains p ∈ M , and let x1 , · · · , xm , ϕ(p)
= (a1 , · · · , am ) be the coordinate functions. If f ∈ C ∞ (p), then there exists a function
gi in some neighborhood W of p ∈ M , such that


m
f (q) = f (p) + (xi (q) − ai )gi (q), ∀ q ∈ W,
i=1

     
∂f  ∂f  ∂  ∂f ◦ ϕ−1 
and gi (p) = i
 where i  = i  (f ) =  .
∂x p ∂x p ∂x p ∂ui ϕ(p)

Proof. Assume ϕ(p) = O ∈ Rm , and f is well defined in some neighborhood of p.


Let W = ϕ−1 (B m ). Then ∀ q ∈ W and we have

f (q) − f (p) = f ◦ ϕ−1 (u) − f ◦ ϕ−1 (O).

After calculation, we obtain



m
f (q) − f (p) = ui g i (u),
i=1

 1 ∂f ◦ ϕ−1
where g i (u) = 0
(su1 , · · · , sum ) d s (i = 1, · · · , m). Let g i (ϕ(q)) =
∂ui
gi (q), then gi is smooth on W , and satisfies


m
f (q) = f (p) + xi (q)gi (q),
i=1  
∂f ◦ ϕ−1  ∂f 
gi (p) = g i (O) = i
 = i .
∂u O ∂x p

Hence lemma is proved. 


  
∂  ∞ ∂  ∂f ◦ ϕ−1 
Theorem 1.23. Define  : C (p) → R,  (f ) =  , ∀f ∈
 ∂xi p ∂xi p ∂ui ϕ(p)
∂ 
C ∞ (p), then i  (i = 1, · · · , m) is a group of bases for Tp M . Therefore, dim Tp M
∂x p
= m, and for Xp ∈ Tp M, we have


m 
∂ 
Xp = Xp (xi )  .
∂xi p
i=1
1.1 Differentiable Manifolds 45

Proof. ∀ Xp ∈ Tp M, as f ∈ C ∞ (p). By Lemma 1.22, we know


m
f = f (p) + (xi − ai )gi ,
i=1

then
  ∂f   ∂ 
m m m
Xp (f ) = Xp [(x − a )gi ] =
i i
Xp (x ) i  = i
Xp (xi ) i  (f ).
i=1 i=1
∂x p i=1 ∂x p

 decomposed coefficients, {Xp (x )}, of Xp with respect to (w.r.t.) the bases


i
The
∂ 
 (i = 1, · · · , m) are called coordinates of the tangent vector Xp w.r.t. the
∂xi p
chart(U, ϕ). 

Remark 1.24. By Theorem 1.23 we know: if the coordinates of Xp w.r.t. chart (U, ϕ)
are defined as (Xp (x1 ), · · ·, Xp (xm )), then Tp M and Rm are isomorphisms,
 and the
∂ 
basis for Tp M corresponds exactly to the standard basis for R , i.e.,
m
 → ei =
∂xi p
(0, · · · , 1, 0, · · · , 0).

1. Definition and properties of differentials of mappings


The definition of differentials of a mapping is as follows:
Definition 1.25. Let f : M → N be a smooth mapping. ∀ p ∈ M, Xp ∈ Tp M, we
define f∗p : Tp M → Tf (p) N that satisfies:

f∗p (Xp )(g) = Xp (g ◦ f ), ∀ g ∈ C ∞ (f (p)).

This linear mapping f∗p is called the differential of f at the p ∈ M .

Definition 1.26. The differential of the identity mapping I is an identity mapping,


i.e., I∗p : Tp M → Tp M .

Remark 1.27. Let M, N, L be differentiable manifolds, p ∈ M , and f : M → N, g :


N → L are smooth mappings, then (g ◦ f )∗p = g∗f (p) ◦ f∗p .

Remark 1.28. If f : M → N is a diffeomorphism, then f∗p : Tp M → Tf (p) N is a


isomorphism.

Proposition 1.29. Let x1 , · · · , xm , y 1 , · · · , y n be the coordinate functions of (U, ϕ),


(V, ψ) respectively, then
   
n  
∂  ∂fj  ∂ 
f∗p  =   ,
∂xi p i ∂x p ∂y j f (p)
j=1

where fj = y j ◦ f .
46 1. Preliminaries of Differentiable Manifolds

Proof. Since     
∂  ∂  ∂fk 
f∗p  (y k ) = i  (y k ◦ f ) = i  ,
∂xi p ∂x p ∂x p

therefore, by Theorem 1.23 we have


   
n  
∂  ∂fj  ∂y k 
f∗p  =  
∂xi p i ∂x j
p ∂y f (p)
i,j=1


n  
∂fj  ∂ 
=   (y k ).
i ∂x p ∂y j f (p)
i,j=1

Therefore the proposition is completed. 



n  
n 
∂  j ∂ 
Let Xp = αi  , f∗p (X p ) = β  , by Proposition 1.29, we
∂xi p ∂y j f (p)
i=1 j=1
have ⎛ ⎞
⎛ ⎞ ∂f1 ∂f1 ⎛ 1 ⎞
β1 ··· α
⎜ ∂x1 ∂xm ⎟
⎜ .. ⎟ ⎜ .. .. ⎟⎜ .. ⎟ .
⎝ . ⎠=⎜ . . ⎟⎝ . ⎠
⎝ ⎠
βn ∂fn
···
∂fn αm
∂x1 ∂xm
 
∂fi
This matrix is the Jacobian matrix of f at p w.r.t. charts (U, ϕ), (V, ψ).
∂xj n×m
Its rank rkp f is called the rank of f : M → N at the p. From the above equations,
we can easily observe that f∗p is equivalent to Df(ϕ(p)) under the assumption of
isomorphism, where Df(ϕ(p)) is the differential at ϕ(p) of the local representation of
f , f = ψ ◦ f ◦ ϕ−1 .
2. Geometrical meaning of differential of mappings
A smooth
  curve
 on M is a smooth mapping c : (a, b) → M . The tangent vector,
d 
c∗t0  on Tc(t0 ) M is called the velocity vector of c at t0 . Let f : M → N be a
dt t0
smooth mapping. Then, f ◦ c is a smooth curve on N that passes f (p). By composite
differentiation, we have
d    
 d 
(f ◦ c)∗t0  = f∗p0 ◦ c∗t0  ,
d t t=t0 d t t=t0

i.e., f∗p0 transforms the velocity vector of c at t0 to the velocity vector of f ◦ c at t0 .

1.1.3 Submanifolds
The extension of the curve and surface on Euclidean space to the differentiable mani-
fold is the submanifold. In the following section, we focus on the definitions of three
submanifolds and their relationship. First, we describe a theorem.
1.1 Differentiable Manifolds 47

1. Inverse function theorem

Theorem 1.30. Let M, N be m-dimensional differentiable manifolds, f : M → N


is a smooth mapping, p ∈ M . If f∗p : Tp M → Tf (p) N is an isomorphism, then there
exists a neighborhood, W of p ∈ M , such that
1◦ f (W ) is a neighborhood of f (p) in N.
2◦ f |W : W → f (W ) is a diffeomorphism (this theorem is an extension of the
inverse function theorem for a manifold).

Proof. Consider charts (U, ϕ) on M about p ∈ M and (V, ψ) on N about f (p) ∈ N ,


so that f (U ) ⊂ V . Then, the local representation f = ψ ◦ f ◦ ϕ−1 : ϕ(U ) → ψ(V )
is a smooth mapping. Since f∗p : Tp M → Tf (p) N is an isomorphism, Dfˆ(ϕ(p)) :
Rm → Rm is also an isomorphism. By the inverse function theorem, there exists
a neighborhood O of ϕ(p) ∈ Rm such that f(O) is a neighborhood of ψ(f (p)) in
Rm , and f : O → f(O) is a diffeomorphism. O has to be chosen appropriately. Let
O ⊂ ϕ(U ), and f(O) ⊂ ψ(V ). Let W = ϕ−1 (O). Then, W is the neighborhood of
p, which meets our requirement. 

Remark 1.31. Given a chart (V, ψ) on N , f (p) ⊂ V , choose ϕ = ψ ◦ f and some


neighborhood U of p, such that ϕ(U ) ⊂ V . By 2◦ of Theorem 1.30, f |W : W →
f (W ) is a diffeomorphism and (U, ϕ) is a chart on M . Hence f = ψ ◦ f ◦ ϕ−1 = I
is an identity mapping from ϕ(U ) to ψ(V ).

Example 1.32. Suppose f : R → S 1 , defined by f (t) = (cos t, sin t). Using the
chart of Example 1.11, we obtain
  
π π
cos t, t ∈ kπ − , kπ + ,
f (t) =

2 2
− sin t, t ∈ (kπ, (k + 1)π).

Obviously, f (t) = 0, ∀ t ∈ R. However, f : R → S 1 is not injective. Thus, f is not


a diffeomorphism. This example shows that f∗p : Tp M → Tf (p) N isomorphism and
f : M → N homeomorphism at some neighborhood of p are only local properties.

We have discussed the case where f∗p : Tp M → Tf (p) N is an isomorphism. In


the following section we turn to the case when f∗p is injective.
2. Immersion
Definition 1.33. Let M, N be differentiable manifolds, and f : M → N a smooth
mapping, and p ∈ M . If f∗p : Tp M → Tf (p) N is injective (i.e., rkp f = m), then f
is said to immerse at p. If f immerses at every p ∈ M , then f is called an immersion.

Below are some examples of immersion.

Example 1.34. Let U ∈ Rm be an open subset, α : U → Rn , α(u1 , · · · , um ) =


(u1 , · · · , um , 0, · · · , 0).

By definition, α is obviously an immersion, and is often called a model immersion.


48 1. Preliminaries of Differentiable Manifolds

Proposition 1.35. Let M, N be m- and n-dimensional differentiable manifolds re-


spectively; f : M → N is a smooth mapping, p ∈ M . If f immerses at p, then there
exist charts (U, ϕ) on M about p ∈ M and (V, ψ) on N about f (p) ∈ N in which the
coordinate description f = ψ ◦ f ◦ ϕ−1 : ϕ(U ) → ψ(V ) has the form

f(u1 , · · · , um ) = (u1 , · · · , um , 0, · · · , 0).

Proof. Choose charts (U1 , ϕ1 ) and (V1 , ψ1 ) appropriately so that p ∈ U1 , f (p) ∈


V1 , ϕ1 (p) = 0 ∈ Rm , ψ1 (f (p)) = 0 ∈ Rn and f (U1 ) ⊂ V1 . Since f immerses
  
∂ fi 
at p, the rank of Jacobian Jf(0) =  is m, where f = (f1 , · · · , fn ). We can
∂uj 0
assume that the first m rows in the Jacobian matrix Jf(0) are linearly independent.
Then, define a mapping for ϕ1 (U1 ) × Rn−m → Rn = Rm × Rn−m by

g(u, v) = f(u) + (0, v).

It is easy to prove that g(u,!0) = f(u) maps origin 0 to itself in Rn and the rank of
0
Jg (O ) = Jf(O) is n, where 0 denotes a m × (n − m) zero matrix, and by
In−m
the inverse function theorem, g is a diffeomorphism from a neighborhood of origin of
Rm to a neighborhood of origin of Rn . Shrink U1 , V1 so that they become U, V , and
−1 
let ϕ = ϕ1 |U, ψ = g −1 ◦ (ψ|V ). Since ψ ◦ f ◦ ϕ−1 = g −1 ◦ ψ1 ◦ f ◦ ϕ−1
1 =g ◦f =
g(u, 0), the proposition is proved. 

Remark 1.36. By definition of immersion, if f : M → N immerses at p ∈ M , then


f immerses in some neighborhood of p.

Remark 1.37. By Proposition 1.35, f limited in some neighborhood of p has a local


injective expression f = (u1 , · · · , um , 0, · · · , 0). Then, f limited in some neighbor-
hood of p is injective. Note that this is only a local injection, not total injective.

Definition 1.38. Let N, N  be differentiable manifolds, N  ⊂ N . If the inclusion


map i : N  → N is an immersion, then N  is said to be an immersed submanifold of
N.

Remark 1.39. Suppose f is an immersion and injective (such f would henceforth be


called injective immersion), M is a smooth atlas A = {(Uα , ϕα )}. Denote f A =
{(f (Uα ), ϕα ◦ f −1 )}. Then, it is easy to prove that {f (M ), f A} is a differentiable
manifold. Since f has a local expression f = ϕα ◦ f −1 ◦ f ◦ ϕ−1 α = I : ϕα (Uα ) →
ϕα (Uα ), f : M → f (M ) is a diffeomorphism, i.e., f∗p : Tp M → Tf (p) f (M ) is an
isomorphism. Since f is an immersion, the inclusion map i : f (M ) → N is also an
immersion. Hence, f (M ) is an immersed submanifold of N .

From the following example, we can see that the manifold topology of an immersed
submanifold may be inconsistent with its subspace topology and can be very complex.
1.1 Differentiable Manifolds 49

Example 1.40. T 2 = S 1 × S 1 = {(z1 , z2 ) ∈ C × C | |z1 | = |z2 | = 1}. Define


f : R → T 2 , s.t. f (t) = (e2πit , e2πiα ), where α is an irrational number. We can
prove that f (R), which is a differentiable manifold derived from f , is an immersed
submanifold of T 2 , and f (R) is dense in T 2 .
We may regard T 2 as a unit square on the plane of R2 , which is also a 2D manifold
that has equal length on opposite sides. It can be represented by a pair of ordered real
numbers (x, y), where x, y are mod Z real numbers. Define
1 2
ϕ : R2 −→ S 1 × S 2 , ϕ(u1 , u2 ) = (e2πiu , e2πiu )
and define“ ∼ ” : (u1 , u2 )∼ (v1 , v 2 ) ⇔ u1 = v1 (mod Z), u2 = v 2 (mod Z), and
1 1 1 1
let W = u10 − , u10 + × u20 − , u20 + . Then, (ϕ(W ), ϕ−1 ) is a chart
2 2 2 2
of T 2 = S 1 × S 1 that contains f (t0 ). Choose a neighborhood U of t0 ∈ R so that
f (U ) ⊂ ϕ(W ). Then, the local expression of f , f = (t, αt) is an immersion at t0 . It
is easy to prove that if ϕ−1 f (R) is dense in R2 , then f (R) is dense in T 2 = S 1 × S 1 .
By definition, f is injective. It is concluded that the topology of T 2 is different from
the topology of f (U ), which derives from f (R).
3. Regular submanifolds
The type of submanifold given below has a special relationship to its parent differential
manifold, which is similar to that of Euclidean space and its subspace.
Definition 1.41. Let M  ⊂ M have the subspace topology, and k be some nonnega-
tive integer, 0 ≤ k ≤ m. If there exists a chart (U, ϕ) of M that contains p in every
p ∈ M  , so that
1◦ ϕ(p) = O ∈ Rm .
2◦ ϕ(U ∩ M  ) = {(u1 , · · · , um ) ∈ ϕ(U ) | uk+1 = · · · = um = 0}.
Then M  is said to be a k-dimensional regular submanifold of M , and the chart is
called submanifold chart.
Let A = {(Uα , ϕα )} be a set that contains all submanifold charts on M . Denote
"
A={(U "α , ϕ
"α )}, where U "α = Uα ∩ M  , ϕ "α = π ◦ (ϕα |U"α ), π : Rk × Rm−k → Rk .

Since M has the subspace topology, U "α is an open set of M  , and ϕ "α → ϕ
"α : U "α )
"α (U
is a homeomorphism for U "α to ϕ"α (U"α ) ⊂ Rk . Moreover, Uα # U "α = M  , and hence
"α , ϕ
A" is an atlas of M  . ∀ (U "β , ϕ
"α ), (U "β ) ∈ A" and U
"α ∩ U "β = Ø, we have

ϕ "−1
$β ◦ ϕ 1 −1 1
α (u , · · · , u ) = π ◦ ϕβ ◦ ϕα (u , · · · , u , 0, · · · , 0).
k k

Obviously, A" is a smooth atlas of M  , which determines a differentiable structure of


M  . Thus, M  is a k-dimensional differentiable manifold.
Below is an example of regular submanifold.
Example 1.42. Let M, N be m- and n-dimensional differentiable manifolds respec-
tively, and f : M → N be a smooth mapping. Then, the graph of f
gr(f ) = {(p, f (p)) ∈ M × N | p ∈ M }
is an m-dimensional closed submanifold of M × N ( closed regular submanifold ).
50 1. Preliminaries of Differentiable Manifolds

Proof. Consider charts (U, ϕ), (V, ψ), p0 ∈ U, f (p0 ) ∈ V, ϕ(p0 ) = O ∈ Rm ,


ψ(f (p0 )) = O ∈ Rn , f (U ) ⊂ V , and define G : ϕ(U ) × ψ(V ) → Rn+m =
Rm × Rn , so that
G(u, v) = (u, v − f(u)).

It is easy to prove that G(gr(f)) = {(u, O  ) | u ∈ ϕ(U )}, and the rank of
% &
 Im O
JG (O, O ) =
−Df(O) In

is n + m. Since G(O, O ) = (O, O ), G homeomorphically maps some neighborhood


U" of (O, O ) on ϕ(U ) × ψ(V ) to some neighborhood V" of (O, O ) on Rn+m . Denote

" ),
W = (ϕ × ψ)−1 (U χ = G ◦ (ϕ × ψ)|W.

Then, (W, χ) is a chart of M × N that contains (p0 , f (p0 )), and

χ(p0 , f (p0 )) = (O, O ) ∈ Rn+m ,


χ(W ∩ gr(f )) = {(u, v) ∈ χ(W )|v = 0}.

The proof can be obtained. 

Remark 1.43. If N  is a regular submanifold N , f : M → N is a smooth mapping,


f (M ) ⊂ N  , then f : M → N  is also a smooth mapping. Let (U, ϕ), (V, ψ) be
" is a induced chart of N  from N . Then, by the fact that N  is
charts of N , then (V" , ψ)
a regular submanifold of N , we know ψ ◦ f ◦ ϕ−1 (u) = (ψ" ◦ f ◦ ϕ−1 (u), 0). Then,
the smoothness of f : M → N leads to the smoothness of f : M → N  .

Remark 1.44. Let M  be a k-dimensional regular submanifold of M , then i : M  →


M is the inclusion mapping. Take a submanifold chart (U, ϕ) of M that induces the
" , ϕ)
chart (U " of M  . Then, i = ϕ ◦ i ◦ ϕ
"−1 : ϕ(
"U " ) → ϕ(U ) has the form

i (u1 , · · · , uk ) = (u1 , · · · , uk , 0, · · · , 0).

Thus, i∗p : Tp M  → Tp M is injective, which means that the regular submanifold is


definitely an immersed submanifold.

4. Embedded submanifolds
Definition 1.45. Let f : M → N be an injective immersion. If f : M → f (M ) is
a homeomorphism, where f (M ) has the subspace topology of N , then f (M ) is an
embedded submanifold of N .

Proposition 1.46. Suppose f : M → N is an embedding, then f (M ) is a regular


submanifold of N , and f : M → f (M ) is a diffeomorphism.
1.1 Differentiable Manifolds 51

Proof. Since f is an embedding, f is an immersion ,∀ q ∈ f (M ), ∃ p ∈ M so that


f (p) = q. Let charts (U, ϕ), (V, ψ), p ∈ U, f (p) ∈ V so that ϕ(p) = O ∈ Rm , ψ(q) =
O ∈ Rn , f (U ) ⊂ V , and f(u1 , · · · , um ) = (u1 , · · · , um , · · · , 0). Since f : M →
f (M ) is a homeomorphism, if U is an open subset of M , then f (U ) is an open subset
of f (M ), and there exists an open subset W1 ⊂ N so that f (U ) = W1 ∩ f (M ).
Denote W = V ∩ W1 , χ = ψ|W . Then, χ(q) = O ∈ Rn , and
χ(W ∩ f (M )) = {(u1 , · · · , un ) ∈ χ(W ) | um+1 = · · · = un = 0},
i.e., (W, χ) is a submanifold chart of N that contains q, which also means that f (M )
is a regular submanifold of N . Let (W$, χ
") be a chart of f (M ) induced from (W, χ).
−1
Then from χ ◦ f ◦ ϕ (u) = (" χ ◦ f ◦ ϕ−1 (u), 0), we conclude that f : M → f (M )
is a diffeomorphism. 

Remark 1.47. If f is an immersion, then we can appropriately choose the charts of


M, N , such that f has the local expression f(u1 , · · ·, um ) = (u1 , · · · , um , · · · , 0).
Therefore, it is easy to see that f can be an injective immersion in the neighborhood
U of p, and f : U → f (U ) is a homeomorphism. Obviously, f (U ) has the induced
subspace topology from N . Therefore, f | U : U → N is an embedding.
Definition 1.48. Let X, Y be two topological spaces, and f : X → Y be continuous.
If for every compact subset K in Y , we have f −1 (K) to be a compact subset in X,
then f is said to be a proper mapping.
Proposition 1.49. Let f : M → N be an injective immersion. If f is a proper map-
ping, then f is an embedding.
Proof. It would be sufficient to prove f −1 : f (M ) → M is continuous. Assume there
exist an open set W of M and a sequence of points {qi } of f (M ) s.t. qi ∈ / f (W ),but
{qi } converges to some point qi of f (W ). Denote pi = f −1 (qi ), p0 = f −1 (q0 ), p0 ∈
W . Since {q0 , qi }(i = 1, 2, · · ·) is compact, and f is a proper mapping, {p0 , pi }(i =
1, 2, · · ·) is a compact set of M . Let p1 ∈ M be the convergence of {pi }. Since
f is continuous, {f (pi )} converges to f (p1 ), i.e., f (p1 ) = f (p0 ). Thus, p0 = p1 .
Therefore, when i is large enough, there exists pi ∈ W , and qi = f (pi ) ∈ f (W ). This
is in contradiction with qi ∈/ f (W ). 
Remark 1.50. Let f be an injective immersion. If M is compact, then f is a proper
mapping. By Proposition 1.49, f is an embedding.

1.1.4 Submersion and Transversal


Below we will discuss the local property of f when f∗p : Tp M → Tf (p) N is surjec-
tive.
Definition 1.51. f is smooth, p ∈ M . If f∗p : Tp M → Tf (p) N is surjective, then f is
a submersion at p; if f is a submersion at every p ∈ M , then f is said be a submersion.
Similar to the proposition for f that immerses at p, we have the following propo-
sition.
52 1. Preliminaries of Differentiable Manifolds

Proposition 1.52. Given a smooth f and p ∈ M , if f submerses at p, then there


exists chart (U, ϕ) on M about p and (V, ψ) on N about f (p) ∈ N in which f =
ψ ◦ f ◦ ϕ−1 : ϕ(U ) → ψ(V ) has the form

f(u1 , · · · , um ) = (u1 , · · · , un ).
Proof. Take charts (U1 , ϕ1 ), (V, ψ1 ), p ∈ U1 , f (p) ∈ V, ϕ1 (p) = O ∈ Rm , ψ1 (f (p))
  
∂ fi 
= O ∈ Rn and f (U1 ) ⊂ V . Since f is a submersion, Jf(O) =  has rank
∂uj O
n, where f = (f1 , · · · , fn ). We assume that the first n rows of Jf(O) are linearly
independent. Let g : ϕ1 (U1 ) → ψ(V ) × Rm−n satisfy

g(u1 , · · · , um ) = (f(u1 , · · · , um ), un+1 , · · · , um ).


!
Jf(O)
Then, g(O) = O, and Jg (O) = has rank n. By the inverse function
O Im−n
theorem, g(O) maps a neighborhood W at O diffeomorphically to a neighborhood of
g(W ) ⊂ ψ(V ) × Rm−n . Let U = W ∩ U1 , ϕ = g ◦ (ϕ1 |U ), then ψ ◦ f ◦ ϕ−1 1 ◦g
−1
=
−1 1 1
β ◦ g ◦ g = β, and β : R → R , β(u , · · · , u ) = (u , · · · , u ) is a projection
m n m n

from Rm → Rn . 
Remark 1.53. By Definition 1.51, if f : M → N is a submersion at p ∈ M , then f
is a submersion in some neighborhood of p.
Remark 1.54. If f : M → N is a submersion, then f is an open mapping (i.e., open
set mapping to an open set). Furthermore, f (M ) is an open subset of N .
Let G be an open subset of M , ∀ q ∈ f (G). There exists a p ∈ G, s.t. f (p) = q.
Since f is a submersion, there exist charts (U, ϕ), (V, ψ), p ∈ U, q ∈ V , s.t. U ⊂ G,
and f : ϕ(U ) → ψ(V ), f(u1 , · · · , um ) = (u1 , · · · , un ). Let H = β(ϕ(U )), where
β(u1 , · · · , um ) = (u1 , · · · , un ), s.t. H ⊂ ψ(V ). Thus, ψ −1 (H) is a neighborhood of
q ∈ N , ψ −1 (H) ⊂ f (G), i.e., f (G) is an open subset of N .
Next, we consider under what condition would f −1 (q0 ) be a regular submanifold
of M , and ∀ q0 ∈ N be fixed.
Definition 1.55. Given f : M → N is smooth, p ∈ M , if f∗p : Tp M → Tf (p) N is
surjective, then p is said to be a regular point of f (i.e., f submerses at p), otherwise p
is said to be a critical point of f , and q ∈ N is called a regular value of f , if q ∈
/ f (M )
or q ∈ f (M ), but each p ∈ f −1 (q) is a regular point of f ; otherwise, q is called a
critical value of f .
Remark 1.56. When dim M < dim N , as a result of dim Tp M = dim M <
dim N = dim Tf (p) N , for q ∈ f (M ), p ∈ f −1 (q), p cannot be a regular point
of f . Hence, q ∈ N is a regular value of f ⇔ q ∈
/ f (M ).
Theorem 1.57. Let f : M → N be smooth, q ∈ N ; if q is a regular value of f ,
and f −1 (q) = Ø, then f −1 (q) is an (m − n)-dimensional regular submanifold of M .
Moreover, ∀p ∈ f −1 (q),
Tp {f −1 (q)} = ker f∗p .
1.1 Differentiable Manifolds 53

Proof. Since q is a regular value, ∀ p ∈ f −1 (q), f submerses at p by definition. By


the Proposition 1.52, there exist charts (U, ϕ), (V, ψ), p ∈ U, f (p) = q ∈ V, ϕ(p0 ) =
O ∈ Rm , ψ(q) = O ∈ Rn , and ψ ◦ f ◦ ϕ−1 (u1 , · · · , um ) = f(u1 , · · · , um ) =
(u1 , · · · , un ), ϕ{U ∩ f −1 (q)} = {(u1 , · · · , um ) ∈ ϕ(U ) | u1 = · · · = un = 0},
i.e., (U, ϕ) is a submanifold chart of M that contains p. Therefore, f −1 (q) is a regular
submanifold of M , and dim f −1 (q) = m − n.
Note that f |f −1 (q) : f −1 (q) → M, f |f −1 (q) = f ◦ i, i : f −1 (q) → M is an inclu-
sion mapping. Since f |f −1 (q) = q is a constant mapping, f∗p ◦i∗p = (f |f −1 (q) )∗p = 0,
i.e., i∗p (Tp {f −1 (q)}) ⊂ ker f∗p . Furthermore, because q is a regular value of f ,
f∗p (Tp M ) = Tq N , and dim ker f∗p = dim Tp M − dim f∗p (Tp M ) = m − n =
dim f −1 (q). Therefore, we have Tp {f −1 (q)} = ker f∗p . 

Remark 1.58. Given f : M → N is smooth, dim M = dim N, M is compact. If


q ∈ N is a regular value of f , then f −1 (q) = Ø or f −1 (q) consists of finite points.
By Theorem 1.57, if f −1 (q) = Ø, then f −1 (q) is a 0-dimensional regular sub-
manifold of M . By definition, we have ϕ(U ∩ f −1 (q)) = O ∈ Rm , i.e., every point
in f −1 (q) is an isolated point. Moreover due to the compactness of f −1 (q), f −1 (q)
must consist of finite points.

Below, we give some applications of Theorem 1.57.


n+1
Example 1.59. Let f : Rn+1 → R, and f (u1 , · · · , un+1 ) = (ui )2 .
i=1
From the Jacobian matrix of f at (u1 , · · · , un+1 ), we know f is not a submersion
at (u1 , · · · , un+1 ) ⇔ u1 = · · · = un+1 = 0. Therefore, any non-zero real number is
a regular value of f . According to the Theorem 1.57, the n-dimensional unit sphere
S n = f −1 (1) is an n-dimensional regular submanifold on Rn+1 .

Example 1.60. Let f : R3 → R, and f (u1 , u2 , u3 ) = (a − (u1 )2 + (u2 )2 )2 +
(u3 )2 , a > 0.
The assumption tells us that any non-zero real number is a regular point of f . Then,
0 < b2 < a2 is a regular value of f . Therefore, by Theorem 1.57, T 2 = f −1 (b2 ) is a
2-dimensional regular submanifold on R2 .
If M  is a regular submanifold of M , then dim M − dim M  =codim M  is called
the M -codimension of M  . Denote M  = {p ∈ M | fi (p) = 0 (i = 1, · · · , k)} and
consider the mapping

F : M −→ Rk , F (p) = (f1 (p), · · · , fk (p)).

If fi : M → R is smooth, then F is smooth too, and M  = F −1 (O).

Proposition 1.61. Suppose M  is a subset of M . Then, M  is a k-codimensional reg-


ular submanifold of M if and only if for all q ∈ M  , there exists a neighborhood U of
q ∈ M and a smooth mapping F : U → Rk , s.t.
1◦ U ∩ M  = F −1 (O).
2◦ F : U → Rk is a submersion.
54 1. Preliminaries of Differentiable Manifolds

Proof. Necessity. By the definition of the regular submanifold, if M  is a k-codimen-


sional regular submanifold of M , then ∀p ∈ M  , there exists a submanifold chart
(U, ϕ) of M that contains p s.t. ϕ(p) = O ∈ Rm , and ϕ(U ∩ M  ) = {(u1 , · · · , um ) ∈
ϕ(U ) | um−k+1 = · · · = um = 0}. Let us denotes the projection by π : Rm =
Rm−k × Rk → Rk , let F = π ◦ ϕ : U → Rk . Then, F is smooth, and F −1 (O) =
(π ◦ ϕ)−1 (O) = U ∩ M  , F∗q = π∗ϕ(q) ◦ ϕ∗q . Since ϕ∗q is an isomorphism and π∗ϕ(q)
is surjective, F submerses at q.
Sufficiency. If ∀ q ∈ U ∩ M  , F submerses at q, then O ∈ Rk is a regular value
of F . By Theorem 1.57, F −1 (O) is a k-codimensional regular submanifold of U , i.e.,
M  is a k-codimensional regular submanifold of M . 

We know that if q ∈ N is a regular value of f : M → N , and f −1 (q) = Ø, then


f −1 (q) is a regular submanifold of M . Assume that Z is a regular submanifold of N .
Then, under what condition would f −1 (Z) be a regular submanifold of M ? For this
question, we have the following definition.
Definition 1.62. Suppose Z is a regular submanifold of N , f : M → N is smooth,
p ∈ M . Then, we say f is transversal to Z at p, if f (p) ∈
/ Z or when f (p) ∈ Z has

f∗p Tp M + Tf (p) Z = Tf (p) N,

denoted by f p Z. If ∀p ∈ M , f p Z, then f is transversal to Z, denoted by f Z.

Remark 1.63. If dim M + dim Z < dim N , then f Z ⇔ f (M ) ∩ Z = Ø; if q ∈ N


is a regular value of f , then ∀p ∈ f −1 (q), f p Z; if f : M → N is a submersion,
then for any regular submanifold Z of N , f Z.

For transversality, we focus on its geometric property.


Example 1.64. M = R, N = R2 , Z is x-axis in R2 , f : M → N, f (t) = (t, t2 ).
When t = 0, as a result of f (t) ∈
/ Z, f t Z;
    
 d  ∂ ∂  d
When t = 0, Jf (0) = (1, 0) , note that f∗0  =(1, 0) 1
, 2 = ,
dt 0 ∂u ∂u d u1
f∗0 T0 M = T(0,0) Z. Therefore, f∗0 T0 M + T(0,0) Z = T(0,0) N is impossible to estab-
lish. Thus, f is not transversal to Z at 0.
However, if we change f to f (t) = (t, t2 − 1), we obtain:   t = ±1, f (t) ∈
 when /
 d ∂ 
Z, so f t Z; when t = 1, Jf (1) = (1, 2) , therefore f∗1  =  +
 dt 1 ∂u1 (1,0)
∂ 
2 2 , i.e., f∗1 T1 M + T(1,0) Z = T(1,0) N , and hence f 1 Z. Similarly, we have
∂u (1,0)
f −1 Z. Thus, f Z.
Submanifold transverse: Let Z, Z  be two regular submanifolds of N , i : Z  → N
is an inclusion mapping. If iZ, then submanifold Z  is transversal to Z, denoted as
Z  Z. If Z  Z, ∀p ∈ Z ∩ Z  , by definition, we have

i∗p (Tp Z  ) + Tp Z = Tp N,

i.e.,
1.1 Differentiable Manifolds 55

Tp Z  + Tp Z = Tp N.
We assume that f : M → N is smooth, and Z is a k-codimensional regular
submanifold of N , p ∈ M, f (p) = q ∈ Z. According to the Proposition 1.61, there
exists a submanifold chart (V, ψ) of N that contains q, s.t. π ◦ ψ : V → Rk is a
submersion, and Z ∩ V = (π ◦ ψ)−1 (O). Now, take a neighborhood of p in M , s.t.,
f (U ) ⊂ V , then π ◦ ψ ◦ f : U → Rk .
Proposition 1.65. f p Z ⇔ π ◦ ψ ◦ f : U → Rk submerses at p.

Proof. Since π ◦ ψ submerses at f (p), O ∈ Rk is a regular value of π ◦ ψ. From


Z ∩ V = (π ◦ ψ)−1 (O), we know for every q ∈ Z ∩ V , there exists a (π ◦ ψ)∗q Tq N =
To Rk . By Theorem 1.57, ker(π ◦ ψ)∗q = Tq Z. Therefore, f∗p Tp M + Tq Z = Tq N ↔
(π◦ψ)∗q (f∗p Tp M ) = To Rk ↔ (π◦ψ◦f )∗p (Tp M ) = To Rk , i.e., π◦ψ◦f submerses
at p. 

Remark 1.66. Extending from the conclusion of Proposition 1.65, we have f Z ↔


O ∈ Rk are regular values of π ◦ ψ ◦ f : U → Rk .

Remark 1.67. Since f p Z, i.e., π ◦ ψ ◦ f : U → Rk submerses at p. By Proposition


1.52, we can choose a coordinate chart s.t. π ◦ ψ ◦ f ◦ ϕ−1 : ϕ(U ) → Rk has the form

(π ◦ ψ ◦ f ◦ ϕ−1 )(u1 , · · · , um ) = (um−k+1 , · · · , um ).

Then, f = ψ ◦ f ◦ ϕ−1 can be represented by

f = (η1 (u1 , · · · , um ), · · · , ηn−k (u1 , · · · , um ), um−k+1 , · · · , um ).

Theorem 1.68 (Extension of Theorem 1.57). Suppose f : M → N is smooth, Z is


a k-codimensional regular submanifold of N . If f Z and f −1 (Z) = Ø, then f −1 (Z)
is a k-codimensional regular submanifold of M , and ∀p ∈ f −1 (Z),

Tp {f −1 (Z)} = f∗p
−1
{Tf (p) Z}.

Proof. ∀ p ∈ f −1 (Z), there exists q ∈ Z, denoted by q = f (p). Since Z is a k-


codimensional regular submanifold of N , there exists a submanifold chart (V, ψ) of
N that contains p. Let U = f −1 (V ). From f Z, we know that O ∈ Rk is a regular
value of π ◦ ψ ◦ f , and

U ∩ f −1 (Z) = (π ◦ ψ ◦ f )−1 (O).

By Theorem 1.57, U ∩ f −1 (Z) is a k-codimensional regular submanifold of U , and

Tp {f −1 (Z)} = ker(π ◦ ψ ◦ f )∗p


−1
= f∗p {(π ◦ ψ)−1
∗q (O)}

−1
= f∗p {Tf (p) Z}.

The theorem is proved. 


56 1. Preliminaries of Differentiable Manifolds

1.2 Tangent Bundle


The tangent bundle of a differentiable manifold M is the disjoint union of the tangent
spaces of M . It is useful, in distinguishing between the tangent space and bundle, to
consider their dimensions, n and 2n respectively. In other words, the tangent bundle
accounts for dimensions in the positions in the manifold as well as directions tangent
to it. Since we can define a projection map, for each element of the tangent bun-
dle giving the element in the manifold whose tangent space the first element lies in,
tangent bundles are also fiber bundles.

1.2.1 Tangent Bundle and Orientation


In this section, we will discuss two invariable properties under diffeomorphism–
tangent bundle and orientation.
1. Tangent Bundle
Definition 2.1. The Triple (T M, M, π) is called tangent bundle
# of differentiable man-
ifold M (sometimes simply called T M ), where T M = Tp M , projection map
p∈M
π : T M → M satisfies π(Xp ) = p, ∀ Xp ∈ T M . For every p ∈ M, π −1 (p) = Tp M
is called fiber at p of tangent bundle T M .

Proposition 2.2. Let M be an m-dimensional differentiable manifold, then T M is a


2m-dimensional differentiable manifold, and π : T M → M is a submersion.

Proof. Let (U, ϕ) be a chart on M , and its coordinate function be x1 , · · · , xm . Then,


 ∂ 
∀ Xp ∈ π −1 (U ), Xp = ai i  . Define ϕU : π −1 (U ) → ϕ(U ) × Rm , s.t.
i
∂x p

ϕU (Xp ) = (ϕ(p); a1 , · · · , am ),

obviously ϕU is a 1 to 1 mapping.
Note that as (U, ϕ) takes all the charts on M , all the corresponding π −1 (U ) con-
stitutes a covering of T M . Hence, if the topology of π −1 (U ) is given, the subset of
π −1 (U ) is open, iff the image of ϕU is an open set of ϕ(U ) × Rm . It is easy to prove
that by the 1 to 1 correspondence of ϕU , the topology of ϕU on the Rm ×Rm = R2m
subspaces can be lifted on π −1 (U ). The topology on T M can be defined as follows:
W is called an open subset of T M , iff W ∩ π −1 (U ) is an open subset of π −1 (U ). It
is easy to deduce that T M constitutes a topological space that satisfies the following
conditions:
1◦ T M is a Hausdorff space that has countable bases.
2◦ π −1 (U ) is an open subset of T M , and ϕU is a homeomorphism from π −1 (U )
to an open subset of R2m .
Furthermore, it can be proved that the manifold structure on T M can be naturally
induced from the manifold structure on M . We say that {(π −1 (U ), ϕU )} = A is a
smooth atlas of T M . For any chart (π −1 (U ), ϕU ), there exists a (π −1 (V ), ψV ) ∈
1.2 Tangent Bundle 57

A, and π −1 (U ) ∩ π −1 (V ) = Ø. Let x1 , · · · , xm and y 1 , · · · , y m be the coordinate


functions of the charts (U, ϕ), (V, ψ). Then,
%  &
∂ 
ψV ◦ ϕ−1 1
U (u; a , · · · , a ) = ψV
m
ai
i
∂xi ϕ−1 (u)
%%  &  &
∂y j  ∂ 
= ψV ai i 
j i
∂x ϕ−1 (u) ∂y j ψ◦ϕ−1 (u)
%  1   m  &
−1 i ∂y  i ∂y
= ψ◦ϕ (u); a ,···, a .
i
∂xi  ϕ−1 (u) i
∂x ϕ−1 (u)
i

It is easy to conclude that T M is a 2m-dimensional manifold, A is a differentiable


structure on T M . From the charts (U, ϕ) of M and (π −1 (U ), ϕU ) of T M , we know
 = ϕ ◦ π ◦ ϕ−1
π U : ϕ(U ) × R → ϕ(U ) has the form:
m

(u; a1 , · · · , am ) = u.
π
By the definition of submersion, π is a submersion. 
Given below are examples of two trivial tangent bundles (if there exists a diffeo-
morphism from its tangent bundle T M to M × Rm , and this diffeomorphism limited
on each fiber of T M (Tp M ) is a linear isomorphism from Tp M to {p} × Rm ).
Example 2.3. Let U be an open subset of Rm and T U  U × Rm .
 ∂    
∂  (i = 1, · · · , m) is the basis of
∀ Xu ∈ T U, Xu = ai i  , where
i
∂u u ∂ui u
Tu U . Then, it is easy to prove that
Xu −→ (u; a1 , · · · , am )
is a diffeomorphism from T U to U × Rm . Moreover, since each fiber Tu U of T U is
a linear space, maps limited on Tu U is a linear isomorphism from Tu U to {u} × Rm ,
i.e., T U is a trivial tangent bundle.

Example 2.4. T S 1 is a trivial tangent bundle, i.e., T S 1  S 1 × R.


Let A={(U, ϕ), (V, ψ)} be a smooth atlas on S 1 , where
U = {(cos θ, sin θ)|0 < θ < 2π}, ϕ(cos θ, sin θ) = θ,
V = {(cos θ, sin θ)| − π < θ < π}, ψ(cos θ, sin θ) = θ,
'
−1 θ, 0 < θ < π,
ψ ◦ ϕ (θ) =
θ − 2π, π < θ < 2π.
Define f : T S 1 → S 1 × R, s.t.
⎧ 
⎪ ∂ 
⎨ (p; a), p ∈ U,
⎪ Xp = a  ,
∂x p
f (Xp ) =
⎪ 
⎪ ∂ 
⎩ (p; b), p ∈ V, Xp = b  ,
∂y p
58 1. Preliminaries of Differentiable Manifolds

where x, y are the coordinate functions on (U, ϕ), (V, ψ) respectively. When p ∈ U ∩
V , we have
∂  ∂y  ∂  ∂ 
 =   =  .
∂x p ∂x p ∂y p ∂y p
Therefore, f has the definition and is a 1 to 1 correspondence. Moreover, f and f −1
are smooth. Hence, T S 1 is a trivial tangent bundle.
Apart from trivial tangent bundles, there exists a broad class of nontrivial tangent
bundles. For an example, T S 2 is a nontrivial tangent bundle.

Definition 2.5. Let f : M → N be smooth. Define a mapping T f : T M → T N , s.t.

T f |Tp M = f∗p , ∀ p ∈ M,

then T f is called the tangent mapping of f .

Remark 2.6. ∀ Xp ∈ Tp M , there exist charts (U, ϕ) on M about p and (V, ψ)


on N about f (p), s.t. f (U ) ⊂ V . By π1 : T M → M, π2 : T N → N , it
is naturally derived that (π1−1 (U ), ϕU ), (π2−1 (V ), ψV ) are charts on T M, T N , and
T f (π1−1 (U )) ⊂ π2−1 (V ). Note that

ψV ◦ T f ◦ ϕ−1 1
U (u; a , · · · , a )
m

  ∂f1   ∂fn  
= ψ ◦ f ◦ ϕ−1
U ; ai i  ,···, ai i  ,
i
∂x ϕ−1 (u) i
∂x ϕ−1 (u)

which may be simplified as


 
ψV ◦ T f ◦ ϕ−1  ˆ
U (u; α) = f (u); Df (u)α ,

where α = (a1 , · · · , am ). Therefore, T f is a smooth mapping.

Remark 2.7. Let M, N, L be the differentiable manifolds. By the definition of tan-


gent mapping, if f : M → N and g : N → L are smooth, then

T (g ◦ f ) = T g ◦ T f.

Remark 2.8. If f : M → N is a diffeomorphism, then T f : T M → T N is also a


diffeomorphism.

2. Orientation
Next, we introduce the concept of orientation for differentiable manifolds.
Given V as a m-dimensional vector space, {e1 , · · · , em }, {e1 , · · · , em } as V ’s two

m
ordered bases, if ej = aij ei (j = 1, · · · , m), then
i=1

(e1 , · · · , em ) = (e1 , · · · , em )A,


1.2 Tangent Bundle 59

where A = (aij )m×m . If det A > 0, we call {ei } and {ej } concurrent; otherwise,
if det A < 0, we call {ei } and {ej } reverse. Then, a direction μ of V can be ex-
pressed by a concurrent class [{ej }] equivalent to {ej }. The other direction −μ can
be expressed by an equivalent class to the reverse direction of {ej }. (V, μ) is called an
orientable vector space.
Let (V, μ), (W, ν) be two orientable vector spaces. A : V → W is a linear isomor-
phism from V to W . If the orientation of W , which is induced by A, is consistent with
ν, i.e., Aμ = ν, then A preserves orientations. Otherwise, A reverses orientations. In
the below section, we extend the orientation concept to differentiable manifolds.

Definition 2.9. Let M be an m-dimensional differentiable manifold, ∀p ∈ M, μp is


the orientation of Tp M , s.t.

ϕ∗q : (Tq M, μq ) −→ (Tϕ(q) Rm , νϕ(q) ), ∀q ∈ U

are all linear isomorphisms that preserves orientations, where (U, ϕ) is a chart that
contains p, and
 ∂  ∂  

νϕ(q) =  , · · · ,  .
∂u1 ϕ(q) ∂um ϕ(q)
Then, μ = {μp | p ∈ M } is the orientation on M , and (M, μ) is called an orientable
differentiable manifold.

Remark 2.10. The Definition 2.9 shows that if (M, μ) is an orientable differentiable
manifold, W is an open subset of M , then ∀p ∈ M and there exists an orientation μp
of Tp M . This gives an orientation on W , denoted by μ|W . Then, (W, μ|W ) is also an
orientable differentiable manifold. Specifically, if (U, ϕ) is a chart on M , then (U, μp )
is an orientable differentiable manifold.

Remark 2.10 shows that M may be locally orientable. Next, we discuss how to
construct a global orientation.

Proposition 2.11. Let M be an m-dimensional differentiable manifold, then M is


orientable, iff there exists a smooth atlas, A = {(Uα , ϕα )}, on M , s.t. ∀ (Uα , ϕα ),
(Uβ , ϕβ ) ∈ A, if Uα ∩ Uβ = Ø, then

det Jϕβ ◦ϕ−1


α
(ϕα (q)) > 0, ∀ q ∈ Uα ∩ Uβ ,

where Jϕβ ◦ϕ−1


α
(ϕα (q)) is the Jacobian matrix of ϕβ ◦ ϕ−1
α at ϕα (q).

Proof. Necessity. Since M is orientable, select one of the orientations of M , μ =


{μp | p ∈ M }. According to Definition 2.9, ∀p ∈ M , there exists a chart (U, ϕ) on
M about p, s.t. ∀q ∈ U ,
 ∂  ∂  

ϕ∗q μq =  , · · · ,  .
∂u1 ϕ(q) ∂um ϕ(q)
Denote a set consisting of all such charts by A. Then, A is a smooth atlas of M ,
and the properties of A described in the proposition are easy to prove.
60 1. Preliminaries of Differentiable Manifolds

Sufficiency. Let A be an atlas that satisfies all the properties of the proposition.
Choose (Uα , ϕα ), (Uβ , ϕβ ) ∈ A and Uα ∩ Uβ = Ø, and use x1 , · · · , xm and
y 1 , · · · , y m to represent the coordinate functions of (Uα , ϕα ), (Uβ , ϕβ ) respectively.
Note that
 ∂  ∂    ∂  ∂  

 , · · · ,  =  , · · · ,  J −1 (ϕα (q)),
∂x1 p ∂xm p ∂y 1 p ∂y m p ϕβ ◦ϕα

and by supposition Jϕβ ◦ϕ−1


α
(ϕα (q)) > 0, we have
 ∂  ∂    ∂  ∂  

 , · · · ,  =  , · · · ,  ,
∂x1 p ∂xm p ∂y 1 p ∂y m p

i.e., M is orientable. 

Remark 2.12. If f : M → N is a diffeomorphism, f A = {f (Uα ), ϕα ◦ f −1 } is


a smooth atlas. Pick two charts on N , (f (ϕα ), ϕα ◦ f −1 ), (f (ϕβ ), ϕβ ◦ f −1 ), we
have det Jϕβ ◦f −1 ◦f ◦ϕ−1
α
(ϕα (q)) = det Jϕβ ◦ϕ−1
α
(ϕα (q)), ∀ q ∈ Uα ∩ Uβ . If M is
orientable, then N is possible, which means orientation is an invariable property under
diffeomorphism.

Proposition 2.13. Let M be a connected differentiable manifold; if M is orientable,


then M has only two orientations.

Proof. If μ = {μp | p ∈ M } is an orientation of M , then −μ is also an orientation.


Therefore, M has at least two orientations. Assume there exists another orientation,
denoted as ν = {νp | p ∈ M }. Let S = {p ∈ M | μp = νp }. ∀p ∈ S, take charts
(U, ϕ), (V, ψ) of M about p, s.t. μ, ν satisfy all the requirements of Definition 2.9. As
a result of μp = νp , we have

det Jψ◦ϕ−1 (ϕ(p)) > 0.

By continuity, there exists a neighborhood of ϕ(p), W ⊂ ϕ(U ∩ V ), s.t.

det Jψ◦ϕ−1 (ϕ(u)) > 0, ∀ u ∈ W.

Denote O = ϕ−1 (W ). Then, O is a neighborhood of p in M , and O ⊂ S, i.e., S is an


open subset of M . Similarly, M \S is also an open subset of M . Since M is connected,
we have either S = Ø or S = M . If S = Ø, then μ = −ν; if S = M , then μ = ν. 

Remark 2.14. By the Proposition 2.13, any connected open set on an orientable dif-
ferentiable manifold M has two and only two orientations.

Remark 2.15. Let (U, ϕ), (V, ψ) be two charts on M , and U and V be connected. If
U ∩ V = Ø, then det Jψ◦ϕ−1 preserves the orientation on ϕ(U ∩ V ).

Example 2.16. S 1 is an orientable differential manifold. Let the smooth atlas of S 1


be A = {(U+ , ϕ+ ), (U− , ϕ− )}, where
1.2 Tangent Bundle 61

U+ = S 1 \{(0, −1)}, U− = S 1 \{(0, 1)},

ϕ± : U± → R, s.t.
u1 −u1
ϕ+ (u1 , u2 ) = , ϕ− (u1 , u2 ) = .
1 + u2 u2 − 1
Since
1
ϕ+ ◦ ϕ−1
− (u) = − , ∀ u ∈ ϕ− (U+ ∩ U− ),
u
we have
1
det Jϕ+ ◦ϕ−1 (u) = > 0, ∀ u ∈ ϕ− (U+ ∩ U− ).
− u2
Similarly
det Jϕ− ◦ϕ−1 (u) > 0, ∀ u ∈ ϕ+ (U+ ∩ U− ),
+

i.e., S 1 is orientable.
Example 2.17. Möbius strip is a non-orientable surface. Define equivalent relation“∼”
on [0, 1] × (0, 1):

(u, v) ∼ (u, v), 0 < u < 1, 0 < v < 1,


(0, v) ∼ (1, 1 − v), 0 < v < 1,

[0, 1] × (0, 1)\ ∼ is a Möbius strip, A = {(U, ϕ), (V, ψ)} is its smooth atlas
 
1
U = M \{0} × (0, 1), V = M\ × (0, 1),
2
 
1 1
ϕ : U −→ (0, 1) × (0, 1), ψ : V −→ − , × (0, 1),
2 2
which satisfies:
ϕ(u, v) = (u, v),

⎪ 1
⎨ (u, v), 0≤ ,
2
ψ(u, v) =

⎩ (u − 1, 1 − v), 1
< u ≤ 1,
2
⎧  
⎪ 1
⎨ (u, v), (u, v) ∈ 0, × (0, 1),
−1 2
ψ ◦ ϕ (u, v) =  

⎩ (u − 1, 1 − v), (u, v) ∈ 1 , 1 × (0, 1),
2
i.e., ⎧  
⎪ 1
(u, v) ∈ 0,
⎨ 1, × (0, 1),
2
det Jψ◦ϕ−1 (u, v) =
⎪  
⎩ −1, (u, v) ∈
1
, 1 × (0, 1).
2
By the Remark 2.15, Möbius strip is a nonorientable surface.
62 1. Preliminaries of Differentiable Manifolds

Definition 2.18. Let M, N be two orientable differential manifolds, and f : M → N


be a local diffeomorphism ( diffeomorphism in the neighborhood of any p ∈ M ). If
for every p ∈ M , there exists a f∗p : Tp M → Tf (p) N that preserves (or reverses) the
orientation, then f is said to preserve the orientation (or reverse the orientation).
Proposition 2.19. f : M → N is a diffeomorphism; if M is a connection, then f
preserves the orientation or reverses the orientation.
Proof. Let S = {p ∈ M | f∗p : Tp M → Tf (p) N preserves the orientation}. ∀ p ∈ S,
because f∗p preserves orientation, det Jf (p) > 0. From the continuity, there exists U ,
s.t. det Jf (q) > 0, ∀ q ∈ U , i.e., U ⊂ S. Similarly, M \S is also an open subset of M .
Since M is connected, S = Ø or S = M . When S = Ø, then f preserves the inverse
orientation, otherwise (S = M ) f preserves the orientation. 

1.2.2 Vector Field and Flow


Similar to Euclidean space, differentiable manifold also has the concept of vector field
and curve of solution.
Definition 2.20. Let M be a differentiable manifold. If map X : M → T M has the
property π ◦ X = I : M → M , then X is said to be a vector field of M , and is also
called a section in the tangent bundle T M , where π : T M → M is a projection. If
the map X is smooth, then X is called a smooth vector field.

Proposition 2.21. X is a smooth vector field on M , iff for every f ∈ C ∞ (M ) there


exists a Xf ∈ C ∞ (M ), and C ∞ (M ) = {all smooth f unctions on M }, Xf :
M → R, Xf (p) = Xp (f ), ∀ p ∈ M, f ∈ C ∞ (M ).
Proof. Necessity. Suppose (U, ϕ) is a chart of M , and (π −1 (U ), ϕU ) is an induced
natural chart of T M . Suppose X can be expressed as
 ∂ 
Xp = ai (p) i  , ∀ p ∈ U,
i
∂x p

by
 = ϕU ◦ X ◦ ϕ−1 : ϕ(U ) −→ ϕ(U ) × Rm ,
X

X(u) = (u; a1 ◦ ϕ−1 (u), · · · , am ◦ ϕ−1 (u)),
we know, if X is smooth, then a1 , · · · , am are smooth too. Since
 ∂f 
(Xf )(p) = Xp f = ai (p) i  , ∀ p ∈ U,
i
∂x p

(Xf )|U is smooth.


Sufficiency. ∀ p ∈ M , let (U, ϕ) be a chart on M about p, where its coordinate
function x1 , · · · , xm may be expanded to smooth x
"i on the entire M , and satisfy x
"i =
x on some neighborhood of p on V ⊂ U . Then,
i
1.2 Tangent Bundle 63

 ∂ 
Xq = xi )
Xq ("  , ∀ q ∈ U,
i
∂xi q

xi ) is smooth, i.e., X is also smooth.


by the supposition Xq (" 

Definition 2.22. Let X be a smooth vector field on the differentiable manifold M .


The solution curve of X through p refers to a smooth mapping c : J → M s.t.
c(0) = p, and  
d
c∗t  = Xc(t) , ∀ t ∈ J,
dt t
i.e., the velocity vector at t of a smooth curve c is exactly the value of the vector field
at p on M .

Proposition 2.23. Let f : M → N be a diffeomorphism, X be a smooth vector field


on M . If we denote f∗ X = T f ◦ X ◦ f −1 : N → T N , then f∗ X is a smooth vector
field on N , and c is a solution curve of X through p ∈ M , iff f ◦ c is a solution curve
of f∗ X through f (p).

Proof. By the definition of tangent mapping, we have

π2 ◦ (f∗ X) = π2 ◦ (T f ◦ X ◦ f −1 )
= f ◦ (π1 ◦ X) ◦ f −1 = I.

Since f −1 , X, T f are smooth, f∗ X is a smooth vector field on N .


If c : J → M is a solution curve of X through p, then f ◦ c(0) = f (0) = f (p),
and    
d  d 
(f ◦ c)∗t  = f∗c(t) ◦ c∗t 
dt t dt t

= f∗c(t) (Xc (t)) = (f∗ X)f ◦c(t) , ∀ t ∈ J.


Therefore the proposition is completed. 

Remark 2.24. Let X be a smooth vector field on the differentiable manifold M , and
(U, ϕ) be a chart on M . By Proposition 2.23, we have ϕ∗ (X | U ) to be a smooth
vector field of ϕ(U ).

Remark 2.25. If ϕ∗ (X | U ) has an expression


 ∂ 
m
{ϕ∗ (X | U )}u = ai (u)  , ∀ u ∈ ϕ(U ),
i=1
∂ui u

then
∂  m
∂(ϕ ◦ c)i  ∂ 

(ϕ ◦ c)∗t  =   , ∀ t ∈ J,
∂t t ∂t t ∂ui ϕ◦c(t)
i=1

where (ϕ ◦ c)i is the i-th component of ϕ ◦ c.


Therefore, according to Proposition 2.23, there exists a solution curve, c : J → U ,
of X through p, iff ϕ ◦ c is a solution of
64 1. Preliminaries of Differentiable Manifolds

⎧ i
⎨ d u = ai (u1 , · · · , um ), i = 1, · · · , m,
dt

u(0) = ϕ(p).

Strictly speaking, a vector field on Rn is a mapping A : Rn → T (Rn ),i.e.,

∀ x ∈ Rn , A(x) ∈ Tx Rn .
 
Since (e1 )x , · · · , (en )x form a basis on Tx Rn , we can write ∂ , · · · , ∂ , there-
∂x1 ∂xn
fore
 n
A(x) = Ai (x)(ei )x .
i=1

If Ai (x) ∈ C ∞ , then A(x) is called a smooth vector field on Rn . The set of all smooth
vector fields on Rn is denoted by X (Rn ). For any vector A(x), B(x) ∈ X (Rn ),
define:
(αA + βB)(x) = αA(x) + βB(x), α, β ∈ R,

(f A)(x) = f (x)A(x), f (x) ∈ C ∞ (Rn ).

Then, X (Rn ) is a C ∞ module of C ∞ -vector on Rn .


If we denote A(x) ∈ X (Rn ) by (A1 (x), · · · , An (x)) , i.e.,


n
A(x) = Ai (x)(ei )x = (A1 (x), · · · , An (x)) ,
i=1

then A(x) is a C ∞ n-value function.

1.3 Exterior Product


Exterior product is one of the algebraic operations. It has quite an interesting geomet-
ric background. In this section, we would like to construct a new linear space from
an original linear space so that the new space has not only the linear space algebraic
structure, but also a new algebraic operation — exterior product. This forms a basis
for the differential form introduced later. These materials can be found in a series of
books[AM78,Che53,Arn89,Ede85,Fla] .
In R3 , let the vectors be

a1 = a11 i + a12 j + a13 k,


a2 = a21 i + a22 j + a23 k,
a3 = a31 i + a32 j + a33 k,

where a1 , a2 , a3 are linearly independent. Then,


1.3 Exterior Product 65

( )
V = x ∈ R3 | x = α1 a1 + α2 a2 + α3 a3 , 0 ≤ α1 , α2 , α3 ≤ 1
a spanned parallelepiped by vectors a1 , a2 , a3 . We introduce a new operation, ∧ be-
tween a1 , a2 , a3 as follows:
 
 a11 a12 a13 
 
a1 ∧ a2 ∧ a3 =  a21 a22 a23 .

 a31 a32 a33 

The geometric meaning of a1 ∧a2 ∧a3 is the orientable volume of V , where orientation
means the sign of the volume is positive or negative. If the right hand law is followed,
the volume has the plus sign, otherwise it has the minus sign . It is easy to see that
operation ∧ satisfies the following laws:
1◦ Multilinear. Let a2 = βb + γc, b, c be vectors, β, γ be real numbers. Then,
a1 ∧ (βb + γc) ∧ a3 = β(a1 ∧ b ∧ a3 ) + γ(a1 ∧ c ∧ a3 ).
2◦ Anti-commute
a1 ∧ a2 ∧ a3 = −a2 ∧ a1 ∧ a3 ,
a1 ∧ a2 ∧ a3 = −a3 ∧ a2 ∧ a1 ,
a1 ∧ a2 ∧ a3 = −a1 ∧ a3 ∧ a2 .
From 2◦ we know that if a1 , a2 , a3 has two identical vectors, then a1 ∧a2 ∧a3 = 0.
Example 3.1. Let e1 , e2 , e3 be a basis in R3 , which are not necessarily orthogonal,
and let a1 , a2 , a3 be three vectors in R3 , which can be represented by
a1 = a11 e1 + a12 e2 + a13 e3 ,
a2 = a21 e1 + a22 e2 + a23 e3 ,
a3 = a31 e1 + a32 e2 + a33 e3 .
By multilinearity and anti-commutativity of ∧, after the computation, we have
 
 a11 a12 a13 
 
a1 ∧ a2 ∧ a3 =  a21 a22 a23  e1 ∧ e2 ∧ e3 .
 a31 a32 a33 

Example 3.2. Let e1 , e2 , e3 be a basis in R3 , and


a1 = a11 e1 + a12 e2 + a13 e3 ,
a2 = a21 e1 + a22 e2 + a23 e3 .
Then,
     
 a a12   a a13   a a11 
a1 ∧ a2 =  11 e ∧ e2 +  12 e ∧ e3 +  13 e ∧ e1 .
a21 a22  1 a22 a23  2 a23 a21  3
The geometric significance of this formula is that the projection of the par-
allelepiped spanned by the pair of vectors a1 and a2 onto the coordinate plane
e1 e2 , e2 e3 , e3 e1 is equal to A12 , A23 , and A31 respectively. Abstracting from the mul-
tilinearity and the anti-commutativity, which is satisfied by the operation wedge, we
can obtain the concept of exterior product.
66 1. Preliminaries of Differentiable Manifolds

1.3.1 Exterior Form


1. 1- Form
In this section, Rn is an n-dimensional real vector space, where the vectors are de-
noted by ξ, η, · · · ∈ Rn .

Definition 3.3. A form of degree 1 (or a 1-form) on Rn is a linear function ω : Rn →


R, i.e.,

ω(λ1 ξ1 + λ2 ξ2 ) = λ1 ω(ξ1 ) + λ2 ω(ξ2 ), λ1 , λ2 ∈ R, ξ1 , ξ2 ∈ Rn .

The set of all 1-forms on Rn is denoted by Λ1 (Rn ). For ω1 , ω2 ∈ Λ1 (Rn ), define

(λ1 ω1 + λ2 ω2 )(ξ) = λ1 ω1 (ξ) + λ2 ω2 (ξ), λ1 , λ2 ∈ R.

Then, Λ1 Rn becomes a vector space, i.e., the dual space (Rn )∗ of Rn .


Let ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0
⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
e1 = ⎢ . ⎥ , e2 = ⎢ . ⎥ , · · · , en = ⎢ . ⎥
⎣ .. ⎦ ⎣ .. ⎦ ⎣ .. ⎦
0 0 1
be the standard basis on Rn . x1 , x2 , · · · , xn forms the coordinate system on Rn , i.e.,
if ξ = a1 e1 + a2 e2 + · · · + an en , then xi (ξ) = ai , especially xi (ej ) = δij . Obviously,
xi ∈ Λ1 (Rn ). For any ω ∈ Λ1 (Rn ),

n  
n 
i=1
ω(ξ) = ω ai ei = ai ω(ei ) = ω(ei )xi (ξ),
i=1 i=1 n

and so
ω = ω(e1 )x1 + ω(e2 )x2 + · · · + ω(en )xn .
Thus, x1 , · · · , xn is a basis on Λ1 (Rn ), Λ1 (Rn ) = {xi }i=1,···,n .

Example 3.4. If F is a uniform force field on a Euclidean space R3 , then its work A
on a displacement ξ is a 1-form acting on ξ,

ωF (ξ) = (F, ξ) = F1 a1 + F2 a2 + F3 a3 , ξ = a1 e1 + a2 e2 + a3 e3

or
ωF = F1 x1 + F2 x2 + F3 x3 .

2. 2-Forms
Definition 3.5. An exterior form of degree 2 (or a 2-form) is a bilinear, skew-
symmetric function ω 2 : Rn × Rn → R, i.e.,

ω 2 (λ1 ξ1 + λ2 ξ2 , ξ3 ) = λ1 ω 2 (ξ1 , ξ3 ) + λ2 ω 2 (ξ2 , ξ3 ),


ω 2 (ξ1 , ξ2 ) = −ω 2 (ξ2 , ξ1 ), ξ1 , ξ2 , ξ3 ∈ Rn , λ1 , λ2 ∈ R.
1.3 Exterior Product 67

The set of all 2-forms on Rn is denoted by Λ2 (Rn ) = Λ2 .


Similarly, if we define the sum of two 2-forms ω12 , ω22 and scalar multiplication as
follows
(λ1 ω12 + λ2 ω22 )(ξ1 , ξ2 ) = λ1 ω12 (ξ1 , ξ2 ) + λ2 ω22 (ξ1 , ξ2 ),
ω12 , ω22 ∈ Λ2 (Rn ), λ1 , λ2 ∈ R,

then Λ2 (Rn ) becomes a vector space on R.

Property 3.6. The skew-symmetric condition ω 2 (ξ1 , ξ2 ) − ω 2 (ξ2 , ξ1 ) is equivalent to


ω 2 (ξ, ξ) = 0, ∀ ξ ∈ Rn since from the latter it follows that

0 = ω 2 (ξ1 + ξ2 , ξ1 + ξ2 )
= −ω 2 (ξ1 , ξ1 ) + ω 2 (ξ1 , ξ2 ) + ω 2 (ξ2 , ξ1 ) + ω 2 (ξ2 , ξ2 )
= ω 2 (ξ1 , ξ2 ) + ω 2 (ξ2 , ξ1 ).

i.e., ω 2 (ξ1 , ξ2 ) = ω 2 (ξ2 , ξ1 ).

Example 3.7. Let S(ξ1 , ξ2 ) be the oriented area of the parallelogram constructed on
the vector ξ1 and ξ2 of the oriented Euclidean plane R2 , i.e.,
 
 ξ11 ξ12 

S(ξ1 , ξ2 ) =  ,
ξ21 ξ22 

where
ξ1 = ξ11 e1 + ξ12 e2 , ξ2 = ξ21 e1 + ξ22 e2 .

Example 3.8. Let v be a given vector on the oriented Euclidean space R3 . The triple
scalar product on other two vectors ξ1 and ξ2 is a 2-form:
 
 v1 v2 v3 
 
ω(ξ1 , ξ2 ) = (v, [ξ1 , ξ2 ]) =  ξ11 ξ12 ξ13  ,
 ξ21 ξ22 ξ23 

3
 3

where v = vi eji , ξj = ξji ei (j = 1, 2).
i=1 i=1

3. k-Forms
We denote the set of all permutations of the set {1, 2, · · · , k} by Sk and its element by

σ = {σ(1), σ(2), · · · , σ(k)} = {i1 , i2 , · · · , ik } ∈ νk ,



1, if σ ∈ νk is even,
ε(σ) =
−1, if σ ∈ νk is odd.
68 1. Preliminaries of Differentiable Manifolds

Definition 3.9. An exterior form of degree k (or a k-form) is a function of k vectors


that is k-linear and skew-symmetric:

ω(λ1 ξ1 + λ2 ξ1 , ξ2 , · · · , ξk ) = λ1 ω(ξ1 , ξ2 , · · · , ξk ) + λ2 ω(ξ1 , ξ2 , · · · , ξk ),


ω(ξi1 , ξi2 , · · · , ξik ) = ε(σ)ω(ξ1 , ξ2 , · · · , ξk ),
ξ1 , ξ1 , ξ2 , · · · , ξk ∈ Rn , λ1 , λ2 ∈ R, σ ∈ νk ,

where σ = (i1 , i2 , · · · , ik ) ∈ Sk .

Example 3.10. The oriented volume of the parallelepiped with edges ξ1 , ξ2 , · · · , ξn


in the oriented Euclidean space Rn is an n-form.
 
 ξ11 · · · ξ1n 
 
 ξ21 · · · ξ2n 
 
V (ξ1 , ξ2 , · · · , ξn ) =  . ..  ,
 .. . 

 ξn1 · · · ξnn 

where ξi = ξi1 e1 + · · · + ξin en .

The set of all k-forms in Rn is denoted by Λk (Rn ). It forms a real vector space if
we introduce operations of addition.

(λ1 ω1k + λ2 ω2k )(ξ1 , ξ2 , · · · , ξk ) = λ1 ω1k (ξ1 , ξ2 , · · · , ξk ) + λ2 ω2k (ξ1 , ξ2 , · · · , ξk ),

ω1k , ω2k ∈ Λk (Rn ), λ1 , λ2 ∈ R.

it

k
Question 3.11. Show that if ηj = aji ξi (j = 1, · · · , k), then
i=1

ω k (η1 , η2 , · · · , ηk ) = det (aji )ω k (ξ1 , ξ2 , · · · , ξk ).

1.3.2 Exterior Algebra


1. The exterior product of two 1-forms
In the previous section, we have defined various exterior forms. We now introduce one
more operation: exterior multiplication of forms. As a matter of fact, these forms can
be generated from the 1-forms by an operation called exterior product.

Definition 3.12. For ω1 and ω2 ∈ Λ1 (Rn ), the exterior product of ω1 and ω2 denoted
by ω1 ∧ ω2 is defined by the formula
 
 ω1 (ξ1 ) ω2 (ξ1 ) 

(ω1 ∧ ω2 )(ξ1 , ξ2 ) =   , ξ1 , ξ2 ∈ Rn ,
ω1 (ξ2 ) ω2 (ξ2 ) 
1.3 Exterior Product 69

which denotes the oriented area of the image of the parallelogram with sides ω(ξ1 )
and ω(ξ2 ) on the ω1 , ω2 plane.
It is not hard to verify that ω1 ∧ ω2 really is a 2-form and has properties
ω1 ∧ ω2 = −ω2 ∧ ω1 ,
(λ1 ω1 + λ2 ω1 ) ∧ ω2 = λ1 ω1 ∧ ω2 + λ2 ω1 ∧ ω2 .
Now suppose we have chosen a system of linear coordinates on Rn , i.e., we are given
n independent 1-forms, x1 , x2 , · · · , xn . We will call these forms basic. The exterior
products of the basic forms are the 2-forms xi ∧ xj . By skew-symmetry,

xi ∈ Λ1 (Rn ), xi ∧ xi = 0,
xi ∧ xj = −xj ∧ xi , 
 xi (ξ1 ) xj (ξ1 ) 
(xi ∧ xj )(ξ1 , ξ2 ) =   
xi (ξ2 ) xj (ξ2 ) 
 
 ai aj 
=   = ai bj − aj bi ,
bi bj 
 
where ξ1 = ai ei , ξ2 = bi ei . It is the oriented area of the parallelogram with
i i
sides (xi (ξ1 ), xi (ξ2 )) and (xj (ξ1 ), xj (ξ2 )) in the (xi , xj )-plane.
For any ω ∈ Λ2 (Rn ),

n 
n
ω(ξ1 , ξ2 ) = ω(ai ei , bj ej ) = ai bj ω(ei , ej )

i,j=1 i,j=1 
= (ai bj − aj bi )ω(ei , ej ) = ω(ei , ej )(xi ∧ xj )(ξ1 , ξ2 ),
i<j i<j
 
where ξ1 = ai ei , ξ2 = bi ei . Thus,
i i

ω= ω(ei , ej )xi ∧ xj ,
i<j

i.e., {xi ∧ xj }i<j generate Λ2 (Rn ). In addition, if



aij xi ∧ xj = 0,
i<j

acting on el , ek (l < k), we get



akl = aij (xi ∧ xj )(el , ek ) = 0.
i<j

Thus, {xi ∧xj }i<j are linearly independent and they form a base of Λ2 (Rn ), which
 n 
implies that the dimension of Λ2 (Rn ) is .
2
70 1. Preliminaries of Differentiable Manifolds

In the oriented Euclidean space R3 , the base of Λ2 (R3 ) is x1 ∧ x2 , x2 ∧ x3 , and


x3 ∧ x1 = −x1 ∧ x3 . Any 2-form ω ∈ Λ2 (R3 ) can be represented as

ω = P x2 ∧ x3 + Qx3 ∧ x1 + Rx1 ∧ x2 .

2. Exterior monomials

Definition 3.13. For ω1 , · · · , ωk ∈ Λ1 (Rn ), we define their exterior product ω1 ∧


· · · ∧ ωk as follows:
 
 ω1 (ξ1 ) · · · ω1 (ξk ) 
 
 .. .. 
(ω1 ∧ · · · ∧ ωk )(ξ1 , · · · , ξk ) =  . . .
 
 ωk (ξ1 ) · · · ωk (ξk ) 

In other words, the value of a product of 1-forms on the parallelepiped ξ1 , · · · , ξk


is equal to the oriented volume of the image of the parallelepiped in the oriented
Euclidean coordinate space Rn under the mapping ξ → (ω1 (ξ), · · · , ωk (ξ)).

Question 3.14. Prove that ω1 ∧ · · · ∧ ωk really is a k-form.

Property 3.15. We have the following properties:


1◦ (λ ω1 + λ ω1 ) ∧ ω2 ∧ · · · ∧ ωk = λ ω1 ∧ · · · ∧ ωk + λ ω1 ∧ · · · ∧ ωk .
2◦ ωσ(1) ∧ ωσ(2) ∧ · · · ∧ ωσ(k) = ε(σ)ω1 ∧ ω2 ∧ · · · ∧ ωk .
3◦ If i = j, ωi = ωj , then ω1 ∧ · · · ∧ ωk = 0.
4◦ If ω1 , · · · , ωk are linearly dependent, then ω1 ∧ · · · ∧ ωk = 0.
 k
5◦ If βi = aij ωj (i = 1, · · · , k), then
j=1

β1 ∧ · · · ∧ βk = det (aij )ω1 ∧ · · · ∧ ωk .

Proof. Here we only prove 5◦ , the others are easy. By the linearity of the exterior
product,
* k + * k +
 
β1 ∧ · · · ∧ βk = a1i1 ωi1 ∧ · · · ∧ akik ωik
i1 =1 ik =1


k
= a1i1 · · · akik ωi1 ∧ · · · ∧ ωik
i1 ,···,ik =1

= a1i1 · · · akik ε(i1 , · · · , ik )ω1 ∧ · · · ∧ ωk (by 2◦ )
i1 ,···,ik ∈νk

= det (aij )ω1 ∧ · · · ∧ ωk .

The proof can be obtained. 


1.3 Exterior Product 71

Theorem 3.16. {xi1 ∧ · · · ∧ xik }i1 <···<ik form a basis on Λk (Rn ),and so the dimen-
 
n
sion of Λk (Rn ) = .
k


n
Proof. For ξi = ξij ej (i = 1, · · · , k), ξi ∈ Rn , then
j=1


n 
n 
ω(ξ1 , · · · , ξk ) = ω ξi,j1 ej1 , · · · , ξkjk ejk
j1 =1 jk =1

n
= ξij1 · · · ξkjk ω(ej1 , · · · , ejk )
j1 ,···,jk =1
 
= ε(j1 , · · · , jk )
i1 <···<ik (j1 ,···,jk )∈νk (i1 ,···,ik )

· ξ1j1 · · · ξkjk ω(ξi1 , · · · , ξik )



= ω(ei1 , · · · , eik )xi1 ∧ · · · ∧ xik (ξ1 , · · · , ξk ).
i1 <···<ik

So 
ω= ω(ei1 , · · · , eik )xi1 ∧ · · · ∧ xik .
i1 <···<ik

Thus, {xi1 ∧ · · · ∧ xik }i1 <···<ik generate Λk (Rn ). Obviously, they are linearly inde-
pendent. Consequently, they form a basis on Λk (Rn ).
In particular,
 n 
dim Λk (Rn ) = .
k
 n 
If k = n, then dim Λn (Rn ) = = 1, ∀ ω ∈ Λn (Rn ), and there must be
n
ω = ax1 ∧ · · · ∧ xk , for some number a ∈ R.
For k > n,
dim (Λk (Rn )) = 0, Λk (Rn ) = {0}.
Therefore, the theorem is completed. 

Definition 3.17. Let ω k = ω1 ∧· · ·∧ωk , ω l = ωk+1 ∧· · ·∧ωk+l , ωi ∈ Λ1 (Rn ) (i =


1, · · · , k + l). Define their product ω k ∧ ω l to be the monomial

ωk ∧ ωl = (ω1 ∧ · · · ∧ ωk ) ∧ (ωk+1 ∧ · · · ∧ ωk+l )


= ω1 ∧ · · · ∧ ωk ∧ ωk+1 ∧ · · · ∧ ωk+l .

Property 3.18. If ω1 , ω2 , and ω3 are monomials, then


1◦ (λ1 ω1 + λ2 ω2 ) ∧ ω3 = λ1 ω1 ∧ ω3 + λ2 ω2 ∧ ω3 .
72 1. Preliminaries of Differentiable Manifolds

2◦ ω1 ∧ ω2 = (−1)kl ω2 ∧ ω1 , ω1 ∈ Λk , ω2 ∈ Λl .

3◦ (ω1k ∧ ω2l ) ∧ ω3m = ω1k ∧ (ω2l ∧ ω3m ), ω1 ∈ Λk , ω2 ∈ Λl , ω3 ∈ Λm .

3. Exterior product of forms


We now turn to define the exterior product of an arbitrary k-form ω k with an arbitrary
l-form ω l .

Definition 3.19. The exterior product ω k ∧ ω l of a k-form ω k on Rn with an l-form


ω l on Rn is the (k + l)-form on Rn , defined by the formula

(ω k ∧ ω l )(ξ1 , · · · , ξk+l ) = ε(σ)ω k (ξi1 , · · · , ξik )ω l (ξj1 , · · · , ξjl ),
i1 <···<ik ; j1 <···<jl
(3.1)
where σ = (i1 , i2 , · · · , ik , j1 , · · · , jl ) is a permutation of the numbers (1, · · · , k + l).

Example 3.20. For k = l = 1,

(ω1 ∧ ω2 )(ξ1 , ξ2 ) = ω1 (ξ1 )ω2 (ξ2 ) − ω1 (ξ2 )ω2 (ξ1 )


 
 ω (ξ ) ω2 (ξ1 ) 
=  1 1 ,
ω1 (ξ2 ) ω2 (ξ2 ) 

which agrees with the definition of multiplication of 1-forms.

Example 3.21. For ω1 ∈ Λ2 and ω2 ∈ Λ1 ,

(ω1 ∧ ω2 )(ξ1 , ξ2 , ξ3 ) = ω1 (ξ1 , ξ2 )ω2 (ξ3 ) − ω1 (ξ1 , ξ3 )ω2 (ξ2 )


+ ω1 (ξ2 , ξ3 )ω2 (ξ1 ).

Proposition 3.22. ω k ∧ ω l defined above is actually a (k + l)-form.

The linearity of ω k ∧ω l is based on the linearity of ω k and ω l . The skew-symmetry


is based on the following.

Lemma 3.23. If ω is a k-linear function on R, then the following conditions are


equivalent.
1◦ ω(ξσ(1) , · · · , ξσ(k) ) = ε(σ) ω(ξ1 , · · · , ξk ), ∀ σ ∈ νk .
2◦ ω(ξ1 , · · · , ξi , · · · , ξj , · · · , ξk ) = −ω(ξ1 , · · · , ξj , · · · , ξi , · · · , ξk ), ∀ i = j.
3◦ ω(ξ1 , · · · , ξk ) = 0, if ξi = ξj , ∀ i = j.
4◦ ω(ξ1 , · · · , ξk ) = 0, where ξi = ξi+1 (1 ≤ i ≤ n − 1).
5◦ ω(ξ1 , · · · , ξk ) = 0, ξ1 , · · · , ξk are linearly dependent.

The proof is left to the reader. Now we turn to prove Proposition 3.22. For this we
only need to prove 4◦ of Lemma 3.23.
1.3 Exterior Product 73

Proof. By 4◦ in Lemma 3.23, we only need to prove that if ξi = ξi +1 , then (ω k ∧
ω l )(ξ1 , · · ·, ξi , ξi +1 , · · · , ξk+l ) = 0.
Consider the terms of the right hand side in (1.2). If i , i + 1 ∈ (i1 , · · · , ik ), then
ω k (ξi1 , · · · , ξik ) = 0. Therefore,

ω k (ξi1 , · · · , ξik )ω l (ξj+1 , · · · , ξj+l ) = 0.

When i , i + 1 ∈ (j1 , · · · , jl ), the case is similar. If i ∈ (i1 , · · · , ik ) and i + 1 ∈


(j1 , · · · , jl ), i.e., i1 < · · · < ih < i < ih+1 < · · · < ik and j1 < · · · < jn < i + 1 <
jn+1 < · · · < jl , then there is another term such that

i1 < · · · < ih < i + 1 < ih+1 < · · · < ik ,


j1 < · · · < jn < i < jn+1 < · · · < jl .

The condition ξi +1 = ξi implies that

ω k (ξ1 , · · · , ξi , · · · , ξik )ω l (ξj1 , · · · , ξi +1 , · · · , ξjl )

= ω k (ξ1 , · · · , ξi +1 , · · · , ξik )ω l (ξj1 , · · · , ξi , · · · , ξjl ).

However, the sign ε(σ) = −ε(σ  ) where

σ = (i1 , · · · , i , · · · , ik , j1 , · · · , i + 1, · · · , jl ),
σ  = (i1 , · · · , i + 1, · · · , ik , j1 , · · · , i , · · · , jl ).

Thus, the right side in 1◦ is equal to 0, i.e.,

(ω k ∧ ω l )(ξ1 , · · · , ξk+l ) = 0.

Hence lemma is proved. 


Theorem 3.24. The exterior product of forms defined above is skew-commutative, dis-
tributive, and associative, i.e.,
1◦ skew-commutative: ω k ∧ ω l = (−1)kl ω l ∧ ω k .
2◦ distributive : (λ1 ω1k + λ2 ω2k ) ∧ ω l = λ1 ω1k ∧ ω l + λ2 ω2k ∧ ω l .
3◦ associative : (ω k ∧ ω l ) ∧ ω m = ω k ∧ (ω l ∧ ω m ).
For monomials, it coincides with the exterior product defined above.
Distributivity follows from the fact that every term in Equation (1.2) is linear with
respect to ω k and ω l .
Since
(ω k ∧ ω l )(ξ1 , · · · , ξk+l )

= ε(i1 , · · · , ik , j1 , · · · , jl )ω k (ξi1 , · · · , ξik )ω l (ξj1 , · · · , ξjl )
i1 <···<ik ; j1 <···<jl

= (−1)kl ε(j1 , · · · , jl , i1 , · · · , ik ) ω l (ξj1 , · · · , ξjl )ω k (ξi1 , · · · , ξik )
i1 <···<ik ; j1 <···<jl

= (−1)kl (ω l ∧ ω k )(ξ1 , · · · , ξk+l ),


74 1. Preliminaries of Differentiable Manifolds

we get skew-commutativity.
In order to prove associativity, we first prove that for monomials the exterior prod-
uct defined by Definition 3.19 coincides with the exterior product in Definition 3.17.
Since we have not get proved the equivalence of the Definition 3.17 of exterior product
of k 1-forms with the Definition 3.19 , we will temporarily denote the exterior product
of k 1-forms by the symbol ∧, so that our monomials have the form

ω k = ω1 ∧ · · · ∧ωk and ω l = ωk+1 ∧ · · · ∧ωk+l .

where ω1 , · · · , ωk+l are 1-forms.

Lemma 3.25. The exterior product of two monomials is a monomial.

(ω1 ∧ · · · ∧ωk ) ∧ (ωk+1 ∧ · · · ∧ωk+l ) = ω1 ∧ · · · ∧ωk ∧ωk+1 ∧ · · · ∧ωk+l .

Proof.

((ω1 ∧ · · · ∧ωk ) ∧ (ωk+1 ∧ · · · ∧ωk+l ))(ξ1 , · · · , ξk+l )



= ε(i1 , · · · , ik , j1 , · · · , jl )(ω1 ∧ · · · ∧ωk )(ξi1 , · · · , ξik )
i1 <···<ik ; j1 <···<jl

·(ωk+1 ∧ · · · ∧ωk+l ) (ξj1 , · · · , ξjl )



= ε(i1 , · · · , ik , j1 , · · · , jl ) det ωi (ξim )
1≤i≤k; 1≤m≤k
i1 <···<ik ; j1 <···<jl

· det ωj (ξjm )
k+1≤j≤k+l; 1≤m≤k

= det ωi (ξj ) = ω1 ∧ · · · ∧ωk+l (ξ1 , · · · , ξk+l ).


1≤i≤k+l; 1≤j≤k+l

Therefore, the lemma is completed. 

Thus,
ω1 ∧ω2 = ω1 ∧ ω2 ,
ω1 ∧ω2 ∧ω3 = (ω1 ∧ω2 ) ∧ ω3 = (ω1 ∧ ω2 ) ∧ ω3 ,
 
ω1 ∧ω2 ∧ω3 = ω1 ∧ ω2 ∧ω3 = ω1 ∧ (ω2 ∧ ω3 ).
It follows that
(ω1 ∧ ω2 ) ∧ ω3 = ω1 ∧ (ω2 ∧ ω3 ),
and denoted by ω1 ∧ ω2 ∧ ω3 . Thus, ω1 ∧ω2 ∧ω3 = ω1 ∧ ω2 ∧ ω3 .
In general, we have

ω1 ∧ω2 ∧ · · · ∧ωk = (· · · ((ω1 ∧ ω2 ) ∧ ω3 ) ∧ · · · ∧ ωk ).

We now prove associativity. By Theorem 3.16, ω k ∈ Λk , ω l ∈ Λl and ω m ∈ Λm


can be represented by the following formulae resp.
1.4 Foundation of Differential Form 75


ωk = ai1 ···ik xi1 ∧ · · · ∧ xik ,

ωl = bj1 ···jl xj1 ∧ · · · ∧ xjl ,

ωm = ch1 ···hm xh1 ∧ · · · ∧ xhm .

By distributivity and associativity for monomials,



(ω k ∧ ω l ) ∧ ω m = abc ((xi1 ∧ · · · ∧ xik ) ∧ (xj1 ∧ · · · ∧ xjl )) ∧ (xh1 ∧ · · · ∧ xhm )
k,l,m

= abc (xi1 ∧ · · · ∧ xik ) ∧ ((xj1 ∧ · · · ∧ xjl ) ∧ (xh1 ∧ · · · ∧ xhm ))
k,l,m

= ω k ∧ (ω l ∧ ω m ).

Based on linear space Λk (k = 0, 1, 2, · · · , n), we may construct a bigger linear space


Λ, which is a direct sum of Λ1 , Λ1 , · · · , Λn , i.e.,

Λ = Λ0 +̇Λ1 +̇ · · · +̇Λn .

Each element ω may be represented as

ω = ω0 + ω1 + · · · + ωn , ωi ∈ Λ i ,

and this kind of expression is unique. In Λ there is not only algebraic structure of the
linear space, but also the definition of the exterior product. Direct sum Λ is the Grass-
mann algebra produced by the linear space which contained the entire real number
field and the linear space.

1, x1 , · · · , xn , xi ∧ xj (i < j), · · · , x1 ∧ x2 ∧ · · · ∧ xn
n 
 n 
form the basis of Λ, whose dimension is dim(Λ) = = 2n .
i
i=0

1.4 Foundation of Differential Form

There is no strict definition on how to define df for a smooth function and dx for
dx1 , · · · , dxn , in classical mathematical analysis. The differential of the independent
variable is equal to its increment in classical mathematical analysis, which is improper
in a general sense. Here, we always regard dx1 , · · · , dxn as some basis of a linear
space, which is called the differential space.
76 1. Preliminaries of Differentiable Manifolds

1.4.1 Differential Form


A vector ξ on Rn is a vector from 0 into ξ. A tangent vector (x, ξ) ∈ Tx Rn on Rn
at x is a vector from x to x + ξ i.e., a fixed vector starting from x to x + ξ. Tangent
vector (x, ξ) is usually denoted by ξx .
For (x, ξ), (x, η) ∈ Tx Rn define:
α(x, ξ) + β(x, η) = (x, αξ + βη), α, β ∈ R
or αξx + βηx = (αξ + βη)x .Then, the tangent space Tx Rn to Rn at x forms a vector
space, and (e1 )x , · · · , (en )x is its standard basis. The set
,
T (Rn ) = Tx Rn
x∈Rn

is called the tangent bundle on Rn , see Section 1.2. Notice that the tangent bundle
T (Rn ) consists of all fixed vectors on Rn .
The mapping π : T Rn → Rn defined by the following formula:
π(ξx ) = x, ∀ξx ∈ Tx Rn
is called the tangent bundle projection. π −1 (x) = Tx Rn is called the fiber of the
tangent bundle over the point. The dual space of Tx Rn denoted by Tx∗ Rn is called the
cotangent vector space to Rn at x consisting of all linear functions from Tx Rn into
Rn . Its element is called a covector (covariant vector) or a cotangent vector to Rn at
x, and ,
T ∗ (Rn ) = Tx∗ Rn
x∈Rn

is called the cotangent bundle.


The cotangent bundle projection:
π ∗ : T ∗ (Rn ) −→ Rn
is similarly defined :
π ∗ (ωx ) = x, ∀ ωx ∈ Tx∗ Rn .
We now introduce a natural topological structure into T ∗ Rn . The element in

T Rn can be represented as a vector consisting of 2n components (q1 , · · · , qn , p1 , · · · ,
pn ), where (q1 , · · · , qn ) ∈ Rn , (p1 , · · · , pn ) ∈ Tq∗ Rn , viewed as Rn . Thus, topol-
ogy in T ∗ Rn is the product topology in Rn × Rn . T ∗ Rn equipped with such a
topology forms a 2n-dimensional manifold.
For any tangent space Tx Rn to Rn at x denoted by Tx Rn = {(x, ξ)|ξ ∈ Rn },
# onkTx R nis denoted by Λ (Tx R ).
n k n
the set of all k-forms
k n
Λ (R ) = Λ (Tx R ) is called the exterior k-forms bundle on Rn . The
x∈Rn
direct sum Λn (Rn ) of k-form bundles Λk (Rn )(k = 0, 1, · · · , n),
Λ(Rn ) = Λ0 (Rn )+̇Λ1 (Rn )+̇ · · · +̇Λn (Rn )
is called the bundle of exterior forms on Rn ( where Λ0 (Rn ) = Rn ).
1.4 Foundation of Differential Form 77

Definition 4.1. A differential k-form is a mapping ω : Rn → Λk (Rn ), such that


ω(x) ∈ Λk (Tx Rn ). If ϕ1 (x), · · · , ϕn (x) is the dual basis to (e1 )x , · · · , (en )x on Rn ,
then 
ω(x) = ai1 ···ik (x)ϕi1 (x) ∧ · · · ∧ ϕik (x),
i1 <···<ik

where ai1 ···ik (x) ∈ F(Rk ), the totality of functions on Rn .

ω is called continuous, differentiable, etc., if ai1 ···ik is continuous, differentiable


etc., respectively. From now on, we shall assume that forms that are differentiable will
always mean (C ∞ ).
The set of all differentiable k-differential forms on Rn is denoted by Ωk (Rn ), in
particular Ω0 (Rn ) = C ∞ (Rn ). ∀ ω1 , ω2 ∈ Ωk (Rn ), f ∈ C ∞ (Rn ), define:

(ω1 + ω2 )(x) = ω1 (x) + ω2 (x),


(f ω)(x) = f (x) · ω(x).

Then, Ω∞ (Rn ) forms a C ∞ (Rn )-module, i.e., vector space over a ring. ∀ ω k ∈
Ωk (Rn ), ω l ∈ Ωl (Rn ), define their exterior product ω k ∧ ω l ∈ Ωk+l (Rn ) as

(ω k ∧ ω l )(x) = ω k (x) ∧ ω l (x).

By this formula, we have:

f · ω = f ∧ ω, f ∈ C ∞ (Rn ) (= Ω0 (Rn )).

By the Theorem 3.24, the exterior product of differential forms defined above is dis-
tributive, skew-symmetric, and associative.
If f : Rn → R is differentiable, then Df (x) ∈ Λ1 (Rn ), where Df is the deriva-
tive of f (x) at x. Thus, we get a differential 1-form df ∈ Ω1 (Rn ), defined as


n 
df (ξx ) = Df (x)ξ = Di f (x)ξi , ξ= ξi ei .
i=1

Replacing f with xi for any x = (x1 , · · · , xi , · · · , xn ) yields:

dxi (ξx ) = Dxi (ξ) = ξi ,

or
dxi ((ej )x ) = Dxi (ej ) = δij .
Thus, dx1 , · · · , dxn form the dual basis to (e1 )x , · · · , (en )x . ∀ ω ∈ Ωk (Rn ), ω can
be written as 
ω(x) = ai1 ···ik (x)dxi1 ∧ · · · ∧ dxik ,
i1 <···<ik

where ai1 ···ik (x) ∈ C (R ). n

Represented by dx1 , · · · , dxn , the differential df 1-form is


78 1. Preliminaries of Differentiable Manifolds

df = D1 f dx1 + · · · + Dn f dxn

or in a classical notation
∂f ∂f
df = dx1 + · · · + n dxn ,
∂x1 ∂x
since

n 
n
df (ξx ) = D f (x)(ξ) = ξi Di f = Di f dxi (ξx ), ∀ ξx ∈ Tx Rn .
i=1 i=1

Theorem 4.2. Every differential k-form on the space Rn with a given coordinate
system x1 , · · · , xn can be represented uniquely in the form

ωk = ai1 ···ik (x)dxi1 ∧ · · · ∧ dxik ,
i1 <···<ik

where the ai1 ···ik (x) are smooth functions on Rn .

As a particular case of Theorem 4.2, let k = 1. Thus, we have:

Theorem 4.3. Every differential 1-form on the space Rn with a given coordinate
system x1 , · · · , xn can be represented uniquely with smooth function ai (x) as follows:

ω = a1 (x)dx1 + · · · + an (x)dxn .

Example 4.4. Calculate the value of the forms ω = dr2 (r2 = (x1 )2 + (x2 )2 ) on the
vectors ξ1 , ξ2 , ξ3 . (Fig. 4.1), the results in full in the Table 4.1.
x2
6
2 ξ2 ξ3
1 = ~
ξ1
6
- x1
0 1 2 3
Fig. 4.1. Example 4.4 graphical representations

Table 4.1. Example 4.4 table representations


ξ1 ξ2 ξ3
ω1 0 −1 1
ω2 0 −2 −2
ω3 0 −8 0

For example, calculate the value of ω3 in vectors ξ1 , ξ2 , ξ3 as follows:


1.4 Foundation of Differential Form 79

ω3 (x)(ξi ) = dr2 = 2x1 dx1 + 2x2 dx2 ,


ω3 (x)(ξ1 ) = 2 · 0 · dx1 (ξ1 ) + 2 · 0 · dx2 (ξ1 ) = 0,
ω3 (x)(ξ2 ) = 2 · 2 · dx1 (ξ2 ) + 2 · 2 · dx2 (ξ2 ) = 4 × (−1) + 4 × (−1) = −8,
ω3 (x)(ξ3 ) = 2 · 2 · dx1 (ξ3 ) + 2 · 2 · dx2 (ξ3 ) = 4 × 1 + 4 × (−1) = 0.
Example 4.5. Calculate ω1 = dx1 ∧dx2 , ω2 = x1 dx3 ∧dx2 −x2 dx2 ∧dx1 , and ω3 =
rdr ∧ dϕ(x1 = r cos ϕ, x2 = r sin ϕ) in vectors (ξ1 , η1 ), (ξ2 , η2 ), and (ξ3 , η3 )(see
Fig. 4.2), result as follows:

Table 4.2. Example 4.5 table representations


(ξ1 , η1 ) (ξ2 , η2 ) (ξ3 , η3 )
ω1 1 1 −1
ω2 2 1 −3
ω3 1 1 −1

x2
6
3
6
η3
ξ3
2 η1
6
η2 
1 - ξ1
ξ2
R - x1
0 1 2
Fig. 4.2. Example 4.5 graphical representations

Example 4.6. Calculate the value of the forms ω1 = dx2 ∧ dx3 , ω2 = x1 dx3 ∧
dx2 , ω3 = dx3 ∧ dr2 on the vectors ξ, η at the point x, where r2 = (x1 )2 + (x2 )2 +
(x3 )2 , ξ = (1, 1, 1) , η = (1, 2, 3) , x = (2, 0, 0).
The detailed calculation is shown bellow as follows:
   
 dx2 (ξ) dx3 (ξ)   1 1 
2 3
ω1 (ξ, η) = dx ∧ dx (ξ, η) =    =   = 1,
dx2 (η) dx3 (η)   2 3 
ω2 (ξ, η) = 2 · dx3 ∧ dx2 (ξ, η) = −2ω1 (ξ, η) = −2,
ω3 (ξ, η) = dx3 ∧ dr2 (ξ, η)
= dx3 ∧ (2x1 dx1 + 2x2 dx2 + 2x3 dx3 )(ξ, η)
= 2x1 dx3 ∧ dx1 (ξ, η) − 2x2 dx2 ∧ dx3 (ξ, η)
= 2 · 2 · dx3 ∧ dx1 (ξ, η) − 2 · 0 · dx2 ∧ dx3 (ξ, η)
   
 dx3 (ξ) dx1 (ξ)   1 1 
= 4  = 4  = −8.
dx3 (η) dx1 (η)  3 1 
80 1. Preliminaries of Differentiable Manifolds

1.4.2 The Behavior of Differential Forms under Maps


First, we consider the behavior of differential forms under maps. Let f : Rn → Rm
be a differential mapping. Df (x) is the linear transformation from Rn to Rm , which
∂ fi
is the derivative of f , and DF (x) = the Jacobian of f at x. It induces a linear
∂ xj
transformation f∗ from the tangent space Tx Rn to Rn at x into the tangent space
Tf (x) Rm to Rm at f (x), i.e.,

f∗ (ξx ) = (Df (x)(ξ))f (x) , ∀ ξx ∈ Tx Rn .

Definition 4.7. Let f ∗ : Ωk (Tf (x) Rm ) → Ωk (Tx Rn ) be a linear map:

(f ∗ ω(x))(ξ1 , · · · , ξk ) = ω(f (x))(f∗ ξ1 , · · · , f∗ ξk ), ξi ∈ Tx Rn .

f ∗ can be expanded on Ωk (Rm ):

(f ∗ ω)(x) = f ∗ ω(x).

f ∗ ω is called the pull-back of ω under f , which is the dual transformation of f∗ .

Theorem 4.8. Let f : Rn → Rm , h : Rm → Rl , f, h ∈ C ∞ . Then,



n
∂ fi
1◦ f ∗ (d y i ) = j
d xj .
∂x
j=1
2◦ f ∗ (ω1 + ω2 ) = f ∗ ω1 + f ∗ ω2 .
3◦ f ∗ (ω ∧ η) = f ∗ ω ∧ f ∗ η.
4◦ f ∗ (g · ω) = (g ◦ f ) · f ∗ ω, ∀ g ∈ C ∞ (Rm ).
5◦ (h ◦ f )∗ = f ∗ ◦ h∗ .

Proof. The proof is not difficult. We only prove 1◦ , 4◦ and 5◦ . ∀ ξx ∈ Tx Rn ,

f ∗ (d y i )(ξx ) = d xi (f∗ ξx ) = (f∗ ξ)i


n
∂f i n
∂f i j
= ξj = d x (ξx ).
j=1
∂xj j=1
∂xj

Thus, we have
n
∂f i j
f ∗ (dy i ) = dx .
j=1
∂xj

The theorem is proved. 

Example 4.9. Let y = f (x1 , x2 ) = (x1 )2 + (x2 )2 , ω = dy,


∂f ∂f
f ∗ ω = d x(f∗ ξx ) = d x1 (ξx ) + 2 d x2 (ξx )
∂x1 ∂x
= 2x1 d x1 + 2x2 d x2 .
1.4 Foundation of Differential Form 81

Before proving 4◦ , let us understand the following theorem.


Theorem 4.10. Let f = (f 1 , · · · , f m ) = (y 1 , · · · , y m ), f : Rn → Rm be a differ-
 i  i
∂f ∂y
entiable mapping with Jacobian j
= j
. For ω ∈ Ωk (Rm ),
∂x m×n ∂x m×n

ω= ai1 ,···,ik (y)dy i1 ∧ · · · ∧ dy ik ,
i1 <···<ik

we have
 !
i1 · · · ik
f ∗ω = ai1 ,···,ik (f (x))Δ dxj1 ∧ · · · ∧ dxjk ,
j1 · · · jk
i1 <···<ik ; j1 <···<jk
! !  
i1 · · · ik i1 · · · ik ∂y i
where Δ is the -minor of matrix i = 1, · · · , m .
j1 · · · jk j1 · · · jk ∂xj j = 1, · · · , n

Proof.
* +

∗ ∗
f ω =f ai1 ,···,ik (y)dy ∧ · · · ∧ dy
i1 ik

i1 <···<ik

= ai1 ,···,ik (f (x))f ∗ (dy i1 ) ∧ · · · ∧ f ∗ (dy ik )
i1 <···<ik
⎛ ⎞ ⎛ ⎞
  n
∂y i1 n
∂y ik
= ai1 ,···,ik (f (x)) ⎝ j1
dxj1 ⎠ ∧ · · · ∧ ⎝ jk
dxjk ⎠
i1 <···<ik j =1
∂x j =1
∂x
1 k

 
n
∂y i1
∂y ik
= ai1 ,···,ik (f (x)) j1
· · · jk dxj1 ∧ · · · ∧ dxjk
i1 <···<ik j1 ,···,jk =1
∂x ∂x
   n % & i1
j1 · · · jk ∂y
= ai1 ,···,ik (f (x)) ε
j1 · · · jk 
∂xj1
i1 <···<ik j1 <···<jk (j1 ,···,jk )∈νk
ik
∂y
···  dx
j1
∧ · · · ∧ dxjk
∂xjk
 % &
i1 · · · ik
= ai1 ,···,ik (f (x))Δ dxj1 ∧ · · · ∧ dxjk .
j1 · · · jk
i1 <···<ik ; j1 <···<jk

The proof can be obtained. 


The following deduction gives the proof for 4◦ .
Proof. If f : Rn → Rn is differentiable, then

f ∗ (gdx1 ∧ · · · ∧ dxn ) = g ◦ f det (f  )dx1 ∧ · · · ∧ dxn ,


 i
∂f
where f  = Df = . 
∂xj
82 1. Preliminaries of Differentiable Manifolds

For proof of 5◦ :
Proof.

[f ∗ (h∗ ω)](x)(ξ1 , · · · , ξp ) = (h∗ ω)(f (x))(f∗ ξ1 , · · · , f∗ ξp )


= ω(h(f (x)))(h∗ f∗ ξ1 , · · · , h∗ f∗ ξp )
= ω(h ◦ f )(x)((hf )∗ ξ1 , · · · , (hf )∗ ξp )
= [(h ◦ f )∗ ω](x)(ξ1 , · · · , ξp ).

Therefore, (h ◦ f )∗ = f ∗ ◦ h∗ . 

1.4.3 Exterior Differential


We now define an operator similar to differentiation in classical mathematical analysis.
We have introduced the function differential on the manifold, namely if f ∈ C ∞ (Rn ),
and df |x ∈ Tx Rn , then df is a 1-form on Rn . Therefore, we may say operator
d : Ω0 (Rn ) → Ω1 (Rn ) maps the 0-form defined on Rn onto the 1-form on Rn . We
need to extend this operator to the exterior algebra Ω(Rn ) on M .
Definition 4.11. The exterior differential operator d on an exterior algebra ω(Rn ) of
manifold M is a mapping

d : Ωk (Rn ) −→ Ωk+1 (Rn ),

where k = 0, 1, · · · , n.
Exterior algebra may be represented by the local coordinate system of M as

ω= ai1 ,···,ik dxi1 ∧ · · · ∧ dxik ,
1≤i1 <···<ik ≤n

which is a k-form. Then,



dω = dai1 ,···,ik ∧ dxi1 ∧ · · · ∧ dxik
1≤i1 <···<ik ≤n

 
n
∂ai 1 ,···,ik
= dxj ∧ dxi1 ∧ · · · ∧ dxik .
1≤i1 <···<ik ≤n j=1
∂xj

Here d is called an exterior differential operator.


In particular, if ω = f ∈ Ω0 (Rn ) = C ∞ (Rn ), then
n
∂f j
dω = dx .
j=1
∂xj

From this, we can see that when ω ∈ Ωn (Rn ), dω = 0.


1.4 Foundation of Differential Form 83

Theorem 4.12. Exterior differential operator d has the following properties:


1◦ d(ω + η) = dω + dη.
2◦ ∀ ω k ∈ Ωk , ω l ∈ Ωl , has d(ω k ∧ ω l ) = dω k ∧ ω l + (−1)k ω k ∧ dω l .
3◦ d(dω) = 0 or in simple form d2 ω = 0.
4◦ f ∗ d ω = df ∗ ω.

Proof. 1◦ The proof is obvious.


2◦ Let 
ω k (x) = ai1 ,···,ik dxi1 ∧ · · · ∧ dxik ,
i1 <···<ik

ω l (x) = bj1 ,···,jl dxj1 ∧ · · · ∧ dxjl .
j1 <···<jl

Then,

ωk ∧ ωl = ai1 ,···,ik (x)bj1 ,···,jl (x)dxi1 ∧ · · · ∧ dxik ∧ dxj1 ∧ · · · ∧ dxjl .
i1 <···<ik ; j1 <···<jl

By definition,
 n
∂(ab) i
d(ω k ∧ ω l ) = dx ∧ dxi1 ∧ · · · ∧ dxik ∧ dxj1 ∧ · · · ∧ dxjl
i1 <···<ik ; j1 <···<jl i=1
∂xi
* +
 
n
∂ai1 ···ik ∂bj1 ···jl
= bj1 ···jl i
+ ai1 ···ik dxi ∧ dxi1 ∧ · · · ∧ dxik
i1 <···<ik ; j1 <···<jl i=1
∂x ∂xi

∧dxj1 ∧ · · · ∧ dxjl
 
n
∂ai1 ···ik i
= bj1 ···jl dx ∧ dxi1 ∧ · · · ∧ dxik ∧ dxj1 ∧ · · · ∧ dxjl
i1 <···<ik ; j1 <···<jl i=1
∂xi

 
n
∂bj1 ···jl
+ ai1 ···ik (−1)k dxi ∧ dxi1 ∧ · · · ∧ dxik ∧ dxj1 ∧ · · · ∧ dxjl
i1 <···<ik ; j1 <···<jl i=1
∂xi

= dω k ∧ ω l + (−1)k ω k ∧ dω l .

3◦ The proof is as follows:

 
n
∂2a
d(dω) = i j
dxi ∧ dxj ∧ dxi1 ∧ · · · ∧ dxik
i1 <···<ik i,j=1
∂x ∂x

  % ∂2a ∂2a
&
= i j
− j i
dxi ∧ dxj ∧ dxi1 ∧ · · · ∧ dxik
i1 <···<ik i<j
∂x ∂x ∂x ∂x

= 0,

since
84 1. Preliminaries of Differentiable Manifolds

∂2a ∂2a
= .
∂xi ∂xj ∂xj ∂xi
4◦ By Theorem 4.8, 1 ◦ ∀ ω = g ∈ C ∞ (Rn ), we have:
*m +
 ∂g m
∂g

f dg = f ∗
i
dy i
= i
◦ f · f ∗ (dy i )
i=1
∂y i=1
∂y

n  m
∂g(f (x)) ∂f i j
= · j dx
j=1 i=1
∂y i ∂x


n
∂g(f (x))
= dxj = d(g ◦ f ) = df ∗ g.
j=1
∂xj

Furthermore, ∀ ω ∈ Ωk (Rm ),
* +
 m
∂a i
∗ ∗
f (dω) = f i
dy ∧ dy i1 ∧ · · · ∧ dy ik
i1 <···<ik i=1
∂y
* +


=f da ∧ dy ∧ · · · ∧ dy
i1 ik

i1 <···<ik
  
= f ∗ (da) ∧ f ∗ dy i1 ∧ · · · ∧ dy ik
i1 <···<ik
  
= d(f ∗ a) ∧ f ∗ dy i1 ∧ · · · ∧ dy ik
i1 <···<ik
* +
  
∗ ∗
=d f a∧f dy ∧ · · · ∧ dy
i1 ik
(by 3◦ )
i1 <···<ik

= df ∗ ω.
Therefore, the theorem is completed. 

1.4.4 Poincaré Lemma and Its Inverse Lemma


Definition 4.13. A differential form ω is closed if dω = 0, and is exact if there exists
a differential form η such that ω = dη.
Clearly if ω is exact, then it is closed by the formula d2 ω = 0. However, the con-
verse is not always true. The following Poincaré lemma asserts that in the neighbour-
hood of each point, the closed is equivalent to the exact. Before stating the Poincaré
lemma, we introduce a notion.
Definition 4.14. An open set A ⊂ Rn is star-shaped with respect to 0, if for any
x ∈ A the set {αx | α ∈ [0, 1]} ⊂ A.
1.4 Foundation of Differential Form 85

Evidently, Rn is a star-shaped open set, and every convex set containing 0 is a


star-shaped open set with respect to 0.

Theorem 4.15 (Poincaré lemma). Let A ⊂ Rn be an open star-shaped set with


respect to 0. Then, every closed form on A is exact.

Proof. We will construct an R-linear mapping H : Ωk (A) → Ωk−1 (A), such that
d ◦ H + H ◦ d = id: Ωk (A) → Ωk (A), i.e., ω = d ◦ H(ω) + Hdω. Then, from dω = 0,
if follows that ω = d(H(ω)). Taking η = H(ω), we get

ω = dη.

Let

ω= ai1 ···ik (x)dxi1 ∧ · · · ∧ dxik .
i1 <···<ik

Define

 
k %- 1
j−1
H(ω)(x) = (−1) tk−1 ai1 ···ik (tx)dtxij
i1 <···<ik j=1 0
&
.
· dx ∧ · · · ∧ dx ∧ · · · ∧ dx
i1 i j ik
,

where the symbol “  ” over dxij indicates that it is omitted.

 
k %- 1 &
dH(ω) = (−1)j−1 tk−1 a(tx)dt dxij ∧ dxi1 ∧ · · · ∧ dxik
i1 <···<ik j=1 0
* n -
+
 
k
j−1
 1
k ∂a(tx) i
+ (−1) t dx dt xij ∧ dxi1 ∧ · · · ∧ dxik
i1 <···<ik j=1 i=1 0 ∂xi
 %- 1 &
= k tk−1 a(tx)dt dxi1 ∧ · · · ∧ dxij ∧ · · · ∧ dxik
i1 <···<ik 0

 
k n -
 1
∂a(tx)
+ (−1)j−1 tk dtxij dxi ∧ dxi1 ∧ · · ·
i1 <···<ik j=1 i=1 0 ∂xi
.
∧dxij
∧ · · · ∧ dxik . (4.1)

On the other hand,


86 1. Preliminaries of Differentiable Manifolds

n
∂a i
dω = i
dx ∧ dxi1 ∧ · · · ∧ d xik ,
i1 <···<ik i=1
∂x
 ' n - 1
∂a(tx)  i i1
Hdω = tk i
dt x dx ∧ · · · ∧ dxik
i <···<i i=1 0 ∂x
1 k


n 
k - 1
∂a(tx)
+ (−1) j
tk dtxij dxi ∧ dxi1 ∧ · · ·
i=1 j=1 0 ∂xi
/
.
∧ dx ∧ · · · ∧ dx
ij ik
.

The second term of the right hand side in this equality coincides with the second term
in Equation (1.3) except for the sign. Adding them together, we get
 - 1
dH(ω) + Hdω = k tk−1 a(tx)dt dxi1 ∧ · · · ∧ dxik
i1 <···<ik 0

 n -
 1
∂a(tx)
+ i
tk
dtxi dxi1 ∧ · · · ∧ dxik
i1 <···<ik i=1
∂x 0
* +
 - 1 k−1
 n
k ∂a(tx) i
= kt a(tx) + t x dt dxi1 ∧ · · · ∧ dxik .
i1 <···<ik 0 i=1
∂xi

Notice that

n
∂a(tx) dk 
ktk−1 a(tx) + tk xi i
= t a(tx) .
i=1
∂x dt
Thus, * +
- 1 
n
∂a(tx) i
ktk−1 a(tx) + tk i
x dt
0 i=1
∂x
- 1
d k 
= t a(t, x) dt = a(x).
0 dt
Then, we have

dH(ω) + Hdω = ai1 ···ik (x)dxi1 ∧ · · · ∧ dxik = ω,
i1 <···<ik

i.e.,
dH + Hd = id.
Therefore, the theorem is completed. 

1.4.5 Differential Form in R3


We now assume that R3 is a three-dimensional oriented Euclidean space. The square
of the length element in R3 has the form
1.4 Foundation of Differential Form 87

ds2 = (dx1 )2 + (dx2 )2 + (dx3 )2 .

For any vector A ∈ R3 , we define a corresponding 1-form ωA


1 2
and a 2-form ωA by
the formula

1 2
ωA (ξ) = (A, ξ), ωA (ξ, η) = (A, [ξ, η]), ∀ ξ, η ∈ R3 ,

where ( , ) stands for usual inner product and ( , [ ]) for triple scalar product.
1
Let A = A1 e1 + A2 e2 + A3 e3 and ωA = a1 dx1 + a2 dx2 + a3 dx3 . Then by
3
definition, on one hand, ωA (ej ) = ai dx1 (ej ) = Aj ; on the other hand, ωA (ej ) =
i=1
1
(A, ej ) = Aj . Thus, aj = Aj , i.e., ωA = A1 dx1 + A2 dx2 + A3 dx3 . Similarly, we
can get
2
ωA = A1 dx2 ∧ dx3 + A2 dx3 ∧ dx1 + A3 dx1 ∧ dx2 .

It is easy to observe that

2
ωA =∗ (ωA
1
), 1
ωA = b (A).

Here, the top left hand corner “*” (“b”) represents the Hodge (sharp) operator respec-
tively, namely ∗ : ∧k (Rm ) → ∧n−k (Rm ); b : R → R∗ .
We now introduce three operators that play an important role in classical vector
analysis, i.e., gradient, curl, and divergence.

Definition 4.16. Let f ∈ C ∞ (R3 ) and A ∈ X (R3 ). The grad f and curl A ∈ X (R3 )
and div (A) ∈ C ∞ (R3 ) defined as follows

1 2 1 2 3
ωgradf
= df, ωcurl A
= dωA , and dωA = div A = ωA ,

where ω 3 = dx1 ∧ dx2 ∧ dx3 is the volume element in R3 .

By this definition,

1 ∂f ∂f ∂f
ωgrad f
= df = 1
dx1 + 2 dx2 + 3 dx3 .
∂x ∂x ∂x

Thus,
∂f ∂f ∂f
grad f = e1 + 2 e2 + 3 e3
∂x1 ∂x ∂x
 
∂f ∂f ∂f ∂f
= , , = ,
∂x1 ∂x2 ∂x3 ∂x

and so
88 1. Preliminaries of Differentiable Manifolds

2 1
 
ωcurl A
= dωA = d A1 (x)dx1 + A2 (x)dx2 + A3 (x)dx3
 
∂A1 1 ∂A ∂A
= 1
dx + 21 dx2 + 31 dx3 ∧ dx1
∂x ∂x ∂x
 
∂A2 1 ∂A ∂A
+ dx + 22 dx2 + 32 dx3 ∧ dx2
∂x1 ∂x ∂x
 
∂A3 1 ∂A ∂A
+ dx + 23 dx2 + 33 dx3 ∧ dx3
∂x1 ∂x ∂x
   
∂A3 ∂A ∂A1 ∂A
= − 32 dx2 ∧ dx3 + − 13 dx3 ∧ dx1
∂x2 ∂x ∂x3 ∂x
 
∂A2 ∂A
+ − 21 dx1 ∧ dx2 ,
∂x1 ∂x

where
     
∂A3 ∂A ∂A1 ∂A ∂A2 ∂A
curl A = − 32 e1 + − 13 e2 + − 21 e3
∂x2 ∂x ∂x3 ∂x ∂x1 ∂x
 
 e1 e2 e3 
 
 ∂ 
=  ,
∂ ∂
1 ∂x2 ∂x3 
 ∂x 
 A1 A2 A3 
3 2
ωdiv A
= dωA = d(A1 dx2 ∧ dx3 + A2 dx3 ∧ dx1 + A3 dx1 ∧ dx2 )
 
∂A1 ∂A ∂A
= 1
+ 22 + 33 dx1 ∧ dx2 ∧ dx3 .
∂x ∂x ∂x

Therefore,
∂A1 ∂A ∂A
div A = + 22 + 33 .
∂x1 ∂x ∂x
Since
2 1
ωcurl grad f = d ωgrad f = d (d f ) = 0,
 2 
div curl (A)ω 3 = d ωcurl A
= d (d ωA1
) = 0,
we easily get two equalities in classical vector analysis:

curl grad = 0, div curl = 0.

1.4.6 Hodge Duality and Star Operators


Let us introduce the Hodge operator as a linear transformation:

∗ : Λp −→ Λn−p .

Definition 4.17. Denoting ∗υ as a element in Λn−p , ∀ u ∈ Λp , we have


1.4 Foundation of Differential Form 89

uΛ ∗ υ = (u, υ)en .
For brevity, we write uΛ ∗ υ as u ∗ υ.
If υ is a scalar, u must also be a scalar. By the above formula, we get ∗υ = υen .
Example 4.18. If 
α= ai1 ···ip dxi1 ∧ · · · ∧ dxip ,
i1 <···<ip

then 
∗α = bj1 ···jn−p dxj1 ∧ · · · ∧ dxjn−p ,
i1 <···<ip

where 
bj1 ···jn−p = εi1 ···ip j1 ···jn−p ai1 ···ip ,
i1 <···<ip

εi1 ···ip j1 ···jn−p is the generalized Kronecker symbol.


Star operators in 3-dimensional space have the following properties:
Property 4.19.
1◦ ∗ d x = d y ∧ d z.
2◦ ∗ d y = d z ∧ d x.
3◦ ∗ d z = d x ∧ d y.
4◦ ∗ (d x ∧ d y ∧ d z) = 1.
5◦ Let ω = a1 d x1 + a2 d x2 + a3 d x3 , then
% & % & % &
∂ a3 ∂ a2 1 ∂ a1 ∂ a3 2 ∂ a2 ∂ a1
∗dω = − d x + − d x + − d x3 .
∂ x2 ∂ x3 ∂ x3 ∂ x1 ∂ x1 ∂ x2
∂ a1 ∂ a2 ∂ a3
6◦ ∗d∗ω = + + .
∂ x1 ∂ x2 ∂ x3
7◦ grad = d (operation on Λ0 (R3 )).
8◦ rot = ∗d (operation on Λ1 (R3 )).
9◦ div = ∗d∗ (operation on Λ2 (R3 )).

1.4.7 Codifferential Operator δ


We know that the exterior differential operator d : Λk (M ) → Λk+1 (M ) is a linear
differential operator of first order, which increases one order in the form of differential
form. However, Hodge star operator ∗ : Λk (M ) → Λn−k (M ) essentially is a dual
operator. From this, one may ask whether it is possible to define a kind of −1 linear
differential operator δ : Λk (M ) → Λk−1 (M ). The answer is yes.
Definition 4.20. The codifferential operator δ is a kind of linear differential operator
of −1 order. δ: Λk (M ) → Λk−1 (M ). It can be represented by
δ = −(−1)g (−1)n(k+1) ∗ d ∗ .
If our manifold is an oriented Riemann manifold, then g = 1.
90 1. Preliminaries of Differentiable Manifolds

Definition 4.21. The k-form ω is called coclosed, if δω = 0; it is called coexact, if


there exists a θ ∈ Λk+1 (M ) s.t. ω = δθ. The relation between operators δ, d, and ∗
leads to the following theorem.

Theorem 4.22. A codifferential operator δ has the following properties:


1◦ δ 2 = 0.
2◦ ∗ δd = dδ∗, ∗dδ = δd∗.
3◦ d ∗ δ = δ ∗ d = 0.
4◦ ∗ (δω) = (−1)k d(∗ω), ω ∈ Λk (M ).
5◦ δ(∗ω) = (−1)n−k+1 (dω), ω ∈ Λk (M ).

1.4.8 Laplace–Beltrami Operator


Definition 4.23. Linear mapping

Δ = d δ + δ d, Λk (M ) −→ Λk (M )

is called Laplace–Beltrami operator on Riemann manifold.


If k = 0, for f ∈ Λ0 (M ), we have δf = 0, and then

Δf = δdf.

Theorem 4.24. The Laplace–Beltrami operator obeys the following rule:


1◦ Δ = (d + δ)2 .
2◦ d · Δ = Δ · d = d · δ · d.
3◦ δ · Δ = Δ · δ = δ · d · δ.
4◦ ∗ Δ = Δ ∗ .

Example 4.25. Let f ∈ Λ0 (R3 ). For a rectangular coordinate system δ, we have

∂f i
Δf = δdf = δ dx .
∂xi

Since δ = −(−1)g (−1)n(k+1) ∗ d∗ = ∗d∗,


% &  2
∂f i ∂ f
Δf = ∗d ∗ dx = .
∂xi (∂xi )2

Obviously, for R3 , the Laplace–Beltrami operator Δ : Λ0 (R3 ) → Λ0 (R3 ) is the


3
∂2
usual Laplace operator Δ = i 2
.
(∂x )
i=1

The corresponding relationship in action on the form operators d and δ and action on
the coefficient of vector analysis can be summarized as follows:
1.5 Integration on a Manifold 91

d d d
form: Λ0
 Λ1  Λ2  Λ3
δ δ δ
grad curl div
coef.: scalar  vector  vector  scalar
−div curl −grad
We can easily obtain two equations in classical vector analysis:

d d = 0 : rot grad = 0,
div rot = 0.
δδ = 0 : −rot grad = 0,
−div rot = 0.

1.5 Integration on a Manifold


The integral of an n-form on an n-manifold is defined in terms of integral over sets in
Rn by means of partition of unity subordinate of an atlas.

1.5.1 Geometrical Preliminary

Chains. A singular k-cube in M ⊂ Rn is a continuous function c : [0, 1]k → M ,


where [0, 1]n = [0, 1] × · · · × [0, 1], R0 = [0, 1]0 = {0}.
0 12 3
n
A singular 0-cube in M is a function c : [0, 1]0 = {0} → M , i.e., a point in M .
A singular 1-cube in M is a usual curve. The standard n-cube in Rn is the identity
mapping I n : [0, 1]n → Rn , I n (x) = x, ∀ x ∈ [0, 1]n .

Definition 5.1. A k-chain C in M is a linear combination of finite singular k-cubes


ci in M , i.e.,

c = α1 c1 + · · · + αr cr , αi ∈ R, i = 1, · · · , r.

The set of all k-chains in M is denoted by C k (M ). C k (M ) forms a vector space


on R if we introduce in C k (M ) the addition and multiplication by scalar by the fol-
lowing formulae:


r 
r 
r
c1 + c2 = αi1 ci + αi2 ci = (αi1 + αi2 )ci ,
i=1 i=1 i=1


r 
r
αc1 = α αi1 ci = (ααi1 )ci ,
i=1 i=1
92 1. Preliminaries of Differentiable Manifolds


r
where cj = αij ci (j = 1, 2) are two k-chains in M . Without loss of generality,
i=1
we assume that different chains c1 and c2 are generated by the same set of k-cubes
{c1 , · · · , cr }. For example, let c1 = c1 + 2c2 , c2 = c1 + c3 , where c3 = c2 . We only
need to rewrite c1 and c2 as c1 = c1 + 2c2 + 0 · c3 , c2 = c1 + 0 · c2 + c3 .
Boundary of Chains Corresponding to the exterior operator d: Ωk → Ωk+1 , there
is a boundary operator ∂ : C k (M ) → C k−1 (M ), as defined below:

∂ [0, 1] = {1} − {0},


 
∂ [0, 1]2 = ∂ [0, 1] × [0, 1] = {1} × [0, 1] − {0} × [0, 1]
−[0, 1] × {1} + [0, 1] × {0}.

For general [0, 1]k , ∀ x ∈ [0, 1]k−1 , denote


k
I(i,0) (x) = I k (x1 , · · · , xi−1 , 0, xi , · · · , xk−1 ),
k
I(i,1) (x) = I k (x1 , · · · , xi−1 , 1, xi , · · · , xk−1 ).

k k
We call I(i,0) and I(i,1) as (i, 0)- and (i, 1)-surface respectively, and


k 
∂I k = ∂[0, 1]k = (−1)i+α I(i,α)
n
.
i=1 α=0,1

For any k-cube c : [0, 1]k → M , the (i, α)-surface is defined as

c(i,α) = c ◦ I(i,α)
k
.

The boundary ∂c of the k-cube c is


k  
k 
∂c = (−1)i+α c ◦ I(i,α)
k
= (−1)i+α c(i,α) .
i=1 α=0,1 i=1 α=0,1

The boundary of any k-chain c = αj cj is
j

∂c = αj ∂cj .
j

Theorem 5.2. For any k-chain c, ∂(∂c) = 0, or briefly ∂ 2 = 0.

Proof. Firstly, assume i ≤ j and consider (I(i,α)


k
)(j,β) . For x ∈ [0, 1]k−2 , we have

k
(I(i,α) k
)(j,β) (x) = I(i,α) (x1 , · · · , xj−1 , β, xj , · · · , xk−2 )

= I k (x1 , · · · , xi−1 , α, xi , · · · , xj−1 , β, xj , · · · , xk−2 ).


1.5 Integration on a Manifold 93

Similarly, we have
k
(I(j+1,β) k
)(i,α) (x) = I(j+1,β) (x1 , · · · , xi−1 , α, xi , · · · , xk−2 )

= I k (x1 , · · · , xi−1 , α, xi , · · · , xj−1 , β, xj , · · · , xk−2 ).

Thus, if i ≤ j, (I(i,α)
k k
)(j,β) (x) = (I(j+1,β) )(i,α) (x), it is easy to see that for any
k-cube c, (c(i,α) )(j,β) = (c(j+1,β) )(i,α) as i ≤ j. Now,
* k +
 
i+α
∂(∂c) = ∂ (−1) c(i,α)
i=1 α=0,1


k  k−1
 
= (−1)i+α+j+β (c(i,α) )(j,β) .
i=1 α=0,1 j=1 β=0,1

In this sum, (c(i,α) )(j,β) and (c(j+1,β) )(i,β) occur simultaneously with the opposite
sign. Then, all terms disappear in pairs and ∂(∂c) = 0. Consequently, for any k-chain
r
c= αi ci , where ci (i = 1, · · · , r) are the k-cubes,
i=1
* +

r 
r
∂(∂c) = ∂ ai ∂ci = ai ∂(∂ci ) = 0.
i=1 i=1

The theorem is proved. 

Definition 5.3. A k-chain c is called a cycle if ∂c = 0. A k-chain c is called a bound-


ary if there is a k + 1-chain c1 such that c = ∂c1 . Obviously, boundaries imply cycles.
However, the converse is not always true.

1.5.2 Integration and Stokes Theorem


For any ω ∈ Ωk ([0, 1]k ), there is a f ∈ C ∞ ((0, 1)k ) such that ω = f dx1 ∧ · · · ∧ dxk .
Define the integral of ω on [0, 1]k as
- -
ω= f,
[0,1]k [0,1]k

i.e., - -
ω= f dx1 · · · dxk ,
[0,1]k [0,1]k

where the right hand side is a Riemannian integral of f on [0, 1]k .


If ω ∈ C k (M ) and c is a k-cube in M , we define the integral of ω on c as
- -
ω= c∗ ω.
c [0,1]k
94 1. Preliminaries of Differentiable Manifolds

In particular,
- -
f dx1 ∧ · · · ∧ dxk = (I k )∗ (f dx1 ∧ · · · ∧ dxk )
Ik [0,1]k
-
= f (x1 , · · · , xk ) dx1 · · · dxk .
[0,1]k

If c is a 0-cube, we define -
ω = ω(c(0)).
c

The integral of ω on a k-chain c = ai ci is
i
-  -
ω= ai ω,
c i ci

where ci are k-cubes in M .

Example 5.4. For k = 1, c : [0, 1] → R2 is a curve defined by x = cos 2πθ, y =


sin 2πθ (1 ≤ θ ≤ 1), i.e., a circle on (x, y)-plane. Let

ω = P (x, y)dx + Q(x, y)dy,

then, the integral of ω on c is


- -
ω = P (x, y)dx + Q(x, y)dy
c c
-
= c∗ (P (x, y)dx + Q(x, y)dy)
[0,1]
- 1 
= − P (cos 2πθ, sin 2πθ)2π sin 2πθ
0

+Q(cos 2πθ, sin 2πθ)2π cos 2πθ d θ,

which is the usual integral along a curve.

Theorem 5.5 (Stokes theorem). If ω is a (k − 1)-form on an open set M ⊂ Rn and


c is a k-chain in M , then - -
dω = ω.
c ∂c

Proof. 1◦ Firstly, we assume c = I k and ω is a (k − 1)-form on [0, 1]k . Then, ω is


a sum of (k − 1)-forms of the type

4i ∧ · · · ∧ dxk .
f dx1 ∧ · · · ∧ dx
1.5 Integration on a Manifold 95

Thus, it suffices to prove the theorem for each of these.


- -
dω = 4i ∧ · · · ∧ dxk )
d(f dx1 ∧ · · · ∧ dx
Ik Ik
-  ∂f
= dxi ∧ dx1 ∧ · · · ∧ dx 4i ∧ · · · ∧ dxk
[0,1]k ∂xi
-
∂f 4i ∧ · · · ∧ dxk
= (−1)i−1 i dx1 ∧ · · · ∧ dx
[0,1]k ∂x
-
∂f 1
= (−1)i−1 i
(x , · · · , xk )dx1 · · · dxk
[0,1]k ∂x
-

= (−1)i−1 f (x1 , · · · , 1, · · · , xk )
[0,1]k−1
 4i ∧ · · · ∧ dxk .
−f (x1 , · · · , 0, · · · , xk ) dx1 ∧ · · · ∧ dx

Notice that
-
k∗ 4i ∧ · · · ∧ dxk )
I(j,α) (f dx1 ∧ · · · ∧ dx
[0,1]k−1
-  ∗   ∗ 
= f (x1 , · · · , xj−1 , α, xj , · · · , xk ) I(j,α)
k
dx1 ∧ · · · ∧ dxi ∧ · · · ∧ I(j,α)
k
dxk
[0,1]k−1

⎪ 0, i = j,

= -

⎩ 4i · · · dxk ,
f (x1 , · · · , α, · · · , xk ) dx1 · · · dx i = j.
[0,1]k−1

Thus,
-
4i ∧ · · · ∧ dxk
f dx1 ∧ · · · ∧ dx
∂I k


n  -
= (−1)j+α k∗
I(j,α) 4i ∧ · · · ∧ dxk )
(f dx1 ∧ · · · ∧ dx
j=1 α=0,1 [0,1]k−1
-

= (−1)i+1 f (x1 , · · · , xi−1 , 1, · · · , xk )
[0,1]k−1

−f (x1 , · · · , 0, · · · , xk ) d x1 ∧ · · · ∧ d.
xi ∧ · · · ∧ d xk .

In other words, - -
dω = ω.
Ik ∂I k

2◦ For a singular k-cube c, since


96 1. Preliminaries of Differentiable Manifolds


k 
∂c = (−1)i+α c(i,α)
i=1 α=0,1


k 
= (−1)i+α c ◦ I(i,α)
k
,
i=1 α=0,1

by definition of integration,
- 
k  -
i+α
ω= (−1) ω
∂c k
c◦I(i,α)
i=1 α=0,1


k  - -

= (−1) i+α
c ω= c∗ ω,

I(i,α) ∂I k
i=1 α=0,1
- - - -
dω = c∗ dω = dc∗ ω = c∗ ω,
c Ik Ik ∂I k

- -
and so dω = ω, for any singular k-cube c.
c ∂c

k
Finally, if c is a k-chain, i.e., c = αi ci , where ci are singular k-cubes, then
i=1

- 
k - 
k - -
dω = ai dω = ai ω= ω.
c i=1 ci i=1 ∂ci ∂c

Therefore, the theorem is completed. 

Example 5.6. Consider the 1-form

ω 1 = p1 dq1 + · · · + pn dqn ,

on R2n with the coordinates p1 , · · · , pn , q1 , · · · , qn ; dω 1 = dp1 ∧ dq1 + · · · + dpn ∧


dqn = dp ∧ dq. Thus, - - -
dp ∧ dq = p dq.
ci ∂ci
- -
In particular, if ci is a cycle , i.e., ∂ci = 0, then dp ∧ dq = 0.
ci

1.5.3 Some Classical Theories on Vector Analysis


Here, we assume R3 is the oriented 3-dim Euclidian space. By Subsection 1.4.5, every
vector field A on R3 corresponds to a 1-form ωA1
and a 2-form ωA2
:
1.5 Integration on a Manifold 97

A = (A1 (x), A2 (x), A3 (x)),


1
ωA = A1 (x)dx1 + A2 (x)dx2 + A3 (x)dx3 ,
2
ωA = A1 (x)dx2 ∧ dx3 + A2 (x)dx3 ∧ dx1 + A3 (x)dx1 ∧ dx2 .
Suppose a 1-chain c1 represents a curve l (with the same orientation). Then,
- - -
1
ωA = A1 dx1 + A2 dx2 + A3 dx3 = A · dl,
c1 c1 l
1
which shows that the integral of on a l-chain c1 representing a curve l is the circu-
ωA
1
lation of the field A over the curve l. If A is a force field, then the integral of ωA on a
1-chain c1 is the work done by A along the curve l.
Suppose a 2-chain c2 represents an oriented surface S. Then,
- -
2
ωA = A1 dx2 ∧ dx3 + A2 dx3 ∧ dx1 + A3 dx1 ∧ dx2
c2 c2
-
= A1 dx2 dx3 + A2 dx1 dx3 + A3 dx1 dx2
S
-
= Adn.
S
2
In other words, the integral of ωA on a 2-chain c2 representing an oriented surface
S is the flux of the field A though the surface S.
Applying the Stokes’ theorem to different cases, we can get three important theo-
rems in classical calculus, Green theorem, Gauss theorem, and Stokes theorem.
Theorem 5.7 (Green theorem). Let c2 represent a 2-dim domain D, ∂c2 the bound-
ary l of D, and ω = P (x, y)dx + Q(x, y)dy ∈ C 1 (R2 ). Then,
- -
dω = d(P dx + Qdy),
c2 c2
- -
ωdl = P dx + Qdy
∂c2 l
-  - 
∂Q ∂P  ∂Q ∂P 
= − dx ∧ dy = − dxdy.
c2 ∂x ∂y D ∂x ∂y

Proof. Since
 ∂Q ∂P 
d (P dx + Qdy) = − dx ∧ dy,
∂x ∂y
- -
the Stokes theorem ωdl = dω implies
∂c c
-  -
∂Q ∂P 
− dxdy = P dx + Qdy,
D ∂x ∂y l

which is the classical Green theorem. 


98 1. Preliminaries of Differentiable Manifolds

Theorem 5.8 (Gauss theorem). If a 3-chain c3 represents a domain in R3 , ∂c3


2
represents the boundary S = ∂D of D, and ωA = Ax dy∧dz+Ay dz∧dx+Az dx∧dy
is a 2-form on R3 , then

- - 
2 ∂Ax ∂Ay ∂Az 
dωA = + + dx ∧ dy ∧ dz
c3 c3 ∂x ∂y ∂z
-
= div Adxdydz,
D
- -
ω = Ax dydz + Ay dxdz + Az dxdy.
∂c3 S

Thus, by Stokes’ theorem, we obtain Gauss theorem.


Theorem 5.9 (Classical Stokes theorem).
- -
div Adxdydz = Ax dydz + Ay dxdz + Az dxdy.
D S

If a 2-chain C in R3 represents an oriented surface S, its boundary is l = ∂S =


1
∂c2 and ωA = Ax dx + Ay dy + Az dz is a 1-form on R3 . Then, the Stokes’ theorem
shows - -
Ax dx + Ay dy + Az dz = curl A ds.
l S
1 1
Here we use the equality: dωA = ωcurl A .

1.6 Cohomology and Homology


The set of all closed k-forms, denoted by Z k , forms a subspace of Ωk and the set of
all exact k-forms, denoted by B k , also forms a subspace of Ωk . The quotient space

Zk ker dk
H k (M, R) = k
≡ ,
B imdk−1

is called the k-th cohomology space. An element in H k is an equivalent class of closed


forms, in which two closed forms ω1 and ω2 are equivalent if there is a (k − 1)-form
θ such that ω1 − ω2 = dθ, where d is called the exterior differential operator. It is a
mapping
d : Ωk (M ) −→ Ωk+1 (M ).
The kernel of d is the subspace Z k of the closed k-form, the image of d is the
space B k of exact differential form. For every k-form, we have dω k = 0. Then

Bk ⊆ Z k .
1.7 Lie Derivative 99

Similarly, the set of all k-cycles, denoted by Zk , and the set of all k-boundaries,
denoted by Bk , form a subspace of the vector space C k . The corresponding quotient
space

Zk ker ∂k
Hk (M, R) = =
Bk im∂k+1
is called the k-th homology space. An element in Hk is a class of cycles differing from
one another only by a boundary.
Definition 6.1. The dimension of H k (Hk ), denoted by bk (bk ), is called the k-th Betti
number.

Theorem 6.2 (De Rham theorem). The two Betti numbers are the same, i.e.,

bk = bk .

Example 6.3. M ⊂ Rn is an open set. We consider H 0 (M, R) and H0 (M, R), B 0 =


{0}, for there is no such form smaller than 0-form. If ω ∈ Ω0 (M ), then ω = f ∈
C ∞ (M ), df = 0 means that f is equal to a constant (local), and:

Z 0 (M ) = R × · · · × R .
0 12 3
m

If M has components connected by m paths. Since H 0 (M, R) = Z 0 (M ), and


b = dim (H 0 (M, R)) = m, it is easy to see that , Z0 (M ) is generated by all points
0

in M and m1 , m2 ∈ M ⊂ Z0 (M ) is equivalent, iff they have components connected


by the same path. Thus, H0 (M, R) = R × · · · × R and b0 = dim (H0 (M, R)) = m.
0 12 3
m
We have b0 = b0 . The De Rham theorem holds good for k = 0 in this case.

1.7 Lie Derivative


The Lie derivative may be defined in several equivalent ways. In this section, to keep
things simple, we begin by defining the Lie derivative acting on scalar functions and
vector fields. The Lie derivative can also be defined to act on general tensors, as dis-
cussed later in the article.

1.7.1 Vector Fields as Differential Operator



n
Let X(x) = Xi (x)ei = (X1 (x), · · · , Xn (x)) be a vector field on Rn , and ω 1 =
i=0

n
ai (x) dx be a 1-form on Rn . ω 1 , X(x) = ω 1 (x), X(x) is the function on
i

i=0
100 1. Preliminaries of Differentiable Manifolds

Rn , where  ,  is the dual bracket between Tx Rn and Tx∗ Rn . This defines the natural
bilinear mapping of
Λ1 (Rn ) × X (Rn ) −→ C ∞ (Rn ).
 ∂f
If ω 1 = df = dxi , then
∂xi
n
∂f n
∂f
df, X = X
i i
= Xi (x) i .
i=1
∂x i=1
∂x

∂f 
n
Denote df, X = LX f , i.e., LX f = i
, ∀ f ∈ C ∞ (Rn ). Thus, any smooth
Xi
i=1
∂x
vector field may be viewed as a linear partial differential operator on Rn of order 1,
without zero terms and smooth coefficient, i.e., there is a correspondence between
X(x) ∈ X (Rn ) and LX :


n 
n

X(x) = Xi (x)ei −→ LX = Xi (x) .
i=1 i=1
∂xi


n
It is one to one, and so hereafter, we can also write X(x) = Xi (x)ei as
i=1


n

X(x) = Xi (x) .
i=1
∂xi

Definition 7.1. For any two vector fields X, Y ∈ X (Rn ), define

[X, Y ] = XY − Y X,

i.e.,
[X, Y ]f = X(Y f ) − Y (Xf ), ∀ f ∈ C ∞ (R),

where Xf is viewed as LX f , and [X, Y ] is called the commutator or the Poisson


bracket of X, Y .


n
∂  n

Proposition 7.2. Let X = Xi (x) i
,Y = Yi (x) i , then
i=1
∂x i=1
∂x

n 
 ∂Xi  ∂
n 
∂Yi
[X, Y ] = Xk − Yk .
i=1 k=1
∂xk ∂xk ∂xi

Proof. ∀ f ∈ C ∞ (Rn ),
1.7 Lie Derivative 101

[X, Y ]f = X(Y f ) − Y (Xf )



n
∂f   n
∂f 
=X Yi − Y X i
i=1
∂xi i=1
∂xi
n 
  ∂f 2 
n
∂Yi ∂f ∂2f ∂Xi ∂f
= Xk k i
+ Xk Yi i k − Yk k i
− Yk Xi i k
∂x ∂x ∂x ∂x ∂x ∂x ∂x ∂x
k=1 i=1
n 
 ∂Xi  ∂f
n 
∂Yi
= Xk − Yk .
i=1 k=1
∂xk ∂xk ∂xi

Thus,
n 
 ∂Xi  ∂
n 
∂Yi
[X, Y ] = Xk − Yk .
i=1 k=1
∂xk ∂xk ∂xi
The i-th component of [X, Y ] is
n 
 ∂Yi ∂Xi 
[X, Y ]i = Xk − Yk .
k=1
∂xk ∂xk

In this manner, [X, Y ] may be represented by ∂Y X − ∂X Y , where ∂Y is the Jaco-


∂x ∂x ∂x
bian of Y . 

Theorem 7.3. Let X1 , X2 , X3 ∈ X (Rn ) and f, g ∈ C ∞ (Rn ). Then,


1◦ [X1 , X2 ] = −[X2 , X1 ].

2 [α1 X1 + α2 X2 , X3 ] = α1 [X1 , X3 ] + α2 [X2 , X3 ].

3 [f X1 , gX2 ] = f (X1 g)X2 − g(X2 f )X1 + f g[X1 , X2 ].
4◦ [X1 , [X2 , X3 ]] + [X2 , [X3 , X1 ]] + [X3 , [X1 , X2 ]] = 0.
The equality 4◦ is called Jacobi identity.

Proof. 1◦ and 2◦ are evident. 4◦ follows immediately from the expansion 3◦ :

[f X1 , gX2 ] = (f X1 )(gX2 ) − (gX2 )(f X1 )


= f (X1 g)X2 + f gX1 X2 − g(X2 f )X1 − f gX2 X1
= f (X1 g)X2 − g(X2 f )X1 + f g[X1 , X2 ].

The proof can be obtained. 

1.7.2 Flows of Vector Fields


In Subsection 1.2.2, we have already discussed vector field and the flow of a general
differentiable manifold. Now, we focus on applications to the dynamic system.
102 1. Preliminaries of Differentiable Manifolds

(1) Phase space: The set of all possible states of a process is called phase space. For
example, consider the motion of system in classical mechanics, whose future and past
are uniquely determined by the initial position and initial velocities of all points in the
system. The phase space of a mechanical system is the set whose elements are the sets
of positions and velocities of all points of the given system. This is exactly the tangent
space we discussed earlier.
(2) Differentiable process: A process is called differentiable if its phase space has
the structure of a differentiable manifold, and the change of state with time is described
by differentiable functions. Let M be the phase space. A point of this space is a defined
state of the process. Assume that at instant t = 0 the process was in state x. Then at
another moment, the process t will be in another state. Denote this new state of the
process by g t x. We have defined for each t a mapping g t

g t : M −→ M.

This is called the transformation in time t, which takes the state at instance 0 to the
state at the instant t
g 0 = id, g t+s = g t · g s .
Let y = g s x be the state after time s, and the state z = g t y again after time t. The
effect is the same as advancing x to time t + s, i.e.,

z = g t+s x.

Thus, we can define a one parameter transformation group.


Definition 7.4. The mapping family {g t }of set M to itself is called a one parameter
transformation group on filed M , if ∀ s, t ∈ R satisfies:

g t+s = g t · g s , g 0 = I.

Definition 7.5. The set M and the corresponding one parameter transformation group
{g t } that maps M to itself compose (M, {g t }), which is called the phase flow, where
set M is called the phase space, and its element is called the phase point.

(3) Diffeomorphism: If there exists a 1−1 map f : U → V so that f and f −1 :


V → U are both differentiable, then f is said to be a diffeomorphism.
(4) One parameter diffeomorphism groups: The one parameter differmorphism
group {g t } on manifold M is a collection of mappings from the direct product R × M
to M :
g : R × M −→ M, g(t, x) = g t · x, t ∈ R, x ∈ M
which satisfies:
1◦ g is a differentiable mapping.
2◦ ∀ t ∈ R, g t : M → M is a differomorphism.
3◦ Family {g t , t ∈ R} is one parameter transformation group for M .
Dynamical system: the autonomous differential equation defined by a field X is
the equation
1.7 Lie Derivative 103

d x(t)
= X(x(t)), x|t=0 = x0 (initial value).
dt
The image of the mapping x(t) is called a phase space, and the graph of the map-
ping x(t) is called an integral curve. The integral curve lies in the direct product of
the time axis and the phase space, which is called M × R extended space. Such an
equation is called a dynamic system, whose phase flow g t , M → M, {g t , t ∈ R}
composes a group:

g t · g s = g t+s , g 0 = id, g −t = {g t }−1 .

Let X be a smooth vector field on Rn . The solution curve through x0 is a differ-


entiable mapping t → x(t): I → Rn , where I is an interval in R and 0 ∈ I, such
that
d x(t)
= ẋ(t) = X(x(t)), t ∈ I, x(0) = x0 . (7.1)
dt
It can be represented by its components as
d xi (t)
= Xi (x1 (t), · · · , xn (t)), t ∈ I; i = 1, · · · , n, xi (0) = xi0 ,
dt
where x(t) = (x1 (t), · · · , xn (t)). It is well known that there is a unique differentiable
solution x(t, x0 ) for the above system, which depends differentiably on the initial
value x0 in some neighborhood.
The mapping φtX from (some neighborhood of) Rn to itself is defined by

φtX (x0 ) = x(t, x0 ).

Suppose t is small enough, and is a diffeomorphism in (some neighborhood of) Rn .


It has the following properties:
1◦ φtX1 +t2 = φtX1 · φtX2 .
2◦ φ0X = id.
3◦ φ−t t −1
X = (φX ) .
Thus, such a class {φtX } of the mappings φtX forms a group, called as a local 1-
parameter transformations group in Rn , or a local dynamics system in Rn , or a flow
in Rn .

1.7.3 Lie Derivative and Contraction


Let ϕ : Rn → Rn be a diffeomorphism and Y ∈ X (Rm ) be a smooth vector field on
Rn . The pullback ϕ∗ Y of Y is a smooth vector field on Rn , defined by the formula

(ϕ∗ Y )(x) = (Dϕ−1 )(y)Y (y) = ϕ−1


∗ (y)Y (y), y = ϕ(x).

Definition 7.6. Let X and Y be two vector fields. The Lie derivative LX Y of Y with
respect to X is defined by
d  t∗ 
LX Y = φ Y t=0 ,
dt X
where φtX is the flow of X.
104 1. Preliminaries of Differentiable Manifolds

Theorem 7.7. Let X and Y be∗two vector fields. Then we have



1◦ (φtX Y )f = φtX (Y φ−t
X f ), ∀ f ∈ C ∞ (Rn ).
d t∗ ∗
2◦ (φ f ) = φtX (Xf ).
dt X
3◦ LX Y = [X, Y ].
Proof. 1◦ By definition,
∗ ∗  
(φtX Y )f (x) = df (x), (φtX · Y )(x) = df (x), Dφ−t
X (y)Y (y) y = φtX (x)

= (Dφ−t 
X (y)) df (x), Y (y)

= d(f ◦ φ−1 
X (y)) , Y (y)
 
= df"(y), Y (y) f" = f ◦ φtX

= (Y f")(y) = (Y f")(φtX (x))


∗ 
= φtX (Y f")(x) = φtX Y f (φ−t

X (y)) (x)
∗ ∗ 
= φtX Y (φ−tX f ) (x).

2◦ The proof is as follows:

d  t∗  d   n
∂f  
φX f = f φtX (x) = Xi φtX (x)
dt dt ∂xi
i=1
  ∗
= (Xf ) φtX (x) = φtX (Xf ).
3◦ The proof is as follows:
d  t∗   d  t∗ ∗ 

(LX Y )f = φX Y f t=0 = φX (Y · φ−t
X f ) t=0
dt dt
 ∗ ∗ 
=

φtX XY · φ−t

− φtX Y φ−t 
X f X Xf t=0

= XY f − Y Xf = [X, Y ]f, ∀ f.
Thus, we get
LX Y = [X, Y ].
Therefore, the theorem is completed. 
By the equality 3◦ , the Jacobi identity about the Poisson bracket { , } shows that
the operator LX is a { , }-derivative on the algebra X (Rn ) with binary operator [ , ],
i.e.,
LX {X1 , X2 } = {LX X1 , X2 } + {X1 , LX X2 }.
Definition 7.8. ∀ω ∈ Ωk (Rn ), the Lie derivative LX ω of ω with respect to a vector
field X ∈ X (Rn ) is defined by

d ∗  1 ∗ 
LX ω = φtX ω  = lim φtX ω − ω .
dt t=0 t→0 t
1.7 Lie Derivative 105

Theorem 7.9. The Lie derivative LX with respect to X ∈ X (Rn ) has the following
properties:
n
∂f
1◦ LX f = Xf = Xi i , f ∈ C ∞ (Rn ), i.e., the Lie derivatives of a
i=1
∂x
function f with respect to X is the directional derivative in direction X(x).
2◦ LX is a Λ-derivative, i.e., LX is R-linear,

LX (αω1 + βω2 ) = αLX ω1 + βLX ω2 ,


LX (ω1 ∧ ω2 ) = LX ω1 ∧ ω2 + ω1 ∧ LX ω2 .

3◦ LX d = d LX .
d t∗ ∗
Proof. 1◦ We have φ f = φtX (Xf ), and so
dt X
 
d 
 (φt f ) = (φtX Xf )t=0 = Xf.
∗ ∗
LX f =
d t t=0 X

2◦ It is obvious that LX is R-linear,



d  ∗
LX (ω1 ∧ ω2 ) =  φtX (ω1 ∧ ω2 )
dt t=0
  
d  ∗ ∗
=  φtX ω1 ∧ φtX ω2
dt t=0
   
d  ∗ d  ∗
=  φt ω1 ∧ ω2 + ω1 ∧  φtX ω2
d t t=0 X d t t=0

= LX ω1 ∧ ω2 + ω1 ∧ LX ω2 .

3◦ The proof is as follows:


 
d  ∗ d  ∗
LX dω =  φt dω =  dφtX ω
d t t=0 X d t t=0

d ∗
= d  φt ω = d LX ω.
d t t=0 X

Therefore, the theorem is completed. 

Definition 7.10. Let X ∈ X (Rn ) and ω ∈ Ωk (Rn ). The contraction iX ω of X and


ω is defined by

iX ω(ξ1 , · · · , ξk−1 ) = ω(X(x), ξ1 , · · · , ξk−1 ), ξi ∈ Tk Rn , i = 1, · · · , k − 1.

iX f = 0, for f ∈ C ∞ (Rn ) = Ω0 (Rn ). iX maps k-forms into (k − 1)-forms, i.e.,

Ωk (Rn ) −→ Ωk−1 (Rn ).


106 1. Preliminaries of Differentiable Manifolds

Theorem 7.11. Let ω1 , ω2 ∈ Ωk (Rn ), ω3 ∈ Ωl (Rn ). Then,


1◦ iX is a Λ-antiderivative, i.e., R-linear.
R-linear : iX (α1 ω1 + α2 ω2 ) = α1 iX ω1 + α2 iX ω2 , α1 , α2 ∈ R.
anti-derivative : iX (ω1 ∧ ω3 ) = iX ω1 ∧ ω3 + (−1)k ω1 ∧ iX ω3 .
2◦ if X+gY = f iX + giY , ∀ f, g ∈ C ∞ (Rn ), X, Y ∈ X (Rn ).
3◦ iX df = LX f, f ∈ C ∞ (Rn ).
4◦ LX = iX d + diX , (Cartan’s Magic formula).
5◦ Lf X = f LX + df ∧ iX .

Proof. 1◦ R-linearity of ix is evident.

iX1 (ω1 ∧ ω3 )(ξ2 , · · · , ξk+l−1 )


= (ω1 ∧ ω3 )(X1 (x), ξ2 , · · · , ξk+l )
= (ω1 ∧ ω3 )(ξ1 , ξ2 , · · · , ξk+l ) (denote ξ1 = X1 (x))

= ε(σ)ω1 (ξσ(1) , · · · , ξσ(k) )ω3 (ξσ(k+1) , · · · , ξσ(k+l) )
σ(1)<···<σ(k); σ(k+1)<···<σ(k+l)

 

= + .
i∈{σ(1)···σ(k)} i∈{σ(k+1)···σ(k+l)}

In the first part, it must be σ(1) = 1 since σ(1) < · · · < σ(k). Similarly, in the second
part, σ(k + 1) = 1.
Set

σ  = {σ(2), · · · , σ(k), σ(k + 1), · · · , σ(k + l)},


σ  = {σ(1), · · · , σ(k), σ(k + 2), · · · , σ(k + l)}.

Then,

ε(σ) = ε(σ  ), if i ∈ {σ(1), · · · , σ(k)},

ε(σ) = (−1)k ε(σ  ), if i ∈ {σ(k + 1), · · · , σ(k + l)}.

Thus,
1.7 Lie Derivative 107

iX (ω1 ∧ ω3 )(ξ2 , · · · , ξk+2 )




= ε(σ  )ω1 (ξ1 , ξσ(2) , · · · , ξσ(k) )
σ(2)<···<σ(k), σ(k+1)<···<σ(k+l)

· ω3 (ξσ(k+1) , · · · , ξσ(k+l) )



+ (−1)k ε(σ  )ω1 (ξσ(1) , · · · , ξσ(k) )
σ(1)<···<σ(k), σ(k+1)<···<σ(k+l)

· ω3 (ξ1 , ξσ(k+2) , · · · , ξσ(k+l) )



= ε(σ  )(iX ω1 )(ξσ (1) , · · · , ξσ (k−1) )
σ  (1)<···<σ  (k), σ  (k−1)<···<σ  (k+l−1)

· ω3 (ξσ (k) , · · · , ξσ (k+l−1) )



+ (−1)k ε(σ  )ω1 (ξσ (1) , · · · , ξσ (k) )(iX ω3 )
σ  (1)<···<σ  (k), σ  (k+1)<···<σ  (k+l−1)

· (ξσ (k+1) , · · · , ξσ (k+l−1) )

= (iX ω1 ∧ ω3 )(ξ2 , · · · , ξk+l ) + (−1)k (ω1 ∧ iX ω3 )(ξ2 , · · · , ξk+l ).


Thus, we get the equality

iX (ω1 ∧ ω3 ) = iX ω1 ∧ ω3 + (−1)k ω1 ∧ iX ω3 .

2◦ The proof is as follows:


(if X+gY ω)(ξ1 , · · · , ξk−1 )
= ω(f X + gY, ξ1 , · · · , ξk−1 )
= f (x)ω(X(x), ξ1 , · · · , ξk−1 ) + g(x)ω(Y (x), ξ1 , · · · , ξk−1 )
= f (x)iX ω(ξ1 , · · · , ξk−1 ) + g(x)iY ω(ξ1 , · · · , ξk−1 )
= (f (x)iX + g(x)iY )(ξ1 , · · · , ξk−1 ).
This is the equation 2◦ .
3◦ iX df = df (X) = Xf = LX f .
4◦ By induction with respect to k. k = 0. i.e., 3◦ .
Suppose that 4◦ holds good for k. Then for k + 1, ω can be written as a sum of
the forms like ω 1 ∧ df where ω 1 ∈ Ωk (Rn ) and f ∈ C ∞ (Rn ). By linearity of LX ,
without loss of generality we may assume ω = ω 1 ∧ df . Then,
LX (ω ∧ df ) = LX ω ∧ df + ω ∧ LX df
= (iX dω + diX ω) ∧ df + ω ∧ dLX f.
On the other hand,
108 1. Preliminaries of Differentiable Manifolds

(iX d + diX )(ω ∧ df )


= iX d(ω ∧ df ) + (diX )(ω ∧ df )

= iX (dω ∧ df ) + d(iX ω ∧ df + (−1)k ω ∧ iX df )

= iX dω ∧ df + (−1)k+1 dω ∧ iX df + diX ω ∧ df

+(−1)k dω ∧ iX f + (−1)2k ω ∧ dLX f


= (iX d + diX )ω ∧ df + ω ∧ dLX f,

thus, LX (ω ∧ df ) = (iX d + diX )(ω ∧ df ), i.e.,

LX = iX d + diX .

5◦ The proof is as follows:

Lf X = (dif X + if X d)ω
= d(f iX ω) + f iX dω
= df ∧ iX ω + f diX ω + f iX dω
= (f LX + df ∧ iX )ω,
so
Lf X = f LX + df ∧ iX .
Therefore, the theorem is completed. 

Every k-form ω on Rn can also be considered as a function on X k (Rn ): X k (Rn )


→ C ∞ (Rn ), i.e.,

ω(X1 , · · · , Xk )(x) = ω(x)(X1 (x), · · · , Xk (x)), Xi ∈ X (Rn ), i = 1, · · · , k.

It is linear, skew-symmetric.

Theorem 7.12. Let ω ∈ Ωk (Rn ), Xi (i = 1, · · · , k) be vector fields on Rn . Then,


k
1◦ (LX ω)(X1 , · · · , Xk ) = LX (ω(X1 , · · · , Xk ))− ω(X1 , · · · , LX Xi , · · · , Xk ).
i=1

k
2◦ dω(X0 , · · · , Xk ) =  i , · · · , Xj , · · · , Xk )
(−1)i LXi (ω(X0 , · · · , X
i=0

+ i , · · · , X
(−1)i+j ω(LXi Xj , X0 , · · · , X j , · · · , Xk ).
i<j

Proof.
1◦ By the definition of LX and the Theorem 7.7, since
1.7 Lie Derivative 109


(φtX ω)(X1 , · · · , Xk ) = ω(φtX ω)(φtX∗ X1 , · · · , φtX∗ Xk )
∗ ∗
= ω(φ−t −t
X X1 , · · · , φX Xk )(φX X)
t

∗ ∗ ∗
= φtX (ω(φ−t −t
X X1 , · · · , φX Xk )),

we have

d  ∗
(LX ω)(X1 , · · · , Xk ) =  (φt ω)(X1 , · · · , Xk )
d t t=0 X
  t∗ 
d  ∗
−t∗
=  φ ω(φ−t
X X1 , · · · , φX Xk )
d t t=0 X

  k
= LX ω(X1 , · · · , Xk ) − ω(X1 , · · · , LX Xi , · · · , Xk ).
i=1

2◦ By induction with respect to k, k = 0 is evident:

df (X) = LX f, f ∈ C ∞ (Rn ) = Ω0 (Rn ).

Suppose 2◦ holds for k − 1. Then for k, by 1◦ ,

dω(X0 , X1 , · · · , Xk ) = (iX0 dω)(X1 , · · · , Xk )


= (LX0 ω)(X1 , · · · , Xk ) − (diX0 ω)(X1 , · · · , Xk )

  k
= LX0 ω(X1 , · · · , Xk ) − ω(X1 , · · · , LX0 Xi , · · · , Xk )
i=1

−(diX0 ω)(X1 , · · · , Xk ),

where iX0 ω ∈ Ωk−1 (Rn ). By inductive hypothesis,

(diX0 ω)(X1 , · · · , Xk )


k
 
=  i , · · · , Xk )
(−1)i−1 LXi iX0 ω(X1 , · · · , X
i=1

+ i , · · · , X
(−1)i+j iX0 ω(LXi Xj , X1 , · · · , X  j , · · · , Xk )
1≤i≤j≤k


k
 
=  i , · · · , Xk )
(−1)i−1 LXi ω(X0 , X1 , · · · , X
i=1

+ i , · · · , X
(−1)i+j−1 ω(LXi Xj , X0 , · · · , X j , · · · , Xk ).
1≤i≤j≤k

Thus, we get
110 1. Preliminaries of Differentiable Manifolds

d ω(X0 , · · · , Xk )

  
k
= LX0 ω(X1 , · · · , Xk ) + (−1)j ω(X1 , · · · , LX0 Xj , · · · , Xk )
j=1


k
 
+  i , · · · , Xk )
(−1)i LXi ω(X0 , · · · , X
i=1

+ i , · · · , X
(−1)i+j ω(LXi Xj , X0 , · · · , X  j , · · · , Xk )
1≤i<j<k


k
 
=  i , · · · , Xk )
(−1)i LXi ω(X0 , · · · , X
i=0

+ i , · · · , X
(−1)i+j ω(LXi Xj , X0 , · · · , X j , · · · , Xk ).
i<j

Finally , the theorem is completed. 


Bibliography

[AA88] D.V. Anosov and V.I. Arnold: Dynamical Systems I. Springer, Berlin, (1988).
[AA89] V. I. Arnold and A. Avez: Ergodic Problems of Classical Mechanics. Addison-Wesley
and Benjamin Cummings, New York, (1989).
[Abd99] S. S. Abdullaev: A new integration method of Hamiltonian systems by symplectic
maps. J. Phys. A: Math. Gen., 32(15):2745–2766, (1999).
[Abd02] S. S. Abdullaev: The Hamilton–Jacobi method and Hamiltonian maps. J. Phys. A:
Math. Gen., 35(12):2811–2832, (2002).
[AKN78] V. I. Arnold, V. V. Kozlov, and A. I. Neishtadt: Mathematical Aspects of Classical
and Celestial Mechanics. Springer, Berlin, Second edition, (1978).
[AM78] R. Abraham and J. E. Marsden: Foundations of Mechanics. Addison-Wesley, Reading,
MA,Second edition, (1978).
[AMR88] R. Abraham, J. E. Marsden, and T. Ratiu: Manifolds, Tensor Analysis, and Applica-
tions. AMS 75. Springer-Verlag, Berlin, Second edition, (1988).
[AN90] A. I. Arnold and S.P. Novikov: Dynamical System IV. Springer Verlag, Heidelberg,
(1990).
[AP92] D. K. Arrowsmith and C. M. Place: Dynamical Systems: Differential Equations, Maps,
and Chaotic Behavior. Chapman & Hall, New York, (1992).
[Arn78] V. I. Arnold: Ordinary Differential Equations. The MIT Press, New York (1978).
[Arn88] V. I. Arnold: Geometrical Methods in the Theory of Ordinary Differential Equations.
Springer-Verlag, Berlin, (1988).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Berlin Heidelberg:
Springer-Verlag, GTM 60, Berlin ,Second edition, (1989).
[Ber00] R. Berndt: An Introduction to Symplectic Geometry. AMS Providence, Rhode Island,
(2000).
[Bir23] G. D. Birkhoff: Relativity and Modern Physics. Harvard Univ. Press, Cambridge,
Mass., Second edition, (1923).
[BK89] G.W. Bluman and S. Kumei: Symmetries and Differential Equations. AMS 81.
Springer-Verlag, New York, (1989).
[Car65] C. Carathe’odory: Calculus of Variation and Partial Differential Equations of First
Order, Vol.1. Holden-Day, San Franscisco, (1965).
[Car70] H. Cartan: Differential Forms. Houghton-Mifflin, Boston, (1970).
[Che53] S. S. Chern: Differential Manifolds. Lecture notes. University of Chicago, (1953).
[Ede85] D. G. B Edelen: Applied Exterior Calculus. John Wiley and Sons, New York, First
edition, (1985).
[Fla] H. Flanders: Differential Forms. Academie Press, New York, Second edition. (1963).
[GS84] V. Guillemin and S. Sternberg: Symplectic Techniques in Physics. Cambridge Univer-
sity Press, Cambridge, (1984).
[Lan95] S. Lang: Differential and Riemannian Manifolds. Springer-Verlag, New York, (1995).
[LM87] P. Libermann and C.M. Marle: Symplectic Geometry and Analytical Mechanics. Rei-
del Pub. Company, Boston, First edition, (1987).
[Mac70] S. MacLanc: Hamiltonian mechanics and geometry. Amer. Math. Mon., 77(6):570–
586, (1970).
112 Bibliography

[Poi93] H. Poincaré: Les Méthodes Nouvelles de la Mécanique Céleste, Tome II. Gauthier-
Villars, Paris, Second edition, (1893).
[Poi99] H. Poincaré: Les Méthodes Nouvelles de la Mécanique Céleste. Tome III. Gauthiers-
Villars, Paris, Second edition, (1899).
[Sc77] M. Schreiber: Differential Forms. Springer-Verlag, New York, First edition, (1977).
[Sie43] C.L. Siegel: Symplectic geometry. Amer.and math. J. Math, 65:1–86, (1943).
[Spi68] M. Spivak: Calculus on Manifolds. The Benjamin/Cummings publishing company,
London, New York, First edition, (1968).
[Tre75] F. Treves: Pseodo-Differential Operator. Acad. Press, New York, First edition, (1975).
[Wei77] A. Weinstein: Lectures on symplectic manifolds. In CBMS Regional Conference, 29.
American Mathematical Society, Providence, RI, (1977).
[Wes81] C. Von. Westenholz: Differential Forms in Mathmatical Physics. North-Holland,
Amsterdam, Second edition, (1981).
Chapter 2.
Symplectic Algebra and Geometry
Preliminaries

In order to deeply understand Hamiltonian mechanics, it is necessary to know basic


concepts of symplectic algebra and geometry.

2.1 Symplectic Algebra and Orthogonal Algebra


Symplectic algebra and orthogonal algebra have several similar concepts. First, we
start with the bilinear form.

2.1.1 Bilinear Form


1. Bilinear form
Definition 1.1 (Bilinear Form). Let Fn be an n-dimensional linear space. A bilinear
form on Fn is a mapping ϕ : Fn × Fn → Fn that satisfies:
1◦ ϕ(αu + βv, y) = αϕ(u, y) + βϕ(v, y).
2◦ ϕ(x, αu + βv) = αϕ(x, u) + βϕ(x, v), ∀α, β ∈ F, u, v, x, y ∈ Fn .
It is obvious that there exists a 1-1 correspondence between the matrix space M (n, F)
and the space of the bilinear form on Fn . As a matter of fact, given a matrix A ∈
M (n, F), there is a bilinear form ϕA on Fn corresponding to


n
ϕA (x, y) = x Ay = aij xi yj .
i,j=1

Conversely, given a bilinear form on Fn , there is also a matrix A ∈ M (n, F) corre-


sponding to
A = Aϕ = [aij ] = [ϕ(ei , ej )] ∈ M (n, F),
such that ϕ(x, y) = x Ay, where e1 , e2 , · · · is a set of basis of Fn .

Definition 1.2 (Symmetric or Antisymmetric). A bilinear form ϕ = ϕA is called


symmetric or antisymmetric if

ϕ(x, y) = ϕ(y, x) or ϕ(x, y) = −ϕ(y, x), ∀ x, y ∈ Fn ,

i.e., A = A, A = −A, respectively.


114 2. Symplectic Algebra and Geometry Preliminaries

Definition 1.3 (Conformally Symmetric). A bilinear form A is called conformally


symmetric if
ϕ(x, y) = 0 ⇐⇒ ϕ(y, x) = 0,

i.e.,

{(x, y) ∈ Fn × Fn | ϕ(x, y) = 0} = {(x, y) ∈ Fn × Fn | ϕ(y, x) = 0}.

A matrix A is called conformally symmetric if x Ay = 0 ⇔ x A y = 0, which


is equivalent to saying that ϕA is conformally symmetric.
ϕ is a non-degenerate map (or non-singular) if ∃x = 0 s.t. ϕ(x, y) = 0, ∀y ∈ Fn .
Proposition 1.4. It is evident that the following claims are equivalent:
1◦ ϕA is non-degenerate.
2◦ If ∀ y ∈ Fn , ϕA (x, y) = 0, then x = 0.
3◦ A is non-degenerate.

Definition 1.5 (Conformally Identical). A bilinear form ϕA is called conformally


identical to ϕB if

{(x, y) ∈ Fn × Fn | ϕA (x, y) = 0} = {(x, y) ∈ Fn × Fn | ϕB (y, x) = 0}.

Proposition 1.6. The following claims are equivalent:


1◦ ϕA is conformally identical to ϕB .
2◦ ϕA (x, y) = 0, iff ϕB (x, y) = 0.
3◦ x Ay = 0, iff x By = 0.
4◦ ∃μ ∈ F, μ = 0, such that A = μB.

The equivalence of 1◦ , 2◦ and 3◦ is trivial. Next, we prove the equivalence be-


tween 1◦ and 4◦ .
Theorem 1.7. {(x, y) ∈ Fn × Fn | ϕA (x, y) = 0} = {(x, y) ∈ Fn × Fn |
ϕB (y, x) = 0},
iff ∃ μ ∈ F, μ = 0, such that A = μB.

Proof. The sufficiency is trivial. We only need to prove the necessity. Without loss of
generality, we can assume F = R. Then, we have

ker(A) = {y ∈ Rn | Ay = 0} = {y ∈ Rn | x Ay = 0, ∀ x ∈ Rn },
ker(B) = {y ∈ Rn | By = 0} = {y ∈ Rn | x By = 0, ∀ x ∈ Rn }.

By our assumption, x Ay = 0 ⇔ x By = 0. Hence, ker(A) = ker(B), denoted as V ,


i.e., V = ker(A) = ker(B). Then ∀ v ∈ Rn ,

Av = 0 ⇐⇒ Bv = 0.

Since x Ay = 0 ⇔ x By = 0, {Av}⊥ = {Bv}⊥ , and so


2.1 Symplectic Algebra and Orthogonal Algebra 115

{Av} = {Bv}.

This shows that there exists μ(v) ∈ R, μ = 0 such that Av = μ(v)Bv.


Next, we show μ(v) is a non-zero constant.
Take a basis of Rn = {v1 , · · · , vr , vr+1 , · · · , vn }, such that

{vr+1 , · · · , vn } = V = ker(A) = ker(B).

Thus, Avi = 0 (i = 1, · · · , r), A(v1 + v2 + · · · + vr ) = 0.


The above shows that there exist μ1 , · · · , μr , μ (all of which are no-zero), such
that
Avi = μi Bvi , i = 1, · · · , r,
A(v1 + v2 + · · · + vr ) = μB(v1 + v2 + · · · + vr ).

Then,
μ1 Bv1 + · · · + μr Bvr = μB(v1 + v2 + · · · + vr )
= μBv1 + · · · + μBvr .
After manipulation, we get (μ1 − μ)Bv1 + · · · + (μr − μ)Bvr = 0, i.e.,

B ((μ1 − μ)v1 + · · · + (μr − μ)vr ) = 0.

Since (μ1 − μ)v1 + · · · + (μr − μ)vr ∈ {v1 , · · · , vr }, it must be

(μ1 − μ)v1 + · · · + (μr − μ)vr = 0.

Then by the linear independence of v1 , · · · , vr , we have

μ = μ1 = · · · = μr .

Therefore,
Avi = μBvi , i = 1, · · · , r.
Similarly for i = r + 1, · · · , n, Avi = 0 = μBvi . Thus, we have obtained A = μB. 

From Theorem 1.7, we can easily derive the following theorems.


Theorem 1.8. ϕA is conformally symmetric, iff ∃ μ ∈ F, μ = 0, such that A = μA.

Theorem 1.9. A ∈ M (n, R) is conformally symmetric, iff A = ±A, i.e., A is sym-


metric or antisymmetric.

2. Quadratic forms induced by bilinear form



n
Given a bilinear form ϕA (x, y) = aij xi yj , we can get a quadratic bilinear form
i,j

n
ϕA (x, x) = aij xi xj . Obviously, we have the following propositions:
i,j
116 2. Symplectic Algebra and Geometry Preliminaries

Proposition 1.10. ϕA (x + y, x + y) − ϕA (x, x) − ϕA (y, y) = ϕA (x, y) + ϕA (y, x).

Proposition 1.11. ϕA is antisymmetric, iff ϕA (x, x) = 0, ∀x ∈ Fn .

Proposition 1.12. ∀ x ∈ Fn , ϕA (x, x) = ϕB (x, x), iff A + A = B + B  .

Proposition 1.13. If A = A , B = B  , then ∀ x ∈ Fn , ϕA (x, x) = ϕB (x, x) ⇔


A = B.

Proposition 1.14. The following assertions are equivalent :


1◦ ϕ ( A = ϕB . n ) ( )
2◦ (x, y) ∈ F × Fn | ϕA (x, y) = 1 = (x, y) ∈ Fn × Fn | ϕB (x, y) = 1 .
3◦ ϕA (x, y) = 1 ⇔ ϕB (x, y) = 1.
4◦ x Ay = 1 ⇔ x By = 1.

2.1.2 Sesquilinear Form


1. Sesquilinear form
In a complex field C, there is an automorphism

C  z = x + i y → z = x − i y ∈ C,

such that
z1 · z2 = z1 · z2 , z1 + z2 = z1 + z2 .
This leads to a new kind of binary forms.
Definition 1.15 (Sesquilinear). A sesquilinear form on Cn is a mapping φ : Cn ×
Cn → C, such that for all u, v, x ∈ Cn , α, β ∈ C, we have
1◦ φ(αu + βv, x) = α φ(u, x) + β φ(v, x).
2◦ φ(x, αu + βv) = α φ(x, u) + β φ(x, v).
Similarly, there exists a 1-1 correspondence between the complex matrix space M (n, C)
and the space of sesquilinear forms on Cn .
In fact, a complex matrix A ∈ M (n, C) has a natural correspondence to a
sesquilinear form φA , which satisfies

n
φA (x, y) = x∗ Ay = aij xi yj .
i,j

Conversely, a sequilinear form φ has a natural correspondence to a complex matrix

A = Aφ = [aij ] = [φ(ei , ej )] ∈ M (n, C),

which satisfies φ(x, y) = x∗ Aφ y.


Definition 1.16 (Hermitian). φ = φA is Hermitian or anti-Hermitian, if ∀x, y ∈
Cn , φ(x, y) =φ(y, x) or φ(x, y) = −φ(y, x), i.e., A∗ = A or A∗ = −A. Such a
complex matrix A is called Hermitian or anti-Hermitian.
2.1 Symplectic Algebra and Orthogonal Algebra 117

Definition 1.17 (Conformally Hermitian). φ = φA is called conformally Hermi-


tian, if φ(x, y) = 0 ⇔ φ(y, x) = 0, then

{(x, y) ∈ Cn × Cn | φ(x, y) = 0} = {(x, y) ∈ Cn × Cn | φ(y, x) = 0},

or
x∗ Ay = 0 ⇐⇒ x∗ A∗ y = 0.

The matrix A satisfying the above condition is called conformally Hermitian.


φ = φA is non-degenerate iff x ∈ Cn , x = 0, such that φA (x, y) = 0, ∀y ∈
Cn ; or iff ∀y ∈ Cn , φ(x, y) = 0, then x = 0; or iff A is non-degenerate.
2. Hermitian forms induced by sesquilinear forms
n
From a sesquilinear form on Cn , φA (x, y) = x∗ Ay = aij xi yj , we can induce a
i,j
correspondent Hermitian form on Cn ,

n
φA (x, x) = x∗ Ax = aij xi xj .
i,j

If A is Hermitian, then ∀x ∈ Cn , φA (x, x) ∈ R.


Remark 1.18. Hermitian forms have properties similar to Propositions 1.10 – 1.14,
and Theorems 1.7 – 1.8.
Remark 1.19. Hermitian analogue of Theorem 1.9 as follows.
Theorem 1.20. A ∈ M (n, C) is conformally Hermitian, iff ∃μ ∈ C, |μ| = 1, such
that A∗ = μA; or iff ∃ θ ∈ R, such that A∗ =eiθ A.
Proof. A = A∗ ∗ = μA∗ = μμA = |μ|2 A. Thus, |μ|2 = 1. 

2.1.3 Scalar Product, Hermitian Product


A scalar product on Fn is a non-degenerate conformally symmetric bilinear form
φG (x, y), where G = ±G.
Symmetric products in Rn : (x, y)S = ϕS (x, y) = x Sy, S  = S, |S| = 0.
Anti-symmetric product in Rn :[x, y]K = ϕK (x, y) = x Ky, K  = −K, |K| =
0.
Remark 1.21. There does not exist any anti-symmetric scalar product in F2n+1 since
|K| = 0 if K  = −K.
Hermitian products in Cn are non-degenerate Hermitian forms in Cn , i.e.,

x, y = x∗ Gy = φG (x, y), G∗ = G, |G| = 0.

Typical Examples are given below:


118 2. Symplectic Algebra and Geometry Preliminaries

Example 1.22 (Symmetric case). Euclidean scalar product (Euclidean form) in Rn :


n
(x, y) = (x, y)I = x y = xi yi , I  = I.
i=1


n
This induces a Euclidean length measure |x|2 = (x, x) = x2i .
i=1

Example 1.23 (Anti-symmetrical case). Standard symplectic scalar product (sym-


plectic form) in R2n :


n
[x, y] = [x, y]J = x Jy = (xi yn+i − xn+i yi ),
i=1

where !
0 In
J= = J2n , J  = J −1 = −J.
−In 0

As n = 1, we have  
 x y1 
[x, y] = x1 y2 − x2 y1 =  1 ,
x2 y2 
which represents the oriented area of the parallelogram formed by the vector x, y in
R2 (see the image below).
(x2 , y2 )
6
y
For general n, we get 

n   : x
  xi yi 

[x, y] = x Jy =   -
 xn+i yn+i  O (x1 , y1 )
i=1

which represents a sum of oriented areas of the parallelograms formed by projecting


vectors x, y ∈ R2n to the (xi , xn+i ) coordinate planes.

Remark 1.24 (Pfaffian theorem). For any n, there exists a polynomial Pn (xij ) with
integer coefficients in variables xij (i < j) such that
2
det K = det [kij ] = [Pn (kij )] , ∀ anti-symmetric matrix K.

Example 1.25 (Hermitian case). Unitary product in Cn :

w, z = w, zI = w∗ Iz = w∗ z, w, z ∈ Cn , I ∗ = I,

w, z induces z = z ∗ z, which is the Euclidean length measure in Cn .


2.1 Symplectic Algebra and Orthogonal Algebra 119

2.1.4 Invariant Groups for Scalar Products


This topic is discussed in detail in books[Wey39,Wey40,Art57] .
General linear group: GL(n, F)={A ∈ M (n, F) | detA = 0}.
Special linear group: SL(n, F)={A ∈ GL(n, F) | detA=1}.
Orthogonal group: invariant group for the Euclidean scalar product (x, y)=ϕI (x, y) =
x y,
O(n, F) = {A ∈ GL(n, F) | (Ax, Ay) = (x, y), ∀x, y ∈ Fn }.
From definition, we have

A ∈ O(n, F) ⇐⇒ A IA = A A = I.

In particular, we denote O(n, R) as O(n), i.e., O(n, R) ≡ O(n).


Symplectic group: invariant group for the anti-symmetric scalar product [x, y] =
ϕJ (x, y) = x Jy,
( )
Sp(2n, F) = A ∈ GL(2n, F) | [Ax, Ay] = [x, y], ∀x, y ∈ F2n .

From definition, we have

A ∈ Sp(n, F) ⇐⇒ Ā JA = J.

We denote Sp(n, F) ≡ Sp(n).


Unitary group: invariant group for the Hermitian scalar product x, y=x∗ y.
( )
U (n, C) = A ∈ GL(n, C) | Ax, Ay = x, y, ∀x, y ∈ Cn .

From definition, we have

A ∈ U (n, C) ⇐⇒ A∗ IA = I.

Similarly, we denote U (n, C) ≡ U (n).


Invariant Group for Scalar Product ϕG , φG .
Here:
ϕG (x, y) = x Gy, G = ±G, |G| = 0,
φG (x, y) = x∗ Gy, G∗ = G, |G| = 0.
The invariant group for ϕG is
( )
G(G, n, F) = A ∈ GL(n, F) | ϕG (Ax, Ay) = ϕG (x, y), ∀ x, y ∈ Fn
(
= A ∈ GL(n, C) | A GA = G}
( )
= A ∈ GL(n, C) | A−1 = G−1 A G .

The symmetric case G = S, S  = S, ϕS (x, y) = (x, y)S = x Sy, where G is


called an S-orthogonal group.
120 2. Symplectic Algebra and Geometry Preliminaries

O(S, n, F) = G(S, n, F)
( )
= A ∈ GL(n, F) | (Ax, Ay)S = (x, y)S , ∀ x, y ∈ Fn
(
= A ∈ GL(n, F) | A SA = S}
( )
= A ∈ GL(n, F) | A−1 = S −1 A S .

¯ n, F) ≡ O(n, F).
Special case: O(I,
Anti-symmetrical case: G = K, K  = −K, ϕK (x, y) = [x, y]K = x Ky, where
G is called a K-symplectic group.

Sp(K, n, F) = G(K, 2n, F)


( )
= A ∈ GL(2n, F) | [Ax, Ay]K = [x, y]K , ∀ x, y ∈ F2n
( )
= A ∈ GL(2n, F) | A KA = K
( )
= A ∈ GL(2n, F) | A−1 = K −1 A K .

Special case: Sp(J, 2n, F) ≡ Sp(2n, F).


Hermitian case:

G = H ∈ M (n, C), H ∗ = H, |H| = 0, φH (x, y) = x, yH = x∗ Hy,


( )
U (H, n, C) = A ∈ GL(n, C) | Ax, AyH = x, yH , ∀ x, y ∈ Cn
( )
= A ∈ GL(n, C) | A∗ HA = H
( )
= A ∈ GL(n, F) | A−1 = H −1 A∗ H .

Special case: U (I, n, C) ≡ U (n, C) ≡ U (n).


Conformally Invariant Group for scalar Product ϕG .
( )
CG(G, n, F) = A ∈ GL(n, F) | ϕG (Ax, Ay) = 0 ⇐⇒ ϕG (x, y) = 0
( )
= A ∈ GL(n, F) | ∃ μ ∈ F, μ = 0, such that A GA = μG
( )
= A ∈ GL(n, F) | ∃ μ ∈ F, μ−1 = 0, A−1 = μ−1 G−1 A G .

When G = S, S  = S, we denote CG(S, n, F) as CO(S, n, F).


When G = K, K  = −K, we denote CG(K, n, F) as CSp(K, n, F).
When G = H, H ∗ = H, we have
( )
CU (H, n, C) = A ∈ GL(n, C) | Ax, AyH = 0 ⇐⇒ x, yH = 0
( )
= A ∈ GL(n, C) | ∃ μ ∈ C, μ = 0, such that A∗ HA = μH
( )
= A ∈ GL(n, C) | ∃ μ ∈ C, μ = 0, A−1 = μ−1 H −1 A∗ H .
2.1 Symplectic Algebra and Orthogonal Algebra 121

2.1.5 Real Representation of Complex Vector Space


Consider a mapping from Cn to R2n ρ : z = x + i y → ρ(z) = [x, y] , z ∈
Cn , x, y ∈ Rn . Evidently ρ : Cn → R2n is injective, and it satisfies the following
properties:

Property 1.26.
1◦ ρ(z + w) = ρ(z) + ρ(w), ∀z, w ∈ Cn .
2◦ ρ(αz) = αρ(z), ∀α ∈ R.
! ! !
−y 0 −I x
3◦ ρ(iz) = ρ(−y + ix) = = = −Jρ(z).
x I 0 y
4◦ ρ((α + iβ)z) = (αI − βJ)ρ(z), α + iβ ∈ C.
5◦ ρ(0) = 0 ∈ R2n , 0 ∈ Cn .
!
A −B
For C = A + iB ∈ M (n, C), set R(C) = ∈ M (2n, R).Similarly,
B A
R : C → R(C), M (n, C) → M (2n, R) is injective.
Assume C = A + iB ∈ M (n, C), w = Cz. Then,

w = u + iv = (A + iB)(x + iy) = (Ax − By) + i(Bx + Ay),

i.e., ! ! !
u A −B x
= ,
v B A y
or
ρ(w) = R(C)ρ(z) = ρ(Cz).
Analogously, R satisfies the following properties:

Property 1.27.
1◦ R(On ) = O2n , On ∈ M (n, C).
2◦ R(In ) = I2n , In ∈ M (n, C).
3◦ R(αC) = αR(C), ∀ α ∈ R.
!
−B −A
4◦ R( i C) = R( iA − B) = = −JR(C).
A −B
5◦ R(C1 + C2 ) = R(C1 ) + R(C2 ), ∀ C1 , C2 ∈ M (n, C).
6◦ R(C1 · C2 ) = R(C1 )R(C2 ).

C invertible ⇐⇒ R(C) invertible.

The last assertion follows from the theorem below.

Theorem 1.28.  
 A −B 
det(A + iB) = 0 ⇐⇒   0.
=
B A 
122 2. Symplectic Algebra and Geometry Preliminaries

Real Representation of Unitary Group:


If det(H) = 0, H = P + iQ ∈ M (n, C) is Hermitian, iff
P  = P, Q = −Q.
For w = u + iv, z = x + iy ∈ Cn , definew, zH = w∗ Hz.
w∗ Hz = (u + iv)∗ H(x + iy)
= (u + iv)∗ (P + iQ)(x + iy)
! ! ! !
  P −Q x Q P x
= (u , v ) + i(u , v  ) .
Q P y −P Q y

The above equation shows that the Hermitian scalar product of z and w, w, zH
consists of two parts: its real part is a Euclidean scalar product in R2n (whose measure
is a symmetric matrix S), denoted by the round bracket and its imaginary part can be
taken as a new scalar product in R2n (whose measure is an anti-symmetric matrix K),
denoted by the square bracket. Therefore, we have w, zH =(W, Z)S + i [W, Z]K ,
where !
u
W = ρ(w) = ,
v
!
x
Z = ρ(z) = ∈ R2n ,
y
H = P + i Q, P  = P,
!
 P −Q

Q = −Q, S = S = ,
Q P
!
 Q P
K = −K = .
−P Q

Let T = R(C) = R(A + iB). Then,


w, zH = Cw, CzH ⇐⇒ (W, Z)S + i [W, Z]K = (T W, T Z)S + i [T W, T Z]K .
From this, we can derive the following equivalent conditions:
Proposition 1.29.
1◦ U((H, n, C)  C = A + i B )
= C ∈ GL(n, C) | Cw, CzH = w, zH , ∀ w, z ∈ Cn .
!
◦ A −B
2 T = , det T = 0,
B A

(W, Z)S = (T W, T Z)S , [W, Z]K = [T W, T Z]K .


3◦ T ∈ GL(n, C), T ∈ O(S, 2n, R) ∩ Sp(K, 2n, R), where K = SJ.
4◦ T J = JT, det T = 0, and T  ST = S, T  SJT = SJ.
2.1 Symplectic Algebra and Orthogonal Algebra 123

Hence, GL(n, C) is identified with its real image GL(n, R) = {T ∈ GL(2n, R) |


T J = JT }, and U (H, n, C) is identified with its real image in GL(2n, R).
Since H = P + i Q is non-degenerate, and S = R(H), K = SJ are also non-
degenerate, we have: if T J = JT, T  ST = S, then T  SJT = T  ST J = SJ; if
T J = JT and T  SJT = SJ, then T  ST = T  SJ(−J)T = −T  SJT J = −SJ 2 =
−1
S; if T  ST = S and T  SJT = SJ, then |T |2 = 1, T invertible, ST = T  S.
−1
Hence, if SJT = T  SJ = ST J, then JT = T J.
Therefore, we have

U (H, n, C) = GL(n, C) ∩ O(S, 2n, R) ∩ Sp(K, 2n, R)


= GL(n, C) ∩ O(S, 2n, R)
= GL(n, C) ∩ Sp(K, 2n, R)
= O(S, 2n, R) ∩ Sp(K, 2n, R),

where H = P + iQ, P  = P, Q = −Q, S = R(H), K = SJ, |H| = 0.


In particular, if H = I, S = I2n , K = J2n , then

U (n, C) = GL(n, C) ∩ O(2n, R) ∩ Sp(2n, R)


= GL(n, C) ∩ O(2n, R)
= GL(n, C) ∩ Sp(2n, R)
= O(2n, R) ∩ Sp(2n, R).

2.1.6 Complexification of Real Vector Space and Real Linear


Transformation
In a complex vector space V , we not only have the additive operator, but also the scalar
multiplication by complex numbers: for u ∈ V, α + iβ ∈ C, then (α + iβ)u ∈ V . This
can be realized by a scalar multiplication by real numbers u ∈ V, α ∈ R ⇔ αu ∈ V
plus a single scalar multiplication by an imaginary unit i, i: u ∈ V ⇔ iu ∈ V , which
can be seen as an operator

l : V −→ V, u −→ i u = l(u).

l is a real linear transformation of V , i.e.,

l(αu + βv) = αl(u) + βl(v), u, v ∈ V, α, β ∈ R.

Moreover, a complex linear transformation T of V satisfies:


the additive property: T (u + v) = T (u) + T (v),
the multiplication property: T ((α + i β)u) = (α + i β)T (u).
The latter is simply the commutativity of complex linear transformation and complex
scalar multiplication, which can be realized by
124 2. Symplectic Algebra and Geometry Preliminaries

T (i u) = i T (u) ∼ T l = lT.

Let

L(V, R) = {T | T (u + v) = T (u) + T (v), T (αu) = αT (u), ∀ u, v ∈ V, α ∈ R},


L(V, C) = {T | T (u + v) = T (u) + T (v), T ((α + i β)u)
= (α + iβ)T (u), ∀ u, v ∈ V, α, β ∈ R}.

Evidently, T ∈ L(V, C) ⇔ iff T ∈ L(V, R) and T (i u) = i T (u). Thus, the


operator l satisfies
l ∈ L(V, R), i2 = −1,
and
T ∈ L(V, C) iff T ∈ L(V, R), and T l = lT.
These observations form the basis for the following definitions and the method of
complexification of real vector space.

Definition 1.30. A real vector space V = V (R) is complexifiable if there exists an


operator l in L(V, R) such that l2 = −I.

We can easily see that V is complexifiable iff dimV (R) = 2n. This is because the
operator equation l2 = −I in L(V, R) corresponds to the matrix equation X 2 = −I
in M (m, R), m =dimR (V ), which has no real solution for m = 2n + 1. (Since
from X 2 = −I it follows that all eigenvalues of X are ±i, while for m = 2n + 1,
X ∈ M (m, R) has at least one real eigenvalue.) When m = 2n, there is a special
solution
X = ±J, (±J)2 = −I.
If we introduce an isomorphism J on R2n , which satisfies J 2 = −I, then we
say that R2n is equipped with the complex structure. Hence, we can define operation
(a + ib)u = au + bJu, and R2n becomes the complex n-dimensional space. Cn is
called R2n complexifiable.

2.1.7 Lie Algebra for GL(n, F)


1. Lie algebra
Definition 1.31. For B1 , B2 ∈ M (n, F), we define a commutator of B1 , B2 as fol-
lows:
{B1 , B2 } = B1 B2 − B2 B1 ,
which satisfies the following properties:
1◦ {B1 , B2 } = −{B2 , B1 }.
2◦ {B1 , {B2 , B3 }} + {B2 , {B3 , B1 }} + {B3 , {B1 , B2 }} = 0.
The equation 2◦ is called Jacobi identity.
2.1 Symplectic Algebra and Orthogonal Algebra 125

Definition 1.32. A Lie algebra is a vector space equipped with a binary operation
L × L → L, which satisfies the Jacobi identity.
Hence, M (n, F), equipped with the above commutator, becomes a Lie algebra,
denoted as gl(n, F). Since gl(n, F) is the tangent vector space to GL(n, F) at I,
gl(n, F) is called Lie algebra of the Lie group GL(n, F).
Definition 1.33. The Lie algebra of the Lie group SL(n, F) is defined as follows:
sl(n, F) = {B ∈ gl(n, F) | trB = 0}.
Remark 1.34. If trB1 = trB2 = 0, then tr{B1 , B2 } = tr(B1 B2 − B2 B1 ) = 0.
Therefore, sl(n, F) is closed under { , }. As the matter of fact, for any A, B, tr{A, B}
is always equal to 0.
Definition 1.35. The Lie algebra of the Lie group G(G, n, F) is defined as follows:
g(G, n, F) = {B ∈ gl(n, F)|B  G + GB = 0}.
Remark 1.36. g(G, n, F) is closed under { , }, i.e.,
{B1 , B2 } ∈ g(G, n, F), ∀ B1 , B2 ∈ g(G, n, F).
If Bi G = −GBi (i = 1, 2), then
{B1 , B2 } G = (B1 B2 − B2 B1 ) G
= (B2 B1 − B1 B2 )G
= B2 B1 G − B1 B2 G
= B2 (−GB1 ) − B1 (−GB2 )
= G(B2 B1 − B1 B2 )
= −G{B1 , B2 }.

2. Exponential matrix transform


Definition 1.37. For B ∈ M (n, C), define

 
n
1 k 1 k
exp(B) = B = lim B .
k! n→∞ k!
k=0 k=0

Take the Chebyshev matrix norm



m
||B|| = max |bij |.
i
j=1
* ∞
+
 1 k
2
Then, |bij | ≤ ||B||, ||B || ≤ ||B|| . Hence, the n series
k k
B are always
k!
k=0 ij
uniformly convergent, and convergent uniformly if ||B|| ≤ β, ∀ β > 0.
126 2. Symplectic Algebra and Geometry Preliminaries

Proposition 1.38. We have the following results:


1◦ exp (T −1 BT ) = T −1 exp (B)T, ∀ T ∈ GL(n, C).
2◦ If B has eigenvalues λ1 , · · · , λn (with multiplicities), then exp B has the
eigenvalues exp λ1 , · · · , exp λn .
n
3◦ det (exp B) = exp (tr B), tr B = bii .
i=1
4◦ exp B ∈ GL(n, C), ∀ B ∈ M (n, C).
5◦ If B1 , B2 ∈ M (n, C), B1 B2 = B2 B1 , then

exp (B1 + B2 ) = exp B1 · exp B2 = exp B2 · exp B1 .

6◦ exp B  = (exp B) , exp B = exp B.


7◦ ∀ t ∈ R, if exp (tB) ∈ GL(n, C), then

exp ((t1 + t2 )B) = exp (t1 B) · exp (t2 B),


exp (0 · B) = I,
exp (−tB) = (exp (tB))−1 .

Therefore, the mapping t ∈ R → exp (tB) ∈ GL(n, C) for a given B is a group


homomorphism of the additive group R into the multiplicative group GL(n, C).
Theorem 1.39. There exists a neighborhood W of On in M (n, C) such that the ex-
ponential mapping exp : M (n, C) → GL(n, C) is a diffeomorphism on W .


 Bk
Proof. The series is uniformly convergent in ||B|| ≤ ρ (ρ > 0). Therefore,
k!
k=0
Fij = (exp B)ij are the entire analytic functions of the complex variables bkl (k, l =
1, · · · , n),
Fij (bkl ) = δij + bij + · · · + (terms of degree) ≥ 2.
Therefore, the Jacobian matrix is
∂(expB)ij ∂Fij (bkl ) kl
= = δij ,
∂ b̄kl ∂bkl
'
kl 1, k = i, l = j,
where δij =
0, otherwise.
kl
Since det (δij ) = 1 = 0, by implicit function theorem, there exists a neighborhood
W ⊂ {||B|| < ρ}, such that B → exp B is a diffeomorphism on W . 

Remark 1.40. The inverse function to B → exp B = A is



 (−1)k
A −→ log A = log (I + (A − I)) = (A − I)k ,
k
k=1

which is uniformly convergent for ||A − I|| ≤ ρ < 1.


2.1 Symplectic Algebra and Orthogonal Algebra 127

Proposition 1.41. The function A(t) = exp (tB), ∀ t ∈ R satisfies the following prop-
erties:
A(0) = I,
A(t1 + t2 ) = A(t1 )A(t2 ) = A(t2 )A(t1 ),
A(−t) = (A(t))−1 ,
d d
A(t) = BA(t), A(t)|t=0 = B.
dt dt
Proposition 1.42. For A1 , A2 ∈ GL(n, C), its commutator is defined by
{A1 , A2 }G = A1 A2 A−1 −1
1 A2 (commutator in Lie group).
Then for Ai (t) = exp tBi (i = 1, 2), ∀ t ∈ R, we have
{A1 (t), A2 (t)}G = I + t2 {B1 , B2 }g + o(t4 ),
{A1 (t), A2 (t)}G |t=0 = In ,
d
{A1 (t), A2 (t)}G |t=0 = 0,
dt
1 d2 ( )
A1 (t), A2 (t) G |t=0 = {B1 , B2 }g ,
2 d t2
where {B1 , B2 }g = B1 B2 − B2 B1 is the commutator in Lie algebra.

n
Proposition 1.43. If B ∈ g(G, n, F) and f (λ) = αk λk , αk ∈ F, then (B k ) G =
k=0
G(−B)k , (f (B)) G = Gf (−B).
Theorem 1.44. A(t) = exp (tB) ∈ G(G, n, F), ∀ t ∈ R iff B ∈ g(G, n, F).
Proof. Let C(t) = A (t)GA(t), then C(0) = G,
d d A (t) d A(t)
C(t) = GA(t) + A (t)G
dt dt dt
= BA (t)GA(t) + A (t)GBA(t) (1.1)
= A (t)(B  G + GB)A(t).

d
Thus, A(t) ∈ G(G, n, F), ∀t ∈ R iff C(t) ≡ G, i.e., C(t) = 0. Then, in order
dt
to prove the theorem, we need only to show that the latter condition is equivalent to
B ∈ g(G, n, F), i.e., B  G + GB = 0.
d d 
If C(t) = 0, then C(t)t=0 = I  (B  G + GB)I = B  G + GB = 0, and so
dt dt
B ∈ g(G, n, F). Conversely, if B  G + GB = 0, by (1.1),
d
C(t) = 0,
dt
then C(t) = C(0) = G, i.e., A (t)GA(t) = G, ∀ t ∈ R. 
128 2. Symplectic Algebra and Geometry Preliminaries

3. Lie algebra of conformally invariant groups CG(G, n, F)


Definition 1.45. Define
(
cg(G, n, F) = B ∈ gl(n, F) | B  G + GB = βG, for some β ∈ F}.

Since B1 , B2 ∈ cg(G, n, F), then

{B1 , B2 } G = (B1 B2 − B2 B1 ) G = B2 B1 G − B1 B2 G


= B2 (β1 G − GB1 ) − B1 (β2 G − GB2 )
= β1 β2 G − β1 GB2 − (β2 G − GB2 )B1
−β1 β2 G + β2 GB1 + (β1 G − GB1 )B2
= G(B2 B1 − B1 B2 ) = −G{B1 , B2 }.

{B1 , B2 } ∈ cg(G, n, F). This shows that cg(G, n, F) is closed under { , }. There-
fore, cg(G, n, F) is a subalgebra of gl(n, F) equipped with the induced binary opera-
tion { , }, called as the Lie algebra of the conformally invariant group CG(G, n, F).
Theorem 1.46. Let A(t)= exp(tB), then A(t) ∈ CG(G, n, F) iff B ∈ cg(G, n, F).

2.2 Canonical Reductions of Bilinear Forms


In Section 2.1, we have seen that for a given bilinear form ϕ on Fn , there is a matrix
G in M (n, F) such that under natural base, ϕ can be represented by G:

ϕ(x, y) = x Gy, Gij = ϕ(ei , ej ).

The representative matrix G will change as the base changes. In this section, we want
to make sure how the matrix G changes.
Let Fn = {e1 , · · · , en } = {f1 , · · · , fn }, where e1 , · · · , en is the standard base and
f1 , · · · , fn is a new base. Then,

n
fi = tji ej , T = [tij ] ∈ GL(n, F),
j=1
⎡ ⎤ ⎡ ⎤
x1 u1

n 
n
⎢ ⎥ ⎢ .. ⎥ ,
x= xi ei = ui f¯i , x = ⎣ ... ⎦ = T u = T ⎣ . ⎦
i=1 i=1 xn un
⎡ ⎤ ⎡ ⎤
y1 v1

n 
n
⎢ ⎥ ⎢ .. ⎥ .
y= yi ei = vi f¯i , y = ⎣ ... ⎦ = T v = T ⎣ . ⎦
i=1 i=1 yn vn
2.2 Canonical Reductions of Bilinear Forms 129

Assume that under the new base f¯1 , · · · , f¯n , the representative matrix of ϕ is G. Then,
%
n 
n &
Gij = ϕ(f¯i , f¯j ) = ϕ tki ek , tlj el
k=1 l=1

n
= tki ϕ(ek , el )tlj = (T  GT )ij ,
k,l=1

i.e.,
G = T  GT,

n 
n  
n 
n
ϕ(x, y) = ϕ xi ei , yj ej = xi ϕ(ei , ej )yj = xi Gij yj = x Gy
i=1 j=1 i,j=1 i,j=1


n 
n  
n
= ϕ ui f¯i , vj f¯j = ui ϕ(f¯i , f¯j )vj
i=1 j=1 i,j=1


n
= ui Gij vj = u Gv.
i,j=1

2.2.1 Congruent Reductions

Definition 2.1. Let G1 and G2 ∈ M (n, F). G1 is congruent to G2 , if there exists a


non-singular matrix T ∈ GL(n, F), such that T  G1 T = G2 , denoted by G1 ∼ G2 .

Thus, the representative matrices of a bilinear form ϕ on Rn under different bases


are congruent to one another.
If G1 is congruent to G2 , then the equality G1 = εG1 implies the equality G2 =
εG2 with the same parity ε and rank G1 = rank G2 .
Let φ be a conjugate bilinear form on Cn . G is the representative matrix of φ under
the standard base, i.e.,
⎡ ⎤ ⎡ ⎤
y1 x1
⎢ ⎥ ⎢ ⎥
φ(x, y) = x∗ Gy, y = ⎣ ... ⎦ , x = ⎣ ... ⎦ ∈ Cn .
yn xn


n
If f¯1 , · · · , f¯n is another base on Cn and f¯j = tij ei , T = [tij ] ∈ GL(n, C),
i=1
then similarly we can get
G = T ∗ GT,
where G is the representative matrix of φ under the base f¯1 , · · · , f¯n .
130 2. Symplectic Algebra and Geometry Preliminaries

Definition 2.2. Let G1 and G2 ∈ M (n, C). G1 is congruent to G2 if there exists a


c
matrix T ∈ GL(n, C), such that T ∗ G1 T = G2 , denoted by G1 ∼ G2 .
c
If G1 ∼ G2 , then rank G1 = rank G2 and the equality G∗1 = εG1 implies the
equality G∗2 = εG2 with the same factor ε.


Remark 2.3. G is a conformal Hermitian matrix, i.e., G = εG with G, ε ∈ C, and

|ε| = 1, if and only if ε = eiθ and G∗ = eiθ G, if and only if e 2 G is a Hermitian
matrix.

Definition 2.4. Let ϕ(x, y) = ϕG (x, y) = x Gy be a bilinear form induced by G.


For a subspace U ⊂ Rn /Cn , the subspace U ϕ ⊂ Rn /Cn defined by

U ϕ = {x ∈ Rn /Cn | ϕ(x, y) = x Gy = 0, ∀y ∈ U }

is called the G-orthogonal complement of U .

2.2.2 Congruence Canonical Forms of Conformally Symmetric


and Hermitian Matrices

We list congruence canonical forms of conformally symmetric and Hermitian matrices


in Table 2.1 as a comparison.
F = R/C.

1. Alternative canonical forms


! ! !
1 I I O I I O
Let T = √ , then T  T = . where Fn = V +
2 I −I I O O −I
U, U = {x ∈ Fn | ϕ(x, y) = 0, ∀ y ∈ Fn } = (Fn )ϕ , dim V = r, dim U =
 0 1 
n − r, ϕ is non-singular, and J canonical symplectic quadratic form .
−1 0
'
δk,j , i = 2k − 1,
Let T = [tij ], tij = i, j = 1, · · · , 2n. Then,
δk+n,j , i = 2k,
⎡ ⎤
0 1
⎢ ±1 0 ⎥ !
⎢ ⎥
⎢ .. ⎥ O In
T ⎢ . ⎥T = .
⎢ ⎥ ±In O 2n×2n
⎣ 0 1 ⎦
±1 0

Thus, the canonical forms listed in Table 2.1 have the following alternative forms:
2.2 Canonical Reductions of Bilinear Forms 131

Table 2.1. Canonical form of conformal matrix and Hermitian matrices


Matrix Canonical form
G = G, in C or R, r
⎡ ⎤ ∃ u, v ∈ R , s.t.
J
⎢ ⎥ ϕ(u, v) = 0. Let a1 = u, {a1 , a2 } ∩ {a1 , a2 }ϕ
⎢ J ⎥
⎢ ⎥ a2 = v =⇒ ϕ(a1 , a1 )
⎢ .. ⎥ = {0},
⎢ ⎥ ϕ(u, v)
5 6
⎢ . ⎥
⎢ ⎥ = ϕ(a2 , a2 ) = 0, J O
⎣ s-block J ⎦ G∼
ϕ(a1 , a2 ) O G1
On−r
2s = r = −ϕ(a2 , a1 ) = 1

G = G, in C, ∃ u ∈ Cr , s.t.
⎡ ⎤ {a1 } ∩ {a1 }ϕ = {0},
. ϕ(u, u) = 0.
⎢ Ir
.
. ⎥ {a1 }ϕ = {a2 , · · · , ar },
⎢ ⎥ u 5 6
⎢ ... ... ⎥ Let a1 =  1 0
⎢ ⎥ ϕ(u, u)
⎣ ⎦ G∼
. 0 G1
.
. On−r =⇒ ϕ(a1 , a1 ) = 1
∃ u ∈ Rr , s.t. {a1 } ∩ {a1 }ϕ = {0},
G = G, in R,
⎡ ⎤ ϕ(u, u) = 0.
Ip {a1 }ϕ = {a2 , · · · , ar },
⎢ ⎥ u
⎣ −Iq ⎦ Let a1 =  ⎡ ⎤
|ϕ(u, u)| ±1 0
On−r G∼ ⎣ ⎦
p+q =r =⇒ ϕ(a1 , a1 ) = ±1 0 G1

G∗ = G, G∗ = eiθ G ∃ u ∈ Cr , s.t.
⎡ ⎤
Ip
⎢ ⎥ ϕ(u, u) = 0. {a1 } ∩ {a1 }ϕ = {0},
⎣ −Iq ⎦, u
Let a1 = 
On−r
|ϕ(u, u)| {a1 }ϕ = {a2 , · · · , ar },
⎡ ⎤ ⎡ ⎤
−iθ =⇒ ϕ(a1 , a1 ) ±1 0
⎢ e 2 Ip ⎥ G∼ ⎣ ⎦
⎢ −iθ ⎥
⎢ −e 2 ⎥ = signϕ(u, u) 0 G1
⎣ Iq ⎦
On−r
= ±1
p+q =r

⎡ ⎤
0 Ir
⎣ −Ir 0 ⎦ ∼ G = −G, in R or in C,
On−r
!
Ir
∼ G = G, in C,
On−r
⎡ ⎤
0 Is
⎢ Is 0 ⎥ 
⎣ ⎦ ∼ G = G, in R or (G∗ = G, in C ),
σId
On−r
⎡ ⎤
0 Is
⎢ e−iθ I ⎥
⎢ 0 ⎥ ∗
⎥ G = eiθ G,
s
⎢ iθ in C,
⎣ σe− 2 Id ⎦
On−r
132 2. Symplectic Algebra and Geometry Preliminaries

where s = min(p, q), d = |p − q|, σ = sign(p − q), p + q = r = 2s + d, p − q =


1 1 1 1
σd, p = (r + σd) = s + (1 + σ)d, q = (r − σd) = s + (1 − σ)d.
2 2 2 2
2. Invariants under congruences
Theorem 2.5. Let G be a conformally symmetric matrix in Fn (= Rn or Cn ), i.e.,
G = εG. Then, the quantities ε(G), r(G) and s(G) are the invariants under congru-
ences. Moreover, if ε = −1, then r = 2s, if ε = 1, then p(G), q(G), d(G) and σ(G)
are invariants under congruences.
If G is conformal Hermitian, i.e., G∗ = εG with ε = eiθ , then the quantities
ε(G), r(G), s(G), p(G), q(G) and σ(G) are invariants under congruences.

Theorem 2.6 (Sylvester’s law of inertia). Let ϕ(x) be a quadratic form in Rn and
x = T y, det(T ) = 0. If

ϕ(x) = x21 + x22 + · · · + x2p − x2p+1 − · · · − x2n


= y12 (x) + y22 (x) + · · · + yq2 (x) − yq+1
2
(x) − · · · − yn2 (x), (2.1)

then p = q.
Similarly, let φ(x) be a quadratic form in Cn and x = T y, det T = 0. If

φ(x) = |x1 |2 + |x2 |2 + · · · + |xp |2 − |xp+1 |2 − · · · − |xn |2


= |y1 |2 + |y2 |2 + · · · + |yq |2 − |yq+1 |2 − · · · − |yn |2 ,

then p = q.

Proof. If p > q, then p + (n − q) > n. Thus, the equations x1 = 0, · · · , xp =


0, yq+1 (x) = 0, · · · , yn (x) = 0 has a non-zero solution ξ = 0. By (2.1),

y12 (ξ) + · · · + yq2 (ξ) + ξp+1


2
+ · · · + ξn2 = ξ12 + · · · + ξp2 + yq+1
2
(ξ) + · · · + yn2 (ξ) = 0,

and thus ξp+1 = 0, . . . , ξn = 0. Then, we have ξ = 0, which is a contradiction. This


shows that p ≤ q. Similarly, q ≤ p, then p = q. 
!
G0 0
Theorem 2.7 (Witt). If a non-singular Hermitian matrix is conjugate
0 G1
!
G0 0
congruent to , then G1 is conjugate congruent to G2 .
0 G2
! !
1 0 1 0
Proof. We first prove the case G0 = 1, i.e., ∼ .
0 G1 0 G2
! ! !
a b̄ 1 0 1 0
Let T = and T ∗ T = . Then,
c d 0 G1 0 G2
! ! ! ! 5 aa + c∗ G c ab̄ + c∗ G d 6
1 0 a c∗ 1 0 a b̄ 1 1
= ∗ =
0 G2 b̄ d 0 G 1 c d ∗  ∗
b̄a + d G1 c b̄b̄ + d G1 d
,i.e.,
2.2 Canonical Reductions of Bilinear Forms 133

aa + c∗ G1 c = 1, ab̄ + c∗ G1 d = 0,
b̄a + d∗ G1 c = 0, b̄b̄ + d∗ G1 d = G2 .

Let dλ = d + λcb ,

d∗λ G1 dλ = (d∗ + λb̄c∗ )G1 (d + λcb̄ )


= d∗ G1 d + λb̄c∗ G1 d + λd∗ G1 cb̄ + |λ|2 b̄c∗ G1 cb̄
= G2 − b̄b̄ − λb̄ab̄ − λb̄ab̄ + |λ|2 (b̄b̄ − aab̄b̄ )
= G2 − (1 + λa + aλ − (1 − |a|2 )|λ|2 )b̄b̄ .

If λ satisfies the equation


1 + λa + aλ − (1 − |a|2 )|λ|2 = 0,
then
d∗λ G1 dλ = G2 .
It only needs to take
⎧ 1
⎨ − , if a = 1,
λ= 2
⎩ 1 , if a = 1.
1−a

If the order r of G0 is larger than 1, ⎡ then ⎤


±1 0
c ⎢ .. ⎥
G0 ∼ ⎣ . ⎦ ,
0 ±1 r×r
and thus
⎡ ⎤ ⎡ ⎤
. ..
⎛ ⎞ .. ⎛ ⎞ .
⎢ ±1 ⎥ ⎢ ±1 ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎢ ⎜ .. ⎟ ⎥ ⎢ ⎜ .. ⎟ ⎥
⎢ ⎝ . ⎠ . 0 ⎥ ⎢ ⎝ . ⎠ . 0 ⎥
⎢ .. ⎥ c ⎢ .. ⎥
⎢ ±1 r . ⎥ ∼ ⎢ ±1 . ⎥.
⎢ ⎥ ⎢ r ⎥
⎢ ··· ··· ··· . · · · ⎥ ⎢ ··· ··· ··· .. ··· ⎥
⎢ .. ⎥ ⎢ . ⎥
⎣ 0 G 1 ⎦ ⎣ 0 G2 ⎦
.. ..
. .

We denote ⎡ ⎤
..
⎛ ⎞ .
⎢ ±1 ⎥
⎢ .. ⎥
⎢ ⎜ .. ⎟ . ⎥
⎢ ⎝ . ⎠ 0 ⎥

$1 = ⎢ .. ⎥
G ±1 . ⎥,
⎢ r−1 ⎥
⎢ ··· ··· ··· .. ··· ⎥
⎢ . ⎥
⎣ 0 G1 ⎦
..
.
134 2. Symplectic Algebra and Geometry Preliminaries

⎡ ⎤
..
⎛ ⎞ .
⎢ ±1 ⎥
⎢ .. ⎥
⎢ ⎜ .. ⎟ . ⎥
⎢ ⎝ . ⎠ 0 ⎥

$2 = ⎢ .. ⎥
G ±1 r−1 . ⎥,
⎢ ⎥
⎢ · · · · · · ··· .. ··· ⎥
⎢ . ⎥
⎣ 0 G2 ⎦
..
.

$1 ∼
then by the result just proved above, G
c $
G2 . By recursions, we can finally get

c
G1 ∼ G2 .

The Witt theorem gives another proof of the invariance of index p. If


! !
Ip 0 c Ip 0
∼ ,
0 −In−p 0 −In−p

and p > p , then


  !
Ip−p 0  c  ∗
∼ ,
0 −In−p   −In−p

i.e., there exists a matrix T ∈ GL(n − p , C), such that

!
Ip−p 0
= −T ∗ T.
0 −In−p

The (1,1) element of the given matrix is

1 = −(|ti1 |2 + · · · + |tn−p |2 ),

which is a contradiction. 

2.2.3 Similar Reduction to Canonical Forms under Orthogonal


Transformation

For comparison, we list the canonical forms of Hermitian, conformal Hermitian ma-
trices, real symmetric, and anti-symmetric matrices under unitary or orthogonal trans-
formations in the Table 2.2. The content can be found in any standard textbook.
2.2 Canonical Reductions of Bilinear Forms 135

Table 2.2. H, S and K under unitary or orthogonal transformations

Matrices Canonical form


If λ1 is an eigenvalue of H, then :
H ∗ = H, λk ∈ R ∃ w1 = 0, s.t.Hw1 = λ1 w1 ,
⎡ ⎤ w1 , w1  = 1.
λ1
w1 , Hw1  = w1 , λ1 w1  = λ1 .
⎢ .. ⎥
⎢ ⎥ w1 , Hw1  ∈ R ⇒ λ1 ∈ R.
⎢ . ⎥
⎢ ⎥
⎢ λr ⎥ z ∈ {w1 }⊥ ⇒ Hz1 ∈ {w1 }⊥
⎢ 0 ⎥
⎢ ⎥
⎢ ⎥ So, ∃ T0 ∈ U (n, C), s.t.
⎢ .. ⎥ !
⎣ . ⎦ λ1 0
T0−1 HT0 = ,
0 0 H1
where H1∗ = H1
S is a real matrix, S  = S ⇒ S
H ∗ = ei θ H, λk ∈ R
⎡ ⎤ is a Hermitian matrix.
e−i θ/2 λ1
⎢ ⎥ If λ1 is an eigenvalue, then:
⎢ .. ⎥
⎢ . ⎥ ∃ w1 ∈ R, s.t. Sw1 = λ1 w1 ,
⎢ ⎥
⎢ ⎥ (w1 , w1 ) = 1.Similarly,
⎢ e−iθ/2 λr ⎥
⎢ ⎥
⎢ 0 ⎥ x ∈ {w1 }⊥ ⇒ Sx ∈ {w1 }⊥ .
⎢ ⎥
⎢ .. ⎥
⎣ . ⎦ Analogous, ∃ T0 ∈ O(n, R),
! s.t.
λ1 0
0 T0−1 ST0 = ,
0 S1
where S1 = S1
S  = S, λk ∈ R
⎡ ⎤
λ1
⎢ .. ⎥
⎢ ⎥
⎢ . ⎥
⎢ ⎥
⎢ λr ⎥
⎢ 0 ⎥
⎢ ⎥
⎢ ⎥
⎢ .. ⎥
⎣ . ⎦
0
K anti-symmetrical,
iK is a Hermitian matrix,
its eigenvalues are real.
If λ1 = 0 is an eigenvalue of iK,
then ∃ w1 = u − i v = 0,
K  = −K, λk > 0, s.t. i K(u − i v) = λ1 (u − i v).
2s = r (1) Kv = λ1 u,
⎡ ⎤ Therefore : ⇒
0 λ1 (2) Ku = −λ1 v.
⎢ −λ1 0 ⎥
⎢ ⎥ λ1 (u, v) = (v, Kv) = 0.
⎢ .. ⎥
⎢ ⎥ Sinceλ1 = 0, we have(u, v) = 0.
⎢ . ⎥
⎢ ⎥
⎢ 0 λs ⎥ From (1) and (2), we get u = 0, v = 0.
⎣ −λs 0 ⎦
Suppose : (u, u) = (v, v) = 1.
On−r
or Then : (u, Ku) = 0, (u, Kv) = λ1 ,
⎡ ⎤
0 A (v, Kv) = 0, (v, Ku) = −λ1 .
⎣ −A 0 ⎦
On−r If x ∈ {u, v}⊥ , then :
where A = diag{λ1 , λ2 , · · · , λs } Kx ∈ {u, v}⊥ . Thus,
∃ T0 ∈ O(n, R), s.t.
⎡ ⎤
0 λ1
T0−1 KT0 = ⎣ −λ1 0 ⎦,
K1
where K1 = K1
136 2. Symplectic Algebra and Geometry Preliminaries

Next, we consider Jordan canonical forms. Let us first recall the Jordan canonical form
for a general real matrix A ∈ M (n, R). A Jordan canonical form viewed in real space
is different from the Jordan canonical form viewed in complex space.
1. Elementary divisors in complex space
In complex space, the elementary divisor corresponding to a paired-complex conjugate
eigenvalue α ± iβ, β = 0 is of the form

[λ − (α + iβ)]p , [λ − (α − iβ)]p .
The corresponding Jordan blocks are
⎡ ⎤ ⎡ ⎤
α + iβ 1 α − iβ 1
⎢ .. ⎥ ⎢ .. ⎥
⎢ α + iβ . ⎥ ⎢ α − iβ . ⎥
⎢ ⎥ , ⎢ ⎥ .
⎢ .. ⎥ ⎢ .. ⎥
⎣ . 1 ⎦ ⎣ . 1 ⎦
α + iβ p×p
α − iβ p×p

The elementary divisor corresponding to a real eigenvalue γ is of the form

(λ − γ)q .

Its Jordan block is ⎡ ⎤


γ 1
⎢ .. ⎥
⎢ γ . ⎥
⎢ ⎥ ∼ (λ − γ)q .
⎢ .. ⎥
⎣ . 1 ⎦
γ q×q

2. Elementary divisor in real space


In real space, the elementary divisor corresponding to a paired complex conjugate
eigenvalues α± iβ, (β = 0) is of the form

[λ2 − 2αλ + (α2 + β 2 )]p .

Its Jordan block is


⎡ ⎤
0 1
⎢ −(α2 + β 2 ) 2α 1 ⎥
⎢ ⎥
⎢ 0 1 ⎥
⎢ ⎥
⎢ −(α2 + β 2 ) 2α 1 ⎥
⎢ ⎥
⎢ .. ⎥ .
⎢ . ⎥
⎢ ⎥
⎢ .. ⎥
⎢ . 1 ⎥
⎢ ⎥
⎣ 0 1 ⎦
−(α2 + β 2 ) 2α 2p×2p

The elementary divisor and the Jordan block corresponding to a real eigenvalue γ
is the same as in complex space, i.e.,
2.3 Symplectic Space 137

⎡ ⎤
γ 1
⎢ .. ⎥
⎢ γ . ⎥
⎢ ⎥ ∼ (λ − γ)q .
⎢ .. ⎥
⎣ . 1 ⎦
γ q×q

2.3 Symplectic Space


A symplectic vector space is a vector space V equipped with a nondegenerate, skew-
symmetric, bilinear form ω called the symplectic form.
Explicitly, a symplectic form is a bilinear form ω : V × V → R that is
1◦ Skew-symmetric: ω(u, v) = −(v, u) ∀u, v ∈ V .
2◦ Nondegenerate: if ω(u, v) = 0 ∀v ∈ V , then u = 0.
Working in a fixed basis, ω can be represented by a matrix. The two conditions
above imply that this matrix must be skew-symmetric and nons-singular. This is not
the same as a symplectic matrix, which represents a symplectic transformation of the
space.

2.3.1 Symplectic Space and Its Subspace


1. Comparison between symplectic and Euclidian space[Tre75,LM87,FQ91a,Wei77]
In this section, we restrict ourselves to R2n . The symbol ⇔, which stands for “if and
only if”, will be widely adopted under J orthogonality and I Orthogonality.
Sympl. Structure J-Sympl. Matrix Euclidian Structure I-Unit Matrix

[x, y] = x Jy, (x, y) = x Iy;


[y, x] = −[x, y], (y, x) = (x, y);
[x, x] = 0, ∀ x, (x, x) > 0, ∀ x = 0;
[x, y] = (x, Jy). (x, y) = [x, J −1 y].
J-Orthogonality I-Orthogonality

xy ⇐⇒ [x, y] = 0 ⇐⇒ yx, x⊥y ⇐⇒ (x, y) = 0 ⇐⇒ y⊥x.


U V ⇐⇒ [x, y] = 0, ∀ x ∈ U, y ∈ V, U ⊥V ⇐⇒ (x, y) = 0, x ∈ U, y ∈ V.
U V ⇐⇒ U ⊥JV, U ⊥V ⇐⇒ U J −1 V.
Definition 3.1. V  = {x ∈ R2n | [x, y] = 0, ∀y ∈ V }. V ⊥ = {x ∈ R2n | (x, y) =
0, ∀y ∈ V }.
By definition, we have V  = (JV )⊥ , V ⊥ = (J −1 V ) .
138 2. Symplectic Algebra and Geometry Preliminaries

Proposition 3.2. U ⊂ R2n , V ⊂ R2n . Then,

U ⊂ V ⇐⇒ U  ⊃ V  , U ⊂ V ⇐⇒ U ⊥ ⊃ V ⊥ ;

(U ∩ V ) = U  + V  , (U ∩ V )⊥ = U ⊥ + V ⊥ ;

(U + V ) = U  ∩ V  , (U + V )⊥ = U ⊥ ∩ V ⊥ ;

dim U + dim U = 2n, dim U + dim U⊥ = 2n;


 ⊥
U = U, U⊥ = U;

∃ U, U ∩ U  = {0}, U ∩ U ⊥ = {0};

U + U  = R2n , U + U ⊥ = R2n .

Definition 3.3. U ⊂ R2n , define U 0 = U ∩ U  , called as the radical of U .

2. Special classes of subspaces[HW63,Tre75,LM87,Wei77]

(1) V degenerate subspace : V 0 = V ∩ V  = 0


⇐⇒ dim V is odd.

(2) V isotropic subspace : V ⊂ V

⇐⇒ V ∩ V  = V 0 = V
⇐⇒ [x, y] = 0 on V
=⇒ dim V ≤ n

⇐⇒ V  coisotropic
⇐= dim V = 1.

(3) V coisotropic subspace : V  ⊂ V

⇐⇒ V ∩ V  = V 0 = V 

⇐⇒ [x, y] = 0 on V 
=⇒ dim V ≥ n

⇐⇒ V  isotropic
⇐= dim V = 2n − 1.
2.3 Symplectic Space 139

(4) V Lagrangian : V  = V
⇐⇒ V is isotropic and coisotropic
⇐⇒ V is isotropic and dim V = n
⇐⇒ V is coisotropic and dim V = n
⇐⇒ V maximally isotropic
⇐⇒ V minimally coisotropic.
(5) V non-degenerate : V ∩ V  = {0}

⇐⇒ V + V  = R2n
⇐⇒ [x, y] non-degenerate on V
⇐⇒ If [x, y] = 0 ∀ y ∈ V, then x = 0

⇐⇒ V  non-degenerate
=⇒ dim V is even.

(6) Coordinate subspaces.


Define ν = {1, 2, · · · , n} with the natural order. If α = {i1 , i2 , · · · , ik } ⊂ ν (i1
< · · · < ik ), the total number of α s is 2n . Denote α
 = ν\α = {j1 , · · · , jn−k } (j1
< . . . < jn−k ), ν = ∅, α  ∪ α = ν, α  ∩ α = ∅.
Definition 3.4. Define a coordinate subspace Rα,β = {ei , fj }i∈α,j∈β , where
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0 0
⎢ 0 ⎥ ⎢ .. ⎥ ⎢ .. ⎥ ⎢ .. ⎥
⎢ ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥
⎢ .. ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ ⎥
e1 = ⎢ ⎥ , · · · , en = ⎢ ⎥ , f1 = ⎢ ⎥ , · · · , fn = ⎢ 0 ⎥ .
⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ . ⎥
⎣ .. ⎦ ⎣ .. ⎦ ⎣ .. ⎦ ⎣ .. ⎦
0 0 0 1
2n
The total number of coordinate subspaces Rα,β is 2 .
Problem 3.5. We have the following issues to address:
1◦ Under what conditions about' α, β is Rα,β isotropic, coisotropic, Lagrangian?
1, i ∈ α,
2◦ Define Iα = [di δij ], di =
0, i ∈ α .
Proof. .
Iν = In , Iν = I∅ = On , Iα2 = Iα ,
Iα Iα = On , Iα Iβ = Iα∩β , Iα + Iα = I,
Iα + Iβ = Iα∩β + Iα∪β = Iα∪β + Iα Iβ .
The proof can be obtained. 
140 2. Symplectic Algebra and Geometry Preliminaries

3◦ Show that the subspaces D = {pi = ±qi , i = 1, 2, · · · , n} are Lagrangian


and transversal to all coordinate Lagrangian subspaces.

Theorem 3.6. Let W be non-degenerate and W = U "V,  where U "V


 stands for
the J-orthogonal sum. Then U, V both are non-degenerate.

Proof. Let x ∈ U such that [x, y] = 0, ∀y ∈ U . Then, from U ⊂ V  , it follows


that [x, z] = 0, ∀z ∈ V . Therefore, [x, w] = 0, ∀w ∈ W . By assumption, W is
non-degenerate, and so x = 0. This shows U is non-degenerate. Similarly, V is non-
degenerate too. 

Theorem 3.7. If U is isotropic, then there exists a Lagrangian subspace V ⊃ U .

Proof. Without loss of generality, we can assume that dim U = k < n. Therefore,
there exists a vector ak+1 ∈ U  \U = ∅, V = U + {ak+1 } ⊃ U, V is isotropic. 

By repeating this procedure, we can get a Lagrangian subspace V ⊃ U .


Theorem 3.8. For a given Lagrangian subspace U = {a1 , a2 , · · · , an }, there exists
another Lagrangian subspace V = {b1 , b2 , · · · , bn }, such that

R2n = P1 "P
 2"
 · · · "P
 n = U +̇V,

where Pi = {ai , bi }, [ai , aj ] = [bi , bj ] = 0, [ai , bj ] = δij .

Proof. We proceed by induction with respect to dimension n. For n = 1, by non-


degenerality of R2 , there is a vector b1 , satisfying [a1 , b1 ] = 1. Of course, {b1 } is
Lagrangian. Thus, this theorem is true for n = 1.
Assume that for n−1, the theorem is true. Then, for R2n , {a1 } ⊂ {a2 , a3 , · · · , an },
and so {a2 , a3 , · · · , an } \{a1 } = ∅, there is a vector b1 ∈ {a2 , a3 , · · · , an } \
{a1 } , such that

[a1 , b1 ] = 1, [a2 , b1 ] = · · · = [an , b1 ] = 0.

Set P1 = {a1 , b1 }. By above subspace(5) and Theorem 3.6,

R2n = P1 "P
 1 , P1 ∩ P1 = {0},

dim P1 = 2(n − 1), {a2 , a3 , · · · , an } ⊂ P1 is maximally isotropic in P1 , i.e.,
a Lagrangian subspace of P1 . By inductive assumption, there exists b2 , · · · , bn and
{b2 , · · · , bn } is Lagrangian space in P1 . Moreover,

[ai , aj ] = [bi , bj ] = 0, [ai , bj ] = δij , i, j = 2, · · · , n.

Therefore, for all i, j, we have

[ai , aj ] = [bi , bj ] = 0, [ai , bj ] = δij .

Set Pi = {ai , bi }, then R2n = P1 "P


 2"
 · · · "P
 n. 
2.3 Symplectic Space 141

Theorem 3.9. Let U, V be two Lagrangian subspaces, U = {a1 , a2 , · · · , an }, U ∩


V = {0}. Then, there exists a unique base b1 , b2 , · · · , bn , such that V = {b1 , b2 , · · · ,
bn }, and
[ai , aj ] = [bi , bj ] = 0, [ai , bj ] = δij .

Proof. Similarly manner, by induction with respect to dimension n, the proof can be
obtained. 

Theorem 3.10. If U is not isotropic, then there exists a non-degenerate subspace N ⊂


U, N = {0}, such that

U = U0 "
 N, U 0 = U ∩ U .

Proof. By assumption U 0 = {x ∈ U |[x, y] = 0, ∀y ∈ U } = U . Therefore, there is a


subspace N ⊂ U, N = {0}, such that N ∩ U 0 = {0} and U = N +̇U 0 .
Since U 0 U , of course U 0 N . If x ∈ N and [x, y] = 0, ∀y ∈ N, then by U 0 N ,
we have [x, z] = 0, ∀z ∈ U 0 i.e., [x, y] = 0, ∀y ∈ U . Thus, x ∈ U 0 ∩ N = {0}, and
x must be = 0. Therefore, N is non-degenerate. 

Theorem 3.11. Let V1 , V2 be two disjoint isotropic subspaces. Then, there exist two
Lagrangian subspaces W1 , W2 that are disjoint. W1 ∩ W2 = {0} such that W1 ⊃
V1 , W2 ⊃ V2 .

Proof. Let U = V1 +̇V2 , U 0 = U ∩ U  = (V1 +̇V2 ) ∩ (V1 +̇V2 ) .


1◦ If U is isotropic. Assume

V1 = {a1 , · · · , ar }, V2 = {br+1 , · · · , br+s }.

By Theorem 3.7, there exists V3 = {ar+s+1 , · · · , an }, such that

V1 +̇V2 +̇V3 = {a1 , · · · , ar , br+1 , · · · , br+s , ar+s+1 , · · · , an }

is Lagrangian. Moreover, by Theorem 3.8, there exist b1 , · · · , br , ar+1 , · · · , ar+s ,


br+s+1 , · · ·, bn ∈ R2n , such that

[ai , aj ] = [bi , bj ] = 0, [ai , bj ] = δij .

Set W1 = {a1 , · · · , ar , ar+1 , · · · , ar+s , ar+s+1 , · · · , an }, W2 = {b1 , · · · , bn }.


Obviously, W1 , W2 are Lagrangian and W1 ∩ W2 = {0}, V1 ⊂ W1 , V2 ⊂ W2 .
2◦ If U is not isotropic. Set U10 = V1 ∩ V2 , U20 = V2 ∩ V1 , then

U 0 = U10 +̇U20 , Ui0 = Vi ∩ U 0 , i = 1, 2.

Evidently,

Ui0 = Vi ∩ U 0 = Vi ∩ U ∩ U 

= Vi ∩ (V1 +̇V2 ) = Vi ∩ (V1 ∩ V2 ), i = 1, 2.


142 2. Symplectic Algebra and Geometry Preliminaries

By assumption, Vi (i = 1, 2) are isotropic, i.e., Vi ⊂ Vi (i = 1, 2). By the following


Lemma 3.12,
U10 +̇U20 = V1 ∩ (V1 ∩ V2 )+̇V2 ∩ (V1 ∩ V2 )

= (V1 +̇V2 ) ∩ (V1 + V2 )

= U ∩ U  = U 0.

Take Ni ⊂ Vi (i = 1, 2), such that Vi = Ui0 "N


 i.
Set N = N1 +̇N2 , then
U = V1  V2 = U10  N1  U20  N2 = U 0  N.
Similar to the proof of Theorem 3.10, we can see that N is non-degenerate and
U = U 0 "N
 . Of course, Ni = N ∩ Vi (i = 1, 2), and N1 ⊂ N ∩ V1 .
If x ∈ N ∩ V1 , then x = x1 + x2 , x1 ∈ N1 , x2 ∈ N2 . Moreover, by x ∈ V1 ,
we have x2 = x − x1 ∈ V1 . Thus, x2 ∈ V1 ∩ V2 = {0}, x2 = 0. This shows that
x = x1 ∈ N1 .
As N is non-degenerate, R2n = N "N
  , U 0 = U10  U20 ⊂ N  . By N = N1 
N2 and N1 , N2 being isotropic, we know that N1 , N2 both are maximal isotropic,
i.e., Lagrangian in N .
For the non-degenerate subspace N  , since U 0 = U10  U20 ⊂ N  is isotropic,
applying results 1◦ of Theorem 3.11 , there exist two Lagrangian subspaces W1 , W2 ,
in N  , such that Wi ⊃ Ui0 (i = 1, 2) and W 1 ∩ W 2 = {0}. Set W1 = W1 
N1 , W2 = W2  N2 . Then, Wi (i = 1, 2) are Lagrangian in R2n (i = 1, 2) and
Vi ⊂ Wi (i = 1, 2), W1 ∩ W2 = {0}. 
Lemma 3.12. If A ⊂ A1 , B ⊂ B1 , then
(A + B) ∩ (A1 ∩ B1 ) = A ∩ (A1 ∩ B1 ) + B ∩ (A1 ∩ B1 )
= A ∩ B1 + B ∩ A1 .
Proof. (A + B) ∩ (A1 ∩ B1 ) ⊂ A ∩ (A1 ∩ B1 ) + B ∩ (A1 ∩ B1 ) is trivial. Let
x ∈ (A + B) ∩ (A1 ∩ B1 ). Since x ∈ A + B, then exists a decomposition
x = a + b, a ∈ A ⊂ A1 , b ∈ B ⊂ B1 .
Since x ∈ A1 ∩ B1 , a = x − b1 ∈ B1 , b = x − a ∈ A1 .
Thus, a ∈ A ∩ B1 , b ∈ B ∩ A1 , i.e., x = a + b ∈ A ∩ B1 + B ∩ A1 . 
Lemma 3.13. Let U, V, W ⊂ Rm , V  W = Rm . Define a linear projection πVW
along W into V as πVW : Rm → Rm ,
'
x, ∀x ∈ V,
πVW x =
0, ∀x ∈ W.

Then, U  W = Rm , iff πVW : U → V is non-singular and onto.


2.3 Symplectic Space 143

Proof. Assume x ∈ U , πVW x = 0. From V  W = Rm , it follows that there exists a


decomposition
x = v + w, where v ∈ V and w ∈ W.
Thus,
0 = πVW x = πVW (v + w) = πVW v + πVW w = πVW v = v.
Then, x = w ∈ W , i.e., x ∈ U ∩ W = {0}. We get x = 0.
Conversely, if x ∈ U ∩ W , then x ∈ W , x ∈ U . By the definition of π, we have
πx = 0. π : U → V is non-singular and so x = 0, i.e.,

W ∩ U = {0}, U  W = Rm .

Therefore, the lemma is completed. 


Theorem 3.14. For every Lagrangian subspace L in R2n , there exists a coordinate
Lagrangian subspace Rα,α transversal to L, i.e., ∃α ⊂ ν , such that Rα,α +L = R2n .

Proof. Since (Rν,0 ∩ L) ⊂ Rν,0 , ∃α ⊂ ν, such that

Rα,0 ∩ (Rν,0 ∩ L) = {0},


Rα,0 + (Rν,0 ∩ L) = Rν,0 .
Lag. Lag.
 
T
Rν,0 = Rν,0 ⊂ Rα,α + L = Rα,
T
α ∩ L) .
α + L = (Rα,

Therefore,
Rα,α ∩ L = (Rα,α ∩ Rν,0 ) ∩ (Rν,0 ∩ L)
= Rα,0 ∩ (Rν,0 ∩ L) = {0}.
Then,
Rα,α + L = R2n .
The theorem is proved. 
3. Matrix representation of subspaces in R2n
⎡ ⎤
a11 · · · a1k !
⎢ ⎥ A1
A = ⎣ ... ..
. ⎦ = [a1 , a2 , · · · , ak ] = ,
A2
a2n,1 · · · a2n,k
where A ∈ M (2n, k), ai ∈ M (2n, 1), Aj ∈ M (n, k).
Definition 3.15. A ∈ M (2n, k) is non-singular, if rankA = k.
Let A ∈ M (2n, k), B ∈ M (2n, l). Then, [A, B] ∈ M (2n, k + l). If [A, B] is
non-singular, then both A and B are non-singular.
G2n,k = {all k-dim subspaces in R2n }, called as Grassmann manifold.
If A ∈ M (2n, k) is non-singular, we define {A} = {a1 , · · · , ak } to be a k-dim
subspace in R2n generated by k column vectors a1 , · · · , ak of A.
144 2. Symplectic Algebra and Geometry Preliminaries

Proposition 3.16. Let A, B ∈ M (2n, k) be non-singular. {A} = {B}, iff ∃Q ∈


GL(k), such that AQ = B, i.e.,
! !
A1 Q B1
= .
A2 Q B2

Definition 3.17. Let A, B ∈ M (2n, k). If there is a matrix Q ∈ GL(k), and AQ =


B, then we say that A is equivalent to B, denoted by A ∼ B.

Proposition 3.18. G2n,k consists of equivalent classes of non-singular elements un-


der M (2n, k) i.e.,

G2n,k ≈ {equivalent classes of non-singular elements under “∼” in M (2n, k)}.

Definition 3.19. Λn = { all Lagrangian subspaces in R2n } ⊂ G2n,n .


!
A1
Definition 3.20. A = ∈ M (2n, n) is called a symmetric pair of square
A2
matrices, if
A JA = On ,
!
0 In
i.e., A1 A2 − A2 A1 = On , where J = .
−In 0
!
A1
Evidently, A = is a symmetric pair iff A1 A2 ∈ SM (n), or A2 A1 ∈ SM (n).
A2
All symmetric pairs of square matrices are denoted by SM 2n,n .
! !
S I
In particular, A = or ∈ SM 2n,n , iff S ∈ SM (n).
I S
!
A1 |A1 | = 0, A ∈ SM 2n,n ⇐⇒ A2 A−1
1 ∈ SM (n).
A= ,
A2 |A2 | = 0, A ∈ SM 2n,n ⇐⇒ A1 A−1
2 ∈ SM (n).

Definition 3.21. Let A, B ∈ M (2n, n). A is conjugate to B if A JB = In . A is


conformally conjugate to B if ∃μ = μ(A, B) = 0, such that A JB = μIn .

Obviously, A is conjugate to B iff −B is conjugate to A; A is conformally conju-


gate to B iff B is conformally conjugate to A with μ(B, A) = −μ(A, B).

2.3.2 Symplectic Group

Let
2.3 Symplectic Space 145

!
A1
A= = [a1 , · · · , an ] ∈ M (2n, n),
A2
!
B1
B= = [b1 , · · · , bn ] ∈ M (2n, n),
B2
!
A1 B1
M = [A, B] = = [a1 , · · · , an , b1 , · · · , bn ] ∈ M (2n),
A2 B2
M ∈ Sp(2n) ⇐⇒ M  JM = J
⇐⇒ A, B are a symmetric pair, A is conjugate to B
⇐⇒ A JA = On = B  JB, A JB = In
⇐⇒ A1 A2 − A2 A1 = On = B1 B2 − B2 B1 , A1 B2 − A2 B1 = In
⇐⇒ ai Jaj = bi Jbj = 0, ai Jbj = δij ,

i.e., a1 , · · · , an , b̄1 , · · · , b̄n is a symplectic basis


!
A1 A2
⇐⇒ M  = ∈ Sp(2n)
B1 B2
! !
A1 A2
⇐⇒ , are symmetric pairs and are conjugate to each other
B1 B2
⇐⇒ A1 B1 − B1 A1 = On = A2 B2 − B2 A2 ; A1 B2 − B1 A2 = In .

M ∈ CSp(2n) ⇐⇒ ∃μ = 0, M  JM = μJ
⇐⇒ A, B form a symmetric pair and A is conformally conjugate to B

⇐⇒ A JA = On = B  JB, ∃μ = 0, s.t. A JB = μIn

⇐⇒ A1 A2 − A2 A1 = On = B1 B2 − B2 B1 , A1 B2 − A2 B1


= μIn , μ = 0
!
A1 A2
⇐⇒ M  =   ∈ CSp(2n)
B1 B2
! !
A1 A2 are symmetric pairs and are conformally
⇐⇒ ,
B1 B2 conjugate to each other

⇐⇒ A1 B1 − B1 A1 = On = A2 B2 − B2 A2 ; A1 B2 − B1 A2


= μIn , μ = 0.

Proposition 3.22. If M  JM = J, then M JM  = J. Generally speaking, if K 2 =


±I, then M  KM = K, iff M KM  = K.

Proof. If M  KM = K, then K = M −1 KM −1 ,
146 2. Symplectic Algebra and Geometry Preliminaries


K −1 = (M −1 KM −1 )−1 = M K −1 M  .

By assumption, K 2 = ±I, and so K −1 = ±K. Therefore, M KM  = K. 

If M ∈ Sp(2n), then A is a symmetric pair, iff M A is a symmetric pair; and A is


conjugate to B, iff M A is conjugate to M B.
If M ∈ CSp(2n), then A is a symmetric pair iff M A is a symmetric pair; A is
conformally conjugate to B iff M A is conformally conjugate to M B.

M = [A, B] ∈ O2n ⇐⇒ M  M = I2n


⇐⇒ A A = B  B = In , A B = On .
M = [A, B] ∈ Un = Sp(2n) ∩ O2n
= GL(n, C) ∩ Sp(2n) = GL(n, C) ∩ O2n
⇐⇒ A JA = B  JB = On , A B = On , A JB = A A = B  B = In .
⇐⇒ A is non-singular symmetric pair, A A = In B = J −1 A
(see Theorem 3.34).

Theorem 3.23. Let M be non-singular. M ∈ CSp(2n) iff

M Z ∈ SM 2n×n , ∀Z ∈ SM 2n×n , Z non-singular.


!
A1 B1
Proof. We only need to prove the sufficiency. Let M = [A, B] = .
A2 B2
! !
I I
1◦ Take Z = ∈ SM 2n×n , M Z = [A, B] = A.
O O
By assumption, M Z ∈ SM 2n×n , i.e., (M Z) JM Z = A JA = O.
!
◦ O
2 Take Z = ∈ SM 2n×n , similarly, B  JB = 0.
I
!
S
3◦ Take Z = , S  = S, and so Z ∈ SM 2n×n .
I
By (M Z) JM Z = 0, we have

(AS + B) J(AS + B) = S  A JAS + B  JB + S  A JB + B  JAS


= S  A JB + B  JAS = O.

Let C = A JB. Then, S  C = C  S, ∀ S  = S. Take S = I. Then, C  = C, CS =


SC, ∀ S  = S. This shows that C must be μI, i.e., A JB = μI.
The μ = 0 follows from |M | = 0. In fact, if μ = 0, then A JB = O. Hence,
A JA = A JB = 0. Thus, A J[A, B] = 0. This leads to A = 0, and therefore, A is


also equal to 0. This is a contradiction. Therefore, M ∈ CSp(2n). 

Remark 3.24. If CS = SC, ∀S  = S, then C = μI.


2.3 Symplectic Space 147

2.3.3 Lagrangian Subspaces


Theorem 3.25. {A} ∈ Λn ⇔ A is a non-singular symmetric pair. M ∈ Sp(2n) or
CSp(2n) implies that {A} ∈ Λn , iff M {A} ∈ Λn .
Examples of Lagrangian subspaces:
Coordinate Lagrangian subspaces[Arn89,Wei77,AM78,HW63] :
! !
I I
Rν,0 = Rν,ν = , in which is a non-singular symmetric pair;
O O
! !
O O
R0,ν = Rν ,ν = , in which is a non-singular symmetric pair;
I I
!

Rν,ν = {Iα,α } in which Iα,α = is a non-singular symmetric pair.
Iα
Proposition 3.26. We have the following results:
1◦ Let {A} be k-dim, {B} be l-dim. Then, {A} ⊂ {B} , iff A JB = Ok×l .
2◦ Let A, B be non-singular. Then, {A} ∩ {B} = {0}, iff [A, B] is non-singular.
3◦ If dim{A} = dim{B} = n, then {A} ∩ {B} = {0}, iff

det [A, B] = 0.

4◦ {A} is isotropic of k-dim ⇔ A JA = Ok .


5◦ {A} is Lagrangian ⇔ A JA = On .
6◦ k-dimensional subspaces {A} is non-degenerate iff |A JA| = 0, k = 2s, iff
∃ B, such that {B} = {A}, B  JB = J2s .
7◦ {A} is degenerate ⇔ |A JA| = 0.
Theorem 3.7 to Theorem 3.14 can be restated in matrix language as follows:
Theorem 3.27. If A JA = Ok , A is non-singular. Then, there exists B ∈ M (2n, n −
k), such that [A, B] is a non-singular symmetric pair.
Theorem 3.28. If A ∈ M (2n, n) is a non-singular symmetric pair, then there exists
a matrix, B ∈ M (2n, n), such that [A, B] ∈ Sp(2n).

Theorem 3.29. If A, C ∈ M (2n, n) are two non-singular symmetric pairs and


det [A, C] = 0, then there exists uniquely a non-singular symmetric pair B such that
B ∼ C and [A, B] ∈ Sp(2n).
Theorem 3.30. Let A ∈ M (2n, k), B ∈ M (2n, l), A JA = Ok , B  JB = Ol ,
and [A, B] be non-singular. Then, there exist C, D, such that [A, C], [B, D] are non-
singular symmetric pairs and det [A, C, B, D] = 0.
 A 
1
Theorem 3.31. If A = is a non-singular symmetric pair, then ∃ α ⊂ ν, such
A2
that
|Iα A1 + Iα A2 | = 0.
148 2. Symplectic Algebra and Geometry Preliminaries

Theorem 3.32. For two mutually transversal Lagrangian subspaces {A}, {B}, there
always exists a third Lagrangian subspace {C}, transversal to {A} and {B}.
Proof. Take {a1 , a2 , · · · , an } = {A}, {B} = {b1 , b2 , · · · , bn }, such that

A JA = O, B  JB = O, A JB = In , B  JA = −In .

Set C = A + B. Then,
(A + B) J(A + B) = (A + B  )J(A + B)
= A JA + A JB + B  JA + B  JB = O;

and
det [A, A + B] = det [A, B] = 0,
det [B, A + B] = det [A, B] = 0.
The theorem is proved. 
Theorem 3.33. For any two Lagrangian subspaces {A}, {B} there exists another
Lagrangian subspace {C}, transversal to {A} and {B}.
Proof. Assume U 0 = {A, B} = 0. Take

{a1 , · · · , ak } = {A} ⊂ {A},

{b̄1 , · · · , b̄k } = {B} ⊂ {B}, {c1 , · · · , cn−k } = U 0 ,

such that
{a1 , · · · , ak , c1 , · · · , cn−k } = {A},
{b̄1 , · · · , b̄k , c1 , · · · , cn−k } = {B},
[ai , aj ] = 0 = [b̄i , b̄j ],
[ai , b̄j ] = δij , i, j = 1, · · · , k.

Set C = {a1 + b̄1 , · · · , ak + b̄k , d1 , · · · , dn−k }, where

[di , dj ] = 0, [di , cj ] = δij ,


[di , aj ] = [di , b̄j ] = 0.

Then, {C} is what we want to find out. 

2.3.4 Special Types of Sp(2n)


!
A1 B1
Set M = [A, B] = ∈ Sp(2n). The following are special types of
A2 B2
Sp(2n).
2.3 Symplectic Space 149

(0) M ∈ Sp(2n) ∩ O2n = Un

=⇒ M = [A, J −1 A], A JA = On , A A = In .

!
A1 O
(I) Sp(2n)  M = , which are diagonal blocks
O B1
!
A1 O
=⇒ M =  , A1 ∈ GL(n, R).
O A−1
1
! !
I B1 I S
(II) Sp(2n)  M = =⇒ M = , S  = S.
O B2 O I
! !
A1 O I O
(II ) Sp(2n)  M = =⇒ M = , S  = S.
A2 I S I
!
Iα Iα
(III) Sp(2n)  M = Jα = , α ⊂ ν, and symplectic substitution
−Iα Iα
!
Iα Iα
Jα = .
−Iα Iα

1. Several special types


(1) Sp2n (0).
Sp2n (0) = Sp2n ∩ O2n = Un .

Theorem 3.34. M = [A, B] ∈ Sp2n (0) = Un ⇔ B = J −1 A, A A = I, A JA


= O.

Proof. Evidently, if A A = I, A JA = O. Then, [A, J −1 A] ∈ Sp2n (0). Conversely,

M = [A, B] ∈ Sp2n =⇒ A JA = O, A JB = I.
M = [A, B] ∈ O2n =⇒ A B = O, B  B = I.
7 ⎫
A JA = O ⎪
=⇒ A (JA + B) = O ⎪
 ⎪

⎬ !

AB=O A
=⇒ (JA + B) = O
/ ⎪
⎪ B
B  JA = −I ⎪
=⇒ B  (JA + B) = O ⎪

BB = I

=⇒ M  (JA + B) = O =⇒ JA + B = O =⇒ B = J −1 A.

The theorem is proved. 

Lemma 3.35. There exist polynomials ϕn (A) = ϕn (ajk ), ψn (A) = ψn (ajk ) in 2n×
n variables a11 , · · · , a2n,n with integer coefficients such that
150 2. Symplectic Algebra and Geometry Preliminaries

!
 A1 −A2
det [A J A] = det = (ϕn (A))2 + (ψn (A))2 ≥ 0,
A2 A1
where ⎡ ⎤
a11 ··· a1n
⎢ .. ⎥ .
A = ⎣ ... . ⎦
a2n,1 · · · a2n,n
Proof.
   
 A1 −A2   A + i A2 −A2 + i A1 
 =  1
 A2 A1  A2 A1 
 
 A1 + i A2 −A2 + i A1 − i (A1 + i A2 ) 
=  
A2 A1 − i A2
 
 A1 + i A2 O 
=  
A2 A1 − i A2 
= (A1 + i A2 ) (A1 − i A2 )
= (A1 + i A2 ) (A1 + i A2 )
= (Re|A1 + i A2 |)2 + (Im|A1 + i A2 |)2
= (ϕn (A))2 + (ψn (A))2 ≥ 0.
Therefore, the lemma is completed. 
Theorem 3.36. M ∈ Sp2n (0) = Un ⇒ |M | = 1.
Proof. We will prove that
7
M  M = I =⇒ |M |2 = 1 =⇒ |M | = ±1
=⇒ |M | = 1.
By Lemma 3.35, M = [A, J −1 A] =⇒ |M | ≥ 0
The proof can be obtained. 
(2) Sp2n (I).
!
A1 O
M= , in which diagonal blocks ∈ GL(n, R),
O B1
!
A1 O
M=  , A1 ∈ GL(n, R), |M | = |A1 ||A1 |−1 = 1,
O A−1
1

A1 ∈ GL(n, R) =⇒ A1 = T P, T  T = In , P  = P > 0(positive definite),


! ! !
p p A1 p
R2n  −→ M = ∈ R2n ,
q q (A−11 ) q


! ! !
A1 O T O P O
 = ,
O A−11
O T O P −1
!
T O
∈ On ⊂ Un ,
O T
2.3 Symplectic Space 151

where
' ! /
T O 
On = , T T = In = Un ∩ GL(n, R) = Sp2n (0) ∩ Sp2n (I),
O T
' ! /
P O 
, P = P > 0 is not a group,
O P −1
' ! /
P O 
, P = P > 0 ∩ Sp2n (0) = {I2n }.
O P −1

(3) Sp2n (II).


' ! /
I S 
Sp2n (II) = ,S = S
O ' I ! /
 I O 
 Sp2n (II ) = , S = S , which is a multiplicative group
S I
 SM n = {S, S  = S}, which is an additive group.

(4) Sp2n (I, II).

Sp2n (I, II)


' ! ! ! /
Q QS Q O I S
= −1 = −1 , Q ∈ GL(n, R), S  = S .
O Q O Q O I

M ∈ Sp2n (I,II) =⇒ |M | = 1.
Sp2n (II) ∩ Sp2n (0) = Sp2n (II ) ∩ Sp2n (0) = {I}.

(5) Sp2n (III).


! !
Iα Iα Iα −Iα
Jα = , Jα = Jα−1 = , α ⊂ ν.
−Iα Iα Iα Iα

In particular, Jν = J2n , J0 = I2n , |Jα | = 1. {Symplectic Substitution} = {Jα , Jα }


is not a group. Total number of Jα = 2n , total number of {Jα , Jα } = 2n+1 − 1, and
! !
I O  I S
= J2n J2n ,
−S I O I

which is the bijection from Sp2n (II) to Sp2n (II ).


2. Some theorems about Sp(2n)
Theorem 3.37. If {A}, {B} ∈ Λn , then there exists M ∈ Sp(2n), such that M A =
B. Moreover, M {A} = {B}.
152 2. Symplectic Algebra and Geometry Preliminaries

Proof. By Theorem 3.28, there exist C, D ∈ M2n,n , such that

[A, C] = MA ∈ Sp(2n), [B, D] = MB ∈ Sp(2n).

Set M = MB MA−1 . Then, M MA = MB , i.e., M [A, C] = [B, D]. Therefore, M A


= B. 

Theorem 3.38. If {A}, {B} ∈ Λn , then there exists

M ∈ Sp2n (0) = Un = Sp2n ∩ O2n ,

such that
M {A} = {B}.

Proof. From the procedure of Grass-Schmidt’s orthogonal normalization, we can get


two matrices C and D of order 2n × n i.e., C, D ∈ M (2n, n), such that

{C} = {A}, C  JC = O, C  C = I.
{D} = {B}, D JD = O, D D = I.

Set MC = [C, J −1 C], MD = [D, J  D]. Obviously,

MC , MD ∈ Sp2n (0) = Un .

Set M = MD MC−1 . Then, M MC = MD , i.e.,

M [C, J −1 C] = [D, J  D].

Thus, M C = D.
We obtain M {C} = {M C} = {D}, i.e., M {A} = {B}. 

Theorem 3.39. Let G = CSp(2n), Sp(2n), or Sp2n (0) = Un . Then, the following
action of G on Λn : M {A} = {M A}(A ∈ Λn , M ∈ G) is
1◦ Transitive, i.e., ∀{A}, {B} ∈ Λn , ∃M ∈ G, such that M {A} = {B}.
2◦ If for any {A} ∈ Λn , M {A} = {A} then M = ±I2n when G = Sp(2n) or
Sp2n (0); and M = μI2n when G = CSp(2n).

Proof. 1◦ can be obtained by Theorem 3.37 and Theorem 3.38.


! ' / ' /
A B I O
2◦ Assume M = . Taking {A} = , ∈ Λn , respec-
C D O I
tively, we have
' / ' / ' / ' /
I I O O
M = and M = ,
O O I I

i.e., for some Q ∈ GL(n),


! ! ! ! !
A B I A I Q
= = Q= ,
C D O C O O
2.3 Symplectic Space 153

! ! ! ! !
A B O B O O
= = Q= ,
C D I D I Q
' /
I
and so B = C = O. Again, take {A} = , P  = P . Then, from the equality
P

' / ' /
I I
M = ,
P P

it follows that
! ! ! ! !
A O I A I Q
= = Q= ,
O D P DP P PQ

for some Q ∈ GL(n) and ∀P  = P , i.e., A = Q, and DP = P Q = P A. Set


P = I, D = A. Then, we get

AP = P A, ∀P  = P.

This implies that A = μI(μ = 0) (since |M | = 0). Besides, if M ∈ Sp(2n) or


Sp2n (0) then μ2 I = I, i.e., μ2 = 1. Therefore, we have M = ±I. 
Theorem 3.40. Λn  Un /On = Sp2n (0)/Sp2n (0) ∩ Sp2n (I), or
!
I
ϕ : Un / On −→ Λn , ϕ(M On ) = M , M ∈ Un ,
O

is a bijection from Un /On to Λn .


Proof. First, ϕ is well-defined.
! In fact, if M1 On = M2 On , or M1 ∈ M2 On , then
T O
M1 = M2 .
O T
' / ! ! !
I T O I T
For some T’= T, M1 = M2 Q = M2 Q
O O T O O
! ' /
I I
= M2 T Q = M2 .
O O

It follows from the Theorem 3.38 that ϕ is surjective. We will prove that ϕ is injective.
' / ' / ' / ' /
I I −1 I I
If M1 = M2 , then M2 M1 = .
O O O O
! ' / ' /
A B I I
Set M = M2−1 M1 = ∈ Sp2n (0). Then, M = , i.e.,
C D O O
! ! ! !
A B I A I
= = Q.
C D O C O
154 2. Symplectic Algebra and Geometry Preliminaries

Thus, C = O.
!
A B
Since ∈ Sp2n (0), by Theorem 3.34:
C D
! ! ! ! !
B A O −I A O
= J −1 = =
D O I O O A

and !
 A
[A , O] = A A = I.
O
!
A O
This means that M = , A A = I ∈ On . 
O A

Theorem 3.41. Λn  Sp(2n)/Sp2n (I,II), or the mapping


!
I
ϕ : Sp(2n)/Sp2n (I, II) −→ Λn , ϕ(M Sp2n (II)) = M , M ∈ Sp
O

is a bijection.

Proof. If M1 Sp2n (I, II) = M2 Sp2n (I, II), then M1 ∈ M2 Sp2n (I, II), i.e.,
!
Q QS
M1 = M2  for some Q ∈ GL and S ∈ SM (n).
O Q−1

Thus,
' / ! ! !
I Q QS I Q
M1 = M2  P = M2 P
O O Q−1 O O
! ' /
I I
= M2 QP = M2 .
O O

This implies that ϕ is well-defined.


By Theorem 3.37, we know that ϕ is surjective. ' / ' /
I I
Last, ϕ is injective too. As a matter of fact, if M1 = M2 , then
' / ' / O O
I I
M2−1 M1 = .
O O
! ' / ' /
−1 A B I I
Set M = M2 M1 = ; then M = , i.e.,
C D O O
! ! ! ! !
A B I A I P
= = P = , for some P ∈ GL(n).
C D O C O O

Thus, C = O.
2.3 Symplectic Space 155

! ! !
A B A O I B1
M= =  ∈ Sp(2n),
C D O A−1 O D1
!
−1  A O
where B1 = A B, D1 = A D. Since M ∈ Sp(2n) and ∈ Sp(2n), −1
O A
!
I B1
must be symplectic too.
O D1
!
I B1
By definition, ∈ Sp2n (II). Therefore, M ∈ Sp2n (I, II). 
O D1

2.3.5 Generators of Sp(2n)


Theorem 3.42. Every symplectic matrix M can be decomposed as the multiplication
of three kinds of special symplectic matrices
M = M 0 M1 M2 ,
where
M0 ∈ Sp2n (0), M1 ∈ Sp2n (I), M2 ∈ Sp2n (II).
Proof. Let M = [A, B] ∈ Sp2n ; then {A} ∈ Λn . By Theorem 3.38, there exists
M0 ∈ Sp2n (0) = Un , such that
! !
I Q
A = M0 Q = M0
O O
! !
Q O I
= M0 −1 .
O Q O
Let !−1
Q O
B1 = −1 M0−1 B,
O Q
i.e., ! !
Q O C1
B = M0 −1 .
O Q C2
Thus,
M = [A, B] ! ! ! !!
Q O I Q O C1
= M0 −1 , M0 −1
O Q O O Q C2
! !
Q O I C1
= M0 −1 ∈ Sp(2n)
O Q O C2

= M 0 M1 M2 ,
! !
Q O I C1
where M1 = −1 ∈ Sp2n (I), M2 = ∈ Sp2n (II). 
O Q O C2
156 2. Symplectic Algebra and Geometry Preliminaries

Proposition 3.43. |M | = 1, ∀ M ∈ Sp(2n).

Theorem 3.44. The following decomposition of a symplectic matrix M ∈ Sp(2n) is


unique.
M = M0 M1 M2 ,
!
P O
where M0 ∈ Sp2n (0), M1 = , P  = P > 0, M2 ∈ Sp2n (II).
O P −1

Proof. A non-singular matrix Q ∈ GL(n) can be uniquely decomposed as a multi-


plication of an orthogonal matrix T and a positive definite matrix P : Q = T P . By
Theorem 3.42, we have a decomposition
!
Q O
M = M 0M 1M 2 = M 0 −1 M2
O Q
! !
T O P O
= M0 M2
O T O P −1
= M 0 M1 M2 ,

where
!
T O
M0 = M 0 ∈ Sp2n (0), M2 = M 2 ∈ Sp2n (II),
O T
!
P O
M1 = , P  = P > 0.
O P −1

We need to prove that such a decomposition is unique. 

Suppose M = M01 M11 M21 is another decomposition, where


! !
P1 O I S1
M11 = , P1 = P1 > 0, M21 = .
O P1−1 O I

By the equality M0 M1 M2 = M01 M11 M21 , we have


−1
Sp2n (0)  M01 M0 = M11 M21 M2−1 M1−1
! ! ! !
P1 O I S1 I −S P −1 O
=
O P1−1 O I O I O P
! ! !
P1 O I S1 − S P −1 O
= −1
O P1 O I O P
!
P1 P −1 P1 (S1 − S)P
= −1 ∈ Sp2n (0).
O P1 P

Thus,
2.3 Symplectic Space 157

! !
P1 (S1 − S)P P1 P −1

= J
P1−1 P O
! !
O −I P1 P −1
=
I O O
!
O
= .
P1 P −1
Then, P1 (S1 − S)P = O and P1−1 P = P1 P −1 , i.e., S1 = S and P12 = P 2 .
−1
Therefore, P1 = P , since P1 , P are positive definite. It follows that, M01 M0 = I,
i.e., M0 = M01 . This is what we need to prove.
!
A1 B1
Theorem 3.45. If M = ∈ Sp(2n) with |A1 | = 0, then it can be de-
A2 B2
composed as
M = M 2  M1 M2 ,
where M2 ∈ Sp2n (II ), M1 ∈ Sp2n (I), M2 ∈ Sp2n (II).
This leads to |M | = 1.
Proof. By Gauss elimination,
! ! !
A1 B1 I −A−1 1 B1 A1 O
= ,
A2 B2 O I A2 B2 − A2 A−1
1 B1
! ! !
I O A1 O A1 O
= .
−A2 A−1
1 I A2 B2 − A2 A−1
1 B1 O B2 − A2 A−1 1 B1

M = [A, B] ∈ Sp(2n), and so A JB = I and A is a symmetric pair, i.e.,

A1 B2 − A2 B1 = I, A2 A−1


1 ∈ SM .

Thus,
 
A−1
1 = B2 − A−1  −1  −1
1 A2 B1 = B2 − (A2 A1 ) B1 = B2 − A2 A1 B1 .

Obvioualy,
! !
A1 O A1 O
M1 = −1 =  ∈ Sp2n (I),
O B2 − A2 A1 B1 O A−1
1
!−1
I O
M2 = ∈ Sp2n (II ),
−A2 A−1
1 I
!−1
I −A−1 1 B1
M2 = ∈ Sp2n (II),
O I

by which we get the decomposition M = M2 M1 M2 . 


158 2. Symplectic Algebra and Geometry Preliminaries

Theorem 3.46. Every symplectic matrix M can be decomposed as

M = M3 M2 M1 M2 ,

where M3 ∈ Sp2n (III), M2 ∈ Sp2n (II ), M1 ∈ Sp2n (I), M2 ∈ Sp2n (II), and it
reduces |M | = 1 too.
!
A1 B1
Proof. It follows from that there exists α ⊂ ν such that Jα−1 =
A2 B2
!
C1 D1
with |C1 | = 0 and Theorem 3.45. 
C2 D2

2.3.6 Eigenvalues of Symplectic and Infinitesimal Matrices


Definition 3.47. A real polynomial Pm (λ) = a0 λm + a1 λm−1 + · · · + am is called
reflective if Pm (λ) = λm Pm (1/λ).

It is easy to see that Pm is reflective if and only if ai = am−i (i = 0, · · · , m).


Lemma 3.48. We have the following results:
1◦ Q(λ) = b0 λ2 + b1 λ + b2 is reflective iff b0 = b2 , i.e., Q(λ) = b0 (λ − α)(λ −
1/α).
2◦ L(λ) = c0 λ + c1 is reflective iff c0 = c1 , i.e., L(λ) = c0 (λ + 1).

Property 3.49. We have the following properties:


1◦ P1 , P2 reflective ⇒ P1 · P2 reflective.
2◦ P = P1 · P2 , P, P1 reflective ⇒P2 reflective.
;n
3◦ m = 2n, Pm reflective ⇒ Pm = Qi (λ), Qi reflective of order 2.
i=1
;
n

4 m = 2n + 1, Pm reflective ⇒ Pm = L(λ) Qi (λ), Qi , L reflective of
i=1
order 2 and 1, respectively.

Lemma 3.50. The characteristic polynomial of a symplectic matrix M ∈ Sp(2n),


P (λ) = |M − λI| is reflective.

Theorem 3.51. [Arn89,AM78] λ0 is an eigenvalue of a symplectic matrix M with multi-


−1
plicity k. Then, λ−1
0 , λ0 , λ0 are also the eigenvalues of M with the same multiplicity.
If ±1 are the eigenvalues of M , then their multiplicity is even.

Possible cases of distribution of eigenvalues of a symplectic matrix of order 4 are


depicted in Fig. 3.1.

Definition 3.52. A real polynomial P2n (λ) = a0 λ2n + a1 λ2n−1 + · · · + a2n is even
if P (λ) = P (−λ).
2.3 Symplectic Space 159

complex saddle saddle center

real saddle generic center

(2)
(2) (2) (4) (2)

degenerate saddle identity degenerate center

Fig. 3.1. Distribution of Eigenvalues of a symplectic matrix of Sp(4)

Obviously, P2n (λ) is even iff a2i+1 = 0 (i = 0, 1, · · · , n − 1). Every even poly-
nomial P2n (λ) can be rewritten in the following form

;
n
P2n (λ) = a0 (λ2 − ci ).
i=1

Lemma 3.53. The characteristic polynomial of every infinitesimal symplectic matrix


is even.

Theorem 3.54. [Arn89,AM78] If λ0 is an eigenvalue of on infinitesimal symplectic matrix


B with multiplicity k, then −λ0 , λ0 and −λ0 are the eigenvalues of B with the same
multiplicity. If 0 is an eigenvalue of B, then its multiplicity is even.

The possible cases of distribution of eigenvalues of an infinitesimal symplectic


matrix of order 4 are depicted in Fig. 3.2.
160 2. Symplectic Algebra and Geometry Preliminaries

(2)
(4)
(2) (2) (2)

Fig. 3.2. Distribution of eigenvalues of an infinitesimal symplectic matrix of sp(4)

2.3.7 Generating Functions for Lagrangian Subspaces


Special cases
 S  ' p    S  /

L = = ∈ R2n  (p , q  )J = O (S  = S)
I q I
'    I  / '  /
p 2n    p  2n 
= ∈ R  (p , q ) =O = ∈ R  p = Sq
q −S q
'  /
p  2n  ∂ϕ
= ∈ R p = ,
q ∂q

1
where we define ϕ(q) = q  Sq, called as a generating function [Wei77,FWQW89,Fen95,Ge91]
2
for L.
Remark 3.55. There exists a generating function for Lagrangian subspace transversal
to Rν,0 or R0,ν .
Theorem 3.56. For a non-singular symmetric pair A, there is α ⊂ ν such that
det(Jα−1 A)2 = 0 and det(Jα−1 A)1 =
 0, where Jα−1 A and Jα−1 A are non-singular
symmetric pairs.
Proof. By Theorem 3.14, there exists
! α ⊂ ν such that {A}+̇{Rα,α } = R2n . This
Iα A1
shows that the matrix is non-singular.
Iα A2
Multiplying by Jα−1 , we have
! ! ! !
−1 Iα A1 Iα −Iα Iα A1 O (Jα−1 A)1
Jα = = .
Iα A2 Iα Iα Iα A2 I (Jα−1 A)2
2.3 Symplectic Space 161

Therefore, det (Jα−1 A)1 = 0.


If replace Jα−1 with Jα−1 , then det (Jα−1 A)2 = 0. 
For a general case, we have the following theorem:
/ '
A1
Theorem 3.57. For every Lagrangian subspace L = {A} = , there exist an
A2
α ⊂ ν and a generating function ϕ, a quadratic form in n-variables {pi , qj }i∈α,j∈α
such that
'% &  /
p 2n  ∂ϕ ∂ϕ
L= ∈ R  pi = , i ∈ α; qi = − , i∈α .
q ∂qi ∂pi

Proof. Taking α ⊂ ν in Theorem 3.56, the matrix


! !
B1 Iα A1 − Iα A2
= B = Jα−1 A =
B2 Iα A1 + Iα A2
is Lagrangian with |B2 | = 0. Define
! ! !
u p Iα p − Iα q
= Jα−1 = ;
v q Iα p + Iα q
then, ' '
pi , i ∈ α, pi , i ∈ α ,
ui = vi =
−qi ,i∈α , qi , i ∈ α,
! ! !
A1 p A1
[p , q  ]J = Jα (Jα−1 JJα )Jα−1
A2 q A2
!! !
p B1
= Jα−1 J
q B2
!
S
= [u , v  ] J B2 ,
I
where S = B1 B2−1 , S  = S, and Jα−1 = Jα ∈ Sp(2n). Thus, if we define ϕ(v) =
1 
v Sv, then
2
' !  ! / ' !  ! /
p  A1 u    S
L = ∈ R2n  [p , q  ]J =O = ∈ R2n  [u , v ]J
 =O
q A2 v I
' !  / ' !  /
u  u  ∂ϕ
= ∈ R2n u = Sv = ∈ R2n u =
v v ∂v
' !  ! ! /
p  p u ∂ϕ
= ∈ R2n  = Jα ,u =
q q v ∂v
' !  /
p  ∂ϕ ∂ϕ
= ∈ R2n  pi = , i ∈ α; qi = − ,i ∈ α .
q ∂qi ∂pi

Therefore, the theorem is completed. 


162 2. Symplectic Algebra and Geometry Preliminaries

2.3.8 Generalized Lagrangian Subspaces


In the previous sections, we have considered in detail the special symplectic space
with the special symplectic structure
!
O I
ωJ (x, y) = ϕJ (x, y) = x Jy, J = .
−I O

For every non-singular anti-symmetric matrix K of order 2n, ωK (x, y) = ϕK (x, y) =


x Ky is a symplectic structure on R2n , and so (R2n , ωK ) is also a symplectic space.
Several previous results can be directly applied to this case. Here, we only give a few
different theorems [FW91b,FWQW89] .
Definition 3.58. {A}(or A) is called as a K-Lagrangian subspace, if A KA =
0, and A is non-singular.
Let us denote:
Λn (K) = {{A} | {A} is a K-Lagrangian subspace} ,
Sp(K, 2n) = {M ∈ M (2n) | M  KM = K} ,
CSp(K, 2n) = {M ∈ M (2n) | M  KM = μK, μ = 0} .

Elements of Sp(K, 2n) or CSp(K, 2n) are called as K-symplectic matrices and
conformally K-symplectic matrices respectively.
Theorem 3.59. Let M be non-singular.
1◦ M ∈ CSp(K, 2n), iff A ∈ Λn (K) ⇒ M A ∈ Λn (K).
2◦ M ∈ Sp(K, 2n), iff (M A) K(M B) = A KB or

ϕk (M A, M B) = ϕk (A, B), ∀A, B ∈ Λn (K).

Proof. 1◦ “⇒” is trivial. We know that for any non-singular


% anti-symmetric
& matrix
O I
K, there exists Q ∈ GL such that K = Q JQ, where J = . Thus,
−I O

A KA = A Q JQA = (QA) J(QA),


(M A) K(M A) = A M  Q JQ(M A)
−1
= A Q Q M  Q JQM Q−1 QA
= (QA) (QM Q−1 ) J(QM Q−1 )(QA).
A ∈ Λn (K) =⇒ M A ∈ Λn (K) ⇐⇒ QA ∈ Λn (J) =⇒ QM Q−1 ∈ QA ∈ Λn (J).

By Theorem 3.23, we get QM Q−1 ∈ CSp(J), i.e.,

(QM Q−1 ) J(QM Q−1 ) = μJ, μ = 0.

This leads to M  Q JQM = μQ JQ, i.e., M  KM = μK.


2.3 Symplectic Space 163

2◦ Similarly, we only need to prove the sufficiency. By 1◦ , we have M  KM =


μK, μ = 0. Thus, ∀A, B ∈ Λn (K), we have

A KB = μ−1 A M  KM B = μ−1 A KB.

Taking A and B ∈ Λn (K) such that A KB = 0, we get μ = 1. Therefore,

M  KM = K.

The theorem is proved. 

Definition 3.60. Let K1 and K2 be two non-singular, anti-symmetric matrices of or-


der 2n. Define
Sp(K1 , K2 ) = {M ∈ M (2n) | M  K1 M = K2 } ,
CSp(K1 , K2 ) = {M ∈ M (2n) | ∃ μ = 0, s.t. M  K1 M = μK2 } .

Remark 3.61. Sp(K1 , K2 ) and CSp(K1 , K2 ) are not groups. However,

Sp(K2 , K1 ) = Sp(J), CSp(K2 , K1 ) = CSp(J)

have the same power. In fact, ∃ Q1 , Q2 ∈ GL, such that K1 = Q1 JQ1 , K2 =
Q2 JQ2 ,
M  K2 M = K1 ⇐⇒ M  Q2 JQ2 M = Q1 JQ1 ,
i.e.,

Q−1   −1
1 M Q2 JQ2 M Q1 = J.

It is equivalent to Q2 M Q−11 ∈ Sp(J). Hence, the mapping: M ∈ Sp(K2 , K1 ) →


Q2 M Q−1 1 ∈ Sp(J) is a one-to-one correspondence. It is also a one-to-one correspon-
dence between CSp(K2 , K1 ) and CSp(J).
In addition, M can be viewed as a mapping from Λn (K2 ) to Λn (K1 ):

M : Λn (K2 ) −→ Λn (K1 ), A ∈ Λn (K2 ) −→ M A ∈ Λn (K1 ).

We have the following theorem similar to Theorem 3.23 and Theorem 3.59.
Theorem 3.62. Let M be non-singular. Then,
1◦ M ∈ CSp(K2 , K1 ), iff M A ∈ Λn (K1 ), ∀A ∈ Λn (K2 ).
2◦ M ∈ Sp(K2 , K1 ), iff (M A) K1 (M B) = A K2 B, ∀A, B ∈ Λn (K2 ).

Proof. The proof is omitted, as it is similar to the proof of Theorem 3.59. 


Bibliography

[AM78] R. Abraham and J. E. Marsden: Foundations of Mechanics. Reading, MA: Addison-


Wesley, Second edition, (1978).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM
60, Berlin Heidelberg, Second edition, (1989).
[Art57] E. Artin: Geometrical Algebra. Interscience Publishers, New York, Second edition,
(1957).
[Car65] C. Carathe’odory: Calculus of Variation and Partial Differential Equations of First
Order, Vol.I. Holden-Day, San Franscisco, (1965).
[Fen95] K. Feng: Collected Works of Feng Kang. Volume I, II. National Defence Industry
Press, Beijing, (1995).
[FQ91] K. Feng and M.Z. Qin: Hamiltonian Algorithms for Hamiltonian Dynamical Systems.
Progr. Natur. Sci., 1(2):105–116, (1991).
[FW91] K. Feng and D.L. Wang: Symplectic Difference Schemes for Hamiltonian Systems in
General Symplectic Structures. J. Comput. Math., 9(1):86–96, (1991).
[FWQW89] K. Feng, H. M Wu, M.Z. Qin, and D.L. Wang: Construction of canonical dif-
ference schemes for Hamiltonian formalism via generating functions. J. Comput. Math.,
7:71–96, (1989).
[Ge91] Z. Ge: Equivariant symplectic difference schemes and generating functions. Physica
D, 49:376–386, (1991).
[HW63] L.G. Hua and Z.X. Wan: Classical Groups. Shanghai Science and Technology Pub-
lishing House, in Chinese, Shanghai, (1963).
[LM87] P. Libermann and C.M. Marle: Symplectic Geometry and Analytical Mechanics. Rei-
del pub. company, Boston, First edition, (1987).
[Tre75] F. Treves: Pseudo-Differential Operator. Acad.Press, New York, First edition, (1975).
[Wei77] A. Weinstein: Lectures on symplectic manifolds. In CBMS Regional Conference, 29.
American Mathematical Society, Providence, RI, (1977).
[Wey39] H. Weyl: The Classical Groups. Princeton Univ. Press, Princeton, Second edition,
(1939).
[Wey40] H. Weyl: The method of orthogonal projection in potential theory. Duke Math. J.,
7:411–444, (1940).
Chapter 3.
Hamiltonian Mechanics and Symplectic
Geometry

Hamiltonian mechanics is geometry in phase space. Phase space has the structure of a
symplectic manifold.

3.1 Symplectic Manifold


A symplectic manifold is a smooth manifold M equipped with a closed, nondegener-
ate, 2-form called the symplectic form. On a symplectic manifold, as on a Riemannian
manifold, there is a natural isomorphism between the vector field and the 1-form. Sym-
plectic manifolds arise naturally in abstract formulations of classical mechanics and
analytical mechanics as the cotangent bundles of manifolds, e.g., in the Hamiltonian
formulation of classical mechanics, which provides one of the major motivations for
the field. The set of all possible configurations of a system is modelled as a manifold,
and this manifold’s cotangent bundle describes the phase space of the system.
Any real-valued differentiable function H on a symplectic manifold can serve as
an energy function or Hamiltonian. Associated to any Hamiltonian is a Hamiltonian
vector field; the integral curves of the Hamiltonian vector field are solutions to the
Hamilton–Jacobi equations. The Hamiltonian vector field defines a flow on the sym-
plectic manifold, called a Hamiltonian flow or symplectomorphism. By Liouville’s
theorem, Hamiltonian flows preserve the volume form on the phase space. The vector
fields on a manifold form a Lie algebra. The Hamiltonian vector fields on a symplectic
manifold form a Lie algebra. The operation in this algebra is called Poisson bracket.

3.1.1 Symplectic Structure on Manifolds


Definition 1.1. Let M2n be an even-dimensional differential manifold. A symplectic
structure on M2n is a closed nondegenerate differential 2-form ω on M2n [AM78,Arn89] .
1◦ d ω = 0 is closed.
2◦ ∀ x ∈ M, ∃ ξ ∈ Tx M , s.t., if

ω(ξ, η) = 0, ∀η ∈ Tx M,

then ξ = 0 (nondegenerate).
The pair (M, ω) is called a symplectic manifold. We call it a presymplectic (almost
symplectic) manifold, if only condition 1◦ (2◦ ) is satisfied.
166 3. Hamiltonian Mechanics and Symplectic Geometry

Example 1.2. Consider the vector space R2n with coordinates (pi , qi ). Let ω =
n
d pi ∧ d qi . Then, ω defines a symplectic structure. Given two tangent vectors
i=1

ξ j = (ξ1j , · · · , ξnj ; η1j , · · · , ηnj ), j = 1, 2,

we have

n
ω(ξ 1 , ξ 2 ) = ηi1 · ξi2 − ηi2 · ξi1 .
i=1

This example shows that any symplectic manifold can have standard symplectic
structure at least locally.
Exercise 1.3. Verify that (R2n , ω) is a symplectic manifold. For n = 1, ω becomes
an area measure on a plane.

3.1.2 Standard Symplectic Structure on Cotangent Bundles


Let M be an n-dimensional differential manifold. A 1-form on Tx M , the tangent
space to M at a point x, is called a cotangent vector to M at x. The set of all cotan-
gent vectors to M at x forms an n-dimensional vector space, dual to the tangent space
Tx M . We will denote this vector space of cotangent vectors by Tx∗ M and call it the
cotangent space to M at x. The union of the cotangent space to the manifold at all of
its points is called cotangent bundle, denoted as T ∗ M . Let (q1 , · · · , qn ) be the local
coordinates of M and (p1 , · · · , pn ) be the coordinates of the fiber. Then (q, p) becomes
a local coordinate of T ∗ M , and T ∗ M is equipped with a structure of differential mani-
 n
fold. Locally, ω = dpi ∧dqi is a natural symplectic structure of T ∗ M [AM78,Arn89] . In
i=1
order to give a coordinate-free definition of the form ω, we first define a distinguished
1-form on T ∗ M . Let π : T ∗ M → M be a natural projection. Let ξ ∈ Tp (T ∗ M ) be
a vector tangent to the cotangent bundle at the point p ∈ Tx∗ M . The derivative taken
tangent mapping π∗ : Tp (T ∗ M ) → Tπ(p) M of the natural projection π : T ∗ M → M
takes ξ to a vector π∗ (ξ) tangent to M at x = π(p). We define the 1-form σ on T ∗ M
by the relation σ = p(π∗ (ξ)), which has a form under the local coordinate as follows


n
∂ ∂ 
n

ξ= ai + bi , π∗ (ξ) = bi .
i=1
∂pi ∂qi i=1
∂qi


n
Therefore, p(π∗ (ξ)) = pi bi , which results in
i=1


n 
n
σ= pi dqi , ω = dσ = dpi ∧ dqi .
i=1 i=1
3.1 Symplectic Manifold 167

3.1.3 Hamiltonian Vector Fields

As we pointed out in Section 2.3 of Chapter 2, the symplectic structure of a symplectic


space is similar to a Euclidean structure in some aspects. The symplectic structure
on a symplectic manifold is also similar to a Riemann structure, which defines the
Euclidean structure on a tangent space so that the tangent space becomes isomorphic
to the cotangent space. The same is true for a symplectic structure. Let (M, ω) be a
symplectic manifold; ∀η ∈ Tx M , there exists a linear form on Tx M : Tx M  ξ →
ω(ξ, η); therefore, ω( · , η) defines an element of the cotangent space. Thus, we get a
linear mapping Ω : Tx M → Tx∗ M , η → ω( · , η). The non-degeneracy of ω shows
that Ω is an injective. Since Tx M and Tx∗ M have the same dimensions, Ω must be an
isomorphism, i.e.,
ω(ξ, η) = (Ωη)ξ.
Using the local coordinates, we can set (M, ω) = (R2n , d p ∧ d q). For ξ = (q1 , · · ·,
n
qn ; p1 , · · · , pn ), η = (q1 , · · · , qn ; p1 , · · · , pn ), (Ωη)ξ = ω(ξ, η) = pi qi − pi qi ,
! i=1
O −I
Ω has a matrix representation as , i.e., Ω = −J, Ω−1 = J.
I O
Although the above results are defined for the tangent and cotangent space at a
specific point x ∈ M , they can be easily extended to the entire tangent and cotangent
bundle. Let θ be a 1-form on M , i.e., a C ∞ section on T ∗ M . Ω−1 θ should be on
a vector field on M , i.e., a section on T M . One of the most important cases is when
θ = dH is an exact differential form, i.e., θ is a C ∞ total differential on M . We denote
it by Ω−1 = J.
We often say that JdH is Hamiltonian vector field [AM78,Arn89] with Hamilto-
nian function H, which can be represented using the local coordinate as (M, ω) =
(R2n , dp ∧ dq).
We will use J : T ∗ M → T M∗ to denote the above isomorphism. Let H be a
function on the symplectic manifold M 2n . Then, dH is a 1-form on M , and at every
point there exists a tangent vector associated with it. Thus, we can obtain a vector field
JdH on M . From
5 6
O I
J= , d H = Hq d q + Hp d p,
−I O

we obtain the Hamiltonian vector field


5 65 6 5 6
O I Hq Hp
JdH = = ,
−I O Hp −Hq

which has an expression under local basis for the tangent field

∂ ∂
Hp − Hq .
∂q ∂p
168 3. Hamiltonian Mechanics and Symplectic Geometry

3.1.4 Darboux Theorem


Symplectic geometry arises from the globalization of the symplectic algebra consid-
ered in the previous chapter. First, we prove Darboux’s theorem, according to which
every symplectic manifold has local coordinates p, q in which the symplectic structure
can be written in the simplest way ω = dp ∧ dq.

Theorem 1.4 (Darboux theorem). Let ω be a closed non-degenerate 2-form on a


manifold M 2n . Then, dω = 0 iff there exists a local coordinate system (u, ϕ) for
every m ∈ M , such that

ϕ(m) = 0, ϕ(u) = (x1 (u), · · · , xn (u), y1 (u), · · · , yn (u)),

and

n
ω|u = d xi ∧ d yi .
i=1


n
Proof. The sufficiency can be easily derived, since dxi ∧ dyi is a closed form.
i=1
Necessity. We first assume that M = E is a linear space, and m = 0 ∈ E.
Let ω1 be the constant-form ω(0), ω " = ω1 − ω, ωt = ω + t" ω (0 ≤ t ≤ 1).
For each t, ωt (0) = ω(0) is nondegenerate. Hence, by the openness of the set of
isomorphisms from E to E ∗ , there exists a neighborhood of 0 on which ωt (0 ≤
t ≤ 1) is nondegenerate for all 0 ≤ t ≤ 1. We can assume that this neighborhood
" = dθ. Without loss
is a ball. Thus by Poincaré lemma, there exists a 1-form θ s.t. ω
of generality, we assume θ(0) = 0. Since ωt is nondegenerate, there exists a smooth
vector field X, s. t. iX ωt = −θ. Since Xt (0) = 0, from the local existence theory
of ODEs, there is a sufficiently small ball on which the integral curves of Xt are
well defined for t ∈ [0, 1]. Let Ft be the flow starting at F0 = identity. By the Lie
derivative formula for a time-dependent vector field, we have
d d
(F ∗ ωt ) = Ft∗ (LXt ωt ) + Ft∗ ωt
dt t dt
= Ft∗ d iXt ωt + Ft∗ ω
" = Ft∗ (−d θ + ω
" ) = 0.

Therefore, F1∗ ω1 = F0∗ ω = ω. So F1 provides the chart transforming ω to the constant


form ω1 . 

3.2 Hamiltonian Mechanics on R2n


Darboux theorem shows that every symplectic manifold of dimension 2n is locally
identified with the standard symplectic manifold (R2n , ω). Thus, the results obtained
in (R2n , ω) can be locally transferred to any finite-dimensional symplectic manifolds.
Therefore, in this section, we only consider Hamiltonian systems in R2n with the
standard symplectic structure ω = dp ∧ dq.
3.2 Hamiltonian Mechanics on R2n 169

3.2.1 Phase Space on R2n and Canonical Systems


1. 1-form and 2-form in R2n
In R2n , we denote
z = (z1 , · · · , zn , zn+1 , · · · , z2n )
= (p1 , · · · , pn , q1 , · · · , qn )
5 6
p
= ∈ R2n ,
q

where the prime indicates the matrix transpose.

Definition 2.1. A fundamental differential 1-form and 2-form in R2n are defined by
the following formulae:
n 
n
1-form : θ = pi d qi = zi d zn+i ;
i=1 i=1

n 
n
2-form : ω = d θ = d p ∧ d q = d pi ∧ d qi = d zi ∧ d zn+i .
i=1 i=1

Thus, it can be seen that ω satisfies the following propoties:


1◦ Closed: d ω = d d θ = 0.
2◦ Non-degenerate: ω(ξ, η) = 0, ∀ η ∈ Tz R2n ⇒ ξ = 0, ∀ z ∈ R2n .
3◦ Anti-symmetric: ω(ξ, η) = −ω(η, ξ), ∀ ξ, η ∈ Tz R2n , z ∈ R2n .
Any differential 2-form satisfying the conditions 1, 2 and 3 is called a symplectic
structure in R2n . ω is called the standard symplectic structure in R2n . θ is called the
standard 1-form. R2n equipped with ω is called the symplectic space, or symplectic
manifold, denoted by (R2n , ω), or briefly, R2n .
2n 2n
∂ ∂
Let ξ = ξi , η= ηi ∈ Tz R2n . Then,
∂ zi ∂ zi
i=1 i=1


n 
n
ω(ξ, η) = d zi ∧ d zn+i (ξ, η) = (ξi ηn+i − ηi ξn+i )
i=1 i=1
⎡ ⎤
5 6 η1
O In ⎢ .. ⎥
= (ξ1 , · · · , ξ2n ) ⎣ . ⎦
−In O
η2n
= ξ  Jη, (2.1)
!
O In
where J = , and (ξ1 , · · · , ξ2n ) represents the vector ξ ∈ Tz R2n con-
−In O
sisting of components ξi . The Equation (2.1) is the matrix representation of the 2-form
ω on Tz R2n .

2. Hamiltonian vector fields on R2n


170 3. Hamiltonian Mechanics and Symplectic Geometry

To each vector ξ, tangent to the symplectic manifold (R2n , ω) at the point z, we


associate a 1-form ωξ1 on Tz R2n by the formula

ωξ1 (η) = ω 2 (η, ξ), ∀ η ∈ Tz R2n .

We denote this correspondence by

Ω : Tz R2n −→ Tz∗ R2n ,

i.e.,
ωξ1 (η) = Ωξ(η) = ω 2 (η, ξ) = (−iξ ω)η, ∀ η ∈ Tz R2n ,
or
ωξ1 = Ωξ = −iξ ω.
From the equation ω(η, ξ) = η  Jξ = (Jξ) η, it follows that

ωξ1 = Ωξ = Jξ

or
2n

ωξ1 = Ωξ = (Jξ)i dzi
i=1

= ξn+1 d z1 + · · · + ξ2n d zn − ξ1 d zn+1 − ξn d z2n .

Obviously, Ω is an isomorphism from the tangent space Tz R2n into the cotangent
space Tz∗ R2n . This naturally induces a mapping from X (R2n ) into Ω1 (R2n ):
1
ωX = Ω(X)(z) = Ω(X(z)) = −iX ω, ∀ X ∈ X (R2n ).

In particular, if H ∈ C ∞ (R2n ), then dH ∈ Ω1 (R2n ), Ω−1 dH is a vector field on


R2n , and we denote it as XH .
Definition 2.2. The vector field XH = Ω−1 dH is called a Hamiltonian vector field,
and H is called the Hamiltonian function.
If we write d H = (Hz1 , · · · , Hz2n ) = Hz , then

XH = J −1 Hz ,
d H = ΩXH = −iX ω.
3. Canonical systems
Now, we consider a canonical equation in R2n .
Definition 2.3.
dz
= J −1 Hz (2.2)
dt
or
dp dq
= −Hq , = Hp . (2.3)
dt dt
3.2 Hamiltonian Mechanics on R2n 171

Since J −1 Hz is the matrix representation of a Hamiltonian vector field Ω−1 d H =


XH , the Equation (2.2) can be rewritten as
dz
= XH (z). (2.4)
dt

The phase flow of the Hamiltonian vector field is denoted as φtH , and called the
Hamiltonian phase flow.

Theorem 2.4. A Hamiltonian phase flow preserves the symplectic structure :

(φtH )∗ ω = ω.

Proof. Since
d t ∗ d 
(φ ) ω = (φt+s )∗ s=0 ω
dt H ds H
d 
= (φtH )∗ · (φsH )∗ s=0 ω
ds
= (φtH )∗ LXH ω,

and
LXH ω = (iXH d + d iXH )ω = iXH d ω + d iXH ω
= 0 + (−d (dH)) = 0,

we have
d t ∗
(φ ) ω = 0,
dt H
i.e., 
(φtH )∗ ω = (φtH )∗ t=0 ω = ω.
The theorem is proved. 

4. Integral invariants[Arn89]
Let g : R2n → R2n be a differentiable mapping.

Definition 2.5. A differential k-form ω k is called an integral invariant of the map g,


if the integrals of ω on any k-chain c and on its image under g are the same, i.e.,
- -
ωk = ωk .
gc c

Example 2.6. If n = 1, ω 2 = d p ∧ d q is the area element, then ω 2 is the integral


invariant of the map whose Jacobian determinant is equal to 1.

Theorem 2.7. A k-form ω k is an integral invariant of a map g if and only if

g∗ ωk = ωk .
172 3. Hamiltonian Mechanics and Symplectic Geometry

The proof is left for the reader to derive as a separate exercise.

Theorem 2.8. If the forms ω k and ω l are integral invariants of the map g, then the
form ω k ∧ ω l is also an integral invariant of g.

This follows immediately from the Theorem 2.7.

Theorem 2.9. Let ω 2 be a standard symplectic structure. Then, ω 2 , (ω 2 )2 = ω 2 ∧


ω 2 , (ω 2 )3 = ω 2 ∧ ω 2 ∧ ω 2 , · · · are all the integral invariants of a Hamiltonian phase
flow.

We define a volume element on R2n using (ω 2 )n . Then, a Hamiltonian phase flow


preserves volume, and we obtain Liouville’s theorem from the Theorem 2.4. Since the
form (ω 2 )k is proportional to

ω 2k = dpi1 ∧ · · · ∧ dpik ∧ dqi1 ∧ · · · ∧ dqik ,
i1 <···<ik

the integral of ω 2k is equal to the sum of the oriented volume of projections onto
the coordinate planes (pi1 , · · · , pik , qi1 , · · · , qik ). Therefore, a Hamiltonian phase flow
preserves the sum of the oriented area as projections onto the coordinate planes
(pi1 , · · · , pik , qi1 , · · · , qik ) (1 ≤ k ≤ n).

3.2.2 Canonical Transformation


[Arn89]
Definition 2.10. A diffeomorphism g : R2n → R2n , z = g(z) is called a
∂
z
canonical transformation on R2n , if for every z ∈ R2n , M = ∈ Sp(2n).
∂z
It is easy to see that a linear canonical transformation is a symplectic transforma-
tion.

Theorem 2.11. A diffeomorphism g is canonical if and only if g preserves ω, i.e.,


 P (z) 
g ∗ ω = ω. In other words, if we denote z = g(z) = , i.e.,
Q(z)
5 6 5 6
p g P (z)
z= −→ = z,
q Q(z)

then g is canonical iff d P ∧ d Q = d p ∧ d q.


Thus, a Hamiltonian phase flow φtH is a one-parameter group of canonical trans-
formations on R2n .

Proof. For every ξ, η ∈ Tz R2n ,

(g ∗ ω)(ξ, η) = ω(g∗ ξ, g∗ η) = ξ  M  JM η,
∂g
where M = g∗ = is Jacobian of g.
∂z
3.2 Hamiltonian Mechanics on R2n 173

g canonical ⇐⇒ M  JM = J, ∀ z ∈ R2n ,
⇐⇒ ξ  M  JM η = ξ  Jη, ∀ ξ, η ∈ Tz R2n , z ∈ R2n ,
⇐⇒ g ∗ ω(ξ, η) = ω(ξ, η), ∀ ξ, η ∈ Tz R2n ,
⇐⇒ g ∗ ω = ω.

Therefore, the theorem is completed. 

Definition 2.12. A diffeomorphism g : R2n → R2n is conformally canonical if its


∂ g(z)
Jacobian M (z) = ∈ CSp(2n), ∀ z ∈ R2n .
∂z
Besides the parameters above, a canonical transformation g(z) can be determined
by whether or not it transforms every canonical equation into a canonical equation.
We first consider a conformally canonical transformation.
∂ z
Let z = g(z, t) be a time-dependent transformation and M (z, t) = =
∂z
∂ g(z, t)
, the Jacobian of g(z, t) with respect to z.
∂z

Theorem 2.13. The time-dependent transformation z = g −1 (


z , t) : R2n → R2n
transforms every canonical equation
d z  z (
= J −1 H z ),
dt

 z ) into a canonical equation


with the Hamiltonian H(
dz
= J −1 Hz (z),
dt
∂ z
with some Hamiltonian H(z), iff M (z, t) = satisfies
∂z

M  JM = μJ,

where μ = 0, independent of z and t.

Proof.
d z ∂ z d z ∂g dz ∂g
= + =M + .
dt ∂ z dt ∂t dt ∂t
 

Set H(z) = H(g(z,  ◦ g; then H z = ∂ z H
t), t) = H  z . Thus, from the equation
∂z

d z  z ,
= J −1 H
dt

we have
dz ∂g 
M + = J −1 M −1 H z ,
dt ∂t
174 3. Hamiltonian Mechanics and Symplectic Geometry

i.e.,  
dz  ∂g
= M −1 J −1 M −1 H z −
dt ∂t
 
 ∂g
= J −1 JM −1 J −1 M −1 H z − JM −1
∂t
= J −1 (u + v),

 ∂g
where u = BH z , B = JM −1 J −1 M −1 , v = C , C = −JM −1 , and u depends
∂t
 as well as on z, and v depends only on z.
on the Hamiltonian H
 ∈ C ∞ (R2n ), there exists another function H(z), such that
For every H
dz
= J −1 Hz ,
dt

iff there exists a function H(z) ∈ C ∞ (R2n ), such that


∂H
u+v = ,
∂z
i.e., u + v is a gradient transformation. We know that a Jacobian matrix which equals
∂ (u + v)
to u + v and is symmetric, i.e.,
∂z
∂ui ∂ vi ∂ uk ∂v 
+ = + k, ∀ H(z), i, k = 1, · · · , 2n.
∂ zk ∂ zk ∂ zi ∂ zi

 as a constant, we get
In the above equation , taking H
∂ vi ∂ vk
= , i, k = 1, · · · , 2n. (2.5)
∂ zk ∂ zi

Consequently,
∂ ui ∂ uk  z ),
= , ∀ H( i, k = 1, · · · , 2n. (2.6)
∂ zk ∂ zi
2n

Notice that ui = (BH z )i = Bij H zj , (2.6) becomes
j=1

∂    
2n 2n
∂H ∂ ∂H
Bij = Bkj . (2.7)
∂ zk j=1 ∂ zj ∂ zi ∂ zj
j=1

Expanding it, we get


2n
 2n
 2n
 2n

∂ Bij ∂ H ∂2 H ∂Bkj ∂H ∂2 H
+ Bij = + Bkj . (2.8)
∂ zk ∂ zj ∂ zk ∂ zj ∂ zi ∂zj ∂ zi ∂ zj
j=1 j=1 j=1 j=1

Take H( 
 z ) = zl ◦ g −1 (l = 1, · · · , 2n), then H(z) = zl (l = 1, · · · , 2n). By this,
Equation (2.8) gets split into classes of equations:
3.2 Hamiltonian Mechanics on R2n 175

∂ Bij ∂ Bkj
= , i, k, j = 1, · · · , 2n, (2.9)
∂ zk ∂ zi
2n
 2n

∂2 H ∂2 H
Bij = Bkj , i, k = 1, · · · , 2n. (2.10)
∂ zk ∂ zj ∂ zi ∂ zj
j=1 j=1

∂2 H
Set A = . Obviously, A is symmetric, i.e., A = A. Then, (2.10) indicates
∂ zk ∂ zj

BA = (BA) = A B  = AB  , ∀ A = A.

This implies
B = μ(z, t)I, (2.11)
where μ = 0. Since |B| = 0, or

Bij = μ(z, t)δij .

Substituting it into (2.9), we get


∂μ ∂μ
δij = δkj , i, j, k = 1, · · · , 2n. (2.12)
∂ zk ∂ zi
From this, it follows that
∂μ
= 0, i = 1, · · · , 2n,
∂ zi

i.e., μ = μ(t) is independent of z. Thus, JM −1 J −1 M −1 = B = μ(t)I, i.e.,
M  JM = μ−1 (t)J, M is conformally symplectic with μ−1 (t).
We now prove that μ is independent of t. Since

JM −1 J −1 M −1 = μ(t)I, C = −JM −1 = −μ(t)M  J,
∂ zl ∂g
Cij = −μ(t) Jlj , v=C ,
∂ zi ∂t

we have
∂ gj ∂ z ∂g
vi = Cij = −μ(t) l Jlj j ,
∂t ∂ zi ∂t
 
∂ vi ∂ ∂ zl ∂g
= − μ(t) Jlj j
∂ zk ∂ zk ∂ zi ∂t
 2 
∂ zl ∂g ∂ z ∂ 2 gj
= −μ(t) Jlj j + l Jlj .
∂ zk ∂ zi ∂t ∂ zi ∂ zk ∂ t

Then, the system (2.5)


∂ vi ∂ vk
= , i, k = 1, · · · , 2n
∂ zk ∂ zi
is equivalent to
176 3. Hamiltonian Mechanics and Symplectic Geometry

 
∂ 2 zl ∂g ∂ z ∂ 2 gj
−μ(t) Jlj j + l Jlj
∂ zk ∂ zi ∂t ∂ zi ∂ zk ∂ t
 2 
∂ zl ∂g ∂ zl ∂ 2 gj
= μ(t) Jlj j + Jlj ,
∂ zi ∂ zk ∂t ∂ zk ∂ zi ∂ t

i.e.,    
∂zl ∂ zj ∂ zl ∂ zj
Jlj = Jlj ,
∂ zi ∂ zk t ∂ zk ∂ zi t
i.e.,
(M  JMt )ik = (M  JMt )ki ,
which shows show that M  JMt is a symmetric matrix. Therefore,

M  JMt = (M  JMt ) = Mt J  M = −Mt JM.

Then, we have

(M  JM )t = Mt JM + M  JMt = Mt JM − Mt JM = 0.

However, M  JM = μ(t)J, and so it follows that μ(t) = μ =constant. Consequently,


M  JM = μJ, μ independent of z and t. 

In particular, if g is independent of t, then v = 0 and u = μH z (z). Thus, we


obtain the following Theorem 2.14.

Theorem 2.14. A transformation g(z) = z : R2n → R2n is conformally canonical


with μ independent of z iff z = g −1 (
z ) transforms every canonical system
d z  z
= J −1 H
dt
 z ) into a canonical system
with the Hamiltonian H(
dz
= J −1 Hz (z),
dt

with the Hamiltonian H(z) = μH(g(z))  ◦ g.
= μH

For a further transform, we obtain Theorem 2.15.

Theorem 2.15. A transformation z = g(z) : R2n → R2n is canonical iff g −1 trans-


forms a canonical system
d z  z
= J −1 H
dt
 z ) into a canonical system
with the Hamiltonian H(
dz
= J −1 Hz ,
dt

with Hamiltonian H(z) = H(g(z))  ◦ g.
=H
3.2 Hamiltonian Mechanics on R2n 177

3.2.3 Poisson Bracket


1. Poisson bracket
Definition 2.16. The Poisson bracket {φ(z), ψ(z)} of smooth functions φ(z) and
ψ(z) on R2n is also a smooth function on R2n , defined by the formula
5 65 6
 −1  
O −I ψp
{φ, ψ}(z) = φz J ψz = [φp , φq ]
I O ψq

= −(φp ψq − φq ψp ).

Property 2.17. Let φ, ψ, χ be smooth functions on R2n , then the Poisson bracket has
following basic properties:
1◦ anti-symmetric: {φ, ψ} = −{ψ, φ}.
2◦ bilinear: {αφ + βψ, χ} = α{φ, χ} + β{ψ, χ}, α, β ∈ R.
3◦ Jacobi identity: {{φ, ψ}, χ} + {{ψ, χ}, φ} + {{χ, φ}, ψ} = 0.

1◦ and 2◦ are self-evident. The Jacobi identity can be proved by direct computa-
tion, but it also follows from the following proposition and the corresponding Jacobi
identity of the vector field.

Proposition 2.18. Let φ and ψ be smooth functions on R2n . Then,


1◦ {φ, ψ} = −ω(Xφ , Xψ ).
2◦ {φ, ψ} = d φ(Xψ ) = iXφ ω(Xψ ).
3◦ {φ, ψ} = iXφ iXψ ω .

d 
4◦ {φ, ψ}(z) =  φ(φtψ z) = LXψ φ(z).
dt t=0
5◦ Ω−1 d {φ, ψ} = −[Ω−1 d φ, Ω−1 d ψ] = −[Xφ , Xψ ] ⇔ X{φ,ψ} = −[Xφ , Xψ ],
where Xφ = Ω−1 d φ is the Hamiltonian vector field of a Hamiltonian function φ.
Each of the equalities 1◦ , 2◦ , 3◦ and 4◦ can be a definition of the Poisson bracket
of functions.

Proof. By definition,

{φ, ψ}(z) = φz J −1 ψz = φz J −1 JJ −1 ψz


= −(J −1 φz ) J(J −1 ψz ) = −ω(Xφ , Xψ ) (2.13)
= −iXφ ω(Xψ ) = d φ(Xψ ), (2.14)

2n

where Xφ = Ω−1 d φ = (J −1 φz )i d zi . The Equations (2.13) and (2.14) are just
i=1
1◦ and 2◦ of Proposition 2.18 respectively. However,

−ω(Xφ , Xψ ) = ω(Xψ , Xφ ) = iXψ ω(Xφ ) = iXφ iXψ ω,

and so {φ, ψ} = iXφ iXψ ω, which is 3◦ . For 4◦ , by Equation (2.14),


178 3. Hamiltonian Mechanics and Symplectic Geometry

{φ, ψ} = d φ(Xψ ) = iXψ d φ = LXψ φ,


since for φ ∈ C ∞ (R2n ), LX φ = iX dφ. Finally, for 5◦ , we have
[Xφ , Xψ ] = (J −1 ψz )z J −1 φz − (J −1 φz )z J −1 ψz
= J −1 ψzz J −1 φz − J −1 φzz J −1 ψz ,
and
{φ, ψ}z = (φ z J −1 ψz )z = φzz J −1 ψz − ψzz J −1 φz ,
Ω−1 d {φ, ψ} = J −1 {φ, ψ}z
= J −1 φzz J −1 ψz − J −1 ψzz J −1 φz = −[Xφ , Xψ ].

Therefore, the proposition is completed. 


Exercise 2.19. Show that the map g : R2n → R2n , sending(p, q) → (P (p, q), Q(p, q))
is canonical, iff the Poisson bracket of any two functions in variables (p, q) and (P, Q)
coincide:
{φ ◦ g −1 , ψ ◦ g −1 } = {φ, ψ} ◦ g −1 , ∀ φ, ψ ∈ C ∞ (R2n ).
i.e.,
∂ ψ ∂φ ∂ψ ∂φ
{φ, ψ}p,q = −
∂ p ∂q ∂q ∂p
∂ψ ∂φ ∂ψ ∂φ
= −
∂P ∂Q ∂Q∂P

= {φ, ψ}P,Q .

Theorem 2.20. A function F is a first integral of the phase flow with the Hamiltonian
H iff its Poisson bracket with H is identically zero:
{F, H} = 0.
Proof. By the 4◦ of proposition above,

d 
LXH F =  (φtH )∗ F = {F, H} = 0.
dt t=0

Thus,

d d t ∗ d 
F (φtH (z)) = (φ ) F (z) =  (φt+s )∗ F (z)
dt dt H d s s=0 H
∗ 

d ∗
= (φt φs ) F (z)
d s H H s=0

∗ d  ∗
= φtH (φs )∗  F (z) = φtH LXH F (z) = 0,
d s H s=0

i.e., F is a first integral of the phase flow with the Hamiltonian H. The necessary
condition is evident. 
3.2 Hamiltonian Mechanics on R2n 179

From the Theorem 2.20, we immediately obtain the following.

Theorem 2.21. H is a first integral of the phase flow with Hamiltonian function H.

Theorem 2.22 (E. Noether theorem). If a Hamiltonian H is a first integral of the


phase flow with a Hamiltonian function H, then F is also a first integral of the phase
flow with the Hamiltonian function H.

This follows immediately from the Theorem 2.21 and the fact that {F, H} =
−{H, F }.

Theorem 2.23 (Poisson theorem). The Poisson bracket of the two first integrals
F1 , F2 of a system with a Hamiltonian function H is again a first integral.

Proof. By the Jacobi identity,


( ) ( ) ( )
{F1 , F2 }, H = F1 , {F2 , H} + F2 , {H, F1 }
= 0 + 0 = 0,

which is what we require. 

2. Lie algebras of Hamiltonian vector fields and functions


Definition 2.24. A Lie algebra is a vector space L, together with a bilinear skew-
symmetric operation [ , ] : L × L → L, which satisfies the Jacobi identity.

The operation [ , ] is usually called the commutator.


Therefore, the set of all vector fields on R2n , X (R2n ), together with the Poisson
bracket [ , ], forms a Lie algebra; the set of all smooth functions on R2n , C ∞ (R2n ),
together with the Poisson bracket { , }, forms a Lie algebra too.

Definition 2.25. A linear subspace of a Lie algebra is called a subalgebra if the sub-
space is closed under the commutator, i.e., the commutator of any two elements of the
subspace belongs to it.

Evidently, a subalgebra of a Lie algebra is itself a Lie algebra with the original
commutator.
By the proposition and theorems above, we have:

Corollary 2.26. The Hamiltonian vector fields on R2n form a subalgebra of the Lie
algebra of all vector fields.

Corollary 2.27. The commutator of the Hamiltonian phase flow with a Hamiltonian
form a subalgebra of the Lie algebra of all functions.
180 3. Hamiltonian Mechanics and Symplectic Geometry

3.2.4 Generating Functions


Let a subset S ⊂ R2n be an r-dimensional submanifold in R2n . For any fixed point
s ∈ S, there exists an open set U ⊂ Rr and a diffeomorphism ϕ : U → S such that
s ∈ ϕ(U ). For simplicity, we consider the case only locally.
Definition 2.28. A subset S ⊂ R2n is an r-dim submanifold if there exists a one-to-
one smooth map Z : U ⊂ Rr → R2n such that
S = {z = Z(x) ∈ R2n | x ∈ U ⊂ Rr }.
The tangent space Tz S to S at z = Z(x) is
⎡ ∂ Z1 ∂ Z1 ⎤
··· ⎡ ⎤
 ⎢ ⎥
∂ x 1 ∂ xr ∂P
 ⎢ ⎥ ⎢ ∂x ⎥
∂Z
Tz S = =⎢ ⎢
.
.. ..
.
⎥=⎣

2n
⎦ ⊂ Tz R ,
∂x ⎣ ⎦ ∂ Q
∂ Z2n ∂ Z2n ∂x
···
∂ x1 ∂ xr
 P (x) 
where Z(x) = .
Q(x)
Let f : Rn → Rn be a smooth function. The graph of f in the product space
Rn × Rn = R2n ,
5 6 7
f (q) 2n
gr (f ) := Gf = ∈R | q∈R n
q
⎧ ⎫
⎨ ∂f ⎬
2n
is an n-dim submanifold in R , and its tangent space Tz Gf = ∂q is transver-
⎩ ⎭
I
 I 
2n
sal to Rp =
n
in Tz R for any z ∈ Gf .
O
Theorem 2.29. Let S be an n-dim submanifold in R2n . S is a graph of some function
f iff for any z ∈ S, its tangent space Tz S to S at z is transversal to Rnp in Tz R2n .
Proof. We need to only prove the sufficiency. By definition, there are two functions
P (x) and Q(x), such that
5 6 5 6 7
p P (x)
S= = | x∈U ⊂R n
.
q Q(x)
⎡ ⎤
∂P
 I   
⎢ ∂x ⎥ 2n ∂ Q
Since Tz S = ⎣ ⎦ is transversal to R n
p = in Tz R , we have   = 0.
∂Q O ∂x
∂x
By the inverse function theorem, q = Q(x) has an inverse function x = X(q) with
 
∂X ∂ Q −1
the Jacobian = . This implies that
∂q ∂X
3.2 Hamiltonian Mechanics on R2n 181

⎡ ∂P ⎤ ⎡ % &−1 ⎤ ⎡
∂P ∂Q ∂P ∂X ⎤ ⎡ ∂ ⎤
⎢ ∂x ⎥ ⎢ ∂x ∂q ⎥ ⎣ ∂x ∂q ⎦ ⎣ ∂q (P ◦ X)
Tz S = ⎣ ⎦=⎣ ⎦= = ⎦.
∂Q
I I I
∂x

Setting f = P ◦ X, we get
5 6 7  7  7 ⎡ ⎤
∂f
P (x) P ◦ X(q) f (q)
S= ∈ R2n = = , Tz S = ⎣ ∂q ⎦,
Q(x) q q I

i.e., S is the graph of the function f (q) = P ◦ X(q).


 U (p, q)   p 
Let W : R2n → R2n , z = W (z) = , z= be a diffeomor-
V (p, q) q
phism with the Jacobian
⎡ ⎤
∂U ∂U 5 6
∂W ⎢ ∂p ∂q ⎥ A B
=⎢ ⎣ ∂V ∂V ⎦ =
⎥ .
∂z C D
∂p ∂q

∂f
If f (q) is a function with Jacobian M = , and S = Gf , the graph of f , then
∂q
 5 6 7
f (q)
W (S) = z = W (z) | z = ∈S
q

is an n-dim submanifold in R2n with Jacobian


⎡ ⎤ 5 65 6 5 6
∂f
∂W ⎣ ∂q ⎦ A B M AM + B
Tz W (S) = = = .
∂z C D I CM + D
I

Therefore, the theorem is completed. 


 AM + B 
By Theorem 2.29, W (s) is a graph of some function g iff Tz W (s) =
CM + D
is transversal to Rnp on W (S), i.e., |CM + D| = 0. Thus, we obtain the following
theorem.
∂W
Theorem 2.30. Let W (z) : R2n → R2n be a diffeomorphism with Jacobian =
 A B  ∂z
∂f
and f (q) be a function with Jacobian M = . Then, M satisfies the
C D ∂q
transversality condition: |CM + D| = 0, iff there exists a function g(q) with Jacobian
∂g
N = = (AM + B)(CM + D)−1 such that W (Gf ) = Gg , i.e., W transforms
∂q
the graph of f into the graph of g.
182 3. Hamiltonian Mechanics and Symplectic Geometry

Definition 2.31. Let f : Rn → Rn be a transformation and ϕ : Rn → R be a scalar


function; if f =grad ϕ = ϕq (q), then ϕ is called a generating function of f and f
called a gradient transformation[AM78,Fen86,FWQW89] .

Given an n-value function f on Rn , we may construct a differential 1-form ω 1 =


f d q = f1 d q1 + · · · + fn d qn . If there exists a 0-form ϕ such that ω 1 = f d q =
d ϕ, i.e., ω 1 is exact, then f = ϕq . In Rn , by Poincare lemma 4.15 in Subsection
∂f
1.4.4, the only requirement is that ω 1 is closed, i.e., is symmetric. Thus, any
∂g
transformation from Rn into itself with a symmetric Jacobian may be called a locally
gradient transformation.

Definition 2.32. Let S be an r-dim submanifold in R2n . S is called an isotropic,


coisotropic, Lagrangian, or K-Lagrangian submanifold if for any z ∈ S, Tz S is an
isotropic, coisotropic, Lagrangian, or K-Lagrangian subspace of Tz R2n respectively.

It is obvious that the graph of any gradient transformation is Lagrangian.

Corollary 2.33. A Lagrangian submanifold S in R2n is the graph of some gradient


transformation f : Rn → Rn , S = Gf , iff its tangent space Tz S is transversal to Rnp
in Tz R2n for any z ∈ S.

Corollary 2.34. A transformation W (z) : R2n → R2n is a conformally canonical


transformation iff W (s) is Lagrangian for any Lagrangian submanifold S.

3.2.5 Hamilton–Jacobi Equations


Consider a canonical system

d z(t)
= J −1 Hz , z(0) = z0 , (2.15)
dt

with the Hamiltonian H(z) = H(p, q). Let z(t) = (p(t), q(t)) be its solution and
G(t) the 1-parameter group of diffeomorphisms in R2n .
5 6 5 6
p0 p(t)
G(t) : z0 = −→ z(t) = G(t)z0 = , G(0) = I.
q0 q(t)
 p 
0
Let M0 = be an n-dim initial manifold, p0 a function of q0 , and M0
q0
 p 
0 (q0 ) ∂ p0
form a Lagrangian manifold, i.e., M0 = and ∈ Sm. Since G(t) :
q0 ∂ q0
 p   p(t) 
0
→ is a canonical transformation for a fixed t in some neighbourhood
q0 q(t)
2n
of R , and
3.2 Hamiltonian Mechanics on R2n 183

⎡ ⎤
5 6 ∂p ∂p
A B ⎢ ∂ p0 ∂ q0 ⎥
G∗ (t) = =⎢ ⎥
C D
⎣ ∂q ∂q ⎦
∂ p0 ∂ q0
is a symplectic matrix,
⎡ ⎤
5 6 ⎡ ∂ p0 ⎤ ∂p
A 0 +B
Y1 ⎢ ∂ q0 ⎥
= G∗ ⎣ ∂ q0 ⎦ = ⎢
⎣ ∂ p0


Y2 I C +D
∂ q0
   Y   N 
 ∂p  1
is a symmetric pair. If C 0 + D = 0, then ∼ , where
∂q0 Y2 I
  −1
∂p ∂p
N = A 0 +B C 0 +D ∈ Sm(n).
∂ q0 ∂ q0

By Theorem 2.29, p can be represented as a function of q, i.e., p(t) = p(q, t).


Let H(q) = −H(p, q)|p=p(q) = −H(p(q), q). Consider a 1-form in Rn+1 :

ω 1 = p d q + H d t.

There is a scalar function ϕ(q, t) such that

ω 1 = p d q + H d t = d ϕ,

iff ω 1 is closed, i.e., the following matrix


⎡ ∂ p1 ∂ p1 ∂ p1 ⎤
···
⎢ ∂ q1 ∂ q n ∂t ⎥
⎢ . ⎥ ⎡ ⎤
⎢ . .
.. .. ⎥ ∂p ∂p
⎢ . . ⎥
⎢ ⎥ ⎢ ∂q ∂t ⎥
⎢ ∂ pn ∂ pn ∂ pn ⎥=⎢ ⎥
⎢ ⎥ ⎣ ⎦
⎢ ∂ q1 · · · ∂ qn

∂t ⎥ ∂H ∂H
⎢ ⎥
⎢ ∂H ⎥ ∂q ∂t
⎣ ···
∂ H ∂ H ⎦
∂ q1 ∂ qn ∂t

∂p
is symmetric. We know that the matrix is symmetric, and so we need to only prove
∂q

∂p ∂H
= .
∂t ∂q

By canonical Equations (2.15),


d p(q, t) ∂p∂q ∂p ∂p ∂p
−Hq = = + = Hp + .
dt ∂q ∂t ∂t ∂q ∂t

In addition,
184 3. Hamiltonian Mechanics and Symplectic Geometry


∂H ∂ H(p(q, t), q) ∂ p 
=− =− H p − Hq .
∂q ∂q ∂t
Thus,
∂p ∂p ∂H
= −Hq − Hp = .
∂t ∂q ∂q
Consequently, there exists a scalar function ϕ(q, t), s.t.

p d q + H d t = d ϕ.

Thus, it follows that


∂ϕ
p= ,
∂q

ϕt = H = −H(p, q) = −H(ϕq , q),

or
ϕt + H(ϕq , q) = 0,
which is called the Hamilton–Jacobi equation.
Bibliography

[AA88] D.V. Anosov and V.I. Arnold: Dynamical Systems I. Springer, Berlin, (1988).
[AA89] V. I. Arnold and A. Avez: Ergodic Problems of Classical Mechanics. Addison-Wesley,
New York, (1989).
[Abd02] S. S. Abdullaev: The Hamilton-Jacobi method and Hamiltonian maps. J. Phys. A:
Math. Gen., 35(12):2811–2832, (2002).
[AKN78] V. I. Arnold, V. V. Kozlov, and A. I. Neishtadt: Mathematical Aspects of Classical
and Celestial Mechanics. Springer, Berlin, Second edition, (1978).
[AM78] R. Abraham and J. E. Marsden: Foundations of Mechanics. Reading, MA: Addison-
Wesley, Second edition, (1978).
[AMR88] R. Abraham, J. E. Marsden, and T. Ratiu: Manifolds, Tensor Analysis, and Applica-
tions. AMS 75. Springer-Verlag, Berlin, Second edition, (1988).
[AN90] A. I. Arnold and S.P. Novikov: Dynamical System IV. Springer Verlag, Berlin, (1990).
[Arn88] V. I. Arnold: Geometrical Methods in The Theory of Ordinary Differential Equations.
Springer-Verlag, Berlin, (1988).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Berlin Heidelberg:
Springer-Verlag, GTM 60, Second edition, (1989).
[Ber00] R. Berndt: An Introduction to Symplectic Geometry. AMS Providence, Rhode Island,
(2000).
[Bir23] G. D. Birkhoff: Relativity and Modern Physics. Harvard Univ. Press, Cambridge,
Mass., Second edition, (1923).
[BK89] G.W. Bluman and S. Kumei: Symmetries and differential equations. AMS 81.
Springer-Verlag, New York, (1989).
[Car65] C. Carathe’odory: Calculus of Variation and Partial Differential Equations of First
Order, Vol.1. Holden-Day, San Franscisco, (1965).
[Car70] H. Cartan: Differential Forms. Houghton-Mifflin, Boston, (1970).
[CH53] R. Courant and D. Hilbert: Methods of Mathematical Physics. Interscience, New York,
Second edition, (1953).
[Che53] S. S. Chern: Differential Manifolds. University of Chicago, (1953). Lecture notes.
[Fen86] K. Feng: Difference schemes for Hamiltonian formalism and symplectic geometry. J.
Comput. Math., 4:279–289, (1986).
[Fla] H. Flanders: Differential Forms. Academie Press, New York, Second edition, (1963).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Berlin, Springer, (1987).
[FQ91a] K. Feng and M.Z. Qin: Hamiltonian Algorithms for Hamiltonian Dynamical Systems.
Progr. Natur. Sci., 1(2):105–116, (1991).
[FQ91b] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FQ03] K. Feng and M. Z. Qin: Symplectic Algorithms for Hamiltonian Systems. Zhejiang
Science and Technology Publishing House, Hangzhou, in Chinese, First edition, (2003).
186 Bibliography

[FWQW89] K. Feng, H. M. Wu, M.Z. Qin, and D.L. Wang: Construction of canonical dif-
ference schemes for Hamiltonian formalism via generating functions. J. Comput. Math.,
7:71–96, (1989).
[Gol80] H. Goldstein: Classical Mechanics. Addison-Wesley Reading, Massachusetts, (1980).
[GS84] V. Guillemin and S. Sternberg: Symplectic Techniques in Physics. Cambridge Univer-
sity Press, Cambridge, (1984).
[Lan95] S. Lang: Differential and Riemannian Manifolds. Springer-Verlag, Berlin, (1995).
[LL99] L. D. Landau and E. M. Lifshitz: Mechanics, Volume I of Course of Theoretical
Physics. Corp. Butterworth, Heinemann, New York, Third edition, (1999).
[LM87] P. Libermann and C.M. Marle: Symplectic Geometry and Analytical Mechanics. Rei-
del Pub. Company, Boston, First edition, (1987).
[Mac70] S. MacLanc: Hamiltonian mechanics and geometry. Amer. Math. Mon., 77(6):570–
586, (1970).
[Sie43] C.L. Siegel: Symplectic geometry. Amer. J. Math, 65:1–86, (1943).
[Tre75] F. Treves: Pseodo-Differential Operator. N.Y.: Acad. Press, First edition, (1975).
[Wei77] A. Weinstein: Lectures on symplectic manifolds. In CBMS Regional Conference, 29.
American Mathematical Society, Providence, RI , (1977).
[Wes81] C. Von. Westenholz: Differential Forms in Mathematical Physics. North-Holland,
Amsterdam, Second edition, (1981).
Chapter 4.
Symplectic Difference Schemes for
Hamiltonian Systems

The canonicity of the phase flow for time-independent Hamiltonian systems is one of
the most important properties. It ensures the preservation of phase areas and the phase
volume. Thus, preserving the canonicity of transition of difference schemes from one
time step to the next is also important in the numerical solutions of Hamiltonian sys-
tems. The goal of this chapter is to find some simple symplectic schemes, i.e., to
identify which one, among the existing difference schemes, is symplectic.

4.1 Background
It is well known that Hamiltonian systems have many intrinsic properties: the preser-
vation of phase areas of even dimension and the phase volume, the conservation laws
of energy and momentum, and other symmetries.

4.1.1 Element and Notation for Hamiltonian Mechanics


Let H be a smooth function of 2n variables p1 , · · · , pn , q1 , · · · , qn . Then, the Hamil-
tonian canonical systems are of the form :

ṗ = −Hq , q̇ = Hp , (1.1)
!
p
where p = (p1 , · · · , pn )T , q = (q1 , · · · , qn )T . Let z = , and the standard
q
symplectic matrix be: 5 6
O In
J= , (1.2)
−In O
where In is the n × n identity matrix, and J has property J −1 = J  = −J. Then,
system (1.1) can be written in a compact form:

ż = J −1 Hz , (1.3)
!
Hq
where Hz = ; H is called the Hamiltonian function of the system. The phase
Hp
t
flow of system (1.1) can be represented as gH . According to the fundamental theorem
188 4. Symplectic Difference Schemes for Hamiltonian Systems

of a Hamiltonian system, the solution of a canonical system is a one-parameter sym-


plectic group Gt , denoted by Sp(2n). Therefore, symplectic geometry serves as the
mathematical foundation of Hamiltonian mechanics. For simplicity, we consider only
the classical phase space R2n = Rnp × Rnq , where Rnp is called the momentum space,
and Rnq the configuration space. Locally, every 2n-dimensional manifold is diffeo-
morphic to a neighborhood of a point on R2n . The phase space R2n is equipped with
a standard symplectic structure defined by

n 
n
ωJ = d zi ∧ d zn+i = d pi ∧ d qi , (1.4)
i=1 i=1

i.e., for each z of R2n , it is a bilinear antisymmetric form:

ωJ (ξ, η) = ξ  Jη, ∀ ξ, η ∈ Tz R2n ,

for each pair of tangent vector ξ, η at point z ∈ Tz R2n , where J is the standard
symplectic structure Equation (??).
Let w : R2n → R2n be a differential mapping, z ∈ R2n → w(z) ∈ R2n ; the
corresponding Jacobian matrix is denoted by
⎡ ∂w ∂ w1 ⎤
1
···
⎢ ∂ z1 ∂ z2n ⎥
⎢ ⎥
⎢ .. .. ⎥
∂w ⎢ ⎥
=⎢ . . ⎥.
∂z ⎢ ⎥
⎢ ∂w ⎥
⎣ 2n
···
∂ w2n ⎦
∂ z1 ∂ z2n

The mapping w induces, for each z ∈ R2n , a linear mapping w∗ (z) from the tangent
space at z into the the tangent space at w(z):
∂w
ξ = (ξ1 , · · · , ξ2n ) −→ w∗ ξ = ξ.
∂z

Each 2-form w on R2n also induces a 2-form w∗ ω on R2n by the formula


 
∂w ∂w
w∗ ω(ξ, η)z ≡ ω ξ, η .
∂z ∂z w(z)

If ω(ξ, η)z = ξ A(z) η, A (z) = −A(z), then w∗ ω(ξ, η) = ξ B(z) η, i.e.,


   
∂w ∂w
B(z) = A(w(z)) .
∂z ∂z
Refer to Definition 4.7 from Chapter 1.

Definition 1.1 (Diff). A diffeomorphism (differentiable one to one onto mapping )


of R2n is called a canonical transformation if w preserves the standard symplectic
structure. i.e., w∗ ωJ = ωJ ,
4.1 Background 189

! !
∂w ∂w
J = J, (1.5)
∂z ∂z
∂w
i.e., Jacobian is a symplectic matrix for every z.
∂z
Its geometric meaning is depicted in Fig. 1.1. According to the general theory of ODE,
for each Hamiltonian system (1.1), there corresponds a one parameter group of dif-
feomophisms g t , at least locally in t and z, of R2n such that

g 0 = id, g t1 +t2 = g t1 · g t2 .
q q
6 6 ∂w
∂z
ξ
[ ∂∂ w
z
]
 
ξ -
1 :
∂w
∂z
η
η
- p - p
O O
Fig. 1.1. Geometric meaning of preserving symplectic structure

4.1.2 Geometrical Meaning of Preserving Symplectic Structure ω


If z(0) is taken as an initial value, then the solution of (1.1) can be written as

z(t) = g t z(0).

The basic property of a Hamiltonian system is that g t is a canonical transformation,


i.e.,
(g t )∗ ωJ = ωJ ,
for all t. This leads to the following class of phase-area conservation law:
- -
ωJ = ωJ , every 2-chain σ 2 ⊂ R2n ,
gt σ2 σ2
- -
ω J ∧ ωJ = ω J ∧ ωJ , every 4-chain σ 4 ⊂ R2n ,
gt σ4 σ4

- ··· - ···
ω J ∧ · · · ∧ ωJ = ωJ ∧ · · · ∧ ωJ , every 2n-chain σ 2n ⊂ R2n ,
g t σ 2n σ 2n

where the last one is the Liouville’s phase-volume conservation law. Another class of
conservation law is related to the energy and all the first integrals. A smooth function
ϕ(x) is said to be a first integral if ϕ(g t z) = ϕ(z), for all t, z. The latter is equivalent
to the condition {ϕ, H} = 0; H usually represents the energy, which is a first integral
of itself.
190 4. Symplectic Difference Schemes for Hamiltonian Systems

The above situations can be generalized. A symplectic structure in R2n is specified


by a non-degenerate, closed 2-form

ωK = Kij (z) d zi ∧ d zj , (1.6)
i<j

i.e.,
1 
ωK (ξ, η)z = ξ K(z)η, K  (z) = −K(z), det K(z) = 0.
2
A differentiable mapping w : R2n → R2n called K-symplectic, if w∗ ωK = ωK ,
i.e.,
% &
∂w  ∂ w
K w(z) = K(z). (1.7)
∂z ∂z

The Darboux theorem establishes the equivalence between all symplectic struc-
tures. Every non-singular closed 2-form ωK can be brought to the standard form
 
Kij (z) d zi ∧ d zj = d ωi ∧ d wn+j
i<j i<j

locally by suitable coordinate transformation z → w(z).

4.1.3 Some Properties of a Symplectic Matrix

From Subsection 2.3.2, a matrix S of order 2n is called a symplectic matrix if it


satisfies:
S  JS = J, (1.8)
where S  is the transpose of S. All symplectic matrices form a symplectic group
Sp(2n).

Definition 1.2. A matrix B of order 2n is called infinitesimal symplectic, if

JB + B  J = O. (1.9)

All infinitesimal symplectic matrices form a Lie algebra with commutation operation
[A, B] = AB − BA, denoted as sp(2n). sp(2n) is the Lie algebra of the Lie group
Sp(2n). We have the following well-known proposition[FWQ90] , which can be found
in Chapter 2. Here, we omit the proof.

Proposition 1.3. det S = 1, if S ∈ Sp(2n).

Proposition 1.4. S −1 = −JS  J = J −1 S  J, if S ∈ Sp(2n).

Proposition 1.5. SJS  = J, if S ∈ Sp(2n).


4.1 Background 191

!
A B
Proposition 1.6. Let S = , A, B, C, D be an n × n matrix; then S ∈
C D
Sp(2n) iff:

AB  − BA = O, CD − DC  = O, AD − BC  = I,


A C − C  A = O, B  D − D B = O, A D − C  B = I.

Proposition 1.7. Matrices


5 6 5 6
I B I O
,
O I D I

are symplectic, iff B  = B, D = D.

Proposition 1.8. Matrices


5 6
A O
∈ Sp(2n), iff A = (D )−1 .
O D

Proposition 1.9. Matrices

S = M −1 N ∈ Sp(2n), iff M JM  = N JN  .

Proposition 1.10. Matrices


5 6
Q I −Q
∈ Sp(2n), iff Q2 = Q, Q = Q.
−(I − Q) Q

Proposition 1.11. If B ∈ sp(2n), then exp (B) ∈ Sp(2n).

Proposition 1.12. If B ∈ sp(2n), and |I + B| = 0, then F = (I + B)−1 (I − B) ∈


Sp(2n), the B Cayley transform of B.

Proposition 1.13. If B ∈ sp(2n), then (B 2m ) J = J(B 2m ).

Proposition 1.14. If B ∈ sp(2n), then (B 2m+1 ) J = −J(B 2m+1 ).

Proposition 1.15. If f (x) is an even polynomial, and B ∈ sp(2n), then f (B  )J =


Jf (B).

Proposition 1.16. If g(x) is an odd polynomial, and B ∈ sp(2n), then g(B) ∈


sp(2n), i.e.,
g(B  )J + Jg(B) = O.
192 4. Symplectic Difference Schemes for Hamiltonian Systems

4.2 Symplectic Schemes for Linear Hamiltonian


Systems
A Hamiltonian system (1.1 ) is called linear, if the Hamiltonian is a quadratic form of
z:
1
H(z) = z  Cz, C  = C,
2
and J is a standard antisymmetric matrix:
!
O In
J= , J  = −J = J −1 , det J = 1.
−In O

Then, the canonical system (1.1), (1.3) become:


dz
= Bz, B = J −1 C, C  = C, (2.1)
dt
where B = J −1 C is infinitesimal symplectic. The solution of (1.1 ) is:

z(t) = g t z(0), g t = exp (tB), (2.2)

where g t , as the exponential transformation of infinitesimal symplectic tB, is sym-


plectic (Proposition 1.11 ).
1
Consider now a quadratic form F (z) = z  Az. The Poisson bracket of two
2
quadratic forms H, F is also a quadratic form:
1 
{H, F } = z (AJC − CJA)z.
2
Theorem 2.1. The condition for the quadratic form F to be an invariant integral of
the linear Hamiltonian (2.1) can be expressed in any one of the following equivalent
ways:  
F (exp (tJ −1 C))z ≡ F (z), (2.3)
{H, F } = 0, (2.4)
 −1
  
exp (tJ C) A exp (tJ −1 C) = A, (2.5)
AJC = CJA. (2.6)

4.2.1 Some Symplectic Schemes for Linear Hamiltonian Systems


Some types of the symplectic schemes for system (1.1 ) are proposed[Fen85] , the first
of which is called the time-centered Euler schemes (or midpoint Euler)

z n+1 − z n z n+1 + z n
=B . (2.7)
τ 2
The transition z n → z n+1 is given by
4.2 Symplectic Schemes for Linear Hamiltonian Systems 193

 
τ 1−λ
z n+1 = Fτ z n , Fτ = φ − B , φ(λ) = , (2.8)
2 1+λ
τ
where Fτ is a Cayley transformation of the infinitesimal symplectic − B, and is
2
symplectic according to Proposition 1.12.
The second scheme we consider is the staggered explicit scheme for a separable
Hamiltonian. For a separable Hamiltonian H(p, q) = U (p) + V (q),
 p  1
1 1
H(p, q) = [p , q  ] S = p U p + q  V q = U (p) + V (q), (2.9)
2 q 2 2
where 5 6
U O
S= .
O V
U  = U is positive definite and V  = V , the canonical Equation (1.1), becomes:
dp dq
= −Vq , = Up . (2.10)
dt dt
The staggered explicit scheme is:
1 n+1 n+ 1
(p − pn ) = −Vq 2 , (2.11)
τ
 
1 1 1
q n+ 2 +1 − q n+ 2 = Upn+1 . (2.12)
τ
 
1
pT s are defined at integer time t = nτ , and q T s at half-integer times t = n + τ.
2
The transition 5 6 5 6
pn pn+1
n
w = 1
−→ 1
= wn+1
q n+ 2 q n+ 2 +1
is given by the following:
wn+1 = Fτ wn ,
where 5 6−1 5 6
I O I −τ V
Fτ = , (2.13)
−τ U I O I
as the product of two symplectic matrices, is symplectic (Proposition 1.7 ), and the
scheme has second order of accuracy.

4.2.2 Symplectic Schemes Based on Padé Approximation


We know that the trajectory z(t) = g t z0 is the solution satisfying the initial condition
z(0) = z0 . In a linear system, g t coincides with its own Jacobian. One might asks how
to approximate to of exp (tB). This can be simply described in terms of Padé rational
approximation [FWQ90,Qin89] . Here, we consider the rational approximation to exp (x)
defined by
194 4. Symplectic Difference Schemes for Hamiltonian Systems

nlm (x)
exp (x) ∼ = glm (x), (2.14)
dlm (x)
where

m
(l + m − k) ! m !
nlm (x) = xk , (2.15)
(l + m) ! k ! (m − k) !
k=0

l
(l + m − k) ! l !
dlm (x) = (−x)k . (2.16)
(l + m) ! k ! (l − k) !
k=0

For each pair of nonnegative integers l and m, the Taylor series expansion of
nlm (x)
about the origin point is:
dlm (x)

nlm (x)
exp (x) − = o(|x|m+l+1 ), |x| −→ 0, (2.17)
dlm (x)

and the resulting (l + m)-th order Padé approximation of exp (x) is denoted by glm .
Theorem 2.2. Let B be an infinitesimal symplectic; then, for sufficiently small |t|,
glm (tB) is symplectic iff l = m, i.e., gll (x) is the (l, l) diagonal Padé approximant to
exp (x).

Proof. Sufficiency. Let nll (x) = f (x) + g(x), dll = f (x) − g(x), where f (x) is an
even polynomial, g(x) is an odd one. In order to prove gtt (tB) ∈ Sp(2n), we only
need to verify Proposition 1.9 .
       
f (tB)+g(tB) J f (tB)+g(tB) = f (tB)−g(tB) J f (tB)−g(tB) . (2.18)

By Propositions 1.15 and 1.16, the L.H.S of Equation (2.18) is:


      
f (tB)+g(tB) J f (tB  )+g(tB  ) = f (tB)+g(tB) f (tB)−g(tB) J. (2.19)

Similarly for the R.H.S of Equation (2.18), we have:


      
f (tB)−g(tB) J f (tB  )−g(tB  ) = f (tB)−g(tB) f (tB)+g(tB) J. (2.20)

Comparing Equations (2.19) and (2.20) completes the proof of the “if” part of the
theorem.
The “only if” part. Without loss of generality, we may take l > m. We only need to
notice that in Equation (2.18), the order of the polynomial on the right hand is higher
than that on the left hand. 

From Theorem 2.2, we can obtain a sequence of symplectic difference schemes based
on the diagonal (k, k) Padé table. In Table 2.1, the element of l-th row, m-th column is
denoted by (l, m). For the (1,1) approximation (i.e., l = 1, m = 1), we have the Euler
centered scheme
4.2 Symplectic Schemes for Linear Hamiltonian Systems 195

τB n
z n+1 = z n + (z + z n+1 ), (2.21)
2
Fτ(1,1) = φ(1,1) (τ B),
λ
1+
(1,1) 2.
φ (λ) = (2.22)
λ
1−
2
This scheme has second order accuracy.
For the (2,2) Padé approximation, we have:
τB n τ 2B2 n
z n+1 = z n + (z + z n+1 ) + (z − z n+1 ), (2.23)
2 12
whose transition is
(2,2)
Fτ = φ(2,2) (τ B),
λ λ2
1+
+
φ(2,2) (λ) = 2 12 . (2.24)
λ λ2
1− +
2 12

This scheme has fourth order accuracy.


For the (3,3) approximation, we have:
τB n τ 2B2 n τ 3B3 n
z n+1 = z n + (z + z n+1 ) + (z − z n+1 ) + (z + z n+1 ). (2.25)
2 10 120
λ λ2 λ3
1+ + +
Fτ(3,3) =φ (3,3)
(τ B), φ(3,3)
(λ) = 2 10 120 . (2.26)
λ λ2 λ3
1− + −
2 10 120
This scheme has sixth order accuracy.
For the (4,4) approximation, we have:
τB n 3τ 2 B 2 n
z n+1 = z n + (z + z n+1 ) + (z − z n+1 )
2 28
τ 3B3 n τ 4B4 n
+ (z + z n+1 ) + (z − z n+1 ), (2.27)
84 1680
λ 3λ2 λ3 λ4
1+ + + +
Fτ(4,4) = φ (4,4)
(τ B), φ (4,4)
(λ) = 2 28 84 1680 . (2.28)
λ 3λ2 λ3 λ4
1− + − +
2 28 84 1680
This scheme has eighth order accuracy.
Theorem 2.3. The difference schemes
z n+1 = gll (τ B)z k , l = 1, 2, · · ·
for a linear Hamiltonian system (2.1) are symplectic of 2l-th order accuracy .
Table 2.1. Padé approximation table (l, m)
196

l
0 1 2 3 4
m

1 1 1 1 1
0
1 1−x x2 x2 x3 x2 x3 x4
1−x+ 1−x+ − 1−x+ − +
2 2 6 2 6 24

x x x x
1+x 1+ 1+ 1+ 1+
1 2 3 4 5
1 x 2x x3 3x x2 x3 4x 3x2 x3 x4
1− 1− + 1− + − 1− + − +
2 3 6 4 4 24 5 10 15 120 .

x2 2x x2 x x2 2x x2 x x2
1+x+ 1+ + 1+ + 1+ + 1+
+
2 2 3 6 2 12 5 20 3 30
1 x x x2 3x x2 x3 2x x2 x3 x4
1− 1− + 1− + − 1− + − +
3 2 12 5 20 60 3 20 30 360
4. Symplectic Difference Schemes for Hamiltonian Systems

x2 x3 3x x2 x3 3x 3x2 x3 x x2 x3 3x x2 x3
1+x+ + 1+ + + 1+ + + 1+ + + 1+ + +
3 2 6 4 4 24 5 20 60 2 10 120 7 14 210
1 x 2x x2 x x2 x3 4x x2 4x3 x4
1− 1− + 1− + − 1− + − +
4 5 20 2 10 120 7 7 210 840

x2 x3 x4 4x 3x2 x3 x4 2x x2 x3 x4 4x x2 4x3 x4 x 3x2 x3 x4


1+x+ + + 1+ + + + 1+ + + + 1+ + + + 1+ + + +
4 2 6 24 5 10 15 120 3 20 30 360 7 7 210 840 2 28 84 1680
1 x x x2 3x x2 x3 x 3x2 x3 x4
1− 1− + 1− + − 1− + − +
5 3 30 7 14 210 2 28 84 1680
4.2 Symplectic Schemes for Linear Hamiltonian Systems 197

4.2.3 Generalized Cayley Transformation and Its Application


Definition 2.4. A matrix B is called non-exceptional, if

det (I + B) = 0. (2.29)

Let B be non-exceptional; let us introduce a matrix S by

I + S = 2(I + B)−1 , (2.30)

whose inversion is
I + B = 2(I + S)−1 . (2.31)
[FWQ90]
Therefore S is non-exceptional, and we have the Cayley transformation :

S = (I − B)(I + B)−1 = (I + B)−1 (I − B), (2.32)

and
B = (I − S)(I + S)−1 = (I + S)−1 (I − S). (2.33)
Let A be an arbitrary matrix. The equation

S  AS = A (2.34)

expresses the condition that the substitution of S into both variables z, w leaves in-
variant the bilinear form z  Aw.

Lemma 2.5. [Wey39] If the non-exceptional matrices B and S are connected by (2.32)
and (2.33), and A is an arbitrary matrix, then

S  AS = A (2.35)

iff
B  A + AB = O. (2.36)

Proof. Taking the transpose of (2.33), we obtain

B  (I + S  ) = I − S  .

Right multiplying by AS on both sides, and from (2.35), we obtain

A(S − I) = B  A(S + I).

Right multiplying by (S + I)−1 again on both sides, we obtain

−AB = B  A.

Conversely, by assuming (2.36) and right multiplying the transposed equation

S  (I + B  ) = I − B 
198 4. Symplectic Difference Schemes for Hamiltonian Systems

of (2.33) by A on both sides, we have

S  A(I − B) = A(I + B),

which yields (2.35) on post-multiplication by (I + B)−1 .


Let φ(λ) = (1 − λ)/(1 + λ); then the Cayley transform of B is denoted by
φ(B) = (I + B)−1 /(I − B). By taking successively A = J and A = A in Lemma
2.5, this lemma is proved. 

Theorem 2.6. The Cayley Transform of a non-exceptional infinitesimal symplectic


(symplectic) matrix is a non-exceptional symplectic (infinitesimal symplectic) matrix.
If B = J −1 C, C  = C, B ∈ sp(2n), det (I + τ B) = 0, A = A, then
   
φ(τ B) A φ(τ B) = A (2.37)

iff
B  A + AB = O.
1
In other words, a quadratic form F (z) = z  Az is invariant under the symplectic
2
transformation φ(τ B) iff F (z) is an invariant integral of the Hamiltonian system (2.1).

Theorem 2.7. [FWQ90] Let ψ(λ) be a function of a complex variable λ, satisfying:


1◦ ψ(λ) is analytic with real coefficients in a neighborhood D of λ = 0.
2◦ ψ(λ)ψ(−λ) = 1 in D.
3◦ ψλ (0) = 0. Let A, B be matrices of order 2n. Then,
   
ψ(τ B) A ψ(τ B) = A,

for all τ with sufficient small |τ |, iff

B  A + AB = O.

We call these ψ(λ) the generalized Cayley transformation.

Proof. Condition 2◦ implies ψ 2 (0) = 0. Thus, ψ(0) = 0, if


   
ψ(τ B) A ψ(τ B) = A,

for all τ with |τ | sufficiently small. Then, differentiating both sides of the above equa-
tion with respect to τ , we get
   
B  ψλ (τ B) Aψ(τ B) + ψ(τ B) ABψλ (τ B) = O.

Setting τ = 0, it becomes

(B  A + AB)ψ(0) ψλ (0) = O.

From condition 3◦ , we get


B  A + AB = O.
4.2 Symplectic Schemes for Linear Hamiltonian Systems 199

Conversely, if B  A + AB = 0, then it is not difficult to verify that the equations

ψλ (τ B  )A = Aψλ (τ B), ψ(τ B  )A = Aψ(−τ B)

hold good for any analytic function ψ. From condition 2◦ , it follows that

ψλ (λ) ψ(−λ) − ψ(λ) ψλ (−λ) = 0.

Therefore,
d  
ψ(τ B) A ψ(τ B)

d  
= ψ(τ B  ) A ψ(τ B)

= B  ψλ (τ B  ) Aψ(τ B) + ψ(τ B  ) ABψλ (τ B)
= B  Aψλ (−τ B) ψ(τ B) + ABψλ (−τ B) ψ(τ B)
= (B  A + AB) ψλ (−τ B) ψ(τ B) = O,

i.e.,
ψ(τ B  ) Aψ(τ B) = ψ(0) Aψ(0) = Aψ 2 (0) = A.
The proof is completed. 
By taking successively A = J and A = A in Theorem 2.8 and using (2.3) – (2.6),
we obtain the following theorems.
Theorem 2.8. Take |τ | sufficiently small so that τ B has no eigenvalue at the pole
of the function φ(λ) in Theorem 2.7. Then, ψ(τ B) ∈ Sp(2n) iff B ∈ sp(2n). Let
B = J −1 C, C  = C, A = A; then,

ψ(τ J −1 C) Aψ(τ J −1 C) = A, (2.38)

iff
AJC = CJA.
1
In other words, a quadratic form F (z) = z  Az, is invariant under the symplectic
2
transformation ψ(τ B), iff F (z) is an invariant integral of the system (2.1).
The transformation φ(τ B) based on Theorem 2.7 includes exponential transfor-
mation exp(τ B), Cayley transformation ψ(−τ B/2), and diagonal Padé transforma-
tion as special cases. Taking φ(λ) in Theorem 2.7 as a rational function, then nec-
P (λ)
essarily ψ(λ) = , P (λ) is a polynomial, and is often normalized by setting
P (−λ)
P (0) = 1, P  (0) = 0.
Theorem 2.9. Let P (λ) be a polynomial P (0) = 1, P  (0) = 0, and
P (λ)
exp (λ) − = O (|λ|2k+1 ). (2.39)
P (−λ)
200 4. Symplectic Difference Schemes for Hamiltonian Systems

Then,
P (−τ B)z m+1 = P (τ B)z m ,
i.e.,
P (τ B) m
z m+1 = z (2.40)
P (−τ B)
is a symplectic scheme of order 2k for a linear system (2.1). This difference scheme
and the original system (2.1) have the same set of quadratic invariants.
P (x)
In order to find the approximate to exp (x) , we may express exp(x) in various
P (−x)
rational fraction ways. The following are examples:
nll (x) d (−x) 1 1
(1) exp (x) ∼ = ll . (2) exp (x) ∼ glm (x) · gml (x).
nll (−x) dll (x) 2 2
x x
1 + tanh e2
(3) exp (x) = 2 exp (x) =
x. (4) x .
e− 2
1 − tanh
2
1 x
(1 + e )
(5) exp (x) = 12 .
(1 + e−x )
2
Each denominator and numerator in the above expressions can be expanded about
the origin
 in xTaylor
< series. The first term of the approximation gives the function
x
ψ(x) = 1 + 1 − , which yields the Euler centered scheme. Keeping m(> 1)
2 2
terms in the expansions for both the denominator and numerator, we will get a function
ψ(x) that will extend the Euler centered schemes. The schemes obtained in this way
are all symplectic; however, the order of accuracy of the first and third schemes is
higher than that of the last two kinds. For example, if in the formula (5) the first three
terms of the expansions of the denominator and numerator are retained, then the 4-th
order symplectic scheme is obtained. However, the same kind of truncation gives 6-th
order schemes from (1) and (3).

4.3 Symplectic Difference Schemes for a Nonlinear


Hamiltonian System
For a nonlinear Hamiltonian system, we give some simple symplectic difference
schemes.
Centered Euler scheme. For Equation (1.3), we give Euler centered schemes [Fen85] :
% m+1 &
1 m+1 −1 z + zm
(z − z ) = J Hz
m
, (3.1)
τ 2

where the mapping Fτ : z m → z m+1 is nonlinear. By differentiation,


4.3 Symplectic Difference Schemes for a Nonlinear Hamiltonian System 201

% m+1 &% &


∂z m+1 −1 z + zm 1 ∂z m+1 1
= I + τ J Hzz + I ,
∂z m 2 2 ∂z m 2
% m+1 &
z + zm
where Hzz is the Hessian matrix of the function H(z) at point z =
2
z m+1 + z m ∂z m+1
, and is the Jacobian matrix of Fτ . We have
2 ∂z m
 z m+1 + z m  !−1  z m+1 + z m !
τ τ −1
Fτ = I − J −1 Hzz I + J Hzz .
2 2 2 2

When z remains bounded and by taking τ to%be sufficiently


& small, we can keep the
τ −1 z m+1 + z m
infinitesimally symplectic matrix J Hzz non-exceptional. Then,
2 2
Fτ as a Cayley transform is symplectic. Thus, all the conservation laws for phase
areas remain true. However, unlike the linear case, the first integral φ(x) including H
itself are not conserved exactly. Indeed, it satisfies conservation law only nearby:

ϕ(z m+1 ) = ϕ(z m ) mod o(τ 3 ).


1
Property 3.1. Let f (z) = z  Bz be a conservation law for the Hamiltonian system
2
(1.3). Then, it is also a conservation law of the Euler centered scheme for system (1.3).

Proof.
= >
z k+1 − z k
B(z k+1 + z k ),
τ
=  >
z k+1 + z k
= B(z k+1 + z k ), J −1 Hz
2
=  k+1 >
z + zk
= (z k+1 + z k ), BJ −1 Hz = 0,
2

and so
Bz k , z k  = Bz k+1 , z k+1 .
The proof is proved. 

The last equation comes from the conservation law of original system.

Remark 3.2. As Euler centered schemes, high-order schemes constructed by the di-
agonal element in the Padé table preserve all quadratic first integrals of the original
Hamiltonian system.

It is worth to point out that the trapezoidal scheme:


1 m+1 1 
(z − z m ) = J −1 Hz (z m+1 ) + Hz (z m ) (3.2)
τ 2
is non-symplectic, because the transition
202 4. Symplectic Difference Schemes for Hamiltonian Systems

 τ −1  τ 
Fτ = I − J −1 Hzz (z m+1 ) I + J −1 Hzz (z m )
2 2
is non-symplectic in general. By a nonlinear transformation [Dah59,QZZ95] ,
h h
ξ k = ρ(z k ) = z k + f (z k ), ξ k+1 = ρ(z k+1 ) = z k+1 + f (z k+1 ), (3.3)
2 2
and the trapezoidal scheme can be transformed into a symplectic Euler centered
scheme
h 
ξ k + ξ k+1 = z k + z k+1 + f (z k ) + f (z k+1 ) .
2
Applying (3.2) to the above formula, we get

ξ k + ξ k+1 = z k + z k+1 + z k+1 − z k = 2z k+1 .

ξ k + ξ k+1
By taking z k+1 = in the second equation of (3.3), we obtain
2
% &
ξ k + ξ k+1 h ξ k + ξ k+1
ξ k+1 = + f ,
2 2 2
i.e., % &
ξ k + ξ k+1
ξ k+1 = ξ k + hf ,
2
which is a Euler centered scheme.

Theorem 3.3. The trapezoidal scheme (3.2) preserves the following symplectic
structure[WT03] :
h2
J + Hzz (z)JHzz (z), (3.4)
4
i.e.,
% k+1 & % &
∂z h2 k+1 k+1 ∂ z k+1 h2
k
J + H zz (z )JHzz (z ) k
= J+ Hzz (z k )JHzz (z k ).
∂z 4 ∂z 4

Proof. The proof can be easily obtained by direct calculation using nonlinear trans-
form of (3.3) to (1.7). 

Remark 3.4. For the canonical system with general separable Hamiltonian, H(p, q) =
U (p) + V (q), and we have
dq dp
= −Vq (q), = Up (p), (3.5)
dt dt
1 m+1 1
(p − pm ) = −Vq (q m+ 2 ),
τ
(3.6)
1 m+1+ 1 1
(q 2 − q m+ 2 ) = U (pm+1 ).
p
τ
4.4 Explicit Symplectic Scheme for Hamiltonian System 203

! !
pm pm+1
The transition Fτ : 1 → 1 has the Jacobian:
q m+ 2 q m+1+ 2
5 6−1 5 6
I O I −τ L
Fτ = .
−τ M I O I

1
From Proposition 1.7, it is symplectic, but with M = Upp (pm+1 ), L = Vqq (q m+ 2 ).

3
Property 3.5. Let f (p, q) = p Bq be a conservation law of (3.5). Then, (pk+1 ) Bq k+ 2
1
= (pk ) Bq k+ 2 is a conservation law of the difference scheme (3.6) also.

Proof. Indeed, because f (p, q) is a conservation law of the original Hamiltonian sys-
tem
   
Bp, Up (p) = 0, Bq, Vq (q) = 0,

we get
% &
3
q k+ 2 − q k+ 2
1
 
, Bpk+1 = Up (pk+1 ), Bpk+1 = 0,
τ
% &
pk+1 − pk 1  1 1
, Bq k+ 2 = Vq (q k+ 2 ), Bq k+ 2 = 0.
τ

Subtracting the two equations above, we get


3 1
(Bpk+1 , q k+ 2 ) = (Bpk , q k+ 2 ).

The proof can be obtained. 

4.4 Explicit Symplectic Scheme for Hamiltonian


System

The oldest and simplest difference scheme is the explicit Euler method. Usually, it is
not symplectic for general Hamiltonian systems. It is interesting to ask: under what
condition of Hamiltonian systems, can the explicit Euler method become symplectic?
In fact, the explicit Euler scheme should be the phase flow of a system (i.e., exact
solution) to be symplectic. Most of the important Hamiltonian systems can be decom-
posed into the sum of these simple systems. Then, the composition of the Euler method
acting on these systems yields a symplectic method, which is also explicit. These sys-
tems are called symplectically separable. So classical separable Hamiltonian systems
are symplectically separable. In this section, we will prove that any polynomial Hamil-
tonian is symplectically separable.
204 4. Symplectic Difference Schemes for Hamiltonian Systems

4.4.1 Systems with Nilpotent of Degree 2


For a Hamiltonian system (1.3), the oldest and simplest is the explicit Euler scheme:

z = EH
τ
z := z + τ JHz (z), (4.1)
τ
where EH = 1 + τ JHz . Usually, the scheme (4.1) is non-symplectic. However, it is
symplectic for a specific kind of Hamiltonian system, called a system with nilpotent
of degree 2.
[FW98]
Definition 4.1. A Hamiltonian system is nilpotent of degree 2 if it satisfies

JHzz (z)JHz (z) = 0, ∀z ∈ R2n . (4.2)

Evidently, H(p, q) = φ(p) or H(p, q) = ψ(q), which represents inertial flow and
stagnant flow, are nilpotent of degree 2 since for H(p, q) = φ(p),
5 65 65 6 5 65 6
φpp O O −I φp φpp O O
Hzz (z)JHz (z) = = = O,
O O I O O O O φp

and for H(p, q) = ψ(q),


5 65 65 6 5 65 6
O O O −I O O O −ψq
Hzz (z)JHz (z) = = = O.
O ψqq I O ψq O ψqq O
τ
Theorem 4.2. If H is nilpotent of degree 2, then the explicit Euler scheme EH is the
exact phase flow of the Hamiltonian system, and hence symplectic.
Proof. Let z = z(0). From the condition (4.2), it follows that
d
z̈(t) = JHz (z(t)) = (JHz (z(t)))z ż(t) = JHzz (z(t))JHz (z(t)) = 0,
dt
and therefore,
ż(t) = ż(0) = JHz (z(0)).
Hence,
t
z(t) = z(0) + tJHz (z(0)) = z + tJHz (z) = EH (z).
t
This is just the explicit Euler scheme EH . This shows that for such a system, the
explicit Euler scheme EH τ
is the exact phase flow, and therefore symplectic. 
Theorem 4.3. Let φ(u) : Rn → R be a function on n variables u, φ(u) = φ(u1 ,
u2 , · · ·, un ). Let Cn×2n = (A, B) be a linear transformation from R2n to Rn . Then,
the Hamiltonian H(z) = φ(Cz) satisfies

JHzz (z)JHz (z) = O, ∀ φ, z, (4.3)

iff
CJC T = O. (4.4)
4.4 Explicit Symplectic Scheme for Hamiltonian System 205

Proof. Since
JHzz (z)JHz (z) = JC T φuu (CJC T φu (Cz)), (4.5)
the sufficient condition is trivial.
We now prove the necessity. If

JHzz (z)JHz (z) = O, ∀φ, z,

then from (4.5) it follows that

JC T φuu (Cz)(JC T φu (Cz) = O, ∀φ, z.


1
Especially take φ(u) = uT u, then
2

JC T CJC T Cz = O, ∀ z,

i.e.,
JC T CJC T C = O.
Left multiplying by C and right multiplying JC T by this equation, we get:

(CJC T )3 = O.

The anti-symmetry of CJC T implies CJC T = O. 

Lemma 4.4. Let C = (A, B); then CJC T = O, if and only if AB T = BAT .

Theorem 4.5. For any Hamiltonian system:

H(z) = H(p, q) = φ(Cz) = φ(Ap + Bq), AB T = BAT ,

where φ(u) is any n variable function. The explicit Euler method

z = EH
τ
z = Eφτ z = z + τ JHz (z) = z + τ JC T φu (Cz)

is the exact phase flow, i.e.,

eτφ := Eφτ = 1 + τ JHz = 1 + τ JC T φu ◦ C,

hence, Eφτ is symplectic.

4.4.2 Symplectically Separable Hamiltonian Systems


[FW98,FQ91]
Definition 4.6. Hamiltonian H(z) is separable, if


m
H(z) = Hi (z), Hi (z) = φi (Ci z) = φ(Ai p + Bi q), (4.6)
i=1
206 4. Symplectic Difference Schemes for Hamiltonian Systems

where φi are functions of n variables and Ci = (Ai , Bi ) satisfies the condition


Ai BiT = Bi AT
i (i = 1, · · · , m). Obviously, we have the following proposition.

Proposition 4.7. A linear combination of a symplectic separable Hamiltonian is sym-


plectically separable.
For a symplectically separable Hamiltonian (4.6), the explicit composition scheme
τ
gH = Emτ
◦ Em−1
τ
◦ · · · ◦ E2τ ◦ E1τ
τ
:= EH m
◦ EHτ
m−1
◦ · · · ◦ EHτ
2
◦ EH
τ
1
(4.7)
is symplectic and of order 1. As a matter of fact:
τ
EH 2
◦ EH
τ
1
= (1 + τ JH2,z ) ◦ (1 + τ JH1,z )
= 1 + τ JH2,z + τ JH1,z + O(τ 2 )
= 1 + τ J(H2,z + H1,z ) + O(τ 2 ),
τ
gH = Emτ
◦ Em−1
τ
◦ · · · ◦ E2τ ◦ E1τ
 
m−1 
= (1 + τ JHm,z ) ◦ 1 + τ J Hi,z + O(τ 2 )
i=1

m
= 1 + τJ Hi,z + O(τ 2 )
i=1
= 1 + τ JHz + O(τ 2 ).
τ
The symplecticity of gH follows from the fact that symplectic maps on R2n form a
group under composition.
Similarly,
gHτ
= E1τ ◦ E2τ ◦ · · · ◦ Em−1
τ
◦ Em
τ

is symplectic and of order 1.


More discussion on how to construct separable schemes with high order is pro-
vided in Chapter 8.
Example 4.8. The Hamiltonian [FW98,FQ91]

k−1 % &
2π i 2π i
Hk (p, q) = cos p cos + q sin
i=0
k k

with k-fold rotational symmetry in a phase plane[2,4] are not separable in the conven-
 if k = 1, 2, 4. Otherwise
tional sense  they are symplectically separable, since every
2π i 2π i
term cos p cos + q sin is nilpotent of degree 2 according to Theorem 4.3.
k k
For example, for k = 3,
   
2π 2π 4π 4π
H3 (p, q) = cos p + cos p cos + q sin + cos p cos + q sin
3 3 3 3
% √ & % √ &
1 3 1 3
= cos p + cos p− q + cos − p − q ,
2 2 2 2
4.4 Explicit Symplectic Scheme for Hamiltonian System 207

and the explicit symplectic schemes of order 1 are


% √ &
1 1 3
q 1 = q − τ sin p+ q ,
2 2 2
√ % √ &
3 1 3
p1 = p + τ sin p+ q ,
2 2 2
% √ &
2 11 1 1 3 1
q = q − τ sin p − q ,
2 2 2
√ % √ &
3 1 3
p = p1 − τ sin p− q ,
2 2 2
q = q 2 − τ sin p.

Using the composition theory discussed in Chapter 8 , we can construct an explicit


symplectic scheme with higher order accuracy.

4.4.3 Separability of All Polynomials in R2n


Theorem 4.9. [FW98] Every monomial xn−k y k of degree n in 2 variables x and y,
n ≤ 2, 0 ≤ k ≤ n can be expanded as a linear combination of n + 1 terms:

{(x + y)n , (x + 2y)n , · · · , (x + 2n−1 y)n , xn , y n }.

Proof. Using binomial expansion,

(x + y)n = xn + C1n xn−1 y 1 + C2n xn−2 y 2 + · · · + Cn−2


n x2 y n−2 + C1n x1 y n−1 + y n .

Define
P1 (x, y) : = (x + y)n − xn − y n
= C1n xn−1 y 1 + C2n xn−2 y 2 + · · · + C2n x2 y n−2 + C1n x1 y n−1 ,

which is separable, and the right side consists of mixed terms; P1 is a linear combina-
tion of 3 terms (x + y)n , xn , and y n .

P1 (x, 2y) = 2C1n xn−1 y 1 + 22 C2n xn−2 y 2 + · · · + 2n−2 C2n x2 y n−2 + 2n−1 C1n x1 y n−1 ,
2P1 (x, 1y) = 2C1n xn−1 y 1 + 2C2n xn−2 y 2 + · · · + 2C2n x2 y n−2 + 2C1n x1 y n−1 .

Define
P2 (x, y) : = P1 (x, 2y) − 2P1 (x, y)
= (22 − 2)C2n xn−2 y 2 + · · · + (2n−2 − 2)C2n x2 y n−2
+(2n−1 − 2)C1n x1 y n−1 ,

which is separable in 4 terms (x + y)n , (x + 2y)n , xn , and y n .


208 4. Symplectic Difference Schemes for Hamiltonian Systems

P3 (x, y) = P2 (x, 2y) − 22 P2 (x, y)

= (23 − 22 )(23 − 2)C3n xn−3 y 3 + · · · + (2n−2 − 22 )(2n−2 − 2)C2n x2 y n−2

+(2n−1 − 22 )(2n−1 − 2)C1n x1 y n−1 ,

which is separable in 5 terms (x + y)n , (x + 2y)n , (x + 22 y)n , xn , and y n . Define:

Pn−2 (x, y) : = Pn−3 (x, 2y) − 2n−3 Pn−3 (x, y)

= (2n−2 − 2n−3 ) · · · (2n−2 − 2)C2n x2 y n−2

+(2n−1 − 2n−3 ) · · · (2n−1 − 2)C1n x1 y n−1 ,

which is separable in n terms (x + y)n , (x + 2y)n , · · · , (x + 2n−3 y)n , xn , and y n .


Finally, we get:

Pn−1 (x, y) = Pn−2 (x, 2y) − 2n−2 Pn−2 (x, y)


= (2n−1 − 2n−2 )(2n−1 − 2n−3 ) · · · (2n−1 − 2)C1n x1 y n−1
= γn−1 x1 y n−1 , γn−1 = 0.

The separable n + 1 terms are (x + y)n , (x + 2y)n , · · · , (x + 2n−2 y)n , xn , and y n .


Hence, the mixed term xy n−1 is separable into n + 1 terms. Then, from the separa-
bility of Pn−2 (x, y) and xy n−1 , we know that x2 y n−2 is separable into n + 1 terms.
Similarly, x3 y n−3 , x4 y n−4 , · · · , xn−2 y 2 , and xn−1 y is separable into n + 1 terms. 

Remark 4.10. We can also work with the following formulae:

1 1
(x + y)2m+1 + (x − y) − x2m+1
2 2
= C22m+1 x2m−1 y 2 + C42m+1 x2m−3 y 4 + · · · + C2m
2m+1 xy
2m
,
1 1
(x + y)2m+1 − (x − y)2m+1 − y 2m+1
2 2
= C12m+1 x2m y + C32m+1 x2m−2 y 3 + · · · + C2m−1 2 2m−1
2m+1 x y ,
1 1
(x + y)2m + (x − y)2m − x2m − y 2m
2 2
= C22m x2m−2 y 2 + C42m x2m−4 y 4 + · · · + C2m−2
2m x2 y 2m−2 ,
1 1
(x + y)2m − (x − y)2m
2 2
= C12m x2m−1 y + C32m x2m−3 y 3 + · · · + C2m−1
2m xy 2m−1 ,

by means of elimination to get more economic expansions, e.g.,

1 1 1 1 1
xy = (x + y)2 − x2 − y 2 = (x + y)2 − (x − y)2 .
2 2 2 4 4
4.5 Energy-conservative Schemes by Hamiltonian Difference 209

Theorem 4.11. Every polynomial P (x, y) of degree n in variables p and q can be


expanded as n + 1 terms P1 (x, y), P2 (x, y), · · · , Pn−1 (x, y), Pn (x), Pn+1 (y), where
each Pi (u) is a polynomial of degree n in one variable or more. Generally, every
polynomial P (p, q) can be expanded as

m
P (p, q) = Pi (ai p + bi q), m ≤ n + 1,
i=1

where Pi (u) are polynomials of degree n in one variable.


Theorem 4.12. Every monomial in 2n variables is of the form
1 −k1 k1 2 −k2 k2 n −kn kn
f (p, q) = (pm
1 q1 )(pm
2 q2 ) · · · (pm
n qn )
and can be expanded as a linear combination of the terms in the form:
φ(Ap + Bq) = (a1 p1 + b1 q1 )m1 (a2 p2 + b2 q2 )m2 · · · (an pn + bn qn )mn ,
where φ(u) = φ(u1 , · · · , un ) = um
1 u2 · · · un is the monomial in n with total de-
1 m2 mn

m
gree m = mi and with degree mi in variable ui . A and B are diagonal matrices
i=1
of order n:
⎛ ⎞ ⎛ ⎞
a1 0 ··· 0 b1 0 ··· 0
⎜ 0 a2 ··· 0 ⎟ ⎜ 0 b2 ··· 0 ⎟
⎜ ⎟ ⎜ ⎟
A=⎜ .. .. .. ⎟, B=⎜ .. .. .. ⎟,
⎝ . . . ⎠ ⎝ . . . ⎠
0 0 · · · an 0 0 · · · bn

which automatically satisfies AB T = BAT . The elements ai , bi can be chosen as


integers.
Theorem 4.13. Every polynomial P (p1 , q1 , · · · , pn , qn ) of degree m in 2n variables
can be expanded as [FW98]

m
P (p, q) = Pi (Ai p + Bi q),
i=1

where each Pi is a polynomial of degree m in n variables, and Ai , Bi are diagonal


matrices (satisfying Ai BiT = Bi AT
i ). Thus, for polynomial Hamiltonian, the symplec-
tic explicit Euler composite schemes of order 1, 2, or and 4 can be easily constructed.

4.5 Energy-conservative Schemes by Hamiltonian


Difference
Now, we consider energy-conservative schemes by Hamiltonian differencing, which
was first proposed by A.J. Chorin[CHMM78] , and later considered by K. Feng[Fen85] .
210 4. Symplectic Difference Schemes for Hamiltonian Systems

However, these schemes are not symplectic. For simplicity, we illustrate the cases
only when n = 2. Let z = z m , z̄ = z m+1 .
1 1
(p̄1 − p1 ) = − {H(p1 , p2 , q̄1 , q2 ) − H(p1 , p2 , q1 , q2 )},
τ q̄1 − q1
1 1
(p̄2 − p2 ) = − {H(p̄1 , p2 , q̄1 , q̄2 ) − H(p̄1 , p2 , q̄1 , q2 )},
τ q̄2 − q2
1 1
(5.1)
(q̄1 − q1 ) = {H(p̄1 , p2 , q̄1 , q2 ) − H(p1 , p2 , q̄1 , q2 )},
τ p̄1 − p1
1 1
(q̄2 − q2 ) = {H(p̄1 , p̄2 , q̄1 , q̄2 ) − H(p̄1 , p2 , q̄1 , q̄2 )}.
τ p̄2 − p2
By addition and cancellation, we have energy conservation for the arbitrary Hamil-
tonian H(p̄1 , p̄2 , q̄1 , q̄2 ) = H(p1 , p2 , q1 , q2 ).
Since the proposed energy conservative schemes based on Hamiltonian differenc-
ing only have the first order accuracy, Qin[Qin87] first proposed another more symmet-
ric form in 1987, which possesses the second order accuracy. Independently, Itoh and
Abe[IA88] also proposed the same schemes in 1988.
For simplicity, we consider only the case n = 2, and the following difference
schemes are given:
d p1 1 H(p1 , p2 , q̄1 , q2 ) − H(p1 , p2 , q1 , q2 ) 1 H(p1 , p̄2 , q̄1 , q̄2 ) − H(p1 , p̄2 , q1 , q̄2 )
=− −
dt 4 Δq1 4 Δq1
1 H(p̄1 , p2 , q̄1 , q2 ) − H(p̄1 , p2 , q1 , q2 ) 1 H(p̄1 , p̄2 , q̄1 , q̄2 ) − H(p̄1 , p̄2 , q1 , q̄2 )
− − ,
4 Δq1 4 Δq1
d q1 1 H(p̄1 , p2 , q̄1 , q2 ) − H(p1 , p2 , q̄1 , q2 ) 1 H(p̄1 , p̄2 , q̄1 , q̄2 ) − H(p1 , p̄2 , q̄1 , q̄2 )
= +
dt 4 Δp1 4 Δp1
1 H(p̄1 , p2 , q1 , q2 ) − H(p1 , p2 , q1 , q2 ) 1 H(p̄1 , p̄2 , q1 , q̄2 ) − H(p1 , p̄2 , q1 , q̄2 )
+ + ,
4 Δ p1 4 Δp1
d p2 1 H(p̄1 , p2 , q̄1 , q̄2 ) − H(p̄1 , p2 , q̄1 , q2 ) 1 H(p1 , p2 , q1 , q̄2 ) − H(p1 , p2 , q1 , q2 )
=− −
dt 4 Δ q2 4 Δq2
1 H(p̄1 , p̄2 , q̄1 , q̄2 ) − H(p̄1 , p̄2 , q̄1 , q2 ) 1 H(p1 , p̄2 , q1 , q̄2 ) − H(p1 , p̄2 , q1 , q2 )
− − ,
4 Δq2 4 Δ q2
d q2 1 H(p̄1 , p̄2 , q̄1 , q̄2 ) − H(p̄1 , p2 , q̄1 , q̄2 ) 1 H(p1 , p̄2 , q1 , q̄2 ) − H(p1 , p2 , q1 , q̄2 )
= +
dt 4 Δp2 4 Δ p2
1 H(p̄1 , p̄2 , q̄1 , q2 ) − H(p̄1 , p2 , q̄1 , q2 ) 1 H(p1 , p̄2 , q1 , q2 ) − H(p1 , p2 , q1 , q2 )
+ + .
4 Δ p2 4 Δ p2

From the above first two equations, we have:


1 1
(H(p̄1 , p2 , q̄1 , q2 ) + H(p̄1 , p̄2 , q̄1 , q̄2 )) = (H(p1 , p̄2 , q1 , q̄2 ) + H(p1 , p2 , q1 , q2 )).
2 2
From the last two equations, we have:
1 1
(H(p̄1 , p̄2 , q̄1 , q̄2 ) + H(p1 , p̄2 , q1 , q̄2 ) = (H(p̄1 , p2 , q̄1 , q2 ) + H(p1 , p2 , q1 , q2 )).
2 2
Combining these equations, we observe that these schemes have exact conservation
of the Hamiltonian H. Further research about conservative energy scheme can be re-
ferred in recent studies[WWM08] .
Bibliography

[Car65] C. Carathe’odory: Calculus of Variation and Partial Differential Equations of First


Order, Vol.1. Holden-Day, San Franscisco, (1965).
[CHMM78] A. Chorin, T. J. R. Huges, J. E. Marsden, and M. McCracken: Product formulas
and numerical algorithms. Comm. Pure and Appl. Math., 31:205–256, (1978).
[Dah59] G. Dahlquist: Stability and error bounds in the numerical integration of ordinary
differential equations. Trans. of the Royal Inst. of Techn., Stockholm, Sweden, 130:87,
(1959).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[FQ91] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FW98] K. Feng and D.L. Wang: On variation of schemes by Euler. J. Comput. Math., 16:97–
106, (1998).
[FWQ90] K. Feng, H.M. Wu, and M.Z. Qin: Symplectic Difference Schemes for Linear Hamil-
tonian Canonical Systems. J. Comput. Math., 8(4):371–380, (1990).
[IA88] T. Itoh and K. Abe: Hamiltonian-conserving discrete canonical equations based on
variational difference quotients. J. Comp. Phys., 76:85–102, (1988).
[Men84] C.R. Menyuk: Some properties of the discrete Hamiltonian method. Physica D,
11:109–129, (1984).
[Qin87] M. Z. Qin: A symplectic scheme for the Hamiltonian equations. J. Comput. Math.,
5:203–209, (1987).
[Qin89] M. Z. Qin: Cononical difference scheme for the Hamiltonian equation. Mathematical
Methods and in the Applied Sciences, 11:543–557, (1989).
[QZZ95] M. Z. Qin, W. J. Zhu, and M. Q. Zhang: Construction of symplectic of a three stage
difference scheme for ODEs. J. Comput. Math., 13:206–210, (1995).
[Wey39] H. Weyl: The Classical Groups. Princeton Univ. Press, Princeton, Second edition,
(1939).
[WT03] D. L. Wang and H. W. Tam: A symplectic structure preserved by the trapezoidal rule.
J. of Phys. Soc. of Japan, 72(9):2193–2197, (2003).
[WWM08] Y. S. Wang, B. Wang, and M. Z.Qin: Local structure-preserving algorithms for
partial differential equations. Science in China (Series A), 51(11):2115–2136, (2008).
Chapter 5.
The Generating Function Method

This chapter discusses the construction of the symplectic difference schemes via gen-
erating function and their conservation laws.

5.1 Linear Fractional Transformation


!
Aα Bα
Definition 1.1. Let α = ∈ GL(2m). A linear fractional transforma-
Cα Dα
[Sie43,Hua44,FWQW89,Fen86]
tion is defined by

σα : M (m) −→ N (m),

M −→ N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 , (1.1)


under the transversality condition

|Cα M + Dα | = 0. (1.2)
!
Aα Bα
Proposition 1.2. Let α ∈ GL(2m), and the inverse α−1 = , then
Cα Dα

|Cα M + Dα | = 0 iff |M C α − Aα | = 0,
(1.3)
|Aα M + Bα | = 0 iff |B α − M Dα | = 0.

Thus the linear fractional transformation σα in (1.1) can be represented as

σα (M ) = (M C α − Aα )−1 (B α − M Dα ). (1.4)

Proof. From the relation


5 65 α 6 5 65 6
Aα Bα A Bα Aα Bα Aα Bα
= = I2m ,
Cα Dα Cα Dα Cα Dα Cα Dα

i.e.,
214 5. The Generating Function Method

Aα Aα + Bα C α = Aα Aα + B α Cα = Im ,
Cα B α + Dα Dα = C α Bα + Dα Dα = Im ,
(1.5)
Aα B α + Bα Dα = Aα Bα + B α Dα = O,
Cα Aα + Dα C α = C α Aα + Dα Cα = O,
we obtain the following identities:
5 65 α 6 5 6
I −M A Bα Aα − M C α B α − M Dα
= ,
Cα Dα Cα Dα O I
5 65 6 5 6 (1.6)
I −M Aα Bα Aα − M C α B α − M Dα
= .
Aα Bα Cα Dα I O

In addition, we have:
5 6 5 65 6
I −M I O I −M
= ,
Cα Dα Cα I O Cα M + Dα
5 6 5 65 6 (1.7)
I −M I O I −M
= .
Aα Bα Aα I O Aα M + Bα

Inserting (1.7) into (1.6), taking their determinate, we obtain

|Cα M + Dα | |α|−1 = |Aα − M C α |,


(1.8)
|Aα M + Bα | |α|−1 = (−1)m |B α − M Dα |.

Note that since α is a non-singular matrix, (1.3) is valid.


By (1.8), Equation (1.4) is well defined. The only remaining step is to verify the
equation

(M C α − Aα )−1 (B α − M Dα ) = (Aα M + Bα )(Cα M + Dα )−1 ,

i.e.,
(B α − M Dα )(Cα M + Dα ) = (M C α − Aα )(Aα M + Bα ).
Expanding it and using the conditions (1.5), we know that it holds. 

Proposition 1.3. We have the following well-known relation

(C α N + Dα )(Cα M + Dα ) = I, (1.9)

hence
|C α N + Dα | = 0 iff |Cα M + Dα | = 0,
5.2 Symplectic, Gradient Mapping and Generating Function 215

where N = σα (M ). Under the transversality condition (1.2), σα has an inverse linear


fractional transformation σα−1 = σα−1 ,

M = σα−1 (N ) = (Aα N + B α )(C α N + Dα )−1


= (N Cα − Aα )−1 (Bα − N Dα ). (1.10)

Proof.

(C α N + Dα )(Cα M + Dα )
= (C α (Aα M + Bα )(Cα M + Dα )−1 + Dα )(Cα M + Dα )
= (C α Aα + Dα Cα )M + C α Bα + Dα Dα
=I (by (1.5)),

which is (1.9). The first equation of (1.10) can be obtained from (1.4) and the second
equation can be derived from (1.1). 

Combining (1.2) and (1.3) together, we obtain the following four mutually equiv-
alent transversality conditions:

|Cα M + Dα | = 0, (1.11)
|M C α − Aα | = 0, (1.12)
|C α N + Dα | = 0, (1.13)
|N Cα − Aα | = 0, (1.14)

where
N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 ,
M = σα−1 (N ) = (Aα N + B α )(C α N + Dα )−1 .
Moreover, the linear fractional transformation σα from {M ∈ M (m) | |Cα M +
Dα | = 0} to {N ∈ M (m) | |C α N + Dα | = 0} is 1-1 surjective.

5.2 Symplectic, Gradient Mapping and Generating


Function
To study the symplectic structure and Hamiltonian system in R2n phase space, we
need R4n symplectic structure as a product of R2n space. Its symplectic structure
comes from the product of original symplectic structure in R2n


n 
n
"=
Ω d zi ∧ d zi+n − d zi ∧ d zi+n , (2.1)
i=1 i=1
216 5. The Generating Function Method

 O 
J2n
where the corresponding matrix is given by J"4n = . We denote
O −J2n
" 4n = (R4n , J"4n ).
R
On the other hand, R4n has its standard symplectic structure:
2n

Ω= d wi ∧ d w
i+2n , (2.2)
i=1

where (w1 , · · · , w2n , w 2n )T represents its coordinate. The corresponding ma-
1 , · · · , w
trix is given by 5 6
O I2n
J4n = .
−I2n O
We denote manifold R4n = (R4n , J4n ).
Now we first review some notations and facts of the symplectic algebra. Every
4n × 2n matrix of rank 2n can be represented as:
5 6
A1
A= ∈ M (4n, 2n), A1 , A2 ∈ M (2n),
A2

defines a 4n-dim subspace {A} spanned by its 2n column vectors. Evidently, {A} =
{B} iff ∃P ∈ GL(2n) such that
5 6 5 6
A1 P B1
AP = B, i.e., = .
A2 P B2
 X 
1
A 2n-dim subspace {X} = of R4n , X1 , X2 ∈ M (2n), is a J4n -Lagrangian,
X2
if
X T J4n X = O,
i.e.,
X1T X2 − X2T X1 = O X1T X2 ∈ Sm(2n).
or
 X 
1
According to Siegel[Sie43] , we call such a 4n × 2n matrix X = a symmetric
X2
 X  
1 N 
pair. Moreover, if |X2 | = 0, then X1 X2−1 = N ∈ Sm(2n) and = .
X2 I
 Y 
1
Similarly, a 2n-dim subspace {Y } = is J˜4n -Lagrangian, if
Y2

Y T J˜4n Y = O,

i.e.,
Y1T J2n Y1 = Y2T J2n Y2 ,
5.2 Symplectic, Gradient Mapping and Generating Function 217


Y1 
the 4n×2n matrix Y = is called a symplectic pair. |Y2 | = 0 implies Y1 Y2−1 =
Y2

Y1   M 
M ∈ Sp(2n), and = .
Y2 I
 A 

Theorem 2.1. A transformation α = α
∈ GL(4n) carries every J˜4n -
Cα Dα
Lagrangian subspace into a J4n -Lagrangian subspace if and only if α ∈ CSp(J˜4n , J4n ),
i.e.,
αT J4n α = μJ˜4n , for some μ = μ(α) = 0. (2.3)

Proof. The “if” part is obvious, we need only to prove the “only if”
5 part. 6
J2n J2n
Taking α0 ∈ Sp(J˜4n , J4n ) (which always exists), e.g., α0 = 1 1 ,
I2n I2n
2 2
we have
CSp(J˜4n , J4n ) = CSp(4n) · α0 .
Therefore, it suffices to show that if α carries every J4n - Lagrangian subspace into
J4n -Lagrangian subspace, then α ∈ CSp(4n), i.e.,

αT J4n α = μJ4n for some μ = 0.


 I 
2n
1◦ Take the symmetric pair X = . By assumption,
O2n
5 65 6 5 6
Aα Bα I Aα
αX = =
Cα Dα O Cα

is also a symmetric pair, i.e., AT T T T


α Cα − Cα Aα = O. Similarly, Bα Dα − Dα Bα = O.
 S 
2◦ Take the symmetric pair X = , S ∈ Sm(2n). Then every
I
5 65 6 5 6
Aα Bα S Aα S + Bα
αX = =
Cα Dα I Cα S + Dα

is also a symmetric pair, i.e.,

O = (αX)T J4n (αX)


! !
  O I Aα S + Bα
= S T AT T T T T
α + Bα , S Cα + Dα
−I O Cα S + Dα
= S(AT T T T
α Cα − Cα Aα )S + S(Aα Dα − Cα Bα )

−(DαT Aα − BαT Cα )S + BαT Dα − DαT Bα


= S(AT T T T T
α Dα − Cα Bα ) − (Aα Dα − Cα Bα ) S, ∀ S ∈ Sm(2n).
218 5. The Generating Function Method

Set P = AT T
α Dα − Cα Bα , then the above equation becomes

SP = P T S, ∀ S ∈ Sm(2n).

It follows that P = μI, i.e.,

AT T
α Dα − Cα Bα = μI.

So
5 6T 5 65 6
T
Aα Bα O I Aα Bα
α J4n α =
Cα Dα −I O Cα Dα
5 6
AT T
α Cα − Cα Aα AT T
α Dα − Cα Bα
=
BαT Cα − DαT Aα BαT Dα − DαT Bα
5 6
O I
= μ = μJ4n ,
−I O

α ∈ GL(4n) implies μ = 0. 
 Aα Bα 
The inverse matrix of α is denoted by α−1 = . By (2.3), we have
Cα Dα

AT T T T
α Cα − Cα Aα = μJ, Aα Dα − Cα Bα = O,
(2.4)
BαT Cα − DαT Aα = O, BαT Dα − DαT Bα = −μJ,

Aα = μ−1 JCαT , B α = −μ−1 JAT


α,
(2.5)
C α = −μ−1 JDαT , Dα = μ−1 JBαT .

 A Bα 
Theorem 2.2. Let α = α
∈ CSp(J˜4n , J4n ). The linear fractional trans-
Cα Dα
formation σα :{M ∈ Sp(2n) | |Cα M +Dα | = 0}→ {N ∈ Sm(2n) | |C α N +Dα | =
0} is one to one and onto.

Proof. From above we know that |Cα M + Dα | = 0, iff |C α N + Dα | = 0. Now we


need only to prove M ∈ Sp(2n) iff N = σα (M ) ∈ Sm(2n). It is derived from direct
calculation, since N ∈ Sm(2n) iff
 T
(Aα M + Bα )(Cα M + Dα )−1 = (Aα M + Bα )(Cα M + Dα )−1 ,

i.e.,
(M T AT T T T T T T
α + Bα )(Cα M + Dα ) = (M Cα + Dα )(Aα M + Bα ).

Expanding and combining them together, we obtain


5.2 Symplectic, Gradient Mapping and Generating Function 219

O = M T (AT T T T
α Cα − Cα Aα )M + Bα Dα − Dα Bα
+M T (AT T T T
α Dα − Cα Bα ) + (Bα Cα − Dα Aα )M
= M T JM − J,

then (2.4) holds, iff M ∈ Sp(2n). 


Definition 2.3. A mapping w → w  = f (w) : R2n → R2n is called a gradient, if its
Jacobian N (w) = fw (w) ∈ Sp(2n) everywhere.

Definition 2.4. A 2n-dim submanifold U of R4n is a J"4n -Lagrangian submanifold


or J4n -Lagrangian submanifold if its tangent plane Tz U at z for any z ∈ U is a 4n
tangent space of J"4n -Lagrangian subspace or J4n -Lagrangian subspace.
For a symplectic mapping z → z = g(z), the graph[Fen86,FWQW89,Ge91]
 7
 z 
4n 2n
Γg = gr(g) := ∈ R | z = g(z), z ∈ R
z

is always a J"4n -Lagrange submanifold. For every gradient mapping w → w


 = f (w),
its graph  7
 w  4n 2n
Γf = gr(f ) := ∈R |w  = f (w), w ∈ R
w
is always a J4n -Lagrange submanifold.
 A
Bα 
Let α = α
∈ CSp(J"4n , J4n ), it defines a linear fractional transfor-
Cα Dα
mation
 w
   z   z   w 
=α , = α−1 ,
w z z w
i.e.,
 = Aα z + Bα z,
w z = Aα w
 + B α w,
(2.6)
w = Cα z + Dα z,  + Dα w.
z = C αw

Theorem 2.5. Let α ∈ CSp(J˜4n , J4n ). Let z → z = g(z) : R2n → R2n be a


canonical mapping with Jacobian M (z) = gz (z) ∈ Sp(2n) satisfying (1.2) in (some
neighborhood of) R2n . Then there exists a gradient mapping w → w  = f (w) in
(some neighborhood of) R2n with Jacobian N (w) = fw (w) ∈ Sm(2n) and a scalar
function —generating function— φ(w) (depending on α and g) such that
1◦ f (w) = ∇φ(w); (2.7)
2◦ Aα g(z) + Bα z = f (Cα g(z) + Dα z) = ∇φ(Cα g(z) + Dα z); (2.8)
3◦ N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 ,
M = σα−1 (N ) = (Aα N + B α )(C α N + Dα )−1 ; (2.9)
4◦ Γf = α(Γg ), Γg = α−1 (Γf ). (2.10)
220 5. The Generating Function Method

Proof. Under the linear transformation α, the image of Γg is


'  /
w 4n
α(Γg ) = ∈R |w  = Aα g(z) + Bα z, w = Cα g(z) + Dα z .
w
Since Γg is a J˜4n -Lagrangian submanifold and α ∈ CSp(J˜4n , J4n ), the tangent plane
of α(Γg ) defined by  7
Aα M (z) + Bα
Cα M (z) + Dα
is a J4n -Lagrangian subspace. So α(Γg ) is a J4n -Lagrangian submanifold. By as-
sumption, |Cα M + Dα | = 0, and by the implicit function theorem, w = Cα g(z) +
Dα z is invertible and its inverse is denoted by z = z(w). Set:

 = f (w) = (Aα g(z) + Bα z) z=z(w) = Aα g(z(w)) + Bα z(w),
w (2.11)

obviously, such a f (w) satisfies the identity


Aα g(z) + Bα z ≡ f (Cα g(z) + Dα z). (2.12)
The Jacobian of f (w) is
 −1

∂w 
∂w  ∂w
∂w
N (w) = fw (w) = = =
∂w ∂z ∂z ∂z
= (Aα M (z) + Bα )(Cα M (z) + Dα )−1 = σα (M (z)). (2.13)
By Theorem 2.2 it is symmetric. So f (w) is a gradient map. By the Poincaré lemma,
there exists a scalar function φ(w), such that
f (w) = ∇φ(w).
In addition, we have
 w  
Γf = ∈ R4n | w
 = f (w) = Aα g(z(w)) + Bα z(w) = α(Γg ).
w
Therefore, the theorem is completed. 
This theorem tells us that for a fixed α, the corresponding symplectic mapping
determines only one gradient mapping with accuracy up to a constant factor.
Theorem 2.6. Let α ∈ CSp(J˜4n , J4n ). Let φ(w) be a scalar function and w → w =
f (w) = ∇φ(w) be its induced gradient mapping and N (w) = fw (w) = φww (w),
the Hessian matrix of φ(w), satisfy (1.13) in (some neighborhood of) R2n . Then, there
exists a canonical map z → z = g(z) with Jacobian M (z) = gz (z) satisfying (1.11)
such that
1◦ Aα f (w) + B α w = g(C α f (w) + Dα w), identically in w .
2◦ M = σα−1 (N ) = (Aα N + B α )(C α N + Dα )−1 ,
N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 .
3 Γg = α−1 (Γf ), Γf = α(Γg ).

By the way, we can get the original symplectic mapping, if f (w) or φ(w) is ob-
tained from Theorem 2.5.
5.3 Generating Functions for the Phase Flow 221

5.3 Generating Functions for the Phase Flow


Consider the Hamiltonian system
dz −1
= J2n ∇H(z), z ∈ R2n , (3.1)
dt
where H(z) is a Hamiltonian function. Its phase flow is denoted as g t (z) = g(z, t) =
gH (z, t), being a one-parameter group of canonical maps, i.e.,

g 0 = identity, g t1 +t2 = g t1 ◦ g t2 ,

and if z0 is taken as an initial condition, then z(t) = g t (z0 ) is the solution of (3.1)
with the initial value z0 .

Theorem 3.1. Let α ∈ CSp(J˜4n , J4n ). Let z → z = g(z, t) be the phase flow of
the Hamiltonian system (3.1) and M0 ∈ Sp(2n). Set G(z, t) = g(M0 z, t) with Ja-
cobian M (z, t) = Gz (z, t). It is a time-dependent canonical map. If M0 satisfies the
transversality condition (1.2), i.e.,

|Cα M0 + Dα | = 0, (3.2)

then there exists, for sufficiently small |t| and in (some neighborhood of) R2n , a time-
dependent gradient map w → w  = f (w, t) with Jacobian N (w, t) = fw (w, t) ∈
Sm(2n) satisfying the transversality condition (1.13) and a time-dependent generat-
ing function φα,H (w, t) = φ(w, t), such that

1◦ f (w, t) = ∇φ(w, t). (3.3)



2◦ φ(w, t) = −μH(Aα ∇φ(w, t) + B α w). (3.4)
∂t

3 Aα G(z, t) + Bα z ≡ f (Cα G(z, t) + Dα z, t) ≡ ∇φ(Cα G(z, t) + Dα z, t).(3.5)
4◦ N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 ,
M = σα−1 (N ) = (Aα N + B α )(C α N + Dα )−1 . (3.6)

(3.4) is the most general Hamilton–Jacobi equation for the Hamiltonian system
(3.1) with the linear transformation α.

Proof. Since g(z, t) is differentiable with respect to z and t, so is G(z, t). Condition
(3.2) implies that for sufficiently small |t| and in some neighborhood of R2n ,

|Cα M (z, t) + Dα | = 0. (3.7)

 = f (w, t), such


Thus, by Theorem 2.5, there exists a time-dependent gradient map w
that it satisfies (3.2) and (3.6).
Set:

H(w, t) = −μH( z ) z=Aα w(w,t)+B
 αw

= −μH(Aα w(w,
 t) + B α w). (3.8)
222 5. The Generating Function Method

Consider the differential 1-form


2n

1
ω = i d wi + H(w, t) d t,
w then
i=1
2n
 2n
 2n

i
∂w i
∂w ∂H
d ω1 = d wj ∧ d wi + d t ∧ d wi + d wi ∧ d t
∂ wj ∂t ∂ wi
i,j=1 i=1 i=1

  ∂ wi ∂wj
  ∂ wi2n
∂H
 
= − d wj ∧ d wi + − d t ∧ d wi . (3.9)
∂ wj ∂ wi ∂t ∂ wi
i<j i=1


∂w
Since N (w, t) = fw (w, t) = is symmetric, the first term of (3.9) is zero.
∂w
Notice that z = G(z, t) = g(M0 z, t),
d z d g(M0 z, t)
= = J −1 ∇ H(G(z, t)). (3.10)
dt dt
So G(z, t) is the solution of the following initial-value problem:

⎨ d z = J −1 ∇ H( z ),
dt
⎩ z(0) = M z.
0

Therefore, from the equations

 = Aα G(z, t) + Bα z,
w w = Cα G(z, t) + Dα z,

it follows that

dw dw
= Aα J −1 ∇H(
z ), = Cα J −1 ∇H(
z ).
dt dt

dw ∂w dw 
∂w
Since = + , combining these equations, we obtain
dt ∂ w dt ∂t
 

∂w 
∂w
= Aα − Cα J −1 ∇H(
z ).
∂t ∂w
On the other hand,
 T   T

∂w
∇w H(w, t) = H w (w, t) = μ − Hz · Aα + Bα
    ∂w
α T  T α T
∂w
= −μ (B ) + (A ) ∇H( z)
 ∂w   
∂ 
w
= Aα J −1 − Cα J −1 ∇ H(
z ) by (2.5) and N ∈ Sm(2n)
∂w

∂w
= .
∂t

So d ω 1 = 0. By Poincaré lemma, there exists, in some neighborhood of R2n+1 , a


scalar function φ(w, t), such that
5.3 Generating Functions for the Phase Flow 223

ω1 = w
 d w + H d t = d φ(w, t),

i.e.,
f (w, t) = ∇w φ(w, t),
∂  
φ(w, t) = −μH Aα ∇w φα,H (w, t) + B α w .
∂t
Therefore, the theorem is completed. 

Examples of generating functions are:


⎡ ⎤
O O −In O
⎢ In O O O ⎥
(I) α=⎢ ⎣ O O
⎥ , μ = 1, M0 = J, |Cα M0 + Dα | = 0;
O In ⎦
O In O O
 q 
w= , φ = φ(q, q, t);
q
 −p   φ 
q
=
w = , φt = −H(φq, q).
p φq

This is the generating function and H.J. equation of the first kind.
⎡ ⎤
O O −In O
⎢ O −In O O ⎥
(II) α=⎢ ⎣ O
⎥ , μ = 1, M0 = I, |Cα M0 + Dα | = 0;
O O In ⎦
In O O O
 q 
w= , φ = φ(q, p, t);
p
 p   φ 
q
w=− = , φt = −H( p, −φp).
q φp

This is the generating function and H.J. equation of the second kind.
⎡ ⎤
−J2n J2n
(III) α=⎣ 1 1
⎦, μ = 1, M0 = I, |Cα M0 + Dα | = 0;
I2n I2n
2 2
1
w = (z + z), φ = φ(w, t);
2  
1
 = J(z − z) = ∇φ,
w φt = −H w − J −1 ∇φ .
2

This is the Poincaré’s generating function[Wei72] and H.J. equation.


If the Hamiltonian function H(z) depends analytically on z then we can derive the
explicit expression of the corresponding generating function via recursions .

Theorem 3.2. Let H(z) depend analytically on z. Then φα,H (w, t) is expressible as
a convergent power series in t for sufficiently small |t|, with recursively determined
coefficients:
224 5. The Generating Function Method



φ(w, t) = φ(k) (w)tk , (3.11)
k=0
1 T
φ(0) (w) = w N0 w, N0 = (Aα M0 + Bα )(Cα M0 + Dα )−1 , (3.12)
2
φ(1) (w) = −μ(α)H(E0 w), (3.13)
E0 = Aα N0 + B α = M0 (Cα M0 + Dα )−1 .

If k ≥ 1,

2n
μ(α)  1  
k
(k+1)
φ (w) = − Hzi1 ,···,zim (E0 w)
k+1 m!
m=1 i1 ,···,im =1 j1 +···+jm =k
jl ≥1

· (Aα ∇φ(j1 ) )i1 , · · · , (Aα ∇φ(jm ) )im , (3.14)

where Hzi1 ,···,zim (E0 w) is the m-th partial derivative of H(z) w.r.t. zi1 , · · · , zim ,
 
evaluated at z = E0 w and Aα ∇φ(jl ) (w) i is the il -th component of the column
l
vector Aα ∇φ(jl ) (w).

Proof. Under our assumption, the generating function φα,H (w, t) depends analyti-
cally on w and t in some neighborhood of R2n and for small |t|. Expand it as a power
series as follows:
∞
φ(w, t) = φ(k) (w)tk .
k=0

Differentiating it with respect to w and t, we get



∇φ(w, t) = ∇φ(k) (w)tk , (3.15)
k=0
∞

φ(w, t) = (k + 1)tk φ(k+1) (w). (3.16)
∂t
k=0

By (3.15),
∇φ(0) (w) = ∇φ(w, 0) = f (w, 0) = N0 w.
1
So we can take φ(0) (w) = wT N0 w. We denote E0 = Aα N0 + B α . Then
2


Aα ∇φ(w, t) + B α w = E0 w + Aα ∇φ(k) (w)tk .
k=1

 
Substitutes it in H Aα ∇φ(w, t) + B α w and expanding at z = E0 w, we get
5.3 Generating Functions for the Phase Flow 225

H(Aα ∇φ(w, t) + B α w)
* ∞
+

(k)
= H E0 w + A ∇φ (w)t
α k

k=1

∞ 2n  ∞
1
= H(E0 w) + tj1 +···+jm Hzi1 ,···,zim
m=1
m ! i ,···,i =1 j ,···,j =1
1 m 1 m

· (E0 w)(Aα ∇φ(j1 ) (w))i1 · · · (Aα ∇φ(jm ) (w))im

∞ 2n  
1
= H(E0 w) + tk Hzi1 ,···,zim
m ! i ,···,i =1
m=1 1 m k≥m j1 +···+jm =kjl ≥1

· (E0 w)(Aα ∇φ(j1 ) (w))i1 · · · (Aα ∇φ(jm ) (w))im



 
k 2n
 
1
= H(E0 w) + tk Hzi1 ,···,zim
m!
k=1 m=1 i1 ,···,im =1 j1 +···+jm =kjl ≥1

· (E0 w)(Aα ∇φ(j1 ) )i1 · · · (Aα ∇φ(jm ) )im .

Substituting this formula into the R.H.S. of (3.4), and (3.5) into the L.H.S. of (3.4),
then comparing the coefficients of tk on both sides, we obtain the recursions Equations
(3.13) and (3.14). 

In the next section when we use generating functions φα,H to construct difference
schemes we always assume M0 = I. For the sake of convenience, we restate Theorem
3.1 and Theorem 3.2 as follows.

Theorem 3.3. Let α ∈ CSp(J˜4n , J4n ). Let z → z = g(z, t) be the phase flow of the
Hamiltonian system (3.1) with Jacobian M (z, t) = gz (z, t). If

|Cα + Dα | = 0,

then there exists, for sufficiently small |t| and in (some neighborhood of) R2n , a time-
dependent gradient map w → w  = f (w, t) with Jacobian N (w, t) = fw (w, t) ∈
Sm(2n) satisfying the transversality condition (1.13) and a time-dependent generat-
ing function φα,H (w, t) = φ(w, t) such that

f (w, t) = ∇ φ(w, t); (3.17)


∂φ
= −μH(Aα ∇ φ(w, t) + B α w); (3.18)
∂t
Aα g(z, t) + Bα z ≡ f (Cα g(z, t) + Dα z, t)
≡ ∇φ(Cα g(z, t) + Dα z, t); (3.19)
N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 ; (3.20)
M = σα−1 (N ) = (Aα N + B α )(C α N + Dα )−1 . (3.21)
226 5. The Generating Function Method

Theorem 3.4. Let H(z) depend analytically on z. Then φα,H (w, t) is expressible as
a convergent power series in t for sufficiently small |t|, with the recursively determined
coefficients:


φ(w, t) = φ(k) (w)tk ; (3.22)
k=0
1 T
φ(0) (w) = w N0 w, N0 = (Aα + Bα )(Cα + Dα )−1 ; (3.23)
2
φ(1) (w) = −μ(α)H(E0 w), E0 = (Cα + Dα )−1 . (3.24)

If k ≥ 1,
2n
μ(α)  1  
k
(k+1)
φ (w) = − Hzi1 ,···,zim (E0 w)
k+1 m! m=1 i1 ,···,im =1 j1 +···+jm =k
jl ≥1

· (Aα ∇φ(j1 ) )i1 · · · (Aα ∇φ(jm ) )im . (3.25)

5.4 Construction of Canonical Difference Schemes


In this section, we consider the construction of canonical difference schemes for the
Hamiltonian system (3.1). By Theorem 3.1, for a given time-dependent scalar function
ψ(w, t) : R2n × R → R, we can get a time-dependent canonical map g̃(z, t). If
ψ(w, t) approximates some generating function φα,H (w, t) of the Hamiltonian system
(3.1), then g̃(z, t) approximates the phase flow g(z, t). Then, fixing t as a time step,
we can get a difference scheme —the canonical difference scheme—whose transition
from one time-step to the next is canonical. By Theorem 3.4, the generating functions
φ(w, t) can be expressed as a power series. So a natural way to approximate φ(w, t)
is to take the truncation of the series. More precisely, we have:

Theorem 4.1. Using Theorems 3.3 and 3.4, for sufficiently small τ > 0 as the time-
step, we define

m
ψ (m) (w, τ ) = φ(i) (w)τ i , m = 1, 2, · · · . (4.1)
i=0

Then the gradient mapping

 = f˜(w, τ ) = ∇ψ (m) (w, τ )


w→w (4.2)

defines an implicit canonical difference scheme z = z k → z k+1 = z,

Aα z k+1 + Bα z k = ∇ψ (m) (Cα z k+1 + Dα z k , τ ) (4.3)

of m-th order of accuracy.


5.4 Construction of Canonical Difference Schemes 227

(m)
Proof. Since ψ (m) (w, 0) = φ(w, 0), so ψww (w, 0) = φww (w, 0) = fw (w, 0) =
N (w, 0) satisfies the transversality condition (1.13), i.e., |C α N (w, 0) + Dα | = 0.
Thus for sufficiently small τ and in some neighborhood of R2n , N (m) (w, τ ) =
(m)
ψww (w, τ ) satisfies the transversality condition (1.13), i.e., |C α N (m) (w, τ ) + Dα | =
0. By Theorem 4.1, the gradient mapping w → w  = f˜(w, τ ) = ∇ψ (m) (w, τ ) defines
implicitly a time-dependent canonical mapping z → z = g̃(z, τ ) by the equation

Aα z + Bα z = ∇ψ (m) (Cα z + Dα z, τ ).

Thus, the equation

Aα z k+1 + Bα z k = ∇ψ (m) (Cα z k+1 + Dα z k , τ )

is an implicit canonical difference scheme.


Since ψ (m) (w, τ ) is the m-th order approximation to φ(w, τ ), so is f˜(w, τ ) =
∇ψ (m) (w, τ ) to f (w, τ ), it follows that the canonical difference scheme given by
(4.3) is of m-th order of accuracy. 

Therefore, for every α ∈ CSp(J"4n , J4n ), we can construct a series of symplectic


schemes for arbitrary order accuracy.
Examples of the canonical difference scheme:
Type (I). Constructing symplectetic scheme by the first kind of the generating
function. From Theorem 3.2, as μ = 1,
1 T
φ(0) (w) = w N0 w, N0 = (Aα + Bα )(Cα + Dα )−1 ,
2
φ(1) (w) = −H(E0 w), E0 = (Cα + Dα )−1 ,
1
φ(2) (w) = (∇H)T Aα E0T (∇H)(E0 w),
2
1 1
φ(3) (w) = − (∇H)T Aα ∇w φ(2) − (Aα ∇φ(1) )T Hzz (Aα ∇φ(1) )
3 6
1
= − (∇H)T Aα (E0T Hzz Aα E0T ∇H + E0T Hzz E0 AαT ∇H)
6
1
− (∇H)T E0 AαT Hzz Aα E0T ∇H
6
1
= − {(∇H)T Aα E0T Hzz (Aα E0T + E0 AαT )∇H
6
+(∇H)T E0 AαT Hzz Aα E0T ∇H}.

Here we use the matrix notation instead of the component notation in Theorem 3.4.
Hzz denotes the Hessian matrix of H, and all derivatives of H are evaluated at z =
E0 w.
Type (II). Constructing symplectetic scheme by the second kind of the generating
function
228 5. The Generating Function Method

⎡ ⎤ ⎡ ⎤
O O −In O O O O In
⎢ O −In O O ⎥ ⎢ O −I O O ⎥
α=⎢ ⎣ O
⎥ , αT = α−1 = ⎢ n ⎥.
O O In ⎦ ⎣ −In O O O ⎦
In O O O O O In O
   
q p
w= , w=− ,
p q
 O I   O I   O O 
N0 = − , E0 = , Aα E0T = − ,
I O I O I O
φ(1) (w) = −H(
p, q),
1
n
φ(2) (w) = − (Hqi Hpi )(
p, q),
2
i=1

1 n
φ(3) (w) = − (Hpi pj Hqi Hqj + Hqi qj Hpi Hpj + Hqi pj Hpi Hqj ),
6
i,j=1

∂H
where H(z) = H(p1 , · · · , pn , q1 , · · · , qn ), Hzi = .
∂ zi
a. The first order scheme.

ψ (1) (w, τ ) = φ(0) (w) + τ φ(1) (w).

 = ∇ψ (1) (w, τ ) defines a first order canonical difference scheme


The equation w
 k+1
pi = pki − τ Hqi (pk+1 , q k ),
i = 1, · · · , n. (4.4)
qik+1 = qik + τ Hpi (pk+1 , q k ),

When H is separable, H = U (p) + V (q). So

Hqi (pk+1 , q k ) = Vqi (q k ), Hpi (pk+1 , q k ) = Upi (pk+1 ).

At this time, (4.4) becomes


 k+1
pi = pki − τ Vqi (q k ),
i = 1, · · · , n. (4.5)
qik+1 = qik + τ Upi (pk+1 ),

Evidently, (4.4) is an explicit difference scheme of 1-st order of accuracy. If we set q’s
 1
at half-integer times t = k + τ , then (4.4) becomes
2
 1
pk+1
i = pki − τ Vqi (q k+ 2 ),
k+ 12 +1 k+ 12
i = 1, · · · , n. (4.6)
qi = qi + τ Upi (pk+1 ),

(4.6) is a staggered explicit scheme of 2-nd order accuracy.


b. The second order scheme.

ψ (2) (w, τ ) = ψ (1) (w) + τ 2 φ(2) (w).


5.4 Construction of Canonical Difference Schemes 229

The induced gradient map is


⎡ 
n  ⎤
! ! ⎢ ∇q Hqi Hpi ⎥
p ∇q H τ2 ⎢ ⎥
 = ∇w ψ (2) = −
w −τ − ⎢ i=1
  ⎥ .
q ∇p H 2 ⎣
n

∇p Hqi Hpi
i=1

So the second order scheme is


⎧  n 

⎪ τ2 

⎪ pi = pi − τ Hqi (p
k+1 k
,q ) −
k+1 k
Hqj Hpj (pk+1 , q k ),
⎨ 2 qi
j=1
  i = 1, · · · , n.
τ2 
n



⎪ qi
k+1
= qik + τ Hpi (pk+1 , q k ) + Hqj Hpj (pk+1 , q k ),
⎩ 2 pi
j=1

This scheme is already implicit even when H(z) is separable.


c. The third order scheme is

τ 2  
n
pk+1
i = pki − τ Hqi (pk+1 , q k ) − Hqj Hpj qi (pk+1 , q k )
2
j=1
τ3 
n
 
− Hpl pj Hql Hqj + Hql qj Hpl Hpj + Hpl qj Hql Hpj qi
(pk+1 , q k ),
6
l,j=1
τ 2  
n
qik+1 = qik + τ Hpi (pk+1 , q k ) + Hqj Hpj pi (pk+1 , q k )
2
j=1
τ3   
n
+ Hpl pj Hql Hqj + Hql qj Hpl Hpj + Hpl qj Hql Hpj p (pk+1 , q k ),
6 i
l,j=1

where i = 1, · · · , n.
Type (III). Constructing symplectetic scheme by Poincaré type generating function
⎡ ⎤ ⎡ ⎤
1
−J2n J2n J2n I2n
α=⎣ 1 ⎦ , α−1 = ⎢ ⎣
2 ⎥
⎦. (4.7)
1 1
I2n I2n − J2n I2n
2 2 2
1
w = (
z + z),  = J(z − z).
w (4.8)
2
N0 = 0, E0 = I, Aα E0T + E0 AαT = 0. (4.9)
φ(0) = φ(2) = φ(4) = 0, (4.10)
1 
φ(1) (w) = −H ( z + z) , (4.11)
2
1
φ(3) (w) = (∇H)T JHzz J∇H, (4.12)
24
ψ (2) (w, τ ) = −τ H, (4.13)
τ3
ψ (4) (w, τ ) = −τ H + (∇H)T JHzz J∇H. (4.14)
24
230 5. The Generating Function Method

a. The second order scheme is


 
1
 = ∇w ψ (2) (w, t) = −τ ∇H
J(z − z) = w (z + z) ,
2
i.e.,  
1 k+1
z k+1 = z k + τ J −1 ∇H (z + zk ) . (4.15)
2
It is centered Euler scheme.
b. The 4-th order scheme is
   
1 τ3
 = ∇w ψ (4) (w, t) = −τ ∇H
J(z − z) = w (z + z) + ∇z (∇H)T JHzz J∇H ,
2 24
(4.16)
i.e.,
% &  % 1 &
1 k+1 k τ3
z k+1 = z k +τ J −1 ∇H (z +z ) − J −1 ∇z (∇H)T JHzz J∇H (z k+1 +z k ) .
2 24 2

It is not difficult to show that the generating function φ(w, t) of type (III) is odd in
t. Hence, Theorem 4.1 leads to a family of canonical difference schemes of arbitrary
even order accuracy.
5 6
−J2n J2n
Theorem 4.2. Let α = 1 1 . For sufficiently small τ > 0 as the time-
I2n I2n
2 2
step, we define

m
ψ (2m) (w, τ ) = φ(2i−1) (w)τ 2i−1 , m = 1, 2, · · · . (4.17)
i=1

Then the gradient map

 = f˜(w, τ ) = ∇ψ (2m) (w, τ )


w −→ w

defines implicitly canonical difference schemes z = z k → z k+1 = z,


 
1
z k+1 = z k − J −1 ∇ψ (2m) (z k+1 + z k ), τ (4.18)
2
of 2m-th order of accuracy. The case m = 1 is the Euler centered scheme.

Remark 4.3. We have following diagram commutes:


phase flow gradient transf. generating function
α ∇φ
g(z, t) - f (w, t) - φ(w, t)

6
o(tm+1 )
?
−1 ∇ψ
g m (z, t)  α f˜(w, t)  ψ(w)
5.5 Further Remarks on Generating Function 231

5.5 Further Remarks on Generating Function


Now we want to construct unconditional Hamiltonian algorithms, i.e., they are sym-
plectic for all Hamiltonian systems.
First we consider the one-leg weighted Euler schemes , i.e.,

z = EH,c
s
z: z = z + sJHz (c
z + (1 − c)z), (5.1)

1
with real number c being unconditionally symplectic if and only if c = , which
2
corresponds to the centered Euler scheme
 
z + z
z = z + sJHz . (5.2)
2

These simple propositions illustrate a general situation: apart from some very rare
exceptions, the vast majority of conventional schemes are non-symplectic. However, if
we allow c in (5.1) to be a real matrix of order 2n, we get a far-reaching generalization:
(5.1) is symplectic iff

1
c= (I2n + J2n B), B T = B, cT J + Jc = J. (5.3)
2

The simplest and important cases are[FQ91] :


 
1 z + z
C: c = I2n , z = z + sJHz ,
2 2
% & p = p − sHq (
p, q),
I O
P : c= , (5.4)
O O q = q + sHp (
p, q),
% &
O O p = p − sHq (p, q),
Q: c= ,
O I q = q + sHp (p, q).

For H(p, q) = φ(p) + ψ(q), the above schemes P and Q reduce to explicit schemes.
A matrix α of order 4n is called a Darboux matrix if

αT J4n α = J˜4n ,
% & % &
O −I2n J2n O
J4n = , J˜4n = ,
I2n O O −J2n
% & % &
a b a1 b1
α= , α−1 = .
c d c1 d1

Every Darboux matrix induces a (linear) fractional transform between symplectic and
symmetric matrices

Sp(2n) −→ Sm(2n),
σα :
σα (S) = (aS + b)(cS + d)−1 = A for |cS + d| = 0
232 5. The Generating Function Method

with the inverse transform σα−1 = σα−1

Sm(2n) −→ Sp(2n),
σα−1 :
σα−1 (A) = (a1 A + b1 )(c1 A + d1 )−1 = S for |c1 A + d1 | = 0,

where Sp(2n) = {S ∈ GL(2n, R) | S T J2n S = J2n } is the group of symplectic ma-


trices.
The above mechanism can be extended to generally non-linear operators on R2n .
Let totally symplectic operators be denoted by SpD2n , and symm(2n) the totality of
symmetric operators (not necessary one-one). Every f ∈ symm(2n) corresponds, at
least locally, to a real function φ (unique up to a constant) such that f is the gradient
of φ : f (w) = ∇φ(w), where ∇φ(w) = (φw1 (w), · · · , φw2n (w)) = φw (w). Then we
have
σα : SpD2n −→ symm (2n),
σα (g) = (a ◦ g + b) ◦ (c ◦ g + d)−1 = ∇φ for |cgz + d| = 0

or alternatively  
ag(z) + bz = (∇φ) cg(z) + dz ,
where φ is called the generating function of Darboux type α for the symplectic oper-
ator g.[FQ91] Then

σα−1 : symm(2n) −→ SpD2n ,


σα−1 (∇φ) = (a1 ◦ ∇φ + b1 ) ◦ (c1 ◦ ∇φ + d1 )−1 = g, (5.5)
for |c1 φww + d1 | = 0

or alternatively
a1 ∇φ(w) + b1 (w) = g(c1 ∇φ(w) + d1 w), (5.6)
where g is called the symplectic operator of Darboux type α for the generating func-
tion φ.
For the study of symplectic difference scheme, we may narrow down the class of
Darboux matrices to the subclass of normal Darboux matrices, i.e., those satisfying
a + b = 0, c + d = I2n . The normal Darboux matrices α can be characterized as
* + * +
a b J −J 1
α= = , c = (I + JB), B T = B, (5.7)
c d c I −c 2
* + * +
−1
a1 b1 (c − I)J I
α = = . (5.8)
c1 d1 cJ I

The fractional transform induced by a normal Darboux matrix establishes a 1-1


correspondence between symplectic operators near identity and symmetric operators
near nullity. Then the determinantal conditions could be taken for granted. Those B’s
listed in section 5 correspond to the most important normal Darboux matrices. For
5.5 Further Remarks on Generating Function 233

every Hamiltonian H with its phase flow etH and for every normal Darboux matrix α,
we get the generating function φ(w, t) = φtH (w) = φtH,α (w) of normal Darboux
type α for the phase flow of H by
∇φtH,α = (JetH − J) ◦ (cetH + I − c)−1 for small |t|. (5.9)
φtH,α satisfies the Hamilton–Jacobi equation

φ(w, t) = −H(w + a1 ∇φ(w, t)) = −H(w + c1 ∇φ(w, t)) (5.10)
∂t
and can be expressed by Taylor series in |t|:


φ(w, t) = φ(k) (w)tk , |t| small enough. (5.11)
k=1

The coefficients can be determined recursively


φ(1) (w) = −H(w), and for k ≥ 0, a1 = (c − I)J;
−1  1 
k
φ(k+1) (w) = · Dm H(w) (5.12)
k+1 m!
m=1 j 1 +j 2 +···+j m =k
 jl 1

· (a1 ∇φ (w), · · · , a1 ∇φ(jm ) (w)),


(j1 )

where we use the notation of the m-linear form


Dm H(w)(a1 ∇ φ(j1 ) (w), · · · , a1 ∇ φ(jm ) (w))
2n
:= Hzi1 ···zim (w)(a1 ∇ φ(j1 ) (w))i1 · · · (a1 ∇ φ(jm ) (w))im .
i1 ,···,im =1

By (5.9), the phase flow z = etH z satisfies


 
z + (I − c)z
z − z = −J∇ φtH,α c

  
= − tj J∇ φ(j) c
z + (I − c)z . (5.13)
j=1

Let ψ s be a truncation of φsH,α up to a certain power, e.g., sm . Using the inverse


transformation σα−1 , we obtain the symplectic operator
g s = σα−1 (∇ψ s ), |s| small enough, (5.14)
which depends on s, H, α (or equivalently B) and the mode of truncation. It is a
symplectic approximation to the phase flow esH and can serve as the transition operator
of a symplectic difference scheme for the Hamiltonian system (3.1)
1
z −→ z = g s z : z = z − J∇ψ s (c
z + (I − c)z), c=
(I + JB). (5.15)
2
Thus, using the technique of the phase flow generating functions, we have constructed,
for every H and every normal Darboux matrix, a hierarchy of symplectic schemes by
truncation. The simple symplectic schemes (5.4) correspond to the lowest truncation.
234 5. The Generating Function Method

5.6 Conservation Laws


The conservation laws we refer to here[FQ91,FW91a,GF88,Ge91] have two meanings. As it
is well known, the Hamiltonian system (3.1) itself has first integrals which are con-
served in time evolution, e.g., the Hamiltonian is always a first integral. Hence, the
first question is how many first integrals of Hamiltonian system (3.1) can be preserved
by symplectic algorithms. The second question is whether or not there exist their own
first integrals in case the original first integrals can not be preserved by symplectic
algorithms.
We first consider preservation of the first integrals of Hamiltonian systems by sym-
plectic algorithms. The detailed discussion is referred to references[FQ91,Fen93b,GF88,Wan94] .
Consider the Hamiltonian system

dz
= J∇H(z). (6.1)
dt
Suppose
z = gH
s
(z) (6.2)

is a symplectic algorithm. Under a symplectic transformation z = S(y), system (6.1)


can be transformed into
dy
= J∇H̃(y), (6.3)
dt
where H̃(y) = H(S(y)) and scheme (5.6) can be transformed into

y = S −1 ◦ gH
s
◦ S(y). (6.4)

On the other hand, the algorithm g s can be applied to system (6.3) directly and the
corresponding scheme is
y = gH̃
s
(y). (6.5)
Naturally, one can ask if (6.4) and (6.5) are the same. This introduces the following
concept.

Definition 6.1. A symplectic algorithm g s is invariant under the group G of symplec-


tic transformations, or G-invariant, for Hamiltonian H if

S −1 ◦ gH
s
◦ S = gH◦S
s
, ∀ S ∈ G;

g s is symplectic invariant for Hamiltonian H, if

S −1 gH
s
◦ S = gH◦S
s
, ∀ S ∈ Sp(2n).

In practice, the second case is more common. Generally speaking, numerical al-
gorithms depend on the coordinates, i.e., they are locally represented. But many nu-
merical algorithms may be independent of the linear coordinate transformations.
5.6 Conservation Laws 235

Theorem 6.2. [FW91a,GF88,Coo87] Suppose F is a first integral of the Hamiltonian sys-


tem (6.1) and etF is the corresponding phase flow. Then F is conserved up to a constant
s
by the symplectic algorithm gH ,

F ◦ gH
s
= F + c, c is a constant (6.6)
s
if and only if gH is etF -invariant.
s
Proof. We first assume that the symplectic algorithm gH is etF -invariant, i.e.,

e−t
F ◦ gH ◦ eF = gH◦et ,
s t s
∀ t ∈ R. (6.7)
F

Since F is a first integral of the Hamiltonian system (6.1) with the Hamiltonian
H, H is also the first integral of the Hamiltonian system (5.6) with the Hamiltonian
F , i.e.,
H ◦ etF = H. (6.8)
It follows from (5.6) and (6.8) that

e−t
F ◦ gH ◦ eF = gH ,
s t s

i.e.,
s −1
etF = (gH ) ◦ etF ◦ gH
s
. (6.9)
Differentiating (6.9) with respect to t at point 0 and noticing that

d etF 
 = J∇F,
dt t=0

we get
s −1
J∇F = (gH )∗ J∇F ◦ gH
s
. (6.10)
s
Since gH is symplectic, i.e.,
s −1 s T
(gH )∗ J = J(gH )∗ ,

we have
s T
J∇F = J(gH )∗ ∇F ◦ gH
s
= J∇(F ◦ gH
s
),
then
s T
∇F = (gH )∗ ∇F ◦ gH
s
= ∇(F ◦ gH
s
).
It follows that
F ◦ gH
s
= F + c. (6.11)
s
We now assume that F is conserved by gH , i.e., (6.6) is valid. Then noticing that
s −1 s −1
the phase flows of the vector fields J∇F and (gH )∗ J∇F ◦ gH s
are etF and (gH ) ◦
eF ◦ gH respectively, we can get (5.6) similarly, i.e., gH is eF -invariant.
t s s t

Symplectic invariant algorithms are invariant under the symplectic group Sp(2n)
and hence invariant under the phase flow of any quadratic Hamiltonian. 
236 5. The Generating Function Method

Corollary 6.3. Symplectic invariant algorithms for Hamiltonian systems preserve all
quadratic first integrals of the original Hamiltonian systems up to a constant.
s
If a symplectic scheme has a fixed point, i.e., there is a point z such that gH (z) = z,
then the constant c = 0 and the first integral is conserved exactly. Since linear schemes
always have the fix point 0, we have the following result.

Corollary 6.4. Linear symplectic invariant algorithms for linear Hamiltonian sys-
tems preserve all quadratic first integrals of the original Hamiltonian systems.

Example 6.5. Centered Euler scheme and symplectic Runge–Kutta methods are sym-
plectic invariants. Hence they preserve all quadratic first integrals of system (6.1) up
to a constant.

Example 6.6. Explicit symplectic scheme (4.5), and other explicit symplectic schemes
(2.1) – (2.4) considered in Chapter 8 are invariant under the linear symplectic transfor-
mations of the form diag (A−T , A), A ∈ GL(n). Thus they preserve angular momen-
tum pT Bq of the original Hamiltonian systems, since their infinitesimal symplectic
matrices are diag (−B T , B), B ∈ gl(n).

In fact, these results can be improved. Symplectic Runge–Kutta methods preserve


all quadratic first integrals of system (6.1) exactly. For generating function methods,
we have the following result[FW91a,GF88,FQ87] .
s
Theorem 6.7. Let gH,α be a symplectic method constructed by the generating func-
1
tion method with the Darboux type α. If F (z) = z T Az, A ∈ Sm(2n), is a quadratic
2
first integral of the Hamiltonian system (6.1) and

AJB − BJA = O, (6.12)


s
then F (z) is conserved by gH,α , i.e.,

F (
z ) = F (z), or F ◦ gH,α
s
= F. (6.13)

For B = O, i.e., the case of centered symplectic difference schemes, (6.12) is


always valid. So all centered symplectic difference schemes preserve all quadratic
first integrals of the Hamiltonian system (6.1) exactly.

Proof. Since F (z) is the first integral of system (6.1),


1 T 1
z = z T Az,
z A z = etH .
2 2
It can be rewritten as
1
z + z)T A(
( z − z) = 0, z = etH . (6.14)
2
From (6.12), it follows that
5.6 Conservation Laws 237

1 T 1
z − z) A(
JB( z − z)T (AJB − BJA)(
z − z) = ( z − z) = 0, ∀ z, z ∈ R2n .
2 4
Combining it with (6.14), we have
 T
z + (I − c)z A(
c z − z) = 0.
Using (5.13), it becomes


 T  
z + (I − c)z
c AJ tj ∇φ(j) c
z + (I − c)z = 0.
j=1

From this, we get


wT AJ∇φ(j) (w) = 0, ∀ j ≥ 1, ∀ w ∈ R2n .
z + (I − c)z, where
Taking w = c
  
m
 
z = gH,α
s
z = z − J∇ψ (m) c
z + (I − c)z = z − sj J∇φ(j) c
z + (I − c)z ,
j=1

we have

m
wT A(
z − z) = − sj wT AJ∇φ(j) (w) = −AJ∇ψ(w) = 0,
j=1

since
1 T 1 T 1
wT A(
z − z) = z A
z− z − z)T (AJB − BJA)(
z Az + ( z − z)
2 2 2
1 1 T
= zT Az− z Az.
2 2
Therefore, the theorem is completed. 
We list some of the most important normal Darboux matrices c, the type matrices
B, together with the corresponding form of symmetric matrices A of the conserved
1
quadratic invariants F (z) = z T Az:
2
1
c = I − c = I, B = O, A arbitrary,
2
 I
n O   O
−In

c= , B= ,
O O −In O  O b 
b arbitrary;
 O O   O I  A = , angular
bT O
c= , B= n
, momemtum.
O In In O
 I   a b  T
1n ±In a = a, bT = −b;
c= , B = ∓I2n , A= ,
∓In In
2 −b a Hermitian type.
 I ±I   I   a 
1 n O b aT = a,
c= , B=± ,A= , T
2 ±I O O −In −b −a b = −b.
238 5. The Generating Function Method

Apart from the first integrals of the original Hamiltonian systems, a linear sym-
plectic algorithm has its own quadratic first integrals. For the linear Hamiltonian sys-
tem
dz
= Lz, L = JA ∈ sp(2n) (6.15)
dt
1
with a quadratic Hamiltonian H(z) = z T Az, AT = A, let us denote its linear
2
symplectic algorithm by

z = gH
s
(z) = G(s, A)z, G ∈ Sp(2n). (6.16)

Let us assume that the scheme (6.16) is of order r. Then G(s) has the form

G(s) = I + sL(s),
s 2 s2 sr−1 r
L(s) = L + L + L3 + · · · + L + O(sr ).
2! 3! r!

For sufficiently small time step size s, G(s) can be represented as


" " "
G(s) = esL(s) , L(s) = L + O(sr ), L(s) ∈ sp(2n).

So (6.16) becomes
"
z = esL(s) z.
This is the solution z(t) of the linear Hamiltonian system
dz " "
= L(s)z, L(s) ∈ sp(2n), (6.17)
dt

with the initial value z(0) = z 0 evaluated at time s. The symplectic numerical solution
"
z k = Gk (s)z 0 = eksL(s) z 0

is just the solution of system (6.17) at discrete points ks, k = 0, ±1, ±2, · · ·. Hence,
for sufficiently small s, scheme (6.16) corresponds to a perturbed linear Hamiltonian
system (6.17) with the Hamiltonian
  1 T −1
" s) = 1 z, J −1 L(s)z
H(z, " = z J Lz + O(sr ) = H(z) + O(sr ). (6.18)
2 2
It is well-known that the linear Hamiltonian system has n functionally independent
quadratic first integrals. So does the scheme (6.15). The following

" i (z, s) = 1 z T J −1 L
H " 2i−1 (s)z, i = 1, 2, · · · , n (6.19)
2
are the first integrals of the perturbed system (6.17), therefore, of scheme (6.16), which
approximate the first integrals of system (6.15)
1
Hi (z) = z T J −1 L2i−1 z, i = 1, 2, · · · , n
2
5.7 Convergence of Symplectic Difference Schemes 239

up to O(sr ). Another group of first integrals of (6.16) is

 i (z, s) = z T J −1 Gi (s)z,
H i = 1, 2, · · · , n.

They can be checked easily. The first one is[FW94]

 1 (z, s) = z T J −1 G(s)z = z T J −1 (I + sL(s))z


H
= sz T J −1 L(s)z = 2sH(z) + O(s3 ).

5.7 Convergence of Symplectic Difference Schemes


We considered Hamiltonian systems
dz
= JHz, z ∈ U ⊂ R2n . (7.1)
dt

In this section, we shall prove that all symplectic schemes for Hamiltonian systems
constructed by generating functions are convergent, if τ → 0.
A normal Darboux matrix, which will be introduced in the next chapter, has the
form
5 6 ⎡ ⎤
Aα Bα J −J
α= =⎣ 1 1
⎦ , B T = B,
Cα Dα (I + JB) (I − JB)
2 2
⎡ ⎤
5 6 1 (7.2)
A α
B α (JBJ − J) I
−1 ⎢ 2 ⎥
α = =⎣ ⎦,
C α Dα 1
(JBJ + J) I
2

which defines a linear transformation in the product space R2n × R2n :


5 6 5 6 5 6 5 6
w z z 
w
=α , = α−1 ,
w z z w

i.e.,
1 1
 = J z − Jz,
w w = (I + JB) z + (I − JB) z, B T = B. (7.3)
2 2
Let z → z = g(z, t) be the phase flow of the Hamiltonian systems (5.7); it is a time
dependent canonical map. There exist, for sufficiently small |t| and in (some neigh-
borhood of) R2n , a time-dependent gradient map w → w  = f (w, t) with Jacobian
fw (w, t) ∈ Sm(2n) (i.e.,everywhere symmetric) and a time-dependent generating
function φ = φα,H , such that

f (w, t) = ∇ φα,H (w, t), Aα g(z, t) + Bα z = ∇ φ(Cα g(z, t) + Dα z, t). (7.4)


240 5. The Generating Function Method

On the other hand, for a given time-dependent scalar function ψ(w, t) : R2n × R →
R, we can obtain a time-dependent canonical map g"(z, t). If ψ(w, t) approximates the
generating function φα,H (w, t) of the Hamiltonian system (5.7), then g"(z, t) approxi-
mates the phase flow g(z, t). For sufficiently small τ > 0 as the time step, define

m
φ(m) = φ(k) (w)τ k , (7.5)
k=1

1
where φ(1) (w) = −H(w), and for k ≥ 0, Aα = (JBJ − J),
2
2n
−1  1 
k
φ(k+1) (w) = Hzi1 · · · zim (w)
k+1 m!m=1 i1 ,···,in =1
    
· Aα ∇ φ(j1 ) (w) i1
· · · Aα ∇ φ(jm) (w) i . (7.6)
m
j1 +···+jm =k

Then, ψ (m) (w, τ ) is the m-th approximation of φα,H (w, τ ) , and the gradient map,

 = f"(w, τ ) = ∇ ψ (m) (w, τ ).


w −→ w (7.7)

Define a canonical map z → z = g"(z, τ ) implicitly by equation

Aα z + Bα z = (∇ ψ (m) )(Cα z + Dα z, z). (7.8)

The implicit canonical difference scheme of m-th order accuracy

z = z k −→ z = z k+1 = g"(z k , τ ), (7.9)

for system (5.7) is obtained.


For the sake of simplicity, we denote g"τ (z) = g"(z, τ ). Then
 
di g"τ (z)  di gτ (z) 
g"0 (z) = z, i
 = i
 , (7.10)
dτ τ =0 dτ τ =0

where gτ (z) is the phase flow of g(z, τ ).


Theorem 7.1. If H is analytical in U ⊂ R2n , then the scheme (7.9) is convergent
with m-th order accuracy[CHMM78,QZ93] .

Proof. For the step-forward operator g"τ , we set

z1 = g"τ (z), z2 = g"τ (z1 ), · · · , zk = g"τ (zk−1 ),

we have z k = g"τk .
First, we prove that the convergence holds locally. We begin by showing that for
any z0 , the iterations are defined for z"t/k
n
(n ≤ k), if t is sufficiently small. Indeed,
in the neighborhood of z0 , g"τ (z) = z + o(τ ), thus, if g"lt (z) (l = 1, 2, · · · , n − 1) is
k
defined for z in the neighborhood of z0 ,
5.7 Convergence of Symplectic Difference Schemes 241

     
g"nt (z) − z = g"nt (z) − g"n−1
t (z) + g"n−1
t (z) − g"n−2
t (z) + · · · + g" kt (z) − z
k k k k k
   
t t
= o + ··· + o = o(t).
0 k 12 k
3
n

which is small and independent of k for sufficiently small t. So g"t/k


n
(n ≤ k) is defined
and remains in Uz0 for z near z0 .
Since H is analytical, for any z1 , z2 ∈ Uz0 , there exists a constant C, such that
JHz (z1 ) − JHz (z2 ) ≤ J Hz (z1 ) − Hz (z2 ) ≤ Cz1 − z2 .
- t
Let F (t) = g(z1 , t)−g(z2 , t), where g(zi , t) = zi + JHz (g(zi , s)) d s (i = 1, 2),
0
?- t - t ?
?     ?
F (t) = ?
? JH z g(z 1 , s) d s − JHz g(z2 , s) d s + z 1 − z ?
2?
0 - t 0

≤ z1 − z2  + C F (s) d s,
0
using Gronwall inequality, we have
F (t) = g(z1 , t) − g(z2 , t) ≤ eC|t| z1 − zn ,
gt (z) − g"kt = g kt (z) − g"kt
k k k

k−1
= gt g kt (z) − g k−1
t g" kt (z) + g k−2
t g kt (y1 )
k k k

−g k−2
t g" kt (y1 ) + · · · + g k−1
t g kt (yl−1 )
k k

−g t k−1
g" kt (yl−1 ) + · · · + g kt (yk−1 ) − g" kt (yk−1 ),
k

where yl = g"lt (z). Then we have


k
 

k  C k − l |t| 
gt (z) − g"kt (z) ≤ exp g kt (yl−1 ) − g" kt (yl−1 )
k k
e=1
 m
t
≤ k exp (C|t|) o −→ 0, if k −→ ∞.
k

Here, we use consistent supposition gτ (z) − g"τ (z) = o(τ )m .


Now, we assume that g(z, t) is defined for 0 ≤ t ≤ T . We shall show that g"kt con-
k
verges to g(z, t). By the above proof and domain compactness, if N is large enough,
g Nt = lim g"kt N uniformly convergent on a neighborhood of the curve t → gt (z).
k→∞ k

Thus, for 0 ≤ t ≤ T, gt (z) = g Nt = lim ("


g kt N )N (z). By the uniformity of t,
N k→∞ k

gt (z) = lim g"kt (z).


k→∞ k
From this proof, one can see that if H is not analytical but Hz satisfies the local
Lipschitz condition, then the scheme (7.9) is convergent with order m = 1 . 
242 5. The Generating Function Method

5.8 Symplectic Schemes for Nonautonomous System


We consider the following system of canonical equations:
d pi ∂H
=− ,
dt ∂qi
d qi ∂H i = 1, 2, · · · , n (8.1)
= ,
dt ∂pi

with Hamiltonian function H(p1 , p2 , · · · , pn , q1 , q2 , · · · , qn , t). This is a nonautono-


mous Hamiltonian system. This approach, which is applied particularly to nonau-
tonomous system, is to consider the time t as an additional dependent variable. Now
we can choose a parameter τ as a new independent variable. The original problem
therefore becomes one of finding q1 , · · · , qn with t as a function of an independent
variable τ . Hence, we set the coordinate qi by adding t = qn+1 .
The corresponding phase space must have 2n+2 dimensions, z = (p1 , p2 , · · · , pn ,
h, q1 , q2 , · · · , qn , t), here t and h being merely alternative notations for qn+1 and pn+1 .
The new momentum h associated with the time t is, as its physical interpretation, the
negative of the total energy. We call this new space the extended phase space.
An advantage of adding another degree of freedom to the analysis is that the sys-
tem now resembles an autonomous system [Arn89,Qin96,Gon96] with 2n + 2 degree free-
dom, because its Hamiltonian is not an explicit function of τ .
In the extended phase space, (8.1) becomes
5 6
dz O −J2n+1
= J ∇ K(z), J = J2n+2 = , (8.2)
dt J2n+1 O

where K(z) = h + H(p1 , p2 , · · · , pn , q1 , q2 , · · · , qn , t), which we call the “extended


Hamiltonian function”. We write (8.2) in another form
d pi ∂H d qi ∂H
=− , = , i = 1, 2, · · · , (8.3)
dτ ∂qi dτ ∂pi
d pn+1 ∂H
=− , (8.4)
dτ ∂qn+1
d qn+1
= 1. (8.5)

Equation (8.4) shows that our normalized parameter now becomes equal to qn+1 ,
which is the time t. The Equation (8.3) is the original canonical equation. The last
Equation (8.5) gives the law according to which the negative of the total energy, pn+1 ,
changes with the time.
The general form of the canonical Equation (8.2) has great theoretical advantages.
It shows the role of conservative system in a new light. We notice that after adding the
time t to the mechanical variables, every system becomes conservative. The extended
Hamiltonian function k does not depend on variable τ explicitly and thus our system is
a conservative system in the extended phase space. The method of generating function
plays a central role in the construction of symplectic schemes. In [Fen86] a constructive
5.8 Symplectic Schemes for Nonautonomous System 243

general theory of generating function roughly reads as follows. Let a normal Darboux
matrix be
5 6 ⎡ ⎤
Aα Bα J −J
α= =⎣ 1 1
⎦ , B T = B,
Cα Dα (I + JB) (I − JB)
2 2
⎡ 1 ⎤
5 6 (JBJ − J) I
Aα Bα ⎢ 2 ⎥
α−1 = =⎣ ⎦.
α α 1
C D (JBJ + J) I
2

Define a linear transformation in product space R2n+2 × R2n+2 by


! ! ! !

w z z −1 
w
=α , =α , (8.6)
w z z w

i.e.,
1 1
 = J z − Jz,
w z + (I − JB)z,
w = (I + JB) B T = B. (8.7)
2 2
Let z → z = g(z, τ ) be the phase flow of the Hamiltonian system (8.2). It is a
time-dependent canonical map. There exists, for a sufficiently small τ and in (some
neighborhood of) R2n+2 , a time-dependent gradient mapping w → w  = f (w, τ ) with
Jacobian fw (w, t) ∈ Sm(2n + 2) (i.e., symmetric everywhere) and a time-dependent
generating function φ = φα,K (w, τ ), such that

∂φ  
= −K Aα ∇φ(w) + B α w ,
∂τ
f (w, τ ) = ∇φα,K (w, τ ), (8.8)
 
Aα g(z, t) + Bα z ≡ (∇φ) Cα g(z, t) + Da z, t .

On the other hand, for a given time-dependent scalar function ψ(w, t) : R2n+2 ×R →
R, we can get a time-dependent canonical map g"(z, τ ). If ψ(w, τ ) approximates the
generating function φα,K (w, τ ) of the Hamiltonian system (8.2), g"(z, t) approximates
the phase flow g(z, t).
For a sufficiently small s > 0 as the time-step, define


m
ψ (m) = φ(k) (w) sk , (8.9)
k=1

where
1
φ(1) (w) = −K(w), Aα = (JBJ − J).
2
For k ≥ 0,
244 5. The Generating Function Method

2n
−1  1 
k
(k+1)
φ (w) = Kzi1 ···zim (w)
k+1 m!
m=1 i1 ,···,im =1
    
· Aα ∇φ(j1 ) (w) i1
· · · Aα ∇φ(jm ) (w) im .
j1 + · · · + jm = k
jl = 0

(8.10)
Then ψ (m) (w, s) is the m-th approximation of φα,K (w, s), and the gradient mapping

 = f"(w, s) = ∇ψ (m) (w, s)


w −→ w (8.11)
defines a canonical map z → z = g"(z, s) implicitly by equation
Aα z + Bα z = (∇ψ (m) )(Cα z + Dα z, s). (8.12)
An implicit canonical difference scheme
z = z k −→ z = z k+1 = g"(z k , s), (8.13)
for system (8.2) is obtained[Qin96] , and this scheme is of m-th order of accuracy.
Let B = 0,
1
φ(1) (w) = −K(w), φ(2) = φ(4) = 0, φ(3) = (∇K)T JKzz J∇K.
24
We have a scheme of second order:
 
z + z
J(  = ∇φ(2) (w, s) = −s∇K
z − z) = w ,
2
 k+1 
z + zk
z k+1 = z k + sJ∇K ,
2
 k+1 
p + pk q k+1 + q k tk+1 + tk
pk+1
i = pki − sHqi , , ,
2 2 2
(8.14)
 k+1 k k+1 k k+1 
p +p q +q t + tk
qik+1 = qik + sHpi , , ,
2 2 2
 
pk+1 + pk q k+1 + q k tk+1 + tk
hk+1 = hk − sHt , , ,
2 2 2
tk+1 = tk + s.
This is the time-centered Euler scheme.
Scheme of the fourth order:
 k+1   k+1   k+1 
z + zk z + zk z + zk
J(z k+1 − z k ) = ∇φ(4) = −s∇K J∇K
2 2 2
s3  
+ ∇z (∇K)T JKzz J∇K ,
24
   
z k+1 + z k s3
z k+1 = z k + sJ∇K − J∇z (∇K)T JKzz J∇K ,
2 24
(8.15)
5.8 Symplectic Schemes for Nonautonomous System 245

i.e.,
 
pk+1 + pk q k+1 + q k tk+1 + tk
pk+1
i = pki − sHqi , ,
2 2 2
s3
− (Hpj pl qi Hqj Hql + 2Hpj pl Hqj qi Hql − 2Hqj pl qi Hpj Hql
24
−2Hqj pl Hpj qi Hql − 2Hqj pl Hpj Hql qi + 2Hqj ql Hpl qi Hpj + Hqj ql qi Hpl Hpj
−2Hqj qi Hpj t − 2Hqj Hpj qi t + 2Hpj qi Hqj t + 2Hpj Hqj qi t + Hqi tt ),
 k+1 
p + pk q k+1 + q k tk+1 + tk
qik+1 = qik + sHpj , ,
2 2 2
s3
+ (Hpj pl pi Hqj Hql + 2Hpj pl Hqj pi Hql − 2Hqj pl pi Hpj Hql
24
−2Hqj pl Hpj pi Hql − 2Hqj pl Hpj Hql pi + 2Hqj pl Hpl qi Hpj + Hqj ql pi Hpl Hpj
−2Hqj pi Hpj t − 2Hqj Hpj pi t + 2Hpj pi Hqj t + 2Hpj Hqj pi t + Hpi tt ),
 k+1 
p + pk q k+1 + q k tk+1 + tk
hk+1 = hk − sHt , ,
2 2 2
3
s
− (Hpj pl t Hqj Hql + 2Hpj pl Hqj t Hql − 2Hqj pl t Hpj Hql
24
−2Hqj pl Hpj t Hql − 2Hqj pl Hpj Hql t + 2Hqj ql Hpl t Hql + Hqj ql t Hpl Hpj
−2Hqj t Hpj t − 2Hqj Hpj tt + 2Hpj t Hqj t + 2Hpj Hqi t + Httt ),
tk+1 = tk + s.
(8.16)
Let ! !
O I p
B=− , w= .
I O q

We have

φ(1) = −K(w),
1
φ(2) = − (Kqi Kpi )(w),
2
1
φ(3) = − (Kpj pl Kqj Kql + Kqj ql Kpj Kpl + Kqj pl Kpj Kql )(w), or
6
1 (8.17)
φ(2) = − (Hqi Hpj + Ht )(w),
2
1
φ(3) = − (Hpj pl Hqi Hql + Hql Hqj pl Hpj + Hql Hpl t
6
+Hqj ql Hpj Hpl + 2Hpj Hqj t + Htt ).

Scheme of the first order


246 5. The Generating Function Method

pk+1
i = pki − sHqi (pk+1 , q k , tk ),

qik+1 = qik + sHpi (pk+1 , q k , tk ),


(8.18)
hk+1 = hk − sHt (pk+1 , q k , tk ),

tk+1 = tk + s.

Scheme of the second order


s2
pk+1
i = pki − sHqi (pk+1 , q k , tk ) − (Hqi t + Hqj qi Hpj + Hqj Hpj qi )(pk+1 , q k , tk ),
2
s2
qik+1 = qik + sHpi (pk+1 , q k , tk ) + (Hpi t + Hqj pi Hpj + Hqj Hpj pi )(pk+1 , q k , tk ),
2
s2
hk+1 = hk − sHt (pk+1 , q k , tk ) − (Htt + Hqj t Hpj + Hqj Hpj t )(pk+1 , q k , tk ),
2
tk+1 = tk + s.
(8.19)
Scheme of the third order
s2
pk+1
i = pki − sHqi (pk+1 , q k , tk ) − (Hqi t + Hqj qi Hpj + Hqj Hpj qi )
2
s3
· (pk+1 , q k , tk ) − (Htt + 2Hpj Hqj t + Hql Hpl t + Hpj pl Hqj Hql
6
+Hql Hqj pl Hpj + Hqj ql Hpj Hpl )qi (pk+1 , q k , tk ),
s2
qik+1 = qik + sHpi (pk+1 , q k , tk ) + (Hpi t + Hqj pi Hpj + Hqj Hpj pi )
2
s3
· (pk+1 , q k , tk ) + (Htt + 2Hpj Hqj t + Hql Hpl t + Hpj pl Hqj Hql
6
+Hql Hqj pl Hpj + Hqj ql Hpj Hpl )pi (pk+1 , q k , tk ),
s2
hk+1 = hk − sHt (pk+1 , q k , tk ) − (Htt + Hqj t Hpj + Hqj Hpj t )
2
s3
· (pk+1 , q k , tk ) − (Htt + 2Hpj Hqj t + Hql Hpl t + Hpj pl Hqj Hql
6
+Hql Hqj pl Hpj + Hqj ql Hpj Hpl )t (pk+1 , q k , tk ),
tk+1 = tk + s.
(8.20)
Bibliography

[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM


60, Berlin Heidelberg, Second edition, (1989).
[CHMM78] A. Chorin, T. J. R. Huges, J. E. Marsden, and M. McCracken: Product formulas
and numerical algorithms. Comm. Pure and Appl. Math., 31:205–256, (1978).
[Coo87] G. J. Cooper: Stability of Runge–Kutta methods for trajectory problems. IMA J.
Numer. Anal., 7:1–13, (1987).
[Fen86] K. Feng: Difference schemes for Hamiltonian formalism and symplectic geometry. J.
Comput. Math., 4:279–289, (1986).
[Fen93b] K. Feng: Symplectic, contact and volume preserving algorithms. In Z.C Shi and
T. Ushijima, editors, Proc.1st China-Japan conf. on computation of differential equation-
sand dynamical systems, pages 1–28. World Scientific, Singapore, (1993).
[FQ91] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FW91] K. Feng and D.L. Wang: A Note on Conservation Laws of Symplectic Difference
Schemes for Hamiltonian Systems. J. Comput. Math., 9(3):229–237, (1991).
[FW94] K. Feng and D.L. Wang: Dynamical systems and geometric construction of algo-
rithms. In Z. C. Shi and C. C. Yang, editors, Computational Mathematics in China, Con-
temporary Mathematics of AMS Vol 163, pages 1–32. AMS, (1994).
[FWQW89] K. Feng, H. M. Wu, M.Z. Qin, and D.L. Wang: Construction of canonical dif-
ference schemes for Hamiltonian formalism via generating functions. J. Comput. Math.,
7:71–96, (1989).
[Ge91] Z. Ge: Equivariant symplectic difference schemes and generating functions. Physica
D, 49:376–386, (1991).
[GF88] Z. Ge and K. Feng: On the Approximation of Linear Hamiltonian Systems. J. Comput.
Math., 6(1):88–97, (1988).
[Gon96] O. Gonzalez: Time integration and discrete Hamiltonian systems. J. Nonlinear. Sci.,
6:449–467, (1996).
[Hua44] L. K. Hua: On the theory of automorphic function of a matrix I, II. Amer. J. Math.,
66:470–488, (1944).
[LL99] L. D. Landau and E. M. Lifshitz: Mechanics, Volume I of Course of Theoretical
Physics. Corp. Butterworth, Heinemann, New York, Third edition, (1999).
[Qin96] M.Z. Qin: Symplectic difference schemes for nonautonomous Hamiltonian systemes.
Acta Applicandae Mathematicae, 12(3):309–321, (1996).
[Qin97a] M. Z. Qin: A symplectic schemes for the PDEs. AMS/IP studies in Advanced Math-
emateics, 5:349–354, (1997).
[QZ93] M. Z. Qin and W. J. Zhu: A note on stability of three stage difference schemes for
ODE’s. Computers Math. Applic., 25:35–44, (1993).
[Sie43] C.L. Siegel: Symplectic geometry. Amer. J. Math., 65:1–86, (1943).
[Wan94] D. L. Wang: Some acpects of Hamiltonian systems and symplectic defference meth-
ods. Physica D, 73:1–16, (1994).
[Wei72] A. Weinstein: The invariance of Poincarés generating function for canonical transfor-
mations. Inventiones Math., 16:202–213, (1972).
Chapter 6.
The Calculus of Generating Functions and
Formal Energy

In the previous chapter, we constructed the symplectic schemes of arbitrary order via
generating function. However the construction of generating functions is dependent
on the chosen coordinates. One would like to know under what circumstance will the
construction of generating functions be independent of the coordinates. The generating
functions are deeply associated with the conservation laws, so it is important to study
their properties and computations.

6.1 Darboux Transformation


Consider a cotangent bundle T ∗ Rn  R2n with natural symplectic structure[Fen98a] :
5 6
O In
J2n = . (1.1)
−In O

Now we consider R4n and the product of cotangent bundles T ∗ R × T ∗ R  R4n


with natural product symplectic structure:
5 6
−J2n O
"
J4n = . (1.2)
O J2n

Correspondingly, we consider the product space Rn × Rn  R2n . Its cotangent


bundle, T ∗ (Rn × Rn ) = T ∗ R2n  R4n has a natural symplectic structure:
5 6
O I2n
J4n = . (1.3)
−I2n O
Choose symplectic coordinates z = (p, q) on the symplectic manifold, then for sym-
plectic transformation, g : T ∗ Rn → T ∗ Rn , we have
5 6 7
gz
∗ n
gr (g) = , z∈T R , (1.4)
z
" 4n = (R4n , J"4n ). Note that on
it is a Lagrangian submanifold of T ∗ Rn × T ∗ Rn in R
4n 4n
R here is a standard symplectic structure (R , J4n ): A generating map
250 6. The Calculus of Generating Functions and Formal Energy

α : T ∗ Rn × T ∗ Rn −→ T ∗ (Rn × Rn )

maps the symplectic structure (1.2) to the standard one (1.3). In particular, α maps La-
grangian submanifolds in (R4n , J"4n ) to Lagrangian submanifolds Lg in (R4n , J4n ).
Suppose that α satisfies the transversality condition of g Chapter 5, Equation (1.2),
then 5 6 7
dφg (ω)
∗ 2n
Lg = , ω∈T R , (1.5)
ω
φg is called generating function of g. We call this generating map α (linear case) or
α∗ (nonlinear case) Darboux transformation, in other words, we have the following
definition.
Definition 1.1. A linear map
5 6
Aα Bα
α= , (1.6)
Cα Dα

which acts as the followings:


5 6 5 6 5 6 5 6
z0 z0 Aα z0 + Bα z1 w0
4n
R  −→ α = = ∈ R4n ,
z1 z1 Cα z0 + Dα z1 w1

is called a Darboux transformation, if

α J4n α = J"4n . (1.7)

Denote
Eα = Cα + Dα , Fα = Aα + Bα , (1.8)
then, we have:

Definition 1.2. Let α be a Darboux tramsformation. Then we define


( )
Sp(J"4n , J4n ) = α ∈ GL(4n) | α J4n α = J"4n = Sp(J, " J),
( )
Sp(J4n ) = β ∈ GL(4n) | β  J4n β = J4n = Sp(4n), (1.9)
( )
Sp(J"4n ) = γ ∈ GL(4n) | γ  J"4n γ = J"4n = S"p (4n).

Definition 1.3. A special case of Darboux transformation


⎡ ⎤
J2n −J2n
α0 = ⎣ 1 1

I2n I2n
2 2

is called Poincaré transformation.


6.2 Normalization of Darboux Transformation 251

Remark 1.4. From the definition above, we know α0 ∈ Sp(J"4n , J4n ).

Proposition 1.5. If α ∈ Sp(J"4n , J4n ), β ∈ Sp(4n), γ ∈ S"p (4n), then βαγ ∈


Sp(J"4n , J4n ).

Proposition 1.6. ∀ α ∈ Sp(J"4n , J4n ), we have

Sp(J"4n , J4n ) = Sp(4n)α0 = α0 S"p (4n),


Sp(J"4n , J4n ) = Sp(4n)α = αS"p (4n).
!
Aα Bα
Proposition 1.7. Let α = ∈ Sp(J"4n , J4n ), then:
Cα Dα
5 6 5 6
−J2n Cα J2n Aα Aα−1 Bα−1
α−1 = = .
J2n Dα −J2n Bα Cα−1 Dα−1

Hint: Using the first equation of (1.2) in Definition 1.2.

Theorem 1.8. If α ∈ Sp(J"4n , J4n ) satisfies transversality condition |Cα + Dα | = 0,


then for all symplectic diffeomorphism z → g(z), g ∼ I2n (near identity), gz ∈
Sp(2n), there exists a generating function:

φα,g : R2n −→ R,

such that
Aα g(z) + Bα z = ∇φα,g (Cα g(z) + Dα z),
i.e.,
(Aα g + Bα )(Cα g + Dα )−1 z = ∇φα,g (z).

6.2 Normalization of Darboux Transformation


1
Denote M ≡ Sp(J˜4n , J4n ) a submanifold in GL(4n), dimM = 4n(4n + 1) =
2
2
8n +2n. Denote M ≡ {α ∈ M | |Eα | = 0} an open submanifold of M , dimM ∗ =

dimM . Denote M  ≡ {α ∈ M | Eα = In , Fα = 0} ⊂ M ∗ ⊂ M .

Definition 2.1. A Darboux transformation is called a normalized Darboux transform-


ation[Fen98a] , if
Eα = I2n , Fα = O2n .

The following theorem answers the question on how to normalize a given Darboux
transformation.
252 6. The Calculus of Generating Functions and Formal Energy

Theorem 2.2. ∀ α ∈ M ∗ , there exists


5 6
I2n P
β1 = ∈ Sp(4n),
O I2n
⎡ ⎤
T
(T −1 ) O
β2 = ⎣ ⎦ ∈ Sp(4n), |T | = 0,
O T

such that β2 β1 α ∈ M  .
Proof. We need only to take P = −Fα Eα−1 = −(Aα +Bα )(Cα +Dα )−1 , T = Eα−1 ,
then
5 T 65 65 6
Eα O I −Fα Eα−1 Aα Bα
β2 β1 α =
O Eα−1 O I Cα Dα
!
Aβ2 β1 α Bβ2 β1 α
=
Cβ2 β1 α Dβ2 β1 α
5  6
Eα (Aα − Fα Eα−1 Cα ) Eα (Bα − Fα Eα−1 Dα )
= α1 = . (2.1)
Eα−1 Cα Eα−1 Dα
It’s easy to verify that

Aβ2 β1 α + Bβ2 β1 α = O2n , Cβ2 β1 α + Dβ2 β1 α = I2n .

The theorem is proved. 


From now on we will assume α is a normalized Darboux transformation unless it
is specified otherwise.
Theorem 2.3. A Darboux transformation can be written in the standard form as
⎡ ⎤
J2n −J2n
α=⎣ 1 1
⎦ , V ∈ sp(2n).
(I + V ) (I − V )
2 2
Proof. It’s not difficult to show:
5 6
Aβ Bβ
∀ α1 ∈ M =⇒ ∃β ∈ Sp(4n), β= ,
Cβ Dβ

such that α1 = β, where α0 is a Poincaré transformation. After computation, we get


⎡ 1 1

Aβ J2n + Bβ −Aβ J2n + Bβ
⎢ 2 2 ⎥
α1 = ⎣ ⎦.
1 1
Cβ J2n + Dβ −Cβ J2n + Dβ
2 2
6.2 Normalization of Darboux Transformation 253

!
 Aβ O
Because α1 ∈ M , we have Dβ = I2n , Bβ = O, i.e., β = . Since
! Cβ I2n
I2n O
β ∈ Sp(4n), we have β = , Q ∈ Sm(2n). Thus:
Q I2n
5 6⎡ J −J2n

I2n O 2n
α1 = ⎣ ⎦
1 1
Q I2n I2n I2n
2 2
⎡ ⎤
J2n −J2n
=⎣ 1 1

I2n + QJ2n I2n − QJ2n
2 2
⎡ ⎤
J2n −J2n
=⎣ 1 1
⎦,
(I2n + V ) (I2n − V )
2 2

where Q = Q, V = 2QJ. We shall write


⎡ ⎤ ⎡ 1

J2n −J2n − (I2n − V ) I2n
⎢ 2 ⎥
αV = ⎣ 1 1
⎦, αV−1 = ⎣ ⎦.
(I2n + V ) (I2n − V ) 1
(I2n + V ) I2n
2 2 2
Therefore, the theorem is completed. 
!
Aβ Bβ
Corollary 2.4. Every α = ∈ M ∗ has a normalized Darboux form
Cβ Dβ

αV ∈ M  , V = (Cα + Dα )−1 (Cα − Dα ) ∈ sp(2n).

This result can be derived from (2.1).

From the following theorem, we can show that the normalization condition is nat-
ural.

Theorem 2.5. Let Gτ be a consistent difference scheme for equation ż = J −1 Hz ,


i.e.,
1◦ Gτ (z)|τ =0 = z, ∀ z, H.

∂Gτ (z) 
2◦  = J −1 Mz , ∀ z, H.
∂τ τ =0
iff the generating Darboux transformation is normalized with A = −J.

Proof. We take symplectic


! difference scheme of first order via generating function of
A B
type α = , we have
C D

AGτ (z) + Bz = −τ Hz (CGτ (z) + Dz). (2.2)


254 6. The Calculus of Generating Functions and Formal Energy

We first prove the “ only if ” part of the theorem. When taking τ = 0, we have

AG0 (z) + Bz = (A + B)z = 0, ∀ z =⇒ A + B = O.

Differentiating (2.2) yields

∂Gτ (z) 
A  = −Hz ((C + D)z).
∂τ τ =0

Since
 ∂Gτ (z) 

 = J −1 Hz (z),
∂τ τ =0

we have
AJ −1 Hz (z) = −Hz ((C + D)z), ∀ H, z.
Take special form H(z) = z T b, and substitute it into above equation, we have

AJ −1 b = −b, ∀ b,

which results in A = −J.


On the other hand, since Hz (z) = Hz ((C + D)z), ∀ H, z take special form
1
H = z T z, and substitute it into the above equation, we get
2

z = (C + D)z, ∀ z =⇒ C + D = I.

Now we prove the “ if ” part. Take

A + B = O, A = −J, C + D = I, (2.3)

then
A(Gτ (z) − z) = −τ Hz (CGτ (z) + Dz),

A = −J, τ = 0 =⇒ Gτ (z)τ =0 = z.

On the other hand, we have


% τ & % τ &
∂G (z)  ∂G (z) 
A  = −Hz ((C+D)z) =⇒  = J −1 Hz (z), ∀ z, H.
∂τ τ =0 ∂τ τ =0

Therefore, the theorem is completed. 

Theorem 2.6. A normalized Darboux transformation with A = −J can be written in


the standard form
⎡ ⎤
−J J
α=⎣ 1 1
⎦ , ∀ V ∈ sp(2n).
(I − V ) (I + V )
2 2
6.3 Transform Properties of Generator Maps and Generating Functions 255

6.3 Transform Properties of Generator Maps and


Generating Functions
Let 5 6
Aα Bα
α= ∈ Sp(J"4n , J4n ),
Cα Dα
denote Eα = Cα + Dα , Fα = Aα + Bα . Let g ∈ Sp-diff. From now on, we always
assume that transversality condition is satisfied, i.e., |Eα | = 0.
!
(T −1 )T O
Theorem 3.1. ∀ T ∈ GL(2n), let βT = ∈ Sp(4n), βT α ∈
O T
Sp(J"4n , J4n ), we have:
φβT α,g ∼= φα,g ◦ T −1 . (3.1)

Proof. Since
⎡ ⎤ 5 6
T T
(T −1 ) Aα (T −1 ) Bα AβT α BβT α
βT α = ⎣ ⎦= ,
T Cα T Dα CβT α DβT α

we have
Aα g(z) + Bα z = ∇φα,g ◦ (Cα g(z) + Dα z), (3.2)
and
T T
(T −1 ) Aα g(z) + (T −1 ) Bα z = ∇φβT α,g ◦ (T Cα g(z) + T Dα z)
⇐⇒ Aα g(z) + Bα z = T  (∇φβT α,g ) ◦ T (Cα g(z) + Dα z)
= ∇(φβT α,g ◦ T )(Cα g(z) + Dα z). (3.3)

Comparing (3.2) with (3.3) for all z, we find:

∇φα,g (Cα g(z) + Dα z) = ∇(φβT α,g ◦ T )(Cα g(z) + Dα z).

Thus we obtain
φα,g ∼
= φβT α,g ◦ T
or
φα,g ◦ T −1 ∼
= φβT α,g
The theorem is proved. 
!
S O "
Theorem 3.2. ∀ S ∈ Sp(2n), define γS = ∈ Sp(4n). Then we have
O S

φαγS ,g ∼
= φα,S◦g◦S −1 (3.4)
256 6. The Calculus of Generating Functions and Formal Energy

Proof. Since:
5 6 5 6
Aα S Bα S Aα γS Bα γS
αγS = = ,
Cα S Dα S Cα γS Dα γS

we have

Aα S ◦ g ◦ S −1 z + Bα z = ∇φα,S◦g◦S −1 (Cα S ◦ g ◦ S −1 z + Dα z).

Since S is nonsingular, replacing z with S(z) results in

Aα S ◦ g(z) + Bα Sz = ∇φα,S◦g◦S −1 (Cα S ◦ g(z) + Dα Sz), ∀ z. (3.5)

On the other hand,

(Aα S)g(z) + (Bα S)z = ∇φαγS ,g [(Cα S)g(z) + Dα Sz], ∀ z. (3.6)

Compare (3.5) with (3.6) and note that

|Cα + Dα | = 0 ⇐⇒ |Cα S + Dα S| = 0 ⇐⇒ |Cα Sgz (z) + Dα S| = 0.

Finally, we obtain:
∇φαγS,g = ∇φα,S◦g◦S −1 ,
i.e.,
φαγS,g ∼
= φα,S◦g◦S −1 .
The proof can be obtained. 
!
I2n P
Theorem 3.3. Take β = ∈ Sp(4n), P ∈ Sm(2n), α ∈ Sp(J"4n , J4n ),
O I2n
then:
φβα,g ∼
= φα,g + ψp , (3.7)
1
where ψp = w P w ( function independent of g).
2
Proof. Since:
5 6 5 6 5 6
I2n P Aα Bα Aα + P Cα Bα + P Dα
βα = = ,
O I2n Cα Dα Cα Dα

Eβα = Eα , Fβα = Fα + P Eα ,
obviously,

Aβα g(z) + Bβα z = ∇φβα,g (Cβα g(z) + Dβα z), (3.8)


Aα g(z) + Bα z + (P Cα g(z) + P Dα z) = ∇φβα,g (Cα g(z) + Dα z). (3.9)

On the other hand,


6.3 Transform Properties of Generator Maps and Generating Functions 257

∇ψP (Cα g(z) + Dα z) = P (Cα g(z) + Dα z). (3.10)

Inserting (3.10) into (3.9), we obtain

Aα g(z) + Bα z = ∇φβα,g (Cα g(z) + Dα z) − ∇ψP (Cα g(z) + Dα z)


= ∇(φβα,g − ψP )(Cα g(z) + Dα z). (3.11)

Compare (3.11) with

Aα g(z) + Bα z = ∇φα,g (Cα g(z) + Dα z),

we obtain
φβα,g − ψP ∼
= φα,g .
Analogically, we have: 
!
I2n O
Theorem 3.4. If we take β = ∈ Sp(4n), Q ∈ Sm(2n), then
Q I2n

1
φα,g + (∇w φα,g (w)) Q(∇w φα,g (w)) ∼
= φβα,g (w + Q∇φα,g (w)). (3.12)
2
Theorem 3.5. We have the following relation:

φ⎡ ⎤ ∼
= −φ⎡ ⎤ . (3.13)

A B ⎦ ,g −1 ⎣
−B −A ⎦ ,g
C D D C

Proof. Since

Aα g −1 (z) + Bα z = ∇φα,g−1 (Cα g −1 (z) + Dα z),

replacing z with g(z) yields

Aα z + Bα g(z) = ∇φα,g−1 (Cα z + Dα g(z)). (3.14)

Comparing (3.14) with

−Bα g(z) − Aα z = ∇φ5 −B −A


6 (Dα g(z) + Cα z),
,g
D C

the proof is complete. 

Theorem 3.6. If

φ⎡ ⎤ ∼
= −φ⎡ ⎤ , ∀ g,

A B ⎦ ,g −1 ⎣
A B ⎦ ,g −1
C D C D

then
A + B = O, C = D.
258 6. The Calculus of Generating Functions and Formal Energy

Proof. By Theorem 3.5 and the uniqueness of Darboux transformation in Theorem


4.2, we have 5 6 5 6
A B −B −A
=± .
C D D C
We only consider the case “+”, where we have A + B = O, C = D. 
5 6
J −J
Remark 3.7. For Poincaré map α0 = 1 1 , we have:
I I
2 2
φα0 ,g−1 ∼
= −φα0 ,g , ∀ g ∈ Sp-diff.
t
Theorem 3.8. Let gH be the phase flow of Hamiltonian system H(z). Then under
t
Poincaré map α0 the generating function for gH is an odd function w.r.t t, i.e.,
t (w, t) = −φα ,g t (w, −t),
φα0 ,gH 0 H
∀ w ∈ R2n , t ∈ R.
t
Proof. By the properties of generating function for gH , we have
−t t −1
gH = (gH ) ,
t (w, −t) = φα ,(g t )−1 (w, t) = −φα ,g t (w, t).
φα0 ,g−t (w, t) = φα0 ,gH
H 0 H 0 H

The theorem is proved. 


!
S O
Theorem 3.9. If S ∈ Sp(2n), α ∈ Sp(J"4n , J4n ), γ1 = , then
O I
!
Aα S Bα
αγ1 = .
Cα S Dα
Assume |Eαγ1 | = |Cα S + Dα | = 0, we have
φα,S◦g ∼
= φαγ1 ,g , (3.15)
i.e.,
φ5 A 6 ∼
= φ5 AS 6 .
B B
,S◦g ,g
C D CS D
Theorem 3.10. If
5 6 5 6
I O Aα Bα S
γ2 = , α ∈ Sp(J"4n , J4n ), αγ2 = ,
O S Cα Dα S
assume |Bα + Dα S| = 0, we have
φα,g◦S −1 ∼
= φαγ2 ,g , (3.16)
i.e.,
φ5 A 6 ∼
= φ5 A 6 .
B BS
,g◦S −1 ,g
C D C DS
6.3 Transform Properties of Generator Maps and Generating Functions 259

Proof. Since
 
Ag(S −1 z) + Bz = ∇φα,g◦S −1 Cg(S −1 z) + Dz , ∀ z,

replacing z with Sz yields

Ag(z) + BSz = ∇φα,g◦S −1 (Cg(z) + DSz)


= ∇φ⎡ ⎤ (Cg(z) + DSz),

A BS ⎦ ,g
C DS

φ⎡ ⎤ ∼
= φ⎡ ⎤ .

A BS ⎦ ,g ⎣
A B ⎦ ,g◦S −1
C DS C D
Therefore, the theorem is completed. 

The proof of Theorem 3.9 is similar.

Theorem 3.11. If
!
λI2n O
β= ∈ CSp(4n), α ∈ Sp(J"4n , J4n ), λ = 0,
O I2n
!
λA λB
βα= ∈ CSp(J"4n , J4n ), μ(βα) = λ,
C D

then we have
φ⎡ ⎤ ∼
= λφ⎡ ⎤ . (3.17)

λA λB ⎦ ,g ⎣
A B ⎦ ,g
C D C D

Proof. Since

α ∈ Sp(J"4n , J4n ) =⇒ Ag(z) + Bz = ∇φ⎡ ⎤ (Cg(z) + Dz),



A B ⎦ ,g
C D

β ∈ CSp(J"4n , J4n ) =⇒ λAg(z) + λBz = ∇φ⎡ ⎤ (Cg(z) + Dz),



λA λB ⎦ ,g
C D

L.H.S = λAg(z) + λBz = λ∇φ⎡ ⎤ (Cg(z) + Dz),



A B ⎦ ,g
C D

R.H.S = ∇φ⎡ ⎤ (Cg(z) + Dz),



λA λB ⎦ ,g
C D

then we have
260 6. The Calculus of Generating Functions and Formal Energy

∇φ⎡ ⎤ (Cg(z) + Dz) = λ∇φ⎡ ⎤ (Cg(z) + Dz),



λA λB ⎦ ,g ⎣
A B ⎦ ,g
C D C D

φ⎡ ⎤ ∼
= λφ⎡ ⎤ .

λA λB ⎦ ,g ⎣
A B ⎦ ,g
C D C D

The theorem is proved. 


Theorem 3.12. Let
5 6
I2n O
β= ∈ CSp(J4n ), λ = 0, α ∈ Sp(J"4n , J4n ),
O λI2n
5 6
A B
βα = ∈ CSp(J"4n , J4n ), μ(βα) = λ,
λC λD

then we have:
φ⎡ ⎤ ∼
= λφ⎡ ⎤ ◦ λ−1 I2n . (3.18)

A B ⎦ ,g ⎣
A B ⎦ ,g
λC λD C D
Proof. Since
!
A B
∈ CSp(J"4n , J4n ),
λC λD
Ag(z) + Bz = ∇φ⎡ ⎤ (λCg(z) + λDz),

A B ⎦ ,g
λC λD
L.H.S = ∇φ⎡ ⎤ (Cg(z) + Dz),

A B ⎦ ,g
C D
⎛ ⎞
⎜ ⎟
R.H.S = ⎜
⎝φ
⎡ ⎤ ⎟ ◦ λI2n (Cg(z) + Dz)


A B ⎦ ,g
λC λD
⎛ ⎞
⎜ ⎟
= λ−1 ∇ ⎜
⎝φ
⎡ ⎤ ◦ λI2n ⎟
⎠ (Cg(z) + Dz),

A B ⎦ ,g
λC λD

hence
φ⎡ ⎤ ∼
= λ−1 φ⎡ ⎤ ◦ λI2n .

A B ⎦ ,g ⎣
A B ⎦ ,g
C D λC λD
Therefore, the theorem is completed. 
6.4 Invariance of Generating Functions and Commutativity of Generator Maps 261

Before finishing this section, we will give two conclusive theorems which can
include the contents of the seven theorems given before. They are easy to prove and
the proofs are omitted here.
Let !
a b
α ∈ CSp(J"4n , J4n ), β ∈ CSp(J4n ), β = ,
c d
obviously
β α ∈ CSp(J"4n , J4n ), μ(βα) = λ(β)μ(α),
and then the following theorem.
Theorem 3.13. For φβα,g , we have
φβα,g (c∇w φα,g (w) + dw) (3.19)
'
∼ 1  
= λ(β)φα,g (w) + w (d b)w + (∇w φα,g (w))
2
/
 1  
·(c b)w (∇w φα,g (wλ)) (c a)(∇w φα,g (w)) . (3.20)
2

We now formulate the other one. Let α ∈ CSp(J"4n , J4n ), γ ∈ CSp(J"4n ) ⇔


!
a b
γ  J"4n γ = ν(γ)J"4n ⇒ αγ ∈ CSp(J"4n , J4n ), μ(αγ) = μ(α)ν(γ), γ = .
c d
We have the following theorem.
Theorem 3.14. For φαγ,g , we have
φαγ,g ∼
= φα,(ag+b)(cg+d)−1 . (3.21)

6.4 Invariance of Generating Functions and


Commutativity of Generator Maps
First we present the uniqueness theorem of the linear fractional transformation.
Theorem 4.1. Let
5 6 5 6
Aα Bα Aα Bα
α= , α= ∈ Sp(J"4n , J4n ),
Cα Dα Cα Dα
|Eα | = 0, |Eα | = 0.

If
(Aα M + Bα )(Cα M + Dα )−1 = (Aα M + Bα )(Cα M + Dα )−1 ,
∀ M ∼ I2n , M ∈ Sp(2n),

then
α = ±α.
262 6. The Calculus of Generating Functions and Formal Energy

Proof. Let
N0 = (Aα I + Bα )(Cα I + Dα )−1
= (Aα I + Bα )(Cα I + Dα )−1 .

Suppose β ∈ Sp(4n), first we prove that if

(Aβ N + Bβ )(Cβ N + Dβ )−1 = N, ∀ N  N0 , N ∈ Sm(2n),

then β = ±I4n . Now we have two cases:


1◦ (Aβ N0 + Bβ )(Cβ N0 + Dβ )−1 = N0 ⇒ Aβ N0 + Bβ = N0 Cβ N0 + N0 Dβ .
2◦ Take N = N0 + εI ⇒ Aβ (N0 + εI) + Bβ = (N0 + εI)Cβ (N0 + εI) +
(N0 + εI)Dβ .
From 1◦ , 2◦ ⇒ εAβ = εN0 Cβ + εCβ N0 + εDβ + ε2 Cβ , ∀ ε, which results in

Aβ − Dβ − N0 Cβ − Cβ N0 = εCβ = 0 =⇒ Cβ = 0,

thus Aβ = Dβ . !
◦ Aβ Bβ
From 1 , we have B = , Bβ = Bβ . Therefore
O Aβ

Aβ N A−1 −1
β = N − Bβ Aβ .

Subtracting this formula by Aβ N0 A−1 −1


β = N0 − Bβ Aβ yields

Aβ (N − N0 ) = (N − N0 )Aβ .

Take N − N0 = εS, ∀ S ∈ Sm(2n) ⇒ Aβ S = SAβ , ∀ S ∈ Sm(2n) ⇒ Aβ = λI2n


(This can be proved by mathematical induction).
Then from 1◦ , Aβ N0 + Bβ = N0 Aβ ⇒ Bβ = 0, and
!
Aβ O
β= = λI4n ∈ Sp(4n) =⇒ λ = ±1.
O Aβ

Let β = αα−1 , then the fractional transformation of β preserves all symmetric N ∼


" J), α−1 ∈ Sp(J, J),
N0 . Because α ∈ Sp(J, " we have αα−1 ∈ Sp(J, J) = Sp(4n).
The theorem is proved. 

We now present the uniqueness theorem for Darboux transformations.


" J), then
Theorem 4.2. Suppose α, α ∈ Sp(J,

φα,g ∼
= φα,g , ∀ g ∈ Sp-diff, g ∼ I2n =⇒ α = ±α.

Proof. From the hypothesis, we have

φα,g ∼
= φα,g =⇒ Hessian(φα,g ) = (φα,g )ww
= (Aα g(z) + Bα )(Cα g(z) + Dα )−1 ,
6.4 Invariance of Generating Functions and Commutativity of Generator Maps 263

(φα,g )ww = (Aα g(z) + Bα )(Cα g(z) + Dα )−1 , ∀ g(z) ∈ Sp(2n) ∼ I.


Then by uniqueness theorem of the linear fractional transformation α = ±α. From
the above proof, we get

Hessian (φα,g ) = Hessian (φ−α,g ), ∀ g ∈ I, α.

The generating function φα,g depends on Darboux transformation α, symplectic dif-


feomorphism g and coordinates. If we make a symplectic coordinate transformation
w → S(z), then φ(S) ⇒ φ(S(z)), while the symplectic diffeomorphism g is repre-
sented in z coordinates as S −1 ◦ g ◦ S, i.e.,
φα,S −1 ◦g◦S = φα,g ◦ S

For the invariance of generating function φg (S) under S, one would like to expect
φα,S −1 ◦g◦S = φα,g ◦ S, ∀g ∼ I.
This is not true in general case. We shall study under what condition this is true for the
normalized Darboux transformation αV . The following theorem answers this ques-
tion. 
Theorem 4.3. Let
⎡ ⎤
J2n −J2n
α = αV = ⎣ 1 1
⎦, ∀ V ∈ sp(2n), αV ∈ M  ,
(I + V ) (I − V )
2 2
5 6
(S −1 )T O
S ∈ Sp(2n), βS = ∈ Sp(J4n ),
O S
5 6
S O
γS = ∈ Sp(J"4n ).
O S

Then the following conditions are equivalent:


1◦ φαV ,S◦g◦S −1 = φα,g ◦ S −1 , ∀ g  I.
2◦ φαV γS ,g = φβS αV ,g , ∀ g  I.
3◦ αV γS = βS αV .
4◦ SV = V S.
Proof. 1◦ ⇔ 2◦ from Theorems 3.1 and Theorem 3.2. 2◦ ⇒ 3◦ using the uniqueness
theorem on Darboux transformation 4.2. For
αV γS = ±βS αV ,
−1
since JS = S  J, sign “−” case is excluded. The rest of the proof is trivial.
There is a deep connection between the symmetry of a symplectic difference
scheme and the conservation of first integrals.
Let F be the set of smooth functions defined on Rn . 
264 6. The Calculus of Generating Functions and Formal Energy

Theorem 4.4. If Hamiltonian function H is invariant under phase flow gF with


Hamiltonian function F , then F is first integral of the system with Hamiltonian func-
tion H.
Let H, F ∈ F, then

F ◦ gH
t
= F ⇐⇒ {F, H} = 0 ⇐⇒ H ◦ gFt = H ⇐⇒ gH
t

= gF−S ◦ gH
t
◦ gFS .

Theorem 4.5. Let F be a conservation law of Hamiltonian system, then phase flow
t
gH (or symplectic schemes φτH ) keeps phase flow gFt with F (or φτF ) invariant iff
F ◦ gH = F + C. Let F ∈ F, g ∈ Sp-diff, then

g = gF−t ◦ g ◦ gFt (or gFt = g −1 gFt (g(z)) ⇐⇒ F ◦ g = F + c.

Proof. The “ if ” part of the proof is obvious. Since

F ◦ g = F + c =⇒ ∇F = ∇F ◦ g =⇒ gFt = gFt ◦g

= g −1 ◦ gFt ◦ g

= g −1 gFt (g(z)) ⇐⇒ g = gF−t ◦ g ◦ gFt .

On the other hand, take the derivative of both sides of the following equation w.r.t. t
at t = 0,
gFt (z) = g −1 gFt (g(z)),
and notice that g∗ (z) ∈ Sp, g∗−1 J −1 = J −1 g∗T , we get

J −1 ∇F (z) = g∗−1 (z)J −1 ∇F (g(z)) = J −1 g∗T (z)∇F (g(z)),

then we have
∇F = ∇F ◦ g =⇒ F ◦ g = F + c.
Therefore, the theorem is completed. 

6.5 Formal Energy for Hamiltonian Algorithm


Let F s be an analytic canonical transformation for s, i.e.,
1◦ F s ∈ Sp-diff.
2◦ F 0 = id.
3◦ F s analytic if |s| is small enough.
Then there exists a “formal” energy, i.e., a formal power series in s,


hs (z) = h(s, z) = si hi (z)
i=1
6.5 Formal Energy for Hamiltonian Algorithm 265

with the following property: if hs (z) converges, the phase flow ght s is a canoni-
cal transformation with Hamiltonian function hs (z), which is considered as a time-
independent Hamiltonian with s as a parameter and satisfies “equivalence condition”

ght s t=s = F s . (5.1)

Therefore hs (z) = hs (F s z), ∀z ∈ R2n , thus hs (z) is invariant under F s (for those
s, z in the domain of convergence of hs (z)).
The generating function with F s , the new Hamiltonian function and α, the Dar-
boux transformation is


φF s ,α (w) : ψ(s, w) = sk ψ (k) (w). (5.2)
k=1

Introduce formal power series




hs (z) = h(s, w) = si hi (w).
k=1

Assuming it converges, we associate the phase flow with the generating function


hs (z) −→ ψht s ,α (w) : χ(t, s, w) = tk χ(k) (s, w),
k=1
(1)
χ (s, w) = −h(s, w). (5.3)
For k > 1,

k 2n
 
1
χ(k+1) (s, w) = − hwl1 ,···,wlm (s, w)
(k + 1)m!
m=1 l1 ,···,lm =1 k1 +···+km =k

·(A1 χ(k1) (km )


w (s, w))l1 · · · (A1 χw (s, w))lm

k
1  (1)
= χwl1 ,···,wlm (s, w)
(k + 1)m!
m=1 k1 +···+km =k

·(A1 χ(k1)
w (s, w))l1 · · · (A1 χ(k
w
m)
(s, w))lm . (5.4)

 ∞ 
 ∞
Let χ(k) (s, w) = si χ(k,i) (w), then χ(t, s, w) = tk si χ(k,i) (w). Then
i=0 k>1 i=0


 ∞
 
k 
1
si χ(k+1,i) (w) = si
(k + 1)m!
i=0 i=0 m=1
i0 + i1 + · · · + im = i
k1 + · · · + km = k
2n
 (1,i )
· χwl1 0,···,wlm (w)(A1 χ(k
w
1 ,i1 )
(w))l1 · · ·
l1 ,···,lm =1

·(A1 χ(k
w
m ,im )
(w))lm . (5.5)
266 6. The Calculus of Generating Functions and Formal Energy

Thus


k  2n

1 (1,i )
χ(k+1,i) (w) = χwl1 0,···,wlm (w)
(k + 1)m!
m=1 i0 +i1 +···+im =i l1 ,···,lm =1
k1 +···+km =k

·(A1 χ(k
w
1 ,i1 )
(w))l1 · · · (A1 χ(k
w
m ,im )
(w))lm . (5.6)

Let

 ∞

χ(1) (s, w) = si χ(1,i) (w) = −h(s, w) = − si hi (w),
i=0 i=0
(k+1,i)
so the coefficient χ can be obtained by recursion,

χ(1,i) = −h(i) , i = 0, 1, 2, · · · . (5.7)


 
Note that χ(k+1,i) is determined only by the values of χ(k ,i ) (k  ≤ k, i ≤ i),

χ(1,0) χ(1,1) χ(1,2) ··· χ(1,i) χ(1,i+1)


.. .. .. .. ..
. . . . .
(5.8)
χ(k,0) χ(k,1) χ(k,2) ··· χ(k,i) χ(k,i+1)

χ(k+1,0) χ(k+1,1) χ(k+1,2) · · · χ(k+1,i) χ(k+1,i+1)

The condition (5.1) can be now reexpressed as

χ(t, s, w)|t=s = χ(s, s, w) = ψ(s, w),

i.e.,

 ∞
 ∞

sk si χ(k,i) (w) = sk ψ (k) (w),
k>1 i=0 k>1

  ∞

si χ(k−j,j) (w) = si ψ (k) (w),
i=0 j=0 i=1


k−1
χ(k−j,j) (w) = ψ (k) , k = 2, 3, · · · , (5.9)
j=0

χ(1,0) = ψ (1) ,


k
χ(k+1−i,i) = ψ (k+1) , k = 2, 3, · · · .
i=0

So
6.5 Formal Energy for Hamiltonian Algorithm 267

−h(0) −h(1) −h(2) ··· −h(k−1) −h(k)

ψ (1) χ(1,0) χ(1,1) χ(1,2) ··· χ(1,k−1) χ(1,k)

ψ (2) χ(2,0) χ(2,1) χ(2,2) ··· χ(2,k−1)


.. .. .. (5.10)
. . .

ψ (k) χ(k,0) χ(k,1)

ψ (k+1) χ(k+1,0)

and
χ(1,0) = ψ (1) ,

χ(2,0) + χ(1,1) = ψ (2) ,

χ(3,0) + χ(2,1) + χ(1,2) = ψ (3) , (5.11)


..
.
χ(k+1,0) + χ(k,1) + · · · + χ(2,k−1) + χ(1,k) = ψ (k+1) .

Now if ψ (1) , ψ (2) , · · · , ψ (k) , ψ (k+1) , · · · are known, then

h(0) = −χ(1,0) , h(1) = −χ(1,1) , · · · , h(k−1) = −χ(1,k−1) , h(k) = −χ(1,k) , · · ·

can be determined. We get:

h(0) = −ψ (1) ,

h(1) = −ψ (2) + χ(2,0) ,

h(2) = −ψ (3) + (χ(3,0) + χ(2,1) ),


(5.12)
..
.
h(k) = −ψ (k+1) + (χ(k+1,0) + χ(k,1) + · · · + χ(2,k−1) ),
..
.

So h(0) , h(1) , h(2) , · · · can be recursively determined by ψ (1) , ψ (2) , · · ·. So we get the
∞
formal power series hs = si h(i) (z), and in case of convergence, it satisfies
i=0

ght s t=s = F s .

We now give a special example to show how to calculate the formal energy. Let us
consider normal Darboux transformation with
268 6. The Calculus of Generating Functions and Formal Energy

5 6 5 6
O −I A1 B1
V = −E = , αV−1 = ,
−I O C1 D1

where 5 6
1 O O
A1 = (JV J − J) = .
2 −I O
Suppose we just use the first term of the generating function of the generating map
αV , i.e., we just consider the first order scheme


F s ∼ ψ(s, w) = −sH(w) = sk ψ (k) .
i=1

Let us assume ψ (1) = −H(w). If ψ (2) = ψ (3) = · · · = 0, then χ(1,0) = ψ (1) = −H.
We need to calculate χ(2,0) . Since
5 6
(1,0)
Hp
χz =− ,
Hq
5 65 6 5 6
(1,0)
O O −Hp O
A1 χz = = ,
−I O −Hq Hp
5 6
(1,0)
Hpp Hpq
χzz = − .
Hqp Hqq

By formula (5.6), we get


2n
 2n 
 2n
1
χ(2,0) = χz(1,i 0)
(A1 χ(k
z
1 ,i1 )
)l1
2 × 1! l 1
i0 +i1 =0 k1 =1 l1 =1

1 (1,0)  (1,0)
= (χz ) A1 χz
2
5 6 5 6
1 Hp 0 1
=− = − Hq Hp .
2 Hq Hp 2

From formula (5.10), we get

1 
χ(2,0) + χ(1,1) = ψ (2) = 0 =⇒ χ(1,1) = −χ(2,0) = H Hp .
2 q

In order to obtain χ(1,2) , we first determine χ(3,0) and χ(2,1) , and for the latter we
need to caculate
6.5 Formal Energy for Hamiltonian Algorithm 269

⎡ ⎤
∂ 
n

⎢ ∂p Hqj Hpj ⎥
⎢ ⎥
(1,1) 1⎢ j=1 ⎥
χz = ⎢ ⎥
2⎢ ⎥
⎣ ∂ 
n

Hqj Hpj
∂q
j=1
5 6
1 Hpq Hp + Hpp Hq
= ,
2 Hqq Hp + Hqp Hq
5 6
(1,1)
O O (1,1)
A1 χz = χz
−I O
5 6
1 O
=− ,
2 Hpq Hp + Hpp Hq
(2,0) (1,1)
A1 χz = −A1 χz
5 6
1 O
= .
2 Hpq Hp + Hpp Hq

For k = 2, i = 0, we have

 2n
 2n 
 2n
1 1
χ(3,0) = χ(1,0)
zl (A1 χ(k
z
1 ,0)
)l1
3 1! 1
i0 +i1 =0 k1 =2 l1 =1

2n
   
+ χ(1,0) (k1 ,0)
zl ,zl (A1 χz )l1 (A1 χ(k
z
2 ,0)
)l2
1 2
l1 ,l2 =1 i0 +i1 +i2 =0 k1 +k2 =2

1  (2,0) T (2,0) 1 (1,0) T (1,0) (1,0)


= χz A1 χz + A1 χz χzz A1 χz
3 6
1
= − (HqT Hpq Hp + HqT Hpp Hq + HpT Hqq Hp ).
6

For k = 1, i = 1, we have
2n
 2n 
 2n
1
χ(2,1) = χz(1,i
l
0)
(A1 χ(k
z
1 ,i1 )
)l1
2 1
i0 +i1 =1 k1 =1 l1 =1
 
1  (1,0) T (1,1)  (1,1) T (1,0)
= χz A1 χz + χz A1 χz
2
1 1
= (H T Hpp Hq + HpT Hqq Hp ) + HqT Hpq Hp .
4 q 2
270 6. The Calculus of Generating Functions and Formal Energy

From (5.11), we have

χ(3,0) + χ(2,1) + χ(1,2) = ψ (3) = 0 =⇒ χ(1,2) = −(χ(3,0) + χ(2,1) ).

For k = 1, i = 2,

1 1
χ(1,2) = − − (HqT Hpp Hq + HpT Hqq Hp ) − HqT Hpq Hp
6 6

1 1
+ (HqT Hpp Hq + HpT Hqq Hp ) + HqT Hpq Hp
4 2
1
= − (H T Hpp Hq + HpT Hqq Hp + 4HqT Hpq Hp ).
12 q

Finally, we get the formal power series of energy

h(s, z) = −(χ(1,0) + sχ(1,1) + s2 χ(1,2) ) + O (s3 )


s s2
= H(z) − Hq Hp + (H  Hpp Hq + Hp Hqq Hp + 4Hq Hpq Hq ) + O (s3 ).
2 12 q

t
Now, let H(z) be a time-independent Hamiltonian, let its phase flow be gH , and
let its generating function be


φgH
t (w) = φ(t, w) = tk φ(k) (w).
k=1

Then we have
φ(1) (w) = −H(w),
for k ≥ 1,


k 2n 
1 (1)
φ(k+1) (w) = φwl1 ···wlm (w)
(k + 1)m!
m=1 l1 ,···,lm =1 k1 +···+km =k
   
· A1 φw (w) l · · · A1 φ(k
(k1 )
1
w
m)
(w) lm . (5.13)

Theorem 5.1. [Fen98a] Let F s be a Sp-diff operator of order m for Hamiltonian H,


i.e., φ(s, w) − ψ(s, w) = O(|s|m+1 ), and


⎪ ψ (1) (w) = φ(1) (w) = −H(w),





⎨ ψ (2) (w) = φ(2) (w),

⎪ ..

⎪ .



⎩ ψ (m) (w) = φ(m) (w),

then
6.5 Formal Energy for Hamiltonian Algorithm 271

h(0) (w) = H(w), h(1) (w) = h(2) (w) = · · · = h(m−1) (w) = 0,


i.e.,
h(s, w) − H(w) = o (|s|m )
and
h(m) (w) = ψ (m+1) (w) − φ(m+1) (w).
 
Proof. First we show that χ(k+1,i) depends only on derivatives of χ(k ,i ) (k  ≤
k, i ≤ i). The recursion for i = 0 is the same as the recursion of phase flow generat-
 
ing function with Hamiltonian χ(1,0) (w). For i ≥ 1, χ(k+1,i) = 0, if χ(k ,i ) = 0 for
all i , k  , such that 1 ≤ i ≤ i, 1 ≤ k  ≤ k. We have

ψ (1) = χ(1,0) = χ(1,0)


recursion
−−−−−→ χ(2,0) , χ(3,0) , χ(4,0) , · · ·

ψ (2) = χ(1,1) + χ(2,0) =⇒ χ(1,1)


recursion
−−−−−→ χ(2,1) , χ(3,1) , χ(4,1) , · · ·

ψ (3) = χ(1,2) + χ(2,1) + χ(3,0) =⇒ χ(1,2)


recursion
−−−−−→ χ(2,2) , χ(3,2) , χ(4,2) , · · · (5.14)

ψ (4) = χ(1,3) + χ(2,2) + χ(3,1) + χ(4,0) =⇒ χ(1,3)


recursion
−−−−−→ χ(2,3) , χ(3,3) , χ(4,3) , · · ·
..
.
ψ (k) = χ(1,k−1) + χ(2,k−2) + · · · + χ(k,0) =⇒ χ(1,k−1)
recursion
−−−−−→ χ(2,k−1) , χ(3,k−1) , χ(4,k−1) , · · ·

So, χ(k,i) can be generated successively through (5.9), (5.6). Then




h(s, w) = si χ(1,i) (w).
i=0

Using equation H = ψ (1) = φ(1) = χ(1,0) and (5.9), (5.14) ,we get

χ(2,0) = φ(2) , χ(3,0) = φ(3) , · · · , χ(k,0) = φ(k) , · · ·

Using Equation (5.14), we get

ψ (2) = φ(2) = χ(1,1) + φ(2) =⇒ χ(1,1) = 0.

Applying Equations (5.9) and (5.14), we get


272 6. The Calculus of Generating Functions and Formal Energy

χ(2,1) = 0 =⇒ χ(3,1) = χ(4,1) = · · · = χ(k,1) = · · · = 0.

Applying equation

ψ (3) = φ(3) = χ(1,2) + χ(2,1) + χ(3,0) = χ(1,2) + 0 + φ(3) =⇒ χ(1,2) = 0,

then
χ(2,2) = χ(3,2) = χ(4,2) = · · · = χ(k,2) = · · · = 0.

Finally

ψ (m) = φ(m) = χ(1,m−1) + χ(2,m−2) + · · · + χ(m−1,1) + φ(m) =⇒ χ(1,m−1) = 0,

then
χ(2,m−1) = χ(3,m−1) = χ(4,m−1) = · · · = χ(k,m−1) = · · · = 0.

Since χ(k,i) = 0, ∀ i = 1, 2, · · · , m − 1 and k = 1, 2, 3 · · ·, then the equation

ψ (m+1) = χ(1,m) + χ(2,m−1) + · · · + χ(m,1) + χ(m+1,0)

=⇒ χ(1,m) = ψ (m+1) − φ(m+1) ,

so we finally get


h(s, z) = si χ(1,i) = H(z) + sm (ψ (m+1) − φ(m+1) ) + O (|s|m+1 ),
i=0

i.e.,
h(s, z) − H(z) = sm (ψ (m+1) (z) − φ(m+1) (z)) + O(|s|m+1 ).

So in particular, if F s ∼ ψ(s, w) is given by the truncation of phase flow generating


function, i.e.,
ψ (1) = φ(1) = H,

ψ (2) = φ(2) , · · · , ψ (m) = φ(m) ,

ψ (m+1) = φ(m+1) = 0,

then
h(s, z) = H(z) − O(|s|m+1 ).

Therefore, the theorem is completed. 


6.6 Ge–Marsden Theorem 273

6.6 Ge–Marsden Theorem


In this section, we describe the result of Ge–Marsden, which talks about nonexistence
of symplectic schemes that preserving energy. Due to the importance of preserving
energy for a numerical method, extensive effort has been made by many people in
searching for energy-preserving symplectic scheme, yet none of them is successful.
Ge Zhong, a former Ph.D. student of Prof. Feng, first proved in his thesis [Ge88] the
non-existence of the energy-preserving symplectic schemes. The result was published
later in [GM88] by himself and Marsden, and now is called Ge–Marsden theorem.
Let H be such a Hamiltonian function, where in its neighborhood of some level
surface (energy surface), there exists no other conservation law exception energy. In
other words, given a function f defined in a neighborhood of energy surface H = c0 ,
if {f, H} = 0, then f = g(H), where g is a function on R1 .
A symplectic scheme can be regarded as a one-parameter family of symplectic
transformation φτ (τ ≥ 0). A well-posed difference scheme should satisfy the consis-
tency condition which ensures φτ depends smoothly on parameter τ .
Now suppose that we have a symplectic scheme which preserves the energy, i.e.,

H ◦ φτ = H,

where φτ maps energy surface H = c to itself. We denote the mapping φτ , gH


τ

restricted on to the level surface H = c respectively as


τ
fH |H=c , φτ |H=c .

Theorem 6.1 (G–M theorem). [Ge88] There exists a function τ = τ (c, t)defined on a
neighborhoold of 0 ∈ R, such that
 
φτ (c,t) H=c = g t H=c .

This means that if we can find a symplectic scheme preserving energy, we can solve the
original Hamiltonian system equivalently by a reparametrization of time parameter in
t
phase flow gH . In general this is impossible.
The proof of above Theorem 6.1 bases on the following Lemma 6.2.
t t
Lemma 6.2. Let gA 1
, gA 2
be solutions of following systems of ODE respectively.

dx dx
= A1 (x, t), = c(t)A1 (x, t),
dt dt
where c(t) is function of t, then
t τ (t)
gA 1
= gA2 ,

where τ (t) is the solution of following system:



= c(t), τ (o) = 0.
dt
274 6. The Calculus of Generating Functions and Formal Energy

Proof. omit. 

Now we give a proof of Theorem 6.1.


Proof. Let F (z, τ ) be a Hamiltonian function, whose phase flow is φτ . Then from
H ◦ φτ = H and Theorem2.20 of Chapter 3 we get {F (·, τ ), H} = 0. According to
the assumption, there exists a F1 such that:
 
F (z, τ ) = F1 H(z), τ ,

and the vector field generated by Hamiltonian function F (z, τ ) is

J −1 F1 (H(z), τ )Hz ,

which is tangent to the energy surface H = c. Its restriction to the level surface H = c
is 

J −1 F1 (c, τ )Hz  .
H=c
−1
It differs from the restriction of vector field J Hz to the level surface H = c only
by a constant F1 (c, τ ). By Lemma 6.2 the proof is completed. 
All symplectic transformations that keep H invariant compose a group S(H).
S(H) is a rigidity under which S0 (H) is a contained connected support set of unit
transformation of group S(H). S0 (H) induces the level surface H = c by the set of
all transformation, denoted by S0 (H)|H=0 . Then S0 (H) is a curve

t 
S0 (H) = {gH H=c
, t ∈ R}.

Note that the rigidity of S0 (H) exactly counteracts the existence of energy-preserving
symplectc scheme.
Bibliography

[Fen98] K. Feng: The calculus of generating functions and the formal energy for Hamiltonian
systems. J. Comput. Math., 16:481–498, (1998).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[FQ91a] K. Feng and M.Z. Qin: Hamiltonian Algorithms for Hamiltonian Dynamical Systems.
Progr. Natur. Sci., 1(2):105–116, (1991).
[FQ91b] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FQ03] K. Feng and M. Z. Qin: Symplectic Algorithms for Hamiltonian Systems. Zhejiang
Press for Science and Technology, Hangzhou, in Chinese, First edition, (2003).
[Ge88] Z. Ge: Symplectic geometry and its application in numerical analysis. PhD thesis,
Computer Center, CAS, (1988).
[Ge91] Z. Ge: Equivariant symplectic difference schemes and generating functions. Physica
D, 49:376–386, (1991).
[GM88] Z. Ge and J. E. Marsden: Lie–Poisson Hamilton–Jacobi theory and Lie–Poisson inte-
grators. Physics Letters A, pages 134–139, (1988).
[GW95] Z. Ge and D.L. Wang: On the invariance of generating functions for symplectic trans-
formations. Diff. Geom. Appl., 5:59–69, (1995).
Chapter 7.
Symplectic Runge–Kutta Methods

In this chapter we consider symplectic Runge–Kutta (R–K) method.

7.1 Multistage Symplectic Runge–Kutta Method


Now we study Multistage Symplectic Runge–Kutta Method. A key feature of the R–K
method is using the linear combination of the first-order derivatives of the numerical
solution of differential equations to achieve the higher-order approximation.

7.1.1 Definition and Properties of Symplectic R–K Method


Consider the following Hamiltonian system:
dpi ∂H
=− ,
dt ∂qi
dqi ∂H i = 1, 2, · · · , n, (1.1)
= ,
dt ∂pi

where H = H(p1 , · · · , pn , q1 , · · · , qn ) is a Hamiltonian function independent of t. For


t-dependent Hamiltonian (e.g., nonautonomous system), we can introduce two new
variables and transform the system into another one which has (1.1) form[Qin96,Gon96] .
In order to facilitate the expression, we denote
⎡ ⎤ ⎡ ⎤
p1 z1 ⎡ ⎤
⎢ .. ⎥ ⎢ .. ⎥ Hz1
⎢ . ⎥ ⎢ . ⎥ ⎢ .. ⎥
⎢ ⎥ ⎢ ⎥ ⎢ . ⎥
⎢ pn ⎥ ⎢ zn ⎥ ⎢ ⎥
z=⎢ ⎥=⎢
⎢ q1 ⎥ ⎢ zn+1 ⎥
⎥ , Hz = ⎢ Hzn ⎥ ,
⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ . ⎥
⎢ . ⎥ ⎢ . ⎥ ⎣ .. ⎦
⎣ .. ⎦ ⎣ .. ⎦
Hz2n
qn z2n
!
O In
J = J2n = , J  = J −1 = −J,
−In O
where In is n × n identity matrix, J is a standard symplectic matrix. Using this nota-
tion, we can rewrite Equation (1.1) into
278 7. Symplectic Runge–Kutta Methods

dz
= J −1 Hz , (1.2)
dt
or
dz
= J −1 Hz = f (z). (1.3)
dt
The s-stage R–K method for (1.3) has the following form:

s
z k+1 = z k + h bi f (Yi ),
i=1

s (1.4)
Yi = z k + h aij f (Yj ), 1 ≤ i ≤ s,
j=1

where h = tk+1 − tk (k ≥ 0), bi , aij (i, j = 1, 2, · · · , s) are real parameters. The


properties of a R–K method (consistency, accuracy, and stability, etc.) are determined
completely by these parameters. In scheme (1.4), if j ≥ i (1 ≤ i ≤ s), aij = 0, then
all Yi

i−1
Yi = z k + h aij f (Yj ) + aii f (Yi ), 1 ≤ i ≤ s, (1.5)
j=1

can be computed in an explicit way from Y1 , Y2 , · · · , Yi−1 . Such scheme is therefore


called explicit R–K scheme. In scheme (1.4), when j > i (1 ≤ i < s), aij = 0,
and has certain aii = 0 in the diagonal line (1 ≤ i ≤ s), the scheme is called semi-
implicit scheme. Each Yi may defined implicitly by a 2n-dimensional equation. The
importance of semi-implicit methods is that the computation of Y1 , Y2 , · · · , Yi−1 can
be carried out in sequence as s system of 2n algebraic equation rather than as one
system of s × 2n equations. Sometimes we referred this scheme as diagonal implicit.
If the method is neither explicit, nor diagonally implicit, and is just called implicit,
then all Yi must be computed simultaneously. Explicit methods are much easier to
apply than implicit ones. On the other hand implicit methods possess good stability
properties.
Butcher in[But87] proposed the so-called Butcher-array which provides a condensed
representation of the R–K method (1.4),

c1 a11 ··· a1s

c A c2 a21 ··· a2s


.. .. .. (1.6)
. . .
bT cs as1 ··· ass
b1 ··· bs


s
where ci = aij (i = 1, 2, · · · , s). Thus, a s-stage R–K method is determined
j=1
completely by the mathmatical tableau (1.6). Therefore this kind of expression is often
called the Butcher tableau (or form).
7.1 Multistage Symplectic Runge–Kutta Method 279

We regard a single-step difference scheme as a transition mapping from time tk to


tk+1 .

Definition 1.1. A symplectic R–K method is a R–K method whose transitional trans-
∂ z k+1
formation of (1.4), i.e., Jacobian matrix is symplectic.
∂z k

Definition 1.2. An s-stage R–K method is said to satisfy simplifying condition if


s
1
B(p) : bi ck−1
i = , k = 1(1)p,
k
i=1
s
cki
C(η) : aij ck−1
j = , k = 1(1)η,
k
j=1
s
bj (1 − ckj )
D(ζ) : bi ck−1
i aij = , j = 1(1)s, k = 1(1)ζ,
k
i=1

where A is s×s matrix, b and c are s×1 vectors of weights and abscissae, respectively.

In 1964 Butcher proved the following fundamental theorem [But87] :

Theorem 1.3. If the coefficients A, b, c of a R–K method satisfy B(p), C(η), D(ζ)
(p ≤ η + ζ + 1, and p ≤ 2η + 2), then the R–K method is of order p[HNW93] .

R–K method is based on high order quadrature rule. Thus one can derive a R–K
method of order s for any set of distinct abscissas ci (i = 1, · · · , s). A high order can
be obtained for the following special sets of abscissas:
1◦ Using shifted zeros of the Gauss–Legendre polynomial to obtain ci and con-
dition C(s) of Definition 1.2 to obtain the Gauss–Legendre method.
2◦ Using zeros of Radau polynomial:
ds−1  s 
(1) x (x − 1)s−1 (left Radau),
dxs−1
ds−1  s−1 
(2) x (x − 1)s (right Radau),
dxs−1
with condition D(s) of Definition 1.2 to obtain Radau I A method, or with condition
C(s) to obtain Radau II A method.
3◦ Using zeros of Lobatto polynomial

ds−2  s−1 
s−2
x (x − 1)s−1
dx

with coefficients bi satisfying condition 1.2 B(2s − 2) to obtain


(1) Lobatto III A if aij is determined by C(s);
(2) Lobatto III B if D(s) is satisfied;
(3) Lobatto III C if ai1 = b1 , ∀ i = 1, · · · , s, and the rest of aij is determined
by C(s − 1).
Radau I A:
280 7. Symplectic Runge–Kutta Methods

1 1
0 −
4 4
0 1
2 1 5
1 3 4 12

1 3
4 4

Radau II A:
1 5 1

3 12 12
1 1
3 1
1
4 4
1
3 1
4 4

Lobatto III A:

0 0 0 0
0 0 0
1 5 1 1

1 1 2 24 3 24
1
2 2 1 2 1
1
1 1 6 3 6
2 2 1 2 1
6 3 6

Lobatto III B:

1 1
0 − 0
1 6 6
0 0
2 1 1 1
0
1 2 6 3
1 0
2 1 5
1 0
1 1 6 6
2 2 1 2 1
6 3 6

Lobatto III C:
7.1 Multistage Symplectic Runge–Kutta Method 281

1 1 1
0 −
1 1 6 3 6
0 −
2 2 1 1 5 1

1 1 2 6 12 12
1
2 2 1 2 1
1
1 1 6 3 6
2 2 1 2 1
6 3 6
We present a table of these conditions for methods which are based on high order
quadrature rule, see Table 1.1.

Table 1.1. The simplified conditions for s-stage method based on high order quadrature rule

method simplified condition order of accuracy Padé approx


Gauss–Legendre B(2s) C(s) D(s) 2s (s, s)
Radau I A B(2s − 1) C(s − 1) D(s) 2s − 1 (s − 1, s)
Radau II A B(2s − 1) C(s) D(s − 1) 2s − 1 (s − 1, s)
Lobatto III A B(2s − 2) C(s) D(s − 2) 2s − 2 (s − 1, s − 1)
Lobatto III B B(2s − 2) C(s − 2) D(s) 2s − 2 (s − 1, s − 1)
Lobatto III C B(2s − 2) C(s − 1) D(s − 1) 2s − 2 (s − 2, s)

7.1.2 Symplectic Conditions for R–K Method


In this subsection a sufficient condition for R–K method to be symplectic is given. Let
B = diag[b1 , b2 , · · · , bs ] be a diagonal matrix, M = BA + A B − bb . The following
condition was first proposed by Sanz-Serna during his visit to China[SS88] .
Theorem 1.4. If M = 0, then an s-stage R–K method (1.4) is symplectic[SS88,Las88,Sur88] .

Proof. Here we give our own proof[QZ92a] . To prove the scheme (1.4) is symplectic
when M = 0, we only need to verify the Jacobian matrix is symplectic. From the
scheme (1.4) we have

∂z k+1  s
∂Y
= I + h bi Df (Yi ) ki , (1.7)
∂z k ∂z
i=1

∂Yi  s
∂Y
=I +h aij Df (Yj ) kj , 1 ≤ i ≤ s, (1.8)
∂z k ∂z
j=1

where D f is the derivative of function f .


282 7. Symplectic Runge–Kutta Methods

∂Yi
Denote Di = D f (Yi ), = Xi (i = 1, 2, · · · , s), and let f = J −1 Hz , then
∂z k
JDi + Di J = 0, (1.9)

and
% & * s + * s +
∂z k+1 ∂z k+1  
J = J +h bi Di Xi J + hJ bi Di Xi
∂z k ∂z k
i=1 i=1
* + * +

s 
s
+ h bi Di Xi J h bi Di Xi
i=1 i=1


s
= J +h bi [(Di Xi ) J + JDi Xi ]
i=1
* s + * s +
 
2
+h bi Di Xi J bi Di Xi .
i=1 i=1

It follows from (1.8)



s
(Di Xi ) JXi = (Di Xi ) J + h aij (Di Xi ) JDj Xj ,
j=1


s
(Xi ) JDi Xi = JDi Xi + h aij (Dj Xj ) JDi Xi .
j=1

Using Equation (1.9), we obtain


% & * + * +
∂z k+1 ∂z k+1 
s 
s
J = J+ h bi Di Xi J h bi Di Xi
∂z k ∂z k
i=1 i=1


s
+h bi [Xi Di JXi + Xi JDi Xi ]
i=1


s 
s
−h bi h aij (Di Xi ) JDj Xj
i=1 j=1


s !

+h aij (Dj Xj ) JDi Xi
j=1


s 
s
= J + h2 (bi bj − bi aij − bj aji )(Di Xi ) JDj Xj .
i=1 j=1

It is easy to see that if M = 0, then


7.1 Multistage Symplectic Runge–Kutta Method 283

% &
∂ z k+1 ∂ z k+1
J = J,
∂ zk ∂ zk

∂z k+1
i.e., the Jacobian matrix of transitional mapping is symplectic. 
∂z k
Remark 1.5. If R–K method is non-reducible, then condition M = 0 is also neces-
sary.

From subsection 7.1.1 we know that a R–K method is determined completely by


the coefficients ci , aij , bi (i, j = 1, · · · , s). Now we introduce the Gauss–Legendre
method: let ci (i = 1, · · · , s) be zeros of shifted Legendre polynomial Qs (x), where
the Legendre polynomials are defined as
ds 1
Ps (x) = {(x2 − 1)s }, (1.10)
2s s!
d xs
 
1
Qs (x) = Ps x − . (1.11)
2


s
Let this method satisfy simplified conditions B(s) and C(s). Solve equations bi ck−1
i
i=1
1  1
s
= (1 ≤ k ≤ s) for bi (i = 1, · · · , s), and solve equations aij ck−1
j = cki (1 ≤
k kj=1
k ≤ s, 1 ≤ i ≤ s) for aij (i, j = 1, · · · , s). Then the scheme determined by bi and aij
is the only R–K method that has achieved 2s-order of accuracy. We listed Butcher’s
tableau for s ≤ 2 as follows
s = 1:
1 1
2 2
(1.12)
1

s = 2:
√ √
3− 3 1 3−2 3
6 4 12
√ √
3+ 3 3+2 3 1 (1.13)
6 12 4

1 1
2 2

It is easy to see that s = 1 is exactly the case of the Euler centered scheme:
% &
k+1 k 1 k k+1
z = z + hf (z + z ) . (1.14)
2
284 7. Symplectic Runge–Kutta Methods

It is not difficult to verify that both schemes (1.12) and (1.13) satisfy the conditions
M = 0, and hence are symplectic. Furthermore, we have the following conclusions:
Theorem 1.6. An s-stage Gauss–Legendre method is a symplectic scheme with 2s-
order of accuracy.
Proof. Since the scheme satisfies conditions D(s), C(s), B(2s), i.e.,

s
1 1 1
bi aij cl−1
i = bj (1 − clj ) = bj − bj clj
l l l
i=1


s 
s
= bi bj cl−1
i − bj aji cl−1
i ,
i=1 i=1

which results in

s
(bi aij + bj aji − bi bj )cl−1
i = 0, l, j = 1, 2, · · · , s.
i=1

Since c1 , c2 , · · · , cs are not equal mutually, we obtain M = 0. 

7.1.3 Diagonally Implicit Symplectic R–K Method


In this subsection, we will give some diagonal symplectic R–K formulas. These
schemes not only have advantages with regards to computational convenience and
good stability, but also are symplectic.
Let us consider a diagonally s-stage implicit R–K method that satisfies M = 0.
Without loss of generality we assume that bi = 0 (i = 1, 2, · · · , s). Because of the
condition M = 0, we have
bi bj − bi aij − bj aji = 0, i, j = 1, 2, · · · , s. (1.15)
If bk = 0, then bi aik = 0 (i = 1, 2, · · · , s), the method is equivalent to a method with
fewer stages.
The following theorem is first proposed by the authors, sees the literature [QZ92a,SA91] .
Theorem 1.7. If an s-stage diagonally implicit method satisfies M = 0, then we can
write the method in the following form:
b1
c1
2
b2
c2 b1
2
b3
c3 b1 b2
2
.. .. .. .. (1.16)
. . . .
bs
cs b1 b2 b3 ···
2
b1 b2 b3 ··· bs
7.1 Multistage Symplectic Runge–Kutta Method 285


i
bi
where ci = bj−1 + (i = 1, · · · , s, b0 = 0).
j=1
2

Proof. Since the scheme is diagonally implicit, aij = 0 (j > i); to satisfy M = 0,
we have bi bj − bi aij − bj aji = 0 (i, j = 1, 2, · · · , s), which results in

bi
aij = bj , aii = , i = 1, · · · , s, i > j.
2
The theorem is proved. 

Corollary 1.8. Explicit R–K method with any order does not satisfy condition M = 0.

Remark 1.9. Tableau (1.16) Cooper[Coo87] has discussed the condition (1.15) and con-
structed a method of family (1.16) with s = 3 and order 3.

Below we give diagonally implicit symplectic R–K methods for s ≤ 3:


s = 1:
1 1
2 2
(1.17)
1

s=2:
1 1
0
4 4

3 1 1 (1.18)
4 2 4

1 1
2 2

s=3:
1 1
a a
2 2

3 1
a a a
2 2
(1.19)
1 1
+a a a −a
2 2

a a 1 − 2a

where a = 1.351207, which is a real root of polynomial 6x3 − 12x2 + 6x − 1 [Coo87] .


The above three schemes have accuracy o(Δt2 ), o(Δt2 ), o(Δt3 ) respectively.
286 7. Symplectic Runge–Kutta Methods

Corollary 1.10. If s = 3, and the elements in Butcher tableau are taken in symmetri-
cal version (a11 = a33 ).

1 1
a a
2 2

1 1
a −a
2 2
(1.20)
1 1
1− a a 1 − 2a a
2 2

a 1 − 2a a

Then this scheme has 4th-order accuracy.


In Chapter 8 we will see that this is a typical example that using Euler centered
scheme and multiplication extrapolation to achieve 4th order accuracy.
Now we consider s = 4, Butcher tableau can be represented as follows:

b1 b1
2 2

b2 b2
b1 + b1
2 2

b3 b3 (1.21)
b 1 + b2 + b1 b2
2 2

b4 b4
b 1 + b 2 + b3 + b1 b2 b3
2 2

b1 b2 b3 b4

We expect this method to have 4th-order accuracy. According to Taylor expansion,


the coefficients in the method must satisfy the system of equations:

s
bi = 1, (1.22)
i=1
s
1
bi ci = , (1.23)
2
i=1
s
1
bi c2i = , (1.24)
3
i=1
s
1
bi aij cj = , (1.25)
6
i,j=1
7.1 Multistage Symplectic Runge–Kutta Method 287


s
1
bi c3i = , (1.26)
4
i=1
s
1
bi ci aij cj = , (1.27)
8
i,j=1
 s
1
bi aij c2j = , (1.28)
12
i,j=1
s
1
bi aij ajk ck = . (1.29)
24
i,j,k=1

Now we have 8 equations with 4 unknowns. Luckily we find a set of solutions using
computer, which is

b1 = −2.70309412, b2 = −0.53652708,
b3 = 2.37893931, b4 = 1.8606818856.

Perhaps we can reduce the equations to the form of 4 equations with 4 unknowns and
s
get the exact solution. For an example, using bi = 1, bi bj − bi aij = 0 (i, j =
i=1

s 
s
1
1, 2, · · · , s), we have bi ai,j = bi ci = . So we can remove Equation
2
i=1,j=1 i=1
(1.23) from the system. In an implementation of this R–K method, we rewrite it in the
following form:

b1 h
Y1 = z k + f (Y1 ),
2
b h
Y2 = 2Y1 − z k + 2 f (Y2 ),
2
b h (1.30)
Y3 = 2Y2 − (2Y1 − z k ) + 3 f (Y3 ),
2
b h
Y4 = 2Y3 − (2Y2 − 2Y1 + z k ) + 4 f (Y4 ),
2
z k+1 = 2Y4 − (2Y3 − 2Y2 + 2Y1 − z k ).

Corollary 1.11. This scheme (1.30) can be obtained by applying the implicit midpoint
scheme over 4 steps of length b1 h, b2 h, b3 h, b4 h. It has 4-th order accuracy.

Let
288 7. Symplectic Runge–Kutta Methods

 1 
1 z 4 + z0
z 4 = z 0 + b1 hf ,
2
 2 1 
2 1 z4 +z4
z 4 = z 4 + b2 hf ,
2
 3 2  (1.31)
3 2 z4 +z4
z 4 = z 4 + b3 hf ,
2
 3 
3 z1 + z 4
z 1 = z 4 + b4 hf ,
2
Rewrite it in the following form:
1 % 1 &
z 4 + z0 b z 4 + z0
= z 0 + 1 hf ,
2 2 2
2 1 % 2 1 &
z4 +z4 1 b z4 +z4
= z 4 + 2 hf ,
2 2 2
3 2 % 3 2 & (1.32)
z4 +z4 2 b z4 +z4
= z 4 + 3 hf ,
2 2 2
3 % 3 &
z1 + z 4 3 b z1 + z 4
= z 4 + 4 hf .
2 2 2

Let
1 1 2
z0 + z 4 z4 +z4
= Y1 , = Y2 ,
2 2
2 3 3
z4 +z4 z 4 + z1
= Y3 , = Y4 ,
2 2

then (1.32) becomes scheme (1.30).


There are similar results for s ≤ 3. More detail can be seen later in Section 8.1.
All schemes proposed in this section can be applied to general ODE’s as well.

Exercise 1.12. Does there exist 5-stage diagonally implicit R–K method with 5th-
order accuracy?

7.1.4 Rooted Tree Theory

1. High order derivatives and rooted tree theory


The basic method to construct the numerical scheme for ordinary differential equa-
tions is Taylor expansion. If only a single (scalar) equation is considered, Taylor ex-
pansion can be used in studying the convergence, compatibility, and order conditions
for R–K methods. However, if the system of differential equations are considered,
Taylor expansion is intractable. Consider system of ODE’s:

ẏ = f (y), y(0) = η, f : Rm → Rm , m > 1. (1.33)


7.1 Multistage Symplectic Runge–Kutta Method 289

For brevity, let m = 2, and y = (1 y, 2 y) , f = (1 f, 2 f ) , introduce the following


notations
i ∂if ∂ 2 (i f )
fj := j , i fjk := j ,
∂( y) ∂( y)∂(k y)
we have
1 (1)
y = 1 f, 2 (1)
y = 2 f,
(1.34)
1 (2)
y = 1 f1 (1 f ) + 1 f2 (2 f ), 2 (2)
y = 2 f1 (1 f ) + 2 f2 (2 f ).

Using matrix and vector symbols, we have


5 1 6
f1 1 f2
y (2) = f. (1.35)
2
f1 2 f2

The second order derivative can be expressed via Jacobian matrix. However, the
third-order derivative y (3) , can no longer be expressed via matrix and vector sym-
bol, not to mention the higher order derivative. This has motivated people to study
the structure of the Taylor expansion of high order derivatives and search for a better
symbol to simplify the Taylor expansion of high order derivatives. Then the rooted tree
theory[But87,Lam91,HNW93,SSC94] (With the tree roots skill to express high order deriva-
tive) emerged. Take y (3) as an example:
1 (3)
 
y = 1 f11 (1 f )2 + 1 f12 (1 f )(2 f ) + 1 f1 1 f1 (1 f ) + 1 f2 (2 f )
 
+1 f21 (2 f )(1 f ) + 1 f22 (2 f )2 + 1 f2 2 f1 (1 f ) + 2 f2 (2 f ) ,
2 (3)
 
y = 2 f11 (1 f )2 + 2 f12 (1 f )(2 f ) + 2 f1 1 f1 (1 f ) + 1 f2 (2 f )
 
+2 f21 (2 f )(1 f ) + 2 f22 (2 f )2 + 2 f2 2 f1 (1 f ) + 2 f2 (2 f ) . (1.36)

Definition 1.13. Let z, f (z) ∈ Rm , f (M ) (z) be the M -th Frechet derivatives of f .


It is an operator on Rm × Rm × · · · × Rm (M times), and is linear in each operand,

f (M ) (z)(K1 , K2 , · · · , KM )
 m m  m m
= ··· i
fj1 j2 ···jM j1 K1j2 K2 · · · jM
KM · ei , (1.37)
i=1 j1 =1 j2 =1 jM =1

where z is the argument, K1 , K2 , · · · , KM operands, and


1 T
Kt = Kt , 2 K t , · · · , m K t ∈ Rm , t = 1, 2, · · · , M,
M
i ∂ i
fj1 j2 ···jM = f (z) (1.38)
∂(j1 z)∂(j2 z) · · · ∂(jM z)

and
1 , 0, · · · , 0]T ∈ Rm .
ei = [0, 0, · · · , 0, 0123
i
290 7. Symplectic Runge–Kutta Methods

We have the following comments:


(1) The value of f (M ) (z)(· · ·) is a vector in Rm .
(2) Repeated subscripts are permitted in (1.37), so that all possible partial deriva-
tives of order M are involved. Thus, if M = 3, m = 2, the following partial deriva-
tives will appear:

i ∂ 3 (i f ) i ∂ 3 (i f )
f111 = ; f112 = i f121 = i f211 = ,
∂(1 z)3 ∂(1 z)2 ∂(2 z)
i = 1, 2.
i ∂ 3 (i f ) ∂ 3 (i f )
f122 = i f212 = i f221 = , i
f222 = ,
∂(1 z)∂(2 z)2 ∂(2 z)3

(3) The argument z simply denotes the vector with respect to whose component
we are performing the partial differentiations.
(4) An M times Frechet derivatives has M operands. This is an important prop-
erty to note.
Take m = 2, we have
Case M = 1,
2 
 2
f (1) (z)(K1 ) = i
fj1 (j1 K1 )ei
i=1 j1 =1
5 1 6
f1 (1 K1 ) + 1 f2 (2 K1 )
= , (1.39)
2
f1 (1 K1 ) + 2 f2 (2 K1 )

where
i ∂(i f ) i ∂(i f )
f1 = , f2 = , i = 1, 2.
∂(1 z) ∂(2 z)
Replace z with y, and K1 with f , (1.39) becomes
5 1 1 6
f1 ( f ) + 1 f2 (2 f )
f (1) (y)(f (y)) = = y (2) . (1.40)
2
f1 (1 f ) + 2 f2 (2 f )

(1.40) can be briefly denoted as

y (2) = f (1) (f ). (1.41)

Case M = 2,
2 
 2 
2
f (2) (z)(K1 , K2 ) = i
fj1 j2 (j1 K1 )(j2 K2 )ei
i=1 j1 =1 j2 =1
5 1
6
f11 ( K1 )( K2 ) + 1 f12 (1 K1 )(2 K2 ) + 1 f21 (2 K1 )(1 K2 ) + 1 f22 (2 K1 )(2 K2 )
1 1

= .
2
f11 (1 K1 )(1 K2 ) + 2 f12 (1 K1 )(2 K2 ) + 2 f21 (2 K1 )(1 K2 ) + 2 f22 (2 K1 )(2 K2 )

Use y to replace z, and let K1 = K2 = f, we obtain


7.1 Multistage Symplectic Runge–Kutta Method 291

5 1 6
f11 (1 f )2 + 2(1 f12 )(1 f )(2 f ) + 1 f22 (2 f )2
f (2) (y)(f (y), f (y)) = , (1.42)
2
f11 (1 f )2 + 2(2 f12 )(1 f )(2 f ) + 2 f22 (2 f )2

which is only part of the right side of (1.36), but not all. The absent terms are
5 1 1 1    6
f1 f1 ( f ) + 1 f2 (2 f ) + 1 f2 2 f1 (1 f ) + 2 f2 (2 f )
2
    . (1.43)
f1 1 f1 (1 f ) + 1 f2 (2 f ) + 2 f2 2 f1 (1 f ) + 2 f2 (2 f )

Now if we replace the operand f (y) with f (1) (y)(f (y)) in (1.40), the result is exactly
(1.43). Hence, shortening the notation as in (1.41), (1.36) can be written as

y (3) = f (2) (f, f ) + f (1) (f (1) (f )). (1.44)

Thus we have seen that y (2) is a single Frecht derivative of order 1, and that y (3) is
a linear combination of Frecht derivatives of order 1 and 2. In general, y (p) turns out
to be a linear combination of Frecht derivatives of order up to p − 1. The components
in such linear combination are called elementary differentials.
Definition 1.14. The elementary differential Fs : Rm → Rm of f , and their order
are defined respectively by
1◦ f is only elementary differential of order 1, and
2◦ if Fs (s = 1, 2, · · · , M ) are elementary differential of order rs , then the
Frecht derivative
F (M ) (F1 , F2 , · · · , FM ), (1.45)
is an elementary of order

M
1+ rs . (1.46)
s=1

Remark 1.15. We have following:


1◦ The elementary differential F1 , F2 , · · · , FM appearing as operands in (1.45)
need not be distinct.
2◦
{F1 , F2 , · · · , FM } : f (M ) (F1 , F2 , · · · , FM ). (1.47)

3◦ The order of the elementary differential (1.47) is, by (1.46), the sum of the

M
orders of the elementary differentials plus 1, i.e., 1 + rs , where 1 is “for the brack-
s=1
ets”.
Order 1 has only one elementary differential, i.e., f .
Order 2 has only one elementary differential, i.e., f (1) (f ) = {f }.
Order 3 has two elementary differentials, i.e.,

M = 2 =⇒ operand f, f =⇒ elementary differential {f 2 },


M = 1 =⇒ operand f (1) (f ) = {f } =⇒ elementary differential {{f }} = {2 f }2 .
292 7. Symplectic Runge–Kutta Methods

Order 4 has four elementary differentials, i.e.,

M = 3 =⇒ operandf, f, f =⇒ elementary differential {f 3 },


M = 2 =⇒ operand f, {f } =⇒ elementary differential {f {f }2 (≡ {2 f }f }),
operand {f 2 } =⇒ elementary differential {2 f 2 }2 ,
M = 1 =⇒
operand {2 f }2 =⇒ elementary differential {3 f }3 .

2. Labeled graph
Let n be a positive integer. A labeled n-graph g is a pair {V, E} formed by a set V
collection with card (V ) = n, and a set E of unordered pairs (v, w) as a collection of
elements, of which v, w are point of the set V , and v = w. Therefore g may be empty.
V and E elements are known as the vertices and edges respectively. Two vertices v, w
is said to be the adjacent if (v, w) ∈ E. Fig. 1.1 shows labeled graph for n = 2, 3, 4.

◦ ◦ ◦ ◦ ◦
i j i j k
n=2 n=3
l k
◦ ◦ j k
◦ ◦
◦ j ◦ ◦ ◦ ◦
i j k l
◦ ◦
i l
◦ i n=4

Fig. 1.1. Labeled graph for n = 2, 3, 4

A graph can have many different types of label. However for the same graph, there
exists a isomorphic mapping χ between two different labels. For an example, for one
of the graphs in Fig. 1.1, depicted also in Fig. 1.2, we can take two types of label, as
shown in Fig. 1.3.

◦ ◦ ◦ ◦
i j k l

Fig. 1.2. Labeled of graph 4

◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦
i j k l m n p q

Fig. 1.3. 2 kind labeled of graph


7.1 Multistage Symplectic Runge–Kutta Method 293

We take mapping χ to be

χ : i −→ m, j −→ n, k −→ p, l −→ q,

i.e.,
χ : V1 −→ V2 , V1 = {i, j, k, l}, V2 = {m, n, p, q},
then
χ: (i, j) −→ (m, n),
(j, k) −→ (n, p),
(k, l) −→ (p, q),

i.e.,
χ: E1 −→ E2 ,
E1 = {(i, j), (j, k), (k, l)},
E2 = {(m, n), (n, p), (p, q)}.

Therefore χ : Lg1 → Lg2 , where Lg1 = {V1 , E1 }, Lg2 = {V2 , E2 } are two different
labels in Fig. 1.3. In fact, if there exists a isomorphic mapping between two labels
g1 , g2 , they can be regarded as two types of labels for the same tree. Therefore a
graph is an equivalent class, which consists of a variety of different labeled graphs
corresponding to different types of label. These labeled graphs are equivalent, i.e.,
there exists an isomorphic mapping between them.
3. Relationship between rooted tree and elementary differential
Next we can see that there is a 1 to 1 correspondence between elementary and trees.
(1) Let f be the unique elementary differential of order 1. Then f corresponds
to the unique tree of order 1, which consists of a single vertex.
(2) If the elementary differential Fs of order rs (s = 1, 2, · · · , M ) corre-
sponds to trees ts of order rs (s = 1, 2, · · · , M ), then the elementary differential

M 
M
{F1 , F2 , · · · , FM } of order 1 + rs corresponds to the tree of order 1 + rs ,
1 s=1
obtained by grafting the M trees Fs (s = 1, 2, · · · , M ) onto a new root.
◦ ◦
◦ ◦ ◦
Example 1.16. If F1 ∼ t1 , F2 ∼ t2 = , F3 ∼ t3 = ◦ , then
◦ ◦
◦◦ ◦ ◦
◦ ◦ ◦ ◦
{F1 , F2 , F3 } ∼ ◦ ◦ .

We need a notation to represent trees similar to the notation for elementary differ-
ential. All trees can be labeled with combination of the symbol of τ for the unique tree
of order 1 (consisting of a single node) and the symbol [· · ·], meaning we have grafted
the trees appearing between the brackets onto a new root. We shall denote n copies of
[[[···[ ]···]]]
tree t1 by tn1 , ktimes by [k , and ktimes by ]k . For example
294 7. Symplectic Runge–Kutta Methods


t1 = [τ ]= ,


@ A @ A @ ◦ ◦
t2 = τ [τ ] = [τ ]τ = τ [τ ]2 = ,

@ A
t3 = [t1 , t22 ] = [τ ] [τ [τ ]], ] [τ [τ ] ]

◦ ◦
◦ ◦
◦ ◦
= [2 τ ] [τ [τ ]2 [τ [τ ]3 =
◦ ◦ .


Definition 1.17. The order r(t), symmetry σ(t) and density (tree factorial) γ(t) are
defined by

r(τ ) = σ(τ ) = γ(τ ) = 1, and


 
r [tn1 1 tn2 2 · · ·] = 1 + n1 r(t1 ) + n2 r(t2 ) + · · · , (number of vertices)
    n1   n2
σ [tn1 1 tn2 2 · · ·] = n1 !n2 ! · · · σ(t1 ) σ(t2 ) ···,
 n 1 n2   n1 n2  n 1   n2
γ [t1 t2 · · ·] = r [t1 t2 · · ·] γ(t1 ) γ(t2 ) ···.

Let α(t) (tree multiplicity) be the number of essentially different ways of labeling
the vertices of the tree t with the integers 1, 2, · · · , r(t) such that labels are monotone
increase. Essentially different labeling is illustrated in the following examples:

◦ ◦ 2 3◦ 4 4 2◦ 3
◦ ◦ ◦ and ◦ ◦,
Example 1.18. t1 = [τ 3 ] = , its labeling trees are
◦ ◦ ◦
1 1
are not regarded as essentially different labelings, hence α(t1 ) = 1.
4 4
◦ ◦ ◦
@ A ◦ ◦ 2◦ ◦3 3◦ ◦2
Example 1.19. t2 = τ [τ ] = , its labeling trees are ,
◦ ◦ ◦
1 1
3

4◦ ◦2
and are regarded as essentially different labelings, and α(t2 ) = 3.

1
From above, we have a easy way of computing α(t), namely

r(t) !
α(t) = . (1.48)
σ(t)γ(t)
7.1 Multistage Symplectic Runge–Kutta Method 295

4. Order conditions for multi-stage R–K method


Definition 1.20. The function F is defined on the set T of all trees by

F (τ ) = f,
(1.49)
F ([t1 , t2 , · · · , tM ]) = {F (t1 ), F (t2 ), · · · , F (tM )}.

The proof of the following two theorems was established by Butcher in 1987 [But87] .

Theorem 1.21. Let ẏ = f (y), f : Rm → Rm , then



y (q) = α(t)F (t), (1.50)
r(t)=q

where F (t) is defined by (1.49), and α(t) by (1.48).

Below let us apply this theorem for p ≤ 4 to obtain y (q) :

y (2) = {f },
y (3) = {f 2 } + {2 f }2 ,
y (4) = {f 3 } + 3{f {f }2 + {2 f 2 }2 + {3 f }3 .

Let us define the right side of Equation (1.4) to be yn (h), which is then expanded as a
Taylor series about h = 0,

1
y(xn+1 ) = y(xn ) + hy (1) (xn ) + h2 y (2) (xn ) + · · · , (1.51)
2
where  
dq 
q
y n (h) = α(t)γ(t)φ(t)F (t). (1.52)
dh h=0
r(t)=q

We first slightly modify the notation in Butcher array of a. Let as+1,i = bi (i =


1, 2, · · · , s), we get the following Table 1.2.

Definition 1.22. 1◦ For i = 1, 2, · · · , s, s + 1, define the function of φi on the set T


of all trees by:


s
φi (τ ) = aij ,
j=1

s
φi ([t1 , t2 , · · · , tM ]) = aij φj (t1 )φj (t2 ) · · · φj (tM ).
j=1

2◦ define φ(t) = φs+1 (t).


296 7. Symplectic Runge–Kutta Methods

Table 1.2. Tree and elementary differential up to 4

tree t F (t) r(t)σ(t)γ(t)α(t)φi (t), i = 1, · · · , s φ(t)


 
◦ τ f 1 1 1 1 aij (= ci ) bi
j i

◦  
[τ ] {f } = f  f 2 1 2 1 aij cj bi ci
◦ j i
◦ ◦  
[τ 2 ] {f 2 } = f  (f, f ) 3 2 3 1 aij c2j bi c2i
◦ j i

  
◦ [[τ ]] {2 f }2 = f  f  f 3 1 6 1 aij ajk ck bi aij cj
◦ j k i,j

◦  
◦ ◦ [τ 3 ] {f 3 } = f  (f, f, f ) 4 6 4 1 aij c3j bi c3i
◦ j i

◦ ◦   
[τ [τ ]] {f {f }2 = f  (f, f  f ) 4 1 8 3 aij cj ajk ck bi ci aij cj
◦ j k i,j

◦ ◦
  
◦ [[τ 2 ]] {2 f 2 }2 = f  f  (f, f ) 4 2 12 1 aij ajk c2k bi aij c2j
◦ j k i,j


◦      
  
◦ [[τ ]] {3 f }3 = f f f f 4 1 24 1 aij ajk akn cn bi aij ajk ck
◦ j k n i,j,k

Remark 1.23. Functions φi has representations on the set T



φi (τ ) = aij = ci , ∀ i = 1, 2, · · · , s,
j
  
φ(τ ) = as+1,j = bj = bi ,
j j i
 
φi ([τ ]) = aij φj (τ ) = aij cj , ∀ i = 1, 2, · · · , s,
j j
  
φ([τ ]) = as+1,j cj = bj cj = bi ci ,
j j i
  
φi ([[τ ]]) = aij φj ([τ ]) = aij ajk ck , ∀ i = 1, 2, · · · , s,
j j k
   
φ([[τ ]]) = as+1,j ajk ck = bj ajk ck = bi aij cj .
j k jk ij

Theorem 1.24. R–K method has order p, if


7.1 Multistage Symplectic Runge–Kutta Method 297

 1
φ(t) = bi φi = , ∀ r(t) ≤ p, t ∈ T, (1.53)
i
γ(t)

and does not hold for some tree of order p + 1.

From Table 1.3, we then obtain the following number of orders trees (see Table 1.4).

Table 1.3. Number of trees up to order 10

order p 1 2 3 4 5 6 7 8 9 10

number of trees Tp 1 1 2 4 9 20 48 115 286 719

Table 1.4. Number of conditions up to order 10

order p 1 2 3 4 5 6 7 8 9 10

number of conditions 1 2 4 8 17 37 85 200 486 1205

Number of order conditions for Multi-stage R–K up to order 10, can be seen in the
following Table 1.4.

7.1.5 Simplified Conditions for Symplectic R–K Method


There are four types of trees, which can be defined as follows[SA91] :
(1) A labeled n-tree λτ is a labeled n-graph {V, E}, such that for any pair of
distinct vertices v and w, there exists a unique path that joins v and w.
(2) Two labeled n-trees {V1 , E1 }, {V2 , E2 } are said to be isomorphic, if a bi-
jection of V1 onto V2 exists that transforms edges in E1 into edges in E2 , vertices V1
into V2 . n-trees τ is an equivalence class that consists of labeled n-trees isomorphic
to it. Each of the labeled n-trees that represent τ is called a labeling of τ .
(3) A rooted labeled n-tree ρλτ is a labeled n-tree, in which one of the vertices
r, called the root, has been highlighted. The vertices adjacent to the root are called the
sons of the root. The sons of the remaining vertices are defined in an obvious recursive
way. In fact, when some point is defined as root, the tree becomes a directed graph,
i.e., any edge (v, w) in set E has a direction to represent the relationship between
father and son. Let T be a mapping from son to father. Since any point v has a path to
the root, e.g. v = v0 , v1 , · · · , vm = r, r may be obtained through the sequential action
of T on v. Therefore a direction can be defined from v to r, and the entire root also
become oriented.
(4) Two labeled n-trees {V1 , E1 , r1 }, {V2 , E2 , r2 } are said to be root isomor-
phic, if a bijection of V1 onto V2 exists that transforms edges in E1 onto E2 and maps
298 7. Symplectic Runge–Kutta Methods

r1 onto r2 . A rooted n-trees ρτ is an equivalence class that comprises of a the rooted


labeled n-tree and all rooted labeled n-trees root-isomorphic to it. Fig. 1.4 is an exam-
ple of rooted tree.
Fig. 1.5 shows that there is only one unlabeled 3-tree for n = 3, τ31 , which repre-
sents three different labeled trees denoted by (A, B, C). Each labeled tree represents
three rooted labeled trees denoted by lower case letter (a, b, · · ·). The 9 rooted labeled
trees can be classified into two rooted trees ρτ31 , ρτ32 . The tree τ31 at the last row
can be considered as the result of the identification of ρτ31 with ρτ32 . In general, trees
can be considered to be equivalent classes of rooted trees, because a root isomorphism
is an isomorphism. For each rooted tree ρτ , we denote α(ρτ ) as the number of the
monotonic rooted labeled trees. The latter only allow so called monotonic rooted la-
belings where each vertex is labeled using an integer number (≤ n) smaller than all
its sons.

l
◦ ◦
l
j k
◦ ◦ j◦ ◦k k ◦+
R◦ R◦ s◦ j
i+ i + ◦+
+ i

Fig. 1.4. A rooted tree

Unless otherwise specified, it is assumed that the set of vertices of a labeled n


graph is always {1, 2, · · · , n}. In order to clarify the above four types of trees, we use
Fig. 1.5 to illustrate.

τ31
(nolabeled 3-tree)

(labeled 3-trees) 1 2 3 3 1 2 2 3 1
A B C
(rooted labeled 3-tree)
(+)1-2-3 1-(+)2-3 1-2-(+)3 (+)3-1-2 3-(+)1-2 3-1-(+)2 (+)2-3-1 2-(+)3-1 2-3-(+)1
a b c d e f g h i
◦ ◦ ◦

(rooted 3-tree)


ρτ32
ρτ31

(nolabeled 3-tree) ◦ ◦ ◦ τ31

Fig. 1.5. The 3-tree, labeled 3-trees (A) − (C), rooted labeled 3-tree (a) − (i), and rooted
3-tree
7.1 Multistage Symplectic Runge–Kutta Method 299


ρτ11 + τ11 •


ρτ21 • τ21 ••
+

Fig. 1.6. Rooted n-tree (n=1,2)

• • •
ρ τ31 • ρτ32 τ31 • • •

• +
+
• •
ρτ41 • ρτ42 • • τ41 • • • •
• 1 2 3 4
• •
+ +

• •
ρτ43

ρτ44 • • • τ42 • •
• • •
+ +

Fig. 1.7. Rooted n-tree (n=3,4)

o p

m n • •
k l • •
• •
j •
+ •
i ρτ i

• • • •
• • • •
• • • •
• • • •
+ + ρτj +
ρτI ρτJ

Fig. 1.8. 4 rooted tree in Lemma 1.25

Superfluous trees. Let τ be an n-tree and choose one of its labelings λτ . This
labeling gives rise to n different rooted labeled trees ρλτ1 , · · · , ρλτn , where ρλτi has
its root at the integer i (1 ≤ i ≤ n). If for each edge (i, j) in λτ , ρτi and ρτj represent
different rooted trees; then τ is called non-superfluous. Consider the 3-tree τ31 in
300 7. Symplectic Runge–Kutta Methods

Fig.1.5. When choosing the labeled 3-tree A, we see that for the edge 1-2, choosing 1
as the root leads to ρτ31 , and choosing 2 as the root leads to ρτ32 . For the edge 2-3,
choosing 2 as the root leads to ρτ32 , and choosing 2 as the root leads to ρτ31 . Therefore
τ31 is non-superfluous. One the other hand the 4-tree with labeling is superfluous (see
Fig. 1.6 and 1.7), since changing the root from 2 to the adjacent 3 does not result in
different rooted trees. • • • •
1 2 3 4
In order to simplify the order conditions for symplectic R–K,we need some lem-
mas. Before introducing the lemmas, let us first look at Fig. 1.8: 4-rooted tree.
Look at first rooted tree ρτi (i.e., root at i) and rooted tree ρτj . The root of the
rooted trees ρτI , ρτJ in Fig. 1.8 is at vertex i and j, they are removed edge joining i
and j in the top left-hand corner graph.

Lemma 1.25. With the above notations


1◦
1 1 1 1
+ = · . (1.54)
γ(ρτi ) γ(ρτj ) γ(ρτI ) γ(ρτJ )
2◦ For the symplectic R–K method, weighted coefficients of elementary differ-
ential satisfy
φ(ρτi ) + φ(ρτj ) = φ(ρτI )φ(ρτJ ). (1.55)
3◦ For order ≥ (r − 1), symplectic R–K method,

1 1
φ(ρτi ) + φ(ρτj ) = + . (1.56)
γ(ρτi ) γ(ρτj )

Therefore ρτi order conditions hold iff order conditions of ρτj hold.

Proof. By the definition of γ, we have

γ(ρτI )
γ(ρτi ) = rγ(ρτJ ) , (1.57)
r(ρτI )

γ(ρτJ )
γ(ρτj ) = rγ(ρτI ) , (1.58)
r(ρτJ )
where r(ρτI ) and r(ρτJ ) are the orders of ρτI and ρτJ . Then, insert r(ρτI ) in formula
(1.57) and r(ρτJ ) in formula (1.58) into r(ρτI )+r(ρτJ ) = r to obtain (1.54). Rewrite
the left side of formula (1.55) into
 ;  ;
φ(ρτi ) + φ(ρτj ) = bi aij + bj aij , (1.59)
ij··· ij···

B
where represents a product of r − 2 factors akl . Equality (1.55) can be obtained
using order condition (1.15) of the symplectic R–K method. 

Example 1.26. See the simple examples below. From Fig. 1.8 we have
7.1 Multistage Symplectic Runge–Kutta Method 301


φ(ρτ v ) = biv aiv iw aiv i1 aiv i2 aiw i3 aiw i4 ,
1 2 •4
vw1···4 • • •3

φ(ρτ w ) = biw aiw iv aiw i3 aiw i4 aiv i1 aiv i2 . • •
v w
vw1···4

From Fig.1.8 we have


1 2
 • •
φ(ρτv ) = biv aiv i1 aiv i2 ,

v12 v
•4

φ(ρτw ) = biw aiw i3 ai3 i4 . • 3
w34
•w
Theorem 1.27. [SA91] Assume that a symplectic R–K method satisfies the order con-
ditions for order ≥ (r − 1) with (r ≥ 2). Then, to ensure that the method to have
order ≥ r, it is sufficient that, for each non-superfluous tree τ with r vertices, there is
one rooted tree ρτ associated with τ for which

1
φ(ρτ ) = . (1.60)
γ(ρτ )

Proof. Choose first a non-superfluous tree τ . Assume that condition (1.60) is satisfied
for a suitable rooted tree ρτi of τ . From the Lemma 1.25 we choose j as any of the
vertices adjacent to i. By condition (1.56), the order condition (1.60) is also satisfied
for ρτj . Since any two vertices of a tree can be joined through a chain of pairwise
adjacent vertices, the iteration of this argument leads to the conclusion that the method
satisfies the order conditions that arise from any rooted tree in τ . In the case of a
superfluous tree τ , by definition, it is possible to choose adjacent vertices i, j, such
that ρτi and ρτj are in fact the same rooted tree. Then condition (1.56) shows that
(1.60) holds for the rooted tree ρτi . Therefore (1.60) holds for all rooted tree in τ . 

Example 1.28. For r = 2, there is only one tree τ21 , this is a superfluous tree.

Example 1.29. For r = 3, there is again only one tree τ31 . It has two rooted trees
ρτ31 , ρτ32 . Hence the order conditions become
3 3

1 1
bi aij aik = , or bi aij ajk = .
3 6
i,j,k=1 i,j,k=1

Example 1.30. For r = 4, there is only one non-superfluous tree τ42 . We impose
either the order conditions for ρτ43 or the order conditions for ρτ44 .

We see that for symplectic R–K methods it is sufficient to obtain the order con-
ditions only for non-superfluous trees rather than every rooted trees. The reduction
in the number of order conditions is given in Table 1.5. Comparison order conditions
between symplectic R–K (R–K–N) method.
302 7. Symplectic Runge–Kutta Methods

Table 1.5. Order conditions between symplectic R–K (R–K–N) method


Order R–K method symp. R–K method R–K–N method symp. R–K–N method
1 1 1 1 1
2 2 1 2 2
3 4 2 4 4
4 8 3 7 6
5 17 6 13 10
6 37 10 23 15
7 85 21 43 25
8 200 40 79 39

Example 1.31. For diagonally symplectic R–K method, see tableau (1.16).
If r = 3, according to Theorem 1.27 and Table 1.5, symplectic R–K method has
only two conditions. One condition is for r = 1,
b1 + b2 + b3 = 1. (1.61)
another condition is for r = 3, which has only one non-superfluous tree with two
rooted trees, ρτ31 , ρτ32 . Choose one of them
3
 1
r(ρτ31 )φ(ρτ31 ) = 1, bi c2i = ,
i=1
3

After simplifying, we obtain


b31 + b32 + b33 = 0. (1.62)
Since we have two equations with three unknowns, one of which can be freely
chosen, for example:

1 −32
b1 = b3 = √ , b2 = √ see (1.20). (1.63)
2− 33 2− 33

7.2 Symplectic P–R–K Method


In this section we study symplectic Partitioned–Runge–Kutta method(P–R–K method).

7.2.1 P–R–K Method


In this subsection we focus on a class of special Hamiltonian system i.e., separable
systems:
H(p, q) = u(p) + v(q). (2.1)
Its corresponding Hamiltonian equations are



dp
= −vq (q) = f (q),
dt
(2.2)

⎩ d q = up (p) = g(p).
dt
7.2 Symplectic P–R–K Method 303

Let us suppose that the component p of the first set of system (2.2) are integrated by
an R–K method and the component q in second part of system are integrated with
a different R–K method. The overall scheme is called a Partitioned– Runge–Kutta
method, or shortly called P–R–K method. It can be specified by two Butcher tableaux:

c1 a11 · · · a1s C1 A11 · · · A1s


.. .. .. .. .. .. .. ..
. . . . . . . .
cs as1 · · · ass Cs As1 · · · Ass
b1 ··· bs B1 ··· Bs
(2.3)
The application of (2.3) to the system (2.2) results in


⎪ s

⎪ iP = pn
+ h aij f (Qj ),



⎪ j=1

⎪ 


s

⎪ Q = q n
+ h Aij g(Pj ),
⎨ i
j=1 i = 1, · · · , s, (2.4)

⎪ s

⎪ p n+1
=p +hn
bi f (Qi ),






i=1


⎪ n+1
s

⎪ q = q n
+ h Bi g(Pi ).

i=1

These tableaux are coefficients of P–R–K method.

Theorem 2.1. If coefficients of P–R–K (2.4) satisfies the following conditions:

M = bi Aij + Bj aji − bi Bj = 0, i, j = 1, · · · , s (2.5)

then this P–R–K method is symplectic[AS93,Sur90,Sun93b,SSM92] .

Proof. Let
Ki = f (Qi ), li = g(Pi ), (2.6)
and

d pn+1 ∧ d q n+1 − d pn ∧ d q n
s
 
= h bi d Ki ∧ d Qi + Bi d Pi ∧ d li
i=1

s
−h2 (bi Aij + Bj aji − bi Bj )d Ki ∧ d lj . (2.7)
i,j=1

Note that the first term on the right side of Equation (2.7) is
304 7. Symplectic Runge–Kutta Methods

bi d Ki ∧ d Qi + Bi d Pi ∧ d li

s
= −vqq d Qi ∧ d Qj + upp d Pi ∧ d Pj = 0.
i,j=1

In order to satisfy the equality (2.5), it is sufficient to make

bi Aij + Bj aji − bi Bj = 0.

Therefore, the theorem is completed. 

W -transformation (defined below) proposed by Hairer and Wanner in 1981 has


the intention of simplifying the order condition C(·) and D(·), as well as their re-
lationship. Through the W - transformation, it is easy to construct higher order R–K
method[HW81] . Let us suppose polynomials pi (0 ≤ i ≤ (s − 1)), are orthogonal to the
following inner product

s
(p, q) = bi p(ci )q(ci ),
i=1

introducing matrix

⎡ ⎤
p0 (c1 ) p1 (c1 ) · · · ps−1 (c1 )
⎢ ⎥
⎢ p0 (c2 ) p1 (c2 ) · · · ps−1 (c2 ) ⎥
⎢ ⎥
W =⎢ ⎥,
⎢ ··· ··· ··· ··· ⎥
⎣ ⎦
p0 (cs ) p1 (cs ) · · · ps−1 (cs )

by the orthogonality of pi (i = 1, · · · , s − 1), we have

W T BW = I.

We take pk (x) as a standard shifted Legendre polynomial, defined by

√ 
k  k  k + i 
pk (x) = 2k + 1 (−1)k+i xi , k = 0, 1, · · · , s − 1.
i i
i=0

For an s-stage R–K method (A, b, c), let X = W −1 AW = W T BAW , then for the
high order R–K method based on the high order quadrature formula their transforma-
tion matrix X is given by Table 2.1[Sun93a,Sun94,Sun95] .
7.2 Symplectic P–R–K Method 305

Table 2.1. Matrix X form


method Xs,s−1 Xs−1,s Xs,s symplectic
Gauss ξs−1 −ξs−1 0 yes
Lobatto III A ξs−1 u 0 0 no
Lobatto III B 0 −ξs−1 u 0 no
u2
Lobatto III C ξs−1 u −ξs−1 u no
2(2s − 1)
Lobatto III S ξs−1 uσ −ξs−1 uσ 0 yes
1
Radau I A ξs−1 −ξs−1 no
4s − 2
1
Radau II A ξs−1 −ξs−1 no
4s − 2
Radau I B ξs−1 −ξs−1 0 yes
Radau II B ξs−1 −ξs−1 0 yes

For the Gauss–Legendre method


⎡ 1 ⎤
−ξ1
⎢ 2 ⎥
⎢ ξ1 0 −ξ2 O ⎥
⎢ ⎥
⎢ .. .. ⎥
⎢ ξ2 . . ⎥
XGL = ⎢⎢
⎥,

⎢ .. .. .. ⎥
⎢ . . . ⎥
⎢ .. .. ⎥
⎣ O . . −ξs−1 ⎦
ξs−1 0
1
where ξk = √ (k = 0, 1, · · · , s − 1).
2 4k2 − 1
Corollary 2.2. If coefficients of P–R–K method satisfy bi = Bi , ci = Ci , ci and
bi = 0 (i = 1, · · · , s), and
T
W T M W = X + X − e1 eT
1,

then P–R–K method is symplectic.


Corollary 2.3. (Sun)[Sun93b] Given a s-stage R–K method and its coefficients (A, b, c).
If the coefficients ci , bi = 0 (i = 1, · · · , s) satisfy order conditions of R–K method
B(p), C(η) and  D(ζ),  then the P–R–K method produced by the coefficient of
a
aij , āij = bj 1 − ji , bi , ci is symplectic with order r = min (p, 2η + 2, 2ζ +
bi
2, η + ζ + 1).
Corollary 2.4. Method of Radau IA has order 2s − 1, by Corollary 2.3, method of
Radau IA and Radau I A is symplectic with order 2s − 1.
Example 2.5. P–R–K method with first order accuracy Radau IA–I A

1
0 1 0 0
=⇒ 2

1 1 1
306 7. Symplectic Runge–Kutta Methods

Example 2.6. P–R–K method with third order accuracy Radau I A – IA

1 1 1 1
0 − 0 0 0 −
4 4 8 8
2 1 5 2 1 1 =⇒ 7 3
3 4 12 3 3 3 24 8
1 3 1 3 1 3
4 4 4 4 4 4

Corollary 2.7. Constructs symplectic P–R–K method Radau II A – Radau II A with


the similar method.

Example 2.8. P–R–K method with first order accuracy Radau II A – Radau II A

1
1 1 1 0
=⇒ 2

1 1 1

Example 2.9. P–R–K method with third order accuracy Radau II A – Radau II A

1 5 1 1 1 3 1
− 0 −
3 12 12 3 3 8 24
3 1 =⇒ 7 1
1 1 1 0
4 4 8 8
3 1 3 1 3 1
4 4 4 4 4 4

Corollary 2.10. Using same method constructed symplectic P–R–K method Lobatto
III C – III C with 2s − 2 order accuracy.

Example 2.11. Symplectic P–R–K Lobatto III C–III C method with 2 order accuracy,
its coefficients are

1 1 1 1
0 − 0 0 0 −
2 2 4 4
1 1 =⇒ 3 1
1 1 1 0
2 2 4 4
1 1 1 1 1 1
2 2 2 2 2 2

Example 2.12. Symplectic P–R–K Lobatto III C – III C method, its coefficients are
7.2 Symplectic P–R–K Method 307

1 1 1 1 1 1
0 − 0 0 0 0 −
6 3 6 12 6 12
1 1 5 1 1 1 1 5 4 1
− 0 −
2 6 12 12 2 4 4 =⇒ 24 12 24
1 2 1 1 5 1
1 1 0 1 0
6 3 6 12 6 12
1 2 1 1 2 1 1 2 1
6 3 6 6 3 6 6 3 6

Corollary 2.13. The s-stage P–R–K Lobatto III A–III B method is symplectic with
2s − 2 order accuracy.
Example 2.14. Symplectic P–R–K Lobatto III A–III B method, its coefficients are

1 1
0 0 0 0 0 0
2 4
1 1 1 =⇒ 1 1
1 1 0
2 2 2 2 4
1 1 1 1 1 1
2 2 2 2 2 2

1 1 1 1
0 0 0 − 0 − 0
6 6 12 12
5 1 1 1 1 2 1 1
− 0 −
24 3 24 6 3 =⇒ 16 3 48
1 2 1 1 5 1 2 1
0
6 3 6 6 6 6 12 12
1 2 1 1 2 1 1 2 1
6 3 6 6 3 6 6 3 6

With the help of symplectic conditions of P–R–K methods, we can construct sym-
plectic R–K method. We have the following corollary:
1
Corollary 2.15. [Sun][Sun00] The s-stage R–K method with coefficients a∗ij = (aij +
2
Aij ), b∗i = bi = Bi and c∗i = ci are symplectic and at least satisfy B(p), C(ξ) and
D(ξ), i.e., order
r = min(p, 2ξ + 1), where ξ = min (η, ζ).
Example 2.16. If we take the coefficients in Example 2.14, we know that the right
 of
the table is a special case of 2-order accuracy of R–K methods of Lobatto III S i.e.,

1
σ = situation of literature [Chi97] , see Table 2.1 .
2

7.2.2 Symplified Order Conditions of Explicit Symplectic R–K


Method
The following s-stage scheme is well-known[ZQ95b]
308 7. Symplectic Runge–Kutta Methods

pi = pi−1 + ci hf (qi−1 ),
qi = qi−1 + di hg(pi ), i = 1, 2, · · · , s, (2.8)

where f = −vq (q), g = up (p). We can regard f, g as a function of z = (p, q), for f
(or g) with the p (or q) variables of coefficient 0, i.e., f (q, 0 · p) (or g(p, 0 · q). In order
to facilitate the writing in a unified form, we make:
p = ya , q = yb , f = fa , g = fb , ya,0 = p0 , yb,0 = q0 ,
and ya,1 = ps−1 , yb,1 = qs−1 , then Equation (2.8) is transformed into an s-stage
P–R–K form:
g1,a = ya,0 = p0 ,
g1,b = yb,0 = q0 ,
g2,a = ya,0 + c1 τ fa (q0 ) = ya,0 + c1 hfa (g1,b ) = p1 ,
g2,b = yb,0 + d1 τ fb (p1 ) = yb,0 + d1 hfb (g2,a ) = q1 ,
..
. (2.9)

s−1
gs,a = ya,0 + h cj fa (gj,b ) = ps−1 ,
j=1

s−1
gs,b = yb,0 + h dj fb (gj+1,a ) = qs−1 .
j=1

(2.9) is equivalent to


i−1
gi,a = ya,0 + h cj fa (gj,b ),
j=1


i−1
gi,b = yb,0 + h dj fb (gj+1,a ), i = 2, · · · , s,
j=1
(2.10)

s−1
ya,1 = ya,0 + h cj fa (gj,b ),
j=1


s−1
yb,1 = yb,0 + h dj fb (gj+1,a ).
j=1

And (2.2) can be rewritten with new variables as


5 6 5 6
ẏa fa (yb )
= . (2.11)
ẏb fb (ya )

Let
a1 = c1 , a2 = c2 , ···, as−1 = cs−1 , as = 0,
(2.12)
b1 = 0, b2 = d1 , ···, bs−1 = ds−2 , bs = ds−1 ,
then schemes (2.10) now become
7.2 Symplectic P–R–K Method 309


i−1 
i−1
gi,a = ya,0 + h aj fa (gj,b ) = ya,0 + h aj Rj,a ,
j=1 j=1

i 
i
gi,b = yb,0 + h dj fb (gj,a ) = yb,0 + bj Rj,b , i = 2, · · · , s,
j=1 j=1
(2.13)
s
ya,1 = ya,0 + h aj Rj,a ,
j=1
s
yb,1 = yb,0 + h bj Rj,b .
j=1

Where
Ri,a = fa (gi,b ), Ri,b = fb (gi,a ). (2.14)
Now, we just need to study the order conditions of scheme (2.13) when as = b1 = 0.
Notice that as = b1 = 0 is necessary for (2.13) to be canonical and is also crucial for
simplifying order conditions, as we will see later.
A P -graph (denoted by P G) is a special graph which satisfies the following con-
ditions: (i) its vertices are divided into two classes: “white” ◦ “black” •, sometimes
instead “meagre” and “fat”. (ii) the two adjacent vertices of a P G cannot be of the
same class. If we give the vertices of PG an arbitrary set of labels, we get a label P -
graph, and we say P -graph∈ P G. Two labeled P -graphs are said to be isomorphic
labeled P -graphs if they are just two different labelings of the same P -graph.
A simple path joins a pair of vertices v and w, v = w, and is a sequence of pairwise
distinct vertices v = v0 , v1 , · · · , vn−1 = w, where vi = vi−1 , (vi−1 , vi ) ∈ E. Fig.
2.1 shows an example of a simple path of v and w for n = 4.

◦ ◦ ◦

Fig. 2.1. A simple path of v and w black • and white ◦, for n=4

(1) The definition of a P -tree P τ , a labeled P -tree (denoted by λP τ ), a rooted


labeled P -tree ρλP τ of the same order n are just as that of tree τ , labeled by tree λτ ,
rooted tree ρτ , and rooted labeled tree ρλτ , where the general graph is substituted by
the P -graph.
(2) We define the isomorphism of two labeled P -trees below. Generally tree’s
isomorphism and the P -tree’s isomorphism are all in accordance with the type of
labeling used. This has been described before. Here we give the precise definition for
a P -tree.
Two labeled P -trees {V1 , E1 } and {V2 , E2 } are called isomorphism, if the order
of these tree is the same, and there exists a bijection mapping χ, from V1 to V2 and E1
to E2 satisfies
310 7. Symplectic Runge–Kutta Methods

K(χ(v1 )) = K(v1 ),
where v1 ∈ V1 , and 
1, for v black,
K(v) =
0, for v white.
A P -tree (n order) is the equivalent of such a class, it consists of a labeled P -tree and
all of its isomorphism. We use [HNW93] the P -series and tree method to derive the order
condition of Equation (2.13) below. We first introduce some definitions and notations.
(3) Two rooted labeled P -trees with same order, {V1 , E1 , r1 } and {V2 , E2 , r2 }
(where ri (i = 1, 2) denoted rooted label), are called rooted isomorphism if there
exists a χ that satisfies the condition of (2), and χ(r1 ) = r2 holds.
A rooted P -tree, denoted by ρP τ , is an equivalence class which contains a labeled
P -tree and all of its isomorphic P -trees. We denote ρP τa (ρP τb ) as ρP τ with white
(black) root, and ρλP τ as rooted labeled P -tree, which is obtained by adding label to
ρP τ . Thus ρλP τ ∈ ρP τ .
We denote by ρP τa (resp. ρP τb ) for a rooted P -tree ρP τ that has a white (resp.
black) root. If we give the vertices of a rooted P -tree ρP τ such a set of labels so
that the label of a father vertex is always smaller than that of its sons, we then get a
monotonically labeled rooted P -tree M ρλP τ . We denote by α(ρP τ ) the number of
possible different monotonic labelings of ρP τ when the labels are chosen from the set
Aq = { the first q letters of i < j < k < l < · · ·}, where q is the order of ρP τ .
The set of all rooted trees of order n with a white (resp. black) root is denoted by
T Pna (resp. T Pnb ). Let us denote by λP τna (resp. λP τnb ) the set of all rooted labeled
P -trees of order n with a white (resp. black) root vertex, and M λP τna (resp. M λP τnb )
the set of all monotonically labeled P -tree of order n with a white (resp. black) root
vertex when the labels are chose from the set An .
(4) The density γ(ρλτ ) of a rooted P -tree ρλτ is defined recursively as

γ(ρλτ ) = r(ρλτ )γ(ρλτ 1 ) · · · γ(ρλτ m ),

where r(ρλτ ) is order of ρλτ , and ρλτ 1 , · · · , ρλτ m are the sub-trees which arise when
the root of ρλτ is moved from the tree. The density of rooted P -tree ρP τ is calculated
by regarding them as general rooted tree neglecting the difference between with the
black and white vertices .
(5) Let ρP τ 1 , · · · , ρP τ m be rooted P -tree. We denote by ρP τ = a [ρP τ 1 , · · · ,
ρP τ m ] the unique rooted P -tree that arises when the roots of ρP τ 1 , · · · , ρP τ m are
all attached to a white root vertex. Similarly denote it by b [ρP τ 1 , · · · , ρP τ m ] when
the root of the P -tree is black. We say ρP τ 1 , · · · , ρP τ m are sub-trees of ρP τ . We
further denote the rooted P -tree of order 1, which has a white (resp.black) root vertex
by ta (resp.tb ).
(6) (2.11) is defined recursively as:

F (ta )(y) = fa (y), F (tb )(y) = fb (y),


m
∂ fw(ρP t) (y)  
F (ρP τ )(y) = F (ρP τ 1 )(y) · · · F (ρP τ m )(y) , (2.15)
∂yw(ρP τ 1 ) · · · ∂yw(ρP τ m )
7.2 Symplectic P–R–K Method 311

where y = (ya , yb ), and ρP τ = a [ρP τ 1 , · · · , ρP τ m ] or ρP τ = b [ρP τ 1 , · · · , ρP τ m ],


and w(ρP τ ) is defined by:
 a, if ρP τ attached to a white root vertex,
w(ρP τ ) =
b, if ρP τ attached to a black root vertex.

We see that F (ρP τ ) is independent of labeling. Here, and in the remainder of this
book, in order to avoid sums and unnecessary indices, we assume that ya and yb are
scalar quantities , and fa ,fb scalar functions. All subsequent formulas remain valid for
vectors if the derivatives are interpreted as multi-linear mapping.

Lemma 2.17. The derivatives of the exact solution of (2.11) satisfy:


(a)
 
ya = F (ρλP τ )(ya , yb ) = α(ρP τ )F (ρP τ )(ya , yb ),
ρλP τ ∈M λT Pqa ρP τ ∈T Pqa
(a)
 
yb = F (ρλP τ )(ya , yb ) = α(ρP τ )F (ρP τ )(ya , yb ),
ρλP τ ∈M λT Pqb ρP τ ∈T Pqb

(2.16)
where q = 1, 2, · · ·.
It is convenient to introduce two new rooted P -trees of order 0: ∅a and ∅b . The
corresponding elementary differential are F (∅a ) = ya , F (∅b ) = yb .
We further set

T P a = ∅a ∪ T P1a ∪ T P2a ∪ · · · ,
T P b = ∅b ∪ T P1b ∪ T P2b ∪ · · · ,
λT P a = ∅a ∪ λT P1a ∪ λT P2a ∪ · · · , (2.17)
λT P b = ∅b ∪ λT P1b ∪ λT P2b ∪ · · · ,
M λT P a = ∅a ∪ M λT P1a ∪ M λT P2a ∪ · · · ,
M λT P b = ∅b ∪ M λT P1b ∪ M λT P2b ∪ · · · .

p-series: let c(∅a ), c(∅b ), c(ta ), c(tb ), · · · be real coefficients defined for all P -
trees, c : T P a ∪ T P b → R. The series p(c, y) = (pa (c, y), pb (c, y)) is defined
as
 hr(ρλP τ )
pa (c, y) = c(ρλP τ )F (ρλP τ )(y)
r(ρλP τ )!
ρλP τ ∈M λT P a
 hr(ρλP τ )
= α(ρP τ ) c(ρλP τ )F (ρλP τ )(y),
(ρλP τ )!
ρP τ ∈T P a
 (2.18)
hr(ρλP τ )
pb (c, y) = c(ρλP τ )F (ρλP τ )(y)
(ρλP τ )!
ρλP τ ∈M λT P b
 hr(ρλP τ )
= α(ρP τ ) c(ρλP τ )F (ρλP τ )(y).
r(ρλP τ ) !
ρP τ ∈T P b
312 7. Symplectic Runge–Kutta Methods

Notice that c is defined on T P a ∪T P b , and for any labelings of ρP τ (especially for


monotonic labeling ρλP τ ), we have c(ρλP τ ) = c(ρP τ ). Lemma 2.17 states simply
that the exact solution is a p-series,
 T  
ya (t0 + h), yb (t0 + h) = p Y, ya (t0 ), yb (t0 ) ,

where Y (ρP τ ) = 1 for all rooted P -trees ρP τ . The following theorem is from the
book[HNW93] .

Theorem 2.18. Let c : T P a ∪ T P b → R, be a sequence of coefficients such that


c(∅a ) = c(∅b ) = 1. Then

 fa p(c, (ya , yb ))   


h   = p cT , (ya , yb ) ,
fb p(c, (ya , yb ))

with

cT (∅a ) = cT (∅b ) = 0, cT (ta ) = cT (tb ) = 1,


cT (ρP τ ) = r(ρP τ )c(ρP τ 1 ) · · · c(ρP τ m ),
ρP τ = a [ρP τ 1 , · · · , ρP τ m ], or ρP τ = b [ρP τ 1 , · · · , ρP τ m ].

Let  


⎪ Ri,a = pa Ki , (ya,0 , yb,0 ) ,
 

⎨ R = p K , (y , y ) ,
i,b b i a,0 b,0
  i = 1, · · · , s, (2.19)

⎪ g = p G , (y , yb,0 ) ,


i,a a

i a,0

gi,b = pb Gi , (ya,0 , yb,0 ) ,

where Ki (i = 1, · · · , s) : T P a ∪ T P b → R, Gi (i = 1, · · · , s) : T P a ∪ T P b → R
are two sets of p-series. From (2.10), we have Gi (∅a ) = Gi (∅b ) = 1. Hence, From
(2.14), we have
 
  pa Ki , (ya,0 , yb,0 ) ! Ri,a
!
fa (gi,b )
!
p Ki , (ya,0 , yb,0 ) =   = =h
pb Ki , (ya,0 , yb,0 ) Ri,b fb (gi,a )
     
fa pb Gi , (ya,0 , yb,0 ) ! fa p Gi , (ya,0 , yb,0 ) !
= h    = h   
fb pa Gi , (ya,0 , yb,0 ) fb p Gi , (ya,0 , yb,0 )
 

= p Gi , (ya,0 , yb,0 ) .

Then from Theorem 2.18, we get

Ki = Gi , ∀ i = 1, · · · , s.

But from (2.13), we have


7.2 Symplectic P–R–K Method 313

⎡ ⎤

i−1
  ⎢ ya,0 + h aj Rj,a ⎥
  pa Gi , (ya,0 , yb,0 ) ! ⎢ ⎥
⎢ j=1 ⎥
p Gi , (ya,0 , yb,0 ) =   =⎢ ⎥
pb Gi , (ya,0 , yb,0 ) ⎢ i ⎥
⎣ yb,0 + h bj Rj,b ⎦
⎡ ⎤j=1

i−1  
⎢ ya,0 + h aj pa Kj , (ya,0 , yb,0 ) ⎥
⎢ ⎥
⎢ j=1 ⎥
= ⎢
 ⎥
.
⎢ i  ⎥
⎣ yb,0 + h bj pb Kj , (ya,0 , yb,0 ) ⎦
j=1

Thus:

i−1
Gi (ρP τa ) = aj Kj (ρP τa ),
j=1
∀ r(ρP τ ) ≥ 1. (2.20)
i
Gi (ρP τb ) = bj Kj (ρP τb ),
j=1

From (2.13), we also have



s  
ya,1 = ya,0 + h pa Ki , (ya,0 , yb,0 ) ,
i=1

s   (2.21)
yb,1 = yb,0 + h pb Ki , (ya,0 , yb,0 ) .
i=1

Comparing the numerical solution obtained from (2.13) and the exact solution from
(2.11), we get the order condition for scheme (2.13).
Theorem 2.19. Scheme (2.13) is p-order accuracy iff its coefficients ai , bi satisfy:
⎧ s
⎪ 

⎪ ai Ki (ρP τa ) = 1, 1 ≤ r(ρP τa ) ≤ p,

i=1
 (2.22)


s

⎩ b i K i (ρP τ b ) = 1, 1 ≤ r(ρP τ b ) ≤ p,
i=1

where Ki (i = 1, · · · , s) are defined recursively by



⎪ Ki = Gi ,


⎪ Gi (∅a ) = Gi (∅b ) = 1,


⎪ 
⎨ i−1
Gi (ρP τa ) = aj Kj (ρP τa ), r(ρP τa ), r(ρP τb ) ≥ 1. (2.23)

⎪ j=1

⎪  i



⎩ Gi (ρP τb ) = bj Kj (ρP τb ),
j=1

From the first and second equations of (2.23) we know Ki (∅a ) = Ki (∅b ) =
0, Ki (ta ) = Ki (tb ) = 1, from the last two equations of (2.23) we can obtain
Gi (ta ), Gi (tb ). Repeating this procedure, we can obtain Ki (ρP τa ), Ki (ρP τb ), by P -
tree order from low to high.
314 7. Symplectic Runge–Kutta Methods

Next we rewrite equations (2.23) into more intuitive forms. From (2.23), we have
⎧ * i + * i +

⎪  

⎪ Ki (ρP τa ) = r(ρP τa ) 1
bj Kj (ρP τb ) · · · m1
bj Kj (ρP τb ) ,

⎨ j=1 j=1
* i−1 + * i−1 + i = 2, 3, · · · , s,

⎪  

⎪ Ki (ρP τb ) = r(ρP τb ) 1
aj Kj (ρP τa ) · · · m 2

⎩ aj Kj (ρP τa ) ,
j=1 j=1
(2.24)
where
ρP τa = a [ρP τb1 , · · · , ρP τbm1 ],
(2.25)
ρP τb = b [ρP τa1 , · · · , ρP τam2 ].
We now define elementary weight φ(ρP τ ) for a rooted P -tree. Choose any labeling
ρλP τ for ρP τ ; without loss of generality we choose a monotonic one with labels
i < j < k < l < · · ·, where the rooted labeling is i. Then φ can be obtained
recursively (Note the difference of solving for φ between the original tree and its sub-
tree)

⎪ 
s−1
 

⎪ ai φ(ρP τb1 ), · · · , φ(ρP τbm1 ) ,


φ(ρP τ a ) =

⎨ i=1
s
  r(ρP τa ), r(ρP τb ) ≥ 1,

⎪ φ(ρP τ ) = bi φ(ρP τa1 ), · · · , φ(ρP τam2 ) ,


b

⎪ i=1

φ(∅a ) = φ(∅b ) = 1,
(2.26)
where ρP τa , ρP τb and (2.25) are the same. Here, notice that i is the root of ρP τa or
ρP τb , and s is the label for an imaginary father vertex to the root i. The summation is
always with respect to subscripts of the son’s vertex, from 1 adds to the father vertex
or it reduces by 1. Now s is order of scheme (2.13). We are doing this only for ease
of the recursive definition; otherwise vertex i has no father vertex and the summation
superscript cannot be determined. Regarding the subtrees of ρP τa , ρP τb , the father
vertex for their root labeling is i. It is not necessary to add an extra father vertex. So
the weight for a p-tree as the original tree is different from the weight as another subset
tree. By (2.26) we can see that the elementary weight of a tree φ is not related to its
labeling as long as the imaginary father of maintaining root label is always the order
of the scheme (2.13).
[AS93,ZQ95b,SSC94]
Theorem 2.20. Order conditions in Theorem 2.19 are equivalent to
1
φ(ρP τ ) = , ∀ ρP τ ∈ T P a ∪ T P b , r(ρP τ ) ≤ p. (2.27)
γ(ρP τ )
Proof. We just need to prove

⎪ 
s−1




φ(ρP λτa )γ(ρλP τb ) = aj Kj (ρP τa ),
j=1
(2.28)

⎪ 
s

⎪ φ(ρλP τ )γ(ρλP τ ) = bj Kj (ρλP τb ).
⎩ b a
j=1
7.2 Symplectic P–R–K Method 315

From (2.23), we have


⎧ ⎛ ⎞ ⎛ ⎞


i
 1
i
 m1
⎪ i
⎨ K (ρλP τa ) = r(ρλP τa ⎝
) bj1 Kj1 (ρλP τb )⎠ · · · ⎝ bjm1 Kjm1 (ρλP τb )⎠ ,
j1 =1 jm =1
1
⎛ ⎞ ⎛ ⎞


i−1
 i−1


⎩ Ki (ρλP τb ) = r(ρλP τb ) ⎝
1
ajm2 Kjm2 (ρλP τa )⎠ ···⎝
m
ajm2 Kjm2 (ρλP τa 2 )⎠ ,
j1 =1 jm =1
2
(2.29)
where i = 2, 3, · · · , s, and
ρλP τa = a [ρP τb1 , · · · , ρλP τbm1 ],
(2.30)
ρλP τb = b [ρλP τa1 , · · · , ρλP τam2 ],
while j1 , · · · , jm1 and j1 , · · · , jm2 are the labels of the roots of ρλP τb1 , · · · , ρλP τbm1
and λP τa1 , · · · , ρλP τam2 respectively. Due to
⎧ s−1 * i + * i
+

⎪   1
 m1

⎪ a r(ρλP τ ) b K (ρλP τ ) · · · b K (ρλP τ ) ,


i a j 1 j 1 b j m 1 j m 1 b
⎨ i=1 j1 =1 jm1 =1
R.S. of(2.28) ⇐⇒ * i−1 + * i−1 +
⎪ 
⎪ s  

⎪ 1 m

⎪ bi r(ρλP τb ) aj1 Kj1 (ρλP τa ) · · · ajm2 Kjm2 (ρλP τa 2 ) ,

i=1 j1 =1 jm2 =1



s−1
    

⎪ 1 1 m m
ai r(ρλP τa ) φ(ρλP τb )γ(ρλP τb ) · · · φ(ρλP τb 1 )γ(ρλP τb 1 ) ,


i=1
L.S. of (2.28) ⇐⇒

⎪ s
    

⎪ 1 1 m m

⎩ bi r(ρλP τb ) φ(ρλP τa )γ(ρλP τa ) · · · φ(ρλP τa 2 )γ(ρλP τa 2 ) ,
i=1

so we have to prove

i
φ(ρλP τbl )γ(ρλP τbl ) = bjn Kjn (ρP τbl ) for n = 1, · · · , m1 ,
jn =1

and

i−1
φ(ρP τal )γ(ρP τal ) = ajn Kjn (ρP τal ) for n = 1, · · · , m2 .
jn =1

Continue this process and finally we see it is enough to prove


f (l)−1 f (l)−1
 
φ(ta )γ(ta ) = al Kl (ta ), φ(tb )γ(tb ) = bl Kl (tb ), (2.31)
l=1 l=1

where l is the label of ta or tb and f (l) is the label of the father. Since
f (l)−1 f (l)−1 f (l)−1
  
φ(ta )γ(ta ) = al · 1 = al · Kl (ta ) = al ,
l=1 l=1 l=1
f (l) f (l) f (l)
  
φ(tb )γ(tb ) = bl · 1 = bl · Kl (tb ) = bl ,
l=1 l=1 l=1

and
Kl (ta ) = 1, Kl (tb ) = 1.
The theorem is proved. 
316 7. Symplectic Runge–Kutta Methods

Let P τ be a tree of order p (p ≥ 2) P -tree. Choose any label to obtain λP τ . Let v


and w be two adjacent vertices. We consider four rooted P -tree . Denote ρP τ v (resp.
ρP τ w ) as the rooted P -tree obtained by regarding the vertex v (resp.w) as the root of
ρP τ . Denote ρP τv (resp.ρP τw ) the rooted P -tree, which arises when the edge (v, w)
is deleted from P τ and has the root v (resp.w). Without loss of generality, let v be
white, and w be black.

Theorem 2.21. [AS93,ZQ95b,SSC94] With the above notations, we have:


1 1 1
1◦ v
+ w
= .
γ(ρP τ ) γ(ρP τ ) γ(ρP τv )γ(ρP τw )

2◦ φ(ρP τ v ) + φ(ρP τ w ) = φ(ρP τv )φ(ρP τw ), when as = b1 = 0.

Proof. By the definition of γ, we have


γ(ρP τv )
γ(ρP τ v ) = nγ(ρP τw ) ,
r(ρP τv )
γ(ρP τw )
γ(ρP τ w ) = nγ(ρP τv ) .
r(ρP τw )

Due to r(ρP τv ) + r(ρP τw ) = n, therefore


1 1 r(ρP τv ) r(ρP τw )
+ = +
γ(ρP τ v ) γ(ρP τ w ) nγ(ρP τw )γ(ρP τv ) nγ(ρP τw )γ(ρP τv )
1
= .
γ(ρP τw )γ(ρP τv )

i.e., 1◦ .
Also has

s−1 ;
iv 
iv ;
iw
φ(ρP τ v ) = aiv b iw ,
iv =1 1 iw =1 2

s ; w −1
iw i ;
iv
φ(ρP τ w ) = b iw aiv ,
iw =1 2 iv =1 1

iv 
; iw 
;  
where resp. is the product of all φ(ρP τ v ) resp. φ(ρP τ w ) , while
1 2

ρλP τa = a [ρP τb1 , · · · , ρλP τbm1 ], ρλP τb = b [ρλP τa1 , · · · , ρλP τam2 ],
iv 
; iw 
;
and iv , iw are labels of v and w respectively. resp. varies only according to
1 2
iv , (resp. iw ), therefore


s−1 ;
iv 
s ;
iw
φ(ρP τv ) = aiv , φ(ρP τw ) = biw ,
iv =1 1 iw =1 2
7.2 Symplectic P–R–K Method 317

then

s−1 ;
iv 
s ;
iw
φ(ρP τv )φ(ρP τw ) = aiv bi w
iv =1 1 iw =1 2

s−1 ;
i v  iv ;
iw 
s iw 
;
= aiv biw + bi w
iv =1 1 iw =1 2 iw =iv +1 2

s−1 ;
iv 
iv ;
iw 
s−1 ;
iv 
s ;
iw
= aiv bi w + aiv bi w .
iv =1 1 iw =1 2 iv =1 1 iw =iv +1 2

After manipulation, we can get


s−1 ;
iv 
s ;
iw 
s ; w −1
iw i ;
iv
aiv bi w = biw aiv
iv =1 1 iw =iv +1 2 iw =2 2 iv =1 1

s ; w −1
iw i ;
iv
= bi w aiv , b1 = 0.
iw =1 2 iv =1 1

From this, 2◦ holds. 


Corollary 2.22. Let scheme (2.13) be at least (p − 1)-order, where p ≥ 2, then order
conditions
1
φ(ρP τ v ) =
γ(ρP τ v )
satisfies, iff
1
φ(ρP τ w ) = .
γ(ρP τ w )
Proof. Because scheme (2.13) is at least (p − 1)-order, by Theorem 2.20, we know
the following two relations hold:
1 1
φ(ρP τ v ) = , φ(ρP τ w ) = .
γ(ρP τv ) γ(ρP τw )
By Theorem 2.21, we have
1 1
φ(ρP τ v ) + φ(ρP τ w ) = + .
γ(ρP τ v ) γ(ρP τ w )
The corollary is obviously established. 
So far we draw the following conclusion for this section:
Theorem 2.23. [AS93,ZQ95b,SSC94] Symplectic scheme (2.13) (as = b1 = 0) is of p-
order, if and only if any of P -tree pτ, r(pτ ) ≤ p there is a rooted P -tree ρP τ ∈ pτ ,
such that
1
φ(ρP τ ) = .
γ(ρP τ )
318 7. Symplectic Runge–Kutta Methods

Proof. By Corollary 2.22, we know that any two kinds of rooted label method of the
P τ lead to equivalent conditions. Therefore, we only need to take one of them to get
the order conditions. 

By Theorem 2.23, we simplify the order conditions. Originally, every rooted P -


tree has a corresponding order condition. Now every P -tree, no matter how different
the root is chosen, has a corresponding order condition. For the case of 4-order, the
number of order conditions reduces from 16 to 8, and the corresponding 8 P -trees are
as follows:

• •

• • •
τa τb a[τb ] b[τa , τa ] b[τb , τb ]

• •
• •

• •
b[a [τb , τb ]] a[b [τa , τa ]] a[a [a [τb ]]]

Fig. 2.2. 8 P-trees

Finally, according to Theorem 2.23, we can simplify the order conditions for
P–R–K method, which is given in Table 2.2. Calvo and Hairer[CH95] further reduce
the number of independent condition in P–R–K method. See Table 2.3. For general
Hamiltonian, the corresponding values are given by Table 2.4 which is obtained by
Mirua[Mur97] .

Table 2.2. Order conditions P–R–K method and Symplectic P–R–K method for separable
case

Order P–R–K method Symplectic P–R–K method


1 2 2
2 4 3
3 8 5
4 16 8
5 34 14
6 74 24
7 170 46
8 400 88
7.3 Symplectic R–K–N Method 319

Table 2.3. Further reduction in Order conditions for P–R–K method in separable case

Order P–R–K method Symp. P–R–K method expl. Symp. P–R–K method
1 2 1 1
2 2 1 1
3 4 2 2
4 8 3 3
5 18 6 6
6 40 10 9
7 96 22 18
8 230 42 30

Table 2.4. Order conditions P–R–K method and Symplectic P–R–K method for general case

Order P–R–K method Symplectic P–R–K method


1 2 1
2 4 1
3 14 3
4 52 8
5 214 27
6 916 91
7 4116 350
8 18996 1376

7.3 Symplectic R–K–N Method


Symplectic Runge–Kutta–Nyström method is abbreviated as symplectic R–K–N method.
The main purpose of this section is to develop and simplify the order conditions for R–
K–N methods, while the simplified order conditions for canonical R–K–N methods,
which are applied to special kind of ODE’s, are also obtained here. Then, using the
simplified order condition , we construct some 5 stage, fifth-order symplectic R–K–N
schemes.

7.3.1 Order Conditions for Symplectic R–K–N Method


We consider a special kind of second order ODE’s:

ÿ J = f J (y 1 , y 2 , · · · , y n ), J = 1, · · · , n, (y 1 , · · · , y n ) ∈ Rn . (3.1)

We can transform (3.1) into a system of first order ODE’s


320 7. Symplectic Runge–Kutta Methods

! !
ẏ ẏ
= , (3.2)
ÿ f (y)

by adding another group of variables ẏ J (J = 1, · · · , n). Since canonical difference


schemes are meaningful only to Hamiltonian systems, we assume that (3.2) can be
written as ⎡ ∂H(y, ẏ) ⎤
5 6
ẏ ⎢ ∂ ẏ ⎥
=⎣ ⎦, (3.3)
ÿ ∂H(y, ẏ)

∂y
where H(y, ẏ) is scalar function that satisfies
⎧ ∂H(y, ẏ)

⎨ = ẏ,
∂ ẏ

⎩ − ∂H(y, ẏ) = f (y).
∂y
1 T ∂u
So H must be in the form H = ẏ ẏ − u(y), = f (y). Therefore only when
2 ∂y
f (g) is a gradient of some scalar function, its symplectic algorithm is meaningful. A
general s-stage R–K–N method can be written as


⎪  s
⎪ gi = y0 + ci hẏ0 + h2
⎪ aij f (gj ), i = 1, · · · , n,






j=1

⎨  s
y1 = y0 + hẏ0 + h2 bj f (gj ), (3.4)




j=1



⎪ s

⎪ ẏ = ẏ0 + h bj f (gj ).

⎩ 1
j=1

The corresponding Butcher tableau is

c1 a11 ··· a1s


c2 a21 ··· a2s
c A .. .. ..
. . ··· .
b
c3 as1 ··· ass
b
b1 ··· bs
b1 ··· bs

Theorem 3.1. If the coefficients of scheme (3.4) satisfy

bj = bj (1 − cj ), 1 ≤ j ≤ s, (3.5)
bi aij − bj aji + bi bj − bi bj = 0, 1 ≤ i, j ≤ s, (3.6)

then scheme is symplectic[Sur90,Sur89,CS93,ZQ93] .


7.3 Symplectic R–K–N Method 321

Proof. The proof of Theorem 3.1 can be find in [Sur89,OS92] . Here, we only point out
that under conditions (3.5), (3.6) are equivalent to
bi aij − bj aji + bi bj (cj − ci ) = 0, 1 ≤ i, j ≤ s. (3.7)
Therefore, the theorem is completed. 
Similar to Section 7.1, we first introduce some necessary definitions and notations,
and then derive order conditions. Some definitions of Section 7.1 can still be used.
Here we only introduce some special definitions and notations.
(1) S-graph. A S-graph, denoted as S-g, is a special P -graph where any two
adjacent vertices belong to different categories: “ white (meagre)” or “ black (fat)”.
The Labeled S-graph has a definition similar to the labeled P -graph.
(2) S-tree. A S-tree, denoted as Sτ , has a definition similar to the P -tree of
7.2: replacing the P -graph in the definition of original P -tree with S-graph gives the
definition of S-tree. The definition of labeled S- tree, λSτ , rooted S-tree, ρSτ , rooted
labeled S-tree, ρλSτ , and isomorphic labeled S-trees, root-isomorphic labeled S-tree
are defined using the same method as we have used to define P -trees, labeled P -trees,
etc. We should point out that in this section, we just consider S-trees with “ black” root
vertices. So when we refer to rooted S-tree, we mean that its S-tree has a “ black” root.
Moreover, order r, density γ also has similar definition as mentioned in Section 7.1.
But the elementary weight definition is completely different, which we will redefine
subsequently.
Definition 3.2. We define the elementary weight φ(ρλSτ ) corresponding to a rooted
labeled S-tree. At first, for convenience, we assume ρλSτ is monotonically labeled.
Later, we will see this is unnecessary. In the remainder of this section, without speci-
fication, the labels of the vertices are always j < k < l < m < · · ·. For a monotonic
labeling, the label of the root is j. Then φ(ρλSτ ) is a sum over the labels of all fat
vertices of ρλsτ , the general term of the sum is a product of
1◦ bj (j is a rooted vertex).
2◦ akl , if the fat vertex k is connected via a meagre son with another fat vertex l.
3◦ cm k , if the fat vertex k has m meagre end-vertices as its sons, where an end-
vertex is the vertex which has no son.
We see that, for two different rooted labeled S-trees: ρλSτ 1 and ρλSτ 2 , we have
φ(ρλSτ 1 ) = φ(ρλSτ 2 ). Thus, the choosing of the monotonic labeling is unnecessary.
m j
k • •l m • •l
For example, for and , we have
+j +k
ρλSτ 1 ρλSτ 2
 
φ(ρλSτ 1 ) = bj cj ajm = bk akj ck = φ(ρλSτ 2 ) = φ(ρSτ ).
j,m j,k
1 2
Because ρλSτ and ρλSτ are rooted isomorphism, they belong to a rooted tree ρSτ :
• •
+ . Therefore, they form an equivalence class. The following theorem can be seen
in the literature [HNW93] . We omit the proof here.
322 7. Symplectic Runge–Kutta Methods

Theorem 3.3. P-K-N method (3.4) is order of p iff:


1
φ(ρSτ ) = , for rooted S-tree ρSτ, r(ρSτ ) ≤ p, (3.8)
γ(ρSτ )
1
φ (ρSτ ) = , for rooted S-treeρλSτ, r(ρSτ ) ≤ (p − 1). (3.9)
γ(ρSτ )(r(ρSτ ) + 1)

The explanation for φ (ρSτ ) is similar to that of φ(ρSτ ), which only needs to sub-
stitute bj in φ(ρSτ ) (suppose j is the label of rooted tree, corresponding to a certain
label choosing) by b̄j . Because φ and φ is independent of the chosen label, (3.8) and
(3.9) can take any of the labels to calculate.

We find that (3.8) and (3.9) are not independent under symplectic conditions.

Theorem 3.4. Under symplectic condition (3.5), order condition (3.8) implies condi-
tion (3.9)[ZQ95b] .

Proof. Let ρSτ be ≤ p − 1 order S-tree, and let ρSu be such a rooted S-tree with
r(ρSτ ) + 1 order that is obtained from ρSτ rooted tree by attaching a new branch
with a meagre vertex to the root of τ . Therefore from definition of φ, we have
 ;  ;
φ(ρSu) = bj cj , φ(ρSτ ) = bj ,
j j

where we assume that ρSτ and ρSu have monotonic labels j < k < l < · · ·. Then for
ρSu, apart from the added root of the meagre leaf node, the remaining vertices ;have
the same labeling as ρSτ , and is a sum for all non-fat root vertices, and is a
product of aij and ci that are contained in ρSτ and ρSu. From the definition of φ, we
have
(r(ρSτ ) + 1)γ(ρSτ )
γ(ρSu) = ,
r(ρSτ )
therefore,
 ;  ;
φ (ρSτ ) = bj = bj (1 − cj )
j j
 ;  ;
= bj − bj cj = φ(ρSτ ) − φ(ρSu)
j j
1 1 1
= − = . (3.10)
γ(ρSτ ) γ(ρSu) (r(ρSτ ) + 1)γ(ρSτ )

Since the formula(3.8) is held for ≤ p-order S-tree and (3.9) is held only for
≤ p − 1-order S-tree, the order of the tree obtained by adding a leaf node to any
≤ p − 1-order tree on the root must be ≤ p and (3.8) must be satisfied. Therefore the
final equal sign in (3.10) holds. Thus we reach the conclusion of this section. 

Theorem 3.5. R–K–N method (3.4) is symplectic, and is of p-order iff:


7.3 Symplectic R–K–N Method 323

bi aij − bj aji + bi bj (cj − ci ) = 0, 1 ≤ i, j ≤ s, (3.11)


b̄i = bi (1 − ci ), 1 ≤ i ≤ s, (3.12)
1
φ(ρSτ ) = for rooted S-tree ρSτ, r(ρSτ ) ≤ p. (3.13)
γ(ρSτ )

Note that the conditions we have given here are necessary and sufficient. How-
ever some conditions of (3.13) are still redundant, which means some conditions are
mutually equivalent. We will see more details about this in Section 7.4.

7.3.2 The 3-Stage and 4-th order Symplectic R–K–N Method


For convenience, we construct only explicit schemes here[QZ91] . Suppose the parame-
ters aij of a R–K–N method to be a matrix A
⎡ ⎤
0 0 ··· 0 0
⎢ a21 0 · · · 0 0 ⎥
⎢ ⎥
⎢ a31 a32 · · · 0 0 ⎥
A=⎢ ⎥.
⎢ .. .. .. .. ⎥
⎣ . . . . ⎦
as1 as2 · · · as,s−1 0

By the symmetry of Equations (3.11) in Theorem 3.5, we have

bi aij − bj aji + bi bj (cj − ci ) = 0 ⇐⇒ bj aji − bi aij + bj bi (ci − cj ) = 0,

hence Equations (3.11) can be simplified into

bi aij − bj aji + bi bj (cj − ci ) = 0, 1 ≤ j < i ≤ s.

Since when j < i, aji = 0, the above formula can be written as

bi aij + bi bj (cj − ci ) = 0, 1 ≤ j < i ≤ s.

For conditions of 3-stage of 3rd-order R–K–N method, we get equations for parame-
ters : ⎧
⎪ b2 a21 + b2 b1 (c1 − c2 ) = 0,




⎪ b3 a31 + b3 b1 (c1 − c3 ) = 0,




⎪ b3 a32 + b3 b2 (c2 − c3 ) = 0,


⎨ b1 + b2 + b3 = 1,
(3.14)

⎪ b c + b c + b c =
1
,

⎪ 1 1 2 2 3 3
2



⎪ 2 2 2 1

⎪ b1 c1 + b2 c2 + b3 c3 = ,

⎪ 3

⎩ 1
b2 a21 + b3 a31 + b3 a32 = ,
6
and
bi = bi (1 − ci ), i = 1, 2, 3. (3.15)
324 7. Symplectic Runge–Kutta Methods

Direct verification shows


⎡ ⎤
0 0 0
⎢ 7 ⎥
⎢ 0 0 ⎥
A = ⎢ 36 ⎥,
⎣ ⎦
1
0 − 0
2
7 3 1 2
b1 = , b2 = , b3 = − , c1 = c3 = 0, c2 =
24 4 24 3

is a set of solutions for system (3.14).


The number of order conditions becomes 7 for a scheme of order 4. In addition,
if the scheme is of 3-stage, there are 3 equations for canonicity. Therefore we have
a total of 10 equations with 9 unknown variables (systems (3.14) plus parameters
b̄i (i = 1, 2, 3)). So we first construct a 4-stage scheme of 4th order, which requires
13 equations with 14 variables:
A 3-stage symplectic scheme needs to satisfy the following 6 conditions

b2 a21 + b2 b1 (c1 − c2 ) = 0,
b3 a31 + b3 b1 (c1 − c3 ) = 0,
b3 a32 + b3 b2 (c2 − c3 ) = 0,
b4 a41 + b4 b1 (c1 − c4 ) = 0,
b4 a42 + b4 b2 (c2 − c4 ) = 0,
b4 a43 + b4 b3 (c3 − c4 ) = 0,

and equations that satisfy 4th-order conditions are as follows:

b1 + b2 + b3 + b4 = 1,
1
b1 c1 + b2 c2 + b3 c3 + b4 c4 = ,
2
2 2 2 2 1
b1 c1 + b2 c2 + b3 c3 + b4 c4 = ,
3
1
b2 a21 + b3 a31 + b3 a32 + b4 a41 + b4 a42 + b4 a43 = ,
6
1
b1 c31 + b2 c32 + b3 c33 + b4 c34 = ,
4
1
b2 c2 a21 + b3 c3 a31 + b3 c3 a32 + b4 c4 a41 + b4 c4 a42 + b4 c4 a43 = ,
8
1
b2 a21 c1 + b3 a31 c1 + b3 a32 c2 + b4 a41 c1 + b4 a42 c2 + b4 a43 c3 = .
24

Set c4 = 0, we have

b2 a21 + b2 b1 (c1 − c2 ) = 0, (3.16)


b3 a31 + b3 b1 (c1 − c3 ) = 0, (3.17)
b3 a32 + b3 b2 (c2 − c3 ) = 0, (3.18)
b4 a41 + b4 b1 c1 = 0, (3.19)
7.3 Symplectic R–K–N Method 325

b4 a42 + b4 b2 c2 = 0, (3.20)
b4 a43 + b4 b3 c3 = 0, (3.21)
b1 + b2 + b3 + b4 = 1, (3.22)
1
b1 c1 + b2 c2 + b3 c3 = , (3.23)
2
1
b1 c21 + b2 c22 + b3 c23 = , (3.24)
3
1
b2 a21 + b3 a31 + b3 a32 + b4 a41 + b4 a42 + b4 a43 = , (3.25)
6
1
b1 c31 + b2 c32 + b3 c33 = , (3.26)
4
1
b2 c2 a21 + b3 c3 a31 + b3 c3 a32 = , (3.27)
8
1
b2 a21 c1 + b3 a31 c1 + b3 a32 c2 + b4 a41 c1 + b4 a42 c2 + b4 a43 c3 = . (3.28)
24
We obtain a set of numerical solutions of (3.16) – (3.28):
a21 = 0.2232896E − 01, a31 = 0.2822977E − 08,
a32 = 0.2886753, a41 = 0.3053789E − 01,
a42 = −0.1057342, a43 = −0.4251137,
b1 = −0.3867491E − 01, b2 = 0.5000003,
b3 = 0.5386746, b4 = −0.9129767E − 07,
c1 = 0.7886753, c2 = 0.2113249,
c3 = 0.7886752.
Guessing from these numerical solutions, we obtain
1 1
a31 = 0, b2 = , b3 = − b1 , b4 = 0, c1 = c3 , c2 = 1 − c1 .
2 2
Inserting them into (3.16) – (3.28), we have
(3.16) ⇐⇒ a21 + b1 (2c1 − 1) = 0,
(3.17) ⇐⇒ 0 = 0,
   
1 1 1
(3.18) ⇐⇒ − b1 a32 + − b1 (1 − 2c1 ) = 0,
2 2 2
(3.19), (3.20), (3.21), (3.22), (3.23) ⇐⇒ 0 = 0, or 1 = 1,
1 2 1 1
(3.24) ⇐⇒ c + (1 − c1 )2 = ,
2 1 2  3
1 1 1
(3.25) ⇐⇒ a21 + − b1 a32 = ,
2 2 6
2 1
(3.26) ⇐⇒ c1 − c1 + = 0,
6  
1 1 1
(3.27) ⇐⇒ (1 − c1 )a21 + c1 a32 − b1 = ,
2   2 8
1 1 1
(3.28) ⇐⇒ a21 c1 + − b1 a32 (1 − c1 ) = .
2 2 24
326 7. Symplectic Runge–Kutta Methods

So we obtain a system of equations of the variables a21 , a32 , b1 , c1


a21 + b1 (2c1 − 1) = 0, (3.29)
   
1 1 1
− b1 a32 + − b1 (1 − 2c1 ) = 0, (3.30)
2 2 2
1 2 1 1
c + (1 − c1 )2 = , (3.31)
2 1 2
  3
1 1 1
a21 + − b1 a32 = , (3.32)
2 2
 6 
1 1 1
(1 − c1 )a21 + c1 a31 − b1 = , (3.33)
2 2 8
  1
1 1
a21 c1 + − b1 a32 (1 − c1 ) = . (3.34)
2 2 24
√  √ 
1 3+ 3 3− 3
From (3.31), we have c21 −c1 + = 0, which leads to c1 = or c1 = .
6 6 6
1
Suppose b1 = , from (3.30) we obtain
2
1
a32 = (2c1 − 1), (3.35)
2
√  √ 
3 3
so a32 = or a32 = − . Under (3.35), Equation (3.32) becomes
6 6
% &
1 1 1 1
a21 + − b1 × (2c1 − 1) = . (3.36)
2 2 2 6
Adding (3.36) and (3.29) together, we find
√ √
1 1 2− 3  2 + 3
2a21 = − (2c1 − 1), i.e., a21 = or .
3 2 12 12
√  √ 
3−2 3 3+2 3
From (3.29), we obtain b1 = or . Therefore, we have reason to
12 12
speculate that
√ √ √ √
2− 3 3 3−2 3 3+ 3
a21 = , a32 = , b1 = , c1 =
12 6 12 6
and √ √ √ √
2+ 3 3 3+2 3 3− 3
a21 = , a32 = − , b1 = , c1 =
12 6 12 6
are two sets of solutions of (3.29) – (3.34). Direct verification shows that they are
indeed solutions of the equations of (3.16) – (3.28). Thus we obtain two sets of analytic
solutions of the original equations
√ of system (3.16)√– (3.28).
2− 3 3
Solution 1: a21 = , a31 = 0, a32 = , a41 , a42 , a43 arbitrary.
12 6
√ √
3−2 3 1 3+2 3
b1 = , b2 = , b3 = , b4 = 0,
12 2 12
√ √ √
3+ 3 3− 3 3+ 3
c1 = , c2 = , c3 = , c4 = 0.
6 6 6
7.3 Symplectic R–K–N Method 327

√ √
2+ 3 3
Solution 2: a21 = , a31 = 0, a32 = − , a41 , a42 , a32 arbitrary.
12 6
√ √
3+2 3 1 3−2 3
b1 = , b2 = , b3 = , b4 = 0,
12 2 12
√ √ √
3− 3 3+ 3 3− 3
c1 = , c2 = , c3 = , c4 = 0.
6 6 6

Since b4 = c4 = 0, and b̄4 = b4 (1 − c4 ) = 0 in the two solutions, we obtain two


3-stage symplectic explicit R–K–N method of order 4. They are

Scheme 1:
ci aij

3+ 3
0 0 0
6
√ √
3− 3 2− 3
0 0
6 12
√ √
3+ 3 3
0 0
6 6
√ √ √
5−3 3 3+ 3 1+ 3
bi
24 12 24
√ √
3−2 3 1 3+2 3
bi
12 2 12
Scheme 2:
ci aij

3− 3
0 0 0
6
√ √
3+ 3 2+ 3
0 0
6 12
√ √
3− 3 3
0 − 0
6 6
√ √ √
5+3 3 3− 3 1− 3
bi
24 12 24
√ √
3+2 3 1 3−2 3
bi
12 2 12

Remark 3.6. We can obtain the required solutions easily by solving only the first 6
equations from the simplified order conditions of symplectic R–K–N.

7.3.3 Symplified Order Conditions for Symplectic R–K–N Method


In subsection 7.3.1, we have a preliminary briefing of symplified order conditions for
the symplectic R–K–N methods. In this section, we will simplify them further. The
key is to make full use of symplectic conditions [ZQ95b] .
328 7. Symplectic Runge–Kutta Methods

Let Sτ be an S-tree of order n ≥ 3. It has at least two fat vertices. Let λSτ be a
labeling. Let v and w be two fat vertices connected via a meagre vertex u. For order
≤ 2 S-tree, the root S-trees contains only one the first-order and one the second-
order. Therefore there are no such issues that the order conditions for the trees with
the same order are related to each other. We consider six rooted S-trees. Let us denote
ρSτ v (resp. ρSτ w ) as the rooted S-tree obtained by regarding the vertex v (resp. w)
as the root of Sτ . Let us denote ρSτ vu (resp. ρSτ wu ) as the rooted S-tree with root
v (resp. w) that arises when the edge (u, w)(resp. (v, u)) is deleted from sτ . Let us
denote ρSτv and ρSτw as the rooted S-tree with root v and w respectively which arise
when edges, (u, v) and (u, w) are deleted from Sτ . Fig. 3.1 shows the rooted trees of
Theorem 3.7.

• + • • +
v u w v u w v u w
Sτ ρSτ v ρSτ w

+ • • + + v w +
v u u w
vu wu
ρSτ ρSτ ρSτv ρSτw

Fig. 3.1. Rooted S-trees

Theorem 3.7. With the above notations, we have:


1 1 1 1
1◦ − = − .
γ(ρSτ v ) γ(ρSτ w ) γ(ρSτ vu )γ(ρSτw ) γ(ρSτ wu )γ(ρSτv )
And if the R–K–N method (3.4) satisfies (3.7), then,
2◦ φ(ρSτ v ) − φ(ρSτ w ) = φ(ρSτ vu )φ(ρSτw ) − φ(ρSτ wu )φ(ρSτv ).
Proof. Let
r(ρSτv ) = x, r(ρSτw ) = ẏ, n = r(Sτ ) = x + y + 1.
By definition of γ, we have
;
γ(ρSτ v ) = n (x + 1)γ(ρSτw ),
1
; (3.37)
γ(ρSτ w ) = n (y + 1)γ(ρSτv ),
2
;;
where denotes the product of γ(τ1 )(γ(τ2 )) of the sub-trees, τi which arise
1 2
when v (resp. w) is chopped from ρSτv (resp. ρSτw ). Notice that γ is calculated as
7.3 Symplectic R–K–N Method 329

the general tree τ , with the difference between the black and white vertices neglected.
Then from (3.37), we have
% &
1 1 n 2 (y+ 1)γ(ρSτv ) − 1 (x + 1)γ(ρSτw )
− = . (3.38)
γ(ρSτ v ) γ(ρSτ w ) n 1 (x + 1)γ(ρSτv )2 (y + 1)γ(ρSτw )

Because
; ;
γ(ρSτ vu ) = (x + 1) , γ(ρSτ wu ) = (y + 1) ,
1 2

and
; ;
γ(ρSτv ) = x , γ(ρSτw ) = y ,
1 2

we have
⎛; ; ⎞
(y + 1)γ(ρSτv ) − (x + 1)γ(ρSτw )
1 1 n⎜ 2 1 ⎟
− = ⎝ ⎠
γ(ρSτ v ) γ(ρSτ w ) n γ(ρSτ vu )γ(ρSτ wu )γ(ρSτv )γ(ρSτw )
⎛ ;; 2 ⎞
(x − y 2 + x − y)
1⎜ 1 2 ⎟
= ⎝ ⎠.
n γ(ρSτ vu )γ(ρSτ wu )γ(ρSτv )γ(ρSτw )

However,

1 1

γ(ρSτ vu )γ(ρSτw ) γ(ρSτ wu )γ(ρSτv )
 
n γ(ρSτ wu )γ(ρSτv ) − γ(ρSτ vu )γ(ρSτw )
=  
n γ(ρSτ vu )γ(ρSτw )γ(ρSτ wu )γ(ρSτv )
* +
; ; ; ;
n (y + 1) x− (x + 1) y
=  2 1 1 2

n γ(ρSτ vu )γ(ρSτ wu )γ(ρSτv )γ(ρSτw )
;;  
(x + y + 1) x(y + 1) − (x + 1)y
1 2
=  
n γ(ρSτ vu )γ(ρSτ wu )γ(ρSτv )γ(ρSτw )
;; 2
(x − y 2 + x − y)
1 2
=  . (3.39)
n γ(ρSτ vu )γ(ρSτ wu )γ(ρSτv )γ(ρSτw )

Thus, we get 1◦ . 

By definition of φ, we have
330 7. Symplectic Runge–Kutta Methods


⎪  ;
v  ;
v

⎪ φ(ρSτ vu
) = b c , φ(ρSτv ) = bi v ,

⎨ i v i v
iv iv

⎪  ;
w  ;
w

⎪ φ(ρSτ wu
) = b c , φ(ρSτw ) = bi w ,
⎩ i w iw
iw iw

and ⎧ * v w+

⎪   ;;

⎪ v
φ(ρSτ ) = biv aiv iw ,


iv iv
* v w+ (3.40)

⎪   ;;

⎪ φ(ρSτ w ) =


biw aiw iv ,
iw iw
v ;
; w 
where denotes part of φ(ρSτ v )(resp. φ(ρSτ w ), which is the sum over black
vertices of ρSτv (ρSτw ). From symplectic order condition (3.11), we have

 ;
w ;
v
φ(ρSτ v ) − φ(ρSτ w ) = (biv aiv iw − biw aiw iv )
iv ,iw
 ;
v ;
w
= biv biw (civ − ciw )
iv ,iw
 ;
v  ;
w
= biv civ biw
iv iw
 ;
w ;
v
− biw ciw biv
iw iv
= φ(ρSτ vu )φ(ρSτw ) − φ(ρSτ wu )φ(ρSτv ). (3.41)

Thus, the second part 2◦ of Theorem 3.7 is held. The following corollary is trivial.

Corollary 3.8. Suppose that the R–K–N method (3.4) satisfying (3.7), has order at
least n − 1, with n ≥ 3. If ρSτ v and ρSτ w are different rooted S-trees of order n,
then the order condition is the same as given in Theorem 3.7.
1
φ(ρSτ v ) =
γ(ρSτ v )

holds, iff
1
φ(ρSτ w ) =
γ(ρSτ w )
is satisfied.

Proof. Because R–K–N method (3.4) is of at least n − 1-order, by Theorem 3.5, we


have
7.3 Symplectic R–K–N Method 331

1 1
φ(ρSτ vu ) = , φ(ρSτ wu ) = ,
γ(ρSτ vu ) γ(ρSτ wu )
and
1 1
φ(ρSτv ) = , φ(ρSτw ) = ,
γ(ρSτv ) γ(ρSτw )
similarly by Theorem 3.1, corollary is proved. 

So we have the conclusion of this subsection.


Theorem 3.9. [SSC94,ZQ95b] A R–K–N method (3.4) that satisfies symplectic conditions
is order of p, iff for every S-tree Sτ there exists a rooted S-tree ρSτ v which arises
when a black vertex v of Sτ is lifted as the root, such that
1
φ(ρSτ ) = .
γ(ρSτ )

Proof. By Corollary 3.8 we know that any two different methods of choosing the
corresponding root have equivalent order conditions. Hence the theorem is proved. 

As an application of Theorem 3.9, we consider the explicit R–K–N method, i.e.,


aij = 0, for j > i (i, j = 1, 2, · · · , s), and the non-redundant case, i.e., bi = 0 (i =
1, 2, · · · , s), see [OS92] . Then, we have aij = bj (ci −cj ), for i ≥ j, (i, j = 1, 2, · · · , s).
So we obtain the following corollary.

Corollary 3.10. Non-redundant R–K–N method (3.4) is explicit symplectic and of


order p, iff:
1◦ aij = bj (ci − cj ), 1 ≤ j < i ≤ s.
2◦ bj = bj (1 − cj ), 1 ≤ j ≤ s.
3◦ For every S-tree sτ , there exist a rooted S-tree ρSτ v , which arises when a
black vertex v of sτ is lifted as the root, such that:
1
φ(ρSτ ) = .
γ(ρSτ )

To obtain a 5-stage fifth order non-redundant symplectic explicit R–K–N method,


the following equations are satisfied:

aij = bj (ci − cj ), 1 ≤ j < i ≤ s, (3.42)


bj = bj (1 − cj ), 1 ≤ j ≤ s, (3.43)

and
5

bj = 1, (3.44)
j=1
5
 1
bj cj = , (3.45)
2
j=1
332 7. Symplectic Runge–Kutta Methods

5
 1
bj c2j = , (3.46)
3
j=1
5
 1
bj ajl = , (3.47)
6
j,l=1
5
 1
bj c3j = , (3.48)
4
j=1
5
 1
bj cj ajm = , (3.49)
8
m,j=1
5
 1
bj c4j = , (3.50)
5
j=1
5
 1
bj c2j ajp = , (3.51)
10
j,p=1
5
 1
bj ajl ajp = , (3.52)
20
j,l,p=1
5
 1
bj cj ajl cl = . (3.53)
30
j,l=1

Replacing aij of system equations (3.44) – (3.53) by (3.42), we get a system of


10 equations for parameters bi , ci (i = 1, · · · , 5). Every order condition of system
(3.44) − (3.53) corresponds to the S-trees of the same number in Fig. 3.2.
l
•l
•l •k •l k • k • •m

+ j + j + j + j + j

1 2 3 4 5

m•
m p l p l
l m
k• •l • • k• •l •m k • •m k• • p
k• •p
+ j + j + j + j + j
6 7 8 9 10
Fig. 3.2. Rooted S-trees corresponding to order condition (3.44) – (3.53)

For the sake of convenience, we choose monotonic labelings for trees in Fig. 3.2.
We obtain the Equations (3.46). In the following list we provide four sets of numerical
solutions, whose laws are yet to be studied further.
7.4 Formal Energy for Symplectic R–K Method 333

i 1 2 3 4 5
 bi 0.396826 −0.824374 0.204203 1.002182 0.221161
1
ci 0.961729 0.866475 0.127049 0.754358 0.229296
 bi 0.221160 1.002182 0.204203 −0.824375 0.396827
2
ci 0.770703 0.245641 0.872950 0.133524 0.038270
 bi −1.670799 1.221431 0.088495 0.959970 0.400902
3
ci 0.694313 0.637071 −0.020556 0.795861 0.301165
 bi 0.400902 0.959969 0.088495 1.221434 −1.670802
4
ci 0.698834 0.204138 1.020556 0.362928 0.305086

Remark 3.11. R–K, P–R–K, and R–K–N methods have corresponding order condi-
tions. The order conditions for symplectic R–K, symplectic P–R–K, and symplectic
R–K–N method can be simplified using symplectic conditions. The order conditions
for order 1 to 8 have already been listed in Table 1.4. Calvo and Hairer[CH95] further
reduce the number of independent condition in R–K–N method. See Table 3.1.

Table 3.1. Order conditions R–K–N method and Symplectic R–K–N method for general case
Order R–K–N method Symplectic R–K–N method
1 1 1
2 1 1
3 2 2
4 3 2
5 6 4
6 10 5
7 20 10
8 36 14

7.4 Formal Energy for Symplectic R–K Method


The energy H(z) of a Hamiltonian system is also an invariant of the system. How-
ever, under normal circumstances, no symplectic scheme can preserve all the original
Hamilton energy [Fen98a] . On the other hand, any symplectic scheme preserves a for-
mal Hamiltonian energy, which approaches the original Hamiltonian energy with the
precision of numerical scheme. The calculation of formal energy can be done in many
ways. First, we have obtained a complete method in theory to obtain the formal energy
of a symplectic difference scheme constructed by generating function [Fen98a] . Yoshida
[Yos90]
uses Lie series of BCH Formula to determine the formal energy of separable
Hamiltonian. What is insufficient is that the existing formal energy computational
334 7. Symplectic Runge–Kutta Methods

methods for symplectic R–K method mostly use Poincaré lemma, and then use the
quadrature method. Although theoretically primary function (total differential) does
exist, obtaining the primary function through the integral is not that easy. Therefore,
we attempt to calculate the formal energy of a symplectic R–K method in a easy way
that does not need the integral and also does not need any differentiation.

7.4.1 Modified Equation


Consider the numerical solution of ODEs

ż = f (z), z ∈ Rn . (4.1)

The R–K method for Equation (4.1) is defined as follows:


* +
s
ki = f z0 + h aij kj , (4.2)
j=1

s
z1 = z0 + h bi ki . (4.3)
i=1

Since the fundamental work of Butcher, numerical solution z1 can be written as


(suppose f is sufficiently differentiable):
* +
 hr(t) s
z1 = z0 + α(t) γ(t) bi φi (t) F (t)(z0 ). (4.4)
r(t) ! i=1
t∈T

Definition 4.1. [Hai94] Let t be a rooted tree. A partition of t into k subtrees {s1 , . . . , sk }
is a set S, consisting of k − 1 branches of t such that the trees s1 , . . . , sk are obtained
when the branches of S are removed from t. Such a partition is denoted by (t, S). We
further denote α(t, S) as the number of possible monotonic labelings of t such that
the vertices of the subtrees sj are labeled consecutively.
Example 4.2. All partitions of t = [[τ ], [τ ]], t into k subtrees with the numbers
α(t, S):

k=1 k=2 k=3 k=4 k=5


• • •. • • • •. • •. • •. •. • • •. •. • •. •. •.
.. .. .. .. .. .. .. .. .. ..
.. .. .. .. .. .. .. .. .. ..
. . . . . . . . . .
• • ·• • • • ·• • •· • •· ·• • • ·• ·• • •· ·• •·
\ \ / \ / \ \ / \ /
• • \• \• •/ • \•/ \• \•/ \•/
3 2 1 1 1 2 1 3 2 3

Suppose a numerical method can be expressed as a formal series


 hr(t)
z1 = z0 + α(t)a(t)F (t)(z0 ), (4.5)
r(t) !
t∈T
7.4 Formal Energy for Symplectic R–K Method 335

Table 4.1. Relation between coefficients a(t) and b(t)(1)

t= • a(•) = b(•)

•  •  •
t= a = b + b ( • )2
• • •

• • • •  • •  3  • 
t= a = b + b b( • ) + b( • )3
• • • 2 •

• •  •   •
t= • a • = a • + 3b b ( • ) + b ( • )3
• • • •

where a : T → R is an arbitrary function. Such a series is called a B-series. If


function f (z) is only N -times continuously differentiable, then the series (4.5) has to
be interpreted as a finite sum over t ∈ T with (r(t) ≤ N ).
Theorem 4.3. [Hai94] Let a : T → R be an arbitrary mapping, and the right side
of equation (4.1) f (z) is N -times continuously differentiable. The numerical solution
given by (4.5) satisfies
z1 = z"(t0 + h) + O(hN +1 ). (4.6)
Here, z"(t) is the exact solution of the modified equation:
 hr(t)−1
"
ż = α(t)b(t)F (t)("
z ), (4.7)
r(t) !
r(t)≤N

where the coefficients b(t) can be defined recursively by:

 1   α(t, S)
r(t)
r(t)
a(t) = b(s1 ) · · · b(sk ). (4.8)
k! r(s1 ), · · · , r(sk ) α(t)
k=1 (t,S)

The second sum in (4.8) is over all partitions of t into k subtrees {s1 , · · · , sk }.
By (4.8), we can define relation between coefficients a(t) and b(t), See Table 4.1
to Table 4.4[LQ01] .
According to Table 4.1 – Table 4.4, we can determine modified equation for R–K
equation (up to 5 orders, it is clear that as long as the order continues to add 6, 7-order
tree · · · equation can be modified to any order).
Remark 4.4. If numerical method is symmetrical (or time-reversible), then when r(t)
is even, b(t) = 0.
Remark 4.5. If numerical method is p-order, in other words when r(t) ≤ p, a(t) =
1; then, if 2 ≤ r(t) ≤ p, b(•) = 1, b(t) = 0; if r(t) = p + 1, b(t) = a(t) − 1.
336 7. Symplectic Runge–Kutta Methods

Table 4.2. Relation between coefficients a(t) and b(t)(2)

•• • • • • • • • • •  •
t= a =b + 2b( • )b + 2b( • )2 b( •/ ) + b( • )4
• • • •
• % • & % • &  • • 2 • 
4 •
t=• • a • • =b • • + b( • )b + b( • )b • + b( / )2
3 • 3 • •
• • •
10 •
b( • )2 b( •/ ) + b( • )4
+
3
• • %• •& %• •&  • • • 
t= • a • =b • + 2b( • ) b + 2b( • )b •
• •
• • •

+4b( • )2 b( •/ ) + b(•)4

• %• & %• & • 
• • • •
t=• a =b + 4b( • ) b • + 3b( / )2
• • • •
• • •

+6b( • )2 b( •/ ) + b( • )4

Example 4.6. Centered Euler scheme

z + z 
n n+1
zn+1 = zn + hf .
2

Modified equation can be defined1

  
" h2 (2) h4 7 (4)
z) −
ż = f (" f (f, f ) − 2f (1) f (1) f + f (f, f, f, f)
24 120 48
1 1   3
+ f (3) (f, f, f (1) f ) − f (2) f, f (2) (f, f ) − f (2) (f, f (1) f (1) f )
4 4 2
3 7 (1) (3) 1
+ f (2) (f (1) f, f (1) f ) − f f (f, f, f ) − f (1) f (2) (f, f (1) f )
4 12 2

1 3
+ f (1) f (1) f (2) (f, f ) + f (1) f (1) f (1) f (1) f + O(h6 ).
4 2

Example 4.7. 2-stage Gauss–Legendre method [HNW93] :

1
b(t) by Section 7.5.
7.4 Formal Energy for Symplectic R–K Method 337

Table 4.3. Relation between coefficients a(t) and b(t)(3)

    5 ••• • •
• • • • • • 10
t = • • a • • = b • • + b( • )b( )+ b( • )2 b( )
• • • 2 • 3 •

5 •
+ b ( • )3 b( •/ ) + b( • )5
2

• • • % • • •& % • • •&  • • 5 • • • 10 • 
5
t= • a • =b • + b( • )b • + b( • )b + b( • )2 b •
• • • 2 • 2 • 3 •

10  • • •
+ b( • )2 b + 5b( • )3 b( /
• ) + b( • )
5
3 •

• % •& % •& %• &


• • • • • •  • 
t= • a • =b • + 5b( • ) b + 10b( / )b • + 10b( • )2 b •
• • • • • • •
• • • •

• •
+15b( • )b( •/ )2 + 10b( • )3 b( •/ ) + b( • )5

√ √
1 3 1 1 3
− −
2 6 4 4 6
√ √
1 3 1 3 1
+ +
2 6 4 6 4
1 1
2 2
Modified equation is defined2

" 37h4 @ (4) A h4 @ (3)


z) −
ż(t) = f (" f (f, f, f, f) − 4f (1) f (3) (f, f, f ) − f (f, f, f (1) f )
8840 720
  A
−f (2) f, f (2) (f, f ) − 2f (1) (f (2) f, f (1) f ) + f (1) f (1) f (2) (f, f )
37h4 @ (2) (1) A
− f (f f, f (1) f ) − 2f (2) (f, f (1) f (1) f ) + 2f (1) f (1) f (1) f (1) f
2880
+O(h6 ).

Example 4.8. 2-order diagonal implicit R–K method:


Modified equation:
2
b(t) by Section 7.5.
338 7. Symplectic Runge–Kutta Methods

Table 4.4. Relation between coefficients a(t) and b(t)(4)

%• • & %• • % % •& % • •&


• • • • • •
t= • a • =b • ) + 52 b • )b • + 5b(•/ )b ) + 52 b( • )b •
• • • • • •
• • •  •    •2
2 2 • •
+ 20 b( • ) b • + 10
b( • ) b + 5b( • )b( /
• )
3
• 3 •
3 • 5
+ 152
b( • ) b( •/ ) + b( • )
• • •
• • %• •& %• •&  • % •& % • • &
• •
t= • a • =b • + 52 b( • )b • • + 56 b( • )b 5
• + 3 b( • )b
• • • • •
• •  40 • 2 • •
+ 53 b(•/ )b • + 9 b( • ) b • ) + 5b( • )b( / )2
• • •
20 2 • • 20 3 • 5
+ 9 b( • ) b( ) + 3 b( • ) b( / • ) + b( %•)
% • & % • & •
•  • • • • &
t = • • • a • • • = b • • • + 54 b( • )b + 54 b( • )b • •

• • •  •   •
• • 2 • • •
+6b5 25
•  b(•/ ) + 9 b( • ) b • + 53 b( • )b(•/ )2
• •
+ 59 b( • )2 b • + 15 4
b( • )3 b( / • ) + b( • )
5

• • % • •& % • •& • % • •&


•  • •
t =• • a • • =b • • + 58 b( • )b • + 54 b(•/ )b
• •
• • • % • &  • • • 
+ 15 b( • )b • • + 10 b( • )2 b + 53 b( • )2 b •
8 3 • •
• • •
+ 54 b( • )b(•/ )2 + 35 b( • )3 b(•/ ) + b( • )5
• • • 8
% • & % • & % •&
• • • •  5 • • •
t =• • a • • = b • • + 58 b( • )b 5
• + 4 b(•/ )b • + 2 b( / )b

• • •
• • •  • & • 
15 10 2 •2
+ 8 b( • )b • • + 3 b( • ) b • + 5b( • )b( / )

 •  •
2 • • 3 • 5
+ 53 b( • ) b + 25 b( • ) b(•/ ) + b( • )
% & % & • 4
 • 
• • • • • • • •  •
t = • • a • • = b • • + 52 b( • )b • • + 53 b(•/ )b • + 5b( • )b( / )2
• •
• • • •  •   •
• •
+ 10 b( • )2 b • + 20 b( • )2 b + 5b( • )3 b(•/ )
9
• 9 •
+b( • )5

" h2 @ (2) A 17h4 @ (4)


z) −
ż(t) = f (" f (f, f ) − 2f (1) f (1) f − f (f, f, f, f)
96 10240
A 13h @ (3)4
−4f (1) f (3) (f, f, f ) − f (f, f, f (1) f ) − f (2) (f, f (2) (f, f ))
2560
A h4 @ (2) (1)
−2f (1) (f (2) f, f (1) f ) + f (1) f (1) f (2) (f, f ) − f (f f, f (1) f )
2560
A
−2f (2) (f, f (1) f (1) f ) + 2f (1) f (1) f (1) f (1) f + O(h6 ).
7.4 Formal Energy for Symplectic R–K Method 339

1 1
0
4 4
3 1 1
4 2 4
1 1
2 2

7.4.2 Formal Energy for Symplectic R–K Method


Let T be a set of all rooted trees. On set T we define a relation ∼, as follows:
1◦ t ∼ t.
2◦ u ◦ v ∼ v ◦ u.
3◦ If u1 ◦ v1 ∼ u2 ◦ v2 , u2 ◦ v2 ∼ u3 ◦ v3 , then u1 ◦ v1 ∼ u3 ◦ v3 .
Where u ◦ v and v ◦ u is defined by
if u = [u1 , · · ·, um ], v = [v1 , · · · , vm ], then
u ◦ v = [u1 , · · · , um , v], v ◦ u = [v1 , · · · , vm , u].
Obviously “∼ ” expresses an equivalent relation. Using this equivalent relation to
classify the root tree collection T , we can obtain a quotient space, denoted by T E.
Then we may construct another set T E, " which is obtained by filtering every equivalent
class of T E. The filtering rule is as follows: if t ∈ T E, then t is a rooted tree’s subset
class. We sort the element of t according to σ(t), and choose t so that σ(t) is the
biggest. In general, such t is not unique and we can choose any one of them. For each
element in the T E" we define a quasi elementary differential
 
F ∗ (t) = f (m−1) F (t1 ), F (t2 ), · · · , F (tm ) . (4.9)
The reason why we call (4.9) quasi elementary differential is because under normal
circumstances, elementary differential has been defined as:
 
F (t) = f (m) F (t1 ), F (t2 ), · · · , F (tm ) ,

where f (m) (K1 , K2 , · · · , Km ) is m-order Frechet derivative.


Here, we regard f (m−1) (K1 , K2 , · · · , Km ) as a formal definition. Obviously,
when f is a differential of some function, f (m−1) (K1 , K2 , · · · , Km ) will become m-
order Frechet derivative of its primary function. For example, t = [τ, τ, τ, τ ], F ∗ [t] =
f (3) (f, f, f, f). Let f = JHz , then F ∗ (t) becomes JH (4) (JHz , JHz , JHz , JHz ),
which is obviously a fourth-order Frechet derivative. We use L to express this map-
ping, namely
L: f (m−1) (K1 (f ), K2 (f ), · · · , Km (f ))

−→ JH (m) (K1 (JHz ), K2 (JHz ), · · · , Km (JHz )).


Obviously, L is a 1 to 1 mapping. From now on, we will always use L to express this
mapping unless it is specified otherwise. We will no longer differentiate F ∗ (t) with
L(F ∗ (t)). The operation of F ∗ (t) is always thought as operation of L(F ∗ (t)).
340 7. Symplectic Runge–Kutta Methods


s
Lemma 4.9. Let (A, b) be a symplectic R–K method, a(t) = γ(t) bj φj (t), then:
j=1

a(u ◦ v) a(v ◦ u) a(u) a(v)


+ = · , u, v ∈ T P. (4.10)
γ(u ◦ v) γ(v ◦ u) γ(u) γ(v)
[CS94]
Proof. . 
" h) be the formal energy of symplectic R–K method (A, b),
Lemma 4.10. Let H(z,
then the corresponding modified equation is (possible differ by an arbitrary constant):

ż = f"("
" " z̃ .
z) = J H (4.11)

Conversely, if modified equation of symplectic R–K the method (A, b) are f"("
z ), then
we have
H" z = −J f"(z). (4.12)

Proof. Note that both the modified equation and the formal energy can be obtained
essentially via series expansion, Lemma 4.10 is obvious. 

s
Lemma 4.11. Let (A, b) be a symplectic R–K method, a(t) = γ(t) bj φj (t), then:
j=1

b(u ◦ v) b(v ◦ u)
+ = 0, u, v ∈ T P, u = v, (4.13)
γ(u ◦ v) γ(v ◦ u)

where b(t) is determined recursively by (4.8).

Proof. Using the method that proves Lemma 10 in literature[Hai94] and modifying it
slightly will complete the proof. We leave out its detail here. 

Remark 4.12. By (4.13) and relation α(t)γ(t)σ(t) = r(t) we obtain

α(u ◦ v) b(u ◦ v) σ(v ◦ u)


+ = 0. (4.14)
α(v ◦ u) b(v ◦ u) σ(u ◦ v)

We now need another coefficient ν(t) related to rooted tree. Note first that the
rooted trees u ◦ v and v ◦ u represent an identical unrooted tree with different root, i.e.,
selecting the other vertex of u ◦ v as root leads to tree v ◦ u. Thus, in an equivalent
class, it can always transform one rooted tree to another by selecting a different root.
Take t̄ ∈ T E, then t̄ is one equivalent class of rooted tree collection (t represents the
element). Let u ∈ t̄, then u can be obtained by selecting some vertex of t as the root
node. We denote ν(u) as number of vertices of t which may be selected.
Example 4.13. t = [τ, τ, [τ ]], then t̄ = {t1 , t2 , t3 , t4 }, where •
• • • • • • •
t1 = • • • , t2 = • • , t3 = • , t4 = • .
• • • •
We have •
7.4 Formal Energy for Symplectic R–K Method 341

ν(t1 ) = 1, ν(t2 ) = 1, ν(t3 ) = 1, ν(t4 ) = 2.


Then, the following relation for ν(t) is hold

ν(u ◦ v) · σ(u ◦ v) = ν(v ◦ u) · σ(v ◦ u). (4.15)

Lemma 4.14. Let rooted tree u = v, then

F ∗ (u ◦ v) = −F ∗ (v ◦ u). (4.16)

Proof. F ∗ (u ◦ v) = JJ −1 F (u)JJ −1 F (v). Let α = J −1 F (u), β = J −1 F (v), where


F (u), F (v) are elementary differentials. If u = [u1 , u2 , · · · , um ], then
 
F (u) = f (m) F (u1 ), F (u2 ), · · · , F (um ) ,

where f = JHz , and F (v) is similar to F (u).


By properties of elementary differential (multilinear mapping), α,β are 2n-dimensional
vectors, and

F ∗ (u ◦ v) = J(α, Jβ) = JαT Jβ = −Jβ T Jα = −F ∗ (v ◦ u).

Therefore, the lemma is completed. 


" then
Lemma 4.15. Let t∗ ∈ T E,

α(t∗ )b(t∗ )∇F ∗ (t∗ ) = α(t)b(t)F (t). (4.17)
t∈t¯∗

Proof. First the right side of (4.17) is uniquely determined, and on the left side the
selection of t∗ may not necessarily be unique. Therefore, it is required to prove that
(4.17) the left side of the formula is independent of selection of t∗ . We explain it as
follows: given t∗1 ∈ t∗ , such that σ(t∗ ) = σ(t∗1 ), then there exists a series of u ◦ v,
such that

t∗ = u1 ◦ v1 ∼v1 ◦ u1 = u2 ◦ v2 ∼v2 ◦ u2 = · · · ∼um ◦ vm = t∗1 . (4.18)

Let m = 2, i.e., t∗ = u1 ◦ v1 , t∗1 = v1 ◦ u1 , then by (4.14)

α(t∗ )b(t∗ ) = −α(t∗1 )b(∗1 ),

and by Lemma 4.14

F ∗ (u ◦ v) = −F ∗ (v ◦ u) =⇒ ∇F ∗ (u ◦ v) = −∇F ∗ (v ◦ u),

therefore
α(t∗ )b(t∗ )∇F ∗ (t∗ ) = α(t∗1 )b(t∗1 )∇F ∗ (t∗1 ). (4.19)
∗1
For m > 2, t must be a node of (4.18), (4.19) is also held.
Next, ∇F ∗ (t∗ ) must be a linear combination of basic differentials in the same
class, i.e.,
342 7. Symplectic Runge–Kutta Methods


∇F ∗ (t∗ ) = β(t)F (t).
t∈t¯∗

Consider a special case:


•k

l m
t∗ = • • •j ,

•i

F ∗ (t∗ ) = f (2) (f, f, f (1) f ).


The differentiation of F ∗ (t∗ ) can be seen as a following process: first differentiate
w.r.t. the root node one time, obtain f (3) (f, f, f (1) f ), and then differentiate w.r.t. each
vertex one time, i.e., add 1 to each vertex superscript and then move it in front of its
father along with its substance, continue moving until it reaches the front of root node.
In this process, according to (4.4), every move accompanies a change in sign. Using
the above example, differentiating w.r.t. the point i, j, k, l, m respectively, we obtain

m −→ −f (1) f (2) (f, f (1) f ), l −→ −f (1) f (2) (f, f (1) f ),

j −→ −f (2) (f, f (2) (f, f )), k −→ f (1) f (1) f (2) (f, f ).

Thus, we get:

∇F ∗ (t∗ ) = f (3) (f, f, f (1) f ) − 2f (1) f (2) (f, f (1) f )

−f (2) (f, f (2) (f, f )) + f (1) f (1) f (2) (f, f ).

It is easy to see,
ν(t)
β(t) = ± ,
ν(t∗ )
where “ ± ” is selected using the following rule: if d(·) expresses the distance between
the vertex to the root node, i.e., the least number of vertices passed from this vertex to
the root node along the connection between vertices (including initial point and root
node), then sign (β(t)) = (−1)d(t) . Using the above example

d(i) = 0, d(m) = d(l) = d(j) = 1, d(k) = 2.

By sign (b(u ◦ v)) = −sign (b(v ◦ u)), we have

sign (b(t))
(−1)d(t) = ,
sign (b(t∗ ))

therefore:
 α(t)b(t)
∇F ∗ (t∗ ) = F (t).
¯∗
α(t∗ )b(t∗ )
t∈t
7.4 Formal Energy for Symplectic R–K Method 343

Thus, we get 
α(t∗ )b(t∗ )∇F ∗ (t∗ ) = α(t)b(t)F (t).
t∈t¯∗

Therefore, the lemma is completed. 

With the above results, we describe the main result of this section[Hai94] .

Theorem 4.16. Given a R–K method (A, b), A = (aij )s×s , b = (b1 , b2 , · · · , bs ) , its
formal energy is
 hρ(t)−1
" h) = −J
H(z, α(t)b(t)F ∗ (t), "
t ∈ T E, (4.20)
ρ(t)!
ρ(t)≤N

where b(t) is determined by a(t) (According to Table 4.1 to Table 4.4), i.e.,

s
a(t) = γ(t) bj φj (t).
j=1

Proof. Let the modified equation be


 hr(t)−1
"
ż = α(t)b(t)F (t)("
z ),
r(t) !
t∈T P

then
 hr(t∗ )−1 
"
ż = α(t)b(t)F (t)("
z)
r(t∗ ) ! ¯∗
"
t∗ ∈T E t∈t

 hr(t∗ )−1
= α(t∗ )b(t∗ )∇F ∗ (t∗ ).
r(t∗ ) !
"
t∗ ∈T E

By Lemma 4.10
 hr(t∗ )−1
" z = −J
H α(t∗ )b(t∗ )∇F ∗ (t∗ ),
r(t∗ )!
"
t∗ ∈T E

which leads to (differ by an arbitrary constant)


 hr(t∗ )−1
" h) = −J
H(z, α(t∗ )b(t∗ )F ∗ (t∗ ).
r(t∗ ) !
"
t∗ ∈T E

The theorem is proved. 

Remark 4.17. Literature [Tan94] pointed out that each item of series expansion of for-
mal energy of symplectic R–K scheme has 1 to 1 corresponding relationship with
unrooted trees collection. This theorem specifically indicates this 1 to 1 correspon-
dences.
344 7. Symplectic Runge–Kutta Methods

Finally we sum up the method to construct the formal energy of  a symplectic R–K
method: given a symplectic R–K method (A, b, c), let a(t) = γ(t) bj φj (t). Then
j
according to Table 4.1 to Table 4.4 identify the corresponding b(t) to each rooted
" Using (4.20), we can directly write the formal energy. Without loss of
tree in T E.
generality, in practice we can choose
 7
• • • • • •
"= •, • • • • • • • , • • , ··· .
TE , , • •,
• • • • •
If we know the order of method or the method are time reversible (symmetrical),
then according to Remark 4.4 and Remark 4.5 many calculations can be left out.

Example 4.18. Centered Euler scheme


z + z 
n n+1
zn+1 = zn + hf .
2
Its formal energy[Tan94] :
2 4
" h) = H(z) + h Jf (2) (f, f ) − 7h Jf (3) (f, f, f, f)
H(z,
24 5760
4
h h4
− Jf (2) (f, f, f (1) f ) − Jf (2) (f (1) f, f (1) f ) + O(h6 ).
480 160
Example 4.19. Gauss–Legendre method
√ √
1 3 1 1 3
− −
2 6 4 4 6
√ √
1 3 1 3 1
+ +
2 6 4 6 4
1 1
2 2

Its formal energy:


4 4
" h) = H(z) + 37h Jf (3) (f, f, f, f) + h Jf (2) (f, f, f (1) f )
H(z,
8840 720
4
37h
+ Jf (1) (f (1) f, f (1) f ) + O(h6 ).
2880
Example 4.20. Diagonal implicit R–K method

1 1
0
4 4
3 1 1
4 2 4
1 1
2 2
7.5 Definition of a(t) and b(t) 345

Its formal energy


2 4
" h) = H(z) + h Jf (1) (f, f ) + 17h Jf (3) (f, f, f, f)
H(z,
96 10240
4
13h h4
+ Jf (2) (f, f, f (1) f ) + Jf (1) (f (1) f, f (1) f ) + O(h6 ).
2560 2560

7.5 Definition of a(t) and b(t)


We consider following schemes.

7.5.1 Centered Euler Scheme

1
2

•  •
a( • ) = 1 a(•/ ) = 1 b( • ) = 1 b •/ = 0

  3 •  3   •  1
a • • = b • • =−
1
a • = b • =
• 4 • 2 • 4 • 2

 • • • 1  •   • • •  • 
a = a • • =1 b =0 b • • =0
• 2 •
• •

 • •  •  • •  •
a • • =3 b • • =0
3
a • = b • =0
• 2 • • •

   •     • 
a •• •• = b •• •• =
5 5 7 1
a ••• = b ••• =
• 16 • 8 • 48 • 24

• •
 • •  •   • •  • 
a •• = b •• =−
15 15 1 3
a • • = b • • =−
16 8 16 8
• • • •
346 7. Symplectic Runge–Kutta Methods

• • 5  • • • • • 1  • • •
5 7
a • • = a • = b • • = b • =−

4 • 4

4 • 12

• •
• • • •  • • • • 
5 15 1 1
a • = a • = b • =− b • =
• 2 • 4 • 6 • 4
• •
• • • •
15 3
a • = b • =
• 2 • 2
• •

7.5.2 Gauss–Legendre Method

√ √
1 3 1 1 3
− −
2 6 4 4 6
√ √
1 3 1 3 1
+ +
2 6 4 6 4

1 1
2 2
 
a( • ) = 1 a( /•) = 1 b( • ) = 1 b /• = 0
• •
•   • 
a( • • ) = 1 a( •) = 1 b • • =0 b • =0
• • • •

 • • • • •  • •
••• • •
a( )=1 a =1 b =0 b =0
• • • •

 •   •  •   •
a • • =1 a • • =1 b • • =0 b • • =0
• • • •

• • 35  •  • •  • 
35 37 1
a • • = 72 a ••• = b • • = − 72 b ••• =−
• • 36 • • 36

• • 35 • • 37
a • • = b • • =−
72 72
• •
7.6 Multistep Symplectic Method 347

7.5.3 Diagonal Implicit R–K Method

1 1
0
4 4
3 1 1
4 2 4
1 1
2 2
 
a( • ) = 1 a( /•) = 1 b( • ) = 1 b /• = 0
• •
•   •  1
a( • • ) = b • • =−
15 9 1
a •) = b • =
• 16 • 8 • 16 • 8
 • • • • •  • •
••• 1
a( )= a • =1 b =0 b • =0
• 2 • • •
 •   •  •   •
a • • =3 b • • =0
3
a • • = b • • =0

2 • • •
 • •  •   • •  • 
205 115 51 13
a • • = 256 a ••• = b • • = − 256 b ••• =−
• • 128 • • 128
• • • •
65 1
a • • = b • • =−
64 64
• •

7.6 Multistep Symplectic Method


We present in this section Multistep method for Hamiltonian system.

7.6.1 Linear Multistep Method


Consider the autonomous ODEs on Rn
dz
= a(z), (6.1)
dt

where z = (z1 , · · · , zn ) and a(z) = (a1 (z), · · · , an (z)) is a smooth vector field on
Rn . For Equations (6.1) we define a linear m step method (LMM) in standard form
by

m m
αj zj = τ βj Qj , (6.2)
j=0 j=0

where αj and βj are constants subject to the conditions

αm = 1, |α0 | + |β0 | = 0.
348 7. Symplectic Runge–Kutta Methods

If m = 1, we call (6.2) a single step method. Otherwise, we call it a multi-step method.


The linearity means that the right hand of (6.2) linearly depends on the value of a(z)
on integral points. For compatibility of (6.2) with Equation (6.1), it must be of at least
order one and thus satisfies
(1) α1 + α2 + · · · + αm = 0;
m
(2) β0 + β1 + · · · + βm = jαj = 0.
j=0
LMM method (6.2) has two characteristic polynomials

m 
m
ξ(λ) = αi λi , σ(λ) = βi λi . (6.3)
i=0 i=0

Equation (6.2) can be written as

ξ(E) = τ a(σ(E)yn ). (6.4)

In next subsection, we propose a new definition for symplectic multi-step methods.


This new definition differs from the old ones given for the single step method. It
is defined directly on M which corresponds to the m step scheme defined on M ,
while the old definitions are given by defining a corresponding one step method on
M × M × · · · × M = M m with a set of new variables. The new definition introduces
a step transition operator g : M → M . Under our new definition, the leap-frog method
is symplectic only for linear Hamiltonian systems. The transition operator g will be
constructed via continued fractions and rational approximation.

7.6.2 Symplectic LMM for Linear Hamiltonian Systems


First we consider a linear Hamiltonian system
dz
= az, (6.5)
dt
where a is an infinitesimal 2n × 2n symplectic matrix a ∈ sp(2n). Its phase flow is
z(t) = exp (ta)z0 . The LMM for (6.5) is

αm zm + · · · + α1 z1 + α0 z0 = τ a(βm zm + · · · + β1 z1 + β0 z0 ). (6.6)

Our goal is to find a matrix g, i.e., a linear transformation g : R2n → R2n which
can satisfy (6.6)
 
αm g m (z0 ) + · · · + α1 g(z0 ) + α0 z0 = τ a βm g m (z0 ) + · · · + β1 g(z0 ) + β0 z0 . (6.7)

Such a map g exists for sufficiently small τ and can be represented by continued
fractions and rational approximations. We call this transformation step transition
operator[Fen98b] .
Definition 6.1. If g is a symplectic transformation, then its corresponding LMM (6.6)
is symplectic (we simply call the method SLMM).
7.6 Multistep Symplectic Method 349

From (6.7), we have

α0 I + α1 g 1 + · · · + αm g m
τa = . (6.8)
β0 I + β1 g 1 + · · · + βm g m

The characteristic equation for LMM is

ξ(λ) = τ μσ(λ), (6.9)

where μ is the eigenvalue of the infinitesimal symplectic matrix a and λ is the eigen-
value of g.
Let
ξ(λ)
ψ(λ) = , (6.10)
σ(λ)
then (6.9) can be written as
τ μ = ψ(λ). (6.11)
Its inverse function is
λ = φ(τ μ). (6.12)
To study the symplecticity of the LMM, one only needs to study the properties
of functions φ and ψ. We will see that if φ is of the exponential form or ψ is of
logarithmic form, the corresponding LMM is symplectic. We first study the properties
of the exponential functions and logarithmic functions.
Explike and loglike functions
First we describe the properties of exponential functions:
(1) exp (x)|x=0 = 1.
d
(2) exp (x)|x=0 = 1.
dx
(3) exp (x + y) = exp (x) · exp (y).
If we substitute y with −x, we have

exp (x) exp (−x) = 1. (6.13)

Definition 6.2. If a function φ(x) satisfies φ(0) = 1, φ (0) = 1 and φ(x)φ(−x) = 1,


we call this function an explike function.

It is well known that the inverse function of an exponential function is a logarith-


mic function x → log (x). It has the following properties:
(1) log x|x=1 = 0;
d
(2) log x|x=1 = 1;
dx
(3) log xy = log x + log y.
1
If we take y = , we get
x
1
log x + log = 0. (6.14)
x
350 7. Symplectic Runge–Kutta Methods

Definition 6.3. If a function ψ satisfies ψ(1) = 0, ψ  (1) = 1, and


% &
1
ψ(x) + ψ = 0, (6.15)
x

we call it a loglike function.

Obviously, polynomials can not be explike functions or loglike functions, so we


try to find explike and loglike functions in the form of rational functions.
Theorem 6.4. [Fen98b] LMM is symplectic for linear Hamiltonian systems iff its step
transition operator g = φ(τ a) is explike, i.e., φ(μ)·φ(−μ) = 1, φ(0) = 1, φ (0) = 1.
[Fen98b]
Theorem 6.5. LMM is symplecticforlinear Hamiltonian systems iff ψ(λ) =
ξ(λ) 1
is a loglike function, i.e., ψ(λ) + ψ = 0, ψ(1) = 0, ψ  (1) = 1.
σ(λ) λ
1
Proof. From Theorem 6.4, we have φ(μ)φ(−μ) = 1, so λ = φ(μ), = φ(−μ). The
  λ  
1 1
inverse function of φ satisfies ψ(λ) = μ, ψ = −μ, i.e., ψ(λ) + ψ = 0,
λ λ
ψ(1) = 0, ψ  (1) = 1 follows from consistency
  condition (1), (2).
1
On the other side, if ψ(λ) = −ψ , let ψ(λ) = μ, then its inverse function is
λ
1
φ(μ) = λ and φ(−μ) = , we then have φ(μ) · φ(−μ) = 1. 
λ
Theorem 6.6. If ξ(λ) is a antisymmetric polynomial, σ(λ) is a symmetric one, then
ξ(λ)
ψ(λ) = , satisfies
σ(λ)
 
1
ψ(1) = 0, ψ + ψ(λ) = 0.
λ
Proof. Since
  
m 
m
1
ξ(λ) = λ ξ m
= am−i λ = −i
ai λi = −ξ(λ),
λ
i=0 i=1
  
m 
m
1
σ(λ) = λm σ = βm−i λi = βi λi = σ(λ),
λ
i=0 i=1
% & % &
1 1
  ξ λm ξ
ξ(λ) 1 λ λ −ξ(λ)
ψ(λ) = , ψ = % & = % & = ,
σ(λ) λ 1 1 σ(λ)
σ λm σ
λ λ

  
m 
m
1
we obtain ψ(λ) + ψ = 0. Now ξ(1) = αk = 0, σ(1) = βk = 0, then
λ
k=0 k=0
ξ(1)
ψ(1) = = 0. 
σ(1)
7.6 Multistep Symplectic Method 351

Corollary 6.7. If above generating  polynomial is consistent with ODE (6.1), then
1
ψ(λ) is loglike function, i.e., ψ + ψ(λ) = 0, ψ(1) = 0, ψ̇(1) = 1.
λ
ξ̇σ − ξ̇ξ ξ̇(1)
Proof. ψ  (1) = = = 1. This condition is just consistence condition. 
σ2 σ(1)
ξ(λ)
Theorem 6.8. Let ψ(λ) = irreducible loglike function, then ξ(λ) is an anti-
σ(λ)
symmetric polynomial while σ(λ) is a symmetric one.
Proof. We write formally

ξ(λ) = αm λm + αm−1 λm−1 + · · · + α1 λ + α0 ,

σ(λ) = βm λm + βm−1 λm−1 + · · · + β1 λ + β0 .

If deg ξ(λ) = p < m, set ai = 0 for i > p; if deg σ(λ) = q < m, set βi = 0 for
ξ(1)
i > q. ψ(1) = 0 ⇒ ξ(1) = 0, since otherwise, if ξ(1) = 0, then ψ(1) = = 0.
σ(1)
Now ξ(1) = 0 ⇔ σ(1) = 0, since otherwise ξ(1) = σ(1) ⇒ ξ(λ), σ(λ) would have
common factor. So we have

m 
p
ξ(1) = αk = αk = 0,
k=0 k=0
m q
σ(1) = βk = βk = 0.
k=0 k=0

If m = deg ξ = p, then am = ap = 0. If m = deg σ = q, then βm = βp = 0.


% & % &
1 1
  ξ λm ξ
1 λ λ ξ(λ)
ψ = % & = % & = .
λ 1 m
1 σ(λ)
σ λ σ
λ λ
 
1
Since ψ(λ) + ψ = 0, we have
λ

ξ(λ) ξ(λ)
=− ⇐⇒ ξ(λ)σ(λ) = −ξ(λ)σ(λ)
σ(λ) σ(λ)

=⇒ ξ(λ)|ξ(λ)σ(λ) and σ(λ)|σ(λ)ξ(λ).

Since ξ(λ), σ(λ) have no common factor, then ξ(λ)|ξ(λ), σ(λ)|σ(λ). If m =


deg ξ(λ) ⇒ deg ξ ≤ deg ξ ⇒ ∃ c,

ξ(λ) = cξ(λ) =⇒ σ(λ) = −cσ(λ).

Since αm = 0 ⇒ αm λm + αm−1 λm−1 + · · · + α0 = c(αm + · · · + α0 wn ) ⇒ αm =


cα0 , α0 = cαm ⇔ αm = c2 αm , therefore c2 = 1, c = ±1. Suppose c = +1, then
352 7. Symplectic Runge–Kutta Methods


m
σ(λ) = −σ̄(λ), and βk = σ(1), then−σ̄(1) = σ(1) ⇔ σ(1) = 0, which leads to
k=0
a contradiction with the assumption σ(1) = 0. Therefore c = −1, i.e.,
˜
ξ(λ) = −ξ(λ), αj = −αm−j , j = 0, 1, · · · , m,
σ(λ) = σ̃(λ), βj = βm−j , j = 0, 1, · · · , m.

The theorem is proved. 

The proof for the case m = deg σ(λ) proceeds in exactly the same manner as above.

7.6.3 Rational Approximations to Exp and Log Function


1. Leap-frog scheme
We first study a simple example:
z2 = z0 + 2τ az1 . (6.16)

Let z1 = cz0 , then z0 = c−1 z1 , insert this equation into (6.16),we get
% &
1 1 1 z2
z2 = 2τ az1 + z1 = 2τ a + z1 = d1 z1 , z1 = z = ,
c c 1 2 d1
2τ a +
c
⎛ ⎞
⎜ 1 ⎟ 1
z3 = z1 + 2τ az2 = ⎝2τ a +
1⎠ 2
z = d2 z2 , z2 = z3 ,
1
2τ a + 2τ a +
c 1
2τ a +
c
⎛ ⎞
⎜ ⎟
⎜ ⎟
⎜ 1 ⎟
z4 = ⎜2τ a + ⎟ = d4 z3 , ···.
⎜ 1 ⎟
⎝ 2τ a +
1 ⎠
2τ a +
c

Where dk can be written in the form of continued fractions


1 1 1
dk = 2τ a + , (6.17)
2τ a + 2τ a + · · · + 2τ a + · · ·

lim dk = g = τ a + 1 + (τ a)2 . (6.18)
k→∞

We assume the transition operator of Leap-frog to be g, from (6.16) we have


g 2 − 1 = 2τ ag,

2
now we  have g = τ a ± 1 + (τ a) . Here only sign + is meaningful, thus g =
2
τ a + 1 + (τ a) which is just the limit of continued fraction (6.17). It is easy to
verify that g is explike, i.e., g(μ)g(−μ) = 1. So the Leap-frog scheme is symplectic
for linear Hamiltonian systems according to our new definition.
7.6 Multistep Symplectic Method 353

2. Exponential function


zk
exp(z) = 1 + . (6.19)
k!
k=1

We have Lagrange’s continued function


z −z z −z
exp (z) = 1 +
1 + 2 + · · · + 2n − 1 + 2 +· · ·
a1 a 2 a2n−1 a2n
= b0 + , (6.20)
b1 + b2 + · · · + b2n−1 + b2n +· · ·
where
a1 = z, a2 = −z, · · · , a2n−1 = z, a2n = −z, n ≥ 1,
b0 = 1, b1 = 1, b2 = 2, · · · , b2n−1 = 2n − 1, b2n = 2, n ≥ 1,

and Euler’s contract expansion


2z z2 z2
exp (z) = 1 +
2 − z + 6 + · · · + 2(2n − 1) + · · ·
A1 A2 An
= B0 + , (6.21)
B1 + B2 + · · · + Bn + · · ·
where
A1 = 2z, A2 = z 2 , · · · , An = z 2 , n ≥ 2,
B0 = 1, B1 = 2 − z, B2 = 6, · · · , Bn = 2(2n − 1), n ≥ 2.

We have
P0 p0 p1 1+z p2 P1 2+z
= = 1, = , = = ,
Q0 q0 q1 1 q2 Q1 2−z
(6.22)
p3 6 + 4z + z 2 p4 P2 12 + 6z + z 2
= , = = + ···.
q3 6 − 2z q4 Q2 12 − 6z + z 2
In general p2n−1 (z) is a polynomial of degree n, q2n−1 is a polynomial of degree n−1,
p2n−1
so is not explike. While p2n = Pn (x), q2n = Qn (x) are both polynomials of
q2n−1
degree n and from the recursions

P0 = 1, P1 = 2 + z, Pn = z 2 Pn−2 + 2(2n − 1)Pn−1 ,


(6.23)
Q0 = 1, Q1 = 2 − z, Qn = z 2 Qn−2 + 2(2n − 1)Qn−1 .

It’s easy to see that for n = 0, 1, · · ·,

Qn (z) = Pn (−z), Pn (0) > 0.

So the rational function


Pn (z) Pn (z)
φn (z) = =
Qn (z) Pn (−z)

is explike and
354 7. Symplectic Runge–Kutta Methods

φn (z) − exp(z) = o(|z|2n+1 ),


where

P0 = 1, P1 = 2+z, Pn (z) = z 2 Pn−2 (z)+2(2n−1)Pn−1 (z), n ≥ 2. (6.24)

This is just the diagonal Padé approximation.


3. Logarithmic function


(w − 1)k
log w = , (6.25)
kwk
k=1

we have the Lagrange’s continued fraction


w − 1 w − 1 w − 1 2(w − 1) (n − 1)(w − 1) n(w − 1)
log w =
1 + 2 + 3 + 2 + ··· + 2n − 1 + 2 + ···
a1 a 2 a3 a4 a2n−1 a2n
= , (6.26)
b1 + b2 + b3 + b4 + · · · + b2n−1 + b2n + · · ·
where
a1 = w − 1, a2 = w − 1, a3 = w − 1, a4 = 2(w − 1), ···,
b0 = 0, b1 = 1, b2 = 2, b3 = 3, b4 = 2, · · · ,

and
a2n−1 = (n − 1)(w − 1), a2n = n(w − 1), n ≥ 2,
b2n−1 = 2n − 1, b2n = 2, n ≥ 2,
and the Euler’s contracted expansion
 2  2
2(w − 1) 2(w − 1) 2 × 2(w − 1) 2(n − 1)(w − 1)
log w =
w + 1 – 6(w + 1) – 2.5(w + 1) – · · · – 2(2n − 1)(w + 1) – · · ·
A1 A2 A3 An
= , (6.27)
B1 + B2 + B3 + · · · + Bn + · · ·
where
A1 = 2(w − 1), A2 = −2(w − 1), · · · , An = −(2(n − 1)(w − 1))2 , n ≥ 3,
B0 = 0, B1 = w + 1, B2 = 6(w + 1), · · · , Bn = 2(2n − 1)(w + 1), n ≥ 2.

The following can be obtained by recursion


P0 p0 p1 p2 P1 2(w − 1)
= = 0, = w − 1, = = ,
Q0 q0 q1 q2 Q1 w+1
(6.28)
p3 w2 + 4w − 5 p4 P2 3(w2 − 1)
= , = = 2 .
q3 4w + 2 q4 Q2 w + 4w + 1
In general
p2n−1 (w) p2n (w)
− log (w) = O (|w − 1|2n ), − log (w) = O (|w − 1|2n+1 ).
q2n−1 (w) q2n (w)
p2n−1 (w)
The rational function approximates log w only by odd order 2n − 1, it does
q2n−1 (w)
not reach the even order 2n, and is not loglike. However
7.6 Multistep Symplectic Method 355

p2n (w) Pn (w)


Rn = ψn (w) = =
q2n (w) Qn (w)

is a loglike function. In fact, by recursion, it’s easy to see that


% &
1
Pn (w) = −wn Pn ,
w
% & (6.29)
1
Qn (w) = wn Qn ,
w

and ∀ n, Qn (1) = 0.We also have

P0 = 0, P1 (w) = 2(w − 1), P2 (w) = 3(w2 − 1),


Q0 = 1, Q1 (w) = w + 1, Q2 (w) = w2 + 4w + 1,

and for n ≥ 3,

Pn (w) = −(2(n − 1)(w − 1)2 Pn−2 (w) + 2(2n − 1)(w − 1)Pn−2 (w)),
(6.30)
Qn (w) = −((2n − 1)(w − 1)2 Qn−2 (w) + 2(2n − 1)(w − 1)Qn−2 (w)).

3(λ2 − 1)
So we see R1 (λ) is just the Euler midpoint rule and R2 (λ) = 2 is just the
λ + 4λ + 1
Simpson scheme.
Conclusion: The odd truncation of the continued fraction of the Lagrange’s ap-
proximation to exp(x) and log (x) is neither explike nor loglike, while the even trun-
cation is explike and loglike. The truncation of the continued fraction obtained from
Euler’s contracted expansion is explike and loglike.
4. Obreschkoff formula
Another rational approximation to a given function is the Obreschkoff formula[Obr40] :

n
Ckn 
m
Ckm
Rm,n (x) = (x0 − x)k f (k) (x) − k
(x − x0 )k f (k) (x0 )
Ckm+n k! C k!
k=0 k=0 m+n
- x
1
= (x − t)m (x0 − t)n f m+n+1 (t)dt. (6.31)
(m + n)! x0
.
(1) Take f (x) = ex , x0 = 0, we obtain Padé approximation exp(x) = Rm,n (x).
If m = n, we obtain Padé approximation Rm,m (x).
.
(2) Take f (x) = log(x), x0 = 1, we obtain log(x) = Rm,n (x). If m = n, we
obtain loglike function Rm (x),

1  Ckm
m
Rm (λ) = k
(λ − 1)k (λm−k + (−1)k−1 λm ),
λm C k
k=1 2m

i.e., % &
1
Rm (λ) + Rm = 0.
λ
We have
356 7. Symplectic Runge–Kutta Methods

Rm (λ) − log (λ) = O (|λ|2m+1 ),


λ2 − 1
R1 = ,

1
R2 = (−λ4 + 8λ3 − 8λ + 1),
12λ2
1
R3 = (λ6 − 9λ5 + 45λ4 − 45λ2 + 9λ − 1),
60λ3
···
where R1 (λ) is just the leap-frog scheme.

5. Nonexistence of SLMM for Nonlinear Hamiltonian Systems (Tang Theorem)


For nonlinear Hamiltonian systems, there exists no symplectic LMM. When equa-
tion (6.1) is nonlinear, how to define a symplectic LMM? The answer is to find the
step-transition operator g : Rn → Rn , let

z = g 0 (z),
z1 = g(x),
z2 = g(g(z)) = g ◦ g(z) = g 2 (z), (6.32)
..
.
zn = g(g(· · · (g(z)) · · ·)) = g ◦ g ◦ · · · ◦ g ◦ (z) = g n (z),

we get from (6.2)



k 
n
αi g i (z) = τ βi f ◦ g i (z). (6.33)
i=0 i=0

It’s easy to prove that if LMM (6.33) is consistent with Equation (6.1), then for smooth
f and sufficiently small step-size τ , the operator g defined by (6.32) exists and it can
be represented as a power series in τ and with first term equal to identity. Consider a
case where Equation (6.1) is a Hamiltonian system, i.e., a(z) = J∇H(z), we have
the following definition.

Definition 6.9. LMM is symplectic if the transition operator g defined by (6.32) is


symplectic for all H(z) and all step-size τ , i.e.,

g∗T Jg∗ (z) = J. (6.34)

This definition is a completely different criterion that can include the symplec-
tic condition for one-step methods in the usual sense. But Tang in[Tan93a] has proved
that nonlinear multistep method can satisfy such a strict criterion. Numerical experi-
ments of Li[Fen92b] show that the explicit 3-level centered method (Leap-frog method)
1
is symplectic for linear Hamiltonian systems H = (p2 + 4q 2 ) (see Fig. 0.2 in
2
introduction of this book) but is non-symplectic for nonlinear Hamiltonian systems
1 2
H = (p2 + q 2 ) + q 4 (see Fig. 0.3 (a,b,c) in introduction of this book).
2 3
Bibliography

[AS93] L. Abia and J.M. Sanz-Serna: Partitioned Runge–Kutta methods for separable Hamil-
tonian problems. Math. Comp., 60:617–634, (1993).
[But87] J.C. Butcher: The Numerical Analysis of Ordinary Differential Equations. John Wiley,
Chichester, (1987).
[CH95] M.P. Calvo and E. Hairer: Further reduction in the number of independent order condi-
tions for symplectic, explicit partitioned Runge-Kutta and Runge–Kutta–Nyström methods.
Appl. Numer. Math., 18:107–114, (1995).
[Chi97] S. A. Chin: Symplectic integrators from composite operator factorization. Physics
Letters A, 226:344–348, (1997).
[Coo87] G. J. Cooper: Stability of Runge–Kutta methods for trajectory problems. IMA J.
Numer. Anal., 7:1–13, (1987).
[CS93] M.P. Calvo and J.M. Sanz-Serna: High-order symplectic Runge-Kutta-Nyström meth-
ods. SIAM J. Sci. Comput., 114:1237–1252, (1993).
[CS94] M.P. Calvo and J.M. Sanz-Serna: Canonical B-Series. Numer. Math., 67:161–175,
(1994).
[DV84] K. Dekker and J.G. Verwer: Stability of Runge-Kutta Methods for Stiff Initial Value
Problems. North-Holland, Amsterdam, (1984).
[Fen65] K. Feng: Difference schemes based on variational principle. J. of Appl. and Comput.
Math.in Chinese, 2(4):238–262, (1965).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[Fen86a] K. Feng: Canonical Difference Schemes for Hamiltonian Canonical Differential
Equations. In International Workshop on Applied Differential Equations (Beijing, 1985),
pages 59–73. World Sci. Publishing, Singapore, (1986).
[Fen86b] K. Feng: Difference schemes for Hamiltonian formalism and symplectic geometry.
J. Comput. Math., 4:279–289, (1986).
[Fen86c] K. Feng: Symplectic geometry and numerical methods in fluid dynamics. In F.G.
Zhuang and Y.L. Zhu, editors, Tenth International Conference on Numerical Methods in
Fluid Dynamics, Lecture Notes in Physics, pages 1–7. Springer, Berlin, (1986).
[Fen91] K. Feng: The Hamiltonian Way for Computing Hamiltonian Dynamics. In R. Spigler,
editor, Applied and Industrial Mathematics, pages 17–35. Kluwer, The Netherlands, (1991).
[Fen92a] K. Feng: Formal power series and numerical methods for differential equations. In
T. Chan and Z.C. Shi, editors, International conf. on scientific computation, pages 28–35.
World Scientific, Singapore, (1992).
[Fen92b] K. Feng: How to compute property Newton’s equation of motion. In L. A. Ying,
B.Y. Guo, and I. Gladwell, editors, Proc of 2nd Conf. on Numerical Method for PDE’s,
pages 15–22.World Scientific, Singapore, (1992). Also see Collected Works of Feng Kang.
Volume I, II. National Defence Industry Press, Beijing, (1995).
[Fen93a] K. Feng: Formal dynamical systems and numerical algorithms. In K. Feng and
Z.C Shi, editors, International conf. on computation of differential equationsand dynamical
systems, pages 1–10. World Scientific, Singapore, (1993).
358 Bibliography

[Fen93b] K. Feng: Symplectic, contact and volume preserving algorithms. In Z.C. Shi and
T. Ushijima, editors, Proc.1st China-Japan conf. on computation of differential equation-
sand dynamical systems, pages 1–28. World Scientific, Singapore, (1993).
[Fen95] K. Feng: Collected Works of Feng Kang. volume I,II. National Defence Industry
Press, Beijing, (1995).
[Fen98a] K. Feng: The calculus of generating functions and the formal energy for Hamiltonian
systems. J. Comput. Math., 16:481–498, (1998).
[Fen98b] K. Feng: The step-transition operator for multi-step methods of ODEs. J. Comput.
Math., 16(3), (1998).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[FQ91a] K. Feng and M.Z. Qin: Hamiltonian Algorithms for Hamiltonian Dynamical Systems.
Progr. Natur. Sci., 1(2):105–116, (1991).
[FQ91b] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FQ03] K. Feng and M. Q. Qin: Symplectic Algorithms for Hamiltonian Systems. Zhejiang
Science and Technology Publishing House, Hangzhou, in Chinese, First edition, (2003).
[FW91a] K. Feng and D.L. Wang: A note on conservation laws of symplectic difference
schemes for Hamiltonian systems. J. Comput. Math., 9(3):229–237, (1991).
[FW91b] K. Feng and D.L. Wang: Symplectic difference schemes for Hamiltonian systems in
general symplectic structure. J. Comput. Math., 9(1):86–96, (1991).
[FW94] K. Feng and D.L. Wang: Dynamical systems and geometric construction of algo-
rithms. In Z. C. Shi and C. C. Yang, editors, Computational Mathematics in China, Con-
temporary Mathematics of AMS Vol 163, pages 1–32. AMS, (1994).
[FW98] K. Feng and D.L. Wang: On variation of schemes by Euler. J. Comput. Math., 16:97–
106, (1998).
[FWQ90] K. Feng, H.M. Wu, and M.Z. Qin: Symplectic difference schemes for linear Hamil-
tonian canonical systems. J. Comput. Math., 8(4):371–380, (1990).
[FWQW89] K. Feng, H.M. Wu, M.Z. Qin and D.L. Wang: Construction of canonical dif-
ference schemes for Hamiltonian formalism via generating functions. J. Comput. Math.,
7:71–96, (1989).
[Ge88] Z. Ge: Symplectic geometry and its application in numerical analysis. PhD thesis,
Computer Center, CAS, (1988).
[Ge90] Z. Ge: Generating functions, Hamilton–Jacobi equations and symplectic groupoids on
Poisson manifolds. Indiana Univ. Math. J., 39:859, (1990).
[Ge91] Z. Ge: Equivariant symplectic difference schemes and generating functions. Physica
D, 49:376–386, (1991).
[Ge95] Z. Ge: Symplectic integrators for Hamiltonian systems. In W. Cai et al., editor, Nu-
merical Methods in Applied Sciences, pages 97–108, Science Press, New York, (1995).
[Gon96] O. Gonzalez: Time integration and discrete Hamiltonian systems. J. Nonlinear. Sci.,
6:449–467, (1996).
[Hai94] E. Hairer: Backward analysis of numerical integrators and symplectic methods. Annals
of Numer. Math., 1:107–132, (1994).
[Hai97b] E. Hairer: Variable time step integration with symplectic methods. Appl. Numer.
Math., 25:219–227, (1997).
[Hai99] E. Hairer: Backward error analysis for multistep methods. Numer. Math., 84:199–232,
(1999).
[Hai00] E. Hairer: Symmetric projection methods for differential equations on manifolds. BIT,
40:726–734, (2000).
[Hai01] E. Hairer: Geometric integration of ordinary differential equations on manifolds. BIT,
41:996–1007, (2001).
[Hai03] E. Hairer: Global modified Hamiltonian for constrained symplectic integrators. Nu-
mer. Math., 95:325–336, (2003).
Bibliography 359

[Hen62] P. Henrici: Discrete Variable Methods in Ordinary Differential Equations. John Wiley
& Sons, Inc., New York, Second edition, (1962).
[HL97a] E. Hairer and P. Leone: Order barriers for symplectic multi-value methods. In D.F.
Grifysis, D.F.Higham, and G.A. Watson, editors, Numerical Analysis 1997 Proc.of the 17-
th Dundee Biennial Conference,June 24-27, 1997, Pitman Reserch Notes in math. series
380, pages 133–149, (1997).
[HL97b] E. Hairer and Ch. Lubich: The life-span of backward error analysis for numerical
integrators. Numer. Math., 76:441–462, (1997).
[HL97c] M. Hochbruck and Ch. Lubich: On Krylov subspace approximations to the matrix
exponential operator. SIAM J. Numer. Anal., 34(5), (1997).
[HL97d] W. Huang and B. Leimkuhler: The adaptive Verlet method. SIAM J. Sci. Comput.,
18(1):239, (1997).
[HL99a] E. Hairer and Ch. Lubich: Invariant tori of dissipatively perturbed Hamiltonian sys-
tems under symplectic discretization. Appl. Numer. Math., 29:57–71, (1999).
[HL99b] M. Hochbruck and Ch. Lubich: Exponential integrators for quantum-classical molec-
ular dynamics. BIT, 39:620–645, (1999).
[HL00a] E. Hairer and P. Leone: Some properties of symplectic Runge–Kutta methods. New
Zealand J. of Math., 29:169–175, (2000).
[HL00b] E. Hairer and Ch. Lubich: Energy conservation by Störmer-type numerical inte-
grators. In G.F. Griffiths and G.A. Watson, editors, In Numerical Analysis 1999, pages
169–190. CRC Press LLC, (2000).
[HL00c] E. Hairer and Ch. Lubich: Long-time energy conservation of numerical methods for
oscillatory differential equations. SIAM J. Numer. Anal., 38:414–441, (2000).
[HL00d] J. L. Hong and Y. Liu: Symplectic integration of linear discontinues Hamiltonian
systems. Neural Parallel Sci Comput., 8:317–325, (2000).
[HL03] M. Hochbruck and C. Lubich: On magnus integrators for time-dependent Schrödinger
equations. SIAM J. Numer. Anal., 41:945–963, (2003).
[HL04a] E. Hairer and C. Lubich: symmetric multistep methods over long times. Numer.
Math., 97:699–723, (2004).
[HLR01] T. Holder, B. Leimkuhler, and S. Reich: Explicit variable step-size and time-
reversible integration. Appl. Numer. Math., 39:367–377, (2001).
[HLS98] M. Hochbruck, C. Lubich, and H. Selhofer: Exponential integrators for large systems
of differential equations. SIAM J. Sci. Comput., 19(5):1552–1574, (1998).
[HLW02] E. Hairer, Ch. Lubich, and G. Wanner: Geometric Numerical Integration. Num-
ber 31 in Springer Series in Computational Mathematics. Springer-Verlag, (2002).
[HLW03] E. Hairer, C. Lubich and G. Wanner: Geometric integration illustrated by the
Störmer-Verlet method. Acta Numerica, pages 399–450, (2003).
[HM04] P. Hydon and E.L. Mansfield: A variational complex for difference equations. Foun-
dations of Computational Mathematics, 4:187–217, (2004).
[HMM95] P. Hut, J. Makino and S. McMillan: Building a better leapfrog. Astrophys. J.,
443:L93–L96, (1995).
[HMSS93] E. Hairer, A. Murua and J.M. Sanz-Serna: The non-existence of symplectic multi-
derivative Runge–Kutta methods. Preprint, (1993).
[HNW93] E. Hairer, S. P. Nørsett, and G. Wanner: Solving Ordinary Differential Equations I,
Nonstiff Problems. Springer-Verlag, Second revised edition, (1993).
[HOS99] D.J. Hardy, D.I. Okunbor, and R.D. Skeel: Symplectic variable step size integration
for n-body problems. Appl. Numer. Math., 29:19–30, (1999).
[HS81] W. H. Hundsdorfer and M. N. Spijker: A note on B-stability of Runge–Kutta methods.
Numer. Math., 36:319–331, (1981).
[HS94] A. R. Humphries and A. M. Stuart: Runge-Kutta methods for dissipative and gradient
dynamical systems. SIAM J. Numer. Anal., 31(5):1452–1485, (1994).
[HS97a] E. Hairer and D. Stoffer: Reversible long-term integration with variable stepsizes.
SIAM J. Sci. Comput., 18:257–269, (1997).
360 Bibliography

[HS05] E. Hairer and G. Söderlind: Explicit time reversible adaptive step size control. SIAM
J. Sci. Comput., 26:1838–1851, (2005).
[HW74] E. Hairer and G. Wanner: On the Butcher group and general multivalue methods.
Computing, 13:1–15, (1974).
[HW81] E. Hairer and G. Wanner: Algebraically stable and implementable Runge–Kutta meth-
ods of high order. SIAM J. Numer. Anal., 18:1098–1108, (1981).
[HW91] E. Hairer and G. Wanner: Solving Ordinary Differential Equations II, Stiff and
Differential-Algebraic Problems. Springer, Berlin, (1991).
[HW94] E. Hairer and G. Wanner: Symplectic Runge-Kutta methods with real eigenvalues.
BIT, 34:310–312, (1994).
[HW96] E. Hairer and G. Wanner: Solving Ordinary Differential Equations II. Stiff and
Differential-Algebraic Problems, 2nd edition, Springer Series in Computational Mathemat-
ics 14. Springer-Verlag Berlin, Second edition, (1996).
[IA88] T. Itoh and K. Abe: Hamiltonian-conserving discrete canonical equations based on
variational difference quotients. J. of Comp. Phys., 76:85–102, (1988).
[Jay96] L. O. Jay: Symplectic partitioned Runge–Kutta methods for constrained Hamiltonian
systems. SIAM J. Numer. Anal., 33:368–387, (1996).
[Jay97] L. O. Jay: Lagrangian integration with symplectic methods. Technical Report AH-
PCRC Preprint 97-009, University of Minnesota, (1997).
[Jay99] L. O. Jay: Structure preservation for constrained dynamics with super partitioned
additive Runge–Kutta methods. SIAM J. Sci. Comput., 20(2):416–446, (1999).
[Jim94] S. Jiménez: Derivation of the discrete conservation laws for a family of finite differ-
ence schemes. Applied Mathematics and Computation, 64:13–45, (1994).
[JL06] Z. Jia and B. Leimkuhler: Geometric integrators for multiple time-scale simulation. J.
Phys. A: Math. Gen., 39:5379–5403, (2006).
[Kar96a] B. Karasözen: Comparison of reversible integrators for a Hamiltonian in normal
form. In E. Kreuzer and O. Mahrenholz, editors, Proceedings of the Third International
Congress on Industrial and Applied Mathematics, ICIAM 95, Issue 4: Applied Sciences,
especially Mechanics (Minisymposia), pages 563–566, (1996).
[Kar96b] B. Karasözen: Composite integrators for Bi-Hamiltonian systems. Comp. & Math.
with Applic., 32:79–86, (1996).
[Kar96c] B. Karasözen: Numerical Studies on a Bi-Hamiltonian Hénon-Heiles System. Tech-
nical Report No 133, Middle East Technical University, Department of Mathematics,
Ankara, Turkey, (1996).
[Kar97] B. Karasözen: Reflexive methods for dynamical systems with conserved quantities.
Technical Report Nr. 1897, Technische Hochschule Darmstadt, FB Mathematik, (1997).
[KHL08] L. H. Kong, J. L. Hong, and R. X. Liu: Long-term numerical simulation of the
interaction between a neutron field and meson field by a symplectic-preserving scheme. J.
Phys. A: Math. Theor., 41:255207, (2008).
[Kir86] U. Kirchgraber: Multi-step methods are essentially one-step methods. Numer. Math.,
48:85–90, (1986).
[Lam91] J.D. Lambert: Numerical Methods for Ordinary Differential Equations, The Initial
Value Problem. Wiley, Chichester, (1991).
[Las88] F.M. Lasagni: Canonical Runge–Kutta methods. Z. Angew. Math. Phys., 39:952–953,
(1988).
[LDJW00] Y.X. Li, P. Z. Ding, M. X. Jin, and C. X. Wu: Computing classical trajectories of
model molecule A2 B by symplectic algorithm. Chemical Journal of Chinese Universities,
15(8):1181–1186, (2000).
[Lei99] B. J. Leimkuhler: Reversible adaptive regularization: Perturbed Kepler motion and
classical atomic trajectories. Phil. Trans. Royal Soc. A, 357:1101, (1999).
[Leo00] P. Leone: Symplecticity and Symmetry of General Integration Methods. Thèse, Section
de Mathématiques, Université de Genève, Second edition, (2000).
[LP96] B. J. Leimkuhler and G. W. Patrick: A symplectic integrator for Riemannian manifolds.
J. Nonlinear. Sci., 6(4):367–384, (1996).
Bibliography 361

[LP01] L. Lopez and T. Politi: Applications of the cayley approach in the numerical solution
of matrix differential systems on quadratic groups. Appl. Numer. Math., 36:35–55, (2001).
[LQ01] H. W. Li and M. Z. Qin: On the formal energy of symplectic R–K method. Math.
Num. Sinica, 23:75–92, (2001).
[LQHD07] X.S. Liu, Y.Y. Qi, J. F. He, and P. Z. Ding: Recent progress in symplectic algorithms
for use in quantum systems. Communications in Computational Physics, 2(1):1–53, (2007).
[LR94a] B. Leimkuhler and S. Reich: Symplectic integration of constrained Hamiltonian sys-
tems. Math. Comp., 63:589–605, (1994).
[LR05] B. Leimkuhler and S. Reich: Simulating Hamiltonian Dynamics. Cambridge Univer-
sity Press, Cambridge, First edition, (2005).
[LvV97] B. J. Leimkuhler and E. S. van Vleck: Orthosymplectic integration of linear Hamil-
tonian systems. Numer. Math., 77:269–282, (1997).
[LW76] J. D. Lambert and I. A. Watson: Symmetric multistep methods for periodic initial
value problems. J. Inst. Maths. Applics., 18:189–202, (1976).
[LYC06] H. Liu, J.H. Yuan, J.B. Chen, H. Shou, and Y.M. Li: Theory of large-step depth
extrapolation. Chinese journal Geophys., 49(6):1779–1793, (2006).
[McL95c] R. I. McLachlan: On the numerical integration of ODE’s by symmetric composition
methods. SIAM J. Numer. Anal., 16:151–168, (1995).
[McL95d] R. I. McLachlan: On the numerical integration of ordinary differential equations by
symmetric composition methods. SIAM J. Sci. Comput., 16:151–168, (1995).
[McL96] R. I. McLachlan: More on Symplectic Correctors. In Jerrold E. Marsden, George
W. Patrick, and William F. Shadwick, editors, Integration Algorithms and Classical Me-
chanics, volume 10 of Fields Institute Communications. Fields Institute, American Mathe-
matical Society, July (1996).
[McL02] R. McLachlan: Splitting methods. Acta Numerica, 11:341–434, (2002).
[Mie89] S. Miesbach: Symplektische Phasenfluß approximation zur Numerischen Integration
Kanonischer Differentialgleichungen. Master’s thesis, Technische Universität München,
(1989).
[MP92] S. Miesbach and H.J. Pesch: Symplectic phase flow approximation for the numerical
integration of canonical systems. Numer. Math., 61:501–521, (1992).
[MPQ04] R.I. McLachlan, M. Perlmutter, and G.R.W. Quispel: On the nonlinear stability of
symplectic integrators. BIT, 44:99–117, (2004).
[MQ98a] R. I. McLachlan and G. R. W. Quispel: Generating functions for dynamical systems
with symmetries, integrals, and differential invariants. Physica D, 112:298–309, (1998).
[MQ98b] R.I. McLachlan and G.R.W. Quispel: Numerical integrators that preserve symme-
tries and reversing symmetries. SIAM J. Numer. Anal., 35:586–599, (1998).
[MQ02] R. I. McLachlan and G. R. W. Quispel: Splitting methods. Acta Numerica, 11:341–
434, (2002).
[MQ03] R.I. McLachlan and G.R.W. Quispel: Geometric integration of conservative polyno-
mial ODEs. Appl. Numer. Math., 45:411–418, (2003).
[MQ04] D.I. McLaren and G.R.W. Quispel: Integral-preserving integrators. J. Phys. A: Math.
Gen., 37:L489–L495, (2004).
[MQR98] R. I. McLachlan, G. R. W. Quispel, and N. Robidoux: A unified approach to Hamil-
tonian systems, Poisson systems, gradient systems, and systems with Lyapunov functions
and/or first integrals. Physical Review Letters, 81:2399–2403, (1998).
[MQR99] R. I. McLachlan, G. R. W. Quispel, and N. Robidoux: Geometric integration using
discrete gradients. Phil. Trans. Royal Soc. A, 357:1021–1046, (1999).
[MQT98] R. I. McLachlan, G. R. W. Quispel, and G. S. Turner: Numerical integrators that
preserve symmetries and reversing symmetries. SIAM J. Numer. Anal., 35(2):586–599,
(1998).
[MS95c] R. I. McLachlan and C. Scovel: Equivariant constrained symplectic integration. J.
Nonlinear. Sci., 5:233–256, (1995).
362 Bibliography

[MS96] R. I. McLachlan and C. Scovel: A Survey of Open Problems in Symplectic Integration.


In J. E. Mardsen, G. W. Patrick, and W. F. Shadwick, editors, Integration Algorithms and
Classical Mechanics, pages 151–180. American Mathematical Society, (1996).
[Mur97] A. Murua: On order conditions for partitioned symplectic methods. SIAM J. Numer.
Anal., 34:2204–2211, (1997).
[Mur99] A. Murua: Formal series and numerical integrators, part I: Systems of odes and sym-
plectic integrators. Appl. Numer. Math., 29:221–251, (1999).
[Obr40] N. Obreschkoff: Neue Quadraturformeln. Abhandlungen pröß Klasse Acad Wiss
Mathnatuwiss, 1–20.(1940).
[Oku93] D. Okunbor: Variable step size does not harm second-order integrators for Hamilto-
nian systems. J. Comput. Appl. Math, 47:273–279, (1993).
[Oku95] E. I. Okunbor: Energy conserving, Liouville, and symplectic integrators. J. of Comp.
Phys., 120(2):375–378, (1995).
[OS92] D. Okunbor and R.D. Skeel: Explicit canonical methods for Hamiltonian systems.
Math. Comp., 59:439–455, (1992).
[QD97] G. R. W. Quispel and C. Dyt: Solving ODE’s numerically while preserving symme-
tries, Hamiltonian structure, phase space volume, or first integrals. In A. Sydow, editor,
Proceedings of the 15th IMACS World Congress, pages 601–607. Wissenschaft & Technik,
Berlin, (1997).
[Qin87] M. Z. Qin: A symplectic schemes for the Hamiltonian equations. J. Comput. Math.,
5:203–209, (1987).
[Qin89] M. Z. Qin: Cononical difference scheme for the Hamiltonian equation. Mathematical
Methodsand in the Applied Sciences, 11:543–557, (1989).
[Qin90] M. Z. Qin: Multi-stage symplectic schemes of two kinds of Hamiltonian systems of
wave equations. Computers Math. Applic., 19:51–62, (1990).
[Qin96] M. Z. Qin: Symplectic difference schemes for nonautonomous Hamiltonian systemes.
Acta Applicandae Mathematicae, 12(3):309–321, (1996).
[Qin97a] M. Z. Qin: A symplectic schemes for the PDE’s. AMS/IP studies in Advanced
Mathemateics, 5:349–354, (1997).
[QT90a] G. D. Quinlan and S. Tremaine: Symmetric multistep methods for the numerical
integration of planetary orbits. Astron. J., 100:1694–1700, (1990).
[QT90b] G.D. Quinlan and S. Tremaine: Symmetric multistep methods for the numerical
integration of planetary orbits. Astron. J., 100:1694–1700, (1990).
[QT96] G. R. W. Quispel and G. S. Turner: Discrete gradient methods for solving ODE’s
numerically while preserving a first integral. Physics Letters A, 29:L341–L349, (1996).
[QWZ91] M. Z. Qin, D. L. Wang, and M. Q. Zhang: Explicit symplectic difference schemes
for separable Hamiltonian systems. J. Comput. Math., 9(3):211–221, (1991).
[QZ90a] M. Z. Qin and M. Q. Zhang: Explicit Runge-Kutta-like schemes to solve certain
quantum operator equations of motion. J. Stat. Phys., 60(5/6):839–843, (1990).
[QZ91] M. Z. Qin and W. J. Zhu: Canonical Runge-Kutta-Nyström(RKN) methods for second
order ode’s. Computers Math. Applic., 22:85–95, (1991).
[QZ92a] M. Z. Qin and M. Q. Zhang: Symplectic Runge-Kutta Schemes for Hamiltonian
System. J. Comput. Math., Supplementary Issue: pages 205–215, (1992).
[QZ92b] M. Z. Qin and W. J. Zhu: Construction of higher order symplectic schemes by com-
position. Computing, 47:309–321, (1992).
[QZ94] M. Z. Qin and W. J. Zhu: Multiplicative extrapolatio method for constructing higher
order schemes for ODE’s. J. Comput. Math., 12:352–356, (1994).
[QZZ95] M. Z. Qin, W. J. Zhu, and M. Q. Zhang: Construction of symplectic of a three stage
difference scheme for ODE’s. J. Comput. Math., 13:206–210, (1995).
[Rei94a] S. Reich: Momentum conserving symplectic integrators. Physica D, 76:375–383,
(1994).
[Rei95c] S. Reich: Smoothed dynamics of highly oscillatory Hamiltonian systems. Physica
D, 89:28–42, (1995).
[Rei96a] S. Reich: Enhancing energy conserving methods. BIT, 1:122–134, (1996).
Bibliography 363

[Rei96b] S. Reich: Symplectic integration of constrained Hamiltonian systems by composition


methods. SIAM J. Numer. Anal., 33:475–491, (1996).
[Rei96c] S. Reich: Symplectic Methods for Conservative Multibody Systems. In J. E. Mard-
sen, G. W. Patrick, and W. F. Shadwick, editors, Integration Algorithms and Classical Me-
chanics, pages 181–192. American Mathematical Society, (1996).
[Rei97] S. Reich: On higher-order semi-explicit symplectic partitioned Runge–Kutta methods
for constrained Hamiltonian systems. Numer. Math., 76(2):249–263, (1997).
[Rei99] S. Reich: Backward error analysis for numerical integrators. SIAM J. Numer. Anal.,
36:475–491, (1999).
[Rut83] R. Ruth: A canonical integration technique. IEEE Trans. Nucl. Sci., 30:26–69, (1983).
[SA91] J.M. Sanz-Serna and L. Abia: Order conditions for canonical Runge-Kutta schemes.
SIAM J. Numer. Anal., 28:1081–1096, (1991).
[SS88] J. M. Sanz-Serna: Runge–Kutta schemes for Hamiltonian systems. BIT, 28:877–883,
(1988).
[SSC94] J. M. Sanz-Serna and M. P. Calvo: Numerical Hamiltonian Problems. AMMC 7.
Chapman & Hall, (1994).
[SSM92] S. Saito, H. Sugiura, and T. Mitsui: Family of symplectic implicit Runge-Kutta
formulae. BIT, 32:539–543, (1992).
[SSM92b] S. Saito, H. Sugiura, and T. Mitsui: Butcher’s simplifying assumption for symplec-
tic integrators. BIT, 32:345–349, (1992).
[Sto93b] D.M. Stoffer: Variable step size destabilizes the Störmer/leap-frog/Verlet method.
BIT, 33:172–175, (1993).
[Sto95] D. Stoffer: Variable steps for reversible integration methods. Computing, 55:1–22,
(1995).
[Sto97] D. Stoffer: On the Qualitative Behaviour of Symplectic Integrators Part I: Perturbed
Linear Systems. Numer. Math., 77(4):535–548, (1997).
[Sto98a] D. Stoffer: On the gualitative behavior of symplectic integrator. II: Integrable systems.
J. of Math. Anal. and Applic., 217:501–520, (1998).
[Sto98b] D. Stoffer: On the qualitative behaviour of symplectic integrators. III: Perturbed
integrable systems. J. of Math. Anal. and Appl., 217:521–545, (1998).
[Sun93a] G. Sun: Construction of high order symplectic Runge-Kutta methods. J. Comput.
Math., 11(3):250–260, (1993).
[Sun93b] G. Sun: Symplectic partitioned Runge–Kutta methods. J. Comput. Math., 11:365–
372, (1993).
[Sun94] G. Sun: Characterization and construction of linear symplectic R–K methods. J.
Comput. Math., 12(2):101–112, (1994).
[Sun95] G. Sun: Construction of high order symplectic Partitioned–Runge–Kutta methods. J.
Comput. Math., 13(1):40–50, (1995).
[Sun00] G. Sun: A simple way constructing symplectic Runge–Kutta methods. J. Comput.
Math., 18:61–68, (2000).
[Sur88] Y.B. Suris: On the conservation of the symplectic structure in the numerical solu-
tion of Hamiltonian systems(in Russian), In: Numerical Solution of Ordinary Differential
Equations, ed. S.S. Filippov, Keldysh Institute of Applied Mathematics. USSR Academy
of Sciences, Moscow, Second edition, (1988).
[Sur89] Y.B. Suris: The canonicity of mappings generated by Runge–Kutta type methods
when integrating the systems ẍ = − ∂U ∂x
. U.S.S.R. Comput. Maths. Math. Phys., 29:138–
144, (1989).
[Sur90] Y.B. Suris: Hamiltonian methods of Runge–Kutta type and their variational interpre-
tation. Math. Model., 2:78–87, in Russian, (1990).
[Tan93a] Y. F. Tang: The symplecticity of multi-step methods. Computers Math. Applic.,
25:83–90, (1993).
[Tan93b] Y. F. Tang: The necessary condition for Runge–Kutta schemes to be symplectic for
Hamiltonian systems. Computers Math. Applic., 26:13–20, (1993).
364 Bibliography

[Tan94] Y. F. Tang: Formal energy of a symplectic scheme for Hamiltonian systems and its
applications. Computers Math. Applic., 27:31–39, (1994).
[Vog56] R. de Vogelaere: Methods of integration which preserve the contact transformation
property of the Hamiltonian equations. Report No. 4, Dept. Math., Univ. of Notre Dame,
Notre Dame, Ind., Second edition, (1956).
[Wan91a] D. L. Wang: Semi-discrete Fourier spectral approximations of infinite dimensional
Hamiltonian systems and conservations laws. Computers Math. Applic., 21:63–75, (1991).
[Wan91b] D. L. Wang: Symplectic difference schemes for Hamiltonian systems on Poisson
manifolds. J. Comput. Math., 9(2):115–124, (1991).
[Wan91c] D. L. Wang: Poisson difference schemes for Hamiltonian systems on Poisson man-
ifolds. J. Comput. Math., 9:115–124, (1991).
[Wan93] D. L. Wang: Decomposition vector fields and composition of algorithms. In Proceed-
ings of International Conference on computation of differential equations and dynamical
systems, Beijing, 1993. World Scientific, (1993).
[Wan94] D. L. Wang: Some acpects of Hamiltonian systems and symplectic defference meth-
ods. Physica D, 73:1–16, (1994).
[Yos90] H. Yoshida: Conserved quantities of symplectic integrators for Hamiltonian systems.
Preprint, (1990).
[ZQ93a] M. Q. Zhang and M. Z. Qin: Explicit symplectic schemes to solve vortex systems.
Comp. & Math. with Applic., 26(5):51, (1993).
[ZQ93b] W. Zhu and M. Qin: Applicatin of higer order self-adjoint schemes of PDE’s. Com-
puters Math. Applic.,26(3):15–26, (1993).
[ZQ93c] W. Zhu and M. Qin: Constructing higer order schemes by formal power series. Com-
puters Math. Applic.,25(12):31–38, (1993).
[ZQ93] W. Zhu and M. Qin: Order conditionof two kinds of canonical difference schemes.
Computers Math. Applic., 25(6):61–74, (1993).
[ZQ94] W. Zhu and M. Qin: Poisson schemes for Hamiltonian systems on Poisson manifolds.
Computers Math. Applic., 27:7–16, (1994).
[ZQ95a] W. Zhu and M. Qin: Reply to “comment on Poisson schemes for Hamiltonian systems
on Poisson manifolds”. Computers Math. Applic., 29(7):1, (1995).
[ZQ95b] W. Zhu and M. Qin: Simplified order conditions of some canonical difference
schemes. J. Comput. Math., 13(1):1–19, (1995).
[ZS75] K. Zare and V. Szebehely: Time transformations in the extended phase-space. Celest.
Mech., 11:469–482, (1975).
[ZS95] M. Q. Zhang and R. D. Skeel: Symplectic integrators and the conservation of angular
momentum. J. Comput. Chem., 16:365–369, (1995).
[ZS97] M. Q. Zhang and R. D. Skeel: Cheap implicit symplectic integrators. Appl. Numer.
Math., 25(2):297, (1997).
[ZW99] H. P. Zhu and J. K. Wu: Generalized canonical transformations and symplectic algo-
rithm of the autonomous Birkhoffian systems. Progr. Natur. Sci., 9:820–828, (1999).
[ZzT96] W. Zhu, X. zhao, and Y Tang: Numerical methods with a high order of accuracy
applied in the quantum system. J. Chem. Phys., 104(6):2275–2286, (1996).
Chapter 8.
Composition Scheme

In this chapter, we consider a class of reversible schemes also called symmetrical


schemes. In algebraic language, it is not other, just like self-adjoint schemes . Here, we
only deal with one-step reversible schemes. We will introduce the concept of adjoint
methods and some of their properties. We will show that there is a self-adjoint scheme
of even order in every method. Using the self-adjoint schemes with lower order, we
can construct higher order schemes by “composing” a method, and this construct-
ing process can be continued to obtain arbitrary even order schemes. The composing
method presented here can be used to in both non-symplectic and symplectic schemes.
In [Yos90] , H.Yoshida proposed a new method to get multistage higher order explicit
symplectic schemes for separable Hamiltonian systems by composing lower order
ones. However, in [QZ92,Wru96,Suz92] , we found that this method can also be applied to
non-symplectic schemes for both Hamiltonian and non-Hamiltonian systems.

8.1 Construction of Fourth Order with 3-Stage Scheme


In this section, we construct a 3-stage difference scheme of fourth order by the method
of composing 2nd order schemes symmetrically.

8.1.1 For Single Equation


We know that trapezoid scheme
h 
Zk+1 = Zk + f (Zk ) + f (Zk+1 ) (1.1)
2
with h the step length, is order 2 for ODEs,

Ż = f (Z). (1.2)

We expect that the 3-stage method of the form


 
Z1 = Z0 + c1 h f (Z0 ) + f (Z1 ) ,
 
Z2 = Z1 + c2 h f (Z1 ) + f (Z2 ) , (1.3)
 
Z3 = Z2 + c3 h f (Z2 ) + f (Z3 )
366 8. Composition Scheme

would be of order 4 (i.e., Z3 − Z(t + h) = O(h5 )), where Z0 = Z(t). Z(t + h) is the
exact solution at t + h and Z3 the numerical one when the parameters c1 , c2 , and c3
are chosen properly.
We will use the method of Taylor expansion to deal with the simple case when
there is only one ordinary differential equation (ODE). When we deal with the case of
systems of ODEs, the Taylor expansions become very complex, although they surely
can be applied and the same conclusion as in the former case can be derived. We
introduce another method[HNW93] , known as “trees and elementary differentials” to
deal with the latter case. In fact, the essence of the two methods is the same; they are
just two different ways of expression.
In this section, without specific statements, the values of all functions and their
derivatives are calculated at Z0 , and we consider only the terms up to o(h4 ) in the
following calculations, while the higher order terms of h are omitted,

f = f (Z0 ), f  = f  (Z0 ), . . . , etc.

First, we calculate the Taylor expansion of the exact solution. Since




⎪ Ż = f,


⎨ Z̈ = f  · Ż = f  f,
2 (1.4)

⎪ Z (3) = f  f 2 + f  f,


⎩ (4)  3
Z = f  f 3 + 4f f  f 2 + f  f,

we have, with Z0 = Z(t),

h2  h3 2 h4 3
Z(t+h) = Z0 +hf + f f + (f  f 2 +f  f )+ (f  f 3 +4f  f  f 2 +f  f )+O(h5 ).
2! 3! 4!
(1.5)
Now, we turn to the Taylor expansion of the numerical solution. We can rewrite
(1.3) as

Z3 = Z0 + c1 h(f (Z0 ) + f (Z1 )) + c2 h(f (Z1 )


+f (Z2 )) + c3 h(f (Z2 ) + f (Z3 ))
= Z0 + h(c1 f (Z0 ) + (c1 + c2 )f (Z1 )
+(c2 + c3 )f (Z2 ) + c3 f (Z3 )). (1.6)

We use the same technique to expand the Taylor expansions of f (Z2 ), f (Z3 ).
Since

Z2 − Z0 = c1 h(f (Z1 ) + f (Z0 )) + c2 h(f (Z2 ) + f (Z1 ))


= c1 hf (Z0 ) + (c1 + c2 )hf (Z1 ) − c2 hf (Z2 ). (1.7)

We need to calculate f (Z1 ), f (Z2 ), f (Z3 ) Taylor expansion up to the terms of


order 3 of h
f  f 
f (Zi ) = f (Z0 ) + f  (Zi − Z0 ) + (Zi − Z0 )2 + (Zi − Z0 )3 + O(h4 ). (1.8)
2! 3!
8.1 Construction of Fourth Order with 3-Stage Scheme 367

Since Z1 = Z0 + c1 h(f (Z1 ) + f (Z0 )) by (1.8), we have


 
f (Z1 ) = f (Z0 ) + f  c1 hf (Z0 ) + c1 hf (Z1 )

 2
+ f2 ! c1 hf (Z0 ) + c1 hf (Z1 ) (1.9)

 3
+ f3 ! c1 hf (Z0 ) + c1 hf (Z1 ) + O(h4 ).

Inserting the Taylor expansion of f (Z1 ) into right side of (1.9), we get
  
f (Z1 ) = f (Z0 ) + c1 hf  f (Z0 ) + f (Z0 ) + c1 hf  f (Z0 ) + f (Z1 )

 2  

+ f2 ! (c1 h)2 f (Z0 ) + f (Z1 ) + f2 ! (c1 h)2 f (Z0 ) + f (Z0 )
 2 
 3
+c1 hf  f (Z0 ) + f (Z1 ) + f3 ! (c1 h)3 f (Z0 ) + f (Z0 ) + O(h4 )
  
= f (Z0 ) + c1 hf  2f (Z0 ) + c1 hf  f (Z0 ) + f (Z0 ) + c1 2hf  f (Z0 )

 2  
 
+(c1 h)2 f2 ! f (Z0 ) + f (Z0 ) + (c1 h)2 f2 ! 2f (Z0 ) + c1 hf  f (Z0 )
2 
 3
+f (Z0 ) + (c1 h)3 f3 ! 2f (Z0 ) + O(h4 )
   
2
= f (Z0 ) + c1 h 2f  f (Z0 ) + (c1 h)2 2f  f (Z0 ) + 2f  f 2 (Z0 )
 
3
+(c1 h)3 2f  f (Z0 ) + 6f  f  f 2 (Z0 ) + 43 f  f 3 (Z0 ) + O(h4 ).
(1.10)
Similarly, developing f (Z2 ) and f (Z3 ), since
   
Z2 − Z0 = c1 h f (Z1 ) + f (Z0 ) + c2 h f (Z2 ) + f (Z1 )
= c1 hf (Z0 ) + (c1 + c2 )hf (Z1 ) + c2 hf (Z2 ),

therefore
  f  
f (Z2 ) = f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + c2 f (Z2 ) + h2 c1 f (Z0 )
2!
2 f  
+(c1 + c2 )f (Z1 ) + c2 f (Z2 ) + h3 c1 f (Z0 ) + (c1 + c2 )f (Z1 )
3!
3
+c2 f (Z2 ) + O(h4 )
%  
= f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + c2 f (Z0 ) + hf  c1 f (Z0 )
  
+(c1 + c2 )f (Z1 ) + c2 f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + c2 f (Z0 )
f   2 & f  %
+ h2 c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + c2 f (Z0 ) + h2 c1 f (Z0 )
2! 2!
 
+(c1 + c2 )f (Z1 ) + c2 f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 )
&2 f   3
+c2 f (Z2 ) + h3 c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + c2 f (Z0 ) + O(h4 ).
3!

(1.11)
368 8. Composition Scheme

Similarly, inserting Taylor expansion of f (Z1 ) into (1.11), we get


 
f (Z2 ) = f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 ) f (Z0 ) + (c1 h)2f  f (Z0 )
  
2
+(c1 h)2 2f  f (Z0 ) + 2f  f 2 (Z0 ) + c2 f (Z0 )
   
+hf  c1 f (Z0 ) + (c1 + c2 ) f (Z0 ) + (c1 h)2f  f (Z0 ) + c2 f (Z0 )
 f  
+hf (c1 + c2 )2f (Z0 ) + h2 (c1 + c2 )2 4f 2 (Z0 )
2!
f  2   
+ h c1 f (Z0 ) + (c1 + c2 ) f (Z0 ) + (c1 h)2f  f (Z0 )) + c2 f (Z0 )
2!
 2 f 
+hf  (c1 + c2 ) f (Z0 ) + f (Z0 ) + h3 (c1 + c2 )3 8f 3 (Z0 )
3!
+O(h4 )    
 2
= f (Z0 ) + h 2(c1 + c2 )f f (Z0 ) + h2 (c1 + c2 )2 2f  f (Z0 )
 
3
+2f  f 2 (Z0 ) + h3 (c1 + c2 )(c21 + c1 c2 + c22 )2f  f (Z0 )
 
+ (c1 + c2 )2c21 + 2c2 (c1 + c2 )2 + 4(c1 + c2 )3 f  f  f 2 (Z0 )
4 
+ (c1 + c2 )3 f  f 3 (Z0 ) + O(h4 ).
3
(1.12)
Using the above identical method, we have
 
f (Z3 ) = f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z3 )
f   2
+h2 c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z3 )
2!
  3
3f
+h c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z3 )
3!
+O(h4 )  
= f (Z0 ) + hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z0 )
 
+hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z0 )
 
+hf  (c1 + c3 )f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 )
f   2 
+h2 c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z0 )
2!
  
2f
+h c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z0 )
2!
 3
+hf  c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z0 )
f   3
+h3 (c1 + c3 )f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 )
3!
+O(h4 ).
(1.13)
Inserting the Taylor expansion of f (Z1 ) and f (Z2 ) into (1.13)
   2
f (Z3 ) = f (Z0 ) + h 2(c1 + c2 + c3 )f  f (Z0 ) + h2 (c1 + c2 + c3 )2 2f  f (Z0 )
8.1 Construction of Fourth Order with 3-Stage Scheme 369

 
+(c1 + c2 + c3 )2 2f  f 2 (Z0 ) + h3 c1 + c2 )c21 + (c2 + c3 )(c1 + c2 )2
 3 
+c3 (c1 + c2 + c3 )2 2f  f (Z0 ) + (c1 + c2 )c21 + (c2 + c3 )(c1 + c2 )2

+c3 (c1 + c2 + c3 )2 + 2(c1 + c2 + c3 )3 2f  f  f 2 f (Z0 )
4 
+ (c2 + c2 + c3 )3 f  f 3 (Z0 ) + O(h4 ). (1.14)
3
w w
Let c1 = c3 = 1 , c2 = 0 , take into account (1.5) and (1.6), and compare the
2 2
Taylor expansion of the exact solution (1.5) with the above one. In order to get fourth
order accuracy schemes (1.3), we need to solve the following equations for coefficients
c1 , c2 , c3 :

hf  : c1 + (c1 + c2 ) + (c2 + c3 ) + c3 = 1 =⇒ 2w1 + w0 = 1, (1.15)


2 
h f f : (c1 + c2 )2c1 + (c2 + c3 )2(c1 + c2 ) + c3 2(c1 + c2 + c3 )
1
= , (1.16)
2
2
h3 f  f 2 , h3 f  f : (c1 + c2 )2c21 + (c2 + c3 )2(c1 + c2 )2 + c3 2(c1 + c2 + c3 )2
1
= , (1.17)
6
4 4 4
h4 f  f 3 : (c1 + c2 ) c31 + (c2 + c3 ) (c1 + c2 )3 + c3 (c1 + c2 + c3 )3
3 3 3
1
= , (1.18)
24
3
h4 f  f : (c1 + c2 )2c31 + (c1 + c2 )2(c21 + c22 + c1 c2 )(c2 + c3 )
 
+c3 2 (c1 + c2 )c21 + (c2 + c3 )(c1 + c2 )2 + c3 (c1 + c2 + c3 )2
1
= , (1.19)
24
4   2

h f ff : (c1 + c2 )6c31
+ (c2 + c3 ) 4(c1 + c2 ) + 3
+ c2 ) 2c21 (c1

+2c2 (c1 + c2 ) + c3 2(c1 (c1 + c2 ) + (c2 + c3 )(c1 + c2 )2
2 2

 1
+c3 (c1 + c2 + c3 )2 + 2(c1 + c2 + c3 )3 = . (1.20)
24
1 1
When 2w1 + w0 = 1 holds, the Equation (1.16) becomes = , i.e., identity, and
2 2
the Equations (1.17) – (1.20) become the same, i.e.,

6w13 − 12w12 + 6w1 − 1 = 0.

Thus, we get the conditions for the difference scheme (1.3) to be of order 4:

2w1 + w0 = 1,
(1.21)
6w13 − 12w12 + 6w1 − 1 = 0.

Thus we get,
1
−2 3 1
w0 = 1 , w1 = 1 .
2 − 23 2 − 23
370 8. Composition Scheme

Now, scheme (1.3) becomes


⎧ 1  

⎪ Z1 = Z0 + 1 h f (Z0 ) + f (Z1 ) ,

⎪ 2(2 − 2 3 )


⎨ −2 3
1
 
Z2 = Z1 + 1 h f (Z1 ) + f (Z2 ) , (1.22)

⎪ 2(2 − 2 3 )

⎪  

⎪ 1
⎩ Z3 = Z2 + 1 h f (Z2 ) + f (Z3 ) .
2(2 − 2 3 )

8.1.2 For System of Equations

We use the “method of tree and elementary differentials” [HNW93] given in Chapter 7.
We first rewrite the scheme (1.3) in the R–K methods:
⎧  

⎨ Z1 = Z0 + hc1 f (Z0 ) + c1 f (Z1 ) , 
Z2 = Z0 + h c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + c2 f (Z2 ) ,

⎩  
Z3 = Z0 + h c1 f (Z0 ) + (c1 + c2 )f (Z1 ) + (c2 + c3 )f (Z2 ) + c3 f (Z3 ) .
(1.23)
Obviously, the above equation is equivalent to following


⎪ g1 = Z0 ,



⎪ g
⎨ 2 = Z0 + c1 hf (g1 ) + c1 hf (g2 ),
g3 = Z0 + c1 hf (g1 ) + (c1 + c2 )hf (g2 ) + c2 hf (g3 ),



⎪ g4 = Z0 + c1 hf (g1 ) + (c1 + c2 )hf (g2 ) + (c2 + c3 )hf (g3 ) + c3 hf (g4 ),

⎪  

Z = Z0 + h c1 f (g1 ) + (c1 + c2 )f (g2 ) + (c2 + c3 )f (g3 ) + c3 f (g4 ) ,
(1.24)
where g2 = Z1 , g3 = Z2 , g4 = Z3 , and Z = Z3 . Thus, the Butcher tableau

C A
bT
takes the following form:

0 0 0 0 0
2c1 c1 c1 0 0
2(c1 + c2 ) c1 c1 + c2 c2 0
2(c1 + c2 + c3 ) c1 c1 + c2 c2 + c3 c3
c1 c1 + c2 c2 + c3 c3
From the previous chapter, we have the order condition for R–K method as follows:

Theorem 1.1. R–K method


8.1 Construction of Fourth Order with 3-Stage Scheme 371


s
giJ = Z0J + aij f J (gj1 , · · · , gjh ),
j=1
s
Z1J = Z0J + bj hf J (gj1 , · · · , gjh )
j=1

is order of p, iff

s
1
bj φj (ρt) =
j=1
γ(ρt)

for all rooted tree ρt, have r(ρt) ≤ p, where Z0 = (Z01 , · · · , Z0n )T , f J = (f 1 , f 2 , · · · , f n )T .

Since rooted tree ρt of Theorem 1.1 is defined in Chapter 7, definitions of φj (ρt)


are as follows:

φj (ρt) = ajk a · · · ,
k,l,···

where ρt is a labelled tree with root j, the sum over the r(ρt) − 1 remaining indices
k, l, · · ·. The summand is a product of r(ρt) − 1 a’s, where all fathers stand two by
two with their sons as indices.
⎧ 4 4
⎪  C
4 

⎪ 1

⎪ b j = 1, b j ajk = ,

⎪ j=1 2

⎪ j=1 k=1

⎪ 4
  4 4
  4

⎪ 1 1

⎪ bj ajk ajl = , bj ajk akl = ,
⎨ 3 6
j=1 k,l=1 j=1 k,l=1
4 4 4 4 (1.25)

⎪    

⎪ bj
1
ajk ajl ajm = , bj
1
ajk akl ajm = ,




4 8


j=1 k,l,m=1 j=1 k,l,m=1

⎪  4  4 4  4

⎪ 1 1

⎩ bj ajk akl akm = , bj ajk akl alm = .
12 24
j=1 k,l,m=1 j=1 k,l,m=1

From the previous chapter on simplifying condition of symplectic R–K, we know


that the system of Equation (1.25) only exists in 3 independent conditions (equa-
tion). In the above equation, last 4 conditions have only one independent condition.
By symmetrically choosing c1 = c3 , this condition is satisfied automatically. Taking
w w
c1 = c3 = 1 , c2 = 0 , by the first two conditions of Equation (1.25), we obtain the
2 2
same equation
2w1 + w0 = 1. (1.26)
Substituting the relation (1.26) in the order of conditions (1.25), we get

2w13 + w03 = 0. (1.27)

These Equations (1.26) and (1.27) are just the same as in Subsection 8.1.1 for single
equation.
372 8. Composition Scheme

From the literature[Fen85] , we know that the centered Euler scheme is symplectic,
but trapezoidal scheme (1.1) is right non-symplectic as a result of nonlinear transfor-
mation [Dah75,QZZ95,Fen92] , the scheme (1.1) can transform the Euler center point form,
therefore the trapezoidal form is nonstandard symplectic, just as discussed in Section
4.3 of Chapter 4. It is the same with the trapezoidal form, the centered Euler scheme
may also be used to construct the higher order scheme. Because
⎧  
⎪ Z 0 + Z1

⎪ Z1 = Z0 + d 1 hf ,

⎪ 2
⎨  
Z + Z2
Z2 = Z1 + d2 hf 1 ,

⎪ 2

⎪  

⎩ Z3 = Z2 + d3 hf Z2 + Z3 ,
2

it is equally the same in R–K method with in the following Butcher tableau:

d1 d1
0 0
2 2

d2 d2
d1 + d1 0
2 2

d3 d3
d 1 + d2 + d1 d2
2 2

d1 d2 d3
1
Using the same method, we can prove that , when d1 = d3 = 1 , d2 =
2 − 23
1
−2 3
1 , the above scheme is fourth order, and the coefficient is entirely the same as in
2 − 23
trapezoidal method.

8.2 Adjoint Method and Self-Adjoint Method


Here, we will introduce the concept of adjoint scheme and self-adjoint scheme. These
two kinds of schemes are the foundation that construct higher order scheme in the
future. First, we see several higher order schemes as the example, and seek common
character which may supply method for constructing higher order scheme; In the Sec-
tion 4.4 of Chapter 4, we discussed an explicit scheme for separable Hamiltonian
system. We know
 n+1
p = pn − τ Vq (q n ),
(2.1)
q n+1 = q n + τ Up (pn+1 )

(where τ is step size, pn , q n are numerical solution in step n) is of order 1. We shall


compose this scheme (2.1) to a 2nd order scheme choosing a suitable coefficient of τ ,
8.2 Adjoint Method and Self-Adjoint Method 373

⎧ n+ 1 τ

⎪ p 2 = pn − Vq (q n ),

⎪ 2

⎨ q n+ 12 = q n + τ Up (pn+ 12 ),


1 τ
pn+1 = pn+ 2 − Vq (q n+ 2 ),
1

⎪ 2

⎩ n+1 1
q = q n+ 2 .
This scheme is equal to the following:
⎧ n+ 1 τ

⎪ p 2 = pn − Vq (q n ),

⎪ 2

⎪ τ
⎪ 1
⎨ q n+ 2 = q n +
1
Up (pn+ 2 ),
2
1 τ 1 (2.2)

⎪ q n+1 = q n+ 2 + Up (pn+ 2 ),

⎪ 2

⎪ τ

⎩ pn+1 = pn+ 2
1
− Vq (q n+1 ).
2

This 2nd order scheme can also be defined as a self-adjoint scheme, see also [Yos90] .
Ruth[Rut83] , using scheme (2.1), constructed a 3rd order scheme via composition
method,

⎨ p1 = p − c1 τ Vq (q ),
k k
⎪ q1 = q k + d1 τ Up (p1 ),
p2 = p1 − c2 τ Vq (q1 ), q2 = q1 + d2 τ Up (p2 ), (2.3)

⎩ k+1
p = p2 − c3 τ Vq (q2 ), q k+1
= q2 + d3 τ Up (p k+1
).
7 4 1 2 2
When c1 = , c2 = , c3 = − , d1 = , d2 = − , d3 = 1, this scheme is
24 3 24 3 3
3rd order. We may construct multistage schemes, in order to achieve the higher order
precision. In literature[QZ92,Fen86,Fen91,FR90,Rut83] , we may see the following 4th order
form: ⎧

⎪ p1 = pk − c1 τ Vq (q k ), q1 = q k + d1 τ Up (p1 ),



⎨ p2 = p1 − c2 τ Vq (q1 ), q2 = q1 + d2 τ Up (p2 ),
(2.4)

⎪ p3 = p2 − c3 τ Vq (q2 ),


q3 = q2 + d3 τ Up (p3 ),

⎩ k+1
p = p3 − c4 τ Vq (q3 ), q k+1 = q3 + d4 τ Up (pk+1 ),
where ⎧ 1 1

⎪ c1 = 0, c2 = c4 = − (2 + α), c3 = (1 + 2α),

⎪ 3 3

⎨ 1
d1 = d4 = (2 + α),
6 D



⎪ 1 √ 3 1

⎩ d2 = d3 = (1 − α), α = 2 +
3
,
6 2
or ⎧ 1 1

⎪ c1 = c4 = (2 + α), c2 = c3 = (1 − α),

⎪ 6 6

⎨ 1
d1 = d3 = (1 + 2α),
3 D



⎪ 1 √ 3 1

⎩ 2
d = − (1 + 2α), d 4 = 0, α = 3
2 + .
3 2
374 8. Composition Scheme

The above examples in the 3-stage fourth-order scheme give us an understanding. We


can construct the higher order scheme through the low-order scheme, and this method
is not limited the symplictic type. Because the usual structure uses the Taylor ex-
pansion, the comparison of coefficient of structure in the higher order form will be
cumbersome, therefore in this section and next, we will use the Lie series method.
This method is already used widely , for example Stanly. Stenberg, Alex, J. Dragt
and F. Neri used the Lie series to study the differential equation, see also the litera-
ture [Ste84,DF76,DF83,Ner87] . Using the Lie series to study our question is convenient as
there is no need to extract the Lie series item truly to which form it corresponds, but
must use its form only. We will see this later.
We know that each scheme is always formal and is expressed as follows:

yn+1 = S(τ )yn , (2.5)

where τ is the step size. S(τ ) is called the integrator, but yn+1 and yn are numerical
solutions of equation in steps n + 1 and n .
Definition 2.1. An integrator S ∗ (τ ) is called the adjoint integrator S(τ ), if

S ∗ (−τ )S(τ ) = I, (2.6)



S(τ )S (−τ ) = I. (2.7)

That means yn+1 = S(τ )yn , yn = S ∗ (−τ )yn+1 , or yn+1 = S ∗ (−τ )yn , yn =
S(τ )yn+1 . In fact, (2.6) – (2.7) equations are equivalent to

S(−τ )S ∗ (τ ) = I, (2.8)
S ∗ (τ )S(−τ ) = I. (2.9)

In order to prove this, set τ = −τ , then (2.6) – (2.7) becomes
⎧  
⎨ S ∗ (τ )S(−τ ) = I,
 
⎩ S(−τ )S ∗ (τ ) = I.

τ expresses the length of arbitrary step; therefore, the above equations are formula of
(2.8), (2.9). Further, we would like to point out that the two conditions (2.6) and (2.7)
are the same. Since form S ∗ (−τ )S(τ ) = I, we get

S ∗ (−τ ) = S −1 (τ ),

where S −1 (τ ) is the inverse of the integrator S(τ ). Here, we always assume S(τ ) is
invertible. So, we have

S(τ )S ∗ (−τ ) = S(τ )S −1 (τ ) = I.

The formula may result in (2.7) by (2.6), and vice versa.


But note the difference between S ∗ (τ ) and S −1 (τ ), that is, S ∗ (−τ ) = S −1 (τ ).
Here, S ∗ (τ ) and S(τ ) is the same push-forward mapping, but S −1 (τ ) is the pull-back
mapping.
8.2 Adjoint Method and Self-Adjoint Method 375

For a convenient deduction in Section 8.3, we give another definition of a self-


adjoint method[HNW93] here, and show that it is equivalent to definition adjoint (2.1).
We rewrite (2.5) as follows:

yn+1 = yn + τ φ(x, yn , τ ). (2.10)

yn , yn+1 is numerical solution for the equation y  = f (x) in the n and n + 1 step, and
φ is increment function which the form (2.5) corresponds.
Definition 2.2. Scheme yn+1 = yn + τ φ∗ (x, yn , τ ) is an adjoint scheme (2.10), if it
satisfies:

B = A − τ φ(x + τ, A, −τ ), (2.11)
A = B + τ φ∗ (x, B, τ ). (2.12)

Theorem 2.3. Definition 2.1 and Definition 2.2 are equivalent.


Proof. Since (2.8) – (2.9) and (2.6) – (2.7) are equivalent, (2.6) and (2.9) are also
equivalent. It is enough to prove that (2.9) is equivalent to (2.11) – (2.12). Let
 ∗
S (τ )yn = yn + τ φ∗ (x, yn , τ ),
S(τ )yn = yn + τ φ(x, yn , τ ),

first prove that (2.9) → (2.11) – (2.12). Let



A = yn+1 ,
(2.13)
B = A − τ φ(x + τ, A, −τ ),

prove that (2.12), due to


 
S ∗ (τ )S(−τ )yn+1 = S ∗ (τ ) yn+1 − τ φ(x + τ, yn+1 , −τ )
= yn+1 − τ φ(x + τ, yn+1 , −τ )
 
+τ φ∗ x, yn+1 − τ φ(x + τ, yn+1 , −τ ), τ
= yn+1 − τ φ(x + τ, yn+1 , −τ ) + τ φ∗ (x, B, τ )
= B + τ φ∗ (x, B, τ )
= IA.

By the last equality, also because S ∗ (τ )S(−τ ) = I, we have

B + τ φ∗ (x, B, τ ) = A,

this is the formula (2.12).


Now, we will prove : (2.11) – (2.12) ⇒ (2.9). Let
'
A = yn+1 ,
B = A − τ φ(x + τ, A, −τ ).
376 8. Composition Scheme

Since
 
S ∗ (τ )S(−τ )yn+1 = S ∗ (τ ) yn+1 − τ φ(x + τ, yn+1 , −τ )
= A − τ φ(x + τ, A, −τ ) + τ φ∗ (x, B, τ ),

from (2.12), we have

S ∗ (τ )S(−τ )A = B + τ φ∗ (x, B, τ ) = A = IA,

i.e., S ∗ (τ )S(−τ ) = I. 

Definition 2.4. An integrator S(τ ) is self-adjoint, if

S ∗ (τ ) = S(τ ), i.e., S(−τ )S(τ ) = I.

In[Yos90] , the integrator with the property S(−τ )S(τ ) = I. H. Yoshida called this
operator as reversible. We see that time reversibility and self-adjointness are the same.
The time reversible (i.e., self-adjoint) integrator plays an important role in this chapter
due to its special property.
       
τ τ τ τ
Theorem 2.5. For every integrator S(τ ), S ∗ S or S S∗ is self-
2 2 2 2
adjoint integrator[QZ92,Str68] .
   
τ τ
Proof. We must prove that S ∗ =S is self-adjoint, i.e.,
2 2
 ∗
S ∗ (τ )S(τ ) = S ∗ (τ )S(τ ).

By Definition 2.1,we have S ∗ (τ )S(−τ ) = I, then


 ∗  −1
S ∗ (τ )S(τ ) = S ∗ (−τ )S(−τ )
 −1
= S −1 (−τ ) S ∗ (−τ )
 −1
= S ∗ (τ ) S ∗ (−τ ) .

Because we also have


S ∗ (−τ )S(τ ) = I,
i.e.,  −1
S ∗ (−τ ) = S(τ ),
therefore  ∗
S ∗ (τ )S(τ ) = S ∗ (τ )S(τ ).
Therefore, the theorem is completed. 

Theorem 2.6. If S1 (τ ) and S2 (τ ) are self-adjoint integrators, then symmetrical com-


position S1 (τ )S2 (τ )S1 (τ ) is self-adjoint[QZ92] .
8.3 Construction of Higher Order Schemes 377

Proof. Consider S1 (τ ) and S2 (τ ) are self-adjoint integrators, then


 ∗  −1
S1 (τ )S2 (τ )S1 (τ ) = S1 (−τ )S2 (−τ )S1 (−τ )
= S1 (−τ )−1 S2 (−τ )−1 S1 (−τ )−1
= S1∗ (τ )S2∗ (τ )S1∗ (τ ) = S1 (τ )S2 (τ )S1 (τ ).

The theorem is proved. 

We pointed out that generally, a combination of self-adjoint operators is not necessar-


ily from the self-adjoint. One simple example is
 ∗
S1 (τ )S2 (τ ) = S2 (τ )S1 (τ ) = S1 (τ )S2 (τ ),

where S1 (τ ) and S2 (τ ) are self-adjoint operators, but they not commutative. We will
construct a higher order form in the below section.

8.3 Construction of Higher Order Schemes


We will first give constructed method for the higher difference scheme, and will further
prove that the Gauss–Legendre method is a self-adjoint method. We have given some
example for structured higher order schemes.
In this section, we will introduce first-order differential operators, Lie series and
some of their properties, all these are the basis of further deduction.
Denote:
f = (f1 , f2 , · · · , fn )T
g = (g1 , g2 , · · · , gn )T
% &T
d d
D= ,···, ,
d y1 d yn

where f1 , f2 , · · · , fn and g1 , g2 , · · · , gn are scalar function. Let



n

Lf = f T D = fi (3.1)
i=1
∂yi

be a first-order differential operator. The action of Lf on a scalar function ϕ is,


* n +
 ∂
Lf ϕ = fi ϕ = f T Dϕ(y).
i=1
∂y i

It is linear and satisfies the Leibniz formula, i.e., for two scalar functions ϕ1 and ϕ2 ,

(1) Lf (λ1 ϕ1 + λ2 ϕ2 ) = λ1 Lf ϕ1 + λ2 Lf ϕ2 , ∀ λ1 , λ2 ∈ R. (3.2)


(2) Lf (ϕ1 ϕ2 ) = ϕ1 Lf ϕ2 + ϕ2 Lf ϕ1 . (3.3)
378 8. Composition Scheme

Definition 3.1. The commutator of two first-order differential operators Lf and Lg is


defined by
[Lf , Lg ] = Lf Lg − Lg Lf . (3.4)
The commutator of the two first-order differential operators is still a first differen-
tial operator, since

n
∂ 
n
∂ n
∂gi ∂ n
∂2ϕ
Lf Lg ϕ = f T Dg T Dϕ = fj gi ϕ= fj ϕ+ fj gi ,
j=1
∂yj i=1 ∂yi i,j=1
∂yj ∂yi i,j=1
∂yj ∂yi


n
∂ 
n
∂ n
∂fi ∂ n
∂2ϕ
Lg Lf ϕ = g T Df T Dϕ = gj fi ϕ= gj ϕ+ gj fi ,
j=1
∂yj i=1 ∂yi i,j=1
∂yj ∂yi i,j=1
∂yj ∂yi

therefore,
n %
 &
∂gi ∂fi ∂
(Lf Lg − Lg Lf )ϕ = fj − gj ϕ,
i,j=1
∂yj ∂yj ∂yi

this means
n %
 &
∂gi ∂fi
[Lf , Lg ] = Lc , c = [c1 , c2 , · · · , cn ], ci = fj − gj ,
j=1
∂yj ∂yj

and Lc will still be the first-order differential operator. It is very easy to prove the
following properties of bracket

[Lf , Lf ] = 0, (3.5)
[λ1 Lf1 + λ2 Lf2 , Lg ] = λ1 [Lf1 , Lg ] + λ2 [Lf2 , Lg ], ∀ λ1 , λ2 ∈ R. (3.6)

The commutator also satisfies the Jacobi identity, i.e., if Lf , Lg , Lh are three first-
order differential operators, then
@ A @ A @ A
[Lf , Lg ], Lh + [Lg , Lh ], Lf + [Lh , Lf ], Lg = 0. (3.7)

(3.7) is very easy to prove, the detailed proof process is seen [Arn89] . From the above,
we know that first-order differential operator forms a Lie algebra.

Definition 3.2. A Lie series is an exponential of a first-order linear differential oper-


ator
∞ n n
t Lf
etLf = . (3.8)
n=0
n!

The action of a Lie series a scalar function ϕ(y) is given by:


∞ k k
 ∞ k

t Lf t  T k
etLf ϕ = ϕ(y) = f (y)D ϕ(y)
k! k!
k=0 k=0
t2 T  
= ϕ(y) + tf T (y)(D(y)) + f (y)D f T (y)Dϕ(y) + · · · . (3.9)
2
8.3 Construction of Higher Order Schemes 379

Taylor expansion gives us an one elementary example of a Lie series

∞ k
* n +k
 t  d
et[1,1,···,1]D ϕ(y) = ϕ(y)
k ! i=1 dyi
k=0
 
= ϕ y + t(1, 1, · · · , 1)T . (3.10)

We have seen several properties of Lie series, which are similar to those of [Ste84] . Let,

f = (f1 (y), f2 (y), · · · , fn (y))T ,


g = (g1 (y), g2 (y), · · · , gn (y))T ,

and T  T T T T
etf D
g = etf D g1 , etf D g2 , · · · , etf D gn ,
then, we have the following:
Property 3.3. The Lie series has the compositionality

etLf g(y) = g(etLf y). (3.11)

Proof. It is enough to prove

etLf gm (y) = gm (etLf y).

Since * n +k
∞ k k
 ∞ k
 
tLf
t Lf t ∂
e y= y= fi y,
k! k! i=1
∂y i
k=0 k=0

considering j component (1 ≤ j ≤ n), in etLf y , we have


∞ 
∂ k
 n ∞
 tk
etLf yj = fi yj = y j + Lk−1 fj ,
i=1
∂yi k! f
k=0 k=1

then
*
∞ k ∞ k
+
 t k−1  t k−1
tLf
gm (e y) = gm y1 + Lf f1 (y), · · · , yn + L fn (y)
k! k! f
k=1 k=1
∞ k
* n +k
 t  ∂
= fi (y) gm (y) = etLf gm (y).
k ! i=1 ∂yi
k=0

The proof can be obtained. 

Property 3.4. Product preservation property

etLf (pq) = (etLf p)(etLf q), (3.12)


380 8. Composition Scheme

where p(y), q(y) are scalar functions.


Proof. By (3.2) – (3.3) and (3.8), (3.12) can be obtained by direct computation. 

Property 3.5. Baker-Campbell-Hausdroff formula (simply called BCH formula): All


first differential operators constituted a Lie algebra. Therefore, we have the following
BCH formula: 2 3 4
etLf etLg = et(Lf +Lg )+t w2 +t w3 +t w4 +··· , (3.13)
where
1
w2 = [Lf , Lg ],
2
1@ A 1@ A
w3 = [Lf , Lg ], Lf + [Lf , Lg ], Lg ,
12 12
1@ @ @ AA
w4 = Lf , Lg Lg , Lf ] ,
24
···.

Property 3.6. If the differential equation property is,


T
y(t) = etf D
y, y = y(0),

then
ẏ(t) = f (y(t)).
T
(y)D
Proof. Since yi (t) = etf yi (0), then
d T T
yi (t) = etf (y)D f T (y)Dyi (0) = etf D fi (y).
dt
From Property 3.3, we have
d T  
yi (t) = fi (etf D y) = fi y1 (t), y2 (t), · · · , yn (t) = fi (y(t)).
dt
The proof can be obtained. 

From Property 3.6, we know that equation ẏ = f (y) can express the solution y(t) =
etLf y(0), Section 8.2 has discussed that the integral S(τ ) can also be represented
in this form. If S(τ ) has the group property about τ , it will be the phase flow of
dy
autonomous ODE = f (y). However, in our problem, there is just one parameter

family S(τ ) without group property. So, there just exists a formal vector field f τ (y),
which defines only the formal autonomous system
dy
= f τ (y).
dt

Its formal phase flow concerning two parameters τ, t, can be expressed by etLf τ . Take
the diagonal phase flow
8.3 Construction of Higher Order Schemes 381

etLf τ |t=τ = eτ Lf τ .
This is just S(τ ) Lie series expression. See the next chapter to know more about the
formal vector field and the formal phase flow. Since f τ (y) is a formal vector field, it
is a formal power series in τ . Thus, the exponential representation of S(τ ) will the
following form:

S(τ ) = exp (τ A + τ 2 B + τ 3 C + τ 4 D + τ 5 E + · · ·),

and series
τ A + τ 2B + τ 3C + τ 4D + τ 5E + · · ·
may not be convergence, where A, B, C, D, E, · · · are first-order differential opera-
tors. Therefore, we have:
Theorem 3.7. Every integrator S(τ ) has a formal Lie expression [QZ92] .
We use Theorem 3.7 to derive an important conclusion.
Theorem 3.8. Every self-adjoint integrator has an even order of accuracy[QZ92] .

Proof. Let S(τ ) be a self-adjoint integrator. Expand S(τ ) in the exponential form

S(τ ) = exp (τ w1 + τ 2 w2 + τ 3 w3 + · · ·).

Suppose S(τ ) is of order n, then

S(τ )y(0) = eτ Lf y(0) + O(τ n+1 ),

when the ODE to be solved is ẏ = f (y). Since


n+1
eτ Lf + o(τ n+1 ) = eτ Lf +O(τ )
,

then n+1
S(τ ) = eτ Lf +O(τ )
.
We must show that n is an even number. This means that we have to prove

w2 = w4 = w6 = w8 = · · · = 0.

Since
S(−τ ) = exp (−τ w1 + τ 2 w2 − τ 3 w3 + · · ·),
and using the BCH formula, we get

S(τ )S(−τ ) = exp (2τ 2 w2 + O(τ 3 )). (3.14)

Since S(τ ) is self-adjoint, i.e., S(τ )S(−τ ) = I, So (3.14) means w2 = 0, and (3.14)
becomes  
S(τ )S(−τ ) = exp 2τ 4 w4 + O(τ 5 ) .
This leads to w4 = 0. Continuing this process, we have
382 8. Composition Scheme

w2 = w4 = w6 = · · · = w2k = · · · = 0.

Thus S(τ ) becomes

S(τ ) = exp (τ w1 + τ 3 w3 + τ 5 w5 + · · ·).

Therefore, if S(τ ) is of order n, then n must be an even number. Since S(τ ) is at least
of order 1, and if n  2, we have w1 = Lf , because S(τ ) Lie series expression is
unique. 

Now, we will provide a corollary on the construction of higher order schemes.


Corollary 3.9. Let S(τ ) be a self-adjoint integrator with order 2n, if c1 , c2 satisfies

2c2n+1
1 + c2n+1
2 = 0, 2c1 + c2 = 1,

then composition integrator S(c1 τ ) S(c2 τ ) S(c1 τ ) is of order 2n + 2, solving the


above equations, we get [QZ92] :

2n+1
1 2
c1 = √ , c2 = √ .
2 − 2n+1 2 2 − 2n+1 2
Proof. From Theorem 2.6, we know S(c1 τ )S(c2 τ )S(c1 τ ) is a self-adjoint operator
and Theorem 3.8 shows it to be even order. Since S(τ ) is of order 2n, the expansions
in exponential form of S(c1 τ ), S(c2 τ ) are
 
S(c1 τ ) = exp c1 τ w1 + τ 2n+1 c2n+1
1 w2n+1 + O(τ 2n+3 ) ,
 
S(c2 τ ) = exp c2 τ w1 + τ 2n+1 c2n+1
2 w2n+1 + O(τ 2n+3 ) .

Again, by BCH formula, we get

S(c1 τ )S(c2 τ )S(c1 τ )


 
= exp (2c1 + c2 )τ w1 + (2c2n+1
1 + c2n+1
2 )τ 2n+1 w2n+1 + O(τ 2n+3 )
 
= exp τ w1 + O(τ 2n+3 ) .

The proof can be obtained. 

H. Yoshida in [Yos90] obtained the same result for symplectic explicit integrator
used to solve separable systems. The result can be applied to non-Hamiltonian systems
and non-symplectic integrators.
In this chapter, we will extend these results to solve general autonomous system’s
form. Some examples of adjoint scheme and its construction are given below. A con-
crete method to construct an adjoint for any given scheme is also given. This method
can be referred in literature [HNW93] . If the numerical solution is yτ , then any given
scheme may be expressed as:

yτ (x + τ ) = yτ (x) + τ φ(x, yτ (x), τ ), (3.15)


8.3 Construction of Higher Order Schemes 383

where φ is increment function corresponding to the scheme and τ is step size. By


substituting −τ instead of τ in (3.15), we get
 
y−τ (x − τ ) = y−τ (x) − τ φ x, y−τ (x), −τ ,
and x + τ instead of x, we get
y−τ (x) = y−τ (x + τ ) − τ φ(x + τ, y−τ (x + τ ), −τ ). (3.16)
For sufficiently τ , Equation (3.16), for y−τ (x + τ ) possesses a unique solution (by the
implicit function theorem) and expresses in the following form:
 
y−τ (x + τ ) = y−τ (x) + τ φ∗ x, y−τ (x), τ , (3.17)
and (3.17) is just the adjoint scheme for (3.15). y−τ is the adjoint scheme of numerical
solution. φ∗ is the increment function corresponding to the adjoint scheme (the above
process equals to: first solve S(−τ ), then solve S −1 (−τ )). In fact; let y−τ (x + τ ) =
A, y−τ (x) = B, from (3.16) and (3.17), we have

B = A − τ φ(x + τ, A, −τ ),
A = B + τ φ∗ (x, B, τ ),
as in Equations (2.11) and (2.12) in Definition 2.2. Next, we would like to consider
self adjoint conditions for R–K method. Since most one-step multistage methods can
be written in R–K form, we now turn to the R–K methods to get some results which
may be useful. The general s-stage R–K method is in the form R–K .
⎧  
⎪ s

⎪ ki = f x0 + ci τ, y0 + τ aij kj ,


j=1
(3.18)

⎪ s


⎩ y1 = y0 + τ bi ki ,
i=1

where y0 is numerical solution in x0 , y1 is numerical value in x0 + τ , then



s
ci = aij (3.19)
j=1

may be expressed in Butcher tableau:

c1 a11 a12 ··· a1s


c2 a21 a22 ··· a2s
.. .. .. .. ..
. . . . .
cs as1 as2 ··· ass
b1 b2 ··· bs
[HNW93]
The proof for the following Lemma 3.10 see . It can be proved by Definition
2.2 directly. Since we have proved the equivalence between Definition 2.1 and 2.2.
384 8. Composition Scheme

Lemma 3.10. Every R–K method has an adjoint method, whose coefficients a∗ij , b∗j , c∗j
(i, j = 1, · · · , s) can be written as follows:
⎧ ∗ ∗
⎨ ci = 1 − cs+1−i ,

a∗ij = bs+1−j − as+1−i,s+1−j , (3.20)

⎩ ∗
bj = bs+1−j .
Lemma 3.11. If as−i+1,s−j+1 + aij = bs−j+1 = bj , then the corresponding R–K
method (3.18) is self-adjoint.
Concentrating on semi-explicit symplectic R–K method, we have:
Theorem 3.12. The semi-explicit symplectic R–K method for autonomous systems is
self-adjoint if its Butcher tableau is of the form[QZ92] .

Table 3.1. Butcher tableau in theorem 3.12


b1
2
b2
b1
2
.. .. ..
. . .
b2
b1 b2 ···
2
b1
b1 b2 ··· b2
2
b1 b2 ··· b2 b1

Proof. We know that the Butcher tableau of semi-explicit symplectic R–K method
must be of the form

b1
2
b2
b1
2
.. .. ..
. . .
bs−1
b1 b2 ···
2
bs
b1 b2 ··· bs−1
2
b1 b2 ··· bs−1 bs

Tab. 3.1is obvious from Lemma 3.11.


By Lemma 3.11, we know that possesses such form of Table 3.1 is evident. For
non self-adjoint symplectic integrator S(τ ), S ∗ (τ )S(τ ) is self-adjoint, and is sym-
plectic. In order to prove that it is symplectic, it is enough to prove S ∗ (τ ) is sym-
plectic. If S ∗ (τ )S(−τ ) = I, then S ∗ (τ ) = S −1 (−τ ). As S(τ ) is symplectic, S(−τ )
8.3 Construction of Higher Order Schemes 385

and S −1 (−τ ) are symplectic integrators, and therefore S ∗ (τ ) is also symplectic. The
two Lemmas given below can be seen in paper[HNW93] . Lemma 3.13 is derived from
Theorem 1.24 of Chapter 7 and Theorem 1.1 of Chapter 8. 
Lemma 3.13. If in an implicit s-stage R–K method, all ci (i = 1, · · · , s) are different
and at least of order s, then it is a collocation method iff it is satisfies:

s
cqi
aij cq−1
j = , i = 1, · · · , s, q = 1, · · · , s. (3.21)
j=1
q

Lemma 3.14. Based on the symmetrical distribution, collocation algorithm is self-


adjoint.
As the Legendre polynomial is the orthogonal, coefficient of Gauss–Legendre
method ci (i = 1, · · · , s) takes the root of Legendre polynomial Ps (2c − 1) in which
the root is not mutually same. Moreover, the coefficient of Gauss–Legendre method
aij (i, j = 1, · · · , s) satisfies formula (3.21) , and the Gauss–Legendre method is of
the order 2s; therefore, it is the collocation method. For details, see[HNW93] . We have:
Theorem 3.15. Gauss–Legendre methods are self-adjoint.
Proof. Since Gauss–Legendre method is collocation method, we only need to prove
ci = 1−cs+1−i , and ci are symmetrical distributions, where c1 , c2 , · · · , cs are the root
of Legendre polynomial Ps (2c − 1) (lower index denotes order of polynomial). If the
root of Ps (w) are w1 , w2 , · · · , ws , then
1 wi
ci = − , i = 1, · · · , s,
2 2
i.e., ci = 1 − cs+1−i and wi + ws+1−i = 0 are equivalent.
Legendre polynomial can be constructed by

q0 (w) = 1, q−1 (w) = 0,
2
qi+1 (w) = (w − δi+1 )qi (w) − γi+1 qi−1 (w), i = 0, 1, · · · ,

where
(wqi , qi )
δi+1 = , i ≥ 0,
(qi , qi )

⎨ 0, for i = 0,
2
γi+1 = (qi , qi )
⎩ , for i ≥ 1,
(qi−1 , qi−1 )

and - 1
(qi , qj ) = qi (w)qj (w) d w.
−1
1
We obtain q1 = w, q2 = w2 − , assuming q2n (w) is an even function and q2n−1 (w)
3
is an odd function. We proceed by induction on n, for n = 1, this has established.
386 8. Composition Scheme

Suppose q2n is an even function, and q2n−1 is an odd function. Prove that n + 1 is also
true.
Since
- 1
odd d w
(wq2n , q2n ) even|1−1
δ2n+1 = = −1 = = 0,
(q2n , q2n ) (q2n , q2n ) (q2n , q2n )
2
then q2n+1 = wq2n − γ2n+1 q2n−1 (w) = odd function−odd function = odd function.
- 1
odd dw
(wq2n+1 , q2n+1 ) −1
δ2n+2 = = = 0,
(q2n+1 , q2n+1 ) (q2n+1 , q2n+1 )
2
then q2n+2 = wq2n+1 − γ2n+2 q2n (w) = is an even function. We have proved this
conclusion for n + 1. From this, the P2n (w) root may be written in the following
sequence:
−w1 , −w2 , · · · , −wn , wn , · · · , w2 , w1 .
But the root of p2n+1 (w) has the following form:
−w1 , −w2 , · · · , −wn , 0, wn , · · · , w2 , w1 ,
where wi > 0, wi > wi+1 (i = 1, · · · , n), therefore wi + wn+1−i = 0. Even though
we have the direct proof of Theorem 3.15, the computation is tedious. As a result of
Gauss–Legendre method coefficient aij , bj satisfies the following equation:

s
cqi
aij cq−1
j = , i = 1, · · · , s, q = 1, · · · , s, (3.22)
q
j=1
s
1
bj cjq−1 = , q = 1, · · · , s. (3.23)
q
j=1

Using the linear algebra knowledge, we have


cki

s (−1)n+k
s
aij = ; k ϕ , n= ,
kj
(cj − cl ) 2
k=1
j=l

1

s (−1)n+k
bj = ; k ϕ , i, j, l = 1, · · · , s,
kj
k=1 (cj − cl )
j=l

where
⎧ 

⎨ ϕkj = ct1 ct2 · · · cts−k , k < s,
{t1 ,t2 ,···,ts−k }⊂{1,2,···,j−1,j+1,···,s}


ϕsj = 1.
8.3 Construction of Higher Order Schemes 387

The direct calculation may result in

cki - ci

s (−1)n+k
; k ϕkj = lj (t)d t, j = 1, · · · , s, (3.24)
k=1 (cj − cl ) 0
j=l


s (−1)n+k 1 - 1
; k ϕkj = lj (t)d t, j = 1, · · · , s, (3.25)
k=1 (cj − cl ) 0
j=l

B
(t − ck )
k=j
where lj = B , when ci = 1 − cs+1−i , we have li (t) = ls+1−i(1−t) , then
(cj − ck )
k=j

as+1−i,s+1−j + aij = bs+1−j = bj ,

from (3.24) and (3.25), it is easy to prove. 

Below given is an example to construct a self-adjoint scheme using a given scheme


[QZ92]
.
Example 3.16. It is well know that scheme (2.1) is of the first order. From the above
method, the adjoint scheme will be

q n+1 = q n + τ Up (pn ),
(3.26)
pn+1 = pn − τ Vq (q n+1 ).

Composition scheme (2.2) from (2.1) and (3.26) is of order 2. In order to maintain
τ
the transient step size, original τ will shrink and will become , because the present
    2
∗ τ τ
scheme is S S . If it is self-adjoint, the transient length of step size is main-
2 2
tained as τ .

Example 3.17. It is easy to prove that scheme


τ@ A
y1 = y0 + f (y1 ) + f (y0 )
2
is self-adjoint, and will be of order 2. Symmetrical composition scheme (1.22) is 3-
stage of 4th order and also self-adjoint.

Example 3.18. The explicit 4th order symplectic scheme (2.4) can be composed by
S2 (x1 τ ) S2 (x2 τ ) S2 (x1 τ ) and developed as follows:

SV (c1 τ )SU (d1 τ )SV (c2 τ )SU (d2 τ )SV (c3 τ )SU (d3 τ )SV (c4 τ )SU (d4 τ ).
0 12 3 0 12 30 12 3
S2 (x1 τ ) S2 (x2 τ ) S2 (x1 τ )

noting Corollary 3.9, we get:


388 8. Composition Scheme

x1 1
c1 = c4 = = √ = 0.6756035,
2 2− 33
1
d1 = d3 = x1 = √ = 1.35120719,
2(2 − 3 2)

−32
d2 = √ = x2 = −1.7024142, d4 = 0,
2− 32

x + x2 1− 32
c2 = c3 = 1 = √ = −0.1756036.
2 2(2 − 3 2)

Example 3.19. By literature[FQ91] , we know that one class of symplectic scheme for
dy
equation = J∇H is
dt
% &
1 1
y k+1 = y k + τ J(∇H) (I + JB)y k+1 + (I − JB)y k , B T = B, (3.27)
2 2
5 6 5 6
O −In In O
J= , I= ,
In O O In
this scheme is of order 1, if B = O; if B = O, then the scheme will be of order 2.

In scheme (3.27), if B = O, we can prove it is self-adjoint.


When B = O, if integrator of scheme (3.27) is S(τ, H, B), then adjoint integrator
of scheme (3.27) will be

S(τ, H, −B) = S ∗ (τ, H, B).


   
τ τ
Composition integrator from S and S ∗ is self-adjoint with second order,
2 2
concrete scheme is
⎧  
⎪ τ 1 1
⎨ y1 = y k + J(∇H) (I − JB)y1 + (I + JB)y k ,
2 2 2
 

⎩ y k+1 = y1 + τ J(∇H) 1 (I + JB)y k+1 + 1 (I − JB)y1 .
2 2 2

8.4 Stability Analysis for Composition Scheme


[QZ93]
In this paragraph, will discuss the stability of the three-stage , fourth order
scheme which was constructed in Section 8.1
1  
yn+ 13 = yn + 1 τ f f (yn ) + f (yn+ ) ,
1
2(2 − 2 3 ) 3

−2 3
1
 
yn+ 23 = yn+ 13 + 1 τ f (yn+ 13 ) + f (yn+ 23 ) , (4.1)
2(2 − 2 ) 3

1  
yn+1 = yn+ 23 + 1 τ f (yn+ 23 ) + f (yn+1 ) .
2(2 − 2 ) 3
8.4 Stability Analysis for Composition Scheme 389

We will prove that although the trapezoid method is A-stable, scheme (4.1) is not
A-stable. Fortunately , the unstable region is very small, as Fig. 4.2 (enlaged figure is
Fig.4.1) shows, and scheme (4.1) is still useful for solving stiff systems. Judging from
the size and location of the unstable region of scheme (4.1), we know it is safe for
systems which have eigenvalues not very adjacent to the real axis, while some other
methods which have unstable regions near the imaginary axis, such as Gear’s are safe
for systems which have eigenvalues not very adjacent to the imaginary axis.
0.020

0.012 S

0.004

–0.004

–0.012

–0.020
–0.596 –0.579 –0.563
–0.603 –0.587 –0.571

Fig. 4.1. Closed curve S which contains all zero point of scheme (4.1)

1.0

0.6

0.2
0.0
–2.0 –1.6 –1.2 –0.8 –0.4
–0.2

–0.6

–1.0

Fig. 4.2. Stability region size and position of (4.1)

Just the same as in scheme (4.1), the Euler midpoint rule can also be used to
construct a scheme:
⎧ % &
⎪ 1 yn + yn+ 1

⎪ yn+ 13 = yn + 1 τf
3
,

⎪ 2 − 23 2



⎪ % &
⎨ 1
−2 3 yn+ 1 + yn+ 2
3 3
yn+ 23 = yn+ 13 + 1 τ f , (4.2)

⎪ 2 − 23 2

⎪ % &

⎪ 1 yn+ 2 + yn+1

⎪ yn+1 = yn+ 23 + 3
⎪ 1 τf .
⎩ 2 − 23 2
390 8. Composition Scheme

Scheme (4.2) is symplectic, but scheme (4.1) is non-symplectic. We now study the
stability of scheme (4.1). Note that scheme (4.1) is not A- stable, whereas the trapezoid
method is. To show this, we apply scheme (4.1) to test equation

ẏ = λy, y(0) = y0 , λ ∈ C, Re λ < 0, (4.3)

which yields
⎧ τ 

⎪ yn+ 13 = yn + c1 λyn + λyn+ 13 ,

⎪ 2
⎨ τ 
yn+ 23 = yn+ 13 + c2 λyn+ 13 + λyn+ 23 , (4.4)
⎪ 2

⎪ τ 

⎩ n+1
y = y 2
n+ 3 + c1 λyn+ 23 + λyn+1 ,
2

i.e., ⎧
⎪ c1 λτ

⎪ 1+

⎪ yn+ 13 = 2 y ,

⎪ c1 λτ n

⎪ 1−

⎪ 2





⎨ 1+
c2 λτ
yn+ 3 =
2
2 y 1, (4.5)

⎪ −
c2 λτ n+ 3

⎪ 1

⎪ 2



⎪ c 1 λτ

⎪ 1+

⎪ 2 y 2,

⎪ yn+1 =
c1 λτ n+ 3
⎩ 1−
2
1
1 −2 3 λτ
where c1 = 1 , c2 = 1 . Let = z, z ∈ C, we have
2 − 23 2 − 23 2

(1 + c1 z)(1 + c2 z)(1 + c1 z)
yn+1 = yn . (4.6)
(1 − c1 z)(1 − c2 z)(1 − c1 z)

Definition 4.1. The stable region R of scheme (4.1) is:


⎧  % &% &% & ⎫
⎪  c1 λτ c2 λτ c1 λτ  ⎪
⎨  1+ 1+ 1+  ⎬
 2 2 2 
R = λτ ∈ C   % &% &% &  < 1, Re (λτ ) < 0 ,

⎩  c1 λτ c2 λτ c1 λτ  ⎪

 1− 1− 1−
2 2 2

i.e.,   7
 
 (1 + c1 z)(1 + c2 z)(1 + c1 z) 
R= z ∈ C   < 1, Re z < 0 . (4.7)
 (1 − c1 z)(1 − c2 z)(1 − c1 z)
1
Obviously, when z → (< 0), we have
c2
 
 (1 + c1 z)(1 + c2 z)(1 + c1 z) 
 
 (1 − c1 z)(1 − c2 z)(1 − c1 z)  −→ ∞.
8.4 Stability Analysis for Composition Scheme 391

1
This means schemes (4.1) cannot be stable in the adjacent region . Thus, we obtain
c2
the following theorem:
Theorem 4.2. Scheme (4.1) is not A-stable.
Since scheme (4.1) is not A-stable, we will figure out the stable region of it. To do
this, we will first study the roots of the following equation:
 
 (1 + c1 z)(1 + c2 z)(1 + c1 z) 
 
 (1 − c1 z)(1 − c2 z)(1 − c1 z)  = 1. (4.8)

Once the roots of (4.8) are known, it is not difficult to get the stable region of (4.1).
Note Equation (4.8) is equivalent to

(1 + c1 z)(1 + c2 z)(1 + c1 z)
= eiθ , 0 ≤ θ < 2π. (4.9)
(1 − c1 z)(1 − c2 z)(1 − c1 z)

From (4.9), we get the following polynomial:

c21 c2 (1 + eiθ )z 3 + (2c1 c2 + c21 )(1 − eiθ )z 2


+(2c1 + c2 )(1 + eiθ )z + (1 − eiθ ) = 0, 0 ≤ θ < 2π. (4.10)

Since 2c1 + c2 = 1, and a = c21 c2 , b = 2c1 c2 + c21 , then (4.10) becomes:

a(1 + eiθ )z 3 + b(1 − eiθ )z 2 + (1 + eiθ )z + (1 − eiθ ) = 0, 0 ≤ θ < 2π. (4.11)

Consider the roots of (4.11) in two cases.

Case 4.3. 1 + eiθ = 0 (i.e., 0 ≤ θ < 2π, θ = π).

By computing the roots of polynomial (4.11), we get

z1 = x + yi, z2 = −x + yi, z3 = wi, x, y, w ∈ C (4.12)

when θ is chosen as a definite value. z1 , z2 , z3 should satisfy

(z − z1 )(z − z2 )(z − z3 ) = a(1 + eiθ )z 3 + b(1 − eiθ )z 2 + (1 + eiθ )z


+(1 − eiθ ) = 0
⎧  

⎪ b 1 − ei θ

⎪ z 1 + z2 + z3 = − ,

⎪ a 1 + e iθ


⎨ 1
⇐⇒ z 1 z 2 + z1 z 3 + z2 z 3 = , (4.13)

⎪ a

⎪  

⎪ 1 1 − ei θ

⎪ z z z = − .
⎩ 1 2 3 a 1 + e iθ

Since
392 8. Composition Scheme

1 − e iθ (1 − cos θ) − i sin θ sin θ


= =− i,
1+e i θ (1 + cos θ) + i sin θ 1 + cos θ
then ⎧

⎪ z1 + z2 + z3 =
b sin θ
i = ip1 ,




a 1 + cos θ

1
z1 z2 + z1 z3 + z2 z3 = = p2 , (4.14)

⎪ a




⎩ z1 z2 z3 = 1 sin θ i = ip3 ,
a 1 + cos θ
where p1 , p2 , p3 are real numbers. From (4.12) and (4.14), we have equations of
x, y, w as the following:

2y + w = p1 , (4.15)
x2 + y 2 + 2yw = −p2 , (4.16)
x2 w + y 2 w = −p3 . (4.17)

Now, we will prove that Equations system (4.15)–(4.17) exists as a set of real solution.
In fact, from (4.16) and (4.17), we get:

p2 w + 2yw2 = p3 . (4.18)

From (4.15) and (4.18), we have

w3 − p1 w2 − p2 w + p3 = 0. (4.19)

Since p1 , p2 , p3 are all real, (4.19) is a polynomial with real coefficient and has one
real root and two conjugate complex roots. Using the real root w from (4.19), we can
get a real value of y from (4.15) , and from (4.16) and (4.17), we have x2 = real, then
x is real or pure imaginary. If x is pure imaginary, then from (4.12), z1 , z2 , z3 are all
pure imaginaries, so all the the roots of (4.11) are on the imaginary axis. This means
that if we assume
 
 (1 + c1 z)(1 + c2 z)(1 + c1 z) 
V (z) =   ,
(1 − c1 z)(1 − c2 z)(1 − c2 z) 

then, V (z) > 1 or V (z) < 1 for Re (z) < 0. For the same reason that scheme (4.1)
cannot be A-stable, V (z) > 1 for Re (z) < 0 is possible, but we have V (−0.5) < 1,
1
and V (z) is continuous except at . Thus, x is impossible to be pure imaginary, so
c2
it must be real. Since polynomial (4.11) only has three roots, we will get the same
results of z1 , z2 , z3 if we use other value of w, real or complex.

Case 4.4. 1 + eiθ = 0 (i.e., θ = π). D


21 1
When θ = π, (4.11) becomes z = − > 0, then it has two real roots ± − .
b b

Eventually, we get the following:


8.4 Stability Analysis for Composition Scheme 393

Theorem 4.5. The three roots of polynomial (4.11) are in the form :

z1 = −x + yi, z2 = x + yi, z3 = wi,

where x, y, w are all real.

Theorem 4.5 tells us that there is a root of (4.10) on the imaginary, and that the two
other roots are located symmetrically with respect to the imaginary axis. Thus, there
is only one root on the left open semi-plane. Computation shows that these roots form
a closed curve S (when θ changes from 0 to 2π), as in Fig. 4.1.
From (4.15) – (4.17), we get the equation for S:
 2
x − 3y 2 + 2p1 y + p2 = 0,
0 ≤ θ < 2π, θ = π. (4.20)
2yx2 + 2y 3 − p1 x2 − p1 y 2 − p3 = 0,

and D
1
x=± − , y = 0, for θ = π, (4.21)
b
b sin θ 1 sin θ
where z = −x + iy, p1 = , p2 = , p3 = .
a(1 + cos θ) a a(1 + cos θ)
Since in S, V (z) > 1, and when z → ∞, V (Z) → 1, we can conclude the
stability of scheme (4.1).
Theorem 4.6. The stable region R of scheme (4.1) is [QZ93] :

R = {z ∈ C | z outside S and Re z < 0}, i.e.,


( z )
R = {λτ ∈ C | λτ outside S ∗ and Re (λτ ) < 0}, where S ∗ = z ∈ C | ∈ S .
2
Scheme (4.1) is not A-stable, the outside region of it is infinite, and the unstable
region is very small. The unstable region is blackened in Fig. 4.2, it is a little “disk”
about −1.18 on the real axis. For every definite λ, we can choose some special step
length τ , such that λτ will not be in S ∗ , while the step-length τ need not be very small
for λ which has a big modulus. Because of linear cases, scheme (4.2) is equivalent to
(4.1). So, scheme (4.2) has exactly the same stable region as (4.1). Thus, we conclude
that scheme (4.1) and (4.2) are still useful for solving stiff problems, which we wanted
to show in example.
Following are some numerical tests for stability of scheme (4.1).

Example 4.7. Numerical test for orders of schemes (4.1) and (4.2).

To test the order scheme (4.1) and (4.2), we apply them to the following Hamilto-
nian system: ⎧
⎪ d p = − ∂ H = −w2 q − q 3 ,

⎨ dt ∂q
(4.22)


⎩ d q = ∂ H = p,
dt ∂p
394 8. Composition Scheme

1 1 
where the Hamiltonian H = p2 + w2 q 2 + q 4 , and compare the numerical solu-
2 2
tions with trapezoid method and centered Euler scheme.
For convenience, the numerical solution of p and q can be denoted as
1◦ (4.1) by T 4p, T 4q.
2◦ (4.2) by E4p, E4q.
3◦ trapezoid scheme by T 2p, T 2q.
4◦ centered Euler scheme by E2p, E2q. Respectively, we use double precision
in all computations. We can see the following explicit scheme:
⎧ τ

⎪ p 1 = pn − Vq (qn ),
⎪ n+ 2
⎪ 2


⎨ qn+ 1 = qn + τ Up (pn+ 1 ),
2 2 2
(4.23)
⎪ τ

⎪ qn+1 = qn+ 12 + 2 Up (pn+ 12 ),



⎩ p τ
= p 1 − V (q
n+1 n+ 2 ), q n+1
2
x1 1
c1 = c4 = = √ = 0.6756035,
2 2− 33
1
d1 = d3 = x1 = √ = 1.35120719,
2(2 − 3 2)

−32
d2 = √ = x2 = −1.7024142, d4 = 0,
2− 32

x + x2 1− 32
c2 = c3 = 1 = √ = −0.1756036.
2 2(2 − 3 2)
A separable system with H = V (q) + U (p) is self-adjoint, so it can be used
to construct fourth-order scheme to get (4.1) and (4.2). From Sections 8.2 and 8.3,
1
the simplified fourth-order scheme can be written taking c1 = 1 = c3 , c2 =
2 − 23
1
−2 3
1 , x1 = x3 = 1.35120719, x2 = −1.7024142. For details see Example 3.18.
2 − 23
⎧ τ

⎪ pn+ 14 = pn − x1 Vq (qn ),

⎪ 2



⎪ q 1 = qn + x1 τ Up (p
n+ 14 ),


n+ 3

⎪ x + x

⎪ pn+ 12 = pn+ 14 − 1 2
τ Vq (qn+ 13 ),

⎪ 2

qn+ 23 = qn+ 13 + x2 τ Up (pn+ 12 ), (4.24)



⎪ x + x3
pn+ 34 = pn+ 12 − 2

⎪ τ Vq (qn+ 23 ),

⎪ 2



⎪ qn+1 = qn+ 23 + x3 τ Up (pn+ 34 ),



⎪ τ
⎩ pn+1 = pn+ 3 − x3 Vq (qn+1 ),
4 2

where pn+ 14 , pn+ 12 , pn+ 34 and qn+ 13 , qn+ 23 denote the numerical solution of different
stages at every step. Scheme (4.24) has been proved by H. Yoshida to be a fourth-
order scheme in [Yos90] . We can apply scheme (4.24) to Equation (4.22) and compare
8.4 Stability Analysis for Composition Scheme 395

Table 4.1. Numerical comparison between several schemes


Step N Numerical solution and exact solution
EXp = −1.131 156 917 000000 EXq = −0.021 512 660 000000
T 4p − EXp = 0.000 014 55000 T 4q − EXq = −0.000 003 728
N = 10 E4p − EXp = 0.000 068 24300 E4q − EXq = −0.000 029 687
T 2p − EXp = 0.000 641 21600 T 2q − EXq = 0.003 917 96400
E2p − EXp = 0.000 025 85700 E2q − EXq = 0.004 206 14700
EXp = −0.578 997 162 000000 EXq = −0.479 477 967 00000
T 4p − EXp = 0.000 004 11500 T 4q − EXq = −0.000 002 660
N = 20 E4p − EXp = −0.000 116 088 E4q − EXq = −0.000 029 838
T 2p − EXp = −0.014 158 525 T 2q − EXq = −0.003 977 057
E2p − EXp = −0.015 197 562 E2q − EXq = −0.004 255 307
EXp = −1.083 692 040 00000 EXq = 0.163 258 193 0000000
T 4p − EXp = −0.000 104 873 T 4q − EXq = −0.000 195 865
N = 100 E4p − EXp = 0.000 145 7860 E4q − EXq = 0.000 131 2730
T 2p − EXp = 0.024 490 7400 T 2q − EXq = 0.036 283 1300
E2p − EXp = 0.027 254 9000 E2q − EXq = 0.039 223 1760
EXp = −1.089 537 517 000000 EXq = −0.153 288 801 000000
T 4p − EXp = 0.000 560 51300 T 4q − EXq = −0.001 139 354
N = 500 E4p − EXp = −0.000 250 063 E4q − EXq = 0.000 559 4940
T 2p − EXp = −0.040 591 714 T 2q − EXq = 0.188 655 9980
E2p − EXp = −0.037 488 191 E2q − EXq = 0.204 743 2350
EXp = −0.966 531 326 000000 EXq = −0.293 028 275 000000
T 4p − EXp = 0.002 470 90100 T 4q − EXq = 0.002 014 58300
N = 1000 E4p − EXp = −0.001 281 080 E4q − EXq = −0.000 988 873
T 2p − EXp = −0.603 588 331 T 2q − EXq = −0.233 974 665
E2p − EXp = −0.668 484 708 E2q − EXq = −0.243 402 518

the results with that of schemes we mentioned above. We denote EXp and EXq as
the exact solution of p and q for system (4.22), and present our results when taking
w = 2, τ = 0.1, p0 = 0.5, q0 = 0.5 in Table 4.1. From Table 4.1, we can see that
T 4p, T 4q and E4p, E4q are more approximate to EXp, EXq than T 2p, T 2q and
E2p, E2q. Thus, we conclude that scheme (4.1) and (4.2) have a higher order than
trapezoid method and centered Euler scheme. Table 4.1 also shows that although the
trapezoid scheme (4.1) is non-symplectic, it can be used to solve a Hamiltonian system
to get satisfactory results than the centered Euler scheme, by nonlinear transformation;
the latter can be obtained from the former, see Section 8.1.
Example 4.8. Numerical test for stability of schemes (4.1) and (4.2). To consider
the unstable case, we take λ = −11.8, τ = 0.1, and initial value y0 = 1.0 in the
test equation, so λτ falls into the unstable region. While the exact solution decreases
quickly, the numerical solution obtained by scheme (4.1) grows to infinity as shown
in Table 4.2.

Example 4.9. For the stable case, we consider a linear stiff system

ẏ1 = −501y1 + 500y2 ,
(4.25)
ẏ2 = 500y1 − 501y2 ,

which has eigenvalues λ1 = −1001, λ2 = −1. The exact solution is


396 8. Composition Scheme

Table 4.2. Stability test


Step number Numerical and exact solution
Step1 0.576776990×101 0.307278738
Step10 0.407404568×108 0.000007504
Step50 0.112235299×1039 0.000000000
Step100 0.816583328×1075 0.000000000

Table 4.3. Test for stiff system


Step N Numerical solution and exact solution
EXY 1 = 0.998364638 EXY 2 = 0.991660285
N = 10 T 4Y 1 = 0.998453117 T 4Y 2 = 0.991571619
T 4Y 1 − EXY 1 = 0.000088478 T 4Y 2 − EXY 2 = −0.000088666
EXY 1 = 0.985112102 EXY 2 = 0.985111801
N = 30 T 4Y 1 = 0.985111988 T 4Y 2 = 0.985111662
T 4Y 1 − EXY 1 = −0.000000114 T 4Y 2 − EXY 2 = −0.000000138
EXY 1 = 0.975309908 EXY 2 = 0.975309908
N = 50 T 4Y 1 = 0.975309788 T 4Y 2 = 0.975309788
T 4Y 1 − EXY 1 = −0.000000120 T 4Y 2 − EXY 2 = −0.000000120
EXY 1 = 0.006571583 EXY 2 = 0.006571583
N = 100 T 4Y 1 = 0.006571770 T 4Y 2 = 0.006571771
T 4Y 1 − EXY 1 = −0.000000186 T 4Y 2 − EXY 2 = −0.000000188
EXY 1 = 0.000000298 EXY 2 = 0.000000298
N = 200 T 4Y 1 = 0.000000298 T 4Y 2 = 0.000000298
T 4Y 1 − EXY 1 = −0.000000000 T 4Y 2 − EXY 2 = −0.000000000

⎧    
⎨ y1 (t) = f1 (y1 , y2 ) = 0.5 y1 (0) − y2 (0) e−1001t + 0.5 y1 (0) + y2 (0) e−t ,
   
⎩ y2 (t) = f2 (y1 , y2 ) = −0.5 y1 (0) − y2 (0) e−1001t + 0.5 y1 (0) + y2 (0) e−t ,
(4.26)
where y1 (0), y2 (0) denote the initial value. Since system (4.25) is linear, schemes
(4.1) and (4.2) are equivalent in this case. We present a numerical solution using
scheme (4.1) here. In Table 4.3, we denote the numerical solution of y1 and y2 using
(4.1) by T 4Y 1, T 4Y 2, and the exact solution of y1 and y2 by EXY 1 and EXY 2. We
also assume τ = 0.1, y1 (0) = 1.5, y2 (0) = 0.5, in the Table 4.3, while τ = 0.0005
in the first 50 steps, and τ = 0.1 in the remaining steps.

8.5 Application of Composition Schemes to PDE


When solving partial differential equations (PDEs), there are several methods such as
spectral methods and finite difference methods which can be used to achieve high-
order accuracy in the space direction, while it is difficult to obtain high-order accu-
racy in time direction. So it is obvious that the overall accuracy is often influenced
strongly by the relatively unsatisfactory approximation in the time direction. Though
the self-adjoint schemes (also called symmetrical schemes or reversible schemes) are
well known, such as the composed Strang scheme [Str68] which is of order 2, the ad-
vantage of these schemes which can be used to construct higher order schemes is long
8.5 Application of Composition Schemes to PDE 397

neglected. In this section, we use scheme (4.1) to solve two kinds of PDEs in order to
show that the technique introduced in previous section can be used to overcome the
deficiency in the time direction, since theoretically, we can construct arbitrary even
order schemes in the time direction[ZQ93b] .
Let us first consider the following one-dimensional first-order wave equation


ut + ux = 0,
(5.1)
u(x, 0) = f (x), 0 ≤ x ≤ 2π,

with periodic boundary conditions

u(0, t) = u(2π, t).

Since collocation, Galerkin, and tau methods are identical in the absence of essen-
tial boundary conditions, we will analyze the Fourier collocation or pseudospectral
method. Let us introduce the collocation points xn = 2πn/2N (n = 0, · · · , 2N − 1),
and let u = (u0 , · · · , u2N −1 ), where un = u(xn , t). The collocation equation that
approximates (5.1) is as follows:

∂u
= C −1 DCu, (5.2)
∂t

where C and D are 2N × 2N matrices whose entries are

1 @ A
ckl = √ exp (k − N )xl , (5.3)
2N
dkl = −k ∗ δkl , (5.4)

where k ∗ = k − N (1 ≤ k ≤ 2N − 1), and k ∗ = 0, if k = 0. For the process of


the discretization, see also literature [GO77] , we leave out the proof in this, but directly
quote. To solve this, let us consider Equation (5.1) with initial value f (x) = sin x, and
compare the numerical solution with the exact solution u(x, t) = sin (x − t), we use
scheme (4.1) and trapezoid scheme (crank-Nicolson) to solve Equation (5.2) (N = 5).
All u values are calculated in the collocation points taking the time step size
τ = 0.1 and 0.01, and respectively calculating 100 steps with the double precision.
ORD.4 and ORD.2, represent results that use (4.1) and the trapezoidal form obtained
numerical solution respectively. ERR.4 and ERR.2 represent error between numer-
ical solution ORD.4 and ORD.2 and the exact solution, where the collocation point
is n. We list u(x, t) in each step with values 0, 5, 9 as collocation points. The exact
solution is denoted by EX. From Table 5.1 and Table 5.2 we can see that the solution
of the 4th order scheme is more precise than the solution of the 2nd order scheme,
when τ = 0.1 precise 2, when τ = 0.01 precise 4.
398 8. Composition Scheme

Table 5.1. Comparison between numerical and exact solution when τ =0.1
Step N n EX ORD.4 ORD.2 ERR.4 ERR.2
0 −0.099833416647 −0.099832763924 −0.099750623437 0.000000652723 0.000082793209
N =1 5 0.099833416647 0.099832763924 0.099750623438 −0.000000652723 −0.000082793209
9 −0.665615704994 −0.66561545443 −0.665553604585 0.000000489551 0.000062100409
0 −0.841470984808 −0.841467440655 −0.841021115481 0.000003544153 0.000449869327
N = 10 5 0.841470984808 0.841467440655 0.841021115481 −0.000003544153 −0.000449869327
9 −0.998346054152 −0.998346431587 −0.998393545150 −0.000000377435 −0.000047490998
0 0.544021110889 0.543966068061 0.537020563223 −0.000055042829 −0.007000547666
N = 100 5 −0.544021110889 −0.543966068061 −0.537020563223 0.000055042829 0.007000547666
9 0.933316194418 0.933292641025 0.930296266090 −0.000023553213 −0.003019928328

Table 5.2. Comparison between numerical and exact solution when τ = 0.01
Step N n EX ORD.4 ORD.2 ERR.4 ERR.2
0 0.009999833340 −0.099998333280 −0.009999750000 0.000000000007 0.000000083334
N =1 5 0.009999833340 0.009999833280 0.009999750000 −0.000000000007 −0.000000083334
9 −0.595845898383 −0.595845898378 −0.595845831454 0.000000000005 0.000000066929
0 −0.099833416647 −0.099833416582 −0.099832587427 0.000000000065 0.000000829220
N = 10 5 0.099833416647 0.099833416582 0.099832587427 0.000000000042 −0.000000829220
9 −0.665615704994 −0.665615704952 −0.665615083044 0.000000000003 0.000000621950
0 −0.841470984808 −0.841470984547 −0.841466481987 0.000000000261 −0.000004502821
N = 100 5 0.841470984808 0.841470984547 0.841466481987 −0.000000000267 −0.000004502871
9 −0.998346054152 −0.998346054304 −0.998346533230 −0.000000000152 −0.000000479078

Similarly, in 2nd order PDE, the result of the 4th order scheme is more precise
when compared to the result of the 2nd order scheme in 2 - 4 precision.
Let us take the second order heat conductivity equation

⎪ ∂u(x, t) ∂ 2 u(x, t)

⎨ = , 0 < x < π, t ≥ 0,
∂t ∂x2
u(0, t) = u(π, t) = 0, t > 0, (5.5)



u(x, 0) = f (x), 0 ≤ x ≤ π.

By applying Fourier sine approach in Equation (5.5), we get


N
uN (x, t) = an (t) sin nx, (5.6)
n=1

and ⎧
⎪ d an 2
⎨ d t = −n an ,

- π (5.7)
⎪ 2

⎩ an (0) = π f (x) sin nx d x.
0
8.5 Application of Composition Schemes to PDE 399

Table 5.3. Comparison between numerical and exact solution when τ =0.1
Step N n EX ORD.4 ORD.2 ERR.4 ERR.2
1 0.531850090044 0.5318500444815 0.531805704455 0.0000003547710 −0.000044385589
N =1 2 0.860551522611 0.8605520966420 0.860479705219 0.0000005740310 −0.000071817391
3 0.860551522611 0.8605520966420 0.860479705219 0.0000005740310 −0.000071817391
4 0.531850090044 0.5318504448150 0.531805704455 0.0000003547710 −0.000443855890
1 0.216234110142 0.2162355525360 0.216053719560 0.0000001442394 −0.000180390582
N = 10 2 0.349814139737 0.3498764735800 0.349582261644 0.0000023338430 −0.000291878093
3 0.349814139737 0.3498764735800 0.349582269644 0.0000123338430 −0.000291878093
4 0.216234110142 0.2162355522536 0.216053719560 0.0000014423940 −0.000180390582
1 0.003960465877 0.0039605979700 0.003943973573 0.0000001320940 −0.000164923040
N = 50 2 0.006408168400 0.0064083821320 0.006381483292 0.0000002137320 −0.000026685108
3 0.006408168400 0.0064083821320 0.006381483292 0.0000002137320 −0.000026685108
4 0.003960465877 0.0039605979700 0.003943973573 0.0000001320940 −0.000164923040

πj
The initial value of an can be represented in another form. Let xj = (j =
N +1
1, · · · , N ) be collocation points, from collocation equation


N
πjn
an sin = u(xj ), j = 1, · · · , N, (5.8)
n=1
N +1

we get explicit solution

2 
N
πjn
an = u(xj ) sin , n = 1, · · · , N.
N +1 N +1
j=1

Since u(x, 0) = f (x), we get:

2 
N
πjn
an (0) = f (xj ) sin , n = 1, · · · , N. (5.9)
N +1 N +1
j=1

The exact solution for Equation (5.5) with boundary condition f (x) = sin x is
e−t sin x. In Table 5.3 and Table 5.4, all symbols carry the same significance. We
take N = 4 for computation.

To solve the semi-discrete spectral approximations

ut = LN u, (5.10)

of the differential equation


ut = Lu, (5.11)
where L denotes the spacial operator, we often use the Crank–Nicolson scheme, back-
ward Euler scheme, and leap-frog scheme. However, we know the backward and for-
ward Euler schemes are not self-adjoint, nor the leap-frog scheme. But the first two
schemes are adjoint to each other and the composition is the Crank–Nicolson scheme
400 8. Composition Scheme

Table 5.4. Comparison between numerical and exact solution when τ =0.01
Step N n EX ORD.4 ORD.2 ERR.4 ERR.2
1 0.581936691312 0.581936691316 0.581936642817 −0.000000000004 −0.000000048495
N =1 2 0.941593345844 0.941593345850 0.941593267377 0.000000000006 −0.000000078467
3 0.941593345844 0.941593345850 0.941593267377 0.000000000006 −0.000000078467
4 0.581936693120 0.581936691316 0.581936642817 −0.000000000004 −0.000000048495
1 0.216234110142 0.216234110285 0.216232308172 0.0000000001430 −0.000001801970
N = 100 2 0.349874139137 0.349874139969 0.349811224088 0.0000000002310 −0.000002915049
3 0.349874139137 0.349874139969 0.349811224088 0.0000000000231 −0.000002915049
4 0.276234110142 0.216234110285 0.216232308172 0.0000000001430 −0.000001801970

1 
un+1 − un = Δt LN un+1 + LN un . (5.12)
2
which is self-adjoint and of order 2. We can construct a fourth-order scheme by com-
position
1  
un+1/3 = un + Δt LN un + LN un+1/3 ,
2(2 − 2
1/3 )
2 1/3  
un+2/3 = un+1/3 − Δt LN un+1/3 + LN un+2/3 , (5.13)
2(2 − 21/3 )
1  
un+1 = un+2/3 + Δt LN un+2/3 + LN un+1 .
2(2 − 2 )
1/3

Finally, we can point out that scheme (5.13) is unstable for some special step size of
t. Since the diameter of the unstable region is very small, we can always avoid taking
those step-size Δt which make λΔt (λ denotes the eigenvalue of the system to be
solved) fall into the unstable region. Fig. 5.1 shows the solution of the heat equation
when we use scheme (5.13) to solve the (5.11) We take Δt = 0.0097 and N = 24. We
can conclude that while the Crank–Nicolson remains stable, the scheme (5.13) does
not, and solution tends to overflow. For a Detailed numerical test about this problem,
see[ZQ93b] .

Fig. 5.1. Stability comparison between schemes of Crank–Nicolson (L), (5.13) (M) and exact
solution (R) of the heat equation
8.6 H-Stability of Hamiltonian System 401

8.6 H-Stability of Hamiltonian System


We know that Hamiltonian system always appears in space of even dimension. A
more important fact is that there is no asymptotically stable linear Hamiltonian system.
They are either Liapunov stable or unstable, so are the linear symplectic algorithms.
Therefore, the usual stability concepts in numerical methods for ODEs are not suitable
to symplectic algorithms for Hamiltonian systems, for example, A-stability and A(α)
π π
stability, α ≤ . Hence, usual A(α) stability is useless for α ≤ and A-stability
2 2
needs to be modified. Here, we introduce a new test system and a new concept-H-
stability (Hamiltonian stability) for symplectic algorithms and discuss the H-stability
of symplectic invariant algorithms and the H-stability intervals of explicit symplectic
algorithms.
For the linear Hamiltonian system

dz
= Lz, L = JA ∈ sp(2n), H = (z, Az), AT = A, (6.1)
dt
a linear symplectic algorithm

z k+1 = gH
t
(z k ) = G(s, A)z k , k0 (6.2)

is stable, if ∃ C > 0, such that

z k  = Gk (s, A)z 0   Cz 0 , ∀k > 0, (6.3)

where •  is a well-defined norm, such as Euclidean norm. Evidently, it is equivalent


to Gk (s) bounded, or the eigenvalues of G(s) are in the unit disk and its elementary
divisors corresponding to the eigenvalues on the unit circle are linear. Since G(s) is
symplectic, then
G−1 (s) = J −1 G(s)T J. (6.4)
Hence, if λ is an eigenvalue of G(s), so is λ−1 , and they have the same elementary
divisors. Therefore, the eigenvalue with the module less than 1 is always accompa-
nied with the eigenvalue with the module great than 1. This implies that the linear
symplectic method (6.1) cannot be asymptotically stable. We have:

Theorem 6.1. Linear symplectic method (6.1) is stable iff the eigenvalues of G(s) are
unimodular and their elementary divisors are linear [Wan94] .

Here, we introduce the test Hamiltonian system

dz
= αJz, α ∈ R, (6.5)
dt
with
α T α
H(z) = H(p, q) = z z = (p2 + q 2 ), A = αI.
2 2
402 8. Composition Scheme

Definition 6.2. A symplectic difference method is H-stable at μ = αs, if it is stable


for the test Hamiltonian system (6.2) with the given μ, such μ is called a stable point.
The maximum interval in which every point is stable and which contains the original
point is called the H-stability interval of the method. A symplectic difference method
is H-stable if its H-stability interval is the whole real axis. In this case, its numerical
solutions are bounded for (6.2) with α ∈ R.

Remark 6.3. It is reasonable to choose (6.5) as the model equation because any linear
Hamiltonian system may turn into the standard form

1
n
H(p, q) = αi (p2i + qi2 ).
2 i=1

Test Equations (6.2) and (6.1) become

z k+1 = G(μ)z k , (6.6)

where G(μ) is 2 × 2 symplectic matrix. If


5 6
a1 a2
G(μ) = ,
a3 a4

then det G(μ) = a1 a4 − a2 a3 = 1. Its characteristic polynomial is


 
 a1 − λ a2 

|G(μ) − λI| =   = λ2 − (a1 + a4 )λ + 1.
 a3 a4 − λ 

So, its eigenvalues are


E% &2
a1 + a4 a1 + a4
λ± = ± − 1. (6.7)
2 2

Lemma 6.4. Scheme (6.6) is stable at μ = 0, iff


 2
a 1 + a4 a 1 + a4
< 1, i.e., −1 < < 1. (6.8)
2 2

Example 6.5. Applying the centered Euler scheme to the test system (6.5), it becomes
1
z = z + μJ(
z + z), μ = αs,
2
 −1  
1 1
z = I + μJ I − μJ z,
2 2

where
8.6 H-Stability of Hamiltonian System 403

⎡ 1

1 − μ2 −μ
1 ⎢ 4 ⎥
G(μ) = 1 ⎣ ⎦, (6.9)
1 + μ2 1
4 μ 1 − μ2
4

therefore
⎛ 1 ⎞
% &2 1 − μ2
a1 + a4
=⎝ 4 ⎠ < 1,
1
∀μ = 0.
2 1 + μ2
4
By Lemma 6.4, we know that the centered Euler scheme to all μ = 0 is stable, cer-
tainly it is also stable for μ = 0, therefore, the centered Euler scheme is H-stable.
For the stability region of certain explicit scheme, see the literature [Wan94,QZ90] .
In Section 8.2, we have constructed schemes of difference from 1st order to 4th
order. We will now discusses its stability by applying these schemes to the model
Equation (6.5), we get
z k+1 = Gi (μ)z k , μ = αs, i = 1, 2, 3, 4
Gi is the step transition equation.
* +
1 −μ
G1 (μ) = ,
μ 1 − μ2
⎛ 1 ⎞
1 − μ2 −μ
⎜ 2 ⎟
G2 (μ) = ⎝   ⎠,
1 1
μ 1 − μ2 1 − μ2
4 2
⎛   ⎞
1 1 1 7
1 − μ2 + μ4 −μ 1 − μ2 + μ4
⎜ 2 72 6 1728 ⎟
G3 (μ) = ⎝   ⎠,
1 2 1 1 5 7
μ 1 − μ + μ4 1 − μ2 + μ4 − μ6
6 72 2 72 1728
* +
a1 a2
G4 (μ) = ,
a3 a4

1 1 4 1
a1 = 1 − μ2 + μ + (1 + β)2 μ6 ,
2 24 144
 
1 1
a2 = −μ 1 − μ2 − (2 + β)(1 + 2β)2 μ4 ,
6 216
 
1 1 1
a3 = μ 1 − μ2 − (2 + β)(1 − β)μ4 + (2 + β)(1 + β 2 )μ6 ,
6 216 864
1 1 4 1
a4 = 1 − μ2 + μ + (1 + β)2 μ6 .
2 24 144
404 8. Composition Scheme

Theorem 6.6. From the explicit scheme above, the H-stability intervals are (−2, 2),
(−2, 2), (−2.507, 2.507) and (−1.573, 1.573).

Proof. Proof of this theorem can be found in paper of Daoliu Wang[Wan94] and paper
of Mengzhao Qin and Meiqing Zhang[QZ90] . 
Bibliography

[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM


60, Berlin, Second edition, (1989).
[Dah75] G. Dahlquist: Error analysis for a class of methods for stiff nonlinear initial value
problems. In G.A. Watson, editor, Lecture Notes in Mathematics, Vol. 506, Numerical
Analysis, Dundee, pages 60–74. Springer, Berlin, (1975).
[DF76] A. J. Dragt and J. M. Finn: Lie series and invariant functions for analytic symplectic
maps. J. of Math. Phys., 17:2215–2227, (1976).
[DF83] A.J. Dragt and E. Forest: Computation of nonlinear behavior of Hamiltonian systems
using Lie algebraic methods. J. of Math. Phys., 24(12):2734–2744, (1983).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[Fen86] K. Feng: Difference schemes for Hamiltonian formalism and symplectic geometry. J.
Comput. Math., 4:279–289, (1986).
[Fen91] K. Feng: The Hamiltonian Way for Computing Hamiltonian Dynamics. In R. Spigler,
editor, Applied and industrial Mathmatics, pages 17–35. Kluwer, The Netherlands, (1991).
[Fen92] K. Feng: Formal power series and numerical methods for differential equations. In
T. Chan and Z.C. Shi, editors, International conf. on scientific computation, pages 28–35.
World Scientific, Singapore, (1992).
[For92] E. Forest: Sixth-order Lie group integrators. J. of Comp. Phys., 99:209–213, 1992.
[For06] E. Forest. Geometric integration for particle accelerators. J. Phys. A: Math. Gen.,
39:5321–5377, (2006).
[FQ91] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FR90] E. Forest and R. D. Ruth: Fourth-order symplectic integration. Physica D, 43:105–117,
(1990).
[GO77] D. Gottlib and A. Orsag: Numerical Analysis of Spectral Methods, Theory and Appli-
cation. SIAM, Philadelphia, (1977).
[HNW93] E. Hairer, S. P. Nørsett, and G. Wanner: Solving Ordinary Differential Equations I,
Nonstiff Problems. Springer-Verlag, Berlin, Second revised edition, (1993).
[McL95a] R. I. McLachlan: Comment on “ Poisson schemes for Hamiltonian systems on Pois-
son manifolds”. Computers Math. Applic., 29:1, (1995).
[McL95b] R. I. McLachlan: Composition methods in the presence of small parameters. BIT,
35:258–268, (1995).
[McL95c] R. I. McLachlan: On the numerical integration of ODE’s by symmetric composition
methods. SIAM J. Numer. Anal., 16:151–168, (1995).
[McL95d] R. I. McLachlan: On the numerical integration of ordinary differential equations by
symmetric composition methods. SIAM J. Sci. Comput., 16:151–168, (1995).
[MSS99] A. Murua and J. M. Sanz-Serna: Order conditions for numerical integrators obtained
by composing simpler integrators. Phil. Trans. Royal Soc. A, 357:1079–1100, (1999).
406 Bibliography

[MSSS97] A. Murua, J. M. Sanz-Serna, and R. D. Skeel: Order conditions for numerical inte-
grators obtained by composing simpler methods. Technical Report 1997/7, Departemento
de Matemática Aplicada y Computatión, Universidad de Valladolid, Spain, (1997).
[Mur97] A. Murua: On order conditions for partitioned symplectic methods. SIAM J. Numer.
Anal., 34:2204–2211, (1997).
[Mur99] A. Murua: Formal series and numerical integrators, part I: Systems of ODEs and
symplectic integrators. Appl. Numer. Math., 29:221–251, (1999).
[Mur06] A. Murua: The Hopf algebra of rooted trees, free Lie argebra,and Lie series. Founda-
tions of Computational Mathematics, 6(4):387–426, (2006).
[Ner87] F. Neri: Lie algebras and canonical integration. University of Maryland Tech. report,
(1987).
[QWZ91] M. Z. Qin, D. L. Wang, and M. Q. Zhang: Explicit symplectic difference schemes
for separable Hamiltonian systems. J. Comput. Math., 9(3):211–221, (1991).
[QZ90] M. Z. Qin and M. Q. Zhang: Multi-stage symplectic schemes of two kinds of Hamil-
tonian systems for wave equations. Computers Math. Applic., 19:51–62, (1990).
[QZ90a] M. Z. Qin and M. Q. Zhang: Explicit Runge–Kutta–like schemes to solve certain
quantum operator equations of motion. J. Stat. Phys., 60(5/6):839–843, (1990).
[QZ92] M. Z. Qin and W. J. Zhu: Construction of higher order symplectic schemes by com-
position. Computing, 47:309–321, (1992).
[QZ93] M. Z. Qin and W. J. Zhu: A note on stability of three stage difference schemes for
ODEs. Computers Math. Applic., 25:35–44, (1993).
[QZZ95] M. Z. Qin, W. J. Zhu, and M. Q. Zhang: Construction of symplectic of a three stage
difference scheme for ODEs. J. Comput. Math., 13:206–210, (1995).
[Rut83] R. Ruth: A canonical integration technique. IEEE Trans. Nucl. Sci., 30:26–69, (1983).
[Ste84] S. Steinberg: Lie series and nonlinear ordinary equations. J. of Math. Anal. and Appl.,
101:39–63, (1984).
[Str68] G. Strang: On the construction and comparison of difference schemes. SIAM J. Numer.
Anal., 5:506–517, (1968).
[Suz77] M. Suzuki: On the convergence of exponential operators the zassenhuas formula,
BCH formula and systematic approximants. Communications in Mathematical Physics,
57:193–200, (1977).
[Suz90] M. Suzuki: Fractal decomposition of exponential operators with applications to many-
body theories and Monte Carlo simulations. Physics Letters A, 146:319–323, (1990).
[Suz92] M. Suzuki: General theory of higher-order decomposition of exponential operators
and symplectic integrators. Physics Letters A, 165:387–395, (1992).
[Wan94] D. L. Wang: Some acpects of Hamiltonian systems and symplectic defference meth-
ods. Physica D, 73:1–16, (1994).
[Wru96] O. Wrubel: Qin-Kompositionen mit Lie-Reihen. Diplomarbeit Uni Karlsruhe (TH),
(1996).
[Yos90] H. Yoshida: Construction of higher order symplectic integrators. Physics Letters A,
150:262–268, (1990).
[ZQ93] W. Zhu and M. Qin: Applicatin of higer order self-adjoint schemes of PDEs. Comput-
ers Math. Applic., 26(3):15–26, 1993.
Chapter 9.
Formal Power Series and B-Series

We study vector fields, their associated dynamical systems and phase flows together
with their algorithmic approximations in RN from the formal power series approach
[Fen93a,Fen92]
.

9.1 Notation
Our considerations will be local in both space and time, all related objects are C∞
smooth. We use coordinate description and matrix notation, the coordinate vec-
tors in RN and vector functions a : RN → RN are denoted by column matri-
ces. The identity vector function 1N is given by 1N (x) = x. For vector function
a = (a1 , · · · , aN )T : RN → RN ,
 
∂ ai
a∗ : = = Jacobian matrix a,
∂ xj
 ∂
a∗ : = ai = linear differential operator of first order associated to a,
∂xi

the association a → a∗ is linear, a∗ operates on scalar functions φ : RN → R


 ∂φ
a∗ φ = ai ,
∂xi

and on vector functions b : RN → RN as

a∗ b = a∗ (b1 , · · · , bN )T = (a∗ b1 , · · · , a∗ bN )T = b∗ a, a∗ 1N = a.

Multiple applications of linear differential operators are naturally defined, such as


a∗ b∗ , (a∗ b∗ )c∗ , a∗ (b∗ c∗ ), etc. The operations are multilinear, associative but non-
commutative; thus, powers can be defined as

a∗k = a∗ a∗ · · · a∗ (k time), ak := a∗k 1N ,

the identity operator I operates on scalar and vector functions φ and b as Iφ = φ,


Ib = b.
408 9. Formal Power Series and B-Series

We identify all vector functions a : RN → RN as vector fields. All vector fields


in RN form a (∞-dimensional) real Lie bracket under Lie bracket

[a, b] := a∗ b − b∗ a = b∗ a − a∗ b = (b∗ a∗ − a∗ b∗ )1N .

The Lie algebra VN is associated with the (∞-dimensional) local Lie group DN of
near-identity diffeomorphisms—or simply near-1 maps—of RN .
Consider the dynamical system in RN
dx
= a(x), (1.1)
dt
defined by a vector field a. It possesses a phase flow eta = et , which is a one-parameter
(in t) group of near-1 maps of RN ,

e0 = 1N , et+s = et ◦ es ,

and generates the solution by x(0) → eta x(0) = x(t). The phase flow is expressible
as a convergent power series in t:


eta = 1N + tk ek ,
k=1

1 ∗ 1 1
e0 = 1N , ek = a ek−1 = (a∗ )k 1N = ak .
k k! k!
We define

 1
Exp ta∗ := I + (ta)∗k , I is the identity operator.
k!
k=1

This is an operator power series operating on scalar and vector functions, and defined
by

 ∞ k

∗ 1 ∗ k t k
exp ta := (Exp ta )1N = 1N + (ta ) 1N = 1N + a , (1.2)
k! k!
k=1 k=1

then
eta = (Exp ta∗ )1N = exp ta, (1.3)
for scalar function
φ ◦ eta = φ ◦ exp ta = (Exp ta∗ )φ,
for vector function

b ◦ eta = b ◦ exp ta = (Exp ta∗ )b = (Exp ta∗ )b∗ 1N .

Each numerical algorithm solving the system (1.1) possesses the step transition map
fas which is one-parameter (in step-size s) family (in general not a one-parameter
group in s) of near-1 maps on RN , expressible as a convergent power series in s
9.2 Near-0 and Near-1 Formal Power Series 409



fas = 1N + sk fk , (1.4)
k=1

the coefficients can be determined recursively form the defining difference equation.
The transition generates the numerical solution x(0) → (fas )N x(0) ≈ x(N s) by
iterations with step-size s chosen fixed in general.
The main
 problem is to construct and analyze the algorithmic approximations
fas ≈ eta t=s = esa in a proper way. For this purpose, we propose a unified frame-
work based on the apparatus of formal power series, Lie algebra of vector fields, and
the corresponding Lie group of diffeomorphisms [Lie88,Olv93] .

9.2 Near-0 and Near-1 Formal Power Series




Among the formal power series sk ak , ak : RN → RN , we pick out two special
0
classes. The first class consists of those with a0 = 0, called near-0 formal vector
fields and the second class consists of those with a0 = 1N , called near-1 formal maps
(diffeomorphisms).
∞
All near-0 formal vector fields as = sk ak form a (∞-dim.) real Lie algebra
1
FVN under the Lie bracket
5∞ ∞
6 ∞
   
s s k k
[a , b ] = s ak , s bk := sk [ai , bj ].
k=1 k=1 k=2 i+j=k

The associated near-0 formal differential operators and their products are
* ∞
+ ∞
 
k
s
(a )∗ : = s ak := sk ak∗ ,
k=1 ∗ k=1
* ∞
+∗ ∞
 
as∗ := sk ak := sk a∗k ,
k=1 k=1
∞ 
as∗ bs∗ : = sk a∗i b∗j , (as∗ )2 : = as∗ as∗ , etc.
k=2 i+j=k

For any vector function a = (a1 , · · · , aN )T : RN → RN and any near-1 formal




map g s = 1 + sk gk , we define the composition,
k=1
410 9. Formal Power Series and B-Series



(a ◦ g s )(x) = a(g s (x)) = a(x) + sk (a ◦ g)k (x),
k=1

k  1
(a ◦ g)k = (Dm a)(gk1 , · · · , gkm ),
m=1 k1 +···+km =k
m!

where
Dm a = (Dm a1 , · · · , Dm aN )T ,

N
∂ m ai
Dm ai (v1 , · · · , vm ) = v1j · · · vmjm ,
j1 ,···,jm =1
∂xj1 · · · ∂xjm 1

is the usual m-th differential multi-linear form for m tangent vectors vi = (vi1 , · · ·,
viN )T (i = 1, · · · , m) at point x ∈ RN , which is invariant under permutation of
vectors. Using the identities,

(D1 a)(b) = b∗ a,
(D2 a)(b, c) = (c∗ b∗ − (c∗ b)∗ )a,
(D3 a)(b, b, b) = (b∗3 + 2b3∗ − 3b∗ b2∗ )a.

We get in particular

(a ◦ g)1 = g1∗ a,
1
(a ◦ g)2 = g2∗ a + (g1∗2 − g12∗ )a,
2
1 ∗3
(a ◦ g)3 = g3∗ a + ((g2∗ g1∗ − (g2∗ g1 )∗ )a + (g + 2g13∗ − 3g1∗ g12∗ )a.
3! 1

For any two near-1 formal maps



 ∞

s k s
f = 1N + s fk , g = 1N + sk gk ,
k=1 k=1

the composition f ◦ g is defined in a term by term way:


s s



(f s ◦ g s )(x) = f s (g s (x)) = 1N (g s (x)) + sk fk (g s (x))
k=1


=: 1N (x) + sk (f ◦ g)k (x),
k=1
(f ◦ g)1 = f1 + g1 ,
(f ◦ g)k = fk + gk + δ(f1 , · · · , fk−1 ; g1 , · · · , gk−1 ), k ≥ 2,

k−1 
i  1
δ(f1 , · · · , fk−1 ; g1 , · · · , gk−1 ) = (Dm fk−i )(gi1 , · · · , gim ).
i=1 m=1 i1 +···+im =i
m!
9.2 Near-0 and Near-1 Formal Power Series 411

In particular we get,

(f ◦ g)2 = f2 + g2 + g1∗ f1 ,
1
(f ◦ g)3 = f3 + g3 + g1∗ f2 + g2∗ f1 + (g1∗2 − g12∗ )f1 ,
2
1
(f ◦ g)4 = f4 + g4 + g1∗ f3 + g2∗ f2 + g3∗ f1 + (g1∗2 − g12∗ )f2
2
1 ∗3
+(g2∗ g1∗ − (g2∗ g1 )∗ )f1 + (g + 2g13∗ − 3g1∗ g12∗ )f1 .
3! 1

Under this composition rule, all near-1 formal maps




f s = 1N + sk fk
k=1

form a (∞-dim) formal Lie group FDN . In group FDN , inverse elements, square
roots, rational powers, etc., always exist, and their coefficients can always be deter-
mined recursively by the defining composition relations. For example, the inverse


(f s )−1 := 1 + sk hk = hs is defined by (f s ◦ hs ) = 1N , hence
k=1

f1 + h1 = 0, fk + hk + δ(f1 , · · · , fk−1 ; h1 , · · · , hk−1 ) = 0, k ≥ 2.

In particular,

h1 = −f1 , h2 = −f2 + f12 ,


1 1
h3 = −f3 + f1∗ f2 + (f2∗ − f12∗ )f1 − f13 + f12∗ f1 .
2 2

There is an obvious one-one correspondence between the Lie algebra FVN and the
Lie group FDN , established simply by +1N and −1N . However, the more significant
one-one correspondence between them is given by exp and its inverse log.

exp : FVN → FDN ,



 ∞
1 s∗ m
as = sk ak =⇒ exp as =: 1N + (a ) 1N
m=1
m!
k=1


=: 1N + sk fk = f s . (2.1)
k=1

Note that
* ∞
+ * ∞
+ ∞
  
s∗ m
(a ) = sk1 a∗k1 ··· skm a∗km = sk1 +···+km a∗k1 · · · a∗km ,
k1 =1 km =1 k1 ,···,km =1
412 9. Formal Power Series and B-Series

so we get easily
k
1 
fk = a∗k1 · · · a∗km 1N , k ≥ 1, f1 = a1 ,
m!
m=1 k1 +···+km =k

k
1  1
fk = ak + a∗k1 · · · a∗km 1N , k ≥ 2, f2 = a2 + a21 . (2.2)
m! 2
m=2 k1 +···+km =k

Note that (2.2) provides a 2-way recursion formula from a1 , · · · , ak to f1 , · · · , fk and


vice versa. Therefore, exp maps FVN one-one onto FDN and its inverse, i.e., log is
defined by the same (2.2):

log = (exp)−1 : FDN −→ FVN ,


log exp as = as , exp log f s = f s .

In particular,
1 1 1 3
a1 = f1 , a2 = f2 − a21 , a3 = f3 − (a∗1 a2 + a∗2 a1 ) − a ,
2 2 3! 1
1 1 ∗ ∗ 1
a4 = f4 − (a∗1 a3 + a22 + a∗3 a1 ) − (a a a2 + a∗1 a∗2 a1 + a∗2 a∗1 a1 ) − a41 ,
2 3! 1 1 4!

k−1
1  1 k
ak = fk − a∗k1 · · · a∗km 1N − a , k ≥ 3.
m! k! 1
m=2 k1 +···+km =k

An equivalent construction of log f s = as is


∞
s (−1)m−1 s
log f = hm , (2.3)
m=1
m

where
hs1 = f s − 1N , hsm = hsm−1 ◦ f s − hsm−1 .
It is easy to compute

 ∞

hs1 = sk fk = sk1 (1N ◦ f )k1 ,
k=1 k1 =1


hs2 = sk1 +k2 ((1N ◦ f )k1 ◦ f )k2 ,
k1 ,k2 =1



hs3 = sk1 +k2 +k3 (((1N ◦ f )k1 ◦ f )k2 ◦ f )k3 ,
k1 ,k2 ,k3 =1

···.
9.2 Near-0 and Near-1 Formal Power Series 413



Substituting in (2.3) and equating with sk ak , we get
k=1

k
(−1)m−1 
ak = (· · · ((1N ◦ f )k1 ◦ f )k2 · · · ◦ f )km . (2.4)
m
m=1 k1 +···+km =k

It is easy to verify log exp as = as for this log, so this is precisely the inverse of exp,
thus agreeing with the previous one.
We use the above construction (2.4) to establish the formal Baker–Campbell–
Hausdorff formula[Bak05,Hau06] . For arbitrary near-1 formal maps f s , g s ,


log (f s ◦ g s ) = log f s + log g s + dk (log f s , log g s ),
k=1

where log f s = as , log g s = bs , then[Dyn46]

1  (−1)m−1 
k
[(as )p1 (bs )q1 · · · (as )pm (bs )qm ]
dk (as , bs ) = ,
k m p1 !q1 ! · · · pm !qm !
m=1 p1 +q1 +···+pm +qm =k
pi +qi ≥1,pi ≥0,qi ≥0

where
(x)p = xx · · · x (p times), [x1 x2 x3 · · · xn ] = [[· · · [[x1 , x2 ], x3 ], · · ·], xn ].
In particular,
1 s s 1 s s s  1 s s s s
d1 = [a , b ], d2 = [a b b ] + [bs as as ] , d3 = − [a b b a ].
2 12 24


Let log (f s ◦ g s ) = cs = sk ck , then
k=1

1
c1 = a1 + b1 , c2 = a2 + b2 + [a1 b1 ],
2
1 1
c3 = a3 + b3 + ([a1 b2 ] + [a2 b1 ]) + ([a1 b1 b1 ] + [b1 a1 a1 ]),
2 12
1
c4 = a4 + b4 + ([a1 b3 ] + [a2 b2 ] + [a3 b1 ])
12
1
+ ([a1 b1 b2 ] + [a1 b2 b1 ] + [a2 b1 b1 ] + [b1 a1 a2 ] + [b1 a2 a1 ] + [b2 a1 a1 ])
12
1
− [a1 b1 b1 a1 ], etc.
24
Note that the classical BCH formula is restricted to the composition of two one-
parameter groups, where log f s = sa1 and log g s = sb1 .
The log transform reduces matters at the Lie group level to those at the easier level
of Lie algebra. All properties of near-1 formal maps have their logarithmic interpreta-
tions.
414 9. Formal Power Series and B-Series

[Fen93a,Fen92,Fen93b]
C

Proposition 2.1. We list some of them, let log f s = as = sk ak :
k=1
1◦ f s is a phase flow, i.e., f s+t = f s ◦ f t ⇔ log f s = sa1 .
2◦ f s is revertible, i.e., f s ◦ f −s = 1N ⇔ log f s is odd in s.
3◦ f s raised to real μ-th power (f s )μ ⇔ log (f s )μ = μ log f s . In particular,
√ 1
log (f s )−1 = − log f s , log f s = log f s .
2
4◦ f s scaled to f αs ⇔ log (f αs ) = (log f )αs . In particular,
log (f −s ) = (log f )−s .
5◦ f s − g s = O(sp+1 ) ⇔ log f s − log g s = O(sp+1 ).
6◦ f s ◦g s = g s ◦f s ⇔ [log f s , log g s ] = 0 ⇔ log (f s ◦g s ) = log f s +log g s .
C

7◦ (f s ◦g s ) = hs ⇔ log hs = log (f s ◦g s ) = log f s +log g s + dk (log f s , log g s ).
k=1
8◦ f s symplectic ⇔ all ak are Hamiltonian fields (see Chapter 5).
9◦ f s contact ⇔ all ak are contact fields (see Chapter 11).
10◦ f s volume-preserving ⇔ all ak are source-free fields (see Chapter 10).
The log transform has important bearing on dynamical systems with Lie algebra
structure. The structure-preserving property of maps f s at the Lie group (G ⊂ Dm )
level can be characterized through their logarithms at the associated Lie algebra (L ⊂
Vm ) level.

9.3 Algorithmic Approximations to Phase Flows


9.3.1 Approximations of Phase Flows and Numerical Method
We return to the main problem of approximation to the phase flow for dynamical
dx
system = a(x).
dt

 ∞

fas s
= f = 1N + s fk ≈
k
esa = 1N + sk ek ,
k=1 k=1

ak
ek = .
k!

If fk = ek (1 ≤ k ≤ p), we say fas is accurate to order ≥ p, if moreover, fp+1 = ep+1 ,


we say it is accurate to  order p.
Let log f s = as = sk ak . Note that the first p + 1 equations in (2.2) completely
determine a1 , a2 , · · ·, ap+1 and f1 , f2 , · · ·, fp+1 each other. It is then easy to establish

fk = ek , 1 ≤ k ≤ p; fp+1 = ep+1 ⇐⇒
a = a1 = e1 ; ak = 0, 1 < k ≤ p; ap+1 = fp+1 − ep+1 = 0. (3.1)

So, the orders of approximation for fas ≈ esa and for log fas − sa are the same.
Moreover, note that we have a formal field
9.3 Algorithmic Approximations to Phase Flows 415



s−1 log f s = s−1 as = a + sk+1 ak+1 = a + O(sp ),
k=1

which is the original field a up to a near-0 perturbation and defines a formal dynamical
system


dx
= (s−1 log f s )(x) = a(x) + sk+1 ak+1 (x)
dt
k=1

having a formal phase flow (in two parameters t and s with group property in t)
ets−1 as = exp ts−1 as whose diagonal formal flow ets−1 as |t=s is exactly f s . This
means that any compatible algorithm fas of order p gives perturbed solution of a right
equation with field a; however, it gives the right solution of a perturbed equation with
field s−1 log fas = a + O(sp ). There could be many methods with the same formal or-
der of accuracy but with quite different qualitative behavior. The problem is to choose
among them those leading to allowable perturbations in the equation. For systems
with geometric structure, the 8◦ , 9◦ , 10◦ of Proposition 2.1 provide guidelines for a
proper choice. The structure-preservation requirement for the algorithms precludes all
unallowable perturbations alien to the pertinent type of dynamics. Take, for example,
Hamiltonian systems. A transition map fas for Hamiltonian field a is symplectic if and
only if all fields ak are Hamiltonian, i.e., the induced perturbations in the equation
are Hamiltonian. So symplectic algorithms are clean, inherently free from all kinds
of perturbations alien to Hamiltonian dynamics (such as artificial dissipation inherent
in the vast majority of conventional methods), this accounts for their superior perfor-
mance. The situations are the same for contact and volume-preserving algorithms .
The Proposition 2.1 profound impact on later developed called “Backward error se-
ries” work, “Modified equation” and “Modified integrator”[Hai94,CHV05,CHV07] .

9.3.2 Typical Algorithm and Step Transition Map


Finally we give, as an illustration, four simplest methods together with step transition
maps and their logarithms.
1 1 3 3
esa = 1N + sa + s2 a2 + s a + O(s4 ).
2 3!

(1) Explicit Euler method (E):

x1 − x0 = sa(x0 ),
f s − 1N = sa,
fEs = 1N + sa,
s2 2
log fEs = sa − a + O(s3 ),
2

non-revertible, order = 1.
(2) Implicit Euler method (I):
416 9. Formal Power Series and B-Series

x1 − x0 = sa(x1 ),
f s − 1N = sa ◦ f s ,
fIs = (1N − sa)−1 = (fE−s )−1 = 1 + sa + s2 a2 + O(s3 ),
s2 2
log fIs = sa + a + O(s3 ),
2
non-revertible, order = 1.
(3) Trapezoidal method (T ):
s
x1 − x0 = (a(x1 ) + a(x0 )),
2
s
f s − 1N = (a ◦ f s + a),
2
   
s −1 s s s
fTs = 1N − a ◦ 1N + a = fI2 ◦ fE2
2 2
s s
s2 2 s3
= (fE2 )−1 ◦ fCs ◦ fE2 = 1N + sa + a + a3 + O(s4 ),
2 4
s3 3
log fTs = sa + a + O(s5 ),
12

revertible, order = 2, symplectic for linear Hamiltonian but non-symplectic for non-
linear Hamiltonian systems, where fCs denoting following centered Euler scheme.
(4) Centered Euler method (C):
 
1
x1 − x0 = sa (x1 + x0 ) ,
2
 
1 s
f s − 1N = sa ◦ (f + 1N ) ,
2

2-stage version recommended for implementation:


s
x̄ = x + a(x̄), x1 = 2x̄ − x0 ,
2
s s
x̄ = fI2 (x0 ), x1 = 2fI2 (x0 ) − 1N (x0 ),
   
s
s s −1 s s
fCs = 2fI2 − 1N = 1N + a ◦ 1N − a = fE2 ◦ fI2
2 2
2 3
s 2 s
= 1N + sa + a + (a∗ a2 + a3 ) + O(s4 ),
2 8
 
1 1
log fCs = sa + s3 a∗ a2 − a3 + O(s5 ),
8 24
revertible, order = 2, unconditionally symplectic with preservation of all quadratic
invariants for Hamiltonian systems.
Note the similarities and delicate differences between C and T : Both can be com-
s s
posed by a implicit and a explicit stages but in opposite orderings. Moreover, they
2 2
9.4 Related B-Series Works 417

are conjugate to each other. C is far less known than T , it becomes prominent only
after the recent development of symplectic algorithms [Fen85] . In crucial aspects, C is
superior.
Remark 3.1. The above log fCs is not others but just formal vector fields for centered
Euler scheme or present called backward error analysis

s2 1
f¯ = f + (f  f  f − f  (f, f )).
12 2

9.4 Related B-Series Works


Consider the numerical solution of ODEs

ż = f (z), z ∈ Rn . (4.1)

B-series methods: B-series were introduced by Harier and Wanner[HW74] . The Taylor
series of exact solution of (4.1) with initial value z(0) = z can be written as

h2  h3   
z(h) = z + hf (z) + f (z)f (z) + f (f (z), f (z)) + f  (z)f  (z)f (z) + · · · .
2! 3!
(4.2)
B-series methods are numerical integrators zn+1 = Φh (zn ) whose Taylor series have
the same structure with real coefficients a(τ ):
 a( )
Φh (z) = z + ha( )f (z) + h2 a( )f  (z)f (z) + h3 f  (f (z), f (z))
 2!
+a( )f  (z)f  (z)f (z) + · · · , (4.3)

where coefficients a(τ ) are defined for all rooted trees and characterize the integrator.
Every numerical integrator (including R–K method) can be expanded into a B-
series as introduced and studied in[HW74] .
Definition 4.1 (rooted tree and forest). The set of rooted tree T and forest F are
defined recursively by
1◦ The tree , only one vertex belong to T ;
2◦ If τ1 , · · · , τn are n tree of τ , the forest u = τ1 , · · · , τn is the commutative
product of τ1 , · · · , τn ;
3◦ If u is a forest of F, then u = |τ | is a tree of T .
Let T = { , , , · · ·} be the set of rooted trees and let ∅ be the empty tree. For
τ1 , · · · , τn ∈ T , we denote by τ = [τ1 , · · · , τn ] the tree obtained by grafting the roots
of τ1 , · · · , τn to a new vertex which becomes the root of τ . Elementary differentials
Ff (τ ) are defined by induction as
 
Ff ( )(z) = f (z), Ff (τ )(z) = f (m) (z) Ff (τ1 ), · · · , Ff (τm )(z) . (4.4)

For real coefficients a(∅) and a(τ ), τ ∈ T a B-series is a series of the form
418 9. Formal Power Series and B-Series

 h|τ |
B(f, a, z) = a(∅)Id + a(τ ) Ff (τ )(z) (4.5)
σ(τ )
τ ∈T

= a(∅)Id + ha( )f + h2 a( )f  f + h3 a( )/2f  (f, f ) + · · · , (4.6)

where Id stands for the identity; Id(z) = z and the scalars a(τ ) are the known nor-
malization coefficients[BSS96] . Now, we give following examples:
Example 4.2. The Taylor series of the exact solution of (4.1) can be written as a B-
1
series z(h) = B(f, e)(z0 ) with coefficients a(τ ) = e(τ ) = ,∀τ ∈ T.
γ(τ )

Example 4.3. The coefficient B-series for explicit Euler scheme a(τ ) = 0, ∀ τ ∈ T
except a( ) = 1.
Example 4.4. The coefficient B-series for implicit Euler scheme a(τ ) = 1, ∀ τ ∈ T .
 |τ |−1
1
Example 4.5. The coefficient B-series for centered Euler scheme a(τ ) = ,
2
∀τ ∈ T.
Example 4.6. The coefficient B-series for trapezoidal scheme a( ) = 1, a( ) =
1 1 1
, a( ) = , a( ) = , · · · .
2 2 4
Example 4.7. The coefficient B-series for R–K method (A, b, c), a(τ ) = bT φ(τ ), ∀ τ ∈
T.
Partitions and skeletons: A partition pτ of a tree τ is obtained by cutting some of the
edges [CHV07] . The resulting list of trees is denoted by P (pτ ). Eventually, the set of all
partitions pτ of τ is denoted by P (pτ ). Now, given a partition pτ , the corresponding
skeleton χ(pτ ), as introduced in [CHV07] , is the tree obtained by contracting each tree of
P (pτ ) to a single vertex and by re-establishing the cut edges (see Tables 4.1 – 4.25).
We observe that a tree τ ∈ T has exactly 2|τ |−1 partitions pτ and that different parti-
tions may lead to the same P (pτ ). An admissible partition is a partition with at most
one cut along any part from the root to any terminal vertex. We denote AP τ as the set
of admissible partition of τ and by convention, we suppose that ∅ ∈ AP τ . We denote
#(pτ ) as number of subtrees. We denote this distinguished tree by R(pτ )(or rp ). We
denote P ∗ (pτ ) = P (pτ ) \ R(pτ ) as the list of forest that do not contain the root of τ .
We distinguish rp as the tree vp (or P (pτ )) whose root coincides with the root of τ .
This tree is usually referred to as a subtree of τ and we denoted by vp∗ (or P ∗ (pτ )) the
forest obtained by removing rp from vp . The above definition can be seen in Tables
4.1 – 4.25.

9.4.1 The Composition Laws


The following theorem result on the composition of B-series was obtained by[HW74] .
Now we formulate this theorem in the form [CHV07] :
9.4 Related B-Series Works 419

Theorem 4.8. Let a, b : T ∪{∅} → R be two mappings, with a(∅) = 1. Then B-series
B(f, a)(z) inserted into B(f, b)(·) is still a B-series

 
B(f, b) B(f, a)(z) = B(f, a · b)(z), (4.7)

and a · b : T ∪ {∅} → R is defined by


(a · b) = b(∅) = b(∅), ∀ τ ∈ T, (a · b) = b(rp )a(vp∗ ), (4.8)
p∈AP(τ )

where a is extended to F, as follows:

;
n
∀u = τ1 · · · τn ∈ F, a(u) = a(τi ). (4.9)
i=1

Table 4.1. The partitions of a tree of order 2 with associated skeleton and forest

• ·•
·
•·
τ
p •



χ(pτ ) •
' /  

τ ••
P (p ) •



R(pτ ) •
 
 
∗ τ
∅ •
P (p )

#(pτ ) 1 2

pτ ∈ AP τ yes yes
420 9. Formal Power Series and B-Series

Table 4.2. The partitions of a tree of order 3 with associated skeleton and forest

• • •· • • ·• •· ·•
· · · ·
pτ • ·• •· ·•·

• • • •

χ(pτ ) • • •
'• •/  •  •  
P (p ) τ
• •• •• •••

• • • •
τ •
R(p ) • • •
 
     
∗ τ
∅ • • ••
P (p )

#(pτ ) 1 2 2 3

pτ ∈ AP τ yes yes yes yes

Table 4.3. The partitions of a tree of order 3 with associated skeleton and forest

• • • •·
·· ··
• • •· •
·· ··
pτ • •· • •·

• • •
τ •
χ(p ) • • •
'• /
•  •  •  
• • •••
P (pτ ) • • •


• •
• •
R(pτ ) • •
  ' /
•    
∗ τ
∅ • ••
P (p ) •
#(pτ ) 1 2 2 3
p ∈ AP
τ τ
yes yes yes no
9.4 Related B-Series Works 421

Table 4.4. The partitions of a tree of order 4 with associated skeleton and forest

• • • • • • • •
·· ·· ·· ··
• • •· • • ·• •· ·• · •· ·• •· ··•
·· ·• •
· ·
• ·• · · ·
· ·
pτ • • •· • ·•· •· ·• ·•·

• •
• • • • • • • • • •
τ •
χ(p ) • • • • • • •
' • / ' • / ' /' /' / ' / ' / 
• •
• • • • • • • ••

••

••

••••
P (pτ ) • • • • • •
• •
• •
• • • • • • • •
τ • •
R(p ) • • • • • •
 
   •     •      

∅ • • • •• •• •••
P (p ) τ
• •
#(pτ ) 1 2 2 2 3 3 3 4
p ∈ AP τ
τ
yes yes yes yes yes no yes no

Table 4.5. The partitions of a tree of order 4 with associated skeleton and forest

• • • • • • •· • •· ·• • • •· • •· ·•
·· ·· ·· ·· ·· ·· ·· ··
• •· •· • • •·· •· •·
pτ · · · ·
• •· • • • •· •· •·
• • • •
• • • • • • • • •
χ(pτ ) • • • • • • •
'• /' • / ' • / '
•/ ' / ' / ' /
• • • • •  
τ
• • • • • •• •• •• ••••
P (p ) • • • • • • • •
• • • •
• • • • • • •

R(pτ ) • • •

  ' /      ' /' /
• • • •  
∅ • • • , • •, •, •,•, •
P ∗ (pτ ) • • •

#(pτ ) 1 2 2 2 3 3 3 4

pτ ∈ AP τ yes yes yes yes yes no no no


422 9. Formal Power Series and B-Series

Table 4.6. The partitions of a tree of order 4 with associated skeleton and forest
• • •· • •· •· • •·
·· ·· ·· ··
• • • • • • • •
·· ·· ·· · ··
• •· • •· •· •·· •·· •·
pτ • ··• • • • ·• ··• ··•

• • •

• • • • • • •

χ(p ) τ
• • • • • • •

' / ' • / ' • / ' /' /' /' / 
• • • • • •
• • • • • •• •• •• ••••
P (pτ ) • • • • • • • •
• •

• • • •
• • • •
R(pτ ) • • • •
' • /
 
•    •    •  •  
∅ • •• • • •••
P ∗ (pτ ) • • • •
#(pτ ) 1 2 2 2 3 3 3 4
p ∈ AP τ
τ
yes yes yes yes no no no no

Table 4.7. The partitions of a tree of order 5 with associated skeleton and forest

• • • •· • • • •· • • • ·• •· •· • •· • ·• • •· ·• •· •· ·•
·· ·· · ·· ·· ·· · · ·· · ·· ·· · ·
•· •·
τ
p • • • • • •

• • • • • • • • • • • •
τ •
χ(p ) • • • • • • •

' • /' /' /' /' / ' / ' / ' /


• • • • • • • • • • •
• • • •• •• •• ••••
P (pτ ) • • • • • • •

• • • • • • • • • • • • •
R(p ) τ
• • • • • • •

' / ' / ' / ' / ' /' /' /' /


∅ • • • • • • • • • •••
P ∗ (pτ )

#(pτ ) 1 2 2 2 3 3 3 4

pτ ∈ AP τ yes yes yes yes yes yes yes yes


9.4 Related B-Series Works 423

Table 4.8. The partitions of a tree of order 5 with associated skeleton and forest
• • • • • • • •
·· ·· ··
• • •· •· • •·· • •·
·· ··
• • • • • • • •
·· ··
• •· • • •· • •·· •·

• ··• • • • • ··• ··•
• • •
• • • •
• • • •
χ(pτ ) • • • • • • •

'• /' • /' • /'
• • • •/' • • /' • /' • /' •/
• • • • • • • • • •• • •• • •• •
• • • • • •
P (p ) τ
• • •



• • •

• • • • • •
• •
R(p ) τ
• • • • •

    ' / ' •/   ' •/ ' •/
• •
∅ • • • •• •• ••
P ∗ (pτ ) •
• • • •
#(pτ ) 1 2 2 2 2 3 3 3
p ∈ AP τ
τ
yes yes yes yes yes no no no

Table 4.9. Continuous partitions of the above tree of order 5 with associated skeleton and
forest
• • • • • • • •
·· ·· ·· ·· ··
•· •· •· •·· •· •·· •· •··
·· ·· ·· ·· ·· ··
• • • • • • • •
·· ·· ·· · ·· ·· ···
•· •· •· • · •· •· •·· •·
pτ ··• • • • ··• ··• ··• ··•

• • • • •
• • • • • • •

• • • • • • • •
τ
χ(p ) • • • • • • • •
'• • /'• • /'• • /' • /' •/ ' •/ ' • / ' /
• • • ••• ••• ••• ••• •• •• •
•• •• •• • • • •
P (pτ )
• • •
• • • • •
R(p ) τ • • •
'• • /' • /' • / ' •/ ' •/ ' • /  
• • ••• •• •• •• ••••
P ∗ (pτ ) •• • • • • •
#(pτ ) 3 3 3 4 4 4 4 5
p ∈ AP τ
τ
no no no no no no no no
424 9. Formal Power Series and B-Series

Table 4.10. The partitions of a tree of order 5 with associated skeleton and forest

• • • • • • • • • • • • • • • •
• • •· • • ·· • • ·· • • ·• •· ···· • • ··· ··· • • ·····•
·· · · ·
•· ·• •·
τ
p • • • • •

• • • • • • • • • •

χ(p ) τ
• • • • • • •

 • •   •   •   •   •   • •  • •  • •
• • • • • • • • • •
•• •• ••
P (p ) τ
• • • • • • • • • • • •

• • • • • • • • • • • •
• • • • • • • • • •
R(p ) τ
• • • • • • • •
               
∅ • • • • •• •• ••
P ∗ (pτ )

#(pτ ) 1 2 2 2 2 3 3 3

pτ ∈ AP τ yes yes yes yes yes yes yes yes

Table 4.11. Continuous partitions of the above tree of order 5 with associated skeleton and
forest

• • • • • • • • • • • • • • • •
• ·· ·• •· ·· • •· • •· ·· ·· • • ·· ·· ·• •· ·· ·• •· ·· ·• •· ·· ·· ·•
τ · ·· ·· · · · ··· · ·· · · ··· · · ··· · ·· ·· · ·· ···
p • • • • • • • •

• • • • • • • • • • • • • • • • • • •
• •

τ
p • • • • • • • •

 • ' • / ' • / ' • / ' • / 


•  • •  • •
•• •• •• ••• ••• ••• ••• • • • • •
P (pτ ) • • • • • • •

• • • • • • • • • •

R(pτ ) • • • • • • •
           
•• •• •• ••• ••• ••• ••• ••••
P ∗ (pτ )

#(pτ ) 3 3 3 4 4 4 4 5

pτ ∈ AP τ yes yes yes yes yes yes yes yes


9.4 Related B-Series Works 425

Table 4.12. The partitions of a tree of order 5 with associated skeleton and forest

• • • • • • • • · • • •·· • • • • • • • • •·· • • • · •
· · ·· ·· · ·
• •· •· • ·• ·•
·· •· •· ·
·· · ·
pτ • • • • • •· •· •·
• • •
• • • • • •
• •
χ(pτ ) • • • • • • •
'• • •/ '• •
/'• •
/'• •
/
 •   • •  • •  • •
• • • • • •
• • • •• •• ••
P (pτ ) • • • • • • • • •
• • • • • • • • •
• • • •
• • • •
R(pτ ) • • • •

  ' • /       '• •
/'
• •
/'
• •
/
• •
∅ • • • • • •
P ∗ (pτ ) • • • •
3 1 2 2 2 2 3 3 4
p ∈ AP τ
τ
yes yes yes yes yes no no no

Table 4.13. Continuous partitions of the above tree of order 5 with associated skeleton and
forest

• •·· • • • · • • •·· · • • •·· · • • •·· · • • • ·• • •·· • • •·· · •


·· · ·· · ·· ·· · · ·· ·· · · ·· · ·· · ·
·• · •· •· · •· •· · · •· ·• · •·
·· ·
· ·
·· ··
pτ • • • • • • • •·
• • • • • • • • •
• • • • • • • • • • • • •
χ(pτ ) • • • • • • • •
' •/ ' •/ ' /
• ' / ' / ' / ' / 

••• ••• ••• • • • •
••• ••• ••• ••• •••••
P (pτ ) • • • • • • •

• • • •
• • • • • • •

R(pτ ) • • •
' • /' • /' • /
      
• • • • • • ••• •• •• •• ••••
• • •
P ∗ (pτ )
#(pτ ) 3 3 3 4 4 4 4 5
pτ ∈ AP τ yes yes yes yes no no no no
426 9. Formal Power Series and B-Series

Table 4.14. The partitions of a tree of order 5 with associated skeleton and forest
• • • • • • • •
·· ·· ··
• •· •· • • •·· • •·
• • • • • ··• • • •· • • ··• •· ·• • ·•
·· ·· ·· ·· ·
pτ • • • •· • • • •·
• •
• • • • • • • •

χ(pτ ) • • • •
• • •
' • ' / ' / ' • • ' / ' • ' /
/ • / ' / /
• •• • • • • • • • • •
• • •• • • • •• • •• • • ••
• •• ••
P (pτ ) • •
• •
• • • • •
• • • • • • • • • •
• • • •
R(pτ ) • • •

    '• / ' • /    ' •/' • /


∅ • • • • • • • •
• •
P ∗ (pτ ) • •
#(pτ ) 1 2 2 2 2 3 3 3
pτ ∈ AP τ yes yes yes yes yes no yes no

Table 4.15. Continuous partitions of the above tree of order 5 with associated skeleton and
forest
• • • • • • • •
·· ·· ·· ·· ··
•· •· •· •·· •·· •· •· •··
•· • • ···• •· ··• • ···• •· ··• •· ·• •· ···• •· ···•
·· · ·· · ·· ·· ·· ·· ·· ·· ··
pτ • •· • •· • • • •

• • •
• • • • • • • • •
• • • • • • • • •
• ··
χ(pτ ) • • • • • • • •·
' • / ' • • /' • • /' /' /' /' / 
•• • • • • • • •••••
• • • • • • • • • • • • •
•• •• • • • •
P (pτ ) •

• • • • • •
• •
R(pτ ) • • • • •

  ' • / ' • /    ' • / ' • / 


• • • • ••• ••• •• •• ••••
• • • •
P ∗ (pτ )
#(pτ ) 3 3 3 4 4 4 4 5
p ∈ AP τ
τ
yes no yes no no no no no
9.4 Related B-Series Works 427

Table 4.16. The partitions of a tree of order 5 with associated skeleton and forest

• • • •· •· • • • • • •· •· • • • •·
· · · · ·
• • • •· •· • • • •· • •· •· •· ·• • •·
·· ·· ·· ·· ·· ·
pτ • • • •· • • • •

• • • • • • • • •

χ(pτ ) • • • • • • •

'• • / ' • /' • / ' •/ ' •/ ' /' • • / ' • /


• • • • • • • • • •• •
•• • •• • •• ••
P (pτ ) • • • • • • •
• •
• • • • • • •
• • • • •
• • • • • • •
R(pτ ) • • •
• • • •
      •  •   '
 •• /  
∅ • • • • • •
P ∗ (pτ ) • • ••
#(pτ ) 1 2 2 2 2 3 3 3
p ∈ AP τ
τ
yes yes yes yes yes yes yes no

Table 4.17. Continuous partitions of the above tree of order 5 with associated skeleton and
forest

•· • • •· •· • •· • • •· •· •· •· •· •· •·
· · · · · · · · · · ·
•·· • •· •· •· • •·· ·• •· ·•· •· •· •·· •· •·· ·•·
·· ·· ·· ·· ·· ·· ·· ·· · ·· ·· ··
pτ • • •· • • • • •
• • • • • • •
• • • • • • • • • • • • • • •
χ(pτ ) • • • • • • • •
' • /'
• • / ' • • / ' • / ' • /' • / ' • /  
•• • • • ••• ••• ••• ••• •••••
•• •• • • • •
P (pτ ) •

• • • •
• • • •
τ • • • •
R(p ) •
' / /
 ' • / ' • / • ' •      
• • ••• • • • ••• • • • • • • •
• • • • • •
P ∗ (pτ )
#(pτ ) 3 3 3 4 4 4 4 5
p ∈ AP τ
τ
no yes yes no no no no no
428 9. Formal Power Series and B-Series

Table 4.18. The partitions of a tree of order 5 with associated skeleton and forest
• •· • • • •· • •
• • ·· • • • • ·· • •
• • • • •· • • · • • · ·• • • •· · • •· •
·· ·· · ·· ·· ·· ·· ·· · · ·
pτ • • • • • •· • •

• • • • • • • • •

χ(pτ ) • • • • • • •

' • / ' • /' • /' /' • /'


 •  /' •• /
• • • • • •• • •• • • • • • • •• •

•• ••
P (pτ ) • • • •• • •
• •
• • • •
• • • • • • • • • • • • • • • •
R(pτ ) • • • • • • • •
        ' /     ' • /

∅ • • • • • • • •
P ∗ (pτ ) • •

#(pτ ) 1 2 2 2 2 3 3 3
pτ ∈ AP τ yes yes yes yes yes no yes yes

Table 4.19. Continuous partitions of the above tree of order 5 with associated skeleton and
forest

•· •· • • •· •· •· •·
· · · · · ·
• •· ·• •· • ·• • •· ·• •· •· ·• • •· ··• •· • ··• •· •· ·• •· •· ··•
·· ·· ·· · ·· ·· ··· ·· · ·· · · ·
·· · ·
·· · · ·
pτ • • •· • •· • • •
• • •
• • • • • • • • • • • •· • • • • • • •
·
χ(pτ ) • • • • • ·• • •
' /' /' ' /' /' /' / 
• • • • •• / • • • • 
•• •• • • • • • • • • • • • • • • • • • •
P (pτ ) • • •• • • • •

• • • • • • • •
• •
R(pτ ) • • • • • •
  ' • / ' • /      
• • • • • •• ••• ••• ••• ••••
P ∗ (pτ ) • •

#(pτ ) 3 3 3 4 4 4 4 5
pτ ∈ AP τ yes yes yes yes no no yes no
9.4 Related B-Series Works 429

Table 4.20. The partitions of a tree of order 5 with associated skeleton and forest

• • • • •· • • • • • •· ·• • • •· •
·· · · · ·
• • • •· • ·• • ·• •· • • ·•· •· ·• •· ·•
·· ·· ·· ·· ··
pτ • • • • • • • •

• • • • • • • • • •

χ(pτ ) • • • • • • •

' • /•' • /' • /' •/


/ '• •/ ' /' /'
• • •• • •• • • • • • • • • • • •
•• •• ••
P (pτ ) • • • •• • • • •

• • • • • • •
• • • • • • • • • •
• •
R(pτ ) • • • • • •

      ' /    ' • /
•  
• •
∅ • • • • • • • •
• •
P ∗ (pτ )
#(pτ ) 1 2 2 2 2 3 3 3
p ∈ AP τ
τ
yes yes yes yes yes yes yes yes

Table 4.21. Continuous partitions of the above tree of order 5 with associated skeleton and
forest

•· • • ·• • ·• •· ·• • ·• •· • •· ·• •· ·•
·· ·· • •·· • ··•·· • •·· • ··• ·· ·· ·· ··
• ·• • ·• · · · · • • • •
· · ·· · ··· ·· ··· ·· ··· ··· ···
pτ •· •· • •· • • • •
• • • • • • • •
• • • • • • • • • • • •
• •
χ(pτ ) • • • • • • • •
' • /
••  •• 
•• •
 •  •  •  •  
• • ••• ••• ••• ••• •••••
P (pτ ) •• •• • • • • •

• • • • •
• • •
R(pτ ) • • • • •
 •  •     •  •    
• • • • ••• •• •• ••• ••••

P (p ) τ • • • •
τ
#(p ) 3 3 3 4 4 4 4 5
p ∈ AP
τ τ
no no yes no no no no no
430 9. Formal Power Series and B-Series

Table 4.22. The partitions of a tree of order 5 with associated skeleton and forest
• • • • • • •· • • • • • •· ·• •· •
·· ·· ·· ·· ··
• • •· • •· •· • •·
·· ·· ··
• • • • • • • •
pτ ··· · ··
• • • • • • • •
• •
• • • • • • • •

χ(pτ ) • • • • • • •
• • • •
' • / ' • •/ ' /' • /' • /' /' • / ' /
• • • • • ••
• • • • • • • •• •• • •
• •• • ••
P (pτ ) • • • •
• • • •
• • • •
• • • • • • • •
R(pτ ) • • • • •

  ' • •/ ' /     ' /   ' • /
• • • • •
∅ • • • •• •
• •
P ∗ (pτ ) • •
#(pτ ) 1 2 2 2 2 3 3 3
pτ ∈ AP τ yes yes yes yes yes no yes no

Table 4.23. Continuous partitions of the above tree of order 5 with associated skeleton and
forest
•· • • ·• • ·• •· ·• •· ·• •· • • ·• •· ·•
·· · · ·· ·· ·· ·· ·· · ·· ··
• •·· •· •· • •· •·· •·
·· ·· ·· ·· ··
• • • • • • • •
pτ ·· ·· ·· ·· ·· ··
•· • •· • •· •· •· •·
• • • •
• • • • • • • • • •
• • • • • • • •
τ
χ(p ) • • • • • • • •
' •/  ' •/
••  •  •  •  •  
• • •
•• • •• •• ••• ••• ••• ••• •••••
P (pτ ) • • • • •

• •
• • • • • •
R(pτ ) • •
' •/  •  ' •/   •  •  •  
• • • • • ••• •• •• •• ••• •
• • • •
P ∗ (pτ ) • •
#(pτ ) 3 3 3 4 4 4 4 5
p ∈ AP τ
τ
no no no no no no no no
9.4 Related B-Series Works 431

Table 4.24. The partitions of a tree of order 5 with associated skeleton and forest
• • • •· • •· • •
·· ··
• • • • •· • • • • • • • •· ·• •· •
·· ·· ·· ·· ·· ·
• •· • • •· •· • •·
· ·
pτ • •· • • • • • •·
• •
• • • • • • • •

χ(pτ ) • • • • • • •
• •
' / ' • /' /
• • • ' • • / '• • / ' • /  • ' • /
•• • • • •
• • • • •• • ••

•• •
P (pτ ) • • • • • • •
• •
• • • • • • •
• • • • • •
• •
R(p ) τ
• • • • •


  ' /          • ' • /
• • • • •
∅ • • • • •
• •
P ∗ (pτ ) • •
#(pτ ) 1 2 2 2 2 3 3 3
pτ ∈ AP τ yes yes yes yes yes no yes no

Table 4.25. Continuous partitions of the above tree of order 5 with associated skeleton and
forest
• •· •· •· •· •· • •·
·· · · ·· · ·
• • • • •· ·• •· ··• • • •· ·• •· ·• •· ··•
·· ·· ·· ·· ·· ·· ·· ·· ·· ··
•· · •· • • •· · •· •· •·
· · · · · ·
pτ •· •· • • •· •· •· •·
• •
• • • • • • •
• • •
• • • • • • • • • •
χ(pτ ) • • • • • • •

•/
 • /' '  •     
• • • • • •
•• •••
P (pτ ) • •• • ••
• • • • • • • • • • • •••••
• • •

• •
• • • • • •
R(pτ ) • •
• •        •   •   •   
• •
• •• ••• •• •• •• •• • •
•• • • • •
P ∗ (pτ )
#(pτ ) 3 3 3 4 4 4 4 5
pτ ∈ AP τ no no yes no no no no no
432 9. Formal Power Series and B-Series

9.4.2 Substitution Law


In[CHV07,CHV05] ,they introduce a new composition law on B-series, denoted by ∗ and
called law of substitution, obtained as a result of the substitution of vector field g(z) =
1
B(f, b)(z) with b(∅) = 0, into another B-series B(g, a)(z). Chartier, Hairer, and
h
Vilmart give the following theorem:
Theorem 4.9. For b(∅) = 0, the vector field h−1 B(f, b) gives a B-series

B(h−1 B(f, b), a) = B(f, b ∗ a). (4.10)

We have (b ∗ a)(∅) = a(∅) and for all τ ∈ T ,



(b ∗ a)(τ ) = a(χp )b(vp ), (4.11)
p∈P(τ )

where b is extended to F as follows:


;
n
∀ u = τ1 · · · τn ∈ F, a(u) = a (τi ). (4.12)
i=1

Remark 4.10. The composition law for the trees of order ≤ 5 is listed in Example
4.22.
Remark 4.11. The Substitution law for the (backward error) trees of order ≤ 5 is
listed in Example 4.24.
Remark 4.12. The Substitution law for the trees of order ≤ 5 is listed in Example
4.23.
Modified integrators (called generating function method or preprocessed vector field
integrators): Let Ψf,h is the exact h-flow for Equation (4.1) which is a B-series with
1
coefficient e(τ ) = . Consequently, the coefficient b̆(τ ) of the modified differen-
γ(τ )
tial equation for Φf,h = B(f, a) is obtained from

(b̆ ∗ a)(τ ) = e(τ ), ∀ τ ∈ T. (4.13)

Backward error analysis (called formal vector field, modified equation or postpro-
cessed vector field): The modified differential equation of a method Ψf,h = B(f, e)
is obtained by putting Φf,h equal to the exact flow. Its coefficient b(τ ) is therefore
obtained from
(b ∗ e)(τ ) = a(τ ), ∀ τ ∈ T. (4.14)
1
Remark 4.13. Substituting the expression given in (4.13) into (4.14) gives b̆∗b∗a =
γ
a. Therefore, b̆ and b(τ ) are inverse elements for substitution law ∗

b̆(τ ) ∗ b(τ ) = b(τ ) ∗ b̆(τ ) = δ• (τ ). (4.15)


9.4 Related B-Series Works 433

Proposition 4.14. Using formulae (4.13) and (4.11) in Example 4.23, we easily ob-
tain modified centered Euler scheme of sixth order first find in[CHV07] :

h2 @ (2) A h4 3 (4)
ż = f (z) + f (f, f ) − 2f (1) f (1) f + f (f, f, f, f )
24 120 48
1 (3) 1 (2)
− f (f, f, f f ) + f (f, f (f, f )) − f (2) (f, f (1) f (1) f )
(1) (2)
4 4
2 (2) (1) 3 1
+ f (f f, f f ) − f (1) f (3) (f, f, f ) + f (1) f (2) (f, f (1) f )
(1)
4 12  2
1 (1) (1) (2)
− f f f (f, f ) + f f f f f + O(h6 ).
(1) (1) (1) (1)
4

Proof. First, we must point out b̆(τ ) = 0, ∀ |τ | = even. We calculate coefficient b̆( )
as follows

b̆( ) + 2a( )b̆( )2 b̆( ) + a( )b̆( )2 b̆( )

+a( )b̆( )2 b̆( ) + a( )b̆( )5 = e( ).


Note the formula in Example 4.23, coefficient a(τ ) in Example 4.5, and γ(τ ) in Table
1 1 1 1
4.26, we have b̆( )= + − = .
40 24 16 240
The proof of others is left to the reader. 
Proposition 4.15. In 2001, the author first obtained modified equation for centered
Euler scheme given in Example 4.6 of Chapter 7. Using formulas (4.14) and (4.23) in
Example 4.24, we can obtain this formula again.

Proof. First, we must point out b(τ ) = 0, ∀ |τ | = even. We calculate coefficient


b( ) as follows
b( 1
) + 6e( )b( )2b( ) + ( ) = a( ).
γ
Note the formula in Example 4.22, coefficient a(τ ) in Example 4.5, and γ(τ ) in Table
1 1 1 7
4.26, we have b( )= + − = . The proof of others is left to the reader.
16 6 5 240

Remark 4.16. After calculating the coefficients b̆(τ ) and b(τ ), we list following in
Table 4.26.

Remark 4.17. Can directly test following equation

b̆(τ ) ∗ b(τ ) = δ• (τ ), (4.16)


via ∗ operation formula (4.11).
Remark 4.18. Relating the previous laws to two Hopf algebra introduced, respectively, by
Connes and Kremer[CK98] and by Calaque, Ebrahimi-Fard, and Manchon[CEFM08] , we can see
these papers[Bro00],[CHV08]] .
434 9. Formal Power Series and B-Series

Table 4.26. Coefficients σ(τ ), γ(τ ), b̆(τ ), and b(τ ) for trees of order  5

τ ∅

σ(τ ) 1 1 2 1 6 1 2 1

γ(τ ) 1 2 3 6 4 8 12 24

b̆(τ ) 0 1 0 1/12 −1/12 0 0 0 0


b(τ ) 0 1 0 −1/12 1/12 0 0 0 0

σ(τ ) 24 2 2 6 1 1 2 1 2

γ(τ ) 5 10 20 20 40 30 60 120 15

b̆(τ ) 1/80 −1/240 1/120 −1/80 1/240 −1/120 −1/240 1/120 1/240
b(τ ) 7/240 1/240 1/80 −7/240 −1/240 −1/80 1/240 1/80 −1/240

9.4.3 The Logarithmic Map


The coefficient ω(τ ) can be interpreted as the coefficient of the modified field obtained
by backward error analysis, for the explicit Euler method z1 = z0 + hf (z0 ), corre-
sponding to a = δ∅ + δ• . They can be computed by formula (4.11) or (4.22). Murua
in[Mur06] gives the following formula

log (a) = (a − b∅ ) ∗ ω. (4.17)

Properties of logarithmic map has been discussed in Proposition 2.1. Using formula
of Example 4.24, determined ω(τ )(= b(τ )) recursively , because a(τ ) = 0 ∀τ ∈ T
except a( ) = 1.
For example: from 14 formula of Example 4.24, we have
11 1 1 1 1 1 1 1 1 1 1 1 1
b( ) + (− ) + ( )(− )( ) + (− )( ) + (− ) + ( ) + 2( )( )
122 2 2 6 2 2 3 2 4 6 6 6 4
1 1 1 1 1 1 1 1 1
+( )( ) + 2( )( ) + ( )(− ) + 3( )(− ) + = 0,
3 4 3 3 24 2 8 2 30
then we get
1
ω( ) = b( )= .
20

The test of others ω(τ ) is left the reader.


We give following Table 4.27(compare with [Mur06],[CHV05][CHV08],[CEFM08] ).
9.4 Related B-Series Works 435

Table 4.27. Coefficient ω(τ ) for trees of order  5

τ ∅

ω(τ ) 0 1 −1/2 1/6 1/3 0 −1/12 −1/6 −1/4

ω(τ ) –1/30 –1/60 1/30 1/30 1/10 1/20 3/20 1/5 1/60

Definition 4.19. (Lie derivative of B-series) Let b(τ ) with b(∅) = 0 and a(τ ) be the
coefficient of two B-series and let z(t) be a formal solution of the differential equation
hż(t) = B(b, z(t)). The Lie derivatives of the function B(a, z(t)) with respect to the
vector field B(b, z(t)) is again B-series

d
h B(a, z(t)) = B(∂b a, z(t)). (4.18)
dt

Its coefficients are given by ∂b a(∅) = 0 and for |τ | ≥ 1 by


 
∂b a(τ ) = a(θ)b(τ \ θ), ∂b a(τ ) = a(χp )b(vp ). (4.19)
θ∈SP (τ ) pτ ∈P(τ )

Exercise 4.20. [HLW02] Prove that the coefficient of modified differential equation are
recursively defined by b(∅) = 0, b(·) = 1 and

|τ |
 1
b(τ ) = a(τ ) − ∂bj−1 b(τ ), (4.20)
j!
j=2

where ∂bj−1 b(τ ) is the (j − 1)-th iterative of the Lie-derivative ∂b .

1
Proposition 4.21. The above-mentioned formula (4.20) is just formula b ∗ =
γ(τ )
a(τ ) namely
|τ |
 1 1
∂bj−1 b(τ ) = b(τ ) ∗ . (4.21)
j! γ(τ )
j=1

Proof. Note that formula (4.23) in Example 4.24 and Tables 4.1 – 4.25, can obtain
this results directly. 
436 9. Formal Power Series and B-Series

For Example: from 4th formula of Example 4.24, we have

e( )b( ) + 2e( )b( )b( ) + e( )b( )3 = a( ),


1
b( ) + b( )b( ) + b( )3 = a( ),
6
1
b( ) = a( ) − b( )b( ) − b( )3 .
6

Example 4.22. The composition laws for the trees of order ≤ 5 are

a · b( ) = b(∅) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( )2 + 2b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( )3 + 3b( ) · a( )2 + 3b( ) · a( )
+b( )
a · b( ) = b(∅) · a( ) + b( ) · a( )a( ) + b( ) · a( ) + b( ) · a( )2
+b( ) · a( ) + b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( )2 + 2b( ) · a( ) + b( )

a · b( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( ) + b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( )4 + 4b( ) · a( )3 + 6b( ) · a( )2
+4b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( )3 + b( ) · a( ) · a( )2
+2b( ) · a( ) · a( ) + b( ) · a( ) + b( ) · a( )2
+2b( ) · a( )2 + 2b( ) · a( ) + b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + 2b( ) · a( ) · a( ) + b( ) · a( ) + b( ) · a( )2
2

+2b( ) · a( ) + 2b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + b( ) · a( )3 + 3b( ) · a( )2 + 3b( ) · a( )
+b( ) · a( ) + b( )

a · b( ) = b(∅) · a( ) + b( ) · a( ) · a( ) + b( ) · a( )2 + b( ) · a( )

+b( ) · a( ) + b( ) · a( ) + b( ) · a( ) + b( )

a · b( ) = b(∅) · a( ) + b( ) · a( ) · a( ) + b( ) · a( ) · a( ) + b( ) · a( )

+b( ) · a( ) + b( ) · a( )2 + b( ) · a( ) + b( ) · a( ) + b( )
9.4 Related B-Series Works 437

a · b( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( ) + b( ) · a( )2

+2b( ) · a( ) + b( )

a · b( ) = b(∅) · a( ) + b( ) · a( ) + b( ) · a( ) + b( ) · a( )

+b( ) · a( ) + b( )
a · b( ) = b(∅) · a( ) + 2b( ) · a( )2 + b( ) · a( ) · a( ) + b( ) · a( )2
+b( ) · a( ) + 2b( ) · a( ) + b( ) · a( ) + b( )

Example 4.23. The substitution law ∗ defined in for the trees of order ≤ 5.

b ∗ a( ) = a( )b( )
b ∗ a( ) = a( )b( ) + a( )b( )2
b ∗ a( ) = a( )b( ) + 2a( )b( )b( ) + a( )b( )3
b ∗ a( ) = a( )b( ) + 2a( )b( )b( ) + a( )b( )3
b ∗ a( ) = a( )b( ) + 3a( )b( )b( ) + 3a( )b( )2 b( ) + a( )b( )4
b ∗ a( ) = a( )b( ) + a( )b( )b( ) + a( )b( )2 + a( )b( )b( )
+2a( )b( )2 b( ) + a( )b( )2 b( ) + a( )b( )4
b ∗ a( ) = a( )b( ) + a( )b( )b( ) + 2a( )b( )a( ) + a( )b( )2 b( )
+2a( )b( )2 b( ) + a( )b( )4

b ∗ a( ) = a( )b( ) + 2a( )b( )b( ) + a( )b( )2 + 3a( )b( )2 b( )

+a( )b( )4
b ∗ a( ) = a( )b( ) + 4a( )b( )b( ) + 6a( )b( )2 b( )
+4a( )b( )3 b( ) + a( )b( )5
b ∗ a( ) = a( )b( ) + a( )b( )b( ) + 2a( )b( )b( ) + a( )b( )b( )
+a( )b( )2 b( ) + 2a( )b( )2 b( ) + a( )b( )2 b( )
+2a( )b( )b( )2 + 2a( )b( )3 b( ) + 2a( )b( )3 b( )
+a( )b( )5
b ∗ a( ) = a( )b( ) + 2a( )b( )b( ) + 2a( )b( )b( )
+a( )b( ) b( ) + 2a( )b( )2 b( ) + 3a( )b( )b( )2
2

+4a( )b( )3 b( ) + a( )b( )5


438 9. Formal Power Series and B-Series

b ∗ a( ) = a( )b( ) + a( )b( )b( ) + 3a( )b( )b( )


+3a( )b( )2 b( ) + 3a( )b( )2 b( ) + a( )b( )3 b( )

+3a( )b( )3 b( ) + a( )b( )5

b ∗ a( ) = a( )b( ) + a( )b( )b( ) + a( )b( )b( ) + a( )b( )b( )

+a( )b( )b( ) + 2a( )b( )2 b( ) + a( )b( )b( )2

+a( )b( )b( )2 + a( )b( )2 b( ) + a( )b( )2 b( )

+a( )b( )3 b( ) + a( )b( )3 b( ) + 2a( )b( )3 b( )

+a( )b( )5

b ∗ a( ) = a( )b( ) + a( )b( )b( ) + a( )b( )b( ) + a( )b( )b( )

+a( )b( )b( ) + a( )b( )2 b( ) + 2a( )b( )b( )2

+a( )b( )b( )2 + 2a( )b( )2 b( ) + a( )b( )3 b( )

+3a( )b( )3 b( ) + a( )b( )5

b ∗ a( ) = a( )b( ) + a( )b( )b( ) + 2a( )b( )b( ) + a( )b( )b( )

+2a( )b( )b( )2 + 2a( )b( )2 b( )

+a( )b( )2 b( )

+a( )b( )2 b( ) + 2a( )b( )3 b( ) + 2a( )b( )3 b( )

+a( )b( )5

b ∗ a( ) = a( )b( ) + 2a( )b( )b( ) + 2a( )b( )b( ) + 3a( )b( )2 b( )

+3a( )b( )b( )2 + 4a( )b( )3 b( ) + a( )b( )5

b ∗ a( ) = a( )b( ) + 2a( )b( )b( ) + a( )b( )b( ) + a( )b( )b( )

+2a( )b( )2 b( ) + 2a( )b( )2 b( ) + 2a( )b( )b( )2

+a( )b( )3 b( ) + 2a( )b( )3 b( )

+a( )b( )3 b( ) + a( )b( )5


(4.22)
9.4 Related B-Series Works 439

Example 4.24. The substitution law ∗ defined in for the trees of order ≤ 5

b ∗ e( ) = e( )b( )
b ∗ e( ) = e( )b( ) + e( )b( )2

b ∗ e( ) = e( )b( ) + 2e( )b( )b( ) + e( )b( )3

b ∗ e( ) = e( )b( ) + 2e( )b( )b( ) + e( )b( )3

b ∗ e( ) = e( )b( ) + 3e( )b( )b( ) + 3e( )b( )2 b( ) + e( )b( )4

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + e( )b( )2 + e( )b( )b( )

+2e( )b( )2 b( ) + e( )b( )2 b( ) + e( )b( )4

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + 2e( )b( )e( ) + e( )b( )2 b( )

+2e( )b( )2 b( ) + e( )b( )4

b ∗ e( ) = e( )b( ) + 2e( )b( )b( ) + e( )b( )2 + 3e( )b( )2 b( )

+e( )b( )4
(4.23)

b ∗ e( ) = e( )b( ) + 4e( )b( )b( ) + 6e( )b( )2 b( )

+4e( )b( )3 b( ) + e( )b( )5

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + 2e( )b( )b( ) + e( )b( )b( )

+e( )b( )2 b( ) + 2e( )b( )2 b( ) + e( )b( )2 b( )

+2e( )b( )b( )2 + 2e( )b( )3 b( ) + 2e( )b( )3 b( )

+e( )b( )5

b ∗ e( ) = e( )b( ) + 2e( )b( )b( ) + 2e( )b( )b( ) + e( )b( )2 b( )

+2e( )b( )2 b( ) + 3e( )b( )b( )2 + 4e( )b( )3 b( )

+e( )b( )5

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + 3e( )b( )b( )

+3e( )b( )2 b( ) + 3e( )b( )2 b( ) + e( )b( )3 b( )

+3e( )b( )3 b( ) + e( )b( )5


440 9. Formal Power Series and B-Series

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + e( )b( )b( ) + e( )b( )b( )

+e( )b( )b( ) + 2e( )b( )2 b( ) + e( )b( )b( )2

+e( )b( )b( )2 + e( )b( )2 b( ) + e( )b( )2 b( )

+e( )b( )3 b( ) + e( )b( )3 b( ) + 2e( )b( )3 b( ) + e( )b( )5

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + e( )b( )b( ) + e( )b( )b( )

+e( )b( )b( ) + e( )b( )2 b( ) + 2e( )b( )b( )2

+e( )b( )b( )2 + 2e( )b( )2 b( ) + e( )b( )3 b( )

+3e( )b( )3 b( ) + e( )b( )5

b ∗ e( ) = e( )b( ) + e( )b( )b( ) + 2e( )b( )b( ) + e( )b( )b( )

+2e( )b( )b( )2 + 2e( )b( )2 b( ) + e( )b( )2 b( )

+e( )b( )2 b( ) + 2e( )b( )3 b( ) + 2e( )b( )3 b( ) + e( )b( )5

b ∗ e( ) = e( )b( ) + 2e( )b( )b( ) + 2e( )b( )b( ) + 3e( )b( )2 b( )

+3e( )b( )b( )2 + 4e( )b( )3 b( ) + +e( )b( )5

b ∗ e( ) = e( )b( ) + 2e( )b( )b( ) + e( )b( )b( ) + e( )b( )b( )

+2e( )b( )2 b( ) + 2e( )b( )2 b( ) + 2e( )b( )b( )2

+e( )b( )3 b( ) + 2e( )b( )3 b( ) + e( )b( )3 b( ) + e( )b( )5


Bibliography

[Bak05] H. F. Baker: Alternants and continuous groups. Proc. London Math. Soc., 3:24–47,
(1905).
[Bro00] Ch. Brouder: Runge–Kutta methods and renormalization. Euro. Phys. J. C, 12:521–
534, (2000).
[BSS96] J. C. Butcher and J. M. Sanz-Serna: The number of conditions for a Runge–Kutta
method to have effective order p. Appl. Numer. Math., 22:103–111, (1996).
[CEFM08] D. Calaque, K. Ebrahimi-Fard, and D. Manchon: Two Hopf algebra of trees inter-
acting. arXiv: 0806.2238 v 2, (2008).
[CHV05] P. Chartier, E. Hairer, and G. Vilmart: A substitution law for B-series vector fields.
Technical Report 5498, INRIA, (2005).
[CHV07] P. Chartier, E. Hairer, and G. Vilmart: Numerical integration based on modified
differential equations. Math. Comp., 76(260):1941–1953, (2007).
[CHV08] P. Chartier, E. Hairer, and G Vilmart: Composing and substituting S-series and B-
series of integrators and vector fields. Preprint, www.irisa.fr/ipso/fichiers/algebraic.pdf,
(2008).
[CK98] A. Connes and D. Kreimer: Hopf algebra, renormazation and noncommutative geom-
etry. Communications in Mathematical Physics, 199:203–242, (1998).
[Dyn46] E. B. Dynkin: Normed Lie algebra and analytic groups, volume 1. Amer. Math. Soc.
(translation), (1946).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[Fen92] K. Feng: Formal power series and numerical methods for differential equations. In
T. Chan and Z. C. Shi, editors, International conf. on scientific computation, pages 28–35.
World Scientific, Singapore, (1992).
[Fen93a] K. Feng: Formal dynamical systems and numerical algorithms. In K. Feng and Z.
C. Shi, editors, International conf. on computation of differential equations and dynamical
systems, pages 1–10. World Scientific, Singapore, (1993).
[Fen93b] K. Feng: Symplectic, contact and volume preserving algorithms. In Z.C. Shi and
T. Ushijima, editors, Proc.1st China-Japan conf. on computation of differential equations
and dynamical systems, pages 1–28. World Scientific, Singapore, (1993).
[Hai94] E. Hairer: Backward analysis of numerical integrators and symplectic methods. Annals
of Numer. Math., 1:107–132, (1994).
[Hau06] F. Hausdorff: Die symbolische exponentialformel in der gruppentheorie. Berichte der
Sachsischen Akad. der Wissensch., 58:19–48, (1906).
[HLW02] E. Hairer, Ch. Lubich, and G. Wanner: Geometric Numerical Integration. Num-
ber 31 in Springer Series in Computational Mathematics. Springer-Verlag, Berlin, (2002).
[HW74] E. Hairer and G. Wanner: On the Butcher group and general multivalue methods.
Computing, 13:1–15, (1974).
[Lie88] S. Lie; Zur theorie der transformationsgruppen. Christiania, Gesammelte Abh., Christ.
Forh. Aar., 13, (1988).
442 Bibliography

[Mur06] A. Murua: The Hopf algebra of rooted trees, free Lie algebra, and Lie series. Foun-
dations of Computational Mathematics, 6(4):387–426, (2006).
[Olv93] P. J. Olver: Applications of Lie Groups to Differential Equations. GTM 107. Springer-
Verlag, Berlin, Second edition, (1993).
[Ote91] J. A. Oteo: The Baker–Campbell–Hausdorff formula and nested commutator identi-
ties. J. of Math. Phys., 32(2):419–424, (1991).
[OW00] B. Owren and B. Welfert: The Newton iteration on Lie groups. BIT, 40(1):121–145,
(2000).
[Owr06] B. Owren: Order conditions for commutator-free Lie group methods. J. Phys. A:
Math. Gen., 39:5585–5599, (2006).
[Rei99] S. Reich: Backward error analysis for numerical integrators. SIAM J. Numer. Anal.,
36:475–491, (1999).
[SS96] J. M. Sanz-Serna: Backward Error Analysis for Symplectic Integrators. In J. E. Mard-
sen, G. W. Patrick, and W. F. Shadwick, editors, Integration Algorithms and Classical Me-
chanics, pages 193–206. American Mathematical Society, New York, (1996).
[SS97] J. M. Sanz-Serna: Geometric integration. In The State of the Art in Numerical Analysis
(York, 1996), volume 63 of Inst. Math. Appl. Conf. Ser. New Ser., pages 121–143, Oxford
Univ. Press, New York, (1997).
Chapter 10.
Volume-Preserving Methods for Source-Free
Systems

Source-free dynamical systems is an important system in recent mechanics and


physics. It has a abroad application. Therefore, designing a proper numerical method
for this system is significant. It is well known that phase flow of source-free system
is a volume-preserving transformation. Therefore, the transient operator of the numer-
ical method that we design should be volume-preserving. We call this algorithm the
volume-preserving algorithm.

10.1 Liouville’s Theorem


Let x = (x1 , x2 , · · · , xN )T , and f (x) = (f1 (x), f2 (x), · · · , fN (x))T : RN → RN ,
then the dynamical system
dx
= f (x) (1.1)
dt

N
∂fi
is source-free (i.e., divergence-free), when = 0 (i.e., div f (x) = 0). The flow
∂xi
i=1
of a source-free system is volume-preserving, i.e.,

det (etf (x))∗ = 1, ∀ x, t,

here etf denotes the flow of system (1.1) and (etf (x))∗ the Jacobian of etf at x. Thus,
volume-preserving
 schemes
 are required for computing the numerical solution of
∂ xn+1
(1.1). If det = 1, we call this scheme volume-preserving, where xn denotes
∂ xn
the numerical solution at step n.
We know that the phase flow of Hamiltonian system preserves phase volume in-
variable. The source-free system is more general than the Hamiltonian system, we
must prove that the phase flow preserving phase volume is invariable, considering the
dynamic system (1.1), its phase flow is

g t (x) = x + f (x)t + o(t2 ). (1.2)

Let D(0) be a region in x space and V (0) is its volume, then

V (t) = volume of D(t), D(t) = g t D(0).


444 10. Volume-Preserving Methods for Source-Free Systems

Theorem 1.1 (Liouville’s Theorem). If div f = 0, then g t preserving the volume is


invariable, V (t) = V (0).
Proof. First proof -

d V (t) 
 = div f d x, (1.3)
d t t=0 D(0)

for any t, using the formula for changing variables in a multiple integral gives
-
∂ gt x
V (t) = det d x.
D(0) ∂x

∂ gt x
Calculating by formula (1.2), we find
∂x
∂ gt x ∂f
=E+ t + O(t2 ), as t → 0.
∂x ∂x
but
det (E + At) = 1 + t tr A + O(t2 ), t → 0,

n
where tr A = aii . Therefore
i=1
-
V (t) = [1 + t div f + O(t2 )]d x, (1.4)
D(0)
 -
d V (t) 
 = div f d x.
d t t=0 D(0)

Then Equation (1.3) is proved. Now we take t = t0 is no worse than t = 0, therefore


 -
d V (t) 
 = div f d x,
dt t=t0 D(t0 )

d V (t)
and if div f = 0, = 0. This completes the proof. 
dt
In particular, for Hamiltonian equation
   
∂ ∂H ∂ ∂H
div f = − + = 0,
∂p ∂q ∂q ∂p
Liouville’s theorem is proved specially.

10.2 Volume-Preserving Schemes


10.2.1 Conditions for Centered Euler Method to be Volume
Preserving
Let us consider centered Euler scheme
10.2 Volume-Preserving Schemes 445

 
xn+1 + xn
xn+1 = xn + τ f , (2.1)
2
where τ is the step size in t. We then have
  
∂ xn+1 xn+1 + xn 1 ∂ xn+1 1
= IN + τ Df + IN ,
∂ xn 2 2 ∂ xn 2
τ
∂ xn+1 IN + Df (x∗ )
= 2 .
∂ xn τ
IN − Df (x∗ )
2
 
∂f x + xn ∂ xn+1
Here, Df = fx = ≡ B = (bij ), x∗ = n+1 . The condition det =
∂x 2 ∂xn
τ
|IN + Df (x∗ )|
1, now requires 2
τ = 1. Let P (λ) = |Df (x∗ ) − λIn | be the character-
|IN − Df (x∗ )|
2
istic matrix of Df (x∗ ). Since
 % & 
   % &
 τ   τ ∗ 2  2
IN + Df (x∗ )  Df (x ) + IN  P
 2  2 τ  τ
  =  % &  = (−1)
N
% &,
  
IN − τ Df (x∗ )  τ 2  P −
2
 2  − Df (x∗ ) − IN  τ
 2 τ 

we then get the condition for scheme (2.1) to be volume-preserving[QZ93] , i.e.,

P (λ) = (−1)N P (−λ).

Let us consider some particular cases of N to show that scheme (2.1) is not always
volume preserving.
Case 2.1. In this case, we have

P (λ) = λ2 + (b11 + b22 )λ + b11 b22 − b12 b21 . (2.2)

N
∂fi
Since = 0, i.e., tr B = 0, then P (λ) = λ2 + b11 b22 − b12 b21 , and
i=1
∂xi

P (−λ) = P (λ).

Thus, the scheme (2.1) is always volume-preserving for source-free systems of dim.2.
Case 2.2. Here

P (λ) = −λ3 + (b11 + b22 + b33 )λ2 − cλ + |B| = −λ3 − cλ + |B|, (2.3)

where      
 b11 b12   b22 b23   b11 b13 

c= + + .
 b21 b22   b32 b33   b31 b33 
446 10. Volume-Preserving Methods for Source-Free Systems

The volume-preserving condition for Euler method is now |B| = 0. For example,
(ABC flow) when system (1.1) takes the form
dx
= cy − bz,
dt
dy
= az − cx, a, b, c ∈ R,
dt
dz
= bx − ay,
dt

we have |B| = 0. For this dynamical system, centered Euler method is volume-
preserving.
Lemma 2.3. Let P (λ) be the characteristic polynomial of matrix AN ×N , then
 
P (λ) = |A − λIN | = (−1)N λN − P1 λN −1 + P2 λN −2 + · · · + (−1)N PN , (2.4)

where

N
P1 = aii = tr A,
i

N  a

 
 ii aij 
P2 =  ,
 aji ajj 
i<j
 
 aii aij aik  (2.5)
  
N
 
P3 =  a ,
 ji ajj ajk 
i<j<k  

aki akj akk
···
PN = |A|.
Using Lemma 2.3, we can discuss the case N = 4.
Case 2.4. At this time,
P (λ) = λ4 − P1 λ3 + P2 λ2 − P3 λ + |B|.
Since P1 = tr (B) = 0, then P (−λ) = (−1)4 P (λ) requires P3 = 0.
It must be pointed out that, when N increases, more increasing number of condi-
tions is required for system (2.1) to be volume-preserving, and it seems impossible to
satisfy all these condition. But fortunately, for the special case when system (1.1) is
Hamiltonian, i.e.,
5 6
O −Ik
f = J∇H, J = , N = 2k.
Ik O

Scheme (2.1) is volume-preserving. This is because the Hamiltonian system is source-


free and Df is an infinitesimal symplectic matrix, we have the following Lemma.
10.2 Volume-Preserving Schemes 447

Lemma 2.5. Let M be an infinitesimal symplectic matrix, if λ is an eigenvalue of M ,


so are −λ, λ̄, −λ̄.

From Lemma 2.5, we know that P (−λ) = (−λ)2k P (λ) is valid when system
(1.1) is Hamiltonian, so Euler method is volume-preserving for Hamiltonian systems.
In fact, the method is even symplectic for Hamiltonian systems, that is to say it also
preserve the symplectic structure of Hamiltonian systems which is a much stronger
property than volume-preserving.

10.2.2 Separable Systems and Volume-Preserving Explicit


Methods
In this section, we consider a special kind of source-free systems called separable
systems. System (1.1) is separable if
d xi
= fi (x1 , x2 , · · · , xi−1 , xi+1 , · · · , xN ), i = 1, 2, · · · , N. (2.6)
dt

We can divide the above system into N source-free systems:



⎪ d x1
⎪ d t = f1 (x2 , · · · , xN ),





⎪ d x2
⎨ = 0,
dt
(2.7)

⎪ ..

⎪ .




⎩ d x N
= 0.
dt

⎪ d x1

⎪ = 0,

⎪ dt




d x2
= f2 (x1 , x3 , · · · , xN ),
dt
(2.8)

⎪ ..

⎪ .




⎩ d x N
= 0.
dt

···



d x1
= 0,




dt



⎨ ..
.
(2.9)

⎪ d xN −1

⎪ = 0,

⎪ dt



⎩ d xN = fN (x1 , · · · , xN −1 ).
dt
448 10. Volume-Preserving Methods for Source-Free Systems

The first order explicit Euler method can be applied to them to get the exact solutions
of them, i.e., the phase flows of them. Using the composition method[QZ92] , we can
construct first order explicit Euler volume-preserving scheme for system (2.6). The
adjoint of this scheme is obtained from the implicit Euler method and is also explicit.
Composing these two schemes, we get a reversible explicit. This process can be ex-
pressed by formal power series as shown below.
From Chapter 9, we know the flow of (1.1) can be represented by power of series.

 1 ∗k
eτf = 1N + τ k ek,f , ek,f : RN −→ RN , ek,f = f 1N ,
k!
k=1


N

where f ∗ denotes the first order differential operator , f ∗ = fi , f ∗2 = f ∗ ×
i=1
∂xi
f ∗ , f ∗ 3 = f ∗ × f ∗ × f ∗ , · · · , 1N is the identity vector function, 1N (x) = x. For
simplicity, we just write out
eτA · eτB = eτcτ , (2.10)
the first several terms are
τ
cτ = A + B + [A, B] + o(τ 2 ),
2
where [A, B] = A∗ B − B∗ A is the Lie bracket of A and B, A∗ , B∗ denotes the
Jacobian matrix of A and B .
We now rewrite system of Equations (2.7) – (2.9) in compact form as
dx
= ai (x), ai = (0, · · · , 0, fi , 0, · · · , 0)T , i = 1, 2, · · · , N. (2.11)
dt
These integrable systems have flow
inf

eτai = 1N + τ k ek,ai , i = 1, 2, · · · , N. (2.12)
k=1

Since we have a∗N ∗k


i 1N (x) = ai x = 0, when k ≥ 2, then

inf
 inf
 τk
eτai (x) = x + τ k ek,ai (x) = x + a∗i k 1N (x) = x + τ ai (x). (2.13)
k!
k=1 k=1

Using the formula (2.10), we find

eτaN × eτaN −1 × · · · × eτa2 × eτa1 = eτf +o(τ ) . (2.14)

This means the concatenation eτaN × eτaN −1 × · · · × eτa1 approximates the flow eτf to
the first order of τ .
Because the equations in the system (2.11) are all source-free, their flows are all
volume-preserving and the concatenation of them remains volume-preserving, so
10.3 Source-Free System 449

 
det (eτaN × eτaN −1 × · · · × eτa1 )(x) ∗

= det eτaN (xN −1 ))∗ × det (eτaN −1 (xN −2 ))∗ × · · · × det (eτa1 (x0 ))∗ = 1,

where x0 = x, x1 = eτa1 (x0 ), · · · , xN −1 = eτaN −1 (xN −2 ), xN = eτaN (xN −1 ).


Thus, from system (2.6) we get volume-preserving scheme of first order. This is an
explicit scheme since eτai (i = 1, · · · , N ) are flows of integrable systems which can be
written as (2.13). From [QZ92] , we know the concatenation eτaN ×eτaN −1 ×· · ·×eτa1 with
its adjoint eτa1 × eτa2 × · · · × eτaN produces a reversible scheme
τ τ τ τ
ea2N × ea2N −1 × · · · × eτa1 × · · · × ea2N −1 × ea2N

of second order, but is still explicit. We can use theory of composition[QZ92] to con-
struct symplectic scheme of arbitrary order.

10.3 Source-Free System


Source-free dynamical systems on the Euclidean space Rn are defined by source-free
(or divergence-free) vector fields a : Rn → Rn ,


n
∂ ai (x)
div a(x) = = 0, ∀x ∈ Rn , (3.1)
i=1
∂ xi

through equations
dx
= ẋ = a(x), (3.2)
dt
here and hereafter, we use the coordinate description and matrix notation

x = (x1 , · · · , xn )T , a(x) = (a1 (x), · · · , an (x))T , (3.3)

where T denotes the transpose of a matrix.


In this subsection, we mainly analyze and construct numerical algorithms proper
for source-free systems. Such systems constitute one of the most important classi-
cal cases of dynamical systems preserving certain geometric structure and arise in
many physical problems such as particle tracking in incompressible fluids and toroidal
magnetic surface-generation in stellarators. Because of the difficulty and even impos-
sibility of solving equations by quadrature, the numerical methods certainly play an
important role in understanding the dynamic behavior of a system and in solving phys-
ical and engineering problems. On the other hand, the problem of whether a numerical
algorithm is proper for a system is closely related to the problem of whether the al-
gorithmic approximation to the corresponding phase flow approximates perfectly in
some sense and even strictly preserve the structure of the system itself if the system
has such structure. It has been evidenced with some typical examples in the Hamilto-
nian case that “nonproper” algorithms will result in essentially wrong approximations
450 10. Volume-Preserving Methods for Source-Free Systems

to the solutions of systems and “proper” algorithms may generate remarkably right
ones.
But how does one evaluate a numerical algorithm to be proper for source-free
systems? It is well known that intrinsic to all source-free systems there is a volume
form of the phase space Rn , say

α = dx1 ∧ dx2 ∧ · · · ∧ dxn (3.4)

such that the evolution of dynamics preserves this form. In other words, the phase flow
eta , of source-free system (3.2), satisfies the volume-preserving condition

(eta )∗ α = α, (3.5)

or equivalently,
∂ eta (x)
det = 1, ∀ x ∈ Rn , t ∈ R. (3.6)
∂x
In addition to this, eta satisfies the group property in t,

e0a = identity, et+s


a = eta ◦ esa . (3.7)

In fact, (3.5) and (3.7) completely describe the properties of the most general
source-free dynamical systems. This fact suggests that a proper algorithmic approxi-
mation gas to phase flow esa for source-free vector field a : Rn → Rn should satisfy
these two requirements. However, the group property (3.7) is too stringent in general
for algorithmic approximations because only the phase flows satisfy it. Instead of it, a
weaker requirement, i.e.,

ga0 = identity, gas ◦ ga−s = identity, (3.8)

is reasonable and practicable for all vector fields a : Rn → Rn . We call such algo-
rithmic revertible approximations, that means gas always generate coincident forward
and backward orbits.
As for the volume-preserving property (3.5), it characterizes the geometric struc-
ture —volume-preserving structure—of source-free systems. Our aim here is just
to construct difference schemes preserving this structure, which we call volume-
preserving schemes, in sense that the algorithmic approximations to the phase flows
satisfy (3.5) for the most general source-free systems.

10.4 Obstruction to Analytic Methods


We note that for n = 2, source-free vector fields = Hamiltonian fields, and area-
preserving maps = symplectic maps, so the problem for area-preserving algorithms
has been solved in principle.
But for n ≥ 3, the problem is new, since all the conventional methods plus even the
symplectic methods are generally not volume-preserving, even for linear source-free
systems. As an illustration, see example and Lemma of Feng and Shang[FS95] .
10.4 Obstruction to Analytic Methods 451

Example 4.1. Solve on R3


dx
= a(x) = Ax, tr A = 0, (4.1)
dt
by the Euler centered method, we get algorithmic approximation Gs to esa = exp (sA)
with  s −1  s 
Gs = I − A I+ A . (4.2)
2 2
Simple calculations show that in 3-dimensions, if tr A = 0, then det Gs = 1 ⇔
det A = 0, which is exceptional. A more general conclusion in linear case is
Lemma 4.2. Let sl(n) denote the set of all n × n real matrices with trace equal to
zero and SL(n) the set of all n × n real matrices with determinant equal to one. Then
for any real analytic function φ(z) defined in a neighborhood of z = 0 in C satisfying
the conditions:
1◦ φ(0) = 1;
2◦ φ̇(0) = 1.
We know that φ(sl(n)) ⊂ SL(n) for some n ≥ 3 if and only if φ(z) = exp (z).
Proof. “If part” is a known conclusion, for the “only if part” it suffices to show it for
n = 3. For this, we consider matrices of the diagonal form
⎡ ⎤
s 0 0
⎢ ⎥
D(s, t) = ⎢
⎣ 0 t 0 ⎥ ∈ sl(3), s, t ∈ R.
⎦ (4.3)
0 0 −(s + t)
Since φ is analytic in a neighborhood of the origin in C, we have
⎡ ⎤
φ(s) 0 0
⎢ ⎥
φ(D(s, t)) = ⎢ ⎣ 0 φ(t) 0 ⎥ , s, t  0.
⎦ (4.4)
0 0 φ(−(s + t))
By assumption, det φ(D(s, t)) = 1, for s, t  0. So
φ(s)φ(t)φ(−(s + t)) = 1, s, t  0, (4.5)
together with the condition φ(0) = 1, we have
φ(s)φ(−s) = 1, s  0. (4.6)
Multiplying the both sides of Equation (4.5) by φ(s + t) and using (4.6), we get
φ(s)φ(t) = φ(s + t), s, t  0. (4.7)
This, together with the conditions 1◦ and 2◦ of the lemma, implies
φ(z) = exp (z),
which completes the proof. 
452 10. Volume-Preserving Methods for Source-Free Systems

Lemma 4.2 says that there are no consistent analytic approximations to the expo-
nential function sending sl(n) into SL(n) at the same time other than the exponential
itself. This shows that it is impossible to construct volume-preserving algorithms ana-
lytically depending on source-free vector fields. Thus we have:
Theorem 4.3 (Feng-Shang). All the conventional methods including the well-known
Runge–Kutta methods, linear multistep methods and Euler methods (explicit, implicit
and centered) are non-volume-preserving.
The above lemma tell us we cannot construct volume-preserving scheme for all
source-free system. But we can split class sl(n) to subclass and perhaps in subclass,
there exists volume-preserving scheme.
In Subsection 10.2.1, we get some condition for centered Euler scheme to be
volume-preserving scheme. It is the best elucidation.
Consequently, to construct volume-preserving algorithms for source-free systems,
we must break through the conventional model and explore new ways.

10.5 Decompositions of Source-Free Vector Fields


In R2 , every source-free field a = (a1 , a2 )T corresponds to a stream function or
2-dimensional Hamiltonian ψ, unique up to a constant:
∂ψ ∂ψ
a1 = − , a2 = . (5.1)
∂ x2 ∂ x1
In R3 , every source-free field a = (a1 , a2 , a3 )T corresponds to a vector potential
b = (b1 , b2 , b3 )T , unique up to a gradient:
∂ b3 ∂b
a = curl b, a1 = − 2,
∂ x2 ∂ x3
∂ b1 ∂b ∂ b2 ∂b (5.2)
a2 = − 3, a3 = − 1,
∂ x3 ∂ x1 ∂ x1 ∂ x2

then we get source-free decomposition


⎡ ⎤ ⎡ ⎤ ⎡ ⎤
⎡ ⎤ 0 ∂b ∂ b3
a1 − 2
⎢ ⎥ ⎢ ∂ x3 ⎥ ⎢ ∂ x2 ⎥
⎢ ⎥ ⎢ ∂ b1 ⎥ ⎢ ⎥ ⎢ ⎥
a=⎢ ⎥=⎢ ⎥ ⎢ ⎥ ⎢ ∂ b3 ⎥
⎥ = a(1) +a(2) +a(3) . (5.3)
⎣ a2 ⎢
⎦ ⎢ ∂ x3 ⎥ ⎢ ⎥ +

0 ⎥+⎢ −
⎥ ⎢ ∂ x1 ⎥
⎣ ∂b ⎦ ⎣ ∂b ⎦ ⎣ ⎦
a3 − 1 2
∂ x2 ∂ x1 0

As a generalization of cases n = 2, 3, on Rn , we have[FS95] :


Lemma 5.1. To every source-free field a = (a1 , a2 , · · · , an )T , there corresponds a
skew symmetric tensor field of order 2, b = (bik )1≤i,k≤n , bik = −bki , so that

n
∂ bik
ai = , i = 1, 2, · · · , n. (5.4)
∂ xk
k=1
10.5 Decompositions of Source-Free Vector Fields 453

Proof. With the given a = (a1 , · · · , an )T , we define the 1-form on Rn


n
α= ai (x)d xi . (5.5)
i=1

Since a is source-free, we have



n
∂ ai
δα = − = −div a = 0,
∂ xi
i=1

where δ is codifferential operator. The above equation means that α is δ-closed. By


Poincaré’s lemma, there exists a 2-form, say β, so that

α = δβ. (5.6)

But for the 2-form β, there exists a skew symmetric tensor of order 2, b = (bik )1≤i.k≤n ,
bik = −bki , so that
n
β= bik d xi ∧ d xk . (5.7)
i,k=1

Take (5.7) codifferential,


* n +
n  ∂ bik
δβ = d xi , (5.8)
i=1
∂ xk
k=1

and from Equations (5.5) and (5.6), we get (5.4). The proof is completed. 

By (5.4), we can decompose


  ∂bik ∂bik T
a= a(ik) , a(ik) = 0, · · · , 0, , 0, · · · , − , 0, · · · , 0 , i < k.
∂xk ∂xi
1≤i<k≤n
(5.9)
Every vector field a(ik) in (5.9) is a 2-dimensional Hamiltonian on the xi -xk plane
and zero in other dimensions. We call such decompositions essentially Hamiltonian
decompositions.
We note that the tensor potential b = (bik )1≤i,k≤n is far from uniquely determined
for a given source-free field a = (a1 , · · · , an )T from Equation (5.4). For uniqueness,
one may impose normalizing conditions in many different ways. One way is to impose,
as done by H. Weyl in[Wey40] in 3-dimensional case:

N0 : bik = 0, |i − k| ≥ 2, (5.10)

this condition is ineffective for n = 2. The non zero components are

b12 = −b21 , b23 = −b32 , · · · , bn−1,n = −bn,n−1 . (5.11)


Nk : bk,k+1 |xk+1 =0 = 0, 1 < k ≤ n − 2 (5.12)
454 10. Volume-Preserving Methods for Source-Free Systems

this condition is ineffective for n = 2,

Nn−1 : bn−1,n |xn−1 =xn =0 = 0. (5.13)

Then, simple calculations show that all bk,k+1 are uniquely determined by quadra-
ture
- x2
b12 = a1 d x2 , (5.14)
-0 xk+1  
∂b
bk,k+1 = ak + k−1,k d xk+1 , 2 ≤ k ≤ n − 2, (5.15)
∂ xk−1
-0 xn % & - xn−1
∂b
bn−1,n = an−1 + n−2,n−1 d xn − an |xn =0 d xn−1 . (5.16)
0 ∂ xn−2 0

So, one gets an essentially Hamiltonian decomposition for a as


n−1 % &T
∂ bk,k+1 ∂b
a= a(k) , a(k) = 0, · · · , 0, , − k,k+1 , 0, · · · , 0 , (5.17)
∂ xk+1 ∂ xk
k=1

or in components, ⎧


∂b
a1 = 12 ,




∂ x2

⎪ ∂b ∂b

⎪ a2 = − 12 + 23 ,

⎪ ∂ x1 ∂ x3

..
. (5.18)



⎪ ∂b ∂b

⎪ an−1 = − n−2,n−1 + n−1,n ,

⎪ ∂ x ∂ xn


n−2

⎪ ∂ bn−1,n
⎩ an = − .
∂ xn−1

10.6 Construction of Volume-Preserving Schemes


In this section, we give a general way to construct volume-preserving difference
schemes for source-free systems by means of essentially the Hamiltonian decom-
positions of source-free vector fields and the symplectic difference schemes for 2-
dimensional Hamiltonian systems. With this aim, we first prove:
Lemma 6.1. Let a be a smooth vector field on Rn and have decomposition


m
a= a(i) , (6.1)
i=1

with smooth fields a(i) :Rn → Rn (i = 1, · · · , m). Suppose that, for each i =
1, · · · , m, Gτi is an approximation of order p to eτa(i) , the phase flow of the system
10.6 Construction of Volume-Preserving Schemes 455

1
associated to the field a(i) , in the sense that lim (Gτ (x) − eτa(i) (x)) = 0 for all
τ →0 τp i
x ∈ Rn with some p ≥ 1. Then, we have:
1◦ For any permutation (i1 i2 · · · im ) of (12 · · · m), the compositions
 −1
τ
1 Gi1 i2 ···im := Gτim ◦ · · · ◦ Gτi2 ◦ Gτi1 , τ
1 Gi1 i2 ···im := −τ
1 Gi1 i2 ···im (6.2)

are approximations, of order one, to eτa ; and the compositions


τ τ τ τ
2 − 2
i1 i2 ···im ◦ 1 Gi1 i2 ···im , iτ1 i2 ···im := 1 Gi21 i2 ···im ◦ 1 G
τ 2
2 gi1 i2 ···im := 1 G 2g i1 i2 ···im (6.3)

are revertible approximations, of order 2, to eτa ;


2◦ If, for each i = 1, 2, · · · , m, Gτi is an approximation, of order 2, to eτa , then
τ τ τ τ τ τ
τ
2 Gi1 i2 ···im := Gi2m ◦ · · · ◦ Gi22 ◦ Gi21 ◦ Gi21 ◦ Gi22 ◦ · · · ◦ Gi2m (6.4)

is an approximation, of order 2, to eτa ; and it is revertible if each Gτi is revertible;


3◦ If 2 Gτ is a revertible approximation, of order 2, to eτa , then the symmetric
composition[QZ92]
τ
4G = 2G
α1 τ
◦ 2 Gβ1 τ ◦ 2 Gα1 τ (6.5)
with
1
α1 = (2 − 2 3 )−1 , β1 = 1 − 2α1 < 0, (6.6)
is a revertible approximation, of order 4, to eτa ; and generally, the symmetric compo-
sition, recursively is defined as follows,

2(l+1) G
τ
= 2l Gαl τ ◦ 2l Gβl τ ◦ 2l Gαl τ , (6.7)

with 1
αl = (2 − 2 (2l+1) )−1 , βl = 1 − 2αl < 0, (6.8)
as a revertible approximation, of order 2(l + 1), to eτa .

Proof. It is only needed to prove for (i1 i2 · · · im ) = (12 · · · m).


1◦ It is easy to prove that the phase flow eta has the series expansion
∞ k
 t
eta (x) = x + ak (x), x ∈ Rn , t  0, (6.9)
k!
k=1

where
∂a1 (x)
a1 (x) = a(x), a2 (x) = a(x),
∂x
(6.10)
∂ak−1 (x)
ak (x) = a(x), k = 1, 2, · · · .
∂x

The assumption that for i = 1, 2, · · · , m, Gτi are approximations of order p ≥ 1, to


eτa(i) implies that for all x ∈ Rn ,
456 10. Volume-Preserving Methods for Source-Free Systems

Gτi (x) = x + τ a(i) (x) + O(τ 2 ), τ  0, i = 1, 2, · · · , m. (6.11)

So, from Taylor expansion, we have that for x ∈ Rn ,


 
(Gτ2 ◦ Gτ1 )(x) = Gτ2 Gτ1 (x) = x + τ (a(1) (x) + a(2) (x)) + O(τ 2 ), τ  0. (6.12)

By induction for m, we get


τ
1 G(12···m) (x) = (Gτm ◦ · · · ◦ Gτ2 ◦ Gτ1 )(x)
= x + τ (a(1) (x) + a(2) (x) + · · · + a(m) (x)) + O(τ 2 )
= x + τ a(x) + O(τ 2 ), τ  0. (6.13)

This implies that 1 Gτ(12···m) is an approximation, of order one, to eτa , which provides
the proof needed.
In[QZ92] that 2 giτ1 i2 ···im and 2 giτ1 i2 ···im , defined by Equation (6.2), are revertible
approximations, of order 2, to eτa , the conclusion 1◦ of the Lemma 6.1 is proved.
2◦ By assumption, we have that for x ∈ Rn and τ ∼ 0,

1  2
Gτi (x) = x + τ a(i) (x) + τ 2 a(i) (x) + O(τ 3 ), i = 1, 2, · · · , m. (6.14)
2
Taylor expansion of the right hand side of Equation (6.4) with (i1 i2 · · · im ) =
(12 · · · m) yields
⎛ ⎞

m
1 
m
τ
2 G(12···m) (x) =x+τ a(i) (x) + τ 2 ⎝ a(i) a(j) ⎠ (x) + O(τ 3 ), τ  0.
i=1
2 i,j=1
(6.15)
Here, we have used the convention

∂a(x)
(ab)(x) = (a∗ b)(x) = a∗ (x)b(x), a∗ (x) = , (6.16)
∂x
for a, b : Rn → Rn . However, we have
*m +
 m  
m 
m
2 (i)
a = a∗ a = a a(j) = (a(i) )∗ a(j) = a(i) a(j) . (6.17)
i=1 ∗ j=1 i,j=1 i,j=1

So
1
eτa (x) = x + τ a(x) + τ 2 a2 (x) + O(τ 3 ) = 2 Gτ(12···m) (x) + O(τ 3 ), τ ∼ 0.
2
This shows that 2 Gτ(12···m) is an approximation, of order 2, to eτa . By direct verifica-
tion, this is revertible if each component Gτi is revertible.
The conclusion 3◦ directly follows from Qin-Zhu’s paper[QZ93] . 
10.6 Construction of Volume-Preserving Schemes 457

Lemma 6.2. Given system


 T
∂b ∂b
ẋ = a(k) (x), a(k) (x) = 0, · · · , 0, k,k+1 (x), − k,k+1 (x), 0, · · · , 0 ,
∂xk+1 ∂ xk
(6.18)
with x = (x1 , · · · , xk , xk+1 , · · · , xn )T and smooth function bk,k+1 : Rn → Rn .
Then any symplectic difference scheme, of order p ≥ 1, of the Hamiltonian system on
the xk -xk+1 plane
∂ bk,k+1 ∂ bk,k+1
ẋk = , ẋk+1 = − , (6.19)
∂ xk+1 ∂ xk
with xj , j = k, k + 1 as parameters naturally gives a volume-preserving differ-
ence scheme, of order p, of the source-free system (6.18) on the n-dimensional
(x1 , · · · , xn )T -space by simply freezing the coordinates xj , j = k, k + 1 and trans-
forming xk and xk+1 according to the symplectic difference scheme for (6.19) in which
xj , j = k, k + 1 are considered as frozen parameters.
Proof. It is obvious that the so-constructed difference scheme is of order p. As to the
volume-preserving property, we easily prove that it is true by direct calculation of the
determinant of the Jacobian of the step-transition map of the scheme, with the notice
of the fact that the determinant of the Jacobian of a symplectic map is equal to one.
Now, we construct volume-preserving difference schemes for source-free systems.
Let a = (a1 , · · · , an )T be a source-free field. As was proved in Section 10.5, we
have essentially Hamiltonian decomposition (5.17) for a with the functions bk,k+1
given from a by (5.14) – (5.16). We denote by Skτ the step transition map of a
volume-preserving difference scheme with step size τ , as constructed in Lemma 6.2,
% &T
(k) ∂ bk,k+1 ∂ bk,k+1
associated to the vector field a = 0, · · · , 0, ,− , 0, · · · , 0 for
∂ xk+1 ∂ xk
k = 1, 2, · · ·. 
Then by Lemma 6.1, we have:
[FW94]
Theorem 6.3. 1◦ A simple composition of the n − 1 components S1τ , S2τ , · · · ,
τ
Sn−1 , say
1G
τ τ
:= Sn−1 ◦ · · · ◦ S2τ ◦ S1τ
is a volume-preserving algorithmic approximation, of order one, to eτa ; and

2g
τ  2 ◦ 1G 2 ,
:= 1 G
τ τ

2g
τ
2
= 1G 2 ◦ 1G
τ

are revertible volume-preserving algorithmic approximations, of order 2.


2◦ If each Skτ is an approximation, of order 2, to eτa(k) , then the symmetric com-
position
τ τ τ τ τ τ

2 G = Sn−1 ◦ · · · ◦ S2 ◦ S1 ◦ S1 ◦ S2 ◦ · · · ◦ Sn−1
τ 2 2 2 2 2 2
(6.20)
is a volume-preserving approximation, of order 2, to eτa .
3◦ If each Skτ is revertible, then the so-constructed 2 Gτ is revertible too.
4◦ From the above constructed revertible algorithmic approximation 2 g τ or
τ
2 G , we can further recursively construct revertible approximations, of all even
orders, to eτa according to the process of Lemma 6.1.
458 10. Volume-Preserving Methods for Source-Free Systems

Remark 6.4. If a has essentially Hamiltonian decompositions other than (5.17) and
(5.14) – (5.16), then one can construct volume-preserving difference schemes corre-
sponding to these decompositions in a similar way to the above.

10.7 Some Special Discussions for Separable


Source-Free Systems
For a source-free field a = (a1 , · · · , an )T with essentially Hamiltonian decomposition
(5.17), we take Skτ : x = (x1 , · · · , xn )T → x  = ( n )T as determined from
x1 , · · · , x
the following:


⎪ xj = xj , j = k, k + 1,



⎨ ∂b
k = xk + τ k,k+1 (x1 , · · · , xk−1 , x
x k , xk+1 , · · · , xn ), (7.1)
⎪ ∂ xk+1



⎪ ∂b
⎩ xk+1 = xk+1 − τ k,k+1 (x1 , · · · , xk−1 , x k , xk+1 , · · · , xn ).
∂ xk
Then, simple calculations show that 1 Gτ = Sn−1 τ
◦ · · · ◦ S2τ ◦ S1τ from


⎪ 1 = x1 + τ a1 (
x x1 , x2 , · · · , xn ),





⎪ j = xj + τ aj (
x x1 , · · · , x
j , xj+1 , · · · , xn )


- x j 
j−1


∂ al
x1 , · · · , x
j−1 , t, xj+1 , · · · , xn )dt, j = 2, · · · , n − 1,


+τ (

⎪ x ∂ xl


j l=1


n = xn + τ an (
x x1 , · · · , x
n−1 , xn ),
(7.2)
 −1
 τ = 1 G−τ
and 1 G is given from


⎪ n = xn + τ an (x1 , · · · , xn−1 , x
x n ),





⎪ j = xj + τ aj (x1 , · · · , xj , x
x j+1 , · · · , x
n )


- x j 
j−1
⎪ ∂al

⎪ −τ (x1 , · · · , xj−1 , t, xj+1 , · · · , x
n )d t, j = 2, · · · , n − 1,

⎪ xj l=1 ∂xl





1 = x1 + τ a1 (x1 , x
x 2 , · · · , x
n ).
(7.3)
(7.2) and (7.3) are both volume-preserving difference scheme, of order 1, of the
source-free system associated to the field a, with the step-transition maps 1 Gτ and

1 G . They can be composed into revertible volume-preserving schemes of order 2,
τ
 τ2 : x = (x1 , · · · , xn )T →
say, 2-stage scheme with step transition map 2 gτ = 1 G 2 ◦1 G
T
 = (
x x1 , · · · , x
n ) as follows,
10.7 Some Special Discussions for Separable Source-Free Systems 459

⎧ 1
⎪ τ  1

⎪ n2 = xn + an x1 , · · · , xn−1 , x
x n2 ,

⎪ 2



⎪ 1
τ  1 1

⎪ 
x 2
= x + a x1 , · · · , x , 
x 2
, · · · , 
x 2


i i
2
i i i+1 n



⎪ - x i12 

⎪ ∂ al  1
i−1

⎪ τ 1

⎪ − x1 , · · · , xi−1 , t, x i+1
2
,···,x
n2 d t, i = 2, · · · , n − 1,

⎪ 2 xi ∂ xl

⎪ l=1



⎪ 1
τ  1 1

⎨ x
12 = x1 + a1 x1 , x 22 , · · · , x
n2 ,
2

⎪ 1
τ  1 1

⎪ 1 = x
x 12 + a1 x 1 , x
22 , · · · , xn2 ,

⎪ 2



⎪ τ  1

⎪ j = x
1
j2 + aj x 1 , · · · , x
j , x
1
j+1
2
,···,x n2


x

⎪ 2



⎪ - j 
j−1
∂ al  1
x

⎪ τ 1

⎪ + x1 , · · · , x
j−1 , t, x
j+1
2
,···,x
n2 d t, j = 2, · · · , n − 1,


1
2 x 2 ∂ xl

⎪ j l=1



⎪ 1
τ  1
⎩ x n2 + an x
n = x 1 , · · · , x
n−1 , x n2 .
2
(7.4)
Either (7.2) or (7.3) contains n − 1 implicit equations generally. But for fields a with
some specific properties, it will turn into explicit. For example,
∂ ai
= 0, i = 1, · · · , n (7.5)
∂ xi

(i.e., ai does not depend on xi ), then (7.2) turns into explicit[QZ93]



⎪ 1 = x1 + τ a1 (x2 , · · · , xn ),
x


x x1 , · · · , x
j = xj + τ aj ( j−1 , xj+1 , · · · , xn ), j = 2, · · · , n − 1, (7.6)



n = xn + τ an (
x x1 , · · · , x
n−1 , xn ).

For details, see the Section 10.2.


We note that, for a = (a1 , · · · , an )T ,

n
a= a{k} , a{k} = (0, · · · , 0, ak , 0, · · · , 0)T , k = 1, 2, · · · , n. (7.7)
k=1

It is easy to verify that if a = (a1 , · · · , an )T satisfies the condition (7.5), then the
scheme (7.6) is just the result of composing the Euler explicit schemes of the systems
associated to the fields a{k} (k = 1, · · · , n), i.e., we have

1G
τ
= Eaτ{n} ◦ · · · ◦ Eaτ{2} ◦ Eaτ{1} , (7.8)

where
Eaτ{k} = I + τ a{k} , k = 1, 2, · · · , n, I = identity. (7.9)
460 10. Volume-Preserving Methods for Source-Free Systems

{k}
In fact, Eaτ{k} are the phase flows eτa{k} , since a∗ a{k} = 0 for k = 1, 2, · · · , n,
which is implied by the condition (7.5). According to Theorem 6.3, we then get a 2nd
order explicit revertible volume-preserving scheme, with step transition map
τ τ τ τ τ τ

2G
τ  2{1} ◦ E
= Ea2{n} ◦ · · · ◦ Ea2{2} ◦ Ea2{1} ◦ E  2{2} ◦ · · · ◦ E
 2{n}
a a a
= 1G 2
τ
 τ2 = 2 gτ .
◦ 1G (7.10)

10.8 Construction of Volume-Preserving Scheme via


Generating Function
Not only symplectic scheme can be constructed via generating function, but volume-
Preserving scheme is also constructed via generating function. A. Thyagaraja and
F.A. Haas[TH85,Sco91] give an important type generating function for volume-preserving
mapping in 3-dimensions. It is however, not complete both in generality and in sys-
tematization. The complete results are given by Z.J. Shang[Sha94a,Sha94b] .

10.8.1 Fundamental Theorem


5 6 5 6
α α
Aα Bα A B
Theorem 8.1. Let α = ∈ GL(2n), α−1 = . Assume
Cα Dα C α Dα
that g : Rn → Rn , z = g(z) is a differentiable mapping given, in some point z0 ∈
Rn , satisfying transversality condition
 
 ∂g(z) 
Cα + D 
α  = 0. (8.1)
 ∂z

Then, in Rn neighborhood W of point w0 = Cα g(z0 ) + Dα z0 exists a unique differ-


entiable mapping f (w) = fα,g = (f1 (w), f2 (w), · · · , fn (w)): Wn → Rn such that
z = g(z) satisfying condition
   
 ∂f (w)   ∂f (w) 
   
 ∂w Cα − Aα  = Bα − ∂w Dα  = 0, (8.2)

such that, mapping z = g(z) can reconstruct in a neighborhood V of the point z = z0


from w = f (w) by the relation

Aα z + Bα z = f (Cα z + Dα z). (8.3)

Conversely, for any differential mapping f (w) = (f1 (w), · · · , fn (w)) : Rn →


R , satisfying condition (8.2) at the point w0 ∈ Rn , we give a unique differential
n

mapping in some neighborhood V of the point z0 = C α f (w)+Dα w0 (8.3). Moreover,


the transversality condition (8.1) is satisfied for the mapping g at the point z0 =
C α f (w) + Dα w0 .
10.8 Construction of Volume-Preserving Scheme via Generating Function 461

Remark 8.2. Generally speaking, a volume-preserving mapping f is uniquely deter-


mined by the matrix α ∈ GL(2n) and mapping g as above by relation (8.3) determined
by mapping f = fα,g . We called f = fα,g as generating mapping dependent on α, g.

Remark 8.3. We only consider some typical types generating mapping


5 6
In − Ess Ess
α(s,s) = , 1 ≤ s ≤ n, (8.4)
Ess In − Ess

where Ess denotes the n × n matrix, of which only the entry at the s-th row and the
s-th column is equal to 1, and all other entries are 0. In this case, (8.2) and (8.3) have
much more simple forms. For example, for α = α(1,1) , (8.2) turns into
 
∂f1  ∂(f2 , · · · , fn ) 
=   = 0, (8.5)
∂w1 ∂(w2 , · · · , wn ) 

and (8.3) turns into


z1 = f1 (z1 , z2 , · · · , zn ),
z1 , z2 , · · · , zn ),
z2 = f2 (
(8.6)
···
z1 , z2 , · · · , zn ).
zn = fn (

The same situation also applies for α(s,s) .

Remark 8.4. For such a matrix α(1,1) , generating mapping f (w) of type α(1,1) , there
are n component f (w) = (f1 (w), f2 (w), · · · , fn (w)), in which n − 1 component
f2 (w), · · · , fn (w) is linear independent, satisfying condition
 
 ∂(f2 , · · · , fn ) 
 
 ∂(w2 , · · · , wn )  = 0,

then we can express the first component f1 of f by other n − 1 component


- w1  
 ∂(f2 , · · · , fn ) 
f1 (w1 , w2 , · · · , wn ) = C(w2 , · · · , wn ) +   (ξ, w2 , · · · , wn )dξ,
 
w1,0 ∂(w2 , · · · , wn )
(8.7)
where C is scalar function dependent n − 1 variable.
 A Bα   Aα B α 
α
Theorem 8.5. Let α = ∈ GL(2n) and α−1 = . Sup-
Cα Dα C α Dα
pose |Cα + Dα | = 0 for some M0 ∈ GL(2n), then there exists a dependent at t and α
of the diffeomorphism gαt (M0 z) generating mapping, f (w, t) = fα,a (w, t), its phase
flow gαt satisfying

ż = a(z), a(z) = (a1 (z), · · · , an (z))T , z = (z1 , · · · , zn )T , (8.8)


462 10. Volume-Preserving Methods for Source-Free Systems

such that % &


∂f ∂f
= Aα − Cα a(Aα f + B α ω), (8.9)
∂t ∂ω

f (w, 0) = (Aα + Bα )(Cα + Dα )−1 ω. (8.10)

We call (8.9) a Hamilton–Jacobi equation. The proofs of Theorems 8.1 and 8.2
can found in [Sha94b] .

Remark 8.6. If α = α(1,1) , then relations (8.9) and (8.10) turn into

∂ f1 ∂f
= −a1 (w1 , f2 , · · · , fn ) 1 , (8.11)
∂t ∂ w1
∂ fk ∂f
= ak (w1 , f2 , · · · , fn ) − a1 (w1 , f2 , · · · , fn ) k , k = 2, · · · , n, (8.12)
∂t ∂ w1
fk (w1 , · · · , wn , 0) = wk , k = 1, 2, · · · , n. (8.13)

When a is source-free system, i.e.,


n
∂ak
div a(z) = (z) = 0, z ∈ Rn , (8.14)
∂zk
k=1

then gαt is volume-preserving, we get


 
∂f1  ∂(f2 , · · · , fn ) 

(w, t) =   (w, t). (8.15)
∂w1 ∂(w2 , · · · , wn ) 

From (8.11), (8.13), and (8.15), we get


-  
t  ∂(f2 , · · · , fn ) 
f1 (w, t) = w1 − 
a1 (w1 , f2 (w, τ ), · · · , fn (w, τ1 ))   (w, τ )dτ.
0 ∂(w2 , · · · , wn ) 
(8.16)
f2 , · · · , fn is independently determined by (8.12) and (8.13) (for k = 2, · · · , n) we
call these as generating function type α(1,1) for source-free system (8.8).

Theorem 8.7. Suppose vector field a is analytical function of z, then f (w, t) =


fα,a (w, t), is solution of Cauchy problem (8.9) and (8.10), it is expressible as a con-
vergent power series in t for sufficiently small |t|, with the recursively determined
coefficients


f (w, t) = f (k) (w)tk , (8.17)
k=0
f 0 (w) = N0 w, N0 = (Aα + Bα )(Cα + Dα )−1 , (8.18)
1 −1
f (w) = L0 a(E0 w), E0 = (Cα + Dα ) , L0 = Aα − N0 Cα , (8.19)

for k ≥ 1, we have
10.8 Construction of Volume-Preserving Scheme via Generating Function 463

ip ≥1
1  
k m
(k+1) 1 ∂ f (k) (w) 1
f (w) =− Cα a(E0 w) −
k+1 ∂w k+1 j!
m=1 j=1 i1 +···+ij =m
(k−m)
∂f (w)
· j
Cα Dα,E 0w
(Aα f (i1 ) (ω), · · · , Aα f (ij ) (w))
∂w
1  
k
1
+ m
Aα Dα,E 0w
(Aα f (i1 ) (ω), · · · , Aα f (im ) (w)),
k+1 m!
m=1 i1 +···+im =k
ip 1
(8.20)

(k) (k)
where for ξ (k) = (ξ1 , · · · , ξn )T ∈ Rn (k = 1, 2, · · · , m), we get
⎡ ⎤

n
∂ m a1 (w)
⎢ ξα1 1 · · · ξαmm ⎥
⎢ ∂z α1
· · · ∂z αm ⎥
⎢ α1 ,···,αm =1 ⎥
⎢ .. ⎥
m
Dα,w (ξ (1) , · · · , ξ (m) ) = ⎢ . ⎥. (8.21)
⎢ ⎥
⎢  n
∂ m an (w) ⎥
⎣ ξ 1 · · · ξαmm ⎦
α ,···,α =1
∂zα1 · · · ∂zαm α1
1 m

Proof. Under the above proposition, if generating function f (w, t) = fα,a (w, t) is
dependent analytically on w and t in some neighborhood Rn for sufficient small t,
then it can be expressed as a power series


f (w, t) = f (k) (w)tk .
k=0

Differentiating it with respect to w and t, we get




∂f ∂ f (k) (w) k
(w, t) = t , (8.22)
∂w ∂w
k=0
∞
∂f
(w, t) = (k + 1)f (k+1) (w)tk . (8.23)
∂t
k=0

By (8.10),
f (0) (w) = f (w, 0) = N0 w.
This is (8.19). Denote E0 = Aα N0 + B α = (Cα + Dα )−1 , then


Aα f (w, t) + B α w = E0 w + Aα f (k) (w)tk .
k=1

Expanding a(z) at z = E0 w, we get


464 10. Volume-Preserving Methods for Source-Free Systems

* +
α α


α (k) k
α(A f (w, t) + B w) = a E0 w + A f (w)t
k=1


∞ 
k  1 m
= a(E0 w) + tk Da,E0 w (Aα f (i1 ) (w), · · · , Aα f (im ) (w)).
m=1 i1 +···+im =k
m!
k=1
ip 1

(8.24)

Here, Da,E0 is multilinear operator defined by (8.21).


Substituting (8.22) and (8.24) in the right hand side of Equation (8.9), substituting
(8.23) in the left hand side of (8.9), and then comparing the coefficients of tk on both
sides, we get the recursions (8.18) – (8.20). The proof is completed. 

Remark 8.8. Let α = α(1,1) , then (8.18) – (8.20) turn into

f (0) (w) = w, (8.25)


f (1) (w) = "
a(w), "
a(w) = (−a1 (w), a2 (w), · · · , an (w)) , T
(8.26)

for k ≥ 1,
(k) k−1
 m
  n

(k+1) 1 ∂f (w) 1 1
fi (w) = $1 (w) i∂w
k+1 a + k+1
1
m=1 j=1 i1 +···+ij =m α1 ,···,αj =2
j!
ip 1
(k−m)
∂ fi (w) ∂ a"1 (w)
j
(i )
· f (i1 ) (w) · · · fαjj (w)
∂ w1 ∂ wα1 · · · ∂ wαj α1
k n
1    1 ∂ m a"i (w) (i ) (im )
+ f 1 (w) · · · fαm (w),
k + 1 m=1 i m! ∂wα1 · · · ∂wαm α1
1 +···+im =k α1 ,···,αm =2
ip 1
i = 1, 2, · · · , n.

(8.27)
10.8.2 Construction of Volume-Preserving Schemes
In this subsection, we consider the construction of volume-preserving schemes[Sha94a]
for the source-free system (8.8). By Remark 8.3 of Theorem 8.1, for given time-
dependent scalar functions φ2 (w, t), · · · , φn (w, t) : Rn × R → R and C(w, " t) :
Rn−1 × R → R, we can get a time-dependent volume-preserving mapping g"(z, t). If
φ2 (w, t), · · · , φn (w, t) approximates the generating functions f2 (w, t), · · · , fn (w, t)
of the type α(1,1) of the source-free system (8.8), then suitable choice C(w, " t), g"(w, t)
approximates the phase flow gαt (z) = g(z, t). Fixing t as a time step, we can get a
difference scheme (volume-preserving schemes) whose transition from one time step
to the next is volume-preserving. By Remark 8.8 of Theorem 8.7, generating functions
f2 (w, t), · · · , fn (w, t) can be expressed as power series. So, a natural way to approx-
imate f2 (w, t), · · · , fn (w, t) is take the truncation of the series. However, we have to
choose a suitable C(w, " t) in (8.7) to guarantee the accuracy of the scheme.
Assume that
 m
(k)
φmi (w, t) = fi (w)tk , i = 2, · · · , n (8.28)
k=0
10.8 Construction of Volume-Preserving Scheme via Generating Function 465

and

m
(k)
ψ1m (w, t) = f1 (w)tk . (8.29)
k=0

Let for some fixed value w1,0 ,


(m)
C m (w2 , · · · , wn , t) = ψ1 (w1,0 , w2 , · · · , wn , t) (8.30)

and
-  
w1  ∂ (φ(m) , · · · , φn ) 
(m)
(m)
φ1 (w, t) = C (m) (w2 , · · · , wn , t) + 
 ∂ (w2 , · · · , wn )  (ξ, w2 , · · · , wn , t)d ξ,
2

w1,0
(8.31)
then we have,
Theorem 8.9. Using Theorem 8.5 and Theorem 8.7 for sufficiently small τ ≥ 0
(m) (m)
as the time step, defining mapping φ(m) (w, τ ) = (φ1 (w, τ ), φ2 (w, τ ), · · · ,
(m) (m)
φn (w, τ ))T with the components φi (w, τ )(i = 1, 2, · · · , n) given as above for
m = 1, 2, · · · , then the mapping

 = φ(m) (w, τ ),
w −→ w (8.32)

defines a volume-preserving scheme z = z k → z k+1 = z



⎨ z1k = φ(m)
1 (z1 , z2 , · · · , znk , τ ),
k+1 k

i = 2, · · · , n, (8.33)
⎩ (m)
zik+1 = φi (z1k+1 , z2k , · · · , znk , τ ),

of m-th order of accuracy of the source-free system (8.8).


(m)
Proof. Since φi (w, 0) = fi0 (w, 0) = wi (i = 2, · · · , n),
 
 ∂(φ(m) , · · · , φ(m) ) 
 2 n 
  (w, 0) = 1.
 ∂(w2 , · · · , wn ) 

Therefore, for sufficiently small τ and in some neigbourhood of Rn


 
 ∂(φ(m) , · · · , φ(m) ) 
 2 n 
  (w, τ ) = 0.
 ∂(w2 , · · · , wn ) 

By Theorem 8.1, Remark 8.3, Remark 8.4, and Equation (8.31), the relation (8.33)
defines a time-dependent volume-preserving z = z k → z k+1 = z = g"(z, τ ). That is,
(8.33) is a volume-preserving scheme.
Noting that
(m)
φi (w, τ ) = fi (w, τ ) + O(τ m+1 ), i = 2, · · · , n,
(m)
ψ1 (w, τ ) = f1 (w1 , τ ) + O(τ m+1 ),
466 10. Volume-Preserving Methods for Source-Free Systems

for sufficiently small τ and


-  
w1  ∂ (f2 , · · · , fn ) 
f1 (w, τ ) = f1 (w1,0 , w2 , · · · , wn , τ ) +  
 ∂ (w2 , · · · , wn )  (ξ, w2 , · · · , wn )d ξ,
w1,0

we have from (8.31)


(m)
φ1 (w, τ ) = f1 (w, τ ) + O(τ m+1 ).
(m) (m)
So, φ(m) (w, τ ) = (φ1 (w, τ ), · · · , φn (w, τ )) is an m-th order approximant to
f (w, τ ) = (f1 (w, τ ), · · · , fn (w, τ )), the generating function of the type α1,1 of gαt
and hence the volume-preserving scheme (8.33) is of m-th order of accuracy. The
proof is completed. 

Remark 8.10. We note that the volume-preserving scheme z = z k → z k+1 given by


(8.33) is implicit for only one new variable z1k+1 and explicit for all other new variables
zik+1 (i = 2, · · · , n) in terms of the old variables zik (i = 2, · · · , n).

Remark 8.11. We can get volume-preserving scheme similar to the above one if we
consider the types α = α(s,s) (2 ≤ s ≤ n), instead of α = α(1,1) .

Example 8.12. First order scheme:


⎧ (1) k+1 k

⎨ z1 = φ1 (z1 , z2 , · · · , zn , τ ),
k k

(1) k+1 k i = 2, · · · , n,

⎩ zi = φi (z1 , z2 , · · · , zn , τ ),
k k

where
(1)
φ1 (w, τ ) = −τ a1 (0, w2 , · · · , wn )
 
 ∂ a2 ∂ a2 ∂ a2 
 1+τ τ ··· τ 
 ∂ w 2 ∂ w3 ∂ wn 
 
 
 ∂ a3 ∂ a3 ∂ a3 
- w1  τ 1+τ ··· τ 
 ∂ w 2 ∂ w3 ∂ wn 
+   (ξ, w2 , · · · , wn )d ξ,
 
0  .. .. .. 
 
 . . . 
 
 
 ∂ an ∂ an ∂ an 
 τ τ ··· 1+τ 
∂ w2 ∂ w3 ∂ wn
(1)
φi (w, τ ) = wi + τ ai (w).

Second order scheme:


⎧ (2) k+1 k

⎨ z1 = φ1 (z1 , z2 , · · · , zn , τ ),
k k

⎪ (2)
k+1
⎩ zi = φi (z1k+1 , z2k , · · · , znk , τ ), i = 2, · · · , n,
10.9 Some Volume-Preserving Algorithms 467

where
-  
w1  ∂ (ψ2(2) , · · · , ψn(2) ) 
(2) (2)
ψ1 (0, w2 , · · · , wn , τ )  
φ1 (w, τ ) = +  ∂ (w2 , · · · , wn )  (ξ, w2 , · · · , wn )d ξ,
0
(2) (2)
φi (w, τ ) = ψi (w, τ ), i = 2, · · · , n,

and
 (2) (2) T 1 ∂" a(w)
ψ (2) (w, τ ) = ψ1 (w, τ ), · · · , ψn (w, τ ) = w + τ"
a(w) + τ 2 "
a(w),
2 ∂ w1
 T
"
a(w) = − a1 (w), a2 (w), · · · , an (w) .

10.9 Some Volume-Preserving Algorithms

In this section, we analyze and study under conditions a source-free system that has
volume-preserving R–K schemes.

10.9.1 Volume-Preserving R–K Methods

Consider the system


dz
= a(z),
dt
where
5 6
x
z= , x ∈ Rp , y ∈ Rq ,
y
(9.1)
 g(y) 
a(z) = .
f (x)

Obviously, this is a source-free system. Its phase flow in Rp+q preserves the phase
volume of (p + q) form

d x1 ∧ d x2 ∧ · · · ∧ d xp ∧ d y1 ∧ d y2 ∧ · · · ∧ d yq .

Only R–K and P–R–K are to be discussed. We wish, some of the phase volume is
preserved.
The formula of a general m-th stage P–R–K method with time step h applied to
system (9.1) is read as
468 10. Volume-Preserving Methods for Source-Free Systems


m 
m
ξi = xn + h dij g(ηj ), ηi = yn + h cij f (ξj ), 1 ≤ i ≤ m,
j=1 j=1


m
xn+1 = xn + h δj g(ηj ),
(9.2)
j=1


m
yn+1 = yn + h γj f (ξj ),
j=1

here ξi ∈ Rp , ηi ∈ Rq (1 ≤ i ≤ m) are auxiliary vectors used to compute updates


(xn+1 , yn+1 ).
[DV84]
Suppose (9.2) is irreducible, that is, if i = j, then ξi = ξj or ηi = ηj .
We have following Lemma of Y.B. Suris[Sur96] .
Lemma 9.1. Let δ = [δ1 , δ2 , · · · , δm ]T , D = (dij ), C = (cij ), e = [1, 1, · · · , 1]T
be a m-dimensional vector D− = eδ T − D, C − = eδ T − C. The P–R–K method
preserves phase volume for system (9.1) in arbitrary dimensions, iff

dk1 l1 cl,k2 · · · dkr−1 lr−1 clr−1 kr dkr lr clr k1


= d− − − − − −
k1 l1 cl1 k2 · · · dkr−1 lr−1 clr−1 kr dkr lr clr k1 .

For arbitrary 1 ≤ r ≤ m and two arbitrary ordered sets (k1 , · · · , kr ) and (l1 , · · · , lr )
of different natural numbers from (1, m), dij and cij are elements (i, j) with respect
to matrix D− and C − .
Next, for system (9.1), we construct some volume-preserving method by P–R–K
method, using the above criteria.
First we consider volume-preserving by R–K method for linear system.
Linear system of ODE is read as

ẏ = M y, (9.3)

where M is n × n matrix with trace M = 0. If det M = 0, the system (9.3) can


degrade to a lower stage, so we assume det M = 0. Now, we assume that M is a
constant matrix. As in R–K method, (A.b.c) applied to system (9.3) takes the form

s
Yi = yn + h aij M Yj ,
j=1

s (9.4)
yn+1 = yn + h bj M Yj ,
j=1

where A = (aij )s×s , b = [b1 , b2 , · · · , bs ]T .


Here, we just talk about R–K method and according to Lemma 4.2, we cannot
find a general volume-preserving R–K method. So, our hope is to distinguish M into
10.9 Some Volume-Preserving Algorithms 469

different classes and find out whether there are volume-preserving R–K method in any
class.
Now, we need the following notations:
A = A ⊗ En , M = diag (M, M, · · · , M ) = Es ⊗ M,

b = bT ⊗ En , Y = [Y1 , Y2 , · · · , Ys ]T (9.5)
y n = [yn , yn , · · · , yn ]T , e = e ⊗ es ,

where En is an n-stage identical matrix and e = [1, 1, · · · , 1]T is a n-dimensional


vector. For R–K method to be volume-preserving, we have equivalent condition:
∂ yn+1 ∂ yn+1
det ≡ 1. So, we need to calculate the matrix ≡ 1. In matrix notations,
∂ yn ∂ yn
R–K method (9.4) reads
yn+1 = yn + hM bY,
(9.6)
Y = (1 − hM A)−1 y n .

So,
yn+1 = (En + (hM b(I − hM A)−1 e)yn
∂yn+1
=⇒ = En + hM b(I − hM A)−1 e. (9.7)
∂yn
Lemma 9.2. Let A, D be non-degenerate m × m and n × n matrices respectively and
B an m × n and C an n × m matrix, then
det A det (D + CA−1 B) = det D det (A + BD−1 C). (9.8)
The proof can be found in any textbook of linear algebra.
By Lemma 9.2, it is easy to get from (9.7)
% &
∂ yn+1 det (I − hM A − eM b)
det = .
∂ yn det (I − hM A)

Additionally, we define the notations


A− = (a−
ij ), a−
ij = aij − bj ,
(9.9)
N = A ⊗ M, N − = A− ⊗ M.

In these notations, (9.7) reads as


% &
∂ yn+1 det (I − hN − )
det = . (9.10)
∂ yn det (I − hN )

Now, if (9.10) is identical to 1, we arrive at the criterion for R–K method (9.4) to be
volume-preserving scheme as
det (λI − N − ) = det (λI − N ), ∀ λ ∈ R. (9.11)
470 10. Volume-Preserving Methods for Source-Free Systems

Theorem 9.3. If dimension of M is odd, then all the R–K methods based on high or-
der quadrature formula such as Gauss–Legendre, Radau, and Lobatto are not volume-
preserving.
Proof. Note that N = A ⊗ M and N − = A− ⊗ M . If the method is volume-
preserving, then
det N = det(N − ) ⇐⇒ det(A ⊗ M ) = det(A− ⊗ M )
⇐⇒ (det A)n (det(M ))s = (det(A− ))n (det M )s
⇐⇒ (det A)n = (det(A− ))n
⇐⇒ det A = det(A− ). (9.12)
Now, we need the W -transformation proposed by Hairer and Wanner[HW81] . They
introduced a generalized square matrix W defined by
W = (p0 (c), p1 (c), · · · , pn−1 (c)), (9.13)
where the normalized shifted Legendre polynomials are defined by

√ 
k  k  k + i 
pk (x) = 2k + 1 (−1)k+i xi , k = 0, 1, · · · , s − 1. (9.14)
i i
i=0

For Gauss–Legendre, let X = W −1 AW , then


⎡ 1 ⎤
−ξ1
⎢ 2 ⎥
⎢ ξ 0 −ξ2 ⎥
⎢ 1 ⎥
⎢ ⎥
X=⎢ ⎢ ξ2 ⎥, (9.15)

⎢ .. .. ⎥
⎢ . . −ξs−1 ⎥
⎣ ⎦
ξs−1 0

1
where ξk = √ (k = 0, 1, · · · , s − 1).
2 4k2 − 1
However, X − = W −1 A− W , then
⎡ 1 ⎤
− −ξ1
⎢ 2 ⎥
⎢ ξ 0 −ξ2 ⎥
⎢ 1 ⎥
− ⎢ ξ ⎥
X =⎢ 2 ⎥.
⎢ .. .. ⎥
⎢ . . −ξs−1 ⎥
⎣ ⎦
ξs−1 0

It is easy to verify that det X = det(X − ) ⇒ det A = det(A− ). So, Gauss–Legendre


method is not volume-preserving.
Using the Table 2.1 of Chapter 7, the remaining part of the proof is similar, where
σ ∈ R and uσ = 0. 
10.9 Some Volume-Preserving Algorithms 471

Theorem 9.4. [QL00] If the dimension of M is even, then the R–K methods based on
high order quadrature formulas such as Gauss–Legendre, Lobatto III A, Lobatto III
B, Lobatto III S, Radau IB, and Radau IIB are volume-preserving, iff

λ(M ) = (λ1 , λ2 , · · · , λ n2 , −λ1 , −λ2 , · · · , −λ n2 ).

Proof. Assume A, B are n × n and m × m matrices respectively, and their eigenvalue


are respectively{λ1 , λ2 , · · · , λn } and {μ1 , μ2 , · · · , μm }. Then according to the prop-
erty of Kronecker product, we have λ(A ⊗ B) = {λi μj , i = 1, · · · , n; j = 1, · · · , m}.
For R–K methods to be volume-preserving schemes, according to (9.11), N and N −
must have same eigenvalue, that is to say, A ⊗ M and A− ⊗ M must have the
same eigenvalues. For example, for Gauss–Legendre method, λ(A) = λ(X) and
λ(A− ) = λ(X − ), however, it is obvious that λ(X) = −λ(X − ), so according to
the properties of Kronecker product, we can easily verify that A ⊗ M and A− ⊗ M
have the same eigenvalues. 

Remark 9.5. If (9.3) is a Hamiltonian system, that is to say, M = J −1 S, where


 0 In 
J= and S  = S is an n × n invertible matrix, then
−In 0
 
λ(M ) = λ1 , λ2 , · · · , λ n2 , −λ1 , −λ2 , · · · , λ n2 .

So, the R–K method based on high order quadrature formula (Gauss–Legendre,
Lobatto IIIA, Lobatto IIIB, Lobatto IIIS, Radau IB, and Radau IIB) are volume-
preserving. The Theorem 9.4 says that for the methods to preserve volume, the system,
in some sense, must be similar to a Hamiltonian system. If the matrix M similar to
an infinitesimally symplectic matrix, i.e., there is an invertible matrix P , subjected
to P −1 M P = JS, S T = S, then we can transform the system to a Hamiltonian
system by a coordinate transformation. In this situation, the volume-preserving R–K
methods and the symplectic R–K methods almost have no difference, that is, if P is a
symplectic matrix, then volume-preserving R–K methods are equivalent to symplec-
tic R–K methods; and in this case, they can be transformed to one another by a linear
transformation.

10.9.2 Volume-Preserving 2-Stage P–R–K Methods

In the case r = 1, if the necessary and sufficient condition of Lemma 9.5 are sat-
isfied, then a 2-stage P–R–K method is volume-preserving. This condition is the
same condition of symplecity on the class of separable Hamiltonian system. Thus
for system (9.3), all 2-stage P–R–K methods proposed in[Sun95] are volume-preserving
algorithms[QL00] .

Example 9.6. 3th order Radau IA-IA method


472 10. Volume-Preserving Methods for Source-Free Systems

1 1
0 − 0 0 0
4 4

2 1 5 2 1 1
3 4 12 3 3 3

1 3 1 3
4 4 4 4

Example 9.7. 3th order Radau IIA-IIA method

1 5 1 1 1
− 0
3 12 12 3 3

3 1
1 1 1 0
4 4

3 1 3 1
4 4 4 4

Example 9.8. 2nd order Lobatto IIIC-IIIC method

1 1
0 − 0 0 0
2 2

1 1
1 1 1 0
2 2

1 1 1 1
2 2 2 2

Example 9.9. 4th order Gauss IA-IA method

√ √
1 3 1 + 2σ 1 − 2σ 3
− −
2 6 4 4 6
√ √
1 3 1 − 2σ 3 1 + 2σ
+ +
2 6 4 6 4

1 1
2 2
√ √
1 3 1 − 2σ 1 + 2σ 3
− −
2 6 4 4 6
√ √
1 3 1 + 2σ 3 1 − 2σ
+ +
2 6 4 6 4

1 1
2 2
10.9 Some Volume-Preserving Algorithms 473

10.9.3 Some Generalizations

Method in[Sur96] can be applied to the extension of following system:

ẋ = g(y), ẏ = h(z), ż = f (x), x, y, z ∈ RP . (9.16)

For this system, we consider multi-stage P–R–K method


m 
m 
m
ξi = xn + h dij g(ηj ), ηi = yn + h cij h(wj ), wj = zn + h eij f (ξj ),
j=1 j=1 j=1
(9.17)


m 
m 
m
xn+1 = xn +h αj g(ηj ), yn+1 = yn +h βj h(wj ), zn+1 = zn +h γj (ξj ).
j=1 j=1 j=1

Theorem 9.10. A multi-stage P–R–K method is volume-preserving for a system type


(9.17), iff

dk1 l1 cl1 m1 em1 k2 dk2 l2 cl2 m2 em2 k3 · · · dkr lr clr mr emr kr


= d− − − − − − − − −
k1 l1 cl1 m1 em1 k2 dk2 l2 cl2 m2 em2 k3 · · · dkr lr clr mr emr kr (9.18)

for arbitrary 1 ≤ r ≤ m and three arbitrary ordered sets (k1 , · · · , kr ), (l1 , · · · , lr ),


and (m1 , · · · , mr ) of different natural number form [1, · · · , m], here dij , cij , eij , d−
ij ,
− −
cij , eij are defined by Lemma 9.5.

Example 9.11. A multi-stage P–R–K method

1 1
0 − 0 0 0 a a 0
2 2

1 1
1 1 1 0 b+c b c
2 2

1 1 1 1
b 2c
2 2 2 2

Suitably choose a, b, c, as method can get global truncation error with order O(h2 ).

Remark 9.12. Dimension of x, y, z may be different.

Remark 9.13. Theorem 9.10 can be extended with no difficulty to the following sys-
tem:
ẋ1 = f2 (x2 ) ẋ2 = f3 (x4 ), ···, ẋn = f1 (x1 ). (9.19)
474 10. Volume-Preserving Methods for Source-Free Systems

10.9.4 Some Explanations


We usually state that symplectic method is volume-preserving scheme. But this par-
lance is somewhat irrelevancy because symplectic scheme (satisfying symplectic con-
dition), true only in this case, that it applied to Hamiltonian system. For P–R–K
method (dij , δi , cij , γj ), if it satisfies

γi dij + δi cji − γi δj = 0,

we can say this integrator is symplectic. If system is not Hamiltonian, we cannot say
that this P–R–K method is symplectic. The main problem is that we say a scheme
is symplectic because it preserves symplectic structure for a given system. Therefore,
only Hamiltonian system possesses symplectic structure. Consequently, we cannot say
“volume-preserving P–R–K methods form a subset of symplectic ones”.
Until now, we gave some criteria for volume-preserving by R–K and P–R–K
methods. In fact, it is almost impossible based on these criteria to construct volume-
preserving algorithm with high order accuracy. Indeed, we even cannot predict that
there exists schemes which satisfied those criteria. We are too far to resolve these
problems.
It should be noted that in the above discussion, we always suppose system is not
reducible. In other words, det M = 0. But in practice, some systems are reducible, for
example
ẋ = cy − bz,
ẏ = az − cx, a, b, c ∈ R.
ż = bx − ay,

In fact, for this system, centered Euler scheme is volume-preserving. Furthermore,


LobattoIIIA, LobattoIIIB, LobattoIIIS, RadauIB, RadauIIB etc. are also volume-
preserving. With detailed analysis of the process in Subsection 10.9.2, it is easy to
get the following[QL00] .

Theorem 9.14. If the dimension of M is odd, then the R–K methods based on high
order quadrature formulae, such as LobattoIIIA, LobattoIIIB, LobattoIIIS, RadauI,
RadauIIB etc., are volume-preserving, iff
 
λ(M ) = λ1 , λ2 , · · · , λ n2 , 0, −λ1 , −λ2 , − · · · , −λ n2 .

We also find that in Theorem 9.4, det M = 0 is not necessary.

As for nonlinear systems, we cannot give some satisfactory results. A nonlinear


system
ẏ = f (y), t ∈ R, y ∈ Rn ,

n
∂fi (y)
is said to be source-free if divf = = 0. Such system preserves the phase
∂yi
i=1
volume on the phase Rn . For these systems, we only point out the centered Euler
10.9 Some Volume-Preserving Algorithms 475

∂ fi
schemes is volume-preserving iff the Jacobian = M is, in some sense, similar to
∂ yi
an infinitesimally symplectic matrix. That is, the eigenvalues of M can be specified as
 
λ(M ) = λ1 , λ2 , · · · , λ n2 , −λ1 , −λ2 , − · · · , −λ n2 ,

or  
λ(M ) = λ1 , λ2 , · · · , λ n2 , 0, −λ1 , −λ2 , − · · · , −λ n2 .
Bibliography

[DV84] K. Dekker and J.G. Verwer: Stability of Runge–Kutta Methods for Stiff Initial Value
Problems. Elesevier Science Pub. B. V., North-Holland, Amsterdam, (1984).
[FS95] K. Feng and Z. J. Shang: Volume-preserving algorithms for source-free dynamical
systems. Numer. Math., 71:451–463, (1995).
[FW94] K. Feng and D.L. Wang: Dynamical systems and geometric construction of algo-
rithms. In Z. C. Shi and C. C. Yang, editors, Computational Mathematics in China, Con-
temporary Mathematics of AMS, Vol. 163, pages 1–32. AMS, (1994).
[HW81] E. Hairer and G. Wanner: Algebraically stable and implementable Runge–Kutta meth-
ods of high order. SIAM J. Numer. Anal., 18:1098–1108, (1981).
[MQ04] R.I. McLachlan and G.R.W. Quispel: Explicit geometric integration of polynomial
vector fields. BIT, 44:513–538, (2004).
[QD97] G. R. W. Quispel and C. P. Dyt: Solving ODE’s numerically while preserving sym-
metries, Hamiltonian structure, phase space volume, or first integrals. In A. Sydow, editor,
Proceedings of the 15th IMACS World Congress, pages 601–607. Wissenschaft & Technik,
Berlin, (1997).
[QD98] G. R. W. Quispel and C. P. Dyt: Volume-preserving integrators have linear error
growth. Physics Letters A, 202:25–30, (1998).
[QL00] M. Z. Qin and H. W. Li: Volume preserving R–K methods for linear systems. Acta
Applicandae Mathematicae, 16:430–434, (2000).
[QM03] G. R. W. Quispel and D. I. McLaren: Explicit volume-preserving and symplectic
integrators for trigonometric polynomial flows. J. of Comp. Phys., 186(1):308–316, (2003).
[Qui95] G. R. W. Quispel: Volume-preserving integrators. Physics Letters A, 206:26–30,
(1995).
[QZ92] M. Z. Qin and W. J. Zhu: Construction of higher order symplectic schemes by com-
position. Computing, 47:309–321, (1992).
[QZ93] M. Z. Qin and W. J. Zhu: Volume-preserving schemes and numerical experiments.
Computers Math. Applic., 26:33–42, (1993).
[Sco91] C. Scovel: Symplectic numerical integration of Hamiltonian systems. In T. Ratiu,
editor, The Geometry of Hamiltonian Systems, pages 463–496. Springer, New York, (1991).
[Sha94a] Z Shang: Construction of volume-preserving difference schemes for source-free sys-
tems via generating functions. J. Comput. Math., 12:265–272, (1994).
[Sha94b] Z. Shang: Generating functions for volume-preserving mappings and Hamilton–
Jacobi equations for source-free dynamical systems. Science in China (series A), 37:1172–
1188, (1994).
[Sun95] G. Sun: Construction of high order symplectic Partitioned–Runge–Kutta methods. J.
Comput. Math., 13(1):40–50, (1995).
[Sur96] Y. B. Suris: Partitioned–Runge–Kutta methods a phase volume preserving integrators.
Physics Letters A, 220:63–69, (1996).
[TH85] A. Thyagaraja and F.A. Haas: Representation of volume-preserving maps induced by
solenoidal vector fields. Phys. Fluids, 28:1005, (1985).
[Wey40] H. Weyl: The method of orthogonal projection in potential theory. Duke Math. J.,
7:411–444, (1940).
Chapter 11.
Contact Algorithms for Contact Dynamical
Systems

An odd-dimensional manifold cannot admit a symplectic structure. The analogue of


symplectic structure for odd-dimensional manifolds is a little less symmetric, but is
also a very interesting structure – the contact structure. In this chapter, we apply the
ideas of preserving Lie group and Lie algebra structure of dynamical systems in con-
structing symplectic algorithms for Hamiltonian systems to the study of numerical
algorithms for contact dynamical systems and present so-called contact algorithms,
i.e., algorithms preserving contact structure, for solving numerically contact systems.

11.1 Contact Structure


The source of contact structures are manifolds of contact element of configuration
spaces. It is also of basic importance in physical and engineering sciences. Contact
geometry has – as does symplectic geometry – broad applications in physics, e.g.
geometrical optics, classical mechanics, thermodynamics, geometric quantization, and
applied mathematics such as control theory.

11.1.1 Basic Concepts of Contact Geometry


Contact geometry[Arn89,Arn88] is the study of a geometric structure on smooth manifolds
given by a hyperplane distribution in the tangent bundle and specified by a one-form,
both of which satisfy a “maximum non-degeneracy” condition called “complete non-
integrability”.
The integration of first-order partial differential equations is reduced to the in-
tegration of a system of ordinary differential equations, the so-called characteristic
equations. The basic of this reduction is a simple geometric analysis of the formula-
tion of curves. Let M be a smooth manifold and let V be a direction field on M .
Definition 1.1. N ⊂ M is called an integral surface of V if the tangent plane of N
contains the direction of V at every point (Fig. 1.1).
Let Γ be a k-dimensional submanifold in an n-dimensional manifold M (Fig. 1.2),
Γ is called a hypersurface if k = n − 1.
The Cauchy problem for the direction field v with initial manifold Γ is the prob-
lem of finding a (k + 1)-dimensional integral submanifold of v containing the initial
submanifold Γ .
478 11. Contact Algorithms for Contact Dynamical Systems

z z
6 6
........ .... ..
... ............................... ... ..
... ................ ... ..
...
.. ................
................ ... ..
...
V
.. N ...... ... .....
.
.... ... ..
.. ...
. ..
...... .. ... .. ..
..
.. .. ... ..
.. ..
. ..
... .. ... .. ..
.... . ... .. ..
..
.... .. ......
. ..
..
.. ..
...... ... .
........ ... .. ..
..
. . ...... . ..
.... .. ..... ... .. ...
.... ... ...
..
.. N ...
.... .. .. ...
....... ...
. ..
..
...
....
.
........................ ... .. ..
..........
........ .. ...
.. ..
........
.......
...... ... .... V ...
.
-y - y
O ...... ... O
......
...... ...
...... ..
M ..... ...
...... ..
........
.
M
x / x /
Fig. 1.1. Meaning of definition
..
...
...
...
...
.
........
z ....
.....
. .
V
........
..................... ......................
...
...
...
... ......................... ...
.............
6
...
... .............. ..
... ...................... ...
...... .... .. . . ... .. ......
... ..................................... ...
... ............
............. ..
.. ..
... ............. ..
... .............. k+1 ..
.. .............
.......
.........
V ..
..
..
..
.. k
..
. Γ
...
..
..

- y
M

x /
Fig. 1.2. Integral surface with initial manifold of Γ

Every point in n-dimensional space existence an (n − 1)-dimensional hyperplane,


codim = 1 field of hyperplane, this means field of tangent hyperplane can be locally
n
described by 1-form, and α = αi dxi , and
i=1

n
αi2 (x) = 0, ∀ x ∈ Rn .
i=1
Hyperplane in Fig. 1.3 is null space of 1-form α. Relation between hyperplane and
its 1-form is not 1- to -1 correspondence. They may be different up to multiplication
by a non zero constant. This multiplicator is dependent of point.
We consider what a filed of hyperplane looks like in general in a neighborhood of
a point in an n-dimensional manifold. For example, let n = 2. Then the manifold is
a surface and field of hyperplane is a field of straight line. Such a field in a neighbor-
hood of a point is always constructed very simply, namely, as a field of tangent to a
family of parallel lines in a plane. More precisely, one of the basic results of the local
theory of ODEs is that it is possible to change any smooth field of tangent lines on a
manifold into a field of tangents to family of straight lines in Euclidean space by using
a diffeomorphism in a sufficiently small neighborhood of any point of the manifold.
11.1 Contact Structure 479

......
... ...............................
... ................
................
z 6 ... ................
.. .. ........
.. . ...
....... 6 ...
...
.. ..
. ...
.... ...
.... ...
.... ...
...... ...
.... ...
.... ...
... ..
....... ...
....................... .
.......... ...
........
........ ...
.......
.......
......
...... ...
...
...
- y
...... ...
......
..... ...
...... ....
.......

x /
Fig. 1.3. Hyperplane

If n > 2, then a hyperplane is not a line. For example, if n = 3, most field of


2-dimensional tangent planes in ordinary 3-dimensional space cannot be diffeomor-
phically mapped onto a field of parallel planes. The reason is that there exists fields of
tangent planes for which it is impossible to find integral surfaces, i.e., surface which
have the prescribed tangent plane at each point.
A 1-form in 3-dimensional can be written in following standard form
α = xd y + d z.
Every tangent hyperplane in point x, which is denoted by Πx , have:
⎡ ⎤
0
⎢ ⎥
[ηx , ηy , ηz ] ⎣ x ⎦ = 0,
1

and [0, x, 1] not all equal to zero, it is defined as a 2-dimensional field of hyperplane.
When x = 0,
⎡ ⎤T ⎡ ⎤
0 ηx
⎢ ⎥ ⎢ ⎥
⎣ 0 ⎦ ⎣ ηy ⎦ = 0.
1 0
Each point with a hyperplane intersecting wall defines a direction field, see Fig. 1.4
and 1.5.
Next, we prove that in R3 space, there does not exist an integral surface which
can be given by the 1-form α = xd y + d z, where x, y is horizontal coordinate, z is
vertical coordinate, see Fig. 1.6.
Consider a pair of vectors emanating from the origin (0,0,0) and lying in the hori-
zontal plane of our coordinate systems; another integral curve from (0,0,0) to (0,1,0),
and then from (0,1,0) to (1,1,0), and another integral curve from (0,0,0) to (1,0,0),
and then from (1,0,0) to (1,1, −1). As a result, these two curves cannot close up. The
difference in the heights of these points is 1, this difference can be considered as a
measure of the nonintegrability of the field.
We have four direction fields from the origin point 0 to walls of east, south, west,
and north, respectively, describing by Fig. 1.5.
480 11. Contact Algorithms for Contact Dynamical Systems

Fig. 1.4. Defines the field of 2n-dimensional plane α = 0 in R2n+1

- : :
-
~
- : : ~
-
: : ~
~

North West East South

Fig. 1.5. Direction fields in each wall

11.1.2 Contact Structure


A contact element to an n-dimensional smooth manifold at some point is an (n − 1)-
dimensional plane tangent to the manifold at that point, i.e., an (n − 1)-dimensional
subspace of the n-dimensional tangent space at that point. At the n-dimensional space
for each point there is a n − 1 dimensional hyperplane, dimensions of this hyperplane
field is n − 1. We note first that a field of hyperplanes can be given locally by a
differential 1-form: a plane in the tangent space gives a 1-form up to multiplication
by a non zero constant. We choose this constant so that the value of the form on
vertical basic vector is equal to 1. The Hyperplanes of the field are null space of the
1-form[Arn89,Arn88] .

Definition 1.2. A field of hyperplanes is said to be nondegenerate at a point if the


rank of the 2-form dα|ω=0 in the plane of the field passing through this point is equal
to the dimension of the plane.

Definition 1.3. A differential 1-form α which is nowhere equal to the zero form on
a manifold M is called a contact form if the exterior derivative dα of α defines a
nondegenerate exterior 2-form in every plane α = 0.
11.1 Contact Structure 481

Fig. 1.6. Integral curves constructed for a non-integrable field of planes

Example 1.4. Consider the space R2n+1 with the contact structure by the 1-form
α = d u + p d q. Where q = (q1 , · · · , qn ), u, p = (p1 , · · · , pn ), α is not equal to zero
form at any point in R2n+1 , and consequently defines the field of 2n-dimensional
planes α = 0 in R2n+1 .
Example 1.5. The form constructed in Example 1.4 is a contact form, the exterior
derivatives of the form α is equal to

d α|α=0 = d q1 ∧ d p1 + · · · + d qn ∧ d pn .

In the plane α = 0, (q1 , · · · , qn ; p1 , · · · , pn ) may serve as coordinate.


 O −I 
The matrix of the form ω = dα|α=0 has the form , where I is the
I O
identity matrix of order n. The determinant of this matrix is equal to 1. Consequently,
the 2-form ω is nondegenerate. In other words, the rank of this form is 2n, so our field
is nondegenerate at the origin and thus also in a neighborhood of the origin (in fact,
this field of planes is nondegenerate at all points of the space).
Definition 1.6. A contact structure on the manifold M is a field of tangent plane
which are given locally as the set of zeros of a contact 1-form. The hyperplanes of
the field are called contact hyperplanes. We can denote by Πx the contact hyperplane
at the point x. Putting briefly, a contact structure on a manifold is a nondegenerate
field of tangent hyperplane.
.
Definition 1.7. A field of planes is called nondegenerated on a manifold if it is non-
degenerate at every point of the manifold.
It should be noted that on the even-dimensional manifold there cannot be a non-
degenerate field of hyperplanes, on such a manifold a hyperplane is odd-dimensional,
and the rank of every skew-symmetric bilinear form on an odd-dimensional space is
less than the dimension of the space. Nondegenerate field of hyperplane do exist on
odd-dimensional manifold.
482 11. Contact Algorithms for Contact Dynamical Systems

Definition 1.8. A hyperplane (dimension n − 1) tangent to a manifold at some point


is called a contact element, and this point is called the point of contact.
The set of all contact element of an n-dimensional manifold has the structure of
a smooth manifold of dimension 2n − 1. The manifold of all contact elements of an
n-dimensional manifold is a fiber bundle whose base is our manifold and whose fiber
is (n − 1)-dimensional projective space.
Theorem 1.9. The bundle of contact element is the projectivization of the cotangent
bundle: it can be obtained from the cotangent bundle by changing every cotangent
n-dimensional vector space into on (n − 1)-dimensional projective space (a point of
which is a line passing through the origin in the cotangent space).
Proof. A contact element is given by a 1-form on the tangent space, for which this
element is not zero, and it is determined up to multiplication by a non zero number.
But a form on the tangent space is a vector of the cotangent space. Therefore, a non
zero vector of the cotangent space, determined up to a multiplication by a non zero
number, is a non zero vector of the cotangent space, determined up to a multiplication
by a non zero number, i.e., a point of the projectivized cotangent space. 
In this chapter, we simply consider the Euclidean space R2n+1 of 2n+1 dimensions
as our basic manifold with the contact structure given by the normal form
⎡ ⎤
dx
n
⎢ ⎥
α= xi d yi + d z = xd y + d z = (0, xT , 1) ⎢ ⎥
⎣ dy ⎦, (1.1)
i=1
dz

here we have used 3-symbol notation to denote the coordinates and vectors on R2n+1

x = (x1 , · · · , xn )T , y = (y1 , · · · , yn )T , z = (z). (1.2)

A contact dynamical system on R2n+1 is governed by a contact vector field f =


(a , bT , c) : R2n+1 → R2n+1 through equations
T

d
ẋ = a(x, y, z), ẏ = b(x, y, z), ż = c(x, y, z), · =: , (1.3)
dt
where the contactivity condition of the vector field f is

Lf α = λf α, (1.4)

with some function λf : R2n+1 → R, called the multiplier of f . In (1.4), Lf α


denotes the Lie derivation of α with respect to f and is usually calculated by the
formula (see Chapter 1 of book)[Arn88]

Lf α = if d α + d if α. (1.5)

It is easy to show from (1.4) and (1.5) that to any contact vector field f on R2n+1 ,
there corresponds a function K(x, y, z), called contact Hamiltonian, such that
11.1 Contact Structure 483

a = −Ky + Kz x, b = Kx , c = K − xT Kx =: Ke . (1.6)

In fact, (1.6) represents the general form of a contact vector field. Its multiplier, de-
noted as λf from now, is equal to Kz .

Definition 1.10. A contact transformation g is a diffeomorphism on R2n+1


⎛ ⎞ ⎛ ⎞
x x(x, y, z)
⎜ ⎟ ⎜ ⎟
g: ⎜ ⎟ ⎜
⎝ y ⎠ −→ ⎝ y(x, y, z) ⎠

z z(x, y, z)

conformally preserving the contact structure, i.e., g ∗ α = μg α, that means


* n +
n 
i d yi + d z = μg
x xi d yi + d z , (1.7)
i=1 i=1

for some everywhere non-vanishing function μg : R2n+1 → R, called the multiplier


of g.

The explicit expression of (1.7) is


⎡ ⎤
x x
x y z
x
⎢ ⎥
T , 1) ⎢
(0, x ⎣ yx yy yz ⎥ T
⎦ = μg (0, x , 1).
zx zy zz
t
A fundamental fact is that the phase flow gK of a contact dynamical system associated
2n+1
with a contact Hamiltonian K : R → R is a one parameter (local) group of
contact transformations on R2n+1 , i.e., gK
t
satisfies
0
gK = identity map on R2n+1 ; (1.8)
t+s
gK t
= gK ◦ gKs
, ∀ t, s ∈ R; (1.9)
t ∗
(gK ) α = μgKt α, (1.10)

t : R
2n+1
for some everywhere non-vanishing function μgK → R. Moreover, we have
the following relation between μG∗k and the Hamiltonian K:
- t
μgK
t = exp (Kz ◦ gK
s
)d s. (1.11)
0

For general contact systems, condition (1.10) is stringent for algorithmic approx-
imations to phase flows because only the phase flows themselves satisfy it. We will
construct algorithms for contact systems such that the corresponding algorithmic ap-
proximations to the phase flows satisfy the condition (1.10), of course, probably, with
different, but everywhere non-vanishing, multipliers from μgK t . We call such algo-

rithms as contact ones.


484 11. Contact Algorithms for Contact Dynamical Systems

11.2 Contactization and Symplectization


There is a well known correspondence between contact geometry on R2n+1 and conic
(or homogeneous) symplectic geometry on R2n+2 . To establish this correspondence,
we introduce two spaces R2n+2
+ and R+ × R2n+1 .
a. We use the 4-symbol notation for the coordinates on R2n+2
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
p0 p11 q11
⎢ p ⎥ ⎢ . ⎥ ⎢ . ⎥
⎢ 1 ⎥ ⎢ ⎥
⎢ ⎥ , p0 = (p0 ), q0 = (q0 ), p1 = ⎢ .. ⎥ , q1 = ⎢ ⎥
⎣ .. ⎦ . (2.1)
⎣ q0 ⎦ ⎣ ⎦
q1 p1n q1n

Consider ( )
R2n+2
+ = (p0 , p1 , q0 , q1 ) ∈ R2n+2 | p0 > 0 (2.2)
as a conic symplectic space with the standard symplectic form

ω = dp0 ∧ dq0 + dp1 ∧ dq1 . (2.3)

Definition 2.1. Function φ : R2n+2 + → R is called a conic function if it satisfies


 p 
1
φ(p0 , p1 , q0 , q1 ) = p0 φ 1, , q0 , q1 , ∀ p0 > 0. (2.4)
p0

So, a conic function on R2n+2 depends essentially only 2n + 1 variables.


Definition 2.2. F : R2n+2
+ → R2n+2
+ is called a conic map if

F ◦ Tλ = Tλ ◦ F, ∀ λ > 0, (2.5)

where Tλ is the linear transformation on R2n+2


5 6 5 6 5 6 5 6
p λp p0 q0
Tλ = , p= , q= . (2.6)
q q p1 q1

The conic condition (2.5) for the mapping F : (p0 , p1 , q0 , q1 ) → (P0 , P1 , Q0 , Q1 )


can be expressed as follows:
 
p
P0 (p0 , p1 , q0 , q1 ) = p0 P0 1, 1 , q0 , q1 > 0,
p0
 
p1
P1 (p0 , p1 , q0 , q1 ) = p0 P1 1, , q0 , q1 ,
p0
  ∀ p0 > 0. (2.7)
p
Q0 (p0 , p1 , q0 , q1 ) = Q0 1, 1 , q0 , q1 ,
p0
 
p1
Q1 (p0 , p1 , q0 , q1 ) = Q1 1, , q 0 , q1 ,
p0
11.2 Contactization and Symplectization 485

So, a conic map is essentially depending only on 2n + 2 functions in 2n + 1 variables.


It should be noted that, in some cases, we also consider conic functions and conic
maps defined on the whole Eucildean space. The following lemma gives a criterion of
a conic symplectic map.
Lemma 2.3. F : (p0 , p1 , q0 , q1 ) → (P0 , P1 , Q0 , Q1 ) is a conic symplectic map if and
only if (0, 0, P0T , P1T )F∗ − (0, 0, pT0 , pT
1 ) = 0, where F∗ is the Jacobi matrix of F at
the point (p0 , p1 , q0 , q1 ).

Proof. For F : (p0 , p1 , q0 , q1 ) → (P0 , P1 , Q0 , Q1 ), the condition

(0, 0, P0T , P1T )F∗ − (0, 0, pT T


0 , p1 ) = 0, (2.8)

is equivalent to the condition

P0 d Q0 + P1 d Q1 = p0 d q0 + p1 d q1 , or P d Q = pd q, (2.9)

where P = (P0 , P1 ), Q = (Q0 , Q1 ), p = (p0 , p1 ), q = (q0 , q1 ). Hence in matrix


form, it can be written as

QT
p · P = 0, QT
q · P = p. (2.10)

Notice that a function f (x1 , x2 , · · · , xn ) is homogeneous of degree k, i.e.,

f (λx1 , λx2 , · · · , λxn ) = λk f (x1 , x2 , · · · , xn ),

if and only if

xi fxi (x1 , x2 , · · · , xn ) = kf (x1 , x2 , · · · , xn ).

Therefore, the condition (2.7) is equivalent to

Pp (p, q) · p = P (p, q), Qp (p, q) = 0. (2.11)

If F is conic symplectic, then

QT T
p Pp − Pp Qp = O, QT T
q Pq − Pq Qq = O, QT T
q Pp − Pq Qp = I. (2.12)

Combining with (2.11), we get

p = QT T T T T T
q Pp p − Pq Qp p = Qq P, O = Qp Pp p − Pp Qp p = Qp P. (2.13)

This proves the “only if” part.


Conversely, if F satisfies the condition (2.8), then it satisfies (2.9), which means
that it is symplectic. We know that if a matrix is symplectic, then its transpose is also
symplectic. Therefore,

Pq PpT − Pp PqT = O, Qq QTp − Qp QT


q = O, Pp QT T
q − Pq Qp = I. (2.14)
486 11. Contact Algorithms for Contact Dynamical Systems

Combining with (2.10), we get

P = Pp QT T
q P − Pq Qp P = Pp p,
(2.15)
0 = Qq QT T
p P − Qp Qq P = Qq p.

This means that F is conic. This finishes the proof. 

b. Consider R+ × R2n+1 as the product of the positive real space R+ and the
contact space R2n+1 . We use (w, x, y, z) to denote the coordinates of R+ × R2n+1
with w > 0 and with x, y, z as before.
Definition 2.4. A map G: R+ × R2n+1 → R+ × R2n+1 is called a positive product
map if it is composed by a map g : R2n+1 → R2n+1 and a positive function γ :
R2n+1 → R+ in the form
⎡ ⎤ ⎡ ⎤
w W
⎢ ⎥ ⎢ ⎥
⎢ x ⎥ ⎢ X ⎥
⎢ ⎥ −→ ⎢ ⎥, W = w γ(x, y, z), (X, Y, Z) = g(x, y, z). (2.16)
⎣ y ⎦ ⎣ Y ⎦
z Z

We denote γ ⊗ g the positive product map composed of map g and function γ.


c. Define mapping S : R+ × R2n+1 → R2n+2 +
⎡ ⎤ ⎡ ⎤
w p0 = w
⎢ ⎥ ⎢ ⎥
⎢ x ⎥ ⎢ p1 = wx ⎥
⎢ ⎥ −→ ⎢ ⎥. (2.17)
⎣ y ⎦ ⎣ q0 = z ⎦
z q1 = y

Then the inverse S −1 : R2n+2


+ → R+ × R2n+1 is given by
⎡ ⎤ ⎡ ⎤
p0 w = p0
⎢ ⎥ ⎢ p1 ⎥
⎢ p1 ⎥ ⎢ x= ⎥
⎢ ⎢
⎥ −→ ⎢ p0 ⎥, p0 = 0. (2.18)
⎣ q0 ⎦ ⎥
⎣ y = q1 ⎦
q1 z = q0

Lemma 2.5. [Fen93b,Fen95] Given a transformation F : (p0 , p1 , q0 , q1 ) → (P0 , P1 , Q0 ,


Q1 ) on R2n+2
+ and let G = S −1 ◦ F ◦ S. Then we have:
1 F is a conic map on R2n+2

+ if and only if G is a positive product map on
R+ × R2n+1 ; in this case, if we write G = γ ⊗ g, then

γ(x, y, z) = P0 (1, x, z, y), (2.19)

and g : (x, y, z) → (X, Y, Z) is given by


11.2 Contactization and Symplectization 487

P1 (1, x, z, y)
X= , Y = Q1 (1, x, z, y), Z = Q0 (1, x, z, y). (2.20)
P0 (1, x, z, y)

2◦ F is a conic symplectic map if and only if G is a positive product map, say


γ ⊗ g, on R+ × R2n+1 with g also a contact map on R2n+1 . Moreover, in this case,
the multiplier of the contact map g is just equal to γ −1 = P0−1 (1, x, z, y).

Proof. The conclusion 1◦ is easily proved by some simple calculations. Below we


devote to the proof of 2◦ . Let F send (p0 , p1 , q0 , q1 ) → (P0 , P1 , Q0 , Q1 ), G send
(w, x, y, z) → (W, X, Y, Z). Then by using the conclusion 1◦ , we have

P0 ◦ S = wP0 (1, x, z, y) = wγ, P1 ◦ S = wP1 (1, x, z, y) = wγX(x, y, z),


⎡ ⎤
1 0 0 0 ⎡ ⎤
γ wγx wγy wγz
⎢ ⎥
⎢ x wIn 0 0 ⎥ ⎢ 0 ⎥
⎢ ⎥ ∂ (W, X, Y, Z) ⎢ ⎥
S∗ = ⎢ ⎥ , G∗ = =⎢ ⎥,
⎢ 0 0 0 1 ⎥ ∂ (w, x, y, z) ⎣ 0 g∗ ⎦
⎣ ⎦
0 0 In 0 0
⎡ ⎤
1 0 0 0
⎢ X W In 0 0 ⎥
⎢ ⎥
S∗ ◦ G = ⎢ ⎥,
⎣ 0 0 0 1 ⎦
0 0 In 0

and compute
 
((0, 0, P0T , P1T )F∗ − (0, 0, pT T
0 , p1 )) ◦ S S∗
   
= (0, 0, P0T , P1T ) ◦ S (F∗ ◦ S)S∗ − (0, 0, pT T
0 , p1 ) ◦ S S∗
= (0, 0, wγ, wγX T )(F∗ ◦ S)S∗ − (0, 0, w, wxT )S∗
= (0, 0, wγ, wγX T )(S∗ ◦ G)G∗ − (0, 0, w, wxT )S∗
   
= wγ 0, (0, X T , 1)g∗ − wγ 0, γ −1 (0, xT , 1) .

Noting that S is a diffeomorphism, S∗ is non-singular, w > 0, γ > 0, we obtain

(0, 0, P0T , P1T )F∗ − (0, 0, pT T T


0 , p1 ) ≡ 0 ⇐⇒ (0, X , 1)g∗ − γ
−1
(0, xT , 1) ≡ 0,

which proves the conclusion 2◦ . 

Lemma 2.5 establishes correspondences between conic symplectic space and con-
tact space and between conic symplectic maps and contact maps. We call the transform
from F to G = S −1 ◦ F ◦ S = γ ⊗ g contactization of conic symplectic maps, the
transform from G = γ ⊗ g to F = S ◦ GS −1 symplectization of contact maps and
call the transform S : R+ × R2n+1 → R2n+1+ symplectization of contact space, and
the transform C = S −1 : R2n+2
+ → R + × R 2n+1
contactization of conic symplectic
space.
488 11. Contact Algorithms for Contact Dynamical Systems

11.3 Contact Generating Functions for Contact Maps


With the preliminaries of the last section, it is natural to derive contact generating func-
tion theory for contact maps from the well known symplectic analog[Fen93b,Fen95,Shu93] .
The following two lemmas can be proved easily[Fen95] .

Lemma 3.1. Hamiltonian φ : R2n+2 → R is a conic function if only if the associated


Hamiltonian vector field aφ = J∇φ is conic, i.e., a(Tλ z) = Tλ a(z), λ = 0, z ∈
!
2n+2
O −In+1
R , where J = .
In+1 O
5 6 5 6
p p
Lemma 3.2. Linear map → C is a conic transformation on R2n+2 ,
q q
i.e., C ◦ Tλ6 = Tλ ◦ C, if and only if the matrix C has the diagonal form C =
5
C0 O
with (n + 1) × (n + 1) matrix C0 and C1 .
O C1

Noting that the matrix in gl(2n + 2)

1
C= (I + JB), B = B T ∈ Sm(2n + 2), (3.1)
2
establishes a 1-1 correspondence between near-zero Hamiltonian vector fields z →
a(z) ≡ J∇φ(z) and near-identity symplectic maps z → g(z) via generating relation

g(z) − z = J∇φ(Cg(z) + (I − C)z), (3.2)

and combining Lemma 3.1 and Lemma 3.2, we find that matrix
5 6
C0 O
C= , C0 ∈ gl(n + 1), (3.3)
O I − C0T

establishes a 1-1 correspondence between near-zero conic Hamiltonian vector fields


z → a(z) = J∇φ(z) and near-identity conic symplectic maps z → g(z) via generat-
ing relation (3.3). !
α βT
Write C0 = with α ∈ R, β, γ ∈ Rn and δ ∈ gl(n). Then the
γ δ
generating relation (3.2) with generating matrix C given by (3.3) can be expressed as


⎪ p0 − p0 = −φq0 (p, q),





⎨ p1 − p1 = −φq1 (p, q),
(3.4)

⎪ q0 − q0 = φq0 (p, q),





⎩ q1 − q1 = φq1 (p, q),
11.3 Contact Generating Functions for Contact Maps 489

5 6 5 6
p0 q0
where p = and q = are given by
p1 q1


⎪ p0 + (1 − α)p0 + β T (
p0 = α p1 − p1 ),




⎨ p1 = δ
p1 + (I − δ)p1 + γ(
p0 − p0 ),
(3.5)

⎪ q 0 = (1 − α)q0 + αq0 − γ T (q1 − q1 ),





q 1 = (I − δ T )
q1 + δ T q1 − β(
q0 − q0 ).

Every conic function φ can be contactized as an arbitrary function ψ(x, y, z) as


follows
ψ(x, y, z) = φ(1, x, z, y), (3.6)

i.e.,

φ(p0 , p1 , q0 , q1 ) = p0 φ(1, p1 /p0 , q0 , q1 ) = p0 ψ(p1 /p0 , q1 , q0 ), p0 = 0,

and we have the partial derivative relation:




⎪ φq0 (p0 , p1 , q0 , q1 ) = p0 ψz (x, y, z),




⎨ φq1 (p0 , p1 , q0 , q1 ) = p0 ψy (x, y, z),
(3.7)

⎪ φp0 (p0 , p1 , q0 , q1 ) = ψ(x, y, z) − xT ψx (x, y, z) = ψe (x, y, z),





φp1 (p0 , p1 , q0 , q1 ) = ψx (x, y, z),

p1
where x = , y = q1 , z = q0 on the right hand side. So, under contactizing
p0
transforms
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
w p0 w 
w p0 
w
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ x ⎥ ⎢ p ⎥ ⎢ wx ⎥ ⎢ x ⎥ ⎢ p ⎥ ⎢ w ⎥
⎢ ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢  ⎥ ⎢ 1 ⎥ ⎢ x ⎥
S: ⎢ ⎢
⎥ −→ ⎢
⎥ ⎢
⎥=⎢
⎥ ⎢
⎥,



⎥ −→ ⎢
⎥ ⎢
⎥=⎢
⎥ ⎢
⎥,

⎢ y ⎥ ⎢ q0 ⎥ ⎢ z ⎥ ⎢ y ⎥ ⎢ q0 ⎥ ⎢ z ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦
z q1 y z q1 y
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
w p0 w
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ x ⎥ ⎢ ⎥ ⎢ wx ⎥
⎢ ⎥ ⎢ p1 ⎥ ⎢ ⎥
⎢ ⎥ −→ ⎢ ⎥=⎢ ⎥, (3.8)
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ y ⎥ ⎢ q0 ⎥ ⎢ z ⎥
⎣ ⎦ ⎣ ⎦ ⎣ ⎦
z q1 y

the generating relation (3.4) turns into


490 11. Contact Algorithms for Contact Dynamical Systems



⎪  − w = −wψz (x, y, z),
w




⎨ w
x − wx = −wψy (x, y, z),
(3.9)

⎪ z − z = ψe (x, y, z),





y − y = ψx (x, y, z),

and Equation (3.5) turns into




⎪ w = αw + (1 − α)w + β T (w
x − wx),




⎨ w x = δ w
x + (I − δ)wx + γ(w  − w),
(3.10)

⎪ z = (1 − α) z + αz − γ T (
y − y),





y = (I − δ T )
y + δ T y − β(
z − z).

Since the p0 -axis is distinguished for the contactization in which we should always
w p w p
 = = 0 and μ = = 0 ,
take p0 = 0, it is natural to require β = 0 in (3.5). Let μ

w p0 w p0
we obtain from Equations (3.9) and (3.10)
1 + αψz (x, y, z)
=
μ , μ = 1 + αψz (x, y, z), (3.11)
1 − (1 − α)ψz (x, y, z)

and the induced contact transformation on the contact (x, y, z) space R2n+1 is
⎧  
⎪ x − x = −ψy (x, y, z) + ψz (x, y, z) (1 − α)x + αx ,


y − y = ψx (x, y, z), (3.12)



z − z = ψe (x, y, z),

with the bar variables on the right hand side given by x, y, z



⎪  + d 2 x + d0 ,
x = d1 x


y = (I − δ T )
y + δ T y, (3.13)



z = (1 − α)z + αz − γ T (
y − y),

where ⎧  
⎪ d1 = I − (1 − α)ψz (x, y, z) δ,


⎨  
d2 = I + αψz (x, y, z) (I − δ), (3.14)




d0 = −ψz (x, y, z)γ.
Summarizing the above discussions, we have:
Theorem 3.3. Relations (3.12) – (3.14) give a contact map (x, y, z) → (x, y, z) via
 α O 
contact generating function ψ(x, y, z) under the type C0 = . Vice versa.
γ δ
11.3 Contact Generating Functions for Contact Maps 491

However, the difficulty in the algorithmic implementation lies in the fact that, un-
like ȳ and z, which are linear combinations of y, y and z, z with constant matrix coef-
ficients, since x = d1 x + d2 x + d0 and d1 , d2 are matrices with coefficients depending
on ψ̄z = ψz (x, y, z) which in turn depends on (x, y, z) the combination of x from x 
and x is not explicitly given, the entire equations for solving x, y, z in terms of x, y, z
are highly implicit. The exceptional cases are the following:
(E1) α = 0, δ = On , γ = O,
 = 1 − ψz (x, y, z), μ = 1,
μ (3.15)

⎪ x − x = −ψ y (x, y , z) + x
ψz (x, y, z),


y − y = ψx (x, y, z), (3.16)



z − z = ψe (x, y, z) = ψ(x, y, z) − xT ψx (x, y, z).
(E2) α = 1, δ = In , γ = O,
 = μ = 1 + ψz (
μ x, y, z), (3.17)

⎪ x − x = −ψy ( x, y, z) + xψz (
x, y, z),


y − y = ψx (
x, y, z), (3.18)



z − z = ψe (
x, y, z) = ψ(x, y, z) − xT ψx (
x, y, z).
1 1
(E3) α = , δ = In , γ = O,
2 2
1
1+ψz (x, y, z) 1
=
μ 2 , μ = 1 + ψz (x, y, z), (3.19)
1 2
1 − ψz (x, y, z)
2
⎧ +x
x

⎪  − x = −ψy (x, y, z) + ψz (x, y, z)
x ,

⎨ 2
y − y = ψx (x, y, z), (3.20)




z − z = ψe (x, y, z) = ψ(x, y, z) − xT ψx (x, y, z),
with
+x
x 1 y + y z + z
x= − ψz (x, y, z)(
x − x), y= , z= . (3.21)
2 4 2 2

For ψz = λ = constant, the case (E3) reduces to


1
1+λ 1
=
μ 2 , μ = 1 + λ, (3.22)
1 2
1− λ
2
⎧ +x
x

⎪  − x = −ψy (x, y, z) + ψz (x, y, z)
x ,

⎨ 2
y − y = ψx (x, y, z), (3.23)




z − z = ψe (x, y, z) = ψ(x, y, z) − xT ψx (x, y, z),
492 11. Contact Algorithms for Contact Dynamical Systems

with
+x
x 1 y + y z + z
x= − λ (
x − x), y= , z= . (3.24)
2 4 2 2

Note that the symplectic map induced by generating function φ from the relation
(3.2) can be represented as the composition of the maps, non-symplectic generally,
z → z and z → z
z = z + CJ∇φ(z),
z = z + (I − C)J∇φ(z).

Theorem 3.4. Contact map (x, y, z) → ( x, y, z) induced by contact generating func-
tion ψ from the relations (3.12)–(3.14) can be represented as the composition of the
maps (x, y, z) → (x, y, z) and (x, y, z) → ( x, y, z) which are not contact generally
and given, respectively, as follows

⎪ x − x = −δψy (x, y, z) + αψz (x, y, z)x − γψz (x, y, z),


y − y = (I − δ T )ψx (x, y, z), (3.25)



z − z = (1 − α)ψe (x, y, z) − γ T ψx (x, y, z)

and

⎪ x − x = −(I − δ)ψy (x, y, z) + (1 − α)ψz (x, y, z)
x + γψz (x, y, z),


y − y = δ T ψx (x, y, z), (3.26)



z − z = αψe (x, y, z) + γ T ψx (x, y, z).

(3.25) and (3.26) are the 2-stage form of the generating relation (3.12) of the contact
 α O 
map induced by generating function ψ under the type C0 = . Correspond-
γ δ
ing to the exceptional cases (E1), (E2) and (E3), the above 2-stage representation has
simpler forms, we no longer use them here.

11.4 Contact Algorithms for Contact Systems

Consider contact system (1.3) with the vector field a defined by contact Hamiltonian
K according to Equation (1.6). Take ψ(x, y, z) = sK(x, y, z) in (3.12) – (3.14) as the
generating function, we then obtain contact difference schemes with 1st order of ac-
 α O 
curacy of the contact system (1.3) associated with all possible types C0 = .
γ δ
The simplest and important cases are (write K x = Kx (x, y, z), etc.) as follows[Fen95] .
11.4 Contact Algorithms for Contact Systems 493

11.4.1 Q Contact Algorithm


Q. Contact analog of symplectic method (p, Q)1 (α = 0, δ = 0n , γ = 0).
 
 = x + s − Ky (x, y, z) + x
x Kz (x, y, z) ,

1-stage form : y = y + sKx (x, y, z),


z = z + sKe (x, y, z);
(4.1)
x = x, y = y + s K x, z = z + s K e,
2-stage form :
 = x + s(−K y + x
x  K x ), y = y, z = z.

11.4.2 P Contact Algorithm


P . Contact analog of symplectic method (P, q)(α = 1, δ = In , γ = O).
 
 = x + s − Ky (
x x, y, z) + xKz (
x, y, z) ,

1-stage form: y = y + sKx (


x, y, z),
z = z + sKe (
x, y, z);
(4.2)
x = x + s(−K y + x K z ), y = y, z = z,
2-stage form:
 = x,
x y = y + s K x , z = z + s K e .

11.4.3 C Contact Algorithm


C. Contact
 version of Poincarégenerating function method similarly to symplectic
1 1
case α = , δ = In , γ = O .
2 2
2-stage form:
s s s
x = x + (−K y + xK z ), y = y + Kz , z = z + K e ,
2 2 2
  −1
s s s
x = x + (−K y + xK x ) = x − K y 1 − K z , (4.3)
2 2 2
s s
y = y + K x = 2y − y, z = z + K e = 2z − z.
2 2
One might suggest, for example, the following scheme for (1.3):
1
For Hamiltonian system ṗ = −Hq (p, q), q̇ = Hp (p, q), the difference scheme p =
p − sHq (p, q), q = q + sHp (p, q) is symplectic and we call it (p, Q) method because
the pair (p, q), composed of the old variables of p and the new variables of q, emerges in the
Hamiltonian. The following (P, q) method has the similar meaning.
494 11. Contact Algorithms for Contact Dynamical Systems

 = x + sa(
x x, y, z), y = y + sb(
x, y, z), z = z + sc(
x, y, z).

, i.e., x
It differs from (4.2) only in one term for x K(x, y, z) instead of xK(x, y, z).
This minute, but delicate, difference makes (4.2) contact and the other non-contact!
It should be noted that the Q and P methods are of order one of accuracy and the
C method is of order two. The proof is similar to that for symplectic case. In principle,
one can construct the contact difference schemes of arbitrarily high order of accuracy
for contact systems, as was done for Hamiltonian systems, by suitably composing the
Q, P or C method and the respective reversible counterpart[QZ92] . Another general
method for the construction of contact difference schemes is based on the generat-
ing functions for phase flows of contact systems which will be developed in the next
section.

11.5 Hamilton–Jacobi Equations for Contact Systems


We recall that a near identity contact map g : (x, y, z) → ( x, y, z) can be gen-
erated from the! so-called generating function ψ(x, y, z), associated with a matrix
α O
C0 =
γ δ
, by the relations (3.12) – (3.14). Accordingly, to the phase flow etK
of a contact system with contact Hamiltonian K, there corresponds a time-dependent
generating function ψ t (x, y, z) such that the map etK : (x, y, z) → ( x, y, z) is gen-
erated from ψ t by the relations (3.12) – (3.14), in which ψ is replaced by ψ t and C0
is given in advance as above. The function ψ t should be determined by K and C0 .
Below we derive the relevant relations
 p1 between  them.
Let H(p0 , p1 , q0 , q1 ) = p0 K , q1 , q0 , p0 = 0. With this conic Hamiltonian
p0 ! !
C0 O α O
and with normal Darboux matrices C = O I − C0
, where C0 = γ δ
,
we get the Hamilton–Jacobi equation

∂ t  
φ (u) = H u + (I − C)J∇φt (u) , with u = (p0 , p1 , q0 , q1 )T , (5.1)
∂t

satisfied by the generating function φt (u) of the phase flow gH


t
of the Hamiltonian
t
system associated with the Hamiltonian H while the phase flow gH is generated from
φt by the relation
 t 
t
gH (u) − u = Jφt CgH (u) + (I − C)u . (5.2)

On the other hand, according to the discussion in Section 11.3, we have


p1
φt (p0 , p1 , q0 , q1 ) = p0 ψ t (x, y, z), with x= , y = q1 , z = q0 .
p0

So, by simple calculations, we have


11.5 Hamilton–Jacobi Equations for Contact Systems 495

⎡ ⎤ ⎡ ⎤
p0 − (1 − α)φq0 p0 (1 − (1 − α)ψz )
⎢ ⎥ ⎢ ⎥
⎢ p1 + γφq − (I − δ)φq ⎥ ⎢ p0 (x + γψz − (I − δ)ψy ) ⎥
⎢ 0 1 ⎥ ⎢ ⎥
u + (I − C)J∇φt (u) = ⎢

⎥=⎢
⎥ ⎢


T
⎢ q0 + αφp0 + γ φp1 ⎥ ⎢ z + αψe + γ T ψx ⎥
⎣ ⎦ ⎣ ⎦
q1 + δ T φp1 y + δ T ψx

and
H(u + (I − C)J∇φt (u))
% &
  x − (I − δ)ψy + γψz
= p0 1 − (1 − α)ψz K , y + δ T ψx , z + αψe + γ T ψx .
1 − (1 − α)ψz

Therefore, from Equation (5.1), ψ t (x, y, z) satisfies


% &
∂ t   x − (I − δ)ψy + γψz T T
ψ = 1 − (1 − α)ψz K , y + δ ψx , z + αψe + γ ψx .
∂t 1 − (1 − α)ψz
(5.3)
 = gH
Now we claim1 . From (5.2), it follows that u t
(ū). The claim is then proved,
t
since H(gH (ū)) = H(ū), for all u. The following equality is valid
   
H u + (I − C)J∇φt (u) = H u − CJ∇φt (u) . (5.4)

So, replacing C by C − I in above discussions or, equivalently, replacing α and δ by


α − 1 and δ − 1 with γ unchanging in (5.3), we can derive equation satisfied by the ψ t

% &
∂ t x + δψy + γψz
ψ = (1+αψz )K , y + (δ T − I)ψx , z + (α − 1)ψe + γ T ψx .
∂t 1 + αψz
(5.5)
(5.3) and (5.5) define the same function ψ t . When t = 0, etK = I, so we should
impose the initial condition
ψ 0 (x, y, z) = 0, (5.6)
for solving the first order partial differential equation (5.3) or (5.5). We call both equa-
tions the Hamilton–Jacobi equations of the contact! system associated with the contact
α O
Hamiltonian K and the matrix C0 = γ δ
.
Specifically, we have Hamilton–Jacobi equations for particular cases:
(E1) α = 0, δ = O, γ = O.
% &
∂ t x − ψyt
ψ = (1 − ψz )K
t
, y, z = K(x, y − ψxt , z − ψet ). (5.7)
∂t 1 − ψzt
(E2) α = 1, δ = In , γ = O.
1
 = u + (I − C)J∇φt (u) and ū = u − CJ∇φt (u), then we have
Proof of the claim: let u
u = Cu  + (I − C)ū.
496 11. Contact Algorithms for Contact Dynamical Systems

% &
∂ t x + ψyt
ψ = K(x, y + ψxt , z + ψet ) = (1 + ψzt )K , y, z . (5.8)
∂t 1 + ψzt

1 1
(E3) α = , δ = In , γ = O.
2 2
1 t
⎛ ⎞
  ψy x−
∂ t 1 t 1 t 1
ψ = 1 − ψz K ⎝ 2 ,y + ψ , z + ψet ⎠
∂t 2 1 2 x 2
1 − ψzt
2
⎛ 1

  x + ψyt
1 t ⎝ 1 t 1
= 1 + ψz K 2 ,y − ψ , z − ψet ⎠ . (5.9)
2 1 2 x 2
1 + ψzt
2
Remark 5.1. On the construction of high order contact difference schemes.

If K is analytic, then one can solve ψ t (x, y, z) from the above Hamilton–Jacobi
equations in the forms of power series in time t. Its coefficients are recursively de-
termined by the K and the related matrix C0 . The power series are simply given
from the corresponding conic Hamiltonian generating functions φt (p0 , p1 , q0 , q1 ) by
ψ t (x, y, z) = ψ t (1, x, z, y), since the power series expressions
 of φt with respect to
p1
t from the conic Hamiltonian H(p0 , p1 , q0 , q1 ) = p0 K , q1 , q0 have been well
p0
given in[FW94] . Taking a finite truncation of the power series up to order m, an arbi-
trary integer, with respect to the time t and replacing by the truncation the generating
function ψ in (3.12)–(3.14), then one obtains a contact difference scheme of order
m for the contact system defined by the contact Hamiltonian K. The proofs of these
assertions are similar to those in the Hamiltonian system case, hence are omitted here.
Bibliography

[Arn78] V. I. Arnold: Ordinary Differential Equations. The MIT Press, New York, (1978).
[Arn88] V. I. Arnold: Geometrical Methods In The Theory of Ordinary Differential Equations.
Springer-Verlag, Berlin, (1988).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM
60, Berlin Heidelberg, Second edition, (1989).
[Etn03] J. Etnyre: Introductory lectures on contact geometry. In Proc. Sympos. Pure Math,
volume 71, page 81C107. SG/0111118, (2003).
[Fen93b] K. Feng: Symplectic, contact and volume preserving algorithms. In Z.C. Shi and
T. Ushijima, editors, Proc.1st China-Japan conf. on computation of differential equation-
sand dynamical systems, pages 1–28. World Scientific, Singapore, (1993).
[Fen95] K. Feng: Collected works of Feng Kang. volume I,II. National Defence Industry
Press, Beijing, (1995).
[FW94] K. Feng and D.L. Wang: Dynamical systems and geometric construction of algo-
rithms. In Z. C. Shi and C. C. Yang, editors, Computational Mathematics in China, Con-
temporary Mathematics of AMS, Vol 163, pages 1–32. AMS, (1994).
[Gei03] H. Geiges: Contact geometry. Math.SG/0307242, (2003).
[MNSS91] R. Mrugała, J.D. Nulton, J.C. Schon, and P. Salamon: Contact structure in thermo-
dynamic theory. Reports on Mathematical Physics, 29:109C121, (1991).
[QZ92] M. Z. Qin and W. J. Zhu: Construction of higher order symplectic schemes by com-
position. Computing, 47:309–321, (1992).
[Shu93] H.B. Shu: A new approach to generating functions for contact systems. Computers
Math. Applic., 25:101–106, (1993).
Chapter 12.
Poisson Bracket and Lie–Poisson Schemes

In this chapter, a clear Lie–Poisson Hamilton–Jacobi theory is presented. It is also


shown how to construct a Lie–Poisson scheme integrator by generating function,
which is different from the Ge–Marsden[GM88] method.

12.1 Poisson Bracket and Lie–Poisson Systems


Before introducing the Lie–Poisson system, let us first review more general about the
Poisson system.

12.1.1 Poisson Bracket


Take a system with finite dimensions as an example. Give a manifold M and two
smooth functions F, G on M , i.e., F, G ∈ C ∞ (M ). If an operation {·, ·} defined on
C ∞ (M ) satisfies the following 4 properties, then {·, ·} is called Poisson bracket, and
(M, {·, ·}) is called Poisson manifold[Olv93] .

1. Bilinearity

{aF1 + bF2 , H} = a{F1 , H} + b{F2 , H},


{F, aH1 + bH2 } = a{F, H1 } + b{F, H2 }.

2. Skew-Symmetry
{F, H} = −{H, F }.
3. Jacobi Identity

{{F, H}, G} + {{H, G}, F } + {{G, F }, H} = 0.

4. Leibniz Rule

{F1 · F2 , H} = F1 {F2 , H} + F2 {F1 , H}.

Given a Hamiltonian function H ∈ C ∞ (M ), the induced equation

Ḟ = {F, H}, ∀ F ∈ C ∞ (M )
500 12. Poisson Bracket and Lie–Poisson Schemes

is called the generalized Hamiltonian equation. The most general case of Hamiltonian
system is the one with symplectic structure, whose equations have the form:
5 6 5 6
O I p
ż = JHz , J = , z= .
I O q
According to Darboux theorem, a general Poisson system with finite dimensions
can be transformed into a local coordinate form, whose equations may be written as
ż = K(z)Hz , (1.1)
the corresponding Poisson bracket is
{F, H} = (∇z F (z))T K(z)∇z H(z), ∀ F, H ∈ C ∞ (M ).
K(z) satisfies 4 properties the above , if and only if K(z) = (kij (z)) satisfies
∂klm (z) ∂k ∂k
kij (z) + kil (z) mj + kim (z) jl = 0, j, l, m = 1, 2, · · · , n. (1.2)
∂zi ∂zi ∂zi
We remark that any antisymmetry constant matrix satisfies (1.2) and hence is a Hamil-
tonian operator, and the bracket defined by it is a Poisson bracket. We will discuss its
algorithm in more detail in the next section.
Definition 1.1. A diffeomorphism z → z = g(z) : M → M is called a Poisson
mapping, if it preserves the Poisson bracket, i.e.,
{F ◦ g, H ◦ g} = {F, H} ◦ g, ∀ F, H ∈ C ∞ (M ). (1.3)
Theorem 1.2. For a Poisson manifold with structure matrix K(z), Equation (1.3) is
equivalent to
gz K(z)gzT = K( z ),
where gz is the Jacobian matrix of g with respect to z.
Proof.
 T  
{F ◦ g, H ◦ g} = ∇(F ◦ g) K(z)∇ H ◦ g(z)
= (F ◦ g)z K(z)(H ◦ g)T
z
 T  T
∂g ∂g
= Fz (g(z)) K(z) Hz (g(z)
∂z ∂z
 T
∂g ∂g
= (∇F ◦ g)T K(z) (∇H ◦ g),
∂z ∂z
and
{F, H} ◦ g = ∇F T K∇H(g(z)) = (∇F ◦ g)T K(g(z))(∇H ◦ g).
By comparison, we get
gz (z)K(z)(gz (z))T = K(g(z)) = K(
z ).
The theorem is proved. 
12.1 Poisson Bracket and Lie–Poisson Systems 501

A Hamiltonian system on a Poisson manifold usually refers to the following ODEs


dz
= K(z)∇H(z), (1.4)
dt
where H(z) is a Hamiltonian function.
The phase flow of the Equation (1.4), which is expressed as g t (z) = g(t, z) =
gH (t, z), is a one parameter diffeomorphism group (at least locally), i.e.,
g 0 = identity, g t1 +t2 = g t1 ◦ g t2 .
Theorem 1.3. The phase flow gH (z, t) of the Hamiltonian system (1.4) is a one pa-
rameter group of Poisson maps, i.e.,
{F ◦ g(z, t), G ◦ g(z, t)} = {F, G} ◦ g(z, t). (1.5)
Proof. See[Olv93] . 
By Theorem 1.2, we get
gz (z, t)K(z)(gz (z, t))T = K(g(z)). (1.6)
Definition 1.4. A smooth function C(z) is called a Casimir function, if
{C(z), F (z)} = 0, ∀ F ∈ C ∞ (M ).
Definition 1.5. F (z) ∈ C ∞ (M ) is a first integral of Hamiltonian system, iff {F, H} =
0. Obviously, every Casimir function is a first integral.

12.1.2 Lie–Poisson Systems


The Lie–Poisson system is a type[MW83,MR99] of common Poisson systems. Its struc-
ture space is the dual space of any Lie algebra, and its bracket is called Lie–Poisson
bracket. There are two types of definition for the Lie–Poisson bracket: one relies on
the coordinate definition, and the other does not rely on the coordinate definition.
Lie–Poisson bracket. Let g be a r-dimensional Lie algebra, Cij k
(i, j, k = 1, 2, · · · , r)
be the configuration constants of g w.r.t. basis v1 , v2 , · · · , vr . Let V be another r-
dimensional linear space, with coordinate x = (x1 , x2 , · · · , xr ). Then Lie–Poisson
bracket is defined by:

r
∂F ∂H
{F, H} = k k
Cij x , ∀ F, H ∈ C ∞ (R). (1.7)
∂ xi ∂ xj
i,j,k=1

According to the notation of the Poisson system



r
k ij (x) = l l
Cij x.
l=1

It is easy to verify that {F, H} satisfies the 4 properties of a Poisson bracket. For
the infinite dimensional evolution equations, there exists a corresponding coordinate
definition; see the literatures[Arn89,MR99] .
502 12. Poisson Bracket and Lie–Poisson Schemes

Lie group action and momentum mapping. The Lie–Poisson system is closely re-
lated to the Hamiltonian system with symmetry.
Definition 1.6. The invariant property of a Hamiltonian system under one parameter
differomorphism group is called symmetry of the Hamiltonian system. Under certain
circumstance, this invariant property is called momentum. The corresponding map-
ping is called momentum mapping.
The Lie group, action on manifold M , ∀ g ∈ G, and corresponds to a self-homeomorph-
ism φg on M . Below, we consider only the translation action of G on itself and the
induced action on T G and T ∗ G.

Definition 1.7. Infinitesimal generator vector field: let g be a Lie algebra of G, ξ ∈g,
then exp tξ ∈ G,
d 
ξM =  φexp tξ (x), x ∈ M
d t t=0
is called infinitesimal generator vector field of the flow Ft = φexp tξ .

Definition 1.8 (Lifted action). Action φg : M → M may induce action φ"g :


T ∗ M → T ∗ M , which is defined as follows:

φ"g (α) = T ∗ φg−1 = (T ∗ φg )−1 (α), ∗


α ∈ Tφ(g) (x).

Thus, we can prove that the lifted mapping of a diffeomorphism is symplectic.


Definition 1.9 (Momentum mapping). Let (P, ω) be a connected symplectic mani-
fold. Let G be a Lie group, φg : P → P a symplectic action. We call J : P → g∗ (g∗
is the dual space of g) a momentum mapping, if J satisfies

∀ ξ ∈ g, d J(ξ) = iξp ω,

where J(ξ) is defined by J(ξ)(x) = J (x), ξ,  · , ·  denotes a scalar product, and
ξp is the infinitesimal generator of the action to ξ.

Theorem 1.10 (Generalized Noether theorem). Let φ be a symplectic action of G


on (P, ω) with a momentum mapping J . Suppose H : P → R is G-invariant, i.e.,

H(x) = H(φg (x)), ∀ x ∈ P, g ∈ G, (1.8)

then J is a first integral of XH , i.e., if Ft is the phase flow of XH , then

J (Ft (x)) = J (x).

Proof. See[MW83] . 

Definition 1.11. A momentum mapping J is called Ad∗ -equivariant, if

J (φg (x)) = Ad∗g−1 J (x), ∀ g ∈ G,

that is, the following diagram commutes


12.1 Poisson Bracket and Lie–Poisson Systems 503

φg
P −−−−→ P
⏐ ⏐
⏐ ⏐
JG JG

Ad∗
g −1
g∗ −−−−→ g∗

and we call such a group action as a Poisson action[AN90] .


[MR99]
Theorem 1.12. J is Ad∗ -equivariant momentum mapping, iff

{J(ξ), J(η)} = J([ξ, η]),

i.e., J is a Lie homomorphism.

Corollary 1.13. Let φ be a Poisson action of G on the manifold M , and φ" be the lifted
action on T ∗ (M ) = P . Then this action φ" is symplectic and has an Ad∗ -equivariant
momentum mapping given by

J : P −→ g∗ , J(ξ)(α(q)) = α(q) · ξM (q), q ∈ M, α(q) ∈ T ∗ M.

ξM is the infinitesimal generator of φ on M .

Below, we will discuss the translation action of a Lie group on itself using the
above theorem and deduction.
Let G be a Lie group, φ : G × G → G be a left translation action (g, h) → gh.
Then its infinitesimal generator is

ξG (g) = Te Rg ξ = Rg∗ ξ.

Because lifted action is symplectic, by Corollary 1.13, we can obtain the momentum
mapping:

J (αq )(ξ) = αq Te Rg ξ = αq Rg∗ ξ =⇒ J (αq ) = Te Rg∗ αq = Rg∗ αq ,

or can rewrite it as
JL (αq ) = Rg∗ αq .
Likewise, we can obtain the similar result for the right translation

JR (αq ) = L∗g αq .

Lie–Poisson bracket and motion equation. In the previous sections, we have intro-
duced the Lie–Poisson bracket and equations which are expressed by the local coordi-
nates. Below, we will introduce an intrinsic definition of Lie–Poisson bracket and its
induced equation of motion.
δF
Let  · , ·  be the pairing between g∗ and g, ∀ F : g∗ → R, ∈ g, μ ∈ g∗ , is
δμ
defined by
= δF>
DF (μ)γ = γ, , γ ∈ g∗ .
δμ
504 12. Poisson Bracket and Lie–Poisson Schemes

If we regard g∗∗  g, then DF (μ) ∈ g∗∗ becomes an element of g,


=  δF δG >
{F, G}(μ) = − μ, , ,
δμ δμ

where [ · , · ] is the Lie bracket on g. The above equation is usually denoted as {F, G}.
It is easy to verify that { · , · } satisfies the 4 properties of Poisson bracket, and are
often called as (−) Lie–Poisson bracket. They are first proposed by Lie[Lie88] and
are redefined by Berezin and others thereafter. We can prove that { · , · } can be derived
from the left translation reduction of a typical Poisson bracket on T ∗ G. If the right
translation reduction is used, we have the Lie–Poisson bracket (+):
=  δF δG >
{F, G}(μ) = μ, , = {F, G}+ .
δμ δμ

Given a Lie–Poisson bracket, we can define the Lie–Poisson equation. Take { · , · } as


an example.

Proposition 1.14. If H− ∈ C ∞ (g∗ ) is a Hamiltonian function, then the evolutionary


equation on g∗ is:
Ḟ = {F, H}− ,
i.e.,
μ̇ = XH− (μ) = ad∗δH μ. (1.9)
δμ

Proof. Because
= δF >
Ḟ (μ) = DF (μ) · μ̇ = μ̇, ,
δμ
and
=  δF δH > = δF > = ∗ δF >
{F, H− }− (μ) = − μ, , = μ, ad δH = ad δH μ, .
δμ δμ δμ δμ δμ δμ
Since F is arbitrary, we obtain
μ̇ = ad∗δH μ.
δμ

Likewise, for the right invariant system, the equation is

μ̇ = −ad∗δH μ.
δμ

Henceforth, we will denote the system of left translation reduction as g∗+ , and the right
translation reduction as g∗− . Generally speaking, the rigid body and Heavy top system
belongs to the left invariant system g∗− , and the continuous systems, such as plasma
and the incompressible flow, are right invariant system g∗+ . 

Lemma 1.15. JL , JR are Poisson mapping.

Proof. See[MW83] . 
12.1 Poisson Bracket and Lie–Poisson Systems 505

From this lemma, we can obtain the following reduction theorem (it will be used
in the generating function theory later).
Theorem 1.16. 1◦ ∗
For the left invariant system g− , we have the following diagram
commutes:
GtH◦J
T ∗ G −−−−−→
R
T ∗G
⏐ ⏐

JR G

JR G

∗ Gt ∗
g− −−−H−→ g−
where H : g∗ → R is a Hamiltonian function on g∗− , GtH is a phase flow of Hamilto-
nian function H on g∗− , and GtH◦JR is phase flow of Hamiltonian function H ◦ JR on
T ∗ G.
2◦ Similarly for right invariant system g∗+ , we have

GtH◦J
T ∗ G −−−−→
L
T ∗G
⏐ ⏐

JL G

JL G

Gt
g∗+ −−−H−→ g∗+

Theorem 1.17. The solutions of a Lie–Poisson system are a bundle of coadjoint or-
bits. Each coadjoint orbit is a symplectic manifold and is called symplectic leave of
the Lie–Poisson system.
This theorem is from literature[AM78] . For Lie–Poisson system such as Heavy Top
and the compressible flows, similar set of theories can be established. The readers can
refer to literature[MRW90] for more details.

12.1.3 Introduction of the Generalized Rigid Body Motion


Let G be a Lie group (finite dimensional), g(t) be a movement on G. We define:
Velocity: V (t) = ġ(t) ∈ Tg(t) G;
Angular velocity in body description: WB (t) = T Lg(t)−1 (ġ(t)) ∈ g;
Angular velocity in space description: WS (t) = T Rg(t)−1 (ġ(t)) ∈ g;
Momentum : M (t) = Ag ġ, where Ag : Tg G → Tg∗ G is called a moment of
inertia operator, it relates to the kinetic energy by
1 1 1 1
T = (ġ, ġ)g = (WB , WB ) = AWB , WB  = Ag ġ, ġ,
2 2 2 2
where A: g→ g∗ is a value of Ag at g = e;
Angular momentum in body description: MB (t) = T ∗ Rg(t) (M (t)) ∈ g∗ ;
Angular momentum in space description: MS (t) = T ∗ Lg(t) (M (t)) ∈ g∗ .
506 12. Poisson Bracket and Lie–Poisson Schemes

From the above definition, we can obtain the following conclusions:

WS (t) = Adg(t) WB (t), MS (t) = Ad∗g(t)−1 MB (t), MB (t) = AWB (t).

By Theorem 1.10, we get:


Theorem 1.18. Conservation of spatial angular momentum theorem
d
MS (t) = 0. (1.10)
dt

Because the system that takes kinetic energy T as the Hamiltonian function is left
invariant, MS (t) is the momentum mapping exactly.
Corollary 1.19. Euler equation
d
MB (t) = {WB (t), MB (t)} = {A−1 MB (t), MB (t)}, (1.11)
dt

where { · , · } is defined by:

{ξ, a} = ad∗ξ a, ∀ ξ ∈ g, a ∈ g∗ .

Given below are two different proofs of the Euler equation.

Proof. 1◦ From the Lie–Poisson equation of the motion μ̇ = ad∗∂H μ, we can obtain
∂μ
directly
1 1  −1 
H = (WB (t), MB (t)) = A MB (t), MB (t) ,
2 2
δH
= A−1 MB (t) = WB (t).
δ MB
2◦ By the definition of spatial angular momentum, we have

MB (t) = Ad∗g(t) Ad∗g(0)−1 MB (0) = Ad∗g(t) η. (1.12)

Since
MS (t) = MS (0) =⇒ Ad∗g(t)−1 MB (t) = Ad∗g(0)−1 MB (0) = η.
This also indicates that the trajectory of Lie–Poisson equation lies in some coadjoint
orbit. From

MB (t), ξ = Ad∗g(t) η, ξ = η, Adg(t) ξ, ∀ ξ ∈ g,

taking time derivatives on two sides, we get


= > H I
d MB (t)
, ξ = η, [T Rg(t)−1 (ġ(t)), Adg(t) ξ] ,
dt

since
d
Adg(t) ξ = [T Rg(t)−1 ġ(t), Adg(t) ξ]
dt
12.2 Constructing Difference Schemes for Linear Poisson Systems 507

(see[AM78] ), then
= >
d MB (t)
,ξ = η, [T Rg(t)−1 ġ(t), Adg(t) ξ]
dt
= η, Adg(t) [T Lg(t)−1 ġ(t), ξ]
= Ad∗g(t) η, adT Lg(t)−1 ġ(t) ξ

= MB (t), adT Lg(t)−1 ġ(t) ξ

= ad∗T L MB (t), ξ
g(t)−1 ġ(t)

d MB (t)
=⇒ = ad∗T L −1 ġ(t) MB (t) = {WB (t), MB (t)}.
dt g(t)

The proof can be obtained. 

Generally speaking, an equation of motion on T ∗ G, if it has Hamiltonian function


H = T, it can be expressed by

∂H ∂H
ġ(t) = T Lg(t) = Lg(t)∗ , (1.13)
∂μ ∂μ
μ̇(t) = ad∗∂H μ(t). (1.14)
∂μ

Its solution is μ(t) = Ad∗g(t) Ad∗g(0)−1 μ(0). The Equation (1.14) is called as the Lie–
Poisson equation.

12.2 Constructing Difference Schemes for Linear


Poisson Systems

Since the phase flow of Hamiltonian system is Poisson phase flow, which preserves
the Poisson structure, it is important to construct difference schemes for system (1.4)
that preserve the same property. Difference scheme that preserves the Poisson bracket
is called as the Poisson difference scheme.
One special case of the Poisson phase flow is the symplectic phase flow. How to
construct the symplectic difference schemes has already been described in the previ-
ous chapters. The reader can also refer to literatures[Fen85,FWQW89,FQ87,CS90] for more
details. However, the numerical algorithm for a general Poisson phase flow is still
in its infancy. So far the results are limited to cases where structure matrix K is
constant[Wan91,ZQ94,AKW93,Kar04] and K(z) is linear (Lie–Poisson) only. We will dis-
cuss the results for the Lie–Poisson case in the next section. In this section, we will
discuss the results when K is a constant matrix.
508 12. Poisson Bracket and Lie–Poisson Schemes

12.2.1 Constructing Difference Schemes for Linear Poisson


Systems
Without loss of generality, we assume that K is an odd-dimensional matrix. Because
odd dimensional antisymmetric matrix is definitely degenerated, there
! exists a coordi-
J2r O
nate transformation P ∈ GL(n) such that P KP T = O Os
.

Definition 2.1. A difference scheme z = gH


τ
(z) is called a Poisson scheme, if and
T
only if gz Kgz = K.
Next, we have:
Definition 2.2. SK (n) = {A ∈ GL(n) | AKAT = K}, then the set SK (n) has the
following properties:
1◦ When the rank of K is an even number and non-singular, then K has all the
properties of a symplectic matrix.
2◦ When the rank of K is an odd number, it must be degenerated. It is easy to
verify that SK (n) is a group and we call it as K-symplectic group. Its Lie algebra is

sK (n) = {a ∈ gl(n) | aK + KaT = 0}.


According to Feng et al.[FWQ90] , we can establish the relationship between SK (n) and
sK (n) via Cayley transformation. If A ∈ SK (n), namely if AKAT = K, then

B = (I − A)(I + A)−1 = (I + A)−1 (I − A)

is an element of sK (n). However, if B ∈ sK (n), then

A = (I − B)(I + B)−1 = (I + B)−1 (I − B)

is an element of SK (n) .
For a generalized Cayley transformation, we have the following result similarly:
p(Λ)
Theorem 2.3. Given φ(Λ) = , p(Λ) is a polynomial that satisfies p(0) = 1,
p(−Λ)
ṗ(0) = 0, if B ∈ sK (n), then

A = φ(B) ∈ SK (n).

Therefore, we may use Padé approximation and pseudo-spectral method (the


Chebyshev spectral method) to construct the Poisson schemes for the linear Poisson
system. The Padé approximation has been described in the literatures[Qin89,ZQ94,FWQ90]
in detail. Below, we will briefly describe the Chebyshev spectral method to construct
the Poisson scheme. The Chebyshev spectral method is a highly effective method to
approximate eA . The detailed explanation of this is described in literature[TF85] . Here,
we give only the result.
The Chebyshev spectral method is an approach based on series expansion by
Chebyshev polynomial, i.e.,
12.2 Constructing Difference Schemes for Linear Poisson Systems 509


  
x
ex = Ck Jk (R)Qk , |x| < R, (2.1)
R
k=0

where x is a real number and Qk is the Chebyshev complex orthogonal multinomial.


Qk satisfies the following recurrence relation:

Q0 (x) = 1,
Q1 (x) = x,
Qk+1 (x) = Qk−1 (x) + 2xQk (x),

where C0 = 1, and Ck = 2 for k > 0. Jk denotes the k-order Bessel function. Qk


denotes the Chebyshev polynomial. R is chosen arbitrarily. During computing, we
calculate Jk (R) first, and then calculate Qk using the above recursive procedure.
Using the generalized Cayley transformation, and
A
e2
eA = −A ,
e 2

and applying the Chebyshev spectral method to the numerator and denominator re-
spectively, we can obtain the Poisson algorithm.
It was pointed out in literature[TF85] that when k > R, the series converges ex-
ponentially. Therefore, the summation in (2.1) is always finite. Where to truncate the
series is determined by the size of Jk (R). Since Jk (R) converges exponentially too,
only a few steps of iteration is enough. Numerical tests show that this method has high
accuracy and efficiency, especially when A is a dense matrix. The above method can
be applied only to the linear dynamic system, where H is a quadratic form of z,

ż = KBz.

12.2.2 Construction of Difference Schemes for General Poisson


Manifold
For a general H, there are other methods to construct Poisson integrator such as
method of generating function. The reader can refer to literatures[Fen85,FWQW89,FQ87]
for more details. For a low-order scheme, we can construct directly using the implicit
Euler scheme and verify it by criterion (1.3).
Let
∂H
ż = K ,
∂z
construct  
1 1
z k+1 = z k + τ K∇H (I + B)z k+1 + (I − B)z k . (2.2)
2 2
Take derivative of (2.2) w.r.t. z k ,
 
1 1
zz = I + τ KHzz zz + (I − B) ,
(I + B)
2 2
510 12. Poisson Bracket and Lie–Poisson Schemes

i.e.,  
1 1
I − τ KHzz (I + B) zz = I + τ KHzz (I − B),
2 2
∂x
where z k+1
= z, z = z, xy =
k
, therefore,
∂y
 −1  
1 1
zz = I − τ KHzz (I + B) I + τ KHzz (I − B) .
2 2

To become a Poisson scheme, it should satisfy

zz K zzT = K,

i.e.,
 −1  
1 1
I − τ KHzz (I + B) I + τ KHzz (I − B) K
2 2
 T  −1
1 1
· I + τ KHz z(I − B) I − τ KHz z(I + B) = K.
2 2

After manipulation, we obtain

KHzz (KB T + BK)Hzz K = O.

Therefore, if KB T + BK = O, i.e., B T ∈ sK (n), this scheme is a Poisson scheme.


When B = O, then the scheme becomes Euler midpoint scheme.
Denote
z k+1 = GτH,B z k ,
then for B = O, the scheme is of second-order, for B = O, it is only of first order.
Using
z k+1 = GτH,±B z k ,
we can construct a composite scheme,
τ τ
GτH,±B = GH,B
2
◦ GH,−B,
2

τ τ
GτH,∓B = GH,−B
2
◦ GH,B.
2

Proposition 2.4. The above scheme has the second-order accuracy and the following
proposition can be easily derived.
1
If φA (z) = z T Az, where AT = A, is a conservative quantity of Hamiltonian
2
dz ∂H
system = K , and if A satisfies B T A + AB T = 0, then φA (z) is also a
dt ∂z
conservative quantity of difference scheme GτH,−B .

Proof. Because φA (z) is a conservative quantity of Hamiltonian system,


1 T 1
z = z T Az,
z A
2 2
12.2 Constructing Difference Schemes for Linear Poisson Systems 511

then
1
z + z)T A(
( z + z) = 0.
2
From B T A + AB T = 0, we obtain
1 1
z − z))T A(
(B( z − z)B T A(
z − z) = ( z − z)
2 2
1
z − z)T (B T A + AB)(
= ( z − z) = 0,
4
 T
1 1
( z − z)
z + z) + B( z − z) = 0
A(
2 2
=⇒ τ wT AKHz (w) = 0, ∀ w ∈ Rn .

Let
1 1
w = (z k+1 + z k ) + B(z k+1 − z k ),
2 2
we obtain
1 k+1 T k+1 1
(z ) Az = (z k )T Az k .
2 2
The proof can be obtained. 

12.2.3 Answers of Some Questions


1. Euler explicit scheme[LQ95a]
For a separable Hamiltonian H in a standard Hamiltonian system, there exists an
Euler explicit symplectic scheme. Similar question is raised for the Poisson system:
does there exist an Euler explicit Poisson scheme for a separable H? The answer is
“may not be”. We take n = 3 as an example to explain this point. Let
⎡ ⎤
0 −c b
⎢ ⎥
K=⎢
⎣ c 0 −a ⎥
⎦,
−b a 0

1 2
H= (z + z22 + z32 ),
2 1
then
∂H
ż = K = Kz.
∂z
To make scheme
 
1 1
z k+1 = z k + τ K∇H (I + B)z k+1 + (I − B)z k
2 2

a Poisson scheme, we should have

KHzz (KB T + BK)Hzz K = K 2 B T K + KBK 2 = O,


512 12. Poisson Bracket and Lie–Poisson Schemes

i.e., K 2 B T K ∈ Sm(n) (symmetrical matrix), here KB T ∈ Sm(n), i.e., BK ∈


Sm(n). Expand scheme
1 1
z k+1 = z k + τ K∇H(w), w = (I + B)z k+1 + (I − B)z k
2 2
into ⎧ k+1

⎪ z1 = z1k − cτ w2 + bτ w3 ,


z2k+1 = z2k + cτ w1 − aτ w3 ,



⎩ k+1
z3 = z3k − bτ w1 + aτ w2 .
To make sure the scheme is explicit, w2 , w3 have to be a function of z k only. From
1 k+1 1 1 1
w2 = (z + z2k ) + b21 (z1k+1 − z1k ) + b23 (z3k+1 − z3k ) + b22 (z2k+1 − z2k ),
2 2 2 2 2
we obtain b21 = 0 = b23 , b22 = −1. Likewise, b31 = b32 = 0, b33 = −1. Then B
has the form ⎡ ⎤
b1 b2 b3
⎢ ⎥
⎢ 0 −1 0 ⎥ ,
⎣ ⎦
0 0 −1
substituting it into BK(∈ Sm(n)), we know only when a = 0, the scheme becomes
an explicit scheme. Note that when a = 0, K is degenerated to the symplectic case.
Therefore, in many situations, the separable system does not have an explicit scheme.
Here the explicit scheme refers to the low-order finite-difference scheme, not the ex-
plicit analytic solution.
2. Midpoint scheme and Euler scheme
Below, we will answer the questions whether the midpoint scheme is a Lie–Poisson
scheme of Euler equation, and whether there exists a Lie–Poisson scheme in a gener-
alized Euler scheme[LQ95a,LQ95b] .
We already know that the answer for the first question is “no”. Now, we turn to the
second question. The Euler equation has the form

ż = J(z)Hz = f (z).

For the case n = 3,


⎡ ⎤
0 −z3 z2
⎢ ⎥
J(z) = ⎢
⎣ z3 0 −z1 ⎥
⎦,
−z2 z1 0
% 2 &
1 z1 z22 z32
H= + + .
2 I1 I2 I3

We construct a generalized Euler scheme:


12.2 Constructing Difference Schemes for Linear Poisson Systems 513

z = z + τ J(w)Hz (w) = z + τ f (w),

where
1 1 1 1
(
z + z) + B(
w= z − z) = (I + B) z + (I − B)z.
2 2 2 2
The Jacobian matrix of map z → z is
% &
∂
z ∂w
A= = I + τ D∗ f (w) ,
∂z ∂z
where
⎡ ⎤
I 2 − I3 I 2 − I3
0 z3 z2
⎢ I2 I3 I2 I3 ⎥
⎢ ⎥
⎢ I 3 − I1 I 3 − I1 ⎥
D∗ f (z) = D∗ J(z)Hz = ⎢
⎢ I 1 I 3 z3 0 z1 ⎥,

⎢ I1 I3 ⎥
⎣ ⎦
I 1 − I2 I 1 − I2
z2 z1 0
I1 I2 I1 I2
∂w 1 1
= (I + B)A + (I − B),
∂z 2 2
therefore

A = (I − τ D∗ f (w)(I + B))−1 (I + τ D∗ f (w)(I − B)).

For the Euler scheme to be a Poisson scheme, it has to be AJ(z)AT = J(


z ), therefore:
AJ(z)AT = (I − τ D∗ f (w)(I + B))−1 (I + τ D∗ f (w)(I − B))J(z)
· (I + τ (I − B T )(D∗ f (w))T )(I − τ (I + B)(D∗ f (w))T )−1
= J(
z ),
i.e.,
(I + τ D∗ f (w)(I − B))J(z)(I + τ (I − B T )(D∗ f (w))T )
z )(I − τ (I + B T )(D∗ f (w))T ),
= (I − τ D∗ f (w)(I + B))J(
after manipulation,
z ) − J(z) + τ 2 D∗ f (w)[(I + B)J(
J( z )(I + B T ) − (I − B)J(z)(I − B T )](D∗ f (w))T
= τ [J(z)(I − B T ) + J(
z )(I + B T )](D∗ f (w))T + τ D∗ f (w)[(I − B)J(z) + (I + B)J(
z )]
z − z) + τ 2 D∗ f (w)[J(
= J( z − z) + BJ( z + z)B T + BJ(
z + z) + J( z − z)B T ](D∗ f (w))T .

Because τ is arbitrary, and z − z = O(τ ), we can have




⎪ J(z − z) = τ J( z + z)(D∗ f (w))T + τ D∗ f (w)J( z + z),




⎨ τ 2 D∗ f (w)[BJ( z + z) + J(z + z)B T ](D∗ f (w))T

⎪ z − z)B T (D∗ f (w))T + τ D∗ f (w)BJ(
= τ J( z − z),





z − z)B T ](D∗ f (w))T = O.
z − z) + BJ(
D∗ f (w)[J(
514 12. Poisson Bracket and Lie–Poisson Schemes

1
When B = O, the above equation is the midpoint scheme, w = ( z + z). It is
2
easy to verify that the last equality in the above equations is dissatisfied. Hence the
midpoint scheme is not a Poisson scheme. When B = O, after complex computation,
we can obtain similarly that there does not exist any B ∈ gl(n) to satisfy the above
3 formulas. Therefore, there does not exist a Poisson scheme in a generalized Euler
form.

12.3 Generating Function and Lie–Poisson Scheme


The generating function method plays a crucial role in constructing the symplectic
scheme (see the literatures[FWQW89,CS90,CG93] for details). Therefore, how to use the
generating function method to construct the Lie–Poisson scheme becomes a research
hot spot. The literatures in this aspect include[GM88,Ge91,CS91] . We have also investi-
gated the generating function for Lie–Poisson system in details, and discovered that
the Ge–Marsden method needs further improvement. Below is our understanding and
derivation on the generating function and the Hamilton–Jacobi theory[LQ95b] .

12.3.1 Lie–Poisson–Hamilton–Jacobi (LPHJ) Equation and


Generating Function
According to the diagram in Section 12.1 (for the left invariant system),
GtH◦J =S
T ∗ G −−−−−R−−→ T ∗ G
⏐ ⏐

JR G

JR G

Gt =P
g∗ −−−H−−−→ g∗
the phase flow determined by H on g∗ can induce a phase flow on T ∗ G determined
by H ◦ JR . Let ut (q, q0 ) be a first kind generating function of the symplectic map S.
Then we have the following properties.
Property 3.1. If u : G × G → R is invariant under the left action of G, i.e.,
ut (gq, gq0 ) = ut (q, q0 ), (3.1)
then the symplectic mapping generated by u, S : (q0 , p0 ) → (p, q), where:
∂ut (q, q0 ) ∂ ut (q, q0 )
p0 = − , p= , (3.2)
∂q0 ∂q
preserves momentum mapping JL . That is to say,
JL ◦ S = JL .
For the right-invariant translation on G,
JR ◦ S = JR .
12.3 Generating Function and Lie–Poisson Scheme 515

Definition 3.2. If G acts on the configuration space without fixed point, then we say
G acts on G freely.

Property 3.3. If G acts on G freely, and its induced symplectic mapping S preserves
the momentum mapping JL , then the first-kind generating function of S is left invari-
ant.

Proof. See[GM88] . 

For a left-invariant system, such as a generalized rigid body, the Hamiltonian func-
tion is left invariant, the phase flow is also left invariant, the momentum mapping JL is
a first integral for this dynamics, i.e., JL is invariant under the phase flow of GtH◦JR .
Therefore, if the action is free (generally speaking, the action is locally free), the first-
kind generating function is left invariant.
Let ut (q, q0 ) be the first-kind generating function of S, then by the left invariance

ut (q, q0 ) = ut (e, q −1 q0 ) = u
"t (g), g = q −1 q0 .

By Equation (3.2), we have



∂ ut (q, q0 ) ∂ u"t (q −1 q0 ) "t (Lq−1 q0 )
∂u ∂u"
p0 = − =− =− = −L∗q−1  ,
∂ q0 ∂ q0 ∂ q0 ∂ g g=q−1 q0

∂ ut (q, q0 ) "(q −1 q0 )
∂u "(Rq0 V (q))
∂u ∂u"
p= = = = V ∗ Rq∗0  ,
∂q ∂q ∂ q" ∂ g g=q−1 q0

V (q) = q −1 ,
V ∗ = −L∗q−1 Rq∗−1 ,

then
∂u" 
p = −L∗q−1 Rq∗−1 Rq∗0  ,
∂ g g=q−1 q0
therefore,

∂u"
μ0 = L∗q0 p0 = −L∗q0 L∗q−1 
∂ g g=q−1 q0
 
∂u" ∂u"
= −L∗q−1 q0  = −L∗g  ,
∂ g g=q−1 q0 ∂ g g=q−1 q0

and 
u
∂"
μ = L∗q p = −L∗q L∗q−1 Rq∗−1 Rq∗0 
∂g g=q−1 q0
 
u
∂" u
∂"
= −Rq∗−1 q0  = −Rg∗  .
∂g g=q−1 q0 ∂g g=q−1 q0

Through the above derivation, it is easy to prove (3.1)


516 12. Poisson Bracket and Lie–Poisson Schemes


∂u"
M0 = Rq∗0 p0 = −Rq∗0 L∗q−1  ,
∂ g g=q−1 q0

u
∂"
M = Rq∗ p = −Rq∗ L∗q−1 Rq∗−1 Rq∗0 
∂g g=q−1 q0

∂u"
= −L∗q−1 Rq∗0 
∂ g g=q−1 q0

∂u"
= −Rq∗0 L∗q−1  = M0 ,
∂ g g=q−1 q0

i.e.,
JL ◦ S = JL .

Take g = q −1 q0 , then

⎪ "(g)
∗∂u

⎨ μ0 = −Lq ∂ g ,
(3.3)

⎩ μ = −Rg∗ ∂ u"(g) = Ad∗g−1 μ0 ,

∂g

"t (q −1 q0 ) = u
therefore ut (q, q0 ) = u "t (g) defines a Poisson mapping:

μ0 → μ = Ad∗g−1 μ0 .

We now derive the conditions that ut (q, q0 ) must meet.


ut (q, q0 ) generates a symplectic map S = GtH◦J = GtH , where H = H ◦ J , and

S : (p0 , q0 ) −→ (p, q),


∂u ∂u (3.4)
p0 = − , p= .
∂ q0 ∂q

Because

∂u ∂u
pd q − p0 d q0 = dq + d q0 ,
∂q ∂ q0
∂u ∂u ∂u ∂u
du = dq + d q0 + d t = pd q − p0 d q0 + d t,
∂q ∂ q0 ∂t ∂t

we have
% &
∂u
d pd q − p0 d q0 + d t = 0.
∂t

Note that
12.3 Generating Function and Lie–Poisson Scheme 517

d (pd q − p0 d q0 ) = d p ∧ d q − d p0 ∧ d q0
% & % &
∂p ∂p ∂p ∂q ∂q ∂q
= d p0 + d q0 + dt ∧ dp0 + dq0 + d t − d p0 ∧ d q 0
∂ p0 ∂q0 ∂t ∂p0 ∂ q0 ∂t
% &
∂p ∂q ∂p ∂q
= − d p 0 ∧ d q 0 − d p0 ∧ d q 0
∂ p0 ∂ q0 ∂ q0 ∂ p0
∂q ∂p ∂q ∂ p0
+ d p0 ∧ d t + d q0 ∧ d t
∂ t ∂p0 ∂t ∂ q0
∂p ∂q ∂p ∂p
− dp0 ∧ dt − d q0 ∧ d t
∂ t ∂p0 ∂ t ∂ q0
= f1 + f2 + f3 .
Since (p0 , q0 ) → (p, q) is symplectic, we have
gz JgzT = J =⇒ f1 = 0.
Because
⎧ ⎧


∂q ∂H ⎪

∂H
⎨ ∂t = ∂p ⎨ f2 = ∂ p d p ∧ d t
=⇒ ,

⎪ ⎪

⎩ ∂ p = −∂ H ⎩ f3 = ∂ H d q ∧ d t
∂t ∂q ∂q

therefore, d p ∧ d q − d p0 ∧ d q0
∂H ∂H
= dp ∧ dt + dq ∧ dt = dH ∧ dt
∂p ∂q
% &
∂H
=⇒ d H ∧ d t + d ∧ d t = 0.
∂t

We have % &
∂H
d H+ ∧ d t = 0.
∂t
Therefore,
∂u
+ H(p, q, t) = c.
∂t
Taking a proper initial value, we can obtain:
∂u
+ H(p, q, t) = 0,
∂t
i.e.,
∂ ut (p, q)
+ H ◦ JR (p, q, t) = 0.
∂t
Therefore we obtain the LPHJ equations
 
∂ u(g) ∂ u(g)
+ H − Rg∗ = 0, (3.5)
∂t ∂g
g = q −1 q0 .
518 12. Poisson Bracket and Lie–Poisson Schemes

Remark 3.4. If we can construct a generating function u(g), we then have u(q0 , q).
This function can generate a symplectic mapping on T ∗ G. By the commutative dia-
gram, a Poisson mapping on g∗ can also be induced. This is a key point of constructing
a Lie–Poisson integrator by generating function.

Remark 3.5. In order that the induced phase flow is a Poisson phase flow, the phase
flow on T ∗ G should be symplectic. Therefore, the condition of g = q −1 q0 cannot be
discarded. Namely, when t → 0, g = q −1 q0 (unit element).

Remark 3.6. Only when g = q −1 q0 is satisfied, the momentum mapping is invariant.


This is because the momentum mapping is

JL (p, q) = Rq∗ p = Ad∗q−1 JR (p, q).

To make sure
JL (p0 , q0 ) = JL (p, q) =⇒ Ad∗q−1 JR (p0 , q0 ) = Ad∗q−1 JR (p, q)
0

=⇒ JR (p, q) = Ad∗q Ad∗q−1 JR (p0 , q0 )


0

= Ad∗(q−1 q0 )−1 JR (p0 , q0 ) = Ad∗g−1 JR (p0 , q0 ).

If g = q −1 q0 , deriving back, we obtain the momentum mapping is invariant.

Remark 3.7. The above generating function theory can be transformed into the gen-
erating function theory on g (for details see literature[CS90] ). That is to say, the above
generating function theory on T ∗ G can be reformulated by the exponential mapping in
terms of algebra variables, which has been done by Channell and Scovel[CS90] . Below,
we list only some of their results.
For g ∈ G, choose ξ ∈ g, so that g = exp (ξ). Then the LPHJ equation can be
transformed into ⎧ ∂s

⎪ + H(−ds · ψ(adξ )) = 0,

⎨ ∂t
M0 = −ds · χ(adξ ), (3.6)




M = −ds · ψ(adξ ),
where ⎧
⎨ χ(adξ ) = id + 1 adξ + 1 ad2ξ + · · · ,
2 12 (3.7)
⎩ −adξ
ψ(adξ ) = χ(adξ ) · e  χ(adξ ) − adξ ,
−1
and the condition g = q q0 is transformed into

s(ξ, 0) = s0 (ξ) = s0 (I), (3.8)

i.e.,
ξ|t=0 = id.
12.3 Generating Function and Lie–Poisson Scheme 519

12.3.2 Construction of Lie–Poisson Schemes via Generating


Function
The generating function theory to construct the symplectic scheme has been described
in detail in the literatures[LQ95a,Fen86,FWQW89] . The next step is to use the generating
function theory to construct the Lie–Poisson schemes. As we know, the generating
function must generate identity transformation at time zero. From the previous sec-
tion, the generating function should satisfy the condition (3.8), i.e., the group element
becomes a unit element at t = 0. We are not able to find a generating function univer-
sally applicable to a general Lie–Poisson system after a long time pursuit. Scovel[MS96]
once suggested a possible solution using the Morse bundle theory. However, for the
Hamilton function of quadratic form, we can find the low-order generating function.
Below, we will give a brief description:
1
The Hamiltonian for so(3)∗ is H(M ) = M I −1 M . From (3.6) and (3.7), using
2
u as the generating function, we have:
 
1 1
M = −d u · ψ(adξ ) = −d u 1 − adξ + ad2ξ + O(ad3ξ )
2 12
(3.9)
1
= −d u + d u · adξ + O(ad2ξ ).
2
After substituting H into Equation (3.6) and using expansion of ψ, we have
 
∂u 1
+ H −d u + d u · adξ − O(ad2xi )
∂t 2
   
∂u 1 1 1
= + −d u + du · adξ + O(ad2ξ ) I −1 −d u + d u · adξ + O(ad2ξ )
∂t 2 2 2
∂u 1 1
= + I −1 d ud u − I −1 d ud u(adξ ) + O(ad2ξ ).
∂t 2 2
(3.10)
Because ξ and time τ have the same order of magnitude, the Equation (3.10) can
be simplified as
% &
∂u ∂u 1 ∂u ∂u 1 ∂u ∂u
+ H(M ) = + I −1 − I −1 adξ + O(τ 2 )
∂t ∂t 2 ∂ξ ∂ξ 2 ∂ξ ∂ξ
∂u 1 ∂u ∂u
= + I −1 + O(τ 2 )
∂t 2 ∂ξ ∂ξ

= 0.

From this, we can obtain a first-order generating function. Taking


Iξ · ξ
u= . (3.11)

It can be easily verified that the Equation (3.11) satisfies (3.10) to the approximate
order. Therefore, we can use u to construct the Lie–Poisson scheme.
520 12. Poisson Bracket and Lie–Poisson Schemes

We first calculate ξ by solving

M0 = −Iξ · χ(ξ), (3.12)

and then substitute it into Equation (3.6). Next, we calculate M = exp (ξ)M0 . On
repeating this procedure, we can obtain a Lie–Poisson algorithm.
Below, we will apply this algorithm to free rigid body. For motion of the rigid
body, χ(ξ) has a closed expression (see Subsection 12.5.2). Solving nonlinear (3.12)
for ξ becomes a key point. It is necessary to linearize (3.12). The iterative formula for
ξ is
    
1 + τ c1 ξ − c3 ξ(ξ + c4 ) (I −1
M0 × ξ) + c2 (I
−1 p ) δ ξ = ξ
0 k+1 − ξk ,

where
2 − |ξ| sin |ξ| − 2 cos |ξ| cos |ξ| − 1
c1 = , c2 = ,
|ξ|4 |ξ|2
−2|ξ| − |ξ| cos |ξ| + 3 sin |ξ| 2|ξ| − sin |ξ|
c3 = , c4 = .
|ξ|5 |ξ|3

In fact, the above algorithm is applicable even when H is a polynomial.


Ge–Marsden[GM88] have proposed an algorithm, which neglects the generating
function condition (3.8). Therefore, it has certain flaw. Below, we will explain it from
the theoretical point of view.
First, we should point out that the Ge–Marsden algorithm can only give the first-
order scheme for simple system such as the free rigid body. Its second-order scheme,
however, is not a second-order approximation to the original system, as we will prove
it below.
Generally speaking, a generating function can be given as the following equation

 (δt)n
u = u0 + un , (3.13)
n!
n=1

(ξ, ξ)
where u0 = generates the identical transformation at time t = 0. Substituting
2
(3.13) into the LPHJ equation, we have

∂H
u1 = −H(V ), u2 = · du1 · ψ(adξ ), · · · . (3.14)
∂V

Below, we will take so(3)∗ as an example to explain the flaw of this algorithm.
ξ2
For so(3)∗ , u0 = , and hence V = ξ. The first-order scheme is
2

ξ2 ξ2 τ
S1 = u0 + τ u1 = − τ H(ξ) = − ξI −1 ξ.
2 2 2

The generating function for the second-order scheme is


12.3 Generating Function and Lie–Poisson Scheme 521

τ2 ξ2 τ 2 ∂H
S2 = S 1 + u2 = − τ H(ξ) + · du1 · ψ(adξ )
2 2 2 ∂V
ξ2 τ τ2  
= − ξI −1 ξ − I −1 ξ I −1 ξ · ψ(ξ) .
2 2 2

Using the system of Equation (3.6) (for SO(3) M, M0 denote angular momentum),
we get:
M − M0 = −du · adξ . (3.15)
Next, we will prove that S1 indeed generates a first-order Lie–Poisson scheme to
the Euler equation. However, S2 actually is not a second-order approximation to the
Euler equation. Furthermore, we will find that with this algorithm, it is impossible to
construct difference scheme that preserves the momentum mapping.
Because % 2 &
ξ τ
d S1 = d − ξ · I −1 ξ = ξ − τ I −1 ξ
2 2

and M0 = −dS1 · χ(adξ ) = (−ξ + τ I −1 ξ) · χ(ξ), we have ξ = −M0 + O(τ ). By


Equation (3.15) and applying ξ · adξ = 0, we obtain

M − M0 = (ξ − τ I −1 ξ) · adξ = −τ I −1 ξ · adξ
= τ [ξ, I −1 ξ] = τ [−M0 + O(τ ), I −1 (−M0 + O(τ ))]
= τ [M0 , I −1 M0 ] + O(τ 2 ).

This is a first-order approximation to the Euler equation

Ṁ = [M, I −1 M ]. (3.16)

For the second-order generating function S2 , we first calculate χ(ξ). Let χ(ξ) =
1+a1 ξ +a2 ξ 2 , where a1 , a2 have closed analytical expression (see Subsection 12.5.2)
as follows
1 − cos |ξ|
a1 = 2 ,
sin |ξ| + (1 − cos |ξ|)
2

(cos |ξ| − 1)2 sin |ξ| − |ξ|


+ + (sin |ξ| − |ξ|)|ξ|
|ξ|2 |ξ|
a2 = 2 ,
sin |ξ| + (1 − cos |ξ|)2

therefore,
u2 = −I −1 ξ(I −1 ξ · ψ(ξ))

= −I −1 ξ, I −1 ξ − a2 I −1 ξ(I −1 ξ · ξ 2 ),
then
τ2  
d S2 = ξ − τ I −1 ξ − τ 2 (I −1 )2 ξ − d a2 I −1 ξ · (I −1 ξ · ξ 2 ) ,
2
by
522 12. Poisson Bracket and Lie–Poisson Schemes

M0 = −d S2 · χ(ξ) = −ξ + τ ξ · χ(ξ) + O(τ 2 ),


we have
ξ = −M0 + τ ξ · χ(ξ) + O(τ 2 ).
From Equation (3.15), we get

M − M0 = −dS2 · adξ
τ2
= −(ξ − τ I −1 ξ − τ 2 (I−1 ξ)2 − d(a2 I −1 ξ · (I −1 ξ · ξ2 ))) · ξ
2

= τ [M0 , I −1 M0 ] + a1 τ 2 [[M0 , I −1 M0 ], I −1 M0 ]

+[M0 , I −1 [M0 , I −1 M0 ]]
 
+a2 I −1 M0 (I −1 M0 · M 42 ) + I −1 (I −1 M0 · M 42 )M0
0 0

τ2  
− d a2 · I −1 ξ(I −1 ξ · ξ2 ) · ξ + O(τ 3 ).
2
(3.17)
According to the Euler equation (3.16), its second-order approximation should be

τ2
M − M0 = τ [M0 , I −1 M0 ] + ([[M0 , I −1 M0 ], I −1 M0 ]
2 (3.18)
+[M0 , I −1 [I −1 M0 , M0 ]]) + O(τ 3 ).

As t → 0, ξ → M0 , by comparison, we found that the Equation (3.17) is not an


approximation to the Equation (3.18). Therefore, the generating function S2 cannot
generate the second-order approximation to the Euler equation.
We have shown that S1 generates a first-order Lie–Poisson scheme. However, a
momentum mapping preserving scheme should satisfy JL (q, M ) = JL (q0 , M0 ). For
T ∗ SO(3), this becomes qM = q0 M0 , and hence M = q −1 q0 M0 . Therefore, it is
necessary to estimate q ∈ G. If we had formula M = gM0 , a very natural idea is
to make g = q −1 q0 , which leads to q = q0 g −1 . An algorithm well constructed on
so(3)∗ should lead to a good approximation of q ∈ SO(3) to equation of motion. The
scheme that satisfies Equations (3.6) and the condition (3.8) and is generated by our
generating function theory that belongs to this type. However, the scheme constructed
via algorithm[GM88] does not belong to this type. Because the condition (3.8) is ne-
glected, it is impossible to construct the algorithm on G using algorithm[GM88] . This
can be illustrated as follows.
Using another representation of (3.6)

M0 = −du · χ(adξ ), M = exp (adξ )M0 , (3.19)

and ξ = (−M0 + τ I −1 ξ) · χ(ξ), if we let q = q0 g −1 = q0 exp (−ξ) = q0 exp ((M0 −


τ I −1 ξ) · χ(ξ)), then q is not a first-order approximation to the equation of motion q̇ =
qI −1 M4. In fact, the algorithm[GM88] cannot produce the form of q alone to construct
momentum mapping preserving scheme.
12.4 Construction of Structure Preserving Schemes for Rigid Body 523

12.4 Construction of Structure Preserving Schemes for


Rigid Body
We have already introduced the equation of motion for generalized rigid body in pre-
vious section. In this section, we will take SO(3) as an example to explain how to
construct structure-preserving schemes.

12.4.1 Rigid Body in Euclidean Space

Let Λ(t) ∈ SO(3), such that Λ(t)Λ(t)T = I, |Λ(t)| = 1. Then the equation of motion
for the free rigid body can be formulated as

4 (t),
Λ̇(t) = Λ(t)W (4.1)

4 (t) ∈ so(3), so(3) is the Lie algebra of SO(3). The isomorphism relation,
where W
so(3)  R3 , can be realized through the following equations:

4 (t)  W (t) ∈ R3 ,
W
⎡ ⎤
⎡ ⎤ w1
0 −w3 w2
⎢ ⎥
⎢ ⎥ ⎢ w ⎥
⎢ w3 0 −w1 ⎥  ⎢ 2 ⎥,
⎣ ⎦ ⎢ ⎥
⎣ ⎦
−w2 w1 0 w3

4 (t) · a = W × a,
W a ∈ R3 .

4 (t) in Equation (4.1) is called angular velocity in the body description. W


The W 4 (t) =
−1
Λ(t) Λ̇(t) is consistent with the definition of generalized rigid body. The corre-
sponding Euler equation is

Ṁ = [M, W ], M = JW, (4.2)

where J is called inertia operator, M the body angular momentum. The body variables
and the spatial variables have the following relations:

⎪ ω = AW,


m = ΛM, ω 4 ΛT =⇒ ω = ΛW,
 = ΛW



a = ΛA,

here A is an acceleration.
Operator “  ” has the following equalities:
524 12. Poisson Bracket and Lie–Poisson Schemes

u
× v = [
u, v],
 · v = u × v,
u
u, v] · w = (u × v) × w,
[
1
u·v = u v).
tr (
2

The equation of motion of the rigid body may be expressed on space SU (2) or SH1
(unit quaternion). Applying their equivalence (their Lie algebra is isomorphism), we
may obtain different forms of the Equation (4.1) under SU (2) and SH1 .
SU (2): U ∈ SU (2), satisfies

U U ∗ = I, |U | = 1.

The equation of motion becomes

U̇ = U Ωu ,

where Ωu = U ∗ U ∈ su(2), satisfies Ωu + Ω∗u = 0, tr Ωu = 0. In su(2), we choose

{(−iσ1 ), (−iσ2 ), (−iσ3 )}

as a basis, where
5 6 5 6
0 1 0 −i
σ1 = , σ2 = ,
1 0 i 0
5 6 5 6
1 0 1 0
σ3 = , σ0 =
0 −1 0 1
are 4 Pauli matrices.
It is easy to see that
5 6
3
 −iω3 −ω2 − iω1
ωi σi = ∈ SU (2).
i=1 ω2 − iω1 iω3
Hence
Ωu = (ω1 , ω2 , ω3 ) ∈ su(2)  R3  so(3),
using the matrix notation, rewrite the equation:
5 6 5 65 6
σ̇ β̇ σ β −iω3 −ω2 − iω1
= .
γ̇ δ̇ γ δ ω2 − iω1 iω3

∀ Q ∈ SH1 , Q = 1, Q = (q0 , q1 , q2 , q3 ) ∈ H (set of all quaternion). The equa-


tion of motion becomes Q̇ = QΩh , where Ωh = QQ̇ = Q−1 Q̇ ∈ sh1 (quaternion
with zero real part). Let
12.4 Construction of Structure Preserving Schemes for Rigid Body 525

Ωh = ω1 i + ω2 j + ω3 k = (0, ω1 , ω2 , ω3 ), ωh = (ω1 , ω2 , ω3 ).

Rewrite the equation of motion into the quaternion form

(q̇0 , q̇1 , q̇2 , q̇3 ) = (q0 , q1 , q2 , q3 ) · (0, ω1 , ω2 , ω3 )



q̇0 = −qω = −(q, ω),
=⇒ q = (q1 , q2 , q3 ).
q̇ = q0 ω T + q ω,

The Euler equation of motion becomes


dM
so∗3 : = [M, W ];
dt
d Mu 1 1
su∗2 : = [Mu , Wu ] = [Mu , W ], Mu = 2M, ωu = W ;
dt 2 2
d Mh 1 1
sh∗1 : = [Mh , Wh ] = [Mh , W ], Mh = 2M, ωh = W.
dt 2 2

If the unified Euler equation of motion is used, we have


dM
= [M, W ].
dt
If ω is assigned using the values of SO(3), then the corresponding equation of motion
becomes:
Λ̇ = ΛW4 (t), W (t) = (ω1 , ω2 , ω3 ),
 
ω ω ω
Q̇ = QΩh , Ωh = 0, 1 , 2 , 3 ,
2 2 2
 
ω1 ω2 ω3
U̇ = U Ωu , Ωu = , , .
2 2 2

After the above transformation, the equation of motion becomes more simpler. The
number of unknowns become fewer from the original 9 (SO(3)) to 4 complex vari-
ables (SU (2)), and then reduced to 4 real variables (SH1 ). The computation storage
and operation may be sharply reduced for large-scale scientific computations.
More details about the relations among SO(3), SU (2) and SH1 will be given in
Section 12.5.

12.4.2 Energy-Preserving and Angular Momentum-Preserving


Schemes for Rigid Body
With the equation of motion of rigid body, we may construct the corresponding differ-
ence scheme[LQ95a] . One type of important schemes is the structure-preserving scheme.
Structure-preserving may have some different meaning for different systems. For ex-
ample, it could mean to preserve the original system’s physical structure, the symme-
try, or invariant physical quantities.
526 12. Poisson Bracket and Lie–Poisson Schemes

The total energy and the angular momentum, especially the angular momentum,
are important invariants for the rigid motion. Many experiments indicated that the
energy and the angular momentum can be well maintained, which is essential for
computer simulation to have a good approximation to the real motion.
The equation of motion for the rigid body is

⎨ Λ̇(t) = Λ(t)W 4 (t),
/
Ṁ (t) = M (t) × W (t)
⎩ =⇒ I · Ẇ (t) = IW (t) × W (t),
M (t) = I · W (t)

where I is the inertia operator.


1 14 4 (t) is a Hamil-
Note that the energy function H = (M (t), W (t)) = W (t)J W
2 2
tonian function and the spatial angular momentum M (t) = ΛM (t) ⇔ M 4(t) =
ΛM4(t)Λ(t)T becomes momentum mapping. To maintain the energy and the angu-
lar momentum invariant is just to maintain the Hamilton function and the momentum
mapping of the Lie–Poisson system invariant.
The energy invariance is mainly manifested in solving Euler equations, and the
angular momentum invariance concerns more with equations of motion on SO(3).
Using relation Λn+1 Mn+1 = Λn Mn , we can derive the formula for which Λn+1
should satisfy. For Euler equation Ṁ (t) = M (t) × W (t) = M (t) × I −1 M (t), the
midpoint scheme preserves the Hamiltonian function, i.e., it is energy-preserving (in
fact midpoint scheme preserves all functions of quadratic form).
The midpoint scheme for Euler equation is
Mn+1 − Mn M + Mn M + Mn
= n+1 × I n+1 , (4.3)
δt 2 2
Mn+1 + Mn
multiply I −1 via inner product on both sides,
2

(Mn+1 − Mn ) · (I −1 (Mn+1 + Mn )) = 0 =⇒ I −1 Mn+1 · Mn+1 = I −1 Mn · Mn ,

i.e., Hn+1 = Hn . Since I −1 is a symmetric operator, we have

Mn · I −1 Mn+1 = Mn+1 · I −1 Mn .

Rewrite scheme (4.3) into


δt
Mn+1 = Mn + (Mn+1 + Mn ) × I −1 (Mn+1 + Mn )
4
% & % &
δt  δt 
=⇒ I + I −1 (Mn+1 + Mn ) Mn+1 = I − I −1 (Mn+1 + Mn ) Mn
4 4
% &−1 % &
δ t −1  δt −1 
=⇒ Mn+1 = I + I (Mn+1 + Mn ) I − I (Mn+1 + Mn ) Mn
4 4

= Λ−1
n+1 Λn Mn .
12.4 Construction of Structure Preserving Schemes for Rigid Body 527

By conservation of angular momentum:

Λn+1 Mn+1 = Λn Mn .

By comparison, we obtain
 δt −1  δt 

Λn+1 = Λn I − I −1 (Mn+1 + Mn ) 
I + I −1 (Mn+1 + Mn ) .
4 4
Since
δt −1  δt
I (Mn+1 + Mn ) = Wn + O(δt2 ),
4 2
from Cayley transformation, we know this is a second-order approximation to equa-
tion Λ̇ = ΛW4.
In brief, if we construct an energy-preserving scheme on so(3)∗ , we may obtain
a scheme approximate to the equation of motion by using the conservation of an an-
gular momentum. We remark that this highly depends on the schemes constructed on
so(3)∗ . Not every scheme on so(3)∗ corresponds to a good approximation scheme to
the equation of motion on SO(3). Ge–Marsden algorithm for Lie–Poisson system is
a typical example.

12.4.3 Orbit-Preserving and Angular-Momentum-Preserving


Explicit Scheme

The orbit-preserving[LQ95a] here means the motion trajectory remains at coadjoint or-
bit. For rigid body this means in every time step

Mn+1 = Λn Mn , ∃ Λn ∈ SO(3).

The midpoint scheme constructed in (4.2) is a kind of implicit orbit-preserving


scheme. Below, we will derive explicit orbit-preserving schemes.
The equation is

4 · M,
Ṁ = M × W = −W × M = −W 4 ∈ SO(3),
W 4 = I −1 M.
W

Assume the difference scheme to be constructed has the form

Mn+1 = exp (b(δt))Mn . (4.4)

It is easy to see when b(δt) = −δtWn = −δt(I −1 Mn ), (4.4) is a first-order scheme.


Expanding the scheme (4.4), we obtain
1 2 1 3
Mn+1 = Mn + b(δt)Mn + b(δt) Mn + b(δt) Mn + · · · . (4.5)
2 3!

Using Taylor expansion, we have


528 12. Poisson Bracket and Lie–Poisson Schemes

   (3) 
4n Mn + M̈ δt2 + M
Mn+1 = Mn − δtW δt3 + · · ·
2 3!
2
4n Mn + δt
= Mn − δtW (Mn × Wn × Wn ) (4.6)
2
δt2  
+ Mn × I −1 (Mn × Wn ) + · · · .
2

Let
b(δt) = δtB1 + δt2 B2 + δt3 B3 + · · · ,

substitute it into (4.5), and retain only the first two terms

1
Mn+1 = Mn + δtB1 Mn + δt2 B2 Mn + (δtB1 + δt2 B2 )2 Mn + o(δt3 )
2
1
= Mn + δtB1 Mn + δt2 B2 Mn + δt2 B12 Mn + o(δt3 ).
2
(4.7)
Comparing the coefficients of Equation (4.6) with those of (4.7), we have

4n ,
B1 = −W
(B12 + 2B2 )Mn = (Mn × Wn × Wn ) + (Mn × I −1 (Mn × Wn ))
 
4 2 Mn − I −1 (M
= W n × W n ) Mn ,
n

then
4n ,
B1 = −W
1  −1 
B2 = − I (M
n × Wn ) .
2

Likewise, we can construct third or fourth order schemes. Here we give only the result

1  4  −1   4 + I −1 (M ×
B3 = W I (M × W ) + 2I −1 (M
× W )W W × W)
6
   1

+ I −1 M × I −1
1
(M × W ) − B1 B2 − B2 B1 .
2 2

Another way to construct the orbit-preserving scheme is the modified R–K method,
which can be described as follows.
If the initial value M0 is known, let:
12.4 Construction of Structure Preserving Schemes for Rigid Body 529

μ0 = M0 ,

−1 μ )
μ1 = eτ c10 (−I 0
M0 ,

−1 μ ) τ c (−I
−1 μ )
μ2 = eτ c21 (−I 1
e20 0
M0 ,
···

−1 
−1 
−1 μ )
μr = eτ cr,r−1 (−I μr−1 )
eτ cr,r−2 (−I μr−2 )
· · · eτ cr,0 (−I 0
M0 ,

the (r + 1)-th order approximation of the equation is



−1 μ ) τ c 
−1 
−1 μ )
M = eτ cr (−I r r−1 (−I
e μr−1 )
· · · eτ c0 (−I 0
M0 .
Comparing the coefficients between the above equation and the Taylor expansion
(4.6), we obtain cij and cs . Take r = 1 as an example.

−1 
−1 
−1
μ1 = eτ c10 (−I μ0 ) M0 = eτ c10 (−I M0 ) M0 = e−τ c10 (I M0 ) M0
 2 
= 1 − τ c10 I−1 M + τ c2 (I −1 M )2 + O(τ 3 ) M ,
0 10 0 0
2
τ c1 (−I
−1 μ ) τ c (−I
−1 μ ) 
−1 μ −1 μ
M = e 1
e 0 0
M0 = e−τ c1 I 1 −τ c0 I
e 0
M0
 2 
= −1 μ + τ c2 (I
1 − τ c1 I −1 μ )2 + O(τ 3 ) 1 − τ c0 I
−1 M
1 1 1 0
2

τ2 2 
+ c (I −1 M0 )2 + O(τ 3 ) M0
2 0
 2 2
= 1 − τ c0 I
−1 M − τ c I −1 μ + τ c2 (I −1 M )2 + τ c2 (I−1 μ )2
0 1 1 0 0 1
2 2 1

−1 μ · I
+τ 2 c0 c1 I 1
−1 M + O(τ 3 ) M
0 0

= 1 − τ c0 I −1 M − τ c I −1 M + τ 2 c c I −1 (M −1 M )
0 1 0 1 10 0×I 0


τ2 2  τ2
+ c0 (I −1 M0 )2 + c21 (I
−1 M )2 + τ 2 c c (I
0 0 1
−1 M )2 + O(τ 3 ) M
0 0
2 2

M0 · M0 + τ 2 c1 c10 I −1 (M
= M0 − τ (c0 + c1 )I −1 0×I
−1 M ) · M
0 0

τ2 2
+ (c + c21 + 2c0 c1 )(I
−1 M )2 M + O(τ 3 ).
0 0
2 0

By the Taylor expansion, we have



⎪ c0 + c1 = 1, ⎧


⎨ ⎨ c0 + c1 = 1,
1
c1 c10 = , =⇒

⎪ 2 ⎩ c c = 1.

⎩ 2 1 10
2
c0 + c21 + 2c0 c1 = (c0 + c1 )2 = 1,
530 12. Poisson Bracket and Lie–Poisson Schemes

1 1
Set c0 = c1 = , c10 = 1 or c0 = 0, c1 = 1, c10 = , we obtain a second-order
2 2
modified R–K method.
Literature[CG93] gives the modified R–K methods for general dynamic system. The
scheme on so(3)∗ constructed via the above methods can be written as Mn+1 = ΛMn .
Take Λ−1
n+1 Λn = Λ, we obtain Λn+1 = Λn Λ
−1
. It is easy to verify that the Λn+1 =
−1
Λn Λ approximates Λ̇ = ΛW in the same order of accuracy as scheme Mn+1 =
ΛMn .

12.4.4 Lie–Poisson Schemes for Free Rigid Body


We have mentioned how to construct a scheme to preserve the angular momentum
and the Lie–Poisson structure. The free rigid motion is a simple Lie–Poisson sys-
tem. Among existing methods, Ge–Marsden algorithm is a first-order method to pre-
serve the Lie–Poisson structure (we thus prove that this method is unable to maintain
angular momentum). In Section 12.3, we introduced a generating-function method
which is slow. We will introduce a fast method in this section. It is a split Lie–Poisson
method[LQ95a] . It is also an angular momentum preserving method.
Because the free rigid motion’s Hamiltonian function is separable, we can use the
composite method to construct Lie–Poisson scheme according to separable system’s
procedure. MacLachlan introduced an explicit method[McL93] which requires analytic
solution for each split subsystem at every step. The midpoint method proposed below
is also an explicit Lie–Poisson method but with few computations.
The rigid motion’s Lie–Poisson equation is
⎡ ∂H ⎤
⎡ ⎤ ⎡ ⎤
ẋ1 0 −x3 x2 ⎢ ∂x1 ⎥
⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ∂H ⎥
⎢ ẋ2 ⎥=⎢ x3 0 −x1 ⎥ ⎢ ⎥, (4.8)
⎣ ⎦ ⎣ ⎦ ⎢ ∂ x2 ⎥
⎢ ⎥
ẋ3 −x2 x1 0 ⎣ ⎦
∂H
∂ x3
1
where x = (x1 , x2 , x3 )T ∈ R3 is an angular momentum, H = I −1 x, x is a
2
Hamiltonian function and energy function of system.
For a separable system, I is usually a diagonal matrix. Without loss of generality,
1
let H = (a1 x21 + a2 x22 + a3 x23 ). According to the decomposition rule, the fewer
2
the split steps the better. We can use Casimir function of the Lie–Poisson equation
to rewrite the system’s Hamilton function, and obtain an equivalent system. Note that
n
|x|2 = x2i is a first integral of the system. Let
i=1

1 1 1
H = H − a1 |x|2 = (a2 − a1 )x22 + (a3 − a1 )x23 = H1 + H2 ,
2 2 2
1 1
where H1 = (a2 − a1 )x22 , H2 = (a3 − a1 )x23 .
2 2
12.4 Construction of Structure Preserving Schemes for Rigid Body 531

Substituting H1 into the Lie–Poisson equation (1.1), we have


⎡ ⎤
−(a2 − a1 )x2 x3
∂ H1 ⎢ ⎥
ẋ = J(x) =⎢
⎣ 0 ⎥,
⎦ (4.9)
∂x
(a2 − a1 )x1 x2

where
⎡ ⎤
0 −x3 x2
⎢ ⎥
J(x) = ⎢
⎣ x3 0 −x1 ⎥
⎦.
−x2 x1 0

This equation can be simplified as a standard symplectic system



ẋ1 = −(a2 − a1 )x2 x3 ,
(4.10)
ẋ3 = (a2 − a1 )x1 x2 ,

where x2 is a constant.
Among symplectic difference schemes for the standard symplectic system (4.10),
only a few of them can preserve the Lie-Poisson structure of the original system (4.9).

Theorem 4.1. For the system (4.9), the midpoint scheme is a Lie–Poisson scheme[LQ95a] .

In order to prove the Theorem 4.1, we need the following lemma first.

Lemma 4.2. For the system (4.9), a symplectic algorithm for the standard symplectic
system (4.10) preserves Poisson structure, if and only if the following three conditions
are satisfied

⎪ −x11 x3 + x13 x1 = − x3 ,


x31 x3 − x33 x1 = − x1 , (4.11)



1 + x32 x
x12 x 3 = 0,

∂ xi
where xi = xni , xi = xn+1
i , xij = .
∂ xj

Proof. By the Theorem 1.2, a scheme is of Poisson if and only if


% & % &T

∂x 
∂x
J(x) = J(
x).
∂x ∂x

Expanding the above equation, we get


532 12. Poisson Bracket and Lie–Poisson Schemes

⎡ ⎤⎡ ⎤⎡ ⎤
x11 x12 x13 0 −x3 x2 x11 0 x31
⎢ ⎥⎢ ⎥⎢ ⎥
⎢ 0 1 0 ⎥ ⎢ −x1 ⎥ ⎢ x32 ⎥
⎣ ⎦ ⎣ x3 0 ⎦ ⎣ x12 1 ⎦
x31 x32 x33 −x2 x1 0 x13 0 x33
⎡ ⎤
0 −
x3 2
x
⎢ ⎥
= ⎢ 3
⎣ x 0 x1 ⎥
− ⎦,
−
x2 1
x 0

i.e.,
⎡ ⎤ ⎡ ⎤
0 −x11 x3 + x13 x1 a13 0 −
x3 2
x
⎢ ⎥ ⎢ ⎥
⎢ x11 x3 − x13 x1 0 x31 x3 − x33 x1 ⎥ ⎢ 3 x1 ⎥
−
⎣ ⎦=⎣ x 0 ⎦,
−a13 x33 x1 − x31 x3 0 −
x2 1
x 0
where a13 = (x12 x3 − x13 x2 )x31 + (x13 x1 − x11 x3 )x32 + (x11 x2 − x12 x1 )x33 .
Since the scheme is symplectic for (4.10), we have
−x13 x31 + x11 x33 = 1.
So a13 can be simplified as:
a13 = (x3 x31 − x1 x33 )x12 + (x13 x1 − x11 x3 )x32 + x2 .
Comparing the corresponding elements of the matrix on both sides and using the con-
2 = x2 , we have
dition x

⎪ x11 x3 − x13 x1 = x3 ,


x31 x3 − x33 x1 = − x1 ,



1 + x32 x
x12 x 3 = 0.
Thus before the lemma is proved. 
Now we will prove the Theorem 4.1.
Proof. The midpoint scheme for system (4.9) is (here, I = (I1 , I2 , I3 ) = (a1 , a2 , a3 ))

⎪  + x3
x

⎪ x1 = x1 + τ (I1 − I2 ) 3 x2 ,
⎨ 2
x2 = x2 ,



⎩ x  + x1
x
3 = x3 + τ (I2 − I1 ) 1 x2 .
2
⎡ ⎤
x11 x12 x13
⎢ ⎥
Its Jacobian matrix is ⎣ 0 1 0 ⎦, where
x31 x32 x33
12.4 Construction of Structure Preserving Schemes for Rigid Body 533

⎧ τ

⎪ x11 = 1 + (I1 − I2 )x2 x31 ,

⎪ 2



⎪ τ τ

⎪ x12 = (I1 − I2 )(
x3 + x3 ) + (I1 − I2 )x2 x32 ,

⎪ 2 2



⎪ τ τ
⎨ x13 = (I1 − I2 )x33 x2 + (I1 − I2 )x2 ,
2 2

⎪ τ τ
x31 = (I2 − I1 )x11 x2 + (I2 − I1 )x2 ,



⎪ 2 2



⎪ τ
x32 = (I2 − I1 )(
τ
x1 + x1 ) + (I2 − I1 )x2 ,



⎪ 2 2


⎩ x = 1 + τ (I − I )x x .
33 2 1 2 13
2
Solving the above equations, we get

⎪ 1 − a2

⎪ x11 = x33 = ,

⎪ 1 + a2



⎪ τ

⎪ (I1 − I2 )
x3


⎨ x12 = 2 2
,
1+a
(4.12)

⎪ 2a

⎪ x13 = −x31 = − ,

⎪ 1 + a2



⎪ τ

⎪ (I2 − I1 )
x1

⎩ x32 = 2
2
,
1+a
where
τ
a= (I2 − I1 )x2 . (4.13)
2
Substituting the system of Equations (4.12) into condition (4.11), we find that all con-
ditions are satisfied. Therefore, by Lemma 4.2, the scheme is of Poisson. 
Lemma 4.3. [FQ91] Consider dynamic system ẋ = a(x). If a can be split into a =
a1 + a2 + · · · + ak , and g s  esa is phase flow of a dynamic system, then
s s s s
gis  esai , 2nd-order, ∀ i =⇒ g12 ◦ · · · gk2 ◦ gk2 ◦ · · · g12  esa , 2nd-order.
Proof. For the standard symplectic system (4.10), the generalized Euler scheme
 = x + τ J∇H(B
x x + (I − B)x)
is symplectic, iff
1
B= (I + C), JC + C T J = O. (4.14)
2
It is natural to ask what kind of symplectic difference scheme for the system (4.10)
is also a Poisson scheme for the system (4.9). Below we restrict our discussion to the
generalized Euler scheme ! (4.14).
c1 c2
Let C = , then the symplectic condition (4.14) turns into c4 = −c1 .
c3 c4
Therefore,
534 12. Poisson Bracket and Lie–Poisson Schemes

5 6
1 1 + c1 c2
B= ,
2 c3 1 − c1
then
5 6
1 x1 + (1 − c1 )x1 + c2 (
(1 + c1 ) x3 − x3 )
x + (I − B)x =
B
2 x1 − x1 ) + (1 − c1 )
c3 ( x3 + (1 − c1 )x3
5 6 (4.15)
1 z1
= ,
2 z3

then Euler scheme becomes



1 = x1 − az3 ,
x
(4.16)
3 = x3 − az1 ,
x

where a is defined by Equation (4.13), z1 , z3 are defined by Equation (4.15).


After complex computations, the elements of Jacobian matrix of the solution are

(1 + ac3 )(1 − ac2 ) − a2 (1 − c1 )2


x11 = ,
(1 + ac3 )(1 − ac2 ) + a2 (1 − c21 )
−2a(1 − ac2 )
x13 = ,
(1 + ac3 )(1 − ac2 ) + a2 (1 − c21 )
((1 + ac3 )(1 − ac2 ) − a2 (1 − c1 )2 )x3 + 2a(1 + ac3 )x1
3 =
x ,
(1 + ac3 )(1 − ac2 ) + a2 (1 − c21 )

x11 x3 − x13 x1 = x
3 (see (4.11)).

Since x1 , x3 are arbitrary real number, we can get

c1 = 0, c2 = −c3 . (4.17)

Substituting Equation (4.17) into (4.16), and recalculating the Jacobian matrix, we
have
2a(1 − ac2 ) (1 − ac2 )2 − a2
x31 = 2 , x33 = ,
a + (1 − ac2 )
2 2 a + (1 − ac2 )
2

((1 − ac2 )2 − a2 )x1 − 2a(1 − ac2 )x3


1 =
x .
a2 + (1 − ac2 )2

It is easy to see that one of the conditions (4.11)

x31 x3 − x33 x1 = −
x1

is satisfied. Likewise, we can prove that another condition of (4.11) is also satisfied.
From (4.17), we have C = cJ, where c is an arbitrary constant and
12.4 Construction of Structure Preserving Schemes for Rigid Body 535

5 6
0 1
J= .
−1 0

Therefore, the lemma is completed. 

12.4.5 Lie–Poisson Scheme on Heavy Top


The Lie–Poisson algorithm as we discussed in the previous sections are based on the
dual space of semi-simple Lie algebra. In practice, we often have some Lie-Poisson
system whose configuration space is not based on semi-simple Lie algebra, but on the
dual space of the semi-product of Lie algebra and linear space. Such systems include
but not limited to heavy top and compressible hydrodynamics flows. The reader can
refer to literature[MRW90] for a more detailed discussion. In such configuration space,
there exists no momentum mapping as we discussed in previous sections. The angu-
lar momentum is preserved only along a specific direction. Therefore, the generating
function theory is no longer valid. However, using Lie–Poisson equations under local
coordinates, we can construct the Lie–Poisson algorithm and the angular momentum
preserving scheme. We will illustrate this by heavy top as an example.
Heavy top is a gravity body under the action of gravity with a fixed point. The free
rigid body is a heavy top with the fixed point in center of mass. Its configuration space
is 3 dimensional Euclid space E(3). Its Lie algebra is no longer semisimple. Its phase
space e∗ (3) has 6 coordinates {x1 , x2 , x3 , p1 , p2 , p3 }. The Poisson bracket operation
on e∗ (3) is

{xi , xj } = εijk xk , {xi , pi } = εijk pk , {pi , pj } = 0, (4.18)

where 
(i, j, k), i, j, k is not the same,
εijk =
0, i, j, k is the same.
There are two independent Casimir functions for bracket (4.18)
3
 3

f1 = p2i , f2 = pi xi .
i=1 i=1

Let H(x, p) be the Hamiltonian function of system. Introducing notation ui =


∂H ∂H
, Ωi = . Then the Lie–Poisson equation has the form of Kirchhoff equation
∂ pi ∂ xi

ṗ = [p, Ω], ẋ = [x, ω] + [p, u], (4.19)

where square bracket denotes cross product. H is the system’s energy, x and p are
angular momentum, and momentum under momentum coordinate. For a general case,
energy H is of quadratic form about x, p, and positive definite, which can be given as
follows
536 12. Poisson Bracket and Lie–Poisson Schemes

3
 3
 3

2H = ai x2i + bij (pi xj + xi pj ) + cij pi pj . (4.20)
i=1 i,j=1 i,j=1

For heavy top, the energy is often expressed as the sum of kinetic energy and
potential energy, i.e.,

x21 x2 x2
H(x, p) = + 2 + 3 + γ1 p1 + γ2 p2 + γ3 p3 , (4.21)
2I1 2I2 2I3

where Ii is the main movement inertia of the rigid body, γi (i = 1, 2, 3) are three
coordinates of the center of mass. It is easy to see that this is a separable system.
The structure matrix of the Lie–Poisson system is
5 6
J(x) J(p)
,
J(p) O

where J(x) is defined in Subsection 12.4.4.


It is difficult to construct Lie–Poisson algorithm on heavy top than on the free rigid
body because the generating function methods are no longer suitable. However, it may
become easier by the composition method and the Lemma 4.3.
We will first split the Hamiltonian function H of heavy top system into six part
6
H= Hi , where
i=1

x2i
Hi = , Hi+3 = γi pi , i = 1, 2, 3.
2Ii

We will take H1 , H4 as examples to construct Lie–Poisson scheme.


First, we will take H1 as the Hamilton function, then
5 6 5 6 ⎡ ∂ H1 ⎤
ẋ J(x) J(p)
= ⎣ ∂x ⎦,
ṗ J(p) 0 0

after manipulating, we get ⎧



⎪ x1 = 0,






x x
x2 = 3 1 ,




I1



⎪ x2 x1
⎨ x3 = − I ,
1
(4.22)

⎪ p1 = 0,





⎪ x p

⎪ p2 = 1 3 ,

⎪ I1




⎩ p3 = − x1 p2 .
I1
12.4 Construction of Structure Preserving Schemes for Rigid Body 537

Theorem 4.4. The Midpoint scheme of (4.22) is Poisson scheme for heavy top.

Proof. By Theorem 1.2, the midpoint scheme is the Poisson scheme iff mapping

(x, p) −→ (
x, p)

satisfies
⎡ ⎤

∂x 
∂x 5

6 ∂ x ∂ p ⎤ 5 6
⎢ ∂x ∂p ⎥ J(x) J(p) ⎢ ∂ x ∂x ⎥ J(
x) J(
p)
⎢ ⎥ ⎣ ⎦= . (4.23)
⎣ ∂ p ∂ p ⎦ ∂x ∂ p
J(p) O J(
p) O
∂x ∂p ∂p ∂p
 
∂ y
Denote = yz , then the expand Equation (4.23),
∂z

⎪ x xT
x J(x) x = J(
x),


x pT
x J(x)x +x pT
x J(p) p = J(
p), (4.24)



pT
px J(x)x +p pT
p J(p)x +p pT
x J(p)p = 0.

From the results of Subsection12.4.4, the first equation of system (4.24) is obviously
hold. Note also
⎡ ⎤ ⎡ ⎤
0 p21 p31 1 0 0
⎢ ⎥ ⎢ ⎥
px = ⎢
⎣ 0 0 0 ⎥ ⎢
⎦ , pp = ⎣ 0 p22 p23 ⎦ ,

0 0 0 0 p32 p33

∂ p2 ∂ p3 ∂ pi


where p21 = , p31 = , pij = (i, j = 2, 3). Through the computation,
∂ x1 ∂ x1 ∂ pj
we have
p22 = x22 , p23 = x23 , p32 = x32 , p33 = x33 ,
τ p3 τ p2
− (4.25)
p21 = I1 2 , p31 = I1
,
1+a 1 + a2

where a is defined by the Euler scheme (4.16).


Substituting (4.25) into Equation (4.24), the 2nd and 3rd equations of (4.24) are
also hold. 

If H4 is taken as Hamiltonian function of a system, the equation degenerates into


a constant equation. Then constructing Lie–Poisson scheme is trivial.
For a Hamiltonian function of general form, we need to perform a transformation,
so that the equation is easier for constructing the Lie–Poisson scheme. Take the free
rigid body as the example.
For a quadratic form, we have
538 12. Poisson Bracket and Lie–Poisson Schemes

1 1
H = Hi + Hij = ai x2i + aij (xi + xj )2 ,
2 2
i.e., we can eliminate the mixed items and transform it into a sum of squares. Next,
we can construct Lie–Poisson scheme for system with Hij as Hamiltonian function.
Take H12 as an example
∂ H12
ẋ = J(x) . (4.26)
∂x
It is easy to see that x1 + x2 is a Casimir function of system. Expanding Equation
(4.26) yields
⎡ ⎤ ⎡ ⎤
ẋ1 −a12 x3 (x1 + x2 )
⎢ ⎥ ⎢ ⎥
⎢ ẋ2 ⎥ = ⎢ a12 x3 (x1 + x2 ) ⎥ . (4.27)
⎣ ⎦ ⎣ ⎦
ẋ3 a12 (x21 − x22 )
Since x1 + x2 is a constant, denote c = x1 + x2 , the Equation (4.27) becomes

⎪ ẋ1 = −ca12 x3 ,


ẋ2 = ca12 x3 , (4.28)



ẋ3 = ca12 (c − 2x2 ).

The midpoint scheme to the above equations is no longer Lie–Poisson scheme. How-
ever, we can solve system of the Equations (4.28) analytically without difficulty.

12.4.6 Other Lie–Poisson Algorithm


Apart from the Lie–Poisson algorithm described above, we have other Lie–Poisson
algorithms, which include but not limited to Scovel and MacLaclan[MS95] constrained
Hamiltonian algorithm, and Veselov[Ves91b,Ves88] discrete Lagrangian system approach,
as well as the reduction method mentioned before. Below, we will give a brief intro-
duction to these method.
1. Constrained Hamiltonian algorithm
The detailed description about the constrained Hamiltonian algorithm can be
found in literature[MS95] and its references. Here we apply it to rigid motion only.
The structure space for a rigid motion is SO(n) = N . Take a larger linear space
M = gl(n). Then the constraint function of N on M is

φ(q) = q T q − 1, ∀ q ∈ M.

Note that φ(q) = 0 on N , and d φ(q) = Tq (M ) → R is a differential mapping.


Assume on T ∗ M , there exists a non-constraint system of Hamiltonian equations

ṗ = −∂q H,
q̇ = ∂p H.
12.4 Construction of Structure Preserving Schemes for Rigid Body 539

Then on T ∗ N , if the local coordinates (p, q) on T ∗ M is still used, we should have

φ(q) = 0 =⇒ dφ · q̇ = dφ · ∂p H = {H, φ}.

Therefore, on T ∗ M , there exists an embedded submanifold

CM = {(p, q) ∈ T ∗ M : φ(q) = 0, {H, φ} = 0},

which can induce a mapping

ψ: CM −→ T ∗ M, (p, q) → (p− , q),

where p− = ψ(p, q). It is easy to verify that this is an isomorphic mapping and pre-
serving the symplectic structure.
There exist constrained equations of dynamic system on CM ,

q̇ = ∂p H,
(4.29)
ṗ = −∂q H + dφ · μ.

If it is easy to construct structure-preserving scheme for the Equation (4.29) (e.g. when
(4.29) is a separable system), then we can use map ψ to induce the algorithm on T N .
Take SO(n) as an example.
1
On T M we have a Lagrangian function L(q, q̇) = tr (q̇J q̇ T ). Using the Legen-
2
1
dre transformation, we can obtain Hamiltonian function H(p, q) = tr (pJ −1 pT ) on

2
T M . Therefore, using (4.29), we can obtain the constrained Hamiltonian equation of
the dynamic system: ⎧
⎨ q̇ = 1 pJ −1 ,
2 (4.30)

ṗ = dψ · μ = 2qμ,
which is a separable Hamilton system obviously. It is easy to construct the ex-
plicit symplectic difference scheme. But on T N , the Hamiltonian function becomes
1
H(p, q) = tr (I −1 (q T p)(q T p)T ), and its Hamiltonian equations are
4


⎪ −1 T ∂H
⎨ q̇ = qI (q p) = ∂ p ,
(4.31)


⎩ ṗ = pI −1 (q T p) = − ∂ H ,
∂q

where q ∈ SO(n), q T p ∈ so(n). This is not a separable Hamilton system. Therefore,


constructing its symplectic difference method will be difficult and computationally
complicated. However using (4.30) and maps ψ, we can construct the algorithm for
1
SO(n) easily. Note that ψ(p) = (p − qpT q) in this case.
2
Scovel and McLachlan[MS95] proved this algorithm preserves the momentum map-
ping. We remark that the constraint Hamiltonian system has advantage only when the
540 12. Poisson Bracket and Lie–Poisson Schemes

expansion system is separable. Otherwise this algorithm is impractical. Take the rigid
body as an example. On T ∗ SO(3), if Euler equation is to be solved, there are only
6 unknowns. If we expand it to T ∗ GL(n), the number of unknown becomes 18. If
system is not separable, then the computation cost will definitely increase.
2. Veselov–Moser algorithm
Veselov–Moser algorithm[MV91] is to discretize Lagrange function first and then
apply Legendre transformation to the discrete Lagrange function. The constructed al-
gorithm preserves discreted symplectic structure, thus also preserves system’s Lie–
Poisson structure. The concrete procedure is as follows:
1◦ First discretize the Lagrange function.
2◦ Add constraint and find the solution for δS = 0.
3◦ Obtain the discrete equation.
4◦ Solve this equation.
n
T
For SO(n), S = tr (Xk JXk+1 ). The constrained Lagrange function is
k=1


n
L=S+ (Xk XkT − 1),
k=1

then
δL = 0 =⇒ Xk+1 J + Xk−1 J = Λk Xk , ∀ k ∈ Z,
from this, we can have a system of equations

Mk+1 = wk Mk wk−1 ,
wk ∈ O(n), (4.32)
Mk = wkT J − Jwk ,
−1
where wk = Xk+1 Xk . It is easy to prove that this discrete system of equations con-
verges to continuous system of Euler-Arnold equations:

Ṁ = [M, Ω],
Ω ∈ o(n). (4.33)
M = JΩ + ΩJ,

To solve Equation (4.32), the key lies in solving for wk . In order to make iteration
(Xk , Yk ) → (Xk+1 , Yk+1 ) symplectic, Yk = Xk+1 , we need

Yk+1 J + Xk J = Λk Xk+1 , Λk ∈ Sm(n).

This is because
T
Yk+1 J + Xk J = Xk+1 wk+1 J + Xk J
T T
= Xk+1 (wk+1 J + Xk+1 Xk J)
T
= Xk+1 (wk+1 J + wk J).
12.4 Construction of Structure Preserving Schemes for Rigid Body 541

See also
JwkT − wk J = Mk+1 = wk+1
T
J − Jwk+1 ,
then
JwkT + Jwk+1 = wk+1
T
J + wk J,
T
i.e., wk+1 J + wk J is symmetric. Thus ∃ Λk , Λk = ΛT
k , so that

T
Xk+1 (wk+1 J + wk J) = Λk Xk+1 .

Therefore,
Yk+1 J + Xk J = Λk Xk+1
satisfies symplectic condition.
The next question is how to solve wkT J − Jwk = Mk = tmk for wk ? The nu-
merical experiments show that not all solutions wk that satisfy Equations (4.32) are
the solutions we want. To solve ωk quickly, we propose to use the Quaternion method.
w ∈ SO(3) corresponds to an element q = (q0 , q1 , q2 , q3 ) in SH1 . Their relations
will be given in Section 12.5. Then the second equation in Equation (4.32) becomes


⎪ 2(α2 − α1 )q2 q1 + 2(α1 + α2 )q3 q0 = −δtm3 ,


2(α3 − α1 )q3 q1 − 2(α3 + α1 )q2 q0 = δtm2 ,



⎩ 2(α − α )q q + 2(α + α )q q = −δtm ,
3 2 3 2 3 2 1 0 1

in addition,
q02 + q12 + q22 + q32 = 1.
Solving the above nonlinear equations for (q0 , q1 , q2 , q3 ) is not an easy task. We found
that when iteration step size is small, q0 , q1 , q2 , q3 behaves reasonable. However, when
the step size is large, the solution behaves erratically. Numerical experiments show that
solving these nonlinear equations is quite time-consuming, and hence this method is
not recommended in practice.
3. Reduction method
Reduction method bases on the momentum mapping discussed in previous sec-
tions. We have mentioned that the solution of a Lie–Poisson system lies in a coadjoint
orbit in Section 12.2, and this orbit has non-degenerated symplectic structure. If we
can construct the symplectic algorithm on this reduced orbit, then this algorithm is
naturally Lie–Poisson. Moreover it preserves the Casimir function and also preserves
the orbit. Below, we will take SO(3) as an example to illustrate this method.
The coadjoint orbit of SO(3) is a two dimensional spherical surface S2r . On S2r ,
we have a symplectic structure

ωμ (ξg∗ (μ), ηg∗ (μ)) = −μ[ξ, η],

and Hamiltonian function


1
Hμ (Ad∗g−1 μ) = I −1 (Ad∗g−1 μ)2 ,
2
542 12. Poisson Bracket and Lie–Poisson Schemes

where Ad∗g−1 μ denotes an element on S2r .


How to choose the chart and local coordinate on S2r , so that the symplectic struc-
ture becomes simple, is very important. We once selected the spherical coordinate to
be the local coordinate and the corresponding symplectic structure and Hamiltonian
function become very complicated. However, if the Euler angle coordinate is used, the
equations become very simple.
Let
S2r = {(x, y, z) | x2 + y 2 + z 2 = r2 },

where x, y, z are three angular momentums in the body description. Using Euler angle
coordinate θ, ψ to do the following coordinate transformation:

⎪ x = r sin θ cos ϕ,


y = r sin θ sin ϕ,



z = r cos θ.

Lie–Poisson (Euler) equation may become the following Hamiltonian equations:


⎧ 1 ∂H

⎨ θ̇ = − r sin θ ∂ ϕ ,
(4.34)

⎩ ϕ̇ = 1 ∂ H ,
r sin θ ∂ θ

where % &
1 r2 sin2 θ cos2 ϕ r2 sin2 θ sin2 ϕ r2 cos2 θ
H= + + .
2 I1 I2 I3

We can construct a non-standard symplectic algorithm for Equations (4.34) or we can


simplify the problem further by transformation

(θ, ϕ) → (cos θ, ϕ) = (x1 , x2 ),

then
⎧ dx 1 ∂H
1

⎨ dt = r ∂ x ,
2


⎩ d x2 = − 1 ∂ H .
dt r ∂ x1
This is a Hamiltonian system with standard symplectic structure, and its symplectic
algorithm is easy to construct.
To sum up, constructing Lie–Poisson scheme for a Lie–Poisson system has three
methods. The first method is to lift it to T ∗ G and construct the symplectic algorithm
(includes constraint Hamiltonian method) on it. The second is the direct construction
based on g∗ (generating function method and composition method). The third is to
construct symplectic algorithm on the reduced coadjoint orbit.
12.5 Relation Among Some Special Group and Its Lie Algebra 543

12.5 Relation Among Some Special Group and Its Lie


Algebra
In this section, we present relation among special group and its Lie algebra.

12.5.1 Relation Among SO(3), so(3) and SH1 , SU (2)


Let
Λ ∈ SO(3), ΛΛ = 1, |Λ| = 1,

ξ ∈ so(3) =⇒ ξ + ξT = 0,
q ∈ SH1 is a normal Quaternion q = (q0 , q) = (q0 , q1 , q2 , q3 ), q = (q1 , q2 , q3 ),

q02 + q2 = 1 = q02 + q12 + q22 + q32 .

We assume
⎡ ⎤
0 −ξ3 ξ2
⎢ ⎥
∀ ξ ∈ R3 , ξ = (ξ1 , ξ2 , ξ3 ) =⇒ ξ = ⎢
⎣ ξ3 0 −ξ1 ⎥
⎦ ∈ so(3),
−ξ2 ξ1 0

 When A
ξ is called the axial quantity of ξ.  ∈ so(3), A expresses its axial quantity.

1. Transformation between SO(3) and SH1


∀ q ∈ SH1 , x ∈ q0  R3 , qxq −1 represents a rotation of x. Using isomorphic
mapping: !
q0 + q 1 i q 2 + q 3 i
A(q) = , H  C2 ,
−q2 + q3 i q0 − q1 i
we can obtain ∀ q ∈ H, ∃ Λ ∈ SO(3), ∀ x ∈ R3 , we have

A(qxq −1 ) = A(0, Λx).

From this we can get Λ.


Given q = (q0 , q1 , q2 , q3 ), then
⎡ 1 ⎤
q02 + q12 − q1 q2 − q0 q3 q0 q2 + q1 q3
⎢ 2 ⎥
⎢ 1 ⎥
Λ = 2⎢
⎢ q 1 q2 + q0 q3 q02 + q22 − q2 q3 − q0 q1 ⎥,

⎣ 2 ⎦
1
q 1 q3 − q0 q2 q0 q1 + q2 q3 q02 + q32 −
2
or simplify as
Λ = (2q02 − 1)1 + 2q0 q + 2q ⊗ q.
It is easy to see that, if Λ = (Λij ) is known, then
544 12. Poisson Bracket and Lie–Poisson Schemes

1√
q0 = 1 + tr Λ,
2
(Q − Q23 )

q1 = 32 ⎪

4q0 ⎪



(Q13 − Q31 ) 1
q2 = =⇒ q = (Λ − ΛT ).
4q0 ⎪
⎪ 4q0

(Q21 − Q12 ) ⎪


q3 =
4q0

2. Relation between so(3) and SO(3)


The relation between so(3) and SO(3) is the relation between Lie algebra and Lie
group. Let ξ be an antisymmetry matrix for the axial quantity ξ, then exp (
ε) denotes
a rotation in SO(3). We have the expansion

 n
 = 1 
exp (ξ) ξ ∈ SO(3).
n!
n=0

According to the properties of SO(3), this expansion has a closed form, i.e., the Ro-
drigue formula
% &
1
sin2 ξ 2
 = 1 + sin ξ ξ + 1 % 2 & ξ .
Λ = exp (ξ) 2
ξ 2 1
ξ
2

We have two proofs of the above formula: one is from the geometry point of view and
the other is from algebra point of view. Below, we will give details on the algebraic
proof.
∀ ξ ∈ so(3), the following results hold after simple calculations:
3 4 2
ξ = −ξ,
 ξ = −ξ , |ξ| = 1.

Substitute them into the above series expansion,


 
 = exp (ξ · n ξ
exp (ξ) ) n=
ξ

= 1 + sin ξ n2
n + (1 − cos ξ)
2 1
sin ξ  1 sin 2 ξ 
2
= 1+ ξ+ % &2 ξ .
ξ 2 1
ξ
2


We can prove that ξ is the angle of rotation exp (ξ).
12.5 Relation Among Some Special Group and Its Lie Algebra 545

3. Transformation between SO(3) and SH1


The relation between SO(3) and SH1 is manifested by the relation between so(3)
and SH1 . A rotation in SO(3), (θ, n) ↔ (ξ, ξ),
∀ (ξ, ξ) ∈ SO(3) =⇒ ξ ∈ so(3)
1
=⇒ q0 = cos ξ,
2
⎛ 1

1 ⎝ sin 2 ξ ⎠
q = 1
ξ.
2
ξ
2

When ξ  1, we use


sin x x2 x4 x6
=1− + −
x 6 120 5040
to deal with the singularity situation. 
If q02 + q2 = 1, normalization is needed, which is just divided by q02 + q2 .
 Since ξ has the same direction as q,
Given (q0 , q) ∈ SH1 , we need to solve for ξ.
we have
q
ξ = ξ ,
q
−1
where ξ can be given by ξ = 2 sin (q ).

12.5.2 Representations of Some Functions in SO(3)


By definition, we have

 ξn
iex(ξ) = ,
(n + 1) !
n=0

χ(ξ)iex(−ξ) = Idξ .
% &3
ξ ξ
For ξ ∈ so(3), from =− , we have
ξ ξ

 ∞
 ∞

n
(−ξ)  2k
(ξ)  2k+1
(−ξ)
 =
iex(−ξ) = +
(n + 1)! (2k + 1)! (2k + 2)!
n=0 k=0 k=0

 % &2 ∞

(−1)k+1 ξ2k ξ (−1)k+1 ξ2k+1 ξ
= 1+ + ·
(2k + 1)! ξ (2k + 2)! ξ
k=1 k=0
2
|ξ| − sin |ξ|  cos |ξ| − 1 
= 1+ ξ + ξ
|ξ|3 |ξ|2
2
= 1 + c1 ξ + c2 ξ ,
546 12. Poisson Bracket and Lie–Poisson Schemes

cos |ξ| − 1 |ξ| − sin |ξ|  can be obtained by formula


where c1 = , c2 = . χ(ξ)
|ξ|2 |ξ|3

χ(ξ)iex(−  = Idξ .
ξ)
2
 = 1 + a1 ξ + a2 ξ , then
Let χ(ξ)
2 2

χ(ξ)iex(−  = (1 + a1 ξ + a2 ξ )(1 + c1 ξ + c2 ξ )
ξ)
2 3 4
= 1 + (a1 + c1 )ξ + (a1 c1 + c2 + a2 )ξ + (a1 c2 + a2 c1 )ξ + a2 c2 ξ
2
= 1 + (a1 + c1 − (a1 c2 + a2 c1 )|ξ|2 )ξ + (c2 + a2 + a1 c1 − a2 c2 |ξ|2 )ξ
= Id,

therefore 
a1 + c1 − (a1 c2 + a2 c1 )|ξ|2 = 0,
a1 c1 + c2 + a2 − a2 c2 |ξ|2 = 0.
Solving the above equations, we have
−c1 1 − cos |ξ|
a1 = = ,
(1 − c2 |ξ|2 )2 + c21 |ξ|2 (sin |ξ|)2 + (1 − cos |ξ|)2
(cos |ξ| − 1)2 sin |ξ| − |ξ|
+ + (sin |ξ| − |ξ|)|ξ|
−c2 + c2 |ξ| + c21
2
|ξ|2 |ξ|
a2 = = .
(1 − c2 |ξ|2 )2 + c21 |ξ|2 (sin |ξ|)2 + (1 − cos |ξ|)2
Bibliography

[AKW93] M. Austin, P. S. Krishnaprasad, and L.-S. Wang: Almost Poisson integration of rigid
body systems. J. of Comp. Phys., 107:105–117, (1993).
[AM78] R. Abraham and J. E. Marsden: Foundations of Mechanics. Addison-Wesley, Reading,
MA, Second edition, (1978).
[AN90] A. I. Arnold and S.P. Novikov: Dynomical System IV. Springer Verlag, Berlin Heidel-
berg, (1990).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM
60, Berlin Heidelberg, Second edition, (1989).
[CFSZ08] E. Celledoni, F. Fassò, N. Säfström, and A. Zanna: The exact computation of the
free rigid body motion and its use in splitting methods. SIAM J. Sci. Comput., 30(4):2084–
2112, (2008).
[CG93] P. E. Crouch and R. Grossman: Numerical integration of ordinary differential equa-
tions on manifolds. J. Nonlinear. Sci., 3:1–33, (1993).
[CS90] P. J. Channell and C. Scovel: Symplectic integration of Hamiltonian systems. Nonlin-
earity, 3:231–259, (1990).
[CS91] P. J. Channel and J. S. Scovel: Integrators for Lie–Poisson dynamical systems. Physica
D, 50:80–88, (1991).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[Fen86] K. Feng: Symplectic geometry and numerical methods in fluid dynamics. In F. G.
Zhuang and Y. L. Zhu, editors, Tenth International Conference on Numerical Methods in
Fluid Dynamics, Lecture Notes in Physics, pages 1–7. Springer, Berlin, (1986).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[FQ91] K. Feng and M.Z. Qin: Hamiltonian algorithms for Hamiltonian systems and a com-
parative numerical study. Comput. Phys. Comm., 65:173–187, (1991).
[FWQ90] K. Feng, H.M. Wu, and M.Z. Qin: Symplectic difference schemes for linear Hamil-
tonian canonical systems. J. Comput. Math., 8(4):371–380, (1990).
[FWQW89] K. Feng, H. M. Wu, M.Z. Qin, and D.L. Wang: Construction of canonical dif-
ference schemes for Hamiltonian formalism via generating functions. J. Comput. Math.,
7:71–96, (1989).
[Ge91] Z. Ge: Equivariant symplectic difference schemes and generating functions. Physica
D, 49:376–386, (1991).
[GM88] Z. Ge and J. E. Marsden: Lie–Poisson–Hamilton–Jacobi theory and Lie–Poisson in-
tegrators. Physics Letters A, pages 134–139, (1988).
[HV06] E. Hairer and G. Vilmart: Preprocessed discrete Moser–Veselov algorithm for the full
dynamics of the rigid body. J. Phys. A, 39:13225–13235, (2006).
[Kar04] B. Karasözen: Poisson integrator. Math. Comput. Modelling, 40:1225–1244, (2004).
[Lie88] S. Lie: Zur theorie der transformationsgruppen. Christiania, Gesammelte Abh., Christ.
Forh. Aar., 13, (1988).
548 Bibliography

[LQ95a] S. T. Li and M. Qin: Lie–Poisson integration for rigid body dynamics. Computers
Math. Applic., 30:105–118, (1995).
[LQ95b] S. T. Li and M. Qin: A note for Lie–Poisson– Hamilton–Jacobi equation and Lie–
Poisson integrator. Computers Math. Applic., 30:67–74, (1995).
[McL93] R.I. McLachlan: Explicit Lie–Poisson integration and the Euler equations. Physical
Review Letters, 71:3043–3046, (1993).
[MR99] J. E. Marsden and T. S. Ratiu: Introduction to Mechanics and Symmetry. Number 17
in Texts in Applied Mathematics. Springer-Verlag, Berlin, Second edition, (1999).
[MRW90] J.E. Marsden, T. Radiu, and A. Weistein: Reduction and hamiltonian structure on
dual of semidirect product Lie algebra. Contemporary Mathematics, 28:55–100, (1990).
[MS95] R. I. McLachlan and C. Scovel: Equivariant constrained symplectic integration. J.
Nonlinear. Sci., 5:233–256, (1995).
[MS96] R. I. McLachlan and C. Scovel: A Survey of Open Problems in Symplectic Integration.
In J. E. Mardsen, G. W. Patrick, and W. F. Shadwick, editors, Integration Algorithms and
Classical Mechanics, pages 151–180. American Mathematical Society, New York, (1996).
[MV91] J. Moser and A. P. Veselov: Discrete versions of some classical integrable systems and
factorization of matrix polynomials. Communications in Mathematical Physics, 139:217–
243, (1991).
[MW83] J.E. Marsden and A. Weinstein: Coadjoint orbits, vortices and Clebsch variables for
incompressible fluids. Phys D, 7: (1983).
[MZ05] R.I. McLachlan and A. Zanna: The discrete Moser–Veselov algorithm for the free
rigid body. Foundations of Computational Mathematics, 5(1):87–123, (2005).
[Olv93] P. J. Olver: Applications of Lie Groups to Differential Equations. GTM 107. Springer-
Verlag, Berlin, Second edition, (1993).
[Qin89] M. Z. Qin: Cononical difference scheme for the Hamiltonian equation. Mathematical
Methodsand in the Applied Sciences, 11:543–557, (1989).
[TF85] H. Tal-Fzer: Spectral method in time for hyperbolic equations. SIAM J. Numer. Anal.,
23(1):11–26, (1985).
[Ves88] A.P. Veselov: Integrable discrete-time systems and difference operators. Funkts. Anal.
Prilozhen, 22:1–33, (1988).
[Ves91] A.P. Veselov: Integrable maps. Russian Math. Surveys,, 46:1–51, (1991).
[Wan91] D. L. Wang: Symplectic difference schemes for Hamiltonian systems on Poisson
manifolds. J. Comput. Math., 9(2):115–124, (1991).
[ZQ94] W. Zhu and M. Qin: Poisson schemes for Hamiltonian systems on Poisson manifolds.
Computers Math. Applic., 27:7–16, (1994).
[ZS07a] R.van Zon and J. Schofield: Numerical implementation of the exact dynamics of free
rigid bodies. J. of Comp. Phys., 221(1):145–164, (2007).
[ZS07b] R.van Zon and J. Schofield: Symplectic algorithms for simulations of rigid body
systems using the exact solution of free motion. Physical Review E, 50:5607, (2007).
Chapter 13.
KAM Theorem of Symplectic Algorithms

Numerical results have shown the overwhelming superiority of symplectic algorithms


over the conventional non-symplectic systems, especially in simulating the global and
structural dynamic behavior of the Hamiltonian systems. In the class of Hamiltonian
systems, the most important and better-understood systems are completely integrable
ones. Completely integrable systems exhibit regular dynamic behavior which corre-
sponds to periodic and quasi-periodic motions in the phase spaces. In this chapter,
we study problems as to whether and to what extent symplectic algorithms can simu-
late qualitatively and approximate quantitatively the periodic and quasi-periodic phase
curves of integrable Hamiltonian systems.

13.1 Brief Introduction to Stability of Geometric


Numerical Algorithms

Among the various kinds of equations of mathematical physics, only a few can be
integrated exactly by quadrature and the rest are unsolvable. However, even an ap-
proximate solution is also valuable in many scientific and engineering problems. In a
wide range of applications, the most powerful and perhaps the only practically feasible
approximation is the numerical method — this is the case, especially in the computer
era. A question arises accordingly: Whether a numerical method can reflect the real
information of exact solutions of original problems properly or simulate accurately?
To a problem described by time evolutionary equations, the solutions can often be
represented by a flow (or semi-flow), which is locally defined on a phase space. Curves
on the phase space which are invariant under the action of the flow (or semi-flow) are
called invariant curves (or positively invariant curves) of the flow (or semi-flow). There
is a natural correspondence between the solutions of the equations and the invariant
curves (or positively invariant curves) of the flow (or semi-flow). The invariant curves,
or positively invariant curves, are called solution curves of the equations. The quali-
tative analysis concerns with problems about understanding topological structures of
the solution curves and their limit sets, which are often sub-manifolds of the phase
space. The aim of the numerical method, in principle, not only pursues an optimal
quantitative approximation to the real solution of the considered problem locally but
also preserves as well as possible the topological and even geometrical properties of
550 13. KAM Theorem of Symplectic Algorithms

the solution curves and their limit sets globally. The latter constitutes the main content
of qualitative analysis of the numerical method.
Qualitative analysis becomes important in the study of numerical methods be-
cause instability phenomena take place very often even in the numerical simulations
of very stable systems. Numerical treatments of stiff problems show that explicit
methods have a severe time-step restriction, which lead to Dahlquist’s pioneering
work about A-stability[Dah63] . Various notions of stability for numerical methods have
been established since then, classifying different types of stable methods for differ-
ent problems. The celebrated linear stability theory (A-stability, A(α)-stability, and
L-stability)[Wid76,Ehl69] is based on the scalar linear equation1

ẏ = λy (1.1)

and turns out to be powerful for the numerical study of all linear-dissipation-dominated
problems. G-stability, which was also developed by Dahlquist[Dah75] , is characterized
by retaining the contractivity property of any two solutions of nonlinear “contractive”
systems
ẏ = f (y). (1.2)
Here, the “contractivity” of the system (1.2) is defined by the condition

f (u) − f (v), u − v ≤ 0, (1.3)


d
which implies that u(t) − v(t) ≤ 0 for any two solutions u(t) and v(t) of (1.2).
dt
It is remarkable that for linear multistep one-leg methods, G-stability is equivalent
to A-stability[Dah78] . Butcher extended Dahlquist’s idea and developed B-stability the-
ory for Runge–Kutta methods[But75] . An elegant algebraic criterion of B-stability for
Runge–Kutta methods was given by Burrage and Butcher, who further suggested the
notion of algebraic stability of Runge–Kutta methods by only using the algebraic con-
ditions of the criterion[BB79] . The algebraically stable Runge–Kutta methods can in-
herit very important dynamic properties of dissipative systems[HS94] . In many cases,
algebraic stability is equivalent to B-stability[HS81] . Many notions and results about
stability were also generalized to the general linear methods[HLW02,HNW93] . In almost
the whole latter half of the last century, one of the central tasks of numerical analy-
sis was the construction and analysis of numerical methods, satisfying these various
stability conditions.
Stable methods have no stringent step-size restriction in their own applicable
ranges. They can preserve the dynamic stability properties (fixed points, periodic
curves, and attractors) of most of dissipative systems[SH96] . Even explicit methods can
also properly reflect the key dynamics of dissipative systems if sufficiently small step
sizes are used[Bey87,SH96] . It was also proved that Runge–Kutta methods (including Eu-
ler methods), with small step sizes, can preserve the topological structures of dynamic
trajectories of many structure-stable systems[Li99,Gar96] .
The application of a numerical method to a generic system definitely changes the
structure of the system. On the other hand, conventional methods do not change the
1
Dahlquist test equation.
13.2 Mapping Version of the KAM Theorem 551

topological structures of most dynamic trajectories of typical stable systems (e.g., dis-
sipative systems having motion stability and Morse–Smale systems and Axiom A sys-
tems having structure-stability)[SH96,Li99] . However, this remarkable advantage of the
conventional methods does not carry over to conservative systems. Most of the sta-
ble methods introduce artificial dissipation into conservative systems. They produce
illusive attractors and therefore destroy the qualitative character of the conservative
systems even if sufficiently small step sizes of the numerical method are used. The
approximate solutions of conservative systems ask for new numerical methods which
require more stringent stability.
Geometric numerical integration theory for conservative systems has been devel-
oped rapidly in recent twenty years. The monographs[SSC94,HLW02,FQ03,LR05] summarize
the main developments and important results of this theory. Qualitative behavior of ge-
ometric integrators has been investigated by many authors[Sha99,Sha00b,HL97,Sto98a,HLW02] .
For symplectic integrators applied to Hamiltonian systems, some stability results, ei-
ther in the spirits of the KAM theory or based on the backward analysis, have been
well established[Sha99,Sha00b,HL97,CFM06,DF07] . The typical stable dynamics of Hamilto-
nian systems, e.g., quasi-periodic motions and their limit sets — minimal invariant
tori, can be topologically preserved and quantitatively approximated by symplectic
discretizations. In this chapter, we give a review about these results. For more details,
readers refer to the relevant references[Sha99,Sha00b,HLW02,CFM06,DF07] .

13.2 Mapping Version of the KAM Theorem


In this section, we introduce the mapping version of the celebrated KAM theorem.
The main results of the theorem stem from answering a question about the stability
of motions of planets in the solar system. This question attracted many great scien-
tists in history and the culminated breakthrough was given by Kolmogorov (1954),
Arnold (1963) and Moser (1962). The monograph[HLW02] gives a nice introduction to
the KAM theorem based on the Hamiltonian perturbation theory. We present some re-
sults about differentiable Cantorian foliation structures of invariant tori in phase space
of an integrable symplectic mapping under perturbations. We give relevant estimates
explicitly in terms of the diophantine constant and nondegeneracy parameters of the
frequency map of the integrable system. As a direct application of these estimates,
we state a generalization of Moser’s small twist theorem to higher dimensions, which
can be applied to prove a numerical version of the KAM theorem for symplectic algo-
rithms.

13.2.1 Formulation of the Theorem

Consider an exact symplectic mapping S : (p, q) → (


p, q) to be defined in the phase
space I × Tn
p = p − ∂2 H(
p, q), q = q + ∂1 H( p, q), (2.1)
552 13. KAM Theorem of Symplectic Algorithms

where H : I × Tn → R is the generating function, I is an open and usually


bounded set of Rn and Tn is the standard n-torus. In (2.1), ∂1 and ∂2 denote the gra-
dient operators with respect to the first n and the last n variables respectively. When
H(p, q) = H0 (p) does not depend on q, then (2.1) represents an integrable mapping
S = S0 : (p, q) → (p, q) = (p, q + ω(p)) with the frequency map

ω(p) = ∂H0 (p), p ∈ I, (2.2)

where ∂ denotes the gradient operator with respect to p. Under the mapping S0 ,
the phase space I × Tn is completely foliated into invariant n-tori {p} × Tn ,
p ∈ I. On each torus, the iterations of S0 are linear with frequencies ω = ω(p).
This is a typical integrable case. When a perturbation h(p, q) is added to H0 , i.e.,
H(p, q) = H0 (p) + h(p, q), (2.1) does not define an integrable mapping generally.
However, KAM theorem shows that the perturbed mapping S still exhibits to a large
extent the integrable behavior in the phase space if the frequency map ω is nondegen-
erate in some sense (see[Arn63,AA89,Arn89,Kol54b,Mos62] for Kolmogorov’s nondegeneracy
and[CS94,Rüs90] for weak nondegeneracy) and the perturbation h is sufficiently small in
some function space. In this chapter, we consider the following nondegeneracy condi-
tion for ω : I → Ω:

θ |p1 − p2 | ≤ |ω(p1 ) − ω(p2 )| ≤ Θ |p1 − p2 | (2.3)

for some 0 < θ ≤ Θ. Here I and Ω are the domains of action variables and the
corresponding frequency values respectively. We always assume that I and Ω are open
in Rn and ω is analytic and can be analytically extended to some complex domain,
say I + r, of the real domain I, where r is the extension radius. We assume (2.3) is
satisfied for p1 , p2 ∈ I + r with |p1 − p2 | ≤ r. Note that this nondegeneracy condition
implies that the frequency map ω is invertible in any ball of radius r and centered in I,
which is stronger than the standard Kolmogorov’s nondegeneracy assumption of the
following (this was already noticed by Pöschel in[Pös82] ,

θ|d p| ≤ |d ω(p)| ≤ Θ|d p| for p ∈ I + r. (2.4)

An invariant torus of the integrable system is naturally specified by its frequency


vector. Those tori are rationally dependent and even Liouville frequency vectors
are generally destroyed by perturbations (Poincarè and Mather[Mat88] ). The invari-
ant tori of KAM type are specified by the so-called diophantine frequency vectors
ω = (ω1 , · · · , ωn ),
 ik,ω  γ
e − 1 ≥ τ for 0 = k = (k1 , · · · , kn ) ∈ Zn (2.5)
|k|


n 
n
with some constants γ > 0, τ > 0, where k, ω = kj ωj and |k| = |kj | for
j=1 j=1
integers k ∈ Zn .
We introduce some notations. For an open or closed set I ⊂ Rn and for a ≥ 0,
denote by C a (I ×Tn ), the class of isotropic differentiable functions of order a defined
13.2 Mapping Version of the KAM Theorem 553

on I ×Tn in the sense of Whitney. The norm of a function u ∈ C a (I ×Tn ) is denoted


by ua,I×Tn . Since we also get the anisotropic differentiability of the foliations of
invariant tori, we need the class C ν1 ,ν2 (I ×Tn ), of anisotropic differentiable functions
of order (ν1 , ν2 ), with the norm denoted by uν1 ,ν2 ;I×Tn for a function u in the class.
These two classes, endowed with the corresponding norms, are both Banach spaces.
We also use another norm  · ν1 ,ν2 ;I×Tn ,ρ for ρ > 0 defined by

uν1 ,ν2 ;I×Tn ,ρ = u ◦ σρ ν1 ,ν2 ;σρ−1 (I×Tn ) (2.6)

for u ∈ C ν1 ,ν2 (I × Tn ), where σρ denotes the partial stretching (x, y) → (ρx, y) for
(x, y) ∈ I × Tn . Note that the following relation between these two norms is valid
for 0 < ρ ≤ 1:
uν1 ,ν2 ;ρ ≤ uν1 ,ν2 ≤ ρ−ν1 uν1 ,ν2 ;ρ , (2.7)
where we dropped the domains to simplify the notations.
Take Ω = ω(I) and denote by Ωγ the set of those frequencies, in Ω, which satisfy
the diophantine condition (2.5) for given γ > 0 and whose distance to the boundary
#
of Ω is at least equal to 2γ. The set Ωγ is a Cantor set2 and the difference Ω \ Ωγ
γ>0
is a zero set if τ > n + 1. Therefore Ωγ is large for small γ.
The main results of this section are stated as follows:
Theorem 2.1. Given positive integer n and real number τ > n+1, consider mapping
S defined in phase space I ×Tn by (2.1) with H( p, q) = H0 (
p)+h( p, q), where H0 is
p, q) belongs to the Whitney’s class C αλ+λ+τ (I ×
analytic in I + r with r > 0 and h(
Tn ) for some λ > τ + 1 and α > 1,
 
i
α∈/Λ= + j : i, j ≥ 0 integer .
λ
Suppose the frequency map ω = ∂H0 : I → Ω satisfies the nondegeneracy condition
(2.3) for p1 , p2 ∈ I + r with |p1 − p2 | ≤ r where the constants θ and Θ satisfy
0 < θ ≤ Θ, then there exists a positive
 constant
 δ0 , depending only on n, τ , λ and α,
1
such that for any 0 < γ ≤ min 1, rΘ , if
2

hαλ+λ+τ,I×Tn ;γΘ−1 ≤ δ0 γ 2 θΘ−2 , (2.8)

then there exists a Cantor set Iγ ⊂ I, a surjective map ωγ : Iγ → Ωγ of C α+1 class


and a symplectic injection Φ : Iγ × Tn → Rn × Tn of C α,αλ class, in the Whitney’s
sense, such that
1◦ Φ is a conjugation from S to R. That is, the following equation holds:

S ◦ Φ = Φ ◦ R, (2.9)

where R is the integrable rotation on Iγ × Tn with frequency map ωγ , i.e., R(P, Q) =


(P, Q+ωγ (P )) for (P, Q) ∈ Iγ ×Tn . Moreover, Equation (2.9) may be differentiated
as often as Φ allows.
2
A subset of Rn is called a Cantor set if it is nowhere dense and complete in Rn .
554 13. KAM Theorem of Symplectic Algorithms

2◦ If Ω is a bounded open set of type D in the Arnold’s sense3 , then we have the
following measure estimate
  −n 
mEγ ≥ 1 − c4 θΘ −1 γ m E, (2.10)

where Eγ = Φ(Iγ × Tn ) is the union of invariant tori Φ({P } × Tn ), P ∈ Iγ , of S


and m denotes the invariant Liouville measure on the phase space E = I × Tn ; c4 is
a positive constant depending on n, τ , a and the geometric property of the domain Ω.
3◦ If h is of C βλ+λ+τ class with α ≤ β not in Λ, then we have further that
ωγ ∈ C β+1 (Iγ ) and Φ ∈ C β,βλ (Iγ × Tn ). Moreover,
? ?
? −1 ?
?σγΘ−1 ◦ (Φ − I)? , γ −1 ωγ − ωβ+1;γΘ−1 ≤ c5 γ −2 Θ hβλ+λ+τ ;γΘ−1
β,βλ;γΘ −1
(2.11)
with constant c5 depending on n, τ , λ and β, here we have dropped the domains in
the notation of norms.
4◦ For each ω ∗ ∈ Ωγ , there exists p∗ ∈ I and P ∗ ∈ Iγ such that ω(p∗ ) =
ωγ (P ∗ ) = ω ∗ and
 −1
|P ∗ − p∗ | ≤ c6 γθΘ−1 hαλ+λ+τ,I×Tn ;γΘ−1 , (2.12)

where c6 is a positive constant depending on n, τ , λ and α.

Theorem 2.2. If the frequency map ω satisfies the nondegeneracy condition (2.4),
then the conclusions of Theorem 2.1 are still true with the same estimates (2.10) –
(2.12) under the following smallness condition for h,

hαλ+λ+τ,I×Tn ;γΘ−1 ≤ δ0 γ 2 θ2 Θ−3 , (2.13)

where δ0 > 0 is depending only on n, τ , λ and α and is sufficiently small.

Remark 2.3. The above two theorems are stated for the case when h is finitely many
times differentiable. If h is infinitely many times differentiable or analytic, we have the
following conclusions, which are easily derived by similar remarks to those of[Pös82] .
1◦ If h ∈ C ∞ (I × Tn ), then ωγ ∈ C ∞ (Iγ ) and Φ ∈ C ∞ (Iγ × Tn ) and the
estimates (2.8) hold for any β ≥ α.
2◦ If h ∈ C ω (I × Tn ), then we have further Φ ∈ C ∞,ω (Iγ , Tn ) under an
additional smallness condition for δ0 which also depends on the radius of analyticity of
h with respect to angle variables. Here C ω denotes the class of real analytic functions.

13.2.2 Outline of the Proof of the Theorems


In this section, we give an outline of the proof of Theorem 2.1. For detailed arguments
refer to [Sha00a] .
3
For example, a domain with piece-wise smooth boundary is of type D in the Arnold’s sense.
13.2 Mapping Version of the KAM Theorem 555

a. We transform the mapping S, by a partial coordinates stretching σρ : (x, y) →


(p, q) = (ρx, y), to T = σρ−1 ◦ S ◦ σρ : (x, y) → (
x, y). The mapping T is determined
by
 = x − ∂2 F (
x x, y), y = y + ∂1 F ( x, y), (2.14)
where
F (x, y) = F0 (x) + f (x, y) (2.15)
is well defined on Iρ × T with
n

F0 (x) = ρ−1 H0 (ρx), f (x, y) = ρ−1 h(ρx, y) (2.16)

and
Iρ = ρ−1 I = {x ∈ Rn |ρx ∈ I}. (2.17)
For the time being, ρ is regarded as a free parameter. F0 (x) is real analytic in
Iρ + rρ and f belongs to the class C a (Iρ × Tn ) where rρ = ρ−1 r, a = αλ + λ + τ .
So the new mapping T satisfies the assumptions of Theorem 2.1 in which only I,
r, H, H0 , and h are replaced by Iρ , rρ , F , F0 , and f respectively. Accordingly, the
frequency map of the integrable mapping associated to the generating function F0
turns into
" (x) = ∂F0 (x), x ∈ Iρ ,
ω
and the nondegeneracy condition for the mapping turns out to be

ρθ |x1 − x2 | ≤ |"
ω (x1 ) − ω
" (x2 )| ≤ ρΘ |x1 − x2 | (2.18)

for x1 , x2 ∈ Iρ + rρ with |x1 − x2 | ≤ rρ . In addition, from (2.16), we have

f a,Iρ ×Tn = ρ−1 ha,I×Tn ;ρ .

1
From now on, we fix ρ = γΘ−1 . Then the assumption 0 < γ ≤ rΘ in Theorem
2
1
2.1 implies that 0 < ρ ≤ r. Hence, rρ ≥ 2. Let Iρ∗ be the set of points in Iρ with the
2
distance to its boundary at least one and let

" −1 (Ωγ ) ∩ Iρ .
Iρ;γ = ω (2.19)

Then, from (2.18) and the definition of Ωγ it follows that

(Iρ;γ + 1) ∩ Rn ⊂ Iρ∗ ⊂ (Iρ∗ + 1) ∩ Rn ⊂ Iρ , (2.20)

and

γμ |x1 − x2 | ≤ |"
ω (x1 ) − ω
" (x2 )| ≤ γ |x1 − x2 | , μ = θΘ−1 (2.21)

for x1 , x2 ∈ Iρ + 2 with |x1 − x2 | ≤ 2.


b. We approximate f by real analytic functions. Let

sj = s0 4−j , rj = sλj , j = 0, 1, 2, · · · (2.22)


556 13. KAM Theorem of Symplectic Algorithms

with fixed λ > τ + 1 and s0 > 0. Let

Uj = Iρ × Tn + (4sj , 4sj )

be the complex extended domain of Iρ × Tn with extended widths 4sj of Iρ and


Tn respectively[Pös82] . By an approximation lemma [Pös82] , there exists real analytic
functions fj defined on U0 with f0 = 0 such that, in case f ∈ C b (I × Tn ) with b ≥ a,

|fj − fj−1 |Uj ≤ sbj cb f b;Iρ ×Tn j = 1, 2, · · · ,


f − fj b ,I ∗ ×Tn −→ 0 (j −→ ∞) for 0 < b < b, (2.23)
ρ

where cb is a positive constant only depending on b, n and s0 but not depending on the
domain Iρ and hence not depending on the parameter ρ. Moreover, we may require fj
to be 2π-periodic in the last n variables. In (2.23), | · |Uj denotes the maximum norm
of analytic functions on the complex domain Uj .
c. We give the KAM iteration process which essentially follows Pöschel [Pös82] in
the Hamiltonian system case. For each fj , we define a mapping Tj : (x, y) → ( x, y)
by
 = x − ∂2 Fj (
x x, y), y = y + ∂1 Fj ( x, y) (2.24)
with Fj (
x, y) = F0 (
x)+fj (x, y). For each j, the function Fj (
x, y) is well-defined and
real analytic on Uj if 4sj ≤ rρ = ρ−1 r — this inequality  is satisfied for j = 0, 1, · · ·
1
if we choose 0 < s0 ≤ 4−1 noting that 0 < γ < rΘ . We can show that each Tj
2
for j ≥ 0 is a well-defined analytic mapping on a domain of complex extension of
the phase space Iρ × Tn , which is appropriate for the KAM iterations if h is bounded
by (2.8) with a sufficiently small δ0 > 0. It follows from (2.23) that Tj converges to
T in C b−1−κ -norm for any κ > 0 on some subdomain Iρ∗ × Tn of the phase space
Iρ × Tn , where T is well-defined. The central problem is to find transformations
Φj and integrable rotations Rj , defined on a sequence of nested complex domains
that intersect a nonempty Cantor set, say I"ρ;γ × Tn , such that the following holds as
j → ∞ on I"ρ;γ × Tn in some Whitney’s classes,

Cj = Rj−1 ◦ Φ−1 " "


j ◦ Tj ◦ Φj −→ identity, Φj −→ Φ, Rj −→ R, (2.25)

" and R
where Φ " are well-defined on I"ρ;γ × Tn . In this case, we have

"=Φ
T ◦Φ " ◦R
" on I"ρ;γ × Tn . (2.26)

Transforming the mapping T back to S by the partial coordinates stretching σρ ,


" and R
and meanwhile transforming Φ " to Φ and R respectively, we have

S◦Φ=Φ◦R on Iγ × Tn ,

where
Iγ = ρI"ρ;γ = {p ∈ Rn | ρ−1 p ∈ I"ρ;γ }
13.2 Mapping Version of the KAM Theorem 557

is a Cantor set of I. In fact, due to the nondegeneracy of the frequency map ω in the
sense of (2.3), we may keep the frequencies prescribed by (2.5) fixed in the above
approximation process. As a result, we have ωγ (Iγ ) = Ωγ , where ωγ is the frequency
map of the integrable rotation R on Iγ ×Tn . This is just the conclusion (1) of Theorem
2.1.
The construction of Φj and Rj uses the KAM iteration, which is described as
follows.
Assume
|fj − fj−1 | ∼ εj , j = 1, 2, · · · , (2.27)
where εj is a decreasing sequence of positive numbers . Suppose we have already
found a transformation Φj and a rotation Rj with frequency map ω (j) such that

Cj = Rj−1 ◦ Φ−1
j ◦ Tj ◦ Φj (2.28)

satisfies
|Cj − I| ∼ εj+1 . (2.29)
Then, we construct a transformation Ψj and a new rotation Rj+1 with frequency map
ω (j+1) such that
Φj+1 = Φj ◦ Ψj (2.30)
and (2.29) is also true for the next index j + 1 with Cj+1 defined by (2.28) in which
j is replaced by j + 1. As was remarked by Pöschel in[Pös82] , “for this procedure
to be successful it is essential to have precise control over the various domains of
definition”.
We define transformation Ψj : (ξ, η) → (x, y) implicitly with the help of a gener-
ating function ψj by

x = ξ + ∂2 ψj (ξ, y), y = η − ∂1 ψj (ξ, y). (2.31)

To define ψj , we consider mapping

Bj = Rj−1 ◦ Φ−1
j ◦ Tj+1 ◦ Φj . (2.32)

Bj is near identity and is assumed to be given implicitly from its generating function
bj by
 = x − ∂2 bj (
x x, y), y = y + ∂1 bj (
x, y). (2.33)
The function ψj is then determined from bj by the following homological equation

ψj (x, y + ω (j) (x)) − ψj (x, y) + "bj (x, y) = 0, (2.34)

where "bj (x, y) = bj (x, y) − [bj ](x) with [bj ] being the mean value of bj with respect
to the angle variables over Tn . Define

ω (j+1) (x) = ω (j) (x) + ∂[bj ](x). (2.35)

Then Rj+1 : (x, y) → (


x, y) is just given by
558 13. KAM Theorem of Symplectic Algorithms

 = x,
x y = y + ω (j+1) (x). (2.36)

With the so defined Ψj and Rj+1 , we easily show that, formally,

Ψ−1
j ◦ Rj ◦ Bj ◦ Ψj = Rj+1 ◦ Cj+1 .

Formal calculations similar to those in[Pös82] show that (2.29) is valid if we replace j
by j + 1.
We do not solve the Equation (2.34) exactly. Instead, we will solve an approximate
equation by truncating the fourier expansion of "bj with respect to angle variables to
some finite order so that “only finitely many resonances remain, and we obtain a real
analytic solution ψj defined on an open set”[Pös82] . This idea was first successfully
used by Arnold[Arn63,AA89] .
For an earnest proof, we need more precise arguments by carefully controlling the
domains of definition of functions and mappings in the iterative process. Readers can
refer to[Sha00a] for details.

13.2.3 Application to Small Twist Mappings


In this section, we state a theorem of KAM type for small twist mappings. Such
a theorem first appeared in Moser’s celebrated paper[Mos62] for 2-dimensional area-
preserving mappings. Its generalization to higher dimensions was given in[Sha00a] , as a
direct application of the theorems of the last section. The result may be formulated as
follows:

Theorem 2.4. Consider one parameter family of mappings St : (p, q) → (


p, q) with
S0 = I and S1 = S, defined in phase space I × Tn by

p, q) = p − t∂2 h(
p = p − t∂2 H( p, q),
(2.37)
q = q + t∂1 H(
p, q) = q + tω(p) + t∂1 h(
p, q),

where H( p, q) = H0 (p) + h(p, q) and ω( p). Under the assumptions of
p) = ∂H0 (
Theorem 2.1 (Theorem 2.2) for H, if h satisfies the smallness condition of Theorems
2.1 (Theorem 2.2), then the corresponding conclusions of the theorem are still valid
for St (0 < t ≤ 1), only with the following remarks:
1◦ Ωγ is replaced by
   tγ 
Ωt,γ = ω ∈ Ω∗ : eik,tω − 1 ≥ for k ∈ Zn \ {0} , (2.38)
|k|τ

where Ω∗ denotes the set of points in Ω with distance to its boundary at least equal to
2γ. Accordingly, Iγ , ωγ , Φ, and R are replaced by It,γ , ωt,γ , Φt and Rt ,respectively.
2◦ If Ω is a bounded open set of type D in the Arnold’s sense[Arn63] , then we have
the following Lebesgue measure estimate

m(Ω \ Ωt,γ ) ≤ DγmΩ (2.39)


13.3 KAM Theorem of Symplectic Algorithms for Hamiltonian Systems 559

for t ∈ (0, 1], with constant D only depending on n, τ and the geometry of Ω. So in
this case, Ωt,γ is still a large Cantor set in Ω when γ is small enough.
3◦ If h ∈ C ∞ (B × Tn ), then ωγ,t ∈ C ∞ (Bγ,t ) and Φt ∈ C ∞ (Bγ,t × Tn )
which satisfy the estimates (2.11) for any β ≥ α.
4◦ If h is analytic with the domain of analyticity containing
( )
S(r, ρ) = (p, q) ∈ C2n : |p − p | < r, |Im q| < ρ with p ∈ B and Re q ∈ Tn
for some r > 0 and ρ > 0 (Re q and Im q denote the real and imaginary parts of q
respectively) and if h satisfies
hr,ρ = sup |h(p, q)| ≤ δ0 γ 2 θ2 Θ−3 (2.40)
(p,q)∈S(r,ρ)

for some sufficiently small δ0 > 0 depending on n, τ , r and ρ, then all the con-
clusions of Theorem 2.1 (Theorem 2.2) are still true with ωγ,t ∈ C ∞ (Bγ,t ), Φt ∈
C ∞,ω (Bγ,t × Tn ) and the estimate (2.11) holds for any β ≥ 0.
We have presented the results about the existence of differentiable foliation struc-
tures in the sense of Whitney of invariant tori for nearly integrable symplectic map-
pings and for mappings with small twists. Such a result was proved first by Lazutkin
in 1974[Laz74] for planar twist maps and was generalized to higher dimensions by
Svanidze in 1980[Sva81] . For the case of Hamiltonian flows of arbitrary dimensions,
the generalizations were given by J. Pöschel in 1982[Pös82] , Chierchia and Gallavotti
in 1982[CG82] . The perturbation and measure estimates in terms of γ were studied
by Rüssmann in 1981[Rüs81] , Svadnidze in 1981[Sva81] , Neishtadt in 1982[Nei82] and
Pöschel in 1982[Pös82] . The estimates in terms of θ and Θ were given by Shang in
2000[Sha00a] , which are also crucial in the small twist mapping case.

13.3 KAM Theorem of Symplectic Algorithms for


Hamiltonian Systems
In this section, we study stability of symplectic algorithms when applied to typical
nonlinear Hamiltonian systems. We introduce a numerical version of the KAM the-
orem. Such a theorem was already suggested by Channel and Scovel in 1990[CS90] ,
Kang Feng 1991[Fen91] , and Sanz-Serna and Calvo in 1994[SSC94] . Its rigorous for-
mulation and proof were given by Shang in 1999 and 2000[Sha99,Sha00b] based on the
thesis[Sha91] . The main results consist of the existence of invariant tori, with a smooth
foliation structure, of a symplectic numerical algorithm when it applies to a generic in-
tegrable Hamiltonian system with arbitrarily many degrees of freedom if the time-step
size of the algorithm is sufficiently small and falls in a Cantor set of large measure.
This existence result also implies that the algorithm, when it is applied to a generic
integrable system of n degrees of freedom, possesses n independent smooth invariant
functions which are in involution and well-defined on the set filled by the invariant
tori in the sense of Whitney. The invariant tori are just the level sets of these functions.
Quantitative analysis shows that the numerical invariant tori of a symplectic algorithm
can approximate the corresponding exact invariant tori of the systems.
560 13. KAM Theorem of Symplectic Algorithms

13.3.1 Symplectic Algorithms as Small Twist Mappings


We consider a Hamiltonian system with n degrees of freedom in canonical form
∂K ∂K
ẋ = − (x, y), ẏ = (x, y), (x, y) ∈ D, (3.1)
∂y ∂x

where D is a connected bounded, open subset of R2n ; x and y are both n-dimensional
Euclidean coordinates with ẋ and ẏ the derivatives of x and y with respect to the time
“t” respectively; K : D → R1 is the Hamiltonian.
A symplectic algorithm that is compatible with the system (3.1) is a discretization
scheme such that, when applied to the system (3.1), it uniquely determines one param-
eter family of symplectic step-transition maps GtK that approximates the phase flow
t
gK in the sense that
1 
lim s GtK (z) − gK t
(z) = 0 for any z = (x, y) ∈ D (3.2)
t→0 t

for some s ≥ 1, here t > 0 is the time-step size of the algorithm and s, the largest
integer such that (3.2) holds, is the order of accuracy of the algorithm approximating
the continuous systems. Note that the domain in which GtK is well-defined, say D " t,
depends on t generally and converges to D as t → 0 — this means that any z ∈ D
may be contained in D " t when t is sufficiently close to zero.
From (3.2), we may assume

GtK (z) = gK
t
(z) + ts RK
t
(z), (3.3)

where
1 t 
t
RK (z) = s
GK (z) − gK
t
(z)
t
is well-defined for z ∈ D" t ⊂ D and has the limit zero as t → 0 for z ∈ D. Below, we
prove the main results of this chapter by simply regarding the approximation GtK to
t
the phase flow gK of the above form as a symplectic discretization scheme of order s.
We assume that the system (3.1) is integrable. That is, there exists a system of
action-angle coordinates (p, q) in which the domain D can be expressed as the form
B × Tn and the Hamiltonian depends only on the action variables, where B is a
connected bounded, open subset of Rn and Tn the standard n-dimensional torus. Let
us denote by Ψ : B × Tn → D the coordinate transformation from (p, q) to (x, y),
then Ψ is a symplectic diffeomorphism from B × Tn onto D. The new Hamiltonian

K ◦ Ψ(p, q) = H(p), (p, q) ∈ B × Tn (3.4)

only depends on p. Therefore, in the action-angle coordinates (p, q), (3.1) takes the
simple form
∂H
ṗ = 0, q̇ = ω(p) = (p) (3.5)
∂p
and the phase flow gHt
is just the one parameter group of rotations (p, q) → (p, q +
tω(p)) which leaves every torus {p} × Tn invariant.
13.3 KAM Theorem of Symplectic Algorithms for Hamiltonian Systems 561

Assume K is analytic and, without loss of generality, assume the domain of ana-
lyticity of K contains the following open subset of C2n
( )
Dα0 = z = (x, y) ∈ C2n : d(z, D) < α0 , (3.6)

with some α0 > 0, where


d(z, D) = inf

|z − z  |
z ∈D

denotes the distance from the point z ∈ C2n to the set D ⊂ C2n in which |z| =
max |zj | for z = (z1 , · · · , z2n ). Also, we assume that Ψ extends analytically to the
1≤j≤2n
following complex domain
( )
S(r0 , ρ0 ) = (p, q) ∈ C2n : d(p, B) < r0 , Re q ∈ Tn , |Im q| < ρ0 (3.7)

with r0 > 0, ρ0 > 0 and has period 2π in each component of q. In (3.7), B is


considered " 0 , ρ0 ) =
as a subset of C2n . Without loss of generality, we suppose D(r

Ψ S(r0 , ρ0 ) ⊂ Dα0 and further that Ψ is a diffeomorphism from S(r0 , ρ0 ) onto
" 0 , ρ0 ). So the Equation (3.4) is valid for (p, q) ∈ S(r0 , ρ0 ) and
D(r

Ψ−1 ◦ gK
t
◦ Ψ = gH
t
(3.8)

on this complex domain of coordinates (p, q).


Checking the existing available symplectic algorithms, we find that GtK is always
analytic if the Hamiltonian K is analytic. Note that the domain in which GtK is well-
t
defined converges to the domain of definition of gK as t approaches zero. We may
t
assume, without loss of generality, GK is well-defined and analytic in the complex
domain Dα0 for t sufficiently close to zero. Moreover, in the analytic case, we have
 
 t 
GK (z) − gKt
(z) ≤ ts+1 M (z, t)

with an everywhere positive continuous function M : Dα0 × [0, δ1 ] → R for some


sufficiently small δ1 > 0.
" t = Ψ−1 ◦ Gt ◦ Ψ is
Lemma 3.1. There exists δ2 > 0 such that for t ∈ [0, δ2 ], G K K
r ρ 
well-defined and real analytic on the closed complex domain S 0 , 0 and
2 2
 t   
" (p, q) − g t (p, q) ≤ M ts+1 ,
G r0 ρ0
K H (p, q) ∈ S , , t ∈ [0, δ2 ], (3.9)
2 2

where M is a positive constant depending on r0 , ρ0 , α0 , δ1 , Ψ and K, not on t.


    
r ρ r ρ
Proof. Let U1 = S 0 , 0 and V1 = Ψ S 0 , 0 . Since U1 is a closed subset
2 2 2 2
of S(r0 , ρ0 ) and Ψ is a diffeomorphism from S(r0 , ρ0 ) onto Dα0 , V1 is closed in Dα0 .
Let ξ be the distance from V1 to the boundary of Dα0 , then ξ > 0. The compactness
 ξ
of V1 implies that there exists 0 < δ1 < δ1 such that gK
t
maps V1 into V1 + for
2
562 13. KAM Theorem of Symplectic Algorithms

 ξ
t ∈ [0, δ1 ], where V1 + denotes the union of all complex open balls centered in V1
2
ξ 
with radius . Since M (z, t) is continuous and positive for (z, t) ∈ V1 × [0, δ1 ], there
2 
exists a constant M 0 > 0 which is an upper bound of M (z, t) on V1 × [0, δ1 ]. Let
  D 
ξ
δ2 = min 1, δ1 , . Then for t ∈ [0, δ2 ], GtK maps V1 into Dα0 and hence
4M0
G" t = Ψ−1 ◦ Gt ◦ Ψ is well-defined on U1 . The real analyticity of the map follows
K K
from the real analyticity of Ψ and K. To verify Equation (3.9) , we first note that the

analyticity of Ψ−1 on V1 + ⊂ Dα0 implies that
4
 −1 
 ∂Ψ 
 (z) ≤ M1
∂z


for all z ∈ V1 + with some constant M1 > 0,and then Taylor formula gives
4
Ψ(p, q) ∈ V1 and
 t    ξ
RK (Ψ(p, q)) = GtK (Ψ(p, q)) − gK
t
(Ψ(p, q)) ≤ M0 ts+1 ≤
4
for (p, q) ∈ U1 and t ∈ [0, δ2 ]. Therefore,
" t (p, q) − g t (p, q)| = |Ψ−1 (g t (Ψ(p, q)) + Rt (Ψ(p, q))) − Ψ−1 (g t (Ψ(p, q)))|
|G K H K K K

≤ 2nM1 M0 ts+1 .

Let M = 2nM1 M0 , then (3.9) is verified. 

The above lemma shows that G " t is an approximant to the one parameter group
K
of integrable rotations gH up to order ts+1 as t approaches zero. To apply Theorem
t

2.4, we need to verify the exact symplecticity of G" t so that it can be expressed by
K
globally defined generating function. Because Ψ is not necessarily exact symplectic,
" t = Ψ−1 ◦ Gt ◦ Ψ is not trivially observed.
the exact symplecticity of G K K

Lemma 3.2. Let G be an exact symplectic mapping of class C 1 from D into R2n
where D is an open subset of R2n and let Ψ be a symplectic diffeomorphism from
B × Tn onto D. Then Ψ−1 ◦ G ◦ Ψ is an exact symplectic mapping in the domain in
which it is well-defined.

p, q) = Ψ−1 ◦ G ◦ Ψ(p, q) and let γ be any given closed curve in the
Proof. Let (
domain of definition of G" =: Ψ−1 ◦ G ◦ Ψ, which is an open subset of B × Tn . The
"
exact symplecticity of G will be implied by[Arn89]
- -
I(γ) = p d q − p d q = 0. (3.10)
γ γ

x, y) = Ψ(
Now we verify (3.10). Let (x, y) = Ψ(p, q) and (  p, q). Then (
x, y) =
 d y − γ x d y = 0, where x,
G(x, y). Since G is an exact symplectic, we have γ x
13.3 KAM Theorem of Symplectic Algorithms for Hamiltonian Systems 563

, y are considered as functions of (p, q), which vary over γ. Therefore, with these
y, x
conventions and with γ  = Ψ−1 ◦ G ◦ Ψ(γ),
- - - -
I(γ) = p d q − x  d y + x d y − p d q
γ γ γ γ
- - - -
= pdq − xdy + xdy − pdq
γ Ψ(γ  ) Ψ(γ) γ
- -
= pdq − x d y. (3.11)
γ  −γ Ψ(γ  )−Ψ(γ)

Note that G is exact and hence it is homotopic to the identity. This implies that Ψ−1 ◦
G ◦ Ψ is homotopic to the identity too. So γ  and γ belong to the same homological
class in the fundamental group of the manifold B × Tn . Therefore, one may find a
2-dimensional surface, say σ, in the phase space B × Tn , which is bounded by γ  and
γ. Ψ(σ) is then a 2-dimensional surface in D bounded by Ψ(γ  ) and Ψ(γ). By stokes
formula and from (3.11), we get
- -
I(γ) = dp ∧ dq − d x ∧ d y,
σ Ψ(σ)

which is equal to zero because Ψ preserves the two form d p ∧ d q. 


Checking the existing available symplectic algorithms, we observe that they are
generally constructed by discretizing Hamiltonian systems, therefore, they generate
exact symplectic step transition maps. In our case, this means that GtK is a one-
parameter family of exact symplectic mappings. By Lemma 3.2, so is G" t . As a result,
K
" t
GK can be re-expressed by generating function. On the other hand, by Lemma  3.1, we
" r ρ
t t
see that G is near the identity and approximates g up to order t
K H
s+1
on S 0 , 0
2 2
for t ∈ [0, δ2 ]. A simple argument of implicit function theorem, with the notice of the
exact symplecticity of G " t , will show the following:
K

Lemma 3.3. There exists a function ht which depends


 on
 the time step t such that it is
r0 ρ0
well-defined and real analytic on the domain S , for t ∈ [0, δ3 ] with δ3 being
4 4
" : (p, q) → (
a sufficiently small positive number so that G t
p, q) can be expressed by
K
ht as follows:
∂ht ∂ht
p = p − ts+1 (
p, q), q = q + tω(
p) + ts+1 (
p, q). (3.12)
∂q ∂ p
Proof. It follows immediately from Lemmas 3.1 and 3.3 that
? t? ? t?
? ∂h ? ? ∂h ?
? ? ≤ M, ? ? ≤ M.
? ?
∂ p r0
?ρ0
? ∂q r0 ρ0
4 , 4 4 , 4
 
r0 ρ0
p0 , q0 ) ∈ D and let ht (
Fix ( p, q) ∈ S
p0 , q0 ) = 0. For any ( , , integrating
4 4
∂ht ∂ht
the exact differential one form d p + d q along one of the shortest curves from
∂ p ∂q
564 13. KAM Theorem of Symplectic Algorithms

 
r ρ
( p, q) in S 0 , 0 and then taking the maximal norm of the integration
p0 , q0 ) to (
 4 4
r0 ρ0
for (p, q) over S , , we obtain the estimate
4 4

ht  r40 , ρ40 ≤ 2nM L, for t ∈ [0, δ3 ], (3.13)

where M is the constant in Lemma 3.1 and L is  an upper


 bound of the length of
r ρ
the shortest curves from (p0 , q0 ) to points of S 0 , 0 , which is clearly a finite
4 4
positive
 number.
 Note that B is a connected bounded, open subset of Rn and therefore,
r ρ
S 0 , 0 is bounded too. 
4 4

13.3.2 Numerical Version of KAM Theorem

We formulate the main result of this chapter as follows.


Theorem 3.4. Let Hamiltonian system (3.1) be integrable in a connected bounded,
open domain D of R2n , and let K be real analytic and nondegenerate in the sense
of Kolmogorov after expressed as action-angle variables. For an analytic symplectic
algorithm4 compatible with the system, as long as the time-step t of the algorithm is
small enough, most nonresonant invariant tori of the integrable system do not vanish,
but are only slightly deformed, so that in the phase space D, the symplectic algorithm
also has invariant tori densely filled with phase orbits winding around them quasi-
periodically, with a number of independent frequencies equal to the number of degrees
of freedom. These invariant tori are all analytic manifolds and form a Cantor set, say
Dt . The Lebesgue measure mDt of the Cantor set Dt tends to mD as t tends to zero.
Moreover, on Dt , the algorithm is conjugate to a one parameter family of rotations of
the form (p, q) → (p, q + tωt (p)) by a C ∞ -symplectic conjugation Ψt : Bt × Tn →
Dt , where (p, q) are action-angle coordinates and ωt is the frequency map defined on
a Cantor set Bt ⊂ Rn of actions.
More quantitative results hold. For any given and sufficiently small γ > 0, if the
time step t is sufficiently small, then there exists closed subsets Bγ,t of Bt and Dγ,t of
Dt such that Dγ,t = Ψt (Bγ,t × Tn ) and the following hold:
1◦ mDγ,t ≥ (1 − c1 γ)mD, where c1 is a positive constant not depending on t
and γ;
2◦ Ψt − Ψβ,βλ;Bγ,t ×Tn , ωt − ωβ+1;Bγ,t ≤ c2 γ −(2+β) · ts for any β ≥ 0,
where s is the accuracy order of the algorithm, λ > n + 2, c2 is a positive constant
not depending on γ and t. The norms here are understood in the sense of Whitney;
3◦ Every numerical invariant torus in Dγ,t is ts -close to the invariant torus of
the same frequencies of the original integrable system (3.1) in the sense of Hausdorff 5 .
4
An analytic algorithm is an algorithm generating an analytic step-transition map whenever
the Hamiltonian is analytic. Note that all the existing available symplectic algorithms are
analytic in this sense.
5 [Bey87]
The Hausdorff distance of two sets  A and B is defined as d(A, B) =
max sup dist(x, B), sup dist(y, A) , where dist(x, B) = inf |x − y|.
x∈A y∈B y∈B
13.3 KAM Theorem of Symplectic Algorithms for Hamiltonian Systems 565

Proof. Now the analytic version of Theorem 2.4 can be applied to St = G " t . The
K
conditions required by Theorem 2.4 are satisfied clearly according to the assumptions
of Theorem 2.1. For example, the nondegeneracy of the integrable system in the sense
of Kolmogorov means that the frequency map ω : B → Rn is nondegenerate and
therefore, there exists positive constants θ ≤ Θ such that ω satisfies (2.4) with some
positive numbers r ≤ r0 . We assume r = r0 here without loss of generality. In
Theorem 2.4, the function h is replaced by ts ht which satisfies the estimate (2.40)
r ρ
with r = 0 and ρ = 0 if we choose
4 4
D
s 2nM L −1 32
γ = γt =: Γ td , with 0 < d ≤ and Γ = θ Θ (3.14)
2 δ0

and if t is sufficiently small, where δ0 is the bound given by (2.40) of Theorem 2.4.
 1 
It is clear that the so chosen γ satisfies the condition γ ≤ min 1, rΘ required by
2
Theorem 2.4 for t sufficiently close to zero. By Theorem 2.4, we then have the Cantor
sets Bt = Bγ,t ⊂ B and Ωt = Ωγ,t ⊂ ω(B), a surjective map ωt = ωγ,t : Bt → Ωt
of class C ∞ and a symplectic mapping Φt : Bt × Tn → Rn × Tn of class C ∞,ω ,
in the sense of Whitney, such that the conclusions (1) – (4) of Theorem 2.4 hold with
" t fill out a set Et = Eγ,t = Φt (Bt × Tn ) in
γ = Γ td . From (2.10), invariant tori of G K
phase space E = B × T with measure estimate
n

  −n d 
mEt ≥ 1 − c4 Γ θΘ −1 t mE. (3.15)

From (2.11), with the notice of (2.7) and the fact that

ht βλ+λ+τ ≤ c7 ht  r40 , ρ40

by Cauchy’s estimate for derivatives of an analytic function, we have


 −β −1
Φt − Iβ,βλ;Bt ×Tn ≤ γΘ−1 σγΘ −1 ◦ (Φt − I)β,βλ;γΘ −1

≤ c5 c7 γ −(2+β) Θ1+β ts ht  r40 , ρ40


≤ c8 θ2+β Θ−(2+β/2) · ts−(2+β)d (3.16)

for t sufficiently close to zero, where


β 1+ β
c8 = c5 c7 (2nM L)− 2 δ0 2
.

In the last inequality of (3.16), we have used the estimate (3.13) for ht . From (2.11),
we also get

ωt − ωβ+1;Bt ≤ (γΘ−1 )−(β+1) ωt − ωβ+1;γΘ−1


≤ c8 θ2+β Θ−(1+β/2) · ts−(2+β)d . (3.17)

Let Ψt = Ψ ◦ Φt and Dt = Ψ(Et ), then GtK ◦ Ψt = Ψt ◦ Rt , which means that


Ψt realizes the conjugation from GtK to Rt : (p, q) → (p, q + tωt (p)) and for any
566 13. KAM Theorem of Symplectic Algorithms

fixed P ∈ Bt , Ψt (P, Tn ) is an invariant torus of GtK , which is an analytic Lagrangian


manifold since Ψt is a symplectic diffeomorphism and analytic with respect to the
angle variables. On the torus, the iterations of GtK starting from any fixed point are
quasi-periodic with frequencies tωt (p) which are rationally independent and satisfy
the diophantine condition (4.3) with ω = ωt (p) and γ = Γ td . These invariant tori
distribute C ∞ -smoothly in the phase space due to the C ∞ -smoothness of the conju-
gation Ψt . Moreover, we have the same estimates for the measure of Dt and for the
closeness of Ψt to Ψ as (3.15) and (3.16), with larger constants c4 and c8 , in which Et ,
E, Φt and I are replaced by Dt , D, Ψt and Ψ respectively. For β ≥ 0, if we choose d
satisfying
s
0<d< , (3.18)
2+β
then, from the above estimates, we see that Ψt , with the domain of definition Bt ×Tn ,
converges to the Ψ in C β,βλ -norm and ωt , with the domain of definition Bt , converges
to ω with respect to the C β+1 -norm as t tends to zero; the measure of Dt , the union
of invariant tori of GtK , tends to the measure of the phase space D. These arguments
just complete the proof of the first part of Theorem 3.4.
Now, we prove the remainder of Theorem 3.4. From the estimates (3.15) – (3.17)
and the uniform boundedness of the diffeomorphism Ψ and its inverse as well as their
derivatives, we see that if we choose γ being fixed in advance and not depending on
the time-step size t of the algorithm, then we have
 
c4 (θΘ−1 )−n γ mD,
mDγ,t ≥ 1 − " (3.19)

with constant "c4 > 0 not depending on γ and t, where Dγ,t = Ψ(Eγ,t ) with Eγ,t =
Φt (Bγ,t × Tn ) and with Bγ,t being the subset of B as indicated above. Note that
Bγ,t is a closed subset of Bt and Dγ,t a closed subset of Dt if t is sufficiently small.
Moreover, the estimate

c8 γ −(2+β) Θ1+β · ts
Ψt − Ψβ,βλ;Bγ,t ×Tn ≤ " (3.20)

and
c8 γ −(2+β) Θ2+β · ts
ωt − ωβ+1;Bγ,t ≤ " (3.21)

hold for any β ≥ 0 with " c8 > 0 not depending on γ and t. The conclusions (1)
 −n
and (2) of the last part of Theorem 3.4 are proved if we set c1 = " c4 θΘ −1 and
c2 = "
c8 ·max(Θ1+β , Θ2+β ). From (3.19), it follows that for a sufficiently small γ > 0,
Dγ,t has a positive Lebesque measure. From (2.12), it follows that for any ω ∗ ∈ Ωγ,t ,
there exists p∗ ∈ B and P ∗ ∈ Bγ,t such that ω(p∗ ) = ωt (P ∗ ) = ω ∗ and
 −1 s
|P ∗ − p∗ | ≤ 2nM Lc6 c7 γθΘ−1 ·t ,

which implies that


 
$1 c6 c7 γθΘ−1 −1 · ts ,
|Ψ(P ∗ , q) − Ψ(p∗ , q)| ≤ 4n2 M LM
13.3 KAM Theorem of Symplectic Algorithms for Hamiltonian Systems 567

uniformly for q ∈ Tn , where M $1 is an upper bound of the norm of ∂Ψ (p, q) for


r ρ  ∂z
0 0
(p, q) ∈ S , . This estimate, together with (3.20), proves the third conclusion
2 2
of the second part of Theorem 3.4. Theorem 3.4 is completely proved. 

A natural corollary of the above theorem is:

Corollary 3.5. Under the assumptions of the above theorem, there exists n functions
F1t , · · · , Fnt which are defined on the Cantor set Dt and of class C ∞ in the sense of
Whitney such that:
1◦ F1t , · · · , Fnt are functionally independent and in involution (i.e., the Poisson
bracket of any two functions vanishes on Dt );
2◦ Every Fjt (j = 1, · · · , n), is invariant under the difference scheme and the
invariant tori are just the intersection of the level sets of these functions;
3◦ Fjt (j = 1, · · · , n) approximate n independent integrals in involution of the
integrable system, with a suitable order of accuracy with respect to the time-step t
which will be explained in the proof.

Proof. By Theorem 3.4, we have

GtK ◦ Ψt (p, q) = Ψt ◦ Rt (p, q), for (p, q) ∈ Bt × Tn , (3.22)

where Rt is the integrable rotation (p, q) → (p, q + tωt (p)) and admits n invariant
functions, say, p1 , · · · , pn , analytically defined on Bt × Tn . Let

Fit = pi ◦ Ψ−1
t , i = 1, · · · , n,

then they are well-defined on the Cantor set Dt and of class C ∞ in the sense of
Whitney due to the C ∞ -smoothness of Ψ−1 t on Dt . Moreover, we easily verify by
(3.22) that
Fit ◦ GtK = Fit , i = 1, · · · , n,
and this means that Fit (i = 1, · · · , n) are n invariant functions of GtK . These n invari-
ant functions are functionally independent because pi (i = 1, · · · , n) are functionally
independent and Ψt is a diffeomorphism. The claim that Fit and Fjt are in involution
for 1 ≤ i, j ≤ n simply follows from the fact that pi and pj are in involution and Ψt
is symplectic. Note that the Poisson bracket is invariant under symplectic coordinate
transformations. Finally, it is observed from the proof of Theorem 3.4 that for each of
j = 1, · · · , n, Fjt approximates

Fj = pj ◦ Ψ−1
 
s
as t → 0, with the order of accuracy equal to ts−(2+β)d 0 < d < is given on
2+β
the set Dt (note that this set depends also on d by definition) and equal to ts on Dγ,t ,
a subset of Dt , in the norm of the class C β for any given β ≥ 0. It is clear that the
functions Fj (j = 1, · · · , n) are integrals of the integrable system and that any two of
them are in involution by the symplecticity of Ψ−1 . Corollary 3.5 is then proved. 
568 13. KAM Theorem of Symplectic Algorithms

13.4 Resonant and Diophantine Step Sizes


It is observed from the proof of Theorem 3.4 that the preserved invariant tori have
frequencies of the form ωt = tω, where t is the step size of the algorithm and ω
belongs to the frequency domain of the system, which the algorithm applies to. The
frequency tω is required to satisfy the diophantine condition
 
 exp (ik, tω) − 1 ≥ tγ , 0 = k ∈ Zn (4.1)
|k|
τ

with some γ > 0 and τ > 0, where u, v denotes the inner product of vectors u and
v in Rn . Note that t > 0 may be arbitrarily small.
For any fixed ω ∈ Rn , even if it is a diophantine vector, there exists some t in any
small neighborhood of the origin such that (4.1) does not hold for any γ > 0 and any
τ > 0. In fact, one can choose t to satisfy the resonance relation

exp (ik, tω) = 1 (4.2)

for some 0 = k ∈ Zn . In the next section, we will show that such t forms a dense set
in R.
We note that a one-step algorithm, when applied to system of differential equa-
tions, can be regarded as a perturbation of the phase flow of the system. On the other
hand, according to Poincaré, arbitrarily small perturbations in the generic case may
destroy those resonant invariant tori of an integrable system. Therefore, to simulate
the invariant torus with a given frequency of some Hamiltonian system by symplectic
algorithms, one is forced to be very careful to select step sizes, say, to keep them away
from some dense set.
Some questions arise: is it possible to simulate an invariant torus of an integrable
system by symplectic algorithms? If possible, how does one select the step sizes and
what structure does the set of those admitted step sizes have? In this paper, we try to
answer these questions.

13.4.1 Step Size Resonance


For any frequency vector, step size resonance may take place very often.

Lemma 4.1. For any ω ∈ Rn , there exists a dense subset, say D(ω), of R such that
for any t ∈ D(ω), the resonance relation (4.2) holds for some 0 = k ∈ Zn .

Proof. If k, ω = 0 for some 0 = k ∈ Zn , then D(ω) = R. If k, ω = 0 for any


0 = k ∈ Zn , then
 2πl 
D(ω) = t = : 0 = k ∈ Zn , l ∈ Z , (4.3)
k, ω

which is clearly dense in R and the resonance relation (4.2) holds for any t ∈ D(ω).
The proof of the lemma is completed. 
13.4 Resonant and Diophantine Step Sizes 569

Definition 4.2. D(ω) is called the resonant set of step sizes with respect to the fre-
quency ω ∈ Rn . Any t ∈ D(ω) is called a resonant step size with respect to ω.

From Lemma 3.1, If ω ∈ Rn is a resonant frequency, i.e., k, ω = 0 for some


0 = k ∈ Zn , then D(ω) = R. In other words, each step size is resonant with respect
to a resonant frequency. If ω ∈ Rn is a nonresonant frequency, i.e., k, ω = 0 for
any 0 = k ∈ Zn , then D(ω) is a countable and dense set of R. Because a resonant
torus may be destroyed by arbitrarily small Hamiltonian perturbations (Poincaré), any
invariant torus with frequency ω of a generic integrable system may not be preserved
by symplectic algorithms with step sizes in D(ω). To simulate an invariant torus of
the frequency ω, it is natural to consider those step sizes which are far away from
the resonant set D(ω). Note that if ω is of at least 2 dimensions, the resonant set
D(ω) is “denser” than the rational numbers in R because the set D(ω) consists of all
real numbers in the case when ω is resonant and consists of all numbers of the form
αk r in the case when ω is nonresonant, where r takes any rational number and, for

k ∈ Zn \ {0}, αk = which may be arbitrarily small and large, and moreover,
k, ω
there are arbitrarily many pairs of rationally independent numbers in αk . Anyway, for
nonresonant ω, D(ω) is countable.

13.4.2 Diophantine Step Sizes


Even though the step size may encounter resonance densely, we can still have a big
possibility to select step sizes to keep away from resonance. We discuss this as follows.

Definition 4.3. A number t ∈ R is said to be of diophantine type with respect to the


nonresonant frequency ω ∈ Rn , if
 2πl 
 λ
t −  ≥ μ τ , 0 = k ∈ Zn , 0 < l ∈ Z (4.4)
k, ω l |k|
for some constants λ > 0, μ and τ > 0.

We denote by Iλ,μ,τ (ω), the set of numbers τ satisfying (4.4) for given constants
λ > 0, μ and τ > 0. Then, Iλ,μ,τ (ω) is a subset of R which is far away from resonance
with respect to ω. For this set, we have:

Lemma 4.4. For any nonresonant frequency ω ∈ Rn , and for any λ > 0, any μ and
any τ > 0, the set Iλ,μ,τ (ω) is nowhere dense and closed in R. Moreover, if μ > 1
and τ > n, then we have
 
meas R \ Iλ,μ,τ (ω) ≤ cλ, (4.5)

where c is a positive number depending only on n, μ and τ .

Proof. The nowhere denseness and the closedness of Iλ,μ,τ (ω) follow from the fact
that the complement of the set is both open and dense in R for any λ > 0, μ and
τ > 0. It remains to prove (4.5). Since
570 13. KAM Theorem of Symplectic Algorithms

, '   /
 2πl  λ
R \ Iλ,μ,τ (ω) = t ∈ R : t − < ,
k, ω lμ |k|τ
0<l∈Z
0=k∈Zn

we have
 ∞
 1 
  2λ 1
meas R \ Iλ,μ,τ (ω) ≤ ≤ 2λ · .
lμ |k|τ lμ |k|τ
0<l∈Z l=1 0=k∈Zn
0=k∈Zn

Define

 1
cμ = .

l=1
Then cμ < ∞ when μ > 1 and
 ∞ ∞
1 1 n+m−1
|k|−τ = τ
· #{k ∈ Zn
: |k| = m} ≤ 2 n
C
τ m
m m
0=k∈Zn m=1 m=1


2n−1 1
≤2 τ −n+1
= 22n−1 cτ −n+1 < ∞
m=1
m

when τ > n, here #S denotes the number of the elements of the set S and Csk are
binomial coefficients. (4.5) is verified with
c = 4n cμ cτ −n+1 . (4.6)
Therefore, the lemma is completed. 
Remark 4.5. We may define Iλ,μ,τ (ω) to be empty for any resonant frequency ω and
any λ > 0, any μ and any τ > 0 because no number t satisfies (4.4) in this case. It is
possible that the set Iλ,μ,τ (ω) may still be empty even for nonresonant frequencies ω
if here the numbers μ and τ are not properly chosen. Anyway, the above lemma shows
that if μ > 1 and τ > n, then the set Iλ,μ,τ (ω) has positive Lebesgue measure and
hence is nonempty for any λ > 0.

Remark 4.6. If λ1 > λ2 > 0, then Iλ1 ,μ,τ (ω) ⊂ Iλ2 ,μ,τ (ω). Therefore, if ω is a
nonresonant frequency and μ > 1 and τ > n, then the set of all real numbers t
satisfying (4.4) for some λ > 0 has full Lebesgue measure in any measurable set
of R. It should be an interesting number theoretic problem to study the cases when
μ ≤ 1 or τ ≤ n. In numerical analysis, the step sizes are usually considered only in a
bounded interval. We take the interval [−1, 1] as illustration without loss of generality.
Lemma 4.7. For a nonresonant frequency ω = (ω1 , ω2 , · · · , ωn ), assume 0 < λ <

with |ω| = max |ωj |. If −1 ≤ μ ≤ 1 and μ + τ > n + 1, then we have
|ω| 1≤j≤n
 
meas [−1, 1] \ Iλ,μ,τ (ω) ≤ "
cλ, (4.7)
where "
c is a positive number depending not only on n, μ and τ but also on |ω|.
13.4 Resonant and Diophantine Step Sizes 571

Proof. The set [−1, 1] \ Iλ,μ,τ (ω) is contained in the union of all subintervals
% &
2πl λ 2πl λ
− μ τ, + μ τ
k, ω l |k| k, ω l |k|

for those l ∈ Z, l > 0 and k ∈ Zn \ {0} such that


2πl λ
<1+ μ τ. (4.8)
|k, ω| l |k|
Since −1 ≤ μ ≤ 1, we have that τ > n + 1 − μ ≥ n and (3.30) implies that

|k| > − λ l. Therefore,
|ω|
∞ ∞
    2λ  1  1
meas [−1, 1] \ Iλ,μ,τ (ω) ≤ ≤ 4n λ ,
l |k|
μ τ lμ mτ −n+1
l=1 |k|>Nl,λ l=1 m>Nl,λ

 

where Nl,λ = − λ l which is positive for positive l. We will use the following
|ω|
estimate which is easy to prove:

 1 ⎨ cτ −n+1 , 0 < N ≤ 1,
≤ 1 (4.9)
mτ −n+1 ⎩ , N > 1.
m>N (τ − n)(N − 1)τ −n

Assume lλ is the integer such that Nlλ ,λ ≤ 1 and Nlλ +1,λ > 1. Then (4.7) is verified
with
⎛ ⎞
⎜ 
lλ ∞
 ⎟
⎜ 1 1 1 ⎟
"
c = 4n ⎜cτ −n+1 + %% & &τ −n ⎟ (4.10)
⎝ lμ τ −n 2π ⎠
l=1 l=lλ +1 lμ −λ l−1
|ω|


which is finite because the conditions μ + τ > n + 1 and 0 < λ < guarantee the
|ω|
lλ
1
convergence of the infinite summation in (4.1). If lλ = 0, then we take =0

l=1
and hence the first term in the bracket of Equation (4.10) disappears in this case. Note
that here the number "c depends also on λ, but this dependence is not fatal essentially

because the only harmful case is when λ is close to . However, this case is not of
|ω|
π
interest and may always be avoided. For example, we simply assume 0 < λ ≤ in
|ω|
the lemma. The proof of Lemma 4.7 is completed. 

Therefore, to guarantee the positiveness of the Lebesgue measure of the set


Iλ,μ,τ (ω), it is not necessary to assume μ > 1. One may require μ to only sat-
isfy μ ≥ −1. In the case −1 ≤ μ ≤ 1, however, one has to additionally require
572 13. KAM Theorem of Symplectic Algorithms

μ + τ > n + 1, which automatically implies that τ > n. One may also consider how
big the set Iλ,μ,τ (ω) is in other unit intervals with integer endpoints, but we do not go
further in this direction.
Remark 4.8. It remains to study the set Iλ,μ,τ (ω) in other cases: μ < −1 or τ ≤ n
or μ + τ ≤ n + 1. I believe the Lebesgue measure of the set is zero in each of these
cases. It is also an interesting problem to calculate the Hausdorff dimensions of the
set Iλ,μ,τ (ω) in all of these cases. The cases when −1 ≤ μ ≤ 1 and ν = n − μ + 1
and when μ > 1 and τ = n should be particularly interesting. In all other cases, I
intend to believe the set is empty. Note that a special case when n = 1, μ = 0 and
τ = n − μ + 1 = 2 with ω = 2π just corresponds to the classical diophantine problem
on approximating an irrational number by rational ones.
To any nonresonant frequency ω in Rn , we have associated a 3-parameter family
of sets Iλ,μ,τ (ω) on the real line. The set Iλ,μ,τ (ω) has positive Lebesgue measure
and hence is nonempty if μ ≥ −1, τ > n, μ + τ > n + 1 and λ > 0 suitably small
(in the case when μ > 1 and τ > n, Iλ,μ,τ (ω) has positive Lebesgue measure for
any λ > 0). But to guarantee an invariant torus of the frequency tω for symplectic
algorithms with the step size t, it seems that the only way is to require tω satisfy a
diophantine condition of the type (1.1) (J. Mather showed in [Mat88] ) that for any exact
area-preserving twist mapping, an invariant circle with any Liouville frequency can be
destroyed by arbitrarily small perturbations in C ∞ -topology). This is the case when
one requires both ω be a diophantine frequency and t be a diophantine step size with
respect to the ω, as the following lemma shows.
Lemma 4.9. Let γ > 0 and 0 < λ ≤ 1. Then for any ω ∈ Ωγ (τ1 )6 and any t ∈
[−1, 1] ∩ Iλ,μ,τ2 (ω), we have
 ik,tω  |t|"
γ
e − 1 ≥ , 0 = k ∈ Zn , (4.11)
|k|μ+τ1 +τ2

where
2λγ
"=
γ  μ . (4.12)
1
π 1+ |ω|

Proof. It is easy to prove that for k ∈ Zn , k = 0, there exists l ∈ Z such that
 ik,tω  2 
e − 1 ≥ k, tω + 2πl.
π
We have two cases:
1◦ l = 0. Since ω ∈ Ωγ (τ1 ),
 ik,tω  2  2|t|γ
e − 1 ≥ k, tω ≥ ;
π π|k|τ1
6
We denote by Ωγ (τ ), the set of all vectors ω ∈ Rn satisfying the diophantine condition of
the form γ
|k, ω| ≥  k ∈ Zn .
, 0=
|k|τ
13.4 Resonant and Diophantine Step Sizes 573

2◦ l = 0. Since t ∈ Iλ,μ,τ2 (ω) and ω ∈ Ωγ (τ1 ),


J   K  
 ik,tω  2  2πl 2 2πl  
e 
− 1 ≥  k, t + ω  = t +  k, ω
π k, ω π k, ω
2 λγ
≥ · .
π lμ |k|τ1 +τ2

But
π  ik,tω 
|2πl| ≤ |k, tω + 2πl| + |k, tω| ≤ e − 1 + |t||k, ω|
2
≤ π + |t| |ω||k|,

therefore,
 ik,tω  2λγ
e − 1 ≥ % &μ .
1 |t|
π + |ω| |k|μ+τ1 +τ2
2 2π
Combining the two cases, (4.11) is verified and hence Lemma 2.4 is proved. 
 # 
From the above lemmas and the fact that meas Rn \ Ωγ (τ ) = 0 for τ >
γ>0
n − 1, we conclude that for almost all ω ∈ Rn and almost all t ∈ [−1, 1], tω satisfies
a diophantine condition of the mapping type (2.5). As the step size of a difference
scheme, however, t may fall into an arbitrarily small neighbourhood of the origin.
The next lemma shows that for a nonresonant frequency ω ∈ Rn and for μ ≥ −1,

τ > n + 1, μ + τ > n + 1 and 0 < λ < , the set Iλ,μ,τ (ω) has large measure near
|ω|
the origin of the real line.

Lemma 4.10. Let ω be a nonresonant frequency of Rn . Let λ > 0, μ ≥ −1, τ > n


and μ + τ > n + 1. For any δ > 0, let
δ
Jλ,μ,τ (ω) = (−δ, δ) \ Iλ,μ,τ (ω).


If λ + δ < , then
|ω|
δ
meas Jλ,μ,τ (ω) ≤ dδ τ −n , (4.13)
where

4n λ  1
d= %% & &τ −n < ∞. (4.14)
τ −n 2π
l=1 lμ −λ l−δ
|ω|

Consequently, if in addition τ > n + 1, then


 
meas Iλ,μ,τ (ω) ∩ (−δ, δ)
lim = 1. (4.15)
δ→0+ meas(−δ, δ)
574 13. KAM Theorem of Symplectic Algorithms

Proof. Let t ∈ Jλ,μ,τ


δ
. By definition, we have
λ 2πl λ
−δ − ≤ ≤δ+ μ τ (4.16)
lμ |k|τ k, ω l |k|
for some k ∈ Zn and 0 < l ∈ Z. Fix l ∈ Z, l = 0, denoted by Klδ the set of k ∈ Zn
satisfying (4.16). If k ∈ Klδ , then
2πl 2πl
≤ ≤ |k, ω| ≤ |k| |ω|,
λ λ
δ+ μ δ+
l |k| lμ |k|τ
which implies that
* 2π +
−λ
|ω| .
|k| > l = Nlδ
δ
( )
since μ ≥ −1. This shows that Klδ ⊂ k ∈ Zn : |k| > Nlδ and therefore,

  ∞

2λ 1  1
δ
meas Jλ,μ,τ (ω) ≤ ≤4 λ
n
τ −n+1
.
k∈Zn
l |k|
μ τ lμ δ
m
l=1 l=1 m>Nl
|k|>Nlδ


Because 0 < δ + λ < , we have Nlδ > 1. (4.13) follows from (4.9) with the
|ω|
constant d defined by (4.14), which is finite because τ > n and μ + τ − n > 1. (4.15)
is true if, in addition, τ > n + 1. 

13.4.3 Invariant Tori and Further Remarks


Now, we summarize the main result of this section as follows.
Theorem 4.11. Given an analytic, nondegenerate and integrable Hamiltonian system
of n degrees of freedom, and given a frequency ω, in the domain of the frequencies of
the system, which satisfies the diophantine condition of the form
γ
|k, ω| ≥ , 0 = k = (k1 , · · · , kn ) ∈ Zn (4.17)
|k|τ
for some γ > 0 and τ > 0, there exists a Cantor set I(ω) of R, for any symplectic
algorithm applied to the system, there exists a positive number δ0 such that if the step
size t of the algorithm falls into the set (−δ0 , δ0 ) ∩ I(ω), then the algorithm, if applied
to the integrable system, has an invariant torus of frequency tω. The invariant torus of
the algorithm approximates the invariant torus of the system in the sense of Hausdorff
with the order equals to the order of accuracy of the algorithm. The Cantor set I(ω)
has density one at the origin in the sense that
 
m (−δ, δ) ∩ I(ω)
lim = 1. (4.18)
δ→0+ m(−δ, δ)
13.4 Resonant and Diophantine Step Sizes 575

Proof. For the given ω, we define I(ω) = Iλ,μ,"τ (ω) for some λ > 0, μ > 1 and
τ" > n + 1. By Lemma 3.7, we have for any t ∈ [−1, 1] ∩ I(ω),
  |t|"
 ik,tω  γ
e − 1 ≥ , 0 = k ∈ Zn
|k|μ+τ +"τ

" given by (4.12). The analytic version of Theorem 2.4 may be applied and there-
with γ
fore, for a symplectic algorithm applied to the given system7 , we can find a positive
number δ0 , which depends on the numbers n, γ, τ , λ, μ, τ" and |ω| and on the non-
degeneracy and the analyticity of the system and, of course, also on the algorithm,
such that the algorithm has an invariant torus of the frequency tω with the required
approximating property to the corresponding invariant torus of the system if the step
size t falls into the set [−δ0 , δ0 ] ∩ I(ω). It follows from Lemma 3.8 that the set I(ω)
has density one at the origin because we have chosen μ > 1 and τ" > n + 1. 

Remark 4.12. In practical computations, one would like to choose big step sizes. It
is interesting to look at how the δ0 in Theorem 4.11 depends on the nonresonance
property of the frequency ω and how the δ0 relates to the size of the diophantine set
I(ω) of step sizes. It is known that the parameters γ and ν describe the nonresonance
property of the frequency ω and the parameters λ, μ and ν" determine the size of the
set I(ω). Among them, the most interesting are γ and λ because we may fix all others
in advance without loss of generality. For a given ω, we define γ to be the biggest
one such that (4.17) holds for a fixed τ > n − 1. It is easy to see, from Lemma 4.9
2
and Theorem 2.4, that δ0 may be chosen to be proportional to (γλ) s , where s is the
order of accuracy of the algorithm considered in Theorem 4.11. Note that the more
nonresonant the ω is, the bigger γ will be and therefore the bigger δ0 is admitted. On
the other hand, for a given ω, the bigger step size is taken, the bigger λ has to be
chosen and in this case, the set I(ω) turns out to be smaller. But anyway, the set I(ω)
is of density one at the origin. Consequently, to simulate an invariant torus, one has
much more possibilities to select available small step sizes than to select available big
ones.

Remark 4.13. It is interesting to make some comparisons between Theorem 3.4 and
Theorem 4.11. Theorem 3.4 shows that a symplectic algorithm applied to an analytic
nondegenerate integrable Hamiltonian system has so many invariant tori that the tori
form a set of positive Lebesgue measures in the phase space if the step size of the algo-
rithm is sufficiently small and fixed in an arbitrary way. No additional nonresonance or
diophantine condition is imposed on the step size. But the set of frequencies of the in-
variant tori depends on the step size and, therefore, changes in general as the step size
changes. It is a fact that the measure of the set of frequencies of the invariant tori be-
comes larger and larger as the step size gets smaller and smaller. These sets, however,
may not intersect at all for step sizes taken over any interval near the origin. There-
fore, the invariant tori of any frequencies may not be guaranteed for any symplectic
algorithm with step size randomly taken in any neighbourhood of the origin. Theorem
7
So far, the available symplectic algorithms are exact symplectics when they are applied to
global Hamiltonian systems and analytics when applied to analytic systems.
576 13. KAM Theorem of Symplectic Algorithms

4.11 shows that an invariant torus with any fixed diophantine frequency of an analytic
nondegenerate integrable Hamiltonian system can always be simulated very well by
symplectic algorithms for any step size in a Cantor set of positive Lebesgue measure
near the origin. The following theorem shows that one can simulate simultaneously
any finitely many invariant tori of given diophantine frequencies by symplectic algo-
rithms with a sufficiently big probability to select available step sizes. The step sizes,
of course, also have to be restricted to a Cantor set.

Theorem 4.14. Given an analytic, nondegenerate and integrable Hamiltonian system


of n degrees of freedom. Given N diophantine frequencies ω j (j = 1, 2, · · · , N ), in
the domain of the frequencies of the system, there exists a Cantor set I of R, depending
on the N frequencies, such that for any symplectic algorithm applied to the system,
there exists a positive number δ0 such that if the step size t of the algorithm falls
into the set (−δ0 , δ0 ) ∩ I, then the algorithm has N invariant tori of the frequencies
τ ω j (j = 1, 2, · · · , N ) when it applies to the integrable system. These invariant tori
approximate the corresponding ones of the system in the sense of Hausdorff with the
order equal to the order of accuracy of the algorithm. The Cantor set I has density
one at the origin.

Proof. The proof of Theorem 4.14 follows from Theorem 4.11 and 

Lemma 4.15. For any integer N ≥ 1, any ω j ∈ Ωγ (τ ) (j = 1, 2, · · · , N ) and any


δ > 0, put AN = (ω 1 , ω 2 , · · · , ω N ) and

L
N

τ (A ) = (−δ, δ) \ Iλ,μ,"
N N
τ (A )
Iλ,μ," = Iλ,μ,"τ (ω j ), δ
Jλ,μ," N N N
τ (A )
j=1

with given λ > 0, μ ≥ −1, τ" > n + 1 and μ + τ" > n + 1. Then we have
τ" −n
τ (A ) ≤ N dδ
δ N
meas Jλ,μ,"


if λ + δ < , where |AN | = max |ω j | and d is defined by (4.14) where τ is
|AN | 1≤j≤N
replaced by τ" and |ω| replaced by |AN |. Consequently, the set Iλ,μ,"
N N
τ (A ) has density
one at the origin. Moreover, for any t ∈ [−1, 1] ∩ Iλ,μ,"
N N
τ (A ), we have

 ik,tωj   |t|"
γ
e − 1 ≥ , 0 = k ∈ Zn , j = 1, 2, · · · , N
|k|μ+τ +"τ

" given by (4.12) where |ω| replaced by |AN |.


with γ

Proof. Lemma 4.15 is a natural corollary of Lemmas 4.9 and 4.10. 

Remark 4.16. There have been some works about exponential stability of symplec-
tic algorithms in simulating invariant tori with given diophantine frequencies of in-
tegrable or nearly integrable systems (Benettin and Giorgilli (1994)[BG94] , Hairer and
Lubich in 1997[HL97] and Stoffer in 1998[Sto98b] ). The result, for example, of Hairer and
13.4 Resonant and Diophantine Step Sizes 577

Lubich[HL97] shows that during a very long interval of iteration steps (exponentially
long in 1/t ), the numerical orbits of a symplectic algorithm approximate the exact
orbits of some perturbed Hamiltonian system8 with a very small error (exponentially
small in −1/t ) if the starting values of the numerical orbits and the exact ones are the
same and are taken on the invariant torus of the perturbed system (the invariant torus
is guaranteed by the KAM theorem)[HL97] (Corollary 7) or taken in a neighbourhood
of the invariant torus with the radius of order t2n+2 (this is easily derived from Hairer
and Lubich (1997, Corollary 8)), here n is the degrees of freedom of the Hamiltonian
system and t is the step size of the algorithm which is assumed to be sufficiently small.
Theorems 4.11 and 4.14 show that one may generate quasi-periodic (therefore, perpet-
ually stable) numerical orbits using a symplectic algorithm which approximate exact
quasi-periodic orbit of an analytic nondegenerate integrable Hamiltonian system if the
step sizes of the algorithm fall into a Cantor set of large density near the origin. As the
step size in this Cantor set gets smaller and smaller, more and more stable numerical
orbits appear. For such a stability consideration, Theorem 3.4 shows much more: the
perpetually stable numerical orbits take up a large set of the phase space so that the
Lebesgue measure of the set approaches the Lebesgure measure of the phase space
as the step size approaches zero. Due to the well-known topological confinement of
the phase plane between invariant closed curves, this implies the perpetual stability of
symplectic algorithms applied to one degree of freedom systems for any initial values
if the step size is small.

Remark 4.17. Generally speaking, it is difficult to check the diophantine condition


for a step size with respect to a nonresonant frequency vector. An obvious fact is
fortunately, however, that step sizes N −1 , with N being integers, satisfy the diophan-
tine condition (4.4) with respect to frequency vectors satisfying diophantine condition
(2.8). This fact was checked by Dujardin and Faou in 2007[DF07] for the 1 + 1 di-
mensional linear Schrödinger equation with a periodic potential, where a spatially
1
periodic solution can be stably simulated using nonresonant step size t = = 0.2,
5

but is quickly violated using resonant step size t = ∼ 0.196.
62 − 22

8
The perturbed Hamiltonian system approximates the symplectic algorithm and is determined
uniquely, in the setting of the backward analysis, by the algorithm and the Hamiltonian
system which the algorithm applies to[Hai94] .
Bibliography

[AA89] V. I. Arnold and A. Avez: Ergodic Problems of Classical Mechanics. Addison-Wesley


and Benjamin Cummings, New York, (1989).
[Arn63] V.I. Arnold: Proof of A. N. kolmogorov’s theorem on the preservation of quasi-
periodic motions under small perturbations of the Hamiltonian,. Russian Math. Surveys,
18(5):9–36, (1963).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM
60, Berlin, Heidelberg, Second edition, (1989).
[BB79] K. Burrage and J.C. Butcher: Stability criteria for implicit Runge–Kutta methods.
SIAM J. Numer. Anal., 16:46–57, (1979).
[Bey87] W. J. Beyn: On invariant closed curves for one-step methods. Numer. Math., 51:103–
122, (1987).
[BG94] G. Benettin and A. Giorgilli: On the Hamiltonian interpolation of near to the identity
symplectic mappings with application to symplectic integration algorithms. J. Stat. Phys.,
74:1117–1143, (1994).
[But75] J. C. Buther: A stability property of implicity Runge–Kutta methods. BIT, 15(3):358–
361, (1975).
[CD09] F. Castella and G. Dujardin: Propagation of gevrey regularity over long times for
the fully discrete Lie-Trotter splitting scheme applied to the linear Schrödinger equation.
ESIAM: Mathmatical Modelling and Numericalo Analysis, 43(4):651–676, (2009).
[CFM06] P. Chartier, E. Faou, and A. Murua: An algebraic approach to invariant preserving
integrators; the case of quadratic and Hamiltonian invariants. Numer. Math., 103(4):575–
590, (2006).
[CG82] L. Chierchia and G. Gallavotti: Smooth prime integralsfor quasi-integrable Hamilto-
nian systems. IL Nuovo Cimento, 67, 277– 295, (19820.
[CS90] P.J. Channell and C. Scovel: Symplectic integration of Hamiltonian systems. Nonlin-
earity, 3:231–259, (1990).
[CS94] C. G. Cheng and Y. S. Sun: Existence of KAM tori in degenerate Hamiltonian systems.
J. Differential Equations, 114(1):288–335, (1994).
[Dah63] G. Dahlquist: A special stability problem for linear multistep methods. BIT, 3(1):27–
43, (1963).
[Dah75] G. Dahlquist: Error analysis for a class of methods for stiff nonlinear initial value
problems. In G.A. Watson, editor, Lecture notes in Mathematics, Vol. 506, Numerical
Analysis, Dundee, pages 60–74. Springer, Berlin, (1975).
[Dah78] G. Dahlquist: G-stability is equivalent to A-stability. BIT, 18:384–401, (1978).
[DF07] G. Dujardin and E. Faou: Normal form and long time analysis of splitting schemes
for the linear Schrödinger equation with small potential. Numer. Math., 108(2):223–262,
(2007).
[Ehl69] B. L. Ehle: On Padé approximations to the exponential function and A-stable methods
for the numerical solution of initial value problems. Technical Report, Research Rep. No.
CSRR 2010, University of Waterloo Dept. of Applied Analysis and Computer Science,
(1969).
Bibliography 579

[Fen91] K. Feng: The Hamiltonian Way for Computing Hamiltonian Dynamics. In R. Spigler,
editor, Applied and industrial Mathmatics, pages 17–35. Kluwer, The Netherlands, (1991).
[FQ03] K. Feng and M. Q. Qin: Symplectic Algorithms for Hamiltonian Systems. Zhejiang
Science and Technology Publishing House,Hangzhou, in Chinese, First edition, (2003).
[Gar96] B. M. Garay: On structural stability of ordinary differential equations with respect to
discretization methods. Numer. Math., 72:449–479, (1996).
[Hai94] E. Hairer: Backward analysis of numerical integrators and symplectic methods. Annals
of Numer. Math., 1:107–132, (1994).
[HL97] E. Hairer and Ch. Lubich: The life-span of backward error analysis for numerical
integrators. Numer. Math., 76:441–462, (1997).
[HLW02] E. Hairer, Ch. Lubich, and G. Wanner: Geometric Numerical Integration. Num-
ber 31 in Springer Series in Computational Mathematics. Springer-Verlag, Berlin, (2002).
[HNW93] E. Hairer, S. P. Nørsett, and G. Wanner: Solving Ordinary Differential Equations I,
Nonstiff Problems. Springer-Verlag, Berlin, Second revised edition, (1993).
[HS81] W. H. Hundsdorfer and M. N. Spijker: A note on B-stability of Runge–Kutta methods.
Numer. Math., 36:319–331, (1981).
[HS94] A. R. Humphries and A. M. Stuart: Runge–Kutta methods for dissipative and gradient
dynamical systems. SIAM J. Numer. Anal., 31(5):1452–1485, (1994).
[Kol54b] A. N. Kolmogorov: On conservation of conditionally periodic motions under small
perturbations of the Hamiltonian. Dokl. Akad. Nauk SSSR,, 98:527–530, (1954).
[Laz74] V. F. Lazutkin: On Moser’s theorem on invariant curves. In Voprsoy raspr. seism. voln.
vyp. Nauka Leningrad, 14:105–120, (1974).
[Li99] M. C. Li: Structural stability for Euler method. SIAM J. Math. Anal., 30(4):747–755,
(1999).
[LR05] B. Leimkuhler and S. Reich: Simulating Hamiltonian Dynamics. Cambridge Univer-
sity Press, Cambridge, First edition, (2005).
[Mat88] J. Mather: Destruction of invariant circles. Ergod. Theory & Dynam. Sys, 8:199–214,
(1988).
[Mos62] J. Moser: On invariant curves of area-preserving mappings of an annulus. Nachr.
Akad. Wiss. Gottingen, II. Math.-Phys., pages 1–20, (1962).
[Nei82] A. I. Neishtadt: Estimates in the Kolmogorov theorem on conservation of condition-
ally periodic motions. J. Appl. Math. Mech., 45(6):766–772, (1982).
[Pös82] J. Pöschel: Integrability of Hamiltonian systems on Cantor sets. Comm. Pure and
Appl. Math., 35:653–695, (1982).
[Rüs81] H. Rüssmann: On the existence of invariant curves of twist mappings of an anulus.
In J. Palis, editor, Geometric Dynamics, Lecture Notes in Math. 1007, pages 677–718.
Springer-Verlag, Berlin, (1981).
[Rüs90] H. Rüssmann: On twist Hamiltonian. In in Colloque Internationa: Mécanique céleste
et systèmes hamiltoniens. Marseille, (1990).
[SH96] A.M. Stuart and A.R. Humphries: Dynamical Systems and Numerical Analysis. Cam-
bridge University Press, Cambridge, Second edition, (1996).
[Sha91] Z. J. Shang: On the KAM theorem of symplectic algorithms for Hamiltonian systems,.
Ph.D. thesis (in Chinese), Computing Center, Academia Sinica, (1991).
[Sha99] Z. Shang: KAM theorem of symplectic algorithms for Hamiltonian systems. Numer.
Math., 83:477–496, (1999).
[Sha00a] Z. J. Shang: A note on the KAM theorem for symplectic mappings. J. Dynam.
Differential eqns., 12(2):357–383, (2000).
[Sha00b] Z. J. Shang: Resonant and diophantine step sizes in computing invariant tori of
Hamiltonian systems. Nonlinearity, 13:299–308, (2000).
[SSC94] J. M. Sanz-Serna and M. P. Calvo: Numerical Hamiltonian Problems. AMMC 7.
Chapman & Hall, London, (1994).
[Sto98a] D. Stoffer: On the qualitative behavior of symplectic integrator. II: Integrable systems.
J. of Math. Anal. and Applic., 217:501–520, (1998).
580 Bibliography

[Sto98b] D. Stoffer: On the qualitative behaviour of symplectic integrators. III: Perturbed


integrable systems. J. of Math. Anal. and Appl., 217:521–545, (1998).
[Sva81] N. V. Svanidze: Small perturbations of an integrable dynamical system with an integral
invariant. In Proceedings of the Steklov Institute of Mathematics, Issue 2, pages 127–151,
(1981).
[Wid76] O. B. Widlund: A note on unconditionally stable linear multistep methods. BIT,
7(1):65–70, (1976).
Chapter 14.
Lee-Variational Integrator

In the 1980s, Lee proposed an energy-preserving discrete mechanics with variable


time steps by taking time (discrete) as a dynamical variable [Lee82,Lee87] . On the other
hand, motivated by the symplectic property of Lagrangian mechanics, a version of
discrete Lagrangian mechanics has been developed and variational integrators that
preserve discrete symplectic 2-form have been obtained [MPS98,MV91,Ves88,Ves91a,WM97] ,
but variational integrators obtained in this way fix the time steps and consequently,
they are not energy-preserving in general.
Obviously, energy-preserving discrete mechanics and variational integrators are
more preferable, since solutions of the Euler–Lagrange equations of conservative con-
tinuous systems are not only symplectic but also energy-preserving. To attain this
goal, we should study some discrete mechanics with discrete energy conservation and
symplectic variational integrators. Recently, Kane, Marsden, and Ortiz have employed
appropriate time steps to conserve a defined energy and developed what they called
symplectic energy-momentum-preserving variational integrators in [KMO99] . Although
their approach is more or less related to Lee’s discrete mechanics, the discrete energy-
preserving condition is not derived by the variational principle.

14.1 Total Variation in Lagrangian Formalism


The purpose of this section is to generalize or improve the above mentioned ap-
proaches as well as to explore the relations among discrete total variation, Lee’s dis-
crete mechanics, and Kane–Marsden–Ortiz integrators. We will present a discrete total
variation calculus with variable time steps and a discrete mechanics that is discretely
symplectic, energy-preserving and has the correct continuous limit. In fact, this dis-
crete variation calculus and mechanics is a generalization of Lee’s discrete mechan-
ics in symplectic-preserving sense and can derive directly the variational symplectic-
energy-momentum integrators of Kane, Marsden, and Ortiz.

14.1.1 Variational Principle in Lagrangian Mechanics


Before beginning this section, we will recall very briefly the ordinary variational prin-
ciple in Lagrangian mechanics for later use. Suppose Q denotes the extended con-
figuration space with coordinates (t, q i ) and Q(1) the first prolongation of Q with
582 14. Lee-Variational Integrator

coordinates (t, q i , q̇ i )[Olv93] . Here t denotes time and q i (i = 1, 2, · · · , n) denote the


positions. Consider a Lagrangian L : Q(1) → R. The corresponding action functional
is defined by
- b
S(q i (t)) = L(t, q i (t), q̇ i ) d t, (1.1)
a
i 2
where q (t) is a C curve in Q.
Hamilton’s principle seeks a curve q i (t) denoted by Cab with endpoints a and b,
for which the action functional S is stationary under variations of q i (t) with fixed
endpoints. Let

V = φi (t, q) i (1.2)
∂q
be a vertical vector field on Q, here q = (q 1 , · · · , q n ). By a vertical vector field we

mean a vector field on Q which does not involve terms of form ξ(t, q) , for example,
∂t
time t does not undergo variation.
Let F ε be the flow of V , i.e., a one-parameter group of transformations on Q :
F (t, q i ) = (t̃, q̃ i ).
ε

t̃ = t, (1.3)
q̃ i = g i (ε, t, q), (1.4)

where
d 
 g i (ε, t, q) = φi (t, q) := δq i (t). (1.5)
d ε ε=0
In other words, the deformation (1.3) – (1.4) transforms the curve q i (t) into a
family of curves q̃ i (ε, t̃) in Q denoted by Cεba which are determined by

t̃ = t, (1.6)
q̃ i = g i (ε, t, q(t)). (1.7)

Thus, we obtain a (sufficiently small) set of curves Cεba around Cab . Corresponding to
this set of curves there is a set of Lagrangian and action functionals
- b
d i
S(q (t)) −→ S(q̃ (ε, t̃)) =
i i
L(q̃ i (ε, t̃), q̃ (ε, t̃)) d t̃. (1.8)
a d t̃

Now, we can calculate the variation of S at q(t) as follows:



d
δS =  S(q̃ i (ε, t̃))
dε ε=0
- b % & ! 
∂L d ∂L ∂ L i b
= i
− i
φ i
d t + i
φ  . (1.9)
a ∂q d t ∂ q̇ ∂ q̇ a

For the fixed endpoints, φi (a, q(a)) = φi (b, q(b)) = 0, the requirement of Hamilton’s
principle, δS = 0, yields the Euler–Lagrange equation for q(t)
14.1 Total Variation in Lagrangian Formalism 583

∂L d ∂L
− = 0. (1.10)
∂ qi d t ∂ q̇ i

If we drop the requirement of φi (a, q(a)) = φi (b, q(b)) = 0, we can naturally


obtain the Lagrangian 1-form on Q(1) from the second term in (1.9):
∂L
θL = d qi , (1.11)
∂ q̇ I
 
∂ ∂
where d q i are dual to , d qi = δji . Furthermore, it can be proved that the
∂ qj ∂ qj
solution of (1.10) preserves the Lagrangian 2-form

ωL := dθL . (1.12)

On the other hand, introducing the Euler–Lagrange 1-form


' /
∂L d ∂L
E(q i , q̇ i ) = − d qi , (1.13)
∂q i d t ∂ q̇ i

the nilpotency of d leads to


d
d E(q i , q̇ i ) + ωL = 0, (1.14)
dt
namely, the necessary and sufficient condition for symplectic structure preserving is
that the Euler–Lagrange 1-form is closed[GLW01a,GLWW01,GW03] .

14.1.2 Total Variation for Lagrangian Mechanics


Consider a general vector field on Q

∂ ∂
V = ξ(t, q) + φi (t, q) i , (1.15)
∂t ∂q

here q = (q 1 , · · · , q n ). Let F ε be the flow of V . The variations of (t, q i ) ∈ Q are


described in such a way

(t, q i ) −→ F ε (t, q i ) = (t̃, q̃ i ), (1.16)

where
t̃ = f (ε, t, q), q̃ i = g i (ε, t, q) (1.17)
with
d  d 
 f (ε, t, q) = ξ(t, q) := δt,  g i (ε, t, q) = φi (t, q) := δq i . (1.18)
dε ε=0 dε ε=0
The deformations (1.17) transform a curve q i (t) in Q denoted by Cab into a set of
curves q̃ i (ε, t̃) in Q denoted by Cεb̃ã , determined by
584 14. Lee-Variational Integrator

t̃ = f (ε, t, q(t)), q̃ i = g i (ε, t, q(t)). (1.19)

Before calculating the total variation of S, we will introduce the first-order prolonga-
tion of V denoted as pr1 V
∂ ∂ ∂
pr1 V = ξ(t, q) + φi (t, q) i + αi (t, q, q̇) i , (1.20)
∂t ∂q ∂ q̇
here
αi (t, q, q̇) = Dt φi (t, q) − q̇ i Dt ξ(t, q), (1.21)
where Dt denotes the total derivative with respect to t, for example,

∂φk
Dt φk (t, q i ) = φkt + φkqi q̇ i , φkt = .
∂t
For prolongations of the vector field and the related formulae, refer to[Olv93] .
Now, we let us calculate the total variation of S straightforwardly:
- b̃ 
d  i d  d 
δS =  S(q̃ (ε, t̃)) =  L t̃, q̃ i (ε, t̃), q̃ i (ε, t̃) d t̃
dε ε=0 dε ε=0 ã dt̃
 - b  
d d dt̃  
=  L t̃, q̃ i (ε, t̃), q̃ i (ε, t̃) d t t̃ = f (ε, t, q(t))
dε ε=0 a d t̃ dt
- b - b
d   d i   
=  L t̃, q̃ i (ε, t̃), q̃ (ε, t̃) d t + L t, q i (t), q̇ i (t) Dt ξ d t
a d ε ε=0 d t̃ a
- b  - b
∂L ∂L ∂L
= ξ + i φi + i (Dt φi − q̇ i Dt ξ) d t + LDt ξ d t
a ∂t ∂q ∂ q̇ a
- b  !
∂L d  ∂L i   ∂L d ∂L  i
= + q̇ − L ξ + − φ dt
a ∂t dt ∂ q̇ i ∂q i dt ∂ q̇ i
 !
∂L  ∂L b
+ L − i q̇ i ξ + i φi  . (1.22)
∂ q̇ ∂ q̇ a

Here we have made use of (1.18), (1.20), (1.21) and

d  d t̃ d d 
 =  t̃ = Dt ξ.
d ε ε=0 d t d t d ε ε=0
If ξ(a, q(a)) = ξ(b, q(b)) = 0 and φi (a, q(a)) = φi (b, q(b)) = 0, the requirement
of δS = 0 yields the equation from ξ, the variation along the base manifold, i.e., the
time.
∂L d  ∂L i 
+ q̇ − L = 0, (1.23)
∂t dt ∂ q̇ i
and the Euler–Lagrange equation from φi , the variation along the fiber, i.e., the con-
figuration space,
∂L d ∂L
− = 0. (1.24)
∂q i dt ∂ q̇ i
14.1 Total Variation in Lagrangian Formalism 585

Here ξ and φi are regarded as independent components of total variation. However,


there is another decomposition for the independent components, i.e., the vertical and
horizontal variations; see Remark1.2 below.
∂L
If L does not depend on t explicitly, i.e., L is conservative, = 0, then (1.23)
∂t
becomes the energy conservation law

d  ∂L 
H = 0, H := i
q̇ i
− L . (1.25)
dt ∂ q̇
By expanding the left-hand side of (1.25), we obtain
% & % &
d ∂L i ∂L d ∂L
q̇ − L = − − q̇ i . (1.26)
dt ∂ q̇ i ∂q i dt ∂ q̇ i

Thus, for a conservative L, energy conservation is a consequence of Euler–Lagrange


equation. This agrees with Noether theorem which states that the characteristic of an
infinitesimal symmetry of the action functional S is that of a conservation law for the

Euler–Lagrange equation. For a conservative L, is an infinitesimal symmetry of
∂t
the action functional S, and its characteristic is −q̇ i . From Noether theorem, there
exits a corresponding conservation law in the characteristic form
% &
∂L d ∂L
− − q̇ i = 0. (1.27)
∂ qi d t ∂ q̇ i

If we drop the requirement

ξ(a, q(a)) = ξ(b, q(b)) = 0, φi (a, q(a)) = φi (b, q(b)) = 0, (1.28)

we can define the extended Lagrangian 1-form on Q(1) from the second term in (1.22)
% &
∂L i ∂L
ϑL := L − i
q̇ dt + d qi . (1.29)
∂ q̇ ∂ q̇ i

Suppose g i (t, vqi ) is a solution of (1.24) depending on the initial condition vqi ∈ Q(1) .
Restricting q̃ i (ε, t̃) to the solution space of (1.24) and using the same method in[MPS98] ,
it can be proved that the extended symplectic 2-form is preserved:

(pr1 g i )∗ ΩL = ΩL ,
ΩL := dϑL , (1.30)
 
d i
where pr1 g i (s, vqi ) = s, g i (s, vqi ), g (s, vqi ) denotes the first-order prolonga-
ds
tion of g i (s, vqi )[Olv93] .

Remark 1.1. If ξ in (1.15) is independent of q, the deformations in (1.17) are called


fiber-preserving. In this case, the domain of definition of q̃ i (ε, t̃) only depends on the
deformations in (1.17). While in the general case, the domain of definition of q̃ i (ε, t̃)
depends on not only the deformations in (1.17) but also on q i (t).
586 14. Lee-Variational Integrator

Remark 1.2. Using the identity


% & % &
∂L d ∂L i ∂L d ∂L
+ q̇ − L = − − q̇ i , (1.31)
∂t dt ∂ q̇ i ∂q i dt ∂ q̇ i
the Equation (1.22) becomes
- b% & !
∂L d ∂L ∂L i b b
δS = i
− i
(φi
− ξ q̇ i
)d t + i
(φ − ξ q̇ i
)  + (Lξ)a . (1.32)
a ∂q d t ∂ q̇ ∂ q̇ a

According to (1.18), φi = δq i should be regarded as the total variation of q i , δq i =


δV q i + δH q i , since the variation of t also induces the variation of q i , denoted as δH q i ,
the horizontal variation of q i . Substituting ξ = δt in (1.18), the horizontal variation of
q i should be δH q i = ξ q̇ i , and consequently φi − ξ q̇ i is interpreted as vertical variation
δV q i , i.e., the variation of q i (t) at the moment t (for e.g., see[CH53] ) . Therefore, the first
two terms in (1.32) come from vertical variation δV q i and the last term comes from
horizontal variation δt. The horizontal variation of S with respect to the horizontal
variation δH q i = ξ q̇ i gives rise to the identity (1.31).

14.1.3 Discrete Mechanics and Variational Integrators


In this subsection, by calculus of discrete total variations, we will develop a discrete
Lagrangian mechanics, which includes the boundary terms in Lee’s discrete mechan-
ics that give rise to the discrete version of symplectic preserving. The discrete varia-
tion calculus is mainly analog to Lee’s idea that time (discrete) is regarded as a dy-
namical variable, i.e., the time steps are variable[Lee82,Lee87] . The vertical part of this
discrete variation calculus is similar to the one in[KMO99,MV91,Ves88,Ves91a,WM97] . Using
this calculus for discrete total variations we naturally derive the Kane–Marsden–Ortiz
integrators.
We use Q × Q to denote the discrete version of the first prolongation for the
extended configuration space Q. A point (t0 , q0 ; t1 , q1 ) ∈ Q × Q 1 , corresponds to a
q − q0
tangent vector 1 . A discrete Lagrangian is defined to be L : Q × Q → R and
t 1 − t0
the corresponding action as

N −1
S= L(tk , qk , tk+1 , qk+1 )(tk+1 − tk ). (1.33)
k=0

The discrete variational principle in total variation is to extremize S for variations of


both qk and tk with fixed end points (t0 , q0 ) and (tN , qN ). This discrete variational
principle determines a discrete flow Φ : Q × Q → Q × Q by
Φ(tk−1 , qk−1 , tk , qk ) = (tk , qk , tk+1 , qk+1 ). (1.34)
Here (tk+1 , qk+1 ) are calculated from the following discrete Euler–Lagrange equa-
tion, i.e., the variational integrator, and the discrete energy conservation law (for con-
servative L)
1
In this section, q is an abbreviation of (q 1 , q 2 , · · · , q n ).
14.1 Total Variation in Lagrangian Formalism 587

(tk+1 − tk )D2 L(tk , qk , tk+1 , qk+1 ) + (tk − tk−1 )D4 L(tk−1 , qk−1 , tk , qk ) = 0,
(1.35)
and

(tk+1 − tk )D1 L(tk , qk , tk+1 , qk+1 ) + D3 L(tk−1 , qk−1 , tk , qk )(tk − tk−1 )


−L(tk , qk , tk+1 , qk+1 ) + L(tk−1 , qk−1 , tk , qk ) = 0, (1.36)

for all k ∈ {1, 2, · · · , N − 1}. Here Di denotes the partial derivative of L with respect
to the i-th argument. The Equation (1.35) is the discrete Euler–Lagrange equation.
The Equation (1.36) is the discrete energy conservation law for a conservative L. The
integrator (1.35) – (1.36) is the Kane–Marsden–Ortiz integrator.
Using the discrete flow Φ, the Equations (1.35) and (1.36) become

(tk+1 − tk )D2 L ◦ Φ + (tk − tk−1 )D4 L = 0, (1.37)


((tk+1 − tk )D1 L − L) ◦ Φ + D3 L + L = 0, (1.38)

respectively. If (tk+1 − tk )D2 L and (tk+1 − tk )D1 L − L are invertible, the Equations
(1.37) and (1.38) determine the discrete flow Φ under the consistency condition

((tk+1 −tk )D1 L−L)−1 ◦(D3 L+L) = ((tk+1 −tk )D2 L)−1 ◦(tk −tk−1 )D4 L. (1.39)

Now, we will prove that the discrete flow Φ preserves a discrete version of the
extended Lagrange 2-form ΩL . As in continuous case, we will calculate d S for varia-
tions with variable end points.

d S(t0 , q0 , · · · , tN , qN ) · (δt0 , δq0 , · · · , δtN , δqN )



N −1
= (D2 L(tk , qk , tk+1 , qk+1 )δqk + D4 L(tk , qk , tk+1 , qk+1 )δqk+1 )(tk+1 − tk )
k=0

N −1
+ (D1 L(tk , qk , tk+1 , qk+1 )δtk + D3 L(tk , qk , tk+1 , qk+1 )δtk+1 )(tk+1 − tk )
k=0

N −1
+ L(tk , qk , tk+1 , qk+1 )(δtk+1 − δtk )
k=0

N −1
= D2 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk )δqk
k=0

N
+ D4 L(tk−1 , qk−1 , tk , qk )(tk − tk−1 )δqk
k=1

N −1
+ D1 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk )δtk
k=0

N
+ D3 L(tk−1 , qk−1 , tk , qk )(tk − tk−1 )δtk
k=1
588 14. Lee-Variational Integrator


N −1 
N
+ L(tk , qk , tk+1 , qk+1 )(−δtk ) + L(tk−1 , qk−1 , tk , qk )δtk
k=0 k=1

N −1
= (D2 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk )
k=1
+D4 L(tk−1 , qk−1 , tk , qk )(tk − tk−1 ))δqk

N −1
+ (D1 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk )
k=1
+D3 L(tk−1 , qk−1 , tk , qk )(tk − tk−1 )) (1.40)
+L(tk−1 , qk−1 , tk , qk ) − L(tk , qk , tk+1 , qk+1 ))δtk
+D2 L(t0 , q0 , t1 , q1 )(t1 − t0 )δq0 + D4 L(tN −1 , qN −1 , tN , qN )(tN − tN −1 )δqN
+(D1 L(t0 , q0 , t1 , q1 )(t1 − t0 ) − L(t0 , q0 , t1 , q1 ))δt0
+(D3 L(tN −1 , qN −1 , tN , qN )(tN − tN −1 ) + L(tN −1 , qN −1 , tN , qN ))δtN .

We can see that the last four terms in (1.40) come from the boundary variations. Based
on the boundary variations, we can define two 1-forms on Q × Q

θL− (tk , qk , tk+1 , qk+1 )


= (D1 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk ) − L(tk , qk , tk+1 , qk+1 ))dtk
+D2 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk )dqk , (1.41)

and

θL+ (tk , qk , tk+1 , qk+1 )


= (D3 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk ) + L(tk , qk , tk+1 , qk+1 ))d tk+1
+D4 L(tk , qk , tk+1 , qk+1 )(tk+1 − tk )d qk+1 , (1.42)

having employed the notations in [MPS98] . We regard the pair (θL− , θL+ ) as the discrete
version of the extended Lagrange 1-form ϑL defined in (1.29).
Now, we parameterize the solutions of the discrete variational principle by the
initial condition (t0 , q0 , t1 , q1 ) and restrict S to that solution space. Then Equation
(1.40) becomes

dS(t0 , q0 , · · · , tN , qN ) · (δt0 , δq0 , · · · , δtN , δqN )


= θL− (t0 , q0 , t1 , q1 ) · (δt0 , δq0 , , δt1 , δq1 )
+θL+ (tN −1 , qN −1 , tN , qN ) · (δtN −1 , δqN −1 , δtN , δqN )
= θL− (t0 , q0 , t1 , q1 ) · (δt0 , δq0 , δt1 , δq1 )
+(ΦN −1 )∗ θL+ (t0 , q0 , t1 , q1 )(δt0 , δq0 , δt1 , δq1 ). (1.43)

From (1.43), we obtain


d S = θL− + (ΦN −1 )∗ θL+ . (1.44)
The Equation (1.44) holds for arbitrary N > 1. By taking N = 2, we get
14.1 Total Variation in Lagrangian Formalism 589

d S = θL− + Φ∗ θL+ . (1.45)

By exterior differentiation of (1.45), we obtain

Φ∗ (d θL+ ) = −d θL+ . (1.46)

From the definition of θL− and θL+ , we know that

θL− + θL+ = d (L(tk+1 − tk )). (1.47)

By exterior differentiation of (1.47), we obtain d θL+ = −d θL− . Define

ΩL ≡ d θL+ = −d θL− . (1.48)

Finally, we have shown that the discrete flow Φ preserves the discrete extended La-
grange 2-form ΩL
Φ∗ (ΩL ) = ΩL . (1.49)
Now, the variational integrator (1.35), the discrete energy conservation law (1.36),
and the discrete extended Lagrange 2-form ΩL converge to their continuous counter-
parts as tk+1 → tk , tk−1 → tk .
Consider a conservative Lagrangian L(q, q̇). For simplicity, we choose the discrete
Lagrangian as % &
qk+1 − qk
L(tk , qk , tk+1 , qk+1 ) = L qk , . (1.50)
tk+1 − tk
The variational integrator (1.35) becomes
% &
∂L 1 ∂L ∂L
(qk , Δt qk ) − (qk , Δt qk ) − (qk−1 , Δt qk−1 ) = 0,
∂ qk tk+1 − tk ∂ Δ t qk ∂ Δt qk−1
(1.51)
q − qk q − qk−1
where Δt qk = k+1 , Δt qk−1 = k .
tk+1 − tk tk − tk−1
It is easy to see that, as tk+1 → tk , tk−1 → tk , the Equation (1.51) converges to
∂L d ∂L
− = 0. (1.52)
∂ qk d t ∂ q̇k
The discrete energy conservation law (1.36) becomes
Ek+1 − Ek
= 0, (1.53)
tk+1 − tk
where
% &
∂L qk+1 − qk
Ek+1 = Δt qk − L q k , ,
∂Δt qk tk+1 − tk
% &
∂L qk − qk−1
Ek = Δt qk−1 − L qk−1 , .
∂Δt qk−1 tk − tk−1
The Equation (1.53) converges to
590 14. Lee-Variational Integrator

% &
d ∂L
q̇k − L = 0 (1.54)
dt ∂ q̇k
as tk+1 → tk , tk−1 → tk .
Now, we will consider the discrete extended Lagrange 2-form ΩL defined by
(1.48). By discretization of (1.50), the discrete extended Lagrange 1-form θL+ defined
in (1.42) becomes
% &
+ ∂L ∂L
θL = L(qk , Δt qk ) − Δt qk d tk+1 + d qk+1 . (1.55)
∂ Δ t qk ∂ Δt q k
From (1.55), we can deduce that θL+ converges to the continuous Lagrangian 1-form
ϑL defined by (1.29) as tk+1 → tk , tk−1 → tk . Thus, we obtain
ΩL = dθL+ −→ dϑL = ΩL , tk+1 → tk , tk−1 → tk . (1.56)
In general, the variational integrator (1.35) with fixed time steps does not ex-
actly conserve the discrete energy, and the computed energy will not have secular
variation[GM88,SSC94] . In some cases, such as in discrete mechanics proposed by Lee
in [Lee82,Lee87] , the integrator (1.35) is required to conserve the discrete energy (1.36)
by varying the time steps. In other words, the steps can be chosen according to (1.36)
so that the integrator (1.35) conserves the discrete energy (1.36). The resulting inte-
grator also conserves the discrete extended Lagrange 2-form dθL+ . This fact had not
been discussed in Lee’s discrete mechanics.
Example 1.3. Let us consider an example. For the classical Lagrangian
1
L(t, q, q̇) = q̇ 2 − V (q), (1.57)
2
we choose the discrete Lagrangian L(tk , qk , tk+1 , qk+1 ) as
% &2 % &
1 qk+1 − qk qk+1 − qk
L(tk , qk , tk+1 , qk+1 ) = −V . (1.58)
2 tk+1 − tk 2
The discrete Euler–Lagrange equation (1.35) becomes
% &
qk+1 − qk qk − qk−1 V  (q̄k )(tk+1 − tk ) + V  (q̄k−1 )(tk − tk−1 )
− + = 0,
tk+1 − tk tk − tk−1 2
(1.59)
which preserves the Lagrange 2-form
% &
1 t − tk 
+ k+1 V (q̄k ) d qk+1 ∧ d qk , (1.60)
tk+1 − tk 4
q +q q +q
where q̄k = k k+1
, q̄k−1 = k−1 k
.
2 2
If we take fixed variables tk+1 − tk = tk − tk−1 = h, then (1.59) becomes
qk+1 − 2qk + qk−1 V  (q̄k ) + V  (q̄k−1 )
2
+ = 0,
h 2
which preserves the Lagrange 2-form
% &
1 h 
+ V (q̄k ) d qk+1 ∧ d qk .
h 4
14.2 Total Variation in Hamiltonian Formalism 591

14.1.4 Concluding Remarks


We have presented the calculus of total variation problem for discrete mechanics with
variable time steps referring to continuous mechanics in this section. Using the cal-
culus for discrete total variations, we have proved that Lee’s discrete mechanics is
symplectic and derived Kane–Marsden–Ortiz integrators. It is well known that an
energy-preserving variational integrator is a more preferable and natural candidate of
approximations for conservative Euler–Lagrange equation, since the solution of con-
servative Euler–Lagrange equation is not only symplectic but also energy-preserving.
As is mentioned, Kane–Marsden–Ortiz integrators are related closely to the dis-
crete mechanics proposed by Lee[Lee82,Lee87] . In Lee’s discrete mechanics, the differ-
ence equations are the same as Kane–Marsden–Ortiz integrators. However, Lee’s dif-
ference equations are solved as boundary value problems, while Kane–Marsden–Ortiz
integrators are solved as initial value problems.
Finally, it should be mentioned that in very recent works[GLW01a,GLWW01,GW03] , two
of the authors (HYG and KW) and their collaborators have presented a difference dis-
crete variational calculus and the discrete version of Euler–Lagrange cohomology for
vertical variation problems in both Lagrangian and Hamiltonian formalism for dis-
crete mechanics and field theory. In their approach, the difference operator with fixed
step-length is regarded as an entire geometric object. The advantages of this approach
have already been seen in the last subsection in the course of taking continuous limits
although the difference operator Δt in (1.50) is of variable step-length. This approach
may be generalized to the discrete total variation problems.

14.2 Total Variation in Hamiltonian Formalism


We present a discrete total variation calculus in Hamiltonian formalism in this section.
Using this discrete variation calculus and generating function for flows of Hamiltonian
systems, we derive symplectic-energy integrators of any finite order for Hamiltonian
systems from a variational perspective. The relationship between the symplectic in-
tegrators derived directly from the Hamiltonian systems and the variationally derived
symplectic-energy integrators is explored.

14.2.1 Variational Principle in Hamiltonian Mechanics


Let us begin by recalling the ordinary variational principle in Hamiltonian formalism.
Suppose Q denotes the configuration space with coordinates q i , and T ∗ Q the phase
space with coordinates (q i , pi ) (i = 1, 2, · · · , n). Consider a Hamiltonian H : T ∗ Q →
R. The corresponding action functional is defined by
- b
S((q i (t), pi (t))) = (pi · q i − H(q i , pi )) d t, (2.1)
a

where (q i (t), pi (t)) is a C 2 curve in phase space T ∗ Q.


592 14. Lee-Variational Integrator

The variational principle in Hamiltonian formalism seeks the curves (q i (t), pi (t))
for which the action functional S is stationary under variations of (q i (t), pi (t)) with
fixed end points. We will first define the variation of (q i (t), pi (t)).
Let
 n
∂ 
n

V = φi (qq , p ) i + ψ i (qq , p ) i , (2.2)
∂q ∂p
i=1 i=1

be a vector field on T ∗ Q, here q = (q 1 , · · · , q n ), p = (p1 , · · · , pC


n
). For simplicity, we
will use Einstein convention and omit the summation notation in the following.
Let us denote the flow of V by F ε : F ε (qq , p ) = (q̃q̃q̃, p̃p̃), which is written in compo-
nents as

q̃ i = f i (ε, q , p ), (2.3)
p̃i = g i (ε, q , p ), (2.4)

where (qq , p ) ∈ T ∗ Q and



d 
 f i (ε, q , p ) = φi (qq , p ),
d ε ε=0

d
 g i (ε, q , p ) = ψ i (qq , p ).
dε ε=0

Let (q i (t), pi (t)) be a curve in T ∗ Q. The transformation (2.3) and (2.4) transforms
(q i (t), pi (t)) into a family of curves
 i   
q̃ (t), p̃i (t) = f i (ε, q (t), p (t)), g i (ε, q (t), p (t)) .
 
Next, we will define the variation of q i (t), pi (t) :
      
d 
δ q i (t), pi (t) =:  q̃ i (t), p̃i (t) = φi (qq , p ), ψ i (qq , p ) . (2.5)
dε ε=0
 
Next, we will calculate the variation of S at q i (t), pi (t) as follows:
  
d 
δS =  S (q̃ i (t), p̃i (t))
d ε ε=0
  
d  i i
= S (f (ε, q (t), p (t)), g (ε, q (t), p (t)))
dε ε=0
 - b%   d  
d 
=  g i ε, q (t), p (t) f i ε, q (t), p (t)
dε ε=0 a dt
 &
−H f i (ε, q (t), p (t)), g i (ε, q (t), p (t)) d t
- b % & % & ! b
i ∂H i i ∂H 
= q̇ − i
ψ + − ṗ − i
φi d t + pi φi  .
a ∂p ∂q a

(2.6)
  i 
If φi q (a), p (a)
 i φ qi(b),p (b) = 0, the requirement of δS = 0 yields the Hamil-
ton equation for q (t), p (t) :
14.2 Total Variation in Hamiltonian Formalism 593

∂H ∂H
q̇ i = , ṗi = − . (2.7)
∂ pi ∂ qi
   
If we drop the requirement of φi q (a), p (a) φi q (b), p (b) = 0, we can naturally
∗ i i
 i oni T Q
obtain the canonical 1-form  from the second term in (2.6): θ = p dq . Fur-
thermore, restricting (q̃ (t), p̃ (t)) to the solution space of (2.7), we can prove that
the solution of (2.7) preserves the canonical 2-form ω = d θL = d pi ∧ d q i .
On the other hand, it is not necessary to restrict ((q̃ i (t), p̃i (t)) to the solution space
of (2.7). Introducing the Euler–Lagrange 1-form
   
∂H ∂H
E(q i , pi ) = q̇ i − i
d pi + − ṗi − i
d qi , (2.8)
∂p ∂q

the nilpotency of d leads to


d
d E(q i , pi ) + ω = 0, (2.9)
dt

namely, the necessary and sufficient condition for symplectic structure preserving is
that the Euler–Lagrange 1-form (2.8) is closed[GLW01a,GLWW01,GLW01b,GW03] .
Based on the above-given variational principle in Hamiltonian formalism and
using the ideas of discrete Lagrange mechanics[Ves88,Ves91b,MPS98,WM97] , we can de-
velop a natural version of discrete Hamilton mechanics with fixed time steps and
derive symplectic integrators for Hamilton canonical equations from a variational
perspective[GLWW01] .
However, the symplectic integrators obtained in this way are not energy-preserving,
in general, because of its fixed time steps[GM88] . An energy-preserving symplectic in-
tegrator is a more preferable and natural candidate of approximations for conservative
Hamilton equations since the solution of conservative Hamilton equations is not only
symplectic but also energy-preserving. To attain this goal, we use variable time steps
and a discrete total variation calculus developed in [Lee82,Lee87,KMO99,CGW03] . The basic
idea is to construct a discrete action functional with variable time steps and then apply
a discrete total variation calculus. In this way, we can derive symplectic integrators
and their associated energy conservation laws. These variationally derived symplectic
integrators are two-step integrators. If we take fixed time steps, the resulting integra-
tors are equivalent to the symplectic integrators derived directly from the Hamiltonian
systems in some special cases.

14.2.2 Total Variation in Hamiltonian Mechanics


In order to discuss total variation in Hamiltonian formalism, we will work with ex-
tended phase space R × T ∗ Q with coordinates (t, q i , pi ). Here t denotes time. For
details, see [Arn89,GPS02] . By total variation, we refer to variations of both (q i , pi ) and t.
Consider a vector field on R × T ∗ Q,

∂ ∂ ∂
V = ξ(t, q , p ) + φi (t, q , p ) i + ψ i (t, q , p ) i . (2.10)
∂t ∂q ∂p
594 14. Lee-Variational Integrator

Let F ε be the flow of V . For (t, q i , pi ) ∈ R × T ∗ Q, we have F ε (t, q i , pi ) = (t̃, q̃ i , p̃i ):

t̃ = h(ε, t, q , p ), (2.11)
q̃ i = f i (ε, t, q , p ), (2.12)
p̃i = g i (ε, t, q , p ), (2.13)

where

d 
 h(ε, t, q , p ) = ξ(t, q , p ), (2.14)
d ε ε=0

d 
 f i (ε, t, q , p) = φi (t, q , p), (2.15)
d ε ε=0

d 
 g i (ε, t, q , p ) = ψ i (t, q , p ). (2.16)
d ε ε=0

The transformation (2.11) – (2.13) transforms a curve (q i (t), pi (t)) into a family of
curves (q̃ i (ε, t̃), p̃i (ε, t̃)) determined by
 
t̃ = h ε, t, q (t), p (t) , (2.17)
i i
 
q̃ = f ε, t, q (t), p (t) , (2.18)
i i
 
p̃ = g ε, t, q (t), p (t) . (2.19)

Suppose we can solve (2.17) for t : t = h−1 (ε, t̃). Then,

q̃ i (ε, t̃) = f i (ε, h−1 (ε, t̃), q (h−1 (ε, t̃)), p (h−1 (ε, t̃))), (2.20)
q̃ i (ε, t̃) = f i (ε, h−1 (ε, t̃), q (h−1 (ε, t̃)), p (h−1 (ε, t̃))). (2.21)
Before calculating the variation of S directly, we will first consider the first-order
prolongation of V ,
1 ∂ i ∂ i ∂ i ∂ i ∂
pr V = ξ(t, q , p ) + φ (t, q , p ) i + ψ (t, q , p ) i + α (t, q , p , ·q
·q, ·p
·p) i + β (t, q , p , q̇ , ṗ ) i ,
∂t ∂q ∂p ∂ q̇ ∂ ṗ
(2.22)
where pr1 V denotes the first-order prolongation of V and

αi (t, q , p , q̇ , ṗ ) = Dt φi (t, q , p ) − q̇ i Dt ξφi (t, q , p ), (2.23)


β i (t, q , p , q̇ , ṗ ) = Dt ψ i (t, q , p ) − ṗi Dt ξφi (t, q , p ), (2.24)

where Dt denotes the total derivative. For example,

Dt φi (t, q , p ) = φit + φqq̇ + φp ṗ .

For prolongation of vector field and formulae (2.23) and (2.24), refer to[Olv93] .
Now, let us calculate the variation of S directly as follows:
d   
δS =  S (q̃ i (ε, t̃), p̃i (ε, t̃))
d ε ε=0
- b̃ 
d   i d i  
=  p̃ (ε, t̃) q̃ (ε, t̃) − H q̃ i (ε, t̃), p̃i (ε, t̃) d t̃
d ε ε=0 ã dt̃
14.2 Total Variation in Hamiltonian Formalism 595

-
d  b  d   d t̃
=  p̃i (ε, t̃) q̃ i (ε, t̃) − H (q̃ i (ε, t̃), p̃i (ε, t̃) dt
d ε ε=0 d t̃ dt
a
  
t̃ = h ε, t, q (t), p (t)
- b  
d  d  
=  p̃i (ε, t̃) q̃ i (ε, t̃) − H q̃ i (ε, t̃), p̃i (ε, t̃) d t
a d ε ε=0 d t̃
- b
 i  
+ p (t)q̇ i (t) − H q i (t), pi (t) Dt ξ d t (2.25)
a
-  d 
b   ∂ H  i  i ∂ H  i
= H q i (t), pi (t) ξ + − ṗi − φ + q̇ − ψ dt
a dt ∂ qi ∂ pi
@ Ab
+ pi φi − H(q i , pi )ξ  . a
(2.26)
In (2.25), we have used (2.14)and the fact
d  d t̃ d d 
 =  t̃ = Dt ξ.
d ε ε=0 d t d t d ε ε=0
In (2.26),
 we have used  theprolongation formula (2.23).
   
If ξ a, q (a), p (a) = ξ b, q (b), p (b) = 0 and φi a, q (a), p (a) = φi b, q (b), p (b)
= 0, the requirement of δS = 0 yields the Hamilton canonical equation
∂H ∂H
q̇ i = , ṗi = − (2.27)
∂pi ∂q i

from the variation φi , ψ i and the energy conservation law


d
H(q i , pi ) = 0 (2.28)
dt
from the variation ξ. Since
d ∂H i ∂H i
H(q i , pi ) = ·q + ·p ,
dt ∂ qi ∂ pi
we can very well see that the energy conservation law (2.28) is a natural consequence
of the Hamilton canonical Equation (2.27).
If we drop the requirement
   
ξ a, q (a), p (a) = ξ b, q (b), p (b) = 0,
   
φi a, q (a), p (a) = φi b, q (b), p (b) = 0,
we can define the extended canonical 1-form on R × T ∗ Q from the second term in
(2.26)
θ = pi d q i − H(q i , pi )d t. (2.29)
 i i

Furthermore, restricting q̃ (t), p̃ (t) to the solution space of (2.27), we can prove
that the solution of (2.27) preserves the extended canonical 2-form
ω = d θ = d pi ∧ d q i − d H(q i , pi ) ∧ d t (2.30)
by using the same method in[MPS98] .
596 14. Lee-Variational Integrator

14.2.3 Symplectic-Energy Integrators


In this section, we will develop a discrete version of total variation in Hamiltonian for-
malism. Using this discrete total variation calculus, we will derive symplectic-energy
integrators.
Let
L(q i , pi , q̇ i , ṗi ) = pi q̇ i − H(q i , pi )
be a function from R × T (T ∗ Q) to R. Here L does not depend on t explicitly.
We use P × P for the discrete version R × T (T ∗ Q). Here P is the discrete version
of R × T ∗ Q. A point (t0 , q 0 , p 0 ; t1 , q 1 , p 1 ) ∈ P × P corresponds to a tangent vector
q − q p − p 
1 0 1 0
, .
t1 − t0 t0 − t0

For simplicity, the vector symbols q = (q 1 , · · · , q n ) and p = (p1 , · · · , pn ) are


used throughout this section. A discrete L is defined to be L : P × P → R and the
corresponding discrete action as


N −1
S= L(tk , q k , p k , tk+1 , q k+1 , p k+1 )(tk+1 − tk ), (2.31)
k=0

where t0 = a, tN = b. The discrete variational principle in total variation is to


extremize S for variations of both q k , p k and tk holding the end points (t0 , q 0 , p 0 )
and (tN , q N , p N ) fixed. This discrete variational principle determines a discrete flow
Φ : P × P → P × P by

Φ(tk−1 , q k−1 , p k−1 , tk , q k , p k ) = (tk , q k , p k , tk+1 , q k+1 , p k+1 ). (2.32)

Here, (tk+1 , q k+1 , p k+1 ) for all k ∈ (1, 2, · · · , N − 1) are found from the following
discrete Hamilton canonical equation
(tk+1 − tk )D2 L(tk , q k , p k , tk+1 , q k+1 , p k+1 ) + (tk − tk−1 )D5 L(tk−1 , q k−1 , p k−1 , tk , q k , p k ) = 0,
(2.33)
(tk+1 − tk )D3 L(tk , q k , p k , tk+1 , q k+1 , p k+1 ) + (tk − tk−1 )D6 L(tk−1 , q k−1 , p k−1 , tk , q k , p k ) = 0

and the discrete energy conservation law


(tk+1 −tk )D1 L(tk , q k , p k , tk+1 , q k+1 , p k+1 )+(tk −tk−1 )D4 L(tk−1 , q k−1 , p k−1 , tk , q k , p k )
(2.34)
−L(tk , q k , p k , tk+1 , q k+1 , p k+1 ) + L(tk−1 , q k−1 , p k−1 , tk , q k , p k ) = 0.
Di denotes the partial derivative of L with respect to the ith argument. Equation (2.33)
is the discrete Hamilton canonical equation (variational integrator). Equation (2.34)
is the discrete energy conservation law associated with (2.33). Unlike the continu-
ous case, the variational integrator (2.33) does not satisfy (2.34) for arbitrarily given
tk+1 in general. Therefore, we need to solve (2.33) and (2.34) simultaneously with
qk+1 , pk+1 and tk+1 taken as unknowns.
Now, we will prove that the discrete flow determined by (2.33) and (2.34) pre-
serves a discrete version of the extended Lagrange 2-form ω defined in (2.30) so that
14.2 Total Variation in Hamiltonian Formalism 597

we call (2.33) and (2.34) a symplectic-energy integrator. We will do this directly from
the variational point of view, consistent with the continuous case[MPS98] .
As in the continuous case, we will calculate dS for variations with varied end
points.
dS(t0 , q 0 , p 0 , · · · , tN , q N , p N ) · (δt0 , δqq 0 , δpp0 , · · · , δtN , δqq N , δppN )

N −1 % &
= D2 L(vv k )δqq k + D5 L(vv k )δqq k+1 + D3 L(vv k )δppk + D6 L(vv k )δppk+1 (tk+1 − tk )
k=0


N −1   
N −1
+ D1 L(vv k )δtk + D4 L(vv k )δtk+1 (tk+1 − tk ) + L(vv k )(δtk+1 − δtk )
k=0 k=0


N −1 % &
= D2 L(vv k )(tk+1 − tk ) + D5 L(vv k−1 )(tk − tk−1 δqq k
k=1


N −1 % &
+ D3 L(vv k )(tk+1 − tk ) + D6 L(vv k−1 )(tk − tk−1 δppk
k=1


N −1  
+ D1 L(vv k )(tk+1 − tk ) + D4 L(vv k−1 )(tk − tk−1 ) + L(vv k−1 ) − L(vv k ) δtk
k=1
 
+D2 L(vv 0 )(t1 − t0 )δqq 0 + D3 L(vv 0 )(t1 − t0 )δp0 + D1 Lvv 0 )(t1 − to ) − L(vv 0 ) δt0
+D5 L(vv N −1 )(tN − tN −1 )δqq N + D6 L(vv N −1 )(tN − tN −1 )δppN
 
+ D4 L(vv N −1 )(tN − tN −1 ) − L(vv N −1 ) δtN ,
(2.35)
where v k = (tk , q k , p k , tk+1 , q k+1 , p k+1 ) (k = 0, 1, · · · , N − 1). We can see that the
last six terms in (2.35) come from the boundary variations. Based on the boundary
variations, we can define two 1-forms on P × P ,
θL− (vv k ) = D2 L(vv k )(tk+1 − tk )dqq k + D3 L(vv k )(tk+1 − tk )dppk
+D1 L(vv k )(tk+1 − tk ) − L(vv k )dtk (2.36)
and
θL+ (vv k ) = D5 L(vv k )(tk+1 − tk )dqq k+1 + D6 L(vv k )(tk+1 − tk )dppk+1
+D4 L(vv k )(tk+1 − tk ) − L(vv k )dtk+1 . (2.37)
Here, we have used the notation in [MPS98] . We regard the pair (θL− , θL+ ) as being the
discrete version of the extended canonical 1-form θ defined in (2.29).
Now, we will parametrize the solutions of the discrete variational principle by
(t0 , q0 , t1 , q1 ), and restrict S to that solution space. Then, Equation (2.35) becomes
d S(t0 , q 0 , p 0 , · · · , tN , q N , p N ) · (δt0 , δqq 0 , δpp0 , · · · , δtN , δqq N , δppN )
= θL− (t0 , q 0 , p 0 , t1 , q 1 , p 1 ) · (δt0 , δqq 0 , δpp0 , δt1 , δqq 1 , δpp1 )
+θL+ (tN −1 , q N −1 , p N −1 , tN , q N , p N ) · (δtN −1 , δqq N −1 , δppN −1 , δtN , δqq N , δppN )
= θL− (t0 , q 0 , p 0 , t1 , q 1 , p 1 ) · (δt0 , δqq 0 , δpp0 , δt1 , δqq 1 , δpp1 )
+(ΦN −1 )∗ θL+ (t0 , q 0 , p 0 , t1 , q 1 , p 1 ) · (δt0 , δqq 0 , δpp0 , δt1 , δqq 1 , δpp1 ). (2.38)
598 14. Lee-Variational Integrator

From (2.38), we can obtain

d S = θL− + (ΦN −1 )∗ θL+ . (2.39)

The Equation (2.39) holds for arbitrary N > 1. Taking N = 2, we obtain

d S = θL− + Φ∗ θL+ . (2.40)

By exterior differentiation of (2.40), we obtain

Φ∗ (d θL+ ) = −d θL− . (2.41)

From the definition of θL− and θL+ , we know that

θL− + θL+ = d L. (2.42)

By exterior differentiation of (2.42), we obtain dθL+ = −d θL− . Define

ωL ≡ d θL+ = −d θL− . (2.43)

Finally, we have shown that the discrete flow Φ preserves the discrete extended canon-
ical 2-form ωL :
Φ∗ (ωL ) = ωL . (2.44)
We can now call the coupled difference system (2.33) and (2.34) a symplectic-
energy integrator in the sense that it satisfies the discrete energy conservation law
(2.34) and preserves the discrete extended canonical 2-form ωL .
To illustrate the above-mentioned discrete total variation calculus, we present an
example. We choose L in (2.31) as
q k+1 − q k
L(tk , q k , p k , tk+1 , q k+1 , p k+1 ) = p k+1/2 − H(qq k+1/2 , p k+1/2 ), (2.45)
tk+1 − tk

where
p k + p k+1 q + q k+1
p k+1/2 = , q k+1/2 = k .
2 2
Using (2.33), we can obtain the corresponding discrete Hamilton equation
* +
q k+1 − q k−1 1 ∂H ∂H
− (tk+1 − tk ) (qq ,p ) + (tk − tk−1 ) (qq ,p = 0,
2 2 ∂pp k+1/2 k+1/2 ∂pp k+1/2 k+1/2
* +
p k+1 − p k−1 1 ∂H ∂H
+ (tk+1 − tk ) (qq k+1/2 , p k+1/2 ) + (tk − tk−1 ) (qq k+1/2 , p k+1/2 ) = 0,
2 2 ∂qq ∂qq
(2.46)
p +p
p q +qq
where p k−1/2 = k 2 k−1 , q k−1/2 = k 2 k−1 .
Using (2.34), we can obtain the corresponding discrete energy conservation law
   
H q k+1/2 , p k+1/2 = H q k+1/2 , p k+1/2 . (2.47)

The symplectic-energy integrator (2.46) and (2.47) preserves the discrete 2-form:
14.2 Total Variation in Hamiltonian Formalism 599

1     d tk + dtk+1 
d p k ∧ dqq k+1 + dppk+1 ∧ dqq k − H q k+1/2 , p k+1/2 ∧ . (2.48)
2 2
If we take fixed time steps tk+1 − tk = h (h is a constant), then (2.46) becomes
 
q k+1 − q k−1 1 ∂H  ∂H
= q k+1/2 , p k+1/2 + q k−1/2 , p k−1/2 ,
2h 2 ∂p ∂p
 
p k+1 − p k−1 1 ∂H ∂H (2.49)
=− q k+1/2 , p k+1/2 ) + (qq k−1/2 , p k−1/2 .
2h 2 ∂q ∂q

Now, we will explore the relationship between (2.49) and the midpoint integrator for
the Hamiltonian system
∂H
q̇q = ,
∂pp
∂H (2.50)
ṗp = − .
∂qq
The midpoint symplectic integrator for (2.50) is
q k+1 − q k ∂H 
= q ,p ,
h ∂ p k+1/2 k+1/2
p k+1 − p k ∂H   (2.51)
= − q ,p .
h ∂qq k+1/2 k+1/2

Replacing k by k − 1 in (2.51), we obtain


q k − q k−1 ∂H  
= q ,p ,
h ∂pp k−1/2 k−1/2
p k − p k−1 ∂H   (2.52)
= − q k−1/2 , p k−1/2 .
h ∂qq

Adding (2.52) to (2.51) results in (2.49). Therefore, if we use (2.51) to obtain p k , q k ,


the two-step integrator (2.49) is equivalent to the midpoint integrator (2.51). However,
the equivalence does not hold in general. For example, choose L in (2.31) as
  q − qk  
L tk , q k , p k , tk+1 , q k+1 , p k+1 = p k k+1 − H q k+1/2 , p k+1/2 , (2.53)
tk+1 − tk
and take fixed time steps tk+1 − tk = h. Then (2.33) becomes
 
q k+1 − q k 1 ∂H   ∂H
= q k+1/2 , p k+1/2 + q k−1/2 , p k−1/2 ,
h 2 ∂p ∂pp
 
p k − p k−1 1 ∂H  ∂H  (2.54)
= − q k+1/2 , p k+1/2 + q k−1/2 , p k−1/2 .
h 2 ∂q ∂qq

The integrator (2.54) is a two-step integrator which preserves dpk ∧dqk+1 . In this case,
we cannot find an one-step integrator which is equivalent to (2.54). In conclusion,
using discrete total variation calculus, we have derived two-step symplectic-energy
integrators. When taking fixed time steps, some of them are equivalent to one-step
integrators derived directly from the Hamiltonian system while the others do not have
this equivalence.
600 14. Lee-Variational Integrator

14.2.4 High Order Symplectic-Energy Integrator


In this subsection, we will develop high order symplectic-energy integrators by using
the generating function of the flow of the Hamiltonian system

żz = J∇H(zz ), (2.55)

where
 O −I 
z = (pp, q )T , J= .
I O

Let us first recall the generating function with normal Darboux matrix of a symplectic
transformation. For details, see Chapters 5 and 6, or [Fen86,FWQW89] .
Suppose α is a 4n × 4n nonsingular matrix with the form
* +
A B
α= ,
C D

where A, B, C and D are both 2n × 2n matrices.


We denote the inverse of α by
* +
−1 A1 B1
α = ,
C1 D1

where A1 , B1 , C1 and D1 are both 2n × 2n matrices. We know that a 4n × 4n matrix


α is a Darboux matrix if
αT J4n α = J"4n , (2.56)
where
* + * + * +
O −I2n J2n O O −In
J4n = , J˜4n = , J2n = ,
I2n O O −J2n In O

where In is an n × n identity matrix and I2n is a 2n × 2n identity matrix.


Every Darboux matrix induces a fractional transform between symplectic and
symmetric matrices

σα : Sp(2n) −→ Sm(2n),
σα = (AS + B)(CS + D)−1 = M, for S ∈ Sp(2n), det (CS + D) = 0

with the inverse transform σα−1 = σα−1

σα−1 : Sm(2n) −→ Sp(2n),


σα = (A1 M + B1 )(C1 M + D1 )−1 = S,

where Sp(2n) is the group of symplectic matrices and Sm(2n) the set of symmetric
matrices.
14.2 Total Variation in Hamiltonian Formalism 601

We can generalize the above discussions to nonlinear transformations on R2n .


Let us denote the set of symplectic transformations on R2n by SpD2n and the set of
symmetric transformations (i.e., transformations with symmetric Jacobian) on R2n by
Symm(2n). Every f ∈ Symm(2n) corresponds, at least locally, to a real function φ
(unique to a constant) such that f is the gradient of φ,

w ) = ∇φ(w
f (w w ), (2.57)
 
where ∇φ(w w ), · · · , φw2n (w
w ) = φw1 (w w ) and w = (w1 , w2 , · · · , w2n ).
Then, we have

σα : SpD2n −→ Symm(2n),
σα = (A ◦ g + B) ◦ (C ◦ g + D)−1 = ∇φ, for g ∈ SpD2n , det(Cgz + D) = 0

or alternatively
Ag(zz ) + Bzz = (∇φ)(Cg(zz ) + Dzz ),
where ◦ denotes the composition of transformation and the 2n × 2n constant matrices
A, B, C and D are regarded as linear transformations. gz denotes the Jacobian of
symplectic transformation g.
Let φ be the generating function of Darboux type α for symplectic transformation
g.
Conversely, we have

σα−1 : Symm (2n) −→ SpD2n ,


σα−1 (∇φ) = (A1 ◦ ∇φ + B1 ) ◦ (C1 ◦ ∇φ + D1 )−1 = g, for det(C1 φww + D1 ) = 0,

or alternatively
A1 ∇φ(w
w ) + B1w = g(C1 ∇φ(w
w ) + D1w ),
where g is called the symplectic transformation of Darboux type α for the generating
function φ.
For the study of integrators, we will restrict ourselves to normal Darboux matrices,
i.e., those satisfying A + B = 0, C + D = I2n . The normal Darboux matrices can be
characterized as
* +
J2n −J2n 1
α= , E = (I2n + J2n F ), F T = F, (2.58)
E I2n − E 2

and * +
(E − I2n )J2n I2n
α−1 = . (2.59)
EJ2n I2n

The fractional transform induced by a normal Darboux matrix establishes a one-


one correspondence between symplectic transformations near identity and symmetric
transformations near nullity.
1
For simplicity, we will take F = 0, then E = I2n and
2
602 14. Lee-Variational Integrator

* +
J2n −J2n
α= 1 1 . (2.60)
I2n I2n
2 2
Now, we will consider the generating function of the flow of (2.55) and denote it by
etH . The generating function φ(w
w , t) for the flow etH of Darboux type (2.60) is given
by
1 1 −1
∇φ = (J2n ◦ etH − J2n ) ◦ etH + I2n , for small |t|, (2.61)
2 2
w , t) satisfies the Hamilton–Jacobi equation
where φ(w
∂  1 
w , t) = −H w + J2n ∇φ(w
φ(w w , t) (2.62)
∂t 2
and can be expressed by Taylor series in t,


w , t) =
φ(w φk (w)tk , for small |t|. (2.63)
k=1

The coefficients φk (w) can be determined recursively as


φ1 (w
w) = −H(w),
% &
−1  1 
k
m 1 j1 1 jm
φk+1 (w
w) = D H J2n ∇φ , · · · , J2n ∇φ , (2.64)
k + 1 m=1 m ! 2 2
j1 + · · · + jm = k
jl ≥ 1

where k ≥ 1, and we use the notation of the m-linear form


 
1 1
Dm H J2n ∇φj1 , · · · , J2n ∇φjm
2 2
2n 1  1 
:= Hz i1 ···zz im (zz ) J2n ∇φj1 (w
w) · · · J2n ∇φjm (w
w) .
i ,···,i =1
2 i1 2 im
1 m

From (2.61), we can see that the phase flow z := etH z satisfies
 z − z  ∞
  z + z 
z − z ) = ∇φ
J2n ( = tj ∇φj . (2.65)
2 j=1
2

Now, we will choose L in (2.31) as


q k+1 − q k
L(tk , q k , p k , tk+1 , q k+1 , p k+1 ) = p k+1/2 − ψ m (qq k+1/2 , p k+1/2 ), (2.66)
tk+1 − tk
where

m
ψ m (qq k+1/2 , p k+1/2 ) = tj φj (qq k+1/2 , p k+1/2 ). (2.67)
j=1
14.2 Total Variation in Hamiltonian Formalism 603

The corresponding symplectic-energy integrator (2.33) and (2.34) is


* +
q k+1 − q k−1 1 ∂ψ m ∂ψ m
− (tk+1 − tk ) (qq k+1/2 , p k+1/2 ) + (tk − tk−1 ) (qq ,p ) = 0,
2 2 ∂pp ∂pp k−1/2 k−1/2
(2.68)
* +
p k+1 − p k−1 1 ∂ψ m ∂ψ m
+ (tk+1 − tk ) (qq ,p ) + (tk − tk−1 ) (qq ,p ) = 0,
2 2 ∂qq k+1/2 k+1/2 ∂qq k−1/2 k−1/2

ψ m (qq k+1/2 , p k+1/2 ) = ψ m (qq k−1/2 , p k−1/2 ),

which satisfies the discrete extended canonical 2-form


1  dt + dt 
k k+1
(dppk ∧ dqq k+1 + dppk+1 ∧ dqq k ) − ψ m (qq k+1/2 , p k+1/2 ) ∧ . (2.69)
2 2
The integrator (2.68) is a two-step symplectic-energy integrator with 2m-th order
of accuracy.

14.2.5 An Example and an Optimization Method


In this subsection, we will see an example. We will take the Hamiltonian as
1 2 1 4
H(q, p) = p + (q − q 2 ), (2.70)
2 2
where q and p are scalars.
Corresponding to (2.70) the discrete Lagrangian (2.31) is chosen as
qk+1 − qk 1 4 2
L(tk , qk , pk , tk+1 , qk+1 , pk+1 ) = pk+1/2 − (qk+1/2 − qk+1/2 ). (2.71)
tk+1 − tk 2
The corresponding symplectic-energy integrator (2.33) and (2.34) become
qk+1 − qk−1 1
− ((tk+1 − tk )pk+1/2 + (tk − tk−1 )pk−1/2 ) = 0, (2.72)
2 2
pk+1 − pk−1 1 3 3
+ ((tk+1 − tk )(2qk+1/2 − qk+1/2 ) + (tk − tk−1 )(2qk−1/2 − qk−1/2 )) = 0,
2 2
1 2 1 4 2 1 2 1 4 2
p + (q − qk+1/2 ) = pk−1/2 + (qk−1/2 − qk−1/2 ),
2 k+1/2 2 k+1/2 2 2

where tk−1 , qk−1 , pk−1 and tk , qk , pk are given and tk+1 , qk+1 , pk+1 are unknowns.
In the following numerical experiment, we will use a robust optimization method
suggested in [KMO99] to solve (2.72). Concretely, let 
qk+1 −qk−1 1
A= 2
− 2
(tk+1 − tk )pk+1/2 + (tk − tk−1 )pk−1/2 ,
 
pk+1 −pk−1 3 3
B= 2
+ 12 (tk+1 − tk )(2qk+1/2 − qk+1/2 ) + (tk − tk−1 )(2qk−1/2 − qk−1/2 ) ,
C = 12 p2k+1/2 + 12 (qk+1/2
4 2
− qk+1/2 ) − 12 p2k−1/2 − 12 (qk−1/2
4 2
− qk−1/2 ).
Then, we will minimize the quantity

F = A2 + B 2 + C 2 (2.73)
604 14. Lee-Variational Integrator

Fig. 2.1. The orbits calculated by (2.72), (2.74) left plot q0 = 0.77, p0 = 0 and right plot q0 = 0.99, p0 = 0

Fig. 2.2. The energy evaluation by (2.72), (2.74) left plot q0 = 0.77, p0 = 0 and right plot q0 = 0.99, p0 = 0

over qk+1 , pk+1 and tk+1 under the constraint tk+1 > tk . This constraint guarantees
that no singularities occur in choosing time steps.
We will compare (2.72) with the following integrator with fixed time steps:
qk+1 − qk−1 1
− (pk+1/2 + pk−1/2 ) = 0,
2h 2
pk+1 − pk−1 1 3 3
 (2.74)
+ (2qk+1/2 − qk+1/2 ) + (2qk−1/2 − qk−1/2 ) = 0.
2h 2

In our numerical experiment, we use two initial conditions q0 = 0.77, p0 = 0, t =


0 and q0 = 0.99, p0 = 0, t = 0. To obtain q1 and p1 , we apply the midpoint integrator
with t1 = 0.1. In Fig. 2.1, the orbits calculated by (2.72) and (2.74) are shown for
the two initial conditions. The two orbits in each initial condition are almost indis-
tinguishable. In Fig.2.2, we plot the evolution of the energy H(qk+1/2 , pk+1/2 ) for
both (2.72) and (2.74). The oscillating curve is for (2.74) and the lower line for (2.72).
For more numerical examples, see [KMO99] in the Lagrangian setting. In principle, the
results in[KMO99] apply to the Hamiltonian setting in the present method as well taking
qk+1 − qk
= pk+1/2 . The purpose is to develop a discrete total variation calculus in
h
14.2 Total Variation in Hamiltonian Formalism 605

the Hamiltonian setting and obtain the symplectic-energy integrators. The comprehen-
sive implementation of the obtained integrators is not the subject of present and will
be a topic for future research.

14.2.6 Concluding Remarks


We will develop a discrete total variation calculus in Hamiltonian formalism in this
subsection. This calculus provides a new method for constructing structure-preserving
integrators for Hamiltonian system from a variational point of view. Using this calcu-
lus, we will derive the energy conservation laws associated with integrators. The cou-
pled integrators are two-step integrators and preserve a discrete version of the extended
canonical 2-form. If we take fixed time steps, the resulting integrators are equivalent
to the symplectic integrators derived directly from the Hamiltonian systems only in
special cases. Thus, new two-step symplectic integrators are variationally obtained.
Using generating function method, we will also obtain higher order symplectic-energy
integrators.
In principle, our discussions can be generalized to multisymplectic Hamiltonian
system
Mzz t + Kzz x = ∇x H(zz ), z ∈ Rn , (2.75)
where M and K are skew-symmetric matrices on Rn (n ≥ 3) and S : R n → R is
a smooth function [Bri97,BD01] . We call the above-mentioned system a multisymplectic
Hamiltonian system, since it possesses a multisymplectic conservation law
∂ ∂
ω+ κ = 0, (2.76)
∂t ∂x
where ω and κ are the presymplectic forms
1 1
ω= d z ∧ Md z, κ= d z ∧ Kd z .
2 2
The constructed action functional is
-  
1 T
S= z (Mzz t + Kzz x ) − H(zz ) d x ∧ d t. (2.77)
2
Performing total variation on (2.77), we can obtain the multisymplectic Hamiltonian
system (2.75), the corresponding local energy conservation law
∂ 1  ∂ 1 T 
S(z) − z T Kzx + z Kzt = 0, (2.78)
∂t 2 ∂x 2
and the local momentum conservation law
∂ 1 T  ∂  1 
z M zx + S(z) − z T M zt = 0. (2.79)
∂t 2 ∂x 2
In the same way, we can develop a discrete total variation in the multisymplectic form
and obtain multisymplectic-energy-momentum integrators. This will be discussed in
detail in Chapter 16.
606 14. Lee-Variational Integrator

14.3 Discrete Mechanics Based on Finite Element


Methods
Now, we will consider mechanics based on finite element methods. Let us go back
to the variation problem of the action factional (1.1). The finite element method is an
approximate method for solving the variation problem. Instead of solving the vari-
ation problem in the space C 2 ([a, b]), the finite element method solves the problem
in a subspace Vhm ([a, b]) of C 2 ([a, b]). Vhm ([a, b]) consists of piecewise m-degree
polynomials interpolating the curves q(t) ∈ C 2 ([a, b]).

14.3.1 Discrete Mechanics Based on Linear Finite Element


First, let us consider the piecewise linear interpolation. Given a partition of [a, b]

a = t0 < t1 < · · · < tk < · · · < tN −1 < tn = b,

the intervals Ik = [tk , tk+1 ] are called elements. hk = tk+1 − tk .Vh ([a, b]) consists
of piecewise linear function interpolating q(t) at (tk , qk )(k = 0, 1, · · · , N ). Now, we
will derive the expressions of qh (t) ∈ Vh ([a, b]). First, we will construct the basis
functions ϕk (t), which are piecewise linear functions on [a, b] satisfying ϕk (ti ) =
δki (i, k = 0, 1, · · · , N ).
 t − t0
1− , t0 ≤ t ≤ t1 ;
ϕ0 (t) = h0
0, otherwise;
 t − tN (3.1)
1+ , tN −1 ≤ t ≤ tN ;
ϕN (t) = hN −1
0, otherwise;

and for k = 1, 2, · · · , N − 1,
⎧ t − tk

⎪ tk−1 ≤ t ≤ tk ;
⎨ 1+ hk−1
,
ϕk (t) = t − tk (3.2)
⎪ 1− , tk ≤ t ≤ tk+1 ;

⎩ hk
0, otherwise.

Using these basis functions, we obtain the expression qh ∈ Vh ([a, b]):


N
qh (t) = qk ϕk (t). (3.3)
k=0

In the space Vh ([a, b]), the action functional (1.1) becomes


-
  b
S (t, qh (t)) = L(t, qh (t), q̇h (t))dt
a
14.3 Discrete Mechanics Based on Finite Element Methods 607

−1 - tk+1
* +

N 
N
d 
N
= L t, (qi ϕi (t), (qi ϕi (t)) d t
tk i=0
d t i=0
k=0

N −1
= L(tk , qk , tk+1 , qk+1 )(tk+1 − tk ), (3.4)
k=0

where
- * +
1 tk+1 
N
d 
N
L(tk , qk , tk+1 , qk+1 ) = L t, (qi ϕi (t), (qi ϕi (t)) d t
tk+1 − tk tk i=0
d t i=0
- tk+1 * k+1

+
d 
k+1
1
= L t, (qi ϕi (t), (qi ϕi (t)) d t.
tk+1 − tk tk dt
i=k i=k
(3.5)

Therefore, restricting to the subspace Vh ([a, b]) of C 2 ([a, b]), the original varia-
tional problem reduces to the extremum problem of the function (3.4) in qk (k =
0, 1, · · · , N ). Note that (3.4) is one of the discrete actions (1.33). Thus, what remains
to be done is just to perform the same calculation on (3.4) as on (1.33). We can then
obtain the discrete Euler–Lagrange equation (1.35) which preserves the discrete La-
grange 2-form (1.48). Therefore, discrete mechanics based on finite element methods
consists of two steps: first, use finite element methods to obtain a kind of discrete
Lagrangian, second, use the method of Veselov mechanics to obtain the variational
integrators.
Let us consider the previous example again. For the classical Lagrangian (1.57),
we choose the discrete Lagrangian L(tk , qk , tk+1 , qk+1 ) as
L(tk , qk , tk+1 , qk+1 )
⎛ * +2 * N +⎞
- tk+1 N 
1 ⎝ 1 d
= (qi ϕi (t)) −V (qi ϕi (t)) ⎠ d t
tk+1 − tk tk 2 dt i=0 i=0
- tk+1 * % &2 % &+
1 1 qk+1 − qk tk+1 − t t − tk
= −V qk + qk+1 dt
tk+1 − tk tk 2 tk+1 − tk tk+1 − tk tk+1 − tk
% &2
1 qk+1 − qk
= − F (qk , qk+1 ), (3.6)
2 tk+1 − tk

where
- % &
1 tk+1
tk+1 − t t − tk
F (qk , qk+1 ) = V qk + qk+1 d t. (3.7)
tk+1 − tk tk tk+1 − tk tk+1 − tk

The discrete Euler–Lagrange equation (1.35) becomes


% &
qk+1 − qk qk − qk−1 ∂F (qk , qk+1 )
− + (tk+1 − tk )
tk+1 − tk tk − tk−1 ∂qk
∂F (qk−1 , qk )
+ (tk − tk−1 ) = 0, (3.8)
∂qk
608 14. Lee-Variational Integrator

which preserves the Lagrange 2-form


% &
1 ∂ 2 F (qk , qk+1 )
+ (tk+1 − tk ) d qk+1 ∧ d qk . (3.9)
tk+1 − tk ∂qk ∂qk+1

Again, if we take fixed time steps tk+1 − tk = tk − tk−1 = h, (3.8) becomes

qk+1 − 2qk + qk−1 ∂F (qk , qk+1 ) ∂F (qk−1 , qk )


2
+ + = 0,
h ∂qk ∂qk
which preserves the Lagrange 2-form
% &
1 ∂ 2 F (qk , qk+1 )
+h d qk+1 ∧ d qk .
h ∂ qk ∂ qk+1

Suppose qk is the solution of (3.8) and q(t) is the solution of

d2 q ∂ V (q)
+ = 0, (3.10)
d t2 ∂q

then from the convergence theory of finite element methods[Cia78,Fen65] , we have

q(t) − qh (t) ≤ Ch2 , (3.11)


N
where  ·  is the L2 norm. qh (t) = qk , h = max{hk } and C is a constant
k
k=0
independent of h.
If we use midpoint numerical integration formula in (3.7), we obtain
% &
1  tk+1 tk+1 − t t − tk
F (qk , qk+1 ) = V q k + q k+1 d t
tk+1 − tk tk tk+1 − tk tk+1 − tk
 
q + qk+1
≈v k .
2
In this case, (3.8) is the same as (1.59). We can also use trapezoid formula or Simpson
formula and so on to integrate (3.7) numerically and obtain another kind of discrete
Lagrangian.

14.3.2 Discrete Mechanics with Lagrangian of High Order


Now, we will consider piecewise quadratic polynomial interpolation, which will result
in a kind of discrete Lagrangian of high order. To this aim, we add an auxiliary node
tk+ 12 to each element Ik = [tk , tk+1 ]. There are two kinds of quadratic basis functions:
φk (t) for nodes tk and φk+ 12 (t) for tk+ 12 that satisfy
 
φk (ti ) = δik , φk ti+ 12 = 0,
 
φk+ 12 ti+ 12 = δik , φk+ 12 (ti ) = 0, i, k = 0, 1, · · · , N.
14.3 Discrete Mechanics Based on Finite Element Methods 609

We have the basis functions as follows:


⎧   
⎨ 2(t − t0 ) − 1 t − t0 − 1 , t0 ≤ t ≤ t 1 ;
φ0 (t) = h0 h0 (3.12)
⎩ 0, otherwise;
⎧   
⎨ 2(t − t) t − t
N
−1 N
− 1 , tN −1 ≤ t ≤ tN ;
φN (t) = hN −1 hN −1 (3.13)

0, otherwise;

and for k = 1, 2, · · · , N − 1,
⎧ % &% &

⎪ 2(tk − t) tk − t

⎪ −1 − 1 , tk−1 ≤ t ≤ tk ;

⎨ % hk−1 hk−1
& 
φk (t) = 2(t − tk ) t − tk (3.14)

⎪ − 1 − 1 , tk ≤ t ≤ tk+1 ;

⎪ hk hk


0, otherwise;

and for k = 0, 1, · · · , N − 1,
  
t − tk t − tk
4 1− , tk ≤ t ≤ tk+1 ;
φk+ 12 (t) = hk hk (3.15)
0, otherwise.

Using these basis functions, we will construct subspace Vh2 ([a, b]) of C 2 ([a, b]):


N 
N −1
qh2 (t) = qk φk (t) + qk+ 12 φk+ 12 (t), qh2 (t) ∈ Vh2 ([a, b]). (3.16)
k=0 k=0

In the space Vh2 ([a, b]), the action functional (1.1) becomes
-
  b  
S (t, qh2 (t)) = L t, qh2 (t), q̇h2 (t) d t
a
 −1 - tk+1
N
 
= L t, qh2 (t), q̇h2 (t) d t
k=0 tk


N −1
= L(tk , qk , qk+ 12 , tk+1 , qk+1 )(tk+1 − tk ), (3.17)
k=0

where
-
1 tk+1  
L(tk , qk , qk+ 12 , tk+1 , qk+1 ) = L t, qh2 (t), q̇h2 (t) d t. (3.18)
tk+1 − tk tk

For the discrete action (3.17), we have


610 14. Lee-Variational Integrator

d S(q0 , q 12 , q1 , · · · , qN −1+ 12 , qN ) · (δq0 , δq 12 , δq1 , · · · , δqN −1+ 12 , δqN )



N −1
= (D2 L(wk )δqk + D3 L(wk )δqk+ 12 + D5 L(wk )δqk+1 )(tk+1 − tk )
k=0

N −1
= (D2 L(wk )δqk + D3 L(wk )δqk+ 12 )(tk+1 − tk )
k=0

N
+ D5 L(wk−1 )(tk − tk−1 )δqk
k=1

N −1
= (D2 L(wk )(tk+1 − tk ) + D5 L(wk−1 )(tk − tk−1 )δqk
k=1

N −1
+ D3 L(wk )δqk+ 21 (tk+1 − tk ) + D2 L(w0 )(t1 − t0 )δq0
k=0
+D5 L(wN −1 )(tN − tN −1 )δqN , (3.19)

where wk = (tk , qk , qk+ 12 , tk+1 , qk+1 ) (k = 0, 1, · · · , N − 1). From (3.19), we obtain


the discrete Euler–Lagrange equation

D2 L(wk )(tk+1 − tk ) + D5 L(wk−1 )(tk − tk−1 ) = 0, (3.20)


D3 L(tk , qk , qk+ 12 , tk+1 , qk+1 ) = 0, (3.21)
D3 L(tk−1 , qk−1 , qk−1+ 12 , tk , qk ) = 0. (3.22)

From (3.21) and (3.22), we can solve for qk+ 12 and qk−1+ 12 respectively, then
substitute them into (3.20) and finally solve for qk+1 . Therefore, the discrete Euler–
Lagrange equation (3.20) – (3.22) determines a discrete flow

Ψ : M × M −→ M × M,
Ψ(tk−1 , qk1 , tk , qk ) = (tk , qk , tk+1 , qk+1 ).

From (3.19), we can define two 1-forms

ΘLv− (tk , qk , qk+ 12 , tk+1 , qk+1 ) = D2 L(tk , qk , qk+ 12 , tk+1 , qk+1 )(tk+1 − tk )dqk ,

and

ΘLv+ (tk , qk , qk+ 12 , tk+1 , qk+1 ) = D5 L(tk , qk , qk+ 12 , tk+1 , qk+1 )(tk+1 − tk )dqk+1 .

Using the same method as before, we can prove that

Ψ∗ (dΘLv+ ) = −dΘLv− . (3.23)

From the definition of ΘLv− and ΘLv+ , we have

ΘLv− + ΘLv+ = d((tk+1 − tk )L) − D3 L(tk , qk , qk+ 12 , tk+1 , qk+1 )dqk+ 12 . (3.24)
14.3 Discrete Mechanics Based on Finite Element Methods 611

From (3.21), we obtain D3 L(tk , qk , qk+ 12 , tk+1 , qk+1 ) = 0. Thus

ΘLv− + ΘLv+ = d((tk+1 − tk )L),


which means
dΘLv+ = −dΘLv− . (3.25)
From (3.23) and (3.25), we arrive at
Ψ∗ (ΩvL ) = ΩvL , (3.26)
where ΩvL = dΘLv+ .
For the classical Lagrangian (1.57), from (3.16) and (3.18), we obtain
L(tk , qk , qk+ 12 , tk+1 , qk+1 )
- tk+1 % &
1 1 2  
= q̇h2 (t) − V qh2 (t) d t
tk+1 − tk tk 2
% &
1 1 2 2 2 2
= a (tk+1 + tk tk+1 + tk ) + ab(tk + tK+1 ) + b
2 3
−G(qk , qk+ 12 , qk+1 ), (3.27)
where
4  
a= 2
qk + qk+1 − 2qk+ 12 ,
hk
1  
b = 2 4(tk + tk+1 )qk+ 12 − (3tk + tk+1 )qk+1 − (tk + 3tk+1 )qk ,
hk

and
-
1
tk+1  
G(qk , qk+ 12 , qk+1 ) = V qk fk (t)+qk+1 fk+1 (t)+qk+ 12 fk+ 12 (t) d t,
tk+1 − tk tk

where % & 
2(t − tk ) t − tk
fk (t) = −1 −1 ,
hk hk
% & 
2(tk+1 − t) tk+1 − t
fk+1 (t) = −1 −1 ,
hk hk
 
t − tk t − tk
fk+ 12 (t) = 4 1− .
hk hk
For the discrete Lagrangian (3.27), the discrete Euler–Lagrange equations (3.20) –
(3.22) become
a1 qk−1 + a2 qk + a3 qk+1 + a4 qk− 12 + a5 qk+ 12 − d1 hk − d2 hk−1 = 0, (3.28)
8   ∂G(qk , qk+ 12 , qk+1 )
− 2
qk + qk+1 − 2qk+ 12 − = 0, (3.29)
3hk ∂qk+ 1
2

8   ∂G(qk−1 , qk−1+ 12 , qk )
− 2
qk−1 + qk − 2qk−1+ 12 − = 0, (3.30)
3hk−1 ∂qk−1+ 1
2
612 14. Lee-Variational Integrator

where
% &
1 1 7 1 1 1 1
a1 = , a2 = + , a3 = ,
3 hk−1 3 hk−1 hk 3 hk
8 1 8 1 ∂G(qk , qk+ 1 , qk+1 ) ∂G(qk−1 , qk−1+ 1 , qk )
a4 = − , a5 = − , d1 = 2
, d2 = 2
.
3 hk−1 3 hk ∂qk ∂qk
The solution of (3.28) – (3.30) preserves the Lagrange 2-form
* 2
+
1 ∂ G(qk , qk+ 1 , qk+1 )
− hk 2
−M d qk ∧ d qk+1 , (3.31)
3hk ∂ qk ∂ qk+1

where
* +* +
16 ∂ 2 G(qk , qk+ 1 , qk+1 ) 16 ∂ 2 G(qk , qk+ 1 , qk+1 )
2 2
+ hk + hk
3hk ∂qk+ 1 ∂qk 3hk ∂qk+ 1 ∂qk
M= * 2
2 + 2
.
32 ∂ G(qk , qk+ 1 , qk+1 )
− hk 2
2
3hk ∂qk+ 1
2

If we take the fixed time steps hk−1 = hk = h, then (3.28) – (3.30) become
qk−1 − 8qk− 12 + 14qk − 8qk+ 12 + qk+1
− d1 hk − d2 hk−1 = 0, (3.32)
3h2
8   ∂G(qk , qk+ 12 , qk+1 )
− q k + q k+1 − 2q 1 − = 0, (3.33)
3h2 k+ 2 ∂qk+ 12
8   ∂G(qk−1 , qk−1+ 12 , qk )
− 2
qk−1 + qk − 2qk−1+ 12 − = 0, (3.34)
3h ∂qk−1+ 12
which preserve
* +
1 ∂ 2 G(qk , qk+ 12 , qk+1 )
−h −M d qk ∧ d qk+1 , (3.35)
3h ∂qk ∂qk+1
where
* +* +
16 ∂ 2 G(qk , qk+ 1 , qk+1 ) 16 ∂ 2 G(qk , qk+ 1 , qk+1 )
2 2
+h +h
3hk ∂qk+ 1 ∂qk 3hk ∂qk+ 1 ∂qk
M= * 2 2 + 2
.
32 ∂ G(qk , qk+ 1 , qk+1 )
−h 2
2
3h ∂qk+ 1
2

Suppose qk is the solution of (3.28) – (3.30) and q(t) is the solution of (3.10), then
from the convergence theory of finite element methods [Cia78,Fen65] , we have
q(t) − qh2 (t) ≤ Ch3 , (3.36)
where

N 
N −1
qh2 (t) = qk φk (t) + qk+ 12 φk+ 12 (t),
k=0 k=0
h = maxk {hk } and C is a constant independent of h.
14.3 Discrete Mechanics Based on Finite Element Methods 613

14.3.3 Time Steps as Variables


In the above section, the time steps tk play the role of parameters. They are determined
beforehand according to some requirements. In fact, we can also regard tk as variables
and the variation of the discrete action with respect to tk yields the discrete energy
conservation law. This fact was first observed by Lee[Lee82,Lee87] . The symplecticity
of the resulting integrators was investigated in [CGW03,KMO99] . These results are also
applied to the discrete mechanics based on finite element methods.
We regard tk as variables and calculate the variation of the discrete action (1.33)
as follows:

d S(t0 , q0 , · · · , tN , qN ) · (δt0 , δq0 , · · · , δtN , δqN )


d 
=  S(t0 + εδt0 , q0 + εδq0 , · · · , tN + εδtN , qN + εδqN )
d ε ε=0

N −1
= [D2 L(wk )(tk+1 − tk ) + D4 L(wk−1 )(tk − tk−1 )]δqk
k=1

N −1
+ [D1 L(wk )(tk+1 − tk ) + D3 L(wk−1 )(tk − tk−1 ) + L(wk−1 ) − L(wk )]δtk
k=1
+D2 L(w0 )(t1 − t0 )δq0 + D4 L(wN −1 )(tN − tN −1 )δqN
+[D1 L(w0 )(t1 − t0 ) − L(w0 )]δt0
+[D3 L(wN −1 )(tN − tN −1 ) + L(wN −1 )]δtN , (3.37)

where wk = (tk , qk , tk+1 , qk+1 ) (k = 0, 1, · · · , N − 1), so that we have the discrete


energy evolution equation from the variation δqk

D2 L(wk )(tk+1 − tk ) + D4 L(wk−1 )(tk − tk−1 ) = 0, (3.38)

and the discrete energy evolution equation from the variation δtk

D1 L(wk )(tk+1 − tk ) + D3 L(wk−1 )(tk − tk−1 ) + L(wk−1 ) − L(wk ) = 0, (3.39)

which is a discrete version of (1.23). For a conservative L, (3.39) becomes the discrete
energy conservation law.
From the boundary terms in (3.37), we can define two 1-forms

θL− (wk ) = (D1 L(wk )(tk+1 − tk ) − L(wk ))dtk + D2 L(wk )(tk+1 − tk )dqk , (3.40)

and

θL+ (wk ) = (D3 L(wk )(tk+1 − tk ) + L(wk ))dtk+1 + D4 L(wk )(tk+1 − tk )dqk+1 .
(3.41)
These two 1-forms are the discrete version of the extended Lagrange 1-form (1.29).
Unlike the continuous case, the solution of (3.38) does not satisfy (3.39) in general.
Therefore, we must solve (3.38) and (3.39) simultaneously. Using the same method in
the above section, we can show that the coupled integrator
614 14. Lee-Variational Integrator

D2 L(wk )(tk+1 − tk ) + D4 L(wk−1 )(tk − tk−1 ) = 0,


D1 L(wk )(tk+1 − tk ) + D3 L(wk−1 )(tk − tk−1 ) + L(wk−1 ) − L(wk ) = 0,
(3.42)
preserves the extended Lagrange 2-form ωL = dθL+ .
For the discrete Lagrangian (3.6), (3.42) becomes
% &
qk+1 − qk qk − qk−1 ∂ F (wk ) ∂ F (wk−1 )
− + hk + hk−1 = 0,
tk+1 − tk tk − tk−1 ∂ qk ∂ qk
% &2
1 qk+1 − qk ∂ F (w )
+ F (wk ) − k
hk
2 tk+1 − tk ∂ tk
% &2
1 qk − qk−1 ∂ F (wk−1 )
= + F (wk−1 ) + hk−1 .
2 tk − tk−1 ∂ tk

For the kind of high order discrete Lagrangian, we can obtain similar formulae.

14.3.4 Conclusions
Recently, it has been proved [GLWW01] that the symplectic structure is preserved not
only on the phase flow but also on the flow with respect to symplectic vector fields
as long as certain cohomological condition is satisfied in both continuous and discrete
cases. This should be able to be extended to the cases in this chapter.
Bibliography

[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM


60, Berlin Heidelberg, Second edition, (1989).
[BD01] T. J. Bridges and G. Derks: The symplectic evans matrix, and the instability of solitary
waves and fonts. Arch. Rat. Mech. Anal, 156:1–87, (2001).
[Bri97] T. J. Bridges: Multi-symplectic structures and wave propagation. Math. Proc. Cam.
Phil. Soc., 121:147–190, (1997).
[CGW03] J. B. Chen, H.Y. Guo, and K. Wu: Total variation in Hamiltonian formalism and
symplectic-energy integrators. J. of Math. Phys., 44:1688–1702, (2003).
[CH53] R. Courant and D. Hilbert: Methods of Mathematical Physics. Interscience, New York,
Second edition, (1953).
[Cia78] D. G. Ciarlet: The Finite Element for Elliptic Problem. North-Holland, Amsterdam,
First edition, (1978).
[Fen65] K. Feng: Difference schemes based on variational principle. J. of appl. and comput.
math.in chinese, 2(4):238–262, (1965).
[Fen86] K. Feng: Difference schemes for Hamiltonian formalism and symplectic geometry. J.
Comput. Math., 4:279–289, (1986).
[FWQW89] K. Feng, H. M. Wu, M.Z. Qin, and D.L. Wang: Construction of canonical dif-
ference schemes for Hamiltonian formalism via generating functions. J. Comput. Math.,
7:71–96, (1989).
[GLW01a] H. Y. Guo, Y. Q. Li, and K. Wu: A note on symplectic algorithms. Commun.Theor.
Phys., 36:11–18, (2001).
[GLW01b] H. Y. Guo, Y. Q. Li, and K. Wu: On symplectic and multisymplectic structures and
their discrete version in Lagrange formalism. Commun.Theor. Phys., 35:703–710, (2001).
[GLWW01] H. Y. Guo, Y. Q. Li, K. Wu, and S. K. Wang: Difference discrete variational
principle, Euler-Lagrange cohomology and symplectic, multisymplectic structures. arXiv:
math-ph/0106001, (2001).
[GM88] Z. Ge and J. E. Marsden: Lie–Poisson–Hamilton–Jacobi theory and Lie–Poisson in-
tegrators. Physics Letters A, pages 134–139, (1988).
[GPS02] H. Goldstein, C. Pole, and J. Safko: Classical Mechanics. Addison Wesley, New
York, Third edition, (2002).
[GW03] H. Y. Guo and K. Wu: On variations in discrete mechanics and field theory. J. of
Math. Phys., 44:5978–6044, (2003).
[KMO99] C. Kane, J. E. Marsden, and M. Ortiz: Symplectic-energy-momentum preserving
variational integrators. J. of Math. Phys., 40:3353–3371, (1999).
[Lag88] J. L. Lagrange: Mécanique Analytique, 2 vols. Gauthier-Villars et fils, Paris, 4-th
edition, 1888-89, (1781)
[Lee82] T. D. Lee: Can time be a discrete dynamical variable? Phys.Lett.B, 122:217–220,
(1982).
[Lee87] T. D. Lee: Difference equations and conservation laws. J. Stat. Phys., 46:843–860,
(1987).
616 Bibliography

[MPS98] J. E. Marsden, G.P. Patrick, and S. Shloller: Multi-symplectic geometry, variational


integrators, and nonlinear PDEs. Communications in Mathematical Physics, 199:351–395,
(1998).
[MV91] J. Moser and A. P. Veselov: Discrete versions of some classical integrable systems and
factorization of matrix polynomials. Communications in Mathematical Physics, 139:217–
243, (1991).
[Olv93] P. J. Olver: Applications of Lie Groups to Differential Equations. GTM 107. Springer-
Verlag, Berlin, Second edition, (1993).
[SSC94] J. M. Sanz-Serna and M. P. Calvo: Numerical Hamiltonian Problems. AMMC 7.
Chapman & Hall, London, (1994).
[Ves88] A. P. Veselov: Integrable discrete-time systems and difference operators. Funkts. Anal.
Prilozhen, 22:1–33, (1988).
[Ves91a] A. P. Veselov: Integrable Lagrangian correspondences and the factorization of matrix
polynomials. Funkts. Anal. Prilozhen, 25:38–49, (1991).
[Ves91b] A. P. Veselov: Integrable maps. Russian Math. Surveys, 46:1–51, (1991).
[WM97] J. Wendlandt and J. Marsden: Mechanical integrators derived from a discrete varia-
tional principle. Physica D, 106:223–246, (1997).
Chapter 15.
Structure Preserving Schemes for Birkhoff
Systems

A universal symplectic structure for a Newtonian system including nonconservative


cases can be constructed in the framework of Birkhoffian generalization of Hamilto-
nian mechanics. In this chapter, the symplectic geometry structure of Birkhoffian sys-
tem is discussed, and the symplecticity of Birkhoffian phase flow is presented. Based
on these properties, a way to construct symplectic schemes for Birkhoffian systems by
the generating function method is explained[SSQS07],[SQ03] .

15.1 Introduction
Birkhoffian representation is a generalization of Hamiltonian representation, which
can be applied to hadron physics, statistical mechanics, space mechanics, engineering,
biophysics, etc. Santilli[San83a,San83b] . All conservative or nonconservative, self-adjoint
or non self-adjoint, unconstrained or nonholonomic constrained systems always ad-
mit a Birkhoffian representation (Guo[GLSM01] and Santilli[San83b] ). In last 20 years,
many researchers have studied Birkhoffian mechanics and obtained a series of results
in integral theory, stability of motion, inverse problem, and algebraic and geometric
description, etc.
Birkhoff’s equations are more complex than Hamilton’s equations, and the study
of the computational methods of the former is also more complicated. There are no
result on computational methods for Birkhoffian system before. In general, the known
difference methods are not generally applicable to Birkhoffian system. A difference
scheme used to solve Hamiltonian system should be Hamiltonian scheme (Hairer, Lu-
bich and Wanner[HLW02] and Sanz-Serna and Calvo[SSC94] ), so a difference scheme to
simulate Birkhoffian system should be a Birkhoffian scheme. However, the conven-
tional difference schemes such as Euler center scheme, leap-frog scheme, etc., are not
Birkhoffian schemes. So, a way to systematically construct a Birkhoffian scheme is
necessary, and this is the main context in this chapter.
Both the Birkhoffian and Hamiltonian systems are usually of finite dimensional
(Arnold[Arn89] and Marsden and Ratiu[MR99] ), infinite dimension system has not been
proposed before. The algebraic and geometric profiles of finite dimensional Birkhof-
fian systems are described in local coordinates, and general nonautonomous Hamil-
tonian systems are considered as autonomous Birkhoffian systems (Santilli[San83b] ).
Symplectic schemes are systematically developed for standard Hamiltonian systems
618 15. Structure Preserving Schemes for Birkhoff Systems

and for general Hamiltonian systems on the Poisson manifold, which belong to au-
tonomous and semi-autonomous Birkhoffian systems (Feng and Wang[FW91bandFW91a]
and Feng and Qin[FQ87] ). So, in this chapter, we just discuss the nonautonomous
Birkhoffian system in detail. Thereby, Einstein’s summation convention is used.
In Section 15.2, Birkhoffian systems are sketched out via variational self-adjointness,
with which we shows the relationship between Birkhoffian and Hamiltonian systems
more essentially and directly. Then the basic geometrical properties of Birkhoffian sys-
"
tems are presented. In Section 15.3, the definitions of K(z)-Lagrangian submanifolds
"
is extended to K(z, t)-Lagrangian submanifolds with parameter t. Then the relation-
ship between symplectic mappings and gradient mappings are discussed. In Section
15.4, the generating functions for the phase flow of the Birkhoffian systems are con-
structed and the method to simulate Birkhoffian systems by symplectic schemes of any
order is given. Section 15.5 contains an illustrating example. Schemes of order one,
two, and four are derived for the linear damped oscillator. In the last Section 15.6,
numerical experiments are given.

15.2 Birkhoffian Systems


The generalization of Hamilton’s equation is given by
% & % &
∂ Fj ∂ Fi d zi ∂ B(z, t) ∂ F (z, t)
− − + = 0, i, j = 1, 2, · · · , 2n, (2.1)
∂ zi ∂ zj dt ∂ zi ∂t

where the following abbreviations


∂ Fj ∂ Fi
Kij = − , K = (Kij )i,j=1,···,2n
∂ zi ∂ zj

are further used. Following the terminology suggested by Santilli[San83b] , this is called
Birkhoff’s equation or Birkhoffian system under some additional assumptions. The
function B(z, t) is called the Birkhoffian function because of certain physical differ-
ence with Hamiltonian. Also, the Fi (i = 1, 2, · · · , 2n) are Birkhoffian functions. A
representation of Newton’s equations via Birkhoff’s equation is called a Birkhoffian
representation.

Definition 2.1. Birkhoff’s equations (2.1) are called autonomous when the functions
Fi and B are independent of the time variable. In this case, the equations are of the
simple form
d zj ∂ B(z)
Kij (z) − = 0. (2.2)
dt ∂ zi

They are called semi-autonomous when the functions Fi do not depend explicitly on
time. In this case, the equations have the more general form
d zj ∂ B(z, t)
Kij (z) − = 0.
dt ∂ zi
15.2 Birkhoffian Systems 619

They are called nonautonomous when both the functions Fi and B explicitly depen-
dent on time. Then, the equations read as follow:

d zj ∂ B(z, t) ∂ Fi (z, t)
Kij (z, t) − − = 0. (2.3)
dt ∂ zi ∂t

They are called regular when the functional determinant is unequal to zero in the
region considered, i.e.,

$ ) = 0,
det (Kij )(Re

otherwise, degenerate.

Given an arbitrary analytic and regular first-order system


d zi
Kij (z, t) + Di (z, t) = 0, i = 1, 2, · · · , 2n, (2.4)
dt

$ for
which is self-adjoint if and only if it satisfies the following conditions in Re
[AH75]
i, j = 1, 2, · · · , 2n :

Kij + Kji = 0,
∂ Kij ∂ Kjk ∂ Kki
+ + = 0, (2.5)
∂ zk ∂ zi ∂ zj
∂ Kij ∂ Di ∂ Dj
= − .
∂t ∂ zj ∂ zi

We now simply introduce the geometric significance of the condition of variational



$
self-adjointness [MP91,SVC95] . Here the region considered is a star-shaped region Re
of points of R×T ∗ M , T ∗ M the cotangent space of the M , M a 2n-dimensional man-
ifold. The geometric significance of self-adjointness condition (2.5) is the integrability
condition for a 2-form to be an exact symplectic form.
Consider first the case for which Kij = Kij (z). Given a symplectic structure
written as the 2-form in local coordinates
2n

Ω= Kij (z, t) d zi ∧ d zj .
i,j=1

One of the fundamental properties of symplectic form is that dΩ = 0. Because the


exact character of 2-form implies that

Ω = d (Fi d zi ), (2.6)

this geometric property is fully characterized by the first two equations of the condition
(2.5); i.e., the 2-form (2.6) describes the geometrical structure of the autonomous case
(2.2) of the Birkhoff’s equations, it even sketches out the geometric structure of the
semi-autonomous case.
620 15. Structure Preserving Schemes for Birkhoff Systems

For the case Kij = Kij (z, t), the full set of condition (2.5) must be consid-
ered. The corresponding geometric structure can be better expressed by transition
of the symplectic geometry on the cotangent bundle T ∗ M with local coordinates
zi to the contact geometry on the manifold R × T ∗ M with local coordinates z"i
(i = 0, 1, 2, · · · , 2n), z"0 = t[San83b] . More general formulations of an exact contact
2-form exist, although it is now referred to as a (2n+1)-dimensional space,
2n

=
Ω  ij d z"i ∧ d z"j = Ω + 2 Di d zi ∧ d t,
K
i,j=0

where
5 6
0 −DT
 =
K , D = (D1 , · · · , D2n )T .
D K

If the contact form is also of the exact type,



−B,
 = d (F"i d z"i ),
Ω F"i = (2.7)
Fi ,

the geometric meaning of the condition of the self-adjointness is then the integrability
condition for the exact contact structure (2.7). Here B can be calculated from
∂B ∂ Fi
− = Di +
∂ zi ∂t

for
   
∂ ∂ Fi ∂ ∂ Fj
Di + = Dj + .
∂ zj ∂t ∂ zi ∂t

All the above discussion can be expressed via the following property.

Theorem 2.2 (Self-Adjointness of Birkhoffian System). For a general nonautonom-


ous first-order system (2.4), a necessary and sufficient condition for self-adjointness

$ of points of R × T∗ R2n is that it is of the Birkhoffian type, i.e., the following
in Re
representation holds for i, j = 1, 2, · · · , 2n,
   
dz ∂ Fj ∂ Fi d zi ∂ F (z, t)
Kij (z, t) i + Di (z, t) = − − ∇ B(z, t) + . (2.8)
dt ∂ zi ∂ zj dt ∂t

Remark 2.3. The functions Fi and B can be calculated according to the rules [AH75]
- 1
1
Fi = zj · Kji (λz, t) d λ,
2 0
- 1  
∂ Fi
B= zi · D i + (λz, t) d λ.
0 ∂t
15.3 Generating Functions for K(z, t)-Symplectic Mappings 621

Due to the self-adjointness of Birkhoff’s equations, the phase flow of the system
(2.8) conserves the symplecticity
d d
Ω = (Kij d zi ∧ d zj ) = 0.
dt dt

z, 
So denoting the phase flow of the Equation (2.8) with ( t) yields

z, 
Kij ( t) d zi ∧ d zj = Kij (z, t) d zi ∧ d zj ,

respectively the algebraic representation

∂ z T ∂ z
z, 
K( t) = K(z, t).
∂z ∂z
In the latter, the algorithm preserving this geometric property of the phase flow in
discrete space will be constructed.

15.3 Generating Functions for K(z, t)-Symplectic


Mappings
In this section, general K(z, t)-symplectic mappings and their relationships with the
gradient mappings and their generating functions are considered [FW91b,FW91a,FQ87] .
Definition 3.1. Let denote
5 6 5 6 5 6
O In O I2n J2n O
J2n = , J4n = , J"4n = ,
−In O −I2n O O −J2n
5 6
K(z , t) O
" z , z, t, t0 ) =
K( .
O −K(z, t0 )

Then a 2n-dimensional submanifold L ⊂ R4n


' /
z  4n 2n
L= ∈ R | z = z(x, t0 ), z = z(x, t), x ∈ U ⊂ R , open set
z

is called a J4n - or J"4n - or K(


" z , z, t, t0 )-Lagrangian submanifold if it holds

(Tx L)T J4n (Tx L) = 0, (Tx L)T J"4n (Tx L) = 0

or
" z , z, t, t0 )(Tx L) = 0,
(Tx L)T K(

where Tx L is the tangent space to L at x.


622 15. Structure Preserving Schemes for Birkhoff Systems

Definition 3.2. The mapping with parameters t and t0 is z → z = g(z, t, t0 ) :


R2n → R2n is called a canonical map or a gradient map or a K(z, t)-symplectic
map if its graph
'  /
z 4n 2n
Γg = ∈ R | z = g(z, t, t0 ), z = z ∈ R
z

is a J4n - or J"4n - or K(


" z , z, t, t0 )-Lagrangian submanifold.

For differentiable mappings, there exists an equivalent definition for the K-


symplecticness, which is also useful for difference schemes.

Definition 3.3. A differentiable mapping g : M → M is K(z, t)-symplectic if

∂ gT  ∂ g
K g(z, t, t0 ), t = K(z, t0 ).
∂z ∂z

A difference scheme approximating the Birkhoffian system (2.8) with step size τ

z k+1 = g k (z k , tk + τ, tk ), k0

is called a K-symplectic scheme, if g k is K-symplectic for every k, i.e.,


T
∂ gk ∂ gk
k
K(z k+1 , tk+1 ) k = K(z k , tk ).
∂z ∂z

The graph of the phase flow of the Birkhoffian system (2.8) is g t (z, t0 ) =
" z , z, t, t0 )-Lagrangian submanifold for
g(z, t, t0 ) which is a K(
 
gzt (z, t0 )T K g t (z, t0 ), t gzt (z, t0 ) = K(z, t0 ).

Similarly, the graph of the phase flow of standard Hamiltonian system is a J"4n -
Lagrangian submanifold.
Consider the nonlinear transformation with two parameters t and t0 from R4n to
itself,
5 6 5 6 5 6
z 
w α1 (
z , z, t, t0 )
α(t, t0 ) : −→ = , (3.1)
z w α2 (
z , z, t, t0 )
5 6 5 6 5 1 6

w z  w, t, t0 )
α (w,
−1
α (t, t0 ) : −→ = .
w z α2 (w,
 w, t, t0 )

Let denote the Jacobian of α and its inverse by


5 6 5 6
Aα Bα Aα Bα
α∗ (
z , z, t, t0 ) = , α∗−1 (w,
 w, t, t0 ) = .
Cα Dα Cα Dα
15.3 Generating Functions for K(z, t)-Symplectic Mappings 623

Let α be a diffeomorphism from R4n to itself, then it follows that α carries ev-
"
ery K-Lagrangian submanifold into a J4n -Lagrangian submanifold, if and only if
" i.e.,
α∗T J4n α∗ = K,
5 6T 5 65 6 5 6
Aα Bα J2n O Aα Bα K( z , t) O
= .
Cα Dα O −J2n Cα Dα O −K(z, t0 )

"
Conversely, α−1 carries every J4n -Lagrangian submanifold into a K-Lagrangian sub-
manifold.
Theorem 3.4. Let M ∈ R2n×2n , α given as in (3.1), and define a fractional trans-
formation

σα : M −→ M, M −→ N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1

under the transversality condition |Cα M + Dα | = 0. Then the following four condi-
tions are mutually equivalent:
|Cα M + Dα | = 0, |M C α − Aα | = 0,

|C α N + Dα | = 0, |N Cα − Aα | = 0.

The proof is direct and simple, so it is omitted here.


Theorem 3.5. Let α be defined as in (3.1). Let z → z = g(z, t, t0 ) be a K(z, t)-
symplectic mapping in some neighborhood R " of R2n with Jacobian gz (z, t, t0 ) =
M (z, t, t0 ). If M satisfies the transversality condition in R "
 
Cα (g(z, t, t0 ), z, t, t0 )M (z, t, t0 ) + Dα (g(z, t, t0 ), z, t, t0 ) = 0, (3.2)

then there uniquely exists in R" a gradient mapping w → w  = f (w, t, t0 ) with Ja-
cobian fw (w, t, t0 ) = N (w, t, t0 ) and a uniquely defined scalar generating function
φ(w, t, t0 ), such that
f (w, t, t0 ) = φw (w, t, t0 ),
   
α1 (g(z, t, t0 ), z, t, t0 ) = f α2 (g z, t, t0 ), z, t, t0 , t, t0
   
= φw α2 g(z, t, t0 ), z, t, t0 , t, t0 , (3.3)

and
N = (Aα M + Bα )(Cα M + Dα )−1 ,
M = (Aα N + B α )(C α N + Dα )−1 .

Proof. Under the transformation α, the image of the graph Γg is


5 6 5   67

w  = α1 g(z, t, t0 ), z, t, t0
w
α(Γg ) = ∈ R4n |   .
w w = α2 g(z, t, t0 ), z, t, t0
624 15. Structure Preserving Schemes for Birkhoff Systems

Inequality (3.2) implies


   
 ∂ w   ∂ α2 ∂ z ∂ α2   
 = · +  = Cα M + Dα  = 0,
∂z ∂ z ∂z ∂z
 
so w = α2 g(z, t, t0 ), z, t, t0 is invertible, the inverse function is denoted by z =
z(w, t, t0 ). Set
 
 = f (w, t, t0 ) = α1 g(z, t, t0 ), z, t, t0 |z=z(w,t,t0 ) ,
w (3.4)

then
  
∂f ∂ α1 ∂ g ∂ α1 ∂z
N= = + = (Aα M + Bα )(Cα M + Dα )−1 .
∂w ∂ z ∂ z ∂z ∂w

Notice that the tangent space to α(Γg ) at z is


⎡ ⎤

∂w 5 6
  ⎢ ∂z ⎥ Aα M + Bα
Tz α(Γg ) = ⎣ ⎦= .
∂w Cα M + Dα
∂z

It can be concluded that α (Γg ) is a J4n -Lagrangian submanifold for


 T  
Tz α(Γg ) J4n Tz α(Γg )
5 6
 T T
 Aα M + Bα
= (Aα M + Bα ) , (Cα M + Dα ) J4n
Cα M + Dα
 M   M 
= (M T , I)α∗T J4n α∗ "
= (M T , I) K = 0.
I I

So,
(Aα M + Bα )T (Cα M + Dα ) − (Cα M + Dα )T (Aα M + Bα ) = 0,

i.e., N = (Aα M + Bα )(Cα M + Dα )−1 is symmetric. This implies that w  =


f (w, t, t0 ) is a gradient mapping. By the Poincaré lemma, there is a scalar function
φ(w, t, t0 ) such that

f (w, t, t0 ) = φw (w, t, t0 ). (3.5)

Consider the construction of f (w, t, t0 ) and z(w, t, t0 ). Since z(w, t, t0 )◦α2 (g(z, t, t0 ),
z, t, t0 ) ≡ z, substituting w = α2 (g(z, t, t0 ), z, t, t0 ) in (3.4) and (3.5) yields Equa-
tion (3.3). 

Theorem 3.6. f (w, t, t0 ) obtained in Theorem 3.5 is also the solution of the following
implicit equation:
   
α1 f (w, t, t0 ), w, t, t0 = g α2 (f (w, t, t0 ), w, t, t0 ), t, t0 .
15.4 Symplectic Difference Schemes for Birkhoffian Systems 625

Theorem 3.7. Let α be defined as in Theorem 3.5, let w → w


 = f (w, t, t0 ) be
" 2n
a gradient mapping in some neighborhood R of R with Jacobian fw (w, t, t0 ) =
" the condition
N (w, t, t0 ). If N satisfies in R
 α   
C f (w, t, t0 ), w, t, t0 N (w, t, t0 ) + Dα f (w, t, t0 ), w, t, t0  = 0,

" there uniquely exists a K(z,t)-symplectic mapping z → z = g(z, t, t0 ) with


then in R
Jacobian g(z, t, t0 ) = M (z, t, t0 ) such that

α1 (f (w, t, t0 ), w, t, t0 ) = g(α2 (f (w, t, t0 ), w, t, t0 ), t, t0 ),


M = (Aα N + B α )(C α N + Dα )−1 ,
N = (Aα M + Bα )(Cα M + Dα )−1 .

Remark 3.8. The proofs of Theorems 3.6 and 3.7 are similar to that of Theorem 3.5
and are omitted here. Similar to Theorem 3.6, the function g(z, t, t0 ) is the solution of
the implicit equation

α1 (g(z, t, t0 ), z, t, t0 ) = f (α2 (g(z, t, t0 ), z, t, t0 ), z, t, t0 ).

15.4 Symplectic Difference Schemes for Birkhoffian


Systems

In Section 15.2, it is indicated that for a general Birkhoffian system, there exists the
common property that its phase flow is symplectic. With the result in the last sec-
tion, symplectic schemes for Birkhoffian systems are constructed by approximating
the generating functions.
Birkhoff’s phase flow is denoted by g t (z, t0 ) and it is a one-parameter group of
K(z, t)-symplectic mappings at least local in z and t, i.e., g t0 = identity, g t1 +t2 =
g t1 ◦ g t2 . Here z is taken as an initial value when t = t0 , and z(z, t, t0 ) = g t (z, t0 ) =
g(t; z, t0 ) is the solution of the Birkhoffian system (2.8).

Theorem 4.1. Let α be defined as in Theorem 3.5. Let z → z = g t (z, t0 ) be the phase
flow of the Birkhoffian system (2.8), M (t; z, t0 ) = gz (t; z, t0 ) is its Jacobian. At some
initial point z, i.e., t = t0 , z = z, if

|Cα (z, z, t0 , t0 ) + Dα (z, z, t0 , t0 )| = 0, (4.1)

then for sufficiently small |t − t0 | and in some neighborhood of z ∈ R2n there exists
a gradient mapping w → w  = f (w, t, t0 ) with symmetric Jacobian fw (w, t, t0 ) =
N (w, t, t0 ) and a uniquely determined scalar generating function φ(w, t, t0 ) such that
626 15. Structure Preserving Schemes for Birkhoff Systems

f (w, t, t0 ) = φw (w, t, t0 ), (4.2)


∂  
φw (w, t, t0 ) = A φw (w, t, t0 ), w, φww (w, t, t0 ), t, t0 , (4.3)
∂t
   
∂w  ∂w 
A w,  w, , t, t0 = Ā z(w, w, t, t0 ), z(w, w, t, t0 ), , t, t0 , (4.4)
∂w ∂w
 
∂w  d ∂w d
Ā z, z, , t, t0 =  z , z, t, t0 ) −
w( w( z , z, t, t0 )
∂w dt ∂w d t
 
∂w  ∂ α1 ∂w ∂ α2
= Aα − Cα K −1 D( z , t) + − ,(4.5)
∂w ∂t ∂w ∂t
   
α1 (g(t; z, t0 ), z, t, t0 ) = f α2 g(t; z, t0 ), z, t, t0 , t, t0
   
= φw α2 g(t; z, t0 ), z, t, t0 , t, t0 , and
N = σα (M ) = (Aα M + Bα )(Cα M + Dα )−1 ,
M = σα−1 = (Aα N + B α )(C α N + Dα )−1 .
Proof. M (t; z, t0 ) is differentiable with respect to z and t. Condition (4.1) guarantees
that for sufficiently small |t − t0 | and for z in some neighborhood of z ∈ R2n , there is
 
Cα ( z , z, t, t0 ) = 0.
z , z, t, t0 )M (t; z, t0 ) + Dα (
Additionally, the Birkhoffian phase flow is a symplectic mapping; therefore, by The-
orem 3.5, there exists a time-dependent gradient map w = f (w, t, t0 ) and a scalar
function φ(w, t, t0 ), such that
∂ f (w, t, t0 ) ∂ φw (w, t, t0 )
f (w, t, t0 ) = φw (w, t, t0 ), = . (4.6)
∂t ∂t
Notice that z = g(t; z, t0 ) is the solution of the following initial-value problem;
⎧  
⎪ d z ∂F
⎨ = K −1 (z , t) ∇B + (
z , t),
dt ∂t

⎩ z|t=t0 = z,

therefore, from the equation in (3.2), it follows that


 

dw ∂w  d z ∂ ∂F ∂ α1
= · z , z, t, t0 ) = Aα K −1 ∇ B +
+ α1 ( + ,
dt ∂ z d t ∂t ∂t ∂t
 
dw ∂ F ∂ α 2
= Cα K −1 ∇ B + + ,
dt ∂t ∂t

so
   

∂w 
dw ∂w dw 
∂w ∂F ∂ α1  ∂ α2
∂w
= − = Aα − Cα K −1 ∇ B + + − .
∂t dt ∂ t dt ∂t ∂t ∂t ∂w ∂t

∂w
Since = 0, so w = w(w,
 t) exists and is solvable in (w),
 but it cannot be solved
∂w
explicitly from the transformation α and α−1 , we have
 

∂w 
∂w
Ā z, z, , t, t0 = ,
∂w ∂t
and the Equations (4.4) and (4.5). Then, from (4.6), the Equation (4.3) follows. 
15.4 Symplectic Difference Schemes for Birkhoffian Systems 627

According to[FW91b,FW91a,FQ87] , we can easily construct symplectic difference schemes


of any order for the autonomous or semi-autonomous Birkhoffian systems. Because of
the simplicity of the ordinary geometry structure, the transformation α in (3.1) needed
in these cases is independent of the parameter t, accordingly
 
∂w  
dw ∂w  dw ∂w 
= − = Aα − Cα K −1 ∇ B
∂t dt ∂ t dt ∂t
  T 

∂w
= − B αT + AαT ∇z B
∂w
   
= −Bw z(w,
 w) or = −Bw (  w), t) .
z (w,

Therefore, the corresponding Birkhoffian system is completely a Hamiltonian system


∂ φ(w, t)   ∂ φ(w, t, t0 )  
= −B z(φw , w) , = −B z(φw , w), t (4.7)
∂t ∂t
in the autonomous and semi-autonomous case, respectively.

Remark 4.2. Because of the forcing term in (2.1), the Hamilton–Jacobi equation
for the generating function φ(w, t, t0 ) cannot directly be derived, but instead the
Hamilton–Jacobi equation (4.3) for φw (w, t, t0 ) can be easily derived. Assume the
generating function φw (w, t, t0 ) can be expanded as a convergent power series in t


φw (w, t, t0 ) = (t − t0 )k φ(k)
w (w, t0 ). (4.8)
k=0

Lemma 4.3. The k-th order total derivative of A defined as in Theorem 4.1 with re-
spect to t can be described as

∞  
∞ 
Dtk A = ∂φw A (t − t0 )i φ(k+i)
w + ∂ φww A (t − t0 )i φ(k+i)
ww
i=0 i=0


∞  
∞ 
+∂t ∂φw A (t − t0 )
i
φ(k−1+i)
w + ∂t ∂φww A (t − t0 )i φ(k−1+i)
ww
i=0 i=0


k 
k−m 
k−m−n 
+ Cm
k Cnk−m ∂φnw ∂φl ww ∂tm A
m=0 n=1 l=1 h1 +···+hn
+j1 +···+jl =k−m

∞ ∞

(h1 +i) (hn +i)
· (t − t0 )i φw ,· · ·, (t − t0 )i φw ,
i=0 i=0

 ∞
 
· (t − t0 )i φ(j
ww
1 +i)
, · · ·, (jl +i)
(t − t0 )i φw ,
i=0 i=0

then at the point of t = t0 , the total derivative of A is as


628 15. Structure Preserving Schemes for Birkhoff Systems

(k) (k)
Dtk At0 = ∂φw At0 φw + ∂φww At0 φww
(k−1) (k−1)
+∂t ∂φw At0 φw + ∂t ∂φw w At0 φww

k 
k−m 
k−m−n 
+ Cm
k Cnk−m ∂φnw ∂φl ww ∂tm At0
m=0 n=1 l=1 h1 +···+hn
+j1 +···+jl =k−m
 (h ) (h ) (j1 ) (jl ) 
· φw 1 , · · · , φw n , φww , · · · , φww ,

(0) (0)
where At0 = A(φw , w, φww , t0 , t0 ).
By means of the representations of the total derivative of A, the following results
are proved.

Theorem 4.4. Let A and α be analytic. Then the generating function φwα,A (w, t, t0 )
= φw (w, t, t0 ) can be expanded as a convergent power series in t for sufficiently small
|t − t0 |


φw (w, t, t0 ) = (t − t0 )k φ(k)
w (w, t0 ), (4.9)
k=0
(k)
and φw (k ≥ 0), can be recursively determined by the following equations

φ(0)
w (w, t0 ) = f (w, t0 , t0 ), (4.10)
 (0) 
φ(1) (0)
w (w, t0 ) = A φw , w, φww , t0 , t0 , (4.11)
1  
φk+1
w (w, t0 ) = Dtk A φ(0) (0)
w , w, φww , t0 , t0 . (4.12)
(k + 1) !

Proof. Differentiating Equation (4.9) with respect to w and t, we derive




φww (w, t, t0 ) = (t − t0 )k φ(k)
ww (w, t0 ), (4.13)
k=0
∞

φw (w, t, t0 ) = (k + 1)(t − t0 )k φ(k+1)
w (w, t0 ). (4.14)
∂t
k=0

By Equation (4.2),

φ0w (w, t0 ) = φw (w, t0 , t0 ) = f (w, t0 , t0 ).


 
∂w 
Substituting Equations (4.9) and (4.13) in A w,  w, , t, t0 , and expanding A in
∂w
t = t0 , we get

A(φw , w, φww , t, t0 ) = A(f (w, t0 , t0 ), w, fw (w, t0 , t0 ), t0 , t0 )



 1
+ (t − t0 )k Dtk A(φ(0) (0)
w , w, φww , t0 , t0 ). (4.15)
k!
k=1

Using Equation (4.3) and comparing (4.15) with (4.14), we get (4.11) and (4.12). 
15.5 Example 629

In the autonomous and semi-autonomous case, A is replaced by the Birkhoffian B,


which makes it much easier to expand the generating functions φ. With Theorems 3.5
and 3.7, the relationship between the Birkhoffian phase flow and the generating func-
tion φ(w, t, t0 ) is established. With this result, K(z, t)-symplectic difference schemes
can be directly constructed.

Theorem 4.5. Let A and α be analytic. For sufficiently small time-step τ > 0, take

m
(m)
ψw (w, t0 + τ, t0 ) = τ i φ(i)
w (w, t0 ), m = 1, 2, · · · ,
i=0

(i)
where φw are determined by Equations (4.10) – (4.12).
(m)
Then, ψw (w, t0 + τ, t0 ) defines a K(z,t)-symplectic difference scheme z = z k →
z k+1 = z,
(m)
 
α1 (z k+1 , z k , tk+1 , tk ) = ψw α2 (z k+1 , z k , tk+1 , tk ), tk+1 , tk (4.16)

of m-th order of accuracy.


(m)
Proof. Let be N = φww (w0 , t0 , t0 ) = ψww (w0 , t0 , t0 ) and w0 = α(z, z, t0 , t0 ),
then Theorem 3.7 yields |C α N + Dα |, because of

|Cα (z, z, t0 , t0 ) + Dα (z, z, t0 , t0 )| = 0.

Thus, for sufficiently small τ and in some neighborhood of w0 , there exists

|C α N (m) (w, t0 + τ, t0 ) + Dα | = 0,

where

N (m) (w, t0 + τ, t0 ) = ψww


(m)
(w, t0 + τ, t0 ).
(m)
By Theorem 3.7, ψw (w, t0 + τ, t0 ) defines a K(z, t)-symplectic mapping which
is expressed in (3.3). Therefore, Equation (4.16) determines a m-th order K(z, t)-
symplectic difference scheme for the Birkhoffian system (2.8). 

15.5 Example
In this section, an example illustrates how to obtain schemes preserving the symplectic
structure for a nonconservative system expressed in Birkhoffian representation. Con-
sider the linear damped oscillator

r̈ + ν ṙ + r = 0. (5.1)

We introduce a gradient function p satisfying p = ṙ, then a Birkhoffian representation


of (5.1) is given by
630 15. Structure Preserving Schemes for Birkhoff Systems

5 65 6 5 6
0 −eνt ṙ νeνt p + eνt r
= . (5.2)
eνt 0 ṗ eνt p

The structure and functions are


5 6 5 6
0 −eνt −1
0 e−νt
K= , K = ,
eνt 0 −e−νt 0
⎡ 1 ⎤
eνt p 1
F =⎣ 2 ⎦, B = eνt (r2 + rp + p2 ),
1 2
− eνt r
2
and the energy function reads as follows:
1
H(q, p) = (q 2 + p2 ) − νp2 . (5.3)
2
The Euler midpoint scheme (or one-step Gauss–Runge–Kutta method) for the sys-
tem (5.2), which can be derived via the discrete Lagrange–d’Alembert principle[MW01] ,
reads as follows:
qk+1 − qk p + pk
= k+1 ,
τ 2
pk+1 − pk pk+1 + pk q + qk
= −ν − k+1 ,
τ 2 2
and hence,
5 6 5 65 6
qk+1 1 −τ 2 + 2ντ 4τ qk
= , (5.4)
pk+1 Δ −4τ −τ 2 − 2ντ + 4 pk

where Δ = τ 2 + 2ντ + 4, is not a K(z, t)-symplectic scheme.


Now, let the transformation α in (3.1) be
 = eνt p − eνt0 p,
Q P = q − q,
1 1 (5.5)
Q = (
q + q), P = − (eνt p + eνt0 p),
2 2
where the Jacobian of α is
⎡ ⎤
0 eνt 0 −eνt0
⎢ 1 −1 ⎥
⎢ 0 0 ⎥
⎢ ⎥
α∗ = ⎢ 1 1 ⎥.
⎢ 2 0 0 ⎥
⎣ 2 ⎦
1 1
0 − eνt 0 − eνt0
2 2
The inverse transformation is
1 1
q = P + Q,  − e−νt P,
p = e−νt Q
2 2 (5.6)
1 1
q = − P + Q,  − e−νt0 P,
p = − eνt0 Q
2 2
15.5 Example 631

and ⎡ ⎤
1
0 1 0
⎢ 2 ⎥
⎢ 1 −νt ⎥
⎢ e 0 0 −e−νt ⎥
⎢ ⎥
α∗−1 =⎢ 2
⎥.
⎢ −
1 ⎥
⎢ 0 1 0 ⎥
⎣ 2 ⎦
1
− e−νt0 0 0 −e−νt0
2
Consequently, using (5.5), (5.6) and (5.2), we derive
⎡ ⎤
5 6 5 6 1
νe νt

p + e νt ˙
p −e νt

q − eνt P − eνt Q

dw ⎢ 2 ⎥
= = =⎣ ⎦,
dt ˙q p 1  − e−νt P
e−νt Q
2
⎡ ⎤
1 −νt  1 −νt
e Q− e P
dw ⎢ 4 2 ⎥
=⎣ ⎦.
dt 1 νt  1
e P + eνt Q
4 2

Simple calculations (for m = 0,1) yields,


5 6
Q   0 
(0)
φw =  = ,
P t=t0 0
5 6
  −eνt0 Q
(1) dw (0) d w 
φw =  − φww  = .
d t t=t0 d t t=t0 −e−νt0 P

(0) (1)
 = φw + φw τ , so the first order scheme for the system (5.2) reads as follows:
Set w

qk+1 − qk eνtk+1 pk∗1 + eνtk pk


= e−νtk+1 ,
τ 2
eνtk+1 pk+1 − eνtk pk q + qk
= −eνtk k+1 ,
τ 2
and hence
5 6 5 65 6
qk+1 1 4 − τ2 4τ qk
= , (5.7)
pk+1 Δ −4τ e−ντ (4 − τ 2 )e−ντ pk

where Δ = 4 + τ 2 . The transition matrix denoted by A satisfies


5 6 5 6
0 −e νtk+1 0 −e νtk
AT A= .
eνtk+1 0 eνtk 0

Then, consider the transformation α in (3.1) to be


632 15. Structure Preserving Schemes for Birkhoff Systems

 = eνt/2 p − eνt0 /2 p,
Q P = −eνt/2 q + eνt0 /2 q,
1 1
Q = (eνt/2 q + eνt0 /2 q), P = − (eνt p + eνt0 p).
2 2
The Jacobian of α is
⎡ ⎤
0 eνt/2 0 −eνt0 /2
⎢ −eνt/2 eνt0 /2 ⎥
⎢ 0 0 ⎥
⎢ ⎥
α∗ = ⎢ 1 νt/2 1 νt0 /2 ⎥
⎢ e 0 e 0 ⎥
⎣ 2 2 ⎦
1 1
0 − eνt 0 − eνt0
2 2
and the inverse ⎡ ⎤
1
0 1 0
⎢ 2 ⎥
⎢ ⎥
⎢ 1 −νt
e 0 0 −e −νt ⎥
⎢ ⎥
=⎢ ⎥.
2
α∗−1 ⎢ ⎥
⎢ 1 ⎥
⎢ 0 − 1 0 ⎥
⎣ 2 ⎦
1
− e−νt0 0 0 −e−νt0
2
Direct calculation yields the scheme of second order

eνtk+1 /2 qk+1 − eνtk /2 qk eνtk+1 /2 pk+1 + eνtk /2 pk eνtk+1 /2 qk+1 + eνtk /2 qk


= +ν ,
τ 2 4
eνtk+1 /2 pk+1 − eνtk /2 pk eνtk+1 /2 qk+1 + eνtk /2 qk eνtk+1 /2 pk+1 + eνtk /2 pk
= −ν ,
τ 2 4
and hence, 5 6 5 65 6
qk+1 e−ντ /2 w1 −16τ qk
= , (5.8)
pk+1 Δ 16τ w−2 pk
where Δ = ν 2 τ 2 − 4τ 2 − 16,

w1 = −16 − 8ντ − ν 2 τ 2 + 4τ 4 , w2 = −16 + 8ντ − ν 2 τ 2 + 4τ 2 .

e−ντ /2
Abbreviating the matrix (∗) in (5.8) by M (τ ), then by composition[Yos90,QZ92]
Δ
we have the scheme of order four
5 6 5 6
qk+1 qk
= M (c1 τ )M (c2 τ )M (c1 τ ) , (5.9)
pk+1 pk

where
1 −21/3
c1 = , c2 = .
2 − 21/3 2 − 21/3
If take m = 2, we have
φ(2)
w = 0.
15.5 Example 633

Now take m = 3,
      
1 ∂ ∂ dw " ∂ " ∂w
dw " " ∂
∂w "
dw ∂ w
φ(3)
w = + −
3! ∂ t ∂ t dt ∂w " dt ∂ t ∂w∂w " dt ∂ t
    
∂w" ∂ dw ∂ ∂w " dw
− − . (5.10)
∂ w ∂ t dt ∂ t ∂ w dt
For equation q̈ + ν q̇ + q = 0, 3rd derivatives of φ in time t = t0 , only one term to
appear, i.e.,    
∂ ∂w" ∂ "
dw ∂ w
− .
∂t ∂w ∂w" dt ∂ t
Simple calculation yields
 
  1  ν  
 −1  −
ν−   −ν P − Q 
(3)  1   8  8  
φw     2 
2
t=t0
= −  ν  ν  1  ν 
6 −  − −1    − Q−P 
2  8 4  2
    
 1 1
− + ν2 Q +
νP 
 
1 4 16 2 
= −     
6 1 1 2 νQ
 − + ν +P 
4 16 2
  2   2  
 ν ν ν 
 − 2 Q + − 2 P 
 2 2 2 
=     2 
,

 ν2 ν ν 
 −2 P + −2 Q 
2 2 2
(1) (3)
" = φw Δt + φw Δt2 , i.e.,
we get 4-th order symmetrical symplectic scheme: w
eνtk+1 /2 qk+1 − eνtk /2 qk eνtk+1 /2 pk+1 + eνtk /2 pk eνtk+1 /2 qk+1 − eνtk /2 qk
= +ν
τ 2 4
  
1 ν2
+τ 2 − 2 eνtk+1 /2 pk+1 + eνtk /2 pk
24 × 4 2
 2   
ν ν νtk+1 /2
+ −2 e qk+1 + eνtk /2 qk ,
2 2
eνtk+1 /2 pk+1 − eνtk /2 pk eνtk+1 /2 qk+1 + eνtk /2 qk eνtk+1 /2 pk+1 + eνtk /2 pk
= −ν
τ 2 4
  
1 ν2
−τ 2 − 2 eνtk+1 /2 qk+1 + eνtk /2 qk
24 × 4 2
  
ν2 ν  νtk+1 /2 
+ −2 e pk+1 + eνtk /2 pk .
2 2
(5.11)
This method is easily extended to more general ODEs such as

ṗ + β  (t)p + V (q, t) = 0,
(5.12)
q̇ − G(p, t) = 0.
634 15. Structure Preserving Schemes for Birkhoff Systems

Remark 5.1. The derived schemes (5.7), (5.8), and (5.9) are K(z, t)-symplectic, i.e.,
for τ > 0 and k ≤ 0 they satisfy the Birkhoffian condition

eνtk+1 d qk+1 ∧ d pk+1 = eνtk d qk ∧ d pk .

15.6 Numerical Experiments

In this section, we present numerical results for the linear damped oscillator (5.1),
resp., (5.2) using the derived K(z, t)-symplectic schemes (5.7), (5.8), and (5.9) of
order one, two, and four, respectively. Further, we use Euler’s midpoint scheme (5.4),
which is not K(z, t)-symplectic but shows convenient numerical results[MW01] , and
further Euler’s explicit scheme for comparison.
In the presented figures, the initial values are always chosen as q(0) = 1,
p(0) = q̇(0) = −1, and the time interval is from 0 to 25. There are only small
differences in the behavior of the different schemes choosing other initial values. The
actual error, err = |approximate solution - true solution|, is computed with step size
τ = 0.2. Using different step sizes, the schemes always show the same quality, which
is emphasized by representing the results in a double logarithmic scale using step sizes
τ = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5. The orbits are computed with step size τ = 0.05.
The first comparison is given between scheme (5.7) and Euler’s explicit scheme
both are of order one. For smaller ν, i.e., 0 ≤ ν ≤ 1.3 scheme (5.7) is better, and for
ν > 1.3 Euler’s explicit scheme is better. The second comparison is given between
scheme (5.8) and Euler’s midpoint scheme (5.4) both are of order two. For 0 ≤ ν ≤
0.5 both schemes show the same behavior, for 0.5 < ν < 2.8 scheme (5.8) is better,
where the most advantage is around ν = 2, and for 2.8 ≤ ν Euler’s midpoint scheme
behaves better. The third comparison is given between scheme (5.9) of order four and
scheme (5.8) of order two. Both schemes have the same structure preserving property,
and therefore the higher order scheme (5.9) shows a clear superiority over the two-
order scheme. These differences between the discussed schemes are illustrated by the
error curves (Figs. 6.1 and 6.4).
For the energy function (5.3), the comparisons of the energy error H, between the
different schemes are also done in double logarithmic scales (Figs. 6.5 and 6.8). The
result shows that the dominance is not clear between scheme (5.7) and Euler’s explicit
scheme while scheme (5.8) is always better than Euler’s midpoint scheme for growing
ν, even for ν ≥ 2.8. Scheme (5.9) keeps its superiority in the comparisons.
The comparisons also show that it is possible for different schemes obtained from
different transformation α, that different quantities are preserved. This point is proved
to be true in the generating function method for Hamiltonian systems (see Feng et
al[FW91b,FW91a] ). The extension to application in Birkhoffian systems will also be stud-
ied in a prospective paper.
15.6 Numerical Experiments 635

0
q’’ + 0.6 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
scheme (5.7)
−2
scheme (5.8)
10 scheme (5.9)

−4
10
max−err

−6
10

−8
10

−10
10 −2 −1 0
10 10 10
tau

Fig. 6.1. Error comparison between the different schemes for ν = 0.6

0
q’’ + 1.3 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
scheme (5.7)
−2
scheme (5.8)
10 scheme (5.9)

−4
10
max−err

−6
10

−8
10

−10
10 −2 −1 0
10 10 10
tau

Fig. 6.2. Error comparison between the different schemes for ν = 1.3
636 15. Structure Preserving Schemes for Birkhoff Systems

0
q’’ + 1.9 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
−2
10
scheme (5.7)
scheme (5.8)
scheme (5.9)
−4
10

−6
10
max−err

−8
10

−10
10

−12
10

−14
10 −2 −1 0
10 10 10
tau

Fig. 6.3. Error comparison between the different schemes for ν = 1.9

0
q’’ + 2.8 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
scheme (5.7)
−2
scheme (5.8)
10 scheme (5.9)

−4
10
max−err

−6
10

−8
10

−10
10 −2 −1 0
10 10 10
tau

Fig. 6.4. Error comparison between the different schemes for ν = 2.8
15.6 Numerical Experiments 637

0
q’’ + 2.8 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
scheme (5.7)
−2
scheme (5.8)
10 scheme (5.9)

−4
10
max−err−H

−6
10

−8
10

−10
10 −2 −1 0
10 10 10
tau

Fig. 6.5. Energy error comparison between the different schemes for ν = 0.6

0
q’’ + 1.3 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
scheme (5.7)
−2
10 scheme (5.8)
scheme (5.9)

−4
10
max−err−H

−6
10

−8
10

−10
10

−12
10 −2 −1 0
10 10 10
tau

Fig. 6.6. Energy error comparison between the different schemes for ν = 1.3
638 15. Structure Preserving Schemes for Birkhoff Systems

0
q’’ + 1.9 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
−2
10
scheme (5.7)
scheme (5.8)
scheme (5.9)
−4
10
max−err−H

−6
10

−8
10

−10
10

−12
10

−14
10 −2 −1 0
10 10 10
tau

Fig. 6.7. Energy error comparison between the different schemes for ν = 1.9

0
q’’ + 2.8 * q’ + q = 0 ( tau = 0.01, 0.02, 0.05, 0.1, 0.2, 0.5 )
10
expl Euler
midpoint (5.4)
scheme (5.7)
−2
scheme (5.8)
10 scheme (5.9)

−4
10
max−err−H

−6
10

−8
10

−10
10 −2 −1 0
10 10 10
tau

Fig. 6.8. Energy error comparison between the different schemes for ν = 2.8
Bibliography

[AH75] R.W. Atherton and G.M. Homsy: On the existence and formulation of variational prin-
ciples for nonlinear differential equations. Studies in Applied Mathematics, LIV(1):1531–
1551, (1975).
[Arn89] V. I. Arnold: Mathematical Methods of Classical Mechanics. Springer-Verlag, GTM
60, Berlin Heidelberg, Second edition, (1989).
[FQ87] K. Feng and M.Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[FW91a] K. Feng and D.L. Wang: A Note on conservation laws of symplectic difference
schemes for Hamiltonian systems. J. Comput. Math., 9(3):229–237, (1991).
[FW91b] K. Feng and D.L. Wang: Symplectic difference schemes for Hamiltonian systems in
general symplectic structure. J. Comput. Math., 9(1):86–96, (1991).
[GLSM01] Y.X. Guo, S.K. Luo, M. Shang, and F.X. Mei: Birkhoffian formulations of non-
holonomic constrained systems. Reports on Mathematical Physics, 47:313–322, (2001).
[HLW02] E. Hairer, Ch. Lubich, and G. Wanner: Geometric Numerical Integration. Num-
ber 31 in Springer Series in Computational Mathematics. Springer-Verlag, Berlin, (2002).
[MP91] E. Massa and E. Pagani: Classical dynamics of non-holonomic systems : a geomet-
ric approach. Annales de l’institut Henri Poincar (A) Physique thorique, 55(1):511–544,
(1991).
[MR99] J. E. Marsden and T. S. Ratiu: Introduction to Mechanics and Symmetry. Number 17
in Texts in Applied Mathematics. Springer-Verlag, Berlin, second edition, (1999).
[MW01] J. E. Marsden and M. West: Discrete mechanics and variational integrators. Acta
Numerica, 10:357–514, (2001).
[QZ92] M. Z. Qin and W. J. Zhu: Construction of higher order symplectic schemes by com-
position. Computing, 47:309–321, (1992).
[San83a] R.M. Santilli; Foundations of Theoretical Mechanics I. Springer-Verlag, New York,
Second edition, (1983).
[San83b] R.M. Santilli: Foundations of Theoretical Mechanics II. Springer-Verlag, New York,
Second edition, (1983).
[SQ03] H. L. Su and M. Z. Qin: Symplectic schemes for Birkhoffian system. Technical Report
arXiv: math-ph/0301001, (2003).
[SSC94] J. M. Sanz-Serna and M. P. Calvo: Numerical Hamiltonian Problems. AMMC 7.
Chapman & Hall, London, (1994).
[SSQS07] H. L. Su, Y.J. Sun, M. Z. Qin, and R. Scherer: Symplectic schemes for Birkhoffian
system. Inter J of Pure and Applied Math, 40(3):341–366, (2007).
[SVC95] W. Sarlet, A. Vandecasteele, and F. Cantrijn: Derivations of forms along a map:
The framework for time-dependent second-order equations. Diff. Geom. Appl., 5:171–203,
(1995).
[Yos90] H. Yoshida: Construction of higher order symplectic integrators. Physics Letters A,
150:262–268, (1990).
Chapter 16.
Multisymplectic and Variational Integrators

Recently, multisymplectic discretizations have been drawing much attention and,


therefore, have become the vigorous component of the structure-preserving algo-
rithms. In this chapter, we systematically develop what our research group has achieved
in the field of multisymplectic discretizations. Some very interesting new issues aris-
ing in this field are also given. Multisymplectic and variational integrators are stud-
ied from a comparative point of view. The implementation issues of multisymplectic
integrators are discussed, and composition methods to construct higher order mul-
tisymplectic integrators are presented. The equivalence of variational integrators to
multisymplectic integrators is proved. Several generalizations are also described.

16.1 Introduction
The introduction of symplectic integrators is a milestone in the development of numer-
ical analysis[Fen85] . It has led to the establishment of structure-preserving algorithms, a
very promising subject. Due to its high accuracy, good stability and, in particular, the
capability for long-term computation, the structure-preserving algorithms have proved
to be very powerful in numerical simulations. The applications of structure-preserving
algorithms can be found on diverse branches of physics, such as celestial mechanics,
quantum mechanics, fluid dynamics, geophysics[LQHD07,MPSM01,WHT96] , etc.
Symplectic algorithms for finite dimensional Hamiltonian systems have been well
established. They not only bring new insights into existing methods but also lead to
many powerful new numerical methods. The structure-preserving algorithms for in-
finite dimensional Hamiltonian systems are comparatively less explored. Symplec-
tic integrators for infinite dimensional Hamiltonian systems were also considered
[Qin90,LQ88,Qin87,Qin97a]
. The basic idea is, first to discretize the space variables appro-
priately so that the resulting semi-discrete system is a Hamiltonian system in time;
and second, to apply symplectic methods to this semi-discrete system. The symplectic
integrator obtained in this way preserves a symplectic form which is a sum over the
discrete space variables. In spite of its success, a problem remains: the change of the
symplectic structure over the spatial domain is not reflected in such methods.
This problem was solved by introducing the concept of multisymplectic integra-
tors (Bridges and Reich[BR01a,BR06] ). In general, an infinite dimensional Hamiltonian
system can be reformulated as a multisymplectic Hamiltonian system in which as-
sociated to every time and space direction, there exists a symplectic structure and a
642 16. Multisymplectic and Variational Integrators

multisymplectic conservation law is satisfied. The multisymplectic conservation law is


completely local and reflects the change of the symplecticity over the space domain. A
multisymplectic integrator is a numerical scheme for the multisymplectic Hamiltonian
system which preserves a discrete multisymplectic conservation law, characterizing
the spatial change of the discrete symplectic structure. The multisymplectic integrator
is the direct generalization of the symplectic integrator and has good performance in
conserving local conservation laws. A disadvantage of the multisymplectic integrator
is the introduction of many new variables which usually are not needed in numerical
experiments. To solve this problem, we can eliminate the additional variables from
some multisymplectic integrators and obtain a series of new schemes for the equa-
tions considered. On the construction of multisymplectic integrators, it was proved
that using symplectic Runge–Kutta integrators in both directions lead to multisym-
plectic integrators[Rei00] . In this chapter, another approach, namely the composition
method will be presented.
The multisymplectic integrator is based on the Hamiltonian formalism. In the La-
grangian formalism, a geometric-variational approach to continuous and discrete me-
chanics and field theories is known by Marsden, Patrik, and Shkoller[MPS98] . The mul-
tisymplectic form is obtained directly from the variational principle, staying entirely
on the Lagrangian side, but the local energy and momentum conservation laws are not
particularly addressed. By disretizing the Lagrangian and using a discrete variational
principle, variational integrators are obtained, which satisfy a discrete multisymplectic
form[MPS98] . Taking the sine-Gordon equation and the nonlinear Schrödinger equation
as examples, we will show that some variational integrators are equivalent to multi-
symplectic integrators.
In addition to the standard multisymplectic and variational integrators, we have
more ambitious goal of presenting some generalizations, including multisymplec-
tic Fourier pseudospectral methods on real space, nonconservative multisymplectic
Hamiltonian systems, constructions of multisymplectic integrators for modified equa-
tions and multisymplectic Birkhoffian systems[SQ01,SQWR08,SQS07] .
This chapter is organized as follows. In the next Section 16.2, the basic theory
of multisymplectic geometry and multisymplectic Hamiltonian systems is presented.
Section 16.3 is devoted to developing multisymplectic integrators. In Section 16.4, the
variational integrators are discussed. In Section 16.5, some generalizations are given.

16.2 Multisymplectic Geometry and Multisymplectic


Hamiltonian Systems
In this section, the basic theory needed for multisymplectic and variational integra-
tors is discussed. The basic theory includes multisymplectic geometry and multi-
symplectic Hamiltonian system. We will present the theory from the perspective of
the total variation[Lee82,Lee87] , always named Lee variational integrator (see Chapter
14)[Che02,CGW03] .
16.2 Multisymplectic Geometry and Multisymplectic Hamiltonian Systems 643

1. Multisymplectic geometry Exclusively, local coordinates are used and the notion
of prolongation spaces instead of jet bundles[Olv86,Che05c] is employed. The covariant
configuration space is denoted by X × U and X represents the space of independent
variables with coordinates xμ (μ = 1, 2, · · · , n, 0), and U the space of dependent
variables with coordinates uA (A = 1, 2, · · · , N ). The first-order prolongation of X ×
U is defined to be
U (1) = X × U × U1 , (2.1)
where U1 represents the space consisting of first-order partial derivatives of uA with
respect to xμ .
Let φ : X → U be a smooth function, then its first prolongation is denoted by

pr1 φ = (xμ , φA , φA
μ ).

A Lagrangian density, L is defined as follows:

L : U (1) −→ Λn+1 (X),


(2.2)
L(pr1 φ) = L(xμ , φA , φA
μ)d
n+1
x,

where Λn+1 (X) is the space of n + 1 forms over X.


Corresponding to the Lagrangian density (2.2), the action functional is defined by
-
S(φ) = L(xμ , φA , φA
μ)d
n+1
x, M is an open set in X. (2.3)
M

Let V be a vector field on X × U with the form


∂ ∂
V = ξ μ (x) + αA (x, u) ,
∂ xμ ∂ uA

where x = (x1 , · · · , xn , x0 ), u = (u1 , · · · , uN ) and we use Einstein summation


convention here.
The flow exp(λV ) of the vector field V is a one-parameter transformation group of
X × U and transforms a map φ : M → U to a family of maps φ̃ : M̃ → U depending
on the parameter λ. Now, we calculate the variation of the action functional (2.3). For
simplicity , let n = 1, N = 1 and x1 = x, x0 = t, u1 = u, α1 = α, then it follows
that
  -
d  d 
δS =  S(φ̃) =  L(x̃, t̃, φ̃, φ̃x̃ , φ̃t̃ ) d x̃ ∧ d t̃ = A + B,
dλ λ=0 dλ λ=0 M̃

where
-     
∂L ∂L ∂L
A = + Dt φt − L + Dx φt ξ0
M ∂t ∂ φt ∂ φx
    
∂L ∂L ∂L
+ + Dx φx − L + Dt φx ξ1
∂x ∂ φx ∂ φt
  !
∂L ∂L ∂L
+ − Dx − Dt α d x ∧ d t, (2.4)
∂φ ∂ φx ∂ φt
644 16. Multisymplectic and Variational Integrators

and
-   
∂L ∂L
B = φt − L d x − φt d t ξ 0
∂M ∂ φt ∂ φx
  
∂L ∂L
+ L− φx d t + φx d x ξ 1
∂ φx ∂ φt
  !
∂L ∂L
+ dt − dx α . (2.5)
∂ φx ∂ φt

If ξ 1 (x), ξ 0 (x), and α(x, t, φ(x, t)) have compact support on M , then B = 0. In this
case, with the requirement of δS = 0 and from (2.4), the variation ξ 0 yields the local
energy evolution equation
   
∂L ∂L ∂L
+ Dt φt − L + Dx φt = 0, (2.6)
∂t ∂ φt ∂ φx

and the variation ξ 1 the local momentum evolution equation


   
∂L ∂L ∂L
+ Dx φx − L + Dt φx = 0. (2.7)
∂x ∂ φx ∂ φt

For a conservative L, i.e., the one that does not depend on x, t explicitly, (2.6) and
(2.7) become the local energy conservation law and the local momentum conservation
law respectively.
The variation α yields the Euler–Lagrange equation

∂L ∂L ∂L
− Dx − Dt = 0. (2.8)
∂φ ∂φx ∂φt

If the condition that ξ 1 (x, t), ξ 0 (x, t), α(x, t, φ(x, t)) have compact support on M is
not imposed, then from the boundary integral B, we can define the Cartan form

∂L  
∂L ∂L ∂L
ΘL = dφ ∧ dt − dφ ∧ dx + L − φx − φt d x ∧ d t, (2.9)
∂φx ∂ φt ∂ φx ∂ φt

which satisfies using the interior product and the pull-back mapping ()∗ ,
-
 1 ∗  1 
B= pr φ pr V ΘL . (2.10)
∂M

The multisymplectic form is defined to be ΩL = d ΘL .

Theorem 2.1. [MPS98,GAR73] Suppose φ is the solution of (2.8), and let η λ and ζ λ be
two one-parameter symmetry groups of Equation (2.8), and V1 and V2 be the corre-
sponding infinitesimal symmetries, then we have the multisymplectic form formula
-
 1 ∗  1 
pr φ pr V1 pr1 V2 ΩL = 0. (2.11)
∂M
16.2 Multisymplectic Geometry and Multisymplectic Hamiltonian Systems 645

2. Multisymplectic Hamiltonian systems


A large class of partial differential equations can be represented as[BR06,Bri97]

M z t + Kz x = (z S(z), (2.12)

where z ∈ Rn , M and K are antisymmetric matrices in Rn×n , n ≥ 3 and S : Rn →


R is a smooth function. Here for simplicity, we only consider one space dimension.
We call (2.12) a multisymplectic Hamiltonian system, since it possesses a multi-
symplectic conservation law
Dt ω + Dx κ = 0, (2.13)
d d
where Dt = , Dx = and ω and κ are the presymplectic form
dt dx

1 1
ω= d z ∧ M d z, κ= d z ∧ K d z,
2 2

which are associated to the time direction and the space direction, respectively.
The system (2.12) satisfies a local energy conservation law

Dt E + Dx F = 0, (2.14)

with energy density


1
E = S(z) − z T Kz x
2

and energy flux


1
F = z T Kz t .
2

The system (2.12) also has a local momentum conservation law

Dt I + Dx G = 0 (2.15)

with momentum density


1
I = zTM zx
2

and momentum flux


1
G = S(z) − z T M z t .
2

The multisymplectic Hamiltonian system can be obtained from the Lagrangian


density and the covariant Legendre transform, or Legendre–Hodge transformation[Bri06] .
The relationship between the Lagrangian and the Hamiltonian formalisms is ex-
plained in the following diagram, where in each line the corresponding equations are
given[Che05c,Che02,LQ02] .
646 16. Multisymplectic and Variational Integrators

∂L ∂L
L = L(φ, φx , φt ) ⇐⇒ H = L − φx − φt ,
∂ φx ∂ φt
∂L ∂L ∂L
− Dx − Dt = 0 ⇐⇒ M z t + Kz x = (z S(z),
∂φ ∂ φx ∂ φt
-
(pr1 φ)∗ (pr1 V1 pr1 V2 ΩL ) = 0 ⇐⇒ Dt ω + Dx κ = 0,
∂M
% & % &
∂L ∂L
Dt φt − L + Dx φt = 0 ⇐⇒ Dt E + Dx F = 0,
∂ φt ∂ φx
% & % &
∂L ∂L
Dx φx − L + Dt φx = 0 ⇐⇒ Dt I + Dx G = 0.
∂ φx ∂ φt

16.3 Multisymplectic Integrators and Composition


Methods
The concept of the multisymplectic integrators for the system (2.12) was introduced
by Bridges and Reich[BR01a] . A multisymplectic integrator is a numerical scheme for
(2.12) which preserves a discrete multisymplectic conservation law. The multisym-
plectic integrator is the direct generalization of the symplectic integrator and has good
performance in maintaining local conservation laws. Using symplectic Runge–Kutta
integrators in both directions leads to multisymplectic integrators[Rei00] .
A popular multisymplectic integrator is the multisymplectic Preissman integrator
which is obtained by using the midpoint method in both directions. Discretizing (2.12)
by the midpoint method in both directions with step-size Δt and Δτ yields

z j+1 − z ji+ 1 j+ 1
z 2 − zi
j+ 1
2  
i+ 1 j+ 1
M 2 2
+ K i+1 = ∇z S z i+ 12 , (3.1)
Δt Δx 2

where Δ t and Δ x are the time step size and space step size, respectively, and
 
1
z ji ≈ z(iΔz, jΔt), z j+1 1 = z j+1
+ z j+1
,
 i+ 2 2 i
 i+1
j+ 12 1 j j j+1 j+1
z i+ 1 = z i + z i+1 + z i + z i+1 , etc.
2 4

The scheme (3.1) satisfies the discrete multisymplectic conservation law


j+1 j j+ 1 j+ 1
ωi+ 1 − ωi+ 1 κi+12 − κi 2
2 2
+ = 0, (3.2)
Δt Δx
which can be proved by direct calculations.

Example 3.1. First, consider the sine-Gordon equation[Che06b,WM01]

utt − uxx + sin u = 0. (3.3)


16.3 Multisymplectic Integrators and Composition Methods 647

Introducing the new variables v = ut and w = ux , Equation (3.3) is equivalent to the


system
−vt + wx = sin u, ut = v, −ux = −w, (3.4)
which can be represented as

M1 z t + K1 z x = ∇z S1 (z), (3.5)

where
1
z = (u, v, w)T , S1 (z) = (v 2 − w2 ) − cos (u)
2
and ⎛ ⎞ ⎛ ⎞
0 −1 0 0 0 1
⎜ ⎟ ⎜ ⎟
M1 = ⎝ 1 0 0 ⎠, K1 = ⎝ 0 0 0 ⎠.
0 0 0 −1 0 0
Applying the multisymplectic integrator (3.1) to (3.3) yields
j+1 j j+ 1 j+ 1
vi+ 1 − vi+ 1 w 2 − wi 2
j+ 1
− 2 2
+ i+1 = sin ui+ 12 ,
Δt Δx 2

uj+1
i+ 1
− uji+ 1 j+ 1
2 2
= vi+ 12 , (3.6)
Δt 2

j+ 1 j+ 1
u 2 − ui 2
j+ 1
− i+1 = −wi+ 12 .
Δx 2

Eliminating v and w from (3.6), a nine-point integrator for u is derived


(j) (j) (j)
uj+1 j j−1
(i) − 2u(i) + u(i) ui+1 − 2ui + ui−1
− + sin (ūji ) = 0, (3.7)
Δ t2 Δ x2
where
uli−1 + 2uli + uli+1
ul(i) = , l = j − 1, j, j + 1,
4
(j) uj−1 j j+1
m +2um +um
um = 4 , m = i − 1, i, i + 1,
1 
sin(ūji ) = sin (ūji ) + sin (ūji−1 ) + sin (ūi−1j−1 ) + sin (ūij−1 ) ,
4
1 j j+1  1 j 
ūji = u + uji+1 + uj+1
i+1 + ui , ūji−1 = u + uji + uj+1 + uj+1
i−1 ,
4 i 4 i−1 i

1  j−1  1  j−1 j
ūj−1
i−1 = u + uj−1 + uji + uji−1 , ūj−1 = u + uj−1 j
i+1 + ui+1 + ui .
4 i−1 i i 4 i

Second, consider the nonlinear Schrödinger equation, written in the form


 
iψt + ψxx + V  |ψ|2 ψ = 0. (3.8)
648 16. Multisymplectic and Variational Integrators

Using ψ = p + iq and introducing a pair of conjugate momenta v = px , w = qx ,


Equation (3.8) can be represented[Che06b,Che05b,CQ02,CQT02,SHQ06,SMM04,SQL06,Che04a] as a
multisymplectic Hamiltonian system

M2 zt + K2 zx = (z S2 (z), (3.9)

where
1 2 
z = (p, q, v, w)T , S2 (z) = v + w2 + V (p2 + q 2 )
2
and ⎛ ⎞ ⎛ ⎞
0 1 0 0 0 0 −1 0
⎜ ⎟ ⎜ ⎟
⎜ −1 0 0 0 ⎟ ⎜ 0 0 0 −1 ⎟
M2 = ⎜ ⎟
⎜ 0 0 0 0 ⎟, K2 = ⎜
⎜ 1 0
⎟.
⎝ ⎠ ⎝ 0 0 ⎟⎠
0 0 0 0 0 1 0 0
From the multisymplectic Preissman integrator (3.1), we obtain a six-point integrator
for (3.8)
j+1 j j+ 1 j+ 1 j+ 1
ψ[i] − ψ[i] ψi+12 − 2ψi 2 + ψi−12 1
i + 2
+ Gi,j = 0, (3.10)
Δt Δx 2
where  
r 1
ψ[i] = ψi−1,r + 2ψi,r + ψi+1,r , r = j, j + 1,
4
     
 j+ 1 2 j+ 1  j+ 1 2 j+ 1
Gi,j = V  ψi− 12  ψi− 12 + V  ψi+ 12  ψi+ 12 .
2 2 2 2

Third, consider the KdV equation (Korteweg & de Vries)

ut + 3(u2 )x + uxxx = 0. (3.11)

Introducing the new variables φ, v and w, Equation (3.11) can be represented as

M3 z t + K3 z x = (z S3 (z), (3.12)

where
1 2
z = (φ, u, v, w)T , S3 (z) = v + u2 − uw
2
and ⎛ ⎞
1 ⎛ ⎞
0 0 0 0 0 0 1
⎜ 2 ⎟
⎜ 1 ⎟ ⎜ ⎟
⎜ − 0 0 0 ⎟ ⎜ 0 0 −1 0 ⎟
M3 = ⎜
⎜ 2
⎟,
⎟ K3 = ⎜
⎜ 0 1
⎟.
⎜ 0 0 ⎟
0 0 ⎟ ⎝ ⎠
0
⎝ 0 ⎠
−1 0 0 0
0 0 0 0
From the multisymplectic Preissman integrator (3.1), we obtain an eight-point inte-
grator
16.3 Multisymplectic Integrators and Composition Methods 649

j+ 1 j+ 1 j+ 1 j+ 1
uj+1 j
(i) − u(i) ū2 − ū2i−1 u 2 − 3ui 2 + 3ui−12 − ui−22
+ 3 i+1 + i+1 = 0, (3.13)
Δt 2Δ x Δ x3

where

uli−2 + 3uli−1 + 3uli + uli+1


ul(i) = , l = j, j + 1,
8
1  j+ 12 2 j+ 1 
ū2m = (um+1 ) + (um 2 )2 , m = i − 1, i + 1.
2

A twelve-point integrator for the KdV equation is known[ZQ00,AM04,MM05] , which can


be reduced to the eight-point integrator (3.13). Numerical experiments with the in-
tegrators mentioned above are given in[WM01,CQT02,ZQ00] . For other soliton equations
such as the ZK equation and the KP equation, similar results are obtained[Che03,LQ02] .
The coupled Klein–Gordon–Schrödinger equation[KLX06]

1
i ψt + ψxx + ψϕ = 0,
2
√ (3.14)
ϕtt − ϕxx + ϕ − |ψ|2 = 0, i= −1

is a classical model which describes interaction between conservative complex neu-


tron field and neutral meson Yukawa in quantum field theory.
KGS equation with initial boundary value conditions

ψ(0, x) = ψ0 (x), ϕ(0, x) = ϕ0 (x), ϕt (0, x) = ϕ1 (x), (3.15)


ψ(t, xL ) = ψ(t, xR ) = ϕ(t, xL ) = ϕ(t, xR ) = 0, (3.16)

where ψ0 (x), ϕ0 (x) and ϕ1 (x) are known functions. The problems (3.14), (3.15) and
(3.16) has conservative quantity

- xR
2
ψ = ψ ψ̄ d x = 1.
xL

Setting ψ = p + i q, ψx = px + i qx = f + i g,

pt = v, ϕx = w, z = (p, q, f, ϕ, v, w)T .

The multisymplectic formation of KGS system (3.14) is


650 16. Multisymplectic and Variational Integrators

⎧ 1

⎪ qt + fx = −ϕp,

⎪ 2



⎪ 1

⎪ pt + gx = −ϕq,

⎪ 2



⎪ 1 1

⎪ − px = f,

⎪ 2 2


1 1
− qx = − g, (3.17)

⎪ 2 2



⎪ 1 1 1 1

⎪ − vt + wx = ϕ − (p2 + q 2 ),

⎪ 2 2 2 2



⎪ 1 1

⎪ ϕt = v,

⎪ 2 2


⎪ 1
⎩ 1
− ϕx = − w.
2 2
System (3.17) can be written in standard Bridge form
∂z ∂z
M +K = ∇ S, (3.18)
∂t ∂x
where matrices M and K (3.18) are
⎛ ⎞ ⎛ ⎞
0 −2 0 0 0 0 0 0 0 1 0 0 0 0
⎜ ⎟ ⎜ 0 ⎟
⎜ 2 0 0 0 0 0 0 ⎟ ⎜ 0 0 0 1 0 0 ⎟
⎜ ⎟ ⎜ ⎟
⎜ 0 0 0 0 0 0 0 ⎟ ⎜ −1 0 0 0 0 0 0 ⎟
⎜ ⎟ ⎜ ⎟
1⎜ ⎟ 1⎜ ⎟
M= ⎜ 0 0 0 0 0 0 0 ⎟, K= ⎜ 0 −1 0 0 0 0 0 ⎟
2⎜ ⎟ 2⎜ ⎟
⎜ 0 0 0 0 0 −1 0 ⎟ ⎜ 0 0 0 0 0 0 1 ⎟
⎜ ⎟ ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ 0 0 0 0 1 0 0 ⎠ ⎝ 0 0 0 0 0 0 0 ⎠
0 0 0 0 0 0 0 0 0 0 0 −1 0 0

respectively, and the Hamiltonian function is


1 1
S(z) = − ϕ(p2 + q 2 ) + (ϕ2 + v 2 − w2 − f 2 − g 2 ).
2 4
For the three local conservation laws corresponding to (3.17), (3.18), we have

ω(z) = −2 d p ∧ d q − d ϕ ∧ d v,
κ(z) = d p ∧ d f + d q ∧ d q + d ϕ ∧ d w,
1 1
E(z) = − ϕ(p2 + q 2 ) + (ϕ2 + v 2 − pfx − qgx − ϕwx ),
2 4
1 (3.19)
F (z) = (pft + qgt + ϕwt − f pt − gqt − rw),
4
1 1 1
I(z) = − ϕ(p2 + q 2 ) + (ϕ2 − w2 − f 2 − g 2 + ϕvt ) + (pqt − qpt ),
2 4 2
1
G(z) = (−2pg + 2qf − qvx + vw).
4
16.3 Multisymplectic Integrators and Composition Methods 651

Recently, many math physical equations can be solved by Multisymlectic methods, such
as Gross–Pitaevskii equation[TM03,TMZM08] , Maxwell’s equations[SQS07,SQ03,CYWB06,STMM07] ,
Camassa–Holm equation[Dah07] , Kadomtsev–Petviashvili equation[JYJ06] , Seismic
wave equation[Che04b,Che04a,Che07a,Che07c,Che07b] , Dirac equation[HL04] , and nonlinear
“good” Boussinesq equation[HZQ03,Che05a] , etc.
Now, let us discuss the composition method for constructing high order multisym-
plectic integrators[Che05c,CQ03] . First, recall the definition of a composition method for
ODEs[Yos90,QZ92,Suz92] : Suppose there are n integrators with corresponding operators
s1 (τ ), s2 (τ ), · · ·, sn (τ ) of corresponding order p1 , p2 , · · · , pn , respectively, having
maximal order μ =maxi (pi ). If there exists constants c1 , c2 , · · · , cn such that the or-
der of the integrator whose operator is the composition s1 (c1 τ )s2 (c2 τ ) · · · sn (cn τ )
is m > μ, then the new integrator is called composition integrator of the original n
integrators. This construction of higher order integrators from the lower order ones is
called the composition method.
While constructing higher order integrators, the main task is to determine con-
stants c1 , c2 , · · ·, cn such that the scheme with the corresponding operator

Gm (τ ) = s1 (c1 τ )s2 (c2 τ ) · · · sn (cn τ )

has order m > μ. Now, we will present the basic formula for determining the constants
ci (i = 1, · · · , n). For this purpose, we introduce the symmetrization operator S
p!q! 
S(xp z q ) = Pm (xp z q ),
(p + q) !
Pm

where x, z are arbitrary noncommutable operators, Pm denotes the summation of all


the operators obtained in all possible ways of permutation[Suz92] .
We also introduce a time-ordering operator P :

xi xj , if i < j;
P (xi xj ) =
xj xi , if j < i,

where xi , xj are noncommutable operators[Suz92] .


Set Gm (τ ) = s1 (c1 τ ) · · · sn (cn τ ). The condition on which Gm has order m reads


n
P S(xn1 1 xn2 2 xn3 3 · · ·) = 0, ci = 1, (3.20)
i=1

where n1 + 2n2 + 3n3 + · · · ≤ m, excluding n2 = n3 = · · · = 0.


Given a multisymplectic integrator for (2.12) with accuracy of order O(τ p + τq ),
   
M s(τ )zi,j + K s( τ )zi,j = ∇z (z̃i,j ), (3.21)

where s(τ ) and s(


τ ) are discrete operators in t-direction and x-direction respectively,
and τ and τ are time step and space step respectively. z̃i,j = fs,s (zi,j ) is a function of
zi,j corresponding to the operators s(τ ) and s(τ ).
652 16. Multisymplectic and Variational Integrators

Suppose Gm (τ ) is the composition operator of s(τ ) with accuracy of order


 n (
O(τ m ), and G τ ) is the composition operator of s(τ ) with accuracy of order O(τ n ),
then the multisymplectic integrator
   
M Gm (τ )zi,j + K G  n (
τ )zi,j = ∇z S(z̃i,j ) (3.22)
 m 
has accuracy of order O τ + τn .

16.4 Variational Integrators


In this section, variational integrators are discussed. First, we present Veselov-type
dicretizations of first-order multisymplectic field theory developed in [MPS98] . For sim-
plicity, let n = 1, N = 1, X = (x, t), U = (u), and take X  = (xi , tj ) and U = (uij )
as the discrete versions of X and U . It is more suitable to use only the indices of the
grid and set X = (i, j).
A rectangle 2 of X is an ordered quadruple of the form
 
2 = (i, j), (i + 1, j), (i + 1, j + 1), (i, j + 1) . (4.1)
The i − th component of 2 is the i − th vertex of the rectangle, denoted by 2i . A
point (i, j) ∈ X is touched by a rectangle if it is a vertex of that rectangle. If M ⊆ X,
then (i, j) is an interior point of M, if M contains all four rectangles that touch (i, j).
We denote M as the union of all rectangles touching interior points of M. A boundary
point of M is a point in M which is not an interior point. If M = M, we call M regular.
int M is the set of the interior points of M, and ∂M is the set of boundary points.
The discrete first-order prolongation of X × U is defined by
U(1) ≡ (2; uij , ui+1j , ui+1j+1 , uij+1 ),
and the first order prolongation of the discrete map ϕ : X → U; ϕ(i, j) := ϕi,j by
pr1 ϕ ≡ (2; ϕij , ϕi+1j , ϕi+1j+1 , ϕij+1 ). (4.2)
(1)
Corresponding to a discrete Lagrangian L : U → R, we define the discrete
functional
 
S(ϕ) = L(pr1 ϕ)ΔxΔt = L(2, ϕij , ϕi+1j , ϕi+1j+1 , ϕij+1 )ΔxΔt, (4.3)
2⊂M 2⊂M

where Δx and Δt are the grid sizes in direction x and t, and M is a subset of X. In
this chapter, only an equally spaced grid is considered.
Now for brevity of notations, let M = [a, b] × [c, d] be a rectangular domain and
consider a uniform rectangular subdivision
a = x0 < x1 < · · · < xM −1 < xM = b, c = t0 < t1 < · · · < tN −1 < tN = d,
xi = a + i Δ x, tj = c + j Δ t, i = 0, 1, · · · , M, j = 0, 1, · · · , N,
M Δ x = b − a, N Δ t = d − c.
(4.4)
16.4 Variational Integrators 653

For autonomous Lagrangian and uniform rectangular subdivisions, the discrete action
functional takes the form

 −1 N
 −1
M
 
S(ϕ) = L ϕij , ϕi+1j , ϕi+1j+1 , ϕij+1 Δ x Δ t. (4.5)
i=0 j=0

Using the discrete variational principle, we obtain the discrete Euler–Lagrange equa-
tion (variational integrator)

D1 Lij + D2 Li−1j + D3 Li−1j−1 + D4 Lij−1 = 0, (4.6)

which satisfies the discrete multisymplectic form formula


⎛ ⎞
   
⎝ (pr1 ϕ)∗ pr1 V1 pr1 V2 ΩL ⎠ = 0,
l
(4.7)
2;2∩∂ M=∅ l;2l ∈∂ M

where ΩlL = d ΘLl (l = 1, · · · , 4) and V1 and V2 are solutions of the linearized


equation of (4.6). Now the discretizations of an autonomous Lagrangian L(ϕ, ϕx , ϕt )
is considered
 ϕi+1,j+ 1 − ϕij+ 1 ϕi+ 1 j+1 − ϕi+ 1 j 
L(ϕij , ϕi+1,j , ϕi+1,j+1 , ϕi,j+1 ) = L ϕ̄ij , 2 2
, 2 2
,
Δx Δt
(4.8)
where
1
ϕ̄ij = (ϕij + ϕi+1j + ϕi+1j+1 + ϕij+1 ) ,
4
1
ϕij+ 12 = (ϕij + ϕij+1 ) ,
2
1
ϕi+ 12 j+1 = (ϕij+1 + ϕi+1j+1 )
2

etc. For the discrete Lagrangian, the discrete Euler–Lagrange equation (4.6) is a nine-
point variational integrator. The following results demonstrate the equivalence of vari-
ational integrators and multisymplectic integrators. Consider the sine-Gordon equa-
tion (3.3), then the Lagrangian is given by
1 1
L(u, ux , ut ) = u2x − u2t − cos (u). (4.9)
2 2

The discrete Euler–Lagrange equation (4.6) corresponding to (4.9) is just the nine-
point integrator (3.7). Consider the nonlinear Schrödinger equation (3.8), then the
Lagrangian for (3.8) is given by
1@ 2 A
L(p, q, px , qx , pt , qt ) = px + qx2 + pqt − qpt − V (p2 + q 2 ) . (4.10)
2

The discrete Euler–Lagrange equation (4.6) corresponding to (4.10) reads


654 16. Multisymplectic and Variational Integrators

j+1 j−1 j+ 1 j− 1 j+ 1 j− 1 j+ 1 j− 1
ψ[i] − ψ[i] ψi+12 + ψi+12 − 2ψi 2
− 2ψi 2
+ ψi−12 + ψi−12
i +
2Δ t Δ x2
1 1
+ Gi,j + Gi,j−1 = 0. (4.11)
4 4
The integrator (4.11) is equivalent to the integrator (3.10), since replacing j by j − 1
in (3.10) and adding the resulting equation to (3.10) leads to (4.11) (see [CQ03] ).

16.5 Some Generalizations


In this section, some generalizations based on the multisymplectic geometry and mul-
tisymplectic Hamiltonian systems are presented.
1. Multisymplectic Fourier pseudospectral methods
On Fourier space, multisymplectic Fourier pseudospectral methods were considered
in [BR01b] . Now, we discuss these methods on real space [CQ01a] and take the nonlinear
Schrödinger equation as an example. Applying the Fourier pseudospectral method to
the multisymplectic system (3.9) and using the notations

p = (p0 , · · · , pN −1 )T , q = (q0 , · · · .qN −1 )T ,


v = (v0 , · · · , vN −1 )T , w = (w0 , · · · .wN −1 )T ,

it follows
d qj
− (D1 v)j = 2(p2j + qj2 )pj ,
dt
dp
− j − (D1 w)j = 2(p2j + qj2 )qj , (5.1)
dt
(D1 p)j = vj ,
(D1 q)j = wj ,

where j = 0, 1, · · · , N − 1 and D1 is the first order spectral differentiation matrix.


The Fourier pseudospectral semidiscretization (5.1) has N semidiscrete multisym-
plectic conservation laws


N −1
d
ωj + (D1 )j,k κjk = 0, j = 0, 1, · · · , N − 1, (5.2)
dt
k=0

where
1
ωj = (d zj ∧ M d zj ), κjk = d zj ∧ K d zk ,
2
and zj = (pj , qj , vj , wj )T (j = 0, 1, · · · , N − 1).
2. Nonconservative multisymplectic Hamiltonian systems
Nonconservative multisymplectic Hamiltonian systems refer to those depending on
16.5 Some Generalizations 655

the independent variables explicitly. Such an example is the Schrödinger equation


with variable coefficients[HLHKA06] . Another example is the three-dimensional scalar
seismic wave equation[Che04b,Che06a,Che07a,Che07b,Che04a]
1
∇2 u − utt = 0, (5.3)
c(x, y, z)2

where ∇2 u = uxx + uyy + uzz and c(x, y, z) is the wave velocity.


Introducing the new variables
1
v= ut , w = ux , p = uy , q = uz ,
c(x, y, z)

Equation (5.3) can be rewritten as

M (x, y, z)Zt + KZx + LZy + N Zz = ∇Z S(Z), (5.4)


1
where Z = (u, v, w, p, q)T , S(Z) = (v 2 − w2 − p2 − q 2 ) and
2
⎛ 1 ⎞
0 − 0 0 0
⎜ c(x, y, z) ⎟
⎜ ⎟
⎜ 1 ⎟
⎜ 0 0 0 0 ⎟

M (x, y, z) = ⎜ c(x, y, z) ⎟,
⎜ 0 0 0 0 0 ⎟

⎜ ⎟
⎝ 0 0 0 0 0 ⎠
0 0 0 0 0
⎛ ⎞
0 0 1 0 0
⎜ ⎟
⎜ 0 0 0 0 0 ⎟
⎜ ⎟
=⎜ ⎟,
K ⎜ −1 0 0 0 0 ⎟
⎜ ⎟
⎝ 0 0 0 0 0 ⎠
0 0 0 0 0
⎛ ⎞ ⎛ ⎞
0 0 0 1 0 0 0 0 0 1
⎜ ⎟ ⎜ ⎟
⎜ 0 0 0 0 0 ⎟ ⎜ 0 0 0 0 0 ⎟
⎜ ⎟ ⎜ ⎟
L=⎜ 0 0 0 0 0 ⎟ N =⎜ 0 0 0 0 0 ⎟.
⎜ ⎟, ⎜ ⎟
⎜ ⎟ ⎜ ⎟
⎝ −1 0 0 0 0 ⎠ ⎝ 0 0 0 0 0 ⎠
0 0 0 0 0 −1 0 0 0 0

The corresponding four presymplectic forms associated to the time direction and three
space directions are respectively:
1 1
ω= d Z ∧ M (x, y, z) d Z, κx = d Z ∧ K d Z,
2 2
(5.5)
1 1
κy = d Z ∧ L d Z, κz = d Z ∧ N d Z.
2 2
656 16. Multisymplectic and Variational Integrators

Note that the time direction presymplectic form ω depends on the space variables
(x, y, z). We can also obtain the corresponding multisymplectic integrators[Che06a] .
3. Construction of multisymplectic integrators for modified equations
Consider the linear wave equation

utt = uxx . (5.6)

Based on the two Hamiltonian formulations of (5.6) and using the hyperbolic func-
tions, various symplectic integrators were constructed in[QZ93] . By deriving the cor-
responding Lagrangians and their discrete counterparts, these symplectic integrators
were proved to be multisymplectic integrators for the modified versions of (5.6)
in[SQ00] .
Let us present an example. Using hyperbolic function tanh(x), we can obtain a
symplectic integrator for (5.6) of accuracy O(Δ t2s + Δ x2m ):
   
Δt Δt
uj+1
i − 2uji + uj−1
i = tanh 2s, tanh 2s, Δ (2m) (uj+1 i − 2uji + uij−1 ),
2 2
(5.7)
where

m−1  j
Δ x2 ∇+ ∇−
Δ (2m) = ∇+ ∇− (−1)j βj ,
4
j=0

where
[(j !)2 22j ]
βj =
[(2j + 1) ! (j + 1)]
and ∇+ and ∇− are forward and backward difference operators respectively.
For m = 2 and s = 2, the integrator (5.7) is a multisymplectic integrator of the
modified equation

Δ t2 Δ t4
utt = uxx − uxxxx − uxxxxxx . (5.8)
6 144
For other hyperbolic functions, we can obtain similar results.

4. Multisymplectic Birkhoffian systems


The multisymplectic Hamiltonian system can be generalized to include dissipation
terms. This generalization leads to the following multisymplectic Birkhoffian system
∂F ∂G
M (t, x, z)z t + K(t, x, z)z x = ∇z B(t, x, z) + + , (5.9)
∂t ∂x

where z = (z1 , · · · , zn )T , F = (f1 , · · · , fn )T , G = (g1 , · · · , gn )T and M = (mij )


and K = (kij ) are two antisymmetric matrices with entries respectively:
∂ fj ∂ fi ∂ gj ∂ gi
mij = − , kij = − .
∂ zi ∂ zj ∂ zi ∂ zj
16.5 Some Generalizations 657

The system (5.9) satisfies the following multisymplectic dissipation law:


   
d 1 d 1
dz ∧ M dz + d z ∧ K d z = 0. (5.10)
dt 2 dx 2

Let us present an example[SQ03,SQWR08] . Consider the equation describing the linear


damped string:
utt − uxx + u + αut + βux = 0. (5.11)
Introducing new variables p = ut and q = ux , the Equation (5.11) can be cast into the
form of (5.9) with
⎛ ⎞ ⎛ ⎞
0 eαt−βx 0 0 0 −eαt−βx
⎜ ⎟ ⎜ ⎟
M = ⎝ −eαt−βx 0 0 ⎠, K = ⎝ 0 0 0 ⎠,
0 0 0 eαt−βx 0 0

and
1
z = (u, p, q)T , B = − eαt−βx (u2 + p2 − q 2 + αup + βuq),
2
 T  T
1 αt−βx 1 αt−βx 1 αt−βx 1
F = − e p, e u, 0 , G = e q, 0, − eαt−βx .
2 2 2 2

Similarly, we can develop multisymplectic dissipation integrators for the system (5.9)
which preserve a discrete version of the multisymplectic dissipation law (5.10).
5. Differential complex, methods and multisymplectic structure
Differential complexes have come to play an increasingly important role in numerical
analysis recently. In particular, discrete differential complexes are crucial in design-
ing stable finite element schemes[Arn02] . With regard to discrete differential forms, a
generic Hodge operator was introduced in[Hip02] . It was shown that most finite ele-
ment schemes emerge as its specializations. The connection between Veselov discrete
mechanics and finite element methods was first suggested in[MPS98] . Symplectic and
multisymplectic structures in simple finite element methods are explored in[GjLK04] . It
will be of particular significance to study the multisymplectic structure for the finite
element methods by using discrete differential complexes and in particular, discrete
Hodge operators[STMM07] . We will explore this issue in the future.
Bibliography

[AM04] U.M. Ascher and R.I. McLachlan: Multisymplectic box schemes and the Korteweg-de
Vries equation. Appl. Numer. Math., 39:55–269, (2004).
[Arn02] D.N. Arnold: Differential complexes and numerical stability. Plenary address deliv-
ered at ICM 2002. Beijing, China, (2002).
[BR01a] T. J. Bridges and S. Reich: Multi-symplectic integrators: numerical schemes for
Hamiltonian PDEs that conserve symplecticity. Physics Letters A, 284:184–193, (2001).
[BR01b] T.J. Bridges and S. Reich: Multi-symplectic spectral discretizations for the Zakharov-
Kuznetsov and shallow water equations. Physica D, 152:491–504, (2001).
[BR06] T. J. Bridges and S. Reich: Numerical methods for Hamiltonian PDEs. J. Phys. A:
Math. Gen., 39:5287–5320, (2006).
[Bri97] T. J. Bridges: Multi-symplectic structures and wave propagation. Math. Proc. Cam.
Phil. Soc., 121:147–190, (1997).
[Bri06] T. J. Bridges: Canonical multisymplectic structure on the total exterior algebra bundle.
Proc. R. Soc. Lond. A, 462:1531–1551, (2006).
[CGW03] J. B. Chen, H.Y. Guo, and K. Wu: Total variation in Hamiltonian formalism and
symplectic-energy integrators. J. of Math. Phys., 44:1688–1702, (2003).
[Che02] J. B. Chen: Total variation in discrete multisymplectic field theory and multisymplec-
tic energy momentum integrators. Letters in Mathematical Physics, 51:63–73, (2002).
[Che03] J. B. Chen: Multisymplectic geometry, local conservation laws and a multisymplectic
integrator for the Zakharov–Kuznetsov equation. Letters in Mathematical Physics, 63:115–
124, (2003).
[Che04a] J. B. Chen: Multisymplectic geometry for the seismic wave equation. Com-
mun.Theor. Phys., 41:561–566, (2004).
[Che04b] J. B. Chen: Multisymplectic Hamiltonian formulation for a one-way seismic wave
equation of high order approximation. Chin Phys. Lett., 21:37–39, (2004).
[Che05a] J. B. Chen: Multisymplectic geometry, local conservation laws and Fourier pseu-
dospectral discretization for the ”good” Boussinesq equation. Applied Mathematics and
Computation, 161:55–67, (2005).
[Che05b] J. B. Chen: A multisymplectic integrator for the periodic nonlinear Schrödinger
equation. Applied Mathematics and Computation, 170:1394–1417, (2005).
[Che05c] J. B. Chen: Variational formulation for multisymplectic Hamiltonian systems. Let-
ters in Mathematical Physics, 71:243–253, (2005).
[Che06a] J. B. Chen: A multisymplectic variational formulation for the nonlinear elastic wave
equation. Chin Phys. Lett., 23(2):320–323, (2006).
[Che06b] J. B. Chen: Symplectic and multisymplectic Fourier pseudospectral discretization
for the Klein-Gordon equation. Letters in Mathematical Physics, 75:293–305, (2006).
[Che07a] J. B. Chen: High order time discretization in seismic modeling. Geophysics,
72(5):SM115–SM122, (2007).
[Che07b] J. B. Chen: Modeling the scalar wave equation with Nyströn methods. Geophysics,
71(5):T158, (2007).
[Che07c] J. B. Chen: A multisymplectic pseudospectral method for seismic modeling. Applied
Mathematics and Computation, 186:1612–1616, (2007).
Bibliography 659

[CQ01a] J. B. Chen and M. Z. Qin: Multisymplectic fourier pseudospectral method for the
nonlinear Schrödinger equation. Electronic Transactions on Numerical Analysis, 12:193–
204, (2001).
[CQ02] J.-B. Chen and M. Z. Qin. A multisymplectic variational integrator for the nonlinear
Schrödinger equation. Numer. Meth. Part. Diff. Eq., 18:523–536, 2002.
[CQ03] J. B. Chen and M. Z. Qin: Multisymplectic composition integrators of high order. J.
Comput. Math., 21(5):647–656, (2003).
[CQT02] J. B. Chen, M. Z. Qin, and Y. F. Tang: Symplectic and multisymplectic methods for
the nonlinear Schrödinger equation. Computers Math. Applic., 43:1095–1106, (2002).
[CWQ09] J. Cai, Y. S. Wang, and Z. H. Qiao: Multisymplectic Preissman scheme for the
time-domain Maxwell’s equations. J. of Math. Phys., 50:033510, (2009).
[CYQ09] J. X. Cai, Y.S.Wang, and Z.H. Qiao: Multisymplectic Preissman scheme for the
time-domain Maxwell’s equations. J. of Math. Phys., 50:033510, (2009).
[CYWB06] J. X. Cai, Y.S.Wang, B. Wang, and B.Jiang: New multisymplectic self-adjoint
scheme and its composition for time-domain Maxwell’s equations. J. of Math. Phys.,
47:123508, (2006).
[Dah07] M. L. Dahlby: Geometrical integration of nonlinear wave equations. Master’s thesis,
Norwegian University, NTNU, Trondheim, (2007).
[Fen85] K. Feng: On difference schemes and symplectic geometry. In K. Feng, editor, Pro-
ceedings of the 1984 Beijing Symposium on Differential Geometry and Differential Equa-
tions, pages 42–58. Science Press, Beijing, (1985).
[FQ87] K. Feng and M. Z. Qin: The symplectic methods for the computation of Hamiltonian
equations. In Y. L. Zhu and B. Y. Guo, editors, Numerical Methods for Partial Differential
Equations, Lecture Notes in Mathematics 1297, pages 1–37. Springer, Berlin, (1987).
[GAR73] P.L. GARCIA: The Poincare–Cartan invariant in the calculus of variations symposia
mathematica. In in Convegno di Geometria Simplettica e Fisica Mathmatica XIV, pages
219–243. Academic Press, London, (1973).
[GjLK04] H.Y. Guo, X.M. Ji, Y.Q. Li, and K.Wu: symplectic, multisymplectic structure-
preserving in simple finite element method, Preprint arXiv: hep-th/0104151. (2004).
[Hip02] R. Hiptmair: Finite elements in computational electromagnetism. Acta Numerica,
11:237–339, (2002).
[HL04] J. Hong and C. Li: Multi-symplectic Runge–Kutta methods for nonlinear Dirac equa-
tions. J. of Comp. Phys., 211:448–472, (2004).
[HLHKA06] J. L. Hong, Y. Liu, H.Munthe-Kass, and Zanna A: On a multisymplectic scheme
for Schrödinger equations with variable coefficients. Appl. Numer. Math., 56:816–843,
(2006).
[HZQ03] L. Y Huang, W. P. Zeng, and M.Z. Qin: A new multi-symplectic scheme for nonlinear
“good” Boussinesq equation. J. Comput. Math., 21:703–714, (2003).
[JYJ06] B. Jiang, Y.S.Wang, and Cai J.X: New multisymplectic scheme for generalized
Kadomtsev-Petviashvili equation. J. of Math. Phys., 47:083503, (2006).
[KLX06] L. H. Kong, R. X. Liu, and Z.L. Xu: Numerical simulation interaction between
Schrödinger equation, and Klein–Gorden field by multi-symplecticic methods. Applied
Mathematics and Computation, 181:342–350, (2006).
[Lag88] J. L. Lagrange: Mécanique Analytique Blanchard, Paris, 5th edition, vol. 1, (1965).
[Lee82] T. D. Lee: Can time be a discrete dynamical variable? Phys.Lett.B, 122:217–220,
(1982).
[Lee87] T. D. Lee: Difference equations and conservation laws. J. Stat. Phys., 46:843–860,
(1987).
[LQ88] C.W. Li and M.Z. Qin: A symplectic difference scheme for the infinite dimensional
Hamiltonian system. J. Comput. Appl. Math, 6:164–174, (1988).
[LQ02] T. T. Liu and M. Z. Qin: Multisymplectic geometry and multisymplectic Preissman
scheme for the KP equation. J. of Math. Phys., 43:4060–4077, (2002).
660 Bibliography

[LQHD07] X. S. Liu, Y.Y. Qi, J. F. He, and P. Z. Ding: Recent progress in symplectic algo-
rithms for use in quantum systems. Communications in Computational Physics, 2(1):1–53,
(2007).
[MM05] K.W. Morton and D.F. Mayers: Numerical Solution of Partial Differential Equations:
an introduction. Cambridge University Press, Cambridge, Second edition, (2005).
[MPS98] J. E. Marsden, G.P. Patrick, and S. Shloller: Multi-symplectic geometry, variational
integrators, and nonlinear PDEs. Communications in Mathematical Physics, 199:351–395,
(1998).
[MPSM01] J. E. Marsden, S. Pekarsky, S. Shkoller, and M.West: Variational methods, multi-
symplectic geometry and continuum mechanics. J.Geom. Phys., 38:253–284, (2001).
[Olv86] P.J. Olver: Applications of Lie Groups to Differential Equations. Springer, New York,
(1986).
[Qin87] M. Z. Qin: A symplectic schemes for the Hamiltonian equations. J. Comput. Math.,
5:203–209, (1987).
[Qin90] M. Z. Qin: Multi-stage symplectic schemes of two kinds of Hamiltonian systems of
wave equations. Computers Math. Applic., 19:51–62, (1990).
[Qin97a] M. Z. Qin: A symplectic schemes for the PDEs. AMS/IP studies in Advanced Math-
emateics, 5:349–354, (1997).
[QZ92] M. Z. Qin and W. J. Zhu: Construction of higher order symplectic schemes by com-
position. Computing, 47:309–321, (1992).
[QZ93] M. Z. Qin and W. J. Zhu: Construction of symplectic scheme for wave equation via
hyperbolic functions sinh(x), cosh(x) and tanh(x). Computers Math. Applic., 26:1–11,
(1993).
[Rei00] S. Reich: Multi-symplectic Runge–Kutta collocation methods for Hamiltonian wave
equations. J. of Comp. Phys., 157:473–499, (2000).
[SHQ06] J. Q. Sun, W. Hua, and M. Z. Qin: New conservation scheme for the nonlinear
Schrodinger system. Applied Mathematics and Computation, 177:446–451, (2006).
[SMM04] J. Q. Sun, Z. Q. Ma, and M. Z. Qin: RKMK method of solving non-damping
LL equations for ferromagnet chain equations. Applied Mathematics and Computation,
157:407–424, (2004).
[SMQ06] J. Q. Sun, Z. Q. Ma, and M. Z. Qin: Simulation of envelope Rossby solution in pair
of cubic Schrodinger equations. Applied Mathematics and Computation, 183:946–952,
(2006).
[SNW92] J.C. Simo, N.Tarnow, and K.K. Wong: Exact energy-momentum conserving algo-
rithms and symplectic schemes for nonlinear dynamics. Comput. Methods Appl. Mech.
Engrg., 100:63–116, (1992).
[SQ00] Y. J. Sun and M.Z. Qin: Construction of multisymplectic schemes of any finite order
for modified wave equations. J. of Math. Phys., 41:7854–7868, (2000).
[SQ01] H. L. Su and M. Z. Qin: Multisymplectic Birkhoffian structure for PDEs with dissipa-
tion terms, arxiv:math.na 0302299, (2001).
[SQ03] H. Su and M. Z. Qin: Symplectic schemes for Birkhoffian system. Technical Report
arXiv: math-ph/0301001, (2003).
[SQ04] Y. J. Sun and M. Z. Qin: A multi-symplectic schemes for RLW eqution. J. Comput.
Math., 22:611–621, (2004).
[SQ05] H. Su and M. Z. Qin: Multisymplectic geometry method for Maxwell’s equations and
multisymplectic scheme. Technical Report arXiv. org math-ph/0302058, (2005).
[SQL06] J. Q. Sun, M.Z. Qin, and T.T. Liu: Total variation and multisymplectic structure for
the CNLS system. Commun.Theor. Phys., 46(2):966–975, (2006).
[SQS07] H. L. Su, M.Z. Qin, and R. Scherer: Multisymplectic geometry method for Maxwell’s
equations and multisymplectic scheme. Inter. J of Pure and Applied Math, 34(1):1–17,
(2007).
[SQWD09] J. Q. Sun, M. Z. Qin, H. Wei, and D. G. Dong: Numerical simulation of collision
behavior of optical solitons in birefingent fibres.
Commun Nonlinear Science and Numerical Simulation, 14:1259–1266, (2009).
Bibliography 661

[SQWR08] H. L. Su, M. Z. Qin, Y. S. Wang, and R. Scherer: Multisymplectic Birkhoffian


structure for PDEs with dissipation terms. Preprint No:2, Karlsruhe University, (2008).
[STMM07] A. Stern, Y. Tong, M.Desbrun, and J.E. Marsden: Electomagnetism with varia-
tional integration and discretedifferential forms, arXiv:0707.4470v2, (2007).
[Str68] G. Strang: On the construction and comparison of difference schemes. SIAM J. Numer.
Anal., 5:506–517, (1968).
[Str96] M. Struwe: Variational Methods Application to nonlinear PDEs and Hamiltonian sys-
tems, volume 34 of A Series of Modern Surveys in Mathematics. Springer-Verlag, Berlin,
Second edition, (1996).
[Suz92] M. Suzuki: General theory of higher-order decomposition of exponential operators
and symplectic integrators. Physics Letters A, 165:387–395, (1992).
[TM03] Y.M. Tian and M.Z.Qin: Explicit symplectic schemes for investigating the evolution
of vortices in a rovating Bose–Einstein condensate. Comput. Phys. Comm., 155:132–143,
(2003).
[TMZM08] Y.M. Tian, M.Z.Qin, Y. M. Zhang, and T. Ma: The multisymplectic method for
Gross–Pitaevskii equation. Comput. Phys. Comm., 176:449–458, (2008).
[WHT96] J. Wisdom, M. Holman, and J. Touma: Symplectic Correctors. In Jerrold E. Mars-
den, George W. Patrick, and William F. Shadwick, editors, Integration Algorithms and
Classical Mechanics, volume 10 of Fields Institute Communications, pages 217–244.
Fields Institute, American Mathematical Society, July (1996).
[WM01] Y. S. Wang and M. Z.Qin: Multisymplectic geometry and multisymplectic scheme
for the nonlinear Klein–Gordon equation. J. of Phys.soc. of Japan, 70:653–661, (2001).
[WWQ08] Y.S. Wang, B. Wang, and M. Z. Qin: Local structure-preserving algorithms for
partial differential equation. Science in China (series A), 51(11):2115–2136, (2008).
[Yos90] H. Yoshida: Construction of higher order symplectic integrators. Physics Letters A,
150:262–268, (1990).
[Zha91b] M. Q. Zhang: Explicit unitary schemes to solve quantum operator equations of mo-
tion. J. Stat. Phys., 65(3/4), (1991).
[Zha93a] M. Q. Zhang: Algorithms that preserve the volume amplification factor for linear
systems. Appl. Math. Lett., 6(3):59–61, (1993).
[Zha93b] M. Q. Zhang: Computation of n-body problem by 2-body problems. Physics Letters
A, 197:255–260, (1993).
[ZQ93a] M. Q. Zhang and M. Z. Qin: Explicit symplectic schemes to solve vortex systems.
Comp. & Math. with Applic., 26(5):51, (1993).
[ZQ93b] W. Zhu and M. Qin: Application of higer order self-adjoint schemes of PDEs. Com-
puters Math. Applic., 25(12):31–38, (1993).
[ZQ00] P. F. Zhao and M. Z. Qin: Multisymplectic geometry and multisymplectic Preissman
scheme for the KdV equation. J. Phys. A: Math. Gen., 33:3613–3626, (2000).
[ZW99] H.P. Zhu and J.K. Wu: Generalized canonical transformations and symplectic algo-
rithm of the autonomous Birkhoffian systems. Progr. Natur. Sci., 9:820–828, (1999).
[ZzT96] W. Zhu, X. Zhao, and Y. Tang: Numerical methods with a high order of accuracy
applied in the quantum system. J. Chem. Phys., 104(6):2275–2286, (1996).
Symbol

Symbol Description
A, B Matrix A = {aij ∈ M (n)}

A∗ = A conjugate transpose of A
A , AT transpose of A
A J-orthogonal complement of A
A⊥ orthogonal complement of A
A = {Uλ , ϕλ } smooth atlas
Ad adjoint representation
Ad∗ coadjoint representation
adv adjoint vector field
ad∗v coadjoint vector field
Br (a) take a as the center of circle, r is the radius ball
Bk space consist of all exact k-form
Bk set of all k-boundaries
bk , bk Betti number
B(ρ), C(η), D(ζ) order conditions of Butcher.
C the complex numbers
Cn complex vector space of complex n-vector
Ck space of k-times differentiable functions
C∞ space of smooth functions
C(z) Casimir function
C k (M ) k-dimensional chain on M
i structure constant
Cjk
d exterior derivative, exterior differential operator
dxi basis differential 1-form
D total differential
det A determinant of matrix A
div divergence
deg ω (deg f )( deg P (x)) order of form (order of map) (order of polynomial)
Eτ Euler step-transient operator
e identity element of group
ex exponential function of x
ei , {ei , fj } basis, symplectic basis
eta phase of flow with vector field a
exp, Exp exponential map
F (t)f differential element of function f
664 Symbol

Symbol Description
F a field (usually R or C)
Fn vector space (over F) of n-vectors
f∗p differential of the map f in the p place
F(Rn ) a class of all differentiable function on Rn
g k (M ) set of all k-differential form on M
G group, Lie group
G2n,k M (2n, k) nonsingular equivalent class
g Lie algebra
g∗ dual to the Lie algebra
Gl(n), Gl(n, R), Gl(n, C) linear group on Rn ,(Cn )
gl(n) Lie algebra of n × n matrix
grad gradient
H K (M, R)(HK (M, R)) k-th cohomology (homology) group on M
H(p, q), H(z) Hamiltonian function
i including map
iX , contraction, interior product
I identity map
In , I2n identity matrix, standard Euclidean structure
id (Id) identity
im L image of map L
J momentum map
I2n , J4n symplectic structure
J"4n J"4n -symplectic structure
K K-symplectic structure
ker L  kernel of mapping L
L[u] = L dx variation of L
LX Y, LX ω vector field Y , differential form ω of Lie derivative
M, N manifold
M (n, m, R) set of all real matrix with n-row and m-column
M (n, m, C) set of all complex matrix with n-row and m-column
M (n, R) set of all real matrix of order n × n on Rn
M (n, C) set of all complex matrix of order n × n on Cn
M (n, F) set of all matrix of order n × n on Fn
O(n), o(n) orthogonal group, orthogonal Lie algebra
O zero matrix
P p coordinate in momentum space
Q q coordinate in configuration space
p order of p
R real number
Rn n-dimensional real vector space
Rnp , Rnq momentum space, configuration space in Rn
RP n real projection space
r(t) order of t-tree
S symplectic transformation, S-transformation
h, s step of time
Symbol 665

Symbol Description
SL(n), SL(n, R), SL(n, C) special linear group, (real), (complex)
sl(n) Lie algebra of special linear group
SO(n) special orthogonal group
so(n) Lie algebra of special orthogonal group
Sp(2n) symplectic group, symplectic matrix
sp(2n) symplectic algebra, infinitesimal symplectic matrix
CSp(2n) conformal symplectic group
Sp(0) 0-class of symplectic matrix
Sp(I) I-class of symplectic matrix
Sp(II) II-class of symplectic matrix
Sp(III) III-class of symplectic matrix
Sp-diff or Sp-Diff symplectic diffeomorphism
TM tangent bundle
Tx M tangent space in place x
T ∗M cotangent bundle
Tx∗ M cotangent space in place of x
Sm symmetric group
u, v vector in Rn space
(U, ϕ), (V, ϕ) local coordinate
V vector space
Xp vector field in place p on manifold
ẋ, ẍ first, second, order derivative at x
x, xi x vector, coordinate component
y, y i y vector, coordinate component
X(M ) set of all tangent vector on M
XH Hamiltonian vector field
X (Rn ) set of all smooth vector field on Rn
 A Bα

α
α= Darboux transformation
Cα Dα
 Aα 
Bα inverse of Darboux transformation
α−1 = α α
C D
δ variational derivative, codifferential operator
σ(t) symmetry of t-tree
γ(t) density of t-tree
δij Kronecker symbol
α(t) essential different labelings
Γf , Gf , gr (f ) graphic of f
Δt, τ, s step size of time
Δx step size of apace
θ differential 1-form
dθ exterior of differential 1-form
π T Rn −→ Rn projection T Rn to Rn
π −1 (x) = Tx Rn fiber in point x
666 Symbol

Symbol Description
ϕ∗ ω (ϕ∗ ω) pull back of differential form (push-forward)
ϕ∗ f (ϕ∗ f ) pull back of function(push-forward)
ϕ∗ Y (ϕ∗ Y ) pull back of vector field (push-forward)
× product
∧ exterior product
Λk (Rn ) k-th exterior bundle over Rn
Λn Lagrangian subspace
Λn (K) K-Lagrangian subspace
f Z f transverse to Z
f p Z f in the p transverse to Z
Ω standard symplectic structure
Ω# lift of mapping Ω# (z1 )(z2 ) = Ω−1 (z1 , z2 )
Ωb down mapping Ωb (z1 )(z2 ) = Ω(z1 , z2 )
Ωk (Rn ), Ω0 (Rn ) = C ∞ (Rn ) k-differential form on Rn

∂xI
or ∂xi partial derivative with respect to xi
∂ boundary operator
(×f rotation
(
 ·f divergence
 pds boundary integral
ω integral of differential form
∅ empty set
⊗ tensor product
∩ set-theoretic intersection
∪ set-theoretic union
⊂ inclusion
∈ element of
  element of
◦ f ◦ g = f (g) composition
/ division
 ∈ not element of
∀ for
 homomorphism
 approximate

= similarly
≡ identity
:= definition
∼ corresponding, equivalent, congruent relation
c
∼ conjugate congruent
−→ mapping
=⇒ extrusion
⇐⇒ extrusion mutually
 n 
n! binomial coefficient
Cnk = =
k k!(n − k)!
Symbol 667

Symbol Description
* +
n n!
= multinomial coefficients.
k1 , k 2 , · · · , k r k1 !k2 ! · · · , kr !
where k1 + k2 + · · · + kr = n
(a, b) open interval
[a, b] closed interval
[u, w] Lie bracket
[A, B] matrix commutator
[F, H] Poisson bracket
(u, v) inner product, Euclidean inner product
[U, V ] symplectic inner product
B norm of matrix
◦ direct sum
U +V
symplectic direct sum
P1 "P
 2 inner product
, 
Poisson bracket
{ϕ, φ}
vector a orthogonal to b (Euclidean)
a⊥b
ab vector a symplectic orthogonal to b
1N (x) = x identity function
Index

A C
A(α)-stability, 550 calculate the formal energy, 267
a*–linear differential operator, 407 canonical equation, 170
a∗ –Jacobian matrix, 407 canonical forms under orthogonal
A-stability, 550 transformation, 134
ABC flow, 446 canonical reductions of bilinear forms, 128
action functional of Lagrangian density, 643 canonical transformation, 172, 188
Ad*-equivariant, 503 Cartan form, 644
adjoint integrator, 374 Cartan’s Magic formula, 106
adjoint method, 372 Casimir function, 501
all polynomials is symplectically separability Cayley transformation, 193
in R2n , 207 centered Euler method, 416
alternative canonical forms, 130 centered Euler scheme, 192, 200, 231
angular momentum in body description, 505 chains, 91
angular momentum in space description, 505 characteristic equations, 477
angular momentum-preserving schemes for chart, 40
rigid body, 525 Chebyshev spectral method, 508
angular velocity in body description, 505 classical Stokes theorem, 98
angular velocity in space description, 505 closed form, 84
anti-symmetric product, 117 closed nondegenerate differential 2-form, 165
atlas, 40 coadjoint orbits, 505
automorphism, 39 coclosed form, 90
autonomous Birkhoff’s equations, 618 codifferential operator, 89
coefficient B-series for centered Euler
scheme, 418
B coefficient B-series for exact solution, 418
coefficient B-series for explicit Euler scheme,
B-series, 417 418
B-stability, 550 coefficient B-series for implicit Euler scheme,
backward error analysis, 432 418
base of tangent space, 45 coefficient B-series for R–K method, 418
BCH formula, 380, 413 coefficient B-series for trapezoidal scheme,
Betti numbers, 99 418
bijective, 39 coefficients can be determined recursively,
bilinear antisymmetric form, 188 233
binary forms, 116 coexact form, 90
Birkhoffian system, 618 cohomology space, 98
black (fat )vertex, 309 coisotropic subspace, 138
boundary of chains, 92 commutativity of generator maps, 261
Butcher tableau, 278 commutator, 124, 179
670 Index

commutator of two vector fields, 100 constructing s-scheme by Poincaré type g.f.,
comparison order conditions between 229
symplectic R–K (R–K–N) method, 302 constructing s-scheme via 1st kind g.f., 227
comparison order conditions P–R–K method construction of volume-preserving schemes
and symplectic P–R–K method, 318, 319, via g.f., 464
333 contact 1-form, 480
compatible of two local coordinate systems, contact algorithm, 483
40 contact algorithm–C, 493
complete non-integrability, 477 contact algorithm–P , 492
complexifiable, 124 contact algorithm–Q, 492
complexification of real vector space and real contact difference schemes, 492
linear transformation, 123 contact dynamical systems, 477
composition laws, 419 contact element, 482
composition of centered Euler scheme, 372 contact generating function, 487
composition of trapezoid scheme, 365 contact geometry, 477
composition scheme is not A-stable, 389 contact Hamiltonian, 483, 492
compositional property of Lie series, 379 contact map, 486
condition for centered Euler to be volume- contact structure, 477, 481
preserving, 444 contact transformation, 483
condition of symplectic P–R–K method, 303 contactization of conic symplectic maps, 487
condition of variational self-adjointness, 619 contraction, 105
configuration space, 188 convergence of symplectic difference
conformally K-symplectic group schemes, 239
CSp(K, n, F), 120 coordinate Lagrangian subspaces, 147
conformally canonical transformation, 173, coordinate of tangent vector, 45
182 coordinate subspaces, 139
conformally Hermitian, 117 cotangent bundle, 76, 249
conformally identical, 114 cotangent vector, 76
conformally orthogonal group CO(S, n, F), cycle, 93
120
conformally symmetric, 114 D
conformally symplectic group CSp(2n), 144
conformally unitary group CU (H, n, C), Darboux matrix, 231, 600
120 Darboux theorem, 168, 190
congruence canonical forms of conformally Darboux transformation, 249
symmetric, 130 De Rham theorem, 99
congruence canonical forms of Hermitian decomposed theorem of symplectic matrix,
matrices, 130 155
congruent reductions, 129 decompositions of source-free vector fields,
conic function, 484 452
conic Hamiltonian vector fields, 488 definition of symplectic for LMM, 356
conic map, 484 density of tree γ(t), 294
conic symplectic, 484 diagonal formal flow, 415
conic symplectic map, 484 diagonal Padé approximant, 194
conic transformation, 488 diagonally implicit method, 284
conservation Laws, 234 diagonally implicit symplectic R–K method,
conservation of spatial angular momentum 284
theorem, 506 diffeomorphism, 39, 102, 126, 188
constrained Hamiltonian algorithm, 537 diffeomorphism group, 102
construction of the difference schemes via differentiable manifold, 40
generating function, 213 differentiable manifold structure, 40
construct volume-preserving difference differentiable mapping, 41
schemes, 454 differentiable mapping, differential concept,
constructing s-scheme by 2nd kind g.f., 227 43
Index 671

differentiable structure, 40 explicit Euler scheme, 204


differential, 45 explike function, 349
differential k-form, 77 exponential matrix transform, 125
differential complex, 657 extended canonical two form, 595
diophantine condition, 566, 572 extended configuration space, 581
diophantine frequency vectors, 552 extended Lagrangian 1-form, 585
diophantine step sizes, 569 extended phase space, 242
direction field, 477 extended symplectic 2-form, 585
discrete energy conservation law, 587 exterior algebra, 68
discrete Euler–Lagrange equation, 587, 652 exterior differential operator, 82
discrete extended Lagrange 2-form, 589 exterior form, 66
discrete Lagrange 2-form, 589 exterior monomials, 70
discrete Lagrangian, 652 exterior product, 64
discrete mechanics based on finite element exterior product of forms, 72
methods, 606
discrete multisymplectic conservation law, F
646
discrete multisymplectic form formula, 652 fathers’ and sons’ relations, 297
discrete total variation in the multisymplectic fiber of tangent bundle, 56
form, 605 first integrals, 234
discrete variational principle in total first order prolongation, 594, 643
variation, 596 first order prolongation of V , 584
divergence-free system, 443, 449 fixed point, 236
formal energy, 264
E formal energy for symplectic R–K method,
333, 339
eigenvalues of infinitesimal symplectic formal energy of centered Euler scheme, 344
matrix, 159 formal power series, 265, 407
eigenvalues of symplectic matrix, 158 formal vector field, 432
elementary divisor in real space, 136 fourth order with 3-stage scheme, 365
elementary divisors in complex space, 136 Frechet derivatives, 289
embedded submanifold, 538 free rigid body, 529
embedding submanifold, 51
endomorphism, 39
energy conservation law, 645 G
energy density, 645
energy equation, 644 G-stability, 550
energy flux, 645 Gauss IA-IA, 472
energy-preserving schemes for rigid body, Gauss theorem, 98
525 Gauss–Legendre polynomial, 279
epimorphism, 39 Ge–Marsden theorem, 273
equivalent atlas, 40 general Hamilton–Jacobi equation, 221
Euclidean form, 118 general linear group GL(n, F), 119
Euclidian structure, 137 general vector field, 583
Euler centered scheme, 194 generalized Cayley transformation, 197, 198
Euler equation, 506 generalized Euler schemes, 231
Euler–Lagrange 1-form, 583 generalized Hamiltonian equation, 500
Euler–Lagrange equation, 644 generalized Lagrangian subspaces, 162
Euler–Lagrange equation in FEM, 607 generalized Noether theorem, 502
even polynomial, 159 generating function, 182, 219, 233, 601
exact form, 84 generating function and H.J. equation of the
exact symplectic mapping, 551 first kind, 223
exp maps, 412 generating function and H.J. equation of the
explicit Euler method, 415 second kind, 223
672 Index

generating function for Lie–Poisson system, immersion, 47


519 implicit Euler method, 415
generating function for volume-preserving, impossible to construct volume-preserving
460 algorithms analytically depending on
generating function method, 432 source-free vector fields, 452
generating functions, 221, 255 infinitesimal generator vector field, 502
generating functions for Lagrangian infinitesimal symplectic matrices, 190
subspaces, 160 injective, 39
generator map, 255 integral invariant, 171
generators of Sp(2n), 155 integral surface, 477
gradient map, 220 integrator S(τ ) has a formal Lie expression,
gradient mapping, 219 381
gradient transformation, 174 invariance of generating functions, 261
graph of gradient map, 219 invariant groups for scalar products, 119
graph of symplectic map, 219 invariant integral, 192
Grassmann algebra, 75 invariant tori, 574
Grassmann manifold, 143 invariant under the group G of symplectic
Green theorem, 97 transformations, 234
Gronwall inequality, 241 invariant under the phase flow of any
group homomorphism, 126 quadratic Hamiltonian, 235
group of contact transformations, 483 invariants under congruences, 132
inverse mapping, 39
H isomorphic mapping, 39
isotropic subspace, 138
H-Stability, 401 isotropic, coisotropic, Lagrangian, 182
H-stability interval of explicit scheme, 404
Hamilton–Jacobi equation, 182, 233, 462, J
602
Hamilton–Jacobi equation for contact system, Jacobi identity, 124, 177
494 J4n , J˜4n -Lagrangian submanifold, 219, 622
Hamiltonian function, 187 jet bundles, 643
Hamiltonian mechanics, 165, 168
Hamiltonian operator, 500 K
Hamiltonian phase flow, 171 "
K-Lagrangian submanifold, 623
Hamiltonian systems, 187 K(z, t)-symplectic, 621
Hamiltonian vector fields, 167, 170 k-forms, 67
Hamiltonian–Jacobi equation, 627 K-symplectic group, 120
heavy top, 534 K-symplectic scheme, 622
Hermitian form, 117 K-symplectic structure, 190
Hermitian, anti-Hermitian, 116 KAM iteration process, 556
high order symplectic-energy integrator, 600 KAM theorem, 551
Hodge operator, 88 KAM theorem of symplectic algorithms, 559
homeomorphism, 39 Kane–Marsden–Ortiz integrator, 587
homogeneous symplectic, 484
homology space, 99 L
homomorphism, 39
Hopf algebra, 433 L-stability, 550
horizontal variation of q i , 586 labeled n-tree λτ , 297
hyperplane, 478 labeled P -tree, 309
hypersurface, 477 labeled graph, 292
labeled trees, 298
I Lagrange 2-form in FEM, 607
Lagrangian 2-form, 583
immersed submanifold, 48 Lagrangian density, 643
Index 673

Lagrangian mechanics, 581 momentum equation, 644


Lagrangian submanifold, 182, 250 momentum flux, 645
Lagrangian subspace, 138 momentum mapping, 502
Lee-variational integrator, 581 monomial, 207
left translation action, 503 monomorphism, 39
Legendre transform, 645 monotonic rooted labeled trees, 298
Legendre–Hodge transformation, 645 Morse–Smale systems, 551
Lie algebra, 125, 179, 190, 409 multi-stage P–R–K method, 473
Lie algebra of conformally invariant groups, multisymplectic Birkhoffian systems, 656
128 multisymplectic conservation law, 605, 645
Lie bracket, 409 multisymplectic dissipation law, 656
Lie derivative, 103 multisymplectic form, 644
Lie group, 125 multisymplectic form formula, 644
Lie group action, 502 multisymplectic Fourier pseudospectral
Lie series, 377 methods, 654
Lie–Poisson bracket, 501, 504 multisymplectic geometry, 643
Lie–Poisson equation, 504 multisymplectic Hamiltonian system, 605
Lie–Poisson scheme, 519 multisymplectic Hamiltonian system for KdV
Lie–Poisson systems, 501 equation, 648
Lie-Poisson-Hamilton-Jacobi equation, 514 multisymplectic Hamiltonian system for
lifted action, 502 KGS equation, 649
linear damped oscillator, 629 multisymplectic Hamiltonian system for
linear fractional transformation, 213 Schrödinger equation, 647
linear Hamiltonian systems, 192 multisymplectic Hamiltonian system for
linear multistep method, 347 sine-Gordon equation, 646
Liouville frequency vectors, 552 multisymplectic Hamiltonian systems, 645
Liouville’s phase-volume conservation law, multisymplectic integrators, 646
189 multisymplectic integrators for modified
Liouville’s theorem, 172, 443 equations, 655
Lobatto III A, 279, 280 multisymplectic-energy-momentum
Lobatto III B, 279, 280 integrators, 605
Lobatto III C, 279, 280
Lobatto IIIC-IIIC, 472 N
Lobatto polynomial, 279
local coordinate systems, 40 natural product symplectic structure, 249
log maps, 412 near-0 formal power series, 409
logarithmic map, 434 near-1 formal power series, 409
loglike function, 350 nilpotent of degree 2, 204
Noether theorem, 179
M non-exceptional matrices, 197
non-existence of symplectic schemes
Möbius strip, 61 preserving energy, 273
manifold, 40 non-superfluous tree, 299
matrix representation of subspaces, 143 nonautonomous Birkhoff’s equation, 619
maximum non-degeneracy, 477 nonautonomous Hamiltonian System, 242
modified centered Euler scheme of sixth nonconservative multisymplectic Hamilto-
order, 433 nian systems, 654
modified equation, 334, 432 nonexistence of SLMM for nonlinear
modified equation for centered Euler scheme, Hamiltonian systems, 356
336, 433 nonresonant frequencies, 570
modified integrators, 432 normal Darboux matrix, 232, 239, 494
momentum, 502 normal Darboux matrix of a symplectic
momentum conservation law, 605, 645 transformation, 600
momentum density, 645 normalization coefficient B-series, 418
674 Index

normalization Darboux transformation, 251 product preservation property of Lie series,


normalizing conditions, 453 379
null space of 1-form, 478 prolongation spaces, 643
number of essential different labelings α(t), proper mapping, 51
294 properties of Lie series, 379
numerical version of KAM theorem, 564 pull-back, 80
pull-back mapping, 374
O push-forward mapping, 374

obstruction, 450 Q
one-form (1-form), 66
one-leg weighted Euler schemes, 231 quadratic bilinear form, 115
one-parameter group of canonical maps, 221 quaternion form, 524
operation ∧, 65
optimization Method, 603 R
orbit-preserving schemes, 527
order conditions for symplectic R–K–N Radau I A, 279
method, 319 Radau IA-IA, 471
orientable differentiable manifold, 59 Radau II A, 280
orientable vector spaces, 59 Radau IIA-IIA, 472
Radau polynomial, 279
orthogonal group O(n, F), 119
rational fraction, 200
real representation of complex vector space,
P 121
reduction method, 540
P–R–K method, 302
reflective polynomial, 158
Padé approximation, 193
regular submanifold, 51, 53
Padé approximation table, 196
relationship between rooted tree and
partitions and skeletons, 418 elementary differential, 293
Pfaffian theorem, 118 resonant, 568
phase flow, 102, 221, 408 revertible approximations, 450
phase flow of contact system, 483 Riemann structure, 167
phase flow- etF , 235 right translation, 503
phase space, 102 rigid body in Euclidean space, 523
phase-area conservation law, 189 Rodrigue formula, 543
Poincaré lemma, 85, 220, 222 root isomorphism, 298
Poincaré transformation, 250 rooted n-tree, 299
Poincaré’s generating function and H.J. rooted P -tree, 309
equation, 223 rooted S-tree, 321
Poisson bracket, 177, 192, 499 rooted 3-tree, 298
Poisson manifold, 499 rooted labeled n-tree ρλτ , 297
Poisson mapping, 500 rooted labeled P -tree, 309
Poisson scheme, 508 rooted labeled S-tree, 321
Poisson system, 500 rooted labeled 3-tree, 298
Poisson theorem, 179 rooted labeled trees, 298
postprocessed vector field, 432
Preissman integrator, 646 S
preprocessed vector field integrators, 432
preserve all quadratic first integrals of system, S-graph, 321
236 S-orthogonal group, 119
preserve angular momentum pT Bq, 236 S-tree, 321
preserving the contact structure, 483 scalar product, 117
presymplectic form, 645 section of tangent bundle, 62
presymplectic forms, 605 self-adjoint integrator, 376
product of cotangent bundles, 249 self-adjoint method, 372
Index 675

semi-autonomous Birkhoff’s equation, 618 symplectic conditions for R–K method, 281
separable Hamiltonian system, 202 symplectic explicit R–K–N method
separable systems for source-free systems, (non-redundant 5-stage fifth order), 331
447 symplectic form, 118
sesquilinear form, 116 symplectic geometry, 165, 188
simplify symplectic R–K conditions, 300 symplectic group, 188
simplifying condition of R–K method, 279 symplectic group Sp(2n), 144
Sm(2n)matrices, 600 symplectic group Sp(2n, F ), 119
small twist mappings, 558 symplectic invariant algorithms, 235
some theorems about Sp(2n), 151 symplectic leave, 505
sons of the root, 297 symplectic LMM for linear Hamiltonian
source-free system, 443, 449, 467 systems, 348
Sp(2n) matrices, 600 symplectic manifold, 165
SpD2n the totality of symplectic operators, symplectic map, 220
232 symplectic mapping, 215
SpD2n the set of symplectic transformations, symplectic matrix, 189
601 symplectic operators near identity, 232
special linear group SL(n, F ), 119 symplectic pair, 217
special separable source-free systems, 458 symplectic R–K method, 277, 279
special type Sp2n (I), 150 symplectic R–K–N method, 319
special type Sp2n (II), 151 symplectic R–K–N method (3-stage and 4-th
special type Sp2n (III), 151 order), 323
special type Sp2n (I, II), 151 symplectic schemes for Birkhoffian Systems,
special types of Sp(2n), 148 625
stability analysis for composition scheme, symplectic schemes for nonautonomous
388 system, 244
standard antisymmetric matrix, 192 symplectic space, 137
standard symplectic structure, 169, 188, 249 symplectic structure, 137, 165, 215, 477
star operators, 88 symplectic structure for trapezoidal scheme,
step size resonance, 568 202
step transition, 415 symplectic structure in product space, 215
step-forward operator, 240 symplectic subspace, 137
Stokes theorem, 93 symplectic-energy integrator, 596, 602
structure-stability, 551 symplectic-energy-momentum, 581
subalgebra of a Lie algebra, 179 symplectically separable Hamiltonian
submanifold, 46 systems, 205
submersion, 51 symplectization of contact space, 487
substitution law, 432 symplified order conditions for symplectic
superfluous trees, 298 R–K–N method, 327
surjective, 39 symplified order conditions of explicit
Sylvester’s law of inertia, 132 symplectic R–K method, 307
Symm(2n) the set of symmetric
transformations, 601 T
symm(2n) the totality of symmetric
operators, 232 table of coefficient ω(τ ) for trees of order
symmetric operators near nullity, 232  5, 435
symmetric pair, 216 table of coefficients σ(τ ), γ(τ ), b̆(τ ), and b(τ ),
symmetric product, 117 434
symmetrical composition, 376 table of composition laws for the trees of
symmetry of tree σ(t), 294 order ≤ 4, 436
symplectic algebra, 216 table of substitution law ∗ defined in for the
symplectic algorithms as small twist trees of order ≤ 5, 437
mappings, 560 tangent bundle, 56
symplectic basis, 145 tangent mapping, 58
676 Index

tangent space, 44
tangent vector, 43
the elementary differential, 291
the inverse function to exp, 126
the order of tree r(t), 294
time-dependent gradient map, 221
topological manifold, 40
total variation for Lagrangian mechanics, 583
total variation in Hamiltonian mechanics, 593
transversal, 54, 140, 143
transversal Lagrangian subspaces, 148
transversality condition, 181, 213, 221, 225,
227, 250, 251, 460, 623
trapezoidal method, 416
trapezoidal scheme, 201
tree, 298
trivial tangent bundle, 57
truncation, 233
two-forms (2-forms), 66

U
Unitary group U (n, C), 119
Unitary product, 118

V
variational integrators, 651
variational principle in Hamiltonian
mechanics, 591
vector field, 62
vertical vector field, 582
Veselov–Moser algorithm, 539
volume-preserving 2-Stage P–R–K methods,
471
volume-preserving P-R–K method, 467
volume-preserving R–K method, 467
volume-preserving schemes, 444

W
W -transformation, 304, 470
white (meagre) vertex, 309
Witt theorem, 132

X
X-matrix, 305

You might also like