100% found this document useful (1 vote)
239 views

Shafer Romanovski

This document provides a preface for a book about using computational algebra approaches to study the center problem and cyclicity problem for polynomial systems of ordinary differential equations. The book aims to introduce graduate students and researchers to using tools from computational algebra like ideals in polynomial rings and Gröbner bases. It covers topics like Lyapunov's second method, normal forms of differential equations, computing focus quantities to define the center variety, and algorithms for problems in computational algebra relevant to studying small-amplitude periodic solutions. The intended audience includes those in nonlinear differential equations, computational algebra, and other fields where investigating nonlinear oscillations is important.

Uploaded by

tempesta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
239 views

Shafer Romanovski

This document provides a preface for a book about using computational algebra approaches to study the center problem and cyclicity problem for polynomial systems of ordinary differential equations. The book aims to introduce graduate students and researchers to using tools from computational algebra like ideals in polynomial rings and Gröbner bases. It covers topics like Lyapunov's second method, normal forms of differential equations, computing focus quantities to define the center variety, and algorithms for problems in computational algebra relevant to studying small-amplitude periodic solutions. The intended audience includes those in nonlinear differential equations, computational algebra, and other fields where investigating nonlinear oscillations is important.

Uploaded by

tempesta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 336

Valery G.

Romanovski
Douglas S. Shafer

The Center and Cyclicity


Problems:
A Computational Algebra
Approach

Birkhäuser
Boston • Basel • Berlin
Valery G. Romanovski Douglas S. Shafer
Center for Applied Mathematics Department of Mathematics
and Theoretical Physics University of North Carolina
University of Maribor Charlotte, NC 28025
Krekova 2 USA
2000 Maribor, Slovenia [email protected]
[email protected]

ISBN 978-0-8176-4726-1 eISBN 978-0-8176-4727-8


DOI 10.1007/978-0-8176-4727-8

Library of Congress Control Number: PCN applied for

Mathematics Subject Classification (2000): 34C07, 37G15, 37G05, 34C23, 34C14, 34-01, 37-01,
13-01, 14-01

© Birkhäuser is a part of Springer Science+Business Media, LLC 2009


All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Birkhäuser Boston, c/o Springer Science+Business Media, LLC, 233
Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or
scholarly analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use in this publication of trade names, trademarks, service marks and similar terms, even if they
are not identified as such, is not to be taken as an expression of opinion as to whether or not they are
subject to proprietary rights.

Cover designed by Alex Gerasev.

Printed on acid-free paper.

Springer is part of Springer Science+Business Media (www.springer.com)


To our families and teachers.
Preface

The primary object of study in this book is small-amplitude periodic solutions of


two-dimensional autonomous systems of ordinary differential equations,

ẋ = P(x, y), ẏ = Q(x, y),

for which the right-hand sides are polynomials. Such systems are called polynomial
systems. If the origin is an isolated singularity of a polynomial (or real analytic)
system, and if there does not exist an orbit that tends to the singularity, in either for-
ward or reverse time, with a definite limiting tangent direction, then the singularity
must be either a center, in which case there is a neighborhood of the origin in which
every orbit except the origin is periodic, or a focus, in which case there is a neigh-
borhood of the origin in which every orbit spirals towards or away from the origin.
The problem of distinguishing between a center and a focus for a given polynomial
system or a family of such systems is known as the Poincaré center problem or the
center-focus problem. Although it dates from the end of the 19th century, it is com-
pletely solved only for linear and quadratic systems (max{deg(P), deg(Q)} equal to
1 or 2, respectively) and a few particular cases in families of higher degree.
Relatively simple analysis shows that when the matrix of the linearization of the
system at the the singular point has eigenvalues with nonzero real parts, the singular
point is a focus. If, however, the real parts of the eigenvalues are zero then the type
of the singular point depends on the nonlinear terms of polynomials in a nontrivial
way. A general method due to Poincaré and Lyapunov reduces the problem to that
of solving an infinite system of polynomial equations whose variables are param-
eters of the system of differential equations. That is, the center-focus problem is
reduced to the problem of finding the variety of the ideal generated by a collection
of polynomials, called the focus quantities of the system.
A second problem, called the cyclicity problem, is to estimate the number of
limit cycles, that is, isolated periodic solutions, that can bifurcate from a center or
focus when the coefficients of the system of differential equations are perturbed by
an arbitrarily small amount, but in such a way as to remain in a particular family
of systems, for example in the family of all quadratic polynomial systems if the

vii
viii Preface

original system was quadratic. This problem is a part of the still unresolved 16th
Hilbert problem and is often called the local 16th Hilbert problem. In fact, in order
to find an upper bound for the cyclicity of a center or focus in a polynomial system
it is sufficient to obtain a basis for the above-mentioned ideal of focus quantities.
Thus the study of these two famous problems in the qualitative theory of differential
equations can be carried out through the study of polynomial ideals, that is, through
the study of an object of commutative algebra.
Recent decades have seen a surge of interest in the center and cyclicity prob-
lems. Certainly an important reason for this is that the resolution of these problems
involves extremely laborious computations, which nowadays can be carried out us-
ing powerful computational facilities. Applications of concepts that could not be
utilized even 30 years ago are now feasible, often even on a personal computer,
because of advances in the mathematical theory, in the computer software of com-
putational algebra, and in computer technology. This book is intended to give the
reader a thorough grounding in the theory, and explains and illustrates methods of
computational algebra, as a means of approaching the center-focus and cyclicity
problems.
The methods we present can be most effectively exploited if the original real
system of differential equations is properly complexified; hence, the idea of com-
plexifying a real system, and more generally working in a complex setting, is one
of the central ideas of the text. Although the idea of extracting information about a
real system of ordinary differential equations from its complexification goes back
to Lyapunov, it is still relatively scantily used. Our belief that it deserves exposition
at the level of a textbook has been a primary motivation for this work. In addition
to that, it has appeared to us that by and large specialists in the qualitative theory
of differential equations are not well versed in these new methods of computational
algebra, and conversely that there appears to be a general lack of knowledge on
the part of specialists in computational algebra about the possibility of an algebraic
treatment of these problems of differential equations. We have written this work
with the intention of trying to help to draw together these two mathematical com-
munities.
Thus, the readers we have had in mind in writing this work have been gradu-
ate students and researchers in nonlinear differential equations and computational
algebra, and in fields outside mathematics in which the investigation of nonlinear
oscillation is relevant. The book is designed to be suitable for use as a primary text-
book in an advanced graduate course or as a supplementary source for beginning
graduate courses. Among other things, this has meant motivating and illustrating
the material with many examples, and including a great many exercises, arranged in
the order in which the topics they cover appear in the text. It has also meant that we
have given complete proofs of a number of theorems that are not readily available in
the current literature and that we have given much more detailed versions of proofs
that were written for specialists. All in all, researchers working in the theory of limit
cycles of polynomial systems should find it a valuable reference resource, and be-
cause it is self-contained and written to be accessible to nonspecialists, researchers
in other fields should find it an understandable and helpful introduction to the tools
Preface ix

they need to study the onset of stable periodic motion, such as ideals in polynomial
rings and Gröbner bases.
The first two chapters introduce the primary technical tools for this approach
to the center and cyclicity problems, as well as questions of linearizability and
isochronicity that are naturally investigated in the same manner. The first chapter
lays the groundwork of computational algebra. We give the main properties of ide-
als in polynomial rings and their affine varieties, explain the concept of Gröbner
bases, a key component of various algorithms of computational algebra, and provide
explicit algorithms for elimination and implicitization problems and for basic opera-
tions on ideals in polynomial rings and on their varieties. The second chapter begins
with the main theorems of Lyapunov’s second method, theorems that are aimed at
the investigation of the stability of singularities (in this context often termed equi-
librium points) by means of Lyapunov functions. We then cover the basics of the
theory of normal forms of ordinary differential equations, including an algorithm
for the normalization procedure and a criterion for convergence of normalization
transformations and normal forms.
Chapter 3 is devoted to the center problem. We describe how the concept of a
center can be generalized to complex systems, in order to take advantage of work-
ing over the algebraically closed field C in place of R. This leads to the study of
the variety, in the space of parameters of the system, that corresponds to systems
with a center, which is called the center variety. We present an efficient compu-
tational algorithm for computing the focus quantities, which are the polynomials
that define the center variety. Then we describe two main mechanisms for prov-
ing the existence of a center in a polynomial system, Darboux integrability and
time-reversibility, thereby completing the description of all the tools needed for this
method of approach to the center-focus problem. This program and its efficiency
are demonstrated by applying it to resolve the center problem for the full family of
quadratic systems and for one particular family of cubic systems. In a final section,
as a complement to the rest of the chapter, particularly aspects of symmetry, the
important special case of Liénard systems is presented.
If all solutions in a neighborhood of a singular point are periodic, then a ques-
tion that arises naturally is whether all solutions have the same period. This is the
so-called isochronicity problem that has attracted study from the time of Huygens
and the Bernoullis. In Chapter 4 we present a natural generalization of the concept
of isochronicity to complex systems of differential equations, the idea of lineariz-
ability. We then introduce and develop methods for investigating linearizability in
the complex setting.
As indicated above, one possible mechanism for the existence of a center is time-
reversibility of the system. Chapter 5 presents an algorithm for computing all time-
reversible systems within a given polynomial family. This takes on additional im-
portance because in all known cases the set of time-reversible systems forms exactly
one component of the center variety. The algorithm is derived using the study of in-
variants of the rotation group of the system and is a nice application of that theory
and the algebraic theory developed in Chapter 1.
x Preface

The last chapter is devoted to the cyclicity problem. We describe Bautin’s


method, which reduces the study of cyclicity to finding a basis of the ideal of focus
quantities, and then show how to obtain the solution for the cyclicity problem in
the case that the ideal of focus quantities is radical. In the case that the ideal gen-
erated by the first few focus quantities is not radical, the problem becomes much
more difficult; at present there is no algorithmic approach for its treatment. Nev-
ertheless we present a particular family of cubic systems for which it is possible,
using Gröbner basis calculations, to obtain a bound on cyclicity. Finally, as a further
illustration of the applicability of the ideas developed in the text, we investigate the
problem of the maximum number of cycles that can maintain the original period of
an isochronous center in R2 when it is perturbed slightly within the collection of
centers, the so-called problem of bifurcation of critical periods.
Specialists perusing the table of contents and the bibliography will surely miss
some of their favorite topics and references. For example, we have not mentioned
methods that approach the center and cyclicity problems based on the theory of
resultants and triangular decomposition, and have not treated the cyclicity problem
specifically in the important special case of Liénard systems, such as we did for the
center problem. We are well aware that there is much more that could be included,
but one has to draw the line somewhere, and we can only say that we have made
choices of what to include and what to omit based on what seemed best to us, always
with an eye to what we hoped would be most valuable to the readers of this book.
The first author acknowledges the financial support of this work by the Slovenian
Research Agency. We thank all those with whom we consulted on various aspects of
this work, especially Vladimir Basov, Carmen Chicone, Freddy Dumortier, Maoan
Han, Evan Houston, and Dongming Wang.

Maribor, Charlotte Valery G. Romanovski


May 2008 Douglas S. Shafer
Contents

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Notation and Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

1 Polynomial Ideals and Their Varieties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Fundamental Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 The Ideal Membership Problem and Gröbner Bases . . . . . . . . . . . . . . 7
1.3 Basic Properties and Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4 Decomposition of Varieties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.5 Notes and Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2 Stability and Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57


2.1 Lyapunov’s Second Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.2 Real Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.3 Analytic and Formal Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.4 Notes and Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3 The Center Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89


3.1 The Poincaré First Return Map and the Lyapunov Numbers . . . . . . . 91
3.2 Complexification of Real Systems, Normal Forms, and the Center
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.3 The Center Variety . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.4 Focus Quantities and Their Properties . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.5 Hamiltonian and Reversible Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 128
3.6 Darboux Integrals and Integrating Factors . . . . . . . . . . . . . . . . . . . . . . 136
3.7 Applications: Quadratic Systems and a Family of Cubic Systems . . . 147
3.8 The Center Problem for Liénard Systems . . . . . . . . . . . . . . . . . . . . . . . 158
3.9 Notes and Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

xi
xii Contents

4 The Isochronicity and Linearizability Problems . . . . . . . . . . . . . . . . . . . 175


4.1 The Period Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
4.2 Isochronicity Through Normal Forms and Linearizability . . . . . . . . . 177
4.3 The Linearizability Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
4.4 Darboux Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
4.5 Linearizable Quadratic Centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
4.6 Notes and Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

5 Invariants of the Rotation Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213


5.1 Properties of Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
5.2 The Symmetry Ideal and the Set of Time-Reversible Systems . . . . . . 229
5.3 Axes of Symmetry of a Plane System . . . . . . . . . . . . . . . . . . . . . . . . . . 237
5.4 Notes and Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

6 Bifurcations of Limit Cycles and Critical Periods . . . . . . . . . . . . . . . . . . 249


6.1 Bautin’s Method for Bifurcation Problems . . . . . . . . . . . . . . . . . . . . . . 250
6.2 The Cyclicity Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 269
6.4 Bifurcations of Critical Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
6.5 Notes and Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

Index of Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
List of Tables

1.1 The Multivariable Division Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 12


1.2 The Computations of Example 1.2.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Buchberger’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 The Radical Membership Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1.5 Algorithm for Computing I ∩ J . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.6 Algorithm for Computing I : J . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.7 Singular Output of Example 1.4.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.8 Singular Output of Example 1.4.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.9 Singular Output of Example 1.4.12 Using minAssChar . . . . . . . . . . 46
1.10 The Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.1 Normal Form Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.1 The Focus Quantity Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128


3.2 Generators of Isym for System (3.100). . . . . . . . . . . . . . . . . . . . . . . . . . . 136

4.1 The Linearizability Quantities Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 199

5.1 Algorithm for Computing Isym and a Hilbert Basis of M . . . . . . . . . . . 235

6.1 Reduced Gröbner Basis of B3 for System (3.129) . . . . . . . . . . . . . . . . 272


6.2 Normal Form Coefficients for System (6.51) . . . . . . . . . . . . . . . . . . . . . 292
6.3 Isochronicity Quantities for System (6.51) . . . . . . . . . . . . . . . . . . . . . . . 293

xiii
Notation and Conventions

N the set of natural numbers {1, 2, 3, . . .}


N0 N ∪ {0}
Z the ring of integers
Q the field of rational numbers
R the field of real numbers
C the field of complex numbers
A⊂B A is a subset of B, A = B allowed
A$B A is a proper subset of B
A\B elements that are in A and are not in B
See the Index of Notation beginning on p. 323 for a full list of notation.

xv
Chapter 1
Polynomial Ideals and Their Varieties

As indicated in the Preface, solutions of the fundamental questions addressed in


this book, the center and cyclicity problems, are expressed in terms of the sets of
common zeros of collections of polynomials in the coefficients of the underlying
family of systems of differential equations. These sets of common zeros are termed
varieties. They are determined not so much by the specific polynomials themselves
as by larger collections of polynomials, the so-called ideals that the original collec-
tions of polynomials generate. In the first section of this chapter we discuss these
basic concepts: polynomials, varieties, and ideals. An ideal can have more than one
set of generating polynomials, and a fundamental problem is that of deciding when
two ideals, hence the varieties they determine, are the same, even though presented
by different sets of generators. To address this and related isssues, in Sections 1.2
and 1.3 we introduce the concept of a Gröbner basis and certain fundamental tech-
niques and algorithms of computational algebra for the study of polynomial ideals
and their varieties. The last section is devoted to the decomposition of varieties into
their simplest components and shows how this decomposition is connected to the
structure of the generating ideals. For a fuller exposition of the concepts presented
here, the reader may consult [1, 18, 23, 60].

1.1 Fundamental Concepts

A polynomial in variables x1 , x2 , . . . , xn with coefficients in a field k is a formal


expression of the form
f = ∑ aα xα , (1.1)
α ∈S

where S is a finite subset of Nn0 , aα ∈ k, and for α = (α1 , α2 , . . . , αn ), xα denotes


α1 α2 α
the monomial x1 x2 · · · xn . In most cases of interest k will be Q, R, or C. The
n

product aα xα is called a term of the polynomial f . The set of all polynomials in the
variables x1 , . . . , xn with coefficients in k is denoted by k[x1 , . . . , xn ]. With the natural
and well-known addition and multiplication, k[x1 , . . . , xn ] is a commutative ring. The

V.G. Romanovski, D.S. Shafer, The Center and Cyclicity Problems, 1


DOI 10.1007/978-0-8176-4727-8_1,
© Birkhäuser is a part of Springer Science+Business Media, LLC 2009
2 1 Polynomial Ideals and Their Varieties

full degree of a monomial xα is the number |α | = α1 + · · · + αn . The full degree of


a term aα xα is the full degree of the monomial xα . The full degree of a polynomial
f as in (1.1), denoted by deg( f ), is the maximum of |α | among all monomials (with
nonzero coefficients aα , of course) of f .
If a field k and a natural number n are given, then we term the set

kn = {(a1 , . . . , an ) : a1 , . . . , an ∈ k}

n-dimensional affine space. If f is the polynomial in (1.1) and (a1 , . . . , an ) ∈ kn , then


f (a1 , . . . , an ) will denote the element ∑α aα aα1 1 · · · aαn n of k. Thus, to any polynomial
f ∈ k[x1 , . . . , xn ] is associated the function f : kn → k defined by

f : (a1 , . . . , an ) 7→ f (a1 , . . . , an ) .

This ability to consider polynomials as functions defines a kind of duality between


the algebra and geometry of affine spaces. In the case of an arbitrary field k this
interconnection between polynomials and functions on affine spaces can hold some
surprises. For example, the statements “ f is the zero polynomial” (all coefficients
aα are equal to zero) and “ f is the zero function” ( f |kn ≡ 0) are not necessarily
equivalent (see Exercise 1.1). However, we will work mainly with the infinite fields
Q, R, and C, for which the following two statements show that our naive intuition
is correct.

Proposition 1.1.1. Let k be an infinite field and f ∈ k[x1 , . . . , xn ]. Then f is the zero
element of k[x1 , . . . , xn ] (that is, all coefficients aα of f are equal to zero) if and only
if f : kn → k is the zero function.

Proof. Certainly if every coefficient of the polynomial f is the zero polynomial then
the corresponding function is the zero function. We must establish the converse:

If f (a1 , . . . , an ) = 0 for all (a1 , . . . , an ) ∈ kn , then f is the zero polynomial. (1.2)

We will do this by induction on the number of variables in the polynomial ring.


Basis step. For n = 1, the antecedent in (1.2) means that either (i) f is the zero
polynomial or (ii) deg( f ) is defined and at least 1 and f has infinitely many roots.
It is well known, however (Exercise 1.2), that every polynomial f ∈ k[x] for which
deg( f ) = s > 0 has at most s roots. Hence only alternative (i) is possible, so (1.2)
holds for n = 1.
Inductive step. Suppose (1.2) holds in the ring k[x1 , . . . , x p ] for p = 1, 2, . . . , n − 1.
Let f ∈ k[x1 , . . . , xn ] be such that the antecedent in (1.2) holds for f . We can write f
in the form
m
f= ∑ g j (x1 , . . . , xn−1 )xnj
j=0

for some finite m, and will show that g j is the zero polynomial for each j, 1 ≤ j ≤ m.
This will imply that f is the zero polynomial. Thus fix any a = (a1 , . . . , an−1 ) ∈ kn−1
and define fa ∈ k[xn ] by
1.1 Fundamental Concepts 3
m
fa = ∑ g j (a1 , . . . , an−1 )xnj .
j=0

By hypothesis, fa (an ) = 0 for all an ∈ k. Hence, by the induction hypothesis, fa is


the zero polynomial; that is, its coefficients gk (a1 , . . . , an−1 ) are equal to zero for all
j, 0 ≤ j ≤ m. But (a1 , . . . , an−1 ) was an arbitrary point in kn−1 , hence the evaluation
function corresponding to g j is the zero function for j = 1, . . . , m, which, by the
induction hypothesis, implies that g j is the zero polynomial for j = 1, . . . , m, as
required. Thus the proposition holds. 

The proposition yields the following result.

Corollary 1.1.2. If k is an infinite field and f and g are elements of k[x1 , . . . , xn ],


then f = g in k[x1 , . . . , xn ] if and only if the functions f : kn → k and g : kn → k are
equal.

Proof. Suppose f and g in k[x1 , . . . , xn ] define the same function on kn . Then f − g


is the zero function. Hence, by Proposition 1.1.1, f − g is the zero polynomial in
k[x1 , . . . , xn ], so that f = g in k[x1 , . . . , xn ]. The converse is clear. 

Throughout this chapter, unless otherwise indicated k will denote an arbitrary


field. The main geometric object of study in this chapter is what is called an affine
variety in kn , defined as follows.

Definition 1.1.3. Let k be a field and let f1 , . . . , fs be (finitely many) elements of


k[x1 , . . . , xn ]. The affine variety defined by the polynomials f1 , . . . , fs is the set

V( f1 , . . . , fs ) = {(a1 , . . . , an ) ∈ kn : f j (a1 , . . . , an ) = 0 for 1 ≤ j ≤ s} .

An affine variety is a subset V of kn for which there exist finitely many polynomials
such that V = V( f1 , . . . , fs ). A subvariety of V is a subset of V that is itself an affine
variety.

In other words, the affine variety V( f1 , . . . , fs ) ⊂ kn is the set of solutions of the


system
f1 = 0, f2 = 0, . . . , fs = 0 (1.3)
of finitely many polynomial equations in kn . Of course, this set depends on k and
could very well be empty: V(x2 + y2 + 1) = ∅ for k = R but not for k = C, while
V(x2 + y2 + 1, x, y) = ∅ no matter what k is, since k is a field.
The following proposition gives an important property of affine varieties. The
proof is left as Exercise 1.3, in which the reader is asked to prove in addition that
the arbitrary (that is, possibly infinite) intersection of affine varieties is still an affine
variety.

Proposition 1.1.4. If V ⊂ kn and W ⊂ kn are affine varieties, then V ∪W and V ∩W


are also affine varieties.
4 1 Polynomial Ideals and Their Varieties

It is easy to see that, given an affine variety V , the collection of polynomials


{ f1 , . . . , fs } such that V = V( f1 , . . . , fs ) is not unique, and thus cannot be uniquely
recovered from the point set V . For example, for any a and b in k, a 6= 0, it is
apparent that V( f1 , . . . , fs ) = V(a f1 + b f2 , f2 , . . . , fs ). See also Example 1.1.13 and
Proposition 1.1.11. In order to connect a given variety with a particular collection
of polynomials, we need the concept of an ideal, the main algebraic object of study
in this chapter.

Definition 1.1.5. An ideal of k[x1 , . . . , xn ] is a subset I of k[x1 , . . . , xn ] satisfying


(a) 0 ∈ I,
(b) if f , g ∈ I then f + g ∈ I, and
(c) if f ∈ I and h ∈ k[x1 , . . . , xn ], then h f ∈ I.

Let f1 , . . . , fs be elements of k[x1 , . . . , xn ]. We denote by h f1 , . . . , fs i the set of all


linear combinations of f1 , . . . , fs with coefficients from k[x1 , . . . , xn ]:
( )
s
h f1 , . . . , fs i = ∑ hj fj : h1 , . . . , hs ∈ k[x1 , . . . , xn ] . (1.4)
j=1

It is easily seen that the set h f1 , . . . , fs i is an ideal in k[x1 , . . . , xn ] . We call h f1 , . . . , fs i


the ideal generated by the polynomials f1 , . . . , fs , and the polynomials themselves
generators of I. A generalization of this idea that will be important later is the fol-
lowing: if F is any nonempty subset of k[x1 , . . . , xn ] (possibly infinite), then we let
h f : f ∈ Fi denote the set of all finite linear combinations of elements of F with
coefficients from k[x1 , . . . , xn ]. (Occasionally we will abbreviate the notation to just
hFi.) Then h f : f ∈ Fi is also an ideal, the ideal generated by the elements of F,
which are likewise called its generators (Exercise 1.4; see Exercise 1.38). An ar-
bitrary ideal I ⊂ k[x1 , . . . , xn ] is called finitely generated if there exist polynomials
f1 , . . . , fs ∈ k[x1 , . . . , xn ] such that I = h f1 , . . . , fs i; the set f1 , . . . , fs is called a basis
of I. The concept of an ideal arises in the context of arbitrary commutative rings. In
that setting an ideal need not be finitely generated, but in a polynomial ring over a
field it must be:

Theorem 1.1.6 (Hilbert Basis Theorem). If k is a field, then every ideal in the
polynomial ring k[x1 , . . . , xn ] is finitely generated.

For a proof of the Hilbert Basis Theorem the reader is referred to [1, 60, 132, 195].

Corollary 1.1.7. Every ascending chain of ideals I1 ⊂ I2 ⊂ I3 ⊂ · · · in a polynomial


ring over a field k stabilizes. That is, there exists m ≥ 1 such that for every j > m,
I j = Im .

Proof. Let I1 ⊂ I2 ⊂ I3 ⊂ · · · be an ascending chain of ideals in k[x1 , . . . , xn ] and


set I = ∪∞j=1 I j , clearly an ideal in k[x1 , . . . , xn ]. By the Hilbert Basis Theorem there
exist f1 , . . . , fs in k[x1 , . . . , xn ] such that I = h f1 , . . . , fs i. Choose any N ∈ N such that
F = { f1 , . . . , fs } ⊂ IN , and suppose that g ∈ I p for some p ≥ N. Since g ∈ I and F
is a basis for I, there exist h1 , . . . , hs ∈ k[x1 , . . . , xn ] such that g = h1 f1 + · · · + hs fs .
1.1 Fundamental Concepts 5

But then because F ⊂ IN and IN is an ideal, g ∈ IN . Thus I p ⊂ IN , and the ascending


chain has stabilized by IN . 
Rings in which every strictly ascending chain of ideals stabilizes are called
Noetherian rings. The Hilbert Basis Theorem and its corollary hold under the milder
condition that k be only a commutative Noetherian ring. Some condition is neces-
sary, though, which is why in the statements above we explicitly included the con-
dition that k be a field, which is enough for our puposes.
Occasionally we will find that it is important not to distinguish between two
polynomials whose difference lies in a particular ideal I. Thus, we define a relation
on k[x1 , . . . , xn ] by saying that f and g are related if f − g ∈ I. This relation is an
equivalence relation (Exercise 1.5) and is the basis for the following definition.
Definition 1.1.8. Let I be an ideal in k[x1 , . . . , xn ]. Two polynomials f and g in
k[x1 , . . . , xn ] are congruent modulo I, denoted f ≡ g mod I, if f − g ∈ I. The set
of equivalence classes is denoted k[x1 , . . . , xn ]/I.
As a simple example, if in R[x] we take I = hxi, then f ≡ g mod I precisely when
f (x) − g(x) = xh(x) for some polynomial h. Hence f and g are equivalent if and
only if they have the same constant term.
If for f ∈ k[x1 , . . . , xn ] the equivalence class of f is denoted [ f ], then for any f1 and
f2 in [ f ] and for any g1 and g2 in [g], ( f1 + g1 ) − ( f2 + g2 ) ∈ I and f1 g1 − f2 g2 ∈ I.
We conclude that an addition and multiplication are defined on k[x1 , . . . , xn ]/I by
[ f ] + [g] = [ f + g] and [ f ][g] = [ f g], which give it the structure of a ring (Exercise
1.6).
Suppose f1 , . . . , fs ∈ k[x1 , . . . , xn ] and consider system (1.3), whose solution set
is the affine variety V = V( f1 , . . . , fs ). The reader may readily verify that for any
a ∈ kn , a ∈ V if and only if f (a) = 0 for every f ∈ I = h f1 , . . . , fs i. V is the set of
common zeros of the full (typically infinite) set I of polynomials. Moreover, given
the ideal I, as the following proposition states, the particular choice of generators
is unimportant; the same variety will be determined. Thus, it is the ideal that deter-
mines the variety, and not the particular collection of polynomials f1 , . . . , fs .
Proposition 1.1.9. Let f1 , . . . , fs and g1 , . . . , gm be bases of an ideal I ∈ k[x1 , . . . , xn ],
that is, I = h f1 , . . . , fs i = hg1 , . . . , gm i. Then V( f1 , . . . , fs ) = V(g1 , . . . , gm ).
The straightforward proof is left to the reader.
We have seen how a finite collection of polynomials defines a variety. Conversely,
given a variety V , there is naturally associated to it an ideal. As already noted, the
collection of polynomials in a system (1.3) for which V is the solution set is not
unique, and neither is the ideal they generate, although any such ideal has the prop-
erty that V is precisely the subset of kn on which every element of the ideal vanishes.
The ideal naturally associated to V is the one given in the following definition.
Definition 1.1.10. Let V ⊂ kn be an affine variety. The ideal of the variety V is the
set

I(V ) = { f ∈ k[x1 , . . . , xn ] : f (a1 , . . . , an ) = 0 for all (a1 , . . . , an ) ∈ V } .


6 1 Polynomial Ideals and Their Varieties

In Exercise 1.7 the reader is asked to show that I(V ) is an ideal in k[x1 , . . . , xn ],
even if V is not a variety, but simply an arbitrary subset of kn . (See also the discus-
sion following Theorem 1.3.18.)
The ideal naturally associated to a variety V bears the following relation to the
family of ideals that come from the polynomials in any system of equations that
define V .

Proposition 1.1.11. Let f1 , . . . , fs be elements of k[x1 , . . . , xn ]. Then the set inclusion


h f1 , . . . , fs i ⊂ I(V( f1 , . . . , fs )) always holds, but could be strict.

Proof. Let f ∈ h f1 , . . . , fs i. Then there exist h1 , . . . , hs ∈ k[x1 , . . . , xn ] such that


f = h1 f1 + · · · + hs fs . Since f1 , . . . , fs all vanish on V( f1 , . . . , fs ), so does f , so
f ∈ I(V( f1 , . . . , fs )). The demonstration that the inclusion can be strict is given by
Example 1.1.13. 

When V is not just a subset of kn but a variety, the ideal I(V ) naturally determined
by V uniquely determines V :

Proposition 1.1.12. Let V and W be affine varieties in kn . Then


1. V ⊂ W if and only if I(W ) ⊂ I(V ).
2. V = W if and only if I(W ) = I(V ).

Proof. (1) Suppose V ⊂ W . Then any polynomial that vanishes on W also vanishes
on V , so I(W ) ⊂ I(V ). Suppose conversely that I(W ) ⊂ I(V ). Choose any collection
{h1 , . . . , hs } ⊂ k[x1 , . . . , xn ] such that W = V(h1 , . . . , hs ), which must exist, since W
is a variety. Then for 1 ≤ j ≤ s, h j ∈ I(W ) ⊂ I(V ), so that if a ∈ V , then h j (a) = 0.
That is, if a ∈ V , then a ∈ V(h1 , . . . , hs ) = W , so V ⊂ W .
Statement (2) is an immediate consequence of statement (1). 

Example 1.1.13. Let V = {(0, 0)} ⊂ R2 . Then I(V ) is the set of all polynomials in
two variables without constant term. We will express V as V( f1 , f2 ) in two different
ways. Choosing f1 = x and f2 = y, V = V( f1 , f2 ) and I = hx, yi is the same ideal
as I(V ). Choosing f1 = x2 and f2 = y, V = V( f1 , f2 ), but J = hx2 , yi is the set of
elements of R[x, y], every term of which contains x2 or y; hence J $ I(V ). Note that
both I and J have the property that V is precisely the set of common zeros of all
their elements.

Denote by V the set of all affine varieties of kn and by I the set of all polynomial
ideals in k[x1 , . . . , xn ]. Then Definition 1.1.10 defines a map

I : V → I. (1.5)

Because every ideal I of k[x1 , . . . , xn ] has a finite basis (Theorem 1.1.6), so that
I = h f1 , . . . , fs i, and because the variety defined using any basis of I is the same as
that defined using any other (Proposition 1.1.9), there is also a natural map from I
to V defined by
V : I → V : h f1 , . . . , fs i 7→ V( f1 , . . . , fs ) . (1.6)
1.2 The Ideal Membership Problem and Gröbner Bases 7

That is, for an ideal I in k[x1 , . . . , xn ], V(I) = V( f1 , . . . , fs ) for any finite collection
of polynomials satisfying I = h f1 , . . . , fs i. Thus the symbol V will be doing double
duty, since we will continue to write V( f1 , . . . , fs ) in place of the more cumbersome
V(h f1 , . . . , fs i). The following theorem establishes some properties of the maps I
and V. (See also Theorem 1.3.15.)

Theorem 1.1.14. For any field k, the maps I and V are inclusion-reversing. I is one-
to-one (injective) and V is onto (surjective). Furthermore, for any variety V ⊂ kn ,
V(I(V )) = V .

Proof. In Exercise 1.8 the reader is asked to show that the maps I and V are
inclusion-reversing. Now let an affine variety V = V( f1 , . . . , fs ) of kn be given. Since
I(V ) is the collection of all polynomials that vanish on V , if a ∈ V , then every el-
ement of I(V ) vanishes at a, so a is in the set of common zeros of I(V ), which is
V(I(V )). Thus, V ⊂ V(I(V )). For the reverse inclusion, by the definition of I(V ),
f j ∈ I(V ), 1 ≤ j ≤ s; hence, h f1 , . . . , fs i ⊂ I(V ). Since V is inclusion-reversing,
V(I(V )) ⊂ V(h f1 , . . . , fs i) = V( f1 , . . . , fs ) = V .
Finally, I is one-to-one because it has a left inverse, and V is onto because it has
a right inverse. 

1.2 The Ideal Membership Problem and Gröbner Bases

One of the main problems of computational algebra is the Ideal Membership Prob-
lem, formulated as follows.
Ideal Membership Problem. Let I ⊂ k[x1 , . . . , xn ] be an ideal and let f
be an element of k[x1 , . . . , xn ]. Determine whether or not f is an element
of I.
We first consider the polynomial ring with one variable x. One important feature
of this ring is the existence of the Division Algorithm: given two polynomials f
and g in k[x], g 6= 0, there exist unique elements q and r of k[x], the quotient and
remainder, respectively, of f upon division by g, such that f = qg + r, and either
r = 0 or deg(r) < deg(g). To divide f by g is to express f as f = qg + r. We say
that g divides f if r = 0, and write it as g | f . As outlined in Exercises 1.9–1.12, the
greatest common divisor of two polynomials in k[x] is defined, is easily computed
using the Euclidean Algorithm, and can be used in conjunction with the Hilbert
Basis Theorem to show that every ideal in k[x] is generated by a single element. (An
ideal generated by a single element is called a principal ideal, and a ring in which
every ideal is principal is a principal ideal domain). The Ideal Membership Problem
is then readily solved: given an ideal I and a polynomial f , we first find a generator
g for I, then divide f by g; f ∈ I if and only if g | f .
In polynomial rings of several variables, we want to follow an analogous proce-
dure for solving the Ideal Membership Problem: performing a division and exam-
ining a remainder. Matters are more complicated, however. In particular, in general
8 1 Polynomial Ideals and Their Varieties

ideals are not generated by just one polynomial, so we have to formulate a procedure
for dividing a polynomial f by a set F of polynomials, and although there is a way
to generalize the Division Algorithm to do this for elements of k[x1 , . . . , xn ], a com-
plication arises in that the remainder under the division is not necessarily unique.
To describe the division algorithm in k[x1 , . . . , xn ], we must digress for several
paragraphs to introduce the concepts of a term ordering and of reduction of a poly-
nomial modulo a set of polynomials, along with attendant terminology. We first of
all specify an ordering on the terms of the polynomials. In the case of one vari-
able there is the natural ordering according to degree. In the multivariable case
there are different orders that can be used. We will define the general concept of
a term order and a few of the most frequently used term orders. Observe that be-
cause of the one-to-one correspondence between monomials xα = xα1 1 xα2 2 · · · xαn n and
n-tuples α = (α1 , . . . , αn ) ∈ Nn0 , it is sufficient to order elements of Nn0 (for, as in
the one-variable case, the actual coefficients of the terms play no role in the order-
ing). Underlying this correspondence, of course, is the assumption of the ordering
x1 > x2 > · · · > xn of the variables themselves.
Recall that a partial order ≻ on a set S is a binary relation that is reflexive (a ≻ a
for all a ∈ S), antisymmetric (a ≻ b and b ≻ a only if a = b), and transitive (a ≻ b
and b ≻ c implies a ≻ c). A total order > on S is a partial order under which any
two elements can be compared: for all a and b in S, either a = b, a > b, or b > a.

Definition 1.2.1. A term order on k[x1 , . . . , xn ] is a total order > on Nn0 having the
following two properties:
(a) for all α , β , and γ in Nn0 , if α > β , then α + γ > β + γ ; and
(b) Nn0 is well-ordered by > : if S is any nonempty subset of Nn0 , then there exists a
smallest element µ of S (for all α ∈ S \ { µ }, α > µ ).

The monomials {xα : α ∈ N0 } are then ordered by the ordering of their ex-
ponents, so that xα > xβ if and only if α > β . Note that while we speak of the
term order > as being on k[x1 , . . . , xn ], we are not actually ordering all elements of
k[x1 , . . . , xn ], but only the monomials, hence the individual terms of the polynomials
that comprise k[x1 , . . . , xn ]; this explains the terminology term order. The terminol-
ogy monomial order is also widely used.
A sequence α j in Nn0 is strictly descending if, for all j, α j > α j+1 and α j 6= α j+1 .
Such a sequence terminates if it is finite.

Proposition 1.2.2. A total order > on Nn0 well-orders Nn0 if and only if each strictly
descending sequence of elements of Nn0 terminates.

Proof. If there exists a strictly descending sequence α1 > α2 > α3 > · · · that does
not terminate, then {α1 , α2 , . . . } is a nonempty subset of Nn0 with no minimal ele-
ment, and > does not well-order Nn0 .
Conversely, if > does not well-order Nn0 , then there exists a nonempty subset A
of Nn0 that has no minimal element. Let α1 be an arbitrary element of A. It is not
minimal; hence there exists α2 ∈ A, α2 6= α1 , such that α1 > α2 . Continuing the
process, we get a strictly descending sequence that does not terminate. 
1.2 The Ideal Membership Problem and Gröbner Bases 9

We now define the three most commonly used term orders; in Exercise 1.16 we
ask the reader to verify that they indeed meet the conditions in Definition 1.2.1.
Addition and rescaling in Zn are performed componentwise: for α , β ∈ Zn and
p ∈ Z, the jth entry of α + pβ is the jth entry of α plus p times the jth entry of β .
The word “graded” is sometimes used where we use the word “degree.”

Definition 1.2.3. Let α = (α1 , . . . , αn ) and β = (β1 , . . . , βn ) be elements of Nn0 .


(a) Lexicographic Order. Define α >lex β if and only if, reading left to right, the
first nonzero entry in the n-tuple α − β ∈ Zn is positive.
(b) Degree Lexicographic Order. Define α >deglex β if and only if
n n
|α | = ∑ α j > |β | = ∑ β j or |α | = |β | and α >lex β .
j=1 j=1

(c) Degree Reverse Lexicographic Order. Define α >degrev β if and only if either
|α | > |β | or |α | = |β | and, reading right to left, the first nonzero entry in the
n-tuple α − β ∈ Zn is negative.

For example, if α = (1, 4, 4, 2) and β = (1, 2, 6, 2), then α is greater than β with
respect to all three orders. Note in particular that this example shows that degrev is
not simply the reverse of deglex.
When a term order > on k[x1 , . . . , xn ] is given, we write aα xα > aβ xβ if and
only if α > β . We reiterate that the definitions above are based on the presumed
ordering x1 > · · · > xn of the variables. This ordering must be explicitly identified
when non-subscripted variables are in use. For instance, if in k[x, y] we choose y > x,
then y5 >lex x9 (since (5, 0) >lex (0, 9)) and xy4 >deglex x2 y3 (since 4 + 1 = 3 + 2 and
(4, 1) >lex (3, 2)), and we will typically write these latter two terms as y4 x and y3 x2
to reflect the underlying ordering of the variables themselves.
Fixing a term order > on k[x1 , . . . , xn ], any nonzero f ∈ k[x1 , . . . , xn ] may be writ-
ten in the standard form, with respect to > ,

f = a 1 x α1 + a 2 x α2 + · · · + a s x αs , (1.7)

where a j 6= 0 for j = 1, . . . , s, αi 6= α j for i 6= j and 1 ≤ i, j ≤ s, and where, with


respect to the specified term order, α1 > α2 > · · · > αs .

Definition 1.2.4. Let a term order on k[x1 , . . . , xn ] be specified and let f be a nonzero
element of k[x1 , . . . , xn ], written in the standard form (1.7).
(a) The leading term LT( f ) of f is the term LT( f ) = a1 xα1 .
(b) The leading monomial LM( f ) of f is the monomial LM( f ) = xα1 .
(c) The leading coefficient LC( f ) of f is the coefficient LC( f ) = a1 .

The concept of division of single-variable polynomials has an obvious general-


ization to the case of division of one monomial by another: we say that a monomial
β β
xα = x1α1 · · · xαn n divides a monomial xβ = x1 1 · · · xn n , written xα | xβ , if and only if
β j ≥ α j for all j, 1 ≤ j ≤ n. In such a case the notation xβ /xα denotes the monomial
10 1 Polynomial Ideals and Their Varieties

β −α β −α
x1 1 1 · · · xn n n . In k[x1 , . . . , xn ], to divide a polynomial f by nonzero polynomials
{ f1 , . . . , fs } means to represent f in the form

f = u1 f 1 + · · · + us f s + r ,

where u1 , . . . , us , r ∈ k[x1 , . . . , xn ], and either r = 0 or deg(r) ≤ deg( f ) (the inequality


is not strict). The most important part of this expression is the remainder r, not
the weights u j , for the context in which we intend to apply the division concept is
that the fi are generators of an ideal I, and we want the division to produce a zero
remainder r if and only if f is in I.
We must first specify a term order on k[x1 , . . . , xn ]. The main idea then of the
algorithm for the division is the same as in the one-variable case: we reduce the
leading term of f (as determined by the specified term order) by multiplying some f j
by an appropriate term and subtracting. We will describe the procedure in detail, but
to understand the motivation for the following definition, recall that in the familiar
one-variable case of polynomial long division, in the first pass through the algorithm
dividing, for example, f = 6x3 + · · · (lower-order terms omitted) by g = 7x2 + · · · ,
3 f)
we compare the leading terms, multiply g by 67 x = 6x 7x2
= LT(
LT(g) , and then subtract the
product from f to obtain a polynomial h = f − 76 xg that satisfies deg(h) < deg( f ).
Definition 1.2.5. (a) For f , g, h ∈ k[x1 , . . . , xn ] with g 6= 0, we say that f reduces to
h modulo g in one step, written as
g
f → h,

if and only if LM(g) divides a nonzero term X that appears in f and


X
h= f− g. (1.8)
LT(g)

(b) For f , f1 , . . . , fs , h ∈ k[x1 , . . . , xn ] with f j 6= 0, 1 ≤ j ≤ s, letting F = { f1 , . . . , fs },


we say that f reduces to h modulo F, written as
F
f → h,

if and only if there exist a sequence of indices j1 , j2 , . . . , jm ∈ {1, . . . , s} and a se-


quence of polynomials h1 , . . . , hm−1 ∈ k[x1 , . . . , xn ] such that
fj fj fj f jm−1 f jm
1 2 3
f−
−→ h1 −−
→ h2 −−
→ · · · −−−→ hm−1 −−→ h .
F
Remark 1.2.6. Applying part (a) of the definition repeatedly shows that if f → h,
then there exist u j ∈ k[x1 , . . . , xn ] such that f = u1 f1 + · · · + us fs + h. Hence, by Def-
inition 1.1.8 f , reduces to h modulo F = { f1 , . . . , fs } only if f ≡ h mod h f1 , . . . , fs i.
The converse is false, as shown by Example 1.2.12.
Example 1.2.7. We illustrate each part of Definition 1.2.5.
(a) In Q[x, y] with x > y and the term order deglex, let f = x2 y + 2xy − 3x + 5 and
1.2 The Ideal Membership Problem and Gröbner Bases 11

g = xy + 6y2 − 4x. If the role of X is played by the leading term x2 y in f , then

x2 y 
h= f− xy + 6y2 − 4x = −6xy2 + 4x2 + 2xy − 3x + 5,
xy
g
so f → h and LM(h) < LM( f ). If the role of X is played by the term 2xy in f , then

2xy 
h̃ = f − xy + 6y2 − 4x = x2 y − 12y2 + 5x + 5,
xy
g
so f → h̃ and LT(h̃) = LT( f ). In either case we remove the term X from f and
replace it with a term that is smaller with respect to deglex.
(b) In Q[x, y] with y > x and the term order deglex, let f = y2 x + y2 + 3y, f1 = yx + 2,
and f2 = y + x. Then
f f f
y2 x + y2 + 3y −
→1
y2 + y −
→2
−yx + y −
→2
x2 + y,
{ f1 , f2 }
so f −−−−→ x2 + y.
Definition 1.2.8. Suppose f , f1 , . . . , fs ∈ k[x1 , . . . , xn ], f j 6= 0 for 1 ≤ j ≤ s, and let
F = { f1 , . . . , fs }.
(a) A polynomial r ∈ k[x1 , . . . , xn ] is reduced with respect to F if either r = 0 or no
monomial that appears in the polynomial r is divisible by any element of the set
{LM( f1 ), . . . , LM( fs )}.
F
(b) A polynomial r ∈ k[x1 , . . . , xn ] is a remainder for f with respect to F if f → r
and r is reduced with respect to F.
The Multivariable Division Algorithm is the direct analogue of the procedure
used to divide one single-variable polynomial by another. To divide f by the or-
dered set F = { f1 , . . . , fs }, we proceed iteratively, at each step performing a familiar
polynomial long division using one element of F. Typically, the set F of divisors is
presented to us in no particular order, so as a preliminary we must order its elements
in some fashion; the order selected can affect the final result. At the first step in the
actual division process, the “active divisor” is the first element of F, call it f j , whose
leading term divides the leading term of f ; at this step we replace f by the polyno-
mial h of (1.8) when X = LT( f ) and g = f j , thereby reducing f somewhat using f j .
At each succeeding step, the active divisor is the first element of F whose leading
term divides the leading term of the current polynomial h; at this step we similarly
reduce h somewhat using the active divsior. If at any stage no division is possible,
then the leading term of h is added to the remainder, and we try the same process
again, continuing until no division is possible at all. By Exercise 1.17, building up
the remainder successively is permissible. An explicit description of the procedure
is given in Table 1.1 on page 12. In the next theorem we will prove that the algo-
F
rithm works correctly to perform the reduction f → r and generate the components
of the expression f = u1 f1 + · · · + us fs + r, where r is a remainder for f with respect
to F (thus showing that a remainder always exists), but first we present an example.
12 1 Polynomial Ideals and Their Varieties

Multivariable Division Algorithm

Input:
f ∈ k[x1 , . . ., xn ]
ordered set F = { f 1 , . . ., f s} ⊂ k[x1 , . . ., xn ] \ {0}

Output:
u1 , . . ., us , r ∈ k[x1 , . . ., xn ] such that
1. f = u1 f 1 + · · · + us f s + r,
2. r is reduced with respect to { f 1 , . . ., f s }, and
3. max(LM(u1 )LM( f 1 ), . . ., LM(us )LM( f s ), LM(r)) = LM( f )

Procedure:
u1 := 0; . . . , us := 0; r := 0; h := f
WHILE h 6= 0 DO
IF
There exists j such that LM( f j ) divides LM(h)
THEN
For the least j such that LM( f j ) divides LM(h)
LT(h)
u j := u j +
LT( f j )
LT(h)
h := h − fj
LT( f j )
ELSE
r := r + LT(h)
h := h − LT(h)

Table 1.1 The Multivariable Division Algorithm

Example 1.2.9. In Q[x, y] with x > y and the term order lex, we apply the algorithm
to divide f = x2 y+ xy3 + xy2 by the polynomials f1 = xy+ 1 and f2 = y2 + 1, ordered
f1 then f2 .
The two panels in Table 1.2 on page 14 show the computation in tabular form
and underscore the analogy with the one-variable case. The top panel shows three
divsions by f1 , at which point no division (by either divisor) is possible. The leading
term −x is sent to the remainder, and the process is restarted. Division by f1 is
impossible, but the bottom panel shows one further division by f2 , and then all
remaining terms are sent to the remainder. Therefore,

f = u1 f1 + u2 f2 + r = (x + y2 + y) f1 + (−1) f2 + (−x − y + 1) .

That is, the quotient is {u1 , u2 } = {x + y2 + y, −1} and the remainder is −x − y + 1.


(In general, the role of the divisor on each step will alternate between the f j , so that
in a hand computation the full table will contain more than s panels, and successive
dividends when f j is the active divisor must be added to obtain u j .)
1.2 The Ideal Membership Problem and Gröbner Bases 13

Now let us go through exactly the same computation by means of an explicit


application of the Multivariable Division Algorithm. That is, we will follow the
instructions presented in Table 1.1 in a step-by-step fashion.
First pass:
LM( f1 ) | LM(h) but LM( f2 ) ∤ LM(h)
f1 is least
2
u1 = 0 + xxyy = x
h = (x2 y + xy3 + xy2 ) − x(xy + 1) = xy3 + xy2 − x
Second pass:
LM( f1 ) | LM(h) and LM( f2 ) | LM(h)
f1 is least
3
u1 = x + xyxy = x + y2
h = (xy3 + xy2 − x) − y2(xy + 1) = xy2 − x − y2
Third pass:
LM( f1 ) | LM(h) and LM( f2 ) | LM(h)
f1 is least
2
u1 = x + y2 + xyxy = x + y2 + y
h = (xy2 − x − y2) − y(xy + 1) = −x − y2 − y
Fourth pass:
LM( f1 ) ∤ LM(h) and LM( f2 ) ∤ LM(h)
r = 0 + (−x) = −x
h = (−x − y2 − y) − (−x) = −y2 − y
Fifth pass:
LM( f1 ) ∤ LM(h) but LM( f2 ) | LM(h)
f2 is least
2
u2 = 0 + −yy = −1
h = (−y2 − y) − (−1)(y2 + 1) = −y + 1
Sixth pass:
LM( f1 ) ∤ LM(h) and LM( f2 ) ∤ LM(h)
r = −x + (−y) = −x − y
h = (−y + 1) − (−y) = 1
Seventh pass:
LM( f1 ) ∤ LM(h) and LM( f2 ) ∤ LM(h)
r = −x − y + 1
h = 1−1 = 0
A summary statement in the language of Definition 1.2.5 for these computations
is the string of reductions and equalities
14 1 Polynomial Ideals and Their Varieties

1f f f f
f→ h1 →1 h2 →
1
h3 = h4 + (−x) →2 h5 + (−x) = h6 + (−x − y)
= h7 + (−x − y − 1) = −x − y − 1

{ f1 , f2 }
or, more succinctly, f −−−−→ −x − y − 1 .

x + y2 + y ←− [u1 ] r:
xy + 1 x2 y + xy3 + xy2 ←− [ f ]
2
x y +x
xy3 + xy2 − x ←− [h1 ]
xy3 + y2
xy2 − x − y2 ←− [h2 ]
xy2 +y
− x − y2 − y ←− [h3 ]

−1 ←− [u2 ]
y2 + 1 −y2 − y ←− [h4 ] −x
−y2 − 1
− y+ 1 ←− [h5 ]
1 ←− [h6 ] −x − y
0 ←− [h7 ] −x − y + 1

Table 1.2 The Computations of Example 1.2.9

Theorem 1.2.10. Let an ordered set F = { f1 , . . . , fs } ⊂ k[x1 , . . . , xn ]\ {0} of nonzero


polynomials and a polynomial f ∈ k[x1 , . . . , xn ] be given. The Multivariable Division
Algorithm produces polynomials u1 , . . . , us , r ∈ k[x1 , . . . , xn ] such that

f = u1 f1 + · · · + us fs + r, (1.9)

where r is a remainder for f with respect to F and

LM( f ) = max(LM(u1 )LM( f1 ), . . . , LM(us )LM( fs ), LM(r)), (1.10)

where LM(u j )LM( f j ) is not present in (1.10) if u j = 0.

Proof. The algorithm certainly produces the correct result u j = 0, 1 ≤ j ≤ s, and


r = f in the special cases that f = 0 or that no leading term in any of the divisors
divides any term of f . Otherwise, after the first pass through the WHILE loop for
which the IF statement is true, exactly one polynomial u j is nonzero, and we have
1.2 The Ideal Membership Problem and Gröbner Bases 15

max[LM(u1 )LM( f1 ), . . . ,LM(us )LM( fs ), LM(h)]


(1.11)
= max[LM(u1 )LM( f1 ), . . . , LM(us )LM( fs )] ,

which clearly holds at every succeeding stage of the algorithm. Consequently, (1.10)
holds at that and every succeeding stage of the algorithm, hence holds when the
algorithm terminates.
At every stage of the algorithm, f = u1 f1 + · · · + us fs + r + h holds. Because the
algorithm halts precisely when h = 0, this implies that on the last step (1.9) holds.
Moreover, since at each stage we add to r only terms that are not divisible by LT( f j )
for any j, 1 ≤ j ≤ s, r is reduced with respect to F, and thus is a remainder of f with
respect to F.
To show that the algorithm must terminate, let h1 , h2 , . . . be the sequence of poly-
nomials produced by the successive values of h upon successive passes through
the WHILE loop. The algorithm fails to terminate only if for every j ∈ N there
is a jth pass through the loop, hence an h j 6= 0. Then LM(h j ) exists for each
j ∈ N, and the sequence LM(h1 ), LM(h2 ), . . . satisfies LM(h j+1 ) < LM(h j ) and
LM(h j+1 ) 6= LM(h j ), which contradicts Proposition 1.2.2. 

If we change the order of the polynomials in Example 1.2.9, dividing first by


f2 and then by f1 , then the quotient and remainder change to {xy + x, x − 1} and
−2x + 1, respectively (Exercise 1.21). Thus we see that, unlike the situation in the
one-variable case, the quotient and remainder are not unique. They depend on the
ordering of the polynomials in the set of divisors as well as on the term order chosen
for the polynomial ring (see Exercise 1.22). But what is even worse from the point
of view of solving the Ideal Membership Problem is that, as the following examples
show, it is even possible that, keeping the term order fixed, there exist an element of
the ideal generated by the divisors whose remainder can be zero under one ordering
of the divisors and different from zero under another, or even different from zero no
matter how the divisors are ordered.

Example 1.2.11. In the ring R[x, y] fix the lexicographic term order with x > y
and consider the polynomial f = x2 y + xy + 2x + 2. When we use the Multi-
variable Division Algorithm to reduce the polynomial f modulo the ordered set
{ f1 = x2 − 1, f2 = xy + 2}, we obtain

f = y f1 + f2 + (2x + y) .

Since the corresponding remainder, 2x+y, is different from zero, we might conclude
that f is not in the ideal h f1 , f2 i. If, however, we change the order of the divisors so
that f2 is first, we obtain

f = 0 · f1 + (x + 1) f2 + 0 = (x + 1) f2 , (1.12)

so that f ∈ h f1 , f2 i after all.

Example 1.2.12. In the ring R[x, y] fix the lexicographic term order with x > y. Then
2y = 1 ·(x+ y)+ (−1)·(x− y) ∈ hx+ y, x− yi, but because LT(x+ y) = LT(x− y) = x
16 1 Polynomial Ideals and Their Varieties

does not divide 2y the remainder of 2y with respect to {x + y, x − y} is unique and is


2y. Thus, for either ordering of the divisors, the Multivariable Division Algorithm
produces this nonzero remainder.

We see then that we have lost the tool that we had in polynomial rings of one
variable for resolving the Ideal Membership Problem. Fortunately, not all is lost.
While the Multivariable Division Algorithm cannot be improved in general, it has
been discovered that if we use a certain special generating set for our ideals, then
it is still true that f ∈ h f1 , . . . , fs i if and only if the remainder in the Division Al-
gorithm is equal to zero, and we are able to decide the Ideal Membership Problem.
Such a special generating set for an ideal is called a Gröbner basis or a standard
basis. It is one of the primary tools of computational algebra and is the basis of
numerous algorithms of computational algebra and algebraic geometry. To motivate
the definition of a Gröbner basis we discuss Example 1.2.11 again. It showed that
in the ring k[x, y], under lex with x > y, for f = x2 y + xy + 2x + 2, f1 = x2 − 1, and
f2 = xy + 2,
{ f1 , f2 }
f −−−−→ 2x + y .
But by (1.12), f ∈ h f1 , f2 i, so the remainder r = 2x + y must also be in h f1 , f2 i. The
trouble is that the leading term of r is not divisible by either LM( f1 ) or LM( f2 ),
and this is what halts the division process in the Multivariable Division Algorithm.
So the problem is that the ideal h f1 , f2 i contains elements that are not divisible by a
leading term of either element of the particular basis { f1 , f2 } of the ideal.
If, for any ideal I, we had a basis B with the special property that the leading
term of every polynomial in I was divisible by the leading term of some element of
B, then the Multivariable Division Algorithm would provide an answer to the Ideal
Membership Problem: a polynomial f is in the ideal I if and only if the remainder
of f upon division by elements of B, in any order, is zero. This the idea behind
the concept of a Gröbner basis of an ideal, and we use this special property as the
defining characteristic of a Gröbner basis.

Definition 1.2.13. A Gröbner basis (also called a standard basis) of an ideal I in


k[x1 , . . . , xn ] is a finite nonempty subset G = {g1 , . . . , gm } of I \ {0} with the follow-
ing property: for every nonzero f ∈ I, there exists g j ∈ G such that LT(g j ) | LT( f ).

It is implicit in the definition that we do not consider the concept of a Gröbner


basis G for the zero ideal, nor will we need it. See Section 5.2 of [18] for this more
general situation, in which G must be allowed to be empty. Note that the requirement
that the set G actually be a basis of the ideal I does not appear in the definition of a
Gröbner basis but is a consequence of it (Theorem 1.2.16). Note also that whether
or not a set G forms a Gröbner basis of an ideal I depends not only on the term order
in use, but also on the underlying ordering of the variables. See Exercise 1.23.
With a Gröbner basis we again have the important property of uniqueness of the
remainder, which we had in k[x] and which we lost in the multivariable case for
division by an arbitrary set of polynomials:
1.2 The Ideal Membership Problem and Gröbner Bases 17

Proposition 1.2.14. Let G be a Gröbner basis for a nonzero ideal I in k[x1 , . . . , xn ]


and f ∈ k[x1 , . . . , xn ]. Then the remainder of f with respect to G is unique.
G G
Proof. Suppose f → r1 and f − → r2 and both r1 and r2 are reduced with respect to
G. Since f − r1 and f − r2 are both in I, r1 − r2 ∈ I. By Definition 1.2.8, certainly
r1 − r2 is reduced with respect to G. But then by Definition 1.2.13 it is immediate
that r1 − r2 = 0, since it is in I. 
Definition 1.2.15. Let I be an ideal and f a polynomial in k[x1 , . . . , xn ]. To reduce f
modulo the ideal I means to find the unique remainder of f upon division by some
Gröbner basis G of I. Given a nonzero polynomial g, to reduce f modulo g means
to reduce f modulo the ideal hgi.
Proposition 1.2.14 ensures that once a Gröbner basis is selected, the process is
well-defined, although the remainder obtained depends on the Gröbner basis speci-
fied. We will see when this concept is applied in Section 3.7 that this ambiguity is
not important in practice.
Let S be a subset of k[x1 , . . . , xn ] (possibly an ideal). We denote by LT(S) the
set of leading terms of the polynomials that comprise S and by hLT(S)i the ideal
generated by LT(S) (the set of all finite linear combinations of elements of LT(S)
with coefficients in k[x1 , . . . , xn ]). The following theorem gives the main properties
F
of Gröbner bases. We remind the reader that the expression f → h means that there
is some sequence of reductions using the unordered set F of divisors that leads
from f to h, which is not necessarily a remainder of f with respect to F. This is in
contrast to the Multivariable Division Algorithm, in which F must be ordered, and
the particular order selected determines a unique sequence of reductions from f to
a remainder r.
Theorem 1.2.16. Let I ⊂ k[x1 , . . . , xn ] be a nonzero ideal, let G = {g1 , . . . , gs } be a
finite set of nonzero elements of I, and let f be an arbitrary element of k[x1 , . . . , xn ].
Then the following statements are equivalent:
(i) G is a Gröbner basis for I;
G
(ii) f ∈ I ⇔ f → 0;
(iii) f ∈ I ⇔ f = ∑sj=1 u j g j and LM( f ) = max1≤ j≤s (LM(u j )LM(g j ));
(iv) hLT(G)i = hLT(I)i.
Proof. (i) ⇒ (ii). Let any f ∈ k[x1 , . . . , xn ] be given. By Theorem 1.2.10 there exists
G
r ∈ k[x1 , . . . , xn ] such that f → r and r is reduced with respect to G. If f ∈ I, then
r ∈ I; hence, by the definition of Gröbner basis and the fact that r is reduced with
G
respect to G, r = 0. Conversely, if f → 0, then obviously f ∈ I.
(ii) ⇒ (iii). Suppose f ∈ I. Then by (ii) there is a sequence of reductions
gj gj 3
gj g jm−1 g jm
1 2
f −−→ h1 −−→ h2 −−→ · · · −−−→ hm−1 −−→ 0

which yields f = u1 g1 + · · · + u2 g2 for some u j ∈ k[x1 , . . . , xn ]. Exactly as described


in the first paragraph of the proof of Theorem 1.2.10, an equality analogous to (1.11)
18 1 Polynomial Ideals and Their Varieties

holds at each step of the reduction and thus the equality in (iii) holds. The reverse
implication in (iii) is immediate.
(iii) ⇒ (iv). G ⊂ I, hence hLT(G)i ⊂ hLT(I)i is always true, so we must ver-
ify that hLT(I)i ⊂ hLT(G)i when (iii) holds. The inclusion hLT(I)i ⊂ hLT(G)i
is implied by the implication: if f ∈ I, then LT( f ) ∈ hLT(G)i. Hence suppose
f ∈ I. Then it follows immediately from the condition on LM( f ) in (iii) that
LT( f ) = ∑ j LT(u j )LT(g j ), where the summation is over all indices j such that
LM( f ) = LM(u j )LM(g j ). Therefore hLT(I)i ⊂ hLT(G)i holds.
(iv) ⇒ (i). For f ∈ I, LT( f ) ∈ LT(I) ⊂ hLT(I)i = hLT(G)i, so there exist
h1 , . . . , hm ∈ k[x1 , . . . , xn ] such that
m
LT( f ) = ∑ h j LT(g j ). (1.13)
j=1

β β
Write LT( f ) = cx1 1 · · · xn n and LT(g1 ) = a1 xα1 1 · · · xαn n . If there exists an index u such
that αu > βu , then every term in h1 LT(g1 ) has exponent on xu exceeding βu , hence
must cancel out in the sum (1.13). Similarly for g2 through gm , implying that there
γ γ
must exist an index j so that LT(g j ) = a j x11 · · · xnn has γu < βu for 1 ≤ u ≤ n, which
is precisely the statement that LT(g j ) | LT( f ), so G is a Gröbner basis for I. 

The selection of a Gröbner basis of an ideal I provides a solution to the Ideal


Membership Problem:
Ideal Membership Problem: Solution. If G is a Gröbner basis for an
G
ideal I, then f ∈ I if and only if f → 0.
However, it is not at all clear from Definition 1.2.13 that every ideal must actually
possess a Gröbner basis. We will now demonstrate the existence of a Gröbner basis
for a general ideal by giving a constructive algorithm for producing one. The algo-
rithm is due to Buchberger [27] and either it or some modification of it is a primary
component of a majority of the algorithms of computational algebra and algebraic
geometry. A key ingredient in the algorithm is the so-called S-polynomial of two
polynomials. To get some feel for why it should be relevant, consider the ideal I
generated by the set F = { f1 , f2 } ⊂ R[x, y], f1 = x2 y + y and f2 = xy2 + x. Then
f = y f1 − x f2 = y2 − x2 ∈ I, but under lex or deglex with x > y, neither LT( f1 ) nor
LT( f2 ) divides LT( f ), showing that F is not a Gröbner basis of I. The polynomial
f is the S-polynomial of f1 and f2 and clearly was constructed using their leading
terms in precisely a fashion that would produce a cancellation that led to the failure
of their leading terms to divide its leading term.

Definition 1.2.17. Let f and g be nonzero elements of k[x1 , . . . , xn ], LM( f ) = xα


and LM(g) = xβ . The least common multiple of xα and xβ , denoted LCM(xα , xβ ),
γ γ
is the monomial xγ = x11 · · · xnn such that γ j = max(α j , β j ), 1 ≤ j ≤ n, and (with the
same notation) the S-polynomial of f and g is the polynomial

xγ xγ
S( f , g) = f− g.
LT( f ) LT(g)
1.2 The Ideal Membership Problem and Gröbner Bases 19

The following lemma, whose proof is left as Exercise 1.24 (or see [1, §1.7]), de-
scribes how any polynomial f created as a sum of polynomials whose leading terms
all cancel in the formation of f must be a combination of S-polynomials computed
pairwise from the summands in f .

Lemma 1.2.18. Fix a term order < on k[x1 , . . . , xn ]. Suppose M is a monomial and
f1 , . . . , fs are elements of k[x1 , . . . , xn ] such that LM( f j ) = M for all j, 1 ≤ j ≤ s. Set
f = ∑sj=1 c j f j , where c j ∈ k. If LM( f ) < M, then there exist d1 , . . . , ds−1 ∈ k such
that
f = d1 S( f1 , f2 ) + d2 S( f2 , f3 ) + · · · + ds−1S( fs−1 , fs ). (1.14)

In writing (1.14) we have used the fact that S( f , g) = −S(g, f ) to streamline the
expression. We also use this fact in the following theorem, expressing the condition
on S(gi , g j ) only where we know it to be at issue: for i 6= j. The theorem is of fun-
damental importance because it provides a computational method for determining
whether or not a given set of polynomials is a Gröbner basis for the ideal that they
generate.

Theorem 1.2.19 (Buchberger’s Criterion). Let I be a nonzero ideal in k[x1 , . . . , xn ]


and let < be a fixed term order on k[x1 , . . . , xn ]. A generating set G = {g1 , . . . , gs } is
G
a Gröbner basis for I with respect to < if and only if S(gi , g j ) → 0 for all i 6= j.
G
Proof. If G is a Gröbner basis, then by (ii) in Theorem 1.2.16, f → 0 for all f ∈ I,
including S(gi , g j ). Conversely, suppose that all S-polynomials of the g j reduce to 0
modulo G and let
s
f= ∑ h jg j (1.15)
j=1

be an arbitrary element of I. We need to show that there exists an index j0 for which
LT(g j0 ) divides LT( f ). The choice of the set of polynomials h j in the representation
(1.15) of f is not unique, and to each such set there corresponds the monomial

M = max (LM(h j )LM(g j )) ≥ LM( f ) . (1.16)


1≤ j≤s

The set M of all such monomials is nonempty, hence it has a least element M0 (since
the term order is a well-order). We will show that for any M ∈ M, if M > LM( f ),
then there is an element of M that is smaller than M, from which it follows that
M0 = LM( f ), completing the proof.
Thus let M be an element of M for which M 6= LM( f ). Let {h1 , . . . , hs } be a
collection satisfying (1.15) and giving rise to M, and let

J = { j ∈ {1, . . . , s} : LM(h j )LM(g j ) = M}.

For j ∈ J, we write h j = c j M j + lower terms. Let g = ∑ j∈J c j M j g j . Then

f = g + g̃, (1.17)
20 1 Polynomial Ideals and Their Varieties

where LM(g̃) < LM(g). Thus LM(M j g j ) = M for all j ∈ J, but LM(g) < M. By
Lemma 1.2.18, g is a combination of S-polynomials,

g= ∑ di j S(Mi gi , M j g j ), (1.18)
i, j∈J

with di j ∈ k.
We now compute these S-polynomials. Since the least common multiple of all
pairs of leading terms of the Mi gi is M, using the definition of S-polynomials we
obtain
M
S(Mi gi , M j g j ) = S(gi , g j ),
Mi j
where Mi j = LCM(LM(gi ), LM(g j )), which divides M because Mi divides M, M j
divides M, and M is the least common multiple of Mi and M j . By hypothesis,
G G
S(gi , g j ) → 0 whenever i 6= j, hence S(Mi gi , M j g j ) → 0 (Exercise 1.18), so by the
definition of reduction modulo G,

S(Mi gi , M j g j ) = ∑ hi jt gt , (1.19)
t

and

max (LM(hi jt )LM(gt ) = LM(S(Mi gi , M j g j )) < max(LM(Mi gi ), LM(M j g j )) = M.


1≤t≤s

Substituting (1.19) into (1.18) and then (1.18) into (1.17), we obtain f = ∑sj=1 h′j g j ,
where max1≤ j≤s (LT(h′j )LT(g j )) < M (recall that di j ∈ k), as required. 

Example 1.2.20. (Continuation of Example 1.2.11.) Once again consider the ideal
I in k[x, y] generated by the polynomials f1 = x2 − 1 and f2 = xy + 2. As before
we use lex with x > y. If f and g are in I, then certainly S( f , g) is. Beginning with
the basis { f1 , f2 } of I, we will recursively add S-polynomials to our basis until,
based on Buchberger’s Criterion, we achieve a Gröbner basis. Thus we initially set
G := { f1 , f2 } and, using LCM(x2 , xy) = x2 y, compute the S-polynomial

x2 y 2 x2 y
S( f1 , f2 ) = (x − 1) − (xy + 2) = −2x − y.
x2 xy

Since −2x − y is obviously reduced with respect to { f1 , f2 }, it must be added to


G, which becomes G := { f1 , f2 , −2x − y}. We then compute the S-polynomial for
every pair of polynomials in the new G and reduce it, if possible. We know already
that S( f1 , f2 ) = −2x − y; hence, we compute

x2 2 x2 1 G
S( f1 , −2x − y) = (x − 1) − (−2x − y) = − (xy + 2) → 0
x2 −2x 2
xy xy 1
S( f2 , −2x − y) = (xy + 2) − (−2x − y) = − y2 + 2 .
xy −2x 2
1.2 The Ideal Membership Problem and Gröbner Bases 21

Since − 12 y2 + 2 is reduced with respect to G, it is added to G, which then becomes


G := { f1 , f2 , −2x − y, − 12 y2 + 2}. Further computations show that the S-polynomial
of every pair of polynomials now in G is in hGi. Therefore a Gröbner basis for
I = h f1 , f2 i is { f1 , f2 , −2x − y, − 21 y2 + 2}.
This example illustrates the algorithm in Table 1.3, based on Buchberger’s Cri-
terion, for computing a Gröbner basis of a polynomial ideal. (It is implicit in the
algorithm that we order the set G at each stage in order to apply the Multivariable
Division Algorithm.)

Buchberger’s Algorithm

Input:
A set of polynomials { f 1 , . . ., f s } ∈ k[x1 , . . ., xn ] \ {0}

Output:
A Gröbner basis G for the ideal h f 1 , . . ., f s i

Procedure:
G := { f 1 , . . ., f s}.
Step 1. For each pair gi , g j ∈ G, i 6= j, compute the S-polynomial S(gi , g j ) and
apply the Multivariable Division Algorithm to compute a remainder ri j :
G
S(gi , g j ) → ri j
IF
All ri j are equal to zero, output G
ELSE
Add all nonzero ri j to G and return to Step 1.

Table 1.3 Buchberger’s Algorithm

Although it is easy to understand why the algorithm produces a Gröbner basis for
the ideal h f1 , . . . , fs i, it is not immediately obvious that the algorithm must terminate
after a finite number of steps. In fact, its termination is a consequence of the Hilbert
Basis Theorem (Theorem 1.1.6), as we now show.
Theorem 1.2.21. Buchberger’s Algorithm produces a Gröbner basis for the nonzero
ideal I = h f1 , . . . , fs i.
Proof. Since the original set G is a generating set for I, and at each step of the algo-
rithm we add to G polynomials from I, certainly at each step G remains a generating
set for I. Thus, if at some stage all remainders ri j are equal to zero, then by Buch-
berger’s criterion (Theorem 1.2.19) G is a Gröbner basis for I. Therefore we need
only prove that the algorithm terminates.
To this end, let G1 , G2 , G3 , . . . be the sequence of sets produced by the successive
values of G upon successive passes through the algorithm. If the algorithm does not
22 1 Polynomial Ideals and Their Varieties

terminate, then we have a strictly increasing infinite sequence G j $ G j+1 , j ∈ N.


Each set G j+1 is obtained from the set G j by adjoining to G j at least one polyno-
mial h of I, where h is a nonzero remainder with respect to G j of an S-polynomial
S( f1 , f2 ) for f1 , f2 ∈ G j . Since h is reduced with respect to G j , its leading term is
not divisible by the leading term of any element of G j , so that LT(h) ∈ / hLT(G j )i.
Thus we obtain the strictly ascending chain of ideals

hLT(G1 )i $ hLT(G2 )i $ hLT(G3 )i $ · · · ,

in contradiction to Corollary 1.1.7. 

Even if a term order is fixed, an imprecision in the computation of a Gröbner basis


arises because the Multivariable Division Algorithm can produce different remain-
ders for different orderings of polynomials in the set of divisors. Thus the output
of Buchberger’s Algorithm is not unique. Also, as a rule the algorithm is ineffi-
cient in the sense that the basis that it produces contains more polynomials than are
necessary. We can eliminate the superfluous polynomials from the basis using the
following fact, which follows immediately from the definition of a Gröbner basis.

Proposition 1.2.22. Let G be a Gröbner basis for I ⊂ k[x1 , . . . , xn ]. If g ∈ G and


LT(g) ∈ hLT(G \ {g})i, then G \ {g} is also a Gröbner basis for I.

Proof. Exercise 1.27. 

In particular, if g ∈ G is such that LT(g′ ) | LT(g) for some other element g′ of G,


then g may be discarded.

Definition 1.2.23. A Gröbner basis G = {g1 , . . . , gm } is called minimal if, for all
i, j ∈ {1, . . . , m}, LC(gi ) = 1 and for j 6= i, LM(gi ) does not divide LM(g j ).

Theorem 1.2.24. Every nonzero polynomial ideal has a minimal Gröbner basis.

Proof. By the Hilbert Basis Theorem (Theorem 1.1.6) there exists a finite basis
{ f1 , . . . , fs } of I. From the Gröbner basis G obtained from { f1 , . . . , fs } by Buch-
berger’s Algorithm (Theorem 1.2.21) discard all those polynomials g such that
LT(g) ∈ hLT(G \ {g})i, then rescale the remaining polynomials to make their lead-
ing coefficients equal to 1. By Proposition 1.2.22 the resulting set is a minimal
Gröbner basis. 

Proposition 1.2.25. Any two minimal Gröbner bases G and G′ of an ideal I of the
ring k[x1 , . . . , xn ] have the same set of leading terms: LT(G) = LT(G′ ). Thus they
have the same number of elements.

Proof. Write G = {g1, . . . , gs } and G′ = {g′1 , . . . , gt′ }. Because g1 ∈ I and G′ is a


Gröbner basis, there exists an index i such that

LT(g′i ) | LT(g1 ) . (1.20)

Because g′i ∈ I and G is a Gröbner basis, there exists an index j such that
1.2 The Ideal Membership Problem and Gröbner Bases 23

LT(g j ) | LT(g′i ) , (1.21)

whence LT(g j ) | LT(g1 ). Since G is minimal we must have j = 1. But then (1.20) and
(1.21) taken together force LT(g′i ) = LT(g1 ). Re-index G′ so that g′i = g′1 , and repeat
the argument with g1 replaced by g2 (noting that LT (g1 ) ∤ LT (g2 ), by minimality
of G), obtaining existence of an index i such that LT(g′i ) = LT(g2 ). Applying the
same argument for g3 through gs , we obtain the fact that t ≥ s and, suitably indexed,
LT(g′i ) = LT(gi ) for 1 ≤ i ≤ s. Reversing the roles of G and G′ , we have t ≤ s, and
in consequence the fact that LT(G′ ) = LT(G).
By minimality, there is a one-to-one correspondence between leading terms of
G and its elements, and similarly for G′ , so G and G′ have the same number of
elements. 
Although the minimal basis obtained by the procedure described in the proof
of Theorem 1.2.24 can be much smaller than the original Gröbner basis provided
by Buchberger’s Algorithm, nevertheless it is not necessarily unique (see Exercise
1.28). Fortunately, we can attain uniqueness with one additional stipulation concern-
ing the Gröbner basis.
Definition 1.2.26. A Gröbner basis G = {g1 , . . . , gm } is called reduced if, for all i,
1 ≤ i ≤ m, LC(gi ) = 1 and no term of gi is divisible by any LT(g j ) for j 6= i.
That is, the Gröbner basis is reduced if each of its elements g is monic and is re-
duced with respect to G\ {g} (Definition 1.2.8). A quick comparison with Definition
1.2.23 shows that every reduced Gröbner basis is minimal.
Theorem 1.2.27. Fix a term order. Every nonzero ideal I ⊂ k[x1 , . . . , xn ] has a
unique reduced Gröbner basis with respect to this order.
Proof. Let G be any minimal Gröbner basis for I, guaranteed by Theorem 1.2.24 to
exist. For any g ∈ G, replace g by its remainder r upon division of g by elements of
G \ {g} (in some order), to form the set H = (G \ {g}) ∪ {r}. Then LT(r) = LT(g),
since, by minimality of G, for no g′ ∈ G does LT(g′ ) | LT(g). Thus LT(H) = LT(G),
so hLT(H)i = hLT(G)i. Then by Theorem 1.2.16 H is also a Gröbner basis for I. It
is clear that it is also minimal. Since r is a remainder of g with respect to G \ {g}, by
definition it is reduced with respect to G \ {g} = H \ {r}. Applying this procedure
to each of the finitely many elements of G in turn yields a reduced Gröbner basis for
the ideal I.
Turning to the question of uniqueness, let two reduced Gröbner bases G and G′
for I be given. By Proposition 1.2.25, LT(G) = LT(G′ ), so that for any g ∈ G there
exists g′ ∈ G′ such that
LT(g) = LT(g′ ). (1.22)
By the one-to-one correspondence between the elements of a minimal Gröbner basis
and its leading terms, to establish uniqueness it is sufficient to show that g = g′ . Thus
consider the difference g − g′. Since g − g′ ∈ I, by Theorem 1.2.16
G
g − g′ → 0. (1.23)
24 1 Polynomial Ideals and Their Varieties

On the other hand, g − g′ is already reduced with respect to G: for by (1.22) the
leading terms of g and g′ have cancelled, no other term of g is divisible by any
element of LT(G) (minimality of G), and no other term of g′ is divisible by any
element of LT(G′ ) = LT(G) (minimality of G′ and (1.22) again); that is,
G
g − g′ → g − g′ . (1.24)

Hence, from (1.23) and (1.24) by Proposition 1.2.14, we conclude that g − g′ = 0,


as required. 
Recall that by Theorem 1.2.16, a Gröbner basis G of an ideal I provides a solution
G
to the Ideal Membership Problem: f ∈ I if and only if f → 0. An analogous prob-
lem is to determine whether or not two ideals I, J ⊂ k[x1 , . . . , xn ], each expressed in
terms of a specific finite collection of generators, are the same ideal. Theorem 1.2.27
provides a solution that is easily implemented:
Equality of Ideals. Nonzero ideals I and J in k[x1 , . . . , xn ] are the same
ideal if and only if I and J have the same reduced Gröbner basis with
respect to a fixed term order.
Buchberger’s Algorithm is far from the most efficient way to compute a Gröbner
basis. For practical use there are many variations and improvements, which the in-
terested reader can find in [1, 18, 60], among other references. The point is, however,
that such algorithms have now made it feasible to actually compute special gener-
ating sets for large ideals. Almost any readily available computer algebra system,
such as Mathematica, Maple, or REDUCE, will have built-in routines for comput-
ing Gröbner bases with respect to lex, deglex, and degrev. In the following section
we will repeatedly assert that such and such a collection is a Gröbner basis for a par-
ticular ideal. The reader is encouraged to use a computer algebra system to duplicate
the calculations on his own.

1.3 Basic Properties and Algorithms

In this section we present some basic facts concerning polynomial ideals that will be
important for us later. Based on these properties of polynomial ideals we develop al-
gorithms for working constructively with them and interpret these algorithms from
a geometric point of view, meaning that we will see which operations on affine
varieties correspond to which operations on ideals, and conversely. Buchberger’s
Algorithm is an fundamental component of all the algorithms below. In our presen-
tation we use for the most part the notation in [60], to which we refer the reader for
more details and further results.
To begin the discussion, recall that in solving a system of linear equations, an
effective method is to reduce the system to an equivalent one in which initial strings
of variables are missing from some of the equations, then work “backwards” from
constraints on a few of the variables to the full solution. The first few results in
1.3 Basic Properties and Algorithms 25

this section are a formalization of these ideas and an investigation into how they
generalize to the full nonlinear situation (1.3).

Definition 1.3.1. Let I be an ideal in k[x1 , . . . , xn ] (with the implicit ordering of the
variables x1 > · · · > xn ) and fix ℓ ∈ {0, 1, . . ., n − 1}. The ℓth elimination ideal of I is
the ideal Iℓ = I ∩ k[xℓ+1 , . . . , xn ]. Any point (aℓ+1 , . . . , an ) ∈ V(Iℓ ) is called a partial
solution of the system { f = 0 : f ∈ I}.

The following theorem is most helpful in investigating solutions of systems of


polynomial equations. We will use it several times in later sections.

Theorem 1.3.2 (Elimination Theorem). Fix the lexicographic term order on the
ring k[x1 , . . . , xn ] with x1 > x2 > · · · > xn and let G be a Gröbner basis for an ideal
I of k[x1 , . . . , xn ] with respect to this order. Then for every ℓ, 0 ≤ ℓ ≤ n − 1, the set

Gℓ := G ∩ k[xℓ+1, . . . , xn ]

is a Gröbner basis for the ℓth elimination ideal Iℓ .

Proof. Fix ℓ and suppose that G = {g1 , . . . , gs }. We may assume that the g j are
ordered so that exactly the first u elements of G lie in k[xℓ+1 , . . . , xn ], that is, that
Gℓ = {g1 , . . . , gu }. Since Gℓ ⊂ Iℓ , Theorem 1.2.16 is applicable and implies that
Gℓ is a Gröbner basis of Iℓ provided hLT(Iℓ )i = hLT(Gℓ )i. It is certainly true that
hLT(Gℓ )i ⊂ hLT(Iℓ )i. The reverse inclusion means that if f ∈ Iℓ , then LT( f ) is a
linear combination of leading terms of elements of Gℓ . Since we are working with
monomials, this is true if and only if LT( f ) is divisible by LT(g j ) for some index
j ∈ {1, . . . , u}. (See the reasoning in the last part of the proof of Theorem 1.2.16.)
Given f ∈ I, since G is a Gröbner basis for I, there exists a g j ∈ G such that LT( f )
is divisible by LT(g j ). But f is also in Iℓ , so f depends on only xℓ+1 , . . . , xn , hence
LT (g j ) depends on only xℓ+1 , . . . , xn . Because we are using the term order lex with
x1 > x2 > · · · > xn , this implies that g j = g j (xℓ+1 , . . . , xn ), that is, that g j ∈ Gℓ . 

The Elimination Theorem provides an easy way to eliminate a group of variables


from a polynomial system. Moreover, it provides a way to find all solutions of a
polynomial system in the case that the solution set is finite, or in other words, to find
the variety of a polynomial ideal in the case that the variety is zero-dimensional. We
will build our discussion around the following example.

Example 1.3.3. Let us find the variety in C3 of the ideal I = h f1 , f2 , f3 , f4 i, where

f 1 = y2 + x + z − 1
f2 = x2 + 2yx + 2zx − 4x − 3y + 2yz − 3z + 3
f3 = z2 + x + y − 1
f4 = 2x3 + 6zx2 − 5x2 − 4x − 7y + 4yz − 7z + 7 ,

that is, the solution set of the system


26 1 Polynomial Ideals and Their Varieties

f1 = 0, f2 = 0, f3 = 0, f4 = 0 . (1.25)

Under lex and the ordering x > y > z, a Gröbner basis for I is G = {g1 , g2 , g3 }, where
g1 = x + y + z2 − 1, g2 = y2 − y − z2 + z, and g3 = z4 − 2z3 + z2 . Thus system (1.25)
is equivalent to the system

x + y + z2 − 1 = 0
y2 − y − z2 + z = 0 (1.26)
z4 − 2z3 + z2 = 0 .

System (1.26) is readily solved. The last equation factors as z2 (z − 1)2 = 0, yielding
z = 0 or z = 1. Inserting these values successively into the second equation deter-
mines the corresponding values of y, and inserting pairs of (y, z)-values into the first
equation determines the corresponding values of x, yielding the solution

V = V( f1 , f2 , f3 , f4 ) = {(−1, 1, 1), (0, 0, 1), (0, 1, 0), (1, 0, 0)} ⊂ C3 , (1.27)

so V is just a union of four points in C3 (and of course it is also a variety in R3 and


in Q3 ).

Careful examination of the example shows that system (1.26) has a form similar
to the row-echelon form for systems of linear equations (and indeed row reduction
is a process that leads to a Gröbner basis). As in the linear case the procedure for
solving system (1.25) consists of two steps: (1) the forward or elimination step, and
(2) the backward or extension step.
In the first step we eliminated from system (1.25) the variables x and y, in the
sense that we transformed our system into an equivalent system that contains a
polynomial g3 that depends only on the single variable z, and which is therefore
easy to solve (although in general we might be unable to find the exact roots of such
a polynomial if its degree is greater than four). Once we found the roots of g3 , which
is the generator of the second elimination ideal I2 , we came to the second part of
the procedure, the extension step. Here, using the roots of g3 , we first found roots
of the polynomial g2 (which, along with g3 , generates the first elimination ideal I1 ),
and then we found all solutions of system (1.26), which is equivalent to our original
system.
The example shows the significance of the choice of term order for the elimi-
nation step, because in fact the original polynomials F = { f1 , . . . , f4 } also form a
Gröbner basis for I, but with respect to the term order deglex with y > x > z. Unlike
the Gröbner basis G = {g1 , g2 , g3 }, the Gröbner basis F does not contain a univariant
polynomial.
Now consider this example from a geometrical point of view. The map

πm : Cn → Cn−m : (a1 , . . . , an ) 7→ (am+1 , . . . , an )

is called the projection of Cn onto Cn−m . For the variety (1.27), the projection
π1 (V ) = {(1, 1), (0, 1), (1, 0), (0, 0)} is the variety of the first elimination ideal and
1.3 Basic Properties and Algorithms 27

π2 (V ) = {1, 0} is the variety of the second elimination ideal. So in solving the sys-
tem, we first found the variety of the second elimination ideal I2 , extended it to the
variety of the first elimination ideal I1 , and finally extended the latter variety to the
variety of the original ideal I = h f1 , f2 , f3 , f4 i.
Unfortunately, the backward step, the extension of partial solutions, does not
always work. In the example above, substitution of each root of g j into the preceding
polynomials yields a system that has solutions. The next example shows that this is
not always the case. That is, it is not always possible to extend a partial solution to
a solution of the original system.

Example 1.3.4. Let V be the solution set in C3 of the system

xy = 1, xz = 1. (1.28)

The reduced Gröbner basis of I = hxy − 1, xz − 1i with respect to lex with x > y > z
is {xz − 1, y − z}. Thus the first elimination ideal is I1 = hy − zi. The variety of I1 is
the line y = z in the (y, z)-plane. That is, the partial solutions corresponding to I1 are
{(a, a) : a ∈ C}. Any partial solution (a, a) for which a 6= 0 can be extended to the
solution (1/a, a, a) of (1.28). The partial solution (0, 0) cannot, which corresponds
to the fact that the point (0, 0) ∈ V(I1 ) has no preimage in V under the projection π1 .
The projection π1 (V ) of V onto C2 (regarded as the (y, z)-plane) is not the line y = z,
but is this line with the point (y, z) = (0, 0) deleted, hence is not a variety (Exercise
1.30).

Note that the situation is the same if we consider the ideal I as an ideal in R[x, y, z].
In this case we can easily sketch V , which is a hyperbola in the plane y = z in R3 ,
and we see from the picture that the projection of the hyperbola onto the plane x = 0
is the line y = z with the origin deleted.
The next theorem gives a sufficient condition for the possibility of extending a
partial solution of a polynomial system to the complete solution.

Theorem 1.3.5 (Extension Theorem). Let I = h f1 , . . . , fs i be a nonzero ideal in the


ring C[x1 , . . . , xn ] and let I1 be the first elimination ideal for I. Write the generators
N
of I in the form f j = g j (x2 , . . . , xn )x1 j + g̃i , where N j ∈ N0 , g j ∈ C[x2 , . . . , xn ] are
nonzero polynomials, and g̃ j are the sums of terms of f j of degree less than N j in
x1 . Consider a partial solution (a2 , . . . , an ) ∈ V(I1 ). If (a2 , . . . , an ) 6∈ V(g1 , . . . , gs ),
then there exists a1 such that (a1 , a2 , . . . , an ) ∈ V(I).

The reader can find a proof, which uses the theory of resultants, in [60, Chap. 3,
§6]. Note that the polynomials in Example 1.3.4 do not satisfy the condition of the
theorem.
An important particular case of the Extension Theorem is that in which the lead-
ing term of at least one of the f j , considered as polynomials in x1 , is a constant.

Corollary 1.3.6. Let I = h f1 , . . . , fs i be an ideal in C[x1 , . . . , xn ] and let I1 be the


first elimination ideal for I. Suppose there exists an f j that can be represented in the
form f j = cxN1 + g̃, where N ∈ N, c ∈ C \ {0}, and g̃ is the sum of the terms of f j
28 1 Polynomial Ideals and Their Varieties

of degree less than N in x1 . Then for any partial solution (a2 , . . . , an ) ∈ V(I1 ), there
exists a1 ∈ C such that (a1 , a2 , . . . , an ) ∈ V(I).

Assuming now that the ideal I in the theorem or in the corollary is an elimination
ideal Ik , we obtain a method that will sometimes guarantee that a partial solution
(ak+1 , . . . , an ) can be extended to a complete solution of the system. For we first use
the theorem (or corollary) to see if we can be sure that the original partial solution
can be extended to a partial solution (ak , ak+1 , . . . , an ), then we apply the theorem
again to see if we can be sure that this partial solution extends, and so on.
A restatement of Corollary 1.3.6 in geometric terms is the following. In Exercise
1.31 the reader is asked to supply a proof.

Corollary 1.3.7. Let I and f j be as in Corollary 1.3.6. Then the projection of V(I)
onto the last n − 1 components is equal to the variety of I1 ; that is, π1 (V ) = V(I1 ).

We have seen in Example 1.3.4 that the projection of a variety in kn onto kn−m
is not necessarily a variety. The following theorem describes more precisely the
geometry of the set πℓ (V ) and its relation to the variety of Iℓ . For a proof see, for
example, [60, Chap. 3, §2].

Theorem 1.3.8 (Closure Theorem). Let V = V( f1 , . . . , fs ) be an affine variety in


Cn and let Iℓ be the ℓth elimination ideal for the ideal I = h f1 , . . . , fs i. Then
1. V(Iℓ ) is the smallest affine variety containing πℓ (V ) ⊂ Cn−ℓ , and
2. if V 6= ∅, then there is an affine variety W $ V(Iℓ ) such that V(Iℓ ) \ W ⊂ πℓ (V ).

According to the Fundamental Theorem of Algebra every element of C[x] with


deg( f ) ≥ 1 has at least one root. The next theorem is a kind of analogue for the
case of polynomials of several variables. It shows that any system of polynomials
in several variables that is not equivalent to a constant has at least one solution
over C. We will need the following lemma concerning homogeneous polynomials.
(A polynomial f in several variables is homogeneous if the full degree of every
term of f is the same.) The idea is that in general it is possible that f not be the
zero element of k[x1 , . . . , xn ] yet g defined by g(x2 , . . . , xn ) = f (1, x2 , . . . , xn ) be the
zero in k[x2 , . . . , xn ]. An example is f (x, y) = x2 y − xy. The lemma shows that this
phenomenon does not occur if f is homogeneous.

Lemma 1.3.9. Let f (x1 , . . . , xn ) ∈ k[x1 , . . . , xn ] be a homogeneous polynomial. Then


f (x1 , . . . , xn ) is the zero polynomial in k[x1 , . . . , xn ] if and only if f (1, x2 , . . . , xn ) is
the zero polynomial in k[x2 , . . . , xn ].

Proof. If f (x1 , . . . , xn ) is the zero polynomial, then certainly f (1, x2 , . . . , xn ) is the


zero polynomial as well. For the converse, let f ∈ k[x1 , . . . , xn ] be a nonzero homo-
geneous polynomial with deg( f ) = N. Without loss of generality we assume that
all similar terms of f (x1 , . . . , xn ) are collected, so that the nonzero terms of f are
different and satisfy the condition

aα xα 6= aβ xβ implies α 6= β . (1.29)
1.3 Basic Properties and Algorithms 29

β βn γ γn
Then for any two terms c1 x2 2 · · · xn and c2 x22 · · · xn in f (1, x2 , . . . , xn ) for which
c1 c2 6= 0, it must be true that (β2 , . . . , βn ) 6= (γ2 , . . . , γn ), else
(N−(β2 +···+βn )) β2 (N−(γ2 +···+γn )) γ2
x2 · · · xβn x2 · · · xγn
n n
c1 x1 and c2 x1

are terms in f that violate condition (1.29). That is, no nonzero terms in f (1, . . . , xn )
are similar, hence none is lost (there is no cancellation) when terms in f (1, . . . , xn )
are collected. Since f is not the zero polynomial, there is a nonzero term cx1α1 · · · xnαn
in f , which yields a corresponding nonzero term cxα2 2 · · · xαn n in f (1, x2 , . . . , xn ),
which is therefore not the zero polynomial. 

Theorem 1.3.10 (Weak Hilbert Nullstellensatz). If I is an ideal in C[x1 , . . . , xn ]


such that V(I) = ∅, then I = C[x1 , . . . , xn ].

Proof. To prove the theorem it is sufficient to show that 1 ∈ I. We proceed by in-


duction on the number of variables in our polynomial ring.
Basis step. Suppose that n = 1, so the polynomial ring is C[x], and let an ideal
I for which V(I) = ∅ be given. By Exercise 1.12 there exists f ∈ C[x] such that
I = h f i. The hypothesis V(I) = ∅ means that f has no roots in C, hence by the
Fundamental Theorem of Algebra we conclude that f = a ∈ C \ {0}. It follows that
I = h f i = C[x].
Inductive step. Suppose that the statement of the theorem holds in the case of
n − 1 ≥ 1 variables. Let I be an ideal in C[x1 , x2 , . . . , xn ] such that V(I) = ∅. By
the Hilbert Basis Theorem (Theorem 1.1.6) there exist finitely many polynomials
f j such that I = h f1 , . . . fs i. Set N = deg( f1 ). If N = 0, then f1 is constant, so that
I = C[x1 , . . . , xn ], as required. If N ≥ 1, then for any choice of (a2 , . . . , an ) ∈ Cn−1
the linear transformation
x1 = y1
x2 = y2 + a 2 y1
.. (1.30)
.
xn = yn + a n y1
induces a mapping

T : C[z1 , . . . , zn ] → C[z1 , . . . , zn ] : f 7→ f˜ (1.31)

defined by f˜(y1 , . . . , yn ) = f (y1 , y2 + a2 y1 , . . . , yn + an y1 ) . Let g ∈ C[z1 , . . . , zn−1 ] be


defined by

f˜1 (y1 , . . . , yn ) = f1 (y1 , y2 + a2 y1 , . . . , yn + any1 ) = g(a2 , . . . , an )yN1 + h , (1.32)

where h is the sum of all terms of f1 whose degree in y1 is less than N. If we


rewrite f1 in the form f1 = uN + uN−1 + · · · + u0 , where, for each j, u j is a homo-
geneous polynomial of degree j in x1 , . . . , xn , then g(a2 , . . . , an ) = uN (1, a2 , . . . , an ).
By Lemma 1.3.9, g(a2 , . . . , an ) is a nonzero polynomial, hence there exist ã2 , . . . , ãn
such that g(ã2 , . . . , ãn ) 6= 0. Fix a choice of such ã2 , . . . , ãn to define T in (1.31).
30 1 Polynomial Ideals and Their Varieties

Clearly I˜ := { f˜ : f ∈ I} is an ideal in C[y1 , . . . , yn ] and V(I)


˜ = ∅, since the linear
transformation given by (1.30) is invertible.
To prove the theorem it is sufficient to show that 1 ∈ I˜ (this inclusion implies
that 1 ∈ I because the transformation (1.31) does not change constant polynomials).
From (1.32) and Corollary 1.3.7 we conclude that V(I˜1 ) = π1 (V(I)). ˜ Therefore

V(I˜1 ) = π1 (V(I))
˜ = π1 (∅) = ∅.

Thus by the induction hypothesis I˜1 = C[y2 , . . . , yn ]. Hence 1 ∈ I˜1 ⊂ I,


˜ the statement
of the theorem thus holds for the case of n variables, and the proof is complete. 

The Weak Hilbert Nullstellensatz provides a way for checking whether or not a
given polynomial system
f1 = f2 = · · · = fs = 0 (1.33)
has a solution over C, for in order to find out if there is a solution to (1.33) it is
sufficient to compute a reduced Gröbner basis G for h f1 , . . . , fs i using any term
order:
Existence of Solutions of a System of Polynomial Equations. System
(1.33) has a solution over C if and only if the reduced Gröbner basis G
for h f1 , . . . , fs i with respect to any term order on C[x1 , . . . , xn ] is different
from {1}.
Example 1.3.11. Consider the system

f1 = 0, f2 = 0, f3 = 0, f4 = 0, f5 = 0, (1.34)

where f1 , f2 , f3 , and f4 are from Example 1.3.3 and f5 = 2x3 + xy − z2 + 3. A


computation shows that any reduced Gröbner basis of h f1 , . . . , f5 i is {1}. Therefore
system (1.34) has no solutions.

If we are interested in solutions of (1.33) over a field k that is not algebraically


closed, then a Gröbner basis computation gives a definite answer only if it is equal
to {1}. For example, if there is no solution in Cn , then certainly there is no solution
in Rn . In general, if {1} is a Gröbner basis for h f1 , . . . , fs i over a field k, then system
(1.33) has no solutions over k.
We now introduce a concept, the radical of an ideal I, that will be of fundamental
importance in the procedure for identifying the variety V(I) of I.

Definition 1.3.12. Let I ⊂ k[x1 , . . . , xn ] be an ideal. The radical of I, denoted I, is
the set

I = { f ∈ k[x1 , . . . , xn ] : there exists p ∈ N such that f p ∈ I}.

An ideal J ⊂ k[x1 , . . . , xn ] is called a radical ideal if J = J.

In Exercise 1.32 √the reader is asked to verify that I is an ideal, and in Exercise
1.33 to show that I determines the same affine variety as I:
1.3 Basic Properties and Algorithms 31

V( I) = V(I) . (1.35)

Example 1.3.13. Consider the set of ideals I (p) = h(x − y) pi, p ∈ N. All these ideals
define the same variety V , which√ is the line y = x in the plane k2 . It is easy to see that
I (1) % I (2) % I (3) % · · · and that I (p) = I (1) for every index p, so the only radical
ideal among the I (p) is I (1) = hx − yi.

This example indicates that it is the radical of an ideal that is fundamental in


picking out the variety that it and other ideals may determine. The next theorem and
Proposition 1.3.16 say this more precisely.

Theorem 1.3.14 (Strong Hilbert Nullstellensatz). Let f , f1 , . . . , fs be elements of


C[x1 , . . . , xn ]. Then f ∈ I(V( f1 , . . . , fs )) if and only if there exists p ∈ N such that
f p ∈ h f1 , . . . , fs i. In other words, for any ideal I in C[x1 , . . . , xn ],

I = I(V(I)). (1.36)

Proof. If f p ∈ h f1 , . . . , fs i, then f p vanishes on the variety V( f1 , . . . , fs ). Hence,


because f and f p have the same zero sets, f itself vanishes on V( f1 , . . . , fs ). That
is, f ∈ I(V( f1 , . . . , fs )).
For the reverse inclusion, suppose f vanishes on V( f1 , . . . , fs ). We must show
that there exist p ∈ N and h1 , . . . , hs ∈ C[x1 , . . . , xn ] such that
s
fp = ∑ h j f j. (1.37)
j=1

To do so we expand our polynomial ring by adding a new variable w, and consider


the ideal
J = h f1 , . . . , fs , 1 − w f i ⊂ C[x1 , . . . , xn , w].
We claim that V(J) = ∅. For consider any point à = (a1 , . . . , an , an+1 ) ∈ Cn+1 , and
let A be the projection of à onto the subspace comprised of the first n coordinates,
so that A = (a1 , . . . , an ). Then either (i) A ∈ V( f1 , . . . , fs ) or (ii) A 6∈ V( f1 , . . . , fs ).
In case (i) f (A) = 0 so that 1 − w f is equal to 1 at Ã. But then it follows that
à 6∈ V( f1 , . . . , fs , 1 − w f ) = V(J). In case (ii) there exists an index j, 1 ≤ j ≤ s, such
that f j (A) 6= 0. We can consider f j as an element of C[x1 , . . . , xn , w] that does not
depend on w. Then f j (Ã) 6= 0, which implies that à 6∈ V(J). Since à was an arbitrary
point of Cn+1 , the claim that V(J) = ∅ is established.
By the Weak Hilbert Nullstellensatz, Theorem 1.3.10, 1 ∈ J, so that
s
1= ∑ q j (x1 , . . . , xn, w) f j + q(x1, . . . , xn , w)(1 − w f ) (1.38)
j=1

for some elements q1 , . . . , qs , q of the ring C[x1 , . . . , xn , w]. If we replace w by 1/ f


and multiply by a sufficiently high power of f to clear denominators, we then obtain
(1.37), as required. 
32 1 Polynomial Ideals and Their Varieties

Remark. Theorems 1.3.10 and 1.3.14 remain true with C replaced by any alge-
braically closed field, but not for C replaced by R (Exercise 1.34).

For any field k the map I actually maps into the set of radical ideals in k[x1 , . . . , xn ].
It is a direct consequence of the Strong Nullstellensatz that over C, if we restrict the
domain of V to just the radical ideals, then it becomes one-to-one and is an inverse
for I.

Theorem 1.3.15. Let k be a field and let H ⊂ I denote the set of all radical ideals in
k[x1 , . . . , xn ]. Then I : V → H; that is, I(V ) is a radical ideal in k[x1 , . . . , xn ]. When
k = C, V|H is an inverse of I; that is, if I ∈ H, then I(V(I)) = I.

Proof. To verify that I(V ) is a radical ideal, suppose f ∈ k[x1 , . . . , xn ] is such that
f p ∈ I(V ). Then for any a ∈ V , f p (a) = 0. That is, ( f (a)) p = 0, yielding f (a) = 0,
since k is a field. Because a is an arbitrary point of V , f vanishes at all points
of V , so f ∈ I(V ). By Theorem 1.1.14, the equality V(I(V )) = V always holds.
= C, Theorem 1.3.14 applies, and for a radical ideal I equation (1.36) reads
If k √
I = I = I(V(I)). 

Part of the importance for us of the concept of the radical of an ideal is that when
the field k in question is C, it completely characterizes when two ideals determine
the same affine variety:

Proposition
√ 1.3.16.
√ Let I and J be ideals in C[x1 , . . . , xn ]. Then V(I) = V(J) if and
only if I = J.
√ √
√ =√V(J), then by (1.35), V( I) = V( J), hence by Theorem 1.3.15
Proof. If V(I)
we obtain I = √J. √ √ √
Conversely, if I = J, then V( I) = V( J) and by (1.35), V(I) = V(J). 

It is a difficult computational problem to compute the radical of a given ideal


(unless the ideal is particularly simple, as was the case in Example 1.3.13). However,
examining the proof of the Theorem 1.3.14 we obtain the following method for
checking whether or not a given polynomial belongs to the radical of a given ideal.

Theorem 1.3.17. Let k√be an arbitrary field and let I = h f1 , . . . , fs i be an ideal in


k[x1 , . . . , xn ]. Then f ∈ I if and only if 1 ∈ J := h f1 , . . . , fs , 1−w f i ⊂ k[x1 , . . . , xn , w].

Proof. If 1 ∈ J, then from (1.38) and the√discussion surrounding it we √ obtain that


f p ∈ I for some p ∈ N. Conversely, if f ∈ I, then by the definition of I, f p ∈ I ⊂ J
for some p ∈ N. Because 1 − w f ∈ J as well,

1 = w p f p + (1 − w p f p ) = w p f p + (1 + w f + · · · + w p−1 f p−1 )(1 − w f ) ∈ J,

since w p , 1, w f , . . . , w p−1 f p−1 ∈ k[x1 , . . . , xn , w] and J is an ideal. 

This theorem provides the simple algorithm presented


p in Table 1.4 on page 33
for deciding whether or not a polynomial f lies in h f1 , . . . , fs i.
1.3 Basic Properties and Algorithms 33

Radical Membership Test

Input:
f , f 1, . . ., f s ∈ k[x1 , . . ., xn ]

Output:
p p
“Yes” if f ∈ h f 1 , . . ., f s i or “No” if f 6∈ h f 1 , . . ., f s i

Procedure:
1. Compute a reduced Gröbner basis G for
h f 1 , . . ., f s , 1 − w f i ⊂ k[x1 , . . ., xn , w]
2. IF G = {1}
THEN
“Yes”
ELSE
“No”

Table 1.4 The Radical Membership Test

Proposition 1.3.16 and the Radical Membership Test together give us a means
for checking whether or not the varieties of two given ideals I = h f1 ,√. . . , fs√ i and
J = hg1 , . . . , gt i in
√ C[x 1 , . . . , x n ] are equal. Indeed, √by Exercise 1.35, I = J if
and only if f j ∈ J for all j, 1 ≤ j ≤ s, and g j ∈ I for all j, 1 ≤ j ≤ t. Thus by
Proposition√ 1.3.16, V(I) = V(J) √ if and only if the Radical Membership Test shows
that f j ∈ J for all j and g j ∈ I for all j.
In succeeding chapters we will study the so-called center varieties of polynomial
systems of ordinary differential equations. The importance of the Radical Member-
ship Test for our studies of the center problem is that it gives a simple and efficient
tool for checking whether or not the center varieties obtained by different methods
or different authors are the same. Note, however, that the varieties in question must
be complex, since the identification procedure is based on Proposition 1.3.16, which
does not hold when C is replaced by R.
Now we turn to the question of the relationship between operations on ideals
and the corresponding effects on the affine varieties that they define, and between
properties of ideals and properties of the corresponding varieties. Hence let I and
J be two ideals in k[x1 , . . . , xn ]. The intersection of I and J is their set-theoretic
intersection
I ∩ J = { f ∈ k[x1 , . . . , xn ] : f ∈ I and f ∈ J},
and the sum of I and J is

I + J := { f + g : f ∈ I and g ∈ J} .
34 1 Polynomial Ideals and Their Varieties

In Exercise 1.36 the reader is asked to show that I ∩ J and I + J are again ideals in
k[x1 , . . . , xn ] and to derive a basis for the latter. (See also Exercise 1.38. For a basis
of I ∩ J, see Proposition 1.3.25 and Table 1.5.) The intersection of ideals interacts
well with the process of forming the radical of an ideal:
√ √ √
I ∩J = I ∩ J, (1.39)

and with the mapping I defined by (1.5):

I(A) ∩ I(B) = I(A ∪ B) . (1.40)

The reader is asked to prove these equalities in Exercise 1.39. Both intersection and
sum mesh nicely with the mapping V defined by (1.6):
Theorem 1.3.18. If I and J are ideals in k[x1 , . . . , xn ], then
1. V(I + J) = V(I) ∩ V(J) and
2. V(I ∩ J) = V(I) ∪ V(J).
Proof. Exercise 1.40 (or see, for example, [60]). 
In Definition 1.1.10 we defined the ideal I(V ) of an affine variety. As the reader
showed in Exercise 1.7, the fact that V was an affine variety was not needed in the
definition, which could be applied equally well to any subset S of kn to produce an
ideal I(S). (However, the equality V(I(V )) = V need no longer hold; see Exercise
1.41.) Thus for any set S in kn we define I(S) as the set of all elements of k[x1 , . . . , xn ]
that vanish on S. Then I(S) is an ideal in k[x1 , . . . , xn ], and the first few lines in the
proof of Theorem 1.3.15 show that it is actually a radical ideal. There is also an
affine variety naturally attached to any subset of kn .
Definition 1.3.19. The Zariski closure of a set S ⊂ kn is the smallest variety con-
taining S. It will be denoted by S̄.
By part (b) of Exercise 1.3 the Zariski closure is well-defined: one merely takes
the intersection of every variety that contains S. Another characterization of the
Zariski closure will be given in Proposition 1.3.21. By Exercise 1.3, Proposition
1.1.4, and the obvious facts that S ⊂ S̄, (S̄) = S̄, and ∅ ¯ = ∅, the mapping from
the power set of kn into itself defined by S → S̄ is a Kuratowski closure operation,
hence can be used to define a topology on kn (Theorem 3.7 of [200]), the Zariski
topology. When k is R or C with the usual topology, the Zariski closure of a set S is a
topologically closed set that contains S. If Cℓ(S) is the topological closure of S, then
S ⊂ Cℓ(S) ⊂ S̄; the second inclusion, like the first, could be strict. See Exercise 1.42.
The observation contained in the following proposition, which will prove useful
later, underscores the analogy between the Zariski closure and topological closure.
We have stated it as we have in order to emphasize this analogy.
Proposition 1.3.20. Suppose P denotes a property that can be expressed in terms
of polynomial conditions f1 = · · · = fs = 0 and that it is known to hold on a subset
V \ S of a variety V . If V \ S = V , then P holds on all of V .
1.3 Basic Properties and Algorithms 35

Proof. Let W = V( f1 , . . . , fs ). Then the set of all points of kn on which property P


holds is the affine variety W , our assumption is just that V \ S ⊂ W , and we must
show that V ⊂ W . If V ∩ S = ∅, then V \ S = V , so the desired conclusion follows
automatically from the assumption V \ S ⊂ W . If V ∩ S 6= ∅, then V \ S 6= V , so the
set V \ S is not itself a variety, since it is not equal to the smallest variety V \ S = V
that contains it. But W is a variety and it does contain V \ S, hence W contains the
smallest variety that contains V \ S, namely V , as required. 
Here is the characterization of the Zariski closure of a set mentioned earlier.
Proposition 1.3.21. Let S be a subset of kn . The Zariski closure of S is equal to
V(I(S)) :
S̄ = V(I(S)). (1.41)
Proof. To establish (1.41) we must show that if W is any affine variety that con-
tains S, then V(I(S)) ⊂ W . Since S ⊂ W , I(W ) ⊂ I(S), therefore by Theorem 1.1.14
V(I(S)) ⊂ V(I(W )) = W . 
Proposition 1.3.22. For any set S ⊂ kn ,

I(S̄) = I(S). (1.42)

Proof. Since S ⊂ S̄, Theorem 1.1.14 yields I(S̄) ⊂ I(S), so we need only show that

I(S) ⊂ I(S̄). (1.43)

If (1.43) fails, then there exist a polynomial f in I(S) and a point a ∈ S̄ \ S such
that f (a) 6= 0. Let V be the affine variety defined by V = S̄ ∩ V( f ). Then a ∈ S̄ but
a∈ / V (since f (a) 6= 0), so V $ S̄. Because S ⊂ S̄ and S ⊂ V( f ) (since f ∈ I(S)),
S ⊂ V . Thus S ⊂ V $ S̄, in contradiction to the fact that S̄ is the smallest variety that
contains the set S. 
Just above we defined the sum and intersection of polynomial ideals. From The-
orem 1.3.18 we see that the first of these ideal operations corresponds to the in-
tersection of their varieties, while the other operation corresponds to the union of
their varieties. In order to study the set theoretic difference of varieties we need to
introduce the notion of the quotient of two ideals.
Definition 1.3.23. Let I and J be ideals in k[x1 , . . . , xn ]. Their ideal quotient I : J is
the ideal
I : J = { f ∈ k[x1 , . . . , xn ] : f g ∈ I for all g ∈ J}.
In Exercise 1.43 we ask the reader to prove that I : J is indeed an ideal in
k[x1 , . . . , xn ].
The next theorem shows that the formation of the quotient of ideals is an alge-
braic operation that closely corresponds to the geometric operation of forming the
difference of two varieties. It is useful in the context of applications of Proposition
1.3.20, even when the set S is not itself a variety, for then we search for a variety
V1 ⊂ S for which V \ V1 = V still holds.
36 1 Polynomial Ideals and Their Varieties

Theorem 1.3.24. If I and J are ideals in k[x1 , . . . , xn ], then

V(I) \ V(J) ⊂ V(I : J) , (1.44)

where the overline indicates Zariski closure. If k = C and I is a radical ideal, then

V(I) \ V(J) = V(I : J).

Proof. We first show that


I : J ⊂ I(V(I) \ V(J)). (1.45)
Take any f ∈ I : J and a ∈ V(I) \ V(J). Because a 6∈ V(J) there exists g̃ ∈ J such
that g̃(a) 6= 0. Since f ∈ I : J and g̃ ∈ J, f g̃ ∈ I. Because a ∈ V(I), f (a)g̃(a) = 0,
hence f (a) = 0. Thus f (a) = 0 for any a ∈ V(I) \ V(J), which yields the inclusion
f ∈ I(V(I) \ V(J)). Since, by Proposition 1.3.22, I(V(I) \ V(J)) = I(V(I) \ V(J)),
(1.45) holds, and applying Theorem 1.1.14 we see that (1.44) is true as well.
To prove the reverse inclusion when k = C and I is a radical ideal, first suppose
h ∈ I(V(I) \ √ V(J)). Then for any√g ∈ J, hg vanishes on V(I), hence by Theorem
1.3.14, hg ∈ I, and because I = I, hg ∈ I. Thus we have shown that

h ∈ I(V(I) \ V(J)) implies hg ∈ I for all g ∈ J . (1.46)

Now fix any a ∈ V(I : J). By the definition of V(I : J), this means that if h is any
polynomial with the property that hg ∈ I for all g ∈ J, then h(a) = 0:

hg ∈ I for all g ∈ J implies h(a) = 0 . (1.47)

Thus if h ∈ I(V(I) \ V(J)), then by (1.46) the antecedent in (1.47) is true, hence
h(a) = 0, meaning that a ∈ V(I(V(I) \ V(J))). Taking into account Proposition
1.3.21,
V(I : J) ⊂ V(I(V(I) \ V(J))) = V(I) \ V(J) . 

Suppose I and J are ideals in k[x1 , . . . , xn ] for which we have explicit finite bases,
say I = h f1 , . . . , fu i and J = hg1 , . . . , gv i. We wish to use these bases to obtain bases
for I + J, I ∩ J, and I : J. For I + J the answer is simple and is presented in Exer-
cise 1.36. To solve the problem for I ∩ J we need the expression for I ∩ J in terms
of an ideal in a polynomial ring with one additional indeterminate as given by the
following proposition. The equality in the proposition is proved by treating its con-
stitutents only as sets. See Exercise 1.37 for the truth of the more general equality
of ideals that it suggests.

Proposition 1.3.25. If I = h f1 , . . . , fu i and J = hg1 , . . . , gv i are ideals in k[x1 , . . . , xn ],


then
I ∩ J = ht f1 , . . . ,t fu , (1 − t)g1, . . . , (1 − t)gv i ∩ k[x1 , . . . , xn ] .

Proof. If f ∈ I ∩ J, then there exist h j , h′j ∈ k[x1 , . . . , xn ] such that

f = t f + (1 − t) f = t(h1 f1 + · · · + hu fu ) + (1 − t)(h′1g1 + · · · + h′v gv ).


1.3 Basic Properties and Algorithms 37

Conversely, if there exist h j , h′j ∈ k[t, x1 , . . . , xn ] such that

f (x) = h1 (x,t)t f1 (x) + · · · + hu(x,t)t fu (x)


+ h′1(x,t)(1 − t)g1(x) + · · · + h′v (x,t)(1 − t)gv(x),

then setting t = 0 we obtain f (x) = h′1 (x, 0)g1 (x) + · · · + h′v (x, 0)gv (x) ∈ J and set-
ting t = 1 we obtain f (x) = h1 (x, 1) f1 (x) + · · · + hu (x, 1) fu (x) ∈ I. 

If we order the variables t > x1 > · · · > xn , then in the language of Definition
1.3.1 the proposition says that I ∩ J is the first elimination ideal of the ideal in
k[t, x1 , . . . , xn ] given by ht f1 , . . . ,t fu , (1 − t)g1 , . . . , (1 − t)gv i. The proposition and
the Elimination Theorem (Theorem 1.3.2) together thus yield the algorithm pre-
sented in Table 1.5 for computing a generating set for I ∩ J from generating sets of
I and J. To obtain a basis for the ideal I : J from bases for I and J we will need two
more results.

Algorithm for Computing I ∩ J

Input:
Ideals I = h f 1 , . . ., f ui and J = hg1 , . . ., gv i in k[x1 , . . ., xn ]

Output:
A Gröbner basis G for I ∩ J

Procedure:
1. Compute a Gröbner basis G′ of

ht f 1 (x), . . . ,t f u(x), (1 −t)g1 (x), . . . , (1 −t)gv (x) i

in k[t, x1 , . . ., xn ] with respect to lex with


t > x1 > · · · > xn .
2. G = G′ ∩ k[x1 , . . ., xn ]

Table 1.5 Algorithm for Computing I ∩ J

Proposition 1.3.26. Suppose I is an ideal in k[x1 , . . . , xn ] and g is a nonzero element


of k[x1 , . . . , xn ]. If {h1 , . . . , hs } is a basis for I ∩ hgi, then {h1 /g, . . . , hs /g} is a basis
for I : hgi.

Proof. First note that for each j, 1 ≤ j ≤ s, because h j ∈ hgi, h j /g is a polynomial,


and because h j ∈ I, h j /g is in I : hgi. For any element f of I : hgi, g f ∈ I ∩ hgi, hence
by hypothesis g f = ∑sj=1 u j h j for some choice of u j ∈ k[x1 , . . . , xn ], from which it
follows that f = ∑sj=1 u j (h j /g). 
38 1 Polynomial Ideals and Their Varieties

Proposition 1.3.27. Suppose that I and J1 , . . . , Jm are ideals in k[x1 , . . . , xn ]. Then


I : (∑m m
s=1 Js ) = ∩s=1 (I : Js ) .

Proof. Exercise 1.44. 

Using the fact that any ideal J = hg1 , . . . , gs i can be represented as the sum of
the principal ideals of its generators, J = hg1 i + · · · + hgs i, we obtain from Proposi-
tions 1.3.25, 1.3.26, and 1.3.27 the algorithm presented in Table 1.6 for computing
generators of the ideal quotient.

Algorithm for Computing I : J

Input:
Ideals I = h f 1 , . . ., f ui and J = hg1 , . . ., gv i in k[x1 , . . ., xn ]

Output:
A basis { f v1 , . . ., f vpv } for I : J

Procedure:
FOR j = 1, . . ., v
Compute I ∩ hg j i = hh j1 , . . ., h jm j i
FOR j = 1, . . ., v
Compute I : hg j i = hh j1 /g j , . . ., h jm j /g j i
K := I : hg1 i
FOR j = 2, . . ., v
Compute K := K ∩ (I : hg j i) = h f j1 , . . ., f j p j i

Table 1.6 Algorithm for Computing I : J

1.4 Decomposition of Varieties

Consider the ideal I = hx3 y3 , x2 z2 i. It is obvious that its variety V = V(I) is the union
of two varieties, V = V1 ∪V2 , where V1 is the plane x = 0 and V2 is the line {(x, y, z) :
y = 0 and z = 0}. It is geometrically clear that we cannot further decompose either
V1 or V2 into a union of two varieties as we have just done for V . Of course, each
could be decomposed as the uncountable union of its individual points, but we are
thinking in terms of finite unions. With this restriction, our variety V is the union of
two irreducible varieties V1 and V2 .

Definition 1.4.1. A nonempty affine variety V ⊂ kn is irreducible if V = V1 ∪V2 , for


affine varieties V1 and V2 , only if either V1 = V or V2 = V .
1.4 Decomposition of Varieties 39

The ideal I = hx3 y3 , x2 z2 i considered above defines the same variety as any ideal
hx p yq , xr zs i,
where p, q, r, s ∈ N. √ Among all these ideals the one having the√simplest
description is the radical of I, I = hxy, xzi, and it is easily seen that I is the
intersection of two radical ideals hxi and hy, zi, which in turn correspond to the
irreducible components of V(I). The radical ideals that define irreducible varieties
are the so-called prime ideals.
Definition 1.4.2. A proper ideal I $ k[x1 , . . . , xn ] is a prime ideal if f g ∈ I implies
that either f ∈ I or g ∈ I.
The following property of prime ideals is immediate from the definition.
Proposition 1.4.3. Every prime ideal is a radical ideal.
Another interrelation between prime and radical ideals is the following fact.
Proposition 1.4.4. Suppose P1 , . . . , Ps are prime ideals in k[x1 , . . . , xn ]. Then the
ideal I = ∩sj=1 Pj is a radical ideal.
Proof. Assume that for some p ∈ N, f p ∈ ∩sj=1 Pj . Then for each j, 1 ≤ j ≤ s,
f p ∈ Pj , a prime ideal, which by Proposition 1.4.3 is a radical ideal, hence f ∈ Pj .
Therefore f ∈ ∩sj=1 Pj = I, so that I is a radical ideal. 
We know from Theorem 1.3.15 that in the case k = C there is a one-to-one cor-
respondence between radical ideals and affine varieties. We now show that there is
always a one-to-one correspondence between prime ideals and irreducible varieties.
Theorem 1.4.5. Let V ∈ kn be a nonempty affine variety. Then V is irreducible if
and only if I(V ) is a prime ideal.
Proof. Suppose V is irreducible and that f g ∈ I(V ). Setting V1 = V ∩ V( f ) and
V2 = V ∩V(g), V = V1 ∪V2 , hence either V = V1 or V = V2 . If V = V1 , then V ⊂ V( f ),
therefore f ∈ I(V ). If V = V2 , then g ∈ I(V ) similarly, so the ideal I(V ) is prime.
Conversely, suppose that I(V ) is prime and that V = V1 ∪ V2 . If V = V1 , then
there is nothing to show. Hence we assume V 6= V1 and must show that V = V2 . First
we will show that I(V ) = I(V2 ). Indeed, V2 ⊂ V , so Theorem 1.1.14 implies that
I(V ) ⊂ I(V2 ). For the reverse inclusion we note that by Proposition 1.1.12, I(V ) is
strictly contained in I(V1 ), I(V ) $ I(V1 ), because V1 is a proper subset of V (possibly
the empty set). Therefore there exists a polynomial

f ∈ I(V1 ) \ I(V ). (1.48)

Choose any polynomial g ∈ I(V2 ). Then since f ∈ I(V1 ) and g ∈ I(V2 ), by the fact
that V = V1 ∪ V2 and (1.40), f g ∈ I(V ). Because I(V ) is prime, either f ∈ I(V ) or
g ∈ I(V ). But (1.48) forces g ∈ I(V ), yielding I(V2 ) ⊂ I(V ), hence I(V ) = I(V2 ). But
then by part (2) of Proposition 1.1.12, V = V2 , so that V is irreducible. 
We saw above that the variety of any ideal hx p yq , xr zs i where p, q, r, s ∈ N, is a
union of a plane and a line. As one would expect, the possibility of decomposing a
variety into a finite union of irreducible varieties is a general property:
40 1 Polynomial Ideals and Their Varieties

Theorem 1.4.6. Let V ∈ kn be an affine variety. Then V is a union of a finite number


of irreducible varieties.

Proof. If V is not itself irreducible, then V = VL ∪ VR , where VL and VR are affine


varieties but neither is V . If they are both irreducible, then the proof is complete.
Otherwise, one or both of them decomposes into a union of proper subsets, each
one an affine variety, say V = (VLL ∪ VLR ) ∪ (VRL ∪ VRR ). If every one of these four
sets is is irreducible, the proof is complete; otherwise, the decomposition continues.
In this way a collection of strictly descending chains of affine varieties is created,
such as (with the obvious meaning to the notation)

V % VL % VLR % VLRL % · · · ,

each of which in turn generates a strictly increasing chain of ideals, such as

I(V ) $ I(VL ) $ I(VLR ) $ I(VLRL ) $ · · ·

in k[x1 , . . . , xn ]. (The inclusions are proper because, by Theorem 1.1.14, I is injec-


tive.) By Corollary 1.1.7 every such chain of ideals terminates, hence (again by
injectivity of I) each chain of affine varieties terminates, and so the decomposition
of V terminates with the expression of V as a union of finitely many irreducible
affine varieties. 

A decomposition V = V1 ∪ · · · ∪ Vm of the variety V ⊂ kn into a finite union of


irreducible subvarieties is called a minimal decomposition if Vi 6⊂ V j for i 6= j.

Theorem 1.4.7. Every variety V ⊂ kn has a minimal decomposition

V = V1 ∪ · · · ∪Vm , (1.49)

and this decomposition is unique up to the order of the V j in (1.49).

The proof is left as Exercise 1.48 (or see [60]).


Note that for the statement of uniqueness in the theorem above it is important that
the minimal decomposition be a union of a finite number of varieties. Otherwise, for
instance, a plane can be represented as the union of points or the union of lines, and
they, of course, are different decompositions.
We already knew that an intersection of prime ideals is a radical ideal. As a direct
corollary of Theorems 1.3.15, 1.4.5, and 1.4.7 and identity (1.40), when k = C we
obtain as a converse the following property of radical ideals.

Theorem 1.4.8. Every radical ideal I ⊂ C[x1 , . . . , xn ] can be uniquely represented


as an intersection of prime ideals, I = ∩mj=1 Pj , where Pr 6⊂ Ps if r 6= s.

Our goal is to understand the set of solutions of a given system

f1 (x1 , . . . , xn ) = 0, . . . , fs (x1 , . . . , xn ) = 0 (1.50)


1.4 Decomposition of Varieties 41

of polynomial equations, which, as we know (Proposition 1.1.9), really depends on


the ideal I = h f1 , . . . , fs i rather than on the individual polynomials themselves. The
solution set of (1.50) is the variety V = V( f1 , . . . , fs ) = V(I). To solve the system
is to obtain a description of V that is as complete and explicit as possible. If V is
composed of a finite number c of points of kn , then as illustrated in Example 1.3.3,
we can find them by computing a Gröbner basis of the ideal I with respect to lex
and then iteratively finding roots of univariant polynomials. It is known that this
procedure will always yield full knowledge of the solution set V (including the case
c = 0, for then the reduced Gröbner basis is {1}). If V is not finite, then we turn to
the fact that there must exist a unique minimal decomposition (1.49); the best way
to describe V is to find this minimal decomposition. Application of I to (1.49) yields

I(V ) = I(V1 ) ∩ · · · ∩ I(Vm ) , (1.51)

where each term on the right is a prime ideal (Theorem 1.4.5). These prime ideals
correspond precisely to the irreducible varieties that we seek. However, unlike I(V ),
the ideal I with which we have to work need not be radical, and from Proposition
1.4.4 and Theorem 1.4.8 it is apparent that radical ideals, but only radical ideals,
can be represented as intersections of prime ideals. A solution is to work over √ C
and appeal to Proposition 1.3.16: supposing that the prime decomposition of I is
∩mj=1 Pj and applying the map V then yields

V = V(I) = V( I) = V(∩mj=1 Pj ) = ∪mj=1 V(Pj ) , (1.52)

and each term V(Pj ) specifies√the irreducible subvariety V j . A first step in following
this procedure is to compute I from the known ideal I, and there exist algorithms
for computing the radical of an arbitrary ideal ([81, 173, 192]). It is more conve-
nient to work directly with I, however; in order to decompose it into an intersection
of ideals, we need the following weaker condition on the components of the decom-
position.

Definition 1.4.9. An ideal I ⊂ k[x1 , . . . , xn ] is called a primary ideal if, for any pair
f , g ∈ k[x1 , . . . , xn ], f g ∈ I only if either f ∈ I or g p ∈ I for some p ∈ N.
√ √
An ideal I is primary if and only if I is prime (Exercise 1.49); I is called the
associated prime ideal of I.

Definition 1.4.10. A primary decomposition of an ideal I ⊂ k[x1 , . . . , xn ] is a repre-


sentation of I as a finite intersection of primary ideals Q j :

I = ∩mj=1 Q j . (1.53)

The decomposition
p is called a minimal primary decomposition if the associated
prime ideals Q j are all distinct and ∩i6= j Qi 6⊂ Q j for any j.

A minimal primary decomposition of a polynomial ideal always exists, but it is


not necessarily unique:
42 1 Polynomial Ideals and Their Varieties

Theorem 1.4.11 (Lasker–Noether Decomposition Theorem). Every ideal I in


k[x1 , . . . , xn ] has a minimal primary decomposition (1.53). All such decompositions
have the same number m of primary ideals and the same collection of associated
prime ideals.
The reader can find a proof in [18, 60].
Returning to the problem of finding the variety of the ideal I generated by the
polynomials in (1.50), whose √ primary
√ decomposition
 is given by (1.53), by (1.39)
the prime decomposition of I is I = ∩mj=1 Q j := ∩mj=1 Pj . Now apply V, which
by Theorem 1.3.18(2) yields (1.52), in which each term V(Pj ) specifies the irre-
ducible subvariety V j . There exist algorithms for computing the primary decompo-
sition and, as mentioned above, the radical of a given ideal ([81, 173, 192]). They
are time- and memory-consuming, however, and have yet to be implemented in
some general-purpose computer algebra systems. They can be found in Maple and
in specialized computer systems designed specifically for algebraic computations,
like CALI ([85]), Macaulay ([88]), and Singular ([89]). We illustrate the ideas with
a pair of examples.
Example 1.4.12. We look at Example 1.3.3 again, this time using Singular in order
to obtain a prime decomposition of the ideal I =  f1 , . . . , f4 . Code for carrying out
the decomposition is as follows.
> LIB "primdec.lib";
> ring r=0,(x,y,z),dp;
> poly f1=yˆ2+x+z-1;
> poly f2=xˆ2+2*y*x+2*z*x-4*x-3*y+2*y*z-3*z+3;
> poly f3=zˆ2+x+y-1;
> poly f4=2*xˆ3+6*z*xˆ2-5*xˆ2-4*x-7*y+4*y*z-7*z+7;
> ideal i=f1,f2,f3,f4;
> primdecGTZ(i);
The first command downloads a Singular library that enables computation of pri-
mary and prime decompositions. The second command declares that the polynomial
ring involved has characteristic zero, that the variables are x, y, and z and in the order
x > y > z, and that the term order to be used (as specified by the parameter dp) is
degree lexicographic order. The next four lines specify the polynomials in question;
in the following line ideal is the declaration that the ideal under investigation is
I =  f1 , f2 , f3 , f4 . Finally, primdecGTZ ([62]) commands the computation of a
primary decomposition of I using the Gianni–Trager–Zacharias algorithm ([81]).
The output is displayed in Table 1.7 on page 43. In the output the symbol z2 is
Singular’s “short” notation for z2 . To have the output presented as z^2, switch off
the short function with the command short=0.
The output is a list of pairs of ideals, where each ideal is, of course, specified by
a list of generators. In this case there are four pairs. The first ideal Q j in each pair
is a primary ideal in a primary decomposition of I; the second ideal Pj in each pair
is the associated prime ideal, that is, the radical, of the first, Pj = Q j . Singular’s
output indicates therefore that the ideal I in Example 1.4.12 is the intersection of the
four primary ideals
1.4 Decomposition of Varieties 43

[1]:
[1]:
_[1]=z2-2z+1
_[2]=y+z-1
_[3]=z2+x+y-1
[2]:
_[1]=z-1
_[2]=y
_[3]=z2+x+y-1
[2]:
[1]:
_[1]=z2-2z+1
_[2]=y-z
_[3]=z2+x+y-1
[2]:
_[1]=z-1
_[2]=y-1
_[3]=z2+x+y-1
[3]:
[1]:
_[1]=z2
_[2]=y-z
_[3]=z2+x+y-1
[2]:
_[1]=z
_[2]=y
_[3]=z2+x+y-1
[4]:
[1]:
_[1]=z2
_[2]=y+z-1
_[3]=z2+x+y-1
[2]:
_[1]=z
_[2]=y-1
_[3]=z2+x+y-1

Table 1.7 Singular Output of Example 1.4.12

Q1 = z2 − 2z + 1, y + z − 1, z2 + x + y − 1
Q2 = z2 − 2z + 1, y − z, z2 + x + y − 1
Q3 = z2 , y − z, z2 + x + y − 1
Q4 = z2 , y + z − 1, z2 + x + y − 1 .

We see that for each j, 1 ≤ j ≤ 4, Q j = Pj = Q j , so that Q j is not prime. It is also
apparent that the varieties V(Pj ) are the four points from (1.27).

Example 1.4.13. The discussion in Section 6.4 on the bifurcation of critical periods
from centers will be built around the system of differential equations
44 1 Polynomial Ideals and Their Varieties

ẋ = x − a20x3 − a11x2 y − a02xy2 − a−13y3


(1.54)
ẏ = −y + b02y3 + b11xy2 + b20x2 y + b3,−1x3

(the negative indices are merely a product of the indexing system that will simplify
certain expressions). Associated to (1.54) is an infinite collection of polynomials
gkk , k ∈ N, in the coefficients a pq and bqp of (1.54), called the focus quantities of
the system. Here we consider the ideal I generated by just the first five focus quan-
tities, I = g11 , . . . , g55 . They are the polynomials denoted in the Singular code that
follows by g11, . . . , g55. To carry out the primary decomposition of I we use the
following Singular code (where a13 stands for a−13 and b31 stands for b3,−1 ):
>LIB "primdec.lib";
>ring r=0,(a20,a11,a02,a13,b31,b20,b11,b02),dp;
>poly g11=a11-b11;
>poly g22=a20*a02-b02*b20;
>poly g33=(3*a20ˆ2*a13+8*a20*a13*b20+3*a02ˆ2*b31
-8*a02*b02*b31-3*a13*b20ˆ2-3*b02ˆ2*b31)/8;
>poly g44=(-9*a20ˆ2*a13*b11+a11*a13*b20ˆ2
+9*a11*b02ˆ2*b31-a02ˆ2*b11*b31)/16;
>poly g55=(-9*a20ˆ2*a13*b02*b20+a20*a02*a13*b20ˆ2
+9*a20*a02*b02ˆ2*b31+18*a20*a13ˆ2*b20*b31
+6*a02ˆ2*a13*b31ˆ2-a02ˆ2*b02*b20*b31
-18*a02*a13*b02*b31ˆ2
-6*a13ˆ2*b20ˆ2*b31)/36;
>ideal i = g11,g22,g33,g44,g55;
>primdecSY(i);
The only significant difference between this code and that of the previous example is
the last line; primdecSY commands the computation of a primary decomposition
of I using the Shimoyama–Yokoyama algorithm ([173]). The output is displayed in
Table 1.8 on page 45.
Thus in this case the ideal I is the intersection of three primary ideals

Q1 = a02 − 3b02, a11 − b11, 3a20 − b20


Q2 = b11 , 3a02 + b02, a11 , a20 + 3b20, 3a−13 b3,−1 + 4b20 b02 
Q3 = a11 − b11, a20 a02 − b20 b02 , a20 a−13 b20 − a02 b3,−1 b02 ,
a202 b3,−1 − a−13 b220 , a20 2 a−13 − b3,−1 b02 2 

which correspond to the first ideals of each pair in the output. In contrast to the first
example, in this case thesecond ideal in each output pair is the same as the first one,
which means that Q j = Q j = Pj for j = 1, 2, 3, hence is prime (see Exercise 1.49).
Thus, by Proposition 1.4.4, I is a radical ideal. The variety V(I) is the union of three
irreducible varieties V(P1 ), V(P2 ), and V(P3 ).
Note that in many cases the simplest output form of the minimal associated prime
ideals is provided by Singular’s routine minAssChar, which computes the mini-
mal associated primes by the characteristic sets method ([198]). For instance, for the
1.4 Decomposition of Varieties 45

[1]:
[1]:
_[1]=a02-3*b02
_[2]=a11-b11
_[3]=3*a20-b20
[2]:
_[1]=a02-3*b02
_[2]=a11-b11
_[3]=3*a20-b20
[2]:
[1]:
_[1]=b11
_[2]=3*a02+b02
_[3]=a11
_[4]=a20+3*b20
_[5]=3*a13*b31+4*b20*b02
[2]:
_[1]=b11
_[2]=3*a02+b02
_[3]=a11
_[4]=a20+3*b20
_[5]=3*a13*b31+4*b20*b02
[3]:
[1]:
_[1]=a11-b11
_[2]=a20*a02-b20*b02
_[3]=a20*a13*b20-a02*b31*b02
_[4]=a02ˆ2*b31-a13*b20ˆ2
_[5]=a20ˆ2*a13-b31*b02ˆ2
[2]:
_[1]=a11-b11
_[2]=a20*a02-b20*b02
_[3]=a20*a13*b20-a02*b31*b02
_[4]=a02ˆ2*b31-a13*b20ˆ2
_[5]=a20ˆ2*a13-b31*b02ˆ2

Table 1.8 Singular Output of Example 1.4.13

polynomials of Example 1.4.12 the output is given in Table 1.9 on page 46. These
output ideals look simpler than those obtained by primdecGTZ, but of course they
are the same.
Finally, we turn to the problem of “rational implicitizations.” Consider the unit
circle in the real plane R2 , which is the real affine variety V = V(x2 + y2 − 1).
V is represented by the well-known parametrization (x, y) = (cos t, sin t). This is
not an algebraic parametrization, however, but a transcendental one. An algebraic
parametrization of the circle is given by the rational functions

1 − t2 2t
x= , y= , t ∈ R, (1.55)
1 + t2 1 + t2
46 1 Polynomial Ideals and Their Varieties

[1]:
_[1]=z
_[2]=y-1
_[3]=x
[2]:
_[1]=z-1
_[2]=y
_[3]=x
[3]:
_[1]=z-1
_[2]=y-1
_[3]=x+1
[4]:
_[1]=z
_[2]=y
_[3]=x-1

Table 1.9 Singular Output of Example 1.4.12 Using minAssChar

which covers all points of the variety V except the point (−1, 0). (For a derivation
of this parametrization see [60, §4.4].)
Now we consider the opposite problem: given a rational or even a polynomial
parametrization of a subset S of kn , such as (1.55) for the unit circle, try to in essence
eliminate the parameters so as to express the set in terms of polynomials in x1 , . . . , xn .
Hence suppose we are given the system of equations

f1 (t1 , . . . ,tm ) fn (t1 , . . . ,tm )


x1 = , . . . , xn = , (1.56)
g1 (t1 , . . . ,tm ) gn (t1 , . . . ,tm )

where f j , g j ∈ k[t1 , . . . ,tm ] for j = 1, . . . , n. Let W = V(g1 · · · gn ). Equations (1.56)


define a function
F : km \ W → kn
by the formula
 
f1 (t1 , . . . ,tm ) fn (t1 , . . . ,tm )
F(t1 , . . . ,tm ) = , ..., . (1.57)
g1 (t1 , . . . ,tm ) gn (t1 , . . . ,tm )

The image of km \W under F, which we denote by F(km \W ), is not necessarily an


affine variety. For instance, for system (1.55) with k = R, W = ∅ and F(R) is not
the circle x2 + y2 = 1, but the circle with the point (−1, 0) deleted. Consequently,
we look for the smallest affine variety that contains F(km \ W ), that is, its Zariski
closure F(km \ W ). In the case of (1.55), this is the whole unit circle. The prob-
lem of finding F(km \ W ) is known as the problem of rational implicitization. If
the right-hand sides of (1.56) are polynomials, then it is the problem of polynomial
implicitization. The terminology comes from the fact that the collection of poly-
nomials f1 , . . . , fs that determine the variety V ( f1 , . . . , fs ) defines it only implicitly.
The following theorem gives an algorithm for polynomial implicitization.
1.4 Decomposition of Varieties 47

Theorem 1.4.14. Let k be an infinite field, let f1 , . . . , fn be elements of k[t1 , . . . ,tm ],


and let F : km → kn be the function defined by the equations

x1 = f1 (t1 , . . . ,tm ), . . . , xn = fn (t1 , . . . ,tm ). (1.58)

Form the ideal I = h f1 − x1 , . . . , fn − xn i ⊂ k[t1 , . . . ,tm , x1 , . . . , xn ]. Then the smallest


variety in kn that contains F(km ), the image of km under F, is the variety V(Im ) of
the mth elimination ideal Im = I ∩ k[x1 , . . . , xn ].

Proof. We prove the theorem only for the case k = C; for the general case see [60].
The function F is defined by

F(t1 , . . . ,tm ) = ( f1 (t1 , . . . ,tm ), . . . , fn (t1 , . . . ,tm )). (1.59)

Its graph is that subset of Cm+n for which each equation in (1.58) holds, hence is
precisely the affine variety V = V(I). The set F(Cm ) ⊂ Cn is simply the projection
of the graph of the function F onto Cn . That is, F(Cm ) = πm (V ), where πm is the
projection from Cn+m to Cn defined by πm (t1 , . . . ,tm , x1 , . . . , xn ) = (x1 , . . . , xn ). By
the Closure Theorem (Theorem 1.3.8) V(Im ) is the smallest variety that contains the
set πm (V ). 

To obtain an algorithm for rational implicitization we add an additional variable


in order to eliminate the influence of vanishing denominators.

Theorem 1.4.15. Let k be an infinite field, let f1 , . . . , fn and g1 , . . . , gn be elements


of k[t1 , . . . ,tm ], let W = V(g1 · · · gn ), and let F : km \W → kn be the function defined
by equations (1.57). Set g = g1 · · · gn . Consider the ideal

J = h f1 − g1 x1 , . . . , fn − gnxn , 1 − gyi ⊂ k[y,t1 , . . . ,tm , x1 , . . . , xn ],

and let
Jm+1 = J ∩ k[x1 , . . . , xn ] (1.60)
be the (m + 1)st elimination ideal. Then V(Jm+1 ) is the smallest variety in kn con-
taining F(km \ W ).

For a proof see [60]. We will use this theorem in our derivation of the center
variety of complex quadratic systems (Theorem 3.7.1) and later in our investigation
of the symmetry ideal of more general families (Section 5.2).

Definition 1.4.16. An affine variety V ⊂ kn admits a rational parametrization (or


can be parametrized by rational functions) if there exists a function F of the
form (1.57), F : km \ W → kn , where f j , g j ∈ k[t1 , . . . ,tm ] for j = 1, . . . , n and
W = V(g1 · · · gn ), such that V is the Zariski closure of F(km \ W ).

Theorem 1.4.17. Let f1 , . . . , fn and g1 , . . . , gn be elements of C[t1 , . . . ,tm ]. Suppose


I is an ideal C[x1 , . . . , xn ] that satisfies

C[x1 , . . . , xn ] ∩ h1 − tg, x1g1 − f1 , . . . , xn gn − fn i = I , (1.61)


48 1 Polynomial Ideals and Their Varieties

where g = g1 g2 · · · gn and the f j and g j are evaluated at (t1 , . . . ,tm ). If the vari-
ety V(I) of I admits a rational parametrization (1.56), then I is a prime ideal in
C[x1 , . . . , xn ].
Corollary 1.4.18. If an affine variety V ⊂ Cn can be parametrized by rational func-
tions, then it is irreducible.
Proof. Let (1.56) be the parametrization of a variety V . Then by Theorem 1.4.15
V = V(Jm+1 ) where Jm+1 is defined by (1.60). However, Jm+1 is the same as the
ideal I defined by (1.61). According to Theorem 1.4.17, the ideal I is prime, hence
radical (Proposition 1.4.3). Then using the Strong
√ Hilbert Nullstellensatz (Theorem
1.3.14) in the second step, I(V ) = I(V(I)) = I = I is prime, so by Theorem 1.4.5
the variety V = V(I) is irreducible. 
The statement of the corollary remains true with C replaced by any infinite field
(for the proof see, for example, [60, Chap. 4]).

Proof of Theorem 1.4.17. It is sufficient to show that the ideal

H = h1 − tg, x1g1 (t1 , . . . ,tm ) − f1 (t1 , . . . ,tm ), . . . , xn gn (t1 , . . . ,tm ) − fn (t1 , . . . ,tm )i

is prime in C[x1 , . . . , xn ,t1 , . . . ,tm ,t]. To do so we exploit the following facts of ab-
stract algebra:
(i) if ψ : R → R′ is a ring homomorphism, then the kernel ker(ψ ) = {r : ψ (r) = 0}
is an ideal in R (which means that it is a subring of R and for all r ∈ R and
s ∈ ker(ψ ), rs ∈ ker(ψ ), which the reader should be able to verify); and
(ii) if ker(ψ ) is a proper subset of R and R′ is an integral domain (which means
that ab = 0 only if either a = 0 or b = 0), then ker(ψ ) is prime (which is
true because ab ∈ ker(ψ ) means that 0 = ψ (ab) = ψ (a)ψ (b), which forces
ψ (a) = 0 or ψ (b) = 0 by the hypothesis on R′ ).
Let C(t1 , . . . ,tm ) denote the ring of rational functions of m variables with coeffi-
cients in C, and consider the ring homomorphism

ψ : C[x1 , . . . , xn ,t1 , . . . ,tm ,t] → C(t1 , . . . ,tm )

defined by

ti → ti , x j → f j (t1 , . . . ,tm )/g j (t1 , . . . ,tm ), t → 1/g(t1 , . . . ,tm ),

i = 1, . . . , m, j = 1, . . . , n. We will prove that H = ker(ψ ), which is clearly a proper


subset of C[x1 , . . . , xn ,t1 , . . . ,tm ,t]. It is immediate that H ⊂ ker(ψ ). We will show
the other inclusion by induction on the degree in just the variables x1 , . . . , xn ,t of the
polynomial h ∈ C[x1 , . . . , xn ,t1 , . . . ,tm ,t].
Basis step. Suppose that h ∈ ker(ψ ) and that h is linear in x1 , . . . , xn ,t, so that h
may be written as
n
h= ∑ α j (t1 , . . . ,tm )x j + α (t1 , . . . ,tm )t + α0(t1 , . . . ,tm ).
j=1
1.4 Decomposition of Varieties 49

Then α0tg = − ∑nj=1 α j f j t g̃ j − α t, where g̃ j = g/g j and g, f j , andg̃ j are all evaluated
at (t1 , . . . ,tm ). Therefore h = ∑nj=1 α j (x j g j − f j )t g̃ j − α t(1 − tg) + (1 − tg)h, so that
h ∈ H.
Inductive step. Now assume that for all polynomials of degree d in x1 , . . . , xn ,t,
if h ∈ ker(ψ ), then h ∈ H, and let h ∈ ker(ψ ) be of degree d + 1 in x1 , . . . , xn ,t. We
can write h as
n
h= ∑ h j (x j , x j+1 , . . . , xn ,t1 , . . . ,tm ,t) + bh(t1 , . . . ,tm ,t) + h0(t1 , . . . ,tm ),
j=1

where every term of h j contains x j and every term of b


h contains t, which allows us
to express h as h = u + v for
n
u= ∑ (h j /x j )(x j g j − f j )t g̃ j − bh(1 − tg) + (1 − tg)h ∈ H ⊂ ker(ψ )
j=1

and
n
v=t ∑ f j g̃ j (h j /x j ) + bh + tgh0 .
j=1

Since h, u ∈ ker(ψ ), v ∈ ker(ψ ), which clearly implies that v/t ∈ ker(ψ ). Then by
the induction hypothesis, v/t ∈ H, hence v ∈ H, so that h ∈ H as well. 

Example 1.4.19. As an illustration of the use to which we put these ideas, consider
the ideal Q3 of Example 1.4.13. Of course, we know already that Q3 is prime, so
that V(Q3 ) is irreducible. Imagine, however, that we merely suspect that V(Q3 ) is
irreducible and that we wish to establish its irreducibility directly. We attempt to
parametrize V(Q3 ). There is no systematic procedure for finding a parametrization;
one must proceed on a case-by-case basis. In this instance, the first generator of Q3
suggests that we introduce the single parameter u by writing a11 = b11 = u. The
equation a20 a02 − b20 b02 = 0 corresponding to the second generator suggests that
we will need three more parameters, since we have one equation (albeit nonlinear) in
three variables. If we arbitrarily select a20 from among a20 and a02 to play the role of
a parameter and similarly select b02 as a second parameter, then writing a20 = v and
b02 = w, we can satisfy the equation using just one more parameter s if we require
a02 = sw and b20 = sv. Inserting the expressions already specified into the third
generator of Q3 yields a20 a−13 b20 = a02 b3,−1 b02 = s(a−13 v2 − b3,−1 w2 ), which
can be made to vanish by the introduction of just one more parameter t if we require
a−13 = t w2 and b3,−1 = t v2 . The last two generators then vanish automatically when
all the conditions specified so far are applied. Thus the set S ⊂ C8 defined by the
parametric equations
a11 = u a02 = sw
b11 = u b02 = w
(1.62)
a20 = v a−13 = tw2
b20 = sv b3,−1 = tv2
50 1 Polynomial Ideals and Their Varieties

satisfies S ⊂ V(Q3 ). To show that S̄ = V(Q3 ) we apply Theorem 1.4.14. A Gröbner


basis of the ideal

I = ha11 − u, b11 − u, a20 − v, b20 − sv, a02 − sw, b02 − w, a−13 − tw2 , b3,−1 − tv2 i

with respect to lex with

u > s > w > t > v > a11 > b11 > a20 > b20 > a02 > b02 > a−13 > b3,−1

is

{a−13 b220 − a202 b3,−1, a02 a20 − b02 b20 , −a−13 a20 b20 + a02 b02 b3,−1,
a−13 a220 − b202 b3,−1, a11 − b11, −a20 + v, a−13 − b202 t,
a02 b3,−1 − a20 b02 b20 t, b3,−1 − a220 t, b02 − w, −b3,−1 s + a20 b20 t,
a−13 s − a02 b02 t, −a02 + b02 s, b20 − a20 s, b11 − u}.

The first five polynomials listed in the Gröbner basis, the ones that do not depend
on any of s, t, u, v, or w, are the generators of Q3 . According to Theorem 1.4.14 the
variety V(Q3 ) is the Zariski closure of S. Thus, by Definition 1.4.16, (1.62) defines
a polynomial parametrization of V(Q3 ), which by Corollary 1.4.18 is irreducible.

Example 1.4.19 gives us an intuitive idea of the concept of the dimension of an


irreducible affine variety. The variety S = V(Q3 ) is the Zariski closure of an image
of C5 ; therefore, it is a five-dimensional variety. If a variety is not irreducible then its
dimension is the maximal dimension of its components. For the precise definition of
dimension of a variety and methods for computing it see, for example, [18] or [60].
Here we only observe that with reference to Example 1.4.19, the naive supposition
that because the variety V(Q3 ) arises from five conditions on eight variables its
dimension must be 8 − 5 = 3 is incorrect.

1.5 Notes and Complements

Gröbner bases were introduced by Bruno Buchberger in 1965 ([26]) in the context of
his work on performing algorithmic computations in residue classes of polynomial
rings. He named them after his thesis advisor Wolfgang Gröbner, a professor at the
University of Innsbruck, who stimulated the research on the subject. The concept
of a Gröbner basis has proven to be an extremely useful computational tool and
has provided new insights in various subjects of modern algebra. It has also found
enormous application in studies of many fundamental problems in various branches
of mathematics and engineering. For further reading on Gröbner basis theory, its
development, and applications, the reader is referred to [1, 18, 28, 60, 184, 185].
We have written the algorithms in this and succeeding chapters in pseudocode
in order to facilitate the reader’s writing his own programs in the computer algebra
1.5 Notes and Complements 51

system or programming language of his choice. Each system has its own peculiar-
ities, which the user must take into account when using it. For instance, Singular
performs decompositions of polynomial ideals over the field Q and fields of pos-
itive characteristic, but not over R or C. Thus although
√ thepdecomposition (1.53)
I = ∩mj=1 Q j and the companion decomposition I = ∩mj=1 Q j := ∩mj=1 Pj that it
√ C, the components in the corresponding decomposition
gives are still valid over
(1.52) V = V(I) = V( I) = V(∩mj=1 Pj ) = ∪mj=1 V(Pj ) might not all be irreducible
as varieties in Cn .
The decomposition of varieties that arise in actual applications, such as the prob-
lems in the qualitative theory of ordinary differential equations that we will study in
Chapters 3–6, can lead to computations so vast that they cannot be completed even
with rather powerful computers. In such cases it can be helpful to employ modular
arithmetic; see [70, 148].

Exercises

1.1 Let F2 = {0, 1}, define an addition by 0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 0,


and define a multiplication by 1 · 0 = 0, 0 · 0 = 0, 0 · 1 = 0, 1 · 1 = 1. Show that F2
is a field and that f = x2 − x is a nonzero element of the ring F2 [x] that defines
the zero function f : F2 → F2 .
1.2 The Factor Theorem states that for f ∈ k[x] and c ∈ k, f (c) = 0 (c is a root of f )
if and only if (x − c) divides f . Use this to show that a polynomial f of degree
s > 0 in k[x] has at most s roots.
1.3 a. Prove Proposition 1.1.4.
b. More generally, prove that if T is any indexing set whatsoever and if for each
t ∈ T the set Vt is an affine variety, then ∩t∈T Vt is an affine variety.
Hint. Use part (b) of Exercise 1.4 and the Hilbert Basis Theorem (Theorem
1.1.6).
1.4 a. Prove that the set h f1 , . . . , fs i defined by (1.4) is an ideal in k[x1 , . . . , xn ].
b. More generally, prove that h f : f ∈ Fi is an ideal in k[x1 , . . . , xn ].
1.5 Prove that congruence modulo an ideal I (Definition 1.1.8) is an equivalence
relation on k[x1 , . . . , xn ].
1.6 a. Let [ f ] denote the equivalence class of f in k[x1 , . . . , xn ]/I. Show that for any
f1 and f2 in [ f ] and for any g1 and g2 in [g], ( f1 + g1 ) − ( f2 + g2 ) ∈ I and
f1 g1 − f2 g2 ∈ I.
b. Use part (a) to show that addition and multiplication in k[x1 , . . . , xn ]/I can be
validly defined by [ f ] + [g] = [ f + g] and [ f ][g] = [ f g].
c. Show that the operations on k[x1 , . . . , xn ]/I of part (b) define a ring structure
on k[x1 , . . . , xn ]/I. Note in particular what plays the role of the zero object.
1.7 Prove that for any subset V of k, not necessarily an affine variety, the set I(V )
specified by Definition 1.1.10 is an ideal of k[x1 , . . . , xn ].
1.8 Prove that the maps (1.5) and (1.6) are inclusion-reversing.
52 1 Polynomial Ideals and Their Varieties

1.9 The greatest common divisor of polynomials f and g in k[x], denoted GCD( f , g),
is the polynomial h such that (i) h divides both f and g, and (ii) if p ∈ k[x] di-
vides both f and g, then p divides h. Prove that
a. GCD( f , g) exists (and is unique except for multiplication by nonzero ele-
ments of k);
b. the Euclidean Algorithm, shown in Table 1.10, produces GCD( f , g).
Hint. The idea of the algorithm is that if R is the remainder upon division of F
by G, then the set of common divisors of F and G is precisely the set of common
divisors of G and R, so GCD(F, G) = GCD(G, R). Note that GCD( f , 0) = f .

The Euclidean Algorithm

Input:
Polynomials f , g ∈ k[x]

Output:
h = GCD( f , g)

Procedure:
h := f , s := g
WHILE s 6= 0
DO
s
h→r
h := s
s := r

Table 1.10 The Euclidean Algorithm

1.10 Suppose f , g ∈ k[x] and let d = GCD( f , g). Show that there exist polynomials
u1 , u2 ∈ k[x] such that d = u1 f + u2g. (Compare with Exercise 5.7.)
Hint. The polynomials u1 and u2 can be shown to exist and simultaneously
computed using the Extended Euclidean Algorithm, which is to write successive
passes through the Euclidean Algorithm as

f = q1 g + r1
g = q2 r1 + r2
r1 = q3 r2 + r3
..
.
rn−3 = qn−1 rn−2 + rn−1
rn−2 = qn rn−1 + 0
1.5 Notes and Complements 53

and retain all the quotients. For j ≥ 2, each remainder r j can be expressed in
terms of r1 through r j−1 , and r1 can be expressed in terms of f and g.
1.11 Use Exercise 1.10 to show that in k[x], h f1 , f2 i = hGCD( f1 , f2 )i.
1.12 Show that any ideal in k[x] is generated by a single polynomial f .
Hint. The Hilbert Basis Theorem and the idea in Exercise 1.11.
1.13 Without factoring the polynomials involved, find a single generator of the ideal
in R[x] indicated.
a. I = hx2 − 4, x + 3i
b. I = hx3 + 3x2 − x − 3, x3 + x2 − 9x − 9i
c. I = h6x3 − 11x2 − 39x + 14, 4x3 − 12x2 − x − 21, 10x3 − 35x2 + 8x − 28i
1.14 Determine whether or not f ∈ I by finding a single generator of I, and then
applying the Division Algorithm.
a. f = 6x3 + 12x2 + 4x + 15, I = hx − 5, x2 + x + 1i
b. f = 2x3 + 5x2 − 6x − 9, I the ideal of Exercise 1.13(b)
c. f = x4 − 7x3 + 12x − 13, I the ideal of Exercise 1.13(b)
1.15 Show that for any term order > on k[x1 , . . . , xn ], if α ∈ Nn0 is not (0, . . . , 0), then
α > (0, . . . , 0), hence that if α + β = γ , then γ > α and γ > β .
1.16 Show that the term orders introduced in Definition 1.2.3 satisfy the properties
of Definition 1.2.1.
g
1.17 Suppose f , g, h, and p are in k[x1 , . . . , xn ] and g 6= 0. Show that if f → h, then
g
( f + p) → (h + p).
p
1.18 a. Suppose f , g, h, p ∈ k[x1 , . . . , xn ] and p 6= 0. Show that if g → h and f is a
p
monomial, then f g → f h.
b. Suppose f1 , . . . , fs , f , g ∈ k[x1 , . . . , xn ] with f j 6= 0 for 1 ≤ j ≤ s, and write
F F
F = { f1 , . . . , fs }. Show that if g → h and f is a monomial, then f g → f h.
1.19 Suppose f1 , . . . , fs , f , g ∈ k[x1 , . . . , xn ], fs = f + g, and g ∈ h f1 , . . . , fs−1 i. Show
that h f1 , . . . , fs i = h f1 , . . . , fs−1 , f i
1.20 Suppose f1 , . . . , fs , g ∈ k[x1 , . . . , xn ], f j 6= 0 for 1 ≤ j ≤ s, and set F = { f1 , . . . , fs }.
F
Show that if g → h, then h f1 , . . . , fs , gi = h f1 , . . . , fs , hi.
1.21 Show that for the polynomials of Example 1.2.9 the remainder upon division of
f by { f2 , f1 } is −2x + 1, whether the term order is lex, deglex, or degrevlex.
1.22 Find the remainder upon division of the polynomial f = x7 y2 + x3 y2 − y + 1 by
F = {xy2 − x, x − y3} (in the order listed), first with respect to the term order lex
on R[x, y], then with respect to the term order deglex on R[x, y].
1.23 In R[x, y, z] let g1 = x + z, g2 = y + z, and I = hg1 , g2 i. Use Buchberger’s Cri-
terion (Theorem 1.2.19) as in Example 1.2.20 to show that under lex with
x > y > z, G = {g1, g2 } is a Gröbner basis of I, but under lex with x < y < z it
is not.
1.24 Prove Lemma 1.2.18.
Hint. Let a j = LC( f j ) 6= 0, 1 ≤ j ≤ s. Observe that ∑ j c j a j = 0 and that
S( f j , fℓ ) = a−1 −1
j f j − aℓ f ℓ , and show that (1.14) can be represented in the form
54 1 Polynomial Ideals and Their Varieties

f = c1 a1 S( f1 , f2 ) + (c1 a1 + c2a2 )S( f2 , f3 ) + · · ·


+ (c1 a1 + · · · + cs−1as−1 )S( fs−1 , fs ).

1.25 Find the reduced Gröbner basis for the ideal h f1 , f2 i of Example 1.2.20.
1.26 Find an example of a set S ⊂ k[x, y] such that hLT(S)i $ hLT(hSi)i .
1.27 Prove Proposition 1.2.22.
Hint. Use the reasoning applied in the proof that (iv) implies (i) in Theorem
1.2.16.
1.28 Find two Gröbner bases for the ideal I = hy2 + yx+ x2, y+ x, yi in the polynomial
ring R[x, y] under the term order lex with y > x. Use Proposition 1.2.22 to make
two different choices of elements to discard so as to obtain two distinct minimal
Gröbner bases of I.
1.29 Given a Gröbner basis G of an ideal I, let us say that an element g of G is
reduced for G if no monomial in g lies in hLT(G) \ {g}i, and then define G to
be reduced if each element of G is reduced for G and has leading coefficient
1. Show that this definition of a reduced Gröbner basis agrees with Definition
1.2.26.
1.30 Prove that a line with one point deleted is not a variety.
1.31 Prove Corollary 1.3.7. √
1.32 Prove that for any ideal I, I is an ideal.
Hint. For f u , gv ∈ I, u, v ∈ N, consider the binomial expansion √ of ( f + g)
u+v−1 .

1.33 Let k be a field and I an ideal in k[x1 , . . . , xn ]. Show that V( I) = V(I).


1.34 Construct counterexamples to the analogues of Theorems 1.3.10 and 1.3.14 that
arise when C is replaced by R.
Let I =
1.35 √ √h f1 , . . . , fs i and J = hg√1 , . . . , gt i be ideals in C[x1 , . . . , xn√ ]. Show that
I = J if and only if fi ∈ J for all i, 1 ≤ i ≤ s and g j ∈ I for all j,
1 ≤ j ≤ t.
1.36 Let I = h f1 , . . . , fu i and J = hg1 , . . . , gv i be ideals in k[x1 , . . . , xn ].
a. Prove that I ∩ J and I + J are ideals in k[x1 , . . . , xn ].
b. Show that I + J = h f1 , . . . , fu , g1 , . . . , gv i .
1.37 Suppose I and J are ideals in k[x1 , . . . , xm , y1 , . . . , yn ].
a. Show that I ∩ k[y1 , . . . , yn ] is an ideal in k[y1 , . . . , yn ].
b. Let k[y] denote k[y1 , . . . , yn ]. Show that (I ∩ k[y]) + (J ∩ k[y]) ⊂ (I + J) ∩ k[y]
but that the reverse inclusion need not hold.
1.38 a. If A is any set and {Ia : a ∈ A} is a collection of ideals indexed by A, show
that ∩a∈A Ia is an ideal.
b. Let any subset F of k[x1 , . . . , xn ] be given. By part (a) of this exercise the
set J := ∩{I : I is an ideal and I ⊃ F} is an ideal, the “smallest ideal that
contains every element of F.” Show √ that J =√h f : √f ∈ Fi.
1.39 Prove that for any two ideals I and J, I ∩ J = I ∩ J. Prove that for any two
sets A and B in kn , I(A ∪ B) = I(A) ∩ I(B).
1.40 Prove Theorem 1.3.18.
1.41 See Exercise 1.7. Show that if S is not an affine variety, then S ⊂ V(I(S)) always
holds, but the inclusion could be strict.
1.5 Notes and Complements 55

1.42 Show that if V is a variety, then V is a topologically closed set. Suppose S is


a set (not necessarily a variety) and C(S) is its topological closure; show that
S ⊂ C(S) ⊂ S̄. Show by example that just the first, just the second, both, or
neither of the two set inclusions could be strict.
1.43 Prove that the set I : J introduced in Definition 1.3.23 is an ideal in k[x1 , . . . , xn ].
1.44 Prove Proposition 1.3.27.
1.45 [Referenced in Theorem 3.7.1, proof.] Let V be a variety and V1 a subvariety of
V , V1  V .
a. Since V1 ⊂ V , by definition V = V1 ∪ (V \ V1). Show that no points outside
V are picked up in forming the Zariski closure of V \ V1 . That is, show that
V1 ∪ (V \ V1 ) ⊂ V .
b. Use the result in part (a) to show that if V is irreducible, then (V \ V1 ) = V .
(Compare this result to Proposition 1.3.20.)
c. Show that the set equality in part (b) can fail if V is not irreducible or if V1
is not a variety.
1.46 Consider J = 2a1 + b0, a0 + 2b1, a1 b1 − a2b2  in C[a0 , a1 , a2 , b2 , b1 , b0 ].
a. Use a computer algebra system to verify that J is a radical ideal. (For
example, in Maple apply the IsRadical command that is part of the
PolynomialIdeals package.)
b. Use the algorithm of Table 1.6 on page 38 to verify that J : b1 b2  = J.
c. Conclude that V(J) \ V(b1 b2 ) = V(J).
d. Could the conclusion in part (c) be drawn from the result in Exercise 1.45
without computing J : b1 b2 ? If so, explain why; if not, what additional
information is needed?
1.47 In the context of the Exercise 1.46, define a pair of polynomials g and h by
g = a0 b0 b2 − a2 b22 and h = 4 a30 a2 − a20 b20 − 18 a0 a2 b0 b2 + 4 b30 b2 + 27 a22 b22 .
Show that V(J) \ V(g ∩ h) = V(J).
1.48 Prove Proposition 1.4.7. √
1.49 Prove that an ideal I is primary if and only if the ideal I is prime.
1.50 Show that the minimal associated prime ideals given in Tables 1.7 and 1.9 are
the same.
Chapter 2
Stability and Normal Forms

In this chapter our concern is with a system of ordinary differential equations


ẋ = f(x) in Rn or Cn in a neighborhood of a point x0 at which f(x0 ) = 0. Early
investigations into the nature of solutions of the system of differential equations in
a neighborhood of such a point were made in the late 19th century by A. M. Lya-
punov ([114, 115]) and H. Poincaré ([143]). Lyapunov developed two methods for
investigating the stability of x0 . The so-called First Method involves transformation
of the system to normal form; the Second or Direct Method involves the use of what
are now termed Lyapunov functions. In the first section of this chapter we present
several of the principal theorems of Lyapunov’s Direct Method. Since smoothness
of f is not necessary for these results, we do not assume it in this section. The second
and third sections are devoted to the basics of the theory of normal forms.

2.1 Lyapunov’s Second Method

Let Ω be an open subset of Rn , x0 a point of Ω , and f : Ω → Rn continuous and


such that solutions of initial value problems associated with the autonomous system
of differential equations
ẋ = f(x) (2.1)
are unique. For x1 ∈ Ω , we let x1 (t) denote the unique solution of (2.1) that satis-
fies x(0) = x1 , on its maximal interval of existence J1 ; this is the trajectory through
x1 . The point set {x1 (t) : t ∈ J1 } is the orbit of (or through) x1 . If f(x0 ) = 0, then
x(t) ≡ x0 solves (2.1) uniquely, the orbit through x0 is just {x0 }, and x0 is termed
an equilibrium or rest point of the system, or a singularity or singular point (par-
ticularly when we view f as a vector field on Ω ; see Remark 3.2.4(b)). Any orbit
is topologically a point, a circle (a closed orbit or a cycle), or a line (see [44] or
[140]). The decomposition of Ω , the phase space of (2.1), into the union of disjoint
orbits is the phase portrait of (2.1). In part (a) of the following definition we have
incorporated into the definition the result from the theory of differential equations

V.G. Romanovski, D.S. Shafer, The Center and Cyclicity Problems, 57


DOI 10.1007/978-0-8176-4727-8_2,
© Birkhäuser is a part of Springer Science+Business Media, LLC 2009
58 2 Stability and Normal Forms

that if x1 (t) is confined to a compact set for all nonnegative t in its maximal interval
of existence, then that interval must contain the half-line [0, ∞), so that existence of
x1 (t) for all nonnegative t need not be assumed in advance. The same consideration
applies to part (b) of the definition.

Definition 2.1.1.
(a) An equilibrium x0 of (2.1) is stable if, for every ε > 0, there exists δ > 0 such
that if x1 satisfies |x1 − x0 | < δ then |x1 (t) − x0| < ε for all t ≥ 0.
(b) An equilibrium x0 of (2.1) is asymptotically stable if it is stable and if there
exists δ1 > 0 such that if x1 satisfies |x1 − x0 | < δ1 , then limt→∞ x1 (t) = x0 .
(c) An equilibrium of (2.1) is unstable if it is not stable.

Note that it is possible that the trajectory of every point in a neighborhood of


an equilibrium x0 tends to x0 in forward time, yet in every neighborhood there ex-
ist a point whose forward trajectory travels a uniformly large distance away from
x0 before returning to limit on x0 (Exercise 2.1). This is the reason for the specific
requirement in point (b) that x0 be stable. What we have called stability and asymp-
totic stability are sometimes referred to in the literature as positive stability and
positive asymptotic stability, and the equilibrium is then said to be negatively stable
or negatively asymptotically stable for (2.1) if it is respectively positively stable or
positively asymptotically stable for ẋ = −f(x).
The problem of interest in this section is to obtain a means of determining when
an equilibrium of system (2.1) is stable, asymptotically stable, or unstable without
actually solving or estimating solutions of the system, particularly when linear es-
timates fail. In a system of differential equations that models a mechanical system,
the total energy function typically holds the answer: if total energy strictly decreases
along positive trajectories near the equilibrium, then it is asymptotically stable. The
concept of Lyapunov function generalizes this idea.
Since without loss of generality we may assume that the equilibrium is located
at the origin of Rn , we henceforth assume that Ω is a neighborhood of 0 ∈ Rn
and that f(0) = 0. To understand the meaning of the quantity Ẇ in point (b) of the
following definition, note that if x(t) is a trajectory of system (2.1) (corresponding
to some initial condition x(0) = x0 ), then the expression w(t) = W (x(t)) defines a
differentiable function from a punctured neighborhood of 0 ∈ R into R, and by the
chain rule its derivative is dW (x) · f(x). The expression in (b) can thus be understood
as giving the instantaneous rate of change (at x) of the function W along the unique
trajectory of (2.1) which is at x at time zero.

Definition 2.1.2. Let U be an open neighborhood of 0 ∈ Rn and let W : U → R be a


continuous function that is C1 on U \ {0}.
(a) W is positive definite if W (0) = 0 and W (x) > 0 for x 6= 0 .
(b) W is a Lyapunov function for system (2.1) if it is positive definite and if the
function Ẇ : U \ {0} → R : x 7→ dW (x) · f(x) is nonpositive.
(c) W is a strict Lyapunov function for system (2.1) if it is positive definite and if Ẇ
is negative.
2.1 Lyapunov’s Second Method 59

Theorem 2.1.3. Let Ω be an open neighborhood of 0 ∈ Rn , and let 0 be an equilib-


rium for system (2.1) on Ω .
1. If there exists a Lyapunov function for system (2.1) on a neighborhood U of 0,
then 0 is stable.
2. If there exists a strict Lyapunov function for system (2.1) on a neighborhood U of
0, then 0 is asymptotically stable.
Proof. Suppose there exists a Lyapunov function W defined on a neighborhood U
of 0, and let ε > 0 be given. Decrease ε if necessary so that {x : |x| ≤ ε } ⊂ U, and
let S = {x : |x| = ε }. By continuity of W , compactness of S, and the fact that W is
positive definite, it is clear that m := min{W (x) : x ∈ S} is finite and positive, and
that there exists a positive number δ < ε such that M := max{W (x) : |x| ≤ δ } < m.
We claim that δ is as required. For fix x1 such that 0 < |x1 | ≤ δ , and as usual let
x1 (t) denote the trajectory through x1 . If, contrary to what we wish to establish,
there exists a positive value of t for which |x1 (t)| = ε , then there exists a smallest
such value T . Then for 0 ≤ t ≤ T , x1 (t) is in U \ {0}, hence v(t) := W (x1 (t)) is
defined and smooth, and v′ (t) = Ẇ (x1 (t)) ≤ 0, so that v(0) ≥ v(T ), in contradiction
to the fact that v(0) ≤ M < m ≤ v(T ).
For our proof of the second point we recall the notion of the omega limit set ω (x)
of a point x for which x(t) is defined for all t ≥ 0 : ω (x) is the set of all points y
for which there exists a sequence t1 < t2 < · · · of numbers such that tn → ∞ and
x(tn ) → y as n → ∞. Note that ω (x) is closed and invariant under the flow of system
(2.1) (see [44] or [140]).
Suppose there exists a strict Lyapunov function W defined on a neighborhood U
of 0. Since W is also a Lyapunov function, by point (1) 0 is stable. Choose ε > 0 so
small that {x : |x| ≤ ε } ⊂ U, and let δ be such that if |x| < δ , then x(t) exists for all
t ≥ 0 and satisfies |x(t)| < ε /2, hence the omega limit set ω (x) is defined and is a
nonempty, compact, connected set ([44, 140]). Note that if 0 < |x1 | < δ , then for all
t ≥ 0, W (x1 (t)) exists and Ẇ (x1 (t)) < 0.
Fix any x1 such that 0 < |x1 | < δ . If a, b ∈ ω (x1 ), then W (a) = W (b). Indeed,
there exist sequences tn and sn , tn → ∞ and sn ≥ 0, such that both x1 (tn ) → a and
x1 (tn + sn ) → b. Since W is continuous and strictly decreases on every trajectory in
Ω \ {0}, W (a) ≥ W (b). Similarly, W (b) ≥ W (a).
Again, if a ∈ ω (x1 ), then a must be an equilibrium. For if |a| 6= 0 and f(a) 6= 0,
then a(t) ∈ Ω \ {0} when it is defined, and for sufficiently small τ > 0 the time-τ
image a(τ ) of a satisfies a(τ ) 6= a. But if a ∈ ω (x1 ), then |a| ≤ ε /2 < ε , so that
W (a(τ )) < W (a). Yet a(τ ) ∈ ω (x1 ), hence W (a(τ )) = W (a), a contradiction.
Finally, 0 is the only equilibrium in {x : |x| < δ }. For given any x satisfying
0 < |x| < δ , for sufficiently small τ > 0, W (x(τ )) is defined and W (x(τ )) < W (x),
hence x(τ ) 6= x.
In short, ω (x1 ) = {0}, so x1 (t) → 0 as t → ∞, as required. 
The following slight generalization of part (2) of Theorem 2.1.3 is sometimes
useful (see Example 2.1.6, but also Exercise 2.4).
Theorem 2.1.4. Let Ω be an open neighborhood of 0 ∈ Rn , and let 0 be an equi-
librium for system (2.1) on Ω . Let C be a smooth curve in Ω to which f is nowhere
60 2 Stability and Normal Forms

tangent except at 0. If there exists a Lyapunov function W for system (2.1) on a


neighborhood U of 0 such that Ẇ is negative on Ω \ C, then 0 is asymptotically
stable.

Proof. The proof of part (2) of Theorem 2.1.3 goes through unchanged, since it
continues to be true that W is strictly decreasing on every trajectory in Ω \ {0}. 

An instability theorem analogous to Theorem 2.1.3 is the following.

Theorem 2.1.5. Let Ω be an open neighborhood of 0 ∈ Rn , and let 0 be an equilib-


rium for system (2.1) on Ω .
1. If there exists a positive definite function W on a neighborhood U of 0 such that
Ẇ > 0 on U \ {0}, then 0 is unstable.
2. If there exists a function W on a neighborhood U of 0 such that W (0) = 0, Ẇ is
positive definite, and W takes a positive value in every neighborhood of x0 , then
x0 is unstable.

Proof. The proof is left as Exercise 2.5. 

Example 2.1.6. Consider a quadratic system on R2 with an equilibrium at which


the linear part has one eigenvalue negative and the other eigenvalue zero. (To say
that the system is quadratic means that the right-hand sides in (2.1) are polynomials,
the maximum of whose degrees is two.) By translating the equilibrium to the origin,
performing a linear change of coordinates, and rescaling time, we may place the
system in the form
ẋ = −x + ax2 + bxy + cy2
ẏ = dx2 + exy + f y2 .
We will use a Lyapunov function to show that the equilibrium is stable if f = 0 and
ce < 0, or if c = e = f = 0. To do so, consider any function W of the form

W (x, y) = (Ax2 + By2 )/2. (2.2)

Then

Ẇ (x, y) = −Ax2 + Aax3 + (Ab + Bd)x2y + (Ac + Be)xy2 + B f y3 .

Choosing A = |e| and B = |c| if ce < 0 and A = B = 1 if c = e = 0, W is positive


definite. When f = 0, Ẇ becomes Ẇ (x, y) = −x2 (A − Aax − (Ab + Bd)y) , hence is
nonpositive on a neighborhood of the origin, implying stability. We note that any
equilibrium of a planar system at which the linear part has exactly one eigenvalue
zero is a node, a (topological) saddle, or a saddle-node (see, for example, Theorem
65, §21 of [12]), so that the equilibrium in question must actually be a stable node,
hence be asymptotically stable, a fact that the Lyapunov function does not reveal if
only Theorem 2.1.3 is used, but one that is shown by Theorem 2.1.4 when c 6= 0
(but not when c = 0).
2.1 Lyapunov’s Second Method 61

Example 2.1.7. Consider a quadratic system on R2 with an equilibrium at the origin


at which the linear part has purely imaginary eigenvalues. By a linear change of
coordinates and time rescaling (Exercise 2.7), we may write the system in the form

ẋ = −y + ax2 + bxy + cy2


(2.3)
ẏ = x + dx2 + exy + f y2 .

If we look for a Lyapunov function in the form (2.2), then

Ẇ (x, y) = (B − A)xy + Aax3 + (Ab + Bd)x2y + (Ac + Be)xy2 + B f y3 ,

which is nonpositive in a neighborhood of the origin if and only if A = B and

a = f = 0, b = −d, c = −e (2.4)

(in which case Ẇ ≡ 0). Therefore the origin of system (2.3) is stable if the coeffi-
cients satisfy (2.4). The sufficient conditions (2.4) that we have found for stability
of the origin of system (2.3) in fact are not necessary, however, but arose simply
as a result of our specific choice of the form (2.2) of our trial function W . Indeed,
consider system (2.3) with a = f = 1 and c = d = e = 0, that is,

ẋ = −y + x2 + bxy, ẏ = x + y2 , (2.5)

and in place of (2.2) the far more complicated trial function

W (x, y) = x2 + y2
2 (2 + b) 3 4
+ x − 2 x2 y + 2 x y2 − y3
3  3
26 − 30 b2 − 9 b3 4 4
− x + x3 y
6 (4 + 3 b) 3 (4 + 3 b)

2 (1 + 3 b) 2 2 4 (1 + b) (1 + 3 b) 3 26 + 36 b + 9 b2 4
− x y + xy − y ,
3 (4 + 3 b) 3 (4 + 3 b) 6 (4 + 3 b)

for which

2(26 + 30b + 9b2) 4 4 13 + 13 b + 3 b2 y4
Ẇ (x, y) = x − + o(|x4 + y4 |) .
12 + 9 b 12 + 9 b
√ √
When −13−6 13 < b < −13+6 13 , W is a strict Lyapunov function, hence for such b
the origin is an asymptotically stable singular point of system (2.5).

Example 2.1.7 illustrates that even for a system that is apparently as simple as
system (2.3) it is a difficult problem to find in the space of parameters {a, b, c, d, e, f }
the subsets corresponding to systems with a stable, unstable, or asymptotically sta-
ble singularity at the origin. The theorems of this section do not provide a procedure
for resolving this problem. We will study this situation in detail in Chapter 3.
62 2 Stability and Normal Forms

2.2 Real Normal Forms

Suppose x0 is a regular point of system (2.1), that is, a point x0 at which f(x0 ) 6= 0,
and that f is C∞ (or real analytic). The Flowbox Theorem (see, for example, §1.7.3
of [44]) states that there is a C∞ (respectively, real analytic) change of coordinates
x = H(y) in a neighborhood of x0 so that with respect to the new coordinates system
(2.1) becomes

ẏ1 = 1 and ẏ j = 0 for 2 ≤ j ≤ n . (2.6)

The Flowbox Theorem confirms the intuitively obvious answers to two questions
about regular points, those of identity and of structural stability or bifurcation: re-
gardless of the infinitesimal generator f, the phase portrait of system (2.1) in a neigh-
borhood of a regular point x0 is topologically equivalent to that of the parallel flow
of system (2.6) at any point, and when f is regarded as an element of any “reason-
ably” topologized set V of vector fields, there is a neighborhood N of f in V such
that for any f̃ in N, the flows of f and f̃ (or we often say f and f̃ themselves) are topo-
logically equivalent in a neighborhood of x0 . (Two phase portraits, or the systems of
differential equations or vector fields that generate them, are said to be topologically
equivalent if there is a homeomorphism carrying the orbits of one onto the orbits of
the other, preserving the direction of flow along the orbits, but not necessarily their
actual parametrizations, say as solutions of the differential equation.) In short, the
phase portrait in a neighborhood of a regular point is known (up to diffeomorphism),
and there is no bifurcation under sufficiently small perturbation.
Now suppose that x0 is an equilibrium of system (2.1). As always we assume, by
applying a translation of coordinates if necessary, that the equilibrium is located at
the origin and that f is defined and is sufficiently smooth on some open neighbor-
hood of the origin. The questions of identity and of structural stability or bifurcation
have equally simple answers in the case that the real parts of the eigenvalues of
the linear part A := df(0) of f are nonzero, in which case the equilibrium is called
hyperbolic. For the Hartman–Grobman Theorem (see for example [44]) states that,
in such a case, in a neighborhood of the origin the local flows φ (t, x) generated by
(2.1) and ψ (t, x) generated by the linear system

ẋ = Ax (A = df(0)) (2.7)

are topologically conjugate: there is a homeomorphism H of a neighborhood of


the origin onto its image such that H(φ (t, x)) = ψ (t, H(x)). (Here, for each x0 ,
φ (t, x0 ) (respectively, ψ (t, x0 )) is the unique solution x(t) of (2.1) (respectively,
of (2.7)) satisfying the initial condition x(0) = x0 .) Because the homeomorphism
H thus carries trajectories of system (2.1) onto those of system (2.7), preserving
their sense (in fact, their parametrization), it is a fortiori a topological equivalence
between the full system (2.1) and the linear system (2.7) in a neighborhood of 0,
the latter of which is explicitly known. Furthermore, if f is an element of a set V of
2.2 Real Normal Forms 63

vector fields that is topologized in such a way that eigenvalues of linear parts depend
continuously on the vector fields, then system (2.1) is structurally stable as well.
The real situation of interest then is that of an isolated equilibrium of system
(2.1) that is nonhyperbolic. The identity problem is most fully resolved in dimension
two. If there is a characteristic direction of approach to the equilibrium, then apart
from certain exceptional cases, the topological type of the equilibrium can be found
by means of a finite sequence of “blow-ups” and is determined by a finite initial
segment of the Taylor series expansion of f at 0 (see [65]). In higher dimensions,
results on the topological type of degenerate equilibria are less complete.
Normal form theory enters in when we wish to understand the bifurcations of
system (2.1) in a neighborhood of a nonhyperbolic equilibrium. Supposing that we
have solved the identity problem (although in actual practice an initial normal form
computation may be done in order to simplify the identity problem for the original
system), we wish to know what phase portraits are possible in a neighborhood of
0, for any vector field in a neighborhood of f in some particular family V of vec-
tor fields, with a given topology. The idea of normal form theory quite simply is to
perform a change of coordinates x = H(y), or a succession of coordinate transfor-
mations, so as to place the original system (2.1) into a form most amenable to study.
Typically, this means eliminating as many terms as possible from an initial segment
of the power series expansion of f at the origin.
It is useful to compare this idea to an application of the Hartman–Grobman The-
orem. Although the homeomorphism H guaranteed by the Hartman–Grobman The-
orem can rightly be regarded as a change of coordinates in a neighborhood of the
origin, in that it is invertible, it would be incorrect to say that system (2.1) has been
transformed into system (2.7) by the change of coordinates H, since H could fail
to be smooth. In fact, for some choices of f in (2.1) it is impossible to choose H
to be smooth. An example for which this happens is instructive and will lay the
groundwork for the general approach in the nonhyperbolic case.
Example 2.2.1. Consider the linear system

ẋ1 = 2x1
(2.8)
ẋ2 = x2 ,

which has a hyperbolic equilibrium at the origin, and the general quadratic system
with the same linear part,

ẋ1 = 2x1 + ax21 + bx1 x2 + cx22


(2.9)
ẋ2 = x2 + a′x21 + b′x1 x2 + c′ x22 .

We make a C2 change of coordinates x = H(y) = h(0) (y) + h(1) (y) + h[2] (y), where
h(0) (y) denotes the constant terms, h(1) (y) the linear terms in y1 and y2 , and h[2] (y)
all the remaining terms. Our goal is to eliminate all the quadratic terms in (2.9). To
keep the equilibrium situated at the origin, we choose h(0) (y) = 0, and to maintain
the same linear part, which is already in “simplest” form, namely Jordan normal
form, we choose h(1) (y) = y. Thus the change of coordinates is
64 2 Stability and Normal Forms

x = y + h[2](y) ,

whence
ẋ = ẏ + dh[2] (y)ẏ = (Id + dh[2] (y))ẏ, (2.10)
where for the n-dimensional vector function u we denote by du the Jacobian matrix
∂u ∂ u1

1
· · ·
 ∂ .y1 . ∂ .yn 
 . .. .  .
 . . 
∂ un ∂ un
∂y · · · ∂ yn
1

For y sufficiently close to 0 the geometric series ∑∞ [2] k


k=0 (−dh (y)) converges in the
n n
real vector space of linear transformations from R to R with the uniform norm
kTk = max{|Tx| : |x| ≤ 1}. Therefore, the linear transformation Id + dh[2] (y) is
invertible and (Id + dh[2] (y))−1 = Id − dh[2] (y) + · · ·, so that (2.10) yields

ẏ = (Id + dh[2] (y))−1 ẋ = (Id − dh[2] (y) + · · ·)ẋ . (2.11)

Writing h[2] as  
[2] a20 y21 + a11y1 y2 + a02y22 + · · ·
h (y) = ,
b20 y21 + b11y1 y2 + b02y22 + · · ·
we have
 
2a20 y1 + a11y2 + · · · a11 y1 + 2a02y2 + · · ·
dh[2] (y) = ,
2b20 y1 + b11y2 + · · · b11 y1 + 2b02y2 + · · ·

so that (2.11) is
   
ẏ1 1 − 2a20y1 − a11y2 − · · · − a11 y1 − 2a02y2 − · · ·
=
ẏ2 − 2b20y1 − b11y2 − · · · 1 − b11y1 − 2b02y2 − · · ·
 
2(y1 + a20y21 + a11y1 y2 + a02 y22 + · · · )
 + a(y1 + · · ·)2 + b(y1 + · · · )(y2 + · · · ) + c(y2 + · · · )2 
× y2 + b20y + b11y1 y2 + b02y2 + · · ·
2


1 2
+ a′ (y1 + · · ·)2 + b′(y1 + · · · )(y2 + · · · ) + c′ (y2 + · · · )2
 
2y1 + (a − 2a20)y21 + (b − a11 )y1 y2 + cy22 + · · ·
= .
y2 + (a′ − 3b20)y21 + (b′ − 2b11)y1 y2 + (c′ + b02)y22 + · · ·

From this last expression it is clear that suitable choices of the coefficients ai j and
bi j exist to eliminate all the quadratic terms in the transformed system except for the
y22 term in ẏ1 . In short, no matter what our choice of the transformation H(y), if H
is twice differentiable, then system (2.9) can be simplified to

ẏ1 = 2y1 + cy22


(2.12)
ẏ2 = y2
2.2 Real Normal Forms 65

through order two, but the quadratic term in the first component cannot be removed;
(2.12) is the normal form for system (2.9) through order two.

We now turn our attention to the general situation in which system (2.1) has
an isolated equilibrium at the origin, not necessarily hyperbolic. We will need the
following notation and terminology. As in Chapter 1, for α = (α1 , . . . , αn ) ∈ Nn0 , xα
denotes xα1 1 · · · xαn n and |α | = α1 + · · · + αn . We let Hs denote the vector space of
functions from Rn to Rn each of whose components is a homogeneous polynomial
function of degree s; elements of Hs will be termed vector homogeneous functions.
j
If {e1 , . . . , en } is the standard basis of Rn , e j = (0, . . . , 0, 1, 0, . . . , 0)T , then a basis
for Hs is the collection of vector homogeneous functions

v j,α = xα e j (2.13)

for all j such that 1 ≤ j ≤ n and all α such that |α | = s (this is the product of a
monomial and a vector). For example, the first three vectors listed in the basis for
H2 given in Example 2.2.2 below are v1,(2,0) , v1,(1,1) , and v1,(0,2). Thus Hs has
dimension N = nC(s + n − 1, s) (see Exercise 2.6).
Assuming that f is C2 , we expand f in a Taylor series so as to write (2.1) as

ẋ = Ax + f(2)(x) + R(x), (2.14)

where f(2) ∈ H2 and the remainder satisfies the condition that |R(x)|/|x|2 → 0 as
|x| → 0. We assume that A has been placed in a standard form by a preliminary linear
transformation, so that in practical situations the nonlinear terms may be different
from what they were in the system as originally encountered.
Applying the reasoning of Example 2.2.1, we make a coordinate transformation
of the form
x = H(y) = y + h(2)(y), (2.15)
where in this case h(2) ∈ H2 , that is, each of the n components of h(2) (y) is a ho-
mogeneous quadratic polynomial in x. Since dH(0) = Id is invertible, the Inverse
Function Theorem guarantees that H has an analytic inverse on a neighborhood of
0. Using (2.15) in the right-hand side of (2.14) and inserting that in turn into the
analogue of (2.11) (that is, into (2.11) with h[2] replaced by h(2) ) yields

ẏ = Ay + Ah(2)(y) + f(2)(y) − dh(2) (y)Ay + R2(y), (2.16)

where the remainder satisfies the condition |R2 (y)|/|y|2 → 0 as |y| → 0. The
quadratic terms can be eliminated from (2.16) if and only if h(2) (y) can be cho-
sen so that
L h(2) (y) = dh(2) (y)Ay − Ah(2)(y) = f(2) (y), (2.17)
where L , the so-called homological operator, is the linear operator on H2 defined
by
L : p(y) 7→ dp(y)Ay − Ap(y) . (2.18)
66 2 Stability and Normal Forms

In other words, all quadratic terms can be eliminated from (2.16) if and only if L
maps H2 onto itself. If L is not onto, then because H2 is finite-dimensional, it
decomposes as a direct sum H2 = Image(L ) ⊕ K2 , where Image(L ) denotes the
image of L in H2 , although the complementary subspace K2 is not unique. The
quadratic terms in (2.16) that can be eliminated by a C2 change of coordinates are
precisely those that lie in Image(L ). Those that remain have a form dependent on
the choice of the complementary subspace K2 .
Example 2.2.2. Let us reconsider the system (2.9) of Example 2.2.1 in this context.
The basis (2.13) for H2 is
 2     2      
x1 x1 x2 x2 0 0 0
, , , 2 , , 2 ,
0 0 0 x1 x1 x2 x2

which we order by the order in which we have listed the basis elements, and which
for ease of exposition we label u1 , u2 , u3 , u4 , u5 , and u6 . A straightforward compu-
tation based on the definition (2.18) of the homological operator L yields
 α1 α2   α1 α2 
y y y y
L 1 2 = (2α1 + α2 − 2) 1 2 (2.19a)
0 0

and    
0 0
L = (2α1 + α2 − 1) , (2.19b)
y1α1 yα2 2 y1α1 y2α2
so that each basis vector is an eigenvector. The eigenvalues are, in the order of the
basis vectors to which they correspond, 2, 1, 0, 3, 2, and 1. Image(L ) is thus the
five-dimensional subspace of H2 spanned by the basis vectors other than u3 , and
a natural complement to Image(L ) is Span{u3 }, corresponding precisely to the
quadratic term in (2.12).
Returning to the general situation, beginning with (2.1), written in the form
(2.14), we compute the operator L of (2.18), choose a complement K2 to Image(L )
in H2 , and decompose f(2) as f(2) = (f(2) )0 + f̃(2) ∈ Image(L ) ⊕ K2 . Then for any
h(2) ∈ H2 satisfying L h(2) = (f(2) )0 , by (2.16) the change of coordinates (2.15)
reduces (2.1) (which is the same as (2.14)) to

ẏ = Ay + f̃(2)(y) + R̃2(y), (2.20)

where, to repeat, f̃(2) ∈ K2 , and |R̃2 (y)|/|y|2 → 0 as |y| → 0. This is the normal
form for (2.1) through order two: the quadratic terms have been simplified as much
as possible.
Turning to the cubic terms, and assuming one more degree of differentiability, let
us return to x for the current coordinates and write (2.20) as

ẋ = Ax + f̃(2)(x) + f(3)(x) + R(x), (2.21)

where R denotes a new remainder term, satisfying the condition |R(x)|/|x|3 → 0 as


|x| → 0. A change of coordinates that will leave the constant, linear, and quadratic
2.2 Real Normal Forms 67

terms in (2.21) unchanged is one of the form

x = y + h(3)(y), (2.22)

where h(3) ∈ H3 , that is, each of the n components of h(3) (y) is a homogeneous
cubic polynomial function in y. Using (2.22) in the right-hand side of (2.21) and
inserting that in turn into the analogue of (2.11) (that is, into (2.11) with h[2] replaced
by h(3) ) yields

ẏ = Ay + f̃(2)(y) + Ah(2)(y) + f(3)(y) − dh(3) (y)Ay + R3(y), (2.23)

where the remainder satisfies the condition |R3 (y)|/|y|2 → 0 as |y| → 0. The cubic
terms can be eliminated from (2.23) if and only if h(3) (y) can be chosen so that

L h(3) (y) = dh(3) (y)Ay − Ah(3)(y) = f(3) (y). (2.24)

Comparing (2.17) and (2.24), the reader can see that the condition for the elim-
ination of all cubic terms is exactly the same as that for the elimination of all
quadratic terms. The homological operator L is again defined by (2.18), except
that it now operates on the vector space H3 of functions from Rn to Rn all of
whose components are homogeneous cubic polynomial functions. If L does not
map onto H3 , then exactly as for H2 when L does not map onto H2 , H3 decom-
poses as a direct sum H3 = Image(L ) ⊕ K3 , although again the complementary
subspace K3 is not unique. Once we have chosen a complement K3 , f(3) decom-
poses as f(3) = (f(3) )0 + f̃(3) ∈ Image(L ) ⊕ K3 . Then for any h(3) ∈ H3 satisfying
L h(3) = (f(3) )0 , by (2.23) the change of coordinates (2.22) reduces (2.1) (which is
the same as (2.21)) to

ẏ = Ay + f̃(2)(y) + f̃(3)(y) + R̃3(y), (2.25)

where f̃(3) ∈ K3 , and |R̃3 (y)|/|y|3 → 0 as |y| → 0. This is the normal form for (2.1)
through order three: the quadratic and cubic terms have been simplified as much as
possible.
It is apparent that the pattern continues through all orders, as long as f is suf-
ficiently differentiable. Noting that a composition of transformations of the form
x = y + p(y), where each component of p(y) is a polynomial function, is a trans-
formation of the same form and has an analytic inverse on a neighborhood of 0, we
have the following theorem.

Theorem 2.2.3. Let f be defined and Cr on a neighborhood of 0 in Rn and satisfy


f(0) = 0. Let A = df(0). For 2 ≤ k ≤ r, let Hk denote the vector space of functions
from Rn to Rn all of whose components are homogeneous polynomial functions of
degree k, let L denote the linear operator on Hk (the “homological operator”) de-
fined by L p(y) = dp(y)Ay − Ap(y), and let Kk be any complement to Image(L )
in Hk , so that Hk = Image(L ) ⊕ Kk . Then there is a polynomial change of coor-
dinates x = H(y) = y + p(y) such that in the new coordinates system (2.1) is
68 2 Stability and Normal Forms

ẏ = Ay + f(2)(y) + · · · + f(r) (y) + R(y), (2.26)

where for 2 ≤ k ≤ r, f(k) ∈ Kk , and the remainder R satisfies |R(y)|/|y|r → 0 as


|y| → 0.

Definition 2.2.4. In the context of Theorem 2.2.3, expression (2.26) is a normal


form through order r for system (2.1).

2.3 Analytic and Formal Normal Forms

In this section we study in detail the homological operator L and normal forms of
the system
ẋ = Ax + X(x), (2.27)
where now x ∈ Cn , A is a possibly complex n × n matrix, and each component Xk (x)
of X, 1 ≤ k ≤ n, is a formal or convergent power series, possibly with complex
coefficients, that contains no constant or linear terms. Our treatment mainly follows
the lines of [19].
To see why this is of importance, even when our primary interest is in real sys-
tems, recall that the underlying assumption in the discussion leading up to Theorem
2.2.3, which was explicitly stated in the first sentence following (2.14), was that the
linear terms in the right-hand side of (2.1), that is, the n × n matrix A in (2.14), had
already been placed in some standard form by a preliminary linear transformation.
To elaborate on this point, typically at the beginning of an investigation of system
(2.1), expressed as (2.14), the matrix A has no particularly special form. From linear
algebra we know that there exists a nonsingular n × n matrix S such that the simil-
iarity transformation SAS−1 = J produces the Jordan normal form J of A. If we use
the matrix S to make the linear coordinate transformation

y = Sx (2.28)

of phase space, then in the new coordinates (2.14) becomes

ẏ = Jy + S f(2)(S−1 y) + S R(S−1y). (2.29)

Although the original system (2.1) or (2.14) is real, the matrices J and S can be
complex, hence system (2.29) can be complex as well. Thus even if we are primarily
interested in studying real systems, it is nevertheless fruitful to investigate normal
forms of complex systems (2.27). Since we are working with systems whose right-
hand sides are power series, we will also allow formal rather than convergent series
as well.
Whereas previously Hs denoted the vector space of functions from Rn to Rn , all
of whose components are homogeneous polynomial functions of degree s, we now
let Hs denote the vector space of functions from Cn to Cn , all of whose components
are homogeneous polynomial functions of degree s. The collection v j,α of (2.13)
2.3 Analytic and Formal Normal Forms 69

remains a basis of Hs . For α = (α1 , . . . , αn ) ∈ Nn0 and κ = (κ1 , . . . , κn ) ∈ Cn we will


let (α , κ ) denote the scalar product
n
(α , κ ) = ∑ α jκ j .
j=1

Lemma 2.3.1. Let A be an n × n matrix with eigenvalues κ1 , . . . , κn , and let L be


the corresponding homological operator on Hs , that is, the linear operator on Hs
defined by
L p(y) = dp(y)Ay − Ap(y) . (2.30)
Let κ = (κ1 , . . . , κn ). Then the eigenvalues λ j , i = j, . . . , N, of L are

λ j = (α , κ ) − κm ,

where m ranges over {1, . . . , n} ⊂ N and α ranges over {β ∈ Nn0 : |β | = s}.

Proof. For ease of exposition, just for this paragraph let T denote the linear trans-
formation of Cn whose matrix representative with respect to the standard basis
{e1 , . . . , en } of Cn is A. There exists a nonsingular n×n matrix S such that J = SAS−1
is the lower-triangular Jordan form of A (omitted entries are zero),
 
κ1
σ2 κ2 
 
 σ3 κ3 
 
J=
 σ4 κ4 ,

 . .. .. 
.
 
σn κn

where κ1 through κn are the eigenvalues of A (repeated eigenvalues grouped to-


gether) and σ j ∈ {0, 1} for 2 ≤ j ≤ n. We will compute the eigenvalues of L by
finding its N × N matrix representative L with respect to a basis of Hs correspond-
ing to the new basis of Cn in which J is the matrix of T . This corresponds to the
change of coordinates x = Sy in Cn .
In Exercise 2.11 the reader is led through a derivation of the fact that with respect
to the new coordinates the expression for L changes to (2.30) with A replaced by
J, that is,
L p(y) = dp(y)Jy − Jp(y) . (2.31)
Column ( j, α ) of L is the coordinate vector of the image under L of the basis
vector v j,α of (2.13). Computing directly from (2.31), L v j,α (y) is (omitted entries
are zeros)
70 2 Stability and Normal Forms
  κ 1 y1

  κ2 y2 + σ2 y1 
 α −1 α  
L v j,α (y) =  αn α1 α2
· · · αn y1 y2 · · · ynαn −1 
 κ3 y3 + σ3 y2 
α1 y1 y2 · · · yn  
1 2
 .. 
  
.
κn yn + σn yn−1
 
 
 
 κ j yα1 yα2 · · · yαn n 

− 1 2
α1 α2 
αn 
σ j+1 y1 y2 · · · yn 
 

" #
n
αi−1 +1 αi −1
= ((α , κ ) − κ j ))y + ∑
α
σi αi y1α1 · · · yi−1 yi · · · yαn n ej
i=2
+ σ j+1 yα e j+1
n
= ((α , κ ) − κ j ))v j,α (y) + ∑ σi αi v j,(α1 ,...,αi−1 +1,αi −1,...,αn ) (y)
i=2
+ σ j+1 v j+1,α (y) .

If the basis of Hs is ordered so that vr,β precedes vs,γ if and only if the first nonzero
entry (reading left to right) in the row vector (r − s, β − γ ) is negative, then the basis
vector v j,α precedes all the remaining vectors in the expression for L v j,α . This
implies that the corresponding N × N matrix L for L is lower triangular, and has
the numbers (α , κ ) − κm (|α | = s, 1 ≤ m ≤ n) on the main diagonal. 

The order of the basis of Hs referred to in the proof of Lemma 2.3.1 is the
lexicographic order. See Exercise 2.12.
We say that our original system (2.27) under consideration is formally equivalent
to a like system
ẏ = Ay + Y(y) (2.32)
if there is a change of variables

x = H(y) = y + h(y) (2.33)

that transforms (2.27) into (2.32), where the coordinate functions of Y and h, Y j and
h j , j = 1, . . . , n, are formal power series. (Of course, in this context it is natural to
allow the coordinate functions X j of X to be merely formal series as well.) If all Y j
and h j are convergent power series (and all X j are as well), then by the Inverse Func-
tion Theorem the transformation (2.33) has an analytic inverse on a neighborhood
of 0 and we say that (2.27) and (2.32) are analytically equivalent. (See the para-
graph following Corollary 6.1.3 for comments on the convergence of power series
of several variables.)
2.3 Analytic and Formal Normal Forms 71

We alert the reader to the fact that, as indicated by the notation introduced in
(2.33), h stands for just the terms of order at least two in the equivalence transfor-
mation H between the two systems.
Lemma 2.3.1 yields the following theorem.
Theorem 2.3.2. Let κ1 , . . . , κn be the eigenvalues of the n × n matrix A in (2.27) and
(2.32), set κ = (κ1 , . . . , κn ), and suppose that

(α , κ ) − κm 6= 0 (2.34)

for all m ∈ {1, . . . , n} and for all α ∈ Nn0 for which |α | ≥ 2. Then systems (2.27) and
(2.32) are formally equivalent for all X and Y, and the equivalence transformation
(2.33) is uniquely determined by X and Y.
Proof. Differentiating (2.33) with respect to t and applying (2.27) and (2.32) yields
the condition

dh(y)Ay − Ah(y) = X(y + h(y)) − dh(y)Y(y) − Y(y) (2.35)

that h must satisfy. We determine h by a recursive process like that leading up to


Theorem 2.2.3.
Decomposing X, Y, and h as the sum of their homogeneous parts,
∞ ∞ ∞
X= ∑ X(s) , Y= ∑ Y(s) , h= ∑ h(s) , (2.36)
s=2 s=2 s=2

where X(s) , Y(s) , h(s) ∈ Hs , (2.35) decomposes into the infinite sequence of equa-
tions

L (h(s) ) = g(s) (h(2) , . . . , h(s−1) , Y(2) , . . . , Y(s−1) , X(2) , . . . , X(s) ) − Y(s), (2.37)

for s = 2, 3, . . . , where g(s) denotes the function that is obtained after the substitution
into X(y + h(y)) − dh(y)Y(y) of the expression y + ∑si=1 h(i) in the place of y + h(y)
and the expression ∑si=1 Y(i) (y) in the place of Y(y), and maintaining only terms that
are of order s. For s = 2, the right-hand side of (2.37) is to be understood to stand for
X(2) (y) − Y(2)(y), which is known. For s > 2, the right-hand side of (2.37) is known
if h(2) , . . . , h(s−1) have already been computed. By the hypothesis (2.34) and Lemma
2.3.1 the operator L is invertible. Thus for any s ≥ 2 there is a unique solution h(s)
to (2.37). Therefore a unique solution h(y) of (2.35) is determined recursively. 
Choosing Y = 0 in (2.32) yields the following corollary and motivates the defi-
nition that follows it.
Corollary 2.3.3. If condition (2.34) holds, then system (2.27) is formally equivalent
to its linear approximation ẏ = Ay. The (possibly formal) coordinate transformation
that transforms (2.27) into ẏ = Ay is unique.
Definition 2.3.4. System ẋ = Ax + X(x) is linearizable if there is an analytic nor-
malizing transformation x = y + h(y) that places it in the normal form ẏ = Ay.
72 2 Stability and Normal Forms

Both the linear system ẏ = Ay and the linearizing transformation that produces it are
referred to as a “linearization” of the system ẋ = Ax+X(x). We will see later (Corol-
lary 4.2.3) that, at least when A is diagonal and n = 2 (the only case of practical
interest for us), the existence of a merely formal linearization implies the existence
of a convergent linearization.
As we saw in Example 2.2.2, when (2.34) does not hold some equations in (2.37)
might not have a solution. This means that in such a case we might not be able to
transform system (2.27) into a linear system by even a formal transformation (2.33).
The best we are sure to be able to do is to transform (2.27) into a system in which all
terms that correspond to pairs (m, α ) for which (2.34) holds have been eliminated.
However, terms corresponding to those pairs (m, α ) for which (2.34) fails might be
impossible to eliminate. These troublesome terms have a special name.
Definition 2.3.5. Let κ1 , . . . , κn be the eigenvalues of the matrix A in (2.27), ordered
according to the choice of a Jordan normal form J of A, and let κ = (κ1 , . . . , κn ).
Suppose m ∈ {1, . . . , n} and α ∈ Nn0 , |α | = α1 + · · · + αn ≥ 2, are such that

(α , κ ) − κm = 0 .
(α )
Then m and α are called a resonant pair, the corresponding coefficient Xm of the
monomial xα in the mth component of X is called a resonant coefficient, and the
corresponding term is called a resonant term of X. Index and multi-index pairs,
terms, and coefficients that are not resonant are called nonresonant.
A “normal form” for system (2.27) should be a form that is as simple as possible.
The first step in the simplification process is to apply (2.28) to change the linear part
A in (2.27) into its Jordan normal form. We will assume that this preliminary step
has already been taken, so we begin with (2.27) in the form

ẋ = Jx + X(x), (2.38)

where J is a lower-triangular Jordan matrix. (Note that the following definition is


based on this supposition.) The simplest form that we are sure to be able to obtain
is one in which all nonresonant terms are zero, so we will take this as the meaning
of the term “normal form” from now on.
Definition 2.3.6. A normal form for system (2.27) is a system (2.38) in which every
nonresonant coefficient is equal to zero. A normalizing transformation for system
(2.27) is any (possibly formal) change of variables (2.33) that transforms (2.27)
into a normal form; it is called distinguished if for each resonant pair m and α , the
(α )
corresponding coefficient hm is zero, in which case the resulting normal form is
likewise termed distinguished.
Two remarks about this definition are in order. The first is that it is more re-
strictive than the definition of normal form through order k for a smooth function,
Definition 2.2.4, since it requires that every nonresonant term be eliminated. In Ex-
ample 2.2.2, for instance, Span{c1 u1 +c2 u2 +u3 +c4 u4 +c5 u5 +c6 u6 } for any fixed
choice of the constants c j is an acceptable complement to Image(L ), hence
2.3 Analytic and Formal Normal Forms 73

ẋ = 2x + c(c1x2 + c2xy + y2 )
(2.39)
ẏ = y + c(c4x2 + c5 xy + c6y2 )

is a normal form through order two according to Definition 2.2.4. But equations
(2.19) show that the single resonant term is the y2 term in the first component, so
that (2.39) does not give a normal form according to Definition 2.3.6 unless the c j
are all chosen to be zero.
The second remark concerning Definition 2.3.6 is almost the reverse of the first:
although a normal form is the simplest form that we are sure to be able to obtain
in general, for a particular system it might not be the absolute simplest. In other
words, the fact that a coefficient is resonant does not mean that it must be (or remain)
nonzero under every normalization: a normalizing transformation that eliminates all
the nonresonant terms could very well eliminate some resonant terms as well. For
example, the condition in Corollary 2.3.3 is sufficent for linearizability, but it is by
no means necessary. In Chapter 4 we will treat the question of the possibility of
removing all resonant as well as nonresonant terms under normalization.

Remark 2.3.7. Suppose the system ẋ = Ax + X(x) is transformed into the normal
form ẏ = Ay+Y(y) by the normalizing transformation x = y+h(y), and let λ be any
nonzero number. It is an immediate consequence of Definitions 2.3.5 and 2.3.6 that
ẏ = λ Ay + λ Y(y) is a normal form for ẋ = λ Ax + λ X(y), and inspection of (2.35)
shows that the same transformation x = y + h(y) is a normalizing transformation
between the scaled systems.

Henceforth we will use the following notation. For any multi-index α , the co-
(α )
efficient of the monomial xα in the mth component Xm of X will be denoted Xm .
((2,0)) ((1,1))
Thus for example in system (2.9), X1 = a and X2 = b , although when α is

(α ,...,αn )
given explicitly, by slight abuse of notation we will write simply Xm 1 instead
((α1 ,...,αn )) (2,0) ((2,0))
of Xm . Hence, for example, we write X1 = a instead of X1 = a. We
will use the same notational convention for Y and h.
Every system is at least formally equivalent to a normal form, and as the proof
of the following theorem shows, there is some freedom in choosing it, although that
freedom disappears if we restrict ourselves to distinguished normalizing transfor-
mations. Theorem 2.3.11 has more to say about this.

Theorem 2.3.8. Any system (2.38) is formally equivalent to a normal form (which
need not be unique). The normalizing transformation can be chosen to be distin-
guished.

Proof. Since the linear part is already in simplest form, we look for a change of
coordinates of the form (2.33) that transforms system (2.38) into ẏ = Jy + Y(y) in
which all nonresonant coefficients are zero. Writing h(y) = ∑∞ (s)
s=2 h (y), for each
(s)
s the function h must satisfy equation (2.37), arising from (2.35) by the process
described immediately below (2.37). We have shown in the proof of Lemma 2.3.1
that the matrix of the operator L on the left-hand side of (2.37) is lower triangular
74 2 Stability and Normal Forms

with the eigenvalues (α , κ ) − κm on the main diagonal. Therefore any coefficient


(α )
hm of h(s) is determined by the equation
(α ) (α ) (α )
[(α , κ ) − κm ]hm = gm − Ym , (2.40)
(α )
where gm is a known expression depending on the coefficients of h(i) satisfying
j < s. Suppose that for i = 2, 3, . . . , s − 1, the homogeneous terms h( j) and Y ( j) have
been determined . Then for any m ∈ {1, . . . , n} and any multi-index α with |α | = s,
if the pair m and α is nonresonant, that is, if (α , κ )− κm 6= 0, then we choose Ymα = 0
(α )
so that Y will be a normal form, and choose hm as uniquely determined by equation
(α )
(2.40). If (α , κ ) − κm = 0, then we may choose hm arbitrarily (and in particular, the
(α )
choice hm = 0 every time yields a distinguished transformation), but the resonant
(α ) (α ) (α ) (α )
coefficient Ym must be chosen to be gm , Ym = gm . The process can be started
(α ) (α )
because for s = 2 the right-hand side of (2.40) is Xm −Ym . Thus formal series for
a normal form and a normalizing transformation, distinguished or not, as we decide,
are obtained. 
For simplicity, from now on we will assume that the matrix J is diagonal, that
is, that σk = 0 for k = 2, . . . , n. (All applications of normal form theory in this book
will be confined to systems that meet this condition.) Then the mth component on
the right-hand side of (2.35) is obtaining by expanding Xm (y + h(y)) in powers of
y; we let {Xm (y + h(y))}(α ) denote the coefficient of y(α ) in this expansion. Using
(α )
this fact and Exercise 2.14, the coefficient gm in (2.40) is given by the expression
n
(α ) (β ) (α −β +e j )
gm = {Xm (y + h(y))}(α ) − ∑ ∑ β j hm Y j , (2.41)
j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0

where again {Xm (y + h(y))}(α ) denotes the coefficient of yα obtained after expand-
j
ing Xm (y + h(y)) in powers of y, and e j = (0, . . . , 0, 1, 0, . . . , 0) ∈ Nn0 . Note that for
(α )
|α | = 2 the sum over β is empty, so that gm = {Xm (y + h(y))}(α ) , which reduces
(α ) (α )
to gm = Xm , since X and h begin with quadratic terms. For |α | > 2, |β | < |α | and
(α )
|α − β + e j | < |α | ensure that gm is uniquely determined by (2.41).
The proof of Theorem 2.3.8 and formula (2.41) yield the normalization procedure
that is displayed in Table 2.1 on page 75 for system (2.38) in the case that the matrix
J is diagonal.
Example 2.3.9. Fix any C∞ system (2.1) with an equilibrium at 0 whose linear part
is the same as that of system (2.8):

ẋ1 = 2x1 + ax21 + bx1 x2 + cx22 + · · ·


(2.42)
ẋ2 = x2 + a′ x21 + b′x1 x2 + c′ x22 + · · · .

The resonant coefficients are determined by the equations


2.3 Analytic and Formal Normal Forms 75

Normal Form Algorithm

Input:
system ẋ = Jx + X(x),
J diagonal with eigenvalues κ1 , . . ., κn
κ = (κ1 , . . ., κn )
an integer k > 1

Output:
a normal form ẏ = Jy + Y(y) + o(|y|k ) up to order k
a distinguished transformation x = H(y) = y + h(y) + o(|y|k ) up to order k

Procedure:
h(y) := 0; Y(y) := 0
FOR s = 2 TO s = k DO
FOR m = 1 TO m = n DO
compute Xm (y + h(y)) through order s
FOR |α | = s DO
FOR m = 1 TO m = n DO
(α )
compute gm using (2.41)
IF
(α , κ ) − κm 6= 0
THEN
(α )
(α ) gm
hm :=
(α , κ ) − κm
(α )
hm (y) := hm (y) + hm yα
ELSE
(α ) (α )
Ym := gm
(α )
Ym (y) := Ym (y) +Ym yα
ẏ := Jy + Y(y) + o(|y| ) k

H := y + h(y) + o(|y|k )

Table 2.1 Normal Form Algorithm

(α , κ ) − 2 = 2α1 + α2 − 2 = 0
(α , κ ) − 1 = 2α1 + α2 − 1 = 0 .

When |α | = 2, the first equation has the unique solution (α1 , α2 ) = (0, 2) and the
second equation has no solution; for |α | ≥ 3, neither equation has a solution. Thus
by Definition 2.3.6, for any k ∈ N0 , the normal form through order k is
(0,2) 2
ẏ1 = 2y1 + Y1 y2 + o(|y|k )
k
ẏ2 = y2 + o(|y| ) .
76 2 Stability and Normal Forms

(α ) (α )
As remarked immediately after equation (2.41), for |α | = 2, gm = Xm (y), so we
(0,2)
know that in fact Y1 = c. If we ignore the question of convergence, then (2.42) is
formally equivalent to
ẏ1 = 2y1 + cy22
ẏ2 = y2 .
In Exercise 2.15 the reader is asked to use the Normal Form Algorithm to find the
normalizing transformation H through order two.

Example 2.3.10. Let us change the sign of the coefficient of x2 in the second equa-
tion of system (2.42) and consider the resulting system:

ẋ1 = 2x1 + ax21 + bx1 x2 + cx22 + · · ·


(2.43)
ẋ2 = −x2 + a′ x21 + b′ x1 x2 + c′ x22 + · · · .

The normal form of (2.43) is drastically different from the normal form of (2.42).
For now the resonant coefficients are determined by the equations

(α , κ ) − 2 = 2α1 − α2 − 2 = 0
(α , κ ) + 1 = 2α1 − α2 + 1 = 0 .

Solutions of the first equation that correspond to |α | ≥ 2 are the pairs (k, 2k − 2),
k ∈ N0 , k ≥ 2; solutions of the second equation that correspond to |α | ≥ 2 are the
pairs (k, 2k + 1), k ∈ N0 . By Definition 2.3.6, the normal form of (2.43) is

ẏ1 = 2y1 + y1 ∑ Y1
(k+1,2k)
(y1 y22 )k ,
k=1
∞ (2.44)
ẏ2 = −y2 + y2 ∑ Y2
(k,2k+1)
(y1 y22 )k .
k=1

In Exercise 2.16 the reader is asked to use the Normal Form Algorithm to find the
(2,2) (1,3)
resonant coefficients Y1 and Y2 and the normalizing transformation H through
order three.

Theorem 2.3.11. Let system (2.38), with J diagonal, be given. There is a unique
normal form that can be obtained from system (2.38) by means of a distinguished
normalizing transformation, which we call the distinguished normal form for (2.38),
and the distinguished normalizing transformation that produces it is unique. The
resonant coefficients of the distinguished normal form are given by the formula
(α )
Ym = {Xm (y + h(y))}(α ) , (2.45)

where {Xm (y + h(y))}(α ) denotes the coefficient of yα obtained after expanding


Xm (y + h(y)) in powers of y.
2.3 Analytic and Formal Normal Forms 77

Proof. In the inductive process in the proof of Theorem 2.3.8, at each step the choice
(α )
of hm is already uniquely determined if m and α are a nonresonant pair, while if
(α )
they are resonant, then hm must be chosen to be zero so that the transformation
(α )
will be distinguished. Consider now the choice of Ym at any step. If m and α are
(α )
a nonresonant pair then of course Ym must be chosen to be zero. If m and α are a
resonant pair, so that (α , κ ) − κm = 0, then by (2.40) and (2.41)
n
(α ) (α ) (β ) (α −β +e j )
Ym = gm = {Xm (y + h(y))}(α ) − ∑ ∑ β j hm Y j , (2.46)
j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0

(α )
so again Ym is uniquely determined at this stage.
At the start of the process, when |α | = 2, as noted immediately below (2.41)
(α )
the sum in (2.46) is empty; if m and α are a resonant pair, then Ym is deter-
(α ) (α )
mined as Ym = {Xm (y + h(y))}(α ) = Xm , since X and h begin with quadratic
terms. To obtain formula (2.45), consider the sum in (2.46). Any particular coeffi-
(α −β +e j )
cient Y j can be nonzero only if j and α − β + e j are a resonant pair, that
is, only if (α − β + e j , κ ) − κ j = 0, which by linearity and the equality (e j , κ ) = κ j
holds if and only if (α − β , κ ) = 0. That is,
(α −β +e j )
Yj 6= 0 implies (α − β , κ ) = 0 . (2.47)

(α −β +e j )
Thus if in the sum j and β are such that Y j 6= 0, then

(β , κ ) − κm = (β − α , κ ) + (α , κ ) − κm = 0

by (2.47) and the assumption that m and α are a resonant pair. Since h is distin-
(β )
guished this means that hm = 0. Thus every term in the sum in (2.46) is zero, and
the theorem is established. 

We close this section with a theorem that gives a criterion for convergence of
normalizing transformations that applies to all the situations of interest in this book.
Recall that a series v(z) = ∑α v(α ) zα is said to majorize a series u(z) = ∑α u(α ) zα ,
and v(z) is called a majorant of u(z), denoted u(z) ≺ v(z), if |u(α ) | ≤ v(α ) for all
α ∈ Nn0 . If a convergent series v(z) majorizes a series u(z), then u(z) converges on
some neighborhood of 0. By way of notation, for any series f (z) = ∑α f (α ) zα we
denote by f ♮ (z) its trivial majorant, the series that is obtained by replacing each
coefficient of f by its modulus: f ♮ (z) := ∑α | f (α ) |zα . Note that in the following
lemma f (x) begins with terms of order at least two.

Lemma 2.3.12. Suppose the series f (x) = ∑α :|α |≥2 f (α ) xα converges on a neigh-
borhood of 0 ∈ Cn . Then there exist positive real numbers a and b such that
78 2 Stability and Normal Forms
 2
a ∑nj=1 x j
f ♮ (x) ≺ . (2.48)
1 − b ∑nj=1 x j

Proof. There exist positive real numbers a0 and b0 such that f (x) converges on a
neighborhood of M := {x : |x j | ≤ b0 for 1 ≤ j ≤ n}, and | f (x)| ≤ a0 for x ∈ M. Then
the Cauchy Inequalities state that
a0
| f (α ) | ≤ α1 for all α ∈ Nn0 . (2.49)
b0 · · · bα0 n
h i
From the identity ∑|α |≥0 yα1 1 · · · yαn n = ∏nj=1 ∑∞ s
s=0 j it follows that
y

 α1  αn "  #



a0 x x n xj s
∑ bα1 · · · bαn xα1 1 · · · xnαn = a0 ∑ b0 · · · b0 = a0 ∏ ∑ b0
1 n

|α |≥0 0 0 |α |≥0 j=1 s=0


n  −1
xj
= a0 ∏ 1 −
j=1 b 0
(2.50)
holds on Int(M). By the definition of majorization, (2.49) and (2.50) yield
 
x j −1
n
f (x) ≺ a0 ∏ 1 −

. (2.51)
j=1 b0

It follows readily from the fact that for all n, k ∈ N, (n + k)/(1 + k) ≤ n, and the
series expansion of (1 + x)−n about 0 ∈ C that for any n ∈ N,

1
(1 + x)−n ≺ . (2.52)
1 − nx
It is also readily verified that
  !−n
n x j −1 1 n
∏ 1 − b0 ≺ 1−
b0 ∑ xj . (2.53)
j=1 j=1

Thus applying (2.52) with x replaced by − b10 ∑nj=1 x j , and using the fact that for
c ∈ R+ , 1/(1 + cu) ≺ 1/(1 − cu), (2.53) yields
 
n
x j −1 1
∏ 1 − b0 ≺ 1 − n ∑n x j . (2.54)
j=1 b0 j=1

Combining (2.51) and (2.54) yields


a0
f ♮ (x) ≺ .
1 − bn0 ∑nj=1 x j
2.3 Analytic and Formal Normal Forms 79

But since f has no constant or linear terms, the constant and linear terms in the
right-hand side may be removed, yielding finally
" #
1 n n

f (x) ≺ a0
1 − bn0 ∑nj=1 x j
−1− ∑ xj ,
b0 j=1

so that (2.48) holds with a = (n/b0 )2 a0 and b = n/b0. 

Theorem 2.3.13. Let κ1 , . . . , κn be the eigenvalues of the matrix J in (2.38) and set
κ = (κ1 , . . . , κn ). Suppose X is analytic, that is, that each component Xm is given
(α )
by a convergent power series, and that for each resonant coefficient Y j in the
distinguished normal form Y of X, α ∈ N (that is, every entry in the multi-index α
n

is positive). Suppose further that there exist positive constants d and ε such that the
following conditions hold:
(a) for all α ∈ Nn0 and all m ∈ {1, . . ., n} such that (α , κ ) − κm 6= 0,

|(α , κ ) − κm | ≥ ε ; (2.55)

(b) for all α and β in Nn0 for which 2 ≤ |β | ≤ |α | − 1, α − β + em ∈ Nn0 for all
m ∈ {1, . . . , n}, and
(α − β , κ ) = 0, (2.56)
the following inequality holds:
n n
(α −β +e j ) (α −β +e j )
∑ β jY j ≤ d|(β , κ )| ∑ Y j . (2.57)
j=1 j=1
Then the distinguished normalizing transformation x = H(y) is analytic as well,
that is, each component hm (y) of h is given by a convergent power series, so that
system (2.38) is analytically equivalent to its normal form.

Proof. Suppose that a particular pair m and α correspond to a nonresonant term.


(α )
Then Ym = 0 and by (2.40) and (2.41)
(α )
|hm | ≤ 1
ε {Xm (y + h(y))}(α )
n
1 (β ) (α −β +e j ) (2.58)
+
|(α , κ ) − κm | ∑ ∑ β j hm Y j .
j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0

(α −β +e j )
For any nonzero term Y j in (2.58), by (2.47)

(α , κ ) − κm = (α − β , κ ) + (β , κ ) − κm = (β , κ ) − κm ,

so by hypothesis (a) |(β , κ ) − κm | = |(α , κ ) − κm | ≥ ε . We adopt the convention that


(γ )
Y j = 0 if γ 6∈ Nn0 , so that we can reverse the order of summation in (2.58) and apply
hypothesis (b). Thus the second term in (2.58) is bounded above by
80 2 Stability and Normal Forms

n
1 (β ) (α −β +e j )
|(α , κ ) − κm | 2≤|β ∑ ∑ βiY j
|hm |
|≤|α |−1 j=1
n
1 (β ) (α −β +e j )
≤ ∑
|(α , κ ) − κm | 2≤|β |≤|α |−1
|hm |d|(α , κ ) − κm + κm | ∑ |Y j |
j=1
 
|κm | n (β ) (α −β +e j )
≤ d 1+
ε ∑ ∑ |hm ||Y j |.
j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0

Applying this to (2.58) gives


n
(α ) 1 (β ) (α −β +e j )
|hm | ≤ {Xm (y + h(y))}(α ) + d0 ∑ ∑ |hm ||Y j |, (2.59)
ε j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0

where d0 = max1≤m≤n d(1 + |κm|/ε ). Thus


n
(β ) (α −β +e j )
h♮m (y) ≺ ε1 ∑ |{Xm (y+h(y))}(α ) |yα +d0 ∑ ∑ ∑ |hm ||Y j |yα .
|α |≥2 |α |≥2 j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0
(2.60)
Clearly
∑ |{Xm (y + h(y))}(α )|yα ≺ ∑ Xm♮ (y + h♮ (y)) . (2.61)
|α |≥2 |α |≥2

Turning to the term to the right of the plus sign in (2.60), for convenience index
elements of Nn as {γr }∞ (γ ) = 0 if γ 6∈ Nn ,
r=1 . Recalling our convention that Y 0

n
(β ) (α −β +e j )
∑ ∑ ∑ |hm ||Y j |yα
|α |≥2 j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0
" #
∞ n ∞
(β ) (α −β +e )
= ∑ |hm r |yβr ∑∑ |Y j s r j |yαs −βr . (2.62)
r=1 j=1 s=1

For any fixed multi-index βr , consider the sum


n ∞
(αs −βr +e j )
∑ ∑ |Y j |yαs −βr . (2.63)
j=1 s=1

(α −β +e )
The number Y j s r j is nonzero only if j and αs − βr + e j form a resonant pair.
The same term, times the same power of y, will occur in the sum (2.63) corre-
sponding to the multi-index βr′ if and only if there exists a multi-index αs′ that
satisfies αs′ − βr′ = αs − βr , which is true if and only if αs − βr + βr′ ∈ Nn0 . Writing
αs = (αs1 , . . . , αsn ) and βr = (βr1 , . . . , βrn ), then by the fact that αs − βr + e j is part of
2.3 Analytic and Formal Normal Forms 81

a resonant pair and the hypothesis that no entry in the multi-index of a resonant pair
is zero, if k 6= j, then αsk − βrk ≥ 1, while αsj − βrj ≥ 0. Thus αs − βr + βr′ ∈ Nn0 , so
we conclude that the expression in (2.63) is the same for all multi-indices βr ,, and
may be written
n
(γ )
∑ ∑ |Y j |yγ −e j ,
j=1 (γ , j) resonant

hence (2.62) is
n
(β ) (α −β +e j )
∑ ∑ ∑ hm Yj yα
|α |≥2 j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0
! !
n
β) (γ )
≺ ∑ hm y β
∑ ∑ Yj y γ −e j

|β |≥2 j=1 (γ , j) resonant (2.64)


n
(γ )
= h♮m (y) ∑ ∑ y−1
j Yj yγ
j=1 (γ , j) resonant
n
= h♮m (y) ∑ y−1 ♮
j Y j (y) ,
j=1

which is well-defined even at y = 0 by the hypothesis on resonant pairs. Thus ap-


plying (2.61) and (2.64) to (2.60) yields
n
h♮m (y) ≺ ε1 Xm♮ (y + h♮ (y)) + h♮m(y) ∑ y−1 ♮
j Y j (y) . (2.65)
j=1

Multiplying (2.45) by yα and summing it over all α for which |α | ≥ 2, from (2.61)
we obtain Ym♮ (y) ≺ Xm♮ (y + h♮ (y)). Summing this latter expression and expression
(α )
(2.65) over m, and recalling that for a distinguished transformation hm = 0 for all
nonresonant pairs, we obtain the existence of a real constant c1 > 0 such that the
following majorizing relation holds:

n n n n n Y j♮ (y)
∑ Ym♮ (y) + ∑ h♮m (y) ≺ c1 ∑ Xm♮ (y + h♮ (y)) + c1 ∑ h♮m (y) ∑ yj
. (2.66)
m=1 m=1 m=1 m=1 j=1

By Lemma 2.3.12
 2
a ∑nj=1 y j + ∑nj=1 h♮j (y)
Xm♮ (y + h♮ (y)) ≺  , (2.67)
1 − b ∑nj=1 y j + ∑nj=1 h♮j (y)

so that (2.66) becomes


82 2 Stability and Normal Forms
 2
n n c1 a ∑nj=1 y j + ∑nj=1 h♮j (y) n n
∑ Ym♮ (y)+ ∑ h♮m (y) ≺ 

 + c1 ∑ h♮m (y) ∑ y−1 ♮
j Y j (y).
m=1 m=1 1 − b ∑ j=1 y j + ∑ j=1 h j (y)
n n
m=1 j=1
(2.68)
To prove the theorem it suffices to show that the series
n n
S(y) = ∑ Ym♮ (y) + ∑ h♮m (y)
m=1 m=1

converges at some nonzero value of y. We will show convergence at y = (η , . . . , η )


for some positive value of the real variable η . To do so, we consider the real
series S(η , . . . , η ). Since Y and h begin with quadratic or higher terms we can
write S(η , . . . , η ) = η U(η ) for a real series U(η ) = ∑∞ k=1 uk η , where the coef-
k

ficients uk are nonnegative. By the definition of S and U, U(η ) clearly satisfies


♮ ♮
∑nm=1 hm (η , . . . , η ) ≺ η U(η ) and ∑nm=1 Ym (η , . . . , η ) ≺ η U(η ), which when applied
to (2.68) yield

c1 a(nη + η U(η ))2


η U(η ) = S(η , . . . , η ) ≺ + c1 η U 2 (η )
1 − b(nη + η U(η ))

so that
c1 a η (n + U(η ))2
U(η ) ≺ c1U 2 (η ) + . (2.69)
1 − b η (n + U(η ))
Consider the real analytic function F defined on a neighborhood of (0, 0) ∈ R2 by

c1 ay(n + x)2
F(x, y) = c1 x2 + .
1 − by(n + x)

Using the geometric series to expand the second term, it is clear that

F(x, y) = F0 (y) + F1(y)x + F2(y)x2 + · · · ,

where Fj (0) = 0 for j 6= 2 and Fj (y) ≥ 0 if y ≥ 0. Thus for any sequence of real
constants r1 , r2 , r3 , . . . ,
!

F ∑ rk yk , y = δ1 y + δ2(r1 )y2 + δ3 (r1 , r2 )y3 + · · · , (2.70)
k=1

where δ1 ≥ 0 and for k ≥ 2 δk is a polynomial in r1 , r2 , . . . , rk−1 with nonnegative


coefficients. Thus

0 ≤ a j ≤ b j for j = 1, . . . , k − 1 implies δk (a1 , . . . , ak−1 ) ≤ δk (b1 , . . . , bk−1 ).


(2.71)
By the Implicit Function Theorem there is a unique real analytic function w = w(y)
defined on a neighborhood of 0 in R such that w(0) = 0 and x − F(x, y) = 0 if and
2.4 Notes and Complements 83

only if x = w(y); write w(y) = ∑∞ k


k=1 wk y . By (2.70) the coefficients wk satisfy

wk = δk (w1 , . . . , wk−1 ) for k ≥ 2, (2.72)

and by (2.69) the coefficients uk satisfy

uk ≤ δk (u1 , . . . , uk−1 ) for k ≥ 2. (2.73)

A simple computation shows that u1 = c1 an2 = w1 , hence (2.71), (2.72), and (2.73)
imply by mathematical induction that uk ≤ wk for k ≥ 1. Thus U(η ) ≺ w(η ), imply-
ing the convergence of U on a neighborhood of 0 in R, which implies convergence
of S, hence of h and Y, on a neighborhood of 0 in Cn . 

2.4 Notes and Complements

Our concern in this chapter has been with vector fields in a neighborhood of an
equilibrium. In Section 2.1 we presented a few of Lyapunov’s classical results on
the stability problem. Other theorems of this sort are available in the literature. The
reader is referred to the monograph of La Salle ([109]) for generalizations of the
classical Lyapunov theory.
The idea of placing system (2.1) in some sort of normal form in preparation for
a more general study, or in order to take advantage of a general property of the
collections of systems under consideration, has wide applicability. To cite just one
example, a class of systems of differential equations that has received much attention
is the set of quadratic systems in the plane, which are those of the form

ẋ = a00 + a10x + a01y + a20x2 + a11xy + a02y2


(2.74)
ẏ = b00 + b10x + b01y + b20x2 + b11xy + b02y2 .

Using special properties of such systems, it is possible by a sequence of coordinate


transformations and time rescalings to place any such system that has a cycle in its
phase portrait into the special form

ẋ = δ x − y + ℓx2 + mxy + ny2


(2.75)
ẏ = x(1 + ax + by) ,

thereby simplifying the expression, reducing the number of parameters, and making
one parameter (δ ) into a rotation parameter on the portion of the plane in which a
cycle can exist. (See §12 of [202].)
Turning to the problem of computing normal forms, if we wish to know the ac-
tual coefficients in the normal form in terms of the original coefficients, whether
numerical or symbolic, we must keep track of all coefficients exactly at each suc-
cessive step of the sequence of transformations leading to the normal form. Hand
computation can quickly become infeasible. Computer algebra approaches to the
84 2 Stability and Normal Forms

actual computation of normal forms is treated in the literature; the reader is referred
to [71, 139, 146]. Sample code for the algorithm in Table 2.1 is in the Appendix.
We have noted that unless L maps Hs onto itself, the subspace Ks comple-
mentary to Image(L ) is not unique. It is reasonable to attempt to make a uniform,
or at least systematic, choice of the subspace Ks , s ≥ 2. Such a systematic choice
is termed in [139] a normal form style, and the reader is directed there for a full
discussion.
For more exhaustive treatments of the theory of normal forms the reader can
consult, for example, the references [15, 16, 19, 24, 25, 115, 139, 142, 180, 181].

Exercises

2.1 Show that the trajectory of every point in a neighborhood of the equilibrium
x0 = (1, 0) of the system
p
ẋ = x − (x + y) x2 + y2 + xy
p
ẏ = y + (x − y) x2 + y2 − x2

on R2 tends to x0 in forward time, yet in every neighborhood of x0 there exists


a point whose forward trajectory travels distance at least 1 away from x0 before
returning to limit on x0 .
Hint. Change to polar coordinates.
2.2 Consider the general second-order linear homogeneous ordinary differential
equation in one dependent variable in standard form,

ẍ + f (x)ẋ + g(x) = 0 , (2.76)

generally called a Liénard equation.


a. Show that if there exists ε > 0 suchR that xg(x) > 0 whenever 0 < |x| < ε ,
then the function W (x, y) = 21 y2 + 0x g(u) du is positive definite on a neigh-
borhood of 0 ∈ R2 .
b. Assuming the truth of the condition on g(x) in part (a), use the function W
to show that the equilibrium of the system

ẋ1 = x2
ẋ2 = −g(x1 ) − f (x1 )x2 ,

which is equivalent to (2.76), is stable if f (x) ≡ 0 and is asymptotically stable


if f (x) > 0 for 0 < |x| < ε .
2.3 Transform system (2.76) into the equivalent Liénard form

ẋ1 = x2 − F(x1 )
(2.77)
ẋ2 = −g(x1 ) ,
2.4 Notes and Complements 85
R
where F(x) = 0x f (u)du. Formulate conditions in terms of g(x) and F(x) so that
the equilibrium of (2.77) is stable, asymptotically stable, or unstable.
2.4 Construct a counterexample to Theorem 2.1.4 if tangency to C \ {0} is allowed,
even if at only countably many points.
2.5 Prove Theorem 2.1.5.
2.6 Show that the dimension of the vector space Hs of functions from Rn to Rn
(or from Cn to Cn ), all of whose components are homogeneous polynomial
functions of degree s, is nC(s + n − 1, s) = n(s + n − 1)!/(s!(n − 1)!).
Hint. Think in terms of distributing p identical objects into q different boxes,
which is the same as selecting with repetition p objects
 from q types of objects.
2.7 Show that if the eigenvalues of the matrix ac db are ±iβ (β ∈ R), then by a
linear transformation the system

ẋ = ax + by
ẏ = cx + dy

can be brought to the form


ẋ = β y
ẏ = −β x .
2.8 For system (2.1) on R2 , suppose f(0) = 0 and A = df(0) has exactly one zero
eigenvalue. Show that for each k ∈ N, k ≥ 2, each vector in the ordered basis
of Hk of Example 2.2.2 is an eigenvector of L , and that L has zero as an
k−1
eigenvalue of multiplicity two, with corresponding eigenvectors xy 0 and
 
0 2
yk , where x and y denote the usual coordinates on R . Write down the normal
form for (2.1) through order k. (Assume from the outset that A has been placed
in Jordan normal form, that is, diagonalized.)
2.9 For system (2.1) on R2 , suppose f(0) = 0 and A = df(0) has both eigenvalues
zero but is not itself
 the zero transformation, hence has upper-triangular Jordan
normal form 00 10 . (This form for the linear part is traditional in this context.)
a. Show that the matrix of L : H2 → H2 with respect to the ordered basis of
Example 2.2.2 is  
0 0 0 −1 0 0
2 0 0 0 −1 0
 
0 1 0 0 0 −1
L=  .

0 0 0 0 0 0
0 0 0 2 0 0
000 0 1 0
b. Show that the rank of L (the dimension of its column space) is four, hence
that dim Image(L ) = 4.    
2 
c. Show that none of the basis vectors x0 , xy 0
, and x02 lies in Image(L ).
Hint. Work in coordinates, for example replacing the second vector in the list
by its coordinate vector, the standard basis vector e5 of R6 .
86 2 Stability and Normal Forms

d. Explain why two possible normal forms of f through order two are

ẋ = y + O(3) ẋ = y + ax2 + O(3)


(i) and (ii)
ẏ = ax2 + bxy + O(3) ẏ = bx2 + O(3) .

Remark: This is the “Bogdanov–Takens Singularity.” Form (i) in part (d) is the
normal form of Bogdanov ([20]); form (ii) is that of Takens ([187]).
2.10 In the same situation as that of Exercise 2.9, show thatnfor all N, k ≥ 3,
 k ∈o
dim(Image(L )) = dim(Hk ) − 2 and that Kk = Span x0 , x0k
k
is one
choice of Kk . Hence a normal form for (2.1) through order r is

ẋ = y + a2x2 + a3x3 + · · · + ar xr + O(r + 1)


ẏ = b2 x2 + b3x3 + · · · + br xr + O(r + 1) .

2.11 Show that if S is a nonsingular n × n matrix such that J = SAS−1, then under the
coordinate change x = Sy, expression (2.30) for L is transformed into (2.31),
as follows.
a. Writing S : Cn → Cn : x 7→ y = Sx and for h ∈ Hs letting u = S −1 ◦ h ◦ S ,
use the fact that dS (y)z = Sz to show that L h(x) = Sdu(y)Jy − ASu(y).
b. Use the fact that L u = S −1 ◦ L h ◦ S and the result of part (a) to obtain
(2.31).
2.12 a. Order the basis vectors of Example 2.2.2 according to the lexicographic or-
der of the proof of Lemma 2.3.1.
b. Rework part (a) of Exercise 2.9 using lexicographic order. (Merely use the
result of Exercise 2.9 without additional computation.)
c. By inspection of the diagonal entries of the matrix in part (b), determine the
eigenvalues of L .
d. Verify by direct computation of the numbers (κ , α ) − κ j that the eigenvalues
of L are all zero.
2.13 [Referenced in Section 3.2.] Let κ1 , . . . , κn be the eigenvalues of the matrix A in
display (2.27).
a. Suppose n = 2. Show that if κ1 6= 0, then a necessary condition that there be
a resonant term in either component of X is that κ2 /κ1 be a rational number.
Similarly if κ2 6= 0. (If the ratio of the eigenvalues is p/q, GCD(p, q) = 1,
then the resonance is called a p : q resonance.)
b. Show that the analogous statement is not true when n ≥ 3.
2.14 Derive the rightmost expression in (2.41) in the following three steps.
a. Use the expansions of h and Y analogous to (2.36) to determine that the
vector homogeneous function of degree s in dh(y)Y(y) is a sum of s − 2
products of the form dh(k) (y)Y(ℓ) (y).
b. Show that for each r ∈ {2, . . ., s − 1}, the mth component of the correspond-
ing product in part (a) is
2.4 Notes and Complements 87
" #" #!
n
(β ) β −1 (γ )
∑ ∑ β j hm y1β1 · · · y j j · · · ynβn ∑ Y j yγ .
j=1 |β |=r |γ |=s−r+1

c. For any α such that |α | = s, use the expression in part (b) to find the co-
efficient of yα in the mth component of dh(y)Y(y), thereby obtaining the
negative of the rightmost expression in (2.41).
2.15 Apply the Normal Form Algorithm displayed in Table 2.1 on page 75 to show
that the normalizing transformation H through order two in Example 2.2.2 is
     1 2

y1 y1 2 ay1 + by1y2
H = + 1 ′ 2 1 ′ ′ 2 + ··· .
3 a y1 + 2 b y1 y2 + c y2
y2 y2

2.16 Apply the Normal Form Algorithm displayed in Table 2.1 on page 75 to com-
(2,2) (1,3)
pute the coefficients Y1 and Y2 of the normal form (2.44) of Example
2.3.10 and the normalizing transformation H(y) up to order three.
2.17 In system (2.38) let J be a diagonal 2 × 2 matrix with eigenvalues κ1 = p and
κ2 = −q, where p, q ∈ N and GCD(p, q) = 1. Show that the normal form of
system (2.38) is

ẏ1 = py1 + y1Y1 (w)


ẏ2 = −qy2 + y2Y2 (w) ,

where Y1 and Y2 are formal power series without constant terms and w = yq1 y2p .
2.18 For system (2.1) on Rn , suppose that f(0) = 0 and that the linear part A = df(0)
of f at 0 is diagonalizable. For k ∈ N, k ≥ 2, choose as a basis of Hk the ana-
logue of the basis of Example 2.2.2, consisting of functions of the form
 
x1
n  x2 
hα , j : R → R :   7→ xa11 xα2 2 · · · xnαn e j ,
n
..
.xn

where α = (α1 , . . . , αn ) ∈ Nn0 , α1 + α2 + · · · + αn = k, j ∈ {1, 2, . . ., n}, and


for each such j, e j is the jth standard basis vector of Rn , as always. Show by
direct computation using the definition (2.18) of L that each basis vector hα , j
is an eigenvector of L , with corresponding eigenvalue λ j − ∑ni=1 αi λi . (Note
how Example 2.2.2 and Exercise 2.8 fit into this framework but Exercises 2.9
and 2.10 do not.) When λ j 6= 0 for all j the equilibrium is hyperbolic, and the
non-resonance conditions
n
λ j − ∑ αk λk 6= 0 for all 1 ≤ j ≤ n, for all α ∈ Nn0
k=1

are necessary conditions for the existence of a smooth linearization. Results on


the existence of smooth linearizations are given in [183] (for the C∞ case) and
in [180] (for the analytic case).
Chapter 3
The Center Problem

Consider a real planar system of differential equations u̇ = f(u), defined and analytic
on a neighborhood of 0, for which f(0) = 0 and the eigenvalues of the linear part of
f at 0 are α ± iβ with β 6= 0. If the system is actually linear, then a straightforward
geometric analysis (see, for example, [44], [95], or [110]) shows that when α 6= 0
the trajectory of every point spirals towards or away from 0 (see Definition 3.1.1:
0 is a focus), but when α = 0, the trajectory of every point except 0 is a cycle, that
is, lies in an oval (see Definition 3.1.1: 0 is a center). When the system is nonlinear,
then in the first case (α 6= 0) trajectories in a sufficiently small neighborhood of the
origin follow the behavior of the linear system determined by the linear part of f at
0: they spiral towards or away from the origin in accordance with the trajectories
of the linear system. The second case is different: the linear approximation does
not necessarily determine the geometric behavior of the trajectories of the nonlin-
ear system in a neighborhood of the origin. This phenomenon is illustrated by the
system
u̇ = −v − u(u2 + v2 )
(3.1)
v̇ = u − v(u2 + v2 ) .
In polar coordinates system (3.1) is ṙ = −r3 , ϕ̇ = 1 . Thus whereas the origin is a
center for the corresponding linear system, every trajectory of (3.1) spirals towards
the origin, which is thus a stable focus. On the other hand, one can just as easily
construct examples in which the addition of higher-order terms does not destroy the
center. We see then that in the case of a singular point for which the eigenvalues
of the linear part are purely imaginary the topological type of the point is not de-
termined by the linear approximation, and a special investigation is needed. Here
we face the fascinating problem in the qualitative theory of differential equations
known as the problem of distinguishing between a center and a focus, or just the
center problem for short, which is the subject of this chapter. Since there always ex-
ists a nonsingular linear transformation that transforms a system whose linear part
has eigenvalues α ± iβ into the form (3.4) below, and a time rescaling τ = β t will
eliminate β , the objects of study will be systems of the form

V.G. Romanovski, D.S. Shafer, The Center and Cyclicity Problems, 89


DOI 10.1007/978-0-8176-4727-8_3,
© Birkhäuser is a part of Springer Science+Business Media, LLC 2009
90 3 The Center Problem

u̇ = −v + U(u, v)
(3.2)
v̇ = u + V(u, v) ,

where U and V are convergent real series starting with quadratic terms. We reiterate
that every real analytic planar system of differential equations with a nonzero linear
part at the origin that has purely imaginary eigenvalues can be placed in this form
by a translation, followed by a nonsingular linear change of coordinates, followed
by a time rescaling.
Our principal concern, however, is not so much with this or that particular system
of the form (3.2) as it is with families of such systems, and in particular with families
of polynomial systems, such as the family of all quadratic systems (see (2.74)) of
the form (3.2) (which is precisely the collection of quadratic systems that can have
a center at the origin, after the transformations described in the previous sentence).
Conditions that characterize when a member of such a family has a center at the
origin are given by the vanishing of polynomials in the coefficients of the members
of the family, hence they yield a variety in the space of coefficients. Because the
eigenvalues of the linear part at the origin of every system in question are complex,
and because complex varieties are more amenable to study than real varieties, it is
natural to complexify the family. This leads to the study of families of systems that
are of the form  
e y) = i x − ∑ a pqx p+1 yq
ẋ = P(x,
(p,q)∈S
  (3.3)
e y) = −i y −
ẏ = Q(x, ∑ bqp xq y p+1 ,
(p,q)∈S

where the variables x and y are complex, the coefficients of Pe and Q e are complex,
where S ⊂ ({−1} ∪ N0) × N0 is a finite set, every element (p, q) of which satisfies
p + q ≥ 1, and where b pq = āqp for all (p, q) ∈ S. The somewhat unusual indexing
is chosen to simplify expressions that will arise later. In this chapter we develop an
approach to solving the center problem for such a family.
An overview of the approach, and of this chapter, is the following. We begin in
Section 3.1 with a local analysis of any system for which the eigenvalues of the real
part are complex, then turn in Section 3.2 to the idea of complexifying the system
and examining the resulting normal form. The geometry of a center suggests that
there should be a kind of potential function Ψ defined on a neighborhood of the
origin into R whose level sets are the closed orbits about the center; such a function
is a so-called first integral of the motion and is shown in Section 3.2 to be intimately
connected with the existence of a center. Working from characterizations of centers
thus obtained, in Section 3.3 we generalize the concept of a center on R2 to sys-
tems of the form (3.3) on C2 . We then derive, for a family of systems of the form
(3.3), the focus quantities, a collection of polynomials {gkk : k ∈ N} in the coef-
ficients of family (3.3) whose simultaneous vanishing picks out the coefficients of
the systems that have a center at the origin. The variety so identified in the space
of coefficients is the center variety VC . Since by the Hilbert Basis Theorem (Theo-
rem 1.1.6) every polynomial ideal is generated by finitely many polynomials, there
3.1 The Poincaré First Return Map and the Lyapunov Numbers 91

must exist a K ∈ N such that the ideal B = hg11 , g22 , . . .i of the center variety VC
satisfies B = BK = hg11 , . . . , gKK i. Thus the center problem will be solved if we
find a number K such that B = BK . Actually, finding such a number K is too strong
a demand: we are interested only in the variety of the ideal, not the √ ideal itself,
√ so
by Proposition 1.3.16 it is sufficient to find a number K such that B = BK .
To do so for a specific family (3.3) we compute the first few focus quantities. An
efficient computational
p algorithm is developed in Section 3.4. We compute the gkk
until gJ+1,J+1 ∈ hg11 , . . . , gJ,J i, indicating that perhaps K =√
J; we might compute
the next several focus quantities, confirming that gJ+s,J+s ∈ BJ for small s ∈ N
and thereby strengthening our conviction that K = J. Since

V(g11 ) ⊃ V(g11 , g22 ) ⊃ V(g11 , g22 , g33 ) ⊃ · · · ⊃ VC

the vanishing of the J polynomials g11 , . . . , gJJ on the coefficients of a particular


system in (3.3) is a necessary condition that the system have a center at the origin.
We will have shown that K = J, and thus have solved the center problem, if we
can show that the vanishing of these J polynomials on the coefficients of a given
system is a sufficient condition that the system have a center at the origin. To do so
we need techniques for determining that a system of a given form has a center, and
we present two of the most important methods for doing so in the context of our
problem in Sections 3.5 and 3.6. The whole approach is illustrated in Section 3.7,
in which we apply it to find the center variety of the full set of quadratic systems,
as well as a family of cubic systems. As a complement to this approach, in the final
section we study an important class of real planar systems, the Liénard systems.
Since throughout this chapter we will be dealing with both real systems and their
complexifications, we will denote real variables by u and v and complex variables
by x and y, except in Section 3.8 on Liénard systems, where only real systems are
in view and we employ the customary notation. For formal and convergent power
series we will consistently follow the notational convention introduced in the para-
graph following Remark 2.3.7. In variables subscripted by ordered pairs of integers,
a comma is occasionally introduced for clarity, but carries no other meaning, so that
for example we write gkk in (3.56) but gk1 ,k2 in (3.55). Finally, it will be convenient
to have the notation N−n = {−n, . . ., −1, 0} ∪ N.

3.1 The Poincaré First Return Map and the Lyapunov Numbers

Throughout this and the following section the object of study is a real analytic sys-
tem u̇ = f(u) on a neighborhood of 0 in R2 , where f(0) = 0 and the eigenvalues of
the linear part of f at 0 are α ± iβ with β 6= 0. By a nonsingular linear coordinate
change, such a system can be written in the form

u̇ = α u − β v + P(u, v)
(3.4)
v̇ = β u + α v + Q(u, v) ,
92 3 The Center Problem

where P(u, v) = ∑∞ (k) ∞


k=2 P (u, v) and Q(u, v) = ∑k=2 Q (u, v), and P (u, v) and
(k) (k)
(k)
Q (u, v) (if nonzero) are homogeneous polynomials of degree k. In Chapter 6 we
will need information about the function P(r) of Definition 3.1.3 in the case that
α 6= 0, so we will not specialize to the situation α = 0 until the next section. The
following definition, which applies equally well to systems that are only C1 , is cen-
tral. For the second part recall that the omega limit set of a point a for which a(t)
is defined for all t ≥ 0 is the set of all points p for which there exists a sequence
t1 < t2 < · · · of numbers such that tn → ∞ and a(tn ) → p as n → ∞.
Definition 3.1.1. Let u̇ = f(u) be a real analytic system of differential equations on
a neighborhood of 0 in R2 for which f(0) = 0.
(a) The singularity at 0 is a center if there exists a neighborhood Ω of 0 such that
the trajectory of any point in Ω \ {0} is a simple closed curve γ that contains 0
in its interior (the bounded component of the complement R2 \ {γ } of γ ). The
period annulus of the center is the set ΩM \ {0}, where ΩM is the largest such
neighborhood Ω , with respect to set inclusion.
(b) The singularity at 0 is a stable focus if there exists a neighborhood Ω of 0 such
that 0 is the omega limit set of every point in Ω and for every trajectory u(t) in
Ω \ {0} a continuous determination of the angular polar coordinate ϕ (t) along
u(t) tends to ∞ or to −∞ as t increases without bound. The singularity is an
unstable focus if it is a stable focus for the time-reversed system u̇ = −f(u). The
singularity is a focus if it is either a stable focus or an unstable focus.
For smooth systems the behavior of trajectories near a focus can be somewhat
bizarre (even in the C∞ case), but the theory presented in this section shows that for
analytic systems the angular polar coordinate of any trajectory sufficently near zero
changes monotonically, so that every trajectory properly spirals onto or away from
0 in future time.
In polar coordinates x = r cos ϕ , y = r sin ϕ , system (3.4) becomes

ṙ = α r + P(r cos ϕ , r sin ϕ ) cos ϕ + Q(r cos ϕ , r sin ϕ ) sin ϕ


h i
= α r + r2 P(2) (cos ϕ , sin ϕ ) cos ϕ + Q(2)(cos ϕ , sin ϕ ) sin ϕ + · · ·
(3.5)
ϕ̇ = β − r−1 [P(r cos ϕ , r sin ϕ ) sin ϕ − Q(r cos ϕ , r sin ϕ ) cos ϕ ]
h i
= β − r P(2) (cos ϕ , sin ϕ ) sin ϕ − Q(2) (cos ϕ , sin ϕ ) cos ϕ + · · · .

It is clear that for |r| sufficiently small, if β > 0 then the polar angle ϕ increases as
t increases, while if β < 0 then the angle decreases as t increases.
It is convenient to consider, in place of system (3.5), the equation of its trajecto-
ries
dr α r + r2 F(r, sin ϕ , cos ϕ )
= = R(r, ϕ ) . (3.6)
dϕ β + rG(r, sin ϕ , cos ϕ )
The function R(r, ϕ ) is a 2π -periodic function of ϕ and is analytic for all ϕ and for
|r| < r∗ , for some sufficiently small r∗ . The fact that the origin is a singularity for
(3.4) corresponds to the fact that R(0, ϕ ) ≡ 0, so that r = 0 is a solution of (3.6). We
3.1 The Poincaré First Return Map and the Lyapunov Numbers 93

can expand R(r, ϕ ) in a power series in r:

dr α
= R(r, ϕ ) = rR1 (ϕ ) + r2 R2 (ϕ ) + · · · = r + · · · , (3.7)
dϕ β

where Rk (ϕ ) are 2π -periodic functions of ϕ . The series is convergent for all ϕ and
for all sufficiently small r.
Denote by r = f (ϕ , ϕ0 , r0 ) the solution of system (3.7) with initial conditions
r = r0 and ϕ = ϕ0 . The function f (ϕ , ϕ0 , r0 ) is an analytic function of all three
variables ϕ , ϕ0 , and r0 and has the property that

f (ϕ , ϕ0 , 0) ≡ 0 (3.8)

(because r = 0 is a solution of (3.7)). Equation (3.8) and continuous dependence of


solutions on parameters yield the following proposition.

Proposition 3.1.2. Every trajectory of system (3.4) in a sufficiently small neighbor-


hood of the origin crosses every ray ϕ = c, 0 ≤ c < 2π .

The proposition implies that in order to investigate all trajectories in a sufficiently


small neighborhood of the origin it is sufficient to consider all trajectories passing
through a segment Σ = {(u, v) : v = 0, 0 ≤ u ≤ r∗ } for r∗ sufficiently small, that is,
all solutions r = f (ϕ , 0, r0 ). We can expand f (ϕ , 0, r0 ) in a power series in r0 ,

r = f (ϕ , 0, r0 ) = w1 (ϕ )r0 + w2 (ϕ )r02 + · · · , (3.9)

which is convergent for all 0 ≤ ϕ ≤ 2π and for |r0 | < r∗ . This function is a solution
of (3.7), hence

w′1 r0 + w′2 r02 + · · ·


≡ R1 (ϕ )(w1 (ϕ )r0 + w2 (ϕ )r02 + · · · ) + R2(ϕ )(w1 (ϕ )r0 + w2 (ϕ )r02 + · · ·)2 + · · · ,

where the primes denote differentiation with respect to ϕ . Equating the coefficients
of like powers of r0 in this identity, we obtain recurrence differential equations for
the functions w j (ϕ ):

w′1 = R1 (ϕ )w1 ,
w′2 = R1 (ϕ )w2 + R2(ϕ )w21 ,
(3.10)
w′3 = R1 (ϕ )w3 + 2R2(ϕ )w1 w2 + R3 (ϕ )w31 ,
..
.

The initial condition r = f (0, 0, r0 ) = r0 yields

w1 (0) = 1, w j (0) = 0 for j > 1. (3.11)


94 3 The Center Problem

Using these conditions, we can consequently find the functions w j (ϕ ) by integrating


equations (3.10). In particular,
αϕ
w1 (ϕ ) = e β . (3.12)

Setting ϕ = 2π in the solution r = f (ϕ , 0, r0 ) we obtain the value r = f (2π , 0, r0 ),


corresponding to the point of Σ where the trajectory r = f (ϕ , 0, r0 ) first intersects
Σ again.

Definition 3.1.3. Fix a system of the form (3.4).


(a) The function

R(r0 ) = f (2π , 0, r0 ) = η̃1 r0 + η2 r02 + η3 r03 + · · · (3.13)

(defined for |r0 | < r∗ ), where η̃1 = w1 (2π ) and η j = w j (2π ) for j ≥ 2, is called
the Poincaré first return map or just the return map.
(b) The function

P(r0 ) = R(r0 ) − r0 = η1 r0 + η2 r02 + η3 r03 + · · · (3.14)

is called the difference function.


(c) The coefficient η j , j ∈ N, is called the jth Lyapunov number.

Note in particular that by (3.12) the first Lyapunov number η1 has the value
η1 = η̃1 − 1 = e2πα /β − 1. Zeros of (3.14) correspond to cycles (closed orbits, that
is, orbits that are ovals) of system (3.4); isolated zeros correspond to limit cycles
(isolated closed orbits).

Proposition 3.1.4. The first nonzero coefficient of the expansion (3.14) is the coeffi-
cient of an odd power of r0 .

The proof is left as Exercise 3.3.


The Lyapunov numbers completely determine the behavior of the trajectories of
system (3.4) near the origin:

Theorem 3.1.5. System (3.4) has a center at the origin if and only if all the Lya-
punov numbers are zero. Moreover, if η1 6= 0, or if for some k ∈ N

η1 = η2 = · · · = η2k = 0, η2k+1 6= 0, (3.15)

then all trajectories in a neighborhood of the origin are spirals and the origin is a
focus, which is stable if η1 < 0 or (3.15) holds with η2k+1 < 0 and is unstable if
η1 > 0 or (3.15) holds with η2k+1 > 0.

Proof. If all the Lyapunov numbers vanish, then by (3.14) the difference function
P is identically zero, so every trajectory in a neighborhood of the origin closes.
If (3.15) holds (including the case η1 6= 0), then
3.1 The Poincaré First Return Map and the Lyapunov Numbers 95
∞ h i
P(r0 ) = η2k+1 r02k+1 + ∑ αm r0m = r02k+1 η2k+1 + P(r
f 0) , (3.16)
m=2k+2

where P f is analytic in a neighborhood of the origin and P(0)


f = 0. Suppose that
η2k+1 < 0. By (3.16), r0 = 0 is the only zero of the equation P(r0 ) = 0 in an open
interval about 0. Thus there exists r̄0 > 0 such that

P(r0 ) < 0 for 0 < r0 < r̄0 . (3.17)

Then the trajectory passing through the point (expressed in polar coordinates)
(1) (2)
P(0, r0 ) (0 < r0 < r̄) reaches points P(0, r0 ), P(0, r0 ), . . . as ϕ increases through
(1)
2π , 4π , . . . . Here (since P is the difference function) r0 = P(r0 ) + r0 and
(n+1) (n) (n) (1) (n)
r0 = P(r0 ) + r0 for n ≥ 1, and r0 > r0 > · · · > r0 > · · · > 0. The sequence
(n)
{r0 } has a limit r̂ ≥ 0, which must in fact be zero, since continuity of P(r0 ) and
the computation  
(n) (n+1) (n)
lim P(r0 ) = lim r0 − r0 = 0
n→∞ n→∞

imply that P(r̂) = 0, so r̂ = 0 follows from (3.17). Therefore, all trajectories in a


sufficiently small neighborhood of the origin are spirals and tend to the origin as
ϕ → +∞. Similarly, if η2k+1 > 0, then the trajectories are also all spirals but tend to
the origin as ϕ → −∞. 

Remark 3.1.6. When (3.15) holds for k > 0 the focus at the origin is called a fine
focus of order k. The terminology reflects the fact that in the early literature a hy-
perbolic singularity was termed “coarse” since it is typically structurally stable, so
that a nonhyperbolic singularity is “fine.”

Remark 3.1.7. A singularity that is known to be either a node (Definition 3.1.1(b)


but ϕ (t) has a finite limit), a focus, or a center is termed an antisaddle. Thus The-
orem 3.1.5 implies that the origin is an antisaddle of focus or center type for any
analytic system of the form (3.4).

Remark. Our conclusions depend in an essential way on our assumption that the
functions P and Q on the right-hand side of (3.4) are analytic. If either P or Q is
not an analytic function, then the return map need not be analytic, and the function
P(r0 ) can have infinitely many isolated zeros that accumulate at zero. When this
happens a neighborhood of the origin contains a countably infinite number of limit
cycles separated by regions in which all trajectories are spirals or Reeb components
of the foliation of a punctured neighborhood of the origin by orbits. Such a singular
point is sometimes called a center-focus. A concrete example is given by the sys-
tem u̇ = −v + uh(u2 + v2 ), v̇ = u + vh(u2 + v2 ), where the function h : R → R is
−2
the infinitely flat function defined by h(0) = 0 and h(w) = e−w sin w−1 for w 6= 0.
A more exotic example illustrating the range of possibilities for C∞ (but flat) sys-
tems, in which there does not even exist a topological structure at the singularity, is
constructed in [171], or consult the appendix of [13].
96 3 The Center Problem

Remark. If we integrate system (3.10) with initial conditions other than (3.11),
then we get a new set of functions w b j (ϕ ) and a new set of Lyapunov numbers η
bj.
However, the first nonzero Lyapunov number, which determines the behavior of
trajectories in a sufficiently small neighborhood of the origin, is the same in each
case. That is, if

η1 = · · · = η j = 0, η j+1 6= 0 and η
b1 = · · · = η
bk = 0, η
bk+1 6= 0,

then k = j and η
bk+1 = η j+1 .

3.2 Complexification of Real Systems, Normal Forms, and the


Center Problem

Since we are investigating the nature of solutions of a real analytic system in the
case that the eigenvalues of the linear part at a singularity are complex, hence when
the system has the canonical form (3.4), it is natural to attempt to associate to it a
two-dimensional complex system that can be profitably studied to gain information
about the original real system. There are different ways to do so, some more use-
ful than others. The most convenient and commonly used technique is to begin by
considering the real plane (u, v) as the complex line

x1 = u + iv. (3.18)

Differentiating (3.18) and applying (3.4), we find that system (3.4) is equivalent to
the single complex differential equation

ẋ1 = (α + iβ )x1 + X1(x1 , x̄1 ), (3.19)

where X1 = P + iQ, and P and Q are evaluated at ((x1 + x̄1 )/2, (x1 − x̄1 )/(2i)). At
this point we have merely expressed our real system using the notation of complex
variables. To obtain a system of complex equations in a natural way, we now adjoin
to equation (3.19) its complex conjugate, to obtain the pair of equations

ẋ1 = (α + iβ )x1 + X1 (x1 , x̄1 )


(3.20)
x̄˙1 = (α − iβ )x̄1 + X1 (x1 , x̄1 ) ,

where, as the notation indicates, we have taken the complex conjugate of both the
coefficients and the variables in X1 (x1 , x̄1 ). Thus, for example, from the real system

u̇ = 2u − 3v + 4u2 − 8uv
v̇ = 3u + 2v + 16u2 + 12uv

we obtain first
3.2 Complexification of Real Systems, Normal Forms, and the Center Problem 97
    
x1 + x̄1 2 x1 + x̄1 x1 − x̄1
ẋ1 = (2 + 3i)x1 + 4 −8
2 2 2i
"  2  2 #
x1 + x̄1 x1 − x̄1
+ i 16 − 12
2 2i
= (2 + 3i)x1 + (1 + 3i)x21 + (2 + 14i)x1x̄1 + (1 − i)x̄12,

to which we adjoin

x̄˙1 = (2 − 3i)x̄1 + (1 + i)x21 + (2 − 14i)x1x̄1 + (1 − 3i)x̄12 .

Finally, if we replace x̄1 everywhere by x2 and regard it as a new complex vari-


able that is independent of x1 , then from (3.20) we obtain a full-fledged system of
analytic differential equations on C2 :

ẋ1 = (α + iβ )x1 + X1 (x1 , x2 )


, X2 (x1 , x̄1 ) = X1 (x1 , x̄1 ) . (3.21)
ẋ2 = (α − iβ )x2 + X2 (x1 , x2 )

System (3.21) on C2 is the complexification of the real system (3.4) on R2 . The


complex line Π := {(x1 , x2 ) : x2 = x̄1 } is invariant for system (3.21); viewing Π as
a two-dimensional hyperplane in R4 , the flow on Π is precisely the original flow of
system (3.4) on R2 (Exercise 3.6). In this sense the phase portrait of the real system
has been embedded in an invariant set in the phase portrait of a complex one. An
important point is that (3.21) is a member of the family

ẋ1 = λ1 x1 + X1 (x1 , x2 )
(3.22)
ẋ2 = λ2 x2 + X2 (x1 , x2 )

of systems of analytic differential equations on C2 with diagonal linear part, in


which the eigenvalues need not be complex conjugates, and the higher-order terms
can be unrelated. See Exercise 3.7. Two of the first few results in this section will
be stated and proved for elements of this more general collection of systems.
In the case of (3.21), since we are assuming that β 6= 0, the eigenvalues λ1 , λ2 of
the linear part satisfy λ1 λ2 6= 0 and
 2   
λ2 α −β2 2αβ
= − i ,
λ1 α2 + β 2 α2 + β 2

hence by Exercise 2.13 a necessary condition for the existence of resonant terms is
that αβ = 0, hence that α = 0. Thus when α 6= 0 the hypothesis of Corollary 2.3.3
holds and system (3.21) is formally equivalent to its normal form

ẏ1 = (α + iβ )y1
(3.23)
ẏ2 = (α − iβ )y2 ,
98 3 The Center Problem

a decoupled system in which each component spirals onto or away from 0 ∈ C,


reflecting the fact that the underlying real system has a focus at the origin of R2 in
this case.
We wish to know the normal form for system (3.21) and conditions guarantee-
ing that the normalizing transformation (2.33) that changes it into its normal form
is convergent, rather than merely formal. We will usually restrict to distinguished
normalizing transformations, hence by Theorem 2.3.11 (which applies because the
linear part in (3.22) is diagonal) can speak of the normal form and the distinguished
normalizing transformation that produces it. However, we will allow more general
normalizing transformations when it is not much more trouble to do so.
Proposition 3.2.1. Fix an analytic real system (3.4) for which β 6= 0 (our standing
assumption throughout this chapter).
1. The normal form of the complexification (3.21) of system (3.4) that is produced by
the unique distinguished transformation is the complexification of a real system,
( j,k) (k, j)
and its coefficients satisfy Y2 = Y1 for all ( j, k) ∈ N20 , j + k ≥ 2. More-
( j,k) (k, j)
over, the distinguished normalizing transformation satisfies h2 = h1 for all
( j, k) ∈ N20 , j + k ≥ 2.
2. More generally, if x = y + h(y) is any normalizing transformation of the complex-
( j,k) (k, j)
ification (3.21) of (3.4), chosen so as to satisfy the condition h2 = h1 for
every resonant pair (1, (k, j)) (which is always possible), then the correspond-
ing normal form is the complexification of a real system, its coefficients satisfy
( j,k) (k, j)
Y2 = Y1 for all ( j, k) ∈ N20 , j + k ≥ 2, and the normalizing transformation
( j,k) (k, j)
satisfies h2 = h1 for all ( j, k) ∈ N20 , j + k ≥ 2.
Proof. If eigenvalues of the linear part of the real system have nonzero real part,
then as we just saw the normal form is (3.23), which is the complexification of the
linearization of (3.4), u̇ = α u − β v, v̇ = β u + α v. The unique normalizing transfor-
mation x = y + h(y) is vacuously distinguished. The fact that its coefficients satisfy
( j,k) (k, j)
h2 = h1 for all ( j, k) ∈ N20 , j + k ≥ 2, follows from the arguments that will be
given in the proof of the proposition in the case that α = 0.
Thus suppose that the eigenvalues are purely imaginary, so that in the nota-
( j,k) j
tion of Section 2.3, κ = (κ1 , κ2 ) = (β i, −β i). We write X1 (x1 , x2 ) = Σ X1 x1 xk2
( j,k) j
and X2 (x1 , x2 ) = Σ X2 x1 xk2 , following the notational convention established in
the same section. Let Y1 and Y2 denote the higher-order terms in the normal form
( j,k) ( j,k)
of (3.21) and use the same notation Y1 and Y2 for their respective coeffi-
( j,k) (k, j)
cients. By Exercise 3.7, the hypothesis is that X2 = X1 for all ( j, k) ∈ N20 ,
j + k ≥ 2. An easy computation using the definition of resonance and the fact that
(κ1 , κ2 ) = (β i, −β i) shows that for ( j, k) ∈ N20 , j + k ≥ 2, 1 and (k, j) form a reso-
nant pair if and only if 2 and ( j, k) form a resonant pair; this shows that the choice
described in statement (2) of the proposition is always possible. Since statement (2)
implies statement (1), the proposition will be fully proved if we can establish that
for any transformation x = y + h(y) that normalizes (3.21), the identities
3.2 Complexification of Real Systems, Normal Forms, and the Center Problem 99

( j,k) (k, j)
Y2 = Y1 (3.24a)
( j,k) (k, j)
h2 = h1 (3.24b)

hold for all pairs ( j, k) ∈ N20 , j + k ≥ 2, provided (3.24b) is required when 1 and
(k, j) ( j,k)
(k, j) form a resonant pair (but h1 and h2 are otherwise arbitrary).
( j,k) ( j,k)
The coefficients Ym and hm are computed using (2.40) and (2.41). The proof
will be by induction on s = j + k. Recall that for m = 1, 2 and a multi-index α ,
{Xm (y + h(y))}(α ) denotes the coefficient of y(α ) when Xm (y + h(y)) is expanded
in powers of y. To carry out the proof we will simultaneously prove by induction
that
{X2 (y + h(y))}( j,k) = {X1 (y + h(y))}(k, j) . (3.25)
Basis step. A simple set of computations using Definition 2.3.5 shows that for
( j,k) ( j,k) ( j,k)
s = 2 there are no resonant terms, so Ym = 0 and gm = Xm (comment after
( j,k)
(2.41)). We leave it to the reader to compute the coefficients hm and verify the
truth of (3.24b) in this case. Since X and h begin with quadratic terms, the truth of
(3.25) for s = 2 is immediate.
Inductive step. Suppose (3.24) and (3.25) hold for j + k ≤ s. (It is understood that
we make the choice that (3.24b) holds when 1 and (k, j) form a resonant pair.) First
(s+1)
we establish the truth of (3.25) for j + k = s + 1. Let ∼ denote agreement of series
through order s + 1. Then

∑ {X1 (y + h(y))}( j,k)y1j yk2


j+k≥2

= X1 (y1 + h1 (y1 , y2 ), y2 + h2 (y1 , y2 ))


 r  t
= ∑ X1
(r,t)
y1 + h1(y) y2 + h2(y)
r+t≥2
 r  t
(α ) (α )
∑ ∑ h1 yα1 1 y2α2 ∑ h2 yα1 1 y2α2
(s+1) (r,t)
∼ X1 y1 + y2 + .
2≤r+t≤s+1 2≤|α |≤s 2≤|α |≤s

We apply to each term in the sum the involution y1 ↔ y2 and obtain


 r  t
(α ) (α )
∑ ∑ h1 yα2 1 y1α2 ∑ h2 yα2 1 y1α2
(r,t)
X1 y2 + y1 + .
2≤r+t≤s+1 2≤|α |≤s 2≤|α |≤s

Now we conjugate the coefficients, obtaining


 r  t
(α ) (α )
∑ ∑ h̄1 yα2 1 y1α2 ∑ h̄2 yα2 1 y1α2
(r,t)
X1 y2 + y1 +
2≤r+t≤s+1 2≤|α |≤s 2≤|α |≤s

which, applying the hypothesis on X1 and X2 and the induction hypothesis, is


100 3 The Center Problem
  
(α2 ,α1 ) α1 α2 r (α2 ,α1 ) α1 α2 t
∑ ∑ ∑
(t,r)
= X2 y2 + h2 y2 y1 y1 + h1 y2 y1
2≤r+t≤s+1 2≤|α |≤s 2≤|α |≤s


(s+1) (t,r)
∼ X2 (y1 + h1(y1 , y2 ))t (y2 + h2(y1 , y2 ))r
r+t≥2
= X2 (y1 + h1 (y1 , y2 ), y2 + h2 (y1 , y2 ))
= ∑ {X2 (y + h(y))}( j,k)y1j yk2 ,
j+k≥2

so that (3.25) holds for j + k = s + 1.


( j,k) (k, j)
Turning our attention to gm , from (2.41) we compute the value of g1 as

(β1 ,β2 ) (k−β1 +1, j−β2 ) (β1 ,β2 ) (k−β1 , j−β2 +1)
{X1 (y + h(y))}(k, j) − ∑ β1 h1 Y1 − ∑ β2 h1 Y1 ,

which by (3.25) and the induction hypothesis is


(β2 ,β1 ) ( j−β2 ,k−β1 +1) (β ,β1 ) ( j−β2 +1,k−β1 )
{X2 (y + h(y))}( j,k) − ∑ β1 h2 Y2 − ∑ β2 h2 2 Y2 ,

( j,k)
which is g2 , so that
( j,k) (k, j)
g2 = g1 . (3.26)
Since 1 and (k, j) form a resonant pair if and only if 2 and ( j, k) form a resonant
(k, j) ( j,k) (k, j)
pair, Y1 6= 0 if and only if Y2 6= 0 and we may freely choose h1 if and only
( j,k)
if we may freely choose h2 . Thus for any ( j, k) with j + k = s + 1, by (2.40) we
see that when they are nonzero,

(k, j) (k, j) ( j,k) ( j,k)


Y1 = g1 = g2 = Y2

by (3.26), so that (3.24a) holds, and (again by (2.40)) when its value is forced
( j,k)
( j,k) gm
hm = ,
(( j, k), (κ1 , κ2 )) − κm

so that by (3.26) and the fact that (κ1 , κ2 ) = (iβ , −iβ )


" (k, j)
# " (k, j)
#
(k, j) g1 g1
h1 = =
((k, j), (iβ , −iβ )) − iβ iβ (k − j − 1)
(k, j) ( j,k)
g1 g2 ( j,k)
= = = h2 ,
iβ ( j − k + 1) (( j, k), (iβ , −iβ )) + iβ

so that (3.24b) holds. Thus (3.24) is true for the pair ( j, k). 
3.2 Complexification of Real Systems, Normal Forms, and the Center Problem 101

It is apparent that if a real system is complexified and that system is transformed


into normal form, then the real system that the normal form represents is a transfor-
mation of the original real system.
The first two results that we state apply to more general systems than those that
arise as complexifications of real systems. The condition that λ1 and λ2 be rationally
related is connected to the condition for resonance described in Exercise 2.13.

Proposition 3.2.2. Suppose that in the system (3.22) λ1 /λ2 = −p/q for p, q ∈ N
with GCD(p, q) = 1.
1. The normal form for (3.22) produced by any normalizing transformation (2.33)
(not necessarily satisfying the condition of Proposition 3.2.1(2), hence in partic-
ular not necessarily distinguished) is of the form (using the notational convention
introduced just before Theorem 2.3.8)

ẏ1 = λ1 y1 + y1 ∑ Y1
( jq+1, j p)
(yq1 y2p ) j = λ1 y1 + y1Y1 (yq1 y2p )
j=1
∞ (3.27)
ẏ2 = λ2 y2 + y2 ∑
( jq, j p+1) q p j
Y2 (y1 y2 ) = λ2 y2 + y2Y2 (yq1 y2p ) .
j=1

2. Let Y1 (w) and Y2 (w) be the functions of a single complex variable w defined by
(3.27). If the normalizing transformation that produced (3.27) is distinguished,
and if qY1 (w) + pY2(w) ≡ 0, then the normalizing transformation is convergent.

Proof. That the normal form for (3.22) has the form (3.27) follows immediately
from Definitions 2.3.5 and 2.3.6 and a simple computation that shows that the reso-
nant pairs are (1, ( jq + 1, jp)) and (2, ( jq, jp + 1)) for j ∈ N.
To establish point (2) we use Theorem 2.3.13. Let λ = (λ1 , λ2 ). To obtain a
uniform lower bound for all nonzero |(α , λ ) − λm |, m = 1, 2, define Em : R2 → R
for m = 1, 2 by

E1 (x, y) = |((x, y), (λ1 , λ2 )) − λ1 | = q1 |λ2 | |px − qy − p|


E2 (x, y) = |((x, y), (λ1 , λ2 )) − λ2 | = q1 |λ2 | |px − qy + q|.

Then for both m = 1 and m = 2, Em (x + q, y + p) = Em (x, y), so the minimum


nonzero value of Em on N0 × N0 is the same as the minimum nonzero value of Em
on {1, 2, . . . , q + 1} × {0, 1, . . ., p}, which is some positive constant εm . Thus (2.55)
holds with ε = min{ε1 , ε2 }.
Since p, q ≥ 1, if qY1 (w) + pY2 (w) ≡ 0 we have that, for any multi-indices β ,
γ ∈ N20 for which |β | ≥ 2 and |γ | ≥ 2,
 
(γ ) (γ ) q (γ ) 1 (γ )
β1Y1 + β2Y2 = β1 − β2 Y1 = |(β , λ )| Y1
p |λ1 |
   
1 (γ ) q (γ ) 1 (γ ) (γ )
≤ |(β , λ )| Y1 + Y1 = |(β , λ )| Y1 + Y2 .
|λ1 | p |λ1 |
102 3 The Center Problem

We conclude that condition (2.57) holds with d = 1/|λ1 |. 

For any normalization (3.27) of a system (3.22) that satisfies the hypotheses of
Proposition 3.2.2 we let
∞ ∞
G(w) = ∑ G2k+1 wk , H(w) = ∑ H2k+1 wk (3.28a)
k=1 k=1

be the functions of the complex variable w defined by

G = qY1 + pY2 , H = qY1 − pY2 . (3.28b)

The significance of the function G is shown by the next two theorems. The signif-
icance of H will be revealed in the next chapter. But first we need an important
definition. It is motivated by the geometric considerations described in the introduc-
tion to this chapter, namely that we should be able to realize the ovals that surround
a center as level curves of a smooth function. That our intuition is correct will be
borne out by Theorems 3.2.9 and 3.2.10.

Definition 3.2.3. A first integral on an open set Ω in Rn or Cn of a smooth or ana-


lytic system of differential equations

ẋ1 = f1 (x), . . . , ẋn = fn (x) (3.29)

defined everywhere on Ω is a nonconstant differentiable function Ψ : Ω → C that


is constant on trajectories (that is, for any solution x(t) of (3.29) in Ω the function
ψ (t) = Ψ (x(t)) is constant). A formal first integral is a formal power series in x, not
all of whose coefficients are zero, which under term-by-term differentiation satisfies
dt [Ψ (x(t))] ≡ 0 in Ω .
d

Remark 3.2.4. (a) If Ψ is a first integral or formal first integral for system (3.29)
on Ω , if F : C → C is any nonconstant differentiable function, and if λ is any
constant, then Φ = F ◦ Ψ is a first integral or formal first integral for the system
ẋ1 = λ f1 (x), . . . , ẋn = λ fn (x) on Ω .
(b) If
∂ ∂
X (x) = f1 (x) + · · · + fn (x) (3.30)
∂ x1 ∂ xn
is the smooth or analytic vector field on Ω associated to system (3.29), then a non-
constant differentiable function (or formal powers series) Ψ on Ω is a first integral
(or formal first integral) for (3.29) if and only if the function X Ψ vanishes through-
out Ω :
∂Ψ ∂Ψ
X Ψ = f1 + · · · + fn ≡ 0 on Ω . (3.31)
∂ x1 ∂ xn
(c) Our concern is only with a neighborhood of the origin, so by “existence of a first
integral” we will always mean “existence on a neighborhood of the origin.”
3.2 Complexification of Real Systems, Normal Forms, and the Center Problem 103

We now show that the existence of a formal first integral is enough to guarantee
that G(w) = qY1 (w) + pY2 (w) ≡ 0, hence by Proposition 3.2.2(2) that the normaliz-
ing transformation that transforms (3.22) to (3.27), if distinguished, is convergent.
Note that if Ψ is any formal first integral of a system (3.22) that meets the condi-
tions of Proposition 3.2.2, does not have a constant term, and begins with terms of
order no higher than p + q, then Ψ must have the form Ψ (x1 , x2 ) = xq1 x2p + · · · (Ex-
ercise 3.8). The assertion made here and in the following theorem about the form of
the function Ψ (x1 , x2 ) means only that up to terms of degree p + q it has the form
indicated, not that it is a function of the product xq1 x2p alone.

Theorem 3.2.5. Suppose that in system (3.22) λ1 /λ2 = −p/q for p, q ∈ N with
GCD(p, q) = 1. Let G be the function defined by (3.28b), computed from some nor-
mal form of (3.22).
1. If system (3.22) has a formal first integral of the form Ψ = x1 x2 + · · · , then G ≡ 0.
q p

Thus if the normalizing transformation in question is distinguished, it is conver-


gent.
2. Conversely, if G ≡ 0 then system (3.22) has a formal first integral of the form
Ψ (x1 , x2 ) = xq1 x2p + · · · , which is analytic when the normalizing transformation is
distinguished.

Proof. By Remarks 2.3.7 and 3.2.4(a), we may assume that λ1 = p and λ2 = −q.
Suppose system (3.22) has a formal first integral of the form Ψ (x1 , x2 ) = x1 x2p + · · · .
q

If H is the normalizing transformation that converts (3.22) into its normal form
(3.27), then F = Ψ ◦ H is a formal first integral for the normal form, hence

∂F   ∂F  
(y1 , y2 ) py1 + y1Y1 (yq1 y2p ) + (y1 , y2 ) −qy2 + y2Y2 (yq1 y2p ) ≡ 0 , (3.32)
∂ y1 ∂ y2
which we rearrange as

∂F ∂F
py1 (y1 , y2 ) − qy2 (y1 , y2 )
∂ y1 ∂ y2
(3.33)
∂F ∂F
= −y1 (y1 , y2 )Y1 (yq1 y2p ) − y2 (y1 , y2 )Y2 (yq1 y2p ) .
∂ y1 ∂ y2
Recalling from (2.33) the form of H and writing F according to our usual conven-
tion, F(y1 , y2 ) has the form

F(y1 , y2 ) = ∑ F (α1 ,α2 ) yα1 1 yα2 2 = yq1 y2p + · · · . (3.34)


(α1 ,α2 )

A simple computation on the left-hand side of (3.33), and insertion of (3.27) into
the right, yields
104 3 The Center Problem

∑ (α1 p − α2 q)F (α1 ,α2 ) y1α1 yα2 2


(α1 ,α2 )
" #" #

∑ α1 F (α1 ,α2 ) y1α1 yα2 2 ∑
( jq+1, j p) q p j
=− Y1 (y1 y2 ) (3.35)
(α1 ,α2 ) j=1
" #" #

∑ α2 F (α1 ,α2 ) y1α1 yα2 2 ∑ Y2
( jq, j p+1)
− (yq1 y2p ) j .
(α1 ,α2 ) j=1

q p
We claim that F(y1 , y2 ) is a function of y1 y2 alone, so that it may be written
F(y1 , y2 ) = f (yq1 y2p ) = f1 yq1 y2p + f2 (yq1 y2p )2 + · · · . The claim is precisely the state-
ment that for any term F (α1 ,α2 ) y1α1 yα2 2 of F,

pα1 − qα2 6= 0 implies F (α1 ,α2 ) = 0 . (3.36)

Equation (3.34) shows that (3.36) holds for |(α1 , α2 )| = α1 + α2 ≤ p + q. This


implies that the right-hand side of (3.35) has the form c2 (yq1 y2p )2 + · · · for some
c2 , hence by (3.35) implication (3.36) holds for α1 + α2 ≤ 2(p + q). But if that
is true, then it must be the case that the right-hand side of (3.35) has the form
c2 (yq1 y2p )2 + c3 (yq1 y2p )3 + · · · for some c3 , hence by (3.35) implication (3.36) must
hold for α1 + α2 ≤ 3(p + q). Clearly by mathematical induction (3.36) must hold in
general, establishing the claim.
But if F(y1 , y2 ) = f (yq1 y2p ), then

∂F ∂F
y1 (y1 , y2 ) = qyq1 y2p f ′ (yq1 y2p ) and y2 (y1 , y2 ) = pyq1 y2p f ′ (yq1 y2p ),
∂ y1 ∂ y2
q p
so that, letting w = y1 y2 , (3.33) becomes

0 ≡ −qw f ′ (w)Y1 (w) − pw f ′ (w)Y2 (w) .

But because F is a formal first integral it is not a constant, so we immediately obtain


qY1 (w) + pY2 (w) ≡ 0, which in conjunction with part (2) of Proposition 3.2.2 proves
part (1).
Direct calculations show that if G ≡ 0, then Ψ b (y1 , y2 ) = yq y p is a first integral of
1 2
(3.27). The coordinate transformation that places (3.22) in normal form has the form
given in (2.33), hence has an inverse of the form y = x + b h(x). Therefore system
(3.22) admits a formal first integral of the form Ψ (x1 , x2 ) = x1 x2 + · · · . By part (2)
q p

of Proposition 3.2.2, if the transformation to the normal form (3.27) is distinguished,


then it is convergent, hence so is Ψ (x1 , x2 ). 

Corollary 3.2.6. An analytic system (3.22) possesses a formal first integral of the
form Ψ (x1 , x2 ) = x1 x2p + · · · only if it possesses an analytic first integral of that
q

form.

Proof. If there exists a formal first integral of the form Ψ (x1 , x2 ) = x1 x2p + · · ·, then
q

part (1) of the theorem implies that for any normalizing transformation the corre-
3.2 Complexification of Real Systems, Normal Forms, and the Center Problem 105

sponding function G vanishes identically. In particular this is true when the normal-
izing transformation is distinguished. But then by part (2) of the theorem there exists
an analytic first integral of the same form. 

Now we specialize to the case that arises from our original problem, the study
of a real system with a singularity at which the eigenvalues of the linear part are
purely imaginary, α ± iβ = ±iω , ω ∈ R \ {0}. By a translation and an invertible
linear transformation to produce (3.4) and the process described at the beginning of
this section, we obtain the complexification

ẋ1 = iω x1 + X1(x1 , x2 )
, X2 (x1 , x̄1 ) = X1 (x1 , x̄1 ) . (3.37)
ẋ2 = −iω x2 + X2(x1 , x2 )

By Proposition 3.2.2 with p = q = 1, any normal form of (3.37) has the form

ẏ1 = iω y1 + y1Y1 (y1 y2 )


(3.38)
ẏ2 = −iω y2 + y2Y2 (y1 y2 ) .

Theorem 3.2.7. Let (3.38) be a normal form of (3.37) that is produced by some nor-
malizing transformation, and let G be the function computed according to (3.28b).
1. If G ≡ 0, then the original real system (3.4) generating (3.37) has a center at the
origin.
2. Conversely, if G = G2k+1 (y1 y¯1 )k + · · · , G2k+1 6= 0, and if the normalizing trans-
formation satisfies the condition of Proposition 3.2.1(2), then the origin is a focus
for (3.4), which is stable if G2k+1 < 0 and unstable if G2k+1 > 0.

Proof. Let G be the function computed according to (3.28b) with respect to some
normalization (3.38) of (3.37), and suppose that G ≡ 0. By part (2) of Theorem 3.2.5,
system (3.37) has a formal first integral of the form Ψ (x1 , x2 ) = x1 x2 + · · · , hence by
Corollary 3.2.6 it has an analytic first integral of the form Ψ
b (x1 , x2 ) = x1 x2 + · · · . But
then system (3.4) has an analytic first integral of the form Φ (u, v) = u2 + v2 + · · ·
(Exercise 3.9), which implies that the origin is a center for (3.4) (Exercise 3.10).
Now suppose that the normalizing transformation satisfies the condition of part
(2) of Proposition 3.2.1 and that G(y1 , y2 ) = G2k+1 (y1 y2 )k + · · · , G2k+1 6= 0. We
can avoid questions of convergence by breaking off the series that defines h in the
normalizing transformation (2.33) with terms of order 2k + 1, thereby computing
just an initial segment of the normal form that is nevertheless sufficiently long.
By Proposition 3.2.1(2) and Exercise 3.7, in the normal form (3.38) the coeffi-
( j, j+1) ( j+1, j) ( j+1, j)
cients satsify Y2 = Y1 for all j, so the hypothesis on G implies that Y1
is purely imaginary for 1 ≤ j ≤ k − 1. Hence there are real numbers b1 , . . . , bk−1 so
that the normal form (3.38) is
(k+1,k)
ẏ1 = iω y1 + i(b1 y1 y2 + · · · + bk−1 (y1 y2 )k−1 )y1 + y1Y1 (y1 y2 )k + · · ·
(3.39)
(k,k+1)
ẏ2 = −iω y2 − i(b1 y1 y2 + · · · + bk−1 (y1 y2 )k−1 )y2 + y1Y1 (y1 y2 )k + · · · .
106 3 The Center Problem

We appeal to Proposition 3.2.1 once more, reverting to the real system represented
by system (3.39) by replacing every occurrence of y2 in (3.39) by ȳ1 , thereby obtain-
ing two equations that describe the underlying real system, after some transforma-
tion of coordinates, in complex notation. Making the substitution in (3.39), we use
both equations to compute the derivatives of r and ϕ in complex polar coordinates
y1 = reiϕ , obtaining
2k+1
ṙ = 1 ˙ 1
2r (ẏ1 ȳ1 + y1 ȳ1 ) = 2 G2k+1 r + o(r2k+1)
(3.40)
ϕ̇ = i
(y ȳ˙ − ẏ1 ȳ1 ) = ω + b1r2 + o(r2) .
2r2 1 1

The equation of the trajectories of system (3.40) is

dr G2k+1 r2k + o(r2k )


= . (3.41)
dϕ 2ω + o(r)

This equation has the form of (3.6) and, just as for (3.6), we conclude that the orbits
of (3.40) are spirals, hence that the origin is a focus. By the first equation of (3.40)
the focus is stable if G2k+1 < 0 and unstable if G2k+1 > 0. 
( j, j+1) ( j+1, j)
Remark 3.2.8. The observation made in the proof that Y2 = Y1 for all
j ∈ N implies that when G ≡ 0 all the coefficients in Y1 and Y2 have real part zero,
which in turn implies that the normal form (3.38) of (3.37) may be written

ẏ1 = iω y1 + 21 y1 H(y1 , y2 )
ẏ2 = −iω y2 − 12 y2 H(y1 , y2 ) ,

where H is the function defined by (3.28), and in this case has only purely imaginary
coefficients as well.
We close this section with two theorems that give characterizations of real planar
systems of the form (3.2) that have a center at the origin. The first one was first
proved by Poincaré ([143]) in the case that U and V are polynomials. Lyapunov
([114]) generalized it to the case that U and V are real analytic functions. The as-
sertion about the form of the function Ψ (u, v) in the first theorem means only that
up to quadratic terms does it have the form indicated, not that it is a function of
the product sum u2 + v2 alone. In the Russian mathematical literature a first inte-
gral of this form for system (3.2), and the corresponding first integral of the form
Ψ (x1 , x2 ) = x1 x2 + · · · for the complexification of (3.2) (see the first line of the
proof), are sometimes referred to as “Lyapunov first integrals.”
Theorem 3.2.9 (Poincaré–Lyapunov Theorem). System (3.2) on R2 has a cen-
ter at the origin if and only if there exists a formal first integral that has the form
Ψ (u, v) = u2 + v2 + · · · .
Proof. By Exercise 3.9, the real system (3.2) has a formal first integral of the form
Ψ (u, v) = u2 + v2 + · · · if and only if its complexification (3.37) (with ω = 1) has a
formal first integral of the form Φ (x1 , x2 ) = x1 x2 + · · ·.
3.2 Complexification of Real Systems, Normal Forms, and the Center Problem 107

If system (3.2) has a formal first integral of the form Ψ (u, v) = u2 + v2 + · · · , then
by Theorem 3.2.5(1) G ≡ 0, hence by Theorem 3.2.7 system (3.2) has a center at
the origin.
Conversely, if the origin is a center for (3.2), then the function G defined by
(3.28) is identically zero (since otherwise by Theorem 3.2.7 the origin is a focus).
Therefore, by Theorem 3.2.5(2), its complexification, system (3.37), possesses an
analytic local first integral of the form Φ (x1 , x2 ) = x1 x2 + · · · , hence system (3.2)
admits a first integral of the form Ψ (u, v) = u2 + v2 + · · · . 

Remark. The analogue of the Poincaré–Lyapunov Theorem fails when the linear
part at the origin vanishes identically. That is, it is possible for a system of poly-
nomial differential equations on R2 to have a center at the origin, but there be no
formal first integral in any neighborhood of the origin. See Exercise 3.11. However,
there must exist a C∞ first integral ([137]).

The last theorem of this section states that system (3.2) has a center at the origin
if and only if it can be placed in a special form by an analytic coordinate change.
This is not the normal form in the sense of Definition 2.3.6, since the complex
eigenvalues of the linear part of (3.2) automatically mean that its normal form is
complex, since the Jordan normal form of the linear part is complex. Nevertheless,
going over to the complexification is a useful step in establishing the result.

Theorem 3.2.10. The origin is a center for system (3.2) if and only if there is a
transformation of the form
u = ξ + φ (ξ , η )
(3.42)
v = η + ψ (ξ , η )
that is real analytic in a neighborhood of the origin and transforms (3.2) into the
system
ξ̇ = −η F(ξ 2 + η 2 )
(3.43)
η̇ = ξ F(ξ 2 + η 2 ) ,
where F(z) is real analytic in a neighborhood of the origin and F(0) = 1.

Proof. By Proposition 3.2.2, the normal form of the complexification of (3.2) is

ẏ1 = iy1 + y1Y1 (y1 y2 )


(3.44)
ẏ2 = −iy2 + y2Y2 (y1 y2 ) .

By Proposition 3.2.1, this is the complexification of a real system, which we can


recover by making the substitutions y1 = ξ + iη and y2 = ξ − iη in (3.44) and
applying them to ξ̇ = 21 (ẏ1 + ẏ2) and η̇ = 2i1 (ẏ1 − ẏ2). Direct computation gives

ξ̇ = ξ Φ2 (ξ 2 + η 2 ) − ηΦ1 (ξ 2 + η 2 )
(3.45)
η̇ = ξ Φ1 (ξ 2 + η 2 ) + ηΦ2 (ξ 2 + η 2 ) ,

where
108 3 The Center Problem

Φ1 (w) = 1 − 2i (Y1 (w) − Y2 (w)) and Φ2 (w) = 21 (Y1 (w) + Y2 (w)) ,

so that Φ1 (w) and Φ2 (w) are formal series. The process by which it was derived
means that system (3.45) is a transformation of the original system (3.2), and from
the fact that Φ1 (0) = 1 and Φ2 (0) = 0 we conclude that it has the form (3.42).
Suppose (3.2) has a center at the origin. Then by Theorem 3.2.7, Y1 + Y2 ≡ 0, so
that Φ2 ≡ 0. Hence every real system with a center can be brought into the form
(3.43) by a substitution (3.42), where the series φ (ξ , η ), ψ (ξ , η ), and F(ξ 2 + η 2 )
are convergent for small ξ and η .
Conversely, suppose that for (3.2) there is a substitution (3.42) such that in the
new coordinates the system has the form (3.43). System (3.43) has the first integral
ξ 2 + η 2 . Therefore system (3.2) possesses a first integral of the form u2 + v2 + · · ·
and by Theorem 3.2.9 has a center at the origin. 

Although, as we mentioned earlier, system (3.43) is not a normal form in the


sense of Definition 2.3.6, it is still a real normal form in the sense of Section 2.2.

3.3 The Center Variety

Reviewing the results of the two previous sections we see that it is possible to give
a number of statements that are characterizations of systems of the form (3.2) that
have a center at the origin.

Theorem 3.3.1. The following statements about system (3.2), when the right-hand
sides are real analytic functions of real variables u and v, are equivalent:
(a) system (3.2) has a center at the origin;
(b) the Lyapunov numbers η2k+1 are all zero [Proposition 3.1.4 and Theorem 3.1.5];
(c) system (3.2) has a formal first integral of the form Ψ (u, v) = u2 + v2 + · · · [The-
orem 3.2.9];
(d) the complexification

ẋ1 = ix1 + X1(x1 , x2 )


, X2 (x1 , x̄1 ) = X1 (x1 , x̄1 ) (3.46)
ẋ2 = −ix2 + X2(x1 , x2 )

of system (3.2) has a formal first integral of the form Ψ (x1 , x2 ) = x1 x2 + · · ·


[Theorems 3.2.5 and 3.2.7];
(e) the coefficients G2k+1 of the function G = Y1 + Y2 computed from the normal
form
ẏ1 = iω y1 + y1Y1 (y1 y2 )
(3.47)
ẏ2 = −iω y2 + y2Y2 (y1 y2 )
of the complexification (3.46) of (3.2) (produced by a normalizing transforma-
tion that satisfies the condition of Proposition 3.2.1(2)) are all zero [Theorem
3.2.7];
3.3 The Center Variety 109

(f) in the normal form (3.47) of the complexification (3.46) of system (3.2) (pro-
duced by a normalizing transformation that satisfies the condition of Proposi-
tion 3.2.1(2)) the real parts of the functions on the right-hand side are all zero
[Theorem 3.2.7 and Remark 3.2.8];
(g) there exists a real analytic transformation in a neighborhood of the origin of
the form u = ξ + φ (ξ , η ), v = η + ψ (ξ , η ) that transforms system (3.2) into the
form ξ̇ = −η F(ξ 2 + η 2 ), η̇ = ξ F(ξ 2 + η 2 ), for some real analytic function F
satisfying F(0) = 1 [Theorem 3.2.10].

Following Dulac ([64]), we use statement (d) of the theorem to extend the con-
cept of a center to certain complex systems. See also Definition 3.3.7 below.

Definition 3.3.2. Consider the system

ẋ1 = ix1 + X1(x1 , x2 )


(3.48)
ẋ2 = −ix2 + X2(x1 , x2 ) ,

where x1 and x2 are complex variables and X1 and X2 are complex series without
constant or linear terms that are convergent in a neighborhood of the origin. System
(3.48) is said to have a center at the origin if it has a formal first integral of the form

Ψ (x1 , x2 ) = x1 x2 + ∑ w jk x1j xk2 . (3.49)


j+k≥3

If in (3.48) X1 and X2 satisfy the condition X2 (x1 , x̄1 ) = X1 (x1 , x̄1 ), then (3.48)
is the complexification of a real system that has a center at the origin, and the real
system can be recovered from either equation by replacing x2 by x̄1 . It is worth
emphasizing that the basis of the definition is that a real system of the form (3.2)
has a center at the origin if and only if its complexification has a center at (0, 0) ∈ C2 .
Thus far we have dealt primarily with analytic systems of differential equations
in which the coefficients were fixed. We now come to the central object of study in
this book, planar polynomial systems having a singularity at the origin at which the
eigenvalues of the linear part are purely imaginary, and whose coefficients depend
on parameters. Up to a time rescaling any such system can be written in the form
(3.2), where U and V are now polynomial functions, hence by (3.21) its complexi-
fication can be written in the form
 
ẋ = P(x, y) = i x − ∑ a pqx y
e p+1 q

(p,q)∈S
  (3.50)
e
ẏ = Q(x, y) = −i y − ∑ q p+1
bqp x y ,
(p,q)∈S

where the coefficients of Pe and Q e are complex, where S ⊂ N−1 × N0 is a finite set
(N−1 := {−1} ∪ N), every element (p, q) of which satisfies p + q ≥ 1, and where
bqp = ā pq for all (p, q) ∈ S. This is system (3.3) mentioned in the opening of this
110 3 The Center Problem

chapter; we will always refer to it using that number. The somewhat unusual index-
ing is to simplify expressions that will arise later. Similarly, although system (3.3) is
a system of the form (3.48), we will find that it is more convenient to completely fac-
tor out the i than it is to use the form (3.48) (see Exercise 3.14). We have not scaled
out the multiplicative factor of i so as to avoid complex time (but see Proposition
3.3.9).
The complexification of any individual system of the form (3.2) can be writ-
ten in the notation of (3.3) by choosing the set S and the individual coefficients
a pq and bqp suitably. However, (3.3) is intended to indicate a parametrized family
of systems under consideration. The set S picks out the allowable nonzero coef-
ficients and thereby specifies the family. Thus S = {(1, 0), (0, 1), (−1, 2)} corre-
sponds to the collection of complex systems with linear part diag(i, −i) and with
arbitrary quadratic nonlinearities, while S = {(1, 0), (−1, 2)} corresponds to all sys-
tems with the same linear part but with quadratic nonlinearities that do not contain
a monomial xy. Understood this way, (3.3) consists of only those families of sys-
tems that are closed under the involution of the dependent variables and parameters
(x, y, a pq , bqp ,t) 7→ (y, x, bqp , a pq , −t). Thus while not completely general, (3.3) in-
cludes all systems that are complexifications of real systems, and is adapted to our
purposes.
We will allow in our considerations the full set of systems of the form (3.3)
without the requirement that bqp = ā pq. Thus throughout this book we take C2ℓ as
the parameter space of (3.3), where ℓ is the cardinality of the set S, and will denote
it by E(a, b). E(a) = E(a, ā) will denote the parameter space of
 
ẋ = P(x, x̄) = i x − ∑ a pq x x̄ ,
e p+1 q
(3.51)
(p,q)∈S

which we call the real polynomial system in complex form. That is, (3.51) is just
(3.19) for system (3.2). To shorten the notation we will let C[a, b] denote the poly-
nomial ring in the variables a pq , bqp , (p, q) ∈ S. So, for example, if we want the
nonlinear terms in (3.3) to be precisely the collection of homogeneous quadratic
polynomials we take S = {(1, 0), (0, 1), (−1, 2)}, and E(a, b) = C6 . When the con-
text makes it clear that we are considering systems of the form (3.3), for economy
of expression we will speak of “the system (a, b)” when we mean the system of the
form (3.3) with the choice (a, b) of parameter values.
To determine if a system of the form (3.3) has a center at the origin, by Definition
3.3.2 we must look for a formal first integral of the form

Ψ (x, y) = xy + ∑ v j−1,k−1 x j yk , (3.52)


j+k≥3

where j, k ∈ N0 and the indexing is chosen to simplify the formulas that we will
obtain in the next section. In this context the condition (3.31) of Remark 3.2.4 that
the function Ψ (x, y) be a formal first integral is the identity
3.3 The Center Variety 111

∂Ψ e ∂Ψ e
XΨ = P(x, y) + Q(x, y) ≡ 0 , (3.53)
∂x ∂y
which by (3.52) is
  
i y + ∑ jv j−1,k−1 x j−1 yk x− ∑ a pq x p+1 yq
j+k≥3 (p,q)∈S
   (3.54)
+i x+ ∑ kvk−1,k−1 x y j k−1
−y + ∑ q p+1
bqp x y ≡ 0.
j+k≥3 (p,q)∈S

In agreement with formula (3.52) we set v00 = 1 and v1,−1 = v−1,1 = 0 (so that v00 is
the coefficient of xy in Ψ (x, y)). We also set a jk = bk j = 0 for ( j, k) 6∈ S. With these
conventions, for k1 , k2 ∈ N−1 , the coefficient gk1 ,k2 of xk1 +1 yk2 +1 in (3.54) is zero
for k1 + k2 ≤ 0 and for k1 + k2 ≥ 1 is
 
k1 +k2 −1
 
gk1 ,k2 = i (k1 − k2)vk1 ,k2 − ∑ (s1 + 1)ak1 −s1 ,k2 −s2 − (s2 + 1)bk1−s1 ,k2 −s2  vs1 ,s2 .
s1 +s2 =0
s1 ,s2 ≥−1
(3.55)
If we think in terms of starting with a system (3.3) and trying to build a formal
first integral Ψ in a step-by-step process, at the first stage finding all vk1 ,k2 for which
k1 + k2 = 1, at the second all vk1 ,k2 for which k1 + k2 = 2, and so on, then for a pair k1
and k2 , if k1 6= k2 , and if all coefficients v j1 j2 are already known for j1 + j2 < k1 + k2 ,
then vk1 k2 is uniquely determined by (3.55) and the condition that gk1 ,k2 be zero, and
the process is successful at this step. By our specification of v−1,1 , v00 , and v1,−1 the
procedure can be started. But at every second stage (in fact, at every even value of
k1 + k2 ), there is the one pair k1 and k2 such that k1 = k2 = k > 0, for which (3.55)
becomes
 
2k−1  
 
gkk = −i  ∑ (s1 + 1)ak−s1,k−s2 − (s2 + 1)bk−s1 ,k−s2 vs1 ,s2  , (3.56)
s1 +s2 =0
s1 ,s2 ≥−1

so the process of constructing a formal first integral Ψ succeeds at this step only
if the expression on the right-hand side of (3.56) is zero. The value of vkk is not
determined by equation (3.55) and may be assigned arbitrarily.
It is evident from (3.55) that for all indices k1 and k2 in {−1} ∪ N0 , vk1 k2 is a
polynomial function of the coefficients of (3.3), that is, is an element of the set that
we have denoted C[a, b], hence by (3.56) so are the expressions gkk for all k.
The polynomial g11 is unique, but for k ≥ 2 the polynomial gkk depends on the
arbitrary choices made for v j j for 1 ≤ j < k. So while it is clear that if for system
(a∗ , b∗ ), gkk (a∗ , b∗ ) = 0 for all k ∈ N, then there is a center at the origin, since
the process of constructing the formal first integral Ψ succeeds at every step, the
truth of the converse is not immediately apparent. For even if for some k ≥ 2 we
112 3 The Center Problem

obtained gkk (a∗ , b∗ ) 6= 0, it is conceivable that if we had made different choices for
the polynomials v j j for 1 ≤ j < k, we might have gotten gkk (a∗ , b∗ ) = 0. We will
show below (Theorem 3.3.5) that in fact whether or not gkk vanishes at any particular
(a∗ , b∗ ) ∈ E(a, b) is independent of the choices of the vkk . Thus the polynomial gkk
may be thought of as the kth “obstacle” to the existence of a first integral (3.52): if at
a point (a∗ , b∗ ) of our parameter space E(a, b) gkk (a∗ , b∗ ) 6= 0, then the construction
process fails at that step, no formal first integral of the form (3.52) exists for the
corresponding system (3.3), and by Definition 3.3.2 that system does not have a
center at the origin. Only if all the polynomials gkk vanish, gkk (a∗ , b∗ ) = 0 for all
k > 0, does the corresponding system (3.3) have a formal first integral of the form
(3.52), hence have a center at the origin of C2 . Although it is not generally true that
a first integral of the form (3.52) exists, the construction process always yields a
series of the form (3.52) for which X Ψ = Ψx Pe + Ψy Q e reduces to

X Ψ = g11 (xy)2 + g22(xy)3 + g33(xy)4 + · · · . (3.57)

Definition 3.3.3. Fix a set S. The polynomial gkk defined by (3.56) is called the kth
focus quantity for the singularity at the origin of system (3.3). The ideal of focus
quantities, B = hg11 , g22 , . . . , g j j , . . .i ⊂ C[a, b], is called the Bautin ideal, and the
affine variety VC = V(B) is called the center variety for the singularity at the origin
of system (3.3), or more simply, of system (3.3). Bk will denote the ideal generated
by the first k focus quantities, Bk = hg11 , g22 , . . . , gkk i.

Even though the focus quantities gkk are not unique, Theorem 3.3.5(2) shows that
the center variety is well-defined. First, though, we’ll illustrate the ideas so far with
a simple but important example.

Example 3.3.4. Let us consider the set of all systems of the form (3.3) that have
quadratic nonlinearities, so that ordered from greatest to least under degree lexico-
graphic order S is the ordered set S = {(1, 0), (0, 1), (−1, 2)}, and (3.3) reads

ẋ = i x − a10x2 − a01xy − a−12y2
 (3.58)
ẏ = −i y − b2,−1x2 − b10xy − b01y2 .

We will use (3.55) and (3.56) to compute vk1 ,k2 through k1 + k2 = 2.


Stage 0 : k1 + k2 = 0 : (k1 , k2 ) ∈ {(−1, 1), (0, 0), (1, −1)}
By definition, v−11 = 0, v00 = 1, and v1,−1 = 0.
Stage 1: k1 + k2 = 1 : (k1 , k2 ) ∈ {(−1, 2), (0, 1), (1, 0), (2, −1)}
In (3.55) s1 + s2 runs from 0 to 0, so the sum is over the terms (s1 , s2 ) in the index
set {(−1, 1), (0, 0), (1, −1)}. Inserting the values of vs1 ,s2 from the previous stage,
(3.55) reduces to gk1 ,k2 = i[(k1 − k2 )vk1 ,k2 − ak1 ,k2 + bk1 ,k2 ]. Setting gk1 ,k2 equal to
zero for each choice of (k1 , k2 ) yields

v−12 = − 31 a−12 , v01 = −a01 + b01 , v10 = a10 − b10 , v2,−1 = − 31 b2,−1 ,

where we have applied the convention that a pq = bqp = 0 if (p, q) 6∈ S.


3.3 The Center Variety 113

Stage 2 : k1 + k2 = 2 : (k1 , k2 ) ∈ {(−1, 3), (0, 2), (1, 1), (2, 0), (3, −1)}
In (3.55) s1 + s2 runs from 0 to 1, so the sum is over the terms (s1 , s2 ) in the index
set {(−1, 1), (0, 0), (1, −1); (−1, 2), (0, 1), (1, 0), (2, −1)}, and (3.55) is

gk1 ,k2 = i[(k1 − k2 )vk1 ,k2 + 2bk1+1,k2 −1 v−11 − (ak1 ,k2 − bk1 ,k2 )v00 − 2ak1−1,k2 +1 v1,−1
+ 3bk1 +1,k2 −2 v−12 − (ak1 ,k2 −1 − 2bk1,k2 −1 )v01
− (2ak1 −1,k2 − bk1 −1,k2 )v10 − 3ak1−2,k2 +1 v2,−1 ] .

For the first choice (k1 , k2 ) = (−1, 3) this reads

g−13 = i[−4v−13 + 2b02v−11


− (a−13 − b−13)v00 − 2a−24v1,−1 + 3b01v−12 − (a−12 − 2b−12)v01
− (2a−23 − b−23)v10 − 3a−34v2,−1 ]
= i[−4v−1,3 + 3b01v−12 − a−12v01 ] ,

where we have applied the convention that a pq = bqp = 0 if (p, q) 6∈ S. Setting g−1,3
equal to zero and inserting the known values of vi j from the previous two stages
yields v−13 = − 21 a−12 b01 + 41 a−12 a01 . Applying the same procedure for all the re-
maining choices of (k1 , k2 ) except (k1 , k2 ) = (1, 1) yields

v02 = 12 a−12b10 + 12 a201 − 32 a01 b01 + b201 − a10 a−12 ,


v20 = 12 a01 b2,−1 − b01b2,−1 + a210 − 23 a10 b10 + 21 b210 ,
v3,−1 = − 21 a10 b2,−1 + 14 b10 b2,−1 .

When (k1 , k2 ) = (1, 1), (3.55) becomes

g11 = i[0 · v11 + 2b20v−11 − (a11 − b11)v00 − 2a02v1,−1


+ 3b2,−1v−12 − (a10 − 2b10)v01 − (2a01 − b01)v10 − 3a−12v2,−1 ]
= −i[a10 a01 − b10b01 ] .

This is the first focus quantity, which must be zero in order for an element of family
(3.58) to have a center at the origin.
As promised, we now show that for fixed k ∈ N, the variety V(gkk ) is the same
for all choices (when k ≥ 2) of the polynomials v j j , j < k, which determine gkk , and
thus that the center variety VC is well-defined.
Theorem 3.3.5. Fix a set S and consider family (3.3).
1. Let Ψ be a formal series of the form (3.52) and let g11 (a, b), g22 (a, b), . . . be
polynomials satisfying (3.57) with respect to system (3.3). Then system (a∗ , b∗ )
has a center at the origin if and only if gkk (a∗ , b∗ ) = 0 for all k ∈ N.
2. Let Ψ and gkk be as in (1) and suppose there exist another function Ψ ′ of the form
(3.52) and polynomials g′11 (a, b), g′22 (a, b), . . . that satisfy (3.57) with respect to
family (3.3). Then VC = VC′ , where VC = V(g11 (a, b), g22 (a, b), . . . ) and where
VC′ = V(g′11 (a, b), g′22 (a, b), . . . ).
114 3 The Center Problem

Proof. 1) Suppose that family (3.3) is as in the statement of the theorem. Let Ψ be a
formal series of the form (3.52) and let {gkk (a, b) : k ∈ N} be polynomials in (a, b)
that satisfy (3.57).
If, for (a∗ , b∗ ) ∈ E(a, b), gkk (a∗ , b∗ ) = 0 for all k ∈ N, then Ψ is a formal first
integral for the corresponding family in (3.3), so by Definition 3.3.2 the system has
a center at the origin of C2 .
To prove the converse, we first make the following observations. Suppose that
there exist a k ∈ N and a choice (a∗ , b∗ ) of the parameters such that g j j (a∗ , b∗ ) = 0
for 1 ≤ j ≤ k − 1 but gkk (a∗ , b∗ ) 6= 0. Let H(x1 , y1 ) be the distinguished normalizing
transformation (2.33), producing the distinguished normal form (3.27), here written
ẋ1 = ix1 + x1 X(x1 y1 ), ẏ1 = −iy1 + y1Y (x1 y1 ), and consider the function F = Ψ ◦ H.
By construction
h i ∂F h i ∂F
ix1 + x1 X(x1 y1 ) (x1 , y1 ) + −iy1 + y1Y (x1 y1 ) (x1 , y1 )
∂ x1 ∂ y1
(3.59)
= gkk (a∗ , b∗ )[x1 + h1(x1 , y1 )]k+1 [y1 + h2 (x1 , y1 )]k+1 + · · ·
= gkk (a∗ , b∗ )xk+1 k+1
1 y1 + · · · .

Through order 2k + 1 this is almost precisely equation (3.32) (which was derived on
the admissible simplifying assumption that λ1 = p and λ2 = q), so that if we repeat
practically verbatim the argument that follows (3.32), we obtain identity (3.36), with
p = q = 1, through order 2k + 2. (We are able to obtain the conclusion about the form
of F with regard to the terms of order 2k + 2 from the fact that at the first step at
which the terms on the right-hand side of (3.59) must be taken into account, they
have the form a constant times (x1 y1 )k+1 .) Therefore

F(x1 , y1 ) = f1 · (x1 y1 ) + · · · + fk+1 · (x1 y1 )k+1 + U(x1 , y1 ) = f (x1 y1 ) + U(x1 , y1 ),

where f1 = 1 and U(x1 , y1 ) begins with terms of order at least 2k + 3. Thus

∂F ∂F
x1 = x1 y1 f ′ (x1 y1 ) + α (x1 , y1 ) and y1 = x1 y1 f ′ (x1 y1 ) + β (x1 , y1 ),
∂ x1 ∂ y1

where α (x1 , y1 ) and β (x1 , y1 ) begin with terms of order at least 2k + 3, and so the
left-hand side of (3.59) is

i[α (x1 , y1 ) − β (x1 , y1 )] + (X(x1y1 ) + Y (x1 y1 )) x1 y1 f ′ (x1 y1 )


+ X(x1y1 ) α (x1 , y1 ) + Y (x1 y1 ) β (x1 , y1 ) .

Hence if we subtract

i[α (x1 , y1 ) − β (x1 , y1 )] + X(x1y1 ) α (x1 , y1 ) + Y (x1 y1 ) β (x1 , y1 ) ,

which begins with terms of order at least 2k + 3, from each side of (3.59), we obtain

G(x1 y1 ) x1 y1 f ′ (x1 y1 ) = gkk (a∗ , b∗ )(x1 y1 )k + · · · , (3.60)


3.3 The Center Variety 115

where G is the function of (3.28b). Thus supposing, contrary to what we wish to


show, that system (3.3) for the choice (a, b) = (a∗ , b∗ ) has a center at the origin of
C2 , so that it admits a first integral Φ (x, y) = xy + · · · . Then by part (1) of Theo-
rem 3.2.5, the function G vanishes identically, hence the left-hand side of (3.60) is
identically zero, whereas the right-hand side is not, a contradiction.
2) If VC 6= VC′ , then there exists (a∗ , b∗ ) that belongs to one of the varieties VC
and VC′ but not to the other, say (a∗ , b∗ ) ∈ VC but (a∗ , b∗ ) 6∈ VC′ . The inclusion
(a∗ , b∗ ) ∈ VC means that the system corresponding to (a∗ , b∗ ) has a center at the
origin. Therefore by part (1) g′kk (a∗ , b∗ ) = 0 for all k ∈ N. This contradicts our as-
sumption that (a∗ , b∗ ) 6∈ VC′ . 
Remark 3.3.6. Note that it is a consequence of (3.60) that if, for a particular
(a∗ , b∗ ) ∈ E(A, b), gkk (a∗ , b∗ ) is the first nonzero focus quantity, then the first
nonzero coefficient of G(a∗ , b∗ ) is G2k+1 (a∗ , b∗ ) and G2k+1 (a∗ , b∗ ) = gkk (a∗ , b∗ ).
Thus points of VC correspond precisely to systems in family (3.3) that have a
center at the origin of C2 , in the sense that there exists a first integral of the form
(3.52). If (a, b) ∈ VC and a pq = b̄qp for all (p, q) ∈ S, which we will denote b = ā,
then such a point corresponds to a system that is the complexification of the real
system expressed in complex coordinates as (3.51), which then has a topological
center at the origin. More generally, we can consider the intersection of the center
def
variety VC with the set Π = {(a, b) : b = ā} whose elements correspond to com-
def
plexifications of real systems; we call this the real center variety VCR = VC ∩ Π . To
the set Π there corresponds a family of real systems of ordinary differential equa-
tions on R2 expressed in complex form as (3.51) or in real form as (3.2), and for
this family there is a space E R of real parameters. Within E R there is a variety V
corresponding to systems that have a center at the origin. By Theorem 3.3.1 points
of V are in one-to-one correspondence with points of VCR .
By Theorems 3.1.5, 3.2.7, and 3.3.5, in order to find either VC or VCR , one can
compute either the Lyapunov numbers η2k+1 , the coefficients G2k+1 of the function
G defined by (3.28), or the focus quantities gkk , all of which are polynomial func-
tions of the parameters. (Recall that the functions U and V in (3.2) are now assumed
to be polynomials. It is apparent from the discussion in the proof of Theorem 2.3.8
that the coefficients of the normalizing transformation x = H(y) = y + h(y) and of
the resulting normal form are polynomial functions of the parameters (a, b) with
coefficients in the field of Gaussian rationals, Q(i) = {a + ib : a, b ∈ Q}, provided
(α )
that, when m and α form a resonant pair, we choose hm to be a polynomial in (a, b)
with coefficients in Q(i). Thus G2k+1 ∈ Q[a, b]. That η2k+1 is a polynomial in (a, b)
is Proposition 6.2.2.) They can be markedly different from each other, but they all
pick out the same varieties VC and VCR . From the point of view of applications the
most interesting systems are the real systems. The trouble is, of course, that the field
R is not algebraically closed, making it far more difficult to study real varieties than
complex varieties. This is why we will primarily investigate the center problem for
complex systems (3.3). If the real case is the one of interest, then it is a straightfor-
ward matter to pass over to the real variety VCR by means of the substitution b = ā,
and from VC obtain VCR .
116 3 The Center Problem

It must be noted that in the literature all of the polynomials η2k+1 , G2k+1 , and gkk
are called “focus quantities” (or numbers) or “Lyapunov quantities” (or numbers, or
constants), but we reserve the term “focus quantities” for the polynomials gkk . When
they are viewed as polynomials in the coefficients of a family of systems, we refer
to the η2k+1 as “Lyapunov quantites.” We underscore the fact that once an indexing
set S is fixed, the polynomial gkk is determined for any k ∈ N, up to choices made at
earlier stages when k > 1. The polynomial gkk belongs to the family (3.3), not to any
particular system in the family. If a system is specified, by specifying the parameter
string (a∗ , b∗ ), then what is also called a kth focus quantity is obtained for that
system, namely, the element of C that is obtained by evaluating gkk at (a∗ , b∗ ). In this
sense the term “the kth focus quantity” has two meanings: first, as the polynomial
gkk ∈ C[a, b], and second as the number gkk (a∗ , b∗ ) obtained when gkk is regarded
as a function from E(a, b) to C and is evaluated at the specific string (a∗ , b∗ ). The
same is true of η2k+1 .
Although a system of the form (3.22) can never in any sense correspond to a
real system with a center if λ2 6= λ̄1 , yet when λ1 and λ2 are rationally related, say
λ1 /λ2 = −p/q, the condition G ≡ 0 on its normal form for the existence of a first
integral of the form Ψ (x1 , x2 ) = xq1 x2p +· · · (Theorem 3.2.5) is precisely the condition
(with p = q = 1 and expressed in terms of the normal form of its complexification)
that a real system have a center (Theorem 3.3.1(e)). Thus it is reasonable to extend
Definition 3.3.2 of a center of a system on C2 as follows. We restrict to λ1 = p and
λ2 = −q to avoid complex time that might be introduced by a time rescaling. See
also Exercise 3.8.

Definition 3.3.7. Consider the system

ẋ1 = px1 + X1(x1 , x2 )


(3.61)
ẋ2 = −qx2 + X2 (x1 , x2 ),

where x1 and x2 are complex variables, X1 and X2 are complex series without con-
stant or linear terms and are convergent in a neighborhood of the origin, and p and q
are elements of N satisfying GCD(p, q) = 1. System (3.61) is said to have a p : −q
resonant center at the origin if it has a formal first integral of the form

Ψ (x1 , x2 ) = xq1 x2p + ∑ w jk x1j xk2 . (3.62)


j+k>p+q

By Theorem 3.2.5, the origin is a p : −q resonant center for (3.61) if and only if
there exists a convergent first integral of the form (3.62). (Compare Corollary 3.2.6.)
Another immediate corollary of Theorem 3.2.5 is the following characterization of
a p : −q resonant center.

Proposition 3.3.8. The origin is a p : −q resonant center of system (3.61) if and


only if the function G computed from its normal form (defined by equation (3.28b))
vanishes identically.

When (3.61) is a polynomial system, we can write it in the form


3.3 The Center Variety 117

ẋ = px − ∑ a jk x j+1 yk
( j,k)∈S
(3.63)
ẏ = −qy + ∑ bk j xk y j+1 .
( j,k)∈S

Repeating the same procedure used for family (3.3), we now look for a first integral
in the form
Ψ (x, y) = xq y p + ∑ v j−q,k−p x j yk , (3.64)
j+k>p+q
j,k∈N0

where the indexing has been chosen in direct analogy with that used in (3.52). Con-
dition (3.53) now reads
  
qx y + ∑ jv j−q,k−p x y
q−1 p j−1 k
px − ∑ amn x m+1 n
y
j+k>p+q (m,n)∈S
   (3.65)
+ px y q p−1
+ ∑ j k−1
kv j−q,k−p x y −qy + ∑ n m+1
bnm x y ≡ 0.
j+k>p+q (m,n)∈S

We augment the set of coefficients in (3.52) with the collection

J = {v−q+s,q−s : s = 0, . . . , p + q},

where, in agreement with formula (3.64), we set v00 = 1 and vmn = 0 for all other
elements of J, so that elements of J are the coefficients of the terms of degree p+q in
Ψ (x, y). With the convention amn = bnm = 0 for (m, n) 6∈ S, for (k1 , k2 ) ∈ N−q × N−p
the coefficient gk1 ,k2 of xk1 +q yk2 +p in (3.65) is zero for k1 + k2 ≤ 0 and for k1 + k2 ≥ 1
is given by

gk1 ,k2
k1 +k2 −1  
= (pk1 − qk2)vk1 ,k2 − ∑ (s1 + q)ak1−s1 ,k2 −s2 − (s2 + p)bk1 −s1 ,k2 −s2 vs1 ,s2 .
s1 +s2 =0
s1 ≥−q, s2 ≥−p
(3.66)
The focus quantities are now
kq+kp−1  
gkq,kp = − ∑ (s1 + q)ak1 −s1 ,k2 −s2 − (s2 + p)bk1 −s1 ,k2 −s2 vs1 ,s2 , (3.67)
s1 +s2 =0
s1 ≥−q,s2 ≥−p

so that we always obtain a series of the form (3.64) for which X Ψ reduces to

X Ψ = gq,p(xq y p )2 + g2q,2p(xq y p )3 + g3q,3p(xq y p )4 + · · · . (3.68)

The center variety for family (3.63) is defined in analogy with Definition 3.3.3.
With appropriate changes, the proof of the analogue of Theorem 3.3.5 goes through
118 3 The Center Problem

as before to show that the center variety for family (3.63) is well-defined. Moreover,
we have the following result.
Proposition 3.3.9. The center variety of system (3.63) with p = q = 1 coincides with
the center variety of system (3.3).
Proof. When p = q = 1 the computation of the condition that (3.62) be a formal
first integral of (3.63) yields (3.54) but without the multiplicative factor of i. 
Based on this proposition we are free to consider the system

e y) = x −
ẋ = P(x, ∑ a pq x p+1 yq
(p,q)∈S
(3.69)
e y) = −y +
ẏ = Q(x, ∑ bqp xq y p+1
(p,q)∈S

instead of system (3.3). As the reader has by now noticed, the expression for the
focus quantities may or may not contain a multiplicative factor of i, depending on
how the problem was set up. Since we are concerned only with the variety the focus
quantities determine, hence only with their zeros, the presence or absence of such a
nonzero factor is unimportant.
There are different points of view as to what exactly constitutes the center prob-
lem. Under one approach, to resolve the center problem for a family of polynomial
systems means to find the center variety of the family. More precisely, to solve the
center problem for the family (3.3) or the family (3.63) of polynomial systems, for
a fixed set S, means to find the center variety and its irreducible decomposition
V(B) = V1 ∪ · · · ∪Vk . In the following sections we will present some methods that
enable one to resolve the center problem in this sense for many particular families
of polynomial systems.
In a wider sense the center problem is the problem of finding an algorithm that
will guarantee complete identification of the center variety of any family of polyno-
mial systems in a finite number of steps. It is unknown if the methods that we will
present are sufficient for the construction of such an algorithm. Understood in this
sense, the center problem seems to be still far from resolution.

3.4 Focus Quantities and Their Properties

Calculating any of the polynomials η2k+1 , G2k+1 , or gkk is a difficult computational


problem because the number of terms in these polynomials grows so fast. The focus
quantities gkk are the easiest to calculate, but even they are difficult to compute for
large k if we use only formulas (3.55) and (3.56). In this section we identify structure
in the focus quantities gkk for systems of the form (3.3). This structure is the basis
for an efficient algorithm for computing the focus quantities.
We begin by directing the reader’s attention back to Example 3.3.4. Consider
v02 = 21 a−12 b10 + 21 a201 − 32 a01 b01 + b201 − a10 a−12 , and note that for any monomial
3.4 Focus Quantities and Their Properties 119

that appears, the sum of the product of the index of each term (as an element of
N−1 × N0 ) with its exponent is the index of v02 :

a−12b10 : 1 · (−1, 2) + 1 · (1, 0) = (0, 2)


a201 : 2 · (0, 1) = (0, 2)
a01 b01 : 1 · (0, 1) + 1 · (0, 1) = (0, 2) (3.70)
b201 : 2 · (0, 1) = (0, 2)
a10 a−12 : 1 · (1, 0) + 1 · (−1, 2) = (0, 2)

The reader can verify that the same is true for every monomial in every coefficient
v jk computed in Example 3.3.4.
To express this fact in general we introduce the following notation. We order the
index set S in equation (3.3) in some manner, say by degree lexicographic order from
least to greatest, and write the ordered set S as S = {(p1 , q1 ), . . . , (pℓ , qℓ )}. Consis-
tently with this we then order the parameters as (a p1 ,q1 , . . . , a pℓ ,qℓ , bqℓ ,pℓ , . . . , bq1 ,p1 )
ν ν ν2ℓ
so that any monomial appearing in vi j has the form aνp11 ,q1 · · · a pℓℓ ,qℓ bqℓℓ+1
,pℓ · · · bq1 ,p1 for
some ν = (ν1 , . . . , ν2ℓ ). To simplify the notation, for ν ∈ N2ℓ
0 we write

ν
[ν ] = aνp11 ,q1 · · · aνpℓℓ ,qℓ bqℓℓ+1 ν2ℓ
def
,pℓ · · · bq1 ,p1 .

We will write just C[a, b] in place of C[a p1 ,q1 , . . . , a pℓ ,qℓ , bqℓ ,pℓ , . . . , bq1 ,p1 ], and for
f ∈ C[a, b] write f = ∑ν ∈Supp( f ) f (ν ) [ν ], where Supp( f ) denotes those ν ∈ N2ℓ 0 such
that the coefficient of [ν ] in the polynomial f is nonzero.
Once the ℓ-element set S has been specified and ordered, we let L : N2ℓ 0 → Z be
2

the linear map defined by

L(ν ) = (L1 (ν ), L2 (ν ))
= ν1 (p1 , q1 ) + · · · + νℓ (pℓ , qℓ ) + νℓ+1 (qℓ , pℓ ) + · · · + ν2ℓ(q1 , p1 )
(3.71)
= (p1 ν1 + · · · + pℓνℓ + qℓνℓ+1 + · · · + q1ν2ℓ ,
q1 ν1 + · · · + qℓ νℓ + pℓ νℓ+1 + · · · + p1ν2ℓ ),

which is just the formal expression for the sums in (3.70). The fact that we have
observed about the v jk is that for each monomial [ν ] appearing in v jk , L(ν ) = ( j, k).
This is the basis for the following definition.
Definition 3.4.1. For ( j, k) ∈ N−1 × N−1 , a polynomial f = ∑ν ∈Supp( f ) f (ν ) [ν ] in
C[a, b] is a ( j, k)-polynomial if, for every ν ∈ Supp( f ), L(ν ) = ( j, k).
Theorem 3.4.2. Let family (3.3) (family (3.50)) be given. There exists a formal se-
ries Ψ (x, y) of the form (3.52) (note that j − 1 and k − 1 there correspond to j and k
in points (2) and (3) below) and polynomials g11 , g22 , . . . in C[a, b] such that
1. equation (3.57) holds,
2. for every pair ( j, k) ∈ N2−1 , j + k ≥ 0, v jk ∈ Q[a, b], and v jk is a ( j, k)-polynomial,
3. for every k ≥ 1, vkk = 0, and √
4. for every k ≥ 1, igkk ∈ Q[a, b] (i = −1), and gkk is a (k, k)-polynomial.
120 3 The Center Problem

Proof. The discussion leading from equation (3.52) to equation (3.57) shows that if
we define v1,−1 = 0, v00 = 1, and v−11 = 0, then for k1 , k2 ≥ −1, if vk1 ,k2 are defined
recursively by
 k1 +k2 −1


k 1
1 −k2 ∑ [(s1 + 1)ak1 −s1 ,k2 −s2 − (s2 + 1)bk1 −s1 ,k2 −s2 ]vs1 ,s2 if k1 6= k2
vk1 ,k2 = s1 +s2 =0

 s1 ,s2 ≥−1

0 if k1 = k2 ,
(3.72)
where the recursion is on k1 + k2 (that is, find all vk1 ,k2 for which k1 + k2 = 1, then
find all vk1 ,k2 for which k1 + k2 = 2, and so on), and once all vk1 ,k2 are known for
k1 + k2 ≤ 2k − 1, gkk is defined by
 
k1 +k2 −1
 
gkk = −i  ∑ [(s1 + 1)ak1 −s1 ,k2 −s2 − (s2 + 1)bk1 −s1 ,k2 −s2 ]vs1 ,s2  , (3.73)
s1 +s2 =0
s1 ,s2 ≥−1

then for every pair ( j, k), v jk ∈ Q[a, b], for every k, igkk ∈ Q[a, b], and equation
(3.57) holds. (An assumption that should be recalled is that in (3.72), ak1 −s1 ,k2 −s2
and bk1 −s1 ,k2 −s2 are replaced by zero when (k1 − s1 , k2 − s2 ) 6∈ S.) By our definition
of vkk in (3.72), (3) holds. To show that v jk is a ( j, k)-polynomial we proceed by
induction on j + k.
Basis step. For k1 + k2 = 0, there are three polynomials, v−11 , v00 , and v1,−1 .
Since Supp(v−11 ) = Supp(v1,−1 ) = ∅, the condition in Definition 3.4.1 is vacuous
for v−1,1 and v1,−1 . For v00 , Supp(v00 ) = (0, . . . , 0), and L(0, . . . , 0) = (0, 0).
Inductive step. Suppose that v jk is a ( j, k)-polynomial for all ( j, k) satisfying
j + k ≤ m, and that k1 + k2 = m + 1. Consider a term vs1 ,s2 ak1 −s1 ,k2 −s2 in the sum
in (3.72). If (k1 − s1 , k2 − s2 ) 6∈ S, then ak1 −s1 ,k2 −s2 = 0 by convention, and the term
does not appear. If (k1 − s1 , k2 − s2 ) = (pc , qc ) ∈ S, then
 
(ν ) (ν )
vs1 ,s2 ak1 −s1 ,k2 −s2 =  ∑ vs1 ,s2 [ν ] [µ ] = ∑ vs1 ,s2 [ν + µ ] , (3.74)
ν ∈Supp(vs1 ,s2 ) ν ∈Supp(vs1 ,s2 )

where µ = (0, . . . , 0, 1, 0, . . . , 0), with the 1 in the cth position, counting from the left.
Clearly L(µ ) = (k1 − s1 , k2 − s2 ), hence by the inductive hypothesis and additivity
of L, every term in (3.74) satisfies

L(ν + µ ) = L(ν ) + L(µ ) = (s1 , s2 ) + (k1 − s1 , k2 − s2 ) = (k1 , k2 ) .

Similarly, for any term vs1 ,s2 bk1 −s1 ,k2 −s2 such that (k1 − s1 , k2 − s2 ) = (qc , pc ), that is,
such that (k2 − s2 , k1 − s1 ) = (pc , qc ) ∈ S, we obtain an expression just like (3.74),
except that now the 1 in µ is in the cth position counting from the right, and we
easily compute that every term in this expression satisfies L(ν + µ ) = (k1 , k2 ). Thus
point (2) is fully established.
3.4 Focus Quantities and Their Properties 121

It is clear that the same arguments show that gkk is a (k, k)-polynomial, complet-
ing the proof of point (4). 
The import of Theorem 3.4.2 is that any monomial [ν ] can appear in at most one
of the polynomials vk1 ,k2 (or gk1 ,k2 if k1 − k2 = 0), and we find it by computing L(ν ).
For example, in the context of Example 3.3.4, S = {(−1, 2), (0, 1), (1, 0)}, for the
monomial a201 b310 b2,−1 = a0−12 a201 a010 b001 b310 b12,−1 ν = (0, 2, 0, 0, 3, 1), and

L(ν ) = 0 · (−1, 2) + 2 · (0, 1) + 0 · (1, 0) + 0 · (0, 1) + 3 · (1, 0) + 1 · (2, −1) = (5, 1),

so that a201 b310 b2,−1 can appear in v51 but in no other polynomial vk1 ,k2 . From the
opposite point of view, since in the situation of this example L1 (ν ) + L2 (ν ) = |ν |
for all ν ∈ N60 , given (k1 , k2 ) ∈ N20 , any solution ν of (L1 (ν ), L2 (ν )) = (k1 , k2 ) must
satisfy |ν | = L1 (ν ) + L2 (ν ) = k1 + k2 . Hence we find all monomials that can appear
in vk1 ,k2 by checking L(ν ) = (k1 , k2 ) for all ν ∈ N60 with |ν | = k1 + k2 . Thus, for
example, for v−13 , we check L(ν ) = (−1, 3) for ν ∈ N60 satisfying |ν | = 2. The set
to check is

{(2, 0, 0, 0, 0, 0), (0, 2, 0, 0, 0, 0), . . ., (0, 0, 0, 0, 0, 2);


(1, 1, 0, 0, 0, 0), (1, 0, 1, 0, 0, 0), . . ., (0, 0, 0, 0, 1, 1); . . ., (0, 0, 0, 0, 1, 1)}

consisting of 21 sextuples. A short computation shows that

(L1 (ν ), L2 (ν )) = (−ν1 + ν3 + ν5 + 2ν6 , 2ν1 + ν2 + ν4 − ν6 ) = (−1, 3)

only for ν = (1, 1, 0, 0, 0, 0) and ν = (1, 0, 0, 1, 0, 0), so that a−12 a01 and a−12 b01 ,
(1,1,0,0,0,0) (1,0,0,1,0,0)
respectively, appear in v−13 . That is, v−13 = v−13 a−12a01 + v−13 a−12 b01
(1,1,0,0,0,0) (1,0,0,1,0,0)
for some unknown constants v−13 and v−13 . The next theorem shows
(ν )
how to compute the coefficients vk1 ,k2 . The following definition and lemma will be
needed.
ν ν ν
Definition 3.4.3. Let f = ∑ν ∈Supp( f ) f (ν ) aνp11 ,q1 · · · a pℓℓ,qℓ bqℓℓ+1
,pℓ · · · bq1 ,p1 ∈ C[a, b]. The
2ℓ

conjugate fb of f is the polynomial obtained from f by the involution

f (ν ) → f¯(ν ) ai j → b ji b ji → ai j ;
ν νℓ+1 νℓ ν1
that is, fb = ∑ν ∈Supp( f ) f¯(ν ) a p2ℓ
1 ,q1 · · · a pℓ ,qℓ bqℓ ,pℓ · · · bq1 ,p1 ∈ C[a, b].

ν νℓ+1 νℓ
Since [ν ] = aνp11 ,q1 · · · aνpℓℓ ,qℓ bqℓℓ+1 ν2ℓ c ν2ℓ ν1
,pℓ · · · bq1 ,p1 , [ν ] = a p1 ,q1 · · · a pℓ ,qℓ bqℓ ,pℓ · · · bq1 ,p1 , so that

[(ν1\
, . . . ν2ℓ )] = [(ν2ℓ , . . . , ν1 )] . (3.75)

For this reason we will also write, for ν = (ν1 , . . . , ν2ℓ ), νb = (ν2ℓ , . . . , ν1 ).
Let a family (3.3) (family (3.50)) for some set S of indices be fixed, and for any
ν ∈ N2ℓ
0 define V (ν ) ∈ Q recursively, with respect to |ν | = ν1 + · · · + ν2ℓ , as follows:
122 3 The Center Problem

V ((0, . . . , 0)) = 1; (3.76a)


for ν 6= (0, . . . , 0),
V (ν ) = 0 if L1 (ν ) = L2 (ν ), (3.76b)
and when L1 (ν ) 6= L2 (ν ) ,

1
V (ν ) =
L1 (ν ) − L2 (ν )
"

× ∑V e (ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L1 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + 1)
(3.76c)
j=1
#
2ℓ
− ∑ V (ν1 , . . . , ν j − 1, . . . , ν2ℓ )(L2 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) + 1) ,
e
j=ℓ+1

where (
V (η ) if η ∈ N2ℓ
Ve (η ) = 0
0 if η ∈ N2ℓ 2ℓ
−1 \ N0 .

Lemma 3.4.4. Suppose ν ∈ N2ℓ


0 is such that either L1 (ν ) < −1 or L2 (ν ) < −1. Then
V (ν ) = 0.
Proof. The proof is by induction on |ν |.
Basis step. A quick computation shows that if |ν | = 0 or |ν | = 1, then both
L1 (ν ) ≥ −1 and L2 (ν ) ≥ −1, so the basis step is |ν | = 2. This happens in two
ways:
j
(α ) ν = (0, . . . , 0, 2, 0, . . . , 0),
j k
(β ) ν = (0, . . . , 0, 1, 0, . . . , 0, 1, 0, . . . , 0).
In case (α ), since for (p, q) ∈ S, q ≥ 0, L1 (ν ) ≤ −2 or L2 (ν ) ≤ −2 holds if and only
if:
(i) when 1 ≤ j ≤ ℓ : (p j , q j ) = (−1, q j ), and
(ii) when ℓ + 1 ≤ j ≤ 2ℓ : (q2ℓ− j+1, p2ℓ− j+1) = (q2ℓ− j+1, −1).
In subcase (i),

1 j j
V (ν ) = V (0, . . . , 0, 1, 0, . . . , 0) (L1 (0, . . . , 0, 1, 0, . . . , 0) + 1)
−2 − 2q j
1 j
= V (0, . . . , 0, 1, 0, . . . , 0) (−1 + 1)
−2 − 2q j
= 0.

Subcase (ii) is the same, except for a minus sign.


In case (β ), L1 (ν ) ≤ −2 or L2 (ν ) ≤ −2 holds if and only if:
(i) when 1 ≤ j < k ≤ ℓ : (p j , q j ) = (−1, q j ) and (pk , qk ) = (−1, qk ), and
(ii) when ℓ + 1 ≤ j < k ≤ 2ℓ : (q2ℓ− j+1, p2ℓ− j+1) = (q2ℓ− j+1, −1) and
(q2ℓ−k+1, p2ℓ−k+1 ) = (q2ℓ−k+1, −1).
3.4 Focus Quantities and Their Properties 123

In subcase (i),

1 k k
V (ν ) = V (0, . . . , 0, 1, 0, . . . , 0) (L1 (0, . . . , 0, 1, 0, . . . , 0) + 1)
−2 − q j − qk
j j

+ V (0, . . . , 0, 1, 0, . . . , 0) (L1 (0, . . . , 0, 1, 0, . . . , 0) + 1)

1 k
= V (0, . . . , 0, 1, 0, . . . , 0) (−1 + 1)
−2 − q j − qk
j

+ V (0, . . . , 0, 1, 0, . . . , 0) (−1 + 1) = 0 .

Subcase (ii) is the same, except for a minus sign.


Inductive step. Assume the lemma holds for all ν satisfying |ν | ≤ m and let ν be
such that |ν | = m + 1, L1 (ν ) < −1 or L2 (ν ) < −1, and L1 (ν ) 6= L2 (ν ).
Suppose first that L1 (ν ) < −2. For any term in either sum in (3.76c), the argu-
ment µ of Ve satisfies |µ | ≤ m, and if µ arises in a term in the first sum,

L1 (µ ) = L1 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) = L1 (ν ) − p j < −1 ,

while if µ arises in a term in the second sum,

L1 (µ ) = L1 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) = L1 (ν ) − q2ℓ− j+1 < −2 ,

e (µ ) is not automatically zero, then by the induction hy-


so that in either case, if V
pothesis V (µ ) = V (µ ) = 0. The proof when L2 (ν ) < −2 is similar.
e
Now suppose L1 (ν ) = −2. For any term in either sum in (3.76c), the argument
µ of Ve satisfies |µ | ≤ m. For any term in the first sum in (3.76c), we have that
L1 (µ ) = L1 (ν ) − p j = −2 − p j . If p j 6= −1, then L1 (µ ) ≤ −2 and by the induction
hypothesis V e (µ ) = 0. If p j = −1, then L1 (µ ) + 1 = −1 + 1 = 0. In either case,
the term is zero. Again, for any term in the second sum in (3.76c), we have that
L1 (µ ) = L1 (ν ) − q2ℓ− j+1 = −2 − q2ℓ− j+1 ≤ −2, so by the induction hypothesis
Ve (µ ) = 0.
The proof when L2 (ν ) = −2 is similar. 

Here is the theorem that is the basis for an efficient computational algorithm for
obtaining the focus quantities for family (3.3).

Theorem 3.4.5. For a family of systems of the form (3.3) (family (3.50)), let Ψ be
the formal series of the form (3.52) and let {gkk : k ∈ N} be the polynomials in
C[a, b] given by (3.72) and (3.73), which satisfy the conditions of Theorem 3.4.2.
Then
(ν )
1. for ν ∈ Supp(vk1 ,k2 ), the coefficient vk1 ,k2 of [ν ] in vk1 ,k2 is V (ν ),
(ν )
2. for ν ∈ Supp(gkk ), the coefficient gkk of [ν ] in gkk is
124 3 The Center Problem

(ν )
gkk
"

= −i ∑ Ve (ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L1 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + 1) (3.77)
j=1
#
2ℓ
− ∑ Ve (ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L2 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + 1) ,
j=ℓ+1

and
3. the following identities hold:
(νb) (ν )
V (νb ) = V (ν ) and gkk = −gkk for all ν ∈ N2ℓ
0 (3.78a)
(ν )
V (ν ) = gkk =0 if νb = ν 6= (0, . . . , 0) . (3.78b)

Proof. The proof of part (1) is by induction on k1 + k2.


Basis step. For k1 + k2 = 0, there are three polynomials, v−11 , v00 , and v1,−1 .
Supp(v00 ) = (0, . . . , 0), and

v00 = 1 · a0p1,q1 · · · a0pℓ,qℓ b0qℓ ,pℓ · · · b0q1 ,p1 = V ((0, . . . , 0)) · a0p1 ,q1 · · · a0pℓ ,qℓ b0qℓ ,pℓ · · · b0q1 ,p1 ,

as required. Since Supp(v−11 ) = Supp(v1,−1 ) = ∅, statement (1) holds vacuously


for them.
Inductive step. Suppose statement (1) holds for vk1 ,k2 for k1 + k2 ≤ m, and let k1
and k2 be such that k1 + k2 = m + 1. If k1 = k2 , then vk1 ,k2 = 0 by Theorem 3.4.2(3).
By Theorem 3.4.2(2), for any ν ∈ Supp(vk1 ,k2 ), L(ν ) = (k1 , k2 ), so L1 (ν ) = L2 (ν )
and V (ν ) = 0, as required. If k1 6= k2 , then by (3.72)

vk1 ,k2
k1 +k2 −1
1
=
k1 − k2 ∑ [(s1 + 1)ak1 −s1 ,k2 −s2 − (s2 + 1)bk1 −s1 ,k2 −s2 ]vs1 ,s2
s1 +s2 =0
s1 ,s2 ≥−1
k1 +k2 −1   
1 (µ )
=
k1 − k2 ∑ (s1 + 1) ∑ vs1 ,s2 [µ ]ak1 −s1 ,k2 −s2
s1 +s2 =0 µ ∈Supp(vs1 ,s2 )
s1 ,s2 ≥−1
 
(µ )
− (s2 + 1) ∑ vs1 ,s2 [µ ]bk1 −s1 ,k2 −s2 ]
µ ∈Supp(vs1 ,s2 )
k1 +k2 −1 
1 (µ )
=
k1 − k2 ∑ (s1 + 1) ∑ vs1 ,s2 a µp11,q1 · · · a µpcc,q
+1
c
· · · bqµ12ℓ,p1
s1 +s2 =0 µ ∈Supp(vs1 ,s2 )
s1 ,s2 ≥−1

(µ ) µ
∑ vs1 ,s2 aµp11,q1 · · · bqd2ℓ−d+1 · · · bqµ12ℓ,p1
+1
− (s2 + 1) ,pd , (3.79)
µ ∈Supp(vs1 ,s2 )
3.4 Focus Quantities and Their Properties 125

where (pc , qc ) = (k1 − s1 , k2 − s2 ) (provided (k1 − s1 , k2 − s2 ) ∈ S, else by convention


the product is zero) and (qd , pd ) = (k1 − s1 , k2 − s2 ) (provided (k2 − s2 , k1 − s1 ) ∈ S,
else by convention the product is zero).
(ν )
Fix ν ∈ N2ℓ 0 for which L(ν ) = (k1 , k2 ). We wish to find the coefficient vk1 ,k2 of
[ν ] in vk1 ,k2 . For a fixed j ∈ {1, . . . , ℓ}, we first ask which pairs (s1 , s2 ) are such that
(pc , qc ) = (k1 − s1 , k2 − s2 ) = (p j , q j ). There is at most one such pair: s1 = k1 − p j
and s2 = k2 − q j ; it exists if and only if k1 − p j ≥ −1 and k2 − q j ≥ −1. For that pair,
we then ask which µ ∈ N2ℓ 0 are such that (µ1 , . . . , µ j + 1, . . . , µ2ℓ ) = (ν1 , . . . , ν2ℓ ).
There is at most one such multi-index: (µ1 , µ2 , . . . , µ2ℓ ) = (ν1 , . . . , ν j − 1, . . . , ν2ℓ );
it exists if and only if ν j ≥ 1. For this µ ,

L(µ ) = ν1 (p1 , q1 ) + · · · + ν2ℓ(q1 , p1 ) − (p j , q j ) = (k1 − p j , k2 − q j ) = (s1 , s2 ),

although µ 6∈ Supp(vs1 ,s2 ) is possible. Applying the same considerations to the cases
(ν )
(qd , pd ) = (q2ℓ− j+1, p2ℓ− j+1) for j = ℓ + 1, . . ., 2ℓ, we see that for any term vk1 ,k2 [ν ]
appearing in vk1 ,k2 there is at most one term on the right-hand side of (3.79) for which
the value of c is 1, at most one for which the value of c is 2, and so on through c = ℓ,
and similarly at most one term for which the value of d is ℓ, at most one for which
the value of d is ℓ − 1, and so on through d = 1. Thus the coefficient of [ν ] in (3.72)
is (recalling that vk1 ,k2 is a (k1 , k2 )-polynomial so that (k1 , k2 ) = (L1 (ν ), L2 (ν )), and
similarly for vs1 ,s2 ):

(ν ) 1
vk1 ,k2 =
L1 (ν ) − L2 (ν )
"

(ν ,...,ν −1,...,ν2ℓ )
× ∑ ′ (L1 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) + 1)vk1−p
1 j
j ,k2 −q j (3.80)
j=1
#
2ℓ
(ν ,...,ν −1,...,ν )
− ∑ ′
(L2 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) + 1)vk1−q
1 j 2ℓ
2ℓ− j+1 ,k2 −p2ℓ− j+1
,
j=ℓ+1

where the prime on the first summation symbol indicates that if (i) ν j − 1 < 0, or
if (ii) k1 − p j < −1, or if (iii) k2 − q j < −1, or if (iv) on the contrary ν j − 1 ≥ 0,
k1 − p j ≥ −1, and k2 − q j ≥ −1, but (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) 6∈ Supp(vk1 −p j ,k2 −q j ),
then the corresponding term does not appear in the sum, and the prime on the second
summation symbol has a similar meaning.
If in either sum j is such that ν j − 1 < 0, then since the corresponding term does
(ν ,...,ν j −1,...,ν2ℓ )
not appear, if we replace v 1 e (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) the sum is
by V
k1 −p j ,k2 −q j
unchanged, since the latter is zero in this situation.
In the first sum, suppose j is such that ν j − 1 ≥ 0. If both k1 − p j ≥ −1 and
k2 − q j ≥ −1, then there are two subcases. If the corresponding term appears in
the sum, then because |ν1 + · · · + (ν j − 1) + · · · + ν2ℓ | ≤ m, the induction hypothe-
(ν ,...,ν −1,...,ν )
sis applies and vk1 1−p j ,kj2 −q j 2ℓ = V (ν1 , . . . , ν j − 1, . . ., ν2ℓ ). Since in this situation
Ve (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) = V (ν1 , . . . , ν j − 1, . . . , ν2ℓ ), in the corresponding term we
126 3 The Center Problem

(ν ,...,ν −1,...,ν )
may replace vk1 1−p j ,kj2 −q j 2ℓ by Ve (ν1 , . . . , ν j −1, . . . , ν2ℓ ) and the sum is unchanged.
The second subcase is that in which the corresponding term does not appear, mean-
ing that (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) 6∈ Supp(vk1 −p j ,k2 −q j ). But again the induction hy-
pothesis applies, and now yields V (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) = 0, so again the sum is
unchanged by the same replacement.
Finally, suppose that in the first sum j is such that ν j − 1 ≥ 0 but that either
k1 − p j < −1 or k2 − q j < −1, so the corresponding term is not present in the sum.
Then because L(ν1 , . . . , ν j − 1, . . . , ν2ℓ ) = (k1 − p j , k2 − q j ), Lemma 3.4.4 applies,
by which we can make the same replacement as above, and thus the first sum in
(3.80) is the same as the first sum in (3.76c). The second sum in (3.80) is treated
similarly. This proves point (1). The same argument as in the inductive step with
only slight modification here and there gives point (2).
Turning to point (3), first note that it follows directly from the definitions of L
and νb that for all ν ∈ N2ℓ 0 ,

L1 (νb) = L2 (ν ) and L2 (νb ) = L1 (ν ), (3.81)

and that if νb = ν then L1 (ν ) = L2 (ν ). This latter statement means that if νb = ν


and ν 6= (0, . . . , 0) then by definition of V , V (ν ) = 0. Of course if ν = (0, . . . , 0)
then νb = ν , and by definition V ((0, . . . , 0)) = 1. In sum, for all ν ∈ N2ℓ 0 such that
νb = ν 6= (0, . . . , 0), V (ν ) = 0, the assertion about V in (3.78b). This implies that
Ve (ν ) = 0 for all ν ∈ N2ℓ 0 such that ν = ν 6= (0, . . . , 0). We prove that V (ν ) = V (ν )
b b
for νb 6= ν by induction on |ν |.
Basis step. The equality |ν | = 0 holds only if ν = (0, . . . , 0), hence only if νb = ν ,
so the statement is vacuously true.
Inductive step. Suppose that if |ν | ≤ m and νb 6= ν then V (νb) = V (ν ), hence
Ve (νb ) = Ve (ν ). Thus, more succinctly, V (νb) = V (ν ) (and Ve (νb ) = Ve (ν )) if |ν | ≤ m.
Fix ν ∈ N2ℓ 0 for which |ν | = m + 1 and ν
b =
6 ν . In general (whether or not νb = ν ),
L1 (ν ) = L2 (ν ) if and only if L1 (νb ) = L2 (νb). Thus if L1 (ν ) = L2 (ν ) then it is also
true that L1 (νb ) = L2 (νb ) and by definition both V (ν ) = 0 and V (νb) = 0, so that
V (νb ) = V (ν ).
If L1 (ν ) 6= L2 (ν ),

1
V (νb ) =
L1 (νb) − L2 (νb)

× V e (ν2ℓ − 1, ν2ℓ−1, . . . , ν1 )(L1 (ν2ℓ − 1, ν2ℓ−1, . . . , ν1 ) + 1)
e (ν2ℓ , ν2ℓ−1 − 1, . . ., ν1 )(L1 (ν2ℓ , ν2ℓ−1 − 1, . . ., ν1 ) + 1)
+V
+ ···
+Ve (ν2ℓ , . . . , νℓ+1 − 1, νℓ , . . . , ν1 )(L1 (ν2ℓ , . . . , νℓ+1 − 1, νℓ , . . . , ν1 ) + 1)
e (ν2ℓ , . . . , νℓ+1 , νℓ − 1, . . . , ν1 )(L2 (ν2ℓ , . . . , νℓ+1 , νℓ − 1, . . . , ν1 ) + 1)
−V
− ···

−Ve (ν2ℓ , . . . , ν2 , ν1 − 1)(L2 (ν2ℓ , . . . , ν2 , ν1 − 1) + 1) ,
3.4 Focus Quantities and Their Properties 127

which, by the induction hypothesis and (3.81), is equal to


1
L2 (ν ) − L1 (ν )

× V e (ν1 , . . . , ν2ℓ−1 , ν2ℓ − 1)(L2 (ν1 , . . . , ν2ℓ−1 , ν2ℓ − 1) + 1)
+Ve (ν1 , . . . , ν2ℓ−1 − 1, ν2ℓ)(L2 (ν1 , . . . , ν2ℓ−1 − 1, ν2ℓ) + 1)
+ ···
+Ve (ν1 , . . . , νℓ , νℓ+1 − 1, . . . , ν2ℓ )(L2 (ν1 , . . . , νℓ , νℓ+1 − 1, . . ., ν2ℓ ) + 1)
−Ve (ν1 , . . . , νℓ − 1, νℓ+1, . . . , ν2ℓ )(L1 (ν1 , . . . , νℓ − 1, νℓ+1, . . . , ν2ℓ ) + 1)
− ···

−Ve (ν1 − 1, ν2, . . . , ν2ℓ )(L1 (ν1 − 1, ν2, . . . , ν2ℓ ) + 1) = V (ν ).

Thus V (νb) = V (ν ) if νb 6= ν , for |ν | = m + 1, hence for all ν ∈ N2ℓ


0 .
The identity V (νb ) = V (ν ) for all ν ∈ N2ℓ0 and formula (3.77) immediately imply
( ν ( ν
that g = −g for all ν ∈ N0 . Thus (3.78a) is proved, and it implies the assertion
b) ) 2ℓ
(ν )
about gkk in (3.78b). The truth of the assertion about V in (3.78b) has already been
noted. 

Theorem 3.4.2(4) and equation (3.78a) show that the focus quantities have the
following structure (Exercise 3.16; see also Exercise 4.3), identifiable in (3.133)
and (3.134) (but not present in (3.135) because of a reduction modulo hg11 , g22 i).

Corollary 3.4.6. The focus quantities have the form


(ν )
gkk = 1
2 ∑ gkk ([ν ] − [νb]) . (3.82)
{ν :L(ν )=(k,k)}

(ν )
Remark 3.4.7. Noting that by (3.77) the coefficient gkk of [ν ] in gkk is purely imag-
inary, it is an easy consequence of (3.82) that if system (3.3) (system (3.50)) is the
complexification of a real system, that is, if b̄ = a, then for every fixed a ∈ Cℓ , gkk
is a real number.

The results of this section yield the Focus Quantity Algorithm for computation
of the focus quantities for family (3.3) given in Table 3.1 on page 128, where in the
last two lines we have abused the notation slightly. The formula for the maximum
value of |ν | such that ν can contribute to gkk (the quantity M in the algorithm) is
established in Exercise 3.15; in that formula, ⌊r⌋ denotes the greatest integer less
than or equal to r. The code for an implementation of this algorithm in Mathematica
is given in the Appendix.
128 3 The Center Problem

Focus Quantity Algorithm

Input:
K∈N
Ordered set S = {(p1 , q1 ), . . ., (pℓ , qℓ )} ⊂ ({−1} × N0 )2
satisfying p j + q j ≥ 1, 1 ≤ j ≤ ℓ

Output:
Focus quantities gkk , 1 ≤ k ≤ K, for family (3.3)

Procedure:
w := min{p1 + q1 , . . ., pℓ + qℓ }
M := ⌊ 2Kw ⌋
g11 := 0; . . ., gKK := 0; V (0, . . . , 0) := 1;
FOR m = 1 TO M DO
FOR ν ∈ N02ℓ such that |ν | = m DO
Compute L(ν ) using (3.71)
Compute V (ν ) using (3.76)
IF
L1 (ν ) = L2 (ν )
THEN
(ν )
Compute gL(ν ) using (3.77)
(ν )
gL(ν ) := gL(ν ) + gL(ν ) [ν ]

Table 3.1 The Focus Quantity Algorithm

3.5 Hamiltonian and Reversible Systems

As we showed in the opening paragraphs of this chapter, it is important that we have


means of determining when a system satisfying certain conditions has a center at the
origin. In this and the next section we describe several such techniques. This section
is devoted to a discussion of two classes of systems that have the property that, in
the real case, any antisaddle at the origin must be a center. Because they possess this
property, we will find these two classes naturally appearing as components of the
center varieties that we study in Section 3.7.
The first class with which we deal is the set of Hamiltonian systems. A system
e y) on C2 is said to be a Hamiltonian system if there is a function
e y), ẏ = Q(x,
ẋ = P(x,
H : C → C, called the Hamiltonian of the system, such that Pe = −Hy and Q
2 e = Hx .
It is immediately apparent that X H ≡ 0, so that H is a first integral of the system.
Thus an antisaddle of any real Hamiltonian system, and the singularity at the origin
of any complex Hamiltonian system of the form (3.3) (= (3.50)) or (3.61), is known
to be a center. Hamiltonian systems are easy to detect, as the following proposition
shows.
3.5 Hamiltonian and Reversible Systems 129

Proposition 3.5.1. The collection of Hamiltonian systems in a family of the form


(3.3) are precisely those systems whose coefficients satisfy the following condition:

for all (p, q) ∈ S for which p ≥ 0:


if (q, p) ∈ S, then (p + 1)a pq = (q + 1)b pq; (3.83)
if (q, p) ∈
/ S, then a pq = bqp = 0.

e = −Hx , then
Proof. If there exists a function H : C2 → C such that Pe = Hy and Q
ey , which from (3.3) is the condition
Pex = −Q

∑ (p + 1)a pqx p yq = ∑ (p + 1)bqpxq y p . (3.84)


(p,q)∈S (p,q)∈S
p≥0 p≥0

If (r, s) ∈ N20 is such that both (r, s) ∈ S and (s, r) ∈ S, then the monomial xr ys appears
in the left-hand side of equation (3.84) with coefficient (r + 1)ars and in the right-
hand side with coefficient (s + 1)brs , so that (r + 1)ars = (s + 1)brs holds. But if
(r, s) ∈ N20 is such that (r, s) ∈ S but (s, r) 6∈ S, then the monomial xr ys appears in the
left-hand side of equation (3.84) with coefficient (r + 1)ars (and r ≥ 0) but not in
the right-hand side, so that ars = 0, and the monomial xs yr appears in the right-hand
side of equation (3.84) with coefficient (r + 1)bsr (and r ≥ 0) but not in the left-hand
side, so that bsr = 0. Thus condition (3.83) holds.
Conversely, if condition (3.83) holds, then because the system is polynomial,
e y) can be integrated with respect to y and −Q(x,
P(x, e y) can be integrated with re-
spect to x consistently to produce a polynomial H that is a Hamiltonian for system
(3.3). 
By Proposition 3.5.1, the set of Hamiltonian systems in family (3.3) corresponds
precisely to the variety of the ideal
def
IHam = h (p + 1)a pq − (q + 1) b pq : p ≥ 0 and (p, q) ∈ S and (q, p) ∈ S i
(3.85)
∩ h a pq , bqp : p ≥ 0 and (p, q) ∈ S and (q, p) 6∈ S i .

If there are r1 generators in the first ideal on the right in (3.85) and 2 r2 in
the second, then a polynomial parametrization of IHam is given by the mapping
F : C2ℓ−(r1 +2r2 ) → C2ℓ , where r1 component functions have the form b pq = t j and
another corresponding r1 component functions have the form a pq = ( q+1 p+1 )t j ; where
r2 component functions have the form a pq = 0 and r2 component functions have
the form bqp = 0; and where the remaining 2ℓ − (r1 + 2r2 ) component functions,
corresponding to a pq and bqp not appearing in any generator of IHam , have the form
a pq = t j or bqp = t j . By Corollary 1.4.18, V(IHam ) is irreducible.
The second class of systems that we discuss in this section are those having a
symmetry that in the real case forces any singularity of focus or center type to be
in fact a center. When direction of flow is taken into account, there are two types of
symmetry of a real system with respect to a line L: mirror symmetry, meaning that
when the phase portrait is reflected in the line L it is unchanged; and time-reversible
130 3 The Center Problem

symmetry (standard terminology, although reversed-time symmetry might be more


accurate), meaning that when the phase portrait is reflected in the line L and then
the sense of every orbit is reversed (corresponding to a reversal of time), the original
phase portrait is obtained. When L is the u-axis, the former situation is exemplified
by the canonical linear saddle u̇ = −u, v̇ = v, and the latter situation is exemplified
by the canonical linear center u̇ = −v, v̇ = u. The canonical saddle has four lines of
symmetry in all, two exhibiting mirror symmetry and two exhibiting time-reversible
symmetry; every line through the origin is a line of time-reversible symmetry for the
e v), v̇ = V
center. In general, a real system u̇ = U(u, e (u, v) possesses a time-reversible
symmetry with respect to the u-axis if and only if
e −v) = −U(u,
U(u, e v) and Ve (u, −v) = Ve (u, v); (3.86a)

it possesses a mirror symmetry with respect to the u-axis if and only if


e −v) = U(u,
U(u, e v) and Ve (u, −v) = −Ve (u, v) (3.86b)

(Exercise 3.20). Time-reversible symmetry is of interest in connection with the


center-focus problem because for any antisaddle on L, presence of the symmetry
precludes the possibility that the singularity be a focus, hence forces it to be of
center type. For example, consider the real system

u̇ = −v − vM(u, v2), v̇ = u + N(u, v2), (3.87)

where M is an analytic function without constant term and N is an analytic function


whose series expansion at (0, 0) starts with terms of order at least two. We know
by Remark 3.1.7 that the origin is a either a focus or a center. The fact that system
(3.87) satisfies condition (3.86a) implies that all orbits near the origin close, so that
the origin is in fact a center. Mirror symmetry will not be of interest to us in the
present chapter since it is incompatible with the existence of either a focus or a
center on the line L. We will take it up again in the last section of Chapter 5.

Definition 3.5.2. A real system u̇ = U(u,e v), v̇ = V e (u, v) is time-reversible if its


phase portrait is invariant under reflection with respect to a line and a change in
the direction of time (reversal of the sense of every trajectory).

We say that a line L is an axis of symmetry of system (3.88) if as point sets


(ignoring the sense of the parametrization by time t) the orbits of the system are
symmetric with respect to L. Henceforth in this section, except for Definition 3.5.4,
which applies to general analytic systems (as did Definition 3.5.2), we restrict to
polynomial systems.
The point of view that we take throughout this book is that it is generally easier
and much more fruitful to study the center problem for a complex system than it
is to do so for a real one. Thus our first objective in this part of this section is to
generalize the concept of reversibility to complex systems. As usual, we complexify
the (u, v)-plane by making use of the substitution x = u + iv. Then the real system
3.5 Hamiltonian and Reversible Systems 131

e v),
u̇ = U(u, e (u, v)
v̇ = V (3.88)

is transformed into
e x̄),
ẋ = P(x, (3.89)
e x̄) = U(
where P(x, e 1 (x + x̄), 1 (x − x̄)) + iVe ( 1 (x + x̄), 1 (x − x̄)). Both mirror sym-
2 2i 2 2i
metry and time-reversible symmetry with respect to the u-axis are captured by a
simple condition on the vector a of coefficients of the polynomial P: e system (3.88)
has one of these two types of symmetry with respect to the u-axis if and only if
e −v) + iVe (u, −v) ≡ ±(U(u,
U(u, e v) − iVe (u, v)) ,

which in terms of Pe is precisely the condition P( e x̄, x) = ±P(x,


e x̄) , as can easily be
seen by writing it out in detail, and this latter condition is equivalent to ā = ±a. The
upper sign everywhere corresponds to mirror symmetry; the lower sign everywhere
corresponds to time-reversible symmetry. Thus we have established the following
fact.
Lemma 3.5.3. Let a denote the vector of coefficients of the polynomial P(x, e x̄) in
(3.89).
1. System (3.88) exhibits mirror symmetry (respectively, time-reversible symmetry)
with respect to the u-axis if and only if a = ā (respectively, a = −ā), that is, if and
only if all coefficients are real (respectively, all are purely imaginary).
2. If a = ±ā, then the u-axis is an axis of symmetry for system (3.88).
The condition in part (2) is only sufficient because the u-axis can be an axis of
symmetry in the absence of both mirror and time-reversible symmetry; see Section
5.3.
By the lemma the u-axis is an axis of symmetry for (3.89) if

e x̄, x) = −P(x,
P( e x̄) , (3.90a)

corresponding to (3.86a), time-reversible symmetry, or if

P( e x̄) ,
e x̄, x) = P(x, (3.90b)

corresponding to (3.86b), mirror symmetry. If (3.90a) is satisfied, then under the


involution
x → x̄, x̄ → x, (3.91)
(3.89) is transformed into its negative,

e x̄),
ẋ = −P(x, e x̄).
x̄˙ = −P(x, (3.92)

So if (3.89) is obtained from (3.88) and the transformation (3.91) when applied
to (3.89) and its complex conjugate yields the system (3.92), then the real system
(3.88) is time-reversible, hence has a center at the origin. If the line of reflection is
not the u-axis but a distinct line L passing through the origin, then we can apply the
132 3 The Center Problem

rotation x1 = e−iϕ x through an appropriate angle ϕ to make L the u-axis. In the new
coordinates we have

ẋ1 = e−iϕ P(e


e iϕ x1 , e−iϕ x̄1 ) := Pe1 (x1 , x̄1 ) .

According to Lemma 3.5.3, this system is time-reversible with respect to the line
Imx1 = 0 if the analogue Pe1 (x̄1 , x1 ) = −Pe1 (x1 , x̄1 ) of (3.90a) holds, which by a
straightforward computation is

eiϕ P(e
e iϕ x1 , e−iϕ x̄1 ) = −e−iϕ P(e
e iϕ x̄1 , e−iϕ x1 ).

Hence (3.89) is time-reversible precisely when there exists a ϕ ∈ R such that

e2iϕ P(x, e 2iϕ x̄, e−2iϕ x).


e x̄) = −P(e (3.93)

Recall from the beginning of Section 3.2 that the complexification of system (3.88)
is the system that is obtained by adjoining to (3.89) its complex conjugate. An exam-
ination of (3.93) and its conjugate suggests the following natural generalization of
time-reversibility to systems on C2 . For ease of comparison with the mathematical
literature we have stated it in two equivalent forms.
e y) on C2 is time-reversible if there
e y), ẏ = Q(x,
Definition 3.5.4. A system ẋ = P(x,
exists γ ∈ C \ {0} such that

e y) = −γ Q(
P(x, e γ y, γ −1 x) . (3.94)

Equivalently, letting z = (x, y) ∈ C2 , the system

dz
= F(z)
dt
is time-reversible if there exists a linear transformation T (x, y) = (γ y, γ −1 x), for
some γ ∈ C \ {0}, such that

d(T z)
= −F(T z) .
dt
The equivalence of the two forms of the definition follows from the fact, readily
verified by direct computation, that a complex system is time-reversible according to
the first statement in Definition 3.5.4 if and only if under the change of coordinates

x1 = γ y, y1 = γ −1 x (3.95)

and reversal of time the form of the system of differential equations is unaltered.
We note in particular that a system of the form (3.3) is time-reversible if and only
if the parameter (a, b) satisfies

bqp = γ p−q a pq for all (p, q) ∈ S (3.96)


3.5 Hamiltonian and Reversible Systems 133

for some fixed γ ∈ C \ {0}. Observe also that if we complexify the real system
(3.88) written in the complex form (3.89) by adjoining to the latter the equation
e x̄), then setting γ = e2iϕ in (3.94) and replacing y by x̄ we recover
e x̄) = P(x,
x̄˙ = Q(x,
(3.93), as anticipated.
As we have seen above, a real polynomial system that has a singularity that must
be either a center or focus and that is time-reversible with respect to a line pass-
ing through this point must have a center at the point. We will now show that the
analogous fact for complex systems of the form (3.3) is also true.

Theorem 3.5.5. Every time-reversible system of the form (3.3) (not necessarily sat-
isfying bqp = ā pq ) has a center at the origin.

Proof. Suppose the system of the form (3.3) under consideration corresponds to
parameter value (a, b), so that condition (3.96) holds for some γ ∈ C \ {0}. We
will compute the value of the focus quantities gkk at (a, b) using (3.82). Hence fix
k ∈ N. Since gkk is a (k, k)-polynomial (Theorem 3.4.2), each ν ∈ Supp(gkk ) satisfies
L(ν ) = (k, k), which by (3.71) is

ν1 p1 + · · · + νℓ pℓ + νℓ+1qℓ + · · · + ν2ℓq1 = k
(3.97)
ν1 q1 + · · · + νℓ qℓ + νℓ+1 pℓ + · · · + ν2ℓ p1 = k .

For any such ν , evaluating [ν ] and [c


ν ] at (a, b) and applying condition (3.96) yields
c
the relation [ν ] = γ [ν ], where
w

w = ν2ℓ (q1 − p1 ) + · · · + νℓ+1 (qℓ − pℓ) + νℓ (pℓ − qℓ) + · · · + ν1 (p1 − q1) .

But (3.97) implies that w is zero, so that by (3.82) gkk (a, b) = 0. Since this holds for
all k, system (a, b) has a center at the origin. 

The set of time-reversible systems is not generally itself a variety, for although
condition (3.96) is a polynomial condition, the value of γ will vary from system
to system, so that the full set of time-reversible systems is not picked out by one
uniform set of polynomial conditions. It can be made into a variety by forming
its Zariski closure, but the proof of Theorem 3.5.5 suggests the following idea for
defining a readily identifiable variety that contains it, which turns out to be a useful
approach, and will be shown in Section 5.2 to be equivalent. For a fixed index set S
determining a family (3.3) and the corresponding mapping L given by (3.71), define

M = {ν ∈ N2ℓ
0 : L(ν ) = ( j, j) for some j ∈ N0 } (3.98)

and let Isym be the ideal defined by

def
Isym = h[ν ] − [ν̂ ] : ν ∈ M i ⊂ C[a, b],

which is termed the symmetry ideal or the Sibirsky ideal for family (3.3). It is almost
immediate that any time-reversible system in family (3.3) corresponds to an element
134 3 The Center Problem

of V(Isym ) : for ν ∈ M means that (3.97) holds, and if system (a, b) is reversible then
[νb] = γ w [ν ] for w as in the proof of Theorem 3.5.5, which by (3.97) is zero, hence
([ν ] − [νb])|(a,b) = 0.
(The reader may have wondered why, in the definition of M , the requirement on
j is that it be in N0 , when j ∈ N would have served. The reason is that the same
set M will be encountered again in Chapter 5, where we will want it to have the
structure of a monoid under addition, hence we need it to have an identity element.)
By definition of the Bautin ideal B (Definition 3.3.3) and equation (3.82) clearly
B ⊂ Isym , hence V(Isym ) ⊂ V(B), the center variety of system (3.3).

Definition 3.5.6. The variety V(Isym ) is termed the symmetry or Sibirsky subvariety
of the center variety.

Although the parameters a and b of any time-reversible system (a, b) of the form
(3.3) satisfy (a, b) ∈ V(Isym ), the converse is false, as the following example shows.

Example 3.5.7. We will find the symmetry ideal Isym for the family

ẋ = i(x − a10x2 − a01xy)


(3.99)
ẏ = −i(y − b10xy − b01y2 ) ,

family (3.3) when S = {(p1 , q1 ), (p2 , q2 )} = {(1, 0), (0, 1)}, so ℓ = 2. We must
first identify the set M = {ν ∈ N40 : L(ν ) = (k, k) for some k ∈ N0 }. Writing
ν1 ν2 ν3 ν4
ν = (ν1 , ν2 , ν3 , ν4 ), [ν ] = a10 ν ] = aν104 aν013 bν102 bν011 ,
a01 b10 b01 ∈ C[a10 , a01 , b10 , b01 ], [c
and L(ν ) = ν1 (1, 0) + ν2 (0, 1) + ν3(1, 0) + ν4(0, 1) = (ν1 + ν3 , ν2 + ν4 ).
For k = 1, (ν1 , ν3 ), (ν2 , ν4 ) ∈ {(1, 0), (0, 1)}, so there are four 4-tuples ν satisfy-
ing L(ν ) = (1, 1), and we compute:

ν = (1, 1, 0, 0) : [ν ] − [c
ν ] = a10 a01 − b10b01 ,
ν = (1, 0, 0, 1) : [ν ] − [c
ν ] ≡ 0,
ν = (0, 1, 1, 0) : [ν ] − [c
ν ] ≡ 0,
c
ν = (0, 0, 1, 1) : [ν ] − [ν ] = b10 b01 − a10a01 .

For k = 2, (ν1 , ν3 ), (ν2 , ν4 ) ∈ {(2, 0), (1, 1), (0, 2)}, so there are nine 4-tuples ν
satisfying L(ν ) = (2, 2), and we compute:

ν = (2, 2, 0, 0) : [ν ] − [c
ν ] = a210 a201 − b210b201 = (a10 a01 − b10b01 )(a10 a01 + b10b01 ),
ν = (2, 1, 0, 1) : [ν ] − [c
ν ] = a2 a01 b01 − a10b10 b2 = a10 b01 (a10 a01 − b10b01 ),
10 01

ν = (2, 0, 0, 2) : [ν ] − [c
ν ] ≡ 0,
ν = (1, 2, 1, 0) : [ν ] − [c
ν ] = a10 a201 b10 − a01b210 b01 = a01 b10 (a10 a01 − b10b01 ).

and so on for the remaining five 4-tuples.


The pattern seen so far holds for all k (Exercise 3.21):
3.5 Hamiltonian and Reversible Systems 135

Isym = h[ν ] − [c
ν ] : ν ∈ M i = ha10 a01 − b10b01 i .

Thus V(Isym ) = {(a10 , a01 , b10 , b01 ) : a10 a01 − b10 b01 = 0} .
In particular, V(Isym ) includes (a10 , a01 , b10 , b01 ) = (1, 0, 1, 0), corresponding to
the system ẋ = i(x − x2 ), ẏ = −i(y − xy). But this system is not time-reversible, since
for (p, q) = (0, 1), (3.94) reads b10 = γ −1 a01 , which here evaluates to 1 = 0.
Thus to repeat, while every time-reversible system belongs to the symmetry sub-
variety of the center variety, the converse fails, as Example 3.5.7 shows. We will
nevertheless use the terminology “symmetry” subvariety because the restriction of
V(Isym ) to the real center variety VCR gives precisely the time-reversible real systems
with a center and because the name is suggestive. The correct relationship between
the set of time-reversible systems in family (3.3) and the symmetry subvariety is
described by the following theorem.
Theorem 3.5.8. Let R ⊂ E(a, b) be the set of all time-reversible systems in family
(3.3). Then
1. R ⊂ V(Isym ) and
2. V(Isym ) \ R
= {(a, b) ∈ V(Isym ) : there exists (p, q) ∈ S with a pq bqp = 0 but a pq + bqp 6= 0}.
We have already seen the proof of part (1). We will present the proof of part (2)
in Chapter 5, where we also give a simple and effective algorithm for computing
a finite set of generators for the ideal Isym (Table 5.1) and establish the following
important property that it possesses.
Theorem 3.5.9. The ideal Isym is prime in C[a, b].
This theorem immediately implies that the variety V(Isym ) is irreducible. In fact,
for all polynomial systems investigated up to this point, V(Isym ) is a component, that
is, is a proper irreducible subvariety, of the center variety. We conjecture that this is
always the case, that is, that for any polynomial system of the form (3.3), V(Isym ) is
a component of the center variety.
Using the algorithm mentioned just after Theorem 3.5.8 one can compute the
ideal Isym for the general cubic system of the form (3.69):

ẋ = x − a10 x2 − a01 xy − a−12 y2 − a20 x3 − a11 x2 y − a02 xy2 − a−13 y3


(3.100)
ẏ = −y + b2,−1 x2 + b10 xy + b01 y2 + b3,−1 x3 + b20 x2 y + b11 xy2 + b02 y3

(where x, y, ai j , and bi j are in C), which by Proposition 3.3.9 has the same center
variety as the general cubic system of the form (3.3).
Theorem 3.5.10. The ideal Isym of system (3.100) is generated by the polynomials
listed in Table 3.2 on page 136. Thus the symmetry subvariety V(Isym ) of the center
variety of system (3.100) is the set of common zeros of these polynomials.
The proof of Theorem 3.5.10 will also be given in Chapter 5 (page 236), where
we will also show that the pattern exhibited in Table 3.2, that every generator has
the form [µ ] − [µ̂ ] for some µ ∈ N2ℓ
0 , holds in general (Theorem 5.2.5).
136 3 The Center Problem

a11 − b11 a01 b02 b2,−1 − a−12 b10 a20


a01 a02 b2,−1 − a−12 b10 b20 a410 a−13 − b3,−1 b401
a10 a−12 b20 − b01 b2,−1 a02 a10 a−12 b210 − a201 b2,−1 b01
a20 a02 − b20 b02 a210 a−12 b10 − a01 b2,−1 b201
a10 b02 b10 − a01 a20 b01 a301 b2,−1 − a−12 b310
a10 a02 b10 − a01 b20 b01 a310 a−12 − b2,−1 b301
a10 a−13 b2,−1 − a−12 b3,−1 b01 a20 a−13 b20 − a02 b3,−1 b02
a210 b02 − a20 b201 a202 b3,−1 − a−13 b220
a01 a−12 b3,−1 − a−13 b2,−1 b10 a201 b20 − a02 b210
a220 a−13 − b3,−1 b202 a10 a−13 b20 b10 − a01 a02 b3,−1 b01
a10 a20 a−13 b10 − a01 b3,−1 b02 b01 a10 b202 b2,−1 − a−12 a220 b01
a210 a02 − b20 b201 a10 a02 b02 b2,−1 − a−12 a20 b20 b01
a10 a202 b2,−1 − a−12 b220 b01 a201 b3,−1 b02 − a20 a−13 b210
a201 a20 − b02 b210 a01 a−12 b220 − a202 b2,−1 b10
a210 a−13 b20 − a02 b3,−1 b201 a−12 a20 a10 − b02 b2,−1 b01
a01 a−12 a20 b20 − a02 b02 b2,−1 b10 a201 a02 b3,−1 − a−13 b20 b210
a210 a20 a−13 − b3,−1 b02 b201 a01 a−12 a220 − b202 b2,−1 b10
a10 a−13 b310 − a301 b3,−1 b01 a210 a−13 b210 − a201 b3,−1 b201
a310 a−13 b10 − a01 b3,−1 b301 a401 b3,−1 − a−13 b410
a01 a−13 b20 b2,−1 − a−12 a02 b3,−1 b10 a01 a20 a−13 b2,−1 − a−12 b3,−1 b02 b10
a10 a−12 b3,−1 b02 − a20 a−13 b2,−1 b01 a10 a−12 a02 b3,−1 − a−13 b20 b2,−1 b01
a2−12 b3,−1 b20 − a02 a−13 b22,−1 a2−12 a20 b3,−1 − a−13 b02 b22,−1
a10 a2−12 b3,−1 b10 − a01 a−13 b22,−1 b01 a201 a−13 b22,−1 − a2−12 b3,−1 b210
a2−12 b320 − a302 b22,−1 a2−12 a20 b220 − a202 b02 b22,−1
a2−12 a220 b20 − a02 b202 b22,−1 a210 a2−12 b3,−1 − a−13 b22,−1 b201
a2−12 a320 − b302 b22,−1 a2−12 b23,−1 b02 − a20 a2−13 b22,−1
a4−12 b33,−1 − a3−13 b42,−1 a2−12 a02 b23,−1 − a2−13 b20 b22,−1
a10 a01 − b10 b01 a01 a2−13 b32,−1 − a3−12 b23,−1 b10
a10 a3−12 b23,−1 − a2−13 b32,−1 b01

Table 3.2 Generators of Isym for System (3.100)

3.6 Darboux Integrals and Integrating Factors

We now present the method of Darboux integration for proving the existence of first
integrals and integrating factors for polynomial systems of differential equations on
C2 . We thus consider systems
e y),
ẋ = P(x, e y),
ẏ = Q(x, (3.101)

where x, y ∈ C, Pe and Q e are polynomials without constant terms that have no non-
constant common factor, and m = max(deg(P), e Let X denote the corre-
e deg(Q)).
sponding vector field as defined in Remark 3.2.4. Suppose system (3.101) has a first
integral H on a neighborhood of the origin. If, as is not infrequently the case, H
α
has the form ∏ f j j , where for each j, α j ∈ C and f j ∈ C[x, y] (and we may assume
that f j is irreducible and that f j and fk are relatively prime if j 6= k), then f j divides
3.6 Darboux Integrals and Integrating Factors 137

X f j for each j (Exercise 3.22). But the fact that X f j = K j f j for some polynomial
K j ∈ C[x, y] implies that the variety V( f j ) of f j is an invariant curve for (3.101),
since it is then the case that
   
∂ fj e ∂ fj e e
e Q)
P+ Q = grad f j , (P, = X f j |V( f j ) ≡ 0 . (3.102)
∂x ∂y V( f j ) V( f j )

Starting with system (3.101), this suggests that in the search for a first integral one
look for a first integral that is a product of powers of polynomials whose zero sets
are invariant curves in the phase portrait of (3.101). This discussion is the motivation
for the definitions and results in this section.
Definition 3.6.1. A nonconstant polynomial f (x, y) ∈ C[x, y] is called an algebraic
partial integral of system (3.101) if there exists a polynomial K(x, y) ∈ C[x, y] such
that
∂f e ∂f e
Xf = P+ Q = K f. (3.103)
∂x ∂y
The polynomial K is termed a cofactor of f ; it has degree at most m − 1. (See
Exercise 3.23.)
Since the vector field X associated to (3.101) is a derivation (see (4.58)), the
following facts are apparent:
1. if f is an algebraic partial integral for (3.101) with cofactor K, then any constant
multiple of f is also an algebraic partial integral for (3.101) with cofactor K;
2. if f1 and f2 are algebraic partial integrals for (3.101) with cofactors K1 and K2 ,
then f1 f2 is an algebraic partial integral for (3.101) with cofactor K1 + K2 .
We know that if f is an algebraic partial integral, then V( f ) is an algebraic invariant
curve. The converse also holds:
Proposition 3.6.2. Fix f ∈ C[x, y]. V( f ) is an algebraic invariant curve of system
(3.101) if and only if f is an algebraic partial integral of system (3.101).
Proof. Just the “only if” part of the proposition requires proof. Write f = f1α1 · · · fsαs ,
where, for each j, f j is an irreducible polynomial. The equality X f |V( f ) = 0 implies
that
X f j |V( f j ) = 0 (3.104)
for all j. Thus V( f j ) ⊂ V(X f j ); applying Proposition 1.1.12 gives the inclusion
X f j ∈ I(V(X f j )) ⊂ I(V( f j )). Since f j is irreducible, h f j i is prime, hence radical
(Proposition 1.4.3), so by Theorem 1.3.15, I(V( f j )) = h f j i and we conclude that
X f j ∈ h f j i for all j. Therefore X f j = K j f j , for some K j ∈ C[x, y], so that every
polynomial f j is an algebraic partial integral of (3.101). As noted just above the
statement of the proposition, if g and h are two algebraic partial integrals with co-
factors Kg and Kh , then gh is also an algebraic partial integral, with cofactor Kg + Kh .
Therefore f = f1α1 · · · fsαs is an algebraic partial integral of (3.101). 
Remark. In the literature a function that meets the condition of Definition 3.6.1 is
frequently termed an algebraic invariant curve, in keeping with the characterization
given in the proposition.
138 3 The Center Problem

Definition 3.6.3. Suppose that the curves defined by f1 = 0, . . . , fs = 0 are algebraic


invariant curves of system (3.101), and that α j ∈ C for 1 ≤ j ≤ s. A first integral of
system (3.101) of the form
H = f1α1 · · · fsαs (3.105)
is called a Darboux first integral of system (3.101).

The existence of a Darboux first integral can be highly restrictive. For example,
if α j is real and rational for all j, then every trajectory of (3.101) lies in an algebraic
curve. (See Exercise 3.25.) If sufficiently many algebraic invariant curves can be
found, then they can be used to construct a Darboux first integral, as the following
theorem shows.

Theorem 3.6.4 (Darboux). Suppose system (3.101) has q (distinct) algebraic in-
variant curves f j (x, y) = 0, 1 ≤ j ≤ q, where for each j, f j is irreducible over C2 ,
and that q > (m2 + m)/2. Then system (3.101) admits a Darboux first integral.

Proof. Let Cm−1 [x, y] denote the complex vector space of polynomials of degree at
most m − 1. A homogeneous polynomial of degree p has p + 1 terms, so Cm−1 [x, y]
has dimension m + (m − 1) + · · ·+ 1 = m(m + 1)/2. By Proposition 3.6.2, there exist
polynomials K1 , . . . , Kq such that K j ∈ Cm−1 [x, y] and X f j = K j f j for 1 ≤ j ≤ q.
Thus the number of vectors K j is greater than the dimension of Cm−1 [x, y], hence
the collection {K1 , . . . , Kq } is linearly dependent, so that there exist constants α j ,
not all zero, such that
q q
X fj
∑ α j K j = ∑ αj fj = 0,
j=1 j=1

the zero polynomial. Defining H for these polynomials f j and constants α j by


(3.105) with s = q, this yields
q q
X fj
XH =H ∑ αj fj
= H ∑ α jKj = 0 ,
j=1 j=1

the zero function, meaning that H is a first integral of (3.101) if it is not constant.
Since the algebraic curves f j = 0 are irreducible and not all the constants α j are
zero, the function H is indeed not constant (Exercise 3.26). 

Corollary 3.6.5. If system (3.101) has at least q = (m2 + m)/2 algebraic invariant
curves f j (x, y) = 0, each of which is irreducible over C2 and does not pass through
the origin (that is, f j (0, 0) 6= 0), then it admits a Darboux first integral (3.105).

Proof. In this case all the cofactors are of the form K j = ax + by + · · ·. Thus they
are contained in a vector subspace of Cm−1 [x, y] of dimension (m2 + m − 2)/2. 

The situation in which a system of the form (3.101) has more independent al-
gebraic partial integrals than dim Cm−1 [x, y] occurs very seldom (but examples are
known; see Section 3.7). Sometimes, though, it is possible to find a Darboux first
integral using a smaller number of invariant algebraic curves. Indeed, the proof of
3.6 Darboux Integrals and Integrating Factors 139

Theorem 3.6.4 shows that if f1 , . . . , fs are any number of distinct irreducible alge-
braic partial integrals of system (3.101), as long as a nontrivial linear combination
of the cofactors K j is zero, ∑sj=1 α j K j = 0, H = f1α1 · · · fsαs will be a first integral of
system (3.101). See also Theorem 3.6.8 below.
If a first integral of system (3.101) cannot be found, then we turn our attention to
the possible existence of an integrating factor. Classically, an integrating factor of
the equation
M(x, y)dx + N(x, y)dy = 0 (3.106)
for differentiable functions M and N on an open set Ω is a differentiable function
µ (x, y) on Ω such that µ (x, y)M(x, y) dx + µ (x, y)N(x, y) dy = 0 is an exact differen-
tial, which is the case if and only if

∂ (µ M) ∂ (µ N)
− ≡ 0.
∂y ∂x
e need only be differentiable.
For the following definition, the functions Pe and Q
ey .
Denote by div X the divergence of the vector field X , div X = Pex + Q
Definition 3.6.6. An integrating factor on an open set Ω for system (3.101) is a
differentiable function µ (x, y) on Ω such that

X µ = −µ div X (3.107)

holds throughout on Ω . An integrating factor on Ω of the form

µ = f1β1 · · · fsβs , (3.108)

where f j is an algebraic partial integral for (3.101) on Ω for 1 ≤ j ≤ s, is called a


Darboux integrating factor on Ω .
This definition is consistent with the classical definition of an integrating factor
of equation (3.106). Indeed, the trajectories of system (3.101) satisfy the equation
e y) dx − P(x,
Q(x, e y) dy = 0 ,

so that µ is an integrating factor in the classical sense if and only if

∂ (µ Q)
e ∂ (−µ P)
e
− ≡ 0,
∂y ∂x

hence if and only if µx Pe + µy Qe + µ (Pex + Q


ey ) ≡ 0, which is precisely (3.107). Of
course, the importance of µ is that when (3.101) is rescaled by µ the resulting
system on Ω , which has the same orbits in Ω as (3.101) except where µ = 0, has
the form ẋ = −Hy , ẏ = Hx , from which the first integral H on Ω can be found by
integration (although H could be multivalued if Ω is not simply connected).
The same computation as in the proof of Theorem 3.6.4 shows that a function µ
of the form (3.108) is a Darboux integrating factor if and only if
140 3 The Center Problem
!
s
X µ + µ div X = µ ∑ β j K j + divX ≡ 0,
j=1

where K j is the cofactor of f j for j = 1, . . . , s. Thus if we have s algebraic invariant


curves f1 , . . . , fs and are able to find s constants β j such that
s
∑ β j K j + divX ≡ 0, (3.109)
j=1

β β
then µ = f1 1 · · · fs s is an integrating factor of (3.101).
It is possible to extend the concept of Darboux integrability to a wider class of
functions that includes limits of first integrals of the form (3.105). We can proceed
in the following way. Suppose that for every value of a parameter ε near zero both
f = 0 and f + ε g = 0 are invariant curves for (3.101), with respective cofactors K f
and K f +ε g . Then

X ( f + ε g) Xf Xg
K f +ε g = = +ε
f + εg f + εg f + εg
 
g Xg X g − gK f
= K f 1 − ε + O(ε 2 ) + ε + O(ε 2 ) = K f + ε + O(ε 2 ) .
f f f

Since K f +ε g is a polynomial of degree at most m − 1,

def X g − gK f
K′ =
f
is a polynomial of at most degree m − 1. Thus

K f +ε g = K f + ε K ′ + O(ε 2 )

and
 1/ε !  1/ε    1/ε
f + εg f + εg K f +ε g K f f + εg 
X = − = K ′ + O(ε ) .
f f ε ε f

As ε tends to zero, (( f + ε g)/ f )1/ε tends to h = eg/ f , which clearly satisfies


def

X h = K ′ h. (3.110)

Thus the function h satisfies the same equation as an algebraic invariant curve,
namely equation (3.103), and has a polynomial cofactor K ′ of degree at most m − 1.

Definition 3.6.7. A (possibly multivalued) function of the form


α
eg/ f ∏ f j j ,
3.6 Darboux Integrals and Integrating Factors 141

where f , g, and all the f j are polynomials, is called a Darboux function. A function
of the form h = eg/ f satisfying (3.110), where K ′ is a polynomial of degree at most
m − 1, is called an exponential factor. As before, K ′ is termed the cofactor of h.

Sometimes an exponential factor is called a “degenerate algebraic invariant


curve,” to emphasize its origin from the coalescence of algebraic invariant curves.
The name above is preferable since eg/ f is neither algebraic nor a curve. It is easy
to check that if h = eg/ f is an exponential factor, then f = 0 is an algebraic invariant
curve, and g satisfies an equation

X g = gK f + f Kh ,

where K f is the cofactor of f and Kh is the cofactor of h. Since the product of two
exponential factors is again an exponential factor, it is no loss of generality that the
exponential factor in Definition 3.6.7 is unique.
The theory of Darboux integrability presented above for algebraic invariant
curves goes through essentially unchanged when exponential factors are allowed in
addition to algebraic curves. In particular, the existence of at least m(m + 1)/2 + 1
invariant algebraic curves or exponential factors yields the existence of Darboux
first integrals, and the existence of at least m(m + 1)/2 invariant algebraic curves
or exponential factors implies the existence of a Darboux integrating factor. The
following theorem provides a Darboux first integral when the number of algebraic
invariant curves and exponential factors is small.

Theorem 3.6.8. Suppose system (3.101) has q (distinct) irreducible algebraic par-
tial integrals f j with corresponding cofactors K j , 1 ≤ j ≤ q, and has r (distinct) ex-
ponential factors exp(g j /h j ) with corresponding cofactors L j , 1 ≤ j ≤ r, for which
there exist q complex constants α j , 1 ≤ j ≤ q, and r complex constants β j , 1 ≤ j ≤ r,
not all zero, such that ∑ j=1 α j K j + ∑rj=1 β j L j ≡ 0. Then system (3.101) admits a first
q

α
integral of the form H = f1α1 · · · fq q (exp(g1 /h1 ))β1 · · · (exp(g1 /h1 ))β1 .

Proof. Simply differentiate the expression for H and apply the same reasoning as in
the proof of Theorem 3.6.4. 

Darboux’s method is one of the most efficient tools for studying the center prob-
lem for polynomial systems (3.101). In particular, if we are able to construct a Dar-
boux first integral (3.105) or a Darboux integrating factor (3.108) with algebraic
curves f j = 0 that do not pass through the origin, then we are sure to have a first
integral that is analytic in a neighborhood of the origin. If for system (3.48) (respec-
tively, (3.61)) the first integral has the form (3.49) (respectively, (3.62)), or can be
modified in accordance with Remark 3.2.4 to produce a first integral on a neighbor-
hood of the origin that does, then the system has a center at the origin.
In the case that at least one of the invariant curves that are used to construct a first
integral or an integrating factor passes through the origin, the first integral that we
obtain need not exist on a neighborhood of the origin. In certain situations this poses
no real difficulty. Suppose, for instance, that a first integral H has the form H = f /g p
142 3 The Center Problem

for p ∈ N and algebraic partial integrals f and g with g(0, 0) = 0 but f (0, 0) 6= 0.
Then G = g p / f is certainly a first integral on a neighborhood of (0, 0) in C2 \ V(g).
Since V(g) is an algebraic invariant curve and G takes the constant value 0 on V(g),
G is constant on all orbits in a neighborhood of (0, 0), including those confined to
V(g), hence G is a first integral on a neighborhood of (0, 0). Similar situations are
examined in Exercises 3.28–3.30.
Under certain circumstances it is sufficient to have an integrating factor that is
defined only in a punctured neighborhood of the origin in order to distinguish be-
tween a center and a focus for a real polynomial system of the form (3.2), that is, of
the form
e v),
u̇ = −v + U(u, v) = U(u, v̇ = u + V (u, v) = Ve (u, v), (3.111)

where max(deg U, e deg V e ) = m. The following theorem is an example of this sort of


result. As always, we let X denote the vector field associated with the system of
e v) ∂ + V
differential equations, here X (u, v) = U(u, e (u, v) ∂ .
∂u ∂v

Theorem 3.6.9. Suppose u2 + v2 is an algebraic partial integral of the real system


(3.111) and that for some negative real number α , (u2 + v2 )α /2 is an integrating fac-
tor for (3.111) on a punctured neighborhood of the origin. Consider the polynomial
div X , and decompose it as a sum of homogeneous polynomials
m−1
div X (u, v) = ∑ d j (u, v). (3.112)
j=1

ThenR the origin is a center if either


(i) 02π div X (r cos ϕ , r sin ϕ ) d ϕ ≡ 0 on a neighborhood of 0 in R, or
R
(ii) 02π d j (cos ϕ , sin ϕ ) d ϕ = 0 for all j such that j < −α − 1.
In particular, the origin is a center if α ≥ −2.
Before presenting a proof of this theorem, we will consider two specific exam-
ples. More examples of applying the Darboux method are presented in Section 3.7.
Example 3.6.10. Consider the family of cubic systems of the form (3.111) given by

u̇ = −v + B(u2 − v2 ) + 2Auv + 2Du3 − (4E − C)u2v − 2Duv2 + Cv3


(3.113)
v̇ = u − A(u2 − v2 ) + 2Buv − Cu3 + 2Du2v − (4E + C)uv2 − 2Dv3 ,

where A, B, C, D, and E are real constants. To search for invariant lines, we use the
method of undetermined coefficients: we insert f (u, v) = f00 + f10 u+ f01 v and, since
m = 3, K(u, v) = K00 + K10 u + K01 v + K20 u2 + K11 uv + K02v2 into (3.103), written
as X f − K f ≡ 0, and collect terms. This yields a cubic polynomial equation

f00 K00 + (K00 f10 + K10 f00 − f01 )u + (K00 f01 + K01 f00 + f10 )v + · · · ≡ 0 . (3.114)

To make the constant term zero we arbitrarily choose f00 = 0. With that choice
made, if, in order to make the coefficient of either linear term zero, we choose either
3.6 Darboux Integrals and Integrating Factors 143

f10 = 0 or f01 = 0, then f00 = f10 = f01 = 0 is forced, which is of no interest, so in-
stead we eliminate the linear terms by choosing K00 = f01 / f10 and K00 = − f10 / f01 ,
2 2
which in turn forces f10 + f01 = 0. Although system (3.113) is real, we allow com-
plex coefficients in this computation, so that we proceed by choosing f01 = i f10 or
f01 = −i f10 . These choices ultimately lead to the algebraic partial integrals

f1 (u, v) = u + iv and f2 (u, v) = u − iv

with corresponding cofactors

K1 (u, v) = i + (B − iA)u + (A + iB)v + (2D − iC)u2 − 4Euv − (2D + iC)v2

and

K2 (u, v) = −i + (B + iA)u + (A − iB)v + (2D + iC)u2 − 4Euv − (2D − iC)v2 .

This example illustrates the fact that a real system can have complex algebraic par-
tial integrals, although what happens here is true in general: they always occur in
complex conjugate pairs (Exercise 3.27).
Then f (u, v) = f1 (u, v) f2 (u, v) = u2 + v2 must be a real algebraic partial integral
with cofactor K(u, v) = K1 (u, v)+ K2 (u, v) = 2(Bu + Av+ 2Du2 − 4Euv− 2Dv2). We
will not pursue the other possible ways of forcing the truth of (3.114).
The equation

(u − A(u2 − v2 ) + 2Buv − Cu3 + 2Du2v − (4E + C)uv2 − 2Dv3 ) du


− (−v + B(u2 − v2 ) + 2Auv + 2Du3 − (4E − C)u2v − 2Duv2 + Cv3 ) dv = 0

corresponding to (3.113) is exact if and only if A = B = D = E = 0. We seek to use


the algebraic partial integral u2 + v2 to construct an integrating factor by looking for
a constant β such that (3.109) holds. Substitution of f and K into (3.109) gives

2β (Bu + Av + 2Du2 − 4Euv − 2Dv2) + 4(Bu + Av + 2Du2 − 4Euv − 2Dv2) ≡ 0,

hence condition (3.109) holds if β = −2. Thus µ (u, v) = f −2 = (u2 + v2 )−2 is an


integrating factor for (3.113) on the set Ω = R2 \ {(0, 0)}. By integrating −Hv = µ Pe
and Hu = µ Q,e we obtain a first integral

1 − 2Au + 2Bv + 4Duv + 4Eu2


H(u, v) = C log(u2 + v2 ) +
u 2 + v2
on the set Ω . The first two hypotheses of Theorem 3.6.9 are satisfied (with α = −4),
and a simple computation shows that condition (i) is met, so that system (3.113) has
a center at the origin.

Example 3.6.11. In practice, it is typically the case that when a Darboux first in-
tegral can be found for system (3.111) on a punctured neighborhood of the origin,
then the origin is a center, but this is not always the case, as the following example
144 3 The Center Problem

shows. For real constants A, B, and C, consider the family of cubic systems

u̇ = −v + A u3 + B u2 v + C v3
(3.115)
v̇ = u − C u3 + A u2 v + (B − 2C) u v2.

Exactly the same procedure as in the previous example leads to the algebraic partial
integral f (u, v) = u2 + v2 with cofactor K(u, v) = 2Au2 + 2(B − C)uv, which can be
used as above to find the integrating factor µ (u, v) = f −2 = (u2 + v2 )−2 on the set
Ω = R2 \ {(0, 0)}. Further calculation gives the first integral

1 + (C − B) u2 + A u v v
H(u, v) = C log(u2 + v2) + 2 2
+ A arctan
u +v u
1 + (C − B) u 2 + Auv  u
π
= C log(u2 + v2) + + A 2 − arctan
u 2 + v2 v
on Ω . The first two hypotheses of Theorem 3.6.9 are satisfied (with α = −4). Since

div X = 4Au2 + 4(B − C)uv


R
is homogeneous of degree two, the condition 02π d j (r cos ϕ , r sin ϕ ) d ϕ = 0 for j < 3
is restrictive; Proposition 3.6.9 guarantees that system (3.115) has a center at the
origin provided
Z 2π
4A cos2 ϕ + 4(B − C) cos ϕ sin ϕ d ϕ = 4Aπ = 0,
1

that is, if A = 0.
Whether the origin is a center or a focus when A 6= 0 can be determined using
the theory of Lyapunov stability from Section 2.1 since, by Remark 3.1.7, we know
that the origin is a focus or a center. (See Exercise 3.31 for an alternate approach.)
The function W (u, v) = u2 + v2 + (B −C)u4 − Au3 v + (B −C)u2v2 − Auv3 is positive
definite on a neighborhood of the origin, and differentiation with respect to t gives
Ẇ (u, v) = (u2 + v2 )(A(u2 + v2 ) + · · · ) , where the omitted terms are of order three or
greater. Thus Ẇ < 0 on a punctured neighborhood of the origin if A < 0 and Ẇ > 0
on a punctured neighborhood of the origin if A > 0. Hence by Theorems 2.1.3(2)
and 2.1.5(1) and the fact that the origin is known to be a focus or a center, the origin
is a stable focus if A < 0 and an unstable focus if A > 0.

In order to prove Theorem 3.6.9 we will need an additional result, which we state
and prove first.

Theorem 3.6.12. Suppose that for the real system (3.111) there exist a punctured
neighborhood Ω of the origin in R2 and a function B : Ω → [0, ∞) ⊂ R that is
continuously differentiable, B 6≡ 0 on any punctured neighborhood of the origin but

∂ e + ∂ (BVe ) ≡ 0 on Ω ,
div BX = (BU) (3.116)
∂u ∂v
3.6 Darboux Integrals and Integrating Factors 145

and for which there exists a finite positive number M such that

Z 2π
(BU cos ϕ + BV sin ϕ ) u=r cos ϕ
dϕ < M (3.117)
0
v=r sin ϕ

on some punctured neighborhood of r = 0 in R. Then system (3.111) has a center at


the origin.

Proof. There are a ray ρ at the origin and a sequence of points p j in ρ , j ∈ N, such
that B(p j ) > 0 and p j → (0, 0) as j → ∞. Since the linear part of system (3.111) is
invariant under a rotation of the coordinate system about the origin, we may assume
that ρ is the positive u-axis. By Remark 3.1.7, the origin is a focus or a center for
system (3.111). Moreover the Poincaré first return map R(r) is defined for r > 0
sufficiently small. Suppose, contrary to what we wish to show, that the origin is a
focus. Reversing the flow if necessary so that the origin is a sink (that is, attracts
every point in a sufficiently small neighborhood of itself), this means that if we
choose any point (u, v) = (r1 , 0) in ρ with r1 > 0 sufficiently small, a sequence r j ,
j ∈ N, is generated, satisfying 0 < r j+1 = R(r j ) < r j and r j → 0 as j → ∞. The
corresponding
R rj points in R2 have (u, v)-coordinates (r j , 0). For j sufficiently large,
r j+1 u + V(u, 0) du > 0. If the flow was not reversed in order to form the sequence
r j , choose J such that B(u, 0) has a nonzero value in the interval rJ+1 ≤ u ≤ rJ and
consequently Z rJ
B(u, 0)[u + V(u, 0)] du > 0 . (3.118)
rJ+1

If the flow was reversed for the construction, choose J so that B(u, 0) has a nonzero
value in the interval rJ ≤ u ≤ rJ−1 and consequently
Z rJ−1
B(u, 0)[u + V(u, 0)] du > 0 . (3.119)
rJ

Now let Γ be the positively oriented simple closed curve composed of the arc γ
of the trajectory of system (3.111) (not reversed) from (rJ , 0) to its next intersection
(either (rJ+1 , 0) or (rJ−1 , 0)) with ρ , followed by the segment λ oriented from that
point back to (rJ , 0). We conclude from (3.118) or (3.119), whichever applies, that
Z
B(u, 0)[u + V(u, 0)] du 6= 0 . (3.120)
λ

Let Cr denote the negatively oriented circle of radius r centered at the origin and,
for r > 0 sufficiently small, let U denote the region bounded by Γ and Cr . Then by
(3.116) and Green’s Theorem,
ZZ Z Z Z
0= div BX dA = (−BVe du + BUdv)
e ± BVe du + (BVe du − BUdv),
e
U γ λ Cr
(3.121)
146 3 The Center Problem

where the immaterial ambiguity in the sign of the second term arises from the ques-
tion of whether or not the flow was reversed in forming the sequence r j .
The first term in (3.121) is
Z T
(−BVeU
e + BU
eVe )(u(t), v(t))dt = 0 ,
0

where T is the time taken to describe γ . The third term is, up to sign,
Z 2π
r (BV sin ϕ + BU cos ϕ ) u=r cos ϕ
dϕ ,
0
v=r sin ϕ

which by (3.117) tends to zero as r tends to zero. But by (3.120) the second term in
(3.121) is a fixed nonzero constant, yielding a contradiction. 

We are now in a position to prove Theorem 3.6.9.

Proof of Theorem 3.6.9. By (3.103) and a simple computation, the condition that
u2 + v2 = 0 be an invariant curve is that there exist a polynomial K(u, v) satisfying

uU(u, v) + vV (u, v) = 21 (u2 + v2 )K(u, v) (3.122)

or

U(r cos ϕ , r sin ϕ ) cos ϕ + V (r cos ϕ , r sin ϕ ) sin ϕ = 21 rK(r cos ϕ , r sin ϕ ) . (3.123)

The condition that (u2 + v2 )α /2 = rα be an integrating factor is, by (3.107) and


straightforward manipulations,
 
∂U ∂V
α (uU(u, v) + vV (u, v)) = −(u + v )
2 2
+ (u, v) , (3.124)
∂u ∂v

which, when combined with (3.122), is


 
∂U ∂V
1
α K(u, v) = − + (u, v) . (3.125)
2 ∂u ∂v

We now apply Theorem 3.6.12. When B(u, v) = (u2 + v2 )α /2 = rα the left-hand


side of (3.116) may be computed as
   
2 α2 −1 ∂U ∂V
2
(u + v ) α (uU(u, v) + vV (u, v)) + (u + v )
2 2
+ (u, v) ,
∂u ∂v

which by (3.124) vanishes identically; the first integral in (3.117) is


Z 2π
rα −1 [uU(u, v) + vV (u, v)] dϕ ,
0 u=r cos ϕ
v=r sin ϕ
3.7 Applications: Quadratic Systems and a Family of Cubic Systems 147

which, again by (3.122), is


Z 2π
1
2 rα +1 K(r cos ϕ , r sin ϕ ) d ϕ ,
0

so condition (3.117) holds if


Z 2π
K(r cos ϕ , r sin ϕ ) d ϕ = O(r−α −1 ) , (3.126)
0

in which case by Theorem 3.6.12 (noting that B(u, v) > 0 on R2 \ {(0, 0)}) the origin
is a center for system (3.111). We observe in particular that if we write
m−1
K= ∑ Kj, (3.127)
j=1

where each K j is a homogeneous polynomial of degree j, then (3.126) holds (and


R
system (3.111) has a center) if 02π K j (cos ϕ , sin ϕ ) d ϕ = 0 for j < −α − 1. This is
automatic for α ≥ −2.
If now condition (i) in Theorem 3.6.9 holds, then replacing K(r cos ϕ , r sin ϕ ) in
(3.126) using (3.125) we see that (3.126) holds, so there is a center at the origin.
If condition (ii) in Theorem 3.6.9 holds, then (3.125) shows that the homoge-
neous components in the decomposition (3.112) are d j (u, v) = −K j (u, v)/α , where
K j is as in (3.127). The theorem then follows from the remarks immediately follow-
ing (3.127). 

3.7 Applications: Quadratic Systems and a Family of Cubic


Systems

To find the center variety of a given family of polynomial systems of the form (3.3)
we use the following approach, outlined in the introduction to this chapter. We com-
pute the first focus quantity that is different from zero, say gKK , and set G = {gKK }.
We then compute the next focus quantity gK+1,K+1 , reduce it modulo gKK (Defini-
tion 1.2.15), and, using the Radical
p Membership Test, check if the reduced poly-
nomial g′K+1,K+1 belongs to hgK,K i . If not, then we add it to G, so that now
G = {gKK , g′K+1,K+1 }. We then compute gK+2,K+2 , reduce it modulo hGi (where
hGi denotes the ideal generated by pthe polynomials in G), and check whether the re-
duced polynomial g′K+2,K+2 is in hGi . (Of course, the set G need not be a Gröbner
basis of hGi.) If not, we adjoin it to G, and so continue until we reach the smallest
p
value of s such that G = {gK,K , g′K+1,K+1 , . . . , g′K+s,K+s } and g′K+s+1,K+s+1 ∈ hGi .
At this point we expect that

VC = V(B) = V(hGi), (3.128)


148 3 The Center Problem

where B is the Bautin ideal and VC is the center variety of the family of systems
under consideration. To increase our confidence, p we might compute the next few
focus quantities and verify that they also lie in hGi . Certainly V(B) ⊂ V(hGi);
we must establish the reverse inclusion so as to prove (3.128). To this end, we next
find the irreducible decomposition of V(hGi),

V(hGi) = V1 ∪ · · · ∪Vq ,

and then for every component Vs of the decomposition use the methods presented in
the previous sections, augmented as necessary by other techniques, to prove that all
systems from that component have a center at the origin.
In this section we will apply this theory to find the center variety of two families
of systems, all systems of the form (3.3) with quadratic nonlinearities,

ẋ = i x − a10x2 − a01xy − a−12y2
 (3.129)
ẏ = −i y − b2,−1x2 − b10xy − b01y2 ,

and the restricted family of cubic systems of the form (3.3) given by

ẋ = i x − a10x2 − a01xy − a−13y3
 (3.130)
ẏ = −i y − b10xy − b01y2 − b3,−1x3 .

According to Proposition 3.3.9, the center variety of each system is the same as the
center variety of the respective system

ẋ = x − a10x2 − a01xy − a−12y2


(3.131)
ẏ = −y + b2,−1x2 + b10xy + b01y2

and
ẋ = x − a10x2 − a01xy − a−13y3
(3.132)
ẏ = −y + b10xy + b01y2 + b3,−1x3 .
This particular cubic family was selected both because it is amenable to study and
because the Darboux theory of integrability is inadequate to treat it completely, so
that it gives us an opportunity to illustrate yet another technique for finding a first
integral. It also provides an example illustrating the following important remark.

Remark. The problem of finding an initial string of focus quantities g11 , . . . , gKK
such that V(BK ) = V(B) and the problem of finding an initial string of focus quan-
tities g11 , . . . , gJJ such that BJ = B are not the same problem, and need not have
the same answer. The first equality tells us only that the ideals BK and B have the
same radical, not that they are the same ideal. We will see in Section 6.3 that for the
general quadratic family (3.129) both V(B3 ) = V(B) and B3 = B hold true, but
that for family (3.130) V(B5 ) = V(B) but B5 $ B. In the same section we will see
that these two families also show that BK for the least K such that V(BK ) = V(B)
may or may not be radical (true for family (3.129) but false for (3.130)).
3.7 Applications: Quadratic Systems and a Family of Cubic Systems 149

Theorem 3.7.1. The center variety of families (3.129) and (3.131) is the variety of
the ideal B3 generated by the first three focus quantities,, and is composed of the
following four irreducible components:
1. V1 = V(J1 ), where J1 = h2a10 − b10 , 2b01 − a01i;
2. V2 = V(J2 ), where J2 = ha01 , b10 i;
3. V3 = V(J3 ), where J3 = h2a01 + b01 , a10 + 2b10, a01 b10 − a−12b2,−1 i;
4. V4 = V(J4 ), where J4 = h f1 , f2 , f3 , f4 , f5 i , where
(a) f1 = a301 b2,−1 − a−12b310 ,
(b) f2 = a10 a01 − b01b10 ,
(c) f3 = a310 a−12 − b2,−1b301 ,
(d) f4 = a10 a−12 b210 − a201 b2,−1b01 , and
(e) f5 = a210 a−12 b10 − a01 b2,−1b201 .
Moreover, V1 = V(IHam ), V4 = V(Isym ), V2 is the Zariski closure of those systems
having three invariant lines, and V3 is the Zariski closure of those systems having
an invariant conic and an invariant cubic.
Proof. Following the general approach outlined above, by means of the algorithm
of Section 3.4, we compute the first three focus quantities for family (3.129) and
reduce g22 modulo g11 and g33 modulo {g11 , g22 }. Actually, before performing the
reduction of g33 , we compute a Gröbner basis for hg11 , g22 i (with respect to lex and
with the ordering a10 > a01 > a−12 > b2,−1 > b10 > b01 ) and reduce it modulo this
basis. Our abbreviated terminology for this procedure is that we “reduce g33 mod-
ulo hg11 , g22 i” (see Definition 1.2.15). Maintaining the notation gkk for the reduced
quantities, the result is

g11 = −i(a10a01 − b01b10 ) (3.133)


g22 = −i(a10a−12 b210 − b01b2,−1 a201 − 32 (a−12 b310 − b2,−1a301 )
(3.134)
− 32 (a01 b201 b2,−1 − b10a210 a−12 ))
g33 = i 85 (−a01 a−12 b410 + 2a−12b01 b410 + a401b10 b2,−1
− 2a301 b01 b10 b2,−1 − 2 a10 a2−12 b210 b2,−1 + a2−12 b310 b2,−1 (3.135)
− a301 a−12 b22,−1 + 2 a201 a−12 b01 b22,−1 ).

Note that g33 is not of the form (3.82) because of the reduction.
The reader is encouraged to verify (Exercise 3.32) using the Radical Membership
Test that
p p p
g22 ∈/ hg11 i, g33 ∈ / hg11 , g22 i, g44 , g55 , g66 ∈ hg11 , g22 , g33 i . (3.136)

Thus we expect that

V(hg11 , g22 , g33 i) = V(B3 ) = V(B). (3.137)

To verify that this is the case, we will find the irreducible decomposition of V(B3 )
and then check that an arbitrary element of each component has a center at the
origin, thus confirming that V(B3 ) ⊂ V(B) and thereby establishing (3.137).
150 3 The Center Problem

In general, finding the irreducible decomposition of a variety is a difficult compu-


tational problem, which relies on rather laborious algorithms for the primary decom-
position of polynomial ideals. Although, as we have mentioned in Chapter 1, there
are at present implementations of such algorithms in some specialized computer al-
gebra systems (for example, CALI, Macaulay, and Singular), for system (3.129) we
can find the irreducible decomposition of V(B3 ) using only a general-purpose com-
puter algebra system such as Maple or Mathematica. Moreover, we choose to do so
here in order to further illustrate some of the concepts that have been developed.
We begin by computing the Hamiltonian and symmetry ideals using equation (3.83)
and the algorithm of Section 5.2. This gives IHam = J1 and Isym = J4 ; we know that
V1 = V(IHam ) and V4 = V(Isym ) are irreducible; they are, therefore, candidates for
components of V(B3 ).
To proceed further, we parametrize V(g11 ) as

(a−12 , a01 , a10 , b01 , b10 , b2,−1 ) = (a−12 , a01 , s b10 , s a01 , b10 , b2,−1 ) .

For j ∈ {1, 2, 3}, define gej j ∈ C[a−12 , a01 , b10 , b2,−1 , s] by

gej j (a−12 , a01 , b10 , b2,−1 , s) = g j j (a−12 , a01 , s b10 , s a01 , b10 , b2,−1 ) .

For every (a−12 , a01 , b10 , b2,−1 , s) ∈ C5 ,

ge11 (a−12 , a01 , b10 , b2,−1 , s) := g11 (a−12 , a01 , s b10 , s a01 , b10 , b2,−1 ) = 0;

gej j (a−12 , a01 , b10 , b2,−1 , s) = 0 if and only if g j j (a−12 , a01 , s b10 , s a01 , b10 , b2,−1 ) = 0
for j = 2, 3, although some points of V(g11 ) are missed by the parametrization,
and will have to be considered separately (points (iv) and (v) below). The system
ge22 = ge33 = 0 has the same solution set, that is, defines the same variety, as the
system h1 = h2 = 0, where {h1 , h2 } is a Gröbner basis of he g22 , ge33 i. Such a basis
with respect to lex with a−12 > a01 > b10 > b2,−1 > s is

{(2s − 1)(s + 2) f1, (2s − 1)(a−12 b2,−1 − b10 a01 ) f1 },

where f1 is the first polynomial on the list of generators of J4 = Isym . (Strictly speak-
ing, we should have fe1 , defined analogously to the gej j , in place of f1 .) We solve this
system.
(i) One solution is s = 12 : (a−12 , a01 , b10 /2, a01/2, b10 , b2,−1 ) lies in V(g11 , g22 , g33 )
for all (a−12 , a01 , b10 , b2,−1 ). All these sextuples lie in V(2a10 − b10 , 2b01 − a01 ),
and they are precisely this variety. Thus (a, b) ∈ V(2a10 − b10, 2b01 − a01 ) implies
(a, b) ∈ V(B3 ). This is V(J1 ).
(ii) A second solution is f1 = a301 b2,−1 − a−12 b310 = 0. This means that if a sex-
tuple (a−12 , a01 , s b10 , s a01 , b10 , b2,−1 ) (which lies in V(g11 )) satisfies the condition
a301 b2,−1 − a−12 b310 = 0, then it is in V(g22 , g33 ), hence in V(g11 , g22 , g33 ) = V(B3 ).
We would like to say that any element of V(g11 ) ∩ V(a301 b2,−1 − a−12 b310 ) lies in
V(B3 ), but we have not shown that, because there are points in V(g11 ) that have
been left out of consideration, the points missed by our parametrization: those for
3.7 Applications: Quadratic Systems and a Family of Cubic Systems 151

which a01 = 0 but b01 6= 0 and those for which b10 = 0 but a10 6= 0. In the for-
mer case g11 = 0 only if b10 = 0, and we quickly check that if a01 = b10 = 0,
then g22 = g33 = 0. Similarly, if b10 = 0 but a10 6= 0, g11 = 0 forces a01 = 0,
so we again have that a01 = b10 = 0, so g22 = g33 = 0. Thus we conclude that
(a, b) ∈ V(g11 , a301 b2,−1 − a−12 b310 ) = V(a10 a01 − b10 b01 , a301 b2,−1 − a−12 b310 ) im-
plies (a, b) ∈ V(B3 ). We set J5 = ha10 a01 − b10 b01 , a301 b2,−1 − a−12 b310 i and set
V5 = V(J5 ).
(iii) A third solution is s = −2 and a−12 b2,−1 − b10 a01 = 0, which means that any
sextuple (a−12 , a01 , −2b10, −2a01, b10 , b2,−1 ) satisfying a−12 b2,−1 − b10 a01 = 0 lies
in V(B3 ). These points form V(a−12 b2,−1 − b10 a01 , a10 + 2b10, 2a01 + b01 ) = V3 ,
so V3 ⊂ V(B3 ).
Points of V(g11 ) that are not covered by our parametrization are those points for
which
(iv) b10 = 0 but a10 6= 0, hence a01 = 0; and
(v) a01 = 0 but b01 6= 0, hence b10 = 0.
In either case, a01 = b10 = 0, so all these points lie in V(a01 , b10 ), and we have
already seen that a01 = b10 = 0 implies that g22 = g33 = 0, so we conclude that
(a, b) ∈ V(a01 , b10 ) = V2 implies (a, b) ∈ V(B3 ), so V2 ⊂ V(B3 ).
Our computations have shown that V(B3 ) = V1 ∪V2 ∪V3 ∪V5 . Since the variety
V4 = V(Isym ), which is irreducible, does not appear, but V4 ⊂ VC ⊂ V(B3 ), we
include it as well, writing

V(B3 ) = V1 ∪V2 ∪V3 ∪V4 ∪V5 .

Since J5 = h f1 , f2 i ⊂ Isym , V4 = V(Isym ) ⊂ V(J5 ) = V5 , so we suspect that V5 is the


union of V4 and another subvariety on our list, and recognize that V2 ⊂ V5 as well.
We can discard V5 provided V2 ∪ V4 contains it. To check this, we compute (using,
say, the algorithm in Table 1.6 in Section 1.3) J5 : J4 and apply Theorem 1.3.24 to
obtain

V(J5 ) \ V(J4 ) ⊂ V(J5 ) \ V(J4 )


⊂ V(J5 : J4 ) = V(b310 , a201 , a01 b210 , a201 b10 , a10 a01 − b10 b01 )
= V(a01 , b10 ) = V(J2 ) ,

where the next-to-last equality is by inspection. Thus V(J5 ) ⊂ V(J2 ) ∪ V(J4 ), so V5


is superfluous, and we remove it from our list of subvarieties composing V(B3 ).
Turning to the question of the irreducibility of V1 , V2 , V3 , and V4 , we already
know that V1 and V4 are irreducible. Of course, the irreducibility of both V1 and V2
is clear “geometrically” because each is the intersection of a pair of hyperplanes in
C6 . Alternatively, each has a polynomial parametrization, namely,

(a−12 , a01 , a10 , b01 , b10 , b2,−1 ) = (r, 2t, s,t, 2s, u)

and
(a−12 , a01 , a10 , b01 , b10 , b2,−1 ) = (r, 0, s,t, 0, u),
152 3 The Center Problem

respectively, hence by Corollary 1.4.18 is irreducible. We note here a fact that will
be needed later, that by Theorem 1.4.17 the ideal J2 is prime, hence by Proposition
1.4.3 is radical. To show that V3 is irreducible, we look for a rational parametrization
similarly. It is apparent that S := V3 \ {a−12 = 0} is precisely the image of the map

(a−12 , a01 , a10 , b01 , b10 , b2,−1 ) = F(r, s,t) = (r, s, −2t, −2s,t, st/r)

from C3 \ {(r, s,t) : r 6= 0} into C6 , so that irreducibility of V3 will follow from


Definition 1.4.16 and Corollary 1.4.18 if we can establish that V3 is the Zariski
closure of V3 \ V(a−12). To do so we apply Theorem 1.4.15, which in this context
states that the Zariski closure of S is the fourth elimination ideal of the ideal

J = hr − a−12 , s − a01, −2t − a10, −2s − b01,t − b10, st − rb2,−1, 1 − ryi

in the ring C[r, s,t, y, a−12 , a01 , a10 , b01 , b10 , b2,−1 ]. A Gröbner basis of J with respect
to lex with r > s > t > y > a−12 > a01 > a10 > b01 > b10 > b2,−1 is

{2a01 + b01 , a10 + 2b10, 2a−12 b2,−1 + b01 b10 , y b10 − 1,t − b10, 2s + b01, r − a−12 } .

The generators of the fourth elimination ideal are the basis elements that do not
contain r, s, t, or y: g1 = 2a01 + b01, g2 = a10 + 2b10, and g3 = 2a−12 b2,−1 + b01 b10 .
These are not the generators of J3 as given in the statement of the theorem, but the
two ideals might still be the same. To see if they are, we must compute a reduced
Gröbner basis of J3 with respect to lex with our usual ordering of the variables (see
Theorem 1.2.27). When we do so, we obtain {g1 , g2 , g3 } and conclude that J3 is the
fourth elimination ideal of J, as required. As we did for J2 , we note that by Theorem
1.4.17 the ideal J3 is prime, hence, by Proposition 1.4.3, it is radical.
As an aside we note that we could have approached the problem of the irre-
ducibility of V1 , V2 , and V3 by proving, without reference to the varieties involved,
that the ideals J1 , J2 , and J3 are prime, hence radical, and appealing to Theorems
1.3.14 and 1.4.5, just as we concluded irreducibility of V4 from the fact that J4 = Isym
is prime (Theorem 3.5.9). Direct proofs that J1 , J2 , and J3 are prime are given in the
proof of Theorem 6.3.3.
At this point we have shown that the unique minimal decomposition of V(B3 )
guaranteed by Theorem 1.4.7 to exist is

V(B3 ) = V1 ∪V2 ∪V3 ∪V4 .

It remains to show that every system from Vk , k = 1, 2, 3, 4, has a center at the ori-
gin. For V1 = V(IHam ) and V4 = V(Isym ) this is automatic. For the remaining two
components of V(B3 ), we look for Darboux first integrals.
Systems from V2 have the form

ẋ = x − a10x2 − a−12y2 , ẏ = −y + b2,−1x2 + b01y2 . (3.138)

If we look for an invariant line of (3.138) that does not pass through the origin, say
the zero set of the function f (x, y) = 1 + rx + sy, then the cofactor K is a first-degree
3.7 Applications: Quadratic Systems and a Family of Cubic Systems 153

polynomial, say K = K0 + K1 x + K2 y, and satisfies the equation X f = K f . Since

X f − K f = −K0 + (r − K1 − K0 r)x − (s + K2 + K0 s)y + · · · ,

K is forced to have the form K(x, y) = rx − sy, and a computation of the three re-
maining coefficients in X f − K f indicates that the zero set of f is an invariant line
if and only if
r2 + a10r − b2,−1 s = 0 (3.139)
and
s2 − a−12r + b01 s = 0. (3.140)
Suppose b2,−1 6= 0. We solve (3.139) for s and insert the resulting expression into
(3.140) to obtain
h i
r r3 + 2a10r2 + (a210 + b2,−1b01 )r + (a10 b2,−1 b01 − a−12b22,−1 ) = 0.

Let g denote the constant term in the cubic and h the discriminant of the cubic,
which is a homogeneous polynomial of degree four in a10 , a−12 , b2,−1 , and b01 . Off
V(h) the cubic has three distinct roots ([191]), call them r1 , r2 , and r3 , and off V(g)
none is zero. Let s1 , s2 , and s3 denote the corresponding values of s calculated from
(3.139), s j = (r2j + a10 r j )/b2,−1 . The identity ∑3j=1 α j K j ≡ 0 is the pair of linear
equations

r1 α1 + r2 α2 + r3 α3 = 0 (3.141)
s1 α1 + s2 α2 + s3 α3 = 0 (3.142)

in the three unknowns α1 , α2 , and α3 , hence by Theorem 3.6.8 there exists a Dar-
boux first integral H = (1 + r1 x − s1 y)α1 (1 + r2 x − s2 y)α2 (1 + r3 x − s3 y)α3 . We must
show that the exponents in H can be chosen so that H has the form H(x, y) = xy+· · · .
Conditions (3.141) and (3.142) imply that Hx (0, 0) and Hy (0, 0) are both zero. Us-
ing them to simplify the second partial derivatives of H, we obtain the additional
conditions

r12 α1 + r22 α2 + r32 α3 = 0 (3.143)


s21 α1 + s22 α2 + s23 α3 =0 (3.144)

arising from Hxx (0, 0) = 0 and Hyy (0, 0) = 0 and

r1 s1 α1 + r2 s2 α2 + r3 s3 α3 6= 0

arising from Hxy (0, 0) 6= 0, which by (3.139) and (3.141) simplifies to

r13 α1 + r23 α2 + r33 α3 6= 0. (3.145)

Given nonzero r1 , r2 , and r3 , for any choice of α1 , α2 , and α3 meeting conditions


(3.141) and (3.143), conditions (3.142) and (3.144) are met automatically: for the
154 3 The Center Problem

relationship s j = (r2j + a10 r j )/b2,−1 for j = 1, 2, 3 means that (3.142) is a linear


combination of (3.141) and (3.143), while the truth of the first three yields (3.144)
because s2j = a−12 r j − b01s j . Since
 
r1 r2 r3
det r12 r22 r32  = −r1 r2 r3 (r1 − r2 )(r1 − r3 )(r2 − r3 )
r13 r23 r33

is nonzero the only choice of α1 , α2 , and α3 that satisfies (3.141) and (3.143) but
violates (3.145) is α1 = α2 = α3 = 0. Thus for any other choice of α1 , α2 , and α3 ,
Ψ = (H − 1)/(r13 α1 + r23 α2 + r33 α3 ) is a first integral of the required form.
Thus every element of V2 \ (V(g) ∪ V(h) ∪ V(b2,−1 )) = V2 \ V(ghb2,−1) has a
center at the origin. Since V2 is irreducible and V2 \ V(ghb2,−1) is clearly a proper
subset of V2 we conclude by Exercise 1.45 and Proposition 1.3.20 that every element
of V2 has a center at the origin.
Now consider the variety V3 . A search for invariant lines turns up nothing, so
we look for a second-degree algebraic partial integral f1 (x, y) that does not pass
through the origin and its first-degree cofactor K1 . Solving the system of polynomial
equations that arises by equating coefficients of like powers in the defining identity
X f1 = K1 f1 yields without complications

a01 b10 2 2
f1 = 1 + 2 b10 x + 2 a01 y − a01 b2,−1 x2 + 2 a01 b10 x y − y
b2,−1

and its cofactor K1 = 2 (b10 x − a01 y) . One algebraic partial integral is inadequate,
however, so we look for an invariant cubic curve. The same process gives us
h
f2 = (2 b10 b22,−1 )−1 2 b10 b22,−1 + 6 b210 b22,−1 x + 6 a01 b10 b22,−1 y
+ 3 b10 b22,−1 (b210 − a01 b2,−1 ) x2 + 3 b2,−1(2 a01 b210 b2,−1 − b410 − a201 b22,−1) x y
+ 3 a01 b10 b2,−1 (a01 b2,−1 − b210) y2
+ a01 b32,−1 (a01 b2,−1 − b210) x3 + 3 a01 b10 b22,−1(b210 − a01 b2,−1 ) x2 y
i
+ 3 a01 b210 b2,−1 (a01 b2,−1 − b210) x y2 + a01 b310 (b210 − a01 b2,−1 ) y3

and its cofactor K2 = 3 (b10 x − a01y) . From a comparison of the cofactors K1 and
K2 it is immediately apparent that a nontrivial linear combination α1 K1 + α2 K2 that
is zero is α1 = −3, α2 = 2, hence by Theorem 3.6.8 the system has the Darboux first
integral H = f1−3 f22 , provided b10 b2,−1 6= 0. Set g = 2a01 b210 b2,−1 + b410 + a201b22,−1 .
Then H(x, y) = 1 − 3g(b2,−1b10 )−1 xy + · · · so that Ψ = b2,−1 b10 (−H + 1)/(3g) is a
first integral of the form Ψ (x, y) = xy + · · · , and every system in V3 \ V(b2,−1 b10 g)
has a center at the origin. Since V3 is irreducible and V3 \ V(b2,−1 b10 g) is clearly
a proper subset of V3 , we conclude by Exercise 1.45 and Proposition 1.3.20 that
every element of V3 has a center at the origin. (See Exercise 3.33 for an alternative
way to treat the situation b2,−1b10 = 0. Note also that we could have rescaled f2 by
3.7 Applications: Quadratic Systems and a Family of Cubic Systems 155

2 b10 b22,−1 and still had an algebraic partial integral with the same cofactor, elimi-
nating the problem with vanishing of b10 . Rescaling f1 by b2,−1 does not eliminate
the problem of b2,−1 vanishing, however, since the resulting invariant curve then
passes through the origin.) 

A computation shows that for systems in family (3.131) the symmetry subvari-
ety is the Zariski closure of the set of time-reversible systems. Thus the theorem
identifies one component of the center variety as the set of Hamiltonian systems,
one component as the smallest variety containing the time-reversible systems, and
the remaining two components as the smallest varieties that contain systems that
possess Darboux first integrals, in one case because the underlying system has three
invariant lines, and in the other because it contains an invariant conic and an in-
variant cubic. The operation of taking the Zariski closure is essential. Although
there occur in the literature statements that identify what we have called V2 as “the
set of quadratic systems having three invariant lines,” in fact not every element of
V2 does so. For example, the system ẋ = x − y2 , ẏ = −y + y2 , corresponding to
(a−12 , a01 , a10 , b01 , b10 , b2,−1 ) = (1, 0, 0, 1, 0, 0) ∈ V2 , has exactly two invariant lines,
real or complex; the system ẋ = x + x2 − y2 , ẏ = −y + x2 − y2 , corresponding to
(1, 0, −1, −1, 0, 1) ∈ V2 , has exactly one (Exercise 3.34.)
Theorem 3.7.1 shows the utility of finding the minimal decomposition of V(Bk ):
although systems in V2 and V3 typically possess Darboux first integrals, the form
of the integral for systems from V2 differs from that for systems from V3 ; trying
to deal with these two families as a whole, and in particular trying to prove the
existence of a Darboux first integral for the whole family, would almost certainly be
impossible. In the case of family (3.130), even when attention is restricted to some
individual component of the center variety it does not seem possible to produce first
integrals solely by Darboux’s method of invariant algebraic curves, and we resort to
a different approach.

Theorem 3.7.2. The center variety of families (3.130) and (3.132) is the variety
of the ideal B5 generated by the first five focus quantities and is composed of the
following eight irreducible components:
1. V(J1 ), where J1 = ha10 , a−13 , b10 , 3a01 − b01i;
2. V(J2 ), where J2 = ha01 , b3,−1 , b01 , 3b10 − a10i;
3. V(J3 ), where J3 = ha10 , a−13 , b10 , 3a01 + b01i;
4. V(J4 ), where J4 = ha01 , b3,−1 , b01 , 3b10 + a10i;
5. V(J5 ), where J5 = ha01 , a−13 , b10 i;
6. V(J6 ), where J6 = ha01 , b3,−1 , b10 i;
7. V(J7 ), where J7 = ha01 − 2b01, b10 − 2a10i;
8. V(J8 ), where J8 = ha10 a01 − b01 b10 , a401 b3,−1 − b410 a−13 , a410 a−13 − b401 b3,−1 ,
a10 a−13 b310 − a301 b01 b3,−1 , a210 a−13 b210 − a201 b201 b3,−1,
a310 a−13 b10 − a01 b301 b3,−1 i.

Proof. When we compute gkk for family (3.130) and (for k ≥ 2) reduce it modulo
Bk−1 (Definition 1.2.15), retaining the same notation gkk for the reduced quantities,
we obtain
156 3 The Center Problem

g11 = −i(a10 a01 − b01b10 )


g22 = 0
g33 = i(2a310 a−13 b10 − a210a−13 b210 − 18a10a−13 b310 − 9a401b3,−1
+ 18a301b01 b3,−1 + a201b201 b3,−1 − 2a01b301 b3,−1 + 9a−13b410 )/8
g44 = i(14a10 b01 (2a10 a−13b310 + a401b3,−1 − 2a301b01 b3,−1 − a−13b410 ))/27
g55 = −ia−13 b3,−1 (378a410a−13 + 5771a310a−13 b10 − 25462a210a−13 b210
+ 11241a10a−13 b310 − 11241a301b01 b3,−1 + 25462a201b201 b3,−1
− 5771a01b301 b3,−1 − 378b401b3,−1 )/3240
g66 = 0
g77 = ia2−13 b23,−1 (343834a210a−13 b210 − 1184919a10a−13b310 + 506501a−13b410
− 506501a401b3,−1 + 1184919a301b01 b3,−1 − 343834a201b201 b3,−1 )
g88 = 0
g99 = ia3−13 b33,−1 (2a10 a−13 b310 − a−13b410 + a401b3,−1 − 2a301b01 b3,−1 )

The Radical Membership Test (Table 1.4) shows that both g77 and g99 lie in B5 ,
which suggests that V(B) = V(B5 ), which we will now prove.
Computing IHam using (3.85) gives J7 ; setting the relevant coefficients equal to
zero in the polynomials listed in Table 3.2 gives Isym = J8 . By factoring the focus
quantities through a change of parameters, as was done for the quadratic family,
we find that the system g11 = g33 = g44 = g55 = 0 is equivalent to the eight condi-
tions listed in the theorem. Rather than duplicating those computations, the reader
can verify this by first applying the algorithm in Table 1.5 recursively
√ to√compute
I = ∩8j=1 J j , then using the Radical Membership Test to verify that I = B5 (Ex-
ercise 3.36). Because each of V1 through V7 has a polynomial parametrization, it is
irreducible; V8 is irreducible because it is V(Isym ).
We now prove that every system from V j , 1 ≤ j ≤ 8, has a center at the origin.
First observe that if in a generator of J1 every occurrence of a jk is replaced by bk j
and every occurrence of b jk is replaced by ak j , then a generator of J2 is obtained,
and conversely. Thus any procedure that yields a first integral H for a system in V2
also yields a first integral for the corresponding element of V1 under the involution:
if there is a formula for H in terms of (a, b), simply perform the same involution on
it. Thus we need not treat V1 . Similarly, we need not treat V3 or V5 . The varieties V7
and V8 need not be treated since, as already mentioned, J7 = IHam and J8 = Isym .
Any system from V2 has the form

ẋ = x − 3b10 x2 − a−13 y3 , ẏ = −y + b10 xy ,

for which div X = −5 b10 x. If either a−13 or b10 is zero, then in fact the system
also comes from V8 , already known to have only systems with a center at the origin,
so we need only treat the case a−13 b10 6= 0. We look for a Darboux first integral
or a Darboux integrating factor. A search for invariant lines by the usual method of
3.7 Applications: Quadratic Systems and a Family of Cubic Systems 157

undetermined coefficients yields only the obvious algebraic partial integral f = y


and its cofactor K = −1 + b10 x. Since for no choice of β can equation (3.109) be
satisfied, we cannot obtain an integrating factor from just f alone, and so we look
for higher-order invariant curves. We obtain f1 = 1 − 6b10 x + 9b210 x2 + 2a−13 b10 y3
and its cofactor K1 = −6b10 x. Since equation (3.109) holds with β1 = −5/6, an
−5/6
analytic Darboux integrating factor for our system is µ = f1 . Then
Z Z
(2) (3)
H(x, y) = −µ (x, y)P(x, y) dy = −x+ · · · dy = (−xy+ · · ·) + c(x),

for some analytic c(x), is a first integral in a neighborhood of the origin. Because
(2) (2)
∂ H/∂ x = (−y+ · · ·) + c′ (x) = µ (x, y)Q(x, y) = −y+ · · · ,

c(x) begins with terms of order at least three. Thus Ψ = −H is a Lyapunov first
integral on a neighborhood of the origin, which is thus a center.
Any system from V4 has the form

ẋ = x + 3b10x2 − a−13y3 , ẏ = −y + b10 xy ,

for which div X = 7 b10 x. If either a−13 or b10 is zero, then this system also comes
from V8 , so we restrict attention to the situation a−13 b10 6= 0. An invariant cubic
curve is the zero set of f1 = 1 + 3b10 x − a−13 b10 y3 , whose cofactor is K1 = 3 b10 x.
Since equation (3.109) holds with β1 = −7/3, an analytic Darboux integrating factor
−7/3
for our system is µ = f1 , by means of which we easily obtain a Lyapunov first
integral on a neighborhood of the origin, which is thus a center.
Any system from V6 has the form

ẋ = x − a10 x2 − a−13 y3 , ẏ = −y + b01 y2 .

This system also comes from V8 unless a10 a−13 6= 0, so we continue on the assump-
tion that this inequality holds. A search for a Darboux first integral or integrating
factor proves fruitless, so we must proceed along different lines for this case. To
simplify the computations that follow, we make the change of variables x1 = −a10 x,
y1 = y. Dropping the subscripts, the system under consideration is transformed into

ẋ = x + x2 + a10a−13 y3 , ẏ = −y + b01 y2 . (3.146)

We look for a formal first integral expressed in the form Ψ (x, y) = ∑∞j=1 v j (x)y j .
When this expression is inserted into equation (3.31) and terms are collected on
powers of y, the functions v j are determined recursively by the first-order linear
differential equations

(x + x2 )v′j (x) − jv j (x) + a10 a−13 v′j−3 (x) + b10 ( j − 1)v j−1(x) = 0, (3.147)
158 3 The Center Problem

if we define v j (x) ≡ 0 for j ∈ {−2, −1, 0}. It is easily established by mathematical


induction that, making suitable choices for the constants of integration, there exist
functions of the form
Pj (x)
v j (x) =
(x + 1) j
that satisfy (3.147), where Pj (x) is a polynomial of degree j, and that we may choose
v1 (x) = x/(x + 1). Thus system (3.146) admits a formal first integral of the form
Ψ (x, y) = xy + · · ·, hence has a center at the origin. 
The reader may have noticed that the nonzero focus quantities and reduced fo-
cus quantities listed for systems (3.129) and (3.130) at the beginning of the proofs
of Theorems 3.7.1 and 3.7.2 are homogeneous polynomials of increasing degree.
While the nonzero focus quantities and reduced focus quantities need not be homo-
geneous in general, the fact that these are is not a coincidence. It stems from the fact
that the nonlinearities in families (3.129) and (3.130) are themselves homogeneous.
These ideas are developed in Exercises 3.40 and 3.41.

3.8 The Center Problem for Liénard Systems

In Section 3.5 we studied systems that are symmetric with respect to the action of
a linear group followed by reversion of time. In the present section we will study
a more complex symmetry, the symmetry that is the result of the action, not of a
linear group of affine transformations, but of an analytic invertible transformation
followed by reversion of time. This kind of symmetry is sometimes called general-
ized symmetry. We will analyze the important family of real analytic systems of the
form
ẋ = y, ẏ = −g(x) − y f (x), (3.148)
which are equivalent to the second-order differential equation

ẍ + f (x)ẋ + g(x) = 0. (3.149)

Any system of the form (3.148) is known as a Liénard system; the corresponding
differential equation (3.149) is called a Liénard equation. Equations of this type
arise frequently in the study of various mathematical models of physical, chemical,
and other processes. We will not complexify system (3.148), so in this section x and
y will denote real variables.
Our standing assumptions will be that the functions f and g are real analytic in a
neighborhood of the origin and that

g(0) = 0, g′ (0) > 0. (3.150)

The condition g(0) = 0 is equivalent to system (3.148) having a singularity at the


origin; the condition g′ (0) > 0 is equivalent to its linear part having a positive deter-
minant there. In such a case the origin is an antisaddle if and only if f (0)2 < 4g′ (0),
3.8 The Center Problem for Liénard Systems 159

but we will not make use of this condition. F and G will denote the particular an-
tiderivatives of f and g given by
Z x Z x
F(x) = f (s)ds, G(x) = g(s)ds . (3.151)
0 0

We will show that a Liénard system has a center at the origin only if it possesses a
generalized symmetry, and identify all polynomial Liénard systems having a center
at the origin. We begin with two criteria for distinguishing between a center and a
focus at the origin in system (3.148).
Theorem 3.8.1. Suppose system (3.148) satisfies (3.150). Then (3.148) has a center
at the origin if and only if the functions F and G defined by (3.151) are related by
F(x) = Ψ (G(x)) for some analytic function Ψ for which Ψ (0) = 0.
Proof. By means of the so-called Liénard transformation y1 = y + F(x), we obtain
from (3.148) the system

ẋ = y − F(x), ẏ = −g(x), (3.152)

where the subscript 1 in y1 has been dropped.


Since 2G(x) = g′ (0)x2 + · · · , we may introduce a new variable by setting
p
u = υ (x) = sgn x 2G(x). (3.153)

Condition (3.150) implies that υ (x)


pis an analytic, invertible function on a neighbor-
hood of x = 0 of the form υ (x) = g′ (0) x + O(x2 ). Let x = ξ (u) denote its inverse.
The change of coordinates u = υ (x), y = y transforms (3.152) into the form

g(ξ (u))
u̇ = (y − F(ξ (u))) , ẏ = −g(ξ (u)). (3.154)
u
p
Because g(ξ (u))/u = g′ (0) + O(u) is analytic and different from zero in a neigh-
borhood of the origin, we conclude that the origin is a center for system (3.154),
equivalently for systems (3.152) and (3.148), if and only if it is a center for the
system
u̇ = y − F(ξ (u)), ẏ = −u. (3.155)
Consider the power series expansion of F1 (u) := F(ξ (u)), F1 (u) = ∑∞ k
k=1 ak u . We
claim that the origin is a center for (3.155) if and only if

a2k−1 = 0 for all k ∈ N. (3.156)

Indeed, if (3.156) holds, then (3.155) is time-reversible, hence has a center at the
origin. Conversely, suppose that not all as with odd index s vanish, and let the first
such nonzero coefficient be a2m+1 . Then, as we have just seen, the origin is a center
for the system

u̇ = y − ∑ a2k u2k := y − Fb1(u), ẏ = −u , (3.157)
k=1
160 3 The Center Problem

hence, according to Theorem 3.2.9, it has a first integral Φ (u, y) = u2 + y2 + · · · on


a neighborhood of the origin. This Φ is a Lyapunov function for system (3.155); Φ̇
is
∂Φ ∂Φ ∂Φ
(y − Fb1(u)) − u− (a2m+1 u2m+1 + · · · ) = −2a2m+1 u2m+2 (1 + · · ·).
∂u ∂y ∂u
By Theorem 2.1.4, the origin is a stable focus if a2m+1 > 0 and an unstable focus if
a2m+1 < 0.
Thus we have shown that the origin is a center for (3.155), hence for (3.148),
if and only if F(ξ (u)) = h(u2 ) for some analytic function h for which h(0) = 0.
However, by (3.153) we have u2 = 2G(ξ (u)), which means that the theorem follows
with Ψ (v) = h(2v). 

Theorem 3.8.2. Suppose system (3.148) satisfies (3.150). Then it has a center at the
origin if and only if there exists a function ζ (x) that is defined and analytic on a
neighborhood of 0 and satisfies

ζ (0) = 0, ζ ′ (0) < 0 (3.158)

and
F(x) = F(ζ (x)), G(x) = G(ζ (x)) , (3.159)
where F and G are the functions defined by (3.151).

Proof. Define a real analytic function from a neighborhood of (0, 0) ∈ R2 into R by

b z) = G(x) − G(z)
G(x,
b3 · (x3 − z3 ) + · · ·
= 12 g′ (0)(x2 − z2 ) + G
= (x − z)[ 12 g′ (0)(x + z) + R(x, z)],

where R(x, z) is a real analytic function that vanishes together with its first par-
tial derivatives at (0, 0). Because g′ (0) 6= 0, by the Implicit Function Theorem the
equation G(x) − G(z) = 0 defines, in addition to z = x, a real analytic function
z = ζ (x) = −x + O(x2 ) on a neighborhood of 0 in R. That is, the second equation in
(3.159) always has a unique real analytic solution z = ζ (x) satisfying (3.158).
p
As in the proof of Theorem 3.8.1, define a function u = υ (x) = sgn(x) 2G(x)
and its inverse x = ξ (u). Then 2G(ξ (−u)) = (−u)2 = u2 = 2G(ξ (u)), so that
G(ξ (−u)) = G(ξ (u)). But ξ (−υ (x)) = −x + O(x2 ), so we conclude that in fact
z = ζ (x) = ξ (−υ (x)).
The proof of Theorem 3.8.1 showed that (3.148) has a center if and only if
F1 (u) = F(ξ (u)) is an even function, that is, if and only if F(ξ (u)) = F(ξ (−u)),
equivalently, if and only if

F(ξ (υ (x)) − F(ξ (−υ (x))) = F(x) − F(ζ (x)) = 0,

and the theorem follows. 


3.8 The Center Problem for Liénard Systems 161

The next proposition, in conjunction with Theorem 3.8.2, shows that the gener-
alized symmetry is the only mechanism yielding a center in Liénard systems.

Proposition 3.8.3. Suppose system (3.148) satisfies (3.150). Then the origin is a
center for (3.148) only if there exists an analytic invertible transformation T of the
form T (x, y) = (ζ (x), y), defined on a neighborhood of the origin, such that (3.148)
is invariant with respect to an application of T and a reversal of time.

Proof. Suppose the origin is a center for (3.148), let z = ζ (x) be the function given
by Theorem 3.8.2, and let ξ (z) be the inverse of ζ (x). Define a transformation
T (x, y) by (z, y1 ) = T (x, y) = (ζ (x), y), so that T is an analytic invertible transfor-
mation defined on a neighborhood of (0, 0).
Differentiation of (3.159) yields

F ′ (x) = F ′ (ζ (x)) · ζ ′ (x), G′ (x) = G′ (ζ (x)) · ζ ′ (x) . (3.160)

Thus under application of T we have

ż = ζ ′ (x)ẋ = ζ ′ (x)y1 = ζ ′ (ξ (z))y1

and, using (3.160),

ẏ1 = ẏ = −G′ (x) − yF ′ (x)


= −G′ (ζ (x))ζ ′ (x) − yF ′ (ζ (x))ζ ′ (x) = ζ ′ (ξ (z))[−g(z) − y1 f (z)] .

That is, under the transformation T the system becomes

ż = Z(z) y1 , ẏ1 = Z(z)[−g(z) − y1 f (z)]

for Z : R → R : z 7→ ζ ′ (ξ (z)) = −1 + · · · , whose orbits in a neighborhood of the


origin are precisely those of (3.148) but with the sense reversed. 

We now study in more detail the situation that f and g are polynomials, in which
case F and G are polynomials as well. To do so, we first recall that if k is a field
and if polynomials p, q ∈ k[x] both have full degree at least one, then their resultant,
Resultant(p, q, x), is a polynomial in the coefficients of p and q that takes the value
zero if and only if p and q have a common factor. For p, q ∈ k[x, z], if neither p(x, 0)
nor q(x, 0) reduces to a constant, then we can regard p and q as elements of k[x] with
coefficients in k[z] and form their resultant, Resultant(p, q, x), to obtain a polynomial
in k[z], and it proves to be the case that p and q have a common factor in k[x, z] if
and only if Resultant(p, q, x) = 0 ([60, Chap. 3, §6]).

Proposition 3.8.4. Suppose that in (3.148) f and g are polynomial functions and
that (3.150) holds. Then the origin is a center only if the resultant with respect to x
of
F(x) − F(z) G(x) − G(z)
and
x−z x−z
162 3 The Center Problem

is equal to zero. Conversely, if the resultant with respect to x of these two polynomi-
als is zero, and if the common factor that therefore exists vanishes at (x, z) = (0, 0),
then the origin is a center for system (3.148).
Proof. Since F(x) and G(x) are polynomials, a solution z = ζ (x) of (3.159) satisfy-
ing (3.158) corresponds to a common factor between F(x) − F(z) and G(x) − G(z)
other than x − z. Thus if such a solution z = ζ (x) exists, then
 
F(x) − F(z) G(x) − G(z)
Resultant , , x = 0. (3.161)
x−z x−z

Conversely, suppose (3.161) holds and let c(x, z) be the common factor of the
two polynomials in (3.161). In Exercise 3.42 the reader is asked to show that if
c(0, 0) = 0, then the equation c(x, z) = 0 defines a function z = ζ (x) that satisfies
ζ ′ (0) = −1. Thus by Theorem 3.8.2 system (3.148) has a center at the origin. 
We close this section with a result that shows how polynomial Liénard systems
with a center at the origin arise: they are precisely those polynomial systems (3.148)
for which the functions F and G given by (3.151) are polynomial functions of a com-
mon polynomial function h. To establish this result, we must examine the subfield
K of the field of rational functions generated by the polynomials F(x) and G(x).
In general, for an arbitrary field k we let k(x) denote the field of rational functions
with coefficients in k, that is,
 
α (x)
k(x) = : α , β ∈ k[x], β 6= 0 ,
β (x)

and for h ∈ k[x], let k(h) denote the smallest subfield of k(x) that contains h, namely,
 
α (h(x))
k(h) = : α , β ∈ k[x], β 6= 0 .
β (h(x))

In particular, in reference to the Liénard system (3.148), for F and G as given by


(3.151), we may first form R(F) and, since R(F) is a field, in turn form the field
K := R(F)(G). Then K is the smallest subfield of R(x) that contains both F and
G or, in other words, is the subfield of R(x) generated by F and G.
Lemma 3.8.5. If an analytic function ζ (x) defined on a neighborhood of 0 satisfies
(3.159), then h(x) = h(ζ (x)) for every h ∈ K .
Proof. The result follows immediately from the fact that any element of K has the
form
AN (x)GN (x) + · · · + A0(x)
,
BM (x)GM (x) + · · · + B0(x)
where each A j and B j has the form

an F n (x) + · · · + a0
bm F m (x) + · · · + b0
3.8 The Center Problem for Liénard Systems 163

for M, N, m, n ∈ N0 and a j , b j ∈ R. 

Lemma 3.8.6. There exists a polynomial h ∈ R[x] such that K = R(h).

Proof. Because R ⊂ K ⊂ R(x) and K contains a nonconstant polynomial, the


result follows immediately from the following result of abstract algebra. 

Proposition 3.8.7. Let k and K be fields such that k ⊂ K ⊂ k(x). If K contains a


nonconstant polynomial, then K = k(h) for some h ∈ k[x].

Proof. Lüroth’s Theorem (see [195]) states that any subfield of k(x) that strictly
contains k is isomorphic to k(x). Thus there is a field isomorphism ξ : k(x) → K.
Let r ∈ K denote the image of the identity function ι (x) = x under ξ . Then for any
element of k(x),
 
a n xn + · · · + a 0 an ξ n (x) + · · · + a0 an r n + · · · + a0
ξ = = ∈ k(r) .
b m xm + · · · + b 0 bm ξ m (x) + · · · + b0 bm rm + · · · + b0

Since ξ is one-to-one and onto, we conclude that K = k(r).


We know that r = A/B for some polynomials A, B ∈ k[x]; we must show that B is
a constant. Since the theorem is trivially true if A is a constant multiple of B, without
loss of generality we may assume that A and B are relatively prime. It is well known
([195, §63]) that for any a, b, c, d ∈ k for which ad − bc 6= 0,
   
ar + b aA + bB
k(r) = k =k ,
cr + d cA + dB

hence without loss of generality we may also assume that deg(A) > deg(B) (Exer-
cise 3.43).
Let H ∈ k[x] be any nonconstant polynomial in K. Then because K = k(A/B), we
have
an ( BA )n + · · · + a1 AB + a0
H=
bm ( AB )m + · · · + b1 AB + b0
for some a j , b j ∈ k. It must be the case that n > m, for if n ≤ m, then clearing
fractions yields

H · (bm Am + · · · + b1ABm−1 + b0 Bm ) = an An Bm−n + · · · + a1ABm−1 + a0Bm .

However, the degree of the polynomial on the left is deg(H) + m deg(A), while the
degree of the polynomial on the right is at most m deg(A). Clearing fractions thus
actually yields

H · (bm Am Bn−m + · · · + b1ABn−1 + b0 Bn ) = an An + · · · + a1ABn−1 + a0 Bn .

If, contrary to what we wish to show, deg(B) > 0, then B is a divisor of the polyno-
mial on the left but is relatively prime to the polynomial on the right, a contradiction.
Thus B is constant, as we wished to show. 
164 3 The Center Problem

Theorem 3.8.8. Suppose that the functions f and g in (3.148) are polynomials for
which (3.150) holds, and let F and G be the functions defined by (3.151). The origin
is a center for system (3.148) if and only if there exist a polynomial h ∈ R[x] that
satisfies
h′ (0) = 0 and h′′ (0) 6= 0 (3.162)
and polynomials α , β ∈ R[x] such that F(x) = α (h(x)) and G(x) = β (h(x)).
Proof. Suppose there exist polynomials α , β , h ∈ R[x] such that F(x) = α (h(x)),
G(x) = β (h(x)), and h satisfies (3.162), say h(x) = an xn + · · · + a2 x2 + a0 , a2 6= 0.
Then for x 6= z, U(x, z) = (h(x)−h(z))/(x−z) is the same as the polynomial function
V (x, z) that results by factoring x − z out of the numerator of U and cancelling it
with the denominator, and V (x, z) = a2 (x + z) + · · · . But V satisfies V (0, 0) = 0
and Vx (0, 0) = Vz (0, 0) = a2 6= 0, hence by the Implicit Function Theorem there
is an analytic function ζ defined on a neighborhood of 0 such that ζ (0) = 0 and
in a neighborhood of (0, 0) in R2 , V (x, z) = 0 if and only if z = ζ (x). Moreover,
ζ ′ (0) = −1. There is thus a neighborhood of 0 in which ζ (x) 6= x for x 6= 0, hence in
which V = 0 is equivalent to U = 0. Thus h(x) = h(ζ (x)) holds on a neighborhood
of 0. Since F(x) = α (h(x)) = α (h(ζ (x))) = F(ζ (x)) and similarly G(x) = G(ζ (x)),
by Theorem 3.8.2 the origin is a center for system (3.148).
Conversely, assume that the origin is a center for the polynomial Liénard sys-
tem (3.148), and let ζ be the analytic function defined on a neighborhood of the
origin, satisfying (3.159) and (3.158), as provided by Theorem 3.8.2. If h is the
polynomial provided by Lemma 3.8.6, then F(x) = α (h(x)) and G(x) = β (h(x)) for
some polynomials α , β ∈ R[x], since both F and G lie in K . But then by Lemma
3.8.5, h(x) = h(ζ (x)), hence h′ (x) = h′ (ζ (x))ζ ′ (x). Evaluating this latter equation
at x = 0 and applying (3.158) yields h′ (0) = 0. Similarly, differentiating the identity
G(x) = β (h(x)) twice and evaluating at x = 0 yields G′′ (0) = β ′ (h(0))h′′ (0); since
G′′ (0) = g′ (0) > 0 (by (3.150)) we have h′′ (0) 6= 0, so that h satisfies (3.162). 
Theorem 3.8.8 gives us a means for constructing polynomial Liénard systems
having a center at the origin.

3.9 Notes and Complements

The center problem for polynomial systems dates back about 100 years, beginning
with Dulac’s 1908 study [64] of the quadratic system (3.131). He showed that for
this system a first integral of the form (3.52) exists if and only if by means of a linear
transformation the equation of the trajectories can be brought into one of 11 forms,
each of which contains at most two parameters. He also found first integrals for all
11 forms. The center problem for quadratic systems was also solved by Kapteyn
([103, 104]) in 1911 and 1912.
Neither Dulac nor Kapteyn, however, gave explicit conditions on the coefficients
of the system for the existence of a center. Thus they did not give so-called “nec-
essary and sufficient center conditions.” This problem was considered for the first
3.9 Notes and Complements 165

time by Frommer ([74]), who appears to have been unaware of the work of Du-
lac and Kapteyn. He investigated not the complex system (3.129), but the corre-
sponding real quadratic system of the form (3.111), for which calculations are much
more difficult. For this reason, perhaps, his center conditions were incomplete and
partially incorrect. Somewhat later correct coefficient conditions for a center for
real quadratic system were obtained by Saharnikov ([168]), Sibirsky ([174, 175]),
Malkin ([134]), and others ([76, 206]). For practical use we note in particular the dis-
criminant quantities of Li Chengzhi computed from the normal form (2.75), whose
successive vanishing (and sign) determines that the origin is a first-, second-, or
third-order fine focus (and of what asymptotic stability) or a center (Theorem 12.2
of [202], or Theorem II.5.2 of [205]). (Recall Remark 3.1.6 for the definition of a
fine focus.)
The conditions obtained by different authors for the same system often look
markedly different. But these “center conditions” are simply the zero set of a system
of polynomials, the focus quantities, hence the set of systems (3.131) with a center
is a complex variety (a real variety in the case of real systems of the form (3.2)). This
makes it natural to formulate the center problem as we did above: find an irreducible
decomposition of the center variety defined by prime ideals. Then the answer will
be unique in the sense that we can easily check if the conditions obtained by differ-
ent authors are the same simply by computing the reduced Gröbner bases. It is thus
surprising that, given the substantial amount of work devoted to this topic, the con-
cept of the center variety apparently first appeared explicitly in the literature only
relatively recently, in the paper [210] of Żoła̧dek. The irreducible decomposition of
the center variety of quadratic systems was obtained for the first time in [154].
In Section 3.4 we described the Focus Quantity Algorithm (page 128) for
computing the focus quantities for family (3.3), which is based on the work in
[149, 150, 153]. Of course, there are many other algorithms for computing these
and similar quantities. The reader can consult, for example, [71, 77, 80, 121, 125,
129, 130, 131, 196]. In particular, an algorithm for computing focus quantities is
available in the Epsilon library of Maple (see [199]).
The p : −q resonant center problem has become the subject of study relatively re-
cently. The discussion in Section 3.4 related to the focus quantities for family (3.3),
leading to the Focus Quantity Algorithm, has a direct generalization to the family
(3.63) containing the p : −q resonant centers, leading to an analogous algorithm for
the quantities gkq,kp of (3.68). The only change in the algorithm is that the expres-
sion for the quantity M becomes ⌊K(p + q)/w⌋, the condition in the IF statement
(ν )
changes to L1 (ν )p = L2 (ν )q, and the formulas for V and gkk are replaced by those
of Exercises 3.18 and 3.19. See [150] for a complete treatment. Other results on
p : −q resonant centers are obtained in [57]. The center problem for quadratic sys-
tems with a 1 : −2 resonant singular point has been solved by Fronville, Sadovskii,
and Żoła̧dek ([75, 210]).
In the section devoted to time-reversible systems we considered the simplest case
of reversibility, namely, reversibility with respect to the action of the rotation group.
This concept can be generalized, and it is an interesting and important problem
to study the reversibility of differential equations with respect to different classes of
166 3 The Center Problem

rational and analytic transformations. See [124] and [208]. We note that this problem
is also important from the point of view of physical applications; see the survey of
Lamb and Roberts ([108]) and the references therein.
In Sections 3.6 and 3.7 we discussed the Darboux method of integrability as a
tool for investigating the center problem. However, it also provides a way to con-
struct elementary and Liouvillian first integrals of ordinary differential equations
and is therefore also an efficient tool in the study of the important problem of the
integrability of differential equations in closed form. In recent years the ideas of
Darboux have been significantly developed; see [54, 55, 135, 170] and the refer-
ences they contain. In particular, Jouanalou ([101]) extended the Darboux theory to
polynomial systems in Rn and Cn . In [144] Prelle and Singer showed that if a poly-
nomial vector field has an elementary first integral, then that first integral can be
computed using the Darboux method. Later Singer ([182]) proved that if a polyno-
mial vector field has Liouvillian first integrals, then it has integrating factors given
by Darbouxian functions. A good survey of all these results is [119].
An important problem from the point of view of practical usage of the Dar-
boux method is the study of possible degrees of invariant algebraic curves and their
properties. This problem is addressed in the papers of Campillo–Carnicer ([29]),
Cerveau–Lins Neto ([30]), Tsygvintsev ([188]), and Żoła̧dek ([211, 212]). This is
an interesting subject with the interplay of methods of differential equations and
algebraic geometry and is likely to be a fruitful field for future research.
A generalization of the Darboux method to higher-dimensional systems is pre-
sented in [120].
In Section 3.7 we presented the solution of the center problem for the gen-
eral quadratic system. If we go further and consider the center problem for the
general cubic system (3.100), we face tremendous computational difficulties. Al-
though it appears that by using the modern computational facilities it is possible
to compute enough focus quantities to determine the center variety, the polyno-
mials obtained are so cumbersome that it is impossible to carry out a decompo-
sition of the center variety. Thus current research is devoted to the study of vari-
ous subfamilies of (3.100). One important subfamily is the cubic Liénard system
u̇ = −v, v̇ = u + ∑3k+p=2 uk v p . The center problem for this system has been solved
by Sadovskii ([166]) and Lloyd and Pearson ([49, 122]). Another variant is to elimi-
nate all quadratic terms, that is, to study the center problem for a planar linear center
perturbed by third-degree homogeneous polynomials. This problem was studied ini-
tially by Al’muhamedov ([3]) and Saharnikov ([169]), who obtained sufficient con-
ditions for a center. The complete solution of the problem was obtained by Malkin
([133]) and Sadovskii ([163]) for the real and complex cases, respectively; see The-
orem 6.4.3. Later work on this problem appears in [58, 76, 128, 118, 207]. Similarly,
one could also consider the situation in which the perturbation terms are fourth- or
fifth-degree homogeneous polynomials; see [34, 35]. For results on the center prob-
lem for some particular subfamilies of the cubic system (not restricting to homoge-
neous nonlinearities) see [22, 70, 122, 123, 117, 167, 186, 196] and the references
they contain.
3.9 Notes and Complements 167

The results on the center problem for the Liénard system presented in Section
3.8 are due mainly to Cherkas ([43]), Christopher ([50]), and Kukles ([105, 106]).
We have not considered the center problem in the interesting but much more
difficult case that the singular point is not simple, that is, when the determinant of
the linear part of the right-hand sides vanishes at the singular point. For treatment
of this situation the reader can consult [9, 10, 11, 33, 99, 111, 138, 164, 165, 179].
In sharp contrast to the theory that we have presented here, when the linear part
vanishes identically the center problem is not even algebraically solvable ([98]).

Exercises

3.1 Without appealing to the Hartman–Grobman Theorem, show that if df(0) has
eigenvalues α ± iβ with αβ 6= 0, then the phase portraits of u̇ = f(u) and
u̇ = df(0) · u in a neighborhood of the origin, the latter of which is a focus,
are topologically equivalent (that is, that there is a homeomorphism of a neigh-
borhood of the origin onto its image mapping orbits of one system (as point
sets) onto those of the other).
Hint. Show that 0 is the only singularity of the nonlinear system near 0. Then
show that there exists a simple closed curve surrounding 0 that every non-
stationary trajectory of the nonlinear system near 0 intersects exactly once.
3.2 Find a nonsingular linear transformation that changes

u̇ = au + bv + ∑ U jk u j vk
j+k=2
∞ (3.163)
v̇ = cu + dv + ∑ V jk u j vk
j+k=2
 
ab
into (3.4) when the eigenvalues of the linear part of system (3.163) are
cd
α ± iβ , β 6= 0.
3.3 Prove Proposition 3.1.4.
Hint. The geometry of the return map.
3.4 A real quadratic system of differential equations that has a weak focus (one for
which the real part of the eigenvalues of the linear part is zero) or a center at the
origin can be placed in the form

u̇ = −v + a20u2 + a11uv + a02v2 , v̇ = u + b20u2 + b11uv + b02v2 .

a. Show that the complex form (3.19) (or (3.51)) of this family of systems is
ẋ = ix + c20 x2 + c11xx̄ + c02 x̄2 . Find the coefficients c jk ∈ C in terms of the
real coefficients a jk and b jk .
b. Confirm that the transformation ξ : R6 → R6 that carries (a20 , a11 , . . . , b02 )
to (Re c20 , Im c20 , . . . , Im c02 ) is an invertible linear transformation.
168 3 The Center Problem

3.5 Show that if in the real system (3.2) U and V are polynomial functions that
satisfy max{degP, degV } = n, then the complexification (3.19) (or (3.51)) has
the form ẋ = ix + R(x, x̄), where R is a polynomial without constant or linear
terms that satisfies deg R = n, and that the transformation from R(n+1)(n+2)−6 to
R(n+1)(n+2)−6 that carries the coefficients of U and V to the real and imaginary
parts of those of R is an invertible linear transformation.
3.6 Show that the line x2 = x̄1 is invariant under the flow of system (3.21) on C2 .
Show that if the invariant line is viewed as a 2-plane in real 4-space, then the
flow on the plane is the flow of the original real system (3.4). (Part of the prob-
lem is to make these statements precise.)
3.7 [Referenced in Proposition 3.2.1 and Theorem 3.2.7, proofs.] Writing the
(α ,β ) α β
higher-order terms in (3.22) as X j (x1 , x2 ) = ∑α +β ≥2 X j x1 x2 , j = 1, 2, show
that a member of family (3.22) arises from a real system if and only if λ2 = λ̄1
(α ,β ) (β ,α )
and X2 = X1 .
3.8 Suppose that in system (3.22), λ1 /λ2 = −p/q for p, q ∈ N, GCD(p, q) = 1, and
that Ψ is a formal first integral of (3.22) that does not have a constant term and
begins with terms of order no higher than p + q. Show that Ψ must have the
form Ψ (x1 , x2 ) = xq1 x2p + · · · .
Hint. One approach is to (i) show that all partial derivatives of Ψ through order
p + q − 1 vanish at (0, 0); and then (ii) show that if not all partial derivatives of
Ψ through order p + q vanish at (0, 0), then Ψ (x1 , x2 ) = xq1 x2p + · · ·.
3.9 [Referenced in Theorems 3.2.7 and 3.2.9, proofs.] Let P(u, v), Q(u, v), and
Ψ (u, v) be real functions of real variables u and v and suppose that for com-
plex variables x and y the functions
     
x+y x−y x+y x−y x+y x−y
P , , Q , , and Ψ ,
2 2i 2 2i 2 2i

are well-defined. Show that Ψ (u, v) is a formal first integral for the system

u̇ = P(u, v)
v̇ = Q(u, v)

if and only if  
def x+y x−y
Φ (x, y) = Ψ ,
2 2i
is a formal first integral for the system
   
x+y x−y x+y x−y
ẋ = P , + iQ ,
2 2i 2 2i
   
x+y x−y x+y x−y
ẏ = P , − iQ ,
2 2i 2 2i
   
x+y x−y x+y x−y
=P , − iQ , .
2 2i 2 2i
3.9 Notes and Complements 169

3.10 [Referenced in Theorem 3.2.7, proof.] Show that if system (3.4) has a first inte-
gral of the form Ψ (u, v) = u2 + v2 + · · · , then the origin is a center.
Hint. Restricting Ψ to a sufficiently small neighborhood of the origin, for
(u0 , v0 ) sufficiently near the origin Ψ (u0 , v0 ) is a regular value of Ψ , hence,
by an appropriate result of differential geometry and the form of Ψ , the set
{(u, v) : Ψ (u, v) = Ψ (u0 , v0 )} is an oval.
3.11 Consider the following system (analogous to an example of R. Moussu) of poly-
nomial differential equations on R2 :

u̇ = −v3 , v̇ = u3 + u2 v2 . (3.164)

Note that in the absence of the quartic terms the system is Hamiltonian, hence
has an analytic first integral.
a. Show that (0, 0) is an antisaddle of focus or center type, say by passing to
polar coordinates as at the beginning of Section 3.1.
b. Show that (0, 0) is in fact a center by finding a line of symmetry for the orbits
as point sets.
c. Show that if H(u, v) = ∑ h jk u j vk is constant on orbits of (3.164), then h jk = 0
for all ( j, k) ∈ N0 × N0 \ {(0, 0)}.
3.12 Refer to Definition 3.3.3. Suppose s ∈ N is such that V(Bs ) = V(B) = VC .
Prove the following implication: if for some r ≥ s Br is a radical ideal, then for
every j ≥ r, B j = Br = B (hence is a radical ideal).
3.13 Explain why VCR is a real affine variety.
3.14 Does (3.82) hold for system (3.48)? What is the structure of the focus quantities
for this system?
3.15 Let S = {(p1 , q1 ), . . . , (pℓ , qℓ )} be the indexing set for family (3.3) and fix a
number K ∈ Nℓ0 . Let w = min{p j + q j : (p j , q j ) ∈ S}. Show that if ν ∈ N2ℓ 0
is such that |ν | > ⌊2K/w⌋, where ⌊r⌋ denotes the greatest integer less than or
equal to r, then ν ∈ / Supp(gkk ) for 1 ≤ k ≤ K.
3.16 Prove Corollary 3.4.6.
3.17 Using the algorithm of Section 3.4, write a code in an available computer al-
gebra system and compute the first three focus quantities of system (3.129).
Check that the quantities thus obtained generate the same ideal as the quantities
(3.133)–(3.135).
3.18 Imitate the proof of Theorem 3.4.2 to prove the following analogous theorem
for family (3.63). (Definition 3.4.1 is identical except that it is stated for all
( j, k) ∈ N−q × N−p.)
Theorem. Let family (3.63) be given, where p, q ∈ N, GCD(p, q) = 1. There
exist a formal series Ψ (x, y) of the form (3.64) and polynomials gq,p , g2q,2p,
g3q,3p, . . . in C[a, b] such that
1. equation (3.68) holds;
2. for every pair ( j, k) ∈ N−q × N−p such that j + k ≥ 0, v jk ∈ Q[a, b], and v jk
is a ( j, k)-polynomial;
3. for every k ≥ 1, vkq,kp = 0; and
4. for every k ≥ 1, gkq,kp ∈ Q[a, b], and gkq,kp is a (kq, kp)-polynomial.
170 3 The Center Problem

3.19 For p, q ∈ N with GCD(p, q) = 1, modify the definition (3.76) of V (ν ) so that


(3.76b) becomes V (ν ) = 0 if L1 (ν )p = L2 (ν )q and (3.76c) becomes

1
V (ν ) =
L1 (ν )p − L2 (ν )q
"

× ∑V e (ν1 , . . . , ν j − 1, . . . , ν2ℓ )(L1 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) + q)
j=1
#
2ℓ
− ∑ V (ν1 , . . . , ν j − 1, . . . , ν2ℓ )(L2 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) + p)
e
j=ℓ+1

(but maintain V (0, . . . , 0) = 1 as before). On the assumption that V (ν ) = 0 if


L1 (ν ) < −q or if L2 (ν ) < −p (the analogue of Lemma 3.4.4, which is true),
prove the following analogue of the first two parts of Theorem 3.4.5.
Theorem. Let family (3.63) be given, where p, q ∈ N, GCD(p, q) = 1. Let Ψ be
the formal series of the form (3.64) and let {gkq,kp : k ∈ N} be the polynomials
in C[a, b] given by the theorem in Exercise 3.18. Then
(ν )
1. for ν ∈ Supp(vk1 ,k2 ), the coefficient vk1 ,k2 of [ν ] in vk1 ,k2 is V (ν );
(ν )
2. for ν ∈ Supp(gkq,kp ), the coefficient gkq,kp of [ν ] in gkq,kp is
"

(ν )
gkq,kp =− ∑ Ve (ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L1 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + q)
j=1
#
2ℓ
− ∑ V (ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L2 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + p) .
e
j=ℓ+1

3.20 Show that equations (3.86) are the conditions for the two types of symmetry of
e v), v̇ = V
a real system u̇ = U(u, e (u, v) with respect to the u-axis.
3.21 Prove that for the family of Example 3.5.7, Isym = ha10 a01 − b10b01 i, that is, that
a10 a01 − b10b01 divides [ν ] − [c ν ] for every ν ∈ M .
αj
3.22 Suppose H = ∏ j=1 f j is a first integral for system (3.101), where for each j,
s

α j ∈ C and f j is an irreducible element of C[x, y], and for k 6= j, fk and f j are


relatively prime. Show that for each j, f j |X f j .
Hint. Factor f j from the equation X H ≡ 0.
3.23 Prove that for the polynomial K in (3.103), deg K ≤ m−1. Construct an example
for which the inequality is strict.
3.24 Let X be the vector field corresponding to (3.101), let f1 , . . . , fs be elements of
C[x, y], and let I = h f1 , . . . , fs i. This exercise is a generalization of the discussion
that surrounds (3.102) and leads up to Proposition 3.6.2.
a. Prove that if X f j ∈ I for 1 ≤ j ≤ s, then V( f1 , . . . , fs ) is invariant under
the flow of X . (Invariance means that if η (t) = (x(t), y(t)) is a solution of
(3.101) for which η (0) ∈ V( f1 , . . . , fs ), then f j (η (t)) ≡ 0 for 1 ≤ j ≤ s.)
3.9 Notes and Complements 171

b. Prove conversely
√ that if V( f1 , . . . , fs ) is invariant under the flow of X , then
X f j ∈ I for 1 ≤ j ≤ s (hence X f j ∈ I for 1 ≤ j ≤ s if I is a radical ideal).
3.25 Suppose (3.101) has a Darboux first integral. Formulate conditions that imply
that every trajectory of (3.101) lies in an algebraic curve. (One condition was
given immediately following Definition 3.6.3.)
3.26 [Referenced in Theorem 3.6.4, proof.] Prove that the function H constructed in
the proof of Darboux’s Theorem (Theorem 3.6.4) is not constant.
3.27 Show that f (u, v) is a complex algebraic partial integral of the real system
(3.101) with cofactor K(u, v) if and only if f¯(u, v) is a complex algebraic partial
integral of the real system (3.101) with cofactor K̄(u, v), where the conjugation
is of the coefficients of the polynomials only.
3.28 Consider the system

ẋ = x − 3x2y + 2xy2 + 5y3, ẏ = −3y − x3 + 4x2y + 3xy2 − 2y3. (3.165)

a. Verify by finding their cofactors that both f1 (x, y) = 1 − x2 − 2xy − y2 and


f2 (x, y) = 6y + x3 − 3x2y − 9xy2 − 5y3 are algebraic partial integrals.
−2/3
b. Show that µ = f1 f2 is an integrating factor of (3.165) on some open set
Ω ⊂ C . What is the largest that Ω can be with respect to set inclusion?
2

c. Using part (b) and a computer algebra system, derive the first integral
Φ (x, y) = (2x − x3 − x2 y + xy2 + y3 ) f2 (x, y)1/3 of (3.165) on Ω . Explain
why Φ is not a first integral of (3.165) on any neighborhood of (0, 0) in C2
even though it is defined on C2 .
d. Explain why Ψ = 48 1
Φ 3 is a first integral of (3.165) on any neighborhood of
(0, 0). Conclude that (3.165) has a 1 : −3-resonant center at (0, 0).
3.29 Consider the family

ẋ = x + ax3 + bx2y − xy2 , ẏ = −3y + x2y + b̃xy2 + c̃y3 , (3.166)

where the coefficients are restricted to lying in the variety V(I) ⊂ C4 of the
ideal I = h5bc̃+ 3b + b̃c̃ + 3b̃, 2ac̃+ 3a + c̃, 3ab − ab̃− b − b̃i. Obvious algebraic
partial integrals are x and y with their complementary factors in the polynomials
in ẋ and ẏ as their cofactors.
a. Show that when a 6= 1/3, an integrating factor of (3.166) on the open set
Ω = C2 \ {(x, y) : xy = 0} is µ (x, y) = xα yβ , where α = −(9a + 1)/(3a + 1)
and β = −(5a + 1)/(3a + 1).
b. Using a computer algebra system, derive the first integral

Φ (x, y) = (x3 y)−2a/(3a+1)


× [(2a + 1)(a + 1) + a(2a + 1)(a + 1)x2 + 2a(2a + 1)xy + a(a + 1)y2]

of (3.166) on Ω , valid for a 6= {−1/3, 0}.


c. Show that Ψ = [Φ /((2a + 1)(a + 1))]−(3a+1)/2a is a first integral of (3.166)
on a neighborhood of (0, 0) in C2 , of the form x3 y + · · · , for all systems
(3.166) corresponding to S = V(I) \ V(J), J = ha(3a + 1)(2a + 1)(a + 1)i.
172 3 The Center Problem

d. Show that the Zariski closure


√ S̄√of S is V(I) by computing the ideal quotient
I : J and confirming that I = I : J using the Radical Membership Test.
e. Use (c) and (d) to conclude that every element of V(I) has a 1 : −3-resonant
center at the origin.
3.30 Consider the family

ẋ = x − x3 , ẏ = −3y + ax3 + x2 y + bxy2, (a, b) ∈ C2 . (3.167)

a. By finding their cofactors, verify that the following five polynomials are al-
gebraic partial integrals of (3.167):

f1 (x, y) = x, f2 (x, y) = 1 + x, f3 (x, y) = 1 − x,


√ 
f4 (x, y) = −2 + 1 − ab + 1 x2 + bxy,
f5 (x, y) = 4 − 4x2 − 4bxy .

b. Show that for b(ab − 1) 6= 0,


√ √
b√ (5)
Φ = f2 1−ab
f3 1−ab −2
f4 f5 = 1+ 1 − abx3 y + · · ·
2
is an analytic first integral of (3.167) on a neighborhood of (0, 0) in C2 ,
hence that Ψ = b√1−ab2
[Φ − 1] is an analytic first integral of (3.167) on a
neighborhood of (0, 0) in C2 of the form x3 y + · · ·.
c. Use Exercise 1.45(b) to conclude that (3.167) has a 1 : −3-resonant center at
the origin for all (a, b) ∈ C2 .
3.31 Show that system (3.115) in Example 3.6.11 has a stable (resp., unstable) focus
at the origin when A < 0 (resp., A > 0) by changing to polar coordinates and
integrating the differential equation (3.6) for the trajectories.
3.32 Verify the truth of (3.136).
3.33 In the context of Theorem 3.7.1, suppose (a, b) ∈ V3 ∩ V(b2,−1 b10 ).
a. Show that if (b2,−1 , b10 ) = (0, 0), then (a, b) ∈ V(Isym ), hence there is a cen-
ter at (0, 0).
b. Show that if (b2,−1 , b10 ) 6= (0, 0), then there exist irreducible algebraic partial
integrals h1 , h2 , and h3 for which there exist suitable constants α1 , α2 , and
α3 such that by Theorem 3.6.8, H = h1α1 hα2 2 hα3 3 is a first integral.
Hint. deg(h1 ) = 1, deg(h2 ) = deg(h3 ) = 2.
3.34 Each of the systems ẋ = x − y2 , ẏ = −y+ y2 and ẋ = x + x2 − y2 , ẏ = −y+ x2 − y2
corresponds to an element of the set V2 of Theorem 3.7.1. Show that the first
system has exactly two invariant lines, real or complex, and that the second has
exactly one.
3.35 System (3.2) with quadratic polynomial nonlinearities can be written in the form

u̇ = −v − bu2 − (2c + β )uv − dv2


(3.168)
v̇ = u + au2 + (2b + α )uv + cv2.
3.9 Notes and Complements 173

Necessary and sufficient conditions that there be a center at the origin as formu-
lated by Kapteyn are that at least one of the following sets of equalities holds:
I. a + c = b + d = 0;
II. α (a + c) = β (b + d) and α 3 a − (α + 3b)α 2 β + (β + 3c)αβ 2 − β 3 d = 0;
III. α + 5(b + d) = β + 5(a + c) = 2(a2 + d 2 ) + ac + bd = 0.
Derive these conditions by complexifying (3.168), relating the parameters there
to a10 , a01 , and so on, and using Theorem 3.7.1.
3.36 Check that in the context of Theorem 3.7.2, V(B5 ) = ∪8j=1V j using the proce-
dure described on page 156.
3.37 Prove that the origin is a center for system (3.2) if the system admits the inte-
grating factor (u2 + v2 )α with |α | ≤ 2.
3.38 Consider the family of systems

ẋ = x − a10x2 − a20 x3 − a11x2 y − a02xy2 ,


(3.169)
ẏ = −y + b01y2 + b02y3 + b11xy2 + b20x2 y .

a. Show that when a11 = b11 and a20 + b20 = 0 the corresponding system
(3.169) has a center at the origin.
Hint. H(x, y) = (1 − a10x + b01y + x2 + y2 )/(xy) + a11 log xy .
b. Find the center variety of (3.169).
3.39 Find the center variety and its irreducible decomposition for the system

ẋ = x − a20x3 − a11x2 y − a02xy2 − a−13y3 ,


ẏ = −y + b02y3 + b11xy2 + b20x2 y + b3,−1x3 .

Hint. See Theorem 6.4.3.


3.40 [Referenced in Proposition 4.2.12, proof.] Suppose the nonlinearities in family
(3.3) are homogeneous of degree K, which means that each nonlinear term is of
the same total degree K, and that a polynomial f ∈ C[a, b] is an (r, s)-polynomial
with respect to the corresponding function L of Definition 3.71. (You can think
of the focus quantities, but other polynomials connected with family (3.3) that
we will encounter subsequently have the same property.) Show that f is homo-
geneous of degree (r + s)/(K − 1).
Hint. L1 (ν ) + L2 (ν ).
3.41 Let k be a field and suppose each polynomial f j ∈ k[x1 , . . . , xn ] in the collection
F = { f1 , . . . , fs } is nonzero and homogeneous (but possibly of all different de-
grees), and that f ∈ k[x1 , . . . , xn ] is nonzero and homogeneous. Show that the re-
mainder when f is reduced modulo F, under any ordering of the elements of F,
if nonzero, is homogeneous. Combined with the previous exercise, this shows
that when family (3.3) has homogeneous nonlinearities, the focus quantity gkk
and its remainder modulo Bk−1 are homogeneous polynomials, if nonzero.
3.42 [Referenced in Proposition 3.8.4, proof.] Show that if the common factor c(x, z)
of the polynomials displayed in Proposition 3.8.4 vanishes at (0, 0), then con-
dition (3.150) ensures that the equation c(x, z) = 0 defines a function z = ζ (x)
and that ζ ′ (0) = −1.
174 3 The Center Problem

3.43 [Referenced in Proposition 3.8.7, proof.] Let k be a field. For A, B ∈ k[x] such
that A/B is not constant, show that there exist constants a, b, c, d ∈ k satisfying
ad − bc 6= 0 and such that deg(aA + bB) > deg(cA + dB).
Hint. If deg(A) = deg(B), consider bn A − an B, where A = an xn + · · · + a0 and
B = b n xn + · · · + b 0 .
3.44 Use the family of vector fields originated by C. Christopher ([50]),
   
ẋ = y, ẏ = − 81 + 14 h(x) h′ (x) − y 1 + (2λ − 4)h2(x) h′ (x),

where h(x) = 21 (x2 − 2x3 ), to show that there exist polynomial Liénard systems
with coexisting centers and foci, coexisting centers and fine foci, and coexisting
limit cycles and centers.
Hint. Use Hopf bifurcation theory ([44, 140]).
Chapter 4
The Isochronicity and Linearizability Problems

In the previous chapter we presented methods for determining whether the antisad-
dle at the origin of the real polynomial system (3.2) is a center or a focus, and more
generally if the singularity at the origin of the complex polynomial system (3.4) is
a center. In this chapter we assume that the singularity in question is known to be
a center and present methods for determining whether or not it is isochronous, that
is, whether or not every periodic orbit in a neighborhood of the origin has the same
period. A seemingly unrelated problem is that of whether the system is linearizable
(see Definition 2.3.4) in a neighborhood of the origin. In fact, the two problems are
intimately connected, as we will see in Section 4.2, and are remarkably parallel to
the center problem.

4.1 The Period Function

A planar real analytic system with a center at the origin and nonzero linear part there
can, by a suitable analytic change of coordinates and time rescaling, be written in
the form (3.2):
u̇ = −v + U(u, v), v̇ = u + V(u, v) , (4.1)
where U and V are convergent real series that start with quadratic terms. Recall that
the period annulus of the center is the largest neighborhood Ω of the origin with the
property that the orbit of every point in Ω \ {(0, 0)} is a simple closed curve that
encloses the origin. Thus the trajectory of every point in Ω \ {(0, 0)} is a periodic
function, and the following definition makes sense.

Definition 4.1.1. Suppose the origin is a center for system (4.1) and that the number
r∗ > 0 is so small that the segment Σ = {(u, v) : 0 < u < r∗ , v = 0} of the u-axis
lies wholly within the period annulus. For r satisfying 0 < r < r∗ , let T (r) denote
the least period of the trajectory through (u, v) = (r, 0) ∈ Σ . The function T (r) is
the period function of the center (which by the Implicit Function Theorem is real
analytic). If the function T (r) is constant, then the center is said to be isochronous.

V.G. Romanovski, D.S. Shafer, The Center and Cyclicity Problems, 175
DOI 10.1007/978-0-8176-4727-8_4,
© Birkhäuser is a part of Springer Science+Business Media, LLC 2009
176 4 The Isochronicity and Linearizability Problems

Going over to polar coordinates x = r cos ϕ , y = r sin ϕ , system (4.1) becomes


∞ ∞
dr dϕ
= ∑ ξk (ϕ )rk+1 , = 1 + ∑ ζk (ϕ )rk , (4.2)
dt k=1 dt k=1

where ξk (ϕ ) and ζk (ϕ ) are homogeneous polynomials in sin ϕ and cos ϕ of degree


k + 2. Elimination of time t from (4.2) yields the equation

dr
= ∑ Rk (ϕ )rk , (4.3)
d ϕ k=2

where Rk (ϕ ) are 2π -periodic functions of ϕ and the series is convergent for all ϕ
and for all sufficiently small r. The initial value problem for (4.3) with the initial
condition (r, ϕ ) = (r0 , 0) has a unique solution

r = r0 + ∑ uk (ϕ )r0k , (4.4)
k=2

which is convergent for all 0 ≤ ϕ ≤ 2π and all r0 < r∗ , for some sufficiently small
r∗ > 0; the coefficients uk (ϕ ) are determined by simple quadratures using formulas
(3.10) with α = 0 and β = 1. Substituting (4.4) into the second equation of (4.2)
and dropping the subscript on r yields an equation of the form


= 1 + ∑ Fk (ϕ )rk .
dt k=1

Rewriting this equation as


!


dt = ∞ = 1 + ∑ ψk (ϕ )r k
dϕ (4.5)
1 + ∑k=1 Fk (ϕ )rk k=1

and integrating yields



t −ϕ = ∑ θk (ϕ )rk , (4.6)
k=1

where θk (ϕ ) = 0 ψk (ϕ ) d ϕ and the series in (4.6) converges for 0 ≤ ϕ ≤ 2π and
sufficiently small r ≥ 0. From (4.6) it follows that the least period of the trajectory
of (4.1) passing through (u, v) = (r, 0) for r 6= 0 is
 ∞ 
T (r) = 2π 1 + ∑ Tk r , k
(4.7)
k=1

where the coefficients Tk are given by the expression


Z 2π
1 1
Tk = θk (2π ) = ψk (ϕ ) d ϕ . (4.8)
2π 2π 0
4.2 Isochronicity Through Normal Forms and Linearizability 177

We now see that if the origin is an isochronous center of (4.1), then the functions
θk (ϕ ) satisfy the condition
θk (2π ) = 0 (4.9)
for all k ∈ N. Obviously the converse holds as well, so we have established the
following result.
Theorem 4.1.2. Suppose that in system (4.1) U and V are convergent real series
starting with quadratic terms and that there is a center at the origin.
1. The period function T (r) of system (4.1) is given by formula (4.7).
2. System (4.1) has an isochronous center at the origin if and only if (4.9) holds for
all k ∈ N.

4.2 Isochronicity Through Normal Forms and Linearizability

In the previous section we expressed the isochronicity problem for real analytic pla-
nar systems in terms of conditions on the coefficients in an expansion of the period
function. For a completely different point of view, we begin with the observation
that the canonical linear center

u̇ = −v, v̇ = u (4.10)

is itself isochronous. Since isochronicity does not depend on the coordinates in use,
certainly any system obtainable from (4.10) by an analytic change of coordinates
will still be isochronous. Viewed from the other direction, any real analytic system
with a center that can be reduced to (4.10) by an analytic change of coordinates (that
is, any linearizable system according to Definition 2.3.4) must be isochronous. This
discussion shows that linearizability and isochronicity are intimately connected,
since it reveals that only isochronous systems are linearizable. In fact, Theorem
3.2.10 implies that the correspondence between the two families is perfect, as we
now demonstrate.
Theorem 4.2.1. The origin is an isochronous center for system (4.1) if and only if
there is an analytic change of coordinates (3.42) that reduces (4.1) to the canonical
linear center (4.10).
Proof. By the preceding discussion we need only show that every isochronous cen-
ter is linearizable. Hence suppose the origin is an isochronous center of (4.1). Then
by Theorem 3.2.10 there is an analytic change of coordinates (3.42) that transforms
(4.1) into the real normal form (3.43),

ξ̇ = −η F(ξ 2 + η 2 ), η̇ = ξ F(ξ 2 + η 2 ), (4.11)

where F(z) = 1 + ∑∞ k=1 γk z is analytic in a neighborhood of z = 0. In polar coor-


k

dinates, system (4.11) is ṙ = 0, ϕ̇ = F(r2 ), for which the period function is clearly
178 4 The Isochronicity and Linearizability Problems

T (r) = 2π /F(r2 ). Since the transformed system is still isochronous, F(z) ≡ 1, as


required. 

Theorem 4.2.1 tells us that the isochronicity of a planar analytic system is equiv-
alent to its linearizability, so instead of studying the isochronicity of planar analytic
systems, we will investigate their linearizability. The advantage is that, because lin-
earizability has a natural generalization to the complex setting, as was done in the
previous chapter we can transform our real system (4.1) into a complex system on
C2 and study the complex affine varieties corresponding to the linearizable systems.
The complexification of (4.1) has the form ż = Jz + Z(z), where J is a diagonal ma-
trix. As a preliminary we prove the important fact that if there exists any normalizing
transformation that linearizes such a system, then every normalizing transformation
does so. Thus it will be no loss of generality in our investigation of linearization of
such systems to impose conditions on the normalizing transformation whenever it
is convenient to do so. Since the theorem is true not just in C2 but in Cn , we state
and prove it in this setting.

Theorem 4.2.2. Consider the system

ż = Jz + Z(z), (4.12)

where z ∈ Cn , J is a diagonal matrix, and each component Z j (z) of Z, 1 ≤ j ≤ n, is a


formal or convergent power series, possibly with complex coefficients, that contains
no constant or linear terms. Suppose

z = y+e
h(y) (4.13)

and
z = x+b
h(x) (4.14)
are normalizing transformations (possibly merely formal) that transform (4.12) into
the respective normal forms
ẏ = Jy + Y(y) (4.15)
and
ẋ = Jx + X(x) . (4.16)
If the normal form (4.15) is linear, that is, if Y(y) ≡ 0, then the normal form (4.16)
is linear as well.

Proof. Because (4.14) is invertible there exists a (formal) transformation

x = y + h(y) (4.17)

that transforms (4.16) into (4.15). Certainly (4.15) is a normal form for (4.16) and
(4.17) is a normalizing transformation that produces it. Therefore (employing the
notation introduced in (2.36) and in the paragraph that follows Remark 2.3.7) for
(α )
s ∈ N, s ≥ 2, any coefficient hm of h(s) is determined by equation (2.40):
4.2 Isochronicity Through Normal Forms and Linearizability 179

(α ) (α ) (α )
[(α , κ ) − κm ]hm = gm − Ym , (4.18)
(α )
where gm is a known expression depending on the coefficients of h( j) for j < s.
(α )
Specifically, gm is defined by equation (2.41):
n
(α ) (β ) (α −β +e j )
gm = {Xm (y + h(y))}(α ) − ∑ ∑ β j hm Y j , (4.19)
j=1 2≤|β |≤|α |−1
α −β +e j ∈Nn0

where {Xm (y + h(y))}(α ) denotes the coefficient of yα obtained after expanding


j
Xm (y + h(y)) in powers of y, e j = (0, . . . , 0, 1, 0, . . . , 0) ∈ Nn0 , and the summation is
empty when |α | = 2.
Assume, contrary to what we wish to show, that the normal form (4.16) is not
linear. If k ≥ 2 is such that the series expansions of the coordinate functions of X
start with terms of order k, then there exist a coordinate index m ∈ {1, . . . , n} and
(γ )
a multi-index γ for which |γ | = k such that Xm 6= 0. By (4.19) and the fact that
Y(y) ≡ 0,
n
(γ ) (β ) (γ −β +e j ) (γ )
gm = {Xm (y + h(y))}(γ ) − ∑ ∑ β j hm Y j = Xm .
j=1 2≤|β |≤|γ |−1
γ −β +e j ∈Nn0

On the other hand, from (4.18) and the fact that m and γ must form a resonant pair
we obtain
(γ ) (γ ) (γ ) (γ )
0 · hm = gm − Ym = Xm 6= 0,
a contradiction. 
Corollary 4.2.3. Suppose A is a diagonal matrix. System ẋ = Ax + X(x) on C2 is
linearizable according to Definition 2.3.4 if there exists a merely formal normalizing
transformation x = y + h(y) that places it in the normal form ẏ = Ay.
Proof. Suppose there is a formal normalizing transformation that reduces the sys-
tem to ẏ = Ay. Then by the theorem the distinguished normalizing transformation
does, too. But Ψ (y1 , y2 ) = y1 y2 is a first integral for the normalized system, hence
by Theorem 3.2.5(1) the distinguished normalizing transformation is convergent. 

If we apply the complexification procedure described at the beginning of Section


3.2 to (4.1), we obtain system (3.21) with α = 0 and β = 1, where x1 and x2 are
complex variables:

ẋ1 = ix1 + X1(x1 , x2 )


, X2 (x1 , x̄1 ) = X1 (x1 , x̄1 ) . (4.20)
ẋ2 = −ix2 + X2(x1 , x2 )

We have already seen (see Proposition 3.2.2) that the resonant pairs for (4.20) are
(1, (k + 1, k)) and (2, (k, k + 1)), k ∈ N, so that when we apply a normalizing trans-
180 4 The Isochronicity and Linearizability Problems

formation (2.33), namely

∑ ∑
( j,k) j k ( j,k) j k
x1 = y1 + h1 y1 y2 , x2 = y2 + h2 y1 y2 , (4.21)
j+k≥2 j+k≥2

to reduce system (4.20) to the normal form (3.27), we obtain

ẏ1 = y1 (i + Y1(y1 y2 )), ẏ2 = y2 (−i + Y2(y1 y2 )), (4.22)

where
∞ ∞
∑ Y1 ∑ Y2
( j+1, j) ( j, j+1)
Y1 (y1 y2 ) = (y1 y2 ) j and Y2 (y1 y2 ) = (y1 y2 ) j . (4.23)
j=1 j=1

System (4.20) is linearized by the transformation (4.21) precisely when (4.22) re-
duces to ẏ1 = iy1 , ẏ2 = −iy2 . Be that as it may, in any case the first equation in the
normal form (4.22) is given by

ẏ1 = y1 (i + 21 [G(y1 y2 ) + H(y1y2 )]), (4.24)

where G and H are the functions of (3.28). If the normalizing transformation was
chosen subject to the condition in Proposition 3.2.1(2), then by Theorem 3.2.7 the
origin is a center for (4.1) if and only if in the normal form (4.24) G ≡ 0, in which
case H has purely imaginary coefficients (Remark 3.2.8). It is convenient to define
e by
H
e
H(w) = − 21 iH(w).
By Proposition 3.2.1 system (4.22) is the complexification of a real system; we
obtain two descriptions of it in complex form by replacing every occurrence of y2
by ȳ1 in each equation of (4.22). Setting y1 = reiϕ , we obtain from them

ṙ = 1 ˙
2r (ẏ1 ȳ1 + y1 ȳ1 ) = 0, ϕ̇ = i
(y ȳ˙ − ẏ1 ȳ1 )
2r2 1 1
e 2) .
= 1 + H(r (4.25)

Integrating the expression for ϕ̇ in (4.25) yields (since r is constant in t)


!


T (r) = = 2π 1 + ∑ p2k r 2k
(4.26)
e 2)
1 + H(r k=1

for some coefficients p2k that will be examined later in this section and will play
a prominent role in Chapter 6. Recall from (3.28a) that we write the function H
as H(w) = ∑∞ k e ∞ e
k=1 H2k+1 w . Similarly, write H(w) = ∑k=1 H2k+1 w . The center is
k

isochronous if and only if p2k = 0 for k ≥ 1 or, equivalently, H2k+1 = 0 for k ≥ 1.


We call p2k the kth isochronicity quantity.
The expression for T in (4.26) pertains to the polar coordinate expression (4.25)
for the real system whose complexification is the normalization (4.22) of the com-
plexification (4.20) of the original system (4.1), which has polar coordinate expres-
sion (4.2). If the polar distance in (4.25) is denoted R, then R is an analytic function
4.2 Isochronicity Through Normal Forms and Linearizability 181

of r of the form R = r + · · · ,so a comparison of (4.7) and (4.26), which now reads
T (R) = 2π 1 + ∑∞ k=1 p2k R
2k , immediately yields the following property.

Proposition 4.2.4. The first nonzero coefficient of the expansion (4.7) is the coeffi-
cient of an even power of r.

The discussion thus far in this chapter applies to any planar analytic system of
the form (4.1). We now turn our attention to families of polynomial systems of the
form (4.1), which, as in Section 3.2, we complexify to obtain a family of polynomial
systems of equations on C2 that have the form
 
ẋ1 = P(x1 , x2 ) = i x1 − ∑ a pqx1 x2
e p+1 q

(p,q)∈S
  (4.27)
e
ẋ2 = Q(x1 , x2 ) = −i x2 − ∑ bqp xq1 x2p+1 ,
(p,q)∈S

where the coefficients of Pe and Q e are complex, where S ⊂ N−1 × N0 is a finite


set, every element (p, q) of which satisfies p + q ≥ 1, and where bqp = ā pq for all
(p, q) ∈ S. This, of course, is the same expression as (3.3) and (3.50). When family
(4.27) arises as the complexification of a real family, the equality bqp = ā pq holds for
all (p, q) ∈ S, but we will not restrict ourselves to this condition in what follows. In
such a case the function H e and the isochronicity quantities p2k as implicitly defined
by the second equality in (4.26) exist, although there is no period function T . In
any event, the coefficients p2k defined by (4.26) and H e2k+1 of He = − 1 iH are now
2
polynomials in the parameters (a, b) ∈ E(a, b) = C2ℓ .
(k+1,k) (k,k+1)
The coefficients Y1 and Y2 of the series (4.23) in the normal form
(4.22) are now elements of the ring C[a, b] of polynomials. They generate an ideal
D E
( j+1, j) ( j, j+1)
Y := Y1 , Y2 : j ∈ N ⊂ C[a, b]. (4.28)

( j+1, j) ( j, j+1)
For any k ∈ N we set Yk = hY1 , Y2 : j = 1, . . . , k i. The normal form of a
particular system (a∗ , b∗ ) is linear when all the coefficients
n o
( j+1, j) ∗ ∗ ( j, j+1) ∗ ∗
Y1 (a , b ), Y2 (a , b ) : j ∈ N

are equal to zero. Thus we have the following definition.

Definition 4.2.5. Suppose a normal form (4.22) arises from system (4.27) by means
of a normalizing transformation (2.33) or (4.21), and let Y be the ideal (4.28). The
variety VL := V(Y ) is called the linearizability variety of system (4.27).

As was the case with the center variety, we must address the question of whether
the linearizability variety is actually well-defined, that is, that the variety VL does
not depend on the particular choice of the resonant coefficients of the normalizing
transformation. In the current situation, however, no elaborate argument is needed:
182 4 The Isochronicity and Linearizability Problems

the correctness of Definition 4.2.5 is an immediate consequence of Theorem 4.2.2,


that if any normalizing transformation linearizes (4.27), then all normalizing trans-
formations do.

Remark 4.2.6. If the system

e 1 , x2 ) =
ẋ1 = P(x x1 − ∑ a pq x1p+1 xq2
(p,q)∈S
(4.29)
e 1 , x2 ) = −x2 +
ẋ2 = Q(x ∑ bqp xq1 x2p+1
(p,q)∈S

is transformed into the linear system ẏ1 = y1 , ẏ2 = −y2 by transformation (4.21),
then (4.27) is reduced to ẏ1 = iy1 , ẏ2 = −iy2 by the same transformation. Conversely,
if (4.21) linearizes (4.27), then it also linearizes (4.29). Therefore systems (4.27)
and (4.29) are equivalent with regard to the problem of linearizability. Thus in what
follows we typically state results for both systems but provide proofs for just one or
the other.

The following proposition gives another characterization of VL , based on the


idea that linearizability and isochronicity must be equivalent. Form
e2 j+1 : j ∈ Ni .
H = hH2 j+1 : j ∈ Ni = hH (4.30)

e2 j+1 : j = 1, . . . , k i.
For each k ∈ N we also set Hk = hH2 j+1 : j = 1, . . . , k i = hH
The corresponding variety V(H ), when intersected with the center variety, picks
out the generalization of isochronous centers to the complex setting. Thus define

VI = V(H ) ∩VC .

This variety, the isochronicity variety, should be the same as VL , and we now show
that it is. This result thus shows that the variety VI is well-defined, independently
of the normalizing transformation (4.21).

Proposition 4.2.7. With reference to family (4.27), the sets VI = V(H ) ∩ VC and
VL are the same.
(k+1,k) (k,k+1)
Proof. Let a set of coefficients Y1 , Y2 , k ∈ N, arising from a normaliz-
ing transformation that satisfies the condition of Proposition 3.2.1(2) be given. By
(k+1,k) (k,k+1)
(3.28b), H2k+1 = Y1 − Y2 , which immediately implies that VL ⊂ VI . On
the other hand, if (a, b) ∈ VC , then by Theorem 3.2.5(1) the function G computed
(k+1,k) (k,k+1)
from the Y1 and Y2 vanishes identically. Since, again by (3.28b),

(k+1,k) G2k+1 + H2k+1 (k,k+1) G2k+1 − H2k+1


Y1 = and Y2 = , (4.31)
2 2
this yields VI ⊂ VL . 
4.2 Isochronicity Through Normal Forms and Linearizability 183

The following theorem describes some of the properties of the coefficients


(α )
hm ∈ C[a, b] of the normalizing transformation (4.21) and of the resonant terms
(k+1,k) (k,k+1)
Y1 , Y2 ∈ C[a, b] of the resulting normal form (4.22) that are similar to
the properties of the coefficients v j−1,k−1 of the function (3.52) given by Theorem
3.4.2. The theorem is true in general, without reference to linearization, hence could
have been presented in Chapter 3. We have saved it for now because of its particular
connection to the idea of linearization in view of Definition 4.2.5 and Proposition
4.2.7. We will use the notation that was introduced in the paragraph preceding Def-
inition 3.4.1 of ( j, k)-polynomials. Note in particular that the theorem implies that,
(α )
for m ∈ {1, 2} and α ∈ N20 , as polynomials in the indeterminates a pq , bqp , like Xm ,
(α ) (α )
Ym has pure imaginary coefficients, while the coefficients of hm are real.

Theorem 4.2.8. Let a set S, hence a family (4.27) (family (3.3)), be given, and
(k+1,k)
let (4.21) be any normalizing transformation whose resonant coefficients h1 ,
(k,k+1)
h2 ∈ C(a, b), k ∈ N, are chosen so that they are (k, k)-polynomials with coeffi-
cients in Q.
( j,k) ( j,k)
1. For every ( j, k) ∈ N20 with j + k ≥ 2, h1 is a ( j − 1, k)-polynomial and h2 is
a ( j, k − 1)-polynomial, each with coefficients in Q.
(k+1,k) (k,k+1)
2. For all k ∈ N, the polynomials iY1 and iY2 are (k, k)-polynomials with
coefficients in Q.
(k+1,k) (k,k+1)
3. If the resonant coefficients h1 and h2 are chosen so as to satisfy
(k,k+1) b (k+1,k)
h
2 =h1 , then
(k, j) ( j,k)
h2 =b
h1 for all ( j, k) ∈ N20 with j + k ≥ 2

and
(k,k+1) (k+1,k)
Y2 = Yb1 for all k ∈ N0 ,
where for f ∈ C[a, b] the conjugate fb is given by Definition 3.4.3.

Proof. In the proof it will be convenient to treat point (2) and the part of point (3)
(α )
that pertains to Ym as if they were stated for all the coefficients in the normal form,
including the nonresonant ones, which of course are zero. By iQ we denote the set
of all purely imaginary elements of the field of Gaussian rational numbers, that is,
iQ = {iq : q ∈ Q}.
The vector κ of eigenvalues from Definition 2.3.5 is κ = (κ1 , κ2 ) = (i, −i), so
that for α ∈ N20 , |α | ≥ 2, we have that (α , κ ) − κ1 = i(α1 − α2 − 1) ∈ iQ and that
(α , κ ) − κ2 = i(α1 − α2 + 1) ∈ iQ. The proof of points (1) and (2) will be done
(α ) (α )
simultaneously for hm and Ym by induction on |α |.
Basis step. Suppose α ∈ N20 has |α | = 2. Since there are no resonant pairs
(α ) (α )
(m, α ), Ym = 0, so (2) holds. The coefficients hm are determined by (2.40)
(α )
(which is (4.18)), where gm is given by (2.41) (which is (4.19)), in this case (that
is, the basis step) without the sum. Since X and h begin with quadratic terms,
184 4 The Isochronicity and Linearizability Problems

(α ) (α ) (α )
{X(y + h(y))}(α ) = Xm , hence hm = [(α , κ ) − κm ]−1 Xm . Recalling the nota-
( j,k)
tion of Section 3.4, and writing α = ( j, k), X1 = −ia j−1,k = −i[ν ] for the string
ν = (0, . . . , 1, . . . , 0) with the 1 in the position c such that ( j − 1, k) = (pc , qc ). But
( j,k)
L(ν ) = (pc , qc ), so h1 is a ( j − 1, k)-polynomial with coefficients in Q. An analo-
( j,k)
gous argument shows that h2 is a ( j, k − 1)-polynomial with coefficients in Q, so
(1) holds for |α | = 2.
(α ) (α )
Inductive step. Suppose points (1) and (2) hold for all hm and Ym , m ∈ {1, 2},
2 ≤ |α | ≤ s, and fix α = ( j, k) ∈ N20 with |α | = s + 1. If m and α form a nonres-
(α ) (α ) (α )
onant pair, then Ym = 0, so (2) holds, and hm = C · gm for C ∈ iQ. If m and α
(α )
form a resonant pair, then hm and its companion resonant coefficient in h are se-
lected arbitrarily, up to the condition that they satisfy the conclusion in point (1) and
(α ) (α )
that their coefficients lie in Q, and Ym = gm . Thus point (2) will be established
( j,k) ( j,k)
in either case if we can show that g1 is a ( j − 1, k)-polynomial and g2 is a
( j, k − 1)-polynomial, each with coefficients in iQ. For the proof it is convenient to
(1,0) (0,1)
introduce the six constant polynomials in C[a, b] defined by h1 = h2 = 1 and
(0,0) (0,1) (0,0) (1,0)
h1 = h1 = h2 = h1 = 0. Since the support of each of the last four is empty
(1,0) (0,1)
and Supp(h1 ) = Supp(h2 ) = (0, . . . , 0) ∈ N2ℓ 0 , they satisfy the conclusions of
the theorem. They allow us to write
 c  d
∑ ∑ ∑
(c,d) (r,t) (r,t)
Xm (y + h(y)) = Xm h1 yr1 yt2 h2 yr1 yt2 , (4.32)
c+d≥2 r+t≥0 r+t≥0
(c,d)∈N20 (r,t)∈N20 (r,t)∈N20

so that (before collecting on powers of y1 and y2 ) Xm (y + h(y)) is a sum of terms of


the form
(c,d) (r1 ,t1 ) (r ,t ) (u ,v ) (u ,v ) r +···+rc +u1 +···ud t1 +···+tc +v1 +···vd
Xm h1 · · · h1 c c h2 1 1 · · · h2 d d y11 y2 . (4.33)

Consider any such term in Xm (y + h(y)) for which

(r1 + · · · + rc + u1 + · · · + ud ,t1 + · · · + tc + v1 + · · · + vd ) = ( j, k).


(c,d)
Then X1 = −iac−1,d = −i[ν ], where L(ν ) = (c − 1, d), hence by the induction
hypothesis for any monomial [µ ] appearing in the coefficient in (4.33) when m = 1,

L(µ ) = (c − 1, d) + (r1 − 1,t1) + · · · + (rc − 1,tc ) + (u1, v1 − 1) + · · · + (ud , vd − 1)


= ( j − 1, k).

(c,d)
Again, X2 = ibc,d−1 = i[ν ′ ], where L(ν ′ ) = (c, d − 1), and in a similar fash-
ion implies that for any monomial [µ ] appearing in the coefficient in (4.33) when
m = 2, L(µ ) = ( j, k − 1). Therefore {X1 (y + h(y))}( j,k) is a ( j − 1, k)-polynomial
and {X2 (y + h(y))}( j,k) is a ( j, k − 1)-polynomial, whose coefficients are clearly
elements of iQ.
4.2 Isochronicity Through Normal Forms and Linearizability 185

Any monomial [µ ] that appears in the sum in (4.19) is a product of a monomial


(β )
from hm and (changing the name of the index on the first sum to w) a monomial
(α −β +ew )
from Yw , hence by the induction hypothesis, for m = 1 and w = 1,

L(µ ) = (β1 − 1, β2 ) + (( j − β1 + 1) − 1, k − β2) = ( j − 1, k)

and similarly for m = 1 and w = 2, and for m = 2 and w = 1,

L(µ ) = (β1 , β2 − 1) + ( j − β1, (k − β2 + 1) − 1) = ( j, k − 1)


( j,k) ( j,k)
and similarly for m = 2 and w = 2. Thus g1 is a ( j − 1, k)-polynomial and g2
is a ( j − 1, k)-polynomial, each with coefficients in iQ, and the inductive step is
complete, proving points (1) and (2).
(α ) (α )
The proof of point (3) will be done simultaneously for hm and Ym by induc-
tion on |α |. For notational convenience we will write [ f ] in place of fb when the
b
expression for f is long.
Basis step. Suppose α ∈ N20 has |α | = 2. There are no resonant pairs (m, α ),
hence for α = ( j, k),
(k, j) (k, j)
h2 = (i(k − j + 1))−1X2 = (i(k − j + 1))−1i bk, j−1
= (i( j − k − 1))−1(−i)bk, j−1 = b[(i( j − k − 1))−1(−i)a j−1,k ]
( j,k)
= b[(i( j − k − 1))−1X1 ] = b
j,k
h1 ,

(α ) (k,k+1) (k+1,k)
and since Ym = 0, Y2 = Yb1 is automatic.
(α )
Inductive step. Suppose the resonant terms hm in h with |α | ≤ s have been
(k,k+1) (k+1,k) (α )
chosen so that h2 =b
h1 and that the conclusion in (3) holds for all hm and
(α )
Ym for which 2 ≤ |α | ≤ s. Fix α = ( j, k) ∈ N20 with |α | = s + 1. If (1, ( j, k)) is a
(k, j) ( j,k) ( j,k)
nonresonant pair, then (2, (k, j)) is and Y2 =0=b 0 = Yb1 , so (3) holds for Y1
(k, j) ( j,k) ( j,k) (k, j) (k, j)
and Y2 , while h1 = C · g1 and h2 = C̄ · g2 , C = i( j − k − 1). If (1, ( j, k))
( j,k) (k, j)
is a resonant pair, then we choose h1 and the resonant h2 with coefficients in
( j,k) ( j,k) (k, j) (k, j)
Q and so as to meet the condition in (3), and Y1 = g1 and Y2 = g2 . Thus
(k, j) ( j,k)
point (3) will be established if we can prove that g2 = gb1 .
When expression (4.32) for Xm (y + h(y)) is fully expanded without collecting on
powers of y1 and y2 , it is the sum of all terms of the form given in (4.33). Thus in
general Xm (y + h(y))(β ) is the sum of all products of the form
(c,d) (r1 ,t1 ) (r ,t ) (u ,v ) (u ,v )
Xm h1 · · · h1 c c h2 1 1 · · · h2 d d (4.34)

with admissible (c, d), (rw ,tw ), (uw , vw ) ∈ N20 for which

(r1 + · · · + rc + u1 + · · · + ud ,t1 + · · · + tc + v1 + · · · + vd ) = (β1 , β2 ). (4.35)


186 4 The Isochronicity and Linearizability Problems

By the obvious properties of conjugation, b[X1 (y + h(y))( j,k) ] is thus the sum of the
conjugates of all such products, hence of all products
(c,d) (r1 ,t1 ) (r ,t ) (u1 ,v1 ) (u ,v ) (d,c) (t ,r ) (t ,r ) (v ,u ) (v ,u )
Xb1 b h1 ···b
h1 c c bh2 · · ·b
h2 d d = X2 h2 1 1 · · · h2 c c h1 1 1 · · · h1 d d

with admissible (c, d), (rw ,tw ), (uw , vw ) ∈ N20 for which (4.35) holds with (β1 , β2 )
replaced by ( j, k), where we have applied the induction hypothesis. But this latter
sum (with precisely this condition) is exactly X2 (y + h(y))(k, j) , so that

b[X2 (y + h(y))(k, j) ] = X1 (y + h(y))( j,k) . (4.36)

Again by the induction hypothesis (recall that α = ( j, k))


h 2 i
(β ) (α −β +ew )
b ∑ ∑ βw h1 Yw
w=1 2≤|β |≤|α |−1
α −β +ew ∈N20
 
= ∑ b β1 h(β1 ,β2 )Y ( j−β1 +1,k−β2 )
1 1
2≤|β |≤|α |−1
( j−β1 +1,k−β2 )∈N20
 
+ ∑ b β2 h(β1 ,β2 )Y ( j−β1 ,k−β2 +1)
1 2
2≤|β |≤|α |−1
( j−β1 ,k−β2 +1)∈N20
(β2 ,β1 ) (k−β2 , j−β1 +1)
= ∑ β1 h2 Y2
2≤|β |≤|α |−1
(k−β2 , j−β1 +1)∈N20
(β2 ,β1 ) (k−β2 +1, j−β1 )
+ ∑ β2 h2 Y1
2≤|β |≤|α |−1
(k−β2 +1, j−β1 )∈N20

and, writing γ = (k, j) and δ = (β2 , β1 ),

2
(δ ) (γ −δ +ew )
= ∑ ∑ δw h2 Yw . (4.37)
w=1 2≤|δ |≤|γ |−1
γ −δ +ew ∈N20

(k, j) ( j,k)
Equations (4.36) and (4.37) together imply that g2 = gb1 for |α | = j + k = s + 1,
so the inductive step is complete, and the truth of (3) follows. 

Corollary 4.2.9. Let a set S, hence a family (4.27), be given and let (4.21) be any
(k+1,k) (k,k+1)
normalizing transformation whose resonant coefficients h1 , h2 ∈ C[a, b],
k ∈ N, are chosen so that they are (k, k)-polynomials with coefficients in Q such that
(k,k+1) (k+1,k) (k+1,k) (k,k+1)
h2 =bh1 . Then the pair of polynomials G2k+1 = Y1 + Y2 and
(k+1,k) (k,k+1)
H2k+1 = Y1 − Y2 defined by (3.28) have the form
4.2 Isochronicity Through Normal Forms and Linearizability 187

G2k+1 = ∑ iq(ν ) ([ν ] − [νb]), H2k+1 = ∑ iq(ν ) ([ν ] + [νb]) ,


{ν :L(ν )=(k,k)} {ν :L(ν )=(k,k)}

for q(ν ) ∈ Q. (The symbol q(ν ) stands for the same number in each expression.)
(k+1,k)
Proof. By point (2) of Theorem 4.2.8, the polynomial Y1 has the form

∑ iq(ν ) [ν ]
(k+1,k)
Y1 = (4.38)
{ν :L(ν )=(k,k)}

for q(ν ) ∈ Q, hence by point (3) of Theorem 4.2.8,

∑ ∑ −iq(ν ) [νb].
(k,k+1)
Y2 = iq(ν ) [νb] = (4.39)
{ν :L(ν )=(k,k)} {ν :L(ν )=(k,k)}

Adding (4.39) to and subtracting it from (4.38) immediately gives the result. 

Let us now return to the isochronicity quantities p2k , k ∈ N, defined implicitly


by (4.26), where our discussion is still restricted to families of system (4.27). We
do not, however, restrict to the situation bqp = ā pq , so that the complex systems
under consideration are not necessarily complexifications of real systems. To find
the isochronicity quantities we must first find the polynomials H e2k+1 , which them-
(k+1,k) (k,k+1)
selves are given in terms of the polynomials Y1 and Y2 of the normal
form (4.22). These latter polynomials can be computed by means of the Normal
Form Algorithm in Table 2.1 on page 75, at least when the normalizing transfor-
mation is distinguished, which will be adequate for our purposes. We already know
(k+1,k) (k,k+1) e2k+1 . The
that Y1 and Y2 are (k, k)-polynomials, hence so are H2k+1 and H
same is true for the isochronicity quantities.

Proposition 4.2.10. Let a set S, hence a family (4.27), be given. The isochronicity
quantities p2k for that family are (k, k)-polynomials.

Proof. To get our hands on p2k we must invert the series on the left-hand side
of (4.26). In general, if (1 + ∑∞ k −1 = 1 + ∞ b xk , then clearing the de-
k=1 ak x ) ∑k=1 k

nominator 1 = 1 + ∑k=1 (ak + ak−1 b1 + · · · + bk )xk , hence b1 = −a1 and for k ≥ 2
bk = −a1 bk−1 − a2 bk−2 − · · · − ak−1 b1 − ak , so that the coefficients bk can be recur-
sively computed in terms of the ak : b1 = −a1 , b2 = −a1b1 − a2 = a21 − a2 , and so
ν
on. It follows easily by mathematical induction that for every term c aν11 aν22 · · · ak k
occurring in the expression for bk ,

ν1 + 2ν2 + 3ν3 + · · · + kνk = k. (4.40)

e2k+1 = − 1 iH2k+1 = − 1 i(Y (k+1,k) − Y (k,k+1) )


In our specific case, by (4.26) ak = H 2 2 1 2
and bk = p2k . Thus p2k is a sum of polynomials of the form
(2,1) (1,2) ν1 (3,2) (2,3) ν2 (k+1,k) (k,k+1) νk
c Y1 − Y2 Y1 − Y2 · · · Y1 − Y2 .
188 4 The Isochronicity and Linearizability Problems

By Exercise 4.2, this expression defines a ( j, j)-polynomial, where j is given by


j = ν1 + 2ν2 + · · · + kνk = k, hence p2k is a (k, k)-polynomial. 
From (4.26) and the recursion formula for the inversion of series given in the
proof of Proposition 4.2.10 it is immediately apparent that the polynomials p2k and
e2k+1 satisfy (see Definition 1.1.8)
H
e3
p2 = −H and e2k+1 mod hH
p2k ≡ −H e3 , . . . , H
e2k−1 i for k ≥ 2, (4.41)

which is enough to show that hH e3 , . . . , H


e2k+1 i = hp2 , . . . , p2k i for all k ∈ N, hence
e
that hH2k+1 : k ∈ Ni = hp2k : k ∈ Ni. With more work we can obtain explicit expres-
sions for the polynomials p2k . The first few are (see Exercise 4.4)

e3
p2 = −H
i (2,1) (1,2) 
= Y1
2 − Y2
e5 + (H
p4 = −H e3 )2
i (3,2) (2,3)  (2,1) (1,2) 2
= 2Y1 − Y2 − 41 Y1 − Y2
p6 = −H e7 + 2H e3He5 − (H e3 )3
(4,3) (3,4)  (2,1) (1,2)  (3,2) (2,3) 
= 2i Y1 − Y2 − 21 Y1 − Y2 Y1 − Y2 (4.42)
(2,1) (1,2) 3
− 8i Y1 − Y2
p8 = −H e9 + 2H e7He3 + (H e5 )2 − 3H e5(He3 )2 + (H e3 )4
(5,4) (4,5)  (4,3) (3,4)  (2,1) (1,2) 
= 2i Y1 − Y2 − 21 Y1 − Y2 Y1 − Y2
(3,2) (2,3) 2 (3,2) (2,3)  (2,1) (1,2) 2
− 41 Y1 − Y2 − 3i8 Y1 − Y2 Y1 − Y2
1 (2,1) (1,2) 4
+ 16 Y1 − Y2 .

In general, the isochronicity quantities exhibit the following structure.


Proposition 4.2.11. Let a set S, hence a family (4.27), be given. The isochronicity
quantities of that system have the form
(ν )
p2k = 1
2 ∑ p2k ([ν ] + [νb]).
{ν :L(ν )=(k,k)}

Proof. From (4.26) and the recursion formula for the inversion of series given in the
proof of Proposition 4.2.10 it is clear that the polynomial p2k is a polynomial func-
e3 , H
tion of H e5 , . . . , H
e2k+1 . The result now follows from Corollary 4.2.9 and Exercise
4.5. 
The following proposition states a fact about the isochronicity quantities that will
be needed in Chapter 6.
Proposition 4.2.12. Suppose the nonlinear terms in family (4.27) are homogeneous
polynomials of degree K. Then the isochronicity quantities, p2k for k ∈ N, are ho-
mogeneous polynomials in C[a, b] of degree 2k/(K − 1).
4.2 Isochronicity Through Normal Forms and Linearizability 189

Proof. This fact follows immediately from Proposition 4.2.10 and Exercise 3.40. 

Although the isochronicity quantities p2k are defined for all choices of the co-
efficients (a, b) ∈ E(a, b), they are relevant only for (a, b) ∈ VC , the center vari-
ety, where their vanishing identifies isochronicity for real centers and linearizabil-
ity for all centers. Thus if two of them agree at every point of the center variety,
then they are equivalent with respect to the information that they provide about the
linearizability of centers in the family under consideration. We are naturally led
therefore to the concept of the equivalence of polynomials with respect to a vari-
ety. In general terms, let k be a field and let V be a variety in kn . We define an
equivalence relation on k[x1 , . . . , xn ] by saying that two polynomials f and g are
equivalent if, for every x ∈ V , f (x) = g(x) as constants in k. The set of equivalence
classes is denoted k[V ] and is called the coordinate ring of the variety V . If [[ f ]]
denotes the equivalence class of the polynomial f , then it is easy to see that the
operations [[ f ]] + [[g]] = [[ f + g]] and [[ f ]][[g]] = [[ f g]] are well-defined and give
k[V ] the structure of a commutative ring. The reader is asked in Exercise 4.7 to
show that f and g are equivalent under this new equivalence relation if and only if
f ≡ g mod I(V ). That is, f and g are in the same equivalence class in k[V ] if and
only if they are in the same equivalence class in k[x1 , . . . , xn ]/I(V ). Hence there is
a mapping ϕ : k[V ] → k[x1 , . . . , xn ]/I(V ) naturally defined by ϕ ([[ f ]]) = [ f ]. It is an
isomorphism of rings (Exercise 4.8, but see also Exercise 4.9).
Returning to the specific case of the isochronicity quantities, by way of notation
let

P = hp2k : k ∈ Ni ⊂ C[a, b] and Pe = h[[p2k ]] : k ∈ Ni ⊂ C[VC ]

and for k ∈ N let

Pk = hp2 , . . . , p2k i and Pek = h[[p2 ]], . . . , [[p2k ]]i.

Since the equivalences given in (4.41) clearly imply that H = P (and that Hk = Pk
for all k ∈ N), a system that has a linearizable center at the origin corresponds to
a point in the center variety at which every polynomial p2k vanishes; symbolically:
VI = V(H ) ∩VC = V(P) ∩VC . We will also be interested in V(Pk ) ∩VC . Since all
points of interest lie in the center variety, on which every representative of [[p2k ]]
agrees, we are tempted to write V(P) e ∩ VC and V(Pek ) ∩ VC instead of V(P) ∩ VC
and V(Pk ) ∩ VC . The objects V(P) and V(Pek ) are not well-defined, however, since
e
elements of Pe are equivalence classes and, for (a, b) ∈ C2ℓ \VC , the value of f (a, b)
depends on the respresentative f ∈ [[p2k ]] that is chosen. The value is independent
of the representative when (a, b) ∈ VC , however, so we may validly define
e := {(a, b) ∈ VC : if f ∈ [[p2k ]] for some k ∈ N then f (a, b) = 0}.
VVC (P) (4.43)

The definition of VVC (Pek ) is similar. The following proposition shows that these sets
are subvarieties of the center variety and are precisely what we expect them to be.
190 4 The Isochronicity and Linearizability Problems

Proposition 4.2.13. Let I = Pk for some k ∈ N or I = P. In the former case let


Ie = Pek = h[[p2 ]], . . . , [[p2k ]]i and in the latter case let Ie = Pe = h[[p2k ]] : k ∈ Ni. Then
e = V(I) ∩VC .
VVC (I)

Proof. We prove the proposition when I = P ; the ideas are the same when I = Pk .
Suppose (a∗ , b∗ ) ∈ VVC (P), e so that (a∗ , b∗ ) ∈ VC and f (a∗ , b∗ ) = 0 for every f
that lies in [[p2k ]] for some k ∈ N. Since p2k ∈ [[p2k ]], we have that p2k (a∗ , b∗ ) = 0
for all k ∈ N, so (a∗ , b∗ ) ∈ V(P) ∩VC .
Conversely, suppose (a∗ , b∗ ) ∈ V(P) ∩ VC and that the polynomial f ∈ C[a, b]
e We must show that f (a∗ , b∗ ) = 0 even though f need not be in P.
satisfies [[ f ]] ∈ P.
There exist f j1 , . . . , f js ∈ C[a, b] such that

[[ f ]] = [[ f j1 ]][[p2 j1 ]] + · · · + [[ f js ]][[p2 js ]] = [[ f j1 p2 j1 + · · · + f js p2 js ]].

By the definition of C[VC ], since (a∗ , b∗ ) ∈ VC f agrees with f j1 p2 j1 + · · · + f js p2 js


at (a∗ , b∗ ), we have

f (a∗ , b∗ ) = f j1 (a∗ , b∗ )p2 j1 (a∗ , b∗ ) + · · · + f js (a∗ , b∗ )p2 js (a∗ , b∗ )


= f j1 (a∗ , b∗ ) · 0 + · · · + f js (a∗ , b∗ ) · 0 = 0

since (a∗ , b∗ ) ∈ V (P), as required. 

As previously noted, the equivalences given in (4.41) imply that Hk = Pk and


H = P, so that V(Hk ) = V(Pk ) and V(H ) = V(P). The polynomials p2k are ulti-
(k+1,k) (k,k+1)
mately defined in terms of the polynomials Y1 and Y2 , so we now connect
V(Pk ) and V(P) to V(Yk ) and V(Y ) directly. (See also Exercise 4.10.)

Proposition 4.2.14. With reference to family (4.27),

V(P) ∩VC = V(Y ) ∩VC and V(Pk ) ∩VC = V(Yk ) ∩VC for all k ∈ N.
(k+1,k) (k,k+1)
Proof. Since H2k+1 = Y1 −Y2 , if f ∈ Hk , then f ∈ Yk , but not conversely.
But by (4.31) and the identity [[G2k+1 ]] = [[0]] in C[VC ] (see Theorem 3.2.7), we
have that f ∈ Yk implies [[ f ]] ∈ h[[H3 ]], . . . , [[H2k+1 ]]i = h[[p2 ]], . . . , [[p2k ]]i = Pek .
Therefore if we now let Y fk = h[[Y (2,1) ]], [[Y (1,2) ]], . . . , [[Y (k+1,k) ]], [[Y (k,k+1) ]]i and
1 2 1 2
f= h[[Y ( j+1, j) ]], [[Y ( j, j+1) ]] : j ∈ Ni, then Y
similarly let Y fk = Pek and Y f= Pe in C[VC ].
1 2
Then for all k ∈ N, using Proposition 4.2.13 for the first and last equalities,

fk ) = V(Yk ) ∩VC
V(Pk ) ∩VC = VVC (Pek ) = VVC (Y

and V(P) ∩VC = V(Y ) ∩VC similarly. 


4.3 The Linearizability Quantities 191

4.3 The Linearizability Quantities

In order to find all systems (4.27) that are linearizable by a convergent transfor-
mation of the type (4.21), one approach is to try to directly construct a linearizing
transformation (4.21) and the corresponding normal form (3.27), imposing the con-
(k+1,k) (k,k+1)
dition that Y1 = Y2 = 0 for all k ∈ N. Instead of doing so, we will look for
the inverse of such a transformation. The motivation for this idea is that it will lead
to a recursive computational formula that is closely related to formula (3.76) in Sec-
tion 3.4. First we write what we might call the inverse linearizing transformation,
which changes the linear system

ẏ1 = iy1 , ẏ2 = −iy2 (4.44)

into system (4.27) as


∞ ∞
∑ ∑
( j−1,k) ( j,k−1)
y1 = x1 + u1 x1 j x2 k , y2 = x2 + u2 x1 j x2 k , (4.45)
j+k=2 j+k=2

where we have made the asymmetrical index shift so that the “linearizability quan-
tities” that we ultimately obtain are indexed Ikk and Jkk rather than Ik,k+1 and Jk+1,k .
(0,0) (0,0) (−1,1) (1,−1)
In agreement with (4.45), let u1 = u2 = 1 and u1 = u2 = 0, so that the
(k ,k ) (k ,k )
indexing sets are N−1 × N0 for u1 1 2 and N0 × N−1 for u2 1 2 . Then making the
convention that a pq = bqp = 0 if (p, q) ∈ / S, when we differentiate each part of (4.45)
with respect to t, apply (4.27) and (4.44), and equate the coefficients of like powers
xα1 1 xα2 2 , the resulting equations yield the recurrence formulas

k1 +k2 −1

(k ,k2 ) (s ,s2 )
(k1 − k2 )u1 1 = [(s1 + 1)ak1 −s1 ,k2 −s2 − s2 bk1 −s1 ,k2 −s2 ]u1 1 (4.46a)
s1 +s2 =0
s1 ≥−1,s2 ≥0

k1 +k2 −1

(k ,k2 ) (s ,s2 )
(k1 − k2)u2 1 = [s1 ak1 −s1 ,k2 −s2 − (s2 + 1)bk1 −s1 ,k2 −s2 ]u2 1 , (4.46b)
s1 +s2 =0
s1 ≥0,s2 ≥−1

(k ,k ) (k ,k )
(k1 , k2 ) ∈ N−1 × N0 for u1 1 2 and (k1 , k2 ) ∈ N0 × N−1 for u2 1 2 . (An extra detail
on this computation is given in the paragraph of the proof of Theorem 4.3.2 between
(0,0) (0,0)
displays (4.50) and (4.51).) With the appropriate initialization u1 = u2 = 1 and
(−1,1) (1,−1)
u1 = u2 = 0, and the convention that a pq = bqp = 0 if (p, q) ∈ / S, mentioned
(k1 ,k2 ) (k1 ,k2 )
above, the coefficients u1 and u2 of (4.45) can be computed recursively
using formulas (4.46), where the recursion is on k1 + k2 , beginning at k1 + k2 = 0.
But at every even value of k1 + k2 ≥ 2 there occurs the one pair (k1 , k2 ) such that
k1 = k2 = k > 0, for which (4.46) becomes Ikk = Jkk = 0, where
192 4 The Isochronicity and Linearizability Problems

2k−1

(s ,s2 )
Ikk = [(s1 + 1)ak−s1 ,k−s2 − s2 bk−s1 ,k−s2 ]u1 1 (4.47a)
s1 +s2 =0
s1 ≥−1,s2 ≥0

2k−1

(s ,s2 )
Jkk = [s1 ak−s1 ,k−s2 − (s2 + 1)bk−s1 ,k−s2 ]u2 1 . (4.47b)
s1 +s2 =0
s1 ≥0,s2 ≥−1

(k,k) (k,k)
Of course, when k1 = k2 = k, u1 and u2 can be chosen arbitrarily, but we typ-
(k,k) (k,k)
ically make the choice u1 = u2 = 0. The process of creating the inverse of a
normalizing transformation that linearizes (4.27) succeeds only if the expressions on
the right-hand sides of (4.46) are equal to zero for all k ∈ N. That is, for a particular
member of family (4.27) corresponding to (a∗ , b∗ ), the vanishing of the polynomials
Ikk and Jkk at (a∗ , b∗ ) for all k ∈ N is the condition that the system be linearizable.

Remark 4.3.1. If a similar procedure is applied to (4.29), then precisely the same
polynomials are obtained: Ikk and Jkk the same for systems (4.27) and (4.29).

Thus, in a procedure reminiscent of the attempt in Section 3.3 to construct a first


integral (3.52) for system (3.3) using (3.55) to recursively determine the coefficients
vk1 k2 , for fixed (a∗ , b∗ ) ∈ E(a, b) we may attempt to construct the inverse (4.45) of
a linearization for system (4.27) or (4.29) in a step-by-step procedure, beginning by
(0,0) (0,0) (−1,1) (1,−1)
choosing u1 = u2 = 1 and u1 = u2 = 0 and using (4.46) to recursively
construct all the remaining coefficients in (4.45). The first few coefficients are de-
(k,k) (k,k)
termined uniquely, but for k ∈ N, u1 and u2 exist only if Ikk and Jkk given by
(k,k) (k,k)
(4.47) are zero, in which case u1 and u2 may be selected arbitrarily. For k0 ≥ 2,
(k,k)
the values of Ik0 k0 and Jk0 k0 seemingly depend on the choices made earlier for u1
(k,k)
and u2 for k < k0 . Hence, although Ikk = Jkk = 0 for all k ∈ N means that the pro-
cedure succeeds and system (a∗ , b∗ ) is linearizable, it does not automatically follow
that if (Ik0 k0 , Jk0 k0 ) 6= (0, 0) for some k0 , then the system (a∗ , b∗ ) is not linearizable,
(k,k) (k,k)
since it is conceivable that for different choices of u1 and u2 for k < k0 we
would have gotten (Ik0 k0 , Jk0 k0 ) = (0, 0). We will now show that this is not the case
but that, on the contrary, whether or not Ikk = Jkk = 0 for all k ∈ N is independent of
(k,k) (k,k)
the choices made for u1 and u2 .
In order to do so, we shift our focus from a single system corresponding to a
choice (a∗ , b∗ ) of the coefficients (a, b) in (4.27) or (4.29) to the full family param-
etrized by (a, b) ∈ E(a, b) = C2ℓ that corresponds to the choice of a fixed ℓ-element
(0,0) (0,0) (−1,1) (1,−1)
index set S. Now the initialization is u1 = u2 ≡ 1 and u1 = u2 ≡ 0 in
(k,k) (k,k)
C[a, b] and for arbitrary choice of elements u1 and u2 in C[a, b], k ∈ N, formu-
(k ,k )
las (4.46) and (4.47) recursively define a collection of polynomials in C[a, b]: u1 1 2
(k ,k )
for (k1 , k2 ) ∈ N−1 × N0 but k1 6= k2 , u2 1 2 for (k1 , k2 ) ∈ N0 × N−1 but k1 6= k2 , and
Ikk and Jkk for k ∈ N. As is the case with the focus quantities gkk for system (3.3) and
(k+1,k) (k,k+1)
the coefficients Y1 , Y2 of the normal form (4.22), the polynomials Ikk , Jkk
4.3 The Linearizability Quantities 193

are not determined uniquely, the indefiniteness in this case arising from the freedom
(k,k) (k,k)
we have in choosing the coefficients u1 and u2 (in C[a, b]) of an inverse nor-
malizing transformation that is to linearize (4.27). But the varieties in the space of
parameters E(a, b) = C2ℓ defined by these polynomials should be the same regard-
less of those choices. Theorem 4.2.2 (or alternatively Proposition 4.2.7) established
(k+1,k) (k,k+1)
this fact in the case of Y1 and Y2 . The following theorem states that it is
true for Ikk and Jkk .
Theorem 4.3.2. Let a set S, hence families (4.27) and (4.29), be given, and let Ikk
and Jkk , k ∈ N, be any collection of polynomials generated recursively by (4.46)
(0,0) (0,0) (−1,1) (1,−1)
and (4.47) from the initialization u1 = u2 ≡ 1 and u1 = u2 ≡ 0 and
(k,k) (k,k)
for some specific but arbitrary choice of u1 and u2 , k ∈ N (see Remark 4.3.1).
Then the linearizability variety of the systems (4.27) and (4.29) (see Remark 4.2.6)
as given by Definition 4.2.5 (and Proposition 4.2.7) coincides with the variety
V(hIkk , Jkk : k ∈ Ni).
Proof. The reasoning is identical whether family (4.27) or (4.29) is under consid-
(k,k) (k,k)
eration, so we work with just (4.29). For convenience we let ũ1 , ũ2 , I˜kk , and
J˜kk denote the specific elements of C[a, b] that were specified in the statement of the
theorem.
Suppose system (a∗ , b∗ ) ∈ V(hI˜kk , J˜kk : k ∈ Ni). Then the recursive process using
(4.46) to create a transformation that will transform

ẏ1 = y1 , ẏ2 = −y2 (4.48)


(k,k) (k,k)
into (4.29) for (a, b) = (a∗ , b∗ ) succeeds when the constants u1 and u2 are
(k,k) ∗ ∗ (k,k) ∗ ∗
chosen to be ũ1 (a , b ) and ũ2 (a , b ), respectively, k ∈ N; call the resulting
formal transformation U. Then the formal inverse U−1 is a normalizing transforma-
tion that linearizes (4.29) for (a, b) = (a∗ , b∗ ). By Corollary 4.2.3, system (4.29) for
(a, b) = (a∗ , b∗ ) is therefore linearizable, and so (a∗ , b∗ ) ∈ V(Y ).
Conversely, suppose (a∗ , b∗ ) ∈ V(Y ). This means there exists a normalizing
transformation H that transforms (4.29) for (a, b) = (a∗ , b∗ ) into (4.48). Then H−1
is a change of coordinates of the form (4.45) that transforms (4.48) back into (4.29)
for (a, b) = (a∗ , b∗ ). This latter transformation corresponds to a choice of constants
(k,k) (k,k)
u1 and u2 for which the corresponding constants Ikk and Jkk are all zero. Now
begin the recursive procedure of constructing a transformation of the form (4.45)
(k,k)
for system (a∗ , b∗ ), but at every step using for our choice of the constants u1 and
(k,k) (k,k) (k,k)
u2 the values ũ1 (a∗ , b∗ ) and ũ2 (a∗ , b∗ ). Suppose, contrary to what we wish
to show, that for some (least) K ∈ N, I˜KK (a∗ , b∗ ) 6= 0 or J˜KK (a∗ , b∗ ) 6= 0, so the re-
cursive process halts, and consider the polynomial transformation T constructed so
far:
2K+1 2K+1
∑ ∑
( j−1,k) j k ( j,k−1) j k
y1 = x1 + u1 x1 x2 , y2 = x2 + u2 x1 x2 . (4.49)
j+k=2 j+k=2
( j,k)6=(K+1,K) ( j,k)6=(K,K+1)
194 4 The Isochronicity and Linearizability Problems

We will check how close T−1 comes to linearizing (a∗ , b∗ ) by examining ẏ1 − y1
and ẏ2 + y2 . The key computational insight is to maintain the old coordinates until
the last step, avoiding having to actually work with T−1 . We start with ẏ2 + y2 , for
which we will do the computation in detail. Differentiation of the second equation
in (4.49) and application of (4.29) yields
2K+1
∑ ∑
( j,k−1) j k
ẏ2 + y2 = (1 + j − k)u2 x1 x2 + bqp xq1 x2p+1
j+k=2 (p,q)∈S
( j,k)6=(K,K+1)
2K+1 h i
∑ ∑
( j,k−1)
+ u2 k bqpxq+ j p+k
1 x2 (4.50)
j+k=2 (p,q)∈S
( j,k)6=(K,K+1)
2K+1 h i
∑ ∑
( j,k−1)
− u2 j a pq x1p+ j xq+k
2 .
j+k=2 (p,q)∈S
( j,k)6=(K,K+1)

Setting D =: {( j, k) ∈ N0 × N−1 : 2 ≤ j + k ≤ r + s − 2} ∪ {(r, s − 1)}, for (r, s) ∈ N2 ,


(k ,k )
r + s ≥ 2, a contribution to the coefficient of xr1 xs2 in (4.50) is made by u2 1 2 if and
only if (k1 , k2 ) ∈ D. There is complete cancellation in the coefficient of xr1 xs2 pro-
vided (K, K) ∈ / D. That is certainly true if r + s − 1 < 2K. For r + s = 2K + 1, if
(r, s) 6= (K, K + 1), then (K, K) ∈ / D, but if (r, s) = (K, K + 1), then the last pair listed
in D is (K, K). Thus the only term present in (4.50) is that corresponding to xK1 x2K+1 .
To find its coefficient, we repeat the reasoning that is used to derive (4.46b) from
the analogue of (4.50): identify, for each pair (r, s) ∈ N2 , r + s ≥ 2, the contribu-
tion of each sum in (4.50) to the coefficient of xr1 xs2 (only in deriving (4.46b) we
set the resulting sum to zero). Here the first sum contributes nothing to the coeffi-
cient of xr1 xs2 and the second sum contributes precisely bKK . In the third sum, for any
( j, k), (p + j, q + k) = (r, s) forces (p, q) = (r − j, s − k); j + k is maximized when
p + q is minimized, so max( j + k) + min(p + q) = max( j + k) + 1 = r + s, and the
( j,k−1)
contribution is ∑2K j+k=2 ku2 bK− j,K−k+1 . Similarly, the fourth sum contributes
( j,k−1)
− ∑2Kj+k=2 ju2 aK− j,K−k+1 , so that ultimately we obtain, using nothing more
−1
about T than that it must have the form (x1 , x2 ) = T(y1 , y2 ) = (y1 + · · · , y2 + · · · ),
that ẏ2 + y2 = −JKK yK1 yK+1
2 + · · · . Similarly, ẏ1 − y1 = IKK y1K+1 yK2 + · · · . We con-
clude that the transformation x = T−1 (y) = y + · · · transforms system (a∗ , b∗ ) into

ẏ1 = y1 + I˜KK (a∗ , b∗ )yK+1


1 yK2 + r1 (y1 , y2 )
(4.51)
ẏ2 = −y2 − J˜KK (a∗ , b∗ )yK1 yK+1
2 + r2 (y1 , y2 ),

where r1 (y1 , y2 ) and r2 (y1 , y2 ) are analytic functions beginning with terms of or-
der at least 2K + 2. By Proposition 3.2.2, (4.51) is a normal form through order
2K + 1 but is not necessarily a normal form since we do not know that higher-
order nonresonant terms have been eliminated. However, we can freely change
any term of order 2K + 2 or greater in T−1 without changing terms in (4.51)
4.3 The Linearizability Quantities 195

of order less than or equal to 2K + 1, hence there exists a normalizing transfor-


mation x = S(y) = y + s(y) that agrees with T−1 through order 2K + 1, hence
whose inverse agrees with T through order 2K + 1, and which produces a nor-
mal form that agrees with (4.51) through order 2K + 1. But then the assumption
that (I˜KK (a∗ , b∗ ), J˜KK (a∗ , b∗ )) 6= (0, 0) leads to a contradiction: for we know that
H linearizes (4.29) for (a, b) = (a∗ , b∗ ), hence by Theorem 4.2.2 S does. Thus
(I˜KK (a∗ , b∗ ), J˜KK (a∗ , b∗ )) = (0, 0), and (a∗ , b∗ ) ∈ V(hIkk , Jkk : k ∈ Ni). 

Theorem 4.3.2 justifies the following definition.

Definition 4.3.3. Let a set S, hence families (4.27) and (4.29), be given. For any
(k,k) (k,k)
choice of the polynomials u1 and u2 , k ∈ N, in C[a, b], the polynomials Ikk
and Jkk defined recursively by (4.46) and by (4.47) (starting with the initialization
(0,0) (0,0) (−1,1) (1,−1)
u1 = u2 ≡ 1 and u1 = u2 ≡ 0 and the convention that a pq = bqp = 0 if
/ S) are the kth linearizability quantities. The ideal hIkk , Jkk : k ∈ Ni that they
(p, q) ∈
generate is the linearizability ideal and is denoted L .

To recapitulate, with regard to isochronicity and linearizability we have one va-


riety with several characterizations and names:

V(H ) ∩VC = VI = VL = V(Y ) = V(L ) .

The first and third equalities are definitions and the second and fourth are by Propo-
sition 4.2.7 and Theorem 4.3.2, respectively.
(k,k) (k,k)
With a proper choice of the coefficients u1 and u2 , the quantities Jkk can
be computed immediately from the quantities Ikk , as described by the following
proposition. As always, fb denotes the involution of Definition 3.4.3.

Proposition 4.3.4. Let a set S, hence families (4.27) and (4.29), be given.
(k,k) (k,k) (k,k) (k,k)
1. If, for all k ∈ N, u1 and u2 are chosen so as to satisfy u2 = ub1 , then
(k, j) ( j,k)
u2 = ub1 for all ( j, k) ∈ N−1 × N0 .
(k,k) (k,k)
2. If, for all k ∈ N, u1 and u2 are chosen as in (1), then Jkk = −Ibkk for all k ∈ N.

Proof. The first part follows directly from formula (4.46) by induction on j + k.
(Exercise 4.11.) The second part follows by a direct application of part (1) of the
proposition to (4.47). 
(k,k) (k,k)
Note that the usual choices u1 = 0 and u2 = 0 satisfy the condition in part
(k,k) (k,k)
(1) of the proposition. Moreover, since clearly u1 and u2 have coefficients that
lie in Q, the same is true of Ikk , hence, because its coefficients are real, Ibkk can be
obtained from Ikk merely by replacing every monomial [ν ] that appears in Ikk by the
monomial [ν̂ ].
The properties of the focus quantities and the coefficients of the function Ψ as
described in Theorem 3.4.2 have analogues for the linearizability quantities Ikk and
Jkk and for the coefficients of the transformation (4.45). The precise statements are
given in the following theorem.
196 4 The Isochronicity and Linearizability Problems

Theorem 4.3.5. Let a set S, hence families (4.27) and (4.29), be given.
1. For every ( j, k) ∈ N−1 × N0 (respectively, for every ( j, k) ∈ N0 × N−1 ) the coeffi-
( j,k) ( j,k)
cient u1 (respectively, u2 ) of the transformation (4.45) is a ( j, k)-polynomial
with coefficients in Q.
2. For every k ∈ N, the linearizability quantities Ikk and Jkk are (k, k)-polynomials
with coefficients in Q.
( j,k) ( j,k)
Proof. As just noted it is immediate from (4.46) that u1 , u2 , Ikk , and Jkk lie in
Q[a, b]. The proof that they are ( j, k)- and (k, k)-polynomials is precisely the same
inductive argument for the proof of points (2) and (4) of Theorem 3.4.2. 

Theorem 3.4.5 also has an analogue for linearizability. We adopt the notation
introduced in the paragraph preceding Definition 3.4.1, including letting ℓ denote
the cardinality of the index set S. To state the result we must first generalize the
function V on N2ℓ given by (3.76).
Let a set S, hence a family (4.27), be given, and fix (m, n) ∈ {(0, 1), (1, 0), (1, 1)}.
For any ν ∈ N2ℓ0 define V(m,n) (ν ) ∈ Q recursively, with respect to |ν | = ν1 + · · ·+ ν2ℓ ,
as follows:
V(m,n) ((0, . . . , 0)) = 1; (4.52a)
for ν 6= (0, . . . , 0),

V(m,n) (ν ) = 0 if L1 (ν ) = L2 (ν ); (4.52b)

and when L1 (ν ) 6= L2 (ν ) ,

V (m,n) (ν )
1
=
L1 (ν ) − L2 (ν )
"

× ∑V e(m,n) (ν1 , . . . , ν j − 1, . . . , ν2ℓ )(L1 (ν1 , . . . , ν j − 1, . . . , ν2ℓ ) + m)
j=1
#
2ℓ
− ∑ Ve(m,n) (ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L2 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + n) ,
j=ℓ+1
(4.52c)
where (
e(m,n) (η ) = V(m,n) (η )
V
if η ∈ N2ℓ
0
0 if η ∈ N2ℓ 2ℓ
−1 \ N0 .

The function V(1,1) is the function defined by (3.76). V(1,0) pertains to Ikk and V(0,1)
pertains to Jkk , but because of the relationship described in Proposition 4.3.4(2),
we will need to consider only V(1,0) , for which we now prove that the analogue of
Lemma 3.4.4 is true.

Lemma 4.3.6. Suppose ν ∈ N2ℓ


0 is such that either L1 (ν ) < −1 or L2 (ν ) < −1. Then
V(1,0) (ν ) = 0.
4.3 The Linearizability Quantities 197

Proof. The proof is by induction on |ν |. The basis step is practically the same as that
in the proof of Lemma 3.4.4, as are the parts of the inductive step in the situations
L1 (ν ) < −2, L2 (ν ) < −2, and L1 (ν ) = −2, so we will not repeat them. We are left
with showing that if the lemma is true for |ν | ≤ m, and if ν is such that |ν | = m + 1,
L2 (ν ) = −2, and L1 (ν ) ≥ −1 (else we are in a previous case), then V(1,0) (ν ) = 0.
For this part of the inductive step the old proof does not suffice. It does show that
L2 (µ ) = L2 (ν ) − q j = −2 − q j ≤ −2 when µ is the argument of any term in the first
sum in (4.52c), so that by the induction hypothesis, V(1,0) (µ ) = 0, which implies
Ve(1,0) (µ ) = 0, while L2 (µ ) = L2 (ν ) − p j = −2 − p j if µ is the argument of any term
in the second sum in (4.52c), so that if p j ≥ 0, then the induction hypothesis applies
to give Ve(1,0) (µ ) = 0. Thus we know that if L2 (ν ) = −2 (and L1 (ν ) ≥ −1, to avoid
a known case), then expression (4.52c) for V(1,0) (ν ) reduces to

−1
V(1,0)(ν ) =
L1 (ν ) − L2 (ν )
2ℓ
× ∑ Ve(1,0) (ν1 , . . . , ν j − 1, . . ., ν2ℓ )L2 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ).
j=ℓ+1
p j =−1
(4.53)
To show that V(1,0) (ν ) = 0, we consider the iterative process by which V(1,0) (ν )
is evaluated. It consists of m additional steps: each summand in (4.53), call it
Ve(1,0) (µ0 )L2 (µ0 ), is replaced by a sum of up to 2ℓ terms, each of which has the
form Ve(1,0) (µ1 )M1 (µ1 )L2 (µ0 ), where µ1 is derived from µ0 by decreasing exactly
one entry by 1 and where M1 (µ1 ) is either L1 (µ1 ) + 1 or L1 (µ1 ), then each of these
terms is replaced by a sum of up to 2ℓ terms similarly, and so on until ultimately
(4.53) is reduced to a sum of terms each of which is of the form

C · Ve(1,0) (0, . . . , 0)Mm (µm )Mm−1 (µm−1 ) · · · M2 (µ2 )M1 (µ1 )L2 (µ0 ), (4.54)

where C ∈ Q, where Mk (µk ) = L1 (µk ) + 1 or Mk (µk ) = L2 (µk ), and where µk+1 is


derived from µk by decreasing exactly one entry by 1.
Consider the sequence of values L2 (µ0 ), L2 (µ1 ), . . . , L2 (µm ) that is created in
forming any such term from a summand in (4.53). For some j ∈ {ℓ + 1, . . . , 2ℓ},
L2 (µ0 ) = L2 (ν ) − p j = −2 + 1 = −1; L2 (µm ) = L2 (0, . . . , 0) = 0. On the rth step,
r
supposing µr = µr−1 + (0, . . . , 0, −1, 0, . . . , 0), the value of L2 decreases by qr ≥ 0 if
1 ≤ r ≤ ℓ, decreases by p2ℓ−r+1 if r > ℓ and p2ℓ−r+1 ≥ 0, and increases by 1 if r > ℓ
and p2ℓ−r+1 = −1. Let w denote the number of the step of the reduction process
on which the value of L2 changes to 0 for the last time; that is, L2 (µw−1 ) 6= 0,
L2 (µk ) = 0 for k ≥ w. If L2 (µw−1 ) < 0, then L2 (µw ) is obtained from L2 (µw−1 )
by an increase of 1, hence the value of the index j of the entry that decreased in
order to form µw exceeds ℓ, hence Mw (µw ) = L2 (µw ) = 0. If L2 (µw−1 ) > 0, then the
value of L2 must have increased across 0 on some earlier sequence of steps. Thus
for some v < w, L2 (µv−1 ) increased to L2 (µv ) = 0 by an increase of 1 unit, so that
on that step the index j of the entry in µv−1 that decreased to form µv exceeded
198 4 The Isochronicity and Linearizability Problems

ℓ, so Mv (µv ) = L2 (µv ) = 0. Thus for at least one index k in the product (4.54),
Mk (µk ) = L2 (µk ) = 0. Hence V(1,0) (ν ) evaluates to a sum of zeros, and the inductive
step is complete, proving the lemma. 

Here then is the analogue of Theorem 3.4.5 for linearizability.

Theorem 4.3.7. Let a set S, hence families (4.27) and (4.29), be given. Define four
(−1,1) (1,−1) (0,0) (0,0)
constant polynomials by u1 = u2 = 0, u1 = u2 = 1, and by recursion
( j,k) ( j,k)
on j +k generate polynomials u1 and u2 in Q[a, b] using equation (4.46), where
(k,k) (k,k)
the choices u1 = 0 and u2 = 0 are always made. For k ∈ N, let Ikk and Jkk be
the polynomials defined by equation (4.47). Let fb denote the conjugate of f as given
by Definition 3.4.3.
( j,k)
1. The coefficient of [ν ] in the polynomial u1 and the coefficient of [νb] in the
(k, j)
polynomial u2 are equal to V(1,0)(ν1 , ν2 , . . . , ν2ℓ ).
2. The linearizability quantities Ikk and Jkk of family (4.27) are given by
(ν ) (ν )
Ikk = ∑ Ikk [ν ] and Jkk = ∑ Jkk [ν ],
{ν :L(ν )=(k,k)} {ν :L(ν )=(k,k)}

where

(ν )
Ikk = ∑ Ve(1,0)(ν1 , . . . , ν j − 1, . . ., ν2ℓ )(L1 (ν1 , . . . , ν j − 1, . . ., ν2ℓ ) + 1)
k=1
2ℓ
(4.55)
− ∑ e(1,0)(ν1 , . . . , ν j − 1, . . . , ν2ℓ )L2 (ν1 , . . . , ν j − 1, . . . , ν2ℓ )
V
j=ℓ+1

(ν ) (νb)
and Jkk = −I¯kk .
( j,k)
Proof. The proofs of the assertion made about u1 in point (1) and of the assertion
made about Ikk in point (2) are identical to the proofs of the corresponding points in
the proof of Theorem 3.4.5, where Lemma 4.3.6 is used in place of Lemma 3.4.4 at
( j,k)
one point. The assertions concerning u2 and Jkk are just Proposition 4.3.4, based
(k,k) (k,k)
on the choices u1 = 0 and u2 = 0 and properties of the conjugation. 

Thus we see that the linearizability quantities can be computed by practically the
same formulas that we obtained for the focus quantities in Section 3.4. An algorithm
for their computation analogous to that displayed in Table 3.1 on page 128 for the
focus quantities is given in Table 4.1 on page 199. An implementation in Mathemat-
ica is in the Appendix. As always, if we are interested in linearizability conditions
for a real polynomial family of the form (4.1), we can obtain them by computing
linearizability conditions for the family of systems (4.27) on C2 that arises by com-
plexification of the original real family and then replacing every occurrence of bk j
by ā jk .
4.4 Darboux Linearization 199

Linearizability Quantities Algorithm

Input:
K∈N
Ordered set S = {(p1 , q1 ), . . ., (pℓ , qℓ )} ⊂ ({−1} × N0 )2
satisfying p j + q j ≥ 1, 1 ≤ j ≤ ℓ

Output:
Linearizability quantities Ikk , Jkk , 1 ≤ k ≤ K, for family (4.27)

Procedure:
w := min{p1 + q1 , . . ., pℓ + qℓ }
M := ⌊ 2K w ⌋
I11 := 0; . . ., IKK := 0;
J11 := 0; . . ., JKK := 0;
V(1,0) (0, . . ., 0) := 1;
FOR m = 1 TO M DO
FOR ν ∈ N2ℓ 0 such that |ν | = m
DO
Compute L(ν ) using (3.71)
Compute V(1,0) (ν ) using (4.52)
IF
L1 (ν ) = L2 (ν )
THEN
(ν )
Compute IL(ν ) using (4.55)
(ν )
IL(ν ) := IL(ν ) + IL(ν ) [ν ]
FOR ν ∈ N2ℓ 0 such that |ν | = m
DO
Compute L(ν ) using (3.71)
IF
L1 (ν ) = L2 (ν )
THEN
(ν )
JL(ν ) := −IL(bν ) + JL(ν ) [ν ]

Table 4.1 The Linearizability Quantities Algorithm

4.4 Darboux Linearization

In Chapter 3 we saw that the problem of finding the center variety of system (3.3)
splits into two parts: in the first part we compute an initial string of some number r
of the focus quantities until it appears that V(B) = V(BK ) for some number K, and
find the minimal decomposition into irreducible components of the variety V(BK )
so obtained (thus deriving a collection of necessary conditions for a center); in the
second part we prove that any system from any component of V(BK ) actually has
200 4 The Isochronicity and Linearizability Problems

a center at the origin, typically by constructing a first integral, and thus proving that
the necessary conditions for a center derived in the first part are also sufficient.
The method for finding the linearizability variety of (3.3) (which is the same as
(4.27)) is analogous: we begin by computing the first few linearizability quantities
and find the irreducible decomposition of the variety obtained, then we check that
all systems from the variety are linearizable. In this section we present one of the
most efficient tools for performing such a check, namely, the method of Darboux
linearization. In the definition that follows, the coordinate transformation in ques-
tion is actually the inverse of what we have heretofore called the linearization of the
original system (see Section 2.3 and in particular the discussion surrounding Defi-
nition 2.3.4). We will continue to use this mild abuse of language for the remainder
of this section.

Definition 4.4.1. For x = (x1 , x2 ) ∈ C2 , a Darboux linearization of a polynomial


system of the form (2.27), ẋ = Ax + X(x), is an analytic change of variables

y1 = Z(x1 , x2 ), y2 = W (x1 , x2 ) (4.56)

whose inverse linearizes (2.27) and is such that Z(x1 , x2 ) and W (x1 , x2 ) are of the
form
m
α
Z(x1 , x2 ) = ∏ f j j (x1 , x2 ) = x1 + Z ′ (x1 , x2 )
j=0
n
β
W (x1 , x2 ) = ∏ g j j (x1 , x2 ) = x2 + W ′ (x1 , x2 ),
j=0

where f j , g j ∈ C[x1 , x2 ], α j , β j ∈ C, and Z ′ and W ′ begin with terms of order at


least two. A generalized Darboux linearization is a transformation (4.56) for which
the functions Z(x1 , x2 ) and W (x1 , x2 ) are Darboux functions (see Definition 3.6.7).
A system is Darboux linearizable (respectively, generalized Darboux linearizable)
if it admits a Darboux (respectively, generalized Darboux) linearization.

Theorem 4.4.2. Fix a polynomial system of the form (4.29).


1. The system is Darboux linearizable if and only if there exist s + 1 ≥ 1 algebraic
partial integrals f0 , . . . , fs with corresponding cofactors K0 , . . . , Ks and t + 1 ≥ 1
algebraic partial integrals g0 , . . . , gt with corresponding cofactors L0 , . . . , Lt with
the following properties:
a. f0 (x1 , x2 ) = x1 + · · · but f j (0, 0) = 1 for j ≥ 1;
b. g0 (x1 , x2 ) = x2 + · · · but g j (0, 0) = 1 for j ≥ 1; and
c. there are s + t constants α1 , . . . , αs , β1 , . . . , βt ∈ C such that

K0 + α1 K1 + · · · + αs Ks = 1 and L0 + β1 L1 + · · · + βt Lt = −1 . (4.57)

The Darboux linearization is then given by


β β
y1 = Z(x1 , x2 ) = f0 f1α1 · · · fsαs , y2 = W (x1 , x2 ) = g0 g1 1 · · · gt t .
4.4 Darboux Linearization 201

2. The system is generalized Darboux linearizable if and only if the same conditions
as in part (1) hold, with the following modification: either (i) s ≥ 1 and f1 is an
exponential factor rather than an algebraic partial integral or (ii) t ≥ 1 and g1
is an exponential factor rather than an algebraic partial integral (or both (i) and
(ii) hold), and (4.57) holds with α1 = 1 (if s ≥ 1) and β1 = 1 (if t ≥ 1).

Proof. The vector field X is a derivation: for smooth functions f and g,

f gX f − f X g
X ( f g) = f X g + gX f , X = , X ef = ef X f , (4.58)
g g2
which may be verified using the definition of X f and straightforward computations.
Suppose that for a polynomial system (4.29) there exist s+ 1 ≥ 1 algebraic partial
integrals f0 , . . . , fs and t + 1 ≥ 1 algebraic partial integrals g0 , . . . , gt that satisfy the
conditions (a), (b), and (c). Form the mapping

y1 = Z(x1 , x2 ) = f0 f1 α1 · · · fs αs , y2 = W (x1 , x2 ) = g0 g1 β1 · · · gt βt , (4.59)

which by the conditions imposed is analytic and by the Inverse Function Theorem
has an analytic inverse x1 = U(y1 , y2 ), x2 = V (y1 , y2 ) on a neighborhood of the origin
in C2 . For ease of exposition we introduce α0 = 1. Then differentiation of the first
equation in (4.59) with respect to t yields
s
ẏ1 = X ( f0 α0 · · · fs αs ) = ∑ f0 α0 · · · α j f j α j −1 X ( f j ) · · · fs αs
j=0
s
= ∑ α j K j f 0 α0 · · · f s αs (4.60)
j=0
s
= f 0 α0 · · · f s αs ∑ α j K j = Z = y1 .
j=0

Similarly, ẏ2 = −y2 , so the system has been linearized by the transformation (4.59).
If f1 or g1 exists and is an exponential factor meeting the conditions of the
theorem, the proof of sufficiency is identical. In either case, the pair of identities
Z(x1 , x2 ) = x1 + · · · and W (x1 , x2 ) = x2 + · · · is forced by the lowest-order terms of
the original system and its linearization. Note that this forces the presence of at least
one algebraic partial integral, to play the role of f0 . Moreover, there is no loss of
generality in assuming α1 = 1 and β1 = 1 since a power can be absorbed into the
exponential factor.
Conversely, suppose there exists a generalized Darboux linearization of systems
(3.3) and (3.69), and let the first component be
f
y1 = Z(x1 , x2 ) = e g h1 γ1 · · · hs γs .

Without loss of generality we may assume that the polynomials h j are irreducible
and relatively prime and that f and g are relatively prime. Then using (4.58),
202 4 The Isochronicity and Linearizability Problems
  s s  s  s
f f f f
ẏ1 = X Z = e g X ∏
g j=1
h jγ j + e g ∑ ∏ h j
γj
γk hk γk −1 X hk = e g ∏ h jγj .
k=1 j=1 j=1
j6=k

For any index value w, each term on each side of the last equation contains hw γw −1
as a factor, hence we may divide through by e f /g h1 γ1 −1 · · · hs γs −1 to obtain
  s  s 
f
X h1 · · · hs + ∑ ∏ h j γk X hk = h1 · · · hs .
g k=1 j=1
j6=k

For any index value w, every term except the summand corresponding to k = w
contains hw as a factor, hence is divisible by hw . Thus the summand corresponding
to k = w is divisible by hw as well, and since the polynomials h j are relatively
prime, hw divides X hw . This means that hw is an algebraic partial integral, say with
cofactor Kw , and the equation reduces to
s
f
X + ∑ γk Kk = 1 .
g k=1

f
Using the expression for X g given in (4.58) and multiplying by g2 , we obtain

s
(gX f − f X g) + g2 ∑ γk Kk = g2 ,
k=1

which implies that the term − f X g is divisible by g, say f X g = f gL, and upon
dividing by g again, we have
s
(X f − f L) + g ∑ γk Kk = g .
k=1

But now the term in parentheses must be divisible by g, say X f − f L = gK, and
upon one more division by g we have
s
K + ∑ γk Kk = 1 . (4.61)
k=1

Moreover,
f gX f − f X g gX f − f gL X f − f L gK
X = = = = =K
g g2 g2 g g

and, by (4.61), deg(K) ≤ max1≤ j≤s {deg(K j )} ≤ m − 1, where m is the degree of the
original system, so e f /g is an exponential factor with cofactor K.
Since Z(x1 , x2 ) = x1 + · · · and W (x1 , x2 ) = x2 + · · · , e f /g = 1 + · · · , and all the
h j are polynomials, for exactly one index value w the polynomial hw must have the
4.4 Darboux Linearization 203

form hw (x1 , x2 ) = x1 + · · · , γw must be 1, and for j 6= w, h j must have the form


h j (x1 , x2 ) = 1 + · · · . Thus, by (4.61), condition (a) and the first condition in (4.57)
hold, with hw playing the role of f0 .
If f = 0, then every line of the argument remains true without modification.
The discussion for the second component of the linearizing transformation is
identical, since the minus sign has no bearing on the argument, except to reverse the
sign on the right-hand side of equation (4.61), thereby yielding the second condition
in display (4.57). 
Remark. Note that a single algebraic partial integral can serve both as one of the
f j and as one of the g j . This fact is illustrated by system (4.69) below.
The following two theorems show that even if we are unable to find sufficiently
many algebraic partial integrals to construct a linearizing transformation for system
(4.27) by means of Theorem 4.4.2, it is sometimes possible to construct a lineariz-
ing transformation if, in addition, we use a first integral of the system, which must
exist since we are operating on the assumption that system (4.27) has a center at the
origin. In the first theorem we suppose that we can find algebraic partial integrals
(and possibly an exponential factor) satisfying just one or the other of the condi-
tions (4.57). For simplicity we state the theorem only for the case that we have just
f0 , . . . , fs meeting the first equation in (4.57).
Theorem 4.4.3. Suppose system (4.29) has a center at the origin, hence possesses
a formal first integral Ψ (x1 , x2 ) of the form Ψ = x1 x2 + · · · , and that there exist
algebraic partial integrals and possibly an exponential factor, f0 , . . . , fs , that meet
condition (1.a) in Theorem 4.4.2 and satisfy the first of equations (4.57), possibly as
modified in (2) of Theorem 4.4.2. Then system (4.29) is linearized by the transfor-
mation
s
α
y1 = Z(x1 , x2 ) = f0 ∏ f j j = x1 + · · · ,
j=1
(4.62)
Ψ
y2 = W (x1 , x2 ) = = x2 + · · · .
Z(x1 , x2 )
Proof. Recall from Corollary 3.2.6 that if a formal first integral Ψ of the form (3.52)
exists, then there exists an analytic first integral of the same form, which we still de-
note by Ψ . Condition 1(a) in Theorem 4.4.2 ensures that the transformation (4.62) is
analytic and has an analytic inverse on a neighborhood of the origin in C2 . The com-
putation (4.60) is valid and gives ẏ1 = X Z = Z = y1 . As in the proof of Theorem
4.4.2, set α0 = 1. Then by (4.58) and the fact that Ψ is a first integral,
α α α
∏sj=0 f j X Ψ − Ψ X ∏sj=0 f j −Ψ ∏sj=0 f j j
j j

ẏ2 =   =  = −y2 . 
αj 2 αj 2
∏sj=0 f j ∏sj=0 f j

In the second theorem one of the conditions on system (4.29) is that the coordi-
nate axes be invariant curves through the center. In the context of systems of dif-
ferential equations on C2 , such a curve is termed a separatrix of the center, which
underscores the contrast with real systems.
204 4 The Isochronicity and Linearizability Problems

Theorem 4.4.4. Suppose that system (4.29) has a center at the origin, hence pos-
sesses a formal first integral Ψ (x1 , x2 ) of the form Ψ = x1 x2 + · · · , that Pe contains x1
as a factor and Q e contains x2 as a factor, and that there exist s algebraic partial inte-
grals and exponential factors f1 , . . . , fs with corresponding cofactors K1 , . . . , Ks and
t algebraic partial integrals and exponential factors g1 , . . . , gt with corresponding
cofactors L j with the following properties:
a. f j (0, 0) = 1 for 1 ≤ j ≤ s;
b. g j (0, 0) = 1 for 1 ≤ j ≤ t;
c. there exist s + t + 2 constants a, b, α1 , . . . , αs , β1 , . . . , βt ∈ C such that
s
1 P − ax2 Q + ∑ α j K j = 1
e −1 e
(1 − a)x−1 (4.63a)
j=1

and
t
1 P + (1 − b)x2 Q + ∑ α j L j = −1 .
e −1 e
−bx−1 (4.63b)
j=1

Then (4.29) is linearized by the transformation


−a a α1 αs
1 x2 Ψ f 1 · · · f s
y1 = Z(x1 , x2 ) = x1−a
β β
(4.64)
1 x2 Ψ g1 · · · gt .
y2 = W (x1 , x2 ) = x−b 1−b b 1 t

Proof. Conditions (4.63) imply that Pe does not contain any term of the form xk2 and
e does not contain any term of the form xk . By an inductive argument using
that Q 1
(3.55), it follows that any first integral (3.52) of the system must have the form
 ∞ 
Ψ (x1 , x2 ) = x1 x2 1 + ∑ j
vk, j xk1 x2 .
k+ j=1
k, j≥0

This fact together with conditions (a) and (b) implies that the mapping (4.64) is of
the form y1 = Z(x1 , x2 ) = x1 + · · · , y2 = W (x1 , x2 ) = x2 + · · · , hence is analytic and
has an analytic inverse on a neighborhood of the origin in C2 . Computations just
like (4.60) and using (4.63) show that (4.64) linearizes (4.29). 

As was the case with Theorem 4.4.2, a single function can play the role of one of
the functions f j and one of the functions g j , as is the case in the following example.

Example. Consider the system

ẋ = x(1 − 6b10x + 8b210x2 − 2b02y2 )


(4.65)
ẏ = −y(1 − b10x − b02y2 )

on C2 with b10 b02 6= 0. This system has algebraic partial integrals

h1 = 1 − 4b10x + 8b02b10 xy2 , h2 = 1 − 12b10x + 48b210x2 − 64b310x3 + 24b02b10 xy2


4.5 Linearizable Quadratic Centers 205

with respective cofactors

M1 = −4b10x + 8b210x2 , M2 = −12b10x + 24b210x2 .

Since 3M1 + M2 ≡ 0, by Theorem 3.6.8 a Darboux first integral is given by the


analytic function Φ (x, y) = h−3 2 2 2
1 h2 = 1 + 192b02 b10 x y + · · · . Thus
q
1
Ψ (x, y) = q f2 f1−3 − 1
192 b02 b210

is a first integral of the form (3.52). Taking f1 = g1 = h1 and f2 = g2 = h2 , equations


(4.63) are in this instance a pair of systems of linear equations with infinitely many
solutions, among them a = 2, α1 = 1, α2 = 0 and b = −1, β1 = −1, β2 = 0. Thus,
by Theorem 4.4.4, system (4.65) is linearized by the inverse of the transformation

f1 (x, y)Ψ 2 (x, y) xy2


z= = x + ··· , w= = y + ··· .
xy2 f1 (x, y)Ψ (x, y)

Two more examples of linearizable polynomial systems are given in Exercises


4.12 and 4.13. In the next section we apply the theory developed in this section to
the question of linearizability of quadratic systems.

4.5 Linearizable Quadratic Centers

This section is devoted to finding the linearizability variety for the full family of
quadratic systems of the form (4.29) (= (3.69)):

ẋ = x − a10x2 − a01xy − a−12y2


(4.66)
ẏ = −y + b2,−1x2 + b10xy + b01y2 .

This is the same family for which the center variety was obtained in Section 3.7, as
described in Theorem 3.7.1.

Theorem 4.5.1. The linearizability variety of family (4.66) consists of the following
nine irreducible components:
1. V1 = V(J1 ) where J1 = ha01 , b01 , b2,−1, a10 + 2b10i;
2. V2 = V(J2 ) where J2 = ha10 , a−12 , b10 , 2a01 + b01i;
3. V3 = V(J3 ) where J3 = ha01 , a−12 , b01 i;
4. V4 = V(J4 ) where J4 = hb10 , b2,−1 , a10 i;
5. V5 = V(J5 ) where J5 = h−7b210 + 12b2,−1b01 , 49a−12b10 + 18b201,
14a−12b2,−1 + 3b10b01 , 7a01 + 6b01, 6a10 + 7b10i;
6. V6 = V(J6 ) where J6 = h15b210 + 4b2,−1b01 , 25a−12b10 − 6b201,
10a−12b2,−1 + 9b10b01 , 5a01 + 2b01, 2a10 + 5b10i;
7. V7 = V(J7 ) where J7 = hb2,−1 , a−12 , a01 + b01, a10 + b10i;
206 4 The Isochronicity and Linearizability Problems

8. V8 = V(J8 ) where J8 = a01 , b10 , b2,−1;


9. V9 = V(J9 ) where J9 = b10 , a01 , a−12.

Proof. In analogy to what was done in Section 3.7 when we found the center variety
for this family, we compute the first few pairs of linearizability quantities until a pair
or two are found to lie in the radical of the ideal generated by the earlier pairs. The
actual computation, which was done by means of the algorithm based on Theorem
4.3.7, led us to suspect that the first three pairs of linearizability quantities, which
we will not list here, form a basis of L , indicating that V(L ) = V(L3 ). Using the
Singular routine minAssGTZ (which computes the minimal associated primes of
polynomial ideals by means of the algorithm of [81]), we found that the minimal
associate primes of L3 are the nine ideals written out in the right-hand sides of the
equations for the V j above. Thus V(L ) ⊂ ∪9j=1V j .
To prove the reverse inclusion, we have to show that for every system from V j for
j ∈ {1, . . . , 9}, there is a transformation z = x + · · · , w = y + · · · that reduces (4.66)
to the linear system ż = z, ẇ = −w. We will find that in every case either Theorem
4.4.2 or Theorem 4.4.3 applies. To begin, we observe that each variety V1 through V9
lies in some irreducible component of the center variety VC as identified in Theorem
3.7.1 (Exercise 4.14), so that there must exist an analytic first integral Ψ of the form
Ψ (x, y) = xy + · · · for each of the corresponding systems. We also observe that the
polynomials defining V2 , V4 , and V9 are conjugate to the polynomials defining V1 ,
V3 , and V8 , respectively, so by reasoning analogous to that presented in the proof of
Theorem 3.7.2, it is sufficient to consider systems from just the components V1 , V3 ,
V5 , V6 , V7 , and V8 .
The component V1 . Systems from the component V1 have the form

ẋ = x + 2b10x2 − a−12y2 , ẏ = −y + b10xy. (4.67)

In an attempt to apply Theorem 4.4.2, we search for algebraic partial integrals, be-
ginning with those of degree one, whose cofactors have degree at most one. Apply-
ing the technique of undetermined coefficients (first described in Example 3.6.10),
we find that any algebraic partial integral of (4.67) of degree one that is valid for
the whole family (and not just for a special case, such as a−12 = 0) has the form
h0 = cy, c ∈ R \ {0}, with corresponding cofactor M0 = −1 + b10 x. A search for
algebraic partial integrals of the second degree, for which the cofactor M will be a
polynomial of degree at most one, yields (up to multiplication by a nonzero con-
stant) the two polynomials h1 = 1 + 2b10 x − a−12b10 y2 and h2 = x − (a−12 y2 )/3
for system (4.67), with the respective cofactors M1 = 2b10x and M2 = 1 + 2b10x.
The form of h2 as h2 (x, y) = x + · · · suggests that we let it play the role of
f0 in Theorem 4.4.2 and attempt to find α j solving the first equation in (4.57).
Since M2 + (−1)M1 = 1, we choose h1 for f1 and α1 = −1. The form of h0 as
h0 (x, y) = y + · · · suggests that we attempt to find β j solving the second equation in
(4.57). Since Mo + (−1/2)M1 = −1, we choose h1 for g1 and β1 = −1/2. Thus, by
Theorem 4.4.2, any system in V1 is linearized by the inverse of the transformation
4.5 Linearizable Quadratic Centers 207

x − (a−12y2 )/3
z = h2 (x, y)[h1 (x, y)]−1 = = x + ···
1 + 2b10x − a−12b10 y2
y
w = h0 (x, y)[h1 (x, y)]−1/2 = p = y + ··· .
1 + 2b10x − a−12b10 y2

To finish the proof for component V1 , see Exercise 4.16.


The component V3 . Systems from the component V3 have the form

ẋ = x − a10x2 , ẏ = −y + b2,−1x2 + b10xy. (4.68)

The first equation is independent of y, so we may apply the one-dimensional ana-


logue of Theorem 4.4.2: first-degree algebraic partial integrals are f0 (x) = x and
f1 (x) = 1 − a10 x with corresponding cofactors K0 (x) = 1 − a10x and K1 (x) = −a10 x
that satisfy K0 + α1 K1 = 1 when α1 = −1 so that ẋ = x − a10 x2 is transformed into
ż = z by z = Z(x) = f0 (x)[ f1 (x)]−1 = x/(1 − a10x). Since a system (a, b) from V3
need not be in V(J2 ) ∪ V(J3 ) for the irreducible components V(J2 ) and V(J3 ) of the
center variety, as identified in Theorem 3.7.1, we do not expect to find algebraic par-
tial integrals from which we could build an explicit first integral. Thus simply noting
that any system (4.68) lies in V(Isym ) (component V4 in Theorem 3.7.1), although it
is not time-reversible except in the trivial case that it is already linear (see Exercises
4.14 and 4.15), we know that it must possess an analytic first integral Ψ = xy + · · ·,
hence, by Theorem 4.4.3, is linearized by
x
z = Z(x, y) =
1 − a10x
f1 (x, y) 1 − a10x
w = W (x, y) = Ψ (x, y) = Ψ (x, y) .
f0 (x, y) x

The component V5 . Systems from V5 for which a01 b10 6= 0 have the form

a201 2 b210 2
ẋ = x + 76 b10 x2 − a01xy + y , ẏ = −y − x + b10xy − 76 a01 y2 . (4.69)
2b10 2a01
By the usual method of undetermined coefficients, we ascertain that any such system
has the irreducible algebraic partial integrals and cofactors

h1 = 1 + 32 b10 x + 23 a01 y M1 = 32 (b10 x − a01 y)


h2 = 6 b10x + b210 x2 + 2a01b10 xy + a201 y2 M2 = 1 + 34 b10 x − 43 a01 y
h3 = 6a01 y + b210 x2 + 2a01b10 xy + a201 y2 M3 = −1 + 34 b10 x − 43 a01 y .

Since 1 · M2 + (−2)M1 = 1 and 1 · M3 + (−2)M1 = −1 and h̃2 = h2 /(6b10 ) and


h̃3 = h3 /(6a01) are algebraic partial integrals with respective cofactors M2 and M3 ,
Theorem 4.4.2 applies with f0 = h̃2 , f1 = h0 , g0 = h̃3 , and g1 = h1 to yield the
Darboux linearization
208 4 The Isochronicity and Linearizability Problems

1 −2 6b10 x + b210x2 + 2a01b10 xy + a201y2


z= 6b10 h2 h1 =
6b10 (1 + 23 b10 x + 23 a01 y)2
1 −2 b210 6a01 y + x2 + 2a01b10 xy + a201y2
w= 6a01 h3 h1 =
6a01(1 + 23 b10 x + 23 a01 y)2

for systems from V5 for which a01 b10 6= 0. Since V5 \ V(a01b10 ) = V5 , as can be
shown either by computing the quotient of the corresponding ideals or by follow-
ing the strategy outlined in Exercise 4.18 for the analogous equality that arises for
component V1 , we conclude that a linearization exists for every system from V5 .
The arguments for the components V6 , V7 , and V8 are similar and are left as
Exercise 4.21. 

4.6 Notes and Complements

The history of isochronicity is as old as the history of clocks based on some sort
of periodic motion, such as the swinging of a pendulum. Based on his knowledge
that the cycloid is a tautochrone (a frictionless particle sliding down a wire in the
shape of a cycloid reaches the lowest point in the same amount of time, regardless
of its starting position) and that the evolute of a cycloid is a cycloid, in the 17th
century Huygens designed and built a pendulum clock with cycloidal “cheeks.” This
is probably the earliest example of a nonlinear isochronous system.
Interest in isochronicity in planar systems of ordinary differential equations was
renewed in the second half of the 20th century. In the early 1960s a criterion for
isochronicity was obtained by Urabe ([189, 190]) for the simple but physically im-
portant “kinetic + potential” Hamiltonian system ẋ = Hx = y, ẏ = −Hy = −g(x),
where f (x) and Rg(x) are smooth functions in a neighborhood of the origin and
H(x, y) = y2 /2 + 0x g(s) ds. Urabe’s method can also be applied to study isochronic-
ity in the system
ẋ = y, ẏ = −g(x) − f (x)y2
and some families of polynomial systems ([47, 48, 162]). Efficient criteria for
isochronicity in Liénard systems

ẋ = y, ẏ = −g(x) − f (x)y (4.70)

have been obtained by Sabatini ([160]) and Christopher and Devlin ([52]). In [52]
the authors classified all isochronous polynomial Liénard systems (4.70) of degree
34 or less.
The method of Darboux linearization presented in this chapter indeed is based
on an idea of Darboux, but was applied to the linearization of differential equations
only in 1995, in the work of Mardešić, Rousseau, and Toni ([136]). See such works
as [57, 135] for further developments. Theorem 4.5.1 on the linearization of the
quadratic system (4.66) is due to Christopher and Rousseau ([58]). The problem of
4.6 Notes and Complements 209

isochronicity for real quadratic systems was solved earlier by Loud ([127]). Sys-
tems with only homogeneous cubic nonlinearities were treated by Pleshkan ([141]).
Necessary and sufficient conditions for linearizability of the system ẋ = x + P(x, y),
ẏ = −y + Q(x, y), where P and Q are homogeneous polynomials of degree five, were
obtained in [148] .
The problems of isochronicity and linearizability for certain families of time-
reversible polynomial systems have been considered in [21, 32, 36, 40, 145]. There
are also many works concerning the problems of linearizability and isochronicity
for various particular families of polynomial systems.
In this chapter we gave the proof that isochronicity of a planar system with a
center is equivalent to its linearizability. Isochronicity has also been characterized
in terms of commuting systems. Vector fields X and Z are said to commute if
their Lie bracket [X , Z ] is identically zero. It was proved by Ladis ([107]; see also
[2, 161, 193]) that system (4.1) has an isochronous center at the origin if and only
if there exists a holomorphic system u̇ = u + M(u, v), v̇ = v + N(u, v) such that the
associated vector fields commute. This result was generalized by Giné and Grau
([82]), who showed that the smooth (respectively, analytic) system ẋ = f(x) with a
non-degenerate singular point at the origin and with the associated smooth (respec-
tively, analytic) vector field X = ∑nj=1 f j (x)∂ /∂ x j is linearizable if and only if there
exists a smooth (respectively, analytic) vector field Y = ∑nj=1 (x j + o(|x|2))∂ /∂ x j
such that [X , Y ] ≡ 0. Along with Darboux linearization, the construction of a com-
muting system is a powerful method for proving the isochronicity or linearizability
of particular families of polynomial systems.
The concept of isochronicity can also be extended to foci. Briefly put, a focus
of an analytic system is isochronous if there is a local analytic change of variables
η for which dη (0) is the identity and is such that in a subsequent change to polar
coordinates the ϕ̇ equation has no r dependence. The reader is referred to [6, 7, 83,
84] and the references they contain.
Finally, there is also the concept of strong isochronicity: a system with an antisad-
dle at the origin is strongly isochronous of order n if in polar coordinates (r, ϕ ) there
exist n rays L j = {(r, ϕ ) : r ≥ 0, ϕ = ϕ0 + 2 jπ /n}, j = 0, . . . , n − 1, such that the
time required for any trajectory sufficiently near the origin to pass from L j to L j+1 is
2π /n. If the system is strongly isochronous of order n = 2 with respect to the initial
polar ray ϕ0 = π /2, then it is called strongly isochronous. Strong isochronicity has
been investigated by Amel’kin and his coworkers (see [4, 7] and references therein).
For a survey on the problem of isochronicity, consult [38].

Exercises

4.1 Find the error in the following argument: Every quadratic system is lineariz-
able. For suppose family (4.27) has only quadratic nonlinearities. There are no
resonant terms, hence the normal form is ẏ1 = iy1 , ẏ2 = −iy2 .
210 4 The Isochronicity and Linearizability Problems

4.2 [Referenced in Proposition 4.2.10, proof.] Suppose f is a (k, k)-polynomial, g


is a ( j, j)-polynomial, and r ∈ N. Show that f r is an (rk, rk)-polynomial and
that f g is a (k + j, k + j)-polynomial.
4.3 a. Show that any (k, k)-polynomial f = ∑{ν :L(ν )=(k,k)} f (ν ) [ν ] can be written in
the form f = ∑{ν :L(ν )=(k,k)} ( f (ν ) [ν ] + f (νb) [νb]).
Hint. Let ν1 , . . . , νm be those elements of {ν : L(ν ) = (k, k)} for which νb 6= ν
and let µ1 , . . . , µn be those that are self-conjugate. Write out f + f and pair
each term with the term with the conjugate monomial.
b. Derive Corollary 3.4.6 from (a), Theorem 3.4.2(4), and (3.78a).
4.4 Follow the procedure outlined in the proof of Proposition 4.2.10 to derive the
expressions for the first four isochronicity quantities given by (4.42).
4.5 [Referenced in Proposition 4.2.11, proof.] Prove that if polynomials f and g
have the form f = ∑ν ∈F f (ν ) ([ν ] + [νb]) and g = ∑ν ∈F g(ν ) ([ν ] + [νb]) for some
finite indexing set F, then for all r, s ∈ N0 , the polynomial f r gs has the same
form.
Hint. It is enough to demonstrate the result just for the two cases (r, s) = (r, 0)
and (r, s) = (1, 1). In the former case recall that [ν ]k = [kν ] and [ν ][µ ] = [ν µ ]
and use the Binomial Theorem.
4.6 a. Use the Normal Form Algorithm in Table 2.1 on page 75 to compute the
(k+1,k)
first few coefficients Y1 for the family of systems (4.27) with a full
set of homogeneous cubic nonlinearities; this is the complexification of the
system whose complex form is given by (6.50). Compare your answer to the
list given in Section 6.4 on page 292.
b. Insert the results of the computations in part (a) into (4.42) to obtain the first
few polynomials p2k explicitly in terms of the parameters (a, b).
4.7 Let k be a field and V a variety in kn . Prove that two polynomials f and g in
k[x1 , . . . , xn ] are in the same equivalence class in k[V ] if and only if they are in
the same equivalence class in k[x1 , . . . , xn ]/I(V ).
4.8 In the context of the previous problem, prove that the mapping ϕ from k[V ] to
k[x1 , . . . , xn ]/I(V ) defined by ϕ ([[ f ]]) = [ f ] is an isomorphism of rings.
4.9 In contrast with the previous problem, show that in k[x] if I = hx2 i, then k[V(I)]
and k[x]/I are not isomorphic.
Hint. One is an integral domain.
4.10 Derive the equality V(P) ∩VC = V(Y ), a slightly improved version of the first
equality in Proposition 4.2.14, directly from Proposition 4.2.7 and the identity
H = P.
4.11 Supply the inductive argument for the proof of part (1) of Proposition 4.3.4.
4.12 Prove that the system

ẋ = x − a13x2 y3 − a04xy4 − y5 , ẏ = −y + b13xy4 + b04 y5

is linearizable.
Hint. Show that there exists a Lyapunov first integral for this system of the form
Ψ (x, y) = ∑∞ k 2
k=1 gk (x)y , where g1 (x) = x, g2 (x) = x , and gk (x) are polynomi-
als of degree k, and that a linearization for the second equation of the system
4.6 Notes and Complements 211

can be constructed in the form w = ∑∞ k


k=1 f k (x)y , where f k (x), k = 2, 3, . . . , are
polynomials of degree k − 1 and f1 (x) ≡ 1.
4.13 Find a linearization of the system
4
ẋ = x − x5 − x4 y − xy4, ẏ = −y + x4y + bxy4 + y5 .
b
Hint. Use Theorem 4.4.4.
4.14 With reference to the family (4.66) of quadratic systems, show that the compo-
nents V8 and V9 of the linearizability variety V(L ), as identified in Theorem
4.5.1, lie in the irreducible component V(J2 ) of the center variety VC , as iden-
tified in Theorem 3.7.1. Show that V1 and V2 lie in the component V(J3 ) of VC
and that the remaining components of V(L ) lie in V(J4 ) = V(Isym ) of VC .
4.15 Using the previous exercise, apply Theorem 3.5.8(2) to systems in family (4.66)
corresponding to component V3 in Theorem 4.5.1 to obtain further examples of
systems in V(Isym ) that are not time-reversible.
4.16 a. In the proof of Theorem 4.5.1, explain why the argument that every system
in V1 is linearizable is not valid when b10 = 0.
b. Show that the result stated in that part of the proof, that the inverse of the
1/2
mapping z = h2 /h1 , w = h0 /h1 is a linearization, nevertheless is valid when
b10 = 0.
4.17 a. Use Theorem 3.6.8 and the algebraic partial integrals found in the proof
of Theorem 4.5.1, that every element of V1 is linearizable, to construct an
explicit first integral Ψ for systems in V1 , and by means of it use Theorem
4.4.3 to rederive the linearization of elements of V1 .
b. When b10 = 0, the function h1 is no longer an algebraic partial integral. Is
the mapping Ψ still a first integral?
4.18 In the proof of Theorem 4.5.1, that every system in V1 is linearizable, both the
proof in the text using Theorem 4.4.2 and the proof in Exercise 4.17 using The-
orem 4.4.3, the case b10 = 0 was anomalous, but we were able to verify the
linearizability of systems in V1 directly because the expressions simplified sig-
nificantly when b10 = 0. For a different and somewhat more general approach to
the situation, show that a linearization must exist for systems in V1 with b10 = 0
(not finding it explicitly) in the following steps. Let E(a, b) = C6 be the param-
eter space for system (4.66), with the usual topology.
a. Show that in E(a, b), Cℓ(V1 \ V(b10)) = V1 .
b. Use the result of part (a) and Exercise 1.42 to show that V1 ⊂ V1 \ V(hb10 i).
c. Prove that V1 \ V(hb10 i) = V1 .
d. Use part (c) and Proposition 1.3.20 to conclude that a linearization exists for
every system from V1 , including those for which b10 = 0.
4.19 In the context of the previous exercises, give an alternate proof that systems in
V1 for which b10 = 0 are linearizable based on Theorem 4.4.3 and the fact that
the ẏ equation is already linear.
4.20 Consider the family of systems of the form (4.27) that have only homogeneous
cubic nonlinearities
212 4 The Isochronicity and Linearizability Problems

ẋ = i(x − a20x3 − a11x2 y − a02xy2 − a−13y3 )


(4.71)
ẏ = −i(y − b3,−1x3 − b20x2 y − b11xy2 − b02y3 ) .

a. Find the first four pairs of linearizability quantities for family (4.71).
b. Using the Radical Membership Test, verify that the third pair found in part
(a) does not lie in the radical of the ideal generated by the previous pairs, but
that the fourth pair does. (If you are so inclined, you could compute the fifth
pair and verify that it lies in the ideal generated by the previous pairs, too.)
c. From part (a) you have the ideal L4 and in part (b) a computation that sug-
gests that V(L ) = V(L4 ). Using the Singular routine primdecGTZ or the
like, compute the primary decomposition of L4 , L4 = ∩wj=1 Q j , for some

w ∈ N and the associated prime ideals Pj = Q j .
d. From the previous two parts you have, as in the proof of Theorem 4.5.1,
 
V(L4 ) = V L4 = ∪wj=1 V(Pj ),

a minimal decomposition of V(L4 ), and you know that V(L ) ⊂ V(L4 ).


The reverse inclusion is true. Confirm that if (a, b) ∈ V(Pj ), then the corre-
sponding system (4.71) is linearizable for as many of the Pj as you can.
Hint. See Theorem 6.4.4 for the decomposition in part (c).
4.21 [Referenced in the proof of Theorem 4.5.1.] Find linearizing transformations
for quadratic systems (4.66) from the components V6 , V7 , and V8 of Theorem
4.5.1.
4.22 Although, by Exercise 4.14 above, V1 ⊂ V(J3 ) ⊂ VC , show that none of its
elements possesses an irreducible algebraic partial integral of degree three, but
rather every algebraic partial integral of degree at most three is a product of the
algebraic partial integrals listed in the proof of Theorem 4.5.1 for component
V1 .
Chapter 5
Invariants of the Rotation Group

In Section 3.5 we stated the conjecture that the center variety of family (3.3), or
equivalently of family (3.69), always contains the variety V(Isym ) as a compo-
nent. This variety V(Isym ) always contains the set R that corresponds to the time-
reversible systems within family (3.3) or (3.69), which, when they arise through
the complexification of a real family (3.2), generalize systems that have a line of
symmetry passing through the origin. In Section 3.5 we had left incomplete a full
characterization of R. To derive it we are led to a development of some aspects
of the theory of invariants of complex systems of differential equations. Using this
theory, we will complete the characterization of R and show that V(Isym ) is actually
its Zariski closure, the smallest variety that contains it. In the final section we will
also apply the theory of invariants to derive a sharp bound on the number of axes of
symmetry of a real planar system of differential equations.
We will consider polynomial systems on C2 in a form that is a bit more general
than (3.69), namely, systems of the form

ẋ = − ∑ a pq x p+1 yq = P(x, y),


(p,q)∈Se
(5.1)
ẏ = ∑ bqpxq y p+1 = Q(x, y),
(p,q)∈Se

where the index set Se ⊂ N−1 × N0 is a finite set and each of its elements (p, q)
satisfies p + q ≥ 0. As before, if ℓ is the cardinality of the set S, e we use the ab-
breviated notation (a, b) = (a p1 ,q1 , a p2 ,q2 , . . . , a pℓ ,qℓ , bqℓ ,pℓ , . . . , bq2 ,p2 , bq1 ,p1 ) for the
ordered vector of coefficients of system (5.1), let E(a, b) (which is just C2ℓ ) denote
the parameter space of (5.1), and let C[a, b] denote the polynomial ring in the vari-
ables a pq and bqp . The only difference between (3.3) and (3.69) on the one hand
and (5.1) on the other is that in the definition of the index set Se for family (5.1) it is
required that p + q ≥ 0, whereas in the definition of S for families (3.3) and (3.69)
the inequality p + q ≥ 1 must hold for all pairs (p, q). Thus the linear part in (5.1) is

ẋ = −a00x − a−1,1y, ẏ = b1,−1 x + b00y,

V.G. Romanovski, D.S. Shafer, The Center and Cyclicity Problems, 213
DOI 10.1007/978-0-8176-4727-8_5,
© Birkhäuser is a part of Springer Science+Business Media, LLC 2009
214 5 Invariants of the Rotation Group

whereas heretofore we have restricted attention to systems with diagonal linear part,
which includes the complexification of any real system u̇ = f(u) with f(0) = 0 and
for which the eigenvalues of df(0) = 0 are purely imaginary.

5.1 Properties of Invariants

We begin this section by sketching out how an examination of the condition for
reversibility naturally leads to an investigation of rotations of phase space, and iden-
tify a condition for reversibility that is expressed in the language of invariants. This
will serve to orient the reader to the development of the ideas in this section.
Condition (3.96) for reversibility of a system of the form (5.1), when written out
in detail and with γ expressed in polar form γ = ρ eiθ , is

bq1 p1 = ρ p1−q1 ei(p1 −q1 )θ a p1 q1


..
.
bqℓ pℓ = ρ pℓ−qℓ ei(pℓ −qℓ )θ a pℓ qℓ
(5.2)
a pℓqℓ = ρ qℓ −pℓ ei(qℓ −pℓ )θ bqℓ pℓ
..
.
a p1 q1 = ρ q1 −p1 ei(q1 −p1 )θ bq1 p1 .

Actually, condition (3.96) gives just the first ℓ equations of (5.2); we have solved
each equation for the coefficient a pq and adjoined these ℓ new equations to the orig-
inal ℓ equations for reasons that will soon be apparent.
When ρ is 1, equation (5.2) is suggestive of a rotation of coordinate axes, and
in fact when the dual rotation of coordinates (5.3) below is done in C2 , then, as we
will see, the effect on the coefficient vector (a, b) ∈ C2 in system (5.1) is multiplica-
tion of (a, b) on the left by a diagonal matrix Uθ whose entries are the exponentials
in (5.2). The opposite directions of rotation in (5.3) is natural because when a real
system is complexified the second component is obtained by conjugation of the
first. Thus when ρ = 1 the right-hand side of (5.2) is Uθ · (a, b). If we associate
the vector ζ = (ζ1 , . . . , ζ2ℓ ) = (p1 − q1, . . . , p2ℓ − q2ℓ , q2ℓ − p2ℓ , . . . , q1 − p1 ) to fam-
ily (5.1) and for ρ ∈ R+ and any vector c = (c1 , . . . , c2ℓ ) let ρ ζ c denote the vector
(ρ ζ1 c1 , . . . , ρ ζ2ℓ c2ℓ ), then the right-hand side of (5.2) is ρ ζ Uθ · (a, b). Consequently,
letting (b̂, â) denote the involution of (a, b) given by reversing the order of its en-
tries, the condition for reversibility of system (5.1) is the existence of ρ ∈ R+ and
θ ∈ [0, 2π ) such that (b̂, â) = ρ ζ Uθ · (a, b). Generalizing somewhat, we are thus led
to investigate conditions under which, for fixed vectors c, d ∈ E(a, b) = C2ℓ , the
equation σ −ζ c = Uϕ · d has a solution (σ , ϕ ) ∈ R+ × [0, 2π ). The answer (Theorem
5.1.15) is most easily expressed in terms of invariants of the rotation group (Defini-
tion 5.1.4): c and d must have any nonzero entries in the same positions (which we
5.1 Properties of Invariants 215

know immediately from (3.96) that c and ĉ do for c ∈ R), and must yield the same
value on any unary and binary invariant (Definition 5.1.9), hence on any invariant
(Theorem 5.1.19).
Definition 5.1.1. Let k be a field, let G be a group of n × n matrices with elements in
k, and for A ∈ G and x ∈ kn let A · x denote the usual action of G on kn . A polynomial
f ∈ k[x1 , . . . , xn ] is invariant under G if f (x) = f (A · x) for every A ∈ G. The poly-
nomial f is also called an invariant of G. An invariant is irreducible if it does not
factor as a product of polynomials that are themselves invariants (although it could
very well factor).

Example 5.1.2. Let B = 10 −10 and let I2 denote the 2 × 2 identity matrix. Then the
set C4 = {I2 , B, B2 , B3 } is a group under multiplication (Exercise 5.1), and for the
polynomial f (x) = f (x1 , x2 ) = 21 (x21 + x22 ) we have f (x) = f (B · x), f (x) = f (B2 · x),
and f (x) = f (B3 · x). Thus f is an invariant of the group C4 . Of course, when k = R,
B is simply the group of rotations by multiples of π /2 radians (mod 2π ) about the
origin in R2 , and f is an invariant because its level sets are circles centered at the
origin, which are unchanged by such rotations.

Generalizing the example, consider the group of rotations

x′ = e−iϕ x, y′ = eiϕ y (5.3)

of the phase space C2 of (5.1). Viewing the action of an element of the group as a
coordinate transformation, in (x′ , y′ )-coordinates system (5.1) has the form

∑ ∑
q p+1
ẋ′ = − a(ϕ ) pq x′p+1 y′ , ẏ′ = b(ϕ )qp x′q y′ ,
(p,q)∈Se (p,q)∈Se

where the coefficients of the transformed system are

a(ϕ ) p j q j = a p j q j ei(p j −q j )ϕ , b(ϕ )q j p j = bq j p j ei(q j −p j )ϕ , (5.4)

for j = 1, . . . , ℓ. Once the index set Se has been ordered in some manner, for any
fixed angle ϕ the equations in (5.4) determine an invertible linear mapping Uϕ of
the space E(a, b) of parameters of (5.1) onto itself, which we will represent as the
block diagonal 2ℓ × 2ℓ matrix
!
(a)
Uϕ 0
Uϕ = (b) ,
0 Uϕ

(a) (b)
where Uϕ and Uϕ are diagonal matrices that act on the coordinates a and b, re-
spectively.
Example 5.1.3. For the family of systems

ẋ = −a00x − a−11y − a20x3 , ẏ = b1,−1 x + b00y + b02y3 (5.5)


216 5 Invariants of the Rotation Group

Se is the ordered set {(0, 0), (−1, 1), (2, 0)}, and equation (5.4) gives the collection
of 2ℓ = 6 equations

a(ϕ )00 = a00 ei(0−0)ϕ a(ϕ )−11 = a−11 ei(−1−1)ϕ a(ϕ )20 = a20 ei(2−0)ϕ
b(ϕ )00 = b00 ei(0−0)ϕ b(ϕ )1,−1 = b1,−1 ei(1−(−1))ϕ b(ϕ )02 = b02 ei(0−2)ϕ

so that
!
(a)
Uϕ 0 T
Uϕ · (a, b) = (b) · (a, b)
0 Uϕ
     
1 0 0 0 0 0 a00 a00
0 e−i2ϕ 0 0 0 0    −i2ϕ 
   a−11  a−11 ei2ϕ 
0 0 ei2ϕ 0 0    
0  a20   a20 e 
= · = .
0 0
 0 e−i2ϕ 0  
0  b02   b02 e −i2 ϕ 

0 0 0 0 ei2ϕ 0 b1,−1   b1,−1 e 
i2 ϕ

0 0 0 0 0 1 b00 b00

Thus here
   
1 0 0 e−i2ϕ 0 0
Uϕ = 0 e−i2ϕ 0  Uϕ =  0 ei2ϕ 0 .
(a) (b)
and
0 0 ei2ϕ 0 0 1

(a) (b)
Note that Uϕ and Uϕ do not really depend on a and b; rather, the notation sim-
(a)
ply indicates that Uϕ acts on the vector composed of the coefficients of the first
(b)
equation of (5.1) and Uϕ acts on the vector composed of coefficients of the second
equation of (5.1).
We will usually write (5.4) in the short form
(a) (b)
(a(ϕ ), b(ϕ )) = Uϕ · (a, b) = (Uϕ · a,Uϕ · b).

The set U = {Uϕ : ϕ ∈ R} is a group, a subgroup of the group of invertible


2ℓ × 2ℓ matrices with entries in k, under multiplication. In the context of U the
group operation corresponds to following one rotation with another.

Definition 5.1.4. The group U = {Uϕ : ϕ ∈ R} is the rotation group of family (5.1).
A polynomial invariant of the group U is termed an invariant of the rotation group
or, more simply, an invariant.

Since the terminology is so similar, care must be taken not to confuse the rotation
group U of family (5.1) with the group of rotations (5.3) of the phase plane, which
is not associated with any particular family of systems of differential equations.
The rotation group U of family (5.1) acts on E(a, b) = C2ℓ by multiplication on
the left. We wish to identify all polynomial invariants of this group action. The poly-
nomials in question are elements of C[a, b]. They identify polynomial expressions
5.1 Properties of Invariants 217

in the coefficients of elements of family (5.1) that are unchanged under a rotation of
coordinates. Since Uϕ changes only the coefficients of polynomials, a polynomial
f ∈ C[a, b] is an invariant of the group U if and only if each of its terms is an invari-
ant, so it suffices to find the invariant monomials. By (5.4), for ν ∈ N2ℓ 0 , the image
ν ν ν2ℓ
of the corresponding monomial [ν ] = aνp11 q1 · · · a pℓℓ qℓ bqℓℓ+1
pℓ · · · b q1 p1 ∈ C[a, b] under Uϕ
is the monomial
ν
a(ϕ )νp11 q1 · · · a(ϕ )νpℓℓ qℓ b(ϕ )qℓℓ+1 ν2ℓ
pℓ · · · b(ϕ )q1 p1

= aνp11 q1 eiϕν1 (p1 −q1 ) · · · aνpℓℓ qℓ eiϕνℓ (pℓ −qℓ )


ν iϕνℓ+1 (qℓ −pℓ )
× bqℓℓ+1
pℓ e · · · bνq12ℓp1 eiϕν2ℓ (q1 −p1 ) (5.6)
iϕ [ν1 (p1 −q1 )+···+νℓ (pℓ −qℓ )+νℓ+1 (qℓ −pℓ )+···+ν2ℓ (q1 −p1 )]
=e
ν
× aνp11q1 · · · aνpℓℓ qℓ bqℓℓ+1 ν2ℓ
pℓ · · · b q1 p1 .

The quantity in square brackets in the exponent in the first term is L1 (ν ) − L2 (ν ),


where L(ν ) = (L1 (ν ), L2 (ν )) is the linear operator on N2ℓ
0 defined with respect to
e
the ordered set S by (3.71):

L(ν ) = (L1 (ν ), L2 (ν ))
= ν1 (p1 , q1 ) + · · · + νℓ (pℓ , qℓ ) + νℓ+1(qℓ , pℓ ) + · · · + ν2ℓ (q1 , p1 )
= (p1 ν1 + · · · + pℓ νℓ + qℓνℓ+1 + · · · + q1 ν2ℓ ,
q1 ν1 + · · · + qℓνℓ + pℓνℓ+1 + · · · + p1 ν2ℓ ).

Thus the monomial [ν ] is an invariant if and only if L1 (ν ) = L2 (ν ). As in display


(3.98) of Section 3.5, we define the set M by

M = {ν ∈ N2ℓ
0 : L(ν ) = ( j, j) for some j ∈ N0 }, (5.7)

which has the structure of a monoid under addition. (Recall that a monoid is a set
M together with a binary operation ∗ that is associative and for which there is an
identity element ι : for all a, b, c ∈ M, a ∗ (b ∗ c) = (a ∗ b) ∗ c and a ∗ ι = ι ∗ a = a.)
We have established the following proposition.

Proposition 5.1.5. The monomial [ν ] is invariant under the rotation group U of


(5.1) if and only if L1 (ν ) = L2 (ν ), that is, if and only if ν ∈ M .

Since, for any ν ∈ N2ℓ


0 , L1 (ν ) − L2 (ν ) = −(L1 (ν̂ ) − L2 (ν̂ )), the monomial [ν ] is
invariant under U if and only if its conjugate [ν̂ ] is.
Before considering an example, we wish to make the following point. In order
to find all ν ∈ N2ℓ0 such that ν ∈ M , we would naturally express the condition that
L1 (ν ) = L2 (ν ) = j ∈ N0 as L1 (ν ) − L2 (ν ) = 0, which, written out completely, is

L1 (ν ) − L2 (ν ) = (p1 − q1)ν1 + (p2 − q2 )ν2 + · · · + (pℓ − qℓ)νℓ


(5.8)
+ (qℓ − pℓ)νℓ+1 + · · · + (q1 − p1)ν2ℓ = 0,
218 5 Invariants of the Rotation Group

and look for solutions of this latter equation. Certainly, if ν ∈ M , then ν solves
(5.8). On the other hand, if ν ∈ N2ℓ0 solves (5.8), that does not a priori guarantee
that ν ∈ M , since membership in M requires that the common value of L1 (ν ) and
L2 (ν ) be nonnegative. The following proposition asserts that it always is. The proof
is outlined in Exercise 5.3.

Proposition 5.1.6. The set of all solutions in N2ℓ


0 of equation (5.8) coincides with
the monoid M defined by equation (5.7).

Example 5.1.7. We will find all the monomials of degree at most three that are
invariant under the rotation group U for the family of systems (5.5) of Example
5.1.3. Since Se = {(0, 0), (−1, 1), (2, 0)}, for ν ∈ N60 ,

L(ν ) = ν1 (0, 0) + ν2 (−1, 1) + ν3 (2, 0) + ν4 (0, 2) + ν5 (1, −1) + ν6 (0, 0)


= (−ν2 + 2ν3 + ν5 , ν2 + 2ν4 − ν5 ),

so that equation (5.8) reads

−2ν2 + 2ν3 − 2ν4 + 2ν5 = 0. (5.9)

deg([ν ]) = 0. The monomial 1, corresponding to ν = 0 ∈ N60 , is of course always an


invariant. j
deg([ν ]) = 1. In this case ν = (0, . . . , 0, 1, 0, . . . , 0) ∈ N60 for some j. Clearly (5.9)
holds if and only if ν = e1 or ν = e6 , yielding a100 a0−11 a020 b002 b01,−1 b000 = a00 and
a000 a0−11 a020 b002 b01,−1 b100 = b00 , respectively.
deg([ν ]) = 2. If ν = 2e j and satisfies (5.9), then j = 1 or j = 6, yielding a200 and
b200 , respectively. If ν = e j + ek for j < k, then (5.9) holds if and only if either
( j, k) = (1, 6) or one of j and k corresponds to a term in (5.9) with a plus sign and the
other to a term with a minus sign, hence ( j, k) ∈ P := {(2, 3), (2, 5), (3, 4), (4, 5)}.
The former case gives a00 b00 ; the latter case gives

ν = (0, 1, 1, 0, 0, 0) yielding a000 a1−11 a120 b002 b01,−1 b000 = a−11 a20
ν = (0, 1, 0, 0, 1, 0) yielding a000 a1−11 a020 b002 b11,−1 b000 = a−11 b1,−1
ν = (0, 0, 1, 1, 0, 0) yielding a000 a0−11 a120 b102 b01,−1 b000 = a20 b02
ν = (0, 0, 0, 1, 1, 0) yielding a000 a0−11 a020 b102 b11,−1 b000 = b02 b1,−1 .

deg([ν ]) = 3. If all but one entry in ν is a zero and the nonzero entry is a 3, then
clearly (5.9) holds precisely for ν = 3e1 and ν = 3e6 , yielding a300 and b300 , respec-
tively. If ν = 2e j + ek , then it is clear from considerations of parity that (5.9) holds
if and only if ( j, k) ∈ {(1, 6), (6, 1)}, yielding a200 b00 and a00 b200 . If ν = e j + ek + em
for distinct j, k, and m, then again by parity considerations we get either j = 1 and
(k, m) is in the set P of pairs specified in the previous case, or j = 6 and (k, m) ∈ P.
Thus we get finally the four pairs from the previous case concatenated with a00 and
the same four pairs concatenated with b00 .
5.1 Properties of Invariants 219

To summarize, we have found that the full set of monomial invariants of degree
at most three for family (5.5) is

degree 0: 1
degree 1: a00 , b00
degree 2: a200 , b200 , a00 b00 , a−11 a20 , a−11 b1,−1 , a20 b02 , b02 b1,−1
degree 3: a300 , b300 , a200 b00 , a00 b200 , a00 a−11 a20 , a00 a−11 b1,−1 , a00 a20 b02 ,
a00 b02 b1,−1 , b00 a−11 a20 , b00 a−11 b1,−1 , b00 a20 b02 , b00 b02 b1,−1 .

The following definition arises from the characterization of M that was given by
Proposition 5.1.6.
e Then for the corresponding family (5.1)
Definition 5.1.8. Fix an ordered index set S.
we define:
the characteristic vector:

ζ = (p1 − q1, . . . , pℓ − qℓ, qℓ − pℓ, . . . , q1 − p1 ) ∈ Z2ℓ , (5.10)

the characteristic number:

GCD(ζ ) = GCD(p1 − q1, . . . , pℓ − qℓ) ∈ N0

when ζ 6= (0, . . . , 0), and 1 otherwise, and


the reduced characteristic vector:
1
κ= ζ.
GCD(ζ )

It will be convenient to introduce the notation z = (z1 , z2 , . . . , z2ℓ ) to denote a


generic ordered vector (a, b) = (a p1 ,q1 , a p2 ,q2 , . . . , a pℓ ,qℓ , bqℓ ,pℓ , . . . , bq2 ,p2 , bq1 ,p1 ) of
coefficients of system (5.1), regarded as variables in a polynomial, so that
(
a p j ,q j if 1 ≤ j ≤ ℓ
zj = (5.11)
bq2ℓ− j+1,p2ℓ− j+1 if ℓ + 1 ≤ j ≤ 2ℓ.

A pair of variables zr and zs in z, 1 ≤ r < s ≤ 2ℓ, are conjugate variables provided


there exists u ∈ {1, . . . , ℓ} such that zr = zu and zs = z2ℓ−u+1, so that in terms of the
original variable names, (zr , zs ) = (a pu qu , bqu pu ).

Definition 5.1.9. A unary invariant monomial is an invariant monomial that depends


on only one variable, or on only one variable and its conjugate variable. A binary
invariant monomial is an invariant monomial that depends on two nonconjugate
variables, and possibly on one or both of their conjugate variables.

To illustrate, from Example 5.1.7 four invariant monomials for family (5.5) are
f1 = a200 , f2 = a−11 b1,−1, f3 = a−11 a20 , and f4 = a00 a−11 b1,−1 . Obviously, f1 is a
220 5 Invariants of the Rotation Group

unary invariant, and so is f2 , since a−11 and b1,−1 are conjugate variables; f3 and f4
are binary invariants because the two variables that appear in f3 are not conjugate,
while of the three variables appearing in f4 , two are conjugate.
Of course, if, for a general family (5.1), the characteristic vector ζ = (0, . . . , 0)
then every monomial is invariant. In such a case the irreducible unary invariant
monomials are all a pq and all bqp , and there are no irreducible binary invariant
monomials. The following proposition identifies the irreducible unary and binary
invariant monomials in the nontrivial case.

Proposition 5.1.10. Fix the ordered index set Se and let ζ be the characteristic vector
(Definition 5.1.8) of the corresponding family (5.1). Suppose ζ 6= (0, . . . , 0).
1. The unary irreducible invariant monomials of family (5.1) are all the monomials
of the form a pp , b pp, and a pq bqp for p 6= q.
2. The binary irreducible invariant monomials of family (5.1) are all the monomials
of the form
|ζ |/GCD(ζr ,ζs ) |ζr |/GCD(ζr ,ζs )
zr s zs , (5.12)
where zr and zs are defined by (5.11), and r and s are any pair from {1, 2, . . . , 2ℓ}
such that ζr ζs < 0 and zr and zs are not conjugate variables (r + s 6= 2ℓ + 1).
s
Proof. (1) Propositions 5.1.5 and 5.1.6 imply that for ν = (0, . . . , 0, 1, 0, . . . , 0),
[ν ] = zs is a unary invariant if and only if the corresponding coordinate of the
characteristic vector ζ is equal to zero, that is, if and only if the correspond-
ing coefficient of system (5.1), a ps qs or bqs ps , satisfies ps = qs . Similarly, for
r s
ν = (0, . . . , µ , . . . , η , . . . , 0) with s = 2ℓ − r + 1, by Proposition 5.1.5 and (5.8) the
corresponding monomial
zrµ zηs = a µpr qr bηqr pr (5.13)
is a unary invariant if and only if (µ , η ) is a solution in N × N of the equation
µ (pr − qr ) + η (qr − pr ) = 0. If pr = qr , then a pr qr and bqr pr are unary invariants, so
the monomial is not irreducible. Therefore µ = η , and the only irreducible invariant
of the form (5.13) is a pqbqp , p 6= q.
Clearly, no invariant monomial containing three or more distinct variables can be
a unary invariant, since at most two of the variables can be conjugate.
µ η
(2) By Propositions 5.1.5 and 5.1.6, for nonconjugate coefficients zr and zs , zr zs
is a binary invariant monomial if and only if µ ζr + η ζs = 0 and µ η > 0 (else the
invariant is actually unary). If ζr = 0, then ζs = 0 is forced, in which case it follows
µ
directly from the definition that each of zr and zηs is a unary invariant monomial, so
µ η
that zr zs is not an irreducible invariant.
µ
Thus for an irreducible binary invariant of the form zr zηs we must have ζr ζs 6= 0.
In fact, it must be the case that ζr ζs < 0, since µ , η ∈ N. One solution to the equation
µ ζr + η ζs = 0 is obviously (µ , η ) = (|ζs |/G, |ζs |/G), where G = GCD(ζr , ζs ) > 0,
so that (5.12) gives a binary invariant. In Exercise 5.4 the reader is asked to show
that if (µ , η ) is any solution to µ ζr + η ζs = 0, then there exists a ∈ N such that
 
|ζ |/G |ζ |/G a
zrµ zηs = zr s zs r ,
5.1 Properties of Invariants 221

so that (5.12) is irreducible and is the only irreducible binary invariant monomial of
µ
the form zr zηs involving nonconjugate coefficients zr and zs .
Treatment of the invariant monomials that contain two conjugate variables and a
distinct third variable, or contain two distinct pairs of conjugate variables, is left as
Exercise 5.5. 

Remark 5.1.11. In terms of the original variables a p j q j and bq j p j , the second state-
ment of the proposition is that the binary irreducible invariants are (Exercise 5.6):
for any (r, s) ∈ N2 for which 1 ≤ r, s ≤ ℓ and (pr − qr )(ps − qs) < 0:
|p −qs |/GCD(pr −qr ,ps −qs ) |pr −qr |/GCD(pr −qr ,ps −qs )
a prsqr a ps qs

and
|p −qs |/GCD(pr −qr ,ps −qs ) |pr −qr |/GCD(pr −qr ,ps −qs )
bqr spr b qs ps
and for any (r, s) ∈ N2 for which 1 ≤ r, s ≤ ℓ, r 6= s, and (pr − qr )(ps − qs ) > 0:
|p −qs |/GCD(pr −qr ,ps −qs ) |pr −qr |/GCD(pr −qr ,ps −qs )
a prsqr b qs ps .

Example 5.1.12. By part (1) of Proposition 5.1.10, all the unary irreducible invari-
ant monomials for family (5.5) are a00 , b00 , a−11 b1,−1 , and a20 b02 . We find all the
binary irreducible invariant monomials for this family in two ways.
(i) Using Proposition 5.1.10(2) directly, we first identify all pairs (r, s), 1 ≤ r ≤ 6,
1 ≤ s ≤ 6, with r 6= s (else ζr ζs 6< 0) and r + s 6= 2 · 3 + 1 = 7 and compute ζr ζs for
each:
(1, 2), (1, 3), (1, 4), (1, 5) : ζ1 ζs = (0)ζs = 0
(2, 3) : ζ2 ζ3 = (−2)(2) = −4
(2, 4) : ζ2 ζ4 = (−2)(−2) = 4
(2, 6) : ζ2 ζ6 = (−2)(0) = 0
(3, 5) : ζ3 ζ5 = (2)(2) = 4
(3, 6) : ζ3 ζ6 = (2)(0) = 0
(4, 5) : ζ4 ζ5 = (−2)(2) = −4
(4, 6), (5, 6) : ζr ζ6 = ζr (0) = 0.
Working with the two pairs (2, 3) and (4, 5) for which ζr ζs < 0, we obtain
|−2|/2 |2|/2 |−2|/2 |2|/2
z2 z3 = z2 z3 = a−11 a20 and z4 z5 = z4 z5 = b02 b1,−1 .

(ii) Using Remark 5.1.11:


(a) we identify all pairs (r, s), 1 ≤ r, s ≤ 3, with r 6= s (else (pr − qr )(ps − qs ) 6< 0)
and compute (pr − qr )(ps − qs ) for each:

(1, 2) : (p1 − q1)(p2 − q2) = (0)(−2) = 0


(1, 3) : (p1 − q1)(p3 − q3) = (0)(2) = 0
(2, 3) : (p2 − q2)(p3 − q3) = (−2)(2) = −4,
222 5 Invariants of the Rotation Group

hence
|2|/2 |−2|/2 |2|/2 |−2|/2
a−11 a20 = a−11 a20 and b1,−1 b02 = b1.−1 b02 ;

(b) we identify all pairs (r, s), 1 ≤ r, s ≤ 3, with r 6= s and (pr − qr )(ps − qs ) > 0: the
computation already done shows that there are none.
Unary and binary invariants are of particular importance because, up to a natural
technical condition on the positions of zero entries, if all unary and binary invariants
agree on a pair of coefficient vectors in the parameter space E(a, b) of family (5.1),
then all invariants agree on the two vectors of coefficients, which we will prove later.
In order to obtain the main result on the solution of the equation σ −ζ c = Uϕ · d
mentioned in the introductory paragraphs of this section, we will need a pair of
technical lemmas. The first pertains to the equation

L1 (ν ) − L2 (ν ) = (p1 − q1)ν1 + · · · + (pℓ − qℓ)νℓ


(5.14a)
+ (qℓ − pℓ)νℓ+1 + · · · + (q1 − p1)ν2ℓ = GCD(ζ ),

which can be written in the equivalent form

ζ1 ν1 + · · · + ζℓνℓ + ζℓ+1νℓ+1 + · · · + ζ2ℓν2ℓ = GCD(ζ ). (5.14b)

Lemma 5.1.13. If ζ ∈ Z2ℓ and ζ 6= (0, . . . , 0), then equation (5.14) has a solution
ν ∈ N2ℓ
0 for which ν j ≥ 0 for all j and ν j > 0 for at least one j.

Proof. By Exercise 5.7, the equation (p1 − q1 )t1 + · · · + (pℓ − qℓ )tℓ = GCD(ζ ) has a
solution t ∈ Zℓ . Let t + ∈ Nℓ0 be the positive part of t, defined by (t + ) j = max(t j , 0),
and let t − ∈ Nℓ0 be the negative part of t, defined by (t − ) j = max(−t j , 0), so that
t = t + − t − . Let tˆ− be the involution of t − defined by (tˆ− ) j = (t − )ℓ− j , and form the
doubly long string ν ∈ N2ℓ 0 by concatenating t and t : ν = (t , t ). Then |ν | > 0
+ ˆ− + ˆ−

and writing out (5.14) in detail shows that it is satisfied by this choice of ν . 
To state and prove the second lemma and the remaining results of this section
most simply, we introduce the following notation. We will denote a specific element
of the coefficient set E(a, b) = C2ℓ of family (5.1) by c or d. (The notation (c(a) , c(b) )
(a) (b)
analogous to the notation (Uϕ ,Uϕ ) employed for an element Uϕ of the rotation
group might be more appropriate, but is overly cumbersome.) For c ∈ C2ℓ we let
R(c) denote the set of indices of the nonzero coordinates of the vector c.

Lemma 5.1.14. Fix the ordered index set S. e For c and d in E(a, b), suppose that
R(c) = R(d) and that for every unary or binary invariant monomial J(a, b) of family
(5.1) the condition
J(c) = J(d) (5.15)
holds. Then for any r, s ∈ R(c) = R(d),
 κs  κr
dr ds
= , (5.16)
cr cs
5.1 Properties of Invariants 223

where κ j denotes the jth entry in the reduced characteristic vector κ (Definition
5.1.8).

Proof. Suppose elements c and d of E(a, b) satisfy the hypotheses of the lemma
and that r and s lie in R(c), so that none of cr , cs , dr , and ds is zero. If r = s, then
certainly (5.16) holds, so we suppose that r 6= s.
If κs = 0 (that is, ps − qs = 0), then zs (which is equal to a ps ps or to b p2ℓ−s+1 p2ℓ−s+1 )
is a unary invariant. By (5.15), cs = ds and so (5.16) holds. The case κr = 0 is similar.
Now suppose κr κs < 0. Consider the monomial defined by (5.12). If zr and zs are
not conjugate variables, then it is a binary invariant. If they are conjugate variables,
then (relabelling if necessary so that r < s) since now |ζr | = |ζs |, it is a power of
the unary irreducible invariant a pr qr bqr pr , hence is a unary invariant. Either way, by
hypothesis (5.15) it agrees on c and d, so that
|κ | |κr | |κ | |κr |
cr s cs = dr s ds (5.17)

(where we have raised each side of (5.15) to the power GCD(ζs , ζr )/GCD(ζ )).
Considering each of the two possibilities κr < 0 < κs and κr > 0 > κs , we see that
(5.17) yields (5.16).
Finally, consider the case κr κs > 0. Suppose κr and κs are both positive. Since
κ j = −κ2ℓ− j+1 for all j ∈ {1, . . . , 2ℓ} and r 6= s, so that zr and zs cannot be conjugate
variables, by Lemma 5.1.10
|ζ |/GCD(ζr ,ζ2ℓ−s+1 ) |ζr |/GCD(ζr ,ζ2ℓ−s+1 ) ζ /GCD(ζr ,ζ2ℓ−s+1 ) ζr /GCD(ζr ,ζ2ℓ−s+1 )
zr 2ℓ−s+1 z2ℓ−s+1 = zr s z2ℓ−s+1

is a binary invariant. Thus if we raise it to the power GCD(ζr , ζ2ℓ−s+1 )/GCD(ζ ),


κr
we find that zκr s z2ℓ−s+1 is a binary invariant, hence, by (5.15),

crκs cκ2ℓ−s+1
r κr
= drκs d2ℓ−s+1 . (5.18)

Since by Lemma 5.1.10 zs z2ℓ−s+1 is a unary invariant, by (5.15) we have that


κr −κ r d −κ r
cs−κr c−
2ℓ−s+1 = ds 2ℓ−s+1 , which, when multiplied times (5.18), yields (5.16). The
case that κr and κs are both negative is similar. 

Suppose σ ∈ R\{0}, c ∈ E(a, b), ν ∈ N2ℓ


0 , and ζ ∈ Z is the characteristic vector
2ℓ

of family (5.1). We make the following notational conventions:

σ ζ = (σ ζ1 , . . . , σ ζ2ℓ )
σ −ζ c = (σ −ζ1 c1 , . . . , σ −ζ2ℓ c2ℓ )
cν = [ν ]|c = cν11 · · · cν2ℓ2ℓ .

Theorem 5.1.15. Fix the ordered index set S, e hence the family (5.1), the charac-
teristic vector ζ , and rotation group U = {Uϕ : ϕ ∈ R} for the family (5.1). For
c, d ∈ E(a, b), there exist σ ∈ R+ and ϕ ∈ [0, 2π ) such that

σ −ζ c = Uϕ · d (5.19)
224 5 Invariants of the Rotation Group

if and only if R(c) = R(d) and, for all unary and binary invariants J(a, b) of family
(5.1), condition (5.15) holds.

Proof. Suppose there exists a solution σ0 > 0 and ϕ0 ∈ [0, 2π ) to (5.19). By defi-
−ζ ζ
nition of σ0 c and Uϕ0 · d we have c j = σ0 j eiζ j ϕ0 d j for every j, 1 ≤ j ≤ 2ℓ, hence
R(c) = R(d). Let [ν ] be any invariant monomial. Then by Propositions 5.1.5 and
5.1.6, (5.8) holds for ν , which also reads ζ1 ν1 + · · · + ζ2ℓ ν2ℓ = 0 and immediately
yields

ν ν ζ ν ζ2ℓ iν2ℓ ζ2ℓ ϕ0


[ν ]|c = cν = cν11 · · · c2ℓ2ℓ = (σ0 1 1 eiν1 ζ1 ϕ0 d1 ) · · · (σ0 2ℓ e d2ℓ )
ν2ℓ (ζ1 ν1 +···+ζ2ℓ ν2ℓ ) i(ζ1 ν1 +···+ζ2ℓ ν2ℓ )ϕ0
= d1ν1 · · · d2ℓ σ0 e = dν = [ν ]|d .

Thus all invariant monomials agree on c and d, hence all invariants do.
Conversely, suppose that c, d ∈ E(a, b) satisfy R(c) = R(d) and that every unary
and every binary invariant monomial for family (5.1) agree on c and d. We must
show that there exist σ0 ∈ R+ and ϕ0 ∈ [0, 2π ) solving the system of equations

cr = dr σ ζr eiζr ϕ for r = 1, . . . , 2ℓ , (5.20)

that is, such that


ζ
cr = dr σ0 r eiζr ϕ0 for r = 1, . . . , 2ℓ . (5.21)

If ζ = (0, . . . , 0), then our assumptions on c and d force c = d (Exercise 5.8) and
(5.21) holds with σ0 = 1 and ϕ0 = 0. Hence suppose ζ 6= (0, . . . , 0). First we treat
the case that every coordinate in each of the vectors c and d is nonzero. Let µ ∈ N2ℓ
0
be the solution to (5.14) guaranteed by Lemma 5.1.13 to exist, and consider the
equation, for unknown σ and ϕ ,

cµ = dµ σ GCD(ζ ) eiGCD(ζ )ϕ .

By assumption, cµ and dµ are specific nonzero numbers in C, so we may write this


as  µ
c
= σ GCD(ζ ) eiGCD(ζ )ϕ .

But the left-hand side is a nonzero complex number r0 eiθ0 , r0 ∈ R+ , θ0 ∈ [0, 2π ),
1/GCD(ζ )
hence σ0 = r0 and ϕ0 = θ0 /GCD(ζ ) satisfy
GCD(ζ ) iGCD(ζ )ϕ0
cµ = dµ σ0 e . (5.22)

We will now show that σ0 and ϕ0 are as required.


Fix r ∈ {1, . . ., 2ℓ}, raise each side of (5.22) to the power κr , and use the identity
GCD(ζ )κr = ζr to obtain
ζ
(cµ )κr = (dµ )κr σ0 r eiζr ϕ0 . (5.23)
5.1 Properties of Invariants 225

For each s ∈ {1, . . ., 2ℓ}, form (5.16), raise each side to the power µs , and multiply
the resulting expressions together to obtain
 κ1 µ1  κ2ℓ µ2ℓ  κr µ1  κr µ2ℓ
dr dr d1 d2ℓ
··· = ··· ,
cr cr c1 c2ℓ

whence κ µ κ µ κ µ κ µ
dr 1 1 · · · dr 2ℓ 2ℓ cr 1 1 · · · cr 2ℓ 2ℓ
κ µ κ µ = κ µ κ µ .
d1 r 1 · · · d2ℓr 2ℓ c1 r 1 · · · c2ℓr 2ℓ
ζ1 µ1 ζ2ℓ µ2ℓ
But by (5.14b), which µ solves, κ1 µ1 + · · · + κ2ℓ µ2ℓ = GCD(ζ )
+ · · · + GCD( ζ ) = 1,
so the last expression simplifies to

dr cr
µ κ
= µ κ ,
(d ) r (c ) r

which, when solved for (cµ )κr and inserted into (5.23), gives the equation with index
r in system (5.21), as required.
Now suppose that there are zero elements in the vector c; by hypothesis, the cor-
responding elements in the vector d are zero. If all ζr corresponding to the nonzero
elements cr and dr are equal to zero, then system (5.20) is satisfied for any ϕ and
σ 6= 0 (in this case the corresponding polynomials zr are unary invariants). Other-
wise, in order to see that there is a solution of the system composed of the remaining
equations of (5.20),

cr = dr σ ζr eiζr ϕ for r ∈ R(c) ,

we proceed as above, but with the characteristic vector ζ replaced by the vector ζ (c)
that is obtained from ζ by striking all entries ζr for which r ∈
/ R(c) (so that we also
have GCD(ζ (c)) in place of GCE(ζ ) and κ (c) in place of κ ). Since Lemmas 5.1.13
and 5.1.14 are still valid, the proof goes through as before. 
e hence the family (5.1), the character-
Corollary 5.1.16. Fix the ordered index set S,
istic vector ζ , and rotation group Uϕ for the family (5.1). For c, d ∈ E(a, b), there
exists ϕ ∈ [0, 2π ) such that
c = Uϕ · d (5.24)
if and only if R(c) = R(d), |cr | = |dr | for r = 1, . . . , 2ℓ, and, for all unary and binary
invariants J(a, b), condition (5.15) holds.

Proof. We repeat the proof of the theorem almost verbatim. The difference is that in
this case, because of the condition on the moduli of the components of c and d, we
have |cµ | = |dµ |, so that in the discussion leading up to (5.22), r0 = 1 and in place
of (5.22) we obtain the equation

cµ = dµ eiGCD(ζ )ϕ0 .
226 5 Invariants of the Rotation Group

The remainder of the proof applies without modification to show that ϕ0 is as re-
quired. 

Proposition 5.1.17. Fix the ordered index set S, e hence the family (5.1), the charac-
teristic vector ζ , and rotation group Uϕ for the family (5.1).
1. Suppose ζ 6= (0, . . . , 0) and c, d ∈ E(a, b) satisfy R(c) = R(d) and J(c) = J(d) for
all unary and binary invariants J(a, b). Then (σ0 , ϕ0 ) ∈ R+ × [0, 2π ) satisfies

σ0−ζ c = Uϕ0 d (5.25)

if and only if for every solution ν = µ of (5.14), equation (5.22) holds:


GCD(ζ ) iGCD(ζ )ϕ0
cµ = dµ σ0 e . (5.26)

2. If in part (1) the condition R(c) = R(d) is replaced by the more restrictive condi-
tion |cr | = |dr | for r = 1, . . . , 2ℓ, then ϕ0 ∈ [0, 2π ) satisfies

c = Uϕ0 · d (5.27)

if and only if for every solution ν = µ of (5.14),

cµ = dµ eiGCD(ζ )ϕ0 . (5.28)

Proof. As to part (1), the paragraph in the proof of Theorem 5.1.15 that follows
(5.22) showed that if (σ0 , ϕ0 ) satisfies (5.26) then it satisfies (5.25). Conversely, if
(5.25) holds, and if ν = µ is any solution of (5.14), then for each r ∈ {1, . . . , 2ℓ}
raise (5.21), the rth component of (5.25), to the power µr and multiply all these
terms together. Because ν = µ solves (5.14), the resulting product reduces to (5.26).
The proof of statement (2) is practically the same. 

Remark 5.1.18. A basis or generating set of a monoid (M, +) written additively is a


set B such that every element of M is a finite sum of elements of B, where summands
need not be distinct. If for β ∈ B we define 0β to be the identity in M, and for c ∈ N
and β ∈ B define cβ to be the c-fold sum b + · · · + b (c summands), then B is a basis
of M if and only if every element of M is an N0 -linear combination of elements of
B. It is clear that if (5.15) holds for all unary and binary invariants from a basis of
the monoid M , then it holds for all unary and binary invariants (since every unary
and binary invariant is a sum of unary and binary invariants from a basis of M ).
Hence in the statements of Lemma 5.1.14, Theorem 5.1.15, Corollary 5.1.16, and
Theorem 5.1.17 it is sufficient to require that condition (5.15) hold not for all unary
and binary invariants but only for unary and binary invariants from a basis of M .

The precise statement concerning how agreement of unary and binary invariants
on a pair of coefficient vectors forces agreement of all invariants on the pair is the
following.
5.1 Properties of Invariants 227

Theorem 5.1.19. Fix the ordered index set S.e For c and d in E(a, b), suppose that
R(c) = R(d) and that for every unary or binary invariant monomial J of family (5.1),
J(c) = J(d). Then J(c) = J(d) holds for all invariants J(z).

Proof. It is sufficient to consider only invariant monomials. Thus let [ν ] be any


invariant monomial and suppose c and d satisfy the hypotheses of the theorem.
Then by Theorem 5.1.15 there exist σ ∈ R+ and ϕ ∈ [0, 2π ) such that σ −ζ c = Uϕ ·d,
which reads c j = σ ζ j eiζ j ϕ d j for all j = 1, 2, . . . , 2ℓ. The last two sentences of the first
paragraph of the proof of Theorem 5.1.15, copied verbatim, complete the proof. 

We can now complete the proof of Theorem 3.5.8 of Section 3.5. For a coefficient
vector (a, b) = (a p1 q1 , . . . , a pℓ qℓ , bqℓ pℓ , . . . , bq1 p1 ) ∈ E(a, b), recall that (b̂, â) denotes
the involution of (a, b) defined by

(b̂, â) = (bq1 p1 , . . . , bqℓ pℓ , a pℓ qℓ , . . . , a p1 q1 ) . (5.29)

Theorem 5.1.20 (Theorem 3.5.8). Fix an ordered index set S, hence family (3.3).
The set R ⊂ E(a, b) of all time-reversible systems in family (3.3) satisfies:
1. R ⊂ V(Isym );
2. V(Isym ) \ R
= {(a, b) ∈ V(Isym ) : there exists (p, q) ∈ S with a pq bqp = 0 but a pq + bqp 6= 0}.

Proof. We already know (page 134) that point (1) holds. As to point (2), the state-
ment that there exists a index pair (p, q) ∈ S such that a pq bqp = 0 but a pq + bqp 6= 0
is precisely the statement that for some (p, q) ∈ S exactly one of a pq and bqp is
zero, which in our current language is precisely the statement that R(a, b) 6= R(b̂, â).
Thus if we denote by D the set D := {(a, b) ∈ V(Isym ) : R(a, b) 6= R(b̂, â)}, we
must show that V(Isym ) \ R = D. The inclusion D ⊂ V(Isym ) \ R follows directly
from the characterization of R given by (3.96) (Exercise 5.9). To establish the
inclusion V(Isym ) \ R ⊂ D, we demonstrate the truth of the equivalent inclusion
V(Isym ) \ D ⊂ R. Hence suppose system (a, b) ∈ V(Isym ) satisfies R(a, b) = R(b̂, â).
By definition of Isym
[ν ]|(a,b) = [νb]|(a,b) (5.30)
for all ν ∈ M . But by Proposition 5.1.5 the set {[ν ] : ν ∈ M } contains all unary and
binary invariant monomials of system (3.3). Thus, since [νb ]|(a,b) = [ν ]|(b̂,â) , (5.30)
means that J(a, b) = J(b̂, â) for all such invariants. Thus by Theorem 5.1.15 the
system of equations σ −ζ · (b̂, â) = Uϕ (a, b) in unknowns σ and ϕ has a solution
(σ , ϕ ) = (ρ , θ ) ∈ R+ × [0, 2π ). When written out in detail, componentwise, this is
precisely (5.2), that is, is (3.96) with α = ρ eiθ , so (a, b) ∈ R. 

We have developed the ideas in this section using the rotation group of fam-
ily (5.1). Generalizing to a slightly more general group of transformations yields a
pleasing result that connects time-reversibility to orbits of a group action. We end
this section with an exposition of this idea.
Consider the group of transformations of the phase space C2 of (5.1) given by
228 5 Invariants of the Rotation Group

x′ = η x, y′ = η −1 y (5.31)

for η ∈ C \ {0}; it is isomorphic to a subgroup of SL(2, C). In (x′ , y′ )-coordinates


system (5.1) has the form

∑ ∑
q p+1
ẋ′ = a(η ) pq x′p+1 y′ , ẏ′ = b(η )qp x′q y′ ,
(p,q)∈Se (p,q)∈Se

where the coefficients of the transformed system are

a(η ) p j q j = a p j q j η q j −p j , b(η )q j p j = bq j p j η p j −q j (5.32)

for j = 1, . . . , ℓ. Let Uη denote both an individual transformation (5.32), which we


write in the short form (a(η ), b(η )) = Uη · (a, b), and the full group of transforma-
tions for all η ∈ C \ {0}. As before, a polynomial f ∈ C[a, b] is an invariant of the
group Uη if and only if each of its terms is an invariant.
In analogy with (5.6), by (5.32), for ν ∈ N2ℓ 0 , the image of the corresponding
ν1 νℓ νℓ+1 ν2ℓ
monomial [ν ] = a p1 q1 · · · a pℓ qℓ bqℓ pℓ · · · bq1 p1 ∈ C[a, b] under Uη is the monomial
ν
Uη · [ν ] = a(η )νp11 q1 · · · a(η )νpℓℓ qℓ b(η )qℓℓ+1 ν2ℓ
pℓ · · · b(η )q1 p1
ν
= η ζ ·ν aνp11 q1 · · · aνpℓℓ qℓ bqℓℓ+1 ν2ℓ
pℓ · · · b q1 p1 (5.33)
ζ ·ν
=η [ν ].

Thus exactly as with the rotation group Uϕ , a monomial [ν ] is invariant under the
action of Uη if and only if ζ · ν = 0, that is, if and only if ν ∈ M .
Written in the language of the action of Uη , condition (5.2) for time-reversibility
is that there exist η = ρ eiθ , ρ 6= 0, such that (b̂, â) = Uη · (a, b), which is the same
as (5.19) with c = (b̂, â), d = (a, b), and η = σ eiϕ . Letting O be an orbit of the
group action, O = {(a(η ), b(η )) : η ∈ C \ {0}} ⊂ E(a, b) = C2ℓ , this means that
for (a, b) ∈ O the system (a, b) is time-reversible if and only if (b̂, â) ∈ O as well.
We claim that if an orbit O contains one time-reversible system, then every system
that it contains is time-reversible. For suppose (a0 , b0 ) ∈ O is time-reversible. Then
for some η0 = σ eiθ , equation (5.19) holds with c = (a0 , b0 ) and d = (b̂0 , â0 ), so that
by Theorem 5.1.15,
R(a0 , b0 ) = R(b̂0 , â0 ). (5.34)
Now let any element (a1 , b1 ) of O be given, and let η1 = σ1 eiϕ1 6= 0 be such that
(a1 , b1 ) = Uη1 · (a0 , b0 ). It is clear that (5.34) implies

R(a1 , b1 ) = R(b̂1 , â1 ). (5.35)

If [ν ] is any invariant of the Uη action, then because [ν ] is constant on O,

[ν ]|(a0 ,b0 ) = [ν ]|(b̂0 ,â0 ) = [v̂]|(a0 ,b0 )

and because [ν ] is unchanged under the action of Uη1 this implies that
5.2 The Symmetry Ideal and the Set of Time-Reversible Systems 229

[ν ]|Un ·(a0 ,b0 ) = [ν̂ ]|Un ·(a0 ,b0 ) = [ν̂ ]|(a1 ,b1 ) = [ν ]|(b̂1 ,â1 ) ,
1 1

which is [ν ]|(a1 ,b1 ) = [ν ]|(b̂1 ,â1 ) . This shows that every Uη invariant agrees on (a1 , b1 )
and (b̂1 , â1 ), hence because the Uη and Uϕ invariants are the same, every Uϕ invari-
ant does. This fact together with (5.35) implies by Theorem 5.1.15 that there exists
η2 = σ2 eiϕ2 such that (5.19) holds, so that (b̂1 , â1 ) ∈ O, as required.
We say that the orbit O of the Uη group action is invariant under the involution
(5.29) if (a, b) ∈ O implies (b̂, â) ∈ O. We have proven the first point in the following
theorem. The second point is a consequence of this discussion and Theorem 5.2.4
in the next section.

Theorem 5.1.21. Let an ordered index set S, e hence a family (5.1), be fixed, and let
Uη be the group of transformations (5.31), η ∈ C \ {0}.
1. The set of orbits of Uη is divided into two disjoint subsets: one subset lies in and
fills up the set R of time-reversible systems; the other subset lies in and fills up
E(a, b) \ R.
2. The symmetry variety V(Isym ) is the Zariski closure of the set of orbits of the
group Uη that are invariant under the involution (5.29).

5.2 The Symmetry Ideal and the Set of Time-Reversible Systems

The necessary and sufficient condition that a system of the form (5.1) be time-
reversible, that there exist γ ∈ C \ {0} such that

bqp = γ p−q a pq for all (p, q) ∈ Se (5.36)

(which is just condition (3.96) applied to (5.1)), is neither a fixed polynomial de-
scription of the set R of reversible systems (since the number γ varies from system
to system in R) nor a polynomial parametrization of R. When we encountered this
situation in Section 3.5, we introduced the monoid

M = {ν ∈ N2ℓ
0 : L(ν ) = ( j, j) for some j ∈ N0 }

(where L(ν ) = (L1 (ν ), L2 (ν )) is the linear operator on N2ℓ


0 defined with respect to
the ordered set Se by (3.71)) and the symmetry ideal
def
Isym = h[ν ] − [ν̂ ] : ν ∈ M i ⊂ C[a, b]

and showed that R ⊂ V(Isym ). In this section we will show that V(Isym ) is the Zariski
closure of R by exploiting the implicitization theorems from the end of Section 1.4.
We begin by introducing a parametrization of the set R: for (t1 , . . . ,tℓ ) ∈ Cℓ and
γ ∈ C \ {0}, condition (5.36) is equivalent to

a p jq j = t j , bq j p j = γ p j −q j t j for j = 1, . . . , ℓ. (5.37)
230 5 Invariants of the Rotation Group

Refer to equations (1.56) and (1.57) and to Theorem 1.4.15. We define


for 1 ≤ r ≤ ℓ:

gr (t1 , . . . ,tℓ , γ ) ≡ 1,

for ℓ + 1 ≤ r ≤ 2ℓ:
(
γ q2ℓ−r+1 −p2ℓ−r+1 if p2ℓ−r+1 − q2ℓ−r+1 ≤ 0
gr (t1 , . . . ,tℓ , γ ) =
1 if p2ℓ−r+1 − q2ℓ−r+1 > 0,

for 1 ≤ r ≤ ℓ:

fr (t1 , . . . ,tℓ , γ ) = tr ,

for ℓ + 1 ≤ r ≤ 2ℓ:
(
t2ℓ−r+1 if p2ℓ−r+1 − q2ℓ−r+1 ≤ 0
fr (t1 , . . . ,tℓ , γ ) =
γ p2ℓ−r+1 −q2ℓ−r+1 t2ℓ−r+1 if p2ℓ−r+1 − q2ℓ−r+1 > 0;

set

g = g1 · · · g2ℓ
(
a pr qr if 1 ≤ r ≤ ℓ
xr =
bq2ℓ−r+1 p2ℓ−r+1 if ℓ + 1 ≤ r ≤ 2ℓ;

and let
H = h1 − tg, gr xr − fr : 1 ≤ r ≤ 2ℓi ⊂ C[t,t1 , . . . ,tℓ , γ , a, b]. (5.38)
Then by Theorem 1.4.15

R = V(I ), where I = H ∩ C[a, b]. (5.39)

We will now show that I and Isym are the same ideal. To do so we will need the
following preliminary result.
Lemma 5.2.1. The ideal I is a prime ideal that contains no monomials and has a
reduced Gröbner basis consisting solely of binomials.
Proof. By Theorem 1.4.17 the ideal I is prime, because, as shown in the proof of
that theorem, H is the kernel of the ring homomorphism

ψ : C[x1 , . . . , x2ℓ ,t1 , . . . ,tℓ , γ ,t] → C(t1 , . . . ,tℓ , γ )

defined by

tk → tk , x j → f j (t1 , . . . ,tℓ , γ )/g j (t1 , . . . ,tℓ , γ ), t → 1/g(t1 , . . . ,tℓ , γ ),


5.2 The Symmetry Ideal and the Set of Time-Reversible Systems 231

k = 1, . . . , ℓ, j = 1, . . . , ℓ, which, in the notation of the coefficients a p j q j , bq j p j , is


ψ : C[a, b,t1 , . . . ,tℓ , γ ,t] → C(t1 , . . . ,tℓ , γ ) defined by

tk → tk , a p jq j → t j , bq j p j → γ p j −q j t j , t → 1/g(t1 , . . . ,tℓ , γ ),

k = 1, . . . , ℓ, j = 1, . . . , ℓ. Simply evaluating ψ on any monomial shows that it is


impossible to get zero, so there is no monomial in H, hence none in I , which is a
subset of H.
As defined by (5.38), the ideal H has a basis consisting of binomials. Or-
der the variables t > t1 > · · · > tℓ > γ > a p1 q1 > · · · > bq1 p1 and fix the lexico-
graphic term order with respect to this ordering of variables on the polynomial ring
C[t,t1 , . . . ,tℓ , γ , a, b]. Suppose a Gröbner basis G is constructed with respect to this
order using Buchberger’s Algorithm (Table 1.3 on page 21). Since the S-polynomial
of any two binomials is either zero, a monomial, or a binomial, as is the remainder
upon division of a binomial by a binomial, and since H contains no monomials,
G consists solely of binomials. Then by the Elimination Theorem, Theorem 1.3.2,
G1 := G ∩ C[a, b] is a Gröbner basis of I consisting solely of binomials. A mini-
mal Gröbner basis G2 is obtained from G1 by discarding certain elements of G1 and
rescaling the rest (see the proof of Theorem 1.2.24), and a reduced Gröbner basis
G3 is obtained from G2 by replacing each element of G2 by its remainder upon di-
vision by the remaining elements of G2 (see the proof of Theorem 1.2.27). Since
these remainders are binomials, G3 is a reduced Gröbner basis of I that consists
solely of binomials. 
e hence a family (5.1), be given. The ideal Isym
Theorem 5.2.2. Let an index set S,
coincides with the ideal I defined by (5.38) and (5.39).

Proof. Let f ∈ Isym ⊂ C[a, b] be given, so that f is a finite linear combination, with
coefficients in C[a, b], of binomials of the form [ν ]−[ν̂ ], where ν ∈ M . To show that
f ∈ I it is clearly sufficient to show that any such binomial is in I . By definition
of the mapping ψ ,

ψ ([ν ] − [ν̂ ]) = t1ν1 · · ·tℓνℓ (γ pℓ −qℓ tℓ )νℓ+1 · · · (γ p1 −q1 t1 )ν2ℓ


ν ν
− t1 2ℓ · · ·tℓ ℓ+1 (γ pℓ −qℓ tℓ )νℓ · · · (γ p1 −q1 t1 )ν1 (5.40)
ν ν ν
= t1ν1 · · ·tℓ ℓ t1 2ℓ · · ·tℓ ℓ+1 (γ ν1 ζ1 +···+νℓ ζℓ −γ ν2ℓ ζ1 +···+νℓ+1 ζℓ
).

Since ν ∈ M , ζ1 ν1 + · · · + ζ2ℓν2ℓ = 0. But ζ j = −ζ2ℓ− j+1 for 1 ≤ j ≤ 2ℓ, so

ζ1 ν1 + · · · + ζℓ νℓ = −ζℓ+1νℓ+1 − · · · − ζ2ℓ ν2ℓ = ζℓ νℓ+1 + · · · + ζ1 ν2ℓ

and the exponents on γ in (5.40) are the same. Thus [ν ] − [ν̂ ] ∈ ker(ψ ) = H, hence
[ν ] − [ν̂ ] ∈ H ∩ C[a, b] = I , as required.
Now suppose f ∈ I = H ∩ C[a, b] ⊂ C[a, b]. Because by Lemma 5.2.1 I has
a basis consisting wholly of binomials, it is enough to restrict to the case that f is
binomial, f = aα [α ] + aβ [β ]. Using the definition of ψ and collecting terms,
232 5 Invariants of the Rotation Group

ψ (aα [α ] + aβ [β ])
α +α2ℓ α +αℓ+1 ζℓ αℓ+1 +···+ζ1 α2ℓ β +β2ℓ β +βℓ+1 ζℓ βℓ+1 +···+ζ1 β2ℓ
= aα t1 1 · · ·tℓ ℓ γ +aβ t1 1 · · ·tℓ ℓ γ .

Since H = ker(ψ ) this is the zero polynomial, so

aβ = −aα (5.41a)
α j + α2ℓ− j+1 = β j + β2ℓ− j+1 for j = 1, . . . , ℓ (5.41b)
ζℓ αℓ+1 + · · · + ζ1 α2ℓ = ζℓ βℓ+1 + · · · + ζ1 β2ℓ . (5.41c)

Imitating the notation of Section 5.1, for ν ∈ N2ℓ


0 let R(ν ) denote the set of indices
j for which ν j 6= 0. First suppose that R(α ) ∩ R(β ) = ∅. It is easy to check that
condition (5.41b) forces β j = α2ℓ− j+1 for j = 1, . . . , 2ℓ, so that β = α̂ . But then
because ζ j = −ζ2ℓ− j+1 for 1 ≤ j ≤ 2ℓ, condition (5.41c) reads

−ζℓ+1 αℓ+1 − · · · − ζ2ℓ α2ℓ = ζℓ αℓ + · · · + ζ1 α1

or ζ1 α1 + · · · + ζ2ℓ α2ℓ = 0, so α ∈ M . Thus f = aα ([α ] − [α̂ ]) and α ∈ M , so


f ∈ Isym .
If R(α ) ∩ R(β ) 6= ∅, then [α ] and [β ] contain common factors, corresponding to
the common indices of some of their nonzero coefficients. Factoring out the com-
mon terms, which form a monomial [µ ], we obtain f = [µ ](aα [α ′ ] + aβ [β ′ ]), where
R(α ′ ) ∩ R(β ′ ) = ∅. Since by Lemma 5.2.1 the ideal I is prime and contains no
monomial, we conclude that aα [α ′ ] + aβ [β ′ ] ∈ I , hence, by the first case, that
aα [α ′ ] + aβ [β ′ ] ∈ Isym , hence that f ∈ Isym . 
Lemma 5.2.1 and Theorem 5.2.2 together immediately yield the following theo-
rem, which includes Theorem 3.5.9.
e hence a family (5.1), be given. The ideal
Theorem 5.2.3. Let an ordered index set S,
Isym is a prime ideal that contains no monomials and has a reduced Gröbner basis
consisting solely of binomials.
Together with (5.39), Theorem 5.2.2 provides the following description of time-
reversible systems.
Theorem 5.2.4. Let an ordered index set S, e hence a family (5.1), be given. The va-
riety of the ideal Isym is the Zariski closure of the set R of time-reversible systems
in family (5.1).
A generating set or basis N of M (see Remark 5.1.18) is minimal if, for each
ν ∈ N , N \ {ν } is not a generating set. A minimal generating set, which need
not be unique (Exercise 5.10), is sometimes referred to as a Hilbert basis of M .
The fact that by Proposition 5.1.6 the monoid M is V ∩ N2ℓ0 , where V is the vector
subspace of R2ℓ determined by the same equation (5.8) that determines M , can
lead to some confusion. As elements of R2ℓ , the elements of the Hilbert Basis of
M span V but are not necessarily a vector space basis of V . See Exercises 5.11 and
5.12 concerning this comment and other facts about Hilbert bases of M .
5.2 The Symmetry Ideal and the Set of Time-Reversible Systems 233

The next result describes the reduced Gröbner basis of the ideal Isym in more
detail and shows how to use it to construct a Hilbert basis of the monoid M .

Theorem 5.2.5. Let G be the reduced Gröbner basis of Isym with respect to any term
order.
1. Every element of G has the form [ν ] − [ν̂ ], where ν ∈ M and [ν ] and [ν̂ ] have no
common factors.
2. The set

H = { µ , µ̂ : [µ ] − [µ̂ ] ∈ G}
∪ {e j + e2ℓ− j+1 : j = 1, . . . , ℓ and ± ([e j ] − [e2ℓ− j+1]) 6∈ G},

j
where e j = (0, . . . , 0, 1, 0, . . . , 0), is a Hilbert basis of M .

Proof. Let g = aα [α ] + aβ [β ] be an element of the reduced Gröbner basis of Isym .


The reasoning in the part of the proof of Theorem 5.2.2 that showed that I ⊂ Isym
shows that any binomial in Isym has the form [η ]([ν ] − [ν̂ ]), where ν ∈ M and
R(ν ) ∩ R(ν̂ ) = ∅. Thus if R(α ) ∩ R(β ) = ∅, then g = a([ν ] − [ν̂ ]), and a = 1 since
G is reduced. If R(α ) ∩ R(β ) 6= ∅, then g = [η ]([ν ] − [ν̂ ]) = [η ]h and h ∈ Isym since
ν ∈ M . But then there exists g1 ∈ G such that LT(g1 ) divides LT(h), which implies
that LT(g1 ) divides LT(g), which is impossible since g ∈ G and G is reduced. This
proves point (1).
It is an immediate consequence of the definition of M that e j + e2ℓ− j+1 ∈ M for
j = 1, . . . , ℓ. If, for some j, 1 ≤ j ≤ ℓ, [e j ] − [e2ℓ− j+1] ∈ G, then e j and e2ℓ− j+1 are
both in H , hence e j + e2ℓ− j+1 ∈ Span H , so that H is a basis of M if

H + = {µ , µ̂ : [µ ] − [µ̂ ] ∈ G} ∪ {e j + e2ℓ− j+1 : j = 1, . . . , ℓ}

is, and it is more convenient to work with H + in this regard.


Hence suppose ν is in M . If ν̂ = ν , then ν j = ν2ℓ− j+1 for j = 1, . . . , ℓ, so that
ν = ∑ℓj=1 ν j (e j + e2ℓ− j+1). Thus ν could fail to be a finite sum of elements of H +
only if ν 6= ν̂ , hence only if [ν ] − [ν̂ ] 6= 0. Suppose, contrary to what we wish to
show, that the set F of elements of M that cannot be expressed as a finite N0 -linear
combination of elements of H + is nonempty. Then F has a least element µ with
respect to the total order on N2ℓ 0 that corresponds to (and in fact is) the term order
on C[a, b]. The binomial [µ ] − [µ̂ ] is in Isym and is nonzero, hence its leading term
[µ̂ ] is divisible by the leading term [α ] of some element [α ] − [α̂ ] of G. Performing
the division yields

[µ̂ ] − [µ ] = [β ]([α ] − [α̂ ]) + [α̂ ]([β ] − [β̂ ]),

which, adopting the notation x = (x1 , . . . , x2ℓ ) = (a p1 q1 , . . . , bq1 p1 ) of the definition


of the ideal H defining I , is more intuitively expressed as

x µ̂ − x µ = xβ (xα − xα̂ ) + xα̂ (xβ − xβ̂ ),


234 5 Invariants of the Rotation Group

for some β ∈ N2ℓ 0 that satisfies α + β = µ̂ . But then α̂ + β̂ = µ , so that β̂ < µ (see
Exercise 1.15) and β̂ = µ − α̂ ∈ M . Thus β̂ cannot be in F, whose least element
is µ , so β̂ is a finite N0 -linear combination of elements of H + . But then so is
µ = α̂ + β̂ , a contradiction. Thus F is empty, as required.
It remains to show that H is minimal. We do this by showing that no element
of H is an N0 -linear combination of the remaining elements of H . To fix notation
we suppose that the reduced Gröbner basis of Isym is G = {g1 , . . . , gr }, where in
standard form (leading term first) g j is g j = [µ j ] − [µ̂ j ], 1 ≤ j ≤ r.
Fix µk ∈ H for which either [µk ] − [µ̂k ] ∈ G or [µ̂k ] − [µk ] ∈ G. It is impossible
that µk = ∑ℓj=1 c j (e j + e2ℓ− j+1) (with c j = 0 if e j + e2ℓ− j+1 6∈ H ), since this equa-
tion implies that µ̂k = µk , hence that gk = ([µk ] − [µ̂k ]) = 0, which is not true. It is
also impossible that
r r ℓ
µk = ∑ a j µ j + ∑ b j µ̂ j + ∑ c j (e j + e2ℓ− j+1),
j=1 j=1 j=1
j6=k

which is equivalent to
r r ℓ
µ̂k = ∑ a j µ̂ j + ∑ b j µ j + ∑ c j (e2ℓ− j+1 + e j ).
j=1 j=1 j=1
j6=k

For if some a j 6= 0, then [µk ] is divisible by the leading term [µ j ] of g j (and j 6= k),
while if some b j 6= 0 then [µ̂k ] is divisible by the leading term [µ j ] of g j (and j 6= k
since, by point (1) of the theorem, [µk ] and [µ̂k ] have no common factors), contrary
to the fact that G is reduced. Clearly this argument also shows that no µ̂k is an
N0 -linear combination of the remaining elements of H .
Finally, suppose k ∈ {1, . . . , ℓ} is such that

ek + e2ℓ−k+1 = ∑ a jθ j (5.42)
θ j ∈H

for a j ∈ N0 (we are not assuming that k is such that ek + e2ℓ−k+1 ∈ H ). Because all
a j and all entries in all θ j are nonnegative no cancellation of entries is possible. Thus
either the sum has just one summand, ek + e2ℓ−k+1 , or it has exactly two summands,
ek and e2ℓ−k+1 . In the former case ek + e2ℓ−k+1 ∈ H ; in the latter case both ek
and e2ℓ−k+1 = êk are in H , so that by definition of H either [ek ] − [e2ℓ−k+1 ] ∈ G
or [e2ℓ−k+1 ] − [ek ] ∈ G, hence ek + e2ℓ−k+1 6∈ H . Thus if ek + e2ℓ−k+1 ∈ H , then
(5.42) holds only if the right-hand side reduces to 1 · (ek + e2ℓ−k+1). 

Theorem 1.3.2 provides an algorithm for computing a generating set for the ideal
I and, therefore, for the ideal Isym . Using Theorem 5.2.5, we obtain also a Hilbert
basis of the monoid M . The complete algorithm is given in Table 5.1 on page 235.

Example 5.2.6. Consider the ordered set S of indices S = {(0, 1), (−1, 3)} corre-
sponding to the ordered parameter set {a01, a−13 , b3,−1 , b10 } of family (5.1). In this
5.2 The Symmetry Ideal and the Set of Time-Reversible Systems 235

Algorithm for computing Isym and a Hilbert basis of M

Input:
An ordered index set Se = {(p1 , q1 ), . . ., (pℓ , qℓ )}
specifying a family of systems (5.1)

Output:
A reduced Gröbner basis G for the ideal Isym
and a Hilbert basis H for the monoid M for family (5.1)

Procedure:
1. Compute the reduced Gröbner basis GH for H defined
by (5.38) with respect to lexicographic order with
t > t1 > · · · > tℓ > γ > a p1 q1 > · · · > bq1 p1 .
2. G := GH ∩ C[a, b].
3. H is the set defined in point (2) of Theorem 5.2.5.

Table 5.1 Algorithm for Computing Isym and a Hilbert Basis of M

case ℓ = 2, L : N40 → Z2 is the map

L(ν1 , ν2 , ν3 , ν4 ) = ν1 (0, 1) + ν2 (−1, 3) + ν3(3, −1) + ν4(1, 0)


= (−ν2 + 3ν3 + ν4 , ν1 + 3ν2 − ν3 ),

and by Proposition 5.1.6, M is the set of all ν = (ν1 , ν2 , ν3 , ν4 ) ∈ N40 such that

ν1 + 4ν2 − 4ν3 − ν4 = 0 . (5.43)

When we compute the quantities defined in the display between equations (5.37)
and (5.38), we obtain the polynomials

g1 (t1 ,t2 , γ ) = 1 f1 (t1 ,t2 , γ ) = t1


g2 (t1 ,t2 , γ ) = 1 f2 (t1 ,t2 , γ ) = t2
g3 (t1 ,t2 , γ ) = γ 4
f3 (t1 ,t2 , γ ) = t2
g4 (t1 ,t2 , γ ) = γ f4 (t1 ,t2 , γ ) = t1

and the polynomial and variables

g(t1 ,t2 , γ ) = γ 5 x1 = a01 x2 = a−13 x3 = b3,−1 x4 = b10 .

Thus the ideal defined by (5.38) lies in C[t,t1 ,t2 , γ , a01 , a−13 , b3,−1 , b10 ] and is

H = h1 − t γ 5, a01 − t1 , a−13 − t2 , γ 4 b3,−1 − t2 , γ b10 − t1 i.


236 5 Invariants of the Rotation Group

Using practically any readily available computer algebra system, we can compute
the reduced Gröbner basis for H with respect to lexicographic order with the vari-
ables ordered by t > t1 > t2 > γ > a01 > a−13 > b3,−1 > b10 (step 1 of the algorithm)
as the set of polynomials

b3,−1 a401 − b410 a−1,3 γ b3,−1 a301 − a−13 b310 γ b10 − a01
γ 2
b3,−1 a201 − b210 a−13 γ b3,−1 a01 − b10 a−13
3
γ 4 b3,−1 − a−13
t a−13 a01 − b10 b3,−1 γ t a−13 − b3,−1 t2 − a−13
t a2−13 − γ 3 b23,−1 γ 2 t a301 − b310 t1 − a01
γ t a401 − b410 t a501 − b510 γ5 t − 1
γ 3 t a201 − b210 γ 4 t a01 − b10 .

The reduced Gröbner basis of the fourth elimination ideal (step 2 of the algorithm)
is composed of just those basis elements that do not contain any of t, t1 , t2 , or γ ,
hence is simply {b3,−1 a401 − b410 a−13 }. This is the reduced Gröbner basis of Isym .
Finally (step 3), by Theorem 5.2.5, since [e1 ] = a101 a0−13 b03,−1 b010 = a01 and so on, a
Hilbert basis of H is

{(4, 0, 1, 0), (0, 1, 0, 4), (1, 0, 0, 1), (0, 1, 1, 0)}.

Any string (ν1 , ν2 , ν3 , ν4 ) in N40 that satisfies (5.43) is an N0 -linear combination of


these four 4-tuples.

We now use the algorithm to prove Theorem 3.5.10, that is, to compute the gen-
erators of the ideal Isym for system (3.100):

ẋ = x − a10 x2 − a01 xy − a−12 y2 − a20 x3 − a11 x2 y − a02 xy2 − a−13 y3


ẏ = −y + b2,−1 x2 + b10 xy + b01 y2 + b3,−1 x3 + b20 x2 y + b11 xy2 + b02 y3 .

Proof of Theorem 3.5.10. We first compute a reduced Gröbner basis of the ideal of
(5.38) for our family, namely the ideal

H = h1− γ 10t, a10 −t1 , a01 −t2 , a−12 −t3 , a20 −t4 , a11 −t5 , a02 −t6 , a−13 −t7 ,
γ 4 b3,−1 −t7 , γ 2 b20 −t6 , γ b11 −t5 , b02 − γ 2t4 , γ 3 b2,−1 −t3 , γ b10 −t2 , b01 − γ t1 i,

with respect to lexicographic order with

t > t1 > t2 > t3 > t4 > t5 > t6 > t7 > γ > a10 > a01 > a−12
> a20 > a11 > a02 > a−13 > b3,−1 > b20 > b11 > b02 > b2,−1 > b10 > b01 .

We obtain a list of polynomials that is far too long to be presented here, but which
the reader should be able to compute using practically any computer algebra system.
According to the second step of Algorithm 5.1, in order to obtain a basis of Isym we
simply select from the list those polynomials that do not depend on t, t1 , t2 , t3 , t4 ,
5.3 Axes of Symmetry of a Plane System 237

t5 , t6 , t7 , or γ . These are precisely the polynomials presented in Table 3.2. Note how
they exhibit the structure described in point (1) of Theorem 5.2.5. 

5.3 Axes of Symmetry of a Plane System

As defined in Section 3.5, a line L is an axis of symmetry of the real system


e v),
u̇ = U(u, v̇ = Ve (u, v) (5.44)

if as point sets the orbits of the system are symmetric with respect to L. In this
section by an axis of symmetry we mean an axis passing through the origin, at which
system (5.44) is assumed to have a singularity. We will also assume that U e and Ve
are polynomials. Note that the eigenvalues of the linear part at the origin are not
assumed to be purely imaginary, however, which is the reason for the enlarged index
set Se employed in this chapter. The ideas developed in Section 5.1 lead naturally to a
necessary condition on a planar system in order for it to possess an axis of symmetry
and allow us to derive a bound on the number of axes of symmetry possible in the
phase portrait of a polynomial system, expressed in terms of the degrees of the
polynomials.
As was done in Section 3.5, we write (5.44) as a single complex differential
equation by writing x = u + iv and differentiating with respect to t. We obtain
dx
dt
= ∑ e x̄),
a pqx p+1 x̄q = P(x, (5.45)
(p,q)∈Se

where Pe = U e + iVe , U
e and Ve evaluated at ((x1 + x̄1 )/2, (x1 − x̄1 )/(2i)). When we
refer below to equation (5.45), we will generally have in mind that it is equivalent
to system (5.44).
Writing (5.44) in the complex form (5.45) is just the first step in the process of
complexifying (5.44). Thus, letting a denote the vector whose components are the
coefficients of the polynomial P, e it is natural to write the parameter space for (5.45)
as E(a) = Cℓ . We let EP (a) ⊂ E(a) denote the set of systems (5.45) for which the
original polynomials U e and Ve in (5.44) have no nonconstant common factors. When
we refer to “system a” we will mean system (5.45) with coefficients given by the
components of the vector a.
The choice of the ℓ-element index set Se not only specifies the real family in the
complex form (5.45) but simultaneously specifies the full family (5.1) on C2 , of
which (5.45) is the first component evaluated on the complex line y = x̄. Thus we
(a)
will use the notation Uϕ for a rotation of coordinates in C, corresponding to the
first equation in display (5.3), for family (5.45).
In Lemma 3.5.3 we proved that existence of either time-reversible or mirror sym-
metry with respect to the u-axis is characterized by the condition a0 = ±ā0 , which
is thus a sufficient condition that the u-axis be an axis of symmetry for system a0 .
238 5 Invariants of the Rotation Group

Consider now for ϕ0 ∈ [0, 2π ) the line x = reiϕ0 . Adapting the notation from the
beginning of Section 5.1 as just mentioned, after a rotation in C through angle −ϕ0
(a)
about the origin we obtain from system a0 the system a′0 = Uϕ0 · a0 (observe that in
(5.3) the first rotation is through −ϕ ). If a0 = ±ā0, then the line x = reiϕ0 is an axis
′ ′

of symmetry of (5.45). Thus, because it is always true that

(a) (a)
Uϕ0 · a0 = U−ϕ0 · ā0, (5.46)

Lemma 3.5.3(2) yields the following result.

Lemma 5.3.1. Fix an ordered index set S, e hence the real family in complex form
(5.45), the complex family (5.1), and the rotation group Uϕ for family (5.1). If
(a) (a)
U−ϕ0 · ā0 = Uϕ0 · a0 (5.47)

or
(a) (a)
U−ϕ0 · ā0 = −Uϕ0 · a0 , (5.48)
then the line with complex equation x = reiϕ0 is an axis of symmetry of system (5.45)
with coefficient vector a0 .

Condition (5.47) characterizes mirror symmetry with respect to the line x = reiϕ0 ;
condition (5.48) characterizes time-reversible symmetry with respect to the line
x = reiϕ0 . It is conceivable, however, that a line x = reiϕ0 be an axis of symme-
try for a system a0 yet neither condition hold. If such were the case for a system
a0 , then it is apparent that the system would still have to possess either mirror or
time-reversible symmetry on a neighborhood N of any regular point in its phase
portrait, hence on the saturation of N under the flow. Thus the phase portrait would
have to be decomposed into a union of disjoint open regions on each of which either
(5.47) or (5.48) holds, separated by curves of singularities. This suggests that more
general symmetry can be excluded if U e and Ve are relatively prime polynomials. The
following lemma confirms this conjecture.

Lemma 5.3.2. Fix an ordered index set S, e hence the real family in complex form
(5.45), the complex family (5.1), and the rotation group Uϕ for the family (5.1). If
a0 ∈ EP (a) and the line x = reiϕ0 is an axis of symmetry of system (5.45), then either
(5.47) or (5.48) holds.

Proof. The u-axis is an axis of symmetry of (5.44) if and only if condition (3.86)
holds. Let a′0 = Uϕ0 · a0 and write the corresponding system (5.44) as u̇ = U e ′ (u, v),
e
v̇ = V (u, v). It is easy to see that a0 ∈ EP (a) as well (Exercise 5.15). For system a′0
′ ′

the u-axis is an axis of symmetry, so condition (3.86) is satisfied, hence for all (u, v)
for which Ve ′ (u, v) 6= 0,

e ′ (u, −v) ′
V
− e (u, v) ≡ U
U e ′ (u, −v). (5.49)
e ′ (u, v)
V
5.3 Axes of Symmetry of a Plane System 239

Because Ue and V
e are relatively prime polynomials, (5.49) can be satisfied only if
the expression
e ′ (u, −v)
V
p(u, v) := −
Ve ′ (u, v)
defines a polynomial. Thus from (5.49) we have that p(u, v)U e ′ (u, v) ≡ U
e ′ (u, −v).
But Ue ′ (u, v) and U
e ′ (u, −v) are polynomials of the same degree, so p(u, v) must be
identically constant. That is, there exists a real number k 6= 0 such that
e ′ (u, −v) ≡ kU
U e ′ (u, v), Ve ′ (u, −v) ≡ −kVe ′ (u, v).

e x̄) ≡ kP(x,
This implies that P(x, e x̄), whence a′ = kā′ . But then ā′ = ka′ , hence
0 0 0 0
a0 = k a0 . Since a0 6= 0, k = ±1. Thus a′0 = ±ā′0 , which by (5.46) implies that for
′ 2 ′ ′

system a0 one of conditions (5.47) and (5.48) must hold. 


e and Ve be relatively prime poly-
The condition that a0 be in EP (a), that is, that U
nomials, is essential to the truth of Lemma 5.3.2. A counterexample otherwise is
given by the system u̇ = −u(v − 1)2 (v + 1), v̇ = v(v − 1)2 (v + 1), whose trajecto-
ries are symmetric with respect to the u-axis as point sets (that is, without regard to
the direction of flow), but which exhibits mirror symmetry in the strip |v| < 1 but
time-reversible symmetry in the strips |v| > 1. Even so, this restriction on U e and
Ve is no real hindrance to finding all axes of symmetry of a general system a0 . We
simply identify the common factors F of U e and V e and remove them, creating a new
system a′0 , all of whose axes of symmetry are specified by conditions (5.47) and
(5.48). If the zero set Z of F is symmetric with respect to an axis of symmetry L′ of
system a′0 , then L′ is an axis of symmetry of the original system a0 ; otherwise it is
not. Conversely, if L is an axis of symmetry of system a0 , then just a few moments’
reflection shows us that it is an axis of symmetry with respect to system a′0 , since
trajectories of a′0 are unions of trajectories of a0 (Exercise 5.16).
The following example illustrates the use of Lemmas 5.3.1 and 5.3.2 and the
computations involved.

Example 5.3.3. We will locate the axes of symmetry for the real system

u̇ = −2uv + u3 − 3uv2, v̇ = −u2 + v2 + 3u2v − v3. (5.50)

First we express the system in complex form by differentiating x = u+iv and making
the substitutions u = (x + x̄)/2 and v = (x − x̄)/(2i) to obtain

ẋ = x3 − ix̄2 . (5.51)

This system is a particular element of the family ẋ = −a−12 x̄2 − a20 x3 on C, which
in turn lies in the family

ẋ = −a−12y2 − a20x3 , ẏ = b20 x2 y + b2,−1x2 (5.52)


240 5 Invariants of the Rotation Group

on C2 . Thus here Se = {(−1, 2), (2, 0)}, so that ζ = (−3, 2, −2, 3). Hence because
Uϕ = diag(eiζ1 ϕ , eiζ2 ϕ , eiζ3 ϕ , eiζ4 ϕ ), equations (5.47) and (5.48) are
 3ϕ i    −3ϕ i  
e 0 0 ā−1,2 e 0 0 a−1,2
= ± . (5.53)
0 e−2ϕ0 i ā20 0 e2ϕ0 i a20

Choosing the plus sign, one collection of lines of symmetry x = reiϕ0 arises in cor-
respondence with solutions ϕ0 of the system of equations (observing the sign con-
vention in (5.52), which reflects that of (5.1))
a−12 i
e6ϕ0 i = = = −1 : ϕ0 ∈ { π6 , π2 , 56π , 76π , 32π , 116π }
ā−12 −i
ā20 −1
e4ϕ0 i = = =1 : ϕ0 ∈ {0, π2 , π , 32π }.
a20 −1

The sole solution is the line corresponding to ϕ = π /2, the v-axis. This is a mirror
symmetry of the real system.
Choosing the minus sign in (5.53), a second collection of lines of symmetry
x = reiϕ0 arises in correspondence with solutions ϕ0 of the system of equations

e6ϕ0 i = 1 : ϕ0 = {0, π3 , 23π , π , 43π , 53π }


e4ϕ0 i = −1 : ϕ0 = { π4 , 34π , 54π , 74π }.

There is no solution, so the real system does not possess a time-reversible symmetry.

We will now describe axes of symmetry in terms of invariants of the rotation


group. If we are interested in finding lines of symmetry of a single specific system,
as we just did in Example 5.3.3, there is nothing to be gained by this approach. But
by applying the theory developed in Section 5.1, we will be able to derive a single
equation whose solutions correspond to axes of symmetry, rather than a collection of
equations whose common solutions yield the axes, as was the case with the example.
This will allow us to readily count solutions, and thus obtain the bound in Corollary
5.3.6 below. As before, a circumflex accent denotes the involution of a vector; that
is, if d = (d1 , . . . , dℓ ), then dˆ = (dℓ , . . . , d1 ). Clearly d¯ˆ = d.ˆ¯

Theorem 5.3.4. Fix the ordered index set S, e hence the real family in complex form
(5.45), the complex family (5.1), and the rotation group Uϕ for the family (5.1). Fix
a0 ∈ E(a). If, for all unary and binary invariants J(a, b) of family (5.1),

J(ā0 , â0 ) = J(a0 , â¯0 ) (5.54)

holds, or if, for all unary and binary invariants J(a, b) of family (5.1),

J(ā0 , â0 ) = J(−(a0 , â¯0 )) (5.55)


5.3 Axes of Symmetry of a Plane System 241

holds, then system (5.45) with coefficient vector a0 has an axis of symmetry. Con-
versely, if a0 ∈ EP (a) and system a0 has an axis of symmetry, then either (5.54) or
(5.55) holds.

Proof. Note at the outset that for a0 ∈ E(a), (5.47) holds if and only if

(ā0 , â0 ) = U2ϕ0 · (a0 , â¯0 ) (5.56)

holds and (5.48) holds if and only if

(ā0 , â0 ) = −U2ϕ0 · (a0 , â¯0 ) (5.57)

holds. Thus for a0 ∈ EP (a) the truth of (5.56) or (5.57) is equivalent to the line
x = reiϕ0 being an axis of symmetry and is sufficient for the line x = reiϕ0 being an
axis of symmetry for any a0 ∈ E(a).
Fix a0 ∈ E(a) and suppose (5.54) holds. Note that R(a0 , â¯0 ) = R(ā0 , â0 ) and the
corresponding coordinates of the vectors (a0 , â¯0 ) and (ā, â) have the same moduli.
Hence, by Corollary 5.1.16, (5.54) implies that there exists a ϕ0 such that (5.56)
holds, or equivalently U−ϕ0 · (ā0 , â0 ) = Uϕ0 · (a0 , â¯0 ), which in expanded form is
(U−ϕ0 · ā0 ,U−ϕ0 · â0 ) = (Uϕ0 · a0 ,Uϕ0 · â¯0 ). Thus, by Lemma 5.3.1, x = reiϕ0 is an
(a) (b) (a) (b)

axis of symmetry of system a0 . The proof that (5.55) implies (5.48) is similar.
Suppose now that for a0 ∈ EP (a) the corresponding system (5.45) has an axis of
symmetry. Then by Lemma 5.3.2 either (5.47) or (5.48) holds. Suppose (5.47) holds
(a) (b)
with ϕ = ϕ0 . Then a0 = U−2ϕ0 ā0 and from (5.4) we see that â¯0 = U−2ϕ0 · â0 . Thus
(a0 , â¯0 ) = U−2ϕ0 · (ā0 , â0 ). But then for any invariant J(a, b) of the rotation group for
family (5.1), since U−2ϕ0 is in the group, J(ā0 , â0 ) = J(U−2ϕ0 · (ā0 , â0 )) = J(a0 , â¯0 ),
so (5.54) holds for all invariants. Similarly, if (5.48) holds, then (5.55) holds for all
invariants J of family (5.1). 

Theorem 5.3.5. Let an ordered index set S, e hence a real family of systems (5.44)
in complex form (5.45) and a complex family (5.1) on C2 , be given. Let a specific
vector of coefficients a0 = (a p1 q1 , . . . , a pℓ qℓ ) ∈ EP (a) be given.
1. Suppose a p j q j 6= 0 for 1 ≤ j ≤ ℓ. Then the number of axes of symmetry of system
a0 , that is, of the corresponding system (5.45) with vector a0 of coefficients, is
GCD(ζ ) if exactly one of conditions (5.54) and (5.55) holds and is 2GCD(ζ ) if
both of them hold, where ζ is the characteristic vector of family (5.44), which is
defined by (5.10).
a. When (5.54) holds, a collection of axes of symmetry x = reiϕ in C corresponds
to solutions ϕ = ϕ0 of the equation

[µ ]|(ā0 ,â0 ) = [µ ]|(a0 ,â¯0 ) e2iGCD(ζ )ϕ0 , (5.58)

where µ is any solution ν = µ of (5.14); the same collection is determined by


any such µ . If (5.55) does not hold, then there are no other axes of symmetry.
b. When (5.55) holds, a collection of axes of symmetry x = reiϕ in C corresponds
to solutions ϕ = ϕ0 of the equation
242 5 Invariants of the Rotation Group

[µ ]|(ā0 ,â0 ) = [µ ]|−(a0 ,â¯0 ) e2iGCD(ζ )ϕ0 , (5.59)

where µ is any solution ν = µ of (5.14); the same collection is determined by


any such µ . If (5.54) does not hold, then there are no other axes of symmetry.
2. Define χ : E(a) → C2ℓ by

χ (a) := ζ (a, â)


¯ = (ζ1 a p q , . . . , ζℓ a p q , ζℓ+1 ā p q , . . . , ζ2ℓ ā p q ).
1 1 ℓ ℓ ℓ ℓ 1 1

Suppose χ (a0 ) = (0, . . . , 0). If either of conditions (5.54) and (5.55) holds, then
for every ϕ0 ∈ [0, 2π ) the line x = reiϕ0 is an axis of symmetry of system (5.45).
Proof. Suppose a0 ∈ EP (a). By Lemmas 5.3.1 and 5.3.2, axes of symmetry are
determined by (5.47) and (5.48), which, as noted in the first paragraph of the proof
of Theorem 5.3.4, are equivalent to (5.56) and (5.57), respectively. By Proposition
5.1.17 with c = (ā0 , â0 ) and d = (a0 , â¯0 ) = c̄, (5.56) (respectively, (5.57)) holds if
and only if (5.58) (respectively, (5.59)) does, for any solution ν = µ of (5.14). But no
entry in a0 is zero, so neither side in equations (5.58) and (5.59) can be zero, hence
each has exactly 2GCD(ζ ) solutions in [0, 2π ), none in common and all occurring in
pairs ϕ0 and ϕ0 + π . Thus each determines GCD(ζ ) lines of symmetry through the
origin, all distinct from any that are determined by the other condition. This proves
point (1).
Now suppose a0 ∈ EP (a) and χ (a0 ) = (0, . . . , 0). Denote a0 by (a1 , . . . , aℓ ), so
that the latter equation is (ζ1 a1 , . . . , ζℓ aℓ , ζℓ+1 āℓ , . . . , ζ2ℓ ā1 ) = (0, . . . , 0). For any in-
dex j ≤ ℓ for which ζ j = 0, the monomial a p j q j is an invariant polynomial for family
(5.1), so that (5.54) (respectively, (5.55)) holds for all unary invariants only if a j = ā j
(a)
(respectively, a j = −ā j ). For ϕ0 ∈ [0, 2π ), the matrix representing Uϕ0 is the diag-
onal matrix with the number eiζ j ϕ0 in position ( j, j), which is 1 when ζ j = 0. Thus
because by hypothesis a j = 0 if ζ j 6= 0, (5.47) (respectively, (5.48)) holds for every
ϕ0 , and the result follows from Lemma 5.3.1. 
It is apparent from the proof of point (1) that we can allow zero entries in a0
and still use equations (5.58) and (5.59) to identify lines of symmetry as long as we
restrict to solutions µ of (5.14) that have zero entries where a0 does and apply the
convention 00 = 1, although this observation matters only if we are trying to locate
common axes of symmetry in an entire family in which certain variable coefficients
are allowed to vanish, or perhaps show that no axes of symmetry can exist. We also
note explicitly that the sets of solutions of equations (5.58) and (5.59) are disjoint.
An important consequence of Theorem 5.3.5 is that we can derive a bound on the
number of lines of symmetry in terms of the degrees of the polynomials in system
(5.44).
Corollary 5.3.6. Suppose that in system (5.44) Ue and Ve are polynomials for which
e e
max(deg(U), deg(V )) = n. If the system does not have infinitely many axes of sym-
metry, then it has at most 2n + 2 axes of symmetry.
Proof. Suppose system (5.44) has at least one axis of symmetry, and let a0 denote
the vector of coefficients in its complex form (5.45). If the characteristic vector ζ
5.3 Axes of Symmetry of a Plane System 243

of the corresponding family (5.1) is (0, . . . , 0), then every line through the origin
is an axis of symmetry (Exercise 5.17). Otherwise, since for (p, q) ∈ Se we have
−1 ≤ p ≤ n − 1 and 0 ≤ q ≤ n, every component ζ j of ζ of the corresponding
family (5.1) satisfies |ζ j | < n + 1. Therefore GCD(ζ ) ≤ n + 1. If a0 ∈ EP (a), then
it follows immediately that the number of axes of symmetry is less than or equal
to 2(n + 1). If a0 6∈ EP (a) then we divide out the common factors in U e and Ve and
recall the fact that has already been noted that the number of axes of symmetry of
the system thereby obtained is not exceeded by the number of axes of symmetry of
the original system a0 . 

The bound given in the corollary is sharp (see Exercise 5.18).

Example 5.3.7. We return to the real system (5.50) and now locate the axes of sym-
metry using Theorem 5.3.5. As before we must derive the complex form (5.51) on
C and view it as the first component of a specfic member of the family (5.52) on
C2 , for which Se = {(−1, 2), (2, 0)}, so that ζ = (−3, 2, −2, 3) and GCD(ζ ) = 1.
The next step in this approach is to find all the unary and binary irreducible invari-
ant monomials of family (5.52). Applying Proposition 5.1.10, the unary irreducible
invariant monomials are J1 = J1 (a−12 , a20 , b02 , b2,−1 ) = a−12 b2,−1 and J2 = a20 b02 ;
the binary irreducible invariant monomials are J3 = a2−12 a320 and J4 = b302 b22,−1 . Ob-
serving the sign convention in (5.52), which reflects that of (5.1), the vector whose
components are the coefficients of system (5.51) is a0 = (i, −1), so

(ā0 , â0 ) = (−i, −1, −1, i)


(a0 , â¯0 ) = (i, −1, −1, −i)
−(a0 , â¯0 ) = (−i, 1, 1, i).

Then

J1 (ā0 , â0 ) = (−i)(i) = 1


J1 (a0 , â¯0 ) = (i)(−i) = 1
J1 (−(a0 , â¯0 )) = (−i)(i) = 1,

so that both (5.54) and (5.55) hold for J1 . Similar computations show that they both
hold for J2 but that only (5.54) holds for J3 and J4 . Thus because (5.54) holds for
all invariants of family (5.52) but (5.55) does not, all axes of symmetry of the real
family will be picked out as solutions of the single equation (5.58), for any solution
µ of (5.14), and (5.59) need not be considered.
Equation (5.14) is −3ν1 + 2ν2 − 2ν3 + 3ν4 = 1, for which we choose the solution
µ = (0, 0, 1, 1). Then (5.58) is

(−i)0 (−1)0 (−1)1 (i)1 = (i)0 (−1)0 (−1)1 (−i)1 e2iϕ ,

or e2iϕ0 = −1, and this single equation yields all the axes of symmetry x = reiϕ of
(5.50), namely the lines corresponding to ϕ = π /2 and ϕ = 3π /2, both of which
are, of course, the v-axis.
244 5 Invariants of the Rotation Group

We close this section with a result on axes of symmetry of a system (5.44) for
which the origin is of focus or center type. Suppose the lowest-order terms in either
of Ue and Ve have degree m; note that we allow m > 1. Then (5.44) has the form
em (u, v) + · · · , v̇ = V
u̇ = U em (u, v) + · · · , where each of U
em and Vem is either identi-
cally zero or a homogeneous polynomial of degree m, at least one of them is not
zero, and omitted terms are of order at least m + 1. Since (5.44) is a polynomial
system, the origin is of focus or center type if and only if there are no directions of
approach to it, which, by Theorem 64 of §20 of [12], is true if and only if the homo-
geneous polynomial uVem (u, v) − vU em (u, v) vanishes only at the origin. Making the
e e
replacements U(u, v) = U((x+ x̄)/2, (x− x̄)/(2i)) = Re P(x, e x̄), V
e (u, v) = Im P(x,
e x̄),
u = (x+ x̄)/2, and v = (x− x̄)/(2i), the condition that the origin be of focus or center
type for (5.44), expressed in terms of the vector of coefficients of its representation
in complex form (5.45), is

∑ a pqx p+1x̄q+1 − ā pqx̄ p+1xq+1 = 0 if and only if x = 0 . (5.60)
p+q=m−1

If the rotation indicated in the left equation in display (5.3) is performed for ϕ = ϕ0 ,
then condition (5.60) becomes

∑ a(ϕ0 ) pq x′p+1x̄′q+1 − ā(ϕ0 ) pqx̄′p+1x′q+1 = 0 if and only if x′ = 0 .
p+q=m−1
(5.61)
Proposition 5.3.8. Let an ordered index set S, e hence a real family of systems (5.44)
in complex form (5.45) and a complex family (5.1) on C2 , be given. If for system
a0 ∈ EP (a) the origin is of focus or center type, then equation (5.47) has no solutions
and condition (5.54) does not hold.
Proof. Suppose, contrary to what we wish to show, that there exists a solution ϕ0 to
(a) (a)
(5.47), that is, that U−ϕ0 · ā0 = Uϕ0 ·a0 . Then using (5.46) to eliminate the minus sign
(a) (a)
that is attached to ϕ0 on the left, we obtain Uϕ0 · a0 = Uϕ0 · a0 . But then because
(a)
Uϕ0 · a0 = (a(ϕ0 ) p1 q1 , . . . , a(ϕ0 ) pℓ qℓ ), this implies that a(ϕ0 ) pq = a(ϕ0 ) pq for all
e But then the sum in (5.61) vanishes for any x′ satisfying x′ = x̄′ 6= 0, a
(p, q) ∈ S.
contradiction. Thus (5.47) has no solutions.
Again suppose that condition (5.54) is satisfied. Then by Corollary 5.1.16 there
(a) (a)
exists some ϕ = 2ϕ0 such that (ā0 , â0 ) = U2ϕ0 · (a0 , â¯0 ), so that ā0 = U2ϕ0 · a0 or
(a) (a)
U−ϕ0 · ā0 = Uϕ0 · a0 , that is, (5.47) holds, just shown to be impossible. 

5.4 Notes and Complements

The theory of algebraic invariants of ordinary differential equations presented in


this chapter was developed in large part by K. S. Sibirsky and his collaborators,
5.4 Notes and Complements 245

although many other investigators have played a role. The general theory, which
applies to analytic systems in n variables, goes far beyond what we have discussed
here. The action of any subgroup of the n-dimensional general linear group, not
just the rotation group, is allowed. Moreover, in addition to polynomial invariants in
the coefficients of the system of differential equations, which we have studied here,
there are two other broad classes of objects of interest, the analogous comitants,
which are polynomials in both the coefficients and the variables, and the syzygies,
which are polynomial identities in the invariants and comitants. A standard reference
is [178], in which a full development, and much more than we could indicate here,
may be found.
The algorithm presented in Section 5.2 is from [102, 150].

Exercises

5.1 Show that the set C4 of Example 5.1.2 is a group under multiplication.
5.2 Repeat Example 5.1.3 for the family (3.131) of all systems of the form (3.69)
with quadratic nonlinearities.
5.3 Prove Proposition 5.1.6.
Hint. If ν ∈ N2ℓ 0 satisfies (5.7), then clearly it satisfies (5.8). Establish the con-
verse in two steps: (i) rearrange (5.8) to conclude that L1 (ν ) = L2 (ν ) = n for
some n ∈ Z; (ii) show that n ∈ N0 by writing out the right-hand side of the
equation 2n = L1 (ν ) + L2 (ν ) and collecting on ν1 , . . . , ν2ℓ . You must use the
e
restrictions on the elements of (p j , q j ) ∈ S.
5.4 [Referenced in the proof of Proposition 5.1.10.] Show that if (µ , η ) ∈ N2 and
(ζr , ζs ) ∈ Z2 \ {(0, 0)} satisfy µ ζr + η ζs = 0, then there exists a ∈ N such that
   
|ζs | |ζr |
µ= ·a and η= ·a.
GCD(ζr , ζs ) GCD(ζr , ζs )

5.5 [Referenced in the proof of Proposition 5.1.10.] Complete the proof of part (2)
of Proposition 5.1.10 by showing that
µ
a. any invariant monomial of the form zr zηs ztθ , where zr and zt are conjugate
variables and zs is distinct from zr and zt , factors as a product of unary and
binary invariant monomials, and
µ γ
b. any invariant monomial of the form zr zηs ztθ zw , where zr and zt are conjugate
variables, zs and zw are conjugate variables, and zs is distinct from zr and zt ,
factors as a product of unary and binary invariant monomials.
5.6 Derive Remark 5.1.11 from Proposition 5.1.10.
5.7 [Referenced in the proof of Lemma 5.1.13.] The greatest common divisor of a
set of numbers {p1 , . . . , ps } in Z is the unique number d ∈ N such that d divides
each p j (that is, leaves remainder zero when it is divided into p j ) and is itself
divisible by any number that divides all the p j . It can be found by repeated
application of the Euclidean Algorithm (the algorithm in Table 1.10 on page 52
246 5 Invariants of the Rotation Group

with suitable change in terminology). In analogy with Exercise 1.10, show that
if d = GCD(p1 , . . . , ps ), then there exist α1 , . . . , αs ∈ Z (not unique) such that
d = α1 p1 + · · · αs ps . (Compare with Exercise 1.10.)
5.8 [Referenced in the proof of Theorem 5.1.15.] Show that if for family (5.1) the
characteristic vector ζ ∈ Z2ℓ is zero, ζ = (0, . . . , 0), then every unary and binary
invariant agrees on specfic elements c and d of E(a, b) only if c = d.
5.9 [Referenced in the proof of Theorem 3.5.8, end of Section 5.1.] For D as defined
in the proof of Theorem 3.5.8, show how the inclusion D ⊂ V(Isym ) \ R follows
directly from the characterization (3.96) of R.
5.10 Find a second Hilbert Basis for the monoid M of Example 5.2.6.
5.11 a. Consider the set V = {(x1 , x2 , x3 , x4 ) : x1 + 4x2 − 4x3 − x4 = 0}, the vector
subspace of R4 that is determined by the same equation (5.43) that deter-
mines the monoid M of Example 5.2.6. Show that the Hilbert Basis H of
M is not a vector space basis of V .
b. In particular, find an element of M that has at least two representations as
an N0 -linear combination of elements of H .
5.12 [Referenced in the proof of Proposition 6.3.5.] Show that in general, for an
ordered set S,e the corresponding monoid M , and a Hilbert Basis H of M ,
any element µ of M has at most finitely many representations as an N0 -linear
combination of elements of H .
5.13 Consider the family

ẋ = i(x − a20x3 − a11x2 y − a02xy2 − a−13y3 )


(5.62)
ẏ = −i(y − b3,−1x3 − b20x2 y − b11xy2 − b02y3 ) .

a. Find the symmetry variety of family (5.62).


b. Find a Hilbert basis for the monoid M for family (5.62).
Hint. See Theorem 6.4.3 and the proof of Lemma 6.4.7 for the answers.
5.14 Find the symmetry variety of the system

ẋ = x − a30x4 − a21x3 y − a12x2 y2 − a03xy3 − a−14y4


ẏ = −y + b4,−1x4 + b30x3 y + b21x2 y2 + b12xy3 + b0,3y4 .

5.15 [Referenced in the proof of Lemma 5.3.2.] Let a0 and a′0 be vectors whose com-
ponents are the coefficients of two systems of the form (5.45) and that satisfy
a′0 = Uϕ0 · a0 for some ϕ0 . Show that either both a and a′ are in EP (a) or that
neither is.
5.16 Give a precise argument to show that if the common factor F of U e and V e of a
system (5.44) is discarded, then any axis of symmetry of the original system is
an axis of symmetry of the new system thereby obtained.
5.17 [Referenced in the proof of Corollary 5.3.6.] Suppose system (5.44) has at least
one axis of symmetry and that the characteristic vector ζ of the corresponding
family (5.1) is (0, . . . , 0). Show that every line through the origin is an axis of
symmetry.
5.4 Notes and Complements 247

5.18 Show that the system u̇ = Re(u − iv)n , v̇ = Im(u − iv)n has 2n + 2 axes of sym-
metry.
5.19 Find the axes of symmetry of the system ẋ = x − x3 + 2xx̄2 .
5.20 Find the axes of symmetry of the system

u̇ = −6u2 v − 8uv2 + 6v3 , v̇ = 6u3 + 8u2v − 6uv2.

Note the collapsing of systems (5.47) and (5.48) into single equations.
Chapter 6
Bifurcations of Limit Cycles and Critical Periods

In this chapter we consider systems of ordinary differential equations of the form


e v),
u̇ = U(u, e (u, v),
v̇ = V (6.1)
e v) and Ve (u, v) are polynomials for which
where u and v are real variables and U(u,
max(deg U, e deg V
e ) ≤ n. The second part of the sixteenth of Hilbert’s well-known
list of open problems posed in the year 1900 asks for a description of the possible
number and relative locations of limit cycles (isolated periodic orbits) occurring in
the phase portrait of such polynomial systems. The minimal uniform bound H(n)
on the number of limit cycles for systems (6.1) (for some fixed n) is now known as
the nth Hilbert number.
Despite the simplicity of the statement of the problem not much progress has
been made even for small values of n, and even then a number of important results
have later been shown either to be false or to have faulty proofs. For many years it
was widely believed, for example, that H(2) = 3, but around 1980 Chen and Wang
([39]) and Shi ([172]) constructed examples of quadratic systems (systems (6.1)
with n = 2) having at least four limit cycles. About the same time the correctness of
Dulac’s proof of a fundamental preliminary to Hilbert’s 16th problem, that any fixed
polynomial system has but a finite number of limit cycles, was called into question
(the proof was later shown to be faulty). Examining the question for quadratic sys-
tems, in 1983 Chicone and Shafer ([46]) proved that a fixed quadratic system has
only a finite number of limit cycles in any bounded region of the phase plane. In
1986 Bamón ([14]) and Romanovski ([152]) extended this result to the whole phase
plane, thereby establishing the correctness of Dulac’s theorem in the quadratic case.
A few years later Dulac’s theorem was proved for an arbitrary polynomial system by
Ecalle ([69]) and Il’yashenko ([99]). Even so, as of this writing no uniform bound
on the number of limit cycles in polynomial systems of fixed degree is known. That
is, it is unknown whether or not H(n) is even finite except in the trivial cases n = 0
and n = 1.
Two fundamental concepts used in addressing the problem of estimating H(n)
are the twin ideas of a limit periodic set and of the cyclicity of a such a set, ideas

V.G. Romanovski, D.S. Shafer, The Center and Cyclicity Problems, 249
DOI 10.1007/978-0-8176-4727-8_6,
© Birkhäuser is a part of Springer Science+Business Media, LLC 2009
250 6 Bifurcations of Limit Cycles and Critical Periods

used by Bautin in the seminal paper [17] in which he proved that H(2) ≥ 3. To
define them, we consider a family of systems (6.1) with coefficients drawn from a
specified parameter space E equipped with a topology. A limit periodic set is a point
set Γ in the phase portrait of the system (6.1) that corresponds to some choice e0 of
the parameters that has the property that a limit cycle can be made to bifurcate from
Γ under a suitable but arbitrarily small change in the parameters. That is, for any
neighborhood U of Γ in R2 and any neighborhood N of e0 in E there exists e1 ∈ N
such that the system corresponding to parameter choice e1 has a limit cycle lying
wholly within U. The limit periodic set Γ has cyclicity c with respect to E if and only
if for any choice e of parameters in a neighborhood of e0 in E the corresponding
system (6.1) has at most c limit cycles wholly contained in a neighborhood of Γ ,
and c is the smallest number with this property. Examples of limit periodic sets are
singularities of focus or center type, periodic orbits (not necessarily limit cycles),
and the set formed by a saddle point and a pair of its stable and unstable separatrices
that comprise the same point set (a “homoclinic loop”). More specifically, consider
a system of the form (6.11) below with β 6= 0 and |α | small. If α is made to cross
0 from negative to positive, then the singularity at the origin changes from a stable
to an unstable focus, and typically a limit cycle surrounding the origin is created
or destroyed in what is called a Hopf bifurcation. Roussarie showed ([155]) that
if it could be established (again for n fixed) that every limit periodic set for family
(6.1) has finite cyclicity (under a natural compactification of the parameter and phase
spaces; see Section 6.5), then it would follow that H(n) is finite. This program seems
feasible at least for quadratic systems, for which it is currently under way.
In this chapter we consider the problem of the cyclicity of a simple singularity
of system (6.1) (that is, one at which the determinant of the linear part is nonzero),
a problem that is known as the local 16th Hilbert problem. We describe a general
method based on ideas of Bautin to treat this and similar bifurcation problems, and
apply the method to resolve the cyclicity problem for singular points of quadratic
systems and the problem of bifurcations of critical periods in the period annulus of
centers for a family of cubic systems.

6.1 Bautin’s Method for Bifurcation Problems

Bautin’s approach to bifurcation of limit cycles from singularities of vector fields is


founded on properties of zeros of analytic functions of several variables depending
on parameters, which we address in this section. Let E be a subset of Rn and let
F : R × E → R : (z, θ ) 7→ F (z, θ ) be an analytic function, which we will write in
a neighborhood of z = 0 in the form

F (z, θ ) = ∑ f j (θ )z j , (6.2)
j=0
6.1 Bautin’s Method for Bifurcation Problems 251

where, for j ∈ N0 , f j (θ ) is an analytic function and for any θ ∗ ∈ E the series (6.2)
is convergent in a neighborhood of (z, θ ) = (0, θ ∗ ). In all situations of interest to
us we will be concerned solely with the number of positive solutions (for any fixed
parameter value θ ∗ ) of the equation F (z, θ ∗ ) = 0 in a neighborhood of z = 0 in
R. Thus we define the multiplicity of the parameter value θ ∗ with respect to E as
follows.

Definition 6.1.1. For any θ ∗ ∈ E and any sufficiently small ε > 0, let z(θ , ε ) denote
the number of isolated zeros of F (z, θ ) in the interval (0, ε ) ⊂ R. The point θ ∗ ∈ E
is said to have multiplicity c with respect to the space E at the origin in R if there
exist positive constants δ0 and ε0 such that, for every pair of numbers δ and ε that
satisfy 0 < δ < δ0 and 0 < ε < ε0 ,

max{z(θ , ε ) : |θ − θ ∗ | ≤ δ } = c,

where | · | denotes the usual Euclidean norm on Rn .

There are two possibilities that are of interest to us in regard to the flatness of
F (z, θ ∗ ) at z = 0:
(i) there exists m ∈ N0 such that f0 (θ ∗ ) = · · · = fm (θ ∗ ) = 0 but fm+1 (θ ∗ ) 6= 0;
(ii) f j (θ ∗ ) = 0 for all j ∈ N0 .
In the first case it is not difficult to see that the multiplicity of θ ∗ is at most m
(for example, see Corollary 6.1.3 below). Case (ii) is much more subtle, but there
is a method for its treatment suggested by Bautin in [17]. We first sketch out the
method in the case that the functions f j are polynomial functions of θ . The idea
is to find a basis of the ideal h f0 (θ ), f1 (θ ), f2 (θ ), . . .i in the ring of polynomials
R[θ ]. By the Hilbert Basis Theorem (Theorem 1.1.6) there is always a finite basis.
Adding polynomials f j as necessary in order to fill out an initial string of the se-
quence { f0 , f1 , . . .}, we may choose the first m + 1 polynomials { f0 (θ ), . . . , fm (θ )}
for such a basis. Using this basis we can then write the function F in the form
m
F (z, θ ) = ∑ f j (θ )(1 + Φ j (z, θ ))z j , (6.3)
j=0

where Φ j (0, θ ) = 0 for j = 0, 1, . . . , m. Therefore the function F (z, θ ) behaves like


a polynomial in z of degree m near θ = θ ∗ , hence can have at most m zeros for any
θ in a neighborhood of θ ∗ , as will be shown in Proposition 6.1.2.
We will see below that the local 16th Hilbert problem is just the problem of the
multiplicity of the function P(ρ ) = R(ρ ) − ρ , where R(ρ ) is the Poincaré return
map (3.13). In the case that F is the derivative of the period function T (ρ ) defined
by (4.26) and E is the center variety of system (6.1), we have the so-called problem
of bifurcations of critical periods, which we will consider in Section 6.4.
By the discussion in the preceding paragraphs, to investigate the multiplicity of
F (z, θ ) we will need to be able to count isolated zeros of functions expressed in the
form of (6.3). The following proposition is our tool for doing so.
252 6 Bifurcations of Limit Cycles and Critical Periods

Proposition 6.1.2. Let Z : R × Rn → R be a function that can be written in the form

Z(z, θ ) = f1 (θ )z j1 (1 + ψ1 (z, θ )) + · · · + fs (θ )z js (1 + ψs(z, θ )), (6.4)

where ju ∈ N for u = 1, . . . , s and j1 < · · · < js , and where f j (θ ) and ψ j (z, θ ) are
real analytic functions on {(z, θ ) : |z| < ε and |θ − θ ∗ | < δ }, for some positive real
numbers δ and ε , and ψ j (0, θ ∗ ) = 0 for j = 1, . . . , s. Then there exist numbers ε1 and
δ1 , 0 < ε1 ≤ ε and 0 < δ1 ≤ δ , such that for each fixed θ satisfying |θ − θ ∗ | < δ1 ,
the equation
Z(z, θ ) = 0, (6.5)
regarded as an equation in z alone, has at most s−1 isolated solutions in the interval
0 < z < ε1 .

Proof. Let δ1 and ε1 be such that 0 < δ1 < δ and 0 < ε1 < ε and |ψ j (z, θ )| < 1 if
|z| ≤ ε1 and |θ − θ ∗ | ≤ δ1 for j = 1, . . . , s. Let B(θ ∗ , δ1 ) denote the closed ball in
Rn of radius δ1 centered at θ ∗ . For each j ∈ {1, . . . , s}, f j is not the zero function,
else the corresponding term is not present in (6.4). We begin by defining the set V0
by V0 := {θ ∈ B(θ ∗ , δ1 ) : f j (θ ) = 0 for all j = 1, . . . , s}, a closed, proper subset of
B(θ ∗ , δ1 ). For θ0 ∈ V0 , as a function of z, Z(z, θ0 ) vanishes identically on (0, ε0 ), so
the proposition holds for θ0 ∈ V0 .
For any θ0 ∈ B(θ ∗ , δ1 ) \ V0 , let u ∈ {1, . . . , s} be the least index for which
fu (θ0 ) 6= 0. Then Z(z, θ0 ) = fu (θ0 )z ju + z ju +1 g(z, θ0 ), where g(z, θ0 ) is a real ana-
lytic function on [−ε1 , ε1 ]. Thus the ju th derivative of Z(z, θ0 ) is nonzero at z = 0, so
Z(z, θ0 ) is not identically zero, hence has a finite number S0 (θ0 ) of zeros in (0, ε1 ).
Let V1 = {θ ∈ B(θ ∗ , δ1 ) : f j (θ ) = 0 for j = 2, . . . , s}; V1 ⊃ V0 . For θ0 ∈ V1 , if
f1 (θ0 ) = 0, then Z(z, θ0 ) vanishes identically on (0, ε1 ); if f1 (θ0 ) 6= 0, then as a
function of z, Z(z, θ0 ) = f1 (θ0 )z j1 (1 + ψ1 (z, θ0 )) has no zeros in (0, ε1 ). Either way
the proposition holds for θ ∈ V1 .
For θ ∈ B(θ ∗ , δ1 ) \V1 , we divide Z(z, θ ) by z j1 (1 + ψ1 (z, θ )) to form a real ana-
lytic function Ze(1) (z, θ ) of z on [−ε1 , ε1 ], then differentiate with respect to z to obtain
a real analytic function Z (1) (z, θ ) of z on [−ε1 , ε1 ] that can be written in the form

(1)
Z (1) (z, θ ) = f2 (θ )( j2 − j1 )z j2 − j1 −1 (1 + ψ2 (z, θ )) + · · ·
(1)
+ fs (θ )( js − j1 )z js − j1 −1 (1 + ψs (z, θ )),

(1)
where ψ j (0, θ ∗ ) = 0 for j = 2, . . . , s. As a function of z, Ze(1) (z, θ ) has the same
number S0 (θ ) of zeros in (0, ε1 ) as does Z(z, θ ). As a function of z, Z (1) (z, θ ) is
not identically zero, hence has a finite number S1 (θ ) of zeros in (0, ε1 ). By Rolle’s
Theorem, Ze(1) (z, θ ) has at most one more zero in (0, ε1 ) than does Z (1) (z, θ ), so
S0 (θ ) ≤ S1 (θ ) + 1.
The function Z (1) (z, θ ) is of the same form as Z(z, θ ) (incorporating the nonzero
constant j2 − j1 into f2 (θ ) and taking, if necessary, ε1 smaller in order to satisfy the
(1)
condition |ψ j (z, θ )| < 1 if |z| ≤ ε1 ), so we may repeat the same procedure: define
V2 = {θ ∈ B(θ ∗ , δ1 ) : f j (θ ) = 0 for j = 3, . . . , s}, which contains V1 and on which
6.1 Bautin’s Method for Bifurcation Problems 253

Z (1) , as a function of z, either vanishes identically or has no zeros in (0, ε1 ), then for
(1)
θ ∈ B(θ ∗ , δ1 ) \ V2 divide Z (1) (z, θ ) by z j2 − j1 −1 (1 + ψ2 (z, θ )) to form Ze(2) (z, θ ),
which has the same number S1 (θ ) of zeros in (0, ε1 ) as Z (1) (z, θ ), and finally differ-
entiate Ze(2) (z, θ ) with respect to z to obtain a real analytic function Z (2) (z, θ ) that,
for each θ ∈ B(θ ∗ , δ1 ) \V2 , has, as a function of z, a finite number S2 (θ ) of zeros in
(0, ε1 ), and S1 (θ ) ≤ S2 (θ ) + 1, so that S0 (θ ) ≤ S2 (θ ) + 2.
Taking Z (0) (z, θ ) = Z(z, θ ) and repeating the process for a total of s− 1 iterations,
we obtain a sequence of sets V0 ⊂ V1 ⊂ · · · ⊂ Vs−1 and of functions Z ( j) (z, θ ), defined
and analytic on [−ε1 , ε1 ] for θ ∈ B(θ ∗ , δ1 )\V j , j = 0, . . . , s−1, with the property that
the proposition holds on V j , Z ( j) (z, θ ) has, as a function of z, a finite number S j (θ )
of zeros in (0, ε1 ), and S0 (θ ) ≤ S j (θ ) + j. In particular, the proposition holds for
θ ∈ Vs−1 , and for θ ∈ B(θ ∗ , δ1 ) \ Vs−1 S0 (θ ) ≤ Ss−1 (θ ) + (s − 1). But we can write
(s−1)
Z (s−1) (z, θ ) = fs (θ )( js − js−1 ) · · · ( js − j2 )( js − j1 )z js − js−1 −1 (1 + ψs (z, θ )) for
(s−1) (s−1)
some function ψs satisfying ψs (0, θ ∗ ) = 0. As a function of z, Z (s−1) (z, θ )
is not identically zero, hence has no zeros in (0, ε1 ). Thus S0 (θ ) ≤ s − 1, so the
proposition holds for all θ ∈ B(θ ∗ , δ1 ). 

Remark. If we are interested in the number of isolated solutions of (6.5) in the


interval (−ε1 , ε1 ), then we obtain the upper bound 2s − 1: by the proposition at
most s − 1 isolated solutions in (0, ε ), by the same argument as in the proof of the
proposition at most s − 1 isolated solutions in (−ε1 , 0), plus a possible solution at
z = 0.

Corollary 6.1.3. Suppose the coefficients of the function F of (6.2) satisfy

f0 (θ ∗ ) = · · · = fm (θ ∗ ) = 0, fm+1 (θ ∗ ) 6= 0.

Then the multiplicity of θ ∗ is less than or equal to m.

Proof. Because fm+1 (θ ∗ ) 6= 0, in a neighborhood of θ ∗ we can write the function


F in (6.2) in the form
m
F (z, θ ) = ∑ f j (θ )z j + fm+1(θ )zm+1 (1 + ψ (z, θ )).
j=0

Now apply Proposition 6.1.2. 

Proposition 6.1.2 enables to us to count zeros of functions of the form (6.4),


but functions of interest to us naturally arise in the form of (6.2). In order to use
Proposition 6.1.2, we must rearrange the terms of our series. In doing so it is ex-
pedient to work over C rather than R, so we next review some terminology and
facts related to functions of several complex variables, and to series of the form
∑α ∈Nn0 aα (x − c)α , where the aα are real or complex, that will be needed to pro-
ceed (see, for example, [87, 90, 96]). Let k denote the field R or C. A polydisk
in kn centered at c = (c1 , . . . , cn ) and of polyradius r = (r1 , . . . , rn ) is the open set
{(x1 , . . . , xn ) : |x j − c j | < r j for j = 1, . . . , n}. Because the set Nn0 of multi-indices
254 6 Bifurcations of Limit Cycles and Critical Periods

can be totally ordered in many ways, the question arises as to what limit pre-
cisely is being taken in the definition of the convergence of a series of the form
∑α ∈Nn0 aα (x − c)α . For a formal answer that avoids reference to any order on Nn0 the
reader is referred to the first section in [87]. However, convergence in any sense at a
point b = (b1 , . . . , bn ) implies absolute convergence on the open polydisk centered
at c and of polyradius (|c1 − b1 |, . . . , |cn − bn |) (this is Abel’s Lemma; see the first
section of [96]), so that the existence and value of the sum of the series are indepen-
dent of the ordering of its terms. A germ of an analytic function at a point θ ∗ ∈ kn
is an equivalence class of analytic functions under the relation f is equivalent to g if
there is a neighborhood of θ ∗ on which f and g agree. We denote by Gθ ∗ the ring
(under the natural addition and multiplication described in Exercise 6.1) of germs
of analytic functions of θ at the point θ ∗ ∈ kn , which is a Noetherian ring that is
isomorphic to the ring of convergent power series in n variables over k. If f is an
analytic function on some open neighborhood of θ ∗ in kn , we denote by f the ele-
ment of Gθ ∗ induced by f . (The context will make it clear when the use of boldface
type indicates a mapping into Rn or Cn for n > 1 and when it indicates the germ of a
function into R or C.) The following statement is a special case (p = 1) of Theorem
II.D.2 of [90].
Theorem 6.1.4. Let U be an open subset of Cn , let θ ∗ be a point in U, let g1 , . . . , gs
be holomophic functions on U, and let g1 , . . . , gs be the germs they induce at θ ∗ . Let
I = hg1 , . . . , gs i. Then there exist a polydisk P ⊂ U, centered at θ ∗ , and a constant
γ > 0 such that for any function f that is holomorphic on P and such that f ∈ I, there
exist functions h1 , . . . , hs that are holomorphic on P and are such that f = ∑sj=1 h j g j
on P and khi kP ≤ γ k f kP , for j = 1, . . . , s, where k · kP is the supremum norm for
continuous functions on P, k f kP = supz∈P | f (z)|.
Another point that arises when we change (6.2) into the form (6.4) by means of
a basis of the ideal generated by the coefficient functions f j (θ ) is that the order of
the coefficient functions, as determined by their indices, is important. It will be of
particular importance in the proof of the lemma on rearrangement of the series that
the basis of the ideal h f j : j ∈ N0 i involved have the property that it include the first
nonzero function and any function f j that is independent of all the functions with
lower indices, in the sense that it is not a linear combination of them. For example,
for the ordered collection { f 3 , f 2 , f } and the corresponding ideal I, I = h f 3 , f i and
I = h f i but neither of the bases specified by these expressions meets the condition.
We separate out this property of certain bases in the following definition.
Definition 6.1.5. Let k be a field and let { f0 , f1 , f2 , . . .} be an ordered set of poly-
nomials in k[x1 , . . . , xn ]. Suppose J is the least index for which fJ is not the zero
polynomial. A basis B of the ideal I = h f j : j ∈ N0 i satisfies the retention condition
if
(a) fJ ∈ B, and
(b) for j ≥ J + 1, if f j ∈ / h f0 , . . . , f j−1 i, then f j ∈ B.
The minimal basis of I with respect to the retention condition is the basis constructed
in the following way: beginning with B = { fJ }, sequentially check successive ele-
ments f j , starting with j = J + 1, and add f j to B if and only if f j 6∈ hBi.
6.1 Bautin’s Method for Bifurcation Problems 255

The procedure decribed in the definition produces an ascending chain of ideals,


hence must terminate in finitely many steps, since the ring k[x1 , . . . , xn ] is Noethe-
rian. The basis constructed in this fashion is minimal among all those that satisfy
the retention condition in that it contains as few elements as possible.
When we write only “a basis B of the ideal I = h f1 , f2 , . . .i that satisfies the
retention condition” or “the minimal basis” in this regard, omitting mention of an
ordering, then it is to be understood that the set of functions in question is ordered
as they are listed when I is described. With these preliminaries we may now state
and prove the lemma on rearrangements of series of the form (6.2).

Lemma 6.1.6. Let F (z, θ ) be a series of the form (6.2) that converges on a set
U = {(z, θ ) : |z| < ε and |θ − θ ∗ | < δ } ⊂ R × Rn , let f j denote the germ of f j at θ ∗
in the ring of germs Gθ ∗ of complex analytic functions at θ ∗ when θ ∗ is regarded as
an element of Cn , and suppose there exists a basis B of the ideal I = hf0 , f1 , f2 , . . . i in
Gθ ∗ that consists of m germs f j1 , . . . , f jm , j1 < · · · < jm and that satisfies the retention
condition. Then there exist positive numbers ε1 and δ1 , 0 < ε1 ≤ ε and 0 < δ1 ≤ δ ,
and m analytic functions ψ jq (z, θ ) for which ψ jq (0, 0) = 0, q ∈ {1, . . ., m}, such that
m 
F (z, θ ) = ∑ f jq (θ ) 1 + ψ jq (z, θ ) z jq (6.6)
q=1

holds on the set U1 = {(z, θ ) : |z| < ε1 and |θ − θ ∗ | < δ1 }.

Proof. Allow z and θ to be complex. The series that defines F still converges on
U, or more precisely on U C := {(z, θ ) : |z| < ε and |θ − θ ∗ | < δ } ⊂ C × Cn . Let
R = min{ε /2, δ /2, 1}, and let M = sup{|F (z, θ )| : |z| ≤ R and |θ − θ ∗| ≤ R} < ∞.
Since, for any fixed θ ∈ Cn such that |θ − θ ∗ | ≤ R, the series for F (z, θ ) converges
on {z ∈ C : |z| ≤ R}, the Cauchy Inequalities state that | f j (θ )| ≤ M/R j for all j ∈ N0
so that if W = {θ : |θ − θ ∗| ≤ R} ⊂ Cn , then
M
k f j kW ≤ . (6.7)
Rj
Applying Theorem 6.1.4 to the ideal I = hf1 , f2 , . . . i = hf j1 , . . . f jm i, we obtain the
existence of a polydisk P ⊂ W ⊂ Cn centered at θ ∗ ∈ Rn , a positive real constant γ ,
and for each j ∈ N0 m analytic functions h j,1 , h j,2 , . . . , h j,m on P such that

f j = h j,1 f j1 + h j,2 f j2 + · · · + h j,m f jm

and
kh j,u kP ≤ γ k f j kP , (6.8)
for j ∈ N0 and u = 1, . . . , m. Thus series (6.2) is
∞  m 
F (z, θ ) = ∑ ∑ h j,q (θ ) f jq (θ ) z j . (6.9)
j=0 q=1
256 6 Bifurcations of Limit Cycles and Critical Periods

We wish to rearrange the terms in this series. Rearrangement is permissible if the


series converges absolutely, which we now show to be true. (Since the series is
not a power series we cannot appeal directly to Abel’s Lemma.) Let σ denote the
minimum component in the polyradius of P, so σ ≤ R. Then for j ≥ jm and θ
satisfying |θ − θ ∗ | < σ ,
M
|h j,q (θ ) f jq (θ )z j | ≤ γ k f j kP |z| j (by (6.7) and (6.8), since θ ∈ P ⊂ W )
R jq
M
≤ γ k f j kW j |z| j (since P ⊂ W , j ≥ jm , and R ≤ 1)
R
M2
≤ γ 2 j |z| j , (by (6.7))
R
which means that convergence is absolute for |z| < ε1 := R2 and |θ − θ ∗ | < δ1 := σ .
Thus we may rewrite (6.9) as
jm  m  m  ∞ 
F (z, θ ) = ∑ ∑ j,q jq
h (θ ) f (θ ) z j
+ ∑ ∑ h j,q (θ )z j f jq (θ ) . (6.10)
j=0 q=1 q=1 j= jm +1

Suppose r ∈ {0, 1, . . ., jm }. If r = jq for some q ∈ {1, . . ., m}, then

fr (θ )zr = (0 · f j1 (θ ) + · · · + 1 · f jq (θ ) + · · · + 0 · f jm (θ ))z jq = f jq (θ )z jq .

If jq < r < jq+1 for some q ∈ {1, . . . , m − 1} (the only other case, since fr = 0
for r < j1 ), then since B satisfies the retention condition, there exist functions
ur,1 (θ ), . . . , ur,q (θ ), each one analytic on a neighborhood in Cn of θ ∗ ∈ Rn , such
that
fr (θ ) = ur,1(θ ) f j1 (θ ) + · · · + ur,q(θ ) f jq (θ ),
hence

fr (θ )zr = (ur,1 (θ )zr− j1 ) f j1 (θ )z j1 + · · · + (ur,q (θ )zr− jq ) f jq (θ )z jq .

Thus  
jm m m
∑ ∑ h j,q(θ ) f jq (θ ) z j = ∑ f jq (θ )(1 + ψe jq (z, θ ))z jq
j=0 q=1 q=1

for some functions ψ e (z, θ ) jq that are analytic on a neighborhood of (0, θ ∗ ) in


C × C and satisfy ψ
n e (0, 0) = 0. Using (6.10) to incorporate higher-order terms in
z into the functions ψ e jq , for possibly smaller ε1 and δ1 we have that (6.6) holds on
U1C = {(z, θ ) : |z| < ε1 and |θ − θ ∗ | < δ1 } ⊂ C × Cn . We now make the simple ob-
servation that if real numbers µ , µ1 , . . . , µm and complex numbers ξ1 , . . . , ξm satisfy
µ = µ1 ξ1 + · · · + µm ξm , then µ = µ1 Re ξ1 + · · · + µm Re ξm . Thus because the func-
tions f j1 , . . . , f jm have real coefficients, and F (z, θ ) is real when z and θ are real, if
each ψ jq is replaced by its real part, (6.6) still holds on U1 := U1C ∩ R × Rn. 
6.2 The Cyclicity Problem 257

Theorem 6.1.7. Let F (z, θ ) be a series of the form (6.2) that converges on a set
{(z, θ ) : |z| < ε and |θ − θ ∗ | < δ } ⊂ R × Rn, and let f j denote the germ of f j at θ ∗
in the ring of germs Gθ ∗ of complex analytic functions at θ ∗ when θ ∗ is regarded
as an element of Cn . Suppose the ideal I = hf0 , f1 , f2 , . . . i in Gθ ∗ has a basis B that
consists of m germs f j1 , . . . , f jm , j1 < · · · < jm and satisfies the retention conditon.
Then there exist numbers ε1 and δ1 , 0 < ε1 ≤ ε and 0 < δ1 ≤ δ , such that for each
fixed θ satisfying |θ − θ ∗ | < δ1 , the equation F (z, θ ) = 0, regarded as an equation
in z alone, has at most m − 1 isolated solutions in the interval (0, ε1 ).

Proof. According to Lemma 6.1.6 we can represent the function F (z, θ ) of (6.2) in
the form
m 
F (z, θ ) = ∑ f jq (θ ) 1 + ψ jq (z, θ ) z jq .
q=1

The conclusion now follows by Proposition 6.1.2. 

6.2 The Cyclicity Problem

In this section we use the results just derived to develop a method for counting the
maximum number of limit cycles that can bifurcate from a simple focus or center
of system (6.1), which we abbreviate in this paragraph as u̇ = f0 (u). (Recall that a
singularity u0 of (6.1) is called simple or nondegenerate if det df0 (u0 ) 6= 0.) Back-
ing up a bit, suppose u0 is an arbitrary singularity of system (6.1). The discussion
in the second paragraph of Section 2.2 implies that if u0 is hyperbolic then it has
cyclicity zero. It fails to be hyperbolic if either det df0 (u0 ) = 0 or det df0 (u0 ) > 0 but
Tr df0 (u0 ) = 0. If we remove the hyperbolicity by letting det df0 (u0 ) = 0, then u0
need not be isolated from other singularities, and when it is isolated it is possible that
it either splits into more than one singularity or disappears entirely under arbitrarily
small perturbation of f0 , under any sensible topology on the set of polynomial f. (See
Exercises 6.3 through 6.6.) Thus it is natural to consider the situation Tr df0 (u0 ) = 0
and det df0 (u0 ) > 0, so that u0 is simple, and by Remark 3.1.7 is a focus or a center.
Moreover, any focus or center of a quadratic system is simple (Lemma 6.3.2), so the
theory developed here will cover quadratic antisaddles completely.
Thus suppose u0 is a singularity of system (6.1) at which the trace of the linear
part is zero and the determinant of the linear part is positive, a simple but non-
hyperbolic focus or center. By a translation to move u0 to the origin and a linear
transformation to place df0 (u0 ) in Jordan normal form, system (6.1) can be written
in the form (3.2). We must allow the trace of the linear part to become nonzero under
perturbation, however, so we will consider the family (3.4) for which the theory of
the difference map P and the Lyapunov numbers was developed in Section 3.1.
Under the time rescaling τ = β t equation (3.4) takes the simpler form

u̇ = λ u − v + P(u, v)
(6.11a)
v̇ = u + λ v + Q(u, v),
258 6 Bifurcations of Limit Cycles and Critical Periods

where λ = α /β and where P and Q are polynomials with max{deg P, degQ} = n,


say P(u, v) = ∑nj+k=2 A jk u j vk and Q(u, v) = ∑nj+k=2 B jk u j vk . Introducing the com-
plex coordinate x = u + iv in the usual way expresses (6.11a) in the complex form
 
ẋ = λ x + i x − ∑ a pq x x̄ ,
p+1 q
(6.11b)
(p,q)∈S

where S ⊂ N−1 × N0 is a finite set, every element (p, q) of which satisfies the con-
dition p + q ≥ 1. The choice of labels on the equations is meant to underscore the
fact that they are two representations of the same family of systems. Moreover,
Re a pq ∈ Q[A, B] and Ima pq ∈ Q[A, B]. When λ = 0, these equations reduce to

u̇ = −v + P(u, v)
(6.12a)
v̇ = u + Q(u, v)

(which is precisely (3.2)) and


 
ẋ = i x − ∑ a pq x p+1 x̄q . (6.12b)
(p,q)∈S

We will use just (λ , (A, B)) to stand for the full coefficient string (λ , A20 , . . . , B0n ) in
R × R(n+1)(n+2)−6 and just a to stand for the full coefficient string (a p1 ,q1 , . . . , a pℓ ,qℓ )
in Cℓ . Thus we let E(λ , (A, B)), E(λ , a), E(A, B), and E(a) denote the space of
parameters of families (6.11a), (6.11b), (6.12a), and (6.12b), respectively. These are
just R × R(n+1)(n+2)−6, R × Cℓ, and so on.
The precise definition of the cyclicity of the singularity of (6.11) is the following,
which could also be expressed in terms of the parameters (λ , (A, B)). We repeat that
the natural context for perturbation of an element of family (6.12) is family (6.11),
hence the choice of the parameter λ in the parameter space in the definition.
Definition 6.2.1. For parameters (λ , a), let n((λ , a), ε )) denote the number of limit
cycles of the corresponding system (6.11) that lie wholly within an ε -neighborhood
of the origin. The singularity at the origin for system (6.11) with fixed coefficients
(λ ∗ , a∗ ) ∈ E(λ , a) has cyclicity c with respect to the space E(λ , a) if there exist
positive constants δ0 and ε0 such that for every pair ε and δ satisfying 0 < ε < ε0
and 0 < δ < δ0 ,

max{n((λ , a), ε )) : |(λ , a) − (λ ∗, a∗ )| < δ } = c .

Refer to the discussion in Section 3.1 surrounding system (3.4). For ρ ∈ R, we


have a first return map R(ρ ) determined on the positive portion of the u-axis as
specified in Definition 3.1.3 for systems of the form (3.4). By (3.13), R(ρ ) has
series expansion
R(ρ ) = η̃1 ρ + η2 ρ 2 + η3 ρ 3 + · · · , (6.13)
where η e1 and the ηk for k ≥ 2 are real analytic functions of the parameters
e1 = e2πλ , as explained in the sen-
(λ , (A, B)) of system (6.11), and in particular η
6.2 The Cyclicity Problem 259

tence following Definition 3.1.3. Since isolated zeros of the difference function

P(ρ ) = R(ρ ) − ρ = η1 ρ + η2 ρ 2 + η3 ρ 3 + · · · (6.14)

correspond to limit cycles of system (6.11), the cyclicity of the origin of the system
corresponding to (λ ∗ , (A∗ , B∗ )) ∈ E(λ , (A, B)) is equal to the multiplicity of the
function P(ρ ) at (λ ∗ , (A∗ , B∗ )). Thus the behavior of the Lyapunov numbers ηk ,
k ∈ N, holds the key to the cyclicity of the origin for system (6.11). For example,
we see already that if the point (λ ∗ , (A∗ , B∗ )) in the space of parameters of system
(6.11) is such that λ ∗ = α ∗ /β ∗ 6= 0, then η1 = η̃1 − 1 6= 0, so the expansion of P
shows that no limit cycle can bifurcate from the origin under small perturbation.
This is in precise agreement with the fact that (0, 0) is a hyperbolic focus in this
situation.
We asserted in Section 3.1 that, like the focus quantities gkk for the complexifica-
tion (3.3) of (6.12a) and the coefficients G2k+1 of the function G defined by (3.28)
in terms of the distinguished normal form (3.27) of the complexification, the Lya-
punov numbers are polynomials in the parameters (A, B) of system (6.12a). We will
prove this assertion now. But since the Lyapunov numbers are also defined for the
larger family of systems of the form (6.11), there are actually two sets of “Lyapunov
numbers” of interest: those arising in the context of system (6.12a) and those arising
in the context of system (6.11a), the natural setting in which perturbations from an
element of (6.12a) take place in determination of the cyclicity of the antisaddle at
the origin (Definition 6.2.1). In the context of the parametrized families (6.11) or
(6.12), the Lyapunov numbers are referred to as the Lyapunov “quantities.” All that
we will need to know about the Lyapunov quantities for the more inclusive family
(6.11) is that they are real analytic functions of the parameters (λ , (A, B)). This fol-
lows immediately from the analyticity of the solution f (ϕ , ϕ0 , r0 ) of the initial value
problem (3.7) with initial conditions r = r0 and ϕ = ϕ0 , the fact that r = 0 is not a
singular point but a regular solution of (3.7), and analyticity of the evaluation map,
since the Poincaré first return map is nothing other than the evaluation of f (ϕ , 0, r0 )
at ϕ = 2π . The actual form of the Lyapunov quantities in terms of the parameters
(λ , (A, B)) for family (6.11) is described in Exercise 6.7. It is not difficult to see
that if different choices are made for the initial conditions in the initial value prob-
lems that determine the functions wk , which gives rise to the nonuniqueness of the
Lyapunov quantities, the same proof is valid.

Proposition 6.2.2. The Lyapunov quantities specified by Definition 3.1.3 for family
(6.12a) are polynomial functions of the parameters (A, B) with coefficients in R.

Proof. It is clear from an examination of equations (3.5), (3.6), and (3.7), where,
for the time-rescaled systems (6.11) and (6.12), α and β are replaced by λ and 1,
respectively, that for each k ∈ N, Rk is a polynomial in cos ϕ , sin ϕ , the coefficients
of P and Q (that is, (A, B)), and λ , with integer coefficients.
We claim that for all k ∈ N, the function wk (ϕ ) defined by (3.9) is a polynomial
in cos ϕ , sin ϕ , ϕ , and (A, B), with rational coefficients. By (3.7), R1 ≡ 0 so by (3.10)
and (3.11), the initial value problem that w1 solves uniquely is w′1 = 0, w1 (0) = 1,
260 6 Bifurcations of Limit Cycles and Critical Periods

so w1 (ϕ ) ≡ 1. If the claim holds for w1 , . . . , w j−1 , then by (3.10) and (3.11), the
initial value problem that determines w j uniquely is of the form w′j (ϕ ) = S j (ϕ ),
w j (0) = 0, where S j does not involve w j and is a polynomial in cos ϕ , sin ϕ , ϕ , and
(A, B), with rational coefficients. Simply integrating both sides shows that the claim
holds for w j , hence by mathematical induction the claim holds for wk for all k ∈ N.
(The first integration of an even power of cos ϕ or sin ϕ produces a rational constant
times ϕ as one term.) Then η1 = w1 (2π ) − 1 ≡ 0 and for k ≥ 2 ηk = wk (2π ) is a
polynomial in (A, B) with real coefficients, since the sines and cosines evaluate to 0
and 1 and powers of ϕ evaluate to powers of 2π . 

By Theorems 3.1.5 and 3.3.5, the Lyapunov quantities and the focus quantities
pick out the centers in family (6.12); together with Theorem 3.2.7 and Remark 3.3.6
they also distinguish stable and unstable weak foci. These facts indicate that there
must be an intimate connection between them and suggests that it should be possi-
ble to use the focus quantities to investigate the cyclicity of simple foci and centers.
This is true, and is important because the focus quantities are so much easier to work
with than the Lyapunov quantities. The precise nature of the relationship is given in
the next theorem. In anticipation of that, it will be helpful to review briefly how we
derived the focus quantities for the real system (6.12a). The first step was to express
(6.12a) in the complex form (6.12b) and then adjoin to that equation its complex
conjugate. By regarding x̄ as an independent variable, this pair of differential equa-
tions became a system of ordinary differential equations on C2 , the complexification
(3.50) of (6.12), which we duplicate here:
   
ẋ = i x − ∑ a pq x p+1 yq , ẏ = −i y − ∑ bqp xq y p+1 , (6.15)
(p,q)∈S (p,q)∈S

with bqp = ā pq . Letting X denote the vector field on C2 associated to any system
on C2 of this form, not necessarily the complexification of a real system, hence not
necessarily satisfying the condition bqp = ā pq, we then applied X to a formal series
Ψ given by
Ψ (x, y) = xy + ∑ v j−1,k−1 x j yk . (6.16)
j+k≥3

Recursively choosing the coefficients v j−1,k−1 in an attempt to make all coefficients


of X Ψ = ∑∞j+k≥1 g j,k x j+1 yk+1 vanish, we obtained functions gkk ∈ C[a, b] such that

igkk ∈ Q[a, b] (i = −1) and

X Ψ = g11 (xy)2 + g22(xy)3 + g33(xy)4 + · · · . (6.17)

Thus whereas ηk is a polynomial in the original real coefficients (A, B) of (6.12a),


in fact gkk is a polynomial in the complex coefficients (a, b) of the complexification
(6.15). To make a proper comparison, we must express gkk in terms of the parameters
(A, B). This is possible because the coefficients (a, b) of the complexification satisfy
b = ā and gkk (a, ā) ∈ R for all a ∈ Cℓ (Remark 3.4.7), and because Re a pq and Ima pq
are polynomials (with rational coefficients) in the original coefficients (A, B), so that
6.2 The Cyclicity Problem 261

gR
kk (A, B) := gkk (a(A, B), ā(A, B)) (6.18)

is a polynomial in (A, B) with rational coefficients. The distinction between gkk and
gRkk might be best understood by working through Exercise 6.9. We note, however,
that this distinction is never made in the literature; one must simply keep in mind
the context in which the quantity gkk arises.
We remark that if the time rescaling to obtain (6.11a) from (3.4) with α = 0 is not
done, so that β is present, then π is replaced by π /β everywhere in the statement of
the theorem.

Theorem 6.2.3. Let ηk be the Lyapunov quantities for system (6.12a) with regard to
the antisaddle at the origin, let gkk be the focus quantities for the complexification
(6.15) of (6.12a), and let gRkk denote the polynomial functions defined by (6.18).
Then η1 = η2 = 0, η3 = π gR 11 , and for k ∈ N, k ≥ 2, η2k ∈ hg11 , . . . , gk−1,k−1 i and
R R

η2k+1 − π gR R R
kk ∈ hg11 , . . . , gk−1,k−1 i in R[A, B].

Proof. The idea is to compare the change P(ρ ) in position along the positive u-
axis in one turn around the singularity to the change in the value of the function
Ψ expressed by (6.16), computing the change in Ψ by integrating its derivative
along solutions of (6.12a), which naturally generates the focus quantities according
to (6.17). In reality, Ψ is defined for (x, y) ∈ C2 , but we evaluate it on (x, x̄), the
invariant plane that contains the phase portrait of (6.11); see Exercise 3.7. Since the
system on C2 to which Ψ and the focus quantities gkk pertain is the complexifica-
tion of a real system, it satisfies b = ā, so the focus quantities gkk are actually the
kk . Since Ψ might not converge, hence not actually define a function, we
quantities gR
work instead with the truncation of the series defining Ψ at some sufficiently high
level,
2N+1
ΨN (x, x̄) := xx̄ + ∑ v j−1,k−1 x j x̄k .
j+k=3

Fix an initial point on the positive u-axis with polar coordinates (r, ϕ ) = (ρ , 0) and
complex coordinate x = u + iv = ρ + i0 = ρ . In one turn about the singularity, time
increases by some amount τ = τ (ρ ) and the change in ΨN is
Z τ
△ΨN (ρ , ρ ) = d
dt [ΨN (x(t), x̄(t))] dt
0
Z τ N
=
0 k=1
∑ gRkk (x(t)x̄(t))k+1 + o(|x(t)|2N+2) dt
Z τ N
=
0 k=1
∑ gRkk |x(t)|2k+2 + o(|x(t)|2N+2) dt.
Now change the variable of integration from t to the polar angle ϕ . By (3.9) and
(3.12), keeping in mind that we have rescaled time to make (α , β ) = (0, 1), we
have |x(t)| = r(t) = ρ + w2 (ϕ )ρ 2 + w3 (ϕ )ρ 3 + · · · . By the second equation of (3.5),
d ϕ /dt = 1 + ∑∞k=1 uk (ϕ )r , so that
k
262 6 Bifurcations of Limit Cycles and Critical Periods

1
dt = d ϕ = (1 + ue1(ϕ )ρ + ue2 (ϕ )ρ 2 + · · ·)d ϕ .
1 + ∑∞ u
k=1 k (ϕ )[ρ + w2 (ϕ )ρ 2 + · · ·]k

Thus

△ΨN (ρ , ρ )
Z 2π N
=
0
∑ gRkk (ρ + w2(ϕ )ρ 2 + · · · )2k+2 (1 + ue1(ϕ )ρ + ue2(ϕ )ρ 2 + · · · ) d ϕ
k=1
+ o(ρ 2N+2)
N h i
= ∑ 2 π g kk ρ
R 2k+2
+ g R
( f
kk k,1 ρ 2k+3
+ f k,2 ρ 2k+4
+ · · · ) + o(ρ 2N+2).
k=1

Turning our attention to △ρ , for any value of ρ > 0 we have a positive real number
ξ defined by the function ξ = f (ρ ) = Ψ (ρ , ρ ) = ρ 2 +V3 ρ 3 +V4 ρ 4 + · · · , which has
an inverse ρ = g(ξ ). By Taylor’s Theorem with remainder, there exists ξ̃ between ξ
and ξ + ε such that g(ξ + ε ) = g(ξ ) + g′(ξ )ε + 2!1 g′′ (ξ̃ )ε 2 . Let ρ̃ = g(ξ̃ ). Using the
formulas for the first and second derivatives of g as the inverse of f and inverting
the series obtained, we have that for ε = △ΨN ,
 
1 1 2 + 6V3ρ̃ + · · ·
△ρ = △ΨN + − (△ΨN )2
2ρ + 3V3ρ 2 + · · · 2! (2ρ̃ + 3V3ρ̃ 2 + · · · )3
 
= 21ρ + c0 + c1ρ + c2 ρ 2 + · · ·
 N h i 
× ∑ 2π gR kk ρ 2k+2
+ g R
( f
kk k,1 ρ 2k+3
+ · · · ) + o(ρ 2N+2
)
k=1
 
+ − 1
+ d−2 ρ̃12 + d−1 ρ̃1 + d0ρ̃ + · · ·
8ρ̃ 3
 N 
× ∑ (gkk ) (dk,1 ρ
R 2 4k+4
+ · · · ) + o(ρ 2N+4
) .
k=1

Since △ΨN is of order four or higher in ρ , it is apparent that ρ̃ is of order ρ . Thus


N h i
△ρ = ∑ π gR
kk ρ
2k+1
kk ( f k,1 ρ
+ gR ˜ 2k+2 + f˜k,2 ρ 2k+3 + · · · ) + o(ρ 2N+1). (6.19)
k=1

As indicated just after Definition 3.1.3, η1 = e2πα /β − 1, so the hypothesis λ = 0


implies that η1 = 0. Then by Proposition 3.1.4 η2 = 0. This gives the first two
conclusions of the proposition. Since △ρ = P(ρ ), (6.19) then reads
6.2 The Cyclicity Problem 263

η3 ρ 3 + η4 ρ 4 + η5 ρ 5 + · · · = π gR
11 ρ + g11 ( f 1,1 ρ + f 1,2 ρ + · · ·)
3 R ˜ 4 ˜ 5
+ π gR22ρ + g22 ( f 2,1 ρ + f 2,2 ρ + · · · )
5 R ˜ 6 ˜ 7
+ π gR33ρ + g33 ( f 3,1 ρ + f 3,2 ρ + · · · )
7 R ˜ 8 ˜ 9
+ ···
+ π gRNN ρ
2N+1
+ gR NN ( f N,1 ρ
˜ 2N+2
+ f˜N,2 ρ 2N+3 + · · · )
+ o(ρ 2N+1).

Thus η3 = π gR 11 and given k ∈ N, the choice N = k shows that the last pair of asser-
tions of the proposition holds for η4 through η2k+1 . 

Corollary 6.2.4. Let ηk be the Lyapunov quantities for system (6.12a) with regard
to the antisaddle at the origin, let gkk be the focus quantities for the complexification
(6.15) of (6.12a), and let gR
kk denote the polynomial function defined by (6.18). Then

11 , g22 , g33 , . . . i = hη1 , η2 , η3 , . . . i = hη3 , η5 , η7 , . . . i


hgR R R

in R[A, B]. For any (A∗ , B∗ ) ∈ E(A, B), the corresponding equalities hold true for
the corresponding germs and their ideals in G(A∗ ,B∗ ) .

Proof. The proof is left to the reader as Exercise 6.10. 

The second corollary to the theorem is stated only for germs, since that is the
context in which it will be used later.

Corollary 6.2.5. Let ηk be the Lyapunov quantities for system (6.12a) with regard
to the antisaddle at the origin, let gkk be the focus quantities for the complexifica-
tion (6.15) of (6.12a), and let gR kk denote the polynomial function defined by (6.18).
Let I = hη 2k+1 : k ∈ Ni = hgkk : k ∈ Ni ⊂ G(A∗ ,B∗ ) . Suppose {η k1 , . . . , η km } and
{g j1 , j1 , . . . , g jn , jn } are the minimal bases for I with respect to the retention condi-
ton with respect to the ordered sets {η 3 , η 5 , η 7 , . . .} and {g11 , g22 , . . .}, respectively.
Then m = n and for q = 1, 2, . . . , m, kq = 2 jq + 1.

Proof. The proof is left to the reader as Exercise 6.11. 

One further consequence of Theorem 6.2.3, which is of independent interest,


is that a fine focus of a planar polynomial system is of order k if and only if the
first k − 1 focus quantities vanish. The precise statement is given in the following
proposition.

Proposition 6.2.6. Let ηk be the Lyapunov quantities for system (6.12a) with regard
to the antisaddle at the origin, let gkk be the focus quantities for the complexification
(6.15) of (6.12a), and let gRkk denote the polynomial function defined by (6.18). The
system corresponding to a specific choice of the real parameters (A, B) has a kth-
order fine focus at the origin if and only if gR R
11 (A, B) = · · · = gk−1,k−1 (A, B) = 0 but
R
gkk (A, B) 6= 0.
264 6 Bifurcations of Limit Cycles and Critical Periods

Proof. The proof is left to the reader as Exercise 6.12. 


Before continuing with our development of the theory that enables us to estimate
the cyclicity of a center in a family (6.12), we pause for a moment to show how
Theorem 6.2.3 and Proposition 6.2.6 combine to estimate the cyclicity of a kth-
order fine focus in such a family. The bound is universal, in the sense that whatever
the nature of the nonlinearities, say perhaps maximal degree two or maximal degree
three, a kth-order fine focus has cyclicity at most k.
Theorem 6.2.7. A fine focus of order k has cyclicity at most k − 1 for perturbation
within family (6.12) and at most k for perturbation within family (6.11).
Proof. By Theorem 6.2.3, for any system of the form (6.12), the difference function
P may be written

P(ρ ) = π gR
11 ρ + h41 g11 ρ
3 4

11 + π g22 )ρ + (h61 g11 + h62 g22 )ρ


+ (h51 gR R 5 R R 6

+ ···
11 + · · · + π gk−1,k−1 )ρ
+ (h2k−1,1 gR 11 + · · · + h2k,k gk−1,k−1 )ρ
R 2k−1
+ (h2k,1gR R 2k

11 + · · · + π gkk )ρ
+ (h2k+1,1 gR + η2k+2ρ 2k+2 + η2k+3 ρ 2k+3 + · · · .
R 2k+1

Suppose a system of the form (6.12) corresponding to parameter value (A∗ , B∗ )


has a fine focus of order k at the origin, so that by Proposition 6.2.6, gR
11 through
gR ∗ ∗ R ∗ ∗ R
k−1,k−1 vanish at (A , B ) but gkk (A , B ) 6= 0. Then because gkk is nonzero on a
neighborhood of (A∗ , B∗ ), when we factor out πρ 3 and collect on the gRj j (but for
simplicity keep the same names for the polynomial weighting functions), we may
write

P(ρ ) =πρ 3 gR 11 (1 + h41 ρ + · · · + h2k+1,1 ρ
2k−2
)
22 (1 + h62ρ + · · · + h2k+1,2 ρ
+ gR )ρ
2k−4 2

33 (1 + h83ρ + · · · + h2k+1,3 ρ
+ gR )ρ
2k−6 4

+ ···
k−1,k−1 (1 + h2k,k−1 ρ + h2k+1,2k−1ρ )ρ
+ gR 2 2k−4

kk ρ
+ gR + η2k+2ρ 2k−1 + η2k+3 ρ 2k + · · ·
2k−2
 
=πρ 3 gR
11 (1 + ψ1 (ρ )) + g22 ρ (1 + ψ2 (ρ )) + · · · + gkk ρ
R 2 R 2k−2
(1 + ψk (ρ )) ,

valid on a neighborhood of (A∗ , B∗ ). If a perturbation is made within family (6.12),


then by Proposition 6.1.2 P has at most k − 1 isolated zeros in a small interval
0 < ρ < ε.
For the remainder of the proof, the case that the perturbation is made within
family (6.11), see Exercise 6.14. 
We have indicated earlier that we hope to use the focus quantities to treat the
cyclicity problem. The focus quantities arise from the complexification of system
6.2 The Cyclicity Problem 265

(6.12), but bifurcations to produce limit cycles naturally take place in the larger
family (6.11). We have connected the focus quantities and their ideals to the Lya-
punov quantities in the restricted context of family (6.12). The next result shows
how the minimal basis with respect to the retention condition of the ideal generated
by the Lyapunov quantities for the restricted family (6.12) is related to the minimal
basis of the ideal generated by the Lyapunov quantities of the larger family (6.11)
(with the same indexing set S, of course). We will distinguish between the two sets
of Lyapunov quantities by using the notation ηk for those depending on just the
parameters (A, B) and ηk (λ ) for those depending on the parameters (λ , (A, B)), al-
though, of course, ηk (0, (A, B)) = ηk (A, B). Because the functions ηk (λ ) are not
polynomials in the parameters (λ , (A, B)), we must work in the ring of germs in
order to handle domains of convergence. Note also that we treat the ηk merely as
analytic functions in the first hypothesis of the theorem, although they are actually
polynomials in (A, B).

Lemma 6.2.8. Fix families (6.11) and (6.12) with the same indexing set S. Let
{ηk (λ )}∞ k=1 be the Lypaunov quantities for family (6.11) and let {ηk }k=1 be the

∗ ∗
Lyapunov quantities for family (6.12). Fix (A , B ) in E(A, B) and suppose that the
minimal basis with respect to the retention condition of the ideal hη 1 , η 2 , . . . i in
G(A∗ ,B∗ ) , is {η k1 , . . . , η km }, k1 < · · · < km . Then {η 1 (λ ), η k1 , . . . , η km } is the min-
imal basis with respect to the retention condition with respect to the ordered set
{η 1 (λ ), η 2 (λ ), η 3 (λ ), . . .} of the ideal hη 1 (λ ), η 2 (λ ), η 3 (λ ), . . . i in G(0,(A∗ ,B∗ )) .

Proof. The functions ηk (λ , (A, B)) are analytic in a neighborhood of (0, (A∗ , B∗ )),
hence by Abel’s Lemma their power series expansions converge absolutely there, so
we may rearrange the terms in these expansions. Thus, for any k ∈ N in a neighbor-
hood of (0, (A∗ , B∗ )), ηk (λ , (A, B)) can be written

ηk (λ , (A, B)) = η̆k (λ , (A, B)) + η̌k (A, B), (6.20)

where η̆k (0, (A, B)) ≡ 0. When λ = 0, ηk (λ , (A, B)) reduces to ηk (A, B), so it must
be the case that ηk (0, (A, B)) = 0 + η̌k (A, B) = ηk (A, B) and (6.20) becomes

ηk (λ , (A, B)) = η̆k (λ , (A, B)) + ηk (A, B). (6.21)

Since
η1 (λ , (A, B)) = e2πλ − 1 = 2πλ (1 + 2!1 (2πλ ) + · · ·),
there exists a function uk (λ , (A, B)) that is real analytic on a neighborhood of
(0, (A∗ , B∗ )) in E(λ , (A, B)) such that

η̆k (λ , (A, B)) = uk (λ , (A, B))η1 (λ , (A, B)).

Thus (6.21) becomes, suppressing the (A, B) dependence in the notation,

ηk (λ ) = uk (λ )η1 (λ ) + ηk . (6.22)
266 6 Bifurcations of Limit Cycles and Critical Periods

Let L denote the set {η k1 , . . . , η km }. Because L is the minimal basis with respect to
the retention condition of the ideal hη k : k ∈ Ni in G(A∗ ,B∗ ) , (6.22) implies that for
all k ∈ N, the identity

ηk (λ , (A, B))
= uk (λ , (A, B))η1 (λ , (A, B)) + hk,1(A, B)ηk1 (A, B) + · · · + hk,m (A, B)ηkm (A, B)

holds on a neighborhood of (0, (A∗ , B∗ )) in E(λ , (A∗ , B∗ )) for functions hk,q that are
defined and real analytic on that neighborhood, albeit without λ dependence. The
same equation is therefore true at the level of germs in G(0,(A∗ ,B∗ )) . Thus

M = {η 1 (λ ), η k1 , . . . , η km }

is a basis of the ideal hη 1 (λ ), η 2 (λ ), . . . i ⊂ G(0,(A∗ ,B∗ )) . We must show that it is


minimal among all bases that satisfy the retention condition with respect to the set
{η 1 (λ ), η 2 (λ ), . . .}. Hence let

N = {η 1 (λ ), η j1 (λ ), . . . , η jn (λ )}

be the unique minimal basis with respect to the retention condition (which must
contain η 1 (λ ), since η 1 (λ ) is first on the list and is not 0), with the labelling chosen
so that j1 < · · · < jn , and suppose, contrary to what we wish to show, that it is not
the basis M. There are four ways this can happen, which we treat in turn.
Case 1: There exists p ∈ {1, 2, . . . , min{m, n}} such that for q ∈ {1, 2, . . . , p − 1},
kq = jq and η kq = η jq (λ ) but η k p 6= η j p (λ ) and j p < k p .
Then k p−1 = j p−1 < j p < k p , so because L is minimal η j p = h1 η k1 +· · ·+h p−1 η k p−1
for h1 , . . . , h p−1 ∈ G(A∗ ,B∗ ) . Applying the corresponding equality of functions that
holds on a neighborhood of (A∗ , B∗ ) to (6.22) implies that

η j p (λ ) = u j p (λ )η1 (λ ) + η j p
= u j p (λ )η1 (λ ) + h1ηk1 + · · · + h p−1ηk p−1
= u j p (λ )η1 (λ ) + h1η j1 (λ ) + · · · + h p−1η j p−1 (λ )

is valid on a neighborhood of (0, (A, B)) in E(λ , (A, B)) (although hq is independent
of λ ), so the corresponding equality of germs contradicts the fact that N is minimal.
Case 2: There exists p ∈ {1, 2, . . . , min{m, n}} such that for q ∈ {1, 2, . . . , p − 1},
kq = jq and η kq = η jq (λ ) but η k p 6= η j p (λ ) and j p > k p .
Then j p−1 = k p−1 < k p < j p , so η k p (λ ) 6∈ N, hence, because N is minimal,

η k p (λ ) = h0 η 1 (λ ) + h1η j1 (λ ) + · · · + h p−1η j p−1 (λ )


= h0 η 1 (λ ) + h1η k1 + · · · + h p−1η k p−1

for h1 , . . . , h p−1 ∈ G(0,(A∗,B∗ )) . The corresponding equality of functions that holds on


a neighborhood of (0, (A∗ , B∗ )) in E(λ , (A, B)), when evaluated on λ = 0, implies
6.2 The Cyclicity Problem 267

that ηk p = h̃1 ηk1 + · · · + h̃ p−1ηk p−1 on a neighborhood of (A∗ , B∗ ) in E(A, B), where
for q = 1, . . . , p − 1, h̃(A, B) = h(0, (A, B)). The corresponding equality of germs in
G(A∗ ,B∗ ) contradicts the fact that L is minimal.
Case 3: n < m and for q ∈ {1, . . . , n}: kq = jq and η kq = η jq (λ ).
Then km > jn and η km (λ ) 6∈ N, so

η km (λ ) = h0 η 1 (λ ) + h1 η j1 (λ ) + · · · + hn η jn (λ ) = h0 η 1 (λ ) + h1 η k1 + · · · + hn η kn

for h1 , . . . , hn ∈ G(0,(A∗ ,B∗ )) . Since km > kn , the corresponding equality of functions


that holds on a neighborhood of (0, (A∗ , B∗ )) in E(λ , (A, B)), when evaluated at
λ = 0, gives the same contradiction as in the previous case.
Case 4: n > m and for q ∈ {1, . . . , m}: kq = jq and η kq = η jq (λ ).
Then jn > km and η jn 6∈ L, so

η jn = h1 η k1 + · · · + hm η km = h1 η j1 (λ ) + · · · + hm η jm (λ )

in G(0,(A∗ ,B∗ ) (although hq has no λ dependence), so an application of the corre-


sponding equality of functions that holds on a neighborhood of (0, (A∗ , B∗ )) to
(6.22) implies that

η jn (λ ) = u jn (λ )η1 (λ ) + η jn = u jn (λ )η1 (λ ) + h1 η j1 (λ ) + · · · + hm η jm (λ )

on a neighborhood of (0, (A, B)) in E(λ , (A, B)). Thus, because jn > jm , the corre-
sponding equality of germs contradicts the fact that N is minimal. 

Theorem 6.2.9. Suppose that for (A∗ , B∗ ) ∈ E(A, B), the minimal basis M with re-
spect to the retention condition of the ideal J = hgR R
11 , g22 , . . .i in G(A∗ ,B∗ ) for the
corresponding system of the form (6.12) consists of m polynomials. Then the cyclic-
ity of the origin of the system of the form (6.11) that corresponds to the parameter
string (0, (A∗ , B∗ )) ∈ E(λ , (A, B)) is at most m.

Proof. As stated in the discussion surrounding (6.14), the cyclicity of the origin of
an element of family (6.11) with respect to the parameter space E(λ , (A, B)) is equal
to the multiplicity of the function

P(ρ ) = η1 (λ )ρ + η2 (λ )ρ 2 + η3(λ )ρ 3 + · · · .

By the hypothesis and Corollary 6.2.5, the minimal basis with respect to the re-
tention condition of the ideal hη 3 , η 5 , η 5 , . . .i in G(A∗ ,B∗ ) has m elements, hence, by
Lemma 6.2.8, in G(0,(A∗ ,B∗ )) the minimal basis with respect to the retention condition
of the ideal hη 1 (λ ), η 2 (λ ), η 3 (λ ), . . .i has m + 1 elements. Then by Theorem 6.1.7
the multiplicity of the function P(ρ ) is at most m. 

The following corollary to the theorem is the result that connects the cyclic-
ity of the simple antisaddle at the origin of system (6.11) (perturbation within
E(λ , (A, B))) to the focus quantities of the complexification of the companion sys-
tem (6.12).
268 6 Bifurcations of Limit Cycles and Critical Periods

Corollary 6.2.10. Fix a family of real systems of the form (6.11), with parameter set
E(λ , (A, B)) or E(λ , a). For the associated family (6.12) and parameter set E(A, B)
or E(a) consider the complexification (6.15),
   
ẋ = i x − ∑ a pq x y , p+1 q
ẏ = −i y − ∑ bqp x y q p+1
, (6.23)
(p,q)∈S (p,q)∈S

and the associated focus quantities {gkk }∞ k=1 ⊂ C[a, b]. Suppose {gk1 ,k1 , . . . , gkm ,km }
is a collection of focus quantities for (6.23) having the following properties, where
we set K = {k1 , . . . , km }:
(a) gkk = 0 for 1 ≤ k < k1 ;
(b) gkq ,kq 6= 0 for kq ∈ K;
(c) for kq ∈ K, q > 1, and k ∈ N satisfying kq−1 < k < kq , gkk ∈ Bkq−1 , the ideal in
C[a, b] generated by the first kq−1 focus quantities;
(d) the ideal J = hgk1 ,k1 , . . . , gkm ,km i in C[a, b] is radical; and
(e) V(J) = V(B), where B is the Bautin ideal hgkk : k ∈ Ni in C[a, b].
Then the cyclicity of the singularity at the origin of any system in family (6.11), with
respect to the parameter space E(λ , (A, B)), is at most m.

Proof. We first note that since gkk (a, ā) = gR kk (A(a, b̄), B̄(a, b̄)) ∈ R for all k ∈ N,
the observation made at the end of the proof of Lemma 6.1.6 implies that for any
collection { j1 , . . . , jn } ⊂ N, any k ∈ N, and any f1 , . . . , fn ∈ C[a, b],

gkk = f1 g j1 , j1 + · · · + fn g jn , jn
implies (6.24)
gR
kk = (Re f1 )gRj1 , j1 + · · · + (Re fn )gRjn , jn .

Since V(J) = V(B), for any k ∈ N the kth focus quantity gkk vanishes on V(J),
so gkk ∈ I(V(J)) (Definition 1.1.10). But because J is a radical ideal, by the Strong
Hilbert Nullstellensatz (Theorem 1.3.14), I(V(J)) = J, hence gkk ∈ J. Thus B ⊂ J,
so J = B, and by the additional hypotheses on the set {gk1 ,k1 , . . . , gkm ,km } it is the
minimal basis with respect to the retention condition of the Bautin ideal B with re-
spect to the set {g11 , g22 , . . .}. Thus, for any k ∈ N, there exist fk,1 , . . . , fk,m ∈ C[a, b]
such that gkk = f1 gk1 ,k1 + · · · + fm gkm ,km . By (6.24) for any (A∗ , B∗ ) in E(A, B),
L := {gR R R
k1 ,k1 , . . . , gkm ,km } is then a basis of the ideal I = hgkk : k ∈ Ni in G(A∗ ,B∗ ) .
R
Clearly hypothesis (a) implies that for k < k1 , gkk = 0 in G(A∗ ,B∗ ) . By hypothe-
sis (b) and (6.24) for any kq ∈ K, q > 1, and any k ∈ N satisfying k1 < k < kq ,
gR
kk ∈ hg11 , . . . , gkq−1 ,kq−1 i in G(A∗ ,B∗ ) . Thus it is apparent that even if L is not the min-
imal basis with respect to the retention condition M of the ideal I = hgR R
11 , g22 , . . .i
R
in G(A∗ ,B∗ ) (because of possible collapsing of gkq ,kq to gkq ,kq = 0), it nevertheless
contains M, which therefore can have at most m elements. The conclusion of the
corollary thus follows from Theorem 6.2.9. 
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 269

6.3 The Cyclicity of Quadratic Systems and a Family


of Cubic Systems

The final result in the previous section, Corollary 6.2.10, tells us that if the Bautin
ideal B = hg11 , g22 , . . .i corresponding to the complexification of system (6.12) is
radical, then the cardinality of the minimal basis of B with respect to the retention
condition is an upper bound on the cyclicity of the simple singularity at the origin
of system (6.11), with respect to the parameter set E(λ , (A, B)), the parameter set
corresponding to family (6.11). This result enables us to derive a version of Bautin’s
Theorem on the cyclicity of quadratic foci and centers. A simple but important result
needed for the proof is the fact that any focus or center of a quadratic system is
simple. Before we state and prove a lemma to that effect, however, we wish to state
one more property of foci of real quadratic systems that is a consequence of previous
results, but that has not been mentioned so far, and that will nicely complement the
facts about quadratic foci and centers given in this section.

Proposition 6.3.1. A fine focus of a real quadratic system of differential equations


is of order at most three.

Proof. Consider any quadratic system on R2 with a singularity at which the eigen-
values of the linear part are nonzero pure imaginary numbers. After a translation of
coordinates to move the focus to the origin, a nonsingular linear transformation, and
a time rescaling, the complexification of the system has the form (3.129) with some
specific set (a∗ , b∗ ) of coefficients. If g11 , g22 , and g33 are all zero at (a∗ , b∗ ), which
holds if and only if gR R R
11 , g22 , and g33 are all zero at the corresponding original real pa-
rameter string (A(a , b ), B(a , b )), then by Theorem 3.7.1, (a∗ , b∗ ) ∈ V(B3 ) = VC ,
∗ ∗ ∗ ∗

so the singularity is a center, not a focus. Thus at most the first two focus quantities
can vanish at a fine focus of a quadratic system. But then Proposition 6.2.6 implies
that the focus is of order at most three. 

We now continue our work on the cyclicity of quadratic systems, beginning with
the following important fact.

Lemma 6.3.2. Any focus or center of a quadratic system is simple.

Proof. We prove the contrapositive, hence consider a quadratic system, which we


write as u̇ = f(u), with a nonsimple isolated singularity at u0 ∈ R2 . If Tr df(u0 ) 6= 0,
then by Theorem 65 of §21 of [12], the singularity is a node, saddle, or saddle-node.
If Tr df(u0 ) = 0, then by a translation to move the singularity to the origin, followed
by an invertible linear transformation and a time rescaling, we may place the system
in one of the two forms

u̇ = P2 (u, v), v̇ = Q2 (u, v) (6.25)

or
u̇ = v + P2(u, v), v̇ = Q2 (u, v), (6.26)
270 6 Bifurcations of Limit Cycles and Critical Periods

where P2 is either zero or a homogeneous polynomial of degree two, Q2 is either


zero or a homogeneous polynomial of degree two, and max{degP2 , deg Q2 } = 2. In
the case of (6.25), any line that is composed of solutions of the cubic homogeneous
equation uQ2 (u, v) − vP2 (u, v) = 0, of which there is at least one, is an invariant
line through the origin in the phase portrait of (6.1), so u0 is not a focus or center.
(This can be seen by writing the system in polar coordinates.) In the case of (6.26),
by the Implicit Function Theorem, the equation v + P2(u, v) = 0 defines an analytic
function v = ϕ (u) = α2 u2 + α3 u3 + · · · . Write Q2 (u, v) = au2 + buv + cv2 and define
ψ (u) := Q2 (u, ϕ (u)) = au2 + · · · . If a = 0, then v factors out of the v̇ equation so
the line v = 0 is an invariant line through the origin in the phase portrait of u̇ = f(u),
so u0 is not a focus or a center. If a 6= 0, then by Theorem 67 in §22 of [12] (or the
simplified scheme described later in that section), the singularity either is a saddle-
node or has exactly two hyperbolic sectors, hence is not a focus or a center. 

Theorem 6.3.3. The cyclicity of any center or focus in a quadratic system is at most
three. There exist both foci and centers with cyclicity three in quadratic systems.

Proof. Let a quadratic system with a singularity that is a focus or a center be given.
By Lemma 6.3.2, the system can be written in the form (6.11a), whose complex
form (6.11b) in this case is

ẋ = λ x + i(x − a10x2 − a01xx̄ − a−12x̄2 ). (6.27)

The corresponding system with λ = 0 has complexification (3.129), or equivalently


(3.131), but with b10 = ā01 , b01 = ā10 , and b2,−1 = ā−12, since it arises from a real
family. We will work with the complex parameters a10 , a01 , and a−12 rather than the
original real parameters, since it is easier to do so. This is permissible, and all the
estimates mentioned below continue to hold for the real parameters, since there is a
linear isomorphism between the two parameter sets.
The first three focus quantities for a general complex quadratic system (3.131)
(not necessarily the complexification of a real system) are listed in (3.133), (3.134),
and (3.135). When we examine the five hypotheses of Corollary 6.2.10 in regard
to the collection {g11 , g22 , g33 }, hypothesis (a) holds vacuously, (b) and (c) hold
by inspection, and (e) is (3.137), which we established in the proof of Theorem
3.7.1. Thus the upper bound of three on the cyclicity will follow from the corol-
lary if we can show that B3 = hg11 , g22 , g33 i is a radical ideal in C[a, b]. One way
to do so is to appeal to one of the advanced algorithms that are implemented in
such special-purpose computer algebra systems as Macaulay or Singular (see Ex-
ercise 6.15). We will follow a different approach using the results derived in ear-
lier chapters, and which has the advantage of illustrating techniques for studying
polynomial ideals. We consider the four ideals J1 , J2 , J3 , and J4 listed in Theorem
3.7.1. Since for any two ideals I and J, V(I ∩ J) = V(I) ∪ V(J) (Proposition 1.3.18),
the identity V(B3 ) = ∪4j=1 V(J j ) (Theorem 3.7.1) and (3.137) suggest that perhaps
B3 = ∩4j=1 J j . If this is so, and if each ideal J j is prime, then because an intersection
of prime ideals is radical (Proposition 1.4.4), it follows that B3 is radical. We will
verify both of these conjectures.
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 271

We know already that the ideal J4 is prime because it is Isym (third paragraph of
the proof of Theorem 3.7.1), which by Theorem 3.5.9 is always prime. For another
approach to showing that J1 and J2 are prime, see the proof of Proposition 6.3.4.
Here we will prove that they are prime using Gröbner Bases and the Multivariable
Division Algorithm.
J1 is prime. Under the ordering a10 > a01 > a−12 > b2,−1 > b10 > b01 , the gener-
ators listed for J1 in Theorem 3.7.1 form a Gröbner basis G for J1 with respect to lex.
Let any f , g ∈ C[a, b] = C[a10 , a01 , a−12 , b2,−1 , b10 , b01 ] be given. Then, applying the
Multivariable Division Algorithm, they can be written

f = f1 (2a10 − b10) + f2 (2b01 − a01) + r1


g = g1 (2a10 − b10 ) + g2(2b01 − a01) + r2 ,

where f1 , f2 , g1 , g2 , r1 , and r2 are in C[a, b] and r1 and r2 are reduced with respect to
G, so that for each j ∈ {1, 2}, either r j = 0 or r j 6= 0 but no monomial in r j is divis-
ible by LT(2a10 − b10) = 2a10 or by LT(2b01 − a01 ) = −a01 . Suppose neither f nor
g is in J1 , so that by the solution of the Ideal Membership Problem given on page 18
ν3 ν ν
of Section 1.2, r j 6= 0, j = 1, 2. Recall the notation [ν ] = aν101 aν012 a−12 bν2,−1
4
b105 b016 for
(ν )
ν ∈ N60 . Setting S j = Supp(r j ), r j can then be written in the form r j = ∑ν ∈S j r j [ν ],
where because r1 and r2 are reduced, ν ∈ S1 ∪ S2 implies that ν1 = ν2 = 0. Clearly
the product r1 r2 has the same form, hence, because f g has the form h + r1 r2 for
some h ∈ J1 , r1 r2 is the remainder of the product f g upon division by G. Since
r1 r2 6= 0, f g 6∈ J1 , so J1 is prime.
J2 is prime. The same argument as for J1 applies.
J3 is prime. Ordering the coefficients a10 > a01 > a−12 > b2,−1 > b10 > b01 ,
the set G = {2a01 + b01 , a10 + 2b10 , 2a−12 b2,−1 + b10 b01 } is a Gröbner basis for J3
with respect to lex. First observe that the ideal J = h 2a−12 b2,−1 + b10 b01 i is prime.
There are two ways to see this, both based on the fact that 2a−12 b2,−1 + b10 b01 is
irreducible. In general terms, C[a, b] is a unique factorization domain, and any ideal
in a unique factorization domain that is generated by a prime (here an irreducible
polynomial) is a prime ideal. In simpler terms, if h1 h2 = f · (2a−12 b2,−1 + b10 b01 ),
then one or the other of h1 and h2 must be divisible by 2a−12 b2,−1 + b10 b01 , since
it is impossible that each contribute a factor.
Let any f , g ∈ C[a, b] be given, and write them as

f = f1 (2a01 + b01) + f2 (a10 + 2b10) + f3 (2a−12 b2,−1 + b10 b01 ) + r1


g = g1 (2a01 + b01) + g2(a10 + 2b10) + g3(2a−12 b2,−1 + b10 b01 ) + r2 ,

where r1 and r2 are reduced with respect to G. Suppose that neither f nor g is in J3 .
(ν )
Then for j = 1, 2, r j 6= 0, r j 6∈ J, and r j has the form r j = ∑ν ∈S j r j [ν ] in which ν in
S1 ∪ S2 implies that ν1 = ν2 = ν3 ν4 = 0. The product of f and g can be written in the
form f g = h1 (2a01 + b01 ) + h2 (a10 + 2b10) + h3 (2a−12 b2,−1 + b10 b01 ) + r1 r2 . From
the form of r1 and r2 neither 2a01 = LT(2a01 + b01) nor a10 = LT(a10 + 2b10) divides
r1 r2 , so r1 r2 reduces to zero modulo G only if 2a−12 b2,−1 + b10 b01 divides it. But
this is impossible, since it would imply that r1 r2 ∈ J in the face of our assumption
272 6 Bifurcations of Limit Cycles and Critical Periods

that neither r1 nor r2 is in J and the preliminary observation that J is prime. Thus
f g is not in J3 , which is therefore prime.
To show that the ideals B3 and ∩4j=1 J j are the same, we order the coefficients
a10 > a01 > a−12 > b2,−1 > b10 > b01 , compute the unique reduced Gröbner basis
of each with respect to lex under this ordering, and verify that they are identical.
The reduced Gröbner basis of B3 can be quickly computed by any general-purpose
symbolic manipulator; it is the six-element set shown in Table 6.1, where in order to
avoid fractions each (monic) polynomial in the basis except the quadratic one was
doubled before being listed. To find the reduced Gröbner basis of ∩4j=1 J j , we apply
the algorithm given in Table 1.5 on page 37 for computing a generating set for I ∩ J
from generating sets of I and J three times: first for J1 ∩ J2 , then for (J1 ∩ J2 ) ∩ J3 ,
and finally for (J1 ∩ J2 ∩ J3 ) ∩ J4 . This, too, is easily accomplished using a computer
algebra system and yields the same collection of polynomials as for B3 .

a10 a01 − b01 a10

2a301 b2,−1 − 2a−12 b310 + 3a10 a−12 b210 − 3a201 b2,−1 b01 − 2a01 b2,−1 b201 + 2a210 a−12 b10

2a10 a−12 b210 b01 + 2a401 b2,−1 − 2a01 a−12 b310 − 3a301 b2,−1 b01 − 2a201 b2,−1 b201 + 3a−12 b310 b01

2a−12 b310 b201 + 2a501 b2,−1 − 2a201 a−12 b310 − 3a401 b2,−1 b01 − 2a301 b2,−1 b201 + 3a01 a−12 b310 b01

2a10 a2−12 b2,−1 b210 − a401 b2,−1 b10 + a301 a−12 b22,−1 + 2a301 b2,−1 b10 b01
− 2a201 a−12 b22,−1 b01 + a01 a−12 b410 − a2−12 b2,−1 b310 − 2a−12 b410 b01
4a2−12 b2,−1 b310 b01 − 4a301 a−12 b22,−1 b01 − 2a301 b2,−1 b10 b201 + 2a401 a−12 b22,−1
− 2a01 a2−12 b2,−1 b310 + a401 b2,−1 b10 b01 − a01 a−12 b410 b01 + 2a−12 b410 b201

Table 6.1 Reduced Gröbner Basis of B3 for System (3.129) (All except the quadratic polynomial
were doubled before listing in order to eliminate fractional coefficients.)

Finally, we show that the bound on the cyclicity is sharp. We will denote a real
system by the single letter X (usually subscripted) and will continue to specify a real
system X by a complex triple (a10 , a01 , a−12 ). We will construct concrete quadratic
systems X0 and X1 , the first with a center of cyclicity three and the second with
a third-order fine focus of cyclicity three. To begin, we observe that if in the dis-
cussion in the first few paragraphs of Section 3.1, leading up to the definition of
the function P in (3.14), we regard the right-hand sides of system (3.4) (which
we have now replaced by the simpler form (6.11a)) as analytic functions in u, v,
λ , and the coefficients of P and Q, then P is an analytic function of r0 and the
coefficients. Thus we will be able to freely rearrange series expressions for P and
to assert convergence of expansions of P in powers of r0 on specified intervals
under sufficiently small perturbation of λ and the remaining coefficients. Next we
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 273

observe that
√ the √ equality V(B) = V(B3 ) of varieties (Theorem 3.7.1) implies the
equality B = B3 of ideals (Proposition 1.3.16).Hence, by √ the definition
√ of the
radical of an ideal and the fact that B3 is radical, B3 ⊂ B ⊂ B = B3 = B3 .
Thus hg j j : j ∈ Ni = hg11 , g22 , g33 i. Combining this fact with Theorem 6.2.3, when
λ = 0 in (6.27), the difference function P of (3.14), whose zeros pick out cycles
surrounding the antisaddle at the origin, can be written

P(ρ ) = π g11ρ 3 + h41g11 ρ 4


+ (h51 g11 + π g22)ρ 5 + (h61g11 + h62g22 )ρ 6
+ (h71 g11 + h72g22 + π g33)ρ 7 + (h81 g11 + h82 g22 + h83g33 )ρ 8
+ (h91 g11 + h92g22 + h93g33 )ρ 9 + (h10,1g11 + h10,2g22 + h10,3g33 )ρ 10 + · · ·

for some h jk ∈ C[a, b]. Rearranging terms and rewriting, the right-hand side be-
comes
 
πρ 3 g11 (1 + e
h41 ρ + · · · ) + g22 (1 + e
h62 ρ + · · · )ρ 2 + g33 (1 + e
h83 ρ + · · · )ρ 4 . (6.28)

Because the focus quantities are being computed from the complexification of a real
system (so that b jk = āk j ), by (3.133)–(3.135) they are

g11 = −i[a10 a01 − ā01ā10 ] = 2 Im(a10 a01 )


g22 = −i[a10 ā201 a−12 − ā10a201 ā−12 − 32 (ā301 a−12 − a301ā−12 )
− 23 (ā210 a01 ā−12 − a210ā01 a−12 )]
g33 = i 58 [−a01 ā401 a−12 + 2ā10ā401 a−12 + a401 ā01 ā−12 − 2ā10a301 ā01 ā−12
− 2 a10 ā201 a2−12 ā−12 + ā301 a2−12 ā−12 − a301 a−12 ā2−12 + 2 ā10 a201 a−12 ā2−12].

To simplify the discussion, we introduce a complex parameter c such that a10 = cā01 .
Then g11 = 2|a01 |2 Im(c) and for c ∈ R we have

g22 = 23 (2c− 1)(c+ 2) Im(ā301 a−12), g33 = 54 (2c− 1)(|a−12|2 − |a01 |2 ) Im(ā301 a−12) .

Choose λ = 0, choose any pair of nonzero complex numbers a01 and a−12 sat-
isfying the two conditions (a) |a−12 | − |a01 | = 0 and (b) Im(ā301 a−12 ) > 0, and
choose a10 = −2ā01, corresponding to c = −2. Let X0 denote the corresponding
real quadratic system. Then g11 = g22 = g33 = 0 so X0 has a center at the origin. Let
ρ0 > 0 be such that the power series expansion of P in powers of ρ converges on
the interval (−4ρ0, 4ρ0 ) and the disk D0 of radius 4ρ0 about the origin lies wholly
within the period annulus of the center of X0 at the origin.
Let positive real constants r and ε be given.
Make an arbitrarily small perturbation in a−12 so that convergence of P(ρ )
holds on the interval (−3ρ0 , 3ρ0 ), conditions (b) and (c) continue to hold, but now
|a−12 |2 − |a01 |2 < 0. Let X1 denote the corresponding real quadratic system, which
may be chosen so that its coefficients are arbitrarily close to those of X0 . Then for X1
we have that g11 = g22 = 0 < g33 (since 2c− 1 < 0), so that the origin is a third-order
274 6 Bifurcations of Limit Cycles and Critical Periods

fine focus for X1 and is unstable (Remark 3.1.6 and Theorem 6.2.3). There are no
periodic orbits of X1 wholly contained in the disk D1 of radius 3ρ0 about the origin.
Now change a10 corresponding to the change in c from −2 to −2 + δ1 for
δ1 ∈ R+ . For δ1 sufficiently small, convergence of P(ρ ) holds on the interval
(−2ρ0 , 2ρ0 ), g33 changes by an arbitrarily small amount, g11 remains zero, and g22
becomes negative but arbitrarily close to zero. Change a10 a second time, now cor-
responding to the change in c from −2 + δ1 to (−2 + δ1 ) + δ2 i for δ2 ∈ R+ . For
δ2 sufficiently small, convergence of P(ρ ) holds on the interval (−ρ0 , ρ0 ), g22 and
g33 change by arbitrarily small amounts, while g11 becomes positive but arbitrar-
ily close to zero. Let X2 denote the resulting real quadratic system. Because for
X2 the first three focus quantities satisfy 0 < g11 ≪ −g22 ≪ g33 by the expression
(6.28) for P(ρ ) and the quadratic formula, for suitably chosen g11 , g22 , and g33
there are exactly two zeros of P(ρ ) in the interval (0, ρ0 ], the larger one less than
p
−g22 /g33, which can be made to be less than ρ0 . Moreover, P(ρ ) is positive on a
neighborhood of 0 in (0, ρ0 ]. But then by (3.14) and the identity η1 = e2πλ − 1 (see
the sentence that follows Definition 3.1.3), if λ is made negative but |λ | is suffi-
ciently small, a third isolated zero of P(ρ ) appears in (0, ρ0 ) for the corresponding
real quadratic system X3 . System X3 can be chosen to be within ε of X0 and X1 and
have its three small limit cycles wholly within the disk about the origin of radius r.
Thus the center of X0 and the fine focus of X1 each have cyclicity three. 
We have easily obtained a proof of the celebrated Bautin theorem (which is the
solution of the local 16th Hilbert problem for system (6.1) with n = 2) with the help
of the methods of computational algebra. Simple good fortune entered in, however,
since the ideal B3 defining the center variety was radical, and it is most likely that
it will very seldom be true that the ideal of the first few focus quantities defining the
center variety will be radical. We devote the remainder of this section to a study of
the real system that has complex form

ẋ = i(x − a10x2 − a01xx̄ − a−13x̄3 ), (6.29)

which will illustrate the difficulties encountered with systems of higher order and
some of the techniques for working with them. As usual we adjoin to equation (6.29)
its complex conjugate and consider x̄ and ā jk as new independent complex variables.
We thus obtain system (3.130), which we examined in Section 3.7 (Theorem 3.7.2)
and which for reference we reproduce here as

ẋ = i(x − a10x2 − a01xy − a−13y3 )


(6.30)
ẏ = −i(y − b10xy − b01y2 − b3,−1x3 ).

The first nine focus quantities for system (6.30) are listed in the first paragraph
of the proof of Theorem 3.7.2. The first five focus quantities determine the center
variety of system (6.30)√ and, therefore,
√ the radical of the ideal of focus quantities
of this system; that is, B5 = B. However, in contrast with what is the case for
quadratic systems, B5 is not all of the Bautin ideal B (Exercise 6.18), nor is it
radical:
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 275

Proposition 6.3.4. The ideal B5 = hg11 , g22 , g33 , g44 , g55 i generated by the first five
focus quantities of system (6.30) is not radical in C[a10 , a01 , a−13 , b3,−1 , b10 , b01 ].

Proof. We claim that each of the ideals J j , j = 1, . . . , 8, in the statement of Theorem


3.7.2 is prime. J8 is prime by Theorem 3.5.9. For a proof that the remaining seven
ideals are prime that involves the theory developed in Chapter 1, see Exercise 6.17.
Here we give a purely algebraic proof. To begin with, it is clear that in general
for m, n ∈ N with m ≤ n, the ideal hx1 , . . . , xm i in C[x1 , . . . , xn ] is prime. Thus for
c j ∈ C, j = 1, . . . , n, because the rings C[x1 , . . . , xn ] and C[x1 − c1 , . . . , xn − cn ] are
isomorphic, the ideal hx1 − c1 , . . . , xm − cm i in C[x1 , . . . , xn ] is prime. Thus if, for
r, s ∈ N, we view C[x1 , . . . , xr , y1 , . . . , ys ] as C[x1 , . . . , xr ][y1 , . . . , ys ], we have that the
ideal hx j1 − yk1 , . . . , x j p − yk p i is prime. Applying these ideas to J1 through J7 shows
that they are prime.
Using the algorithm of Table 1.5 on page 37, we can compute a Gröbner basis
of the ideal J = ∩8j=1 J j , which by Proposition 1.4.4 is radical, and obtain from it
the unique reduced Gröbner basis of J . Using the known finite set of generators
of each of B5 and J , we can apply the Radical √ Membership p Test (Table 1.4 on
page 33) and Exercise 1.35 to determine that B 5 = J = J . (This identity
also follows directly from the identity V(B5 ) = V(B) = ∪8j=1 V(J j ) = V(∩8j=1 J j )
(Theorems 3.7.2 and 1.3.18(2)) and Propositions 1.3.16 and 1.4.4.) But when we
compute the reduced Gröbner basis of B5 we do not get the reduced Gröbner basis
for J , so B5 6= J and we conclude that B5 is not radical. 

As we have shown above, the first five focus quantities define the center variety of
system (6.30), but the corresponding ideal B5 is not radical. Similarly, the ideals B7
and B9 generated by the first seven and the first nine focus quantities are not radical
either (Exercise 6.19). By Exercise 6.18, neither B5 nor B7 is all of the Bautin
ideal B. In order to make progress on the cyclicity problem for family (6.29), we
are led to attempt to make the problem more tractable by means of a coordinate
transformation. Turn back to Theorem 3.7.2 and examine the generators of the eight
ideals that determine the irreducible components of the center variety of family
(3.130) (our family (6.30)), and also consider the first focus quantity g11 , which
is given in the first paragraph of the proof of that theorem. If we replace a10 by a
multiple of b10 and replace b01 by a multiple of a01 , then the first focus quantity and
many of the generators factor as products of distinct variables, so that a decoupling
of variables occurs. This suggests that we define a mapping G : C6 → C6 by

(a10 ,a01 , a−13 , b3,−1 , b10 , b01 )


(6.31)
= G(s1 , a01 , a−13 , b3,−1 , b10 , s2 ) = (s1 b10 , a01 , a−13 , b3,−1 , b10 , s2 a01 ).

That is, we introduce new variables s1 and s2 by setting

a10 = s1 b10 , b01 = s2 a01 . (6.32)

This process induces a homomorphism of the ring of polynomials over C with six
indeterminates, defined for any f ∈ C[a, b] = C[a10 , a01 , a−13 , b3,−1 , b10 , b01 ] by
276 6 Bifurcations of Limit Cycles and Critical Periods

W : C[a10 , a01 , a−13 , b3,−1 , b10 , b01 ] → C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ]
ν1 ν2 ν3 ν ν
: ∑ f (ν ) a10 a01 a−13 bν3,−1
4
b105 b016 7→
ν ∈Supp( f )
ν +ν6 ν3 ν +ν ν σ σ σ
∑ f (ν ) sν11 a012 ν4
a−13 b3,−1 b105 1 s26 = ∑ f (α (σ )) s1σ1 aσ012 a−13
3
bσ3,−1
4
b105 s2 6 ,
ν ∈Supp( f ) σ ∈Σ

where Σ is the image in N60 of Supp( f ) ⊂ N60 under the invertible linear map

ω : R6 → R6 : (ν1 , ν2 , ν3 , ν4 , ν5 , ν6 ) 7→(σ1 , σ2 , σ3 , σ4 , σ5 , σ6 )
(6.33)
= (ν1 , ν2 + ν6 , ν3 , ν4 , ν5 + ν1 , ν6 )

and α : R6 → R6 is the inverse of ω . Although the mapping G is not a true change


of coordinates (it is neither one-to-one nor onto), the ring homomorphism W is one-
to-one (Exercise 6.20), hence is an isomorphism of C[a, b] with its image, call it C.
We denote the image of f under W by f˘. Similarly, we denote the ideal in C that is
the image of an ideal I in C[a, b] by I. ˘
The image C is not all of C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ], so in addition to the ide-
als B̆k and B̆ within C, we have the larger ideals hğ11 , . . . ğkk i and hğ j j : j ∈ Ni in
C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ], in which the weights in the finite linear combinations
are not restricted to lie in C. We denote these larger ideals by B̆k+ and B̆ + , respec-
tively. In essence, by means of the transformation G, we have placed the Bautin
ideal in a larger ring, and it is here that we are able to prove that the first nine focus
quantities suffice to generate the whole ideal.

Proposition 6.3.5. With the notation of the preceding paragraph, B̆ + = B̆9+ . That
is, the polynomials ğ11 , ğ33 , ğ44 , ğ55 , ğ77 , and ğ99 form a basis of the ideal of focus
quantities of system (6.30) in the ring C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ].

Proof. As usual, we write just C[a, b] for C[a10 , a01 , a−13 , b3,−1 , b10 , b01 ] and for
ν2 ν3 ν ν
ν ∈ N60 let [ν ] denote the monomial aν101 a01 ν4
a−13 b3,−1 b105 b016 ∈ C[a, b]. The focus
quantities gkk ∈ C[a, b] have the form (3.82),
(ν )
gkk = 1
2 ∑ gkk ([ν ] − [νb]) , (6.34)
{ν :L(ν )=(k,k)}

(ν )
where igkk ∈ Q and L is defined by (3.71), which in the present situation is the map
L : N60 → Z2 given by

L(ν ) = ν1 (1, 0) + ν2(0, 1) + ν3 (−1, 3) + ν4(3, −1) + ν5(1, 0) + ν6 (0, 1)


(6.35)
= (−ν3 + 3ν4 + (ν5 + ν1 ), (ν2 + ν6 ) + 3ν3 − ν4 ).

The caret accent denotes the involution on C[a, b] given by Definition 3.4.3 and the
paragraph following it. For the remainder of this proof we modify it by not taking the
complex conjugate of the coefficients in forming the conjugate. The new involution
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 277

agrees with the old one on monomials, but now, for example, in the expression for
(ν ) (ν )
gkk , gbkk = gkk since it is a constant polynomial.
For any µ and σ in N60 ,
     
[µ + σ ] − [µ \
+ σ ] = 21 [σ ] + [σb ] [µ ] − [µ
b ] + 21 [µ ] + [µ
b ] [σ ] − [σb]
   
= f1 [µ ] − [µb ] + f2 [σ ] − [σb ] ,
(6.36)
where f j ∈ Q[a, b] and fbj = f j for j = 1, 2. Every string ν appearing in the sum in
(6.34) is an element of the monoid M . Using the algorithm of Table 5.1 on page
235, we construct a Hilbert basis H of M ; the result is

H = {(100 001), (010 010), (001 100), (110 000), (000 011),
(040 100), (001 040), (030 101), (101 030), (020 102), (201 020),
(010 103), (301 010), (000 104), (401 000)}.

For j = 1, . . . , 15, let µ j denote the jth element of H as listed here, so that, for
example, µ3 = (001 100). For any k ∈ N, the sum in (6.34) is finite. Expressing each
ν ∈ Supp(gkk ) in terms of elements of H , applying (6.36) repeatedly, and collecting
terms yields an expression for gkk of the form
   
gkk = f1 [µ1 ] − [µ
b1] + · · · + f15 [µ15 ] − [µ
b15] ,

where i f j ∈ Q[a, b] and fbj = f j for j = 1, . . . , 15. It is apparent that f j is a polynomial


in the monomials [µ1 ], . . . , [µ15 ]. Since µb j = µ j for j = 1, 2, 3, the first three terms
do not actually appear. Also, [µ4 ] − [µ b4] = g11 and [µ5 ] − [µ b5 ] = −g11 . Since we are
attempting to show that ğkk ∈ B̆9 , which contains ğ11 as a generator, we may replace
gkk with the polynomial that results when these multiples of g11 are removed. That
is, we reduce gkk modulo g11 . So as not to complicate the notation, we retain the
same name gkk for the reduced focus quantity. The remaining elements of H occur
in conjugate pairs, µ j+1 = µ b j for j = 6, 8, 10, 12, and 14. Thus we may use the
 
identity [µ j+1 ] − [µ
b j+1] = − [µ j ] − [µb j ] for these values of j to finally express the
reduced gkk as
   
gkk = ( f6 − f7 ) [µ6 ] − [µ b6 ] + · · · + ( f14 − f15 ) [µ14 ] − [µb14]
7   (6.37)
= ∑ h2 j [ µ2 j ] − [ µ
b2 j ] ,
j=3

where ih2 j ∈ Q[a, b], b h2 j = h2 j for j = 3, . . . , 7, and again h2 j is a polynomial in the


monomials [µ1 ], . . . , [µ15 ].
Define an involution on C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ], also denoted by a caret ac-
cent, by
278 6 Bifurcations of Limit Cycles and Critical Periods

s1 → s2 , s2 → s1 , a01 → b10 , a−13 → b3,−1 , b3,−1 → a−13 , b10 → a01 .

Thus, for example, if

f = (2 + 3i)s21 a01 a3−13 b310 + (5 − 7i)s31 a401 b3,−1 b310 s42


= f1 (s1 , s2 ) a01 a3−13 b310 + f2 (s1 , s2 ) a401 b3,−1 b310

then

fb = (2 + 3i)s22 b10 b33.−1 a301 + (5 − 7i)s32 b410 a−13 a301 s41


= fb1 (s1 , s2 ) b10 b3 a3 + b
3.−1 01 f2 (s1 , s2 ) b410 a−13 a3 . 01

For any f ∈ C[a, b], the conjugate of the image f˘ of f under the homomorphism W
is the image of the conjugate. In particular, f is invariant under the involution on
C[a, b] if and only if its image f˘ is invariant under the involution on the isomorphic
copy C = W (C[a, b]) of C[a, b] lying in C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ].
In order to make the next display easier to read, we extend the bracket nota-
ν3
tion to C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ], expressing the monomial aν011 aν−13
2
b3,−1 bν104 as
[ν1 , ν2 , ν3 , ν4 ]. With this notation,
ν
W ([ν1 , ν2 , ν3 , ν4 , ν5 , ν6 ]) = sν11 s26 [ν2 + ν6 , ν3 , ν4 , ν5 + ν1 ].

Then the images of the monomials corresponding to the 15 elements of the Hilbert
basis H of M are (note the ordering of the last two columns compared to the
ordering in the display for H )

[100 001] → s1 s2 [1001] [040 100] → [4010] [001 040] → [0104]


[010 010] → [1001] [030 101] → s2 [4010] [101 030] → s1 [0104]
[001 100] → [0110] [020 102] → s22 [4010] [201 020] → s21 [0104] . (6.38)
[110 000] → s1 [1001] [010 103] → s32 [4010] [301 010] → s31 [0104]
[000 011] → s2 [1001] [000 104] → s42 [4010] [401 000] → s41 [0104]

We name the monomials that appear as images as

u = [0110] = a−13 b3,−1 , v = [1001] = a01 b10 , w = [0104] = a−13 b410 .

b = [4010] = a401 b3,−1 .


The first two are self-conjugate and w
Using (6.37) and the last two columns of (6.38), the image of the reduced gkk is

ğkk = h̆6 (w
b − w) + h̆8(s2 w b − s21 w)
b − s1 w) + h̆10 (s22 w
+ h̆12(s32 w
b − s31 w) + h̆14(s42 w
b − s41 w) (6.39)
= hw − b b
hw.
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 279

Knowing that the functions h2 j are polynomials in the monomials [µ1 ], . . . , [µ15 ] and
using the first column in (6.38), we see that in fact h can be regarded as a polynomial
b Note, however, that by regrouping the terms in the
function of s1 , s2 , u, v, w, and w.
monomials that compose h, it could be possible to express h in terms of s1 , s2 , u, v, w,
and wb in diffferent ways. An important example of this phenomenon is the identity
wwb = a401 a−13 b3,−1 b410 = uv4 . We will now show that by means of this identity and
a possible reduction modulo ğ11 , we can eliminate all w b dependence from h.
For m, n ∈ N0 , n ≥ 1, let wm w bn fm,n (s1 , s2 , u, v) denote the sum of all terms in
h that contain wm w bn . For n ≤ m, we can factor out a power of w to rewrite this
expression as w m−n b n fm,n = wm−n (uv4 )n fm,n . For n > m, we must move terms
(ww)
between the expressions hw and b hwb in (6.39). Beginning as in the previous case, we
write
 n−m 4 m   
wb (uv ) fm,n w − wn−m (uv4 )m fbm,n w b
=w b n−m−1 4 m+1
(uv ) fm,n − w n−m−1
(uv4 )m+1 fbm,n
= (− fbm,n )(uv4 )m+1 wn−m−1 − (− fm,n )(uv4 )m+1 w
bn−m−1
   
= −b fm,n (uv4 )m+1 wn−m−2 w − − fm,n (uv4 )m+1 w bn−m−2 wb,

which eliminates w b dependence from h provided n ≥ m + 2. For n = m + 1, the last


line in the display is not present, and the line that precedes it has the form

(uv4 )m+1 f (s1 , s2 , u, v) − (uv4)m+1 fb(s1 , s2 , u, v)


 
= (uv4 )m+1 f (s1 , s2 , u, v) − b
f (s1 , s2 , u, v) . (6.40)

Any term A sn11 sn22 un3 vn4 in f produces in (6.40) the binomial

A un3 +m+1 vn4 +4m+4 (sn11 sn22 − sn12 sn21 ) = A u p vq (ss1 st2 − st1 ss2 ),

where q > 0. The term does not appear if s = t. If s 6= t, say s > t (the case t > s is
similar), such a binomial can be written

A u p vq (s1 s2 )t (ss−t s−t p q−1


1 − s2 ) = A u v (s1 s2 )t (ss−t−1
1 − · · · − ss−t−1
2 )(s1 − s2 )v.

But (s1 − s2 )v = (s1 − s2 )a01 b10 = ğ11 , so any such term can be discarded in a fur-
ther reduction of ğkk modulo ğ11 , although as before we will retain the same no-
tation for the reduced focus quantity. We thus have expression (6.39), where, with
respect to the current grouping of terms in the monomials that are present in h,
ih ∈ Q[s1 , s2 , u, v, w].
In general, for any element f of C[a, b] and any point (a∗ , b∗ ) ∈ C6 such that
f (a∗ , b∗ ) = 0, if a∗01 b∗10 6= 0, then we can define two numbers s∗1 = a∗10 /b∗10 and
s∗2 = b∗01 /a∗01, and know that f˘(s∗1 , a∗10 , a∗−13 , b∗3,−1 , b∗10 , s∗2 ) = 0. By Theorem 3.7.2(7),
if (a, b) is such that b10 = 2a10 and a01 = 2b01, corresponding to s1 = s2 = 1/2,
then gkk (a, b) = 0 for all k ∈ N. This means that if we write the h of (6.39) as
h = ∑Tk=0 Hk (s1 , s2 , u, v)wk , then
280 6 Bifurcations of Limit Cycles and Critical Periods

T T
∑ Hk ( 21 , 12 , u, v)wk+1 − ∑ Hbk ( 21 , 21 , u, v)wbk+1
k=0 k=0
T
= ∑ Hk ( 21 , 12 , u, v)(wk+1 − wbk+1) = 0 (6.41)
k=0

since u and v are self-conjugate. Although the Hk are polynomials in u and v, the
identity (6.41) does not automatically imply that Hk (1/2, 1/2, u, v) is the zero poly-
nomial in C[u, v], for u, v, w, and w b are not true indeterminates, but only short-
hand expressions for monomials in (a, b). As stated, identity (6.41) holds subject
to the condition uv4 − wwb = 0. To obtain the conclusion that Hk (1/2, 1/2, u, v) is the
zero polynomial in C[u, v], we must show that no term in Hk wk+1 can cancel with
any term in H j w j+1 or H j w
b j+1 for j 6= k. Any term in Hk (1/2, 1/2, u, v)wk+1 has
s s+4k+4 r+k+1 r
the form A a01 b10 a−13 b3,−1 ; any term in H j (1/2, 1/2, u, v)w j+1 has the form
B an01 bn+4
10
j+4 m+ j+1 m
a−13 b3,−1 . The powers on a01 agree only if n = s and the powers
on b3,−1 agree only if m = r, hence the powers on a−13 agree only if j = k. The
other cases are similar. Thus, for each k, as an element of C[a01 , a−13 , b3,−1 , b10 ]
Hk (1/2, 1/2, u, v)wk+1 is zero, hence Hk (1/2, 1/2, u, v) is also the zero polynomial in
C[a01 , a−13 , b3,−1 , b10 ]. The map from C4 to C2 that carries (a01 , a−13 , b3,−1 , b10 )
to (u, v) is onto, so this means that the polynomial function F on C2 defined by
F(u, v) = Hk (1/2, 1/2, u, v) is identically zero. By Proposition 1.1.1, Hk (1/2, 1/2, u, v)
must be the zero polynomial in C[u, v]. Thus there exist polynomials ha and hb in
C[s1 , a10 , a−13 , b3,−1 , b10 , s2 ] such that h = ha (2s1 − 1) + hb (2s2 − 1). Inserting this
expression into (6.39) yields

ğkk = ha (2s1 − 1)w + hb(2s2 − 1)w − b b−b


ha(2s2 − 1)w hb (2s1 − 1)w,
b (6.42)

where, with respect to the current grouping of terms in the monomials that are
present in ha and hb , each of iha and ihb lies in Q[s1 , s2 , u, v, w], j = 1, 2. With
respect to that grouping, now reorder the terms in ğkk so that it is expressed as a sum
(1) (2) (3) (1)
of three polynomials ğkk = ğkk + ğkk + ğkk , where ğkk is the sum of all terms that
(2) (3)
contain v, ğkk is the sum of all remaining terms that contain u, and ğkk is the sum
of all remaining terms. Thus
(1)
(1) ğkk is a sum of polynomials of the form
 
vc (2s1 − 1) f w − (2s2 − 1) fbw
b (6.43a)

and  
vc (2s2 − 1) f w − (2s1 − 1) fbw
b, (6.43b)
where c ∈ N and i f ∈ Q[s1 , s2 , u, w] (there is no v dependence in f because v is
self-conjugate, hence, decomposing any polynomial f that is currently present
into a sum of polynomials all of whose terms contain v to the same power, we
may assume that v is present to the same power in f and fb and can be factored
out);
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 281

(2)
(2) ğkk is a sum of polynomials of the form
 
uc (2s1 − 1) f w − (2s2 − 1) fbw
b (6.44a)

and  
uc (2s2 − 1) f w − (2s1 − 1) fbw
b, (6.44b)
where c ∈ N and i f ∈ Q[s1 , s2 , w] (there is no u dependence in f for the same
reason: u is self-conjugate); and
(3)
(3) ğkk is a sum of polynomials of the form

(2s1 − 1) f w − (2s2 − 1) fbw


b (6.45a)

and
(2s2 − 1) f w − (2s1 − 1) fbw,
b (6.45b)
where i f ∈ Q[s1 , s2 , w].
The motivation for this decomposition is the following observation, which we will
have occasion to use several times in what follows: if a monomial [µ ] either (i)
contains s1 but not b10 or (ii) contains s2 but not a01 , or stated in terms of u, v, w,
b if [µ ] either (i) contains s1 but neither v nor w or (ii) contains s2 but neither v
and w,
b then it cannot appear in ğkk . For in case (i) µ = (n, ∗, ∗, ∗, 0, ∗), n > 0, which
nor w,
has as its image under α = ω −1 the 6-tuple (∗, ∗, ∗, ∗, −n, ∗), so that because of the
negative entry µ is not in the image of ω , while in case (ii) µ = (∗, 0, ∗, ∗, ∗, n),
n > 0, which has α -image the 6-tuple (∗, −n, ∗, ∗, ∗, ∗), so again µ is not in the
image of ω .
We will show that all polynomials of the forms (6.43) and (6.45) that appear
in ğkk lie in B̆5+ and that all those of the form (6.44) that appear in ğkk and
are relevant lie in B̆9+ , where B̆ + j = hğ11 , . . . , ğ j j i as an ideal in the full ring
C[s1 , a01 , a−13 , b3,−1 , b10 , s2 ], for j = 5, 9. To start we compute the reduced Gröbner
basis of B̆5+ with respect to lex with s1 > s2 > a01 > b10 > a−13 > b3,−1 . It contains
(among others) the polynomials

u1 = v(s1 − s2 )
u2 = v(2s2 − 1)(w − w)
b
u3 = −a01 u(2s2 − 1)(w − wb)
u4 = −b10 u[(2s1 − 1)w − (2s2 − 1))w]
b
u5 = a01 (2s2 − 1)(s2 − 3)(s2 + 3)(w − w)
b
u6 = (2s1 − 1)(s1 − 3)(s1 + 3)w − (2s2 − 1)(s2 − 3)(s2 + 3)w
b

which therefore also lie in B̆9+ .


(1)
ğkk ∈ B̆5+ . Consider a polynomial of the form specified in (6.43a). A monomial
sn11 sn22 un3 vn4 ws ∈ f gives rise to the binomial
282 6 Bifurcations of Limit Cycles and Critical Periods
 
vc (2s1 − 1)sn11 sn22 un3 vn4 ws+1 − (2s2 − 1)sn12 sn21 un3 vn4 w
bs+1
 
= v (2s1 − 1)sn11 sn22 ws+1 − (2s2 − 1)sn12 sn21 wbs+1 un3 vn4 +c−1

(1)
in ğkk . If n1 ≥ n2 , then the product (s1 s2 )n1 can be factored out while if n1 < n2 ,
then the product (s1 s2 )n2 can be factored out, in each case expressing the binomial
as a monomial times a binomial of the form
 
v (2s1 − 1)srj wt − (2s2 − 1)b
s rj w
bt (6.46)

for j ∈ {1, 2} and where r ∈ N0 and t ∈ N. Thus polynomials of the form (6.43a) lie
in B̆5+ if all monomials of the form (6.46) do. But

srj (2s1 − 1)wt − sbrj (2s2 − 1)w


bt
 
= (s j + sbj ) sr−1
j (2s1 − 1)w − s
t
br−1 bt
j (2s2 − 1)w
 
− s j sbj sr−2 t
br−2
j (2s1 − 1)w − s bt ,
j (2s2 − 1)w

so this will follow by mathematical induction on r if it is true for r = 0 and 1. That


it is follows from the fact that for t = 1, f1 := v(2s2 − 1)(w − w)
b = u2 , and for t > 1,

ft := v(2s2 − 1)(wt − w
bt ) = v(2s2 − 1)(w − w)(w
b t−1 − · · · − w
bt−1 )
= u2 (wt−1 − · · · − w
bt−1 )

(so they contain u2 as a factor) and the identities

v[(2s1 − 1)wt − (2s2 − 1)wbt ] = 2wt u1 + ft


v[s1 (2s1 − 1)wt − s2 (2s2 − 1)w
bt ] = (2s1 + 2s2 − 1)wt u1 + s2 ft
v[s2 (2s1 − 1)wt − s1 (2s2 − 1)w
bk ] = (2s2 (wt − w
bt ) + w
bt )u1 + s2 ft ,

which are easily verified simply by expanding each side. The demonstration that a
polynomial of the form specified in (6.43b) lies in B̆5+ is similar.
(2)
ğkk ∈ B̆9+ . For this case we will need the identity B̆10
+ +
= B̆11 = B̆12+
= B̆9+ . It is
+
established by computing a Gröbner basis G9 of B9 , then computing g10,10 , g11,11,
and g12,12 and reducing each one modulo G9 ; the remainder is zero each time. This
is sufficient since W is an isomorphism onto its image C.
Our next observation is that no polynomial of the form of (6.44b) actually appears
in ğkk . For any monomial Asn11 sn22 wn3 in f yields

1+c+n3 c 4+4n 1+c+n 4+4n


2Asn11 s1+n
2
2
a−13 b3,−1 b10 3 − Asn11 sn22 a−13 3 bc3,−1 b10 3
2 n1 4+4n3 c
− 2As1+n
1 s2 a01 a−13 b1+c+n3,−1
3
+ Asn12 sn21 a4+4n
01 a−13 b1+c+n
3 c
3,−1 .
3

The first term contains a power of s1 but not of b10 , hence must be cancelled by some
(2)
other term in ğkk . But no term coming from a polynomial of the form in (6.44b) with
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 283

a different exponent c′ on u can cancel it, for if c′ 6= c there will be a mismatch of the
exponents on b3,−1 in the first and second terms and a mismatch of the exponents
on a−13 in the third and fourth terms. No term coming from a polynomial of the
form in (6.44b) with the same exponent c on u can cancel it either, because of a
mismatch of the exponents on a−13 in the second term and of a01 in the third and
fourth terms. By writing out completely a general term coming from a monomial in
f in a polynomial of the form in (6.44a) with exponent c′ on u, we find similarly
that for no choice of c′ can we obtain a term to make the cancellation.
The same kind of argument shows that for no polynomial of the form of (6.44a)
can the polynomial f contain a monomial Asn11 sn22 wn3 in which n2 > 0. Thus we
reduce to consideration of polynomials of the form
 
pc,d = uc (2s1 − 1)sr1 wd − (2s2 − 1)sr2 w bd , (6.47)

where c ≥ 1, d ≥ 1, and r ≥ 0.
The two situations d = 1 and d ≥ 2 call for different treatments. If d = 1, then
there is an important shift in the logic of the proof. Heretofore we have implicitly
used the fact that if a polynomial f is a sum of polynomials that are in an ideal I,
then f ∈ I. But it is by no means necessary that every polynomial in such a sum be
in I in order for f to be in I. In particular, it is not necessary that each polynomial
(2)
pc,1 be in B̆9+ in order for ğkk to be there. Indeed, polynomials of the form p1,1 and
p2,1 are not. What is sufficient, and what we will show, is that any polynomial pc,1
+ (2)
that is not definitely in B̆12 = B̆9+ can be in ğkk only if k is so low (k ≤ 12) that ğkk
+
is a generator of B̆12 , hence certainly in it.
The argument that was just applied to narrow the scope of possible forms of f in
(6.44a) shows that r can be at most 3, where now we use the fact that if s1 occurs
in a term with exponent n, then b10 must also occur in that term with exponent
at least n (and cancellation of terms that fail this condition is impossible). Since
deg pc,1 = 2c + r + 6, max deg pc,1 = 2c + 9, hence, for c ≤ 2, deg pc,1 ≤ 13. By
Exercise 6.21, any monomial present in gkk has degree at least k if k is even and at
least k + 1 if k is odd. Since W preserves degree, the same lower bounds apply to
ğkk . Thus, for c ≤ 2, if pc,1 lies in any ğkk , it can do so only for k ≤ 12, so that ğkk
+
is in B̆12 = B̆9+ . For c = 3, reduce p3,1 modulo a Gröbner basis of B̆9+ to obtain
remainder zero and conclude that p3,1 ∈ B̆9+ . For c ≥ 4, pc,1 = uc−3 p3,1 , hence is in
B̆9+ .
For the case d ≥ 2, we factor out and discard a factor of uc−1 from (6.47). Define
the polynomial p+ 1,d by
 
p+ r d r d
1,d = u (2s1 − 1)s1 w + (2s2 − 1)s2 w
b (w − w).
b

The identity

p+ r d−1
1,d = −(s1 w + sr2 w
bd−1 )a301 b3,−1 u3 + (w − w)s
b r1 a−13 b310 wd−2 u4
284 6 Bifurcations of Limit Cycles and Critical Periods

shows that p+ +
1,d ∈ B̆5 for all d ≥ 2. We will prove that any polynomial of the form
p1,d is an element of B̆9+ by induction on k.
Basis step. For d = 2 we have

p1,2 = −sr2 a301 b3,−1 u3 − su1 a−13 b310 u4 + (2s2 − 1)(uv4)2 (sr1 − sr2 ).

Using the binomial theorem if necessary factor v(s1 − s2 ) = u1 out of the last term
to see that p1,2 is in B̆5+ .
Inductive step. Suppose p1,d−1 ∈ B̆9+ . By the identity (see (6.36))
 
p1,d = 12 u (2s1 − 1)sr1 wd−1 − (2s2 − 1)sr2 w
bd−1 (w + w)
b
1
 r d−1 r d−1

+ 2 u (2s1 − 1)s1 w − (2s2 − 1)s2 wb (w − w)
b
b + 21 p+
= 21 p1,d−1 (w + w) 1,d−1 (w − w)
b

and the fact that p+ + +


1, j ∈ B̆9 for all j ≥ 2, it follows that p1,d ∈ B̆9 .
(3)
ğkk ∈ B̆5+ . Just as we applied part (7) of Theorem 3.7.2 to obtain the factorization
of the polynomial h in (6.39), we wish to apply parts (2) and (4) now. To do so we
extend the fact noted earlier that for any element f of C[a, b] and any point (a∗ , b∗ )
in C6 such that f (a∗ , b∗ ) = 0, f˘ vanishes at the point (s∗1 , a∗10 , a∗−13 , b∗3,−1 , b∗10 , s∗2 )
for s∗1 and s∗2 defined by s∗1 = a∗10 /b∗10 and s∗2 = b∗01 /a∗01, which is possible pro-
vided a∗01 b∗10 6= 0. The extension comes from the observation that if a∗01 = 0 but
b∗01 = 0 as well, then we may choose any value for s∗2 and f˘ will still vanish at
(s∗1 , a∗10 , a∗−13 , b∗3,−1 , b∗10 , s∗2 ), provided only that b10 6= 0. Thus, by Theorem 3.7.2(2),
ğkk (3, 0, a−13 , 0, b10 , s2 ) = 0 holds for all a−13 , s2 , and nonzero b10 , hence by Exer-
cise 6.22 holds for all a−13 , s2 , and b10 . Similarly, it follows from part (4) of Theo-
rem 3.7.2 that ğkk (−3, 0, a−13, 0, b10 , s2 ) = 0 holds for all a−13 , s2 , and b10 . Since the
polynomial f in (6.45) contains only sums of products of s1 , s2 , and w = a−13 b410 ,
and a01 and b3,−1 do not appear, we conclude that it must contain s1 + 3 and s1 − 3
as factors, so that the polynomials in (6.45) must have the form

(2s1 − 1)(s1 − 3)(s1 + 3) f w − (2s2 − 1)(s2 − 3)(s2 + 3) fbw


b

and
(2s2 − 1)(s2 − 3)(s2 + 3) f w − (2s1 − 1)(s1 − 3)(s1 + 3) fbw,
b
where f ∈ C[s1 , s2 , w]. The latter forms do not actually appear since every term that
they contain has either s1 without w or s2 without w, b and cancellation of such terms
is impossible. Thus we reduce to an examination of polynomials of the form

pk := sr1 (2s1 − 1)(s1 − 3)(s1 + 3)wd − sr2 (2s2 − 1)(s2 − 3)(s2 + 3)w
bd ,

where r ≥ 0 and d ≥ 1.
(2)
For d = 1, an argument as in the case d = 1 in the proof that ğkk is in B9+ shows
that r is at most 1 and that p1 can be in gkk only if k is so low that gkk is a generator
of B9+ .
6.3 The Cyclicity of Quadratic Systems and a Family of Cubic Systems 285

For d = 2, we have the identity

p2 = −a301 b3,−1sr2 u5 + sr1 wu6 + (s2 − 3)(s2 + 3)(2s2 − 1)(sr1 − sr2 )ww.
b

Using the binomial theorem if necessary factor v(s1 − s2 ) = u1 out of the last term
to see that p2 is in B̆5+ .
For d ≥ 2, define a polynomial p+
d by
r 
p+ d r
bd (w − w).
d = s1 (2s1 − 1)(s2 − 3)(s2 + 3)w + s2 (2s1 − 1)(s2 − 3)(s2 + 3)w b

Then

p+ r 3 4
d = −s1 (s2 − 3)(s2 + 3)a01 b10 w
d−2
u3 + sr2 a301 b3,−1 w
bd−1 u5 − sr1 wk−1 (w − w)u
b 6,

which shows that p+ +


d ∈ B̆5 . Following the pattern from the previous case, the reader
is invited to combine these identities with (6.36) to show by mathematical induction
that pd ∈ B̆5+ for all d ≥ 1. 

Using this result, we obtain the following estimate for the cyclicity of the singu-
larity at the origin for system (6.29).

Proposition 6.3.6. Fix a particular system in family (6.29) corresponding to a pa-


rameter value a for which a01 6= 0. Then the cyclicity of the focus or center at the
origin of this system with respect to perturbation within family (6.29) is at most five.
If perturbation of the linear part is allowed, then the cyclicity is at most six.

Proof. For the general family (6.30) of differential equations on C2 , Proposition


6.3.5 implies that for any k ∈ N, there exist polynomials h1 , h3 , h4 , h5 , h7 , and h9 in
six indeterminates with coefficients in C such that

ğkk = h1 ğ11 + h3ğ33 + h4ğ44 + h5ğ55 + h7ğ77 + h9 ğ99 . (6.48)

For any point (a, b) ∈ C6 , this identity can be evaluated at (a, b), but that tells us
nothing about gkk (a, b). The transformation G defined by equation (6.31) maps the
open set D = {(z1 , z2 , z3 , z4 , z5 , z6 ) : z2 z5 6= 0} one-to-one onto itself and has the
global inverse

F(a10 , a01 , a−13 , b3,−1 , b10 , b01 ) = ( ba10


10
, a01 , a−13 , b3,−1 , b10 , ba01
01
)

on D. Thus if (a, b) ∈ D, that is, if a01 b10 6= 0, then F is defined at (a, b) and when
(6.48) is evaluated at F(a, b), the left-hand side is gkk (a, b) and the right-hand side
is a sum of expressions of the form (letting T j = Supp(h j ) and S j = Supp(g j j ))
286 6 Bifurcations of Limit Cycles and Critical Periods

(ν ) a10 ν1 ν2 ν3 ν
∑ hj ( b10 ) a01 a−13 bν3,−1
4
b105 ( ba01
01
)ν6
ν ∈T j
(ν ) ν +ν6 ν3 ν +ν
× ∑ g j j ( ba1010 )ν1 a012 a−13 bν3,−1
4 01 ν6
b105 1 ( ab01 )
ν ∈S j

f j (a, b)
= r s g j j (a, b)
a01j b10j

for some polynomials f j and constants r j , s j ∈ N0 . Clearly the resulting identity

f1 (a, b) f3 (a, b) f4 (a, b)


gkk = g11 + r3 s3 g33 + r4 s4 g44
ar011 bs101 a01 b10 a01 b10
f5 (a, b) f6 (a, b) f9 (a, b)
+ r5 s5 g55 + r6 s7 g77 + r9 s9 g99
a01 b10 a01 b10 a01 b10

of analytic functions holds on D. Hence, for any (a, b) ∈ D, the set of germs
{g11 , g33 , g44 , g55 , g77 , g99 } in the ring G(a,b) of germs of complex analytic func-
tions at (a, b) is a minimal basis with respect to the retention condition of the
ideal hgkk : k ∈ Ni. Since parameter values of interest are those for which (6.30)
is the complexification of the real system (in complex form) (6.29), they satisfy
b10 = ā01 , so the condition a01 6= 0 ensures that (a, b) is in D, and the set of germs
{gR R R R R R
11 , g33 , g44 , g55 , g77 , g99 } in the ring G(A(a,b),B(a,b)) of complex analytic functions
at (A(a, b), B(a, b)) is a minimal basis with respect to the retention condition of the
ideal hgR kk : k ∈ Ni. Consequently, Theorem 6.2.9 implies that the cyclicity of the
origin with respect to perturbation within the family of the form (6.11) that corre-
sponds to (6.29) (that is, allowing perturbation of the linear terms) is at most six. If
perturbations are constrained to remain within family (6.29), then, by Lemma 6.2.8,
the cyclicity is at most five. 
Our treatment of the cyclicity problem for family (6.29) is incomplete in two
ways: there remain the question of the sharpness of the bounds given by Proposition
6.3.6 (see Exercise 6.23 for an outline of how to obtain at least three small limit
cycles, or four when perturbation of the linear part is permitted) and the question
of what happens at all in the situation a01 = 0. Since family (6.29) was intended
only as an example, we will not pursue it any further, except to state and prove one
additional result concerning it, a result that we have chosen because we wish to
again bring to the reader’s attention the difference between bifurcation from a focus
and bifurcation from a center.
Proposition 6.3.7. Fix a particular system in family (6.29) corresponding to a pa-
rameter value a. If a01 = 0 but a410 a−13 − ā410 ā−13 6= 0, then the cyclicity of the origin
with respect to perturbation within family (6.29) is at most three. If perturbation of
the linear part is allowed, then the cyclicity is at most four.
Proof. Form the complexification of (6.29), system (6.30) with bqp = ā pq. The focus
quantities for family (6.30) are listed at the beginning of the proof of Theorem 3.7.2.
When a01 = 0 (hence b10 = ā01 = 0) the first four of them are zero and
6.4 Bifurcations of Critical Periods 287

7
g55 = −i 60 a−13 b3,−1 (a410 a−13 − b401b3,−1 ) = −i 60
7
|a−13 |2 (a410 a−13 − ā410 ā−13 ).

Since a410 a−13 − ā410 ā−13 6= 0 only if |a−13 | 6= 0, the second condition in the hypoth-
esis ensures that g55 6= 0, so that the origin is not a center but a fine focus of order
five. An argument similar to the proof of Corollary 6.1.3 gives the result. 

6.4 Bifurcations of Critical Periods

Suppose that a particular element of the family of systems of real differential equa-
tions (6.12), corresponding to a specific string a∗ of the coefficients of the complex
form (6.12b), has a center at the origin. The series on the right-hand side of the
expansion (4.7) for the period function T (r) (Definition 4.1.1) for (6.12) defines a
real analytic function on a neighborhood of 0 in R, whose value for r > 0 is the least
period of the trajectory of (4.1) (which is the same as (6.12a)) through the point with
coordinates (r, 0). We will still call this extended function the period function and
continue to denote it by T (r). By Proposition 4.2.4, r = 0 is a critical point of T , a
point at which T ′ vanishes. In this context any value r > 0 for which T ′ (r) = 0 is
called a critical period. The question that we address in this section is the maximum
number of zeros of T ′ that can bifurcate from the zero at r = 0, that is, the maximum
number of critical periods that can lie in a sufficiently small interval (0, ε0 ), when
the coefficient string a is perturbed from a∗ but remains within the center variety
VCR of the original family (6.12). This is called the problem of bifurcations of crit-
ical periods, which was first considered by C. Chicone and M. Jacobs ([45]), who
investigated it for quadratic systems and for some general Hamiltonian systems. In
this section we show how this problem can be examined by applying the techniques
and results that we have developed in earlier sections to the complexification (6.15)
of (6.12b). From the standpoint of the methods used, the problem of bifurcation of
critical periods is analogous to the problem of small limit cycle bifurcations when
2 2
the parameter space is not Rn +3n but an affine variety in Rn +3n . To be specific, the
problem of the cyclicity of antisaddles of focus or center type in polynomial sys-
tems, studied in the previous two sections, is the problem of the multiplicity of the
function F (z, θ ) of (6.2) when this function is the difference function P of Defini-
2
tion 3.1.3 and E = Rn +3n is the space of parameters of a two-dimensional system
of differential equations whose right-hand sides are polynomials of degree n with-
out constant terms. The problem of bifurcations of critical periods is the problem
of the multiplicity of the function F (r, (a, ā)) = T ′ (r) (there is now the parameter
pair (a, b) = (a, ā), since we have complexified), where T (r) is the period function
2
given by equations (4.7) and (4.26), and E is not the full space Rn +3n but the center
variety VCR of family (6.12).
At the outset we make the following simplifying observation. As presented so
far, the problem is to investigate the multiplicity of the function T ′ (r), which, by
(4.26), is T ′ (r, (a, ā)) = ∑∞
k=1 2kp2k (a, ā)r
2k−1 . Using Bautin’s method of Section

6.1, this amounts to analyzing the ideal h2kp2k : k ∈ Ni ⊂ C[a, b]. But this is the
288 6 Bifurcations of Limit Cycles and Critical Periods

same ideal as hp2k : k ∈ Ni ⊂ C[a, b], which arises from the study of the multiplicity
of the function

T (r, (a, ā)) = T (r) − 2π = ∑ p2k (a, ā)r2k , (6.49)
k=1

whose zeros count the number of small cycles that maintain their original period 2 π
after perturbation. Fix an initial parameter string (a∗ , b∗ ) = (a∗ , a∗ ) ∈ VCR . For any
specific parameter string (a, ā) ∈ VCR near (a∗ , b∗ ), the number of critical periods and
the number of cycles of period 2π , in a neighborhood of (0, 0) of radius ε0 > 0, may
differ, but the upper bounds on the number of critical periods and on the number
of period-2π cycles that arise from examining the ideal presented with generators
{2kp2k } on the one hand and examining the same ideal but presented with generators
{p2k } on the other are the same. Thus we can avoid taking derivatives and work with
the function T of (6.49) when we use Bautin’s method to obtain an upper bound on
the number of critical periods.
According to Proposition 6.1.2, if we can represent the function T (r, (a, ā)) in
the form (6.4), then the multiplicity of T at any point is at most s − 1. We are inter-
ested in bifurcations of critical periods for systems from VCR ; however, we consider
this variety as enclosed in VC and look for representation (6.4) on components of
the center variety VC . Because the p2k are polynomials with real coefficients, if such
a representation exists in C[a, b], then it also exists in R[a, b].
We will build our discussion here around systems of the form u̇ = −v + U(u, v),
v̇ = u + V (u, v), where U and V are homogeneous cubic polynomials (or one of
them may be zero). We will find that a sharp upper bound for the number of critical
periods that can bifurcate from any center in this family (while remaining in this
family and still possessing a center at the origin) is three (Theorem 6.4.10). We
begin with the statement and proof of a pair of lemmas that are useful in general.

Lemma 6.4.1. Suppose system (a∗ , b∗ ) = (a∗ , a∗ ) corresponds to a point of the cen-
ter variety VCR of family (6.15) that lies in the intersection of two subsets A and B
of VCR and that there are parametrizations of A and B in a neighborhood of (a∗ , a∗ )
such that the function T (r, (a, ā)) can be written in the form (6.4) with s = mA and
s = mB , respectively. Then the multiplicity of T (r, (a, ā)) with respect to A ∪ B is
equal to max(mA − 1, mB − 1).

Proof. Apply Proposition 6.1.2 to T (r, (a, ā)) with E = A and to T (r, (a, ā)) with
E = B separately. 

The second lemma, a slight modification of a result of Chicone and Jacobs, pro-
vides a means of bounding the number of bifurcating critical periods in some cases
without having to find a basis of the ideal of isochronicity quantities at all.

Lemma 6.4.2. Let F (z, θ ) be defined by (6.2) and convergent in a neighborhood


U of the origin in R × Cn and such that each function f j (θ ) is a homogeneous
polynomial satisying deg( f j ) > deg( f0 ) for j > 0. Suppose there exists a closed set
B in Cn that is closed under rescaling by nonnegative real constants and is such
that | f0 (θ )| > 0 for all θ ∈ B \ 0. Then there exist ε > 0 and δ > 0 such that for
6.4 Bifurcations of Critical Periods 289

each fixed θ ∈ B that satisfies 0 < |θ | < δ , the equation F (z, θ ) = 0, regarded as
an equation in z alone, has no solutions in the interval (0, ε ).

Proof. For any θ for which f0 (θ ) 6= 0, we can write


     
f1 (θ ) f2 (θ ) 2
F (z, θ ) = f0 (θ ) 1 + z+ z + ··· .
f0 (θ ) f0 (θ )

The hypotheses on the f j imply that f j (θ )/ f0 (θ ) → 0 as |θ | → 0, which suggests


the truth of the lemma, at least when B = Cn . Thus let the function G be defined on
U ∩ (R × B) by 
1 if θ = 0
G (z, θ ) = F (z, θ )
 if θ 6= 0.
f0 (θ )
We will show that G is continuous on a neighborhood of the origin in R × B. Since
for any (z, θ ) ∈ U ∩ (R × B), G (z, θ ) = 1 if z = 0, this implies that ε and δ as in
the conclusion of the lemma exist for G . But also for any (z, θ ) ∈ U ∩ (R × B), for
θ 6= 0, F (z, θ ) = 0 only if G (z, θ ) = 0, so the conclusion of the lemma then holds
for F with the same ε and δ . Clearly continuity of G is in question only at points
in the set {(z, θ ) : (z, θ ) ∈ U ∩ (R × B) and θ = 0}.
Let D = {θ : |θ | ≤ R}, where R is a fixed real constant chosen such that F
converges on the compact set [−R, R] × D, and let M denote the supremum of |F |
on D. The Cauchy Inequalities give | f j (θ )| ≤ M/R j for all θ ∈ D.
By the hypothesis that B is closed under rescaling by nonnegative real constants,
any θ ∈ D ∩ B can be written as θ = ρθ ′ for a unique ρ ∈ [0, 1] and θ ′ ∈ B satisfying
|θ ′ | = R, which is unique for nonzero θ . Thus if, for j ∈ N0 , f j is homogeneous of
degree m j ∈ N0 , then

M
| f j (θ )| = | f j (ρθ ′ )| = ρ m j | f j (θ ′ )| ≤ ρ m j .
Rj
Since f0 (θ ) 6= 0 for θ ∈ B \ {0} and {θ : |θ | = R} ∩ B is compact, we have that
m := inf{| f0 (θ ′ )| : |θ ′ | = R and θ ′ ∈ B} is positive, and for all θ ∈ (D ∩ B) \ {0},

| f0 (θ )| = | f0 (ρθ ′ )| = ρ m0 | f (θ ′ )| ≥ mρ m0 .

For j ∈ N, set m′j = m j − m0 > 0. Combining the displayed estimates gives, for
θ ∈ (D ∩ B) \ {0},

f j (θ ) j ρ m j RMj − j j M m′ − j M − j
z ≤ 2 R ≤ ρ j 2 ≤ 2 ρ.
f0 (θ ) mρ m0 m m

Thus !
∞ ∞
f j (θ ) j M M
∑ f0 (θ )
z −1 ≤ ρ
m ∑2 −j
=
m
ρ
j=0 j=1
290 6 Bifurcations of Limit Cycles and Critical Periods

so that |G (z, θ ) − 1| < η for |θ | = ρ R, hence ρ , sufficiently small, and G is contin-


uous on [−R/2, R/2] × (D ∩ B). 

The first step in investigating the bifurcation of critical periods of a center of


a real family (6.12) is to write the family in complex form (6.12b) and form the
complexification (6.15). The complex form of (6.12a) when the nonlinearities are
homogeneous cubics polynomials (perhaps one of them zero) is

ẋ = i(x − a20x3 − a11x2 x̄ − a02xx̄2 − a−13x̄3 ) (6.50)

for a pq ∈ C. The complexification of this family is

ẋ = i(x − a20x3 − a11x2 y − a02xy2 − a−13y3 )


(6.51)
ẏ = −i(y − b3,−1x3 − b20 x2 y − b11xy2 − b02 y3 ) .

In general, the center and isochronicity varieties must be found next. It would be too
lengthy a process to derive them for system (6.51) here, so we will simply describe
them in the following two theorems (but see Example 1.4.13). Characterization of
centers in family (6.15) was first given by Sadovskii in [163], where a derivation
can be found, although his conditions differ from those listed here. (The real case
was first completely solved by Malkin; see [133].) The conditions in Theorem 6.4.4
are close to those derived in [58]; see also [141]. The conditions in both theorems
are for family (6.51) without the restriction bqp = ā pq that holds when (6.51) arises
as the complexification of a real family.

Theorem 6.4.3. The center variety VC of family (6.51) consists of the following
three irreducible components:
1. V(C1 ), where C1 = ha11 − b11, 3a20 − b20, 3b02 − a02 i;
2. V(C2 ), where C2 = ha11 , b11 , a20 + 3b20, b02 + 3a02, a−13 b3,−1 − 4a02b20 i;
3. V(C3 ), where C3 = ha220 a−13 − b3,−1b202 , a20 a02 − b20b02 ,
a20 a−13 b20 − a02b3,−1 b02 , a11 − b11, a202 b3,−1 − a−13b220 i.

Systems from C1 are Hamiltonian, systems from C2 have a Darboux first integral,
and systems from C3 are reversible.

Theorem 6.4.4. The isochronicity and linearizability variety VI = VL of family


(6.51) consists of the following seven irreducible components:
1. V(a11 , a02 , a−13 , b02 , b11 );
2. V(a20 , a11 , b11 , b20 , b3,−1);
3. V(a11 , a02 , a−13 , b11 , b20 , b3,−1 );
4. V(a11 , a02 , b02 , b11 , b3,−1, a20 + 3b20);
5. V(a20 , a11 , a−13 , b11 , b20 , 3a02 + b02);
6. V(a11 , a−13 , b11 , b3,−1 , a20 + b20, a02 + b02);
7. V(a11 , b11 , 3a20 + 7b20, 7a02 + 3b02, 112b320 + 27b23,−1b02 ,
49a−13b220 − 9b3,−1b202 , 21a−13b3,−1 + 16b20b02 , 343a2−13b20 + 48b302).
6.4 Bifurcations of Critical Periods 291

Combining these theorems, we can identify all isochronous centers in family


(6.50) (Exercise 6.29), which will be important at one point in the analysis.
In general, the center variety of a family of systems is presented, as is the case
with Theorem 6.4.3 (and Theorems 3.7.1 and 3.7.2), as a union of subvarieties (typ-
ically irreducible), VC = ∪sj=1V j . If (a∗ , b∗ ) ∈ V1 \ ∪sj=2V j , then because ∪sj=2V j is a
closed set, any sufficiently small perturbation of (a∗ , b∗ ) that remains in VC must lie
in V(C1 ), and similarly if (a∗ , b∗ ) lies in exactly one of the subvarieties V j0 for any
other value of j0 . This observation motivates the following definition of a proper
perturbation and shows why we may restrict our attention to proper perturbations in
the investigation of the bifurcation of critical periods.
Definition 6.4.5. Suppose the center variety of a family (6.15) decomposes as a
union of subvarieties VC = ∪sj=1V j and that (a∗ , b∗ ) ∈ V j0 . A perturbation (a, b)
of (a∗ , b∗ ) is a proper perturbation with respect to V j0 if the perturbed system (a, b)
also lies in V j0 (same index j0 ).
The isochronicity quantities of the complexification (6.15) of a general system
(6.12) are defined implicitly by equation (4.26) in terms of the quantities H e2k+1 ,
(k+1,k) (k,k+1)
which in turn are defined in terms of the coefficients Y1 and Y2 of the dis-
tinguished normalization of (6.15). The ideal in C[a, b] that they generate is denoted
by Y (see (4.28)); we let Yk denote the ideal generated by just the first k pairs:
(2,1) (1,2) (k+1,1) (k,k+1)
Yk = hY1 ,Y2 , . . . ,Y1 ,Y2 i. Similarly, we let P = hp2k : k ∈ Ni denote
the ideal generated by the isochronicity quantities and Pk = hp2 , . . . , p2k i the ideal
generated by just the first k of them. Implementing the Normal Form Algorithm (Ta-
ble 2.1 on page 75) on a computer algebra system it is a straightforward matter to
derive the normal form of (6.15) through low order. For our model system (6.50) and
(6.51), the first few nonlinear terms for which we will need explicit expressions are
listed in Table 6.2 on page 292. Using these expressions, we can obtain by means of
(4.42) expressions for p2 , p4 , p6 , and p8 , which are listed in Table 6.3 on page 293.
(The reader can compute p10 , which will also be needed, but is too long to list.)
In the coordinate ring C[VC ], P4 = P5 = P6 , which suggests the following result.
Lemma 6.4.6. For system (6.51),

VL = V(Y4 ) = VC ∩ V(P4 ) = VI . (6.52)

Proof. Using Singular ([89]), we computed the primary decomposition of Y4 and


found that the associated primes are given by the polynomials defining the vari-
eties (1)–(7) of Theorem 6.4.4, which implies that VL = V(Y4 ). The next equality
follows by Proposition 4.2.14. The last equality is Proposition 4.2.7. 
We now consider bifurcation of critical periods under proper perturbation with
respect to each of the three components of the center variety of family (6.51) iden-
tified in Theorem 6.4.3. Our treatment of the component V(C1 ) of VC illustrates
techniques for finding a basis of the ideal P in the relevant coordinate ring. See Ex-
ercise 6.30 for a different approach that leads to an upper bound of one, which is
sharp (Exercise 6.31).
292 6 Bifurcations of Limit Cycles and Critical Periods

(2,1)
Y1 = − ia11
(1,2)
Y2 = ib11
(3,2)
Y1 = i(−4a02 a20 − 4a02 b20 − 3a−13 b3,−1 )/4
(2,3)
Y2 = i(4a02 b20 + 4b02 b20 + 3a−13 b3,−1 )/4
(4,3)
Y1 = i(−6a−13 a220 + 4a11 a20 b02 − 16a02 a20 b11 − 20a02 a11 b20 − 24a−13 a20 b20
− 8a02 b11 b20 − 18a−13 b220 − 24a202 b3,−1 − 11a11 a−13 b3,−1 − 8a02 b02 b3,−1
− 6a−13 b11 b3,−1 )/16
(3,4)
Y2 = i(−4a20 b02 b11 + 8a02 a11 b20 + 8a−13 a20 b20 + 16a11 b02 b20 + 20a02 b11 b20
+ 24a−13 b220 + 18a202 b3,−1 + 6a11 a−13 b3,−1 + 24a02 b02 b3,−1 + 6b202 b3,−1
+ 11a−13 b11 b3,−1 )/16
(5,4)
Y1 = i(144a02 a211 a20 + 144a11 a−13 a220 − 96a211 a20 b02 + 96a02 a11 a20 b11 − 216a−13 a220 b11
+ 240a11 a20 b02 b11 − 432a02 a20 b211 − 288a02 a211 b20 − 288a202 a20 b20
− 72a11 a−13 a20 b20 − 192a02 a20 b02 b20 − 240a02 a11 b11 b20 − 576a−13 a20 b11 b20
− 192a202 b220 − 516a11 a−13 b220 − 96a02 b02 b220 − 234a−13 b11 b220 − 582a202 a11 b3,−1
− 132a211 a−13 b3,−1 − 660a02 a−13 a20 b3,−1 − 192a02 a11 b02 b3,−1
− 144a−13 a20 b02 b3,−1 − 336a202 b11 b3,−1 − 120a11 a−13 b11 b3,−1 + 24a02 b02 b11 b3,−1
− 18a−13 b211 b3,−1 − 1120a02 a−13 b20 b3,−1 − 300a−13 b02 b20 b3,−1
− 81a2−13 b23,−1 )/192
(4,5)
Y2 = i(96a20 b02 b211 − 240a11 a20 b02 b11 + 96a202 a20 b20 − 24a11 a−13 a20 b20 + 432a211 b02 b20
+ 192a02 a20 b02 b20 + 240a02 a11 b11 b20 + 192a−13 a20 b11 b20 − 96a11 b02 b11 b20
+ 288a02 b211 b20 − 144b02 b211 b20 + 192a202 b220 + 336a11 a−13 b220 + 288a02 b02 b220
+ 582a−13 b11 b220 + 234a202 a11 b3,−1 + 18a211 a−13 b3,−1 + 300a02 a−13 a20 b3,−1
+ 576a02 a11 b02 b3,−1 + 144a−13 a20 b02 b3,−1 + 216a11 b202 b3,−1 + 516a202 b11 b3,−1
+ 120a11 a−13 b11 b3,−1 + 72a02 b02 b11 b3,−1 − 144b202 b11 b3,−1 + 132a−13 b211 b3,−1
+ 1120a02 a−13 b20 b3,−1 + 660a−13 b02 b20 b3,−1 + 81a2−13 b23,−1 )/192.

Table 6.2 Normal Form Coefficients for System (6.51)

Lemma 6.4.7. For family (6.51) and the decomposition of VC given by Theorem
6.4.3, the first four isochronicity quantities form a basis of the ideal P = hp2k : k ∈ Ni
in C[V(C1 )]. At most three critical periods bifurcate from centers of the component
V(C1 ) (the Hamiltonian centers) under proper perturbations.

Proof. On the component V(C1 ) of Hamiltonian systems we have

b11 = a11 , a20 = b20 /3, b02 = a02 /3, (6.53)


6.4 Bifurcations of Critical Periods 293

p2 = 12 (a11 + b11 )
p4 = − 14 (a11 + b11 )2 + 14 (2a02 a20 + 4a02 b20 + 2b02 b20 + 3a−13 b3,−1 )
p6 = 81 (a11 + b11 )3
1
+ 32 (−4a11 a20 b02 + 24a02 a20 b11 + 44a02 a11 b20 + 32a−13 a20 b20
+ 44a02 b11 b20 + 29a11 a−13 b3,−1 + 32a02 b02 b3,−1 + 29a−13 b11 b3,−1
− 4a20 b02 b11 + 24a11 b02 b20 + 8a11 a02 a20 + 8b11 b02 b20
+ 6a−13 a220 + 42a−13 b220 + 42a202 b3,−1 + 6b202 b3,−1 )
1 4
p8 = 16 (a11 + b11 )
1
+ 192 (−36a11 a−13 a220 + 24a211 a20 b02 + 192a02 a11 a20 b11
− 288a11 a20 b02 b11 + 216a11 a−13 a20 b20 + 288a02 a20 b02 b20
+ 864a02 a11 b11 b20 + 576a−13 a20 b11 b20 + 624a02 a−13 a20 b3,−1
+ 576a02 a11 b02 b3,−1 + 144a−13 a20 b02 b3,−1 + 540a11 a−13 b11 b3,−1
+ 216a02 b02 b11 b3,−1 + 1408a02 a−13 b20 b3,−1 + 624a−13 b02 b20 b3,−1
+ 192a11 b02 b11 b20 + 189a2−13 b23,−1 + 384a202 b220 + 144a−13 a220 b11
+ 384a02 a20 b211 + 456a02 a211 b20 + 384a202 a20 b20 + 678a11 a−13 b220
+ 384a02 b02 b220 + 660a−13 b11 b220 + 660a202 a11 b3,−1 + 285a211 a−13 b3,−1
+ 678a202 b11 b3,−1 + 285a−13 b211 b3,−1 + 24a20 b02 b211
+ 384a211 b02 b20 + 456a02 b211 b20 + 144a11 b202 b3,−1
− 36b202 b11 b3,−1 + 48a202 a220 + 48b202 b220 )

Table 6.3 Isochronicity Quantities for System (6.51)

so that when we make these substitutions in p2k , we obtain a polynomial pe2k that is
in the same equivalence class [[p2k ]] in C[V(C1 )] as p2k . For example, by Table 6.3,
pe2 = a11 and pe4 = −a211 + (4/3)a02b20 + (3/4)a−13b3,−1 . The polynomials pe2k so
obtained may be viewed as elements of C[a11 , a02 , a−13 , b3,−1 , b20 ]. Working in this
ring and recalling Remark 1.2.6, we apply the Multivariable Division Algorithm to
obtain the equivalences

pe4 ≡ 34 a02 b20 + 34 a−13b3,−1 mod h pe2 i


pe6 ≡ 53 a−13 b220 + 53 a202 b3,−1 mod h pe2 , pe4 i
2 2
(6.54)
pe8 ≡ − 105
32 a−13 b3,−1 mod h p
e2 , pe4 , pe6 i
pe10 ≡ 0 mod h pe2 , pe4 , pe6 , pe8 i.

(The actual computation each time is to compute a Gröbner basis G of the ideal in
question and reduce pe2k modulo G.)
To prove the first statement of the lemma, we must show that pe2k ∈ h pe2 , pe4 , pe6 , pe8 i
for all k ≥ 6. By Proposition 4.2.11,
294 6 Bifurcations of Limit Cycles and Critical Periods

(ν )
pe2k = 1
2 ∑ p2k ([ν ] + [νb]). (6.55)
{ν :L(ν )=(k,k)}

In each summand the string ν that appears thus lies in the monoid M associated to
family (6.51). We select the order a20 > a11 > a02 > a−13 > b3,−1 > b20 > b11 > b02
on the coefficients and use the algorithm of Table 5.1 on page 235 to construct a
Hilbert basis H of the monoid M of system (6.51). The result is

H = {(2001 0000), (0000 1002), (1010 0000), (0000 0101),


(1001 0100), (0010 1001), (0020 1000), (0001 0200), (0100 0000),
(0000 0010), (1000 0001), (0010 0100), (0001 1000)}.

Thus for example [0020 1000] = a202 b3−1 . We name this string α and similarly set
β = (0010 0100) and γ = (0001 1000). Under the substitutions (6.53) every element
of H reduces to a rational constant times one of α , α b , β , or γ . Thus any ν appearing
in (6.55) with k ≥ 6 can be written in the form (see Remark 5.1.18 for the notation)
ν = rα + sα b + mβ + nγ , so [ν ] = [α ]r [α b ]s [β ]m [γ ]n . In fact, pe2k can be written as a
(ν )
sum of terms p2k ([ν ] + [νb]) in such a way that α b does not appear in ν , this time
using the identity [α ][α
b ] = [β ]2 [γ ]. For if r ≥ s, then we can factor out a power
of [α ] to form [ν ] = [α ]r−s ([α ][α
b ])s [β ]m [γ ]n = [α ]r−s [β ]m+2s [γ ]n+s . If r < s, then
(ν ) (ν )
we simply replace the summand p2k ([ν ] + [νb]) with p2k ([µ ] + [µ b ]), where µ = νb,
altered so as to eliminate αb (this is really just a matter of changing the description
of the indexing set on the sum in (6.55)). But since β and γ are self-conjugate we
have then that for k ≥ 6, pe2k is a sum of terms of the form a rational constant times

[ν ] + [νb] = ([α ]r + [α
b ]r )[β ]m [γ ]n . (6.56)

Moreover, since L(α ) = (3, 3) and L(β ) = L(γ ) = (2, 2), L1 (ν ) = 3r + 2m + 2n,
from which it follows that either r or m + n is at least two, since L1 (ν ) ≥ 6.
Suppose r ≥ 2. Then using the formula
     
\
[µ + σ ] + [µ + σ ] = 21 [σ ] + [σb ] [µ ] + [µ
b ] + 21 [µ ] − [µ
b ] [σ ] − [σb ]

we obtain

[ν ] + [νb] = [(r − 1)α + α ] + [(r − 1)α


b+α
b]
= 21 ([α ] + [α
b ])([(r − 1)α ] + [(r − 1)α
b ])
+ 21 ([(r − 1)α ] − [(r − 1)α
b ])([α ] − [α
b ])
= 21 ([α ] + [α
b ])([α ]r−1 + [α
b ]r−1 ) + 21 ([α ]r−1 − [α
b ]r−1 )([α ] − [α
b ]).

The second congruence in (6.54) implies that the first factor in the first term in this
sum is in h pe2 , pe4 , pe6 i. If r = 2, the second term in the sum is 12 ([α ] − [α
b ])2 , which
a computation shows to be in h pe2 , pe4 , pe6 , pe8 i. Otherwise, the second term can be
written
6.4 Bifurcations of Critical Periods 295

2 ([α ] − [α ]) ([α ] + [α ]r−3 [α


1 b 2 r−2 b ] + · · · + [α ][α
b ]r−3 + [α
b ]r−2 ),

which is in h pe2 , pe4 , pe6 , pe8 i because ([α ] − [α


b ])2 is.
Suppose now that in (6.56) m + n ≥ 2. If m = 0, then

[γ ]n = (a−13 b3,−1)n = (a−13 b3,−1 )2 (a−13 b3,−1 )n−2 ≡ − 105


32
pe8 mod h pe2 , pe4 , pe6 i

by the third congruence in (6.54). If n = 0, then

[β ]m = (a02 b20 )m = (a02 b20 )2 (a202 b220 )m−2 ≡ 0 mod h pe2 , pe4 , pe6 , pe8 i

since a computation shows that a202 b220 ≡ 0 mod h pe2 , pe4 , pe6 , pe8 i. If mn 6= 0, then

β m γ n = (a02 b20 )m (a−13 b3,−1 )n


= (a02 b20 a−13 b3,−1 )(a02 b20 )m−1 (a−13 b3,−1 )n−1 ≡ 0 mod h pe2 , pe4 , pe6 , pe8 i

since another computation shows that a02 b20 a−13 b3,−1 ≡ 0 mod h pe2 , pe4 , pe6 , pe8 i.
This proves the first statement of the lemma. Because we have been working with
the pe2k as elements of C[a11 , a02 , a−13 , b3,−1 , b20 ] without any conditions on a11 , a02 ,
a−13 , b3,−1 , or b20 , so that they are true indeterminates, the second statement of the
lemma follows from what we have just proven and Theorem 6.1.7. 
For the component V(C2 ) of the center variety for family (6.51) we do not use
the Bautin method but apply Lemma 6.4.2 when the original system is isochronous
and a simple analysis otherwise.
Lemma 6.4.8. No critical periods of system (6.50) bifurcate from centers of the
component V(C2 ) of the center variety (centers that have a Darboux first integral)
under proper perturbation.
Proof. Since the first two generators of C2 are just a11 and b11 , we may restrict
attention to the copy of C6 lying in C8 that corresponds to a11 = b11 = 0, replace
p2k by pe2k (a20 , a02 , a−13 , b3,−1 , b20 , b02 ) = p2k (a20 , 0, a02 , a−13 , b3,−1 , b20 , 0, b02 ) for
k ≥ 2, and dispense with p2 altogether. In the remainder of the proof of the lemma
we will let (a, b) denote (a20 , a02 , a−13 , b3,−1 , b20 , b02 ). By Proposition 4.2.12, p2k
is a homogeneous polynomial of degree k, and clearly the same is true for pe2k ,
regarded as an element of C[a, b].
Since the generators of C2 are homogeneous polynomials, V(C2 ), now regarded
as lying in C6 , is closed under rescaling by elements of R and is, of course, topo-
logically closed. The same is clearly true of V(C2 )R = V(C2 ) ∩ {(a, b) : b = ā}.
The value of pe4 , viewed as a function, at any point of V(C2 )R is |a02 |2 , which
we obtained by replacing a20 by −3b20 , b02 by −3a02, a−13 b3,−1 by 4a02b20 , and
then replacing b20 by ā02 . The polynomial function pe4 takes the value zero only
on the string (a, ā) = (0, 0), since in V(C2 )R |a−13|2 = |a02 |2 . If the initial string
(a∗ , a∗ ) of coefficients in V(C2 )R is zero, meaning that it corresponds to the linear
system in family (6.50) (which, by Exercise 6.29(b), is the only isochronous center
in V(C2 )), the conditions of Lemma 6.4.2 are satisfied (with a change of indices)
296 6 Bifurcations of Limit Cycles and Critical Periods

with B = V(C2 )R and F = T ′ , hence for some small ε the equation T ′ (r, (a, ā)) = 0
has no solutions for 0 < r < ε for (a, ā) ∈ V(C2 )R with |(a, ā)| > 0 sufficiently small.
If the initial string (a∗ , a∗ ) of coefficients in V(C2 )R is nonzero, then pe4 is nonzero
on a neighborhood U of (a∗ , a∗ ) in V(C2 )R , and for any string (a, ā) that lies in
U, T ′ (r, (a, ā)) = 4r3 ( pe4 (a, ā) + · · · ) has no zeros in 0 < r < ε . In either case no
bifurcation of critical periods is possible on V(C2 ). 

The third component of the center variety for family (6.51) has such a com-
plicated expression that we analyze the ideal P as an ideal in the coordinate ring
C[V(C3 )] only in conjunction with several parametrizations of V(C3 ).

Lemma 6.4.9. At most three critical periods of system (6.50) bifurcate from cen-
ters of the component V(C3 ) of the center variety (reversible systems) under proper
perturbation, and this bound is sharp.

Proof. Refer to Example 1.4.19. Proceeding as we did there, we derive a mapping


F1 : C5 → C8 defined by

a20 = sw b3,−1 = tw2


a11 = u b20 = w
(6.57)
a02 = v b11 = u
a−13 = tv2 b02 = sv

and readily verify that F1 (C5 ), the image of C5 under F1 , lies in V(C3 ). To use
Theorem 1.4.14 to show that F1 defines a polynomial parametrization of V(C3 ), we
compute a Gröbner basis of the ideal

ha11 − u, b11 − u, a20 − sw, b02 − sv, a−13 − tv2, b3,−1 − tw2 , a02 − v, b20 − wi
(6.58)
with respect to lex with

t > s > v > w > u > a11 > b11 > a20 > b02 > a02 > b20 > a−13 > b3,−1 ,

and obtain

{a−13b220 − a202b3,−1 , −a−13a20 b20 + a02b02 b3,−1 , −a02 a20 + b02b20 ,


− a−13a220 + b202b3,−1 , −a11 + b11, b11 − u, b20 − w, a02 − v, −a20 + b20s,
− b02 + a02s, b3,−1 − b220t, a−13 − a202t, −a−13s + a02b02t,
a−13 s2 − b202t, −b3,−1s + a20b20t, b3,−1 s2 − a220t}.

The polynomials in the basis that do not depend on t, s, v, w, and u are exactly the
generating polynomials of the ideal C3 . By Theorem 1.4.14, this means that (6.57)
is a polynomial parametrization of V(C3 ) (and incidentally confirms, by Corollary
1.4.18, that V(C3 ) is irreducible). This proves only that V(C3 ) is the Zariski closure
of F1 (C5 ), that is, the smallest variety in C8 that contains it, not that F1 (C5 ) covers
6.4 Bifurcations of Critical Periods 297

all of V(C3 ). In fact, by Exercise 6.32, F1 (C5 ) is a proper subset of V(C3 ). Also
by that exercise, setting M = {(0, u, 0, 0, 0, 0, u, 0) : u ∈ C} ⊂ V(C3 ), F1 defines an
analytic coordinate system on F1 (C5 ) \ M as an embedded submanifold of C8 ([97,
§1.3]). Based on this fact, we make the substitution (6.57) in p2k and regard p2k as
an element of R[u, v, w, s,t] (recall that its coefficients are real), but without a change
of name. From Table 6.3 we obtain p2 = u, p4 = −u + 14 vw(4 + 4s + 3vwt 2), and so
on. We will show that in C[u, v, w, s,t], p2k ∈ hp2 , p4 , p6 , p8 i for k ≥ 5, which implies
the corresponding inclusion in the ring of germs at every point of F1 (C5 ) \ M. Since
in C[u, v, w, s,t] p2 = u, it is enough to show that if pe2k denotes the polynomial
obtained from p2k by replacing u by zero, then pe2k ∈ h pe4 , pe6 , pe8 i for k ∈ N, k ≥ 5.
Computations yield

pe4 = 41 vw(4 + 4s + 3vwt 2)


pe6 = 18 v2 w2t(3 + s)(7 + 3s) (6.59)
5 2 2
pe8 ≡ 36 v w (1 + s)(7 + 3s) mod h pe4 , pe6 i

and pe10 ∈ h pe4 , pe6 , pe8 i.


Lemma 6.4.6 says that VI ∩ V(u) = V( pe4 , pe6 , pe8 ) (intersection with VC drops
out because we are now working in analytic coordinates on V(C3 ) \ M) so that pe2k
vanishes everywhere on V( pe4 , pe6 , pe8 ), hence pe2k ∈ I(V( pe4 , pe6p , pe8 )). By the Strong
Hilbert Nullstellensatz (Theorem 1.3.14), this latter ideal is h pe4 , pe6 , pe8 i. If the
ideal h pe4 , pe6 , pe8 i were radical, then we would have the result that we seek. This is
not the case however, because of the squares on v and w in pe6 and pe8 . If we define
p̃˜6 = pe6 /(vw) and p̃˜8 = pe8 /(vw), then a computation shows that
p
h pe4 , pe6 , pe8 i = h pe4 , p̃˜6 , p̃˜8 i. (6.60)

The way forward is to recognize that pe2k also contains v2 w2 as a factor when k ≥ 6.
That is, we claim that for each k ≥ 6 there exists fk ∈ C[v, w, s,t] such that

pe2k = v2 w2 fk . (6.61)

To see this, observe that for every element ν of the Hilbert Basis H displayed in the
proof of Lemma 6.4.7, by inspection [ν ] contains (vw) as a factor. Thus if [θ ] + [θb]
appears in p2k (recall (6.55)) with k ≥ 6, then because L(ν ) = (r, r) with r ≤ 3 for
all ν ∈ H , θ = m1 ν1 + · · · + ms νs with m j ∈ N where s ≥ 2. The important point is
that s is at least two, since it means that [θ ] = [m1 ν1 + · · · + ms νs ] = [ν1 ]m1 · · · [νs ]ms
must contain v2 w2 .
We know that pe2k vanishes at every point of V( pe4 , pe6 , pe8 ). But now because
pe2k = vw(vw fk ), this implies that vw fk vanishes at every point of V( pe4 , pe6 , pe8 ), too.
Furthermore, by (6.60) and Proposition 1.3.16, V( pe4 , pe6 , pe8 ) = V( pe4 , p̃˜6 , p̃˜8 ), so that
vw fk vanishes at every point of V( pe4 , p̃˜6 , p̃˜8 ), hence is in I(V( pe4 , p̃˜6 , p̃˜8 )), which, by
the Strong Nullstellensatz (Theorem 1.3.14) and (6.60), is h pe4 , p̃˜6 , p̃˜8 i. That is, there
exist h1 , h2 , h3 ∈ C[v, w, s,t] such that vw fk (v, w, s,t) = h1 pe4 + h2 p̃˜6 + h3 p̃˜8 , hence
pe2k = v2 w2 fk = (vwh1 ) pe4 + h2 pe6 + h3 pe8 , that is, pe2k ∈ h pe4 , pe6 , pe8 i, as required.
298 6 Bifurcations of Limit Cycles and Critical Periods

In order to handle as many points in V(C3 ) that were not in the image of the
mapping F1 as possible, we define a second, similar mapping F2 : C5 → C8 by

a20 = v b3,−1 = tv2


a11 = u b20 = sv
(6.62)
a02 = sw b11 = u
a−13 = tw2 b02 = w.

Since F2 = η ◦ F1, where η : C8 → C8 is the involution defined by

a20 → a02 , a02 → a20 , a−13 → b3,−1 , b3,−1 → a−13 b20 → b02 , b02 → b20 ,

which preserves V(C3 ) and V(C3 )R , F2 is a parametrization of V(C3 ). By Exercise


6.33, F2 forms an analytic coordinate system on F2 (C5 ) \ M as an embedded sub-
manifold of C8 , where M ⊂ V(C3 ) is as above.
We now proceed just as we did with the parametrization F1 . We define pe2k as
before and similarly set p̃˜6 = pe6 /(vw) and p̃˜8 = pe8 /(vw). Computations now give

pe4 = 41 vw(4s + 4s2 + 3vwt 2)


p̃˜6 = 18 vwt(1 + 3s)(3 + 7s)
p̃˜8 ≡ − 5 svw(1 + s)(3 + 7s) mod h pe4 , pe6 i ,
108

pe10 ∈ h pe4 , pe6 , pe8 i, and establish the identity (6.60) in this setting as well. We thus
obtain the inclusion p2k ∈ hp2 , p4 , p6 , p8 i for k ≥ 5 in C[u, v, w, s,t], which implies
the corresponding inclusion in the ring of germs at every point of F2 (C5 ) \ M.
The points of V(C3 ) that are not covered by either parametrization are covered
by the parametrization F3 of V(C3 ) ∩ V(a20 , a02 , b20 , b02 ) defined by

a11 = b11 = u, a20 = a02 = b20 = b02 = 0, a−13 = w, b3,−1 = v. (6.63)

With this parametrization p2 = u and p4 = −u2 + 34 vw. The Hilbert Basis displayed
in the proof of Lemma 6.4.7 reduces to {u, vw} so that now for every θ that is in M
[θ ] + [θb] = 2[u]r [vw]s , hence (6.55) implies that P = hp2 , p4 i.
Let (a∗ , b∗ ) be a point of V(C3 )R and let U ⊂ V(C3 ) be a sufficiently small
neighborhood of a∗ . Obviously U is covered by the union of the images of the
parametrizations (6.57), (6.62), and (6.63). Therefore, using Lemma 6.4.1, we con-
clude that at most three small critical periods can appear under proper perturbation
of system (a∗ , b∗ ).
Finally, to show that the bound is sharp we produce a bifurcation of three critical
periods from the center of a system in V(C3 ). We will usepthe parametrization (6.57)
with initial values u = 0, v = w = 1, s = −3, and t = 2 2/3. pWe will not change
v or w but will successively change s to −3 + α , then t to 2 (2 − α )/3 + β , and
finally u to a nonzero value that we still denote by u. The expressions (6.59) for
the isochronicity quantities become, replacing pe8 by the polynomial to which it is
6.5 Notes and Complements 299

equivalent, without changing the notation,

pe4 = u
√ √
pe4 = 14 β (4 3 2 − α + 3β )
√ √
1
pe6 = 24 α (−2 + 3α )(2 3 2 − α + 3β )
36 (α − 2)(3α − 2) .
5
pe8 ≡

By the theory that we have developed, it is enough to show that, replacing r2 by


R, the polynomial T (R) = p8 R4 + p6 R3 + p4 R2 + p2 R has three arbitrarily small
positive roots for suitably chosen values of α , β , and u that are arbitrarily close to
zero.
Keeping β and u both zero, we move α √ to a positive value that is arbitrarily
close to 0. Then p8 ≈ 5/9 and p6 ≈ −α / 6 < 0. Moving β to a positive value
that is sufficiently small the quadratic factor in T (R) = R2 (p8 R2 + p6 R + p√4 ) has
two positive roots, the larger one less than −p6 /p8 , which is about 9α /(5 6). If
we now move u to a sufficiently small positive value, the two positive roots move
slightly and a new positive root of T bifurcates from 0, giving the third critical
period. 

By the discussion preceding Definition 6.4.5, Lemmas 6.4.7, 6.4.8, and 6.4.9
combine to give the following result.

Theorem 6.4.10. At most three critical periods bifurcate from centers of system
(6.50), and there are systems of the form (6.50) with three small critical periods.

6.5 Notes and Complements

Bautin’s theorem on the cyclicity of antisaddles in quadratic systems is a funda-


mental result in the theory of polynomial systems of ordinary differential equations,
important not only because of the bound that it provides, but also because of the
approach it gives to the study of the problem of cyclicity in any polynomial system.
Specifically, Bautin showed that the cyclicity problem in the case of a simple focus
or center could be reduced to the problem of finding a basis for the ideal of focus
quantities. By way of contrast, a full description of the bifurcation of singularities
of quadratic systems cannot be expressed in terms of algebraic conditions on the
coefficients ([66]).
In his work Bautin considered quadratic systems with antisaddles in the normal
form of Kapteyn. Such systems look simpler in that form because it contains only
six real parameters. However, the ideal of focus quantities is not radical when the
system is written this way, so that Bautin’s method of constructing that ideal was
rather complicated. A simpler way to construct a basis for such systems was sug-
gested by Yakovenko ([201]). In [206] Żoła̧dek considered the focus quantities of
300 6 Bifurcations of Limit Cycles and Critical Periods

system (6.12b) in the ring of polynomials that are invariant under the action of the
rotation group, and observed that the ideal of focus quantities is radical in that ring.
As to a generalization of Bautin’s theorem to the case of cubic systems, as of now
only a few partial results are known. In particular, Żoła̧dek ([209]) and Christopher
([51]) have shown that there are cubic systems with 11 small limit cycles bifurcating
from a simple center or focus, and Sibirsky ([176]) (see also [207]) has shown that
the cyclicity of a linear center or focus perturbed by homogeneous polynomials of
the third degree is at most five. For studies of bifurcations of small limit cycles of
cubic Liénard systems the reader can consult [56] and for studies of simultaneous
bifurcations of such cycles in cubic systems [203].
At present we have no reason to believe that it is possible to find a bound on the
cyclicity of a simple center or focus of system (6.1) as a function of n. We can look
at the problem from another point of view, however, asking whether it is possible
to perform the steps outlined above algorithmically. That is, for a given polynomial
system with a singular point that is a simple center in the linear approximation we
can ask:
• Does there exist an algorithm for finding the center variety in a finite number of
steps?
• Given the center variety, is it possible to find a basis of the Bautin ideal in a
finite number of steps?
Affirmative answers to these questions would constitute one sort of solution to the
local 16th Hilbert problem.
In this chapter we considered the cyclicity problem only for singularities of fo-
cus or center type. For an arbitrary singularity of system (6.1), even for quadratic
systems this problem is still only partially solved, although much is known.
An important problem from the point of view of the theory of bifurcations and
the study of the Hilbert problem is the investigation of the cyclicity of more gen-
eral limit periodic sets than singularities, of which the next most complicated is the
homoclinic loop mentioned in the introduction to this chapter. Since rescaling the
system by a nonzero constant does not alter the phase portrait of (6.1), the param-
eters may be regarded as lying in the compact sphere S(n+1)(n+2)−1 rather than in
R(n+1)(n+2). There is also a natural extension of the vector field X that corresponds
to (6.1) to the sphere, which amounts to a compactification of the phase space from
R2 to the “Poincaré sphere” S2 . If every limit periodic set in this extended context
has finite cyclicity, then finitude of H(n) for any n follows ([155]). Although it is
still unknown whether even H(2) is finite, considerable progress in the study of the
cyclicity of limit periodic sets of quadratic systems has been achieved. Quadratic
systems are simple enough that it is possible to list all limit periodic sets that can
occur among them. There are 121 of them on the Poincaré sphere, not counting
those that consist of a single singular point. See [68] and [156] for a full elaboration
of these ideas. There is hope that the finitude of H(2) can be proved in this way, but
even for cubic systems this approach is unrealistic. To find out more about methods
for studying bifurcations of general limit periodic sets the reader is referred to the
works of Dumortier, Li, Roussarie, and Rousseau ([67, 68, 112, 156, 157]), Han and
his collaborators ([91, 92, 93, 94]), and Żoła̧dek ([206]). For the state of the art con-
6.5 Notes and Complements 301

cerning the 16th Hilbert problem and references the reader can consult the recent
surveys by Chavarriga and Grau ([37]) and Li ([113]), the books by Christopher and
Li ([53]), Roussarie ([156]), Ye ([202]), and Zhang ([205]), and the bibliography of
Reyn ([147]).
In Section 6.4 we presented one approach to the investigation of bifurcations of
critical periods of polynomial systems as we applied it to system (6.50). The proof
given there, based as it is on the use of the center and isochronicity varieties in C8
of system (6.51), is essentially different from the original proof in [158], which,
however, also includes sharp bounds on the maximum number of critical periods
that can bifurcate in terms of the order of vanishing of the period function at r = 0.

Exercises

6.1 For the purpose of this exercise, for a function f that is defined and analytic in a
neighborhood of θ ∗ ∈ kn , in addition to f let [ f ] also denote the germ determined
by f at θ ∗ . If f and g are germs, show that for any f1 , f2 ∈ f and any g1 , g2 ∈ g,
f1 + g1 = f2 + g2 and f1 g1 = f2 g2 on a neighborhood of θ ∗ , so that addition and
multiplication of germs are well-defined by [ f ] + [g] = [ f + g] and [ f ][g] = [ f g].
6.2 Show that if I = hf1 , . . . , fs i ⊂ Gθ ∗ , then f ∈ I if and only if for any f ∈ f and
f j ∈ f j , j = 1, . . . , s, there exist an open neighborhood U of θ ∗ and functions
h j , j = 1, . . . , s, such that f and all f j and h j are defined and analytic on U and
f = h1 f1 + · · · + hs fs holds on U.
6.3 Show in two different ways that if system (6.1) has a simple singularity at (0, 0),
then the singularity is isolated.
a. Interpret the condition  
Ue Ue
det eu ev 6= 0
Vu Vv
geometrically and give a geometric argument.
b. Give an analytic proof.
Hint. By a suitable affine change of coordinates and the Implicit Function
Theorem near (0, 0), Ve (u, v) = 0 is the graph of a function v = f (u). Consider
e f (u)).
g(u) = U(u,
6.4 Show that if system (6.1) has a simple singularity at (0, 0), if the coefficient
topology is placed on the set of pairs (U, e Ve ) (each pair is identified with a point
of R(n+1)(n+2) under the usual topology by the (n + 1)(n + 2) coefficients of U e
e ), and if N is a neighborhood of (0, 0) that contains no singularity of (6.1)
and V
besides (0, 0), then for (Ue′,V
e ′ ) sufficiently close to (U,
eV e ), the system

e ′ (u, v),
u̇ = U v̇ = Ve ′ (u, v) (6.64)

has exactly one singularity in N.


6.5 Give an example of a system (6.1) with an isolated nonsimple singularity at
(0, 0) with the property that given any neighborhood N of (0, 0), there is a pair
302 6 Bifurcations of Limit Cycles and Critical Periods

(Ue ′ , Ve ′ ) arbitrarily close to (U,


eV e ) in the coefficient topology (Exercise 6.4)
such that system (6.64) has more than one singularity in N.
6.6 Same as Exercise 6.5 but with the conclusion that system (6.64) has no singu-
larities in N.
6.7 Show the Lyapunov quantities specified by Definition 3.1.3 for family (6.11a)
are finite sums of polynomials in (A, B) and rational functions in λ whose de-
nominators have no real zero besides λ = 0, all with rational coefficients, times
exponentials of the form emπλ , m ∈ N.
Hint. Two key points are as follows: the initial value problem that determines w j
is of the form w′j (ϕ ) − λ w j (ϕ ) = S j (ϕ )emλ ϕ , w j (0) = 0, where the constant m
is at least two; and the antiderivative of a function of the form eλ u sinr au coss bu,
r, s ∈ N0 , is a sum of terms of similar form times a rational function of λ whose
denominator has either the form λ , which occurs only if r = s = 0, or the form
λ 2 + c2 , some c ∈ Z depending on a, b, r, and s.
6.8 Use (3.5)–(3.7) and (3.10) and a computer algebra system to compute the first
few Lyapunov quantities at an antisaddle of the general quadratic system (2.74)
written in the form (6.11a). (Depending on your computing facility, even η3
could be out of reach. First do the case λ = 0.)
6.9 The general quadratic system of differential equations with a simple nonhyper-
bolic antisaddle at the origin is

u̇ = −v + A20u2 + A11 uv + A02v2 , v̇ = u + B20u2 + B11 uv + B02v2 .

a. Compute the complexification



ẋ = i x − a10x2 − a01xy − a−12y2

ẏ = −i y − b2,−1x2 − b10 xy − b01y2 ,

which is just (3.129) in Section 3.7. That is, express each coefficient of the
complexification in terms of the coefficients of the original real system.
b. The first focus quantity is given by (3.133) as g11 = −i(a10 a01 − b10 b01 ).
Compute gR 11 , the expression for the first focus quantity in terms of the orig-
inal coefficients.
6.10 Prove Corollary 6.2.4.
6.11 Use Corollary 6.2.4 to prove Corollary 6.2.5.
6.12 Prove Proposition 6.2.6.
6.13 Find an example of two polynomials f and g and a function h that is real analytic
on an open interval U in R but is not a polynomial such that f = hg. Can U be
all of R? (This shows why it is necessary to work with the germs η 1 , η 2 , . . . to
obtain a contradiction in cases 2 and 3 of the proof of Lemma 6.2.8.)
6.14 a. Explain why the following conclusion to the proof of Theorem 6.2.7 is in-
valid. “If the perturbation is made within family (6.11), then we first perturb
without changing λ from zero to obtain up to k − 1 zeros in (0, ε ) as well as
a zero at ρ = 0. Since P ′ (0) = η1 = eλ − 1, we may then change λ to an
arbitrarily small nonzero value, whose sign is the opposite of that of P(ρ )
6.5 Notes and Complements 303

for ρ small, to create an additional isolated zero in (0, ε ).”


Hint. Study the first line in the proof of the theorem.
b. Validly finish the proof of Theorem 6.2.7 using the theory developed in the
remainder of Section 6.2.
6.15 Prove that the ideal B3 for a general complex quadratic system (3.131) is radical
using a computer algebra system that has a routine for computing radicals of
polynomial ideals (such as Singular) as follows:
a. compute the radical of B3 ;
b. apply the Equality of Ideals Criterion on page 24.
6.16 The proof that the bound in Theorem 6.3.3 is sharp is based on the same logic
and procedure as the faulty proof of part of Theorem 6.2.7 in Exercise 6.14.
Explain why the reasoning is valid in the situation of Theorem 6.3.3.
6.17 Show that the ideals J1 through J7 listed in Theorem 3.7.2 are prime using The-
orem 1.4.17 and the Equality of Ideals Criterion on page 24.
6.18 Show that for family (6.30), g77 ∈ / B5 and g99 ∈/ B7 .
6.19 a. Show that the ideal B9 for family (6.30) is not radical.
b. How can you conclude automatically that B5 cannot be radical either?
6.20 Show that the mapping W that appear in the proof of Theorem 6.3.5 is one-to-
one, either directly or by showing that ker(W ) = {0}.
6.21 [Referenced in Proposition 6.3.5, proof.] Show that any monomial in the kth
focus quantity gkk of family (6.30) has degree at least k if k is even and at least
k + 1 if k is odd.
Hint. Any monomial [ν ] in gkk is an N0 -linear combination of elements of the
Hilbert basis H of the monoid M listed on page 277 in the proof of Proposition
6.3.5 and satisfies L(ν ) = (k, k).
6.22 [Referenced in Proposition 6.3.5, proof.] Let k be R or C.
a. Let f and g be polynomials in k[x1 , . . . , xn ], let V ( kn be a variety, and
suppose that if x ∈ kn \ V , then f (x) = g(x). Show that f = g.
Hint. Recall Proposition 1.1.1.
b. Generalize the result to the “smallest” set on which f and g agree that you
can, and to other fields k, if possible.
6.23 Consider family (6.29). We wish to generate as many limit cycles as possible
by imitating the technique used at the end of the proof of Theorem 6.3.3.
a. Explain why a fine focus in family (6.29) is of order at most five.
b. Begin with a system that has a fifth-order fine focus at the origin, and adjust
the coefficients in order to change g11 , g33 , and g44 in such a way as to
produce three limit cycles in an arbitrarily small neighborhood of the origin.
By an appropriate change in the linear part, obtain a system arbitrarily close
to the original system with four arbitrarily small limit cycles about the origin.
6.24 Determine the cyclicities of systems from V(J j )R , where J j are the ideals de-
fined in the statement of Theorem 3.7.1.
6.25 Suppose the coefficients f j (θ ) of the series (6.2) are polynomials over C and
f(z, θ ) = ∑∞
j=s+1 f j (θ )z
that the ideal Is = h f1 , . . . , fs i is radical. Denote by F ˜ k

the function F (z, θ )|V(I ) , by I˜ the ideal generated by the coefficients of F


s
f(z, θ )
304 6 Bifurcations of Limit Cycles and Critical Periods

in C(V(Is )), and assume that the minimal basis MI˜ of I˜ consists of t polynomi-
als. Prove that the multiplicity of F (over R) is at most s + t.
6.26 Investigate cyclicity of the origin for the system

ẋ = x − a20x3 − a11x2 x̄ − a02 xx̄2 − a−13x̄3 .

6.27 Prove Theorem 6.4.3.


6.28 Prove Theorem 6.4.4.
6.29 Use Theorems 6.4.3 and 6.4.4 to show that for the real family (6.50),
a. no nonlinear center corresponding to a point lying in the component V(C1 )
of the center variety VCR is isochronous;
b. no nonlinear center corresponding to a point lying in the component V(C2 )
of the center variety VCR is isochronous; and
c. the centers corresponding to a points lying in the component V(C3 ) of the
center variety VCR that are isochronous correspond to precisely the parameter
strings (a20 , a11 , a02 , a−13 , b3,−1 , b20 , b11 , b02 ) ∈ C8 of the form
(i) (a, 0, −ā, 0, 0, −a, 0, ā), a ∈ C; 
4 −is 16 −2is 16 2is 4 is −is , r, s ∈ R;
(ii) − 28 is
9r e , 0, 3r e , 9r e , 9r e , 3r e , 0, − 28 9r e
(iii) (a, 0, 0, 0, 0, 0, 0, 0, ā), a ∈ C; and 
(iv) 47 e−is , 0, − 12 is 16 2is 16 −2is , − 12 e−is , 0, 4 eis , s ∈ R.
49 e , 49 e , 49 e 49 7
Hint. Systems (i) and (ii) arise from the first parametrization of V(C3 ) in the
proof of Lemma 6.4.9; systems (iii) and (iv) arise from the second param-
etrization. No point covered by the third parametrization corresponds to an
isochronous center.
6.30 Reduce the upper bound in Lemma 6.4.7 from three to one as follows. Assume
that reduction to pe2k ∈ C[a11 , a02 , a−13 , b3,−1 , b20 ] as in the proof of the lemma
has been made.
a. Each pe2k is a homogeneous polynomial of degree k.
b. The polynomial function pe2 vanishes at points in V(C1 )R other than 0, but
pe4 does not.
Hint. V(C1 )R is obtained from V(C1 ) by making the substitutions bk j = ā jk .
The coefficient a11 is forced to be real (recall (6.53)).
c. Make a change of variables z = r2 and show that the number of critical peri-
ods in (0, ε ) is the number of zeros of the function F (z) = ∑∞ e2k zk−1
k=1 2k p
in (0, ε 2 ).
Hint. F corresponds to T ′ (r)/r.
d. F (z) has at most one more zero in (0, ε 2 ) than does its derivative F ′ .
e. F ′ has no zeros in (0, ε 2 ). The result thus follows from parts (c) and (d).
Hint. By parts (a) and (b), Lemma 6.4.2 applies to F ′ with B = V(C1 )R . You
must use Exercise 6.29(a).
6.31 Show that the bound in the previous exercise is sharp by finding a perturbation
from the linear center into V(C1 )R that produces a system with an arbitrarily
small critical period.
6.32 Consider the mapping F1 : C5 → C8 defined by (6.57).
6.5 Notes and Complements 305

a. Given (a20 , . . . , b02 ) ∈ C8 , show that equations (6.57) can be solved for
(u, v, w, s,t) in terms of (a20 , . . . , b02 ) if (a02 , b20 ) 6= (0, 0) and that there is a
solution if and only if a20 = a−13 = b3,−1 = b02 = 0 when (a02 , b20 ) = (0, 0).
b. Use part (a) to show that the image of F1 is

V(C3 ) \ {(a, r, 0, b, c, 0, r, d) : (a, b, c, d) 6= (0, 0, 0, 0) and a2 b − cd 2 = 0}

(but a, b, c, d, and r are otherwise arbitrary elements of C).


c. Let K = {(u, 0, 0, s,t)} ⊂ C5 and M = {(0, u, 0, 0, 0, 0, u, 0) : u ∈ C} ⊂ V(C3 ).
Show that F1 maps C5 \ K one-to-one onto F1 (C5 ) \ M and maps K onto M.
d. Show that the derivative DF1 of F1 has maximal rank at every point of C5 \ K.
6.33 Consider the mapping F2 : C5 → C8 defined by (6.62).
a. Give an alternate proof that F2 is a parametrization of V(C3 ) from that given
in the text by computing a Gröbner basis of a suitably chosen ideal, as was
done for F1 .
b. Proceeding along the lines of the previous exercise, show that the image of
F2 is

V(C3 ) \ {(0, r, a, b, c, d, r, 0) : (a, b, c, d) 6= (0, 0, 0, 0) and a2 c − bd 2 = 0}

(but a, b, c, d, and r are otherwise arbitrary elements of C).


c. Show that F2 maps C5 \ K one-to-one onto F2 (C5 ) \ M and collapses K onto
M, where K and M are the sets defined in part (b) of the previous exercise.
d. Show that the derivative DF2 of F2 has maximal rank at every point of C5 \ K.
6.34 The example at the end of the proof of Lemma 6.4.9 produced a bifurcation of
three critical periods from a nonlinear center. Modify it to obtain a bifurcation
of three critical periods from a linear center.
Appendix

The algorithms presented in this book were written in pseudocode so that they would
not be tied to any particular software. It is a good exercise to program them in a
general-purpose computer algebra system such as Maple or Mathematica. The algo-
rithms of Chapter 1 have been implemented in a number of computer algebra sys-
tems, including Macaulay, Singular, some packages of Maple, and REDUCE. The
interested reader can find a short overview of available computer algebra systems
with routines for dealing with polynomial ideals in [60].
We present here two Mathematica codes that we used to study the general
quadratic system and systems with homogeneous cubic nonlinearities. The codes
can be easily modified to compute the focus and linearizability quantities and nor-
mal forms of other polynomial systems.
In Figures 6.1 and 6.2 we give Mathematica code for computing the first three
focus quantities of system (3.131). Setting α = 1 and β = 0, it is possible to use this
code to compute the linearizability quantities Ikk . Set α = 0 and β = 1 to compute
the quantities Jkk . In the code a12 and b21 stand for a−12 and b2,−1 .
In Figures 6.3 and 6.4 we present the Mathematica code that we used to compute
the first four pairs of the resonant coefficients in the normal form of system (6.51).
Since the output of the last two pairs in the presented notation is relatively large, we
do not present these quantities in the output, but they are given in Table 6.2.
Note also that we hide the output or large expressions using the symbol “;” at
the end of the input expressions.

307
308 Appendix

 The operator (3.71) for system (3.131)

In[1]:= l1[nu1_,nu2_,nu3_,nu4_,nu5_,nu6_]:= 1 nu1 + 0 nu2 − nu3 + 2 nu4 +1 nu5 + 0 nu6;


l2[nu1_,nu2_,nu3_,nu4_,nu5_,nu6_]:= 0 nu1 + 1 nu2 + 2 nu3 − nu4 + 0 nu5 + 1 nu6;

 Set Α=Β=1 to compute the focus quantities


 Set Α=1, Β=0 to compute the linearizability quantities I_{kk}
 Set Α=0, Β=1 to compute J_{kk}

In[3]:= Α=1; Β=1;

 Definition of function (4.52)

In[4]:= v[k1_,k2_,k3_,k4_,k5_,k6_]:=v[k1,k2,k3,k4,k5,k6]=
Module[{us,coef},coef=l1[k1,k2,k3,k4,k5,k6]−l2[k1,k2,k3,k4,k5,k6]; us=0;
v[0,0,0,0,0,0]=1;
Α)*v[k1−1,k2,k3,k4,k5,k6]];
If[k1>0,us=us+(l1[k1−1,k2,k3,k4,k5,k6]+
If[k2>0,us=us+(l1[k1,k2−1,k3,k4,k5,k6]+Α)*v[k1,k2−1,k3,k4,k5,k6]];
If[k3>0,us=us+(l1[k1,k2,k3−1,k4,k5,k6]+Α)*v[k1,k2,k3−1,k4,k5,k6]];
If[k4>0,us=us−(l2[k1,k2,k3,k4−1,k5,k6]+Β)*v[k1,k2,k3,k4−1,k5,k6]];
If[k5>0,us=us−(l2[k1,k2,k3,k4,k5−1,k6]+Β)*v[k1,k2,k3,k4,k5−1,k6]];
If[k6>0,us=us−(l2[k1,k2,k3,k4,k5,k6−1]+Β)*v[k1,k2,k3,k4,k5,k6−1]];
If [coef!=0, us=us/coef]; If [coef==0, gg[k1,k2,k3,k4,k5,k6]=us; us=0]; us]

 gmax is the number of the focus or linearizability quantity to be computed

In[5]:= gmax=3;

 Computing the quantities q[1], q[2], ... up to the order "gmax"

In[6]:= Do[k= sc; num=k; q[num]=0;


For [i1=0,i1<=2 k,i1++ ,
For[i2=0,i2<=(2 k−i1),i2++ ,
For[i3=0,i3<=(2 k−i1−i2),i3++ ,
For[i4=0,i4<=(2 k−i1−i2−i3),i4++,
For[i5=0,i5<=(2 k−i1−i2−i3−i4),i5++,
For[i6=0,i6<=(2 k−i1−i2−i3−i4−i5),i6++,
If[(l1[i1,i2,i3,i4,i5,i6]==k) &&(l2[i1,i2,i3,i4,i5,i6]==k), v[i1,i2,i3,i4,i5,i6];
q[num]=q[num]+gg[i1,i2,i3,i4,i5,i6]TT[i1,i2,i3,i4,i
5,i6]]]]]]]],
{sc,1,gmax}]
Fig. 6.1 Mathematica code for computing focus and linearizability quantities of a quadratic system
Appendix 309

 Definition of monomials of system (3.131)

In[7]:= TT[l1_,l2_,l3_,l4_,l5_,l6_]:=a10^l1 a01^l2 a12^l3 b21^l4 b10^l5 b01^l6

 Output of focus or linearizability quantities

In[8]:= Dogi  Factorqi, Printgi, i, 1, gmax

a01 a10  b01 b10

1
 24 a012 a102  18 a01 a102 b01  27 a012 a10 b10  2 a102 a12 b10  18 a10 b012 b10  3 a10 a12 b102 
3
27 a01 b01 b102  24 b012 b102  2 a12 b103  2 a013 b21  3 a012 b01 b21  2 a01 b012 b21

1
 7236 a013 a103  696 a01 a104 a12  10476 a012 a103 b01  3888 a01 a103 b012  13824 a013 a102 b10 
72
254 a01 a103 a12 b10  16380 a012 a102 b01 b10  1212 a103 a12 b01 b10  3888 a102 b013 b10 
6768 a013 a10 b102  399 a01 a102 a12 b102  238 a102 a12 b01 b102  16380 a01 a10 b012 b102 
10476 a10 b013 b102  2222 a01 a10 a12 b103  6768 a012 b01 b103  1440 a10 a12 b01 b103 
13824 a01 b012 b103  7236 b013 b103  621 a01 a12 b104  2 a12 b01 b104  2 a014 a10 b21 
890 a012 a102 a12 b21  1440 a013 a10 b01 b21  864 a01 a102 a12 b01 b21  238 a012 a10 b012 b21 
1212 a01 a10 b013 b21  621 a014 b10 b21  1754 a012 a10 a12 b10 b21  28 a102 a122 b10 b21 
2222 a013 b01 b10 b21  399 a012 b012 b10 b21  864 a10 a12 b012 b10 b21  254 a01 b013 b10 b21 
696 b014 b10 b21  132 a10 a122 b102 b21  1754 a01 a12 b01 b102 b21  890 a12 b012 b102 b21 
73 a122 b103 b21  73 a013 a12 b212  132 a012 a12 b01 b212  28 a01 a12 b012 b212 

Fig. 6.2 Mathematica code for computing focus and linearizability quantities of a quadratic system
(continued)
310 Appendix

 Input of system (6.51)

  3

In[1]:= xdot   x  a2  k, k x^ 3  k y^k
 k0

Out[1]=  x  y a1, 3  x y2 a0, 2  x2 y a1, 1  x3 a2, 0


3

  3

In[2]:= ydot   y  bk, 2  k x^k y^ 3  k 
 k0

Out[2]=  y  y b0, 2  x y2 b1, 1  x2 y b2, 0  x3 b3, 1


3

 Normal form (4.22) of (6.51) up to the 11th order

5
In[3]:= x1dot   x1  x1 Y12 k  x1 ^k y1 ^k
k1

Out[3]=  x1  x1 x1 y1 Y12  x12 y12 Y14  x13 y13 Y16  x14 y14 Y18  x15 y15 Y110

5
In[4]:= y1dot   y1  y1 Y22 k x1 ^k y1 ^k
k1

Out[4]=  y1  y1 x1 y1 Y22  x12 y12 Y24  x13 y13 Y26  x14 y14 Y28  x15 y15 Y210

 Transformation (4.21)

11 s
In[5]:= xsub  x1  h1k  1, s  k x1 ^ k y1 ^ s  k  Expand;
s3 k0

11 s
In[6]:= ysub  y1  h2k, s  k  1 x1 ^ k y1 ^ s  k  Expand;
s3 k0

 Choose the distinguished transformation

In[7]:= Doh1k, k  0; h2k, k  0, k, 1, 6

 Create equation (2.35)

In[8]:= v1  Dxsub, x1 x1dot  Dxsub, y1 y1dot  xdot . x  xsub, y  ysub  Expand;

In[9]:= v2  Dysub, x1 x1dot  Dysub, y1 y1dot  ydot . x  xsub, y  ysub  Expand;

Fig. 6.3 Mathematica code for computing the distinguished normal form of system (6.51)
Appendix 311

 sh[s] is a code to solve (2.35) in the space H_s of vector homogeneous functions

In[10]:= shs_ :
Moduleg, t1s  1, 0  FirstSolveCoefficientv1 . y1  0, x1 ^s  0, h1s  1, 0;
h1s  1, 0  h1s  1, 0 . t1s  1, 0;
t11, s  FirstSolveCoefficientv1 . x1  0, y1 ^s  0, h11, s;
h11, s  h11, s . t11, s; DoIfs  1  i  i,
t1s  1  i, i  FirstSolveCoefficientv1, x1 ^s  i y1 ^i   0, h1s  1  i, i;
h1s  1  i, i  h1s  1  i, i . t1s  1  i, i,
t1s  1  i, i  FirstSolveCoefficientv1, x1 ^s  i y1 ^i   0, Y12 i;
Y12 i  Y12 i . t1s  1  i, i, i, 1, s 1;
t2s, 1  FirstSolveCoefficientv2 . y1  0, x1 ^s  0, h2s, 1;
h2s, 1  h2s, 1 . t2s, 1;
t20, s  1  FirstSolveCoefficientv2 . x1  0, y1 ^s  0, h20, s  1;
h20, s  1  h20, s  1 . t20, s  1;
DoIfs  1  i  i, t2i, s  1  i  FirstSolveCoefficientv2, x1 ^i y1 ^s  i   0,
h2i, s  1  i; h2i, s  1  i  h2i, s  1  i . t2i, s  1  i,
t2s  1  i, i  FirstSolveCoefficientv2, x1 ^i y1 ^s  i   0, Y22 i;
Y22 i  Y22 i . t2s  1  i, i, i, 1, s 1 

 Calculations using sh

In[11]:= Doshk, k, 3, 9

 Output of the coefficients of the normal form

In[12]:= DoPrintY12 k, Y22 k, k, 1, 4

 a1, 1,  b1, 1

1
   4 a0, 2 a2, 0  4 a0, 2 b2, 0  3 a1, 3 b3, 1,
4
1
  4 a0, 2 b2, 0  4 b0, 2 b2, 0  3 a1, 3 b3, 1
4
Fig. 6.4 Mathematica code for computing the distinguished normal form of system (6.51) (contin-
ued)
References

1. W. W. Adams and P. Loustaunau. An Introduction to Gröbner Bases. Graduate Studies in


Mathematics, Vol. 3. Providence, RI: American Mathematical Society, 1994.
2. A. Algaba, E. Freire, and E. Gamero. Isochronicity via normal form. Qual. Theory Dyn. Sys.
1 (2000), no. 2, 133–156.
3. M. I. Al’muhamedov. On conditions for the existence of singular point of the center type.
(Russian) Izv. Fiz.-Mat. Obs. (Kazan) 9 (1937) 105–121.
4. V. V. Amel’kin. Strong isochronicity of the Liénard system. Differ. Uravn. 42 (2006) 579–
582; Differ. Equ. 42 (2006) 615–618.
5. V. V. Amel’kin and K. S. al’Khaı̆der. Strong isochronism of polynomial differential systems
with a center. Differ. Uravn. 35 (1999) 867–873; Differ. Equ. 35 (1999) 873–879.
6. V. V. Amel’kin and C. Dang. Isochronicity of the Cauchy-Riemann system in the focus case.
(Russian) Vestsi Nats. Akad. Navuk Belarusi Ser. Fiz. Mat.-Navuk (1993) 28–31.
7. V. V. Amel’kin and O. B. Korsantiya. Isochronous and strongly isochronous oscillations
of two-dimensional monodromic holomorphic dynamical systems. Differ. Uravn. 42 (2006)
147–152; Differ. Equ. 42 (2006) 159–164.
8. V. V. Amel’kin, N. A. Lukashevich, and A. P. Sadovskii. Nonlinear Oscillations in Second
Order Systems. (Russian) Minsk: Belarusian State University, 1982.
9. A. F. Andreev. Solution of the problem of the center and the focus in one case. (Russian)
Akad. Nauk SSSR. Prikl. Mat. Meh. 17 (1953) 333–338.
10. A. F. Andreev. Singular Points of Differential Equations. (Russian) Minsk: Vysh. Shkola,
1979.
11. A. F. Andreev, A. P. Sadovskii, and V. A. Tsikalyuk. The center-focus problem for a system
with homogeneous nonlinearities in the case of zero eigenvalues of the linear part. Dif-
fer. Uravn. 39 (2003) 147–153; Differ. Equ. 39 (2003) 155–164.
12. A. A. Andronov, E. A. Leontovitch, I. I. Gordon, and A. G. Maier. Qualitative Theory of
Second-Order Dynamic Systems. Israel Program for Scientific Translations. New York: John
Wiley and Sons, 1973.
13. A. A. Andronov, E. A. Leontovitch, I. I. Gordon, and A. G. Maier. Theory of Bifurcations
of Dynamic Systems on a Plane. Israel Program for Scientific Translations. New York: John
Wiley and Sons, 1973.
14. R. Bamón. Quadratic vector fields in the plane have a finite number of limit cycles.
Publ. Math. Inst. Hautes Etudes Sci. 64 (1986) 111–142.
15. V. V. Basov. The Normal Forms Method in the Local Qualitative Theory of Differential Equa-
tions: Formal Theory of Normal Forms. (Russian) Saint Petersburg: Izd. Saint Petersburg
University, 2001.
16. V. V. Basov. The Normal Forms Method in the Local Qualitative Theory of Differential Equa-
tions: Analytical Theory of Normal Forms. (Russian) Saint Petersburg: Izd. Saint Petersburg
University, 2002.

313
314 References

17. N. N. Bautin. On the number of limit cycles which appear with the variation of coeffi-
cients from an equilibrium position of focus or center type. Mat. Sb. 30 (1952) 181–196;
Amer. Math. Soc. Transl. 100 (1954) 181–196.
18. T. Becker and V. Weispfenning. Gröbner Bases: A Computational Approach to Commutative
Algebra. Graduate Texts in Mathematics, Vol. 141. New York: Springer-Verlag, 1993.
19. Y. N. Bibikov. Local Theory of Nonlinear Analytic Ordinary Differential Equations. Lecture
Notes in Mathematics, Vol. 702. New York: Springer-Verlag, 1979.
20. R. I. Bogdanov. Versal deformations of a singular point on the plane in the case of zero
eigenvalues. Funct. Anal. Appl. 9 (1975) 144–145.
21. Yu. L. Bondar. Solution of the problem of the isochronicity of the center for a cubic system.
(Russian) Vestn. Beloruss. Gos. Univ. Ser. 1 Fiz. Mat. Inform. (2005) 73–76.
22. Yu. L. Bondar and A. P. Sadovskii. Solution of the center and focus problem for a cubic
system that reduces to the Liénard system. Differ. Uravn. 42 (2006) 11–22, 141; Differ. Equ.
42 (2006) 10–25.
23. J. Brainen and R. Laubenbacher. Lectures Notes of the Summer School, Laramie, Wyoming,
2000.
24. A. D. Brjuno. Analytic form of differential equations. I, II. (Russian) Tr. Mosk. Mat. Obs. 25
(1971) 119–262; 26 (1972) 199–239.
25. A. D. Brjuno. A Local Method of Nonlinear Analysis for Differential Equations. Moscow:
Nauka, 1979; Local Methods in Nonlinear Differential Equations. Translated from the Rus-
sian by William Hovingh and Courtney S. Coleman. Berlin: Springer-Verlag, 1989.
26. B. Buchberger. Ein Algorithmus zum Auffinden der Basiselemente des Restklassenringes
nach einem nulldimensionalen Polynomideal. PhD Thesis, Mathematical Institute, Univer-
sity of Innsbruck, Austria, 1965; An algorithm for finding the basis elements of the residue
class ring of a zero dimensional polynomial ideal. J. Symbolic Comput. 41 (2006) 475–511.
27. B. Buchberger. Gröbner bases: an algorithmic method in polynomial ideals theory. In: Multi-
dimensional Systems: Theory and Applications (N. K. Bose, Ed.), 184–232. Dordrecht, The
Netherlands: D. Reidel, 1985.
28. B. Buchberger. Introduction to Gröbner bases. In: Gröbner Bases and Applications (Linz,
1998), London Math. Soc. Lecture Note Ser. 251, 3–31. Cambridge: Cambridge University
Press, 1998.
29. A. Campillo and M. M. Carnicer. Proximity inequalities and bounds for the degree of invari-
ant curves by foliations of P2C . Trans. Amer. Math. Soc. 349 (1997) 2211–2228.
30. D. Cerveau and A. Lins Neto. Holomorphic foliations in CP(2) having an invariant algebraic
curve. Ann. Inst. Fourier (Grenoble) 41 (1991) 883–903.
31. J. Chavarriga, I. A. Garcı́a, and J. Giné. On Lie’s symmetries for planar polynomial differen-
tial systems. Nonlinearity 14 (2001) 863–880.
32. J. Chavarriga, I. A. Garcı́a, and J. Giné. Isochronicity into a family of time-reversible cubic
vector fields. Appl. Math. Comput. 121 (2001) 129–145.
33. J. Chavarriga, H. Giacomini, J. Giné, and J. Llibre. Local analytic integrability for nilpotent
centers. Ergodic Theory Dynam. Systems 23 (2003) 417–428.
34. J. Chavarriga and J. Giné. Integrability of a linear center perturbed by fourth degree homo-
geneous polynomial. Publ. Mat. 40 (1996) 21–39.
35. J. Chavarriga and J. Giné. Integrability of a linear center perturbed by fifth degree homoge-
neous polynomial. Publ. Mat. 40 (1996) 335–356.
36. J. Chavarriga, J. Giné, and I. A. Garcı́a. Isochronous centers of a linear center perturbed by
fourth degree homogeneous polynomial. Bull. Sci. Math. 123 (1999) 77–96.
37. J. Chavarriga and M. Grau. Some open problems related to 16th Hilbert problem. Sci. Ser. A
Math. Sci. (N. S.) 9 (2003) 1–26.
38. J. Chavarriga and M. Sabatini. A survey of isochronous centers. Qual. Theory Dyn. Syst. 1
(1999) 1–70.
39. L. Chen and M. Wang. The relative position and number of limit cycles of a quadratic differ-
ential system. Acta Math. Sinica 22 (1979) 751–758.
40. X. Chen, V. G. Romanovski, and W. Zhang. Linearizability conditions of time-reversible
quartic systems having homogeneous nonlinearities. Nonlinear Anal. 69 (2008) 1525–1539.
References 315

41. L. A. Cherkas. On the conditions for a center for certain equations of the form yy = P(x) +
Q(x)y + R(x)y2 . Differ. Uravn. 8 (1972) 1435–1439; Differ. Equ. 8 (1972) 1104–1107.
42. L. A. Cherkas. Conditions for a Liénard equation to have a center. Differ. Uravn. 12 (1976)
292–298; Differ. Equ. 12 (1976) 201–206.
43. L. A. Cherkas. Conditions for a center for the equation yy = ∑3i=0 pi (x)yi . Differ. Uravn. 14
(1978) 1594–1600, 1722; Differ. Equ. 14 (1978) 1133–1137.
44. C. Chicone. Ordinary Differential Equations with Applications. New York: Springer-Verlag,
1999.
45. C. Chicone and M. Jacobs. Bifurcation of critical periods for plane vector fields. Trans.
Amer. Math. Soc. 312 (1989) 433–486.
46. C. Chicone and D. S. Shafer. Separatrix and limit cycles of quadratic systems and Dulac’s
theorem. Trans. Amer. Math. Soc. 278 (1983) 585–612.
47. A. R. Chouikha. Isochronous centers of Liénard type equations and applications. J. Math.
Anal. Appl. 331 (2007) 358–376.
48. A. R. Chouikha, V. G. Romanovski, and X. Chen. Isochronicity of analytic systems via
Urabe’s criterion. J. Phys. A 40 (2007) 2313–2327.
49. C. Christopher. Invariant algebraic curves and conditions for a centre. Proc. Roy. Soc. Edin-
burgh Sect. A 124 (1994) 1209–1229.
50. C. Christopher. An algebraic approach to the classification of centers in polynomial Liénard
systems. J. Math. Anal. Appl. 229 (1999) 319–329.
51. C. Christopher. Estimating limit cycles bifurcations. In: Trends in Mathematics, Differen-
tial Equations with Symbolic Computations (D. Wang and Z. Zheng, Eds.), 23–36. Basel:
Birkhäuser-Verlag, 2005.
52. C. Christopher and J. Devlin. On the classification of Liénard systems with amplitude-
independent periods. J. Differential Equations 200 (2004) 1–17.
53. C. Christopher and C. Li. Limit Cycles of Differential Equations. Basel: Birkhäuser-Verlag,
2007.
54. C. Christopher and J. Llibre. Algebraic aspects of integrability for polynomial systems.
Qual. Theory Dyn. Syst. 1 (1999) 71–95.
55. C. Christopher, J. Llibre, C. Pantazi, and X. Zhang. Darboux integrability and invariant alge-
braic curves for planar polynomial systems. J. Phys. A 35 (2002) 2457–2476.
56. C. Christopher and S. Lynch. Small-amplitude limit cycle bifurcations for Liénard systems
with quadratic or cubic damping or restoring forces. Nonlinearity 12 (1999) 1099–1112.
57. C. Christopher, P. Mardešić, and C. Rousseau. Normalizable, integrable, and linearizable
saddle points for complex quadratic systems in C2 . J. Dyn. Control Sys. 9 (2003) 311–363.
58. C. Christopher and C. Rousseau. Nondegenerate linearizable centres of complex planar
quadratic and symmetric cubic systems in C2 . Publ. Mat. 45 (2001) 95–123.
59. A. Cima, A. Gasull, V. Mañosa, and F. Mañosas. Algebraic properties of the Liapunov and
periodic constants. Rocky Mountain J. Math. 27 (1997) 471–501.
60. D. Cox, J. Little, and D. O’Shea. Ideals, Varieties, and Algorithms, 2nd edition. New York:
Springer-Verlag, 1997.
61. G. Darboux. Mémoire sur les équations différentielles algébriques du premier ordre et du
premier degré. Bull. Sci. Math. Sér. 2 2 (1878) 60–96, 123–144, 151–200.
62. W. Decker, G. Pfister, H. Schönemann, and S. Laplagne. A S INGULAR 3.0 library for com-
puting the primary decomposition and radical of ideals, primdec.lib, 2005.
63. J. Devlin, N. G. Lloyd, and J. M. Pearson. Cubic systems and Abel equations. J. Differential
Equations. 147 (1998) 435–454.
64. H. Dulac. Détermination et intégration d’une certaine classe d’équations différentielles ayant
pour point singulier un centre. Bull. Sci. Math. (2) 32 (1908) 230–252.
65. F. Dumortier. Singularities of vector fields on the plane. J. Differential Equations 23 (1977)
53–106.
66. F. Dumortier and P. Fiddelaers. Quadratic models for generic local 3-parameter bifurcations
on the plane. Trans. Amer. Math. Soc. 326 (1991) 101–126.
316 References

67. F. Dumortier and C. Li. Perturbation from an elliptic Hamiltonian of degree four I, II, III, IV.
J. Differential Equations 175 (2001) 209–243; 176 (2001) 114–175; 188 (2003) 473–511,
512–554.
68. F. Dumortier, R. Roussarie, and C. Rousseau. Hilbert’s 16th problem for quadratic vector
fields J. Differential Equations 110 (1994) 95–123.
69. J. Ecalle. Introduction aux fonctions analysables et preuve constructive de la conjecture de
Dulac. Paris: Hermann, 1992.
70. V. F. Edneral. Computer evaluation of cyclicity in planar cubic system. In: Proceedings of
the 1997 International Symposium on Symbolic and Algebraic Computation, 305–309. New
York: ACM Press, 1997.
71. V. F. Edneral. Looking for periodic solutions of ODE systems by the normal form method.
In: Trends in Mathematics, Differential Equations with Symbolic Computations (D. Wang
and Z. Zheng, Eds.), 173–200. Basel: Birkhäuser-Verlag, 2005.
72. W. W. Farr, C. Li, I. S. Laboriau, and W. F. Langford. Degenerate Hopf bifurcation formulas
and Hilbert’s 16th problem. SIAM J. Math. Anal. 20 (1989) 13–30.
73. J.-P. Françoise and A. Fronville. Computer algebra and bifurcation theory of vector fields
of the plane. Proceedings of the 3rd Catalan Days on Applied Mathematics (Lleida, 1996),
57–63. Lleida: Univ. Lleida, 1997.
74. M. Frommer. Über das Auftreten von Wirbeln und Strudeln (geschlossener und spiraliger
Integralkurven) in der Umgebung rationaler Unbestimmtheitsstellen. Math. Ann. 109 (1934)
395–424.
75. A. Fronville, A. P. Sadovski, and H. Żoła̧dek. The solution of the 1 : −2 resonant center
problem in the quadratic case. Fund. Math. 157 (1998) 191–207.
76. A. Gasull, A. Guillamon, and V. Mañosa. Centre and isochronicity conditions for systems
with homogeneous nonlinearities. Proceedings of the 2nd Catalan Days on Applied Mathe-
matics (Odeillo, 1995), 105–116. Perpiniá: Presses Universitaires de Perpignan, 1995.
77. A. Gasull, A. Guillamon, and V. Mañosa. An analytic-numerical method for computation
of the Liapunov and period constants derived from their algebraic structure. SIAM J. Nu-
mer. Anal. 36 (1999) 1030–1043.
78. A. Gasull, J. Llibre, V. Mañosa, and F. Mañosas. The focus-centre problem for a type of
degenerate system. Nonlinearity 13 (2000) 699–729.
79. A. Gasull and J. Torregrosa. Center problem for several differential equations via Cherkas’
method. J. Math. Anal. Appl. 228 (1998) 322–343.
80. A. Gasull and J. Torregrosa. A new approach to the computation of the Lyapunov constants.
The geometry of differential equations and dynamical systems. Comput. Appl. Math. 20
(2001) 149–177.
81. P. Gianni, B. Trager, and G. Zacharias. Gröbner bases and primary decomposition of poly-
nomials. J. Symbolic Comput. 6 (1988) 146–167.
82. J. Giné and M. Grau. Linearizability and integrability of vector fields via commutation.
J. Math. Anal. Appl. 319 (2006) 326–332.
83. J. Giné and M. Grau. Characterization of isochronous foci for planar analytic differential
systems. Proc. Roy. Soc. Edinburgh Sect. A 135 (2005) 985–998.
84. J. Giné and J. Llibre. A family of isochronous foci with Darboux first integral. Pacific J. Math.
218 (2005) 343–355.
85. H.-G. Gräbe. CALI–a REDUCE Package for Commutative Algebra, Version 2.2, 1995.
http://www.informatik.uni-leipzig.de/˜graebe/ComputerAlgebra/
Software/Cali.
86. R. L. Graham, D. E. Knuth, and O. Patashnik. Concrete Mathematics. New York: Addison-
Wesley, 1994.
87. H. Grauert and K. Fritzsche. Several Complex Variables. New York: Springer-Verlag, 1976.
88. D. Grayson and M. Stillman. Macaulay2: a software system for algebraic geometry and com-
mutative algebra. http://www.math.uiuc.edu/Macaulay2.
89. G.-M. Greuel, G. Pfister, and H. Schönemann. S INGULAR 3.0. A Computer Algebra System
for Polynomial Computations. Centre for Computer Algebra, University of Kaiserslautern,
2005. http://www.singular.uni-kl.de/
References 317

90. R. C. Gunning and H. Rossi. Analytic Functions of Several Complex Variables. Englewood
Cliffs, NJ: Prentice Hall, 1965.
91. M. Han, Y. Lin, and P. Yu. A study on the existence of limit cycles of a planar system with
third-degree polynomials. Internat. J. Bifur. Chaos Appl. Sci. Engrg. 14 (2004) 41–60.
92. M. Han, Y. Wu, and P. Bi. Bifurcation of limit cycles near polycycles with n vertices. Chaos
Solitons Fractals 22 (2004) 383–394.
93. M. Han and C. Yang. On the cyclicity of a 2-polycycle for quadratic systems. Chaos Solitons
Fractals 23 (2005) 1787–1794.
94. M. Han and T. Zhang. Some bifurcation methods of finding limit cycles. Math. Biosci. Eng.
3 (2006) 67–77.
95. P. Hartman. Ordinary Differential Equations. New York: John Wiley & Sons, 1964; Boston:
Birkhäuser, 1982 (reprint of the second edition).
96. M. Hervé. Several Complex Variables: Local Theory. London: Oxford University Press,
1963.
97. M. Hirsch. Differential Topology. New York: Springer-Verlag, 1976.
98. Yu. Il’yashenko. Algebraic nonsolvability and almost algebraic solvability of the center-focus
problem. Funktsion. Anal. i Prilozhen. 6 (1972) 197–202; Funct. Anal. Appl. 6 (1972) 30–37.
99. Yu. Il’yashenko. Finiteness Theorems for Limit Cycles. Providence, RI: American Mathe-
matical Society, 1991.
100. Yu. Ilyashenko. Centennial history of Hilbert’s 16th problem. Bull. Amer. Math. Soc. (N. S.)
39 (2002) 301–354.
101. J.-P. Jouanolou. Equations de Pfaff algébriques. Lecture Notes in Mathematics, Vol. 708.
New York: Springer-Verlag, 1979.
102. A. S. Jarrah, R. Laubenbacher, and V. Romanovski. The Sibirsky component of the center
variety of polynomial differential systems. Computer algebra and computer analysis (Berlin,
2001). J. Symbolic Comput. 35 (2003) 577–589.
103. W. Kapteyn. On the centra of the integral curves which satisfy differential equations of the
first order and the first degree. Proc. Kon. Akad. Wet., Amsterdam 13 (1911) 1241–1252.
104. W. Kapteyn. New researches upon the centra of the integrals which satisfy differential equa-
tions of the first order and the first degree. Proc. Kon. Acad. Wet., Amsterdam 14 (1912)
1185–1185; 15 (1912) 46–52.
105. I. S. Kukles. Sur les conditions nécessaires et suffisantes pour l’existence d’un centre.
Dokl. Akad. Nauk SSSR 42 (1944) 160–163.
106. I. S. Kukles. Sur quelques cas de distinction entre un foyer et un centre. Dokl. Akad. Nauk
SSSR 42 (1944) 208–211.
107. N. N. Ladis. Commuting vector fields and isochrony. (Russian) Vestn. Beloruss. Gos.
Univ. Ser. I Fiz. Mat. Inform. (1976) 21–24, 93.
108. J. S. W. Lamb and J. A. G. Roberts. Time-reversal symmetry in dynamical systems: a survey.
Time-reversal symmetry in dynamical systems (Coventry, 1996). Phys. D 112 (1998) 1–39.
109. J. P. La Salle. The Stability of Dynamical Systems. Philadelphia: Society for Industrial and
Applied Mathematics, 1976.
110. S. Lefschetz. Differential Equations: Geometric Theory. New York: Wiley Interscience,
1963; New York: Dover Publications, 1977 (reprint).
111. V. L. Le and A. P. Sadovskii. The center and focus problem for a cubic system in the case of
zero eigenvalues of the linear part. (Russian) Vestn. Beloruss. Gos. Univ. Ser. 1 Fiz. Mat. In-
form. (2002) 75–80.
112. C. Li and R. Roussarie. The cyclicity of the elliptic segment loops of the reversible quadratic
Hamiltonian systems under quadratic perturbations. J. Differential Equations 205 (2004)
488–520.
113. J. Li. Hilbert’s 16th problem and bifurcations of planar polynomial vector fields. Inter-
nat. J. Bifur. Chaos Appl. Sci. Engrg. 13 (2003) 47–106.
114. A. Liapounoff. Problème général de la stabilité du mouvement. Annales de la Faculté des Sci-
ences de Toulouse Sér. 2 9 (1907) 204–474. Reproduction in Annals of Mathematics Studies
17, Princeton: Princeton University Press, 1947, reprinted 1965, Kraus Reprint Corporation,
New York.
318 References

115. A. M. Liapunov. Stability of Motion. With a contribution by V. Pliss. Translated by


F. Abramovici and M. Shimshoni. New York: Academic Press, 1966.
(3)
116. Y. Liu. Formulas of focal values, center conditions and center integrals for the system (E3 ).
Kexue Tongbao (English Ed.) 33 (1988) 357–359.
117. Y. Liu. Formulas of values of singular point and the integrability conditions for a class of
cubic system, M(3) ≥ 7. Chinese Sci. Bull. 35 (1990) 1241–1245.
118. Y. Liu and J. Li. Theory of values of singular point in complex autonomous differential
systems. Sci. China Ser. A 33 (1990) 10–23.
119. J. Llibre. Integrability of polynomial differential systems. Handbook of Differential Equa-
tions, Volume 1: Ordinary Differential Equations (A. Cañada, P. Drábek, and A. Fonda, Eds.).
Amsterdam: Elsevier, 2004.
120. J. Llibre and G. Rodriguez. Invariant hyperplanes and Darboux integrability for d-dimen-
sional polynomial differential systems. Bull. Sci. Math. 124 (2000) 599–619.
121. N. G. Lloyd and J. M. Pearson. REDUCE and the bifurcation of limit cycles. J. Symbolic
Comput. 9 (1990) 215–224.
122. N. G. Lloyd and J. M. Pearson. Computing centre conditions for certain cubic systems.
J. Comput. Appl. Math. 40 (1992) 323–336.
123. N. G. Lloyd and J. M. Pearson. Bifurcation of limit cycles and integrability of planar dynam-
ical systems in complex form. J. Phys. A 32 (1999) 1973–1984.
124. N. G. Lloyd and J. M. Pearson. Symmetry in planar dynamical systems. J. Symbolic Comput.
33 (2002) 357–366.
125. N. G. Lloyd, J. M. Pearson, and C. Christopher. Algorithmic derivation of centre conditions.
SIAM Rev. 38 (1996) 619–636.
126. N. G. Lloyd, J. M. Pearson, and V. G. Romanovsky. Centre conditions for cubic systems in
complex form. Preprint, 1998.
127. W. S. Loud. Behavior of the period of solutions of certain plane autonomous systems near
centers. Contributions to Differential Equations 3 (1964) 21–36.
128. V. A. Lunkevič and K. S. Sibirskii. Conditions for a center in the case of homogeneous
nonlinearities of third degree. Differ. Uravn. 1 (1965) 1482–1487; Differ. Equ. 1 (1965) 1164–
1168.
129. S. Lynch. Dynamical Systems with Applications Using Maple, 2nd edition. Boston: Birk-
häuser, 2008.
130. S. Lynch. Dynamical Systems with Applications Using Mathematica. Boston: Birkhäuser,
2007.
131. S. Lynch. Symbolic computation of Lyapunov quantities and the second part of Hilbert’s
sixteenth problem. In: Trends in Mathematics, Differential Equations with Symbolic Compu-
tations (D. Wang and Z. Zheng, Eds.), 1–22. Basel: Birkhäuser-Verlag, 2005.
132. S. MacLane and G. Birkhoff. Algebra. New York: Macmillan, 1967.
133. K. E. Malkin. Criteria for the center for a certain differential equation. (Russian) Volž.
Mat. Sb. Vyp. 2 (1964) 87–91.
134. K. E. Malkin. Conditions for the center for a class of differential equations. (Russian)
Izv. Vysš. Učebn. Zaved. Matematika 50 (1966) 104–114.
135. P. Mardešić, L. Moser-Jauslin, and C. Rousseau. Darboux linearization and isochronous cen-
ters with a rational first integral. J. Differential Equations 134 (1997) 216–268.
136. P. Mardešić, C. Rousseau, and B. Toni. Linearization of isochronous centers. J. Differential
Equations 121 (1995) 67–108.
137. L. Mazzi and M. Sabatini. A characterization of centres via first integrals. J. Differential
Equations 76 (1988) 222–237.
138. N. B. Medvedeva. Analytic solvability of the center–focus problem in some classes of vector
fields with a complex monodromic singular point. Proc. Steklov Inst. Math. 2002 Algebra,
Topology, Mathematical Analysis, Suppl. 2, S120–S141.
139. J. Murdock. Normal Forms and Unfoldings for Local Dynamical Systems. New York:
Springer-Verlag, 2003.
140. L. Perko. Differential Equations and Dynamical Systems, 3rd edition. New York: Springer-
Verlag, 2001.
References 319

141. I. I. Pleshkan. A new method of investigation on the isochronism of a system of differential


equations. Dokl. Akad. Nauk SSSR 182 (1968) 768–771; Soviet Math. Dokl. 9 (1968) 1205–
1209.
142. V. A. Pliss. On the reduction of an analytic system of differential equations to linear form.
Differ. Uravn. 1 (1965) 153–161; Differ. Equ. 1 (1965) 111–118.
143. H. Poincaré. Mémoire sur les courbes définies par une équation différentielle. J. Math. Pures
et Appl. (Sér. 3) 7 (1881) 375–422; (Sér. 3) 8 (1882) 251–296; (Sér. 4) 1 (1885) 167–244;
(Sér. 4) 2 (1886) 151–217.
144. M. J. Prelle and M. F. Singer. Elementary first integrals of differential equations. Trans. Amer.
Math. Soc. 279 (1983) 215–229.
145. N. B. Pyzhova, A. .P. Sadovskii, and M. L. Hombak. Isochronous centers of a reversible
cubic system. (Russian) Tr. Inst. Mat. Natl. Akad. Nauk Belarusi, Inst. Mat., Minsk 4 (2000)
120–127.
146. R. H. Rand and D. Armbruster. Perturbation Methods, Bifurcation Theory and Computer
Algebra. New York: Springer-Verlag, 1987.
147. J. W. Reyn. A Bibliography of the Qualitative Theory of Quadratic Systems of Differential
Equations in the Plane. 3rd edition. Report 94-02, Delft University of Technology, 1994.
148. V. G. Romanovski, X. Chen, and Z. Hu. Linearizability of linear systems perturbed by fifth
degree homogeneous polynomials. J. Phys. A 40 (2007) 5905–5919.
149. V. Romanovski and M. Robnik. The center and isochronicity problems for some cubic sys-
tems. J. Phys. A 34 (2001) 10267–10292.
150. V. G. Romanovski and D. S. Shafer. On the center problem for p : −q resonant polynomial
vector fields. Bull. Belg. Math. Soc. Simon Stevin 15 (2008) (in press).
151. V. G. Romanovski and D. S. Shafer. Time-reversibility in two-dimensional polynomial
systems. In: Trends in Mathematics, Differential Equations with Symbolic Computations
(D. Wang and Z. Zheng, Eds.), 67–84. Basel: Birkhauser Verlag, 2005.
152. V. G. Romanovskii. On the Number of Limit Cycles of a Second Order System of Differential
Equations. (Russian) PhD thesis, Leningrad State University, 1986.
153. V. G. Romanovskii. Calculation of Lyapunov numbers in the case of two pure imaginary
roots. Differ. Uravn. 29 (1993) 910–912; Differ. Equ. 29 (1993) 782–784.
154. V. Romanovsky and A. Şubă. On Bautin ideal of the quadratic system. In: Proceedings of
the Third International Conference “Differential Equations and Applications,” Saint Peters-
burg, June 12–17, 2000. Mathematical Research 7 21–25. St. Petersburg: St. Petersburg State
Technical University Press, 2000.
155. R. Roussarie. A note on finite cyclicity property and Hilbert’s 16th problem. Lecture Notes
in Mathematics, Vol. 1331. New York: Springer-Verlag, 1988.
156. R. Roussarie. Bifurcations of planar vector fields and Hilbert’s sixteenth problem. Progress
in Mathematics 164. Basel: Birkhäuser, 1998.
157. C. Rousseau. Bifurcation methods in polynomial systems. In: Bifurcations and Periodic Or-
bits of Vector Fields (Montreal). NATO Adv. Sci. Inst. Ser. C 405 383–428. Dordrecht: Kluwer
Acad. Publ., 1993.
158. C. Rousseau and B. Toni. Local bifurcation of critical periods in vector fields with homoge-
neous nonlinearities of the third degree. Canad. Math. Bull. 36 (1993) 473–484.
159. C. Rousseau and B. Toni. Local bifurcations of critical periods in the reduced Kukles system.
Canad. J. Math. 49 (1997) 338–358.
160. M. Sabatini. On the period function of Liénard systems. J. Differential Equations 152 (1999)
467–487.
161. M. Sabatini. Characterizing isochronous centers by Lie brackets. Differential Equations Dy-
nam. Systems 5 (1997) 91–99.
162. M. Sabatini. On the period function of x′′ + f (x)x′ 2 + g(x) = 0. J. Differential Equations 196
(2004) 151–168.
163. A. P. Sadovskii. Holomorphic integrals of a certain system of differential equations. Dif-
fer. Uravn. 10 (1974) 558–560; Differ. Equ. 10 (1974) 425–427.
164. A. P. Sadovskii. The problem of the center and focus for analytic systems with a zero linear
part. I. Differ. Uravn. 25 (1989) 790–799; Differ. Equ. 25 (1989) 552–560.
320 References

165. A. P. Sadovskii. The problem of the center and focus for analytic systems with a zero linear
part. II. Differ. Uravn. 25 (1989) 950–956; Differ. Equ. 25 (1989) 682–687.
166. A. P. Sadovskii. Solution of the center and focus problem for a cubic system of nonlinear
oscillations. Differ. Uravn. 33 (1997) 236–244, 286; Differ. Equ. 33 (1997) 236–244.
167. A. P. Sadovskii. Centers and foci of a class of cubic systems. Differ. Uravn. 36 (2000) 1652–
1657; Differ. Equ. 36 (2000) 1812–1818.
168. N. A. Saharnikov. On Frommer’s center conditions. (Russian) Akad. Nauk SSSR. Prikl. Mat.
Meh. 12 (1948) 669–670.
169. N. A. Saharnikov. On conditions for the existence of a center and a focus. (Russian)
Akad. Nauk SSSR. Prikl. Mat. Meh. 14 (1950) 513–526.
170. D. Schlomiuk. Elementary first integrals and algebraic invariant curves of differential equa-
tions. Exposition. Math. 11 (1993) 433–454.
171. D. S. Shafer. Weak singularities under weak perturbation. J. Dynam. Differential Equations
16 (2004) 65–90.
172. S. L. Shi. A concrete example of the existence of four limit cycles for plane quadratic system.
Sci. Sinica Ser. A 23 (1980) 153–158.
173. T. Shimoyama and K. Yokoyama. Localization and primary decomposition of polynomial
ideals. J. Symbolic Comput. 22 (1996) 247–277.
174. K. S. Sibirskii. On the conditions for existence of a center and a focus. (Russian) Uč. Zap.
Kišinevsk. Univ. 11 (1954) 115–117.
175. K. S. Sibirskii. The principle of symmetry and the problem of the center. (Russian)
Kišinev. Gos. Univ. Uč. Zap. 17 (1955) 27–34.
176. K. S. Sibirskii. On the number of limit cycles in the neighborhood of a singular point. Dif-
fer. Uravn. 1 (1965) 53–66; Differ. Equ. 1 (1965) 36–47.
177. K. S. Sibirskii. Algebraic Invariants of Differential Equations and Matrices. (Russian)
Kishinev: Shtiintsa, 1976.
178. K. S. Sibirsky. Introduction to the Algebraic Theory of Invariants of Differential Equations.
Kishinev: Shtiintsa, 1982; Nonlinear Science: Theory and Applications. Manchester: Manch-
ester University Press, 1988.
179. K. S. Sibirskii and A. S. Shubè. Coefficient conditions for a center in the sense of Du-
lac for a differential system with one zero characteristic root and cubic right-hand sides.
Dokl. Akad. Nauk SSSR 303 (1988) 799–803; Soviet Math. Dokl. 38 (1989) 609–613.
180. C. L. Siegel. Über die normalform analytischer Differential–Gleichungen in der Nähe einer
Gleichgewichtslösung. Nachr. der Akad. Wiss. Göttingen Math.–Phys. Kl. IIa (1952) 21–30.
181. C. L. Siegel and J. K. Moser. Lectures on Celestial Mechanics. Berlin: Springer-Verlag, 1971,
1995.
182. M. F. Singer. Liouvillian first integrals of differential equations. Trans. Amer. Math. Soc. 333
673–688.
183. S. Sternberg. On the structure of local homeomorphisms of Euclidean n-space, II. Amer. J.
Math. 80 (1958) 623–631.
184. B. Sturmfels. Gröbner bases and convex polytopes. University Lecture Series, Vol. 8. Provi-
dence, RI: American Mathematical Society, 1996.
185. B. Sturmfels. Algorithms in Invariant Theory. New York: Springer-Verlag, 1993.
186. A. Şubă and D. Cozma. Solution of the problem of the centre for cubic differential system
with three invariant straight lines in generic position. Qual. Theory Dyn. Syst. 6 (2005) 45–
58.
187. F. Takens. Singularities of vector fields. Publ. Math. Inst. Hautes Etudes Sci. 43 (1974) 47–
100.
188. A. Tsygvintsev. Algebraic invariant curves of plane polynomial differential systems. J. Phys.
A 34 (2001) 663–672.
189. M. Urabe. Potential forces which yield periodic motions of a fixed period. J. Math. Mech. 10
(1961) 569–578.
190. M. Urabe. The potential force yielding a periodic motion whose period is an arbitrary contin-
uous function of the amplitude of the velocity. Arch. Ration. Mech. Anal. 11 (1962) 27–33.
References 321

191. J. V. Uspensky. Theory of Equations. New York: McGraw-Hill, 1948.


192. W. V. Vasconcelos. Computational Methods in Commutative Algebra and Algebraic Geome-
try. Algorithms and Computation in Mathematics, Vol. 2. Berlin: Springer-Verlag, 1998.
193. E. P. Volokitin and V. V. Ivanov. Isochronicity and commutability of polynomial vector fields.
Sibirsk. Mat. Zh. 40 (1999) 30–48; Siberian Math. J. 40 (1999) 23–38.
194. A. P. Vorob’ev. On periodic solutions in the case of a center. Dokl. Akad. Nauk BSSR 6 (1962)
281–284.
195. B. L. van der Waerden. Modern Algebra. New York: Frederick Ungar Publishing, 1969.
196. D. Wang. Mechanical manipulation for a class of differential systems. J. Symbolic Comput.
12 (1991) 233–254.
197. D. Wang. Irreducible decomposition of algebraic varieties via characteristic sets and Gröbner
bases. Comput. Aided Geom. Design 9 (1992) 471–484.
198. D. Wang. Elimination Methods. New York: Springer-Verlag, 2001.
199. D. Wang. Elimination Practice: Software Tools and Applications. London: Imperial College
Press, 2004.
200. S. Willard. General Topology. Reading, MA: Addison-Wesley, 1970.
201. S. Yakovenko. A geometric proof of the Bautin theorem. Concerning the Hilbert Sixteenth
Problem. Advances in Mathematical Sciences, Vol. 23; Amer. Math. Soc. Transl. 165 (1995)
203–219.
202. Y.-Q. Ye. Theory of Limit Cycles. Transl. Math. Monographs, Vol. 66. Providence, RI: Amer-
ican Mathematical Society, 1986.
203. P. Yu and M. Han. Twelve limit cycles in a cubic case of the 16th Hilbert problem. Inter-
nat. J. Bifur. Chaos Appl. Sci. Engrg. 15 (2005) 2191–2205.
204. A. Zegeling. Separatrix cycles and multiple limit cycles in a class of quadratic systems.
J. Differential Equations 113 (1994) 355–380.
205. Z. Zhang, T. Ding, W. Huang, and Z. Dong. Qualitative Theory of Differential Equations.
Transl. Math. Monographs, Vol. 101. Providence, RI: American Mathematical Society, 1992.
206. H. Żoła̧dek. Quadratic systems with center and their perturbations. J. Differential Equations
109 (1994) 223–273.
207. H. Żoła̧dek. On a certain generalization of Bautin’s theorem. Nonlinearity 7 (1994) 273–279.
208. H. Żoła̧dek. The classification of reversible cubic systems with center. Topol. Methods Non-
linear Anal. 4 (1994) 79–136.
209. H. Żoła̧dek. Eleven small limit cycles in a cubic vector field. Nonlinearity 8 (1995) 843–860.
210. H. Żoła̧dek. The problem of center for resonant singular points of polynomial vector fields.
J. Differential Equations 135 (1997) 94–118.
211. H. Żoła̧dek. Algebraic invariant curves for the Liénard equation. Trans. Amer. Math. Soc. 350
(1998) 1681–1701.
212. H. Żoła̧dek. New examples of holomorphic foliations without algebraic leaves. Studia Math.
131 (1998) 137–142.
Index of Notation

(s+1)
∼ [binary relation] agreement of series through order s + 1
≺ [binary relation] u(z) ≺ v(z) : series u(z) majorizes series v(z)
\ set difference A \ B : elements of A that are not in B
|α | for α = (α1 , . . ., αn ) ∈ Nn0 , |α | = α1 + · · · + αn
(α , κ ) for α ∈ NN0 and κ ∈ Cn , the scalar ∑nj=1 α j κ j
ηj jth Lyapunov number
ν1 νℓ νℓ+1 ν2ℓ
[ν ] for ν ∈ N2ℓ 0 , [ν ] = a p1 ,q1 · · ·a pℓ ,qℓ bqℓ ,pℓ · · ·bq1 ,p1 ∈ C[a, b]
σζ for σ ∈ R \ {0}, ζ ∈ Z2ℓ , σ ζ = (σ ζ1 , . . ., σ ζ2ℓ )
σ −ζ c for σ ∈ R \ {0}, ζ ∈ Z2ℓ , c ∈ E(a, b), σ −ζ c = (σ −ζ1 c1 , . . ., σ −ζ2ℓ c2ℓ )
a(t) for a ∈ kn , the unique solution of (2.1) with initial condition a(0) = a,
on its maximal interval of existence
(a, b) The vector of coefficients of system (3.3), ordered
(a p1 ,q1 , a p2 ,q2 , . . ., a pℓ ,qℓ , bqℓ ,pℓ , . . ., bq2 ,p2 , bq1 ,p1 )
B the Bautin ideal, generated by the focus quantities
Bk the ideal hg11 , g22 , . . ., gkk i generated by the first k focus quantities
B̆, B̆k the images of B̆ and B̆k under the homomorphism W of Proposition 6.3.5
B̆ + the ideal hğkk : k ∈ Ni described in the paragraph preceding
Proposition 6.3.5
B̆k+ the ideal hğ11 , . . ., ğkk i described in the paragraph preceding
Proposition 6.3.5
(b̂, â) the involution of (a, b) defined by (5.29)
C the field of complex numbers
C[a, b] the polynomial ring in the variables a pq , bqp , (p, q) ∈ S
C[V ] the coordinate ring of a variety V in Cn
ν1 ν2ℓ
cν for c ∈ E(a, b), ν ∈ N2ℓ ν
0 , c = [ν ]|c = c1 · · ·c2ℓ
div X divergence of the vector field X
E(a) the parameter space Cℓ of the complex form (3.51) of the
real polynomial system (3.4)
EP (a) subset of E(a) for which Pe and Q e are relatively prime polynomials
E(a, b) the parameter space C2ℓ of system (3.3)

323
324 Index of Notation

F the function defined by (3.151) (Section 3.8)


f a mapping into Rn or Cn for n ≥ 2
f for an analytic function f , the germ at θ ∗ induced by f (Chapter 6)
b
f the conjugate of the polynomial f defined in Definition 3.4.3
(except: omitting complex conjugation of the coefficients
in the proof of Theorem 6.3.5)
f ♮ (z) the trivial majorant of f (z) = ∑α ∈Nn f (α ) zα : f ♮ (z) = ∑α ∈Nn | f (α ) |zα
f |g polynomial f divides polynomial g
F
f →h f reduces to h modulo F = { f 1 , . . ., f s } (Definition 1.2.5)
f ≡ g mod I f − g ∈ I ( f , g polynomials and I an ideal)
h f 1 , . . ., f s i ideal generated by the polynomials f 1 , . . ., f s
G the function defined by (3.28b)
G the function defined by (3.151) (Section 3.8)
Gθ ∗ the ring of germs of analytic functions of θ at the point θ ∗ ∈ kn
gkk the kth focus quantity of (3.3)
gk1 ,k2 the coefficient of xk1 +1 yk2 +1 in (3.53)
ğkk the image of gkk under the homomorphism W of Proposition 6.3.5
H the set of radical ideals in the ring k[x1 , . . ., xn ]
H the ideal hH2 j+1 : j ∈ Ni = hH e2 j+1 : j ∈ Ni
for H(w) = ∑∞ H
k=1 2k+1 w k defined by (3.28b)

Hk the ideal hH2 j+1 : j = 1, . . ., k i = hHe2 j+1 : j = 1, . . ., k i


Hs the set of functions from Rn to Rn (or Cn to Cn ) all of whose
components are homogeneous polynomial functions of
degree s (Chapter 2)
H the function defined by (3.28b)
He (i/2)H
I the set of all ideals in the ring k[x1 , . . ., xn ]
I(S)
√ the ideal of a set or variety S
I the radical of an ideal I
I +J the sum of two ideals I and J
I:J the quotient ideal of two ideals I and J
IJ the product ideal of two ideals I and J
Ikk , Jkk the kth linearizability quantities of (4.27)
IHam the ideal of Hamiltonian systems of (3.3) defined by (3.85)
Isym the symmetry or Sibirsky ideal
iQ the set of pure imaginary elements of Q(i), iQ = {iq : q ∈ Q}
k a field
k[V ] the coordinate ring of a variety V in kn
k[x1 , . . ., xn ] the ring of polynomials in n indeterminates with coefficients in k
k(x1 , . . ., xn ) the ring of rational functions in n indeterminates with coefficients
in k
k[[x1 , . . ., xn ]] the ring of formal power series in n indeterminates with coefficients
in k
Index of Notation 325

L the homological operator on Hs (Chapter 2)


L the linearizability ideal hIkk , Jkk : k ∈ Ni (Chapter 4)
L(ν ) the linear map L : N2ℓ 2
0 → Z of (3.71)
LCM(xα , xβ ) least common multiple of monomials xα and xβ
LC( f ) leading coefficient of a polynomial f
LM( f ) leading monomial of a polynomial f
LT ( f ) leading term of a polynomial f
LT (S) set of leading terms of a set S of polynomials
M M = {ν ∈ N2ℓ 0 : L(ν ) = ( j, j) for some j ∈ N0 }, L defined
by (3.71)
N the set of natural numbers {1, 2, 3, . . .}
N0 {0} ∪ N
N−n {−n, . . ., −1, 0} ∪ N
P difference function of (3.14)
P the ideal hp2k : k ∈ Ni (Chapters 4 and 6)
Pk the ideal hp2 , p4 , . . ., p2k i (Chapters 4 and 6)
p2k the kth isochronicity quantity
Q the field of rational numbers
Q(i) the field of Gaussian rationals, Q(i) = {a + ib : a, b ∈ Q}
R the field of real numbers
R(α ) for α ∈ kn , the set of indices of nonzero entries
⌊r⌋ the greatest integer less than or equal to r (floor function)
S̄ the Zariski closure of a subset S of kn
Supp( f ) the set of ν such that the coefficient of xν in the polynomial
f = ∑ f (ν ) xν is nonzero
U the rotation group of Definition 5.1.4
Uϕ element of the rotation group U
(a) (b)
Uϕ , Uϕ blocks in the matrix representation of Uϕ
V the set of varieties in kn , for k a field
V(I) the variety of an ideal I
VC the center variety for family (3.3)
VI the isochronicity variety for family (3.3)
VL the linearizability variety for family (3.3)
X the vector field X = ∑ f 1 ∂∂x1 + · · · + f n ∂∂xn corresponding to
the system of differential equations ẋ1 = f 1 (x), . . . , ẋn = f n (x)
(α )
Xm the coefficient of the monomial xα in the
mth component Xm of the vector function X
{Xm (y + h(y))}(α ) the coefficient of y(α ) in the expansion of Xm (y + h(y)) in
powers of y
xα the monomialD xα1 1 · · · xαn n , α = (α1 , .E. ., αn ) ∈ Nn0
( j+1, j) ( j, j+1)
Y the ideal Y1 , Y2 : j ∈ N defined in (4.28)
Z the ring of integers
Index

Abel’s Lemma, 256 for complex systems, 111, 115, 118


affine space, 4 in Liénard systems, 161–166
affine variety, see variety, affine isochronous, 177
algebraic partial integral, 139 linearizable, 179
Algorithm in quadratic systems, 207
Buchberger’s, 23 p : −q resonant center, 118
Division, 9 problem, 91–93, 120, 166–169
Euclidean, 9, 54 for cubic systems, 168
Focus Quantity, 130 for Liénard systems, 160–166
implementation, 309 separatrix of, 205
for computing I : J, 40 center variety, see variety, center
for computing I ∩ J, 39 center-focus, 97
for computing Isym , 237 characteristic number, 221
for computing a Hilbert basis of M , 237 characteristic vector, 221
Linearizability Quantities, 201 coefficient
implementation, 309 leading, 11
Multivariable Division, 14 resonant, 74
Normal Form, 77 cofactor, 139
implementation, 309 complex form of a real system, 112
analytically equivalent systems, 72 complexification of a real system, 98–99
antisaddle, 97 conjugate variables, 221
arithmetic coordinate ring, see variety, coordinate ring of
modular, 53 cycle, 59, 96
limit, 96, 97, 176, 251–252, 260–261, 302
basis cyclicity, 251
Gröbner, see Gröbner basis of a singular point, 260
of a monoid, 228
of an ideal, 6, 38–40 degree
minimal, 257 full, 4
standard, 18 difference function, see function, difference
Bautin ideal, see ideal, Bautin distinguished transformation, 74
Buchberger’s Criterion, 21 divergence, 141, 144

CALI, 318 equilibrium, 59


center, 94, 111 asymptotically stable, 60
and the Lyapunov numbers, 96 hyperbolic, 64
characterizations, 110 stable, 60

327
328 Index

unstable, 60 and V, 33, 34


existence of solutions of a system of ideal
polynomial equations, 32 associated prime, 43
exponential factor, 142–143 basis of, 6, 38–40
minimal, 257
first integral, see integral Bautin, 114, 150
focus, 176 for quadratic systems, 150, 272–275
fine, 97 congruence modulo, 7
of a quadratic system, 271 definition, 6
focus quantities, 114–118, 125 elimination, 27
as (k, k)-polynomials, 121 equality of, 26
for p : −q resonant systems, 119 finitely generated, 6
for quadratic systems, 151 generators of, 6
recursion formula for coefficients, 125 Hamiltonian, 131
relation to fine foci, 265 intersection of, 35, 39
relation to Lyapunov quantities, 263–265 linearizability, 197
structure of, 129 minimal basis of, 257
formally equivalent systems, 72 of a set, 36, 37
function of a variety, 7–9, 33, 34, 36
Darboux, 143 primary, 43
difference, 96 primary decomposition, 43
and cyclicity, 261 prime, 41
germ of, 256 principal, 9
polynomial, 4 quotient, 37–40
vector homogeneous, 67 radical, 32
radical of, 32–35, 43
G defined by (3.28), 104–108, 182, 188 reduction modulo, 12, 19
and first integrals, 105 and congruence, 12
and real centers, 107 Sibirsky, 135
germ of a function, 256 sum, 35, 56
germs symmetry, 135, 233–236
ring of, 256 Ideal Membership Problem, 9, 18
Gröbner basis, 18–26 solution, 20
minimal, 24 solution in k[x], 9
of Isym , 235 implicitization
reduced, 25 polynomial, 48
uniqueness of, 25 rational, 48
integral
H defined by (3.28), 104, 182, 188 algebraic partial, 139
and isochronicity, 184 Darboux, 140–144, 154–156, 158–159
e for H defined by (3.28), 182
H first, 92, 104
Hamiltonian formal first, 104, 108, 110–113, 159
ideal, 131 Lyapunov first, 108
system, 130–131 integrating factor, 141, 158–159
Hartman–Grobman Theorem, 64, 65 Darboux, 141
Hilbert Basis invariant, 217–229
of a monoid, 234 binary, 221
Theorem, 6 curve, 139
Hilbert Nullstellensatz, see Nullstellensatz irreducible, 217, 222–224
Hilbert’s 16th Problem, 251 unary, 221
local, 252 isochronicity quantities, 182, 189–192
homological operator, 67, 69
L mapping, 121, 135, 219–220
I, 7–9, 33, 34 Lasker–Noether Decomposition Theorem, 44
Index 329

least common multiple, 20 linearizing, 74, 181


lex, see order, lexicographic structure of coefficients, 185
Liénard equation, 86, 160 Nullstellensatz
Liénard system, 86, 160–166 Strong, 33
isochronicity of, 210 Weak, 31
limit cycle, see cycle, limit
limit periodic set, 251 omega limit set, 61, 94
linearizability quantities, 193–200 orbit, 59
linearizable system, 73 closed, 59
linearization, 74, 180–181, 193–195, 205–206 order
and isochronicity, 179 degree lexicographic, 11
Darboux, 202 degree reverse lexicographic, 11
generalized, 202 lexicographic, 11
Lyapunov monomial, 10
first integral, 108
function, 60 pair
strict, 60 resonant, 74
numbers, 96, 261 period
quantities, 118, 261–267, 304 critical, 289
relation to focus quantities, 263–265 period annulus, 94
Lyapunov’s Second Method, 59–63 period function, 177–179, 183, 289–290
phase portrait, 59, 64, 65
M , 219 phase space, 59
characterization, 220 Poincaré first return map, see return map
definition, 135 Poincaré–Lyapunov Theorem, 108
Macaulay, 318 polydisk, 255
majorant, 79 polynomial, 3
majorization, 79 homogeneous, 30
monoid, 219 ( j, k)-, 121, 185
basis of, 228 reduced with respect to a set, 13
Hilbert basis of, 234 standard form of, 11
minimal basis of, 234 term of, 3
monomial, 3 polyradius, 255
leading, 11 proper perturbation, 293
multiplicity of a parameter value, 253
quadratic system, 62, 85, 251–252, 271, 272
Noetherian ring, 7
normal form, 74–79 Radical Membership Test, 35
convergence of, 81 real center variety, 117
distinguished, 74, 78 reduce f modulo g, 19
existence, 75, 78 reduce f modulo ideal I, 19
for complex systems with diagonal linear regular point, 64
part, 103 remainder, 13
for the complexification of a real antisaddle, resonant coefficient, see coefficient, resonant
100 resonant pair, see term, resonant
real, 65–70 resonant term, see pair, resonant
definition, 70 retention condition, 256
structure of coefficients, 185 return map, 96, 260
uniqueness, 75, 78 rotation group, 218
normalization, see normalizing transformation
normalizing transformation, 74 S-polynomial, 20–21
convergence of, 79–85, 103 separatrix of a center, 205
distinguished, 74 simple singularity, see singularity, simple
formal, 74 Singular, 53, 318
330 Index

routine minAssChar, 46 V, 5, 8–9, 33, 34


routine primdecGTZ, 44 and I, 33, 34
routine primdecSY, 45 variety
singular point, 59 affine, 5–9, 29–52
singularity, 59, 97 center, 92–93, 114, 119, 149
cyclicity of, 260 for p : −q-resonant systems, 119
hyperbolic, 64 in quadratic systems, 151
nondegenerate, 259 of quadratic systems, 149–157
simple, 259, 271, 303 Sibirsky subvariety of, 136
subvariety, 5 symmetry subvariety of, 136
symmetry coordinate ring of, 191
axes of, 132, 239–246 dimension of, 52
and invariants, 242, 243 equality and radical ideals, 34
upper bound, 244 irreducibility criterion, 41
axis of, 240 irreducible, 40
generalized, 160 isochronicity, 184
mirror, 131–133, 240 linearizability, 183, 195, 197
time-reversible, 132–134, 240 for quadratic systems, 207
minimal decomposition, 42
term, see polynomial, term of minimal primary decomposition, 43
leading, 11 of an ideal, 8–9
resonant, 74 and the radical of the ideal, 33
term order, 10 parametrization of, 49–51
time-reversible system, 229, 231–234 Sibirsky, 136
and centers, 135 symmetry, 136, 229, 234
and the symmetry variety, 234 test for equality of, 35
real, 132 vector field, 59, 104
topological equivalence of vector fields, 64
trajectory, 59 Zariski closure, 36, 37, 157

You might also like