0% found this document useful (0 votes)
76 views409 pages

Garrett Birkhoff, Gian-Carlo Rota - Ordinary Differential Equations-Wiley (1989)

Uploaded by

cacota
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views409 pages

Garrett Birkhoff, Gian-Carlo Rota - Ordinary Differential Equations-Wiley (1989)

Uploaded by

cacota
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 409

ORDINARY

DIFFERENTIAL
EQUATIONS
FOURTH EDITION

GARRETT BIRKHOFF

GIAN-CARLO ROTA
ORDINARY
DIFFERENTIAL
EQUATIONS
FOURTH EDITION

Garrett Birkhoff
Harvard University

Gian-Carlo Rota
Massachusetts Institute of Technology

cy
WILEY

JOHN WILEY & SONS

New York Chicheste Brisbane Toron Singapore


Copyright © 1959, 1960, 1962, 1969, 1978, and 1989 by John Wiley & Sons, Inc

All rights reserved. Published simultaneously in Canada.

Reproduction or translation of any part of


this work beyond that permitted by Sections
107 and 108 of the 1976 United States Copyright
Act without the permission of the copyright
owner is unlawful. Requests for permission
or further information should be addressed to
the Permissions Department, John Wiley & Sons.

Library of Congress Cataloging in Publication Data:

Birkhoff, Garrett, 1911-


Ordinary differential equations.

Bibliography: p. 392
Includes index.
1. Differential equations. I. Rota, Gian-Carlo,
1932- II. Title.
QA372.B58 1989 515.3’52 88-14231
ISBN 0-471-86003-4

Printed in the United States of America

1098 765 4 3 2 1
PREFACE

The theory of differential equations is distinguished for the wealth of its ideas
and methods. Although this richness makes the subject attractive as a field of
research, the inevitably hasty presentation of its many methods in elementary
courses leaves many students confused. One of the chief aims of the present
text is to provide a smooth transition from memorized formulas to the critical
understanding of basic theorems and their proofs.
We have tried to present a balanced account of the most important key ideas
of the subject in their simplest context, often that of second-order equations.
We have deliberately avoided the systematic elaboration of these key ideas, feel-
ing that this is often best done by the students themselves. After they have
grasped the underlying methods, they can often best develop mastery by gen-
eralizing them (say, to higher-order equations or to systems) by their own
efforts.
Our exposition presupposes primarily the calculus and some experience with
the formal manipulation of elementary differential equations. Beyond this
requirement, only an acquaintance with vectors, matrices, and elementary com-
plex functions is assumed throughout most of the book.
In this fourth edition, the first eight chapters have again been carefully
revised. Thus simple numerical methods, which provide convincing empirical
evidence for the well-posedness of initial value problems, are already introduced
in the first chapter. Without compromising our emphasis on advanced ideas and
proofs, we have supplied detailed reviews of elementary facts for convenient
reference. Valuable criticisms and suggestions by Calvin Wilcox have helped to
eliminate many obscurities and troublesome errors.
The book falls broadly into three parts. Chapters 1 through 4 constitute a
review of material to which, presumably, the student has already been exposed
in elementary courses. The review serves two purposes: first, to fill the inevitable
gaps in the student’s mastery of the elements of the subject, and, second, to give
a rigorous presentation of the material, which is motivated by simple examples.
This part covers elementary methods of integration of first-order, second-order
linear, and nth-order linear constant-coefficient, differential equations. Besides
reviewing elementary methods, Chapter 3 introduces the concepts of transfer
function and the Nyquist diagram with their relation to Green’s functions.
Although widely used in communications engineering for many years, these con-
cepts are ignored in most textbooks on differential equations. Finally, Chapter
Vv
vi Preface

4 provides rigorous discussions of solution by power series and the method of


majorants.
Chapters 5 through 8 deal with systems of nonlinear differential equations.
Chapter 5 discusses plane autonomous systems, including the classification of
nondegenerate critical points, and introduces the important notion of stability
and Liapunov’s method, which is then applied to some of the simpler types of
nonlinear oscillations. Chapter 6 includes theorems of existence, uniqueness,
and continuity, both in the small and in the large, and introduces the pertur-
bation equations.
Chapter 7 gives rigorous error bounds for the methods introduced in Chap-
ter 1, analyzing their rates of convergence. Chapter 8 then motivates and ana-
lyzes more sophisticated methods having higher orders of accuracy.
Finally, Chapters 9 through 11 are devoted to the study of second-order lin-
ear differential equations. Chapter 9 develops the theory of regular singular
points in the complex domain, with applications to some important special func-
tions. In this discussion, we assume familiarity with the concepts of pole and
branch point. Chapter 10 is devoted to Sturm-Liouville theory and related
asymptotic formulas, for both finite and infinite intervals. Chapter 11 establishes
the completeness of the eigenfunctions of regular Sturm-Liouville systems,
assuming knowledge of only the basic properties of Euclidean vector spaces
(inner product spaces).
Throughout our book, the properties of various important special func-
tions—notably Bessel functions, hypergeometric functions, and the more com-
mon orthogonal polynomials—are derived from their defining differential
equations and boundary conditions. In this way we illustrate the theory of ordi-
nary differential equations and show its power.
This textbook also contains several hundred exercises of varying difficulty,
which form an important part of the course. The most difficult exercises are
starred.
It is a pleasure to thank John Barrett, Fred Brauer, Thomas Brown, Nathaniel
Chafee, Lamberto Cesari, Abol Ghaffari, Andrew Gleason, Erwin Kreyszig, Carl
Langenhop, Norman Levinson, Robert Lynch, Lawrence Markus, Frank Stew-
art, Feodor Theilheimer, J. L. Walsh, and Henry Wente for their comments,
criticisms, and help in eliminating errors.

Garrett Birkhoff
Gian-Carlo Rota

Cambridge, Massachusetts
CONTENTS

1 FIRST-ORDER OF DIFFERENTIAL EQUATIONS 1

Introduction 1
Fundamental Theorem of the Calculus 2
First-order Linear Equations 7
Separable Equations 9
Quasilinear Equations; Implicit Solutions 11
Exact Differentials; Integrating Factors 15
Linear Fractional Equations 17
Graphical and Numerical Integration 20
The Initial Value Problem 24
*10 Uniqueness and Continuity 26
*11 A Comparison Theorem 29
*12 Regular and Normal Curve Families 31

2 SECOND-ORDER LINEAR EQUATIONS 34

Bases of Solutions 34
Initial Value Problems 37
Qualitative Behavior; Stability 39
Uniqueness Theorem 40
The Wronskian 43
Separation and Comparison Theorems 47
The Phase Plane 49
Adjoint Operators; Lagrange Identity 54
Green’s Functions 58
*10 Two-endpoint Problems 63
*11 Green’s Functions, II 65

3 LINEAR EQUATIONS WITH CONSTANT COEFFICIENTS 71

The Characteristic Polynomial 71


Complex Exponential Functions 72
The Operational Calculus 76
Solution Bases 78
Inhomogeneous Equations 83
vil
viii Contents

6. Stability 85
7. The Transfer Function 86
*8. The Nyquist Diagram 90
*9,. The Green’s Function 93

4 POWER SERIES SOLUTIONS 99

Introduction 99
Method of Undetermined Coefficients 101
More Examples 105
Three First-order DEs 107
Analytic Functions 110
Method of Majorants 113
*7 Sine and Cosine Functions 116
*8 Bessel Functions 117
9 First-order Nonlinear DEs 121
10 Radius of Convergence 124
*11 Method of Majorants, II 126
*12 Complex Solutions 128

5 PLANE AUTONOMOUS SYSTEMS 131

Autonomous Systems 131


Plane Autonomous Systems 134
The Phase Plane, II 136
Linear Autonomous Systems 141
Linear Equivalence 144
Equivalence Under Diffeomorphisms 151
Stability 153
Method of Liapunov 157
Undamped Nonlinear Oscillations 158
10 Soft and Hard Springs 159
11 Damped Nonlinear Oscillations 163
Limit Cycles 164

6 EXISTENCE AND UNIQUENESS THEOREMS 170

1 Introduction 170
2 Lipschitz conditions 172
3 Well-posed Problems 174
4 Continuity 177
*5 Normal Systems 180
Equivalent Integral Equation 183
Successive Approximation 185
Linear Systems 188
Local Existence Theorem 190
Contents x

*10. The Peano Existence Theorem 191


*11. Analytic Equations 193
*12. Continuation of Solutions 197
*13. The Perturbation Equation 198

7 APPROXIMATE SOLUTIONS 204

1 Introduction 204
2 Error Bounds 205
*3 Deviation and Error 207
4 Mesb-halving; Richardson Extrapolation 210
5 Midpoint Quadrature 212
6 Trapezoidal Quadrature 215
7, Trapezoidal Integration 218
8 The Improved Euler Method 222
+9, The Modified Euler Method 224
*10. Cumulative Error Bound 226

8 EFFICIENT NUMERICAL INTEGRATION 230

1 Difference Operators 230


2 Polynomial Interpolation 232
*3 Interpolation Errors 235
4 Stability 237
*5 Numerical Differentiation; Roundoff 240
*6 Higher Order Quadrature 244
«7, Gaussian Quadrature 248
8 Fourth-order Runge-Kutta 250
*9 Milne’s Method 256
*10 Multistep Methods 258

9 REGULAR SINGULAR POINTS 261

1 Introduction 261
*2 Movable Singular Points 263
First-order Linear Equations 264
Continuation Principle; Circuit Matrix 268
Canonical Bases 270
Regular Singular Points 274
Bessel Equation 276
The Fundamental Theorem 281
*9 Alternative Proof of the Fundamental Theorem 285
*10 Hypergeometric Functions 287
*11 The Jacobi Polynomials 289
*12 Singular Points at Infinity 292
*13 Fuchsian Equations 294
x Contents

10 STURM-LIOUVILLE SYSTEMS 300

1 Sturm-Liouville Systems 300


2 Sturm-Liouville Series 302
*3 Physical Interpretations 305
Singular Systems 308
Prifer Substitution 312
Sturm Comparison Theorem 313
Sturm Oscillation Theorem 314
The Sequence of Eigenfunctions 318
The Liouville Normal Form 320
10 Modified Priifer Substitution 323
*11 The Asymptotic Behavior of Bessel Functions 326
12 Distribution of Eigenvalues 328
13 Normalized Eigenfunctions 329
14. Inhomogeneous Equations 333
15 Green’s Functions 334
*16 The Schroedinger Equation 336
*17 The Square-well Potential 338
*18 Mixed Spectrum 339

11 EXPANSIONS IN EIGENFUNCTIONS 344

Fourier Series 344


2 Orthogonal Expansions 346
Mean-square Approximation 347
4 Completeness 350
5 Orthogonal Polynomials 352
+6. Properties of Orthogonal Polynomials 354
*7, Chebyshev Polynomials 358
8 Euclidean Vector Spaces 360
9 Completeness of Eigenfunctions 363
*10. Hilbert Space 365
*11. Proof of Completeness 367

APPENDIX A: LINEAR SYSTEMS 371

1. Matrix Norm 371


_ 2. Constant-coefficient Systems 372
3. The Matrizant 375
4. Floquet Theorem; Canonical Bases 377

APPENDIX B: BIFURCATION THEORY 380

1. What Is Bifurcation? 380


*2. Poincaré Index Theorem 381
Contents xi

3. Hamiltonian Systems 383


4. Hamiltonian Bifurcations 386
5. Poincaré Maps 387
6. Periodically Forced Systems 389

BIBLIOGRAPHY 392

INDEX 395
CHAPTER 1

FIRST-ORDER
DIFFERENTIAL
EQUATIONS

1 INTRODUCTION

A differential equation is an equation between specified derivatives of an


unknown function, its values, and known quantities and functions. Many phys-
ical laws are most simply and naturally formulated as differential equations (or
DEs, as we will write for short). For this reason, DEs have been studied by the
greatest mathematicians and mathematical physicists since the time of Newton.
Ordinary differential equations are DEs whose unknowns are functions of a
single variable; they arise most commonly in the study of dynamical systems and
electrical networks. They are much easier to treat than partial differential equa-
tions, whose unknown functions depend on two or more independent variables.
Ordinary DEs are classified according to their order. The order of a DE is
defined as the largest positive integer, n, for which an nth derivative occurs in
the equation. Thus, an equation of the form

o(x,9,9’) = 0

is said to be of the first order.


This chapter will deal with first-order DEs of the special form

(1) M(x,y) + N(x,y)y’ =

A DE of the form (1) is often said to be of the first degree. This is because, con-
sidered as a polynomial in the derivative of highest order, y’, it is of the first
degree.
One might think that it would therefore be called “linear,” but this name is
reserved (within the class of first-order DEs) for DEs of the much more special
form a(x)y’ + b(x)y + ¢(x) = 0, which are linear in y and its derivatives. Such
“linear” DEs will be taken up in §3, and we shall call first-order DEs of the more
general form (1) quasilinear.
A primary aim of the study of differential equations is to find their solutions—
that is, functions y = f(x) which satisfy them. In this chapter, we will deal with
the following special case of the problem of “solving’’ given DEs.

DEFINITION. A solution of (1) is a function f(x) such that M(x,f(x)) +


N(x,f(x))f(x) = 0 for all x in the interval where f(x) is defined.
1
2 CHAPTER 1 First-Order Differential Equations

The problem of solving (1) for given functions M(x,y) and N(x,y) is thus to
determine all real functions y = f(x) which satisfy (1), that is, all its solutions.

Example 1. Consider the first-order quasilinear DE

(2) x + yy =0

The solutions of (2) can be found by considering the formula d(x? + y*) /dx =
=

2(x + yy’). Clearly, y = f(x) is a solution of (2) if and only if x? + y? = Cisa


constant.

The equation x? + y? = C defines y implicitly as a two-valued function of x,


for any positive constant C. Solving for y, we get for each positive constant C
two solutions, the (single-valued)t functions y = + VC — x*. The graphs of these
solutions, the so-called solution curves, form two families of semicircles. These
fill the upper half-plane y > 0 and the lower half-plane y < 0, respectively, in
that there is one and only one such semicircle through each point in each half-
plane.

Caution. Note that the functions y = + C — x* are defined only in the


interval —VC sx VC , and that since y’ does not exist (is “‘infinite”) when
+ VC, these functions are solutions of (1) only on — C<x<ivec.
=

x =

Therefore, although the pairs of semicircles in Figure 1.1 appear to join together
to form the full circle x? + x = C, the latter is not a “‘solution curve’ of (1). In
fact, no solution curve of (2) can cross the x-axis (except possibly at the origin),
because on the x-axis y = 0 the DE (2) implies x 0 for any finite y’.
=
=

The preceding difficulty also arises if one tries to solve the DE (2) for y’. Divid-
ing through by y, one gets y’ —x/y, an equation which cannot be satisfied if
=
=

y = 0. The preceding difficulty is thus avoided if one restricts attention to


regions where the DE (1) is normal, in the following sense.

DEFINITION. A normal first-order DE is one of the form

(3) y = F(x,y)

In the normal form y’ = —x/y of the DE (2), the function F(x,y) is continuous
in the upper half-plane y > 0 and in the lower half-plane where y < 0; it is
undefined on the x-axis.

2 FUNDAMENTAL THEOREM OF THE CALCULUS

Although the importance of the theory of (ordinary) DEs stems primarily


from its many applications to geometry, science, and engineering, a clear under-

{ In this book, the word “function”’ will always mean single-valued function, unless the contrary is
expressly specified.
2 Fundamental Theorem of the Calculus 3

Figure 1.1 Integral curves of x + yy’ = 0.

standing of its capabilities can only be achieved if its definitions and results are
formulated precisely. Some of its most difficult results concern the existence and
uniqueness of solutions. The nature of such existence and uniqueness theorems
is well illustrated by the most familiar (and simplest!) class of ordinary DEs.
These are the first-order DEs of the very special form

(4) y¥ = g(x)

Such DEs are normal; their solutions are described by the fundamental theorem
of the calculus, which reads as follows.

FUNDAMENTAL THEOREM OF THE CALCULUS. Let ithe function g(x) in the


DE (4) be continuous in the interval a <= x =< b. Given a number c, there is one and
only one solution f(x) of the DE (A) in the interval such that f(a) = ¢. This solution is
given by the definite integral

(5) fe =et f g(t) di, c = f(a)

This basic result serves as a model of rigorous formulation in several respects.


First, it specifies the region under consideration, as a vertical strip a = x = b
in the xy-plane. Second, it describes in precise terms the class of functions g(x)
considered. And third, it asserts the existence and uniqueness of a solution, given
the “initial condition” f(a) = ¢.
We recall that the definite integral

(5’) f g(t) dt = lim ; >~ glt,) At, At, = t, — tp-1


maxAy

is defined for each fixed x as a limit of Riemann sums; it is not necessary to find
a formal expression for the indefinite integral [g(x) dx to give meaning to the
definite integral {% g(t) dt, provided only that g(t) is continuous. Such functions
4 CHAPTER 1 First-Order Differential Equations

as the error function erf x


=
=

(2/Vx) fé e~® dt and the sine integral function


Si (x) = Jf} [(sin #)/é] dt are indeed commonly defined as definite integrals; cf.
Ch. 4, §1.

Quadrature. The preceding considerations enable one to solve DEs of the


special form y’ = g(x) by inspection: for any a, one solution is the function
J% g(é) dt; the others are obtained by adding an arbitrary constant C to this
ee
particular” solution. Thus, the solutions of y’ = e
x
* are the functions y =
Se-® dx =
=
(Vm /2) erf x + C; those of xy’ =
=

sin x are the functions y =


Si (x) + C; and so on. Note that from any one solution curve of y’ = g(x),
the others are obtained by the vertical translations (x,y) fF» (x,y + C).t Thus,
they form a one-parameter family of curves, one for each value of the parameter
C. This important geometrical fact is illustrated in Figure 1.2.
After y’ = f(x), the simplest type of DE is y’ = g(y). Any such DE is invariant
under horizontal translation (x,y) Fe (x + ¢,y). Hence, any horizontal line is cut
by all solution curves at the same angle (such lines are called “‘isoclines”), and
any horizontal translate y = ¢(x + c) of any solution curve y = ¢(x) is again a
solution curve.
The DE y’ = y is the most familiar DE of this form. It can be solved by rewrit-
ing it as dy/y = dx; integrating, we get x = In |y| + c, ory = +e*’, where c
is an arbitrary constant. Setting k = te‘, we get the general solution y = ke*—
but the solution y = 0 is “‘lost’’ until the last step.

Example 2. A similar procedure can be applied to any DE of the form y’ =


g(y). Thus consider

(6) y=y-l
Since »? — 1 = (y + 1)(y — 1), the constant functionsy = —1 and y = 1 are
particular solutions of (6). Since y* > 1 if |y| > 1 whereas y? < 1 if —1 < y

_——

a
—x2
Figure 1.2 Solution curves of y’ = e

+ The symbol |~ is to be read as “goes into’’.


2 Fundamental Theorem of the Calculus 5

< 1, all solutions are decreasing functions in the strip |y| < 1 and increasing
functions outside it; see Figure 1.3.
Using the partial fraction decomposition 2/(y? — 1) = 1/(y — 1) — 1/(y +
1), one can rewrite (6) as 2 dx = dy/(y — 1) — dy/(y + 1) from which we obtain,
by integrating, 2(x — c) = In |(y — 1)/(y + I)|. Exponentiating both sides, we
get te*%*-9 = (y — 1)/(y + 1), which reduces after some manipulation to

1 + e%-9 tanh
(6/) ,= 1 Ft ee4 =
| coth
Jes
This procedure “‘loses’”’ the special solutions y = 1 and y = —1, but gives all
others. Note that if y = f(x) is a solution of (6), then so is 1/y = 1/f(x), as can
be directly verified from (6) (provided y # 0).

Example 3. A more complicated DE tractable by the same methods is y’ =


y® — y. Since y®? — y = y(y + 1)(y — 1), the constant functions y = —1, y = 0,
and y = 1 are particular solutions. Since y’ > y if —1 < y < 0 or 1 < y, whereas
y> < y if y < —1 or 0 < y < 1, all solutions are increasing functions in the
strips —1 < y < 0 and y > 1, and decreasing in the complementary strips.
To find the other solutions, we replace the DE y’ = dy/dx = y® — y by its
reciprocal, dx/dy = 1/(y®> — y). We then use partial fractions to obtain the DE

dx 1 2
(6”)
—_—

dy
= oe

yy
a

yt]
a

y-1 J |
The DE (6”) can be integrated termwise to give, after some manipulation,
x =4In|1—y?| + Cory = [1 F exp (2x — 77,k = Qe.

Symmetry. The labor of drawing solution curves of the preceding DEs is


reduced not only by their invariance under horizontal translation, but by the
use of other symmetries as well. Thus, the DEs 9’ = y and y’ = y® — y are invar-
iant under reflection in the x-axis [i.e., under (x,y) F> (x, —y)]; hence, so are
their solution curves. Likewise, the DEs y’ = 1 + y* and y’ = y? —1 (and their
solution curves) are invariant under (x,y) Fey (—x, —y)—i.e., under rotation
through 180° about the origin. These symmetries are visible in Figures 1.3 and
1.4.

EXERCISES A

1 (a) Show that if f(x) satisfies (6), then so do 1/f(x) and —/f(—x).
(b) Explain how these facts relate to Figure 1.2.

Show that every solution curve (6’) of (6) is equivalent under horizontal translation
and/or reflection in the x-axis toy = (1 + e*)/(1 — e*) or toy = (1 — e*)
(1 +
e**),
(a) Show that if y’ = y? + 1, then y is an increasing function and x = arctan y + ¢.
(b) Infer that no solution of y’ = y® + 1 can be defined on an interval of length
exceeding 7.
CHAPTER 1 First-Order Differential Equations

i AN SSS
yr

yr-t

Figure 1.3 Solution curves of y’ = y? — 1.

(c) Show that a nonhorizontal solution curve of y’ = y? + 1 hasa point of inflection


on the x-axis and nowhere else.

Show that the solution curves of y’ = y* are the x-axis and rectangular hyperbolas
having this for one asymptote. [HInT: Rewrite y’ = y? as dy/y? = dx.]

Sketch sample solution curves to indicate the qualitative behavior of the solutions of
the following DEs: (a) y’ = 1 — y°, (b) y’ = sin xy, (©) y’ = sin? y.
Show that the solutions of y’ = g(y), for any continuous function g, are either all
increasing functions or all decreasing functions in any strip y,_; < y < y, between
successive zeros of g(y) [i.e., values y,, such that g(y) = 0].

Show that the solutions of y’ = g(y) are convex up or convex down for given y accord-
ing as {g| is an increasing or decreasing function of y there.

LLL
——=>
——
+

aoe
EE ye

WV\Y\\
Figure 1.4 Solution curves of y’ = y* — y.
3 First-Order Linear Equations 7

*8. (a) Prove in detail that any nonconstant solution of (6) must satisfy

x=ct+4in
ly — D/yt+ DI

(b) Solve (6”) in detail, discussing the case k = 0 and the limiting case k = © (y =
0).

*9. (a) Show that the choice k < 0 in (6) gives solutions in the strip —1 < y < 1.
(b) Show that the choice k = 1 gives two solutions having the positive and negative
y-axes for asymptotes, respectively.

3. FIRST-ORDER LINEAR EQUATIONS

In the next five sections, we will recall some very elementary, but extremely
useful methods for solving important special families of first-order DEs. We
begin with the first-order linear DE

(7) a(x)y’ + b(x)y + cx) = 0

It is called homogeneous if c(x) = 0, and inhomogeneous otherwise.


Let the coefficient functions a, b, c be continuous. In any interval J where a(x)
does not vanish, the linear DE (7) can be reduced to the normal form

(8) y = —pl)y — g(x)

with continuous coefficient functions p = b/a and q = c/a.


The homogeneous linear case y’ —p(x)y of (8) is solved easily, if not rig-

=

orously, as follows. We separate variables, dy/y = — p(x) dx; then we integrate


(by quadratures), In |y| = —Jp(x) dx + C. Exponentiating both sides, we obtain
ly| = Ke), where K = e and any indefinite integral P(x) = Jp(x) dx may
be used.
This heuristic reasoning suggests that, if P’(x) = p(x), then ye is a constant.
Though this result was derived heuristically, it is easily verified rigorously:

d{ye” =) /dx = ye? + plx)ye”©) = ¢P Ty’ + p(x)y] = 0

if and (since e”” ¥ 0) only if y satisfies (8). This proves the following result.

THEOREM 1. [IfP(x) = Sp(x) dx is an indefinite integral of the continuousfunc-


tion p, then the function ce~" = ce is a solution of the DE y'+p(x)y =0 for
any constant c, and all solutions of the DE are of this form.

* The more difficult exercises in this book are starred.


8 CHAPTER 1 First-Order Differential Equations

We can treat the general case of (8) similarly. Differentiating the function
P(x)
ey, where P(x) is as before, we get

d[e?™y]/dx = e?™[y’ + p(x)y] = —e?g(x)

It follows that, for some constant yo, we must have ¢P(x)J = 9) — fz a(t) at,
whence

(8’)
y = yeP® — 9PO)f ° ea(t) dt
a

Conversely, formula (8’) defines a solution of (8) with y(a) = yo for every yo, by
the Fundamental Theorem of the Calculus. This proves

THEOREM 2. If P(x) is as in Theorem 1, then the general solution of the DE (8)


is given by (8’). Moreover, yo = (a) if and only if P(x) = SX p(x)dx.

Quadrature. In the Fundamental Theorem of the Calculus, if the function


g is nonnegative, the definite integral in (5) is the area under the curve y = g(x)
in the vertical strip between a and x. For this reason, the integration of (4) is
called a quadrature. Formula (8’) reduces the solution of any first-order linear
DE to the performance of a sequence of quadratures. Using Tables of Indefinite
Integrals,+ the solutions can therefore often be expressed explicitly, in terms of
“elementary” functions whose numerical values have been tabulated (‘‘tabulated
functions’’).

Initial Value Problem. In general, the “initial value problem” fora first-
order DE y’ = F(x,y) consists in finding a solution y = g(x) that satisfies an initial
condition y(a) = yo, where a and yo are given constants. Theorem2 states that
the initial value problem always has one and only one solution for a linear DE
(8), on any interval a S x S 5b where p(x) and q(x) are defined and continuous.

Remark. There are often easier ways to solve linear DEs than substitution in
(8’). This fact is illustrated by the following example.

Example 4. Consider the inhomogeneous linear DE

(9) yty=xt3

Trying y = ax + 4, one easily verifies that x + 2 is one solution of (9). On the


other hand, if y = f(x) is any other solution, then z = y — (x + 2) must satisfy
z’ +z = (y’ + 9) — & + 3) = 0, whence z

=
ce-* by Theorem 1. It follows
that the general solution of (9) is the sum ce~* + x + 2.

+ See the book by Dwight listed in the Bibliography. Kamke’s book listed there contains an extremely
useful catalog of solutions of DEs not of the form y’ = g(x). For a bibliography of function tables,
see Fletcher, Miller, and Rosenhead.
4 Separable Equations 9

4 SEPARABLE EQUATIONS

A differential equation that can be written in the form

(10) yx = g(x)h(y)

is said to be separable. Thus, the DEs y’ = 9? — 1 and y’ = y° — y of Examples


2 and 3 are obviously separable, with g(x) = 1. The DE x + yy’ = 0 of Example
1, rewritten as y’ = (—x)(1/y) is separable except on the x-axis, where 1/y
becomes infinite. As we have seen, the solutions y = + VC — x* of this DE
cannot be expressed as single-valued functions of x on the x-axis, essentially for
this reason.
A similar difficulty arises in general for DEs of the form

(11) M(x) + N(y)y’ = 0

These can also be rewritten as

(11%) M(x) dx + Ny) dy = 0

or as y’ = —M(x)/N(y) and are therefore also said to be “separable.”” Whenever

N(y) vanishes, it is difficult or impossible to express y as a function of x.


It is easy to solve separable DFs formally. If ¢(x) = JM(x) dx and yy) =
JN(y) dy are any antiderivatives (“indefinite integrals’) of M(x) and N(y), respec-
tively, then the level curves

d(x) + Wy) = C

of the function U(x,y) = (x) + (y) are solution curves of the DEs (11) and
(11’). Moreover, the Fundamental Theorem of the Calculus assures us of the
existence of such antiderivatives. Likewise, for any indefinite integrals G(x) =
Sg(x) dx and H(y) = Sdy/h(y), the level curves of

G(x) — H(y) = C

may be expected to define solutions of (10), of the form

(11”) y = HC — Gi]

However, the solutions defined in this way are only local. They are defined by
the Inverse Function Theorem,} but only in intervals of monotonicity of H(y)
where h(y) and hence H’(y) = 1/h(y) has constant sign. Moreover, the range
of H(y) may be bounded, as in the case of the DE y’ = I + 9. In this case,

+ This theorem states that if H(y) is a strictly monotonic map of [c,d] onto [4,6], then H~!('y) is single-
valued and monotonic from [a,b] to [c,d].
10 CHAPTER 1 First-Order Differential Equations

S%o0 dy/(. + 9?) = a. Therefore, no solution of the DE ¥ = 1 + 9? can be


continuously defined over an interval (a,b) of length exceeding 7.

Example 5. Consider the DE y’ = (1 + y)e~™. Separating variables, we get


J dy/( + 9%) = f e-* dx, whose general solution is arctan y = (Vx/2) erf x +
C, or y = tan {((Vx/2) erf (x) + C).
The formal transformations (10’) and (10”) can be rigorously justified when-
ever g(x) and h(y) are continuous functions, in any interval in which'h(y) does not
vanish. This is because the Fundamental Theorem of the Calculus again assures
us that ¢(x) = Jf g(x) dx exists and is differentiable on any interval where g(x) is
defined and continuous, while ¥(y) = f dy/h(y) exists and is strictly monotonic in
any interval (y,,¥2) between successive zeros y, and y2 of h(y), which we also
assume to be continuous. Hence, as in Example 2, the equation

vo — 90) = f 2 — J no ax =e
gives for each c a solution of y’ = g(x)h(y) in the strip y, < y < yg. Near any x
with y, — ¢ < $(x) < yg — ¢, this solution is defined by the inverse function
theorem, by the formula y = y~'(¢(x) + 0).

Orthogonal Trajectories. An orthogonal trajectory to a family of curves is a


curve that cuts all the given curves at right angles. For example, consider the
family of geometrically similar, coaxial ellipses x* + my? = C. These are integral
curves of the DE x + myy’ = 0, whose normal form y’ = —x/my has separable
variables. The orthogonal trajectories of these ellipses have at each point a slope
/
J my/x, which is the negative reciprocal of —x/my. Separating variables, we
=
=

get dy/y = m dx/x, or In |4| m In |x|, whence the orthogonal trajectories


_
=

are given byy = + |x|”.


More generally, the solution curves of any separable DE y’ = g(x)h(y) have
as orthogonal trajectories the solution curves of the separable DE y’

=

—1/g(x)h(y).

Critical Points. Points where du/dx du/dy = 0 are called critical points
=
=

of the function u(x,y). Note that the directions of level lines and gradient lines
may be very irregular near critical points; consider those of the functions x? +
y® near their critical point (0,0).
As will be explained in §5, the level curves of any function u € @'(D) satisfy
the DE du/dx + y’du/dy = 0 in D, except at critical points of u. Clearly, their
orthogonal trajectories are the solution curves of du/dy = y’du/dx, and so are
everywhere tangent to the direction of Vu grad u (0u/dx, du/dy). Curves
= =>
= =

having this property are called gradient curves of u. Hence the gradient curves
of u are orthogonal trajectories of its level curves, except perhaps at critical
points.
5 Quasilinear Equations; Implicit Solutions 11

EXERCISES B

1 Find the solution of the DE xy’ + 3y = 0 that satisfies the initial condition
fl) = 1.

Find equations describing all solutions of y’ = (x + yy. (Hint: Set u = x + 9.)


(a) Find all solutions of the DE xy’ + (1 — x)y = 0
(b) Same question for xy’ + (1 — x)y = 1.
(a) Solve the DEs of Exercise 3 for the initial conditions y(1) = 1, y(1) = 2.
(b) Do the same for y(0) = 0 and y(0) = 1, or prove that no solution exists.

(a) Find the general solution of the DE y’ + y = sin 2¢.


(b) For arbitrary (real) constants a, b, and k # 0, find a particular solution of

(*) y = ay + bsin
kt

(c) What is the general solution of (*)?

(a) Find a polynomial solution of the DE

(**) y t=
x? + 4x47

(b) Find a solution of the DE (*) that satisfies the initial condition y(0) = 0.

Show that if & is a nonzero constant and q(x) a polynomial of degree n, then the DE
xy’ + y = q(x) has exactly one polynomial solution of degree n.
In Exs. 8 and 9, solve the DE shown and discuss its solutions qualitatively.

8. dr/d0 = 7° sin 1/r (polar coordinates).


9. dr/d0 = 2/log r.
10. (a) Show that the ellipses 5x” + 6xy + 5y? = C are integral curves of the DE

(5x + 3y) + (3x + 5y)y’ = 0

(b) What are its solution curves?

5 QUASILINEAR EQUATIONS; IMPLICIT SOLUTIONS

In this section and the next, we consider the general problem of solving
quasilinear DEs (1), which we rewrite as

(12) M(x,y) dx + N(x,y) dy = 0

to bring out the latent symmetry between the roles of x and y. Such DEs arise
naturally if we consider the level curves of functions. If G(x,y) is any continuously
differentiable function, then the DE

0G
(12’/
ox
(x,y) + > (x,y) = 0
12 CHAPTER 1 First-Order Differential Equations

is satisfied on any level curve G(x,y) = C, at all points where dG/dy # 0. This
DE is of the form (1), with M(x,y) = 8G/dx and N(x,y) = 8G/dy.
For this reason, any function G which is related in the foregoing way to a
quasilinear DE (1) or (12), or to a nonzero multiple of (12) of the form
'

(12”) (x,y) [M(x,y) dx + N(x,y) dy] = 0, w#0

is called an implicit solution of (12). Slightly more generally, an integral of (1) or


(12) is defined as a function G(x,y) of two variables that is constant on every
solution curve of (1).
For example, the equation x* — 6x”y? + y* = C is an implicit solution of the
quasilinear DE

(x? — Bxy*) + (y? — 3x°y)y’ = 0


/ =
(x — 3xy")
or
~ (8x2y — 9°)
The level curves of x* — 6xy? + y* have vertical tangents on the x-axis and the
lines y = + /3x. Elsewhere, the DE displayed above is of the normal form
y = F(x,y).

Critical Points. At points where 0¢/dx = 0¢/dy = 0, the directions of the


gradient and level curves are undefined; such points are called “critical points”
of ¢. Thus, the function x? + y” has the origin for its only critical point, and the
same is true of the function x* — 6x*y? + y*. (Can you prove it?) On the other
hand, the function sin (x? + 9%) also has circles of critical points, occurring wher-
ever 7° is an odd integral multiple of #/2. Most functions have only isolated crit-
ical points, however, and in general we shall confine our attention to such
functions.
We will now examine more carefully the connection between quasilinear DEs
and level curves of functions, illustrated by the two preceding examples. To
describe it accurately, we will need two more definitions. We first define a
domain} as a nonempty open connected set. We call a function @ = (x), » X,)
of class @” in a domain D when all its derivatives 0¢/0x,, 0°/Ax,0x . . of orders
1,..., m exist and are continuous in D. We will write this condition in symbols
as @ € @” or ¢ € @*(D). When ¢ is merely assumed to be continuous, we will write
$ € @or € @(D).
To make the connection between level curves and quasilinear DEs rigorous,
we will also need to assume the following basic theorem.

+ See Apostol, Vol. 1, p. 252. Here and later, page references to authors refer to the books listed in
the selected bibliography.
5 Quasilinear Equations; Implicit Solutions 13

IMPLICIT FUNCTION THEOREM.+ Let u(x,y) be a function of class C" (n =


1) in a domain containing (xo,Jo); let up denote u(xo,yo), and let u,(x,9o) #0. Then
there exists positive numbers ¢ and » such that for each x € (xg — €,X%9 + 2 and CE
(up — €,to + ©), the equation u(x,y) = C has a unique solution y = f(x,C) in the
interval (yo — 7, yo + 7). Moreover, the function f so defined is also of class @”.
It follows that if u € @"(D), n = 1, the level curves of u are graphs of func-
tions y = f(x,c), also of class @”, except where du/dy = 0. In Example 1, u =
=

xe + y and there is one such curve, the x-axis y = 0; this divides the plane into
two subdomains, the half-planes y > 0 and y < 0. Moreover, the locus (set) where
du/dy = 0 consists of the points where the circles u const have vertical tan-
=
=

gents and the “‘critical point” (0,0) where du/dx = du/dy = 0—that is, where
the surface z u(x,y) has a horizontal tangent plane.
=
=

This situation is typical: for most functions u(x,y), the partial derivative du/d
y vanishes on isolated curves that divide the (x,y)-plane into a number of regions
where 0u/dy # 0 has constant sign, and hence in which the Implicit Function
Theorem applies.

THEOREM 3. In any domain where du/dy # 0, the level curves of any function
u € @' are solution curves of the quasilinear DE

(13) (x,y7,") = M(x,y) + N(x,y)y’ = 0

where M(x,y) = du/dx and N(x,y) = du/dy.

Proof. By the Chain Rule, du/dx = du/dx + (du/dy)y’ along any curve y =
J (x). Hence, such a curve is a level curve of u if and only if

du Ou ou /
0
=
—> = ——
=

dx ox ay
By the Implicit Function Theorem, the level curves of u, being graphs of func-
tions y = f(x) in domains where df/dy # 0, are therefore solution curves of the
quasilinear DE (13). In the normal form y’ = F(x,y) of this DE, therefore, F(x,y)
=
=
— (du/dx)/(0u/dy) becomes infinite precisely when du/dy = 0.

To describe the relationship between the DE (13) and the function u, we need
a new notion.

DEFINITION. An integral of a first-order quasilinear DE (1) is a function


of two variables, u(x,y), which is constant on every solution curve of (1).
Thus, the function u(x,y)
=
=

x® + y* is an integral of the DE x + yy’ = 0

+ Courant and John, Vol. 2, p. 218. We will reconsider the Implicit Function Theorem in greater
depth in §12.
14 CHAPTER 1 First-Order Differential Equations

because, upon replacing the variable y by any function + VC — x? , we obtain


u(x,y) = C. This integral is most easily found by rewriting x + y dy/dx 0 in
=
=

differential form, as x dx + y dy = 0, and recognizing that x dx + y dy = 4d(x?


+ y°) is an ‘exact’ differential (see §6).
Level curves of an integral of a quasilinear DE are called integral curves of the
DE; thus, the circles x? + y? = C are integral curves of the DE x + yy’ = 0,
although not solution curves.

Example 6. From the DE yy’ = x, rewritten as y dy/dx = x, we get the equa-


tion
y dy — x dx
=
=
0. Since y dy — x dx
=
=
$d(y? — x”), we see that the integral
curves of the DE are the branches of the hyperbolas y? = x? + C and the asymp-
totes y = +x, as shown in Figure 1.5. The branches y = + V x + kare solu-
tion curves, but each level curve y = + x? — k* has four branches separated
by the x-axis (the line where the integral curves have vertical tangents).
Note that, where the level curves y = x and y = —x of y? — x2 cross,
the gradient (OF/dx, OF/dy) of the integral F(x,y) = y* — x” vanishes: (@F/dx,
dF/dx) = (0,0).

7.
is \,

SZN
Figure 1.5 Level curves
c = 0, +1, +2, +3, +4, +6, +9, +12 of
y? — 2
6 Exact Differentials; Integrating Factors 15

6 EXACT DIFFERENTIALS; INTEGRATING FACTORS

A considerably larger class of “implicit solutions” of quasinormal DEs can be


found by examining more closely the condition that M(x,y) dx + N(x,y) dy be an
“exact differential’? dU, and by looking for an “integrating factor’ y(x,y) that
will convert the equation

(14) M(x,y) dx + N(x,y) dy = 0

into one involving a “‘total’’ or “exact” differential

pdU = p(x,y)[M(x,y) dx + N(x,y) dy] = 0

whose (implicit) solutions are the level curves of U.


In general, the quasinormal DE (1) or

(14) M(x,y) + N(x,y)y’ = 0

is said to be exact when there exists a function U(x,y) of which it is the ‘total
differential’, so that QU/dx = M(x,y) and 0U/dy = N(x,y), or equivalently

aU 0
(14”) + ey dy = M(x,y) dx + N(x,y) dy
dU = — dx
Ox 0

Since dU = 0 on any solution curve of the DE (14), we see that solution curves
of (14) must lie on level curves of U, just as in the “separable variable”’ case.
Since 0°U/dxdy = 6°U/dydx, clearly a necessary condition for (14’) to be an
exact differential is that 0N/dx 0M/dy. It is shown in the calculus that the

=

converse is also true locally. More precisely, the following result is true.

THEOREM 4. If M(x,y) and N(x,y) are continuously differentiable functions


in a simply connected domain, then (14’) is an exact differential if and only if
aN/dx = 8M/dy.
The function U =U(P) for (14) is constructed as the line integral [§ [M(x,y)dx
+N (x,y) dy] from a fixed point 0 in the domain [perhaps 0 = (0,0)] to a variable
point P = (x,y). Thus, for the DE x + yy’ = 0 of Example 1, this procedure
gives [6 (x dx + y dy) = (x? + y’) /2, showing again that the solution curves of
x + yy’ = Olie on the circles x? + y? = C with center (0,0). More generally, in
the separable equation case of g(x) dx + dy/h(y), we have d[g(x)] /dy = 0 =
O[1/h(y)] /Ox, giving G(x) +H(y) = Cas in §5.
Even when the differential M dx + N dy is not exact, one can often find a
function # (x,y) such that the product

(uM) dx + (uN) dy = du
16 CHAPTER 1 First-Order Differential Equations

is an exact differential. The contour lines u(x,y) = C will then again be integral
curves of the DE M(x,y) + N (x,y)y’ = 0 because du/dx = u(M +Ny’) = 0; and
segments of these contour lines between points of vertical tangency will be solu-
tion curves. Such a function yg is called an integrating factor.

DEFINITION. An integrating factor for a differential M(x,y) dx + N(x,y) dy is


a nonvanishing function (x,y) such that the product (uM) dx + (uN) dy is an
exact differential.

Thus, as we saw in §3, for any indefinite integral P(x) = [p(x) dx of p(x), the
function exp {P(x)} is an integrating factor for the linear DE (8). Likewise, the
function 1/h(x) is an integrating factor for the separable DE (11).
The differential x dy — y dx furnishes another interesting example. It has an
integrating factor in the right half-plane x > 0 of the form u(x) = 1/x?, since
dy/x — y dx/x® = d(y/x); cf. Ex. C11. A more interesting integrating factor is
1/(x? + y*). Indeed, the function

(x,y) (xdy -y dx)


O(x,y) = J 1,0) (x? + y’)

is the angle made with the positive x-axis by the vector (x,y). That is, it is just the
polar angle @ when the point (x,y) is expressed in polar coordinates. Therefore,
the integral curves of xy’ = y in the domain x > 0 are the radii 6 = C, where
—a/2 <0 < w/2; the solution curves are the same.
Note that the differential (x dy — y dx)/(x® + y°) is not exact in the punctured
plane, consisting of the x,y-plane with the origin deleted. For @ changes by 2x
in going around the origin. This is possible, even though 9[x/(x? + y°)]/dx = 0
[—y/(x? + y*)]/dy, because the punctured plane is not a simply connected
domain.
Still another integrating factor of x dy — y dx is 1/xy, which replaces x dy —
y dx 0 by dy/y = dx/x, or In |y| = In |x| + Cin the interior of each of the
=
=

four quadrants into which the coordinate axes divide the (x,y)-plane. Exponen-
tiating both sides, we get y = kx.
A less simple example concerns the DE x(x* — 2y°)y’ = (2x3 — y*)y. Here an
integrating factor is 1/x*y?. If we divide the given DE by x*y®, we get
2 2 _ Qxy _ xty! — y* + Qxy>y!
d x
J
dx ( J x
xy?

Hence the solution curves of the DE are (x?/y) + (y?/x) = C, or x° + 9? = Cxy.

Parametric Solutions. Besides “explicit” solutions y = f(x) and ‘‘implicit”’


solutions U(x,y) = C, quasinormal DFs (14) can have “parametric’’ solutions.
Here by a parametric solutzon is meant a parametric curve x = g(i), y = A(é) along
which the line integral [M(x,y) dx + N(x,9) dy, defined as

(15) [email protected])g’O + N(e@,hO)A(O) at


7 Linear Fractional Equations 17

vanishes. Thus, the curves x = A cost, y = A sin ¢ are parametric solutions of


x + yy’ = 0. They are also solutions of the system of two first-order DEs dx/dt
= —y, dy/dt = x, and will be studied from this standpoint in Chapter 5.

EXERCISES C

1 Find an integral of the DE y’ = y?/x?, and plot its integral curves. Locate its critical
points, if any.

Sketch the level curves and gradient lines of the function x° + 3x°y + 9°. What are
its critical points?

Same question as Exercise 2 for xs


— 3x%y + 9%
Find equations describing all solutions of

1
y= Qx+y

For what pairs of positive integers n,r is the function |x|" of class @’?

Solve the DE xy’ + y = 0 by the method of separation of variables. Discuss its


solution curves, integral curves, and critical points.

(a) Reduce the Bernoulli DE y’ + p(x)y = q(x)y", n # 1, to a linear first-order DE


by the substitution u = y'~*.
(b) Express its general solution in terms of indefinite integrals.

In Exs. 8 and 9, solve the DE exhibited, sketch its solution curves, and describe them
qualitatively:
1
8 ¥ = y/x— x? 9. 9 = y/x — In [x] °
10 Find all solutions of the DE |x| + jy[y’ = 0. In which regions of the plane is the
differential on the left side exact?

*11. Show that the reciprocal of any homogeneous quadratic function Q(x) = Ax? +
2Bxy + Cy? is an integrating factor of x dy — y dx.
*12. Show that if u and v are both integrals of the DE M(x,y) + N(x,y)y’ = 0, then so
are u + v, uv except where v =
=
0, Aw + wv for any constants A and p, and g(u)
for any single-valued function g.

*13, (a) What are the level lines and critical points of sin (« + 4)?
(b) Show that for
u = sin (x + 4), (xo.¥o) = (0,0), andé = «= 4 F(x,¢) in the Implicit
Function Theorem need not exist if 7 < }while it may not be unique if n > 4.

7 LINEAR FRACTIONAL EQUATIONS

An important first-order DE is the linear fractional equation

dy _ ox +dy
(16) ad
# be
dx ax + by’
18 CHAPTER 1 First-Order Differential Equations

which is the normal form of

(16’) (ax + by)y’ — (cx + dy) = 0

It is understood that the coefficients a, b, c, d are constants.


The integration of the DE (16) can be reduced to a quadrature by the sub-
stitution y = vx. This substitution replaces (16) by the DE

e+ dv
xv’ +o =
a+ bv

in which the variables x and v can be separated. Transposing v, we are led to


the separation of variables

(a + bv) dv dx
=0
bv? + (a — dv —c x

Since the integrands are rational functions, this can be integrated in terms of
elementary functions. Thus, x can be expressed as a function of v = y/x: we
have x = kG(y/x), where

Gv) = exp |- S| bu? + a+ bv


(a — du —c Ja

More generally, any DE of the form y’ = F(y/x) can be treated similarly. Set-
ting v = y/x and differentiating y = xv, we get xv’ + v = F(v). This is clearly
equivalent to the separable DE

dv dx
d (In x)
_ =
=

Fv) —v ~

whence x = K exp {fdv/[F(v) — v}}.


Alternatively, we can introduce polar coordinates, setting x r cos @ and
=
=

y = rsin 6. If y = y — @ is the angle between the tangent direction and


the radial direction 6, then

cot ycot@ +1
-—=
=

coty =
d0 cot 6 — coty

Since tan y = y’ = F(y/x) = F(tan 8), we have

1dr 1 + tan
y tan 0 _ 1 + (tan 6)F(tan6)
(17) = Q0)
rd tan y —tan 6 F(tan 6) — tan
0
7 Linear Fractional Equations 19

This can evidently be integrated by a quadrature:

(17) (8) = r(0)expf Q(@)db (17/

The function on the right is well-defined, by the Fundamental Theorem of the


Calculus, as long as tan y # tan 8, that is, as long as y’ # y/x.

Invariant Radii. The radii along which the denominator of Q(@) vanishes
are those where (16) is equivalent to d0/dr = 0. Hence, these radii are particular
solution curves of (16); they are called invariant radii. They are the solutions
y = 7x, for constant 7 tan 6. Therefore, they are the radii y = rx for which
=
=

y = 7 = (¢ + dr)/(a + br), by (16), and so their slopes 7 are the roots of the
quadratic equation

(18) br? + (a—


dr =c

Ifb # 0, Eq. (18) has zero, one, or two real roots according as its discriminant
is negative, zero, or positive. This discriminant is

(18) A = (a—d)* + Abc = (a + d)* — A(ad —be)

In the sectors between adjacent invariant radii, d?/dr has constant sign; this fact
facilitates the sketching of solution curves. Together with the invariant radii, the
solution curves (17’) form a regular curve family in the punctured plane, consist-
ing of the xy-plane with the origin deleted.

Similarity Property. Each solution of the linear fractional DE (16) is trans-


formed into another solution when x and y are both multiplied by the same
nonzero constant k. The reason is, that both y’ = dy/dx and y/x are unchanged
by the transformation (x,y) — (kx,ky). In polar coordinates, if r = f(6) is a solu-
tion of (17), then so is r kf@). Since the transformation (x,y) > (kx,ky) is a
=
=

similarity transformation of the xy-plane for any fixed k, it follows that the solu-
tion curves in the sector between any two adjacent invariant radii are all geo-
metrically similar (and similarly placed). This fact is apparent in the drawings of
Figure 1.6.
Note also that the hyperbolas in Figure 1.6a@ are the orthogonal trajectories
of those of Figure 1.5. This is because they are integral curves of yy’ x and
=
=

xy’ —y, respectively, and x/y is the negative reciprocal of —y/x.


=
=

EXERCISES D

1. Sketch the integral curves of the DEs in Exs. C8 and C9 in the neighborhood of the
origin of coordinates.

2. Express in closed form all solutions of the following DEs:


(a) y= (x? — yy(x? + yd (b) ¥ = sin (y/x)
CHAPTER 1 First-Order Differential Equations

(a) xy’ +y =0 (b) xy’ = 2y (c) y (8x + y)/(x ~ 3y)


Figure 1.6 Integral curves.

(a) Show that the inhomogeneous linear fractional DE

(cx + dy + e) dx — (ax + by + f) dy = 0 ad
# be

can be reduced to the form (16) by a translation of coordinates


(b) Using this idea, integrate (x + y + 1) dx = (2x — y — 1) dy
(c) For what sets of constants a, b, c, d, e, f, is the displayed DE exact?

Find all integral curves of (x" + y")y — x = 0. [Hint: Set


u = y/x.]

Prove in detail that the solutions of any homogeneous DE y’ g(y/x) have the
Similarity Property described in §7
Show that the solution curves of y’ G(x,y) cut those of y’ F(x,y) at a constant
angle @ if and only if G (1 + F)/(1 — tF), where
r = tan
B

Let A, B, C be constants, and K a parameter. Show that the coaxial conics


Ax? + 2Bxy + Cy® = K, satisfy the DE y —(Ax + By) /(Bx + Cy)

(a) Show that the differential (ax + by) dy — (cx + dy) dx is exact if and only if a +
d = 0, and that in this case the integral curves form a family of coaxial conics
(b) Using Exs. 6 and 7, show that if tan 6 (a + d)/(c — 5), the curves cutting the
solution curves of the linear fractional DE y (cx + dy)/(ax + by) at an angle
B form a family of coaxial conics

For the linear fractional DE (16) show that

” = (ad — be)[cx® — (a — d)xy — by*|/(ax + by)®

Discuss the domains of convexity and concavity of solutions

10 Find an integrating factor for y’ + (2y/x) = a, and integrate the DE by quadratures

GRAPHICAL AND NUMERICAL INTEGRATION

The simplest way to sketch approximate solution curves of a given first-order


normal DE y F(x,y) proceeds as follows. Draw a short segment with slope A.
= F(x,,y;) = tan 6; through each point (x;,y,) of a set of sample points sprinkled
fairly densely over the domain of interest. Then draw smooth curves so as to
have at every point a slope y’ approximately equal to the average of the F(x,,,)
8 Graphical and Numerical Integration 21

at nearby points, weighting the nearest points most heavily (i.e., using graphical
interpolation). Methods of doing this systematically are called schemes of graph-
ical integration.
The preceding construction also gives a graphical representation of the direc-
tion field associated with a given normal first-order DE. This is defined as
follows.

DEFINITION. A direction field in a region D of the plane is a function that


assigns to every point (x,y) in D a direction. Two directions are considered the
same if they differ by an integral multiple of 180°, or w radians.

With every quasinormal DE M(x,y) + N(x,y)y’ = 0, there is associated a direc-


tion field. This associates with each point (x,,y,) not a critical point where M =
N = 0, a short segment parallel to the vector (N(x,,9,), —M(x,,9,)). Such seg-
ments can be vertical whereas this is impossible for normal DEs.
It is very easy to integrate graphically the linear fractional equation (16)
because solution curves have the same slope along each radius y = vx, v =
constant: each radius y = kx is an isocline. We need only draw segments having
the right direction fairly densely on radii spaced at intervals of, say, 30°. After
tracing one approximate integral curve through the direction field by the graph-
ical method described above, we can construct others by taking advantage of the
Similarity Property stated in §7.

Numerical Integration. With modern computers, it is easy to construct


accurate numerical tables of the solutions of initial value problems, where they
exist, for most reasonably well-behaved functions F(x,y). Solutions may exist
only locally. Thus, to solve the initial value problem for y’ = 1 + y? for the initial
value (0) = 0 on [0,1.6] is impossible, since the solution tan x becomes infinite
when y = 7/2 = 1.57086. ... We will now describe three very simple methods
(or “algorithms’’) for computing such tables; the numerical solution of ordinary
DEs will be taken up systematically in Chapters 7 and 8.
Simplest is the so-called Euler method, whose convergence to the exact solu-
tion (for F € @') was first proved by Cauchy around 1840 (see Chapter 7, §2).
One starts with the given initial value, y(@) = yo c, setting Xp aand Y, =
= —
= =

yo, and then for a suitable step-size h computes recursively

(19) Xn = X, + h, Y,n +1 = Y,, + hF(X,,Y,,)

A reasonably accurate table can usually be obtained in this way, by letting h =


.001 (say), and printing out every tenth value of Y,,.
If greater accuracy is desired, one can reduce h to .0001, printing out
Yo,Yio0>YooosY3005 ., and “formatting” the results so that values are easy to look
up.

Improved Euler Method. The preceding algorithm, however, is very waste-


ful, as Euler realized. As he observed, one can obtain much more accurate
22 CHAPTER 1 First-Order Differential Equations

results with roughly the same computational effort by replacing (19) with the
following “improved” Euler algorithm

(20) Zn+1 = Y, + hF(X,,,Y,,)


Yi+1 Y, + 9 [F(X,,¥,) + FXn+1Zn+v]

With h = .001, this “improved” Euler method gives 5-digit accuracy in most
cases, while requiring only about twice as much arithmetic per time step.
Whereas with Euler’s method, to use 10 times as many mesh points ordinarily
gives only one more digit of accuracy, the same mesh refinement typically gives
two more digits of accuracy with the improved Euler method.
As will be explained in Chapter 8, when truly accurate results are wanted, it
is better to use other, more sophisticated methods that give four additional digits
of accuracy each time h is divided by 10. In the special case of quadrature—that
is, of DEs of the form y’ = g(x) (see §2)—to do this is simple. It suffices to
replace (19) by Simpson’s Rule.

Xn h
(21) Ynt+1= Yn + 7 [g(x,) + se + g(x, + h))
2

For example, one can compute the natural logarithm of 2,

(2) =In2 = f dx/x = 69314718. ..


with 8-digit accuracy by choosing x = 25 and using the formula

25
1 50 50 50
In 2 = —
150 & | 48 + 2k
+

49 + 2k
+

50 + 2k |
Caution. To achieve 8-digit accuracy in summing 25 terms, one must use a
computer arithmetic having at least 9-digit accuracy. Many computers have only
7-digit accuracy!

Taylor Series Method. A third scheme of numerical integration is obtained


by truncating the Taylor series formula after the term in y%, and writing

Y,n 41 = V(x, +h) = Y, + AY, + WPY%/2 + O(n)

a
For the DE y’ = y, since y;, = 9a ~ Yn» this method gives Y,,, = (1 +h + h?/
2)¥,,, and so it is equivalent to the improved Euler method.
For the DE y’ = 1 + 9°, since y” = 2Qyy’ = 2y(1 + 9°), the method gives

Y,n a = Y, tA + ¥%2) + wy,


+ ¥5)
8 Graphical and Numerical Integration 2

This differs from the result given by Euler’s improved method. In general, since
d[F(x,y)|/dx = OF/dx + (OF/dy) dy/dx, Y% =
=
(F, + FF,),. This makes the
method easy to apply.
The error per step, like that of the improved Euler method, is roughly pro-
portional to the cube of h. Since the number of steps is proportional to h', the
cumulative error of both methods is roughly proportional to h”. Thus, one can
obtain two more digits of accuracy with it by using 10 times as many mesh
points.
As will be explained in Chapter 8, when truly accurate results are wanted,
one should use other, more sophisticated methods that give four additional digits
of accuracy when 10 times as many mesh points are used.
e

Constructing Function Tables. Many functions are most simply defined as


solutions of initial value problems. Thus ¢* is the solution of y’ = y that satisfies
the initial condition e® = 1, and tanx is the solution of y’ = 1 + y? that satisfies
tan 0 = 0. Reciprocally, In x is the solution of y’ = 1/y that satisfies In 0 = 1,
while arctan x is the solution of y’ = 1/(1 + 5°) that satisfies arctan 0 = 0.
It is instructive and enjoyable (using modern computers) to try to construct
tables of numerical values of such functions, using the methods described in this
section, and other methods to be discussed in Chapters 7 and 8. The accuracy
of the computer output, for different methods and choices of the mesh length
h, can be determined by comparison with standard tables.t One can often use
simple recursion formulas instead, like

eth Ax
tan
x + tanh
e = é¢@ and tan (x + h) =
1 — tanx tank’

after evaluating e* = 1.01005167, and also by its Taylor series tan x x +


=
=

23/3 + 2x°/15 +- - tan (.01) = 0.0100003335. . .. Such comparisons will


often reveal the limited accuracy of machine computations (perhaps six digits).

EXERCISES E

1. For each of the following initial value problems, make a table of the approximate
numerical solution computed by the Euler method, over the interval and for the
mesh lengths specified:
(a) y = y with 9(0) = 1, on [0,1], for h = 0.1 and 0.02.
(b) y’ = 1 + 9’ with 9(0) = 0, on [0,1.6], for A = 0.1, 0.05, and 0.02.
Knowing that the exact solutions of the preceding initial value problems are e” and
tan x:

(a) Evaluate the errors E, = Y, — y(X,) for the examples of Exercise 1.


(b) Tabulate the ratios E,/hx, verifying when it is true that they are roughly indepen-
dent of h and x.

+ See for example Abramowitz and Stegun, which contains also a wealth of relevant material.
24 CHAPTER 1 First-Order Differential Equations

Compute approximate solutions of the initial value problems of Exercise 1 by the


improved Euler method.

Find the errors of the approximate values computed in Exercise 3, and analyze the
ratios Y,/h*x (cf. Ex. 2).
Use Simpson’s Rule to compute a table of approximate values of the natural loga-

rithm function In x
=-
=
f dt/t, on the interval [1,2].

Constructa tableof the functionarctanx = f aij + 2) on the interval [0,1] by


Simpson’s Rule, and compare the computed value of arctan 1 with 1/4.

“7 In selected cases, test how well your tables agree with the identities arctan (tan x) =
x and In () =x.
*8 Let ¢, be the approximate value of e obtained using Euler’s method to solve y’ = y
for the initial condition y(0) = 1 on [0,1], on a uniform mesh with mesh length 4 =
L/n.
(a) Show that Ine, = nIn (1 + A).
(b) Infer that ne, = 1 —h/2+ h?/3—---
(c) From this, derive the formula

*) —In (¢,/e) = h/2—W/3+---

(d) From formula (*) show that, as h | 0, ¢ — e, = (he/2)[1 — (4/6) + O(h’)].

9 THE INITIAL VALUE PROBLEM

For any normal first-order differential equation y’ = F(x,y) and any “‘initial”
Xo (think of x as time), the initial value problem consists in finding the solution
or solutions of the DE, for x = x», which also satisfy f(x9) = c. In geometric
language, this amounts to finding the solution curve or curves that issue from
the point (xo,c) to the right in the (x,y)-plane. As we have just seen, most initial
value problems are easy to solve on modern computers, if one is satisfied with
approximate solutions accurate to (say) 3—5 decimal digits.
However, there is also a basic theoretical problem of proving the uniqueness of
this solution.
When F(x,y) = g(x) depends on x alone, this theoretical problem is solved by
the Fundamental Theorem of the Calculus (§2). Given xp =>
=
aand yy = c, the
initial value problem for the DE y’ = g(x) has one and only one solution, given
by the definite integral (5).
The initial value problem is said to be well-posed in a domain D when there is
one and only one solution y = f(x,c) in D of the given DE for each given (X9,c)
€ D, and when this solution varies continuously with c. To show that the initial
value problem is well-posed, therefore, requires proving theorems of existence
(there is a solution), uniqueness (there is only one solution), and continuity (the
solution depends continuously on the initial value). The concept of a well-posed
initial value problem gives a precise mathematical interpretation of the physical
9 The Initial Value Problem 25

concept of determinism (cf. Ch. 6, §5). As was pointed out by Hadamard, solu-
tions which do not have the properties specified are useless physically because
no physical measurement is exact.
It is fairly easy to show that the initial value problems discussed so far are
well-posed. Thus, using formula (8’), one can show that the initial value problem
is well-posed for the linear DE y’ + p(x)y = q(x) in any vertical strip a < x <b
where p and q are continuous. The initial value problem is also well-posed for
the linear fractional DE (16) in each of the half-planes ax + by > 0 and
ax + by
< 0.
Actually, for the initial value problem for y’ = F(x,y) to be well-posed in a
domain D, it is sufficient that F € @! in D. But it is not sufficient that F € @:
though the continuity of F implies the existence of at least one solution through
every point (cf. Ch. 6,813), it does not necessarily imply uniqueness, as the fol-
lowing example shows.

Example 7, Consider the curve family y = (x — C)’, sketched in Figure 1.7.


For fixed C, we have

y= dy_ 3(x — c)* = 3y°8


(22)
0

a DE whose right side is a continuous function of position (x,y). Through every


point (x9,¢) of the plane passes just one curve y = (x — C)? of the family, for
1/3
which C = x) — ¢ depends continuously on (x9,c). Hence, the initial value
problem for the DE (22) always has one and only one solution of the form
y= C)°. But there are also other solutions.
Thus, the function y = 0 also satisfies (22). Its graph is the envelope of the
curves y = (x — ©)*. In addition, for any a < 8, the function defined by the
three equations

(x — a)’, x<a

(22) J,=
ax=x=8
— x > Bp

Figure 1.7 Solution curves of y’ = 3y2/3


26 CHAPTER 1 First-Order Differential Equations

is a solution of (22). Hence, the first-order DE y’ = 3y”/* has a two-parameter


family of solutions, depending on the parameters a and 8.

*10 UNIQUENESS AND CONTINUITY

The rest of this chapter will discuss existence, uniqueness, and continuity
theorems for initial value problems concerning normal first-order DEs
y’ = F(x,y). Readers who are primarily interested in applications are advised to
skip to Chapter 2.
Example 9 shows that the mere continuity of F(x,y) does not suffice to ensure
the uniqueness of solutions y = f(x) of y’ = F(x,y) with given f(a) = c. However,
it is sufficient that F € @'.(D). We shall prove this and continuity at the same
time, using for much of the proof the following generalization of the standard
Lipschitz condition.

DEFINITION. A function F(x,y) satisfies a one-sided Lipschitz condition in a


domain D when, for some finite constant L

(23) FN implies F(x,¥9) — F(x,¥)) = Liye ~~ y)

identically in D. 1t satisfies a Lipschitz condition} in D when, for some nonnegative


constant L (Lipschitz constant), it satisfies the inequality

(23’) | F(x,y) ~ F(x,z)| = Lly ~ z|

for all point pairs (x,y) and (x,z) in D having the same x-coordinate.

The same function F may satisfy Lipschitz conditions with different Lipschitz
constants, or no Lipschitz conditions at all, as the domain D under consideration
varies. For example, the function F(x,y) = 3y2/3 of the DE in Example9satisfies
a Lipschitz condition in any half-plane y = ¢, ¢ > 0, with L = 2e~'/?, but no
Lipschitz condition in the half-plane y > 0. More generally, one can prove the
following.

LEMMA 1. Let F be continuously differentiable in a bounded closed convext


domain D. Then it satisfies a Lipschitz condition there, with L = supp|OF/dy|.

* In this book, starred sections may be omitted without loss of continuity.

+R. Lipschitz, Bull. Sci. Math. 10 (1876), p. 149; the idea of the proof is due to Cauchy (1839). See
Ince, p. 76, for a historical discussion.

t Aset of points is called convex when it contains, with any two points, the line segment joining them.
10 Uniqueness and Continuity 27

Proof. The domain being convex, it contains the entire vertical segment join-
ing (x,y) with (x,z). Applying the Law of the Mean to F(x,y) on this segment,
considered as a function of 7, we have

1 |e
| F(x,y) — F(x,z)| = ly

for some n between y and z. The inequality (23), with L = supp|0F/dy|, follows.
A similar argument shows that (23) holds with L = max p 0F/dy.

The case F(x,y) = g(x) of ordinary integration, or “quadrature, is easily
identified as the case when L =0 in (23’). A Lipschitz condition is satisfied even
if g(x) is discontinuous.

LEMMA 2. Let o be a differentiable function satisfying the differential inequality

(24) o’(x) = Ko(x), aszx=sb

where K is a constant. Then

Cd) o(x) = o(a)eX*-9, for axzx=sdb

Proof. Multiply both sides of (24) by e~** and transpose, getting

0 = e**[o’(x) — Ko(x)] = < {o(x)e—**}


—Kx
The function o(x)e thus has a negative or zero derivative and so is nonincreas-
ing for a < x = b. Therefore, a(x)e** < o(a)e~*, q.e.d.

LEMMA 3. The one-sided Lipschitz condition (23) implies that

fe(x) — fg’) — fe] S Ligh) — fe)?

for any two solutions f(x) and g(x)of y’ = F(x,y).

Proof. Setting f(x) = 91, g(*) = yg, we have

fe(x) — fe’) — f£’()) = (92 — LF O92) — Fx,90)1

from the DE. If yy > y), then, by (23), the right side of this equation has the
upper bound L(y — ,)®. Since all expressions are unaltered when y, and yo are
interchanged, we see that the inequality of Lemma3is true in any case.
We now prove that solutions of y’ = F(x,y) depend continuously (and hence
uniquely) on their initial values, provided that a one-sided Lipschitz condition
holds.
28 CHAPTER 1 First-Order Differential Equations

THEOREM 5. Let f(x)and g(x) be any two solutions of the first-order normal DE
y = F(x,y)in a domain D where F satisfies the one-sided Lipschitz condition (23). Then

(25) [flx) — g(x)| =e? |f@ — g(a)| if x>a

Proof. Consider the function

a(x) = [g(x) — fx]?

Computing the derivative by elementary formulas, we have

o (x) = 2[g(x) — flx)] - Le’) — £’@))

By Lemma 3, this implies that o’(x) =< 2Lo(x); and by Lemma 2, this implies
o(x) < e*+*-%9(a). Taking the square root of both sides of this inequality (which
are nonnegative), we get (25), completing the proof.

As the special case f(a) = g(a) of Theorem 5, we get uniqueness for the initial
value problem: in any domain whereFsatisfies the one-sided Lipschitz condition
(23), at most one solution of y’ = F(x,y) for x = a, satisfies f(a) = c. However,
we do not get uniqueness or continuity for decreasing x. We now prove that we
have uniqueness and continuity in both directions when the Lipschitz condition
(23’) holds.

THEOREM 6. [/f (23’) holds in Theorem 5, then

(26) If) — g()| = &*"|fl@ — g(a)|

In particular, the DE y’ = F(x,9) has at most one solution curve passing through any
point (a,c) €D.

Proof. Since (23’) implies (23), we know that the inequality (23) holds; from
Theorem 5, this gives (26) for x = a. Since (23’) also implies (23) when x goes
to —x, we also have by Theorem 5

Ife) — g@)| =X? |f@ — g(a)| = "f@ — g(a)|

giving (26) also for x < a, and completing the proof.

EXERCISES F

1. In which domains do the following functions satisfy a Lipschitz condition?


(a) F(x,y) = 1 + x (b) Fx) = 1 +5?
(c) F(x,y) = 9/1 + 2°) (d) F(x,y) = x/(1 + 9%)
2. Find all solutions of y’ = [xy].
3. Show that the DE xu’ — 2u + x 0 has a two-parameter family of solutions.
=
=

[HinT: Join together solutions satisfying u(0) = 0 in each half-plane separately.]


11 A Comparison Theorem 29

Let fand g be solutions of y’ = F(x,y), where F is a continuous function. Show that


the functions m and M, defined as m(x) = min (f(x), g(x)) and M(x) = max (f(x),
g(x)), satisfy the same DE. [Hint: Discuss separately the cases f(x) = g(x), f(x) <
g(x), and f(x) > g(x).]

Let a(t), positive and of class @! for a = t = a + ¢, satisfy the differential inequality
o'(t) = Ko(t) log o(#). Show that o(t) = o(@) exp [K(é — a)].

Let F(x,y) = y log (1/y) for 0 < y < 1, F(y) = 0 fory = 0. Show that y’ = F(x,y)
has at most one solution satisfying f(0) = c, even though F does not satisfy a
Lipschitz condition.

(Peano uniqueness theorem). For each fixed x, let F(x,y) be a nonincreasing function
of y. Show that, if f(x) and g(x) are two solutions of y’ = F(x,y), and 6 > a, then
if® — g®| = lf@ — g(@|. Infer a uniqueness theorem.

Discuss uniqueness and nonuniqueness for solutions of the DE y’ = —y!”. [Hrnt:


Use Ex. 7.]

(a) Prove a uniqueness theorem for y’ = xy on — < x,y < +00,


*(b) Prove the same result for y’ = 9° + 1.
10 (Generalized Lipschitz condition.) Let F € @ satisfy

| F(x,y) — F(x,z)| Sk)


ly — zl

identically on the strip 0 < x < a. Show that, if the improper integral [§ k(x) dx is
finite, then y’ = F(x,y) has at most one solution satisfying (0) = 0.

*1] Let F be continuous and satisfy

| F(x,y) — F(x,z)| S Kly — z| log (ly — 217, - for ly-—z[<1

Show that the solutions of y’ = F(x,y) are unique.

*11 A COMPARISON THEOREM

Since most DEs cannot be solved in terms of elementary functions, it is


important to be able to compare the unknown solutions of one DE with the
known solutions of another. It is also often useful to compare functions satis-
fying the differential inequality

(27) f'@) = F(x,f®))

with exact solutions of the DE (3). The following theorem gives such a
comparison.

THEOREM 7. Let F satisfy a Lipschitz condition for x = a. If the function f sat-


isfies the differential inequality (27) for x = a, and if g is a solution of y’ = F(x,y)
satisfying the initial condition g (a) = f(a), then f(x) = g(x) for all x = a.

Proof. Suppose that f(x) > g(x,) for some x, in the given interval, and define
Xo to be the largest x in the interval a < x = x, such that f(x) < g(x). Then
30 CHAPTER 1 First-Order Differential Equations

F(%o) = g(xo). Letting o(x) = f(x) — g(x), we have o(x) = 0 for x9 S x S x); and,
also for x9 = x = x,

o (x) = f'(x) — g(x) S F(x,f(x)) — Flx,g(x)) S L(f(x) — g(x) = Lo(x)

where L is the Lipschitz constant for the function F. That is, the function o
satisfies the hypothesis of Lemma 2 of §10 on xy S x S x, with K = L. Hence
a(x) < o(xp)e“*-* = 0 and so o, being nonnegative, vanishes identically. But
this contradicts the hypothesis f(x,) > g(x,). We conclude that f(x) =< g(x) for
all x in the given interval, q.e.d.

THEOREM 8 (Comparison Theorem). Let fand g be solutions of the DEs

(28) y’ = Fix,y), z’ = G(x,z)

respectively, where F(x,y) = G(x,y) in the strip a S x SS b andF orG satisfies a


Lipschitz condition. Let also f(a) = g(a). Then f(x) = g(x) for all x € [a,6).

Proof. Let G satisfy a Lipschitz condition. Since y’ = F(x,y) = G(x,y), the


functions fand g satisfy the conditions of Theorem 7 with G in place of F. There-
fore, the inequality f(x) = g(x) for x = a follows immediately.

lf F satisfies a Lipschitz condition, the functions u = —/f(x) and v = —g(x)


satisfy the DEs u’ = —F(x, —u) and

v’ = —G(x, —v) = —F(x, —v)

Theorem 6, applied to the functions v, u and A(u,v) = —F(x, —v) now yields
the inequality u(x) = u(x) for x = a, or g(x) = f(x), as asserted.
The inequality f(x) < g(x) in this Comparison Theorem can often be replaced
bya strict inequality. Either fand g are identically equal for a = x = x), or else
f(%o) < g(%o) for some xp in the interval (a, x,). By the Comparison Theorem,
the function o,(x) = g(x) — f(x) is nonnegative for a = x < x), and moreover
61(x9) > 0. Much as in the preceding proof

oi(x) = G(x,g(x)) — F(x,f(x)) = G(x,g(x)) — Gx,f(x) = —Lo,

Hence [e'*o,(x)]’ =
=
e*[o{ + Lo,] = 0; from this expression ¢’c,(x) is a non-
decreasing function on a = x = x,. Consequently, we have

o,(x) = 0,(xp)e 1%*0) >0

which gives a strict inequality. This proves

COROLLARY 1. In Theorem 6, for any x, > a, either f(x,) < g(x), or f(x) =
g(x) for all x € [a,x;].

Theorem 7 can also be sharpened in another way, as follows.


12 Regular and Normal Curve Families 31

COROLLARY 2. In Theorem 7, assume that F, as well as G, satisfies a Lipschitz


condition and, instead off(a) = g(a), that f(a) < g(a). Then f(x) < g(x) forx > a.

Proof. The proof will be by contradiction. 1f we had f(x) = g(x) for some
x > a, there would be a first x = x, > a where f(x) = g(x). The two functions
y = d(x) = f(—x) and z = (x) = g(—x) satisfy the DEs y’ = —F(—x,y) and
z’ = —G(-—x,z) as well as the respective initial conditions ¢(—x,) = ¥(—+x)).
Since —F(—x,y) = —G(—x,y), we can apply Theorem 7 in the interval
[—x,, —a], knowing that the function — F(—x,y) satisfies a Lipschitz condition.
We conclude that ¢(—a) = ¥(—a), that is, that f(a) = g(a), a contradiction.

*12 REGULAR AND NORMAL CURVE FAMILIES

In this chapter, we have analyzed many methods for solving first-order DEs
of the related forms y’ = F(x,y),M(x,y) + N(x,y)y’ = 0, and M(x,y) dx + N(x,y)
dy = 0, describing conditions under which their “solution curves” and/or “‘inte-
gral curves’’ constitute “one-parameter families’ filling up appropriate domains
of the (x,y)-plane. In this concluding section, we will try to clarify further the
relationship between such first-order DEs and one-parameter curve families.
A key role is played by the Implicit Function Theorem, which shows} that the
level curves u = C of any function u € @'(D) have the following properties in
any domain D not containing any critical point: (i) one and only one curve of
the family passes through each point of D, (ii) each curve of the family has a
tangent at every point, and (iii) the tangent direction is a continuous function
of position. Thus, they constitute a regular curve family in the sense of the fol-
lowing definition.

DEFINITION. A regular curve family is a curve family that satisfies condi-


tions (i) through (iii).
Thus, the circles x? + y? = C (C > 0) forma regular curve family; they are
the integral curves of x + yy’ = 0, the DE of Example 1. Concerning the DE
y = y° — y of Example 2, even though it is harder to integrate, we can say
more: its solution curves form a normal curve family in the following sense.

DEFINITION. A regular curve family is normal when no curve of the family


has a vertical tangent anywhere.
Almost by definition, the curves of any normal curve family are solution
curves of the normal DE y’ = F(x,y), where F(x,y) is the slope at (x,y) of the
curve passing through it. Moreover, by Theorem 5’, if F € @', there are no other
solution curves.
The question naturally arises: do the solution curves of y’ = F(x,y) always
form a normal curve family in any domain where F € @'? They always do locally,
but the precise formulation and proof of a theorem to this effect are very dif-

+ Where du/dy = 0 but du/dx # 0, we can set x = g(y) locally on the curve; see below.
32 CHAPTER 1 First-Order Differential Equations

ficult, and will be deferred to Chapter 6. There we will establish the simpler
result that the initial value problem is locally well-posed for such DEs, after treat-
ing (in Chapter 4) the case that F is analytic (i.e., the sum of a convergent power
series).
In the remaining paragraphs of this chapter, we will simply try to clarify fur-
ther what the Implicit Function theorem does and does not assert about “‘level
curves,”

Parametrizing Curve Families. Although the name “level curve” suggests


that for each C the set of points where F(x,y) = C is always a single curve, this
is not so. Thus, consider the level curves of the function F(x,y) = (x? + 9%)? —
2x + Qy*. The level curve F = 0 is the lemniscate 7? = 2 cos 26, and is divided
by the critical point at the origin into two pieces. Inside each lobe of this lem-
niscate is one piece of the level curve F = C for —1 < C < 0, while the “level
curve” F = —1 consists of the other two critical points (+1, 0).
Similarly, in the infinite horizontal strip -1 < y < 1, every solution curve y
=
=
sin x + C of the DE y’ = cos x consists of an infinite number of pieces. The
same is true of the interval curves of the DE cos x dx = sin x dy, which are the
level curves of e~ sin x. (These can also be viewed as the graphs of the functions
y = y = In |sin x| + Cand the vertical lines y = +z.) In general, one cannot
parametrize the level curves of F(x,y) globally by the parameter C.
However, one can parametrize the level curves of any function u € @' locally,
in some neighborhood of any point (x9,¥9) where du/dy # 0. For, by the Implicit
Function Theorem, there exist positive « and y such that for all x € (x9 — €, x9
+ 6 and c € (uy — €, ug + ©), there is exactly one y € (yg — 1, ¥o + ) such that
u(x,y) = c. This defines a function y(x,c) locally, in a rectangle of the (x,u)-plane.
The parameter ¢ parametrizes the level curves of u(x,y) in the corresponding
neighborhood of (xo,yo) in the (x,y)-plane; cf. Figure 1.8.

y=Yotn

x &
u>
= WO

ae
2 UO
u>

¥Y=Yo7-7

xX = Xg—€ X=Xote

Figure 1.8
12 Regular and Normal Curve Families 33

EXERCISES G

1 Let f(u) be continuous and a + df(u) # 0 for p S u S q. Show that the DE


y’ = flax + by + 6) (a,b,c are constants) has a solution passing through every
point of the strip p < ax + by te <q.

Find all solutions of the DE y’ = |x°y°].


Show that, if M and N are homogeneous functions of the same degree, then (1’)
has the integrating factor (xM + yN)~! in any simply connected domain where
xM + yN does not vanish.

Show that if g(y) satisfies a Lipschitz condition, the solutions of y’ = g(y) form a
normal curve family in the (x,y)-plane. [HINT: Apply the Inverse Function Theorem
tox = f dy/g(y) + C]
Let g(x) be continuous for 0 = x < ©, lim,.... g(x) = 5 and a > 0. Show that, for
every solution y = f(x) of y’ + ay = g(x), we have lim, f(x) = b/a.
Show that if a < 0 in Ex. 5, then there exists one and only one solution of the DE
such that lim,. f(x) = b/a.

*7 (Osgood’s Uniqueness Theorem.) Suppose that $(u) is a continuous increasing function


defined and positive for u > 0, such that [2 du/d(u) -> 90 as e > 0. If | F(x,y) —
F(x,z)| < (Cy — z]), then the solutions of the DE (3) are unique. [Hinr: Use Ex.
E4.]

Let F, G, f, g be as in Theorem 8, and F(x,y) < G(x,y). Show that f(x) < g(x) for x
> a, without assuming that F orG satisfies a Lipschitz condition.

Show that the conditions dx/dt = |x| 12 and x(0) = —1 define a well-posed initial
value problem on [0,a) if a S 1, but not ifa > 1.

10 (a) Find the critical points of the DE x dy = y dx.


(b) Show that in the punctured plane (the x,y-plane with the origin deleted),
the integral curves of xy’ = y are the lines 0 = c, where @ is a periodic angular
variable only determined up to integral multiples of 27.
(c) What are its solution curves?
(d) Show that the veal variables x/r
=
=
cos 6 and y/r _
=
sin @ are integrals of
xy’ = y, and describe carefully their level curves.

*11 (a) Prove that there is no real-valued function u € @! in the punctured plane of
Ex. 10 whose level curves are the integral curves of xy’ = y.
(b) Show that the integral curves of y’ = (x + y)/(x — 9) are the equiangular spirals
r= he =e
2 #0,
(b) Prove that there is no real-valued function u € @! whose level curves are these
spirals.
CHAPTER 2

SECOND-ORDER
LINEAR
EQUATIONS

1 BASES OF SOLUTIONS

The most intensively studied class of ordinary differential equations is that of


second-order linear DEs of the form

du
(1) po(x)

dx”
+ pix)
“% + polx)u = ps(x)
The coefficient-functions p,(x) [i = 0, 1, 2, 3] are assumed continuous and real-
valued on an interval J of the real axis, which may be finite or infinite. The inter-
val J may include one or both of its endpoints, or neither of them. The central
problem is to find and describe the unknown functions u = f(x) on J satisfying
this equation, the solutions of the DE. The present chapter will be devoted to
second-order linear DEs and the behavior of their solutions.
Dividing (1) through by the leading coefficient po(x), one obtains the normal
form

du
(1’) dx? + p(x) = + q(x)u = r(x)

p=
pi q=
Ps r=
Ps
Po Po Po

This DE is equivalent to (1) so long as po(x) # 0; if po(xo) = 0 at some point x


= Xo, then the functions p and gq are not defined at the point x). One therefore
says that the DE (1) has a singular point, or singularity at the point x9, when f(x)
0

=

For example, the Legendre DE

(*) 2la-s%]+m=0
has singular points at x = +1. This is evident since when rewritten in the form
(1) it becomes (1 — x”)u” = Qxu’ + du —
=
0. Although it has polynomial solu-
34
1 Bases of Solutions 35

tions when A = n(n + 1), as we shall see in Ch. 4, §1, all its other nontrivial
solutions have a singularity at either x =
=

lorx
=
=
—1.
Likewise, the Bessel DE

(**) xu” + xu’ + (x? — nu = 0

has a singular point at x = 0, and nowhere else. More commonly written in the
normal form

wt ta + 1-3 Jano
its important Bessel function solution Jo(x) will be discussed in Ch. 4, §8.
Linear DEs of the form (1) or (1’) are called homogeneous when their right-
hand sides are zero, so that ps(x) = 0 in (1)—or, equivalently, r(x) = 0 in (14.
The homogeneous linear DE

(2) po(x)u” + py(x)u’ + polx)u


=
=
0

obtained by dropping the forcing term p,(x) from a given inhomogeneous linear
DE (1) is called the reduced equation of (1). Evidently, the normal form of the
reduced equation (2) of (1) is the reduced equation

du
(2’) dx? + p(x)
as
% + q(x)u = 0

of the normal form (1’) of (1).


A fundamental property of linear homogeneous DEs is the following Super-
position Principle. Given any two solutions /\(x) and fo(x) of the linear homoge-
neous DE (2), and any two constants ¢, and ¢o, the function

(3) Six) = exfi(x) + cefolx)

is also a solution of (2). Thisproperty is characteristic of homogeneous linear


equations; the function fis called a linear combination of the functions f, and fy.

Bases of Solutions. It is a fundamental theorem, to be proved in §5, that if


f(x) and f,(x) are two solutions of (2), and if neither is a multiple of the other,
then every solution of (2) can be expressed in the form (3). A pair of functions
with this property is called a basis of solutions.

Example 1. The trigonometric DE is u” + ku = 0; its solutions include


cos kx and sin kx. Hence, all linear combinations a cos kx + 6 sin kx of these
basic solutions are likewise solutions.
Evidently, the zero function u(x) =0 is a trivial solution of any homogeneous
36 CHAPTER 2 Second-Order Linear Equations

linear DE. Letting A = V a: + B, and expressing (a,b) = (A cos y, A sin y) in


polar coordinates, we can also write

acos kx + bsin kx = A cos(kx — +)

for any nontrivial solution of u” + k°u = 0. The constantA in (2) is called the
amplitude of the solution; ¥ its initial phase, and k its wave number; k/2z is called
its frequency, and 22/k its period.

Constant-coefficient DEs. We next show how to construct a basis of solu-


tions of any second-order constant-coefficient homogeneous linear DE

u” + pul + qu =0, (p,q constants) (5)

The trick is to set u


=
=

eP*/2y(x), so that u’ = e*/[y’ — po/2) and u” =


=

eP*/?2Ty” — po’ + p*v/4


], whence (5) is equivalent
to

(5’) o” + (q — pr/A)u =
=

0, v= ey

There are three cases, depending on whether the discriminant A = p” — 4q is


positive, negative, or zero.

Case 1. IfA > 0, then (5) reduces to v” = k?v, where k = VA/2. This DE
has the functions v = e*, ¢— hx as a basis of solutions whence

(VE—p)x/2 (-VA-p)x/2
(6a) u=e u=e

are a basis of solutions of (5). Actually, it is even simpler to make the “exponen-
tial substitution” «

=
* in this case. Then (5) is equivalent to (4? + prat
ge = 0; the roots of the quadratic equation \? + pr + q = 0 are the coeffi-
cients of the exponents in (6a).

Case 2. IfA < 0, then (5’) reduces to v” + k*v = 0, where k = V —A/2.


This DE has cos kx, sin kx as a basis of solutions, whence

(6b) u =e? cos(\V —Ax/2, u = eP/? sin(\/ —Ax/2)

forma basis of solutions of (5) when A < 0.

Case 3. When A = 0, (5’) reduces to v” = 0, which has 1 and x as a basis of


solutions. Hence the pair

(6c) u=eh/?, u = xe P*/?

is a basis of solutions of (5) when ?* = 4q.


2 Initial Value Problems 37

2 INITIAL VALUE PROBLEMS

With differential equations arising from physical problems, one is often inter-
ested in particular solutions satisfying additional initial or boundary conditions.
Thus, in Example 1, one may wish to find a solution satisfying u(0) = up and
u’(0) = ug. An easy way to find a solution satisfying these initial conditions is to
use Eq. (4) with a = uy and b = uj/k. In general, given a second-order linear
DE such as (1) or (14), the problem of finding a solution u(x) that satisfies given
initial conditions u(a) = up and u’(a) = up is called the initial value problem.

Example 2. Suppose we supplement the normal DE of Example 1 with the


“forcing function” r(x) = 3 sin 2x, and wish to find the solution of the resulting
DE u” + u =
=
3 sin 2x satisfying the initial conditions u(0) = u’(0) = 0.
To solve this initial value problem, we first construct a particular solution of
this DE, trying u A sin 2x, where A is an unknown coefficient to be deter-
=
=

mined. Substituting into the DE, we get (—4A + A) sin 2x =


=
3 sin 2x, orA =
—1. Since a cos x + & sin x satisfies u” + u = 0 for any constants a and 4, it
follows that any function of the form

u = acosx + bsinx — sin 2x

satisfies the original DE u” + u = 3 sin 2x. Such a function will satisfy u(0) =
0 if and only if a = 0, so that

u’(x) = bcos
x — 2 cos 2x

In particular, therefore, u’(0) = 6 — 2 = 0. Hence the function u = 2 sin x —


sin 2x solves the stated initial value problem.

Particular solutions of constant-coefficient DEs with polynomial forcing


terms can be treated similarly. Thus, to solve

ul + pu’ + qu = cx +d (p, q ¢, d constants)

it is simplest to look first for a particular solution of the form ax + 5. Substi-


tuting into the DE, we obtain the equations ga = ¢c and pa + qb = 0. Unless q
= 0, these give the particular solution

uaixt qd —2pe
q

When q = 0 but p # 0, we look for a quadratic solution; thus u” + u’ = x has


the solution

—~—
38 CHAPTER 2 Second-Order Linear Equations

Finally, u”
=
=

ex + dhas the cubic solution u = cx°/6 + dx?/2.


The procedure just followed can be used to solve initial value problems for
many other second-order linear DEs of the form (1) and (1’). It requires four
steps.

Step 1. Find a particular solution u,(x) of the DE.


Step 2. Find the general solution of the reduced equation obtained by setting
ps(x) = 0 in (1), or r(x) = 0 in (1. It suffices to find two solutions ¢(x)
and (x) of the reduced DE, neither of which is a multiple of the other.
STep 3. Recognize u
=
=
ag(x) + by(x) + u,(x), where a and b are constants to
be determined from the initial conditions, as the general solution of the
inhomogeneous DE.
Strep 4. Solve for a and 6 the equations

$(0)a + Y(0)b = up — u,(0)

$'(O)a + ¥/() = up — u,(0)

For these equations to be uniquely solvable, the condition

¥(0)
(0) | = $(0)¥/(0)—¥(0)6(0) ¥ 0
¢/(0) ¥(0)

is clearly necessary and sufficient—the expression (4’) is called the Wronskian of


@ and y; we will discuss it in §5.

EXERCISES A

1 (a) Find the general solution of u” + 3u’ + 2u =

K, whereK is an arbitrary constant.


(b) Same question for u” + 3u’ = K.

Solve the initial value problem for u” + 3u’ + 2u _—


=
0, and the following initial
conditions:
(a) u(0) = 1, u’(0) = 0 (b) u(0) = 0, u’(0) 1
=
=

Answer the same questions for u” + 2u’ + 2u 0


=
=

Find a particular solution of each of the following DEs:


es
(a) u” + 3u’ + Qu =
=
e (b) wu” + 3u’ + Qu =
=
sin x
x
(c) u” + 3u’ + Qu *(d) u” + 2u’ + u = e*
=
=
é

Find the general solution of each of the DEs of Ex. 4.

Solve the initial value problem for each of the DEs of Exercise 4, with the initial con-
ditions u(0) = u’(0) = 4.

Find a particular solution of: (a) u” + Qu’ + Qu = e*,


(b) u” + 2u’ + 2u = sin x, *(c) wu” + 2u’ + Qu = e™* sin x.

Solve the initial value problem for each of the DEs of Exercise 7, and the initial con-
ditions u(0) = u’(0) = 0.

Show that any second-order linear homogeneous DE satisfied by x sin x must have a
singular point at x = 0.
3 Qualitative Behavior; Stability 39

3 QUALITATIVE BEHAVIOR; STABILITY

Note that when A < 0 (i.e., in Case 2), all nontrivial solutions of (5) reverse
sign each time that x increases by +/k. Qualitatively speaking, they are oscillatory
in the sense of changing sign infinitely often. These facts become evident if we
rewrite (6b) in the form (4), as

(7) u(x) = Ae~?*/* cos[k(x — ¢)]

Contrastingly, when A = 0, a nontrivial solution of (6) can vanish only when


ac* = —be**, This implies that Bye = —b/a, so that (i) a and 6 must have
opposite signs, and (ii) x = In|b/a|/(@ — a). Hence, a nontrivial solution can
change sign at most once: it is nonoscillatory. Likewise, in Case 3, a nontrivial
solution can vanish only where a + bx = 0, or x = (—a/b), giving the same
result. We conclude:

THEOREM 1. ifA 2 0, then a nontrivial solution of (5) can vanish at most once.
If A < 0, however, it vanishes periodically with period x/\/ —&

Stability. Even more important than being oscillatory or nonoscillatory is


the property of being stable or unstable, in the sense of the following definitions.

DEFINITION. The homogeneous linear DE (2) is stricily stable when every


solution tends to zero as x —> 00; it is stable when every solution remains bounded
as x — 00. When not stable, it is called unstable.

THEOREM 2. The constant-coefficient DE (5) is strictly stable when p > 0 and q


> 0; it is stable when p = 0 but q > 0. It is unstable in all other cases.

Proof. This result can be proved very simply if complex exponents are used
freely (see Chapter 3, §3). In the real domain, however, one must distinguish
several possibilities, viz.:

(A) 1f q < 0, then A > 0 and A? + pr + gq = 0 must have two real roots of
opposite sign. Instability is therefore obvious.
(B) If p < 0, instability is obvious from (6a)-(6b), if one keeps in mind the sign
of p in each case.

(C) If p = 0 and q > 0, then we have Example 2: the DE (5) is stable but not
strictly stable.

(D) If p > 0 and q > 0, there are two possibilities: (i) A S 0, in which case we
have strict stability by (6b) and (6c); (ii) A > 0, in which case VA< p since
A = p? — q < p’, and strict stability follows from (6a).

Second-order linear DEs with constant coefficients have so many applications


that it is convenient to summarize their qualitative properties in a diagram; we
have done this in Figure 2.1. (The words “focal,” “nodal,’”’ and ‘‘saddle’’ point
will be explained in §7; to have a focal point is equivalent to having oscillatory
solutions.)
40 CHAPTER 2 Second-Order Linear Equations

q=up/4 _
q=p/4
Unstable Stable
focal point focal point
—_

Ay way hie;
eo

Unstable Stable
nodal point nodal point
w<M <0 O<m<d
4 i

Saddlepoint (unstable)
M<0<ds

Figure 2.1 Stability Diagram for ti + pu + qu = 0.

4 UNIQUENESS THEOREM

We are now ready to treat rigorously the initial value problem stated in §1.
The first basic concept involved is very general and applies to any normal sec-
ond-order DE u” = F(x,u,u’), whether linear or not.
Think of x as time, and of the possible pairs (u,w’) as states of a physical system,
which is governed (or modeled mathematically) by the given DE. Since wu’
expresses the rate of change of u at any “time” x, while u” = du’/dx gives the
rate of change of w’, it is natural to surmise that the present state of any such
system uniquely determines its state at all future times. Indeed, the theoretical
initial value problem is to prove this result as generally as possible.
In this section, we will prove it for second-order linear DEs of the form (1)
having continuous coefficient-functions p,(x) and no singular points. Since fo(x)
¥ 0, it suffices to consider the normal form (1’).
One would like to prove also that there always exists a solution for any initial
(u,45); this will be proved for second-order linear DEs having analytic coefh-
cient-functions in Chapter 4, and (locally) for linear DEs having continuously
differentiable coefficient-functions in Chapter 6. For the present, we will have to
construct “particular” solutions and bases of solutions for homogeneous DEs
u” + p(x)u’ + q(x)u = 0 by other methods.

Linear Operators. We begin by discussing carefully the general concept of


a “linear operator.” Clearly, the operation of transforming a given function f
into a new function g by the rule

= pof”
+ pif’ + bof
4 Uniqueness Theorem 41

(for continuous p,) is a transformation from one family of functions [in our case,
the family @?(J) of continuously twice-differentiable functions on a given inter-
val J] to another family of functions [in our case, @(J)]. Such a functional trans-
formation is called an operator, and is written in operator notation

Lif] = pof” + pul’ + pef

In our case, the operator L is linear; that is, it satisfies

L{ef + dg] cLLf] + dL[g]


=
=

for any constants ¢ and d.


As a Special case (setting c = 1, d —1), if u and v are any two solutions of
=
=

the inhomogeneous linear DE (1), then their difference u —vsatisfies

L{u — vo] = L[u] — Lv] = ps(x) — ps(x) = 0

That is, their difference is a solution of the homogeneous second-order linear DE


(2).
The preceding simple observations, whose proofs are immediate, have the
following result as a direct consequence.

LEMMA I. If the function v(x) is any particular solution} of the inhomogeneous


DE (1), then the general solution of (1) is obtained by adding to v(x) the general solu-
tion of the corresponding homogeneous linear DE (2).
For, if u(x) is any other solution of (1), then u(x) = v(x) + [u(x) — v(x)],
where L[u(x) — v(x)] = 0 as before. More generally, the following lemma holds.

LEMMA 2. If u(x) is a solution of L{u] = r(x), if v(x) is @ solution of L[u] =


5(x), and if c,d are constants, then w = cu(x) + dv(x) is a solution of the DE L[u] =
er(x) + ds(x).

The proof is trivial, but the result describes the fundamental property of lin-
ear operators. Its use greatly simplifies the solution of inhomogeneous linear
DEs.

Main Theorem. Having established these preliminary results, it is easy to


prove a strong uniqueness theory for second-order linear DEs.

THEOREM 3. (Uniqueness Theorem). If p and q are continuous, then at most one


solution of (1’) can satisfy given initial conditions f (a) = co and f’(a) = cy.

Proof. Let v and w be any two solutions of (1’) that satisfy these initial con-
ditions; we shall show that their differences u —
=

v — w vanishes identically.

+ The phrase “particular solution”’ is used to emphasize that only one solution of (1) need be found,
thus reducing the problem of solving it to the case ps(x) = 0.
42 CHAPTER 2 Second-Order Linear Equations

Indeed, u satisfies (8) by Lemma 1. It also satisfies the initial conditions u = u’


= 0 when x = a. Now consider the nonnegative function o(x) = u® + u’?. By
definition, o(a) = 0. Differentiating, we have, since r(x) = 0,

o'(x) 2u’(u + u”) = 2u’[u — pl(x)u’ — 9q(x)u]


=
=

—2p(x)u + 21 — g(x))uw’

Since (u + u’)? = 0, it follows that |2uu’| < u? + w?. Hence

2[1 — g(x)Juw’ = (1 + | g(x)|)@u? + w)

and

o’(x) = [1 + | q(x) |]u® + [1 + [g(x)| + 12p(x)|]u?

Therefore, if K = 1 + max [|q(x)| + 2|p(x)|], the maximum being taken over


any finite closed interval [a, 6], we obtain

o’(x) <= Ka(x), K < +00

By Lemma 2 of Ch. 1, §10, it follows that o(x) = 0 for all x € [a, 6]. Hence
u(x) = 0 and v(x) = w(x) on the interval, as claimed.

The Uniqueness Theorem just proved implies an important extension of the


Superposition Principle stated in §1.

THEOREM 4. Let f and g be two solutions of the homogeneous second-order linear


DE

(8) u” + p(x)u’ + q(x)u = 0, p.qee

For some x Xo, let (flxo), f’(%o)) and (g(xo), g’(Xo)) be linearly independent
_

vectors. Then every solution of this DE is equal to some linear combination


h(x) cf(x) + dg(x) of f and g with constant coefficients c,d.
=
=

In other words, the general solution of the given homogeneous DE (8) is cf(x)
+ dg(x), where c and d are arbitrary constants.

Proof. By the Superposition Principle, any such h(x) satisfies (8). Conversely,
suppose the function h(x) satisfies the given DE (8). Then, at the given point xg,
constants c and d can be found such that

cf(%o) + dg(xo) = h(x), of (xo) + dg’(xo) = h’(xo)


5 The Wronskian 43

In fact, the constants ¢ and d are given by Cramer’s Rule, as

c = (hogs — gohd)/(fogs — of)

d = (fol! — hof d/( fogs — Bfd)

where we have used the abbreviations fo = fixo), f6 = f’(%o), and so on. For this
choice of c and d, the function

u(x) = h(x) — ¢f(x) — dg(x)

satisfies the given homogeneous DE by the Superposition Principle and the


initial conditions u(x9) = wu’(xo) = 0. Hence by the Uniqueness Theorem,
u(x) is the trivial solution, u(x) = 0, of the given homogeneous DE; therefore
h = of +dg.
Two solutions, f and g, of a homogeneous linear second-order DE (8) with
the property that every other solution can be expressed as a linear combination
of them are said to be a basis of solutions of the DE.

5 THE WRONSKIAN

The question of whether two solutions of a homogeneous linear DE form a


basis of solutions is easily settled by examining their Wronskian, a concept that
we now define.

DEFINITION. The Wronskian of any two differentiable functions f(x) and


g(x) is

fx) f'®)
(9) Wf g *) = flxe’(x) — gf") = g(x)
g(x)

THEOREM 5. The Wronskian (9) of any two solutions of (8) satisfies the identity

(10) Wf a3) = Wf,g;a)exp(- f ‘ poa)


Proof. If we differentiage (9) and write W(f, g; x) = W(x) for short, a direct
computation gives W = fe” — gf”. Substituting for g” and f” from (8) and
cancelling, we have the linear homogeneous first-order DE

(11) W(x) + p(x) W(x) = 0

Equation (10) follows from the first-order homogeneous linear DE (11) by Theo-
rem 4 of Ch. 1, §6.
44 CHAPTER 2 Second-Order Linear Equations

COROLLARY. The Wronskian of any two solutions of the homogeneous linear DE


(8) is identically positive, identically negative, or identically zero,

We now relate the Wronskian of two functions to the concept of linear inde-
pendence. In general, a collection of functions f|, fo... , f, is called linearly
independent on the interval a < x < 6 when no linear combination ¢, f\(x) +
Cofglx) +... + Cyfy (x) of these functions gives the identically zero function for
a = x = |, except the trivial linear combination where all coefficients vanish.
Functions that are not linearly independent are called linearly dependent. 1f f
and g are any two linearly dependent functions, then cf + dg = 0 for suitable
constants c and d, not both zero. Hence g = —(d/d)f or f = —(c/dg; the func-
tions f and g are proportional.

LEMMA. [If f and g are linearly dependent differentiable functions, then their
Wronskian vanishes identically.

Proof. Suppose that fand g are linearly dependent. Then there are two con-
stants ¢ and d, not both zero, which satisfy the two linear equations

cf(x) + dg(x) = 0, of (x) + dg’(x) = 0

identically on the interval of interest. Therefore, the determinant of the two


equations, which is the Wronskian W( f, g; x), vanishes identically.

The interesting fact is that when fand g are both solutions of a second-order
linear DE, a strong converse of this lemma is also true.

THEOREM 6. [f fand g are two linearly independent solutions of the nonsingular


second-order linear DE (8), then their Wronskian never vanishes.

Proof. Suppose that the Wronskian W(f, g; x) vanished at some point x).
Then the vectors [ f(x,), f’(x,)] and [g(x)), g’(x))] would be linearly dependent
and, therefore, proportional: g(x,) = kf(x;) and g’(x,;) = kf’(x,) for some con-
stant k. Consider now the function h(x) = g(x) — kf(x). This function is a solu-
tion of the DE (8), since it is a linear combination of solutions. It also satisfies
the initial conditions h(x) = h’(x,) = 0. By the Uniqueness Theorem, this func-
tion must vanish identically. Therefore, g(x) = &f(x) for all x, contradicting the
hypothesis of linear independence of f and g.

Remark 1. The fact that the DE (8) is nonsingular is essential in Theorem 6.


For example, the Wronskian x* of the two linearly independent solutions x? and
x? of the DE x2u” ~ 4xu’ + 6u = 0 vanishes at x = 0. This is possible because
the leading coefficient po(x) of the DE vanishes there.

Remark 2. There is an obvious connection between the formula for the


Wronskian of two functions and the formula for the derivative of their quotient:

(2) — fe’ ~ of) _ Wh®


~2.

f? f?
5 The Wronskian 45

This suggests that the ratio of two functions is a constant if and only if their
Wronskian vanishes identically. However, this need not be true if f vanishes: the
ratio of the two functions x° and |x|° is not a constant, yet their Wronskian
Wx3, |x|°) = 0. (Note also that both functions satisfy the DEs xu’ = 3u and
8xu” — Qu’ = Q.)

Nevertheless, the connection between W( f,g) and g/f is a useful one. Thus,
it allows one to construct a second solution g(x) of (8) if one nontrivial solution
is known. Namely, if P(x) = JSp(x)dx is any indefinite integral of p(x), then the
function

e7P®)
(12) a = fof | *(x)
|e
is a second, linearly independent solution of (8) in any interval where f(x) is
nonvanishing. This is evident, since (g/f)’ = W(fig)/f?, whence

— P(x)

Ja=JIf?@) |
gh)
_ W( fg)
fix) S| f?

For example, knowing that e* is one nontrivial solution of u” — 6u’ + 9 =


0, since P(x) = —6x = fpdx, setting ¢Pa) = 6
, we obtain the second solution

6x

g(x) = e*f | (e**)? Jax = 6%f dx = xe**


Riccati Equation. Finally, consider the formula for the derivative of the
ratio v = u’/u,f where u is any nontrivial solution of (8):

, 72
Uu

(
v’
(13)
=- —

p(x)v — q(x) — v?
=
_—
=

Uu

The quadratic first-order DE (13) is called the Riccati equation associated with
(8); its solutions form a one-parameter family. Conversely, if v(x) is any solution
of the Riccati equation (13) and if u’ = vu(x)u, then u satisfies (8). Hence, every
solution u(x) of (8) can be written in any interval where u does not vanish, in
the form,

(14) u(x) = C exp J v(x)dx

where v(x) is some solution of the associated Riccati equation (13).


The Riccati substitution v = u’/u thus reduces the problem of solving (8) to
the integration of a first-order quadratic DE and a quadrature. For instance,

+ Since v = u’/u = d(n u)/dx, this is called the logarithmic derivative of u.


46 CHAPTER 2 Second-Order Linear Equations

the Riccati equation associated with the trigonometric equation u” + k®u = 0


is vo’ + v* + k? = 0, whose general solution is v = k tan k(x, — x).

EXERCISES B

1. Show that all solutions of (8) have continuous second derivatives. Show also that
this is not true for (1).

2. Find a formula expressing the fourth derivative u” of any solution u of (8) in terms
of u, u’, and the derivatives of p and g. What differentiability conditions must be
assumed on the coefficients of (8) to justify this formula?

For the solution pairs of the DEs specified in Exs. 3-5 to follow, (a) calculate the
Wronskian, and (b) solve the initial-value problem for the DE specified with each of the
initial conditions u(0) = 2, u’(0) = 1, and u(0) = 1, u’(0) = —1 (or explain why there is
no solution).

3 . fix) = cos x, g(x) = sin x (solutions of u” + u = 0).

4. fix) = e7*, g(x) = e7* (solutions of u” + 4u’ + 3u = 0).


5 . fx)
=
=
x + 1, g(x) = & (solutions of xu” — (1 + x)u’ + u = 0).

Let f(x), g(x), and h(x) be any three solutions of (8). Show that

ff f"
gg’ Bg" =0
hh”
Ah

(a) Prove the Corollary of Theorem 5.


(b) Prove that if f(x) and g(x) satisfy the hypotheses of Theorem 6, then
pts) = @f" — fe")/W and a(x) = (f'e” — £'79/W.
What is wrong with the following ‘“‘proof”’ of Theorem 5: “Let w(x) = log W(x);
then w’(x) = —p(x). Hence, w(x) = w(a) — f%p(x) dx, from which (10) follows.”
Construct second-order linear homogeneous DEs having the following bases of
solutions; you may assume the result of Ex. 7:
(a) x, sin x, (b) x, x", (c) sinh x, sin x, (d) tan x, cot x.
For each of the examples of Ex. 9, determine the singular points of the resulting
DE.

10 (a) Show that if p,q € @", then every solution of (8) is of class @**?.
(b) Show that if every solution of (8) is of class @"*? then pe @" and g€e@".
11 Let f(x), g(x), h(x) be three solutions of the linear third-order DE

9” + pilx)y” + polx)y’ + ps(x)y = 0

Derive a first-order DE satisfied by the determinant

ff f*
w(x) =
ge 8”
h”
AW
6 Separation and Comparison Theorems 4

*12. Let y” + q(x)y = 0, where g(x) is “piecewise continuous” (i.e., continuous except
for a finite number of finite jumps). Define a “solution” of such a DE as a function
y = fix) € @' that satisfies the DE except at these jumps.
(a) Show that any such solution has left- and right-derivatives at every point of
discontinuity.
(b) Describe explicitly a basis of solutions for the DE y” + g(x)y = 0, if

+1 when x > 0
q(x) =
—l when x < 0

[N.B. The preceding function g(x) is commonly denoted sgn x,.]

6 SEPARATION AND COMPARISON THEOREMS

The Wronskian can also be used to derive properties of the graphs of solu-
tions of the DE (8). The following result, the celebrated Sturm Separation Theo-
rem, states that all nontrivial solutions of (8) have essentially the same number
of oscillations, or zeros. (A ‘‘zero”’ of a function is a point where its value is zero;
functions have two zeros in each complete oscillation.)

THEOREM 7. [Jff(x) and g(x) are linearly independent solutions of the DE (8),
then f(x) must vanish at one point between any two successive zeros of g(x). In other
words, the zeros offix) and g(x) occur alternately.

Proof. lf g(x) vanishes at x = x,, then the Wronskian

Wf, gx) = fixe’) # 0

since f and g are linearly independent; hence, f(x,) # 0 and g’(x,) # 0 if g(x,)
= 0. If x, and x» are two successive zeros of g(x), then g’(x,), g’(x2), f(x,), and
Jix2) are all nonzero. Moreover, the nonzero numbers g/’(x,) and g’(xs) cannot
have the same sign, because if the function is increasing at x = x,, then it must
be decreasing at x Xg, and vice-versa. Since W(/, g; x) has constant sign by
=
=

the Corollary of Theorem 4, it follows that f(x,) and f(x.) must also have opposite
signs. Therefore f(x) must vanish somewhere between x, and x9.

For instance, applied to the trigonometric DE u” + k®u = 0, the Sturm Sep-


aration Theorem yields the well-known fact that the zeros of sin kx and cos kx
must alternate, simply because these functions are two linearly independent
solutions of the same linear homogeneous DE.
A slight refinement of the same reasoning can be used to prove an even more
useful Comparison Theorem, also due to Sturm.

THEOREM 8. Let f(x) and g(x) be nontrivial solutions of the DEs u” + p(x)u = 0
and v” + q(x)v = 0, respectively, where p(x) = q(x). Then f(x) vanishes at least once
between any two zeros of g(x), unless p(x) = q(x) and fis a constant multiple of g.
48 CHAPTER 2 Second-Order Linear Equations

Proof. Let x, and x» be two successive zeros of g(x), so that g(x,) = g(xo) =
0. Suppose that f(x) failed to vanish in x; < x < x9. Replacing f and/or g by
their negative, if necessary, we could find solutions f and g positive on x, < x
< x9. This would make

Wf, 83 1) = flxig’(*) = 0 and Ws g; Xo) = Sixadg’ (xo) =0

On the other hand, since


f > 0, g > 0, and p = gon x, < x < xg, we have

“(Wh 69) =fe" — of = — de =0 on xy <x < Xx

Hence, W is nondecreasing, giving a contradiction unless

p-q=Wf gx) =0

In this event, f= kg for some constant k by Theorem 4, completing the proof.

COROLLARY 1. If q(x) =< 0, then no nontrivial solution of u” + q(x) u 0


=
=

can have more than one zero.

The proof is by contradiction. By the Sturm Comparison Theorem, the solu-


tion v = 1 of the DE v” 0 would have to vanish at least once between any
=
=

two zeros of any nontrivial solution of the DE u” + q(x)u = 0.


The preceding results show that the oscillations of the solutions of u” + q(x)u
= 0 are largely determined by the sign and magnitude of q(x). When q(x) <= 0,
oscillations are impossible: no solution can change sign more than once. On the
other hand, if q(x) = k? > 0, then any solution of w” + q(x)u = 0 must vanish
between any two successive zeros of any given solution A cos k(x — x,) of the
trigonometric DE u” + k?u = 0, hence in any interval of length z/k.
This result can be applied to solutions of the Bessel DE (**) of §1 (i.e., to the
Bessel function of order n; see Ch. 4, §4). Substituting u = v/\V/x into (**), we
obtain the equivalent DE

(15) w+ [1— 4n?


—1
Ax?
| =

whose solutions vanish when u does (for x # 0). Applying the Comparison Theo-
rem to (15) and u” + u = 0, we obtain the following.

COROLLARY 2. Each interval of length x of the positive x-axis contains at least


one zero of any solution of the Bessel DE of order zero, and at most one zero of any
nontrivial solution of the Bessel DE. of order n if n > }.

The fact that the oscillations of the solutions of u” + g(x)u =


=

0 depend on
the sign of q(x) is illustrated by Figures 2.2 and 2.3, which depict sample solution
curves for the cases q(x) = 1 and q(x) = —1, respectively.
7 The Phase Plane 49

y |

7S
RESKRRS
ERO DM \i
i

SEEK
VEY x
Figure 2.2 Solution curves of u” + u = 0

7 THE PHASE PLANE

In the theory of normal second-order DEs u” = F(x,u,w’), linear or nonlin-


ear, the two-dimensional space of all vectors (u,’) is called the phase plane. As
was noted in §5, the points of this phase plane correspond to the states of any
physical system whose behavior is modeled by such a DE
Clearly, any solution u(x) of the given DE determines a parametric curve or
trajectory in this phase plane, which consists of all [u(x),u’(x)] associated with this
solution. [A trivial exception arises at equilibrium states at which F(x,c,0) = 0, so
that w’(x) = 0 and u(x) c. Clearly, any such equilibrium point is necessarily
on the u-axis, where u’ = 0.]

4ex
4cosh
z

4 sinh
z

sez
rT
—}te-z

— sinh x

z
— 4teosh
det

Figure 2.3. Solution curves of u” — u = 0


50 CHAPTER 2 Second-Order Linear Equations

The trajectories just defined have some important general geometrical prop-
erties. For example, since u is increasing when wv’ > 0 and decreasing when w’
< 0, the paths of solutions must go to the right in the upper half-plane and to
the left in the lower half-plane. Furthermore, paths of solutions (‘‘trajectories’’)
must cut the u-axis u’ = 0 orthogonally, except where F = 0.
We will treat in this chapter only homogeneous second-order linear DEs (8),
deferring discussion of the nonlinear case to Chapter 5. Using the letter v to
signifly wu’, this DE is obviously equivalent to the system

du d
(16) —_ =


—_—_—
=
P(x)o — q(x)u
dx d.

which can also be written in vector form as

d 0 1

(3) =|
u

dx — q(x) — p(x) I v

Note that if g(x) < 0, then du’/dx = —q(x)u has the same sign as u on the u-
axis. It follows that if q(x) is negative, then any trajectory once trapped in the
first quadrant can never leave it, because it can neither cross the u-axis into the
fourth quadrant nor recross the u’-axis into the second quadrant. The same is
true, for similar reasons, of trajectories trapped in the third quadrant.
Even more important, two nontrivial solutions of (8) are linearly dependent if
and only if they are on the same straight line through the origin in the (u,v)-
plane. It follows that each straight line through the origin moves as a unit. The
preceding facts also become evident analytically, if we introduce clockwise polar
coordinates in the phase plane, by the formulas

(17) u’(x) = r cos (x), u(x) = r sin 6{x)

(We adopt this clockwise orientation so that 6 will be an increasing function on


the u’-axis.) Differentiating the relation cot # = u’/u, we then have the formulas

—(csc? 0)6’ =

(u” /u) — (u’/u)? = —plu’/u) — q — (u’/u)?


=_

=
—pcot 6 — q — cot?6

If we multiply through by —sin? 6, this equation gives

(18) d0/dx = cos? 6 + p(x) cos 6 sin 6 + q(x) sin? 6

This first-order DE gives much information about the oscillations of u.


Differentiating r(x) = u(x) + u’*(x) as in the proof of Theorem 1, where
a(x) = r°(x), we get
/
uw’ + u’u” u'(u — pu’ — qu)
= —

TY = =

=
=
r® cos O[(1 — g(x)) sin 6 — p(x) cos 6]
7 The Phase Plane 51

Dividing through by r? and simplifying, we obtain

1 dr
(19)
—_—— =

p(x) cos? 6 + (1 — q(x)) cos 6 sin 6


r dx

As in Theorem 1, it follows that the magnitude |d(In r)/dx| of the logarithmic


derivative of r(x) is bounded by [P|max + (1 + |4|maxd/2:
Now, consider the graph of the multiple-valued function @(x) in the (x, 6)-
plane. Since cot @ is periodic with period 7, the graph at @ = arc cot(u’/u) for
any solution of (8) consists of an infinite family of congruent curves, all obtained
from any one by vertical translation through integral multiples of 7. The curves
that form the graphs of 6,(x) and @,(x), for any two linearly independent solu-
tions wu), u, of (8), occur alternately. Moreover, by the uniqueness theorem of
Ch. 1, they can never cross.
In (17), u = 0 precisely when sin @ = 0, that is, when @ = 0 (mod 7). Inspect-
ing (18), we also see that

(20a) When 6 = 0 (mod na), that is, u = 0, then d@/dx > 0-

(20b) When 6 =: 1/2 (mod x), d0/dx has the sign of q


¢

From (20a) it follows that, after the graph of any @(x) has crossed the line 6 =
nw, it can never recross it backwards. Where u(x) next vanishes (if it does), we
must have 6 = (n + 1)z; in other words, successive zeros of u(x) occur precisely
where 6 increases from one integral multiple of 2 to the next!
After verifying that the right side of (18) satisfies a Lipschitz condition, we
see that this inequality can never cease to hold; hence, in any interval where
6,(x) increases from nz to nw + 7, 6.(x) must cross the line 6 = na + 7 and so
ug must vanish there. Sturm’s Comparison Theorem follows similarly: if q(x) is
increased and p(x) is left constant, the Comparison Theorem of Ch. 1, applied
to (19), yields it as a corollary.

Oscillatory Solutions. The preceding considerations also enable one to


extend some of the results stated in §3 for constant-coefficient DEs to second-
order linear DEs with variable coefficients. When q(x) > p?(x)/4, the quadratic
form on the right side of (18) is positive definite; hence d@/dx is identically posi-
tive. Unless g(x) gets very near to p°(x)/4, the zero-crossings of solutions occur
with roughly uniform frequency, and so the DE (8) may be said to be of oscii-
latory type.
When q(x) < 0, the DE (8) is said to be of positive type. One can also say by
(20a) and (20b), that once 6(x) has entered the first or third quadrant, it can
never escape from this quadrant; it is trapped in it. Therefore, a given solution
u(x) of (8) can have at most one zero if g(x) < 0; solutions are nonoscillatory.
Moreover, since uu’ > 0 in the first and third quadrants, u*(x) and hence | u(x)|
are perpetually increasing after a solution has been trapped in one of these
quadrants.
Using more care, one can show that when q(x) < 0 the limit as a| — 00 of
52 CHAPTER 2 Second-Order Linear Equations

the solutions q(x) satisfying va(0) = 1 and ue(a) = 0 is an everywhere increasing


positive solution. Moreover, replacing x by —x, which reverses the sign of u’(x),
one can construct similarly an everywhere decreasing positive solution. These two
monotonic solutions (e” and e~* for u” — u = 0) are usually unique, up to con-
stant positive factors, and provide a natural basis of solutions.

Focal, Nodal, and Saddle Points. Even more interesting than Sturm’s theo-
rems are the qualitative differences between the behavior of solutions of differ-
ent second-order DEs that become apparent when we look at the corresponding
trajectories in the phase plane (their so-called phase portraits). We shall discuss
these for nonlinear DEs in Chapter 5; here we shall discuss only the linear, con-
stant-coefficient case. We have already discussed this case briefly in §§2~3, pri-
marily from an algebraic standpoint.
In the linear constant-coefficient case, using the letter v to signify u’, we obvi-
ously have

du du
(21) —_ = —-
=
i

po
— qu.
dx Ya

Deferring to Chapter 5, §5, the discussion of the possibilities gq = 0 and


A = p’ — 4q = 0, the original DE u” + pu’ + qu = 0 has a basis of solutions
of one of the following three main kinds: A) if p? < 4q, e cos kx and e™ sin kx,
B) if p? > 4q > 0, functions e™ and e*, where a and 6 have the same sign,
C) if p? > 0 > 4g, functions e* and e* where a and 6 have opposite signs.
These three cases give very different-looking configurations of trajectories
in the phase plane.
Note that Cases B andC are subcases of the “Case 1” discussed in §1, while
Case A coincides with “Case 2’’ discussed there. As will be explained in Chapter
5, §5, most of the qualitative differences to be pointed out below have analogues

for nonlinear DEs of the general form oe F(u,v), of which the form
d

dv _ ~po
|
(21’)
du v

of (21) is a special case.

Case A. By (18), writing y = cot 6, we have

& = dB/dx = (sin?6)(y? + py + 4) >0 for all 0.

Hence, @ increases monotonically. In each half-turn around the origin, r


elle
is amplified or damped by a factor , according as a > 0 ora < 0. In
either case there are no invariant lines; the critical point at (0,0) is said to be
a focal point. Figure 2.4a shows the resulting phase portrait for u” + 0.2u’ +
4.0lu = 0.
7 The Phase Plane 53

A=2

/
0
ss

/ /
(a) u" +0.2u' + 4.0lu = 0. (b) 2u"—5u'+ 2u=0

Figure 2.4 Two phase portraits.

When > 4q (i.e., in Cases B and C), the two lines u’ = au and u’ = Bu in
the phase plane are invariant lines (Ch. 1, §'7). These lines, which correspond to
the solutions e** and e**, divide the uu’-plane up into four sectors, in each of
which @& is of constant sign and so @ is monotonic. If ¢ = a6 > 0, the two invar-
iant lines lie in the same quadrant; if g < 0, they lie in adjacent quadrants.

Case B. In this case, the trajectories in each sector are all tangent at the
origin to the same invariant line, and have an asymptotic direction parallel to
the other invariant line at 00. Fig. 2.4b depicts the phase portrait for 2u” — 5u’
+ Qu 0. The lines v 2u and u 2v are the invariant lines of the corre-
= = —
= => =

sponding linear fractional DF, dv/du = (5v — u)/v. In Case B, the origin is said
to be a nodal point.

Case C. In the saddle point case that p® > 0 > 4g, the two invariant lines lie
in different quadrants, and all trajectories are asymptotic to one of them as they
come in from infinity, and to the other as they recede to it. Figure 1.5 depicts
the phase portrait for the case u” u, with hyperbolic trajectories u 2 2
= —

—vd
=

4AB in the phase plane given parametrically by u Ae* + Be*,v = Ae


ul
= =
= =

~ Be~*, The invariant lines are the asymptotes v = +u.

EXERCISES C

1. (a) Show that if g(x) = f’(x), then g(x) vanishes at least once between any two zeros
of f(x).
(b) Show how to construct, for any n, a function f(x) satisfying (0) = fl) = 0,
Six) # 0 on (0,1), yet for which f(x) vanishes n times on (0,1).
54 CHAPTER 2 Second-Order Linear Equations

Show that there is a zero of J,(x) between any two successive zeros of Jo(x).

Show that every solution of u” + (1 + e*)u


=
=
0 vanishes infinitely often on (—0o
,0), and also infinitely often on (0300).

Show that no nontrivial solution of u” + (1 — x’)u = 0 vanishes infinitely often.


*5 The Legendre polynomial P,(x) satisfies the DE (1 — xu” — Qxu’ + n(n + lu =
0. Show that P,(x) must vanish O(n) times on [—1,1].
Apply numerical methods (Ch. 1, §8) to (18), to determine about how many times
any solution of u” + xu = 0 must vanish on (0,100).

Same question for the Mathieu DE u” + [x? + 4 cos 2x]u = 0.


(a) Show that no normal second-order linear homogeneous DE can be satisfied by
both some cos kx with k # 0, and some e*. [HINT: Consider the Wronskian.]
(b) Find a normal third-order homogeneous linear DE that has as solutions both the
oscillatory functions sin x, cos x, and the nonoscillatory function é*.

8 ADJOINT OPERATORS; LAGRANGE IDENTITY

Early studies of differential equations concentrated on formal manipulations


yielding solutions expressible in terms of familiar functions. Out of these studies
emerged many useful concepts, including those of integrating factor and exact
differential discussed in Ch. 1, §6. We will now extend these concepts to second-
order linear DEs, and derive from them the extremely important notions of
adjoint and self-adjoint equations.

DEFINITION. The second-order homogeneous linear DE

(22) Llu] = po(x)u"(x) + pi(x)u’(x) + polx)u(x) = 0

is said to be exact if and only if, for some A(x),B(x) € é},

(22) polxju” + pilxju’ + polx)u = ‘ [A(x)u’ + B(x)u]

for all functions u € @®. An integrating factor for the DE (22) is a function v(x)
such that vL[u] is exact. [Here and later, it will be assumed that py € @ and that
Pi-Po € @' in discussing the DEs (22) and (22’).]

If an integrating factor v for (22) can be found, then clearly

wepoladu" + prlsdu’ + pelea] = = CAGul + Bead


Hence, the solutions of the homogeneous DE (22) are those of the first-order
inhomogeneous linear DE

(23) A(x)u’ + B(x)u =


=
Cc
8 Adjoint Operators, Lagrange Identity 55

where C is an arbitrary constant. Also, the solutions of the inhomogeneous DE


L[u]
=
=
r(x) are those of the first-order DE

(23’) A(x)u’ + B(x)u


=
=
Jo(x)r(x) dx + C

The DEs (23) and (23’) can be solved by a quadrature (Ch. 1, §6). Hence, if an
integrating factor of (22) can be found, we can reduce the solution L[u] = r(x)
to a sequence of quadratures.
Evidently, L[u] = 0 is exact in (22) if and only if pp = A, p) = A’ + B, and
po = B’. Hence (22) is exact if and only if

pr = BY = (hy — AY = Bh ~ (pay

This simple calculation proves the following important result.

LEMMA. The DE (22) is exact if and only if its coefficient functions satisfy

p> — pi + po = 0

COROLLARY. A function v € @?is an integrating factor for the DE (22) if and


only if it is a solution of the second-order homogeneous linear DE

(24) M[v] = [polx)v]” — [pilx)u]’ + polx)o = 0

DEFINITION. The operator M in (24) is called the adjoint of the linear oper-
ator L. The DE (24), expanded to the DE

(24’) pov” + (20 — piv’ + (p60 — pi + pov = 0

is called the adjoint of the DE (22).

Clearly, whenever a nontrivial solution of the adjoint DE (24) or (24’)


of a given second-order linear DE (22) can be found, every solution of any DE
L[u] = r(x) can be obtained by quadratures, using (23’).

Lagrange Identity. The concept of the adjoint of a linear operator, which


originated historically in the search for integrating factors, is of major impor-
tance because of the role which it plays in the theory of orthogonal and bior-
thogonal expansions. We now lay the foundations for this theory.
Substituting into (24), we find that the adjoint of the adjoint of a given sec-
ond-order linear DE (20) is again the original DE (20). Another consequence of
(24) is the identity, valid whenever pp € C7, p,€ @',

vL[u] — uM[v] = (vpo)u” — ul pov)” + (wpy)u’ + ul p,o)’


56 CHAPTER 2 Second-Order Linear Equations

Since wu” — uw’ = (wu’ — uw’ and (uw)’ = uw’ + wu’, this can be simplified
to give the Lagrange identity

(25) L{u] — uMo) = © pg(u’o — wo’) — (66 ~ uo)


The left side of (25) is thus always an exact differential of a homogeneous bilin-
ear expression in u,v, and their derivatives.

Self-Adjoint Equations. Homogeneous linear DEs that coincide with their


adjoint are of great importance; they are called self-adjoint. For instance, the
Legendre DE of Example 2, §1, is self-adjoint. The condition for (22) to be self-
adjoint is easily derived. It is necessary by (24’) that 26) — p, = p,, that is,
po = py. Since this relation implies pj — pj = 0, it is also sufficient. Moreover,
in this self-adjoint case, the last term in (25) vanishes. This proves the first
statement of the following theorem.

THEOREM 9. The second-order linear DE (22) is self-adjoint if and only if it has


the form

(26)

The DE (22) can be made self-adjoint by multiplying through by

(26’) h(x) = |expJ u/po)ae /Po-


To prove the second statement, first reduce (22) to normal form by dividing
through by fp, and then observe that the DE

hu” + (ph)w’ + (qhju = 0

is self-adjoint if and only if h’


=
=
ph, or h = exp (fp dx).
For example, the self-adjoint form of the Bessel DE of Example 1 is

(xu’ + [x — (n?/x)]u =0

For self-adjoint DEs (26), the Lagrange identity simplifies to

(26”) vL[u] — uLf{v] = < [p(x)(u’v — uv’)]


8 Adjoint Operators; Lagrange Identity 57

EXERCISES D

1. Show that if «(x) and v(x) are solutions of the self-adjoint DE

(pu’y + q(x)u = 0

then p(x)[uv’ — vu’] is a constant (Abe]’s identity).

Reduce the following DEs to self-adjoint form:


(a) (1 — x)u” — xu’ + Au = 0 (Chebyshev DE)
(b) x?u” + xu? +u = 0 (c) u” + u’ tanx
=0

For each of the following DEs,


=
=
x is one solution; use (12) to find a second,
linearly independent solution by quadratures.
(a) x°y” — 4xy’ + 6y = 0 (b) xy” + (x —2)y’ — 3y = 0

Show that the substitution y = etry replaces (8) by

y” + Ty = 0, _P
I(x) =q—-
4— p’/2

*5 Show that two DEs of the form (8) can be transformed into each other by a change
of dependent variable of the form y = v(x)u,v # 0, if and only if the function
I(x) = g(x) — p'(x)/4 — p’(x)/2 is the same for both DEs [J (x) is called the invariant
of the DE].

Reduce the self-adjoint DE (pu’)’ + qu = 0 to normal form, and show that, in the
notation of Ex. 5, I(x) = (p? — 2pp” + 4pq)/4p°.
(a) Show that, for the normal form of the Legendre DE [(1 — x*)u’])’ + Au = 0

(1 +A
— Ax’)
I(x) = (Use Ex. 6.)
(1 — x)?

(b) Show that, if \ = n(n + 1), then every solution of the Legendre equation has
at least (2n + 1)/x zeros on (—1, 1).

Let u(x) be a solution of u” = q(x)u, q(x) > 0 such that u(0) and w’(0) are positive.
Show that wu’ and u(x) are increasing for x > 0.

Let h(x) be a nonnegative function of class @'. Show that the change of independent
variablet = f% h(s) ds, u(x) = v(t), changes (8) into v” + p,(é)v’ + ¢,(v = 0, where
Pil) = [plx)a(x) + h'(e)]/h(x)? and q(t) = g(x)/h(x)’.
10 (a) Show that a change of independent variable ¢ + f Iq(x)|'? dx, q # 0,
=
=

q€ @' changes the DE (8) into one whose normal form is

du q + 2pq du
(*) dt?
( 2\q|*” dt
~

(b) Show that no other change of independent variable makes |q/ = 1


*11 Using Ex. 10, show that Eq. (8) is equivalent to a DE with constant coefficients
3/2 2

under a change of independent variable if and only if (q’ + 29)/q 1s a constant.

12 Making appropriate definitions, show that pou” + piu” + pou’ + psu =


=
0 is an
exact DE if and only if p7 — p{ + po — ps; = 0.
58 CHAPTER 2 Second-Order Linear Equations

9 GREEN’S FUNCTIONS

The inhomogeneous linear second-order DE in normal form,

2
u
(27) L{u] =

dx? + p(x)“ + q(x)u = r(x)

differs from the homogeneous linear DE

2
u
(27’) Lf{u] = —

dx? + po) + glen =


=

by the nonzero function r(x) on the right side. In applications to electrical and
dynamical systems, the function r(x) is called the forcing term or input function.
By the Uniqueness Theorem of §4 and Lemma 2 of §4, it is clear that the solu-
tion u(x) of L[u] = r(x) for given homogeneous initial conditions such as u(0)
u’(0) = 0 depends linearly on the forcing term. We will now determine the
=
=

exact nature of this linear dependence.


Given the inhomogeneous linear DE (27), we will show that there exists an
integral operator G,

(28) G[r] = f ‘ G(x,E)r(€)dé


such that G[7] u. In fact, one can always find a function G that makes G[7]
=
=

satisfy given homogeneous boundary conditions, provided that the latter define
a well-set problem.
The kernel G(x, £) of Eq. (28) is then called the Green’s functiont associated
with the given boundary value problem. In operator notation, it is defined by
the identity L[G[r]] = r (Gis a “right-inverse’’ of the linear operator L) and
the given boundary conditions.
Green’s functions can be defined for linear differential operators of any
order, as we will show in Ch. 3, §9. To provide an intuitive basis for this very
general concept, we begin with the simplest, first-order case. In this example,
the independent variable will be denoted by ¢ and should be thought of as rep-
resenting time.

Example 3. Suppose that money is deposited continuously in a bank account,


at a continuously varying rate r(#), and that interest is compounded continuously
at a constant rate p (=100p% per annum). As a function of time, the amount

+ To honor the British mathematician George Green (1793-1841), who was the first to use formulas
like (28) to solve boundary value problems. Cauchy and Fourier used similar formulas earlier to solve
DEs in infinite domains.
9 Green’s Functions 59

u(é) in the account satisfies the DE

du
a bet wo

1f the account is opened when ¢ = 0 and initially has no money: u(0) = 0, then
one can calculate u(7) at any later time T > 0 as follows. Each infinitesimal
deposit r(f) dt, made in the time interval (¢, + dé), increases through compound
interest accrued during the time interval from t to T by a factor e7~® to the
amount e?—r(¢) dt. Hence the account should amount, at time T, to the integral
(limit of sums)

(29) u(T) = f T9001) dt = ef e'y(t) dt

This plausible argument is easily made rigorous. It is obvious that u(0) = 0 in


(29). Differentiating the product in the final expression of (29), we obtain

u(T) =per?f ePr(t) dt +efPT1(T) = pu(T) + 1(T)


where the derivative of the integral is evaluated by the Fundamental Theorem
of the Calculus.

Example 4. Consider next the motion of a mass m on an elastic spring, which


we model by the DE u” + pu’ + qu = r(i). Here p signifies the damping coef-
ficient and q the restoring force; r(t) is the forcing function; we will assume that
q> p’/A. Finally, suppose that the mass is at rest up to time fg, and is then given
an impulsive (that is, instantaneous) velocity vp at time fp.

The function f describing the position of the mass m as a function of time


under such conditions is continuous, but its derivative f’ is not defined at fp,
because of the sudden jump in the velocity. However, the left-hand derivative
of fat the point f) exists and is equal to zero, and the right-hand derivative also
exists and is equal to vo, the impulsive velocity. For t > ft, the function f is
obtained by solving the constant-coefficient DE u” + pu’ + qu = 0. Since
q > p’/A, the roots of the characteristic equation are complex conjugate, and
we obtain an oscillatory solution

0 t< ty
u(t)
-| (vp/v)e™"— sin v(t ~ ty) t=t

where wp = »/2 and p Vq — p/4.


60 CHAPTER 2 Second-Order Linear Equations

Now suppose the mass is given a sequence of small velocity impulses Av, =
=

r(t,) At, at successive instants fp, f) = t) + At,...,t, = t +k At,.... Summing


the effects of these over the time interval fj < ¢ = T, and passing to the limit
as At — 0, we are led to conjecture the formula

(30) u(T) = f ‘3 eT sino(T— t)r(t)dt


%

This represents the forced oscillation associated with the DE

(31) u” + pu’ + qu = r(d), q> p’/4


having the forcing term r(é).

Variation of Parameters. The conjecture just stated can be verified as a


special case of the following general result, valid for all linear second-order DEs
with continuous coefficients.

THEOREM 10. Let the function G(t, 7) be defined as follows:


(i) Gd, 7) = 0, fora =t<7,
(ti) for each fixed + =a and all t > 1, Gt, 7) is that particular solution of the DE
L[G] = Gy + p®G, + q®G = 0 which satisfies the initial conditions G = 0
and G, = 1 att = 7.
Then G is the Green’s function of the operator L for the initial value problem on
t= a.

Proof. We must prove that, for any continuous function r, the definite
integral

(32) u(t) = f G(t, r)r(7) dr = f G(t, t)r(r) dr [by(i)]

is a solution of the second-order inhomogeneous linear DE (27), which satisfies


the initial conditions u(a) = u’(a) = 0.
The proof is based on Leibniz’ Rule for differentiating definite integrals.t
This rule is: For any continuous function g(t, 7) whose derivative dg/dt is piece-
wise continuous, we have

2 f een dr = g(t, i) + f ¥ene

+ Kaplan, Advanced Calculus, p. 219. In our applications, dg/dt has, at worst, a simple jump across
t=fT.
9 Green’s Functions 61

Applying this rule twice to the right side of formula (32), we obtain successively,
since Git, t) = 0,

u(t) = Gd, Or) + f Git, t)r(r) dr = f G,@,r)r(r) dr

u"() = Gt, \r® + f G,{t, 7)r(r) dr

By assumption (ii), the last equation simplifiesto

u(t) = ri) + f G,(t,7)r(r) dr

Here the subscripts indicate partial differentiation with respect to ¢. Thus

Liu] u"() + pou’) + qQuO


=

ri) + f [Gult, 7) + pOGE,7) + qOGE,7))r(7) dr = 1


completing the proof.
The reader can easily verify that the function y~'e““~” sin v(t — 7) in (30)
satisfies the conditions of Theorem 10, in the special case of Example 4.
To construct the Green’s function G(¢, 7) of Theorem 9 explicitly, it suffices
to know two linearly independent solutions f() and g(¢) of the reduced equation
L[u] = 0. Namely, to compute G(t, r) for t > +r write Gt, 7) = c(r)f@® +
d(r)g(t), by Theorem 3. Solving the simultaneous linear equations G = 0, G, =
1, at ¢ = + specified in condition (ii) of Theorem 10

e(r)f(z) + d(z)g(r) = 0, c(t) f(r) + d&g’) = 1

we get the formulas

er) = —g(r)/W(f, g: 7), d(r) = f(r)/W(F, 8; 7)

where Wf, g3 7) = fe’) — g(t)f’@ is the Wronskian. This gives for the
Green’s function the formula

GE 7) = L/MgO — gMfOIWL/MEe’® — gOf')

Substituting into (32), we obtain our final result.

COROLLARY. Let f(t) and g(t) be any two linearly independent solutions of the
linear homogeneous DE (27’). Then the solution of L[u] = r(t) for the initial conditions
62 CHAPTER 2 Second-Order Linear Equations

u(a) = u’(a) = 0 is the function

‘fOg® — gofo r(r) dr


(33) u(t) =
a WLf@),.g@)]

Consequently, if we define the functions o(t) and Y(t) as the following definite
integrals:

‘ {@ * g(r) r(r) dr
ot) =
~~

r(r) dr, vO =~
S_

a Wg) a Wig)

we can write the solution of L[u] = r(@), in the form

(33’) ut) = Og® + VOSO

In textbooks on the elementary theory of DEs, formula (33) is often derived


formally by posing the question: What must c(r) and d(r) be in order that the
function

Gt, 7) = c(r)f® + ding

when substituted into (28), will give a solution of the inhomogeneous DE L[u]
= r(t)? Since c(r) and d(r) may be regarded as “variable parameters,”’ which vary
with 7, formula (33) is said to be obtained by the method of variation of
parameters.

EXERCISES E

1 Integrate the following DEs by using formula (33):


(a) y” —y= x” (b) y” ty =e!

(c) y” — gy +9 = 2xe&* (d) y” + 10y’ + 25y = sin


x

Show that the general solution of the inhomogeneous DE y” + k’y = R(x) is given by
y = (1/2); sin ke — YR di] + c, sin kx + cy cos kx.
Solve y” + 3y' + 2y = x? for the initial conditions y(0) = y'(0) = (0).
Show that any second-order inhomogeneous linear DE which is satisifed by both x?
and sin” x must have a singular point at the origin.
Construct Green’s functions for the initial-value problem, for the following DEs:
(a) uw” = 0 (b) u” =u () u® +u=0
(d) x7” + (x? + 2x)u’ + (x + 2)u = 0 [HINT: x is a solution.]
Find the general solutions of the following inhomogeneous Euler DEs:
(a) xy — Qxyl + Q2y = x7 + px tq (b) xy” + 3xy’ + y = R(x)
[Hint: Any homogeneous Euler DE has a solution of the form x’.]

(a) Construct a Green’s function for the inhomogeneous first-order DE

du/dt = p(tu + ri)


10 Two-Endpoint Problems 63
-

(b) Interpret in terms of compound interest (cf. Example 3).


(c) Relate to formula (87 of Ch. 1.

8. Show that, if g(f) < 0, then the Green’s function G(t, 7) of u, + q(tu =
=
0 for the
initial-value problem is positive and convex upward for ¢ > r.

*10 TWO-ENDPOINT PROBLEMS

So far, we have considered only “initial conditions.” That is, in considering


solutions of second-order DEs such as y” = —p(x)y’ — q(x)y, we have supposed
y and y’ both given at the same point a. That is natural in many dynamical prob-
lems. One is given the initial position and velocity, and a general law relating
the acceleration to the instantaneous position and velocity, and then one wishes
to determine the subsequent motion from this data, as in Example 4.
In other problems, two-endpoint conditions, at points x =
=

aand
x = Bb, are
more natural. For instance, the DE y” =
=
0 characterizes straight lines in
the plane, and one may be interested in determining the straight line joining
two given points (a, c) and (6, d). That is, the problem is to find the solution
y = f(x) of the DE y” = 0 which satisfies the two endpoint conditions f(a) = ¢
and f(b) = d.
Many two-endpoint problems for second-order DEs arise in the calculus of
variations. Here a standard problem is to find, for a given function F(x,y,y’), the
curve y = f(x) which minimizes the integral

(34) fp= f F(x,y,y’)dx


By a classical result of Euler,f the line integral (34) is an extremum (maximum,
minimum, or minimax), relative to all curves y = f(x) of class @? satisfying
{@ = ¢ and f() = d, if and only if f(x) satisfies the Euler-Lagrange variational
equation

d OF
(34’)
(ay’ (oF
=
=

dx dy

For example, if F(x,y,y)) = V1 + y” so that I(f) is the length of the curve, Eq.
(34’) gives zero curvature:

/ 72 a
d
)-9 | aa J

(
dx Viry
J
a + wy?
| = dl + wy?

+ See for example Courant and John, Vol. 2, p. 743.


64 CHAPTER 2 Second-Order Linear Equations

as the condition for the length to be an extremum. This is equivalent to y” = 0,


whose solutions are the straight lines y = cx + d.
It is natural to ask: Under what circumstances does a second-order DE have
a unique solution, assuming given values f(a) c and f(6) = d at two given
=
=

endpoints a and b > a? When this is so, the resulting two-endpoint problem is
called well-set. Clearly, the two-endpoint problem is always well-set for y” = 0.

Example 5. Now consider, for given p, q, r € C', the curves that minimize the
integral (34) for F = ${p(x)y + 2q(x)yy’ + 1(x)y®]. For this F, the Euler-
Lagrange DE is the second-order linear self-adjoint DE (py’)’ + (7 — ny = 0.
The question of when the two-endpoint problem is well-set in this example is
partially answered by the following result.

THEOREM 10. Let the second-order linear homogeneous DE

(35) polx)u” + py(x)u’ + po(x)u 0 po(x) # 0


=
=

with continuous coefficient-functions have two linearly independent solutions.t Then the
two-endpoint problem defined by (35) and the endpoint conditions u(a) c, u(b) = d
=
=

is well-set if and only if no nontrivial solution satisfies the endpoint conditions

u(b) = 0
=

(36) u(a) =

Proof. By Theorem 2, the general solution of the DE (35) is the function «


= af(x) + Bg(x), where fand g are a basis of solutions of the DE (35), and a, 6
are arbitrary constants. By the elementary theory of linear equations, the
equations

af(a) + Bg) =c af(b) + Bg) = 4

have one and only one solution vector (a, 8) if and only if the determinant
f@g) — g@f® # 0. The alternative f(a)g(6) = f®)g(@ is, however, the con-
dition that the homogeneous simultaneous linear equations

(37) af(a) + Bg(a) = af(&) + Bg) = 0

have a nontrivial solution (a, 8) # (0, 0). This proves Theorem 11.

When the DE (35) has a nontrivial solution satisfying the homogeneous end-
point conditions u(a) = u(b) = 0, the point (, 0) on the x-axis is called a con-
jugate point of the point (a, 0) for the given homogeneous linear DE (35) or for
a variational problem leading to this DE. In general, such conjugate points exist

t In Ch. 6, it will be shown that this hypothesis is unnecessary; a basis of solutions always exists for
continuous p,(x).
11 Green’s Functions, II 65

for DEs whose solutions oscillate but not for those of nonoscillatory type, such
as u” = g(x)u, q(x) > 0.
Thus, in Example 5, letp = 1, ¢ = 0, and r = —k? < Q. Then the general
solution of (35) for the initial condition u(a) = 0 is u = A sin [k(x — a)]. For
u(b) = 0 to be compatible with A # 0, it is necessary and sufficient that b = a
+ (nx/k). The conjugate points of a are spaced periodically. On the other hand,
the DE y” — Ay = 0, corresponding to the choice p = 1, q = 0,7 A in

=

Example 5, admits no conjugate points if \ = r is positive.

*11_ GREEN’S FUNCTIONS, II

We now show that, except in the case that a and b are conjugate points for
the reduced equation L{u] = 0, the inhomogeneous linear DE (27) can be
solved for the boundary conditions u(a) = u(b) = 0 by constructing an appro-
priate Green’s function G(x, &) on the square a < x, € <= b and setting

(38) u(x) = J G(x,Org) d& = Gly]


Note that G is an integral operator whose kernel is the Green’s function G(x, &).
The existence of a Green’s function for a typical two-endpoint problem is
suggested by simple physical considerations, as follows.

Example 6. Consider a nearly horizontal taut string under constant tension


T, supporting a continuously distributed load w(x) per unit length. If (x)
denotes the vertical displacement of the string, then the load w(x) Ax supported
by the string in the interval (x, x) + Ax) is in equilibrium with the net vertical
component of tension forces, which is

Thy (xo + Ax) — y(%o)}

in the nearly horizontal (‘small amplitude’’) approximation.+ Dividing through


by Ax and letting Ax | 0, we get Ty”(x) = w(x).
The displacement y(x) depends linearly on the load, by Lemma 2 of §4. This
suggests that we consider the load as the sum of a large number of point-con-
centrated loads w, = w(é,) Ag, at isolated points £;. For each individual such load,
the taut string consists of two straight segments, the slope jumping by w,/T at
é,. Thus, if the string extends from 0 to 1, the vertical displacement is

wG(x,&) = |6; ~~ 1)x


e€(x — 1) bi =x=l1
—_

+ For a more thorough discussion, see J. L. Synge and B. A. Griffith, Principles of Mechanics, McGraw-
Hill, 1949, p. 99.
66 CHAPTER 2 Second-Order Linear Equations

where ¢, is set equal to w,/T in order to give a jump in slope of w,/T at the point
x = £, Passing to the limit as the Aé, | 0, we are led to guess that

9) = J" Cle,ule) dé
where

G(x,§) = |&(x
(§ — 1)x/T
~ 1)/T
Ox=xxsé
é=x=1

These heuristic considerations suggest that, in general, the Green’s function


G(x, &) for the two-endpoint problem is determined for each fixed & by the fol-
lowing four conditions:

(i) L[G] = 0in eachof the intervals a = x Sfandé= x = 6b.

(ii) GG, = Gi, £) = 0.


(iii) G(x, &) is continuous across the diagonal x = £ of the square a = x,& =)
over which G(x, &) is defined.
(iv) The derivative 9G/dx jumps by 1/p9(x) across this diagonal. To fulfill these
conditions for any given &, let f(x) and g(x) be any nontrivial solutions of
L{u] = 0 that satisfy f(a) = 0 and g(b) = 0, respectively. Then for any factor
e(£), the function

Gx, ) = |e)f(x) g)
«(E)fg(x)
asxx<t
é=x<b

will satisfy L[G] = 0 in the required intervals because L[f] = L[g] = 0; it will
satisfy (ii) because f(a) = g(b) = 0; and it approaches the same limit «(£)f(é)g(é)
from both sides of the diagonal x = &; hence it is continuous there. For the
factor e(£) to give 0G/dx a jump of 1/po(x) across x = &, a direct computation
gives the condition

(39) so 8 — SEED = MOUOHO — BOLO) = 1/o®


We are therefore led to try the kernel

(39/ Ge, = |SOs)


fO)gO©/PoHWe
/poHwe
a=x<ét
Ex=x<b

where W = fg’ — gf’ is the Wronskian of f and g. Observe again that since f(a)
= g(b) = 0, Ga, §) = GO, §) = 0 for all & € [a, 8].
11 Green’s Functions, I 67

THEOREM 11’. For any continuous function r(x)on [a, b], the function u(x) €
@? defined by (39) and (39’) is the solution of pou” + piu’ + pou =
=
r that satisfies
the boundary conditions of u(a) = u(b) = 0, provided that W(f, g) # 0, i.2., that
there is no nontrivial solution of (35) satisfying the same boundary conditions.

The proof is similar to that of Theorem 10; the existence of two linearly inde-
pendent solutions of (35) is again assumed. Rewriting (38) in the form

u(x) = f ” G(x,Hr® at + J"Gle,Orl®)dé

and differentiating by Leibniz’ Rule, we have

ul(x) = f ” G,lee,Or® dé + J" Gxl0e,rl at


The endpoint contributions cancel since G(x, §) is continuous for x = &. Differ-
entiating again, we obtain

wey = f ” GyalotsB®) dé + Gylee,xr")


+ f Gu(x, 8G) d—é— G(x, x*)r(x*)
where f(x*) signifies the limit of f(€) as € approaches x from above, and f(x")
the limit as approaches x from below. The two terms corresponding to the
contributions from the endpoints come from the sides § < x and & > x of the
diagonal; since r is continuous, r(x) = r(x*). Hence their difference is [G,(x* ,x)
— G,{x~, x)]r(x), which equals r(x/po(x) by (39). Simplifying, we obtain

r(x)
u”(x) = f Gy(x, &)r(&) d& + po
~~
)

From this identity, we can calculate L[u]. It is

Llu) = J L[G(, H)]r dé + r(x) =


=

r(x)

since L[G(x, §)] = 0 except at x = & Here the operator L acts on the variable
x in G(x, &); thoughG is not in @?, the expression L[{G] is meaningful for one-
sided derivatives and the above can be justified. This gives the identity (38).
Since G(x, ), considered as a function of x for fixed &, satisfies the boundary
68 CHAPTER 2 Second-Order Linear Equations

conditions Gia, § = G(b, & = 0, it follows from (38) that u(a) —


=

u(b) = 0,
completing the proof of the theorem.

Delta-Function Interpretation. The ideas underlying the intuitive discus-


sion for Examples 4 and 5 can be given the following heuristic interpretation.
Let the symbolic function 6(x) stand for the limit of nonnegative “density” func-
tions p(x) concentrated in a narrow interval (—e«, ¢) near x = 0, with total mass
J&< p(x) dx = 1, ase | 0. Likewise, 5(x — £) stands for the density of a unit mass
(or charge) concentrated at x = &.
For any
f € C[a, 6] a < 0 < 6 and any such with support (—«, & C [a, 8],
we will have by the Second Law of the Mean for integrals

f “plsypts) dx = farptey dx = fley [pte ax = fe)


where —e < x, < ¢. Letting ¢ approach zero, we get in the limit

(40) J”flayb(a)dx = f(0)


Translating through & we have similarly

(40’) f "5e — Bfle) dx = f®, EE(a,b)


In particular, setting f(x) equal to one, we get

(41) ff be — 9 ax = (, if £ € (a, b)
if£ ¢ [a, 4]

Finally, the Green’s function of a differential operator L and given homoge-


neous linear initial or boundary conditions satisfies the symbolic equation

(42) L,G(x, &) = (x —

and the same initial or boundary conditions (in x). Now consider the function

(43) ua) = f Ges,Br® at


Extending heuristically the Superposition Principle to integrals (considered as
limits of sums), we are led to the good guess that u(x) satisfies the same initial
(resp. boundary) conditions and also

(44) L{u) =1,|f Ge,Hr a| = f d(x—Er(€)d&= r(x)


11 Green’s Functions, II - 69

EXERCISES F

In Exs. 1-3, (a) construct Green’s functions for the two-endpoint problem defined by the
DE specified and the boundary conditions u(0) = u(1) = 0, and (b) solve for 7(x) = x7:
1. u” — u = r(x) 2. u” + 4u = r(x)

io
3. u” — u = r(x)
x
— 1 2 1

In Exs. 4-5, find the conjugate points nearest to x = 0 for the DE specified.
4, u” + 2u’
+ 10u = 0

5 (x? — x + 1)u” + (1 — 2x)u’ + 2u = 0 [HInT: Look for polynomial solutions.]


*6. Uy — uy, + eu = 0
7 (a) Show that, for two-endpoint problems containing no pairs of conjugate points,
Green’s function is always negative.
(b) Show that, if g(x) < 0, then the Green’s function for u” + q(x)u = 0 in the two-
endpoint problem is always negative and convex (concave downward), with neg-
ative slope where x < £ and positive slope where x > &.
*8 Set F(x, y, y’) = (1 — y)? in (34), and find the curves joining (0, 0) and (1, §) which
minimize I( f).

Show that the Euler-Lagrange DE for F(x, y, y) = gp(x)y + 4 Ty” (g, T constants) is
Ty” = gp(x). Relate to the sag of a loaded string under tension T.

ADDITIONAL EXERCISES

1. Show that the ratio v = f{/g of any two linearly independent solutions of the DE u”
+ q(x)u = 0 is a solution of the third-order nonlinear DE

a vi
Vv 3
(*) Sfv] = —

/
2 (
—_—

/ = 2q(x)

The Schwarzian S[v] of a function v(x) being defined by the middle term of (*), show
that S[(av + b)/(cu + d)] = S[0°] for any four constants, a, b, c, d with ad # be.

*3 Prove that, if vo, v,, vo, v3 are any four distinct solutions of the Riccati DE, their
cross ratio is constant: (v9 — v)(vs — v2)/(v9 — v9)(v3 — v1) = ¢.

Find the general solutions of the following inhomogeneous Euler DEs:


(a) xy” — Dey + Qy = x2 + pe tq (b) x°y” + 3xy’ + y = R(x)
(a) Show that, if fand g satisfy u” + q(x)u = 0, the product /g = y satisfies the DE
Qyy” = (y')? — 4y%e(x) + ¢ for some constant c.

(b) As an application, solve 2yy” = (y’/)? — (x + 1)7?y?


Show that, if u is the general solution of the DE (1) of the text, and W, = pp} —
pip, then v = w’ is the general solution of

Popov” + (Pipe — Wos)u’ + (£3 — Wi)v = Wy.


(a) Show that the Riccati equation y’ = 1 + x? + y* has no solution on the interval
(0, ).
(b) Show that the Riccati equation y’ = 1 + x — x* has a solution on the interval
(-—o, +0),
70 CHAPTER 2 Second-Order Linear Equations

Let @,(x) = — x (arctan kx


=
=
k/e(1 + k?x*). Show that, if f(x) is any continuous

function bounded on (—, 00), then litte [0 O(% — Of(x) dx = flo.


For the DE u” + (B/x’)u 0, show that every solution has infinitely many zeros
=
=

on (1, +0) if B > }and a finite number if B < 4. (Hint: The DE is an Euler DE.]
10 For the DE u” + q(x)u = 0, show that every solution has a finite number of zeros
on (1, +00) if g(x) < 4x”, and infinitely many if g(x) > B/x*, B > 4.
*11. For the DE u” + g(x)u =
=
0, show that every solution has infinitely many zeros on
(1, +00) if

[P| -L]ae= 4
*12, Show that, if p, ¢ € @?, we can transform the DE (8) to the form d*z/d# = 0
in some neighborhood of the y-axis by transformations of the form z = f(x)y and
d& = h(x) dx. [HINT: Transforma basis of solutions to y, = 1, yo = &.]
CHAPTER 3

LINEAR EQUATIONS
WITH CONSTANT
COEFFICIENTS

1 THE CHARACTERISTIC POLYNOMIAL

So far, we have discussed only first- and second-order DEs, primarily because
so few DEs of higher order can be solved explicitly in terms of familiar func-
tions. However, general algebraic techniques make it possible to solve constant-
coefficient linear DEs of arbitrary order and to predict many properties of their
solutions, including especially their stability or instability.
This chapter will be devoted to explaining and exploiting these techniques.
In particular, it will exploit complex algebra and the properties of the complex
exponential function, which will be reviewed in this section. It will also apply
polynomial algebra to linear differential operators with constant coefficients,
using principles to be explained in §2.
The nth order linear DE with constant coefficients is

(1) L{u) = u™ + ayu®—) + ague-? + +++ + a,u = r(x)

Here u stands for the kth derivative d*u/dx* of the unknown function u(x);
a), , 4, are arbitrary constants; and r(x) can be any continuous function. As
in Ch. 2, §1, the letter L in (1) stands for a (homogeneous) linear operator. That
is, Law + 6v] = aL[u] + BL[v] for any functions u and vu of class @” and any
constant a and 6.
As in the second-order case treated in Chapter 2, the solution of linear DEs
of the form (1) is best achieved by expressing its general solution as the sum u
Up, + u, of some particular solution u,(x) of (2), and the general solution u,(x)
=
=

of the “reduced” (homogeneous) equation

(2) Llu} = u™ + ayu®-) + ague-? +--+ + a,u = 0

obtained by setting the right-hand side of (1) equal to 0.


dx ’
Solutions of (2) can be found by trying the exponential substitution u = é

where ) is a real or complex number to be determined. Since d*(e)/dx" =


Xe, this substitution reduces (2) to the identity

Lle™] = (N" + av") + agn"? + ++ + a,)e* =


=

71
72 CHAPTER 3 Linear Equations with Constant Coefficients

This is satisfied if and only if A is a (real or complex) root of the characteristic


polynomial of the DE (1), defined as

(3) pQ) pi) =r" + aA + + a4,-)A + 4

For the second-order DE u” + pu’ + qu = 0, the roots of the characteristic


polynomial are \ = 3(—p + ViVp* —%4q). In Ch. 2, §2, it was shown by a special
method that, when p* < 4q so that the characteristic polynomial has complex

roots X ~p/2 wv =
=
V4q — p*), the real functions e~**/? px form

abasisoofsolutions.Wewillnowshowhowtoapplytheexponentialsubstitution
u =e to solve the general DE (2), beginning with the second-order case.
Loosely speaking, when the characteristic polynomial p() has n distinct roots
AL »Ay the functions ¢,(x)
= e* form a basis of complex solutions of the
DE (2) ‘By this we mean that for any “initial”
x =
=
%q and specified (complex)
numbers up, U4, uf, mere exist unique numbers ¢,, , ¢, such that
the solution f(x)
(a-1)
=
=

ule) = Ljn1 6G,(x) satisfies f(xo) = uo, f"(xo) = UG, >

fO-M%)
Moreover, the complex roots A, of p(A) occur in pairs y, +3 vy just as in the
second-order case treated in Ch."9. Therefore, the real functions e* cos Vx,
e“* sin v,x together with the é* corresponding to real roots of(A) = 0, form a
basis of‘real solutions of (2).
Initial Value Problem. By the “‘initial value problem” for the nth order DE
(n— ) a solution
(1) is meant finding, for specified xp and numbers wp, uj, » Uo
u(x) of (1) that satisfies u(x9) up, and u(xo) = uf? for j = 1, n-1
If a basis of solutions ¢,(x) of the “‘reduced”’ DE (2) is known, together with
one “particular” solution u,(x) of the inhomogeneous DE (1), then the sum
u(x) = u,(x) + Le,b{x), with the c; chosen to make u,(x) = Lc,@,(x) satisfy
U,(%p) = Uy — uupl%)“uklxo) = ug ‘Uhl (n—1)
» U;, (%o) = uP ~ ue-PGxo),
constitutes one solution of the stated initial value problem. In §4, wewill prove
that this is the only solution (a uniqueness theorem), so that the stated initial
value problem is ‘“‘well-posed.”’

2 COMPLEX EXPONENTIAL FUNCTIONS

When the characteristic polynomial of u” + pu’ + qu = 0 has complex roots


\ = —p/2 + w, as before, the exponential substitution gives two complex expo-
nential functions as formal solutions, namely

bH/2tiE = 9 Pl? {cos yx +


—_ i sin px}

Fromthesecomplexsolutions, therealsolutionse*/” | | vxobtainedbya


special method in Ch. 2 can easily be constructed as linear combinations. The
2 Complex Exponential Functions 73

present section will be devoted to explaining how similar constructions can be


applied to arbitrary homogeneous linear constant-coefficient DEs (2).
The first consideration that must be invoked is the so-called Fundamental
Theorem of Algebra.t To apply this theorem effectively, one must also be famil-
iar with the basic properties of the complex exponential function. We shall now
take these up in turn.
The Fundamental Theorem of Algebra states that any real or complex poly-
nomial p(A) can be uniquely factored into a product of powers of distinct linear
factors (\ — A,):

(4) PO) = (A — AMA — Ao)” == Aayi

Clearly, the roots of the equation p(A) = 0 are the A,. The exponent &, in (4) is
called the multiplicity of the root \,; evidently the sum of the &, is the degree of
p. When all d, are distinct (ie., all k, = 1 so that m =
=
n), the DE has a basis of
complex exponential solutions ¢(x) = eM 7 =
=
1,2 , n; see §4 for details.

Example 1.For the fourth-order DE u” u, the characteristic polynomial


=
=

isA* = 1, with roots +1, +i. Therefore, a basis of complex solutions is provided
by e*, e~*, e™, and e”. From these we can construct a basis of four real solutions

e, e%, cosx = (e* + e*)/2i, sinx = (e® — e~*)/2i

Complex Exponentials. In this chapter and in Ch. 9, properties of the


complex exponential function e¢* will be used freely, and so we recall some of
them. The exponent z = x + ty is to be thought of as a point in the (x,y)-plane,
which is also referred to as the complex z-plane. The complex “value” w = ¢”
of the exponential function is evidently a vector in the complex w-plane with
magnitude |e’| = e¢*, which makes an angle y with the u-axis. (Here w ut
=
=

2v, so that u e* cos


y andv = e*sin y if w = e*.)
=
=

20
Because e° cos @ + isin 6, one also often writes z x + ty as
z
= =
= =
=r,

where r = Vx? + x and @ = arctan(y/x) are polar coordinates in the (x,y)-plane.


In this notation, the inverse of the complex exponential function ¢ is the com-
plex ‘‘natural logarithm’? function

+ 0
+ 4) =Inr
Inz = In(x

Since 6 is defined only modulo 27, In z is evidently a multiple-valued function.


In the problems treated in this chapter, the coefficients a,J of the polynomial
(2) will usually all be real. Its roots A, will then all be either real or complex
conjugate in pairs, \ = « + iv. Thus, for the second-order DE u” + pu’ + qu
= 0 discussed in Chapter 2, the roots are \ = 3(—p + ‘p* — 4q). They are
real when p? > 4q, and complex conjugate when p? < 4q.

+ Birkhoff and MacLane, p. 113.


74 CHAPTER 3 Linear Equations with Constant Coefficients

In this chapter, the independent variable x will also be considered as real.


Now recall that if A = y« + iy, where u,v are real, then we have for real x

e* = gixtwx
(5)
=
=

e**(cos vx + 7 sin vx)

Hence, if\ = w + wand A* = w — ware both roots of p,(A) = 0 in (3), the


functions e““(cos yx + i sin yx) are both solutions of (2). Since |¢’*| = 1 for all
real p,x, it also follows that, where e* is considered as a complex-valued function
of the real independent variable x

5’) lem] =e, arg{e“} = vx, =ut+wp

Example 1’. For the DE u” + 4u = 0, the characteristic polynomial A* + 4


has the roots +1 + i. Hence it has a basis of real solutions e* cos x, e* sin x,
e”*cos x, e~*sin x. (An equivalent basis is provided by the functions cosh x cos x,
cosh x sin x, sinh x cos x, sinh x sin x.)

Euler’s Homogeneous DE. The homogeneous linear DE

(6) xu+ bx” ue) + box Pu) +--+ + du = 0

is called Euler’s homogeneous differential equation. It can be reduced to the


form (2) on the positive semi-axis x > 0, by making the substitutions

d
t =Inx, _ =
=

> eM = xh
d d.

Corresponding to the real solutions é, te, ... of (2), we have real solutions
x, x In x, of (6).
Moreover, these can easily be found by substituting x* for u in (6). This sub-
stitution yields an equation of the form J(\)x* = 0, where J() is a polynomial
of degree n, called the indicial equation. Any A for which [(A) = 0 gives a so-
lution x* of (6); if X is a double root, then x* and x* In x are both solutions, and
so on.

For example, when n = 2, Euler’s homogeneous DE is

(7) x?u” + pxu’ + qu = 0, p,q real constants

Trying u
=
=

x, we get the indicial equation of (8):

Cs) AA—
1) +pr(+q=0

Alternatively, making the change of variable x = e', we get

du
(8)

dt?
+ (p- 1)
a 7 + mu =0, t=Inx
2 Complex Exponential Functions 75

since


d d oa
de dx (*dx dx? + x

If (p — 1? > 4q, the indicial equation has two distinct real roots \ = a and
\ = 8, and so the DE (7) has the two linearly independent real solutions x* and
x®, defined for positive x. For positive or negative x, we have the solutions |x |*
and |x|® since the substitution of —x for x does not affect (7). Note that the
DE (7) hasa singular point at x = 0 and that |x| has discontinuous slope there
ifa 31.
When the discriminant (p — 1)? — 4g is negative, the indicial equation has
two conjugate complex roots \ = » + w, where w = (1 — p)/2 andy = [4q —
(p — 1)*)'/7/2. A basis of real solutions of (8) is then e“ cos vt and e“ sin vt; the
corresponding solutions of the second order Euler homogeneous DE (7) are
x" cos(v In x) and x" sin(v In x). These are, for x > 0, the real and imaginary
parts of the complex power function

A oe hth (usw) In x
x =e = x*[cos@ In x) + i sin( In x)]

as in (5). For x < 0, we can get real solutions by using |x| in place of x. But for
x < 0, the resulting real solutions of (7) are no longer the real and imaginary
parts of x, because In(—x) = In x + ix; cf. Ch. 9, §1.

General Case. The general nth-order case can be treated in the same way.
We can again make the change of independent variable

t
x =e, t = Inx, x=

dx di

This reduces (6) to a DE of the form (2), whose solutions ve give a basis of
solutions for (6) of the form (In x)’x’.

EXERCISES A

In Exs. 1—4, find a basis of real solutions of the DE specified.

1. u” + 5u’ + 4u = 0 2,.u” =u

3. ua’ =u *4.u"+u=0

In Exs. 5—6, find a basis of complex exponential solutions of the DE specified.

5. u” + Qiu’ + 3u = 0 6. u” — 2u’ + 2u = 0

In Exs. 7-10, find the solution of the initial value problem specified.

7 u” + 5u’ + 4u = 0 u(0) = 1, w(0) = 0


a
8 uU = u,
u(0) = u”(0) = 0, w(0) =1
ID
*9 uU =u,
u(0) = u”(0) = 0, w(0) = u”(0) = 1

*10. u” — 2u’ + Qu = 0, u(0) = 1, u’(0) = 0


76 CHAPTER 3 Linear Equations with Constant Coefficients

In Exs. 11 and 12, find a basis of solutions of the Euler DE

11 xu”
+ 5xu’ + 3u
=0 12. x*u” + 2ixu — 3u = 0

13 Describe the behavior of the function z' of the complex variable z = x + iy as z


traces the unit circle z = e” around the origin

14 Do the same as in Ex. 13 for the function z’e

3 THE OPERATIONAL CALCULUS

We have already explained the general notion of a linear operator in Ch. 2


§2. Obviously, y linear combination M = c,L, + ¢oL of linear operators L, and
L,, defined by the formula M[u] = ¢,L,[u] + ¢ole[ul], is itself a linear operator,
in the sense that M[au + bv]
=
=
aM[u] + bM{v] for all u,v to which L, and Ly
are applicable. Moreover the same is true of the (left-) composite LoL, of L, and
Ly, defined by the formula L,[L,[u]]
For linear operators with constant coefficients, one can say much more. In
the first place, they are permutable, in the sense of the following lemma

LEMMA. Linear operators with constant coefficients are permutable: for any con-
stants ayby if p(D) Xa;D) and q(D) <b,D*, then p(D)q(D) q(D)p(D)
Labpit
Proof. Iterate the formula D[b,D[u]] = b,D°[u]. It follows that, for anytwo
constants a; and 5, and any two positive integers j and k, we have a,D/b,D* =

a,b,DI**= b, D‘a, Di,


This is not true of linear differential operators with variable coefficients
Thus, since

Dxf (xfy’ xf! + f (xD + Df, for fe @ [a,b]

we have Dx xD + I. This shows that the differentiation operator D is not


=
=

permutable with the operator “multiply by x.” Likewise (x2D)(xD) = x3D? +


2D, whereas (xD)(x2D) 3p? + 9x?D

Because constant-coefficient linear differential operators are permutable, we


can fruitfully apply polynomial algebra to them. As an immediate application, we
have

THEOREM 1. If isa ro of multiplicity k ofthe characteristicpolynomial (3),


then the functions x’e™ (r = » k —1) are solutions of the linear DE (2).

Proof. An elementary calculation gives, after cancelation, (D — Nie “f(x)]


= e (x) for any, differentiable function f(x). By induction, this implies
(D — d)'[fxe] = f(x) for any f€ @ *. Since the kth derivative of x” is zero
when k > 7, it follows that

(D — d)*[x’e*] > ifk>r


3 The Operational Calculus 77

Moreover, the operators (D — },)" being permutable, we can write, for any i

L{u)] = q(D)(D ~ , Ms, where q(D) = [] W- a)


yRe

Hence L{x"e*} = Q for each A, andr = 0,1,..., 4, — 1, as stated.

Real and Complex Solutions. Theorem 1 holds whether the coefficients a,


of the DE (2) are real or complex. Indeed, although the independent variable x
will be interpreted as real in this chapter (especially in discussing stability), the
operational calculus just discussed above, and the solutions constructed with it,
are equally applicable to functions of the complex variable z x + i,
=
=

However, when all the coefficients a, are real numbers, more detailed infor-
mation can be obtained about the solutions, as follows.

LEMMA. Let the complex-valued function w(x) = u(x) + iv(x) satisfy a homo-
geneous linear DE (1) with real coefficients. Then the functions u(x) and v(x) [the real
and imaginary parts of w(x) both satisfy the DE.

Proof. The complex conjugatet w*(x) = u(x) — iv(x) of w(x) satisfies the
complex conjugate of the given DE (2), obtained by replacing every coefficient
a, by its complex conjugate af, because L*[w*] = {L[w]}* = 0. If the a, are real,
then a¥ = a, and so w*(x) also satisfies (2). Hence, the linear combinations

[w(x) + w*(x)] [w(x) — w*(x)]


u(x) = d o(x) =
2 2i

also satisfy (2), as stated.


This result is also valid for DEs with variable coefficients a,(x).

COROLLARY 1. If the DE (2) has real coefficients and ¢ satisfies (2), then so
does é”*, The nonreal roots of the characteristic polynomial (3) thus occur in conjugate
pairs >, = pw, £ iv, having the same multiplicity k,.

Now, using formula (5), we obtain the following.

COROLLARY 2. Each pair of complex conjugate roots ,,* of (3) of multiplicity


k, gives veal solutions of (2) of the form

(9) x'eF* cos v;x, xf sin v,x, r=0,. ,k,-


1

These solutions differ from the solutions ¢* with real) in that they have infi-
nitely many zeros in any infinite interval of the real axis: that is, they are oscil-
latory. This proves the following result.

+ The complex conjugate w* of a complex number w = u + iv is u — iv. Some authors use w instead
of w* to denote the complex conjugate of w.
78 CHAPTER 3 Linear Equations with Constant Coefficients

THEOREM 2. [If the characteristic polynomial (3) with real coefficients has 2r non-
real roots, then the DE (2) has 2r distinct oscillatory real solutions of the form (9).

4 SOLUTION BASES

We now show that all solutions of the real homogeneous linear DE (2) are
linear combinations of the special solutions described in Corollary 2 above. The
proof will appeal to the concept of a basis of solutions of a general nth order
linear homogeneous DE

(10) L{u) = u™ + pi(xyu®—? +-++ + p,(x)u = 0

The coefficient-functions p,(x) in (10) may be variable, but they must be real and
continuous.

DEFINITION. A basis of solutions of the DE (10) is a set of solutions u,(x)


of (10) such that every solution of (10) can be uniquely expressed as a linear
combination ¢,u,(x) +: + + + ¢,u,(x).

The aim of this section is to prove that the special solutions described in Cor-
ollary 2 form a basis of real solutions of the DE (2). The fact that every nth order
homogeneous linear DE has a basis of n solutions is, of course, a theorem to be
proved.
First, as in Ch. 2, §2, we define a set of n real or complex functions f,,
Se . »f, defined on an interval (a,b) to be linearly independent when no linear
combination of the functions with constant coefficients not all zero can vanish
identically: that is, when Lf.) ¢,f,(x) = 0 implies c, = cg = +: - =
=
= 0. A
C,
set of functions that is not linearly independent is said to be linearly dependent.
There are two notions of linear independence, according as we allow the coef-
ficients ¢, to assume only real values, or also complex values. In the first case, one
says that the functions are linearly independent over the real field; in the second
case, that they are linearly independent over the complex field.

LEMMA I. A set of real-valued functions on an interval (a,b) is linearly indepen-


dent over the complex field if and only if it is linearly independent over the real field.

Proof. Linear dependence over the real field implies linear dependence over
the complex field, a fortiori. Conversely, the f(x) being real, suppose that Xc,/, (x)
= 0 fora < x < 6, Then [Xe,f-(x)]* = 0, and hence Xc*f(x) = 0. Subtracting,
we obtain X[(c, — c¥)/t] f(x) = 0. If all c, are real, there is nothing to prove. If
some c,J is not real, some real number (c, — c¥)/i will not vanish, and we still have
a vanishing linear combination with real coefficients.
A set of functions that is linearly dependent on a given domain may become
linearly independent when the functions are extended to a larger domain. How-
ever, a linearly independent set of functions clearly remains linearly indepen-
dent when the functions are so extended.
4 Solution Bases 79

LEMMA 2. Any set offunctions of the form

(11) f(x) = xe, jzl,...,n

where the r are nonnegative integers and the d, complex numbers, is linearly indepen-
dent on any nonvoid open interval, unless two or more of the functions are identical.

Proof. Suppose that Xc, f,(x) = 0. For any given A, choose R to be the larg-
est r such that c, # 0. Form the operator

gD) = (D-r)® TT Way?


1%]

where for each i # j, k, is the largest r such that x’e** is a member of the
set of functions in (11). It follows that g(D)[f,] = 0 unless 7 = j, and that
q(D)[f,] = 0 for r < R. Hence, we have

QD)[Xepfrl*)) = cpg(D)ix"e*)}
On the other hand, as in the proof of Theorem 1, we see that

g(Dytx"e) = (RY TT @, — Ayet


le # 0
amy

Therefore, substituting back, we find that cz, = 0. Since we assumed that cp, #
0, this gives a contradiction unless all c, = 0, proving linear independence.

From Theorem 1 we obtain the following corollary.

COROLLARY |. The DE (2) has at least n linearly independent, real or complex


solutions of theform xe.

The analogous result for real solutions of DE of the form (2) with real coef-
ficients can be proved as follows. For any two conjugate complex roots A = 4
+ w and A* # — w of the characteristic equation of (2), the real solutions
=
=

x’e cos vx and x’e“ sin yx are complex linear combinations of x’e and xrox
and conversely. Hence, they can be substituted for x’e* and xrone in any set of
solutions without affecting their linear independence. Since linear indepen-
dence over the complex field implies linear independence over the real field,
this proves the following.

COROLLARY 2. A linear DE (2) with constant real coefficients a, has a set of n


solutions of the form x'e™ or (9), which is linearly independent over the real field in
any nonvoid interval.

We now show that all solutions of the real homogeneous linear DE (2) are
linear combinations of the special solutions described in Corollary 2. (The proof
80 CHAPTER 3 Linear Equations with Constant Coefficients

will be extended to the case of complex coefficient-functions in Ch. 6, §11.) To


this end, we first prove a special uniqueness lemma for the more general homo-
geneous linear DE (10),

L{u] = u® + pi(xju"—? +--+ + p,(x)u = 0

with real and continuous coefficient-functions p,(x).

LEMMA 38. Let f(x) be any real or complex solution of the nth order homogeneous
linear DE (1) with continuous real coefficient-functions in the closed interval [a,b]. If
f@ =f'@=-+ + =f" @ = 0,then f(x) = 0 on [4,0].

Proof. We first suppose f(x) real. The function

a(x) = fix)? + f(x)? + +o bf" xy? = 0

satisfies the initial condition o(a) = 0. Differentiating o(x), we find, since o(x) is
real, that

(x) = QL flO’) + fF"! Fo +f PFO)!

Using the inequality |2a6| < a? + 6? repeatedly n — 1 times, we have

o(x) < (f? +f” + (f? +f") tee et fe??? + Lf") + af e-Dee

Since L[f] = 0, it follows that f™ = —LXf., pf”. Hence, the last term can
be rewritten in the form

frrgo = Daas ongens

Applying the inequality |206| < a? + 6? again, we obtain

2ife-Pf| < > Imafe? + LF)


Substituting and rearranging terms, we obtain

o' (x) S (1 + Ipal


f? + (2 + Ipe-alf? + 2 + IPs lf”?
fee et 2 + IpoDLfe>? + (1 + lpi| + > i) for

Now let K = 2 + max|p,(x)| + max,<,<, D:=110,(x)|. Then it follows from the


last inequality that o’(x) < Ko(x). From this inequality and the initial condition
4 Solution Bases 81

a(a) = 0, the identity o(x) = 0 follows by Lemma 2 of Ch. 1, §11. Hence, we


have f(x) = 0.
If h(x) = f(x) + ig(x) (fg real) is a complex solution of (10), then f(x) and g(x)
satisfy (10) by Lemma 2 of §3. Moreover, h(a) = h’(a) = - + + = A" Ya) = 0
implies the corresponding equalities on f and g. Hence, by the preceding para-
graph, we have h = f + ig =0 + 0 = 0, completing the proof.

We now show that any n linearly independent solutions of (10) form a basis
of solutions.

THEOREM 3. Let u;, » Uy, be n linearly independent real solutions of the nth
order linear homogeneous DE (10) with real coefficient-functions. Then, given arbitrary
real numbers a, Ug, Up, ., ug), there exist unique constants c,, . » Cy Such that
u(x) = Xc,u,(x) is a solution of (10) satisfying

; u® Yea) = ue)
(12’) Uos u’(a) = ud,
=

u(a) =

The functions u,(x) are a basis of solutions of (10).

Theorem 3 follows readily from the lemma. Suppose that, for some a, uo, u4,
, uf’), there were no linear combination Xc,u,(x) satisfying the given initial
conditions (12’). That is, suppose the 7 vectors

(n—
[u,(a), uj(a), » Uj %@), k=1,...,n

were linearly dependent. Then there would exist constants y,, ȴw not all zero,
such that

> Yiu,(a) = 0, > yula) = 0,..., >» yu"Y(a) = 0


k 1

That is, the function @(x) = y,u,(x) + - + + + Y,Un(x) would satisfy

(a) = o'(a) | oe” Ya) =0

From this it would follow, from the lemma, that ¢(x) = 0.


Recapitulating, we can find either ¢,, . , €, not all zero such that

u(x) = cyuj(x) + ++ + + 6,U,(x)

satisfies (12’), or ;, >» Yn not all zero such that

Hx) = Yim (x) +--+ + Ynttalx) = 0


82 CHAPTER 3 Linear Equations with Constant Coefficients

The second alternative contradicts the hypothesis of linear independence in


Theorem 3, which proves the first conclusion there.
To prove the second conclusion, let u(x) be any solution of (10). By the first
conclusion, constants ¢,, . , €, can be found such that

u(x) = cyuy(x) + + + + + Cyt,(x)

,u®(a) = va). Hence the difference


satisfies u(a) = v(a), u’(a) = v’(a), .
u(x) — v(x) satisfies the hypotheses of Lemma 3. Using the lemma, we
=

fix) =

obtain u(x) = v(x) and v(x) = Xe,u,(x), proving the second conclusion of Theo-
rem 3.

COROLLARY 1. Let, + > Aw be the roots of the characteristic polynomial of the


realt DE (2) with multiplicities k,, > Ry Then the functions x'e™, r
=
=

0, s k
J
— 1, are a basis of complex solutions of (2).

Referring back to Theorem 2, we have also the following.

COROLLARY 2. If the coefficients of the DE (2) are real, then it has a basis of real
solutions of the form xe, xe cos vx, and x’e™ sin vx, where d, p, and v are real
constants.

EXERCISES B
1. Solve the following initial-value problems:
(a) uw” —u =
=

0, u(0) = w’(0) = u”(0) = 0, u”(0) = 1


(b) ue u’ (0)
=
=

0 u/(0) = u”(0) = 0, u(0) = 1, =


=

—2
(c) u” + u” =
=

0, u”(0) = u”(0) = 0, u(0) = “(0) =]

(a) Find a DE L[u] = 0 of the form (2) having e~4, te“, and ¢ as a basis of solutions.
(b) For this linear operator L, find a basis of solutions of the sixth-order DE
L?[u] = 0 and the ninth-order DE L?[u] = 0.
Find bases of solutions for the following DEs:
(a) u" = u (b) u” — 3u” + 2u = 0
(c) u” + 6u” + 12u’ + 8u = 0
(d) u” + 6u” + 12u’
+ (8 + du = 0

Knowing bases of solutions L,[u] = 0 and L.[u] = 0 of the form given by Theorem
1, find a basis of solutions of L,[L,[u]] = 0.

Show that in every veal DE of the form (2), L can be factored as L = Lily... L,
where L, = D, + 6, or L, = D? + pD + q with all b, p, q, real.
Extend Lemma 2 of §4 to the case where the r are arbitrary complex numbers.
“7 State an analog of Corollary 2 of §4 for Euler’s homogeneous DE, and prove your
statement without assuming Corollary 2.

Prove that the DE of Ex. A5 has no nontrivial real solution.

+ The preceding result can be proved more generally for linear DEs with constant complex coeffi-
cients, by similar methods; see Ch. 6, §11.
5 Inhomogeneous Equations 83

5 INHOMOGENEOUS EQUATIONS

We now return to the nth order inhomogeneous linear DE with constant


coefficients,

u du
(13) L{u] = + a, —__—

df" 1
+--+ + a4,-; au7 T Gntt =
=

r(t)
dt d

already introduced in §1. As in the second-order case of Ch. 2, §8, the function
r(f) in (13) may be thought of as an “input” or “source’’ term, and u(é) as the
“output’’ due to r(é). We first describe a simple method for finding a particular
solution of the DE (13) in closed form, in the special case that r(#) = Lp,(terk!
is a linear combination of products of polynomials and exponentials.
We recall that, by Lemma 2 of §3, &

(D — »NeMAH) = Mf’
As a corollary, since every polynomial of degree s is the derivative r(t) = q’(i) of
a suitable polynomial q(é) of degree s + 1, we obtain the following result.

LEMMA 1. If r(é) is a polynomial of degree s, then (D — du] = eri)


has a solution of theform u = eMa(t), where q(t) is a polynomial of degree s + 1.

More generally, one easily verifies the identity

(D — A)le“AH) = MA — AYO +f’)


IfA #A,, and f() is a polynomial of degree s, then the right side of the preceding
identity is a polynomial of degree s times e™. This proves another useful alge-
braic fact:

LEMMA 2. If r(é) is a polynomial of degree s and X # d,, then

(D — A)[u] = er)
has a solution of theform u = eq(t), where q(t) is a polynomial of degree s.

Applying the two preceding lemmas repeatedly to the factors of the operator

-+(D—),)*
L = p,(D) = (D — d)"(D — d»)”

we get the following result.

THEOREM 4. The DE L[u] = e“r(t), here r(t) is a polynomial, has a particular


solution of theform &“q(t),where q(t) is also a polynomial. The degree of q(t) equals that
84 CHAPTER 3 Linear Equations with Constant Coefficients

of r(t) unless X = 3, is a root of the characteristic polynomial p,(A) = L(A — Js of L.


If =, is a k-fold root of p,(r), then the degree of q(t) exceeds that of r(t) by k.

Knowing the form of the answer, we can solve for the coefficients 5, of the
unknown polynomial q() = £8,t" by the method of undetermined coefficients.
Namely, applying p(D) to u(é) = e™(r,t'), one can compute the numbers P,, in
the formula

p(D)[u] = MI(Pybyt"
using formulas for differentiating elementary functions. One does not need to
factor p,. The simultaneous linear equations “P,,b, = ¢, can then be solved for
the b,, given r(t) = ct", by elementary algebra. Theorem 4 simply states how
many unknowns 5, must be used to get a compatible system of linear equations.

Example 2. Find the solution of the DE


L{u) = u” + 3u” + 2u’ = 12%

that satisfies the initial conditions u(0) = —17/3, w’(0) = u”(0) = 1/3. First,
since the two-dimensional subspace of functions of the form (a + Bie’ is
mapped into itself by differentiation, the constant-coefficient DE (*) may be
expected to have a particular solution of this form. And indeed, substituting u
= (a + Bie’ into (*) and evaluating, we get

(*) L{u]) = (6a + 118) + 6Bije = 12%

Comparing coefficients, we find a particular solution u


=
=
(-11/3 + 2c
of (*).
Second, the reduced DE u” + 3u” + 2u’ = 0 of (*) has 1, e~4 and e” asa
basis of solutions. The general solution of (*) is therefore

usat be! +ce™ + (—11/3 + Qe

The initial conditions yield three simultaneous linear equations in a, b, ¢ whose


solution is a = 1,6 = —4,¢

=
1. Hence the solution of the specified initial
value problem is

(**) u=1—4e'+e*% + (~11/3 + ie

EXERCISES C

In Exs. 1—4, find a particular solution of the DE specified. In Exs. 1—3, find the solutions
satisfying (a) u(0) = 0, u’(0) = 1 and (b) u(0) = 0, w’(0) = 1.

1. u” = te 2.u”+u= te
3. u” — u = te’ 4. uu” = fd
Stability 85

In each of Exs. 5-8, find a particular solution of the DE specified.


5. u! + 4u = sint 6. u” + 2u” + 3u’ + bu = cost
7, uu? + Bu” + 4u =e 8. u” + iu = sin 21, i=v-l1

In each of Exs. 9-12, find (a) the general solution of the DE specified four exercises
earlier, and (b) the particular solution satisfying the initial condition specified.
9 u(0) = 0 fory = 0, 1,2, 3,4 u(0) = 1
10. u(0) = u’(0), u”(0) = 1

11. u(0) = 10, u’(0) = u”(0) = u”(0) = 0

12. u(0) = 0, u(0) =i

6 STABILITY

An important physical concept is that of the stability of equilibrium. An equi-


librium state of a physical system is said to be stable when small departures from
equilibrium remain small with the lapse of time, and unstable when arbitrarily
small initial deviations from equilibrium can ultimately become quite large.
In considering the stability of equilibrium, it is suggestive to think of the inde-
pendent variable as standing for the time ¢. Accordingly, one rewrites the DE
(2) as

dy ad”*u
(14)
+

ap
T ag7yn?
—_——-

Foss tay + au = 0
dt"

For such constant-coefficient homogeneous linear DEs, the trivial solution u = 0


represents an equilibrium state, and the possibilities for stable and unstable
behavior are relatively few. They are adequately described by the following def-
inition (cf. Ch. 5, §7 for the nonlinear case).

DEFINITION. The homogeneous linear DE (14) is strictly stable when every


solution tends to zero as ¢ —> 0; it is stable when every solution remains bounded
as { > ©O; when not stable, it is called unstable.

Evidently, a homogeneous linear DE is strictly stable if and only if it has a


finite basis of solutions tending to zero, and stable if and only if it has a basis of
bounded solutions. The reason for this is that every finite linear combination of
bounded functions is bounded, as is easily shown. Hence Theorem 3, Corollary
6 gives algebraic tests for stability and strict stability of the DE (14). Take a basis
of solutions of the form ie, ée“ sin vt, te“ cos vt. Such a solution tends to zero
if and only if # < 0 and remains bounded as ¢ > © if and only if » < 0 or w =
r = 0. This gives the following result.

THEOREM 5. A given DE (14) is strictly stable if and only if every root of its
characteristic polynomial has a negative real part. It is stable if and only if every mul-
86 CHAPTER 3 Linear Equations with Constant Coefficients

tiple root dX, [with k, > 1 in (4)] has a negative real part, and no simple root (with
k, = 1) has a positive real part.

Polynomials all of whose roots have negative real parts are said to be of stable
type.t There are algebraic inequalities, called the Routh-Hurwitz conditions, on
the coefficients of a real polynomial, which are necessary and sufficient for it to
be of stable type. Thus, consider the quadratic characteristic polynomial of the
DE of the second-order DE (5) of Ch. 2, §1. An examination of the three cases
discussed in §2 above shows that the real DE

au du
u + au + agu = 0,
=

u =

a’ at

is strictly stable if and only if a, and a, are both positive (positive damping and
positive restoring force). That is, when n = 2, the Routh-Hurwitz conditions
are a, > 0 and a, > 0.
To make it easier to correlate the preceding results with the more informal
discussion of stability and oscillation found in Ch. 2, §2, we can rewrite the DE
discussed there as u + pu + qu = 0. We have just recalled that this DE is strictly
stable if and only if p > 0 and q > 0. It is oscillatory if and only if g > p’/4,
so that its characteristic polynomial A? + pdr + gq has complex roots
—(p £ Vp" — 49)/2.
In the case of a third-order DE (n 3), the test for strict stability is provided
=
=

by the inequalities a, > 0 G = 1, 2, 3) and a,a) > as. When n = 4, the condi-
tions for strict stability are a, > 0 =
1, 2, 3, 4), aja@q > ag, and a,a,a3 > ara,
+ aj.
When n > 2, there are no equally simple conditions for solutions to be oscil-
latory or nonoscillatory. Thus, the characteristic polynomial of the DE # + & +
ua + u = 0is A + 1)Q? + 1); hence its general solution is
t
acosit+ bsint + ce~

Unless a = 6 = 0, this solution will become oscillatory for large positive ¢, but
will be nonoscillatory for large negative ¢. Other illustrative examples are given
in Exercises C.

7 THE TRANSFER FUNCTION

Inhomogeneous linear DEs (13) are widely used to represent electric alter-
nating current networks or filiers. Such a filter may be thought of as a “black

+ See Birkhoff and MacLane, p. 122. For polynomials of stable type of higher degree, see F. R.
Gantmacher, Applications of Matrices, Wiley—Interscience, New York, 1959.
7 The Transfer Function 87

box’’ into which an electric current or a voltage is fed as an input r‘t) and out of
which comesa resulting output u(i).
Mathematically, this amounts to considering an operator transforming the
function r into a function u, which is the solution of the inhomogeneous linear
DE (13). Writing this operator as u = F[r], we easily see that L[F[r]] = r. Thus,
such an input-output operator is a right-inverse of the operator L.
Since there are many solutions of the inhomogeneous DE (13) for a given
input r(¢), the preceding definition of F is incomplete: the preceding equations
do not define F = L™' unambiguously.
For input-output problems that are unbounded in time, this difficulty can
often be resolved by insisting that F[r] be in the class B(—0o, +00) of bounded
functions; in §§7—8, we will make this restriction. For, in this case, for any two
solutions u, and wu, of the inhomogeneous DE L[u] = 1, the difference v = u,
— Uy would have to satisfy L{v] = 0. Unless the characteristic polynomial p, (A)
= 0 has pure imaginary roots, this implies v = 0. Hence, in particular, the DE
L{u] = rhas at most one bounded solution if the DE L{u] = 0 is strictly stable—
an assumption which corresponds in electrical engineering to a passive electrical
network with dissipation. Moreover, the effect of initial conditions is ‘‘tran-
sient’: it dies out exponentially.
For initial value problems and their Green’s functions, it is more appropriate
to define F by restricting its values to functions that satisfy u(0) = w’(0)
=
=

++ + = u@-(0) = 0; this also defines F unambiguously, by Theorem 3.


We now consider bounded solutions of (13) for various input functions, with-
out necessarily assuming that the homogeneous DE is strictly stable.
Sinusoidal input functions are of the greatest importance; they represent
alternating currents and simple musical notes of constant pitch. These are func-
tions of the form

A cos (kt + a) = Re {ce“}, A= Icl, x = argc

A is called the amplitude, k/2x the frequency, and a the phase constant. The fre-
quency k/2z is the reciprocal of the period 2x/k.
Except in the case p,(tk) = 0 of perfect resonance, there always exists a
unique periodic solution of the DE (13), having the same period as the given input
function r(¢) = ce. This output function u({) can be found by making the sub-
stitution u = C(k)ce™, where C(k) is to be determined. Substituting into the inho-
mogeneous DE (14), we see that L[C(k)ce™] = ce™ if and only if

(15) C(k) = 1/p.(k), pitik) # 0

where p,(A) is the characteristic polynomial defined by (3).

DEFINITION. The complex-valued function C(k) of the real variable k


defined by (15) is called the transfer function associated with the linear, time-
independent operator L. If C(k) = p(k)e™™, then p = | C(&)| is the gainfunction,
and y(k) = —arg C(k) = arg p,(ik) is the phase lag associated with k.
88 CHAPTER 3 Linear Equations with Constant Coefficients

The reason for this terminology lies in the relationship between the real part
of u(t) and that of the input r(é). Clearly,

Re {u(t)} = Re {C(k)ce™} = |C(k)| - [cl] cos (kt + a — y)

This shows that the amplitude of the output is p(k) times the amplitude of the
input, and the phase of the output lags ~y = —arg C behind that of the input at
all times.
In the strictly stable case, the particular solution of the inhomogeneous linear
DE L{u] = ce found by the preceding method is the only bounded solution;
hence F[{ce“] = C(k)ce™ describes the effect of the input~output operator F on
sinusoidal inputs. Furthermore, since every solution of the homogeneous DE
(14) tends to zero as t > +00, every solution of L[{u] = ce™ approaches C(k)ce™
exponentially.

Example 3. Consider the forced vibrations of a lightly damped harmonic


oscillator:

(*) [L[u]
=
=
u” + eu’ + pu = sin kt, e< 1]

The transfer function of (*) is easily found using the complex exponential trial
function e. Since

L{ce™] = (—k? + eik + p*)ce™

we have C(k) = 1/[(p? — k?) + eik]. De Moivre’s formulas give from this the
gain function p = 1/[(p? — k*)? + ¢k*]'” and the phase lag

ek
Y = arctan
——_—_——

(p? _— i)
|
The solution of (*) is therefore p sin (kt — ‘y), where p and y¥ are as stated.
One can also solve (*) in real form. Since differentiation carries functions of
the form u = a cos kt + b sin ki into functions of the same form, we look for a
periodic solution of (*) of this form. An elementary computation gives for u as
before:

L{u] = ((p? — ka + ekb] cos kt + [(p? — k?)b — eka] sin kt

To make the coefficient of cos kt in (*) vanish, it is necessary and sufficient that
a/b = ck/(k® — p*), the tangent of the phase advance (negative phase lag). The
gain can be computed similarly; we omit the tedious details.
Finally, note that the characteristic polynomial of any real DE (2) can be fac-
7 The Transfer Function 89

tored into real linear and quadratic factors

pi) TL a+ 4) [] a? + pa + 40, r+2s=n


jel t=]

and since all b, p, and q, are positive in the strictly stable case that all roots of
pi) = 0 have negative parts. Therefore

(k) = ~ arg(b;+ tk) + y arg(q,


+ ikp, ~ h°)
ym

increases monotonically from 0 to nw/2 as k increases from 0 to 0. This is evident


since each arg (b ik) increases from 0 to 7/2, while arg (q, ikp, — k?)
increases from 0 to =, as one easily sees by visualizing the relevant parametric
curves (straight line or parabola). Theorem 6 below will prove a corresponding
result for complex constant-coefficient DEs

Resonance. The preceding method fails when the characteristic polynomial


p.() has one or more purely imaginary roots A tk, (in electrical engineering,
this occurs in a “‘lossless passive network’’)
Thus, suppose that 2k is a root of the equation p,(A) = 0 and that we wish to
solve the inhomogeneous DE L[u] = e From the identity (cf. §6)

L{te“| = | 2 *| = = 16 —
= (pa = PLOe™ + piQyte™

=

we obtain, setting A = ik

L{te™] = pi(ik)e™

If zk is a simple root of the characteristic equation, then pj(ik) # 0. Hence a


solution of L[u] = e™ is u(t) = [1/pi(ék)lte™. The amplitude of this solution is
[1/|p7@k) |], and it increases to infinity as ¢ > 00. This is the phenomenon of
resonance, which arises when a nondissipative physical system © is excited by a
force whose period equals one of the periods of free vibration of £
A similar computation can be made whenik is a root of multiplicity n of the
characteristic polynomial, using the identity L[t"e™]
= pf (ik)e™, which is proved
in much the same way. In this case the amplitude of the solution again increases
to infinity.

Periodic Inputs. The transfer function gives a simple way for determining
periodic outputs from any periodic input function (2) in (13). Changing the time
90 CHAPTER 3 Linear Equations with Constant Coefficients

unit, we can write r(t + 27) =


=
r(t). We can then expand 7(é) in a Fourier series,
getting

(16) Llu) = 22 + 57 (ay cos kt +b, sin ki)


k=1

or, in complex form,

(16’) 2L[u} = y cet; Co = > CG, = c*, = a, — ib


k= —-00

summed over all integers k.


Applying the superposition principle to the Fourier components of c,e™ of
r(é) in (16’), we obtain, as at least a formal solution,

co

Je
|
ke
(17) ut) = >>
k= —00 pith)

provided that no p,(ik) vanishes. The series (17) is absolutely and uniformly con-
vergent, since p,(ik) = 0(k-") for an nth-order DE. We leave to the reader the
proof and the determination of sufficient conditions for term-by-term
differentiability.

EXERCISES D

In Exs. 1-4, test the DE specified for stability and strict stability.

1. u” + 5a’ + 4u = 0 2. u” + 6u” + 12u’ + 8 = 0


3. u” + 6u” + 1lu’ + 6u = 0 4. u” + 4u” + 4u” = 0

5. For which 7 is the DE u™ + u = O stable?


In Exs. 6-9, plot the gain and transfer functions of the operator specified (J denotes the
identity operator):

6. D?+4D+4+
41 7, D+ 6D?
+ 12D + 81
8. D? + 2D + 1017 9. Dt-—I
10. Fora strictly stable L[u] = u” + au’ + bu = r(i), calculate the outputs (the responses)
to the inputs 7() = 1 and r(#) = ¢ for a? > 46 and a? < 40,

*8 THE NYQUIST DIAGRAM

The transfer function C(k) = 1/p,(ék) of a linear differential equation with


constant coefficients L[u] = 0 is of great help in the study of the inhomoge-
neous DE (13). To visualize the transfer function, one graphs the logarithmic
gain In p(k) and phase lag y(k) as functions of the frequency k/2r. If Aj, >
8 The Nyquist Diagram 91

i, are the roots of the characteristic polynomial, we have

(18a) In p(k) = — yn | ~A,| = —§


LDIn [& — »)? + p?]
gm

yk) = Dare(ik~ 4) = Xarctan| ¢ v )


(18b)
H,
LE

|
from which these graphs are easilyplotted. Figure 3.1a depicts thegain function
and phase lag of the DE

u” + 0.8u” + 5.22u” + 1.424u’ + 4.1309u =


=

whose characteristic polynomial has the roots A; = —0.1 + i, — 0.3 + 2%.


We now compute how the phase lag y(k) changes as the frequency k/2a
increases from —0O to +00. By (18b), it suffices to add the changes in the func-
tions arg (tk — \,) for each X,. If Re {A,} is negative, then the vertical straight line
ik ~— h, (—90 <_k < ©) lies in the right half of the complex plane; hence, arg
(tk — A,) increases by x as & increases from —© to +0, If Re {A,} is positive,
then arg (#k — A,) decreases by = for a similar reason. Hence, if there are no
roots with zero real part, the change in y(f) is (m — p)x, where m is the number
of X, with negative real part, and p is the number with positive real part. If there
are no purely imaginary roots, then m + p = n, and we obtain a useful test for
strict stability.

THEOREM 6. The DEL [u] = r(t) of order n is strictly stable if and only if the
phase lag-y = —arg C(k) increases by nm as k increases from —©0 to + 00. In this
case, the phase lag increases monotonically with k.

if the differential operator L has real coefficients, then y(—k) = —y(&) and
+
p(—k) = p(k), since the complex roots A, occur in conjugate pairs y, ~_ wv, In
particular, we have 7(0) = 0, and the change in (A) as k increases from 0 to
00 is (m — p)x/2. This proves the following specialization of Theorem 6.

COROLLARY 1. A linear DE of order n with constant real coefficients is strictly


stable if and only if the phase lag increases from 0 to nw/2 as k increases from 0 to
CO.

If all roots \, of the characteristic polynomial are real, then one easily verifies
that all In |ik — d,| increase monotonically as k increases from 0 to 00, Hence,

+ For purely imaginary roots, the change of argument of A is undefined (it could be x or —z). In
this case, we make the convention that the change in the argument is zero. The following theorem
is true with the proviso that, whenever the argument is undefined, the change is taken to be zero.
92 CHAPTER 3 Linear Equations with Constant Coefficients

v(k)
e(k)
3x ¥(k)
1.0

0.5

In p(k)
0.2

0.1

05 (a)
0.5 1.0 15 2.0 2.5 k

him {C(k)}

k=-1

k=—-0.8

k= —1.2

k=-2
k=0.5
k=0
Re {c(k)}
k= 0.5

km 1.2
15 kL2

k=0.8

1
k=

(b)

Figure 3.1 (a) Phase-lag and gain functions, (6) Nyquist diagram.

in this case, the gain p(k) is a monotonically decreasing function of the fre-
quency. In the case of complex roots A, = #, + i, with p, very small, the gain
function p(k) is very large near k = »,; this is due to near resonance, as in Example
3 above.
Another useful way to visualize the transfer function is to plot the curve
z = C(k) in the complex plane as k ranges through all real values. The curve thus
obtained is called the Nyquist diagram of DE (14). Figure 3.1 depicts the Nyquist
diagram of the DE below (18b).
9 The Green’s Function 93

Since C(x) is the inverse of a polynomial, it tends to the origin as k > +00
that is, it “starts” and “ends” at the origin. It is a continuous curve except when
the characteristic equation has one or more imaginary roots A, = ik, From
Theorem 6 we obtain the following Nyquist Stability Criterion

COROLLARY 2 The equation L{u] 0 is strictly stable if and only if the Nyquist
diagram for C(k) turns through —nx radians as k increases from — 00 to +00

If L is real, then C(—A) = C*(k), and it suffices to plot half of the Nyquist
diagram. The operatorL is strictly stable if and only if the Nyquist diagram turns
through n/2 radians as k increases from 0 to 00

*9 THE GREEN’S FUNCTION

The concept of the Green’s function for initial value problems was intro-
duced in Ch. 2, §9. For any inhomogeneous linear DE L[u]
=
=
r(é), it is a func-
tion Gt, 7) such that

(19) u =f f Git, r)r(r)dr

satisfies L[u] ri) if t = a, for any continuous function r. We now state a


=
=

generalization of Theorem 10 of Ch. 2, §9, which describes the Green’s function


of a linear operator of arbitrary order

THEOREM 7. The Green’s function for the initial value problem of the nth order
real linear differential operator with continuous coefficients

-1

(20) L -2+pn0it+ 7+ pail =it Pald), axzt=b

is zero ift <1. For t = 7, it is that solution of the DE L[G]= 0(for fixed + and
variable t) which satisfies the initial conditions

= dG/at ve = OG /A? =
=

0, o'G/ar =1 for t=T7

In the case p,¢) = a, of linear DEs with constant real coefficients, the exis-
tence of sucha solution follows from the results of §3. Green’s function is easily
computed as a sum of polynomials times exponentials. Thus, if (20) is uw” + 3
+ 3u’ + u, the Green’s function is

ifi<r
G(, 7) {° —_ ry? T 12 ift=rT
94 CHAPTER 3 Linear Equations with Constant Coefficients

For variable coefficient-functions, existence will follow from the results of the
next chapter.
We omit the proof of Theorem 7. It follows exactly the proof for second-
order differential operators, given in Ch. 2, §9. This can be easily extended to
the present case: one simply applies Leibniz’s rule for differentiation under the
integral sign n times instead of twice.
The computation of Green’s functions for linear DEs with constant coefhi-
cients is most easily performed and its significance best understood by using the
following result.

THEOREM 8. The Green’s function for the initial value problem of any linear
differential operator with constant coefficients is a function G(t, r) = T(t — 7) depend-
ing only on the difference t — T.

Proof. Let T(@é) = G@, 0); then T(@) = 0 ift < 0. If t = 0, the function I
is the solution of the DE L{[T] = 0, which satisfies the initial conditions

To) =o) =--- = 1-20) = 0 re) =1

We now remark that, if u(f) is a solution of the DE L{u] = 0, for each fixed
7 the function u(t + 7) of the variable ¢ is also a solution of the DE. It follows
that the function Fi) = G( + 7, 7) (for fixed 7) is a solution of the DE. This
function satisfies the same initial conditions as the function I, because of the
way Green’s functions are defined. By the uniqueness theorem (Theorem 3),
it follows that I) = G(t + 7, 7). Hence, setting s t + 7, we obtain I'(s — 7)
=
=

= G(s, 7), q-e.d.

Referring to Theorems 1 and 3, we obtain the next corollary.

COROLLARY 1. In Theorem 8, the function T(t — 7) isof class @"~*. It satisfies


I\(s) = 0 for s < 0, while Is) is a linear combination offunctions soy fors > 0.f

Changing variables in (19), we have also the following corollary.

COROLLARY 2. [fr(t) vanishes for t < a, and is bounded and continuous for a
XS t, then the function
OO,

(21a) fo = f- TG — r)r(r) dr = f —_
~ T(s)r(é — 5) ds

is a solution of the inhomogeneous linear DE with constant coefficients (13) on [a, 00],
which satisfies f(a) = f@=--- =f Ng) = 0. (Note that, unless r(a) = 0,
f(a) does not exist.)

+ The function I(s) is, of course, also expressible for s > 0 as a real linear combination of functions
of the form (9).
9 The Green’s Function 95

Indeed, since r(r) vanishes for r < a and I'(é — 7) vanishes for rt > t, the
integral (21a) can be written as

(21b) fo = f Té — r)r(r) dr

and is equal to (19), by Theorem 8.

THEOREM 9. [fL is strictly stable, formulas (21a) remain valid for any bounded
continuous function r(t) defined for —00 <t < 00; for such a function, f (1) is the only
solution of the DE (13) which is bounded for —00 < t < co,

Proof. By Corollary 1 above, I'(s) is equal (for s = 0) to a linear combination


of the form

(22) Ts) = > ose, s=0


gui

If the equation L[{u] = 0 is strictly stable, then the real parts of all \, are nega-
tive. Let —m be the largest of these real parts. Then —m < 0, and

|e™T(s)| =< > leStet(m/2)s |


j=l

Since Re {\,} + (m/2) < 0 for 1 <j < 1, the right side remains bounded for 0
<= 5s = ©, Let M be an upper bound for the right side. Then we obtain

IT()| = Me™”, O=s<@

We next show that the integrals in (21a) are well-defined for any bounded
continuous function r. The first integral can be rewritten in the form

rf

J =
~ Tt — v)r(r) dr

since '¢ — 7) = 0 for ¢ < 7. Using the foregoing bound for I'(s), and letting R
be an upper bound for |7(r)| on —00 < 7 < 00, we obtain

If. Té — r)r(r) dr <{. IT@ — | }r@)| ar


t

= RM etn? dy = ——— < +00


—c

for all t. Hence the integral is well-defined and defines a bounded function /f().
To show that fis a solution of the DE, we can argue as in Theorem 10 of Ch. 2,
96 CHAPTER 3 Linear Equations with Constant Coefficients

provided we can carry out the differentiation under the integral sign. This can
indeed be justified}; instead, however, we shall give a direct argument.
Consider the sequence of functions 7,(#) defined by the formulas

ift
= —k
(23) r,t) = |rd)
0 ift< —k k=1,2,...

Then the functions


Lt

(23/) fo =f ' TG — 7)7,(r)dr = f TG — r)r,(r) dr


-

are solutions of the DE (14). We shall show that, for ¢ ranging over any interval
a <= ts Bb, the functions f,(, as well as their derivatives of orders up to n,
converge uniformly to the derivatives of the function f(é). This will also prove
that f(/) is a solution of the DE (14).
From the expression (22) for I(s) as a linear combination of functions of the
form s‘e', we see that all derivatives of I'(s) are also linear combinations.of func-
tions of the same form, for different 7, but with the same sequence of exponents
A, That is, for the derivative of order £, we have

Ts) = > pie,


jml

where the p, are polynomials in the variable s, depending on the order of differ-
entiation ¢.} It follows, as before, that

[T%()| = Me™”, €=1,2,...

Now, from the expression

PO = f ' TO — r)r,(r) dr
we find, for sufficiently large k and j, where k = j

IfPO —fPOl = f ; TO — AIlnG@) — 10) ar

= f- [TC — a)|ln@)I ar

<= RM, f . emt? dr

t Courant, Vol. 2, p. 312.


+ This can be easily seen by applying Leibniz’s rule; cf. Courant and John, p. 203.
9 The Green’s Function 97

and the last integral clearly tends to zero as j, k > ©, uniformly for a Si S b.
Therefore, [f/() — f@®| < ¢ for sufficiently large 2, j, uniformly for
a =i <b. This completes the proof of the fact that fis a solution of the DE.
Lastly, one can easily see, by the following argument, that f thus defined is
the only bounded solution. If f,; were another bounded solution, then f — f,
would be a bounded solution of the homogeneous DE. But, since the DE is
strictly stable as ¢ — ©, no nontrivial solution of the homogeneous DE can
remain bounded as tf > —© (cf. Theorem 4). Hence, f — f; = 0, and the proof
is complete.

Convolution. The preceding results have very simple interpretations in


terms of the important notion of convolution. The convolution of two functions
Ji) and g(#) defined for all real ¢ is defined by the formula

10,

(24) h(x) = f =-
~ Se — dg() at

whenever the integral is finite. If the functions f and g are identically zero for
i < 0, this formula simplifies to

(24’) h(x) = f¥g(x) = f °g(x — 8 dx

In many ways this operation is analogous to the multiplication of two infinite


series. It is commutative and associative, as can easily be seen.
Corollary 2 of Theorem8 states that the solution of the DE L[u] r(t) for

=

= ye
the initial conditions u(0) = u’(0) = - - (0) = 0 is the convolution
r *T of r and the Green’s function I for the same initial conditions, provided
that r= 0 fori
< 0.
Theorem 9 can also be interpreted in terms of the convolution operation. It
asserts that, for strictly stable L, the solution of the “input-output” problem
Liu]

=

r(t) is ry *T for any uniformly bounded “‘input”’ r(¢).

EXERCISES E

In Exs. 1-6, construct the Green’s function for the initial value problem of the DE
indicated.

1 @u/d® = rit) 2. d'u/dt" = r(t)

3 uv —u4 = rb) 4. u” +u= rd)

5 uv +u = rt) *6. u? — u = r(t)

7 Find Green’s function of the DE d"u/dt" = r().

8 Carry out in detail the proof of Theorem 7 for n =


=
3, performing all differentia-
tions under the integral sign explicitly.

*9 Show that, if (19) is defined for all ¢ and the Green’s function G(t, 7) = Tt — 7),
then all coefficients p,(f) are constant.

*10 Show that, if u(¢ + 7) is a solution of (19) with r(@) = 0 whenever u(t) is, then the
coefficients p,(é) are all constants.
98 CHAPTER 3 Linear Equations with Constant Coefficients

11 . (a) Show that in Ex. D1, the values of p,(ék) traverse the parabola x = 4 —(y/5)?.
(b) Verify Theorem6in this case.

12 For u” + au’ + bu = ce“, make graphs of the gain function versus the dimension-
less frequency k/\V/b for the values 7 = 0,0 =1/2,0= 1/v2, ” = 2 of the param-
eter7 = a/2V/b (b > 0).
13 Show that, if p, (ik) = pik) = - > - = pith) = 0 but p(ék) ¥ 0, then
asolution
of L{u] = eis u(t) = (1/pi(anyt"e.
14 A DE (13) is stable at t —© when all solutions of the DE remain bounded as ¢
— —0o, Find necessary and sufficient conditions for stability at —00,

15 In Ex. 14, find necessary and sufficient conditions for strict stability at —o.

16 Show that no DE (13) can be strictly stable at both 00 and —00,


17 Show that a DE (13) is stable at both 00 and —oo if and only if every root of the
characteristic equation is simple and a purely imaginary number.
CHAPTER 4

POWER SERIES
SOLUTIONS

1 INTRODUCTION

In the preceding chapters, we have constructed many explicit solutions of


DEs and initial value problems in terms of the so-called elementary functions
These are functions that can be built up using the four rational operations (addi-
tion, subtraction, multiplication, and division) from the exponential and trigo-
nometric functions and theirinverses (e.g., theinverse In x of the exponential
function x). Since x” = ¢’'™, fractional powers and nth roots x = x'/ of
positive (real) elementary functions can also be considered as ‘“‘elementary.”
In these earlier chapters, functions like P(x) JSp(t) dt defined symbolically
as indefinite integrals were also used freely. Indeed, accurate tables of such func-
tions are easily computed using Simpson’s rule (Ch. 1, §8). However, one should
also realize that the indefinite integrals of most “‘elementary’’ functions are not
themselves “‘elementary”’ in the sense defined above. This basic fact explains why
formulas for such expressions as

x
sin x do
« f
(1)
J dx and
V1 — #? sin

are conspicuously missing from tables of integrals.t+


This chapter will introduce a much more powerful method for constructing
solutions of DEs. This method consists in admitting sums of infinite power series
as defining functions in any domain where they converge. Such functions are
called ‘‘analytic.”’ All sums, differences, products, and (except where the denom-
inator vanishes) quotients of analytic functions are analytic. Moreover, as is
almost evident, since

a,x" a3x°
fe + ayt + ast® ) dt
= C + agx + —
2
+ —
3
+

any indefinite integral of any analytic function is also expressible as an analytic


function. For example, since (sin x)/x = 1 — (x°/3!) + (x°/5!) + termwise

+ Cf. Dwight. A valuable collection of analogous explicit solutions of ordinary DEs may be found in
Kamke, pp. 293-660

99
100 CHAPTER 4 Power Series Solutions

integration gives the sine integral function

3 5
sin & x
(2) Si(x) = dE =x — ——
~~ wee

187 600

2 yt
(Since E\(x) = —

dé is logarithmically infinite near x = 0, to express it as


x g
an analytic function is, however, more involved; cf. §3.)
Although most of the techniques introduced in this chapter are applicable
with little change to complex analytic functions, we shall defer the discussion of
these to Chapter 9. Instead, this chapter will introduce a second and even more
fundamental idea than that of considering functions as defined by power series.
This is the idea of considering functions as defined by differential equations and
appropriate initial or boundary conditions, by using the defining DE itself to
determine the properties of the function. i

This approach is especially easy to carry out for first-order linear DEs. For
example, we can use it to derive the key properties of the exponential function
E(x) (or e*) as the (unique) solution of the DE E’(x) = E(x) that satisfies the
initial condition E(@) = 1. For any a, f(x) = E(@a + x) must satisfy f(0) = E(a)
and f’(x) = E’(a + x) = E(a + x) = f(x). Since f(x) and E(x) are solutions of
the same first-order linear homogeneous DE, it follows that f(x) = f(O)E(x) =
ents
E(a)E(x), giving the formula
=
=
e“e*. In particular, E(a) can never vanish and
is always positive, together with E’(a) = Ea), E”’(a) = Ef@),.. , which shows
that E(x) is increasing and convex. Finally, its Maclaurin series is

(3) E(0) + E’(0)x + (E”(0)x?/2!) + (E” (0)x°/3!) + ++ -

giving the exponential series e* = x x*/(k!).


k=0

In Ch. 2, §§6—7, we have seen how one can derive many oscillation and non-
oscillation properties of solutions of linear second-order DEs

(4) u” + p(x)u’ + q(x)u = 0

When the coefficient-functions

P(X) = po + pix + pox? +..., Gx) = qo t+ uxt qx? +...,

can be expressed as sums of convergent power series, we will show in this chap-
ter how to find a basis of solutions of the DE (4) having the same form. The
functions so constructed include many of the special functions most commonly
used in applied mathematics.
Namely, by substituting a formal power series

(4’) U = dy + ayx + agx®+ asx?+--+ = > a,x"


2 Method of Undetermined Coefficients 101

into the DE (4), and using Cauchy’s product formula (18’),

(4”) plx)u(x) = > yx", 6 = Pay


gud

we shall first show how to compute the unknown coefficients a, in (4’). We shall
then show that the radius of convergence of the resulting formal power series
(4’) is at least as great as the lesser of the radii of convergence of the series for
p(x) and g(x).
We will then give a similar construction for the solutions of normal first-order
DEs of the form y’ = F(x,y), where

(5) F(x,y) = boo + diox + bory + box? + Byxy + doy? + °°:

is also assumed to be analytic. Finally, we will estimate the radius of convergence


of this series.

2 METHOD OF UNDETERMINED COEFFICIENTS

The class A(D) of functions analytic in a domain D is defined as the class of


those functions that can be expanded in a power series around any point of D,
which is convergent in some neighborhood of that point. By a translation of coor-
dinates, one can reduce the consideration of such power series to power series
having the origin as the center of expansion.
Many of the special functions commonly used in applied mathematics have
simple power series expansions. This is especially true of functions defined as
definite integrals of functions with known power series expansions: one simply
integrates termwise! Thus, from (2), we obtain

2k+1

si = [Oo ¢ * (Qkg + g1! = x (—1)' (Qk +


x

k=0
1)2(2k)!

Expanding the integrand in power series, we can obtain similarly the first few
terms of the series expansion for the elliptic integral of the first kind,

F(k,sin™!x) =
=

foa- 97a - Kea


1+ #? 3+
2h + 3h ,
+
x+
6 40

witha little more effort.


Likewise, consider the exponential function e*, defined as the solution of
y =y and the initial condition y(0) = 1. Differentiating this DE n times, we
(a+1)
get y y. Substitution into Taylor’s formula then gives the familiar expo-
nential series e* = Dp x*/(k).
102 CHAPTER 4 Power Series Solutions

We now give some other applications of the same principle to some special
functions familiar in applied mathematics, defined as solutions of second-order
linear homogeneous DEs (4) with analytic coefficient-functions.

Example 1. The Legendre DE (Ch. 2, §1) is usually written

d
(6)
dx
[a - od + hu = 0
where A is a parameter.
Substituting the series (4’) into the linear DE (6), and equating to zero the
coefficients of 1 = x°, x, x2 ,-+., We get an infinite system of linear equations

0 = 2a. + Aggy = Gag + A — 2)a, = 12a, + A — B)aqQ =

The kth equation of this system is the recurrence relation

kk +1)—A
(6’) +2 = (k + lk + 2) ay

This relation defines for each \ two linearly independent solutions, one consist-
ing of even powers of x, and the other of odd powers of x. These solutions are
power series whose radius of convergence is unity by the Ratio Test, unless \ =
n(n + 1) for some nonnegative integer n. When A = n(n + 1), the Legendre
DE has a polynomial solution which is an even function if n is even and an odd
function if m is odd. These polynomial solutions are the Legendre polynomials
P,,(x).

Graphs of the Legendre polynomials P(x), . , P,(x) are shown in Figure


4.1. Note how the number of their oscillations increases with AX = n(n + 1), as
predicted by the Sturm Comparison Theorem.
In general, to construct a solution of an ordinary DE in the form of a power
series, one first writes down a symbolic power series with letters as coefficients,
u = ay + a,x + agx® + .... To determine the a, numerically, one substitutes
this series for u in the DE, differentiating it term by term. One then collects the
coefficients of x* that results, for each k, and equates their sum to zero.
This is called the Method of Undetermined Coefficients. For it to be applic-
able to the second-order linear DE u” + p(x)u’ + q(x)u 0, p and g must be

=

analytic near the origin. That is (cf. §1), they can be expanded into power series

(7) P(x) = po + pix + pox? +


Q*) = got gx t gaxrt---

convergent for sufficiently small x.

+ Courant and John, p. 520; Widder, p. 288.


2 Method of Undetermined Coefficients 103

1 Po(z)

P, (x)
P,(z)
P,(z)

-l

P,(z)

~i+

Figure 4.1 Graphs of P,(x); Legendre polynomials.

To compute the solution, one assumes that

(7’) = Ay + a,x + aox®+ agx®+--+ = > a,x"

Term-by-term differentiation gives

u” = Qa + Basx + 1Qayx®? +--+ + (n+ 1)(n t+ Dayyox" + ---

pul = pod, + (2poag+ pyay)xt+ - > + | > (n+1— Pitas


n

k=1
xepores
Gu= qody+ (Goa+ Gyaq)xTess + bp tite x +
na]
Substituting into (4) and equating the coefficients of 1, x, Xx ,-.. to zero,

we get successively

ag = —4(Poa + 404), ag = —8(2pPodg + pya, + goa + 91a)

and so on. The general equation is

| >
k=0
(n—R)pQn—~+ > tie
(8) n=l
=
_—

ant) = >

n(n + 1)

Given a) = /(0) and a, = /’(0), a unique power series is determined by (8),


which formally satisfies the DE (4). We have proved the following theorem.
104 CHAPTER 4 Power Series Solutions

THEOREM 1. Given a linear homogeneous second-order DE (4) with analytic coef-


ficient functions (5), there exists a unique power series (6) that formally satisfies the DE,
for each choice of ay and ay.

Example 2. The Hermite DE is

(9) u” — QIxu’ + ru = 0

Applying the Method of Undetermined Coefficients to (9), we obtain the recur-


sion formula

2k—2X
(9) Qn+2
(kh + DR +2) 7

which again gives, for each A, one power series in the even powers x** of x and
2k+1
another in the odd powers x . These power series are convergent for all x; if
\ = 2n is a nonnegative even integer, one series is a polynomial of degree n,
the Hermite polynomial H,(x).

Caution. We have not stated or proved that the formal power series (7’) con-
verges or that it represents an analytic function. To see the need for proving
this, consider the DE x?u’ = u — x, which has the everywhere divergent formal
power series solution

x + x? + (2Nx? + (3Nxt + - +n
— Vint +

For normal second-order linear DEs (4), the convergence of the power series
defined by (7’) to an analytic solution will be proved in §6. First we treat some
more special cases, in which convergence is easily verified.

EXERCISES A

1 (a) Prove that the Legendre DE has a polynomial solution if and only if
AX = n(n
+ 1).
(b) Prove that the radius of convergence of every nonpolynomial solution of the
Legendre DE is one.

Find a recurrence relation like (8’) for the DE (1 + x’)y” = y, and compute expan-
sions through terms in x!° for a basis of solutions.
(a) Find power series expansions for a basis of solutions of Airy’s DE u” + xu = 0.
(b) Prove that the radius of convergence of both solutions is infinite.
(c) Show that any solution of Airy’s DE vanishes infinitely often on (0,0), but at
most once on (—©,0).
(d) Reduce u” + (ax + Bu = 0 to d®u/dt®? + tu =
=
0 by a suitable change of inde-
pendent variable.

Show that u(x) satisfies the Hermite DE (9) if and only if uv =-


=
e7**/2y satisfies
vp+at4t1—xv
=0.
(a) Find a basis of power series solutions for the DE u” + x?u = 0.
(b) Do the same for u” + xu = 0.
3 More Examples 105

In spherical coordinates, the Laplacian of a function F(r,6) is

VR= FL + oF, + [Fo + (cos6) Fo].

Show that F(r,0) = r"P(cos 6) satisfies V?F = 0 if and only if P(x) satisfies the
Legendre DE with A = n(n + 1).

(a) Show that in spherical coordinates, U(r,6) = (1 — 2r cos 6 + 1°)~'/ satisfies


V?U = 0. [Hint: Consider the potential of a charge at (1,0,0).]
(b) Infer that (1 — 2r cos 6 + 7’)? = LP r°P,(cos 6), where P,(x) is a solution of
the Legendre equation (8) with A = n(n + 1).

*8 Show that, for any positive integer n, the polynomial d*[(x? — 1)"]/dx" is a solution
of the Legendre DE with A = n(n + 1). (This is Rodrigues’ formula.)

MORE EXAMPLES ‘

Many other famous “higher transcendental functions”’ of classical analysis are


best defined as solutions of linear homogeneous second-order DEs. Among
these the Bessel functions have probably been most exhaustively studied.t

Example 3. The Bessel function /,(x) of order n can be defined, for n = 0,


1, 2,..., as an (analytic) solution of the Bessel DE of order n,f

(10) wt ta +( 1-2 Juno


The coefficients of this DE have a singular point x = 0, but are analytic at all
other points (see §5).
Though the Bessel DE (10) of integral order n has a singular point at the
origin, it has a nontrivial analytic solution there. The power series expression
for this solution can be computed by the Method of Undetermined Coefficients.
For example, when n = 0, the DE (10) reduces to

(10’) (xu’l + xu =
=
0

Substituting u

=

La,x* into (10), we get the recursion relation ka, = —a,_».


For a) = 1, a =
0, (10’) has the analytic solution

2
2 3

Jot)
=

| }+( } |
x
_

2
x

2-4
x

2-4-6
)+
(11)
2 4 6 2r
x x x
+ .--
1-—+—-~ +-+++(-17
4 64 2304 [2’@ry]?

+ See G. N. Watson, A Treatise on Bessel Functions, Cambridge University Press, 1926.

t+ Note that the Bessel DE of “order” 7is still a “second-order” DE.


106 CHAPTER 4 Power Series Solutions

This series defines the Bessel function of order zero. It is convergent for all x
(by the Ratio Test), and converges rapidly if |x| < 2.
Similar calculations give, for general n, the solution

2 4

Ini) = (2) |nm


1 x x
(12)
4(n + DY! 32(n + 2)!
2r

cw(
(Din + n! |
whose coefficients 5,
=
=

4,4, Satisfy the relation (n? — k*)b, = b,-9. The series


(12) is also everywhere convergent. Hence, the Bessel function /,,(x) of any inte-
gral order n is an entire function, analytic for all finite x. Graphs of Jo(x) and
Ji(«) are shown as Figure 4.2.
By comparing coefficients, one can easily verify the relations

Ji = —SJo, xJo = Ji — xJi = 2); — xJo, Js = 2Jo — x)

For general n, one can verify, in the same way, the recursion formula

(13) Int) = 0 ~~ xn = Qn], ~ Jn-y

Clearly, formula (13) defines Ji, Jo, Js, . . . recursively from Jo.

Example 4. A special case of the Jacobi DE is the following

(14) (l — x*)u” — xu’ + Au = 0

where ) is a parameter.
If we set u = Lf.o a,x", substitute into Eq. (14), and collect the terms in x”,
we get

(14’) (n + 1)(n + 2a,49 — (n? — dja, = 0

Jo(z)

J; (x)

ISS
"|
Figure 4.2. Bessel functions Jo(x) and J,(x) = — Jo(x).
4 Three First-Order DEs 107

For most values of \, this leads to two solutions of (10), one an even function
2
and the other odd. However, when X n is a square, we have instead a poly-
nomial solution
Even simpler to solve by power series is the Airy DE

(15) u” + xu=
0

One easily derives from (15) the recursion relation n(n — 1)a, + a,-3 for the
coefficients. A basis of power series solutions therefore consists of the functions

x? x® 9

A1(x)
=
=
1-> +757
180 12960

and

4 7 10
x x x
+
Bi(x) x-isteyt
12 504 45360

4 THREE FIRST-ORDER DEs

The three homogeneous, linear, second-order DEs just treated were quite sim-
ilar to each other. In this section, we will derive power series expansions for
three first-order DEs having much less in common

Example 5. To obtain a useful power series expansion for the “exponential


integral’ function E(x) defined by (2), one must supplement simple substitution
into the exponential series, which gives only

—* 2 3
é x x
Seet-1t5-g+g7:
9} 3!
x

from which termwise integration yields

E\(x) = my~Ins ~ (—x)"


(16)
[k
- (Rl)]

where ‘+ is an unknown constant of integration. The fact that y = .5772156649


discovered by Euler, requires additional analysis
Moreover, to evaluate E,(x) when x 2 10, one should replace (16) by the
asymptotic formula

k
OO x
~ dt
(16) Ey) —{ x 1+ @/x) x
Ja-2+2 pto0)
108 CHAPTER 4 Power Series Solutions

The final series of negative powers is divergent for all x: it is a so-called asymptotic
series (cf. Ch. 7, §'7). However, the partial sum of the first n terms hasa relative
error of less than 1%, and an absolute error less than 107° for x 2 10.

Example 6. Pearson’s DE is

(*) (A + Bx + Cx*)y’ = (D + Ex)y, A#0

Its solution by power series is straightforward, after dividing the equation


through by A.
Setting A = 1, clearly y = Xf. a,x* implies

(1 + Bx + Cx) = a, + (2a, +Ba,)x +


> Le ~ Qay-2C + ( ~ Vay-1B + haj)xt!
k=3

and

(D + Ex)y = Day + y (Da, + Ea,_,)x*


k=]

The two sides of the preceding equations are equal if and only if

_ [Fay + (D — Ba,)]
a, = Dao, ao =
2

ka, = [D — (k — 1)Blaq-) + [E — & — 2)Cla,-2

For given numerical values of A, B, C, D, E, one thus obtains a basic solution


of (*) in the form of the power series

[E + D(l — B)}x?
y= 1+ Det
2
+ x a,x"
h=3

where the a, (k 2 3) are computed recursively from the last previously displayed
formula.

Example 7, We will next consider the function y = tan x as the solution of


the nonlinear DE yf = 1 + 9? for the initial value y(0) = 0.
By successive differentiation of this DE, we easily obtain formulas for the sec-
ond and third derivatives:

yy” = qd + yy
Qyy’ = Qy(1 + y?)
=
=

y” = Qy + by’y = 201 + 3y)1 + 9°)


4 Three First-Order DEs 109

Since (1 + 3y°)(1 + x) 1 + 4y* + 3y", we have further

P= Q(By + 12y")y’ = By(2 + 3y°)1 + 9°)


1 y” (0)
In particular, setting x = y 0, we get y’(0) 0, (0) = 2, and
ww (0)
0 whencetan x = x + x°/3 + O(w)
Note that since 1 + x is positive, the function tan x is always increasing. Also,
since y”has the same sign as y, the graph of y = tan x is concave upwardin the
upper half-plane and concave downwardin the lower half-plane. Likewise, y” is
always positive, and so y” is increasing.
Again, settingt = —x and z —y, we obtain the formulas
=
=

a ~%_
=1l+y=14+2 and z(0)
dt ~ dx

Hence z(Z) is also a solution of the initial-value problem of Example 7. There-


fore, by the uniqueness theorem (Ch. 1, Theorem 6’) for first-order DEs
z = tan é. This proves that tan(—x) tan x: tan x is an odd function.
Consequently, the Taylor series expansion of tan x contains only terms of odd
order

y =x + agx + asx?
+ ayx’ + agx® +

Substituting into the defining DE, we obtain

1 + 3agx® + 5asx* + Tayx® + 9agx® +


1 + x? + Qagx* + (Qas + a8)x® + Q(a7 + agas)x® +

Equating coefficients of like powers of x, we get

3a3 = 1 5a, = 2a3, Ta, = 2a, + a5, 9ag = 2a, + 2agdr,


7

and so on. Solving recursively, we get the first few terms of the power series
expansion for tan x,

2x° 17
(**) tanx=x45
3
+384 HF
315
x? + =
62
2835
x?
+

The radius of convergence of this power series is 7/2; this follows from the for-
mula tan x
=
=
sin x/cos x and the results of §3

By differentiating the DE y 1 + y° repeatedly, one obtains similarly


“= Qyy/, yy” = (2 + 3y)(1 + x), P= by + 6y?)(1 + y), and so on. The
Taylor series expansion for y,,, = tan(x, + h) through terms in h* for given
Mn tan x, is therefore

(*) Int1 = In + AC + Qn + OC)


110 CHAPTER 4 Power Series Solutions

where

Qn = 1 + Wyh + (2 + By2h? + |] + Gy?)y,h?

By neglecting the O(h*) term in (*), one obtains a formula

(**) Yn+1 = Y, + hd + Y2)Q (Yash)


for computing a table of approximate values Y,, of y, = tan (nh) from the initial
value Yo = yo = 0, which is much more accurate than that given by the “Taylor
series method” of Ch. 1, §8.

Order of Accuracy. We have used in (*) the convenient notation O(h°) to


signify the fact that the remainder is bounded by Mh°, where M is independent
of h. As a result, one expects that the error per step fA will be roughly propor-
tional to >. Since the number of steps is proportional to h~!, one therefore
expects the cumulative error to be proportional to h*. This contrasts with the
simpler Taylor series method of Ch. 1, §8, which has only O(4?) cumulative
accuracy.

EXERCISES B

1. Derive formula (13) by comparing the coefficients of the appropriate power series.

2 (a) Show that the function (sin r)/r satisfies the DE u,, + (2/ryu, + u = 0 and the
initial conditions u(0) = 1, u’(0) = 0.
(b) Find another, linearly independent solution of this DE.

Show that the DE (Ax? + B)u” + Cxu’ + Du = 0 hasasolution that is a polynomial


of degreen if and only if An? + (C — A)n + D = 0.
Show that the change of independent variable x cos 8 transforms the Legendre
=
=

DE (8) of the text into ugg + (cos #)ug + Au = 0. What is the self-adjoint form of
this equation?

*5 Find conditions on the constants A, ..., F necessary and sufficient for the DE (Ax?
+ Bx + Chu” + (Dx + E)u’ + Fu = 0 to have a polynomial solution of degree n.

Show that if y’ = 1 + 9°, then y” = 2y(1 + 9%), y” = 2(1 + 99(1 + 3y%), and
y” = By(1 + y°)(2 + By’).
Show that any function that satisfies y’ = 1 + y* is an increasing function, and that
its graph is convex upward in the upper half-plane. [HinT: Use Ex. 6.]
Derive the coefficients 1/3, 2/15, 17/315, and 62/2835 of the series (**) of the text.
Show that, if y’ = 1 + y%, 9” = 89(2 + 15y? + 15y*)(1 + 9°).

5 ANALYTIC FUNCTIONS

A function is called analytic in a domain D when, near any point of D, it can


be expanded in a power series that is convergent in some neighborhood of
that point. For instance, a real function p(x) of one real variable is analytic
5 Analytic Functions 111

in the open interval (x),x2) when, given any x in this interval (i.e., satisfying
x; <_X9 < X9), there exist coefficients p,p),po, and a positive number 6
such that

(17) p(x) = >pal ~ ma if |x—x| <6, 6>0

The numerical values of 6 and the coefficients p, will, of course, depend on xp.
Likewise, a real function F(x,y) is analytic in a domain of D of the real
xy-plane when, given (X9,yo) € D, there exist constants 6, (j,k = 0,1, 2,...)
and 6 > 0 such that

F(x,y) = > > bye ~ xoY(y — yo)"


j=0 k=0
if |x — xol + ly — yol <5

An example of such a series expansion is the double geometric series

M M
(18) G(x,y) =
x
1-2
“£8 HK ayf
j=0 k=0

(
1-—

A I k

This series converges in the rectangle |x| < H, |y| < K and defines an analytic
function in this rectangle.
Analytic functions of three and more variables are defined similarly.

Domain of Convergence. Crucial for work with any power series is an


understanding of its domain of convergence. Within this domain, any power
series is absolutely convergent, and it can be differentiated or integrated term-
wise any number of times. It follows that each coefficient of any power series is
uniquely determined by the analytic function which the power series defines.
This is because, if u = F(x,y) = X,4 bax xo)(y — 90)" is convergent in some
neighborhood of (xo,yo), then

_ (anar*F
=

‘jk
dx/Ayjf (x0290)
If F and G(x,y) = Dye Gulx — xo¥(y — yo)* are any two power series expansions
about the same “center” (%o,jo), moreover, then their power series can be
added, subtracted, and multiplied termwise within the intersection of their
domains of convergence. Worth noting is Cauchy’s formula for the product h(x)
= f(x)g(x) of two analytic functions f(x) = Lf» a,x" and g(x) = Lio bx". This
formula is

(18’) h(x) = > ox*, where c= y abi,


112 CHAPTER 4 Power Series Solutions

By a well-known theorem of the calculus, this will be absolutely convergent to


f(x)g(x) whenever the series for f(x) and g(x) are both absolutely convergent.t+
In dealing with linear differential equations such as (4), it is sufficient to con-
sider analytic functions of one variable. For these, the key concept is that of the
radius of convergence. The radius of convergence R of the power series (17) is the
largest 5 such that the series converges whenever |x — x9| < 4. The radius of
convergence of any power series can be determined from its coefficients by Cau-
chy’s formula

(19) Rim sup V [Pal = lim {sup Vipal }


nwo

The series diverges for all x with |x — x9| > R. The inierval of convergence of
(17) is the interval (x9 — R, x9 + R).
For functions of a complex variable, the radius of convergence of the series
(17) is still determined by Eq. (19). The series is convergent in the circle of con-
vergence |x — x9| < Rand divergent if |x — x9| > R; it defines a single-valued
analytic (or holomorphic) complex function inside its circle of convergence. When
R = 00, the power series (17) defines an analytic function for all x, real or com-
plex; such functions are called entire functions.

Example 8. The substitution £ = C — x reduces the DE

/
(20) u” + u 0, c>0
C—-x (C— x”
to Euler’s homogeneous DE

du A du B
(20) 0
dg? &dé Pe”
already discussed in Ch. 3, §2.
To solve (20), try the function u = & = (C — x)’. This satisfies (20) if and
only if v is a root of the indicial equation of (20’,

vpy— 1)-—Av+B=0

When B < 0, this indicial equation has one positive root and one negative root
—p. Hence, (20) has two linearly independent real solutions, given by the bino-
mial series

(21) (1-2Cc 1 ~»(Cc)+ vy


~ 1)
x
x
( Ix] <C

=

>

2 Cc

+ Courant and John, pp. 542-544, 555; Widder, pp. 303-306 and 318-320. For a more complete
discussion, see K. Knopp, Theory and Application of Infinite Series, Dover, 1956.
6 Method of Majorants 113

and a like series with vy replaced by —y. When » is a nonnegative integer, a poly-
nomial solution is obtained. Otherwise, the radius of convergence of the series
is C, the same as that of the power series expansions of the coefficient-functions

p(x) =
(C —x) “(OP +E 8]
and q(x) = B/(C ~ x)* of the DE (20).

6 METHOD OF MAJORANTS

If one keeps in mind the results of §4, one can show quite easily that the
formal power series solutions of (4), obtained by the Method of Undetermined
Coefficients of §2, have for all choices of a9 and a, radii of convergence at least
as large as the smaller of the radii of convergence of the coefficient functions.
To prove this, one uses an ingenious method due to Cauchy, the so-called
Method of Majorants.
A power series La,x* is said to be majorized by the series LA,x* if and only if
|a,| << A, for all k = 0, 1, 2, 3, .... By the Comparison Test, the radius of
convergence of Xa,x* is then at least as large as that of XA,x*, and all A, are
positive or zero. Therefore, we say that the DE

(22) u” = P(x)u’ + Q(x)u, P(x) = XP,x*, Q(x) = TQyx*

majorizes the DE (A) if and oniy if P, = |p,| and Q, = |q|, for all &.
In particular, the choice of coefficient-functions

(22’) P(x) = ZI pel


x* and Q(x) = Xl gel x"

in (22) gives a DE that majorizes (4). Moreover, by (19) the coefficient-functions


(22’) have the same radius of convergence as p(x) and q(x), respectively.

LEMMA 1. Let the DE (22) majorize the DE (4), and let Df c,x* be the formal
power series solution of (22) whose first two coefficients are |agland |a,|. Then ¢, =
=

|a,/ for all k.

This lemma may be thought of as a generalized comparison test.

Proof. For the DE (22), the coefficients of formal power series solutions sat-
isfy, by (8) with p,
=
=

Pr Gh = —Qe

(*) Cn+) Th b (n—R)Piln-k+ s Qeev | n= 1


k=0

Hence,if ¢)
= |a,|,¢, = lai|,.-. 5 = Jal,
it follows that c,.; = |a,41|, as
stated. This is because a,,, is given for n = 1 by the display (8), like (*) above,
114 CHAPTER 4 Power Series Solutions

with each (positive) term replaced by one having at most as great an absolute
value. The lemma follows by induction on n.
Now let x, be any number whose absolute value |x,| = C is less than the
smaller of the radii of convergence of the two series (7). Then p,x} and q,x* are
uniformly bounded} in magnitude for all k, by some finite constant M. Hence
we have

| Pel = Mc, 1g = Mc, = 0,1,2,...

This implies that the power series for p(x) and q(x) are both majorized by the
geometric series

we
x MC
»
k=0 (Cc
_—

~ (C—x)’
for some
M > 0,C
> 0

This series being majorized in turn by

k
MC?
-uya+(2)
(C — x)

the DE (4) is majorized by the DE

4
MC MC?
(23) =
uw+
(C —x) (C—
x
But, as in Example 8, one solution of this DE is the function

om =[1-( _

Cc IP
where —p is the negative root of the quadratic indicial equation of (23). Again
as in Example 3, this equation is

vy — 1) — MCy — MC? = 0, where —MC? < 0

This function ¢(x) has a power series expansion

pe + 1)x?
(24) d(x) = ( 1--—

Cc
)oa+ 2C?
+

convergent for |x| < C, as in (21).

+ This is because, if a series is convergent, its n-th term tends to zero as n — ©0,
6 Method of Majorants 115

Now apply the foregoing lemma. Each solution of (4) is majorized by K times
the solution ¢(x) of (23), provided that

aC
(24’) k= max{ aol |
But K@(x) has the radius of convergence C. Hence, by the preceding lemma, the
radius of convergence of the series (6) is at least C = |x,|. This proves the
following result.

THEOREM 2. For any choice of ay and a), the radius of convergence of any power
series solution defined by the recursion formula (8) is at least as large as the smaller of
the radii of convergence for the series defining the coefficient functions in (4).

We now recall (§5) that power series can be added, multiplied together, and
differentiated term-by-term within their intervals (circles) of convergence. It fol-
lows from Theorem 2 that when applied to power series defined by (8), the three
equations displayed in §2 between formulas (7) and (8) are identities in the com-
mon interval of convergence specified. Hence, the power series defined by (8)
are solutions of (4), and we have proved the following local existence theorem.

THEOREM 3. Any initial value problem defined by a normal second-order


linear homogeneous DE (4) with analytic coefficient functions and initial conditions
S) = ao, f(0) =a, has an analytic solution near x 0, given by (8).
=
=

EXERCISES C

1 Let Da,x* have the radius of convergence R. Show that, for any r < R, the series is
majorized by L(m/r*}x* for some m > 0.
Using Ex. 1, prove Cauchy’s formula (19).

Prove that, unless v is a nonnegative integer, the radius of convergence of the bino-
mial series (21) is C.

Using the symbols A, B, C to denote the series La,x*, Db,x*, Cex, and writing
A < B to express the statement that series A is majorized by series B, prove the
following results:
(a) A « Band B < Cimply A « C.
(b) A « Band B < Aimply A =B.
(c) If A « B, then the derivative series A’: Dka,x* and B’: Lkbyx* satisfy A’ « B’.
(d) IfA « BandC
« D, thenA + C «B+ Dand AC < BD.

Prove that the radius of convergence of Xa,x* is unaffected by term-by-term differ-


entiation or integration.

(a) Obtain a recursion relation on the coefficients a, of power series solutions La,x*
of Pearson’s DE y’ = (D + Ex)y/(A + Bx + Cx, A # 0.
(b) What is the radius of convergence of the solution?
(c) Integrate this solution by quadratures, and compare.

*7 Extend the Method of Majorants of §6 to prove the convergence of the power series
solutions of the inhomogeneous DE u” + p(x)u’ + q(x)u = r(x), when the functions
116 CHAPTER 4 Power Series Solutions

p,q, 7 are all analytic. [HinT: Show that the DE is majorized by setting p(x) = —MC/
(C — x), q(x) = —MC/(C — x)’, r(x) = M/(C — x), for some finite M > 0, C > 0.]
*8 Let the coefficients of u
=
=
Lax" satisfy a recursion relation of the form
Gy+\/a, = P(k)/Q(k + 1), where P and Q are polynomials without common factors
and Q(0) # 0. Show also that « must satisfy a DE of the form

Q(. 4) [u] = xP(=4) [u]


and conversely.

*7 SINE AND COSINE FUNCTIONS

To illustrate the fact that properties of solutions of DEs can often be derived
from the DEs themselves, we will now study the trigonometric DE

(25) y’ ty =0

The general solution of this DE is y


=
=
a cos x + b sin x, where a and 6 are
arbitrary constants, and the functions cos x and sin x are defined geometrically.
We will pretend that we do not know this, and deduce properties of the trig-
onometric functions sin x and cos x from general theoretical principles, assum-
ing only the trigonometric DE (25) and the initial conditions that they satisfy. In
this spirit, we define the functions C(x) and S(x) as the solutions of this DE that
satisfy the initial conditions C(0) = 1, C’(0) = 0, and S(0) = 0, and S’(0) = 1,
respectively. Applying the Method of Undetermined Coefficients to (25) with
these initial conditions, we get easily the familiar power series expansions

2 4 3 5
x x

(25’) Ce) =1-2+—— ’ S@) =x —- Ste


2! 4!

whose convergence for all x follows by the Ratio Test.


Differentiation of (25) gives the DE y” + y = 0. Therefore, the function
C’(x) is also a solution of the DE (25); moreover, since the function satisfies
the initial conditions C’(0) = 0 and C’(0) = —C(0) = —1, it follows from
the Uniqueness Theorem (Ch. 2, Theorem 1) that C’(x) = —S(x). This
proves the differentiation rule for the cosine function. A similar computation
gives S’(x) = C(x).
The Wronskian of the functions C(x) and S(x) can be computed from these
two formulas; it is W(C,S;x) = C(x)? + S(x)*. From Theorem 3 of Ch. 2,
W(C,S;x) = C(0)? + S(0)? = 1 follows. This proves the familiar trigonometric
formula cos* x + sin* x = 1.
Again, by Theorem 2 of Ch. 2, every solution of the DE (25) is a linear com-
bination of the functions § and C. We now use this fact to derive the addition
8 Bessel Functions 117

formula for the sine function:

sin(a + x) = cos asin x + sin a cos x

First, by the chain rule for differentiating composite functions, the function
S(a + x) is also a solution of the DE (25). Therefore (Ch. 2, Theorem 2), this
function must be a linear combination of S(x) and C(x):

(26) S(@ + x) = AS(x) + BC(x)

Furthermore, if we write f(x) = S(@ + x), then f(0) = S(a) and f’(0) = Cia).
But if we differentiate the right side of (26) and set x 0, we find that
=
=

f(O) = Band f’(0) = A, whence S(a + x) = C(a)S(x) + S(a)C(x). This proves


the addition formula for the sine function. The addition formula for C(x)

C(a + x) = C(a)C(x) — S(@)S(x)

can be derived similarly.


Finally, the fact that the functions S and C are periodic can be proved from
the addition formulas. Define z/4 as the least positive x such that S(7/4) = 1/
2. Since S’
=
=

c=V1- SS 1/V 2 on any interval [0, 5] where S(x) = 1/


2 , we see that S(x) is increasing and satisfies S(x) = x/ V2 there. Hence, S(x)
= 1/V2 on (0, 1] is impossible, which shows that 1/4 exists in [0, 1]. Moreover,
w/4 = Sq/V2), where S~! is the inverse function of S. Since the derivative
of S~! is given by 1/S’(x) = 1/V1 — S*, this makes
1/v3
di
—_
=

4 0 Vvi-#

Moreover, C cannot change sign until S7 = 1 — C® = 1. Hence, cos


(1/4) = 1/ V2 = sin (x/4). Consequently, by the addition formulas proved
above

sin (5 + *) = —_—

Va
(sinx + cosx) = cos(; - *]
In particular, sin (7/2) = (2/V2)/ V2 = land, therefore, cos (4/2) = 0. Using
the addition formulas again, we get the formulas sin (7/2 + x) = sin (r/2 —
x), cos (x/2 + x) = —cos (#/2 — x), sin (m7 + x) = —sin x, cos (a + x) =
—cos x and, finally, the periodicity relations cos (24 + x) = cos x, sin (2x + x)
= sin x.

*8 BESSEL FUNCTIONS

The Bessel functions of integral order n and half-integer order n + } are


among the most important functions of mathematical physics (see Exercises D
below). In §3, we defined J,,(x) as Example 3, and derived a basic recursion for-
118 CHAPTER 4 Power Series Solutions

mula (13) expressing J,(x) algebraically in terms of Jo(x) and its derivatives. In
this section, we shall derive many other useful facts and formulas involving Bes-
sel functions from the results proved in §3. We emphasize that all of these for-
mulas can be derived from their defining DEs (10), the fact that /,,(x) is analytic
at 0, and the choice of leading coefficient in formula (12).
Specifically, one can prove all the properties of the Bessel functions of inte-
gral order from (10) and the recursion relations (13). For example, one can
obtain such useful formulas as

SxJo dx = xis Sx; dx = —xJo + STo dx

fetdx = ZR + y= SUR+W
More generally, we can obtain useful expressions representing, in closed form,
integrals of arbitrary polynomial functions times Bessel functions and products
of Bessel functions. The basic formulas are (13) and

(27a) Sx*J, dx = —x*Jy + kfx* Jy dx

(27b) Sxt]o dx = x*®J, — (k — UJx*


lf, dx

(27c) Qfx*Jo
Ji dx = —x*]J§ + kfx* YG dx
(27d) Sx*(J§ — Ji) dx = x'Jo
fy — kh — DSx* Yo fy dx
(27e) Sx*t& + JB + & — DUP de = x13 + JP)

Equation (27a) follows from J, —J$, integrating by parts. To derive formula


=
=

(27b), note that since (xJ>) = —xJo, it follows that

fy = —Le"Go] = —& — xt "YG + xo

To derive (27c)—(27e), differentiate x*J3, x*Jo J, and x**'( Jj + Js”), respectively,


and use (10) to eliminate J{. The integral [J, dx cannot be reduced further, and
so it has been calculated (by numerical integration) and tabulated.
Important qualitative information can also be obtained from a study of the
DE (10) which the Bessel functions satisfy. Substituting u = v/ Vx into (10), we
obtain the equivalent DE

—1
4n?
(28) vo”+ [1- 4x”
Jono

+ G. N. Watson, Bessel Functions, Chapter 8. Cambridge University Press 1926; A. N. Lowan et al.,
J: Math and Phys. 22 (1943).
8 Bessel Functions 119

The oscillatory behavior of nontrivial solutions of the Bessel DE (10), for large
x, can now be shown, using the Sturm Comparison Theorem (Ch. 2 §4). When
applied to (28), this result shows that, for large x, the distance between succes-
sive zeros of Jy(x) is inferior to + by a small quantity (at most 7/8x*), while that
between successive zeros x, and x,+, of J,(x) exceeds x by about n*1/2x° ifn =
1. Also, since J; = —Jo, there is a zero of J, between any two successive zeros
of Jo
Setting o(x) v? + vo? = x(Ji t+ Jr) + JnJn + J7/4x,it also followsfrom
(28) that o’(x) (4n® — 1)uv’/2x?. Since |2uv’| < v? + v” = a(x), there follows

Kote)
(28’) lo’(x)| = K,
=
=

2 a(x) > 0
Using the Comparison Theorem of Ch. 1, §11, we get from (28’)

Ae **/* =o(x) = x(Jn + Jv) + Jada +i


+ < AeKn/*

where K n* — 3|. For large x, therefore, o(x) must approach a constant A


Clearly, 1/ x is the asymptotic amplitude of the oscillations of the Bessel func-
tions for large x, since J; vanishes at maxima and minima of J,(x), so that o(x)
=
xJ2(x) there. Much more precise asymptotic results about the oscillations of Bes-
sel functions will be provedin Ch. 10.

The General Solution. The general solution of the Bessel DE of zero order
is, setting W(x)
=
=
e */* = 1/x in formula (13) of Ch. 2
2

(29) Zo(x=)Jo(x) [4 + af |
But a straightforward computation with power series gives

2
1 x
—+—-
5x4
1
4

Jo") 2 32

From this formula, substituting back into (19) and integrating the resulting
series term-by-term, we see that the general solution Zo(x) of the Bessel DE of
zero order is

5x*
Zo(x)=Jo(x) l4 +B (Ins +74 OH
(30)
128 ‘)
It follows that every solution not a constant multiple of J)(x) becomes logarith-
mically infinite as x | 0, since B # 0. For further information, see Ch. 9, §7
120 CHAPTER 4 Power Series Solutions

Generating Functions. Given a sequence {a,} of constants a9,),4, ., the


power series

g(x) = a9 + ax t+age? +--+ = ax”


n=0

is called its generating function. When the series on the right converges in an
interval, this defines a function g(x) there; otherwise, the infinite series is just a
formal power series. In many cases, useful information can be obtained about a
sequence {a,} by studying its generating function.
Likewise, given a sequence of functions F,(r), the function defined by the
power series Lt"F,,(r) is also called the “generating function’’ of the sequence.
Thus, the generating function of the sequence of Legendre polynomials is

(*) > rP,(0) = (1 ~— 2rcos6+ ry”,


n=0

The same phrase is used when the sum is taken over all integers; for example,
the Bessel functions of integral order have the generating function

aw

(31) Syn) = ere


—c

See Ex. D13.

EXERCISES D

1. Define E(x) as in §1, by the DE & = E and the initial condition E(0) = 1. Prove in
turn, justifying your arguments by referring to theorems, that
(a) E(x) = Exo x*/(k!) (b) Ela + x) = E(@E(x)
(c) E(—x) = 1/E(x) [Suggestion: Show that for any a, E(a + x)/E(a) satisfies the
conditions defining E(x).]

Define sinh x and cosh x as the solutions of the DE u” =


=
u that satisfy the initial
conditions u(0) = 0, «’(0) = 1 and u(0) = 1, u’(0) = 0, respectively. Show that sinh
x has only one real zero and cosh x has no real zeros. Relate this to the Sturm
Comparison Theorem.

Using methods like those of §7, establish the following formulas (cf. Ex. 4):
(a) cosh? x — sinh? x = 1 (b) cosh (—x) = cosh x
(c) sinh (—x) = —sinh x (d) sinh (x + y) = sinh x cosh y + cosh x sinh y

(a) Show that sinh x + cosh x satisfies the conditions used to define E(x) in Ex. 1.
*(b) Using this result, and the formulas of Ex. 3, show sinh !(x) = In(e +
Vx"? + 1).
Prove formulas (27a) and (27b) in detail, expanding on the remarks in the text.
Prove formulas (27c)~—(27e) similarly.
9 First-Order Nonlinear DEs 121

Establish the identities for Bessel functions of integral order in Exs. 7-10.
k

7. Jk*) = (—YJax) —*8- (24) (xu) = "Yn e(e)


*9. Ju—i(x) + Jnzi(x) = 2nx'],(x)
*10. Jai) — Jnvilx) = 2Jn)
M1. In polar coordinates, V?u = wu, + 77u, + 1771069.
a

ShowthatJ,(7) {cos| nOsatisfiesV?u + u = 0.


*(b)
Showthatconversely,ifJ(7) {cos
i
ndsatisfiesV?u + u
co: =
=
0 and is bounded

near r = 0, then j(r) is a constant multiple of J,(7).

12 Show that (sin 7)/Vr= Vr Jo”) is a constant multiple ofJ;,.(7). (Cf. Ex. A.)
*13. (a) Show that the real and imaginary parts cos(r sin 6) and sini sin 6) of
e? = e808 satisfy Vu + u = 0.
(b) Show that e? = e"""2,t = oy = rsin 0.
(c) Show that the functions F,(r) in the Laurent series expansion

gt? y OF,(7)

satisfy the Bessel equation of order n. [Use (a) and (b).]


(a) Comparing the coefficients of "7", prove the identity e""?_ = I, 2J"()
where J_,(*) = J.(—*) = (—)"Ju().
*14, (a) Show that Kummer’s confluent hypergeometric function

a(a + 1) x?
M(absx) = 1+ 2x +
b b(6 + 1) 2!

+
aa+1)---@tn)x"*
+

bb+1)---O+n)
x!

is a solution of the DE xF” + @ — x)F — aF = 0.


(b) Show that the preceding function is an entire function.

*15, (a) Show that u(r,0,a) = cos [r cos (0 — a@)] satisfies V7u + u =
=
0 for all a, and

hencesodoesitsaverage = f u(r,0;a)da = U(r)

(b) Provethat Jo(7) = = f . cos[rcosa] da = —f ° cos[rcosa] da

9 FIRST-ORDER NONLINEAR DEs

The Method of Undetermined Coefficients and the Method of Majorants can


also be applied to any normal analytic first-order DE. For any function F(x,y)
122 CHAPTER 4 Power Series Solutions

analytic near (0,0), consider the DE

(32)
®
dx
= F(x,y) = > ooney
y=0 &=0

The DE
y 1 + y* of Example 7 is one of the simplest nonlinear such DEs; we
refer the reader back to §4 for a preliminary discussion of how to solve this
particular DE
In this section we will explain how to solve a general DE of the form (32) by
the same method. Namely, we substitute into the DE (32) the formal power
series

(33) f@) Q,% + agx* + asx” +

assuming that we are looking for the solution of (32) satisfying the initial con-
dition y(0) = 0, which we can always do bya translation of coordinates. Accord-
ingly, setting

(33/) yf = a, + 2agx + 3agx? +

and substituting into (32), we obtain successively

a, = boo, 2a = bio + bo1@1


(34)
3ag = bo149 + boo + ba) + bo081

and so on. The expression on the right side of each of these equations is a poly-
nomial with positive integral coefficients. Equations (34) can be solved recur-
sively, giving the formulas

(10 + 5o0b01)
a= boo, an
=
2

(boy + Bi1bo0 + Fo2b00°) +


(iobo1 + So0b01°)
3 6

and so on. When we substitute the series (33) for y into the series (32) for F(x,y)
the coefficient of x” is a sum of products of factors b, (with j + k < h) times
polynomials of degree k in the a, obtained by raising the series (33) to the Ath
power. The coefficient of x" on the left side of (32) is, however, (h + 1)a,,1, by
(33’). Equating coefficients of like powers of x, we have, therefore,

(34’) (h + Vaasi = gulboo: Don’by 0> bp ls Di03 Q15 ’ a)

where the coefficients of q, are positive integers. Substituting for a,, >» &,
already available formulas, we obtain

(34”) Bn+l Prlboo, Bon’ bio, Dip 1» Dno)


9 First-Order Nonlinear DEs 123

The polynomial functions p, have positive, rational numbers as coefficients. They


are the same, no matter which function F(x,y) is used in (32).
Solving the resulting equation at x = 0 for a,,, = y"*"/(n + 1)!, we get the
following result.

THEOREM 4. There exists a power series (33) which formally satisfies any analytic
first-order DE (32). The coefficients of this formal power series are polynomial functions
of the bj, with positive rational coefficients.

The preceding formulas can also be obtained in another way. Let y = f(x) be
the graph in the (x,y)-plane of any solution of the DE y’ = F(x,y), and let u(x,y)
be any analytic function in a domain containing this graph. Differentiating with
respect to x along the graph, we get the formula

du Ou ou _ du du
ay
—_—_—a=

dx ax ( N dy dx Ox dy

This formula can be differentiated repeatedly, giving the operator identity

(35)
a
= (2 + F(x,y)2)
dx”

Applying this identity to the function F(x,y) = y’, we get, in succession,

y’ = F, + FF, y” = Fy + 2FFy + F°F, + FF, + FF;

and so on. The general formula is

d”*ly 0
+F 2) [F(x.y)]
-(
(a+)
(35’)
—_—

J dxt*!
Ox

The right side of (35’), evaluated at x = y = 0, is a polynomial in the variables


b, with positive integers as coefficients. This is because the operations used in
evaluating this expression are addition, multiplication, and differentiation.

EXERCISES E

In Exs. 1-7, calculate the first four nonzero terms of the power series expansion of the
solutions of the DE indicated, for the initial value (0) = 0.
9, y =] + xy?
yuxty
y = glx) = Ldyx* 4. y = gy) = xd,9"
yf = xy? + yx? +1 6. ¥ =1+%

y = cos
Vy +1
Calculate explicitly the polynomial p, = a, of Theorem 4.

Compute the first five polynomials p,, of Theorem 4 when F(x,y) = 5(x)y + c(x) in
(30).
124 CHAPTER 4 Power Series Solutions

10. Apply the same question for the Riccati DE y’ + y? = b(x)y + e(x).
11. Show that the DE y’ = x” + y” has a solution of the form LP a,x**~! with all a, > 0.
For a, = 1, compute ag, as, a4.

12. (a) From the DE y’ = 1 + xy, prove that the coefficient a,,, in the expansion tan x
= La,x", ag = 0, satisfies (n + 1)a,4, = ORZ} aa,-1. [Hint: Differentiate y* using
the binomial expansion of (uv)™.]
(b) Compute the first five nonzero coefficients, and compare with those obtained by
solving for y recursively from x = y — y°/3 + 9°/5 — ..., the series for x =
=

arctan y.

13 Show that if »’ = F(x,y), where F € @°, then

Ww
y
=
=

ee + 39Fay +39?Fey + ¥?Fy +39 Fy + 3y’y"Fy + y"F,

14 For the DE y’ = 2y/x and the initial condition y(1) = 1, calculate the first four terms
of the Taylor series of the solution.

10 RADIUS OF CONVERGENCE

The DE y = 1 + »y® of Example 7 shows that the radius of convergence of


power series solutions of a nonlinear DE y’ = F(x,y) can be much less than that
of the function F(x,y). For, the radius of convergence of the solution tan (x +
c) of the DE y’ = 1 + y? which satisfies y(0) = tan c = + is only 1/2 —c, the
distance to the nearest singular point of the solution. This can be made arbi-
trarily small by making y large enough, even though the radius of convergence
of F(x,y) = 1 + 9? is infinite.
The preceding situation is typical of nonlinear DEs and shows that we cannot
hope to establish an existence theorem for nonlinear DEs as strong as Theorem
3. The contrast with the situation for nonlinear DEs is further illustrated by the
Riccati equation of Ch. 2, §5, (*).

Example 5. The Riccati equation

du/dx —v? — p(x)v — q(x)


=

(36) =

is satisfied by the ratio v(x) = u’(x)/u(x), if u(x) is any nontrivial solution of the
second-order linear DE (4). Conversely, if v(x) is any solution of the Riccati DE
(36), then the function u
=
=

exp (Ju(x) dx) satisfies the linear DE (4).

Now, let p, g be analytic and a, ¢ given constants. Using Theorem 2 to find a


solution u of (4) satisfying the initial conditions u(a) = 1 and u’(a) = c, we obtain
an analytic solution v = u/’u of the Riccati equation (36) which satisfies v(a) =
c. This gives a local existence theorem for the nonlinear Riccati DE (36).
However, this solution becomes infinite when u = 0; hence the radius of
convergence of its power series can be made arbitrarily small by choosing c suf-
ficiently large. Also, the location of the singular points of solutions of (36) is
variable, depending on the zeros of u.
10 Radius of Convergence 125

The Riccati equation also serves to illustrate Theorem 4. A simple computa-


tion gives, for v = LY a,x", the formula

A-1

v? = atx? + Qa,agx® + (a3 + Qajas)x* + +--+ X 4,a,—.x" ++ +-

Substituting back into (36), we get the recursion relation

h-1 h

(h + Vanes = — >> ay. — >> apr —


2=] a=]

This is, of course, just the special case of formula (34’) corresponding to setting
F(x,y) = —y? — XP party — LP yx’
We shall now return to the general case. Let F be any function of x and y
analytic in some neighborhood of (0,0). This means that F can be expanded into
a double power series

(37) F(x,9) = bog + (box + Bory) + (beqx® + byxy + Bogy’) + + -

where b, are given real numbers, and the series is convergent for sufficiently
small x and y. We shall show that the series (33) referred to in Theorem 4, and
defined by formulas (34)-(34’)-(34”) has a positive radius of convergence.

Analytic Functions of Two Variables. To prove this, we shall need a few


facts about analytic functions of two variables and the convergence of double
power series like (37). The terms of any absolutely convergent series can be rear-
ranged in any order without destroying the convergence of the series or chang-
ing the value of the sum. If we substitute into a double power series like (3’7)
convergent near (0,0) any power series y = Djao a,x" having a positive radius of
convergence, we will obtain an analytic function F(x, f(x)) which itself has a pos-
itive radius of convergence.
Finally, let the double power series (37) be convergent at (H,K), where
H > 0 and K > 0. Then the terms of the series £),H°K* are bounded in
magnitude by some finite constant M = max |b,|H’K*. This gives the bound

|b,| < M/HK*, M< +o,

to the terms of the series (37). Comparing with the double geometric series men-
tioned in §6

M
“o> ez
(38) G(x,y) =
J j=0 k=0 (
HK*

(
1-—

It
126 CHAPTER 4 Power Series Solutions

and applying the Comparison Test, we see that the series (37) is absolutely con-
vergent in the open rectangle |x| < H, |y| < K, and can be differentiated
there, term-by-term, any number of times.
The preceding remarks have the following immediate consequence.

COROLLARY. [If the power series (33) of Theorem 4 has a positive radius of con-
vergence, then the function which it defines is an analytic solution of the DE (30) for
the initial condition (0) = 0.

11* METHOD OF MAJORANTS, II

We will now complete the proof of an existence theorem for analytic (normal)
first-order DEs by showing that the series (33) has a positive radius of conver-
gence. This is again shown by the Method of Majorants, which we now extend
to functions of two variables.
Consider the power series

ys: GX — Ax?
+ age + ---

F: boo _ bi9x + boy + Boox? + by xy +e

as infinite arrays of real or complex numbers, irrespective of any questions of


convergence. Such power series can be added, subtracted, and multiplied using
Cauchy’s product formula (18’) algebraically as infinite polynomials, without
ever being evaluated as functions of x and y. Such expressions are called formal
power series. It is possible to substitute one formal power series into another;
after rearranging terms, another formal power series is obtained. Two formal
power series are considered identical when all their coefficients coincide.
Comparing F with the formal power series

G: Coo + Ci0% + Cory + Cop” + C11xy + se ey

one says that the formal power series G majorizes the formal power series F when

[Dal S Cp for all j,k = 0,1,2,-°-.

In symbols, one writes

(39) F<G.

This implies that all coefficients c,, are nonnegative.


The following lemma is immediate.

LEMMA 1. Let F, G, H be any three formal power series. Then F < Gand G &
F imply F =G, and F < Gand G « H imply F « H.
11 Method of Majorants, II 127

It is not true, however, that F « F if F has any negative coefficient.


The relation (39) of majorization between formal power series is useful in
estimating the radius of convergence of such series. From the Comparison Test
for convergence, we obtain the following directly.

LEMMA 2. If F and G are formal power series and if F < G, then F converges
absolutely at any point (x,y) if G converges at (|x|, |y|).

The crucial result for the proof of convergence of the formal power series
(33), obtained from Theorem 4, is the following.

LEMMA 3. Let F < G, and let fand g be the formal power series (without constant
terms) obtained by solving y’ = F (x,y) and y’ = G (x,y) formally, as in Theorem 4,
for the initial condition y(0) = 0. Then g majorizes f (that is, f < g).

Proof. The polynomials p, in Theorem 4 have nonnegative coefficients. It fol-


lows that

lan4i1 = |Prlboo» B10, B01, LS PrlCoo. C10 Cor, )

for all h; hence, the absolute value of each coefficient a, is less than or equal to
the corresponding coefficient of the formal power series g, q.e.d.

It is now a straightforward matter to prove our main result.

THEOREM 5. Let F (x,y) be analytic in the closed rectangle |x| =< H, |y| = K,
where H and K are positive. Then the formal power series solution (33) of the DE (32)
has a positive radius of convergence.

Proof. The power series for the function F is convergent at (H,K); as in §9,
it follows that for some finite M = max |b,H'K"|,

|b,A’K*| < M, whence |b, = RE

That is, the formal power series F is majorized by the double geometric series
(38):

cw

G(x) = u /{( -3) ( ~ J l a

K
=
=

>
pk=0
HK
xy
k
.

This series is the product of two geometric series, each absolutely convergent if
|x| <H, |y| < K. Therefore, it is also absolutely convergent in this rectangular
domain, and defines an analytic function there.
Furthermore, the DE y’ = G(x,y) can be solved in closed form by separation
128 CHAPTER 4 Power Series Solutions

of variables. The solution satisfying (0) = 0 is

(40) y = K(1 ~ V1 + (2MA/K) In (1 — (x/M))),

where the principal values of the logarithmic and square root functions are
taken, corresponding to the usual expansions of the functions 1+
¢ and
In (1 + 4 in power series with center at ¢ = 0. The radius of convergence is
given by the equation (2MH/K) In [1 ~— x/H] = —1, or

(41) R= HU — e“ KM)

since the binomial series for the radicand in (40) converges so long as
(2MH/K)|In [1 — («/H)]| < 1. This completes the proof.
By Theorem 6 of Ch. 1, whose hypotheses are satisfied since F is continuously
differentiable, the solution satisfying the initial condition f(0) is unique. This
proves the following result.

COROLLARY. Every solution of the analytic DE (30) ts analytic. The solution


Satisfying the initial condition f(0) = 0 is unique, and given by the power series (33).

12* COMPLEX SOLUTIONS

Up to now, we have been assuming tacitly that all variables x, y, u, etc.


referred to were real. However, the discussion in this chapter also applies, with
very minor changes to complex power series. In particular (see Theorems 2-3),
any solution of the DE (4) obtained by power series methods defined a complex
analytic function within a circle of convergence |z| < C whose radius is the
smaller of the radii of convergence, of the series in (5). For instance, from the
DE w” + w = 0, we obtain the complex sine and cosine functions sin z = z —
23/31 + 25/5! ~ ,and cosz = 1 — 27/2! + 24/4! ~ , where as usual
w
_

=
u + iv and
z

=
x + iy refer to dependent and independent complex
variables.
Similarly, let
x

(42) dw/dz = F(z,w) = x b,dwt

be any analytic first-order DE, whose right side is a complexanalytic function.


(For F to be analytic, it is sufficient for the function F to be differentiable, since
differentiability implies analyticity in any complex domain.}) Then all the for-
mulas of §9 remain valid; so do the lemmas concerning majorants of §10.
For complex z, w, the domain D: |z| = H, |w| = K of Theorem5 is not a

+ Ahlfors, pp. 24-25; Hille, pp. 72, 196.


12 Complex Solutions 129

rectangle, but the four-dimensional product of two discs. Though this domain
is harder to visualize than a rectangle, it has the advantage that Cauchy’s integral
formulas hold on it: the constant M in (41) is given explicitly byt

M = sup |Fi(z,w)|.

These remarks cover the extension of Theorem 5 to complex DEs.

Dependence on Initial Value. We now consider the dependence of the


solutions of real or complex analytic first-order DEs (32) or (42) on their initial
values. We will prove that this dependence is analytic; it will follow that the solu-
tion curves of any real, normal first-order DE form a normal curve family.

THEOREM 6. Let f(x,c) be the solution of the first-order analytic DE y’ = F(x,y)


which satisfies the initial condition f(a) = c. Then f(x, ¢) is an analytic function of the
independent variable x and the initial value c.

Proof. By a translation of coordinates, we can reduce to the case a = 0, and


consider f(x,c) in the neighborhood of (0,0). But clearly for small fixed c, the
function n{x,c) = f(x,c) — cis the solution of

(43) dn/dx = F,(x, 9) = F(x, ¢ + 7) = XLby(c + n)* = X60,

satisfying the initial condition n(0) = 0. Here each By = B,(c) is an analytic


function of c, the coefficients of whose power series expansion in ¢ are positive
multiples (}) b,, of some b,, By Theorems 4 and 5, the solution n(x, ¢) of (43) is
Lieo &,x”, where each coefficient a, = «,(¢) is a polynomial in the 6,, with positive
rational coefficients. Hence, the doubly infinite formal power series

obtained has coefficients ¥,,,, which are polynomials in the b,, (because each b,,
only affects the 6, with k = h) with positive coefficients. This series formally
satisfies (43).
As in the proof of Theorem 5, this series is majorized by the power series
solution without constant term of the DE

dz/dx G(x, ¢ + 2) = M/[Q — x/M)\(1 — © + 2)/K)).


=
=

Integrating, we get

(44)
1-—
z——
2K
= -—-MHI|n
(3)
t See, for example, Picard, Vol. 2, p. 259.
130 CHAPTER 4 Power Series Solutions

Elementary algebraic manipulation now gives

(@+ 9—KI = @~wf + 2mHKIn(1 2).


The branch of the function z(x, c) which makes z(0, 0) = 0 is given by the
formula

Zz
=
=

K-~c- {« —K)?+ 2MHKIn ( ~ x\


The resulting function (as one sees by expanding the functions V1 + u and In
[1 — (x/H)] into power series) is analytic for x and ¢ sufficiently small. That is,
the power series solution of (44), g(x, c) = Uéq,c"x" (in which all 6. 2 0 as
shown above) is also convergent. But this majorizes f(x,c), whose power series
expansion is therefore also convergent, completing the proof.

EXERCISES F

For the power series expansion of each function defined in Exs. 1-4, determine the
domain of convergence:

1. Jol + y). 2. Vi-» —»).


3 1/Ul — Ge? + 9%). 4. 1/(1 + &? + 99).
5 Show that the solution of y’ = G(x, y), where G is the geometric series (38), is the
function (40).

6 Find the value of %_9 x’y'/(HK’).


7 Find the value of D74-0phx yt/(EPR’).
In Exs. 8-11, establish the properties of the relation “<” [P = P(x, y), Q = Q(x, 9),
p = p(x), and q = q(x) are formal power series].

8 Iff <GandP<«Q,thenF+P<G+Q,.

9 If F < Gand P « Q, then FP < GQ.

10 IfF < Gand p « gq, then F[x, p(x)] « G[x, q(x)].

11 If F < G, then 0F/dx « dG/dx (interpret the derivatives formally). Is the converse
true?

12 Obtain even and odd power series solutions of the DE w” + iw = 0, and interpret
the solutions.

13 Obtain an even power series solution of the DE B’ + 27'B’ + iB = 0, and show


that it defines an entire function.

14 Do the results of Exs. 8~11 hold for complex powerseries? Justify your answer.
CHAPTER 5

PLANE AUTONOMOUS
SYSTEMS

1 AUTONOMOUS SYSTEMS

This and the next three chapters will be concerned with systems of first-order
ordinary DEs in normal form. By this is meant a set of equations

dx,
= Xy(x), . » Xp} 0)
dt

(1)
ax,
= X01, +» Xq; b)
di
i

The X, are given functions of the n + 1 real variables x, : Xn) t. We want to


find solutions of (1), that is, sets of n functions x,(é), . , X,(t) of class @! which
satisfy (1). We shall assume the functions X, to be continuous and real-valued in
a given region R of the (n + 1)-dimensional space of the independent variables
X15 Xo5 Xn,
t

The simplicity of the concept of a first-order normal system becomes appar-


ent when (1) is written in vector notation. A vector is an n-tuple x = (x, Xn)
of real (or complex) numbers. Thus, the functions X,(x,, X,3) = X,(x, t), in
(1) define X = (X, . »X,,) aS a vector-valued function of the vector variable x
and the real variable ¢. Even more simply, we can define a vectorfield as a vector-
valued function X(x) of the vector variable x (ranging over a suitably defined
domain in ZR"), visualizing this as attaching a small arrow X(x) to each point x.
In vector notation, the system (1) assumes the very concise form

dx
(2) — = X(x,Z)
d

A solution of (2) is a vector-valued function x(é) of a real (scalar) variable ¢, such


that x’()) = X(x(i),é). The analogy between (2) and the normal first-order DE
y = F(x,y) studied in Ch. 1 is obvious; the only difference is that the dependent
variable in (2) is a vector and not a number (or “‘scalar’’), Hence, one can call
(2) a normal first-order vector DE.
131
132 CHAPTER 5 Plane Autonomous Systems

A solution of a normal system (1) or (2), defined by the functions x,(d),


»x,(t), can be visualized as a curve in the (n + 1)-dimensional region R. When
n = 1, this specializes to the concept of a solution curve defined in Ch. 1; when
n = 2, it is a curve in (*),x9,)-space. For this reason, the curve in R defined by
any solution of (1) is called a solution curve of (1).
Chapter 6 will contain proofs of existence, uniqueness, continuity, and dif-
ferentiability theorems for solutions of first-order systems (1). In the present
chapter, attention will be confined to autonomous first-order systems. By defini-
tion, these are systems of the form

dx,
(3) =~ = X,(x1, Xn), 2=1,...,n
d

The characteristic property of autonomous systems is the fact that the functions
X, do not depend on the independent variable ¢. When this variable is thought
of as representing time, autonomous systems are thus time-independent or
stationary.
In vector notation, the autonomous system (3) reduces to

dx
Co) — = X(x)
d

To every autonomous system (3) there thus corresponds a unique vector field
X(x) in Euclidean n-space, and conversely. Throughout this chapter, we will con-
sider only vector fields that are of class @', and hence satisfy a Lipschitz condi-
tion in every compact domain. As will be shown in Chapter 6, this implies that
one and only one solution x(é,c) of the autonomous system (3) satisfies the initial
condition x(0)
~
=
c, and that this solution depends continuously on c.
When n = 3, the autonomous system (3) can be imagined as representing the
steady flow of a fluid in space: at each point x in a region of space, the vector
X(x) expresses the velocity of the fluid at that point in magnitude and direction.
The flow is called steady because its velocity depends only on position and does
not vary with time. The solution x(é,c) of the autonomous system (3) for the
initial “value” ¢ then has a simple physical interpretation: it is the trajectory (path,
orbit, or streamline) of a moving fluid particle, whose position (initially at c) is
given as a function of the time ¢.
When the preceding path x(i,c) is considered as a set of points (that is, as a
geometric curve), without reference to its parametric representation, it is also
called a solution curve of the autonomous system (3), or of the associated vector
field X(x). If x(t,c) is a solution of the autonomous system (3), then so is
x(t + a,c) for any constant a; this can also be interpreted as the path of a particle
that passed through the point ¢ at time ¢ = a.

Plane Autonomous Systems. This chapter will be largely concerned with


the case n 2 of (3). In this case, we can omit subscripts and rewrite (3) in
=
=
1 Autonomous Systems 133

simpler notation as

dx 2
=
(x,9); = Y(x,9)
— =

dt d

The connection of such a “plane autonomous system” with the first-order DE


y = X(x,y)/Y(x,y) will be discussed in §2 below.

Example 1. The solutions of the autonomous system

dx ]
(4) =

dt 3
d

are evidently x
=
=

ce™, y = coe™ where ¢),Cy are arbitrary constants. The corre-


sponding solution curves are the loci y” = kx”. Figure 5.1 depicts sample curves
for the case m =
=

2,7
=
=

Note that the solution curves (“trajectories”) of any autonomous system are
endowed with a natural sense or orientation, the direction of increasing ¢. This is
indicated in drawings of solution curves by marking on them arrowheads point-
ing in this direction. See Fig. 5.1, which depicts sample (oriented) solution
curves of the system (4).
In Fig. 5.1, the origin (0,0) is evidently a very special point: integral curves
emanate from it both horizontally and vertically. This is possible only because
the vector field (X,Y) = (mx,ny) reduces there to the null vector 0 = (0,0), whose
direction is indeterminate. Such points are of particular importance for the
study of autonomous systems; they are called critical points.

DEFINITION. A point x = (x, »X,) where all the functions X, are equal
to zero is called a critical point of the autonomous system (3) and of the associ-
ated vector field X(x).

Ifx = c isa critical point of (3), then the functions x,(!) = ¢c, defineatrivial
solution x(é)
=
=
c of (3), which describes not a curve but just a point. In the

Figure 5.1 Integral curves of # = 2x, ¥ = 3y.


134 CHAPTER 5 Plane Autonomous Systems

terminology of hydrodynamics,c is called a stagnation point of the velocity field


X(x).
Every normal system (1) of (n — 1) first-order DEs

dx,/dt = X,(x1, » Xn-15 ‘), i=1,. ~na-l

is equivalent to an autonomous system in n variables. To see this, introduce an


additional variable x,, = t, and rewrite (1) as

dx,/adt = X,(%1, » Xn)s 1, ,n—1, dx,/dt = 1


=

2 =

Evidently, the system so constructed has no critical points.

2 PLANE AUTONOMOUS SYSTEMS

When n 2 in (3), it is convenient to write (3) without subscripts as


=
=

dx
(5) —_—

= X(x,y),
—_

= Y(x,9)
dt di

We then speak of a plane autonomous system. The plane autonomous system (5)
is evidently equivalent to the first-order DE

dy _ Y(x,9)
(5’)
dx X(x,9)

wherever X(x,y) # 0. The main advantage of the parametric form (5) is that
points X(x,y) = 0 of vertical tangency of the solutions of the DE (5’) are no
longer singular points of the corresponding plane autonomous system (5). Like-
wise, the solution curves of (5) are just the integral curves of the quasilinear DE

(5) Y(x,y) = X(x,9)y’

and the two have the same critical points.


The advantage of the parametric viewpoint is apparent in the following exam-
ple, already discussed in Ch. 1, §2.

Example 2. Consider the autonomous system

dx dy
(6) —_— = —
= %

da di

whose solutions are the function-pairs x = r cos(é + ¢), y = rsin(¢ + ¢), where
y and ¢ are arbitrary constants. The graphs of these solutions are concentric
2 Plane Autonomous Systems 135

circles, with center at the origin. The solutions of the corresponding first-order
DE

(6/) 9. _=
x

dx J

are the functions y = + Wr? — x®, which are defined only for |x| < |r|.
Whereas the function —x/y is undefined where y = 0, the functions X(x,y) =
—y and Y(x,y) = x in the system (6) are defined throughout the plane. This gives
the system (6) an obvious advantage over the DE (6’).
Referring to the definition of Ch. 1, §12, we see that the circles x? + yx =f
form a regular} curve family in the ‘‘punctured” xy-plane, the critical point of
(6) at the origin (0,0) being deleted. In Ch. 6, §11, it will be shown that this is
true of plane autonomous systems in general.
Plane autonomous systems have the following interesting relation to level
curves (cf. Ch. 1, §5).

THEOREM 1. For any continuously differentiable function V(x,y), each integral


curve of the plane autonomous system

dx ov
(7) =. 5)
dy — —

(x,y)
dt dt

lies on some level curveVix,y) = constant.

The proof is immediate: along any solution curve we have

dV_aV dx AV dy dV aV av av
=. eee
ee ee

dt Ox dt dy dt ax oy ov Ox

and so V[x(/),y(] = constant. Observe that the associated steady flow is diver-
gence-free or area conserving, because

ov -9d Vv _ &Vv a
aiv(Oy’ —

Ox ~ Axdy 7 OyOx ~

In fluid mechanics, such a steady flow (7) is called incompressible, and Vis called
its stream function.
The representation (7) also reveals the level curves of Vas the solution curves
of dx,/dt = 0V/dx,,
7 = 1,. »n—that is, in vector notation, of dx/dt = grad

+ Note that this regular curve family is not normal in the plane, whereas the graphs of the function
=
— x* form a normal curve family in the upper half-plane y > 0.
136 CHAPTER 5 Plane Autonomous Systems

The main advantage of the parametric representation (7) over the normal
form

ay __ OV/dx
(7/)
dx OV/dy

considered in Ch. 1, §6, is the following. Whereas the solution curves of (7’)
terminate wherever 0V/dy vanishes, those of (7) terminate only where the func-
tion Vhasacritical point (maximum, minimum, or saddle-point) in the sense that
grad V = 0. This happens exactly where the autonomous system (7) has critical
points.
If we set V = ~(x? + °)/2 in Theorem 1, we get the system (6) of Example
2, having circular streamlines. 1f y(x,y) is nonvanishing, then the system dx/di
= —yp, dy/dt = xp also has circles for solution curves, and we can construct a
wide variety of autonomous systems having the same solution curves in this way,
as has been noted before.
Another illustration of Theorem 1 is obtained by setting

- (x? + 9°) - r(cos°@ + sin®6)


V(x,9)
xy cos @ sin 6

and letting u(x,y) = x*y*. Evaluating (7), we get the following example.

Example 3. The plane autonomous system

d. y
(8) —_ =
=

x(2y> — x), -SE


(2x* — 9’)
di di

has as solution curves the curves x° + y? — 3cxy = 0, where c is an arbitrary


constant. Each such solution curve is a folium of Descartes, as in Figure 5.2.
The coordinate axes are also solution curves. The origin is the only critical point
of (8); correspondingly, the folia of Descartes in Figure 5.2 form with the axes
a curve family that is regular, except at the origin.
Note that the curves of Figure 5.2 form a family of similar curves, all similar
to x? + y° = 3xy undera transformation x — kx, y > ky, t > t/c’, where kis a
constant. The reason is that the DE

dy - — (2x? — y) _

dx x(2y> — x9) ()
is homogeneous of degree zero (see the end of Ch. 1, §7).

3 THE PHASE PLANE, II

We have already defined the ‘‘phase plane” in Ch. 2, $7, as a way of visual-
izing the behavior of solutions of (normal) second-order DEs ¥ = F(x,x). By
3 The Phase Plane, II 137

ye z

Figure 5.2 Folia of Descartes.

treating x u as a second dependent variable, any such DE is transformed into


=
=

a (first-order) “plane autonomous system” of the special form x u, u


= =
= =

F(x,u).
Plane autonomous systems of this special form arise naturally from dynamical
systems with one degree of freedom. Let a particle be constrained to move on a
straight line (or other curve) and let its acceleration x be determined by New-
ton’s Second Law of Motion as a function of its instantaneous position x and
velocity %. Then

(9) x = F(x,%)

where we have adopted Newton’s notation, representing time-derivatives by


dots placed over the variable differentiated.
It is usual in dynamics to denote x dx/dit by the letter v, and to call the
_
=

xv-plane the phase plane. Since the variables x and mv are conjugate position and
momentum variables, the phase plane is a special instance of the more general
concept of phase space in classical dynamics.
Since (9) is time-independent, it is called an autonomous second-order DE. In
the xv-plane, the second-order autonomous DE (9) is equivalent to the first-
order plane autonomous system

dx _ du dv
(9) — =v— = F(x,v)
a di d.

The integral curves of this autonomous system in the Poincaré phase plane
depict graphically the types of motions determined by the DE (9). Note that the
solution curves point to the right, to the left, or are vertical according as x > 0
(upper half-plane), x < 0 (ower half-plane), or x 0 (x-axis). This is because
=
=

x is increasing, decreasing, or stationary in these three cases, respectively.


138 CHAPTER 5 Plane Autonomous Systems

Example 4. Consider the damped linear oscillator defined by the second-order


linear DE with constant coefficients

(10) K+ px + qx 0,
=
=

p, q constant

discussed in Ch. 2, §2. The associated autonomous system in the Poincaré phase
plane is

dx dv
(10’) —_>
a= —
po — qx
de” di

The direction field of this system is easily plotted for any p, ¢

For example, let p = 1, q = 1. The resulting DE ¥ — x + x 0 describes


=
=

the free oscillations of a negatively damped particle in an attractive force-field.


Sample solution curves in the Poincaré phase plane are sketched in Figure 5.3.
Or again, let p = 4, q = 3. The resulting DE ¥ + 4% + 3x = 0 describes
heavily damped stable motion; the functions x
=
=
e™ constitute a
e ‘and x =
=

basis of solutions (the exponential substitution of Ch. 3, §1 gives A? + 44 + 3


= 0). The graphs of these solutions in the (x,%)-plane are the radii whose slopes
—1 and —3 are the roots of this polynomial; they represent solutions of the
first-order DEs x —xand% = —3x.
=>
=

Example 5. The DE of a simple pendulum of length 2Z is

a0 pas
(11)
— =

Rk’ sin 6,
dat? £

Figure 5.3. Damped linear oscillations.


3 The Phase Plane, II 139

Here 6 is the (counterclockwise) angle made by the pendulum with the vertical.
The corresponding plane autonomous system in the phase plane (the v@-plane)
is

(114 6 =v, b = —k’sin


0

Since the function sin @ is periodic, the trajectories form a periodic pattern in
the sense that, if [v(),6(/)] is a solution of (114, so is [v(é),0() + 22]. The case
k? =1 is sketched in Figure 5.4.
The solutions of (11’) correspond to the states of constant energy E: v? — 2
cos 6 = 2E (when k? 1). There are two “critical points” in the v6-plane: the
=
=

points (0,0) and (0,7), corresponding to stable and unstable equilibrium, respec-
tively. Near the “vortex point” (0,0), the pendulum oscillates back and forth;
the corresponding trajectories are closed curves, roughly elliptical in shape.
The point (0,7) is a saddle-point; the trajectories v? = 2(1 + cos 6) or v =
=

+2 cos(@/2) that terminate there are called ‘‘separatrices,”’ because they sepa-
rate the closed trajectories from the wavy trajectories v? = 2(E + cos 6) with E
> 1, which correspond to whirling the pendulum in circles.
When the amplitude is small, the DE (11) can be approximated well by
6+ 0
= 0. In this case, the period of oscillation is independent of the amplitude. For
exact solutions, the period increases with the amplitude. We will discuss this
phenomenon in §10 below.

EXERCISES A

1. Find and describe geometrically the solution curves of the following vector fields:
(a) (x, » 2) (b) (ax, by, cz) (c) (y, x, 1) (d) (y, 2, x)
2. Show that the solution curves of the autonomous system

dx d
S~=e-1, =<

dt di

are the curves y = c(e* — 1), and the y-axis x = 0.

LL

Figure 5.4 Simple pendulum.


140 CHAPTER 5 Plane Autonomous Systems

3. Show that the functions xyz and x* + y* are integrals of the system

dx y
dt
= xy", a= —

“vy, & = us —yh


dt

Describe the loci xyz =


=
constant, and sketch typical solution curves

4. (a) Show that the orthogonal trajectories of the level curves of V = x/(x? + 9°) are
another family of circles. Draw a sketch that displays both families of circles
(b) Same question for V = [(x — 1)? + 9*J/[(@ + 1)? + 97]
The gradient field of a scalar function V(x) is defined as the vector field

grad V OV/Ox,, 0V/dx,)

The gradient lines of V are the solution curves of the autonomous system dx,/dt =
=

OV/dx

In Exs. 5—7, find the gradient lines of the following functions

5 6. Vex ty
— 2z 7, V=In[@&e — oa? + ¥/[(x + o)? + 71)
8 Show that a function (x), , X,) of class @! is an integral of the system (3) if and
only if it satisfies the partial DE X, 0¢/dx, + X,, 86/8x, =
=

Show that if 0X/dx = dy/dy and 6X/dy = —dy/dx, the plane autonomous system
(5) is the real form of a single first-order complex analytic DE, and conversely

*10 Let e;, ,» é, and a, a, be real constants, and let

= [@ — a)? + + 2717?

Show that, if V = Le2/7, the functions

y > é, arccos a,
= (x
)/ and 6 = arctan (z/y)

are integrals of %
=
=
dV/dx, OV/dy, 0V/dz. Express the integral curves as
intersections of the surfaces defined by the preceding equations

Exercises 11-14 derive some of the main properties of elliptic functions by the methods
of Ch. 4, §7, and give an application to Example 5
11. The elliptic functions u = sn t, v =
=
cnt, w = dnt, may be defined as the solutions
of the autonomous system

du dv
(*) —— = vw, —wu, — = kup
dt

having the initial values u(0) = 0, v(0) = w(0)


(a) Establish the identities (sn 0)? + (en t)? = 1, k? (sn )? + (dn 0)”
(b) Using (a), show that the three functions specified are defined and analytic for
all real ¢
(c) Expand their solutions in power series through terms in &
*19. (a) Show that, in Ex. 11, if ® < 1, the function cn t vanishes at

t=K= f FOS
4 Linear Autonomous Systems 141

(b) Prove the following addition formulas, valid with k”’ = V1 — Re: sn(t + K) =
cn t/dn t, en(t + K) = —Rk’ sni/dnt, dnt + K) = k’/dnt. [Hint: Show that the
vector-valued function v/w, —k’u/w, k’/w satisfies (*), and that this vector
reduces ati = K to (1,0,k’.]
(c) Prove that sn(—t) = —sn i, cn(—1t) = ent, sn(t + 2K) = —snt, en(t + 2K) =
—cnt, dn(t + 2K) = dnt.
(d) Show that sn ¢ and cn ¢ have infinitely many zeros, and that the zeros of sn t
separate those of cn t.

13 (a) From the assumptions of Ex. 11, show that the function u = sn ¢ satisfies the
second-order DE

(**) a+ (1 + kyu — 2k = 0

(b) Infer from (**) that ad? + (1 + ku? — ku* = constant.


(c) Sketch the integral curves of the DE (**) in the phase plane, marking the special
curve (sn t, cn t, dn #) and any critical points.
(d) Determine the nature of the other critical points, if any.
14 (a) Show that a one-parameter family of solutions of § + &? sin 6 = 0 is given by
sin (6/2) = sin (a@/2)sn[k(t — to)], where @ is the amplitude of oscillation.
(b) Show that6 = 2k sin (a/2)cn[k(t — t)].

4 LINEAR AUTONOMOUS SYSTEMS

An autonomous system (3) is called linear when all the functions X, are linear
homogeneous functions of the x, so that

d
(12) = ay + ++ + AyXy; i=1, ,n
d

Hence, a linear autonomous system is just another name for a (homogeneous)


linear system of DEs with constant coefficients. Such a system is determined
by the square matrix A = |la y,|| of its coefficients, and its vector field satisfies
.

X(x) = Ax

Initial Value Problems. For any autonomous system x’(é) = X(x), the “‘ini-
tial value problem” consists in determining, for each ¢c in the domain of the
vector field X(x), the solution x(f) of the DE that satisfies the ‘‘initial condition”
x(0) = e. We will now show how to solve this problem for any linear plane auton-
omous system (i.e., in the case n = 2).
Any such system has the form

x
2.
(13) + by,
ax cx
+ dy
=
— —_
= =

d d

where a, b, c, d are constants. The coefficient matrix A = of constants is


d
nonsingular unless its determinant is ad — be =
=

0. The origin (0,0) is always a


142 CHAPTER 5 Plane Autonomous Systems

critical point of the system (13). Since the simultaneous linear equations ax +
by = cx + dy = 0 have no solution except x = y = 0 unless A is singular, we
see that the origin is the only critical point of the system (13), unless ad = bc
(the degenerate case |A| = 0)
To solve the initial value problem for the system (13), it is convenient to intro-
duce a new concept.

DEFINITION. The secular equation of (13) is

(14) ui — (a + du + (ad —beju 0


=
=

THEOREM 2. [f (x(é), y(¢)) is any solution of the plane autonomous system (13),
then x(t) and y(t) are solutions of the secular equation (14) of (13).

Proof. We shall prove that x(é) is a solution of (14); the proof for y(é) is the
same, replacing a with d and 6 with c. The first equation (13) implies x — ax =
by, which implies ¥ — ax = by. From the second equation it follows that

X ~— ax% = bex + bdy = bex + d(% + ax)

Transposing, we see that x(?) satisfies (14).


Conversely, the secular equation (14) can also be used to solve the linear sys-
tem (13) as follows. First, find a basis of solutions u(é) and u(é) of (14) by the
methods of Ch. 2. Then, ifb # 0 in (13), set

(15) x= bu and yru— au

The first equation of (13) will be automatically satisfied, whereas the second will
be equivalent to

by = cx +dy or bu — abu = beu + du — au

which holds by (14). In the same way, if6 = 0 but c ¥ 0 in (13), set x = u —
du and y = cu; (13) follows similarly. In both cases, a second solution can be
constructed from v(é).
In the remaining case that b = c = 0, the obvious formula

(15/ x = xe, y = yoo


solves (13) for any initial x(0) = x9 and (0) = yo, as in Example 1.

The preceding recipes are effective computationally. Thus, to solve the initial
value problem

x=x-—y, yuxty, x(0) = 1, 90) = 0

we can use (14) to obtain ¥ — 2% + 2x = 0. Since the roots of the characteristic


polynomial |A — AJ| = X® — 2d + 2 are \ =1 + i, the system has the general
4 Linear Autonomous Systems 143

solution

x = e(Acosit + Bsind) y = eA sint — Bcos t)

Moreover, the initial condition x(0) = 1 implies that A = 1, while (0) = x(0)
— x(0) implies that B = 0. The solution of the initial value problem stated is
therefore x
=
=

e’ cos t, y = e' sin t.

Systems in n Variables. Most of the preceding results have straightforward


generalizations to (homogeneous) linear autonomous systems (12) in n variables.
In vector and matrix notation, the system (12) simplifies to dx/di = Ax. We can
define its ‘‘secular equation,” for any matrix A, to be

(16) L{u) = cu — cu’ +: +u” = p,(D)[u] = 0

where pa(\) = |A ~ Al] = co + cA + - + - + A” is the characteristic polyno-


mial of the matrix A, and D = d/dt. We then have the following.

THEOREM 3. If x(é) is any solution of (12), then every component x,(i) of x()
satisfies the secular equation (16) of (12).

The proof of this result depends on theorems about matrices and so will be
deferred until Appendix A.

Eigensolutions. By an “eigensolution” of the (constant-coefficient) linear


autonomous system xX = Ax is meant a solution of the form x(f) = c(i)¢, where
¢ is a nonzero vector. Since this implies x =
=
c'()¢ = Ac(t)d, there follows Ad
= Ag with A = c’()/c(f), whence c(t) = Ke™. Conversely, if ¢ is any eigenvector
of the matrix A with eigenvalue i, then x() = ep is evidently an eigensolution,
since
X = Ax = Ax.
Eigensolutions (and generalized eigensolutions) of complex linear constant-
coefficient systems Z Cz provide a general canonical basis in the ‘solution

=

space.” But for real linear plane autonomous systems, they arise only when 0 is
a saddle-point. For example, since

—-3 2 2 2 —-3 2 1 1

( —2 2 N Je } wm [
1 1 —2 2 N }=(
2 2

the vector-valued functions ¢( (3) and y(t) = (3) form a basisof


-3 2
}e Theyclearlycorrespondtothe
eigensolutions of the system x’(é)
(

=

—2 2

invariant lines of the linear fractional DE y’ = (—2x + 2y)/(—3x + 2y) (cf.


Ch. 1, §7).
When the matrix A hasa basis of eigenvectors ¢,, there is an especially elegant
144 CHAPTER 5 Plane Autonomous Systems

way to solve the initial value problem x’({) = Ax. Namely, expand the initial data
x(0) = c into a linear combination of eigenvectors of A: c = La,¢,. Then the
solution of the system x’(t) = Ax for these initial data is

(17) x() = > ae",


j=l

where ), is the eigenvalue of ¢;.

5 LINEAR EQUIVALENCE

The secular equation (14) of a linear plane autonomous system (13) estab-
lishes a clear connection between its solutions and those of an associated (linear)
constant-coefficient second-order DE. As we shall now show, it also throws light on
the rough classification, made in Ch. 2, $7, into “focal,” “nodal” and ‘‘saddle”
points of the critical points of such DEs.
Consider first the case of focal points. Anticipating what will soon be proved,
we begin by considering system of the special form

(18) % = ax
— by, 3 = bx + ay

Substituting into (14), we find that its secular equation is

(18’) ui — 2at + (+ Wu =0

Here —2a is clearly arbitrary, while the discriminant A = 4a? — 4(a2 + 04) =
—46 can be any negative number. Hence (cf. Ch. 2, §7, Case A), all secular
equations of focal point type can be obtained from linear plane autonomous
systems of the special form (18).
In polar coordinates, on the other hand, one easily verifies that (18) reduces to

(19) tf = ar, 6=b, -— 2a


a=

Hence the orbits (trajectories) of (18) are equiangular spirals 0 = 6) + Dt,


r =e", yY = —2a/b, except in two degenerate cases:

(i) x
=
=

xe, y = yoe™ when b=0

(ii) ry cos(bt — 8B), y = 7 sin(bt — 8) when a=0


=

x =

In the first case, the origin is said to be a star point of (18); in the second, it is
said to be a vortex point of (18). It should be noted that these two “‘degenerate”’
cases (occurring when q = 0 resp. A = 0) were explicitly omitted in the discus-
sion of Ch. 2, §7.
Clearly the phase-plane representation of (18), which is

(19’) x= y, y = 2ax — (a? + B*)y


5 Linear Equivalence 145

is less attractive than (18); cf. Ex. 1. Yet the two are linearly equivalent in the
following sense.

DEFINITION. ‘Two first-order (linear homogeneous) autonomous systems,


x’() = Ax and u’(t) = Bu are called linearly equivalent when there exists a non-
singular matrix K such that B = KAK™', that is, when

(20) U, = kyxX, + bP Rin Xs i=lsscm

where as usual K denotes the matrix [£,].


The reason for this definition is that, if we write u = Kx and x = K7!u, then
x’({) = Ax is transformed into

dx
(21)
du
_ — = KAx = (KAK™u
di di

under the change of basis associated with the nonsingular matrix K. Thus, in alge-
braic language, linearly equivalent linear autonomous systems are associated
with similart matrices A and KAK™'.
Therefore, the reduction of linear autonomous systems to a standard simpli-
fied (or ‘“‘canonical”) form under linear equivalence amounts to reducing matri-
ces to canonical form under “‘similarity.” We will treat this problem here only
for linear plane autonomous systems

(22)
d
—_ =
ax +by, Om ox + dy

Its solution will throw considerable light onto the classification of critical points
of linear and nonlinear plane autonomous systems into those of focal, nodal,
and saddle-point type.

LEMMA. Linearly equivalent linear plane autonomous systems have the same sec-
ular equation.

Proof. This result follows immediately from (14’), (21), and general identities
of linear algebra. If B = KAK™', then?

|B — dll =
=
|KAK~! — dI| = |K(A — NK"!
|
=

|K]-|A — AZ| -|K7"] = JA —Al]

+ Birkhoff and MacLane, p. 264.

} Birkhoff and MacLane, p. 264. For reduction to diagonal form and Jordan canonical form, see

ibid., pp. 294, 354, For the companion matrix form| _ ___ |, see 1bid., p. 338.
p
146 CHAPTER 5 Plane Autonomous Systems

(The lemma is also a corollary of Theorem 3 unless there are two linearly inde-
pendent equations ui + pu + qu = 0satisfied by both components of all solu-
tions of dx/dt = Ax.)

THEOREM 4. Unlessa = dandb = c = 0, the linear plane autonomous system


(22) is linearly equivalent to the phase plane representation

du d
(23) v,
—_—_— =

qu
— po, =—(a@+d, q = (ad
—bc)
di dt

of its secular equation (14).

(22) Proof. Ifb¥ 0,letu = x,v = ax + by;thatis,letK= (1 0


b
} Then
reduces to

(24) u =v, b= ax + by = (a? + be)x + (ab + bd)y

The last expression in (24) is equal to

(a? + ad)x + (ab + bd)y — (ad — be)x = (a + dv — (ad — beju

by definition of u, v; hence (22) is equivalent to

a = v, b= (a + dv — (ad — boju = —pv — qu

that is, to (23). This shows, in particular, that

KAK"! = ( 0 1

~4q —?p

is the companion matrix of the secular equation of (22).


This proves Theorem 4 for the case b # 0. The case c # 0 can be treated in
the same way, letting u = y andv = cx + dy.
When 0 = c =0, let
u x + y,v = ax +dy; if a # d, we can set x = du

=

— v)/(d — a),y = (au — v)/(@ — 4). By (22) with b = c = 0, we have ud = % +


y = ax + dy = v. Similarly,

b=axt+djy=axtd 2.y

Comparing this with the expression

(a + d)v — (ad)u = [(a® + ad)x + (ad + d®)y — adx — ady]

we also verify (23) in this case.


5 Linear Equivalence 147

Exceptional Case. The case a = d, b = ¢ = 0 ofa scalar matrix,

(25) x = ax,
jy = ay

is genuinely exceptional. The secular equation of (25) is # — 2ax + ax =


=

0,
just as it is for the system

(25) x = ax, y =x tay

but (25) and (25’ are not linearly equivalent. Every component of every solution
of (25) satisfies 2 = au, but this is not true of the solution (e, te“) of (25/.
The exceptional case arises when the characteristic polynomial of the secular
equation (14) has equal roots, that is, when its discriminant A = p* ~ 4q =
(a — d)? + 4bc vanishes. This gives us the following corollary of Theorem 4.

COROLLARY. Unless the discriminant (a — d)* + 4bc vanishes, two linear plane
autonomous systems are linearly equivalent if and only if they have the same secular
equation.

Complete Classification. We now use Theorem 4 to provide a complete


classification of linear plane autonomous systems, giving for each type a simple
canonical form. This classification is based on a study of the possible root-pairs
dy,Ag of the characteristic equation (14), which we renumber as

(26) NW — @ta@ara+ (ad — be) =’ + prt+q=0

By Theorem 4, unless a = d and b = c = 0, it suffices to display one autonomous


system for each such root pair. We now enumerate the different possibilities,
which depend largely on the sign of the discriminant

(26/ A= p? — 4q = (a — d)? + Abc

of the characteristic equation (26). We begin with the case A # 0, g # 0 of


distinct nonzero roots A; # Ap.

A. Focal Points. Suppose A < 0, so that the characteristic equation (26) has
distinct complex roots \, = » + iv (v # 0). This is the case g = (uv? + v?) > 0
and 0 <= p® = 4y” < 4q of a harmonic oscillator. We choose the canonical form
(see the Corollary of Theorem 4 and Exs. B1—B3)

(26a) dx/dt = px — vy, dy/dt = vx + py

for (26), whose solutions are the equiangular spirals r = pe“, 0 = vt + 7 in polar
coordinates, where p = 0 and 7 are arbitrary constants. When p > 0, the spirals
approach the origin (stable focal point); when p < 0, they diverge from it (unsta-
148 CHAPTER 5 Plane Autonomous Systems

ble focal point); when p = 0, they are closed curves representing periodic oscil-
lations (neutrally stable vortex points). See Figures 5.5a and 5.5b.

B. Nodal Points. Suppose that A > 0 and q > 0, so that the roots A = yy,
Hg of the characteristic equation (26) are real, distinct, and of the same sign. We
choose, as the linearly equivalent canonical form,

(26b) dx/at = px, dy/dt = pey, 0 < tail < JHel

whose general solution is (ae"", be"). The system is stable when pu, and py are
negative and unstable when they are positive (the two subcases are related by
the transformation t > —t of time reversal). Geometrically, the integral curves
y = cx™ m Ho/Hy, look like a sheaf of parabolas, tangent at the origin, as in
=
=

Figure 5.5c.

C. Saddle-Points. Suppose that A > 0 but g < 0, so that the roots of the
characteristic equation (26) are real and of opposite sign. We again have the
canonical form (26b). But since y, and yy have opposite signs, the integral curves
x"y = c,m = — p/p, > 0 look like a family of similar hyperbolas having given
asymptotes, as in Figure 5.5d. A saddle-point is always unstable.
There remain various degenerate cases and subcases, in which A = 0 or
q = 0. The simplest such case is the exceptional case (25), in which A = 0,
= a? > 0. The integral curves consist of the straight lines through the origin,
and the configuration formed by them is calleda star, as in Figure 5.7e. In the
nonexceptional subcase, we have the canonical form of (25’)

(26c) dx/di = ax, dy/di = x + ay,

whose integral curves have the appearnce of Figure 5.5f. Such a point is also
called a nodal point, and it is stable or unstable according as a < 0 ora > 0.
The case q = 0, A # 0 corresponds to the phase plane representation of the
second-order DE ¥ + px = 0, p # 0. This corresponds to a rowboat ‘‘coasting”’
on a lake, with no wind and its oars shipped. The boat comes to rest at a finite
distance, in infinite time. The integral curves form a family of parallel straight
lines X + px = constant, as in Figure 5.5g. The origin is a stable (but not strictly
stable) critical point if p > 0, unstable if p < 0.
Finally, the case g = A = 0 reduces to x =
=
y = 0 in the exceptional case
(25) and to the phase plane representation of ¥ = 0 otherwise. The former case
is (neutrally) stable; the latter case is unstable.

EXERCISES B

1. Show that the secular equation of the system

x=
ax — By, J = Bx
+ ay (a, 6 real)

has the complex roots A = a + i.


5 Linear Equivalence 149

(a) Focal point (b) Vortex point

(c) Nodal point (d) Saddle point

ix
(e) Star point
Ay]
(f) Nodal point

(g) Degenerate case

Figure 5.5
150 CHAPTER 5 Plane Autonomous Systems

2. Show that the solution curves of the system

x = ax,
I= by, a #0, are y= Cx, Yr"

3. Solve the following initial value problems:

@ a) -(
2
3 —2
1

I
x

y
) [x(0),9(0)] = (1,0)

0 s0)-(
1
2
2
1
I
x

y
) x0) = 1, 90) = 0
(a) Solve the initial value problem for the DE

1
forgeneral ((0)
d 2 x
x
)
( ( I (
=
=

dt 2 1 y 9(0) b
y

(b) Show that the trajectories are hyperbolas or straight lines through the origin.
(a) Show that the characteristic equation of the system

(*) x = px — vy, y= vx + py (u,v real)

has the complex roots A = w + w.


(b) Infer that any linear plane autonomous system (13) with discriminant A < 0 is
linearly equivalent to (*), for some y,».
(c) Show that, in polar coordinates, the system (*) defines the flow r > e'r,0—> 0
+ vt.
(d) Show that (*) is the real form of the first-order complex linear DE z = dz, z = x
+ ty.

Consider the system dx/dt = ax, dy/di = By for a # 0.


(a) Show that its solution curves are y = C|x|?, p = B/a.
(b) Prove that any linear plane autonomous system (13) with positive discriminant is
linearly equivalent to a DE of the foregoing form under a suitable change of
basis.
*(c) Show that, in the punctured plane x” + 9? > 0, the system dx/dt = ax, dy/dt =
Bi and dx/dt = kax, dy/dt = kBy are equivalent if k # 0, but that they are not
equivalent on any domain that contains the origin unless k = 1.

Show that any linear plane autonomous system (13) with zero discriminant is equiva-
lent to dx/dt = ax, dy/dt = ay, or to dx/dt = ax, dy/dt = x + ay. Describe the asso-
ciated flows geometrically.

8 (a) Show that, if ad ¥ bc, the linear fractional DE

dy _xtatf
dx ax + by te

is equivalent by an (affine) transformation to one of the canonical forms of Exs.


5-7.
(b) Derive a set of canonical forms for the exceptional case ad = be.
6 Equivalence under Diffeomorphisms 151

6 EQUIVALENCE UNDER DIFFEOMORPHISMS

In the preceding sections, we have analyzed properties of linear autonomous


systems that are preserved under linear transformations; in the rest of this chap-
ter we will study properties that are preserved under the far more general class
of continuously differentiable transformations.
Such transformations often enable one to greatly simplify the form of a DE
or system of DEs. For instance, consider the system

dx _ 3x4 — 12x25? + y4

(x? + yy?
(27)
dt
dy _ 6x°y ~ 10xy?
dt (x? + 9°3/2
+

In polar coordinates r = (x? + y*)'/, 6 = arctan (y/x) with inverse functions x


=
=
r cos 6, y = rsin 8, this system reduces to

(27) — = 3rcos 36, — = sin 36


d

In this form, one sees at a glance that the rays @ = n/3 are integral curves,
for n = 0,...,5. Other integral curves are sketched in Figure 5.6.
We shall study below how far we can simplify linear autonomous systems by
such coordinate transformations. Our study will be based on a general concept
of equivalence under diffeomorphism, which we now define precisely. Let

(28) u, = Ai, nds Z=1,....n

be continuously differentiable functions with inverse functions

(28) x, = 8, yUn)s jui,...m

so that f(g(u)) = u and g(f(x)) =


=

x. For such inverse functions to exist locally


and be continuously differentiable, it is necessary and sufficient (by the Implicit

Bua
VW.
Figure 5.6 Solution curves at (18).
152 CHAPTER 5 Plane Autonomous Systems

Function Theorem}) that the Jacobian of (28) be nonvanishing: that | 0f,/dx,| #


0
If x(d) is any solution of the autonomous system

dx; =
(29)
—_

X(%1, > Xn)» j=l, n


d

then the functions

(30) u,(t) = Fil*@, » X,(0)), z=1, gn

satisfy the autonomous system

(31) a = Uy), U; = >- Of aj_ > x (g(u))X(g(u))


di Ox, dt
j=l j=l

and conversely. In this sense, the autonomous systems (29) and (31) are equiv-
alent. We formalize the preceding discussion in a definition.

DEFINITION. Let dx/dt = X(x) and du/dit = U(u) be autonomous systems,


defined in regions R and R’ of n-dimensional space, respectively. The two sys-
tems are equivalent if and only if there exists a one—one transformation u = f(x)
of coordinates, of class @' and with nonvanishing Jacobian, which maps R onto
R’ and carries dx/dt = X(x) into du/dt = U(u).

Under these circumstances, the inverse transformation is also of class @! with


nonvanishing Jacobian. Note that the relation of equivalence is symmetric,
reflexive, and transitive: it is an equivalence relation.t It follows from the pre-
ceding discussion that equivalent autonomous systems have solution curves
obtainable from each other by a coordinate transformation. If the systems (29)
and (31) are equivalent under the change of coordinates (30), and if V(u) is an
integral of (31), then V[f(x)] is an integral of (29).
However, two autonomous systems may have the same solution curves with-
out being equivalent. Thus, the solution curves of

(32) # = (KP + yy, y= —-O? + yx

are concentric circles, as in Example 2 (x —y, § = x). Yet the two systems are
=
=

not equivalent: all solutions of = —y, ¥


=
=
x are periodic with the same period

+ See Ch. 1, §5. The Jacobian of (28) is the determinant of the square matrix|| 0/,/dx,|| of first partial
derivatives. See Widder, p. 28 ff. Transformations with the properties stated are called
“diffeomorphisms.”

{ Birkhoff and Mac Lane, p. 34.


7 Stability 153

2a, whereas the periods of the solutions of (32) vary like 1/r? with distance from
the origin.

7 STABILITY

The concepts of stability and strict stability, already defined for linear DEs
with constant coefficients in Ch. 2, §3, apply to the critical points of any auto-
nomous system. Loosely speaking, a critical point P is stable when the solution
curves originating near P stay uniformly near it at all later times; P is strictly
stable if, in addition, each such individual solution curve gets and stays arbitrar-
ily near P as ¢ increases without limit. In vector notation, the precise definitions
are as follows.

DEFINITION. Let @ be a critical point of the autonomous system


x’(t) = X(x), so that X(a) = 0. The critical point a is called:

(i) stable when, given ¢ > 0, there exists a 6 > 0 so small that, if |x(0) — a] <
6, then |x() — al < ¢forallt > 0
(ii) attractive when, for some 6 > 0,

(33) |x(0) ~ a] <5 implies lim {x@ — al =0


i-~oa

(iii) strictly stable when it is stable and attractive. A stable critical point which is
not attractive is called neutrally stable; a critical point which is not stable is
called unstable.

Evidently, the preceding definitions are invariant under diffeomorphisms


(85); thus, they describe important qualitative distinctions between the kinds
of critical points. For first-order autonomous DEs, being “‘attractive’’ has a sim-
ple interpretation.

THEOREM 5. The critical point 0 of the one-dimensional autonomous DE x’(t) =


X(x) is attractive if and onlyif (ii’). For some 5 > 0,0 < |x| < 6 implies xX(x) <
0. In this case, the DE is strictly stable.

Explanation. In other words, in the first-order case (ii) is equivalent to (ii’)


while either condition implies (iii).

Proof. If X(x,) = 0 for some x, with 0 < |x,| < 4, then x(#) x, is a solu-
=
=

tion, violating (ii) above. In the same way, if x,X(x,) > 0 [that is, if x, and X(x,)
have the same sign], then the solution with initial value x(0) = (x, + 6 sgn x)/
2 could never cross x,; hence it would also violate (ii). Therefore, condition (ii’)
is necessary for being “‘attractive.”’ It is sufficient since, if 0 < x(0) < 6, then
SUP/5;,xc0}X(*) = —a, < O for all 6, € [0,x(0)]. Hence, we have 0 < x(#) < 6, for
all t = x(0)/a,, proving (30); we omit the details. A similar argument covers the
case —6 < x(0) < 0.
a

154 CHAPTER 5 Plane Autonomous Systems

In both cases, stability is immediate; indeed, stability follows if 0 is a limit


point of x with xX(x) < 0.

Attractiveness also implies strict stability for linear autonomous systems in n


dimensions, that is, for linear DEs with constant coefficients. We now prove this
important result, which goes back to Lagrange.

THEOREM 35’. The critical point 0 of the constant-coefficient linear autonomous


system x(t) = Ax is attractive if and only if every eigenvalue of A has a negative real
part. In this case, the system is also strictly stable.

Proof. If some eigenvalue A of A has a nonnegative real part, then (12) has a
solution of the form x() = e“f (an “eigensolution’”’ or “norma! mode”’), where
the initial eigenvector x(0) = f can have arbitrarily small length. Conversely,
note that by Theorem 3, every component x,(t) of every solution x(é) of (12)
satisfies the secular equation P,(D)x,(@) = 0, where the roots of the polynomial
equation P,(\) = 0 are just the eigenvalues A, of A. From Theorem 4 of Ch. 3,
we know that every x,(¢) F> 0 if these A, all have negative real parts, which implies
(33). This proves the first statement of Theorem 5’.
To prove the second statement, we extend the concept of solution basis to
vector DEs with constant coefficients. If x(0) = 0 and x’() = Ax, then by
repeated differentiations, we have x0) = 0 for all n. Hence, in Theorem 2,
every x(0) = 0 and, by the crucial Lemma of Ch. 3, §4, every x,() = 0. It
follows from this that the vector form x(é) = Ax of (12) can have, at most n
linearly independent solutions (it will be proved in Ch. 6 that it has exactly n of
them). Calling them u'(#), . , u(t), we see that the general solution of (12) is
x(t) = Lcu'(t) > 0, where |x(é)| < € for all sufficiently large t, uniformly, pro-
vided only that Xc, < 6, some sufficiently small number. The stability condition
(i) above follows.
It follows from Ch. 3, §5, that the conditions for strict stability of the second-
order system dx/di = ax + by, dy/dt cx + dy, are p = — (a + d) > 0,

=

q = ad — bc > 0 or equivalently
a + d < 0, ad > bc.

Caution. One should not conclude from Theorems 5 and 5’ that attractive-
ness implies strict stability for all autonomous systems. Indeed, we now con-
struct an attractive critical point of a nonlinear plane autonomous system that
is unstable.t Figure 5.7 depicts sample solution curves.

Example 6. Let D, be the lower half-plane y < 0; let Dy be the locus


x? + y = 2|x|, consisting of the discs (x + 1)? + x <= 1; let D, be the half-
strip |x| < 2, y > 0, exterior to Dg; let D, be the locus |x| > 2, y > 2, all as

¢ The authors are indebted to Dr. Thomas Brown for constructing Example 6.
7 Stability 155

Figure 5.7 Unstable critical point.

depicted in Figure 5.7. The system

2xy on D, U Dy
U Dg

| 2xy/[3 — (41x1)] on Dg
yg — x?
on D, UDy
A|x| — y? — 3x? on Ds
(4lx| — 9? ~ 3x?)/[3 — 4/|xI] on Dg

is unstable, yet lim,... |x(| = 0 for all orbits x(é).

Dynamical Systems. If the DEs for an autonomous dynamical system are


written in normal form, as

aq,
(34) df? = F(a 41) =F,q,p) t=1,...,m
°

and conjugate velocity variables p, = dq,/dt are introduced, then (34) defines an
autonomous system of first-order DEs

dq, dp,
(34’) = F(q, p)
a7 Pe di

in an associated 2m-dimensional phase space. A given point (q, p) of phase space


is a critical point for the system (34’) if and only if p = 0 and F(q, 0) = 0, so
that q is an equilibrium point of the dynamical system (34).
156 CHAPTER 5 Plane Autonomous Systems

The dynamical system (34) is called conservative when F,(q, 4) =


=
—0V/dq , for
a suitable potential energy function V(q). For any conservative system, the total
energy function E(q, p) = (= mp7/2) + V(q) is an integral of the system (34).
Moreover, the point (a,0) is a critical point of (34’) if and only if the gradient
VV(a) = 0, so that the potential energy function has a stationary value.t The
point is neutrally stable if the potential energy hasa(strict) local minimum at
q = a; it is never strictly stable.
Thus, consider the simple pendulum of Example 5, §3, with k = 1. For¢c =
—2, the “solution curve’’ reduces to a set of isolated critical points v = 0, 0 =
+ Qn. If |c| < 2, then [9 — 2nr| < 6), where 6) < 7 is the smallest positive
angle such that cos 4 —c/2. Therefore, the solution curves for —2 <¢ <2
=
=

are closed curves (loops) surrounding the origin or any one of the critical points
for 6 = +2nm. As c > —2, these loops tend to the origin. Consequently, the
origin and its translates 6 = +2nzx,v = 0 are neutrally stable. For c = 2, we
have the separatrix curve defined by v = +2 cos (0/2). From the first of equa-
tions (11%, it is seen that the direction of the motion is from —7 to x for v >
0 and from a to —a for v < 0. Therefore, the critical points v = 0,6 = +(2n
+ 1)x are unstable. These unstable critical points occur when the pendulum is
balanced vertically above the point of support.

EXERCISES C
1 For the following DEs, determine the stability and type of the solution curves in the
phase plane, sketching typical curves in each case:
(a) ét+u=0 (b) i+ a + u =
=
0 (c) i-u+tu=0
dd) @+t+u-—u=0 (c) @+2%+u=0 @ a+ 44+ u=0.

For which of the following is x =


=
0 a stable critical point:

2 2 3 = 3
x= x’,
=
—x*,
x= x*, x =
x =
—x

Determine conditions on the coefficients a, 6, c, d of (13) necessary and sufficient


for neutral stability.

Show that x = X(x) hasastrictly stable critical point at x = 0 if X(0) = 0 and X’(0)
< 0. [ Hint: x? is a Liapunov function.]

Show that the plane autonomous system ¥ = x — y, § = 4x? + 2y* —6 has critical
points at (1, 1) and (—1, —1), both of them unstable.

Show that the system dx/dt = In (1 + x + 2y), dy/dt = (x/2) — y + (x?/2) has an
unstable critical point at the origin.

*7 Is the system (*) of Ex. B5 strictly stable, neutrally stable, or unstable at (0, 0, 0)?

+ Courantand John, Vol. 2, p. 326. Points where the value of a function is stationary are also often
called “critical points.”
8 Method of Liapunov 157

Show that the autonomous system

( Je sin89+ 2sinxcosx+e¢—1
dx
— = -

dt 2
2)
—_=
sin (2x + 3y), = = tan (2x + z)
di di

has an unstable critical point atx =y=z=0.


Show that the DE ¥ + x sin (1/x) = 0 is neutrally stable at x = 0.
*10. Let § = S(x) be a function of class @' in some neighborhood of the origin, such
that S(0) = 0, while S(x) > 0 if0 < |x| < ¢forsomee > 0.
(a) Prove that the system dx/dt = X(x) is unstable at the origin if 2 X,0S/ax, > 0
there and is strictly stable if 2 X,0S/dx, < 0 there.
(b) Derive from (a) a stability criterion for the autonomous nth order DE

d"x/dt" = ®(x, dx/dt,... ,d"-'x/dt"")

11 Show that the integral curves in Example 6 are the semicircles

x? + x = +2ax in D, U Dy

and of the form x(x? — 2|x| + y’?) = constant in D; U D,.

8 METHOD OF LIAPUNOV

In studying a critical point of an autonomous system (3), we can assume with-


out loss of generality that it is at the origin. For X(x) of class @?, we can there-
fore rewrite the system as

x) = > ax, + R(x), a=1,...,n


jr

where the R; are infinitesimals of the second order. This suggests that the behav-
ior of solutions near the critical point will be like that of solutions of the asso-
ciated linearized system (12). We now show, at least for n = 2, that this is indeed
true as regards strict stability.

THEOREM 6. [If the critical point (0, 0) of the linear plane autonomous system
(12) is strictly stable, then so is that of the perturbed system

(35) X = ax + by + &x,y), y = cx + dy + n(x,y)

provided that |&(x,y)| + In(x,y)| = Ox? + 9°.

+ The symbol O(x? + 9°) stands for a function bounded by M(x” + y°) for some constant M and all
sufficiently small x,y.
158 CHAPTER 5 Plane Autonomous Systems

Idea of Proof. The proof is based on a simple geometrical idea due to Liapu-
nov. Let E(x,y) be any function having a strict local minimum at the origin. For
a small positive C, the level curves E(x,y) = E(0, 0) + C constitute a family of
small concentric closed loops, roughly elliptical in shape, enclosing the origin.
Now, examine the direction of the vector field defined by (35) on these small
loops. Intuition suggests that the critical point will be strictly stable whenever,
for all small enough loops, the vector field points inward. For this implies that
any trajectory which once crosses a loop is forever trapped inside it, because, to
get outside, the trajectory would have to cross the loop in an outward direction.
At such a crossing point, the vector could not point inward, giving a
contradiction.
To make the preceding intuitive argument precise, we define a Liapunov func-
tion for a critical point a of an autonomous system x(t) = X(x) to be a function
E(x) that assumes its minimum value at a and satisfies

>_ Xx) dE/dx, < 0 for all xa

Since E = dE/dt = X x(t) GE/dx =


=
2X,(x) 0E/dx, for any solution x(é), this
implies that E is decreasing along any trajectory x(i).
To construct a Liapunov function for the critical point (0, 0) under the
hypotheses of Theorem 6, we consider, in turn, each of the three canonical
forms (26a) through (26c) derived in §5. Since the definition of stability is invar-
iant under linear transformations of coordinates, it suffices to consider these
three cases.
For the linear terms in (27), the necessary calculations are simple. The Lia-
2
punov function can be taken as a positive definite quadratic function E = ax
+ 26xy + yy", a > 0, ay > 6”, whose level curves are a family of concentric
coaxial ellipses. We will show, by considering cases separately, that E = —kE
for some positive constant k.
In (26a), the Liapunov function E = x? + y° satisfies

E = 2(xx + yy) = QE

In the strictly stable case, » < 0. In (26b), the same Liapunov function satisfies
E = 2(xx + yy) = Qux® + Quey® < 2u,£, in the strictly stable case 0 > pw, =
fy. (By allowing equality, we also take care of the exceptional case of a star
point.) In (26c), the Liapunov function E = x® + ay? satisfies, in the strictly
stable case a < 0

E = 2(xx% + a’yy) = Q(ax? + a®xy + a°y’) = aE + a(x + ay)? S aE


Hence, in the three possible cases of strict linear stability, we have E = 2uE, E
< 2, E, or E S aE, where the coefficient on the right side is negative. It follows
that E < —kE for some k > 0, in every case. Since the quadratic function E(x,
9) is positive definite, we conclude that E(x, y) = K(x? + 9°) for some constant
K> 0.
9 Undamped Nonlinear Oscillations 159

We now consider the nonlinear system (27). Since E, = 0E/dx and E, =


OE/dy are linear functions of x and y, we have

[EE + Enl = O(lx| + yD? +9


Hence, for some ¢ > 0, E(x, y) = € implies |E,£ + E,n| S kE(x, y)/2.
Now let [x(é), y(i)] be a trajectory of (27) such that E(x(é), y(éo)) S €. Along
this trajectory, for t = t, E(t) = E[x(d), y(d] will satisfy

E() S —kE + |E€ + En| S —kEW)/2

By Theorem 7 of Ch. 1, it follows that E() = E(ég)exp[—k(t—to)/2]. Hence, E(é)


approaches zero exponentially. Since E(x, y) = K(x* + x), it follows at once
that the trajectory tends to the origin.
Using similar arguments, we can prove the following classic generalization of
Theorem 6, which we state without proof.

POINCARE-LIAPUNOV THEOREM. [If the critical point 0 of the linear autono-


mous system x'(t) = Ax is Strictly stable, then so is that of the perturbed system

x(t) = > a,x, + &(x)

provided that |&(x)| = O(\x|°).

9 UNDAMPED NONLINEAR OSCILLATIONS

The classification made in §5 covers linear oscillators near equilibrium points,


which correspond to critical points in the phase plane. We will now study the
nonlinear oscillations of a particle with one degree of freedom, about a position
of stable equilibrium. The case of undamped (i.e., frictionless) oscillations will
be treated first. This case is described by the second-order DE

(36) ¥ + g(x)
=0

which can be imagined as describing the motion of a particle in a conservative


force field. This DE is equivalent (for v —
=

%) to the first-order quasilinear DE


v du/dx + q(x) = 0, or, in the phase plane, to the plane autonomous system

(37) dx/dt = v, dv/dt = —4q(x)

By a translation of coordinates, we can move any position of equilibrium to


x = 0; hence we can let g(0) = 0. If equilibrium is stable, then the “restoring
force”’ q(x) must act in a direction opposite to the displacement x, at least for
small displacements. Therefore we assume that xq(x) > 0 for x # 0 sufficiently
small. This will make the system (37) have an isolated critical point at x = v —
=

0, as in the case of the simple pendulum (Example 5, §3).


160 CHAPTER 5 Plane Autonomous Systems

The key to the analysis of systems (36) is the potential energy integral

(38) Vix) = f q(é)a&


Since xq(x) > 0, the function V(x) is increasing when x is positive and decreasing
when x is negative; it has a local minimum V(0) = @ at x = 0. Differentiating
the total energy

(39) E(x,v) = = + V(x)

with respect to ¢, we get E= *[¥ + q(x)] = 0; hence E(x, v) is a constant along


any trajectory in the phase plane. Therefore E(x, v) is an integral of the system
(37), and of its first-order equivalent du/dx = —g(x)/v. Although the energy
integral E(x,v) has a strict local minimum at (0, 0), the following result is of more
interest to us now.

THEOREM 7. If q ¢ @! and if xq(x) > 0 for small nonzero x, then the critical
point (0, 0) of the system (36) is a vortex point.t

Proof. For any given positive constant E, the locus v?/2 + V(x) = E is an
integral curve, where V(0) = 0 and V(x) increases with |x|, on both sides of x
= 0. These curves are symmetric under reflection (x, v) F> (x, —v) in the x-axis;
they slope down with slope — q(x)/v in the first and third quadrants and up in
the second and fourth quadrants. For any given small value of EF, the function
E — V(x) has a maximum at x =
=
0 and decreases monotonically on both sides,
crossing zero at points x —B and x = A, where B and A are small and posi-
=
=

tive. Hence each locus v? = 2[E — V(x)] is a simple closed curve, symmetric
about the x-axis.
As the energy parameter E decreases, so does |v| = V2[E — V(x)]; thus, the
simple closed curves defined by the trajectories of (32) shrink monotonically
toward the origin as E | 0. In fact, consider the new coordinates (u, v), defined
Vata
by u + V2V(x), according as x is positive or negative. The transformation (x,
=
=

v) > (u, v) is of class @' with a nonvanishing Jacobian near (0, 0), if ¢(0) exists
and is positive. Hence the integral curves of (37) resemble a distorted family of
circles u? + v? = QE.

10 SOFT AND HARD SPRINGS

The most familiar special case of (36) is the undamped linear oscillator

(40) % + qx
= 0, q=kh>0

+ As in the linear case, a critical point of a plane autonomous system is called a vortex point when
nearby solution curves are concentric simple closed curves.
10 Soft and Hard Springs 161

for which q(x) = kx. The general solution of (40) is the function x = A cos [A(é
— &)], representing an oscillation of amplitude A, frequency k/2m (period 27/h),
and phase to.
In other cases, the DE (36) can be imagined as determining the motion of a
unit mass, attached to an elastic spring that opposes a displacement x by a force
q(x), independent of the velocity ¥. The ratio h(x) = q(x)/x is called the stiffness
of the spring; it is bounded for bounded x if g € @! and q(0) = 0. The case (40)
of a linear spring is the case of constant stiffness (Hooke’s Law). For linear
springs, the formulas of the last paragraph show that the frequency f = k/27 is
proportional to the square root of the stiffness k® and is independent of the
amplitude. We will now show that, for nonlinear springs, the frequency f still
increases with the stiffness but is amplitude-depenValeo
dent in Von
general.
Indeed, the force law (36) implies that x = v = 2(E — V(x)]. Hence, if the
limits of oscillation [i.e., the smallest negative and positive roots of the equation
V(x) = #] are x —Band x = A, the period T of the complete oscillation is
=
=

A
dx
T=
(41)
2 Jos V2[E — Vix)]
The integral (41) is improper, but it converges, provided that q(x) does not van-
ish at —B or A; hence it converges for all sufficiently small amplitudes if g € @'
in the stable case h(0) > 0.
We now compare the periods T and T, of the oscillation of two springs, hav-
ing stiffness h(x) and h,(x) = h(x), and the same limits of oscillation —B and A.
By (39), E = J q(u) du = V(A); hence, E — V(x) = 4 q(u) du. From the stiffness
inequality h,(x) = h(x) assumed, therefore, we obtain,

Be vey = fo gydus [ q du = EH), O=xA


Reversing the sign of x, we get the same inequality for —B = x =< 0. Substitut-
ing into (41), we obtain T = T,. We thus get the following comparison theorem.

THEOREM 8. For any two oscillations having the same span [—B, A], the period
becomes shorter and the frequency greater as the stiffness q(x)x increases in (36).

Springs for which h(x) = h(—x) are called symmetric; this makes g(—x) =
—q(x) and V(—x) = V(x), so that B = A in the preceding formulas: symmetric
springs oscillate symmetrically about their equilibrium position. Hence, for sym-
metric springs, the phrase “span [—B, A]’”’ in Theorem 8 can be replaced by
“amplitude A.”
For any symmetric spring, h’(0) = 0; if h”(0) is positive, so that h(x) increases
with |x|, the spring is said to be “hard”; if h”(0) is negative, so that h(x)
decreases as |x| increases, it is said to be ‘‘soft.” Thus the simple pendulum of
Example 5, §3, acts as a “‘soft” spring. We now show that the period of oscilla-
tion is amplitude-dependent, at least for symmetric hard and soft springs.
162 CHAPTER 5 Plane Autonomous Systems

THEOREM 9. The period of a hard symmetric spring decreases, whereas the period
of a soft symmetric spring increases as the amplitude of oscillation increases.

Proof. The period is given by (41); it suffices to compare the periods of quar-
ter-oscillations, say from 0 to A and from 0 to Aj, with A, > A. We write A;
=
=

cA, with ¢ > 1. To study the period p, of the quarter-oscillation from 0 to cA =


Aj, we let x
=
=
cy in (36). The equivalent DE fory is

(42) d*y/di? + yh(cy) = 0

where h(x) = q(x)/x. The oscillation of amplitude cA for (36) corresponds to the
oscillation of amplitude A for (42); and, since the independent variable ¢ is
unchanged, the periods of oscillation are the same for both. Therefore, it suf-
fices to compare the quarter periods p and p, for amplitude A for the two
springs (36) and (42), respectively. Using Theorem 8, we find that, for y > 0, if
yh(cy) = q(y) = yh(y), that is, if h(cy) & h(y) for c > 1 (hard spring), then we
have p, = p, and so the period decreases as the amplitude increases. Soft springs
can be treated similarly.

EXERCISES D

1 (a) Show that the integral curves of ¢ — x + x® = 0 in the phase plane are the
curves v? — x? + x*/2 =C,
(b) Sketch these curves.
(c) Show that the autonomous system defining these curves has a saddle point at
(0, 0) and vortex points at (+1, 0).

Duffing’s equation without forcing term is ¥ + gx + rx* = 0. Show that, for oscil-
lations of small but finite half-amplitude L, the periodT is

r=avaf™
V2q+ a + sin®6)
Verify Theorem 9 in this special case as a corollary.

(a) Draw sample trajectories of the DE # = 2x° in the phase plane, including ¢ =
+x.
(b) Show that x = 0 is the only solution of this DE defined for all ¢ € (—00, 0).

*4 Show that if (0) = 0 and q’(0) < 0 in (36), the origin is a saddle-point in the phase
plane.

Discuss the dependence on the sign of the constant y, of the critical point at the
origin of the system

w= —v
+ pu’, v=ut
pw

(a) Show that the trajectories of + q(x) = 0 in the phase plane are convex closed
curves if q(x) is an increasing function with g(0) = 0.
(b) Is the converse true?

Show that, if ¥ = ax + by, ¥ = cx + dy is unstable at the origin, and if

X(x, y) = ax + by + O(x? + 9%), V(x, 9) = ox + dy + O(x? + y’)


11 Damped Nonlinear Oscillations 163

where ad # bc and A # 0, then the system X = X(x, y), 3 = Y(x, y) is also unstable
there.

The equation of a falling stone in air satisfies approximately the DE

_ gels
*=8 2
bv

where the constant v is the “terminal velocity.”’ Sketch the integral curves of this
DE in the phase plane, and interpret them physically.

Show that the plane autonomous system x = y — x°, ¥ = —x? is stable, though its
linearization is unstable. [Hrnt: Show that x* + 2y? is a Liapunov function.]
*10. Show that for the analytic plane autonomous system

% = Qxy, y= xy? — xt —yl

the origin is an unstable critical point that is asymptotically stable. [HinT: Study
2
Example 6. To prove instability, show that the ellipse 4y* =
=
x—xX cannot be
crossed from the left in the first quadrant.]

11 DAMPED NONLINEAR OSCILLATIONS

The equation of motion for a particle of mass m having an equilibrium point


at x = 0, in the presence of a restoring force mq(x) and a friction force equal
to f(x, v) = mvp(x, v), is

(43) ¥ + p(x, X)# + q(x) = 0, q(x)


=
=

xh(x), pee’, ne@)

When A(0) is positive, the equilibrium point is called statically stable, because the
restoring force tends to restore equilibrium under static conditions (when
v = 0). The conservative system (36) obtained from any statically stable system
(43) by omitting the friction term xp(x, %), is neutrally stable by Theorem 7.
The differential equation (43) has a very simple interpretation in the phase
plane, as

d
(44) dx {S| =we
=

vp(x, v) — q(x)
ds di

The critical points of the system (44) are all on the x-axis, where x =
=

Vv
=
=

0;
they are the equilibrium points (x, 0) where g(x) = 0 in (43). Since in (43), (0)
= 0h(0) = 0 the origin is always a critical point of (44); unless h(0) changes sign,
there is no other.
We shall consider only the case h(x) > 0 of static stability in the large, which
is the case of greatest interest for applications. For simplicity, we will also
assume that p(0, 0) # 0.
Under these assumptions, the origin is the only critical point of (44). More-
164 CHAPTER 5 Plane Autonomous Systems

over the direction field, whose slope is

d q(x)
— p(x,v) —
— = r(x, v) = —

d v

points to the right in the upper half-plane, where ¥ = v > 0, and to the left in
the lower half-plane. On the x-axis, the solution curves have finite curvature
q(x); they cut it vertically downward on the positive x-axis and vertically upward
on the negative x-axis. Thus, the solution curves have a general clockwise
orientation.
Conversely, any continuous oriented direction field with the properties spec-
ified represents a DE of the form (43) in the phase plane. From this it is clear
that the behavior of the solutions of the DEs of the form (43) can be extremely
varied in the large (see §13). However, the local possibilities are limited.

THEOREM 10. [If p and h are of class @' in (43), and if p (0, 0) and h(0) are
positive, then the origin is a strictly stable critical point of (44).

Proof. Under the hypotheses of Theorem 10, we have that p > 0 near the
critical point (0, 0), and we can write (44) as

dx/dt = v, dv/dt = —h(0)x — pov + O(x? + 9’), po = p(0, 0)

An easy computation shows that the linearization of the system (44) has the sec-
ular equation A? + por + h(0) = 0. Since this quadratic polynomial has positive
coefficients, it is of stable type (Ch. 3, §5). Hence, by Theorem 6, the origin is a
strictly stable critical point of (43) when the damping factor p(0, 0) is positive.
The equilibrium point x —
=

0 of (43) is then said to be dynamically stable: the


solution curves tend to the origin in the vicinity of the origin.
When p(0, 0) is negative, the system is said to be negatively damped, and the
equilibrium point to be dynamically unstable. Since (43) can be rewritten as

d’x
(45)
d(—t)?
+ p(s*d(-t —dx dx

d(—t)
+ xh(x) = 0

we see that the substitutions i > —i, x ~ x, v > —v, of time reversal, reverse
the sign of p(x, %) but do not affect (43) otherwise. Hence, if p(0, 0) < 0, all
solution curves of (44) spiral outward near the origin.

*12 > LIMIT CYCLES

We now come to a major difference between nonlinear oscillations and linear


oscillations. When a linear oscillator is negatively damped, the amplitude of
oscillation always increases exponentially without limit. In contrast the ampli-
tude of oscillation of a negatively damped, statically stable, nonlinear oscillator
12 Limit Cycles 165

commonly tends to a finite limit. The limiting periodic oscillation of finite ampli-
tude so approached is called a limit cycle.
The simplest DE that gives rise to a limit cycle is the Rayleigh equation,

(46) &— wl — xe
+x =
=

0, up>od

The characteristic feature of this DE is the fact that the damping is negative for
small x and positive for large x. Hence it tends to increase the amplitude of small
oscillations and to decrease the amplitude of large oscillations. Between these
two types of motions, there is an oscillation of contant amplitude, a limit cycle.
If we differentiate the Rayleigh equation (46) and set y = ¥V/3, we obtain
the van der Pol equation

(47) 5 — wl — yy
+ y =0, u>od

This DE arises in the study of vacuum tubes. The sign of the damping term
depends on the magnitude of the displacement y. The remarks about the Ray-
leigh equation made above apply also to the van der Pol equation.
As stated in §11, negatively damped nonlinear oscillators can give rise to a
great variety of qualitatively different solution curve configurations in the phase
plane. For any particular DE of the form (43), such as the Rayleigh or van der
Pol equation with given yw, one can usually determine the qualitative behavior of
solutions by integrating the DE

v do + [vp(x,v) + g(x)] dx = 0

graphically (Ch. 1, §8). More accurate results can be had by use of numerical
integration. With modern computing machines, using the techniques to be
described in Ch. 8, it is a routine operation to obtain such a family of solution
curves. Figure 5.8 depicts sample integral curves for the van der Pol equation
with
» = 0.1, w = 1, and
» = 10 so obtained.

Liénard Equation. General criteria are also available which determine the
qualitative behavior of the oscillations directly from that of the coefficient-func-
tions. Such criteria are especially useful for DFs depending on parameters,
because graphical integration then becomes very tedious. They are available for
DEs of the form

¥ + f(xsx% + g(x) = 0 (Liénard equation) (48)

The van der Pol equation is a Liénard equation; moreover, it is symmetric in the
sense that —x(?) is a solution if x(/) is a solution. This holds whenever q(—x) =
—qg(x) is odd and f(—x) = f(x) is even.
One can prove the existence of limit cycles for a wide class of Liénard equa-
tions; we can even prove that every nontrivial solution is either a limit cycle, or
a spiral that tends toward a limit cycle as t > +00. This is true if: (i) xq(x) > 0
166 CHAPTER 5 Plane Autonomous Systems

15
F

‘1 | 10
Q2t

2h
=

A) 5h

eX
—1b —5r
—2+

—2e
—10F-
~4b

1 —-3Fr

—4 -2 —15E
-2 -1 0

(a) (b) (c)


Figure 5.8 Van der Pol equation.

for x # 0; (ii) f(x) in (48) is negative in an interval a < x < 6 containing the
origin and positive outside this interval, and
0

(49) [roa f =_
wf!) dx = +00

We sketch the proof.t+


In the xv-plane, solution curves satisfy

(50) @+f) +2 = 0 if v0
It follows, since xq(x) > 0, that they can cross the x-axis only downward if
x > 0, and upward if x < 0. Also by (50), between successive crossings of
the x-axis, v(x) is a bounded single-valued function, decreasing in magnitude if
x > bin the upper half-plane, and if x < a in the lower half-plane.
Now, consider the Liénard function

(51)
E(x, v) = afv + FQ)? + Ux) F(x) = f *fle) dx
U(x) = f g(x) dx = 0

A straightforward calculation gives dE/dit = — q(x)F(x), where, by (49), | F(x)|


becomes positively infinite as |x| — 90. For sufficiently large |x|, since xq(x) >
0, dE/dt is identically negative. Since the set of all (x, v) for which E(x, v) = Ep,

+ For a complete proof, see Lefschetz, p. 267, or Stoker, Appendices III and IV.
12 Limit Cycles 167

any finite constant, is contained in the bounded strip |v + F(x)| =V2E, it


also follows that solution curves can stay in one half-plane (v > 0 or v < 0) for
onlya finite distance (and time); if nontrivial, they must cut the x-axis infinitely
often.
Let x9, X), %g,... be successive zero-crossings; we can assume that x5, > 0
and %on+1 < 0 without loss of generality. If xy = xo, then (by uniqueness) x2, =
Xo and %on4; = x; for all n > 0; the solution curve is a limit cycle. Likewise, if
Xq > Xo, then we find that x», > Xg,-9 aNd Xon41 < Xo,-; for all n > 0 and
the solution curve spirals outward as ¢ increases. Similarly, if xy < x9, then the
solution curve must spiral inward.
Finally, since (0) < 0, Theorem 6 applies if f, g € @': solution curves near
the origin must spiral outward. Also, we know that

E(Xon, 0) ~ E(Xoq-1,0) = — f Mn x2n—1


g(x)F(x)
r(x)
d

and the definite integral is negative for sufficiently large oscillations since xq(x)
> 0 and F(to0) = +00, by (49). Hence, every solution curve sufficiently far
from the origin must spiral inward. Therefore, every oscillation of sufficiently
large initial amplitude must spiral inward toward a limit cycle of maximum
amplitude. Similarly, every oscillation of sufficiently small initial amplitude must
spiral outward to a smallest limit cycle.
For the Rayleigh and van der Pol equations, these limit cycles are the same.
Therefore, every nontrivial solution tends to a unique limit circle, which is sta-
ble. The preceding result holds under much more general conditions. We quote
one set of such conditions without proof.

LEVINSON--SMITH THEOREM. In (48), let q(x) = xh (x), where h(x) > 0 and
let f(x) be negative in an interval (a, b) containing the origin and positive outside this
interval. Let q(—x) = —4q(x), f(—x) = f(x), and let (49) hold. Then (48) has a
unique stable limit cycle in the phase plane, toward which every nontrivial integral
curve tends.

EXERCISES E

1. Show that any DE

¥ + (px? — gx + rx = 0

where g and r are positive constants, can be reduced to the van der Pol DE by a change
of dependent and independent variables.

2. (a) Show that the autonomous plane system

3 2,
uw 2
a@=u-v—w— »
v=utv- v Uv

has a unique critical point, which is unstable, and a unique limit cycle.
(b) Discuss the stability of the related system

wu’,
“a= —u-—vtwt v=eu-vtvtuy

with special reference to oscillations of very small and very large amplitude
168 CHAPTER 5 Plane Autonomous Systems

In the DE ¥ + q(x) = 0, let Vix) = 6 q(u) du, and let g be a continuous function
satisfying a Lipschitz condition. Show that, if V(x,) = Vixg) and Vix) < V(x,) for
x, <x < Xx9, the equivalent autonomous system (44) has a periodic solution passing
through the points (x,, 0) and (x, 0).

Show that the plane autonomous system

_ rn
>
6=1 (polar coordinates)
100

has just one limit cycle.

Discuss the limit cycles of the system

Tr 1

| )a( Q 1
_
=
=

100 Tr

Prove in detail that, for 4 = 1 and the initial condition x(0) = 10, %(0) = 0, the
amplitude of successive oscillations decreases in the Rayleigh DE (46).

Answer the same question for the van der Pol DE, if y(0) = 10, y(0) = 0.

Sketch the integral curves of the van der Pol DE in the phase plane for » = 100.
[Hint: Most of the time, the integral curves are “relaxation oscillations,” near y = 0
ory = 1]

9 Do the same for the Rayleigh DE.

ADDITIONAL EXERCISES

1 Locate the critical points of the DE x


=_
=
x(1 — x)(@ — x), and discuss how their
stability or instability varies with a.

Do the same for x


=
=
x(1 — x)(x — a).

Show that, in the complex domain, every system X = ax + by, ¥ = cx + dy is linearly


equivalent to either ¥ = Ax, ¥ = py or X = Ax, ¥ = Ay + x, for suitable A, p.

Show that, in the punctured plane x” + y? > 0, two linear systems (13) are equivalent,
provided that their discriminants A, A’ are not zero, and they both have either (a)
stable focal points, (b) unstable focal points, (c) vortex points, (d) stable nodal points,
(e) unstable nodal points, or (f) saddle-points.

Consider the linear autonomous system

% = ax + agy + azz, y = dx + boy + diz, Z = cx + Coy


+ Cz

(a) Show that the x-component of any solution satisfies

WE = pix _ por + psx, where py = a, + by + &

(0) po = aby — gb, + bots — bgcy + €3a, — Cas,


a; Ag ag

=
ps
i Cg Cg
12 Limit Cycles 169

(b) Show that the secular equation (¢) is invariant under any nonsingular linear trans-
formation of the variables x, y, z
(c) Conversely, show that, if the polynomial A® = p,A? — pod + ps has distinct real
roots dj, Ag, As, then the given DE is linearly equivalent to & = 6G = 1,2,3).
*(d) Work outa set of real canonical forms for the given DE, with respect to linear
equivalence, in the general case.

Let VU = X, U = Ux, vy +2) be any axially symmetric gradient field of


class @', Show that the DE x = VU admits as an integral the “‘stream function” V =
J r[(QU/dx) dr + (0U/dr) dx), the integral being independent of the path.

Sketch the integral curves in the phase plane for

(a)
¥ txt =0 (b) # + x/[%] = 0 (Coulomb friction)
x—@

(c) ¥ + X¥| + ksinx


=0

Let q(x) be an increasing function, with (0) = 0 and q(—L) = —q(L); let Q(x) =
[g(x) — 9(—x)]/2. Show that the period of oscillation of half-amplitude L for ¥ +
Q(x) = 0 is less than that for ¥ + g(x) = 0 unless g(—x) = —g(x) for all x € [0, LZ}.

Show that the DEs ¥ = 2x + sin x and ¥ = 2x are equivalent on (—00, +00) but
that x =
=
x + x® is not equivalent to ¥ =
=
x. [Hint: Consider the escape time.]

10 Showthat,
if a) < ag << +++ <a,andb <by<+++ <b, the DEs % =
=
Il(x —
a,) and % = I(x — 6,) are equivalent.
CHAPTER 6

EXISTENCE
AND UNIQUENESS
THEOREMS

1 INTRODUCTION

In the earlier chapters of this book, we have proved a number of theorems


establishing the existence of solutions of DEs and the well-posedness of initial value
problems, but always under special hypotheses. In this chapter, we shall study
these questions systematically, in the general context of normal systems of first-
order DEs, of the form

dx,
= X1(x1... 5 Xa t)
dt

(1)
dx,
=
n(Xy5 » Xq} f)
dt

For the most part, we shall restrict attention to the existence and uniqueness of
such solutions. But in the later sections, we shall consider more sophisticated
questions, such as the analyticity of solutions and their dependence on the initial
value vector c (1, , ¢,). We shall prove that, as one might expect, this
=
=

dependence is differentiable and shall derive perturbation formulas that express


the relevant partial derivatives explicitly.
The theorems proved in this chapter will include as special cases all the exis-
tence, uniqueness, and continuity theorems proved in earlier chapters. In Chap-
ters 7 and 8 to follow, we shall describe and analyze algorithms for effectively
computing the solutions whose existence is established in this chapter. Those who
are willing to assume plausible results, and who are mainly interested in appli-
cations, may wish to skip to Chapter 8.
We will continue to make the assumptions of Ch. 5, §1: that the X, are con-
tinuous, real-valued functions of the independent variables x,, x, > X,» tin
some region # of interest in (%), » Xn, )-Space. We shall also use the vector
notation introduced there, rewriting (1) as

(2) dx/dt = X(x, t) or x’(t) = X(x, 0)


170
1 Introduction 171

The curve x(é) in # defined by any solution of (1) will be called a solution curve
of the system (1).
Note that we can trivially inflate any normal system (1) of n first-order DEs
to a normal autonomous system of n + 1 DEs by the simple device of writing
t = X,+4,- This gives the equivalent system

dx,
—_— =

(x1, » Xns Xp+t)s j=l, »n+i


dt

where X,,,, is the function 1. However, this does not help to prove the theorems
of major interest.
As in Chapter 5, §1, one can view any autonomous system x’(i) = X(x) as
defining a steady flow in the appropriate region # of x-space. Although this
does not help to prove the theorems of major interest, it does make it easier to
visualize their meaning.
Thus, a continuously differentiable function U(x) is called an invariant of the
autonomous system x’(!) = X(x) when U(x(t) is constant for every solution x(t)
of this vector DE—i-e., when & X;(x) 9U/dx; = 0. This means that each solution
curve of (2’) stays on a single level surface U = const., and thus generalizes the
concept of “integral” defined in Chapter 1.

Example 1. Consider the autonomous system

dx dy dz
= —Yz, _ =
— = Xz,
“oy
di

It is easy to verify that, when x =

=
x(t) and y = y(é) are solutions, the two
functions

V(x, y, Z) = xy and Wx, 9,2) = xr + yr ~ 2

satisfy dV/dt = x dy/dt + y dx/dt = 0 and dW/dt = 0. Therefore, V and Ware


integrals of the system. The intersection of two surfaces V = c, and W = cisa
solution curve of the system. Thus, every solution curve lies on the intersection
of a hyperbolic cylinder xy = c, and a hyperboloid (or cone) x? + y? — z2 = Co.

As a familiar special case, consider also the plane autonomous system dx/dt
= N(x,y), dy/dt = —M(x,y) associated with the DE M(x,y) + N(x,y)y’ = 0, The
function U(x,y) is an integral of this system, associated as in Ch. 1, $5, with the
integrating factor p(x,y), if and only if OU/dx = uM and dU/dy = BN.
First-order normal systems (1) provide a standard form to which all normal
ordinary DEs and normal systems of DEs can be reduced. For example, one can
reduce the solution of a normal nth-order DE to the solution of a system of n
first-order normal DEs as follows. Let u(é) be any solution of the given nth order
172 CHAPTER 6 Existence and Uniqueness Theorems

DE,

du du du d’'y
(
=

dt” “at? de’ “? dite

Then the n functions x,(t) = u, x9(t) = du/dt, . , x,(t) = d"~'u/dt" satisfy


the normal first-order system

dx, dx,
= Xe+1> kR=1,...,7 1; = F(x), %2, » Xai)
dt di

Conversely, given any solution of the preceding first-order system, the first com-
ponent x,(¢) will have the other components xo, » %, aS its derivatives of
orders 1,...,” — 1. Hence, substituting back, x,(¢) will satisfy the given nth-
order equation.
In the presertt chapter, we shall use this standard form to develop a unified
theory for the existence and uniqueness of solutions of DEs and systems of DEs
of all orders.

2 LIPSCHITZ CONDITION

In order to make use of vector notation for systems of DEs, we recall a few
facts about vectors in n-dimensional Euclidean spaces. Addition of two vectors
and multiplication of vectors by scalars are defined component-wise, as in the
plane and in space. The length of a vector x = (x), 9, . ; X,) is defined as

Ix) =@2 +--+ > +2,3/7

Length satisfies the triangle inequality

Ix
+ yl = |x| + lyl

The dot product or inner product of two vectors is defined as

KY SH xXyy tes
+ xn

and satisfies the Schwarz inequality |x - y| = |x| - lyl.


We shall integrate, differentiate, and take limits of vector functions x(é) of a
scalar (real) variable ¢. All these operations can be carried out component by
component, as in vector addition.
For example, the derivative of a vector function

x(t) = (x, (8), X9(0), » Xn(t))


2 Lipschitz Condition 173

is the vector function x’(i) = (xj}(i), x4(é), , x/(#)). The integral f° x(¢) dt is the
vector with components f° x, £) dt, J2 xo(t) dt,..., J x,(é) dt. We shall often make
use of the fundamental inequalityt

(3) fix ai = f |x(é)|dt


A vector field X(x) is said to be continuous when each component X, of X is a
continuous function of the n variables x,, , X,, the components of the vector
independent variable x. This is equivalent to the following statement: The vector
field X(x) is continuous at the point (vector) c whenever, given « > 0, there
exists 6 > @ such that, if |x — c| < 6, then |X(x) — X(c)| < «. We leave it as
an exercise to verify that these definitions are equivalent.
There is no such thing as “the”’ derivative of a vector field X(x), but only a
‘Jacobian matrix” of partial derivatives X;/dx; relative to the different com-
ponents x, + Xqo

The reader who is not accustomed to working with functions of vectors


should note the differences between the following types of functions: vector-
valued functions of a scalar variable, such as x(é) = (x;(4), xo(@), . .. , X,(t)); sca-
lar-valued functions of a vector variable, such as |x| = tie
x] + aa
+ + x5; vec-
tor-valued functions of a vector variable such as

X(x) = (X,(x1, X9, ’ Xn)»Xo(x1; » Xn), ’ XA*1, .» Xy))

vector-valued function of a vector variable x and a parameter #, such as X(x, #).


A vector-valued function X(x) of a vector variable is said to be of class @” in
a given region when each of the component functions X,(x;, , X,) is of class
@” there. One can easily extend the definition of a Lipschitz condition to vector-
valued functions; as we shall see, this provides a simple sufficient condition for
the uniqueness and existence of solutions for normal systems.

DEFINITION. A family of vector fields X(x, #) satisfies a Lipschitz condition


in a region & of (x, é)-space if and only if, for some Lipschitz constant L,

(4) |X(x, ) — Xfy,)| = L|x—y| if (x, DER Y,)ER

Note that both terms on the left side of (4) involve the same value of #.

+ This inequality is the continuous analog of the triangle inequality

[x + x@ 4 eee 4 xO] < [x] + [x] 4 -- - + [x].

It can be obtained from this inequality by recalling the definition of the integral {° x(t) dt as a limit
of Riemann sums, using the triangle inequality for each of the Riemann sums, and passing to the
limit on both sides,
174 CHAPTER 6 Existence and Uniqueness Theorems

LEMMA. If X(x, t) is of class C' in a bounded closed (“compact’’) convext


domain D, then it satisfies a Lipschitz condition there.

Proof. Let M be the maximum of all partial derivatives |dX,/dx,| in the closed
domain D. For each component X, we have, for fixed x, y, ¢ and for variable s

# Xe +5y,01 = aX,* (x + sy; ),


k=l
0 Xp

Hence, by the mean-value theorem applied to the function X,(x + sy, #) of the
variable s on the interval 0 = s = 1, we have

Xx +y,) -X«%) = >- OX,


Ox,
(x + ay, ty,
k=l]

for some o, between 0 and 1. Squaring, and applying the Schwarz inequality to
the right side, we obtain

[X(x + y,) —Xx, )/? s (2» k=l


Ox,

Ox;
) (>: inl") < nM’ly|?
Consequently, summing over all components i, we have

[Xx + y, ) — X(x, D[? < n?M* Jy?

Taking square roots, the Lipschitz condition follows with Lipschitz constant nM.

3 WELL-POSED PROBLEMS

For DEs to be useful in predicting the future behavior of a physical system


from its present state, their solutions must exist, be unique, and depend contin-
uously on their initial values. As stated in Ch. 1, §9, an initial-value problem is
said to be well-posed when these conditions are satisfied. We now show that, if X
satisfies a Lipschitz condition, the vector DE (2) defines a well-posed (or ‘‘well-
set”’) initial-value problem.
We begin by proving uniqueness.

THEOREM 1 (UNIQUENESS THEOREM). [If the vector fields X(x, t) satisfy a


Lipschitz condition (4) in a domain R, there is at most one solution x(t) of the vector
DE (2) that satisfies a given initial condition x(a) = cin R.

+ A set S in n-space is convex when the segment joining any two points of the set S lies entirely within
S. This definition applies both to closed domains D and open regions &.
3 Well-Posed Problems 175

The proof of this theorem parallels that of Theorem 5 of Ch. 1. We show


that, if x(é and y(é) are both solutions of (2) and if they are equal for one value
of t, say t = a, it follows that x({) = y@) in any domain in which a Lipschitz
condition is satisfied.
Consider the square of the n-dimensional distance between the two vectors
x(é) and y(t). By definition, this is

of) = X [x,(t) — O12 = [x@ — yO? = 0

Differentiating o(/), and using the fact that x and y are solutions of the normal
system (2), we get

o(t) = 22 [x,() — »()1[XxO, ) — XQ, 1)

= 2[x@ — y) : [Xa@, ) — XY, 4]

By the Schwarz inequality, therefore, we have

o(t) = |o'@| = 21k — y): &&, ) — XG, D)I


<= 2|x —y] - |X(x, ) — X(y, )| S 2L|x — y|? = 2Lo)

By the result of Lemma 2 of Ch. 1, §10, it follows that if x (2) = y(a), that is, if
o(a) = 0, then o(f) = 0 [that is, |x(t) — y()|? = 0] for all t = a.
A similar argument works for ¢ < a: replacing ¢ by —?, as in proving Theorem
6 of Ch. 1, we obtain

— = = |o(| = 2Lo()
d(—t

again using the preceding inequality.


We shall prove next that the solutions of a normal first-order system (2)
depend continuously on their initial values.

THEOREM 2 (CONTINUITY THEOREM). Let x(t) and y(t) be any two solutions
of the vector DE (2), where X(x, t) ts continuous and satisfies the Lipschitz condition
(4). Then

(5) |x(a + h) — ya + A)| S "| x(a) — yo)|

Proof. Replacing a + ¢ by a — t, we can always reduce to the case h = 0.


Consider again off) = |x(é) — y(t)|. As in the proof of Theorem 1,

o'(t) = 2[x() — yO] - [X&«O, ) — XO, )] S 2L|x — y|? = 2Lol)

Applying Lemma 2 of Ch. 1, §10 to o(@), we get o(a + h) < o(a)e*™. Taking the
square root of both sides, we get the desired result.
176 CHAPTER 6 Existence and Uniqueness Theorems

From Theorem 2 we can easily infer the following important property of the
solutions of the DE (2).

COROLLARY. Let x(t, c) be the solution of the DE (2) satisfying the initial con-
dition x(a, c) = c. Let the hypotheses of Theorem 2 be satisfied, and let the functions
x(t, c) bedefined for |c ~ c°| < Kand |t — a| ST. Then:
(a) x(t, c) is a continuous function of both variables:
(b) ife > c°, then x(t, c) > x(t, c°) uniformly for |t — a| < T.
Both properties follow from the inequality (5).
In view of the preceding results, it remains only to prove an existence theorem,
in order to show that the initial value problem is well-set for normal first-order
systems (1). This will be done in Theorems 6~8 later.

EXERCISES A

1 Showthat u = x +» + zandv =
=
x? + y? + 2? are integrals of the linear system
dx/dt = y — z, dy/dt = z — x, dz/dt = x — y. Check that the solution curves are
circles having the line (, ¢, #) as the axis of symmetry.

Reduce each of the following DEs to an equivalent first-order system, and determine
in which domain or domains (e.g., entire plane, any bounded region, a half-plane,
etc.) the resulting system satisfies a Lipschitz condition:
(a) d®x/dt? + x? = 1 (b) ax/dt = xP, (c) dx/d = [1 + @?x/at?)?]'”
Reduce the following system to normal form, and determine in which domains a
Lipschitz condition is satisfied:

dv Qdu 3du
du —_—=
u? + v’, = Quu
dt dt dt dt

Show that the vector-valued function (¢ + be“, —e*/ab) satisfies the DE (2) with
X = [1 — (1/x9), 1/(x, — 4], for any nonzero constants a, 6

State and prove a uniqueness theorem for the DE y” = F(x, y, 9’), with Fe é}” (Hint:
Reduce to a first-order system, and use Theorem 1.]

(a) Show that any solution of the linear system dx/dt = y, dy/dt = z, dz/dt = x
satisfies the vector DE ax/dt = x, where x = (x, , 2).
(b) Show that every solution of the preceding system can be written x = da +
eb cosV3t/2 + csinV3t/2], for suitable constant vectors a, b, and c.
(c) Express a, b, and ¢ in terms of x(0), x’(0), and x” (0).

Show that the general solution of the system dx/dt


=
=
x*/y, dy/dt =_
=
x/2 is
x = I/(at + yy = —1/[2a(at + 6)).
Show that the curves defined parametrically as solutions of the system dx/dt = OF/
Ox, dy/dt = OF/dy, dz/dt = OF/dz are orthogonal to the surfaces F(x, y, z) = con-
stant. What differentiability condition on F must be assumed to make this system
satisfy a Lipschitz condition?

(a) Find a system of first-order DEs satisfied by all curves orthogonal to the spheres
x? + y? + 22 = Qax — a’.
(b) By integrating the preceding system, find the orthogonal trajectories in question.
Describe the solution curves geometrically.

10 (a) In what sense is the following statment inexact? “The general solution of the DE
4 Continuity 177

cy” = (1 + y”)°? is the circle (« — a)? + (y — 3)? =


=
c®, where a and 4 are
arbitrary constants.”
(b) Correct the preceding statement, distinguishing carefully between explicit,
implicit, and multiple-valued functions.
11 (a) Given & = a(i)x + by and ¥ = c(t)x + dQ)y, prove that, if b # 0

s-[eratt}e+[@—wm a+]. =
=
0

(b) Given that ¥ + p@)% + q@x = r(), prove that, if ¢ # 0, v = ¥satisfies

o+[p-4]o4+[p+e- oP Jorn peo

12 For which values of a, 8 does the function x%P satisfy a Lipschitz condition: (a) in
the open square 0 < x, y < 1, (b) in the quadrant 0 < x, y < +0, (c) in the part
of the quadrant of (b) exterior to the square of (a)?

13 For each of the following scalar-valued functions of a vector x and each of the fol-
lowing domains, state whether a Lipschitz condition is satisfied or not:
(a)x, +x tess
+ x, (b) x1%9 x, tt (c) y/(e? + 9°) (@) {x in @
|x| <1, (i) —00 < x, < ©, (iii) —00 < x, <0, }x,]| <1, k= 2.

14 Let X(x, é) = (X,(x, 4), . . . , X,(x, 2) be a one-parameter family of vector fields. Show
that X satisfies a Lipschitz condition if and only if each scalar-valued component X,
satisfies a Lipschitz condition, and relate the Lipschitz constant of X to those of the
x,

CONTINUITY

We shall now prove a much stronger continuity property of the solutions of


systems of DEs, namely that the solutions of (2) vary continuously when the
function X varies continuously. Loosely speaking, the solution of a DE depends
continuously upon the DE for given initial values.

THEOREM 3. Let x(t) and y(t) satisfy the DEs

dx/dt = X(x, t) and dy/dt = Yly, t)

respectively, on a <= t < b, Further, let thefunctions X and Y be defined and continuous
in a common domain D, and let

(6) |X, t) — ¥(@z,d)| <6 ast=b, zéED

Finally, let X(x, 1) satisfy the Lipschitz condition (4). Then

(7) Ix) — yO| S [x@ — y@|e"


4! + z feklt-4l — 1]

The function Y is not required to satisfy a Lipschitz condition.


178 CHAPTER 6 Existence and Uniqueness Theorems

Proof. Consider the real-valued function o(), defined for a = ¢ = b by

o(t) = |x —yO? = > [x() —yO)?


From the last expression we see that ¢ is differentiable. Its derivative can be
written in the form

a(t) = 2[Xx, ) — Y¥y@, ] - x@ — yO]


= 2((X(x, ) — X(y@, 0] - x@® — y@)}

+ 2X, t) — Yy@, ) - [x — yO}

We now apply the triangle inequality to the right side, and then the Schwarz
inequality to each of the two terms of the last expression. This gives the
inequality

|o’()| S 2|X&O, ) — XO, DI IxO — yO!


+ 2|XV@, ) — Yy@, OI |x® — yOl

To the first term on the right side we now apply the Lipschitz condition that X
satisfies, to the second term, we apply (6). This gives the following differential
inequality for o:

(8) o'(t) <= 2Lo(t) + 2eV a(t)

The theorem is now an immediate consequence of the following lemma.

LEMMA. Let o(f) = 0,a <1 <b be a differentiable function satisfying the dif-
ferential inequality (8). Then

(9) of) = | Vala)eb) + r (eh) — »} axt=b


Proof. We shall apply Theorem 7 of Ch. 1, $11, on differential inequalities
to (8). The right side of (8), the function F(, t) = 2Lo + 2Vo,satisfies a Lip-
schitz condition in any half plane o = gp that does not include the line « = 0.
Therefore, Theorem 7 of Ch. 1 applies when o(a) > 0. For, if o(@) > 0, then
the solution of the DE

du
(9’) _
%Vu + 2Lu, u=O0
dt

which satisfies the initial condition u(a) = o(a), will have a nonnegative deriva-
tive, and therefore will remain, for ¢ > a, within the half-plane u = o(a).
The DE (9’) is a Bernoulli DE (Ch. 1, Ex. C7). To find the solution satisfying
4 Continuity 179

u(@) = a(a), make the substitution v() = Vu(é). (The square root is well-defined
because u(é) = o(a) > 0.) This gives the equivalent DE

Qvuv! = 2ev + 2Lv?

If u(a) > 0, it follows that u(t) > 0 for all later ¢, since the derivative of u is
positive. This gives v(i) > 0, and so we can divide both sides of this DE by v.
The resulting DE is v’ ~ Lv = e, an inhomogeneous linear DE whose solution
satisfying the initial condition v(a) = Vu(a) is the function

Vult) = v(t) = Vula) &? + (/Lye"® —-1)

On applying Theorem 7 of Ch. 1, we obtain the inequality (9).


We must now consider the case o(a) = 0 when this theorem does not apply
directly. In this case, we consider the solution u,(¢) of the differential equation
(9’) that satisfies the initial condition u,(a) = 1/n. Since the right side of (9) is
positive, u,(¢) is an increasing function of ¢. We shall prove that u,() = o(?).
Suppose that at some point ¢; > a we had u,(¢,) < o(¢,). Then among all num-
bers ¢ with a < ¢ < é, such that u,() = o(é) there would bea largest, say ty.
Hence, we would have u,(f9) = o(f)) > 0 and u,() < of) for tj < t S é,. But
this is impossible by what we have already proved, since in the interval tp) S ¢ =
t,, the functions u(t) and o(¢) stay away from 0. Therefore a Lipschitz condition
is satisfied for (9’). We infer that

a(t) S [neh + E/E! — 1?

for all n > 0. Letting n — 00, we obtain the inequality (9) also in this case.

The following corollary follows immediately from Theorem 3.

COROLLARY. Let X(x, t; €) be a set of continuous functions of x and t, defined


in the domain D: |t — a| S T,|x — ¢| SK for all sufficiently small values of a
parameter €. Suppose that, as e > 0, thefunctions converge uniformly in D to a function
X(x, 4) that satisfies a Lipschitz condition. For each ¢€ > 0, let x(t, © be a solution of
dx/dt = X(x, t; €) satisfying the initial condition x(a; €) = ¢. Then the x(t; €) converge
to the solution of dx/dt = X(x, 1) satisfying x(a) = c, uniformly in any closed sub-
interval |t - a| = T, < Twhere all functions are defined.

EXERCISES B

1. Let X and Y be as in Theorem 3, and let x(a) — y(a). Show that |x() — y(@j/{t —
a| remains bounded as t > a.

2. Show that if |OF/dy| < L(x), then any two solutions of u’ = F(x, u) and vo’ = Fix, v)
satisfy |u(x) — u(x)| < |u(0) — v(0)|e,
For the pairs of DEs in Exs. 3-5, bound the differences on [0, 1] between solutions hav-
ing the same initial value (0) = c.
na

3. yy =PandY =1l+tytes- + 2
oa

!
180 CHAPTER 6 Existence and Uniqueness Theorems

a’
4,.—= 6 and at? =
=
—sin 0.
dt®

3,3 5,5 2n+1


io 2
—_— + y y
J
=
=
sin xy and
y’ = xy — +++ +(-1"
3! 5! (2n + 1)!
To what explicit formulas does formula (7) specialize for the system dx/dt = X(@?
For the DE dx/dt = bx + c?

*7 Show that the conclusion of Theorem 3 holds if only a one-sided Lipschitz condition
(x — y) - (X(x, ) — Xfy, )) S Llx — y|? is assumed for X.
Let X(x, ¢, s) be continuous for |x — c] < K, |t — a| < T, and |s — so] < S, and
let it satisfy | X(x, t, s) — X(y, t, s}| == L|x — y|. Show that the solution x(, s) of x’
= X(x, i, s) satisfying x(a, s) = ¢ is a continuous function of s.

*5 NORMAL SYSTEMS

Many important mathematical problems have normal systems of DEs of order


m > | as their natural formulation. We now give two examples and show how
to reduce every normal system of ordinary DEs to a first-order system.

DEFINITION. A normal system or ordinary DE’s for the unknown functions


&,(0), &(), ..., &,(t) is any system of the form

qr é,

(10) di™ =F, (, dé) —

dt
..3 &,
d&>
dt
3 Sn a

n
)
k
=
=

1, , m, in which for each & only derivatives d’é,/df of any &, of orders
p < n(j) occur on the right side.

In other words, the requirement is that the derivative d"™€,/dt™ of highest


order of each &, constitutes the left-hand side of one equation and occurs
nowhere else.

THEOREM 4. Every normal system (10) of ordinary DEs is equivalent to a first-


order normal system (1) (with n = m).

Proof. Each function F, appearing on the right side of (10) is a function of


several real variables. To reduce the system (11) to the first-order form (1), it is
convenient notationally to rewite n(z) as n,, and to define new variables x,, :
Xn, where

n=nt+...tn, n(1) + .. + n{m)


=
=

by the formulas

dé ae, _ a's
1 = &, x =
——

»%3 = >
n]
di di?” at"?

Xn+1 = £5, Xm+2 = de s » Xn +n


_ ob
di at")
5 Normal Systems 181

In terms of these new variables, the system (1) assumes the form

dx dX» AXyy,—1 Axa,


= F\(x, ’ Xn)s
=

—_—=

XQ» dt X3, > mi?


dt di d
AXm+1 AXni+ng
= Xn +2» ’ = F(x, » Xn» 7)
dt dt

It is clear that this system satisfies the requirements of the theorem.


The initial value problem for the normal system (10) is the problem of finding
a solution for which the variables

alg
dé ae dle
&, re oe di} » $2»
* aie)?
s
di" 1

assume given values at ¢ = a.


It is easily seen from the proof of Theorem 4 that, if the functions F,, con-
sidered as functions of the vector variables x = (x, , x,) and t, satisfy Lip-
schitz conditions, then so do the functions X,(x, f) in the associated first-order
systems (1). This gives the following corollary.

COROLLARY. [If the functions F, of the normal system (10) satisfy Lipschitz con-
ditions in a domain D, then the system has at most one solution in D satisfying given
initial conditions.

Example 2 (the n-body problem). Let n mass points with masses m,; attract each
J
other according to an inverse ath power law of attraction. Then, in suitable
units, their position coordinates satisfy a normal system of 3n second-order dif-
ferential equations of the form

d°x; — *)
mj(%
di? =) jFe
nen

and the same is true for d*y,/dé” and d°z,/dt”, where

ry = (%; — x)? +9; -9)? + & — 2)? =4,


Then Theorem 1asserts that the initial positions [x,(0), y,(0), z,(0)] and velocities
(x{(0), y{(0), z{(0)) of the mass points uniquely determine their subsequent motion
(if any motion is possible). That is, the uniqueness theorem asserts the determi-
nacy of the n-body problem. This theorem, taken with the continuity theorem, and
Theorem 8 to follow, asserts that the n-body problem is well-posed, provided
that there are no collisions.
To see this, let & = (&), . » &,) be the vector with components defined as
follows, for k = 1, ,n.

fi Xp» Esk = Deo Lontk = Zk

esntk Xho Sansa = Ie Esnth = Zh


182 CHAPTER 6 Existence and Uniqueness Theorems

In this notation, the system (10) is equivalent to a first order normal system of
the form (1):

En+3n h=1,...,3n
dé,
= F,(é) = > m(E,—3n ~ Ex-30)/rhton h=3n+1,...,6n

dt
Jj

where k(j) is the remainder of 7 when divided by n, and summation is extended


to those n — 1 values ofj such that (hk — 1)/n and (j — 1)/n are distinct and
have the same integral part. So long as no », = Q, that is, so long as there are
no collisions, a Lipschitz condition is evidently satisfied by the functions F,. When
one or more 7,, vanish, however, some of the functions F,, become singular (they
are undefined) and Theorem 1 is inapplicable.

Example 3. The Frenet-Serretformulast comprise the following normal system


of first-order DEs:

da ap
~—>$ See
Y ay 8
R(s) Ris) Ts)’ ds ~ Ts)
where a, 8, and y = a@ X @ are three-dimensionalf vectors: the unit tangent,
normal, and binormal vectors to a space curve. The curvature «(s) = 1/R(s) and
torsion 7(s) = 1/7(s) are functions of the arc length s; @ = dx/ds is the deriva-
tive of vector position with respect to arc length.
If we let n(s) be the nine-dimensional vector (@,, >, 03, 81, Bos B3, Yi» Yo. Y3)s
the system can be written as the first-order vector DE

dn/ds = Y(n; 5)

Here Y(q; s) is obtained by setting

K(S)Mis h=1,2,3
Y,(7; 5) = — K(s)m—3 + 7(S)ti+3 h=4,5,6
— 7(5)m—s h = 7, 8,9

If x(s) and 7(s) are bounded, the vector fields Y¥(n; s) satisfy a Lipschitz condition
(4) with L = sup{|x(s)| + |7(s)|}; hence, for given initial tangent direction a(0),
normal direction 8(0) perpendicular to (0), and binormal direction (0) = a(0)
X @(0), there is only one set of directions satisfying the Frenet-Serret formulas.
This proves that @ curve with nonvanishing curvature is determined up to a rigid
motion by its curvature and torsion.t}

+ Widder, p. 101.
ta X B denotes the cross product of the vectors @ and @.
+t This theorem of differential geometry can fail when «(s) is zero, because 8 = x”(s)/|x”(s)| is then
geometrically undefined, so that the Frenet-Serret formulas do not necessarily hold.
6 Equivalent Integral Equation 183

EXERCISES C

1. Find all solutions of the system

2,
dx dy d?x
—7 +sy
0, +xt+y=0
=

=—_ =
y
“Ue at* #2
di at?

2. Show that, if ay, = —a, then £ x? is an integral of the system

= > BngXyXp

3. The one-body problem is defined in space by the system

2
x
=
=

— xf(7), j= —wO, = —2f(), r =


x+y
+ 2?

(a) Show that the components L = yz — zy, M = zX — xz, and N = xy — yx of the


angular momentum vector (L, M, N) are integrals of this system.
(b) Show that any solution of the system lies in a plane Ax + By + Cz = 0
(c) Construct an energy integral for the system.

Let a = 2 in the n-body problem (Newton’s law of gravitation), and define the poten-
tial energy as V = — 2,2, mm,/rTy.
(a) Show that the n-body problem is defined by the system

m,a°x, av
=

dt? Ox

(b) Show that the total energy Z m,x2/2 + V(x) is an integral of the system.
(c) Show that the components & m,x,, etc., of linear momentum are integrals.
(d) Do the same as in (c) for the components Z m,(y,z zy,), etc., of angular
momentum.

Show that the general solution of the vector DE d*x/dt® = dx/dt is a + be! + ce,
where a, b, c are arbitrary vectors.

Exercises 6—9 refer to the Frenet -Serret formulas.

6 Show that, if a(s), B(s) , y(s) are orthogonal vectors of length one when s = 0, this is
true for all s, provided they satisfy the Frenet-Serret formulas.

Show that if 1/7(s) = 0, and dx/ds = a, the curve x(s) lies in a plane. [HiNT: Consider
the dot product ¥ - x.]

*8 Show that, if T = &R (k constant), the curve x(s) lies on a cylinder.

*9 Show that, if R/T + (TRY =


=
0, the curve x(s) lies on a sphere.

EQUIVALENT INTEGRAL EQUATION

We now establish the existence of a local solution of any normal first-order


system of DEs for arbitrary initial values. To this end, it is convenient to reduce
the given initial value problem to an equivalent integral equation. One reason
why this restatement of the problem makes it easier to treat is that we do not
have to deal with differentiable functions directly, but only with continuous
184 CHAPTER 6 Existence and Uniqueness Theorems

functions and their integrals. Every continuous function has an integral,


whereas many continuous functions are not differentiable.

THEOREM 5. Let X(x; 1) be a continuous vector function of the variables x and


t. Then any solution X(0) of the vector integral equation

(11) xf) =et+ f X(x(5),s) ds

is a solution of the vector DE (2) that satisfies the initial condition x (a) = ¢, and
conversely.

The vector integral equation (11) is a system of integral equations for r


unknown scalar functions x,(é), , x,(é), the components of the vector function
x(t). That is,

x,(t) = co,+ f X,(x1(S),XolS),... 5%,(S),5)ds, lsk=sr

(In §§6-—7, we will deal with r-dimensional vectors.)

Proof. If x(d) satisfies the integral equation (11), then x(a) = ¢ and, by the
Fundamental Theorem of the Calculus, x/() = X,(x(@); ) fork = 1,..., 7, so
that x(é) also satisfies the system (2). Conversely, the Fundamental Theorem of
the Calculus shows that x,(é) = x,(a) + Ji, x{(s) ds for all continuously differen-
tiable functions x(t). If x(é) satisfies the normal system of DEs (2), then
x(Z)
=
=
x(a) + J! X(x(s); 5) ds; if, in addition, x(a) =
=

c, the integral equation


(11) is obtained, q.e.d.

Example 4. Consider the DE dx/dt = e” for the initial condition x(0) = 0.


Separating variables, we see that this initial-value problem has the (unique) solu-
x
tion 1 — e&~ = t,x = —In (1 — #). Theorem 5 shows that it is equivalent to the
integral equation x(#) = J e* ds, which therefore has the same (unique) solu-
tion. Since the solution is defined only in the interval —00 < ¢ < 1, we see again
that only a local existence theorem can be proved.

Operator Interpretation. The problem of finding a solution to the integral


equation (11) can be rephrased in terms of operators on vector-valued functions
as follows. We define an operator y = U[x] = Ux, transforming vector-valued
functions x into vector-valued functions y by the identity

(12) y® = Ufx@] =e + f X(x(s),5)ds


If X(x, #) is defined for all x in the slab | —a| < T and is continuous, the
domain of this operator can be taken to be the family of continuous vector func-
7 Successive Approximation 185

tions defined in the interval |¢ — a] = T its range consists of all continuously


differentiable vector-valued functions defined in this interval, satisfying y(a) =
c. In this case, Theorem 5 has the following corollary.

COROLLARY. The DE (2) has a solution satisfying x(a) = c if and only if the
mapping U of (12) has a fixpoint in @{a, 5).

However, if X(x, ¢) is not defined for all x, the domain of the operator U has
to be determined with care. This will be done in Theorem 8 below.

7 SUCCESSIVE APPROXIMATION

Picard had the idea of iterating the integral operator U defined by (12), and
proving that, for any initial trial function x’, the successive integral transforms
(Picard approximations)

x’, x! = UX), x? = U%[x"] = Ulx'], x? = U[x’],...

converge to a solution. This idea works under various sets of hypotheses; one
such set is the following.

THEOREM 6. Let the vector function X(x; t) be continuous and satisfy the Lip-
schitz condition (4) on the interval |t — a| = T for all x, y. Then, for any constant
vector c, the vector DE x’(t) = X(x; t) has a solution defined on the interval |t — a|
ST, which satisfies the initial condition x(a) = c.

Proof. As remarked at the end of the preceding section, the operatorUis


defined by (12) for all functions x(#) continuous for |¢ — a| < T. In particular,
since Ux is again a continuous function of |f — a| < T, the function x* = U?[x]
= Ux is well-defined. Similarly, the iterates U®x, U*x, etc., are well-defined.
These iterates will always converge; a typical case is depicted in Figure 6.1.

*|
oo.
e/2
1+¢
g = ——+—

10>
l+¢t
r=

0.5

0.5 1.0 t

Figure 6.1 Picard approximation for dx/dt = x, x(0) = 1/2


186 CHAPTER 6 Existence and Uniqueness Theorems

LEMMA. If x°(t) = ¢, the sequence offunctions defined recursively by x! = U[x"),


x? = U[x'] = U?[x%, ,x” = U[x""'] = U"[x"], .. . converges uniformly for
jé-al ST.

Proof. Let M = supy—a)<7r |X(c; #)|; the numberM is finite because contin-
uous functions are bounded ona closed interval. Without loss of generality we
can assume that a 0 and ¢ = a, that is, that the interval is 0 = ¢ < T; the
=
=

proof for general a and for t < @ can be deduced from this case by the substi-
tutions i > i+ aandt—~a
— it.
By the basic inequality (3) for vector-valued functions, the function x'({) sat-
isfies the inequality

Ix'@® ~ x°@| = if X(x°(s),5)as


(13) < f |xcx’, 5) asm f a= mi
Again, by (3), the function x? = U[x"] satisfies the inequality

|x — x'O| = |f “DG! ) —XO"), as


= J |X(x"(s),s) — X(x%s),5)| ds
We now use the assumption that the function X satisfies a Lipschitz condition
with Lipschitz constant L. This gives, by (13), the inequality

Ix" — x'@| = f |X(x"(s), s) — XK"), 5)| ds


< [uso — x%|as<i [Meds = LM?
2

Similarly, for any n = 1, 2, 3,...,

|x") — x"()| = f |X(x(s), s) — X(x""1(5),s)| ds

< Lf |x"(s) — x""(s)| ds

We now proceed by induction. Assuming that

(M/L)(La)"
Ix") —x""O| = !
7 Successive Approximation 187

we infer that

M(Lt)"* 1
*(Ls)"
(14) |x") —x")| SL (#) ni L(n
+ 1)!

Next, we show that the sequence of functions x"(é) (n 0, 1,2 ) is uni-


formly convergent for 0 = ¢ = T. Indeed, the infinite series

n+l
GED
i
(7)azo | + 1)!

of positive terms is convergent to (M/L)(e“ — 1), and uniformly convergent for


0 <t ST. Hence, by the Comparison Test,+ the series x°() + D~o [x*11() —
x‘(é)] is uniformly convergent for 0 < t < T. The nth partial sum of this series
is the function x"(t). It follows that the sequence of functions x”(é) is uniformly
convergent. This completes the proof of the lemma
To complete the proof of Theorem 6, let x”(¢) denote the limit function of
the sequence x"(#); it suffices by Theorem 5 to show that x(#) is a solution of
the integral equation (11). To this end, we consider the limit of the equations
n+]
x = U[x"], namely the equations

"tg set f Xess

The left side converges uniformly, by the preceding lemma. By the Lipschitz
condition, [X(x"(s), s) — X(x"(), s)| <= L|x"(s) — x"(s)|, and so the integrals on
the right side also converge uniformly. It follows that they have a continuous
limit X(x(#); #).[ Passing to the limit, we have

x @=e+ f xe (s),
8)ds

This demonstrates (11) and completes the proof of Theorem 6

EXERCISES D

In Exs. 1-5, solve the integral equations specified

1. u(t) = 1 + Sp su(s) ds 2.ul) =1+ fi sur(s) ds


3. u(t) + ¢ = Ji su(s) ds 4. u(f) = 1 — fo u(s) tansds

5. ult) = fo [uls) + v(s)] ds v(t) = 1 — §3 us) ds

+ Courantand John, p. 535; see also Widder, p. 285

+ Courantand John, p. 537; Widder, p. 304


188 CHAPTER 6 Existence and Uniqueness Theorems

6. Show that the nth iterate for the solution of y’ = yx such that (0) 1 is the sum
of the first n + 1 terms of the power series expansion of e*/?
For the initial value problems in Exs. 7-10, obtain an expression for the nth function of
the sequence of Picard approximations x U"[x°] to the exact solutions
7. dx/dt ’ x(0) 9. dx/dt
=
=
tx, x(0)

8. dx/dt = 9, dy/dt —4x 10. dx/dt =


=
ty, dy/dt —Ix

x(0) 0, 1
=

(0)
= =

x(0) 0, 9(0) 1
=
=
= = =

2
For the initial value problems of Exs. 11-13, compute the functions x’, x x” of the
sequences of Picard approximations.
24 2
11 dx /dt =
=
x(0) = 0

12 dx/dt
=
=

yt dy/dt=x? + ?? (0) = y(0) = 0


13 dx/dt =
=
x(1 — 20), x(0)
*14 Show that, in Ex. 13, the sequence of Picard approximations converge for all é, but
that this is not so in Ex. 11. In Ex. 13, is the convergence uniform?

15 Let X(x, t) = Ax, where A is a constant matrix. Show that each component of the
nth Picard approximation to any solution is a polynomial function of degree at
most n

16 Establish the following inequalities for the sequence of Picard approximations

ro - xo) = B(EE— A) hy — ero =My L k=ntl


El aly

8 LINEAR SYSTEMS

A first-order system of DEs (1) is said to be linear when it is of the form

(15)
d
~ a;x(t) + bt) lsisn

In this case, we have X;,(x, ft) = Xj.) a;(t)x; + 5,(). In vector notation, the linear
system (15) is written in the form

(16) dx/dt A(x + b(t)

where A(é)x stands for the matrix ||a,,(¢)|| applied to the vector x, and b stands
for the vector (6, »b,)
When b(t) = 0, the system (16) is said to be homogeneous. Otherwise, it is
called inhomogeneous. The homogeneous system obtained from a given inho-
mogeneous system (15) by setting the 6; equal to zero is called the reduced system
associated with (15)
A basic property of a linear system of DEs (16) is that the difference x — y
of any two solutions of (16) is a solution of the reduced system. It can be imme-
diately verified that any linear combination ax() + by(é) of solutions x(#) and
y(t) of a homogeneous linear system is again a solution
8 Linear Systems 189

We shall now establish the existence of solutions of linear systems and


describe the set of all solutions.

LEMMA. Any linear system (15) with continuous coefficient functions on a closed
interval I satisfies a Lipschitz condition (4) with

(17) Ls 2 sup |a;()]

Proof. Since X(x, #) ~ X(y, 4) is the vector sum of n* vectors Z,;, with ith
component a,(x;; — y,) and other components zero, repeated use of the triangle
inequality gives

[X(x,) — XV, )| = 2 Iz;| <= s la,(t)| - Ix, — 9;I

sS a sup la;(O|- Ix — yl

The functions a;(t), being continuous on a closed interval, are bounded.f


Hence, the Lipschitz constant L of (17) is finite. This completes the proof of the
lemma.
We can now state the existence theorem for linear systems.

THEOREM 7. The initial value problem defined by a linear system (15), with the
a, ;(t) and b,(t) defined and continuous for |t — a | = T, and the initial condition x(a)
= ¢, has a unique solution on |t — a | ST.

Proof. The preceding lemma shows that such a system satisfies the hypothesis
of Theorem 6. This gives the existence of the solution. The uniqueness follows
from Theorem 1, again by the preceding lemma.
For homogeneous systems, we can construct a basis of solutions, as follows.

COROLLARY 1. Let x'(¢) be the solution of a homogeneous linear system dx/dt =


A()x that satisfies the initial condition xi(a) = 0, i # k, xi (a) = 1. Then the solution
satisfying the initial condition x(a) = ¢ = (c), » C,) is equal to the linear combi-
nation x(t) = cx'(t) + cox®() +--+ + c,x"(0).
Proof. The vector-valued function y(t) = x({) — Dj-1 cx) is a solution of
the linear system, since it is a linear combination of solutions. This function
satisfies the initial condition y(a) = (0, 0, ..., 0) because of the way in which
the initial conditions for the solutions x’ have been chosen. Since the identically
zero function is also a solution of the linear system, it follows from the unique-
ness in Theorem 7 that y(t) = 0, q.e.d.f

+ Courant and John, p. 101.


t In algebraic terms, Corollary 1 states that the solutions of a homogeneous linear system of dimen-
sion n form an n-dimensional vector space of functions. Therefore, any n + 1 solutions of such a
system are always linearly dependent.
190 CHAPTER 6 Existence and Uniqueness Theorems

The reduction of an nth order normal DE to a first-order system sketched in


in §1, when applied to a linear nth order DE in normal form

-1 -2
du u

dt
= p(t)
di”) + pl) Face + + pri) = + pala
transforms the DE into a homogeneous linear system dx/dt = A(é)x, where the
matrix ||@,,(é)|| = A(@ is defined as follows: a,() = 0 if 1 = i= n— 1 and
J#it la .@ = liflsisn— 14,0 = p10.
We therefore obtain the following result.

COROLLARY 2. An nth order DE in normal form, with coefficients p,(t) contin-


uous for |t — a| = T, has a basis of solutions ut) (1 S j Sn) satisfying the initialt
conditions u(0) =871,0 <isn—1.

More results about solutions of linear systems of DEs will be established in


Appendix A.

9 LOCAL EXISTENCE THEOREM

In Theorem 6, it was assumed that X(x, 4) was defined for all x and satisfied
a Lipschitz condition (4) for all x. But often this is not the case. For instance,
this assumption does not hold for the DE dx/di = e* of Example 4. The ratio

[X@,) — X0,9| _ @—1)


|x — 0| x

is unbounded if the domain of e” is unrestricted.t


Correspondingly, the conclusion of Theorem6 fails for this DE: the solution
which takes the value ¢ at ¢ = 0 is the function x(f) = —In (¢° — 2), and this
function is defined only in the interval —co < ¢ < e~*. Hence, there is no « > 0
such that the DE dx/dit = e* has a solution defined on all of |¢| < ¢ for every
initial value: the interval of definition of a solution changes with the initial value.
To cover this situation, and also cases where the function X is defined only
in a small region of (x), , X,)-Space, we now provea local existence theorem,
whose assumptions and conclusions refer only to neighborhoods of a given
point.

THEOREM 8. Suppose that the function X(x, t) in (2) is defined and continuous
in the closed domain |x — c| = K, |t — a| = Tand satisfies a Lipschitz condition

+ 8 is the Kronecker delta function: 5 = 0 if i # j and & =


=
1. For the concept of a basis of solutions
of nth order DEs, see Ch. 3, §4.
{ The same complications arise with the DE y’ = 1 + y’.
10 The Peano Existence Theorem 191

(4) there. Let M = sup |X(x, 1)| in this domain. Then the DE (2) has a unique solution
satisfying x(a) c and defined on the interval |t — a| = min (T, K/M).
=
=

Proof. All steps in the proof of Theorem6 can be carried out, provided we
know that the functions x"(f) referred to there take their values within the
domain D,: |x — c| = K, |é — a| S min (7, K/M), in which x(é) is surely
defined. In particular, note that since D, C D, the bound M and Lipschitz con-
stant L of Theorem 6 can be used in D,. Therefore, the proof is a corollary of
the following lemma.

LEMMA. Under the hypotheses of Theorem 8, the operator U defined by (12) carries
functions x() satisfying the conditions: (i) x(0) is defined and continuous on |t — a|
<= min (T, K/M); (ii) x(@) = ¢; iii) |x@ — c| S Kon the interval |t — a| S min
(T, K/M), into functions satisfying the same conditions.

Proof. In (12), suppose that x(s) satisfies conditions (i), (ii), (iii). We must
show that y(#) satisfies the same conditions. Clearly (i) and (ii) are satisfied by y(é).
By the inequality (3) we have (taking again ¢ = a for simplicity)

ly@ —c| = f X(x(s),5)as = f |X(x(s),s)| ds


lf M is the maximumof X and if |t — a| = K/M, this gives

ly¥® — c| =—
=
=

Therefore, (iii) is satisfied and y(¢) is defined for |¢ — a| < min (T, K/M), com-
pleting the proof.

Using the reduction of §1, taking an nth-order normal DE

yw = Flu, u’,u”,..., ul),t)


(18)

into an equivalent first-order normal system (1), we obtain the following.

COROLLARY. Lei thefunction F (x, X», » Xn, t) be continuous in the cylinder


|t— a| =T, |x —c| SK.Let (xq? + x5? +--+ +> +x,? + F*)'? =M,andlet
F satisfy a Lipschitz condition there. Then, on the interval |t — a| = min (T, K/M),
the DE (18) has one and only one solution that satisfies the initial conditions u(a) =
Cé4,p0Sisn—.

*10 THE PEANO EXISTENCE THEOREM

The existence theorems for normal systems (1) proved so far have assumed
that the functions X, satisfy Lipschitz conditions. We shall now derive an exis-
192 CHAPTER 6 Existence and Uniqueness Theorems

tence theorem, assuming only continuity. As shown in Ch. 1, solutions of such


systems need not be uniquely determined by their initial values.

THEOREM 9 (PEANO EXISTENCE THEOREM). If the function X(x, t) is con-


tinuous for |x — ¢ | = K, |t — a] ST, and if |X(x, i)| = Mthere, then thevector
DE (2)has at least one solution x(é), defined for

jt — a| = min (T, K/M)

satisfying the initial condition x(a) = c.

Proof. Using an elegant method due to Tonelli, we shall consider the equiv-
alent integral equation (11) of Theorem 5,

(19) xf) =ec+ f X(x(s),5)ds

and prove that this has a solution. Let T; = min (T, K/M). We may assume that
0 and that the interval is 0 = ¢ = Tj. In this interval we construct a
=

a =

sequence of functions x"(t) as follows. For 0 < t = T,/n, set x"() = cc. For
T\/n <t S T, define x"(é) by the formula

(20) x") =et+ fo X(x"(s),5)ds

This formula defines the value of x"(¢) in terms of the previous values of x"(s)
for0 =sSt—T,/n.
It follows, as in the lemma of §9, that the functions x"(¢) are defined for 0 =<
t <= T;. Also, we have

Ti

[x"@| = [e| + Mds = |c| + T\M

Hence, the sequence of functions {x"()| (1 = 1, 2, ...) is uniformly bounded.


Next, we prove that the sequence x” is equicontinuous in the following sense.

DEFINITION. A family ¥ of vector-valued functions x(), defined on an


interval /: |t ~ a| = T, is said to be equicontinuous when, given « > 0, a number
6 > 0 exists such that

jé—s| <6 implies |x@) — x(s)| <e

for all functions x € ¥, provided that s, ¢ € 1.


11 Analytic Equations 193

Indeed, using the inequality (3), we have

[x"(t.) — x"(t)| S f | X(x"(s),s)| ds= M|ty — ty|


from which it is evident that the x(t) are equicontinuous.

We now apply to the sequence x" the Theorem of Arzela-Ascoli, which is


stated below without proof.+

ARZELA-ASCOLI THEOREM. Le? x"(t) (n =


=
1, 2, 3,...) be a bounded equi-
continuous sequence of scalar or vector functions, defined for a < t = b. Then there
exists a subsequence x™(t) (2 1, 2,...) that is uniformly convergent in the interval.
=
=

Applying this result to the sequence x"(/), we see that it must contain a uni-
formly convergent subsequence x™(f), converging to a continuous function x*(é)
as n,
> ©.

It is now easy to verify that this limit function x™(¢) satisfies the integral equa-
tion (19). Indeed, (20) can be written in the form

&

(21) x"@ =ct+ f X(x"(s), s) ds — X(x"(s), s) ds


i- 'n,

As n, > 0, fy X(x(s), s) ds > JG X(x%(s), 5) ds because X(x, #) is uniformly


continuous; and the last term of (21) tends to zero, because, by the inequality
(3)

Fm | = Spe “™M “= 0 21
n;

Therefore, taking limits on both sides of (21) as n; —> 00, we find that x® satisfies
the integral equation (19), q.e.d.

*11 ANALYTIC EQUATIONS

We shall now consider the vector DE (2) under the assumption that X (x, 4)
is an analytic function of all variables x, , x,, t. The essential principle to be
established is that all solutions of analytic DEs ave analytic functions.*
The result is true whether the variables are real or complex; we shall first
consider the complex case. To emphasize that we are dealing with complex vari-

+ Rudin, p. 164 ff. The proof given there is for real-valued functions, but the method applies to
vector-valued functions.

* This section requires a knowledge of elementary complex function theory such as is found in the
books by Hille (Vol. 1) and Ahlfors.
194 CHAPTER 6 Existence and Uniqueness Theorems

ables, we rewrite the vector DE (2) as

(22) dz/dt = z(t) = Z(z,!), t=rtis

where z x, + ty, and Z, = X, + 7Y, are complex-valued functions.


=
=

J
We assume that the Z,(z,, » Zy #) are analytic functions of the variables z,,
Zo ., 2, and ¢ in the closed cylindrical domain C: |t — a| = T, |z—c| =K,
with maximum M there. By the lemma of §2, this implies that a Lipschitz con-
dition holds in C, for some constant L.
Vector notation can be adapted to complex vectors with the following
changes. The length (or norm) of a vector z (z1, 29» , Z,) with complex
=
=

components z, is defined as

[z| = (az¥ + zypB +--+ + z,249'”

The Hermitian inner product of two complex vectors z and

w= (wy, Wo, » W,)

is defined as

Zow = (zywt + zows + ++ + 2,0)

Note that z w = (w z)*: the dot product operation is not commutative for
complex vectors. (The set of complex n-vectors with the above inner product is
called a unitary space.)
Now let y be any path in the complex ¢-plane, defined parametrically by the
equation ¢ = t(¢) = r(o) + is(o), where 7, s € @' andais a real parameter. On
the path y, (22) is equivalent to the system of real DEs

(22’) x'(c) = X(x, y, o)r'(o) — Y(x, y, 0)s’(o)


y¥(o) = X&, y, o)s’(o) + ¥(x, y, or)

Theorems 1 through 8 apply to this system, which satisfies a Lipschitz condition.


Using the complex vector notation described before, we can also prove ana-
logs of these theorems directly, since the DE z(¢) = ZG, t(o))é’(c) is equivalent
to (22’), and hence to (22), on the path y.
The analog of the operator U of formula (12) is the operator W, defined by
the line integral

(23) w(@) = W@)] =e + f Z(z($),9)af


Since each component function Z, is analytic, the line integral defining the
operator W is independent of the path from 0 to ¢ in the complex t-plane, pro-
11 Analytic Equations 195

vided that this path stays within the domain C where the function Z is defined.t
By Morera’s theorem, the function w is therefore also analytic, in the sense that
each component w,(z) is. Thus, the operator W transforms analytic functions
into analytic functions. Moreover, the lemma of §9 still holds, because the inte-
grals in (23) can be taken along straight line segments in the complex {-plane.
This gives the following lemma.

LEMMA. For |t — a|< min (T, K/M), the operator Wdefined by (23) takes anal-
ytic complex-valued vectorfunctions z(t) with |z(t)— ¢|<S K into analytic vectorfunc-
tions w(t) with |w)— c|= K.

By repeated applications of this lemma, it follows that the functions W"[w"]


w"(t) defined by the Picard process of iterated quadrature, in the domain
=

|¢ — a| = min (T, K/M) of the complex t-plane, are all analytic.


We now apply the following result! from function theory.

Weierstrass Convergence Theorem. If a sequence {f,(/)} of complex ana-


lytic functions converges uniformly to f(é) in a domain D of the complex t-plane,
the /(d) is analytic in D.
By this theorem, the sequence of functions w"() converges uniformly for
|¢ - a] = min (T, K/M) to an analytic solution w™(é) of the integral equation

(24) zi) =e + f Z(), §) df = Wiz]


and hence of the complex DE (22). Applying the Existence Theorem of §8 for
real DEs to the system (22’), we infer the next theorem.

THEOREM 10. In Theorem 8, replace the real variables t, x,, X, with complex
variables t, z;, Z,. Under the same hypotheses, if the Z,(z, t) are complex analytic func-
tions, the vector DE (22) has a unique complex analytic solution 2(i) for given initial
conditions.

From this result and the uniqueness theorem, again for real DEs, we obtain
the following corollary.

COROLLARY 1. Let Z(z, t) be analytic in any simply-connected domain of z, t-


space, and let z (t) be any solution of the DE (22). Then z(t) is analytic.tt

+ This is true because the disk |¢ — a| << T where Zis defined is simply connected.

*Ahifors, p. 173. The result contrasts sharply with the case of functions of a real variable. By the
Weierstrass approximation theorem, every continuous function on a real interval a < x < bisa
uniform limit of polynomial (hence analytic) functions.

tt An alternative proof can be based directly on (22). If the z,() satisfy (22), they are continuously
differentiable. Hence, they are analytic (Ahlfors, pp. 24, 105; Hille, Vol. 1, pp. 72, 88).
196 CHAPTER 6 Existence and Uniqueness Theorems

Real Analytic DEs. A real function X(x, é) of real variables x,, , x, and
tis said to be analytic at (c, a) when it can be expanded into a power series with
real coefficients in the variables (x, — c,) and (¢ — a), convergent in the cylinder
|x — c| <4, |f — a] <, for sufficiently small positive y and ¢. When X(x, #)
is analytic, its power series is convergent also in the complex cylinder |z — c| <
n, |t — a| < € (¢ complex), and defines a complex-valued analytic function there.
Now, let a normal system of real DEs (1) be given, the X,(x, t) being analytic.
From Theorem 10, it follows that the resulting complex DE dx/dt = X(x, #) has
a unique complex analytic solution for given real initial values. On the other
hand, it also has a unique (local) real solution by Theorems 1 and 8. Hence the
two solutions must coincide, proving Corollary 2.

COROLLARY 2. If X(x, #) is an analytic real function of the real variables


X15 » X, and t, then every solution of (1) is analytic.

EXERCISES E

1. (a) Obtain an equivalent first-order system for d?x/dé? = t?x. Find the nth term of
the Picard sequence of iterates for the initial values x(0) = 1, x’(0) = 0.
(b) Prove that this initial-value problem has one and only one solution on (—,
oO),

(a) Obtain an equivalent first-order system for the DE d®x/dt” = x* + 2, and find
the Lipschitz constant for the resulting system in the domain |t} =< A, |x| = B,
Ix’| <C.
(b) State and prove a local existence theorem for solutions of this DE, for the initial
conditions x(0) = 6, x’(0) = ¢. Estimate the largest 7, U such that a solution is
defined on —U <1 T.

Show that, if F(y) is continuous for [y| =< K, and | F(y)| < M, every solution of y’ =
F(y) can be uniformly approximated arbitrarily closely for [x] <= K/M by a solution
of a DE y’ = P(y), where P is a polynomial.

Compute the nth Picard approximation to the solution of the complex system dw/dt
=
=
iz, dz/dt = w, which satisfies the initial conditions w(0) = 1, z(0) = i.

In the complex ¢-plane, determine a domain in which the system dw/dt = iz’, dz/at
= tw* has an analytic solution satisfying given initial conditions w(0) = wo, 2(0) = 2.
Show that the solution of the complex analytic DE

w'(z) = M | 21
—_

(1+2 yy
K
(lz1 < *)
which satisfies the initial condition w(0) = 0, is the function

i] =
w(z)= K (: + a IK
(n — 1)M

*7 Using the result of Ex. 6, show that the bound given by Theorem 8 for the domain
of existence of a solution is “best possible” for analytic functions of a complex
variable.
12 Continuation of Solutions 197

*12. CONTINUATION OF SOLUTIONS

Even when the function} X(x, 2) is of class @! and is defined for all x and #,
Theorem8 establishes the existence of solutions only in the neighborhood of a
given initial value. In other words, it establishes only the local existence of solu-
tions. We shall now study how such local solutions can be joined together to give
a global solution defined up to the boundary of the domain of definition of the
function X.

THEOREM 11, Let X(x, é) be defined and of class @' in an open region # of (x,
t) -space. For any point (c, a) in the region A, the DE (2) has a unique solution x(t)
satisfying the initial condition x(a) = ¢ and defined for an intervala St<b(bxs
00) such that, if b < 00, either x(t) approaches the boundary of the region, or x(t) is
unbounded as t > b.

Proof. Consider the set S of all local solutions of the system (2) that satisfy
the given initial condition x(a) = c. These are defined on intervals of varying
lengths of the form [a, T). Given two solutions x and y in this set, defined on
intervals J and I’ respectively, the function z, defined to be equal to x or to y
wherever either is defined, and hence also where both are defined, is also a solu-
tion defined on their union J U I’.
We now construct a single solution x, called the maximal solution, defined on
the union of all the intervals in which some local solution is defined, by letting
x(é) be equal to the value of any of the solutions of S defined at the point ¢. This
maximal solution x(é) is a well-defined function of class @', by the Uniqueness
Theorem. Furthermore, the interval of definition of this solution is the union
of all the intervals of definition and, therefore, is itself an interval of the form
ast<b.
Consider the limiting behavior of x(#), as ¢ tf b. By the Bolzano—Weierstrass
Theorem,‘ any infinite bounded set of points (x(¢,), ¢,) in xf-space must contain
a limit point. Hence either 6 = +00, or lim,, |x()| = +, or at least one
finite point (d, 6) is approached by at least one sequence of points [x(,), ¢,] on
the above solution curve. In the first case, ¢ is unbounded. In the second case,
x(é) is unbounded and the maximal solution may be said to “recede to infinity.”
It remains to consider the third case. A typical example is provided by choos-
ing the region & as the left half-plane t < 0 and x’(t) = ¢”? cos ¢~', with general
solution x C—sint?.
=
=

We shall now prove that, in the third case above, every limit point (d, 5) on
t = b of the maximal solution curve must lie on the boundary of #. Indeed,
suppose that it is in the interior; there would then exist a closed neighborhood

+ In this section we consider only real vectors and functions. The results can, however, be extended
to complex-valued and analytic functions, by methods similar to those used in § 1. The continuation
so defined is then the analytic continuation in the sense of complex function theory, by Theorem 9
(cf. Ch. 9, § 1).
+ Cf. Courant, Vol. 2, pp. 95 ff., where the Bolzano—Weierstrass Theorem is proved in R".
198 CHAPTER 6 Existence and Uniqueness Theorems

D: |x — d| =~ |t — 5] = «of (d, 0) also in ZR. Let M = max, |X|. Take


6 < min (, ¢/2M), and let G C D be the open rectangle |x — d| < «,
|t — 5| <6. Finally, choose k so that [x(t,), ¢,] « G. Then, applying Theorem 8
(in G) to the solution through [x(z,), #,], we see that it stays in G until ¢ = 6. Since
this is true for any ¢ > 0, lim,.,x({) = d. Hence, x(t) would have to coincide with
the unique (by Theorem 1) local solution of (2) through (d, b). Therefore, x(é)
would not be maximal, a contradiction.
The maximum length 5 — a of definition of the solutionxis called the escape
time of the solution for ¢ > a. There is a similar notion of the escape time for ¢
<a.

A solution with a finite escape time is one for which |x(é)| becomes
unbounded or reaches the boundary of # as t — b < 00, On the other hand, a
solution with an infinite escape time is one that remains within the domain of
definition of X for all tf > a. For example, every solution of the DE dx/dt = x
2
has infinite escape time, whereas every nonzero solution of the DE dx/dt = x,
namely every function x 1/(c — 2), has finite escape time.
=
=

*13. THE PERTURBATION EQUATION

It is easy to derive a formula for the dependence onc of the solution x = f(¢,
c) of the initial-value problem defined by the system x’() = X(x, é) and the initial
condition x(a) = c. For simplicity, consider first the case n = 1 of a single first-
order DE. Assuming that /(, c) is analytic, that is, that fhas a convergent Taylor
series expansion, we have

a off\=3c
9g of;_ 9

(25)
ot (
Oc (at Oc
[X(t ©), o)]

= |Xoe.0.0] -| of(t, ¢) |
“4

iC

When we expand around ¢ = 0, this gives formally

(26) fio =fO) + fiO + C/2DAO+---

where, by (25), fi) = Of/dc(t, 0) satisfies the linear perturbation equation

(27) Si® = Eo | A. fi(0) = 1


Hence, if we know f(t), we can compute /;(¢) in closed form by quadrature (Ch.
1). Illustrations of this “perturbation method” are given in Exs. F5 and F6
below.
As simple examples show (see Exs. F7—F10 below), the approximate solutions
13 The Perturbation Equation 199

obtained by linear perturbation are accurate near stable exact solutions of initial
value problems, but they can be misleading near unstable solutions.
We now drop the assumption that X is a one-dimensional vector, as well as
the assumption that the solution has a convergent Taylor expansion, and we
derive analogous results. We show that the solutions of a normal first-order sys-
tem (1) depend differentiably on the initial values, thus proving (at long last!)
that the solution curves of any normal first-order DE or system form a normal
curve family.

THEOREM 12. Let the vector function X be of class C', and let x(t, c) be the
solution of the normal system (2), taking the initial value c att = a. Then x(i, c) is a
continuously differentiable function of each of the components c, of c.

The proof is subdivided into three steps.


A. Consider the system of DEs for the unknown functions h,, the components
of the vector h = (h, » hy):

dh, OX,(x(t, c) > )


(28)
di
»
ry 0
h, + Ah, t, c, 7)

where x(t, c) is the solution of the normal system (2) for which x(é, a) = c. We
assume that the functions H; are bounded for |t — a] <= Tand |h — #| =
K, where 6 is the vector whose components are the Kronecker deltas 6]. We also
assume that H, tends to zero as y, > 0, uniformly for |f — a] <= T and
|h — 6’| < K. We define hi = hii, c, n,) as the solution of (28) that satisfied the
initial condition h/(a) = 4/. Applying the Corollary of Theorem 3, with « = 9,
we find that h/ tends, as y, > 0, to the solution f? of the linear system

th _ > OX,(x(¢,c), #)
(29) f
dt k=1
Ox,

satisfying the same initial conditions, namely, St; (a) = 8.


In addition, we infer from the same Corollary that the vector functions h’
remain bounded as y, > 0.
B. Set

x,(t, C1, Cos +3 G-1,9 + np G+ » Cn) ~ xt,c)


gilt, ¢, 9) =
1,

We next find a differential equation satisfied by the vector partial difference


g) = (gr, &), > Si).
By definition,

dgi(t, c, ny) -
dt
ny[Xx ©) + ng’ ©), t) — XGx(, ©))]
200 CHAPTER 6 Existence and Uniqueness Theorems

We now use the assumption that X, is @’. By Taylor’s theorem for functions of
several variables, we infer that the right side equals

AX(x(t, ©), t)
A(t, c) + | g,(t, ¢)|
k=l Ox;

where ¢; is a function of #, c, and 7; that tends to zero as n; — 0, uniformly as


the variables ¢ and c range over closed intervals. Setting H,(h, t, c, n;) = ¢,|h(,
= h of a system (28). The
c)|, we find that the vector functiong’ is a solution g/
function H satisfies the conditions stated under Step A.
C. The initial conditions satisfied by the g’ are, by definition,

»G-p & + Nis Cj+1 » €n) (i, ¢)


(a, Cy, Cos
g(a,c,n,)
nj
et ,
=0 if: #j
nj
+i;
=] ifi=j
1

Combining with the results of steps A and B, we conclude that, as 1 0, the


function g/ tends to the solution h/ of (25’) satisfying the same initial condition.
But we know that

lim g(t, ¢, 4) = Ox(t, c)/dc;


nj0

We have, therefore, shown that the derivative 0x/dc, exists and is indeed a solu-
tion of (29), q.e.d
The linear DE (27) is called the perturbation equation or variational equation of
the normal system (2), because it describes approximately the perturbation of
the solution caused by a small perturbation of the initial conditions
In the course of the preceding argument we have also proved the following
result

COROLLARY. /f x(t, c) is a solution of the normal system (2) satisfying the initial
condition x(a) c for each c, and if each component of the function X is of class @
then for each j the partial derivative Ox(t, c)/dc, is a solution of the perturbation equa
tion (28) of the system

= A()x + bi), the perturbation equation


In the case of linear systems dx/dit
= A(h of the given system andis the same for
is the reduced equation dh/di
all solutions. Butin nonlinear systems, the perturbation equations (27) and (28)
depend on the particular solution x(, ¢) whose initial value is being varied
13 The Perturbation Equation 201

Plane Autonomous Systems. We now apply the preceding results to the


trajectories of autonomous systems. The main result is that, near any noncritical
point, the trajectories of an autonomous system look like a regular family of
parallel straight lines. We give the proof for the case n = 2. Recall that a plane
autonomous system is one of the form

(30)
& = X(xa) iy
>-
=
=
Y(x,9)
di dt

THEOREM 13. Any plane autonomous system where X andYare of class @' is
equivalent under a diffeomorphism, in some neighborhood of any point that is not a
critical point, to the system du/dt = 1, do/dt = 0.

Proof. Let the point be (a, 6); without loss of generality, we may assume that
X(a, b) # 0. Let the solution of the system for the initial values x(0) = a, (0) =
cbhex = &t, c), y = nt, c), so that 0&/dt = X, On/dt = Y. Then by Theorem 12,
the transformation (t, c) F (E(t, ¢), n(t, c)) is of class @'. Moreover, since x(0) does
not vary with ¢, the Jacobian

ag, 7) 0& On of on
-_ SS xa, )- 1—0- Ya, b) = X(a,b)
=
—_—

At, c) at Oc Oc ot

is nonvanishing at (a, b). Hence, by the Implicit Function Theorem, the inverse
transformation u

=

t(x, y),
=_

=
c(x, y) is of class @'. In the (u, v)-coordinates,
the solutions reduce to u = 4, v = ¢ constant; hence, the DE assumes the
=
=

form stated, q.e.d.

COROLLARY 1. Any two plane autonomous systems are locally equivalent under
a diffeomorphism, except near critical points.

The system u 1,ov 0 is, therefore, locally a canonical form for plane
= =
= =

autonomous systems near noncritical points. In hydrodynamics, the velocity


field associated with this system is called a uniform flow.

COROLLARY 2. [If the functions X andY of the plane autonomous system (30)
satisfy local Lipschitz conditions, then tts integral curves form a regular curve family
im any domain that contains no critical points.

Proof. By Theorem 1, there is a unique integral curve of (26) passing through


each point ¢, not a critical point. As shown in §12, each such integral curve goes
all the way to the boundary. Finally, since Lipschitz conditions imply continuity,
the directions of the vectors (X(x, y), Y(x, y)) vary continuously with position,
except near a critical point, which completes the proof.

EXERCISES F

1. Let F(x, y) be continuous for |x — a] =< T, [y — c] < K. Show that the set of all
solutions of y’ =
=
F(x,y), satisfying the same initial condition f(a) oo
=
c, is
equicontinuous.
202 CHAPTER 6 Existence and Uniqueness Theorems

2. Show that, if X(x, #) is continuous and satisfies a Lipschitz condition for a = t <= 3b,
every solution of the DE (2) satisfying x(a) = ¢ is bounded for a = ¢ = db. Show that
the corresponding result is not true for open intervals a < t < b.

Let «X(x,y) + °¥(x,y) = 0, where X andYare of class @!. Show that the system x’
= X(x,y), y’ = Y(x,y) has infinite escape time. [Hint: Show that 2x” + y* is an integral
of the system.]

Let the function X(x, /) be defined for 0 < ¢ < © and for all x, and let

IX(x, #) — XY, )| = LOIx — yl


where Jf L() di < 00. Show that the DE dx/dt = X(x, #) has a solution on 0 <i <
+ 00 for every initial condition x(a) = c. Show that, if one solution is bounded, then
all are.

Let X(x, #, s) be of class @! for [x — ec] <= K, |t — af ST, [s — 5| SS. Let x(t, s)
be the solution of x’ = X(x, é, s) satisfying x(a) = c. Show that x is a differentiable
function of s.

*6 Under the assumptions of Ex. 5, suppose that X(x, ¢, s) is of class @*. Show that
x(t, s) has n continuous partial derivatives relative to s.
*7 Show that if there are two distinct solutions fand g of y’ = F(x, y) satisfying the same
initial condition ¢ = f(a) = g(a) (F continuous in [x — a] = T, |y ~— c| = 4A), there
are infinitely many of them.

*8 Show that there is a maximal and a minimal solution fy(x) and f,(x) of the DE in Ex.
7, such that f,,(x) <= f(x) < f(x) for any other solution f such that f(a) = f(a) =
Fal). UHint: See Ch. 1, Ex. F4.]

*9 Let F(x, y) and G(x, y) be continuous for a = x = T, |y — c{ = K, and F(x, y) = Gi,


y). Let f be a solution of y’ = F(x, y), and let g be the maximal solution of y’ = G(x,
4). Show that, if f(a) <S g(a), then f(x) = g(x) for x > a.

ADDITIONAL EXERCISES
*1. Let dx/dt = X(x, y, t) and dy/dt = Y(x, y, t), where

(x — x)[X(x, y, ) — Xe’, 9, O1 + y — YK, 9, — VR, 1


is everywhere negative or zero. Show that, for ¢ > 0, the above system has at most
one solution satisfying a given initial condition at ¢ = 0.
vo
In Exs. 2~4, f; means the right-derivative; prove the implication specified. You may
assume the existence of f{. and gj freely.

2 If fi (x) S gi (x), then f(x) — fly) = g(x) — gly) for x = y.

3 If [ft()| <= KIA) then |fx){ <= |fla)|eX'"


"4! for
x = a.
4 If (fi @)| <= KIfG)| + «, then |fix)| = La) [eX!=! + («/K)(eK"—"| — 1).
5 Let dz/dt = Q(z, » 2,), where the Q, are quadratic polynomials. Show that,
for any initial condition, the nth Picard approximation to the solution is a polyno-
mial in ¢ of degree at most 2” — 1.

(a) Prove that, if there is a normal Ath-order ordinary DE satisfied by two functions
uand v and if n > &, there is a normal nth-order DE satisfied by both functions.
State your differentiability assumptions.
(b) Prove that, if the given kth-order DE is linear, then the nth-order DE can also
be chosen to be linear.
13 The Perturbation Equation 203

(c) Prove that there is no fourth-order normal DE u” = Flu, u’, u”, u’”, t) satisfied
by both u =
=
t* and v =
=
i for all real ¢.
8
(d) Prove that «
=
=
satisfies no normal linear homogeneous DE of order six or
less with continuous coefficients.

Show that, if X,, ., X, satisfy Lipschitz conditions on a compact domain, so does


any polynomial function of the X,.

Show that, if X(@) = ||x,(¢)]| is a matrix whose columns are solutions of the homo-
geneous linear system X’ = A(#)X, then det X(f) = [det X(a)] exp ff E ay,(s) ds.
A matrix X(é) is a fundamental matrix for a <= t = a + T of a homogeneous linear
system X’
=
=
A(X if its columns are solutions of the system and det (X(t) # 0. Show
that, if the columns of X are solutions of the system and if det X(a) # 0, then X is
a fundamental matrix.

*10. Show that, if X(¢) is a fundamental matrix of the reduced linear system, the function
x(t) = X(t) JLX7'(s)b(9) ds is the solution of the inhomogeneous system such that x(a)
= 0 (X7! is the matrix inverse of X).
CHAPTER 7

APPROXIMATE
SOLUTIONS

1 INTRODUCTION

During the past 40 years, the accurate numerical solution of initial value
problems for ordinary DEs has become routine, because of the availability of
high-speed programmable computers. Even fairly large systems of DFs can be
treated similarly in many cases, although “stiff? systems involving time scales of
different orders of magnitude can be troublesome.
This development has not only made the study of classical numerical methods
(e.g., Runge-Kutta methods) more important, as practical substitutes for
involved analytical considerations, it has also increased interest in numerical
mathematics from a theoretical standpoint. In particular, the power series methods
explained in Chapter 4, together with techniques of numerical linear algebra,
have provided the basis for a new field of research.
Because of this changed emphasis, a few simple numerical methods for solv-
ing DEs were already described in Chapter 1, §8. In this chapter and the next,
we will treat the numerical solution of ordinary DEs and systems of DEs more
carefully. This chapter will concentrate on the underlying ideas, while the effec-
tive technical implementation of these ideas will be the subject of Chapter 8.
Since these ideas are applicable to systems of first-order DEs, we will adopt
throughout Chapters 7 and 8 the vector notation introduced in Chapter 5.
Thus, we will consider vector DEs of the form

(1) x’(t) = X(x, t), astSa+T

However, since writing and “debugging” computer programs for systems of


DEs can be very time-consuming, most students will probably find it more satis-
factory to interpret all statements and formulas in the conceptually simpler
context of y’ = F(x,y), the case of a single ordinary first-order DE discussed in
Chapter 1.
The basic idea involved, that one can use simple arithmetic to compute
approximate solutions of DEs, is a very natural one. Indeed, the simple methods
to be analyzed in this chapter were mostly known to Euler. However, their rig-
orous error analysis is more recent, having achieved a definitive form only
around 1900.
204
2 Error Bounds 205

Approximate Function Tables. The most effective methods for obtaining


approximate solutions of DEs compute in each case [i.e., for each DE (1) and
initial value x(a) = c] an approximate function table. Given any partition

at+T
=

(2) Tv €=ty<t<tp<t,<---
<4, =

of an interval [a, a + T] of interest by a sequence of mesh poinis t,, it produces


a sequence of approximate values x,(t,), nearly equal to the “true” values x(t,) of
the exact solution, whose existence and uniqueness was proved in Chapter 6.
The difference e,;(i,) = xz(t,) — x(t,) is the error (or “‘discretization error’’) of
the method, and this chapter will be mainly concerned with the error analysis of
the methods discussed.

2 ERROR BOUNDS

Cauchy Polygon Method. The simplest way to construct an approximate


function table for the solution of the DE (1) satisfying the initial condition
x(a) = c, ona given set of mesh points ¢,, is the Euler method of Ch. 1, §8. This
constructs from x) = x(a) = ¢ the sequence of values

(3) Xp = ¢, Xe = Key + XK, te — te-1), k=1,. ,m

This formula is recursive; each value x, can be computed knowing x,-_, alone.
From the approximate function table just defined, one can also construct an
approximate solution by linear interpolation. This approximate solution is defined
by the formula

(3/) Xa() = Xp M(Kp-1, e—-)E — G1) on [t,-1, tg]

Evidently, the graph of the approximate solution (3’) consists of m segments of


straight lines; it is a polygon in the (n + 1)-dimensional (t, x)-space. The function
defined by (3) and (3’) for each partition @ and initial value c is called the Cauchy
polygon approximation to the solution, for that partition.+

Example 1, When the DE (1) is of the special form x’ = f(t), the preceding
method reduces to the Riemann sum formula of Ch. 1, (5’):

(4) ” f di= YM At,, At,= t — th-1, to = 4, be< tes


where the symbol = is to be read “is approximately equal to.” The proof of
convergence to the exact solution, in this case, is the essence of Riemann’s the-
ory of integration.

+ It was Cauchy who first proved their convergence to exact solutions, though Euler had used “Cau-
chy polygons” a century earlier.
206 CHAPTER 7 Approximate Solutions

15 q T

1.25 -

h=0.2

1L0r

y mer
h=mAar=|xij=}
75

! l f J
0.55
0.2 0.4 0.6 0.8 1.0
t
Figure 7.1 Cauchy polygons for dx/dt = x, x(0) = 1/2.

An error bound for the Euler—Cauchy polygon approximation can be derived


in any closed bounded domain D, for any vector DE (1) whose right-hand side
X(x,é) is continuous and satisfies a Lipschitz condition

(5) [X(x,t) — Xly,4)| = LI|x — y|

This bound also depends on the norm of the partition 7,

5 ||

=
max(At,, ., At,) = max lee — te)
k=1,...,

and on the maximum M of |X(x,t| in D. As the following theorem states, this


bound is roughly proportional to L, ||, M, and the length T = ¢ — a of the
interval of integration.

THEOREM 1. Let X€ @' satisfy |X| S M, |OX/dt|S C, and (5) in the cylinder
atSa+t/T, |x — cl S MT. Then the Cauchy polygon approximation xz(t)
differs from the true solution x(t) by at most

(6) [xx(t) —x()| = E + ute — 1): Ie]


The proof of Theorem 1 will be presented and its significance explained in
§3. Here we emphasize that the inequality (6) only provides an upper bound to
the error. Because || is multiplied by a bounded factor, Theorem 1 asserts that
the error is O(|2]); hence it is O(#) in the case of a uniform mesh with constant
step size Ai, h.
=
=
3 Deviation and Error 207

However, as examples described in Exercises A show, the magnitude of the


true error may be very much smaller than the bound (6). Therefore, in most
practical computation, one relies on less general formulas. The basic fact is that,
in the important case ¢,
=
=
a + kh of a uniform mesh with mesh length h, the
error committed in using the Euler—Cauchy polygon approximation is usually
nearly proportional to h; see Figure 7.1.

Example 2. Consider the DE y’ = y on [0,1], for the initial value (0) = 1.


The exact solution is e*, with final value e = 2.71828182853 ---.
As was stated in Ex. £8 of Ch. 1, the final value of x,(1) computed by Euler’s
method,

3
(n — 1)h? n
(*) = (lth =1l+nh+n
2 + h 3!
+

is asymptotically e — (hk — th? + - - - ) e/2. This fact can also be deduced


from formula (*) (cf. Ex. A2 below).
Note that, in Examples 1 and 2, the error made in each individual step is only
O(h?). Since the number of steps is proportional to 1/h, the cumulative error is
still O(h). More generally, the order of magnitude of the cumulative error made
in integrating a first-order DE or system is an infinitesimal of order one less than
that of the error per step. It is the same as that of the relative error per step,
defined as the error divided by the length of the step.

*3 DEVIATION AND ERROR

This section will be devoted to proving Theorem 1, that Euler’s method has
O(h) accuracy. The proof will be based on a new concept: the deviation of a func-
tion from a DE. This concept is of theoretical interest in its own right.

DEFINITION. A vector-valued function y(#) is an approximate solution of the


vector DE (1), with error at most y, when |y(é) — x(| < 7 for all ¢ € [a, a + 7).
Its deviation is at most ¢ when y(#) is continuous, and satisfies the differential
inequality

(7) ly@ — X¥@, d| Se

for all except a finite number of points ¢ of the interval [a, a + T}.

Note that the definition requires the function y to be differentiable, except


at a finite, possibly empty, set of points. Such a function is said to be of class
D'.
The following example shows that an approximate solution can have a small
deviation without having a small error. It is essentially Example 8 of Ch. 1, §9;
note that the DE involved does not satisfy a Lipschitz condition.
208 CHAPTER 7 Approximate Solutions

Example 3. The function y(t) = 10~° is an approximate solution of the DE


dx/dt = 3x*/5 on the interval [0, 00), with deviation 0.0003. The exact solution
to this DE for the initial value x(0) = 107° is x = (¢ + 0.01)°. Atét = 1, it
assumes the value x(1) = 1.030301 instead of y(1) = 0.000001.
Theorem 1 asserts, among other things, that the preceding phenomenon can-
not arise if X(x;t) satisfies a Lipschitz condition (5). For such functions, we can
always make the deviation arbitrarily small by making the norm of the partition
sufficiently small. This result is contained in the following theorem.

THEOREM 2. Let X ¢ @’ satisfy |X| <= M, |9X/dt|<S C, and (5) in the cylinder
D: |x — c| SK,aStSat T. Then any Cauchy polygon in D is an approximate
solution of x’(t) = X(x,t) with deviation at most (C + LM)|x|.
In proving this theorem, we will use the fact that any Cauchy polygon approx-
imation is continuous in [a, a + T], and is differentiable at all points not mesh
points. At these, it still has a left and a right derivative.

Proof. On each subinterval (é;,¢,,;) of 7, it is clear, by (5), that | X(x(),¢) —


X(x,,2)| = L|x(t) — x,|5

|X(x(4),t) — X(x,t)| = LM|t — 4,| = LM|a|

since |x(t) — x,| = if X(x(s),s)ds} = Mt — t,|. Also

|X(x,,é) —_ X(x;,¢,)| = if dX/dt(s)as = C|z|


Adding together the two inequalities just obtained, and using the triangle
inequality, we get the desired conclusion:

(8) |X(x(#),2) — X(x,,,)| S (LM + C)|a|

We now prove a theorem that yields as a corollary an easily computed a priori


error bound for the Cauchy polygon method in terms of |7|, |X}max, the Lip-
schitz constant, and the deviation.

THEOREM 3. Let x(t) be an exact solution and y(t) an approximate solution with
deviation ¢, of the DE x’ (t) = X(x,t). Let X satisfy the Lipschitz condition (5). Then,
fort 2 a, we have

(9) Ix@ —yOl = Ix@ —y@le? + (<) (e-9 — 1)


Proof. Consider o(t) = |x(é) —y(é)|?. Differentiating,

a(t) = 2[Xx),t) — Xv] - x® — yO)


+ 2[X(y).) — yO) - xk® — yO)
3 Deviation and Error 209

Hence, adding inequalities, we obtain

o’(t) S 2Lo(t) + 2eV alt)

Now, set ¢ = v’; the foregoing gives v’ < Lv + ¢ (for ¢ > 0). Applying Theorem
7 of Ch. 1, §12, we get the desired inequality (9), much as in proving the lemma
of Ch. 6, §4.
A slight variant of the analysis leading to Theorem 3 yields a closely related
bound to the cumulative error of the Cauchy polygon approximation, as follows.
Define the directional derivative 0X/0£ of the vector function X(x,?) in the
direction
& = (&,€,, » §,) in t,x-space, for any vector & of unit length, as the
sum £ 0X/dt + Lin, & IX/dx,. It follows as in the proof of the lemma of Ch.
6, §2, that

|X(t,x) — X(u,y)| S |OX/dE] - | (tx) — (uy)|

where € is the unit vector in (¢,x)-space pointing in the direction (¢ — u, x — y).


This inequality gives a bound on the change in X(x,¢) along any side of a Cauchy
polygon, which we now use to complete the proof of Theorem 1.

Proof of Theorem 1. The inequality (6) of Theorem 1 is an immediate corollary


of Theorems 2 and 3. Under the hypotheses of Theorem 1, the deviation of
x(t) is by Theorem 2 at most ¢ (C+LM)|a|. Since x,(a) = x(a), the first
=
=

term of the inequality (9) vanishes if we let y(é) be the Cauchy polygon approx-
imation for the initial value x(a) in Theorem 3, and so (9) simplifies to

(*) Ix) —x,@| 3 [«c +Lwol\| [9 — 1)


This yields (6) by elementary algebra.
In particular, by setting

N
=
=

E + | [expLT— 1]
we obtain the following simple corollary of (6).

COROLLARY. Under the hypotheses of Theorems 1 and 2, let the interval [a, a
+ T) be divided into n equal parts of length h = T/n. Then the error of the Cauchy
polygon approximation is bounded by Nh, where N is a constant independent of h.

EXERCISES A

1. (a) What is the deviation of the approximate solution x = #°/2 — t*/24 of the initial
value problem defined by dx/dt = sin t, x(0) = 0 on the interval 0 <= ¢ =< 1?
(b) Compare the difference 1 — (cos 1) — 34 with the bound given by formula (*),
for the deviation computed in (a).
(c) For the initial value x(0) = 1, bound the difference between the solutions of
dx/dt = sin t and dx/dt = t — (/6).
210 CHAPTER 7 Approximate Solutions

In Exs. 2-5, for the initial value problem specified: (a) use the Cauchy polygon method
to compute an approximate function table for ¢, = 0.1, 0.2, ..., 1.0; (b) find the devia-
tion of the approximate solution obtained from this table by linear interpolation; (c) find
the exact solution, (d) find the error.
= x,
2.x (0) = 1 3. *¥ = 1 — 2x, x(0) = 0

4.%=y, Jr TK x(0)=0, ypOV=1 5X =y, Jae x(0) = 1, y(0) =0

6 (a) Find the deviation of the approximate solution y = 107!° of the DE x’() =
5x,
(b) What is the exact solution of this DE on [0, 0%) for the initial value (0) = 107!°
= (0)?
(c) Prove in detail] the uniqueness of this solution.

On the interval [0, 1], for any « > 0, construct an approximate solution with devia-
tion € to a suitable first-order DE, for which the exact solution with the same initial
value is unbounded.

*8 For the DE x’() = f(@) — x, show that

Ix.) — x@| = Cla, where C = sup |f’(0|

*9 (a) Sharpen (9) and (*) in the stable case [X(x, t) — Y(x, )] - [x — y] = 0. Compare
with the limiting case L = 0 of these formulas.
(b) When X(x, #) satisfies the one-sided Lipschitz condition [X(x, 1) — X(y, )] -
[x — 9] = L |x — y|?, how can (9) and (*) be sharpened? [Hrnr. See Ex. 8.]

4 MESH-HALVING; RICHARDSON EXTRAPOLATION

In practical computation, one can often reduce the error by a large factor by
accepting as a working hypothesis, the theoretical result that, in a wide variety of
situations, the truncation errort under repeated mesh-halvings is of the form

(10) 2(%,sh) — 9(x,) = Ch” + OK")

As has just been emphasized, the order of accuracy v 1 for the Euler—Cauchy
=
=

polygon method. For the modified and improved Euler’s methods to be dis-
cussed later in this chapter, »y = 2. For other methods to be discussed in Chapter
= 4.
8,»
If one knows p a priori, as one does for the methods of Euler just mentioned,
one can determine the unknown constant C in (10) with fair accuracy, by com-
paring the computed value Y) for a given partition x9, with the corresponding
value Y, for the partition 7, obtained from it by mesh-halving.
This is because formula (10) implies that

(11) ¥, — y(x) = > [(%> — yd] + OW)


+ The truncation (or “‘discretization’’) error is the error that would occur if computer floating point
arithmetic were exact. See the discussion of roundoff error at the end of this section.
4 Mesh-Halving; Richardson Extrapolation 211

Comparing with (10), we obtain,

[2Ӵ; ~ Yo] + owt)


(12) y(x,) =
(2 1)

The approximate value of y, = y(x,) obtained by suppressing the O(h’*!) term


in (12) is said to be obtained by Richardson extrapolation. This name is given to
honor the inventor of the method, L. F. Richardson, who called it “deferred
approach to the limit.”
Note that formula (12) corrects each computed value Y, by adding
(Y, — Yoq)/(2” — 1) to it. Thus, suppose we compute e* as the solution of y’ = y
for the initial value y(0) = 1 by the Euler-Cauchy polygon method for h = 27”
(m = 0,1,2,3,4). Then »y = 1, so that the corrected value is

(13a) Y= 2Y, _— Yop.

The resulting approximate values of e1 = (1 + h)*” are tabulated in Table 7.1,


together with their errors and the better approximations obtained using (13a).
For the improved Euler method (Heun’s method), the approximate value
¥, = (1 + h + h?/2)" has 0(h?) accuracy. Since p =
=

2, the corrected value is

(13b) y=yY,+ 5 (Y, — Yo) = 5 Ys — BY»)


The improvement made by applying Richardson extrapolation to this method is
shown in Table 7.2.
The final error is reduced by a factor of nearly 8 = 2° each time that the
mesh-length is halved.

Checking ». A good practical check on the reliability of Richardson approx-


imation consists in verifying that Y,, — Yo, is indeed about 2” times Y,, — Y,,
When » is unknown, one can also estimate it by assuming this same formula, for
all h. Summing the geometric series D2 2~” = 1/(2” — 1), we obtain after some
algebraic manipulation the following extrapolated approximationY to the lim-
iting value y of the series Y,, Y,/2,Y,/4, ...

Y=¥Y,-
(Yn — Yon)”
(14)
Yan —- 2Yon + Y;,

Table 7.1 Richardson Extrapolation of Euler’s Method

L 1 1
A= 2 4 8 16

Y, 2.25 2.44141 2.56574 2.63793


Error 46828 27687 -15254 .08035
2¥, — Yon 2.5 2.63282 2.69007 2.71072
Error -21828 .08645 .02821 00816
212 CHAPTER 7 Approximate Solutions

Table 7.2. Richardson Extrapolation of Heun’s Method

A= $ i & 1
Y, 2.640625 2.694856 2.711841 2.716593
Error .077657 023426 006441 .001689
(4¥,
— 3¥a4)/3 2.6875 2.712933 2.717503 2.718177
Error 030782 .005349 .000799 -000105

Caution. Although valid for sufficiently small A, formula (14) with h = }


overcorrects the computed value 2.694856 in Table 7.2, and overcorrects
2.44141 very badly in Table 7.1.

Roundoff Errors. The preceding discussion has set no limits to the fineness
of the mesh used in solving DEs numerically, and it has been tacitly assumed
that all arithmetric operations and function evaluations are exact. Actually,
however, the floatin -point arithmetic on many computers has an accuracy
of only around 10
-
5. On such computers, the dominant source of error
when h = 1/1024 (say) may well be due to so-called ‘roundoff errors” in
floating-point arithmetic. This is especially likely if values of the x, that are
not exact “binary decimals” are used—e.g. if h = .001 is used instead of
h = 1/1024.
Roundoff errors will be discussed again in Chapter 8, §6.

5 MIDPOINT QUADRATURE

As we have observed (Example 2 above), the relative error made in computing


e by solving y’ = y on [0, 1] for the initial condition »(0) = 1 by the improved
Euler method of Ch. 1, §8, is ~h?/3 + h*/4 + O(h*). In this section, we shall
derive some much more accurate error formulas for evaluating definite integrals
by the midpoint and trapezoidal formulas (i.e., for solving the DE y’ = F{x)).
This can be viewed as lending further credence to the Richardson extrapolation
method of §4.
The simplest formula for numerical quadrature having a higher order of
accuracy than the Cauchy polygon formula (4) is the midpoint quadrature formula

_ G1
+ x)
(15) SF(x) dx = M,[F] = > F(m,) Ax,,
1=1
t
2

Given the partition z of the interval of integration [a, 6] by points of subdivision


Xp <x, < <x, = b, the midpoint approximation M,[F] is easily
=

a =

computed; it takes its name from the fact that m, is the midpoint of the ith inter-
val of subdivision. We now derive an error bound for the midpoint quadrature
formula (15).
5 Midpoint Quadrature 213

THEOREM 4. IfF € @?, then

| fF #9 ax - ~ F(m,) Ax, =
JF” | max|r} 2(b _a)
(16) —

t=]
24

Proof. On each interval [x,—,,x,] = [m, — Ax,/2, m, + Ax,/2], Taylor’s for-


mula implies that

PF"(m; + 7)
F(m, + }) — Fm) — tF’(m,) =
2

where r is between 0 and ¢. But F’(m, + 1) is bounded below by the minimum


F”,, of F”(x) on [a, 6] and above by its maximum value F%,,. Therefore, we
have
tt
Frin e
—m— =< F(m, + 1) — Fm) — tF'(m) S
m: ax

Integration of this inequality over —Ax,/2 = t = Ax,/2 gives

3 Xe
Fmin z < Ft, Ax?
—_ ; F(x) dx — F(m,) Ax, < =
ae
24 i 24

Summing over i and noting that 0 < Ax? < |x|, we get (16).
Theorem 4 shows that the midpoint quadrature formula (15) has order of
accuracy O(h*), one order higher than the Cauchy polygon method.

Error Estimate. In the case of subdivisions into intervals of constant length


h, we can obtain a much more accurate estimate of the error in the midpoint
quadrature formula by considering the higher-order terms in Taylor’s formula.

THEOREM 5. Any function for F € @® on a uniform mesh with constant mesh


length Ax, 2k =h,
=
=

M, LF) = J"Ge)de —z [F'(b) — Fa)]


(17)
7h?
+
[F"() — F"(a)] + O(n)
5760

Proof. By Taylor’s formula with remainder, since F € @°, we have

5
Fmt" FOr!
Fm, +0 =>-
r=0 (7!) 720
214 CHAPTER 7 Approximate Solutions

where £ is some number between m, and m, + t. On each ith interval (x,_,, x,),
the final term (“remainder”) is bounded in magnitude by Mk°/720, where M =
max | F(£)|, the maximum being taken on the entire interval a < & = 5. Inte-
grating over —k =i Sk, we get

x1 Be Fe (m,)
WF?(m)
F(x) dx = 2kF(m,) + +
+ O(k*) Ax,
x:-1 3 60

where the factor O(&°) is bounded in magnitude by Mk°/720. When we sum over
z, there results the estimate

(18) ~ F(m,)Ax, = f F(x)dx — (=) a


y F"(m,)Ax,
z=1

- 120
> F°(m,) Ax, + O(F°)
emt

An application of (18) to the function F”(x) € @* gives similarly (one term being
dropped because of the loss in differentiability),

(18’) > F"(m,) Ax, f F"(x)dx —(


2=1
=>
=
_—

) : F°(m,) Ax, + O(k*)


=]

Applied to F(x) € @?, this gives

(18”) > F°(m,)Ax, = f "F(x)dx + O(k?)

Substituting from (18’) and (18”) back into (18), and combining terms, we get

7k
x F(m,)Ax, = f F(x)dx — . f F(x) dx + 360 J"F"(x)dx + O(k°) ——ee

21

When we set k = h/2, formula (17) follows immediately.


Note that the error estimate (17) implies the very accurate corrected midpoint
formula

IF) — P@)
(19) f " F(x) dx = 3 F(m) Ax, +
2=]
24

_ TF") — F"@)]
+ O(n)
5760
6 Trapezoidal Quadrature 215

EXERCISES B

In each of Exs. 1-4, a numerical quadrature formula is specified for approximately eval-
uating f”, f(x) dx. In each case: (a) compute the truncation error for f(x) = x", n = 0,
1, 2, 3, ..., and (b) find the order of accuracy of the formula, using Taylor’s formula
with remainder, assuming f(x) to be analytic.

1. Simpson’srule: S[f] = - Lf(—A) + 4f0) + fA).

2 Cotes’rule: CLf] = : [f-m = (3) + ar +su


3 Weddle’s rule:

WI) = a [Am + sr(- 2) +s(-4) + 6/0) +/(5) + or(2) +109

Hermiterule: HLf] = ALf(t) + f(-2)] - . Lf’) —f(—A)].


*5 In Ex. 3, find weighting coefficients w, such that the approximation

wof(—h)+ oy( _ 2h3 + wys(- *) + w3f(0)+ ws(3] + oy (2) + wef)


to ft, F(x) dx has a maximum order of accuracy. Compare with Weddle’s rule.
(a) Show that if F(x) = 1/x, then (F(2) — P(1)]/24 = 3g and [F”(2) — F’(1)]/
5760 = 1/1024.
(b) Infer that In 2 = M,(f) + h®/32 — 7h*/1024 + O(h').
(c) Knowing that In 2 = .69317408, compare with numerical experiments.

(a) Show that all odd-ordered derivatives F°**(0) of F(x) = 1/(1 + x?) vanish when
x = 0.
(b) Show that F(1) = —}and F’’(1) = 3
(c) Knowing that 7/4 = arctan 1 = f,3 dx/(1 + x*), derive the formula

he
+— -—
7h*
a/4 = M,[F] + O(n)
48 7680

*8 Derive formulas similar to those of Exs. 6—7 for


(a) J) V1 + x4 dx, and (b) J} sin (x’) dx.

6 TRAPEZOIDAL QUADRATURE

The formula for trapezoidal quadrature is

(20)
J” Rex) dx = T,[F] = 3 [F(x,-1) + Flx)] Ax,/2.
i=]
216 CHAPTER 7 Approximate Solutions

We will now use the concept of the Green’s function for a two-endpoint prob-
lem, as defined in Ch. 2, §11, to obtain an exact expression for the error in
trapezoidal quadrature over a single interval. Consider the linear function

(21) L(x) = Fa) + (« — a)[F(b) — Fla) /h, h=b-— a,

defined by linear interpolation between the values F(a) and F(6), and let R(x) =
F(x) — L(x). Then R” (x) = F’(x), and R(a) = R(b) = 0.
Now consider a single interval of length h, = x, — x,-; = 2k, and translate
coordinates so that (x,_,, x,) becomes the interval (—k, k). As in Ch. 2, §11, we
have

(22) R(x) = f ' G(x,HR(E)dé = f ' G(x,OF"(&)dé,


in which R(x), defined as above to be the difference between the function F(x)
and its trapezoidal approximation L(x), vanishes at the endpoints and satisfies
R” = F”. The Green’s function G(x, £) for F”
=
=
r(x) is given by

x =&,
(23) G(x,2) = |(fx/k + & — x — k)/2,
(fx/k — E& + x — h)/2, =x.

The error in trapezoidal quadrature over (—k, k) is

he

Ty[F) —ff F(x)dx =ff [L(x) —F(x)]dx = f Rex) dx. =

Substituting for R(x) the integral expression displayed above and interchanging
the order of integration in the resulting double integral, we get

Tr[F] —ff F(x)dx _ SAS, G(x,&)ax| F’(&)dé.


=
=

But by direct calculation, f*, G(x, &) dx =


=

—(k — £)/2. Hence

THEOREM 6. The error in trapezoidal quadrature over a single interval (—k, k)

isexactlyf . (k? — &)F"(6)dé/2.


Furthermore, since G(x, &) <= 0 for all x, & € (—&, k), we can use the Second
Mean Value Theorem of the Calculus to obtain as a corollary that the error is

F’(6) f . (k? — #7)dt/2 forsome £&€(—k,k)

The integral is easily evaluated as 2h°/3 = h3/12.


6 Trapezoidal Quadrature 217

COROLLARY. The error in trapezoidal quadrature over a single interval of


length h, is h? F’(E,)/12, for some &, in the interval.

Since h,? < ||*h,, summation over i now gives our final result.

THEOREM 7. The error bound for trapezoidal quadrature is given by the


inequality

(24) Tx[F] — f F(x)dx <I lal — 12


We shall next obtain an analog of Theorem 5 for trapezoidal quadrature in the
case that all the intervals of subdivision have the same length, Ax, = 2k = h.
For any F € @®, Taylor’s formula with remainder gives, much as in the proof of
Theorem 5,

Fm, + k) + Fm, — kh) = 2F(m,) + WPF"(m) + To Pim) + O(k*)

Multiplication by Ax,/2, followed by summation over i, now gives the further


estimate

R2 4

(25) Tr[F] = M,[F] + 2 M,[F”] +94 M,[F’] + O°)

The right side of (25) can be evaluated by repeated use of the midpoint quad-
rature formula error estimate (17). The conclusion is the truncated Euler—
Maclaurin formula.

THEOREM 8. For F € @®, let all intervals of subdivision have the same length
Ax; = 2k = h. Then

(26) T,[F] = f F(x)dx + = [F’() — F(a))


nt

[F"(b) — F”(a)] + O14)


~ 790
Proof. Replacing h by 2k in (17) and then substituting from (17) into (25),
we obtain, as the contribution from the first term on the right-hand side of (25),

b 2 4

J a
F(x) dx — & [FO — F(@) + =
360
[F”(b) — F”(a)] + O(K*)
218 CHAPTER7 Approximate Solutions

From the second term we obtain

)| fi rr dx — (6 [F"(b) — F’(a@)] + on'|


( 2

while the third term gives (k*/24)[F” (b) — F”(a)] + O(k°). Adding these three
contributions together, simplifying, and writing k = h/2, we get (26).

Simpson’s Rule. Comparing the error estimates (17) and (26) for midpoint
and trapezoidal quadrature, we are led to an error estimate for Simpson’s rule.
For a given partition 7, this is defined as

(28) SelF] = 2 M{F +5 TelFl = 16 > [F(x,-1) + 4F(m,) + F(x) Ax,


a=}

Forming the linear combination indicated for subdivision into double steps of
constant length 2k = h, we obtain

(29) Sr[F] = f F(x)dx + oe


180
LF"(b) — F” (a)) + Ove’)

Simpson’s rule will be studied further in Ch. 8, §9.

EXERCISES C

1. Use (26) to estimate the difference

ne-[ 1, 1 1 1

20

=640 >
k=1
10 +k 400 |
In Exs. 2—5, use (26) with A = 0.2 to evaluate the following numbers approximately:

2. In2 = fi dx/x 8. arctan] = fj dx/( + x’)

4. fi VT + x dx 5. Si sin (x?) dx
In Exs. 6-9, use Simpson’s rule (28) with double step 2k = h = 0.2 to evaluate approx-
imately the numbers defined in Exs. 2—5, respectively.
10. For a subdivision into 2n intervals of length kh = (6 — a)/2n, Simpson's approxima-
tion to f? f(x) dx is 5%, (4/3)[fx2,-1) + 4/flre,-1) + f(xs,)]. Show that the truncation
error is (#°/90) E%, f"(x2,-1) + OCA).
11 Show that f*, F(x) dx =
=

QhF(O) + 3 (ft, (Al — [x 13’) dx]. [Hint: Construct


the Green’s function for the initial value problem defined by u” = F”(x) and F(0) =
F’(0) = 0, and study the proof of Theorem 6.]

*7 TRAPEZOIDAL INTEGRATION

The rest of this chapter will be devoted to the theoretical analysis of three
classical methods for integrating first-order ordinary DEs (and systems of DEs).
Like (uncorrected) midpoint and trapezoidal quadrature, these methods have
7 Trapezoidal Integration 219

only O(h) accuracy. Since Runge-Kutta and other algorithms having at least
O(h*) accuracy are readily available and easy to use, readers who are primarily
interested in applications of numerical methods may wish to proceed directly to
Chapter 8, which will take up such more efficient methods.
For various reasons, the errors arising from the use of these more efficient
methods cannot in practice be predicted purely theoretically. Therefore, the dis-
cussion to follow will have no parallel in Chapter 8.
Our theoretical analysis will first take up trapezoidal integration. For any sys-
tem dx/dt = X(x, i) of first-order DEs, this is defined implicitlyt by the recursion
formula (difference equation)

(30) Ye = Ye-1 + X(ye-1, te-1) + Xn, t,)] At,/2

where Ai, = 1, t,-1. From a given initial value yp = ¢ and partition 1,


formula (30) defines a sequence of values y, = yr(t,), that is, a function table
describing approximately the solution of the DE x = X(x, #) satisfying the initial
value x(a) = c.
Note that when X = X(#), formula (30) is equivalent to the trapezoidal quad-
rature formula (20). Also note that, as in Ch. 6, §5, formula (30) can be
extended to DEs and systems of arbitrary order.t Last and most important, note
that if X(x, 2) = A(x + b{® is linear, then (30) is equivalent to

(30’) (i — 6,A)y, = I+ D,An—W)¥e—1 + 9;,(b,-1 + by)

where 6, = At,/2, A, = A(t), and b, = b(i). Hence, for At, small enough, the
system (30’) can be solved for y,, given y,-,, by Gaussian elimination.

Example 4. Consider the solution of the linear DE

(31) + 2x 1
=
=

taking the initial value x(0) = 0. By the formula of Ch. 1, §6, the solution is
the function x

=
e~® fi, e* ds. Looking up values of the definite integral in a
table,tt we get the first row of entries in the following display.

t 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

x 0.0993 0.1948 0.2826 0.3599 0.4244 0.4748 0.5105 0.5321 0.5407


J 0.0990 0.1941 0.2818 0.3590 0.4235 0.4739 0.5097 0.5315 0.5404
z 0.099 0.194 0.282 0.3586 0.424 0.4739 0.509 0.531 0.5383

+ For large At,, Eq. (80) may have more than one solution. But usually, y, can be computed by
iterating (30) two or three times.

[This does not mean that reduction to a first-order system is recommended in numerical
integration.

+{ E. Jahnke and F. Emde, Tables of Functions, Dover, 1943, p. 32; W. L. Miller and A. R. Gordon,
J. Phys. Chem. 35 (1931), p. 2878.
220 CHAPTER 7 Approximate Solutions

The second row of entries gives values y, of the approximate function table
constructed using the trapezoidal integration formula (30), with x(0) 0 and
constant mesh length h 0.1. With this value of 4, the formula in Example 4
reduces to

1 + (10
— t-1)y-1
10 + &

The entries tabulated in the second row were calculated from the preceding
formula, rounding off all numbers to six decimal digits and then rounding off
the final values to four decimals. The last row of the table gives values z, com-
puted by the “improved Euler method” of §8 below.
Above, we used a table of [6 e* ds; we next ask: How should one construct
this table? For large i, numerical quadrature formulas tend to be inefficient,
because the integrand e” varies so rapidly (by more than 10% between 5 and
5.01, for example). For this reason, rather than computing x(f) as we did, it is
more efficient to solve the DE (31), for the initial value x(0) = 0,by an accurate
numerical method (see Ch. 8), and then to compute {5 eds = é x(t) as a prod-
uct, than to compute x() as we did.

The preceding discussion illustrates an important principle. Reductions to


quadratures and substitutions in special formulas do not necessarily help one to
obtain accurate numerical values for solutions of DEs

Asymptotic Expansion. On the other hand, asymptotic expansions and


other analytical devices may be very helpful for analyzing the singularities of
solutions in DEs, their behavior for very large values of the independent vari-
ables, and their dependence on parameters. Thus, to evaluate the function of
Example 4 for very large ¢ (say, { = 25), it is best to set s = ¢ — rin Example 4
and to expand e” in a Taylor’s series. We then obtain

rt 6

ot [ea f (ree |a
Tr

+5t+qt-
2!

For large ¢, the kth term gives the integral

tt 1 (Qk)lt 2k-1

(RNQz+1 (RNQ7t! ’ p = 2ir

Moreover, by truncating the series at the kth term, an asymptotic error estimate
can be obtained
Knowing (or having guessed) the form of the asymptotic series x(t) ~ Lj-o
a [ee we also can derive from (31) purely formally, by the method of unde-
termined coefficients, that a) = 1/2 and a4, = (2k + 1)a,/2, from which
(asymptotically) we obtain

3 15
x~o~—+ 4° set abe
_——

as t—>
©
2t 81 16
7 Trapezoidal Integration 221

Though the series is (ultimately) divergent, the first few terms give an extremely
accurate approximation to x(¢) for ¢ 2 25, which is much more accurate than
could be obtained by general numerical methods

Error Bound. Finally, we derive a bound on the cumulative error of the


function table constructed by the trapezoidal integration formula (30), in terms
of the properties of X(x, t). To do this, we first construct an approximate solution
of the DE x’(t) = X(x, #) from the approximate function table of points (y,, é,)
defined by (30). Since Ay,/Az in (30) is the arithmetic mean of the slopes y; =
X, = X(y,, t) for 7 k — 1, k, the quadratic function

Qe = e-vT
9) = Ye + MAT + s Tet by
2 At

interpolates not only to y,_, and y,, but also to y,_, and yj at ¢,_; and &, respec-
tively.t This gives us a piecewise quadratic approximate solution of class @', which
satisfies. the given DE exactly at all points ¢,
We next bound the deviation (§1) | y’(é) — X(, y()| of this approximate solu-
tion. Since the deviation is zero at t,_; and ¢,, while y’(¢) is linear in the interval
[t,-1, t], it follows (by Theorem 1 of Ch. 8) that the deviation there is at most
t(h —1)|X} maxy Where h = t, — 1, and |X| ax signifies the maximum absolute
value of the second time derivative of X(f). On the other hand, we have

ax
_ 9X OX OX OX AX Ox
(32) x~
de
ye
€t
+ 2x
OxOt
+
at Ox
ax ot (Ox

In Example 4, this gives X = —4 + 12xt + 40° — 8xt


As in the proof of Theorem 3, we now set o(f) = |x() — y(d| ? and differen-
tiate, to get in any interval of length h

(32’) (t) S 2Lo(t) + Q(t) Vol

where, in any subinterval [#,-), ¢,] of length h or less |e(7)| S or(h — 1)|X} mays
tT = ¢— t,-,;. A more careful repetition of the proof of Theorem 3 shows that
since f§ t(h — #) dt 7/6 and |x(a) — y(a)| = 0, the cumulative (truncation)
error must satisfy

h?

(33) 7,
Ix) — 91 SEF |Xlmaater® “9 1}

As a corollary, the order of accuracy (§4) of trapezoidal integration is O(h?)

t It is actually the “cubic” Hermite interpolant to the y, and y) (to be discussed in Ch. 8), but this
happens to be quadratic in the present case
222 CHAPTER 7 Approximate Solutions

EXERCISES D

In Exs. 1-4, compute the trapezoidal approximations to the solutions of the initial value
problems specified, over the range 0 = ¢ = 1 withh = 0.1.

1 x
=
=
—Ix, x(0) = 1 2.2= 14+ x7, x(0) = 0

3 x =

); y=, x(0) = 0, 90) = 1


4 x= y, yauxty, x(0) = 0, =)
90 1°
5 For the DE y’ = F(x, y), show that the truncation error of the Cauchy polygon
method, in one step 0 = x SA, is

h

2
|X .50 + 70.10%0.59] +00, 90= 90)
In Exs. 6-9, compute the truncation error for trapezoidal integration over one interval
of length h, in terms of the Taylor series expansions of the functions involved:

6. % = plt)x, p analytic 7.
% =
=
1+ x?
xX= y,
8. 9.
x
=

y=
0 =

ys jaxty
10. For the DE y” = F(x, y), what is the order of accuracy of the formula

Intl = ¥n—-I + 2X n, Vn)

if all Ax, = A?

8 THE IMPROVED EULER METHOD

The trapezoidal method is very convenient for getting approximate solutions


to linear DEs, because one can solve algebraically for y,. Thus in Example 4,
formula (30) is equivalent to the recursive formula y, = [1 + (10 — &-~1)9-1]/
[10 + 4].
But the trapezoidal method is awkward when it comes to nonlinear DEs
because of the difficulty of solving (30) for y,. In general, formula (30) does not
define the vectors y, recursively but only implicitly. To determine each y,, we
have to solve an equationt (30) where y, is the unknown, y,_, having been pre-
viously determined.
For small Ai,, we can do this by iteration: start witha trial value of y, [say, y,-1
+ X(y,-1, te-1) At,], substitute this trial value y; into the right-hand side of (30)
to get a better approximation y;, and then repeat the process

r+]
(33) yz = yr-1 + EX(ya-1, te-1) + Xi, t)] At,/2

until (30) is satisfied up to the error tolerated.

+ The vector equation (30) is, of course, equivalent to a system of n simultaneous equations in the
components, For linear systems, these equations can be solved by Gauss elimination, instead of
iteration.
8 The Improved Euler Method 223

Instead of solving the implicit equation (30) precisely by using many itera-
tions, one usually gets greater accuracy for the same amount of work by using
a finer mesh and stopping after one or two iterations. If one stops after a single
interation, one has the improved Euler method, which is adequate for many non-
linear engineering problems requiring moderate accuracy (say, two significant
decimal digits).

Predictor-corrector Methods. The improved Euler method typifies an


important class of so-called predictor-corrector methods, whose underlying phi-
losophy is as follows.
First, by extrapolation or otherwise, one tries to make a reasonably good first
(0)
Suess Yi as to what y, should be; in the present case, this guess is provided by
the Cauchy polygon construction applied to y,_:

(0)
(34) Yk = ¥i-1 + X(y,-1, ty—-1) At,

This guess is called the predictor.


One then considers the implicit equation to be solved, for example, Eq. (30),
as a corrector:

(r+ 1)
(34’) Yi =y,-1 + EXY,-1, &-) + X(y??, t,)] At,/2
to be solved iteratively if necessary. In most cases, the full order of accuracy of
the implicit equation [O(h?) in the present instance] is achieved after one
iteration!
The improved Euler method consists in computing the sequence of z, by per-
forming these two substitutions in alternation. In the case X(x, i) = F()) of quad-
rature, it is equivalent to the trapezoidal method.
Applied to the initial value problem of Example 4, the improved Euler
method gives the approximate solution tabulated in the last row of the table of
§7; in this example, the work was carried to three decimal places to reduce the
cumulative error to 0.01.
To apply the improved Euler method to first-order systems, simply substitute
vectors for scalars in formulas (34), (34’). We illustrate the procedure by an
example.

Example 5. Consider the initial value problem defined by the nonlinear


system

O = x2 )
(35) + 9%, —=

= 1+x7 — 9%, x(0) = (0) = 0


d dt

There is little hope that formal methods of integration will help in the compu-
tation, but a straightforward application of the improved Euler method enables
one to calculate an approximate solution.
224 CHAPTER 7 Approximate Solutions

The calculations to be performed give, for this example, the double sequence
of numbers x,, y, defined by x9 = yo = 0 and by the formulas

Pe = X-1 + (x31 + ev At,


(36)
m= M1 td + a — 9-1) At,
and

My = Xp + (xR + yy + pe + Gi) At,/2


(36/)
Ye = Yaa + (2 + x1 — yh + pi — Gh) At,/2

Each step requires squaring four numbers and performing 14 additions and
subtractions and four multiplications.
For a subdivision into intervals of constant length h, the relative error com-
mitted in using the improved Euler method is Oh’), provided that the function
X(x, #) is of class @?. For, expanding the exact solution x(#) of dx/dt = X(x, t) by
Taylor’s Theorem with remainder, we have

h2

(37) xX, = x(E,) = Xp-1 + AX, + —~


2
E ax
Ox
+ —
OX
Ot
\.. + O(n)
where X,_, denotes X(x,_1, 4-1) = Xx(t,-1), &~1) and the subscript k — 1] on the
term in square brackets has a similar meaning. The improved Euler method of
(34)-(34’) gives

h?
oY
(38) De = Me + hY,-1 +
—_—

2
Eay
Ox
+ —

Ot
ln + Oh’)
where Y,_, denotes X(y,_, ¢,—1), and so on, The relative error committed in sub-
stituting (38) for (37) is thus O(4°).
A more explicit error bound is deduced in §10.

*9 THE MODIFIED EULER METHOD

The improved Euler method has the advantage over the trapezoidal method
of being explicit. Various other explicit methods about as accurate as the
improved Euler and trapezoidal methods can also be constructed. For instance,
one can use the following adaptation of the midpoint quadrature formula:

(39) WwW,= W,-] + wx + k-1

2
’ ty-1 + A2 ) h= At,
This midpoint or modified Euler method is about twice as accurate as the trape-
zoidal and improved Euler methods in the special case dx/di = F(i) of quadra-
9 The Modified Euler Method 225

ture, as a comparison of formulas (17) and (26) shows. (In this case, the
improved Euler method is the trapezoidal method.)
But for first-order DEs generally, no such simple error comparison holds. To
see this, let X = Lb,tix be expanded into a double power series, and let x(t)
satisfy x = X(x, é). Then, just as in Ch. 4, §2, we have ¥ = X, + XX,, and

d°x
at
= Xy + 2XXq + X°X,, + XX, + XX

We now introduce the abbreviations

C = dio + dodo, B = boy + 2b11b99 + boaboo" B* = dygbo + Ooobor"

Expanding out to infinitesimals of the fourth order, we find that the exact solu-
tion of the DE dx/di X(x, #) for the initial condition x(0) = 0 has the
=
=

expansion

WC n°(B + B*)
(40) x(h) = hbp + — + + O(n’)
2 6

With the trapezoidal approximation (30), we obtain

WC W(2B + B*)
(41) yh) = hbyy + — + + O(n)
2 4

giving a truncation error h3(B/3 + B*/12) + O(h*). With the improved Euler
approximation (34)-(34/), we get

WC WB
(42) z(h) = hbo + — +— + O(n)
2 2

with error h3(2B — B*)/6 + O(h*). With the midpoint approximation (39), we
finally obtain

WC WB
(43) w(h) = hbo + — + va + O(n’)
2

so that the error is 4°(2B* — B)/12 + O(h')

Corrected Trapezoidal Method. Theorem 8, when combined with Theo-


rem 5 of Ch. 6, shows that the exact solution x(t) of the first-order DE dx/dt =
X(x, #) satisfies

[X,-1 + X;] At, [Xp_-1 — Xj] At?


(44) Xp FS Xp] + + O(At,')
2 12
226 CHAPTER7 Approximate Solutions

where X, denotes X(x,, t,) and X denotes 8X/dt + X 8X/dx. Dropping the last
cerm, we get a corrected trapezoidal integration formula, which may be expected
to have a cumulative error of only O(|7|°).
For instance, when applied to the inhomogeneous linear DE ¥ = 1 — 2éx of
Example 4, in which X = —2x — 2t + 4x”, this formula gives the approximate
recursion formula

x, [1 + t, At, + (—1 + 26) AB/6)

At?
= x, 1[] — tj, Ate + (—1 + 2-17) Ate/6] + At, +=e

with absolute error O(Ai,’) and relative error O (At,°).

EXERCISES E

In Exs. 1-4, compute approximate function tables on [0, 1], with At, = 0.1, by the
improved Euler method for the following initial value problems:
1. x = —tx, x(0) = 1 2,%= (1+), x0) =0
3. x = y, 7 = 0, x(0) = 0, 9(0) = 1 4.%= 9,9 = x+y, x(0) = 0, y(0) = 1

In Exs. 5-8, compute approximate function tables for the data of Exs. 1-4, using the
midpoint (or modified Euler) method, instead of the improved Euler method.

9. Obtain an expression through terms in A° for the error committed in applying


(a) the improved Euler method and (b) the midpoint method to the DE x = p()x,
p(t) analytic.
10 For the analytic DE y’ = F(x, y) and one interval 0 <= x = h, let the exact solution
be given by y(t) = ag + ajh + agh? + ash® + O(h* and let

y = yo + eh + coh? + esh® + O(n)

be the approximate value given by trapezoidal formula. Show that cy = do, ¢, = a,


Cg = ay, and cs = 3a;3/2.

11 Show that through the terms computed in Ex. 10, the improved Euler method gives
the same result as trapezoidal integration for co, ¢), cg.

*10 CUMULATIVE ERROR BOUND

All the methods for constructing approximate function tables that have been
described in this chapter have had one feature in common. Namely, the kth
entry in the table has been constructed from the immediately preceding entry
alone, the (k — 1)st entry, without reference to the earlier entries. Such methods
are called one-step (or “two level’’) methods.
Given a one-step method for numerically integrating the DE dx/dt = X(x, 6),
that is, for constructing an approximate function table with entries y, = y(,),
10 Cumulative Error Bound 227

we can express the preceding property by writing

(45) Me = O(Ye-1> be-1s &s X)

Here¢is the function expressing y, in terms of y,_, and the data of the problem.
Bounds for the errors associated with one-step methods can be obtained by
using the following general theorem, which applies equally to the Cauchy poly-
gon method, the trapezoidal method, and the improved Euler and midpoint
methods.

THEOREM 9. In any one-step method for numerical integration of dx/dt


=
=

X(x, t), where X satisfies the Lipschitz condition

| X(x, ) — X(y, O| S Llx — yl

let the relative error at each step be at most «. Then, over an interval of length T, the
cumulative error is at most (e/L)(e“7 — 1).
Proof. For any partition a, let ¢, denote the error introduced at the kth step.
That is, if £,(é) is that exact solution of the given DE satisfying the initial condi-
tion %,(¢,) = », where y, is the value of the computed approximate solution at
ts let

& = ly — Sid)| = Flt) — %-1D|

By the definition of “relative error,” we have «, < ¢ At,. The magnitude of the
cumulative error is, by definition,

Lym — bm)| = |Xnltm) — Xoltm)| = > [Xe(tm)— %e—1n)]

< > [Egltm)— Far(ty)

But | %,(t,,) — %,-1@n)| is the magnitude of the difference, at t = ¢,,, of two solu-
tions of the given DE that differ by ¢, at t = ¢,. By Theorem 2 of Ch. 6, this is
at most

een < gem Ap,

since ¢, << ¢ At,. Here ¢ is an upper bound to the relative error. Summing over
k, we get the following upper bound to the cumulative error:

| art) ~ x(n)| = €> feien)At,] = €elim> fel At]


228 CHAPTER 7 Approximate Solutions

But the final sum is the Riemann lower sum approximation to the definite inte-
gral fim exp (—Lt) dt = [exp (—Lto) — exp (—Lé,)]/L. Hence

eee? — 1]
[Yr(tm) — %(tm)|
L

and Theorem 9 follows.

Trapezoidal Integration. For trapezoidal integration, the discussion of §7


shows that the truncation error ¢, at the kth step is bounded by

(11?
1X max) Ate
12[1 — (L At,/2)]

Therefore, in this case, an error bound is given by the following corollary.

COROLLARY. The error in the approximatefunction table constructed by the trap-


exoidal formula (30) is at most

XImale“ — 1)
[171 f jx] =-
(46)
12L[1 — L|x|/2] L

Similar error bounds can be found for the midpoint and improved Euler
approximate integration methods.

Approximate Solutions. The error bound (46) refers to the approximate


function table constructed by the trapezoidal integration formula (30). Using
linear interpolation between successive values, we can obtain from this function
table a continuous approximate solution to the DE dx/di = X(x, i).Since the
error in linear interpolation is bounded by |%|max|7|?/2, where # —
=
x

=

OX/dt + X 0X/x, we see that the order of accuracy of this approximate solution
is also O(|7|?).

EXERCISES F

1. Let x(@) and y(#) be approximate solutions of the system (1) with deviations ¢, and
€, defined for a <= t = b. Show that (7) implies that

€ + &€
|x — xO S [x@ — p@le
4 + et“! — 1)
L

2. Let F(x, y) and G(x, y) be everywhere continuous; let F satisfy a Lipschitz condition
with Lipschitz constant L, and let | F(x, y) — G(x, y)[ <= K. Show that if f(x) and
g(x) are approximate solutions of the DEs y’ = F(x, y) and z’ = G(x, z) with devia-
tions ¢ and 7, then

K+e+n
If) — g&)) = FO — g@ie" + (er4| — 1)]
10 Cumulative Error Bound 229

*3 Assume that, for equally spaced subdivisions of mesh length h, the truncation error
of a given approximate method J,[f] is Mh" + O(h"*'), where M is independent of
h. Prove that the extrapolated estimate

Lf] = (2"hlf] — hLAD/2" — 1)

has a truncation error O(h"*}).


*4 (a) Show that the extrapolation of Ex. 3 gives the trapezoidal approximation from
the Cauchy polygon approximation, and Simpson’s rule from the trapezoidal
approximation.
(b) Show that Simpson’s rule satisfies the hypotheses of Ex. 3 with n =
=
4. Derive
an extrapolation estimate for Simpson’s rule.

*5 For the DE dx/dt + tx = 0 and the mesh length A, = 1/(10]é,| + 1), show that
the truncation error of the trapezoidal method tends to zero as t — 00, regardless
of the initial value x(0). What is the limiting truncation error as t > — 00?

*6 Let f() be an analytic function, and f(¢ + 1) = f(@. Show that trapezoidal integra-
tion of fj f@ dt for h = 1/n has an infinite order of accuracy as n — 00. [HINT:
Expand /() in Fourier series.]

Show that the error is O(h’) in the extended Simpson’s rule

xh
h

J ~
; F(x) dx = 200 {114F(x) + 84[F@e +h) + Fe — A]

— [Fix + 2h) + F(x — 2h)]}

(a) Letz, = z + #1, i =


=
V—1. Show that, if F(z) is any complex polynomial
of degree five or less, then we have

z)

(*)
20
P(e) de = = [24K + AUF, + F) — (Fa + FO
(b) Infer that, if F(z) is a complex analytic function, (*) holds with an error that is
O(n’).
*Q Let F(x, y) be bounded and continuous on the strip 0 = x <1, —0 < y < +00;
let {x,} be any sequence of partitions of [0, 1] with |x,[ — 0; and let f,(x) be the
Cauchy polygon approximate solution defined by z,, for the initial value (0) = 0.
(a) Show that, if the f,(x) converge to a limit function f(x), then f(x) is a solution of
y = Fu, y).
(b) Show that, in any case, a uniformly convergent subsequence { f,)(x)} can be
found, n(i) < n(i + 1). [HinT: See Ch. 6, §13.]

*10 In Ex. 9, show that, if the DE y’ = F(x, y) admits only one solution for the initial
value (0) = 0, any sequence of Cauchy polygon approximations defined for (0)
= 0 by partitions whose norms tend to zero must converge to the exact solution.
CHAPTER 8

EFFICIENT NUMERICAL
INTEGRATION

1 DIFFERENCE OPERATORS

In Ch. 7, we analyzed theoretically a number of simple methods for comput-


ing approximate solutions for normal first-order systems

(1) a=
d
X(x, 0)

of ordinary DEs. The simplicity of the methods considered, all due to Euler,
facilitated a rigorous theoretical analysis of their errors. In general, they had
O(h?) accuracy for X € @?.
In this chapter, we will describe some more efficient methods having higher
order accuracy, usually O(h*) for X € @*. We will explain the guiding ideas that
motivated the construction of the algorithms used but will not push the analysis
to the point of getting rigorous error bounds. This is partly to avoid lengthy
discussions of complicated formulas, but mostly because errors are usually esti-
mated in practice by studying the numerical output.
Such higher order methods are almost always used in practice when more
than two or three significant digits are wanted. If their errors are accurately and
reliably known, the errors should be subtracted to obtain improved results, as
in Richardson extrapolation (Ch. 7, §5).
Like the schemes already analyzed in Ch. 7, many of the schemes for numer-
ical integration to be studied later will refer to an assumed partition 7 of the
interval [a, 6] of integration by a finite number of points (the mesh),

‘<t,=b=a+T
(2) Ta=itgp< ti <ig<:

Typically, the partition is made into steps At, = t, — t—, of constant length h,
so that ¢, = a + rh; we then speak of a uniform mesh.
On the mesh (2), the DE (1) is approximated by a suitable difference equation.
This difference equation is then solved step by step in hand computations, using
ordinary arithmetic supplemented by readings from available function tables. In
machine computations, however, function tables are usually replaced by simple
subroutines that give accurate approximations by rational functions.
230
1 Difference Operators 231

Using Taylor’s formula with remainder, it is easy to derive higher order


approximations to derivatives by difference quotients (see §2). Thus for f € @®
we have

fats
=
=

Co + cys + cgs? + 58° + c4s* + O(s5)

For t,4, = @ + khand x, S(), this gives

(3) [ Xnt+2 + 8xn41 8x, 1 + Xn gl/12h Sty) + O(n’)


This suggests trying to solve x’(f) = X(x, t) by simply substituting the difference
quotient of (3) for the left side of the DE
Unfortunately, as will be shown in §4, this procedure is highly unstable. Sta-
bility is only one of several ideas and techniques, few of which were known in
Euler’s time, that must be learned before one can understand, even superfi-
cially, the efficient schemes of numerical integration that are most commonly
used today. The object of §§2—7 to follow will be to explain some of these ideas
and techniques; the remainder of the chapter will be devoted to deriving some
truly efficient schemes of numerical integration
Much of our preliminary discussion will be concerned with difference oper-
ators and difference equations (or AEs, as we will write for short). Basic to these
are the forward difference operator A, the backward difference operator V, and
the central difference operator 5, defined by the formulas

(4a) Af(x) f(x +h f()


(4b) Vf (x) f() f(x — h)
(4c) of(x) = fe + 3h)
— f(x — 5h)

In the preceding formulas, the symbols A, V, 6 stand for linear operators that
transform functions into functions. Unlike the linear differential operators of
Ch. 2, §5, they apply to all functions
These operations are useful in obtaining approximate solutions because they
yield approximations to the derivative f’(x). If f € @', the derivative J’(x) is the
limit of the difference quotients

VS =lim of
i’) =lim +" = lim
0 mo h ano h

For obvious reasons, these are called the forward, backward, and central divided
difference approximations to f’(x)
The difference operators (4a)—(4c) can be applied to anyfunction table defined
on a uniform mesh with step h, consisting of the equally spaced points

(5) X, = Xo + th r=0 +2
=
> >
h>0
232 CHAPTER 8 Efficient Numerical Integration

Using the standard abbreviations y, = f, = f(a + rh), we obtain the identity

fi —fo = fla + ) — fla) = df = Vf = Sip

This shows that the usual difference notation is highly redundant.


This redundancy is also apparent when we iterate the difference operators
(4a)—(4c) to define the second differences

(6a) A*f(x) = A(A( fx) = ACf(x + h) — f(x) = fle + 2h) ~ 2f(x + h) + f(

(6b) VF
(x) = V(VF(x)) = VOf(x) — fx — h)) = flx) — fle — h) + flx — 2
(6c) Sf(x) = fx + h) — 2f(x) + fle — hy
We easily verify the identities 5°f(x) = A(Vf(x)) = V(Af(x)).
In formulas (6a)—(6c), the exponent 2 describes the effect of applying the
operators of formulas (4a)—(4c) twice, or ‘‘squaring” them. More generally, we
can form polynomials (with constant coefficients) of difference operators, like
A® — 3A + 2. Such linear difference operators with constant coefficients com-
mute (are permutable); those with variable coefficients do not commute (cf. Ex.
A4 below).

2. POLYNOMIAL INTERPOLATION

The difference notation of §1 permits us to write down simple formulas for


the polynomials of least degree interpolated through given values on any uni-
form mesh, These interpolation formulas make it possible to construct accurate
approximating functions from accurate function tables.
Simplest is the linear interpolation formula (for fixed h and variable k)

(7) pls + 8) = 99 + = (1 — 90) O<k<h

Next simplest is the formula for quadratic or parabolic interpolation. Using


the second central difference notation of formula (6c),

8°y, = yin — 29; + 9-1

we obtain the quadratic interpolation formula

(8) q(x, + &) wot oh(x —¥o+ * 3%]

+ It is also inconsistent with the usual notation At, = ¢, — ¢—, employed in writing Riemann sums,
used in Ch, 7,
2 Polynomial Interpolation 233

1f second differences are tabulated, as they are in many tables, to use this for-
mula requires only two multiplications and three additions.+
In a similar way, we can derive the quartic (fourth order) interpolation formula

(9) f(xg
+k)=ye x( yn+ Hy
+ A(R? *) | (8°y312n3
— d°y1)
|
Here 5+y = 67(8°y9) = yg — 4y3 + Bye
— 49, + 90
The preceding formulas are based on central differences. For polynomial
interpolation between n + 1 successive values on a uniform mesh, one often
uses the Gregory—-Newiton interpolation formula.

o)
(xo)
(10) p(x)
= f(xo) + > A*f(%
ie
hX(k!)
| Th - =~3]
where n = (x, Xo)/h, and where A* is the iterated forward difference operator
This formula gives an approximation to f(x) in terms of the differences of n
equally spaced values of f/ This formula is a difference analog of Taylor’s for-
mula, without a remainder term

Lagrange Interpolation Formula. The Gregory—Newton formula, in turn


can be regarded as a special case of a very general interpolation formula due to
Lagrange. Given the numbers x) < x, < < x, and 9, 91 Yn, ONe can

showf that there exists a unique polynomial p(x) of degree n or less which sat-
isfies p(x,) = y, for k = 0, 1 , n, that is, which assumes the n + 1 given
values at the points specified. Let

Q(x) xX — Xo)(x — x) (x — x,)

_
Q) =[[«-
plx)
=
(x — X) S#k

Then the polynomial

Pxlx)
Dy
(11) p(x)
= >4ie
pals) >" = Qe)>:Faw
Wene2
+ We do not count the division required to calculate k/h, since this requires only a decimal-point
shift in most tabulations.

t Birkhoff and MacLane, p. 60


234 CHAPTER 8 Efficient Numerical Integration

takes exactly the values p(x,) = y,, 0 = k S n. Indeed, p,(x,) = 0 for j # k, so


that substituting x = x, into (11), we have

Dy, _ Pi)
pilX)
=a
Pils
x)
k=O ~ pyle) 7h ~ J

Formula (11) is the Lagrange interpolation formula. Since the polynomial p(x)
is (11) is unique, (11) is equivalent to (10) ifx, = a + jh,j = 0,1 n. Hence
as interpolants to a function tabulated at equal intervals, (10) and (11) have the
same error

Hermite Interpolation. The limiting case of Lagrange interpolation of


order 2m, obtained by letting xo, : X-1 approach @ and xX,, » Xom—1
approach 8, is called osculatory interpolation of order m on [a, 6). The limit exists
if fe @”“'[a, 6), and is that polynomial p(x) of degree 2m —1 that satisfies p”(a)
f@ and p(b) p©®) for j m — 1. Hermite interpolation of
order m means, for any partition x of [a, 5], just osculatory interpolation of
order m on each subinterval
The case m = 2 of cubic Hermite interpolation is especially useful. On [0, 1]
[writing f(0) = yo, (0) = 96, f(D) = 9. f’C) = yi], this gives

p(x) = Jo + ox + [39 _ 30 _- 296 — mlx + [1 + Yo — 291 + 2yo] x

The preceding formula is easily applied to the solutions of first-order DEs y


F(x, y) (and of first-order systems), because the y, F(x,, y,) are then known
and usually already computed

Spline Interpolation. In problems whose formulation uses empirical data,


or whose solution will only be found on a coarse mesh, Lagrange and Hermite
interpolation can often be advantageously replaced by cubic spline interpola-
tion. This interpolates a piecewise cubic polynomial with continuous first and sec-
ond derivatives through any set of mesh points (x,, y,); see Exs. A9Q-Al1
The concepts defined above are fundamental to the understanding of two-
step and multistep methods for solving ordinary DEs and systems (see §9)
Moreover, the study of their properties is very attractive from a theoretical
standpoint. However, the one-step Runge-Kutta methods to be explained in
§8 are based on power series considerations, which are very different. There-
fore, some readers may wish to postpone the study of these properties until they
have become familiar with Runge-Kutta methods

EXERCISES A

1. (a) Show that, if the function f(x) = a,x" is a polynomial of degree m, then A" =
vp = "fF = h"f™, where f denotes the mth derivative of f.
(b) Show that, if y € @’, then 6 = O(h’) and A = O(h’)

2. Define the divided differences [up, u;] and {tp, w, Ug] as (2, — tUt9)/(x; — Xo) and ([u,
Us] — {uo, u1])/(%2 — Xo), respectively. Show that lim,,,,, xote [to U1, Ug] = u(x)
3 Interpolation Errors 235

Show that

rl
ay a Br — Rune

Show that, if f(x) = x?, then x(Af) = Qhx? + hx, yet A(xf) = 3x7h + Sxh? + HP.
Solve the AE u,,; = 2u, — U,-1, for the initial conditions wy = 1, u, = —1.

The nth Fibonacci number F, is the value at n of the solution of the AE

Fv =F, + Fy-1

for the initial conditions Fy = 0, F, = 1.

(a) Show that F, = (o" — 0”)V/5 , where


p = (V5 + 1)/2,0 = (1 — V5)/2.
(b) What is the solution G, of the Fibonacci AE for the initial conditions Gy) = 2,
1?
G
=
=

Estimate the largest 4 such that parabolic interpolation in a six-place table of log, x
on 2 =x = 3, with mesh length 4, will yield five-place accuracy.

Show that, with parabolic interpolation between y(—A), (0), and y(h), the maximum
error is normally near x = +h/ V3 and is about h? if’) | /9V3 there.
Show that the cubic Hermite interpolants to given values of f(x) and f’(x) at the end-
points of the intervals (a@—h,a) and (a,a+h) define a “cubic spline” function f ¢ @?
[a—h,a+h] if and only if

(*) (*) ALf(a—h) + 4f"(a) + fla+h)] = 38°f(a)

*10 Let h = (6 — a)/nand x, = at+ih. Prove that, given y,[i = 0 . ,n] and 94,¥;,, there
is one and only one cubic spline function s(x) ¢ @7(a,6) which satisfies: (i) the inter-
polation conditions s(x,)} = y, for i = 0, ... ,n; (ii) s(a@) = 6, s(6) = yy; and (iii) is a
cubic polynomial in [x,_),x,] for i = 1,...,n
(HinT: Use Ex. 9.]

*11 (a) Show that if s(x) is the (piecewise) spline interpolant to given y, and 9, 4, specified
in Ex. 10, and fix) = s(x) + v(x) eC 2(a,b) is any other interpolant with the same
properties, then

J‘Lae = J“[s"()Pdx + f [oePax


(b) Infer that the cubic spline interpolant minimizes the mean square value of the
second derivative on (a,2), among all interpolants f «@7(a,0).
bs

(Hint: Show that f s” (x)u”(x)dx = 0 for all continuous piecewise linear func-
a

tions v”(x).)

#3 INTERPOLATION ERRORS

Among the many interesting properties of interpolation schemes, their errors


are clearly most basic. We therefore take them up first. The order of magnitude
of such errors can often be determined algebraically, by simply finding the poly-
nomial of least degree for which the formula ceases to be exact.
236 CHAPTER 8 Efficient Numerical Integration

In this section, we shall do much better, by giving explicit expressions for the
magnitude of the error, by formulas involving an appropriate derivative of the
function. More precisely, let a function f(x) be tabulated at n + 1 points x) <
x< * + <x,, and let p(x) be the unique interpolation polynomial, of degree
n or less, which satisfies p(x,) = f(x,),k = 0,1... , n. How big is the error of
p(x), considered as an approximation to f(x)? An answer to this question, when
J (x) is sufficiently smooth, is provided by the following result.

THEOREM I. Let p(x) = ay + ax +++ + + a,x" be the polynomial satisfying


plxy) = f(xy), for x9 <x, < ++ + <x IffE Oxo, x], thenfor every x in [x,
x,] = I there exists & in I, such that

(x — Xo) °° (x — ,) fOMw}
(12) S(*) — px) =
(n+ VD)!

Proof. Let e(x) = f(x) — p(x) denote the error function. Since p(x) is a poly-
nomial of degree n, p"+ (x) = 0. Therefore

eft Dx) = fOTP(x)

for all x, and e(xo) = e(x;) = - + - = e(x,) = 0. Consider now the function

(13) Ht) = Qlx)e) — QMe(x)

where Qis the polynomial Q(x) = (x — xo)(x — x) ° + + (x — x,). We consider


@ as a function of ¢ on J, for x fixed. Clearly ¢ € @"*!, moreover, o(x) = 0,k =
0, 1,. , n, and in addition ¢(x) = 0. By Rolle’s Theorem, between any two
points where ¢ vanishes there is at least one point where ¢’ vanishes. Since the
function ¢ vanishes at n + 2 points (if x # x, for all k), the function ¢’ vanishes
for at least n + 1 points. Repeating the same argument for higher derivatives,
we eventually conclude that ¢”+ (2) vanishes for at least one point é in the inter-
val I. Differentiating (13) relative to i, n + 1 times, we obtain

0 = OME = Qxje"ME) — (n + Mle(x)

Since e®t(E) = f @*Y() and e(x) = f(x) — p(x), this gives

Q(x)
f(x) — pe) = ex) =
_—_

(n + 1)!
f° * %, q.e.d.

COROLLARY. [fp is the Lagrange interpolation polynomial of a function f(x) of


class @"*' in the interval [xo, x,] and xy < xj) < +++ <x, the error at any point
x € [xo, x,,] ts at most

< N,
If) — plx)| nm
~ (n+ 1)!

whereM, = MAX,<tsxn Ferre andN,, = MAX,o<xSxn | Q(x)|.


4 Stability 237

When the mesh points are equally spaced, we can compute N,, explicitly.
Thus, we have N, = h?/4; if x) < x < x9, N3 = 9h*/16, and so on. Moreover,
similar arguments show that for k <= n, the error in the kth derivative of the
Lagrangian interpolant is O(h""**?). It follows (though we shall not prove it) that
one can develop multipoint AEs that approximate DEs to an arbitrarily high
order of accuracy.

Applications. For example, if x = x, + k,0 <k <h, the magnitude of the


error in linear interpolation is |k(h — A)f’(6)|/2 =< h?|Ff’) |/8, for some
& € [x,, x, + h]. Likewise, parabolic interpolation through f(x, — h), f(x,), and

f(x, + h) gives an approximate value differing from f(x, + k) by |k(h? — k?)f”


(£)|/6. For |k| <S 4/2, the error is therefore bounded by h®|f”| max/16. Since
we would naturally choose j to minimize |x — x,|, this bounds the error in par-
abolic interpolation. The maximum error in the interval (x; — h, x, + h) is
slightly larger; see Ex. 8.
Ordinarily, parabolic interpolation is sufficiently accurate. For example, with
sin x, |f’” |max = 1; hence the error is bounded by A°/16 in radian units. There-
fore, parabolic interpolation give four-place accuracy in a table at 6° intervals!
More generally, unless |f” | max => 10, six-place tables can be extended by par-
abolic interpolation to all x if h = 0.01, without an appreciable loss of accuracy.
The same is true of nine-place tables if h = 0.001 (and of three-place tables if
h = 0.1).
For these reasons, higher order interpolation is unnecessary for most tables
in common use.

Caution. The approximations to a given function f(x) on a fixed interval,


defined by polynomial interpolation over that interval, are not necessarily good
approximations to f(x), even if f(x) is analytic. Thus, the approximations to the
analytic function f(x) = 1/(1 + x°), obtained by the Gregory-Newton interpo-
lation formula (10), do not converget to f(x) on the interval —5 = x = 5, but
oscillate more and more wildly as the step length / tends to zero.
This shows that Newtonian interpolation cannot be used to define the
approximating polynomials referred to in the Weierstrass Approximation Theo-
rem. To get the best such uniform polynomial approximations, one must use a
very different method due to Chebyshev (see Ch. 11, §7) and Remez.

4 STABILITY

The accurate numerical solution of initial value problems for ordinary DEs
(and systems) involves much more than interpolation error bounds and esti-
mates. For one thing, the relative error involves stability considerations. These
are most easily explained in the special case of linear DEs with constant coeffi-
cients, previously discussed in Ch. 3.

+ See J. F. Steffensen, Interpolation, Williams & Wilkins, Baltimore, 1927, pp. 35-38 and the refer-
ences given there.
238 CHAPTER 8 Efficient Numerical Integration

A linear AE with constant coefficients is one of the form

(14) Ynim = Fn + BYnt1 terest Qm—WWn+m-1

where a, are given constants. The solutions of such a AE can be obtained by a


substitution similar to the exponential substitution of Ch. 3, §1. Try the
sequence y, = p’, where p is a number to be determined. This gives from (14)
the characteristic equation

(15) p" — a — 4p — ++ * — ayp™! =0

For each root p, of this characteristic equation, the sequence y, = p,’ is the
solution of the linear AE (14), which satisfies the initial conditions yp 1, M
= =
= =

m—1
Ps +> » Ym-1 = Pr

Midpoint Method. The midpoint method for solving y = F(x, y) consists


in first computing y, (perhaps by Taylor series) and then using the formula

(16) nti = Yn-1 + 2hF (xy, Yn)

For example, consider again the DE y’ = y for the initial condition y(0) = 1,
whose exact solution is y = e*. In this case (16) reduces to

(16) n+l = In—1 + 2hy,

For h = 0.1 and yp = y(0) = 1, the exponential series truncated after five terms
gives y, = 1.1052, rounded off to four decimal places. Substituting into (16’)
we can compute the approximate function table for y, = exp (7/10):

x 0.1 0.2 0.3 0.4 0.5 0.6 0.7

J 1.1052 1.2210 1.3494 1.4909 1.6476 1.8204 2.0117

and so on. After 10 steps, this gives the approximate value e 2.714, whose
=
=

error is about —0.0043.


In (16), the characteristic equation is p? = 1 + 2hp, with distinct roots
2 pt
p=htvVlith= t1+ht—~F>H4+:-:-
8

For A = 0.1, this gives p, = 1.10499, pg = —0.90499 when rounded off to five
decimal places. The first root differs from the growth factor e°' = 1.105171 of
the exact solution by about 0.00018. This is about 0.27% of 0.105, which
explains why the O(4”) approximation (16) gives only two-digit accuracy for
h= 0.1.
4 Stability 239

Stability. By analogy with Ch. 3, §4, we will say that the homogeneous linear
nth order AE with constant coefficients (14) is stable when all solutions y, are
bounded sequences and are strictly stable when all solutions are sequences tending
to zero, as r — ©,
Since we can obtain a basis of solutions of (14) of the form r’p,’, the AE (14)
is strictly stable if and only if all roots of the characteristic equation (15) are less
than one is absolute value. This condition is obviously necessary; it is sufficient
because lim,_.., 7p" = 0 whenever |p| < 1.
The concept of stability brings out a significant aspect of the effectiveness of
the central difference approximation (16) for integrating numerically the DE
y’ = y. The general solution of (16) is Ap,” + Bps’, where p, = h + (1 + hh?!”
as above, and A, B are arbitrary constants. The positive root p;, which is approx-
imately equal to e°| is dominant in the sense that |p,| > |p9|. Therefore, the
term Bp; can be neglected in comparison with Ap} for large 7, provided that
A # 0.

Example 1. For the DE y’ = —y, the situation is reversed: the (central) dif-
ference approximation y,4; — 9,—1 = —2hy, to the stable DE y’ = —y is unstable.
Thus for h = 0.1, we have p,; = 0.904988, p. = —1.10499. Although p,
approximates the exact growth factor p = 0.90584 reasonably well, it is domi-
nated by the “extraneous root” py which is introduced in approximating a first-
order DE by a second-order AE. As a result, the “approximate solution”’ will
x
ultimately grow like e*, whereas the true solution decays like e~

Example 2. Consider the difference approximation (16) to the initial value


problem defined by the DE y’ = 2x and initial y(0) = 0. For the true solution y
= x°, (16) is exact, but it still gives very bad results because of roundoff errors.
This will be explained in §5; here we simply consider the characteristic poly-
nomial of the AE specified:

(17) M2 = Bye41 — Byp-1 + H-—9 — L2hy, Ih = 2%,

The characteristic polynomial of this AE is

(18) p* — 8p? + 8p — 1 = (p? — 1)(e? — 8p +_:1D)

whose roots are £1 and 4 + V15. One of these is near 8, so that errors tend
to grow by a factor 8 per time step. The AE is thus very unstable!

EXERCISES B
?

In Exs. 1-4, verify the formulas indicated for f € @*, a uniform mesh with step A,
0<6<1,and6 = (1 — 4).

1. f(% + 8h) = 99 + 6 Ayo — (00/2) A*yg + OCF?) (Newton)

2. flo + Oh) = yo + 889;9 — (68/4)(5%y + 5%y,) + OCF’) (Bessel)


240 CHAPTER 8 Efficient Numerical Integration

3. flxo + Oh) (Oy + 4) + (@ — Hs +@- 6)5*y,1/6 + O(n’) (Everett)

4. f(x + 6h) =
=

3(Yo + 9) +
6 — Yip — 6/4)6 + 8)
[00(20 — 1)/12] 6,2 + O(h*) (Bessel)

5 Find the truncation errors of the formulas of Exs. 1-3, for quartic polynomials
q(x) = a + bx + cx* + dx” + ex

(a) Find the cubic polynomial ¢(x) that satisfies c(0) = yo, (0) = yo, c(h) = 1,
"(h) = 9%
(b) Derive your formula asa limiting case of the four-point Lagrange interpolation
formula

Test the following AEs for stability or instability, by calculating the roots of their
characteristic equations
(a) Un+l Qu, Unt (b) Unt) =
Uy + Uy

(c) Unti 5u, + 6u,4, = 0 (d) Unsi =

Up-]

*8 Show in detail that |@ + ab] + 6° < 1 is a necessary and sufficient condition for the
strict stability of the AE 9,49 = @yn+; + by,

Derive necessary and sufficient conditions for the strict stability of a general linear
third-order AE with constant coefficients

*5 NUMERICAL DIFFERENTIATION; ROUNDOFF

Using Taylor series with remainder, one can in principle obtain approxima-
tions to the derivatives of tabulated functions having arbitrarily high orders of
accuracy, from suitably designed difference quotient and divided difference
formulas
For example, whereas the usual forward difference quotient formula Af/h has
only O(h) accuracy, and even the central difference quotient formula 6f/h has
only O(h?) accuracy, formula (3) of §1 has O(h*) accuracy. Likewise, we have the
truncation error estimates

Lew fe (x)h?
(19a)
a + Lor
— f'®) =
6 24
(at r
@r*
fO
(19b) —f'®)=
+"T920
where £ is in the interval over which the difference is being taken. This illustrates
the general principle that central difference quotients give more accurate
approximations to derivatives than forward or backward difference quotients of
the same order. We can obtain truncation error bounds similarly:

Lf + 4) fe —/)) Lf” Imax


(19¢) |
2h
f')|
5 Numerical Differentiation; Roundoff 241

Note that the interval in (19c) is twice as long as in (19b); hence, the truncation
error is multiplied by about four.
Similar approximations can be made to f”(x), using second difference quo-
tients. For f € @?, we have

Sf(x) = fle +h) — 2f(x) + fle — h)

= [f(x) + hf'(x) + “f Ey) — 2G) +f) — hfe) + FE)


= “yy"E) +f"E2)I
for some numbers &, and & in the intervals [x, x + h] and [x — h, x], respec-
tively. Hence 6°f/h? lies between the minimum and maximum values of f”(¢) for
€ in the interval x — h = & = x + A. Since a continuous function assumes all
values between its minimum and maximum, and since f”(x) is continuous in the
interval, we conclude that d°f = h?f”(&) for some € in [x — h, x + h]. This shows
that the difference quotient 5°f/h* is a good approximation to the derivative f”
for small h. Since 5*f/h? can be computed by consulting a numerical table of
J (x), the preceding formula may again be regarded as one for approximate numer-
ical differentiation. This is written in the form f”(x) = 8°f(x)/h®, where the symbol
= means “is approximately equal to,” as in Ch. 7, §2.
For f € @*, the preceding analysis can be refined to give an estimate of the
truncation error. Taylor’s formula with remainder gives

fe +h) = fix) + hf) +


aye) + Wf"
@) HPO
6 24

where x —h SE x + A. Since f(€) assumes all values between its minimum


and maximum values on this interval, we can write

(20) SF
— WF" (x) = A*f?(®)/12, x—-hs&ESxth

This formula gives the truncation error estimate

he?
Sf’) —_—
ef
~ T9 ”(6)
—_=

he

in the formula f’(x) =~ 5%f(x)/h? for numerical differentiation. This formula


shows that the truncation error is of the order of h?, and tends to zero fairly
rapidly when the step h is taken smaller and smaller.

Higher-order Derivatives. The preceding truncation error estimates and


bounds are special cases of a general result, namely:
242 CHAPTER 8 Efficient Numerical Integration

THEOREM 2. if f€@*® on I = [x — nh/2, x + nh/2), then

o"f nh?
oa SO ®),
(21) fO@) gel
oo e_

h”

Proof. The cases n = 1, 2 have been treated above. Proceeding by induction,


we get

n/2 n/a
of(x) = ph @ F 9 de Sf(x) = f “e dt pl ttt w du,

h/2 h/2

a"f(x) = f . dt, f =-

A/2
anf pl &t th +++: +4,)dt,

This is a multiple integral over an n-dimensional domain D with center t = 0


and volume h”, symmetric under the reflection 4; ~ —i, (j = 1,..., ). The
arithmetic mean of the integrands at symmetrically placed points x — T and
x + Tis, by Taylor’s formula,

SUF + 7) +f — TH] =fC) + (FE)se0% + 07)


for some 6, —1 < @ <= 1. Hence, setting T = ¢, + - + + ¢, and integrating
over D, we have

mf rae +++ dt, SOF— gy <% fr ay, +++ dt,

where m and M are the least and greatest values of f*®(g) for & in the given
interval. Since

4/2 3

and f" tit, dtp =0, k#E


J —

h/2
i,” dt, = —
12

and since f“+®(g), being continuous, assumes all values between its extreme
values m and M, formula (21) follows.

COROLLARY. Under the hypothesis of Theorem 2, we have the truncation error


bound

nMh?
(22) [f%e) ~ Of/h"| = 24

where M is the maximum of |f*(&)| as & ranges over the given interval.
5 Numerical Differentiation; Roundoff 243

Roundoff Errors. In the preceding discussion, as in the error estimates and


error bounds derived in Ch. 7, it has been tacitly assumed that all arithmetic
operations and readings from function tables were exact, to an unlimited num-
ber of decimal places. In reality, however, only a finite number of significant
digits are used. This leads to a source of error called the roundoff error, which
has been ignored previously in this book. The errors discussed previously are
referred to technically as truncation errors, or discretization errors.
In polynomial interpolation of low order, the roundoff error is not a serious
problem. For |k| < h, formulas (12)—(14) show that the roundoff error affects
only the last decimal place tabulated. For instance, writing k/h = r, parabolic
interpolation gives f(x; + rh) = woy) + wiyp + wey, where wo = (r* — 1)/2,
w= l—r*,w. = (r+ r’)/2, and |r| <4, if the three nearest tabulated values
are used. Since

(23)

and tabulated values are correct to $ in the last decimal place, the roundoff error
is at most one in the last decimal place.
In numerical quadrature formulas, which also have the form Dw,y, per step
with Dw, equal to the mesh length, the maximum roundoff error is similarly
bounded by the length of the interval multiplied by the maximum tabulation
error (ordinarily 3 in the last decimal place).
However, the effect of roundoff can be dramatic in other cases, as Example
2 of §4 demonstrates. The truncation error is zero in this example; hence if
=1l
h =

8 or some other binary fraction, the computer printout will also be exact.
But if h = 0.1 (which is not binary), the small initial roundoff error is amplified
by a factor 8 at each step, and dwarfs the true solution after 10 or 20 steps.
Empirically, roundoff errors are nearly independent, and randomly distrib-
uted in the first untabulated decimal place with a mean nearly zero. Hence,t the
cumulative roundoff error has a roughly normal distribution on a Gaussian
curve, and the probable cumulative roundoff error with n equal subdivisions is
only O(1/ Vn) times the maximum cumulative roundoff error.
Similar results hold for the numerical integration formulas to be considered.
The roundoff errors may be thought of as “noise,” superimposed on the system-
atic truncation error. Both are amplified in the course of the calculation by a
L(b—a)
factor of at most ¢ , where L = sup OF/dx is the one-sided Lipschitz con-
stant, and (6 — a) is the interval of integration.{ The reason for this is that, for
Las defined above,

(24) [y(x) — z(x)]’ = F(x, 9) — F(x, z) S L(y — 2) if you

see also Ch. 1, Theorem 5.

+ By the central limit theorem of probability theory.

t For a careful analysis of the cumulative roundoff error, see Henrici.


244 CHAPTER 8 Efficient Numerical Integration

It is interesting to compare the truncation error bound (19c) with the cor-
responding roundoff error bound, which is 10~"/2hA if an m-place table is used.
When h < (3/10"|f” | max)'”8, therefore, the roundoff error exceeds the trun-
cation error. To minimize the sum (|f” | max#?/6) + (107"/2h), which is the max-
imum total error if both terms have the same sign, set 2h° = 3/10" |f” | max: This
shows that the maximum total error with parabolic interpolation into an m-place
table cannot be reduced below

1 VIB" lmn/10™
and that one loses accuracy if |ff:,| < 2, by choosing h smaller than 0.04, ifa
four-place table is used to approximate f’(x) by a central difference quotient.
For example, for the function f(x) = sin x, where f(x) = sin x, since f(x)
ranges between —1 and 1, the maximum truncation error is approximately h?/
12, and the maximum roundoff error is 2 X 1075/h®, using five-place tables. To
minimize the greater of the truncation error (which tends to zero with h) and
the maximum roundoff error, we must make h?/24 = 1075/h. Hence, we min-
imize the maximum total error of f” =~ 5*f/h? near h* = 2.4 X 1074 or h &
0.13 radian = 8°, roughly, a surprisingly large interval!
Roundoff errors are not considered further in this chapter. This is partly
because, with high-speed computing machines, truncation errors are usually big-
ger unless h is very small (most modern machines carry at least ten decimal dig-
its), and partly because the analysis of roundoff errors involves difficult statistical
considerations.

EXERCISES C

1 Show that the effect of roundoff errors on tenth differences is bounded by about 500
* 107” in n-place tables.
Show that hf’(x9 + 4/2) = 8y12 — (1/24)8%y,2 + O(F').
Show that hf’(x9 + h/2) = dy12 — 8°y12/24 + 5°912/1920 + OH’).
Given a six-place table of sin x (x in radians), show that the approximate formula
{3 & &f)/h has a combined truncation and roundoff error bounded by 2/10°h? +
#?/12, and that this expression has a minimum of about 0.0008, assumed for h
about 0.07.
(a) Show that h2f” = Of — d4f/12 + O(n’).
*(b) Show that hf” =

ef — Hf/12 + 57/90 + O(H').


Show that for
y € @8, we have 8yy = A?[y% + 5°y$/12 — 84y%/240] + O(h°).
Show that, for small h, the AE (10) provides a strictly stable approximation.

*§ HIGHER ORDER QUADRATURE

As we have emphasized repeatedly, it is especially easy to derive accurate


numerical formulas for solving DEs of the form wu’ = u(x)—that is, for numer-
6 Higher Order Quadrature 245

ical quadrature. In this section, we will derive a rigorous error bound for Simp-
son’s Rule (Ch. 1, §8), by a method which is applicable to a wide variety of
numerical quadrature formulas. In the next section, we will discuss two other
remarkable quadrature formulas, which seem to have no analogs for other DEs.
Let 0 = % <7, < 7, S 1be given, and let x; = a@ + 7;h, h > 0, so that

(25) aSx%<x<-+-++<x,Sath=)d
For any function f € @"*®, let p(x) be the Lagrange polynomial interpolant of
degree n to the y, = f{x;) at the x; Then the approximation of f(x) = p(x) is
associated with a formula for numerical quadrature, namely

(25) [409 dx = F000 dx > wy, = > =


=

wyf(x)
i=0 ix=0

The coefficients w, are the integrals of the polynomials p,(x)/p,(x,) in (5);

(26) t
=
_ f pix)dx, =p(x) =I]
p jri
(x — x,)

Formula (25’) is exact for all polynomials for degree = n, since then f(x) = p(x).
For other functions, the error in (25’) is — J° e(x) dx, where —e(x) = p(x) —
f(x). Hence, by Theorem 1, the error in (26) for any given n is of the order of
h”*? at most. We have proved the following theorem.

THEOREM 3. For any choice of real numbers 1, with 0 < ry <1) < ‘<n
< 1 and weights w, defined by formula (26), we have, for all f € @"*? [a, b):

f 70 dx — > w,f(a + 1h) = O(n"*?)


=]

Setting 2 = 1 and 7) = 0, r, = 1 in the preceding formulas, we obtain po(x)


= x — x), p(x) = x — xo. Therefore

I (x — x9) dx =
x) — Xp
Se
w= =

xO XX) — Xo

A similar calculation gives wo —


=

h/2, so that the formula for trapezoidal quad-


rature (Ch. 7, §6),

ath

(27)
a
fe) dx ~ - [le + fle + 1 =F (0 +90)
246 CHAPTER 8 Efficient Numerical Integration

is obtained as a special case of (25) and (26). In this special case, the error e(x)
satisfies by Theorem 1, applied to the linear interpolant L(x) to f(x)

e(x) = L(x) — fix) = 3(% — xO — of’

for some & = &(x), in the interval x) = £ <= xg + h = x. Since

x1
A(yo + 1)
J xo
L(x) dx =
2

the error J%! e(x) dx = fj e(xo + 4) dt in trapezoidal quadrature satisfies

x}

sf f th-ojdts
a
min
e(x) dx = —
2
w”
max
[aoa
Xo

Since f% t(h — t) dt = 13/6, we obtain formula (24) of Ch. 7:

(28) Foot =f fod=Ep@, x<t<n


Applying (28) to each component of any vector-valued function x(t) € @*, we
have

f x,(0)dt — ; [x;(éo) + x(t] = —19 xz(7)


for some 7 in the interval [f, tg + 2], and for each component x,. By choosing
one axis parallel to the error vector Ji} x(é) dt — h(x + x))/2, we obtain, as a
special case of the preceding result, the inequality

| f x(t)dt — 5 +x)| = 19 |sup x”(r)|, ig<7T< ty

The relative truncation error is thus O(h?), as is to be expected from a formula


that neglects quadratic terms.
Note that the vector analogs of the Mean Value Theorem and of (28) are
false. For example, let x(é) = (, t*), t) = 0, and ¢; = 1. Then

1 3
f x(t) dt — (xy + x)
-( 410
is not equal to —x”(r)/12 = —(6r, 127°)/12 —(r, 27°)/2 for any r in [0, 1].
6 Higher Order Quadrature 247

Simpson’s Rule. Formula (8) for parabolic interpolation leads similarly to


Simpson’s rule:

(29) fle) dx = 5 Do + Ay + ye = 5 low) + Af) + fle


h = xg — %| = x, — Xo. This formula is exact for quadratic polynomials. Since
ft, 2° dx =
=
9, Simpson’s rule also is exact for cubic polynomials; this coinci-
dence makes Simpson’s rule especially practical.
The error estimate for Simpson’s rule (29) will now be derived for any f €@*.
Consider the cubic polynomial

P(x) = ay + a,x + agx® + asx?

satisfying P(x) = Yo, P(%1) = Mis P(x) = M1 = fH), Pg) = Jo, with xo = x) —
h, x9 = x, + h. These conditions amount to a) = 9, @, = yj, dg = 5y, /2h?,
and ag = —4a,/h? + (yo — yo)/2h°. Hence they can be satisfied for any yo, 91, Yo.
To estimate the error e(x) = p(x) — f(x), translate coordinates so that x9 =
=

—h, x, = 0, x. = h. Consider the function of t for fixed x

(30) bt) = x°(x? — We) — PP — h?Ye(x)

analogous to the function (13) used in proving Theorem 1. We have ¢(0) =


go(th) = $(x) = 0, and further, ¢’(0) = 0, since e’(0) = 0. By Rolle’s Theorem,
the function ¢’(é) vanishes at three places besides ¢ = 0 in the interval —h < ¢
< h. Hence $”(f) vanishes at least three times in —h < t < h, ”(t) vanishes
twice, and ¢'() vanishes once, at some point ¢ = £ We thus have, much as in
(27),

(31) 0 = G© = 2%(x? — He — 2Me(x)

Since p(x) is a cubic polynomial, p(x) = 0; therefore —e(§ = f). Substi-


tuting in (31) and solving for e(x), we get the error estimate

x?(h? _ xf »(¢)
(32) e(x) = > —h=ét=h
24

Integrating (32) with respect to x, since x(k? — x*) = 0, it follows that the
truncation error in using Simpson’s rule for quadrature over —h < x Sh lies
between m = min f"(é) and M = max /"(€) times the definite integral

* xn? — x*)dx WP
J =

h 24 ~ 90
248 CHAPTER 8 Efficient Numerical Integration

Since f € @*, f'(€) assumes every value between m and M, which gives the fol-
lowing theorem.

THEOREM 4. Iff(x) € @”, the truncation error for Simpson’s rule on the interval
—h=x S his equal to hef(é) /90 for some & in the interval [—h, h).
The relative truncation error is therefore h*f'’(£)/180. For example, to
achieve five decimal places of accuracy in computing In 2 = {7 dx/x, about 10*
points must be taken if Riemann sums are used, about 100 with trapezoidal
quadrature, whereas 10 are sufficient using Simpson’s rule!

*7 GAUSSIAN QUADRATURE

In formulas (25) and (26) for numerical quadrature, with the x, as free param-
eters, we have 2n + 2 adjustable constants in all. This suggests the hope that by
properly locating the x,, we can get a formula which is exact for all polynomials
of degree 2n + 1 or less, since these form a (2n + 1)-parameter family.
Such a formula was obtained by Gauss; in deriving it, it is convenient to
renormalize to the interval (—1, 1) and to label the points é,, . » Em, SO that
m =n + 1. The formula uses two properties of the Legendre polynomials P,,(x)
defined in Ch. 4, §2, which will be proved in Ch. 11, §6. These properties
are:

(i) P,,(x) has m distinct zeros x = &; << &<- ++ < €, in the interval (—1, 1),
whence p,y(x) = Cy(x — &)(x — &) + + + (x — &,) for some constant ¢,,.

(ii) P,,(x) is orthogonal to any polynomial of lower degree:

f x"P,,(x)dx =0, if n<m


We shall assume these results, and also the following definition.

DEFINITION. The Gaussian quadrature formula of order m is the special


case of formulas (25’) and (26), in which 7,_, (1 + &)/2,i=1,...,m,and
=
=

£, is the zth zero of the Legendre polynomial P,,(x).

THEOREM 5. Gaussian quadrature of order m is exact if f(x) is a polynomial


of degree 2m — | or less.

Proof. By centering the origin and change of scale, we can assume the
interval of integration to be [—1, 1] without loss of generality. Let f(x) be any
polynomial of degree 2m — 1 or less; let p(x) be the Lagrange interpolation
polynomial of degree at most m —1 satisfying p(é,) = f(&,), 7 = 1, 2,...,m
Then e(x) = f(x) — p(x) vanishes at &,, » &m- Hence,t we have e(x) = (x — &)
++ + (x — &,)b(x), where (x) is a polynomial of degree at most m — 1.

+ This follows by the Remainder Theorem; see Birkhoff-MacLane, p. 75.


7 Gaussian Quadrature 249

Therefore, by (i) above, e(x) = c,,'P,,(x)b(x) = s(x)P,,(x), where s(x) is a poly-


nomial of degree at most m — 1. But, by (ii), P(x) is orthogonal to any poly-
nomial of degree less than m. Hence

f e(x)dx f 5(x)P,,(x)dx = 0
=
=

so that (25’) and (26) are exact for e(£) by the choice of the £,. Moreover, Gaus-
sian quadrature is exact for p(x) by the choice of the &; hence, it is also exact
for the given f(x), completing the proof.
Let f(x) € @?"[a, a + h], and let q(x) be the polynomial of degree 2m — 1 (or
less) satisfying 9(x;) = f(x;), for x, =
=
a + [jh/(2m + 1)],j = 1,..., 2m. Then,
by Theorem 1, we have e(x) = q(x) — f(x) = O(h?") and so

i- ae) in - oe) dx
=
=

f ™ e(x)dx = O(h?"*)
But, by Theorem 5, we have
m

f - q(x)dx > wq(a + 7h) = >= wfla + 1h)


=
=

gal gol

Substituting back into the preceding equation, we get

ath m

(33)
a
f(x) dx — ° wfla + 1h) = OW?"*)) if r, _(d-¥§
J
2
joi

This proves the following result.

COROLLARY. For f(x) € @?", Gaussian quadrature of order m has an absolute


error O(h?"*) and a relative error O(h*").

Romberg Quadrature.t For more than a century, Gaussian quadrature was


the ultimate in ingenious quadrature methods. Then, around 1960, an extrapo-
lation method based on the idea of successive mesh-halving (as in Richardson
extrapolation) to trapezoidal quadrature turned out to be even more accurate in
many cases.

Let Tf be the trapezoidal sum


0-1

(33) TP =5 |F(a) + FQ) + 2 >” Fath) | h = (b—a)/é,£ = 2.


jm

and let T® = [4"7T&*) — TE) iJ/(4" — 1). Then the T? converge extremely

rapidly to J? F(x) dx.

+ See F. L. Bauer, H. Rutishauser, and E. Stiefel, Proc. XV Symposium on Applied Math., Am. Math,
Soc. 1963, 199-218.
250 CHAPTER 8 Efficient Numerical Integration

EXERCISES D

1. Using a five-place table of sin x, x in radians [but not tables of Si(x)], evaluate

fir sin tdt for x =0.1,0.2,...,1.0

by Simpson’s rule, with h = 0.1.

Show that J? dx/x = In 2 is given by various numerical quadrature formulas, with h


= 1/10, as follows: (a) initial point 0.73654401, (b) trapezoidal 0.69377139, (c) mid-
point 0.6928354, (d) Simpson 0.6931474.
Use Weddle’s rule (Ex. B3, Ch. 7) with 12 subdivisions to compute the approximation
0.69314935 to ln 2 = 0.69314718056.
Use Cotes’ rule (Ex. B2, Ch. 7) with 12 subdivisions, to compute the approximation
In 2 = 0.69319535.

*5 Show that

11
J's as =5 E +f ~ eh + Bf) + 720 (“fo + a) + O(h")
*6 Show that, for n = 3, the Gauss quadrature formula on (— A, h) is

# (970) + 5 Lf(—h V9 + VDI)


with truncation error h°f"()/15,750, where —h <& < h.
*7 Show that the error in Hermite’s tangent cubic quadrature formula,

fora =4 0+ -Eos—90
is h°f(£)/720, where 0 < E <A, ify = f(x) € @*.
(Simpson’s Five-Eight Rule). Show that, if f€ @, then

f ”fle) dx = 55 (ft) + 8710)—f(—M)] + OG)


*9 Show that, if f(@) is an analytic periodic function, then trapezoidal quadrature over
a complete period has O(h") accuracy for all n.

8 FOURTH-ORDER RUNGE-KUTTA

In principle, it is easy to derive formulas of numerical integration for y =


F(x, y) having an arbitrarily high order of accuracy, if F is sufficiently smooth.
8 Fourth-Order Runge-Kutta 251

Simply evaluate the successive derivatives of y as in Ch. 4, §8, getting y’ = F,


y’ = F = F, +FF,

y” = Fz + 2FFy + F?Fy + y"F,


y” = Fey + 3FFyy + 3F 7Figy + FPR
yy + y"(8Fy + 8FF,) + y’F,

and so on. If F is linear or a polynomial of low degree, the preceding formulas


may even be practical for computation.
Then evaluate Taylor’s formula, valid for F € @”,

n(n)
h
(34) (x, + h) = 9 + hy +
ye4 —_—_
+ O(h"*)

2 n

ignoring the remainder O(h"*}). The relative error in the recursion relation so
obtained

2a
h Ye 4. h‘yf
(35) Yer1 = MH hy +

2 (n!)

is obviously of order n. Hence, by Theorem 9 of Ch. 7, so is the cumulative


error. Moreover, this one-step explicit method has the advantages of permitting
a variable mesh length and of being stable for strictly stable DEs.
The calculation of (35), however, is rather cumbersome since it involves many
terms. Furthermore, in some cases the derivatives of F can be computed only
approximately by numerical differentiation, which amplifies roundoff errors
(§5). A one-step method of integration that gives a high order of accuracy and
avoids these defects will be described next.
This approach is based on the idea of obtaining as high an order of accuracy
as possible, using an explicit, one-step method. It consists in extending the
approximations of the improved Fuler method (Ch. 7, §8) further, so as to
obtain a one-step formula having a higher order of accuracy. One-step methods
have the advantage of permitting a change of mesh length at any step, because
no starting process is required.
The most commonly used one-step method with high order of accuracy is the
Runge—Kuttat method. We now describe the AE used in this method, for the
first-order system

dx
(36) — = X(x, 2), ast=sb
dt

with mesh points a = f) < t <t) << so . Letyo =


=

x(a) be the initial value.

+Zeits, Math. Phys. 46 (1901), 435-453; C. Runge and H. Kénig, Numerische Rechnung, 1924, Ch.
10.
252 CHAPTER 8 Efficient Numerical Integration

The approximate function table of values y, corresponding to the points ¢, is


defined by the AE

Your = Yet Ok + Bh + ky + ky)

me
(37) k, = Xi; 4), ky = X(x + hk,2 —,iitr-
h
2

ls = X(y+ hk,
2
+e2 ). ky= X(y;+hks,t, +A)
where the mesh length h = A; may vary with 2.
We now show that the preceding Runge-Kutta method has an error of only
O(h*) per step.t For simplicity, we restrict attention to the first-order DE

(38) = = X(x, i), xe et, asisb


di

and to the initial condition x(0) = 0. Formula (37) reduces in this case to

(39) M41 = ): +(?) (ky + 2ky+ 2hks+ ka)


where

a X(y 4), hg
=
=
x(4 2°
hky
+t
2
(39’)
hks
x(n 2”
hs
2 J. X(y; + hks, t, + A)

Let x(t) be the exact solution of the DE satisfying x(0) = yp =


=
0. Set
h/2 = 0. Then k, can be written as

ky = X(%1/9, 8) + Xx(%1/2, 9)[x9 — X12 + Oh]


+-—
ex(%1/20 [x9 — Xo + OR)? +++,
2

where x) = x(0), %1/2 = x(6), and subscripts x stand for partial derivatives. Using
primes to indicate total derivatives with respect to t, so that X’ = OX/dt +

} The proof that follows was constructed by Robert E. Lynch.


8 Fourth-Order Runge-Kutta 253

X 0X/dx, we get

X’(0, 0)6? X”(0, 0)6° X”(0, 0)64


X12 = Xo + X(0, 0)6 +
+
+ O(n)
2 6 24

Since k, = X(0, 0), it follows that

Ok, X’(0, 0)6? X”(0, 0)6°


Xo — Xj +
—_—=—
i oO SCO

+ O(h'*)
2 2 6

so that

X’(0, 0)6? X”(0, 0)0°


hy = X(X1/9, 8) — X(x1/2, 9)
2 6
| + O(n).
For ks, we have, similarly,

kg = X(x1/2, 9) + Xx(x1/2, 8)[x9 — X12 + Oke] ++ --

From this formula, since

Xo = X12 — X(x1/2, 98 + 3X(%1/9, h/2)6? — 4X"(x12, 60°


+ 94X”"(x19, 0)0* + O(n)

we obtain

hg = X(21/9, 9) + 3X(1/25 O)X"(x1/9, 00? — $X.(%1/2, 9)X-(%1/2, 9)X’(0, 0)0°

_ 8X,x(%1/25 IX" (x19, 66° + O(h*)


~

Similarly, we have

kg = X(xy, hb) + X.(x1, h)[x — x + Akg] ++ -

and

% — 1 = —AX(x1/, 8) — 3X”"(x12, 0° + O(h')

so that

hy = X(x, hb) — 4X(x1, AX" (10, O° + X,(e1, MX _(2e12, OX" 1/2, 0? + O(h4)
254 CHAPTER 8 Efficient Numerical Integration

Finally, we have the relations

X(x12, 8) = X(0, 0) + X’(O, 0)0 + $X”(0, 0)6? + EX”(0, 08° + O(n‘)

X(x,, kh) = X(0, 0) + X’(0, Oh + $X7(0, O)h? + 4X”(0, O)A? + OLA

X,(x), h) = X,(%1/2)6) + X3(%1/2,6)6 + O(h?)


Combining these results, we find that

n= Io + (2) [ky + Qhy+ Qhg+ hy]


2 hb

= y9 + X(0, O)h + X’(0, 0) 2 + X”(0, 0) 6 + x”(0, oe + O(n’)


and Jo = x9. Since x(h) is given by

x(h) = xy = xX) + X(O, O)h + X’(0, 0)h?/2


+ X(0, 0)h°/6 + X”(0, 0)h*/24 + O(n’)

we see that

ly; —x| = O(h’)


Hence, the relative error is of order four. Therefore, by Theorem 9 of Ch. 7,
so is the cumulative error. The method of proof consists in comparing various
Taylor series.
The main defect of the Runge-Kutta method is the need for evaluating k, =
X(x, ¢,) for four values of (x, ¢,) per time step. If X is a complicated function,
this may be quite time-consuming. To avoid this repetitious evaluation, many
computer programs use Adams-type methods instead; see Exercise F8.

EXERCISES E

1. (a) Derive a power series expansion for f(a + h) through terms in h® for the solu-
tion of y’ = 1 + 9? satisfying fla) = .
(b) Truncating the preceding series after terms in h?, evaluate approximately in
three steps the solution of y’ = 1 + y? satisfying (0) = 0, setting x, = 0.5,
Xo = 0.8, x3 =
=
1. What is the truncation error? [HINT: Consider tan x.]

2. Same question for y’


=
=
x® + y*. (The exact solution, rounded off to five decimal
places, is 0.35023.)

3. (a) Apply the Picard process to the DE y’ = x” + 9? for the initial value yp = 0 and
initial trial function y (O) = 0. Calculate the first four iterates.
(b) Using the power series method of the text, calculate the Taylor series of the
solution through terms in x!7, and check against the answer to (a).
(c) Evaluate (1) numerically at x = 1, using the preceding truncated power series,
and compare with the answer of Ex. 2.
8 Fourth-Order Runge-Kutta 255

4. (a) Apply the Runge-Kutta method to the DE y’ = 1 + ¥? for the initial value
(0) = 0, setting x9 = 0, x; = 0.5, xy = 0.8, x3 = 1.
(b) Same question for the DE y’ = x? + »°, with (0) = 0, and the same mesh.
5. For the first-order linear system dx/di = A(/)x, show that the Runge-Kutta method
is equivalent to

Xie = E + #(Ag + 4A; + Ag) + ne(A,Ay+ A? + AsA;)


4

+ x (A2Ay + AyA2) +94 (AgATAp)|x


where Ap A(t), Ay = AQ,+1/2), Ao = Alt+1)-
=
=

6. (a) Show that the system u’ = v, v’ = —wis neutrally stable, and indeed that |(u(2),
v())| = const. for any solution.
(b) Show that the Runge-Kutta method is strictly stable, and satisfies

6 8

[(u(t + h),vo + A)! = ( l-st+a


72 576
lu, v@) |

In Exs. 7-11, let y’ = F(x, y) = Lyin byx’y* and y(0) = 0.


7. Show that

y” = F, + FF, and y” = F,, + 2FF, + FPF, + FF, + FF?

In Exs. 8-11, let B = bog + B1:099 + Boaboo? and B¥ = bigby, + Bobo"

8. Show that, if y(0) = 0, then we have

yh) = boo + h*(bio + boobos)/2 + h?(B/3 + B*/6) + O(n’)

9. Show that, with midpoint integration, y(h) is given by the approximate formula

3B
yulh) = hboo = © + bobo) + — + orn

and that the truncation error is —1(B/ 12 + B*/6) + Orn’).


10. Show that the improved Euler method gives

yeh) = hbo + e (io + boob.) + WB + orn’


with truncation error 4°(B — B*)/6 + O(h*).
11. Show that the trapezoidal approximation to y(h) is

2
B* ) + or
yh) = hbo + 9 (bio + Boobor) + 3
(
=+
2 4

with truncation error 4>(B/6 + B*/12) + Oth’).


256 CHAPTER 8 Efficient Numerical Integration

12. Check the formulas of Ex. E8 against those of Exs. 9-10 above in the special case
t-; = 0,4, = h, and AW = pl) = fo + pil + pol? + of a first-order linear
DE.

*13, For the linear DE dx/dt = p(x, p) = Lae pt — a)*, evaluate x(a + h) through
terms in A® by the Runge-Kutta method. Compare this with the Taylor series for
the exact solution.

*9 MILNE’S METHOD

A very different method for solving initial value problems with fourth-order
accuracy is due to W. E. Milne. Whereas Runge-Kutta methods are based
directly on power series expansions, and the Euler methods of Ch. 7 (and Ch.
1, §8) basically approximate derivatives by difference quotients, Milne’s method
replaces x’(f) = X(x; ¢) by the equivalent (vector) integral equation:

(40) x(t)
=
=
x(a) + f X(x,s) ds

as in Ch. 6, (11). If the integral in (40) is evaluated by Simpson’s rule, we get a


very simple implicit, two-step AE

(41) seig = Xe + ZK f) + AXCteaas fear) + XKeaes fsa)


due to W. E. Milne.} 1t is perhaps the simplest scheme for achieving O(h*) accu-
racy. Moreover, in the case of linear systems x’(f) = A(x + b(é), one can solve
for X,49 in (41) algebraically.

Example 3. Consider the linear DE y’ = 1 — 2xy of Ch. 7, §7, with initial


value y(0) = 0 and mesh-length h = 0.1. In this case (41) reduces to

[15 + xp40)
[3 + (15 — x) — 40419041]
=

Dr+2
=

Evaluating (0.1)
=
=

0.09934 by power series, Milne’s formula gives »(0.2) =


0.19475. This approximate value agrees to five places with the value of y()
obtained by power series expansion. The comparison suggests that the mesh
length h = 0.1 is adequate for four-place accuracy. Repeated use of (41) then
gives the following approximate function table.

x 0.1 0.2 0.3 0.4 0.5

y 0.09934 0.19475 0.28264 0.36000 0.42444

+ W. H. Milne, Numerical Solution of Differential Equations, John Wiley & Sons, 1953.
9 Milne’s Method 257

The truncation error is about 107°, as can be verified by use of the power series
solution

2
=
x—- tx +— xP +--
3 15
= > aaa Gon+)
Aap]
=

2k + 1

which gives y(1) = 0.538079.


Having been successfully used in a wide variety of DEs, Milne’s method pro-
vides an excellent illustration of implicit, two-step methods.

Starting Process. Given the initial value yo c = f(a), one must compute
=
=

4 by a one-step method before one can begin to apply a two-step method. For
analytic F, it is usually best to calculate y, = f(a + h) by expanding f(x) in a
Taylor series as in Ch. 3, §7. WhenF is not analytic, but fairly smooth (say, if
F € @*), good approximations to f(a + h) are often obtained by repeated mesh-
halving of the interval [a, a + h], using a one-step method with a lower order
of accuracy. For instance, we might first compute fla + h/8) by midpoint inte-
gration, and then use Milne’s formula to get f(a + h/4) = 91,4 from yp and 9g.
Next we compute 9; 2 from yp and 4, 4 by a second application of Milne’s formula
with mesh length h/4, finally getting y, from yp and 9j/2 by a third application of
the same process.

Iterative Solution. Although (41) can be solved algebraically for linear DEs
and systems, for nonlinear DEs one must resort to iterative methods to compute
X,+9 from x, and x,,;. One can do this by a method analogous to Picard’s
method of successive approximation (Ch. 6, $7), as follows. First rewrite Milne’s
equation (40) in the following form

(42) Xir9 = Uki.9) = & + 5 (Xe, i) + 4X1 tis) + X(XKp+0,taaa)}


where all quantities are known except X;+9.
Regarded as an equation in the unknown vector x,+9, (42) has the form
X49 = U(x,42), where the function U is computable. For any initial trial value
xf, one can hope that the sequence

(1) (0) (1)


xP, = U(x k+2, (3)
(42’) Xi+2 = U(x k+2, )» » U(xfi2),
=

Xi+2 =

will converge fairly rapidly to the true solution.


In the case of a single DE x’(é) = X(x, t), we now show that this will be the
case if X satisfies a Lipschitz condition, where L is the Lipschitz constant and h
is small. More precisely, iteration converges if h < 3/L, and it converges rapidly
if kh < 3/L. This results from (42’), in which

|x,(+1)
+2
(”)
~ Kare
=_

5 IXcxf tesa) — XLS, tr)| S E | Xi+g


r)
+29
(”) (r—1)
— Xk+2 |
258 CHAPTER 8 Efficient Numerical Integration

Hence, if @ = hL/3, we have by induction on r

(+l) _ (r) (1) (0)


|x k+2 k 9| = 8|x k+2 k+2

9)
For h < 3/L, 6 < 1 and so the sequence of x h+2 is a Cauchy sequence; let x,,5
be its limit. Moreover U is a contraction which shrinks all distances by a factor 6
or less, and so is continuous. Hence, passing to the limit on both sides of the
(+l)
equation x,k+2 =
U(x’?..), we get (42).

*10 MULTISTEP METHODS

Milne’s method (41) is evidently a two-step method in the sense that each new
value of an approximate solution is computed using the two preceding values.
In this section, we will study Milne’s method more critically, and describe other
multistep ““Adams-type’”’ methods.
Multistep methods are usually best executed as predictor-corrector methods,
in the sense of Ch. 7, §8. An explicit “predictor” formula based on extrapola-
tion is made to yield higher order accuracy by one or two iterations of an implicit
“corrector’’ formula.

Milne’s Predictor. Thus with Milne’s method, we can use the predictor

(43) Varo = Xp + 2AX(Ka1, bean)

which has O(h*) absolute, and O(h?) relative accuracy. To get O(h*) relative
accuracy from this, we must iterate twice with the corrector (41). Alternatively,
for k = 2, one can use the four-step (five level) predictor

4h
(43 Zerg = Z—g+ ( —

3
{2X,_) — X, + 2X41}

which has O(h*) accuracy, and apply the corrector (41) once.

Stability. Unfortunately, like the two-step approximation y,,; = y,-; +


2hy, discussed in §3, Milne’s method can give an unstable difference approxi-
mation to a stable DE—and in fact it does this in the case of y’ + y = 0. This
can be verified by solving the relevant characteristic equation, which is

Ahp 3—h
—__ =9
(44) pe t+
3+h 3+h

Setting p = 1 — h + h°/2! — h°/3! + h‘/4!, it is easily verified that Eq. (44)


holds through terms in nh’, confirming that one characteristic root p, eh
=
=

+
O(h°). This corresponds to relative O(h*) accuracy. Unfortunately, the other root
po = —1 — h/3 + O(h?); hence, the magnitude of the error will grow exponen-
10 Multistep Methods 259

tially like e? with alternating sign. Therefore, computed values will ultimately
oscillate with increasing amplitude, whereas the exact values tend smoothly to
zero.

Adams-type Methods. The preceding instability is avoided by more sophis-


ticated multistep methods. One of the most successful of these uses the predictor
formula of Adams and Bashforth (1883)

(45) Vert = Ye th > BnV"X


m=0

where 6, = 1, 6; = 4, Bo = 7, B3 = 2. This is followed by the explicit corrector


formula of Moulton (1926).

(46) Vert = Ye th > YnV"%,


m=0

where ¥ 1,1 = —3 Yo = —% Ys = —#, and


=
=

Xiur = XGasr fer


By combining the results of §4 and §10, it is possible to derive an a priori
error bound for Milne’s method. Specifically, we can prove that the cumulative
truncation error, over any fixed interval [a, a + T], is bounded by Mh‘ for some
finite constant M, that is, independent of the mesh length h. This is true, pro-
vided that both yy and y, are accurate to O(h°).
Indeed, let y(x) be any exact solution of the DE y’ = F(x, y), and let d(x) =
F(x, y(x)). Then by Theorem 4, we have

M+ = Me + A [b(x,) + 4b(xp41) + O(%n40)] + ease


where |é+2| S [| max??/90. From this, a discussion like that of Ch. 7, §10,
yields the bound

L
(47) M = |6"
|max(@”— 1/901( 1-—

Finally, we can express ¢(x) in terms of F and its derivatives, just as in Ch. 4,
§8:

¢’ =
=

F, + FF,, @” = Fy, + 2FFy + F’Fy + F,F, + FF;

and so on. Combining these results, we can compute M a priori in terms of the
values of F and its derivatives, thus getting an explicit error bound.
260 CHAPTER 8 Efficient Numerical Integration

A priori error bounds, like the preceding, are seldom useful for methods hav-
ing O(h*) accuracy. One reason is that they are so complicated. In practice, reli-
ance is usually placed on a posteriori error estimates, which utilize computed val-
ues. Another reason is that they neglect roundoff errors.

EXERCISES F

1 (a) Show that, for the DE y’ = y and hk = 0.1, Milne’s method amounts to using the
AE y49 = Bly, + 4yn41)/29.
(b) Integrate the DE y’ = y from x 0 to x =

1 by Milne’s method with h = 0.1,


=
= =

for the starting values yp = 1 andy, = 1.1052.

Same question for the DE y’ = 1 + y”, using the starting values yo =


=
0 and
9; =
0.1003.

Same question for the DE y’ =


=
x? + y?, with the starting values yo =

0 and y, =
=

0.00033.

Exercises 4—6 concern the two-level midpoint method, defined by (16).

4 (a) Show that, if y’ = F(x, y), where F € @?, then y = f(x) satisfies the two-level mid-
point formula (16) with discrepancy O(h’).
(b) Show that, if F € @* and y,_), y, are exact, the truncation error is Kk? + O(h’).
(a) Integrate y’ = y approximately by the two-level midpoint method with h = 0.1,
taking yy = 1, y, = 1.1052 as starting values and integrating to y;9 = 2.7145.
(b) Estimate the discrepancy and the cumulative truncation error in (a), comparing
them with the roundoff error.

(a) Do the same as in Ex. 5a but for the system y’ = z — 2y, z’ = y — 2z and the
starting values yy = 1, % = 0,9, = 0.8228, z, = 0.0820, computed by the Taylor
series method.
(b) Show that the system in question is stable but that the approximating AE is not.
Explain why the computed table is approximately correct, although the method is
unstable.

(a) For general h > 0, set

he

y=1, natn 2 6 24

and Intl = Yn-1 + 2hy

Show that |y, — e“[ = O(h’), as h > 0 with nh constant.


(b) If A = 10°? and the roundoff error is 107’, infer that the cumulative total error
is O(10?-").
Adams three-level methods for integrating y’ = F(x, y) are

(As) Int = In + 3 [23F, — 16F,-, + 5F,-3] (explicit)

(A8) nti = Vn + vt [9F41 + 19F,, _ 5Fi-4 +F,_9] (implicit)

Show that the truncation error of (A) is O(h‘) per step, while that of the implicit
method (Aj) is O(h’).
CHAPTER 9

REGULAR SINGULAR
POINTS

1 INTRODUCTION

We briefly discussed the complex exponential function of a complex variable,


Az
we , in Chapter 3, §2. There we used its properties to explain the behavior
of In z
=
=

Jat/t, the natural logarithm of z, and the power function 2 = +


for an arbitrary real or complex exponent A = yw + w, as z x + ty ranges
=
=

over the complex plane. In the rest of that chapter, however, we assumed the
independent variable to be real.
In the present chapter, we shall consistently be considering complex-valued
functions w = f(z) of a complex independent variable z, and the behavior of
such functions as z varies in the complex domain. Specifically, we shall usually
be studying functions that satisfy some second-order, linear homogeneous ordi-
nary DE of the form

polzw” + pilzjw’ + polz)w = 0

We shall be particularly interested in the way in which their behavior depends


on the coefficient-functions p,(z) G = 0, 1, 2), much as (in Ch. 2) we considered
the analogous questions for real t and p,(t), extending the results to higher-order
DEs in Chapter 3.
Throughout, we shall be exclusively concerned with analytic functions. Here
an analytic, or holomorphic, function w = f(z) of a complex variable z =
=
x +iy is
one having a complex derivative f’(z) = dw/dz at every point.t This is equivalent
to the definition given in Ch. 4, §5, as is proved in books on complex analysis:
any complex analytic function can be expanded in a convergent power series.
An analytic DE is one in which the functions involved are all analytic. Its solu-
tions are then necessarily also analytic (Ch. 6, §11).
The solutions of analytic DEs are best studied as functions of a complex var-
iable, because their isolated singular points are surrounded by connected
domains in the complex z-plane. This permits one to continue solutions beyond
and around isolated singular points, whereas on the real line, solutions termi-
nate abruptly at singular points.

+ Ahlfors, p. 24; Hille, p. 72. Some knowledge of complex function theory is assumed in this chapter.
261
262 CHAPTER 9 Regular Singular Points

For instance, consider the DE du/dx


=
=

u? for real x and u. The formula


u = —1/x defines two real solutions of this DE; one defined for x > 0; the other
for x < 0. As x — 0, one solution tends to +, and the other to —©; the beha-
vior of these two solutions near x = 0 seems to be unrelated if x is restricted to
real values. On the other hand, consider the same DE dw/dz = w* for complex
z= x + wzand
w u + iv. The formula w —1/z defines a single
=> =
= =

solution of the DE, in a domain D that includes every point of the complex z-plane
except the isolated singular point at z = 0; hence, it includes both real solutions.
The general solution of the same DE is the complex-valued analytic function
w = 1/(c — z). This function has an isolated singularity at z = c.
The DE dw/dz
=
=

w® is defined and analytic for all z and w, real or complex.


Each particular solution w = 1/(c — z) of this DE is defined in the punctured
z-plane with the point z = c deleted. Since this domain is connected, the solution
can be continued as an analytic function from any region in it to any other. This
process of analytic continuation is uniquely defined, for any given path of
continuation.t
The (real) solution w
=
=
u = ~1/x of dw/dz = w* on the negative x-axis is
obtained by analytic continuation in the complex plane from the solution u
=
=

—~1/x of du/dx = u? for x > 0. This is evident if we continue the solution as a


complex analytic function w = (—x + iy)/(x? + y°) around the origin on either
side. The fact that the analytic continuation of a solution of a DE is a solution
of the analytic continuation of the DE is valid in general, as we shall prove in
§4.

Example 1. Consider the first-order Euler homogeneous DE

(1)
_ yw
dw ’ y=at #, a, 6 real
dz z

By separating variables and writing z = re”, we find the solution

= z’ = ev Ing _ etn r+i0)


w =

(1’/)
=

en +-P9 cos (8 In r + a6) + isin (8 Inr + a)]

When 6 = 0 and y = ais real, the analytic continuation of the real solution
u

=

x on the positive x-axis through the upper half-plane to the negative


x-axis, where 0 = 1, is (cos ra + 7 sin 7a)|x|*. Note that this is not equal to the
real solution |x{* on x < 0, unless a is an even integer.
The preceding example also shows that DEs involving only single-valued func-
tions can have multivalued solutions in the complex plane. Unless y is a real
integer, the value of w = z” changes bya factor

en? = e 2B(cos Ina + isin 2ra) #1

+ Ahlfors, p. 182; Hille, p. 209.


2 Movable Singular Points 263

when z describes a simple closed counterclockwise loop around the origin, mak-
ing 8 increase by 27 and In z by 277. This example shows that solutions of a linear
DE can have branch points where the DE has a singular point,} even though the
DE has single-valued coefficient-functions.

Example 2. The second-order homogeneous Euler DE is

(2) zw” + pew’ + qu = 0, p, q real constants

For positive z x > 0, a basis of real solutions is provided by the real and
=
=

imaginary parts of the functions z’ = e”'" *, where ¥ is either of the roots of the
indicial equation

(2’) P+ (p-lvytq=0

For instance, the roots of the indicial equation of the DE

vw” +2’ +w=0

are vy = +i. A basis of complex solutions, real on the positive x-axis, is therefore
provided, as in Ch. 3, §3 by the real and imaginary parts of the functions

Az + x) = dfertn r+i0) + Pm 7+36)


Ww

cosh 6 cos (In r) — i sinh @ sin (In 7), and

We # (z! — z*) = cosh @ sin (In 7) + i sinh @ cos (In 7)

The analytic continuation of the solution cos (In x), real on the positive x-axis
6 = 0, through the upper half-plane to the negative x-axis 6 = a is not the
solution cos (In |x|) = cos (In 7) given in Ch. 3, §3 but is the complex-valued
function cosh z cos (In r) — 7 sinh sin (In 7).

*2. MOVABLE SINGULAR POINTS

The general solution of the DE w’ = w* considered in §1 is 1 /(c — 2). This


function has a pole at the variable point c. Thus, the location of the singular
point of a solution depends in this example on the particular solution. This hap-
pens for most nonlinear DEs; one describes the situation by saying that the ‘“‘gen-
eral solution” of w’ = w* has a movable singular point.

+ We define a singular point of the linear DE

polzw™ + pw?) + 0+ + + pw = prii®


as a point where f(z) = 0, or some #,(z) has a singular point.
264 CHAPTER 9 Regular Singular Points

A second example of movable singular points is provided by the DE w’ =


=

z/w. The general solution of this DE, obtained by separating variables, is the
two-valued function w = (z? — c*)'/””, which has branch points at z =
=

te. Since
c is arbitrary, the general solution has a movable branch point.
There is no significant class of nonlinear first-order normal DEs whose solu-
tions have fixed singular points. However, the solutions of the generalized Ric-
cati DE w’ = fo(z) + pi(z)w + po(z)w® have fixed branch points.t This can be
shown by representing w = v’/pov as a quotient of solutions v of the linear DE
v” + pi’ + popov = 0, as in Chap. 2, §5.
A second-order nonlinear DE having a fixed singular point at z = 0 is

w” = (w?/w) — (w'/2),

whose general solution is w = Cz’, with C, y arbitrary complex constants. But


nonlinear DFs with fixed singular points are highly exceptional.
1t is otherwise for linear DEs, to which this chapter will be largely devoted.
An nth order normal linear DE

L{w) = w + pw) + + + + paw’ + p,(2w = fl2)

with holomorphic coefficient functions has holomorphic solutions in any


domain where the coefficient functions are holomorphic. The argument of Ch.
6, §8, can be applied to construct solutions along any path. Moreover, as in Ch.
6, §11, all the functions constructed in the Picard iteration process are holo-
morphic in any simply connected domain, say in 0 = |z| < R. It follows, as in
Corollary 2 of Theorem 10 of Ch. 6, that the DE L[w] = 0 has a basis of holo-
morphic solutions in any such domain—and that L[w] = f(z) hasa solution for
any choice of initial conditions compatible with the order of the DE.
It follows that the only possible singular points of the solutions of a normal
linear DE occur where one or more of the coefficient-functions ,(z) has a sin-
gular point. In short, linear DEs have fixed singular points. In the remainder of
this chapter, we will see how the nature of these singular points is determined
by the singularities of the coefficient-functions.

3 FIRST-ORDER LINEAR EQUATIONS

The study of singular points of linear DEs in the complex domain begins with
first-order DEs of the form

(3) w’ + p(zjw = 0

We will treat only isolated singular points, assuming that ~ is holomorphic in


some punctured disk A: 0 < |z| < », since any singular point can be moved to
the origin by a translation of coordinates.

+ The Riccati DE is the only first-order nonlinear DE with fixed branch points.
3 First-Order Linear Equations 265

It follows that p(z) can be expanded into a Laurent series,t+

(3/) p@) = Ya 0<|z| <p


convergent in A. When all a, with k < 0 vanish, p(z) is said to have a removable
singularity; when there is a largest negative integer k = —m for which a, ¥ 0,
p(2) is said to have a pole of order m there. If there are an infinite number of
nonzero coefficients a, with k < 0, then p(z) is said to have an essential singularity
at the origin.
In all cases, every nontrivial solution of (3) is given by Theorem 1 of Ch. 1,
§3, as

exp[—Jp(z) dz]
oo

A}
3”) C exp[—a_,Inz — >
k=] ( )e+
k3 (He)
As a corollary, we can represent the solution w in the form

(4) w= Cz%g(z), y= -a_,

where g(z) is a holomorphic function in the domain A.

Caution. When 7¥ is complex the innocent looking function z’ = e”* can


be quite nasty. The discussion of z’ in Ch. 3, §2, should be reviewed in the con-
text of Riemann surfaces. For example, writing z e in the usual polar coor-
=
=

dinate representation, but letting @ wind around the origin on an infinite-


sheeted ‘‘winding surface” for z # 0, |z'| = e* and arg(z') = In r. Hence z’ is
unbounded for |z| = 1, though bounded for |z| ron any one sheet of the
=_
=

Riemann surface of z’.

In general, if z’ (y = a + i), then as z tranverses any circle |z| —


=

r once

counterclockwise, z’ is multiplied by the constant

(4/) ey = 22M(-B+0) = 9 2*B(cog Pea + i sin 2ra)

Hence w is holomorphic if and only if y is a real integer.


We now describe the basic classification of singularities for first-order linear
DEs at any isolated singularity of p(z). In the vicinity of any removable singularity
of p(z), Y = a_, = 0 and so w(z) can be expanded in an ordinary convergent
Taylor series. Hence, if p(z) has a removable singularity at z = 0, so does w.
If p(z) has a pole of order one, then (4) holds, where

a(z) = 1 — age +
(aj — a2" +

+ Ahifors, p. 182; Hille, p. 209.


266 CHAPTER 9 Regular Singular Points

is still analytic. Therefore, w has the form

(*) w= C21 + oz + coz? +--+), yr 4,

In this case p(z) is said to have a branch pole of order ¥ at (Q, 0), and the DE (3)
to have a regular singular point at z = 0. The number + can be real or complex,
rational or irrational.
Finally, if p(z) has a pole of order exeeding 1 or an essential singularity at z
= 0, then the DE (3) is said to have an irregular singular point at 0. In this case,
one can show that g(z) in (4) has an essential singularity at z = 0. For example,
the DE w’ + z7?w =
=
Q has the solution w =
=
exp[1/{] = Upio z “/Al, easily
found by separating variables, with an essential singularity at z = 0.
Toprove this, suppose the contrary: that [z’g(z)]’ + p(z)z’g(z) = 0, with

g(z) = = cz", and c,, # 0. Solving for p(z), we get

zg'(z) ~ yE@) _ _ (m+ yeue™ t+:


p@® =
zg(z) Cye™ti pees

so that p(z) must have a simple pole at z = 0 unless M + y = 0, in which case


p(2) has a removable singularity.
In summary, we have proved the following result.

THEOREM |. Every solution of the first-order linear DE (3), with p holomorphic


in A, has the form w = z%g(z), where y = —a_, andg is single-valued and analytic
in A. Removable singularities, regular singular points, and irregular singular points
of (3) give solutions having removable singularities, branch poles of order y, and essen-
tial singularities at z = 0, respectively.

Theorem 1 was derived by explicit calculation. To extend it to higher order


linear DEs, which cannot be explicitly solved, more general arguments are
needed. These require the notion of a simple branch point of an analytic function
w = f(z). This is defined as an isolated singular point z) near which f(z) can be
represented in the form f(z) = (z — %)"g(z), where g(z) is a one-valued holo-
morphic function in the punctured neighborhood of z. Expanding g(z) in a
Laurent series, we get the following expansion for a function f having a simple
branch point at 2:

aw

S® = > a(z —zo)"


—c

Not all branch points are simple; for example, the functions In(z — zo) and
(2 — %)* + (% — 2%)? have branch points at z zo but do not have simple
=
=

branch points there, unless a — 6 is an integer. Any branch pole (4) is a simple
branch point, but a simple branch point need not be a branch pole; thus,
consider z!/2e71/” at z = 0.
3 First-Order Linear Equations 267

We shall now give another proof of the first result of Theorem 1, namely that
every solution w(z) of (3) that is not a holomorphic function has a simple branch
point at z 0
=
=

Starting at a point z) in the given punctured disc, we continue the function


w analytically counterclockwise around a closed circuit, say the circle |z| = [zo].
Returning to z) after a complete circuit around the origin, the function w(e?"z)
= w(z) obtained in a neighborhood of zp is still a solution of the DE, by Theorem
1, But # may differ from the function w in a neighborhood of zp, because the
function w may have a branch point at z 0 even though the coefficient p(z) of
=
=

the DE does not have a branch point there; the DE w’ = w/2z is a case in point.
However, the general solution of the DE (3) has the form cw(z); from this it
follows that

252
wz) = wie z) = cw(z), c#0

erm
Now, write c , where @ is a suitable complex number, and consider the
=
=

analytic continuation of the function

g(z) = 27 7w(2)

around the same circuit. We obtain

ama) = (ze7™)—*w(ze?™) = 2%ety (ze?™)


g(ze
2710
= zx %e cw(z) = z~*w(z) = g(z)

This shows that g is a single-valued function in the punctured disc A. Thus, the
function w is the product of z* and a function without a branch point.
The idea behind this second (and deeper) proof of the first result of Theorem
1 can be applied to linear DEs of any order, as we show in the following sections.

EXERCISES A

1. Show that no solution of w’ = 1/z that is real on the positive x-axis can be real on
the negative x-axis.

2 (a) Setting z
=
=
re®, discuss the analytic continuation to the negative x-axis of the
solutions z and z Inz of z’w” — zw’ + w = 0 on the positive x-axis.
(b) Show that no nontrivial solution of z2w” + 3w/8 = 0 that is real on the positive
x-axis can be real on the negative x-axis.

Let w™ + aw" +--+ - +.4,(2w = 0 be any holomorphic linear homogeneous


DE satisfied by In z. Show that a,(z) = 0.

*4 Show that any holomorphic linear homogeneous DE that is satisfied by z In z is also


satisfied by z.

Find the function g of Theorem 2 when


(a) p@) = 1/2" (nan integer) (b*) plz) = e!”
n

Solve the DE (4) for pz) = >~


h=1
* —&
268 CHAPTER 9 Regular Singular Points

Let p(z) be holomorphic and single-valued in |z| < p except at points a and b. Show
that any solution w, of (3) can be written in the form

w,{z) = (z — a)%(z — bPf(z)

where fis single-valued and holomorphic in |z{ < p except at a and bd.

Generalize the result of the preceding exercise to the case that p(z) is single-valued
for |z| < p and holomorphic at all points except a, + Dye

Prove in detail that solutions of the generalized Riccati equation

w! = polz) + pilz)w + polz)w*

can have branch points only where the p,(z) have singular points.

10 Show that, if the DE of Ex. 9 has no movable singularities, p.(z) = 0.

11 Show that, for analytic p,(z), the DE dw/dz =


=
3 p,(z)w~* has a regular solution.
k=O

(Hint: Consider the DE satisfied by p/w.]

4 CONTINUATION PRINCIPLE; CIRCUIT MATRIX

A rigorous discussion of complex analytic solutions of higher order DEs


involves the concept of analytic continuation, with which we will assume the
reader to be acquainted.+ It also assumes a less well-known Continuation Prin-
ciple for solutions of complex analytic DEs, which may be derived as follows.
Let Fw, ; Wy, 2) be an analytic complex-valued function in a domain D
of (w, ’ , W,, z)-Space. This means that F can be expanded into a convergent
power series with complex coefficients in some neighborhood of each point of
D. Then, as was shown in Ch. 6, §11, every solution of the nth order DE

d"w
*) = Fw, w',w”,... ,w-”, z)
dz"

is an analytic function. The function w(z) — F[w(z), w’(z), .. . ,w® 2), z] of


the variable z is holomorphic and vanishes identically in the subdomain where
the function w is defined. It follows that all analytic continuations of this func-
tion beyond D also vanish identically. Therefore, any analytic continuation of
the function w is also a solution of the DE (*), and we have the following
theorem.

THEOREM 2. (CONTINUATION PRINCIPLE). The function obtained by analytic


continuation of any solution of an analytic DE, along any path in the complex plane,
is a solution of the analytic continuation of the DE along the same path.

t Ahifors, p. 275; Hille, p. 184.


4 Continuation Principle; Circuit Matrix 269

With this theorem in hand, let w,(z) and w(z) be a basis of solutions of the
second-order DE

(5) w’ + p(z)w’ + qz)w = 0

where the functions p and q are single-valued and analytic in the punctured disc
A: 0 < |z| <p. Analytic continuation of each of these solutions counterclock-
wise around acircle |z| =
=
y < p with center at the origin yields two functions
29%
(in general different): @(z) wy,(ze ) and wo(z) w,(ze""’). These are, by the
Continuation Principle, also solutions of the DE (5). Since every solution of (5)
is a linear combination of w, and w,: the continued functions w,d can be
expressed as linear combinations of the solutions w, and w», thus

QR
@(z) wy(e Z) = 41, (zZ) + a)qW9(z)
252
Woz) Wole Z) = ag W,(z) + aggwo(x)

The 2 X 2 matrix of complex constants A = ||4@,|| is called the circuit matrix of


the DE at the singular point z = 0, relative to the basis of solutions (w), ws)
For instance, consider the Euler DE z*w” — zw’ + 3w/4 0, with indicial
equation (v — 3)(v — 3) = 0. The functions z'” and z*” forma basis of solutions
hence, the circuit matrix is the diagonal (scalar) matrix

—1 0

( 0 —1

A similar calculation shows that the DE w” + (2/92z*)w = 0 has the solution


1/3 2/3
basis z Relative to this basis, its circuit matrix is

wo O

(0 w

where w = (—1 + V3:)/2 is a cube root of unity.

Higher Order DEs. A similar construction can be used to definea circuit


matrix (relative to any solution basis) for the nth order linear DE

aly

(6) L{w] = + p,(zjw = 0


dat!

where again all the coefficient-functions ,(z) are holomorphic in the punctured
disc A

LEMMA. Givena basis of solutions wz) of (6), analytic continuation of the (2)
around any circle |z| =
=

rv in A, once counterclockwise, gives a new basis of solutions

of (6)

(2) = we"'2z) = ajw,(z) + + a,w,(z) J 1,2,...


270 CHAPTER 9 Regular Singular Points

Proof. By the Continuation Principle, the w,(z) satisfy (6); since the w,(z) form
a basis of solutions, the result follows.
The matrix A = |{a,|| so defined is the circuit matrix of the DE (6) relative
to the basis w,(z), , w,(z). It represents a linear transformation of the vector
space of all solutions of the nth-order linear analytic DE (6), in the following
manner. If

w(z) Wz) + CoW(z) terest CpW,(z)


=
=

is the general solution of the DE, analytic continuation of w around the same
circuit y carries w(z) into the solution

(6/) wW(z)
=
=

wer™z) = y C,,w,(Z)
pk=l

That is, the effect of analytic continuation around y counterclockwise is to mul-


tiply the vector c by the matrix A on the right, so that ¢ > cA.
Using the circuit matrix A just defined, we can construct at least one ‘‘canon-
ical” solution of (6) of the special form z*f(z) with f holomorphic in the disc A.
This is because every matrix hast at least one (complex) eigenvector (character-
istic vector). Hence, for some choice of c # 0, we can write cA = Ac, where A
is a complex number, an eigenvalue of the matrix A. Choose c in (6’) to be such
an eigenvector, and let f(z) = z “w(z), where a = (In A)/2z7. It is clear that \ #
0, since otherwise we could retrace backward the circuit y, continuing the solu-
tion w
=
=

0 into a nonzero solution. Continuing the function f(z)


=
=
z *w(z)
along the same circuit, we obtain, as in the proof of Theorem 2, the following
result.

THEOREM 3. Any nth order linear DE (6) with coefficients holomorphic in


A: 0 < |z| < p admits at least one nontrivial solution of the form

(7) w(z)

=

2*flz)

where the function f is single-valued in A.

5 CANONICAL BASES

A solution of a holomorphic DE (6) in A has a simple branch point at z =


=

0
if and only if it is carried into a constant (scalar) multiple of itself by continua-
tion around the circuit y. From the discussion in the preceding section, we see
that a solution w(z) = Xj.) ¢w;(z) of (6) has a simple branch point at z = 0 if
and only if the vector ¢ = (¢, 9, ,» €,) and the circuit matrix A satisfy the

+ Birkhoff and MacLane, p. 293. As stated there, A is a root of the characteristic equation |A =
MI = 0.
5 Canonical Bases 271

relation cA = dc, where the constant A will then be necessarily different from
zero. In other words, a linear combination Lj., cjw, of solutions of (6) has a
simple branch point at z = 0 if and only if the vector c is an eigenvector of the
circuit matrix A associated with the basis (w,, wo, , w,) of solutions. Thus,
there are as many linearly independent solutions of (6) with simple branch
points as there are linearly independent eigenvectors of the matrix A.
We shall now look for a basis of solutions with simple branch points for any
second-order linear DE (5), with coefficients holomorphic in A.
Given two linearly independent solutions w, and wy of (5), we can construct
the circuit matrix A = ||a,|| as in §4. The linear combination w =
=
CW,(2) +
€9W(z) then will have a simple branch point if and only if Dc,ay, = de, By the
theory of linear equations, this system of equations has a nontrivial solution if
and only if the following determinant (the characteristic equation of the circuit
matrix A) equals zero:

(8) [A — AZ] = A? — (ayy + agg) + (11499 — 41949) = 0

Ordinarily, this characteristic equation has two distinct roots A,, A». These
roots give two linearly independent solutions F(z) = ¢,w,(z) + cgwo(z) and
G(x) = d,w,(z) + dgwo(z), having simple branch points: Fez) = d F(z) and
G(e""z) = A»yG(z). Relative to the canonical basis of solutions F, G, the circuit
matrix is thus a diagonal matrix

A 0

( 0 dg

As in §4, F(z) =
=
z*flz) and G(z) = z8g(z), where A; = e?™*, dy = 6?™8, and fand
g are holomorphic in A. Such a basis is called a canonical basis.
When the characteristic equation has a single solution \, the solutions may
stillt sometimes have a basis of the form (7). Every solution of the DE is then
2
2a
multiplied by the same nonzero constant A = ¢ when continued around a
counterclockwise circuit ‘.
Otherwise, we choose a basis as follows. Let w,(z) be the solution of the form
z*flz), where f is one-valued in the punctured disc whose existence was estab-
lished in Theorem 2, and let w9(z) be any other linearly independent solution.
Continuation of w.(z) around the circuit y gives, as in §3, wo(ze?™) = aw,(z) +
bw,(z). The circuit matrix for this basis of solutions is therefore the matrix

A 0

(
=
=

a b

+ This occurs when the matrix A is a multiple of the identity matrix. Otherwise, any 2 * 2 matrix A

with only one eigenvalue is similar to a matrix of the form . This fact is not assumed in the
1 A

present discussion.
272 CHAPTER 9 Regular Singular Points

Since the eigenvalues of such a “triangular” matrix are \ and , and since the
only eigenvalue of A was assumed to be A, we must have 6 = A and

dA 0
). a#0
A
(
=
=

a,x

The continuation around the circuit y of the function h(z) = w(z)/w;(z) is easily
computed to be

We(z)
hie") =< + =

= + hz)
r WwW,(z) r

It follows that the function

Ai® = A(z) — 4 In z
Qrin

is single-valued in 0 < |z| < p and, therefore, that the function w,(z) can be
written in the form

we(z) = w,(2)f,@) +oa Qnz)w,(2)

In the exceptional case, since a # 0, we can make a/27id = 1 by replacing


Wy With 2ridws/a.
This completes the proof of the following theorem.

THEOREM 4. Under the hypotheses of Theorem 3, the second-order linear DE(5)


has a basis of solutions in the neighborhood of the singular point z = 0, having one of
the following forms:

(9a) wy(z) = z°flz), wo(z) = z2g(z)


or, exceptionally,

(9b) Ww,(2) = z“f(z) ’ W(z) = wi(z)[fi(z) + In z]

The functions f(z), g(z), and f,(z) ave holomorphic and single-valued in the punctured
disc
0 < [z| <p.

Higher Order Equations.t The preceding discussion can be extended to


nth order DEs (6). A basis of solutions w), wo, . » w, is called a canonical basis
when the associated circuit matrix A is in Jordan canonical form{ or, somewhat

+ For a complete discussion see Coddington and Levinson, Ch. 4, §1.


t See Appendix A or Birkhoff and MacLane, p. 354.
5 Canonical Bases 273

more generally, consists of zeros except on the main diagonal and (in the case
of a repeated eigenvalue) just above it. If the eigenvalues of the circuit matrix
are Aj, do, ., A, the continuation of the solution w, around a small circuit y
is given by one of the two formulas

(10a) w,(z) =w,(ze?**) = Ajw,(z)

or, when A, = A,-1

(10b) w,(z) = w,(ze"™) = Awwfz) + a; 1wj-1(2)

By Theorem 3, there always is at least one solution that goes into a multiple of
itself. If all the eigenvalues of the circuit matrix are distinct, then the circuit
matrix can be reduced to diagonal form by suitable choice of a ‘‘canonical basis”
of solutions, and the exceptional case of formula (10b) does not arise.

Example 3. Consider the nth order Euler DE

dw
ea tw Co d* wy
(11) L{w] = dz"
+—=w=0
z dz”!

The trial function z’, with unknown exponent », satisfies the DE if and only if
the exponent » is a root of the indicial equation of Ch. 4, §2,

(12) CC)
=
=
vpy~-1)--+-@—nt))
+evpe—-—D-- YV-—ntAaterr+
te wt c 0
=
=

When the roots of the indicial equation are distinct, the z” are a canonical basis
of solutions of the Euler DE (11). The circuit matrix is the diagonal matrix with
diagonal entries }, e exp 2xiv,, where », is a root of the indicial equation.

=

When»is a k-tuple root of the indicial equation, then the functions z’ log z,
z’(log z)", and so on, formabasis of solutions. This basis is not “canonical” when
n > 2 (see Ex. B2), though the circuit matrix for (11) is always triangular relative
to it.

EXERCISES B

1. Construct DEs (5) with circuit matrix

1 0

(0 1

for which the functions f and g in Theorem 4 have (a) essential, and (b) removable
singularities at 0.

2. (a) Compute the indicial polynomial for the homogeneous Euler DE zw” + 32z2w”/
2 + zw’/4 = w/8.
(b) Compute the circuit matrix of the preceding DE relative to the solution basis
z/?, z'(log z), and z'(log z)”.
274 CHAPTER 9 Regular Singular Points

FindaDE(5)withcircuitmatrix(0 1/A
for which the functions f and g in Theo-

rem 4 have (a) poles, and (b) essential singularities. Can f and g have removable
singularities?

Show that the requirements 0 S Re{a}, Re{@} = 1 uniquely determine the exponents
a and @ in formula (9a).

Show that, in the exceptional case of Theorem 4, the eigenvalues of the circuit matrix
are equal.

Show that eV*, e~V” forma solution basis for the DE w” + }w/ — kw = 0. Compute
the circuit matrix for the given DE relative to this basis, and also find a canonical
solution basis.

Construct a second-order holomorphic DE (5) with a singular point at z 0 whose


=
=

circuit matrix has the form , such that fin Theorem 4 has an essential singu-
1 1
larity at z 0
=
=

Show that, if w In z satisfies a (homogeneous) linear DE (6) with holomorphic coeffi-


cients, and w is holomorphic, then w satisfies the same DE.

6 REGULAR SINGULAR POINTS

Many of the ordinary DEs of greatest interest for mathematical physics have
singular points which are ‘“‘regular’’ in the sense of the following definition.

DEFINITION. A second-order DE

(13) w” + p(z)u’ + qz@)w = 0

analytic for 0 < [z — z| < p, has a regular singular point at z) when p(z) has at
worst a simple pole at z Zo, and q(z) at worst a double pole there.+

=

In the next several sections, we shall show how to adapt the power series
methods introduced in Ch. 3 to solve ordinary DEs in the neighborhood of any
regular singular point. In particular, we will show that a singular point of the
second-order linear DE (13) is “regular” if and only if the functions f(z) and g(z)
of the canonical basis (9a), of solutions constructed in Theorem 4, have at worst
branch poles there.{ Equivalently, the condition is that a basis of solutions have
the form (25) below.

+ That is, # may either be holomorphic (have a removable singularity) or have a simple pole, and q
may be holomorphic or have a pole of first or second order.

{Or, in the exceptional case of (9b), poles times logarithmic branch points (order of growth log
r/r).
6 Regular Singular Points 275

We now show that, near the regular singular point z = 0, there always exists
a formal solution of the DE (13), namely, a formal power series of the form

2” + cy’t} + Cox”? + eae


(14) w= 2(1 + cz + cox?
+ eg? + - + +) =
=

which, when substituted into (13), satisfies the DE. To calculate the coefficients
c, and the exponent p of (14) it is convenient to rewrite (13) in the form

(15) L{w) = z2w” + zP(z)w’ + Q@)w = 0

where P(z) = LfioP,z* and Q(z) = XfoQ,z* are convergent for |z| <p. Sub-
stituting (14) into (15) and equating to zero the coefficient of z’, we obtain the
indicial equation

(16) Iv) = vv — 1) + Pw + Q = 0

for the exponent v. The roots of this equation are called the characteristic expo-
nents of the singular point, and J(v) is called its indicial polynomial.
Equating to zero the coefficients of the higher powers of z, namely
ann y+n
z 3% ,+.., we obtain the relation

[~ + ly + Poe+ 1) + Qole; + Pv + Q, = 0

and, recursively,

n—-1

[w+ nv +n —- IW + UF MP + Qen = — > (e+ DPr-n + Qr-ilee


k=0

Since the left side of the preceding equation is [(v + n)c,, the equation can be
written in the form

7 l@+njc, = — y [@ + k)Pr-a + Qn-alers = 1, 2,3,...


k=0

The above equation for the coefficient c, can be solved recursively for ¢,, Cy, ¢3,
..., except in one case: when, for some positive integer n, both vy and (» + n)
are roots of the indicial equation. By taking a characteristic exponent having the
largest real part, we can make sure that I(v +- n) does not vanish for any positive
integer n, even in this case. We therefore obtain the following theorem.

THEOREM 5. Ifthe DE (13) has a regular singular point at z = 0, then at least


one formal power series of the form (14) formally satisfies the DE. Unless the roots of
the indicial equation differ by an integer, there are two linearly independent formal
power series solutions (14) of the DE, whose exponents are the two roots of the indicial
equation.
276 CHAPTER 9 Regular Singular Points

In the special case of an ordinary (i.e., nonsingular) point, clearly Py) = Qo


= 0, and the indicial equation »(@ — 1) = 0 has the roots 0, 1. In this special
case, the preceding construction reduces to the methods of Ch, 4, §2. The series
z(1 + £2,c,z*) associated with the larger real root is uniquely determined
but, because the roots differ by an integer and J(1) = 0, the series associated
with the root py = 0 is not.

Example 4. A remarkable class of special functions having regular singular


points at 0, 1, and © (see §12), and no other singular points, is obtained by
applying Theorem 5 to the hypergeometric DE

(18) zl — zw” + [fy —-(@ +6 + Iz]w’ — afw = 0

The hypergeometric DE has regular singular points at z 0 and z


= =
= =

1. The indicial equation at z


=
=

0 is py + y — 1) =0, with roots », = 0 and


vy = 1 — y. Unless y is an integer, by Theorem 5 the hypergeometric DE has
two formal power series solutions with exponents 0 and 1 ~ y. Unless ¥ is a
negative integer or zero, one such formal power series is obtained from the eas-
ily computed recursion relations

(19) (n + li + NCn+) = (a + ny(B + N)Cy, n=0

Setting cy = 1, we obtain the hypergeometric series

af a(a + 166 + 1) 2
Fa,
8, y;2) = 1 +maet
(20) y¥y¥+)) 2!
ac + 1)(a + 2)B(8B + 16 + 2) 2? +

vty + 1) F+ 2) 3!

From the Ratio Test, it follows that the radius of convergence of the series is at
least one, and is exactly one unless a, 8, or 7 is a negative integer. This also may
be expected from the existence theorems of Ch. 6, since the radius of conver-
gence of a solution extends to the nearest other singular point of the coefficients
of the hypergeometric DE, which is at z = 1. The function F(a, 6, ; z) defined
by the power series (20) is the hypergeometricfunction, to be studied in §10 below.

7 BESSEL EQUATION

To illustrate the behavior of solutions of second-order DEs near regular


singular points, we consider an example of great importance in applied mathe-
matics. This is the Bessel DE of order n:

(21) zw" + zw’ + (22 — nw = 0


7 Bessel Equation 277

The Besse] DE has a regular singular point at the origin, with indicial equation
Iv) = v? ~ n®? = 0. In physical applications, n is usually an integer or half-
integer. But, for theoretical purposes, it is interesting to let n® be an arbitrary
complex number.
By Theorem 5, we can compute a formal power series solution beginning with
z", of the form z"(1 + cz + coz” + - - -). From the recursion formulas (17) or,
more simply, by direct substitution into the DE, we obtain the recursion rela-
tions for the coefficientsc¢,:

(2n + l)c, = 0, (kh + 2)(Qn +h + erro + = 0, k=0,1,2,...

Since Re(2n) = 0, the factor (k + 2)(2n + k + 2) cannot vanish. Solving


recursively for c,,2, we obtain the series

( )-
1 1
of z z

n+1 (_

2 (2!)(n + 1)(n + 2) 2 |
The series in square brackets is an entire function (convergent for all finite z).
Multiplying by the normalizing factor 1/2I'(n + 1), we obtain the Bessel function
of order n, already discussed in Ch. 4, §4:

1 z 1 Zz

(2}[- n+1 (_

()-
1 z
(22) :

(2(n + 1)\(n + 2) 2 |
The Bessel function J, is an entire function (Ch. 4, §5) if n is a nonnegative
integer. Using the functional equation for the gamma function, Iz + 1) =
zI(z), this formula can be recast in the form

co
(- 1)*(z/2)"**

Ir=> Tn +k+ Dre + 1)


k=0

Unless n is an integer, the series (22) with —n in place of n defines a second,


linearly independent solution of the Bessel DE (21)

we (— 1)*(z/2)-"**

J-.©
=>
k=O |
Tk + DE(—n + k + 1) |
This solution has a branch pole at the origin. Unless n is an integer, J, and J_,
form a canonical basis of solutions of the Bessel DE.

Exceptional Case. When n is an integer, we have J_,(z) = (—1)"J,(z), so


that another method must be used to find a basis of solutions. We can then
proceed as follows [cf. Ch. 2, (12)].
278 CHAPTER 9 Regular Singular Points

Consider the Wronskian W = J,¢’ — ¢ of J,, and any other solution ¢ of


the Bessel DE. A straightforward computation gives W’ + (1/z)W = 0, whence
(zW)’ = 0. Hence, for some constant A, we have

Jnl2)b"2) (2)6@) = —

If g(z) is the quotient $(z)/J,(z), it follows by direct differentiation that

We)
_ A
gi) =
Tee) nz)

Therefore, the general solution of the Bessel DE (21) is

(23) Z,(2)
=Jul) | +A f Fi,|
= K,z-""4{1 + Die b,x") (cf. Ch. 4,§8],
for any indefinite integral of 1/z]2(z)
where K”is a suitable constant. Hence, we have

dz
(23’) =KA [xvin fy + Yo dz
Jr(2)

where the circle of convergence of the series in curly brackets (‘braces’’)


extends to the point z # 0 nearest the origin, where /,(z) vanishes. Within this
circle, we can integrate the series term by term. Thus, when n —
=

0, we easily
compute 1/zJo"(z) = z7' + 2/2 + 523/64 + , and so the general solution
of (21) when n = 0 is

2
524
(23”) Zo(z) = to B + A[ Inz+>+ 5+:
256
)
This shows that any solution of the Bessel DE of order zero that is not a constant
times J,(z) is logarithmically infinite near z = 0
More generally, the preceding formulas show that, when the parameter n in
the Bessel DE (21) is not an integer,} we obtain the general case (9a) of Theorem
4, with a = n and @ = —n. When nis an integer, we have the exceptional case
(9b) of Theorem 4. We shall now discuss this exceptional case

The Neumann Function. In the exceptional case that n is an integer, var-


ious choices are possible for a canonical basis of solutions of the Bessel equation
The first solution w)(z) in (9b) must be chosen as a multiple of J,(z), because
only such multiples have branch poles at the origin. Any choice A ¥ 0 in (23)

+ Note that, though the Bessel DE of half-integral order x + } has characteristic exponents which
differ by an integer, it has a basis of solutions of the form (9a). This is because the recurrence rela-
tions for the coefficients in its expansion express ¢ 49 as a multiple of ¢, without involving c,,,
7 Bessel Equation 279

will give a possible second member of the canonical basis (9b). Thus, when n =
0, we can choose A = 1, B = 0 in (23”). Relative to this choice, the circuit
1 0
matrix is
2 1

A more convenient choice is A = 2/x and B = (2y — 2 log 2)/a, where y =


0.5772 .. is Euler’s constant. This defines the Neumann function Yj(x). The
choice is convenient because of the asymptotic formulas, valid as x — 00,

(24) j= /3[o(o-8)+00)]
oe VE ( }+o(%)]
.
wv
xo
sin

This asymptotic behavior will be explained in Ch. 10, §11.


The Bessel and Neumann functions J, and Yp are clearly linearly independent;
hence, they are a canonical basis of solutions of the Bessel DE of order zero.
Using these functions, we now derive a canonical basis for the Bessel DE of
integral order n.
This can be done as follows. If Z,, is a solution of the Bessel equation of order
n, the function Z,,,, defined by the formula

Zari = —2"[2z-"Z,(z))’

is a solution of the Bessel equation of order n + 1. This formula is valid whether


n is an integer or not, as is immediately verified by substituting Z,,, into the
Bessel DE of order n + 1. In particular, J,4,(z) = —z"[z~"/,(2)]’, much as in
Ch. 4, (13).
We now define the Neumann function Y,(z) for integer n by the recursive
formulas

Y,,+1@) = ~2"[2" n(2))’, n=0,1,2,...

Since the function Y(z) is of the form Y)(z) = Jo(z)Lfol) + Ko log z], where fp
is holomorphic and single-valued in a punctured disk with center z _
=

0, we
verify by straightforward differentiation that

¥,@) = h@A® + K, log z}

where /, has the same property as fp, and, recursively, that

Y,@) = ADL + K, log z]

Thus, all Neumann functions Y,,(z) have a branch point at z 0. From this it
=
=

follows that Y,, and J, are linearly independent, and indeed are a canonical basis
of solutions for the singular point z 0 of Bessel’s DE.
=
=
280 CHAPTER 9 Regular Singular Points

The Neumann function is defined when» is not an integer by the formula

Jv(z) cos vr — J_Az)


Y(z) =
sin yr

When pv
=
=
n is an integer, let Y,(z) = lim,_,,Y,(z). The limit can be evaluated by
PHOpital’s Rule as

van

1t can be shown that this definition of the Neumann function for integral n
agrees with that above

Modified Bessel Functions. The values of J,(z) on the imaginary axis


iy define a real function of the positive variable y
=
=

cw k

nly)= (—9"Jaliy) (3) » k=0 |


J
Tk + DIR +n + 1) |
This function is called the modified Bessel function of order n. The function 1,(y)
satisfies the modified Bessel DE

od'l dl
J dy? ty — (+ n)1=0

obtained by substituting iy for z in the Bessel DE. The coefficient-functions of


the modified Bessel DE are real

EXERCISES C

1. (a) Show that the DE zw” + (1 — z)w’ + Aw = 0 hasa regular singular point at the
origin, and that y = 0 is a double root of the indicial equation.

(b) Find the power series expansion of the solution of this DE that is regular at
z = 0. (Hint: Derive a recursion formula for ¢,41/¢

2. State and prove an analog of Theorem 5 for w’ + p(z)w = 0


3. (a) Show that, if

(*) w’ + plz)w’ + g(z)w = 0

has a regular singular point at z = 0, and q(0) # 0, then

q
(*)
ool q )}e+[r-(t)ealens
has a regular singular point at z = 0
(b) Show that, if w,, wy are a basis of solutions of (*), then w{, w are a basis of
solutions of (**)
8 The Fundamental Theorem 281

*4 Show that, if the roots of the indicial equation at a regular singular point of (5) differ
by an integer, the eigenvalues of any circuit matrix are equal.

Let (13) have a regular singular point at z 0, and let a be a root of its indicial
=
=

equation having largest real part. Show that, if w is a solution, then v = z~*w satisfies
a DE of the form (15) with Q) = 0 and Re{P)} = 1.

6. (a) Given three pairs of nonzero complex numbers (A, Ag), (41, Ha), (21, 29), Construct
a holomorphic second-order linear DE (13) having regular points at z, and 2,
whose circuit matrices at these points have eigenvalues (A;, Ag), (41, Hs).
*(b) Generalize the above to n points 2), » Xp

7. Show that, for x a nonnegative integer, J,(z) and its complex multiples are the only
solutions of the Bessel DE that are holomorphic at the origin.

8 (a) Find the exponents at z =


=
0 of the DE zw” + (n+ dw’ + w =
=
0, and find
formal power series solutions corresponding to each characteristic exponent.
(b) Show that a basis of solutions of this DE is given by the functions

d"[sin (2V2)]Jaz” and d"[ cos (2V2)]/dz"

*Q Show that, when n is an integer, a solution of Bessel’s DE [the Neumann function


Y,,(z)] is defined by

a JnlZ)
~(-1)"21-26)
| a on

10 Show that, if u), uw. and v;, Us are bases of solutions of the Besse] and modified Bessel
DEs, respectively, %,, tg, U;, Vg form a basis of solutions of the DE

(2n? + 1) 4 — n’)
w® + (z2 ) a _— emt 2] w+ (n
J»-
Zz w+ —1

11 Show that the self-adjoint form (Ch. 2, §8) of the hypergeometric DE (18) is:

<(«a — grrr leas] _ [abz""'(1 _ zt Fw =0

12 Show that a canonical basis of solutions of the hypergeometric DE (18) is provided


by F(a, 8, y; z) and z-TF@ —y+1,8—7+ 1,2 — 7; 2) unless7 is an integer.

THE FUNDAMENTAL THEOREM

We now establish the fact that the formal power series solutions obtained in
§6 are convergent. We begin by proving the converse of this result.

THEOREM 6. Let the analytic functions

(25) Ww,= (1 +Ya), Wo - (1 + Doe!), a#B


282 CHAPTER 9 Regular Singular Points

have simple branch poles at z 0 of different orders a # 8. Then the normal second-
=
=

order DE satisfied by w, and w has a regular singular point at z 0, with charac-


=
=

teristic exponents a and B.

Proof. The coefficients of the normal second-order linear DE satisfied by w,


and wy are found by solving the simultaneous linear equations

wy + plz; + q(z)w, 0, j= 1,2


=
=

for the unknown coefficient functions p, q, as in Ch. 2, Ex. B7. The result is

p=
—(wywz — ww) _ (wjuh = whut

(w, wy — wWew}) (w,w3 — wew)

The Wronskian W = w,w5 — waw{ in the denominators is equal to

(8 ~ a)ztt8) (: + > a)
and does not vanish near z = 0, since a # 8. The numerators are the powers
yotB-2 a+B-3
and z multiplied by holomorphic functions of z. Dividing out, we
get

plz) = z"'PQ), q(z) = z-?Q&e)

where P(z) and Q(z) are holomorphic in some neighborhood of z = 0. This com-
pletes the proof of the theorem.

THEOREM 7. Let the second-order linear DE (5) have a regular singular point
at the origin, and let a be the larger root of its indicial equation Iv) = 0. Then the
formal power series (1 + Xa,z") of Theorem 5 converges to a solution of (5) in a
domain
0 < |z| <o, o> 0.

It will be recalled that, in Theorem 5, » was any root of the indicial equation
Iv) = 0 such that Jv + n) = 0 for no positive integer n.

Proof. The method of proof, due to Frobenius, is a generalization of Cau-


chy’s Method of Majorants (Ch. 4, §6). The functions P and Q are holomorphic
in a neighborhood of the origin; thus, a closed disk 0 <= |z| < p can be found
in which these functions are holomorphic, with | P(z)| = M and |Q(z)| = N. It
follows from the Cauchy estimates for derivatives that
~

Pr =F: 1Q, | So k =0,1,2,...

+ See Hille, pp. 197 and 202.


8 The Fundamental Theorem 283

Hence, we have

M|yp| + N
Pal lol + 1Ql ’ kz=0

From formula (16) we obtain the bound |I@ + ”| = n?/K for some constant
K = 1: since Iv + n) # 9 for all nonnegative integers n by hypothesis, the
sequence n“/|1(» + n)| is bounded (it tends to 1 as n — 00). Therefore, if

K =1+max| |1@ + nv)| | n

the recursion formulas (17) give the following bound for the coefficient c, in
the formal power series solution (14)

M|yp| +N cA
len| =
<=
£5 (k=0
n
") n—k

Now, let A = M|v| + N+


M+ 1. Clearly (M[y| + N + &M)/n SA for all n
and for 0 <= k =n, and AK = K = 1. Hence, we obtain

£Falal
Jeal = ARSlal
leak
(25/ lc,
|=
n m0 nM 4x9?”

Using this formula, we now prove by induction that

lenl =( )te n=1,2


This inequality is immediate for n =
=
1, namely |c,| = AK|cg|/p. Now assume
it is true for all c, for 1 = k S n — 1. Substituting in (25’) the bounds for
Cos » Cy, given by the induction hypothesis, we obtain

KA AK 1
lc,| = —— >
nN 4=0 ( p
n—k lcol

= OOo
[1 +(AK)
+ (AK)’ + + (AK)"""]

Since AK = 1, each term in the brackets above is bounded by (AK)"~’. Thus, we


have

< AKlool _n(AK)"~ (AK)"|co|


Ic,|
S n
n p
284 CHAPTER 9 Regular Singular Points

and this shows that the formal power series solution (14) has a radius of
convergence at least equal to p/AK. The proof is therefore complete, with
o = p/AK.

COROLLARY. In Theorem 7, unless the roots a, of the indicial equation differ


by an integer, the DE (5) has a basis of solutions whose circuit matrix is a diagonal
matrix with diagonal entries hy exp (2aia) and \y = exp (2778)

For this, one only has to choose a canonical basis of the form (25), which
exists by Theorem 7

Exceptional Case The exceptional case, namely when the roots of the
indicial equation differ by an integer, can be treated by the following method
Select for a a root of the indicial equation having maximum real part. Then
8
=
=
a —~— n for some integer » 2 0 and so, since by the indicial equation
a+ 8 = 1 — Po, it follows that 2a — Po 1, where Po is the leading
coefficient of p(z) Po/fz +P) + and n is a nonnegative integer
Moreover, we know that I(a@ + ) # 0 for all integers n > 0, and so by Theo-
rem 7 the given second-order linear DE (5) has a solution

w, = zflz) = (1 + Yan!)
which has a branch pole at z 0 and is nonvanishing in the punctured disk

=

0 < |z| < o for some o > 0. Hence, if we set w = wih = z"f(z)h(z), the DE (5)
is equivalent to

O = wh” + [2wi + pleywi]h’ = 2°f(2){ar (==) + 2(£


f +n
This first-order DE for the unknown function h’ has a regular singular point at
0, since f’/f is holomorphic there, while 2a/z and p(z) have at worst first-
—I2a ~—
order poles. Therefore, we can write h’(z) z(1 + “Xc,z"), where
-—n — 1, as shown above
=

Po =

Integrating h’(z) term by term in the circle of convergence, we thus have

Inz + (2) ifn


=0
h(z
|
=
=

Cz In
z + z-"d{z) ifn #0

where ¢(z) is holomorphic. This shows that the exceptional case of Theorem 4
always occurs when the indicial equation has a double root and also when the
roots differ by an integer n, unless ¢c, = 0
Collecting results, we have proved (for C = c,) the following theorem
9 Alternative Proof of the Fundamental Theorem 285

THEOREM 8. Suppose that the roots a and B = a — n of the indicial equation


of a second-order linear DE, having a regular singular point at z = 0, differ by a
nonnegative integer n. Then there exists a canonical basis of solutions of the form

(26) aoe 1+ > a,z*


k=l ). meal1+ lie ) + Cw,Inzk=l

where the power series are convergent in a neighborhood of z 0


=
=

Relative to the canonical basis (26), the circuit matrix has the form

dX 2xiC

(0 r

with A = e27!% = 928.

*9 ALTERNATIVE PROOF OF THE FUNDAMENTAL THEOREM

Theorem 7 also can be given a more intrinsic proof by relying on the follow-
ing characterization of poles of analytic functions.

ORDER OF GROWTH THEOREM. [f f(z) is holomorphic in 0 < |z| = R, then


z = 0 is a pole off(z) of order at most a, or a removable singularity, if and only if
there exists a positive number C such that

sup |f(re)| < Cr, O<r=R


05952"

Theorem 7 can be deduced quite easily from this and the following basic
result.

LEMMA. [f the DE (5) has a regular singular point at z —


=
0, then the function
J) in Theorem 3 has at most a pole at z =
=
0

Proof. For any solution w(z) of (5), consider the real-valued function

Ulz) = |w(e)[? + zw’)?

Setting z = re”, we shall majorize the derivative of this function relative to r for
fixed 6; its differentiability follows by the Chain Rule.

+ Hille, p. 213, Theorem 8.4.1.


286 CHAPTER 9 Regular Singular Points

For any differentiable complex-valued function V(r) of a real variable r, as in


Ch. 6, formula (3), we have

“Vv aul =
<{ |V(t)| dt
Applying this inequality to U(re) we obtain

ze 72
1] au
| = wu +|2 |
+ [z*w'w"

where z
=
=
Using the fact that w”
=
=

—(P(z)/2)w’ — (Q(z)/z")w we obtain

|dU/dr| <= |ww’| + |z w|/r + | P(z)||z w|/r + | Q(z) ||ww’|

The functions P(rei) and Q(re6) are holomorphic in some closed disk 0 = |z|
<= R, R > 0. Let M be a common upper bound for their absolute values, for
0 = 6 = 2m and for fixed r. This gives the inequality

1|%] « ww'| + (M
+ 1)[2w?
|
<= (M + 1)|ow

By definition of U, we have |w|? U, |w’|? < U/r’, and hence, multiplying,


|ww’| = U/r. We obtain, therefore,

| fou
(aU = (4M+4) U(re®) _ KU
)] | <
rT
K>0
r

In particular, for 0 < 7 = R, we obtain 0U/dr + KU/r = 0, whence integrating


between the limits rand R

R¥U(Re®) — U(re*) = 0

If N = maxpsesor U(Re"), we obtain

U(re®) < NR*r

and hence,a fortiori, that | w(re") |? < (NR*)r-*. By the Order of Growth Theo-
rem, with C = NR* and a = K, the conclusion of the lemma follows
Consequently, if the DE (5) has a regular singular point at the origin, then it
has a solution given by a locally convergent power series of the form described
in Theorems 5 and 7 The construction of a second solution can then be
achieved as in Theorem 8
The preceding result can be generalized to nth order linear DEs
10 Hypergeometric Functions 287

EXERCISES D

1. Find the exponents at z = 0 of the DE w” + (u/z)w’ + (1/z)w = 0. Show that this


DE has a power series solution C,(z) = 1 + Lf, az", convergent for all |z|. Show
that C,(z) = 2°" ?Jy_\(2V2).
Show that w” + (n + 4 — 2?/4)w = 0 has a basis of solutions

(2n + 1)z? (4n? + 4n + 3)z* |


wz) = 1—- ‘, and
4 96
_ (an + 12° (4n? + 4n + 72° |
wW,(z) = z
12 480

For what values of |z{ do these series converge?

The Laguerre DE is zw” + (1 — z)w’ + aw = 0.


(a) Find its characteristic exponents.
(b) Show that a nontrivial solution is given by De,z* with

_G-= ay
C +1
3
(j+1P

The associated Laguerre DE is zw” + (k + 1 — zw’ + (n — Aw =


=
0. Show that
this has a polynomial solution w = L*(z) for any positive integers k, n.
Show that e~*2z*-)21%(z) satisfies the DE

zw” + 2w' + [A+ Bz + C/z]w = 0

with
A = n— (k — 1)/2,B = —4,C = (1 — h9)/4.
Show that, if 6(0) # 0, the substitution w = ¢(z)v carries second-order linear DEs
(5) having a regular singular point at the origin into DEs having the same property.

Generalize the result of Ex. 6 to nth order linear DEs.


Show that the substitution w = z’(z)w,, where $(0) # 0 and $(z) is regular near
z = 0, carries a regular singular point at z
=
=
0 with indicial polynomial J() into
one with indicial polynomial J — 1).

Do the functions log z and (log z)? satisfy a second-order linear DE (3) with a regular
singular point at z = 0? Do they satisfy a third-order linear DE with regular singular
point at z = 0? Justify your answers.

*10 (a) The DE w” + C3 p,(z)w®-” = 0, p,(z) holomorphic for 0 <= |z| < 7, has a
regular singular point at z =
=
0 if p, has, at worst, a pole of order k. Derive an
analog of the indicial equation (16) and generalize Theorem 7 to this DE.
(b) Generalize Theorem 8 for this DE when two exponents coincide.

*10 HYPERGEOMETRIC FUNCTIONS

So far in this chapter, the behavior of solutions of DEs has been studied only
neara single isolated singular point. A fascinating topic of analysis is the relation
between the behavior at different singular points of analytic functions defined
by DEs. This topic is beautifully illustrated by the hypergeometric functions,
288 CHAPTER 9 Regular Singular Points

defined as solutions of the hypergeometric DE (18). This illustration (Example


4 of §6) is of especial interest because many common transcendental functions
can be expressed in terms of the hypergeometric functions. For example,
(1 — z)* = F(a, B, B; z), arcsin z = zF6, 5, 3; z°), log (1 + z) = zF(1, 1, 2; —2),
and so on.
According to the program laid out in Ch. 4, the properties of the hypergeo-
metric functions can be deduced from the DE (18). For example, let us derive
a formula for the derivative of the hypergeometric function F(a, 8, y; z). Differ-
entiating the hypergeometric DE, we get

21 ~ zw” + fy +1 —-@+t1+B+1+4+
Dqw" —-@+)E+ lw =0

which is again a hypergeometric DE with constants o, a+1,68,;=6+1,


=
=

v1 = y + 1. By Theorems 7 and 8, every solution of this DE holomorphic at


the origin is a constant times F(a + 1,8 + 1, y + 1; z). This implies the formula
Fa, B, y; 2) = kF(@ + 1,8 + 1,7 + 1; z). The constant & is determined by
differentiating (20) at z = 0. This gives the differentiation formula

(27) Fo, 67) = 2 Fet+1etlytiy

The Jacob: rdentity

- [22+ Fa, By) = oat l- ++ @tn— 1'Fa + 0, B, 32)


can be similarly established by multiplying both sides of the identity by z'~*, and
then verifying that both sides of the resulting identity satisfy the same hyper-
geometric DE with constants a, = a + n, 8, Y
The study of the hypergeometric DE is greatly facilitated by its symmetry prop-
erties. Making the substitution w = z'~%u, we obtain as a DE equivalent to (18)
for the dependent variable u, a second hypergeometric DE with different con-
stants (unless y = 1):

(28) zl — zju” + [y, — (@ + 6, + Dz]u’ — a,B,u = 0

where a, =a —y+158, B-y+1,andy, 2 — y. Since this DE has


= =

= =

the solution w)(z) = F(@;, 81, ¥:; z), we obtain at once a power series solution
of (18) corresponding to the exponent 1 — y in the form

(29) w(z) =z) ¥Fa-yt1,p-y+1,2-%2

The two solutions are a canonical basis of solutions of (18) at the regular singular
point z = 0.
The change of dependent variable w = (1 — z)’~*~8u gives another hyper-
geometric DE of the form (18) in the variable u with a, = y — a, 8; = y — B,
11 The Jacobi Polynomials 289

¥, = y. Since the solution of this DE, which is holomorphic at z 0 and takes


=
=

the value 1 there, is F(y — a, y — 8, y; z), we obtain the identity

(30) Fa, B, 32) = (1 — 2) * By — a, ¥ — BY; 2)


A change of independent variable that transforms the hypergeometric DE
into itself ist = | — z. This gives the DE

t(1 — t)w” + [y, — (@ — 6 + l)ilw’ — aBw = 0

where y,; = a + 8 — y + 1. 1t follows that the hypergeometric DE has a second


regular singular point at z
=
=
1, and a basis of solutions

w3(z) = Fla, B,a + 8 +1—y;1—


2)

and

wiz) = (1 ~ 2 Rly — a,y - 8,yy-a—-B+1;1—-aF


These functions form a canonical basis of solutions relative to the singular point
z = 1. Note that the functions ws and w, are equal to linear combinations of
the functions w, and wy by the uniqueness theorem for second-order linear DEs
(Ch. 2, Theorem 3).

*11 THE JACOBI POLYNOMIALS

The linear transformation z F 1 — 2z carries the hypergeometric DE (18)


with parameters
a = —n,
8 =n +a+6+1,y = a+1 intothe Jacobi DE

(31) (l ~ 2)w"” + [a -b~ @tbh+ 2)z]w’ +nn+a+b4+


lw=0

It carries the regular singular points 0, 1 of (18) into 1 and —1, respectively.
Note that the Jacobi DE (31) goes into itself under the transformation z +> —z,
a= b.
Multiplying by (1 — z)*(1 + 2)’, we get the self-adjoint form (Ch. 2, §5)


d
d:
la — 21 + 2)! a +nn+a+b+ 1)(1 —(1 + 2)’u = 0
When a = 8, this reduces to the ultraspherical DE

(32) — la ae a | + n(n + 2a+ 1) — zu = 0

+ Assuming, of course, that the parameters a, 8, y are not chosen in such a way that the solutions
coincide: thus y # a + 8.
290 CHAPTER 9 Regular Singular Points

This is obtained from the partial DE V?[r"u(cos 6)] = 0 in (2a + 3)-dimensional


space by separation of variables; hence, its solutions play an important role in
potential theory and its generalizations. Familiar special cases of the ultraspher-
ical DE are a = b = 0, which gives the Legendre DE [(1 — 2yu) + nin + lu
0 (Ch. 2, §1), and a = b = —3, which gives the Chebyshev DE

(33) [( ~ 2)Pu’y + nl — 2*)"'?u = 0

From the derivation of (32), it is evident that the hypergeometric functions


F(—n,n +a+6+1,a+4 1, (1 — 2/2) are solutions of it. It follows, inspecting
the hypergeometric series (20), that if n is a nonnegative integer, this series is a
polynomial in z unless a is a nonnegative integer —m, with m < n. Multiplying

by the normalizing factor , we get, by definition, the Jacobi polynomials

Pe (zy (nta )n nntatbt+ilatl (1


— x)
2
n
(34)

( v*( , x nntat+ot+1l;b4+1; di —2 2)2)

In turn, when a = 5, these give the ultraspherical or Gegenbauer polynomials


of index a + 3. These are usually normalized by the formula

Ta
+ n+ 1) (a—1/2,a—1/2)
PI) = Tati/2+n+1)""
(2)

With this normalization P©(z) T,,(z) is the Chebyshev polynomial of degree


n: P’/(z) is the Legendre polynomial of degree n, and so on
From the differentiation formula (27) for the hypergeometric function, we
infer the differentiation formula for Jacobi polynomials

— P(z) =C: plarmb+m(,y


(35) a

C=2"n+at+64+ V)int+at+b+4 2) (ntatb+m)

An expression for the Jacobi polynomials often more convenient than (34) is the
Rodrigues formula

pe )( )= 1"
(-1)"
(36)
ni2”
qd —- yd + yd — ond + 7]

We shall derive this formula from the identities for the hypergeometric function
established in the preceding section. First, since (1 — #)* F(—a, b, 6, t), the
11 The Jacobi Polynomials 291

bin
binomial series is a special case of the hypergeometric series: (1 — #
=

Fa+ 1, —n — b,a + 1, t). Using also the Jacobi identity, we justify the first
two steps of

nm

71 — or ern _ ‘?t")

=
=

rad — y= fee"Fa +1, -—n-ba+1;9]


=
=

(a+ Iiat 2): -atnd —)°’Fatn+1,—-n-—b,at+1;8


=@tD@at2):-:‘@taM-nntatoti,atl;9.

In the last step, identity (30) for the hypergeometric function is used.
The Rodrigues formula (36) follows by making the change of variable
t= (1 — 2/2.

EXERCISES E

1 Verify the following identities:


(a) Fla, B, B; 2) = (1 — 2)
(b) FG, 4, $ 2°) = (arcsin z)/z
(c) FI, 1, 2; z) = —log (1 — 2)/z

@i+(t)e+(2 }e treet (2) = (2) -A—m,l,a—m+1;—27')


a

(e) cos az = Fla/2, —a/2, 1/2; (sin z)"]


(f) log [1 + 2)/(1 — 2] = 22F(1/2, 1, 3/2; 2”)
(a) Show that (18) is equivalent to

| z—|z—+y7-1
az
2
(z— +a
\ z—+8
}
(b) Show that the eigenvalues of the circuit matrix of the hypergeometric DE at z =
0 are equal if y is an integer.
(c) Show that the eigenvalues of the circuit matrix for z = 1 are equal if y — a —
6 is an integer.

(a) Show that, if @ is zero or a negative integer, the hypergeometric DE (18) has a
polynomial solution unless -y < a is a negative integer.
(b) Using (34), express this solution as a Jacobi polynomial.

(a) Compute the characteristic exponents at z = +1 of the Legendre DE

[1 — 2*)w’! + Aw = 0

*(b) Describe corresponding circuit matrices, taking as basic solutions an even and
an odd solution.

5. (a) Show that setting ¢ = z” in the Legendre DE gives a hypergeometric DE.


(b) Express the Legendre polynomials as multiples of F(a, 8, 32’) for suitable a, 8,
Y
292 CHAPTER 9 Regular Singular Points

6. Find the characteristic exponents at z = +1 of the associated Legendre DE:

[1 — 2)w') + [n(n + 1) — w?2/1 — 2)Jw = 0

Derive from (31) the self-adjoint form of the Jacobi DE displayed in the text.

*8 ProvethatP@(z) = k,F| -n,-b-—n, lta; (z+


«@ — 1)
1)
| where
nt+a z+1
Ra
=
=

( n I 2

(Hint: Show that the right-hand side satisfies (31), using suitable identities for F.]

Find the roots of the indicial equation of the Jacobi DE (31) at z = 1 andz = —1,

10 Show that (34) defines a solution of (32) even when n is not a positive integer. What
happens when7 is a negative integer?

11 Using (36), show that, fora > b > —1

f ; PE(e)PLM(X)(L — x)°(1 + 2)?de =


=

0, form #n

*12. SINGULAR POINTS AT INFINITY

Even when the coefficient-functions p and g of the second-order linear DE


(5) are regular at infinity, the point at infinity may be neither a removable sin-
gularity nor a regular singular point, but an irregular singular point. For
instance, this is true of w”
=
=

w, whose solutions e** have essential singularities


at infinity.
One determines when the point at infinity is a regular singular point by mak-
ing the substitution z
=
=
1/t. This substitution transforms the second-order lin-
ear DE (5) into the DE

d’v 2 1 1 1
a
(37)
dt” | i (} _—

t di
f?
(po
where v(t) = w(1/t). The point at infinity is said to be a regular singular point
of the DE (5) when the origin is a regular singular point for the DE (37). This
happens when the function

2 1 1

| (2 a }
—_

t

has, at worst, a pole of the first order at ¢ = 0, that is, when the first coefficient
in the power series expansion of p(1/t) vanishes. Also, the function ¢~*9(1/1)
must have, at most, a pole of the second order at ¢ = 0; this happens when the
12 Singular Points at Infinity 293

first two coefficients in the power series for q(1/é) vanish. This gives the follow-
ing theorem.

THEOREM 9. The point at infinity is a regular singular point for the second-order
linear DE (5) if and only if the coefficients p and q have power series expansions
convergent for sufficiently large |z|, of the form

ple) = PL 4 Bey
(38) ’ q) = f++-

That is, it is necessary and sufficient that the function p have a zero of at least
the first order and the function g have a zero of at least the second order at
infinity. In particular, the solutions of the DE are holomorphic at z = ©, or
0, if and only if the coefficients

2 1 1 1
J} ana (
| ( HK
=

t
2
_-

t
_—

tt
K _

are regular at ¢ 0. Hence, the following corollary

COROLLARY. [If the coefficients p(z) and q(z) of the DE (5) are holomorphic for
sufficiently large z, then all solutions of (5) have removable singularities at z
=

oo if
and only if p,
= 2 and q2 = 93
= 0 in (38).

It follows from Theorem 7 that, if z 00 is a regular singular point, and if


=
=

the indicial equation of (37) at ¢ = 0 has roots e and 6 not differing by an inte-
ger, then the DE (5) has a basis of solutions of the form

(@)= 27(14% +324. ), vb=a,B


The indicial equation at infinity is defined, because of Theorem 9, to be the
following equation for v

(39) vy ~ 1+ (2-— pvt q=O0

Its roots are called the characteristic exponents at z = 00. If they differ by an
integer, then there is still a solution of the form (1/z’)(1 + (a)/z) + ), but
every second linearly independent solution may contain a logarithmic term

Example 5. The hypergeometric DE (18) has, by Theorem 9, a regular sin-


gular point at infinity with characteristic exponents a and 8. In order to derive
a canonical basis at infinity, it is convenient to make the substitution
u(t) w(1/t). This transforms the DE into another hypergeometric DE

“1 — du” + [yo (a + Bo + ilu _ AoBou = 0


294 CHAPTER 9 Regular Singular Points

with ag = a, Bg =a —y +1, y2 = a — B + 1. It follows that the hypergeo-


metric DE has the solution

w(z) = 2 "F(a,a —y+1,a—68 + 1; 1/2)

convergent when |z| > 1. From the symmetry between @ and 8, we obtain a
second solution

we(z) = zB,6B —y + 1,8 —a+1;1/2)


The functions w; and ws form a canonical basis at infinity, unless a = 6

*13 FUCHSIAN EQUATIONS

A homogeneous linear DE with single-valued analytic coefficients is called a


Fuchsian DE when it has, at worst, regular singular points in the extended com-
plex plane, including the point at infinity. Since functions whose only singular
points are poles necessarily are rational functions,t it follows that the coeffi-
cients of any Fuchsian DE are rational functions. The most general first-order
Fuchsian DE has the form (see Ex. F8)

wv+{> A,
ju =o
k=] * Zk

The general solution of this DE is the elementary function

wz) =cIl (z — z)74


k=]

Second-order Fuchsian DEs offer much more variety; they are classified
according to the number of their singular points. When the number of these is
small, their study is greatly simplified by making linear fractional transformations}
of the independent variable, of the form

—(zt4) ad # bc
(a ta)’

Any such transformation can be obtained by successive changes of variable of


the forms ¢ = z + k, § = az, and ¢ = 1/z. Each such change of variable shifts
the position of the singular points of a DE, carrying branch poles of solutions
into branch poles. Therefore, by Theorems 6, 7, and 8, a general linear frac-

+ Hille, p. 217, Theorem 8.5.1.

t See Hille, pp. 46-50, or Ahlfors, pp. 76-89.


13 Fuchsian Equations 295

tional transformation transforms regular singular points into regular singular


points, and the indicial equations of the transformed DE coincide with those of
the original DE at corresponding points.
We first consider second-order Fuchsian DEs having at most two singular
points, say at z z, and z zg. By a linear fractional transformation of the
= =
= =

form ¢ = (z — z,)/(@ — 29), we can send these singular points to zero and infin-
ity. 1t follows from the definition of a regular singular point and from Theorem
9 that the Laurent series of p(z) and g(z) reduce to p,/z and q/z”, respectively.
Hence, the most general Fuchsian DE of the second order with two regular sin-
gular points is equivalent to the Euler DE of Example 2,

w +ew+ By =0

after a linear fractional transformation.


The simplest Fuchsian DE of the second order whose solutions do not reduce
to elementary functions is, therefore, one having three regular singular points.
By a linear fractional transformation of the independent variable, we may put
these singular points at 0, 1, 00. From the definition of a regular singular point
and from Theorem 9 of §12, we can determine the coefficient-functions of a
second-order Fuchsian DE with three regular singular points at 0, 1, 00, as fol-
lows. The coefficient p(z) must have, at worst, poles of the first order at z —
=

0
and z = 1. It can therefore be written in the form

pl)
etna tne

=

where the function /;(z) is regular throughout the plane. However, by Theorem
9, the function zp(z) has a finite limit as |z| tends to infinity. Since the function
z[A,/z) + (B,/(@ — 1))] is bounded as |z| tends to infinity, it follows that zp, (z)
is uniformly bounded. By Liouville’s Theorem} it must, therefore, vanish
identically.
Similarly, the coefficient q(z) has at worst poles of the second order at z =
=

0
and z = 1, and can therefore be written in the form

A, As __ By_
+ +
3
qz) = at + qi(z)
Zz Zz «-1% z<-1

where the function q¢,(z) is holomorphic in the finite complex plane. By Theorem
9, the function z°q(z) remains bounded as |z| tends to infinity; hence, so does
the function

B (As + Bs)z — As + qi(z)z(z — 1)

| As
Zz
+
_—
5 + ata} = 2 zz —- 1)

t Hille, p. 204, Theorem 8.2.2.


296 CHAPTER 9 Regular Singular Points

Therefore, Az; = —Bs and, again by Liouville’s Theorem, the function q,(z) van-
ishes identically. This completes the proof of the following theorem.

THEOREM 10. Any second-order Fuchsian DE with three regular singular points
can be transformed by a linear fractional transformation into the form

(40) w++
A,
z
=
z-]
__Bo As
@-1? 2z—-1
leno
where the A, and B, are constants.

The differential equation (40) is called the Riemann DE; it evidently depends
on five parameters.
With the Riemann DE are associated three pairs of characteristic exponents
Ai, Ag), (Hi, Ha), (1, ¥9), belonging to the singular points 0, 1, 0, respectively.
These exponents are the roots of the indicial equations [cf. (16) and (39)]

He — 1) + Bye + By 0
=
=

vy? + (1 ~ Ay — Bw + Ag + By — Az = 0

By means of these equations, we can express the parameters in the DE (40) in


terms of the (characteristic) exponents:

A, 1—~—dy —Ag Ag ArAg

By 1—
m4 — be By
=_

My He

A, + B, =

yp ttl Ay + By — Ag =

VV

From the identities in the first column, we obtain the Riemann identity

(41) Ay + Ag + ey + Hg $y + vg = 1

Substituting into (40) we find the Riemann DE

w" +
1—dr, —~dg 1 ~ 4 — be /

(42) z z-l1

+
AiAg +
be VYg — Ajrg — Hills
0
=
=

2
z (2 — 1)? zz — 1)

The preceding discussion shows that the Riemann DE (40) is completely deter-
mined by the values of the exponents and the location of the singular points.

THEOREM 11. A Fuchsian DE of the second order with three regular singular
points in the extended complex plane is uniquely determined by prescribing the two expo-
nents at each singular point. The exponents satisfy Riemann’s identity (41).
13 Fuchsian Equations 297

The hypergeometric DE of §6 is a special case of the Riemann DE with three


singular points at 0, 1, 00. As shown in §6 and in §12, the hypergeometric DE
has three regular points at 0, 1, 00 with exponents 0, 1 — y; 0, y — a — B; a,
6 respectively.
From Theorems 6-8 and from the fact that the Riemann DE is the unique DE
satisfying the conditions of Theorem 11, several identities for the solutions can
be derived. 1f we make the change of dependent variable v(z) = zw(z), the func-
tion u(z) has a branch pole at each of the singular points 0, 1, 00. Therefore (cf.
Theorem 9, Corollary), u(z) satisfies a DE with three regular singular points at
0, 1, co. By Theorem 11, this must be the Riemann DE (42). The exponents of
this DE are unchanged at z = 1, whereas they are changed to a, + ) at z =
=

0
and to y, — A at infinity. A similar result holds for the more general change of
dependent variable

v(z) = Zz ~— 1)w(z)

Using these identities, we can prove the following fundamental

THEOREM 12. Every Riemann DE (40) can be reduced to the hypergeometric DE


(18) by a change of dependent variable of the form w = x\(1 — 2)"v(2).

COROLLARY. Every second-order Fuchsian DE with three regular singular


pendent
points can be reduced to the hypergeometric DE by changes of independent and de;
variable.

Proof. The general solution w(z) of the Riemann DE can be written in the
form

wz) = 2\(1 — 2Mtv(z)

where v is the general solution of a Riemann DE with exponents 0, Ay — Aj;


0, Mg My3 ¥) + Ay + oy, Vo + Ay + py. Thus, the function v is a solution
of a hypergeometric DE with a y +r + wm, 8B = % +A + My, and
=
=

Y= 1— do + ALq.e.d.

EXERCISES F

1. Show that the only second-order linear DE that has just two regular singular points,
at 0 and 0, is the Euler DE w” + (fp/z)w’
+ (qo/z*)w = 0.
2 Show that no analytic linear DE (5) can have only removable singularities, if the
point z = 09 is included.

Prove in detail that any linear fractional transformation carries regular singular
points into regular singular points.

lf p and gq are constant in (5), is the singular point at oO regular? Justify your
statement.

Show that, unless B = A?/4, the DE (x? + Az + B)w” + (Cz + D)w’ + Ew = 0


can be reduced to the hypergeometric equation by a linear substitution z =_
=

ag
+ 5b.
298 CHAPTER 9 Regular Singular Points

Show that the Besse] DE has an irregular singular point at z = ©.

Find necessary and sufficient conditions on p(z) for w’ + p(z)w 0 to have (a) a
=
=

removable singularity, and (b) a regular singular point at 00.


Show that the most general first-order linear DE with n + 1 distinct regular singular
points at z,, » Z, and © is

v+| > Uk
Jeno
k=l @ — 4)

Integrate this DE explicitly.

*9 Find the most general second-order linear DE (5) having regular singular points at
a, , 4, and 00,

*10. Find the most general linear DE having regular singular points at 0, 00 and no other
singular points. Show that any such DE can be integrated in terms of elementary
functions.

ADDITIONAL EXERCISES

1. Show that fo’? d0(1 — k? sin? 6)'/2 = (@/2)F@, 3, 1; BY).


2. Show that the substitution z = ¢” (m a nonzero integer) transforms DEs (5) having
a regular singular point at z = 0 into DEs having a regular singular point at ¢ = 0,
with the characteristic exponents multiplied by m.
Show that the DE w” + [(1 — 2)/4z7]w =
=
0 has a basis of solutions of the form
wz) = 27[1 + 27/16 + 24/1024 + ++ J] and w(z) = w, log z — 27/16
+ .

Find an entire function f(z) and constant ¢ for which the functions

dz
w, = f(2)'?exp + { f
ea V20l — 2] |
are a basis of solutions of z(1 — z)w” + (1 — 2z)w’/2 + (az + bw = 0.

The algebraic form of the Mathieu equation is

4E(1 — Ejuge + 211 ~— 26uz + (A — 16k + 32KE)u = 0

Show that this has a regular singular point at £ = 0, calculate the exponents, and
find a recurrence relation on the coefficients of the power series solutions.

*6 (a) If P and Q are given polynomials without common factors and if ¢,41/¢, = P(n)/
Q(n) and © ¢,z” is convergent, show that the function © ¢,z” satisfies the DE
2P(zd/dz) w — Q(zd/dzjw = 0.
(b) Find all quadratic polynomials P and Q for which the preceding DE has regular
singular points only, and express the solutions in terms of hypergeometric
functions.

*7 (a) Find the eigenvalues of the circuit matrix of (18) for z = 0.


(b) Using the change of variable = 1 — z, solve the same problem for the circuit
matrix for z 1
=
=

Show that the function In (In z) satisfies no linear DE of finite order with holo-
morphic coefficients.
13 Fuchsian Equations 299

9. Show that, if (0) = 0 but (0) ¥ 0, the substitution z = f({) carries a regular
singular point of (5) at z = 0 into one at ¢ = 0 having the same indicial equation.

10. Show that, for any nontrivial solution of the Euler DE Zw" + ww +w =
=
0 and
any integer n, there exists a spiral path @ = A(r) approaching the origin, along which
lim...) [z"w| = 00.

*11, Let the DE w” + p,(z)w’ + po(z)w = 0 have an isolated singular point at z =


=
ce.

Show that this singular point is regular if and only if, for some n > 0, every solution
satisfies lim, z-*w(re"*) = 0for0 <6 = 2a.
CHAPTER 10

STURM-LIOUVILLE
SYSTEMS

1 STURM-LIOUVILLE SYSTEMS

A Sturm-Liouville equation is a second-order homogeneous linear DE of the


form

(1) < Ee “| + Dols) — gid]u = 0


Here ) is a parameter, while p, p, and q are real-valued functions of x; the func-
tions p and p are positive. In operational notation, with L = D[p(x)D] — q(x),
we can write (1) in the abbreviated form

(1’) L{u] + Ap(x)u = 0

Such a DE (1) is self-adjoint for real \; to ensure the existence of solutions,


the functions g and p are assumed to be continuous and p to be continuously
differentiable (of class @'). For a given value of A, (1) defines a linear operator
transforming any function u € @? into L[u] + Apu. The Sturm-Liouville equa-
tion (1) is called regular ina closed finite interval a = x < b when the functions
p(x) and p(x) are positive for a = x < b. The functions p, g, and p, being con-
tinuous, are bounded in the interval.
For each ), it follows from the existence theorem of Ch. 6, §8, that a regular
Sturm-Liouville equation for a <= x = b hasa basis of two linearly independent
solutions of class @?.
A Sturm-Liouville system (or S-L system) is a Sturm-Liouville equation together
with endpoint (or boundary) conditzons to be satisfied by the solutions, for exam-
ple u(a) = u(b) = 0. One type of endpoint condition we shall study is the
following.

DEFINITION. A regular S-L system is a regular S-L equation (1) onafinite


closed interval a = x < 6, together with two separated endpoint conditions, of
the form-

(2) au(a) + o’u(a) = 0, Bu(b) + B’u'(b) = 0


300
1 Sturm-Liouville Systems 301

Here a, a’, 8, 8’ are given real numbers. We exclude the two trivial conditions
a=o'
= 0andé = @’ = 0.

A nontrivial solution of an S-L system is called an eigenfunction, and the cor-


responding) is called its eigenvalue. Each eigenfunction also is said to belong to
its eigenvalue. The set of all eigenvalues of a regular S-L system is called the
spectrum of the system.

Example 1. The system consisting of the DE u” + du =


=
0 in the interval
0 = x = «, with the boundary conditions u(0) = 0, u(r) =
=
0, has the eigen-
functions u,,(x)
=
=

sin nx and the eigenvalues A, = n’,n = 1,2,3,....

Example 2. For fixed n, the Bessel equation

2
d
du Ju =o, asr=xb
(3)
| I+(
dr “dr
Rr — —

in an S-L equation with p = p = 1, = k’, and q = n?/r. When 0 <a <4,a


regular S-L system is obtained by imposing the endpoint conditions u(a) u(b)
=
=

= 0, or by imposing any other separated endpoint conditions of the form (2).


With a = 0, the DE (3) does not define a regular S-L system, because the
coefficient p(r) vanishes at r = 0. We then obtain a singular S-L system, which
is treated in §4.
For fixed k and variable n, the Bessel equation (3) defines a different S-L
equation, because the parameter is different.

Periodic Endpoint Conditions. For S-L equations whose coefficients are


periodic functions of x with period b — a, the periodic endpoint conditions

(4) u(a) = u(b), u’(a) = u’(b)

are sometimes imposed and give another type of S-L system, a periodic S-L
system.

Example 3. The system consisting of the DE u” + Au = 0, for -—xw = x =


am, with the periodic endpoint conditions u(—a) = u(r) and u’(—7) = u’(x), has
the eigenfunctions 1, cos nx, sin nx, where n is any positive integer. The corre-
sponding eigenvalues are the squares of integers; if n > 0, there are two linearly
independent eigenfunctions having the same eigenvalue n’.

Example 4. A regular S-L system is obtained from the Mathieu equation

(5) u” + A + 16d cos 2x)u 0, Oxxx<r


=
=

by imposing separated endpoint conditions. In this example, p = p = 1, and


q(x) = —16d cos 2x. Note that, since cos 2x is periodic with period 7, any solu-
302 CHAPTER 10 Sturm-Liouville Systems

tion of (5) that satisfies the endpoint conditions

(5’) u(0) = u(x) and u/(0) = u’(m)

will also be periodic with period 7, while any solution satisfying

(5”) u(0) = —u(x) and u’(0) = —u'(x)

will be periodic with period 27. Moreover, since cos 2x = cos (—2x) is an even
function, any solution of (5) is the sum u(x) = $[@(x) + ¥(x)] of an even solution
@(x) = u(x) + u(—x) and an odd solution ¥(x). The Mathieu functions are suitably
normalized even and odd solutions of (5), of periods « or 27 [i.e., satisfying
(5’) or (5”)).

2 STURM-LIOUVILLE SERIES

Examples 1 and 3 define two S-L systems from the same S-L equation,
u” + du 0, but with different endpoint conditions. The eigenfunctions of
=

Example 3 are the functions used in the theory of Fourier series, studied in the
advanced calculus. It is shown there that these functions are orthogonal on
the interval —x < x < x. This means that the following relations hold.+

wT x

J sin mx sin nx dx
J cos mx cos nx dx = 0, ifm #n
_

=
Ww =
t

J =
WT
sin mx cos nx dx =

0, for all integers m, n

The eigenfunctions sin nx of Example 1 are also orthogonal on the interval


0 = x = on which the S-L system in question is defined:

f sin mxsin nx dx = 0, ifm #n

We will now show that analogous orthogonality relations hold for the eigen-
functions of regular S-L systems generally, and for the eigenfunctions of S-L
systems with periodic endpoint conditions.

DEFINITION. ‘Two integrable real-valued functions f and g are orthogonal


with weight function p > 0 on an interval J if and only if

(6) f p(x)f(x)g(x) dx = 0
I

+ Courant and John, pp. 274, 583; Widder, p. 395.


2 Sturm-Liouville Series 303

The interval J may be finite and open or closed at either end; or it may be semi-
infinite or infinite.

THEOREM 1. Eigenfunctions of a regular S-L system (1)—(2) having different


eigenvalues are orthogonal with weightfunction p. Thus, let u and v be eigenfunctions
belonging to distinct eigenvalues h and p. Then

(7) f p(x)u(x)v(x)dx = 0
Proof. We use the operator notation L[u] = [p(x)u’)’ —q(x)u. Then u and v
are eigenfunctions of (1) with eigenvalues ) and yp if and only if

L{u] + Ap(x)u = L{v) + pp(x)v = 0

We next establish the following lemma.

LEMMA. [fu andv satisfy an S-L equation (1) on a closed interval a = x = b,


for values d and yu of the parameter, then

x=b

(8) A — #) f p(x)u(x)u(x) dx = p(x)[u(x)v'(x) — v(x)u’(x)] x™a

To prove (8), we apply the hypothesis to the Lagrange identity of Ch. 2, §8,
namely the identity

ubfo) — vb[u] = + (pla)lu(adu’(@) — v(2)u/()


which is easily verified directly. Integrating this identity between the endpoints
x = aand x = band substituting —ypv for L[v] and —Apu for L[u], we get (8)
as claimed. The right-hand side of (8) is called its boundary term.
To prove Theorem 1, it suffices to show that the boundary term vanishes in
the case of.separated endpoint conditions. But (2) implies that

pla)[ula)o’(a) — v(a)u’(a)] = [exp(a)/e’][u(a)v(a) — v(a@)u(a)] = 0

if a’ # 0. Ifa # 0, the right side of (8) reduces similarly at x = a to

E2a [u’(a)v’(a)—v’(a)u’(a)] = 0
Hence p(a)[u(a)v'(a) — v{a)u’(a)] = 0 unless @ = a’ = 0. Similar formulas cover
the boundary term at x = b. Since the possibilities a =
=
a’ = 0and 8 = p’
=
=

are excluded, this shows that the right side of (8) vanishes. Formula (7) now
follows from identity (8), after dividing through by the nonzero factor (A — y).
304 CHAPTER 10 Sturm-Liouville Systems

From the lemma we also obtain the following corollary.

COROLLARY. The result of Theorem 1 holds also for S-L systems with periodic
endpoint conditions.

For, in this case, the contributions to the boundary term on the right side of
(8) from x aand x = bare equal in magnitude and opposite in sign; hence
=
=

they cancel.
It is shown in the calculus that any reasonably smooth periodic function f(x)
can be expanded into a Fourier series

Six) = a + ~ (a, cos kx + 0, sin kx)


kel
>

that is, expressed as an infinite linear combination of the eigenfunctions of the


Sturm-Liouville system of Example 3. Moreover, the orthogonality of these
eigenfunctions makes it easy to calculate the a, and b, [see Ch. 11, (1)].
The orthogonality relations just proved enable one to obtain similar expan-
sions for general f(x) in the eigenfunctions of other Sturm-Liouville systems; the
resulting infinite series are called Sturm-Liouville series. The most important
property of Sturm-Liouville systems is that, in general, this series converges to
Fi).
This will be proved for regular Sturm-Liouville systems in Ch. 11. In the pres-
ent chapter, our main objective is to prove that the eigenfunctions of any reg-
ular S-L system behave like the eigenfunctions of u” + Au = 0 for the same
endpoint conditions, in a sense to be made precise.

EXERCISES A

1. (a) Show that every solution of the Airy DE v” + xv 0 vanishes infinitely often
=
=

on the positive x-axis and at most once on the negative x-axis.


(b) Show that, if v(x) satisfies the Airy equation, u(x) = v(kx) satisfies u” + xu =
0
(c) Show that the S-L system defined for the DE u” + Axu = 0 by the endpoint
conditions u(0) = u(1) = 0 has an infinite sequence of positive eigenvalues and
no negative eigenvalue. [HinT: See Ch. 2, §4.]
For the S-L system defined by u” + Au 0 and the endpoint conditions u(0) =
=
=

u(x) + u(x) = 0, show that there is an infinite sequence of eigenfunctions with


distinct eigenvalues. What are its eigenvalues?

Show that for u” + Aw = 0 and the endpoint conditions


(a) u(0) = u(r) = 0 (c) w’(0) = u(x) = 0
(b) u(0) = w(x) = 0 (d) w’(0) = w(x) = 0

the eigenvalues are (k + 12,4 +1 /2)%, (zk + 1/2)%, and Rk, respectively. What are
the eigenfunctions?

4. (a) Show that u = U(kr) satisfies (3) if and only if U(x), (x = kr), satisfies the Bessel
DE U" + (1/x)U' + [1 — (n2/x2)]U = 0.
3 Physical Interpretations 305

(b) Show that if U(x) and V(x) satisfy the Bessel DE and if

U(ka) = U(kb) = V(kya) = V(kyb) = 0, hth,

then f U(kr)V(k,)r dr = 0.
(a) Show that any two Mathieu functions having distinct eigenfunctions are
orthogonal, in the sense that

f “ u(x)v(x) dx =
=

(b) Show that the even Mathieu functions are the eigenfunctions of the regular S-L
system defined by (5) and

u(0) = w(x) = 0

(c) Characterize the odd Mathieu functions similarly.

Determine the eigenvalues d, such that u” + Au = 0 admits a nontrivial eigenfunc-


tion satisfying f(0) = f’(0) = fr) = f’Gr) = 0.

Show that, if f(x) and f2(x) are eigenfunctions of Ex. 6 having distinct eigenvalues,
A, # do, then [5 fi(falx) dx = 0.
Show that the substitution £ = cos’ x transforms the Mathieu equation into

4&1 — Bjuge + (2 — 48)ug + A — 16d + 32dE)u =


=

*9 Consider the boundary value problem defined by the first-order DE

wu’ + [A + q(x)]u = 0, q€C, qreal

and one nontrivial side condition B[u] au(a) + a’u'(a) + Bulb) + B’u'(b) = 0.
=
=

Show that this problem admits at most three real eigenvalues.

*3 PHYSICAL INTERPRETATIONS

Sturm-Liouville systems arise typically from vibration problems in continuum


mechanics. In physical language, they describe boundary-value problems cor-
responding to simply harmonic standing waves. It is commonly assumed in phys-
ics that any wave motion can be resolved into simply harmonic standing waves, each
of which periodically oscillates with its proper frequency.
Though physicists commonly assume this result on the basis of experimental
evidence and intuition, it can actually be deduced rigorously from the mathe-
matical theory of wave motion as a boundary value problem in differential equa-
tions, as will be shown below.
We illustrate the physical interpretation of eigenfunctions of Sturm-Liouville
systems by three classic examples.
306 CHAPTER 10 Sturm-Liouville Systems

The partial DE of a vibrating string ist

2 =
T
Mt = CM
ex where c =

Here y is the lateral displacement from equilibrium; T is the tension and p the
density of the string, both assumed constant. Simply harmonic standing waves
are defined by the separation of variables

y(x, t) = u(x) cos k(t — ft)

For (x, #) to satisfy the vibrating string equation y, = c’y,,, it is necessary and
sufficient that u” + Au = 0, = k®/c?, where k depends on the endpoint
condition.
For the vibrating string, it is natural physically to have fixed endpoints, so that
y(a, t) = 9(b, t) = 0. This makes u(a) = u(b) = 0 and leads to the S-L problem
of Example 1. The eigenvalue belonging to each eigenfunction is proportional
to the squared frequency k®/4x*. This relation, combined with the analogy
between mechanical and electromagnetic waves, has led mathematicians to call
the set of eigenvalues the spectrum of an S-L system.
Another physical interpretation of S-L systems is furnished by the longitudi-
nal vibrations of an elastic bar of local stiffness p(x) and density p(x). The mean
longitudinal displacement v(x, f) of the section of sucha bar from its equilibrium
position x satisfies the wave equation

2.
0
p(x
—_=_—=«—_>

ot?

0
| pe) |
The simple harmonic vibrations (the normal modes of vibration) given by the sep-
aration of variables

v = u(x) cos k(t — ty)

are the solutions of the Sturm-Liouville equation

d
dx
|p(x)“| + h'p(x)u =
=
0

This is the special case g = 0 of (1), withA = ke

+ Widder, pp. 413-421. In this section, subscript letters denote differentiation with respect to the
variable indicated.
3 Physical Interpretations 307

For a finite bar, extending over the interval a =


_ x <= 6, various physical
boundary conditions arise naturally:

u(a) = u(b) = 0 (rigidly fixed ends)

u'(a) = u’(b) = 0 (free ends)

u'(a) + au(a) = u’(b) + Bud) = 0 (elastically held ends)

u(a) = u(d), u’(a) = u'(b) (periodic constraints)

Each of these endpoint conditions on u implies a corresponding condition on


v(x, #), with d/dx replaced by 0/dx. The natural frequencies of longitudinal
vibration (musical fundamental tone and overtones) of a bar whose ends are
held in each of the ways described are thus the solutions of the S-L systems
defined by (1) and the appropriate conditions above. Finally, the partial DE of
a vibrating membrane is

wy = C(W,y + Wy) = Cw, + lw, + 12woe)

where 7, 9 denote polar coordinates. A basis of standing wave solutions can be


found by trying the separation of variables

w(r,0,t) = uo‘i \nocosk(t — to)


For w to satisfy the membrane equation with x = ck, it is necessary and sufficient
that u be a solution of the Bessel equation (3). The singularity at r =
=
0 of the
Bessel equation is associated with the singularity in polar coordinates at the
origin. .

If the membrane is a circular disc of radius a (vibrating drumhead), the phys-


ically natural boundary conditons are u(a) = 0 and u(0) nonsingular. The latter
condition characterizes the Bessel functions among other solutions of the Bessel
equation, up to a constant normalizing factor.

EXERCISES B

1. Show that, if U,(x) satisfies the Bessel equation of order n (Ex. A4), then ¢ = U,(kr)
sin nf? and U,(kr) cos n# satisfy the Helmholtz equation V7@ + k®@ = 0 for polar
coordinates in the plane.

The partial DE of a vibrating membrane is V?U + #2U = 0. Using Ex. 1, show that
this equation has solutions satisfying U(x, y) = 0 on x® + y? = 1, for all numbers k,,,
such that J,(Rmn) = 0.

3. A string of density she + ahs cos 2x grams/cm is stretched taut between pegs at x =
—a/2 and x = 7/2, under a tension of 2 kg. Determine its natural frequencies, in
cycles per second.

4. For the Bessel DE (xu’)’ + Axu = 0, with the endpoint conditions that u(1) = 0 and
508 CHAPTER 10 Sturm-Liouville Systems

u is bounded on 0 < x = 1, show that the first five eigenvalues are approximately
A, = 5.78, Ay = 30.5, Ay = 74.9, Ay = 139, and A, = 223. [HinT: Consult a table
of zeros of Jo(x).]

A vibrating reed, with one clamped end and one free end, executes simply harmonic
vibrations with transverse displacement y(x, t) = u(x) cos kt if and only if u® =
=
ku,
u(0) = w(0) = 0, and w’() = u” (2) = 0. Find the characteristic frequencies 27/k.

(Hint: Consider the ultraspherical polynomials of Ch. 9, §11.]


Show that the general solution of the Airy DE is x'/?U,3(2x°/°/3), where U,,, is the
general solution of the Bessel DE of order one-third.

Show that the function J,,,(e* Vb/c) satisfies the DE u” + bce** — d®)u = 0.
Show that the function x/,,,(e/* Vb/c) satisfies the DE

wt x74 (be**/*
a) =0

*9 Show that the general solution of the DE

v” + (ab?x?-? + [(1 — 4n?a)/4x"))v = 0

is the function v(x) = VxU(bx’), whereUis the general solution of the Bessel DE of
order n.

4 SINGULAR SYSTEMS
5

An S-L equation (1) can be given ona finite, semi-infinite, or infinite interval
I. In the finite case, J may include neither, one, or both end points. The exclu-
sion of an endpoint a may be necessary when lim,,., p(x) = 0, lim,_., p(x) = 0,
or when any one of the functions , q, p is singular at a.
Only when/ is a closed, finite interval a < x = 6 can an S-L equation be
associated with a regular S-L system. If J is semi-infinite or infinite, or if J is finite
and p or p vanishes at one or both endpoints, or if q is discontinuous, we cannot
obtain from (1) a regular S-L system. In any such case, the given S-L equation
(1) is called singular.
We can obtain singular S-L systems from singular S-L equations by imposing
suitable homogeneous linear endpoint conditions. These conditions cannot
always be described by formulas like (2). For example, the condition that u be
bounded near a singular endpoint is a common boundary condition defining a
singular S-L system.

Example 5. The Legendre DE

(9) [1 ~ xu)’ + Au =
=

0, -Il<x<]l

together with the condition that a solution u be bounded in the interval, is an


example of a singular S-L system. As shown in Ch. 4, §2, the Legendre polynomials
4 Singular Systems 309

P,(x) are real eigenfunctions of this S-L system belonging to the eigenvalues
An n(n + 1).
=
=

Example 6. For fixed n, the Bessel equation of Example 2,

2
d du
Ju =o, Q@<rxa
_—

dr | |+(
rr
d
k?y - —

is a singular S-L equation with p = p


=
=
r,\ = h®, and q = n?/r. A singular
S-L system is obtained for any a > 0 by imposing the endpoint conditions
u(a) = 0, and u(r) bounded as r —> 0.

The eigenfunctions of the preceding singular S-L systems are the Bessel func-
tions J,(k,r), where £,a is the jth zero of the Bessel function J,(x) of order n. It
has been shown in Ch. 2, §6, that J,(x) has infinitely many zeros; it follows that
the singular S-L system just defined has infinitely many eigenvalues.
The eigenfunctions of singular S-L systems are also orthogonal, provided that
they are square-integrable relative to the weight function p, in the following sense.

DEFINITION. A real-valued function / is square-integrable on the interval I


relative to a given weight function p(x) > 0 when

(10) J Fee dx < +00

When the weight function p is identically equal to 1, we say simply that the
function fis square-integrable on the interval I.

The Schwarz inequality holdst for square-integrable functions:

(11) (f LAx)ge(x)|ox)és)= f S?()p(x)dxf g°(x)p(x)dx


This inequality implies in particular that the product of two such square-inte-
grable functions is an integrable function relative to the weight function p, that
is, the integral in parentheses on the left-hand of (11) is finite.
The right side of the boundary term in (8) vanishes in the limit, for any end-
point conditions that imply that

B
(12) lim p(x[u(x)o'(x) — v(x)u/(x)] 0
x=

+ Birkhoff and MacLane, p. 201; see also Apostol, Vol. 2, p. 16.


310 CHAPTER 10 Sturm-Liouville Systems

The conditions p(a) = p(b) = 0 and u’(x) bounded on the interval [a, 6] imply
this property, for example.
When (12) holds, we obtain from (8) the identity

(A — #) f p(x)u(x)v(x)dx = 0
for any two square-integrable eigenfunctions u and v with eigenvalues ) and y.
The integral here may be an improper integral. If \ # y, this implies, as in the
proof of the lemma of §2, that u and v are orthogonal. This proves the next
theorem.

THEOREM 2. Square-integrable eigenfunctions u and v belonging to different


eigenvalues of a singular S-L system are orthogonal with weight p whenever the bound-
ary term vanishes, as in (12).

Applying this result to Example 5, we obtain the orthogonality relation for


the Legendre polynomials
-

(13) f P,,(x)P,,(x)dx = 0, m*znN

after verifying that the boundary term vanishes. Applying it to the Bessel equa-
tion, we obtain the orthogonality relations for Bessel functions

(14)
J 0
xJn(k,x)Jn(k,x) dx = 0, kh, # k,

if J,(k,a) = Jy(k,a) = 0.

Example 7. The Hermite DE is

(15) u” — 2xu’ + rAu = 0, —CO


<i x <. +00

as in Ch. 4, §2. Using the recursion formula

(Qk — dja,
(16) k=0,1,2,...
mee + DEF 2D)’

of Ch. 4, (99, we obtain a polynomial solution of degree n for } = 2n. These


polynomials are commonly normalized by the condition that a, = 2” (and
a,-; = 0); this defines the Hermite polynomials H,(x). For example, Ho(x) = 1,
Ay(x) = 2x, Hox) = 4x ~— 2, etc. Evidently, H,,(x) is an even function for even
n, and an odd function for odd n.
The Hermite DE is not an S-L equation, because it is not self-adjoint. Making
4 Singular Systems 311

the substitution y = e~*/*u in (15), we obtain the following equivalent self-


adjoint S-L equation for the Hermite functions (x):

(17) yw +r (x? — 1)]y = 0, —CO


<I x <I CO

among whose solutions for \ = 2n are the functions ee ?H,,(x); these functions
are square-integrable and tend to zero as x > +00.
That is, the functions ¢,,(x) = et ?H,(x) are eigenfunctions for the singular
S-L system defined by (17) and by the endpoint condition that a solution y(x)
must tend to zero as x > +00, We shall now derive the orthogonality relations
for the Hermite polynomials

(18) f . H,,(x)H,(xe7™"dx =
=

0, men

For, substituting into the identity (8), we obtain

2(m — n) H,,(x)H,(x)e~* dx =
=

[Pm(x) bux) — Onlx)Pn(x)


xwa
x™—-a
~a

Since the boundary terms are e-* times a polynomial in x, and


—Xe

(19) lim x"e =0


x7+00

for all n, the boundary term vanishes in the limit, as in (12)

EXERCISES C

1. (a) Prove the orthogonality relations for the Bessel functions

(*) f xJilox),(Bx)dx=O0 if Jv@L)=J,/@L)=0, oo # 6


for any nonnegative integer n.
(b) Prove the equality (*) if a/,/(aL)/,(BL) = B],(BL)JAaL).

Show that the Legendre polynomials (and their constant multiples) are the only solu-
tions of the Legendre DE that are bounded on (—1, 1).

Show that the S-L system [(x — a)(6 — x)u’!’ + Au = 0, a < b, with u(x) bounded on
a<x <5, has the eigenvalues \ = 4n(n + 1)(6 — a)*. Describe the eigenfunctions.
(a) (Laguerre polynomials). Consider the singular S-L system

(xe™*u’y’ + Ae *u = 0 on O<x< +0

with endpoint conditions that u(0*) is bounded and that e*u(x) -> 0 as
x — +00, Show that the values A = n give polynomial eigenfunctions.
(b) Show that the preceding system has no other polynomial eigenfunctions. [HINT:
Obtain the power series expansion of the general solution of the DE.]
312 CHAPTER 10 Sturm-Liouville Systems

5. Show that the eigenvalues of the singular S-L system defined by

d gat “|
d. ja x?) = 0,
+ A(1 — x*)*u a>-—l

and the condition of being bounded on (—1, 1), are A, = n(n + 2a)

5 PRUFER SUBSTITUTION

We now develop a powerful method for the study of the solutions ofa self-
adjoint second-order linear DE

(20)
d
d.
|Pe a + O(x)u : a<x<b
where P(x) > 0 is of class @’ and Q is continuous. One may want to find out
how often the solutions of (20) oscillate on the interval under consideration, that
is, the number of zeros they have for a < x < 6. This can be done by using the
Poincaré phase plane, already introduced in Ch. 2, §7. Modifying slightly the
formulas used there, we make in (20) the Priifer substitution

(21) P(x) u’(x) = 1(x) cos 6(x) u(x)


=
=
r(x) sin 6(x)

The new dependent variables r and @ are defined by the formulas

(214 r=wt Pu”, 6 = arctan (u/Pu’)

ris called the amplitude and @ the phase variable. When r # 0, the correspon-
dences (Pu’, u) = (r, 0) defined by (21) are analytic with nonvanishing Jacobian
For nontrivial solutions, r is always positive because, if u(x) ”(x) 0 for
a given x, by the Uniqueness Theorem of Ch. 2, §4, u would be the trivial solu-
tion u=0
We now derive an equivalent system of DEs for r(x) and 6(x). Differentiating
the relationt cot @ = Pu’/u, we get

Pu?
d@ (Puy
c? 6 = — =F = —Qlx) — 5 cot”
dx u ue

If we multiply through by —sin? 0, this expression becomes

d
(22) — = Q(x) sin 294 costa = F(x,6)
P(x)

+ When @ = 0 (mod x), the relation is not defined. But the final equations (22)—(23) can still be
derived by differentiating the relation tan? = u/Pu’.
6 The Sturm Comparison Theorem 313

Differentiating r? = (Pu’)? + u? and simplifying, we obtain

(23)
=
dr
~_

dx |a
1

P(x)
Q(x)| rsin
60 cos @ = —
2 |
ee
1

P(x)
Q(x)| r sin20
The system (22)-(23) is equivalent to the DE (20) in the sense that every non-
trivial solution of the system defines a unique solution of the DE by the Priifer
substitution (21), and conversely. This system is called the Priifer system associ-
ated with the self-adjoint DE (20).
The DE (22) of the Priifer system is a first-order DE in 8, x alone, not con-
taining the other dependent variable 7, and it satisfies a Lipschitz condition with
Lipschitz constant

L= sup —

= sup |Q(x)| + sup


a<x<b 00 a<x<b a<x<b | P(x)|

The constant L is finite in any closed interval in which Q and P are continuous.
Hence, the existence and uniqueness theorems of Ch. 6 are applicable, and
show that the DE (22) has a unique solution (x) for any initial value #(a) = y,
provided P and Q are continuous at a.
With 0(x) known, 1(x) is given by (23) after a quadrature:

(23’) r=Kexp| 2 J la _aro| sin26a|


where K = 7(a). Each solution of the Priifer system (22)—-(23) depends on two
constants: the initial amplitude K = r(a) and the initial phase y = 6(a). Changing
the constant K just multiplies a solution u(x) by a constant factor; thus, the zeros
of any solution u of (20) can be located by studying only the DE (22).

6 THE STURM COMPARISON THEOREM

The zeros of any solution u(x) of the DE (20) occur where the phase function
6(x) in the Priifer substitution (21) assumes the values, 0, £2, +27, . , that
is, at all points x where sin 0(x) = 0. At each of these points cos? 6 = 1 and
d6/dx is positive, by (22) [recall that P(x) > 0]. Geometrically, this means that
the curve (P(x)u’(x), u(x)) in the (Pu’, u)-plane, corresponding to a solution u of
the DE, can cross the Pu’-axis @ = nx only counterclockwise.
Now compare the DE (22) with a DE of the same form, dé/dx Fy(x, 9),

=

having coefficients Q(x) = Q(x) and P,(x) S P(x):

1

=
=
Q,(x) sin? 6 + cos? 6 = F,(x, 6)
d. P(x)
314 CHAPTER 10 Sturm-Liouville Systems

If Q,(x) = Q(x) and P,(x) = P(x) in an interval J, then F,(x, 0) = F(x, 6) there.
By the Comparison Theorem of Ch. 1, §11, we conclude that, if @,(x) is a solu-
tion of the second DE whose initial value satisfies @,(a) = 0(a), and @(x) is a solu-
tion of (22), then 6,(x) = @(x) for a <= x = b. Furthermore, we have 6,(6) = 6(0)
only if 6(x) = 0,(x), which implies that u(x) = cu;(x), whence F(x, O(x)) =
F\(x, 8,(px)). This implies that Q(x) = Q,(x) since dO/dx = 1/P,(x) > 0, where
sin @ = 0; therefore, sin @ can vanish only at isolated points. It also implies that
P(x) = P\(x), except in intervals where cos 6 = 0, and so Q(x) = Q(x) = 0
(cf. Ch. 1, §12, Corollary 1). Therefore, if sin 6(a) = 0, the number of zeros of
sin 6,(x) for a < x < b is at least the number of zeros of sin 6(x), except when
P= P, and Q = Q,, when it is equal, and when Q = Q, = 0 in an interval,
when it may be equal. This completes the proof of the following theorem.

THEOREM 3 (STURM COMPARISON THEOREM). Let P(x) = P,(x) > 0 and


Q)(x) = Q(x) in the DEs

(Pe *) + Q(x)u= 0, du,


(24)

d.

d. (
P(x) me + Qy(x)uy = 0

Then, between any two zeros of a nontrivial solution u(x) of the first DE, there lies at
least one zero of every real solution of the second DE, except when u(x) = cu,(x). This
implies P = P, and Q= Q,, except possibly in intervals where Q = Q, = 0.

In the case of S-L equations, since p(x) > 0, Q(x) = Q,(x) evidently implies
that A = A).
Sturm’s Separation Theorem of Ch. 2, §6 follows as a corollary, by comparing
two linearly independent solutions of the same DE.
A short and easily remembered, if somewhat imprecise, summary is this: as Q
increases and P decreases, the number of zeros of every solution increases.

Maxima and Minima. For the self-adjoint DE (20), the inequality Q(x) > 0
implies that

d6/dx > 0 if 6=(n+4)r

For, in (22), cos 6 = 0 and |sin 6| = 1, if9 = (n + 3)m. Since cos 6 = 0if and
only if u’ = 0, it follows that, if Q(x) is positive, any nontrivial solution of (20)
has exactly one maximum or minimum between successive zeros.

7 STURM OSCILLATION THEOREM

We now consider the variation with \ in the number of zeros of the eigen-
functions of a regular S-L system (1) to (2). Setting P(x) = p(x) and Q(x) =
Ap(x) — g(x) in (1), we obtain (20). Since u 0 if and only if sin 6 = 0 in (21),
=
=
7 Sturm Oscillation Theorem 315

the zeros of any solution of (1) are the points where 6 = 0, ta, +2z, ’

+n, . , 9 being a solution of the associated Priifer equation

(25) _-

d.
=
Dro(x) — q(x)] sin 29 4 costa, azx=zdb
(x)

Here p(x) > 0, p(x) > Oforasx <b.


We now fix -y, and denote by 6(x, \) the solution of (25) that satisfies an initial
condition 6(a, \) = ¥ for all A, where ¥ is determined by the conditions}

/
u(a) a
(25’) tany = O=y<a7
playu'(a) playa.’
The constants a and oe’ come from the initial condition au(a) + a’u’(a) = 0. For
fixed y, the function (x, \) is defined on the domain a S x = b, -oOO<)\A<
oo; we shall consider its behavior there.
Applying the comparison theorem of Ch. 1, $11 (and especially Corollary 1
there) to (25), we obtain the following lemmas.

LEMMA 1. For fixed x > a, 0(x, d) is a strictly increasing function of the variable
r

LEMMA 2. Suppose that for some x, > a, 0(xX,, \) nn, where n = 0 is an


=

integer. Then 0 (x, d) > na for all x > x,.

Proof. If x, is any point where 6(x, A) = na, then by the DE (25), we have
d6(x,, \)/dx, = 1/p(x,) > 0. Thus, the function 0 = 6(x,, d), considered as a
function of x,,, is increasing where it crosses the line 6 = nz, as shown in Figure
10.1. Hence, @(x, A) stays above this line for x > x,, q.e.d.

Lemma 2, combined with the condition 0 <= y = @(a, \) < a, makes the first
zero of u(x) in the open interval a < x < b occur where 6 = =, and the nth
zero, where 0 = nz.
Our next aim is to show that, for fixed x > a, 0(x, 4) > C© asi = OO.
In view of Lemma 2, we will have shown that lim,_... 6(x, 4) = 00 for each x,
if we can show that for every integer n > 0, we can find a number x,(A) be the
smallest x such that 6(x, A) = nz. Then, all we need to show is that x; <x such
that 6(x,; d) =
=

nx for sufficiently large Xo. Stated in different terms, let x,(A)


exists for large \ and that lim)—.*,(A) = a. This is done in the following lemma.

LEMMA 3. For a given fixed positive integer n and sufficiently large \, the func-
tion x,(A) ts defined and continuous. It 1s a decreasing function of X, and lim--o
x,(A) = a.

+ We have assumed a * 0. When a@ = 0, set y = 4/2, tany = 0


316 CHAPTER 10 Sturm-Liouville Systems

6 5
=
Qu 65
| 5

32/2, OOOO
OO

2S OOOO
nn
acs
a

Figure 10.1 Direction field of

= Q(x) sin? 6 +Po cos’ 6 = F(x, 6)


dx )

Proof. By Theorem 3 of Ch. 6, the function (x, A) is a continuous function


of both variables x and A for a S x = band —@ <) < ©, We shall first prove
that, if the function x,,(A) is well-defined, that is, if 6(x, 4) = na for some x, then
x,(A) is a monotonic decreasing function of \. To prove this result, it suffices to
prove that @(x, A) is an increasing function of ). But this is the conclusion of
Lemma 1.
We now show that, for fixed n, the function x,(A) is well-defined for large
enough X. This amounts to saying that, for large enough 4, there is an x in the
inverval a <x < b for which @(x, A) = nw. We can translate this statement into
an equivalent statement for the solutions of the DE (1), using (21). It is equiv-
alent to saying that every nontrivial solution of (1) has at least m zeros in the
interval a < x < 5, since 6(x, A), being a continuous function of x, must take all
values between O{a, A) = y < m and nz.
Now, let qy and py be the maxima of q(x) and p(x), respectively, and let p,, be
the minimum of p(x) for a <= x = b. A solution of the DE

qm
(26) pure” + APn — Inu = 0, A> _—

Pm

is the function u,(x) = sin k(x — a), where k? = (Ap, — qu)/Pm. The successive
zeros of this function are spaced at a distance 7V py/(ApPm — qu) apart. By the
Sturm Comparison Theorem (Theorem 3 above), any nontrivial solution u(x) of
the Sturm-Liouville equation (1) must have at least one zero between any two
zeros of the function w,(x). Since u,(x) has zeros on (a, 6) whendis sufficiently
large, it follows that u(x) has at least n zeros and, therefore, that 6(x, ) takes the
value na for sufficiently large , as we wanted to show.
The number x,,(A) falls between the (n — 1)st and the nth zero of u(x), and
both these zeros tend to a as \ — ©. Therefore, we have x,,(A) — @ as A > ©,
q.e.d.

We are now ready to prove the following result.


7 Sturm Oscillation Theorem 317

THEOREM 4 (OSCILLATION THEOREM). The solution 6(x; ) of DE (25)


satisfying the initial condition O(a, \) = y, 0 = y < @ for each X, is a continuous
and strictly increasing function of d for fixed x ona < x < b. Moreover,

(27) lim 0(x; A) = ©o, lim 6(x; A) = @


00
A009

forax<x=sb.

The first sentence was proved in Lemmas 1~3. The first formula of (27) was
proved in Lemma 1.
We shall now prove the second formula of (27). Choose numbers y < y, <
aw and ¢ > 0. The slope of the segment in the x@-plane joining the points (a, 7;)
and (x), €) where a < x, = b, equals (€ — y,)/(x, — a). For a point (x, 6) on this
segment, the slope of 0(x, A), as given by (25), will be less than the slope of
the segment for large negative \. Therefore, the function @(x, d) will lie below
the segment for a = x = x, for all sufficiently large negative 4. We conclude
that @(x,, A) < e for sufficiently large negative \. Since, by the argument used to
prove Lemma 2, @(x,, A) > 0, it follows that |@(x), A)| < «And since ¢ and
x, are arbitrary, the proof is complete.
We now derive an estimate for the positions of the zeros of a solution of a
regular S-L equation (1), by comparing it with equation (26) and with

(28) Prt” + Apu — Indu = 9

where p,, and q,, are the minima of p(x) and q(x), and py the maximum of p(x)
foraxx=b.
Consider solutions of (26) and (28) for which u(a)/p(a)u’(a) = tan y. The
zeros of these solutions can be determined by inspection. They are a + (nx —
Y)/V An — 9u)/em and a + (nx — ¥)/VQpm — Inem respectively. Applying
the Sturm Comparison Theorem, we obtain the following Corollary.

COROLLARY. Let x, be the nth zero of a nontrivial solution of the S-L equation
(1). Then

Pu
(29)
dpm — Qn

The preceding results have been proved under the assumption that a # 0 in
(2). If a = 0, we can use the same argument when § # 0, by changing the
independent variable tot = a + b — x. Ifa = B = 0, we can still prove the
foregoing results with y = 7/2.

EXERCISES D

1. For u” + —A— q(x)]u =


=
0, with separated endpoint conditions (2), show that all
eigenvalues are positive if q(x) > 0, aa’ < 0, and 66’ > 0.
318 CHAPTER 10 Sturm-Liouville Systems

Show that the number of negative eigenvalues of a regular S-L system is always finite
and is at most 1 if g(x) > 0.

Show that any finite sequence of eigenvalues of a regular S-L system is unbounded.

Find all solutions of the DE 6’


=
=
A sin? 6 + B cos’ 6, where A andBare positive
constants. [HInT: Relate this to a Prtifer system.]

Show that, at all points x where a solution u(x) of (Pu’)’ + Qu = 0 has a minimum
or a maximum, d0/dx = Q(x).
Extend the Sturm Oscillation Theorem to the case where a = 6 = 0 in (2).

Derive Theorem 3 from the Sturm Comparison Theorem of Ch. 2 by introducing


the new dependent variables t = J% ds/P(s) and t; = [2 ds/P,(s).
For any solution of u” + g(x)u = 0, g(x) < 0, show that the product u(x)u’(x) is an
increasing function. Infer that a nontrivial solution can have at most one zero.

Show that, if g(x) < 0 in u” + p(x)u’ + q(x)u = 0, no nontrivial solution of the DE


can have more than one zero.

10 (a) Show that /,(x) is increasing for 0 < x < |n|. [HinT: Use the identity x(x/fy =
(my — x*)J,]
(b) Prove that, if xo is the first positive zero of J, and yo is that of J), then

in| = yo < x9

*11 (a) Let u(x) be a solution of (Pu’)’ + Qu = 0, where P > 0, P’ > 0, Q> 0, and
(P’/Q)’ > 1. Show that the zeros of u, wu’, u” follow one another cyclically.
(b) Infer that the zeros of J, Jn+is Jn+2 follow one another cyclically.

12 (Sturm Convexity Theorem). In u” + Q(x)u = 0, let Q(x) be increasing. Show that


Xp — Xn-1 < Xng1 — Xp, where {x,} is the sequence of successive zeros of a nontrivial
solution w.

13 For the modified Bessel function Jo(y) = Jo(#y), without considering its Taylor series,
show that Jj(y) > 0 and 1 < Jo{y) < cosh
y for all y > 0.

14 Show that, in the Sturm Oscillation Theorem, @(x, 4) > 00 as \ —> 00, uniformly in
any subinterval a’ = x < 6b, a’ > a.

15 Show that @(x, 4) ~ 0 as AX —~ —©, uniformly in any subinterval a’ <


= x <=
=>
b,
a> a.

8 THE SEQUENCE OF EIGENFUNCTIONS

The existence of an infinite sequence of eigenfunctions of a regular S-L sys-


tem that consists of the DE (1), together with the separated endpoint conditions
(2), that is, the conditions

(30) A[u) au{a) + a’u’(a) = 0, Blu] = Bulb) + B'u'(b) = 0


=
=

will now be proved.


We first transform these endpoint conditions into equivalent endpoint con-
ditions for the phase function 6(x, A) of the Priifer system (22)—(23) associated
8 The Sequence of Eigenfunctions 319

with the DE (1). If a # 0, then the function 6(x, \) must satisfy the initial con-
dition 6(a, \) = y, where y is the smallest positive number 0 = y < a such that
pla) tan y = —a’/a. When a 0, we chose a/2. Similarly, we choose
= =
= =

0 <6 =7so that tanéd = —6’/Bp(d).


A solution u(x) of the DE (1) for a S x < bis an eigenfunction of the regular
S-L problem obtained by imposing the endpoint conditions (3) if and only if,
for the corresponding phase function defined by (21),

(31) O(a, \) = ¥, 6(b, )) = 6 + ne, n=0,1,2,...


Osy<7, 0O<éb<4

Clearly, any value of for which conditions (31) are satisfied is an eigenvalue of
the given regular S-L system, and conversely. Let @(x, 4) be the solution of (25)
for the initial condition @(a, \) = . Figure 10.2 shows graphs of the function @
= 6(x, \) for various values of the parameter A. The waviness of the lines
expresses the fact that 1/P(x) in (23) is independent of A, whereas Q(x) = Ap —
q tends to infinity with A. As a result, the slope of the graph is 1/p(x) for all A
when 6 = 0 (mod =), although it tends to infinity with A for all other 6.
Since 9(6, A) is an increasing function of 4, and A(b, A) > 0 by Lemma 2 of §
7, as \ increases from —©6, there is a first value \,) for which the second of the
conditions (31) is satisfied. For this eigenvalue, we have 6(6, Ay) = 6. As A
increases, there is an infinite sequence of X,, for which the second boundary
condition is satisfied, namely those for which 6(4, A,) = 6 + na, for some non-
negative integer n. Each of these values gives an eigenfunction

(32) u,(x) = 1,(x) sin O(x, A,)

of the S-L system. Furthermore, the eigenfunction belonging to A,, has exactly
n zeros in the interval a < x < b, by Theorem 4. This proves all but the last
statement of the following theorem.

A=m25 »A4=9 4=25 /A=9


x/2F Tx /2
rA=4
3r
F-

A=4
5n/2F 5x /2

2xfF 2x

3x /2 5 rA=1 3x /2 A=]

re

r/2F A=0 x/2 rA4=0


rA=—1 A=—1
A=
— 00 0 A=-10

x
7=0 Yro

Figure 10.2 (x, \) for u” + Au = 0.


320 CHAPTER 10 Sturm-Liouville Systems

THEOREM 5. Any regular S-L system has an infinite sequence of real eigenvalues
Wy <M << with lim, .. Ay = ©. The eigenfunction u,(x) belonging to
the eigenvalue X,, has exactly n zeros in the interval a < x < b and is uniquely deter-
mined up to a constant factor.

Only the last assertion wants verification. Any two solutions of (1) that satisfy
the same initial condition au(a) + o’u’(a) = 0 are linearly dependent, by the
Uniqueness Theorem of Ch. 2, §4.

EXERCISES E

1 Show that for a regular S-L system, if q(x) is increased to q,(x) > q(x), each nth eigen-
value of the new system is larger than that of the old.

Show that for a regular S-L system, if p(x) is increased to p,(x) > p(x), all positive
eigenvalues decrease and any negative eigenvalue increases.

Discuss the asymptotic behavior, as n — 00 of the nth eigenvalue of the S-L systems
defined by u” + Au = 0, and the endpoint conditions:
(a) u(0) = 0, u(r) + w(x) = 0.
(b) u(0) = 0, u(r) = u’(x).
That is, find constants ap, a; such that VA, = 2 + ay + a,/n + O(1/n9.
For regular S-L systems with two sets of endpoint conditions, (30) and

a,u(a) + aju(a) = 0, Bu(b) + B’u'(b) = 0

show that, if aj/a, < o’/a, the eigenvalues of the second system are smaller than the
corresponding eigenvalues of the first.

(a) Given (Pu’Y + Qu = 0, and (P,v’)’ + Qyv = 0, P,(x) > 0, Q,(x) continuous,
establish Picone’s identity.

f [Qu(x) — Qlx)]u(x)?dx + f [P(x) — P,(x)]u’(x)?dx


+f"Pito|we) - u(x)v’ (x)

v(x)
| a= 0
where u(a) = u(b) = 0 and v(x) ¥ 0 in [a, 8].
(b) Infer the Sturm Comparison Theorem from Picone’s identity.

*6 (Szegé’s Comparison Theorem). Under the hypothesis of the Sturm Comparison


Theorem for a < x <b, P= P), Q = Q,, let u(x) > 0, u(x) > 0 fora <x <b,
and lim,.., P(x)[u’u, — uuj] = 0. Show that, if u(b) = 0, there is an x, in (a, b) such
that u;(x9) = 0.

9 THE LIOUVILLE NORMAL FORM

By changes of dependent and independent variables of the form

(33) u = y¥(x)w, i= f 2) dx; y> 0, h>0


9 The Liouville Normal Form $21

we can simplify the S-L equations (1) considerably. If the functions y and h are
positive and continuous in the given interval, the first substitution leaves the
location of zeros unchanged, while the second one distorts the range of the
independent variable, preserving the order, and leaves the number of zeros of
a solution in corresponding intervals unchanged. The equivalent DE in w and ¢
is obtained from the identity d/dx h(x) d/dt, which is obtained from the sec-
=
=

ond of equations (33). When substituted into the S-L equation (1), this identity
gives

0 = hlhpQw)], + Ae — Qyw

= hipyhwy, + [hp)y + 2hpy)w, + (hpy,)w} + Ap — Qyw

Dividing through by the coefficient pyh? of wy, we obtain the equivalent DE (for
h, y € @),

Wy + (pyh)'[(hp)y + 2hpy,)w, + [(pyh)'(hpy), + h-*p"Ap —_ qQlw =


=

The term \(p/ph®)\w reduces to dw if and only if h? = p/p. The coefficient of w,


vanishes if and only if (hp),/hp = —2y,/y, which can be achieved by choosing
y® = (hp)"|. Therefore, a simplified equivalent DE in w and ¢ is obtained
by choosing

(34) u=w/Vi@p®, t= f Vplx)/p(x) ax


This substitution reduces (1) to Liouville normal form. Since p and p are positive
throughout the interval of definition (cf. $1), this change of variables makes h(x)
and y(x) positive and of class @? whenever p andpare of class @”.

THEOREM 6. Liouville’s substitution (34) transforms the S-L equation (1) with
coefficient functions p, p € @? and q € © into the Liouville normal form

2
ou + A — q@]lw =0
(35)
dt”

where

(36) q=
1+ oF top"
az

Evaluating the second derivative in (36) and using the identity d/dt —
=

(p/p)'”” d/dx, we get the alternative rational form

Pp , p p / ,
3 1
--4,2
+ 1

()
p p p
(36’)
p
~

4p ( }+(
p p 4 (p 2 (I
p p 4 p
322 CHAPTER 10 Sturm-Liouville Systems

If the DE (1) is defined in a =


— x < 6, and ¢ is the definite integral
t = S% V(s)/p(s) ds, then the equivalent DE (35) is defined in the interval
[0, c), where c =
=

fa p(x)/p(x) dx. An S-L equation (1) with p, p € @? and


q € @ is transformed by Liouville’s substitution into an S-L equation (35) with
q € ©, since the denominator in (36) remains bounded away from 0.

COROLLARY 1. Liouville’s reduction (34) transforms regular S-L systems into


regular S-L systems, separated and periodic boundary conditions into separated and
periodic boundary conditions. The transformed system has the same eigenvalues as the
original system.

Let u(x) and u(x) be transformed into the functions f(#) and g(#) by Liouville’s
reduction (34). From the identity

(37) J“fet at = f u(x)u(x)Vp()o(x) oO dx = f u(x)v(x)p(x)dx


we infer the following result

COROLLARY 2. Liouville’s reduction (34) transforms functions orthogonal with


weight p into orthogonal functions with unit weight.

The Bessel equation (3) of Example 2, §1,

(38) wa +|kx — — ju=o


is the special case p = p = x, q = n”/x of the DE (1). Hence, Liouville’s reduc-
tion (34) is u = w/Vx and x = 1, which leads to the equivalent DE

2 1
dw
Jw = 0, w= xy
—s

kt
4
<=
dx? + 2

If n = 3, this is the trigonometric DE w” + k®w = 0, having a basis of solutions


cos kx and sin kx (k = 1, 2, 3,...). Since Ji;9(0) = 0, it follows that
J; (x) is a
constant multiple of (sin x)/V x.

EXERCISES F

1. (a) Show that the self-adjoint form of the Hermite DE (15) is the S-L equation

[eu + reu = 0

(b) Show that the Liouville normal form of this is the S-L equation (17) for the Her-
mite functions.
10 Modified Priifer Substitution 323

Show that the Liouville normal form of the self-adjoint form of the Jacobi DE is, for
=

x =
cos t

w+| 4 @sin?
— a’) G — 6’)
+(n4 @@+6+1)
(t/2)
TT

4 cos" (t/2) 2 len


Show that the self-adjoint form of the hypergeometric DE is the singular S-L
equation

[9 — xt IW — fab — xP]y =
=

What is the Liouville normal form for this DE?

Compute the Liouville normal form for the Legendre DE, setting x = —cos t, —a
<t<0.

*5 Show that every solution of the Legendre DE is square-integrable on [—1, 1] and


satisfies the endpoint conditions lim,.4, (1 — x”)u(x) = 0.
*6 The Laguerre DE is xu” + (1 — x)u’ + du 0. Show that its self-adjoint form is
=
=

the S-L equation [xe~*w’]’ + dAe~*u = 0. What is its Liouville normal form?

*7 Show that the Legendre polynomial P,(x) has-exactly zeros. [HinT: Reduce the
Legendre DE to Liouville normal form and apply Ex. E6.]

*8 If x, = cost), »*X, = cost, are the zeros of P,(x), x, < x,41, show j that 2x(—1)/
(Qn + 1) <4, < Qaj/(2n + J), for 2 = 7 <n. [Hint: Use the Liouville normal form
and Ex. E6.]

10 MODIFIED PRUFER SUBSTITUTION

By applying a modification of the Priifer substitution to the Liouville normal


form of an S-L system, we can obtain asymptotic formulas for the nth eigen-
function u,(x), valid for large n.
Using the Liouville substitution, any regular S-L system can be transformed
into a regular S-L system consisting of the equation

(39) u” + [A — gq(x)]u

=
u” + Q(x)u = 0, Q(x) = A — q(x)

and separated boundary conditions of the same form

(40) au(a) + a’u’(a) = 0, Bulb) + B’u’(b) = 0

The constants a, @’, B, 6’ are usually changed, but we still have a” + a’? # 0
and 6? + 6” # 0. By Theorem 6, Corollary 1, the eigenvalues of this system are
the same as those of the original system, and the eigenfunctions are obtained
from those for the Liouville normal form through the Liouville substitution. To
study the distribution of eigenvalues and magnitude of the eigenfunctions, it,
324 CHAPTER 10 Sturm-Liouville Systems

therefore, suffices to treat the system (39)—(40). In §§10—-11, we shall use mainly
(40)
We shall assume from now on that Q(x) > 0 for a = x = 8, that is, that \ >
q(x) and Q € @'. We introduce the functions R(x, \) and (x, A), the modified
amplitude and modified phase, which are defined in the terms of a given solution
u(x, d) of (39) by the equations

(41) u = =— sin ¢, v= RVQcos ¢


Q

These equations constitute the modified Priifer system for the DE (39)
We shall now derive a pair of DEs for R and ¢ that are equivalent to (39). We
havet

VQu +
(42) cot¢ =
Ye
—_—— —

Yo

Differentiating the first of these equations, we obtain (using u” = —Qu)

u’
Qu? + u” 1 Qu
(csc* $)¢’ = Q'? 2 9 OF? u

Using the second equation, this simplifies to

(csc* 6)’ = + 1g not


2Q

and, multiplying by sin“ @ and simplifying,

(43) ¢?’ = Q'? + -=> sin 2¢

To derive the DE satisfied by R, differentiate the second equation in (42),


obtaining the identity

Q’ JQ
2RR’ = 2Q7/?(Quu!+ u’u”) + & 2 Q-V2y%
2Q

The first term vanishes since u — Qu, leaving the DE


=_

Q’ —Q’

(44)
F_Yg (sin* @ — cos* ¢) cos 2¢
R 4Q 4Q

+ When u # 0, these equations are valid. When u = 0, set tang = V Qu/w’ and proceed similarly.
10 Modified Prifer Substitution 325

In terms of and q, the modified Priifer system is

,
q
(45a) = VA sin 26
~ 1 4a= 9)
q’

(45b) Re cos 2¢
R 40-9
Clearly, to every nontrivial solution of (39) there corresponds a solution of
the modified Priifer system, and conversely. Furthermore, we know that R > 0,
unless R vanishes identically.
Equations (45a) and (45b) determine the asymptotic behavior of the solutions
of (39) as \ — 00, The fundamental result is the following.

THEOREM 7. Let $(x, ) and R(x, X) be solutions of the system (45a) and (45b),
where q(x) € @' is bounded. Then, as \ > 00,

O(1)
(46) d(x, A) = o(a, ) + VA(x — a) + —_——

Vi

and

ov)
(47) R(x, A+) = Ra, ») + 7

Intuitively, Theorem 7 states that for large \ the modified phase ¢ is approx-
imately a linear function of VA, and the modified amplitude function R is
approximately constant.

The Symbol O(1). The symbol O(1) used here and later signifies a function
J(x, ») of x and X, defined for all sufficiently large A, which is uniformly bounded
for a = x = bash — ©. Hence, O(1)/M signifies a function f(x, \) such that
A‘f(x, A) is uniformly bounded. The symbol O(1)/X’ is also often written O(\™*),
as has been done in analogous contexts in Chs. 7 and 8.
The formula fx, 4) = O(1), where fis a given function, is not an ordinary
equation. Thus, to write O(1) = f(x, A) would be meaningless, since O(1) is not
a function. The formula means simply that f remains uniformly bounded for all
x as AX — 00, and that no other property of the function f is needed for the
purpose at hand. Using this definition, the following important properties of
the symbol O(1) can be easily verified:

O(1) + O00) = 00); = -O@)O(1) = 0); f O(1)dx = O(1)

for any finite a, b. Again, if a and @ are real numbers with a = 8, then O(1)/A*
+ o(/r® = O(1)/A*. Finally, if q(x) is any bounded function of x, then by
326 CHAPTER 10 Sturm-Liouville Systems

Taylor’s formula we have, as \ > 00

[A — g(x)]* = A*
1—q(x)
r
| = d* —ag(x)A*! + O(I)AT?
The preceding formulas will be used freely in subsequent computations

Proof. For all X for which |q(x)| <A on [a, 4], we have as before

q
—.
q
A
(1+ O(1)
r
-£, 00
n2
r
Aq
"
V i-~q=vx(1 -4) =
=
Vx
O(1)
x3?

We now compare the solutions of the DEs (45a) and (45b) with the solutions
x(x, ) = o(a,) + VA(x — a) and R(x, ) = R,(a) of
Vx

d’ —
=
and (log RY = 0

using Theorem 3 of Ch. 6. In making this comparison, we set € = O(1)/ Vy, and
replace x and y with the functions $(x, A) and (x, A), respectively. If $,(a, )
= (a, d), the inequality (7) of Ch. 6 gives |o(x, A) — $)(x, A)| = O(D/ Vx, and
since #,(x, A) = (a, A) + VA(x — a), equation (46) follows.
Similary, to derive (47), compare R(x, A) with R,(x, d), using the identity
eOOWR me y+ O(1)/\ obtained from Taylor’s formula.

*11.| THE ASYMPTOTIC BEHAVIOR OF BESSEL FUNCTIONS

We shall now use the modified Priifer substitution to study the asymptotic
behavior of solutions of the Bessel DE (3) as x — 00. The substitution u = w/
Vx reduces (3) to the Liouville normal form

2 1
(48) w” + [1 — (M/x*)]w = 0, <
0O<xom, M =n*—F

whose solutions are w(x) = VxZ,(x), where Z,(x) is a solution of the Bessel DE
[see (38)]. The modified Priifer system for (48) is then obtained by setting Q(x)
= 1 — M/x* in (43) and (44). This gives

(49a) b(x) =\r-™ _-


M

x?
M sin 26
2(x® — Mx)

R(x) _ ~Mcos 26
(49b)
R(x) 2(x? — Mx)
11 The Asymptotic Behavior of Bessel Functions 327

Expanding the right sides of these equations, wehave as x — ©O, since


(1 — M/x*)'? = 1 ~ M/2x? + O(1)/x*,

=]--
1M , O(1) R(x) o(1)
’(x) 2 x2 ~~ 3 ’ 3
x R(x) x

Here O(1) denotes a function of x that remains bounded as x ~> 00. Integrating
the first of these equations between any x > VM and y > x, we obtain

M M. O(1)
(x) o(y)
—_—

a
2
2x 2y x

Keeping x fixed and letting y — 00, we find that ¢ = lim, [y


— @(y)]is finite.
This gives ¢(x)
= eo + x — M/(2x) + O(1)/x?.
If VM < x < 4, integration of the second equation gives, similarly, log R(x)
log R(y) O(1)/x”. Taking exponentials and letting y > 00, we get

R(x) Rw exp [O(1)/x?] = Ro + O()/x

where R, Ry)
It follows that every solution of the Bessel DE (48) has the asymptotic form

na) = x? EB ()
|sn(6 2
+
O(1)
x
2

Since sin (A + O(1)/x*?) = sin A + O(1)/x?, the preceding display can be rewrit-
ten as

1psin( M
nl) «+ bo — =
2
)+ O(1) B/2

The solution Z,, is uniquely determined by the constants R,, and ¢.. above. For
if two solutions had the same asymptotic amplitude R.. and phase ¢.., their dif-
ference would be a solution having modified amplitude R(x) O(1)/x°. Since

R(x)
= Raoexp oo)
this would imply R = u = 0. Setting x. = 7/2 + d., this proves the following
theorem.

THEOREM 8 To every nontrivial solution of the Bessel DE (3), there corresponds


an asymptotic phase constant x. and a limiting modified amplitude Ro. The solution
is uniquely determined by x. and R.; every solution Z,(x) of Bessel’s DE can be
328 CHAPTER 10 Sturm-Liouville Systems

expressed as x ~> 00 in the form

(n? — 1/4
(50) Z,(x) = —= cos
x
(++ x- 2
O(1)
xo?

For the Bessel function J,(x), it can be shown that x. = n1/2 + 7/4 and that
Ro = V2/". The Neumann function Y,(x) is defined likewise by the conditions
Xo = nw/2 + 37/4 and R, = 2/x. Thus, the Neumann function Y,,(x) is
defined by the condition that it has the same asymptotic amplitude as /,(x), with
an asymptotic phase lag of 2/2 radians.
That is, the asymptotic relation between J,(x) and Y,,(x) is, for large positive
x, the same as that between cos x and sin x. The Hankel function H,(x) =
Jnlx) + 7Y,(x) is, therefore, analogous to the complex exponential function
e* = cos x + isin x.

12 DISTRIBUTION OF EIGENVALUES

Weshall next show that the asymptotic distribution of the eigenvalues of all
regular S-L systems is the same: the trigonometric DE u” + Au = 0 is typical.
We shall treat in detail the case of separated endpoint conditions (2), also assum-
ing a’6’ # 0 for uniformity. We can assume the given S-L system reduced to
Liouville normal form (39)—(40), because this does not change the eigenvalues
or the condition a’f’ # 0.
For the trigonometric DE and the boundary conditions u(a) = u(b) = 0, the
nth eigenfunction is sin [ma(x— a)/(b — a)] and the nth eigenvalue is »,, =
2a?/(b — a)*, n = 1, 2, 3,.... For u(a) = w/(b) = 0, u,(x) sin VA,(x — a),
where A, = (n_+ 3)*x?/(b — a)®. For u’(a) = u’(b) = 0, the (n + 1)st eigenfunc-
tion is cos A(x — a), where A, = n?x?/(b — a)? andn = 0,1,2,....
We will treat in detail, here and in §13, regular S-L systems satisfying sepa-
rated endpoint conditions (2) with a’f’ # 0. We will show that VA, = [nz/(b —
a)] + O(1)/n in this case, n = 0, 1, 2,.... That is, unless o& = 0 or #’ = 0 in
(40), the asymptotic behavior of the eigenvalues and eigenfunctions is similar to
that of u” + Au = 0, with the endpoint conditions a = 6 = 0.

THEOREM 9. For the regular S-L system (39)—(40), let a’B’ # 0. Then the eigen-
values \, are given, as n — 00, by the asymptotic formula

nT o(1)
(51) Vin b-a
+

Here O(1) denotes a function of n that is uniformly bounded for all integers
n = 0.

Proof. Let A = —a/a’ and B = —8/§’. By assumption, A andB are finite.


Choose a solution (x, A) of (39) satisfying the initial condition

(52) cot o(a, \) =


Va ala)’ 0<¢(a,\) <4
13 Normalized Eigenfunctions 329

According to (41), the solution u(x, 4) corresponding to ¢ will be an eigenfunc-


tion if and only if

(53) cot o(b, A) =


VA — qd)

Condition (52) can be simplified by expanding arccot x around x =


=
w/2 toa
first-order approximation in 1/\V/A. This gives, as \ — ©,

o()
(54) a) =F +t n3/2

Condition (53) can be simplified by a similar expansion. For the (n + 1)st


eigenvalue, the modified phase function changes asymptotically to nx + O(1)/
Van This gives, for x = A/VA — q(a)

+
O(1)
(54’) (6, r,) = 9 +
Vin
Subtracting (54) from (54), and comparing with (46) of Theorem 7, we
obtain the equation

O() _ O(1)
(55) (6, d,) = O(a, A,) = ne + Vib— a) +
Vin Vin
Letting \,, > 00 we obtain lim,_... nA, 2 = (6 — a), or VA, = K,n, where the
K,, tend to 1/(b — a). Substituting into (55), we obtain

nn Ol) __ne oO)


Vin = +
d
b-a Via b-a

COROLLARY. If X,, is the sequence of nonzero eigenvalues of a regular S-L


oO
system, then & n=0 Ma?
< ©.

13. NORMALIZED EIGENFUNCTIONS

A square-integrable function u on an interval a < x < bis normalized relative


to a weight function p when

J u?(x)p(x) dx = 1.
a

In the ease of the eigenfunctions of (39), p(x) = 1. Our aim is to show that
the normalized eigenfunctions of (39) and (40) behave approximately like cosine
functions, provided that a’6’ # 0. [The cases a = 0 and 6 = 0 are similar, after
phase-shifts of /2 in (54) and (54’.]
330 CHAPTER 10 Sturm-Liouville Systems

THEOREM 10, Lei u,(x) (n = 0, 1, 2, .. .) be the sequence of normalized eigen-


functions of the regular S-L system (39)—(40), with o/B’ # 0. Then

n(x — a) O(1)
(56)
b—-a n

The proof of this theorem will be carried out in three steps. For an eigen-
function u,(x), with eigenvalue A,, we have by (41)

R(x, d,)
(57) u,(x) = = sin (x, d,), a=x=b
An — q(x)

In order to obtain formula (56), we obtain asymptotic expressions separately


in terms of n for each of the three factors appearing in (57). This is done in the
following three lemmas.

LEMMA 1. Let (x, ) be as in the proof of Theorem 9. Then as \ — 00,

(58) f sin?$(x,A)dx = b-—a


2
,
+
O()
xn

Proof. Using (x, d) as the variable of integration in (55), and recalling from
(46) that dx/dp = (dp/dx)"' = —d~? + O(1)A~*”, we have

(bd) dx
f sin?o(x, A) dx = f ¢ — dd
sin?
(@,)) dp

= Al? + oar *) f " sin’?@do


a,

The last integral can be evaluated explicitly. Apply Theorem 7, to obtain

(6,2) $(6,r)
@ _ sin 2¢ _ Ne -
J ¢ (a,A)
sin? ¢ dd =
2 4 | @r) 2
+ O(1)

Substituting into the previous displayed formula and simplifying, we obtain (58)

A second step toward our result is the following lemma.

LEMMA 2. Let u(x, d) be a solution of (39). Then, as \ —> co

(59) (f u(x)is) = R(a,dyn“ b-a


—__—

2
(1+ O(1)
uN?
)+ O(1) no
13 Normalized Eigenfunctions 331

Proof. Expressingu in terms of R by (41), and then


n expanding Ras in Theo-
rem 7, formula (47), we have

f 2(x)dx = |Re,y+ m0] f [\— g(x)] 7?sin? @dx


Since (\ ~ g)'/? = 71? + O(D)A~*”, we get after simplifying and using (58)

f 2(x)dx = Ee )+ 20] AT? + O(DA “a + oan”)


O(1) b-a o(1)
-| R(a, +) + ——
r I Qni/? r

Hence, taking square roots

O(1) b-—a O(1)


[Lema =| Ria, )+
I 91/2
r

vr )+ O(1)
R(a, d) O(1)
vA ve nef?

COROLLARY. Jf, in Lemma 2, f° u?(x, d) dx = 1, then

(60) RQ,d) \/eenee + O(1)A~7]


Proof. Formula (59) gives the following condition on the amplitude function
of a normalized solution

R(a, »)
1-
O(1)
n° 4 A
VF *) O(1)
rn

Solving for R, and taking the asymptotic form of the quotient, we get (60), q.e.d

LEMMA 3. Let X,, be the nth eigenvalue (Ag < dy < dg of the S-L system
(39)-~(40). Then, as n — 00, unless a’B = 0

nx(x 4
(61) sin $(x, A,) = cos + O(1)Az¥?
b-

Proof. By Theorem 7, (46), we have

O(1)
(x, 4.) = oa, A) + VAx
— a) + —

Ve
332 CHAPTER 10 Sturm-Liouville Systems

Moreover by (54), (a, A,) = 7/2 + O(1)/ Vs Substituting back into the pre-
ceding formula, we get

(62)
sin $(x, A,)
=
=

sin [Vd, (x — a) + 4/2] + O(1)/VA,


=
=

cos [VA, (x ~ a)] + O(1)/n

We now apply Theorem 9 to this formula. By formula (51) and the mean value
theorem, we have

cos [VA,(x —a)] — cos |na(x — a) | = O(1)n7) = O()aA;)?


(b
— a)

Substituting into the right-hand side of (62), we obtain (61), q.e.d.


The proof of Theorem 10 can now be completed as follows. Of the three
factors in equality (57), 1/V/An — q(x) can be replaced by the first-order
approximation (A — q)~'/4 = 7/4 + O()A7**. The factor R(x, A,) is estimated
by the Corollary to Lemma 2, and the factor sin (x, A,) is estimated by Lemma
3. Substituting all these expressions into (57) and simplifying, we obtain

ug()= 2s n= 2) + Or,”
Since \,/? = O(1)n“', this gives Theorem 10.

EXERCISES G

1. For any DE u” + u + p(x)u


=
=

0 with p(x) = O(x~*) as x ~> +00, show that, for


every solution u(x), constants A and x, can be found for which

u(x) = A cos (x — x) + O(x7') as x — 0

*2 Establish the following formula for Legendre polynomials:

A
P,(cos 6) —_
= mip
(sin 6)!
COs
( nts 2
9-2
4
+On"*) for O0<0<a9

for some constant A. [HINT: Find a DE satisfied by P,(cos 6).]

Show that the relative maxima of x!/*|J,(x)| form an increasing sequence if 0 <n
< $and a decreasing sequence if n > }.
*4 (Sonin-Polya Theorem). Show that if in (Pu’)’ + Qu = 0, P, Q€ @'[a, bI, Q(x) # 0,
and P(x)Q(x) are nondecreasing, the successive maxima of |w(x)| form a nonincreas-
ing sequence, and that equality occurs if and only if Q(x) = 1/P(x). [Hint: Show that
the derivative of o(x) = u(x)? + P(x)u’(x)?/Q(x) is nonpositive.]
*5 Show that the values of | PX) Q(x) [77 | u(x) | at those points where u(x) = 0 are a
monotonic increasing or decreasing sequence, according as the values of P(x)Q(x)
are decreasing or increasing. [HinT: Consider v(x) = P(x)Q(x)@(x), ¢ as in Ex. 4.]
14 Inhomogeneous Equations 333

14 INHOMOGENEOUS EQUATIONS

Inhomogeneous second-order linear equations, of the form

(63) L{u] = polx)u” + pilx)u’ + po(x)u = f(x), polx) # 0, polx) € @!

subject to homogeneous separated endpoint conditions (30), can be solved by use


of Green’s functions. The method of solution generalizes that for two endpoint
problems described in Ch. 2, §§9, 11. The discussion given there, which covers
the case a’ = 8’ = 0 of (30), can now be reviewed to advantage.
Before introducing Green’s functions, we first analyze the problem with
inhomogeneous separated endpoint conditions:

(64) A{u] au(a) + a’u’(a) = a), Blu] = Bu(b) + B’u'(b) = B;


=
=

Let U(x) be the solution of L[u] = 0 satisfying the initial conditions U(a) =
a’, U'(a) = —a; let V(x) be the solution of L[u] = 0 satisfying Vid) = 6’, V(b)
=
=
—; let F(x) be the solution of L[u] = f(x) satisfying F(a) = F(a) = 0. The
existence and uniqueness of these functions follow from Theorem 7, Corollary
2, of Ch. 6, §8. For any constants, c, d, the function

w(x) = cU(x) + dV(x) + F(x)

satisfies the inhomogeneous DE (63). Moreover, we have

A[w] = d(aV(a) + o’V(a)) = dA[V]

Blw] = cBU®) + BU") + BEF] = cB[U] + BLF]

If Uand Vare linearly independent, their Wronskian W = UV’ — VU’ never


vanishes. Hence,

ALY) aV(a) + a’V’(a) = —U(a)V(a) + UlaV’a) # 0


=
=

Similarly, B[U] = —W(s) # 0. Therefore, equations (65) for the unknowns c


and d have a unique solution for any values given to A[w] and B[w].
On the other hand, if U andV are linearly dependent, their Wronskian van-
ishes identically. Hence, U(x) satisfies A[U] aa’ + a(—a) = 0 and BLU] =
=_

0. This proves the following theorem.

THEOREM 11. Either DE (63) has a solution w satisfying the boundary conditions
A [w] =a, and B [w] =8), for any given constants a, and B,, or else the homogeneous
DE L[u] = 0 has an eigenfunction with eigenvalue 0, satisfying the homogeneous
conditions A[u] = 0 and Biu] = 0.
334 CHAPTER 10 Sturm-Liouville Systems

15 GREEN’S FUNCTIONS

We now show that, in the first case of the preceding theorem, there exists a
Green’s function G(x, &) defined for a = x, § = b, such that the solution of (63)
subject to the boundary conditions (30) is given by

(66) u(x) = f " 6G, Of dé = SEF)


Note that ¢ is an integral operator (Ch. 2, §9) whose kernel is the Green’s func-
tion G(x, £).
This result has already been established in Ch. 2, §11, for the endpoint con-
ditions u(a) = u(b) = 0; it will now be generalized to arbitrary homogeneous sep-
arated endpoint conditions (30): Afu] = Blu] = 0.
In this general case, G(x, £) can be constructed by the method used in Ch. 2.
For each fixed & G(x, &) is a solution of the homogeneous DE L[G] = 0 on the
intervals [a, £] and [£, 4], satisfying the homogeneous endpoint conditions A[u]
= 0 and Blu] = 0, respectively. It is continuous across x = &(i.e., across the
principal diagonal of the square a =< x, & = d), and its derivative 0G/dx jumps
by 1/po(x) across this diagonal. In other words, we have

A(E)U(x) VE), asxsit


G(x,8) = |e(&) Vix) UE), éxx=)

where the factor e(£) above is chosen to give 0G/0x a jump of 1/p9(€) across
x = & Thus

26.9 - 26.) = d(UOVE) — VOU'E)) =


polé)

We are therefore led to try the kernel

U(x) VE)
sxx
pol&) WEE)’
(67) G(x, &)
UG) Vix)
Exx=b
po(é) W(E)’

where W = UV’ — VU’ is the Wronskian of U and V.

THEOREM 12. Unless W = 0, equations (66) and (67) yield for any continuous
function f on [a, b] a solution u(x) of the DE L{u] =f(x) that satisfies the boundary
conditions A{u] =B[u] = 0.
15 Green’s Functions 335

That is, unless the homogeneous linear boundary-valué problem L[u] A[u)
= Blu] = 0 admits an eigenfunction, the function defined by (67) is a Green’s
function for the system L[w] f, A{u] = Blu] = 0
The proof is like that given in Ch. 2, §11. Rewriting (66) as

u(x) f G(x, HE d& + f G(x, Of(E)dé


and differentiating, we have, by Leibniz’ rule,

oy= | Ge of ae + | Oe, of at
The endpoint contributions give G(x, x" )f(x") — G(x, x*)f(x*) = 0; they cancel
since G(x, £) and f are continuous for x = & Differentiating again, we have, by
Leibniz’ rule

(x) f Gyle, Of
EO) d& + G(x, x) fx
+ | Gale, B® a8 — Gx, x*Yfle")
The two terms corresponding to the contributions from the endpoints come
from the sides x > £ and x < é of the diagonal; since f is continuous, their
difference is [G,(x*, x) x(x~, x)]f(x) S(*)/po(x). Simplifying, we obtain

Sf)
u(x)= f Cale,of
e at + 1%.
polx)

From the foregoing identities, we can calculate L[z]. It is

Lf] f L,{Glx, DIO dé + fl) = fle)


where L,[G(x, §)] stands for the sum fG,, + ~,G, + poG. This sum is zero
except on the diagonal x = é, where it is undefined. This gives the identity (63)
Since G(x, §), as a function of x, satisfies the boundary conditions (30) for all
&, it follows from (66), by differentiating under the integral sign and using Leib-
niz’ Rule again, that u satisfies the same boundary conditions. This completes
the proof of the theorem
In operator language (cf. Ch. 2, §3), we have shown that the operator f >
S[f] transforms the space @[a, b] of continuous functions on the interval [a, 5]
into the space @?[a, 6] of functions of class @?, and that this operator is a right
inverse of the operator L. In other words, we have L[S[f]] = f for all contin-
uous f. In operator notation, we can write § = L™
336 CHAPTER 10 Sturm-Liouville Systems

EXERCISES H

In Exs. 1-5 show that Green’s function is as specified.

x for xe

|
a”
1 u
=
=

Sh u(0) = u’(1) = 0; G(x, 8) =

g for x>€&
u”
2 u(—1) = u(1) = 0; G(x, &) Cle — €| + x€ — 1172
=

f
=

u = =

3 xu” + u’ = f, u(x) bounded as x — 0, u(1) =


=

0;

for xsé
G(x, = | log &
log x for x>&

u” — u = f, u(x) bounded as |x| —> 00; G(x, &) = —exp (|x — &|)]/2.

u” — u’ = f(x), u(0) = u’(1) = 0.

— e%,
G(x,) = |e*(1
(ef — 1),
xe
x ee

Find Green’s function for u” — u = f with u(—a) = u(@) = 0. Show that, as a >
00, it approaches that of Ex. 4.

Show that L[u] + Aw = 0 for nontrivial u, \ # 0 and given homogeneous endpoint


conditions (30), if and only if [au] = wu for p = 1/.

*8 Show that the Green’s function G of a regular S-L system is a symmetric function of
x and &, in the sense that G(x, £) = G(&, x).

*16 THE SCHROEDINGER EQUATION

The Schroedinger equation of quantum mechanics in one space dimension is


the DE

2m
(68)
v + —

2 ) [E— Vix)ly = 0
Physically, the function V(x) has the significance of potential energy; the con-
stant m stands for the mass of the particle; the constant E is an energy param-
eter; h = h/2z is a universal constant, whose numerical value depends on the
units used. The “wave function” (x) may be real or complex; yyy* dx = |y|?
dx is the probability that the particle under consideration will be “‘observed” in
the interval (x, x + dx). The eigenvalues of (68) for varying E are the energy levels
of the associated physical system.
The DE (68) is precisely the Liouville normal form

(69) u” + [A — q(x)]u 0
=
=

of a general S-L equation, with A = 2mE/h? and q = 2mV/h*. But, in most


physical applications, one is concerned with the infinite interval (—00, 00). On
16 The Schroedinger Equation 337

this interval, the ‘“‘endpoint’”’ condition that a solution remain bounded as x >
+00 defines a singular S-L system (cf. §4). In problems involving the Schroedin-
ger equation, it is customary among physicists to define the spectrum of this S-L
system as the set of all eigenvalues for which eigenfunctions exist. The set of
isolated points (if any) in this spectrum is called the discrete spectrum; the part (if
any) that consists of entire intervals is called the continuous spectrum. We shall
adopt this suggestive terminology here; unfortunately, its logical extension to
boundary value problems generally is very technical, even for ordinary DEs.t+
For regular S-L systems, we have proved that the spectrum is always discrete,
and the eigenfunctions are (trivially) square-integrable. We now describe a sim-
ple singular S-L system whose spectrum is continuous and whose eigenfunctions
are not square-integrable.

Example 8. The S-L system ofa free particle is

(70) u” + ru = 0, —0O <x< +0

For every positive number \ > 0, this DE has two linearly independent bounded
solutions sin (VAx) and cos (VAx). For \ = 0, it has the bounded solution u =
1, and no other linearly independent eigenfunction. For \ < 0, it has the lin-
early independent unbounded solutions sinh (Vx x) and cosh (Vx x), and no
nontrivial bounded solution. Hence, the spectrum of the free particle is continu-
ous: it consists of the half-line \ = 0.

Example 9. In the case of a harmonic oscillator, the potential energy V(x) is a


constant multiple of x’. By a change of unit x — kx, we can reduce the resulting
Schroedinger DE to the normal form

(71) wu” +A— xu =0

Comparing with Example 7 of §4, we see that this has the eigenfunctions
e” 7H,(x), for\ = 2n + 1 (n =>
=

0, 1, 2, ...). These eigenfunctions are even


square-integrable.
For any value of A not an odd positive integer, the recurrence relation
Aig = (2k — A + 1)a,/(R + 1)\(R + 2) satisfied by A\(x) may be compared with

that for the Taylor series era y 6'x”’/ (r!), namely ¢y,49 = Beo,/(y + 1). Setting
2r = k, we see that, for all sufficiently large k,

r+ >
Cn+9
A
>0 if 6<1
a Cr

Hence |Ay(x)| > Be? — pr(x), where B > 0, pr(x) is a polynomial and (say)
6 = 4. It follows that |e~*Ay(x)| > Be*/* — O(1) is unbounded, unless d is an

+ See, for example, Coddington and Levinson, pp. 252-269.


338 CHAPTER 10 Sturm-Liouville Systems

odd positive integer. Finally, if u(x) is any bounded nontrivial solution of (71),
the same is true of u(—x) and of [u(x) + u(—x)]/2, (u(x) — u(—x)]/2. This
shows that, if (71) has an eigenfunction, it must have an odd eigenfunction or
an even eigenfunction. Since either of these would be defined up to a constant
factor by the relation

(2k —dX¥+ Da
Me hE DR + QD)

on its coefficients, we see that the Hermitefunctions e~* 7H,(x) are the only eigen-
functions of the harmonic oscillator.

*17 THE SQUARE-WELL POTENTIAL

In Example 8, the spectrum is continuous; in Example 9, it is discrete. We


now describe a Schroedinger equation whose spectrum is partly continuous and
partly discrete.

Example 10. A square-well potential is one satisfying V(x) = —C® on


|x| <a, and V(x) = 0 when |x| > a. This leads to the Schroedinger DE with
discontinuous q(x):

(72) wu
=| 0—C*u on

on
|x| >a
|x| <a

The eigenfunctions can again be determined explicitly.t


If u(x) is any eigenfunction, then so is u(—x), and so are the even part
[u(x) + u(—x)]/2 and the odd part [u(x) — u(—x)]/2 of u(x). Hence, (72) has
a basis of eigenfunctions consisting exclusively of even and odd eigenfunctions,
that is, satisfying u’(0) = 0 or u(0) = 0.
For A > 0, every solution of (72) has the form A cosVAx + Bsin Vax for
|x| > a. Hence, every nontrivial solution of (72) is an eigenfunction and, as in
Example 8, the spectrum includes the entire half line A = 0.
For \ < —C’, on the other hand, the continuation to |x| > a of both the
even solution cosh ( —) — C*x) and the odd solution sinh ( —r — C*x),
from the interval |x| < a, can be shown (see Theorem 14 below) to satisfy u(x)
> 0, w(x) > 0, and u”(x) > 0 for all positive x. Hence, they are both
unbounded. In summary, the spectrum contains no points on \ < —C*: there
is no bounded solution with eigenvalue \ < —C?.
In the interval —C? < \ < 0, one can show that the spectrum is discrete by
working out the implications of the Sturm Oscillation Theorem. The solutions

+ This is done with the usual understanding (Ch. 2, Ex. Al2) that a “solution” of (72) is a function
u € @' which satisfies (72), and so is of class @? where q(x) is continuous.
18 Mixed Spectrum 339

bounded for x > a are the functions A exp (— V —Ax) which satisfy u’(a)/u(a)
=
—). Writing p d\ + C?, we see that the even solutions A cos ux of
(72) satisfy the same boundary condition w’(a)/u(a) V—A if and only if
“tan pa = —X. The odd solutions B sin px satisfy it if and only if » cot pa
= —

—x. Solving the preceding transcendental equations graphically, we


see that the number of even eigenfunctions and the number of odd eigenfunc-
tions belonging to the discrete spectrum are both approximately equal to aC/r
Moreover, every eigenfunction that corresponds to the discrete spectrum is
square-integrable, and conversely

*18 MIXED SPECTRUM

The preceding example is typical of a wide class of Schroedinger equations—


namely, all those having a “‘potential well’? dying out at infinity. We first treat
the continuous portion of the spectrum

LEMMA I In the normalized Schroedinger equation (69), let q be continuous and


satisfy q(x) B/x + 0(1/x) as x — 00, for some constant B. Then, for \ > 0, every
solution has infinitely many zeros and is bounded

Proof. The first statement follows from the Sturm Comparison Theorem
comparing with the DE u” + Au/2 = 0.
To prove the second statement, first change the independent variable
tot = Vax, giving the DE u, + Q()u = 0, with Q@) = 1 — git/Vd)/A. Applied
to the new DE, the Priifer substitution (21) gives, by (22),

ae, —B
(73) — = Q(f)sin?6 + cos?
@= 1 +=sin294— A
Vx

Moreover, 7° = u* + u,° is given by (23’),

r =Kes |f'| +o( )| in 2m$ ao} Kexp |f 6)ds|


where F(s) (A/s)(sin 28) + O(1)/s*, and the limits of integration refer to s
Using the expression dt/d? = 1 — (A/t) sin? 6 + O(1)/#, derivable from the
preceding display, we have

ft

f ro as= f a
+ o( )| sin20dé
The first term on the right side above can be integrated by parts

fi r0ac=|-A cos]- f Fees2008+fo(3:) sin29a


340 CHAPTER 10 Sturm-Liouville Systems

The boundedness of the first two terms on the right side of this equation is
evident; the last term is bounded because d@/ds
=
=
1 + O(1)/s. Hence, u? is
bounded because u? < 7° < K? exp {J F(s) ds}.

Combining Lemma 1 with the analogous result for negative x, we obtain the
following result.

THEOREM 13. If q € @ satisfies q(x) =A/x + O(1/x°) as x > +00 and q(x)
= B/x + O(1/x?) as x > —00, then the spectrum (69) includes the half line > 0.

As regards the discrete portion of the spectrum, the key result is the following
lemma, which characterizes the asymptotic behavior for large x of a wide class
of DEs that have nonoscillatory solutions, such as the modified Bessel equation
of Ch. 9, §7.

LEMMA 2. In the Schroedinger equation (69), let q be continuous, let lim,.4.. q(x)
= 0, and let) = —k? < 0. For any ¢, 0 <€ <k, there exist two solutions u(x) and
U(x) of (69) such that, for all sufficiently large x,

(74) e-9* <= u(x) S ef *, oO < Uy(x) Se

Proof. Choose a so large that (k — 6? < q(x) — X¥ < (k + ©? for all


x = a, and let u,(x) be the solution defined by the initial conditions u,(a) = e™,
ui(a) = ke™. Then r(x) = uj /u, satisfies r(a) = k and the Riccati equation
7’ = G(x, 7) = g(x) — A — 7°. For the DEs

p’ = F(x, p) = (k — 6? — p*, of = Hx,o) =(k+ 9 —o*

it is clear that F(x, 7) <= G(x, rT) = H(x, r) on the domain 7 = k — ¢ > 0. More-
over, the solutions p(x) = k — ¢ and o(x) = k + « of the displayed DEs satisfy
p(a) < r(a) < o(a). Hence, by the Comparison Theorem of Ch. 1, §11, we have
k —€ = p(x) S 7(x) S o(x) = k + «. Integrating, we get the first inequality of
(74).
We now derive the second inequality. As in Ch. 2, §5, a linearly independent
solution of the DE (69) is given by

ds
Ug(x) = Zkuy(x)f
x uj(s)

The first inequality of (74), applied to the integral on the right, gives the
inequalities

OD
1 —2(k—€)x >
ds 1 —QWhk+ex
=>

2(k — 6) x u(y? Wk+0


18 Mixed Spectrum 341

Multiplying through by 2ku,(x), and using (74) again, we get

2 —2ex
é
—(k+Ox
(75) eOO > u(x) = é
1 — &/k 1+ ¢/k

But, for any 9 such that 0 < 3e < 9 < k we have, for sufficiently large x,

=-

é —(k+6)x > e eens


en te > —(k-€)x
é and é
~ 1—é/k 1 + &/k
-

- —

Applying these inequalities to (75), we obtain the second formula of (74) with 7
in place of ¢. Since, for any 7 with 0 < » < k, we can find ¢ 7/6 with 0 < 3¢
=
=

<» < k, and the proof is complete.

COROLLARY I. On (0, 90), let q(x) be continuous and satisfy lim,... q(x) =o
Then every solution of the Schroedinger equation with X <q that is bounded on the
interval (0, 00) is square-integrable.

COROLLARY 2. Lei q(x) € @ on the line (—00, 90), and let q(x) tend to limits qo
and q, respectively, as x — +00, Then every eigenfunction with eigenvalue \ < min
(40.9) is square-integrable.

The final conclusions can be summarized in a single theorem.

THEOREM 14. Let q(x) be as in Theorem 13. Then, for \ > 0, the spectrum is
continuous. For \ < 0, the eigenfunctions are square-integrable.

It can also be shown that, for \ > 0, the eigenfunctions are not square-inte-
grable, and that for \ < 0, the spectrum is discrete.

EXERCISES I

1. Show that the S-L system: u” + Au


=
=
0,0 = x < ©, au(0) + a’u’(0) = 0, u(x)
bounded as x -> 0, has a continuous spectrum 0 <A < © of aa’ ¥ 0.

2 Show that, if uw” — q(x)u = 0,0 = x < ©, with g(x) bounded, the DE cannot have
two square-integrable linearly independent solutions. [Hint: Use the Wronskian.]

Inu” + [A — g(x)]u = 0, 0 < x < 00, if g(x) > +00 as x — 00, show that, for any
A, the DE has exactly one square-integrable solution up to a constant factor.

#4, Under the assumptions of Ex. 3, show that the S-L system corresponding to the
boundary condition u(0) = 0, u(x) square-integrable in [0, 00), has an infinite
sequence of eigenvalues.

Show that, if the DE uw” + q(x)u = 0,0 = x < 0, g€ @ has a solution u(x) with
lim,-.o U;(x) = 1, it also has a solution w(x) such that lim... w9(x)/x = 1.

In wu” + g(x)u = 0, 0 < x < 00, suppose that fix


| q(x)| dx < 00. Show that the DE
342 CHAPTER 10 Sturm-Liouville Systems

has a solution with lim,..,. u(x) = 1. {Hint: Show, by successive approximations, that
the integral equation u(x) = 1 — f2(¢ — x)q(i)u() dt has a solution.]
Suppose that all solutions of the DE u” + q(x)u = 0 are bounded as x -> 00 and that
Jop(x) dx < 00, p(x) > 0. Show that, for all A, all solutions of the DE u” + (g(x)
+ Ap(x))u = 0 are also bounded as x — 00, (Hint: Consider the inhomogeneous DE
u” + qu = —Apu, and show that the integral equation obtained by variation of para-
meters has a bounded solution.]

Show that, if k® > 0 and J3'|q(x) — k?| dx < 00, all solutions of the DE u” + q(x)u
= 0 are bounded as x > ©.

*9 Show that solutions of the generalized Laguerre DE

wt2v+| r 1 a
-—

(
=+
4

2
Js
are &
=
=
67/28
V2TM(x), where L®(x) = d*[L,(x)]/dx", for a = (k® — 1)/4 and
A =n — (k — 1)/2, n, k any nonnegative integers.

ADDITIONAL EXERCISES

1. Show that, if a, 6 > 0, the singular S-L system

4 [a + x1 — x)*! | +A1 + x1 —x)u=0, -l<x<1


with the endpoint condition that u remain bounded as x -> +1, has the eigenvalues
h, = n(n + a + b + 1) and eigenfunctions u,(x) = P(x) (Jacobi polynomials).
Obtain orthogonality relations for the Jacobi polynomials.

Using Rodrigues’ formula, show that between any two zeros of P there is exactly
(a,b)
one zero of n#l> >
if a,b —1.

*4 Derive the following identities for Legendre polynomials:


(a) J>1P.°(x) dx
=
=
2/(2n + 1) (b) SlxP,(x)Pi(x) dx = 2n/(4n® — 1)
[HinT: Use Rodrigues’ formula and integrate by parts.]

*5 Show that the Legendre DE, with the endpoint condition

lim [ — x?)u’(x)] = 0
xt]

has the Legendre polynomials as eigenfunctions, and no other eigenfunctions.

Show that there exists a bounded differentiable function g on a < x < 3, satisfying
the inequality g’ + g?/P(x) + Q(x) < 0, if and only if no solution of (Pu’’ + Qu
= 0 has more than one zero ona = x = 5.

*7 Show that, if [2[Q(x)| dx < 4/(6 — a), no nontrivial solution of u” + Q(x)u =


=
0
can have more than one zero in a = x < b. [Hint: By Theorem 3, it can be assumed.
that Q = 0. Changing coordinates so that a = 0, b = 1, use Ex. 6 with

(1/x) — 4, O<xx<}
g(x) -f Q(t)dt + 1/(x — 1) g$sx<1
18 Mixed Spectrum 343

*8, (Fubini). Show that if, for a < x = b,

P(x) + ple)? — g(x) = pile) + pi)? — a(x)

then, between any two zeros of a solution of u” + 2p,u’ + q,u = 0, there is at least
one zero of u” + 2pu’ + qu = 0. [HinT: See Ch. 2, Ex. B4.]

9. For a regular S-L system with aa’ < 0, and 86’ < 0, and A less than the smallest
eigenvalue, show that the Green’s function is negative.
CHAPTER 11

EXPANSIONS IN
EIGENFUNCTIONS

1 FOURIER SERIES

One of the major mathematical achievements of the nineteenth century was


the proof that all sufficiently smooth functions can be expanded into infinite
series, whose terms are constant multiples of the eigenfunctions of any S-L system
with discrete spectrum. The present chapter will be devoted to proving this
result for regular S-L systems and explaining some of its applications.
The most familiar example of such an expansion into eigenfunctions is the
expansion into Fourier series. We begin by recalling} from the advanced cal-
culus two basic results about Fourier series. The first of these is the following.

FOURIER’S CONVERGENCE THEOREM. Let f(x) be any continuously differ-


entiable periodic function of period 2x, and let

(1) a, = .f- F(x)coskxdx, b, if- F(x)sinkxdx


=
=

Then the infinite series

(2) Qy/2 + a, cos x + b, sin x + ag cos 2x + by sin 2x +---

converges uniformly to f(x).

Note that the nonzero terms a, cos x, 5, sin x, .. in (2) are actually them-
selves eigenfunctions of the periodic Sturm-Liouville system in question (Exam-
ple 3 of Ch. 10); hence f(x) is represented as a sum of eigenfunctions in (2).
However, we shall adopt the usual convention of referring to the normalized
cos kx and sin kx in (2) as the eigenfunctions of the system.
Though there exist continuous functions whose Fourier series are not con-
vergent, the following sharpened form of Fourier’s Convergence Theorem
applies to all continuous periodic functions.

+ Fourier’s theorem is proved in Courant and John, p. 594 ff; Fejér’s Theorem is proved in Widder,
p. 423.

344
1 Fourier Series 345

FEJER’S CONVERGENCE THEOREM. Let f(x) be any continuous periodic func-


tion of period 2x, and let

N-1

6)(x) = =
N | » |Oy
2
2 (a,coskx + b,sinis
n=0 k

= 4%
2
+ y (al’coskx + Bysinkx)

where a’ = (1 — (k/N))a, By=qd- (k/N))b,, be the arithmetic mean of thefirst N


partial sums of the Fourier series of f(x). Then the sequence of functions
On(x) converges uniformly to f(x).

The preceding results, which we will assume as known, yield as corollaries the
following statements about cosine series and about sine series. Let f(x) be con-
tinuous on 0 = x < 7; define a function g(x) for —a7 <= x < =x by the equation
g(x) = f(|x|). Since g(—x) = g(x), g(x) can be extended to an even periodic
function of period 27, which is defined and continuous for all real x. By sym-
metry, all coefficients 6, are zero in the Fourier series of g(x). Applying Fejér’s
and Fourier’s Convergence Theorems, we have the following corollary.

COROLLARY 1. Any continuous function on 0 S x Sw can be approximated


uniformly and arbitrarily closely by linear combinations of cosine functions. If the func-
tion is of class @' and f(0) =f"(r) = 0, then it can be expanded into a uniformly
convergent series of cosine functions:

(2) F(x) = (@o/2) + a, cos x + ag cos 2x +---

By a linear transformation of the independent variable, the preceding result


can be extended to any closed interval [a, b]; the required cosine functions are
the functions cos [ka(x — a)/(b — a)].
Similarly, if f(0) = f(r) = 0, define h(x) as f(x) on 0 = x =
= aw, and as
—f(—x) on —2 = x & 0. This gives an odd continuous periodic function of
period 27, in whose Fourier series all a, vanish.

COROLLARY 2. Any function of class @' on 0 < x S x that satisfies f(0) =


Ff Ge) = 0 can be expanded into a uniformly convergent series of sine functions.

The preceding corollaries are examples of expansions into the eigenfunctions


of the regular S-L systems defined by the DE u” + Au = 0 and the two separated
endpoint conditions u’(0) = u’(z) = 0 and u(0) = u(x) = 0, respectively. We
will prove below that analogous expansions are possible into the eigenfunctions
of any regular S-L system.
346 CHAPTER 11 Expansions in Eigenfunctions

2 ORTHOGONAL EXPANSIONS

Let ¢;(x), do(x), $3(x), . .. be any bounded, square-integrable functions on an


interval I: a < x < b, orthogonal with respect to a positive weight function p(x),
so that

(3) f y(X)bi(x)p(x) dx = 0 if h#k


Suppose that a given function f(x) can be expressed as the limit of a uniformly
convergent series of multiples of the ¢,, so that

oo

(4) f(*)
=
=

C40 (*) + Cyho(x) + cyhg(x) ++ - >


=
=

d= cabal)
h=1

Multiplying both sides of (4) by ¢,(x)p(x), and integrating term-by-term over the
interval—as is possible for uniformly convergent series—we get from the
orthogonality relations (3) the equation

f S(®)bx(x)p(x)dx = > f CHPn(x)x(x)p(x)dx = cyJ ox(x)p(x)dx


Hence, the coefficients ¢, in (4) must satisfy the equation

(5) C= |f Sx)ox)p(x) ax| / J on’(x)p(x)ax|


When the ¢, are the trigonometric functions, from this identity we obtain, as a
special case, the coefficients ¢, = a9/2, cg = a), Cz = 5), .. . of the Fourier series
(1)-(2) with p = 1, using the familiar integrals

iT x

“ane, f cos”kx dx = f
_

vr =
we
sin? kx dx = 4

for any nonzero integer k.


We can summarize the preceding result as follows.

THEOREM 1. If a function f(x) ts the limit f(x) = Xc,o,(x) of a uniformly con-


vergent series of constant multiples of bounded square-integrable functions ,(x) that
are orthogonal with respect to a weight function p(x), the coefficients c, are given by (5).

The preceding conclusion holds provided that one can integrate the series
Xcyb,(x)@,(x)
p(x) term-by-term on the interval /. This holds much more generally
than for uniform convergence, e.g., for mean-square convergence as defined in
§3.
3 Mean-Square Approximation 347

The preceding conclusion was justified by using the fact that uniformly con-
vergent series can be integrated term-by-term on any finite interval J. Many
other series of orthogonal functions also can be integrated term-by-term, and
formula (5), therefore, also holds for them, as we shall prove in later sections.

3 MEAN-SQUARE APPROXIMATION

So far, we have considered only uniformly convergent series, because these


can be integrated term-by-term. The notion of convergence most appropriate
for orthogonal expansions is, however, not uniform convergence but mean-
square convergence, which we now define.

DEFINITION. Let fand the terms of the sequence { f,} (n = 1, 2, 3,...) be


square-integrable real functions. The sequence { f,} is said to converge to f in
the mean square on I, with respect to the positive weight function p(x), when

(6) J Lfalx) —f(x}?p(x) dx > 0, asn— 00


Now, suppose that $), ¢2, #3, . . . form an infinite sequence of square-integra-
ble functions on the interval J, arthogonal with respect to the weight function p,
and let f,(x) = yidi(x) + - + + + ¥n6,(x) be the nth partial sum of the series
Leet Y@,(%). To make the partial sums f, converge in the mean square to f as
rapidly as possible, we choose the coefficients y, so as to minimize the
expression:

(7) E= E™M,-.-+%)= f |ff) —> rico] p(x)dx


Expanding (7), and using the orthogonality relations (3), the function E of
the variables 7, Yo, » Yn is given by the expression

ee) B= | stods— 2) m4|sowaxt Oat fotoas


Now, consider the numbers ¥,, Yo, » ¥, that minimize the function E.
Since E is differentiable in each of its variables, the minimum can be attained
only by setting every 0E/dy, = 0. That is, a necessary condition for a minimum
is that the y, satisfy the equations

0
=
=

—2J fo dx + onf ditodx


Solving for y,, we get y, = {Jf féxp dx}/{f ¢» dx,}, which is the same as equation
(5) for the c,, in another notation.
348 CHAPTER 11 Expansions in Eigenfunctions

We now show that the choice +, Cp. Where, as in (5),


=
=

(8) = |Jsee.eoe ax| / J $2600 ax|


does indeed give a minimum for E. A simple calculation, completing the square,
gives for E the expression

(8/) E= f E~ Tina | dx
= J fea: + Dat + on — 09 f de ax
The right side shows that the minimum is attained if and only if +, =
=

c;. This
proves the following result, for any interval J.

THEOREM 2. Let {¢,(x)} be a sequence of orthogonal square-integrable func-


tions, and let f be square-integrable. Then, among all possible choices of
v1 + Yn» the integral (7) is minimized by selecting y, =c,, where c, is defined by
(8).

The coefficients c, are called the Fourier coefficients of f relative to the orthog-
onal sequence ¢,.
The partial sum ¢,¢9(x) + + - - + ¢,,(x) in Theorem 1 is thus, for each n,
the best mean-square approximation to f(x) among all possible sums y,¢,(x) + - - -
+ ¥,6,(x); it is often called the least square approximation to f(x) because it min-
imizes the mean square difference (7). The remarkable feature of least-square
approximation by orthogonal functions is that the kth coefficient ‘y, in the list
(y) > » ¥) which gives the best mean-square approximation to f is the same for
all n = k. This “finality property’? does not hold, for example, in the case of
least-squares approximation by nonorthogonal functions, or of the approxima-
tions in Fejér’s Theorem, or of best uniform approximation minimizing the func-
tional sup, <,<4| f(x) — Dh=1 €,6,(%)
|.

Orthonormal Functions. The preceding formulas become much simpler


when the orthogonal functions ¢, are orthonormal, in the sense that [¢@,’p dx =
1. For a sequence of orthonormal functions, the formula for the Fourier
coefficients is c,
=
=
Ji féy dx. We can easily construct, from any sequence
@, of orthogonal functions, an orthonormal sequence y, by setting y, —
=

o,/L) ¢:7p dx)'” . For example, the functions

1
- coskx, Va sinkx
Von’
3 Mean-Square Approximation 349

are orthonormal on —a7 = x = @ with respect to the weight function


(x) =1
Substituting the condition f ¢,%p dx =
=
1 into (8’) and remarking that E is
nonnegative, we obtain the following important corollary.

COROLLARY | Let Z} cy, be the least-square approximation to f by a linear


combination of orthonormal functions o, Then

(9) a f F?@)e(x) dx
For the right member of (9) to be finite, it is necessary that f° be integrable
that is, that f be square-integrable with respect to the weight function p. When
this is the case, the integrals (8) are also well-defined by the Schwarz inequality
Under these circumstances, since the right side of (9) is independent of n, if we
let n tend to infinity, we will still have

(10) > 7 = f f 2(x)p(x) dx < +00 (Besselinequality)

That is, the Fourier coefficients of any square-integrable function f form a square-sum-
mable sequence of numbers if the $, are orthonormal

EXERCISES A

1
Show that a, cos kx + 0, sin kx = (1/m) [® 2 f(O cos [A(t — x)] dt.
2 Show that4 + Lf_, cos kx = sin [(2n + 1)x/2]/(2 sin (x/2)]
3 Using Ex. 2, infer that

+ 5° (@cos kx +6,sin kx) =-{ f@ 2 sin [(¢ — x)/2]


sin [(2n + 1)(¢ — x)/2]
d

(a) Prove in detail Corollaries 1 and 2 of Fejér’s Theorem, discussing with care the
differentiability at 0 and x of the periodic functions constructed
(b) Find necessary and sufficient conditions for a continuous function on [0, 7] to
be uniformly approximable by a linear combination of functions sin kx.

Show that, in Fejér’s convergence theorem

» ear |
sin (nt/2)
le)= a [fle+
n (t/2)
*6. Prove Fejér’s theorem, assuming that Ex. 5 holds

For the regular S-L systems in Exs. 7 and 8, (a) find the eigenvalues and eigenfunctions
(b) obtain an expansion formula for a function f€ @! into aseries of eigenfunctions
7. u” +ru = 0,u (0) = 0,w(r)
= 0,0SeS0
350 CHAPTER 11 Expansions in Eigenfunctions

8. The same DE with w’(0) = 0, u(r) = 0.

9. Show that the trigonometric functions are orthogonal, for any a, in

—W asx
=z Ta

4 COMPLETENESS

The most important question about a sequence of continuous functions ¢,


(k = 1, 2, 3, ...), orthogonal and square-integrable with respect to a weight
function p, is the following: Can every square-integrable function {be expanded
into an infinite seriest f = CP c,d, of the ¢,? When this is possible for every
continuous f,t the sequence of orthogonal functions ¢, is said to be complete.
Using the fundamental equation (8) on mean-square approximation, we can
reformulate the definition of completeness as follows. In order that

lim
n~e
1| 0 _ > va] 069dx= 0
it is necessary and sufficient that

lim
n7c
{| freae ~ dat foto ax| + > - aa?| op ax| =
=

Since the term in square brackets is nonnegative by the Bessel inequality (10),
and since J ¢,2p dx > 0 for any nontrivial ¢,, the limit is zero if and only if y, =
¢, for all k, and equality holds in the Bessel inequality (10). This proves the fol-
lowing results.

THEOREM 3. Asequence {@,} offunctions $,(x), orthogonal and square-integrable


with positive weight p(x) on an interval I, is complete if and only if

f f?(x)p(x) dx —
=

>| |f reese ax]/ §o2ere00as|


for all continuous square-integrable functions f.

+ Here and below, the equation f = L7? c,¢, is to be interpreted in the sense of mean-square con
vergence, namely, that the partial sums 2j-, ¢ converge in the mean square to the function f with
respect to p.

t If every continuous function can be expanded into a series LY° ¢¢,, then many discontinuous func-
tions also have such an expansion, convergent in the mean square. The class of all such functions is
that of all Lebesgue square-integrable functions (see §11). We are here considering only continuous
functions in order to avoid assuming a knowledge of the Lebesgue integral.
4 Completeness 351

COROLLARY |. If the $,(x) are orthonormal, a necessary and sufficient condition


jor completeness is the validity of the Parseval equality

ce

(11) f f*()p(x) dx = > | JF(%)bi(%)(x) tx]


kel

for all continuous square-integrable functions f.

For example, take the case of Fourier series. In the notation of (1), the con-
dition for the completeness of the functions 1, cos kx, sin kx on —7 Sx Sr
is that, for all continuous functions /,

(12) © | + > (a + i | = fre dx

1t follows from Fourier’s Convergence Theorem, integrating the squares of the


partial sums of (2), that the identity (12) holds if fis a continuously differentiable
periodic function.
We shall now prove that the identity (12) holds for all continuous periodic
functions f. By Fejér’s Convergence Theorem, the sums

N-1

(13) ox) = s + > ( - i)a,coskx + » ( - | b,sinkx


k=1

converge uniformly for -a7 = x = x to a continuous periodic function f(x).


Therefore, {*, 0%, dx converges as N > © to J", f? dx. Evaluating the integral
by (13), we find that

(14) lim +
N-~co |
a9

2
+o (1 - Ay (aj + | = fre dx
k=1

Now, by the Bessel inequality, we have

(14’)
“| a% + > (a, + by)|= [Ura <o
2 k=1

Since [1 — (k/N))? <= 1, it follows that, if we replace the sum in square brackets
on the left side of (14) byay?/2 + ON, (a? + 6,2), we will get an increasing
sequence whose limit is at least equal to ff? dx. But, by (14’), this limit is at most
equal to f f? dx. Hence, the limit is exactly { f? dx, and (12) is proved. Since
any continuous function on —7 & x = 7 can be given an arbitrarily close mean-
square approximation by a continuous function satisfying f(—a) = f(x), this
proves
352 CHAPTER 11 Expansions in Eigenfunctions

COROLLARY 2. The trigonometric functions 1, cos kx, sin kx (k = 1, 2,...)


are a complete orthogonal sequence in the interval —w Sx Sa.

Using the method of Corollary 2 of §1 and changing variables, we obtain


another corollary.

COROLLARY 3. The functions cos [ka(x — a)/(b — a)], (k = 0, 1, 2,...),


form a complete orthogonal sequence in the interval a Sx SS b,

We conclude this section with the following criterion for completeness of a


sequence of orthogonal functions, which relates the notion of completeness to
that of approximation in the sense of mean-square convergence.

THEOREM 4. Let {d,} (k = 1, 2,.. .) be any sequence of orthogonal square-inte-


grable functions on an interval I, relative to a weight function p > 0. The sequence is
complete if and only if every continuous square-integrable function can be approximated
arbitrarily closely in the mean square by a linear combination of the oy.

Proof. The condition is clearly necessary. Conversely, suppose that, given


¢ > 0, we can find a linear combination Xj. y,@, such that

S- Yah) pa <e
1f we replace each of the y, by the Fourier coefficients ¢, of f relative to ¢,—as
given by formula (5)—then by Theorem 2 the square integral on the left
decreases:

flr > ats) Ppdx<eé


n

k=1

But this is precisely what we had set out to prove.

5 ORTHOGONAL POLYNOMIALS

We shall now prove the completeness of the eigenfunctions of some of the


singular S-L systems studied in Ch. 10. These are the S-L systems on a finite
interval whose eigenfunctions are polynomials, such as the Legendre
polynomials.
We can use any positive weight function p(x) on an interval (a, b) with the prop-
erty that f ® x"p(x) dx is convergent for all n = 0, to construct an infinite
sequence of polynomial functions P,(x), P;(x), Ps(x), . . . with P,(x) of degree n,
which are orthogonal on (a, 4) with respect to this weight function, so that
b

J Pylx)P,(x)p(x) dx 0, men
=

(15)
=

a
5 Orthogonal Polynomials 353

Equations (15) define P,,(x) uniquely up to an arbitrary factor of proportionality,


the normalization constant.
Given a weight function p(x), one can compute the P,(x) explicitly from (15);
the computations will not be described here.t Instead, we shall derive some
interesting general properties of orthogonal polynomials.
We shall first establish the fact that, on any finite interval, such sequences of
orthogonal polynomials are complete. To prove this, we will need the following
result.

LEMMA. Every uniformly convergent sequence of continuous functions is mean-


Square convergent on any interval I, with respect to any integrable positive weight func-
tion (Jf, p dx < ©).

This follows immediately from the inequality

(16) J Lie) —fle)]2o(x) dx < max [(flx) —f9)") f (x) dx


valid when J is any finite or infinite interval. On an infinite interval, however,
we must carefully check the integrability of the weight function. For instance,
the functions f,(x) = n~'/? exp (~x?/n*) converge uniformly to the zero func-
tion on the interval —co < x < 00, but the integrals [%.. f,,°(x) dx do not con-
verge to zero.

Using this lemma, it is easy to prove the completeness of a sequence of orthog-


onal polynomials defined on a finite interval J, relative to any continuous integra-
ble weight function p(x) from the fundamental

WEIERSTRASS APPROXIMATION THEOREM. Let f(x) be any function contin-


uous on a finite closed interval a = x = b, and let e > 0 be any positive number. Then
there exists a polynomial p(x), such that | p(x) — flx)| S.«, forallxonasx=bt

From this theorem, and the inequality (16), we infer

THEOREM 5. Let P,(x) (n = 0, 1, 2, .. .) be a polynomial function of degree n.


For a fixed interval I: a = x = b, let o

f " P,,(x)P,,(x)p(x)dx = 0, if msézAn


where p(x) is a continuous integrable positive weight function. Then the orthogonal
polynomials P,,(x) are complete on I.

Proof. Let p(x) be any polynomial of degree n. We can find ¢, such that p(x)
— ¢,P,,(x) is a polynomial of degree n — 1 or less. Hence, by induction on n, we

+ It is the Gram-Schmidt orthogonalization process applied to the vectors 1, x, x, . . This process


can be applied in any Euclidean vector space (Birkhoff and MacLane, p. 204).
t See Widder, p. 426, or Courant-Hilbert, Vol. 1, p. 65.
354 CHAPTER 11 Expansions in Eigenfunctions

can express f(x) as a finite linear combination of Po(x), . . . , P,(x). By the Weier-
strass Approximation Theorem, we can approximate uniformly any continuous
function arbitrarily closely by a suitable polynomial p(x). By the preceding
lemma, every continuous function can, therefore, be approximated arbitrarily
closely in the mean square by a linear combination of the P,. The result now
follows from Theorem 4.

The completeness of Legendre, Chebyshev, Gegenbauer (or ultraspherical),


and other Jacobi polynomials (see Ch. 9, §11) follows as a corollary. But it is
harder to prove the completeness of polynomials orthogonal on semi-infinite
and infinite intervals, such as the Hermite polynomials and the Laguerre poly-
nomials introduced in the next section.’

EXERCISES B

1. Using Fourier’s Convergence Theorem, show that the eigenfunctions of u” + Au =


0 for the separated boundary conditions u(0) = u’(r) = 0 are complete on (0, x).

2 Show that, if f, > f in the mean square, and ¢,, cf are the Fourier coefficients of f,
f, relative to a given orthonormal sequence ¢,, then cf” — ¢, uniformly in &.
Using expansions into Legendre polynomials, obtain a formula for the best mean-
square approximation in |x| << 1 of a square-integrable function by polynomials of
degree = n.

*4 Using the Liouville substitution, obtain from Fourier’s theorem an expansion theo-
rem for functions f€ @°[—1, 1] into series of Chebyshev polynomials.
Show that, given square-integrable functions f;, » > fu a Sequence ¢,, > %, of
orthonormal square-integrable functions can be found for which f, is a linear com-
bination of ¢), »o lsik=sm.

*6 PROPERTIES OF ORTHOGONAL POLYNOMIALS

We shall now develop some of the properties of orthogonal polynomials


which depend only on the fact that they are orthogonal, irrespective of com-
pleteness. These properties apply to the polynomials whose completeness was
proved in §5 (see last paragraph). They apply also to the Hermite and Laguerre
polynomials, whose completeness will not be proved in this book. All these poly-
nomials have been met before, except the Laguerre polynomials, which we now
define.

Example 1. Consider the singular S-L system consisting of the DE

(17) (xu’y’ + a+———


4
(2
— x)
Ju=o, 0O<x< oc
with the endpoint conditions that u(x) is bounded as x — oo and as x — 0.
—x/2
Setting u(x) = v(x)e , we get the Laguerre DE

(18) xv” + (1 — xu’ + av = 0


6 Properties of Orthogonal Polynomials 355

Trying v = Df. a,x", the Method of Undetermined Coefficients (Ch. 4, §2) gives
the recurrence relation 4,,, (k — a)a,/(k + 1)?. Hence, we have
=
=

ak + 1)a
(19) a, = (—l)ala — l(a — 2)--:
(ety?

The series is a polynomial if and only if a = n, a nonnegative integer; otherwise,


it represents a function that grows exponentially at infinity. Normalizing the
polynomial (for a =
=
n) by the condition a4, = (—1)"/(n!), we get the Laguerre
polynomialst:

(19’) L(x) = >> (-t


k=0 () =
k!

For example, Lo(x) = 1, L(x) = 1 — x, Le(x) = 1 — 2x + }x°,

L(x) = 1 — 3x + $x? —h3,..

Thus, the functions L,(x)e~*/? are eigenfunctions of a singular S-L system. These
functions are certainly square-integrable, together with their derivatives, hence
Theorem 2 of Ch. 10 applies, giving the orthogonality relations

f e"L,,(x)L,(x) dx = 0, men

We shall now consider some of the fundamental properties of an arbitrary


sequence of orthogonal polynomials (not necessarily solutions of a DE) Po(x),
..., P,(x),..., where P, is of degree n. As in §5, we assume that the weight
function p(x) is such that all products x"p(x) are integrable on J. We do not
assume that the interval J is finite.
We first derive a result similar to the Sturm Oscillation Theorem.

THEOREM 6. Let {P,} (n = 0, 1, 2, . . .) be any sequence of polynomials orthog-


onal on a given interval (a, b) where P,,(x) has degree n. Then P,, has n distinct zeros,
all contained in the interval (a, 6).

Proof. Suppose that P,(x) has fewer than n zeros in (a, 6). Let x, » Xm (m
<n) be those zeros at which P,(x) changes sign. Then the polynomial (x — x,)(x
— X9)+ + + (x — %,)P,(x) would be of constant sign. Hence

J (x ~ x) + + + & — X_)P,(x)p(x) dx # 0,
a
m<on

+ The normalizing condition a, = 1 is also often used, and makes some formulas simpler.
356 CHAPTER 11 Expansions in Eigenfunctions

But (« — x;) - + + (« — %,) is a polynomial of degree lower than n. Therefore,


it can be written as a linear combination of the polynomials Po, » Pins» Say
Ur ¢.P,(x), m <n. Hence, we have

J (x — x1) + + + & — Xn)P,(x)p(x)dx = f > CePi(x)Pa(x)o(x)dx


-Ya f P,P,pdx =
=

0
a

a patent contradiction, since the integrand is positive except at the x, Hence,


P,(x) has at least n zeros on (a, b). Since a polynomial of degree n has, at most,
n zeros, the proof is complete.

Next, we shall establish a recursion formula for an arbitrary system of orthog-


onal polynomials.

THEOREM 7. Any three orthogonal polynomials of consecutive degree satisfy a lin-


ear relation

(20) Pr+i(X) = (Apx + B,)P,(x) + CyP,-1()

for suitable constants A,, By C,.

Proof. First choose A,, such thatP,,, ,(x) — xA,P,(x) is a polynomial of degree
n or less, so that

Prix) — XA,P,(x) = YoPa(x) + V1Pa-1) + + + + YnPo

Multiplying both sides by P,(x)p(x), integrating from a to 6, and using the


orthogonality relation, we find that y, = 0 for k = 2, 3,. , n. Hence

Pi 41%) a xA,P,(x) = oP,(x) + nP, —1(%)

Therefore set yp = B, and y, = C,, q.e.d.

The numerical values of the constants A,, B,, C,, in Theorem 7 depend on the
normalizing factors used to define the orthogonal polynomials considered. For
convenience, we have listed in Table 1 the recursion coefficients for some com-
mon polynomials.

EXERCISES C

Establish the following formulas for Hermite polynomials (see Ch. 4, §2):

1. Asie) = 2xH,(x) — 2nH,,-1(x).


2. Aix) = 2nH,_,(x).
6 Properties of Orthogonal Polynomials 357

Table 1. Recursion Coefficients

Polynomial A, B, C,
2n+1 —n
Legendre 0
n+1 n+1
Chebyshev 2 —1
2n +X l—n-— 2A
Gegenbauer
n+1 n+1
Hermite 2 —2n
—l1 2n+1 —n
Laguerre
n+1 n+1 n+1

*3 Chao A,(x)t"/k! = e*-°/? (generating function).


Establish the recursion formula for the Laguerre polynomials:

L(x) = L(x) — Lisi)

(Hint: Differentiate the recursion formula for L,,.]


Show that for the functions ¢,(x) = ¢” d"/dx"(e"*x") are orthogonal in 0 < x
<= Ww,

Infer from Ex. 5 that ¢,(x) = (n!) e-*L,(x), where L, is the nth Laguerre
polynomial.

Show that, if L,(x) is the Laguerre polynomial of degree n, then d'[L,,(x)]/dx* satis-
fies the DE

xv” + (k+1—
x)’ + (n—- kv = 0

Prove the recursion formula of Table 1:

(m + 1)Ln+i(x) = (2n + 1 — x)Lp(x) — nL, 1%)

Let P,(x) be a sequence of orthonormal polynomials with weight function p in


a<x <b, and let c, = S¢f(x)P,(x)p(x) dx. Show that the partial sum

6,(x) = >. cPy(x)


k=0

coincides with f(x) in at least n + 1 points of the interval. [HinT: Use a method
similar to the proof of Theorem 6.]

*10 Show that, in Theorem 6, between any two zeros of P,, there is exactly one zero of
Pit
*1] Show that the Legendre polynomials are the only Gegenbauer polynomials for
which the maximum of | P3(x)|, namely P£(1), is independent of n.

*12 (a) Expand the function (1 — 2xh + h9)~'/%(|x} < 1) into a series of Legendre
polynomials, and show that the nth Fourier coefficient is 2h"/(2n + 1).
358 CHAPTER 11 Expansions in Eigenfunctions

(b) Obtain from (a) the formula

(1 — 2xh + AN? = y HP,(x) for Jal


< V2-1
k=0

*13. Show that the generating function for the Laguerre polynomials is,

eo

_ exp | 1-t
To i—xt
= >" #L,(x)/n!
n=O

In Exs. 14-17, D = d/dx and p(x) is positive. The method of proof is to find-by induction
a S-L equation satisfied by the expressions given.

*14. Show that the only orthogonal polynomials of the form

Palx) = K,(0(x))~'D"[o(x)] for p(x) €

are the Hermite polynomials.

*15. Show that the only orthogonal polynomials of the form

Palx) = K,(0(x))~'D"[o(x)(ax + B)]

are the Laguerre polynomials, after a change of independent variable.

*16. Show that the only system of orthogonal polynomials of the form

K,(0(x))~'D"[(x)(ax? + bx + o)]

are the Jacobi polynomials, after a change of independent variable.

*17. Show that the only sequences of orthogonal polynomials that satisfy a Rodrigues
formula p,(x) = K,[0(x)17'D"[o(x)p(~)], where p is a given polynomial, are the
Jacobi, Laguerre, and Hermite polynomials.

*7 CHEBYSHEV POLYNOMIALS

The Chebyshev polynomials T,(x) were introduced in Ch. 9, §11, as solutions


of the self-adjoint DE

(21) [d — x)wy + A — x*)7'?u = 0, -l<x<l

The Liouville normal form of this DE is um + Au = 0, which is obtained


by setting w = u and
@ = Jf*, dé/V1 — £*, or x = —cos0,0<6<7. For
integral n and \ = n?, two linearly independent solutions of this equation are

(21a) cos n§ = T,(cos #) = T,(x)


7 Chebyshev Polynomials 359

and S,(x) given by the formula

(21b) S,(x) = sin n? = sin 6U,,_,(x)

The U,,(x) defined by (21b) are also polynomials of degree m, called Chebyshev
polynomials of the second kind. Their theory is parallel to that of the T,,(x) and is
developed in Exs. D3—D7.
From the forms of the preceding explicit solutions, we see that the functions
T,(x) are eigenfunctions of the singular S-L system defined from (21) by the
boundary conditions that u’(—1) and w’(1) be finite. All solutions of (21) are
bounded at the singular points x
=
=
+1, as is apparent from inspection of
explicit solutions (21a) to (21b) and also from a calculation of the roots » =
=

0,
3 of the indicial equation 2v? — » =
=
0 of the normal form of (21). But only
multiples of the T,,(x) have bounded derivatives at the endpoints.

Minimax Property. The most striking property of the Chebyshev polyno-


mials is contained in the following result.

THEOREM 8. Among all monic polynomials P(x) = x" + URz) a,x" of degree n,
2'-"T.(x) minimizes max_)=,<,|P(x)| (Minimax property).

Proof. For n = 1, the result follows by inspection. For n = 2, it follows by


induction from the recursion formula

T,(x) = 2xT,,-\(%) — T,-2(%)

which is equivalent to the trigonometric identity

cos (m0 + 6) = 2 cos # cos m6 — -cos (m0 — 6), n=m+1

Next, since T,,(cos 6) = cos n@, we have that

max |2)-"7,(x)| = 2!"


-lsx=1

In order to establish the statement, it therefore suffices to show that for any
monic polynomial of degree n we have

max |x" + a,x"! +--+ +a} = 2!”


-Il<x<1

Suppose this were not so. Then, we could find a monic polynomial p(x) of degree
n such that max_j<,<; |p(x)| < 2'~”. Now, the polynomial 2!-"7 (x) — p(x) is
of degree n — 1. We shall reach a contradiction by showing that this polynomial
has n distinct zeros.
To see this, notice that the polynomial 2!-"7 (x) takes alternately the values
+2!" at n + 1 points xo -l<x<
=
=
<x, = 1, which immediately
360 CHAPTER 11 Expansions in Eigenfunctions

results from T,,(cos 6) =


=
cos n6. Since | p(x,)| < 2'~"| T,(x,|, it follows that the
polynomial 2'~"T,,(x) =

p(x) takes alternately positive and negative values at


n + 1 points. Therefore, this polynomial of degree n — 1 must have at least
n distinct zeros, and so, must vanish identically, q.e.d.

Chebyshev Equioscillation Principle. An important partial generalization


of Theorem 8 is the Chebyshev Equioscillation Principle. This states that, if
F(x) € @[a, b], then there is a unique polynomial p(x) of degree n — 1 that mini-
mizes max | p(x) — f(x)| on [a, 6]. The difference P(x) = p(x) — f(x) vanishes at
n — 1 points on [a, 5], and |P(x)| assumes its maximum value at n points.
Setting f(x) = x”, we have P(x) = T,,(x).

EXERCISES D

1. Show that the DE [(1 — x2))u’}’ + AU. — x*)!7u = 0 can be reduced to a DE with
constant coefficients by setting v(8) = (sin 6)u(cos 8).

Show that the endpoint conditions lim,_.+; V1 — x u(x) = 0 give an S-L system with
eigenvalues X,,
=
=
n(n + 2), from the DE of Ex. 1.

Show that the eigenfunction belonging to the eigenvalue A, is a Chebyshev polyno-


mial of the second kind.

Using Ex. 3, obtain an expansion theorem of a smooth function into a series of


Chebyshev polynomials of the second kind.
Show that 7,(x) = U,(x) — xU,_ (x).

Show that (1 — x?)U,_,(x) = x7,(x) — Ty+1(x).


Express U,(x) in terms of the hypergeometric function.

Expand the function arccos x on (—1, 1) into a series of Chebyshev polynomials


of the first kind.

*9 Infer the Weierstrass Approximation Theorem from Fejér’s Theorem.

EUCLIDEAN VECTOR SPACES

The concepts of mean-square convergence and completeness have suggestive


geometric interpretations. These interpretations are based on the properties of
inner products.
Consider the set of all real functions f, g, h, , continuous and square-
integrable on an interval J, with respect to a fixed positive weight function p.
The interval J may be open or closed, finite, semi-infinite, or infinite. Define the
inner product of two such functions /, g as the integral

(22) (ia = f S(=)g(x)p(x)dx, p(x) > 0


The following formulas are immediate:

(f+ gh) = (fh) + & A), (fa =@Sf)

(cf, 8) = eCf, 8); (f, f) > 0, unless fH0


8 Euclidean Vector Spaces 361

Hence, with respect to the inner product ( f, g), this set of functions is a Euclid-
ean vector spacet (or “inner product space’’).
For real functions, the integral (6) in the definition of mean-square conver-
gence, is the inner product (f, — fi f, ~ f); hence it is the square of the distance
lh —fl = i -ff — f)'” between f, and fin the Euclidean vector space E.
Therefore, f, = f in the mean square, relative to p, means that the Euclidean
distance from f,, to fin E tends to zero.
This distance enjoys the properties of distance in ordinary space, including
the triangle inequality and the Schwarz inequality

If - tel = &) = Il: Hell cos Z(f, g) = ISM « Ilell

The Schwarz inequality shows that fand g are orthogonal if and only if the angle

6 = Lf, g) = arceos [(f, g)/Ifll - liglll, 0Os0<17

is 90°; it gives a geometrical interpretation to the definition of orthogonal


functions.
We shall now generalize Theorems 1, 2, 3, and 4 to an arbitrary Euclidean
vector space E.
If {$,} is a sequence of orthogonal vectors in E, and f € E is given, consider
the squared distance

E(1, >Yn)= Ilr > Vr = (r- > Vibesf —> vit)


Defining c, = (/, ¢)/(@s. ,), we obtain, as in §3,

Ilr > Vide = (ff) - > [eer dp] + > [ce — Wor o,)]
Geometrically, the least-square approximation Lj. ¢,6, to fappears as the orthog-
onal projection of the vector f onto the subspace S of all linear combinations
N11 tess + YnOnof oy, %. This is because, in the orthogonal projection
onto a subspace S of a vector c issuing from the origin, the component ofc per-
pendicular to S$ is the shortest vector from S to c. The coefficients y, are given by
the direction cosine formulas of analytic geometry.
Completeness of the ¢, is defined as in §4, as the property that

"0, oo

tim |If- > Cer


no
that is, > CeDp =
=

f
k 1

+ The reader should familiarize himself with this notion; the space is not assumed to be finite-
dimensional.
362 CHAPTER 11 Expansions in Eigenfunctions

for every fin E. 1t has a simple geometric interpretation in any Euclidean vector
space E. The relation f = LP ¢,@, holds if and only if the distance ||f — 27 ¢¢,|
tends to zero as n — 00, That is, as in the proof of Theorem 3, the condition
for completeness is that we can approximate any f arbitrarily closely by finite
linear combinations ¢,¢, + + c,, of the orthogonal vectors ¢, whose
completeness is in question. This idea is most vividly expressed in terms of the
concept of a dense subset of a Euclidean vector space.

DEFINITION. A subset S of a Euclidean vector space E is dense in E if and


only if, for any fin E and positive numbers 6 > 0, an element s can be found in
S such that ||s — fl] <4.

As in Theorem 4, a set {¢,} of orthogonal elements of E is complete if and only


if the set S of all finite linear combinations £7 +,@, of the ¢, is dense in E.
The Parseval equality is easily derived in any Euclidean vector space E.
Consider the sequence of best mean-square approximations

Sn = > CEP: c= (f, Od/Ors On)


k=1

to a given vector fin E. If the finite linear combinations 7 ¢, are dense in E,


then the square distance

o,)"
k
> Cie — f
1
"eUGN-> | (f,
(Op oy)
k=1
|=o

must tend to zero as n — 00. Hence, if the sequence {¢,} is complete, then

(23) ¥ IG 1)"/(besor)] = G Af (Parsevalequality)


Applying Parseval’s equality to the vector f + g and then expanding and sim-
plifying, we obtain more generally

(23’)

valid whenever {¢,} is complete and f, g are square-integrable. Even if Parseval’s


equality fails, we still get

= (f, bi)” (Bessel inequality)


(24) =/
k=l] (di, oy)
9 Completeness of Eigenfunctions 363

If Parseval’s equality fails, then strict inequality will occur in (24) for some f in
E. For such an f, we have by Theorem 1

Ilr > wt = IIr- >t > Vi>0

for any choice of y,. Since 6 is independent of n, this shows that the ¢, cannot
be a complete set of orthogonal vectors. This gives another proof of Theorem
3, which we now restate for an arbitrary Euclidean vector space.

THEOREM 9. A sequence {q,} of orthogonal vectors of a Euclidean vector space E


is complete if and only if the Parseval equality (23) holds for all f in E.

9 COMPLETENESS OF EIGENFUNCTIONS

The completeness of the eigenfunctions of a regular Sturm-Liouville system


is a consequence of the asymptotic formulas of Ch. 10, and a geometric property
of sets of orthonormal vectors in Euclidean vector spaces. This property is stated
in the following theorem of N. Bary.

THEOREM 10. Let {¢,} be any complete sequence of orthonormal vectors in a


Euclidean vector space E, and let {y,} be any sequence of orthonormal vectors in E that
satisfies the inequality

(25) > Wn~ dnl” < +0


Then the W,, are complete in E.

This result will be proved in §§10, 11. It is plausible intuitively, because it


asserts that completeness is preserved in passing from a set of orthonormal vec-
tors ¢, to any nearby system.
Assuming Theorem 10 provisionally, we can establish the completeness of the
eigenfunctions of a regular S-L system as follows.
Consider the asymptotic formula (56) of Ch. 10,

ni(x — a) O(1)
u,(x) =
(b — a) | (b — a) | n

If u,(x) is the nth normalized eigenfunction of a regular S-L system in Liouville


normal form, and if ¢,(x) = V2/(6 — a) cos (na(x — a)/(b — a)), this gives
| u,(%) — ,(x)| = O(1)/n. Squaring and integrating, we obtain

lu, — dg? = f [up(t) — d(x]? de = om)


364 CHAPTER 11 Expansions in Eigenfunctions

Since the series 1 +34 + $+ -°-°+> + 1/2 +- converges (to 1/6), this
implies the following lemma.

LEMMA. Let u,(x) be the nth normalized eigenfunction of any regular S-L system
in Liouville normal form, with a’B’ # 0, and let

n(x — @)
bn(x) =
(b
— a) | (b
— a) |
Then the , are an orthonormal sequence, and

(26) > lle, — Pall? < +00


n=l]

Since the cosine functions are complete (by Corollary 3 of Theorem 3), it
follows from this lemma and Theorem 10 that the eigenfunctions of any regular
S-L system in Liouville normal form with a’6’ # 0 are a complete set of ortho-
normal functions.
As shown in Ch. 10, §9, the transformation to Liouville normal form, applied
to the (normalized) eigenfunctions, carries the inner product

(.¥) = J b(x)Y(x)p(x)dx
into the inner product

(u, v) = f a
u(x)v(x) dx

Therefore,t the change of variable that leads to a Liouville normal form carries
complete orthonormal sequences, relative to a weight function p, into complete
orthonormal sequences. Hence, the eigenfunctions of regular S-L systems not
in Liouville normal form are also complete.
Finally, since similar arguments cover the case a’6’ = 0, we have the following
result.

THEOREM 11. The eigenfunctions of any regular S-L system are complete in the
Euclidean vector space of square-integrable continuous functions, on the interval
a=x Sb, relative to the weight function p.

+ Since distance and convergence are defined in terms of inner products in any Euclidean vector
space.
10 Hilbert Space 365

*10 HILBERT SPACE

The set of real numbers differs from the set of rational numbers by the com-
pleteness property that every Cauchy sequence of real numbers is convergent.t
This property of completeness has an analog for Euclidean vector spaces (and,
more generally, for metric spaces).

DEFINITION. In a Euclidean vector space E, a Cauchy sequence is an infinite


sequence of vectors f, such that

(27) fm — frill > 0 as m,n


—>OO

The space E is called complete when, given any Cauchy sequence { f,}, there exists
a vector fin E such that |/f, — {|| ~ 0 as n > oo. A complete Euclidean vector
space is called a Hilbert space.
Any finite-dimensional Euclidean vector space is complete, but the Euclidean
vector space of continuous square-integrable functions defined in §8 is not com-
plete, as will appear presently.

Example 2. Let (£;) denote the Euclidean vector space of all infinite
sequences a = {a,} = (a), a9, 43, .. .) of real numbers which are square-sum-
2
mable, that is, which satisfy Xa, < +00, The vector operations on these
sequences are performed term-by-term, so that a + b is the sequence (a, + 4,
dy + do, dg + bs, ...). Inner products are defined by the formula

(28) (a, b) = x a,b, = a,b, + Agdo + asbs +-:


t=]

LEMMA 1. The space (€5) is a Hilbert space.

Proof. Our problem is to prove completeness. To this end, let {a"} be any
Cauchy sequence of square-summable sequences. That is, let

lim |ja” — a"? =


mn
lim
m,n—0o |3 (az — aj)?|
k=1
=
=
0

For each fixed k, the sequence of real numbers aj (n = 1, 2,.. .) (the Ath com-
ponents of a”) is a Cauchy sequence and, therefore, converges to some real num-
ber a,. Let a = {a,} (k = 1, 2, 3,...). We must prove that the sequenceais
square-summable and that a” > a in the Euclidean vector space.
Since ||la"|| — |la"|}| <= lla” — a™|| by the triangle inequality, it follows that
the sequence ||a"|| is bounded. Let \/M be an upper bound. Then, for all
integers n, we have Dj-1 (az)? <= M. Letting n — oo in this finite sum, we get
D1 (a,)* = M. SinceN is arbitrary, it follows that al? = D2, (@,)? < M.

+ Courant and John, p. 94 ff.; Widder, p. 277.


366 CHAPTER 11 Expansions in Eigenfunctions

Hence, a is a square-summable sequence. Moreover, given € > 0, n can be found


so large that |ja” — a™||? < ¢ for all m > n. Therefore, for every integer N
we have Li, (a2 ~ af)? < ¢ Letting m — 00, we obtain Lf, (a7 — a)? Se.
SinceNis arbitrary, this implies [7%, (aj — a,)® < ¢, q.e.d.
The property of Hilbert space that is most useful for establishing the com-
pleteness of eigenfunctions is the following.

THEOREM 12. An orthogonal sequence {p,} of vectors of a Hilbert space is complete


if and only if there is no nonzero vector f orthogonal to all the ¢,,

Proof. Let f be a vector, and let {¢,} be a sequence of orthogonal vectors in


a Hilbert space #. Let g, = Df. o, be the nth least-square approximation to
f by a linear combination of ¢), : 6,. Then, as before, we have c, = (f, ,)/
(ob, &,) and, for m > n,

m wo

( j; oi)" (f, oy)”


l@n — all? = >-
nt+l1 | (bi dy) | <

»
nt+1 | (bis bn)

By the Bessel inequality (24), the series D714 [(f ¢))°/(¢s, $)] of positive num-
bers is convergent; hence, the last sum in the preceding display tends to zero as
n — 00. That is, the sequence {g,} is a Cauchy sequence.
It follows that #, being complete, contains a vector g to which the g,
converge; let h = f — g = lim, (f — gn). Then, since (f — gn, 6) = 0 for
all m = k, we have in the limit, as m — 00, (h, @,) = 0 for all k. By Theorem 9
(Parseval’s equality), h = 0 for all f if and only if {,} is complete. This completes
the proof.

Remark. A complete orthonormal sequence {¢,} in the space (¢,) is obtained


by choosing ¢,
=
=
e*, the kth unit vector whose kth component is 1 and whose
other components are all 0.

The Euclidean vector space @[a, 6] of all continuous functions on a finite


interval a = x. < bis not complete; that is, it is not a Hilbert space. The following
is an example of a mean-square Cauchy sequence of continuous functions that
does not converge to any continuous function. In —1 = x = 1, let f(x) = 0
for ~1 =x
= 0, f(x) = nx for
0 = x = 1/n, and f(x) = I for1l/nsx=1.
The limit function f.(x) equals 0 for -1 =x =Oand1lfor0<x=<=1.
Though the Euclidean vector space @[a, b] is not complete, it can be embed-
ded in the (complete) Hilbert space (¢), as follows. First, make the change of
independent variable t = a(x — a)/(b — a), in order to map the interval [a, 3]
on [0, z). Then for each f{x) € @[a, 6), expand f(t) = f(a + (6 — a)t/z) into the
cosine seriesf(t) = Xe, cos kt, k = 1, 2,.... The vector f = (co, ¢1, Ca, . . .) defines
an element of the space (¢9) by the Bessel inequality.
The correspondence f — f maps the space @[a, b] into the space (¢); more-
over, it preserves vector operations: f + g — £ + g and Af — Mf. By Parseval’s
generalized equality (23’), which is applicable since the cosine functions are
11 Proof of Completeness 367

complete (Theorem 3, Corollary 3), we know that

f, g) = | (6
~
| f "flxg(x)dx = f .fs) at
Hence, the correspondence f — f also preserves inner products (up to a constant
normalizing factor). Therefore, it also preserves lengths (f, f 2, In particular,
= 0 implies that {7 f() dt = 0. Hence, since f is continuous, it also implies
that f(t) = 0. In conclusion, we have proved the following result.

LEMMA 2. The Euclidean vector space @{a, b] (—0 <a <b < +00) can be
embedded in the Hilbert space (€.), with preservation of vector operations and inner
products.

More generally, let ¢, be a complete sequence of orthonormal vectors in an


arbitrary Euclidean space E. Then the mapping f > c = {c,}, where c, =
=

(f, o1)
for fin E, defines an embedding of E as a subspace of (€9). In the same way as
before we obtain the following lemma.

LEMMA 3. Every Euclidean vector space with a complete sequence of orthonormal


vectors can be embedded in (£5), with preservation of vector operations and inner
products.

In view of this lemma, it suffices to prove Theorem 10 under the assumption


that the Euclidean vector space E is complete, that is, that FE is a Hilbert space.
We shall make this assumption from now on.

*11 PROOF OF COMPLETENESS

We are now ready to prove Theorem 10. But to bring out more clearly the
idea of the proof, we first treat a special case.
We define a sequence of orthonormal vectors of a Hilbert space to be an
orthonormal basis if and only if it is complete.

LEMMA. Let {@,} be an orthonormal basis in the Hilbert space Ff. Let {W,} be an
orthonormal sequence in #f satisfying the condition

(29) > Ie —dull?<1


Then the sequence {y,} is also an orthonormal basis in #.

Proof. If the sequence y, were not a basis, then we could find a nonzero
vector h-orthogonal to every y, by Theorem 12. The inner product of this vector
with @, would be given by

(A, dy) = (h, 7) + (h, Or — vy) = (h, di ~ Vi)


368 CHAPTER 11 Expansions in Eigenfunctions

Squaring and using the Schwarz inequality, we would have

(29) (h, bs) = (h, be — We)? S HAI? Ide — Vall?

Summing with respect to k, we would obtain

jal? = > (h,6) = me> lle —Well?< AIP


in evident violation of Parseval’s equality (Theorem 9), since the ¢, are an
orthonormal basis.

COROLLARY 1. Replace condition (29) by the weaker condition

co

(30) > We — ball? <1


k=N+1

for some integer N. Then every element h of 7 orthogonal to $y, Gy and toWys1,
Vn+2, .. must vanish.

Proof. Any such element h satisfies the inequality (29) for k > N. Indeed,
summing over all k, we have, since (h, ¢,) = 0 fork = 1, 2,...,N,

eo

all?
=
=

> (h,o))? = 3 (h, oy)” =< All?


feN+l
> Nee — Vall? < Wry?
k=N+1

again contradicting Parseval’s equality.

COROLLARY 2. If ¢, is an orthonormal basis and W;, an orthonormal sequence


satisfying (30), then every element of Wf orthogonal to y+1, Wn+2 .. and to the
elements

(31) mn = Gn ~ 3 (br VidWes


k=N+1
n=1,2,...,N

must vanish.

Proof. For any such element h, we have

(hy bn) = (A, 9) + > On Vh, i) = 0


k=N+1

forn = 1, 2,..., N. Thus # also satisfies the conditions of Corollary 2 and


therefore vanishes.
11 Proof of Completeness 369

The proof of Theorem 10 can now be completed as follows. Choose an inte-


ger N so that

oo

>> We — bell? <1


k= N+1

By Corollary 2, any element h of 7 which is orthogonal to the elements Wy,,


Wysz,--. and to the elements 7, (n = 1, 2,. , N) defined by formula (31) must
vanish. Denote by S the set of all elements of # orthogonal to Wyi1, nie,
Evidently S is a vector space containing 7, » Ny. By virtue of the previous
remark, the vector space S contains only the linear combinations of these ele-
ments. In other words, S$ is a finite-dimensional vector space whose dimension
is at most N.
But the elements ¥, Po, ... , Wy also belong to the vector space S, and they
are linearly independent (they are an orthonormal sequence!). Therefore, the
elements y;, Po, ., Wy are a basis for the vector space S.t Hence, the elements
Ms No» » Ny are linear combinations of ¥j, Y. ..., Hy. It follows that any
element of 7 that is orthogonal to all the y, must vanish, because such an ele-
ment is also orthogonal to , » Nn and to Wy3, Uy+e, .. We conclude that
the sequence {y,} is complete, proving Theorem 10.
Theorem 10 implies Theorem 11, as was already shown in §9.
1t is natural to ask whether the sums Xc,¢, having square-summable coeffi-
cient sequences can also be interpreted as functions. This question can also be
answered in the affirmative, by use of the Lebesgue integral. Given any complete
family {@,} of orthonormal functions on [a, b], the partial sums Lj.,¢,¢, with
square-summable coefficient sequences {c,} converge in the mean square of a func-
tion f(x), which is square-integrable for the Lebesgue integral. Conversely, if f(x) is
a Lebesgue square-integrable and if c, (f, %,), the partial sums f, = Lp.16,b;
=
=

converge to f(x) in the mean square. That is, the metric completion of the space
@[a, 6] is precisely the Hilbert space £,[a, 6] of all functions on [a, 5] whose
squares are Lebesgue integrable. It is this space that is really appropriate for
the theory of expansions in eigenfunctions.

EXERCISES E

1. Show that in a finite-dimensional Euclidean vector space E, a vector subspace Sis


dense if and only if S = E.
2. Show that an orthonormal sequence ¢, (k = 1, 2, 3, . . .) ina Euclidean vector space
is complete if Parseval’s equality (23) holds for all f in a dense subset.

3. (Vitali). Show that an orthonormal sequence ¢,(x) (k = 1, 2, .. .) of continuous func-


tions on a < x < bd is complete if and only if

(fae) =2~6 axx<=b


+ As shown in Apostol, Vol. 2, p. 12.
370 CHAPTER 11 Expansions in Eigenfunctions

{Hint: Show that linear combinations of the functions f(x) = |x — c| form a dense
subset, of @[a, 6], and apply Ex. 2.)

*4 (Dalzell). Show that a sequence of continuous orthonormal square-integrable func-


tions ¢,(x), a < x = b, is complete if and only if

> (f aoa) ae = OH

(Hint: Let g(x) = x — a — tay (S% (8) dt)®, and show that q(x) = 0 by establishing
q(x) = 0 and f® q(x) dx = 0, ¢ continuous, applying Ex. 3.]
Assuming the equality Df2,1/k? = 2/6, infer from Ex. 4 that the trigonometric
functions cos kx, sin kx, k = 0, 1, 2, . , are complete on (—7, =).

(Moment problem). Let f(x) be continuous, a = x = 6, and let

n=
f; x"f(x) dx

Show that if all the moments m, vanish, f(x) = 0. [HINT: Use the Weierstrass Approx-
imation Theorem.]

*7 Prove the completeness of the eigenfunctions of any regular S-L system with bound-
ary conditions u(a) = u(b) = 0.

*8 Let G(fi, fa .»f,) = det (@,), where a, = (f;, f). Show that the minimum of
If — © afill is equal to GU, fi, fo .? td /C des . » f). (Hint: Interpret the
determinants as volumes.]

Show by a counterexample that Theorem 10 does not remain valid if the y,, are
allowed to be nonorthogonal unit vectors.

10 Consider the orthogonal sequence ¢,, and let ¥, = ¢,, k = 2, ¥, = 0. Then


x (lo, — Yall? < 00, but (p,} is not complete. Which hypothesis of Theorem 10 is
violated?

11 Show that the first-order complex DE iu’ + (q(x) + Alu = 0, a = x < 3b, with the
boundary condition u(a) =
=
u(b) and i = —1, has a complete sequence
of eigenfunctions for any real continuous q(x).

12 Show that linear combinations of the trigonometric functions cos kx, sin kx are dense
in —a < x <7, relative to any continuous positive weight function.
APPENDIX A

LINEAR SYSTEMS

1 MATRIX NORM

As in Ch. 6, §4, any system of linear DEs can be reduced to a system of n


first-order DEs and so written in matrix notation as

(1) x’(f) = A(x

where x(f) is a column vector of length n and A(#) ann X n matrix, both depend-
ing on ¢. In Ch. 6, we proved the existence and uniqueness of solutions of (1) in
any interval of continuity of A(/), for any initial x(0) = c. In this appendix, we
derive some properties of solutions of linear systems (1) whose proofs depend
on deeper properties of matrices.
As before, x’ (é) is the limit

tim xe + At) — x(d)]

where the limit is taken separately in each component Ax,/At. Here we extend
this definition, calling the norm of any matrix A the real number

| |
(2) All = sup
| |

where

Ix] = (xx)? = (xx) and [Ax] = [xTATAx|!2

similarly.t Then A(t) > A(é)) means equivalently:

(i) ay(t) — ay(to) for all j,k = 1,...,0

(ii) | A@ — A(ép)|| tends to zero as t — fp.

+ We use standard notation here: A” signifies the transpose of A, and x" the inner product of x
with itself; see Birkhoff-MacLane, Chs, 8—10 for the facts assumed in this Appendix.

371
372 APPENDIX A Linear Systems

The equation ||J|| = 1, the triangle inequality |A + Bl] = ||Al| + |B], the
multiplicative inequality |ABl| < ||Al| - |B}, and |tAf}' = [2] - ||All all follow
quite directly from (2). From these relations, in turn, one can prove the conver-
gence of the exponential series

(3) exp 1A = 1+ tA + (#/2NA? + (B/3NA® +

for any matrix A and any scalar ¢. It suffices to copy the usual proof for real or
complex exponential series, replacing absolute values by norms throughout.

2 CONSTANT-COEFFICIENT SYSTEMS

We now introduce the alternative notation e“ for exp tA as defined above,


and prove the differentiability relation

d (ey

(4) = AeA for any constant matrix A


dt

tA uA
Because of the commutativity of all terms, we have, as in the scalar case, ee
= et +uyA
and, hence,

co

(4 _ De'A —_ ( At*A®
Je
et tAnAa _ eA
>
=
=

k=1
kl!

Dividing the series in parenthesis through by Ai, we get the identity matrix J
plus a series of matrices whose norms are bounded by Af”'a*/(kl), k = 2,
3, , where a =

=
|| All. Hence, the norm of the sum tends to zero with Aj,
proving (4).
From (4), there follows a very beautiful result.

THEOREM I. The constant-coefficient first-order linear system x'(t) = Ax has the


general solution x(t) = e'x(0).
Proof. Differentiating e4“x(0) = y(d), we get Ae“x(0) = Ay by (4), where x(0)
= y(0) trivially.

THEOREM 2. If v is a (column) eigenvector of the matrix A with eigenvalue


\ =u +4, then the vector-valued function w(t) =e™v is a solution of w'(t) =Aw.

Proof. Since AV = 2 v, we have

w'(t) = A\e“v = e“Av) = eM(Av) = Ae“ = Aw

where the successive steps are easily justified.


2 Constant-Coefficient Systems 373

Complex Solutions. The preceding results hold for complex as well as for
real solution vectors, coefficient-matrices, and independent variables, provided
that we define the norm of a complex vector z = (z,, , Z,)' as the square root
of the Hermitian inner product

A,
Zz = x, a tatty = ant
of z with its conjugate transpose z” = (z*, , z%). Hence, we can apply them
to complex eigenvalues and eigenvectors of real matrices.

Example 1. Consider the linear system

dz, /dt az, — bzg d 41 a —b a

dizg/‘dt =
bz, + AZo
dt ( }=(
22 b a
I 29

where the variables #, z,, zg may be real or complex, but the coefficients a and b
a —b
are real. The characteristic polynomial of the coefficient-matrix A =
b a

in (5) is (—a)? + b?; hence the eigenvalues of A are =


=

a + ib (cf. Ex. B1, Ch.

5). The corresponding column eigenvectors are | _ and i} Hence the

vector-valued functions

ib
“( \] and “(7): c=at ib, ch=a—
are a basis of complex solutions of the system (5).
In general, every matrix A with distinct eigenvalues has a basis of real or com-
plex (column) eigenvectors v,, with corresponding eigenvalues A, (j = 1,. ,n).
Aye
The vector-valued functions ¢,(i) = ¢ v, then form a basis of solutions of the
constant-coefficient linear system z’(t) = Az in the entire complex #-plane.

The Secular Equation. We now generalize the concept of the secular equa-
tion from second-order linear systems with constant coefficients to nth-order
systems. Namely, if p,(A) is the characteristic polynomial of the matrix A, we
define the secular equation of the constant-coefficient system x’(/) = Ax to be the
DE p,(D)u = 0, where D denotes d/dt, and p,(D) is the scalar differential oper-
ator obtained from the characteristic polynomial |A — AZ| = p,(A) of A by sub-
stituting D for X. We first prove Theorem 3 of Ch. 5.
374 APPENDIX A Linear Systems

THEOREM 3. Any component x,(t) of any vector solution x’ (t) = Ax satisfies the
secular equation p,(D) = 0.

Proof. Rewrite the given system as Dx = Ax, and let C(D) = ||C,,(D)|| denote
the matrix of cofactors of the entries ap
J
DéJ , of A — DI. Premultiplying the
equation (A — D)x = 0 by the matrix C(D)", we get the vector DE

(6) pa(D)Ix = C(D)"(A — D)x = 0

But (6) asserts precisely that for eachj = 1,. » 0, pa(D)x, = 0, q.e.d.

Example 2. Whenn = 2, the identity C(D)"(A — D) = pa(D)I reduces to the


easily verified matrix identity

a4, —D a\9 Ago — D —4y9 pa(D) 0

( ag) 99 — D I — 49) a;, —D }=| 0 pa(D)

Stability. Theorem 3 shows that, if every eigenvalue of A has a negative real


part, then by Theorem 5’ of Ch. 5 the constant-coefficient systems x’(t) = Ax is
strictly stable. A more explicit proof can be given of this result and its converse
by allowing vectors and matrices to have complex components and replacing x"x
and A‘A by the corresponding Hermitian formulas x”x and A%A, where the
superscript H signifies conjugate transpose.+ This permits us to make a change
of basis in the vector space of x(t) that replaces A by its Jordan normal form
J = PAP" (P complex nonsingular).
In this Jordan normal form, A is the direct sum of square diagonal blocks /,
having some eigenvalue A, on the diagonal and 1’s just above the diagonal. The
given system x’(é) = Ax then has for its typical DE

dx,
di
= AX, or
di
7 = AX, + X41

depending on whether x, is the last variable in its block or not.


Now consider, for example, a typical elementary Jordan block submatrix on
the diagonal of J, say the 4 X 4 matrix

Aj 1 0

0 A, 1
J
0 0 A,

0 0 90 A,

+ A” is also often denoted A*, and called the “adjoint” of A.


3 The Matrizant 375

We can exponentiate this block explicitly

1 t P72) 8/3!

eh = ov
0 1 t 7/2!

0 0 1 t

0 0 0 1

Since negative exponentials die out faster than any power, it follows that e >
0 as ¢ > 00 if and onlly if every eigenvalue of A has a negative real part. Finally,
since e4 = P™'e!P, it follows that the same is true of e“. (The matrices A and
J = PAP™' have the same eigenvalues.)

3 THE MATRIZANT

We next consider the general case of A(é) variable but continuous in (1). If
Xx) , , X,,(¢) is any list of m (column) vector solutions of (1), the nm X m matrix
X() composed of these columns will, trivially, satisfy the matrix DE X’() =
A(X. In particular, we can construct in this way an n X n matrix solution of

(7) M') = AM, M(0) = I (identity matrix)

The matrix function M(/) so defined is called the matrizant of the system (1). By
(4), the matrizant of the constant-coefficient system x(t) = Ax is just ¢.
Since the multiplication of matrices is associative, the vector-valued function
Mic = x(?) will satisfy x(0) = M(0)c = Ic = c for any given c. Since

x(t) = Mc = A()M(c = AW)x

it will also satisfy (1). By the uniqueness theorem proved in Ch. 6, it follows that
every solution of (1) has the form M()c.
The determinant det M(t) of the matrizant also has a remarkable property.
We know by (7) that

M(t + At) = M(t) + AM() = M(t) + A(t) AtM() + o(Ad)

= [I — A(t) At1M(@) + o(Ad.

By the multiplication rule for determinants, therefore, we have

det M(t + Af) = det [J + A(@® At] det M(@®) + o(Ad)

Expanding in minors, on the other hand, we obtain

det [J + A(é At] = 1 + At Tr A(t) + O(AP),


376 APPENDIX A Linear Systems

where Tr A = & ay, is the trace of A. Combining,

det M(t + At) = (1 + Aé Tr A) det M(@) + o(Ad

or

det M(t + Ai) — det M(t)


= Tr A det M(t) + o(1)
At

Now, passing to the limit as At > 0, we get [using the alternative notation | M(i)|
for det M(O):

d|M@|
= Tr A|M(@|
d

Integrating this first-order DE by the methods of Ch. 1, we obtain our final


result.

THEOREM 4. The determinant of the matrizant of (1) satisfies

(8) detM(@) = exp f {TrA()] a|


COROLLARY. If Tr A = 0, the matrizant of (1) has a determinant identically 1
(is “unimodular’’).

This is the case, for example, with systems (1) that come from nth-order DEs
having the normal form

u(t) + atu") +--+ +4,()u = 0

Next, we define the adjoint of the system (1) to be the first-order linear system

(9) f(t) = —A@E (hais,“ =


=

~ mu nd)
We consider the solution curves of (9) as lying in the dual space of linear func-
tionals on the space of x(d), given by the linear forms

(10) f(x)
= Xf, x, = (@, x) = fx (inner product)

THEOREM 5. In order that [{(), x(é)] be constant for every solution x(0) of the
vector DE (1), tt is necessary and sufficient that (9) hold.
4 Floquet Theorem; Canonical Bases 377

Proof. By straightforward differentiation, we have

4 b soso
| = Xposs + Tsjoso
= 2 HiOanlbaild + 2 fiOn(

Since there exists a solution of (1) with x(é) = c for any é) and c, the condition
that (£(, x(é) be constant for all solutions of (1), and hence that the derivative
above vanish identically, is precisely that

(11) 0= B fa) +f) forall sk

which is obviously equivalent to (9), as claimed.

4 FLOQUET THEOREM; CANONICAL BASES

We next consider periodic linear systems, that is, systems of the form (1) with

(12) Att + T) = A® for some fixed period T

A very special example is furnished by the Mathieu equation

u” + A — 16d cos 2x)u = 0

which is equivalent in the phase plane to

/
U = 0, v’ = (~A + 16d cos 2x)u = 0

Equations of Hill’s type,

u” + piu = 0, pli + x) = pl

provide a more general class of examples.


In any case, let M(T) be the matrizant of the first-order periodic linear system
satisfying (12), and let the eigenvectors of M(T) be v,, , V, (hopefully,
r = n), with M(T)v, = A,v,. Then, if x,@ is the solution of (1) and (12) with the
initial-value vector v,, then by definition we have x,(T) = A,x,(0) = A,v,; more-
over, A, # 0 since x(t) = 0 is the only solution with x(T) = 0 (Uniqueness
Theorem). Since A(é + T) = A(é), x,t + T) is, therefore, the solution of the
same initial value problem as x,(é) but with the initial value vector multiplied by
378 APPENDIX A Linear Systems

A,. We conclude the quasiperiodic relation

(13) x,(¢ + T) = Ax,

This is essentially Floquet’s Theorem. Since A; # 0, A, = eT for some real or


Ont
complex constant a,, we can define y,({) = ¢— x,(é) with the assurance that, by
(13), y,é + T) = y,@. We have proved the following theorem.

THEOREM 6. [If the matrizant M(T) of (1) has r independent eigenvectors and
(12) holds, then (1) has r linearly independent solutions satisfying

Cyt,
(14) x(t) = ey, @, where y.@ + T) = y,0

so that each x,(t) is an exponential (scalar) function e™ times a periodic function.

As in Ch. 5, §10, any second-order equation’ of Hill’s type (including the


Mathieu equation) leaves invariant a periodic, time-dependent “‘energy’’ func-
tion u/? + f p@dt. It follows that, for such equations, all the a, in (14) must
vanish if

fp(idt = Sip@dt = 0

In particular, referring back to Ch. 10 (Example 4 in §1, Ex. A5, and §8), we
can conclude that there exist infinite sequences of even and odd Mathieu func-
tions, and that the functions in each sequence have periods 27 and z alternately.
Finally, let ¢ = a@ be an isolated singular point of the matrix A(), considered
as a function of the complex independent variable ¢. By this we mean that all a,,(é)
are analytic in some punctured disk 0 < | — a| < p. Much as in Ch. 9, we can
consider the effect of making a simple loop ¢ = @ + és around ¢ = aon some
basis of solutions of the linear system (1); we do not assume periodicity. If the
basis of solutions has for initial values x,(f)) = e,, where e, = 6, 6,,,) is the
ith unit vector, the vectors resulting from going once around the loup are
the column vectors of the “matrizant’’ of (1) for the given loop. In any event,
we can write

(15) x(c + e?'r) = Mx(c + 7)

where M is a suitable nonsingular circuit matrix. An argument like that used to


prove Theorem 5 (and Theorem 4 of Ch. 9) gives the following result.

THEOREM 7. [If the circuit matrix of A(t) in (1) for the isolated singular point
t =a has r independent eigenvectors, then (1) has r solutions

(16) x() = ¢ — a)“,

where the £,(t) are holomorphic in 0 < |t — a| < p for some p> 0.
4 Floquet Theorem; Canonical Bases 379

EXERCISES

1. Compute the matrix B = e for the following A:


1
(a) ( —1
01
0 } |
0
1 0 } |
1
0
1
1
ya
a
0
0
d

Apply the Ex. 1 to the following 3 X 3 matrices:

00 1 0 1 0 A 1 0
(a) 1 0 0 (b) 001 (c) 0A 1
0 1 0 000 0 0A

For each A in Ex. 1, discuss the stability of the DE x’() = Ax.

Do the same for each matrix in Ex. 2.

Compute the matrizant of x’(/) = Ax for the following matrices A:

2
0 0
te
(a (_° t
0
} of 0 t
) (c) 2
0
e
0
0 0

(a) For the system x’ = y, y’ = z, 2’


=
=
x, find column vectors ¢, (7 = 1, 2, 8) such
that x;(@) = $), xo() = obs, and x;(t) = e® ‘bs are a basis of solutions.
(b) Discuss the behavior of solutions for large t.

Find a basis of solutions for the system

dx, dx
— = —2x, + Xs ——
= Xx. nn 2x.
dt

dx,
—= X, 1 — 2x, + x, 1 G
=

,n— 1)
dt

(HinT: Try the initial conditions ¢, =


=
sin yjx,r = 1,. ,n—1.]

Show that the matrizant M(T) of any DE of Hill’s type has determinant |M(T)| = 1.
[HinT: Show that, in the phase plane, TrA(/) = 0.]
APPENDIX B

BIFURCATION
THEORY

1 WHAT IS BIFURCATION?

In recent decades, the qualitative theory of ordinary differential equations


has been revolutionized by a series of new concepts, loosely characterized by
99 66 39 66
such words as “bifurcation, control,”’ “strange attractor, chaos,”’ and
“fractal.” Several fascinating books are now available which describe one or
more of these ideas in some detail and depth.t The purpose of this appendix is
to give readers some idea of the variety and richness of the phenomena which
they try to analyze, by introducing them to the nature and typical examples of
“bifurcation.”
In general, the qualitative behavior of solutions of differential equations and
systems of differential equations of given form often depends on the parameters
involved. We studied this dependence for the DE ¥ + px + qx ='0 in Chapter
2, §2, and for the system

d x a b x
(1)
dt ( }=( I
J c d J

in Chapter 5, §5. The qualitative dependence on the paramater \ of solutions


of Sturm-Liouville equations

(2) ‘ ES 2 + Prolx)—glx)lu = 0
provides another classic illustration, to which Chapters 10 and 11 were devoted.
This appendix will introduce a typical concept associated with this parameter
dependence: that of bifurcation.
A striking example of bifurcation is provided by the van der Pol equation

(3) ¥—-— 21 — yy +y = 0

+ See the books by Hale and Chow-Hale, Guckenheimer-Holmes, and Thompson-Stewart listed in
the bibliography.

380
2 Poincaré Index Theorem 381

Whenuz is positive, as was shown in Chapter 5, §12, the equilibrium solution y


= 0 is unstable, while all other solutions tend to stable periodic oscillations y =
ft — 7) associated with the same limit cycle in the phase plane, as ¢ | +00.
When pz is negative, the behavior of solutions is totally different. This is evi-
dent since the substitution ¢ > — ¢ carries (3) into 7 + u(l — yy + y = 0,
thus effectively reversing the sign of » and the sense of trajectories in the phase
plane. It follows that, when » < 0, the equilibrium solution y = 0 of (3) is stable;
the periodic solutions y = f(¢ — 7) associated with the limit cycle are unstable;
and all solutions having initial values (y,, y,) located outside this limit cycle spiral
out to infinity as t t +00.
To describe this qualitative change, the value 4 = 0 where it occurs is called
a bifurcation point of the parameter p in (3). More precisely, it is called a Hopf
bifurcation (contrast with Example 2 in §4 below).
An analogous, even simpler bifurcation occurs when pz = 0 for the first-order
DE

(4) X+ px =C

in the stable case 4 < 0, all solutions approach the same equilibrium solution
x = C/p; when p < 0, they all diverge from it.
By fixing one parameter (e.g., g), one can also apply the concept of bifurca-
tion to DEs depending on two parameters, such as

(5) ¥+ px + qx = 0

see the exercises at the end of this appendix.


A more novel example is provided by the so-called Brusselator equations of
chemical kinetics:

(6) X= A—(B+ lx + xy yj = Be — x*y

Setting A = 1, the phase portrait of (6) has a stable focal point for B << 2 =
1 + A’, and an unstable focal point for B > 2; hence B = 2 is a bifurcation
point for B in (6), if A = 1.

*2 POINCARE INDEX THEOREM

Let (X(x,y),Y(x,y)) be a plane vector field, and let be a simple closed curve
which does not go through any critical point of the vector field. By the Jordan
curve theorem,t the interior of ¥ is a well-defined, simply connected region; we
will assume the vector field to be of class @? in y and its interior.

+ It is notoriously difficult to prove the Jordan curve theorem rigorously.


382 APPENDIX B Bifurcation Theory

DEFINITION. Let (x,y) = arctan[Y(x,y)/X(x,y)] be the angle made by the


vector (X(x,y), ¥(x,y)) with the horizontal. Then the (Poincaré) index of y for the
given vector field is

(7) Ky) = = dy —1
2a Y

It is understood that ¥ is traversed counterclockwise.

Note that although the arctangent function is only defined up to an integral


multiple of 7, the integral in (7) is independent of which branch of this function
is chosen, as well as of the initial point O from which the integral is computed.
Further, one can prove that J(y) is an additive function of domains, in the fol-
lowing sense.

LEMMA I. Let ’ and y” enclose domains D’ and D", whose union is a simply
connected domain D with boundary y, as in Figure B.1. Then

(8) Uy) = Ky’) + Ky")

The proof depends on the fact that y consists of y’ U y” with their common
segment y’ M ” deleted; since this segment is traversed in opposite directions
by y’ and y”, the contributions from it cancel.

LEMMA 2. If a sufficiently small curve contains no critical point, then I(y)= 0.


If it contains a single critical point (x,, y,), and

(9) X(x,y) = a(x — x,) + by — 9,) + O(7*)


Y(x,9) = ¢(x — x,) + dy — y,) + O(r?)

where r? = (x — x,)’ + (y — y,)’, then

D Y Y D

Figure B.1 Additivity of Poincaré index.


3 Hamiltonian Systems 383

1 if ad, > be,


(9’) I(y) = -1 if ad, < b,c,
indeterminate if ad, = bc

In other words, /(y) is the segn of the determinant |a,| a,d, - b,c, of the
matrix of the linearization of the DE % = X(x,y),j = Y(x,y) unless this critical
point (x,,y,) is degenerate, in the sense that |A,| = 0. Getting down to cases, we
see that focal, nodal, and vortex points have index +1, while saddle points have
index —1.
The proof of Lemma 2, in the case of nondegenerate critical points, is
straightforward but tedious. One first observes that, since the vector field
X(x) € @?, there is some neighborhood of (x,,,) in which not only is there no
other singular point, but the direction of X(x) differs by less than 7/4 radians from
that of the linearized vector field

X,(x,y) = (a(x — x) + by — 9),


6, (% — x) + dy — 9)

One then takes up individually each of the non-degenerate cases of Figure 5.5;
see the exercises at the end of this appendix.
The reason why the degenerate case |A,| = 0 has been excluded in Lemma
2 is easily explained by examples. First, the degenerate field Z(z) = z"(n > 1)
has index I(y) = n for any contour ¥ containing the origin. Thus (x? — y*, 2xy)
has index 2, (x* — 3xy", 3x°y — y*) has index 3, and so on. And again, the vector
field (x,x* sin(1 /*)) has infinitely many critical points where the angle W is unde-
fined, in any neighborhood of the origin. However, using Lemmas 1 and 2, we
can prove the following Poincaré index theorem:

THEOREM 1. Let ¥ be any simple closed curve not containing any degenerate crit-
ical point of the plane vectorfield X(x). Then y contains only a finite number of critical
points x, = (x, y,), and

OX/dx OY/dy
(10) Ky) = > L=

» “Bn
J
J
|aXx/ay a¥/ay|,
is the sum of the (Poincaré) indices of these critical points.

3 HAMILTONIAN SYSTEMS

Classical dynamics (including celestial mechanics) is primarily concerned with


systems of first-order DEs having a very special form. These are so-called Ham-
384 APPENDIX B Bifurcation Theory

iltonian systems of DEs of the form

oH oH
(11) 4G = Op,’

=o

t
0q;’

where i 1, yn and H(q;p) is a given real-valued Hamiltonian function.


=
=

Most of the energy-conserving autonomous systems arising in classical


dynamics are “Hamiltonian,” with H = T + Vthe sum of the kinetic energy T
and the potential energy Viq). The gq, are position coordinates and the p, the cor-
responding momenta. Thus, for the particle of mass m discussed in Chapter 5,
§9,q
= x, p
=
=
mk, and H = p?/2m + V(q). Hence, x =
=

q = dH/dp = p/m
(which checks), and m¥ = p =
=
— 0H/dq = — V(x) = F(x), where F(x) =
— q(x) is the force.
Likewise, for the pendulum discussed in Chapter 5, §3, letting the (general-
ized) position coordinate be @ = q, and p = £6, we have the energy function or
Hamiltonian

H = g(1 — cos 6) + £°6?/2


(12)
=2gesin?2 + p/2=V+T

formulas from which (11) can easily be checked.


Many other Hamiltonian systems having no clear connection with physics are
also of current interest. An ingenious one-parameter family of such systems is
the following.+

Example 1. Consider the one-parameter family of plane autonomous


systems

(13) = — wy + xy, 9 = wx +30? + 9’)


For any fixed yu, since X = 0 when (x — p)y = 0, the critical points of (13) lie
where ¥ = 0 on the lines x p# and on y = 0. When x = p, ¥ = 0 where
_
=

x = O0orx
=
=
— 2u; when x = py = 2 + Hu? + y*) = Owheny = + V3p.

Hence, if » # 0, the system (13) has four bifurcation points: one located at
the origin, and the other three at the vertices (— 2p, 0) and (u,+V3n) of an
equilateral triangle. (When pz = 0, there is only one critical point. This is at the
origin, and is degenerate.)
One easily verifies that the system (13) is Hamiltonian because, for

—-—-(x
(14) +9)
+ ny — 3

+ This example is taken from Sec. 1.8 of Guckenheimer-Holmes.


3 Hamiltonian Systems 385

the DE (13) can be rewritten as

(14’) x= ay (x,y34),9 = —~ (x.y;


m)
which is of the form (12) in another notation.
We next generalize Theorem 1 of Chapter 5 from plane Hamiltonian systems
to Hamiltonian systems in general.

THEOREM 2. The flow defined by any Hamiltonian system (11) is volume-con-


serving; moreover, the solution curves (orbits or trajectories) lie on level surfaces
A(qgsp)= C.

The first statement follows from a basic theorem of vector analysis, which
states that a velocity field X(x) defines a volume conserving flow x‘(t)= X(x) if
and only if its divergence is zero.t For (11), evidently

". Ox, ". @H 0


(15) div X = » ap, - aH
(
=
— = —_—_—_— =

i=]
Ox; r=] 9q,0p, =] 94:
The second statement follows since

n n
oH
| oH
4 traso) = 3 Hs, = » =] 2 1=]
q, 1 »
=]

Op,
p,
n
0H
*. oH aH
» aH )-
( ~ 8a,
=

7 1=1 4; op, op,


2=1

Critical Points. By (11), the critical points of the Hamiltonian H(p,q),


where its gradient vanishes, are the points where the velocity vector (q’(é),p’(é)
is 0. This shows that the critical points of the function H are the stagnation
points of the associated flow.
In the plane case (n = 1), we can say more. Near any critical point (€,7), an
expansion in Taylor series gives

(16) 2QV(x,9) = ayy(x — ? + Qayo(x — Ey — v) + ago(y — 0)? +

where aj; = Vix, aig = Vay, Gog = Vy. Hence

a x—& x E
)+
( }-| I
a1 a9
(17)
dt y— 7 ~ 1 — Ag9 y7 0

+ Courant and John, vol. II, p. 602. The fact that the Hamiltonian flows are volume-conserving is
called Liouville’s Theorem.
386 APPENDIX B Bifurcation Theory

The eigenvalues of the coefficient matrix of the linearization (17) at (x,,y,) are
the roots of the quadratic equation

x ay} a2
(18)
=
=

9° — &)dg9 = — det
Q19 Ag9

THEOREM 3. A nondegenerate critical point of a planar Hamiltonian system is a


vortex point at maxima and minima of H, where a,;a22 > 412, and a saddle point
where H has a saddle point (where a,;a9. <a 12).

Where 4; )499 = 49”, the local behavior of trajectories is indeterminate. Note

A(x,y)ishorizontalatcriticalpoints,det |a)
that since the surface z =
=

is precisely its Gaussian curvature.


a9 a9
2
|

4 HAMILTONIAN BIFURCATIONS

We now consider the bifurcations of one-parameter families of Hamiltonian


systems, with Hamiltonians

(19) i= ay (x,y3H),9= —- (x,y3


1)
The fact (Theorem 1) that for any pu, the orbits of (19) are the level curves of
the Hamiltonian V(x,y;“), makes it easy to see how bifurcations arise. One can
think of the orbits as the “shores” of the “lakes” obtained by deformingaflex-
ible bowl z Vix,y;#) as the “time” p varies.t
=
=

In Example 1, the critical points (singular points of the phase portraits or


critical points VV = 0 of V) move around as yp varies, changing their nature
only at the “bifurcation value”’ 4 = 0. The next example is more typical of bifur-
cation, as that word is generally used.

Example 2. Consider the one-parameter family of systems defined by the


Hamiltonians

V=y—pe”, re = x? + ¥?
(20)

The corresponding Hamiltonian flows satisfy

(21) % = 1+ Que”, y = — Quxe™

+ Of course, this “‘time’’ parameter » is unrelated to the variable ¢ associated with the “time’’ deriv-
atives ¥ = dx/dt and y = dy/dt.
5 Poincaré Maps 387

In terms of the simile in the preceding paragraph, the sloping plane z = y


becomes deformed by a deepening circular depression centered at the origin.
When this depression becomes deep enough, the resulting dimple can hold
water; at the same time, the retaining ridge on the downside has a saddle point.
Thus this bifurcation gives rise to a pair of singularities: a vortex point and a
related saddle point.
In detail, clearly ¥ = 0 for r > 0 in Example 2 if and only if x = 0—that is,
on the y-axis. Moreover, at points (0,7) on the y-axis, x = 0 if and only if e =
=

2un. Bifurcation takes place when the line & = — 2yn in the (7,)-plane touches
the graph of § = e™. For larger slopes » > y,, there are two points of intersec-
tion and the surface z Vry-—
pe has corresponding critical points: a
=
=

vortex point and a saddle point, as already stated.


Bifurcations involving the simultaneous appearance of such a paired vortex
point and saddle-point are called saddle-point (or ‘“‘fold”’) bifurcations, to distin-
guish them from Hopf bifurcations (see §1), in which a stable equilibrium point
is replaced by an unstable one enclosed in a surrounding limit cycle.

The Poincaré Index. Theorem 3 explains why the critical points arising
from the preceding “‘bifurcation’’ are neither nodal points nor focal points. The
fact that they are paired and of opposite indices is, however, a simple conse-
quence of the Poincaré index theorem (Theorem 1), as we shall now explain.
For a general one-parameter family of plane autonomous systems,

(22) x = X(x,y;u) y= V(x.y3)

consider the variation with » of the Poincaré index of a fixed simple closed curve
‘y, as defined by (7). Unless the curve passes throughacritical point of the system
(22) for some value of yu, the angle y(y,u) between the vector (X(x,9;4), Y(x,y3H))
and the horizontal will vary continuously with 4. Hence, the contour integral

Tye) = p dy(y,n)
will also vary continuously, and cannot jump by an integral multiple of 27. We
conclude

THEOREM 4. Inside any simple closed curve not passing through a critical point,
new nondegenerate critical points appearing or disappearing at bifurcation values p
must arise or be destroyed in adjacent pairs having indices of opposite signs.

In particular, they cannot appear or disappear singly.

5 POINCARE MAPS

The concept of a Poincaré map arises in three different contexts: (i) limit
cycles (Ch. 5, §13) and other periodic orbits of autonomous systems, (ii) periodically
388 APPENDIX B Bifurcation Theory

forced systems, of which linear constant-coefficient systems like

(26) ¥ + px
+ qx A cos wt
=
=

yield familiar examples (in the phase plane), and (ii) systems with periodic coef-
ficient-functions, like the phase plane representation

a =v, b= — pliju (27)

of the Hill equation mentioned in Appendix A.

Example 3. Consider plane autonomous systems of the special form

(28) x= fine —y, Y= xt fy, r= VP +%¥%

In polar coordinates, the system (28) simplifies to

(28’) 6=1, r= g(r) = rf”)

and has a periodic orbit of radius r = @ and period 27 whenever g(a) = 0. This
orbit is evidently stable (a ‘‘limit cycle’’) if g’(a) < 0, and unstable if g’(a) > 0.

The concept of a Poincaré map provides another way of visualizing pertur-


bation approximations to such periodic orbits. Let us call a small interval a — ¢
<r<a-+ eof the raxis 8 = 0 a Poincaré section of the periodic orbit r = a.
The Poincaré map of this “section’’ maps each point (@ + 7,0) in it onto the point
where the orbit (solution curve) passing through that point next crosses the sec-
tion. If f € &', then dn/d? = dn/dt =
=

nf (a) + Om), and so the Poincaré


map multiplies 7 by approximately exp(27f “(a)). It is thus locally a contraction if
f’@ < 0, but an expansion if f(a) > 0; the periodic orbit is correspondingly
stable or unstable.
More generally, by a Poincaré section of a periodic orbit at a point P is meant
any smooth curve through P not tangent to the orbit, and the Poincaré map of
this section is then defined (locally) as in the preceding paragraph. Using the
theory of perturbations (Ch. 6, §12), one can show that if f’(a) # 0, the linear-
ization y — exp(2xf‘(a))n is the same for all Poincaré sections—and that the
Poincaré maps of the “‘sections”’ crossing a given periodic orbit are all “‘equiv-
alent” under diffeomorphism (Ch. 5, §6).

Bifurcation. Now consider a one-parameter family * = f(r,u)x — y, 9 =


=

+ f(r, w) y of DEs of the form (28), with

(29) arn) = f(r, w) = —?P + psin® r

When yz = 0, the origin is a stable equilibrium point toward which all orbits
tend. But when yu exceeds 1, the origin is unstable, and a limit cycle occurs when-
ever (sin r)/r = 1/y. The nearest limit cycle is locally stable; the next is unstable,
6 Periodically Forced Systems 389

the third is stable, and so on. Looking more closely at the intersections of the
curve s = (sin r)/r with s 1/u, we see that new periodic orbits arise in pairs
=
=

as 4s increases, one stable and the next unstable. The analogy with bifurcations
of equilibrium points is obvious!

6 PERIODICALLY FORCED SYSTEMS

The Poincaré maps of sinusoidally forced (first-order) linear constant-coefficient


systems are easily determined algebraically, by expressing in matrix notation the
ideas introduced in Ch. 3, §7. Let

(30) x/(t) = Ax + ce™, d =ik (k real)

be such a system. Then, unless |AJ — A| = 0 (ie., unless \ is an eigenvalue of


A, the case of resonance), the periodic solution is

(304 x = (I — A) ce™ = Ce¥

whereCis the solution of (AJ — A)C = c. If the eigenvalues A, of A are distinct,


then the section |y — C| < ¢ surrounding the point x = C through which the
periodic orbit passes when ¢ = 0 is transformed during the period T = 2x/k
into a neighborhood |z — C| < ¢ of x(T) = C, by the formula

(31) y=Cr Lao, z=C+ nea, p,

Just as in Ch. 4, §7, this periodic orbit is strictly stable if an only if every eigen-
value A, of A has a negative real part, so that jer? <i.

Periodic Linear Systems. More generally, the periodic, homogeneous lin-


ear systems

(32) x’() = A(é)x, Att + T) = A@

considered in Appendix A, §4, also give rise to Poincaré maps. Indeed, every
such system has the trivial equilibrium periodic solution x(t) = 0, and the linear
transformation x() = M(T) x = x(t + T) defined by the matrizant (Floquet
matrix) referred to in Theorem 6 of Appendix A is the Poincaré map associated
with this equilibrium solution. Clearly, the equilibrium solution (‘‘equilibrium’’)
is strictly stable if and only if all the eigenvalues ), of this Floquet matrix have
magnitudes |A,| <1.

Example 5. Consider for example the Mathieu equation of Ch. 10, §1,
Example 4:+

(33) u”
+ (« + 16 dcos 2x)u =0

+ We have written y in place of A, to avoid confusion with the eigenvalues of (337).


390 APPENDIX B Bifurcation Theory

The initial conditions

¢(0) = 1, $'(0) = 0
(33’)
(0) = 0, v’(0) = 1

determinea basis of solutions of (33), the first of which is an even function and
the second odd. Moreover, the associated Floquet matrix has déterminant one
(see Ex. 8 of Appendix A). Hence its eigenvalues are the roots of a real char-
acteristic equation of the form

(33”) \? — 2B.+1=0

with roots \, = B + VB* — 1. The parameter B = B(u,d) in (33”) must be


computed.

LEMMA The Mathieu equation (33) is stable or unstable according as |B| < 1
or |B| > J.
For, since \jAg = 1, they are complex conjugate and on the unit circle if
|B| < 1, but real and one exceeding one if |B{ > 1. On the other hand, the
values B = +1 yield the eigenvalues ); = +1 associated with the Mathieu func-
tions of periods x and 27. There follows:

THEOREM 5. The bifurcation associated with transition between stability and


instability occurs at values w and d associated with the periodic Mathieu function solu-
tions of (33).

Duffing’s Equation. More interesting, and much harder to understand the-


oretically, are the Poincaré maps of periodically forced nonlinear systems. A
good introduction to these is provided by Duffing’s equation

(34) K+ ck — x + x? = Acoswit.

In the unforced case A = 0, (34) evidently has three equilibrium solutions: x()
=0 and x(f) = +1. Of these, x = 0 is unstable, while x = +1 are stable. The
trajectories in the phase plane are easily drawn in the Hamiltonian case A = ¢
= 0: they are the level curves of the energy function H = x* — 2x” + 2uv”, with
“separatrices” v = +x* V1 — (x*/2) through the origin. These separatrices sep-
arate the periodic orbits corresponding to local oscillations about one stable
equilibrium point from the oscillations about all three, whose amplitudes exceed
2\/2. The tangents to all trajectories in the phase plane are horizontal where
they cross the vertical lines x = 0 andx = +1.
When A = 0 but c > 0, most trajectories in the (x,v)-plane spiral clockwise
from ‘‘infinity” into the two stable focal points at (+1,0). There are, however,
two special trajectories which originate (at £ = —0o) in the unstable saddle point
at (0,0), each of which spirals into one of the focal points at (+1,0). Likewise,
there are two special separating trajectories which spiral in from “infinity’’ but
6 Periodically Forced Systems 391

come to rest at (0,0). These separate the trajectories which spiral into (1,0) from
those which spiral into (— 1,0).
In the forced case A # 0, a rich variety of qualitatively different kinds of
behavior can arise, depending on the choice of the three coefficients ¢, A, and
w in (34). These can be explored most efficiently by using modern computers to
compute and display the sequences of points (x(n7),x(nT)) in the phase plane
arising by iterating the Poincaré map, applied to selected initial states (x,,%,) for
selected choices of c, A, and w.
Of particular interest here are the fixed points of the Poincaré map, that is,
points such that (x(T),x% (T)) = (x(0),% (0)), on periodic orbits of period T. For
example, with small A, there are three such fixed points for (34): two stable fixed
points (on stable periodic orbits) near (+ 1,0), and an unstable fixed point near
the origin. A family of trajectories asymptotic to the unstable fixed point forms
a manifold separating trajectories attracted to the two stable periodic orbits.

EXERCISES

1. Explain why, for fixed q > 0, the value p = 0 is a “bifurcation value”’ of p in (5).

2. (a) Show that for given A and B, (A, B/A) is the only equilibrium point of (6).
(b) Derive the system (6) of variational equations for perturbations of the equilib-
rium solution.

*3 Compute numerically the stable limit cycle of (6) for A = 1, B = 3.

(a) Show that the system ¥ = 2y, ¥ = 3 — 3x leaves invariant the energy (Hamilton-
ian) function V = x° — 3x + yx.
(b) Show that this system has a saddle point at (1,0), a vortex point at (— 1,0), and
no other critical points.
(c) Show that the cubic curve x> — 3x + 9? = —2 through the saddle point is a
“separatrix.”
(d) Sketch the phase portrait of the system.

(a) Show that » =0 is a bifurcation value for the one-parameter family of Hamil-
tonian systems with Hamiltonians H = y* + 3yux? = 3x.
(b) Describe the qualitative change that takes place in the phase portrait when pu
changes sign.

(Hint: The case » = 1 is treated in Ex. 4.]

(a) For the Hamiltonian H = py — ee 7h = yy 4 x, write down the Hamiltonian


DEs.
(b) Locate the critical points of the system, if any.
() Show that the bifurcation values are p = + Ve, and describe the changes in the
phase portrait that take place when yp crosses these values.

As e€— 0, the DE eu” + u =


=
1 is said to constitute a singular perturbation of u =
=

1
(a) Solve this DE explicitly for the endpoint conditions u(0) = u(1) = 0, when
€ > 0 and whene
< 0.
(b) Contrast the increasingly irregular behavior of solutions as « | 0 with the smooth
“boundary layer” behavior as € t 0.
BIBLIOGRAPHY

GENERAL REFERENCES

Abramowitz, Milton, and I. A. Stegun, Handbook of Mathematical Functions. New York:


Dover, 1965.

Ahlfors, L. V., Complex Analysis, 2nd ed. New York: McGraw-Hill, 1966.

Apostol, Tom M., Calculus, 2 vols., 2nd ed. New York: Blaisdell, 1967, 1969.
Birkhoff, Garrett, and S. MacLane,
A Survey of Modern Algebra, 4th ed. New York: Mac-
millan, 1977.

Carrier, George F., Max Krook, and C. E. Pearson, Functions of a Complex Variable. New
York: McGraw-Hill, 1966.

Courant, Richard, and D. Hilbert, Methods of Mathematical Physics, Vel. J. New York:
Wiley-Interscience, 1953.

Courant, Richard, and F. John, Introduction to Calculus and Analysis, 2 vols. New York:
Wiley, 1965, 1972.
Dwight, H. B., Tables of Integrals and Other Mathematical Data, rev. ed. New York: Mac-
millan, 1947.

Fletcher, A.,
J. C. P. Miller, and L. Rosenhead, Index of Mathematical Tables, Vol. I and II,
2nd ed. Reading, Mass.: Addison-Wesley, 1962.

Hildebrand, Francis B., Introduction to Numerical Analysis, 2nd ed. New York: McGraw-
Hill, 1974.
Hille, Einar, Analytic Function Theory, Vol. 1. Waltham, Mass.: Blaisdell, 1959; Vol. 2,
1962.

Kaplan, Wilfred, Advanced Calculus. Reading, Mass.: Addison-Wesley, 1952.

Picard, E., Traité d’Analyse, 3 vols., 2nd ed. Paris: Gauthier-Villars, 1922-1928.
Rudin, Walter, Principles of Mathematical Analysis, 2nd ed. New York: McGraw-Hill, 1964.

Stiefel, E. L., An Introduction to Numerical Mathematics, translated by W. C. Rheinboldt.


New York: Academic Press, 1963.

Taylor, A. E., Advanced Calculus. Waltham, Mass.: Blaisdell, 1955.

Widder, David V., Advanced Calculus, 2nd ed. Englewood Cliffs, N.J.: Prentice-Hall,
1961.

392
Works on Ordinary Differential Equations 393

WORKS ON ORDINARY DIFFERENTIAL EQUATIONS

Arnold V. I., Ordinary Differential Equations, translated by R. A. Silverman. Cambridge:


M.I.T. Press, 1973.
Bellman, Richard, and Kenneth L. Cooke, Differential-Difference Equations. New York:
Academic Press, 1953.

Cesari, L., Asymptotic Behavior and Stability Problems in Ordinary Differential Equations, 2nd
ed. New York: Academic Press, 1963.
Chow, S.-N., and J. R. Hale, Methods in Bifurcation Theory. New York: Springer, 1982.
Coddington, Earl A., and N. Levinson, Theory of Ordinary Differential Equations. New
York: McGraw-Hill, 1955.

Cronin, Jane S., Differential Equations: Introduction and Qualitative Theory. Dekker, 1981.

Davis, Philip J., and Philip Rabinowitz, Numerical Integration. Blaisdell, 1966.
Gear, C. William, Numerical Initial Value Problems in Ordinary Differential Equations. Pren-
tice-Hall, 1971.
Guckenheimer, John, and Philip Holmes, Nonlinear Oscillations, Dynamical Systems, and
Bifurcations of Vector Fields. New York: Springer, 1983.

Hale, Jack K., Ordinary Differential Equations. New York: Wiley, 1969.
Hartman, Philip, Ordinary Differential Equations. New York: Wiley, 1964.
Henrici, Peter, Discrete Variable Methods in Ordinary Differential Equations. New York:
Wiley, 1961.

Henrici, Peter, Error Propagation for Difference Methods. New York: Wiley, 1963.

Hille, Einar, Lectures on Ordinary Differential Equations, Reading, Mass.: Addison-Wesley,


1969.
Hirsch, M. W., and S. Smale, Differential Equations, Dynamical Systems, and Linear Algebra.
New York: Academic Press, 1974.
Hurewicz, Withold, Lectures on Ordinary Differential Equations, Cambridge: M.1.T. Press,
1958.
Ince, E. L., Ordinary Differential Equations, 4th ed. New York: Dover, 1953.

Jordan, D. W., and P. Smith, Nonlinear Ordinary Differential Equations. Oxford: Oxford
University Press, 1977.
Kamke, E., Differentialgleichungen: Lisungsmethoden und Lésungen. Leipzig: Akademische
Verlag, 1943; Chelsea reprint, 1971.
Kaplan, Wilfred, Ordinary Differential Equations. Reading, Mass.: Addison-Wesley, 1958.
LaSalle, J. P., and S. Lefschetz, Stability by Liapounov’s Direct Method, with Applications.
New York: Academic Press, 1961.

Lefschetz, Solomon, Differential Equations, Geometric Theory, 2nd ed. New York: Wiley-
Interscience, 1963.

Liapounoff, A., Probléme Générale de la Stabilité de Mouvement, reprinted by Princeton Uni-


versity Press, 1949.

Magnus, Wilhelm, and S, Winkler, Hill’s Equation. New York: Wiley-Interscience, 1966.
McLachlan, N. W., Ordinary Non-Linear Differential Equations in Engineering and Physical
Sciences, 2nd ed. New York: Oxford University Press, 1956.
394 Bibliography

Nemytskii, V. V., and V. V. Stepanov, Qualitative Theory of Differential Equations. Prince-


ton, N.J.: Princeton University Press, 1960.
Petrowski, I. G., Ordinary Differential Equations. New York: Prentice-Hall, 1966.
Picard, E., Lecons sur Quelques Problémes aux Limites de la Théorie des Equations Différen-
tielles. Paris: Gauthier-Villars, 1930.

Pliss, V. A., Nonlocal Problems of the Theory of Oscillations. New York: Academic Press,
1966. (Russian ed., 1964.)
Poincaré, H., Les Méthodes Nouvelles de la Mécanique Céleste, Vol. I-III, New York: Dover,
1957.
Reid, W. T., Ordinary Differential Equations. New York: Wiley, 1971.
Sansone, G., and R. Conti, Non-Linear Differential Equations. New York: Pergamon, 1952.

Simmons, G. F., Differential Equations with Applications and Historical Notes. New York:
McGraw-Hill, 1972.
Stoker, J.-J., Nonlinear Vibrations in Mechanical and Electrical Systems. New York: Wiley-
Interscience, 1950.

Thompson,
J. M. T., and H. B. Stewart, Nonlinear Dynamics and Chaos. New York: Wiley,
1986.

Tricomi, F. G., Equazioni Differenziali, 3rd ed. Turin: Einaudi, 1961.

Wasow, W., Asymptotic Expansions for Ordinary Differential Equations. New York: Wiley-
Interscience, 1966.
INDEX

Adams three-level methods, 260 Characteristic exponent, 275, 293


Adams-type methods, 259 Characteristic polynomial, 72, 239
Adjoint DE, 55 Chebyshev DE, 57, 358
Adjoint linear system, 376 equioscillation principle, 360
Airy DE, 107, 304, 308 Chebyshev polynomials, 358
Amplitude, 36, 160 Circuit matrix, 268ff, 378
Analytic continuation, 262 Comparison theorem, 29
Analytic DE, 193ff Sturm, 47
Analytic function, 101, 110, 125, 261 Completeness, 350ff, 363, 365
Approximate function table, 205 Complex exponential, 72ff
Approximate solution, 20ff, Ch. 7 Complex solutions, 77, 128, 373
Arzela-Ascoli theorem, 193 Conjugate point, 64
Asymptotic expansion, 107, 220 Conservative dynamical system, 156
Attractive, 153 Constant-coefficient DE, 36, 71¢f
Autonomous system, 13iff, 171 system, 372
linear, 141 Continuation principle, 268
Continuation of solutions, 197
Backward difference, 231 Continuity theorems, 26, 175, 177
Basis of solutions, 35, 43, 72, 78-82, 373 Contour lines, see Level curves
Bernoulli DE, 17 Convolution, 97
Bessel DE, 105, 276, 301, 307, 309, 311 Corrected midpoint formula, 214
Bessel function, 105, 117ff, 277 Corrected trapezoidal method, 225
asymptotic behavior, 326ff Corrector formula, 223, 259
Bessel inequality, 349, 362 Cotes’ rule, 215
Bifurcation, Appendix B Critical point, 10, 12, 133, 383, 385
Bifurcation point, 301 degenerate, 383
Boundary term, 303 Cumulative error, 226
Branch point, 263
Branch pole, 266 Damped linear oscillator, 138
Brusselator equation, 381 Damped nonlinear oscillation, 159
Deferred approach to limit, see Richardson
Canonical basis, 270, 298 extrapolation
Cauchy polygon, 205 Delta-function, 68
Cauchy product formula, 111 Derivative of vector function, 172
Cauchy sequence, 365 Deviation, 207
Central difference, 231 Difference equation, 219
Characteristic equation, 238, 271 constant coefficients, 238

395
396 Index

Difference equation (Continued) Fourier coefficients, 348


operator, 230 Fourier convergence theorem, 344
Differential equation (DE), 1 Fourier series, 90, 344
Differential inequality, 27 Frenet-Serret formula 182
Directional derivative, 209 Frequency, 36, 160
Direction field, 21 Fuchsian equation, 294{f
Discretization error, see Truncation error Function tables, 23, 231
Divided difference, 234 Fundamental theorem of algebra, 73
Domain, 12 Fundamental theorem of calculus, 3
of convergence, 111
Duffing’s equation, 390 Gain function, 87
Dynamical system, 155 Gaussian quadrature, 248
Dynamic stability, 164 Gegenbauer polynomial, 290
Generating function, 120
Eigenfunction, 301, 318 Global solution, 161
normalized, 329 Gradient curve, 10, 140
Eigensolution, 143 Graphical integration, 20
Eigenvalue, 301, 336 Green’s function, 58, 65, 93, 216, 354
Elastic bar, 306 Gregory-Newton formula, 233
Elastic spring, 160
Elementary function, 99, 295 Hamiltonian system, 383{f
Elliptic function, 140 bifurcation, 387
Elliptic integral, 101 Hankel function, 328
Endpoint conditions, 300 Hard spring, 160
periodic, 301 Harmonic oscillator, 88, 337
separated, 300 Hermite DE, 104, 320, 322
Energy function, see Liapunov function Hermite functions, 310, 322, 338
Entire function, 112 Hermite interpolation, 234
Equicontinuous functions, 192 Hermite polynomial, 104, 310, 357
Equilibrium point, 155 Hermite quadrature, 215, 250
Equivalent autonomous systems, 144, 15iff Hilbert space, 365ff
Equivalent integral equation, 183 Holomorphic function, 261
Error, 205ff Homogeneous linear DE, 7, 35
Error function, 4 Hopf bifurcation, 381
Escape time, 198 Hypergeometric DE, 276, 297
Essential singularity, 265 Hypergeometric function, 276, 287ff, 297
Euclidean vector space, 360
Euler-Lagrange equation, 63 Implicit function theorem, 13
Euler method, 21, 205 Implicit solution, 12
Euler’s DE, homogeneous, 62, 74, 262-263 Improved Euler method, 21, 222
Exact differential, 15 Indicial equation, 112, 263, 273, 275
for second-order DE, 54 Indicial polynomial, 275
Existence theorems, 24, 115, 124ff, Ch. 6 Inhomogeneous linear DE, 7, 58, 83
Exponential series, 101, 372 Initial condition, 3, 37
Exponential substitution, 71 Initial value problem, 8, 24, 37, 72, 132, 141
Extremum, 63 Inner product, 172, 194, 373
Integral, 13
Féjer’s convergence theorem, 345 Integral curve, 14
Fibonacci number, 235 Integral operator, 65, 334
Floquet theorem, 377 Integrating factor, 16, 54
Focal point, 52, 144, 147, 383, 390 Interpolation, 233ff
Fold bifurcation, 387 Interpolation error, 235
Forcing term, 35, 58 Invariant lines (radii), 18, 53, 143
Formal power series, 122 Irregular singular point, 266, 292
Forward difference, 231 Iterative solution, 185, 257
Index 397

Jacobi DE, 106 n-body problem, 181, 183


Jacobi identity, 288 Negative damping, 164
Jacobi polynomial, 289, 290 Neumann function, 278, 328
Jordan normal form, 374 Neutral stability, 153
Nodal point, 53, 148
Nonlinear oscillations, 158, 163
Lagrange identity, 55, 303
Norm, 206, 371
Lagrange interpolation, 233, 236
Normal curve family, 31, 129
Laguerre DE, 323, 354, 357
Normal DE, 2, 34
generalized, 342
Normal system, 132, 180
Laguerre polynomials, 311, 355
Numerical differentiation, 240
Laurent series, 265
Numerical integration, 21
Least square approximation, 348
Numerical quadrature, 212
Legendre DE, 34, 102, 308, 323
Nyquist diagram, 90ff
Legendre polynomial, 102, 308, 332, 342, 357
Nyquist stability criterion, 93
Leibniz’s rule, 60
Level curves, 11, 151
One-body problem, 183
Levinson-Smith theorem, 163
One-step method, 251
Liapunov function, 158
Operational calculus, 76
Lienard equation, 165
Operator, 41, 184
Limit cycle, 164, 381, 388
Order of accuracy, 110, 210
Linear DE, 7
Order of DE, 1
Linear equivalence, 144
Order of growth theorem, 285
Linear fractional DE, 17
Orthogonal expansion, 346
Linear independence, 78
Orthogonal functions, 302, 309
Linear operator, 40, 71, 231, 300
Orthogonal polynomials, 352ff
Linear system, 188, 371ff
Orthogonal trajectories, 10, 140
Liouville normal form, 324
Orthonormal basis, 367
Liouville substitution, 324
Orthonormal functions, 348
Lipschitz condition, 26, 173
Oscillatory solution, 38, 51
generalized, 29
Osgood’s uniqueness theorem, 33
one-sided, 27, 180, 210, 243
Lipschitz constant, 26
Parabolic interpolation, 235
Parametric solution, 16
Majorize, 113, 126 Parseval equality, 351, 362
Mathieu equation, 299, 301, 377, 389 Particular solution, 41, 71
Mathieu function, 302, 305, 390 Peano existence theorem, 192
Matrizant, 375 Peano uniqueness theorem, 29

Mean-square convergence, 347 Pearson’s DE, 108


Mesh-halving, 210 Period, 36, 160

Method of Liapounov, 157 Periodically forced system, 387ff


Method of Majorants, 113, 126 Periodic input, 89
Method of Undermined Coefficients, 10iff, 113, Perturbation equation, 198
121ff Phase, 36, 160, 312
Midpoint method, 238 Phase constant, 87
Midpoint quadrature, 212 Phase lag, 87
Milne’s method, 256 Phase plane, 49, 136
Milne's predictor, 258 Phase portrait, 52

Mixed spectrum, 339 Phase space, 155


Modified amplitude, phase, 324 Picard approximations, 185ff
Modified Bessel function, 280 Picone’s identity, 320
Modified Euler method, 224 Piecewise continuous, 47
Movable singular point, 263 Plane autonomous systems, Ch. 5, 201

Multiplicity of root, 73 Poincare index, 381ff


Multistep methods, 259 Poincaré-Liapunov theorem, 158
398 Index

Poincare map, 388 Solution curve, 2


Pole, 263, 265 Sonin-Polya Theorem, 332
Polynomial interpolation, 232 Spectrum, 301, 306
of stable type, 86 continuous, discrete, 337
Potential energy (integral), 156, 159 mixed, 339
Predictor-corrector methods, 223, 259 Spline interpolation, 234
Prufer substitution, 312 Square-integrable function, 309, 349
modified, 323 Square-well potential, 358
Prufer system, 313 Stability, 85, 153, 237, 374
Punctured plane, 19 diagram, 40
of difference equation, 239, 258
Quadrature, 4, 8, 244 Stable, 39, 85, 153, 239
Quartic interpolation, 233 Star point, 148, 149
Quasilinear DE, 1, 11 Starting process, 257
Static stability, 163
Radius of convergence, 112, 124 Strict stability, 39, 153, 239, 374
Rayleigh DE, 165 Sturm comparison theorem, 47, 313
Reduced equation, 35, 71 Sturm convexity theorem, 318
Reduced system, 188 Sturm-Liouville systems, 300ff
Regular curve family, 31, 135, 201 series, 302
Regular singular point, 266, 274ff, 292 Sturm oscillation theorem, 317
Regular Sturm-Liouville system, 300 Sturm separation theorem, 47, 314
Relative error, 207 Successive approximation, 185
Removable singularity, 265 Superposition principle, 35, 68
Resonance, 89 Szego’s comparison theorem, 320
Riccati DE, 45, 68, 124
Richardson extrapolation, 211 Taylor series method, 22
Riemann DE, 296 Trajectory, 50
Rodrigues formula, 105, 290, 358 Transfer function, 86ff
Romberg quadrature, 249 Trapezoidal integration, 218, 228
Roundoff error, 243 Trapezoidal quadrature, 215
Routh-Hurwitz conditions, 86 Trigonometric DE, 35, 85, 116
Runge-Kutta method, 250 Truncation error, 243
Two-endpoint problems, 63, 333
Saddle point, 53, 148, 383
Schroedinger equation, 336ff
Ultraspherical DE, 289
Schwarz inequality, 172, 309
Undamped oscillations, 159
Schwarzian, 69
Undetermined coefficients, methods of, 101ff,
Second differences, 232
12iff
Secular equation, 142, 373
Uniform mesh, 213
Self-adjoint DE, 56, 300
Uniqueness theorem, 26, 41, 174
Separable equations, 8
Unstable critical point, 153
Separatrix, 139, 390
Unstable DE, 39, 85
Similarity, 19
Unstable difference approximation, 239, 258
Simple branch point, 266
Simple pendulum, 138
van der Pol DE, 165
Simpson’s five-eight rule, 250
Variation of parameters, 60
Simpson’s rule, 22, 218, 247
Sine integral function, 4, 100 Vector DE, 131
Vibrating membrane, 307
Singular point, 34, 261
Vibrating string, 306
Singular Sturm-Liouville system, 308, 337
Vortex point, 139, 144, 159, 383
Singularity, see Singular point
Soft spring, 160
Solution, 1, 34, 71 Wave number, 36
Solution basis, see Basis of solutions Weddle’s rule, 215
Index 399

Weierstrass approximation theorem, 353 Well-posed problem, 24, 64, 170, 174
Weierstrass convergence theorem, 195 Well-set see Well-posed problem
Weight function, 302, 309, 352 Wronskian, 43

You might also like