Ordinary_Differential_Equation
Ordinary_Differential_Equation
Alexander Grigorian
University of Bielefeld
Lecture Notes, April - July 2008
Contents
1 Introduction: the notion of ODEs and examples 3
1.1 Separable ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Linear ODE of 1st order . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Quasi-linear ODEs and differential forms . . . . . . . . . . . . . . . . . . . 11
1.4 Integrating factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5 Second order ODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.1 Newtons’ second law . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5.2 Electrical circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1
3.8.2 Jordan cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.8.3 Jordan normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.8.4 Transformation of an operator to a Jordan normal form . . . . . . . 110
2
1 Introduction: the notion of ODEs and examples
A differential equation (Dif ferentialgleichung) is an equation for an unknown function
that contains not only the function but also its derivatives (Ableitung). In general, the
unknown function may depend on several variables and the equation may include various
partial derivatives. However, in this course we consider only the differential equations
for a function of a single real variable. Such equations are called ordinary differential
equations 1 — shortly ODE (die gewöhnliche Dif ferentialgleichungen).
A most general ODE has the form
¡ ¢
F x, y, y 0 , ..., y (n) = 0, (1.1)
y 0 = f (x, y) , (1.2)
where y = y (x) is the unknown real-valued function of a real argument x, and f (x, y) is
a given function of two real variables.
Consider a couple (x, y) as a point in R2 and assume that function f is defined on a
set D ½ R2 , which is called the domain (Definitionsbereich) of the function f and of
the equation (1.2). Then the expression f (x, y) makes sense whenever (x, y) 2 D.
Definition. A real valued function y (x) defined on an interval2 I ½ R, is called a
(particular) solution of (1.2) if y (x) is differentiable at any x 2 I, the point (x, y (x))
belongs to D for any x 2 I and the identity y 0 (x) = f (x, y (x)) holds for all x 2 I.
The family of all particular solutions of (1.2) is called the general solution. The graph
of a particular solution is called an integral curve of the equation. Obviously, any integral
curve is contained in the domain D.
Usually a given ODE cannot be solved explicitly. We will consider some classes of
f (x, y) when one find the general solution to (1.2) in terms of indefinite integration.
1
The theory of partial differential equations, that is, the equations containing partial derivatives, is a
topic of another lecture course.
2
Here and below by an interval we mean any set of the form
3
Example. Assume that the function f does not depend on y so that (1.2) becomes
y 0 = f (x). Hence, y must be a primitive function 3 of f. Assuming that f is a continuous
(stetig) function on an interval I, we obtain the general solution on I by means of the
indefinite integration: Z
y = f (x) dx = F (x) + C,
whence
y = eC ex = C1 ex ,
where C1 = eC . Since C 2 R is arbitrary, C1 = eC is any positive number. Hence, any
positive solution y has the form
y = C1 ex , C1 > 0.
y (x) = Cex ,
where C > 0 or C < 0. Clearly, C = 0 suits as well since y = 0 is a solution. The next
plot contains the integrals curves of such solutions:
3
By definition, a primitive function of f is any function whose derivative is equal to f .
4
y
25
12.5
0
-2 -1 0 1 2
-12.5
-25
Let us show that the family of solutions y = Cex , C 2 R, is the general solution.
Indeed, if y (x) is a solution that takes positive value somewhere then it is positive in
some open interval, say I. By the above argument, y (x) = Cex in I, where C > 0.
Since ex 6= 0, this solution does not vanish also at the endpoints of I. This implies that
the solution must be positive on the whole interval where it is defined. It follows that
y (x) = Cex in the domain of y (x). The same applies if y (x) < 0 for some x.
Hence, the general solution of the ODE y 0 = y is y (x) = Cex where C 2 R. The
constant C is referred to as a parameter. It is clear that the particular solutions are
distinguished by the values of the parameter.
Theorem 1.1 (The method of separation of variables) Let f (x) and g (y) be continuous
functions on open intervals I and J, respectively, and assume that g (y) 6= 0 on J. Let
1
F (x) be a primitive function of f (x) on I and G (y) be a primitive function of g(y) on J.
Then a function y defined on some subinterval of I, solves the differential equation (1.3)
if and only if it satisfies the identity
For example, consider again the ODE y 0 = y in the domain x 2 R, y > 0. Then
f (x) = 1 and g (y) = y 6= 0 so that Theorem 1.1 applies. We have
Z Z
F (x) = f (x) dx = dx = x
5
and Z Z
dy dy
G (y) = = = ln y
g (y) y
where we do not write the constant of integration because we need only one primitive
function. The equation (1.4) becomes
ln y = x + C,
whence we obtain y = C1 ex as in the previous example. Note that Theorem 1.1 does not
cover the case when g (y) may vanish, which must be analyzed separately when needed.
Proof. Let y (x) solve (1.3). Since g (y) 6= 0, we can divide (1.3) by g (y), which yields
y0
= f (x) . (1.5)
g (y)
Observe that by the hypothesis f (x) = F 0 (x) and g0 1(y) = G0 (y), which implies by the
chain rule
y0
= G0 (y) y 0 = (G (y (x)))0 .
g (y)
Hence, the equation (1.3) is equivalent to
G (y (x))0 = F 0 (x) , (1.6)
which implies (1.4).
Conversely, if function y satisfies (1.4) and is known to be differentiable in its domain
then differentiating (1.4) in x, we obtain (1.6); arguing backwards, we arrive at (1.3).
The only question that remains to be answered is why y (x) is differentiable. Since the
function g (y) does not vanish, it is either positive or negative in the whole domain.
1
Then the function G (y), whose derivative is g(y) , is either strictly increasing or strictly
decreasing in the whole domain. In the both cases, the inverse function G−1 is defined
and is differentiable. It follows from (1.4) that
y (x) = G−1 (F (x) + C) . (1.7)
Since both F and G=1 are differentiable, we conclude by the chain rule that y is also
differentiable, which finishes the proof.
Corollary. Under the conditions of Theorem 1.1, for all x0 2 I and y0 2 J there exists
a unique value of the constant C such that the solution y (x) defined by (1.7) satisfies the
condition y (x0 ) = y0 .
The condition y (x0 ) = y0 is called the initial condition (Anfangsbedingung).
Proof. Setting in (1.4) x = x0 and y = y0 , we obtain G (y0 ) = F (x0 )+C, which allows
to uniquely determine the value of C, that is, C = G (y0 ) ¡ F (x0 ). Conversely, assume
that C is given by this formula and prove that it determines by (1.7) a solution y (x). If
the right hand side of (1.7) is defined on an interval containing x0 , then by Theorem 1.1 it
defines a solution y (x), and this solution satisfies y (x0 ) = y0 by the choice of C. We only
have to make sure that the domain of the right hand side of (1.7) contains an interval
around x0 (a priori it may happen so that the the composite function G−1 (F (x) + C)
has empty domain). For x = x0 the right hand side of (1.7) is
G−1 (F (x0 ) + C) = G−1 (G (y0 )) = y0
6
so that the function y (x) is defined at x = x0 . Since both functions G−1 and F + C are
continuous and defined on open intervals, their composition is defined on an open set.
Since this set contains x0 , it contains also an interval around x0 . Hence, the function y is
defined on an interval around x0 , which finishes the proof.
One can rephrase the statement of Corollary as follows: for all x0 2 I and y0 2 J
there exists a unique solution y (x) of (1.3) that satisfies in addition the initial condition
y (x0 ) = y0 ; that is, for every point (x0 , y0 ) 2 I £ J there is exactly one integral curve
of the ODE that goes through this point. However, the meaning of the uniqueness claim
in this form is a bit ambiguous because out of any solution y (x), one can make another
solution just by slightly reducing the domain, and if the reduced domain still contains x0
then the initial condition will be satisfied also by the new solution. The precise uniqueness
claim means that any two solutions satisfying the same initial condition, coincide on the
intersection of their domains; also, such solutions correspond to the same value of the
parameter C.
In applications of Theorem 1.1, it is necessary to find the functions F and G. Techni-
cally it is convenient to combine the evaluation of F and G with other computations as
follows. The first step is always dividing (1.3) by g to obtain (1.5). Then integrate the
both sides in x to obtain Z 0 Z
y dx
= f (x) dx. (1.8)
g (y)
Then we need to evaluate the integral in the right hand side. If F (x) is a primitive of f
then we write Z
f (x) dx = F (x) + C.
In the left hand side of (1.8), we have y 0 dx = dy. Hence, we can change variables in the
integral replacing function y (x) by an independent variable y. We obtain
Z 0 Z
y dx dy
= = G (y) + C.
g (y) g (y)
whence
p
2 y =x+C
7
and
1
(x + C)2 , x > ¡C
y=
4
(the restriction x > ¡C comes from the previous line). Similarly, in the domain y < 0,
we obtain Z Z
dy
p = dx
¡y
whence p
¡2 ¡y = x + C
and
1
y = ¡ (x + C)2 , x < ¡C.
4
We obtain the following integrals curves:
y 4
0
-2 -1 0 1 2 3 4 5
-1 x
-2
-3
-4
We see that the integral curves in the domain y > 0 touch the curve y = 0 and so do the
integral curves in the domain y < 0. This allows us to construct more solution as follows:
take a solution y1 (x) < 0 that vanishes at x = a and a solution y2 (x) > 0 that vanishes
at x = b where a < b are arbitrary reals. Then define a new solution:
8
< y1 (x) , x < a
y (x) = 0, a · x · b,
:
y2 (x) , x > b.
Note that such solutions are not obtained automatically by the method of separation of
variables. It follows that through any point (x0 , y0 ) 2 R2 there are infinitely many integral
curves of the given equation.
8
Theorem 1.2 (The method of variation of parameter) Let functions a (x) and b (x) be
continuous in an interval I. Then the general solution of the linear ODE (1.9) has the
form Z
y (x) = e−A(x) b (x) eA(x) dx, (1.10)
Note that the function y (x) given by (1.10) is defined on the full interval I.
Proof. Let us make the change of the unknown function u (x) = y (x) eA(x) , that is,
u0 e−A = b
y 0 + a (x) y = 0
9
Example. Consider the equation
1 2
y 0 + y = ex (1.12)
x
in the domain x > 0. Then
Z Z
dx
A (x) = a (x) dx = = ln x
x
(we do not add a constant C since A (x) is one of the primitives of a (x)),
Z Z
1 x2 1 x2 2 1 ³ x2 ´
y (x) = e xdx = e dx = e +C ,
x 2x 2x
where C is an arbitrary constant.
Alternatively, one can solve first the homogeneous equation
1
y 0 + y = 0,
x
using the separable of variables:
y0 1
= ¡
y x
(ln y) = ¡ (ln x)0
0
ln y = ¡ ln x + C1
C
y = .
x
Next, replace the constant C by a function C (x) and substitute into (1.12):
µ ¶0
C (x) 1C 2
+ = ex ,
x xx
0
C x¡C C 2
2
+ 2 = ex
x x
C0 2
= ex
x
2
C0 = Zex x
x2 1 ³ x2 ´
C (x) = e xdx = e + C0 .
2
Hence,
C (x) 1 ³ x2 ´
y= = e + C0 ,
x 2x
where C0 is an arbitrary constant.
Corollary. Under the conditions of Theorem 1.2, for any x0 2 I and any y0 2 R there
is exists exactly one solution y (x) defined on I and such that y (x0 ) = y0 .
That is, though any point (x0 , y0 ) 2 I £ R there goes exactly one integral curve of the
equation.
10
Proof. Let B (x) be a primitive of be−A so that the general solution can be written
in the form
y = e−A(x) (B (x) + C)
with an arbitrary constant C. Obviously, any such solution is defined on I. The condition
y (x0 ) = y0 allows to uniquely determine C from the equation:
C = y0 eA(x0 ) ¡ B (x0 ) ,
as jdxj + jdyj ! 0. Here dx and dy the increments of x and y, respectively, which are
considered as new independent variables (the differentials). The linear function adx + bdy
of the variables dx, dy is called the differential of F at (x, y) and is denoted by dF , that
is,
dF = adx + bdy. (1.13)
In general, a and b are functions of (x, y).
Recall also the following relations between the notion of a differential and partial
derivatives:
1. If F is differentiable at some point (x, y) and its differential is given by (1.13) then
the partial derivatives Fx = ∂F
∂x
and Fy = ∂F ∂y
exist at this point and
Fx = a, Fy = b.
Definition. Given two functions a (x, y) and b (x, y) in Ω, consider the expression
which is called a differential form. The differential form is called exact in Ω if there is a
differentiable function F in Ω such that
and inexact otherwise. If the form is exact then the function F from (1.14) is called the
integral of the form.
Observe that not every differential form is exact as one can see from the following
statement.
11
Lemma 1.3 If functions a, b are continuously differentiable in Ω then the necessary con-
dition for the form adx + bdy to be exact is the identity
ay = bx .
Proof. Indeed, if there is F is an integral of the form adx + bdy then Fx = a and
Fy = b, whence it follows that the derivatives Fx and Fy are continuously differentiable.
By a well-know fact from Analysis, this implies that Fxy = Fyx whence ay = bx .
Example. The form ydx ¡ xdy is inexact because ay = 1 while bx = ¡1.
The form ydx + xdy is exact because it has an integral F (x, y) = xy.
3
The form 2xydx + (x2 + y 2 ) dy is exact because it has an integral F (x, y) = x2 y + y3
(it will be explained later how one can obtain an integral).
If the differential form adx + bdy is exact then this allows to solve easily the following
differential equation:
a (x, y) + b (x, y) y 0 = 0. (1.15)
This ODE is called quasi-linear because it is linear with respect to y 0 but not neces-
dy
sarily linear with respect to y. Using y 0 = dx , one can write (1.15) in the form
a (x, y) dx + b (x, y) dy = 0,
which explains why the equation (1.15) is related to the differential form adx + bdy. We
say that the equation (1.15) is exact if the form adx + bdy is exact.
Proof. The hypothesis that the graph of y (x) is contained in Ω implies that the
composite function F (x, y (x)) is defined on I. By the chain rule, we have
d
F (x, y (x)) = Fx + Fy y 0 = a + by 0 .
dx
d
Hence, the equation a + by 0 = 0 is equivalent to dx
F (x, y (x)) = 0, and the latter is
equivalent to F (x, y (x)) = const.
Example. The equation y + xy 0 = 0 is exact and is equivalent to xy = C because
ydx+xdy = d(xy). The same can be obtained using the method of separation of variables.
The equation 2xy + (x2 + y 2 ) y 0 = 0 is exact and is equivalent to
2 y3
x y+ = C.
3
Below are some integral curves of this equation:
12
y
2
1.8
1.6
1.4
1.2
0.8
0.6
0.4
0.2
-7.5 -6.25 -5 -3.75 -2.5 -1.25 0 1.25 2.5 3.75 5 6.25 7.5
How to decide whether a given differential form is exact or not? A partial answer is
given by the following theorem.
We say that a set Ω ½ R2 is a rectangle (box) if it has the form I £ J where I and J
are intervals in R.
Proof of Theorem 1.5. Assume first that the integral F exists and F (x0 , y0 ) = 0
for some point (x0 , y0 ) 2 Ω (the latter can always be achieved by adding a constant
to F ). For any point (x, y) 2 Ω, also the point (x, y0 ) 2 Ω; moreover, the intervals
[(x0 , y0 ) , (x, y0 )] and [(x, y0 ) , (x, y)] are contained in Ω because Ω is a rectangle. Since
Fx = a and Fy = b, we obtain by the fundamental theorem of calculus that
Z x Z x
F (x, y0 ) = F (x, y0 ) ¡ F (x0 , y0 ) = Fx (s, y0 ) ds = a (s, y0 ) ds
x0 x0
and Z y Z y
F (x, y) ¡ F (x, y0 ) = Fy (x, t) dt = b (x, t) dt,
y0 y0
whence
Zx Zy
F (x; y) = a (s; y0 ) ds + b (x; t) dt: (1.16)
x0 y0
Now use the formula (1.16) to define function F (x, y). Let us show that F is indeed the
integral of the form adx + bdy. Since a and b are continuous, it suffices to verify that
Fx = a and Fy = b.
13
It is easy to see from (1.16) that
Zy
∂
Fy = b (x, t) dt = b (x, y) .
∂y
y0
Next, we have
Z x Z y
∂ ∂
Fx = a (s, y0 ) ds + b (x, t) dt
∂x x0 ∂x y0
Z y
∂
= a (x, y0 ) + b (x, t) dt. (1.17)
y0 ∂x
∂
The fact that the integral and the derivative ∂x can be interchanged will be justified below
(see Lemma 1.6). Using the hypothesis bx = ay , we obtain from (1.17)
Z y
Fx = a (x, y0 ) + ay (x, t) dt
y0
= a (x, y0 ) + (a (x, y) ¡ a (x, y0 ))
= a (x, y) ,
Lemma 1.6 Let g (x, t) be a continuous function on I £ J where I and J are bounded
closed intervals in R. Consider the function
Z β
f (x) = g (x, t) dt,
α
where [α, β] = J, which is defined for all x 2 I. If the partial derivative gx exists and is
continuous on I £ J then f is continuously differentiable on I and, for any x 2 I,
Z β
0
f (x) = gx (x, t) dt.
α
which amounts to
Z β Z β
g (x0 , t) ¡ g (x, t)
dt ! gx (x, t) dt as x0 ! x.
α x0 ¡ x α
Note that by the definition of a partial derivative, for any t 2 [α, β],
g (x0 , t) ¡ g (x, t)
! gx (x, t) as x0 ! x. (1.18)
x0 ¡ x
14
Consider all parts of (1.18) as functions of t, with fixed x and with x0 as a parameter.
Then we have a convergence of a sequence of functions, and we would like to deduce
that their integrals converge as well. By a result from Analysis II, this is the case, if the
convergence is uniform (gleichmässig) in the whole interval [α, β] , that is, if
¯ ¯
¯ g (x0 , t) ¡ g (x, t) ¯
sup ¯¯ 0
¡ gx (x, t)¯ ! 0 as x0 ! x.
¯ (1.19)
t∈[α,β] x ¡x
By the mean value theorem, for any t 2 [α, β], there is ξ 2 [x, x0 ] such that
g (x0 , t) ¡ g (x, t)
= gx (ξ, t) .
x0 ¡ x
Hence, the difference quotient in (1.19) can be replaced by gx (ξ, t). To proceed further,
recall that a continuous function on a compact set is uniformly continuous. In particular,
the function gx (x, t) is uniformly continuous on I £ J, that is, for any ε > 0 there is δ > 0
such that
we conclude by Theorem 1.5 that the given form is exact. The integral F can be found
by (1.16) taking x0 = y0 = 0:
Z x Z y
¡ 2 ¢ y3
F (x, y) = 2s0ds + x + t2 dt = x2 y + ,
0 0 3
as it was observed above.
Example. Consider the differential form
¡ydx + xdy
(1.21)
x2 + y 2
in Ω = R2 n f0g. This form satisfies the condition ay = bx because
µ ¶
y (x2 + y 2 ) ¡ 2y 2 y 2 ¡ x2
ay = ¡ = ¡ =
x2 + y 2 y (x2 + y 2 )2 (x2 + y 2 )2
15
and µ ¶
x (x2 + y 2 ) ¡ 2x2 y 2 ¡ x2
bx = = = .
x + y2
2
x (x2 + y 2 )2 (x2 + y 2 )2
By Theorem 1.5 we conclude that the given form is exact in any rectangular domain in
Ω. However, let us show that the form is inexact in Ω.
Consider the function θ (x, y) which is the polar angle that is defined in the domain
Ω0 = R2 n f(x, 0) : x · 0g
by the conditions
y x
sin θ = , cos θ = , θ 2 (¡π, π) ,
r r
p
where r = x2 + y 2 . Let us show that in Ω0
¡ydx + xdy
dθ = . (1.22)
x2 + y 2
y
In the half-plane fx > 0g we have tan θ = x
and θ 2 (¡π/2, π/2) whence
y
θ = arctan .
x
Then (1.22) follows by differentiation of the arctan:
1 xdy ¡ ydx ¡ydx + xdy
dθ = 2 = .
1 + (y/x) x2 x2 + y 2
x
In the half-plane fy > 0g we have cot θ = y
and θ 2 (0, π) whence
x
θ = arccot
y
x
and (1.22) follows again. Finally, in the half-plane fy < 0g we have cot θ = y
and θ 2
(¡π, 0) whence µ ¶
x
θ = ¡ arccot ¡ ,
y
and (1.22) follows again. Since Ω0 is the union of the three half-planes fx > 0g, fy > 0g,
fy < 0g, we conclude that (1.22) holds in Ω0 and, hence, the form (1.21) is exact in Ω0 .
Why the form (1.21) is inexact in Ω? Assume from the contrary that the form (1.21)
is exact in Ω and that F is its integral in Ω, that is,
¡ydx + xdy
dF = .
x2 + y 2
16
function on Ω, which however is not true, because the limits of θ when approaching the
point (¡1, 0) (or any other point (x, 0) with x < 0) from above and below are different.
The moral of this example is that the statement of Theorem 1.5 is not true for an
arbitrary open set Ω. It is possible to show that the statement of Theorem 1.5 is true
if and only if the set Ω is simply connected, that is, if any closed curve in Ω can be
continuously deformed to a point while staying in Ω. Obviously, the rectangles are simply
connected (as well as Ω0 ), while the set Ω = R2 n f0g is not simply connected.
17
1.5 Second order ODE
A general second order ODE, resolved with respect to y 00 has the form
y 00 = f (x, y, y 0 ) ,
Z Z
m d 0 2
(x ) dt = F (x) dx,
2 dt
mv 2
= ¡U (x) + C
2
and
mv 2
+ U (x) = C.
2
2
The sum mv2 + U (x) is called the total energy of the particle (which is the sum of the
kinetic energy and the potential energy). Hence, we have obtained the law of conservation
of energy: the total energy of the particle in a conservative field remains constant.
18
R
+
V(t) _ L
19
where f is a real value function on an open set Ω ½ R2 and a pair (t, x) is considered as
a point in R2 .
Let us associate with the given ODE the initial value problem (Anfangswertproblem)
- shortly, IVP, which is the problem of finding a solution that satisfies in addition the initial
condition x (t0 ) = x0 where (t0 , x0 ) is a given point in Ω. We write IVP in a compact
form as follows: ½ 0
x = f (t, x) ,
(2.1)
x (t0 ) = x0 .
A solution to IVP is a differentiable function x (t) : I ! R where I is an open interval
containing t0 , such that (t, x (t)) 2 Ω for all t 2 I, which satisfies the ODE in I and the
initial condition. Geometrically, the graph of function x (t) is contained in Ω and goes
through the point (t0 , x0 ).
In order to state the main result, we need the following definitions.
Definition. We say that a function f : Ω ! R is Lipschitz in x if there is a constant L
such that
jf (t, x) ¡ f (t, y)j · L jx ¡ yj
for all t, x, y such that (t, x) 2 Ω and (t, y) 2 Ω. The constant L is called the Lipschitz
constant of f in Ω.
We say that a function f : Ω ! R is locally Lipschitz in x if, for any point (t0 , x0 ) 2 Ω
there exist positive constants ε, δ such that the rectangle
is contained in Ω and the function f is Lipschitz in R; that is, there is a constant L such
that for all t 2 [t0 ¡ δ, t0 + δ] and x, y 2 [x0 ¡ ε, x0 + ε],
Note that in the latter case the constant L may be different for different rectangles.
Lemma 2.1 (a) If the partial derivative fx exists and is bounded in a rectangle R ½ R2
then f is Lipschitz in x in R.
(b) If the partial derivative fx exists and is continuous in an open set Ω ½ R2 then f
is locally Lipschitz in x in Ω.
Proof. (a) If (t, x) and (t, y) belong to R then the whole interval between these points
is also in R, and we have by the mean value theorem
whence we obtain
jf (t, x) ¡ f (t, y)j · L jx ¡ yj .
20
Hence, f is Lipschitz in R with the Lipschitz constant (2.3).
(b) Fix a point (t0 , x0 ) 2 Ω and choose positive ε, δ so small that the rectangle R
defined by (2.2) is contained in Ω (which is possible because Ω is an open set). Since
R is a bounded closed set, the continuous function fx is bounded on R. By part (a) we
conclude that f is Lipschitz in R, which means that f is locally Lipschitz in Ω.
Example. The function f (t, x) = jxj is Lipschitz in x in R2 because
jjxj ¡ jyjj · jx ¡ yj ,
by the triangle inequality for jxj. Clearly, f is not differentiable in x at x = 0. Hence, the
continuous differentiability of f is sufficient for f to be Lipschitz in x but not necessary.
The next theorem is one of the main results of this course.
Theorem 2.2 (The Picard - Lindelöf theorem) Let Ω be an open set in R2 and f (t, x)
be a continuous function in Ω that is locally Lipschitz in x.
(Existence) Then, for any point (t0 , x0 ) 2 Ω, the initial value problem IVP (2.1) has a solution.
(Uniqueness) If x1 (t) and x2 (t) are two solutions of the same IVP then x1 (t) = x2 (t) in their
common domain.
Remark. By Lemma 2.1, the hypothesis of Theorem 2.2 that f is locally Lipschitz in
x could be replaced by a simpler hypotheses that fx is continuous. However, as we have
seen above, there are examples of functions that are Lipschitz but not differentiable, and
Theorem 2.2 applies for such functions.
If we completely drop the Lipschitz condition and assume only that f is continuous
in (t, x) then the existence of a solution is still the case (Peano’s theorem) while the
uniqueness fails in general as will be seen in the next example.
p
Example. Consider the equation x0 = jxj which was already solved before by separa-
tion of variables. The function x (t) ´ 0 is a solution, and the following two functions
1 2
x (t) = t , t > 0,
4
1
x (t) = ¡ t2 , t < 0
4
are also solutions (this can also be trivially verified by substituting them into the ODE).
Gluing together these two functions and extending the resulting function to t = 0 by
setting x (0) = 0, we obtain a new solution defined for all real t (see the diagram below).
Hence, there are at least two solutions that satisfy the initial condition x (0) = 0.
21
x
6
0
-4 -2 0 2 4
t
-2
-4
-6
p
The uniqueness breaks down because the function jxj is not Lipschitz near 0.
Proof of existence in Theorem 2.2. We start with the following observation.
Claim. Let x (t) be a function defined on an open interval I ½ R. A function x (t) solves
IVP if and only if x (t) is continuous, (t, x (t)) 2 Ω for all t 2 I, t0 2 I, and
Z t
x (t) = x0 + f (s, x (s)) ds. (2.4)
t0
Indeed, if x solves IVP then (2.4) follows from x0 = f (t, x (t)) just by integration:
Z t Z t
0
x (s) ds = f (s, x (s)) ds
t0 t0
whence Z t
x (t) ¡ x0 = f (s, x (s)) ds.
t0
Conversely, if x is a continuous function that satisfies (2.4) then the right hand side of
(2.4) is differentiable in t whence it follows that x (t) is differentiable. It is trivial that
x (t0 ) = x0 , and after differentiation (2.4) we obtain the ODE x0 = f (t, x) .
This claim reduces the problem of solving IVP to the integral equation (2.4). Fix a
point (t0 , x0 ) 2 Ω and let ε, δ be the parameter from the the local Lipschitz condition at
this point; that is, there is a constant L such that
were 0 < r · δ is a new parameter, whose value will be specified later on. By construction,
I £ J ½ Ω.
22
Denote by X be the family of all continuous functions x (t) : I ! J, that is,
X = fx : I ! J j x is continuousg
J=[x0-ε,x0+ε]
x0
I=[t0-r,t0+r]
t0-δ t0 t0+δ t
which is obviously motivated by (2.4). To be more precise, we would like to ensure that
x 2 X implies Ax 2 X. Note that, for any x 2 X, the point (s, x (s)) belongs to Ω so
that the above integral makes sense and the function Ax is defined on I. This function
is obviously continuous. We are left to verify that the image of Ax is contained in J.
Indeed, the latter condition means that
where
M= sup jf (s, x)j < 1.
s∈[t0 −δ,t0 +δ]
x∈[x0 −ε,x0 +ε]
23
To summarize the above argument, we have defined a function family X and a mapping
A : X ! X. By the above Claim, a function x 2 X will solve the IVP if function x is a
fixed point of the mapping A, that is, if x = Ax.
The existence of a fixed point will be obtained using the Banach fixed point theorem:
If (X, d) is a complete metric space (V ollständiger metrische Raum) and A : X ! X is
a contraction mapping (Kontraktionsabbildung), that is,
for some q 2 (0, 1) and all x, y 2 X, then A has a fixed point. By the proof of this
theorem, one starts with any element x0 2 X, constructs a sequence of iteration fxn g∞n=1
using the rule xn+1 = Axn , n = 0, 1, ..., and shows that the sequence fxn g∞
n=1 converges
in X to a fixed point.
In order to be able to apply this theorem, we must introduce a distance function d
(Abstand) on X so that (X, d) is a complete metric space and A is a contraction mapping
with respect to this distance.
Let d be the sup-distance, that is, for any two functions x, y 2 X, set
Using the fact that the convergence in (X, d) is the uniform convergence of functions and
the uniform limits of continuous functions is continuous, one can show that the metric
space (X, d) is complete (see Exercise 16).
How to ensure that the mapping A : X ! X is a contraction? For any two functions
x, y 2 X and any t 2 I, we have x (t) , y (t) 2 J whence by the Lipschitz condition
¯Z t Z t ¯
¯ ¯
¯
jAx (t) ¡ Ay (t)j = ¯ f (s, x (s)) ds ¡ f (s, y (s)) ds¯¯
¯Zt0t t0
¯
¯ ¯
· ¯¯ jf (s, x (s)) ¡ f (s, y (s))j ds¯¯
¯Zt0t ¯
¯ ¯
¯
· ¯ L jx (s) ¡ y (s)j ds¯¯
t0
·
· Lrd (x, y) .
Therefore,
sup jAx (t) ¡ Ay (t)j · sup jx (s) ¡ y (s)j L jt ¡ t0 j
t∈I s∈I
whence
d (Ax, Ay) · Lrd (x, y) .
Hence, choosing r < 1/L, we obtain that A is a contraction, which finishes the proof of
the existence.
Remark. Let us summarize the proof of the existence of solutions as follows. Let ε, δ, L
be the parameters from the the local Lipschitz condition at the point (t0 , x0 ), that is,
24
for all t 2 [t0 ¡ δ, t0 + δ] and x, y 2 [x0 ¡ ε, x0 + ε]. Let
Then the IVP has a solution on an interval [t0 ¡ r, t0 + r] provided r is a positive number
that satisfies the following conditions:
ε 1
r · δ, r · , r< . (2.6)
M L
For some applications, it is important that r can be determined as a function of ε, δ, M, L.
Example. The method of the proof of the existence in Theorem 2.2 suggests the following
procedure of computation of the solution of IVP. We start with any function x0 2 X (using
the same notation as in the proof) and construct the sequence fxn g∞ n=0 of functions in
X using the rule xn+1 = Axn . The sequence fxn g is called the Picard iterations, and it
converges uniformly to the solution x (t).
Let us illustrate this method on the following example:
½ 0
x = x,
x (0) = 1.
The operator A is given by Z t
Ax (t) = 1 + x (s) ds,
0
whence, setting x0 (t) ´ 1, we obtain
Z t
x1 (t) = 1 + x0 ds = 1 + t,
0
Z t
t2
x2 (t) = 1 + x1 ds = 1 + t +
0 2
Z t
t2 t3
x3 (t) = 1 + x2 dt = 1 + t + +
0 2! 3!
and by induction
t2 t3 tn
xn (t) = 1 + t +
+ + ... + .
2! 3! k!
t t
Clearly, xn ! e as n ! 1, and the function x (t) = e indeed solves the above IVP.
For the proof of the uniqueness, we need the following two lemmas.
Lemma 2.3 (The Gronwall inequality) Let z (t) be a non-negative continuous function
on [t0 , t1 ] where t0 < t1 . Assume that there are constants C, L ¸ 0 such that
Z t
z (t) · C + L z (s) ds (2.7)
t0
25
Proof. We can assume that C is strictly positive. Indeed, if (2.7) holds with C = 0
then it holds with any C > 0. Therefore, (2.8) holds with any C > 0, whence it follows
that it holds with C = 0. Hence, assume in the sequel that C > 0. This implies that the
right hand side of (2.7) is positive. Set
Z t
F (t) = C + L z (s) ds
t0
and observe that F is differentiable and F 0 = Lz. It follows from (2.7) that z · F whence
F 0 = Lz · LF.
This is a differential inequality for F that can be solved similarly to the separable ODE.
Since F > 0, dividing by F we obtain
F0
· L,
F
whence by integration
Z t Z t
F (t) F 0 (s)
ln = ds · Lds = L (t ¡ t0 ) ,
F (t0 ) t0 F (s) t0
Lemma 2.4 If S is a subset of an interval U ½ R that is both open (off en) and closed
(abgeschlossen) in U then either S is empty or S = U.
Any set U that satisfies the conclusion of Lemma 2.4 is called connected (zusammenhängend).
Hence, Lemma 2.4 says that any interval is a connected set.
Proof. Set S c = U n S so that S c is closed in U. Assume that both S and S c are non-
empty and choose some points a0 2 S, b0 2 S c . Set c = a0 +b 2
0
so that c 2 U and, hence,
c
c belongs to S or S . Out of the intervals [a0 , c], [c, b0 ] choose the one whose endpoints
belong to different sets S, S c and rename it by [a1 , b1 ], say a1 2 S and b1 2 S c . Considering
the point c = a1 +b
2
1
, we repeat the same argument and construct an interval [a2 , b2 ] being
one of two halfs of [a1 , b1 ] such that a2 2 S and b2 2 S c . Contintue further, we obtain
a nested sequence f[ak , bk ]g∞ c
k=0 of intervals such that ak 2 S, bk 2 S and jbk ¡ ak j ! 0.
By the principle of nested intervals (Intervallschachtelungsprinzip), there is a common
point x 2 [ak , bk ] for all k. Note that x 2 U. Since ak ! x, we must have x 2 S, and
since bk ! x, we must have x 2 S c , because both sets S and S c are closed in U. This
contradiction finishes the proof.
Proof of the uniqueness in Theorem 2.2. Assume that x1 (t) and x2 (t) are two
solutions of the same IVP both defined on an open interval U ½ R and prove that they
coincide on U.
We first prove that the two solution coincide in some interval around t0 . Let ε and
δ be the parameters from the Lipschitz condition at the point (t0 , x0 ) as above. Choose
26
0 < r < δ so small that the both functions x1 (t) and x2 (t) restricted to I = [t0 ¡ r, t0 + r]
take values in J = [x0 ¡ ε, x0 + ε] (which is possible because both x1 (t) and x2 (t) are
continuous functions). As in the proof of the existence, the both solutions satisfies the
integral identity Z t
x (t) = x0 + f (s, x (s)) ds
t0
for all t 2 I. Hence, for the difference z (t) := jx1 (t) ¡ x2 (t)j, we have
Z t
z (t) = jx1 (t) ¡ x2 (t)j · jf (s, x1 (s)) ¡ f (s, x2 (s))j ds,
t0
assuming for certainty that t0 · t · t0 + r. Since the both points (s, x1 (s)) and (s, x2 (s))
in the given range of s are contained in I £ J, we obtain by the Lipschitz condition
jf (s, x1 (s)) ¡ f (s, x2 (s))j · L jx1 (s) ¡ x2 (s)j
whence Z t
z (t) · L z (s) ds.
t0
27
is contained in Ω and, for all (t, x) , (t, y) 2 R,
Let M be the supremum of jf (t, x)j in R. By the proof of Theorem 2.2, the solution x (t)
with the initial condition x (t0 ) = x0 is defined in the interval [t0 ¡ r, t0 + r] where r is
any positive number that satisfies (2.6), and x (t) takes values in [x0 ¡ ε, x0 + ε] for all
t 2 [t0 ¡ r, t0 + r]. Let us choose r as follows
µ ¶
ε 1
r = min δ, , . (2.10)
M 2L
(in comparison with (2.10), here ε is replaced by ε/2 in accordance with the definition of
R0 ). Clearly, if r satisfies (2.10) then the value
r
r (s) =
2
satisfies (2.12). Let us state the result of this argument as follows.
Claim. Fix a point (t0 , x0 ) 2 Ω and choose ε, δ > 0 from the local Lipschitz condition at
(t0 , x0 ). Let L be the Lipschitz constant in R = [t0 ¡ δ, t0 + δ] £ [x0 ¡ ε, x0 + ε], M =
sup jf j, and define r = r (ε, δ, L, M) by (2.10). Then, for any s 2 [x0 ¡ ε/2, x0 + ε/2], the
R
solution x (t, s) of (2.9) is defined in [t0 ¡ r/2, t0 + r/2] and takes values in [x0 ¡ ε, x0 + ε].
In particular, we can compare solutions with different initial value s since they have
the common domain [t0 ¡ r/2, t0 + r/2] (see the diagram below).
28
x
Ω
x0+ε
x0+ε/2
x0
s
x0-ε/2
x0+ε
Theorem 2.5 (Continuous dependence on the initial value) Let Ω be an open set in
R2 and f (t, x) be a continuous function in Ω that is locally Lipschitz in x. Let (t0 , x0 )
be a point in Ω and let ε, r be as above. Then, for all s0 , s00 2 [x0 ¡ ε/2, x0 + ε/2] and
t 2 [t0 ¡ r/2, t0 + r/2],
jx (t, s0 ) ¡ x (t, s00 )j · 2 js0 ¡ s00 j . (2.13)
Consequently, the function x (t, s) is continuous in (t, s).
Proof. Consider again the integral equations
Z t
0 0
x (t, s ) = s + f (τ , x (τ , s0 )) dτ
t0
and Z t
00 00
x (t, s ) = s + f (τ , x (τ , s00 )) dτ .
t0
where we have used the Lipschitz condition because by the above Claim (τ , x (τ , s)) 2
[t0 ¡ δ, t0 + δ] £ [x0 ¡ ε, x0 + ε] for all s 2 [x0 ¡ ε/2, x0 + ε/2] .
Setting z (t) = jx (t, s0 ) ¡ x (t, s00 )j we obtain
Z t
0 00
z (t) · js ¡ s j + L z (τ ) dτ ,
t0
29
which implies by the Lemma 2.3
which proves (2.13) for t ¸ t0 . Similarly one obtains the same for t · t0 .
Let us prove that x (t, s) is continuous in (t, s). Fix a point (t, s) 2 Ω and prove that
x (t, s) is continuous at this point, that is,
x (tn , sn ) ! x (t, s)
2.3 Higher order ODE and reduction to the first order system
A general ODE of the order n resolved with respect to the highest derivative can be
written in the form ¡ ¢
y (n) = F t, y, ..., y(n−1) , (2.14)
where t is an independent variable and y (t) is an unknown function. It is sometimes more
convenient to replace this equation by a system of ODEs of the 1st order.
Let x (t) be a vector function of a real variable t, which takes values in Rn . Denote by
xk the components of x. Then the derivative x0 (t) is defined component-wise by
x0 = f (t, x) (2.15)
where f is a given function of n+1 variables, which takes values in Rn , that is, f : Ω ! Rn
where Ω is an open subset of Rn+1 (so that the couple (t, x) is considered as a point in
Ω). Denoting by fk the components of f, we can rewrite the vector equation (2.15) as a
system of n scalar equations
8 0
>
> x1 = f1 (t, x1 , ..., xn )
>
>
< ...
x0k = fk (t, x1 , ..., xn ) (2.16)
>
>
>
> ...
: 0
xn = fn (t, x1 , ..., xn )
30
A system of ODEs of the form (2.15) is called the normal system.
Let us show how the equation (2.14) can be reduced to the normal system (2.16).
Indeed, with any function y (t) let us associate the vector-function
¡ ¢
x = y, y 0 , ..., y (n−1) ,
x1 = y, x2 = y 0 , ..., xn = y (n−1) .
Obviously, ¡ ¢
x0 = y 0 , y 00 , ..., y (n) ,
and using (2.14) we obtain a system of equations
8 0
>
> x1 = x2
>
>
< x02 = x3
... (2.17)
>
>
>
> x0 = xn
: 0n−1
xn = F (t, x1 , ...xn )
so that we obtain equation (2.14) with respect to y = x1 . Hence, the equation (2.14) is
equivalent to the vector equation (2.15) with function f defined by (2.18).
Example. For example, consider the second order equation
y 00 = F (t, y, y 0 ) .
What initial value problem is associated with the vector equation (2.15) and the scalar
higher order equation (2.14)? Motivated by the study of the 1st order ODE, one can
presume that it makes sense to consider the following IVP for the vector 1st order ODE
½ 0
x = f (t, x)
x (t0 ) = x0
31
where x0 2 Rn is a given initial value of x (t). For the equation¡(2.14), this means ¢ that the
initial conditions should prescribe the value of the vector x = y, y 0 , ..., y (n−1) at some t0 ,
which amounts to n scalar conditions
8
>
> y (t0 ) = y0
< 0
y (t0 ) = y1
>
> ...
: (n−1)
y (t0 ) = yn−1
where y0 , ..., yn−1 are given values. Hence, the initial value problem IVP for the scalar
equation of the order n can be stated as follows:
8 0 ¡ 0 (n−1)
¢
>
> y = F t, y, y , ..., y
>
>
< y (t0 ) = y0
y 0 (t0 ) = y1
>
>
>
> ...
: (n−1)
y (t0 ) = yn−1 .
2.4 Norms in Rn
Recall that a norm in Rn is a function N : Rn ! R with the following properties:
For example, the function jxj is a norm in R. Usually one uses the notation kxk for a
norm instead of N (x).
Example. For any p ¸ 1, the p-norm in Rn is defined by
à n !1/p
X p
kxkp = jxk j .
k=1
and for p = 2
à n !1/2
X
kxk2 = x2k .
k=1
For p = 1 set
kxk∞ = max jxk j .
1≤k≤n
32
norms. However, it is known that all possible norms in Rn are equivalent in the following
sense: if N1 (x) and N2 (x) are two norms in Rn then there are positive constants C 0 and
C 00 such that
N1 (x)
C 00 · · C 0 for all x 6= 0. (2.19)
N2 (x)
For example, it follows from the definitions of kxk1 and kxk∞ that
kxk1
1· · n.
kxk∞
For most applications, the relation (2.19) means that the choice of a specific norm is not
important.
The notion of a norm is used in order to define the Lipschitz condition for functions
in Rn . Let us fix some norm kxk in Rn . For any x 2 Rn and r > 0, and define a closed
ball B (x, r) by
B (x, r) = fy 2 Rn : kx ¡ yk · rg .
For example, in R with kxk = jxj we have B (x, r) = [x ¡ r, x + r]. Similarly, one defines
an open ball
B (x, r) = fy 2 Rn : kx ¡ yk < rg .
Below are sketches of the ball B (0, 1) in R2 for different norms:
the 1-norm:
x_2
x_1
x_1
the 4-norm:
33
x_2
x_1
x_1
In the view of the equivalence of any two norms in Rn , the property to be Lipschitz
does not depend on the choice of the norm (but the value of the Lipschitz constant L
does).
A subset K of Rn+1 will be called a cylinder if it has the form K = I £ B where I is
an interval in R and B is a ball (open or closed) in Rn . The cylinder is closed if both I
and B are closed, and open if both I and B are open.
Definition. Function f (t, x) is called locally Lipschitz in x in Ω if for any (t0 , x0 ) 2 Ω
there exist constants ε, δ > 0 such that the cylinder
K = [t0 ¡ δ, t0 + δ] £ B (x0 , ε)
34
Lemma 2.6 (a) If all components fk of f are differentiable functions in a cylinder K
and all the partial derivatives ∂f k
∂xi
are bounded in K then the function f (t, x) is Lipschitz
in x in K.
(b) If all partial derivatives ∂f k
∂xj
exists and are continuous in Ω then f (t, x) is locally
Lipschitz in x in Ω.
∂g
(note that the interval [x, y] is contained in the ball B so that ∂xj
(ξ) makes sense). Indeed,
consider the function
The function h (t) is differentiable on [0, 1] and, by the mean value theorem in R, there is
τ 2 (0, 1) such that
g (y) ¡ g (x) = h (1) ¡ h (0) = h0 (τ ) .
Noticing that by the chain rule
Xn
0 ∂g
h (τ ) = (x + τ (y ¡ x)) (yj ¡ xj )
j=1
∂xj
Switching in the both sides to the given norm k¢k and using the equivalence of all norms,
we obtain that f is Lipschitz in x in K.
35
(b) Given a point (t0 , x0 ) 2 Ω, choose positive ε and δ so that the cylinder
K = [t0 ¡ δ, t0 + δ] £ B (x0 , ε)
x0 = f (t, x)
Proof. The proof is very similar to the case n = 1 considered in Theorem 2.2. We
start with the following claim.
Claim. A function x (t) solves IVP if and only if x (t) is a continuous function on an
open interval I such that t0 2 I, (t, x (t)) 2 Ω for all t 2 I, and
Z t
x (t) = x0 + f (s, x (s)) ds. (2.24)
t0
whence Z t
xk (t) ¡ (x0 )k = fk (s, x (s)) ds
t0
36
and (2.24) follows. Conversely, if x is a continuous function that satisfies (2.24) then
Z t
xk = (x0 )k + fk (s, x (s)) ds.
t0
The right hand side here is differentiable in t whence it follows that xk (t) is differentiable.
It is trivial that xk (t0 ) = (x0 )k , and after differentiation we obtain x0k = fk (t, x) and,
hence, x0 = f (t, x).
Fix a point (t0 , x0 ) 2 Ω and let ε, δ be the parameter from the the local Lipschitz
condition at this point, that is, there is a constant L such that
for all t 2 [t0 ¡ δ, t0 + δ] and x, y 2 B (x0 , ε). Choose some r 2 (0, δ] to be specified later
on, and set
I = [t0 ¡ r, t0 + r] and J = B (x0 , ε) .
Denote by X the family of all continuous functions x (t) : I ! J, that is,
X = fx : I ! J : x is continuousg .
We would like to ensure that x 2 X implies Ax 2 X. Note that, for any x 2 X, the
point (s, x (s)) belongs to Ω so that the above integral makes sense and the function Ax is
defined on I. This function is obviously continuous. We are left to verify that the image
of Ax is contained in J. Indeed, the latter condition means that
where
M= sup kf (s, x)k < 1.
s∈[t0 −δ,t0 +δ]
x∈B(x0 ,ε).
37
Then (X, d) is a complete metric space (see Exercise 16).
We are left to ensure that the mapping A : X ! X is a contraction. For any two
functions x, y 2 X and any t 2 I, t ¸ t0 , we have x (t) , y (t) 2 J whence by the Lipschitz
condition
°Z t Z t °
° °
°
kAx (t) ¡ Ay (t)k = ° f (s, x (s)) ds ¡ f (s, y (s)) ds°
°
t t0
Z t0
· kf (s, x (s)) ¡ f (s, y (s))k ds
t0
Z t
· L kx (s) ¡ y (s)k ds
t0
· L (t ¡ t0 ) sup kx (s) ¡ y (s)k
s∈I
· Lrd (x, y) .
Hence, choosing r < 1/L, we obtain that A is a contraction. By the Banach fixed point
theorem, we conclude that the equation Ax = x has a solution x 2 X, which hence solves
the IVP.
Assume that x (t) and y (t) are two solutions of the same IVP both defined on an
open interval U ½ R and prove that they coincide on U. We first prove that the two
solution coincide in some interval around t0 . Let ε and δ be the parameters from the
Lipschitz condition at the point (t0 , x0 ) as above. Choose 0 < r < δ so small that the
both functions x (t) and y (t) restricted to I = [t0 ¡ r, t0 + r] take values in J = B (x0 , ε)
(which is possible because both x (t) and y (t) are continuous functions). As in the proof
of the existence, the both solutions satisfies the integral identity
Z t
x (t) = x0 + f (s, x (s)) ds
t0
for all t 2 I. Hence, for the difference z (t) := kx (t) ¡ y (t)k, we have
Z t
z (t) = kx (t) ¡ y (t)k · kf (s, x (s)) ¡ f (s, y (s))k ds,
t0
assuming for certainty that t0 · t · t0 + r. Since the both points (s, x (s)) and (s, y (s))
in the given range of s are contained in I £ J, we obtain by the Lipschitz condition
whence Z t
z (t) · L z (s) ds.
t0
38
Now we prove that they coincide on the full interval U . Consider the set
S = ft 2 U : x (t) = y (t)g
and let us show that the set S is both closed and open in I. The closedness is obvious:
if x (tk ) = y (tk ) for a sequence ftk g and tk ! t 2 U as k ! 1 then passing to the limit
and using the continuity of the solutions, we obtain x (t) = y (t), that is, t 2 S.
Let us prove that the set S is open. Fix some t1 2 S. Since x (t1 ) = y (t1 ) =: x1 ,
the both functions x (t) and y (t) solve the same IVP with the initial data (t1 , x1 ). By
the above argument, x (t) = y (t) in some interval I = [t1 ¡ r, t1 + r] with r > 0. Hence,
I ½ S, which implies that S is open.
Since the set S is non-empty (it contains t0 ) and is both open and closed in U, we
conclude by Lemma 2.4 that S = U, which finishes the proof of uniqueness.
Remark. Let us summarize the proof of the existence part of Theorem 2.7 as follows.
For any point (t0 , x0 ) 2 Ω, we first choose positive constants ε, δ, L from the Lipschitz
condition, that is, the cylinder
G = [t0 ¡ δ, t0 + δ] £ B (x0 , ε)
is contained in Ω and, for any two points (t, x) and (t, y) from G with the same t,
Let
M = sup kf (t, x)k
G
39
is contained in G. Hence, the values of L and M for the cylinder G1 can be taken the same
as those for G. Therefore, the IVP (2.27) has solution x (t) in the interval [t1 ¡ r, t1 + r],
and x (t) takes values in B (x1 , ε/2) ½ B (x, ε) provided
ε 1
r · δ/2, r · , r< .
2M L
For example, take µ ¶
δ ε 1
r = min , , .
2 2M 2L
If t1 2 [t0 ¡ r/2, t0 + r/2] then [t0 ¡ r/2, t0 + r/2] ½ [t1 ¡ r, t1 + r] so that the solution
x (t) of (2.27) is defined on [t0 ¡ r/2, t0 + r/2] and takes value in B (x, ε), which was to
be proved.
Theorem 2.8 Assume that the conditions of Theorem 2.7 are satisfied. Then the follow-
ing is true.
(a) Any IVP has is a unique maximal solution.
(b) If x (t) and y (t) are two maximal solutions to the same ODE and x (t) = y (t) for
some value of t, then x and y are identically equal, including the identity of their domains.
(c) If x (t) is a maximal solution with the domain (a, b) then x (t) leaves any compact
set K ½ Ω as t ! a and as t ! b.
Here the phrase “x (t) leaves any compact set K as t ! b” means the follows: there is
T 2 (a, b) such that for any t 2 (T, b), the point (t, x (t)) does not belong to K. Similarly,
the phrase “x (t) leaves any compact set K as t ! a” means that there is T 2 (a, b) such
that for any t 2 (a, T ), the point (t, x (t)) does not belong to K.
Example. 1. Consider the ODE x0 = x2 in the domain Ω = R2 . This is separable
equation and can be solved as follows. Obviously, x ´ 0 is a constant solution. In the
domains where x 6= 0 we have Z 0 Z
x dt
= dt
x2
40
whence Z Z
1 dx
¡ = = dt = t + C
x x2
1
and x (t) = ¡ t−C (where we have replaced C by ¡C). Hence, the family of all solutions
1
consists of a straight line x (t) = 0 and hyperbolas x (t) = C−t with the maximal domains
(C, +1) and (¡1, C) (see the diagram below).
y 50
25
0
-5 -2.5 0 2.5 5
-25
-50
Each of these solutions leaves any compact set K, but in different ways: the solutions
1
x (t) = 0 leaves K as t ! §1 because K is bounded, while x (t) = C−t leaves K as
t ! C because x (t) ! §1.
2. Consider the ODE x0 = x1 in the domain Ω = R £ (0, +1) (that is, t 2 R and
x > 0). By the separation of variables, we obtain
Z Z Z
x2 0
= xdx = xx dt = dt = t + C
2
whence p
x (t) = 2 (t ¡ C) , t > C.
See the diagram below:
41
y
2.5
1.5
0.5
0
0 1.25 2.5 3.75 5
Obviously, the maximal domain of the solution is (C, +1). The solution leaves any
compact K ½ Ω as t ! C because (t, x (t)) tends to the point (C, 0) at the boundary of
Ω.
The proof of Theorem 2.8 will be preceded by a lemma.
Lemma 2.9 Let fxα (t)gα∈A be a family of solutions to the same S IVP where A is any
index set, and let the domain of xα be an open interval Iα . Set I = α∈A Iα and define a
function x (t) on I as follows:
The function x (t) defined by (2.28) is referred to as the union of the family fxα (t)g.
Proof. First of all, let us verify that the identity (2.28) defines x (t) correctly, that
is, the right hand side does not depend on the choice of α. Indeed, if also t 2 Iβ then t
belongs to the intersection Iα \ Iβ and by the uniqueness theorem, xα (t) = xβ (t). Hence,
the value of x (t) is independent of the choice of the index α. Note that the graph of x (t)
is the union of the graphs of all functions xα (t).
Set a = inf I, b = sup I and show that I = (a, b). Let us first verify that (a, b) ½ I,
that is, any t 2 (a, b) belongs also to I. Assume for certainty that t ¸ t0 . Since b = sup I,
there is t1 2 I such that t < t1 < b. There exists an index α such that t1 2 I α . Since
also t0 2 Iα , the entire interval [t0 , t1 ] is contained in Iα . Since t 2 [t0 , t1 ], we conclude
that t 2 Iα and, hence, t 2 I.
It follows that I is an interval with the endpoints a and b. Since I is the union of open
intervals, I is an open subset of R, whence it follows that I is an open interval, that is,
I = (a, b).
Finally, let us verify why x (t) solves the given IVP. We have x (t0 ) = x0 because
t0 2 Iα for any α and
x (t0 ) = xα (t0 ) = x0
so that x (t) satisfies the initial condition. Why x (t) satisfies the ODE at any t 2 I? Any
given t 2 I belongs to some Iα . Since xα solves the ODE in Iα and x ´ xα on Iα , we
conclude that x satisfies the ODE at t, which finishes the proof.
42
Proof of Theorem 2.8. (a) Consider the IVP
½ 0
x = f (t, x) ,
(2.29)
x (t0 ) = x0
and let S be the set of all possible solutions to this IVP defined on open intervals. Let
x (t) be the union of all solutions from S. By Lemma 2.9, the function x (t) is also a
solution to the IVP and, hence, x (t) 2 S. Moreover, x (t) is a maximal solution because
the domain of x (t) contains the domains of all other solutions from S and, hence, x (t)
cannot be extended to a larger open interval. This proves the existence of a maximal
solution.
Let y (t) be another maximal solution to the IVP and let z (t) be the union of the
solutions x (t) and y (t). By Lemma 2.9, z (t) solves the IVP and extends both x (t) and
y (t), which implies by the maximality of x and y that z is identical to both x and y.
Hence, x and y are identical (including the identity of the domains), which proves the
uniqueness of a maximal solution.
(b) Let x (t) and y (t) be two maximal solutions that coincide at some t, say t = t1 .
Set x1 = x (t1 ) = y (t1 ). Then both x and y are solutions to the same IVP with the initial
point (t1 , x1 ) and, hence, they coincide by part (a).
(c) Let x (t) be a maximal solution defined on (a, b) where a < b, and assume that
x (t) does not leave a compact K ½ Ω as t ! a. Then there is a sequence tk ! a such
that (tk , xk ) 2 K where xk = x (tk ). By a property of compact sets, any sequence in K
has a convergent subsequence whose limit is in K. Hence, passing to a subsequence, we
can assume that the sequence f(tk , xk )g∞ k=1 converges to a point (t0 , x0 ) 2 K as k ! 1.
Clearly, we have t0 = a, which in particular implies that a is finite.
By Corollary to Theorem 2.7, for the point (t0 , x0 ), there exist r, ε > 0 such that the
IVP with the initial point inside the cylinder
has a solution defined for all t 2 [t0 ¡ r/2, t0 + r/2]. In particular, if k is large enough
then (tk , xk ) 2 G, which implies that the solution y (t) to the following IVP
½ 0
y = f (t, y) ,
y (tk ) = xk ,
is defined for all t 2 [t0 ¡ r/2, t0 + r/2] (see the diagram below).
43
x
x(t)
_ (tk, xk)
B(x0,ε/2)
[t0-r/2,t0+r/2] t
Since x (t) also solves this IVP, the union z (t) of x (t) and y (t) solves the same IVP.
Note that x (t) is defined only for t > t0 while z (t) is defined also for t 2 [t0 ¡ r/2, t0 ].
Hence, the solution x (t) can be extended to a larger interval, which contradicts the
maximality of x (t).
Remark. By definition, a maximal solution x (t) is defined on an open interval, say
(a, b), and it cannot be extended to a larger open interval. One may wonder if x (t) can
be extended at least to the endpoints t = a or t = b. It turns out that this is never the
case (unless the domain Ω of the function f (t, x) can be enlarged). Indeed, if x (t) can
be defined as a solution to the ODE also for t = a then (a, x (a)) 2 Ω and, hence, there is
ball B in Rn+1 centered at the point (a, x (a)) such that B ½ Ω. By shrinking the radius
of B, we can assume that the corresponding closed ball B is also contained in Ω. Since
x (t) ! x (a) as t ! a, we obtain that (t, x (t)) 2 B for all t close enough to a. Therefore,
the solution x (t) does not leave the compact set B ½ Ω as t ! a, which contradicts part
(c) of Theorem 2.8.
44
Hence, if we know that the solution y (t) of (2.31) depends continuously on the right hand
side, then it will follow that y (t) is continuous in x0 , which implies that also the solution
x (t) of (2.30) is continuous in x0 .
Let Ω be an open set in Rn+1 and f, g be two functions from Ω to Rn . Assume in
what follows that both f, g are continuous and locally Lipschitz in x in Ω, and consider
two initial value problems ½ 0
x = f (t, x)
(2.32)
x (t0 ) = x0
and ½
y 0 = g (t, y)
(2.33)
y (t0 ) = x0
where (t0 , x0 ) is a fixed point in Ω.
Assume that the function f as fixed and x (t) is a fixed solution of (2.32). The function
g will be treated as variable.. Our purpose is to show that if g is chosen close enough
to f then the solution y (t) of (2.33) is close enough to x (t). Apart from the theoretical
interest, this question has significant practical consequences. For example, if one knows
the function f (t, x) only approximately (which is always the case in applications in Sci-
ences and Engineering) then solving (2.32) approximately means solving another problem
(2.33) where g is an approximation to f. Hence, it is important to know that the solution
y (t) of (2.33) is actually an approximation of x (t).
Theorem 2.10 Let x (t) be a solution to the IVP (2.32) defined on an interval (a, b).
Then, for all real α, β such that a < α < t0 < β < b and for any ε > 0, there is η > 0
such that, for any function g : Ω ! Rn such that
sup kf ¡ gk · η, (2.34)
Ω
there is a solution y (t) of the IVP (2.33) defined in [α, β], and this solution satisfies the
inequality
sup kx (t) ¡ y (t)k · ε.
[α,β]
which can be regarded as the ε-neighborhood in Rn+1 of the graph of the function t 7! x (t)
where t 2 [α, β]. In particular, K0 is the graph of the function x (t) on [α, β] (see the
diagram below).
45
x
Kε K0
α β t
The set K0 is compact because it is the image of the compact interval [α, β] under the
continuous mapping t 7! (t, x (t)). Hence, K0 is bounded and closed, which implies that
also Kε for any ε > 0 is also bounded and closed. Thus, Kε is a compact subset of Rn+1
for any ε ¸ 0.
Claim 1. There is ε > 0 such that Kε ½ Ω and f is Lipschitz in x in Kε .
Indeed, by the local Lipschitz condition, for any point (t∗ , x∗ ) 2 Ω (in particular, for
any (t∗ , x∗ ) 2 K0 ), there are constants ε, δ > 0 such that the cylinder
G = [t∗ ¡ δ, t∗ + δ] £ B (x∗ , ε)
_ K0
B(x*-ε) G
x*
α t*-δ t* t*+δ β t
Varying the point (t∗ , x∗ ) in K0 , we obtain a cover of K0 by the family of the open
cylinders H = (t∗ ¡ δ, t∗ + δ) £ B (x∗ , ε/2) where ε, δ depend on (t∗ , x∗ ). Since K0 is
46
compact, there is a finite subcover, that is, a finite number of points f(ti , xi )gm
i=1 on K0
and the corresponding numbers εi , δ i > 0, such that the cylinders
and prove that Kε ½ Ω and that function f is Lipschitz in Kε with the constant L. For
any point (t, x) 2 Kε , we have by the definition of Kε that t 2 [α, β], (t, x (t)) 2 K0 and
kx ¡ x (t)k · ε.
_ Gi K0
B(xi-εi)
(t,x)
Hi
(t,x(t))
(ti,xi)
B(xi-εi/2)
ti-δi ti+δi t
where we have used that by (2.36) ε · εi /2. Therefore, x 2 B (xi , εi ) whence it follows
that (t, x) 2 Gi and, hence, (t, x) 2 Ω. Hence, we have shown that any point from Kε
belongs to Ω, which proves that Kε ½ Ω.
47
If (t, x) , (t, , y) 2 Kε then by the above argument the both points x, y belong to the
same ball B (xi , εi ) that is determined by the condition (t, x (t)) 2 Hi . Then (t, x) , (t, , y) 2
Gi and, since f is Lipschitz in Gi with the constant Li , we obtain
where we have used the definition (2.36) of L. This shows that f is Lipschitz in x in Kε
and finishes the proof of Claim 1.
Observe that if the statement of Claim 1 holds for some value of ε then it holds for
all smaller values of ε as well, with the same L. Hence, we can assume that the value of
ε from Theorem 2.10 is small enough so that it satisfies Claim 1.
Let now y (t) be the maximal solution to the IVP (2.33), and let (a0 , b0 ) be its domain.
By Theorem 2.8, the graph of y (t) leaves Kε when t ! a0 and when t ! b0 . Let (α0 , β 0 )
be the maximal interval such that the graph of y (t) on this interval is contained in Kε ;
that is,
α0 = inf ft 2 (α, β) \ (a0 , b0 ) : (s, y (s)) 2 Kε for all s 2 [t, t0 ]g (2.37)
and β 0 is defined similarly with inf replaced by sup (see the diagrams below for the cases
α0 > α and α0 = α, respectively).
x x
x(t) x(t)
Kε Kε
α ά t s t0 β́ β t α=ά t0 β́ β
In particular, (α0 , β 0 ) is contained in (a0 , b0 ) \ (α, β), function y (t) is defined on (α0 , β 0 )
and
(t, y (t)) 2 Kε for all t 2 (α0 , β 0 ) . (2.38)
Claim 2. We have [α0 , β 0 ] ½ (a0 , b0 ); in particular, y (t) is defined on the closed interval
[α0 , β 0 ]. Moreover, the following is true: either α0 = α or
Similarly, either β 0 = β or
By Theorem 2.8, y (t) leaves Kε as t ! a0 . Hence, for all values of t close enough to
a0 we have (t, y (t)) 2
/ Kε . For any such t we have by (2.37) t · α0 whence a0 < t · α and
a0 < α0 . Similarly, one shows that b0 > β 0 , whence the inclusion [α0 , β 0 ] ½ [a0 , b0 ] follows.
48
To prove the second part, assume that α0 6= α that is, α0 > α, and prove that
The condition α0 > α together with α0 > a0 implies that α0 belongs to the open interval
(α, β) \ (a0 , b0 ). It follows that, for τ > 0 small enough,
then, by the continuity of x (t) and y (t), that the same inequality holds for all t 2
(α0 ¡ τ , α0 + τ ) provided τ > 0 is small enough. Choosing τ to satisfy also (2.40), we
obtain that (t, y (t)) 2 Kε for all t 2 (α0 ¡ τ , α0 ], which contradicts the definition of α0 .
Claim 3. For any given α, β, ε as above, there exists η > 0 such that if
sup kf ¡ gk · η, (2.41)
Kε
and Z t
y (t) = x0 + g (s, y (s)) ds.
t0
Assuming for simplicity that t ¸ t0 and using the triangle inequality, we obtain
Z t
kx (t) ¡ y (t)k · kf (s, x (s)) ¡ g (s, y (s))k ds
t0
Z t Z t
· kf (s, x (s)) ¡ f (s, y (s))k ds + kf (s, y (s)) ¡ g (s, y (s))k ds.
t0 t0
49
Since the points (s, x (s)) and (s, y (s)) are in Kε , we obtain by the Lipschitz condition in
Kε (Claim 1) that
Z t
kx (t) ¡ y (t)k · L kx (s) ¡ y (s)k ds + sup kf ¡ gk (β ¡ α) . (2.42)
t0 Kεs
Hence, by the Gronwall lemma applied to the function z (t) = kx (t) ¡ y (t)k,
In the same way, (2.43) holds for t · t0 so that it is true for all t 2 [α0 , β 0 ].
Now choose η in (2.41) as follows
ε
η= e−L(β−α) .
2 (β ¡ α)
Proof. By Claim 2 of the above proof, the maximal solution y (t) of (2.33) is de-
fined on [α0 , β 0 ]. Also, the difference kx (t) ¡ y (t)k satisfies (2.43) for all t 2 [α0 , β 0 ]. If
supKε kf ¡ gk is small enough then by Claim 3 [α0 , β 0 ] = [α, β]. It follows that y (t) is
defined on [α, β] and satisfies (2.45).
where f : Ω ! Rn and Ω is an open subset of Rn+m+1 . Here the triple (t, x, s) is identified
as a point in Rn+m+1 as follows:
50
How do we understand (2.46)? For any s 2 Rm , consider the open set
© ª
Ωs = (t, x) 2 Rn+1 : (t, x, s) 2 Ω .
Denote by S the set of those s, for which Ωs contains (t0 , x0 ), that is,
S = fs 2 Rm : (t0 , x0 ) 2 Ωs g = fs 2 Rm : (t0 , x0 , s) 2 Ωg
Rm
s
S s
Rn+1
(t0,x0)
Then the IVP (2.46) can be considered in the domain Ωs for any s 2 S. We always
assume that the set S is non-empty. Assume also in the sequel that f (t, x, s) is a contin-
uous function in (t, x, s) 2 Ω and is locally Lipschitz in x for any s 2 S. For any s 2 S,
denote by x (t, s) the maximal solution of (2.46) and let Is be its domain (that is, Is is an
open interval on the axis t). Hence, x (t, s) as a function of (t, s) is defined in the set
© ª
U = (t, s) 2 Rm+1 : s 2 S, t 2 Is .
Theorem 2.11 Under the above assumptions, the set U is an open subset of Rn+1 and
the function x (t, s) : U ! Rn is continuous in (t, s).
Proof. Fix some s0 2 S and consider solution x (t) = x (t, s0 ) defined for t 2 Is0 .
Choose some interval [α, β] ½ Is0 such that t0 2 [α, β]. We will prove that there is ε > 0
such that
[α, β] £ B (s0 , ε) ½ U, (2.47)
which will imply that U is open. Here B (s0 , ε) is a closed ball in Rm with respect to
1-norm (we can assume that all the norms in various spaces Rk are the 1-norms).
51
s2Rm
B(s0,ε) s0
α t0 β t
I s0
_
B(s0,ε) s0
~ _
Kε =Kε £B(s0,ε)
α t0 β t
Kε
x(t,s0)
x(t,s)
x s0
If ε is small enough then K e ε is contained in Ω (cf. the proof of Theorem 2.10 and
Exercise 26). Hence, for any s 2 B(s0 , ε), the function f (t, x, s) is defined for all (t, x) 2
52
Kε . Since the function f is continuous on Ω, it is uniformly continuous on the compact
set Ke ε , whence it follows that
Using Corollary to Theorem 2.10 with5 f (t, x) = f (t, x, s0 ) and g (t, x) = f (t, x, s) where
s 2 B (s0 , ε), we obtain that if
is small enough then then the solution y (t) = x (t, s) is defined on [α, β]. In particular,
this implies (2.47) for small enough ε. Furthermore, by Corollary to Theorem 2.10 we
also obtain that
where the constant C depending only on α, β, ε and the Lipschitz constant L of the
function f (t, x, s0 ) in Kε . Letting s ! s0 , we obtain that
for all t 2 I and x 2 Rn , where a (t) and b (t) are some continuous non-negative functions
of t. Then, for all t0 2 I and x0 2 Rn , the initial value problem
½ 0
x = f (t, x)
(2.49)
x (t0 ) = x0
In other words, under the specified conditions, the maximal solution of (2.49) is defined
on I.
Proof. Let x (t) be the maximal solution to the problem (2.49), and let J = (α, β)
be the open interval where x (t) is defined. We will show that J = I. Assume from the
contrary that this is not the case. Then one of the points α, β is contained in I, say β 2 I.
5
Since the common domain of the functions f (t; x; s) and f (t; x; s0 ) is (t; s) 2 −s0 \ −s , Theorem 2.10
should be applied with this domain.
53
Let us investigate the behavior of kx (t) k as t ! β. By Theorem 2.8, (t, x (t)) leaves any
compact K ½ Ω := I £ Rn . Consider a compact set
K = [β ¡ ε, β] £ B (0, r)
where ε > 0 is so small that [β ¡ ε, β] ½ I. Clearly, K ½ Ω. If t is close enough to β then
t 2 [β ¡ ε, β]. Since (t, x (t)) must be outside K, we conclude that x 2 / B (0, r), that is,
kx (t)k > r. Since r is arbitrary, we have proved that kx (t)k ! 1 as t ! β.
On the other hand, let us show that the solution x (t) must remain bounded as t ! β.
From the integral equation
Z t
x (t) = x0 + f (s, x (s)) ds,
t0
where Z β
A = sup a (s) and C = kx0 k + b (s) ds.
[t0 ,β] t0
Since [t0 , β] ½ I and functions a (s) and b (s) are continuous in [t0 , β], the values of A and
C are finite. The Gronwall lemma yields
kx (t)k · C exp (A (t ¡ t0 )) · C exp (A (β ¡ t0 )) .
Since the right hand side here does not depend on t, we conclude that the function kx (t)k
remains bounded as t ! β, which finishes the proof.
Example. We have considered above the ODE x0 = x2 defined in R £ R and have
1
seen that the solution x (t) = C−t cannot be defined on full R. The same occurs for the
equation x = x for α > 1. The reason is that the function f (t, x) = xα does not admit
0 α
the estimate (2.48) for large x, due to α > 1. This example also shows that the condition
(2.48) is rather sharp.
A particularly important application of Theorem 2.12 is the case of the linear equation
x0 = A (t) x + B (t) ,
where x 2 Rn , t 2 I (where I is an open interval in R), B : I ! Rn , A : I ! Rn×n . Here
2
Rn×n is the space of all n £ n matrices (that can be identified with Rn ). In other words,
for each t 2 I, A (t) is an n £ n matrix, and A (t) x is the product of the matrix A (t) and
the column vector x. In the coordinate form, one has a system of linear equations
n
X
x0k = Akl (t) xl + Bk (t) ,
l=1
54
Theorem 2.13 In the above notation, let A (t) and B (t) be continuous in t 2 I. Then,
for any t0 2 I and x0 2 Rn , the IVP
½ 0
x = A (t) x + B (t)
x (t0 ) = x0
has a (unique) solution x (t) defined on I.
Proof. It suffices to check that the function f (t, x) = A (t) x + B (t) satisfies the con-
ditions of Theorem 2.12. This function is obviously continuous in (t, x) and continuously
differentiable in x, which implies by Lemma 2.6 that f (t, x) is locally Lipschitz in x.
We are left to verify (2.48). By the triangle inequality, we have
kf (t, x) k · kA (t) xk + kB (t) k. (2.50)
Let all the norms be the 1-norm. Then
b (t) := kB (t)k∞ = max jBk (t)j
k
where ∞
X
a (t) = max jAkl (t)j
k
l=1
55
where i = 1, ..., n is the row index and l = 1, ..., m is the column index, so that fs is an
n £ m matrix.
If fx is continuous in Ω then by Lemma 2.6 f is locally Lipschitz in x so that all the
previous results apply.
Let x (t, s) be the maximal solution to (2.51). Recall that, by Theorem 2.11, the
domain U of x (t, s) is an open subset of Rm+1 and x : U ! Rn is continuous.
Theorem 2.14 Assume that function f (t, x, s) is continuous and fx and fs exist and
are also continuous in Ω. Then x (t, s) is continuously differentiable in (t, s) 2 U and the
Jacobian matrix y = ∂s x solves the initial value problem
½ 0
y = fx (t, x (t, s) , s) y + fs (t, x (t, s) , s) ,
(2.52)
y (t0 ) = 0.
³ ´
Here ∂s x = ∂x k
∂sl
is an n£m matrix where k = 1, .., n is the row index and l = 1, ..., m
is the column index. Hence, y = ∂s x can be considered as a vector in Rn×m depending on
t and s. The both terms in the right hand side of (2.52) are also n £ m matrices so that
(2.52) makes sense. Indeed, fs is an n £ m matrix, and fx y is the product of the n £ n
and n £ m matrices, which is again an n £ m matrix.
The ODE in (2.52) is called the variational equation for (2.51) along the solution
x (t, s) (or the equation in variations).
Note that the variational equation is linear. Indeed, for any fixed s, its right hand side
can be written in the form
y 0 = A (t) y + B (t) ,
where A (t) = fx (t, x (t, s) , s) and B (t) = fs (t, x (t, s) , s). Since f is continuous and
x (t, s) is continuous by Theorem 2.11, the functions A (t) and B (t) are continuous in t.
If the domain in t of the solution x (t, s) is Is then the domain of the variational equation
is Is £ Rn×m . By Theorem 2.13, the solution y (t) of (2.52) exists in the full interval Is .
Hence, Theorem 2.14 can be stated as follows: if x (t, s) is the solution of (2.51) on Is
and y (t) is the solution of (2.52) on Is then we have the identity y (t) = ∂s x (t, s) for all
t 2 Is . This provides a method of evaluating ∂s x (t, s) for a fixed s without finding x (t, s)
for all s.
Example. Consider the IVP with parameter
½ 0
x = x2 + 2s/t
x (1) = ¡1
in the domain (0, +1) £ R £ R (that is, t > 0 and x, s are arbitrary real). Let us evaluate
x (t, s) and ∂s x for s = 0. Obviously, the function f (t, x, s) = x2 +2s/t is continuously dif-
ferentiable in (x, s) whence it follows that the solution x (t, s) is continuously differentiable
in (t, s).
For s = 0 we have the IVP ½ 0
x = x2
x (1) = ¡1
whence we obtain x (t, 0) = ¡ 1t . Noticing that fx = 2x and fs = 2/t we obtain the
variational equation along this solution
³ ´ ³ ´ 2 2
y 0 = fx (t, x, s)jx=− 1 ,s=0 y + fs (t, s, x)jx=− 1 ,s=0 = ¡ y + .
t t t t
56
This is a linear equation of the form y 0 = a (t) y + b (t) which is solved by the formula
Z
y=e A(t)
e−A(t) b (t) dt,
that is, µ ¶
1 1
x (t, s) = ¡ + 1 ¡ 2 s + o (s) as s ! 0.
t t
In particular, we obtain for small s an approximation
µ ¶
1 1
x (t, s) ¼ ¡ + 1 ¡ 2 s.
t t
Later we will be able to obtain more terms in the Taylor formula and, hence, to get a
better approximation for x (t, s).
Remark. It is easy to deduce the variational equation (2.52) provided we know that the
function x (t, s) is sufficiently many times differentiable. Assume that the mixed partial
derivatives ∂s ∂t x and ∂t ∂s x exist and are the equal (for example, this is the case when
x (t, s) 2 C 2 (U)). Then differentiating (2.51) in s and using the chain rule, we obtain
which implies (2.52) after substitution ∂s x = y. Although this argument is not a proof
of Theorem 2.14, it allows to memorize the variational equation. The main technical
difficulty in the proof of Theorem 2.14 is verifying the differentiability of x in s.
How can one evaluate the higher derivatives of x (t, s) in s? Let us show how to find
the ODE for the second derivative z = ∂ss x assuming for simplicity that n = m = 1, that
is, both x and s are one-dimensional. For the derivative y = ∂s x we have the IVP (2.52),
which we write in the form ½ 0
y = g (t, y, s)
(2.53)
y (t0 ) = 0
where
g (t, y, s) = fx (t, x (t, s) , s) y + fs (t, x (t, s) , s) . (2.54)
For what follows we use the notation F (a, b, c, ...) 2 C k (a, b, c, ...) if all the partial deriva-
tives of the order up to k of the function F with respect to the specified variables a, b, c...
exist and are continuous functions, in the domain of F . For example, the condition in
Theorem 2.14 that fx and fs are continuous, can be shortly written as f 2 C 1 (x, s) , and
the claim of Theorem 2.14 is that x (t, s) 2 C 1 (t, s) .
57
Assume now that f 2 C 2 (x, s). Then by (2.54) we obtain that g is continuous and
g 2 C 1 (y, s), whence by Theorem 2.14 y 2 C 1 (s). In particular, the function z = ∂s y =
∂ss x is defined. Applying the variational equation to the problem (2.53), we obtain the
equation for z
z 0 = gy (t, y (t, s) , s) z + gs (t, y (t, s) , s) .
Since gy = fx (t, x, s),
gs (t, y, s) = fxx (t, x, s) (∂s x) y + fxs (t, x, s) y + fsx (t, x, s) ∂s x + fss (t, x, s) ,
2 ¡ ¢2
= ¡ z + 2 1 ¡ t−2 .
t
Solving this equation similarly to the first variational equation with the same a (t) = ¡ 2t
2
and with b (t) = 2 (1 ¡ t−2 ) , we obtain
Z Z
−A(t) −2
¡ ¢2
z (t) = e A(t)
e b (t) dt = t 2t2 1 ¡ t−2 dt
µ ¶
−2 2 3 2 2 2 4 C
= t t ¡ ¡ 4t + C = t ¡ 3 ¡ + 2 .
3 t 3 t t t
16
The initial condition z (1) = 0 yields C = 3
whence
2 2 4 16
z (t) = t ¡ 3 ¡ + 2 .
3 t t 3t
58
Expanding x (t, s) at s = 0 by the Taylor formula of the second order, we obtain as
s!0
1 ¡ ¢
x (t, s) = x (t) + y (t) s + z (t) s2 + o s2
2 µ ¶
1 ¡ −2
¢ 1 2 8 1 ¡ ¢
= ¡ + 1¡t s+ t ¡ + 2 ¡ 3 s2 + o s2 .
t 3 t 3t t
For comparison, the plots below show for s = 0.1 the solution x (t, s) (yellow) found by nu-
merical methods (MAPLE), the first order approximation u (t)¡ = ¡ 1t + (1 ¡ t−2 )¢s (green)
and the second order approximation v (t) = ¡ 1t +(1 ¡ t−2 ) s+ 31 t ¡ 2t + 3t82 ¡ t13 s2 (red).
t
1 2 3 4 5 6
0
-0.05
-0.1
-0.15
-0.2
-0.25
-0.3
-0.35
-0.4
-0.45
-0.5
-0.55
-0.6
-0.65
x -0.7
Let us discuss an alternative method of obtaining the equations for the derivatives of
x (t, s) in s. As above, let x (t), y (t) , z (t) be respectively x (t, 0), ∂s x (t, 0) and ∂ss x (t, 0)
so that by the Taylor formula
1 ¡ ¢
x (t, s) = x (t) + y (t) s + z (t) s2 + o s2 . (2.56)
2
Let us write a similar expansion for x0 = ∂t x, assuming that the derivatives ∂t and ∂s
commute on x. We have
∂s x0 = ∂t ∂s x = y 0
and in the same way
∂ss x0 = ∂s y 0 = ∂t ∂s y = z 0 .
Hence,
1 ¡ ¢
x0 (t, s) = x0 (t) + y 0 (t) s + z 0 (t) s2 + o s2 .
2
Substituting this into the equation
x0 = x2 + 2s/t
59
we obtain
µ ¶
0 0 1 0 2
¡ 2¢ 1 2
¡ 2¢ 2
x (t) + y (t) s + z (t) s + o s = x (t) + y (t) s + z (t) s + o s + 2s/t
2 2
whence
1 ¡ ¢ ¡ ¢
x0 (t) + y 0 (t) s + z 0 (t) s2 = x2 (t) + 2x (t) y (t) s + y (t)2 + x (t) z (t) s2 + 2s/t + o s2 .
2
Equating the terms with the same powers of s (which can be done by the uniqueness of
the Taylor expansion), we obtain the equations
x0 (t) = x2 (t)
y 0 (t) = 2x (t) y (t) + 2s/t
z 0 (t) = 2x (t) z (t) + 2y 2 (t) .
From the initial condition x (1, s) = ¡1 we obtain
s2 ¡ ¢
¡1 = x (1) + sy (1) + z (1) + o s2 ,
2
whence x (t) = ¡1, y (1) = z (1) = 0. Solving successively the above equations with these
initial conditions, we obtain the same result as above.
Before we prove Theorem 2.14, let us prove some auxiliary statements from Analysis.
Definition. A set K ½ Rn is called convex if for any two points x, y 2 K, also the full
interval [x, y] is contained in K, that is, the point (1 ¡ λ) x + λy belong to K for any
λ 2 [0, 1].
Example. Let us show that any ball B (z, r) in Rn with respect to any norm is convex.
Indeed, it suffices to treat the case z = 0. If x, y 2 B (0, r) then kxk < r and kyk < r
whence for any λ 2 [0, 1]
k(1 ¡ λ) x + λyk · (1 ¡ λ) kxk + λ kyk < r.
It follows that (1 ¡ λ) x + λy 2 B (0, r), which was to be proved.
Lemma 2.15 (The Hadamard lemma) Let f (t, x) be a continuous mapping from Ω to
Rl where Ω is an open subset of Rn+1 such that, for any t 2 R, the set
Ωt = fx 2 Rn : (t, x) 2 Ωg
is convex (see the diagram below). Assume that fx (t, x) exists and is also continuous in
Ω. Consider the domain
© ª
Ω0 = (t, x, y) 2 R2n+1 : t 2 R, x, y 2 Ωt
© ª
= (t, x, y) 2 R2n+1 : (t, x) and (t, y) 2 Ω .
Then there exists a continuous mapping ϕ (t, x, y) : Ω0 ! Rl×n such that the following
identity holds:
f (t, y) ¡ f (t, x) = ϕ (t, x, y) (y ¡ x)
for all (t, x, y) 2 Ω0 (here ϕ (t, x, y) (y ¡ x) is the product of the l £ n matrix and the
column-vector).
Furthermore, we have for all (t, x) 2 Ω the identity
ϕ (t, x, x) = fx (t, x) . (2.57)
60
Rn
t
x (t,x)
y (t,y)
Remark. The variable t can be multi-dimensional, and the proof goes through without
changes.
Since f (t, x) is continuously differentiable at x, we have
The point of the above Lemma is that the term o (kx ¡ yk) can be eliminated if one
replaces fx (t, x) by a continuous function ϕ (t, x, y).
Example. Consider some simple examples of functions f (x) with n = l = 1 and without
dependence on t. Say, if f (x) = x2 then we have
f (y) ¡ f (x) = (y + x) (y ¡ x)
For any continuously differentiable function f (x), one can define ϕ (x, y) as follows:
½ f (y)−f (x)
y−x
, y 6= x,
ϕ (x, y) = 0
f (x) , y = x.
61
where ξ k 2 (xk , yk ), which implies that ξ k ! x and hence, f 0 (ξ k ) ! f 0 (x), where we have
used the continuity of the derivative f 0 (x).
Clearly, this argument does not work in the case n > 1 since one cannot divide by
y ¡ x. In the general case, we use a different approach.
Proof of Lemma 2.15. It suffices to prove this lemma for each component fi
separately. Hence, we can assume that l = 1 so that ϕ is a row (ϕ1 , ..., ϕn ). Hence, we
need to prove the existence of n real valued continuous functions ϕ1 , ..., ϕn of (t, x, y) such
that the following identity holds:
n
X
f (t, y) ¡ f (t, x) = ϕi (t, x, y) (yi ¡ xi ) .
i=1
Xn
= ϕi (t, x, y) (yi ¡ xi )
i=1
where Z 1
ϕi (t, x, y) = fxi (t, x + λ (y ¡ x)) dλ. (2.58)
0
We are left to verify that ϕi is continuous. Observe first that the domain Ω0 of ϕi is an
open subset of R2n+1 . Indeed, if (t, x, y) 2 Ω0 then (t, x) and (t, y) 2 Ω which implies by
the openness of Ω that there is ε > 0 such that the balls B ((t, x) , ε) and B ((t, y) , ε) in
Rn+1 are contained in Ω. Assuming the norm in all spaces in question is the 1-norm, we
obtain that B ((t, x, y) , ε) ½ Ω0 . The continuity of ϕi follows from the following general
statement.
Lemma 2.16 Let f (λ, u) be a continuous real-valued function on [a, b] £ U where U is
an open subset of Rk , λ 2 [a, β] and u 2 U. Then the function
Z b
ϕ (u) = f (λ, u) dλ
a
is continuous in u 2 U.
62
Proof of Lemma 2.16. Let fuk g∞ k=1 be a sequence in U that converges to some
u 2 U. Then all uk with large enough index k are contained in a closed ball B (u, ε) ½ U.
Since f (λ, u) is continuous in [a, b] £ U , it is uniformly continuous on any compact set in
this domain, in particular, in [a, b] £ B (u, ε) . Hence, the convergence
f (λ, uk ) ! f (λ, u) as k ! 1
is uniform in λ 2 [0, 1]. Since the operations of integration and the uniform convergence
are interchangeable, we conclude that ϕ (uk ) ! ϕ (u), which proves the continuity of ϕ.
The proof of Lemma 2.15 is finished as follows. Consider fxi (t, x + λ (y ¡ x)) as a
function of (λ, t, x, y) 2 [0, 1] £ Ω0 . This function is continuous in (λ, t, x, y), which implies
by Lemma 2.16 that also ϕi (t, x, y) is continuous in (t, x, y).
Finally, if x = y then fxi (t, x + λ (y ¡ x)) = fxi (t, x) which implies by (2.58) that
s*+δ
s*
s*-δ
α t0 t* β t
63
Besides, by the openness of Ω, ε and δ can be chosen so small that the following
condition is satisfied:
© ª
e := (t, x, s) 2 Rn+m+1 : α < t < β, kx ¡ x (t, s∗ )k < ε, js ¡ s∗ j < δ ½ Ω
Ω
(cf. the proof of Theorem 2.11). In particular, for all t 2 (α, β) and s 2 (s∗ ¡ δ, s∗ + δ),
e
the solution x (t, s) is defined and (t, x (t, s) , s) 2 Ω.
s*+δ
s*
s*-δ
~
Ω
α t0 β t
x(t,s )
*
x(t,s)
x
In what follows, we restrict the domain of the variables (t, x, s) to Ω. e Note that this
domain is convex with respect to the variable (x, s), for any fixed t. Indeed, for a fixed t,
x varies in the ball B (x (t, s∗ ) , ε) and s varies in the interval (s∗ ¡ δ, s∗ + δ), which are
both convex sets.
Applying the Hadamard lemma to the function f (t, x, s) in this domain and using the
fact that f is continuously differentiable with respect to (x, s), we obtain the identity
a (t, s) = ϕ (t, x (t, s∗ ) , s∗ , x (t, s) , s) and b (t, s) = ψ (t, x (t, s∗ ) , s∗ , x (t, s) , s) (2.59)
64
Set for any s 2 (s∗ ¡ δ, s∗ + δ) n fs∗ g
x (t, s) ¡ x (t, s∗ )
z (t, s) =
s ¡ s∗
and observe that
x0 (t, s) ¡ x0 (t, s∗ ) f (t, x (t, s) , s) ¡ f (t, x (t, s∗ ) , s∗ )
z0 = =
s ¡ s∗ s ¡ s∗
= a (t, s) z + b (t, s) .
Note also that z (t0 , s) = 0 because both x (t, s) and x (t, s∗ ) satisfy the same initial
condition. Hence, function z (t, s) solves for any fixed s 2 (s∗ ¡ δ, s∗ + δ) n fs∗ g the IVP
½ 0
z = a (t, s) z + b (t, s)
(2.60)
z (t0 , s) = 0.
Since this ODE is linear and the functions a and b are continuous in (t, s) 2 (α, β) £
(s∗ ¡ δ, s∗ + δ), we conclude by Theorem 2.13 that the solution to this IVP exists for all
s 2 (s∗ ¡ δ, s∗ + δ) and t 2 (α, β) and, by Theorem 2.11, the solution is continuous in
(t, s) 2 (α, β) £ (s∗ ¡ δ, s∗ + δ). Hence, we can define z (t, s) also at s = s∗ as the solution
of the IVP (2.60). In particular, using the continuity of z (t, s) in s, we obtain
that is,
x (t, s) ¡ x (t, s∗ )
∂s x (t, s∗ ) = lim = lim z (t, s) = z (t, s∗ ) .
s→s∗ s ¡ s∗ s→s∗
Hence, the derivative y (t) = ∂s x (t, s∗ ) exists and is equal to z (t, s∗ ), that is, y (t) satisfies
the IVP ½ 0
y = a (t, s∗ ) y + b (t, s∗ ) ,
y (t0 ) = 0.
Note that by (2.59) and Lemma 2.15
and
b (t, s∗ ) = ψ (t, x (t, s∗ ) , s∗ , x (t, s∗ ) , s∗ ) = fs (t, x (t, s∗ ) , s∗ )
Hence, we obtain that y (t) satisfies the variational equation (2.52).
To finish the proof, we have to verify that x (t, s) is continuously differentiable in (t, s).
Here we come back to the general case s 2 Rm . The derivative ∂s x = y satisfies the IVP
(2.52) and, hence, is continuous in (t, s) by Theorem 2.11. Finally, for the derivative ∂t x
we have the identity
∂t x = f (t, x (t, s) , s) , (2.61)
which implies that ∂t x is also continuous in (t, s). Hence, x is continuously differentiable
in (t, s).
Remark. It follows from (2.61) that ∂t x is differentiable in s and, by the chain rule,
65
On the other hand, it follows from (2.52) that
Theorem 2.17 Under the conditions of Theorem 2.14, assume that, for some k 2 N,
f (t, x, s) 2 C k (x, s). Then the maximal solution x (t, s) belongs to C k (s). Moreover, for
any multiindex α of the order jαj · k and of the dimension m (the same as that of s),
we have
∂t ∂sα x = ∂sα ∂t x. (2.65)
Here α = (α1 , ..., αm ) where αi are non-negative integers, jαj = α1 + ... + αn , and
∂ |α|
∂sα = .
∂sα1 1 ...∂sαmm
x0 = A (t) x + B (t)
where A (t) : I ! Rn×n , B : I ! Rn , and I being an open interval in R. If A (t) and B (t)
are continuous in t then, for any t0 2 I and x0 2 Rn , the IVP
½ 0
x = A (t) x + B (t)
(3.1)
x (t0 ) = x0
has a unique solution defined on the full interval I (cf. Theorem 2.13). In the sequel,
we always assume that A (t) and B (t) are continuous on I and consider only solutions
defined on the entire interval I.
66
Theorem 3.1 A is a linear space and dim A = n.Consequently, if x1 , ..., xn are n linearly
independent solutions to x0 = A (t) x then the general solution has the form
Proof. The set of all functions I ! Rn is a linear space with respect to the operations
addition and multiplication by a constant. Zero element is the function which is constant
0 on I. We need to prove that the set of solutions A is a linear subspace of the space
of all functions. It suffices to show that A is closed under operations of addition and
multiplication by constant.
If x and y 2 A then also x + y 2 A because
(x + y)0 = x0 + y 0 = Ax + Ax = A (x + y)
where all functions ak (t) are defined on an open interval I ½ R and are continuous on I.
As we know, such an ODE can be reduced to the vector ODE of the 1st order as follows.
Consider the vector function
¡ ¢
x (t) = x (t) , x0 (t) , ..., x(n−1) (t) (3.4)
so that
x1 = x, x2 = x0 , ..., xn−1 = x(n−2) , xn = x(n−1) .
Then (3.3) is equivalent to the system
x01 = x2
x02 = x3
...
0
xn−1 = xn
x0n = ¡a1 xn ¡ a2 xn−1 ¡ ... ¡ an x1
that is,
x0 = A (t) x (3.5)
67
where 0 1
0 1 0 ... 0
B 0 0 1 ... 0 C
B C
A=B
B ... ... ... ... ... C.
C
@ 0 0 0 ... 1 A
¡an ¡an−1 ¡an−2 ... ¡a1
Since A (t) is continuous in t on I, we can assume that any solution x (t) of (3.5) is defined
on the entire interval I and, hence, the same is true for any solution x (t) of (3.3).
Denote now by Ae the set of all solutions of (3.3) defined on I.
Corollary. Ae is a linear space and dim Ae = n.Consequently, if x1 , ..., xn are n linearly
independent solutions to x(n) + a1 (t) x(n−1) + .... + an (t) x = 0 then the general solution
has the form
x (t) = C1 x1 (t) + ... + Cn xn (t) ,
where C1 , ..., Cn are arbitrary constants.
Proof. The fact that Ae is a linear space is obvious (cf. the proof of Theorem 3.1).
The relation (3.4) defines a linear mapping from Ae to A. This mapping is obviously
injective (if x (t) ´ 0 then x (t) ´ 0) and surjective, because any solution x of (3.3)
gives back a solution x (t) of (3.5). Hence, Ae and A are linearly isomorphic, whence
dim Ae = dim A = n.
68
Proof. Let us prove this by induction in n that the functions eλ1 t , ..., eλn t are linearly
independent provided λ1 , ..., λn are distinct complex numbers. If n = 1 then the claim is
trivial, just because the exponential function is not identical zero. Inductive step from
n ¡ 1 to n: Assume that, for some complex constants C1 , ..., Cn and all t 2 R,
C1 eλ1 t + ... + Cn eλn t = 0, (3.7)
and prove that C1 = ... = Cn = 0. Dividing (3.7) by eλn t and setting μj = λj ¡ λn , we
obtain
C1 eμ1 t + ... + Cn−1 eμn−1 t + Cn = 0.
Differentiating in t, we obtain
C1 μ1 eμ1 t + ... + Cn−1 μn−1 eμn−1 t = 0.
By the inductive hypothesis, we conclude that Cj μj = 0 when by μj 6= 0 we conclude
Cj = 0, for all j = 1, ..., n ¡ 1. Substituting into (3.7), we obtain also Cn = 0.
Since the complex conjugations commutes¡ ¢ with addition and multiplication of num-
bers, the identity P (λ) = 0 implies P λ = 0 (since ak are real, we have ak = ak ). Next,
we have
eλt = eαt (cos βt + i sin βt) and eλt = eαt (cos βt ¡ sin βt) (3.8)
so that eλt and eλt are linear combinations of eαt cos βt and eαt sin βt. The converse is true
also, because
αt 1 ³ λt λt
´
αt 1 ³ λt λt
´
e cos βt = e +e and e sin βt = e ¡e . (3.9)
2 2i
Hence, replacing in the sequence eλ1 t , ...., eλn t the functions eλt and eλt by eαt cos βt and
eαt sin βt preserves the linear independence of the sequence.
Example. Consider the ODE
x00 ¡ 3x0 + 2x = 0.
The characteristic polynomial is P (λ) = λ2 ¡ 3λ + 2, which has the roots λ1 = 2 and
λ2 = 1. Hence, the linearly independent solutions are e2t and et , and the general solution
is C1 e2t + C2 et .
Example. Consider the ODE x00 +x = 0. The characteristic polynomial is P (λ) = λ2 +1,
which has the complex roots λ1 = i and λ2 = ¡i. Hence, we obtain the complex solutions
eit and e−it . Out of them, we can get also real linearly independent solutions. Indeed,
just replace these two functions by their two linear combinations (which corresponds to
a change of the basis in the space of solutions)
eit + e−it eit ¡ e−it
= cos t and = sin t.
2 2i
Hence, we conclude that cos t and sin t are linearly independent solutions and the general
solution is C1 cos t + C2 sin t.
Example. Consider the ODE x000 ¡ x = 0. The characteristic polynomial is P (λ) =
3
¡ 2 ¢ 1
√
3
λ ¡ 1 = (λ ¡ 1) λ + λ + 1 that has the roots λ1 = 1 and λ2,3 = ¡ 2 § i 2 . Hence, we
obtain the three linearly independent real solutions
p p
t − 12 t 3 − 21 t 3
e, e cos t, e sin t,
2 2
69
and the real general solution is
à p p !
− 12 t 3 3
C1 et + e C2 cos t + C3 sin t .
2 2
What to do when P (λ) has fewer than n distinct roots? Recall the fundamental the-
orem of algebra (which is normally proved in a course of Complex Analysis): any poly-
nomial P (λ) of degree n with complex coefficients has exactly n complex roots counted
with multiplicity. What is it the multiplicity of a root? If λ0 is a root of P (λ) then its
multiplicity is the maximal natural number m such that P (λ) is divisible by (λ ¡ λ0 )m ,
that is, the following identity holds
P (λ) = (λ ¡ λ0 )m Q (λ) ,
m1 + ... + mr = n
and, hence,
P (λ) = (λ ¡ λ1 )m1 ... (λ ¡ λr )mr .
In order to obtain n independent solutions to the ODE (3.6), each root λj should give
rise to mj independent solutions.
Theorem 3.3 Let λ1 , ..., λr be all the distinct complex roots of the characteristic polyno-
mial P (λ) with the multiplicities m1 , ..., mr , respectively. Then the following n functions
are linearly independent solutions of (3.6):
© k λj t ª
t e , j = 1, ..., r, k = 0, ..., mj ¡ 1. (3.10)
Remark. Setting
mj −1
X
Pj (t) = Cjk tk ,
k=1
70
Hence, any solution to (3.6) has the form (3.12) where Pj is an arbitrary polynomial of t
of the degree at most mj ¡ 1.
Example. Consider the ODE x00 ¡ 2x0 + x = 0 which has the characteristic polynomial
P (λ) = λ2 ¡ 2λ + 1 = (λ ¡ 1)2 .
x (t) = (C1 + C2 t) et .
and it has the roots λ1 = 0, λ2 = i and λ3 = ¡i, where λ2 and λ3 has multiplicity 2. The
following 5 function are linearly independent solutions:
Replacing in the sequence (3.13) eit , e−it by cos t, sin t and teit , te−it by t cos t, t sin t, we
obtain the linearly independent real solutions
71
We make some preparation for the proof of Theorem 3.3. Given a polynomial P (λ) =
a0 λn +a1 λn−1 +...+a0 with complex coefficients, associate with it the differential operator
µ ¶ µ ¶n µ ¶n−1
d d d
P = a0 + a1 + ... + a0
dt dt dt
dn dn−1
= a0 n + a1 n−1 + ... + a0 ,
dt dt
where we use the convention that
¡ d ¢ the “product” of differential operators is the composi-
tion. That is, the operator P dt acts on a smooth enough function f (t) by the rule
µ ¶
d
P f = a0 f (n) + a1 f (n−1) + ... + a0 f
dt
It suffices to verify it for P (λ) = λk and then use the linearity of this identity. For such
P (λ) = λk , we have
µ ¶
d dk λt
P e = k e = λk eλt = P (λ) eλt ,
λt
dt dt
which was to be proved.
Lemma 3.4 If f (t) , g (t) are n times differentiable functions on an open interval then,
for any polynomial P of the order at most n, the following identity holds:
µ ¶ Xn µ ¶
d 1 (j) (j) d
P (fg) = f P g. (3.16)
dt j=0
j! dt
72
It is an easy exercise to see directly that this identity is correct.
Proof. It suffices to prove the identity (3.16) in the case when P (λ) = λk , k · n,
because then for a general polynomial (3.16) will follow by taking linear combination of
those for λk . If P (λ) = λk then, for j · k
The latter identity is known from Analysis and is called the Leibniz formula 7 .
Proof. If P has a root λ with multiplicity m then we have the identity for all z 2 C
where Q is a polynomial such that Q (λ) 6= 0. For any natural k, we have by the Leibniz
formula
Xk µ ¶
k (j)
(k)
P (z) = ((z ¡ λ)m ) Q(k−j) (z) .
j=0
j
If k < m then also j < m and
(j)
((z ¡ λ)m ) = const (z ¡ λ)m−j ,
which vanishes at z = λ. Hence, for k < m, we have P (k) (λ) = 0. For k = m we have
(j)
again that all the derivatives ((z ¡ λ)m ) vanish at z = λ provided j < k, while for j = k
we obtain
(k) (m)
((z ¡ λ)m ) = ((z ¡ λ)m ) = m! 6= 0.
Hence,
(m)
P (m) (λ) = ((z ¡ λ)m ) Q (λ) 6= 0.
7
If k = 1 then (3.17) amounts to the familiar product rule
0
(f g) = f 0 g + f g 0 :
73
Conversely, if (3.18) holds then by the Taylor formula for a polynomial at λ, we have
P 0 (λ) P (n) (λ)
P (z) = P (λ) + (z ¡ λ) + ... + (z ¡ λ)n
1! n!
(m) (n)
P (λ) P (λ)
= (z ¡ λ)m + ... + (z ¡ λ)n
m! n!
= (z ¡ λ)m Q (z)
where
P (m) (λ) P (m+1) (λ) P (n) (λ)
Q (z) = + (z ¡ λ) + ... + (z ¡ λ)n−m .
m! (m + 1)! n!
P (m) (λ)
Obviously, Q (λ) = m!
6= 0, which implies that λ is a root of multiplicity m.
Lemma 3.6 If λ1 , ..., λr are distinct complex numbers and if, for some polynomials Pj (t),
r
X
Pj (t) eλt t = 0 for all t 2 R, (3.19)
j=1
Choose some integer k > deg Pr , where deg P as the maximal power of t that enters P
with non-zero coefficient. Differentiating the above identity k times, we obtain
r−1
X
Qj (t) eμj t = 0,
j=1
74
¡ ¢(j)
If j > k then the tk ´ 0. If j · k then j < m and, hence, P (j) (λ) = 0 by hypothesis.
Hence, all the terms in the above sum vanish, whence
µ ¶
d ¡ k λt ¢
P t e = 0,
dt
that is, the function x (t) = tk eλt solves (3.14).
If λ1 , ..., λr are all distinct complex roots of P (λ) and mj is the multiplicity of λj then
it follows that each function in the following sequence
© k λj t ª
t e , j = 1, ..., r, k = 0, ..., mj ¡ 1, (3.21)
is a solution of (3.14). Let us show that these functions are linearly independent. Clearly,
each linear combination of functions (3.21) has the form
r mj −1 r
X X X
k λj t
Cjk t e = Pj (t) eλj t (3.22)
j=1 k=0 j=1
Pmj −1
where Pj (t) = k=0 Cjk tk are polynomials. If the linear combination is identical zero
then by Lemma 3.6 Pj ´ 0, which implies that all Cjk are 0. Hence, the functions (3.21)
are linearly independent, and by Theorem 3.1 the general solution of (3.14) has the form
(3.22).
Let us show that if λ = α + iβ is a complex (non-real) root of multiplicity m then
λ = α ¡ iβ is also a root of the same multiplicity m. Indeed, by Lemma 3.5, λ satisfies the
relations (3.18). Applying the complex conjugation and using the fact that the coefficients
of P are real, we obtain that the same relations hold for λ instead of λ, which implies
that λ is also a root of multiplicity m.
The last claim that every couple tk eλt , tk eλt in (3.21) can be replaced by real-valued
functions tk eαt cos βt, tk eαt sin βt, follows from the observation that the functions tk eαt cos βt,
tk eαt sin βt are linear combinations of tk eλt , tk eλt , and vice versa, which one sees from the
identities
αt 1 ³ λt λt
´
αt 1 ³ λt λt
´
e cos βt = e + e , e sin βt = e ¡e ,
2 2i
eλt = eαt (cos βt + i sin βt) , eλt = eαt (cos βt ¡ i sin βt) ,
multiplied by tk (compare the proof of Theorem 3.2).
75
Proof. If x (t) is also a solution of (3.23) then the function y (t) = x (t) ¡ x0 (t) solves
y 0 = Ay, whence by Theorem 3.1
and x (t) satisfies (3.24). Conversely, for all C1 , ...Cn , the function (3.25) solves y 0 = Ay,
whence it follows that the function x (t) = x0 (t) + y (t) solves (3.23).
Consider now a scalar ODE
where the function f (t) is a quasi-polynomial, that is, f has the form
X
f (t) = Rj (t) eμj t
j
where Rj (t) are polynomials, μj are complex numbers, and the sum is finite. It is obvious
that the sum and the product of two quasi-polynomials is again a quasi-polynomial.
In particular, the following functions are quasi-polynomials
76
¡ ¢
Then the equation (3.27) can be written shortly in the form P dtd x = f , which will be
used below. We start with the following observation.
¡d¢
Claim. If f = c1 f1 +...+ck fk and x1 (t) , ..., xk (t) are¡ solutions
¢ to the equation P dt
xj =
fj , then x = c1 x1 + ... + ck xk solves the equation P dtd x = f .
Proof. This is trivial because
µ ¶ µ ¶X X µ ¶ X
d d d
P x=P cj xj = cj P xj = cj fj = f.
dt dt j j
dt j
Hence, we can assume that the function f in (3.27) is of the form f (t) = R (t) eμt
where R (t) is a polynomial.
To illustrate the method, which will be used in this Section, consider first the following
example.
Example. Consider the ODE µ ¶
d
P x = eμt (3.28)
dt
where μ is not a root of the characteristic polynomial P (λ) (non-resonant case). We
claim that (3.28) has a particular solution in the form x (t) = aeμt where a is a complex
constant to be chosen. Indeed, we have by (3.15)
µ ¶
d ¡ μt ¢
P e = P (μ) eμt ,
dt
whence µ ¶
d ¡ μt ¢
P ae = eμt
dt
provided
1
a= . (3.29)
P (μ)
Consider some concrete examples of ODE. Let us find a particular solution to the
ODE
x00 + 2x0 + x = et .
Note that P (λ) = λ2 + 2λ + 1 and μ = 1 is not a root of P . Look for a solution in the
form x (t) = aet . Substituting into the equation, we obtain
77
Consider another equation:
Note that sin t is the imaginary part of eit . So, we first solve
and then take the imaginary part of the solution. Looking for a solution in the form
x (t) = aeit , we obtain
1 1 1 i
a= = 2 = =¡ .
P (μ) i + 2i + 1 2i 2
Hence, the solution is
i i 1 i
x = ¡ eit = ¡ (cos t + i sin t) = sin t ¡ cos t.
2 2 2 2
Therefore, its imaginary part x (t) = ¡ 21 cos t solves the equation (3.30).
Consider yet another ODE
Here e−t cos t is a real part of eμt where μ = ¡1 + i. Hence, first solve
Hence, the complex solution is x (t) = ¡e(−1+i)t = ¡e−t cos t ¡ ie−t sin t, and the solution
to (3.31) is x (t) = ¡e−t cos t.
Finally, let us combine the above examples into one:
This time μ = ¡1 is a root of P (λ) = λ2 + 2λ + 1 and the above method does not work.
Indeed, if we look for a solution in the form x = ae−t then after substitution we get 0 in
the left hand side because e−t solves the homogeneous equation.
The case when μ is a root of P (λ) is referred to as a resonance. This case as well as
the case of the general quasi-polynomial in the right hand side is treated in the following
theorem.
where a is a constant that replaces Q (indeed, Q must have degree 0 and, hence, is a
constant). Substituting this into the equation, we obtain
³¡ ¢00 ¡ ¢0 ´
a t2 e−t + 2 t2 e−t + t2 e−t = e−t (3.33)
1
x (t) = t2 e−t .
2
Consider one more example.
79
with the same μ = ¡1 and R (t) = t. Since deg R = 1, the polynomial Q must have
degree 1, that is, Q (t) = at + b. The coefficients a and b can be determined as follows.
Substituting ¡ ¢
x (t) = (at + b) t2 e−t = at3 + bt2 e−t
into the equation, we obtain
¡¡ 3 ¢ ¢00 ¡¡ ¢ ¢0 ¡ ¢
x00 + 2x0 + x = at + bt2 e−t + 2 at3 + bt2 e−t + at3 + bt2 e−t
= (2b + 6at) e−t .
2b + 6at = t
t3 −t
x (t) = e .
6
By Lemma 3.4, the summation here runs from j = 0 to j = n but we can allow any
j ¸ 0 because for j > n the derivative P (j) is identical zero anyway. Furthermore, since
P (j) (μ) = 0 for all j · m ¡ 1, we can restrict the summation to j ¸ m. Set
and observe that y (t) is a polynomial of degree k, provided so is Q (t). Conversely, for
any polynomial y (t) of degree k, there is a polynomial Q (t) of degree k such that (3.35)
holds. Indeed, integrating (3.35) m times without adding constants and then dividing by
tm , we obtain Q (t) as a polynomial of degree k.
It follows from (3.34) that y must satisfy the ODE
80
which we rewrite in the form
P (m) (μ)
b0 = 6= 0. (3.37)
m!
Hence, the problem amounts to the following: given a polynomial
of degree k, prove that there exists a polynomial y (t) of degree k that satisfies (3.36). Let
us prove the existence of y by induction in k.
The inductive basis. If k = 0, then R (t) ´ r0 and y (t) ´ a, so that (3.36) becomes
ab0 = r0 whence a = r0 /b0 (where we use that b0 6= 0).
The inductive step from the values smaller than k to k. Represent y in the from
where z is a polynomial of degree < k. Substituting (3.38) into (3.36), we obtain the
equation for z
³ ¡ ¢0 ¡ ¢(k) ´
b0 z + b1 z 0 + ... + bi z (i) + ... = R (t) ¡ ab0 tk + ab1 tk + ... + abk tk e (t) .
=: R
Choosing a from the equation ab0 = r0 we obtain that the term tk in the right hand side
e (t) is a polynomial of degree < k. By the
of (3.38) cancels out, whence it follows that R
inductive hypothesis, the equation
e (t)
b0 z + b1 z 0 + ... + bi z (i) + ... = R
has a solution z (t) which is a polynomial of degree < k. Hence, the function y = atk + z
solves (3.36) and is a polynomial of degree k.
Remark. If k = 0, that is, R (t) ´ r0 is a constant then (3.36) yields
r0 m!r0
y= = (m) .
b0 P (μ)
m!r0
(tm Q (t))(m) =
P (m) (μ)
81
¡d¢
Therefore, the ODE P dt
x = r0 eμt has a particular solution
r0
x (t) = (m)
tm eμt . (3.39)
P (μ)
Example. Consider again the ODE x00 + 2x0 + x = e−t . Then μ = ¡1 has multiplicity
m = 2, and R (t) ´ 1. Hence, by the above Remark, we find a particular solution
1 1
x (t) = t2 e−t = t2 e−t .
P 00 (¡1) 2
which occurs in various physical phenomena. For example, (3.40) describes the movement
of a point body of mass m along the axis x, where the term px0 comes from the friction
forces, qx - from the elastic forces, and f (t) is an external time-dependant force. Another
physical situation that is described by (3.40), is an electrical circuit:
+
V(t) _ L
As before, let R the resistance, L be the inductance, and C be the capacitance of the
circuit. Let V (t) be the voltage of the power source in the circuit and x (t) be the current
in the circuit at time t. Then we have seen that the equation for x (t) is
x
Lx00 + Rx0 + = V 0.
C
If L > 0 then dividing by L we obtain an ODE of the form (3.40).
As an example of application of the above methods of solving such ODEs, we investi-
gate here the case when function f (t) is periodic. More precisely, consider the ODE
82
where A, ω are given positive reals. The function A sin ωt is a model for a more general
periodic force, which makes good physical sense in all the above examples. For example,
in the case of electrical circuit the external force has the form A sin ωt if the power source
is an electrical socket with the alternating current (AC). The number ω is called the
frequency of the external force (note that the period = 2π ω
) or the external frequency, and
the number A is called the amplitude (the maximum value) of the external force.
Assume in the sequel that p ¸ 0 and q > 0, which is physically most interesting case.
To find a particular solution of (3.41), let us consider the ODE with complex right hand
side:
x00 + px0 + qx = Aeiωt . (3.42)
Consider first the non-resonant case when iω is not a root of the characteristic polynomial
P (λ) = λ2 + pλ + q. Searching the solution in the from ceiωt , we obtain
A A
c= = 2
=: a + ib
P (iω) ¡ω + piω + q
where
p A
B= a2 + b2 = jcj = q (3.44)
(q ¡ ω 2 )2 + ω2 p2
and ϕ 2 [0, 2π) is determined from the identities
a b
cos ϕ = , sin ϕ = .
B B
The number B is the amplitude of the solution and ϕ is the phase.
To obtain the general solution to (3.41), we need to add to (3.43) the general solution
to the homogeneous equation
x00 + px0 + qx = 0.
Let λ1 and λ2 are the roots of P (λ), that is,
r
p p2
λ1,2 =¡ § ¡ q.
2 4
Consider the following possibilities for the roots.
λ1 and λ2 are real. Since p ¸ 0 and q > 0, we see that both λ1 and λ2 are strictly
negative. The general solution of the homogeneous equation has the from
C1 eλ1 t + C2 eλ2 t if λ1 6= λ2 ,
(C1 + C2 t) eλ1 t if λ1 = λ2 .
83
In the both cases, it decays exponentially in t as t ! +1. Hence, the general solution of
(3.41) has the form
As we see, when t ! 1 the leading term of x (t) is the above particular solution
B sin (ωt + ϕ). For the electrical circuit this means that the current quickly stabilizes
and becomes also periodic with the same frequency ω as the external force.
λ1 and λ2 are complex.
Let λ1,2 = α § iβ where
r
p2
α = ¡p/2 · 0 and β = q ¡ > 0.
4
The general solution to the homogeneous equation is
The number β is called the natural frequency of the physical system in question (pendu-
lum, electrical circuit, spring) for the obvious reason - in absence of the external force,
the system oscillate with the natural frequency β.
Hence, the general solution to (3.41) is
If α < 0 then the leading term is again B sin (ωt + ϕ). Here is a particular example of
such a function: sin t + 2e−t/4 sin πt
y
2
1.5
0.5
0
0 5 10 15 20 25
x
-0.5
-1
λ1 and λ2 are purely imaginary, that is, α = 0. In this case, p = 0, q = β 2 , and the
equation has the form
x00 + β 2 x = A sin ωt.
The assumption that iω is not a root implies ω 6= β. The general solution is
84
which is the sum of two sin waves with different frequencies - the natural frequency and
the external frequency. Here is a particular example of such a function: sin t + 2 sin πt:
y
2.5
1.25
0
0 5 10 15 20 25
-1.25
-2.5
Strictly speaking, in practice such electrical circuits do not occur since the resistance is
always positive.
Let us come back to the formula (3.44) for the amplitude B and, as an example of
its application, consider the following question: for what value of the external frequency
ω the amplitude B is maximal? Assuming that A does not depend on ω and using the
identity
A2
B2 = 4 ,
ω + (p2 ¡ 2q) ω 2 + q 2
we see that the maximum B occurs when the denominators takes the minimum value. If
p2 ¸ 2q then the minimum value occurs at ω = 0, which is not very interesting physically.
Assume that p2 < 2q (in particular, this implies that p2 < 4q, and, hence, λ1 and λ2 are
complex). Then the maximum of B occurs when
1¡ 2 ¢ p2
ω2 = ¡ p ¡ 2q = q ¡ .
2 2
The value p
ω 0 := q ¡ p2 /2
is called the resonant frequency of the physical system in question. If the external force
has the resonant frequency then the system exhibits the highest response to this force.
This phenomenon is called a resonance. p
Note for comparison that the natural frequency is equal to β = q ¡ p2 /4, which is
in general different from ω 0 . In terms of ω0 and β, we can write
2 A2 A2
B = = 2
ω 4 ¡ 2ω20 ω 2 + q 2 (ω2 ¡ ω20 ) + q 2 ¡ ω 40
A2
= ,
(ω 2 ¡ ω 20 ) + p2 β 2
where we have used that
µ ¶2
p2 p4
q ¡ 2
ω 40 2
=q ¡ q¡ = qp2 ¡ = p2 β 2 .
2 4
85
A
In particular, the maximum amplitude that occurs when ω = ω 0 is Bmax = pβ
.
In conclusion, consider the case, when iω is a root of P (λ), that is
(iω)2 + piω + q = 0,
p
which implies p = 0 and q = ω2 . In this case α = 0 and ω = ω 0 = β = q, and the
equation has the form
x00 + ω 2 x = A sin ωt.
Considering the ODE
x00 + ω2 x = Aeiωt ,
and searching a particular solution in the form x (t) = cteiωt , we obtain by (3.39)
A A
c= = .
P0 (iω) 2iω
10
0
0 5 10 15 20 25
-10
-20
86
3.6 The method of variation of parameters
3.6.1 A system of the 1st order
We present here the method of variation of parameters in order to solve a general linear
system
x0 = A (t) x + B (t)
where as before A (t) : I ! Rn×n and B (t) : I ! Rn are continuous. Let x1 (t) ,..., xn (t)
be n linearly independent solutions of the homogeneous system x0 = A (t) x, defined on
I. We start with the following observation.
Lemma 3.9 If the solutions x1 (t) , ..., xn (t) of the system x0 = A (t) x are linearly inde-
pendent then, for any t0 2 I, the vectors x1 (t0 ) , ..., xn (t0 ) are linearly independent.
Consider the function x (t) = C1 x1 (t) + ... + Cn xn (t) . Then x (t) solves the IVP
½ 0
x = A (t) x,
x (t0 ) = 0,
whence by the uniqueness theorem x (t) ´ 0. Since the solutions x1 , ..., xn are independent,
it follows that C1 = ... = Cn = 0, whence the independence of vectors x1 (t0 ) , ..., xn (t0 )
follows.
Example. Consider two vector functions
µ ¶ µ ¶
cos t sin t
x1 (t) = and x2 (t) = ,
sin t cos t
which are obviously linearly independent. However, for t = π/4, we have
µ p ¶
2/2
x1 (t) = p = x2 (t)
2/2
so that the vectors x1 (π/4) and x2 (π/4) are linearly dependent. Hence, x1 (t) and x2 (t)
cannot be solutions of the same system x0 = Ax.
For comparison, the functions
µ ¶ µ ¶
cos t ¡ sin t
x1 (t) = and x2 (t) =
sin t cos t
are solutions of the same system
µ ¶
0 0 ¡1
x = x,
1 0
and, hence, the vectors x1 (t) and x2 (t) are linearly independent for any t. This follows
also from µ ¶
cos t ¡ sin t
det (x1 j x2 ) = det = 1 6= 0.
sin cos t
87
Given n linearly independent solutions to x0 = A (t) x, form a n £ n matrix
where the k-th column is the column-vector xk (t) , k = 1, ..., n. The matrix X is called
the fundamental matrix of the system x0 = Ax.
It follows from Lemma 3.9 that the column of X (t) are linearly independent for any
t 2 I, which in particular means that the inverse matrix X −1 (t) is also defined for all
t 2 I. This allows us to solve the inhomogeneous system as follows.
is given by Z
x (t) = X (t) X −1 (t) B (t) dt, (3.46)
X 0 = AX.
Indeed, this identity holds for any column xk of X, whence it follows for the whole matrix.
Differentiating (3.46) in t and using the product rule, we obtain
Z
¡ ¢
x = X (t) X −1 (t) B (t) dt + X (t) X −1 (t) B (t)
0 0
Z
= AX X −1 B (t) dt + B (t)
= Ax + B (t) .
Hence, x (t) solves (3.45). Let us show that (3.46) gives all the solutions. Note that the
integral in (3.46) is indefinite so that it can be presented in the form
Z
X −1 (t) B (t) dt = V (t) + C,
where V (t) is a vector function and C = (C1 , ..., Cn ) is an arbitrary constant vector.
Hence, (3.46) gives
where x0 (t) = X (t) V (t) is a solution of (3.45). By Theorem 3.7 we conclude that x (t)
is indeed the general solution.
88
Second proof. Let us show a different way of derivation of (3.46) that is convenient
in practical applications and also explains the term “variation of parameters”. Let us
look for a solution to (3.45) in the form
If C (t) denotes the column-vector with components C1 (t) , ..., Cn (t) then (3.48) can be
written in the form
XC 0 = B
whence
C 0 = X −1 B,
Z
C (t) = X −1 (t) B (t) dt,
and Z
x (t) = XC = X (t) X −1 (t) B (t) dt.
The term “variation of parameters” comes from the identity (3.47). Indeed, if C1 , ...., Cn
are constant parameters then this identity determines the general solution of the homoge-
neous ODE x0 = Ax. By allowing C1 , ..., Cn to be variable, we obtain the general solution
to x0 = Ax + B.
Example. Consider the system ½
x01 = ¡x2
x02 = x1
or, in the vector form, µ ¶
0 0 ¡1
x = x.
1 0
89
It is easy to see that this system has two independent solutions
µ ¶ µ ¶
cos t ¡ sin t
x1 (t) = and x2 (t) = .
sin t cos t
and µ ¶
−1 cos t sin t
X = .
¡ sin t cos t
Consider now the ODE
x0 = A (t) x + B (t)
µ ¶
b1 (t)
where B (t) = . By (3.46), we obtain the general solution
b2 (t)
µ ¶Z µ ¶µ ¶
cos t ¡ sin t cos t sin t b1 (t)
x = dt
sin t cos t ¡ sin t cos t b2 (t)
µ ¶Z µ ¶
cos t ¡ sin t b1 (t) cos t + b2 (t) sin t
= dt.
sin t cos t ¡b1 (t) sin t + b2 (t) cos t
µ ¶
1
Consider a particular example B (t) = . Then the integral is
¡t
Z µ ¶ µ ¶
cos t ¡ t sin t t cos t + C1
dt = ,
¡ sin t ¡ t cos t ¡t sin t + C2
whence
µ ¶µ ¶
cos t ¡ sin t t cos t + C1
x =
sin t cos t ¡t sin t + C2
µ ¶
C1 cos t ¡ C2 sin t + t
=
C1 sin t + C2 cos t
µ ¶ µ ¶ µ ¶
t cos t ¡ sin t
= + C1 + C2 .
0 sin t cos t
where ak (t) and f (t) are continuous functions on some interval I. Recall that it can be
reduced to the vector ODE
x0 = A (t) x + B (t)
90
where 0 1
x (t)
B x0 (t) C
x (t) = B
@
C
A
...
(n−1)
x (t)
and 0 1
0 1 0 ... 0 0 1
B C 0
B 0 0 1 ... 0 C B C
A=B ... ... ... ... ... C and B = B 0 C .
B C @ ... A
@ 0 0 0 ... 1 A
f
¡an ¡an−1 ¡an−2 ... ¡a1
If x1 , ..., xn are n linearly independent solutions to the homogeneous ODE
x(n) + a1 x(n−1) + ... + an (t) x = 0
then denoting by x1 , ..., xn the corresponding vector solution, we obtain the fundamental
matrix 0 1
x1 x2 ... xn
B x01 x02 ... x0n C
X = (x1 j x2 j ...j xn ) = B@ ...
C.
... ... ... A
(n−1) (n−1) (n−1)
x1 x2 ... xn
We need to multiply X −1 by B. Denote by yik the element of X −1 at position i, k
where i is the row index
0 and1k is the column index. Denote also by yk the k-th column
y1k
−1
of X , that is, yk = @ ... A. Then
ynk
0 10 1 0 1
y11 ... y1n 0 y1n f
X −1 B = @ ... ... ... A @ ... A = @ ... A = fyn ,
yn1 ... ynn f ynn f
and the general vector solution is
Z
x = X (t) f (t) yn (t) dt.
We need the function x (t) which is the first component ofR x. Therefore, we need only to
take the first row of X to multiply by the column vector f (t) yn (t) dt, whence
Xn Z
x (t) = xj (t) f (t) yjn (t) dt.
j=1
91
is given by
n
X Z
x (t) = xj (t) f (t) yjn (t) dt (3.49)
j=1
92
Let us show how one can use the method of variation of parameters directly, without
using the formula (3.49). Consider the ODE
where C1 and C2 are constant parameters. let us look for the solution of (3.50) in the
form
x (t) = C1 (t) cos t + C2 (t) sin t, (3.53)
which is obtained from (3.52) by replacing the constants by functions (hence, the name
of the method “variation of parameters”). To obtain the equations for the unknown
functions C1 (t) , C2 (t), differentiate (3.53):
The first equation for C1 , C2 comes from the requirement that the second line here (that
is, the sum of the terms with C10 and C20 ) must vanish, that is,
The motivation for this choice is as follows. Switching to the normal system, one must
have the identity
x (t) = C1 (t) x1 (t) + C2 x2 (t) ,
which componentwise is
Differentiating the first line and subtracting the second line, we obtain (3.55).
It follows from (3.54) and (3.55) that
whence
x00 + x = ¡C10 sin t + C20 cos t
(note that the terms with C1 and C2 cancel out and that this will always be the case
provided all computations are done correctly). Hence, the second equation for C10 and C20
is
¡C10 sin t + C20 cos t = f (t) ,
Solving the system of linear algebraic equations
½ 0
C1 cos t + C20 sin t = 0
,
¡C10 sin t + C20 cos t = f (t)
93
we obtain
C10 = ¡f (t) sin t, C20 = f (t) cos t
whence Z Z
C1 = ¡ f (t) sin tdt, C2 = f (t) cos tdt
and Z Z
x (t) = ¡ cos t f (t) sin tdt + sin t f (t) cos tdt.
Lemma 3.11 (a) Let x1 , ..., xn be a sequence of Rn -valued functions that solve a linear
system x0 = A (t) x, and let W (t) be their Wronskian. Then either W (t) ´ 0 for all
t 2 I and the functions x1 , ..., xn are linearly dependent or W (t) 6= 0 for all t 2 I and the
functions x1 , ..., xn are linearly independent.
(b) Let x1 , ..., xn be a sequence of real-valued functions that solve a linear system ODE
x(n) + a1 (t) x(n−1) + ... + an (t) x = 0,
and let W (t) be their Wronskian. Then either W (t) ´ 0 for all t 2 I and the functions
6 0 for all t 2 I and the functions x1 , ..., xn are
x1 , ..., xn are linearly dependent or W (t) =
linearly independent.
Proof. (a) Indeed, if the functions x1 , ..., xn are linearly independent then, by Lemma
3.9, the vectors x1 (t) , ..., xn (t) are linearly independent for any value of t, which im-
plies W (t) 6= 0. If the functions x1 , ..., xn are linearly dependent then also the vectors
x1 (t) , ..., xn (t) are linearly dependent for any t, whence W (t) ´ 0.
(b) Define the vector function
0 1
xk
B x0k C
xk = B@ ... A
C
(n−1)
xk
94
so that x1 , ..., xk is the sequence of vector functions that solve a vector ODE x0 = A (t) x.
The Wronskian of x1 , ..., xn is obviously the same as the Wronskian of x1 , ..., xn , and the
sequence x1 , ..., xn is linearly independent if and only so is x1 , ..., xn . Hence, the rest
follows from part (a).
Theorem 3.12 (The Liouville formula) Let fxi gni=1 be a sequence of n solutions of the
ODE x0 = A (t) x, where A : I ! Rn×n is continuous. Then the Wronskian W (t) of this
sequence satisfies the identity
µZ t ¶
W (t) = W (t0 ) exp trace A (τ ) dτ , (3.56)
t0
for all t, t0 2 I.
Recall that the trace (Spur) trace A of the matrix A is the sum of all the diagonal
entries of the matrix.
Proof. Let the entries of the matrix (x1 j x2 j...jxn ) be xij where i is the row index and
j is the column index; in particular, the components of the vector xj are x1j , x2j , ..., xnj .
Denote by ri the i-th row of this matrix, that is, ri = (xi1 , xi2 , ..., xin ); then
0 1
r1
B r2 C
W = det B @ ... A
C
rn
We use the following formula for differentiation of the determinant, which follows from
the full expansion of the determinant and the product rule:
0 0 1 0 1 0 1
r1 r1 r1
B r2 C B 0 C B C
W 0 (t) = det B C + det B r2 C + ... + det B r2 C . (3.57)
@ ... A @ ... A @ ... A
rn rn rn0
Indeed, if f1 (t) , ..., fn (t) are real-valued differentiable functions then the product rule
implies by induction
Hence, when differentiating the full expansion of the determinant, each term of the de-
terminant gives rise to n terms where one of the multiples is replaced by its derivative.
Combining properly all such terms, we obtain that the derivative of the determinant is
the sum of n determinants where one of the rows is replaced by its derivative, that is,
(3.57).
The fact that each vector xj satisfies the equation x0j = Axj can be written in the
coordinate form as follows n
X
0
xij = Aik xkj . (3.58)
k=1
95
For any fixed i, the sequence fxij gnj=1 is nothing other than the components of the row
ri . Since the coefficients Aik do not depend on j, (3.58) implies the same identity for the
rows: n
X
0
ri = Aik rk .
k=1
That is, the derivative ri0 of the i-th row is a linear combination of all rows rk . For example,
r10 = A11 r1 + A12 r2 + ... + A1n rn
which implies that
0 0 1 0 1 0 1 0 1
r1 r1 r2 rn
B r2 C B r2 C B r2 C B r2 C
det B C B
@ ... A = A11 det @
C + A12 det B
A @
C + ... + A1n det B
A @
C.
... ... ... A
rn rn rn rn
All the determinants except for the 1st one vanish since they have equal rows. Hence,
0 0 1 0 1
r1 r1
B r2 C B C
det B C = A11 det B r2 C = A11 W (t) .
@ ... A @ ... A
rn rn
Evaluating similarly the other terms in (3.57), we obtain
W 0 (t) = (A11 + A22 + ... + Ann ) W (t) = (trace A) W (t) .
By Lemma 3.11, W (t) is either identical 0 or never zero. In the first case there is nothing
to prove. In the second case, we can solve the above ODE using the method of separation
of variables. Indeed, dividing it W (t) and integrating in t, we obtain
Z t
W (t)
ln = trace A (τ ) dτ
W (t0 ) t0
(note that W (t) and W (t0 ) have the same sign so that the argument of ln is positive),
whence (3.56) follows.
Corollary. Consider a scalar ODE
x(n) + a1 (t) x(n−1) + ... + an (t) x = 0,
where ak (t) are continuous functions on an interval I ½ R. If x1 (t) , ..., xn (t) are n
solutions to this equation then their Wronskian W (t) satisfies the identity
µ Z t ¶
W (t) = W (t0 ) exp ¡ a1 (τ ) dτ . (3.59)
t0
96
Since the Wronskian of the normal system coincides with W (t), (3.59) follows from (3.56)
because trace A = ¡a1 .
In the case of the ODE of the 2nd order
x00 + a1 (t) x0 + a2 (t) x = 0
the Liouville formula can help in finding the general solution if a particular solution is
known. Indeed, if x0 (t) is a particular non-zero solution and x (t) is any other solution
then we have by (3.59)
µ ¶ µ Z ¶
x0 x
det = C exp ¡ a1 (t) dt ,
x00 x0
that is µ Z ¶
0
x0 x ¡ xx00 = C exp ¡ a1 (t) dt .
R dt
R
9
To evaluate the integral tan2 t = cot2 tdt use the identity
0
(cot t) = ¡ cot2 t ¡ 1
that yields Z
cot2 tdt = ¡t ¡ cot t + C:
97
3.8 Linear homogeneous systems with constant coefficients
Here we will be concerned with finding the general solution to linear systems of the form
x0 = Ax where A 2 Cn×n is a constant n £ n matrix with complex entries and x (t) is a
function from R to Cn . As we know, it suffices to find n linearly independent solutions
and then take their linear combination. We start with a simple observation. Let us try
to find a solution in the form x = eλt v where v is a non-zero vector in Cn that does not
depend on t. Then the equation x0 = Ax becomes
λeλt v = eλt Av
that is, Av = λv. Recall that any non-zero vector v that satisfies the identity Av = λv
for some constant λ is called an eigenvector of A, and λ is called the eigenvalue. Hence,
the function x (t) = eλt v is a non-trivial solution to x0 = Ax provided v is an eigenvector
of A and λ is the corresponding eigenvalue.
The fact that λ is an eigenvalue means that the matrix A ¡ λ id is not invertible, that
is,
det (A ¡ λ id) = 0. (3.61)
This equation is called the characteristic equation of the matrix A and can be used to
determine the eigenvalues. Then the eigenvector is determined from the equation
(A ¡ λ id) v = 0. (3.62)
Note that the eigenvector is not unique; for example, if v is an eigenvector then cv is also
an eigenvector for any constant c.
The function
P (λ) := det (A ¡ λ id)
is clearly a polynomial of λ of order n. It is called the characteristic polynomial of the
matrix A. Hence, the eigenvalues of A are the root of the characteristic polynomial P (λ).
Since the vectors v1 and v2 are independent, we obtain the general solution in the form
µ ¶ µ ¶ µ ¶
t 1 −t 1 C1 et + C2 e−t
x (t) = C1 e + C2 e = ,
1 ¡1 C1 et ¡ C2 e−t
99
µ ¶
0 ¡1
The matrix of the system is A = , and the the characteristic polynomial is
1 0
µ ¶
¡λ ¡1
P (λ) = det = λ2 + 1.
1 ¡λ
Hence, the characteristic equation is λ2 +1 = 0 whence
¡a¢ λ1 = i and λ2 = ¡i. For λ = λ1 = i
we obtain the equation for the eigenvector v = b
µ ¶µ ¶
¡i ¡1 a
= 0,
1 ¡i b
which amounts to the single equation ia +b = 0. Choosing a = i, we obtain b = 1, whence
µ ¶
i
v1 =
1
and the corresponding solution of the ODE is
µ ¶ µ ¶
it i ¡ sin t + i cos t
x1 (t) = e = .
1 cos t + i sin t
Since this solution is complex, we obtain the general solution using the second claim of
Lemma 3.13:
µ ¶ µ ¶ µ ¶
¡ sin t cos t ¡C1 sin t + C2 cos t
x (t) = C1 Re x1 + C2 Im x1 = C1 + C2 = .
cos t sin t C1 cos t + C2 sin t
100
3.8.1 Functions of operators and matrices
Recall that an scalar ODE x0 = Ax has a solution x (t) = CeAt t. Now if A is a n £ n
matrix, we may be able to use this formula if we define what is eAt . It suffices to define
what is eA for any matrix A. It is convenient to do this for linear operators acting in Cn .
Recall that a linear operator in Cn is a mapping A : Cn ! Cn such that, for all
x, y 2 Cn and λ 2 C,
A (x + y) = Ax + Ay
A (λx) = λAx.
for all x 2 Cn . With these operation, L (Cn ) is a linear space over C. Since any operator
can be identified with a n £ n matrix, the dimension of the linear space L (Cn ) is n2 .
Apart from the linear structure, the product AB of operators is defined in L (Cn ) as
composition that is,
(AB) x = A (Bx) .
Fix a norm k ¢ k in Cn , for example, the 1-norm
where x1 , ..., xn are the components of the vector x. Define the associated operator norm
in L (Cn ) by
kAxk
kAk = sup . (3.64)
x∈Cn \{0} kxk
101
2. Clearly, kAk ¸ 0. Let us show that kAk > 0 if A 6= 0. Indeed, if A 6= 0 then there
is x 2 Cn such that Ax 6= 0 and kAxk > 0, whence
kAxk
kAk ¸ > 0.
kxk
k (A + B) xk kAxk + kBxk
kA + Bk = sup · sup
x kxk x kxk
kAxk kBxk
· sup + sup
x kxk x kxk
= kAk + kBk.
4. Let us prove the scaling property: kλAk = jλj kAk for any λ 2 C. Indeed, by (3.64)
In addition to the general properties of a norm, the operator norm satisfies the in-
equality
kABk · kAk kBk . (3.65)
Indeed, it follows from (3.64) that kAxk · kAk kxk whence
kAk ¡ Ak ! 0 as k ! 1.
Representing an operator A as a matrix (Aij )ni,j=1 , one can consider the 1-norm on
operators defined by
kAk∞ = max jAij j .
1≤i,j≤n
Clearly, the convergence in the 1-norm is equivalent to the convergence of each component
Aij separately. Since all norms in L (Cn ) are equivalent, we see that convergence of a
sequence of operators in any norm is equivalent to the convergence of the individual
components of the P operators.
Given a series ∞ k=1 Ak ofP operators, the sum of the series is defined as the limit of
P∞
the sequence of partial sums N k=1 Ak as N ! 1. That is, S = k=1 Ak if
N
X
kS ¡ Ak k ! 0 as N ! 1.
k=1
102
Claim. Assume that ∞
X
kAk k < 1. (3.66)
k=1
P
Then the series ∞ k=1 Ak converges.
Proof. Indeed, since all norms in L (Cn ) are equivalent, we can assume that the norm
in (3.66) is the 1-norm. Denoting by (Ak )ij the ij-components of the matrix A, we obtain
that then the condition (3.66) is equivalent to
∞ ¯
X ¯
¯ ¯
¯ k ij ¯ < 1
(A ) (3.67)
k=1
for any indices 1 · i, j · n. Then (3.67) implies that the numerical series
∞
X
(Ak )ij
k=1
P
converges, which implies that the operator series ∞ Ak also converges.
P∞k=1
If the condition (3.66) is satisfied then the series k=1 Ak is called absolutely convergent.
Hence, the above Claim means that absolute convergence of an operator series implies
the usual convergence.
Definition. If A 2 L (Cn ) then define eA 2 L (Cn ) by means of the identity
∞
X Ak
A2 Ak
eA = id +A + + ... + + ... = , (3.68)
2! k! k=0
k!
Lemma 3.14 The exponential series (3.68) converges for any A 2 L (Cn ) .
Proof. It suffices to show that the series converges absolutely, that is,
X∞ ° k°
°A °
° ° < 1.
° k! °
k=0
° °
It follows from (3.65) that °Ak ° · kAkk whence
∞ ° k°
X ∞
°A ° X kAkk
° °· = ekAk < 1,
° k! ° k!
k=0 k=0
Theorem 3.15 For any A 2 L (Cn ) the function F (t) = etA satisfies the ODE F 0 = AF .
Consequently, the general solution of the ODE x0 = Ax is given by x = etA v where v 2 Cn
is an arbitrary vector.
103
Here x = x (t) is as usually a Cn -valued function on R, while F (t) is an L (Cn )-
2
valued function on R. Since L (Cn ) is linearly isomorphic to Cn , we can also say that
2
F (t) is a Cn -valued function on R, which allows to understand the ODE F 0 = AF in
the same sense as general vectors ODE. The novelty here is that we regard A 2 L (Cn )
as an operator in L (Cn ) (that is, an element of L (L (Cn ))) by means of the operator
multiplication.
Proof. We have by definition
∞ k k
X
tA t A
F (t) = e = .
k=0
k!
It is easy to see (in the same way as Lemma 3.14) that this series converges locally
uniformly in t, which implies that F is differentiable in t and F 0 = G. It follows that
F 0 = AF .
For function x (t) = etA v, we have
¡ ¢0 ¡ ¢
x0 = etA v = AetA v = Ax
etA vjt=0 = id v = v.
Hence, both x (t) and etA v solve the same initial value problem, whence the identity
x (t) = etA v follows by the uniqueness theorem.
Remark. If v1 , ..., vn are linearly independent vectors in Cn then the solutions etA v1 , ...., etA vn
are also linearly independent and, hence, can be used to form the fundamental matrix.
In particular, choosing v1 , ..., vn to be the canonical basis in Cn , we obtain that etA vk is
the k-th column of the matrix etA . Hence, the matrix etA is itself a fundamental matrix
of the system x0 = Ax.
Example. Let A be the diagonal matrix
Then ¡ ¢
Ak = diag λk1 , ..., λkn
and ¡ ¢
etA = diag eλ1 t , ..., eλn t .
Let µ ¶
0 1
A= .
0 0
104
Then A2 = 0 and all higher power of A are also 0 and we obtain
µ ¶
tA 1 t
e = id +tA = .
0 1
(A + B)2 = (A + B) (A + B) = A2 + AB + BA + B 2 ,
eA+B = eA eB .
Claim 3. If A (t) and B (t) are differentiable functions from R to L (Cn ) then
105
whence (3.70) follows.
Now we can finish the proof of the lemma. Consider the function F : R ! L (Cn )
defined by
F (t) = etA etB .
Differentiating it using Theorem 3.15, Claims 2 and 3, we obtain
¡ ¢0 ¡ ¢0
F 0 (t) = etA etB +etA etB = AetA etB +etA BetB = AetA etB +BetA etB = (A + B) F (t) .
On the other hand, by Theorem 3.15, the function G (t) = et(A+B) satisfies the same
equation
G0 = (A + B) G.
Since G (0) = F (0) = id, we obtain that the vector functions F (t) and G (t) solve the
same IVP, whence by the uniqueness theorem they are identically equal. In particular,
F (1) = G (1), which means eA eB = eA+B .
Alternative proof. Let us briefly discuss a direct algebraic proof of eA+B = eA eB .
One first proves the binomial formula
Xn µ ¶
n n k n−k
(A + B) = A B
k=0
k
using the fact that A and B commute (this can be done by induction in the same way as
for numbers). Then we have
∞
X X∞ X n
A+B (A + B)n Ak B n−k
e = =
n=0
n! n=0 k=0
k! (n ¡ k)!
Of course, one need to justify the Cauchy product formula for absolutely convergent series
of operators.
106
Here all the entries on the main diagonal are λ and all the entries just above the main
diagonal are 1 (and all other values are 0). Let us use Lemma 3.16 in order to evaluate
etA where A is a Jordan cell. Clearly, we have A = λ id +N where
0 1
0 1 0 ¢¢¢ 0
B .. . . . . . . .. C
B . . . . . C
B . . . C
N =B B .. .. .. 0 C C. (3.72)
B . ... C
@ .. 1 A
0 ¢¢¢ ¢¢¢ ¢¢¢ 0
A matrix (3.72) is called a nilpotent Jordan cell. Since the matrices λ id and N commute
(because id commutes with anything), Lemma 3.16 yields
Hence, we need to evaluate etN , and for that we first evaluate the powers N 2 , N 3 , etc.
Observe that the components of matrix N are as follows
½
1, if j = i + 1
Nij = ,
0, otherwise
where i is the row index and j is the column index. It follows that
X n ½
¡ 2¢ 1, if j = i + 2
N ij = Nik Nkj =
0, otherwise
k=1
that is, 0 1
...
B 0 0 1 0 C
B .. . . ... ... ... C
B . . C
B .. ... ... C
N2 = B . 1 C .
B C
B .. ... C
@ . 0 A
0 ¢¢¢ ¢¢¢ ¢¢¢ 0
Here the entries with value 1 are located on the diagonal that is two positions above the
main diagonal. Similarly, we obtain
0 1
... ...
B 0. 1 0 C
B . ... ... ... ... C
B . C
B C
N = B ...
k ... ...
1 C
B C
B .. ... ... C
@ . A
0 ¢¢¢ ¢¢¢ ¢¢¢ 0
where the entries with value 1 are located on the diagonal that is k positions above the
main diagonal, provided k < n, and N k = 0 if k ¸ n.
Any matrix A with the property that Ak = 0 for some natural k is called nilpotent.
Hence, N is a nilpotent matrix, which explains the term “a nilpotent Jordan cell”. It
107
follows that
0 1
t t2 ... tn−1
1
B 1! 2! (n−1)! C
B ... ... ... ... C
t t2 tn−1 B 0 C
B .. . . . . C
etN = id + N + N 2 + ... + N n−1 =B . . . ... t2 C. (3.74)
1! 2! (n ¡ 1)! B 2! C
B .. ... ... C
@ . t A
1!
0 ¢¢¢ ¢¢¢ 0 1
Combining with (3.73), we obtain the following statement.
Lemma 3.17 If A is a Jordan cell (3.71) then, for any t 2 R,
0 1
λt t tλ t2 tλ ... tn−1 tλ
B e 1!
e 2!
e (n−1)!
e C
B ... ... C
B 0 etλ t tλ
e C
B 1! C
B C
e =B
tA
B ... ... ... ... t2 tλ
C.
C (3.75)
B 2!
e C
B . ... ... C
B .. t tλ
e C
@ 1! A
tλ
0 ¢¢¢ ¢¢¢ 0 e
By Lemma 3.15, the general solution of the system x0 = Ax is x (t) = etA v where v is
an arbitrary vector from Cn . Setting v = (C1 , ..., Cn ), we obtain that the general solution
is
x (t) = C1 x1 + ... + Cn xn ,
where x1 , ..., xn are the columns of the matrix etA (which form a sequence of n linearly
independent solutions). Using (3.75), we obtain
x1 (t) = eλt (1, 0, ..., 0)
µ ¶
λt t
x2 (t) = e , 1, 0, ..., 0
1!
µ 2 ¶
λt t t
x3 (t) = e , , 1, 0, ..., 0
2! 1!
... µ n−1 ¶
λt t t
xn (t) = e , ..., , 1 .
(n ¡ 1)! 1!
That is, matrix C consists of two blocks A and B located on the main diagonal, and all
other terms are 0.
Notation for the tensor product: C = A − B.
108
Lemma 3.18 The following identity is true:
eA⊗B = eA − eB . (3.76)
which follows easily from the rule of multiplication of matrices. Hence, the tensor product
commutes with the matrix multiplication. It is also obvious that the tensor product
commutes with addition of matrices and taking limits. Therefore, we obtain
∞ ∞
̰ ! ̰ !
X (A − B)k X A k
− B k X Ak X Bk
eA⊗B = = = − = eA − eB .
k=0
k! k=0
k! k=0
k! k=0
k!
Definition. A tensor product of a finite number of Jordan cells is called a Jordan normal
form.
That is, if a Jordan normal form is a matrix as follows:
0 1
J1
B J2 0 C
B C
B . . C
J1 − J2 − ¢ ¢ ¢ − Jk = B . C,
B C
@ 0 Jk−1 A
Jk
109
By Lemma 3.17, we obtain
µ ¶ µ ¶
tJ1 et tet tJ2 e2t te2t
e = and e =
0 et 0 e2t
0 0 0 e2t
The columns of this matrix form 4 linearly independent solutions
¡ ¢
x1 = et , 0, 0, 0
¡ ¢
x2 = tet , et , 0, 0
¡ ¢
x3 = 0, 0, e2t , 0
¡ ¢
x4 = 0, 0, te2t , e2t
x (t) = C1 x1 + C2 x2 + C3 x3 + C4 x4
¡ ¢
= C1 et + C2 tet , C2 et , C3 e2t + C4 te2t , C4 e2t .
(Ax)b = Ab xb ,
which should be true for all x 2 Cn , where in the right hand side we have the product of
the n £ n matrix Ab and the column-vector xb .
Clearly, (bi )b = (0, ...1, ...0) where 1 is at position i, which implies that (Abi )b = Ab (bi )b
is the i-th column of Ab . In other words, we have the identity
³ ´
Ab = (Ab1 )b j (Ab2 )b j ¢ ¢ ¢ j (Abn )b ,
the i-th column of Ab is the column vector Abi written in the basis b1 , ..., bn .
110
Example. Consider the operator A in C2 that is given in the canonical basis e = fe1 , e2 g
by the matrix µ ¶
e 0 1
A = .
1 0
Consider another basis b = fb1 , b2 g defined by
µ ¶ µ ¶
1 1
b1 = e1 ¡ e2 = and b2 = e1 + e2 = .
¡1 1
Then µ ¶µ ¶ µ ¶
e 0 1 1 ¡1
(Ab1 ) = =
1 0 ¡1 1
and ¶µ
µ ¶ µ ¶
e0 1 1 1
(Ab2 ) = = .
1 0 1 1
It follows that Ab1 = ¡b1 and Ab2 = b2 whence
µ ¶
b ¡1 0
A = .
0 1
and the k-th column of Ab ¡ λ id is the vector (A ¡ λ id) bk written in the basis b, we
conclude that
(A ¡ λ id) bj = 0
(A ¡ λ id) bj+1 = bj
(A ¡ λ id) bj+2 = bj+1
¢¢¢
(A ¡ λ id) bj+p−1 = bj+p−2 .
111
In particular, bj is an eigenvector of A with the eigenvalue λ. The vectors bj+1 , ..., bj+p−1
are called the generalized eigenvectors of A (more precisely, bj+1 is the 1st generalized
eigenvector, bj+2 is the second generalized eigenvector, etc.). Hence, any Jordan chain
contains exactly one eigenvector and the rest vectors are the generalized eigenvectors.
Theorem 3.19 Consider the system x0 = Ax with a constant linear operator A and let
Ab be the Jordan normal form of A. Then each Jordan cell J of Ab of dimension p with
λ on the diagonal gives rise to p linearly independent solutions as follows:
x1 (t) = eλt v1
µ ¶
λt t
x2 (t) = e v1 + v2
1!
µ 2 ¶
λt t t
x3 (t) = e v1 + v2 + v3
2! 1!
... µ ¶
λt tp−1 t
xp (t) = e v1 + ... + vp−1 + vp ,
(p ¡ 1)! 1!
where fv1 , ..., vp g is the Jordan chain of J. The set of all n solutions obtained across all
Jordan cells is linearly independent.
where the block in the middle is etJ . By Lemma 3.15, the columns of this matrix give
n linearly independent solutions to the ODE x0 = Ax. Out of these solutions, select p
solutions that correspond to p columns of the cell etJ , that is,
x1 (t) = (. . . eλt , 0, . . . , 0 . . . )
| {z }
p
t λt λt
x2 (t) = (. . . 1!
e , e , 0, . . . , 0 ...)
| {z }
p
...
tp−1 λt
xp (t) = (. . . (p−1)!
e , . . . , 1!t eλt , etλ . . . ),
| {z }
p
112
where all the vectors are written in the basis b, the horizontal braces mark the columns of
the cell J, and all the terms outside the horizontal braces are zeros. Representing these
vectors in the coordinateless form via the Jordan chain v1 , ..., vp , we obtain the solutions
as in the statement of Theorem 3.19.
Let λ be an eigenvalue of an operator A. Denote by m the algebraic multiplicity of
λ, that is, its multiplicity as a root of characteristic polynomial11 P (λ) = det (A ¡ λ id).
Denote by g the geometric multiplicity of λ, that is the dimension of the eigenspace of λ:
where s = m ¡ g + 1 and uj are vectors that can be determined by substituting the above
function to the equation x0 = Ax.
The set of all n solutions obtained in this way using all the eigenvalues of A is linearly
independent.
Remark. For practical use, one should substitute (3.78) into the system x0 = Ax con-
sidering uij as unknowns (where uij is the i-th component of the vector uj ) and solve the
resulting linear algebraic system with respect to uij . The result will contain m arbitrary
constants, and the solution in the form (3.78) will appear as a linear combination of m
independent solutions.
Proof. Let p1 , .., pg be the dimensions of all the Jordan cells with the eigenvalue λ (as
we know, the number of such cells is g). Then λ occurs p1 + ... + pg times on the diagonal
of the Jordan normal form, which implies
g
X
pj = m.
j=1
11
To compute P (¸), one needs to write the operator A in some basis b as a matrix Ab and then
evaluate det (Ab ¡ ¸ id). The characteristic polynomial does not depend on the choice of basis b. Indeed,
if b0 is another basis then the relation between the matrices Ab and Ab0 is given by Ab = CAb0 C ¡1
where C is the matrix of transformation of basis. It follows that Ab ¡ ¸ id = C (Ab0 ¡ ¸ id) C ¡1 whence
det (Ab ¡ ¸ id) = det C det (Ab0 ¡ ¸ id) det C ¡1 = det (Ab0 ¡ ¸ id) :
12
If ¸ occurs k times on the diagonal of Ab then ¸ is a root of multiplicity k of the characteristic
polynomial of Ab that coincides with that of A. Hence, k = m.
13
Note that each Jordan cell correponds to exactly one eigenvector.
113
Hence, the total number of linearly independent solutions that are given by Theorem 3.19
for the eigenvalue λ is equal to m. Let us show that each of the solutions of Theorem 3.19
has the form (3.78). Indeed, each solution of Theorem 3.19 is already in the form
To ensure that these solutions can be represented in the form (3.78), we only need to
verify that pj ¡ 1 · s ¡ 1. Indeed, we have
g
à g !
X X
(pj ¡ 1) = pj ¡ g = m ¡ g = s ¡ 1,
j=1 j=1
and the only eigenvalue is λ1 = 3 with the algebraic multiplicity m1 = 2. The equation
for an eigenvector v is
(A ¡ λ id) v = 0
that is, for v = (a, b), µ ¶µ ¶
¡1 1 a
= 0,
¡1 1 b
which is equivalent to ¡a + b = 0. Setting a = 1 and b = 1, we obtain the unique (up to
a constant multiple) eigenvector µ ¶
1
v1 = .
1
Hence, the geometric multiplicity is g1 = 1. Hence, there is only one Jordan cell with
the eigenvalue λ1 , which allows to immediately determine the Jordan normal form of the
given matrix: µ ¶
3 1
.
0 3
By Theorem 3.19, we obtain the solutions
x1 (t) = e3t v1
x2 (t) = e3t (tv1 + v2 )
where v2 is the 1st generalized eigenvector that can be determined from the equation
(A ¡ λ id) v2 = v1 .
114
Setting v2 = (a, b), we obtain the equation
µ ¶µ ¶ µ ¶
¡1 1 a 1
=
¡1 1 b 1
whence µ ¶
3t t
x2 (t) = e .
t+1
Finally, the general solution is
µ ¶
3t C1 + C2 t
x (t) = C1 x1 + C2 x2 = e .
C1 + C2 (t + 1)
The roots are λ1 = 2 with m1 = 1 and λ2 = 1 with m2 = 2. The eigenvectors v for λ1 are
determined from the equation
(A ¡ λ1 id) v = 0,
whence, for v = (a, b, c) 0 10 1
0 1 1 a
@ ¡2 ¡2 ¡1 A @ b A = 0,
2 1 0 c
that is, 8
< b+c=0
¡2a ¡ 2b ¡ c = 0
:
2a + b = 0.
The second equation is a linear combination of the first and the last ones. Setting a = 1
we find b = ¡2 and c = 2 so that the unique (up to a constant multiple) eigenvector is
0 1
1
v = @ ¡2 A ,
2
115
which gives the first solution 0 1
1
x1 (t) = e2t @ ¡2 A .
2
The eigenvectors for λ2 = 1 satisfy the equation
(A ¡ λ2 id) v = 0,
Therefore, g2 = 1, that is, there is only one Jordan cell with the eigenvalue λ2 , which
implies that the Jordan normal form of the given matrix is as follows:
0 1
2 0 0
@ 0 1 1 A.
0 0 1
By Theorem 3.19, the cell with λ2 = 1 gives rise to two more solutions
0 1
0
x2 (t) = et v1 = et @ 1 A
¡1
and
x3 (t) = et (tv1 + v2 ) ,
where v2 is the first generalized eigenvector to be determined from the equation
(A ¡ λ2 id) v2 = v1 .
116
that is 8
< a+b+c=0
¡2a ¡ b ¡ c = 1
:
2a + b + c = ¡1.
This system has a solution a = ¡1, b = 0 and c = 1. Hence,
0 1
¡1
v2 = @ 0 A ,
1
Recall that, by Theorem 2.14, the domain of function x (t, y) is an open subset of Rn+1
and x (t, y) is continuously differentiable in this domain.
The fact that f does not depend on t, implies the following two consequences.
117
1. If x (t) is a solution of (4.1) then also x (t ¡ a) is a solution of (4.1), for any a 2 R.
In particular, the function x (t ¡ t0 , y) solves the following IVP
½ 0
x = f (x)
x (t0 ) = y.
In other words, the Lyapunov stability means that if x (0) is close enough to x0 then
the solution x (t) is defined for all t > 0 and
If we replace in (4.2) the interval (0, +1) by any bounded interval [a, b] containing 0 then
by the continuity of x (t, y),
Hence, the main issue for the stability is the behavior of solutions as t ! +1.
Definition. A stationary point x0 is called asymptotically stable for the system x0 = f (x)
(or the system is called asymptotically stable at x0 ), if it is Lyapunov stable and, in
addition,
kx (t, y) ¡ x0 k ! 0 as t ! +1,
provided ky ¡ x0 k is small enough.
Observe, the stability and asymptotic stability do not depend on the choice of the
norm in Rn because all norms in Rn are equivalent.
14
In the literature one can find the following synonyms for the term “stationary point”: rest point,
singular point, equilibrium point, fixed point.
118
4.2 Stability for a linear system
Consider a linear system x0 = Ax in Rn where A is a constant operator. Clearly, x = 0 is
a stationary point.
Theorem 4.1 If for all complex eigenvalues λ of A, we have Re λ < 0 then 0 is asymp-
totically stable for the system x0 = Ax. If, for some eigenvalue λ of A, Re λ > 0 then 0
is unstable.
Proof. By Theorem 3.190 , the general complex solution of x0 = Ax has the form
n
X
x (t) = Ck eλk t Pk (t) , (4.3)
k=1
where Ck are arbitrary complex constants, λ1 , ..., λn are all the eigenvalues of A listed with
the algebraic multiplicity, and Pk (t) are some vector valued polynomials of t. The latter
means that Pk (t) = u1 + u2 t + ... + us ts−1 for some s 2 N and for some vectors u1 , ..., us .
Note that this solution is obtained by taking a linear combination of n independent
solutions eλk t Pk (t). Since
Xn
x (0) = Ck Pk (0) ,
k=1
we see that the coefficients Ck are the components of x (0) in the basis fPk (0)gnk=1 .
It follows from (4.3) that
n
X ¯ ¯
kx (t)k · ¯Ck eλk t ¯ kPk (t)k
k=1
n
X
(Re λk )t
· max jCk j e kPk (t)k .
k
k=1
Set
α = max Re λk < 0.
k
for all t > 0 and for some large enough constants C and N. Hence, it follows that
¡ ¢
kx (t)k · Ceαt 1 + tN kx (0)k∞ (4.4)
kx (t)k · K kx (0)k ,
whence it follows that the stationary point 0 is Lyapunov stable. Moreover, since
¡ ¢
1 + tN eαt ! 0 as t ! +1,
119
we conclude from (4.4) that kx (t) k ! 0 as t ! 1, that is, the stationary point 0 is
asymptotically stable.
Let now Re λ > 0 for some eigenvalue λ. To prove that 0 is unstable is suffices to show
that there exists an unbounded real solution x (t), that is, a solution for which kx (t)k
is not bounded on (0, +1) as a function of t. Indeed, if such a solution exists then the
function εx (t) is also an unbounded solution for any ε > 0, while its initial value εx (0)
can be made arbitrarily small by choosing ε appropriately.
To construct an unbounded solution, consider an eigenvector v of the eigenvalue λ. It
gives rise to the solution
x (t) = eλt v
for which ¯ ¯
kx (t)k = ¯eλt ¯ kvk = et Re λ kvk .
Hence, kx (t)k is unbounded. If x (t) is a real solution then this finishes the proof. In
general, if x (t) is a complex solution then then either Re x (t) or Im x (t) is unbounded
(in fact, both are), whence the instability of 0 follows.
This theorem does not answer the question what happens when Re λ = 0. We will
investigate this for the case n = 2 where we also give a more detailed description of the
phase diagrams.
Consider now a linear system x0 = Ax in R2 where A is a constant operator in R2 . Let
b = fb1 , b2 g be the Jordan basis of A so that Ab has the Jordan normal form. Consider
first the case when the Jordan normal form of A has two Jordan cells, that is,
µ ¶
b λ1 0
A = .
0 λ2
Then b1 and b2 are the eigenvectors of the eigenvalues λ1 and λ2 , respectively, and the
general solution is
x (t) = C1 eλ1 t b1 + C2 eλ2 t b2 .
In other words, in the basis b,
¡ ¢
x (t) = C1 eλ1 t , C2 eλ2 t
where
α = max (Re λ1 , Re λ2 ) .
If α · 0 then
kx (t) k∞ · kx (0) k
which implies the Lyapunov stability. As we know from Theorem 4.1, if α > 0 then the
stationary point 0 is unstable. Hence, in this particular situation, the Lyapunov stability
is equivalent to α · 0.
Let us construct the phase diagrams of the system x0 = Ax under the above assump-
tions.
Case λ1 , λ2 are real.
120
Let x1 (t) and x2 (t) be the components of the solution x (t) in the basis fb1 , b2 g . Then
x1 = C1 eλ1 t and x2 = C2 eλ2 t .
0.5
0
-1 -0.5 0 0.5 1
-0.5
-1
0.5
0
-1 -0.5 0 0.5 1
-0.5
-1
121
If γ < 0 (that is, λ1 and λ2 are of different signs) then the phase diagram is called a
saddle:
y 1
0.5
0
-1 -0.5 0 0.5 1
-0.5
-1
122
If α 6= 0 then these equations define a logarithmic spiral, and the phase diagram is called
a focus or a spiral:
y
0.75
0.5
0.25
0
-0.5 -0.25 0 0.25 0.5 0.75 1
x
-0.25
-0.5
0.8
0.6
0.4
0.2
0
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
-0.2 x
-0.4
-0.6
-0.8
-1
In this case, the stationary point is stable but not asymptotically stable.
Consider now the case when the Jordan normal form of A has only one Jordan cell,
that is, µ ¶
b λ 1
A = .
0 λ
In this case, λ must be real because if λ is an imaginary root of a characteristic polynomial
then λ must also be a root, which is not possible since λ does not occur on the diagonal
of Ab . Then the general solution is
x (t) = C1 eλt b1 + C2 eλt (b1 t + b2 ) = (C1 + C2 t) eλt b1 + C2 eλt b2
123
whence x (0) = C1 b1 + C2 b2 . That is, in the basis b, we can write x (0) = (C1 , C2 ) and
¡ ¢
x (t) = eλt (C1 + C2 t) , eλt C2 (4.5)
whence
kx (t)k1 = eλt jC1 + C2 tj + eλt jC2 j .
If λ < 0 then we obtain again the asymptotic stability (which follows also from Theorem
4.1), while in the case λ ¸ 0 the stationary point 0 is unstable. Indeed, taking C1 = 0
and C2 = 1, we obtain a particular solution with the norm
kx (t) k1 = eλt (t + 1) ,
which is unbounded.
If λ 6= 0 then it follows from (4.5) that the components x1 , x2 of x are related as
follows:
x1 C1 1 x2
= + t and t = ln
x2 C2 λ C2
whence
x2 ln jx2 j
x1 = Cx2 +
λ
for some constant C. Here is the phase diagram in this case:
y 1
0.5
0
-1 -0.5 0 0.5 1
-0.5
-1
This phase diagram is also called a node. It is stable if λ < 0 and unstable if λ > 0. If
λ = 0 then we obtain a degenerate phase diagram - parallel straight lines.
Hence, the main types of the phases diagrams are the node (λ1 , λ2 are real, non-
zero and of the same sign), the saddle (λ1 , λ2 are real, non-zero and of opposite signs),
focus/spiral (λ1 , λ2 are imaginary and Re λ 6= 0) and center (λ1 , λ2 are purely imaginary).
Otherwise, the phase diagram consists of parallel straight lines or just dots, and is referred
to as degenerate.
To summarize the stability investigation, let us emphasize that in the case Re λ = 0
both stability and instability can happen, depending on the structure of the Jordan normal
form.
124
4.3 Lyapunov’s theorem
Consider again an autonomous ODE x0 = f (x) where f : Ω ! Rn is continuously
differentiable and Ω is an open set in Rn . Let x0 be a stationary point of the system
x0 = f (x), that is, f (x0 ) = 0. We investigate the stability of the stationary point x0 .
Theorem 4.2 (Lyapunov’s theorem) Assume that f 2 C 2 (Ω) and set A = f 0 (x0 ) (that
is, A is the Jacobian matrix of f at x0 ). If Re λ < 0 for all eigenvalues λ of A then the
stationary point x0 is asymptotically stable for the system x0 = f (x).
Remark. This theorem has the second part that says the following: if Re λ > 0 for
some eigenvalue λ of A then x0 is unstable for x0 = f (x). However, the proof of that is
somewhat lengthy and will not be presented here.
Example. Consider the system
½ p
x0 = 4 + 4y ¡ 2ex+y
y 0 = sin 3x + ln (1 ¡ 4y) .
It is easy to see that the right hand side vanishes at (0, 0) so that (0, 0) is a stationary
point. Setting µ p ¶
4 + 4y ¡ 2ex+y
f (x, y) = ,
sin 3x + ln (1 ¡ 4y)
we obtain µ ¶ µ ¶
0 ∂x f1 ∂y f1 ¡2 ¡1
A = f (0, 0) = = .
∂x f2 ∂y f2 3 ¡4
Another way to obtain this matrix is to expand each component of f (x, y) by the Taylor
formula:
p ³ y ´
x+y
f1 (x, y) = 2 1 + y ¡ 2e = 2 1 + + o (x) ¡ 2 (1 + (x + y) + o (jxj + jyj))
2
= ¡2x ¡ y + o (jxj + jyj)
and
Hence, µ ¶µ ¶
¡2 ¡1 x
f (x, y) = + o (jxj + jyj) ,
3 ¡4 y
whence we obtain the same matrix A.
The characteristic polynomial of A is
µ ¶
¡2 ¡ λ ¡1
det = λ2 + 6λ + 11,
3 ¡4 ¡ λ
125
Hence, Re λ < 0 for all λ, whence we conclude that 0 is asymptotically stable.
The main tool for the proof of theorem 4.2 is the following lemma, that is of its own
interest. Recall that for any vector v 2 Rn and a differentiable function F in a domain in
Rn , the directional derivative ∂v F can be determined by
Xn
0 ∂F
∂v F (x) = F (x) v = (x) vk .
k=1
∂xk
Lemma 4.3 (Lyapunov’s lemma) Consider the system x0 = f (x) where f 2 C 1 (Ω) and
let x0 be a stationary point of it. Let V (x) be a C 1 scalar function in an open set U such
that x0 2 U ½ Ω and the following conditions hold:
2. For all x 2 U,
∂f (x) V (x) · 0. (4.6)
Function V with the properties 1-2 is called the Lyapunov function. Note that the
vector field f (x) in the expression ∂f (x) V (x) depends on x. By definition, we have
Xn
∂V
∂f (x) V (x) = (x) fk (x) .
k=1
∂xk
In this context, ∂f V is also called the orbital derivative of V with respect to the ODE
x0 = f (x).
Before the proof, let us show examples of the Lyapunov functions.
Example. Consider the system x0 = Ax where A 2 L (Rn ). In order to investigate the
stability of the stationary point 0, consider the function
n
X
V (x) = kxk22 = x2k ,
k=1
which is positive in Rn n f0g and vanishes at 0. Setting f (x) = Ax, we obtain for the
components
Xn
fk (x) = Akj xj .
j=1
∂V
Since ∂xk
= 2xk , it follows that
Xn Xn
∂V
∂f V = fk = 2 Akj xj xk .
k=1
∂xk
j,k=1
126
The matrix (Akj ) is called a non-positive definite if
n
X
Akj xj xk · 0 for all x 2 Rn .
j,k=1
127
Hence, V is indeed the Lyapunov function, and by Lemma 4.3 the stationary point (0, 0)
is Lyapunov stable.
Physically this has a simple meaning. The fact that F (x) < 0 for x > 0 and F (x) > 0
for x < 0 means that the force always acts in the direction of the origin thus trying to
return the displaced body to the stationary point, which causes the stability.
Proof of Lemma 4.3. By shrinking U , we can assume that U is bounded and that
V is defined on U. Set
Br = B (x0 , r) = fx 2 Rn : kx ¡ x0 k < rg
and observe that, by the openness of U , Bε ½ U provided ε > 0 is small enough. For any
such ε, set
m (ε) = inf V (x) .
x∈U\Bε
Since V is continuous and U n Bε is a compact set (bounded and closed), by the minimal
value theorem, the infimum of V is taken at some point. Since V is positive away from
0, we obtain m (ε) > 0. It follows from the definition of m (ε) that
Since V (x0 ) = 0, for any given ε > 0 there is δ > 0 so small that
We will show that x (t) 2 Bε for all t > 0, which means that the system is Lyapunov
stable at x0 .
For any solution x (t) in U , we have by the chain rule
d
V (x (t)) = V 0 (x) x0 (t) = V 0 (x) f (x) = ∂f (x) V (x) · 0. (4.9)
dt
Therefore, the function V is decreasing along any solution x (t) as long as x (t) remains
inside U.
If the initial point y is in Bδ then V (y) < m (ε) and, hence, V (x (t)) < m (ε) for t > 0
as long as x (t) is defined in U. It follows from (4.8) that x (t) 2 Bε . We are left to verify
that x (t) is defined15 for all t > 0. Indeed, assume that x (t) is defined only for t < T
where T is finite. By Theorem 2.8, if t ! T ¡, then the graph of the solution x (t) must
leave any compact subset of R £ U , whereas the graph is contained in the set [0, T ] £ Bε .
This contradiction shows that T = +1, which finishes the proof of the first part.
For the second part, we obtain by (4.7) and (4.9)
d
V (x (t)) · ¡W (x (t)) .
dt
15
Since x (t) has been defined as the maximal solution in the domain R £ U , the solution x (t) is always
contained in U as long as it is defined.
128
It suffices to show that
V (x (t)) ! 0 as t ! 1
since this will imply that x (t) ! 0 (recall that 0 is the only point where V vanishes).
Since V (x (t)) is decreasing in t, the limit
L = lim V (x (t))
t→+∞
exists. Assume from the contrary that L > 0. Then, for all t > 0, V (x (t)) ¸ L. By the
continuity of V , there is r > 0 such that
Hence, x (t) 2
/ Br for all t > 0. Set
whence it follows that V (x (t)) < 0 for large enough t. This contradiction finishes the
proof.
Proof of Theorem 4.2. Without loss of generality, set x0 = 0. Using that f 2 C 2 ,
we obtain by the Taylor formula, for any component fk of f,
n
X n
1X ¡ ¢
fk (x) = fk (0) + ∂i fk (0) xi + ∂ij fk (0) xi xj + o kxk2 as x ! 0.
i=1
2 i,j=1
f (x) = Ax + h (x)
Hence, for any choice of the norms, there is a constant C such that
129
provided kxk is small enough.
Assuming that Re λ < 0 for all eigenvalues of A, consider the following function
Z ∞
° sA °2
V (x) = °e x° ds (4.11)
2
0
Therefore,
à ! à !
° sA °2 X ¡ sA ¢ X ¡ ¢
°e x° = esA x ¢ esA x = xi e vi ¢ xj esA vj
2
i j
X ¡ ¢
= xi xj esA vi ¢ esA vj .
i,j
Integrating in s, we obtain X
V (x) = bij xi xj
i,j
R∞¡ ¢
where bij = 0 esA vi ¢ esA vj ds are constants, which clearly implies that V (x) is infi-
nitely many times differentiable in x.
Remark. Usually we work with any norm in Rn . In the definition (4.11) of V (x), we
have specifically chosen the 2-norm to ensure the smoothness of V (x).
Function V (x) is obviously non-negative and V (x) = 0 if and only if x = 0. In order
to complete the proof of the fact that V (x) is the Lyapunov function, we need to estimate
∂f (x) V (x). Let us first evaluate ∂Ax V (x) for any x 2 U. Since the function y (t) = etA x
solves the ODE y 0 = Ay, we have by (4.9)
d
∂Ay(t) V (y (t)) = V (y (t)) .
dt
Setting t = 0 and noticing that y (0) = x, we obtain
¯
d ¡ tA ¢¯¯
∂Ax V (x) = V e x ¯ . (4.13)
dt t=0
130
On the other hand,
Z ∞° Z Z
¡ ¢ ¡ ¢° ∞ ° (s+t)A °2 ∞° °
V etA x = °esA etA x °2 ds = °e x°2 ds = °eτ A x°2 dτ
2 2
0 0 t
where we have used the Cauchy-Schwarz inequality u ¢ v · kuk2 kvk2 for all u, v 2 Rn .
Next, let us use the estimate (4.10) in the form
kh (x)k2 · C kxk22 ,
which is true provided kxk2 is small enough. Observe also that the function V (x) has
minimum at 0, which implies that V 0 (0) = 0. Hence, if kxk2 is small enough then
1
kV 0 (x)k2 · C −1 .
2
Combining together the above three lines, we obtain that, in a small neighborhood U of
0,
1 1
∂f (x) V (x) · ¡ kxk22 + kxk22 = ¡ kxk22 .
2 2
1
Setting W (x) = 2 kxk22 , we conclude by Lemma 4.3, that the ODE x0 = f (x) is asymp-
totically stable at 0.
Now consider some examples of investigation of stationary points of an autonomous
system x0 = f (x).
The first step is to find the stationary points, that is, to solve the equation f (x) = 0.
In general, it may have many roots. Then each root requires a separate investigation.
Let x0 denote as before one of the stationary points of the system. The second step is
to compute the matrix A = f 0 (x0 ). Of course, the matrix A can be found as the Jacobian
matrix componentwise by Akj = ∂xj fk (x0 ). However, in practice is it frequently more
convenient to do as follows. Setting X = x ¡ x0 , we obtain that the system x0 = f (x)
transforms to
as X ! 0, that is, to
X 0 = AX + o (kXk) .
131
Hence, the linear term AX appears in the right hand side if we throw away the terms of
the order o (kXk). The equation X 0 = AX is called the linearized system for x0 = f (x)
at x0 .
The third step is the investigation of the stability of the linearized system, which
amounts to evaluating the eigenvalues of A and, possibly, the Jordan normal form.
The fours step is the conclusion of the stability of the non-linear system x0 = f (x) using
Lyapunov’s theorem or Lyapunov lemma. If Re λ < 0 for all eigenvalues λ of A then both
linearized and non-linear system are asymptotically stable at x0 , and if Re λ > 0 for some
eigenvalue λ then both are unstable. The other cases require additional investigation.
Example. Consider the system
½
x0 = y + xy,
(4.14)
y 0 = ¡x ¡ xy.
For the stationary points we have the equation
½
y + xy = 0
x + xy = 0
whence we obtain two roots: (x, y) = (0, 0) and (x, y) = (¡1, ¡1).
Consider first the stationary point (¡1, ¡1). Setting X = x + 1 and Y = y + 1, we
obtain the system
½ 0
X = (Y ¡ 1) X = ¡X + XY = ¡X + o (k (X, Y ) k)
(4.15)
Y 0 = ¡ (X ¡ 1) Y = Y ¡ XY = Y + o (k (X, Y ) k)
whose linearization is ½
X 0 = ¡X
Y 0 = Y.
Hence, the matrix is µ ¶
¡1 0
A= ,
0 1
and the eigenvalues are ¡1 and +1 so that the type of the stationary point is a saddle. The
linearized and non-linear system are unstable at (¡1, ¡1) because one of the eigenvalues
is positive.
Consider now the stationary point (0, 0). Near this point, the system can be written
in the form ½ 0
x = y + o (k (x, y) k)
y 0 = ¡x + o (k (x, y) k)
so that the linearized system is ½
x0 = y,
y 0 = ¡x.
Hence, the matrix is µ ¶
0 1
A= ,
¡1 0
and the eigenvalues are §i. Since they are purely imaginary, the type of the stationary
point (0, 0) is a center. Hence, the linearized system is stable at (0, 0) but not asymptot-
ically stable.
132
For the non-linear system (4.14), no conclusion can be drawn just from the eigenvalues.
In this case, one can use the following Lyapunov function:
V (x, y) = x ¡ ln (x + 1) + y ¡ ln (y + 1) ,
which is defined for x > ¡1 and y > ¡1. Indeed, the function x ¡ ln (x + 1) take the
minimum 0 at x = 0 and is positive for x 6= 0. It follows that V (x, y) takes the minimal
value 0 at (0, 0) and is positive away from (0, 0). The orbital derivative of V is
∂f V = (y + xy) ∂x V ¡ (x + xy) ∂y V
µ ¶ µ ¶
1 1
= (y + xy) 1 ¡ ¡ (x + xy) 1 ¡
x+1 y+1
= xy ¡ xy = 0.
Hence, V is the Lyapunov function, which implies that (0, 0) is stable for the non-linear
system.
Since ∂f V = 0, it follows from (4.9) that V remains constant along the trajectories
of the system. Using that one can easily show that (0, 0) is not asymptotically stable
and the type of the stationary point (0, 0) for the non-linear system is also a center. The
phase trajectories of this system around (0, 0) are shown on the diagram.
y
2
1.75
1.5
1.25
0.75
0.5
0.25
0
-0.75 -0.5 -0.25 0 0.25 0.5 0.75 1 1.25 1.5 1.75 2
-0.25
x
-0.5
-0.75
133