0% found this document useful (0 votes)
16 views

Differential Equations UNSW

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Differential Equations UNSW

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 433

Math2221

Higher Theory and Applications


of Differential Equations

School of Maths and Stats, UNSW


Session 2, 2016

Version: October 19, 2016


Part I

Linear ODEs
Introduction

In first year, you studied second-order linear ODEs with constant


coefficients. We will see that the techniques you learned can be
extended to handle higher-order linear ODEs with variable
coefficients. A key idea, used repeatedly throughout the remainder
of the course, is linear superposition.
Outline

Linear differential operators

Differential operators with constant coefficients

Wronskians and linear independence

Methods for inhomogeneous equations

Solution via power series

Singular ODEs

Bessel and Legendre equations


Linear differential operators

In linear algebra, you have seen the advantages arising from the
compact notation Ax = b for a system of linear equations. A
similarly compact notation is of great value when dealing with a
linear differential equation

Lu = f.

Here, L is an operator (or transformation) that acts on a


function u to create a new function Lu.
Notation
Given coefficients a0 (x), a1 (x), . . . , am (x) we define the linear
differential operator L of order m,

X
m
Lu(x) = aj (x)Dj u(x)
j=0 (1)
= am Dm u + am−1 Dm−1 u + · · · + a0 u,

where Dj u = dj u/dxj (with D0 u = u).

We refer to am as the leading coefficient of L. For simplicity, we


assume that each aj (x) is a smooth function of x.

The ODE Lu = f is said to be singular with respect to an


interval [a, b] if the leading coefficient am (x) vanishes for
any x ∈ [a, b]. (Also say L is singular on [a, b].)
Linearity

An operator of the form (1) is indeed linear: for any constants


c1 and c2 and any (m-times differentiable) functions u1 and u2 ,

L(c1 u1 + c2 u2 ) = c1 Lu1 + c2 Lu2 .

Hence, the set of solutions to the homogeneous equation Lu = 0


forms a vector space.
Example
Lu = (x − 3)u 000 − (1 + cos x)u 0 + 6u is a linear differential
operator of order 3, with leading coefficient x − 3. Thus, L is
singular on [1, 4], but not singular on [0, 2].

Example
N(u) = u 00 + u2 u 0 − u is a nonlinear differential operator of
order 2.
Linear initial-value problem
Consider a general mth-order linear differential operator

X
m
Lu = aj (x)Dj u.
j=0

Given f(x) and m initial values ν0 , ν1 , . . . , νm−1 we seek


u = u(x) satisfying
Lu = f on [a, b], (2)
with

u(a) = ν0 , u 0 (a) = ν1 , ..., u(m−1) (a) = νm−1 . (3)

Theorem
Assume that the ODE Lu = f is not singular with respect to [a, b],
and that f is continuous on [a, b]. Then the IVP (2) and (3) has a
unique solution.
Homogeneous problem

Theorem
Assume that the linear, mth-order differential operator L is not
singular on [a, b]. Then the set of all solutions to the homogeneous
equation Lu = 0 on [a, b] is a vector space of dimension m.

Proof.
Let V = { u : Lu = 0 on [a, b] } and define the linear
transformation Θ : V → Rm by

Θu = [u(a), u 0 (a), . . . , u(m−1) (a)]T .

Uniqueness of solutions means that Θ is one-one, and existence


means that Θ is onto. Hence, Θ is an isomorphism, and therefore
the vector space V has dimension m.
General solution
If {u1 , u2 , . . . , um } is any basis for the solution space of Lu = 0,
then every solution can be written in a unique way as

u(x) = c1 u1 (x) + c2 u2 (x) + · · · + cm um (x) for a 6 x 6 b. (4)

We refer to (4) as the general solution of the homogeneous


equation Lu = 0 on [a, b].

Linear superposition refers to this technique of constructing a new


solution out of a linear combination of old ones. Of course, this
trick works only because L is linear.

Example
The general solution to u 00 − u 0 − 2u = 0 is

u(x) = c1 e−x + c2 e2x .


Inhomogeneous problem

Consider the inhomogeneous equation Lu = f on [a, b], and fix a


particular solution uP .

For any solution u, the difference u − uP is a solution of the


homogeneous equation because

L(u − uP ) = Lu − LuP = f − f = 0 on [a, b].

Hence, u(x) − uP (x) = c1 u1 (x) + · · · + cm um (x) for some


constants c1 , . . . , cm , and so

u(x) = uP (x) + c1 u1 (x) + · · · + cm um (x), a 6 x 6 b,


| {z }
uH (x)

is the general solution of the inhomogeneous equation Lu = f.


Reduction of order

If we know (somehow) one solution u = u1 (x) to a second-order,


linear, homogeneous ODE

u 00 + p(x)u 0 + q(x)u = 0,

then, to find a second (linearly independent) solution, substitute


u = v(x)u1 (x) into the ODE and rearrange to obtain

(u100 + pu10 + qu1 )v + u1 v 00 + (2u10 + pu1 )v 0 = 0.


| {z }
=0

This is just a first-order, linear ODE for the derivative of the


unknown factor v(x): put w = v 0 then

u1 w 0 + (2u10 + pu1 )w = 0.
Reduction of order (continued)
Writing the ODE for w in the standard form

w 0 + (2u10 u−1
1 + p)w = 0,

we seek an integrating
R factor  R
0 −1
I(x) = exp (2u1 u1 + p) dx = u21 exp( p dx), so that

d
(Iw) = Iw 0 + I 0 w = I w 0 + (2u10 u−1

1 + p)w = 0.
dx
Then Iw = C for some constant C, and so
Z
C
v= dx.
I(x)

Example
For the ODE u 00 − 6u 0 + 9u = 0, take u1 = e3x and find v.
Differential operators with constant coefficients

If L has constant coefficients, then the problem of solving Lu = 0


reduces to that of factorizing the polynomial having the same
coefficients. Some complications occur if the polynomial has any
repeated roots.
Characteristic polynomial

Suppose that aj is constant for 0 6 j 6 m, with am 6= 0. We


define the associated polynomial of degree m,

X
m
p(z) = aj zj = am zm + am−1 zm−1 + · · · + a1 z + a0 ,
j=0

so that, formally, L = p(D).

Since Dj eλx = λj eλx we have

p(D)eλx = am λm +am−1 λm−1 +· · ·+a1 λ+a0 eλx = p(λ)eλx ,




and so
p(D)eλx = 0 ⇐⇒ p(λ) = 0.
Factorization

By the fundamental theorem of algebra,

p(z) = am (z − λ1 )k1 (z − λ2 )k2 · · · (z − λr )kr

where λ1 , λ2 , . . . , λr are the distinct roots of p, with


corresponding multiplicities k1 , k2 , . . . , kr satisfying

k1 + k2 + · · · + kr = m.

Lemma
(D − λ)xj eλx = jxj−1 eλx for j > 0.

Lemma
(D − λ)k xj eλx = 0 for j = 0, 1, . . . , k − 1.
Proof

An elementary calculation gives

(D − λ)xj eλx = jxj−1 eλx + xj λeλx − λxj eλx = jxj−1 eλx ,

as claimed, and then

(D − λ)2 xj eλx = (D − λ)jxj−1 eλx = j(j − 1)xj−2 eλx ,


..
.
(D − λ)j xj eλx = j!eλx ,
(D − λ)j+1 xj eλx = j!(D − λ)eλx = 0,

so (D − λ)k xj eλx = 0 for all k > j + 1, that is, for j 6 k − 1.


Basic solutions

Lemma
If (z − λ)k is a factor of p(z) then the function u(x) = xj eλx is a
solution of Lu = 0 for 0 6 j 6 k − 1.

Proof.
Write p(z) = (z − λ)k q(z), so that q(z) is a polynomial of
degree m − k. It follows that

p(D) = (D − λ)k q(D) = q(D)(D − λ)k

and so for 0 6 j 6 k − 1,

p(D)xj eλx = q(D)(D − λ)k xj eλx = q(D)0 = 0.


General solution

Theorem
For the constant-coefficient case, the general solution of the
homogeneous equation Lu = 0 is

X
r kX
q −1

u(x) = cql xl eλq x ,


q=1 l=0

where the cql are arbitrary constants.

Since (z − λq )kq is a factor of p(z),

X
r kX
q −1
X
r kX
l −1

Lu = cql Lxl eλq x = cql × 0 = 0.


q=1 l=0 q=1 l=0

Linear independence is shown in the Technical Proofs.


Distinct real roots

Example
From the factorization

D4 − 2D3 − 11D2 + 12D = (D + 3)D(D − 1)(D − 4)

we see that the general solution of

u 0000 − 2u 000 − 11u 00 + 12u 0 = 0

is
u = c1 e−3x + c2 + c3 ex + c4 e4x .
Repeated real root

Example
From the factorization

D4 + 6D3 + 9D2 − 4D − 12 = (D − 1)(D + 2)2 (D + 3)

we see that the general solution of

u 0000 + 6u 000 + 9u 00 − 4u 0 − 12u = 0

is
u = c1 ex + c2 e−2x + c3 xe−2x + c4 e−3x .
Complex root

Example
From the factorization

D3 − 7D2 + 17D − 15 = (D2 − 4D + 5)(D − 3)


= (D − 2 − i)(D − 2 + i)(D − 3)

we see that the general solution of

u 000 − 7u 00 + 17u 0 − 15u = 0

is

u(x) = c1 e(2+i)x + c2 e(2−i)x + c3 e3x


= c4 e2x cos x + c5 e2x sin x + c3 e3x .
Simple oscillator
Second-order ODEs arise naturally in classical mechanics.
Consider a particle of mass m that moves along the x-axis with
velocity v = ẋ = dx/dt under the influence of

an external applied force = f(t),


a frictional resistance force = −r(v)v,
a restoring force = −k(x)x.

Newton’s second law,

d2 x
m ẍ = m = f(t) − r(v)v − k(x)x,
dt2
leads to a second-order differential equation

m ẍ + r(ẋ) ẋ + k(x)x = f(t).


Simplest case is when r(v) = r0 > 0 and k(x) = k0 > 0 are
constant, giving a linear ODE with constant coefficients:

m ẍ + r0 ẋ + k0 x = f(t).

Typically interested in the case when the applied force is periodic


with frequency ω; for example,

f(t) = F sin ωt.

The general solution x(t) = xH (t) + xP (t).

We now show that xH (t) → 0 as t → ∞ and that xP (t) exists.


Since xP (t + T ) = xP (t) for T = 2π/ω, it follows that the
solution x(t) always tends to a periodic function with frequency ω,
regardless of the initial values x(0) and ẋ(0).
The characteristic equation is

mλ2 + r0 λ + x0 = m(λ − λ+ )(λ − λ− ),

and the roots are



−r0 ± ∆
λ± = where ∆ = r20 − 4mk0 .
2m
If ∆ > 0, then λ− < λ+ < 0 so, for any constants A and B,

xH (t) = Aeλ+ t + Beλ− t → 0 as t → ∞.

If ∆ = 0 then λ− = λ+ < 0 so again xH (t) → 0.



∆ = i |∆| so Re λ+ = Re λ− < 0, and again
p
If ∆ < 0 then
xH (t) → 0.
Since Re λ± 6= 0 the particular solution has the form

xP (t) = C cos ωt + E sin ωt,

and we find that

mx¨P + r0 x˙P + k0 xP = −mω2 C + r0 ωE + k0 C cos ωt




+ −mω2 E − r0 ωC + k0 E sin ωt,




which equals F sin ωt iff

(k0 − mω2 )C + r0 ωE = 0,
−r0 ωC + (k0 − mω2 )E = F.

This 2 × 2 system has a unique solution since its determinant is

(k0 − mω2 )2 + (r0 ω)2 > 0.


Wronskians and linear independence

We introduce a function, called the Wronskian, that provides us


with a way of testing whether a family of solutions to Lu = 0 is
linearly independent. The Wronskian also turns out to have several
other uses.
Wronskian
The Wronskian of the functions u1 , u2 , . . . , um is the
m × m determinant

W(x) = W(x; u1 , u2 , . . . , um ) = det[Di−1 uj ].

For instance, if m = 2 then

u1 u2
W(x) = = u1 u20 − u2 u10 ,
u10 u20

and when m = 3,

u1 u2 u3
W(x) = u10 u20 u30 .
u100 u200 u300

Of course, W(x) is defined only when the functions are


differentiable m − 1 times.
Wronskian

Example
The Wronskian of the functions u1 = e2x , u2 = xe2x and
u3 = e−x is

e2x xe2x e−x


W = 2e 2x 2x
e + 2xe 2x −e−x = 9e3x .
4e2x 4e2x + 4xe2x e−x

Example
The Wronskian of the functions u1 = ex cos 3x and u2 = ex sin 3x
is
ex cos 3x ex sin 3x
W= = 3e2x .
ex cos 3x− 3e sin 3x e sin 3x + 3ex cos 3x
x x
Linearly dependent functions
Lemma
If u1 , . . . , um are linearly dependent over an interval [a, b] then
W(x; u1 , . . . , um ) = 0 for a 6 x 6 b.

Example
The functions u1 = cosh x, u2 = sinh x and u3 = ex are linearly
dependent because

ex + e−x ex − e−x
cosh x + sinh x = + = ex .
2 2
Their Wronskian is
cosh x sinh x ex
W = sinh x cosh x ex = 0.
cosh x sinh x ex
Proof (for m = 3)
Assume that u1 , u2 , u3 are linearly dependent on the
interval [a, b], that is, there exist constants c1 , c2 , c3 , not all zero,
such that

c1 u1 (x) + c2 u2 (x) + c3 u3 (x) = 0 for a 6 x 6 b.

Differentiating, it follows that

c1 u10 (x) + c2 u20 (x) + c3 u30 (x) = 0,


c1 u100 (x) + c2 u200 (x) + c3 u300 (x) = 0,

so
    
u1 (x) u2 (x) u3 (x) c1 0
 u 0 (x) u 0 (x) u 0 (x)  c2  = 0 for a 6 x 6 b.
1 2 3
u100 (x) u200 (x) u300 (x) c3 0

This 3 × 3 matrix must be singular and thus W(x) = 0.


Wronskian satisfies a first-order ODE
Lemma
If u1 , u2 , . . . , um are solutions of Lu = 0 on the interval [a, b]
then their Wronskian satisfies

am (x)W 0 (x) + am−1 (x)W(x) = 0, a 6 x 6 b.

Example
The second-order ODE

u 00 + 3u 0 − 4u = 0

has solutions u1 = ex and u2 = e−4x . Their Wronskian is

ex e−4x
= −5e−3x ,
e −4e−4x
x

and satisfies W 0 + 3W = 0.
Proof (for m = 2)

Differentiating W = u1 u20 − u10 u2 we have

W 0 = (u10 u20 + u1 u200 ) − (u100 u2 + u10 u20 ) = u1 u200 − u100 u2 ,

so

a2 W 0 + a1 W = a2 (u1 u200 − u100 u2 ) + a1 (u1 u20 − u10 u2 )


+ a0 (u1 u2 − u1 u2 )
| {z }
=0
= u1 (a2 u200 + a1 u20 + a0 u2 )
− (a2 u100 + a1 u10 + a0 u1 )u2
= u1 (Lu2 ) − (Lu1 )u2 = 0.
Linear independence of solutions

Theorem
Let u1 , u2 , . . . , um be solutions of a non-singular, linear,
homogeneous, mth-order ODE Lu = 0 on the interval [a, b].
Either
W(x) = 0 for a 6 x 6 b and the m solutions are linearly
dependent,
or else
W(x) 6= 0 for a 6 x 6 b and the m solutions are linearly
independent.
Proof
The Wronskian satisfies
am−1
W 0 + pW = 0 for a 6 x 6 b, where p= .
am
Define an integrating factor
Z 
I(x) = exp p(x) dx 6= 0,

so that I 0 = Ip and hence

(IW) 0 = IW 0 + IpW = I(W 0 + pW) = 0.

Thus, I(x)W(x) = C for some constant C.

Either C = 0 in which case W(x) = 0 for all x ∈ [a, b], or else


C 6= 0 in which case W(x) is never zero for x ∈ [a, b].
(Assume now that m = 3.) We already know that

u1 , u2 , u3 linearly dependent =⇒ W ≡ 0.

Hence, to complete the proof it suffices to show

W(a) = 0 =⇒ u1 , u2 , u3 linearly dependent.

If W(a) = 0, then there exist c1 , c2 , c3 , not all zero, such that


    
u1 (x) u2 (x) u3 (x) c1 0
 u 0 (x) u 0 (x) u30 (x)  c2  = 0 at x = a,
1 2
u100 (x) u200 (x) u300 (x) c3 0

so the function u(x) = c1 u1 (x) + c2 u2 (x) + c3 u3 (x) satisfies

Lu = 0 for a 6 x 6 b, with u(a) = u 0 (a) = u 00 (a) = 0.

The solution of this initial-value problem is unique, so u(x) ≡ 0


and thus u1 , u2 , u3 are linearly dependent.
Interlacing of zeros

Theorem (Sturm separation theorem)


Assume a2 (x) 6= 0 for a 6 x 6 b. If u1 and u2 are linearly
independent solutions of

a2 (x)u 00 + a1 (x)u 0 + a0 (x)u = 0

then u1 has exactly one zero between any two successive zeros
of u2 in the interval [a, b].

Example
The functions u1 (x) = cos x and u2 (x) = sin x are linearly
independent solutions of u 00 + u = 0 on any interval [a, b].

Example
What about u 00 − u = 0?
Proof
Suppose a 6 α < β 6 b with

u2 (α) = 0 = u2 (β) and u2 (x) 6= 0 for α < x < β.

Case 1: W(x) > 0 for a 6 x 6 b and u2 (x) > 0 for α < x < β.
Thus,
u2 (α + h)
u20 (α) = lim+ > 0,
h→0 h
u2 (β + h)
u20 (β) = lim− 6 0.
h→0 h

Since W = u1 u20 − u10 u2 , we see

0 < W(α) = u1 (α)u20 (α) and 0 < W(β) = u1 (β)u20 (β),

so u1 (α) > 0 and u1 (β) < 0.


Proof (continued)
Case 2: W(x) > 0 for a 6 x 6 b and u2 (x) < 0 for α < x < β.
Similar argument shows u1 (α) < 0 and u1 (β) > 0.

Case 3: W(x) < 0 for a 6 x 6 b and u2 (x) > 0 for α < x < β.
Similar argument shows u1 (α) < 0 and u1 (β) > 0.

Case 4: W(x) < 0 for a 6 x 6 b and u2 (x) < 0 for α < x < β.
Similar argument shows u1 (α) > 0 and u1 (β) < 0.

In all cases, u1 (α) and u1 (β) have opposite signs, and so by the
Intermediate Value Theorem u1 (ξ) = 0 for at least one ξ ∈ (α, β).

Finally, suppose for a contradiction that u1 has more than one


zero in (α, β), say u1 (ξ) = 0 = u1 (η) with α < ξ < η < β. Then,
by interchanging the roles of u1 and u2 , it follows that u2 must
have a zero in (ξ, η), contradicting the assumption that u2 (x) > 0
for α < x < β.
Methods for inhomogeneous equations

In first year, you learned the method of undetermined coefficients


for constructing a particular solution uP to an inhomogeneous
second-order linear ODE Lu = f in some simple cases. We will
study this method systematically for higher-order linear ODEs with
constant coefficients.

We also discuss variation of parameters, a technique that applies


for general L and f, but which requires the evaluation of possibly
very difficult integrals.
Superposition of solutions
We now consider methods for finding a particular solution uP
satisfying LuP = f.
First note that if

f(x) = c1 f1 (x) + c2 f2 (x)

and if we know uP1 and uP2 satisfying

LuP1 = f1 and LuP2 = f2 ,

then we can put

uP (x) = c1 uP1 (x) + c2 uP2 (x),

because by the linearity of L,

LuP = c1 LuP1 + c2 LuP2 = c1 f1 + c2 f2 = f.


Polynomial solutions
Let L = p(D) be a linear differential operator of order m with
constant coefficients.

Theorem
Assume that L1 6= 0, or equivalently a0 = p(0) 6= 0. For any
integer r > 0, there exists a unique polynomial uP of degree r such
that LuP = xr .
For simplicity, we prove the result only for the case m = 2. Thus,

Lu = a2 u 00 + a1 u 0 + a0 u,

where a0 , a1 , a2 are constants with a2 6= 0 and a0 6= 0.


Look for uP in the form
X
r
xj
uP (x) = cj .
j!
j=0
We find that
X
r
xj−2 X
r
xj−1 X xj r
LuP = a2 cj + a1 cj + a0 cj
(j − 2)! (j − 1)! j!
j=2 j=1 j=0

xr  xr−1
= a 0 cr + a0 cr−1 + a1 cr
r! (r − 1)!
X
r−2
 xj
+ a2 cj+2 + a1 cj+1 + a0 cj ,
j!
j=0

which equals xr if and only if

a0 cj + a1 cj+1 + a2 cj+2 = 0, 0 6 j 6 r − 2,
a0 cr−1 + a1 cr = 0,
a0 cr = r!.

This upper triangular system is uniquely solvable because


a0 6= 0.
An example
Let Lu = 3u 00 − u 0 + 2u and suppose we want a particular
solution to Lu = 8x3 . The theorem ensures that

uP (x) = C + Ex + Fx2 + Gx3

works for some C, E, F, G. In fact,

LuP = (2C − E + 6F) + (2E − 2F + 18G)x + (2F − 3G)x2 + 2Gx3

so

2C − E + 6F = 0,
2E − 2F + 18G= 0,
2F − 3G= 0,
2G = 8

and back substitution gives uP = −33 − 30x + 6x2 + 4x3 .


Exponential solutions
Theorem
Let L = p(D) and µ ∈ C. If p(µ) 6= 0, then the function

eµx
uP (x) =
p(µ)

satisfies LuP = eµx .


Proof.
Follows at once because p(D)eµx = p(µ)eµx .
Example
A particular solution of

u 00 + 4u 0 − 3u = 3e2x

is
e2x
uP = .
3
Product of polynomial and exponential

Theorem
Let L = p(D) and assume that p(µ) 6= 0. For any integer r > 0,
there exists a unique polynomial v of degree r such that
uP = v(x)eµx satisfies LuP = xr eµx .

Proof.
Again, for simplicity, we prove the result only for m = 2.
Put v = e−µx u so that u = veµx , and observe that

Lu = a2 (v 00 + 2µv 0 + µ2 v) + a1 (v 0 + µv) + a0 v eµx


 

= a2 v 00 + (a1 + 2a2 µ)v 0 + (a2 µ2 + a1 µ + a0 )v eµx .


 

Thus, Lu = eµx q(D)v where q(z) = a2 z2 + (a1 + 2a2 µ)z + p(µ),


and our earlier result yields the desired v satisfying q(D)v = xr
because q(D)1 = q(0) = p(µ) 6= 0.
An example
Consider
2u 00 + u 0 − 3u = 9xe−2x .
Here,
p(z) = 2z2 + z − 3
so p(−2) = 3 6= 0 and a particular solution uP = (Cx + E)e−2x
exists. In fact, we find that

p(D)uP = (3Cx − 7C + 3E)e−2x

so
3C = 9 and − 7C + 3E = 0.
Thus, C = 3 and E = 7, giving

uP = (3x + 7)e−2x .
Polynomial solutions: the remaining case
Theorem
Let L = p(D) and assume p(0) = p 0 (0) = · · · = p(k−1) (0) = 0
but p(k) (0) 6= 0 where 1 6 k 6 m − 1. For any integer r > 0,
there exists a unique polynomial v of degree r such that
up (x) = xk v(x) satisfies LuP = xr .
For simplicity, we discuss only the case m = 2 and k = 1; thus,

p(z) = a2 z2 + a1 z with a1 = p 0 (0) 6= 0.

Write
X
r
xj
v(x) = cj
(j + 1)!
j=0

so that
X
r
xj+1 X
r+1
xj
uP (x) = xv(x) = cj = cj−1 .
(j + 1)! j!
j=0 j=1
Since
X
r+1
xj−1 X xj
r
uP0 (x) = cj−1 = cj
(j − 1)! j!
j=1 j=0

and
X
r
xj−1 X
r−1
xj
uP00 (x) = cj = cj+1 ,
(j − 1)! j!
j=1 j=0

we have

xr X
r−1
 xj
LuP = a1 cr + a2 cj+1 + a1 cj .
r! j!
j=0

The assumption that a1 6= 0 ensures a unique solution to the lower


triangular system

a1 cj + a2 cj+1 = 0, 0 6 j 6 r − 1,
a1 cr = r!.
An example
Let Lu = u 000 + 2u 00 and seek a particular solution to Lu = 12x2 .
The theorem ensures that

uP = x2 (C + Ex + Fx2 ) = Cx2 + Ex3 + Fx4

works for some C, E, F. In fact,

LuP = (4C + 6E) + (12E + 24F)x + 24Fx2

so

4C + 6 E = 0,
12E + 24F = 0,
24F = 12

and back substitution gives

x2
uP = (3 − 2x + x2 ).
2
Exponential times polynomial: remaining case
Lemma
If u(x) = w(x)eµx then

X
m
zj
p(D)u = eµx q(D)w where q(z) = p(j) (µ) .
j!
j=0

Theorem
Let L = p(D) and assume p(µ) = p 0 (µ) = · · · = p(k−1) (µ) = 0
but p(k) (µ) 6= 0, where 1 6 k 6 m − 1. For any integer r > 0,
there exists a unique polynomial v of degree r such that
uP (x) = xk v(x)eµx satisfies LuP = xr eµx .

Proof.
Since q(j) (0) = p(j) (µ) for all j, there is a unique polynomial v of
degree r such that w(x) = xk v(x) satisfies q(D)w = xr and hence
p(D)uP = eµx q(D)w = eµx xr .
Proof of Lemma

X
m X
m X
k  
k
µx k µx
Dj w Dk−j eµx

p(D)we = ak D we = ak
j
k=0 k=0 j=0

X
m X
k
k!
= ak Dj w Dk−j eµx
j!(k − j)!
k=0 j=0
Xm
DwX
j m
= ak k(k − 1)(k − 2) · · · (k − j + 1)µk−j eµx
j!
j=0 k=j
X
m
Dj w X
m
= ak k(k − 1)(k − 2) · · · (k − j + 1)µk−j eµx
j!
j=0 k=j
Xm
Dj w (j) X Dj w
m
= p (µ)eµx = eµx p(j) (µ)
j! j!
j=0 j=0
µx
=e q(D)w.
An example
Consider the ODE

Lu = 12e2x where Lu = u 000 − 4u 00 + 4u 0 .

Here, L = p(D) for p(z) = z3 − 4z2 + 4z = z(z − 2)2 , so


p(2) = p 0 (2) = 0 but p 00 (2) 6= 0. Thus, we try

uP = Cx2 e2x

and find

uP0 = C(2x + 2x2 )e2x , uP00 = C(2 + 8x + 4x2 )e2x ,


uP000 = C(12 + 24x + 8x2 )e2x ,

so LuP = 4Ce2x and we require 4C = 12. Therefore, a particular


solution is
uP = 3x2 e2x .
Complex conjugate roots
Consider

Lu ≡ p(D) ≡ u 000 + u 0 + 10u = 13ex sin 2x.

Here, p(z) = z3 + z + 10 = [(z − 1)2 + 4](z + 2) so p(1 ± 2i) = 0,


and
e2ix − e−2ix
ex sin 2x = ex
2i
is a linear combination of e(1+2i)x and e(1−2i)x . Therefore put

uP (x) = Cxe(1+2i)x + Exe(1−2i)x = xex F cos 2x + G sin 2x .




We find that if F = −3/4 and G = −1/2 then

LuP = (−8F + 12G)ex cos 2x + (−12F − 8G)ex sin 2x = 13ex sin 2x,

so
xex 
uP = − 3 cos 2x + 2 sin 2x .
4
Variation of parameters
What if f is not a polynomial times an exponential, or if L does
not have constant coefficients?
Consider a linear, second-order, inhomogeneous ODE with leading
coefficient 1:

Lu = u 00 (x) + p(x)u 0 (x) + q(x)u(x) = f(x). (5)

Let u1 (x) and u2 (x) be linearly independent solutions to the


homogeneous equation and let W(x) = W(x; u1 , u2 ) denote their
Wronskian. Thus,

Lu1 = 0, Lu2 = 0, W 6= 0.

We seek v1 and v2 such that

u(x) = v1 (x)u1 (x) + v2 (x)u2 (x)

is a (particular) solution to Lu = f.
Variation of parameters (continued)
To simplify the expression

u 0 = v10 u1 + v1 u10 + v20 u2 + v2 u20

we impose the condition v10 u1 + v20 u2 = 0, then (as if v1 and v2


were constant parameters)

u 0 = v1 u10 + v2 u20 .

A short calculation now shows

Lu = v1 Lu1 + v2 Lu2 + v10 u10 + v20 u20 = v10 u10 + v20 u20 ,

since by assumption Lu1 = 0 = Lu2 .


Conclusion: u = v1 u1 + v2 u2 satisfies Lu = f if

v10 u1 + v20 u2 = 0,
v10 u10 + v20 u20 = f.
Thus, we have a pair of equations for the unknown v10 and v20 . In
matrix form
u1 (x) u2 (x) v10 (x)
    
0
= ,
u10 (x) u20 (x) v20 (x) f(x)
so
v10 (x)
   0  
1 u2 (x) −u2 (x) 0
= ,
v20 (x) W(x) −u10 (x) u1 (x) f(x)
or in other words,
−u2 (x)f(x) u1 (x)f(x)
v10 (x) = and v20 (x) = .
W(x) W(x)

Example
Find the general solution to

3u 00 − 6u 0 + 30u = ex tan 3x.


Solution via power series

If L has variable coefficients, then we cannot expect in general that


the solution of Lu = 0 is expressible in terms of elementary
functions like polynomials, trigonometric functions, exponentials
etc. Power series provide a flexible way to represent u in this case.
Constructing a series solution

Consider the initial-value problem

Lu = (1 − x2 )u 00 − 5xu 0 − 4u = 0, u(0) = 1, u 0 (0) = 2.

Look for a solution in the form of a power series



X
u(x) = Ak xk = A0 + A1 x + A2 x2 + · · · .
k=0

Formal calculations show that



X
Lu = (k + 2)[(k + 1)Ak+2 − (k + 2)Ak ]xk ,
k=0

and the initial conditions imply A0 = 1 and A1 = 2.


Convergence?
Since Lu is identically zero iff the coefficient of xk vanishes for
every k, we obtain the recurrence relation
k+2
Ak+2 = Ak for k = 0, 1, 2, . . . .
k+1
Thus,

A0 = 1, A1 = 2, A2 = 2, A3 = 3, ...,

giving
u(x) = 1 + 2x + 2x2 + 3x3 + · · · .
Since
Ak+2 xk+2 k+2 2
lim k
= lim x = x2 ,
k→∞ Ak x k→∞ k + 1
P P∞
the ratio test shows that ∞ j=0 A2j x
2j and
j=0 A2j+1 x
2j+1
2 2
converge for x < 1 but diverge for x > 1.
General case
Consider a general second-order, linear, homogeneous ODE

Lu = a2 (x)u 00 + a1 (x)u 0 + a0 (x)u = 0.

Equivalently,
u 00 + p(x)u 0 + q(x)u = 0,
where
a1 (x) a0 (x)
p(x) = and q(x) = .
a2 (x) a2 (x)
Assume that aj is analytic at 0 for 0 6 j 6 2, and that a2 (0) 6= 0.
Then p and q are analytic at 0, that is, they admit power series
expansions

X ∞
X
p(z) = pk z k
and q(z) = q k zk for |z| < ρ,
k=0 k=0

for some ρ > 0.


Formal expansions

If

X
u(z) = A k zk
k=0

then we find that

Lu(z) = (2A2 + p0 A1 + q0 A0 )
+ (6A3 + 2p0 A2 + p1 A1 + q0 A1 + q1 A0 )z + · · · ,

where, on the RHS, the coefficient of zn−1 for a general n > 1 is

X
n−1
 
(n + 1)nAn+1 + (n − j)pj An−j + qj An−1−j .
j=0
Convergence theorem

Given u(0) and u 0 (0), we put

A0 = u(0) and A1 = u 0 (0),

and compute recursively

−1 X
n−1

An+1 = (n − j)pj An−j + qj An−1−j , n > 1.
n(n + 1)
j=0

Theorem
If the coefficients p(z) and q(z) are analytic for |z| < ρ, then the
formal power series for the solution u(z), constructed above, is
also analytic for |z| < ρ.
Previous example

Earlier we considered

Lu = (1 − x2 )u 00 − 5xu 0 − 4u = 0, u(0) = 1, u 0 (0) = 2.

In this case,
X ∞
−5z
p(z) = = −5 z2k+1
1 − z2
k=0

and
X ∞
−4
q(z) = = −4 z2k
1 − z2
k=0

are analytic for |z| < 1, so the theorem guarantees that u(z), given
by the formal power series, is also analytic for |z| < 1.
Expansion about a point other than 0

Suppose we want a power series expansion about a point c 6= 0, for


instance because the initial conditions are given at x = c.

A simple change of the independent variable allows us to write



X ∞
X
u= Ak (z − c)k = Ak Zk where Z = z − c.
k=0 k=0

Since du/dz = du/dZ and d2 u/dz2 = d2 u/dZ2 , we obtain the


translated equation

d2 u du
+ p(Z + c) + q(Z + c)u = 0.
dZ2 dZ
Now compute the Ak using the series expansions of p(Z + c) and
q(Z + c) in powers of Z.
Example

We construct a power series solution about z = 1 to Airy’s


equation:
u 00 − zu = 0.
Put Z = z − 1 and find that u 00 − zu = u 00 − (Z + 1)u equals

X ∞
X ∞
X
k−2 k+1
k(k − 1)Ak Z − Ak Z − A k Zk
k=0 k=0 k=0

X ∞
X ∞
X
= (k + 2)(k + 1)Ak+2 Zk − Ak−1 Zk − A k Zk
k=0 k=1 k=0

X
(k + 2)(k + 1)Ak+2 − Ak−1 − Ak Zk .
 
= (2A2 − A0 ) +
k=1
Thus, the coefficients must satisfy 2A2 − A0 = 0 and

(k + 2)(k + 1)Ak+2 − Ak−1 − Ak = 0 for all k > 1,

so
A0 Ak−1 + Ak
A2 = and Ak+2 = for k > 1.
2 (k + 2)(k + 1)

We find that

(z − 1)2 (z − 1)3 (z − 1)4


 
u(z) = A0 1 + + + + ···
2 6 24
(z − 1)3 (z − 1)4
 
+ A1 (z − 1) + + + ··· .
6 12
Singular ODEs

Recall that our basic existence and uniqueness theorem for Lu = f


assumes that L is not singular, that is, the leading coeffecient of L
does not vanish on the interval of interest. However, some
important applications lead to singular ODEs so we must now
address this case.
Singular ODEs of second order
Consider

a2 (x)u 00 + a1 (x)u 0 + a0 (x)u = 0 for a 6 x 6 b,

and suppose that a2 (x0 ) = 0 for some x0 with a < x0 < b, but
a2 (x) 6= 0 if x 6= x0 . Put

bj (y) = aj (x) and v(y) = u(x) where y = x − x0 ,

so that, with c = a − x0 < 0 < d = b − x0 ,

b2 (y)v 00 + b1 (y)v 0 + b0 (y)v = 0 for c 6 y 6 d.

Since y = 0 when x = x0 , we have b2 (0) = 0.

In this way, it suffices to consider the case when the leading


coefficient vanishes at the origin.
Cauchy–Euler ODE

A second-order Cauchy–Euler ODE has the form

Lu = ax2 u 00 + bxu 0 + cu = f(x),

where a, b and c are constants, with a 6= 0. This ODE is singular


at x = 0.

Noticing that
Lxr = ar(r − 1) + br + c xr ,
 

we see that u = xr is a solution of the homogeneous equation


(f = 0) iff
ar(r − 1) + br + c = 0.
Factorization
Suppose ar(r − 1) + br + c = a(r − r1 )(r − r2 ). If r1 6= r2 then
the general solution of the homogeneous equation Lu = 0 is

u(x) = C1 xr1 + C2 xr2 , x > 0.

Lemma
If r1 = r2 then the general solution of the homogeneous
Cauchy–Euler equation Lu = 0 is

u(x) = C1 xr1 + C2 xr1 log x, x > 0.

Example
Solve x2 u 00 − xu 0 + u = 0.

Example
Solve 2(x − 2)2 u 00 − 3(x − 2)u 0 − 3u = 0.
Proof of the lemma
Since r1 = r2 the function F(x, r) = xr satisfies

ax2 F 00 + bxF 0 + cF = a(r − r1 )2 xr ,

where the dash means ∂/∂x. Put


∂F ∂ r log x
v(x) = = e = er1 log x log x = xr1 log x
∂r r=r1 ∂r r=r1

and observe that


∂2 ∂F
 
∂ ∂F ∂F
ax2 v 00 + bxv 0 + cv = ax2 2 + bx +c
∂x ∂r ∂x ∂r ∂r r=r1
∂2 F
 
∂ ∂F
= ax2 2 + bx + cF
∂r ∂x ∂x r=r1

a(r − r1 )2 xr r=r

=
∂r 1

= 2a(r − r1 )x + (r − r1 )2 xr log x r=r = 0.


r

1
More general singular ODEs
A number of important applications lead to ODEs that can be
written in the Frobenious normal form

z2 u 00 + zP(z)u 0 + Q(z)u = 0,

where P(z) and Q(z) are analytic at z = 0:



X ∞
X
P(z) = Pk zk and Q(z) = Qk z k , |z| < ρ. (6)
k=0 k=0

Notice that u 00 + p(z)u 0 + q(z)u = 0 but p(z) = z−1 P(z) and


q(z) = z−2 Q(z) are not analytic at z = 0 unless P(0) = 0 and
Q(0) = Q 0 (0) = 0.

So in general we cannot expect a solution u(z) to be analytic


at z = 0.
A clue
We can think of an ODE in Frobenius normal form as a
Cauchy–Euler ODE with variable coefficients.
For z near 0 we have P(z) ≈ P0 and Q(z) ≈ Q0 so we might
expect u(z) to behave like a solution of

z2 u 00 + P0 zu 0 + Q0 u = 0.

We therefore consider the indicial polynomial

I(r) = r(r − 1) + P0 r + Q0 = (r − r1 )(r − r2 ).

If r1 6= r2 then the approximating Cauchy–Euler ODE has the


general solution c1 zr1 + c2 zr2 , so it is natural to seek a solution in
the form

X ∞
X
u(z) = zr A k zk = Ak zk+r , |z| < ρ, with A0 6= 0.
k=0 k=0
An example

Consider
Lu = 2z2 u 00 + 7zu 0 − (z2 + 3)u = 0.
Here, P(z) = 7/2 and Q(z) = −(z2 + 3)/2 are trivially analytic
at z = 0 (since they are polynomials).

The approximations P(z) ≈ P0 = 7/2 and Q(z) ≈ Q0 = −3/2 lead


to the Cauchy–Euler equation z2 u 00 + (7/2)zu 0 − (3/2)u = 0 or

2z2 u 00 + 7zu 0 − 3u = 0.

Thus, the indicial polynomial is

2r(r − 1) + 7r − 3 = 2r2 + 5r − 3 = (2r − 1)(r + 3)

so r1 = 1/2 and r2 = −3.


Using

X ∞
X
k+r 0
u= Ak z , u = (k + r)Ak zk+r−1 ,
k=0 k=0

X
u 00 = (k + r)(k + r − 1)Ak zk+r−2 ,
k=0

we find that

Lu = (2z2 u 00 + 7zu 0 − 3u) − z2 u


X∞
2(k + r)(k + r − 1) + 7(k + r) − 3 Ak zk+r
 
=
k=0

X
− Ak zk+r+2 .
k=0
Since

2(k + r)(k + r − 1) + 7(k + r) − 3 = 2(k + r)2 + 5(k + r) − 3


= [2(k + r) − 1][(k + r) + 3]

and

X ∞
X
Ak zk+r+2 = Ak−2 zk+r
k=0 k=2

it follows that

Lu = (2r − 1)(r + 3)A0 zr + (2r + 1)(r + 4)A1 zr+1



X
(2k + 2r − 1)(k + r + 3)Ak − Ak−2 zk+r .
 
+
k=2
Conclusion: u is a solution if r ∈ {1/2, −3} with

Ak−2
A1 = 0, Ak = for all k > 2.
(2k + 2r − 1)(k + r + 3)

First solution: r = 1/2 with

Ak−2
A1 = 0, Ak = for all k > 2,
k(2k + 7)
so
A0 A1 A2 A0
A2 = , A3 = = 0, A4 = = , ...
22 39 60 1320
and
z2 z4
 
1/2
u(z) = A0 z 1+ + + ··· .
22 1320
Second solution: r = −3 with
Ak−2
A1 = 0, Ak = for all k > 2,
k(2k − 7)
so
A0 A1 A2 A0
A2 = − , A3 = − = 0, A4 = − = , ...
6 3 4 24
and
z2 z4
 
−3
u(z) = A0 z 1− + + ··· .
6 24

General solution of Lu = 0:

z2 z4
 
u(z) = Az1/2 1 + + + ···
22 1320
z2 z4
 
−3
+ Bz 1− + + ··· .
6 24
General case

Now consider
z2 u 00 + zP(z)u 0 + Q(z)u = 0
for P(z) and Q(z) satisfying (6). Formal manipulations show that
Lu(z) equals
∞ 
X X
k−1 
r
I(r)A0 z + I(k + r)Ak + [(j + r)Pk−j + Qk−j ]Aj zk+r ,
k=1 j=0

so we define A0 (r) = 1 and

−1 X
k−1
Ak (r) = [(j + r)Pk−j + Qk−j ]Aj (r), k > 1,
I(k + r)
j=0

provided I(k + r) 6= 0 for all k > 1.


Choice of exponent

The preceding calculations show that the series



X
F(z; r) = Ak (r)zk+r
k=0
satisfies
z2 F 00 + zP(z)F 0 + Q(z)F = I(r)zr ,
with
I(r) = r(r − 1) + P0 r + Q0 = (r − r1 )(r − r2 ).
Assume, with no loss of generality, that Re r1 > Re r2 . It follows
that I(k + r1 ) 6= 0 for all integers k > 1, and therefore
u1 (z) = F(z; r1 ) is (formally) a solution.

If r1 − r2 is not a whole number, then a second, linearly


independent solution is u2 (z) = F(z; r2 ).
Roots differing by an integer
Suppose that r1 = r2 . In this case, I(r) = (r − r1 )2 and so

z2 F 00 + zP(z)F 0 + Q(z)F = (r − r1 )2 zr .

The function v = ∂F/∂r satisfies

z2 v 00 + zP(z)v 0 + Q(z)v = 2(r − r1 )zr + (r − r1 )2 zr log z,

and the RHS is zero if r = r1 , so a second, linearly independent


solution is
X∞ X∞
∂F 0 k+r1
u2 (z) = (z; r1 ) = Ak (r1 )z + Ak (r1 )zk+r1 log z .
∂r
k=0
|k=0 {z }
u1 (z) log z

Even worse complications if r1 = r2 + n for an integer n > 1.


Concluding remark about the indicial equation
We can find r1 and r2 easily without any manipulation of series, to
quickly determine the qualitative behaviour of solutions as z → 0.
Example
To see that all nontrivial solutions of the ODE

z2 u 00 + 3(e3z − 1)u 0 + 15(cosh z)u = 0.

are unbounded as z → 0, use Taylor expansion to write

(3z)2 (3z)3 z2 z4
   
2 00 0
z u +3 3z+ + +· · · u +15 1+ + +· · · u = 0.
2! 3! 2! 4!

The indicial polynomial is

r(r − 1) + 9r + 15 = r2 + 8r + 15 = (r + 3)(r + 5),

which has roots r1 = −3 < 0 and r2 = −5 < 0.


Bessel and Legendre equations

We wrap up this part of the course with two particularly important


examples of second-order, linear ODEs with variable coefficients.
Both occur several times later in the course.
Friedrich Wilhelm Bessel 1784–1846
Bessel equation

The Bessel equation with parameter ν is

z2 u 00 + zu 0 + (z2 − ν2 )u = 0.

This ODE is in Frobenius normal form, with indicial polynomial

I(r) = (r + ν)(r − ν),

and we seek a series solution



X
u(z) = Ak zk+r .
k=0

We assume Re ν > 0, so r1 = ν and r2 = −ν.


Recurrence relation
We find that if
(r + 1 + ν)(r + 1 − ν)A1 = 0,
(k + r + ν)(k + r − ν)Ak + Ak−2 = 0, k > 2.
then
z2 u 00 + zu 0 + (z2 − ν2 )u = (r + ν)(r − ν)A0 zr .
Taking r = ν gives
−Ak−2
Ak = for k > 2,
k(k + 2ν)
so with A0 arbitrary and A1 = 0 we obtain

(z/2)2 (z/2)4

ν
u(z) = A0 z 1 − +
1+ν 2(2 + ν)(1 + ν)
(z/2)6

− + ··· .
3!(3 + ν)(2 + ν)(1 + ν)
Bessel function
With the normalisation
1
A0 =
2ν Γ (1 + ν)

the series solution is called the Bessel function of order ν and is


denoted
(z/2)ν (z/2)2 (z/2)4
 
Jν (z) = 1− + − ··· .
Γ (1 + ν) 1+ν 2!(1 + ν)(2 + ν)

From the functional equation Γ (1 + z) = zΓ (z) we see that

(z/2)ν (z/2)ν+2 (z/2)ν+4 (z/2)ν+6


Jν (z) = − + − + ···
Γ (1 + ν) Γ (2 + ν) 2!Γ (3 + ν) 3!Γ (4 + ν)

and so

X (−1)k (z/2)2k+ν
Jν (z) = .
k!Γ (k + 1 + ν)
k=0
Bessel function of negative order

If ν is not an integer, then a second, linearly independent, solution


is
X∞
(−1)k (z/2)2k−ν
J−ν (z) = .
k!Γ (k + 1 − ν)
k=0

For an integer ν = n ∈ Z, since Γ (n + 1) = n! we have



X (−1)k (z/2)2k+n
Jn (z) = .
k!(k + n)!
k=0

Also, since 1/Γ (z) = 0 for z = 0, −1, −2, . . . , we find that Jn and
J−n are linearly dependent; in fact,

J−n (z) = (−1)n Jn (z).


Bessel functions of integer order
Neumann function
The Neumann function (or Bessel function of the second kind) is

Jν (z) cos νπ − J−ν (z)


Yν (z) = , if ν 6∈ Z.
sin νπ
For n ∈ Z, L’Hospital’s rule shows that if ν → n then Yν (z) tends
to a finite limit
Yn (z) = lim Yν (z).
ν→n

The functions Jν and Yν are linearly independent solutions of


Bessel’s equation for all complex ν.
As z → 0 with ν fixed,
(z/2)ν
Jν (z) ∼ , / {−1, −2, −3, . . .},
ν∈
Γ (ν + 1)
2 −Γ (ν)
Y0 (z) ∼ log z, Yν (z) ∼ , Re ν > 0.
π π(z/2)ν
Neumann functions of integer order
Interlacing of zeros
Legendre equation
The Legendre equation with parameter ν is
(1 − z2 )u 00 − 2zu 0 + ν(ν + 1)u = 0.
This ODE is not singular at z = 0 so the solution has an ordinary
Taylor series expansion
X∞
u= A k zk .
k=0

The Ak must satisfy


(k + 1)(k + 2)Ak+2 − [k(k + 1) − ν(ν + 1)]Ak = 0
for k > 0, and since
k(k + 1) − ν(ν + 1) = (k − ν)(k + ν + 1),
the recurrence relation is
(k − ν)(k + ν + 1)
Ak+2 = Ak for k > 0.
(k + 1)(k + 2)
General solution
We have
u(z) = A0 u0 (z) + A1 u1 (z)
where
ν(ν + 1) 2 (ν − 2)ν(ν + 1)(ν + 3) 4
u0 (z) = 1 − z + z − ···
2! 4!
and

(ν − 1)(ν + 2) 3
u1 (z) = z − z
3!
(ν − 3)(ν − 1)(ν + 2)(ν + 4) 5
+ z − ··· .
5!

Suppose now that ν = n is a non-negative integer. If n is even


then the series for u0 (z) terminates, whereas if n is odd then the
series for u1 (z) terminates.
Legendre polynomial

The terminating solution is called the Legendre polynomial of


degree n and is denoted by Pn (z) with the normalization

Pn (1) = 1.

The first few Legendre polynomials are

P0 (z) = 1, P3 (z) = 12 (5z3 − 3z),

P1 (z) = z, P4 (z) = 18 (35z4 − 30z2 + 3),

P2 (z) = 12 (3z2 − 1), P5 (z) = 81 (63z5 − 70z3 + 15z).

Notice that Pn is an even or odd function according to whether n


is even or odd.
Behaviour of Legendre polynomials
Part II

Dynamical Systems
Introduction

In this part of the course, we survey several topics in the theory of


differential equations (DEs). None is discussed in any depth.
Instead, the aim is to introduce you to a few of the key themes in
this subject, some of which are developed further in our third-year
mathematics courses.
Outline

Examples and terminology

Existence and uniqueness

Practical solution methods

Linear dynamical systems

Higher-order, linear, scalar equations

Stability

Final remarks on nonlinear DEs


Examples and terminology

We begin with some examples of how systems of differential


equations arise in applications, and see how all such problems can
be formulated as a first-order system
dx
= F(x).
dt
Such a formulation leads to a natural geometric interpretation of a
solution.
Predator and prey populations
Lotka–Volterra equations
Simplified ecology with two species:

F(t) = number of foxes at time t,


R(t) = number of rabbits at time t.

Assume populations large enough that F and R can be treated as


smoothly varying in time.

In the 1920s, Alfred Lotka and Vito Volterra independently


proposed the preditor–prey model
dF
= −aF + αFR, F(0) = F0 ,
dt
dR
= bR − βFR, R(0) = R0 .
dt
Here, a, α, b and β are non-negative constants.
Zero predation case

If α = 0 = β then the system uncouples to give


dF dR
= −aF and = bR
dt dt
so
F(t) = F0 e−at and R(t) = R0 ebt .

Interpretation: if the foxes fail to catch any rabbits, then the foxes
starve (F → 0) and the rabbits multiply without limit (R → ∞).
Interaction terms

Rewrite equations as
1 dF
= −a + αR,
F dt
1 dR
= b − βF.
R dt
So  
relative rate of increase in fox popula-
αR =
tion due to predation on rabbits
and  
relative rate of decrease in rabbit pop-
βF = .
ulation due to predation by foxes
Periodic solutions

Example with a = 1.0, α = 0.5, b = 1.5 and β = 0.75.


Vector field
Any first-order system of N ODEs in the form
dx1
= F1 (x1 , x2 , . . . , xN ), x1 (0) = x10 ,
dt
dx2
= F2 (x1 , x2 , . . . , xN ), x2 (0) = x20 ,
dt
.. ..
. .
dxN
= FN (x1 , x2 , . . . , xN ), xN (0) = xN0 ,
dt
can be written in vector notation as
dx
= F(x), x(0) = x0 .
dt
The system of ODEs is determined by the vector
field F : RN → RN .
Predator–prey example

Recall
dF
= −aF + αFR, F(0) = F0 ,
dt
dR
= bR − βFR, R(0) = R0 .
dt
So defining
     
F −aF + αFR F
x= , F(x) = , x0 = 0
R bR − βFR R0

we have    
dx dF/dt −aF + αFR
= = = F(x),
dt dR/dt bR − βFR
with x(0) = x0 .
Geometric viewpoint
Think of the trajectory x(t) as a parametric curve in the phase
N
space R . Then dx/dt = F(x) means that along any trajectory,
F x(t) always points in the forward tangent direction to the
trajectory, with the speed of x(t) equal to the magnitude of F.
Non-autonomous ODEs

A system of ODEs of the form


dx
= F(x).
dt
is said to be autonomous.

For example, we have just seen that the Lotka–Volterra equations


are autonomous.

In a non-autonomous system, F may depend explicitly on t:


dx
= F(x, t). (7)
dt
Equivalent autonomous system

Given a non-autonomous system (7), let


   
x F(x, t)
y= and G(y) = .
t 1

If x = x(t) is a solution of (7), then y = y(t) is a solution of the


autonomous system
   
dy dx/dt F(x, t)
= = = G(y),
dt dt/dt 1

and vice versa.

Therefore, sufficient (in principle) to develop theory for the


autonomous case.
Second-order ODE
Consider an initial-value problem for a general (possibly
non-autonomous) second-order ODE

d2 x
 
dx dx
2
= f x, , t , with x = x0 and = y0 at t = 0.
dt dt dt

If x = x(t) is a solution, and if we let y = dx/dt, then

d2 x
 
dy dx
= 2 = f x, , t = f(x, y, t),
dt dt dt

that is, (x, y) is a solution of the first-order system

dx
= y, x(0) = x0 ,
dt
dy
= f(x, y, t), y(0) = y0 .
dt
Simple oscillator as first-order system
The second-order ODE

mẍ + r0 ẋ + k0 x = f(t)

is equivalent to the (non-autonomous) pair of first-order ODEs

ẋ = v,
1 
v̇ = f(t) − r0 v − k0 x .
m
That is,
dx
= F(x, t),
dt
where
   
x v
x= and F(x, t) =  .
v m−1 f(t) − r0 v − k0 x
Immune response to a viral infection

Human T-cell Simian virus


Roy Anderson and Robert May, Infectious Diseases of Humans:
Dynamics and Control, Oxford University Press, 1991.

Simple model of infection with dependent variables

V(t) = density of the virus,


E(t) = density of the effector cells.

Suppose that in the absence of infection, behaviour of effector cells


(e.g., lymphocytes) described by

dE
= Λ − µE,
dt
for constants Λ > 0 (recruitment rate from bone marrow) and
µ > 0 (death rate).
If E0 is the initial density of effector cells, then
Λ
E(t) = E0 e−µt + (1 − e−µt )
µ

so E(t) → Ê ≡ Λ/µ as t → ∞.

The presence of the virus causes E to grow, so that now


dE
= Λ − µE + VE,
dt
for a constant  > 0 (coefficient of proliferation).
In the absence of an immune response (E = 0) the virus population
obeys
dV
= rV,
dt
for a constant r > 0 (intrinsic growth rate), so V = V0 ert .

However, for E > 0,


dV
= rV − σVE = (r − σE)V,
dt
for a constant σ > 0. Provided

r > σÊ = σΛ/µ,

we have dV/dt > 0 so infection can occur (V can grow).


Complete 2 × 2 dynamical system:
dE
= Λ − µE + VE, E(0) = E0
dt
dV
= rV − σVE, V(0) = V0 .
dt

Example with parameters

Λ = 1.0, µ = 0.5,  = 0.01, r = 1.25, σ = 0.01,

and initial conditions E0 = 2.0 and V0 = 1.0.

Notice Ê = Λ/µ = 2 so r > σÊ.


Numerical solution
Phase portrait
Existence and uniqueness

The most fundamental question about a dynamical system


dx
= F(x, t)
dt
is
For a given initial value x0 , does a solution x(t) satisfying
x(0) = x0 exist, and if so is this solution unique?

Answer is yes, whenever the vector field F is Lipschitz.


Lipschitz constant

Definition
The number L is a Lipschitz constant for a function f : [a, b] → R
if
|f(x) − f(y)| 6 L|x − y| for all x, y ∈ [a, b].

Example
Consider f(x) = 2x2 − x + 1 for 0 6 x 6 1. Since

f(x) − f(y) = 2(x2 − y2 ) − (x − y) = 2(x + y)(x − y) − (x − y)


= (2x + 2y − 1)(x − y)

we have |f(x) − f(y)| = |2x + 2y − 1| |x − y| so a Lipschitz constant


is
L = max |2x + 2y − 1| = 3.
x,y∈[0,1]
Lipschitz implies (uniformly) continuous

We say that the function f : [a, b] → R is Lipschitz if a Lipschitz


constant for f exists.

Theorem
If f is Lipschitz then f is (uniformly) continuous.

Proof.
Suppose L is a Lipschitz constant for f : [a, b] → R. Given  > 0,
if δ = /L then

|f(x) − f(y)| 6 L|x − y| <  whenever |x − y| < δ,

so f is (uniformly) continuous on [a, b].


Continuous does not imply Lipschitz
Example
Consider the (uniformly) continuous function

f(x) = 3 + x for 0 6 x 6 4.

In this case, if x, y ∈ (0, 4] then


√ √
√ √ √ √  x+ y
f(x) − f(y) = x − y = x− y × √ √
x+ y
x−y
=√ √
x+ y

so if a Lipschitz constant L exists then

|f(x) − f(y)| 1
L> =√ √
|x − y| x+ y

for arbitrarily small x and y, a contradiction.


Continuously differentiable implies Lipschitz

Definition
A function f : I → R is Ck if f, f 0 , f 00 , . . . , f(k) all exist and are
continous on the interval I.

Theorem
For any closed and bounded interval I = [a, b], if f is C1 on I then
L = maxx∈I |f 0 (x)| is a Lipschitz constant for f on I.

Proof.
Given a 6 x < y 6 b, the Mean Value Theorem says that there
exists a number c (depending on x and y) such that

f(x) − f(y) = f 0 (c)(x − y) with x < c < y,

so |f(x) − f(y)| = |f 0 (c)| |x − y| 6 L|x − y|.


Equivalent integral equation

Consider an initial-value problem for a (scalar) ODE,

dx
= f(x) for t > 0, with x(0) = x0 . (8)
dt
If x = x(t) is a solution then
Zt

x(t) = x0 + f x(s) ds. (9)
0

Conversely, if x = x(t) is a continuous function satisfying the


(Volterra) integral equation (9) then x is a solution of the
initial-value problem (8).
Picard iterates

We try to solve (9) by fixed point iteration, letting

x1 (t) = x0 ,
Zt

x2 (t) = x0 + f x1 (s) ds,
0
Zt

x3 (t) = x0 + f x2 (s) ds,
0

and in general,
Zt

xk (t) = x0 + f xk−1 (s) ds for k > 1. (10)
0
Increments
Subtracting Zt

xk (t) = x0 + f xk−1 (s) ds
0
from Zt

xk+1 (t) = x0 + f xk (s) ds
0
gives
Zt
  
xk+1 (t) − xk (t) = f xk (s) − f xk−1 (s) ds,
0

so if L is a Lipschitz constant for f then


Zt
|xk+1 (t) − xk (t)| 6 L xk (s) − xk−1 (s) ds, t > 0.
0
Estimating the increments
Put

δk (t) = |xk+1 (t) − xk (t)| and ∆k (t) = max δk (s).


06s6t

We have Zt
δk (t) 6 L δk−1 (s) ds
0
so for 0 6 t 6 T ,
Zt Zt
δ2 (t) 6 L δ1 (s) ds 6 L ∆1 (T ) ds 6 ∆1 (T )Lt,
0 0
Zt Zt
δ3 (t) 6 L δ2 (s) ds 6 L ∆1 (T )Ls ds = ∆1 (T )L2 × 12 t2 ,
0 0
Zt Zt
δ4 (t) 6 L δ3 (s) ds 6 L ∆1 (T )L2 21 s2 ds = ∆1 (T )L3 × 3!1 3
t ,
0 0
..
.
Convergence of the Picard iterates
By induction on k,

(Lt)k−1
δk (t) 6 ∆1 (T ) for |t| 6 T .
(k − 1)!

The telescoping sum


   
xk (t) − x0 (t) = x1 (t) − x0 (t) + x2 (t) − x1 (t)
 
+ · · · + xk (t) − xk−1 (t)
X
k−1
 
= xj+1 (t) − xj (t)
j=0

converges uniformly for |t| 6 T because



X ∞
X (Lt)j−1
|xj+1 (t) − xj (t)| 6 ∆1 (T ) = ∆1 (T )eLt .
| {z } (j − 1)!
j=0 j=1
δj (t)
Thus, we can define x(t) by the uniformly convergent series

X  
x(t) = lim xk (t) = x0 + xj+1 (t) − xj (t) .
k→∞
j=0

In turn, f xk (s) converges uniformly to f x(s) for |s| 6 T , so by


 

sending k → ∞ in (10) we have


Zt

x(t) = lim xk+1 (t) = x0 + lim f xk (s) ds
k→∞ k→∞ 0
Zt

= x0 + lim f xk (s) ds
0 k→∞
Zt

= x0 + f x(s) ds,
0

dx 
and therefore = f x(t) with x(0) = x0 .
dt
Lipschitz vector field

A vector field F : S → RN is Lipschitz on S ⊆ RN if

kF(x) − F(y)k 6 Lkx − yk for all x, y ∈ S.

Here,
X
N 1/2
kxk = x2j
j=1

denotes the Euclidean norm of the vector x ∈ RN .

We say that F(x, t) is Lipschitz in x if

kF(x, t) − F(y, t)k 6 Lkx − yk.


Local existence and uniqueness

Theorem
Let x0 ∈ RN , fix r > 0 and τ > 0, and put

S = { (x, t) ∈ RN × R : kx − x0 k 6 r and |t| 6 τ }.

If F(x, t) is Lipschitz in x for (x, t) ∈ S, and if

kF(x, t)k 6 M for (x, t) ∈ S,

then there exists a unique C1 function x(t) satisfying

dx
= F(x, t) for |t| 6 min(r/M, τ), with x(0) = x0 .
dt

Proof.
See Technical Proofs handout.
Example of non-uniqueness
The initial-value problem
dx
= 3x2/3 for t > 0, with x(0) = 0,
dt
has infinitely many solutions, namely, for any a > 0,

0, 0 6 t 6 a,
x(t) = 3
(t − a) , t > a.

In this case, f(x) = 3x2/3 is not Lipschitz on any neighbourhood


of 0 because

|f(x) − f(0)| 3x2/3 − 0


= = 3x−1/3 → ∞ as x → 0.
|x − 0| x
Example of local, but not global, existence
The (separable) initial-value problem
dx
= 1 + x2 for t > 0, with x(0) = 1,
dt
has a unique solution
 
π 3π π
x(t) = tan t + for − <t< .
4 4 4

Applying the theorem with f(x) = 1 + x2 , x0 = 1, r > 0 and


τ = ∞, we find that

|f(x)| 6 M = r2 + 2r + 2 for |x − x0 | 6 r,

and min(r/M, τ) equals



r r 2−1 π
2
6 2 √ = < .
r + 2r + 2 r + 2r + 2 r= 2 2 4
Distinct trajectories cannot intersect
Suppose that the trajectories of two solutions x1 (t) and x2 (t)
intersect, that is,
x1 (t1 ) = x2 (t2 )
for some t1 and t2 . If we define

y1 (t) = x1 (t1 + t) and y2 (t) = x2 (t2 + t),

then, for j ∈ {1, 2},


 
ẏj (t) = ẋj (tj + t) = F xj (tj + t) = F yj (t) .

Since y1 (0) = x1 (t1 ) = x2 (t2 ) = y2 (0), uniqueness implies that


y1 (t) = y2 (t) for all t, or in other words

x1 (t1 + t) = x2 (t2 + t) for all t.

Therefore, the two solutions trace out the same trajectory in phase
space.
Practical solution methods

This course emphasises analytical (paper and pencil) methods for


solving differential equations, but such an approach works only for
relatively simple problems. We now take a brief look at computer
approximations of the kind that are widely used for direct
numerical simulations in scientific and industrial modelling
(Math2301 and Math3101).
Discrete-time approximation

Initial-value problem in 1D:


dx
= f(x) for 0 < t < T , with x(0) = x0 .
dt
Fix P > 0 and a step size ∆t = T/P. Let

tp = p ∆t for p = 0, 1, 2, . . . , P,

so
0 = t0 < t1 < t2 < · · · < tP = T.

Aim: compute numbers X1 , X2 , . . . , XP such that

x(tp ) ≈ Xp for 1 6 p 6 P.
Finite difference approximation

Since
dx ∆x x(t + ∆t) − x(t)
= lim = lim ,
dt ∆t→0 ∆t ∆t→0 ∆t
if ∆t is small (and thus P is large), then

x(t + ∆t) − x(t) dx 


≈ = f x(t) .
∆t dt
Also, when t = tp ,

x(tp + ∆t) = x(tp+1 ) ≈ Xp+1 and x(tp ) ≈ Xp ,

which suggests we require


Xp+1 − Xp
= f(Xp ).
∆t
Euler’s method
Rearranging:
Xp+1 = Xp + f(Xp ) ∆t.
Thus, given x0 we let X0 = x0 and calculate

X1 = X0 + f(X0 ) ∆t,
X2 = X1 + f(X1 ) ∆t,
X3 = X2 + f(X2 ) ∆t,
..
.
XP = XP−1 + f(XP−1 ) ∆t.

Easily programmed on a computer. For instance, using Julia,


Dt = T / P
X = zeros(P+1); X[1] = x0
for p = 1:P
X[p+1] = X[p] + f(X[p]) * Dt
end
Example
Consider
dx
= ax for t > 0, with x(0) = x0 ,
dt
which has the solution x = x0 eat . Euler’s method is
Xp+1 − Xp
= aXp for p = 0, 1, 2, . . . , with X0 = x0 ,
∆t
so Xp+1 = (1 + a ∆t)Xp and therefore

Xp = (1 + a ∆t)p x0 .

If we send ∆t → 0 and p → ∞, keeping t∗ = p ∆t fixed, then

at∗ p
 
(1 + a ∆t)p = 1 + → eat∗ ,
p

showing that Xp ≈ x(t∗ ) = x(tp ).


Euler’s method with a = 1, P = 10 and T = 1
Digression: order notation

Definition
We write 
φ(y) = O ψ(y) as y → a,
if there exist constants C > 0 and δ > 0 such that

|φ(y)| 6 C|ψ(y)| whenever |y − a| 6 δ.

This notation can be useful if ψ(y) is much simpler than φ(y).

Interpret 
φ(y) = γ(y) + O ψ(y)
as 
φ(y) − γ(y) = O ψ(y) .
Taylor expansions
Recall that if f is Cm+1 then then

(x − a)2
f(x) = f(a) + f 0 (a)(x − a) + f 00 (a) + ···
2
(x − a)m
+ f(m) (a) + Rm (x),
m!
where the remainder term is
Zx
(x − y)m
Rm (x) = f(m+1) (y) dy
a m!

Theorem
If f is Cm+1 on a neighbourhood of a, then

Rm (x) = O (x − a)m+1

as x → a.
Proof.
There exist constants C and δ such that |f(m+1) (y)| 6 C for
|x − a| 6 δ. If 0 6 x − a 6 δ, then
Zx
(x − y)m C
|Rm (x)| 6 C dy = (x − a)m+1 ,
a m! (m + 1)!

and similarly if −δ 6 x − a < 0.

Example
As x → 0,

sin x = x + O(x3 ),
cos x = 1 − 12 x2 + O(x4 ),
log(1 + x) = x − 12 x2 + O(x3 ),
1
= 1 − x2 + O(x4 ).
1 + x2
How accurate is Euler’s method?
Taylor expansion shows that if x(t) is C2 , then as ∆t → 0,

x(t + ∆t) = x(t) + ẋ(t) ∆t + O(∆t2 )


= x(t) + f x(t) ∆t + O(∆t2 ),


so in particular when t = t0 ,

x(t1 ) = x(t0 ) + f x(t0 ) ∆t + O(∆t2 ).




Thus, after the first step of Euler’s method, since X0 = x0 = x(t0 ),

X1 = X0 + f(X0 ) ∆t

= x(t0 ) + f x(t0 ) ∆t
= x(t1 ) + O(∆t2 ).

What happens after p steps?


Theorem (Math2301)
Assume that we compute Xp ≈ x(tp ) using Euler’s method. If L is
a Lipschitz constant for f, then for 0 6 tp 6 T ,

|Xp − x(tp )| 6 12 tp eLtp ∆t max |ẍ(t)|


06t6T

6 C ∆t where C = 21 T eLT max |ẍ(t)|.


06t6T

Thus, for Euler’s method we have

Xp = x(tp ) + O(∆t) as ∆t → 0.

Example
How many steps are needed to ensure the error is less than 10−4 if
T = 5, L = 1 and |ẍ(t)| 6 2? Sufficient to satisfy C ∆t 6 10−4 .

T
T eLT 6 10−4 ⇐⇒ P > T 2 eLT × 104 ≈ 37, 103, 290.
P
A more efficient method

Since dx/dt = f(x), the chain rule gives

d2 x dx
ẍ = = f 0 (x) = f 0 (x)f(x)
dt2 dt
so by Taylor expansion,

x(t + ∆t) = x(t) + ẋ(t) ∆t + 21 ẍ(t) ∆t2 + O(∆t3 )


= x(t) + f x(t) ∆t + 12 f 0 x(t) f x(t) ∆t2 + O(∆t3 )
  

We therefore define the Taylor method of order 2 by

Xp+1 = Xp + f(Xp ) ∆t + 12 f 0 (Xp )f(Xp ) ∆t2 .


| {z } | {z }
Euler step correction term
Comparison
We saw earlier that for Euler’s method,

Xp = x(tp ) + O(∆t) as ∆t → 0.

Can show in a similar way that for the Taylor method of order 2,

Xp = x(tp ) + O(∆t2 ) as ∆t → 0.

Example
When T = 1 we have ∆t = 1/P, so

P ∆t ∆t2
10 0.1 0.01
100 0.01 0.0001
1000 0.001 0.000001
10000 0.0001 0.00000001
100000 0.00001 0.0000000001
Example

Consider
dx
= cos2 x for t > 0, with x(0) = 0,
dt
which has the solution x = tan−1 (t). Since

f(x) = cos2 x and f 0 (x) = −2 cos x sin x,

the Taylor method of order 2 is

Xp+1 = Xp + f(Xp ) ∆t + 12 f(Xp )f 0 (Xp ) ∆t2


= Xp + ∆t cos2 Xp − ∆t2 cos3 Xp sin Xp

for p = 0, 1, 2, . . . , with X0 = 0.
Comparison using P = 8 steps
Systems of ODEs

Euler’s method works in the same way if F : RN → RN and


dx
= F(x) for 0 < t < T , with x(0) = x0 .
dt
Put X0 = x0 and compute vectors Xp ≈ x(tp ) by

Xp+1 = Xp + ∆t F(Xp ) for p = 0, 1, 2, . . . , P − 1.

Similarly, for the Taylor method of order 2,

Xp+1 = Xp + ∆t F(Xp ) + 12 ∆t2 F 0 (Xp )F(Xp ).

Here, F 0 (x) = [∂Fi /∂xj ] ∈ RN×N denotes the Jacobian matrix.


Immune response example
Recall
dE
= Λ − µE + VE, E(0) = E0
dt
dV
= rV − σVE, V(0) = V0 .
dt
In this case,
   
Λ − µE + VE 0 −µ + V E
F(E, V) = , F (E, V) =
rV − σVE −σV r − σE

and the Taylor method of order 2 is


     
Ep+1 Ep Λ − µEp + Vp Ep
= + ∆t
Vp+1 Vp rVp − σVp Ep
2
  
∆t −µ + Vp Ep Λ − µEp + Vp Ep
+ .
2 −σVp r − σEp rVp − σVp Ep
How to avoid computing F 0
In 1895, Carl Runge proposed the scheme

Φ1 = F(Xp ),
Φ2 = F(Xp + ∆t Φ1 ),
1
Xp+1 = Xp + 2 ∆t (Φ1 + Φ2 ),

which requires two evaluations of F per step, but no knowledge


of F 0 . Since

Φ2 = F(Xp ) + F 0 (Xp )(∆t Φ1 ) + O(∆t2 )


= F(Xp ) + ∆t F 0 (Xp )F(Xp ) + O(∆t2 )

we have

Xp+1 = Xp + ∆t F(Xp ) + 21 ∆t2 F 0 (Xp )F(Xp ) +O(∆t3 ).


| {z }
Taylor method of order 2
Thus, up to O(∆t3 ), Runge’s method agrees with the Taylor
method of order 2, so in both cases we expect

Xp = x(tp ) + O(∆t2 ) for p = 1, 2, . . . , P.

In 1901, Martin Kutta proposed a more elaborate scheme,

Φ1 = F(Xp ),
Φ2 = F(Xp + 12 ∆t Φ1 ),
Φ3 = F(Xp + 12 ∆t Φ2 ),
Φ4 = F(Xp + ∆t Φ3 ),
1

Xp+1 = Xp + 6 ∆t Φ1 + 2Φ2 + 2Φ3 + Φ4 ,

and showed that it behaved like a Taylor method of order 4:

Xp = x(tp ) + O(∆t4 ) for p = 1, 2, . . . , P.


Convergence rate

Denote the maximum error using P steps by

EP = max kXp − x(tp )k,


06p6P

and suppose that, for some positive constants C and r,

EP ≈ C ∆tr .

Since ∆t = T/P it follows that EP ≈ CT r P−r , so

EP/2 CT r (P/2)−r
≈ = 2r
EP CT r P−r
and thus
r ≈ log2 (EP/2 /EP ).
Simple test problem
For the system
dx
= 2x + 3y, x(0) = 2,
dt
dy
= −3x + 2y, y(0) = −1,
dt
we obtain the following maximum errors and convergence rates
(taking T = 1).

P Euler RK2 RK4


20 4.64e+00 3.02e-01 4.44e-04
40 2.21e+00 1.07 7.77e-02 1.958 2.81e-05 3.981
80 1.06e+00 1.05 1.97e-02 1.979 1.77e-06 3.993
160 5.19e-01 1.03 4.96e-03 1.989 1.11e-07 3.997
320 2.56e-01 1.02 1.25e-03 1.995 6.92e-09 3.999
640 1.27e-01 1.01 3.12e-04 1.997 4.33e-10 3.999
Linear dynamical systems

Linear differential equations are generally much easier to solve than


nonlinear ones. Hence, most of this course is devoted to linear
problems. Fortunately, linear DEs suffice for describing many
important applications.

Also, understanding how the solutions of linear ODEs behave can


help us understand nonlinear ODEs.
First-order, linear systems of ODEs

We say that the N × N, first-order system of ODEs


dx
= F(x, t)
dt
is linear if the RHS has the form

F(x, t) = A(t)x + b(t)

for some N × N matrix-valued function A(t) = [aij (t)] and a


vector-valued function b(t) = [bi (t)].

The system is autonomous precisely when A and b are constant.


Global existence and uniqueness

We have a stronger existence result in the linear case.

Theorem
If A(t) and b(t) are continuous for 0 6 t 6 T , then the linear
initial-value problem
dx
= A(t)x + b(t) for 0 6 t 6 T , with x(0) = x0 ,
dt
has a unique solution x(t) for 0 6 t 6 T .

We now investigate the special case when A is constant and


b(t) ≡ 0:
dx
= Ax.
dt
General solution via eigensystem
If Av = λv and we define x(t) = eλt v, then

dx
= λeλt v = eλt (λv) = eλt (Av) = A(eλt v) = Ax
dt
that is, x is a solution of dx/dt = Ax.
If Avj = λj vj for 1 6 j 6 N, then the linear combination

X
N
x(t) = cj eλj t vj (11)
j=1

is also a solution because the ODE is linear and homogeneous.


Provided the vj are linearly independent, (11) is the general
solution because given any x0 ∈ RN there exist unique cj such that

X
N
x(0) = cj vj = x0 .
j=1
Example
Consider
dx
= −5x + 2y, x(0) = 5,
dt
dy
= −6x + 3y, y(0) = 7.
dt
In this case,
     
−5 2 1 1
A= , λ1 = −3, v1 = , λ2 = 1, v2 = ,
−6 3 1 3
so the general solution is
   
−3t 1 t 1
x(t) = c1 e + c2 e ,
1 3
and the initial conditions imply c1 = 4 and c2 = 1, so
x = 4e−3t + et ,
   
−3t 1 t 1
x(t) = 4e +e and
1 3 y = 4e−3t + 3et .
Exponential of a matrix
Recall that the Taylor series

X
z zk z2 z3
e = =1+z+ + + ···
k! 2! 3!
k=0

converges for all z ∈ C (Math2621).


By analogy, given A ∈ CN×N , we define (Math2601)

X
A Ak A2 A3
e = =I+A+ + + ··· .
k! 2! 3!
k=0

This series always converges because, for any matrix norm


(Math2301),

X ∞
X
kAk k kAkk
ke k 6 A
6 = ekAk .
k! k!
k=0 k=0
Term-by-term differentiation
Given A and x0 , let

t2 A 2 tk A k
 
tA
x(t) = e x0 = I + tA + + ··· + + · · · x0 .
2! k!

Since
d tA tk−1 Ak
e = 0 + A + tA2 + · · · + + ···
dt (k − 1)!
tk−1 Ak−1
 
= A I + tA + · · · + + · · · = AetA ,
(k − 1)!

we have
dx
= AetA x0 = Ax and x(0) = Ix0 = x0 .
dt
But how can we calculate etA explicitly?
Diagonalising a matrix
Definition
A square matrix A ∈ CN×N is diagonalisable if there exists a
non-singular matrix Q ∈ CN×N such that Q−1 AQ is diagonal.

Theorem
A square matrix A ∈ CN×N is diagonalisable if and only if there
exists a basis {v1 , v2 , . . . , vN } for CN consisting of eigenvectors
of A. Indeed, if

Avj = λj vj for j = 1, 2, . . . , N,

and we put Q = [v1 v2· · · vN ], then Q−1 AQ = Λ where


 
λ1
Λ=
 .. .

.
λN
Matrix powers
Consider a diagonalisable matrix A. Since Q−1 AQ = Λ, it follows
that A has an eigenvalue decomposition

A = QΛQ−1 .

Thus

A2 = (QΛQ−1 )(QΛQ−1 ) = QΛ(Q−1 Q)ΛQ−1 = QΛIΛQ−1


= QΛ2 Q−1

and
A3 = A2 A = QΛ2 Q−1 QΛQ−1 = QΛ3 Q−1 .
In general, we see by induction on k that

Ak = QΛk Q−1 for k = 0, 1, 2, . . . .


Since Λ is diagonal, so is
  
λ1 λ1
 λ2  λ2 
Λ2 = 
  
..  .. 
 .  . 
λN λN
λ21
 
 λ22 
= ,
 
..
 . 
λ2N

and in general,

λk
 
1
 λk
2

Λk =  for k = 0, 1, 2, 3, . . . .
 
.. 
 . 
λk
N
Example
If  
−5 2
A=
−6 3
then
    
−3 0 1 1 −1 1 3 −1
Λ= , Q= , Q =
0 1 1 3 2 −1 1
so
1 1 1 (−3)k 0
   
k k −1 3 −1
A = QΛ Q =
2 1 3 0 1 −1 1
k k+1 k+1 × 3k + 1
 
1 (−1) × 3 − 1 (−1)
= .
2 (−1)k × 3k+1 − 3 (−1)k+1 × 3k + 3
Polynomial of a matrix
For any polynomial

p(z) = c0 + c1 z + c2 z2 + · · · + cm zm

and any square matrix A, we define

p(A) = c0 I + c1 A + c2 A2 + · · · + cm Am .

When A is diagonalisable, Ak = QΛk Q−1 so

p(A) = c0 QIQ−1 + c1 QΛQ−1 + c2 QΛ2 Q−1 + · · ·


+ cm QΛm Q−1
= c0 QI + c1 QΛ + c2 QΛ2 + · · · + cm QΛm Q−1


= Q c0 I + c1 Λ + c2 Λ2 + · · · + cm Λm Q−1


= Qp(Λ)Q−1 .
Lemma
For any polynomial p and any diagonal matrix Λ,
 
p(λ1 )
 p(λ2 ) 
p(Λ) =  .
 
. .
 . 
p(λN )

Theorem
If two polynomials p and q are equal on the spectrum of a
diagonalisable matrix A, that is, if

p(λj ) = q(λj ) for j = 1, 2, . . . , N,

then p(A) = q(A).


Example
Recall that  
−5 2
A=
−6 3
has eigenvalues λ1 = −3 and λ2 = 1. Let

p(z) = z2 − 4 and q(z) = −2z − 1,

and observe

p(−3) = 5 = q(−3) and p(1) = −3 = q(1).

We find
 
2 9 −4
p(A) = A − 4I = = −2A − I = q(A).
12 −7
Exponential of a diagonalisable matrix

Theorem
If A = QΛQ−1 is diagonalisable, then

eλ1
 
 eλ2 
eA = QeΛ Q−1 and eΛ = 
 
.. 
 . 
eλN

Proof.
eA equals

X ∞
X ∞
X ∞
X
Ak QΛk Q−1 Λk Q−1 Λk

= =Q =Q Q−1 .
k! k! k! k!
k=0 k=0 k=0 k=0
Example
Again put  
−5 2
A= .
−6 3
We have
1 1 1 e−3 0
   
A Λ −1 3 −1
e = Qe Q =
2 1 3 0 e1 −1 1
1 3e − e −e−3 + e
 −3 
= .
2 3e−3 − 3e −e−3 + 3e

Notice tA = Q(tΛ)Q−1 so

1 1 1 e−3t 0
   
tA tΛ −1 3 −1
e = Qe Q =
2 1 3 0 et −1 1
1 3e−3t − et −e−3t + et
 
= .
2 3e−3t − 3et −e−3t + 3et
A (maybe) simpler method
The following trick lets you compute eA without finding Q.

If a polynomial p has the property

p(λj ) = eλj for j = 1, 2, . . . , N,

then
eA = QeΛ Q−1 = Qp(Λ)Q−1 = p(A).

For example, we can choose p with degree 6 N − 1 using the


Lagrange interpolation formula,

X
N Y
N
λ − λr
λj
p(λ) = e .
λj − λ r
j=1 r=1
r6=j
A 3 × 3 example

Problem: find etA for


 
6 2 2
A = −2 8 4 .
0 1 7

Can show that A = QΛQ−1 where


   
1 2 2 6 0 0
Q = −1 0 1 and Λ = 0 7 0 ,
1 1 1 0 0 8
so
tA = Q(tΛ)Q−1 for all t.
Fix t and put

(λ − λ2 )(λ − λ3 ) (λ − λ1 )(λ − λ3 )
p(λ) = eλ1 t + eλ2 t
(λ1 − λ2 )(λ1 − λ3 ) (λ2 − λ1 )(λ2 − λ3 )
(λ − λ1 )(λ − λ2 )
+ eλ3 t
(λ3 − λ2 )(λ3 − λ2 )
(λ − 7)(λ − 8) (λ − 6)(λ − 8)
= e6t + e7t
(6 − 7)(6 − 8) (7 − 6)(7 − 8)
(λ − 6)(λ − 7)
+ e8t
(8 − 6)(8 − 7)

so that

p(6) = e6t , p(7) = e7t , p(8) = e8t .


Thus,

e6t
etA = p(A) = (A − 7I)(A − 8I) − e7t (A − 6I)(A − 8I)
2
e8t
+ (A − 6I)(A − 7I)
 2     
−1 0 2 −4 2 6 −2 2 4
= e6t  1 0 −2 − e7t  0 0 0 + e8t −1 1 2 ,
−1 0 2 −2 1 3 −1 1 2

which equals
 6t
−e + 4e7t − 2e8t −2e7t + 2e8t 2e6t − 6e7t + 4e8t

 e6t − e8t e8t −2e6t + 2e8t  .
6t 7t
−e + 2e − e 8t 7t
−e + e 8t 2e − 3e7t + 2e8t
6t
Digression: higher-order, linear, scalar equations

Consider once again

X
m
Lu = aj (x)Dj u.
j=0

We are now ready to prove the result stated earlier:

Theorem
Assume that the ODE Lu = f is not singular with respect to [a, b],
and that f is continuous on [a, b]. Then the IVP (2) and (3) has a
unique solution.
Equivalent first-order system
Sufficient to look at the case m = 3,

Lu = a3 u 000 + a2 u 00 + a1 u 0 + a0 u.

(General mth order case follows in the same way.)

Writing y1 = u, y2 = u 0 and y3 = u 00 we see that Lu = f if and


only if
dy1
= y2 ,
dx
dy2
= y3 , (12)
dx
dy3 1 
= f − a0 y 1 − a1 y 2 − a2 y 3 .
dx a3

Can now see why to expect trouble if L is singular, that is, if the
leading coefficient a3 vanishes at any x in the interval of interest.
Equivalent first-order system: matrix version

We can write Lu = f as
dy
= A(x)y + b(x),
dx
where, when m = 3,
 
0 1 0
A(x) =  0 0 1 ,
−a0 /a3 −a1 /a3 −a2 /a3
   
u 0
y(x) =  u 0  , b(x) =  0  .
u 00 f/a3
Existence and uniqueness
Theorem
Assume that L is not singular on [a, b] and that f is continuous
on [a, b]. Then the IVP (2) and (3) has a unique solution.

Proof.
For m = 3, the IVP (2) and (3) is equivalent to
 
ν0
dy
= A(x)y + b(x), y(0) = ν1 
dx
ν2

where y1 = u, y2 = u 0 , y3 = u 00 and
   
0 1 0 0
A(x) =  0 0 1 , b(x) =  0  .
−a0 /a3 −a1 /a3 −a2 /a3 f/a3
Stability

In many applications we are interested to know how the


solution x(t) behaves as t → ∞, and might not care much about
the precise details of the transient behaviour for finite t. A lot can
be discovered without the need to solve the DE, which is just as
well because finding x(t) explicitly is often difficult or impossible,
especially for nonlinear problems.
Equilibrium points

We say that a ∈ RN is an equilibrium point for the dynamical


system dx/dt = F(x) if
F(a) = 0.
In this case, the solution of
dx
= F(x) for all t, with x(0) = a,
dt
is just the constant function x(t) = a.
Equilibrium points for the viral infection model
Recall
dE
= Λ − µE + VE,
dt
dV
= rV − σVE.
dt
Here, (E, V) is an equilibrium point iff

Λ − µE + VE = 0,
rV − σVE = 0.

Thus, the only equilibrium points are at


Λ
E= and V = 0,
µ
and at  
r 1 σΛ
E= and V = µ− .
σ  r
Stable equilibrium

Definition
An equilibrium point a is stable if for every  > 0 there exists
δ > 0 such that whenever kx0 − ak < δ the solution of

dx
= F(x) for t > 0, with x(0) = x0
dt
satisfies
kx(t) − ak <  for all t > 0.
(In particular, x(t) must exist for all t > 0.)

In this case, if x0 → a then x(t) → a, uniformly for t > 0.

In particular, a trajectory stays close to a stable equilibrium


point a if it starts out sufficiently close to a.
Asymptotic stability

Definition
Let D be an open subset of RN that contains an equilibrium
point a. We say that a is asymptotically stable in D if a is stable
and, whenever x0 ∈ D, the solution of
dx
= F(x) for t > 0, with x(0) = x0
dt
satisfies
x(t) → a as t → ∞.
In this case D is called a domain of attraction for a.
Linear, constant-coefficient case

Consider
dx
= Ax + b with x(0) = x0 . (13)
dt
Since
Ax + b = 0 ⇐⇒ x = −A−1 b,
the only equilibrium point is a = −A−1 b. Moreover,

x(t) = a + etA (x0 − a)

because
dx
= AetA (x0 − a) = A(x − a) = Ax − Aa = Ax + b
dt
and
x(0) = a + I(x0 − a) = x0 .
Criteria for stability

Theorem
Let A be a diagonalisable matrix with eigenvalues λ1 , λ2 , . . . , λN .
The equilibrium point a = −A−1 b is of (13)
1. stable if and only Re λj 6 0 for all j.
2. asymptotically stable if and only Re λj < 0 for all j.
In the second case, the domain of attraction is the whole of RN .
Proofs

1. Using the eigenvalue decomposition A = QΛQ−1 , we have

x(t) − a = etA (x0 − a) = QetΛ Q−1 (x0 − a).

If Re λj 6 0 then 0 < |eλj t | = e(Re λj )t 6 1 for all t > 0, so

X
N
2
X
N
ke tΛ 2
wk = e λj t
wj 6 |wj |2 = kwk2 ,
j=1 j=1

implying that a is stable. Otherwise, Re λj > 0 for at least one j


and eλj t → ∞ as t → ∞.

2. For asymptotic stability it is necessary and sufficient that


eλj t → 0 as t → ∞, for all j.
Example: stable (but not asymptotically stable)
Consider  
dx 2 5
= Ax where A= .
dt −1 −2
Eigenvalues λ1 = i and λ2 = −i so Re λ1 = 0 = Re λ2 , hence
stable.
Eigenvalue decomposition A = QΛQ−1 where
   
i 0 5 5
Λ= , Q = [v1 v2 ] = ,
0 −i i − 2 −i − 2
 
1 1 − 2i −5i
Q−1 = ,
10 1 + 2i 5i

and thus
 
tA tΛ −1 cos t − 2 sin t 5 sin t
e = Qe Q = .
− sin t cos t − 2 sin t
Example: asymptotically stable
Consider  
dx 14 −9
= Ax where A= .
dt 30 −19
Eigenvalues λ1 = −1 and λ2 = −4 so Re λ1 < 0 and Re λ2 < 0,
hence asymptotically stable.
Eigenvalue decomposition A = QΛQ−1 where
   
−1 0 3 1
Λ= , Q = [v1 v2 ] = ,
0 −4 5 2
 
2 −1
Q−1 = ,
−5 3

and thus
6e−t − 5e−4t −3e−t + 3e−4t
 
tA tΛ −1
e = Qe Q = .
10e−t − 10e−4t −5e−t + 6e−4t
Example: unstable
Consider  
dx −26 36
= Ax where A= .
dt −18 25
Eigenvalues λ1 = 1 and λ2 = −2 so Re λ1 > 0 and Re λ2 < 0,
hence unstable.
Eigenvalue decomposition A = QΛQ−1 where
   
1 0 4 3
Λ= , Q = [v1 v2 ] = ,
0 −2 3 2
 
−2 3
Q−1 = ,
3 −4

and thus
−8et + 9e−2t 12et − 12e−2t
 
tA tΛ −1
e = Qe Q = .
−6et + 6e−2t 9et − 8e−2t
Linearization
Suppose that x0 is close to an equilibrium point a. If
dx
= F(x) for all t, with x(0) = x0 , (14)
dt
then for small t the difference y = x − a is small and satisfies
dy dx
= = F(x) = F(a + y) ≈ F(a) + F 0 (a)y.
dt dt

This suggests that if y0 = x0 − a and y is the solution of the


linear dynamical system
dy
= F(a) + F 0 (a)y for all t, with y(0) = y0 ,
dt
then x(t) ≈ a + y(t) for small t. In particular, we can infer
stability properties of (14) at an equilibrium point a from the
eigenvalues of A = F 0 (a).
Viral infection model

Recall that,
   
dE/dt Λ − µE + VE
= F(E, V) = .
dV/dt rV − σVE

Thus,  
0 −µ + V E
F (E, V) = ,
−σV r − σE
so at the first equilibrium point,
 
Λ 0 −µ Ê
E= = Ê, V = 0, F (Λ/µ, 0) = .
µ 0 r − σÊ

Hence, λ1 = −µ < 0 and λ2 = r − σÊ, which means that this


equilibrium point is stable iff r 6 σÊ.
Damped pendulum

The angular deflection θ of a damped pendulum satisfies

mθ̈ + rθ̇ + k sin θ = 0,

where θ = 0 is the “down” position of the bob. For simplicity,


take m = 1, so that we have the equivalent first-order system

= φ,
dt

= −k sin θ − rφ.
dt
The equilibrium points are

(θ, φ) = (nπ, 0)

for n ∈ {. . . , −2, −1, 0, 1, 2, . . .}.


Writing    
d θ φ
= F(θ, φ) =
dt φ −k sin θ − rφ
we see that  
0 0 1
F (θ, φ) = ,
−k cos θ −r
and the matrix
 
0 0 1
A = F (nπ, 0) =
(−1)n+1 k −r

has eigenvalues given by

λ −1
det(λI − A) = = λ2 + rλ + (−1)n k = 0.
(−1)n k λ + r
When n is even, we solve λ2 + rλ + k = 0 to obtain
 √ √
 1

2
 2 √−r ± i 4k − r , 0 6 r < 2 k,

λ = λ± = − k, r = 2 k,

1 √  √
−r ± r2 − 4k , r > 2 k.
2

However, when n is odd, we solve λ2 + rλ − k = 0 to obtain


p
λ = λ± = 12 −r ± r2 + 4k , r > 0.

Conclusion: equilibrium point at (θ, φ) = (nπ, 0) classified as
follows.

r=0√ Re λ+ = Re λ− = 0 stable
n even 0 < r 6√2 k Re λ+ = Re λ− < 0 asymptot. stable
r>2 k λ− < λ+ < 0 asymptot. stable
n odd r>0 λ− < 0 < λ+ unstable

Makes sense because θ = nπ is the “down” position if n is even,


but is the “up” position if n is odd.
Final remarks on nonlinear DEs

We conclude this part of the course by discussing a technique for


solving, or at least partially solving, some nonlinear problems.
Importantly, this technique allows us to deal with the 1D motion of
a particle moving under the influence of a conservative force.
First integrals
Definition
A function G : RN → R is a first integral (or constant of the
motion) for the system of ODEs

dx
= F(x)
dt

if G x(t) is constant for every solution x(t).
By the chain rule,

d  X ∂G dxj
N
dx
G x(t) = = ∇G(x) · = ∇G(x) · F(x).
dt ∂xj dt dt
j=1

Geometric Interpretation: G is a first integral iff

∇G(x) ⊥ F(x) for all x.


Simple example
The function G(x, y) = x2 + y2 is a first integral of the linear
system of ODEs
dx
= −y,
dt
dy
= x.
dt
In fact, putting  
−y
F(x, y) =
x
we have
   
2x −y
∇G · F = · = (2x)(−y) + (2y)(x) = 0,
2y x
or equivalently,
dG ∂G dx ∂G dy
= + = (2x)(−y) + (2y)(x) = 0.
dt ∂x dt ∂y dt
Partial solutions

In effect, a first integral provides a partial solution of the ODE.

Putting C = G(x0 ), we know that x(t) is confined to the surface


G(x) = C.

If N = 2, the equation G(x1 , x2 ) = C implicitly gives x2 = g(x1 ),


so
dx1 
= F1 (x1 , x2 ) = F1 x1 , g(x1 )
dt
and F1 becomes a known function of x1 alone. If we can then
evaluate Z
dx1
t=
F1
to obtain t =t(x1 ), then implicitly we know x1 = x1 (t) and finally
x2 = g x1 (t) .
A class of second-order ODE

Consider
   
d x ẋ
ẍ = f(x) or equivalently = , (15)
dt ẋ f(x)

and suppose that −V(x) is an indefinite integral of f, that is,


−dV/dx = f(x). Since

d ẋ2 d dV dx
= ẋ ẍ and V(x) = = −f(x)ẋ,
dt 2 dt dx dt
if x = x(t) is a solution of (15) then

d 2   
ẋ /2 + V(x) = ẍ − f(x) ẋ = 0,
dt
so the function G(x, ẋ) = 12 ẋ2 + V(x) is a first integral.
Example: undamped pendulum
The angular deflection of an undamped pendulum satisfies

mθ̈ + k sin θ = 0, (16)

or r
2 k
θ̈ + ω sin θ = 0, ω= ,
m
which has the form
dV
θ̈ = f(θ) = −

with f(θ) = −ω2 sin θ and V(θ) = −ω2 cos θ. Thus,

d 1 2
− ω2 cos θ = θ̇θ̈ + ω2 θ̇ sin θ = θ̇ θ̈ + ω2 sin θ = 0
 
dt 2 θ̇

and so every solution of (16) satisfies


1 2
2 θ̇ − ω2 cos θ = C.
Partial solutions in higher dimensions
Suppose N > 2 and we know several (functionally independent)
first integrals G1 , G2 , . . . , Gk . Then the solution x(t) must lie on
the intersection of the surfaces Gj (x) = Cj for 1 6 j 6 k, where
Cj = Gj (x0 ).

We necessarily have k 6 N − 1.

If k = N − 1, then the implicit function theorem gives xj = gj (x1 )


for 2 6 j 6 N, and thus F1 (x) becomes a known function of x1 so,
as before Z
dx1
t= .
F1

In principle, we then know x1 = x1 (t) and hence xj = gj x1 (t)
for 2 6 j 6 N.

However, a system might not have any first integrals.


Lorenz equations

Edward N. Lorenz, Journal of Atmospheric Sciences 20:130–141,


1963.

The 3 × 3 system of ODEs


dx
= −σx + σy,
dt
dy
= rx − y − xz,
dt
dz
= xy − bz.
dt
is a very simplified model of convection in a thin layer of fluid
heated uniformly from below and cooled uniformly from above.
The dependent variables give the coefficients of 3 Fourier modes.
Lorenz equations

For r > 1, the system has three equilibrium points at


p p 
(0, 0, 0), b(r − 1), b(r − 1), r − 1 ,
p p 
− b(r − 1), − b(r − 1), r − 1 .

The parameter choices


8
r = 28, σ = 10, b= ,
3
and initial conditions
1
x0 = y0 = z0 = ,
2
lead to the trajectory shown on the next page.
Lorenz equations
Lorenz equations

Since the solution does not lie on a smooth surface we conclude


that no first integral exists.

The Lorenz equations also exhibit a sensitive dependence on initial


conditions, as illustrated in the following video.

http://www.youtube.com/watch?v=FYE4JKAXSfY
Part III

Initial-Boundary Value Problems in 1D


Introduction

We have seen that an initial-value problem for a (nonsingular)


linear ODE Lu = f always has a unique solution. However, matters
are not so simple for a boundary-value problem: a solution might
not exist, or if one exists it might not be unique. One aim of this
part of the course is to understand when these different outcomes
occur with the help of some concepts adapted from linear algebra.

We also begin our study of partial differential equations (PDEs) by


considering two important examples, the wave equation and the
heat equation, in one spatial variable.
Outline

Two-point boundary value problems

Existence and uniqueness

Inner products and norms of functions

Self-adjoint differential operators

The vibrating string

Heat equation
Two-point boundary value problems

In an mth order initial-value problem we specify m initial


conditions at the left end of an interval. In an mth order
boundary-value problem, we again specify m conditions involving
the solution and its derivatives, but some apply at the left end and
some at the right end.

We will consider only second-order linear boundary-value problems,


so exactly one boundary condition will apply at each end of the
interval.
Initial conditions vs boundary conditions

Consider the second-order ODE

u 00 + u = 0 for 0 < x < π,

whose general solution is

u(x) = A cos x + B sin x.

We obtain a unique solution by specifying initial conditions, that


is, specifying u(0) and u 0 (0).

What if we instead specify boundary conditions at x = 0 and


at x = π?
Three possibilities
Example
A unique solution u(x) = sin x exists satisfying

u 0 (0) = 1 and u(π) = 0.

Example
No solution exists satisfying

u(0) = 0 and u(π) = 1.

Example
Infinitely many solutions u(x) = C sin x exist satisfying

u(0) = 0 and u(π) = 0.


Linear two-point boundary value problem
We want to solve

Lu = f for a < x < b, with B1 u = α1 and B2 u = α2 , (17)

where
Lu = a2 u 00 + a1 u 0 + a0 u
is a 2nd-order linear differential operator, and the boundary
functionals have the form

B1 u = b11 u 0 (a) + b10 u(a),


B2 u = b21 u 0 (b) + b20 u(b).

Example

u 00 − u = x − 1 for 0 < x < log 2,


u=2 at x = 0,
0
u − 2u = 2 log 2 − 4 at x = log 2.
Existence and uniqueness

Having seen a couple of examples, we now investigate the


conditions that determine whether or not a linear boundary-value
problem has a unique solution.
Linearity (again)

Since L, B1 and B2 are all linear, the solutions of the homogeneous


BVP

Lu = 0 for a < x < b, with B1 u = 0 and B2 u = 0, (18)

form a vector space: if u1 and u2 are solutions of (18) then so is


u = c1 u1 + c2 u2 for any constants c1 and c2 .

Any two solutions of the inhomogeneous problem (17) differ by a


solution of the homogeneous problem, that is, if u1 and u2 satisfy
(17) then u = u1 − u2 satisfies (18).

If u1 satisfies (17) and u2 satisfies (18) then u = u1 + cu2


satisfies (17) for any constant c.
Existence and uniqueness

The preceding facts imply the following result.


Theorem (Uniqueness)
The inhomogeneous BVP (17) has at most one solution iff the
homogeneous BVP (18) has only the trivial solution u ≡ 0.

To investigate existence, suppose that the general solution of the


homogeneous equation Lu = 0 is uH = c1 u1 (x) + c2 u2 (x), and
suppose that uP (x) is a particular solution of the inhomogeneous
equation Lu = f. The general solution of Lu = f will then be

u(x) = uH (x) + uP (x) = c1 u1 (x) + c2 u2 (x) + uP (x).


Existence and uniqueness (continued)

To satisfy the boundary conditions in (17) we must choose c1 and


c2 so that

B1 (c1 u1 + c2 u2 + uP ) = α1 ,
B2 (c1 u1 + c2 u2 + uP ) = α2 ,

Since B1 and B2 are linear, the inhomogeneous BVP (17) has at


least one solution iff the 2 × 2 linear system
    
B1 u1 B1 u2 c1 α1 − B1 uP
= (19)
B2 u1 B2 u2 c2 α2 − B2 uP

has at least one solution [c1 , c2 ]T .


Similarly, u(x) = c1 u1 (x) + c2 u2 (x) is a solution of the
homogeneous BVP (18) iff c1 and c2 satisfy
    
B1 u1 B1 u2 c1 0
= .
B2 u1 B2 u2 c2 0

The 2 × 2 matrix on the left is non-singular iff this homogeneous


linear system has only the trivial solution c1 = c2 = 0.

Hence, the following is true.


Theorem
If the homogeneous problem (18) has only the trivial solution, then
for every choice of f, α1 and α2 the inhomogeneous problem (17)
has a unique solution.
Inner products and norms of functions

If the homogeneous problem (18) admits nontrivial solutions, then


the inhomogeneous problem (17) might or might not have any
solutions, depending on the data f, α1 and α2 . To formulate a
condition that guarantees existence we require a short digression
that introduces some ideas from functional analysis.
Inner product and norm

The inner product hf, gi of a pair of continuous functions f,


g : [a, b] → R is defined by
Zb
hf, gi = f(x)g(x) dx.
a

The corresponding norm of f is defined by


Z b 1/2
p 2
kfk = hf, fi = [f(x)] dx .
a

We say that f and g are orthogonal if hf, gi = 0.


Example
If
(a, b) = (−1, 1), f(x) = x, g(x) = cos πx,
then r
2
hf, gi = 0, kfk = , kgk = 1.
3
Thus, f and g are orthogonal over the interval (−1, 1).

Example
Later we will show that the Legendre polynomials are orthogonal
over the interval (−1, 1):
Z1
Pn (x)Pk (x) dx = 0 if n 6= k.
−1
Key properties

The inner product and norm for functions behave like the dot
product and norm for vectors in Rn .

For any continuous functions f, g, h and any constant α:


1. hf + g, hi = hf, hi + hg, hi;
2. hf, g + hi = hf, gi + hf, hi;
3. hαf, gi = αhf, gi = hf, αgi;
4. hf, gi = hg, fi;
5. kfk = 0 if and only f(x) = 0 for a < x < b.
Cauchy–Schwarz inequality
Theorem
|hf, gi| 6 kfkkgk.

Proof.
If f = 0 or g = 0 then hf, gi = 0 and kfkkgk = 0 so the result
reduces to 0 6 0.
Otherwise, f 6= 0 and g 6= 0 so kfk =
6 0 and kgk =
6 0, and we may
define α = 1/kfk and β = 1/kgk. Since

0 6 kαf ± βgk2 = hαf ± βg, αf ± βgi


= α2 kfk2 ± 2αβhf, gi + β2 kgk2 = 2 1 ± αβhf, gi ,


we see that −1 6 αβhf, gi 6 1 so

|hf, gi| 6 α−1 β−1 = kfkkgk.


Triangle inequality

Corollary
kf + gk 6 kfk + kgk.

Proof.

kf + gk2 = hf + g, f + gi
= kfk2 + 2hf, gi + kgk2
6 kfk2 + 2kfkkgk + kgk2
= (kfk + kgk)2 .
Self-adjoint differential operators

We now introduce a class of second-order, linear differential


operators that possess a number of special properties. Operators of
this type are of great importance in applications, as well as enjoying
a rich theory that will be developed here and later in the course.
Integration by parts

Consider a second-order,linear differential operator

Lu = a2 u 00 + a1 u 0 + a0 u.

Integrating by parts,
Zb Zb
0
b
u(a1 v) 0 dx,

(a1 u )v dx = u(a1 v) a

a a
Zb Zb
00 0 b
 0
u(a2 v) 00 dx,

(a2 u )v dx = u (a2 v) − u(a2 v) a +
a a

so
b
hLu, vi = u 0 (a2 v) − u(a2 v) 0 + u(a1 v) a


+ u, (a2 v) 00 − (a1 v) 0 + a0 v .
Adjoint operator and Lagrange identity
Thus, defining the formal adjoint

L∗ v = (a2 v) 00 − (a1 v) 0 + a0 v
= a2 v 00 + (2a20 − a1 )v 0 + (a200 − a10 + a0 )v

and the bilinear concomitant

P(u, v) = u 0 (a2 v) − u(a2 v) 0 + u(a1 v),

we have the Lagrange identity


b
hLu, vi = hu, L∗ vi + P(u, v) a .


The differentiated version of the Lagrange identity is


d
(Lu)v = u L∗ v + P(u, v).
dx
Example
If
Lu = 3xu 00 − (cos x)u 0 + ex u
then

L∗ v = (3xv) 00 + [(cos x)v] 0 + ex v


= 3xv 00 + (6 + cos x)v 0 + (ex − sin x)v

and

P(u, v) = u 0 (3xv) − u(3xv) 0 − uv cos x


= 3x(u 0 v − uv 0 ) − (3 + cos x)uv.
Higher order differential operators
Consider an mth order, linear differential operator
Xm
Lu = aj (x)Dj u.
j=0

Integrating by parts j times gives the identity


Zb Zb
j
aj (x)D u(x)v(x) dx = u(x)(−1)j Dj [aj (x)v(x)] dx
a a
X
j
b
(−1)k−1 (Dj−k u)Dk−1 (aj v)

+ a
,
k=1
so, summing over j, the Lagrange identity holds with
Xm
L∗ v = (−1)j Dj [aj (x)v],
j=0

X
m X
j
P(u, v) = (−1)k−1 (Dj−k u)Dk−1 (aj v).
j=1 k=1
Formal self-adjointness
The operator L is formally self-adjoint if L∗ = L.

Theorem
A second-order, linear differential operator L is formally self-adjoint
iff it can be written in the form

Lu = −(pu 0 ) 0 + qu = −pu 00 − p 0 u 0 + qu, (20)

in which case the Lagrange identity takes the form


0
(Lu) v − u (Lv) = − p(x)(u 0 v − uv 0 ) , (21)

or in other words, the bilinear concomitant is

P(u, v) = −p(x)(u 0 v − uv 0 ).
Proof
If L = −(pu 0 ) 0 + qu = −pu 00 − p 0 u 0 + qu then

L∗ v = (−pv) 00 + (p 0 v) 0 + qv
= −(p 00 v + 2p 0 v 0 + pv 00 ) + (p 00 v + p 0 v 0 ) + qv
= −(pv 00 + p 0 v 0 ) + qv = −(pv 0 ) 0 + qv = Lv,

so L is formally self-adjoint. Conversely, if


Lu = a2 u 00 + a1 u 0 + a0 u and L∗ = L, then

a2 u 00 + a1 u 0 + a0 u
= a2 u 00 + (2a20 − a1 )u 0 + (a200 − a10 + a0 )u

for all u. Choosing u ≡ 1 and then u ≡ x, it follows that

a0 = a200 − a10 + a0 and a1 = 2a20 − a1 .


Thus, a20 = a1 , so by putting p = −a2 and q = a0 we have
a1 = −p 0 , implying that L has the form

Lu = −pu 00 − p 0 u 0 + qu = −(pu 0 ) 0 + qu.

Moreover, in this case, the Lagrange identity takes the form


Zb
−(pu 0 ) 0 v + quv dx

hLu, vi =
a
Zb Zb
 0
b 0 0
= −(pu )v a + pu v dx + quv dx
a a
Zb Zb
0 0 b 0 0
 
= −(pu )v + u(pv ) a − u(pv ) + quv dx
a a
Zb
0 0 b
u −(pv 0 ) 0 + qv dx
  
= −p(u v − uv ) a +
a
 0 0
b
= −p(u v − uv ) a
+ hu, Lvi.
Example
Consider the Bessel equation

x2 u 00 + xu 0 + (x2 − ν2 )u = f(x).

Dividing both sides by x gives Lu = −x−1 f(x) where

Lu = −(xu 0 ) 0 + (ν2 x−1 − x)u.

Example
The Legendre equation

(1 − x2 )u 00 − 2xu 0 + ν(ν + 1)u = f(x)

has the form Lu = −f(x) with

Lu = −[(1 − x2 )u 0 ] 0 − ν(ν + 1)u.


Transforming to self-adjoint form
If we can evaluate the integrating factor
Z 
a1 (x)
p(x) = exp dx ,
a2 (x)

then we can transform a2 u 00 + a1 u 0 + a0 u = f(x) to self-adjoint


form:
pa1 0 pa0 −pf(x)
−pu 00 − u − u= ,
a2 a2 a2
pa0 −pf(x)
−(pu 00 + p 0 u 0 ) − u= ,
a2 a2
−(pu 0 ) 0 + qu = f̃(x)

where q = −pa0 /a2 and f̃ = −pf/a2 .


Example
Write the Cauchy–Euler ODE ax2 u 00 + bxu 0 + cu = f(x) in
self-adjoint form.
Self-adjointness and boundary operators

Lemma
Any formally self-adjoint operator L = −(pu 0 ) 0 + qu satisfies the
identity

X
2

hLu, vi − hu, Lvi = Bi u Ri v − Ri u Bi v , (22)
i=1

for all u and v, where


p(a)u(a) p(a)u 0 (a)
R1 u = or R1 u = −
b11 b10
and
p(b)u(b) p(b)u 0 (b)
R2 u = − or R2 u = .
b21 b20
Proof
Suppose b11 6= 0 and b21 6= 0. At x = a,
 
 pv
+p(u 0 v − uv 0 ) = b11 u 0 + b10 u
b11
 
pu
b11 v 0 + b10 v


b11
= (B1 u)(R1 v) − (R1 u)(B1 v),

and at x = b,
 
0 0 0
 pv
−p(u v − uv ) = b21 u + b20 u −
b21
 
pu
b21 v 0 + b20 v

− −
b21
= (B2 u)(R2 v) − (R2 u)(B2 v).

A similar argument works if b11 = 0 or b21 = 0.


Necessary condition for existence
If u is a solution of the bvp (17), and if v is a solution of the
homogeneous problem

Lv = 0 for a < x < b,


B1 v = 0 at x = a, (23)
B2 v = 0 at x = b,

then on the one hand

hLu, vi − hu, Lvi = hf, vi − hu, 0i = hf, vi

and on the other hand, by (22),

hLu, vi − hu, Lvi = α1 R1 v − R1 u × 0 + α2 R2 v − R2 u × 0.

Hence, the data f, α1 and α2 must satisfy

hf, vi = α1 R1 v + α2 R2 v. (24)
Sufficient condition for existence?
It can be shown that, provided a2 (x) 6= 0 for a 6 x 6 b, the
condition (24) is also sufficient for the existence of u, i.e., u exists
iff (24) holds for all v satisfying (23).
Example
The 2-point BVP

u 00 + u = f for 0 < x < π,


u = α1 at x = 0,
u = α2 at x = π,

has a solution iff



f(x) sin(x) dx = α1 + α2 .
0

In this case, if u is one solution then every other solution is of the


form u + C sin x for some constant C.
Fredholm alternative

Theorem
Either the homogeneous problem (18) has only the trivial
solution v ≡ 0, in which case
the inhomogeneous problem (17) has a unique solution u
for every choice of f, α1 and α2 ,
or else the homogeneous problem admits non-trivial solutions, in
which case
the inhomogeneous problem (17) has a solution u iff f,
α1 and α2 satisfy (24) for every solution v of the
homogeneous problem (18).

In the latter case, u + Cv is also a solution of (17) for any


constant C.
Domain of a self-adjoint operator

Definition
Let D be a subspace of a vector space V with inner product h·, ·i,
and let L be a linear operator on V with domain D. We say that L
is self-adjoint if

hLu, vi = hu, Lvi for all u, v ∈ D.

The identity (22) shows that L = −(pu 0 ) 0 + qu is self-adjoint if D


is the set of C2 functions u : [a, b] → R satisfying the
homogeneous boundary conditions

B1 u = 0 at x = a,
B2 u = 0 at x = b.
Example: Bessel operator

Recall the Bessel operator

Lu = −(xu 0 ) 0 + (ν2 x−1 − x)u for 1 6 x 6 2,

and let
B1 u = u 0 , R1 u = u, at x = 1,
0
B2 u = 2u − u, R2 u = −u, at x = 2,
R2
so that with hf, gi = 1 f(x)g(x) dx the Lagrange identity implies

hLu, vi − hu, Lvi = B1 u R1 v − R1 u B1 v + B2 u R2 v − R2 u B2 v.

Thus, if we let D denote the vector space of C2 functions


u : [1, 2] → R satisfying B1 u = 0 at x = 1 and B2 u = 0 at x = 2,
then
hLu, vi = hu, Lvi for all u, v ∈ D.
The vibrating string

Our first example of an initial-boundary value problem for a partial


differential equation is a classical model for a vibrating string, such
as in a musical instrument like a piano. This example reveals a
connection between Fourier series and acoustics.
Notation and assumptions

Consider a stretched string having the following properties:


1. the string is fixed at x = 0 and x = `;
2. the string is uniform, perfectly elastic and offers no resistance
to bending;
3. the weight of the string is negligible compared to tension;
4. the motion of the string is purely transverse and the deflection
is small.

Let

u = u(x, t) = transverse deflection at position x and time t.,


T = T (x, t) = magnitude of tension,
θ = θ(x, t) = tangent angle.
Newton’s laws
Let ρ be the linear density (mass per unit length) of the string.

For the piece of string between x and x + ∆x,

longitudinal momentum = 0,
∂u
transverse momentum = ρ∆x .
∂t
No motion in the longitudinal direction so

T (x + ∆x, t) cos θ(x + ∆x, t) = T (x, t) cos θ(x, t), (25)

whereas for the transverse motion


 
∂ ∂u
ρ∆x = T (x + ∆x, t) sin θ(x + ∆x, t) − T (x, t) sin θ(x, t).
∂t ∂t
(26)
Wave equation
We deduce from (25) that

T cos θ = T0

for some T0 independent of x, and we deduce from (26) that

∂2 u (T sin θ)(x + ∆x, t) − (T sin θ)(x, t)


ρ 2
= .
∂t ∆x
Taking the limit as ∆x → 0 gives

∂2 u ∂ ∂
ρ 2
= (T sin θ) = T0 (tan θ).
∂t ∂x ∂x
But tan θ = ∂u/∂x so u satisfies the wave equation

∂2 u ∂2 u
ρ = T 0 .
∂t2 ∂x2
Initial-boundary value problem (IBVP)

p
Put c = T0 /ρ and suppose that the string is initially at rest with
a known deflection u0 (x), then

∂2 u 2
2 ∂ u
− c = 0, 0 < x < `, t > 0,
∂t2 ∂x2

u(0, t) = 0, t > 0,
u(`, t) = 0, t > 0, (27)

u(x, 0) = u0 (x), 0 < x < `,


∂u
(x, 0) = 0, 0 < x < `.
∂t
Separation of variables
Look for solutions having a simple form:

u(x, t) = X(x)T (t).

Such a u satsifies the PDE iff

XT 00 − c2 X 00 T = 0.

Thus, we want
T 00 X 00
= .
c2 T X
Since T 00 /(c2 T ) is a function of t only, and X 00 /X is a function of
x only, there must be a separation constant λ such that

T 00 X 00
= −λ = .
c2 T X
A homogeneous BVP
Since

u(0, t) = X(0)T (t), u(`, t) = X(`)T (t),


∂u
(x, 0) = X(x)T 0 (0),
∂t
we also want

X(0) = 0, X(`) = 0, T 0 (0) = 0.

Thus, X should satisfy

X 00 + λX = 0 for 0 < x < `;


X=0 at x = 0;
X=0 at x = `.

One solution is X ≡ 0, but non-zero solutions exist for special


choices of λ.
Case λ > 0
Write λ = ω2 where ω > 0.
Thus, X 00 + ω2 X = 0, which has the general solution

X = A cos ωx + B sin ωx.

Since X(0) = A the first boundary condition implies A = 0 so


X = B sin ωx.
Since X(`) = B sin ω` the second boundary condition implies

B sin ω` = 0.

If B = 0, then X ≡ 0.
If B 6= 0, then sin ω` = 0 so ω` = nπ for some n ∈ {1, 2, 3, . . . }.
In this case, ω = nπ/` so
 2
nπ nπx
λ= and X = B sin .
` `
Case λ = 0

If λ = 0 then X 00 = 0 which has the general solution

X = Ax + B.

Since X(0) = B, the first boundary condition implies B = 0 so


X = Ax.

Since X(`) = A`, the second boundary condition implies

A` = 0.

But ` > 0 so A = 0 and thus X ≡ 0.


Case λ < 0
Write λ = −ω2 where ω > 0.

Thus, X 00 − ω2 X = 0, which has the general solution

X = A cosh ωx + B sinh ωx.

Since X(0) = A the first boundary condition implies A = 0 so


X = B sinh ωx.

Since X(`) = B sinh ω` the second boundary condition implies

B sinh ω` = 0.

If B = 0, then X ≡ 0.

If B 6= 0, then sinh ω` = 0 so ω` = 0, which is impossible if ω > 0.


Non-trivial solutions
Conclusion: get a useful solution only when λ = λn and X = BXn ,
where
 2  
nπ nπ
λn = and Xn (x) = sin x for n = 1, 2, 3, . . . .
` `

Recall that for T (t), we require T 00 + λc2 T = 0 and T 0 (0) = 0.


Since λn c2 = (nπc/`)2 , the general solution is
   
nπc nπc
T (t) = C cos t + E sin t ,
` `

and the initial condition implies that E = 0, so T = CTn where


 
nπc
Tn (t) = cos t .
`
Modes
We call
   
nπ nπc
un (x, t) = Xn (x)Tn (t) = sin x cos t
` `

the nth normal mode of vibration or pure harmonic. The smallest


number τn > 0 such that un (x, t + τn ) = un (x, t) for all x and t
is called the period of the nth normal mode. Since
nπc
τn = 2π
`
we see that
2`
τn = .
nc
The corresponding frequency is
1 nc
= .
τn 2`
Summary

Each mode un satisfies all four homogeneous equations in (27),

∂2 un 2
2 ∂ un
− c = 0,
∂t2 ∂x2
un (0, t) = 0,
(28)
un (`, t) = 0,
∂un
(x, 0) = 0,
∂t
but since  

un (x, 0) = sin x (29)
`
the inhomogeneous initial condition will not be satisfied unless
u0 (x) happens to be sin(nπx/`).
Superposition principle
Each equation in (28) is linear and homogeneous, so any
superposition of normal modes,

X    
nπ nπc
u(x, t) = An sin x cos t , (30)
` `
n=1

also satisfies (28).


We can now hope that (30) is the general solution of the
initial-boundary value problem (27): for any given u0 we can find
A1 , A2 , . . . such that the remaining condition holds, i.e.,

u(x, 0) = u0 (x), 0 6 x 6 `.

But this just means that



X  

u0 (x) = An sin x , 0 6 x 6 `. (31)
`
n=1
Fourier sine coefficients
Thus, the solution to (27) is given by (30) with
Z`  
2 nπx
An = u0 (x) sin dx. (32)
` 0 `

Example
If ` = 1 and the initial deflection is given by

2x, 0 < x < 21 ,
u0 (x) =
2(1 − x), 12 < x < 1,

we find that

8 X (−1)k−1  
u(x, t) = 2 2
sin (2k − 1)πx cos (2k − 1)πct .
π (2k − 1)
k=1
Partial sums
Heat equation

Our second example of an initial-boundary value problem is the


application that led, in 1807, to Fourier’s original paper that
introduced the idea of expanding an arbitrary function in a
trigonometric series.
Jean-Baptiste Joseph Fourier 1768–1830
Heat conduction in 1D
Let
u(x, t) = temperature at position x and time t.
Here, we assume that the temperature does not vary in the y and
z directions, and that the physical parameters

K = thermal conductivity,
ρ = mass density,
σ = specific heat,

are all constant (and positive).

Fourier’s law of heat conduction asserts that the heat flux q in the
x-direction is given by
∂u
q = −K
∂x
Conservation law for thermal energy
In a (fixed) region x1 < x < x2 , the thermal energy per unit of
cross sectional area equals
Z x2
ρσu dx.
x1

The heat flux entering the region at x = x1 equals q(x1 ), and the
heat flux entering the region at x = x2 equals −q(x2 ).

The rate of change of thermal energy must equal the total heat
flux entering the region, so
Z
d x2
ρσu dx = q(x1 ) − q(x2 ),
dt x1
Z x2 Z x2
∂u ∂q
ρσ dx = − dx.
x1 ∂t x1 ∂x
Heat equation
Z x2  
∂u ∂q
ρσ + dx = 0.
x1 ∂t ∂x
By Fourier’s law, q = −K∂u/∂x, so
Z x2 
∂2 u

1 ∂u
ρσ − K 2 dx = 0.
x2 − x1 x1 ∂t ∂x
The LHS equals the average value of the integrand over the
interval (x1 , x2 ), which tends to the pointwise value at x = x1 if
we let x2 → x1 . Since the choice of x1 is arbitrary, we conclude
that u must satisfy the heat equation,
∂u ∂2 u
ρσ − K 2 = 0,
∂t ∂x
or equivalently,
K
ut − auxx = 0, where a = .
ρσ
Part IV

Generalised Fourier Series


Introduction

We have seen how a solution to the vibrating string problem can


be constructed using the Fourier sine series expansion of the initial
data. In this part of the course, we will study abstract Fourier
expansions that include the familiar trigonometric Fourier series as
just one special case. The connection with differential equations
comes about via the theory of Sturm–Liouville eigenproblems.

These generalised Fourier series will allow us to solve partial


differential equations by separating variables in curvilinear
coordinates.
Outline

Complete orthogonal systems

Sturm–Liouville problems

Fourier–Bessel series

Schrödinger equation
Complete orthogonal systems

We now introduce a more general class of inner product, and


investigate properties of orthogonal functions. Expanding a
function as a linear combination of orthogonal function leads
naturally to the notion of a generalised Fourier series.
Weighted inner product

If w : (a, b) → R satisfies

w(x) > 0 for a < x < b,

then we define the inner product with weight function w by


Zb
hf, giw = hf, gwi = f(x)g(x)w(x) dx,
a

and the corresponding norm by


Zb
s
p
kfkw = hf, fiw = [f(x)]2 w(x) dx.
a

Two functions f and g are orthogonal with respect to w over the


interval (a, b) if hf, giw = 0.
Example: Tchebyshev polynomials
The Tchebyshev polynomials T0 , T1 , T2 , . . . are defined recursively
by

T0 (x) = 1, T1 (x) = x,
Tn+1 (x) = 2xTn (x) − Tn−1 (x) for n = 1, 2, 3, . . . .

One can verify by induction on n that

Tn (cos θ) = cos(nθ).

Hence, the substitution x = cos θ shows that if n 6= j then


Z1
Tn (x)Tj (x)
√ dx = 0
−1 1 − x2
and so Tn√is orthogonal to Tj with respect to the Tchebyshev
weight 1/ 1 − x2 .
The space L2

Let L2 (a, b, w) denote the set of all (Lebesgue measurable)


functions f : (a, b) → R such that kfkw < ∞.

The Cauchy–Schwarz inequality states that if f, g ∈ L2 (a, b, w)


then hf, giw exists and is finite, with

|hf, giw | 6 kfkw kgkw .

Furthermore, L2 (a, b, w) is a vector space with

kλfkw = |λ|kfkw , kf + gkw 6 kfkw + kgkw .


Orthogonal set of functions

A set of functions S ⊆ L2 (a, b, w) is said to be orthogonal if every


pair of functions in S is orthogonal and if no function is identically
zero on (a, b).

We say that S is orthonormal if, in addition, each function has


norm 1.

Lemma
If S is orthogonal then S is linearly independent.
Proof
Suppose that φ1 , φ2 , . . . , φN ∈ S satisfy

X
N
Cj φj (x) = 0 for a 6 x 6 b.
j=1
On the one hand,
 X
N 
φk , Cj φj = hφk , 0iw = 0 for 1 6 k 6 N,
j=1 w

and on the other hand, because φk is orthogonal to φj for j 6= k,


 X
N  X
N
φk , Cj φj = Cj hφk , φj iw = Ck hφk , φk iw ,
j=1 w j=1

so Ck hφk , φk iw = 0. Since hφk , φk iw = kφk k2w > 0, it follows


that Ck = 0 for 1 6 k 6 N. Hence, S is linearly independent.
Generalised Pythagorus theorem

Lemma
If {φ1 , . . . , φN } is orthogonal then, for any C1 , . . . , CN ∈ R,

X
N 2 X
N
Cj φj = C2j kφj k2w .
j=1 w j=1

Proof.

X
N X
N  X
N X
N
LHS = Cj φj , Ck φk = Cj Ck hφj , φk iw
j=1 k=1 w j=1 k=1

X X X
N
= + = C2j hφj , φj iw + 0 = RHS.
j=k j6=k j=1
Generalised Fourier coefficients

Lemma
If f is in the span of an orthogonal set of functions
{φ1 , φ2 , . . . , φN } in L2 (a, b, w), then the coefficients in the
representation
XN
f(x) = Aj φj (x)
j=1

are given by
hf, φj iw
Aj = for 1 6 j 6 N. (33)
kφj k2w

We call Aj the jth Fourier coefficient of f with respect to the given


orthogonal set of functions.
Proof

Since
X
N
f(x) = Aj φj (x),
j=1

if 1 6 k 6 N then
X
N  X
N
hf, φk iw = Aj φj , φk = Aj hφj , φk iw
j=1 w j=1
X
= Ak hφk , φk iw + = Ak kφk k2w + 0,
j6=k

so
hf, φk iw
Ak = .
kφk k2w
Classical Fourier sine coefficients
The sequence of functions
nπx
φn (x) = sin , n = 1, 2, 3, . . . ,
`
is orthogonal with respect to the weight w(x) = 1 over the
interval (a, b) = (0, `), that is,
Z`
nπx jπx
hφn , φj i = sin sin dx = 0 if n 6= j.
0 ` `
Since Z`
2 nπx `
kφn k = sin2 dx = ,
0 ` 2
the Fourier (sine) coefficients of a function f(x) are
Z`
hf, φj i 2 jπx
Aj = 2
= f(x) sin dx.
kφj k ` 0 `
Least-squares approximation

Consider approximating a function f ∈ L2 (a, b, w) by a function in


the span of an orthogonal set {φ1 , φ2 , . . . , φN }, that is, finding
coefficients Cj such that

X
N
f(x) ≈ Cj φj (x) for a 6 x 6 b.
j=1

We seek to choose the Cj so that the weighted mean-square error

X
N 2 Z b X
N 2
f− Cj φj = f(x) − Cj φj (x) w(x) dx
j=1 w a j=1

is as small as possible.
Least-squares approximation

Lemma
For all C1 , C2 , . . . , CN , the weighted mean-square error satisfies

X
N 2 X
N
f− Cj φj = kfk2w − A2j kφj k2w
j=1 w j=1

X
N
+ (Cj − Aj )2 kφj k2w ,
j=1

and so achieves its unique minimum when Cj equals the jth


Fourier coefficient of f, that is, when
hf, φj iw
Cj = Aj = for 1 6 j 6 N.
kφj k2w
Proof

X
N 2  X
N X
N 
f− Cj φj = f− Cj φj , f − Ck φk
j=1 w j=1 k=1 w
 X N  X
N 
= hf, fiw − f, Ck φk − Cj φj , f
k=1 w j=1 w
X
N X
N 
+ Cj φj , Ck φk
j=1 k=1 w

X
N X
N
= kfk2w − Ck hf, φk iw − Cj hφj , fiw
k=1 j=1

X
N X
N
+ Cj Ck hφj , φk iw .
j=1 k=1
Since

X
N X
N X X X
N
Cj Ck hφj , φk iw = + = C2j kφj k2w + 0
j=1 k=1 j=k j6=k j=1

and hf, φj iw = Aj kφj k2w , we can complete the square as follows:

X
N 2 X
N X
N
f− Cj φj = kfk2w − 2 Cj hf, φj iw + C2j kφj k2w
j=1 w j=1 j=1

X
N X
N
= kfk2w − 2 Cj Aj kφj k2w + C2j kφj k2w
j=1 j=1

X
N
= kfk22 + C2j − 2Cj Aj + A2j − A2j kφj k2w


j=1

X
N X
N
2
= kfk22 − A2j kφj k2w + Cj − Aj kφj k2w .
j=1 j=1
Conclusion: Let f ∈ L2 (a, b, w) and put

hf, φj iw
Aj = .
kφj k2w

If f belongs to the span of S = {φ1 , . . . , φN }, then

X
N
f(x) = Aj φj (x), a 6 x 6 b,
j=1

but if f does not belong to the span of S then

X
N
f(x) ≈ Aj φj (x), a 6 x 6 b,
j=1

is the best approximation (in the weighted L2 -norm) to f(x) by a


function in the span of S.
Example of least-squares approximation
Define f : [0, 1] → R by

1, 0 < x < 1/2,
f(x) =
0, 1/2 < x < 1,

and let φn (x) = sin nπx. Then


Z1
hf, φj i
Aj = = 2 f(x) sin jπx dx
kφj k2 0
Z 1/2
2Bj
=2 sin jπx dx = ,
0 jπ
where

0, j ≡ 0 (mod 4),
jπ 
Bj = 1 − cos = 1, j ≡ 1 or 3 (mod 4),
2 

2, j ≡ 2 (mod 4),
X
4
2

1

f(x) ≈ Aj φj (x) = sin πx + sin 2πx + sin 3πx .
π 3
j=1
Convergence of least-squares approximation
X
N
f(x) ≈ Aj φj (x)
j=1
Gibbs phenomenon
X
100
f(x) ≈ Aj φj (x)
j=1
Legendre polynomials
Recall that

P0 (z) = 1, P1 (z) = z, P2 (z) = 12 (3z2 − 1),


P3 (z) = 12 (5z3 − 3z).

We will see later that these functions are orthgonal with respect to
Z1
hf, gi = f(x)g(x) dx,
−1

that is,
hPn , Pj i = 0 if n 6= j.
It can also be shown that
Z1
2
kPn k2 = Pn (x)2 dx = .
−1 2n + 1
Example of polynomial least-squares approximation
Let 
1, 0 < x < 1/2,
f(x) =
0, −1 < x < 0 or 1/2 < x < 1.
We find that
j 0 1 2 3 4
hf, Pj i 1/2 1/8 −3/16 −19/128 15/256
kPj k2 2 2/3 2/5 2/7 2/9
Aj 1/4 3/16 −15/32 −133/256 135/512
so

X
4 X
4
hf, Pj i
f(x) ≈ Aj Pj (x) = Pj (x)
kPj k2
j=0 j=0
1 3 15 133 135
= P0 (x) + P1 (x) − P2 (x) − P3 (x) + P4 (x).
4 16 32 256 512
Orthogonal sequences of functions

Now let S = {φ1 , φ2 , φ3 , . . .} be a countably infinite orthogonal


set in L2 (a, b, w). Given any f ∈ L2 (a, b, w) we write

X
f(x) ∼ Aj φj (x)
j=1

to indicate that Aj is the jth Fourier coefficient of f, i.e.,

hf, φj iw
Aj = for all j > 1.
kφj k2w
Example
Fourier sine series: for f ∈ L2 (0, π),

X Zπ
2
f(x) ∼ Aj sin jx, Aj = f(x) sin jx dx.
π 0
j=1

Example
Fourier–Legendre series: for f ∈ L2 (−1, 1),

X Z1
2j + 1
f(x) ∼ Aj Pj (x), Aj = f(x)Pj (x) dx.
2 −1
j=0
Example
Fourier–Tchebyshev series: recall that

T0 (x) = 1, T1 (x) = x, T2 (x) = 2x−1, T3 (x) = 4x3 −3x, ...

are orthogonal
√ over (−1, 1) with respect to the weigth function
w(x) = 1/ 1 − x2 . The substitution x = cos θ implies that
Z1 
2 Tj (x)2 dx π, j = 0,
kTj kw = √ =
−1 1 − x2 π/2, j > 1,

so for f ∈ L2 (−1, 1, w),


∞ Z1
A0 X hf, Tj i 2 f(x)Tj (x) dx
f(x) ∼ + Aj Tj (x), Aj = 2
= √ .
2 kTj kw π −1 1 − x2
j=1
Example
Trigonometric Fourier series: the functions

1, cos x, sin x, cos 2x, sin 2x, cos 3x, sin 3x, ...

are orthogonal over (−π, π) with


Zπ Zπ Zπ
12 dx = 2π and cos2 jx dx = π = sin2 jx dx,
−π −π −π

so if f ∈ L2 (−π, π) then

A0 X
f(x) ∼

+ Aj cos jx + Bj sin jx
2
j=1

where
Zπ Zπ
1 1
Aj = f(x) cos jx dx and Bj = f(x) sin jx dx.
π −π π −π
Bessel inequality

For every N,

X
N 2 X
N
06 f− Aj φj = kfk2w − A2j kφj k2w
j=1 w j=1

so
X
N
A2j kφj k2w 6 kfk2w ,
j=1

and therefore the Fourier coefficients satisfy Bessel’s inequality



X
A2j kφj k2w 6 kfk2w .
j=1
Example
The Fourier sine coefficients of f ∈ L2 (0, π) satisfy

X Zπ
2
A2j 6 f(x)2 dx.
π 0
j=1

Example
The Fourier–Legendre coefficients of f ∈ L2 (−1, 1) satisfy

X Z1
A2j 1
6 f(x)2 dx.
2j + 1 2 −1
j=0
Example

For w(x) = 1/ 1 − x2 , the Fourier–Tchebyshev coefficients
of f ∈ L2 (−1, 1, w) satisfy
∞ Z1
A20 X 2 1 f(x)2 dx
+ Aj 6 √ .
2 π −1 1 − x2
j=1

Example
The trigonometric Fourier coefficients of f ∈ L2 (−π, π) satisfy
∞ Zπ
A20 X 2  1
+ Aj + B2j 6 f(x)2 dx.
2 π −π
j=1
Completeness

An orthogonal set S is complete if there is no non-trivial function


in L2 (a, b, w) orthogonal to every function in S, i.e., if the
condition
hf, φiw = 0 for every φ ∈ S
implies that
kfkw = 0.

In particular, if S is a complete orthogonal set, then every proper


subset of S fails to be complete.

Example
The set S = { sin jx : j > 1 and j 6= 7 } is not complete in L2 (0, π)
because sin 7x is orthogonal to every function in S.
Equivalent definitions of completeness

Theorem
If S = {φ1 , φ2 , . . .} is orthogonal in L2 (a, b, w), then the following
properties are equivalent:
1. S is complete;
2. for each f ∈ L2 (a, b, w), if Aj denotes the jth Fourier
coefficient of f then

X
N
f− Aj φj →0 as N → ∞;
j=1 w

3. each function f ∈ L2 (a, b, w) satisfies Parseval’s identity:



X
kfk2w = A2j kφj k2w .
j=1
Proof that 2 ⇐⇒ 3

We saw already that

X
N 2 X
N
f− A j φj = kfk2w − A2j kφj k2w ,
j=1 w j=1

so
LHS → 0 iff RHS → 0.
Now observe that RHS → 0 means the same thing as

X
kfk2w = A2j kφj k2w .
j=1
Proof that 3 =⇒ 1

Assume that Parseval’s identity holds for every f ∈ L2 (a, b, w). If

hf, φj iw = 0 for all j > 1,

then Aj = 0 for all j > 1, so



X
kfk2w = A2j kφj k2w = 0,
j=1

showing that f = 0. Therefore S is complete.


Proof that 1 =⇒ 2
Assume that S is complete. Given f ∈ L2 (a, b, w), Bessel’s
inequality implies that the Fourier series of f converges
in L2 (a, b, w), so we can define

X
g=f− Ak φk .
k=1

Since

X
hg, φj iw = hf, φj iw − Ak hφj , φk iw
k=1
= hf, φj iw − Aj kφj k2w = 0

for every j > 1, we conclude that g = 0 and therefore 2 holds:



X
f= A j φj .
j=1
Least-squares error
Consider
X
N
f(x) ≈ Aj φj (x) for a < x < b,
j=1

where Aj = hf, φj iw /kφj k2w is the jth Fourier coefficient of f. We


have seen that this choice of Aj minimises the norm in L2 (a, b, w)
of the error
XN
eN (x) = f(x) − Aj φj (x).
j=1

Corollary
If S = {φ1 , φ2 , φ3 , . . .} is a complete orthogonal sequence
in L2 (a, b, w), then for any f ∈ L2 (a, b, w),

X
keN k2 = Aj kφj k2w .
j=N+1
Proof
We saw in the proof of the theorem (2 ⇐⇒ 3) that

X
N
keN k2w = kfk2w − A2j kφj k2w .
j=1

Since Parseval’s identity holds,



X
kfk2w = A2j kφj k2w ,
j=1

and therefore

X X
N
keN k2w = A2j kφj k2w − A2j kφj k2w
j=1 j=1

X
= A2j kφj k2w .
j=N+1
Parseval identity
Each of our four examples of orthgonal sequences is in fact
complete.
Example
The Fourier sine coefficients of f ∈ L2 (0, π) satisfy

X Zπ
2
A2j = f(x)2 dx.
π 0
j=1

Example
The Fourier–Legendre coefficients of f ∈ L2 (−1, 1) satisfy

X Z1
A2j 1
= f(x)2 dx.
2j + 1 2 −1
j=0
Example

For w(x) = 1/ 1 − x2 , the Fourier–Tchebyshev coefficients
of f ∈ L2 (−1, 1, w) satisfy
∞ Z1
A20 X 2 1 f(x)2 dx
+ Aj = √ .
2 π −1 1 − x2
j=1

Example
The trigonometric Fourier coefficients of f ∈ L2 (−π, π) satisfy
∞ Zπ
A20 X 2  1
+ Aj + B2j = f(x)2 dx.
2 π −π
j=1
Estimating the least-squares error
Consider
f(x) = x for 0 < x < π.
This function has the Fourier sine series

X (−1)n+1
f(x) ∼ 2 sin nx
n
n=1

so
X
N
(−1)n+1
f(x) = 2 sin nx + eN (x),
n
n=1

and the error term eN (x) satisfies


Zπ X∞
2(−1)n+1 2 π
 
2 2
eN (x) dx = keN k = .
n 2
0 n=N+1 | {z } |{z}
A2n kφn k2
Thus,

X 1
keN k2 = 2π
n2
n=N+1
Z∞
dt 2π
6 2π 2
= ,
N t N
and so
keN k = O(N−1/2 ).
In other words, the root mean square error for the approximation
 
sin 2x N+1 sin Nx
x ≈ 2 sin x − + · · · + (−1) , 0 < x < π,
2 N

tends to zero like N−1/2 .


Sturm–Liouville problems

Recall that the eigenvectors of a real N × N symmetric matrix


form a orthogonal basis for RN . In the same way, we will see that
a type of homogeneous boundary-value problem generates a
sequence of functions that are orthgonal with respect to an
associated weight function.

The sequence of orthogonal functions turns out to be complete,


but the proof of this fact relies on results from functional analysis
(Math5605) about a compact linear operator on a Hilbert space.
Sturm–Liouville operator
An ODE of the form

[p(x)u 0 ] 0 + [λr(x) − q(x)]u = 0, a < x < b, (34)

is called a Sturm–Liouville equation. The coefficients p, q, r must


all be real-valued with

p(x) > 0 and r(x) > 0 for a < x < b.

Defining the formally self-adjoint differential operator

Lu = −[p(x)u 0 ] 0 + q(x)u, (35)

we can write (34) as

Lu = λru on (a, b).


Eigenfunctions and eigenvalues
Any non-trivial (possibly complex-valued) solution u satisfying
Lu = λru on (a, b) (plus appropriate boundary conditions) is said
to be an eigenfunction of L with eigenvalue λ. In this case, we
refer to (φ, λ) as an eigenpair.

Example
Legendre’s equation

(1 − x2 )u 00 − 2xu 0 + ν(ν + 1)u = 0

is equivalent to

[(1 − x2 )u 0 ] 0 + ν(ν + 1)u = 0

which is of the form (34) with

p(x) = 1 − x2 , q(x) = 0, r(x) = 1, λ = ν(ν + 1).


Boundary conditions

Assume as before that p, q, r are real-valued with p(x) > 0 and


r(x) > 0 for a < x < b. A regular Sturm–Liouville eigenproblem is
of the form
Lu = λru for a < x < b,
0
B1 u = b11 u + b10 u = 0 at x = a, (36)
0
B2 u = b21 u + b20 u = 0 at x = b,

where a and b are finite with

p(a) 6= 0 and p(b) 6= 0,

and where b10 , b11 , b20 , b21 are real with

|b10 | + |b11 | 6= 0 and |b20 | + |b21 | 6= 0.


Eigenfunctions are orthogonal
We now describe how a Sturm–Liouville problem leads to an
associated complete orthogonal system, and hence to a generalised
Fourier expansion.
Theorem
Let L be a Sturm–Liouville differential operator (35). If u,
v : [a, b] → C satisfy

Lu = λru on (a, b), with B1 u = 0 = B2 u,

and
Lv = µrv on (a, b), with B1 v = 0 = B2 v,
and if λ 6= µ, then u and v are orthogonal on the interval (a, b)
with respect to the weight function r(x), i.e.,
Zb
hu, vir = u(x)v(x)r(x) dx = 0.
a
Proof

Using the Lagrange identity,


Zb
(λ − µ)hu, vir = (λ − µ) u(x)v(x)r(x) dx
a
Zb Zb
   
= λr(x)u(x) v(x) dx − u(x) µr(x)v(x) dx
a a
Zb Zb
= Lu(x) v(x) dx − u(x) Lv(x) dx
a a
X
2

= Bj uRj v − Rj uBj v = 0,
j=1

since Bj u = 0 = Bj v for j ∈ {1, 2}. By hypothesis, λ 6= µ so


λ − µ 6= 0, and we conclude that hu, vir = 0.
Eigenvalues are real

Theorem
Let L be a Sturm–Liouville differential operator (35). If
u : [a, b] → C is not identically zero and satisfies

Lu = λru on (a, b), with B1 u = 0 = B2 u,

then λ is real.

In the same way, we can prove that for a real, symmetric (or
complex, Hermitian) matrix:
I eigenvectors with distinct eigenvalues are orthogonal;
I every eigenvalue is real.
Proof

Since p(x), q(x), r(x) are each real,

Lū = −(pū 0 ) 0 + qū = −(pu 0 ) 0 + qu = Lu = λru = λ̄rū,

so ū is an eigenfunction with eigenvalue λ̄. Thus, if λ 6= λ̄ then


Zb Zb
0 = hu, ūir = u(x)ū(x)r(x) dx = |u|2 r dx,
a a

implying u ≡ 0, a contradiction.
We conclude λ = λ̄, that is, λ is real.
Completeness of the eigenfunctions

Theorem
The regular Sturm–Liouville problem (36) has an infinite sequence
of eigenfunctions φ1 , φ2 , φ3 , . . . with corresponding eigenvalues
λ1 , λ2 , λ3 , . . . and moreover:
1. the eigenfunctions φ1 , φ2 , φ3 , . . . form a complete
orthogonal system on the interval (a, b) with respect to the
weight function r(x);
2. the eigenvalues satisfy λ1 < λ2 < λ3 < · · · with λj → ∞ as
j → ∞.

In the same way, for every real symmetric (or complex, Hermitian)
n × n matrix A there is an orthgonal basis for Rn consisting of
eigenvectors of A.
Real trigonometric Fourier series
The various trigonometric Fourier series arise from the the
Sturm–Liouville ODE
u 00 + λu = 0.
For the interval [−`, `], periodic boundary conditions

u(`) = u(−`) and u 0 (`) = u 0 (−`)

lead to the eigenfunctions


jπx
φ2j (x) = cos for j = 0, 1, 2, . . . ,
`
jπx
φ2j−1 (x) = sin for j = 1, 2, . . . ,
`
and (non-distinct!) eigenvalues
 2

λ0 = 0, λ2j = λ2j−1 = , j > 1.
`
Complex trigonometric Fourier series
Again taking

u 00 + λu = 0, u(`) = u(−`), u 0 (`) = u 0 (−`),

we can instead take


 2
iπjx/` jπ
φj (x) = e and λj = for j = 0, ±1, ±2, . . . .
`

Here, the φj are orthogonal with respect to the complex inner


product
Z`
hf, gi = f(x)g(x) dx,
−`
so

X Z`
1
f(x) ∼ Aj eijπx/` , Aj = f(x)e−ijπx/` dx.
2` −`
j=−∞
Half-range trigonometric Fourier series
Consider
u 00 + λu = 0 for 0 < x < `.
Homogeneous Dirichlet boundary conditions

u(0) = u(`) = 0

lead to
 2
jπx jπ
φj (x) = sin and λj = for j = 1, 2, 3, . . . ,
` `
whereas homogeneous Neumann boundary conditions

u 0 (0) = u 0 (`) = 0

lead to
 2
jπx jπ
φj (x) = cos and λj = for j = 0, 1, 2, . . . .
` `
Least-squares error for cosine expansion
Consider once again the function

f(x) = x for 0 < x < π.

This time, instead of the sine series use the cosine series

π X 2
f(x) ∼ + (cos nπ − 1) cos nx
2 πn2
n=1

X
π 4 cos(2k − 1)x
∼ − ,
2 π (2k − 1)2
k=1

and estimate the error term eN (x) where

π X 2
N
f(x) = + (cos nπ − 1) cos nx + eN (x).
2 πn2
n=1
With N = 2K − 1,
X∞  2
2 2 π
keN k = 2
(cos nπ − 1)
πn 2
n=N+1 | {z } |{z}
A2n kφn k2

X ∞
4 π 1 X 1
= =
k=K
2
π (2k − 1) 24 8π
k=K
(k − 12 )4
Z∞
1 dt 1 1 2 1
6 1
= 1
= .
8π K (t − 2 ) 4 24π (K − 2 ) 3 3π N3

Thus, for this choice of f, the root mean square error using the
cosine expansion is

keN k = O(N−3/2 ),

versus O(N−1/2 ) for the sine expansion.


Fourier–Bessel series

In this section, we study a singular Sturm–Liouville problem that


will arise when we deal with eigenproblems for partial differential
equations. The proof that the eigenfunctions are orthogonal needs
to be modified because of the different type of boundary condition
imposed at x = 0.
A singular Sturm–Liouville problem
Consider the parameterised Bessel equation of order ν > 0,

x2 u 00 + xu 0 + (k2 x2 − ν2 )u = 0,

which can be written in Sturm–Liouville form as

(xu 0 ) 0 + k2 x − ν2 x−1 u = 0,


i.e.,

ν2
p(x) = x, r(x) = x, λ = k2 , q(x) = .
x
On the interval [0, `], we do not have a regular Sturm–Liouville
problem because
p(0) = 0
and q(x) is unbounded as x → 0+ .
A condition ensuring λ > 0

Lemma
Consider the ODE

(xu 0 ) 0 + (λx − ν2 x−1 )u = 0 for 0 < x < `, (37)

with boundary conditions

u(x) bounded and xu 0 (x) → 0 as x → 0+ ,


(38)
c1 u 0 + c0 u = 0 at x = `,

and suppose that c0 c1 > 0.


1. If a non-trivial solution u exists, then λ > 0.
2. A non-trivial solution u exists for λ = 0 only when ν = 0 and
c0 = 0, in which case u is a constant.
Proof

Multiply (37) by u and integrate to obtain


Z`
(xu 0 ) 0 u + (λx − ν2 x−1 )u2 dx = 0.
 
0

Integration by parts gives


Z` Z`
0 0 0
(xu ) u dx = `u (`)u(`) − x(u 0 )2 dx,
0 0

since the first boundary condition ensures xu 0 (x)u(x) → 0


as x → 0, so
Z` Z` Z`
0 2
2
λ xu dx = x(u ) dx + ν 2
x−1 u2 dx − `u 0 (`)u(`).
0 0 0
The boundary condition at x = ` gives

u 0 (`)   u 0 (`)   c1  0  2
−u 0 (`)u(`) = −c0 u(`) = c1 u 0 (`) = u (`) ,
c0 c0 c0
if c0 6= 0, and
 u(`)   u(`) c0  2
−u 0 (`)u(`) = −c1 u 0 (`)

= c0 u(`) = u(`) ,
c1 c1 c1

if c1 6= 0. In either case, −u 0 (`)u(`) > 0 because c0 c1 > 0.


Hence, λ > 0, proving part 1.
If λ = 0 then u satisfies
Z` Z`
0 2
x(u ) dx + ν 2
x−1 u2 dx = `u 0 (`)u(`) 6 0.
0 0

For a non-trivial u, it follows that u 0 ≡ 0 and ν2 = 0, so u is


constant and ν = 0, proving part 2.
Bessel functions

Theorem
Assume that c0 c1 > 0. If c0 6= 0 or ν > 0, then the eigenvalues
λ1 , λ2 , . . . and eigenfunctions φ1 , φ2 , . . . of the Bessel equation
(37) with boundary conditions (38) are given by

λj = k2j and φj (x) = Jν (kj x) for j > 1,

where kj is the jth positive solution of the equation

c0 Jν (k`) + c1 kJν0 (k`) = 0. (39)

If c0 = 0 = ν, so that u 0 (`) = 0 by (38), then we have an


additional eigenvalue and eigenfunction

λ0 = 0 and φ0 (x) = 1.
Proof

First suppose that λ = k2 > 0. Then (37) is equivalent to the


parameterised Bessel equation

x2 u 00 + xu 0 + (k2 x2 − ν2 )u = 0

which has the general solution u = AJν (kx) + BYν (kx). Since
Yν (kx) is unbounded as x → 0, we must have B = 0, so
u = AJν (kx) and u 0 = AkJν0 (kx). Thus, at x = `,

c1 u 0 + c0 u = A c1 kJν0 (k`) + c0 Jν (k`) ,




showing that the boundary condition holds iff k is a solution


of (39), in which case Jν (kx) is an eigenfunction.

When λ = 0, we already know that c0 = 0 = ν and u = A for


some constant A.
Orthogonality and normalization
Theorem
Assume that c0 c1 > 0. The Bessel eigenfunctions, satisfying
(37) and (38), have the orthogonality property
Z`
φi (x)φj (x)x dx = 0 if i 6= j.
0

If c1 6= 0, then
Z`  2 
2 1 `c0 2 2
φj (x) x dx = 2 + (kj `) − ν Jν (kj `)2 for j > 1,
0 2kj c1

whereas if c1 = 0, so that φj (`) = 0 by (38), then


Z`
`2
φj (x)2 x dx = Jν+1 (kj `)2 for j > 1.
0 2
R` 2 x dx
In the case c0 = 0 = ν, we have 0 φ0 (x) = `2 /2.
Proof of orthogonality
Put Lu = −(xu 0 ) 0 + ν2 x−1 u so that Lφj = xλj φj . It suffices to
show that
hLu, vi = hu, Lvi for all u, v ∈ D,
where D is the space of C2 functions satisfying the boundary
conditions (38).
In fact, if u, v ∈ D then x(u 0 v − uv 0 ) → 0 as x → 0 so the
Lagrange identity gives

hLu, vi − hu, Lvi = −` u 0 (`)v(`) − u(`)v 0 (`)


 

and it suffices to prove that u 0 v − uv 0 = 0 at x = `. Using the


boundary conditions c1 u 0 + c0 u = 0 = c1 v 0 + c0 v, we have
c0 u 0 v = u 0 (c0 v) = u 0 (−c1 v 0 ) = (−c1 u 0 )v 0 = (c0 u)v 0 = c0 uv 0
and c1 u 0 v = (−c0 u)v = u(−c0 v) = u(c1 v 0 ) = c1 uv 0 , so
cj (u 0 v − uv 0 ) = 0 for j ∈ {0, 1}. At least one of c0 and c1 is not
zero, and thus u 0 v − uv 0 = 0 at x = `.
Fourier–Bessel expansion

For instance, when c1 = 0 we have



X
f(x) ∼ Aj Jν (kj x), 0 < x < `,
j=1

with
Z`  Z`
Aj = f(x)Jν (kj x)x dx Jν (kj x)2 x dx
0 0
Z`
2
= f(x)Jν (kj x)x dx.
`2 Jν+1 (kj `)2 0
Plots when ` = 1, ν = 0, c1 = 0
Partial sums
Another singular Sturm–Liouville problem
Consider the Legendre equation with parameter ν,

(1 − x2 )u 00 − 2xu 0 + ν(ν + 1)u = 0 for −1 < x < 1,

which can be written in Sturm–Liouville form as

[(1 − x2 )u 0 ] 0 + ν(ν + 1)u = 0,

i.e.,

p(x) = 1 − x2 , r(x) = 1, λ = ν(ν + 1), q(x) = 0.

On the interval [−1, 1] we do not have a regular Sturm–Liouville


problem because
p(−1) = 0 = p(1).
We assume without loss of generality that ν > − 21 because

ν(ν + 1) = µ(µ + 1) if µ = −ν − 1.
Eigenvalues are positive

Consider the ODE

[(1 − x2 )u 0 ] 0 + λu = 0 (40)

with boundary conditions

u(x) bounded and (1 ∓ x)u 0 (x) → 0 as x → ±1. (41)

Muliplying the ODE by u and integrating (by parts) we find


Z1 Z1
λ 2
u dx = (1 − x2 )(u 0 )2 dx,
−1 −1

so if a non-trivial solution u exists then λ > 0.


Legendre polynomials are orthogonal
If ν is not an integer then the boundary-value problem (40)–(41)
has only the trivial solution u ≡ 0 (see the tutorial problems),
whereas if ν = n is a non-negative integer, then all solutions have
the form
u = CPn (x),
where Pn is the Legendre polynomial of degree n, so

φn (x) = Pn (x) and λn = n(n + 1)

for n = 0, 1, 2, 3, . . . .
Theorem
The Legendre polynomials are orthogonal over the interval [−1, 1],
Z1
hPn , Pm i = Pn (x)Pm (x) dx = 0 if m 6= n,
−1
Proof

Since r(x) = 1, it suffices to show that the Legendre differential


operator Lu = −[(1 − x2 )u 0 ] 0 is self-adjoint,

hLu, vi = hu, Lvi for all u, v ∈ D,

where D is the space of C2 functions on (−1, 1) that satisfy the


boundary conditions (41).

In fact, the Lagrange identity shows that


1
hLu, vi − hu, Lvi = −(1 − x2 )(u 0 v − uv 0 ) −1 ,


which is zero by (41).


Plots
Schrödinger equation

To conclude this part of the course, we use separation of variables


to solve a partial differential equation that arises in quantum
theory, and has a similar structure to the heat equation but with
the imaginary unit i multiplying the time derivative term. As a
consequence, the modes that arise do not decay exponentially but
instead have a complex phase factor.
Erwin Schrödinger 1887–1961
Quantum wave function

The state of a quantum mechanical particle in 1D can be


represented by a complex-valued wave function ψ(x, t). We can
interpret |ψ(x, t)|2 as the probability density function for the
position x of the particle, so that
Zb
|ψ(x, t)|2 dx
a

is the probability that the particle lies in the interval [a, b] at


time t. Thus, the wave function must have the normalisation
Z∞
|ψ(x, t)|2 dx = 1.
−∞
Time-dependent Schrödinger equation

In 1925, Erwin Schrödinger postulated that ψ obeys the PDE

∂ψ h2 ∂2 ψ
ih + − V(x, t)ψ = 0. (42)
∂t 2m ∂x2
Here, m is the mass of the particle, V(x, t) is the potential of the
applied force and h = h/(2π) where

h = 6.626 070 040 × 10−34 Joule-seconds

is Planck’s constant.
Separable solutions
Suppose that V does not depend on t, and look for separable
solutions
ψ(x, t) = u(x)T (t).
We find that
ihT 0 h2 u 00
=E=− + V(x)
T 2m u
where the separation constant E has the physical dimensions of
energy.

Thus,
iE
T0 = − T
h
and we may take T (t) = e−iEt/h , so that

ψ(x, t) = e−iEt/h u(x).


Time-independent Schrödinger equation

The spatial dependence of the wave function satisfies

h2 00  
u + E − V(x) u = 0, (43)
2m
which has the form of a Sturm–Liouville ODE (34) with

h2
p(x) = , λ = E, r(x) = 1, q(x) = V(x).
2m

If (φn , En ) is the nth eigenpair, then φn and En are real, and

ψn (x, t) = e−iEn t/h φn (x)

is a solution of the time-dependent Schrödinger equation (42).


Eigenstates

Since |ψn (x, t)|2 = φn (x)2 the eigenfunction φn must be


normalised to satisfy
Z∞
φn (x)2 dx = 1. (44)
−∞

In this case, ψn is called the nth eigenstate. Physically, the


eigenvalue En is the energy of the particle in this eigenstate.
The general solution is then a superposition

X ∞
X
ψ(x, t) = An ψn (x, t) = An e−iEn t/h φn (x).
n=0 n=0
Define the complex inner product and norm with weight
function r(x) = 1,
Z∞ Z∞
hf, gi = f(x)g(x) dx and kfk2 = |f(x)|2 dx.
−∞ −∞

The φn are orthonormal (hφn , φj i = δnj ) so the amplitudes An


satisfy Parseval’s identity:
Z∞
1= |ψ(x, t)|2 dx = hψ(·, t), ψ(·, t)i
−∞

X ∞
X 
−iEn t/h −iEk t/h
= An e φn , Ak e φk
n=0 k=0
∞ X
X ∞
An e−iEn t/h Ak e+iEk t/h hφn , φk i
 
=
n=0 k=0
X∞
= |An |2 .
n=0
Quantum mechanical harmonic oscillator
For the classical harmonic oscillator, mẍ + kx = 0, the restoring
force −kx has the potential V(x) = 12 kx2 , and the motion of the
particle is given by
p
xClassical = A cos ωt + B sin ωt where ω = k/m.

Putting V(x) = 12 kx2 = 12 mω2 x2 in the time-independent


Schrödinger equation (43) we have

h2 00 
u + E − 12 mω2 x2 u = 0.

(45)
2m
To simplify the coefficients, let

v(y) = u(x) where y = (mω/h)1/2 x,

and obtain
E 1
v 00 + (2α + 1 − y2 )v = 0, α= − .
hω 2
Series solutions

Next we put
2 /2
v(y) = e−y w(y)
and find that w satisfies Hermite’s equation

w 00 − 2yw 0 + 2αw = 0,

which admits power series solutions



X
w(y) = Bk yk .
k=0

The recurrence relation for the coefficients is


2k − 2α
Bk+2 = Bk for all k > 0.
(k + 2)(k + 1)
Series solutions

Thus, w(y) = B0 w0 (y) + B1 w1 (y) where

2α 2 22 (2 − α)α 4 23 (4 − α)(2 − α)α 6


w0 = 1 − y − y − y − ··· ,
2! 4! 6!
2(1 − α) 3 22 (3 − α)(1 − α) 5
w1 = y + y + y + ··· .
3! 5!
It can be shown that if α is not a non-negative integer then
2 2
w0 and w1 are both O(ey ) so v(y) is O(ey /2 ), so the
normalisation condition (44) is impossible to satisfy.

If α = n is even, then w0 is a polynomial of degree n but w1 is


2
again O(ey ).

If α = n is odd, then w1 is a polynomial of degree n but w0 is


2
again O(ey ).
Hermite polynomials
The Hermite polynomial Hn of degree n is the polynomial solution
of
w 00 − 2yw 0 + 2nw = 0, (46)
normalised by the condition that the coefficient of yn equals 2n .
We conclude that the eigenpairs (φn , En ) of (45) are given by
2 /2
φn (x) = Cn e−y Hn (y) and En = (n + 21 )hω

for n ∈ {0, 1, 2, . . .} and suitable Cn .


Writing the Hermite ODE (46) in self-adjoint form,
2 2
(e−y w 0 ) 0 + 2ne−y w = 0,

we see that
Z∞
2
Hn (y)Hk (y)e−y dy = 0, n 6= k.
−∞
Normalisation
It can be shown that
Z∞
2 √
Hn (y)2 e−y dy = 2n n! π,
−∞

so
Z∞ Z∞  1/2
2
 −y2 /2
2 h
φn (x) dx = Cn e Hn (y) dy
−∞ −∞ mω
 1/2 Z ∞
h 2
= C2n Hn (y)2 e−y dy
mω −∞
1/2


h
= C2n 2n n! π

and we therefore set


1/4  1/2
2−n


Cn = √ .
h n! π
Standing waves φn
Probability densities |ψn |2
Part V

Initial-Boundary Value Problems in 2D


Introduction

In this final part of the course, we use separation of variables to


solve partial differential equations involving two space variables.
First we see how the Fredholm alternative applies to
boundary-value problems in 2D (or 3D) for a class of self-adjoint
partial differential equations. Here, the second Green identity plays
the role of the Lagrange identity in 1D. Next we generalise the
Sturm–Liouville theory to obtain complete orthogonal systems of
eigenfunctions in 2D, allowing us to construct solutions to
initial-boundary value problems in 2D as generalised Fourier series.

The significance of Bessel’s equation will finally be revealed.


Outline

Elliptic differential operators

Green identities and boundary-value problems

Elliptic eigenproblems

Wave and diffusion equations


Elliptic differential operators

We begin by defining a class of second-order, linear partial


differential operators that are well-behaved in certain key respects.
These operators appear as the spatial part of 3 key partial
differential equations, each of which has many important
applications.
Reminder: notation from vector calculus
Partial derivative operator ∂j = ∂/∂xj .
For a scalar field u : Rd → R, the gradient is the vector field
grad u : Rd → Rd defined by
 
∂1 u
Xd  ∂2 u 
grad u = ∇u = ∂j u ej =  .. 
 
 . 
j=1
∂d u

For a vector field F : Rd → Rd , the divergence is the scalar


field div F : Rd → R defined by

X
d
div F = ∇ · F = ∂j Fj = ∂1 F1 + ∂2 F2 + · · · + ∂d Fd .
j=1
Second-order linear PDEs in Rd
The most general second-order linear partial differential operator
in Rd has the form

X
d X
d X
d
Lu = − ajk (x)∂j ∂k u + bk (x)∂k u + c(x)u. (47)
j=1 k=1 k=1

Example
The Laplacian is defined by ∇2 u = ∇ · (∇u) = div(grad u), that is,

X
d
2
∇ u= ∂2j u = ∂21 u + ∂22 u + · · · + ∂2d u.
j=1

Thus, −∇2 u has the form (47) with

ajk (x) = δjk , bk (x) = 0, c(x) = 0.


Principal part
We call
X
d X
d
L0 u = − ajk (x)∂j ∂k u (48)
j=1 k=1

the principal part of the partial differential operator (47).


In many applications, L arises in the divergence form

X
d X
d
Lu = − div(A grad u) = − ∂j ajk (x)∂k u,
j=1 k=1

where A = [aij ]. We can still write such an L in the general


form (47) with the same principal part (48) and with lower-order
coefficients

X
d
bk (x) = − ∂j ajk (x) and c(x) = 0.
j=1
Ellipticity

Definition
A second-order linear partial differential operator (47) is
(uniformly) elliptic in a subset Ω ⊆ Rd if there exists a positive
constant c such that

ξT A(x)ξ > ckξk2 for all x ∈ Ω and ξ ∈ Rd .

Written out explicitly, the ellipticity condition looks like

X
d X
d X
d
ajk (x)ξj ξk > c ξ2k .
j=1 k=1 k=1
Example
The operator L = −∇2 is elliptic (with c = 1) on any Ω ⊆ Rd ,
since
Xd Xd Xd
δjk ξj ξk = ξ2k = kξk2 .
j=1 k=1 k=1

Example
The operator L = −(∂21 + 2∂22 − ∂23 ) is not elliptic in R3 since in
this case the quadratic form
  
1 0 0 ξ1
ξT Aξ = [ξ1 ξ2 ξ3 ] 0 2 0  ξ2  = ξ21 + 2ξ22 − ξ23
0 0 −1 ξ3

is negative if ξ1 = ξ2 = 0 and ξ3 6= 0.
Symmetry and skew-symmetry
Put
1
asy

jk = ajk + akj = symmetric part of ajk ,
2
1
ask

jk = ajk − akj = skew-symmetric part of ajk ,
2
so that

ajk = asy sk
jk + ajk , asy sy
kj = ajk , ask sk
kj = −ajk

When investigating if L is elliptic, it suffices to look at asy


jk .

Lemma

X
d X
d X
d X
d
ajk (x)ξj ξk = asy
jk (x)ξj ξk .
j=1 k=1 j=1 k=1
Proof

d X
X d X
d X
d
ask −ask

jk (x)ξj ξk = kj (x) ξj ξk
j=1 k=1 j=1 k=1

X
d X
d
=− ask
jk (x)ξk ξj
k=1 j=1

X
d X
d
=− ask
jk (x)ξk ξj
j=1 k=1

X
d X
d
=− ask
jk (x)ξj ξk
j=1 k=1

so
X
d X
d
2 ask
jk (x)ξj ξk = 0.
j=1 k=1
Theorem
Denote the eigenvalues of the real symmetric matrix [asy jk (x)]
by λj (x) for 1 6 j 6 d. The operator (47) is elliptic on Ω if and
only if there exists a positive constant c such that

λj (x) > c for 1 6 j 6 d and all x ∈ Ω.

Example
Show that the L = −(3∂21 + 2∂1 ∂2 + 2∂22 ) is elliptic.

Example
Show that L = −(∂21 − 4∂1 ∂2 + ∂22 ) is not elliptic.
Proof
Fix x, then the real symmetric matrix A = [asy jk ] is diagonalisable,
d
that is, R has an orthonormal basis of eigenvectors v1 , . . . , vd
and so
QT AQ = diag(λ1 , λ2 , . . . , λd )
where the matrix Q = [v1 v2 · · · vd ] is orthogonal:
QT = Q−1 . Put η = QT ξ, then ξ = Qη so

X
d
ξT Aξ = (Qη)T A(Qη) = ηT (QT AQ)η = λj η2j ,
j=1

and kξk2 = (Qη)T (Qη) = ηT (QT Q)η = ηT (I)η = kηk2 , so

X
d
ξT Aξ > ckξk2 ⇐⇒ λj η2j > ckηk2 .
j=1
Three important PDEs

In the remainder of this course, we mainly focus on three PDEs.


1. Poisson equation (elliptic):

−∇2 u = f.

2. Diffusion equation or heat equation (parabolic):

∂u
− ∇2 u = f.
∂t
3. Wave equation (hyperbolic):

∂2 u
− ∇2 u = f.
∂t2
Here, u = u(x, t) for t > 0 and x ∈ Ω where Ω is a bounded,
open subset of Rd with a piecewise smooth boundary.
Green identities and boundary-value problems

We saw how the Lagrange identity leads to the concept of


self-adjointness, and allowed us to understand the solvability of
boundary-value problems in 1D. In higher dimensions, the same
role is played by the second Green identity, which we use to prove
(the easy half of) the Fredholm alternative.
Inner products
We denote the (real) inner product in L2 (Ω) by
Z RR
fg dx dy, d = 2,
hf, gi = hf, giΩ = fg dV = RRRΩ
Ω Ω fg dx dy dz, d = 3.

Sometimes we will require a weighted inner product


Z
hf, gir = hf, gir,Ω = fgr dV.

We will also require an inner product on a part of the boundary,


Γ ⊆ ∂Ω, Z
hf, giΓ = fg dS,
Γ
or a weighted version
Z
hf, gir,Γ = fgr dS.
Γ
A class of second-order elliptic operators
To generalise the Sturm–Liouville theory, we consider a
second-order, linear partial differential operator of the form

Lu = −∇ · (p∇u) + qu, (49)

with real-valued coefficients p(x) and q(x).

Example
If d = 1 then ∇ = d/dx so Lu = −(pu 0 ) 0 + qu.

Example
If d = 2 then

Lu = −(pux )x − (puy )y + qu
   
∂ ∂u ∂ ∂u
=− p(x, y) − p(x, y) + qu.
∂x ∂x ∂y ∂y
Ellipticity

The principal part of L is in divergence form −∇ · (A∇u)


with A = [p(x)δjk ] = p(x)I so

ξT Aξ = p(x)kξk2

and therefore L is elliptic on Ω iff

p(x) > c > 0 for all x ∈ Ω.

Example
If p(x) = 1 and q(x) = 0 then L = −∇2 .
Divergence theorem (Math2111)

Notation: Ω = Ω ∪ ∂Ω is the closure of Ω. Since Ω is bounded,


the set Ω is compact and hence, by the Extreme Value Theorem,
any continuous function f : Ω → R must be bounded.

Theorem
If a vector field F : Ω → Rd is continuously differentiable then
Z I
∇ · F dV = F · n dS,
Ω ∂Ω

where n is the outward unit normal to Ω.

Notation: ∂u/∂n = n · ∇u.


First Green identity

Theorem
If u is C2 and v is C1 on Ω, then
Z I
 ∂u
hLu, viΩ = p∇u · ∇v + quv dV − p v dS. (50)
Ω ∂Ω ∂n

Proof.
Since
∇ · [(p∇u)v] = [∇ · (p∇u)]v + p∇u · ∇v
we have

(Lu)v = p∇u · ∇v + quv − ∇ · [(p∇u)v]

and the result follows after applying the divergence theorem


to F = (p∇u)v, because F · n = p(∂u/∂n)v.
Second Green identity
Theorem
If u and v are C2 on Ω, then
I  
∂v ∂u
hLu, viΩ − hu, LviΩ = p u −v dS. (51)
∂Ω ∂n ∂n

Proof.
Since
(Lu)v = p∇u · ∇v + quv − ∇ · [(p∇u)v]
and
u Lv = p∇v · ∇u + qvu − ∇ · [(p∇v)u],
the Lagange identity generalizes to
 
(Lu)v − u Lv = ∇ · p(u∇v − v∇u) ,

and the result follows after applying the divergence theorem.


Boundary conditions and self-adjointness

Suppose that ∂Ω is divided into two parts, ΓD and ΓN , where the D


and N stand for Dirichlet and Neumann boundary conditions.

The second Green identity (51) shows that

hLu, viΩ = hu, LviΩ for all u, v ∈ D, (52)

if we choose D, the domain of the partial differential operator L,


to be the space of C2 functions on Ω satisfying the homogeneous
boundary conditions
u = 0 on ΓD ,
∂u
= 0 on ΓN .
∂n
Elliptic boundary value problem
Recall L = −∇ · (p∇u) + qu and assume

p(x) > c > 0 for all x ∈ Ω,

so that L is elliptic.
Given a source term f, Dirichlet data gD and Neumann data gN ,
we seek a solution u to the BVP
Lu = f in Ω,
u = gD on ΓD , (53)
∂u
p = gN on ΓN .
∂n
Technical requirement: for u to count as a solution it must have
finite energy: Z
|∇u|2 + |u|2 dV < ∞.


Fredholm alternative
Either the homogeneous problem
∂v
Lv = 0 in Ω, v = 0 on ΓD , p = 0 on ΓN , (54)
∂n
has only the trivial solution v ≡ 0, in which case
the inhomogeneous problem (53) has a unique solution u
for “every” choice of f, gD and gN ;
or else the homogeneous problem admits finitely many, linearly
independent solutions v1 , v2 , . . . , vm , in which case
the inhomogeneous problem (53) has a solution u iff the
data f, gD and gN satisfy
Z Z Z
∂vk
fvk dV = gD p dS − gN vk dS
Ω ΓD ∂n ΓN

for k = 1, 2, . . . , m.
Partial proof
Write the second Green identity (51) as
I  
∂v ∂u
hLu, viΩ − hu, LviΩ = up −p v dS.
∂Ω ∂n ∂n

If u satisfies (53) and v satisfies (54) then, since ∂Ω = ΓD ∪ ΓN ,


Z  
∂v ∂u
hf, viΩ − hu, 0iΩ = gD p −p × 0 dS
ΓD ∂n ∂n
Z

+ u × 0 − gN v dS,
ΓN

and so
Z Z Z
∂v
fv dV = gD p dS − gN v dS.
Ω ΓD ∂n ΓN
A pure Neumann problem
Consider the BVP
∂u
−∇2 u = f in Ω, = g on ∂Ω. (55)
∂n
If v is a solution of the homogeneous problem,
∂v
−∇2 v = 0 in Ω, = 0 on ∂Ω,
∂n
then by the first Green identity,
Z Z I
2 2 ∂v
(−∇ v) v dV = k∇vk dV − v dS.
Ω | {z } Ω ∂n
∂Ω |{z}
0
0
R
so Ω k∇vk2 dV = 0. If Ω is connected then v must be constant,
so the inhomogeneous problem (55) is solvable iff the data satisfy
Z I
f dV + g dS = 0.
Ω ∂Ω
Elliptic eigenproblems

We now study second-order, self-adjoint elliptic eigenproblems.


After looking at a few simple, concrete examples that can be
solved by separation of variables, we show some key properties of
the eigenfunctions that generalise earlier results for Sturm–Liouville
problems in 1D. We then see how eigenfunctions can be used to
construct a series solution to a boundary-value problem.
By analogy with the Sturm–Liouville problem in 1D, we now
consider the PDE

∇ · (p∇u) + (λr − q)u = 0 in Ω. (56)

Here, the coefficients p, q, r are given real-valued, continuous


functions on Ω, with

p(x) > const > 0 and r(x) > const > 0 for x ∈ Ω,

with r(x) not identically zero.

Writing
Lu = −∇ · (p∇u) + qu
as before, the PDE (56) can be written in the form

Lu = λru in Ω.
Boundary conditions

Again suppose that ∂Ω is divided into two parts, ΓD and ΓN , where


the D and N stand for Dirichlet and Neumann boundary conditions.

If u : Ω → C is not identically zero and if

Lu = λru in Ω,
u=0 on ΓD , (57)
∂u
p =0 on ΓN ,
∂n
then u is said to be an eigenfunction with eigenvalue λ, and we
call (u, λ) an eigenpair.

The results shown earlier for Sturm–Liouville problems and Fourier


series in 1D generalise in a natural way to this 2D or 3D setting.
Example: eigenproblem on the unit square

The eigenvalues λkn and eigenfunctions u = φkn (x, y) for

∇2 u + λu = 0, 0 < x < 1, 0 < y < 1,

with boundary conditions

u(x, 0) = u(x, 1) = 0, 0 < x < 1,

and
∂u ∂u
(0, y) = (1, y) = 0, 0 < y < 1,
∂x ∂x
are given by

λkn = (k2 + n2 )π2 and φkn (x, y) = cos(kπx) sin(nπy)

for k = 0, 1, 2, . . . and n = 1, 2, 3, . . . .
The eigenfunction φkn (x, y) = cos(kπx) sin(nπy)
Contours of φkn
Example: eigenproblem on the unit disk

Introduce polar coordinates in R2 ,

x = r cos θ and y = r sin θ,

and recall that


1 ∂2 u
 
2 1 ∂ ∂u
∇ u= r + 2 .
r ∂r ∂r r ∂θ2

Consider

∇2 u + λu = 0, 0 6 r < 1, −π < θ 6 π,

with the boundary condition

u(1, θ) = 0, −π 6 θ < π.
The eigenvalues λ = λnj are

λnj = k2nj where Jn (knj ) = 0,

for n = 0, 1, 2, . . . and j = 1, 2, 3, . . . .

The corresponding eigenfunctions u = φnj are

φ0j (r, θ) = J0 (k0j r),


φCnj (r, θ) = Jn (knj r) cos(nθ), n > 1,
φSnj (r, θ) = Jn (knj r) sin(nθ), n > 1.
The eigenfunction φCnj = Jn (knj r) cos(nθ)
Contours of φnj
Eigenproblem on a more complicated domain (Math5295)
Let

Ω1 = { (x, y) : x2 + y2 < 32 },
Ω2 = (−1, 1)2 ,
Ω = Ω1 \ Ω2 ,

and consider the eigenproblem

−∇2 u = λu in Ω,
u=0 on ΓD ,
∂u
=0 on ΓN ,
∂n
where ΓD denotes the inner (square) boundary and ΓN the outer
(circular) boundary. Compute approximate solutions using a finite
element method with 254,336 degrees of freedom.
λ1 = 0.505
λ2 = λ3 = 0.680
λ4 = 1.148 and λ5 = 1.251
Eigenfunctions are orthogonal

Theorem
Let L be the partial differential operator (49). If u, v are (possibly
complex-valued) functions satisfying

Lu = λru, Lv = µrv, in Ω,
u = 0, v = 0, on ΓD ,
∂u ∂v
p = 0, p = 0, on ΓN ,
∂n ∂n
and if λ 6= µ, then u and v are orthogonal on Ω with respect to
the weight function r, i.e.,
Z
hu, vir = uvr dV = 0.

Eigenvalues are real

Theorem
If u is a nontrivial solution of the elliptic eigenproblem (57) then λ
is real. Moreover, both Re u and Im u are also solutions of (57).

Thus, it is always possible to choose purely real-valued


eigenfunctions for L.

The proofs proceed exactly as in the 1D case.


Proofs
The self-adjointness (52) of L implies that

(λ − µ)hu, vir = hλru, vi − hu, µrvi


= hLu, vi − hu, Lvi = 0,

so if λ 6= µ then hu, vir = 0.


The coefficents p, q, r are real-valued so we easily see that (ū, λ̄)
is an eigenpair and therefore

(λ − λ̄)hu, ūir = 0.
R
Since hu, ūir = Ω |u|2 r dV > 0, we conclude that λ = λ̄, or in
other words λ is real.
Finally, because u and ū are eigenfunctions with eigenvalue λ, so
are
u + ū u − ū
Re u = and Im u = .
2 2i
Completeness of the eigenfunctions

If several eigenfunctions share the same eigenvalue then they are


not necessarily orthogonal. In this case, however, we can always
orthogonalise them using the Gram–Schmidt procedure
(Math2601).

In general, the elliptic eigenproblem (57) has a sequence of


eigenvalues λ = λj and corresponding eigenfunctions u = φj such
that
1. the eigenvalues satisfy λ1 6 λ2 6 λ3 6 · · · with λj → ∞ as
j → ∞;
2. the eigenfunctions φ1 , φ2 , φ3 , . . . form a complete
orthogonal system on Ω with respect to the weight function r.
Steklov eigenproblem

As before, let
Lu = −∇ · (p∇u) + qu,
but suppose now that ∂Ω is divided into three parts: Γ , ΓD and ΓN .
If u : Ω → C is not identically zero and if

Lu = 0 in Ω,
∂u
p = λru on Γ ,
∂n (58)
u=0 on ΓD ,
∂u
p =0 on ΓN ,
∂n
then we say that (u, λ) is a Steklov eigenpair.
Example: a Steklov eigenproblem for the unit square

The eigenvalues λn and eigenfunctions u = φn (x, y) for

∇2 u = 0, 0 < x < 1, 0 < y < 1,


∂u
(x, 1) = λu(x, 1), 0 < x < 1,
∂y
u(0, y) = u(1, y) = 0, 0 < y < 1,
∂u
(x, 0) = 0, 0 < x < 1,
∂y
are

λn = nπ tanh(nπ) and φn (x, y) = sin nπx cosh nπy

for n = 1, 2, . . . .
The eigenfunction φn (x, y) = sin(nπx) cosh(nπy)
Contours of φn
Steklov eigenvalues are real
The first Green identity with v = ū gives
Z I
∂u
hLu, ūiΩ = p|∇u|2 + q|u|2 ) dV − p ū dS.
Ω ∂Ω ∂n

If u satisfies (58), then Lu = 0 in Ω so


Z I
2 2 ∂u
p|∇u| + q|u| ) dV = p ū dS,
Ω ∂Ω ∂n
and since ∂Ω = Γ ∪ ΓD ∪ ΓN the boundary conditions imply that
I Z Z   Z
∂u ∂u
p ū dS = (λru)ū dS + p (0̄) dS + (0)ū dS.
∂Ω ∂n Γ ΓD ∂n ΓN

We conclude that λ is real, because


Z Z
p|∇u| + q|u| ) dV = λ r|u|2 dS.
2 2
Ω Γ
Steklov eigenfunctions are orthogonal over Γ
If u and v are eigenfunctions with eigenvalues λ and µ, then by the
second Green identity,
Z Z Z
(λ − µ) uvr dS = (λru)v dS − u(µrv) dS
Γ
ZΓ  Γ

∂u ∂v
= p v−u dS
Γ ∂n ∂n
Z  
∂u ∂v
= p v−u dS
∂Ω ∂n ∂n
= hu, LviΩ − hLu, viΩ
= hu, 0iΩ − h0, viΩ = 0,

so Z
hu, vir,Γ = uvr dS = 0 if λ 6= µ.
Γ
Completeness?
The second type of eigenproblem (58) has a sequence of
eigenvalues λ = λj and corresponding eigenfunctions u = φj such
that
1. the eigenfunctions φ1 , φ2 , φ3 , . . . are orthogonal with
respect to r on Γ ;
2. the eigenvalues satisfy λ1 6 λ2 6 λ3 6 · · · with λj → ∞ as
j → ∞.

The eigenfunctions are complete in L2 (Γ, r) iff the homogeneous


BVP
Lv = 0 in Ω,
v=0 on Γ ∪ ΓD , (59)
∂v
p =0 on ΓN ,
∂n
has only the trivial solution v ≡ 0 in Ω.
Exceptional case
If (59) has a nontrivial solution v, then by the second Green
identity
Z I  
∂v ∂v ∂φj
φj p dS = p φj −v dS
Γ ∂n ∂Ω ∂n ∂n
= hLφj , viΩ − hφj , LviΩ ,

and so, since Lφj = 0 = Lv in Ω,


Z
∂v
φj p dS = 0.
Γ ∂n
Thus, the function
p ∂v
w=
r ∂n
satisfies
hφj , wir,Γ = 0 for all j.
Boundary-value problem with gD = 0 and gN = 0
Suppose that we have homogeneous boundary data,
∂u
Lu = f in Ω, u = 0 on ΓD , p = 0 on ΓN .
∂n
To construct u, find the eigenpairs (φj , λj ) satisfying

∂φj
Lφj = λj rφj in Ω, φj = 0 on ΓD , p = 0 on ΓN .
∂n
Then u exists iff

hf, φj iΩ = 0 whenever λj = 0,

in which case (with arbitrary constants Cj ),


X X hf, φj iΩ
u(x) ∼ Cj φj + φj (x), x ∈ Ω.
λj =0
λ kφj k2r,Ω
λ 6=0 j
j
Boundary-value problem with f = 0 and gN = 0
Consider
∂u
Lu = 0 in Ω, u = gD on ΓD , p = 0 on ΓN .
∂n
Find the Steklov eigenpairs (φj , λj ) satisfying

∂φj ∂φj
Lφj = 0 in Ω, p = λj rφj on ΓD , p = 0 on ΓN .
∂n ∂n
If the homogeneous problem
∂v
Lv = 0 in Ω, v = 0 on ΓD , p = 0 on ΓN ,
∂n
has only the trivial solution v ≡ 0, then

X hgD , φj ir,Γ
u(x) ∼ D
φj (x), x ∈ Ω.
j=1
kφj k2r,ΓD
Boundary-value problem with f = 0 and gD = 0
Consider
∂u
Lu = 0 in Ω, u = 0 on ΓD , p = gN on ΓN .
∂n
Find the Steklov eigenpairs (φj , λj ) satisfying

∂φj
Lφj = 0 in Ω, φj = 0 on ΓD , p = λj rφj on ΓN .
∂n
Then u exists iff

hgN , φj iΓN = 0 whenever λj = 0,

in which case
X X hgN , φj iΓ
u(x) ∼ Cj φj (x) + 2
N
φj (x), x ∈ Ω.
λj =0 λ 6=0
λ j kφ j kr,ΓN
j
Example

Consider the following boundary-value problem on the unit square:

∇2 u = 0, 0 < x < 1, 0 < y < 1,


∂u
(x, 1) = 1, 0 < x < 1,
∂y
u(0, y) = u(1, y) = 0, 0 < y < 1,
∂u
(x, 0) = 0, 0 < x < 1.
∂y

Use the associated Steklov eigenproblem (solved previously) to


show that

4 X 1 cosh(2k − 1)πy
u(x, y) ∼ 2 2
sin(2k − 1)πx .
π (2k − 1) sinh(2k − 1)π
k=1
Boundary-value problem with general data
To handle the most general situation
∂u
Lu = f in Ω, u = gD on ΓD , p = gN on ΓN ,
∂n
we write u as a sum of three terms

u = u1 + u2 + u3 ,

where each term solves one of the cases treated above:


∂u1
Lu1 = f in Ω, u1 = 0 on ΓD , p = 0 on ΓN ,
∂n
∂u2
Lu2 = 0 in Ω, u2 = gD on ΓD , p = 0 on ΓN ,
∂n
∂u3
Lu3 = 0 in Ω, u3 = 0 on ΓD , p = gN on ΓN .
∂n
An alternative approach

If we can find w : Ω ∪ ∂Ω → R satisfying just the boundary


conditions in (53),

∂w
w = gD on ΓD and p = gN on ΓN ,
∂n
then we can solve (53) by putting f? = f − Lw, solving

Lu? = f? in Ω,
?
u =0 on ΓD ,
∂u?
p =0 on ΓN ,
∂n
and then putting u = u? + w.

This will often be easier than finding three sets of eigensystems.


Example
Consider the PDE on the unit square,

−∇2 u = x(1 − x) for 0 < x < 1 and 0 < y < 1,

subject to the boundary conditions

u(x, 0) = 1 and u(x, 1) = 2 for 0 < x < 1,

plus
∂u ∂u
(0, y) = 0 = (1, y) for 0 < y < 1.
∂x ∂x
Use a previously solved eigenproblem to show that

2 X sin(2m − 1)πy
u(x, y) ∼ y + 1 + 3
3π (2m − 1)3
m=1
∞ ∞
4 XX cos 2jπx sin(2m − 1)πy
− 5  .
π
j=1 m=1
j (2m − 1) 4j2 + (2m − 1)2
2
Wave and diffusion equations

We now consider two time-dependent partial differential equations.


Both are solved in the same way as their 1D versions. The only
change is that we must use 2D eigenfunctions.
Vibrating membrane

Consider a stretched membrane in the shape of a planar region Ω.


Assuming that the membrane is fixed along ∂Ω, has initial
displacement u0 (x) and is initially at rest, we have the following
initial boundary value problem for the transverse
displacement u = u(x, t):

∂2 u
− c2 ∇2 u = 0 in Ω, for t > 0,
∂t2
∂u (60)
u = u0 and = 0 in Ω, when t = 0,
∂t
u = 0 on ∂Ω, for t > 0.

This problem is a 2D version of the vibrating string problem. The


constant c2 depends on how tightly the membrane is stretched and
on the density of the membrane.
Reduction to a sequence of ODEs
Let λ1 , λ2 , . . . and φ1 , φ2 , . . . be the eigenvalues and
eigenfunctions of the Laplacian,

−∇2 φj = λj φj in Ω, with φj = 0 on ∂Ω,

then

X hu(·, t), φj iΩ
u(x, t) ∼ uj (t)φj (x), uj (t) = ,
hφj , φj iΩ
j=1

and
∞  2
X
∂2 u

2 2 d uj 2
−c ∇ u= + c λj uj φj (x).
∂t2 dt2
j=1

Thus, uj satisfies a second-order ODE,

d2 u j hu0 , φj iΩ duj
+ c2 λj uj = 0, uj (0) = u0j = , (0) = 0.
dt2 hφj , φj iΩ dt
Normal modes of vibration
Solving the IVP for uj , we find that
p
uj (t) = u0j cos(ωj t) where ωj = c λj ,

so

X
u(x, t) ∼ u0j cos(ωj t)φj (x).
j=1

The separable solution

cos(ωj t)φj (x)

is the jth normal mode of vibration. The period and frequency of


this mode are
2π 1 ωj
τj = and = .
ωj τj 2π
Concentration and flux
Consider molecules of a solute moving in a stationary fluid, and let

u(x, t) = mass concentration of solute,


q(x, t) = flux density vector of solute,
f(x, t) = volume density of solute sources,

at position x and time t. Thus,


Z
u dV = mass of solute in region Ω,
ZΩ
f dV = rate at which solute is added within Ω,

and, for an oriented surface Γ ,


Z
q · n dS = mass flux through Γ .
Γ
Diffusion
In 1855, Adolf Fick postulated that

q = −K∇u, (61)

where the diffusivity K depends on physical properties of the solute


and the fluid medium.
In a fixed region Ω, conservation of mass requires that
Z I Z
d
u dV = q · (−n) dS + f dV.
dt Ω ∂Ω Ω

On the LHS, Z Z
d ∂u
u dV = dV,
dt Ω Ω ∂t
and on the RHS the divergence theorem gives
I Z Z
q · (−n) dS = − ∇ · q dV = ∇ · (K∇u) dV.
∂Ω Ω Ω
Diffusion
Thus, Z  
1 ∂u
− ∇ · (K∇u) − f dV = 0.
vol Ω Ω ∂t
The LHS tends to the pointwise value of the integrand if we shrink
Ω to a point, so u satisfies the diffusion equation,
∂u
− ∇ · (K∇u) = f(x, t). (62)
∂t

In a steady state, the concentration u does not depend on t and so

−∇ · (K∇u) = f(x, t).

If K is independent of x, then

∇ · (K∇u) = K∇2 u.
Heat equation in 3D
If we let

u(x, t) = temperature, K = thermal conductivity,


q(x, t) = heat flux vector, ρ = mass density,
f(x, t) = vol. density of heat sources, σ = specific heat,

then since Fourier’s law,

q = −K∇u,

has the same form as Fick’s law (61), conservation of thermal


energy leads to the heat equation,
∂u
ρσ − ∇ · (K∇u) = f(x, t)
∂t
in the same way that conservation of mass leads to the diffusion
equation (62).
An application: the cooling-off problem

A body Ω is initially at a temperature u0 , and then cools down as


heat leaves through ∂Ω. A (linearised) model of this process is

∂u
− ∇2 u = 0 in Ω, for t > 0,
∂t
∂u (63)
+ bu = 0 on ∂Ω, for t > 0,
∂n
u = u0 on Ω, when t = 0,

where b > 0 is a constant. The associated eigenproblem takes the


form
−∇2 φ = λφ in Ω,
∂φ (64)
+ bφ = 0 on ∂Ω.
∂n
We have not previously discussed such a Robin boundary condition.
Eigenvalues are strictly positive

The first Green identity gives


Z Z Z
∂φ
λ |φ| dV =
2 2
k∇φk dV − φ dS
∂Ω ∂n

ZΩ Z
= k∇φk2 dV + b |φ|2 dS,
Ω ∂Ω

so λ > 0. Moreover, if λ = 0 then

∇φ = 0 in Ω and φ = 0 on ∂Ω,

implying that φ is constant on each component of Ω, and that


each of the constant values is zero; that is, φ ≡ 0. Hence, every
eigenvalue is strictly positive.
Radially symmetric example
Suppose now that Ω is the unit ball ρ < 1, and that u0 = u0 (ρ) is
radially symmetric. Recall (from Math2111) that ∇2 u equals

∂2 u
   
1 ∂ 2 ∂u 1 ∂ ∂u 1
2
ρ + 2
sin ϕ + 2 2
ρ ∂ρ ∂ρ ρ sin ϕ ∂ϕ ∂ϕ ρ sin ϕ ∂θ2
but from symmetry, u is independent of the angular variables
ϕ and θ, that is, u = u(ρ, t). Thus, it suffices to consider radially
symmetric eigenfunctions φ = φ(ρ) satisfying
 
1 d 2 dφ
− 2 ρ = λφ for 0 < ρ < 1.
ρ dρ dρ

Putting λ = k2 with k > 0, we conclude that φ satisfies a


spherical Bessel equation,

(ρ2 φ 0 ) 0 + k2 ρ2 φ = 0.
A tutorial problem shows that the function v(ρ) = ρ1/2 φ(ρ)
satisfies the parameterised Bessel equation of order 1/2,

ρ2 v 00 + ρv 0 + (k2 ρ2 − 41 )v = 0,

so v = AJ1/2 (kρ) + BY1/2 (kρ) and thus

φ = ρ−1/2 AJ1/2 (kρ) + BY1/2 (kρ) .




Here, B = 0 because φ is bounded at ρ = 0, and k must be chosen


so that

+ bφ = 0 at ρ = 1.

Recall the identity
1 d −ν
z Jν (z) = −z−ν−1 Jν+1 (z)

z dz
and put s = kρ, then
d −1/2  ds d 1/2 −1/2 
ρ J1/2 (kρ) = k s J1/2 (s)
dρ dρ ds
= −k3/2 s−1/2 J3/2 (s) = −kρ−1/2 J3/2 (kρ),

so the equation for k > 0 is

−kJ3/2 (k) + bJ1/2 (k) = 0,

which has a sequence of solutions 0 < k1 < k2 < k3 < · · · tending


to ∞, and corresponding eigenfunctions

φj (ρ) = ρ−1/2 J1/2 (kj ρ).


Solutions when b = 1.
Eigenfunctions
Orthogonality

Put Z1
hf, gi = f(ρ)g(ρ) ρ2 dρ
0
then
hφn , φj i = 0 if n 6= j,
since u = φn and λ = k2n satisfy (ρ2 u 0 ) 0 + λρ2 u = 0. Note that
dV = 4πρ2 dρ so if we view φn and φj as functions on Ω, then
Z Z1
φn φj dV = 4π φn φj ρ2 dρ = 0 if n 6= j.
Ω 0

Exercise: prove this orthogonality property using the second Green


identity and (64).
Fourier modes

The Fourier expansion



X hu(·, t), φj i
u(ρ, t) ∼ uj (t)φj (ρ), uj (t) = ,
kφj k2
j=1

gives

X
ut − ∇ u ∼ 2
u̇j + λj uj )φj (ρ),
j=1

so we require that

u̇j + λj uj = 0 for t > 0, with uj (0) = u0j .

Thus,
uj (t) = u0j e−λj t .
The series

X
u(ρ, t) ∼ u0j e−λj t φj (ρ), 0 < ρ < 1, t > 0,
j=1

satisfies the PDE and boundary condition. The initial condition


gives

X
u0 (ρ) = u(ρ, 0) ∼ u0j φj (ρ),
j=1

so u0j must be the jth Fourier coefficient of the initial data:


R1
hu0 , φj i 0u0 (ρ)φj (ρ)ρ2 dρ
u0j = = R1 .
hφj , φj i φ j (ρ) 2 ρ2 dρ
0

As expected, u(ρ, t) → 0 as t → ∞.

You might also like