Lecture Notes MT 2019 Oct 07
Lecture Notes MT 2019 Oct 07
Melanie Rupflin
October 7, 2019
1
Acknowledgement: These lecture notes and the problem sheets are based on material from
the lecture notes and exercises of Hilary Priestley from MT 2017 as well as some further sources,
such as the lecture notes of M. Struwe and the book of D. Werner on Funktionalanalysis.
Contents
1 Banach spaces 4
1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Density 27
5 Separability 34
2
CONTENTS 3
8 Spectral Theory 54
8.2.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Banach spaces
Recall:
Definition 1. Let X be a vector space (over either F = R or F = C).
A norm k · k : X → R is a function so that ∀x, y ∈ X, ∀λ ∈ F
We recall in particular:
4
CHAPTER 1. BANACH SPACES 5
Br (x0 ) := {x ∈ X : kx − x0 k < r} ⊂ Ω.
Notation: We will use the convention that A ⊂ B simply means that A is a subset of B, not
necessarily a proper subset, i.e. allowing for A = B. If our assumption is that A is proper subset
of B then we will either explicitly say so or write A $ B.
We also recall that two norms k · k and k · k0 are equivalent if and only if there exist a constant
C > 0 so that for all x ∈ X
C −1 kxk ≤ kxk0 ≤ Ckxk,
or equivalently if there exist two constants C1,2 ∈ R so that for all x ∈ X
and that equivalent norms lead to equivalent definitions of convergence, Cauchy sequences, open
and closed sets,....
One of the key objects we study in this course are Banach spaces and linear maps between such
spaces.
Definition 2. A normed space (X, k · k) is a Banach space if it is complete, i.e. if every Cauchy
sequence in X converges.
We first note that for any given subspace Y of a normed space (X, k · k) we obtain a norm on Y
simply by restricting the given norm to Y . For the resulting normed space (Y, k · k) we have
Proposition 1.1. Let (X, k · k) be a Banach space, Y ⊂ X a subspace. Then
Proof.
”⇒”:
Let (yn ) be so that yn ∈ Y , yn → x ∈ X. Then (yn ) is a Cauchy sequence in Y so converges in
(Y, k · k) to some y ∈ Y . Hence x = y ∈ Y by uniqueness of limits. Hence Y is closed.
”⇐”:
If (yn ) is a Cauchy sequence in (Y, k · k), it is also a Cauchy sequence in (X, k · k) and must hence
converge in X, say yn → x ∈ X. But as Y is closed we must have that x ∈ Y and hence that
(yn ) converges in (Y, k · k). Thus Y is complete.
WARNING. Many properties of finite dimensional normed spaces are NOT true for general
infinite dimensional spaces, or maps between such spaces. A few examples of this are:
CHAPTER 1. BANACH SPACES 6
• Linear maps from Rn to Rm (or indeed, as we shall see later, linear maps from any finite
dimensional space to any normed space Y ) are always continuous
BUT
not all linear maps L : (X, k · kX ) → (Y, k · kY ) from a Banach space (X, k · kX ) are
continuous.
• Bounded, closed sets in Rn are compact (Heine-Borel-Theorem)
BUT
while compact sets are always bounded and closed, the converse is WRONG for infinite
dimensional spaces
• Every subspace of Rn is a closed set
BUT
not all subspaces of infinite dimensional spaces are closed.
Our intuition can further be wrong as we are used to thinking about Euclidean spaces Rn whose
norm is introduced by an inner product via kxk = (x, x)1/2 .
We recall that an inner product (·, ·) : X × X → R is a map that is symmetric (x, y) = (y, x)
if F = R, respectively hermitian (x, y) = (y, x) if F = C, that is linear in the first variable and
positive definite and call a vector space X together with an inner product (·, ·) an inner product
space.
An important special case of Banach spaces are spaces whose norm is induced by an inner product
and these spaces will play a key role in the HT course B4.2 Functional Analysis 2.
Definition 3. A Hilbert space is an inner product space (X, (·, ·)) which is complete (wrt the
induced norm kxk = (x, x)1/2 ).
WARNING. There are several important properties that hold true in Rn , and more generally
in Hilbert spaces, but that do not hold for general Banach spaces. Examples of this include
• In Rn (and indeed any Hilbert space, cf. B4.2 Functional Analysis 2) minimal distances to
closed subspaces are attained, i.e. given any closed subspace S ⊂ X of a Hilbert space X
and any p ∈ X there exists a unique element s0 ∈ S so that
kp − s0 k = inf kp − sk.
s∈S
1.2 Examples
(Rn , k · kp ), 1 ≤ p ≤ ∞
respectively
kxk∞ := sup |xi |.
i∈{1,...,n}
One can show that these are all norms, with the challenging bit being the proof of the ∆-inequality
X 1/p
kx + ykp = (xi + yi )p ≤ kxkp + kykp .
i
WARNING. This inequality does not hold if we were to extend the definition of k · kp to
0 < p < 1, and hence the above expression does not give a norm on Rn if p < 1.
A useful property to deal with the p norms 1 ≤ p ≤ ∞ (and their generalisations to sequence
and functions spaces) is Hölder’s inequality
Lemma 1.2 (Hölder’s inequality in Rn ). For 1 ≤ p, q ≤ ∞ with
1 1
(?) + =1
p q
we have that for any x, y ∈ Rn
n
X
xi yi ≤ kxkp kykq .
i=1
In (?) we use the convention that p1 = 0 for p = ∞, and one often calls numbers p, q ∈ [1, ∞]
satisfying (?) conjugate exponents.
The proof of this inequality (both for Rn as well as the analogues for the sequence and function
spaces `p and Lp ) can be found in most textbooks. Here we will simply use Hölder’s inequality
without proof.
Remark. As you will show on Problem sheet 1, we have that for all 1 ≤ p < ∞
kxk∞ ≤ kxkp ≤ n1/p kxk∞ .
Hence the ∞-norm is equivalent to every p-norm and thus, by transitivity, we have that k · kp is
equivalent to k · kq for every 1 ≤ p, q ≤ ∞.
An infinite dimensional analogue of (Rn , k·kp ), respectively (Cn , k·kp ) are the spaces of sequences
(`p , k · kp ), 1 ≤ p ≤ ∞, where for 1 ≤ p < ∞
∞
X p
`p := {(xj )j∈N : |xj | < ∞}
j=1
∞
while ` denotes the space of bounded sequences, equipped with k · kp where for 1 ≤ p < ∞
∞
p 1/p
X
kxk`p = |xj |
j=1
while for p = ∞
k(xj )k∞ := sup |xj |.
j
For any 1 ≤ j ≤ ∞ we have that (`p , k · kp ) is a normed space (where we define addition and
scalar-multiplication component-wise) and one can furthermore prove:
CHAPTER 1. BANACH SPACES 8
• the spaces (`p , k · kp ) are all complete and hence Banach spaces, we carry out the proof of
this for p = 2 in the next section.
1 1
• the Hölder inequality holds true, i.e. for every 1 ≤ p, q ≤ ∞ with p + q = 1 and any
(xj ) ∈ `p and (yj ) ∈ `q we have that
P
xj yj converges and
X
| xj yj | ≤ k(xj )kp k(yj )kq .
j
is well defined and one can easily check that this is an inner product that induces the
`2 -norm, hence making (`2 , k · k2 ) into a Hilbert-space.
c0 := {(xn ) ∈ `∞ : xn −→ 0}
n→∞
of `∞ , which is closed and hence, when equipped with the `∞ - norm a Banach space.
Let Ω ⊂ R be an interval, or more generally any measurable subset of Rn . Consider for 1 ≤ p < ∞
the space of functions
Z
p p
L (Ω) := {f : Ω → R measurable so that |f | dx < ∞}
Ω
respectively
Here and in the following all integrals are computed with respect to the Lebesgue measure and
we shall only ever consider functions that are measurable so you may assume in any application
that the functions you encounter are measurable without having to provide a proof for this.
Conversely, we recall that not all measurable functions are integrable and that indeed for a
general measurable function the integral might not even be defined, so justification is needed to
consider integrals in general. However we also recall that the integral of a non-negative functions
f is always defined though might be infinite.
respectively
kf kL∞ := ess sup|f | := inf{M : |f | ≤ M a.e. }.
We note that k · k is only a seminorm on Lp with kf − gkLp = 0 if and only if f = g a.e. We can
hence turn (Lp , k · k) into a normed space by taking the quotient with respect to the equivalence
relation
f ∼ g ⇔ f = g a.e..
CHAPTER 1. BANACH SPACES 9
is one of the most important spaces of functions in the modern theory of PDE (as developed e.g.
in the course C4.3 Functional analytic methods for PDEs) and has the following properties: For
any (measurable) set Ω ⊂ Rn
1
• Hölder’s inequality holds: If f ∈ Lp (Ω) and g ∈ Lq (Ω) where p + 1q = 1 then their product
f g is integrable with Z
f gdx ≤ kf kLp kgkLq .
Ω
is a well defined inner product that induces the L2 norm, so L2 is a Hilbert space.
None of the Lp norms are equivalent, though for bounded domains (and sets with finite measure),
we can estimate the Lp norm of functions by their Lq norm if p < q and have that for any
1<p<q<∞
L∞ (Ω) $ Lq (Ω) $ Lp (Ω) $ L1 (Ω). (1.1)
As an example consider Ω = (0, 2) ⊂ R and p = 2, q = 4. Adding in a multiplication by the
constant function g = 1 we can estimate, using Hölder’s inequality,
1/2 √
Z Z 2 Z 2
2 2 1/2
kf k2L2 = |f | · 1dx ≤ k|f | kL2 k1kL2 = f 4 dx · 1 dx = 2kf k2L4 ,
0 0
√
so we get kf kL2 ≤ 2kf kL4 and in particular that every f ∈ L4 ([0, 2]) is also an element of
L2 ([0, 2]). The general case is discussed on the first problem sheet.
WARNING. The inclusion (1.1) is wrong for unbounded domains, e.g. the constant function
f = 1 is an element of L∞ (R) but isn’t contained in any Lp (R), 1 ≤ p < ∞.
Remark. In practice it is can be useful to extend k · kLp to a function R from the space of all
p
(measurable) functions to [0, ∞) ∪ {∞} by simply setting kf kLp = ∞ if |f | = ∞ (respectively
for p = ∞ if f ∈ / L∞ ), and we note that also with this ’abuse of notation’ the triangle and Hölder-
inequality still hold (with the convention that 0·∞ = 0 for Hölder’s inequality). Similarly we can
extend k · kp to a function that maps all sequences to [0, ∞) ∪ {∞} but we stress that while this
notation/convention can be useful and used in the literature, these functions into [0, ∞) ∪ {∞}
are not norms as a norm is by definition a function into [0, ∞).
WARNING. Note that the inclusions of the function spaces Lp (Ω) for sets Ω with bounded
measure are the ”other way around” compared with the inclusions of the sequence spaces `p .
CHAPTER 1. BANACH SPACES 10
• F b (Ω) := {f : Ω → F bounded }
• Cb (Ω) := {f : Ω → F continuous and bounded }
Similarly, on spaces of differentiable functions (with bounded derivatives) such as C 1 ([0, 1]) we
will generally use norms that are built using the sup norm of both the function and its derivative
such as kf kC 1 := kf ksup + kf 0 ksup .
It is important to note that convergence with respect to the supremum norm is the same as
uniform convergence of functions, so as seen in Prelims and Part A analysis lectures, one often
proves convergence of a given sequence fn in three steps: First we prove that the sequence
converges pointwise to some function f which is then the only candidate for the limit of fn as
uniform convergence implies pointwise convergence. We then need to check that f is in the
corresponding space and finally to establish uniform convergence of fn to f .
Given two normed spaces (X, k · kX ) and (Y, k · kY ) we can define a norm on X × Y e.g. by
k(x, y)k = (kxk2 + kyk2 )1/2 (1.2)
or more generally using any of the p-norms on R2 to define
k(x, y)k := k(kxk, kyk)kp = (kxkp + kykp )1/p respectively k(x, y)k := max(kxk, kyk)
where here and in the following we simply write k · k instead of k · kX and k · kY if it is clear from
the context what norm we are using.
We note that for all of these norms on X × Y we obtain that X × Y is again a Banach space if
both X and Y are Banach spaces. If X and Y are inner product spaces then one uses in general
the norm (1.2) as for this choice of norm also the product X × Y will again be a inner product
space with inner product ((x, y), (x0 , y 0 )) = (x, x0 )X + (y, y 0 )Y , while none of the norms with
p 6= 2 preserve the structure of an inner product space.
Sums of subspaces
Quotientspaces
Given a vector space X and a seminorm |·| on X, i.e. a function |·| : X → [0, ∞) satisfying (N2)
and (N3), we can consider the quotient space X/X0 where X0 := {x ∈ X : |x| = 0}. Then one
can define a norm on X/X0 by defining kx + X0 k := |x|, see problem sheet 1 for details.
This is the process whereby Lp spaces are obtained from the corresponding Lp spaces by identi-
fying functions which are equal a.e.
1.3 Completeness
The spaces discussed above are all complete. The proof of completeness often follows the following
rough pattern: Given a Cauchy sequence (xn ) in a normed space (X, k · kX )
We illustrate this by proving the completeness of some of the spaces introduced in the previous
section:
Given a Cauchy sequence (fn ) in (Cb (Ω), k · ksup ) we have that for every x ∈ Ω
i.e. (fn (x)) is a Cauchy sequence in R so, as R is complete, converges to some limit. We define
as candidate for the limit of the sequence of functions fn the function f (x) := limn→∞ fn (x)
obtained by this pointwise convergence and now show that
Proof. Let ε > 0. As (fn ) is a Cauchy sequence, there exists some N so that for every n, m ≥ N
kfn − fm ksup ≤ ε.
This implies in particular that f is bounded, namely that supx∈Ω |f (x)| ≤ kfN ksup + ε, and that
for every n ≥ N , kf − fn ksup < ε. As ε > 0 was arbitrary this proves that fn converges to f
with respect to the supremum norm. Finally we obtain that f ∈ Cb (Ω) as f is uniform limit
of a sequence of continuous functions and hence continuous (c.f. Analysis II and Part A Metric
spaces, is proved using ε/3 argument).
CHAPTER 1. BANACH SPACES 12
(n)
Let (x(n) ), x(n) = (xj )j∈N , be a Cauchy-sequence in (`2 , k · k2 ). As for every j ∈ N
(n) (m)
|xj − xj | ≤ kx(n) − x(m) k2 −→ 0
n,m→∞
(n) (n)
the sequence (xj ) ⊂ R is Cauchy so converges, say xj −→ xj .
n→∞
Claim: x = (xj ) ∈ `2 and kx − x(n) k2 −→ 0.
n→∞
Proof: Let ε > 0. Then as (x(n) ) is Cauchy there exists N so that for all n, m ≥ N
kx(n) − x(m) k2 ≤ ε.
Thus for every K ∈ N and for all n ≥ N we have that
K K
X (n) 2 X (n) (m) 2
|xj − xj | = lim |xj − xj | ≤ ε2 .
m→∞
j=1 j=1
As this holds for every K we can take K → ∞ to get that kx(n) − xk22 ≤ ε2 for every n ≥ N . As
ε > 0 was arbitrary, we thus obtain that kx(n) − xk2 −→ 0. As above we also get that x ∈ `2 as
n→∞
∆
kxk2 ≤ kx(n) − xk2 + kx(n) k2 < ∞.
(Note that here we use the above mentioned ”abuse of notation” of defining k · k2 for arbitrary
/ `2 to be able to already talk of kxk2 when we do not yet
sequence by setting kxk2 = ∞ if x ∈
2
know that x ∈ ` .)
(ii) ⇒ (i)
Suppose xnk → x. Given any ε > 0, we can choose N so that for all n, m ≥ N
kxn − xm k < ε/2
and furthermore choose K so that for k ≥ K
kxnk − xk < ε/2.
As a consequence we obtain that a normed space is complete if and only if absolute convergence
of series implies convergence of series:
Corollary 1.4. Let (X, k · k) be a normed space. Then the following are equivalent
(ii) ⇒ (i)
Let (xn ) be a Cauchy sequence. Select a subsequence xnj so that
where
P∞ the existence of such a subsequence is ensured P∞ by the fact that xn is Cauchy. Then
j=1 kx n j+1 − x n j k ≤ 1 < ∞ so (ii) ensures that j=1 (xnj+1 − xnj ) converges. Hence xnk =
Pk−1
xn1 + j=1 (xnj+1 − xnj ) converges, so (xn ) has a convergent subsequence and must thus, by
Lemma 1.3, itself converge.
Example (Examples of non-complete spaces). We can construct many examples of non-complete
spaces by equipping a well known space such as Cb , C 1 , `p , Lp with the ’wrong’ norm, or by
choosing a subspace of a Banach space that is not closed. As an example we show that C 0 ([0, 1])
R1
equipped with kf kL1 = 0 |f |dx is not complete.
For (
for x ∈ [0, n12 ]
1 − n2 x
fn (x) :=
0 else
n N
X X 1
fj (x) − f (x) ≥ − f (x) ≥ N/2 − M ≥ 1
j=1 j=1
2
Pn 1
and thus in particular k j=1 fj − f kL1 ≥ 2N 2 9 0.
Chapter 2
We let
L(X, Y ) := {T : X → Y bounded linear operator }
which we always equip with the so called operator norm, which is defined by
kT kL(X,Y ) := inf{M : (2.1) holds true}, T ∈ L(X, Y ).
We will often abbreviate the space L(X, X) of bounded linear operators from a normed space
(X, k · k) to itself by L(X).
We will later see that an important special case is the space of ’bounded linear functionals’,
i.e. bounded linear functions from a normed vector space to the corresponding field F = R
(respectively F = C for complex vector spaces) and this so called dual space X ∗ := L(X, F) will
be discussed in far more detail in chapters 6 and 7.
One can easily check that k·kL(X,Y ) is a norm on L(X, Y ) and as this is the only norm on L(X, Y )
that we shall use, we will often write for short kT k for the norm of an operator T ∈ L(X, Y )
(provided it is clear from the context what X and Y are and with respect to which norms on
X and Y the operator norm has to be computed). In applications the following equivalent
expressions for the norm of an operator are often more useful than the above definition
Remark. For T ∈ L(X, Y ), X 6= {0}, we have
kT xk
kT kL(X,Y ) = sup = sup kT xk = sup kT xk
x∈X,x6=0 kxk x∈X,kxk=1 x∈X,kxk≤1
14
CHAPTER 2. BOUNDED LINEAR OPERATORS 15
i.e. the infimum in the definition of the norm of a bounded linear operator is actually a minimum.
Conversely, the supremum in the above expressions for the norm of an operator is in general not
achieved, and we shall see examples of this later.
WARNING. T being a bounded linear operator does not mean that T (X) ⊂ Y is bounded.
Indeed, the only linear operator with a bounded image is the trivial operator that maps each
x ∈ X to T (x) = 0.
One of the main reasons why L(X, Y ) gives a very natural class of operators between normed
spaces is that it can be equivalently characterised as the space of continuous linear maps:
Proposition 2.1. Let (X, k · k) and (Y, k · k) be normed spaces and let T : X → Y be linear.
Then the following are equivalent:
(iv) T ∈ L(X, Y ).
(iii) ⇒ (iv)
Suppose that T is continuous at x0 = 0 but that T ∈ / L(X, Y ), i.e. that there exists no M ∈ R
so that the required inequality kT xk ≤ M kxk holds for all x ∈ X. Then there exists a sequence
xn so that
kT xn k > nkxn k.
Then x̃n := kTxxnn k (which is well defined as T xn 6= 0) satisfies kx̃n k ≤ n1 → 0, i.e. converges to
x0 = 0. By continuity of T at 0 we must thus have that also T x̃n → T (0) = 0 ∈ Y and hence
kT x̃n k → 0 which contradicts the fact that by construction
1
kT x̃n k = kT xn k = 1.
kT xn k
(iv) ⇒ (i)
Let M ∈ R be so that (2.1) holds. Then as T is linear we obtain that for any x, x̃ ∈ X
In order to prove that a map T : (X, k · k) → (Y, k · k) is a bounded linear operator we need to
kT xkY ≤ M kxkX .
CHAPTER 2. BOUNDED LINEAR OPERATORS 16
As (1) and (3) often require similar arguments, in particular when working with spaces like `p
or Lp where the key step is to be able to bound a sum/integral respectively to prove that it is
finite, one often discusses these two steps at the same time.
We remark that to show that a linear map T : X → Y is an element of L(X, Y ) we just require
some (possibly far from optimal) number M for which (3) holds and that any such M will be an
upper bound on the operator norm. If we need to additionally determine the norm of T then we
usually proceed as follows:
kT xn k
→ M.
kxn k
Instead of (ii) one might be tempted to try to find some element x ∈ X so that kT xk = M kxk,
but
WARNING. For general bounded linear operators, one cannot expect that there exists x ∈ X
so that kT xk = M kxk, i.e. the supremum supx6=0 kT xk
kxk is in general not achieved.
We note that for any T ∈ L(X, Y ) both the kernel ker(T ) := {x ∈ X : T (x) = 0} of T and its
image T X =: {T x : x ∈ X} are subspaces (of X respectively Y ), but that while ker(T ) is always
closed, as it can be viewed as the preimage of the closed set {0} under a continuous operator,
the image T X is in general not closed.
2.1 Examples
Claim: L, R ∈ L(`p ) = L(`p , `p ) with kLk = kRk = 1 while π ∈ L(`p , F) = (`p )∗ also with
kπk k = 1.
Proof: Clearly all three operators are linear and well defined and for every x ∈ `p we have
kRxkp = kxkp and hence of course R ∈ L(`p , `p ) with kRk = 1 (indeed R preserves norms, i.e. is
so called isometric which is a much stronger property than merely having kRk = 1). For L and
πk we immediately see from the definition of the `p norm that
so that both are bounded linear operators (namely L ∈ L(`p , `p ) and πk ∈ (`p )∗ ) and the
corresponding operator norms are bounded from above by kLk ≤ 1 and kπk k ≤ 1. To see that also
kLk ≥ 1 we may use that kL(0, 1, 0, . . .)kp = k(1, 0, . . .)kp = 1 = k(0, 1, 0, . . .)kp , while choosing
x = e(k) , the sequence that is defined by e(k) = (δkj )j∈N , we also get that 1 = |πk (x)| = kxkp
and hence that kπk k ≥ 1.
Definition 5. We call a linear function T : X → Y isometric if for every x ∈ X we have
kT xk = kxk.
We note that if T ∈ L(X, Y ) is both isometric and bijective, then we have that also T −1 is linear
and isometric (so in particular a bounded linear operator) as for every x ∈ X
kT −1 xk = kT (T −1 x)k = kxk.
Such a map is called an isometric isomorphism and the spaces X and Y are called isometrically
isomorphic, written for short as X ∼
=Y.
Let X = C 0 ([0, 1]), as always equipped with the supremum norm and let g ∈ C 0 ([0, 1]). Then
is linear, well defined (as the product of continuous functions is continuous) and bounded as
so indeed kT k = kgksup .
Consider instead g ∈ L∞ ([0, 1]) and let X = L2 ([0, 1]) (equipped of course with the L2 norm).
Then the map T : X → X defined as above is well defined as
Z 1 Z 1 Z
2 2 2 2 2
|(T f )(t)| dt = f (t)g (t)dt ≤ kgkL∞ |f (t)| dt
0 0
so
kT f kL2 ≤ kgkL∞ kf kL2 for all f ∈ X
and thus kT k ≤ kgkL∞ . Indeed one can show that kT k = kgkL∞ , though to prove this for
general functions requires a careful argument using some techniques from Part A integration,
that are not used elsewhere in the course. We hence only consider as an example g(t) = t. Then
kgkL∞ = 1, so the above calculation implies that kT k ≤ 1 while choosing fn := χ[1− n1 ,1] gives
Z 1
kT fn k2L2 = t2 dt ≥ 1
n (1 − n1 )2 ,
1
1− n
1 kT fn kL2 1
so as kfn k2L2 = n we have kfn kL2 ≥1− n → 1 so also kT k ≥ 1 and hence kT k = 1 = kgkL∞ .
CHAPTER 2. BOUNDED LINEAR OPERATORS 18
At the same time one can show that for any f ∈ L2 ([0, 1])
(this proof is a nice exercise related to the part A course in integration) so this gives an example
of an operator for which the supremum supf 6=0 kT fk
kf k is not attained for any element of the Banach
space X = L2 ([0, 1]).
There are several different norms on the space of matrices, including the analogues of the p-norms
on Rn . Particularly useful is the analogue of the Euclidean norm (i.e. of the case p = 2) given
by
X 1
2
kAk := |aij | 2
i,j
which is also called the Hilbert-Schmidt norm and is widely used in Numerical Analysis. A useful
property of this norm is that it gives a simple way of obtaining an upper bound on the operator
norm of the corresponding map T : Rn → Rm
Lemma 2.2. Let T : Rn → Rm be defined by T x = Ax for some A ∈ Mm×n (R) where we equip
Rn and Rm with the Euclidean norm. Then T ∈ L(Rn , Rm ) and its operator norm is bounded by
the Hilbert-Schmidt norm of A
kT k ≤ kAk.
Remark. For most matrices we have
kT k < kAk
and computing kT k can be difficult. For symmetric n × n matrices however we can easily show
(using material from Prelims Linear Algebra) that
Let X = C([0, 3]) as always equipped with the sup-norm. Given any k ∈ C([0, 3] × [0, 3]) we map
each x ∈ X to the function T x : [0, 3] → R that is given by
Z 3
T x(t) := k(s, t)x(s)ds
0
and thus (Lebesgue) integrable over the bounded interval [0, 3]. Here the supremum norms of k
and x are computed over the corresponding domains, i.e. [0, 3] × [0, 3] respectively [0, 3].
Claim: T ∈ L(X)
Provided we show that T : X → X is actually well defined, we will thus obtain that T ∈ L(X)
with kT k ≤ 3kkksup . To prove that T is well defined we have to show that for any function
x ∈ C([0, 3]) also T x is continuous on [0, 3], i.e. that for any t0 ∈ [0, 3] and any sequence tn → t0
T x(tn ) → T x(t0 ). To this end we set fn (s) := k(s, tn )x(s) and f (s) := k(s, t0 )x(s) and observe
that
as claimed.
An important property of the space of bounded linear operators is that it ”inherits” the com-
pleteness of the target space.
Theorem 2.3. Let (X, k · k) be any normed space and let (Y, k · k) be a Banach space. Then
L(X, Y ) (equipped with the operator norm) is complete and thus a Banach space.
Proof. Let (Tn ) be a Cauchy-sequence in L(X, Y ). Then for every x ∈ X we have that
kTn x − Tm xk ≤ kTn − Tm k kxk −→ 0
n,m→∞
We now show that the resulting map x 7→ T x is an element of L(X, Y ) and Tn → T in L(X, Y ),
i.e. kT − Tn k → 0.
We first note that the linearity of Tn (and (AOL)) implies that also T is linear. Given any ε > 0
we now let N be so that for m, n ≥ N we have kTn − Tm k ≤ ε. Given any x ∈ X we thus have
kT x − Tn xk = k lim Tm x − Tn xk = lim kTm x − Tn xk ≤ εkxk.
m→∞ m→∞
CHAPTER 2. BOUNDED LINEAR OPERATORS 20
Hence T is bounded (as kT xk ≤ (kTn k + ε)kxk for all x) and so an element of L(X, Y ) with
kT − Tn k ≤ ε for all n ≥ N , so as ε > 0 was arbitrary we obtain that Tn → T in the sense of
L(X, Y ).
We note in particular that if X is a Banach-space then the space L(X) := L(X, X) of bounded
linear operators from X to itself is a Banach space and that for any normed space (X, k · k) the
dual space X ∗ = L(X, R) (respectively X ∗ = L(X, C) if X is a complex vector space) is complete
as both R and C are complete.
Given any normed spaces (X, k·kX ), (Y, k·kY ) and (Z, k·kZ ) and any linear operators T ∈ L(X, Y )
and S ∈ L(Y, Z) we can consider the composition ST = S ◦ T : X → Z and observe that
Proposition 2.4. The composition ST of two bounded linear operators S ∈ L(Y, Z) and T ∈
L(X, Y ) between normed spaces X, Y, Z is again a bounded linear operator and we have
where we use in the last step that kTn k is bounded since Tn converges.
We also note that for operators T ∈ L(X) from a normed space (X, k · k) to itself we can consider
the composition of T with itself, and more generally powers T n = T ◦ T ◦ . . . ◦ T ∈ L(X) which,
by the above proposition have norm
kT n k ≤ kT kn .
We conclude in particular
Remark. Let X be a Banach space and let A ∈ L(X). Then
∞
X 1 k
exp(A) := A
k!
k=0
i.e. that the series converges absolutely. As X is complete and thus, by Theorem 2.3, also L(X)
is complete we hence obtain from Corollary 1.4 that the series converges.
CHAPTER 2. BOUNDED LINEAR OPERATORS 21
In many applications, including spectral theory as discussed in chapter 8, the following lemma
turns out to be useful to prove that an operator is invertible:
Lemma 2.5 (Convergence of Neumann-series). Let X be a Banach space and let T ∈ L(X) be
so that kT k < 1. Then the operator Id − T is invertible with
∞
X
(Id − T )−1 = T j ∈ L(X).
j=0
If we only talk about T : X → X being ’invertible as a function between sets’, we sometimes say
that T is algebraically invertible and that a function S : X → X is an algebraic inverse of T if
ST = T S = Id (but not necessarily S ∈ L(X)).
Corollary 2.6. Let T ∈ L(X) be invertible. Then for any S ∈ L(X) with kSk < kT −1 k−1 we
have that T − S is invertible
We will discuss the topic of invertibility of linear operators in more detail later in the course (see
chapter 8)
P k
kAkk < ∞ so, by Corollary 2.6,
P
Proof of Lemma 2.5. As kT k < 1 we know that kA k ≤
the series converges
Xn ∞
X
Sn := T k −→ S = T k in L(X).
n→∞
k=0 k=0
As
(Id − T )Sn = Id − A + A − A2 + A2 − . . . − An + An − An+1 = Id − An+1
and kAn+1 k ≤ kAkn+1 → 0 we can pass to the limit n → ∞ in the above expression to obtain
that (Id − T )S = Id and similarly S(Id − T ) = Id so S = (Id − T )−1 .
Proof of Corollary 2.6. As T is invertible (which by definition means that also T −1 ∈ L(X))
we obtain can write T − S = T (Id − T −1 S) and note that T −1 S ∈ L(X) with kT −1 SkL(X) ≤
kT −1 kkSk < 1. By Lemma 2.5 we thus find that (Id − T −1 S) is invertible with (Id − T −1 S)−1 =
∞ −1
S)j ∈ L(X) and hence T − S is the composition of two invertible operators and thus
P
j=0 (T
invertible, compare also Q.1 on Problem Sheet 2.
Remark. We obtain in particular that if T ∈ L(X) is so that kId − T k < 1 then T is invertible.
Denoting by
GL(X) := {T ∈ L(X) : T is invertible }
we thus know that the open unit ball B1 (Id) := {T ∈ L(X) : kT −Idk < 1} around the identity is
fully contained in GL(X) and more generally that for any T ∈ GL(X) so that Bδ (T ) ⊂ GL(X),
for δ = kT 1−1 k > 0, so GL(X) is an open subset of L(X).
Remark. As you will show on Problem sheet 2, for S ∈ L(X) algebraically invertible we have
that S −1 ∈ L(X) if and only if
(?) ∃δ > 0 so that ∀x ∈ X we have kS(x)k ≥ δkxk.
We will furthermore see that for any S ∈ L(X, Y ) satisfying (?) we have that the image SX is
closed, compare Proposition 8.1.
Chapter 3
In this chapter we will explain why for finite dimensional spaces most of the questions raised in
the previous chapters do not arise, and hence why you never had to discuss issues of continuity,
completeness,... in your prelims/part A courses on Linear Algebra. We shall see in particular
that
We shall furthermore see that the Theorem of Heine-Borel seen in part A and Prelims for R and
Rn , that assures that bounded and closed sets in Rn are compact, remains valid in general finite
dimensional normed spaces and that indeed a normed space is finite dimensional if and only if
the assertion of this theorem holds.
To begin with, we prove the following important special case of the equivalence of norms, upon
which we shall later base the proof of this result for general finite dimensional spaces:
Proposition 3.1. Any norm k · k on Rm , m ∈ N, is equivalent to the euclidean norm kxk2 :=
Pm 2 1/2
i=1 xi and hence all norms on Rm are equivalent.
Proof. We first remark that the last part of the proposition simply follows from the transitivity
of the relation of norms being equivalent, so it remains to show that for any norm k · k there
exist constants C1,2 ∈ R so that for every x ∈ Rm
kxk ≤ C1 kxk2 and kxk2 ≤ C2 kxk.
Pm
To get the first inequality we note that for any x = (x1 , . . . , xm ) = i=1 xi ei ∈ Rm
m m m
∆ X C.S. X 2 1/2
X
kei k2 = C1 kxk
kxk ≤ |xi | kei k ≤ |xi | (3.1)
i=1 i=1 i=1
Pm 1/2
where we set C1 := i=1 kei k2 .
For the proof of the reverse inequality we give two slightly different variants, which are however
based on the same core idea and use in particular the Theorem of Heine-Borel in Euclidean space
(Rm , k · k2 ).
22
CHAPTER 3. FINITE DIMENSIONAL NORMED SPACES 23
Variant 1 (Using that continuous functions on compact sets achieve their minimum:)
We note that the function f (x) := kxk is a Lipschitz-continuous function from (Rm , k · k2 ) to
R (though of course not an element of L(Rn , R) as not linear) as the reverse triangle inequality
combined with (3.1) allows us to bound
|f (x) − f (y)| = kxk − kyk ≤ kx − yk ≤ C1 kx − yk2 .
Variant 2 (Proof by contradiction) Suppose that there exists no constant C2 so that the inequality
kxk2 ≤ C2 kxk holds true for every x ∈ X. Then we can choose a sequence of elements x(n) ∈
(n)
Rm \ {0} so that kx(n) k2 ≥ nkx(n) k. The renormalised sequence x̃(n) = kxx(n) k2 then consists
of elements of the euclidean unit sphere S which as observed above is compact and thus has a
subsequence that converges x̃(nj ) → x ∈ S with respect to the euclidean norm k · k2 . As x ∈ S,
we know that x 6= 0 and thus kxk = 6 0 which contradicts the fact that
∆ 1
kxk ≤ kx − x̃(nj ) k + kx̃(nj ) k ≤ C1 kx − x̃(nj ) k2 + → 0.
nj
We note that the exact same proof (replacing all R with C) applies also if the field is F = C and
hence yields that all norms on Cm are equivalent. More generally we obtain
Theorem 3.2. Let X be any finite dimensional space. Then any two norm k · k and k · k0 on X
are equivalent.
To simplify the notation we again carry out the proof just for real vector spaces and note that
the exact same proof (with all R replaced by C) applies for complex vector spaces.
Proof. Let m = dim(X). Choosing a basis f1 , . . . , fm of X we know from Prelims Linear Algebra
that the map
Xm
Q : Rm 3 (µ1 , . . . , µm ) 7→ µi fi ∈ X
i=1
is a linear bijection. Given any two norms k · kX and k · k0X on X we obtain two norms k · kRm
and k · k0Rm on Rm by defining for every x ∈ Rn
kxkRm := kQ(x)kX respectively kxk0Rm := kQ(x)k0X .
We note that these norms are chosen so that the maps Q : (Rm , k · kRm ) → (X, k · kX ) and
Q : (Rm , k · k0Rm ) → (X, k · k0X ) are isometric and hence, as they are bijections, so are their
inverses (i.e. Q and Q0 are isometric isomorphisms). Using that, by Proposition 3.1, all norms
on Rm are equivalent and hence that there exist constants C1,2 so that
kxkRm ≤ C1 kxk0Rm and kxk0Rm ≤ C2 kxkRm
CHAPTER 3. FINITE DIMENSIONAL NORMED SPACES 24
We can easily check that this defines a norm on the finite dimensional space X which, by the
previous theorem, must hence be equivalent to k·kX . In particular, there exists a constant C ∈ R
so that
kT xkY ≤ kxkT ≤ CkxkX
which ensures that T is bounded and hence an element of L(X, Y ).
As we already know from Linear Algebra that Q is a bijection, this corollary immediately follows
from Theorem 3.3 which implies that the linear maps Q, Q−1 are continuous.
Combing the equivalence of norms with the completeness of R and C furthermore allows us to
prove
Theorem 3.5. Every finite dimensional normed space (X, k·k) is complete, i.e. a Banach space.
Proof. We first recall from Prelims Analysis and Part A metric space that Fn , F = R or F =
C equipped with the euclidean norm k · k2 is complete and remark that this can be easily
proved by showing that a sequence in Rn converges/is a Cauchy-sequence if and only if all of
its components converge/are Cauchy-sequences in R. (We stress that this statement is wrong in
infinite dimensional spaces such as the sequence spaces `p ).
Let now Q be as in (3.2). Given a Cauchy-sequence (xn ) in X we conclude that since Q−1 is a
bounded linear operator from (X, k · kX ) to (Fn , k · k2 ) we have
kQ−1 (xn ) − Q−1 (xm )k2 = kQ−1 (xn − xm )k2 ≤ kQ−1 k kxn − xm kX −→ 0,
n,m→∞
CHAPTER 3. FINITE DIMENSIONAL NORMED SPACES 25
i.e. that Q−1 (xn ) is a Cauchy sequence in (Rn , k · k2 ) and therefore converges to some y. Setting
x = Q(y) we hence obtain that
At a more abstract level we could also argue as follows: C([0, 2]) is a proper subspace of L1 ([0, 1])
however, as we shall see later, C([0, 2]) is dense in L1 ([0, 1]), so the closure of C([0, 2]) in L1 ([0, 1])
L1
is C([0, 2]) = L1 ([0, 1]) 6= C([0, 2]).
We recall that the Theorem of Heine Borel ensures that every subset of Rn respectively of Cn
that is bounded and closed is automatically compact. While the reverse implication, i.e. that
a compact set is always bounded and closed, is valid in every normed space (and indeed more
generally in every metric space), for general normed spaces closedness and boundedness does not
imply compactness. Indeed, the analogue of the Heine-Borel Theorem holds true in a normed
space if and only if the space has finite dimension:
Theorem 3.7. Let (X, k · k) a normed space. Then the following are equivalent
(1) dim(X) < ∞
(2) Every subset Y ⊂ X that is bounded and closed is compact
(3) The unit sphere S := {x ∈ X : kxk = 1} is compact.
Remark. We recall that by definition a set K is compact if every open cover of K has a finite
subcover. We also recall that for metric spaces (and hence in particular for normed space)
compactness is equivalent to sequential compactness, i.e. to the property that every sequence
in K has a subsequence which converges in K. A further useful equivalent characterisation
of compactness in metric spaces is that K is compact if and only K is complete and totally
bounded (which means that S for every ε > 0 there exists a finite ε-net, i.e. a finite set of points
m
x1 , . . . , xm ∈ K so that Y ⊂ i=1 Bε (xi )).
For the difficult implication in the proof of 3.7, i.e. (3) ⇒ (1) we shall use the following useful
property of closed subspaces of normed vector spaces.
Proposition 3.8 (Riesz-Lemma). Let (X, k · k) be a normed vector space and Y $ X a closed
subspace. Then to any ε > 0 there exists an element x ∈ S ⊂ X in the unit sphere so that
dist(x, Y ) := inf{kx − yk : y ∈ Y } ≥ 1 − ε.
CHAPTER 3. FINITE DIMENSIONAL NORMED SPACES 26
Proof of Proposition 3.8. We can assume without loss of generality that ε ∈ (0, 1).
As Y 6= X is closed we know that the set X \ Y is open and non-empty, so we can choose some
x∗ ∈ X \ Y and use that d := dist(x∗ , Y ) > 0, as X \ Y must contain some ball Bδ (x) which
ensures that d ≥ δ > 0.
By the definition of the infimum, we can now select y ∗ ∈ Y so that d ≤ kx∗ − y ∗ k < d
1−ε and
∗ ∗
x −y
claim that x := has the desired properties. Clearly kxk = 1, i.e. x ∈ S as desired, and
kx∗ −y ∗ k
we furthermore have that
x∗ y∗ x∗
dist(x, Y ) = inf kx − yk = inf k − ∗ − yk = inf k ∗ − ỹk
y∈Y y∈Y kx∗ ∗
− y k kx − y k ∗ ỹ∈Y kx − y ∗ k
(3.3)
x∗ − ŷ dist(x∗ , Y )
= inf k ∗ ∗
k= ≥1−ε
ŷ∈Y kx − y k kx∗ − y ∗ k
where we used ∗
twice that Y is a subspace, to replace the infimum over y ∈ Y first by an infimum
ŷ
over ỹ = kx∗y−y∗ k + y and then an infimum over ŷ which is related to ỹ by ỹ = kx∗ −y ∗k .
(2) ⇒ (3):
Is trivial as S is clearly closed and bounded.
(3) ⇒ (1):
We argue by contradiction and assume that S is compact but dim(X) = ∞. We may thus
choose a sequence of linearly independent elements yk ∈ X, k ∈ N. Then the subspace Yk :=
span{y1 , . . . , yk } $ Yk+1 is finite dimensional, so by Corollary 3.6, a closed proper subspace of
Yk+1 . Applying Proposition 3.8 with ε = 12 (viewing Yk as a subspace of Yk+1 instead of X)
thus gives us a sequence of elements yk ∈ Yk+1 ∩ S with dist(yk , Yk ) ≥ 21 . In particular for every
k > l we have kyk − yl k ≥ dist(yk , Yl+1 ) ≥ dist(yk , Yk ) ≥ 12 so no subsequence of (yk ) can be a
Cauchy-sequence. Having thus constructed a sequence (yk ) in S ⊂ X that does not contain a
convergent subsequence we conclude that S is not sequentially compact and hence not compact
leading to a contradiction.
Remark. In the special case that X is an inner product space, rather than a general normed
space, then the proof that (3) ⇒ (1) can be simplified significantly and does not require the use
of Proposition 3.8: Given any sequence yk of linearly independent elements of X, we can apply
the Gram-Schmidt method from Prelims Linear Algebra to obtain a sequence xk of orthonormal
elements of X which hence have the property that kxk − xl k2 = kxk k2 − 2(xk , xl ) + kxl k2 = 2
which ensures that no subsequence of (sn ) can be Cauchy and hence that S is not sequentially
compact.
Chapter 4
We recall
Definition 7. Let (X, k · k) be a normed space. Then a subset D ⊂ X is dense if its closure D
is given by the whole space X, i.e. D = X.
Remark. A useful equivalent characterisation is that a subset D ⊂ X is dense in X if and
only if for every x ∈ X there exists a sequence of elements yn ∈ D so that yn −→ x, i.e.
n→∞
kx − yn k −→ 0 or equivalently if and only if for every x ∈ X and every ε > 0 there exists y ∈ D
n→∞
so that kx − yk < ε.
An important feature of dense subsets D of normed spaces is that a bounded linear operator
on X is fully determined by its values on D. This is particularly useful if we are working on a
space that contains a subspace of ”well-understood” objects, e.g. the space of polynomials in the
space of real valued continuous functions or the space of real valued smooth functions on [0, 1]
in (L2 ([0, 1]), k · kL2 ).
Theorem 4.1. Let (X, k · kX ) be a normed space, let Y be a dense subspace of X (which we
equip with the norm of X) and let (Z, k · kZ ) be a Banach space. Then any T ∈ L(Y, Z) has a
unique extension T̃ ∈ L(X, Z), i.e. there exists a unique bounded linear operator T̃ : X → Z so
that T̃ y = T y for every y ∈ Y and we furthermore have that
We first prove the following simpler result which can be useful in applications.
Lemma 4.2. Let (X, k · kX ) be a normed space, D ⊂ X a dense subset and let (Z, k · kZ ) be a
normed space. Then for operators T, S ∈ L(X, Z) we have
T |D = S|D ⇐⇒ T = S.
27
CHAPTER 4. DENSITY 28
Proof of Lemma 4.2. We can prove the non-trivial direction ”⇒” as follows: For any x ∈ X we
can choose a sequence dn → x with dn ∈ D to conclude that since both T and S are continuous
T x = lim T dn = lim Sdn = Sx.
n→∞ n→∞
Proof of Theorem 4.1. Let x ∈ X be any element. Then as Y is dense there exists a sequence
yn of elements of Y so that yn → x.
Claim: T yn converges and the limit z = limn→∞ T yn depends only on x and not on the chosen
sequence yn .
Once proven, this claim allows us define T̃ x := limn→∞ T yn to obtain a well defined map T̃ :
X → Z. This map will be linear as T is linear, as we can interchange limits and addition/scalar
multiplication and know that the obtained limit is independent of the chosen approximating
sequence. Furthermore
kT̃ xkZ = lim kT yn kZ ≤ kT k lim kyn kX = kT kkxkX
n→∞ n→∞
so that T̃ ∈ L(X, Z) with kT k ≥ kT̃ k. The reverse inequality follows from the definition of the
operator norm, as
kT̃ kL(X,Z) = sup kT̃ xkZ ≥ sup kT̃ ykZ = kT kL(Y,Z) .
x∈X,kxkX =1 y∈Y,kykX =1
Hence, once the claim is proven, we obtain the desired extension which, by Lemma 4.2, is
furthermore unique.
It thus remains to prove the claim. To this end we remark that if yn → x then (yn ) is a Cauchy
sequence and hence also
kT yn − T ym kZ ≤ kT kkyn − ym k −→ 0.
n,m→∞
So (T yn ) is a Cauchy sequence in the Banach space Z and must thus converge to some limit
z. To prove that the limit does not depend on the choice of the sequence of elements of Y that
approximate x, let ỹn be any alternative sequence in Y that converges to x. Then the argument
above implies that T ỹn converges to some limit z̃ and one way to see that z = z̃ is to consider
a third sequence ŷn → x chosen as ŷn = yn for n odd and ŷn = ỹn for n even. Then also T ŷn
must converge to a limit ẑ which must agree with the limit of both of the subsequences T ŷ2n
and T ŷ2n+1 , i.e. we must have that z = ẑ = z̃ which establishes the claim and thus completes
the proof of the theorem.
The goal of this section is to identify suitable dense subspaces of the space C(K) = C(K, R) of
real-valued continuous functions on a compact subset K ⊂ Rn . As always we equip C(K) with
the sup-norm and recall that since continuous functions on compact sets are bounded this is well
defined.
We begin by exploring what properties are necessary for a subspace L ⊂ C(K) to be dense. To
this end we first note that given any two points p, q ∈ K with p 6= q we can choose a continuous
CHAPTER 4. DENSITY 29
We now observe that since C(K) contains a function g with g(p) 6= g(q), p 6= q any given points,
also L must have this property: Indeed, if L ⊂ C(K) is dense, then there must be a sequence
fn ∈ L so that kfn − gksup → 0 and hence in particular
|fn (p) − fn (q)| ≥ |g(p) − g(q)| − |fn (p) − g(p)| − |fn (q) − g(q)|
(4.1)
≥ |g(p) − g(q)| − 2kfn − gksup → |g(p) − g(q)| > 0.
A necessary condition for a subspace L ⊂ C(K) to be dense is hence that it separates points
Definition 8. We say that a subset D ⊂ C(K) separates points if for all p, q ∈ K with p 6= q
there exists a function g ∈ D so that g(p) 6= g(q).
Remark. It can be useful to note that for a subspace L ⊂ C(K) that contains the constant
functions, the following two properties are equivalent
The direction ”⇐” is trivial, while the implication ”⇒” can be seen as follows: Given p 6= q we
let g̃ be any function in L for which g̃(p) 6= g̃(q). Then defining g(x) := g̃(x)−g̃(p)
g̃(q)−g̃(p) , which is again
an element of L as L is a subspace and contains the constant functions, we obtain the desired
function with g(p) = 0 and g(q) = 1.
For our first density result for C(K) we furthermore want our subspace to be closed under the
operation of taking the (pointwise) maximum or minimum of two elements of L, i.e. to be a so
called linear sublattice:
We note that if L is a linear sublattice then also the minimum and maximum of any finite
number of elements f1 , . . . , fm is contained in L since we can iteratively write max(f1 , . . . , fm ) =
max(f1 , max(f2 , . . . , fm )) = .... and furthermore remark that L is a sublattice if and only if
f ∈ L ⇒ |f | ∈ L
as one can easily check using e.g. that |f | = max(f, −f ) and max(f, g) = 12 (f + g) + 21 |f − g|.
We stress again that here we consider real valued functions f (and note that this definition would
make no sense for complex valued functions f ).
We now prove our first main result of this section, which gives a density result for general
sublattices:
Theorem 4.3 (Stone-Weiserstrass-Theorem, lattice form). Let K ⊂ Rn be a compact set and let
C(K) be the space of continuous real-valued functions on K which is equipped with the sup-norm.
Let L be a subspace of C(K) which is such that
Proof of Lemma 4.4. If p = q then we simply choose fp,q to be the constant function fp,p ≡ f (p),
which by assumption is an element of L. So let p 6= q. As L separates points and contains the
constant functions, we can choose g ∈ L as in the above remark so that g(p) = 0 and g(q) = 1.
Then fp,q defined by fp,q (x) := f (p) + (f (q) − f (p)) · g(x) is a linear combination of elements of
L and hence also an element of L and has the desired properties that fp,q (p) = f (p) + 0 = f (p)
and fp,q (q) = f (p) + (f (q) − f (p)) · 1 = f (q).
The final claim of the lemma now simply follows from the fact that f − fp,q is continuous and
ε
hence Up,q := (f − fp,q )−1 (−ε, ε) is an open subset of K that contains both p and q.
Based on this lemma we can now prove the lattice version of the Theorem of Stone-Weierstrass
as follows:
ε
Proof. Given f ∈ C(K) and ε > 0 we let fp,q and Up,q be the functions and sets obtained in the
above lemma. Let p ∈ K be any given point, which we consider to be fixed for the first step of
ε
the proof. Then {Up,q }q∈K is an open cover of K so we canSfind finitely many points q1 , . . . , qm
m ε
(allowed to depend on the fixed point p ∈ K) so that K = i=1 Up,q i
.
ε
We recall that on the sets Up,q i
we have |f − fp,qi | < ε and hence in particular fp,qi < f + ε.
Defining gp := min(fp,q1 , . . . fp,qm ) we hence obtain a function gp ∈ L which satisfies gp < f + ε
on all of K and is furthermore so that gp (p) = f (p) as all functions fp,q have the property that
fp,q (p) = f (p).
To turn these functions gp , for which we have a good upper bound on gp − f , into a function g for
which we have both a good upper and a good lower bound on g−f , we now want to take a suitable
maximum of functions gpi . To this end we note that since each gp is continuous and gp (p) = f (p)
we can choose an open neighbourhood Vp of p in V (e.g. by setting Vp := (f − gp )−1 (−ε, ε)) so
that gp > f − ε on Vp . As above we can now use the compactness of K to conclude that the
S Sk
open cover K = p∈K Vp has a finite subcover K = i=1 Vpi and finally set
g := max{gp1 , . . . , gpk },
which is in L as L is a sublattice. As g is the maximum of functions that satisfy gpi < f + ε on
all of K we have of course still g < f + ε on K, but now know additionally that for every x ∈ K
there is some i so that x ∈ Vpi and hence g(x) ≥ gpi (x) > f (x) − ε. Combined we thus obtain
that the element g ∈ L that we constructed satisfies kf − gksup < ε. As ε > 0 and f ∈ C(K)
were arbitrary this completes the proof that L is dense in C(K).
CHAPTER 4. DENSITY 31
This theorem was first proven by Weierstrass in the case of K a compact interval, while the proof
of the more general form of the theorem given above is due to Stone. Hence one generally talks of
the Theorem of Weierstrass if K is a compact interval and of the Theorem of Stone-Weierstrass
otherwise.
We note that the space of polynomials trivially contains the constant functions and also separates
points (for this already the linear functions would be sufficient). It has furthermore the extra
structure of being a subalgebra of the algebra of continuous functions
Definition 10. A subspace A ⊂ C(K) is a subalgebra if A contains the constant functions and
f, g ∈ A ⇒ f g ∈ A
The Theorem of Weierstrass on the density of polynomials in C(K) is hence a special case of the
following more general result:
Theorem 4.6 (Stone-Weierstrass Theorem, subalgebra form). Let A ⊂ C(K) be a subalgebra
of C(K) which separates points. Then A is dense in C(K).
We derive this theorem from the lattice form of the Stone-Weierstrass theorem by using
Proposition 4.7. If A ⊂ C(K) is a subalgebra of C(K) that is closed then A is a linear
sublattice.
Based on this proposition which is proven below and the lattice form of the Stone-Weierstrass
theorem we can now immediately prove the subalgebra form of the theorem:
Proof of Theorem 4.6. Given a subalgebra A as in the theorem we can easily check that Ā is also
a subalgebra and hence, by Proposition 4.7, Ā is a linear sublattice. As A contains the constant
functions and separates points the same holds true also for Ā so we may apply the sublattice
version of the Theorem of Stone Weierstrass to conclude that Ā is dense in C(K), so Ā = C(K).
Hence that A is dense in C(K).
Proof of Proposition 4.7. We need to prove that if f ∈ A then also |f | ∈ A. As A is closed, this
follows provided we can construct a sequence fn ∈ A which converges fn → |f | in C(K), i.e.
uniformly. We note that it suffices to prove this claim for elements of A with kf k ≤ 1 as we may
then obtain an approximating sequence for f ∈ C(K) with kf k > 1 by setting f˜ = kff k , choosing
an approximating sequence f˜n ∈ A for |f˜| and then deducing that fn := kf kf˜n ∈ A converges to
|f |.
As the fk are obtained by taking linear combinations and products of elements of the subalgebra
A we know that fk ∈ A. We now show that for every k ≥ 0 and every x ∈ K
This clearly holds true if k = 0 since f0 = 0 and f1 = 21 f 2 ≥ f0 . Furthermore, if (∗) holds true
for k then also
1
0 ≤ fk ≤ fk+1 = fk + (|f | − fk )(|f | + fk ) ≤ fk + |f | − fk = |f |,
2
and
1 2 1
fk+2 − fk+1 = fk+1 − fk − (fk+1 − fk2 ) = (fk+1 − fk )(1 − (fk+1 + fk )) ≥ 0
2 2
so, by induction, (∗) holds true for every k. We hence conclude that for every x ∈ K the
sequence fk (x) is bounded above and monotone increasing and thus converges to some limit
g(x) ≥ 0 which, by (AOL), must satisfy g(x) = g(x) + 12 (f 2 (x) − g 2 (x)), i.e. must be given by
g(x) = |f (x)|. This establishes that fn → |f | pointwise. Finally, to prove that fn converges
indeed in the sense of C(K), i.e. uniformly, we can apply the following lemma, which is a
generalisation of Dini’s Theorem encountered in Prelims, to the decreasing sequence |f | − fn .
Lemma 4.8. Let K be a compact subset of some metric space (M, d) and let gn : K → R be a
sequence of continuous functions which is decreasing, i.e. so that gn (x) ≥ gn+1 (x) for every x,
and which converges pointwise to zero. Then gn → 0 uniformly.
Proof. Let ε > 0 be any given number. We set Gn := {x ∈ K : gn (x) < ε} and note that since
gn is monotone the sets Gn are increasing, G1 ⊂ G2 ⊃ . . .. As the gnSare continuous these sets
are open in K and as gn → 0 pointwise, we furthermore know that n≥1 Gn = K. So as K is
compact, the open cover {Gj } has a finite subcover and hence there exists a number N so that
GN = K. Hence for all x ∈ K and n ≥ N we have 0 ≤ gn (x) ≤ gN (x) < ε which proves that gn
converges uniformly.
Remark (Non-examinable). There are various direct ways of proving Weierstrass’s theorem on
the density of polynomials e.g. in C([0, 1]). One can prove e.g. that given any f ∈ C([0, 1]) the
so called Bernsteinpolynomials
n
X n k
pn (t) := tk (1 − t)k f ( )
i=1
k n
converge to f uniformly, or follow the original proof of Weierstrass using the Weierstrass trans-
form.
Example (An application of Weierstrass’s theorem). We claim that the only continuous real
valued function f ∈ C[0, 1] for which
Z 1
(?) f (t)tn dt = 0 for every n ∈ N
0
is the zero function. To see this, we let X = C[0, 1] (as always equipped with the sup norm) and
we note that any function f ∈ C[0, 1] induces a bounded linear functional F ∈ X ∗ = L(X, R)
R1
defined by F (x) = 0 f (t)x(t)dt, where we note that F is bounded since |F x| ≤ kf ksup kxksup ,
so kF kX ∗ ≤ kf ksup . If f satisfies (?) then, by linearity, F (p) = 0 for every polynomial. Since the
polynomials
R are dense in X we can thus apply Theorem 4.1 to obtain that F = 0, in particular
F (f ) = f 2 (t)dt = 0. But as f 2 ≥ 0 this implies that f 2 = 0 a.e. and so as f is continuous
indeed f = 0.
CHAPTER 4. DENSITY 33
In many applications where one works with Lp spaces, the following result is very useful
Theorem 4.9. For any 1 ≤ p < ∞ and any compact set K ⊂ Rn the space C ∞ (K) of smooth
functions is dense in Lp (K).
WARNING. This result is wrong for p = ∞ as you can easily see when trying to approximate
step functions by continuous functions.
The proof of this result is non-examinable (though the result and its applications are examinable),
and we only sketch the proof to introduce the important concept of mollifying an integrable
function to obtain a smooth function which is a good approximation to the given (in general not
even continuous) function. This concept is widely used in theory of and applications to PDEs
and more on this topic can be found in particular in the courses B4.3 Distribution Theory and
C4.1 Functional analytic methods for PDEs.
where c > 0 is chosen so that Rn φ(x)dx = 1 and set φε (x) := ε1n φ( xε ). These smooth functions
R
R φε
(which are often called ’mollification kernels’ or a family of ’standard mollifiers’) have Rn φε = 1
and are zero outside of Bε (0). One can get a sequence fε of smooth functions that approximates
a given f ∈ Lp (K) as follows: We extend f by zero outside of K to get a function that is defined
on all of Rn and then set
Z
fε := φε ∗ f, i.e. define fε (x) := φ(x − y)f (y)dy.
R
Then one can easily check that fε ∈ C ∞ (Rn ) with derivatives Dα fε = (Dα φε ) ∗ f (follows from
the differentiation theorem from Part A Integration) and one can indeed prove that fε → f in Lp
(though this proof requires more care and uses properties of Lp functions that we do not require
elsewhere in the course).
Chapter 5
Separability
Many but not all spaces we have encountered so far have the following useful property
Definition 11. A normed space (X, k·k) is called separable if there exists a countable set D ⊂ X
which is dense. A space which is not separable is called inseparable.
Lemma 5.1. (i) Let X be a vector space and let k · k and k · k0 be two norms on X that are
equivalent. Then (X, k · k) and (X, k · k0 ) are either both separable or both inseparable.
(ii) Let (X, k · kX ) and (Y, k · kY ) be two normed spaces which are isometrically isomorphic,
i.e. so that there exists a linear bijection i : X → Y so that ki(x)kY = kxkx for all x ∈ X.
Then (X, k · kX ) and (Y, k · kY ) are either both separable or both inseparable.
Proof. As equivalent norms lead to the same notion of convergent sequences we obtain that a set
D ⊂ X is dense in (X, k · k) if and only if it is dense in (X, k · k0 ). Hence (i) follows. Similarly,
to obtain (ii) we note that if D ⊂ X is dense and if i : X → Y is an isometric isomorphism then
D̃ := i(D) ⊂ Y is dense as for any y ∈ Y we can choose dn ∈ D so that dn → x := i−1 (y) and
thus get a sequence d˜n = i(dn ) ∈ D̃ that converges to y since
For simplicity of notation we will carry out this proof just for real normed spaces and remark
that the exact same proof, with R replaced by C and Q replaced by Q + iQ applies in the complex
case.
34
CHAPTER 5. SEPARABILITY 35
Proof. We first show that Rn equipped with the 1 norm is separable. Indeed, given any x =
(x1 , . . . , xn ) ∈ Rn and any ε > 0 we can use that Q is dense in R to choose qi ∈ Q so that
|xi − qi | < nε and hence kx − (q1 , . . . , qn )k < ε. As Qn is countable we thus get that (Rn , k · k1 ) is
separable. As every other norm k · k on Rn is equivalent to k · k1 we thus get that also (Rn , k · k)
is separable thanks to Lemma 5.1.
Given any other real finite dimensional vector space (X, k · kX ) we let Q : Rn → X be the
isomorphims introduced in (3.2) and note that Q is an isometric isomorphism if we equip Rn
with the norm kxk := kQxkX . The separability of (X, k · k) thus follows from the separability of
(Rn , k · k) and Lemma 5.1.
While many of the spaces we have seen so far are separable, not all of them are and the most
prominent examples of non-separable spaces are
Proposition 5.3 (`∞ and L∞ are inseparable). The sequence space (`∞ (F), k · k∞ ) and the
function spaces L∞ (Ω), Ω ⊂ Rn any non-empty open set, are inseparable.
We provide the proof of this result for the sequence space `∞ and note that a very similar proof,
using characteristic functions of sets, shows that also L∞ is incomplete.
Proof. We recall that the set A := {a = (a1 , a2 , . . .) : ai ∈ {0, 1}} is uncountable and note that
for this subset of `∞ the distance of any two elements a 6= ã is ka − ãk∞ = 1.
Let now D be any dense subset of `∞ . Then given any a ∈ A there must be an element da ∈ D
so that kda − ak < 21 and we define a function f : A → D by assigning to each a such an element
∆
da . We note that da = dã implies that ka − ãk = ka − da + dã − ãk ≤ ka − da k + kã − dã k < 1
and hence that a = ã so this map is injective. Since A is uncountable, we thus obtain that any
dense subset of `∞ is uncountable. Hence `∞ is inseparable.
This lemma follows from a simple ε/2 argument: Given x ∈ X and ε > 0 we use that Y is dense
in X to choose y ∈ Y so that kx − ykX < 2ε and then use that D ⊂ Y is dense to choose d ∈ D
with kd − ykX < 2ε .
Proposition 5.5. Let (X, k · k) be a normed space and suppose that there exists a countable set
S so that span(S̄) is dense in X. Then X is separable.
Here we recall that the span of a subset A ⊂ X is the set of all finite linear combinations, i.e.
XN
span(S) := { λj sj : λj ∈ F, sj ∈ S, N ∈ N}
j=1
and note that since span(S) ⊂ span(S̄) the above proposition implies in particular that if there
is a countable set S whose span is dense in X then X is separable
As before, we carry out the proof for real normed spaces and remark that the exact same proof,
with R and Q replaced by C and Q + iQ, apply in the complex case.
CHAPTER 5. SEPARABILITY 36
XN
Y := { ai si : ai ∈ Q, si ∈ S, N ∈ N}
i=1
Indeed given any x ∈ X and any ε > 0 we can first use that span(S̄) is dense in X to determine
PN
s̄j ∈ S̄ and aj ∈ R, j = 1, . . . , N , so that kx − j=1 aj s̄j k < ε/3. In a second step we can now
use that every element in the closure S̄ of a set can be approximated by elements of the set S to
ε
determine sj ∈ S so that for every j we have |aj |ksj − s̄j k < 3N . Finally, we use that Q is dense
ε
in R to determine rational numbers bi so that for every j also |aj − bj | · ksj k < 3N . All in all we
PN
hence obtain an element y = j=1 bj sj of Y for which
N N N
∆ X X X
kx − yk ≤ kx − aj s̄j k + k (aj s̄j − bj sj )k < ε/3 + kaj s̄j − aj sj + aj sj − bj sj k
j=1 j=1 j=1
(5.1)
N N
∆ X X
≤ ε/3 + |aj |ksj − s̄j k + |aj − bj |ksj k < ε.
j=1 j=1
To conclude the proof of the proposition it is thus enough to show that Y is countable. Writing
S = {s1 , s2 , . . .} we obtain a surjective map f : A → S from the set
[
A := {(a1 , . . .) : ak ∈ Q and ak = 0 for every k ≥ N + 1}
N ∈N
P
of finite rational sequences to Y by defining s(a) := j aj sj for a ∈ A (we note that this sum is
well defined as only finitely many terms are non-zero). As A is the countable union of sets that
are bijective to QN , we know that A is countable and hence that also Y is countable.
We are now in the position to prove that the following important Banach spaces are separable.
Proposition 5.6 (Separablility of `p and Lp for 1 ≤ p < ∞ and of C(K)).
We remark that more generally Lp (Ω) is separable for arbitrary (measurable) domains Ω ⊂ Rn
and 1 ≤ p < ∞.
know that its span, i.e. the space of polynomials, is dense in C(K). By Proposition 5.5 we thus
get that C(K) is separable.
Proof of (ii): We let Y := span(S) where the countable set S = {e(k) , k ∈ N} consists of all
(k)
sequences e(k) for which the ej = δjk .
P∞ p
Given any element x = (x1 , . . .) ∈ `p we can now use that since j=1 |xj | converges, we obtain
that the cut-off sequences x(k) := (x1 , . . . , xk , 0, 0, . . .) approximate x in the sense of `p , namely
CHAPTER 5. SEPARABILITY 37
p 1/p
kx − x(k) k`p = → 0 as k → ∞. We thus conclude that Y is dense in `p and
P
j≥k+1 |xj |
thus obtain from Proposition 5.5 that Y is separable.
that if a sequence fn converges to some element f ∈ C(K) in the usual sense of the sup-norm,
then also
kfn − f kLp ≤ Ln (K)1/p kfn − f ksup → 0.
Hence any set D ⊂ C(K) that is dense with respect to the sup-norm, will also be dense with
respect to the Lp norm. In particular the set of polynomials is dense also in (C(K), k · kLp ).
Combined with the density of C(K) ⊂ Lp (K) and Lemma 5.4 we thus conclude that the space
of polynomials is dense in Lp (K). The claim that Lp is separable hence again follows from
Proposition 5.5 and the fact that the space of polynomials is spanned by the countable set
{xα , α ∈ Nn0 } of monomials.
Proof of (iii), Variant 2 (using density of step functions) for K = [a, b]:
We use without proof the fact that the space of step functions, that is finite linear combinations
of characteristic functions of intervals, is dense in Lp . We then note that given any interval
[c, d] ⊂ [a, b] with real endpoints, we can choose cn , dn ∈ Q so that cn → c and dn → d and
that this guarantees that χ[ci ,di ] → χ[c,d] in Lp as kχ[c,d] − χ[ci ,di ] k ≤ |c − ci | + |d − di |)1/p → 0.
Hence also the span of all characteristic functions χ[c,d] of intervals with rational endpoints is
dense in Lp and as the set of such functions {χ[c,d] , c < d, c, d ∈ Q} is countable we obtain from
Proposition 5.5 that Lp ([a, b]) is separable.
Once we have established that a space e.g. C(K) is separable, we get for free that also any
subspace (equipped with the same norm) is separable:
Proposition 5.7. Let (X, k · kX ) be a separable normed space and let Y be a subspace of X.
Then also (Y, k · kX ) is separable.
Here it is very important that the subspace is equipped with the norm of X, not any other norm.
E.g. we can see L∞ ([0, 1]) as a subspace of L1 ([0, 1]) and the above proposition implies that if
we were to equip L∞ ([0, 1]) with the L1 norm (which is not often done in practice as we would
end up with a space that is not complete) then this would give us a separable normed space,
while L∞ ([0, 1]) equipped with the ’correct norm’, i.e. the L∞ norm, is not separable (however
it is complete which in practice is more important).
Proof of Proposition 5.7. As X is separable there exists a countable dense subset DX := {xk , k ∈
N} ⊂ X. To prove that Y is separable, we now need to determine a subset of Y that is dense in
Y . To this end, we use that (by the definition of the infimum) we can choose for any k, n ∈ N
an element yk,n ∈ Y so that
1 1
kxk − yk,n k ≤ dist(xk , Y ) + = inf kxk − yk +
n y∈Y n
note that D := {yk,n , k, n ∈ N} is countable. We claim that D ⊂ Y is dense. Indeed, given any
y ∈ Y and any ε > 0 we can first use that DX is dense in X to find some xk with ky − xk k < 3ε ,
which we note implies in particular that dist(xk , Y ) < 3ε . Choosing n large enough so that n1 < 3ε
we hence know that kxk − yk,n k < dist(xk , Y ) + 3ε < 2ε3 and hence get that
ε 2ε
ky − yk,n k ≤ ky − xk k + kxk − yk,n k < + = ε.
3 3
CHAPTER 5. SEPARABILITY 38
Finally we want to give a brief outlook on the use of separability and density of subspaces:
• Existence of a basis? One might ask whether for separable space (X, k · k) there is a ’useful
notion’ of basis of a space, and whether in a separable space one can expect such a basis
to have good properties (e.g. be countable). There are several notions of basis for normed
spaces that will be discussed in part C Functional Analysis, and while every space admits
a so called Hamel basis S (a set S so that every element of X can be written as finite
linear combination of elements in S and that is so that that every finite subset of S is
linearly independent), such Hamel bases are not much use in practice as one can show
that a Hamel basis of a Banach space is either finite (if X is finite dimensional) or else
uncountable (c.f. Part C Functional Analysis ). A more useful notion of basis is that of
a Schauder basis (aPset {s1 , s2 , ...} so that every x ∈ X has a unique norm-convergent
∞
representation x = j=1 λj sj ) and such a Schauder basis exists only for separable spaces
(though as you will see in Part C Functional Analysis not for every separable space).
In the special case of Hilbert-spaces one can show the following stronger and very useful
result which you will prove in B4.3 Functional Analysis 2: Every separable Hilbert space
has a countable orthonormal basis {en }, where a set S is called an orthogonal basis of a
Hilbert space X if its elements are orthogonal, have ksk = 1 and span(S) is dense in X.
• Simplifications of proofs: The main application of separability in the present course will be
that it will allow us to give a proof of some of our main results in case of separable spaces,
most notably the Theorem of Hahn-Banach that we discuss in the next section, that avoid
the use of Zorn’s lemma.
• In applications, it is often possible to reduce the proof of a property or inequality to first
proving the claim for a dense subset of ”nice” elements of the space, such as smooth
functions in case of Lp and then a second step that uses the density of such functions to
prove that this property extends to the whole space. Similarly, as a bounded linear operator
T ∈ L(Y, Z), Z a Banach space, that is defined on a dense subspace Y ⊂ X has a unique
extension to an element T ∈ L(X, Z), in many instances one defines operators first on a
dense subset of ”nice” elements (e.g. continuous functions) and then extends this operator
to the whole space.
Many instances of such arguments can be seen in the Part C course on Functional analytic
methods for PDEs.
• Approximating problems on infinite dimensional spaces by finite dimensional problems:
For separableSspaces there exists a sequence of finite dimensional subspaces Y1 ⊂ Y2 ⊂ ...
of X so that Yi is dense in X. This property is used in many instances (be it to try to
prove the existence of a solution of a problem, like a PDE, or more practically in numerics
to obtain an approximate solution) when considering problems on separable Banach spaces
(e.g. subspace of Lp , 1 ≤ p < ∞). The idea of this method (also called Galerkin’s
method) is to first determine solutions xn ∈ Yn of approximate problems defined on the
finite dimensional spaces Yn , where results from Linear Algebra such as the rank-nullity
theorem apply (and e.g. ensure that an operator T : Yj → Yj is invertible if and only if
it is injective) and then hope to obtain that xn converges to a solution x of the original
problem (in some sense, usually one only obtains so called ”weak convergence”, see Part C
courses on Functional Analysis and Fixed Point Methods for Nonlinear PDEs), respectively
in applications in numerical analysis that xn provides a good approximation of the solution.
Chapter 6
The most important special case of linear operators between Banach spaces is the space of
bounded linear functionals, i.e. bounded linear maps into R (respectively C).
Definition 12. Let (X, k · k) be a normed space. Then the dual space of X is defined as
X ∗ := L(X, F) equipped with the operator norm kf kX ∗ := inf{M : |f (x)| ≤ M kxk for every x ∈ X},
where as always F = R if (X, k · k) is a real normed space, respectively F = C for complex spaces.
We remark that since R (and C) are complete we know from Theorem 2.3 that the dual space of
any normed space is complete.
We note that if f ∈ X ∗ and Y is a subspace of X (as always equiped with the same norm to turn
it into a normed space), then we can restrict any f ∈ X ∗ to obtain an element f |Y of Y ∗ , where
we of course set f |Y (y) := f (y). We note that the definition of the operator norm immediately
implies that kf |Y kY ∗ ≤ kf kX ∗ .
Conversely we may ask whether we can extend a functional g ∈ Y ∗ to a bounded linear operator
G ∈ X ∗ , where we call such a G an extension of g provided G|Y = g.
We have already seen that if Y is dense in X such an extension not only exists, but is furthermore
unique and indeed the extension operator E : Y ∗ → X ∗ is an isometric isomorphims, compare
Theorem 4.1. While this result holds true for linear opererators into a general Banach space,
the results that we will prove in this chapter are valid only for elements of the dual space, i.e.
functions that map into the corresponding field F = R respectively F = C.
The main result that we prove in this chapter is the Theorem of Hahn-Banach, that assures in
particular that we can indeed extend any element f ∈ Y ∗ , Y an arbitrary subspace of X, to an
operator F ∈ X ∗ without increasing its operator norm, compare Theorem 6.1 below.
39
CHAPTER 6. THE THEOREM OF HAHN-BANACH AND APPLICATIONS 40
In this section we assume that X is a real vector space (which we can assume wlog to be non-
trivial, i.e. X 6= {0}) and will discuss complex spaces in the next section 6.2.
We discuss several versions and consequences of the Theorem of Hahn-Banach whose main pur-
pose is to establish the existence of linear functionals with some prescribed properties, such as
being an extension of a given functional from a subspace Y to the whole space X.
The results we state in this section are all valid for general (real) normed spaces (X, k · k), but
we shall only give their proofs in the case that X is separable to avoid the need for an argument
using Zorn’s lemma in favour of proofs that use the usual induction. The general version of Hahn-
Banach and further applications that go beyond the confines of this course will be discussed in
the Part C course on Functional Analysis.
The first version of the Theorem of Hahn-Banach shows that we can not only extend bounded
linear functionals from a subspace to the whole space, but we can do this in a way that does not
increase their operator norm, namely
Theorem 6.1 (Theorem of Hahn-Banach on the existence of a bounded extension). Let X be
a normed space, Y ⊂ X a subspace and let f ∈ Y ∗ be any given element of the dual space of Y .
Then there exists an extension F ∈ X ∗ of f , i.e. an element F of X ∗ so that F |Y = f , so that
kF kX ∗ = kf kY ∗ .
so to prove the above result it is enough to prove that there exists a linear extension of f so
that |F (x)| ≤ p(x) for all x ∈ X where we set p(x) := kf k kxk. We note that as F is linear, this
condition is equivalent to having
F (x) ≤ p(x) = kf k kxk for all x ∈ X
as this then also implies that −F (x) = F (−x) ≤ kf kkxk. We also recall that we are dealing
with real vector spaces, and hence functionals with values in R, so the above inequality is well
defined.
Indeed the general version of the Hahn-Banach Theorem assures that such an extension exists
for a much larger class of functions p than just the p(x) = kf kkxk that we obtain in the context
of Theorem 6.1, namely for all p : X → R that are so called sublinear:
Definition 13. Let X be a real vector space. Then p : X → R is called sublinear if for every
x, y ∈ X and every λ ≥ 0 we have that
p(x + y) ≤ p(x) + p(y) and p(λx) = λp(x).
We note that we do not require that p is non-negative. We also note that every norm, and
indeed every seminorm, on X is a sublinear functional. There are also many other constructions
that yield sublinear functions that are important in applications (as discussed e.g. in Part C
Functional Analysis), such as the so called Minkowski functional associated to each convex set
C that contains the orgin, compare section 6.4
To get a simple example of a sublinear functional that is not induced by a semi-norm, we can
consider any linear function p : X → R, or to a get a more geometric example consider p : Rn → R
CHAPTER 6. THE THEOREM OF HAHN-BANACH AND APPLICATIONS 41
that is defined by p(x) = max(xn , 0), i.e. that is given by the distance of a point x to the halfspace
{x : xn ≤ 0}.
The general version of the Theorem of Hahn-Banach (for real vector spaces) is
Theorem 6.2 (Theorem of Hahn-Banach (general sublinear version)). Let X be a real vec-
torspace, Y ⊂ X a subspace and p : X → R sublinear. Suppose that f : Y → R is a linear
functional with the property that
In this course we shall only give the full proof of the more specific version of the Theorem of
Hahn-Banach stated in Theorem 6.1 and this only in the case that X is separable.
As the first step of the proof of this more restricted result and of the proof of the general version
of Hahn-Banach are however identical, we will carry out this first step in the general setting. This
first step is to prove that we can obtain the required extension to subspaces that are obtained
by ”adding one dimension” and relies crucially on scalars being real, rather than complex:
Lemma 6.3 (1-step extension lemma). Let X be a real vector space, p : X → R sublinear and
let Y, Ỹ be subspaces of X which are so that there exists some x0 ∈ X so that
Then for any linear f : Y → R for which f (y) ≤ p(y) for all y ∈ Y there exists a linear extension
f˜ : Ỹ → R so that
f˜(ỹ) ≤ p(ỹ) for all ỹ ∈ Ỹ .
Proof of Lemma 6.3. If x0 ∈ Y then the claim is trivial as Ỹ = Y . So suppose instead that
x0 ∈
/ Y . Then we can write every ỹ ∈ Ỹ uniquely as
so given any number r ∈ R we obtain a well defined linear map f˜r : Ỹ → R if we set
and note that f˜r |Y = f no matter how r is chosen. We now need to show that we can choose
r so that this function f˜ has the required property that f˜r (ỹ) ≤ p(ỹ) for all ỹ ∈ Ỹ , which is
equivalent to
λr ≤ p(y + λx0 ) − f (y) for all y ∈ Y, λ ∈ R. (6.1)
We first note that for λ = 0 this is trivially true no matter how r is chosen as by assumption
f ≤ p on Y .
For λ > 0 the above inequality (6.1) holds true if and only if
for all y ∈ Y or equivalently, setting v = λ1 y and using that Y is a vector space, if and only if
r ≤ inf p(v + x0 ) − f (v) . (6.2)
v∈Y
CHAPTER 6. THE THEOREM OF HAHN-BANACH AND APPLICATIONS 42
For λ < 0 we write λ = −|λ| to rewrite (6.1) as −|λ|r ≤ p(y − |λ|x0 ) − f (y). We hence obtain
that (6.1) is satisfied for all λ < 0 and y ∈ Y if and only if
−1 −1 −1
r ≥ −|λ| (p(y − |λ|x0 ) − f (y)) = f (|λ| y) − p(|λ| y − x0 ),
For f˜r to be the required extension we thus need to choose r so that both (6.2) and (6.3) hold,
which is possible provided
inf p(v + x0 ) − f (v) ≥ sup f (w) − p(w − x0 ) .
v∈Y w∈Y
where we use the sublinearity of p in the second and the assumption that f ≤ p on Y in the last
step.
The proof of the general version of Hahn-Banach now uses this lemma together with an argument
based on Zorn’s lemma to obtain the desired extension as a maximal element of a partially ordered
set of pairs (Ỹ , f˜) of subspaces Ỹ of X that contain Y and extensions f˜ of f with f˜ ≤ p. As
mentioned, this will be carried out in detail in C4.1 Functional Analysis.
Proof of Theorem 6.1 for separable X. Let (X, k·k) be a separable space and let D = {x1 , x2 , . . .}
be a dense subset of X. Given a subspace Y of X we now define an increasing sequence of
subspaces Yi of X that contain Y by setting Y0 = Y and then defining iteratively
Given f ∈ Y ∗ we let p : X → R be the sublinear function that is defined by p(x) := kf k kxk, set
f0 = f and now iteratively obtain linear extensions fi+1 : Yi+1 → R of fi : Yi → R,Si.e. linear
∞
functions with fi+1 |Yi = fi , satisfying fi+1 ≤ p on Yi+1 from Lemma 6.3. On Y∞ := i=1 Yi we
hence obtain a linear function f˜ : Y∞ → R by defining f (x) := fi (x) for x ∈ Yi , which is well
defined as by construction fi = fj on Yi ∩ Yj = Ymin(i,j) . By construction f˜|Y = f0 |Y0 = f and
f˜ ≤ p which, as observed previously (c.f. the remark made after the statement of the theorem)
implies that f˜ ∈ (Y∞ )∗ with
kf˜k(Y∞ )∗ = kf kY ∗ .
We finally note that since D ⊂ X is dense and D ⊂ Y∞ , we have that Y∞ is a dense subspace
of X so we can extend f˜ to the desired element F ∈ X ∗ with kF kX ∗ = kf˜k(Y∞ )∗ = kf kY ∗ using
Theorem 4.1.
WARNING. The Theorem of Hahn-Banach is specific to functionals, that is maps from a
vector space to the corresponding field F, and does not hold true for linear operators between
two normed spaces.
One can e.g. show that there is no continuous linear extension of the identity map Id : c0 → c0
to a map f : `∞ → c0 where c0 ⊂ `∞ denotes the closed subspace of all sequences that tend to
zero.
CHAPTER 6. THE THEOREM OF HAHN-BANACH AND APPLICATIONS 43
We remark that both the bounded extension version of the Theorem of Hahn-Banach, i.e. The-
orem 6.1, as well as the general version stated in Theorem 6.2 using sublinear functions, have an
extension to complex vector space. For simplicity here we only state and prove the former:
Theorem 6.4. Let (X, k · k) be a complex normed space, Y a subspace of X and f ∈ Y ∗ . Then
there exists an extension F ∈ X ∗ of f so that
kF kX ∗ = kf kY ∗ .
As we we can view (X, k · k) and its subspace Y also as a real vector-spaces (simply ignoring
the additional structure obtained from multiplying by i) we hence know that f1 : Y → R is a
bounded R-linear functional. Using the version of Hahn-Banach for real normed spaces that we
have already proven, we can thus extend f1 to a bounded R-linear functional F1 : Y → R that
also satisfies
kF1 (x)k ≤ kf kkxk for every x ∈ X. (6.4)
We now set F (x) := F1 (x) − iF1 (ix) and claim that this is the desired C-linear extension of f
with kF k = kf k. By construction F |Y = f and F (ix) = iF (x) so we get that F is not only
R-linear but indeed C-linear, so it remains to check that kF k ≤ kf k, i.e. that for any x ∈ X we
have |F (x)| ≤ kf kkxk.
Given any x ∈ X we set θ = −Arg(F (x)) ∈ R, which implies that F (eiθ x) = eiθ F (x) =
eiθ |F (x)|eiArg(F (x)) = |F (X)| is real and hence F (eiθ x) = Re(F (eiθ x)) = F1 (eiθ x). We can thus
use (6.4) and the C linearity of F to obtain the desired estimate of
As a first application of the Theorem of Hahn-Banach we obtain the following useful result
Proposition 6.5. Let (X, k · k) be a normed space. Then for any x ∈ X \ {0} there exists an
element f ∈ X ∗ with kf k = 1 so that f (x) = kxk.
Proof. Let Y = span(x) and define g(λx) = λkxk for λ ∈ F. Then g ∈ Y ∗ with kgk = 1 and
hence g has an extension f ∈ X ∗ with kf k = 1 and f (x) = g(x) = kxk.
CHAPTER 6. THE THEOREM OF HAHN-BANACH AND APPLICATIONS 44
This result has several useful consequences, including the following ’dual characterisations’ of
the norms on X and its dual space X ∗
Corollary 6.6. Let (X, k · k), X 6= {0}, be a normed space. Then
Proof. We already observed that the second statement is an easy consequence of the definition of
the operator norm on X. For the proof of (i) we observe that Proposition 6.5 implies that kxkX ≤
supf ∈X ∗ ,kf kX ∗ =1 |f (x)| while the reverse inequality is trivially true since |f (x)| ≤ kf kkxk = kxk
for every f ∈ X ∗ with kf k = 1.
We note that while the supremum in (ii) is in general not achieved, Proposition 6.5 implies that
the supremum in (i) is always achieved.
Corollary 6.7. Let (X, k · k) be a normed space. Then for any two elements x 6= y of X there
exists an element f ∈ X ∗ so that
f (x) 6= f (y).
This corollary follows as Proposition 6.5 allows us to choose f ∈ X ∗ so that f (x−y) = kx−yk =
6 0.
We first note that the kernel of an element f ∈ X ∗ \ {0} has codimension 1 , namely
Lemma 6.8. Let X be a normed space and let f : X → F, F = R respectively F = C be linear
so that f 6= 0. Then for any x0 ∈ X for which f (x0 ) 6= 0 we have that
span(ker(f ) + {x0 }) = X
Geometrically we can think of Corollary 6.7 as follows: As Lemma 6.8 implies that the kernel of
f has codimension 1 we can think of the sets {x : f (x) = λ} as hyperplanes in X (that is shifts of
a subspace with codimension 1) that divides our space X into two parts, namely {x : f (x) < λ}
and {x : f (x) > λ}. The above corollary hence ensures that we can separate any two points by
a hyperplane, with x and y on either side of it.
A slightly more general form of this result that we can prove is that we can separate points from
closed subspaces
Proposition 6.9. Let (X, k · k) be a Banach space, Y a proper closed subspace of X. Then for
any x0 ∈ X \ Y there exists an element f ∈ X ∗ with kf k = 1 so that
Proof of Proposition 6.9. We define a suitable linear map g on the subspace U = span(Y ∪ {x0 })
and then use Hahn-Banach to extend g to f .
To this end we note that every u ∈ U can be written uniquely as u = y + λx0 for some λ ∈ R
and y ∈ Y so that defining
gives a well defined linear map on Y which has the property that g(x0 ) = d and g|Y = 0.
ky + λx0 k = |λ| kx0 − (−λ−1 y)k ≥ |λ| inf kx0 − ỹk = |λ| d = |g(y + λx0 )|.
ỹ∈Y
To prove that kgk = 1 it hence remains to prove that that also kgk ≥ 1 or equivalently that for
any ε > 0 there exists x ∈ X \ {0} so that |g(x)|
kxk ≥ 1 − ε, where it is of course enough to consider
ε ∈ (0, 1). To obtain such an x we note that by the definition of d = dist(x0 , Y ), we can find for
1
any c > d, an element y ∈ Y so that kx0 −yk < c. We chose c = 1−ε d, which is strictly larger than
d since d > 0, and hence obtain an element x0 − y ∈ X for which |g(x 0 −y)| d d
kx0 −yk = kx0 −yk ≥ c = 1 − ε
∗
as required. Having thus shown that kgk = 1 we now obtain the required f ∈ X with f |Y = 0,
f (x0 ) = d and kf k = kgk = 1 by applying the Theorem of Hahn-Banach.
There are far stronger versions of such ’geometric forms of Hahn-Banach’ which will be discussed
in the part C course on Functional Analysis. As already simple examples in R2 show, we cannot
expect to separate sets by straight lines without imposing some constraints on their geometry.
Unsurprisingly, a key role is played by the convexity of sets and one of the results of Part C
Functional Analysis will be to prove that if A is closed and B is compact and if both sets are
convex, then these sets can be strictly separated by a hyperplane in the sense that there exists
an element f ∈ X ∗ and a number λ ∈ R so that
The proof of this result uses the sublinear version of the Theorem of Hahn-Banach and the fact
that for an open convex set C containing the origin, one can define a sublinear function by
p(x) := inf{λ > 0 : x ∈ λC} (called the Minkowski functional).
Such general results play an important role also in applications to PDEs and in other advanced
topics in functional analysis but go beyond the remit of this course.
To formulate another useful consequence of Proposition 6.9 we introduce the following notation
Definition 14. Given any subset A ⊂ X, we define the annihilator of A to be
A◦ := {f ∈ X ∗ : f |A = 0}.
Proposition 6.10. Let (X, k · k) be a normed space. Then the following hold true:
(i) Let S ⊂ X. Then span(S) is dense if and only if the annihilator of S is trivial, i.e.
S 0 = {0} ⊂ X ∗
(ii) If T ⊂ X ∗ is so that span(T ) is dense in X ∗ then T◦ = {0} ⊂ X.
Proof. (i) Suppose first that span(S) is dense. Then for any f ∈ S ◦ , we have by linearity that
also f |Y = 0 where we set Y = span(S). As Y is dense in X we thus get that f = 0 by
Lemma 4.2.
Conversely, suppose that span(S) is not dense. Then Y = span(S) is a closed proper
subspace of X so we can choose x0 ∈ X \ Y and apply Proposition 6.9 to obtain an f ∈ X ∗
6 0 so have found an element f 6= 0 of S ◦ .
with f |Y = 0 and f (x0 ) = kx0 k =
(ii) Suppose that there exists x ∈ T◦ with x 6= 0. Then by Corollary 6.7 there exists f ∈ X ∗
so that f (x) 6= f (0) = 0. If span(T ) is dense in X ∗ we can however find a sequence (fn ) of
elements of span(T ) that converges fn → f in the sense of X ∗ . Note that since x ∈ T◦ we
have that fn (x) = 0 which leads to a contradiction as 0 6= f (x) = limn→∞ fn (x).
We note that as the kernel of any element f ∈ X ∗ is closed, we know that T◦ is an intersection
of closed subspaces and hence itself a closed subspace of X. Also one can easily check from the
definition that the annihilator of any set A ⊂ X is closed subspace of X ∗ . Furthermore we have
Ā = (A◦ )◦ .
Proof. ”⊂” As (A◦ )◦ is closed, it suffices to prove that A ⊂ (A◦ )◦ . Let a ∈ A. Then by definition
of the annihilator of A we know that f (a) = 0 for any f ∈ A◦ and hence that a ∈ (A◦ )◦ .
In this chapter we further discuss the special properties of functionals, describe the dual spaces of
some important spaces encountered earlier, take a first look at the second dual X ∗∗ of a normed
space, that is the dual space of the dual space X ∗ of X and explain how a space always embedds
into its second dual and how this can be used to view a non-complete normed space as a subspace
of a complete space.
To begin with, we note that for linear functionals, we have the following characterisation of
continuity
Lemma 7.1. Let X be a normed space and let f : X → F, F = R respectively F = C be linear.
Then the following are equivalent
ker(f ) is closed ⇐⇒ f ∈ X ∗ .
Proof. ”⇐” As f is continuous and {0} is closed we get that the preimage ker(f ) = f −1 ({0}) is
also closed.
”⇒” The claim is trivial if f = 0 so suppose that f 6= 0 and let x0 be so that f (x0 ) 6= 0, where
(after replacing x0 by a multiple of x0 ) we can assume without loss of generality that f (x0 ) = 1.
We first note that since ker(f ) is closed, we know that dist(x0 , ker(f )) > 0, compare problem
sheet 0. We now claim that for every x ∈ X
|f (x)| ≤ δ −1 kxk where δ := dist(x0 , ker(f )) > 0
which will of course imply that f ∈ X ∗ .
This claim is trivial for x ∈ ker(f ) so suppose instead that f (x) 6= 0. We note that since f (x0 ) = 1
1
and since f is linear we have that x − f (x)x0 ∈ ker(f ). Hence also − f (x) (x − f (x)x0 ) ∈ ker(f )
and must thus have distance of at least δ from x0 which implies that
1 kxk
δ ≤ kx0 + (x − f (x)x0 )k =
f (x) |f (x)|
47
CHAPTER 7. DUAL-SPACES, SECOND DUALS AND COMPLETION 48
We recall from Linear Algebra that if X is a finite dimensional space then we can associate to
each basis e1 , . . . , en of X a dual basis f1 , . . . , fn of X ∗ by defining fi (ej ) = δij . In particular X
and its dual X ∗ are isomorphic.
WARNING. As so often in this course, the finite dimensional case leads to the wrong intuition
for general normed spaces. In general, the dual space can have very different properties than a
space itself, e.g. we can have that X is separable while X ∗ is inseparable, we will have that the
dual space of any normed space is complete, even if the space X itself is not complete....
The one exception to this warning are Hilbert-spaces, for which you will prove the following
beautiful theorem in B4.2 Functional analysis 2
Theorem 7.2 (Riesz-Representation Theorem (contents of B4.2)). Let (X, (·, ·)) be a Hilbert-
space. Then the map ι : X → X ∗ defined by ι(x)(y) := (x, y) is an isometric isomorphism from
X to X ∗ .
To describe the dual space of a given space X, we would like to find another normed space
which is isometrically isomorphic to X, written for short as X ∼
= Y , i.e. for which there exists a
bijective linear map L : Y → X so that
kLykX = kykY for all y.
Often it is not too difficult to find a space Y and a map L so that L : Y → X ∗ is isometric, i.e.
so that kLykX = kykY for all y, and hence also injective, but it can be difficult to find a space Y
that is large enough so that it represents all elements of X ∗ , i.e. so that the map L is surjective,
respectively to prove that a candidate Y for the dual space has this property.
In general, determining the dual space of a given normed space (X, k · k) can be difficult and
already the dual spaces of some very familiar spaces such as C([a, b]) or L∞ ([a, b]) can be com-
plicated and their description is beyond the scope of this course, though we remark that in both
cases the dual spaces can be identified with a suitable space of (signed) measures. Examples of
elements of (C[0, 1])∗ are e.g. the map T : f 7→ [0,1] f gdx for any g ∈ L1 ([0, 1]) but also maps
R
like T : f 7→ f ( 21 ) which one can interpret as the integral of f with respect to a δ-measure that
is concentrated at x = 12 .
On the other hand, for the sequence spaces `p and the function spaces Lp we have the following
characterisations if 1 ≤ p < ∞, which we stress do not apply if p = ∞. For simplicity we consider
real valued functions, though the results and their proofs also apply in the complex case (with
some extra complex conjugates).
Theorem 7.3 (Dual space of Lp ). Let Ω ⊂ Rn be measurable, 1 ≤ p < ∞ and let 1 < q ≤ ∞ be
so that p1 + 1q = 1. Then (Lp (Ω))∗ ∼
= Lq (Ω), and an isometric isomorphism is given by the map
L : L (Ω) → (L (Ω)) where for f ∈ Lq (Ω) the linear map L(f ) ∈ (Lp (Ω))∗ is defined by
q p ∗
Z
L(f ) : Lp (Ω) 3 g 7→ f gdx ∈ R.
Ω
We will prove that the map L defined above is a well-defined linear isometric map but omit the
proof of the difficult part of the theorem, i.e. the fact that L is surjective, as this would require
CHAPTER 7. DUAL-SPACES, SECOND DUALS AND COMPLETION 49
techniques not used elsewhere in the course and as this proof is carried out in Part C Functional
Analysis.
Proof of Theorem 7.3 except for surjectivity of L. We note that since p, q are so that p1 + 1q = 1
we know from Hölder’s inequality that the product f g of two functions f ∈ Lq (Ω) and g ∈ Lp (Ω)
is integrable and Z
| f gdx| ≤ kf kLq kgkLq . (7.1)
Ω
We now remark that since the integral is linear, we have that for all f, f˜ ∈ Lq (Ω), λ ∈ R
and any g ∈ Lq (Ω) that L(f + λf˜)(g) = Lf (g) + λLf˜(g), i.e. that the map L : f → Lf is
linear. Similarly, given any f ∈ Lq (Ω) we have that for any g, g̃ ∈ Lp (Ω) and any λ ∈ R that
Lf (g + λg̃) = Lf (g) + λLf (g̃) so Lf is a linear map from Lp to R and indeed an element of
(Lp (Ω))∗ as it is bounded with
R
|Lf (g)| | f g| kf kLq kgkLp
kLf k(Lp (Ω))∗ = sup = sup ≤ sup = kf kLq
q
g∈L ,g6=0 kgk L p q
g∈L ,g6=0 kgk Lp q
g∈L ,g6=0 kgkLp
This proof is a bit technical as we need to be careful with the exponents, so it can be useful to
first consider special cases such as p = q = 2 or p = 1, q = ∞, where the exponents are much
nicer to see the structure of the argument, before digesting the general case. We first treat the
case that p > 1 and hence q < ∞.
q−2
The estimate (7.2) is trivial if fR = 0 so suppose that f 6= 0. We choose g := |f | f so that
q q
f g = |f | and hence L(f )(g) = Ω |f | dx = kf kqLq . We now note that since p1 + 1q = 1 we have
(q − 1)p = q and so
Z Z Z
p 1/p (q−1) p 1− q1 q 1 (q−1)
kgkLp = |g| dx = (|f | ) dx = |f | dx q = kf kq−1
Lq
If p = 1 and hence q = ∞ then we prove that for any ε > 0 there exists a function gε ∈ L1 so
Lf (g)
that kgk ≥ kf kL∞ − ε. To this end we consider the set Aε := {x : |f (x)| ≥ kf kL∞ − ε}, which
L1
is measurable (and well defined upto a null set). If this set has finite measure then we define
gε (x) := sign(f (x)) · χAε which is in L1 (Ω) with L1 -norm equal to the measure of Aε , which by
the definition of the L∞ norm is positive. As f g ≥ (kf kL∞ − ε)χAε we can thus immediately
check that Lf (g) ≥ (kf kL∞ − ε)kgε kL1 which gives the claimed bound. Finally, if Aε has infinite
measure, then we can replace Aε by any subset Ãε ⊂ Aε whose measure is finite and positive
and apply the above argument for the corresponding function gε ∈ L1 .
Theorem 7.4 (Dual space of `p (R)). Let 1 ≤ p < ∞ and let 1 < q ≤ ∞ be so that p1 + 1q = 1.
Then (`p )∗ ∼
= `q , and an isometric isomorphism is given by the map L : `q (R) → (`p (R))∗ where
for x ∈ ` the linear map L(x) ∈ (`p )∗ is defined by
q
∞
X
L(x) : `p (R) 3 y 7→ xj yj ∈ R.
j=1
The proof that L is a well defined isometric linear map from `q to (`p )∗ is exactly the same as
for the function spaces Lp (replacing functions by sequences and integral by sums), so we do not
repeat it.
Instead we explain how one can prove surjectivity of the map L in case of p = 1 to show that
indeed (`1 )∗ = `∞ :
The proof of surjectivity of L for general sequence spaces `p , 1 ≤ p < ∞ is very similar (though
one needs to be more careful with the exponents).
WARNING.
(`∞ )∗ `1 and (L∞ (Ω))∗ L1 (Ω).
While the analogue of the maps L defined in Theorems 7.3 and 7.4 also give isometric linear maps
from L1 to (L∞ )∗ (respectively `1 to (`∞ )∗ ) these maps are not surjective. For the sequence
spaces one can show that `1 is isomorphic to the dual of a subspace of `∞ , namely (c0 )∗ ∼ = `1 ,
where c0 denotes the subspace of all sequences that converge to zero, compare problem sheet 4.
As the dual space (X ∗ , k · kX ∗ ) of a normed space (X, k · k) is again a normed space, we can
consider its dual space X ∗∗ which is called the second dual space or bidual space of X. The most
important property of this space is that it will always contain an isometric image of the space X
itself, obtained via the canonical map
i : X → X ∗∗ , i(x)(f ) := f (x) (7.3)
that maps each element x to the functional i(x) : X ∗ → R that evaluates elements f ∈ X ∗ at
the point x.
CHAPTER 7. DUAL-SPACES, SECOND DUALS AND COMPLETION 51
Proposition 7.5. Let (X, k · k) be a normed space and let i : X → X ∗∗ be the canonical map
defined by (7.3). Then i is linear and isometric, i.e.
ki(x)kX ∗∗ = kikX .
(L1 )∗∗ ∼
= (L∞ )∗ L1 .
However, it turns out that for many important spaces the space X is isometrically isomorphic
to its bidual X ∗∗ . Spaces for which i(X) = X ∗∗ are called reflexive, and their properties will be
further analysed in part C Functional analysis. Reflexivity (and also separability) is in particular
relevant in applications, as it allows one to extract a subsequence of any given bounded sequence
that ’converges in the weak sense’ to some limit, a property that is hugely relevant as one often
tries to prove the existence of a solution of a problem (be it an abstract equation on some Banach
space, the existence of a minimiser in calculus of variations or a solution of a PDE) by considering
a sequence of approximate solutions (or solutions of approximations of the problem) and hoping
to find a subsequence of these approximate solutions that converges in some sense to a solution
of the original problem.
From the characterisation of the dual spaces of `p and Lp obtained above we know in particular
Proof of Proposition 7.5. i is clearly a linear map from X ∗ → R and as for any f ∈ X ∗ with
kf kX ∗ = 1
|i(x)(f )| = |f (x)| ≤ kf k kxk = kxk
we have that
ki(x)kX ∗∗ = sup |f (x)| ≤ kxk.
f ∈X ∗ ,kf kX ∗ =1
To see that also ki(x)k ≥ kxk we now choose f ∈ X ∗ with kf kX ∗ = 1 as in Proposition 6.5 so
that f (x) = kxk.
We note that this argument is essentially just a repetition of the proof of Corollary 6.6 (i) which
directly gives that ki(x)k = kxk.
As the dual space of any normed space is complete, we know in particular that X ∗∗ is complete
and hence that every closed subspace of X ∗∗ is itself a Banach space. This allows us to view any
non-complete normed space as a dense subspace of a Banach space
Corollary 7.6. Let (X, k · k) be any normed space. Then (X, k · k) is isometrically isomorphic to
i(X) which can be seen as dense subspace of the Banach space (Y, k·kX ∗∗ ) where Y = i(X) ⊂ X ∗∗ .
A Banach space (Y, k · kY ) into which X embedds isometrically as a dense subset is called
completion of X. Such a completion is determined up to isometric isomorphisms, i.e. given any
two spaces Y , Ỹ so that there exist isometric maps J : X → Y respectively J˜ : X → Ỹ for
˜
which J(X) (respectively J(X)) is dense in Y (respectively Ỹ ), we have that there is a (unique)
isometric isomorphism from I : Y → Ỹ so that
J˜ = I ◦ J.
CHAPTER 7. DUAL-SPACES, SECOND DUALS AND COMPLETION 52
Indeed, this map I is determined as the unique extension of J˜ ◦ H, H : J(X) → X the inverse
of the bijective map J : X → J(X), from the dense subspace J(X) ⊂ Y to the whole space Y ,
compare Theorem 4.1.
Let X and Y be any vector spaces over the same field F and let X 0 := {L : X → R linear}
and Y 0 := {L : Y → F linear} be the corresponding sets of linear functionals (so far we do not
introduce any norm on X and Y , so it would also make no sense to talk about continuity).
T :X→Y
the map
T 0 : Y 0 → X0
where for any f ∈ Y 0 we define T 0 (f ) ∈ X 0 by
(T 0 (f ))(x) = f (T (x)),
and one easily checks that T (f ) is indeed linear, and thus an element of X 0 , and that the map
f 7→ T (f ) is also linear.
We may now ask whether this construction works also in the setting of Functional Analysis,
where we work with normed spaces instead of just vector spaces and bounded linear operators
instead of just linear operators. The following proposition answers this question positively:
Proposition 7.7 (dual operator). Let (X, k · kX ) and (Y, k · kY ) be normed spaces over the same
field F = R or F = C and let T ∈ L(X, Y ). Then the dual operator
T 0 : Y ∗ → X∗
(7.4)
f 7→ T 0 (f ) : x 7→ T 0 (f )(x) := f (T x)
kT 0 kL(Y ∗ ,X ∗ ) = kT kL(X,Y ) .
Proof. As already mentioned, the fact that for each f ∈ X ∗ the map T 0 (f ) is linear and that
T itself is a linear operator is easily checked (and the proof is exactly the same as in the finite
dimensional case that was covered in part A Linear Algebra). We first show that T 0 (f ) ∈ X ∗
with
kT 0 (f )kX ∗ ≤ kT kL(X,Y ) kf kY ∗ for every f ∈ Y ∗ (7.5)
which ensures that T 0 is a well defined operator in L(Y ∗ , X ∗ ) with kT 0 kL(Y ∗ ,X ∗ ) ≤ kT kL(X,Y ) .
To see this we note that for every x ∈ X with kxkX = 1
so that (7.5) follows from the definition of the operator norm. To see that also kT 0 k ≥ kT k we
will prove that
kT xkY ≤ kT 0 kL(Y ∗ ,X ∗ ) kxkX for all x ∈ X (7.6)
which implies that kT k = inf{M : kT xk ≤ M kxk} ≤ kT 0 k.
CHAPTER 7. DUAL-SPACES, SECOND DUALS AND COMPLETION 53
This estimate (7.6) trivially holds true for all x ∈ ker(T ), so suppose that T x 6= 0. Then
Proposition 6.5 (which was a consequence of the Theorem of Hahn-Banach) gives us an element
f ∈ Y ∗ with kf kY ∗ = 1 so that
f (T x) = kT xk.
Hence
kT xk = f (T x) = (T 0 (f ))(x) ≤ kT 0 kkf xk ≤ kT 0 kkf kkxk = kT 0 kkxk
as claimed.
You have seen in Part A Linear Algebra that for finite dimensional spaces there are several
relations involving kernels of maps/dual maps and annihilators of the images of dual maps/maps.
Many of these relations have an analogue for general normed spaces, but one needs to be careful
in particular with statements that involve spaces, such as the images T X or T 0 Y ∗ , that are in
general not closed, and such statements often require us to take the closure of the corresponding
sets. Some of these relations will be proven on problem sheet 4.
Chapter 8
Spectral Theory
Before we discuss the spectrum of linear operators, we make some more remarks about the
invertibility of linear operators where we recall by definition T ∈ L(X) is invertible if it is
bijective and T −1 ∈ L(X).
We also remark that if T is algebraically invertible then (iii) holds if and only if
While (8.1) of course does not imply that the map is surjective, it gives the following useful
information on the image of T .
Proposition 8.1. (closed range) Let X be a Banach space and let T ∈ L(X) be so that (8.1)
holds true. Then T is injective and T X ⊂ X is closed. In particular, if T X is additionally dense
in X then T is invertible.
WARNING. This result is wrong if X is not assumed to be complete and we also remark that
the image of general bounded linear operators from Banach spaces is not closed. As an example
consider the inclusion map i : (C[0, 1], k · ksup ) → (L1 [0, 1], k · kL1 ) which is a bounded linear
operator whose image is the subspace of L1 given by all continuous functions which cannot be
closed in L1 as it is a dense proper subspace of L1 .
Proof. The only statement whose proof is not trivial is that the image T X is closed which we can
prove as follows: Given any sequence yn in T X which converges yn → y to some y ∈ X, we let
54
CHAPTER 8. SPECTRAL THEORY 55
i.e. that also (xn ) is Cauchy and thus, as X is complete, that xn → x for some x ∈ X. As T is
continuous we thus get that y = lim yn = lim T xn = T x ∈ T X.
Lemma 8.2. Let (X, k · k) be a normed space, S, T ∈ L(X). Suppose that ST = T S and that
ST is invertible. Then also S and T are invertible.
Proof. By symmetry it suffices to prove the claim for T and we shall prove this by an argument
by contradiction. So suppose that the claim is false. Then we either have that T is not surjective,
which is impossible as in this case we would have that ST (X) = T S(X) = T (SX) ⊂ T X $ X
so ST would not be surjective, or there exists no δ > 0 so that (8.1) holds. In this case we can
choose xn ∈ X \ {0} so that kT xn k
kxn k → 0 and thus conclude that also
kST xn k kT xn k
≤ kSkL(X) → 0,
kxn k kxn k
which means that (8.1) does not hold true for ST , and hence that ST does not have a bounded
inverse.
For the rest of the chapter we consider (X, k · k) to be a normed space over F = C.
Definition 15. Let T ∈ L(X).
σ(T ) := C \ ρ(T ).
We note that the resolvent set ρ(T ) consists of all those λ ∈ C for which the equation
T x − λx = y
has a unique solution for each y which furthermore depends continuously on the right hand side
y.
kT x − λxk ≥ δkxk
We note that every eigenvalue λ is also an approximate eigenvalue (as we may simply choose
xn = x for an element of ker(T − λId) that is normalised to kxk = 1), so we have
σP (T ) ⊂ σAP (T ) ⊂ σ(T ).
We also remark that λ is an approximate eigenvalue if and only if (ii) from above holds, as (ii)
is equivalent to
kT x − λxk
inf =0
x∈X,kxk=1 kxk
as the linearity of T means that this is infimum the same whether it is computed over all of
X \ {0} or just over {x ∈ X : kxk = 1}.
Remark (Other subsets of the spectrum (Off syllabus)).
One often also divides up the spectrum into the following disjoint subsets
8.2.1 Examples
Example 2. (An operator on `∞ for which σP 6= σAP ) Consider the operator T ∈ L(`∞ )
x
defined by T (x) = ( jj )j∈N . Then each λ = 1j is an eigenvalue as we have T (e(j) ) = 1j e(j) for
e(j) = (δjk )k∈N . While λ = 0 is not an eigenvalue it is clearly an approximate eigenvalue as e.g.
e(k) gives a sequence in `∞ with ke(k) k∞ = 1 and kT (e(k) )k → 0.
Example 3. (An Integral operator) Let X = C([0, 1]), as always equipped with the sup
Rt
norm, and consider T ∈ L(X) defined by T x(t) := 0 x(s)ds.
Let now λ 6= 0. Then we can use that the proof of Picard’s Theorem from DE1 shows that for any
y ∈ C([0, 1]) the integral equation T x − λx = y has a unique solution x = (T − λId)−1 (y); here
we note that for y ∈ C 1 the equation is equivalent to the initial value problem x0 (t) − λ−1 x(t) =
−λ−1 y 0 (t) on [0, 1] with x(0) = 0, but that the proof of Picard from DE1 actually applies to give
the existence of a unique solution of the integral equation also just for y continuous.
Furthermore the fact that this solution depends continuously on y can e.g. be obtained from
Gronvall’s lemma. Hence λ is not in the spectrum.
An alternative proof that σ(T ) = {0}, based on the general properties of the spectrum that we
prove in the following section, is carried out on problem sheet 4.
We may now check that every point in the closed unit disc is an approximate eigenvalue and
indeed that
σ(T ) = σAP (T ) = B1 (0)
as Theorem 8.3, that we prove below, ensures that always σ(T ) ⊂ BkT k (0), so in the present
situation where kT k = 1
B1 (0) ⊂ σAP (T ) ⊂ σ(T ) ⊂ B1 (0)
so all these sets need to agree.
CHAPTER 8. SPECTRAL THEORY 58
Our first main result about the spectrum of bounded linear operators is
Theorem 8.3 (Properties of the Spectrum of bounded linear operators on Banach spaces). Let
(X, k · k) be a complex Banach space. Then for any T ∈ L(X) we have
ρ(T ) 3 λ 7→ Rλ (T )
is analytic, i.e. for any λ0 ∈ ρ(T ) there exists a neighbourhood U of λ0 and ’coefficients’
Aj (λ0 , T ) ∈ L(X) so that for every λ ∈ U the resolvent operator is given by the convergent
power series
∞
X
Rλ (T ) = (λ − λ0 )j Aj (λ0 , T ).
j=0
(ii) The spectrum σ(T ) is non-empty, compact and for every λ ∈ σ(T ) we have |λ| ≤ kT kL(X) .
One of the most important aspects of the above theorem is the last part, i.e. that every bounded
operator has non-empty spectrum. Here we crucially use that the vector space is over C. The
claim is not true if we were to only consider the real spectrum as you already know from Linear
Algebra.
Let λ0 be any element of the resolvent set, i.e. so that (T − λ0 Id) is invertible, and denote
by Rλ0 (T ) its continuous inverse. Corollary 2.6 then implies that for any S ∈ L(X) with
kSk < δ := kRλ0 (T )k−1 also T − λ0 Id − S is invertible and its inverse can be written as
∞
−1 X
(T −λ0 Id+S)−1 = (T −λ0 Id)·(Id−Rλ0 (T )S) = (Id−Rλ0 (T )S)−1 Rλ0 (T ) = (Rλ0 (T )S)j Rλ0 (T ),
j=0
where the Neumann-series converges since kRλ0 (T )Sk ≤ kRλ0 (T )kkSk = δ −1 kSk < 1.
Given any λ ∈ C with |λ − λ0 | < δ, we may apply this argument to S = (λ − λ0 )Id, which has
kSk = |λ − λ0 | to obtain that T − λId = T − λ0 Id − S is invertible with inverse
∞
X
Rλ (T ) = (T − λId)−1 = (λ − λ0 )j Rλ0 (T )j+1 . (8.3)
j=0
Hence any such λ ∈ Bδ (λ0 ) ⊂ C is in the resolvent set, so the resolvent set is open and the
resolvent operator is analytic in λ.
To prove (ii) we first note that (i) implies that the spectrum σ(T ) = C \ ρ(T ) is closed. Further-
more, given any λ with |λ| > kT k we have that k λ1 T k = |λ|1
kT k < 1 and we hence obtain from
Lemma 2.5 that Id − λ T is invertible and hence so is T − λId = −λ(Id − λ1 T ), As we will later
1
This establishes the claim that σ(T ) ⊂ {λ ∈ C : |λ| ≤ kT k} and hence that the spectrum is both
bounded and closed, so compact.
It remains to prove that the spectrum of any operator is non-empty, which we will prove by
contradiction, using both the Theorem of Hahn-Banach (applied to functionals on L(X), i.e
elements of (L(X))∗ , instead of X ∗ ) and Liouville’s Theorem from Complex Analysis that the
only holomorphic maps g : C → C which are bounded are the constant maps.
So suppose that σ(T ) is empty. Then the resolvent operator Rλ is defined on all of C so given
any f ∈ (L(X))∗ we can define a function gf : C → C by
We note that this function is not only well defined, but furthermore that for any λ0 ∈ C the
function gf is analytic in a neighbourhood of λ0 , namely
∞
X
gf (λ) = (λ − λ0 )j f (Rλ0 (T )j+1 ) (8.5)
j=0
We now claim that gf is also bounded. To see this we first note that as gf is continuous, it is
bounded on any compact set, in particular on the closed disc B2kT k (0). On the other hand, for
1 1
any λ ∈ C with |λ| ≥ 2kT k we know from (8.4) that kRλ (T )k ≤ |λ|−kT k ≤ kT k and hence
From the Theorem of Liouville we thus obtain that gf must be constant, gf (λ) = Cf for a
constant that depends only on the element f ∈ (L(X))∗ used in the definition of gf . Returning
to the expansion (8.5) we thus conclude that all terms with j ≥ 1 must be zero, i.e. that for any
number λ0 ∈ C and any k ≥ 2 we have that
But by the Theorem of Hahn-Banach, or rather its consequence that we stated in Proposition
6.5, this implies that all the operators Rλ0 (T )k , k ≥ 2, must be zero, which is of course wrong
since all of these operators are powers of invertible operators and thus invertible.
where we note that these two operators commute. If λj is not in the spectrum of T j , then the
operator on the left is invertible and hence by Lemma 8.2 also T − λId must be invertible and
thus λ cannot be in the spectrum of T .
CHAPTER 8. SPECTRAL THEORY 60
We thus know that |λ| ≤ inf j∈N kT j k1/j . Indeed, one can show that kT j k1/j converges as j → ∞
with limj→∞ kT j k1/j = inf j∈N kT j k1/j and that this number agrees with the so called spectral
radius which is defined to be
r(T ) := sup{|λ| : λ ∈ σ(T )}.
Theorem 8.5. Let (X, k · k) be a complex Banach space. Then for any T ∈ L(X) we have
We remark that Lemma 8.4 is only a very special case of the following result, which is our second
main result about the spectrum of bounded linear operators on Banach spaces:
Theorem 8.6. Let X be a complex Banach space, T ∈ L(X) and let p be a complex polynomial.
Then
σ(p(T )) = p(σ(T )) := {p(λ) : λ ∈ σ(T )}.
Pn Pn
Here we set p(T ) := j=0 aj T j if the polynomial p is given by p(z) = j=0 aj z j , with the usual
convention that T 0 = Id.
Proof. We first remark that if p is constant, say p = c ∈ R, then the spectrum of p(T ) = cId is
simply {c} while the fact that σ(T ) is non-empty implies that also p(σ(T )) = {c}. So suppose
that p has degree n ≥ 1 let µ ∈ C be any given number. As we are working in C we can factorise
p(·) − µ and write it as p(z) − µ = α(z − β1 (µ)) . . . (z − βn (µ)) for some α 6= 0 and equally
factorise
p(T ) − µId = α(T − β1 (µ)Id) . . . (T − βn (µ)Id) (8.6)
where we note that all operators on the right hand side commute which will allow us to apply
Lemma 8.2.
We now note that since the zeros βj (µ) of p(·) − µ can be equivalently characterised as the
solutions t = βj (µ) of the equation p(t) = µ we have that
We then note that, applying Lemma 8.2 to (8.6) yields that if βj (µ) ∈ σ(T ) then p(T ) − µId
cannot be invertible, i.e. µ must be an element of σ(p(T )). Hence p(σ(T )) ⊂ σ(p(T )). We now
prove that also p(σ(T ))c ⊂ σ(p(T ))c and hence that p(σ(T )) ⊂ σ(p(T )). To see this we note
that if µ ∈
/ p(σ(T )) then βj (µ) ∈/ σ(T ) so T − βj (µ)Id is invertible for all j = 1, . . . , n. But
then (8.6) shows that p(T ) − µId is the composition of invertible operators so invertible and thus
µ ∈ ρ(T ) = σ(p(T ))c .
This theorem can in particular be applied if a given operator can be written as a polynomial of
a simpler operator.
As a final result of this course, we prove that there is the following close connection between the
spectrum of an operator and the spectrum of its dual operator T 0 ∈ L(X ∗ ):
Theorem 8.7. Let (X, k · k) be a Banach space, let T ∈ L(X) and let T 0 ∈ L(X ∗ ) be the
corresponding dual operator defined by (T 0 f )(x) = f (T x). Then
σ(T ) = σAP (T ) ∪ σP (T 0 ).
Claim 1: σP (T 0 ) ⊂ σ(T )
and