0% found this document useful (0 votes)
2 views

Functional Analysis Notes

The lecture notes provide a detailed summary of the Functional Analysis course taught by Sisto Baldo, covering topics such as Lebesgue measure, integration theory, and various theorems related to measure and functional spaces. Each lecture focuses on different aspects of functional analysis, including the comparison between Lebesgue and Riemann integrals, properties of normed spaces, and theorems like Hahn-Banach and Ascoli-Arzelà. The notes emphasize the importance of attending classes and engaging with reference texts for a comprehensive understanding of the material.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Functional Analysis Notes

The lecture notes provide a detailed summary of the Functional Analysis course taught by Sisto Baldo, covering topics such as Lebesgue measure, integration theory, and various theorems related to measure and functional spaces. Each lecture focuses on different aspects of functional analysis, including the comparison between Lebesgue and Riemann integrals, properties of normed spaces, and theorems like Hahn-Banach and Ascoli-Arzelà. The notes emphasize the importance of attending classes and engaging with reference texts for a comprehensive understanding of the material.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

Lecture Notes of Functional Analysis - Part 1

Degree Course: Master’s Program in Mathematics


Teacher: Sisto Baldo

These notes are just a fairly detailed summary of what went on in class.
In no way they are meant as a replacement for actual classes, human in-
teraction with the teacher, and/or the reading of reference texts, You are of
course strongly encouraged to take advantage of ALL these different learning
opportunities.

1
Contents
1 Lecture of october 2, 2024 (3 hours) 5
Lebesgue measure: motivations, a brief survey on Peano-Jordan measures,
outer Lebesgue measure and its elementary properties. Abstract outer mea-
sures. Measurable sets in the sense of Caratheodory. Property of the measure
on measurable sets. Measurable function, simple function, integral. Beppo-
Levi theorem. Integrable and summable functions, some properties of the
integral. The theorems by Fatou and Lebesgue.

2 Lecture of october 3, 2024 (3 hours) 22


Comparison between Lebesgue and Riemann integral. Topological vector spaces,
normed spaces, Banach spaces. Characterization of continuous linear func-
tionals.

3 Lecture of october 9, 2024 (3 hours) 33


p 1
Dual norm and its completeness. L spaces, the space C . The space of linear
and continuous maps between two normed spaces. Equivalence of norms.
Equivalence of all norms in Rn . The Hahn-Banach Theorem.

4 Lecture of october 10, 2024 (3 hours) 40


Hahn Banach Theorem and first consequences. Hölder inequality. Minkowski
inequality. Completeness of Lp , convergence in Lp and almost everywhere.

5 Lecture of october 16, 2024 (3 hours) 46


p
Dual of L . Minkowski functional of a convex open neighorhood of 0. Geo-
metric consequences of the Hahn Banach theorem: separation of convex sets.

6 Lecture of october 17, 2024 (3 hours) 51


Baire Lemma. Banach-Steinhaus Theorem. Bidual and reflexive spaces.
Open mapping theorem.

7 Lecture of october 23, 2024 (3 hours) 56


End of the proof of the Open Map Theorem. Closed Graph Theorem. (Non)
compactness of the closed unit ball: Riesz Lemma and Riesz Theorem. Char-
acterization of compact sets in a metric space. Ascoli-Arzelà Theorem (state-
ment).

8 Lecture of october 24, 2024 (3 hours) 60


Proof of the Ascoli-Arzelà Theorem. Weak convergence. Banach-Alaoglu
theorem. Some remarks on weak convergence. Weak closure of closed and
convex sets. Sequential weak lower semicontinuity of convex and continuous

2
functions. Exsistence of points of minimal norm in a closed convex set in a
reflexive Banach space. Scalar products, induced norm, parallelogram law.

9 Lecture of october 30, 2024 (3 hours) 67


Hilbert spaces. Projection on a closed convex set in a Hilbert space. Charac-
terization of the nearest point projection through the scalar product. Orthog-
onal decomposition. Dual of a Hilbert space. Orthogonal decomposition of an
Hilbert space. Dual of an Hilbert space. Orthonormal families and Fourier
coefficients. Bessel’s inequality.

10 Lecture of october 31, 2024 (3 hours) 74


Abstract Fourier series in a Hilbert space. Hilbert basis. Fourier series of pe-
riodic functions. Completeness of the trigonometric system in L2 (2π). Weak
convergence and weak compactness in a Hilbert space. (Beginning of the)
proof of the Banach-Alaoglu theorem in Hilbert spaces.

11 Lecture of november 6, 2024 (3 hours) 81


End of the proof of Banach-Alaouglu theorem in Hilbert spaces. Lusin’s The-
orem. Tietze’s Theorem. Density of continuous functions in Lp .

12 Lecture of november 7, 2024 (3 hours) 85


Regularization by convolution. Hints on Radon Measures and the density of
continuous functions in Lp (µ). Caratheodory criterion. Absolute continuity
and the Radon Nikodym theorem.

13 Lecture of november 13, 2024 (3 hours) 93


Signed measures. Hahn and Jordan decomposition of signed measures. Dual
of Lp .

14 Lecture of november 14, 2024 (3 hours) 98


Fundamental lemma of the calculus of variations. Sobolev spaces in dimen-
sion one. AC functions. Sobolev spaces in dimension one and AC functions.
Compactness theorem for Sobolev spaces in dimension 1. Sobolev spaces in
dimension n.

15 Lecture of november 20, 2024 (3 hours) 105


There are discontinuous Sobolev functions in dimension n. Meyers-Serrin
theorem. Sobolev embedding theorem. Poincaré’s Inequality. Existence of
weak solutions for the Poisson equation. Proof of the Sobolev embedding
theorem.

16 Lecture of november 21, 2024 (3 hours) 110

3
Regularity of weak solutions in dimension 1. Morrey’s embedding theorem.
Sobolev-Morrey embedding in W 1,p (Ω) (with Ω a regular open set). Rellich’s
compactness theorem. Weak convergence and weak compactness in W 1,p .

4
1 Lecture of october 2, 2024 (3 hours)
The lecture begins with a very brief presentation of the course: syllabus,
learning material...
We will begin by recalling (or introducing) the basics of Lebesgue measure
and integration theory. Meanwhile, and with very little additional effort, we
will learn about abstract measures and integrals. For this and part of the
following lecture, in class we only gave a very brief and sketchy outline of
the material discussed in detail in the following pages, because most of the
students already covered the subject in bachelor classes.
In your previous calculus courses, you probably saw the definition of
Peano-Jordan measure, which is one of the simplest and most natural meth-
ods of defining (in a rigorous way) the area of a subset of the plane, or the
volume of a subset of R3 . . .
Let us recall the main definitions:
DEFINITION: An interval or rectangle in Rn is a subset I ⊂ Rn which is
the cartesian product of 1-dimensional intervals: I = (a1 , b1 ) × (a2 , b2 ) × . . . ×
(an , bn ). We allow one or both the endpoints of these 1-dimensional intervals
to be included, and we also allow empty or degenerate intervals. The measure
of the interval I above is, by definition, the number
n
Y
|I| = (bi − ai ).
i=1

One readily checks that for n = 2 our interval is a rectangle with edges
parallel to the axes (and its measure coincides with its area), while for n = 3
I will be a rectangular prism, and the measure is simply the volume.
A subset A of Rn is called Peano-Jordan measurable if its “n-dimensional
volume” can be approximated, both from within and from without, by means
of finite unions of intervals. More precisely, we have the following
DEFINITION (Measurable set in the sense of Peano-Jordan): A subset A ⊂
Rn is measurable in the sense of Peano-Jordan if it is bounded and for every
ε > 0 there are a finite number of intervals I1 , . . . , IN , J1 , . . . , JK ⊂ Rn such
that Ii have pairwise disjoint interiors, Ji have pairwise disjoint interiors,
N
[ K
[
Ii ⊂ A ⊂ Ji
i=1 i=1

and finally
K
X N
X
|Ji | − |Ii | ≤ ε.
i=1 i=1

5
If this is the case, we define the Peano-Jordan measure of A as
N
X N
[
|A| = sup{ |Ii | : Ii with pairwise disjoint interiors, Ii ⊂ A}
i=1 i=1
K
X [K
= inf{ |Ji | : Ji with pairwise disjoint interiors, Ji ⊃ A}.
i=1 i=1

In the last expression, we can drop the requirement that the intervals Ji have
pairwise disjoint interiors: the infimum takes care of that!
We immediately check that rectangles are Peano-Jordan measurable, while
the set of points with rational coordinate within a rectangle is not.
Likewise, given a Riemann-integrable function f : [a, b] → R, f ≥ 0, the
region bounded by the graph of f , the x axis and the lines x = a, x = b
is Peano-Jordan measurable, and its measure coincides with the Riemann
integral of f . More generally:
EXERCISE: Let g, h : [a, b] → R be two Riemann-integrable functions of one
variable with g(x) ≤ h(x) for all x ∈ [a, b]. Consider the setA = {(x, y) ∈
R2 : x ∈ [a, b], g(x) ≤ y ≤ h(x). Show that A is Peano-Jordan measurable
and Z b
|A| = (h(x) − g(x)) dx.
a
A set of this kind is called a simple set with respect to the x-axis. . . Simple
sets with respect to the y-axis are defined in a similar way, and there are
natural extensions in higher dimension.
Peano-Jordan measure is a nice object, but it behaves badly with re-
spect to countable operations: it is certainly true that a finite union of P.J.-
measurable sets is again P.J-measurable, but this is false for countable unions:
for instance, the set of points with rational coordinates within a rectangle is
a countable union of points, which are of course measurable.
For this and other reasons, it is convenient to introduce a more general
notion of measure, which will be Lebesgue measure. The definition is very
similar to that of P.J. measure, but we will allow countable unions of intervals.
DEFINITION (Outer Lebesgue Measure): If A ⊂ Rn , its outer Lebesgue
measure is defined as
X∞ ∞
[
m(A) = inf{ |Ii | : Ii intervals, Ii ⊃ A}.
i=1 i=1

Notice that we do not require that the intervals have disjoint interiors. More-
over, since we allow degenerate or empty intervals, finite coverings are pos-
sible.

6
Lebesgue measure enjoys of the following elementary properties:
THEOREM (Elementary properties of Lebesgue outer measure): Let m :
P(Rn ) → [0, +∞] denote (outer) Lebesgue Measure1 . The following holds:
(i) m(∅) = 0, m({x}) = 0 for every x ∈ Rn .

Ai , with A, A1 , A2 , . . . ⊂ Rn , then
S
(ii) If A ⊂
i=1

X
m(A) ≤ m(Ai )
i=1

(countable subadditivity of Lebesgue measure). In particular, A ⊂ B


implies m(A) ≤ m(B) (monotonicity of Lebesgue measure).
(iii) In the definition of Lebesgue measure, we may ask without loss of gen-
erality that the intervals Ii are open.
(iv) m(I) = |I| for every interval I ⊂ Rn . Moreover, m(Rn ) = +∞.

DIM.: (i) is a simple exercise. To prove (ii), we begin by recalling that the
sum of a series of non-negative numbers does not depend on the order of
summation.
Fix ε > 0 and an index i: by definition of infimum, we find a sequence of

intervals {Iji }j such that Iji ⊃ Ai and
S
j=1

X ε
|Iji | < m(Ai ) + .
j=1
2i

Then {Iji }i,j is a countable covering of A whose members are intervals, and
by definition of Lebesgue measure we get
∞ ∞ X
∞ ∞ ∞
X X X ε X
m(A) ≤ |Iji | ≤ |Iji | ≤ (m(Ai ) + ) = m(Ai ) + ε,
i,j=1 i=1 j=1 i=1
2i i=1

and (ii) follows because ε can be taken arbitrarily small.


Monotonicity is an immediate consequence of (ii). Let us show (iii): if

A ⊂ Rn , for every ε > 0 we can find intervals Ij such that
S
Ij ⊃ A and
j=1

X ε
|Ij | < m(A) + .
j=1
2
1
P(Rn ) denotes the set of all subsets of Rn .

7
For every j = 1, 2, . . . let Ij′ ⊃ Ij be a slightly larger open interval, chosen in
such a way that |Ij′ | < |Ij | + 2j+1
ε
. We then get
∞ ∞
X X ε ε ε
|Ij′ | < (|Ij | + ) < m(A) + +
j=1 j=1
2j+1 2 2

and (iii) is proved.


Surprisingly enough, (iv) is slightly harder to prove. By (iii), it immedi-
ately follows from the following
CLAIM:
S∞ If I is an interval, then for every sequence Ij of open intervals with
I
j=1 j ⊃ I one has
X∞
(∗) |I| ≤ |Ij |.
j=1

Indeed, (*) is easy enough if there is only a finite number of intervals Ij ,


a bit less in the general case of a countable covering. On the other hand,
if J ⊂ I is closed and bounded, we can use compactness to choose a finite
number of intervals I1 , I2 , . . . , IN in our covering in such a way that we still
SN
have J ⊂ Ij . Since (*) holds for finite coverings, we deduce that
j=1

N
X ∞
X
|J| ≤ |Ij | ≤ |Ij |.
j=1 j=1

But of course J can be chosen in such a way that its measure is arbitrarily
close to that of I, and (*) is proved. Q.E.D.
As a trivial consequence of this theorem, every countable subset of Rn has
measure 0, because points have measure 0 and Lebesgue measure is countably
subadditive.

Lebesgue measure in Rn is only a particular case of a more general object,


called an outer measure:
DEFINITION (Outer measure): An outer measure on a set X is a function
µ : P(X) → [0, +∞] such that µ(∅) = 0, and which is countably subadditive:

S
if A, A1 , A2 , A3 , . . . ⊂ X and A ⊂ Aj , then
j=1


X
µ(A) ≤ µ(Aj ).
j=1

8
Of course, monotonicity of µ follows from countable subadditivity: if
A ⊂ B then µ(A) ≤ µ(B).
A new example of outer measure is obtained by restricting Lebesgue mea-
sure to a subset A0 ⊂ Rn : this is the measure m̃ defined as

m̃(A) := m(A ∩ A0 ).

Another example is the measure δ0 (Dirac’s delta at 0), which is the


measure on Rn defined by

1 if 0 ∈ A,
δ0 (A) =
0 otherwise.

Yet another examples is counting measure defined by



number of elements in A if A is f inite,
#(A) =
+∞ otherwise.

In general, Lebesgue mesaure does not enjoy good properties on every


subset of Rn : it behaves much better on a particular class of sets, the mea-
surable sets:
DEFINITION (Lebesgue-measurable sets in the sense of Caratheodory): A
set A ⊂ Rn is Lebesgue-measurable or m-measurable if the following equality

m(T ) = m(T ∩ A) + m(T \ A)

holds for every subset T ⊂ Rn . Roughly speaking, we are requiring that A


“split well” the measure of every subset of Rn .
Notice that due to countable subadditivity we always have m(T ) ≤ m(T ∩
A) + m(T \ A): so, to prove measurability it is enough to prove the opposite
inequality
m(T ) ≥ m(T ∩ A) + m(T \ A) ∀T ⊂ Rn .
In the same way, given an outer measure µ, A is said to be µ-measurable
if µ(T ) = µ(T ∩ A) + µ(T \ A) for every T ⊂ X.
REMARK: In the near future we will need the following fact: if A ⊂ Rn is
Lebesgue-measurable and m̃ denotes the restriction of Lebesgue measure to
any subset A0 ⊂ Rn , then A is also m̃-measurable. Indeed, if T ⊂ Rn we
have

m̃(T ) = m(T ∩ A0 ) = m((T ∩ A0 ) ∩ A) + m((T ∩ A0 ) \ A) =


m((T ∩ A) ∩ A0 ) + m((T \ A) ∩ A0 ) = m̃(T ∩ A) + m̃(T \ A).

9
This is clearly still true, with the same proof, if m and m̃ are replaced with
an arbitrary outer measure µ and its restriction µ̃ to a set A0 .

The following theorem shows two main things: first, if we start with
measurable sets and make countable unions, complements, contable intersec-
tions, we don’t leave the category of measurable sets. Moreover, Lebesgue
measure or an abstract outer measure show some very good properties when
restricted to the measurable sets. The main of these is countable additivity:
the measure of the union of a contable family of pairwise disjoint measurable
sets is simply the sum of their measures.
THEOREM (Properties of measurable sets and of the measure restricted to
measurable sets): Let µ be an outer measure on a set X. The following facts
hold true:
(i) If A is µ-measurable, then AC = X \ A is µ-measurable. Moreover, if
µ(A) = 0 then A is µ-measurable.
(ii) Contable union or intersection of µ-measurable sets is µ-measurable.
(iii) If {Ai }i is a countable family of pairwise disjoint µ-measurable sets and

S
A= Ai , then
i=1

X
µ(A) = µ(Ai )
i=1

(countable additivity of µ on measurable sets).


(iv) If {Ai } is an increasing sequence of µ-measurable sets, i.e. if A1 ⊂

S
A2 ⊂ A3 ⊂ . . ., and A = Ai then
i=1

µ(A) = lim µ(Ai ).


i→+∞

(v) If {Ai } is a decreasing sequence of µ-measurable sets, i.e. if A1 ⊃ A2 ⊃



T
A3 ⊃ . . ., if µ(A1 ) < +∞ and if we denote A = Ai , then
i=1

µ(A) = lim µ(Ai ).


i→+∞

PROOF: (i) is obvious if we remark that the measurability condition can be


rewritten as

µ(T ) ≥ µ(T ∩ A) + µ(T ∩ AC ) ∀T ⊂ X.

10
Also, it is trivial that a set with measure 0 is measurable. In particular, we
deduce that ∅ and X are µ-measurable.
Let us show for the moment a weaker version of (ii): if A and B are
measurable, then A ∪ B is measurable. Indeed, if T ⊂ X we have
µ(T ) = µ(T ∩ A) + µ(T \ A) =
µ((T ∩ A) ∩ B) + µ((T ∩ A) \ B) + µ((T \ A) ∩ B) + µ((T \ A) \ B).
Look at the last row: the union of the sets within the first 3 terms is exactly
T ∩ (A ∪ B): by the subadditivity of µ we then infer that the sum of those
terms is ≥ µ(T ∩(A∪B)). Since the set in the last term is simply T \(A∪B),
we then get:
µ(T ) ≥ µ(T ∩ (A ∪ B)) + µ(T \ (A ∪ B)),
and A ∪ B is measurable.
From this and (i) we also get the measurability of A ∩ B, since A ∩ B =
(AC ∪ B C )C . By induction, if follows that finite union and intersections of
measurable sets are measurable. We will complete the proof of (ii) (i.e., for
countable union and intersections) only at the end.
Let us prove (iii): the claim is easy to show for the union of two disjoint
measurable sets A and B, because µ(A∪B) = µ((A∪B)∩A)+µ((A∪B)\A) =
µ(A) + µ(B). By induction, (iii) holds for the union of a finite family of
pairwise disjoint measurable sets.
In the general case of a countable family, countable subadditivity already
P∞
gives µ(A) ≤ µ(Ai ), while monotonicity ensures that for any N ∈ N
i=1

N
! N
[ X
µ(A) ≥ µ (Ai ) = µ(Ai ),
i=1 i=1

where the last equality holds because we proved (iii) for finite unions... By
taking the supremum over all N we get

X
µ(A) ≥ µ(Ai ),
i=1

and (iii) is proved.


Let us show (iv): it suffices to apply (iii) on the sequence of pairwise
disjoint measurable sets defined by B1 = A1 , Bi = Ai \ Ai−1 (i ≥ 2). We get

X N
X
µ(A) = µ(Bi ) = lim µ(Bi ) = lim µ(AN ).
N →+∞ N →+∞
i=1 i=1

11
We now prove (v): we define the increasing sequence of measurable sets
Bi = A1 \ Ai , i = 2, 3, . . .. It follows that

[
A1 = A ∪ Bi
i=2

and by (iv) we get

µ(A1 ) ≤ µ(A) + lim [µ(A1 ) − µ(Ai )],


i→+∞

whence lim µ(Ai ) ≤ µ(A). The opposite inequality holds by monotonicity,


i→+∞
and (v) is proved.
To conclude the proof of the theorem, we only need to show (ii).

S
Let A = Ai , where the sets Ai are measurable. We must show that A
i=1
is measurable.
Let T ⊂ Rn and consider the increasing sequence of measurable sets
N
S
BN := Ai : these are measurable because of the finite version of (ii) we
i=1
already proved. But they are also measurable for the outer measure µ̃ ob-
tained by taking the restriction of µ to the set T (i.e. the measure defined
by µ̃(A) := µ(T ∩ A) for all A ⊂ Rn ). By monotonicity we get:

(∗ ∗ ∗) µ(T ) = µ(T ∩ BN ) + µ(T \ BN ) ≥ µ(T ∩ BN ) + µ(T \ A)

On the other hand, by applying (iv) to the measure µ̃ we obtain

lim µ(T ∩ BN ) = lim µ̃(BN ) = µ̃(A) = µ(T ∩ A)


N →+∞ N →+∞

and measurability of A follows by passing to the limit for N → +∞ in (***).



T
Measurability of Ai follows as usual by writing
i=1

∞ ∞
!C
\ [
Ai = AC
i .
i=1 i=1

Q.E.D.
The family of measurable sets of an outer measure form what is called a σ-
algebra. Moreover, an outer measure restricted to the family of its measurable
sets is called a measure:

12
DEFINITION (σ-algebra, measure): Given a set X, a family A ⊂ P(X) of
C
S∞ of X is called a σ-algebra if X ∈ A, A ∈ A whenever A ∈ A, and
subsets
if i=1 Ai ∈ A whenever Ai ∈ A for all i = 1, 2, . . ..
Given X and a σ-algebra A on X, a measure is a function µ : A →
[0, +∞] such that µ(∅) = 0 and which is countably additive: if A1 , A2 , A3 , . . .
belong to A and are pairwise disjoint, then

[ ∞
X
µ( Ai ) = µ(Ai ).
i=1 i=1

After quite general results, which hold for all outer measures, les us go
back for the moment to Lebesgue measure: the following theorem shows that
there is plenty of Lebesgue measurable sets.
THEOREM (Regularity of Lebesgue measure): Open and closed subsets of
RN are Lebesgue-measurable. Moreover, if A ⊂ RN is Lebesgue-measurable,
then for every ε > 0 there exist B open, C closed with C ⊂ A ⊂ B and
m(B \ C) < ε.
To prove the theorem, we need an easy topological fact: the following
proposition shows that every open subset of RN is a countable union of
intervals.
PROPOSITION: Every open set A ⊂ Rn is a countable union of open inter-
vals.
DIM.: Consider the family F of all cubes in Rn of the type (q1 − r, q1 + r) ×
(q2 − r, q2 + r) × . . . × (qn − r, qn + r), where all qi ed r are rational numbers.
This is clearly a countable family of intervals.
Let us show that A is the union of the following subfamily:

F ′ = {I ∈ F : I ⊂ A}.

Indeed, since A is open, for every x ∈ A there exists an open ball Br(x) (x) ⊂
A. Whitin this ball there is a cube with center in x, whithin which we can
find an element Ix ∈ FS with x ∈ Ix : this is true because rational are dense
in R. But then A = I. Q.E.D.
I∈F ′

Proof of the theorem on regularity of Lebesgue measure: It is an easy exercise


to show that intervals are Lebesgue-Measurable: an interval is indeed a finite
intersection of half-spaces. Half-spaces are Lebesgue-measurable: let S be a
half space of the form S = {x ∈ Rn : xı > a}, with i ∈ {1, 2, . . . , n} and
a ∈ R fixed. If T ⊂ Rn , fix ε >P0 an let {Ii } be a countable family of
intervals covering T and such that ∞ ′
i=1 |Ii | < m(T ) + ε. Define Ii = Ii ∩ S,

13
Ii′′ = Ii ∩(Rn \S): these are still (possibly empty) intervals, the sum of which
measures is exactly |Ii |. Moreover, the family {Ii′ } covers T ∩ S, and {Ii′′ }
covers T ∩ S C : we then get

X ∞
X
m(T ) + ε > |Ii′ | + |Ii′′ | ≥ m(T ∩ S) + m(T ∩ S C )
i=1 i=1

and measurability of S follows because ε is arbitrary.


Intervals are then measurable, and so are open sets (because they are ob-
tained as a countable union of intervals). Closed sets are measurable because
their complements are open.
Let now A be measurable, ε > 0: we show that there exists an open
set B ⊃ A with m(B \ A) < ε/2. Suppose for now that m(A) < +∞. By

S
definition of Lebsegue measure, there are intervals I1 , I2 , . . . with Ii ⊃ A
i=1

P
and |Ii | ≤ m(A) + ε/2. We know that we may suppose without loss of
i=1

S
generality that the intervals Ii be open. If B = Ii , then B is open and by
i=1
subadditivity

X
m(B) ≤ m(Ii ) ≤ m(A) + ε/2,
i=1

whence m(B \ A) = m(B) − m(A) ≤ ε/2.


If m(A) = +∞ the claim still true: we take ε > 0 and we show that there
exists B ⊃ A, B open, such that m(B \ A) < ε.
To this end, consider the measurable sets AN = A ∩ BN (0), N = 1, 2, . . .:
their measure if finite, and their union is A. For each of these we can find an

open set BN ⊃ AN , BN such that m(BN \ AN ) < 2Nε+1 : define B =
S
BN .
N =1

S
Now, B is open and contains A, moreover B \ A ⊂ (BN \ AN ): by
N =1

m(BN \ AN ) < 2ε .
P
subadditivity we get m(B \ A) ≤
N =1
To conclude the proof, we show that given a measurable A and ε > 0,
there exists a closed set C ⊂ A with m(A \ C) < ε/2. It is enough to choose
an open set F ⊃ AC such that m(F \ AC ) < ε/2. Then C = F C is closed,
C ⊂ A, and m(A \ C) = m(F \ AC ) < ε/2. Q.E.D.
The following exercise builds on the regularity of Lebesgue measure:
EXERCISE (Borel hull): Recall that Borel sigma-algebra in Rn (or in a
topological space) is, by definition, the smallest σ-algebra which cointains all

14
open sets. Its elements are called Borel sets. Show that for any Lebesgue
measurable set A there are Borel sets B, C such that C ⊂ A ⊂ B and
m(B \ C) = 0. As a consequence, every Lebesgue measurable set is the
union of a Borel set and a set of measure 0.

Altough there are lots of Lebesgue measurable sets, not every subset of
n
R is measurable:
EXAMPLE (Vitali non measurable set): Let n = 1, and consider the interval
(0, 1) ⊂ R. Define the following equivalence relation on (0, 1): we will say
that x ∼ y if and only if x − y ∈ Q. This equivalence relation gives us a
partition of (0, 1) in infinitely many equivalence classes: we choose a set A
which contains exactly one element of each class (of course, to do this we
need the axiom of choice). We will show that A is not Lebesgue-measurable.
For each q ∈ Q ∩ [0, 1) define the sets Aq = {x + q : x ∈ A}. Since
Lebesgue measure is obviously translation-invariant, we have m(Aq ) = m(A).
Moreover, we have m(A) = m(Aq ) = m(Aq ∩ (0, 1)) + m(Aq \ (0, 1)) because
intervals are Lebesgue-measurable. Let now Bq = Aq \ (0, 1) and define
B̃q = {x : x + 1 ∈ Bq }: we have m(B̃q ) = m(Bq ) by translation invariance.
We next denote Ãq = (Aq ∩ (0, 1)) ∪ B̃q : we clearly have m(Ãq ) = m(A).
easy to check that the sets Aq are pairwise disjoint for q ∈ Q∩[0, 1)
Now, it isS
and that Ãq = (0, 1).
q
If we suppose by contradiction that A is measurable, then so are also the
sets Ãq and by countable additivity we get

X ∞
X
1 = m([0, 1)) = m(Ãq ) = m(A).
n=1 n=1

This is impossible: the mesaure of A is either zero or positive. If we had


m(A) = 0, the r.h.s. would be 0, while if m(A) > 0 it would be +∞: in
neither case it can be equal to 1. Thus A cannot be Lebesgue-measurable.

Before we can define Lebesgue integral, we need to define the important


class of the measurable functions.
DEFINITION (measurable functions): Let A ⊂ X be a measurable set (w.r.t.
a fixed outer measure, for instance Lebesgue measure on Rn ), f : A → R.
With R we denote the set R ∪ {+∞} ∪ {−∞}: in this context, we will adopt
the “funny” convenction that 0 · ±∞ = 0, while the expression +∞ − ∞ has
to be considered meaningless as usual.
The function f is said to be measurable if for all a ∈ R the sets f −1 ((a, +∞]) =
{x ∈ A : f (x) > a} are measurable.

15
The following result gives an equivalent (and more “topological”) char-
acterization of measurability:
PROPOSITION (Characterization of measurable functions): A function f :
A0 → R ∪ {+∞} (with A0 ⊂ X) is measurable iff f −1 ({+∞}), f −1 ({−∞})
are measurable and f −1 (U ) is measurable for every open set U ⊂ R.
PROOF.: If we know that f −1 ({+∞}), f −1 ({−∞}) are measurable and
f −1 (U ) is measurable for every open U ⊂ R, then f is measurable since
f −1 ((a, +∞]) = f −1 ((a, +∞)) ∪ f −1 ({+∞}).
To see the other implication, suppose f is measurable. We can write

\
−1
f ({+∞}) = f −1 ((N, +∞]),
N =1

whence f −1 ({+∞}) is measurable because it is a countable intersection of


measurable sets.
From the measurability of f it follows then that f −1 ((a, +∞)) is mea-
surable for every a ∈ R. Also the sets f −1 ([a, +∞)), f −1 ((−∞, a)) and
f −1 ((−∞, a]) are measurable for every a ∈ R. Indeed, f −1 ([a, +∞)) =
∞ −1
((a − N1 , +∞)) is measurable being the intersection of a count-
T
N =1 f
able family of measurable sets. The counterimages of left halflines like
f −1 ([−∞, a)) and f −1 ([−∞, a]) are measurable because their complements

are measurable. Then also f −1 ({−∞}) is measurable: f −1 ({−∞}) = f −1 ([−∞, −N ])...and
T
N =1
then also the sets f −1 ((−∞, a)) are measurable.
But then the counterimages of open intervals are measurable, because
f −1 ((a, b)) = f −1 ((−∞, b)) ∩ f −1 (a, +∞)).

S
If finally U ⊂ R is open, we write U = Ii , where Ii ⊂ R are open
i=1

−1 −1
S
intervals. Then f (U ) = f (Ii ) is measurable. Q.E.D.
i=1
As a consequence, the domain A of a measurable function is necessarily
measurable (because A = f −1 (R) ∪ f −1 ({+∞}) ∪ f −1 ({−∞})). Moreover,
a real valued continuous function defined on an open set of Rn is certainly
Lebesgue-measurable (Why?).

Measurable functions are “stable” under a whole lot of algebraic and limit
operations:
PROPOSITION (Stability of measurable functions): Suppose f, g are mea-
surable functions, λ ∈ R and {fn } is a sequence of measurable functions.
Then

16
(i) the set {x : f (x) > g(x)} is measurable;

(ii) if ϕ : R → R is continuous, then ϕ ◦ f is measurable (in its domain);

(iii) the functions f + g, λf , |f |, max{f, g}, min{f, g} and f g are all mea-
surable within their domains;

(iv) the functions sup fn , inf fn , lim sup fn , lim inf fn and lim fn are all mea-
surable in their domains.

PROOF: To show (i), observe that if f (x) > g(x), there is a rational number
q ∈ (g(x), f (x)). We can then write
[
f −1 ((q, +∞]) ∩ g −1 ([−∞, q)) ,

{x : f (x) > g(x)} =
q∈Q

and we have obtained our set as a countable union of measurable sets.


(ii) is obvious for real valued functions, thanks to our characterization of
measurable functions: indeed, the preimage of open sets under ϕ is open! In
the general case of functions taking values in R, we must only specify the
topology of R: the open sets are simply unions of open sets in R and of open
half lines. Once this is understood, the proof is as before.
To see (iii), let f , g be measurable and consider their sum f + g (which is
defined on the intersection of the domains, minus the points where the sum
is of the form +∞ − ∞ or −∞ + ∞). This is measurable because

(f + g)−1 ((a, +∞]) = {x : f (x) > a − g(x)}

is measurable by (i). From (ii) we then infer the measurability of λf , |f | and


f 2 (compositions with continuous functions). If f , g are real valued, we can
write max{f (x), g(x)} = 12 (f (x) + g(x) + |f (x) − g(x)|), min{f (x), g(x)} =
1
2
(f (x) + g(x) − |f (x) − g(x)|), f (x)g(x) = 12 ((f (x) + g(x))2 − f 2 (x) − g 2 (x)),
whence the measurability of max{f, g}, min{f, g} and f g. In the general
case, the above identities give measurability on the restrictions to the measur-
able set where f and g are finite. What is left is easily decomposed in a small
number of measurable pieces, where the functions are constant: for instance,
f (x)g(x) is identically +∞ on the set {x ∈ X : f (x) = +∞, g(x) > 0}, it
is zero on the set {x ∈ X : f (x) = +∞, g(x) = 0}, etc. In conclusion, we
easily deduce the measurability of f g.
Consider nowS(iv) and let f (x) = sup{fn (x) : n = 1, 2, . . .}. We have
f −1 ((a, +∞)) = n fn−1 ((a, +∞)), so that f is measurable, and inf fn (x) is
n
likewise measurable.

17
The function lim inf fn (x) is measurable because lim inf fn (x) = sup inf{fm (x) :
n→+∞ n→+∞ n
m ≥ n}, and so is lim sup fn (x). The set where lim inf fn and lim sup fn co-
n→+∞ n n
incide is measurable: this is precisely the domain of lim fn , which is thus
n
measurable. Q.E.D.
An important sublcass of measurable functions is that of simple functions:
in the definition of Lebesgue integral they play the crucial role step functions
have in the theory of Riemann integral.
Recall that given A ⊂ X, its characteristic function is

1 if x ∈ A,
1A (x) =
0 if x ̸∈ A.

DEFINITION: A simple function ϕ : Rn → R is a finite linear combination


of characteristic functions of measurable sets. In other words, ϕ is simple
if there are measurable sets A1 , A2 , . . . , AN and real numbers c1 , c2 , . . . , cN
N
P
such that ϕ(x) = ci 1Ai (x). With no loss of generality, we may suppose
i=1
that the sets Ai are pairwise disjoint.
An equivalent definition is the following: simple functions are measurable
functions whose image is a finite subset of R.
If ϕ is simple and ϕ(x) ≥ 0 for all x, we define in a natural way its
(Lebesgue) integral w.r.t. the measure µ:
Z XN
ϕ(x) dµ(x) = ci µ(Ai ).
X i=1

Observe that step function in Rn are just simple functions for which the
sets Ai are intervals: for this kind of functions (and Lebesgue measure), the
new definition of integral coincides with Riemann’s. Moreover, the integral
of simple functions enjoys the usual properties of monotonicity, homogeneity
and additivity w.r.t. the integrand functions.
As we will see, the Lebesgue integral of a non-negative measurable func-
tion is defined in a way very similar to the (lower) Riemann integral, just by
replacing step functions with simple functions:
Z Z
f (x) dµ(x) := sup{ ϕ(x) dµ(x) : ϕ simple, ϕ ≤ f }.

However, to prove that this object has all the usual properties we expect
from the integral, we will need an approximation result: the following, funda-
mental theorem guarantees that every non-negative measurable function can
be approximated from below with an increasing sequence of simple functions.

18
THEOREM (Approximation of measurable functions with simple functions):
Let f : X → [0, +∞] be a measurable function. Then there exists a sequence
ϕk : X → [0, +∞) of simple functions such that f ≥ ϕk+1 ≥ ϕk (k =
1, 2, 3, . . .) and such that

lim ϕk (x) = f (x) ∀x ∈ X.


k→+∞

PROOF: Consider the functions ϕk (x) = min{k, 2−k [2k f (x)]}, where [·] de-
notes the integer part (floor) function. Those functions are measurable and
take only a finite number of values (belonging to the finite set {j2−k : j =
0, 1, 2, . . . , k2k }). Moreover, ϕk (x) is below f (x) and at each x ∈ X where
f (x) ∈ [0, k) we have f (x) − ϕk (x) ≤ 2−k . If follows that ϕk (x) → f (x) at
every point where f (x) < +∞. On the other hand, if at a point x ∈ X we
have f (x) = +∞, then ϕk (x) = k for each k and again ϕk (x) → f (x).
Finally, the sequence ϕk is increasing because (because of our diadic dis-
cretization of the target space), each of the sets where ϕk is constant is
partitioned in two or more measurable sets where ϕk+1 takes values which
are bigger or equal.
The following is my attempt to visualize the construction of the functions
ϕk with a GeoGebra worksheet2 .

Q.E.D.

We finally reached the point where the Lebesgue integral of a non-negative


measurable function has to be introduced: as a matter of fact, I already told
you the definition
2
http://www.geogebratube.org/material/show/id/51513

19
DEFINITION: The Lebsegue integral of a measurable function f : X →
[0, +∞] is defined by
Z Z
f (x) dµ(x) = sup{ ϕ(x) dµ(x) : ϕ simple, ϕ ≤ f }.
X X
R
More generally, if f : A → [0, +∞] is measurable, we define f (x) dµ(x) as
A
f˜(x) dµ(x), where f˜ : X → [0, +∞] is obtained by extending f to 0 outside
R
X
A.

The following integral convergence result will be extremely important in


the theory of Lebesgue integral.
THEOREM (Beppo Levi’s or of the monotone convergence): Let {fk } be
a sequence of non-negative measurable functions, fk : X → [0, +∞], and
suppose the sequence is increasing: fk+1 (x) ≥ fk (x) for every x ∈ Xand for
every k = 1, 2, 3, . . .. Then, if we denote f (x) = limk→+∞ fk (x), we have
Z Z
f (x) dµ(x) = lim fk (x) dµ(x).
k→+∞
X X

PROOF: Notice that f is measurable, being


R the pointwise limit of measurable
functions. Morevore, the sequence k 7→ fk (x) dµ(x) is increasing: denote
X R
by α its limit. As f ≥ fk for each k, we obviously have f (x) dµ(x) ≥ α:
X
in particular, if α = +∞ the theorem is trivially true. If α ∈ R, we need to
prove the opposite inequality
Z
f (x) dµ(x) ≤ α.
X

To this end, let us fix c ∈ (0, 1) and a simple function


PN s : X → [0, +∞)
with s ≤ f . The function s can be expressed as s(x) = j=1 sj 1Aj (x), where
Aj are pairwise disjoint measurable sets. Define Ek = {x ∈ X : fk (x) ≥

S
cs(x)}. Thanks to the fact that fk → f and c < 1, we have Ek = X, and
k=1
the sequence Ek is increasing because so is the sequence fk . We define next
Aj,k = Aj ∩ Ek : thanks to the continuity of measure on increasing sequences,

20
we have µ(Aj,k ) → µ(Aj ) as k → +∞. We then infer
Z
α = lim fk (x) dµ(x) ≥
k→+∞
X
Z Z
lim fk (x) dµ(x) ≥ lim c s(x) dµ(x) =
k→+∞ k→+∞
Ek Ek
N
X ∞
X Z
lim c sj µ(Aj,k ) = c sj µ(Aj ) = c s(x) dµ(x).
k→+∞
j=1 j=1 X

By taking the supremum over all simple functions s ≤ f and all c < 1, we
get the inequality we need. Q.E.D.
Let us see some important consequences of the monotone convergence
theorem:
(i) Additivity of integral w.r.t. the integrand: Let f, g : X → [0, +∞] be
measurable functions. Then
Z Z Z
(f (x) + g(x)) dµ(x) = f (x) dµ(x) + g(x) dµ(x).
X X X

Indeed, we can choose two increasing sequences of simple functions,


{sk }, {uk } with sk → f , uk → g. Lebesgue integral is clearly additive
on the set of simple functions: by the monotone convergence theorem
we can pass to the limit and we get the desired equality.
(ii) Countable additivity of the integral w.r.t. the integration set: If {Ai } is
a sequence of pairwise disjoint measurable sets and f is a non-negative

S
measurable function defined on A = Ai , then
i=1
Z ∞ Z
X
f (x) dµ(x) = f (x) dµ(x).
A i=1 A
i

It is enough to consider the increasing seguence of measurable functions


k
P
gk (x) = f (x) 1Ai (x), whose limit is g(x) = f (x) 1A (x). Again, the
i=1
thesis follows from Beppo Levi’s theorem.
(iii) Integration by series: If {fk } is a sequence of non-negative measurable
functions defined on a set A, then
Z X ∞ ∞ Z
X
fk (x) dµ(x) = fk (x) dµ(x).
A i=1 i=1 A

21
To prove this, it is enough to apply the monotone convergence theorem
and the additivity of the Lebesgue integral to the partial sums of our
series.

DEFINITION (Integral of arbitrarily signed functions): How can we proceed


if we wish to integrate an arbitrarily signed, measurable function f : A → R?
We define the positive and negative part of f as follows:

f + (x) := max{0, f (x)}, f − (x) := − min{0, f (x)}.

We clearly have f (x) = f + (x)−f − (x) and |f (x)| = f + (x)+f − (x). Moreover,
both f + and f − are non-negative: if their integrals are not both +∞, f is
called Lebesgue-integrable and we define
Z Z Z
f (x) dµ(x) := +
f (x) dµ(x) − f − (x) dµ(x).
A A A

If both the integrals of f + and f − are finite, f is said to be summable and


its integral is finite. Obviously, a measurable function is summable if and
only if the integral of |f | is finite.
Notice that additivity w.r.t. the integrand and the integration set still
hold for summable functions. In particular, as the integral is clearly 1-
homogeneous by its definition, Lebesgue integral is linear on the vector space
of summable functions.

2 Lecture of october 3, 2024 (3 hours)

To be totally convinced that Lebesgue’s theory is reasonable, it is important


to compare Lebesgue’s integral with Riemann’s.
Before we do that, let us introduce an useful notation: we say that a
certain property is true for almost every x ∈ A (shortly: a.e in A, almost
everywhere in A), if the set of points where the property fails has measure 0.
Of course, this definition makes sense for Lebesgue measure, or for any
fixed outer measure µ: to avoid confusion, we will often say that something
is true “µ-a.e.”.
For instance, given two functions f, g : Rn → R, they are µ-a.e. equal if
µ({x : f (x) ̸= g(x)}) = 0.
It is a very simple exercise to check that a function which is a.e. equal to
a measurable function is also measurable (because sets with measure 0 are

22
measurable). Moreover, if we change a function on a set with measure 0, its
integral does not change.
The following result also holds:

RPROPOSITION: Let f : A → [0, +∞] be a measurable function such that


f (x) dµ(x) = 0. Then f = 0 µ-a.e. in A.
A

PROOF: We can write



[ 1
{x ∈ A : f (x) > 0} = {x ∈ A : f (x) > }.
n=1
n

All the sets in the r.h.s. have measure 0: if we had µ(En ) > 0, where
En = {x ∈ A : f (x) > n1 }, we would get
Z Z
f (x) dµ(x) ≥ f (x) dµ(x) ≥ µ(En )/n > 0,
A En

which contradicts the hypothesis. Q.E.D.


The following theorem shows that Riemann integral coincides with Lebesgue
integral (w.r.t the Lebesgue Measure) on the set of Riemann integrable func-
tions (bounded and on a bounded interval: for generalized Riemann integrals
things are slightly more complicated3 .
We state and prove the result in dimension 1: the generalization to higher
dimensions is proved in the same way.
THEOREM: Let f : [a, b] → R be a bounded Riemann-integrable function.
Then f is Lebesgue-measurable and its Lebesgue and Riemann integrals co-
incide.
PROOF of the comparison between Riemann and Lebesgue integrals: Until
the proof is finished, we need to distinguish between
Rb Riemann and Lebesgue
integrals: given f : [a, b] → R, we denote by a f (x) dx its Lebesgue integral,
Rb
by R a f (x) dx its Riemann integral (provided they exist). But we need to
stress that at least for step functions we already know that Lebesgue and
Riemann integrals coincide.
3
As an exercise, show that the generalized Riemann integral of a non negative function
coincides with its Lebesgue integral whenever it exists. It is enough to use the monotone
convergence theorem and the result for ordinary Riemann integrals. The same holds when
the function is absolutely integrable (use the dominated convergence theorem). If instead
f is integrable in the generalized Riemann sense, but the integral of |f | is +∞, then f is
not Lebesgue integrable because both the integrals of f + and f − are +∞.

23
By the definition of Riemann integral, we find two sequences of step
functions {ψn } and {ϕn }, with ψn ≥ f ≥ ϕn and
Z b Z b Z b
lim ψn dx = lim ϕn dx = R f dx.
n→+∞ a n→+∞ a a

Let now ψ(x) = inf{ψn (x) : n = 1, 2, . . .}, ϕ(x) = sup{ϕn (x) : n =


1, 2, . . .}. These two functions are measurable and ϕ ≤ f ≤ ψ. By mono-
tonicity of integral we have
Z b Z b
ψn (x) dx ≥ ψ(x) dx,
a a

whence passing to the limit


Z b Z b
R f (x) dx ≥ ψ(x) dx,
a a

and likewise Z b Z b
R f (x) dx ≤ ϕ(x) dx.
a a
Rb
As ψ ≥ ϕ, we deduce that a (ψ − ϕ) dx = 0, whence ψ − ϕ = 0 a.e., that
is ψ = ϕ = f a.e. in [a, b]. It follows that f is measurable and its Lebesgue
integral is equal to its Riemann Integral. Q.E.D.
Actually, one could prove that a bounded function is Riemann integrable
if and only if it is continuous almost everywhere (Vitali Theorem). We will
not prove this result because of time constraints.
The next, very important theorem is a big improvement over the results
we had for the Riemann integral. But, to he honest, in class we completely
omitted the following rather lenghty discussion on the Fubini Theorem: we
will probably come back to it when we will need it!
THEOREM (Fubini and Tonelli): Let f : R2 → R be a measurable function.
Then
(i) If f ≥ 0, then for a.e. y ∈ R the Rfunction x 7→ f (x, y) is measurable
on R. Moreover, the function y 7→ R f (x, y) dx is measurable and one
has Z Z Z 
(∗) f (x, y) dx dy = f (x, y) dx dy.
R2 R R
Obviously, the same holds also if we interchange the role of x and y.
R R 
(ii) If f is R-valued and R R |f (x, y)| dx dy < +∞, then f is summable.
The same holds if we reverse the order of integration.

24
(iii) If f is R-valued and summable, (i) still holds.

We will give a more general statement for a larger class of measures: this
will require the introduction of the product measure.
Notice for the moment that when f is not summable, the statement is no
longer true and the two iterated integral can well be different4 .
The theorem is easy to generalize to higher dimension: the ambient space
could be Rn × Rk , while x ∈ Rn , y ∈ Rk ...

The following are two celebrated integral convergence theorems. Don’t


be fooled by the fact that the first one is called Fatou’s lemma: it is a
fundamental result!
THEOREM (Fatou’s Lemma): Let fk : X → [0, +∞] be a sequence of non-
negative measurable functions, f (x) = lim inf fk (x). Then
k→+∞
Z Z
f (x) dµ(x) ≤ lim inf fk (x) dµ(x).
X k→+∞ X

PROOF: We already know that f is a non-negative measurable funtion. We


have f (x) = limk→+∞ gk (x), with gk (x) = inf{fh (x) : h ≥ k}. Now, gk
form an increasing sequence of non-negative measurable functions: by the
monotone convergence theorem we have
Z Z
f (x) dµ(x) = lim gk (x) dµ(x).
X k→+∞ X

Our thesis then follows from the monotonicity of Lebesgue integral, as gk (x) ≤
fk (x). Q.E.D.
Let me mention a couple of things: Fatou’s Lemma is in general false for
functions with arbitrary sign. Take for instance A = R, µ = m (Lebesgue
measure), fk (x) = −1/k (constant functions). Then fk (x) → 0, but
Z Z
fk (x) dx = −∞, 0 dx = 05 .
R R

Since in this example the sequence fk is increasing, this also shows that the
monotone convergence theorem fails unless the functions are non-negative.
The same functions, with the opposite sign, also show that in Fatou’s Lemma
we may well have strict inequality.
4
Consider for instance the function f (x, y) = (x−y)/(x+y)3 : the two iterated integrals
on the square [0, 1] × [0, 1] are finite and different.
5
For Lebesgue measure we write dx instead of dm(x)

25
The following is probably the most famous result in Lebesgue’s theory:
THEOREM (Lebesgue’s or of the dominated convergence): Let fk : X → R
be a sequence of measurable functions, and suppose there exists a summable
function ϕ : X → [0, +∞] such that |fk (x)| ≤ ϕ(x) for every k and for every
x. If the limit f (x) = lim fk (x) exists, then
k→+∞
Z
lim |fk (x) − f (x)| dµ(x) = 0,
k→+∞ X
Z Z
f (x) dµ(x) = lim fk (x) dµ(x).
X k→+∞ X

PROOF: The limit function f is measurable, and it is also sommable because


its absolute value is dominated by ϕ. Moreover, |fk (x) − f (x)| ≤ |fk (x)| +
|f (x)| ≤ 2ϕ(x). It follows that the sequence 2ϕ(x) − |fk (x) − f (x)| is non-
negative, and converges to the pointwise limit 2ϕ(x). From Fatou’s Lemma
it then follows that
Z Z
lim inf (2ϕ(x) − |fk (x) − f (x)|) dµ(x) ≥ 2ϕ(x) dµ(x),
k→+∞ X X
whence, eliminating the integral of 2ϕ:
Z
lim sup |fk (x) − f (x)| dµ(x) ≤ 0,
k→+∞ X

which is the first assert in the thesis. The second assert follows from
Z Z Z
fk (x) dµ(x) − f (x) dµ(x) ≤ |fk (x) − f (x)| dµ(x).
X X X
Q.E.D.
We now discuss Fubini’s Theorem: we begin with the definition of product
measure. In the following, I will give more details of those we saw in class.
DEFINITION: Let µ be a measure on Rn , ν a measure on Rm6 . The product
measure µ × ν is an application µ × ν : P(Rn+m ) → [0, +∞] defined for every
S ⊂ Rn+m by

X
(µ × ν)(S) = inf{ µ(Ai )ν(Bi ) :
i=1

[
S⊂ (Ai × Bi ), Ai µ − measurable, Bi ν − measurable}.
i=1
6
More generally, µ can be a measure on a set X, ν a measure on a set Y . The product
measure is then defined on X × Y . All the following results hold in this more general
setting, with the obvious exception of those referring to Lebesgue measure.

26
It is a simple exercise to check that this is indeed an outer measure
(whence the name): proceed as for Lebesgue measure.
Moreover, the product of Lebesgue measures on euclidean spaces give
Lebesgue measure on the product space:
REMARK: Denote mn , mm , mn+m Lebesgue measure on Rn , Rm , Rn+m re-
spectively. Then mn+m = mn × mm . Indeed, if in the coverings which
define the measure we put the extra condition that Ai and Bi are intervals,
we obtain exactly Lebesgue Measure on the product space: we deduce that
mn × mm ≤ mn+m .
To get the opposite inequality, take a covering of S ⊂ Rn+m as in the
definition of product measure. Notice that we may suppose w.l.o.g. that
m(Ai ) ≤ 1, m(Bi ) ≤ 1 for every i (otherwise, subdivide the sets in the
covering into smaller, pairwise disjoint measurable sets). Fix ε > 0: by
definition of LebesguePmeasure, for every i we find a covering of Ai with
intervals Ii,j such that j mn (Ii,j ) < mn (Ai )+ε/2i , and likewise a covering of
Bi with intervals Ji,k such that k mm (Ji,k ) < mm (Bi ) + ε/2i . The intervals
P
Ii,j × Ii,k are still a countable P
covering of S, and we easily check that the sum
of their measures is less than i mn (Ai )mm (Bi )+3ε: the infimum made over
coverings with products of measurable sets or with intervals is the same and
mn+m = mn × mm .

The following is the general version of Fubini’s Theorem. To understand


the statement, we need the notion of σ-finiteness:
DEFINITION: A measure µ on Rn is σ-finite if we can split Rn in a countable
partition of measurable sets with finite measure. More generally, a set is σ-
finite w.r.t. a measure µ if it is measurable and it can be covered with a
countable family of measurable sets with finite measure: it is easy to see
that the sets of this family can be chosen to be pairwise disjoint.
A function f : Rn → R is σ-finite if Rn can be split into a countable
union of measurable sets, over each of which f is summable.
EXERCISE: Show that Lebesgue measure is σ-finite. Show next that a
Lebesgue-measurable function f : Rn → [0, +∞] is σ-finite if and only if
m({x ∈ Rn : f (x) = +∞}) = 07 .

THEOREM(Fubini, general version): Let µ be a measure on X, ν a measure


on Y . Then the product measure µ × ν has the following property: if S ⊂
7
Consider the sets Ai = {x ∈ Rn : i ≤ f (x) < i + 1}: together with the set
R where f =
+∞, they form a partition of Rn . If m(Ai ) < +∞, it is easy to check that Ai f (x) dx <
+∞. Otherwise, thanks to σ-finiteness, Ai can be decomposed into a sequence of sets with
finite measure...

27
X × Y there exists a (µ × ν)-measurable set S ′ such that S ⊂ S ′ and (µ ×
ν)(S) = (µ × ν)(S ′ ) (S ′ is called a measurable envelope of S). If A is µ-
measurable and B is ν-measurable then A × B is (µ × ν)-measurable and
(µ × ν)(A × B) = µ(A)ν(B).
If S ⊂ X × Y is σ-finite with respect to µ × ν, then the slices

Sy = {x ∈ X : (x, y) ∈ S}, Sx = {y ∈ Y : (x, y) ∈ S}

are µ-measurable for ν-almost every fixed y, and ν-measurable for µ-almost
every fixed x. Moreover
Z Z
(µ × ν)(S) = µ(Sy ) dν(y) = ν(Sx ) dµ(x).
Y X

If f : X × Y → [0, +∞] is σ-finite or if f : X × Y → R is summable


w.r.t. the measure µ × ν, then the maps
Z Z
ϕ : x 7→ f (x, y) dν(y), ψ : y 7→ f (x, y) dµ(x)
Y X

are well-defined for µ-a.e. x, µ-a.e. y, and are µ-integrable andν-integrable


respectively. Moreover the following holds:
Z Z Z
f (x, y) d(µ × ν)(x, y) = ϕ(x) dµ(x) = ψ(y) dν(y).
X×Y X Y

REMARK: For the Lebesgue measure and for nonnegative measurable func-
tions, our earlier statement of the Fubini theorem suggests that we do not
need the σ-finiteness hypothesis. Indeed, let’s ask ourselves what happens if
we computed the iterated integral of a function f : R2 → [0, +∞] which is
measurable but not σ-finite w.r.t. Lebesgue measure in the plane. We have
to check that the iterated integral of this function is +∞, and so it “predicts”
correctly that the function is not summable.
But our hypothesis implies that the set S = {(x, y) : f (x, y) = +∞} has
positive Lebesgue measure (otherwise f would be σ-finite). Now, Lebesgue
measure of S is obtained by integrating the measure of the slices Sy . Then
the set {y : m(Sy ) R> 0} must have positive measure. For every y in that set
we obviously have R f (x, y) dy = +∞ (because f ≡ +∞ on S).
In conclusion, when we compute the iterated integral of f we integrate
the constant +∞ on a set with positive measure: the result is +∞.

We omit the proof of Fubini’s Theorem: the interested readers can ask
me for references!

28
Having finished (for the moment) the part on measure theory, we will
begin to study the basics of linear functional analysis: in particular, we will
concentrate on Banach and Hilbert spaces. We will see many examples and
also - I fear - some complements on measaure theory.
At the beginning, our interest will focus on linear algebra in infinite di-
mensional spaces! As you probably already know, from a purely algebraic
viewpoint, there are few dissimilarities from the finite dimensional case : if
we accept the axiom of choice (as we will do), every vector space over R
or over C has a basis, i.e. a maximal set of linearly independent vectors.
Moreover, each element of the space can be written in a unique way as a
finite linear combinantion of basis vectors.
In analysis, however, this is not enough: for instance, as a bare minimum
we need a notion of continuity (and thus a topology or, better yet, a metric),
which is compatible with vector space operations. In other words, we at
least need a topology for which sum of vectors and product by a scalar are
continuous.
DEFINITION: Let X be a vector space over R or C, equipped with a topol-
ogy τ . The space (X, τ ) is a topological vector space if and only if vector
space operations (sum and produts by a scalar) are continuous w.r.t. the
obvious product topologies induced by τ .

The simplest examples of topological vector spaces (and -almost- the only
examples we will touch in this course) are normed spaces:
DEFINITION (Norm, normed space): Let X be a vector space over R (or
over C, with the obvious changes...). A norm on X is a map ∥ · ∥ : X → R
such that
(i) ∥x∥ ≥ 0 ∀x ∈ X, ∥x∥ = 0 iff x = 0;
(ii) ∥λx∥ = |λ|∥x∥ for every x ∈ X, λ ∈ R (homogeneity);
(iii) ∥x + y∥ ≤ ∥x∥ + ∥y∥ for every x, y ∈ X (triangle inequality).
A vector space equipped with a norm is a normed space: it is a metric
space with the induced distance
d(x, y) := ∥x − y∥, x, y ∈ X.
It is very simple to verify that a normed space with the induced metric is a
topological vector space.
As an aside, let me notice that since a normed space is a metric space, it is
possible to test continuity on sequences: for instance a function f : X → R is
continuous iff for every x and every sequence xk → x we have f (xk ) → f (x).

29
Exactly as with the euclidean space, a property we will often need is
completeness:
DEFINITION (Banach) space: A normed space X is a Banach space if it is
complete, i.e. if every Cauchy sequence in X converges.

The following are examples of normed spaces:

1. Rn is a normed
p space with the usual euclidean norm: if x =P(x1 , . . . , xn ),
then |x| = x21 + . . . + x2n . Norms on Rn are also |x|1 = ni=1 |xi | and
|x|∞ = max{|xi | : i = 1, . . . , n}: it is a simple exercise to check that,
and also useful is to draw the balls w.r.t. these metrics.
More general norms on Rn are the following:

n
!1/p
X
|x|p = |xi |p , 1 ≤ p < +∞.
i=1

In this case, checking the triangle inequality is not so simple: we will


see it later.

It is well known that Rn with the euclidean norm is complete. It is


complete also with all the other norms we mentioned: we will see in a few
days that on Rn all norms are equivalent in the sense that they induce the
same topology and have the same Cauchy sequences.
Let us continue with some other examples of normed spaces!

1. (C 0 ([a, b]), ∥ · ∥∞ ) is also an example of a Banach space. Indeed, let


{fn } be a Cauchy sequence in our space: for every ε > 0 there exists
ν ∈ N such that ∥fn − fm ∥∞ < ε for every m, n ≥ ν. Then, for every
x ∈ [a, b] we have
(∗∗)|fm (x) − fn (x)| < ε
for m, n ≥ ν, and the real sequence {fn (x)} is a Cauchy sequence. By
the completeness of R, there is a real number f (x) such that fn (x) →
f (x). Passing to the limit as m → +∞ in (**) we get

|fn (x) − f (x)| ≤ ε ∀n ≥ ν,

and taking the supremum over x ∈ [a, b] ∥fn −f ∥∞ ≤ ε for n ≥ ν. Then


fn → f uniformly. The function f is continuous being the uniform limit
of continuous functions.

30
Rb
2. (C 0 ([a, b]), ∥ · ∥1 ), where ∥f ∥1 = a |f (x)| dx is a normed space but not
a Banach space (we will check in a while that it is not complete). The
spaces (C 0 ([a, b]), ∥ · ∥∞ )and (C 0 ([a, b]), ∥ · ∥1 ) have different topologies:
it is easy to construct a sequence of continuous functions which con-
verges to 0 in the norm ∥ · ∥1 but not in the norm ∥ · ∥∞ . This has a
consequence which is surprising at a first glance:the identity map
Id : (C 0 ([a, b]), ∥ · ∥1 ) → (C 0 ([a, b]), ∥ · ∥∞ )
is discontinuous at 0. So here is an example of a linear and invertible
map between two infinitely dimensional normed spaces, which is not
even continuous!

In infinite dimension, there are always linear maps which are discontinu-
ous.
PROPOSITION: Let (X, ∥ · ∥) vector space over R with dim X = +∞.Then
there are discontinuous linear functionals T : X → R .
PROOF: We will exhibit a member T of the algebraic dual space of X which
is discontinuous. Take an algebraic basis B = {xα }α∈I of X (a maximal set
of linearly independent vectors): this is by assumption an infinite set. By
normalizing the basis vectors, we may assume as well that ∥xα ∥ = 1 for every
α ∈ I.
To define T , we only need to specify its values on the basis vectors: by
linearity, this characterizes the linear functional uniquely.
Choose a countable subset B ′ = {x̃n } ⊂ B and define T (x̃n ) = n, n =
1, 2, . . ., T (xα ) = 0 if xα ∈ B \ B ′ . The linear
√ functional defined in this
way is discontinuous:
√ the sequence y n = x n / n converges to 0 in norm, but
T (yn ) = n → +∞. Q.E.D.
Let us go back to examples:
• The space (C 0 ([a, b]), ∥ · ∥1 ) is not a Banach space, because it is not
complete. Take for instance [a, b] = [−1, 1] and consider the sequence

 −1 if x ≤ −1/n,
un (x) = 1 if x ≥ 1/n,
nx if − 1/n < x < 1/n.

This is a Cauchy sequence w.r.t. the norm ∥ · ∥1 , but it does not


converge to any continuous function: indeed, it converges in the norm
to the discontinuous function sgn(x). In this course, we will study the
completion of this space: it is the Banach space L1 ([a, b]) of Lebesgue
summable functions, with the norm ∥ · ∥1 .

31
We will see later that any two norms over Rn are equivalent. This is
no longer the case in infinite dimension: we saw it is possible to define non
equivalent norms on the same space.

The proposition about the existence of discontinuous linear functionals


has a. . . sad but inevitable consequence: when an analyst mentions the dual of
a normed vector space X (or more generally of a topological vector space),
he never refers to the algebraic dual spaces, i.e. to the space of all linear
functionals over X, but rather to the vector subspace of the continuous linear
functionals, which is also called the topological dual space.
DEFINITION (topological dual space): Given a normed vector space (a topo-
logical vector space) X, its (topological) dual space is the vector space X ′ of
all continuous linear functionals T : X → R.

The following characterization holds:


THEOREM (Characterization of the continuous linear functionals on a normed
space): let (X, ∥·∥) be a normed vector space, T : X → R a linear functional.
The following facts are equivalent:
(i) T is continuous;

(ii) T is continuous at the origin;

(iii) T is a bounded functional8 : there exists a constant C > 0 such that

|T (x)| ≤ C∥x∥ ∀x ∈ X;

(iv) The kernel of T , ker(T ) is a closed subspace of X.

PROOF: Let us prove the theorem on the continuity of linear functionals on


a normed space we stated yesterday. Obviously (i) ⇒ (ii), while (ii) ⇒ (i)
comes from the fact that |T (x) − T (x)| = |T (x − x)| by linearity. (iii) ⇒ (ii)
is also obvious.
Conversely, let us show that (ii) ⇒ (iii): by definition of continuity at
0 (with ε = 1), there exists δ > 0 such that |T (x)| ≤ 1 whenever ∥x∥ ≤ δ.
Then for every y ∈ X, y ̸= 0, we have
∥y∥ δ ∥y∥ δ 1
|T (y)| = |T ( ·y )| = |T (y )| ≤ ∥y∥,
δ ∥y∥ δ ∥y∥ δ
8
In the context of linear maps, “bounded” does not mean that the map has a bounded
image: this is false as soon as the map is non-zero. It means that the image of the unit
ball is bounded!

32
and T is bounded.
It it obvious that (i) ⇒ (iv), because the preimage of the closed set {0}
under the continuous function T is closed.
To finish the proof, we show that (iv) ⇒ (ii). If T ≡ 0 we have nothing to
prove. Otherwise, suppose ker(T ) is closed and assume by contradiction that
T is discontinuous at 0. Then there is a sequence {xn } ⊂ X with ∥xn ∥ → 0
and such that T (xn ) ̸→ 0. Up to subsequences, this implies that there is a
constant c > 0 such that |T (xn )| ≥ c for every n.
Take an arbitrary point y ∈ X and consider the sequence yn = y −
xn
T (xn )
T (y): one immediately checks that yn ∈ ker(T ) and yn → y in the
norm. It follows that y ∈ ker(T ) = ker(T ), whence ker(T ) = X and T ≡ 0,
which contradicts our hypothesis that T ̸≡ 0. By the way, this last part of
the proof shows that the kernel of a linear functional is either closed (and
the linear functional is continuous) or dense (and the linear functional is
discontinuous)! Q.E.D.

3 Lecture of october 9, 2024 (3 hours)

We discovered that a linear functional T : X → R is continuous if and only


if it is bounded, i.e. if there is a constant C > 0 such that
|T (x)| ≤ C∥x∥ ∀x ∈ X.
The norm of T ∈ X ′ is the smallest constant C for which the inequality
holds. Precisely, we define
|T (x)|
(∗) ∥T ∥X ′ := sup{ : x ∈ X, x ̸= 0} = sup{|T (x)| : x ∈ X, ∥x∥ ≤ 1}.
∥x∥
The topological dual of a normed space, equipped with the dual norm, is
always complete!
THEOREM (dual of a normed space): Let X be a (not necessarily complete)
normed vector space over R. Consider the topological dual X ′ of X, equipped
with the dual norm (*). Then (X ′ , ∥ · ∥X ′ ) is a Banach space.
PROOF: Checking that ∥ · ∥X ′ is a norm is an easy exercise we leave to the
reader.
We need to show completeness: let {Tn } ⊂ X ′ be a Cauchy sequence
in the dual norm ∥ · ∥X ′ . For a fixed ε > 0, there exists ν ∈ N such that
∥Tn − Tm ∥X ′ ≤ ε for m, n ≥ ν. Then, for every x ∈ X and m, n ≥ ε we have
(∗∗) |Tn (x) − Tm (x)| ≤ ε∥x∥,

33
and the real sequence {Tn (x)} is Cauchy and converges to a real number we
denote T (x). This pointwise limit T : X → R is clearly linear. To conclude,
we only need to verify that T is continuous and Tn → T in the norm of X ′ .
Passing to the limit as m → +∞ in (∗∗) we get

|Tn (x) − T (x)| ≤ ε∥x∥ ∀x ∈ X, ∀n ≥ ν,

i.e. ∥Tn − T ∥X ′ ≤ ε for n ≥ ν. T is also bounded because, by the previous


inequality,
∥T ∥X ′ ≤ ∥Tν ∥X ′ + ∥Tν − T ∥X ′ < +∞.
Q.E.D.

REMARK/EXERCISE: With a similar argument, you can prove the follow-


ing important fact. Let (X, ∥ · ∥X ), (Y, ∥ · ∥Y ) be normed spaces, with Y
a Banach space. Then the vector space L(X; Y ) of continuous linear maps
between X and Y is a Banach space with the norm

∥T ∥L(X;Y ) = sup{∥T (x)∥Y : x ∈ X, ∥x∥X ≤ 1}.

Let us give other important examples of Banach spaces (and one which
is not Banach):
1. The space Lp (Ω) is defined as the set of measurable functions
 1/p
Z
∥u∥Lp (Ω) :=  |u(x)|p dx ,

quotiented w.r.t. the equivalence relation

u ∼ v ⇔ u(x) = v(x) f or a.e. x ∈ Ω.

This is a norm on Lp , as we will see later in the course.


More generally, if Ω is a set and µ is an outer measure on Ω, for a
µ-measurable function u : Ω → R we define
 1/p
Z
∥u∥Lp (µ) :=  |u(x)|p dµ(x) .

The space Lp (µ) is then defined as above.


The spaces L∞ (Ω), L∞ (µ) are slightly harder to define: we postpone
the definition a little bit.

34
2. The space of sequences ℓp is defined as follows: given a real sequence
{xn } and p ∈ [1, +∞), let


!1/p
X
∥{xn }∥ℓp := |xn |p .
n=1

The space ℓp is then the space of those sequences which have finite
norm.
We will now see that this space is a particular case of Lp (µ), obtained
when µ is the counting measure over N.
Indeed, one can check that given a sequence {an }n∈N of nonnegative
real numbers, one has
Z ∞
X
an dµ(n) = an .
n=1
N

Indeed, let {aNn }n be the sequence truncated to the first N terms, i.e.
an = an if 1 ≤ n ≤ N , aN
N
n = 0 otherwise. As N → +∞, these
sequences increase and converge pointwise to the original sequence. By
the monotone convergence theorem we thus get that the integral of {an }
is the limit of the integrals of the truncated sequences: but those are
simply the partial sums of the series (because the truncated sequences
are “simple functions” for the counting measure!).
By applying the dominated convergence theorem, one verifies that the
statement still holds for absolutely convergent sequences (with arbi-
trary sign).
As an execise, we also checked that summability with respect to the
counting measure of an arbitrary family of numbers {aα }α∈A (where A
is a possibly uncountable set of indices), implies that {α ∈ A : aα ̸= 0}
is at most countable. Indeed, whe can write the latter set of indices as
the union of the following sequence of finite sets: {α ∈ A : |aα | > 1/k},
k = 1, 2, 3, . . .
All the spaces ℓp , Lp (Ω), Lp (µ) are honest Banach spaces: details will
come later!

EXAMPLE: We show that the supremum in the definition of the dual norm
is not always a maximum. Consider the vector space ℓ1 = {{xn }n∈N :

35
P∞
n=1 |xn | < +∞}, with the norm

X
∥{xn }∥ = ℓ1 |xn |.
n=1

We will show later on that this is a Banach space.


Consider next the linear functional T : ℓ1 → R defined by

X
T ({xn }) = (1 − 1/n)xn .
n=1

One easily checks that this is well defined on ℓ1 (the series is absolutely
convergent) and it is linear. It is also bounded:

X
(∗ ∗ ∗)|T ({xn })| ≤ (1 − 1/n)|xn | ≤ ∥{xn }∥ℓ1 ,
n=1

and notice that the last inequality is strict if {xn } is different from the zero
sequence.
From (***) we infer ∥T ∥(ℓ1 )′ ≤ 1. On the other hand, the dual norm is 1:
there is a sequence ek in ℓ1 (a sequence of sequences...) such that ∥ek ∥ℓ1 = 1
and T (ek ) → 1. It suffices to choose as ek the “k-th canonical basis vector”,
i.e. the sequence whose k-th element is 1, while all other elements are 0: we
thus have T (ek ) = 1 − 1/k and

∥T ∥(ℓ1 )′ = 1.

The maximum in the definition of dual norm is not attained because (***)
is strict for all non zero sequences.
Let us discuss some further examples of infinite dimensional normed
spaces:

1. The space C 1 ([a, b]) with the norm ∥f ∥C 1 := ∥f ∥∞ + ∥f ′ ∥∞ is a Banach


space.
Indeed, let us check that this obvious norm is complete. Let {fn } be
a Cauchy sequence in C 1 . Then both sequences {fn } and {fn′ } are
Cauchy sequences for the norm ∥ · ∥∞ , which is a complete norm on the
space C 0 : it follows that there are continuous functions f, g such that
fn → f and fn′ → g uniformly. Now, for every x ∈ [a, b] we have
Z x
fn (x) = fn (a) + fn′ (t) dt.
a

36
Passing to the limit as n → +∞ we get
Z x
f (x) = f (a) + g(t) dt,
a

whence f ∈ C 1 and f ′ = g.

2. The space C 1 ([a, b]) with the norm ∥ · ∥∞ is not complete: there are
sequences of C 1 functions which converge uniformly to functions which
are
p not differentiable. For instance, take [a, b] = [−1, 1] and fn (x) =
x2 + 1/n: this sequence converges uniformly to |x|.

In our discussion of the completeness of Rn with various norms, we briefly


touched the notion of equivalent norms: two norms on the same vector space
are said to be equivalent, it they induce the same topology, i.e. if the open
sets for the two norms are the same.
Let now X be a vector space with two norms ∥·∥1 , ∥·∥2 . Let us check that
the two norms are equivalent if and only if there are two constants c, C > 0
such that
(∗) c∥x∥1 ≤ ∥x∥2 ≤ C∥x∥1 ∀x ∈ X.
Indeed, in a metric space the open sets are defined starting from open
balls: the topology is the same if and only if, given a ball for any one of the
two norms, it is possible to find a ball for the other norm (with a suitably
small radius) which is contained in the first.
By homogeneity of the norm, balls in a normed metric spaces are obtained
from the unit ball by scaling and translating. But the two inequalities (*)
(1) (2) (2) (1)
imply that B1/C (0) ⊂ B1 (0)and Bc (0) ⊂ B1 (0).
Conversely, if the two norms induce the same topology, then (*) holds.
(1)
Indeed, the ball B1 (0) is an open set for both norms: there exists r > 0 such
(2) (1) x
that Br (0) ⊂ B1 (0). Given x ∈ X, x ̸= 0 we obviously have y = r/2 ∥x∥ 2

(2)
Br (0), whence ∥y∥1 < 1 and r/2∥x∥1 ≤ ∥x∥2 . The other inequality is
proved by means of a similar argument.
Thanks to (*), we see that two equivalent norms have the same Cauchy
sequences: completeness is preserved when passing to equivalent norms. In-
stead, the norms ∥·∥∞ and ∥·∥1 on the space C 0 ([a, b]) of continuous functions
are not equivalent (the first is complete, the second is not). In fact, the topol-
ogy induced by the norm ∥ · ∥∞ is stronger (it has more open sets) than the
topology induced by ∥ · ∥1 ... Open balls for the norm ∥ · ∥∞ are not open
w.r.t. the norm ∥ · ∥1 , while the opposite is true.

37
We already mentioned the fact that all norms on Rn are equivalent. Here
is the proof:
THEOREM: All norms on Rn are equivalent.
PROOF: Because of transitivity of the equivalence between norms, it suffices
to show that the euclidean norm | · | is equivalent to any other norm ∥ · ∥. We
have to check that there are constants c, C > 0 such that c|x| ≤ ∥x∥ ≤ C|x|
for every x ∈ Rn . If x ̸= 0, dividing by |x| and using the homogeneity of
norms, we see that our inequalities are true if and only if

(∗∗) c ≤ ∥x∥ ≤ C ∀x ∈ Rn , |x| = 1.

Now, the function x 7→ ∥x∥ is continuous w.r.t. the euclidean topol-


ogy. As a matter of fact, it is lipschitz continuous because of the following
inequalities:
n
X n
X
|∥x∥ − ∥y∥| ≤ ∥x − y∥ = ∥ (xi − yi )ei ∥ ≤ |xi − yi |∥ei ∥ ≤
i=1 i=1
Xn
|x − y|( ∥ei ∥),
i=1

where ei are the vectors of the canonical basis of Rn (and we used the ho-
mogeneity of the norm and the triangle inequality).
On the other hand, the unit sphere S = {x ∈ Rn : |x| = 1} is compact
for the euclidean topology, so (**) is fulfilled with c = min{∥x∥ : x ∈ S},
C = max{∥x∥ : x ∈ S}, which exist by Weierstrass theorem (and c ̸= 0
because the norm ∥ · ∥ is non degenerate). Q.E.D.

From this result, it follows that all norms on a real, finite dimensional
vector space are equivalent, and that every linear isomorphism between such
a space and Rn with the euclidean norm is a homeomorphism (exercise!). So
all normed, finite dimensional vector spaces are isomorphic to Rn as normed
spaces.

One of the most important and useful results for the study of the dual
space of a Banach space is the following:
THEOREM (Hahn-Banach): Let X be a real vector space, p : X → [0, +∞)
a map such that

(i) p(λx) = λp(x) for all x ∈ X and for all λ > 0 (positive homogeneity);

38
(ii) p(x + y) ≤ p(x) + p(y) for all x, y ∈ X (subadditiviy).

Let Y be a proper subspace of X, T : Y → R a linear functional such


that T (x) ≤ p(x) for every x ∈ Y . Then there exists a linear functional
T̃ : X → R wich extends T (i.e. T̃ (x) = T (x) for every x ∈ Y ) and such
that T̃ (x) ≤ p(x) for every x ∈ X.
As a particular case, if X is a normed space, every bounded linear func-
tional on Y extends to a linear functional defined on the whole space X and
whith the same dual norm.
Before we see the proof, let us derive some important consequences of the
Hahn-Banach theorem. One of these is the following: if x ∈ X there exists
T ∈ X ′ such that ∥T ∥ = 1 and T (x) = ∥x∥.
Indeed, this functional can be defined first on the straight line R{x} by
letting T (tx) = t∥x∥ (its dual norm is obviously 1), and then extended to
the whole space X with the Hahn-Banach theorem.
In particular, this also implies that for every x ∈ X one has

∥x∥ = max{T (x) : T ∈ X ′ , ∥T ∥ ≤ 1} :

one inequality is obvious, the equal sign follows by using the functional con-
structed above!
Another consequence of the theorem is the fact that the dual space of
a normed spaces separates points: given any two vectors x, y ∈ X, we can
always find an element of the dual spaces which takes different values on those
two points. Indeed, it is enough to choose a functional which takes the value
∥x−y∥ on the vector x−y: it follows that T (x)−T (y) = T (x−y) ̸= 0. Notice
that this property does not hold for a general topological vector space: there
are examples of quite honest (metrizable and complete) topological vector
spaces, whose topological dual contains only the zero functional!
PROOF: The main part of the proof consists in proving the following assert:
if Z is a proper subspace of X and T : Z → R is linear with T (x) ≤ p(x) for
every x ∈ Z, then T extends to a strictly larger subspace in such a way that
the inequality still holds for the extension.
To this aim, choose x0 ∈ X \ Z: we extend T to a functional T̃ defined on
the space Z ⊕ R{x0 }. By linearity we have, for every x ∈ Z and all t ∈ R:

T̃ (x + tx0 ) = T (x) + tα,

where α = T̃ (x0 ) is a real number we must choose in such a way that

T (x) + tα ≤ p(x + tx0 ) ∀x ∈ Z, ∀t ∈ R.

39
By using again the linearity of T and the positive homogeneity of p, and
by distinguishing the cases t > 0 and t < 0, we easily check that the last
inequality is equivalent to the following two:

T (x) + α ≤ p(x + x0 ) ∀x ∈ Z,
T (y) − α ≤ p(y − x0 ) ∀y ∈ Z,

that is, α must satisfy

T (y) − p(y − x0 ) ≤ α ≤ p(x + x0 ) − T (x) ∀x, y ∈ Z.

This is certainly possible, provided the left hand side is always smaller or
equal than the right hand side, for every choice of x, y ∈ Z. But this is true
because T (x) + T (y) = T (x + y) ≤ p(x + y) ≤ p(x + x0 ) + p(y − x0 ) for all
x, y ∈ Z. Our claim is proved.
Tomorrow we will deduce the general case from this claim, through an
appropriate application of the axiom of choice.

4 Lecture of october 10, 2024 (3 hours)


To conclude the proof of Hahn-Banach’s Theorem we need one of the forms
of the choice axiom, for instance Hausdorff’s maximality principle. Consider
indeed the family F of all linear functionals defined on subspaces Z of X
which contain Y , which are also extensions of the original functional T and
are dominated by the function p. We next put an order relation on F: if
R : Z1 → R and S : Z2 → R are two elements of F, then R ⪯ S iff
Z1 ⊂ Z2 and S extends R. By Hausdorff’s maximality principle, we can find
a maximal totally ordered subset G ⊂ F,

G = {Sσ : Zσ → R, σ ∈ I}.

This set has an upper bound: this is the functional S defined on the subspace
[
Z= Zσ
σ∈I

(why is it a subspace?) by S(x) = Sσ (x) whenever x ∈ Zσ (this is a good


definition because G is totally ordered).
But then Z = X and S is the required extension: otherwise, the claim
we proved at the beginning of this discussion would allow us to extend S to

40
a strictly larger subspace, thus contradicting the maximality of the totally
ordered subset G: this concludes the proof of our main statement.
The “particular case” for normed spaces follows immediately by choosing
p(x) = ∥T ∥Y ′ ∥x∥. Q.E.D.
EXERCISE: In general, given a linear functional T which is bounded on a
subspace Y of (X, ∥ · ∥), its extension T̃ ∈ X ′ given by the Hahn-Banach
theorem is not unique: there are easy examples also in finitely dimensional
spaces. Show that if Y is a dense subspace of X, that is if Y = X, then the
extension is unique.

Our discussion of the consequences of the Hahn-Banach theorem is far


from concluded. But before we proceed further, we will make a more “con-
crete” digression, to build a good number of important examples of infinitely
dimensional Banach spaces. In particular, some time ago we introduced -
without proofs - the spaces ℓp : let us verify that the norm ∥ · ∥ℓp is indeed a
norm, and those spaces are complete. The same will be done for the spaces
Lp (µ), with µ an arbitrary outer measure.
To this aim, we begin with a simple inequality in R.
If p ∈ (1, +∞), its conjugate exponent q is defined through the equation
1 1
+ = 1.
p q
As a natural extension, the conjugate exponent of p = 1 is q = +∞, and vice
versa.
The following inequality holds (Young’s inequality):
LEMMA: If p, q are conjugate exponent, 1 < p < +∞, then
1 1
ab ≤ ap + bq ∀a > 0, b > 0.
p q

PROOF: By the concavity of the logarithm we get:


 
1 p 1 q 1 p 1 q
log(ab) = log a + log b ≤ log a + b ,
p q p q

whence the thesis.

A very important tool in the study of the spaces ℓp , Lp is Hölder inequal-


ity. We will need it, in particular, to show that these are normed spaces.
The case p = 1 and q = +∞ will be analyzed tomorrow, because we need to
define the L∞ norm!

41
PROPOSITION (Hölder inequality): Let 1 ≤ p ≤ +∞, q its conjugate expo-
nent. Then for µ-measurable functions u, v : X → R the following holds:
Z
|u(x)v(x)| dµ(x) ≤ ∥u∥Lp (µ) ∥v∥Lq (µ) .
X

PROOF: We first consider the case 1 < p < +∞: p = 1, q = ∞ is postponed


till later.
Hölder’s inequality is a simple consequence of Young’s inequality. Notice
first that the Lp norms are homogeneous. We may then suppose without loss
of generality that ∥u∥Lp (µ) = ∥v∥Lq (µ) = 1. By Young’s inequality we get
1 1
|u(x)v(x)| ≤ |u(x)|p + |v(x)|q ,
p q
whence, integrating over X
Z
1 1
|uv| dµ ≤ + = 1,
X p q
which is the inequality we had to prove9 .
To see the case p = 1, q = +∞, we first have to define the L∞ norm!
DEFINITION (L∞ (Ω)): If µ is an outer measure on X and u : X → R is
measurable, define

∥u∥L∞ (µ) = esssup{|u(x)| : x ∈ X} := inf {t ∈ R : µ({x ∈ X : |u(x)| > t}) = 0} .

Loosely speaking, this is the sup of |u(x)| up to sets of measure zero. Of


course, L∞ (µ) will be defined as the space of the functions with finite norm,
quotiented by the usual equivalence relation.
REMARK: One has

µ({x ∈ X : |u(x)| > ∥u∥L∞ (µ) }) = 0.

Indeed, if ∥u∥L∞ (µ) = +∞ there is nothing to prove, otherwise we can write


this set as a countable union of sets with zero measure:

[ 1
{x ∈ X : |u(x)| > ∥u∥ L∞ (µ) }= {x ∈ X : |u(x)| > ∥u∥L∞ (µ) + }.
n=1
n

Let us conclude our proof and check that Hölder’s inequality is true in
the limit casee p = 1, q = +∞: we may suppose ∥v∥L∞ < +∞ (otherwise the
9
In case one of the norms is zero, or if one or both are +∞, the inequality is obvious!

42
inequality is obvious). We know that the set A = {x ∈ X : |v(x)| > ∥v∥∞ }
has measure 0. It follows that the integral of any non negative measurable
function on X and on X \ A are the same. Then
Z Z
|u(x)v(x)| dµ(x) ≤ ∥v∥∞ |u(x)| dµ(x) = ∥v∥∞ ∥u∥L1 ,
X X

as we wanted. Q.E.D.

With Hölder’s inequality we finally prove that Lp norms are norms:


THEOREM: If 1 ≤ p ≤ +∞, the spaces (Lp (µ), ∥ · ∥Lp (µ) ) are normed vector
spaces.
PROOF: Lp norms are obviously non negative and homogeneous, they take
finite values on the Lp spaces and are not degenerate. Indeed we saw that
Z
|u(x)|p dµ(x) = 0 ⇒ u(x) = 0 per µ − a.e. x ∈ X,
X

and so Lp norm is non degenerate thanks to the equivalence relation we have


put on measurable functions.
We are left to prove the triangle inequality, which in this case is called
Minkowski’s inequality. Again, this is obvious in the limit cases p = 1 and
p = +∞.
If 1 < p < +∞, take two measurable functions u, v and apply Hölder’s
inequality:
 (p−1)/p
Z Z Z Z
|u+v|p ≤ |u||u+v|p−1 + |v||u+v|p−1 ≤ (∥u∥p +∥v∥p )  |u + v|p  .
X X X X
 (p−1)/p
R
Minkowski’s inequality is then obtained by dividing both sides by |u + v|p .

But to do this, we must know that this quantity is finite as soon as ∥u∥p and
∥v∥q are finite: this follows from the convexity of the function s 7→ sp because
p
|u(x)| + |v(x)| 1 1
≤ |u(x)|p + |v(x)|p .
2 2 2
Q.E.D.

REMARK: What we proved for the spaces Lp (µ) also applies to the spaces
ℓp : we already proved that the latter are just particular case of the first, with
X = N and µ the counting measure.

43
We are now in position to prove that the spaces Lp (µ) are indeed Banach
spaces:
THEOREM (Riesz-Fischer): Let µ be an outer measure on a set X, 1 ≤
p ≤ +∞. Then the spaces Lp (µ) are complete. Moreover, given a sequence
{fn } which converges to some function f in the norm Lp , we can extract a
subsequence {fnk } such that fnk (x) → f (x) for µ-a.e. x ∈ X.
Proof: We begin with the case 1 ≤ p < +∞: p = +∞ is different (and also
easier!) and will be shown at the end.
Let {fn } be a Cauchy sequence in Lp (µ): it is easy to construct an in-
creasing sequence of indices nk in such a way that
1
∥fnk+1 − fnk ∥Lp ≤ , k = 1, 2, 3, . . .
2k
Next, consider the functions
K
X ∞
X
gK (x) = |fnk+1 (x) − fnk (x)|, g(x) = |fnk+1 (x) − fnk (x)|
k=1 k=1

(where the last series makes perfect sense, because its terms are non negative)
. By the triangle inequality and our choice of nk , we immediately check that
∥gK ∥Lp ≤ 1 for every K. Moreover, the monotone convergence theorem
ensures that ∥gK ∥Lp → ∥g∥Lp , whence g ∈ Lp (µ), so that |g(x)| < +∞ for
almost every x ∈ X.
Consider now the telescopic sums
K
X
(fnk+1 (x) − fnk (x)) = fnK (x) − fn1 (x)
k=1


P
and the corresponding series (fnk+1 (x) − fnk (x)): the latter is absolutely
k=1
convergent for almost all x (the series of absolute values converges to g(x)),
so the limit
lim fnk (x) =: f (x)
k→+∞

exists for almost every x ∈ X. On the zero measure set where we do not
have convergence, we can define the pointwise limit in an arbitrary way, for
instance by putting f (x) = 0.
From the previous inequalities we also get

|fnk (x)| ≤ (fn1 (x) + g(x)) ∈ Lp (µ),

44
and the pointwise limit satisfies the same inequality, so that f ∈ Lp (µ).
Finally, the dominated convergence theorem ensures that fnk → f in Lp (Ω)
(the sequence fnk − f is dominated by 2g + |fn1 | ∈ Lp ).
It is a very easy exercise to show that when a Cauchy sequence has a
convergent subsequence, the whole sequence converges to the same limit:
this fact holds in any metric space.
To finish the proof, we only have to check the last claim that whenever
fn → f in Lp we have a subsequence for which we have pointwise convergence
a.e. to f . Of course, if fn converges in Lp , it is a Cauchy sequence. The
theorem we just proved gives us a subsequence which converges a.e. and in
Lp to some function: this is necessarily equal a.e. to f by the uniqueness of
the limit in Lp .
Let us consider now the case p = +∞.
Let then {fn } be a Cauchy sequence in L∞ . For every k ∈ N there exists
an index nk such that ∥fm − fn ∥∞ < k1 for every m, n ≥ nk . Then the sets

[
Ak = {x ∈ X : |fn (x) − fm (x)| > 1/k f or some m, n ≥ nk }, A= Ak
k=1

have all measure zero (by definition of the L∞ norm). Then for every x ∈
X \ A the sequence fn (x) is a Cauchy sequence in R, and it converges to
a pointwise limit f (x) (which we extend as above by putting for instance
f (x) = 0 for x ∈ A). Passing to the limit as m → +∞ in the inequality
|fn (x) − fm (x)| ≤ 1/k ∀x ∈ X \ A, ∀m, n ≥ nk
we get
|fn (x) − f( x)| ≤ 1/k ∀x ∈ X \ A, ∀n ≥ nk ,
whence ∥fn − f ∥∞ ≤ 1/k for all n > nk and fn → f in L∞ .
Q.E.D.
REMARK: In general, if 1 ≤ p < +∞, convergence in Lp (µ) does not imply
convergence a.e. of the whole sequence. This is true in the space L∞ (see the
proof of the theorem!).
We now give an example of a sequence {ui } which converges to 0 in
p
L ([0, 1]) (for every 1 ≤ p < +∞), but which does not converge to 0 in
any point of the interval [0, 1]. The construction goes as follows: if i ∈
[2k , 2k+1 − 1] ∩ N, we put
i − 2k
 
ui (x) = 1[0,2−k ] x − .
2k
The following is an animation of the sequence:

45
We will next study the dual space of Lp (µ): we will see that in “most
cases”, it can be identified with Lq (µ), where q is the conjugate exponent to
p.
More precisely, we will consider the linear map

Φ : Lq (µ) → (Lp (µ))′


v 7→ Tv

R
where Tv (u) := v(x)u(x) dµ(x).
X
First of all, Tv (u) is well defined: indeed, by Hölder’s inequality the func-
tion uv is summable. Moreover, Tv is linear and, again by Hölder,

|Tv (u)| ≤ ∥v∥Lq · ∥u∥Lp ∀u ∈ Lp ,

so that Tv is continuous and ∥Tv ∥(Lp )′ ≤ ∥v∥Lq .


We will see that for 1 < p ≤ +∞ we actually have ∥Tv ∥(Lp )′ = ∥v∥Lq : the
map Φ is a linear isometry between the two spaces. The same holds also for
p = 1 if we make some mild assumption on the measure.
Moreover, for 1 ≤ p < +∞ the map Φ is surjective. So, for 1 ≤ p < +∞
(and µ “good enough” in case p = 1), the map Φ is actually a linear isomestric
isomorphism between Lq and (Lp )′ .
We will prove part of this theorem next time. On the other hand, the
surjectivity of the map Φ (for finite p) will be proved much later, after we
develop the necessary tools.

5 Lecture of october 16, 2024 (3 hours)


Let us show that Φ : v 7→ Tv is a linear isometry between Lq (µ) and (Lp (µ))′ .
By Hölder’s inequality it follows that ∥Tv ∥(Lp )′ ≤ ∥v∥Lq . But choose
u(x) = sgn(v(x))|v(x)|q−1 /∥v∥Lq−1 p
q : one immediately checks that the L norm

of this function is 1 and Tv (u) = ∥v∥Lq , so we get equality of the norms.

46
Notice that the proof works also for p = +∞ and q = 1.
We will discuss the case p = 1, 1 = +∞ next: in this case, if the measure
µ is sufficiently “nice”, then the map Φ is again a linear ismometry. For
instance, this is true for the Lebesgue measure or for the counting measure
on N. But there are “patological” measures for which the result is false.
We will see that the map Φ is also surjective for 1 ≤ p < +∞ (but not
for p = +∞). The proof of this fact will come later in the course thanks to
some powerful abstract tools from functional analysis.
Actually, to show surjectivity in the case 1 ≤ p < +∞ would be a rela-
tively easy exercise for the spaces ℓp , but we will prove the result later for a
much larger class of measures µ (the class of the σ-finite meausres).
As a matter of fact, for 1 < p < +∞ the linear isometry Φ is always
an isomorphism between the Banach spaces Lq (µ) and (Lp (µ))′ , for every
measure µ.
The linear isometry Φ : Lq (µ) → (Lp (µ))′ in general is not surjective if
q = 1, p = +∞: we will show with an example that the dual space of ℓ∞ is
strictly larger than ℓ1 or, more precisely, its image under Φ.
On the contrary, for p = 1, q = ∞ and for “nice measures”, it is both an
isometry and surjective.
In order to be sure that the map Φ : L∞ (µ) → (L1 (µ))′ defined above
is an isometry, we need an hypothesis on the measure: we need to know
that every measurable set with infinite measure has a measurable subset with
finite and strictly positive measure. Suppose that this is true, and notice
that for every ε > 0 there is a subset A ⊂ X such that 0 < µ(A) < +∞ and
|v(x)| ≥ ∥v∥L∞ − ε for all x ∈ A10 Consider the function

sgn(v(x))/m(A) if x ∈ A,
u(x) =
0 otherwise
R
Then ∥u∥L1 = 1 and u(x)v(x) dx ≥ ∥v∥L∞ − ε. But ε > 0 is arbitrary, and

we deduce equality of the norms.
Again, in this case one can show that the map Φ is surjective: for “nice
enough” measures, the dual of L1 can be identified with L∞ .
By summarizing, the following theorem holds:
THEOREM(Duals of the Lp spaces): Let µ be an outer measure on a set X.
Consider the map
Φ : Lq (µ) → (Lp (µ))′
v 7→ Tv
10
If µ({x ∈ X : |v(x)| ≥ ∥v∥L∞ − ε}) = +∞, use the hypothesis on the measure to
replace this set with a smaller one!

47
R
where Tv (u) := u(x)v(x) dµ(x) for every u ∈ Lp (µ). If 1 < p < +∞,
X
then the map Φ is a isometric isomorphism. If p = 1, q = +∞ and the
measure µ has the property that every measurable set with infinite measure
has a measurable subset with finite and strictly positive measure, then Φ is
again an isometric isomorphism.
Finally, if p = +∞, q = 1, then the map Φ is a linear isometry, but in
general it is not surjective.
As we said above, we will prove later the missing part of the theorem,
i.e. the surjectivity of Φ for finite p, and we will only do this for σ-finite
measures.

Instead, let us verify that the dual of ℓ∞ is strictly larger than ℓ1 : this
will require the Hahn-Banach Theorem.
EXAMPLE: Consider the linear subspace c of ℓ∞ of those sequences {ak }k
which have a finite limit as k → +∞. Define a linear functional T : c → R
as follows:
T ({ak }k ) = lim ak .
k→+∞

This is an element of the dual space of (c, ∥ · ∥ℓ∞ ): one immediately checks
its norm is 1.
By the Hahn-Banach theorem, T can be extended to an element of norm
1 of (ℓ∞ )′ we still denote T : such a functional is known as a Banach limit.
I claim that T ̸∈ Φ(ℓ1 ), i.e. there exists no sequence {yk }k ∈ ℓ1 such that

X
(∗) T ({ak }) = yk ak ∀{ak } ∈ ℓ∞ .
k=1

Suppose by contradiction such a {yk } exists, and apply T to the elements eh


of the “canonical basis” (i.e. ehk = δhk )): obviously for each fixed h we have
T (eh ) = 0, and plugging this into (*) we get yh = 0. This contradicts the
fact that T is not the zero functional.

We will now see a couple of important “geometric”consequences of the


Hahn-Banach Theorem, which tell us that sometimes a pair of convex sets
can be separated by means of a closed hyperplane.
To this aim, we need to introduce the fundamental notion of the Minkowski
functional of a convex set:
PROPOSITION (Minkowski functional): Let (X, ∥ · ∥) be a normed space, C
a convex open subset of X containing 0. We define the Minkowski functional

48
of C in the following way:
x
p(x) = inf{t > 0 : ∈ C}.
t
Then p(x) is a well-defined, real, positively homogeneous and subadditive
function11 . Moreover, there is a constant K > 0 such that

(∗) p(x) ≤ K∥x∥ ∀x ∈ X,

and finally
(∗∗) C = {x ∈ X : p(x) < 1}.

PROOF: Let r > 0 be such that Br (0) ⊂ C (possible because C is open): for
x
every x ∈ X we then have r 2∥x∥ ∈ C, whence p(x) ≤ 2r ∥x∥ and (*) is proved.
In particular, p(x) is everywhere finite. Positive homogeneity is immediate.
Let us show (**): if x ∈ C, we use the fact that C is open to find r > 0 such
1
that (1 + r)x ∈ C. It follows that p(x) ≤ 1+r < 1. If conversely p(x) < 1
we can find 0 ≤ t < 1 such that xt ∈ C. But then x = t( xt ) + (1 − t)0 ∈ C
thanks to convexity of C.
We last show subadditivity (the triangle inequality) for the Minkowski
functional: let x, y ∈ X. By definition of p(x), for every ε > 0 we have
x y
x0 = ∈ C, y0 = ∈ C.
p(x) + ε p(y) + ε
Take a convex combination tx0 + (1 − t)y0 with
p(x) + ε p(y) + ε
t= , (1 − t) = .
p(x) + p(y) + 2ε p(x) + p(y) + 2ε
We thus deduce that
x+y
∈ C,
p(x) + p(y) + 2ε
whence
p(x + y) ≤ p(x) + p(y) + 2ε.
Since ε is arbitrary, we have subadditivity. Q.E.D.

We can now prove some very important results on the separation of convex
sets.

EXERCISE: Show that if C = B1 (0) is the unit open ball of our normed
space, the corresponding Minkowski functional is p(x) = ∥x∥.
11
Compare with the statement of Hahn-Banach theorem

49
To show the geometric consequences of the Hahn-Banach theorem we will
need the following lemma, which is of independent interest:
LEMMA: Let C be a nonempty, convex open set in X, x0 ∈ X \ C. Then
there exists T ∈ X ′ such that

T (x) < T (x0 ) ∀x ∈ C.

PROOF: Choose y ∈ C and define C̃ = C − {y}, x = x0 − y. C̃ is a


convex open set containing the origin and x ̸∈ C. If p denotes the Minkowski
functional of C̃, we have p(x) ≥ 1 while C̃ = {x : p(x) < 1}.
On the 1-dimensional subspace Y = R{x} consider the linear functional
T ∈ Y ′ such that T (tx) = tp(x). By positive homogeneity of the Minkowski
functional we immediately check that T (tx) ≤ p(tx) for all t ∈ R.
By Hahn-Banach, we can extend T to a linear functional defined on the
whole of X, in such a way that T (x) ≤ p(x) for every x ∈ X. We also have
T ∈ X ′ thanks to (*) in previous proposition.
Thus T (x) < 1 for all x ∈ C̃, while T (x) = p(x) ≥ 1. By adding y
and using linearity of T we deduce T (x) < 1 + T (y) for all x ∈ C, while
T (x0 ) ≥ 1 + T (y). Q.E.D.
Here is the “geometric version” of Hahn-Banach theorem:
THEOREM (Hahn-Banach, geometric version): Let (X, ∥ · ∥) be a normed
space, A and B nonempty, disjoint convex subsets of X. Then

(i) If A is open, there are T ∈ X ′ , T ̸= 0, α ∈ R such that

T (x) ≤ α ≤ T (y) ∀x ∈ A, ∀y ∈ B.

Geometrically, we can say that the (affine) closed hyperplane T (x) = α


separates the convex sets A and B.

(ii) If A and B are closed and A is also compact, then there are T ∈ X ′ ,
T ̸= 0, α ∈ R and ε > 0 such that

T (x) ≤ α − ε ∀x ∈ A, α + ε ≤ T (y) ∀y ∈ B.

Geometrically, the (affine) closed hyperplane T (x) = α strictly sepa-


rates the convex sets A and B.

PROOF: We first show (i): put C = A − B = {x − y : x ∈ A, y ∈ B}


(beware, this is an algebraic difference of sets!).

50
One immediately
S checks that C is convex. It is also open, because we can
write A − B = (A − {y}).
y∈B
Apply the lemma with x0 = 0: notice indeed that 0 ̸∈ C because A and
B are disjoint. We find T ∈ X ′ such that

T (x) < 0 = T (0) ∀x ∈ C,

whence (linearity of T )

T (x) < T (y) ∀x ∈ A, ∀y ∈ B.

(i) then follows by putting α = sup{T (x) : x ∈ A}.


To prove (ii), we set Aε = A + Bε (0), Bε = B + Bε (0). These are
obviously open convex sets. I claim that Aε ∩ Bε = ∅ for small enough ε:
otherwise, we could find a sequence zn ∈ A 1 ∩ B 1 . Now, we could write
n n
zn = an + wn = bn + wn′ , with an ∈ A, bn ∈ B and ∥wn ∥ < 1/n, ∥wn′ ∥ < 1/n.
Using the compactness of A and up to subsequences, we would find then
an → a ∈ A. But then (since wn → 0, wn′ → 0) we would also have bn → a,
whence a ∈ B (since B is closed). This is impossible since A ∩ B = ∅.
Take then a small enough ε and apply (i) to the convex sets Aε , Bε : we
find T ∈ X ′ and α ∈ R such that

T (a + w) ≤ α ∀a ∈ A, w ∈ Bε (0)
T (b + w′ ) ≥ α ∀b ∈ B, w′ ∈ Bε (0).

Passing to the sup on w and to the inf on w′ we get

T (a) + ε∥T ∥ ≤ α ∀a ∈ A
T (b) − ε∥T ∥ ≥ α ∀b ∈ B.

Q.E.D.

6 Lecture of october 17, 2024 (3 hours)


The following is an important consequence of the “geometric” Hahn-Banach
theorem:
COROLLARY: Let (X, ∥ · ∥) be a normed space, Y a vector subspace. Then
Y is dense iff every linear functional T ∈ X ′ which vanishes identically on
Y is the zero functional.

51
PROOF: If Y is dense, then obviously every continuous linear functional
which vanishes on Y is identically zero.
Conversely, suppose Y is not dense, i.e. Y is a proper subspace of X. We
will find a non-zero bounded linear functional which vanishes identically on
Y . To this end, take x0 ∈ X \ Y and apply (ii) of the previous theorem to
the convex sets Y and {x0 } (the second of which is compact). There exists
T ∈ X ′ such that T (x) < T (x0 ) for every x ∈ Y . By linearity of T we
immediately deduce T ≡ 0 on Y (a non-zero linear functional has never a
bounded image!). So, T (x) = 0 for every x ∈ Y , while T (x0 ) ̸= 0. Q.E.D.

REMARK: Incidentally, the proof of the corollary also suggests that is in not
always possible to separate disjoint convex sets with an hyperplane: there is
no hyperplane separating a proper dense subspace of X and a point outside
the subspace!

Another important result for the study of dual spaces is the following
THEOREM (Banach-Steinhaus): Let (X, ∥ · ∥) be a Banach space, {Tk } ⊂
X ′ be a sequence of continuous linear functionals on X which is pointwise
bounded, i.e. such that

sup{|Tk (x)| : k ∈ N} < +∞ ∀x ∈ X.

Then the sequence is uniformly bounded: there exists C > 0 such that ∥Tk ∥X ′ ≤
C for every k ∈ N.
To prove the Banach-Steinhaus theorem we need the following important
result:
THEOREM (Baire): Let (X, d) be a complete metric space. S If {Fk } is a
sequence of closed sets in X with empty interiors, then Fk has empty
k∈N
interior.
Passing to the complements, this means that in a complete metric space
a countable intersection of open and dense subsets is still dense.

S Let Ω ⊂ X be a fixed nonempty open set: we need to show that


PROOF:
Ω\ Fk is nonempty.
k∈N
Choose x1 ∈ X, 1 > r1 > 0 such that Br1 (x1 ) ⊂ Ω and Br1 (x1 ) ∩ F1 = ∅:
this is possible because the complement of F1 is open and dense. Now, there
are points of the complement of F2 whitin Br1 (x1 ) (because F2 has empty
interior): we can choose x2 , r2 such that Br2 (x2 ) ∩ F2 = ∅, Br2 (x2 ) ⊂ Br1 (x1 )
and r2 < 1/2.

52
Proceeding in the same way, we build a sequence {xk } ⊂ X and positive
real numbers {rk } such that

Brk (xk ) ∩ Fk = ∅, Brk (xk ) ⊂ Brk−1 (xk−1 ), rk < 1/k.

Thanks to the fact that the radius of the balls goes to zero, one immediately
sees that {xk } is a Cauchy sequence. Its limit x has the property that
\
x∈ Brk (xk ).
k∈N

On the other hand, \ [


Brk (xk ) ⊂ Ω \ Fk ,
k∈N k∈N

so the latter set is nonempty. Q.E.D.

We can now prove the theorem of Banach-Steinhaus:


PROOF of the Banach-Steinhaus theorem: Put

Fk = {x ∈ X : |Th (x)| ≤ k ∀h ∈ N}.

These are closed sets whose union is X by the pointwise boundedness hy-
pothesis. By the Baire theorem, there exists an index k ∈ N such that Fk
has a nonempty interior.
Choose x ∈ X, r > 0 in such a way that Br (x) ⊂ Fk : we get

|Th (x + y)| ≤ k ∀y ∈ X, ∥y∥ < r, ∀h ∈ N.

If x ∈ X, ∥x∥ ≤ 1, we get for every h:


   
2 r 2 r
|Th (x)| = |Th x | = (|Th x + x − x | ≤
r 2 r 2
   
2 r 4
|Th x + x | + |Th (x)| ≤ k.
r 2 r
Passing to the sup over x we obtain our thesis. Q.E.D.

REMARK/EXERCISE: The following is an easy but important consequence


of the Banach-Steinhaus Theorem: if {Tk } ⊂ X ′ is a sequence of linear
functionals such that Tk (x) → T (x) ∈ R for every x ∈ X (i.e. Tn converges
to some real function T ), then T ∈ X ′ and

∥T ∥ ≤ lim inf ∥Tk ∥.


k→+∞

53
Prove this result and show with an example that in general we do not have
the convergence of Tk to T in the norm of X ′ . (This statement is a fairly im-
mediate consequence of the theorem: linearity of T is trivial, and by Banach-
Steinhaus the functionals Tk are equibounded in norm...hence their pointwise
limit is bounded. The inequality on the norm is an easy consequence. Fi-
nally, to construct the required counterexample consider the space ℓ2 and the
sequence ek of the “dual basis” elements of (ℓ2 )′ (i.e. ek ({xn }) = xk ).

We introduce a couple of fundamental concepts.


DEFINITION (bidual, reflexive Banach space): Let (X, ∥ · ∥) be a normed
space. Let X ′′ be its bidual, i.e. the vector space of all continuous linear
functionals on X ′ .
As we well know from our first-year linear algebra course, there is a canon-
ical way to associate to each element x ∈ X an element of the bidual space.
Precisely, we associate to x the functional Sx : X ′ → R defined by

Sx (T ) = T (x) ∀T ∈ X ′ .

The map J : X → X ′′ sending x into Sx is a linear isometry: indeed, by


the Hahn-Banach theorem we can write

∥x∥ = max{T (x) : T ∈ X ′ , ∥T ∥X ′ ≤ 1} ∀x ∈ X,

whence ∥x∥ = ∥Sx ∥X ′′ .


In finite dimension, the map J is a (canonical) isomorphism between X
and X ′′ . This is no longer true if the dimension is infinite: in some cases
J(X) is a proper subset of X ′′ . A normed space is called reflexive precisely
when J(X) = X ′′ (i.e. if the space “coincides with its bidual”).
Thanks to our detailed study of the spaces ℓp , it is easy to see that ℓ1 and
ℓ∞ are not reflexive, while the spaces ℓp are for every 1 < p < +∞.
Reflexivity, like separability12 plays a crucial role in the theory of weak
topologies.

The injection of X in its bidual allows us to obtain another important


consequence of the Banach-Steinhaus Theorem:
REMARK/EXERCISE: Let (X, ∥ · ∥) be a Banach space, A a subset. Then
A is bounded if and only if for every T ∈ X ′ , the image T (A) is a bounded
subset of R. (HINT: Obviously, if A is bounded, then T (A) is bounded for
every T ∈ X ′ . To prove the converse, it suffices to show that every sequence
{xn } ⊂ A such that {T (xn )} is bounded for every T ∈ X ′ , is actually bounded
12
A space is separable if it has a countable dense subset.

54
in norm. To this end, apply the Banach-Steinhaus Theorem to the sequence
Sxn ∈ X ′′ : it is bounded pointwise by our hypothesis, hence it is bounded in
norm. We can then conclude because ∥xn ∥ = ∥Sxn ∥X ′′ .)
L1 is not in general reflexive because the dual of L1 is L∞ , but the dual
of L∞ is in general strictly larger than L1 : this was shown in the case of the
spaces ℓ1 , ℓ∞ .
REMARK: We will now show a very concrete consequence of an abstract
result like the Banach-Steinhaus theorem: we prove that there are continuous
and 2π-periodic functions, whose Fourier series does not converge at some
point..
If f : R → R is a continuous, 2π-periodic function, we recall that its
Fourier series is ∞
X
a0 /2 + [an cos nx + bn sin nx],
n=1

where Z π Z π
1 1
an = f (t) cos nt dt, bn = f (t) sin nt dt.
π −π π −π

There is a well-known convergence theorem, ensuring that if f is C 1 , then


its Fourier series converges at every point to f (x) (and convergence is also
uniform). However, the regularity required on f may seem “excessive”: to
compute Fourier coefficients, we certainly don’t need differentiability... But
we will show that, a priori, for f ∈ C 0 we cannot even be sure there is
convergence at every point!
To prove this, we need the following well-known formula, expressing the
N -th partial sum of the Fourier series:

1 π sin((N + 12 )(y − x))


Z
fN (x) = f (y) dy.
π −π 2 sin((y − x)/2)

We show that there are continuous function for which fN (0) does not
converge to f (0): to this aim, consider the Banach space X = C 0 (2π) of
continuous, 2π-periodic functions with the norm ∥ · ∥∞ . If we define

1 π sin((N + 12 )y)
Z
TN : f 7→ f (y) dy,
π −π 2 sin(y/2)

the functionals TN are well defined elements of (L∞ (2π))′ , and then also of
X ′ (because X is a closed subspace of L∞ ). Moreover, if we put

sin((N + 12 )y)
gN (y) = ,
2 sin(y/2)

55
one easily checks that ∥TN ∥X ′ = ∥gN ∥L1 ([−π,π]) .13
Now, from our construction of the functionals we have fN (0) = TN (f ). If
we had fN (0) → f (0) for every f ∈ X, in particular we would have

sup |TN (f )| < +∞ ∀f ∈ X.


N

By the Banach-Steinhaus theorem, we could conclude that the norms of the


functionals TN are equibounded: but this is false, because ∥gN ∥L1 → +∞ as
N → +∞.14

A very important result is the following


THEOREM (of the open mapping): Let (X, ∥ · ∥X ), (Y, ∥ · ∥Y ) be Banach
spaces, T : X → Y a continuous and surjective linear map. Then T is open:
there exists r > 0 such that T (B1 (0)) ⊃ Br (0).
PROOF: To begin with, we show that there exists r > 0 such that

(∗) T (B1 (0)) ⊃ B2r (0).



S
Indeed, by the surjectivity of T we have Y = T (Bn (0)). By Baire
n=1
lemma we know that at least one of these closed sets has nonempty interior.
But by the homogeneity of the norm and the linearity of T , all these sets are
homotetic, so T (B1 (0)) has nonempty interior.
This interior is a symmetric, convex open sets (the closure of a symmetric,
convex open set is symmetric and convex, the same holds for the interior. . . ):
we deduce that 0 belongs to the interior of T (B1 (0)) and (∗) is proved.
To conclude the proof of the open mapping theorem, we have to show
that
T (B1 (0)) ⊃ Br (0),
which is our thesis. We will see this next time!

7 Lecture of october 23, 2024 (3 hours)


Let us conclude the proof of the open mapping theorem.
13
Obviously, T has this norm as an elemento of (L∞ )′ , whence ∥Tn ∥X ′ ≤ ∥gN ∥L1 . On
the other hand, it is easy to construct a sequence σk of periodic continuous functions such
that lim σk (x) → sgn(gN (x)) for a.e.. x ∈ [−π, π] anch such that ∥σk ∥∞ ≤ 1. By the
k→+∞
dominated convergence theorem we have TN (σk ) → ∥gN ∥L1 , whence the thesis.
14
With a change of variables, and recalling that | sin t| ≤ |t|, this comes from the non-
summability of the function sint t on the half line [1, +∞).

56
Let ∥y∥ < r: we look for a point x ∈ X such that ∥x∥ < 1 and T (x) = y.
Since by (∗) we have Br (0) ⊂ T (B1/2 (0)), for every ε > 0 we can find
z ∈ X such that ∥z∥ < 1/2 and ∥y − T (z)∥ < ε. Choosing ϵ = r/2 we find
z1 ∈ X such that ∥z1 ∥ < 1/2 and ∥y − T (z1 )∥ < r/2.
As Br/2 ⊂ T (B1/4 (0)), by repeating the same argument with y − T (z1 )
at the place of y and ε = r/4, we find z2 ∈ X such that ∥z2 ∥ < 1/4 e
∥y − T (z1 ) − T (z2 )∥ < r/4.
Proceeding in the same way, we construct a sequence {zn } ⊂ X such
that ∥zn ∥ < 1/2n and ∥y − T (z1 + z2 + . . . + zn )∥ < r/2n . The sequence
xn = z1 + z2 + . . . + zn is clearly a Cauchy sequence, whence xn → x in X.
We obviously have ∥x∥ < 1 and y = T (x) thanks to the continuity of T .
Q.E.D.
REMARKS/COROLLARIES: An important consequence of the theorem is
the following: if T : X → Y is an algebraic isomorphism between Banach
spaces and T is continuous, then the inverse map T −1 : Y → X is continuous
and T is a Banach spaces isomorphims. Indeed, the fact that T is open implies
the boundedness of T −1 .
Another important consequence: if ∥ · ∥ and ∥ · ∥′ are two Banach norms
on X, and there exists C > 0 such that ∥x∥ ≤ C∥x∥′ for every x ∈ X, then
the two norms are equivalent. It is enough to apply the previous remark to
the identity map between the two Banach spaces.

Another important consequence of the open mapping theorem is the


THEOREM (Of the closed graph): Let X, Y be Banach spaces, T : X → Y a
linear map. Then T is continuous if and only if its graph GT = {(x, T (x)) :
x ∈ X} is closed in X × Y .

Here, X × Y is endowed with the product topology, induced by the norm


∥(x, y)∥X×Y := ∥x∥X + ∥y∥Y .
PROOF: Suppose T is continuous and (xn , T (xn )) → (x, y). By the continu-
ity of T we get y = T (x) whence (x, y) ∈ GT : the graph is closed.
Suppose conversely that GT is closed. Then GT is a closed linear subspace
of the Banach space X × Y , and so is itself a Banach space.
The map Φ : x 7→ (x, T (x)) is clearly an algebraic isomorphism between
X and GT , with inverse map p1 (the projection on the first factor of X × Y ).
As p1 is continuous, by the open mapping theorem we get that Φ is also
continuous. But then T = p2 ◦ Φ is continuous (where p2 is projection on the
second factor). Q.E.D.
To conclude this first discussion of Banach spaces, let us examine com-
pactness. In finite dimension, we have an abundance of compact sets (all

57
closed bounded sets are compact!), and this has been useful to prove many
theorems. In infinite dimension, however, not all bounded and closed sets
are compact:
THEOREM (Riesz): Let (X, ∥ · ∥) be a normed space. Then the dimension
of X is finite if and only if the unit closed ball

B = {x ∈ X : ∥x∥ ≤ 1}

is compact.

To prove the theorem, we need the following result:


LEMMA (Riesz): Let (X, ∥·∥) be a normed space, Y a proper closed subspace.
Then there exists x ∈ X such that ∥x∥ = 1 and dist(x, Y ) ≥ 1/2.
PROOF: Choose x0 ∈ X \ Y . As Y is closed, the distance δ between x0 and
Y is positive. Moreover, by definition of distance there exists y0 ∈ Y such
that ∥x0 − y0 ∥ < 2δ. Put
x0 − y 0
x= .
∥x0 − y0 ∥
Clearly ∥x∥ = 1, and if y ∈ Y we have:
1
∥x − y∥ = ∥x0 − (y0 + y∥x0 − y0 ∥)∥ > 1/2,
∥x0 − y0 ∥

where we used the fact that y0 + y∥x0 − y0 ∥ ∈ Y . Q.E.D.

PROOF of the theorem on the (non) compactness of the unit closed ball:
If the dimension of X is finite, then the closed unit ball is compact: we
can assume w.l.o.g. that we are in the case X = Rn , where all norms are
equivalent to the euclidean norm. It follows that our ball is an euclidean
bounded, closed set, hence compact.
Conversely, suppose the dimension of X is infinite. We construct an
increasing sequence of subspaces Y1 ⊂ Y2 ⊂ Y3 . . . in such a way that
dim(Yk ) = k, k = 1, 2, . . .
Now fix x1 ∈ Y1 , ∥x1 ∥ = 1. By Riesz lemma with X = Y2 and Y = Y1 ,
we can find x2 ∈ Y2 such that ∥x2 ∥ = 1 and dist(x2 , Y1 ) > 1/2.
Proceeding in the same way we find a sequence {xk } such that ∥xk ∥ = 1,
xk ∈ Yk and dist(xk , Yk−1 ) > 1/2. This sequence contains only norm-one
vectors, and the distance between any two of its elements is larger than 1/2.
Such a sequence has obviously no Cauchy subsequences: the unit closed ball
is not compact. Q.E.D.

58
In a metric space (and thus in a Banach space), compact subsets are
characterized as follows. We need a definition:
DEFINITION (Total boundedness): Let (X, d) be a metric space. A subset
K ⊂ X is totally bounded if, for every ε > 0, it is possible to cover K with a
finite number of balls with radius ε.
THEOREM: Let (X, d) be a metric space, K ⊂ X. Then K is compact if
and only if it is complete and totally bounded.
Moreover, a totally bounded subset K of a complete metric space is rela-
tively compact: from any sequence with values in K, we can extract a subse-
quence converging in X.
This characterization gives a good “geometric” idea of how compact sets
look like in an infinite dimensional space:
REMARK: Given a compact subset K of a normed space and any ε > 0,
there exists a finite dimensional subspace Yε whose distance from every point
of the set K is less than ε. Indeed, by the total boundedness we can cover
K by a finite number of balls Bε (x1 ),. . . ,Bε (xN ). We can then define Yε =
span{x1 , . . . , xN }.
We can summarize this by saying that in infinite dimension compact
sets are rather “skinny”...and so balls are not compact: in particular, it
is false that from a norm-bounded sequence we can extract a convergent
subsequence!
PROOF of the characterization of compact sets: We know that in a metric
space, sequential and topological compactness are the same.
Suppose now K is compact: we show that K is complete and totally
bounded. Let {xk } ⊂ K be a Cauchy sequence. By compactness, it has
a subsequence converging to some x ∈ K. But it is easy to check that if a
Cauchy sequence has a converging subsequence, the whole sequence converges
to the same limit!
Choose then ε > 0 and consider the family of open balls {Bε (x)}x∈K .
This is an open covering of K: by compactness, we can extract a finite
subcovering...which gives us a finite number of balls of radius ε covering K.
We will prove the converse next time!
We finish the proof of the characterization of compact sets we stated last
time. Suppose K is complete and totally bounded. Let {xk } ⊂ K be a
sequence in K: we show that it is possible to extract a subsequence which
converges to some point of K.
By the total boundedness, we can cover K with a finite number of open
balls of radius 1. Necessarily, infinitely many terms of the sequence will fall

59
(1)
within one of these balls, which we will call B1 . Let {xk } be the subsequence
of {xk } formed by those elements which belong to B1 .
Cover now K with a finite number of balls of radius 1/2: within one of
(1)
those, which we call B2 , we will have infinitely many terms of {xk }. Let
(2) (1)
{xk } be the subsequence of those elements of {xk } which belong to B2 .
We proceed in the same way, covering K with balls having radius 1/3, 1/4...
(n)
By recurrence, we construct a sequence of subsequences such that {xk }
(n−1)
is a subsequence of {xk } and all its elements are contained within a ball
of radius 1/n.
(k)
We then take the diagonal subsequence, defined by x̃k = xk (the k-th
element of the diagonal subsequence is the k-th element of the k-th subse-
quence.
(n)
The sequence {x̃k } is a subsequence of {xk } for k ≥ n: in particular, it
is a subsequence of {xk }, and is obviously a Cauchy sequence (because, from
the n-th term on, it is contained within a ball of radius 1/n, and this is true
for every fixed n). Thus x̃k → x ∈ K by the completeness assumption.
The same proof works also when X is complete, while K is only totally
bounded: in that case, we only say that x ∈ X. Q.E.D.

We take advantage of the characterization of compactness through total


boundedness in the proof of the following result: we will prove the Ascoli-
Arzelà Theorem, which gives us sufficient conditions to have compactness in
C 0.
THEOREM (Ascoli-Arzelà): Let un : A → B be a sequence of continuous
functions, where A and B are compact metric spaces. If the sequence un is
equicontinuous, i.e. if for every ε > 0 there exists δ > 0 such that x, y ∈
A, dA (x, y) < δ imply dB (un (x), un (y)) < ε for every n, then un has a
subsequence which converge uniformly to some continuous function u : A →
B.
We will prove the theorem next time.

8 Lecture of october 24, 2024 (3 hours)


PROOF: Remark first that the set C 0 (A; B) of continuous functions from
A to B is a complete metric space with the uniform distance d(u, v) =
sup{dB (u(x), v(x)) : x ∈ A}.
To prove the theorem it suffices to show that F = {un : n ∈ N} is a
totally bounded subset of C 0 (A, B). Fix ε > 0, and use the total boundedness
of B to write B = B1 ∪B2 ∪. . .∪BN , where Bj are balls of radius ε. Use then

60
equicontinuity to find δ such that dA (x, y) < δ implies dB (un (x), un (y)) < ε,
and then the total boundedness of A to writeA = A1 ∪ . . . ∪ AM , where Ai
are balls of radius δ and center ai .
For each multiindex (j1 , j2 , . . . jM ) ∈ {1, 2, . . . , N }M (there is a finite num-
ber of those) consider the set of function

W(j1 ,j2 ,...jM ) = {u ∈ F : u(ai ) ∈ Bji , i = 1, . . . , M }.

Each element of the original sequence belongs to one of these sets. Moreover,
each set of function is either empty, or is diameter is less than 5ε, and is thus
contained in a ball of radius 5ε: indeed, if u, v ∈ W(j1 ,j2 ,...jM ) and x ∈ A,
choose i such that x ∈ Ai . By the equicontinuity we get dB (u(x), v(x)) ≤
dB (u(x), u(ai )) + dB (u(ai ), v(ai )) + dB (v(ai ), v(x)) < 4ε.
We thus covered F with a finite number of balls of radius 5ε. Q.E.D.

REMARK: In the most common case of real valued functions, the Ascoli-
Arzelà theorem is usually stated as follows: each sequence of functions in
C 0 (A; R) (with A a compact metric space) which is equicontinuous and equi-
bounded, has a subsequence which converges uniformly to some continuous
function.
Indeed, equiboundedness ensures that the functions in the sequence take
values in the compact interval [−M, M ] for M large enough.

The Ascoli-Arzelà theorem is used to prove a lot of important theorems


in analysis: for instance, one can use it to prove the Peano theorem, ensuring
the existence of local solutions of the Cauchy problem for non-linear O.D.E.s
A possible cure to the lack of compactness of the closed ball is obtained
through the concept of weak convergence: if we are in a “good enough” Ba-
nach space, from every bounded sequence we can extract a weakly convergent
subsequence. We begin by giving the relevant definition:

DEFINITION: Let (X, ∥ · ∥) be a normed space. We say that a sequence


{xk } ⊂ X converges weakly to x ∈ X, and we write xk ⇀ x, if and only if

T (xk ) → T (x) ∀T ∈ X ′ .

EXAMPLES/REMARKS/EXERCISES: Observe first that a norm-convergent


sequence converges also weakly (by the continuity of each T ∈ X ′ ): strong
convergence implies weak convergence
In finite dimension, strong and weak convergence coincide: indeed, a
sequence in Rn converges if and only if its components converge.

61
This is no longer true in infinite dimension: we will see in a moment that
in a reflexive Banach space there are always weakly convergent sequences
which do not converge in norm.
Explicit examples are easily obtained in the spaces ℓp and Lp (Ω), because
we know wery well the dual spaces!
If 1 ≤ p < +∞, consider a sequence xn = {xnk }k ∈ ℓp . By definition,
x ⇀ x = {xk }k in ℓp iff for every T ∈ (ℓp )′ we have T (xn ) → T (x), i.e. iff
n


X ∞
X
lim xnk yk = xk yk ∀{yk } ∈ ℓq .
n→+∞
k=1 k=1

For instance, the sequence en of the “basis vectors” in ℓp converges weakly


to 0 for 1 < p < +∞ (not in ℓ1 ), but it does not converge strongly.
Weak convergence in Lp (Ω) is similarly characterized: given {uk } ⊂ Lp (Ω)
(with 1 ≤ p < +∞), we have uk ⇀ u in Lp if and only if
Z Z
lim uk (x)v(x) dx = u(x)v(x) dx ∀v ∈ Lq (Ω).
n→+∞
Ω Ω

Weak convergence is mainly useful because of the following


THEOREM (Banach-Alaoglu): If X is a reflexive Banach space, its closed
unit ball is (sequentially) weakly compact: from every norm-bounded sequence
it is possible to extract a weakly convergence subsequence.

We will not prove this theorem in its full generality, but later we will give
a proof in the particular case where X is a Hilbert space.

REMARK: If {xn } ⊂ X (with X a Banach space) is a sequence such that


xn ⇀ x, then sup{∥xn ∥ : n ∈ N} < +∞: every weakly convergent sequence
is norm-bounded. Indeed, from the definition of weak convergence we infer
that {T (xn )} is a bounded subset of R for every T ∈ X ′ : we already observed
that this implies the boundedness of {xn } (as a consequence of the Banach-
Steinhaus theorem).

REMARK: In an infinite dimensional, reflexive Banach space we always have


sequences which converge weakly but not strongly. Indeed, the closed unit
ball is not strongly compact: we constructed a sequence of vectors of norm 1,
which does not have strongly convergent subsequences. By Banach-Alaoglu,
the same sequence has a weakly convergent subsequence!

62
REMARK (we saw no details in class...): In the space ℓ1 (which is not
reflexive!) every weakly convergent sequence converges strongly. This is
really a patological example, and the proof is not so easy!
It is easy to see that it suffices to prove that a sequence {xn } ⊂ ℓ1 such
that xn ⇀ 0 also converges strongly. By one of the previous remarks, we
have ∥xn ∥ℓ1 ≤ C for every n.
We have to show that ∥xn ∥ℓ1 → 0. Suppose by contradiction this is false:
up to subsequences, this implies that ∥xn ∥ℓ1 ≥ c > 0 for every n. We show
that we can extract a further subsequence which does not converge weakly
to 0, thut contradicting our hypothesis.
Now, if xn = {xnk }k , then

lim xkn = 0 ∀k ∈ N :
n→+∞

weak convergence to 0 implies convergence to 0 of all components (apply the


definition of weak convergence with {yk } = ek ). So, for every fixed N ∈ N
we have
XN
lim |ykn | = 0.
n→+∞
k=1

From this we see that we can choose a strictly increasing sequence of natural
numbers k1 < k2 < k3 < . . . and a subsequence of xn (which we still denote
xn ) in such a way that
kn+1
X 3
|xnk | ≥ ∥xn ∥ℓ1 , n = 1, 2, 3, . . .
k=kn +1
4

Define now a sequence {yk } ∈ ℓ∞ as follows: if kn + 1 ≤ k ≤ kn+1 , then


yk = sgn(xnk ). Then, for every fixed n, we have
kn+1
n
X X 3 1 1
T{yk } (x ) = |xnk | + xnk yk ≥ ∥xn ∥ℓ1 − ∥xn ∥ℓ1 ≥ c.
k=kn +1 other k′ s
4 4 2

Obviously this sequence does not converge to zero as n → +∞, thus contra-
dicting the fact that xn ⇀ 0.

EXERCISE: If X is a Banach space, C ⊂ X is a nonempty, convex and


strongly closed set (i.e., it is closed in the topology induced by the norm),
then C is sequentially weakly closed: if x ∈ X is the weak limit of a sequence
in C, then x ∈ C.

63
Let indeed xn ⇀ x, {xn } ⊂ C. Suppose by contradiction x ̸∈ C. We
can apply the geometric form of the Hahn-Banach theorem to the convex
sets {x} and C, the first of which is compact and the second closed. We
find T ∈ X ′ , T ̸= 0 and ε > 0 such that T (x) + ε < T (x) for every x ∈ C,
and in particular T (xn ) + ε < T (x) for every n. This is impossible because
T (xn ) → T (x) by definition of weak convergence!

EXERCISE: Let X be a Banach space, F : X → R be a convex continuous


function. Then F is (sequentially) weakly lower semicontinuous: for every
sequence xn ⇀ x we have F (x) ≤ lim inf F (xn ).
n→+∞
In particular, as the norm is a convex function, we have

∥x∥ ≤ lim inf ∥xn ∥ ∀xn ⇀ x.


n→+∞

Let ℓ = lim inf n→+∞ F (xn ). If ℓ = +∞, there is nothing to prove. Let
then ℓ < +∞: choose s ∈ R, s > ℓ and consider the sublevel set

Cs = {x ∈ X : F (x) ≤ s}.

This is a closed convex set by the convexity and strong continuity of F .


By our previous exercise Cs is sequentially weakly closed. Moreover, up
to subsequences we may assume F (xn ) → ℓ: for n large enough we have
{xn } ⊂ Cs . But then, by the weak closure, x ∈ Cs whence F (x) ≤ s. The
thesis follows because s > ℓ is arbitrary.

EXERCISE: If X is a reflexive Banach space, C a nonempty convex closed


subset, x0 ∈ X, show that there exist a point of C whose distance from x0 is
minimal.
Up to translations, we may assume x0 = 0: we show that C has a point
of minimal norm.
Let indeed {yn } ⊂ C be a sequence such that ∥yn ∥ → inf{∥y∥ : y ∈
C} = δ. Obviously, this sequence is norm bounded: by Banach-Alaoglu we
have a subsequence which converge weakly to some point y. By weak closure,
y ∈ C, and by the lower semicontinuity of the norm we conclude that y is
the minimum point: ∥y∥ ≤ lim inf ∥yn ∥ = δ.

EXERCISE (not seen in class): If the Banach space X is not reflexive, the
result of our previous exercise may fail. Consider indeed the space C 0 ([0, 1])
with the sup norm and the set
Z 1/2 Z 1
0
C = {u ∈ C ([0, 1]) : u(x) dx − u(x) dx = 1}.
0 1/2

64
This is a closed convex set, and the infimum of the norms of its elements
is 1. On the other hand, C ha no element of norm 1: the minimum is not
attained!
R 1/2
Closure and convexity of C is obvious: observe that Φ : u 7→ 0 u(x) dx−
R1
1/2
u(x) dx is a bounded linear functional, so C is a closed hyperplane.
We first check that dist(0, C) ≥ 1: indeed, no function with norm strictly
less than 1 belongs to C (our difference of integrals is less or equal than

1/2 (essup{u(x) : x ∈ [0, 1/2]} − essinf{u(x) : x ∈ [1/2, 1]}) ,

which is below ∥u∥∞ ).


This same inequality tells us that if ∥u∥∞ = 1, to belong the C the
function u should be 1 a.e. on [0, 1/2], −1 a.e. in [1/2, 1]: no continuous
function has this property.
But the distance is exactly 1, as we see by considering the sequence of
continuous functions un (x) which is 1 + 1/n on [0, 1/2 − 1/(n + 1)], −1 − 1/n
on [1/2 + 1/(n + 1), 1] and is linear on [1/2 − 1/(n + 1), 1/2 + 1/(n + 1)]:
those functions belong to C, and ∥un ∥∞ = 1 + 1/n → 1.

We will now begin to discuss the theory of Hilbert spaces. Before we


give the definition, we recall the definition of scalar product and of the norm
induced by a scalar product.
DEFINITION: Let X be a real vector space. A scalar product over X is a
map

< ·, · >: X × X → R
(x, y) 7→ < x, y >

which is bilinear (i.e. linear in each of its arguments x and y), symmetric (i.e.
< x, y >=< y, x > for every x, y) and positive definite (i.e. < x, x >≥ 0,
with equality iff x = 0).
From a scalar product we get a norm on X as follows:

∥x∥ :=< x, x >1/2 .

Of course, we have to verify that this is a norm. This, and other simple
facts, are summarized in the following proposition:
PROPOSITION: Let < ·, · > be a scalar product on X, ∥ · ∥ the induced
norm. Then the following hold

65
(i) For every x, y ∈ X we have the Cauchy-Schwarz inequality

< x, y >≤ ∥x∥ ∥y∥;

(ii) The induced norm. . . is a norm;


(iii) The parallelogram law holds:

∥x + y∥2 + ∥x − y∥2 = 2(∥x∥2 + ∥y∥2 ) ∀x, y ∈ X;

(iv) The polarization law holds:


1
< x, y >= (∥x + y∥2 − ∥x − y∥2 ) ∀x, y ∈ X.
4

PROOF: If x, y ∈ X and t ∈ R we have:

0 ≤ ∥ty + x∥2 =< ty + x, ty + x >= t2 ∥y∥2 + 2t < x, y > +∥x∥2 .

The discriminant of this quadratic polynomial is thus less or equal than 0:


this is exactly (i).
We show (ii): the norm is obviously homogeneous and non degenerate.
We have to prove the triangle inequality. For every x, y we have, by the
Cauchy-Schwarz inequality:

∥x + y∥2 =< x + y, x + y >= ∥x∥2 + 2 < x, y > +∥y∥2 ≤


∥x∥2 + 2∥x∥ ∥y∥ + ∥y∥2 = (∥x∥ + ∥y∥)2 .

(iii) and (iv) are easily proved by expanding the scalar products. (iii) is
called the parallelogram law because, if we interpret the vectors x and y as the
edges of a parallelogram, then x + y and x − y represent the diagonals. The
identity is then the expression of a well known result in euclidean geometry.
Q.E.D.

The parallelogram identity allows to characterize which norms are induced


by a scalar product:
PROPOSITION: Let (X, ∥ · ∥) be a normed space. Then the map
1
a(x, y) := (∥x + y∥2 − ∥x − y∥2 ), x, y ∈ X
4
is a scalar product which induces the given norm if and only if the norm
satisfies the parallelogram law.

66
PROOF: If the norm is induced by a scalar product, we already know the
parallelogram law is satisfied, and the scalar product is recovered thanks to
the polarization identity.
Conversely, suppose the norm satisfies the parallelogram law and define
a(x, y) as in the statement. This function is symmetric and a(x, x) = ∥x∥2 ≥
0 with equality iff x = 0. Moreover, a(x, 0) = a(0, y) = 0 and a(−x, y) =
−a(x, y). The function a(x, y) is also continuous.
Let now x1 , x2 , y ∈ X: from the parallelogram law we get
1
(∗) a(x1 , y) + a(x2 , y) = (∥x1 + y∥2 − ∥x1 − y∥2 + ∥x2 + y∥2 − ∥x2 − y∥2 ) =
4
1
(∥x1 + x2 + 2y∥2 + ∥x1 − x2 ∥2 − ∥x1 + x2 − 2y∥2 − ∥x1 − x2 ∥2 ) =
8
1
a(x1 + x2 , 2y).
2
In particular, by letting x1 = x, x2 = 0 the last identity becomes
1
(∗∗) a(x, y) = a(x, 2y) ∀x, y.
2
By replacing (**) within (*) we get

a(x1 , y) + a(x2 , y) = a(x1 + x2 , y) ∀x1 , x2 , y.

By applying this formula repeatedly we easily see that a(mx, y) = m a(x, y)


for every m ∈ Z. Then, by using again (**) and the symmetry:
m m
a( n
x, y) = n a(x, y) ∀x, y ∈ X, ∀m ∈ Z, n ∈ N.
2 2
Now, the set of numbers of the form m/2n is dense in R: by the continuity
of a we conclude that a(tx, y) = ta(x, y) for every x, y ∈ X and every t ∈ R:
a(x, y) is thus a scalar product. Q.E.D.

9 Lecture of october 30, 2024 (3 hours)

DEFINITION (Hilbert space): A real vector space X, equipped with a scalar


product < ·, · > is a Hilbert space if it is a Banach space with the norm
induced by the scalar product.
2
EXAMPLES: Typical prototypes of Hilbert
P∞ spaces are2 the spaces ℓ with
the scalar product R< {xk }, {yk } >:= k=1 xk yk and L (Ω) with the scalar
prduct < u, v >:= u(x)v(x) dx.

67
The following theorem is a stronger version of something we already know
is valid in a reflexive Banach space. But the proof will be independent from
the Banach Alaoglu theorem, which we did not prove!
THEOREM (projection on a closed convex set): Let X be a Hilbert space, C
a nonempty, closed, convex subset of X, x0 ∈ X. Then there exists a unique
y ∈ C such that ∥x0 − y∥ = dist(x0 , C).
PROOF of the existence of the nearest point projection on a closed convex
subset of a Hilbert space: After a translation, we may suppose that x0 = 0:
we must now prove that in C there is a unique element of minimal norm.
Let now δ = inf{∥y∥ : y ∈ C}, and let {yn } ⊂ C be a sequence such that
∥yn ∥ → δ (such a sequence exists by the definition of infimum!).
We prove that {yn } is a Cauchy sequence in X: to this aim, consider
the parallelogram law with x/2, y/2 at the place of x, y... We easily get the
identity
2
x+y
∥x − y∥2 = 2(∥x∥2 + ∥y∥2 ) − 4 ,
2
which holds for every x, y ∈ X. Notice also that, if x and y are in C, then
by convexity x+y
2
∈ C: by the identity just obtained and the definition of δ
we get
2
2 2 2 yn + ym
(∗ ∗ ∗) ∥yn − ym ∥ = 2(∥yn ∥ + ∥ym ∥ ) − 4 ≤
2
2 2 2
2(∥yn ∥ + ∥ym ∥ ) − 4δ .

The last quantity vanishes as m, n → +∞, so {yn } is a Cauchy sequence


and it converges to some point y ∈ X. Since C is closed, y ∈ C. Moreover
∥y∥ = δ by the continuity of the norm: y is our element of minimal norm in
C.
Let us prove uniqueness: if we have also ∥ỹ∥ = δ with ỹ ∈ C, we can
apply (***) with yn = y, ym = ỹ and we get

∥y − ỹ∥ ≤ 0,

whence y = ỹ. Q.E.D.

Let us discuss how to characterize the nearest point projection on a convex


set:
COROLLARY: In the hypotheses of theorem (on existence and uniqueness
of the projection on a convex convex set in a Hilbert space), the point y ∈ C

68
having minimal distance from x0 is characterized by the inequality

(∗) < x0 − y, y − y >≤ 0 ∀y ∈ C.

In particular, if Y is a closed vector subspace of the Hilbert space X,


x0 ∈ X, then there exists a unique point y ∈ Y having minimal distance from
x0 . This point is characterized by the orthogonality relation

< x0 − y, y >= 0 ∀y ∈ Y.

PROOF: We show that y ∈ C is the point of minimum distance iff (*) holds.
Suppose indeed (*) holds, and let y ∈ C. Then

∥x0 −y∥2 = ∥x0 −y+y−y∥2 = ∥x0 −y∥2 +∥y−y∥2 −2 < x0 −y, y−y >≥ ∥x0 −y∥2

and y is the point of minimum distance.


Conversely, let y ∈ C be the point of minimum distance, y ∈ C. Then
we have, for t ∈ [0, 1], ty + (1 − t)y ∈ C so that

(∗∗) ∥x0 − y∥2 ≤ ∥x0 − (ty + (1 − t)y)∥2 = ∥(x0 − y) − t(y − y)∥2 =


∥x0 − y∥2 + t2 ∥y − y∥2 − 2t < x0 − y, y − y >,

whence < x0 − y, y − y >≤ 2t ∥y − y∥2 and (∗) follows by letting t → 0.


If C = Y , with Y a closed vector subspace of X, inequality (∗∗) must
hold for every t ∈ R: this is possible iff the scalar product in the left hand
side is identically zero (and when y ranges in Y , y − y exhausts all element
of Y ). Q.E.D.

The last corollary is very important: we deduce that every Hilbert space
splits into the direct sum of any closed subspace and its orthogonal, with
continuous projections.
PROPOSITION: Let Y ⊂ X be a closed subspace of the Hilbert space X,
p : X → X the map that takes any x ∈ X to its closest point in the subspace
Y . Then p is linear and continuous and its restriction to Y is the identity
map. Moreover, x−p(x) is orthogonal to Y , and so we can write X = Y ⊕Y ⊥ ,
with continuous projections. Finally, ∥x∥2 = ∥p(x)∥2 + ∥x − p(x)∥2 for any
x ∈ X.
PROOF: By the previous corollary, p(x) is the unique point in Y such that
< x − p(x), y >= 0 for every y ∈ Y , i.e. the unique point of Y such that
x − p(x) ∈ M ⊥ : for this reason, it is called the orthogonal projection of x on
Y.

69
Now p coincides with the identity map on Y . We show it is linear: let
x1 , x2 ∈ X, t ∈ R. Then we have 0 =< x1 − p(x1 ), y >=< x2 − p(x2 ), y >
for every y ∈ Y , and so

< x1 − tx2 − (p(x1 ) + tp(x2 )), y >= 0 ∀y ∈ Y,

whence p(x1 + tx2 ) = p(x1 ) + tp(x2 ).


If x ∈ X, since p(x) ∈ Y we get < x − p(x), p(x) >= 0, whence

∥x∥2 =< p(x) + (x − p(x)), p(x) + (x − p(x)) >= ∥p(x)∥2 + ∥x − p(x)∥2 .

So p is continuous, because the identity implies

∥p(x)∥ ≤ ∥x∥,

i.e. the norm of p is less or equal than 1 (actually, it is exactly 1 because it


coincides with the identity map on Y ). Q.E.D.

We remark that there is an easy explicit formula for the orthogonal pro-
jection on a subspace of finite dimension:
REMARK: If Y is a finite dimensional subspace of X, and {e1 , . . . , en } is an
orthonormal basis of Y , then we have
n
X
p(x) = < x, ei > ei .
i=1

n
Moreover ∥p(x)∥2 = (< x, ei >)2 . Indeed, we only need to verify that
P
i=1
x − p(x) is orthogonal to every vector in Y : it is of course enough to check
this on the basis vectors. Now
n
X
< x − p(x), ej >=< x, ej > − < x, ei >< ei , ej >= 0,
i=1

as we wanted. The expression for the norm of p(x) follows immediately from
the orthonormality of the basis vectors ei .
Notice that this result does not depend on the completeness of X: in
the projection theorem, completeness was needed to prove the existence of a
point of minimum distance. Here, we explicitely exhibit this point!

We next characterize the dual of a Hilbert space: for every continuous lin-
ear functional T ∈ X ′ there exists a unique y ∈ X such that T (x) =< y, x >

70
for every x ∈ X. In particular, the dual of X is isometrically isomorphic to
X:
THEOREM (Riesz representation theorem): Let X be a Hilbert space. Define
the application

Φ:X → X′
y 7→ Ty

where, by definition, Ty (x) :=< y, x > for every x ∈ X. Then Φ is an


isometric isomorphism between X and X ′ .
PROOF: From the Cauchy-Schwarz inequality we get

Ty (x) =< y, x >≤ ∥y∥ ∥x∥,

and the linear functional Ty is continuous with norm ≤ ∥y∥. On the other
y
hand, Ty ( ∥y∥ ) = ∥y∥, whence ∥Ty ∥X ′ = ∥y∥.
So the linear map Φ : X → X ′ is a well defined isometry.
To conclude, we have just to show that Φ is surjective: for every T ∈ X ′
there is y ∈ X such that T = Ty .
Let Y = ker(T ). In case Y = X, we obviously have y = 0: we can thus
suppose Y is a closed, proper subspace of X. Let then x0 ∈ X \ Y , y the
orthogonal projection of x0 on Y . For every fixed x ∈ X we have

T (x)
x− (x0 − y) ∈ Y.
T (x0 − y)

This vector must then be orthogonal to x0 − y:

T (x)
< x0 − y, x − (x0 − y) >= 0,
T (x0 − y)

whence with easy computations


x0 − y
T (x) =< x, T (x0 − y) >,
∥x0 − y∥2
−y
and our claim is proved with y = T (x0 − y) ∥xx00−y∥2 . Q.E.D.

Before we proceed, we need to define the sum of an arbitrary (not neces-


sarily countable) family of nonnegative numbers:

71
DEFINITION: Let {tα }α∈I be a family of nonnegative real numbers. We
define X X
tα = sup{ tα : J ⊂ I, J f inite set}.
α∈I α∈J

The family {tα } is said to be summable if the sum is finite.


An equivalent definition is the following: the sum is the integral of {tα }
with respect to the counting measure on I.
Remark that if the set I is countable and {αn }n∈N is an enumeration,
then ∞
X X
tα = tαn
α∈I n=1

(and in particular the sum does not depend on the enumeration chosen).
REMARK: If {tα }α∈I is summable, then the set I ′ = {α ∈ I : tα > 0} is at
most countable.
Indeed, for every fixed n = 1, 2, 3, . . ., the set In = {α ∈ I : tα > 1/n} is
finite.
P
REMARK: P If {c α }α∈I is a family of real numbers such that α∈I |cα | < +∞,
the sum α∈I cα is a well defined real number.
An easy way to define this sum is to take the integral of {cα } w.r.t. the
counting measure on I. Or, equivalently, we can enumerate the non-zero
terms and compute the sum of the series.
DEFINITION: Let I be a set of indices. Denote by ℓ2 (I) the set of families
of real numbers {cα }α∈I such that the sum
X
c2α
α∈I

is finite. This is a Hilbert space with the inner product


X
< {aα }, {bα } >:= aα b α ,
α

where the sum in the r.h.s. is absolutely convergent thanks to the Hölder
inequality in ℓ2 .

We now give the fundamental definition of the abstract Fourier series of


an element x of a Hilbert space X, with respect to some fixed orthonormal
family of vectors.
PROPOSITION (Bessel inequality): Let X be a Hilbert space, {eα }α∈I an
orthonormal family of elements of X (i.e. ∥eα ∥ = 1 for every α ∈ I and

72
< eα , eβ >= 0 whenever α, β ∈ I, α ̸= β). If x ∈ X, we define its Fourier
coefficients w.r.t. {eα } as the real numbers

cα =< x, eα >, α ∈ I.

Then the following Bessel inequality holds


X
|cα |2 ≤ ∥x∥2 .
α∈I

In particular, at most countably many Fourier coefficients are non zero.


P
PROOF: Let J ⊂ I be any finite set of indices. Then the vector α∈J <
x, eα > eα is the orthogonal projection of x on the space spanned by {eα }α∈J
and X X
∥x∥2 = ∥ < x, eα > eα ∥2 + ∥x − < x, eα > eα ∥2
α∈J α∈J

whence (using orthonormality):


X X X
∥x∥2 ≥ ∥ < x, eα > eα ∥2 = ∥ < x, eα > eα ∥2 = c2α .
α∈J α∈J α∈J

Taking the supremum over all finite subsets J ⊂ I we get our thesis. Q.E.D.

Bessel inequality ensures that the Fourier coefficients {cα } of x ∈ X,


w.r.t. an orthonormal family {eα }α∈I , belong to the space ℓ2 (I). Conversely,
every element of ℓ2 (I) coincides with the Fourier coefficients of some element
of the space X:
THEOREM: Let X be a Hilbert space, {eα }α∈I a fixed orthonormal family.
For every {cα }α∈I ∈ ℓ2 (I) there exists an element x ∈ X such that

< x, eα >= cα ∀α ∈ I.

In other words, the linear map

Ψ : X → ℓ2 (I)
x 7→ {< x, eα >}α∈I

is surjective.
PROOF: The coefficients cα are non zero at most for a countable family of
indices I ′ ⊂ I. Choose an enumeration of I ′ :

I ′ = {αk : k = 1, 2, 3 . . .}.

73
Put then n
X
xn = cαk eαk .
k=1

By the orthonormality of eα we have


n+h
X
2
∥xn − xn+h ∥ = c2αk ,
k=n+1


c2αk converges).
P
whence {xn } is a Cauchy sequencey (because the series
k=1
Then xn → x ∈ X. By the continuity of the scalar product,

< x, eα >= lim < xn , eα >= cα


n→+∞

(consider the cases α ∈ I ′ and α ∈ I \ I ′ separately). Q.E.D.

10 Lecture of october 31, 2024 (3 hours)

REMARK: In the proof P of the previous theorem, the point x was found as
the sum of the series ∞
k=1 cαk eαk . We would very much like to write, for any
x ∈ X, X
x= cα eα .
α∈I

This is true if the orthonormal set is maximal, a statement which follows


immediately from the following theorem.
More generally, given any orthonormal system {eα }α∈I in X, x ∈ X and
cα =< x, eα >, we will see that the Fourier series
X
cα e α
α∈I

is well defined and converges to an element of X: precisely, it converges to


the orthogonal projection p(x) of x on the closure of the space spanned by
the vectors {eα }.

The next result ensures that the application Ψ defined in the last theo-
rem, is an isometric isomorphism as soon as the orthonormal system {eα } is
maximal. In that case, given x ∈ X we can always write
X
x= cα eα ,
α∈I

74
where cα =< x, eα > are the Fourier coefficients of x.
THEOREM (Abstract Fourier series): Let X be a Hilbert space, {eα }α∈I be
an orthonormal family in X. Then the following facts are equivalent:

(i) The family {eα }α∈I is maximal: if we add any vector of X to the family,
it is no longer orthonormal;

(ii) The space spanned by {eα }α∈I is dense in X;

(iii) Parseval identity holds: for every x ∈ X, if we denote by cα =< x, eα >


its Fourier coefficients, then
X
∥x∥2 = |cα |2 .
α∈I

In particular, the map Ψ defined in our previous theorem is an isometric


isomorphism between X and ℓ2 (I). By the injectivity of Φ, the Fourier series
X
cα e α
α∈I

defined in the proof of the theorem converges to x, and so its sum does not
depend on the enumeration chosen for the non zero Fourier coefficients.
PROOF: We show that (i) ⇒ (ii): suppose by contradiction that Y =
span{eα } is not dense, and let x0 ∈ X \ Y . Then, if p(x0 ) is the orthogonal
projection of x0 on Y , x0 − p(x0 ) is a non zero vector which is ortogonal to
all eα , against the maximality hypothesis.
We then show (ii) ⇒ (iii): given ε > 0, by (ii) for every x ∈ X we can
find a finite linear combination λ1 eα1 + λ2 eα2 + . . . + λN eαN such that

∥x − λ1 eα1 + λ2 eα2 + . . . + λN eαN ∥2 < ε.

This implies
∥x − cα1 eα1 − cα2 eα2 − . . . − cαN eαN ∥2 < ε

2 2 2
PNp(x)2 on Y = span{eα1 , . . . , eαN }),
(by the minimality property of the orthogonal projection
2
whence ε > ∥x − p(x)∥ = ∥x∥ − ∥p(x)∥ = ∥x∥ − i=1 cαi and thus
X
∥x∥2 ≤ c2α + ε.
α∈I

We already know that α∈I c2α ≤ ∥x∥2 (Bessel inequality), so Parseval iden-
P
tity is proved because ε is arbitrary.

75
Finally, we have to prove that (iii) ⇒ (i): let x0 ∈ X be orthogonal to
all the vectors eα . By the Parseval identity we have ∥x∥ = 0, so x = 0 and
the orthonormal family {eα } is maximal. Q.E.D.
DEFINITION: A maximal orthonormal set in a Hilbert space is called a
Hilbert basis. It is easy to check that a Hilbert basis always exists (Zorn
lemma): in particular, every Hilbert space X is isomorphic and isometric to
ℓ2 (I) for a suitably chosen set of indices I.

You probably wonder how the abstract theory we just discussed is related
with the Fourier series in the traditional, trigonometric sense! Here is the
answer:
REMARK: Consider the space
Z π
2
L (2π) = {u : R → R : u measurable 2π − periodic, u2 (x) dx < +∞},
−π

with the usual equivalence relation identifying functions which are a.e. equal.
This is a Hilbert space with the scalar product
Z π
< u, v >= u(x)v(x) dx.
−π

Consider the following family of functions in L2 (2π):


1 1 1
F = { √ , √ sin nx, √ cos nx : n = 1, 2, . . .}.
2π π π
One easily checks that this family is orthonormal, and that the abstract
Fourier series of u ∈ L2 (2π) w.r.t. this orthonormal set is precisely the usual
Fourier series.
Moreover, we will prove that the family F is maximal. As a consequence,
the classical Fourier series of a function in L2 (2π) converges in L2 to the
function itself.

Notice that because the above family of functions is a Hilbert basis of


2
L (2π), then this space is separable. Recall that a topological space is sepa-
rable if it has a countable dense subset.
PROPOSITION: A Hilbert space X is separable iff it has a countable Hilbert
basis.
PROOF: If X has a countable Hilbert basis, the space generated by this
basis is dense in X. Consider the set of linear combinations with rational
coefficients of the basis elements: this is a dense countable set.

76
Conversely, let {xn } ⊂ X be a dense countable set. Apply the Gram-
Schmidt orthogonalization process to this set: we obtain a sequence of or-
thonormal vectors {ek } which spans a subspace of X cointaining all vectors
xn , i.e. a dense subspace: we have a countable Hilbert basis. Q.E.D.
ESXERCISE: Let X be a Hilbert space, {eα }α∈I a (not necessarily maximal)
orthonormal family. Show that for every x ∈ X the sum of the series
X
< x, eα > eα
α∈I

is well defined. (HINT: Consider the subspace Y = span{eα }α∈I . Show that
the series converges to the orthogonal projection of x on Y . . . )
The fact that the orthonormal system in L2 (2π) given by
 
1 1 1
F = √ , √ cos nx, √ sin nx, n = 1, 2, . . .
2π π π

is maximal, and hence a Hilbert basis, comes form the Stone-Weierstrass


theorem, which ensures that every continuous and 2π-periodic function can
be approximated in the uniform norm with trigonometric polynomials (which
are, by definition, linear combinations of elements of F).
Since we will see very soon that every element of L2 (2π) can be approxi-
mated in the L2 norm with continuous functions, if follows that it can also be
approximated with linear combinations of elements of F: the space spanned
by our orthonormal system is thus dense in L2 (2π), and F is maximal.
We will see now the Stone-Weierstrass theorem for trigonometric poly-
nomials. For the proof we will need the fact that the set of trigonometric
polynomials is an algebra: the product of two of those is still a trigonometric
polynomial. This is easily checked if we express the sin and cos functions in
terms of complex exponentials: as a consequence, notice that any polynomial
in sin x, cos x is indeed a trigonometric polynomial.

THEOREM (Stone-Weierstrass): Let u : R → R be a 2π-periodic, continu-


ous function. Then for every ε > 0 there exists a trigonometric polynomial
v(x) such that ∥u − v∥∞ < ε.
PROOF: For every natural n, consider the following trigonometric polyno-
t n
mial:Rϕn (t) = cn 1+cos

2
, where the constants cn are chosen in such a way
π
that −π ϕn (t) dt = 1.
If we draw the graph of these functions, we see non-negative periodic
functions which “concentrate” around the points 2kπ:

77
4

–3 –2 –1 1 2 3
x

We will see now how these functions will allow us to construct the desired
approximations of u with trigonometric polynomials.
We define the trigonometric polynomials approximating u as follows:
Z π
un (x) = u(x + t)ϕn (t) dt.
−π

These functions are obtained essentially by computing a “weighted average”


of u: we will see that un → u uniformly.
Before we do that, we need to show that un are indeed trigonometric
polynomials, which is not at all clear from the definition. . . To see that, it is
enogh to change variables in the integral, by putting s = x + t: recalling that
all functions involved are periodic, we get:
Z π
un (x) = u(s)ϕn (s − x) ds.
−π

Since ϕn are trigonometric polynomials, by using the addition formulas for


sin and cos, and the linearity of the integral, we now see that the functions
un are indeed trigonometric polynomials!
We now need a simple estimates of the constants cn appearing in the
definition of ϕn : we have
Z π n Z 1/√n  n
1 + cos( √1n )
!n
1 1 + cos t 1 + cos t 2
= dt ≥ √
≥√ .
cn −π 2 −1/ n 2 n 2

The n-th power in the last expression converges to e−1/4 , so that cn ≤ k n
for n large enough, with k a suitable positive constant.
To show that un → u uniformly, we use instead the original definition of
un . Since the functions ϕn are non negative with integral 1, we get
Z π Z π
(∗)|un (x)−u(x)| = | (u(x+t)−u(x))ϕn (t) dt| ≤ |u(x+t)−u(x)|ϕn (t) dt.
−π −π

78
Let M be an upper bound for |u|, and remark that u is uniformly con-
tinuous: for every ε > 0 we find δ > 0 such that |x − y| < δ implies
|u(x) − u(y)| < ε.
Now, split the r.h.s. integral in (*) on the sets [−δ, δ] and [−π, −δ]∪[δ, π].
By our choice of δ we get
Z δ Z δ
|u(x + t) − u(x)|ϕn (t) dt ≤ ε ϕn (t) dt < ε.
−δ −δ

On the other hand


Z π Z π n
1 + cos t
|u(x + t) − u(x)|ϕn (t) dt ≤ 2M cn dt ≤
δ δ 2
n


1 + cos(δ)
2M kπ n ,
2

and the last quantity vanishes as n → +∞, uniformly in x (because it does


not depend on x!). The integral on [−π, −δ] is estimated in the same way:
for n large enough we thus get |un (x) − u(x)| < 2ε for every x. Q.E.D.

To finish our discussion on Hilbert spaces, we study weak convergence


in this setting. By Riesz’s representation theorem, weak convergence in a
Hilbert space X reads as follows: if {xn } ⊂ X, then

(xn ⇀ x) ⇐⇒Def (< y, xn >→< y, x > ∀y ∈ X).

The following are some interesting facts about weak convergence in a


Hilbert spcae:
PROPOSITION: Let X be a Hilbert space. Then

(i) If xn ⇀ x, then {xn } is bounded and ∥x∥ ≤ lim inf n→+∞ ∥xn ∥.

(ii) If {xn } is such that for every y ∈ X the limit T (y) := lim < xn , y >
n→+∞
exists and is finite,then there is a unique x ∈ X such that xn ⇀ x.

(iii) If xn ⇀ x and yn → y (strong convergence), then

< xn , yn >→< x, y > .

(iv) We have xn → x iff xn ⇀ x and ∥xn ∥ → ∥x∥.

79
PROOF: We already know (i) in a general Banach space. For (ii), apply the
Banach Steinhaus theorem to the functionals Tn (y) :=< xn , y >, and Riesz
representation theorem to the limit functional T (y).
To prove (iii), write
< xn , yn > − < x, y >= (< xn , yn > − < xn , y >) + (< xn , y > − < x, y >).
The second bracket converges to zero by definition of weak convergence. To
estimate the first bracket, notice that {xn } is bounded in norm by (i), and
apply the Cauchy-Schwarz inequality to < xn , yn − y >: the first bracket also
goes to 0 and (iii) is proved.
An implication of (iv) is obvious. For the other, write
∥xn − x∥2 = ∥xn ∥2 − 2 < xn , x > +∥x∥2
and apply weak convergence and convergence of the norms. Q.E.D.

We now see the proof of the Banach Alaoglu theorem in a separable


Hilbert space:
THEOREM: Let X be a separable Hilbert space, {xn }n∈N a bounded sequence
in X. Then there exists x ∈ X and a subsequence xnk of xn such that
xnk ⇀ x.
PROOF: By the boundedness hypothesis, there exists C > 0 such that
∥xn ∥ ≤ C for all n ∈ N. In particular, for every fixed j we have
| < xn , ej > | ≤ C ∀n ∈ N.
Consider now the Fourier coefficients < xn , e1 >: they form a bounded
(1)
sequence in R, and we can extract a subsequence xn of xn such that
< x(1)
n , e1 >→ c1 ∈ R.
(1)
The real sequence < xn , e2 > is also bounded: we extract a subsequence
(2) (1) (2)
xn of xn in such a way that < xn , e2 >→ c2 ∈ R.
Proceeding in this way, we construct by recursion a sequence of subse-
(k) (k) (k−1)
quences xn such that xn is a subsequence of xn , and such that the
Fourier coefficients satisfy
lim < x(k)
n , ej >= cj , j = 1, 2, . . . , k.
n→+∞

(n)
Take the diagonal sequence defined by x̃n = xn . It is a subsequence of {xn }
with the property that
lim < x̃n , ej >= cj ∀j ∈ N.
n→+∞

80
We will see next time that the diagonal sequence converges weakly to a
vector (which has the number cj as its Fourier coefficients), thus concluding
the proof. We will also see how to get rid of the separability assumption on
the Hilbert space X.

11 Lecture of november 6, 2024 (3 hours)


By the linearity of the limit and of the scalar product, if we put Y = span{ej :
j ∈ N} (a dense subspace), we have that for every y ∈ Y the following limit
exists and is finite
lim < x̃n , y >= T (y),
n→+∞

and T : Y → R is obviously linear. Moreover, T is bounded (as a pointwise


limit of norm-bounded functionals), and so can be extended to a bounded
linear functional T defined on the whole space X. Let x ∈ X be such
that T (y) =< x, y > for every y ∈ X (this exists by Riesz representation
theorem...and its Fourier coefficients are clearly cj ). Our construction ensures
that < x̃n , y >→< x, y > for every y ∈ Y . As Y is dense in X, the same
holds for every y ∈ X: let us prove it!
Indeed, let y ∈ X, ε > 0. Since Y is dense in X, there exists ỹ ∈ Y such
that ∥ỹ − y∥ < ε. Hence

< x̃n , y > − < x, y >=


< x̃n , y − ỹ > +(< x̃n , ỹ > − < x, ỹ >)− < x, y − ỹ > .

The modulus of the quantity between brackets is less than ε for large enough
n. The other two terms are estimated by Cε: take for instance the first, by
Cauchy-Schwarz we have

| < x̃n , y − ỹ > | ≤ ∥x̃n ∥ ∥y − ỹ∥ < Cε.

We then conclude that

| < x̃n , y > − < x, y > | < (2C + 1)ε

for large enough n. Q.E.D.


REMARK: The last theorem is easily extended to a non-separable Hilbert
space X. Indeed, if {xn } is a bounded sequence in X, define

Z = span{xn : n ∈ N}.

81
This is obviously a separable Hilbert space (the linear combinations with
coefficients in Q of the vectors xn are a countable dense subset): by our
previous result we find x ∈ Z and a subsequence xnk such that

< xnk , y >→< x, y >

for every y ∈ Z as k → +∞. But the same holds for every y ∈ Z ⊥ (because all
scalar products are zero!): it thus holds for every y ∈ X, because X = Z⊕Z ⊥ ,
so xnk ⇀ x in X.
REMARK: The argument we used in a separable Hilbert space applies, with
few modifications, in the case of a reflexive Banach space whose dual is sepa-
rable (it can be shown that this hypothesis is equivalent to ask that the space
is reflexive and separable). In this case, we must replace the Hilbert basis
{ej } with a countable family of elements of the dual space which generate a
dense subspace: indeed, in the proof the orthonormality of the basis vectors
was not used in any essential way!

Among the most important function spaces we met in this course are the
Lebesgue spaces Lp (Ω) (where Ω is an open subset of Rn equipped with the
Lebesgue measure). We will now study some properties of these spaces, which
are very important for the applications: in particular, we will see that every
function in Lp (Ω) (for finite p) can be approximated with regular functions.
The following theorem highlights a surprising relation between measur-
able functions and continuous functions:
THEOREM (Lusin): Let u : Ω → R be a measurable function, with Ω a
bounded and Lebesgue-measurable set. Then, for every ε > 0, there is a
compact set K ⊂ Ω such that |Ω \ K| < ε and such that the restriction of u
to K is continuous.
REMARK: Lusin’s Theorem does not contraddict the fact that there are
measurable functions which are everywhere discontinuous: we are not saying
that points of K are continuity points for u, but only for its restriction!
PROOF: We now prove Lusin’s Theorem. For j = 1, 2, 3, . . ., write R =
+∞
S
Iij , with Iij disjoint intervals with length less than 1/j. Fix also points
i=1
yij ∈ Iij .
Let then Aij = u−1 (Iij ): those are pairwise disjoint measurable sets,
whose union is Ω. By regularity of Lebesgue measure, we can find compact
ε
sets Kij ⊂ Aij such that |Aij \ Kij | < 2i+j .

82

ε
S
Obviously, |Ω \ Kij | < 2j
, and by continuity of the measure on de-
i=1
creasing sequences of sets we can choose Nj ∈ N such that
Nj
[ ε
|Ω \ Kij | < .
i=1
2j

N
Sj
Define Kj = Kij : this is a compact set. We then define uj : Kj → R
i=1
by uj (x) = yij for x ∈ Kij : we obtain a continuous function (it is constant on
Kij , and we have only a finite number of these sets, which are at a positive
distance from each other) with the property that |uj (x) − u(x)| < 1/j for
every x ∈ Kj .

T
If we then define K = Kj , we get a compact set satisfying |Ω \ K| < ε
j=1
over which uj → u uniformly. It follows that the restriction of u to K is
continuous, being the uniform limit of continuous functions. Q.E.D.
The following is a well-known extension theorem:
THEOREM (Tietze): Let K ⊂ Rn be a compact set. If u : K → R is
continuous, there exists a continuous function ũ : Rn → R extending u (i.e.,
such that u(x) = ũ(x) for all x ∈ K) and such that ∥ũ∥∞ = ∥u∥∞ . Moreover,
if K ⊂ Ω, with Ω open in Rn , we can also require that u ∈ CC0 (Ω).
PROOF: Put M = ∥u∥∞ and define the compact sets K1 = u−1 ([−M, −M/3]),
K2 = u−1 ([M/3, M ]): suppose for a moment they are both nonempty, and
let δ > 0 be their mutual distance. Then the function
2M
ũ1 (x) = min{M/3, −M/3 + dist(x, K1 )}

is continuous, everywhere defined and takes values between −M/3 e M/3.
Moreover, on K1 it takes the value −M/3, on K2 the value M/3. It follows
that |ũ1 (x) − u(x)| ≤ 23 M for every x ∈ K. If K1 is empty, we obtain the
same result by putting ũ1 (x) = M/3 (constant function). A similar argument
works if K2 is empty.
We repeat the same construction for the function u2 = u − ũ1 : we find a
continuous function ũ2 which is defined everywhere, with ∥ũ2 ∥∞ ≤ 29 M and
such that ∥u − ũ1 − ũ2 ∥∞ < 49 M. Proceeding in the same way, we find a
k−1
sequence ũk of continuous functions such that ∥ũk ∥∞ ≤ 2 3k M and such that

2k
(∗) ∥u − ũ1 − ũ2 − . . . − ũk ∥∞ < in K.
3k

83
The series of continuous functions

X
ũk (x)
k=1

converges uniformly in Rn to some function ũ (because the series of the norms


converges). By using our estimates of the terms in this sum, we immediately
see that the norm of ũ is less or equal to M . Moreover, by (*) ũ coincides
with u on the points of K.
Let now Ω ⊃ K be open. Take an open set Ω′ such that K ⊂ Ω′ ⊂⊂ Ω.
By using the distance function as in the construction of ũ1 , it is easy to
construct a continuous function ϕ : Rn → [0, 1] such that ϕ(x) = 1 for every
x ∈ K, ϕ(x) = 0 for every x ∈ Rn \ Ω′ . Then ϕ(x) · ũ(x) is a continuous
extension of u, compactly supported in Ω (and with the same uniform norm
as u). Q.E.D.
The following result ensures the density of continuous functions in Lp :
THEOREM (Density of continuous functions in Lp ): Let 1 ≤ p < +∞, Ω
be an open set in Rn . Then continuous functions with compact support are
dense in Lp (Ω).
PROOF.: We have to show that given u ∈ Lp (Ω) and ε > 0, we can find
v ∈ CC0 (Ω) such that ∥u − v∥Lp < ε.
Suppose first Ω is bounded and ∥u∥∞ = M < +∞: we will remove these
restrictions later.
By Lusin’s theorem, we find a compact set K ⊂ Ω such that |Ω \ K| <
ε p
( 2M ) and the restriction of u to K is continuous. By the Tietze extension
theorem, we find v ∈ CC0 (Ω) such that ∥v∥∞ = M and v ≡ u on K. Then
we get

∥u − v∥Lp (Ω) = ∥u − v∥Lp (Ω\K) ≤ ∥u∥Lp (Ω\K) + ∥v∥Lp (Ω\K) < ε,

as we wanted.
If Ω is bounded, but u is unbounded, remark that the truncated functions
uM (x) = max{−M, min{M, u(x)}} converge to u in the Lp norm as M →
+∞ (dominated convergence theorem). Finally, if Ω is unbounded consider
the functions 
u(x) if |x| < R,
uR (x) =
0 if |x| ≥ R.
By the dominated convergence theorem, we see that uR → u in Lp as R →
+∞. The functions uR are supported in Ω ∩ BR (0), which is a bounded open
set, so the previous result applies. Q.E.D.

84
REMARK: The density result is of course false for p = +∞. Indeed, contin-
uous functions are a closed proper subspace of L∞ (Ω).

Here are some consequences of the density of continuous functions in Lp .


REMARK: Lp (Ω) is separable for 1 ≤ p < +∞: it is easy to construct
a countable set of functions which is dense in the subspace of continuous
functions. For instance, take the step functions whose steps are intervals
with rational endpoints and have rational heights. By using the uniform
continuity, every continuous function with compact support is arbitrarily
close to a step function of this kind!
On the other hand, you can show as an exercise that L∞ (Ω) is not sepa-
rable.
EXERCISE: Another important consequence of the density of continuous
functions is the following continuity of translations in Lp (Rn ) (1 ≤ p < +∞):
let u ∈ Lp (Rn ), y ∈ Rn . Define uy (x) := u(x−y) (translated function). Show
that
lim ∥u − uy ∥Lp = 0.
y→0

(HINT: If u ∈ CC0 (Rn ), the result is an easy consequence of uniform con-


tinuity. A generic function can be approximated by continuous, compactly
supported functions.)

12 Lecture of november 7, 2024 (3 hours)


We will now state and prove a “better” version of the density theorem, where
we show that CC∞ is a dense subspace of Lp (Ω). This result can be proved
by applying a procedure called regularization by convolution: essentially, it is
the same trick we already used to prove the completeness of the trigonometric
system in L2 (2π).
We begin with a regularity result:
LEMMA (Regolarity of the convolution product): Let u ∈ L1loc (Rn ), ϕ ∈
CC1 (Rn ). Then the function
Z
v(x) = u(z)ϕ(x − z) dz
Rn

is of class C 1 and
Z
∂v ∂ϕ
(x) = u(z) (x − z) dz.
∂xi ∂xi
Rn

85
By iterating this result, if ϕ ∈ CC∞ , we get v ∈ C ∞ .
PROOF: Since the integrands depends from x in a C 1 way, this is just a
theorem about differentiation under the sign of integral. It is an easy enough
consequence of the dominated convergence theorem.
We begin by showing that v is continuous: let indeed x ∈ Rn , y ∈ Rn with
|y| ≤ 1. Let then K be a compact set containing the support of ϕ(x + y − ·)
for every y as above, M be the uniform norm of ϕ.
Then
Z
|v(x + y) − v(x)| ≤ |u(z)||ϕ(x + y − z) − ϕ(x − z)| dz.
K

By continuity of ϕ, the integrand converges pointwise to 0 as y → 0. Con-


vergence is dominated by 2M |u|1K : we thus infer that v is continuous.
Next, we show that v is differentiable and that partial derivatives are as
in the thesis: continuity of partial derivatives will then follows by repeating
∂ϕ
the above argument with ϕ replaced by ∂x i
. For fixed i = 1, . . . n and for all
|h| < 1 we have

v(x + hei ) − v(x) ϕ(x + hei − z) − ϕ(x − z)


Z
= u(z) dz.
h h
K

∂ϕ
As h → 0, the integrand converges pointwise to u(z) ∂x i
(x − z) and conver-
gence is dominated by L|u(z)|1K , with L the Lipschitz constant of ϕ, as we
wanted. Q.E.D.
An Lp function is approximated by a sequence of regular functions, ob-
tained by computing the convolution product of the original function with
some CC∞ maps called mollifiers.
THEOREM (Regularization by convolution): Let u ∈ Lp (Rn ) with 1 ≤ p <
+∞. Then there exists a sequence {uk } ⊂ C ∞ (Rn ) such that uk → u in
Lp (Rn ).
PROOF: Let ϕ : Rn → R be a C ∞ function such that ϕ(x) ≥ 0, ϕ(x) =
ϕ(−x) for every x and such that
Z
spt ϕ ⊂ B1 (0), ϕ(x) dx = 1.
Rn

Such a function is called a mollifier or bump function: for instance, we can


1
take ϕ(x) = c exp(− 1−|x| 2 ) for |x| < 1, ϕ(x) = 0 for |x| ≥ 1, where c is a

positive constant chosen in such a way that the integral is 1.

86
We then define ϕk (x) = k n ϕ(kx): those functions share the main qual-
itative properties of ϕ, but concentrate more and more around the origin,
because spt ϕk ⊂ B1/k (0).
Consider now the sequence of functions
Z
uk (x) = u(x − y)ϕk (y) dy.
Rn

Whith
R a change of variables, the last expression can also be written uk (x) =
Rn
u(z)ϕ k (x − z) dz and by the Lemma, we immediatly see that uk ∈
∞ n
C (R ): this process is called the regularization by convolution of u.
We show that uk → u in Lp : one has
Z Z p
p
∥uk − u∥Lp (Rn ) = (u(x − y) − u(x))ϕk (y) dy dx.
Rn Rn

In the inner integral, write ϕk (y) = ϕk (y)1/p ϕk (y)1−1/p and use Hölder in-
equality: since ϕk has integral 1, that integral is less or equal than
Z 1/p
p
|u(x − y) − u(x)| ϕk (y) dy .
Rn

By replacing this upper bound in the above expression we thus obtain:


Z Z
p
∥uk − u∥Lp (Rn ) ≤ |u(x − y) − u(x)|p ϕk (y) dy dx =
Z Z Rn Rn

ϕk (y) |u(x − y) − u(x)|p dx dy,


B1/k (0) Rn

where we applied Fubini theorem and the properties of the support of ϕk .


Fix ε > 0: since in the double integral we have |y| ≤ 1/k, continuity of
translations in Lp ensures that for large enough k the inner integral is less
than ε: the whole expression is estimated with ε, and uk → u, as we wanted.
Q.E.D.
REMARK: If Ω is open in Rn and u ∈ Lp (Ω) (1 ≤ p < +∞), then there is
a sequence uk ∈ CC∞ (Ω) converging to u in Lp . Indeed, for every ε > 0 we
find ũ ∈ CC0 (Ω) such that ∥u − ũ∥p < ε. We regularize ũ by convolution,
obtaining a sequence of functions ũk converging to ũ in Lp : for k large enough
we have ∥ũk − ũ∥p < ε, so that ∥ũk − u∥p < 2ε. Moreover, for k large enough
ũk is compactly supported in Ω: indeed, it is easy to check (by looking at
the definition of the regularized functions as convolution products) that the
support of ũk is contained in a neighborhood of radius 1/k of the support of
ũ.

87
We will now introduce some conditions, which ensure that a given measure
has regularity properties similar to those of Lebesgue measure: in particular,
we are interested in the possibility of approximating the measure of a set by
means of open and/or compact sets.
In the following, we will assume that X is a locally compact and separable
metric space.
DEFINITION: If X is as above, Borel σ-algebra B is defined as the smallest
σ-algebra containing the open sets of X. An outer measure (resp. measure)
µ is said to be Borel if Borel sets are µ-measurable.
An outer measure (resp. measure) µ is Borel regular if every set (resp.
measurable set) A is contained in a Borel set B such that µ(A) = µ(B).
Finally, µ is a Radon measure if it is Borel-regular and µ(K) < +∞ for
every compact set K.

A Radon measure is regular in the same sense as the the Lebesgue mea-
sure:
THEOREM (Approximation of the measure with open, closed, compact sets):
Let µ be a Borel regular outer measure onSX. Suppose further there is a
sequence of open sets {Vj } such that X = Vj and µ(Vj ) < +∞ (a sort of
strengthened σ-finiteness). Then for every A ⊂ X we have

(∗) µ(A) = inf{µ(U ) : U open, U ⊃ A},


(∗∗) µ(A) = sup{µ(C) : C closed, C ⊂ A}.

If µ is just Borel (i.e. not Borel regular, same hypothesis on the sets Vj ),
the same relations hold for A a Borel set. If µ is a Radon measure, then the
above open sets Vj always exist. Morevoer,

(∗ ∗ ∗) µ(A) = sup{µ(K) : K compact, K ⊂ A}.

PROOF (not seen in the lecture): We first show (*) for a Borel set A.
Suppose µ(X) < +∞: we will remove later this additional hypothesis.
Define A = {A Borel : (∗) holds}. We show that A is closed under countable

S
unions and intersections. Indeed, if A = An with An ∈ A, ther for every
n=1
n
S find open sets Un such that An ⊂ Un and µ(Un \ An ) < ε/2 .
ε > 0 we can
Then U = n=1 Un is an open set containing A and

!
[
µ(U \ A) ≤ µ (Un \ An ) < ε,
n=1

88

An , then B ⊂ ∞
T T
whence A ∈ A. If on the other hand B = n=1 Un and we
n=1
T∞
immediately check that µ( n=1 Un \ B) < ε. If we define VN = N
T
n=1 Un , we
have a decreasing sequence ofTopen sets with finite measure, all containing
B, such that µ(VN \ B) → µ( ∞ n=1 Un \ B). So for large enough N we have
µ(VN \ B) < ε and B ∈ A.
Obviously, the family A contains all open sets in X. As it is closed under
countable intersections, it also contains the closed sets: a closed set C in a
metric space can be expressed as a countable intersection of open sets by

\
C= {x ∈ X : dist(x, C) < 1/n}.
n=1

Define now A′ = {A ∈ A : AC ∈ A}: this is clearly a σ-algebra, and it


contains the open sets. So A′ is the Borel σ-algebra.
In case µ(X) = +∞, we use the open sets Vj in the hypothesis: given a
Borel set A, we apply our previous result to the finite measures µ|Vj (defined
by µ|Vj (A) = µ(A ∩ Vj )) and for every ε > 0 we find open sets Uj such that
Uj ∩ Vj ⊃ A ∩ Vj and S∞µ(Uj ∩ Vj ) < µ(A ∩ Vj ) + ε/2j . We immediately see that
the open set U = j=1 (Uj ∩ Vj ) contains A and approximates its measure
within ε.
So (*) holds for Borel sets if we just have a Borel measure. If the measure
is also Borel regular, then (*) holds for every set.
(**) follows immediately by taking the complements.
Finally, if µ(K) < +∞ for every compact K and we recall that X is
separable and locally compact, it is easy to see that the sequence Vj in the
statement exists: indeed, in a separable metric space the topology has a
countable basis. Thank to the local compactness, this can be replaced by a
countable basis of relatively compact open sets.
(***) easily follows, because every closed set is the union of an increasing
sequence of compact sets (use in a suitable way the closure of the relative
compact open sets in the argument above. . . ) Q.E.D.
When we construct (outer) measures on a metric space, it is useful to
have a criterion ensuring that Borel sets are measurable:
THEOREM (Caratheodory criterion): Let µ be an outer measure on X such
that µ(A ∪ B) = µ(A) + µ(B) whenever dist(A, B) > 0. Then µ is a Borel
measure.
PROOF (skipped in class): It is enough to show that every closed set C is
measurable, i.e. for every T ⊂ X we have
µ(T ) ≥ µ(T \ C) + µ(T ∩ C).

89
Consider the closed sets Cj = {x ∈ X : dist(x, C) ≤ 1/j}: since T \ Cj
has a strictly positive distance from C, we get µ(T ) ≥ µ((T \Cj )∪(T ∩C)) =
µ(T \ Cj ) + µ(T ∩ C).
To conclude, we just have to show that µ(T \ Cj ) → µ(T \ C) as j → +∞.
On the other hand, if we define
1 1
Rk = {x ∈ T : < dist(x, C) ≤ }
k+1 k
then T \ C = (T \ Cj ) ∪ ( ∞
S
k=j Rk ) and we can conclude thanks to countable
subadditivity, provided we show that

X
lim µ(Rk ) = 0.
j→+∞
k=j
P∞
This is true because the series k=1 µ(Rk ) converges (the j-th remainder
of a convergent series goes to 0 as j → +∞). Consider indeed any finite
sum of even terms of the series: using the additivity of measure
PNon sets at a
positive distance from each other, and monotonicity, we get k=1 µ(R2k ) =
µ( N
S
k=1 R2k ) ≤ µ(T ). A similar bound holds of course for any finite sum of
odd terms, so the partial sums of the series are bounded from above by 2µ(T )
and the series converges (notice indeed that if we have µ(T ) = +∞, we have
nothing to prove!). Q.E.D.

EXAMPLE: The previous theorem shows for instance that the Hausdorff
measures are Borel measures. We didn’t see the actual definition in class, so
the following notes on Hausdorff measures and the Hausdorff dimension are
only for interested students. . .
The α-dimensional Hausdorff measure of A ⊂ Rn is defined as follows:

Hα (A) = lim Hδα , where


δ→0+

X ∞
[
Hδα (A) α
= c(α) inf{ (diam(Ak )) : Ak closed, diam(Ak ) ≤ δ, A ⊂ Ak }.
k=1 k=1

Here, c(α) is a renormalization constant which, for integer α, gives the


Lebesgue measure of the α-dimensional ball of diameter 1. This function
is extended to non-integer values of α by using Euler’s Γ function15 .
It is easy to check that H1 (C) gives the correct length of a rectifiable
curve C, while H2 (S) gives the area of a regular surface S. Moreover, one
15 Γ(1/2)
Precisely: c(α) = Γ(α/2+1)2α , where Euler’s Γ function is defined by Γ(t) =
R +∞ t−1 −s
0
s e ds.

90
can show (but it is not easy!) that H3 in R3 coincides with the 3-dimensional
Lebesgue (in Rn , Hn coincides with the n-dimensional Lebesgue measure).
The Hausdorff measure makes sense for every real value of α, and appears
for instance in the definition of the Hausdorff dimension of a set:

dimH (A) = inf{α > 0 : Hα (A) = 0} = sup{α > 0 : Hα (A) = +∞}.

Typically, fractal sets have non-integer Hausdorff dimension!

REMARK: We remark that the proof of the Lusin theorem and the density
of continuous functions in Lp (Rn ) depend essentially on the possibility of
approximating the measure of a given set with open and compact sets. We
just saw that this is true also for Radon measures on a locally compact and
separable metric space: checking that continuous and compactly supported
functions are dense in Lp (µ) is now a lengthy but easy exercise!

Let µ be a measure on a set X, whose σ-algebra of measurable set is


S. We can build a number of new measures as follows: given a measurable
u : X → [0, +∞], we define a measure ν on the σ-algebra S by
Z
(∗ ∗ ∗∗) ν(A) = u(x) dµ(x) ∀A ∈ S.
A

It is easy to verify that ν is a positive measure. Moreover, it is clear


that ν(A) = 0 whenever µ(A) = 0: we express this fact by saying that ν is
absolutely continuous with respect to µ.
DEFINITION: Given two measures µ, ν on the same σ-algebra S, we say
that ν is absolutely continuous with respect to µ (and we write ν << µ) if
A ∈ S, µ(A) = 0 implies ν(A) = 0.
If µ is a finite measure, then all measures which are absolutely continuous
with respect to µ can be written as in (****):
THEOREM (Radon-Nikodym): Let µ be a finite measure on X (i.e. µ(X) <
+∞), ν another finite measure defined on the same σ-algebra S and such
that ν << µ. Then there exists a function w ∈ L1 (µ), w ≥ 0, such that
Z
ν(A) = w(x) dµ(x) ∀A ∈ S.
A

PROOF of the Radon-Nikodym Theorem: Consider the measure ρ = µ + ν.


Osserve that, by the absolute continuity of ν w.r.t. µ, two functions which
are a.e. equal with respect to µ or ρ are a.e. equal with respect to ν.

91
R
If u ∈ L1 (ρ) define T (u) := X u dν. This is a linear functional: moreover,
by the Cauchy-Schwarz inequality we get
Z Z
T (u) ≤ |u| dν ≤ |u| dρ ≤ ∥u∥L2 (ρ) ρ(X)1/2 .
X X

This means that T ∈ (L2 (ρ))′ : by Riesz representation theorem (in a Hilbert
space), there exists a unique function v ∈ L2 (ρ) such that
Z Z
(I) u dν = vu dρ ∀u ∈ L2 (ρ).
X X

We would like to write


Z Z
(1 − v)u dν = uv dµ ∀u ∈ L2 (ρ)
X X

1
and to choose u = 1E 1−v : if we knew that this function belong to L2 (ρ),
we would have our thesis with w = v/(1 − v). But in general this is not
true. . . and we also risk dividing by 0: we need a more solid argument!
By applying
R (I) to the function u = 1E , with E measurable, we obtain
ν(E) = E v dρ. Since 0 ≤ ν(E) ≤ ρ(E), we also get
Z
1
(II) 0 ≤ v dρ ≤ 1 ∀E ∈ S, ρ(E) > 0.
ρ(E) E
From this it follows that 0 ≤ v(x) ≤ 1 for ρ-almost every x ∈ X. Indeed, if
En = {x ∈ X : v(x) ≥ 1 + 1/n} had a positive measure, the central term
in the previous formula would be strictly greater than 1... Then µ({x ∈ X :

S
v(x) > 1}) = µ( En ) = 0. With a similar argument, we can show that v
n=1
cannot be strictly negative on a set with positive measure.
(I) then becomes
Z Z
(III) (1 − v)u dν = uv dµ ∀u ∈ L2 (ρ).
X X

If A = {x ∈ X : v(x) = 1}, (III) with u = 1A gives µ(A) = 0, whence


ν(A) = 0. Outside this set of measure zero, the function 1/(1 − v) is well
defined...but not necessarily in L2 (ρ).
But fixed a set E ∈ S, for every n ∈ N the functions vn (x) = (1 + v(x) +
v 2 (x) + . . . + v n (x))1E (x) belong to L2 (ρ). Pluggiong these in (III) we get
Z Z
(1 − v (x)) dν(x) = (v(x) + v 2 (x) + . . . + v n (x)) dµ(x).
n+1

E E

92
The left hand side converges to ν(E) by the monotone convergence theorem
(the integrands grow to 1 for a.e. x ∈ X). . . The integrands in the r.h.s. grow
v(x) R
to w(x) = 1−v(x) , and by Beppo Levi the integrals converge to E w(x) dµ(x).
R
So we have ν(E) = E w(x) dµ(x). Summability of w comes from the fact
that ν is a finite measure. Q.E.D.

13 Lecture of november 13, 2024 (3 hours)


EXERCISE: Show that the Radon-Nikodym theorem is still true if µ and
ν are σ-finite measures. In that case, the function w in the thesis is not
necessarily summable. Show then that the theorem is false if the measures
are not σ-finite: take for instance ν the Lebsegue measure on R, µ the
counting measure: we have ν << µ, but the function w in the statement of
the Radon-Nikodym theorem cannot exist.
We will now introduce signed measures.
DEFINITION (Signed measure): Let S be a σ-algebra of subsets of X. A
(finite)
 ∞ signed
 measure is a function µ : S → R such that µ(∅) = 0 and
S ∞
P
µ An = µ(An ) whenever An ∈ S for every n and Am ∩ An = ∅ for
n=1 n=1
m ̸= n (countable additivity).
Notice that the requirement that the measure be countably additive is
actually very strong: if we change the order of the sets An , their union does
not change. This implies that the series in the r.h.s. must be absolutely
convergent.
Obviously, a signed measure does not enjoy the monotonicity property:
in general A ⊂ B does not imply µ(A) ≤ µ(B). Nevertheless, it is easy to
verify that the usual properties of continuity of the measure on increasing
and decreasing sequences of sets still hold.
DEFINITION (Positive and negative sets): Given a signed measure µ, a
measurable set P ∈ S is called positive if µ(E) ≥ 0 for every E ∈ S, E ⊂ P .
Likewise, a measurable set is called negative if every measurable subset has
measure ≤ 0. A null set is a measurable set whose measurable subsets have
all measure 0: it is both a positive and a negative set.

THEOREM (Hahn decomposition of a signed measure): Let µ be a signed


measure on X (whose σ-algebra of measurable sets is S). Then there exist
a positive set P ∈ S and a negative set N ∈ S such that P ∩ N = ∅ and
P ∪ N = X. Such a decomposition of X is called a Hahn decomposition: it
is unique up to null sets.

93
We will prove this theorem in a few moments.
Meanwhile, let us now derive one of the most important consequences of
Hahn decomposition! Every signed measure is the difference of two finite,
positive measures supported in disjoint sets:

DEFINITION (Positive, negative, total variation of a signed measure, Jordan


decomposition): Let µ be a signed measure on X, X = P ∪ N a Hahn
decomposition for µ. For every E ∈ S define
µ+ (E) = µ(E ∩ P ) (Positive variation of µ),
µ− (E) = −µ(E ∩ N ) (Negative variation of µ),
+ −
|µ|(E) = µ (E) + µ (E) (Total variation of µ).
These are obviously positive measures, and µ = µ+ − µ− (Jordan decompo-
sition of the measure µ).

The following is a simple characterization of the total variation measure:


it is precisely the smallest positive measure which is bigger or equal than the
modulus of µ.
PROPOSITION: If µ is a signed measure, then for every A ∈ S one has

X ∞
[
|µ|(A) = sup{ |µ(En )| : En ∈ S, A = En , En ∩ Em = ∅ per m ̸= n}.
n=1 n=1

PROOF (skipped in class): Let A, En be as above, X = P ∪ N a Hahn


decomposition for µ, µ+ e µ− its variations. Then

X ∞
X
|µ(En )| = |µ+ (En ) − µ− (En )| ≤
n=1 n=1
X∞ ∞
X
(µ+ (En ) + µ− (En )) = |µ|(En ) = |µ|(A).
n=1 n=1

The supremum is actually a maximum: it suffices to decompose A in the two


setsA ∩ P e A ∩ N . Q.E.D.
Let us prove the Hahn decomposition theorem for a signed measure.
Uniqueness of the Hahn decomposition up to null sets is obvious. . . much
less is existence! We proceed in several steps.
CLAIM I: for every fixed measurable set M we have sup{µ(E) : E ∈
S, E ⊂ M } < +∞
To prove this, suppose by contradiction we have sup{µ(E) : E ∈ S, E ⊂
M } = +∞: we show that there are two disjoint meaurable sets A and B
such that A ∪ B = M , |µ(A)| ≥ 1 and sup{µ(E) : E ∈ S, E ⊂ B} = +∞.

94
Indeed, thanks to our hypothesis that sup{µ(E) : E ∈ S, E ⊂ M } = +∞
we can choose a measurable set B such that µ(B) > 1 + |µ(M )|, and we set
A = M \ B. Then µ(M ) = µ(A) + µ(B) > µ(A) + 1 + |µ(M )|, whence
µ(A) < −1: both A and B have a measure whose modulus is bigger than 1.
Now,

sup{µ(E) : E ∈ S, E ⊂ B},
sup{µ(E) : E ∈ S, E ⊂ A}

are certainly not both finite, otherwise the same would be true for the same
sup made over all measurable subsets of M : our claim is proved by inter-
changing the roles of A and B if necessary.
The same procedure is then applied to B, which can be decomposed in
two sets with similar properties: iterating this step, we are able to construct
two sequence of measurable sets An , Bn such that An ∩ Bn = ∅, An ∪ Bn =
Bn−1 ,|µ(An )| ≥ 1 and sup{µ(E) : E ∈ S, E ⊂ Bn } = +∞. In particular,
An are pairwise disjoint and the absolute value of their measure is ≥ 1. By
countable additivity we have
∞ ∞
!
X [
µ(An ) = µ An ∈ R,
n=1 n=1

a contradiction because the terms of the series do not converge to 0!

This step is by far the most delicate in the proof of the theorem: we will
now be able to obtain our statement pretty quickly.
CLAIM II: for every A ∈ S and every ε > 0 we can find B ∈ S, B ⊂ A
such that µ(B) ≥ µ(A) and µ(E) > −ε for every E ⊂ B, E ∈ S.
Basically, we claim we can find an “almost positive” subset of A, whose
measure is ≥ µ(A). . .
Let indeed c = sup{µ(C) : C ⊂ A, C ∈ S}: obviously µ(A) ≤ c < +∞
(by Claim I), and so we can find a measurable subset B ⊂ A such that

µ(B) ≥ max{µ(A), c − ε/2}.

This set has the required properties: if we had E ⊂ B with µ(E) ≤ −ε, then
µ(B \ E) = µ(B) − µ(E) ≥ c + ε/2, against the definition of supremum.
In our third step our “almost positive” set becomes positive:
CLAIM III: if A ∈ S, there exists a positive set B ⊂ A such that µ(B) ≥
µ(A).

95
Apply indeed Claim II with ε = 1/n: we find a decreasing sequence of
measurable sets A ⊃ A1 ⊃ A2 ⊃ A3 ⊃ . . . such that µ(An ) ≥ µ(A) and
µ(E) > −1/n for T∞every measurable E, E ⊂ An .
Define B = n=1 An . Then B ⊂ A and, by continuity of the measure on
decreasing sequences, µ(B) ≥ µ(A). Moreover, B is a positive set: if E ⊂ B,
then E is also a subset of An for every n and so µ(E) > −1/n.

With Claim III, we are now able to construct our Hahn decomposition:
let s = sup{µ(A) : A ∈ S}. Choose a sequence An of measurable sets such
that µ(An ) → s: by Claim III we can replace each of the sets An with a
positive subset Bn such that µ(Bn ) ≥ µ(An ), so that µ(Bn ) → s. We then
define n
[
Pn = Bk :
k=1

this is an increasing sequence of positive sets such that µ(Pn ) → s. We then


put
[∞
P = Pn , N = X \ P.
n=1

By construction, µ(P ) = s. P is also positive: if E ⊂ P , E ∈ S, then the


sets E ∩ Pn are an increasing sequence of positive sets whose union is E, so
that µ(E) ≥ 0. Finally, N is negative: if we had E ⊂ N measurable such
that µ(E) > 0, then µ(P ∪ E) = µ(P ) + µ(E) > s, a contradiction with the
definition of s. Q.E.D.

The following is very easy. . . but also very useful!


EXERCISE (Radon-Nikodym theorem for signed measures): Let µ be a finite
positive measure on X, ν a signed measure such that ν << µ. Then there
exists a function v ∈ L1 (µ) such that
Z
ν(E) = v(x) dµ(x) ∀E ∈ S.
E

(HINT: Let X = P ∪ N be a Hahn decomposition. Apply the Radon-Nikodym


theorem to the positive measures ν + e ν − , which are concentrated on disjoint
sets...)

By means of the Radon-Nikodym theorem for signed measures, we are


finally able to prove that the dual space of Lp (µ) is Lq (µ), for 1 ≤ p < +∞
and for a finite positive measure µ16 :
16
Having proved this, the theorem is easily extended to the case where the measures are
σ-finite

96
THEOREM: Let µ be a finite positive measures on X, 1 ≤ p < +∞. Then
for every T ∈ (Lp (µ))′ there exists a unique function v ∈ Lq (µ) (with q the
conjugate exponent of p) such that
Z
T (u) = u(x)v(x) dµ(x) ∀u ∈ Lp (µ).
X

Moreover, ∥T ∥ = ∥v∥Lq .
PROOF: We already proved that the map R Φ : Lq → (Lp )′ sending every
v ∈ Lq (µ) into the functional Tv : u 7→ X uv dµ is a linear isometry. We
only have to show that Φ is surjective.
Let then T ∈ (Lp (µ))′ . Define ν(E) = T (1E ) (notice that 1E ∈ Lp
because µ is finite): we claim that ν is a signed measure on X, ν << µ.
Indeed, we obviously have ν(E) = 0 whenever µ(E) = 0. Moreover,
if A and B are measurable and disjoint, then 1A∪B = 1A + 1B whence
ν(A ∪ B) = ν(A) + ν(B) by the linearity of the functional.
Let’s verify that µ is countably additive: let A = ∞
S
n=1 An , wher An are
measurable and pairwise disjoint. One immediately checks that

X
1A (x) = 1An (x),
n=1

and the sequence of partial sums is dominated by 1A : the above series con-
verges in Lp (µ).
Then, by continuity of T we have ν(A) = ∞
P
n=1 ν(An ) and ν is indeed a
measure.
The Radon-Nikodym theorem gives us a function v ∈ L1 (µ) such that
Z
(A) T (1E ) = ν(E) = v(x) dµ(x) ∀E ∈ S,
E

whence Z
(B) T (s) = s(x)v(x) dµ(x) ∀s simple.
X
q
We need to prove that v ∈ L : if this is true, we can replace the simple
function with any u ∈ Lp because simple functions are dense in this space17 .
Indeed, it is enough to take a sequence sn of simple function converging to u
17
Every bounded function can be approached uniformly with simple functions. More-
over, every function u ∈ Lp can be approached in Lp with a sequence of bounded function
(take for instance un (x) = max{−n, min{u(x), n}}.

97
in Lp : by the continuity
R of T we haveR T (sn ) → T (u), on the other hand by
Hölder we have sn (x)v(x) dµ(x) → u(x)v(x) dµ(x).
X X
We now check that actually v ∈ Lq (µ), thus concluding the proof. Let us
begin with the case p = 1. We know from (A) that
Z
| v(x) dµ(x)| ≤ ∥T ∥µ(E) ∀E ∈ S,
E

whence µ({x : v(x) > ∥T ∥ + 1/n}) = 0 for every n (otherwise the inequality
would fail), and similarly µ({x : v(x) < −∥T ∥ − 1/n}) = 0 whence ∥v∥∞ ≤
∥T ∥.
In the case 1 < p < +∞, (B) holds for every s ∈ L∞ (µ) (because,
as we said above, every bounded function can be approximated uniformly
with simple functions). For n ∈ N let En = {x ∈ X : |v(x)| ≤ n} and
define sn (x) = 1En (x)|v(x)|q−1 sgn(v(x)). These functions are in L∞ (µ), and
|v(x)|q = |sn (x)|p on En . We then get from (B)
 1/p
Z Z Z
|v(x)|q dµ(x) = sn (x)v(x) dµ(x) = T (sn ) ≤ ∥T ∥  |v(x)|q dµ(x) ,
En X En

i.e.  1/q
Z
 |v(x)|q dµ(x) ≤ ∥T ∥.
En

Passing to the limit as n → +∞ and using the monotone convergence theo-


rem we get ∥v∥Lq ≤ ∥T ∥. Q.E.D.

14 Lecture of november 14, 2024 (3 hours)


We now prove a small but important result we will need in the following:

LEMMA (fundamental lemma of the Calculus of Variations): Let w ∈ L1 ([a, b])


Rb
be such that a wϕ dx = 0 for every ϕ ∈ CC1 ((a, b)). Then w = 0 a.e.
PROOF: Approximate the function sgn w(x) with a sequence ϕn of func-
tion in CC1 , taking values in the inverval [−1, 1] and converging a.e. By the
dominated convergence theorem we get then
Z b Z b
0= wϕn dx → |w(x)| dx,
a a

98
whence w = 0 a.e. Q.E.D.
To study problems involving differential equations (both O.D.E.s and
P.D.E.s), we need spaces of functions which are differentiable (in some ap-
propriate sense), and which have good compactness properties.
The Sobolev spaces W 1,p ([a, b]) are a family of spaces modelled on Lp
which fulfill perfectly both requirements. Before we give the definition, we
need the following important notion:
DEFINITION (Weak derivative): Let u ∈ L1 ([a, b]). A function v ∈ L1 ([a, b])
is a weak derivative of u if
Z b Z b

u(x)ϕ (x) dx = − v(x)ϕ(x) dx ∀ϕ ∈ C01 ([a, b]).
a a

Notice that if u ∈ C 1 , then its derivative is also a weak derivative: this is


just the integration by parts formula!
Converserly, we will see that if u and v are continuous functions, then u is
differentiable with derivative v (this is the classical du Bois-Reymond lemma
in the Calculus of Variations).
Moreover, it is easy to see that the weak derivative is unique whenever it
exists: if v and ṽ are weak derivatives of u, it follows from the definition that
w = v − ṽ satisfies the hypothesis of the fundamental lemma of the calculus
of variations. It follows that v = ṽ a.e.
By the uniqueness of weak derivative, we are allowed to denote it by u′ .
DEFINITION (Sobolev spaces W 1,p ): If 1 ≤ p ≤ +∞ we define the Sobolev
space

W 1,p ([a, b]) = {u ∈ Lp ([a, b]) : there exists u′ ∈ Lp ([a, b]) weak derivative of u}.

On the space W 1,p one usually puts one of the following two equivalent
norms:

∥u∥W 1,p = ∥u∥Lp + ∥u′ ∥Lp or ∥u∥W 1,p = (∥u∥pLp + ∥u′ ∥pLp )1/p .

We will use indifferently the first or the second. The second is more appro-
priate in case p = 2, because it is induced by a scalar product, thus making
W 1,2 a Hilbert space.

REMARK: It is easy to verify that W 1,p is a Banach space: if {uk } ⊂ W 1,p


is a Cauchy sequence, then both {uk } and {u′k } are Cauchy sequences in Lp .
As Lp is complete, we get u, v such that uk → u, u′k → v in Lp .

99
We have to show that u ∈ W 1,p and v = u′ . Indeed, by definition of weak
derivative we have
Z b Z b

uk ϕ dx = − u′k ϕ dx ∀ϕ ∈ C01 .
a a

Passing to the limit as k → +∞ we get


Z b Z b

uϕ dx = − vϕ dx,
a a

as we wanted. Notice that the same argument works also if we only have
uk ⇀ u, u′k ⇀ v in Lp : we will use this remark later on.

As we will see later, Sobolev spaces can be defined also in dimension


higher than one.
But in dimension 1 Sobolev functions are much better than in higher
dimension: they are continuous (after possibly changing them on a set of
measure 0) and their weak deriviative coincides almost everywhere with their
classic derivative.
We need the following definition:
DEFINITION (Absolutely continuous functions): The space of absolutely
continuous functions is the set of all primitives of L1 functions:
Z x
1
AC([a, b]) = {u : [a, b] → R : ∃v ∈ L ([a, b]) s.t. u(x) = u(a)+ v(t) dt ∀x ∈ [a, b]}.
a

A theorem in real analysis which is not simple (nor overly difficult, to be


fair. . . ) tells us that AC functions are differentiable a.e., and u′ (x) = v(x)
for a.e. x: so the fundamental theorem of calculus holds, in the sense that u
is a primitive of its derivative.
Moreover, the following ε-δ characterization of AC functions holds:
A function u : [a, b] → R is absolutely continuous if and only if for ev-
ery ε > 0 there exists δ > 0 such that, for every finite collection [a1 , b1 ],
N
P
[a2 , b2 ],. . . ,[aN , bN ] of pairwise disjoint subintervals of [a, b] with (bi −ai ) <
i=1
δ, one has
N
X
|u(bi ) − u(ai )| < ε.
i=1

This characterization shows that AC function satisfy a property which is


a stronger version of uniform continuity: an absolutely continuous function is

100
in particular uniformly continuous. Moreover, it is an easy exercise to check
from the characterization that the product of two AC functions is still in AC.
Due to time constraints, we will omit the proof of the characterization,
and also of the differentiability a.e. of AC functions.
REMARK: From the fact that u : [a, b] → R is continuous and differentiable
a.e. we cannot conclude that u ∈ AC([a, b]). A famous counterexample is
the so-called Cantor’s staircase, a function which is continuous and increas-
ing in the interval [0, 1], whose image is the whole interval [0, 1]... and whose
derivative is 0 almost everywhere. Obviously, such a function is not a primi-
tive of its derivative! Here is how Cantor’s staircase looks like (click on the
image to see an animation, with a zoom on a portion of the graph to see the
finer structure of the function):

We will now prove that the space of absolutely continuous functions coin-
cides, with the Sobolev space W 1,1 ([a, b]). Precisely, each absolutely contin-
uous functions belongs to the Sobolev space and, conversely, given u ∈ W 1,1
there exists an absolutely continuous function which coincide with u almost
everywhere.
We need two lemmas:
LEMMA 1 (du Bois-Reymond): If the weak derivative of u ∈ W 1,1 ([a, b]) is
0, then u is a.e. equal to a constant.
1
Rb
PROOF: Let ψ ∈ C 0 ([a, b]): define w(x) = ψ(x) − b−a a
ψ(t) dt and
Z x
ϕ(x) = w(t) dt.
a

Then ϕ ∈ C01 ([a, b]) and by our hypothesis on u we get


Z b
0= u(x)ϕ′ (x) dx =
a
Z b Z b
1
[u(x)ψ(x) − u(x) ψ(s) ds] dx =
a b−a a
Z b Z b
1
[u(x) − u(s) ds]ψ(x) dx.
a b−a a

101
By the fundamental lemma of the Calculus of Variations, this implies
Z b
1
u(x) = u(s) ds f or a.e. x ∈ [a, b].
b−a a
Q.E.D.
LEMMA 2: If u ∈ AC([a, b]), then u ∈ W 1,1 ([a, b]). Moreover, the pointwise
derivative of u (which is defined a.e.) is also the weak derivative of u.
PROOF: From the definition of AC we know that u and u′ are both in L1 .
Let then ϕ ∈ C01 ([a, b]): obviously, ϕ ∈ AC.
Then the product uϕ is also absolutely continuous and we have (uϕ)′ =
u′ ϕ + uϕ′ a.e. By integrating we get
Z b Z b

0= (uϕ) dx = (u′ ϕ + uϕ′ ) dx,
a a

and the weak derivative of u is u′ . Q.E.D.

We finally prove that W 1,1 is essentially the same as AC:

THEOREM: Let u ∈ W 1,1 ([a, b]). Then there exists ũ ∈ AC([a, b]) such that
u(x) = ũ(x) for a.e. x. So, after possibly changing u in a set of measure 0,
the weak derivative of u coincides with its classical derivative18 .
Rx
PROOF: Define w(x) = a u′ (t) dt. This is an absolutely continuous function
which, by LEMMA 2, belongs to W 1,1 and whose weak derivative is u′ .
Then the weak derivative of the function u − w is 0, whence, by LEMMA 1,
u(x) − w(x) = c a.e., with c a constant. We can then define ũ(x) = c + w(x).
Q.E.D.

The notion of Sobolev spaces is easily extended to functions defined on


Rn : if Ω ⊂ Rn is open, u ∈ L1 (Ω), we say that v ∈ L1 (Ω) is the weak
derivative of u with respect to xi if and only if
Z Z
∂ϕ(x)
u(x) dx = − v(x)ϕ(x) dx ∀ϕ ∈ C01 (Ω).
∂xi
Ω Ω

∂u
We denote the weak derivative (if any) by ∂x i
(there is uniqueness of the
weak derivative as in the 1-dimensional case). The Sobolev spaces are then
defined in the obvious way:
∂u
W 1,p (Ω) = {u ∈ Lp (Ω) : ∃ weak derivatives ∈ Lp (Ω), i = 1, . . . , n}.
∂xi
18
Recall the statement of LEMMA 2.

102
These are Banach spaces with the norm
n
X ∂u
∥u∥W 1,p (Ω) = ∥u∥Lp + ∥ ∥Lp .19
i=1
∂x i

REMARK: Let us go back to the 1-dimensional case. If 1 < p ≤ +∞,


the functions in W 1,p are also in W 1,1 . Then if u ∈ W 1,p there is no loss
of generality in supposing that u is absolutely continuous: it is enough to
choose the appropriate element in the equivalence class of u in Lp . In this
sense, the space Lp is the space of those AC functions, whose derivatives
belong to Lp . These functions, besides being AC, are also Hölder continuous
with exponent 1 − 1/p: if x, y ∈ [a, b] we have by Hölder inequality
Z y
|u(x) − u(y)| = | u′ (t) dt| ≤ ∥u′ ∥Lp |x − y|1−1/p .
x

Functions in W 1,∞ are lipschitz continuous: if we always choose the AC


representative in each equivalence class, we can say that W 1,∞ coincides with
the space of lipschitz continuous functions. Indeed, every lipschitz contin-
uous function is absolutely continuous, and if u has Lipschitz constant L,
then its incremental quotients are bounded by L. . . and so |u′ (x)| ≤ L at all
differentiability points: the derivative of a Lipschitz continuous function is
in L∞ .

The remark we just made is key for the following important compactness
result:
THEOREM (weak compactness in W 1,p ): Let {un } ⊂ W 1,p ([a, b]), 1 < p <
+∞ (and suppose we have chosen the AC representative of each un ). If there
exists a constant C > 0 such that ∥u′n ∥Lp ≤ C for every n, and one of the
two following conditions holds:

(i) |un (a)| ≤ C

(ii) ∥un ∥Lp ≤ C


19
Or with the equivalent norm

n
!1/p
X ∂u p
∥u∥W 1,p (Ω) = ∥u∥pLp + ∥ ∥Lp ,
i=1
∂xi

which is a Hilbert norm for p = 2.

103
then there exists u ∈ W 1,p and a subsequence {unk } such that unk → u
uniformly, u′nk ⇀ u′ weakly in Lp as k → +∞.
PROOF: By previous remark and the equiboundedness of the derivatives in
Lp , all our function satisfy the following Hölder continuity estimate:
(∗)|un (x) − un (y)| ≤ C|x − y|1−1/p ∀x, y ∈ [a, b].
In particular, the functions un are equicontinuous.
Suppose now (i) holds: by using (∗) we have for every x and n
|un (x)| ≤ |un (a)| + |un (x) − un (a)| ≤ C + C(b − a)1−1/p
and the functions un are also equibounded.
By the Ascoli-Arzelà theorem, and weak compactness in the reflexive
space Lp , we find u ∈ C 0 , v ∈ Lp and a subsequence {unk } such that unk → u
uniformly, u′nk ⇀ v weakly in Lp . As we remarked earlier (in proving the
completeness of the spaces W 1,p ), this implies that u ∈ W 1,p and v = u′ .
We still have to prove that the same holds when we replace (i) with
(ii). But we actually have (ii) ⇒ (i): indeed we have, for every x ∈ [a, b],
Rx Rb
u(a) = u(x) − a u′ (x) dx, so that |u(a)| ≤ |u(x)| + a |u′ (x)| dx. Integrating
both sides we get:
Z b Z b
1
|u(a)| ≤ |u(x)| dx + |u′ (x)| dx,
b−a a a
and we conclude by using Hölder’s inequality. Q.E.D.
REMARK: A stronger result holds for p = +∞: if {un } ⊂ W 1,∞ we can use
the theorem for every finite p to find a convergent subsequence. Moreover,
this sequence is equilipschitz (because derivatives are equibounded in L∞ ),
so the limit is also lipschitz continuous. Derivatives converge weakly in Lp
for finite p, but also weakly* in L∞20 .
On the other hand, the compactness theorem is false for p = 1: it is easy
to construct a sequence of functions which is bounded in the W 1,1 norm,
which converges to a discontinuous function: for instance, take the following
functions on [−1, 1]:

 −1 se − 1 ≤ x ≤ −1/n
un (x) = nx se − 1/n < x < 1/n
1 se 1/n ≤ x ≤ 1

20
Weak* convergence can be defined in a space which is the dual X ′ of a Banach space
X: instead of testing the weak convergence of a sequence in X ′ on every linear functional
S ∈ X ′′ , we only test on the elements of J(X). In other words, Tk ⇀∗ T weakly* in X ′ if
and only if Tk (x) → T (x) for every x ∈ X: in particular, un ⇀∗ u weakly* in L∞ if and
Rb Rb
only if a un v dx → a uv dx for every v ∈ L1 . There is also a compactness result for the
weak* convergence: if X is separable, then bounded sequences in X ′ are weakly* compact.

104
15 Lecture of november 20, 2024 (3 hours)

EXAMPLE: The situation is not so good in dimension n. A function u ∈


W 1,p (Ω), with Ω a open subset of Rn , is in general not even continuous.
For instance, consider the following function on the unit open ball of R2 :
1
u(x, y) = .
(x2 + y 2 )1/4

This function is clearly discontinuous at (0, 0), but we will see in a moment
that u ∈ W 1,p (B1 ((0, 0)) for 1 ≤ p < 4/3. By using slightly more sophisti-
cated examples, one can show that there are discontinuos Sobolev functions
in W 1,p for every 1 ≤ p ≤ n, where n is the dimension of the ambient
euclidean space.
Passing to polar coordinates (ρ, θ) and integrating, we immediately see
that u ∈ Lp (B1 (0)) for p < 4. Moreover, we have
1
|∇u(x, y)| = ρ−3/2 .
2
By integrating over the unit ball, one sees that this function is in Lp (B1 (0))
for p < 4/3.
To show tha u ∈ W 1,p , we need to verify that the pointwise derivatives of
u are also its weak derivatives. But this is easily proved by approximating u
with the C 1 functions
1
un (x, y) = .
(x2 + y2 + n1 )1/4

Indeed, this sequence of regular functions converges to u in the W 1,p norm,


and so u ∈ W 1,p .

Luckily for us, although Sobolev functions in Rn are not necessarily con-
tinuous, they enjoy some important properties that make things easier.
For instance, one can prove that smooth functions are dense in W 1,p :
THEOREM (Meyers-Serrin): for any domain Ω ⊂ Rn and for every 1 ≤
p < +∞, the space C ∞ (Ω) ∩ W 1,p (Ω) is dense in W 1,p (Ω) with respect to the
Sobolev norm.
We will omit the proof of this result, which is rather technical. In the
special case, Ω = Rn , or if u ∈ W 1,p (Ω) is compactly supported, the proof
is easy enough and is obtained simply by regularizing by convolution (but
we omitted even this simpler proof in class): suppose indeed u ∈ W 1,p (Rn ).

105
Let ϕk be the sequence of C ∞ mollifiers we used earlier (to show that C ∞
functions are dense in Lp ) and define
Z
uk (x) = u(x − y)ϕk (y) dy.
Rn

We already know that uk → u in Lp . Moreover


Z Z
∂uk ∂ ∂ϕk
(x) = ϕk (x − y)u(y) dy = (x − y)u(y) dy =
∂xi ∂xi ∂xi
Rn Rn
Z Z Z
∂ϕk
− (x − y)u(y) dy = ϕk (x − y)Di u(y) dy = Di u(x − z)ϕk (z) dz,
∂yi
Rn Rn Rn

where Di u is the weak derivative of u in direction xi (and we used the fact


that y 7→ ϕk (x − y) is an acceptable test function in the definition of weak
derivatives).
In other words, the partial derivatives of uk are the regularization by con-
volution of the (weak) partial derivatives of the Sobolev function u and so
uk → u in W 1,p (Rn ).
Notice that, in the general case, the theorem says that a function in
W (Ω) can be approximated in norm with functions in C ∞ (Ω), which in
1,p

general are not continuous up to the boundary.

Another important fact is that Sobolev functions have typically a higher


summability than Lp . To understand this, we begin by introducing some
important subspaces:
DEFINITION: Let Ω ⊂ Rn be a domain, 1 ≤ p ≤ +∞. The space W01,p (Ω)
is the closure in the W 1,p norm of CC1 (Ω).

Morally, W01,p is the subspace of those Sobolev functions which are “zero
at the boundary”, but as a definition this would make no immediate sense,
since for regular domains Ω, the boundary ∂Ω is a set of measure 0. The
definition we gave is a natural surrogate.
One can prove that W01,p (Rn ) = W 1,p (Rn ), but this is not the case for
bounded domains Ω. Indeed, the following important result holds:
THEOREM (Sobolev embedding Theorem, first version): Suppose 1 ≤ p < n
(where n is the dimension of the ambient space). There is a constant C > 0,
depending on p but not on u nor Ω, such that

(∗∗) ∥u∥Lp∗ (Ω) ≤ C∥∇u∥Lp (Ω) ∀u ∈ W01,p (Ω),

106
np
where p∗ = n−p
is called Sobolev exponent.
This theorem is a sort of regularity theorem ensuring that functions in
W01,p have a higher summability than Lp (indeed, p∗ > p).
The Sobolev exponent may seem mysterious, but it is easy to realize it
is the unique for which inequality (∗∗) can be true for every u ∈ Cc1 (Rn ).
Precisely, suppose we have

∥u∥Lq (Rn ) ≤ C∥∇u∥Lp (Rn ) ∀u ∈ Cc1 (Rn ).

Then necessarily q = p∗ .
To show this, fix u ∈ Cc1 (Rn ), u ̸= 0. The inequality must hold, with the
same constant, also for every function of the form v(x) = u(rx) with r > 0.
A simple change of variables lead to
n n
∥u∥Lq (Rn ) ≤ Cr1+ q − p ∥∇u∥Lp (Rn ) .

The exponent of r in the last inequality must be 0: otherwise we have a


contradiction, because by letting r → 0 or r → +∞, we obtain that the right
hand side converges to 0, while the left hand side is strictly positive. The
exponent is 0 exactly when q = p∗ .

Before we prove the theorem, let us see a very important application: we


we will prove the existence of weak solutions to Poisson’s equation.
Consider the following differential problem: let Ω ⊂ Rn be a bounded
open set, f be a given function. We are looking for a function u ∈ C 2 (Ω) ∩
C 0 (Ω) satisfying 
∆u(x) = f (x) ∀x ∈ Ω,
(∗)
u(x) = 0 ∀x ∈ ∂Ω
(Poisson’s equation with homogeneous Dirichlet boundary conditions).
If we multiply the equation by ϕ ∈ C01 (Ω) and we integrate by parts, we
get the integral identity
Z
(∗∗) (∇u(x) · ∇ϕ(x) + f (x)ϕ(x)) dx = 0.

It is easy to see that a function u ∈ C 2 (Ω) satisfying (∗∗) for every


ϕ ∈ C01 (Ω) is a solution to Poisson’s equation, but a priori the integral identity
(∗∗) makes sense for less regular functions, for instance if u ∈ W 1,2 (Ω):
DEFINITION (Weak solution): We say that u ∈ W01,2 (Ω) is a weak solution
of the differential problem (∗) if it satisfies (∗∗) for every ϕ ∈ C01 (Ω).

107
REMARK: The boundary condition is included in the choice of the space
W01,2 (Ω). Moreover, by definition of the latter space, for a weak solution
identity (∗) is satisfied for every ϕ ∈ W01,2 (Ω).

To prove the existence of weak solutions we will need the following


LEMMA (Poincaré’s Inequality): If Ω is bounded, there is a constant C > 0
(depending on p and Ω) such that

∥u∥Lp (Ω) ≤ C∥∇u∥Lp (Ω) ∀u ∈ W01,p (Ω).

PROOF: By Hölders inequality we have


p∗ −p
∥u∥Lp (Ω ≤ |Ω| p∗ p ∥u∥Lp∗ (Ω)

and the thesis follows from Sobolev’s embedding theorem. Q.E.D.


We are now in position to prove the existence and uniqueness of weak
solutions:
THEOREM (Existence of weak solutions): If f ∈ L2 (Ω), then there exists a
unique weak solution u ∈ W01,2 (Ω) to the differential problem (∗).
PROOF: Define the following bilinear form on W01,2 (Ω):
Z
((u, v)) = ∇u(x) · ∇v(x) dx.

By Poincaré’s inequality, this is a scalar product which induces a norm equiv-


alent to the usual Sobolev norm: W01,2 (Ω) equipped with this scalar product
is still a Hilbert space.
Consider then the linear functional F : W01,2 (Ω) → R, ϕ 7→ − f (x)ϕ(x) dx
R

so that (∗∗) reads

(∗ ∗ ∗) ((u, ϕ)) = F (ϕ) ∀ϕ ∈ W01,2 (Ω).

We have F ∈ (W01,2 (Ω))′ : indeed by Poincaré’s inequality

|F (ϕ)| ≤ ∥f ∥L2 (Ω) · ∥ϕ∥L2 (Ω) ≤ C∥f ∥L2 (Ω) · ∥∇ϕ∥L2 (Ω) .

Then by Riesz representation theorems on the dual of Hilbert spaces, there


exists exactly one u ∈ W01,2 (Ω) such that (∗ ∗ ∗) holds. Q.E.D.

Having seen this important example, we are more prepared to suffer a


little bit in proving the Sobolev embedding theorem. We use the following
lemma:

108
LEMMA (Gagliardo): Let f1 , . . . , fn : Rn−1 → R be non negative measurable
functions. Then the following inequality holds:
 1
 n−1
n
Z Y n
Y Z
fi (x̂i ) dx ≤  fin−1 (x̂i ) dx̂i  .
Rn i=1 i=1
Rn−1

Here, given a point x = (x1 , x2 , . . . , xn ) ∈ Rn , x̂i ∈ Rn−1 denotes the point


obtained from x by omitting the i-th component.
PROOF (Omitted in class): The proof goes by induction over n: for n = 2
it is Fubini’s Theorem. The inductive step is a bit technical but not diffi-
cult: one has to use Hölder’s inequality and its straightforward generalization
saying that
Z
f1 · fs · . . . · fn dµ ≤ ∥f1 ∥Ln · ∥f2 ∥Ln · . . . · ∥fn ∥Ln .

Suppose the inequality is true for n, let us prove it for n + 1:


Z n+1
Y Z n
Z Y
fi (x̂i ) dx = fn+1 (x̂n+1 ) dx̂n+1 fi (x̂i ) dxn+1 ≤
i=1 Rn R i=1
Rn+1
 1/n
Z n
Y Z
fn+1 (x̂n+1 )  fin (x̂i ) dxn+1  dx̂n+1 ≤
Rn i=1 R
 1/n   1
 n−1  n−1
n
Z Z n
Y Z
n
fn+1 (x̂n+1 ) dx̂n+1    fin (x̂i ) dxn+1  dx̂n+1  ≤
  

Rn Rn i=1 R
 1/n
n+1
Y Z
 fin (x̂i ) dx̂i 
i=1 Rn

where the last inequality comes from the inductive hypothesis. Q.E.D.
We can now prove the embedding theorem:
PROOF OF SOBOLEV’S EMBEDDING FOR p < n: By definition of the
space W01,p (Ω), it is clearly enough to prove the result for functions u ∈
CC1 (Rn ). We begin by proving the inequality for p = 1 (in which case we
have 1∗ = n−1n
).

109
Now, for every x ∈ Rn and i = 1, . . . , n we have
Z xi
∂u
u(x) = dxi .
−∞ ∂xi

Thus Z Z
∂u
|u(x)| ≤ dxi ≤ |∇u| dxi ,
∂xi
R R

and 1
Z n Z
Z Y  n−1
n
|u(x)| n−1 dx ≤ |∇u(x)| dxi dx.
R
Rn Rn i=1

In the productory, each of the expressions in brakets is independent on xi


thanks to the integration: we can apply Gagliardo’s Lemma and obtain
 1
 n−1  n
 n−1
Z n Z Z Z
n Y
|u(x)| n−1 dx ≤  dx̂i |∇u(x)| dxi  =  |∇u| dx ,
Rn i=1 Rn
Rn−1 R

which is the result for p = 1.


We will see tomorrow how to treat the case p > 1.

16 Lecture of november 21, 2024 (3 hours)


Let us conclude the proof of the Sobolev embedding theorem for 1 < p < n.
Apply the inequality with p = 1 to the function v(x) = |u(x)|1+r , where
r > 0 is to be chosen later. We have |∇v(x)| = (r + 1)|u(x)|r |∇u(x)|. Then,
by using Hölder’s inequality we obtain:
  n−1
n
Z Z
(r+1)n
 |u(x)| n−1 dx ≤ (r + 1) |u(x)|r |∇u(x)| dx ≤
Rn Rn
  p−1
p
Z
pr
(r + 1)∥∇u∥Lp  |u(x)| p−1 dx .
Rn

We can now choose r in such a way that the exponents of |u(x)| on both
sides of the inequality are equal: with easy computations, we find Sobolev’s
inequality. Q.E.D.
If p = n and the domain Ω is bounded, we can apply the theorem to all
smaller exponents and we find that a function u ∈ W01,p (Ω) belongs to Lq (Ω)

110
for every q < +∞. There are examples showing that such a function is not
necessarily in L∞ .

Let us go back for a minute to weak solutions of the Poisson equation:


REMARK: In case f is a continuous function, it is possible to prove that the
weak solution is in C 2 (Ω) ∩ C 0 (Ω) and thus it is a classical solution of (∗).
The proof of this result in dimension n is not at all obvious, but in
dimension 1 it is becomes very easy: suppose u ∈ W01,2 ((a, b)) is a weak
solution of the ODE u′′ (x) = f (x), i.e. a solution of
Z b
(u′ (x)ϕ′ (x) + f (x)ϕ(x)) dx = 0 ∀ϕ ∈ C01 (a, b).21
a

By definition of the weak derivative, this means that f is the weak derivative
of u′ : if follows that u′ ∈ W 1,2 has a continuous weak derivative. But then
u′ ∈ C 1 ((a, b)) and so u ∈ C 2 (a, b) and the differential equation is satisfied
in the pointwise sense.

For p > n things are even better: functions in W01,p (Ω) are Hölder con-
tinuous.
THEOREM (Morrey): If p > n, there is a constant C > 0, depending only
on p and n, such that

[u]α ≤ C∥∇u∥Lp (Ω) ∀u ∈ W01,p (Ω),

where α = 1 − n/p and


 
|u(x) − u(y)|
[u]α = sup : x, y ∈ Ω, x ̸= y .
|x − y|α

This is understood in the sense that in the equivalence class of u in Lp (Ω),


there is a Hölder continuous function satisfying the estimate.
DIM.: As before, it is enough to prove the estimate for functions u ∈ CC1 (Rn ):
indeed, every function in W01,p (Ω) can be approximated in the W 1,p norm
with functions of this kind, and by passing to subsequences we can have
convergence a.e. We then pass to the limit in the estimate.
Let then u ∈ C01 (Ω), and let x, y ∈ Rn be distinct points. Let δ = |x − y|
and define S = Bδ (x) ∩ Bδ (y). For every z ∈ S one has |u(x) − u(y)| ≤
21
Poincaré’s Inequality is true also in dimension 1. . . so existence of the weak solution
is warranted.

111
|u(x) − u(z)| + |u(z) − u(y)|. Integrate both sides over S with respect to z:
we get
Z Z
(∗) |S||u(x) − u(y)| ≤ |u(z) − u(x)| dz + |u(z) − u(y)| dz.
S S

We clearly have |S| = κδ n , with κ a constant independent on δ. Let us esti-


mate the first integal on the right hand side: for the other, the computation
is similar. We have
Z 1
d
|u(z) − u(x)| = u(x + t(z − x)) dt =
0 dt
Z 1 Z 1
∇u(x + t(z − x)) · (z − x) dt ≤ δ |∇u(x + t(z − x)| dt,
0 0

and, by putting w = x + t(z − x) and using Hölder:


Z Z 1 Z
|u(z) − u(x)| dz ≤ δ dt |∇u(x + t(z − x)| dz ≤
0 S
S
Z 1 Z Z 1
−n
δ t dt |∇u(w)| dw ≤ δ t−n (ωn tn δ n )1−1/p dt∥∇u∥Lp (Rn ) .
0 Btδ (x) 0

Here, ωn is the measure of the unit ball in Rn . The integral in t is finite (the
exponent is −n/p > −1).
Inequality (∗) then becomes
κδ n |u(x) − u(y)| ≤ Cδ 1+n−n/p ∥∇u∥Lp (Rn ) ,
as we wanted. Q.E.D.

A natural question is now whether or not Sobolev’s embeddings hold in


the space W 1,p (Ω) (and not only in the subspace W01,p (Ω)): is it true that a

function in W 1,p (Ω) is in Lp (Ω) for p < n or in C 0,α (Ω) for p > n?
The answer is affirmative if the domain Ω is regular enough (otherwise
there are counterexamples).
Suppose for simplicity Ω is a domain of class C ∞ . Then the following
holds:
THEOREM (extension of Sobolev functions): Let Ω be a bounded domain
of class C ∞ , Ω′ another domain with Ω ⊂⊂ Ω′ . Then there exists C > 0,
depending only on p, n, Ω and Ω′ , such that every u ∈ W 1,p (Ω) has an
extension ũ ∈ W01,p (Ω′ ) such that
∥ũ∥W 1,p (Ω′ ) ≤ C∥u∥W 1,p (Ω) .

112
We omit the proof, which is not really complicated. One needs a local-
ization argument which uses a partition of 1 and an extension by reflexion
on the local charts.
If we apply the theorems by Sobolev and Morrey to the extended function
ũ, we easily obtain:
THEOREM (Sobolev-Morrey embedding in W 1,p (Ω)): Let Ω be a regular,
bounded domain as in the extension theorem. If 1 ≤ p < n, there exists
a constant C > 0 (depending only on p, n and Ω) such that

∥u∥Lp∗ (Ω) ≤ C∥u∥W 1,p (Ω) ∀u ∈ W 1,p (Ω).

If p > n, every function in u ∈ W 1,p (Ω) is Hölder continuous with exponent


α = 1 − n/p and there exists C > 0 (depending only on p, n and Ω) such
that
[u]α ≤ C∥u∥W 1,p (Ω) ∀u ∈ W 1,p (Ω).

An important compactness result is the following:


THEOREM (Rellich): Let Ω be a bounded, regular domain, s < p∗ . Every
bounded sequence in W 1,p (Ω) has a subsequence which converges strongly in
Ls (Ω).

To prove it, we need the following Lemma:


LEMMA (Strong compactness in Lp ): Let Ω be bounded. Let 1 ≤ p < +∞,
ϕk the sequence of smooth mollifiers we used in regularization by convolution.
p
Let F be a bounded subset
R of L (Ω) with the following property: the regular-
p
ized functions uk (x) = ϕk (x − y)u(y) dy converge to u in L (Ω) uniformly
Rn
for u ∈ F. Then F is precompact in Lp (Ω).
PROOF: We prove that F is totally bounded in Lp . By assumption, for
every ε > 0 there exists k ∈ N such that ∥uk − u∥Lp (Ω) < ε for every u ∈ F.
Let F0 = {uk : u ∈ F}: if we show that F0 can be covered by a finite
number of balls of radius ε, then F can be covered by a finite number of balls
of radius 2ε. It is then enough to show that F0 is precompact in C 0 and thus
in Lp .
By Ascoli-Arzelà, this is certainly true if F0 is equibounded in C 1 (Ω).
But for u ∈ F we have
Z Z
uk (x) = ϕk (x − y)u(y) dy, ∇uk (x) = ∇ϕk (x − y)u(y) dy
Ω Ω

113
thus

|uk (x)| ≤ M |Ω|1−1/p ∥u∥Lp ≤ C, |∇uk (x)| ≤ L|Ω|1−1/p ∥u∥Lp ≤ C

where M, L are uniform bounds on |ϕk |, |∇ϕk |. Q.E.D.

PROOF of Rellich’s Theorem: We first prove the theorem for s = 1, we then


extend the result to every exponent s < p∗ .
Let F be a bounded set in W 1,p (Ω) (so sup{∥u∥W 1,p (Ω) : u ∈ F} < +∞).
We will show F is relatively compact in L1 (Ω). Let ϕk be the sequence
of mollifiers as in the previous lemma, u ∈ F, uk the regularization by
convolution of u. We have
Z Z Z
(∗) |u(x) − uk (x)| dx = ϕk (y)(u(x − y) − u(x)) dy dx ≤
Ω Ω B1/k (0)
 
Z Z
ϕk (y)  |u(x − y) − u(x)| dx dy
B1/k (0) Ω

By the lemma, we only need to show that the last expression in (∗) becomes
arbitrarily small for large k, uniformly for u ∈ F. To estimate the inner
R we observe first that if we fix a sufficiently large open set U ⊂⊂ Ω,
integral,
then |u(x−y)−u(x)| dx is uniformly small for u ∈ F: indeed, by Hölder’s
Ω\U
inequality and Sobolev’s embedding theorem we have
Z
∗ ∗
|u(x−y)−u(x)| dx ≤ 2|Ω\U |1−1/p ∥u∥Lp∗ (Ω) ≤ 2C|Ω\U |1−1/p ∥u∥W 1,p (Ω) .
Ω\U

Fix ε > 0: if |Ω \ U | is small enough, the last expression is less than ε,


independently of u ∈ F. R
We next need to estimate |u(x − y) − u(x)| dx, with y ∈ B1/k (0). If k is
U
large enough and x ∈ U , then (x − y) ∈ Ω. So, for almost every x ∈ U and
almost every y ∈ B1/k (0) we have
Z 1 Z 1
d 1
|u(x − y) − u(x)| = | (u(x − ty)) dt| ≤ |∇u(x − ty)| dt, 22
0 dt k 0
22
By using the Meyers-Serrin Theorem and Fubini Theorem, one can prove that a
Sobolev function is absolutely continuous on almost every segment. In alternative, again
by Meyers-Serrin, we can suppose w.l.o.g that our functions are regular.

114
whence
Z Z 1 Z
1
|u(x − y) − u(x)| dx ≤ dt |∇u(x − ty)| dx ≤
k 0
U U
1 1−1/p
|Ω| ∥u∥W 1,p (Ω) .
k
For k large enough, the last quantity is < ε for every u ∈ F. So, by plugging
the last two estimates into (∗) we get that there exists k ∈ N such that

∥u − uk ∥L1 (Ω) ≤ 2ε ∀u ∈ F, ∀k ≥ k

and by the Lemma F is relatively compact in L1 .


Let now 1 < s < p∗ : we show that F is relatively compat in Ls (Ω). We
just proved that each sequence {uj }j ⊂ F has a subsequence converging in
L1 (and almost everywhere, after possibly extracting a further subsequence):
let us still denote this subsequence by {uj } and by u its limit. We claim that
uj → u in Ls (Ω).
If A ⊂ Ω is measurable, we have by Hölder’s inequality and the equi-

boundedness of uj in Lp
1 1 1 1
∥uj ∥Ls (A) ≤ ∥uj ∥Lp∗ (Ω) |A| s − p∗ ≤ C|A| s − p∗ .

Thus, for every ε > 0 there exists δ > 0 such that ∥uj ∥Ls (A) ≤ ε for every
j ∈ N and for every A ⊂ Ω with |A| < δ.
By Egoroff’s Theorem23 , there exists a measurable subset C ⊂ Ω such
that |Ω \ C| < δ and uj → u uniformly on C (and a fortiori in Ls (C)). Thus,
for j, k large enough we have

∥uj − uk ∥Ls (Ω) ≤ ∥uj − uk ∥Ls (C) + ∥uj ∥Ls (Ω\C) + ∥uk ∥Ls (Ω\C) < 3ε,

and {uj } is a Cauchy sequence in Ls , as we wanted. Q.E.D.


23
Egorov’s Theorem: Let A be a measurable set with µ(A) < +∞ and {fk }, f measur-
able real-valued functions such that fk (x) → f (x) for µ-almost every x ∈ A, ε > 0. Then
there is a measurable set B ⊂ A such that µ(A \ B) < ε and fk → f uniformly on B.
(Sketch of proof: For every fixed j = 1, 2, 3, . . . consider the sequence of sets

[ 1
Cn,j = {x ∈ A : |fk (x) − f (x)| > }.
j
k=n

It is a decreasing sequence, and the intersection has measure 0. . . So there exists nj such

that µ(Cnj ,j ) < ε/2j . We easily check that C =
S
Cnj ,j is such that µ(C) < ε and
j=1
uk → u uniformly on A \ C.)

115
As a consequence we have the following corollary, on which I only gave
some very brief hints in class.
COROLLARY: Let Ω be a bounded and regular open subset of Rn , 1 < p <
+∞, {uk }k be a bounded sequence in W 1,p (Ω).
If 1 < p < n there exists a subsequence {ukh }h and a function u ∈ W 1,p (Ω)
such that ukh → u strongly in Ls (Ω) for 1 ≤ s < p∗ and Di ukh ⇀ Di u weakly
in Lp (Ω) (i = 1, 2, . . . , n). If p > n, ukh → u uniformly.
PROOF: By Rellich’s Theorem and Banach-Alaoglu Theorem we can find a
subsequence such that ukh → u in Ls (Ω) and Di ukh ⇀ vi weakly in Lp (Ω).
By passing to the limit in the definition of the weak partial derivatives, one
immediately checks that u ∈ W 1,p (Ω) and vi = Di u.
If p > n, then of course we can apply the previous theorem for every finite
s, but we can also obtain uniform convergence thanks to Morrey’s Theorem:
indeed (by extracting a further subsequence if necessary) ukh (x) → u(x) a.e.
in Ω. By Morrey’s Theorem, we have [ukh ]0,α ≤ C so the subsequence is
equicontinuous because it satisfies the inequality

(∗)|ukh (x) − ukh (y)| ≤ C|x − y|α ∀x, y ∈ Ω, ∀h ∈ N.

Choose a point x0 ∈ Ω where we have pointwise convergence: ukh (x0 ) is


obviously equibounded. Moreover, for every x ∈ Ω we have

|ukh (x)| ≤ |ukh (x0 )| + |ukh (x) − ukh (x0 )| ≤ C + C(diam(Ω))α

and ukh is equibounded: we conclude by using the Ascoli-Arzelà Theorem.


Q.E.D.

116

You might also like