Integration 2
Integration 2
Oliver R. Dı́az–Espinosa
SAMSI–Duke University
Current address: Precision Health Economics
Email address: [email protected]
2010 Mathematics Subject Classification. 28-02
Contents
Preface xv
Chapter 1. Elements of set theory 1
§1.1. Naive set theory 1
§1.2. Order sets and transfinite induction 4
§1.3. The Axiom of choice 9
§1.4. Cardinality 11
§1.5. Simple algebraic structures 14
§1.6. Exercises 16
Chapter 2. Elements of point set Topology 19
§2.1. General definitions 19
§2.2. Connected spaces 23
§2.3. Convergence 27
§2.4. Compactness 31
§2.5. Metric spaces 33
§2.6. Banach fixed point theorem 38
§2.7. Uniformities 39
§2.8. Product topology 40
§2.9. Urysohn metrization 44
§2.10. Arzelà–Ascoli theorem 51
§2.11. Locally compact Hausdorff spaces 53
§2.12. Exercises 56
Chapter 3. Basic measure theory 61
vii
viii Contents
This notes originated during a short summer course on Topics in Probability for senior
undergraduate students at the University of Toronto. The original goal was to introduce
Lebesgue integration theory geared towards Probability. Over the course of three years,
mostly from my interactions with first year graduate students preparing for their qualifying
exams at the University of Toronto and at Duke University, these notes grew considerable
into what is now a full course of integration theory. Several topics in Probability (indepen-
dence, conditioning and Martingales) are included. This is intended to preserve the initial
spirit of this notes: to teach some topics in Probability.
The selection of topics and their order of appearance are based on my attempt to
provide a self contained presentation of the subject. In particular, the first two Chapters
are included as a reference to the elements of set theory and point set topology which are
used later in the notes to construct examples or to lay the terrain for new material.
In preparing these notes, I have borrowed from the work of several authors from whom
I learned Integration and Probability theory: W. Rudin’s Real and Complex Analysis; D.
Cohn’s Measure theory; V. I. Bogachev’s Measure theory; O. Kallenberg’s Foundations of
Probability; and K. Bichteller’s Integration: A functional approach.
There are several methods to introduce modern integration theory. We present two: the
classic method of Lebesgue, and the functional approach of Daniell. We will see that both
methods produce the same class of integrable functions. Paraphrasing Klaus Bichteller,
“Lebesgue’s method is based on ingenuity, Daniell’s approach is based on hindsight: to be
integrable, a function must not be too big and must be measurable”.
I hope that this notes assists other graduate students who are learning Integration
theory and the foundations of Probability. I apologize for all the typos that might appear
here and there.
xv
Chapter 1
In this section we give a naive presentation of set theory and the real number, that is,
we do not provide either a rigorous axiomatic presentation of set theory, or a set theoretic
construction of the real numbers. Instead, we take the notion and existence of sets as granted
and assume that the reader is familiar with set operations such as union, intersection,
complementation, relations and functions. Although we assume the existence of the sets
of natural numbers N, the integers Z, the rational numbers Q and of real numbers R, we
indicate in the exercises at the end of this section how to rigorously construct the integers
from the natural numbers and zero, and the rational numbers from the integers. The
problem of constructing the real numbers from the rationals (achieved by Dedekind and
Cantor at the end of the 1800’s) is not discussed in these notes.
We will give a rather detailed presentation of notions of order, cardinality, the Axiom
of choice and some of its equivalences. The Axiom of choice is used in these notes to prove
a fundamental existence results in analysis such as the Hahn–Banach extension theorem,
Vitali’s covering Lemma, Alexander’s covering theorem for compact sets, and Parseval’s
theorem on the existence of maximal orthonormal families in Hilbert spaces.
1
2 1. Elements of set theory
not. The notion of belonging or being an element of is donated by the symbol ∈; thus, we
use x ∈ A to indicate that x is an element of A, and x ∈ / A to indicate that x is not an
element of A. To avoid logical contradictions, it is convenient to postulate that no set is
an element of its own, that is, for any set A, A ∈
/ A. For instance, Russell’s paradox which
considers R := {x : x ∈/ x}. This is not a set for, it it were a set then R ∈ R iff R ∈ / R.
Another example is the set of all sets. There is no such set, for if there were the set of all
sets, which we then denote by U , then U ∈ U .
Definition 1.1.1. Given two sets A and B, we say that A = B if for any x, x ∈ A iff x ∈ B.
That is two sets are equal iff they have the same elements. We say that A is contained in
B (denoted by A ⊂ B) iff for any x, x ∈ A implies that x ∈ B.
Definition 1.1.2. Given a set A, there is a set called power set and denoted by P(A)
such that Y ∈ P(A) iff Y ⊂ A.
Sets may have only one element x. Such set, denoted by {x} is called singleton x. A
set with two
distinct elements
x, y is denoted by {x, y}. The order pair (x, y) is the set
defined by {x}, {x, y} .
A property P is a proposition that for a given object x can be determine to be true
or false. Very often in Mathematics, given a set A and a proposition P , we define a set of
objects that belong to A for which the property P holds true. This set is denoted as
{x ∈ A : P (x)}
Remark 1.1.3. In defining sets through properties, we always restrict the objects to be
elements of a priory established set. When the a priory established set is clear form the
context we often omit it and write instead {x : P (x)}.
Example 1.1.4. If N is the set of natural numbers 1, 2, 3, . . ., then {x ∈ N : x2 − 3x + 2} =
{1, 2}.
There is a set -the empty set or void set which is denoted by ∅- that has no elements.
This can be expresses as the set of objects that are not equal to themselves, {x : x 6= x}.
Since sets are fully characterized by their elements, there is only one empty set.
Throughout this notes, we will use the term collection or family for denote sets whose
elements are sets. Given a collection of sets A, we define its union as the set defined as
[
A = {x : for some A ∈ A, x ∈ A}
Similarly, the intersection of all elements of A is defined as
\
A := {x : for all A ∈ A, x ∈ A}
In particular, if A and B are two sets, then
A ∪ B := {x : x ∈ A, or x ∈ B}
A ∩ B := {x : x ∈ A and x ∈ B}
1.1. Naive set theory 3
When proving existence results using set theory it is often the case that one has an collection
of sets in which one is an extension of another and, from this collection we construct
a function that extends every function in the aforementioned collection. The following
elementary result summarizes this type of arguments.
Lemma 1.1.8. Given sets A and B, assume that C is a collection of functions with domains
S in A and image inSB such that for any f, g ∈ C either f ⊂ g or g ⊂ f . Then
contained
F := C is a function from {dom(f ) : f ∈ C} to B.
4 1. Elements of set theory
S
Proof. We first show that F is a relation with dom(F ) = {dom(f ) : f ∈ C}. For any
x ∈ dom(F
S ) there is y ∈ B such that (x, y) ∈ F . S
Thus (x, y) ∈ f for some f ∈ C, and so
x ∈ {dom(f ) : f ∈ C}. Conversely, for any x ∈ {dom(f ) : f ∈ C} there is f ∈ C such
that x ∈ dom(f ). Hence there is y ∈ B for which (x, y) ∈ f . This means that (x, y) ∈ F ,
that is, x ∈ dom(F ).
Now we show that F is a function. Suppose (x, y) and (x, z) are elements in F . Then, there
are f, g ∈ C such that (x, y) ∈ f and (x, z) ∈ g. Without loss of generality assume that
f ⊂ g. Then, as g is a function, y = z. This shows that F is a function.
Some simplifying notation is in order. Suppose (X, ≤) is a partially ordered set. For
any x, y ∈ X, we will use the notation x < y to mean that x ≤ y but y x; also, we will
use y ≥ x to mean that x ≤ y.
The order type of the set ∅ (with the ∅ order) is denoted by 0. For any integer n ≥ 1,
the order type of Zn := {0, . . . , n − 1} with the usual order is denoted by the integer number
n. The order type of given to (Z+ , ≤) is denoted by ω.
Definition 1.2.8. Suppose (A, ≤) is a totally ordered set. For any x ∈ A, the set Ax :=
{y ∈ A : y < x} is called initial segment of (A, ≤) at x. A subset S ⊂ A is an order
ideal of A if for any x, y ∈ A, if x ∈ S and y ≤ x, then y ∈ S.
When the order ≤ is clear from the context, we will omit explicit reference to it.
Remark 1.2.9. The empty set is trivially an ideal of any totally ordered set. Evidently,
any initial segment of a totally ordered set is an ideal. The converse is not necessarily true.
For instance, if A has no last element, i.e. if A is not bounded above, then A is an ideal
but not an initial segment of A.
Lemma 1.2.10. Suppose (A, ≤) is a totally ordered set. The union of an arbitrary family
of ideals is an ideal. The intersection of an arbitrary collection of ideals is an ideal.
S
Proof. Suppose A is a family of ideals in A. Let x ∈ A and assume y < x. SThere is
S ∈ A such x ∈ S and, as S is an ideal, we have that y ∈ S. Therefore, y ∈ A. For
intersections, the proof is similar.
Theorem 1.2.12. Suppose (A, ≤) is a well ordered set. If S is an ideal of A then, either
A = S or there exists a unique x ∈ A such that S = Ax .
Remark 1.2.13. The well–ordered assumption is needed in the Theorem above. Consider
for instance the set of real numbers R with the usual order. Any interval of the form (−∞, a]
where a ∈ R is an order–ideal however, it is not an initial segment as in Definition 1.2.8.
Proof. Suppose the contrary, that is, the set B := {x ∈ A : f (x) < x} = 6 ∅. Let x0 be
the first element of the set B := {x ∈ A : f (x) < x}. Then f (x0 ) < x0 and by hypothesis,
f (f (x0 )) < f (x0 ). This means that f (x0 ) ∈ B which contradicts the choice of x0 .
Corollary 1.2.15. Suppose (A, ≤) is a well–ordered set. Then for any x ∈ A, the initial
segment Ax with the order inherited by (A, ≤) is not order–isomorphic to A.
Proof. Suppose for some x ∈ A there is an order isomorphism f : (A, ≤) → (Ax , ≤).
Then f , as a function from A into itself, satisfies the conditions of Theorem 1.2.14, and so
y ≤ f (y) for all y ∈ A. However, since f (A) = Ax we have in particular that f (x) < x.
This is a contradiction.
Theorem 1.2.16. Suppose (A, ≤) is a well–ordered set. For any two ideals S and T of A,
S and T are order–isomorphic iff either S = T = A or there is a unique x ∈ A such that
S = Ax = T .
Proof. Suppose g and h are two order isomorphisms from A to B. We will show that
g(x) = h(x) for all x ∈ A. Indeed, h−1 ◦ g is an order isomorphism from A to itself.
Consequently, x ≤ h−1 (g(x)) for all x ∈ A. As h is an order isomorphism, we get that
h(x) ≤ h h−1 (g(x)) = g(x) for all x ∈ A. The converse inequality is obtained by reversing
the roles of g and h.
Theorem 1.2.18. Let (A, ≤) and (B, ) be two well order sets. One and only one of the
following possibilities hold:
(i) A and B are order isomorphic.
(ii) There exits a unique x ∈ A such that Ax is order isomorphic to B.
(iii) There exits a unique y ∈ B such that A and By are order isomorphic.
Proof. Let a0 and b0 be the first elements of A and B respectively. Let E be the collection
of all ideals of A that are order isomorphic to some ideal of B.SThis collection is nonempty
since Aa0 = ∅ is order isomorphic to Bb0 = ∅. The set S := E is an ideal of A and we
will show that S ∈ E .
Suppose Ij (j = 1, 2) are ideals in E and let Jj (j = 1, 2) ideals in B for which there are
(unique) order isomorphisms fj : Ij → Jj . Clearly I1 ∩ I2 is an order ideal of I1 and of I2
which is order isomorphic to order ideals f1 (I1 ∩I2 ) and f2 (I1 ∩I2 ) of B. By Theorem 1.2.16
8 1. Elements of set theory
Theorem 1.2.18 allows the introduction of a total order on order types. Suppose α and
β are two order types, and let (A, w) and (B, r) well–order sets whose order types are α and
β respectively. Then α ≤ β iff (A, w) is order isomorphic to an ideal of (B, r) and α < β if
A is order isomorphic to an initial segment of (B, w). Order types are also called ordinal
numbers.
Theorem 1.2.19. Let α be an order type larger than 0. Let Pα be the set of all order types
that are less than α. Then Pα is well–ordered and it is of order type α.
Proof. Let β an order type and β < α. Let B and A be sets of order types β and α
respectively. Then, there is a unique x ∈ A such that B is order isomorphic to Ax . Setting
g(β) = x we define a function on Pα which is clearly an order isomorphism between Pα and
A. Therefore Pα is well ordered and has order type α.
We conclude this section with two results that generalize Mathematical induction.
Theorem 1.2.20. (Transfinite induction) Let (W, ≤) be a well ordered set. Suppose Q ⊂ W
is a set that satisfies the following condition: For any x ∈ W , if Wx ⊂ Q implies x ∈ Q.
Then, Q = W .
Proof. Let 0 denote the first element of W and for each x ∈ W , set Wx := Wx ∪ {x}. Let
T be the set of all x ∈ W for which there is a function fx : Wx → E such that
(1.1) fx (u) = Ru fx |Wu , u ∈ Wx .
We claim that 0 ∈ T . Since W0 = ∅, the only function φ : W0 → E is φ = ∅. Thus,
f0 : W0 → E given by f0 (0) = R0 (∅) satisfy condition (1.1).
For any x, y ∈ T , if x ≤ y then fx = fy |Wx . Suppose the opposite, that is, there are x, y ∈ T
with x < y such that fx 6= fy |Wx . Let x0 be the first element of the set {u ∈ Wx : fx (u) 6=
1.3. The Axiom of choice 9
fy (u)}. Clearly 0 < x0 since fx (0) = R0 (∅) = fy (0). Hence, Wx0 6= ∅ and fx |Wx0 = fy |Wx0 .
Consequently
fx (x0 ) = Rx0 fx |Wx0 = Rx0 fy |Wx0 = fy (x0 )
which is a contradiction.
Notice that the elements of the product of sets are in fact choice functions. The Axiom
of Choice states that the non–empty product of non–empty sets has at least one element.
The axiom of choice is used under other equivalent forms. In these notes we will only make
use of the following equivalences:
Well–ordering (WO): Every set admits a well–order.
Hausdorff ’s maximal principle (HMP): For every partially ordered set (X, ≤)
there is a maximal chain (P, ≤).
Zorn’s lemma (ZL): Suppose (X, ≤) is a partially ordered set. If any chain P in
X is bounded above in X the, X has a maximal element.
10 1. Elements of set theory
The following quote, attributed to Jerry Bona, states how surprising those equivalences are:
The Axiom of Choice is obviously true, the well–ordering principle is obviously false, and
who can tell about Zorn’s lemma. In other words, although the statement of the axiom
of choice seems to be intuitive and to certain degree non controversial, the well–ordering
principle is rather difficult to accept as trying to find an explicit well–order for the set of
real numbers R demonstrates, and Zorn’s lemma is not intuitive at all.
Theorem 1.3.3. AC, WO, HMP and ZL are equivalent.
Proof. AC implies WO. Let X be a nonempty set. The Axiom of choice implies that there is
a function c : P(X)\{∅} → X such that c(A) ∈ A for all A ∈ P(X)\{∅}. For convenience we
define f : P(X) \ {X} → X as f (B) = c(X \ B) for all B ∈ P(X) \ {X}. Set x0 := f (∅) and
(A, ≤) = {x0 }, (x0 , x0 ) . Then A is a well order set and as Ax0 = {x ∈ A : x < x0 } = ∅,
we say that a well order set (W, ≤) is an f –string if for any x ∈ W ,
x0 = f (Ax0 ). In general,
x = f {y ∈ W : y < x} . Clearly the first element of any f –string is x0 = f (∅).
Finally, we claim that S = X. Suppose S 6= X and let y = f (S). Then R = S ∪ {y} with
:=≤ ∪{(x, y) : x ∈ S} is an f string, and so R ∈ F but this is clearly a contradiction.
W) implies HMP. Suppose((X, R) is a partially ordered set and let (X, ) is well–ordered.
Let x0 ∈ X be the first element of (X, ). By transfinite induction, there is a unique
function f : X → X such that f (x0 ) = x0 and for any other x ∈ X
x if {x} ∪ {f (y) : y ≺ x} is a chain in (X, R)
f (x) =
x0 otherwise
1.4. Cardinality 11
We claim that f (X) = P is chain in (X, R). Notice that for any x ∈ X, f (x) ∈ {x0 , x}. Let
x, y ∈ P and assume that y ≺ x. Then f (x) = x and x0 ≺ x. Thus, either (a) x0 < y and
f (y) = y or (b) y = y0 . In either case, {x} ∪ {f (t) : t ≺ x} is a chain, and so xRy or yRx.
We now prove that P is a maximal chain. Suppose z ∈ X \ P . Then f (z) = x0 , and so
{z} ∪ {f (t) : t ≺ z} is not a chain in (X, R). Hence, there is t0 ≺ z such that neither
f (t0 ) R z nor z R f (t0 ). Consequently P ∪ {z} is not a chain in (X, R). This shows that P
is a maximal chain.
HMP implies ZL. Suppose (X, R) is a partially ordered set in which any chain is bounded
above. Let P be a maximal chain and let m ∈ X be an upper bound of P . Then, as P ∪{m}
is a chain that contains the maximal chain P , we conclude that m ∈ P . This shows that
for any x ∈ X such that m ≤ x, x ∈ P and so x = m. Therefore m is a maximal element
of (X, R).
ZL implies AC. Suppose I is a nonempty set and for each i ∈ I, A(i) is a nonempty set.
Let C be the set of all function f such that f ⊂ I and f (i) ∈ A(i) for each i ∈ f. As I is
not empty, there is i ∈ I and as A(i) is not empty there is ai ∈ A(i). Thus f = {i, ai )} ∈ C
so C is nonempty. We partially order C by inclusion. By Lemma 1.1.8, for any chain P in
(C, ⊂) we have that P ∈ C. Hence the conditions of ZL hold and C has a maximal element
F . We claim that dom = X, other wise there exists x ∈ I \ dom(F ) and, as A(x) 6= ∅, there
is ax ∈ A(X). Then F ∪ {(x, ax )} ∈ C contradicting the maximality of F .
1.4. Cardinality
An important concept in the theory of sets is the notion of cardinality. Two sets A and B
have the same cardinality or power if there is a bijective function f : A → B. In this case,
we also say that the sets A and B are equivalent, which is denoted by A ∼ B.
(a) When A = {1, . . . , n} := Zn , with n ∈ N, then the set B is finite and its cardinality
is denoted by the integer n.
(b) If there is no bijection f : Zn → B for any n ∈ N, then the set B is said to be
infinite.
(c) When A = N, then we say that the set B is infinite countable, and its cardinality
is denoted by ω.
between N and A:
a11 → a12 a13 → a14 ...
ւ ր ւ ր
a21 a22 a23 a24 ...
↓ ր ւ ր
a31 a32 a33 a34 ...
ւ ր
a41 a42 a43 a44 ...
↓ ր
.. .. .. ..
. . . . ...
This is not the only way of producing a bijection between A and N.
Proof. By the Axiom of choice there exits a choice function f on P(X) \ {∅}. Set h(0) :=
f (X) and by induction, there exists a unique
function h : Z+ → X such that for all
n ≥ 1, h(n) = f X \ {h(0), . . . , h(n − 1)} . The function h : Z+ → h(Z+ ) is the desired
bijection.
Example 1.4.5. The set of all real numbers in the interval [0, 1] is uncountable. One
can use the fact that every real number admits a unique binary expansion with an infinite
number of 1s, combined with the result in the previous example. A mild modification of
this method, based on decimal expansions, can be use as well.
Theorem 1.4.6. A set A is infinite iff there is B ⊂ A such that A ∼ B
Proof. Clearly any finite set is not equivalent to any of its proper subsets. Thus, only
necessity needs a proof. If A is infinite countable then for any bijection f : N → A, let
B = {f (2n) : n ∈ N}. Clearly A ∼ B. Assume now that A is infinite and uncountable.
There exists an infinite countable set C ⊂ A. Since A is not countable, neither is A \ C.
Hence there is a countable set D ⊂ A \ C. There exists a bijection g : C ∪ D → D since
D ∪ C is a countable set. The function f (x) = x if x ∈ A \ (D ∪ C) and f (x) = g(x) if
x ∈ C ∪ D is a bijection from A into B := A \ C = A \ (C ∪ D) ∪ D.
The following results provides a link between cardinality and well–ordering and it is a
direct consequence of AC
Lemma 1.4.7. For any cardinal number A there exits a well order type such α such that
Pα has cardinality A.
1.4. Cardinality 13
Using properties of composition of functions it is easy to check that for any cardinal
numbers A, B and C, if A B and B C then A C and that A A. The following result
states that for any pair of cardinal numbers A and B, only one of the following alternatives
may occur: (a) A = B, (b) A ≺ B, (c) B ≺ A.
Theorem 1.4.9. (Bernstein–Schröder) If A B and B A then A = B.
Proof. We present a proof that uses AC. Let A and B be sets with cardinality A and B
respectively. Suppose C ⊂ A and D ⊂ B are such that A ∼ D and B ∼ C. We well–order
sets (A, ≤) and (B, ). There exists a unique order isomorphism f from A to an order ideal
SB of B and, since order isomorphisms are bijections, A ∼ D ∼ SB . Similarly, there is an
order isomorphism g from B to an order ideal SA of A and B ∼ C ∼ SA . Hence, f (SA )
is an order ideal of B contained in the ideal f (A) = SB of B. Since f is order preserving,
f (SA ) is order isomorphic to B. By Corollary 1.2.15 and Theorem 1.2.16 we conclude that
f (SA ) = SB = B. This shows that A ∼ B
Corollary 1.4.10. For any pair of cardinal numbers A, B one and only one of the following
alternatives hold
(i) A = B
(ii) A ≺ B
(iii) B ≺ A
Proof. By Lemma 1.4.7 there are well order sets (A, ≤) and (B, ) with cardinalities A
and B respectively. By Theorem 1.2.18 one and only one of the following hold: (a) A is
order isomorphic to B, (b) A is order isomorphic to an initial segment of B, or (c) B is
order isomorphic to an initial segment of A. In case (a) we obtain (i), case (b) and (c) in
combination to Berstein’s theorem imply (ii) and (iii) respectively.
Theorem 1.4.11. There is a unique ordinal type Ω such that PΩ = {α : α < Ω} is
uncountable and for any β ∈ PΩ , β is countable. Furthermore, if C ⊂ PΩ and C is
countable then there is β ∈ PΩ such that C ⊂ {α : α < β}.
Proof. Let γ be an ordinal type such that Pγ has the same cardinality as that of R. If any
for any β ∈ Pγ , {α ∈ Pγ : α < β} is countable then set Ω := γ; otherwise, as Pγ is well
order, let Ω beSthe smallest ordinal type in Pγ that is uncountable. Let C ⊂ PΩ countable
and set D := {Pα : α ∈ C}. Being the countable union of countable sets, D is countable;
Let β be the first element of PΩ \ D. As each Pα with α ∈ C is an initial segment of PΩ , D
is the initial segment Pβ = {α ∈ PΩ : α < β}.
14 1. Elements of set theory
Remark 1.4.12. The cardinality the set N is denoted by ℵ0 . We have that Pω has cardi-
nality ℵ0 . The cardinality of PΩ is denoted as ℵ1 . The ordinal type Ω is the smallest one
such that PΩ is uncountable. It follows that the cardinality of Ω is at most the cardinality c
of the set R. The continuum hypothesis (CH) is the assertion that the ℵ1 = c, that is there
is no uncountable set whose cardinality is between ℵ0 and ℵ1 . Two important result of Set
theory proven by Cohen state that CH is independent of the axioms of ZF theory plus AC
and that AC is independent of the axioms of ZF theory.
For any set X, we will use $(X) to denote its cardinality. The following result due to
Cantor states that there is largest cardinal number (or order type).
Theorem 1.4.13. (Cantor) For any set X, #(X) ≺ #(P(X)).
For X = ∅ then P(X) = {∅} and so #(∅) = 0 < 1 = # P(∅) . For any set X 6= ∅
consider the function h : X → P(X) given by x 7→ {x}. This is an injective function and
so #(X) # P(X) . Suppose U is a set whose cardinality is the same as that of P(U ).
Then U 6= ∅ and there exists a bijection f : U → P(U ). Consider the set
S := {xu ∈ U : u ∈
/ f (u)}
S ∈ P(U ) since S ⊂ U . By assumption exists a ∈ U such that f (a) = S; however, we have
that a ∈ S iff a ∈
/ S which is a contradiction. This shows that no such set U exists.
The simplest example of commutative group is (Z, +) the integers with the usual addi-
tion from grading school.
Definition 1.5.3. A set R with two binary operations + and · is said to be a ring if
(a) (R, +) is a commutative group.
(b) For any a, b, c ∈ R,
a · (b · c) = (a · b) · c
(c) (distribution property) For any a, b, c ∈ R
a · (b + c) = a · b + a · c
(a + b) · c = a · c + b · c
The additive unit element in (R, +) is denoted as 0. The ring (R, +, ·) is a commutative
ring if
(d) for all a, b ∈ R
a·b=b·a
The ring (R, +, ·) is a unital if
(e) there is e ∈ R such that for all a ∈ R
a·e=e·a=a
The commutative ring (R, +, ·) is an integral domain if
(f) For any a, b ∈ R, a 6= 0 and ab = 0 implies that b = 0.
The simplest example of a commutative ring with unit is (Z, +, ·) is the set of integer
numbers with the operations of addition and multiplication studied in grading school. In
fact, Z is an integral domain with unit 1.
Definition 1.5.4. A commutative ring (F, +, ·) is a field if (F \ {0}, ·) is a group. A field
F is an ordered field if it has a total order < satisfying
(a) If a < b and c > 0, then a · c < b · c
(b) If a < b, then a + c < b + c for all c ∈ F .
The set P = {x ∈ F : x > 0} is the set of all positive elements of F . An ordered field is an
Archimedean field if for any a, b ∈ F with a > 0, there is integer n ∈ N such that na > b
(here na is defined as 0a = 0, and for n ∈ N, na = a + (n − 1)a.)
The rational number Q and the real numbers R with the usual sum and product from
grade school are Archimedean fields. The field Q however is not ordered complete; whereas
R is. Cantor and Dedekind showed that the real numbers with the usual arithmetic oper-
ations of addition and multiplication and order (R, +, ·, <) is the only (up-to isomorphism)
Archimedean filed that is order complete.
16 1. Elements of set theory
1.6. Exercises
Exercise 1.6.1. Given two order pairs (x, y) and (u, v) show that (x, y) = (u, v) iff x = u
and y = v.
Exercise 1.6.2. Suppose (A, ≤), (B, ) and (C, R) are totally ordered sets.
(a) If A and B are order isomorphic, show that B and A are order isomorphic. (Hint:
If f : A → B is an order–isomorphisms from (A, ≤) to (B, ), show that f −1 is an
order isomorphism from B to A.)
(b) If f is an order isomorphism from A to B, show that for any x, y ∈ A, x < y iff
f (x) ≺ f (y).
(c) If A and B are order isomorphic and B and C are order isomorphic, show that A
and C are order isomorphic.
(d) If A and B are order isomorphic and A is well ordered, show that B is also well
ordered
Exercise 1.6.3. Suppose (A, ≤) is a well ordered set. Show that for any x ∈ A, the set
Ax := Ax ∪ {x} = {y ∈ A : y ≤ x} is an ideal. If A is not bounded above (in A), show that
Ax is an initial segment.
Exercise 1.6.4. Show that AC is equivalent to each of following statement:
Kuratowski’s lemma: If (X, ≤) is a partially ordered set and P is a chain in (X, ≤)
then, there exits a maximal chain Q in (X, ≤) that contains P .
Zermelo’s principle: If A is a family of nonempty pairwise disjoint sets then, there
exits a set C such that for each A ∈ S, A ∩ C contains exactly one element in A.
Exercise 1.6.5. Prove the following statements
1. Any subset C ⊂ N is either finite or countable
2. Suppose that A is a set such that for some function f : N → A is onto. Then A is
either finite or countable.
Exercise 1.6.6. The set of all rational numbers Q is countable.
Exercise 1.6.7. For any set X, let {0, X
X
1} the set of all functions from X into {0, 1}.
Show that the # {0, 1} = # P(X) . (Hint: for each set A ⊂ X, consider the indicator
function 1A (x) = 0 if x ∈ X \ A and 1A (x) = 1 if x ∈ A.)
Exercise 1.6.8. Show that if M is an infinite set and A is a countable set, then M and
M ∪ A are equivalent (that is, they have the same cardinality). Show that [0, 1], (0, 1], [0, 1),
(0, 1) and R have the same cardinality.
Exercise 1.6.9. (Distributive formula) Let J be a non–empty set and {Ij : j ∈ J} be a
collection of nonempty sets. Assume that for for each j ∈ J and i ∈ Ij there is associated
Q
a set Aji . Let I = j∈J Ij . Prove that
[ \ j \ [ j
(1.2) Ai = Aα(j)
j∈J i∈Ij α∈I j∈J
1.6. Exercises 17
Let Z := Z+ ×Z
∼
+
. For any (m, n) ∈ Z we use [(m, n)] to denote the class of equivalence of
(m, n). We define the following operations on Z:
[(m, n)] + [(p, q)] = [(m + p, n + q)]
[(m, n)] · [(p, q)] = [(mp + nq, mq + np)]
(a) Show that this operations are well defined, that is, if (m, n) ∼ (m′ , n′ ) and ((p, q) ∼
(p′ , q ′ ) then (m + p, n + q) ∼ (m′ + p′ , n′ + q ′ ) and (mp + nq, mq + np) ∼ (m′ p′ +
n′ q ′ , m′ q ′ + n′ p′ ). (Hint: for multiplication, show first that (mp + nq, mq + np) ∼
(mp′ + nq ′ , mq ′ + np′ ).)
(b) Show that (Z, +, ·) is a commutative ring with additive unit o∗ := [(0, 0)] and
multiplicative unit 1∗ := [(1, 0)]. Furthermore, show that of a ∈ Z \ {0∗ } and
a·b = 0∗ implies that b∗ = 0. For any [(m, n)] ∈ Z, we have that [(m, n)]+[(n, m)] =
[(0, 0)]. In particular, if n ∈ Z+ and n∗ = [(n, 0)], we set −n∗ := [(0, n)].
(c) Show that the map τ : Z+ → Z given by n 7→ [(n, 0)] preserves the addition and
product operations in Z+ , that is τ (n+m) = τ (n)+τ (m) and τ (mn) = τ (m)·τ (n).
(d) We define a total order on Z be saying that a < b iff b − a ∈ {[(n, 0)] : n ∈ N}.
Show that τ preserves the usual order (0 < 1 < 2 < . . . < n < n + 1 < . . .) in Z+ ,
and that [(m, n)] > 0∗ iff m > n in the order in Z+ . The order defined here is the
usual order . . . < −n − 1 < −n < . . . < −1 < 0 < 1 < . . . < m < m + 1 < . . .
learned in grading school.
Exercise 1.6.11. The rational numbers Q can be constructed from the natural numbers Z
by an equivalent relation on Z × (Z \ {0}) given by
(a, b)Q(c, d) iff ad = cb
Set Q := Z×(Z\{0})
Q , an for any (a, b) ∈ Z × (Z \ {0}) we use ab to denote the equivalence
class of (a, b) in Q. We define the following algebraic operations on Q:
[(a, b)]Q + [(c, d)]Q = [(ad + cb, db)]Q
[(a, b)]Q · [(c, d)]Q = [(ac, bd)]Q
(a) Show that (Q, +, ·) is a field with respect with additive unit 0Q := [(0, 1)]Q and
multiplicative unit [(1, 1)]Q . Moreover, show that if p = [(a, b)]Q , then the additive
inverse of p is −p := [(−a, b)]Q = [(a, −b)]Q ; if p 6= 0Q , then the multiplicative
inverse of p is p−1 = [(b, a)]Q .
(b) Show that the map n 7→ [(n, 1)]Q preserves the ring operations + and · on Z.
18 1. Elements of set theory
In this Section we give a brief presentation of topics on point set Topology we use in these
notes. In particular, we discuss convergence over nets and uniformities. Convergence of nets
will make discussion of continuity on topological vector spaces much simpler. Uniformities
will be useful to extend the notion of measurability in metric spaces. Our presentation is
not exhaustive; however, we tried to make this section as self–contained as possible.
19
20 2. Elements of point set Topology
Proof. Suppose x ∈ A. Let Vx the set of all open sets that contain x. If V ∈ Vx then
A ∩ V 6= ∅; otherwise, A ⊂ X \ V which means that x ∈ X \ V , contradiction. If x ∈ A
/ A, then (V \ {x} ∩ A 6= ∅ for any V ∈ Vx , and so x ∈ A′ .
there is nothing else to do. If x ∈
′ ′
A ⊂ A ∪ A . Conversely, if x ∈ A and′ F is a closed set that contains A, then
Therefore,
X \ F ∩ A = ∅, and so x ∈ F . Therefore, A ∪ A ⊂ A.
Definition 2.1.9. In any topological space (X, τ ), the boundary of a set E ⊂ X is defined
as ∂E := E ∩ X \ E.
Theorem 2.1.10. Let (X, τ ) be a topological space. Then
(2.1) ∂E = E \ E o = ∂(X \ E)
for all E ⊂ X.
Proof. The result follows from the identity (X \ A)o = X \ A which holds for any A ⊂ X.
In particular, A = X \ E gives the first identity.
Definition 2.1.11. Suppose (X, τ ) is a topological space and ∅ =
6 A ⊂ X. The collection
τY = {Y ∩ U : U ∈ τ } defines a topology on Y . This topology is called the relative
topology induced by τ and (Y, τY ) is said to be a subspace of (X, τ ).
Definition 2.1.12. A function f between topological spaces (X, τX ) and (Y, τY ) is said to
be continuous at a point x ∈ X if for any U ∈ τY containing f (x), f −1 (U ) ∈ τX ; f is
continuous on X if it is continuous at every point of X.
Proof. Suppose X is normal. Since A and X \ U are disjoint closed sets, there are disjoint
open sets V, W such that A ⊂ V and (X \ U ) ⊂ W . As V ⊂ X \ W and X \ W is closed,
we have that V ⊂ X \ W . Consequently A ⊂ V ⊂ V ⊂ U .
Proof. Suppose A and B are nonempty disjoint sets. For each x ∈ A there is a set Vx ∈ τ
around x such that V x ∩ B = ∅. Similarly, for any y ∈ B there is a set Ux ∈ τ such that
U y ∩A = ∅. Thus, there are countable collections {Pn : n ∈ N, and {Qn : n ∈ N} of open set
S S S S S
sets such that A ⊂ x∈A Vx = n Pn and B ⊂ y∈B Uy = n Qn . Let Pn∗ := Pn \ nj=1 Qn
S
and Q∗n := Qn \ nj=1 Pn . Clearly Pn∗ and Q∗n are open sets, Pn∗ ∩ Q∗m = ∅, and from
[ n
[
Pn ∩ A ⊂ Pn \ Qj ⊂ Pn \ Qj
j j=1
[ [n
Qn ∩ B ⊂ Qn \ P j ⊂ Qn \ Pj
j j=1
S ∗
S ∗
we have that A ⊂ P := n Pn , B ⊂ Q := n Qn , and P ∩ Q = ∅.
2.2. Connected spaces 23
Proof. Suppose Y is the union of two disjoint clopen sets A and B in Y . Then A ∩ Y and
B ∩ Y are clopen in Y . Hence, either A ∩ Y = ∅ or B ∩ Y = ∅. Suppose Y ∩ B = ∅. Then
Y ⊂ A and so, Y = A = A since A is closed in Y . Thus, B = ∅.
Theorem 2.2.4. Let A be a family of connected subset
S in a topological space X. Suppose
that no two members of A are separated. Then, Y = {A : A ∈ A} is connected.
Proof. Fix x ∈ X. For any y ∈ X choose a connected set Cy such that {x, Sy} ⊂ Cy .
Then, {Cy : y ∈ X} satisfies the conditions of Theorem 2.2.4 and so, X = y∈X Cy is
connected.
Example 2.2.6. As in Example 2.1.16, suppose (X, <, τ ) is a totally ordered set with a
topology τ stronger than the order topology τo . If (X, τ ) is connected, then collection of
predecessor–successor pairs in X is empty, for if x < y, then x ∈ Ay = {u : u < y},
y ∈ Bx = {u : x < u} and X = Ay ∪ By . Since X is connected, and Ay and Bx are
nonempty open sets, Ay ∩ Bx 6= ∅. Hence, any dense set Z in τ is not only order-dense, but
also dense in τo .
Example 2.2.7. Suppose (X, <) is order complete (see Definition 1.2.4[7]), and that there
are no predecessor–successor pairs. Then (X, τo ) is connected. To see this, suppose X =
A ∪ B where A and B are nonempty disjoint open (and hence clopen) sets. Let a ∈ A and
b ∈ B and without loss of generality assume a < b. Then Ia,b = A0 ∪ B0 where A0 = A ∩ Ia,b
and B0 = B ∩ Ia,b . Since X \ Ia,b = {u : u < a} ∪ {u : b < u}, Ia,b is closed in X; hence,
A0 and B0 are closed in X. As A0 is bounded by b, c = sup A0 exists. Since any open
neighborhood of c contains points of A0 , and b ∈ / A, c ∈ A0 and a ≤ c < b. Since A0 is
open in Ia,b there is an open neighborhood V of c in X such that V ∩ Ia,b ⊂ A0 . Hence, for
some d > c, ∅ =6 {u : c ≤ u < d} ⊂ A0 . This means that there is z ∈ A0 such that c < z,
contradicting the definition of c. Therefore, X is connected.
Example 2.2.8. Theorem 2.2.4 implies that, for any point x in a topological space X, the
union C(x) of all connected subsets of X that contain x is connected. This set is closed in
X since C(x) is connected and so, contained in C(x). The sets C(x), called the connected
components of X, form a partition of X. Indeed, if C(x) ∩ C(y) 6= ∅, then C(x) ∪ C(y)
is connected and contains x and y. Then, be definition of C(x) and C(y), it follows that
C(x) = C(x) ∪ C(y) = C(y).
Theorem 2.2.9. Suppose X is a connected space. If M ⊂ X is connected and X \ M =
A ∪ B, where A and B are separated, then A ∪ M and B ∪ M are connected.
2.2. Connected spaces 25
All Euclidean spaces, and any normed space in general, are locally connected.
Lemma 2.2.14. A space is locally connected iff the connected components of each open set
are open.
26 2. Elements of point set Topology
Conversely, suppose the connected components of each open set are open. Then, the col-
lection of all connected components of some open set in X form a basis.
Corollary 2.2.15. If X is locally connected and f : X −→ Y is a continuous closed
function, then f (X) is locally connected.
Proof. Without loss of generality, me may assume that f (X) = Y . It suffices to show that
a component C of an open set U in Y is open. By continuity, f −1 (U ) is open in X, and the
restriction f |f −1 (U ) : f −1 (U ) −→ U is continuous. By Theorem 2.2.11 and Lemma 2.2.14,
f −1 (C) is the union of open components of the open f −1 (U ); hence f −1 (C) is open in
X. Consequently, X \f −1 (C) = f −1 (Y \ C) is closed in Y . Since f is a closed function,
Y \ C = f f −1 (Y \ C) is closed in Y and so, C is open in Y .
Proof. Fix x0 ∈ X. Let A be the collection of all x ∈ X for which there is a simple chain
from x0 to x. We claim that A is open. Indeed, if x ∈ A, there is a chain {U0 , . . . , Un } ⊂ U .
For any y ∈ Un , either U0 , . . . , Un is a simple chain from x0 to y, or U0 , . . . , Un−1 is a link
from x0 to y. The latter occurs when y ∈ Un−1 ∩ Un . Hence, A is open.
2.3. Convergence 27
We now show that A is closed. Let y ∈ A and let U ∈ U that contains y. Hence there exists
x ∈ A ∩ U . Let U0 , . . . , Un a simple chain from x0 to x with links in U . Let k the smallest
integer between 0 and n such that Uk ∩ U 6= ∅. Then, either U0 , . . . Uk , U is a simple chain
from x0 to y, or U0 , . . . , Uk is a simple chain from x0 to y. Hence y ∈ A.
Corollary 2.2.20. Any open connected set in the Euclidean space Rn is arcwise connected.
Proof. Let U be an open connected subset of Rn and let B be the collection of all open balls
contained in U . Lemma 2.2.19 implies that for any pair of points x, y ∈ U there is a chain
{B0 , . . . , Bn } ⊂ B from x to y. Let xj be the center of the ball Bj and for j = 0, . . . , n − 1,
x−1 = x, and xn+1 = y. Define the straight line segments
γj,j+1 (λ) = λxj+1 + (1 − λ)xj , 0≤λ≤1
Sn
The polygonal path γ = γ−1,0 .γ0,1 . . . . .γn−1,n .γn,n+1 contains a simple path in j=0 Bj ⊂U
joining x to y.
2.3. Convergence
Suppose D is a non empty set and let be a relation from D to D. From Definition 1.2.2,
we recall that is a pre–order if it is reflexive and transitive; is an partial order if it is
an antisymmetric pre-order; is a total order (or simply and order) if it is a partial order
and each pair (x, y) ∈ D × D, either x y or y x. We introduce another related type of
pre–order.
Definition 2.3.1. A direction on D is a pre–order on D such that for any n and m in
D there is k ∈ D such that n k and m k.
Example 2.3.2. (a) Any non empty subset of R is an ordered and a directed set with
respect the natural order ≤ of numbers.
(b) A collection D of subsets of a given set X is partially ordered by inverse inclusion;
that is A B iff A ⊃ B. If in addition, for any A and B in D, there is C ∈ D
such that C ⊂ A ∩ B, then D is also directed by inverse inclusion.
(c) A collection A ⊂ RS is partially ordered by f ≤ g iff f (x) ≤ g(x) for all x ∈ S. A
is directed if for any f and g in A , there is h ∈ A such that max{f, g} ≤ h.
(d) If (D, 1 ) and (E, 2 ) are directed sets, then D × E is a directed set with respect
to the Cartesian direction: (n, a) ≤ (m, b) iff n 1 m and a 2 b.
Conversely, suppose X is a topological space where each convergent net has a unique limit.
If X were not Hausdorff, then there would exist a pair of points x and y such that for any
open sets V ∈ Vx and U ∈ Vy there is xU,V ∈ V ∩ U . Then {xV,U : (V, U ) ∈ Vx × Vy } is a
net in X that converges to both x and y which is a contradiction.
Definition 2.3.6. A net {ym : m ∈ E} is a subnet of the net {xn : n ∈ D} if there is a
function g : E → D such that ym = xg(m) , and for any n ∈ D, there is M ∈ E such that
m ≥ M implies that g(m) ≥ n.
Example 2.3.7. Suppose {xn : n ∈ N} is a sequence. Then
1. A subsequence of a sequence is clearly a subnet.
2. Let N × N be directed by the Cartesion direction: (n, m) (k, ℓ) iff n ≤ k and
m ≤ ℓ. Then yn,m = xn+m is a subnet of xn .
Example 2.3.8. Let x : (1, ∞) → (0, 1) be x(α) = 1/α and define y : (0, 1) → (0, 1) as
y(α) = α. If (1, ∞) is ordered by the natural order in the real line and (0, 1) is ordered as
t s iff 1/t ≤ 1/s then x and y are subnets of each other.
Definition 2.3.9. Let {xn : n ∈ D} be a net in a topological space X. A point x ∈ X is
a cluster point of {xn : n ∈ D} if for any V ∈ Vx and m ∈ D, there is n ≥ m such that
xn ∈ V .
Proof. If x is a cluster point of {xn : n ∈ D}, then for any V ∈ Vx and n ∈ D there is
m ≥ n such that xm ∈ V , that is, V ∩ An 6= ∅. Therefore x ∈ An for all n ∈ D.
Theorem 2.3.11. A point x is a cluster point of a net {xn : n ∈ D} iff there is a subnet
that converges to x.
Let x be a cluster point of a net {xn : n ∈ D}. Let Vx be the collection of open neighborhoods
of x directed by inverse inclusion. For any V ∈ Vx and n ∈ D, choose g(n, V ) ≥ n with
xg(n,V ) ∈ V . Setting yn,V := xg(n,V ) , we have that {yn,V : (n, V ) ∈ D × Vx }, where D × Vx
is directed by the Cartesian product, is a subnet of {xn : n ∈ D} that converges to x for if
U ∈ Vx and m ∈ D, then (n, V ) ≥ (m, U ) implies that yn,V ∈ V ⊂ U .
Theorem 2.3.12. A net {xn : n ∈ D} in a topological space X converges to x ∈ X iff
every subnet converges to x.
Proof. Suppose {xn : n ∈ D} converges to x and let {ym : m ∈ E} be any subnet. For
any V ∈ Vx there is N ∈ D such that n ≥ N implies that xn ∈ V . Let g : E → D as in the
definition of subnet. There is M ∈ E such that m ≥ M implies that g(m) ≥ N . Therefore,
m ≥ M implies that ym = xg(m) ∈ V .
Suppose {xn : n ∈ D} is a net for which any subnet converges to x. We will show that xn
converges to x by way of contradiction. If xn 9 x, then there is V ∈ Vx such that for any
n ∈ D, there is g(n) ≥ n with xg(n) ∈
/ V . Hence, {yn = xg(n) : n ∈ D} is a subnet that does
not converge to x which is a contradiction.
Example 2.3.13. Let Q ∩ [0, 1] be directed by the natural order in the real line. Then
x(α) = α is a net in [0, 1] which converges to 1. Any number 0 ≤ a < 1 is an accumulation
point of xα but not a cluster point.
Theorem 2.3.14. A point x ∈ A iff there is a net {xn : n ∈ D} ⊂ A such that xn → x.
Proof. Suppose a net {xn : n ∈ D} ⊂ A converges to x. Then for any V ∈ Vx , there exists
N ∈ D such that n ≥ N implies that xn ∈ V . Thus, V ∩ A 6= ∅ for all V ∈ Vx .
Conversely, suppose that f (xα ) → f (x) for any net xα → x. If f were not continuous at
x, there would be U ∈ Uf (x) such that any V ∈ Vx contains a point xV with f (xV ) ∈ / U.
The net {xV : V ∈ Vx } converges to x; however, f (xV ) fails to converge to f (x). This is a
contradiction.
The next result states that in first countable Hausdorff topological spaces, it is enough
to consider convergence of sequences to determine closure of sets and continuity of functions.
Theorem 2.3.16. If (X, τ ) is first countable, then:
(i) X is Hausdorff iff any convergent sequence in X has a unique limit.
(ii) A point x ∈ X is a cluster point of a sequence {xn : n ∈ Z+ } iff there exists a
subsequence that converges to x.
(iii) A sequence xn converges to x iff every subsequence converges to x.
(iv) x ∈ A iff there is a sequence xn ∈ A that converges to x.
(v) For any topological space (Y, τ ′ ) and function f : X → Y , f is continuous at x iff
n→∞ n→∞
for any sequence xn −−−→ x, f (xn ) −−−→ f (x).
Proof. By hypothesis,
T any point x ∈ X has a countable local base Vx = {Vn : n ∈ N} and,
by setting Un = nj=1 Vj if necessary, we may assume that Vn ⊂ Vn+1 for all n ∈ N.
(i) Since any sequence is a net, only sufficiency remains to be proved. Suppose any conver-
gent sequence in X has a unique limit. Let x and y be points in X and let {Vn : n ∈ N}
and {Un : n ∈ N} be decreasing local neighborhoods of x and y respectively. If Vn ∩ Un 6= ∅
for all n ∈ N then we can choose xn ∈ Vn ∩ Un . The sequence {xn : n ∈ N} converges to
both x and y. Therefore, x = y.
(ii) Since a subsequence of a sequence is a subnet of the sequence, only necessity remains
to be proved. Suppose x is a cluster point of the sequence {xn : n ∈ N}. There is n1 ≥ 1
such that xn1 ∈ V1 ∈ Vx . Having found xn1 , . . . , xnk such that n1 < . . . < nk and xnj ∈ Vj
we choose xnk+1 ∈ Vk+1 such that nk+1 ≥ nk + 1, which is possible since x is a cluster point
of {xn : n ∈ N}. Therefore, {xnk : k ∈ N} is a subsequence that converges to x.
(iv) Since any sequence is a net, only necessity remains to be proved. If x ∈ A then
Vn ∩ A 6= ∅ for each Vn ∈ Vx . Choosing xn ∈ Vn ∩ A for each n ∈ N, we obtain a sequence
n→∞
xn −−−→ x.
n→∞
(v) Since any sequence is a net, only sufficiency remains to be proved. Suppose f (xn ) −−−→
n→∞
f (x) whenever xn is a sequence with xn −−−→ x. If f fails to be continuous at x, then
there is a neighborhood U ∈ Vf (x) such that for any n ∈ N there is xn ∈ Vn , Vn ∈ Vx ,
/ U . Then xn is a sequence converging to x for which f (xn ) 9 f (x). This is a
with f (xn ) ∈
contradiction.
2.4. Compactness 31
2.4. Compactness
Definition 2.4.1. A subset K of a topological space (X, τ ) is compact if every open cover
{Ui : i ∈ I} ⊂ τ of K admits a subcover {Ui : i ∈ J} with J ⊂ I finite.
Proof. Suppose X is compact and let F be a collection of closed sets with the finite
intersection property. If ∩{F : F ∈ F} = ∅, then {X \ F : F ∈ F} is an open cover of X,
and so there is a finite subcover {X \ Fj : Fj ∈ F, j = 1, . . . , N }. Consequently, ∩N
j=1 Fj = ∅
which is a contradiction.
Suppose that every collection of closed subsets of X with the finite intersection property
has nonempty intersection. Let U be an open cover of X. Then, F = {X \ U : U ∈ U } is a
collection of closed with empty intersection. Consequently, there exists a finite subcollection
F ′ ⊂ F with empty intersection. The sets {U : X \ U ∈ F ′ } is finite open subcover.
Example 2.4.3. A closed and bounded interval [a, b] in (R, | |) is compact. More generally,
in (Rn , k k2 ), any box closed bounded box I = [a1 , bn ] × . . . × [an , bn ] is compact.
Proof. Suppose I0 is a closed bounded box that has an open cover U from which no finite
subcover can be extracted. Halve each side of I0 to obtain 2n boxes whose sides have the
same length. One of those boxes, say I1 can not be cover cover by a finite subcover from
U . By induction, we obtain a sequence of nested closed boxes Ik+1 ⊂ Ik each of which is
obtained by halving the sides of the previous one, and which may not be covered by a finite
collection of sets in U . Let for each j = 1, . . . , n, let αkj < βkj be the endpoints of each of
the sides if the box Ik . Then ajk ≤ ajk+1 ≤ bjk+1 ≤ bjk and, as bjk − ajk = 2−k (bj0 − aj0 ), there
is x∗ ∈ I such that limk ajk = xj∗ = limk bjk for each j = 1, . . . n. Let U ∈ U be a an open set
that contains x∗ . For some ε > 0 the the ball B(x∗ ; ε) is fully contained in ⊂ U . Then for
all k ∈ N large enough Ik ⊂ B(x∗ ; ε) ⊂ U . This is a contradiction since no chosen Ik can
be covered by a finite collection of sets in U .
Theorem 2.4.4. A topological space X is compact iff any net {xn : n ∈ D} in X has a
cluster point. That is, X is compact iff any net in X has a convergent subnet.
Proof. Suppose X is compact and let {xn : n ∈ D} be a net in X. For each n ∈ D set
An = {xm : m ∈ D, m ≥ n}. Since D is a directed set, the collection of all sets An has the
{An : n ∈ D} also has the finite intersection property.
finite intersection property; hence, T
By compactness, there exists x ∈ n∈D An . From Theorem 2.3.10, it follows that x is a
cluster point of {xn : n ∈ D}.
Conversely, assume that any net in X has a cluster point or equivalently, that every net in X
has a convergent subnet. Suppose F is a collection of closed sets with the finite intersection
32 2. Elements of point set Topology
property. Let G be the collection of all finite intersections of sets in F and direct it with
the inverse inclusion. Since F ⊂ G, it is enough to show that ∩{G : G ∈ G} = 6 ∅. For any
G ∈ G choose xG ∈ G. Then, the net {xG : G ∈ G} has a cluster point x ∈ X. We claim
that x ∈ ∩{G : G ∈ G}. Indeed, if G ∈ G then, for any V ∈ Vx , there is H ∈ G with H G
(H ⊂ G) such that xH ∈ H ∩ V ⊂ G ∩ V . Therefore, x ∈ G = G.
The next result shows that in second countable Hausdorff spaces, sequences are enough
to determine compactness.
Theorem 2.4.5. Suppose X is a second countable Hausdorff space X. X is compact iff
any sequence in X has a convergent subsequence.
In many applications, compactness comes along with the Hausdorff separation property.
In that case, compact sets are also closed sets. The following result offers another link
between these properties.
Lemma 2.4.7. Suppose τ1 ⊂ τ2 are topologies on X. If τ1 is Hausdorff and τ2 is compact,
then τ1 = τ2
Theorem 2.4.8. (Alexander) Let (X, τ ) be a topological space and let S be a subbase for
τ . X is compact iff every cover of X by sets in S admits a finite subcover.
Proof. Only sufficiency needs be proved. Suppose that every subbasic cover of X admits
a finite subcover. If X is not compact then the collection X of all open covers of X that
do not admit a finite subcover is non empty. X is partially ordered by inclusion. Observe
that the union of a nonvoid chain in X is also S
an open cover of X in X . By Zorn’s lemma,
X contains a maximal cover V. Hence, X = V and if U ∈ τ \ V then V ∪ {U } admits a
finite subcover. Let W = V ∩ S . Since W ⊂ V, then no finite subfamily of W covers X.
Consequently, since W ⊂ S , then W does not cover X.
S
Let x ∈ X \ W and choose V ∈ V such that x ∈ V . Since S is a subbase, there are
S1 , . . . , Sn ∈ S such that
\n
x∈ Sj ⊂ V.
j=1
S
Since x ∈ X \ W , we conclude that Sj ∈ / V for all j = 1, . . . , n. The maximality of V
implies that for each 1 ≤ j ≤ n, there is a set Aj which is a union of finite sets in V such
that Sj ∪ Aj = X. Hence,
n
[ n
\ n
[ n
\
V ∪ Aj ⊃ Sj ∪ Aj ⊃ Sj ∪ Aj ) = X.
j=1 j=1 j=1 j=1
The pair (X, d) is called metric space. The metric d induces a topology τd with a base
given by the open balls Br (x) = {y ∈ M : d(x, y) < r}. A topological space (X, τ ) is
metrizable if there is a metric d on X such that τ = τd .
(iv) A sequence {xn } ⊂ X is convergent if there is x ∈ X such that, for any ε > 0,
there is N ∈ N so that, d(xn , x) < ε whenever n ≥ N .
(v) A sequence {xn : n ∈ N} ⊂ (X, d) is Cauchy if for any ε > 0, there exists N ∈ N
such that d(xn , xm ) < ε whenever n, m ≥ N .
(vi) (X, d) is said to be complete if any Cauchy sequence is convergent.
34 2. Elements of point set Topology
j=1
are normed spaces. Moreover, (Rn , | |) is separable and hence, it is a Polish space.
Lemma 2.5.7. Let (X, d) be a metric space.
(i) For any x, y ∈ X and A ⊂ X,
|d(x, A) − d(y, A)| ≤ d(x, y).
(ii) If A ⊂ B ⊂ X, then d(x, B) ≤ d(x, A) for all x ∈ X.
(ii) d(x, A) = d(x, A). Furthermore, d(x, A) = 0 if and only if x ∈ A.
2.5. Metric spaces 35
Proof. (i) For any x, y ∈ X and a ∈ A we have that d(x, a) ≤ d(x, y) + d(a, y); thus,
d(x, A)−d(y, A) ≤ d(x, y). Changing the rôles of x and y we obtain that |d(x, A)−d(y, A)| ≤
d(x, y).
(ii) The set that defines d(x, A) is contained in the one that defines d(x, B).
(iii) Since A ⊂ A, then by (i) d(x, A) ≤ d(x, A). For any a ∈ A, let {an } ⊂ A be a sequence
that converges to a. Then, by (i),
d(x, a) = lim d(x, an ) ≥ d(x, A).
n
Also, by (i), if d(x, A) = 0 and an ∈ A is chosen so that limn d(x, an ) = d(x, A), we conclude
that x ∈ A.
Proof. Let x ∈ (Aδ )ε , and for any r > 0, let a′ ∈ Aδ such that d(x, a′ ) < d(x, Aδ ) + r.
Then,
d(x, A) ≤ d(x, a′ ) + d(a′ , A) < d(x, Aδ ) + r + δ ≤ δ + ε + r.
Letting r ց 0 implies that x ∈ Aδ+ε .
In any given topological space X, countable intersections of open sets are called Gδ sets,
and countable unions of closed sets are called Fσ sets.
Lemma 2.5.9. (Alexandroff) Let X be a subspace X of a metric (Y, d). If X admits
a complete metric ρ compatible with the subspace topology then, X is a Gδ subset of Y .
Conversely, if X is a Gδ set in a complete metric space (Y, d) then, X admits a complete
metric compatible with the subspace topology.
Proof. Suppose that X has a complete metric ρ that generates the subspace topology. We
use diamρ and diamd to denote diameters with respect to ρ and d respectively. For each
1
n ∈ N let Gn be the collection of open sets V in X with diamρ (V
S ) < n . Each V ∈ Gn is of
the form V = UV ∩ X for some open set UV in Y . Let Wn = V ∈Gn UV . Then, Wn is an
S 1
open subset of Y , and so X = Gn , X ⊂ Wn . Notice that X n = {y ∈ Y : d(y, X) < n1 } is
an open set in Y that contains X. We claim that
\ 1
X = (X n ∩ Wn )
n
T 1 T 1
The inclusion X ⊂ n (X ∩Wn ) is obvious. Suppose x ∈ n (X n ∩Wn ). Then d(x, X) = 0,
n
m→∞
and so x ∈ X. Let {xm : m ∈ N} ⊂ Y be a sequence such that d(xm , x) −−−−→ 0. For each
1
n choose Vn ∈ Gn such that x ∈ UVn and diamρ (UVn ∩ X) < n . There is an integer Nn such
36 2. Elements of point set Topology
Notice that the notion of uniform continuity is metric dependent. For S = R it is natural
to consider ρ(x, y) = |x − y|. Let Ub (X, d) the space of bounded real valued uniformly
continuous functions on (X, d).
Theorem 2.5.12. Let (X, d) be any metric space. There exists a complete metric space
(T, ρ) and an isometry f from X to T such that f (X) is dense in T . If (T ′ , ρ′ ) and f ′
satisfy the properties described above, then (T, ρ) and (T ′ , ρ′ ) are isometric.
Proof. Fix x0 ∈ X, and for each x ∈ X define φx (y) = d(x, y) − d(x0 , y). From the triangle
inequality we obtain that maxy∈X |φx (y)| = d(x, x0 ), and maxy∈X |φx (y) − φz (y)| = d(x, z).
Then, the map x 7→ φx defines an isometry f between (X, d) and the space (Bb (X), ρ) of
bounded real functions on X equipped with the metric ρ(h, h′ ) = supy∈X |h(y) − h′ (y)|,
which is complete. This proves that (X, d) is isomorphic to the complete metric space
(T, ρ) := (f (X), ρ) in which f (X) is dense.
Suppose (T ′ , ρ′ ) is another complete metric space for which there is an isometry f ′ from X
to T ′ with f ′ (X) = T ′ . Then, the map ξ = f ′ ◦ f −1 : f (X) → f ′ (X) is clearly an isometry
with respect to the metrics ρ and ρ′ . It is easy to extend ξ to an isometry from (T, ρ) and
(T ′ , ρ′ ) by setting ξ(y) = limn ξ(yn ) for any sequence yn in f (X) that converges to y.
2.5. Metric spaces 37
Remark 2.5.13. A metric space (T, ρ) satisfying the conditions of Theorem 2.5.12 is called
metric completion of (X, d). Observe that if (X, d) is complete, then X is isometric to
its completion, that is, no new points are added to the metric completion.
Lemma 2.5.14. Let (X, d) be a metric space for which any sequence (xn ) ⊂ X has a
convergent subsequence. For any open cover A of X, there exists a number δ > 0 such that
if C ⊂ X and diam(C) < δ, then C ⊂ A for some A ∈ A .
Proof. Suppose that any sequence in (X, d) admits a convergent subsequence. If X is not
totally bounded, then there exists ε > 0 such that that every finite collection of discs of
radius ε fails to cover X. Let x1 ∈ X be arbitrary. As B(x1 ; ε) does not cover X, there
is x2 ∈ B c (x1 ; ε). By induction, we can construct a sequence {xn : n ∈ N} such that
Sn−1
xn ∈ X \ k=1 B(xk ; ε). Since d(xn , xm ) ≥ ε for all m and n, the sequence {xn } has no
convergent subsequence. This is a contradiction.
Theorem 2.5.17. Let (X, d) be a metric space. X is compact if and only if any sequence
{xn } ⊂ X has a convergent subsequence.
Proof. Assume that X is compact. If the sequence (xn ) ⊂ X has a finite number of
elements the conclusion follows easily. Assume that (xn ) has an infinite number of elements
and that it does not admit a convergent subsequence. Then any x ∈ X has an open
neighborhood Ux which contains at most one element of {xn }. By compactness, there is a
finite subcover of {Ux : x ∈ X}. This implies that {xn } is finite, contradiction.
Proof. Necessity follows from Lemma 2.5.16and Theorem 2.5.17. To show sufficiency, let
(xn ) be a sequence in X. Totally boundedness implies that there is a ball B1 of radius
one which contains infinitely many elements of {xn }. Let xn1 be one of such elements.
Proceeding by induction, we obtain a strictly increasing sequence nj of integers and balls
Bj of radius 1/j such that xnj ∈ B1 ∩ · · · ∩ Bj , and B1 ∩ · · · ∩ Bj contains infinitely
many elements of (xn ). Clearly (xnj ) is a Cauchy sequence which, by completeness of X,
converges. This shows that every sequence in X has a convergent subsequence.
The following result has wide theoretical and practical applications in many areas.
Proof. First we show uniqueness. Suppose x∗ and y∗ are fixed points of f . Then d(x∗ , y∗ ) =
d(f (x∗ ), f (y∗ )) ≤ θd(x∗ , y∗ ). Since 0 < θ < 1, it follows that d(x∗ , y∗ ) = 0.
Theorem 2.6.3. (Capaccioli fixed point theorem) Let (X, d) be a complete metric space.
Suppose f : X → X has the property that for each n ∈ N, there exits cn > 0 such that
(a) d(f n (x), f n (y)) ≤ cn d(x, y) for all x, y ∈ X.
P
(b) n cn < ∞
Proof. Fix x0 ∈ X and set xn = f (xn−1 ) for all n ≥ 1. From (a), f is Lipchitz continuous
and for all n > m
n−1
X n−1
X n−1
X
j j
(2.2) d(xn , xm ) ≤ d(xj+1 , xj ) = d f (x1 ), f (x0 ) ≤ d(x1 , x0 ) cj
j=m j=m j=m
Pn
Condition (b) implies that the sequence of sums sn := j=1 cj is Cauchy. By (2.2) we
conclude that {xn : n ∈ N} is a Cauchy sequence in X, and so it converges to some point
x∗ ∈ X. By continuity f (x∗ ) = limn f (xn ) = limn xn+1 = x∗ .
Uniqueness follows from (a) for if f (y) = y then, d(x∗ , y) = d(f n (x∗ ), f n (y)) ≤ cn d(x∗ , y).
Letting n → ∞ shows that x∗ = y.
Remark 2.6.4. Banach’s contraction principle follows from Capaccioli’s theorem by taking
cn = θn where θ is the contraction coefficient.
2.7. Uniformities
Definition 2.7.1. Let X be a non empty set and D a collection of pseudo-metrics on X.
The collection of sets of the form B(x; d, ε) = {y ∈ X : d(x, y) < ε} with d ∈ D and ε > 0,
defines a subbase for a topology τ (D) which we call D–uniform topology .
If (Y, ρ) is a metric or pseudo-metric space, then a function f : X → Y is said to be
D–uniformly continuous if for any ε > 0 there are pseudo-metrics d1 , . . . , dn ∈ D and
δ > 0 such that max1≤j≤n dj (x, z) < δ implies ρ(f (x), f (z)) < ε.
Remarks 2.7.2. Suppose D collection of pseudometrics on X.
(i) If D separates points, that is, if supd∈D d(x, y) = 0 implies x = y, then τ (D) is
Hausdorff. Indeed, if x 6= y then r := d(x, y) > 0 for some d ∈ D. The sets
B(x; d, r/2) and B(y; d, r/2) are disjoint neighborhoods of x and y respectively.
(ii) If f is D–uniformly continuous, then f is continuous on (X, τ (D)). Indeed, let
x0 ∈ X set y0 = f (x0 ). For any ε > 0 there is δ > 0 and a pseudo-metric
d′ = max1≤j≤n dj , where dj ∈ D, such that d′ (x, y) < δ whenever ρ(x, y) < ε.
Hence f (B(x0 ; d′ , δ)) ⊂ B(y0 ; ρ, ε) and, since B(x0 ; d′ , ε) is a neighborhood of x0
in τ (D), the continuity of f follows.
A net {xα : α ∈ A} ⊂ X is said to be a Cauchy net with respect to τ (D) iff for any
ε > 0 and pseudo-metrics d1 , . . . , dn ∈ D, there is α0 ∈ A such that α ≥ α0 and α′ ≥ α0
imply that max1≤j≤n dj (xα , xα′ ) < ε. The space (X, τ (D)) is complete iff any Cauchy net
is convergent.
Lemma 2.7.3. Suppose (Y, ρ) is a metric space. Y is complete iff any Cauchy net in X
converges.
Proof. Since any sequence is a net, only sufficiency remains to be proved. Suppose (Y, ρ) is
complete and let {yα : α ∈ A} be a Cauchy net. For each α ∈ A let Aα = {y Tβn : β ∈ A, β ≥
α}. For any n ∈ N there is αn ∈ A such that diam(Aαn ) < 1/n. Let Bn = k=1 Aαk . Since
40 2. Elements of point set Topology
A is a directed set, there exists ŷn ∈ Bn for each n. Consequently, {ŷn : n ∈ N} is a Cauchy
sequence in Y , and so it converges to a point y ∈ X. For any ε > 0 let N > 2/ε so that
d(ŷn , y) < ε/2 for all n ≥ N . If α ∈ A with α ≥ αN ,
1 ε
ρ(yα , y) ≤ ρ(yα , ŷN ) + d(ŷN , y) ≤ + < ε,
N 2
which shows that yα → y.
Theorem 2.7.4. Suppose (X, τ (D)) is a D–uniform space, (E, ρ) is a complete metric
space and S ⊂ X is dense in X. If f : S → E is D–uniformly continuous, then there exists
a unique continuous extension fˆ of f to X.
There is α0 ∈ A for which α ≥ α0 implies max1≤j≤n dj (xα , x) < δ/2. Hence {f (xα ) : α ∈ A}
is a Cauchy net in (E, ρ) and since E is complete, there is a unique y ∈ E such that
f (xα ) → y. If {yβ : β ∈ B} ⊂ S is another net converging to x, then there is β0 ∈ B
for which β ≥ β0 imply d(yβ , x) < δ/2. Hence ρ(f (xα ), f (yβ )) < ε whenever α ≥ α0 and
β ≥ β0 which leads to limα f (xα ) = y = limβ f (yβ ). Consequently, fˆ(x) = limα f (xα ) for
any net xα → x is a well defined function which extends f to all X and which is D–iniformly
continuous. Indeed, given ε > 0, there is δ > 0 and a pseudometrics d1 , . . . , dn ∈ D such
that (2.3) holds. If d(x′ , y ′ ) = max1≤j≤n dj (x′ , y ′ ) < 3δ , xα → x′ and yβ → y ′ , then for some
α0 and β0 , d(x, xα0 ) ∨ d(yβ0 , y) < 3δ and ρ(fˆ(x), f (xα0 )) ∨ ρ(fˆ(y), f (yβ )) < 3ε . Hence
d(xα , yβ ) ≤ d(xα , x) + d(x, y) + d(y, yβ ) < δ,
which implies
ρ(fˆ(x), fˆ(y)) ≤ ρ(fˆ(x), f (xα )) + ρ(f (xα ), f (yβ )) + ρ(f (yβ ), fˆ(y)) < ε.
It follows that fˆ is D–uniformly continuous.
T
Proof. It is enough to show that any basic open set nj=1 p−ij (Uij ), where Uij is open in
Xij , contains an element of D. Choose yij ∈ Uij for j = 1, . . . , n. the point x defined as
x(ij ) = yij for j = 1, . . . , n, and x(i) = x0 (i) otherwise. Clearly x ∈ D.
Theorem 2.8.3. Let (X, τX ) and (Y, τY ) be topological spaces. Suppose that τY is Haus-
dorff. If f : X → Y is continuous then, Graphf := {(x, f (x)) : x ∈ X} is closed in
(X, τX ) × (Y, τY ).
Proof. Let (x, y) ∈ (X × Y ) \ Graphf . Then y 6= f (x) and there are open sets U, V ∈ τY
such that y ∈ U , f (x) ∈ V and U ∩ V = ∅. By continuity there is W ∈ τX with x ∈ W such
that f (W ) ⊂ V . It follows that W ×U is an open neighborhood of (x, y) in (X, τX )×(Y, τY )
each that (W, V ) ∩ Graphf = ∅.
Lemma 2.8.4. Let {(Xi , τi ) : i ∈ I} be a collection of topological spaces Q
and let (X, τ ) its
product space. Given a topological space (Y, τY ) and a function f : Y → i∈I Xi , we have
that f is continuous iff pi ◦ f : Y → Xi is continuous.
is continuous.
Example 2.8.7. The set N, as a subspace of the Euclidean space R, has the discrete
topology, where any subset of N is open in N. Recall that positive integer n admits a
unique decomposition as n = 2α−1 (2β − 1), where α, β ∈ N. Then, integer values maps
α(n) = α and β(n) = β are clearly continuous. Similarly, the map φ(α, β) = 2α−1 (2β − 1)
on N2 is continuous. The map Φ : N → NN given by
Φ(n) = φ(n, ·) : m 7→ φ(n, m)
is continuous. Indeed, if pm is the projection map in NN onto the m–th component, we have
that (pm ◦ Φ)(n) = 2n−1 (2m − 1) is continuous.
42 2. Elements of point set Topology
N
Lemma 2.8.8. For any topological space X, the product spaces X N and X N are home-
omorphic.
N
Proof. Let φ be as in Example 2.8.7. Define the function G : X N → X N by
x = (x(n) : n ∈ N) 7→ ((x ◦ φ)(n, ·) : n ∈ N)
It can be seen that G is a bijection with inverse given by
(ξ(n, ·) : n ∈ N) 7→ ((ξ ◦ φ−1 )(n) : n ∈ N)
N
For any n ∈ N denote by pn and πn the projections in X N and X N respectively, onto
the corresponding n–th component. Let Mn := {φ(n, m) : m ∈ N}. Notice that πn ◦ G
is the projection from X N onto X Mn , and so πn ◦ G is continuous. Conversely, suppose
n = φ(αn , βn ). Then pn ◦ G−1 = pβn ◦ παn , which is continuous. The conclusion follows.
Example 2.8.9. Consider the spaces N and {0, 1} as a subspaces of R.By Lemma 2.8.8
N
The product spaces {0, 1}N and {0, 1}N are homeomorphic. Notice that although both
N and {0, 1} have discrete topology, the product spaces NN and {0, 1}N are not discrete.
Theorem 2.8.10. IfQ {(Xi , τi ) : i ∈ I} is a family of connected topological spaces, then the
product space X := i∈I Xi is connected.
Proof. Fix y0 = (yi0 : i ∈ I) ∈ X. By induction, we prove that if point y(n) differs from y0
by only n–components, then there is a connected set that contains both y0 and y(n) . For
(1) (1)
n = 1, suppose yi01 6= yi1 and yi0 = yi otherwise. The slice Si01 = {x ∈ X : xi = x0i , i 6= i1 }
contains y0 and y(1) , and being homeomorphic to Xi1 , it is connected by Theorem 2.2.10.
Suppose the claim is valid for k = 1, . . . n − 1. Suppose y(n) differs from y0 by exactly
n–components. Let y(n−1) be such that differs from y0 in n − 1 components, and from y(n)
by only one components. Then induction hypothesis, there are connected sets A0 and A1
that contain y(n−1) and y0 , and y(n−1) and y(n) respectively. Hence, A0 ∪ A1 is connected
and contains y0 , y(n−1) , and y(n) . This completes the proof of the claim.
Let D be the set of all points in X that differ from x0 by only finitely many components.
The claim along with Theorem 2.2.4 implies that D is connected. As D is dense, Y = D is
connected.
Example 2.8.11. Let [0, 1] the unit interval with the topology inherited from the Euclidean
topology on R. For any set I, [0, 1]I with the product topology is connected.
Theorem 2.8.12. If {(Xi , τi ) : i ∈ I} is a collection of topological spaces.
Q
(i) The product topology τ on i∈I Xi is Hausdorff iff each τi is a Hausdorff topology.
(ii) If I is countable and each (Xi , τi ) is second countable, then (X, τ ) is second count-
able.
Q
Proof. (i) Suppose each (Xi , τi ) is Hausdorff. Let x, y ∈ X = i∈I Xi , and assume x 6= y.
Then xi 6= yi for some i ∈ I. There are open neighborhoods U, V ∈ τi of xi and yi
respectively such that U ∩ V = ∅. Then x ∈ p−1 −1 −1 −1
i (U ), y ∈ pi (V ) and pi (U ) ∩ pi (V ) =
2.8. Product topology 43
p−1
i (U ∩ V ) = ∅.
Conversely, suppose (X, τ ) is Hausdorff. For each i ∈ I choose a slice Si as in Remark 2.8.6.
Being a subspace of X, Si is Hausdorff. Since Xi and Si are homeomorphic, we conclude
that Xi is Hausdorff.
Proof. As dn and dn ∧ 1 generate the same topology on Xn , we will assume without loss
of generality that dn ≤ 1. It is easy to check that ρ is a metric on X. To check that ρ is
compatible with the product topology τ we first show that any open ball B(x0 ; r) belongs
to τ . Suppose ρ(x0 , x) < r and set r∗ = ρ(x, x0 ) ∧ (r − ρ(x, x0 )). Let N ∈ N large enough
∗
so that 21N < r2 . Then, the set
r∗
U = {y ∈ X : dn (xn , yn ) < , n = 1, . . . , N }
2
is open in τ , and for any y ∈ U
N
X X dn (xn , yn )
dn (xn , yn )
ρ(x, y) = +
2n 2n
n=1 n>N
N
X dn (xn , yn ) r∗ r∗ r∗
≤ + < +
2n 2 2 2
n=1
This shows that U ⊂ B(x; r∗ ) ⊂ B(x0 ; r).
Proof. Define
X |fn (x) − fn (y)| ∧ 1
(2.5) d(x, y) = , x, y ∈ X
2n
n≥0
Since {fn } separates points in X, d is a metric on X. Since each fn is continuous and the
sum in (2.5) is uniformly convergent on X ×X, the metric d : X ×X → [0, ∞) is continuous.
Therefore, B(x; r) = {y ∈ X : d(x, y) < r} ∈ τ for all x ∈ X and r > 0. Consequently
τd ⊂ τ . Being τd Hausdorff and τ compact, from Lemma 2.4.7 we conclude that τd = τ .
Theorem 2.9.2. (Urysohn metrization theorem) Let (X, τ ) be a Hausdorff topological
space. X is metrizable and separable iff X is regular and second countable. In either
case, X is homeomorphic to a subset of [0, 1]N.
Finally, let d be a metric on [0, 1]N compatible with the product topology. Then ρ(x, y) :=
d(F (x), F (y)) metrizes (X, τ ).
Corollary 2.9.3. Let (X, ρ) be a separable metric space. Then, there is equivalent metric
ρ̃ on X and an isometry h : (X, ρ̃) → [0, 1]N. Moreover, the spaces (Ub (X, ρ̃), k · ku ) and
(Cb (h(X)), k · ku ) are isometric.
Proof. Let d be a metric on [0, 1]N compatible with the product. Then ([0, 1]N, d) is a
compact Polish space, and by Urysohn’s metrization theorem, (X, ρ) is homeomorphic to
some subset U in [0, 1]N. Let h be such an homeomorphism. Then ρ̃(x, y) := d(h(x), h(y))
defines a metric on X equivalent to ρ, h : (X, ρ) → [0, 1]N is an isometry, and (X, ρ̃) and
(h(X), d) are isometric.
Let Xe = h(X), where the closure is taken with respect the product topology on [0, 1]N. It
follows that Cb (X) e ≡ Ub (X),
e and the map Φ : Ub (X, e d) −→ Ub (X, ρ̃) given by f ′ 7→ f ′ ◦ h ∈
Ub (X, ρ̃) satisfies kf ku = kΦ(f )ku . If f ∈ Ub (X, ρ̃), then f ′ = f ◦ h−1 ∈ Ub (h(X), d), and
′ ′
Corollary 2.9.4. Every Polish space (X, ρ) is homeomorphic to a Gδ subset of [0, 1]N.
Proof. Let d be a metric on [0, 1]N that metrizes the product topology. Then ([0, 1]N, d) is
a compact Polish space. By Urysohn’s metrization theorem X is homeomorphic to a subset
U of [0, 1]N. Let h be such an homeomorphism. Then d(h(x),˜ h(y)) := ρ(x, y) metrizes
U = h(X) as a subspace of [0, 1] . By Alexandroff’s lemma, U is a Gδ subset of [0, 1]N.
N
Theorem 2.9.5. The continuous image of a compact metric space into a Hausdorff space
is compact and metrizable.
46 2. Elements of point set Topology
Proof. This is consequence of Theorems 2.4.6, 2.2.10, 2.9.5, and Corollary 2.2.15.
Theorem 2.9.7. Suppose X is a compact Hausdorff space. If C = {Cα : α ∈ I} is
a
T collection of continua contained in X that it is completely ordered by inclusion, then
α∈I Cα is a nonempty continuum.
T
Proof. Since X is Hausdorff and C has the finite intersection property, C = α∈I Cα is
non–empty and compact. Suppose C = A ∪ B, where A and B are nonempty disjoint closed
sets in C. Then A and B are disjoint compact subsets of X; consequently, there are disjoint
open set U and V in X such that A ⊂ U and B ⊂ V . It follows that for any α ∈ I,
Cα ∩ U and Cα ∩ V are disjoint
nonempty open sets in Cα . Since each Cα is connected,
Kα := Cα ∩ X \ (U ∪ V ) 6= ∅. Clearly {Kα : α ∈ I} is collection of compact subsets
of X which is completely ordered by inclusion. Hence ∩α∈I Kα = C ∩ X \ (U ∪ V ) 6= ∅;
however, C ⊂ U ∪ V and we reach a contradiction. Therefore, C is connected.
Definition 2.9.8. Suppose X is a T1 connected space. A point p ∈ X is called a cut point
of X if X \ {p} = A ∪ B where A and B nonempty are separated sets sets. All other points
are called noncut points.
Example 2.9.9. Every point in the unit interval [0, 1], with the exception of {0, 1}, is a
cut point. No point in the circle S1 is a cut point.
Lemma 2.9.10. If X is T1 compact connected, p ∈ X is a cut point and X \ {p} = A ∪ B
where A and B are separated, then A and B contain each a noncut point. In particular, if
X has more that one element, then it has at least two noncut points.
2.9. Urysohn metrization 47
Proof. Suppose that each point x ∈ A is a cut point and induces the separation X \ {x} =
Ax ∪ Bx with p ∈ Bx . Since X is T1 , both Ax and Bx are open in X. The set B ∪ {p} is
connected by Theorem 2.2.9 and intersects Bx at p; hence
(2.6) B ∪ {p} ⊂ Bx , Ax ∪ {x} ⊂ A
If x, y ∈ A and y ∈ Ax , then x 6= y and Bx ∪ {x} ⊂ X \ {y} = Ay ∪ By . Since p ∈
(Bx ∪ {x}) ∩ By and Bx ∪ {x} is connected,
(2.7) Bx ∪ {x} ⊂ By , Ay ∪ {y} ⊂ Ax
The collection {Ax ∪ {x} : x ∈ A} is partially ordered by inclusion and by Haudsorff’s
maximal principle, it contains a maximal chain L . Since X is compact
T and L is collection
of closed subsets that has the finite intersection property, K = L is nonempty. If q ∈ K,
then Aq ⊂ A as in (2.6). If r ∈ Aq , then Ar ∪ {r} ⊂ Aq but q ∈ / Ar ∪ {r} as in (2.7). This
implies that Ar ∪ {r} ∈/ L . On the other hand, if Ax ∪ {x} ∈ L and x 6= q, then q ∈ Ax in
which case Aq ∪ {q} ⊂ Ax ∪ {x}, that is
Ar ∪ {r} ⊂ Ax ∪ {x}
for any Ax ∪ {x} ∈ L . Consequently, {Ar ∪ {r}} ∪ L is a chain that contains L properly,
contradicting the maximality of L . The contradiction arose from assuming that all points
in A were cut points; Therefore, A contains a noncut point. Applying a similar argument
to B shows that B contains a noncut point too.
Suppose X has more than one element. If X has no cut points, then all its elements are
noncut points. If X has one cut point p and X \ {p} = A ∪ B is a separation, then by the
first part of the lemma, each A and B has a cut point. Since A and B are disjoint, X has
at least two cut points.
Proof. Let N be the noncut points of a continuum X. Suppose there is a proper subcontin-
uum K such that N ⊂ K. Let x ∈ X \ K. Then, x is a cut point of X and X \ {x} = A ∪ B
for some nonempty separated sets A and B. Since K ⊂ X \ {x}, L must be contained in A
or in B. Without loss of generality, suppose K ⊂ A. Since B ∪ {x} is closed and connected,
B ∪ {x} is also a proper subcontinuum; hence, it has at least two noncut points, one of
which, say y, is different from x. Then (B ∪ {x}) \ {y} and A ∪ {x} are connected and
contained x; thus,
(B ∪ {x}) \ {y} ∪ A ∪ {x} = X \ {y}
is connected. This means that y is a noncut point of X, but y ∈ B and B ⊂ X \ K ⊂ X \ N ,
which is a contradiction.
48 2. Elements of point set Topology
Given a connected set X and points a, b ∈ X, we define the set E(a, b) as the set
consisting of a and b, and all the cut points x ∈ X for which there is a separation X \ {x} =
A ∪ B where a ∈ B and b ∈ B. The latter points are said to separate a and b.
Lemma 2.9.12. Suppose (X, τ ) is a T1 connected set and a, b ∈ X. On E(a, b) define
x < y iff either x = a and x 6= y, or x separates a and y. Then,
(i) (E(a, b), <) is totally ordered.
(ii) The ordered topology τo on E(a, b) is weaker that the subspace topology on E(a, b)
induced by τ .
Proof. (i) For each point p ∈ E(a, b) \ {a, b}, we will use the notation Ap and Bp for
separated sets such that X \ {p} = Ap ∪ Bp and a ∈ Ap and b ∈ Bp .
If x and y are distinct points of E(a, b) \ {a, b}, then either x ∈ Ay or x ∈ By . If the former,
x ∈ Ay , then By ∪ {y} is a connected subset of X \ {x} = Ax ∪ Bx . Since b ∈ (By ∪ {y}) ∩ Bx ,
By ∪ {y} ⊂ Bx . Then y ∈ Bx and so, x separates a and y, that is, x < y. If the latter, y
separates x and so, y < x
Suppose that x, y, z ∈ E(a, b) and x < y and y < z. Then y ∈ / {a, b}, and x ∈ Ay . Hence
By ∪ {y} is a connected subset of X \ {x}. Consequently By ∪ {y} ⊂ Bx . A similar argument
shows that Bz ∪ {z} ⊂ By and so, Bz ∪ {z} ⊂ Bx . Hence z ∈ Bx which means that x < z.
Finally, since no point x separates a from itself, (E(a, b), <) is a totally ordered space.
Since {y} is closed, both Ay and By are open in X. This is enough to conclude that τo is
weaker that the subspace topology on E(a, b) induced by τ .
Theorem 2.9.13. If (X, τ ) is a continuum with exactly two noncut points, say a and b,
then X = E(a, b) and the the subspace topology coincides with the order topology.
Proof. If x ∈/ {a, b}, then x is a cut point and there is a separation X \ {x} = U ∪ V . By
Lemma 2.9.10, each open set U and V contain a cut point. Hence, either a ∈ U and b ∈ V
or a ∈ V and b ∈. In either case, x separates a and b, i.e., x ∈ E(a, b). Hence X = E(a, b).
By Lemma 2.9.12, the order topology τo is weaker that τ . Since (X, τ ) is compact and τo
is Hausdorff, τo = τ by Lemma 2.4.7.
Lemma 2.9.14. (Debreu) Suppose (X, τ ) is a topological space, and that X is totally
ordered and that the order topology τo ⊂ τ .
(i) (Open gap lemma) If X is second countable, then there exists a bounded strictly in-
creasing continuous function f : X → R. Furthermore, each connected component
of R \ f (X) is a singleton, a bounded open interval, or an infinite interval.
2.9. Urysohn metrization 49
(ii) If (X, τ ) is connected and has a countable dense set, then X is homeomorphic to a
interval in R. If X has a first element (last element), then the interval is left–closed
(resp. right–closed).
Proof. (Ouwehand) (i) Suppose B is a countable basis for (X, τ ). For each V ∈ B choose
zV ∈ V and set Z ′ = {zV : V ∈ B}. Z ′ is dense in τ and also in τo since τo ⊂ τ .
Consider the collection C of all pairs (a, b) ∈ X for which a < b and {z : a < z < b} = ∅.
This collection is countable. To check this, for each (a, b) ∈ C choose Va,b ∈ B such that
a ∈ Va,b ⊂ {x : x < b}. If (a, b) 6= (a′ , b′ ) then either a < b ≤ a′ < b′ in which case a ∈ Va,b
but a′ ∈
/ Va,b , or a′ < b′ ≤ b < a in which case a′ ∈ Va′ ,b′ but a ∈ / Va′ ,b′ . Hence, there is
′′
an injective map from C into B and so, C is countable. Let Z = {a, b : (a, b) ∈ C} and
Z = Z ′ ∪ Z ′′ . If X has a minimum, say a, and/or a maximum, say b, we assume that are
included in Z. The set Z thus constructed is dense in the order topology τo .
To prove that f is continuous, it suffices to show that f −1 (−∞, p) and f −1 (p, ∞) are open
in (X, τ0 ) for any p ∈ R. Fix p ∈ R and set A = f −1 (−∞, p) and B = f −1 (p, ∞). If p = f (x)
for some x ∈ X, then A = {u : u < x} and B = {u : x < u} since f is strictly increasing.
Suppose p ∈ / f (X). If A is empty, B = X. Similarly, if B is empty, A = X. If A and B are
not empty, then A′ = A ∩ Z and B ′ = B ∩ Z are nonempty since f = g, Z = A′ ∪ B ′ , and
a < b whenever (a, b) ∈ A′ × B ′ . In view of Claim I, it must be the case that either A′ has
a maximum, say a0 , and B ′ has a minimum, say b0 , or A has no maximum and B has no
minimum. In the former case, A = {u : u < b0 } and B = {u : a0 < u} S since p ∈ / f (X); in
S case, supA f = supA′ f = p = inf B f = inf B f and so, A = x∈A {u : u < x} and
the later ′
To conclude the proof of (i), it is enough to show that any bounded component G of
R \ g(X) is an open interval. Let ℓ = inf G and u = supp G and assume ℓ < u. Define
C = {z ∈ Z : f (z) ≤ ℓ} and D = {z ∈ Z : u ≤ f (z)}. Clearly C and D are nonempty,
Z = C ∪ D, and for any (c, d) ∈ C × D, c ≤ d. Since supC f ≤ ℓ < u ≤ inf D f , by Claim
I we have that C has a maximum element c0 and D has a minimum element d0 . Hence
f (c0 ) = ℓ < u = f (d0 ). Consequently ℓ, u ∈ f (X) which means that G = (ℓ, u).
The next result characterizes all metric continua with exactly two points.
Theorem 2.9.15. If X is a metric continuum with just two noncut points, then X is
homeomorphic to the unit interval I = [0, 1] with the usual topology.
2.10. Arzelà–Ascoli theorem 51
Proof. Suppose {a, b} are the only two cut points. Then X = E(a, b), and the order
topology on E(a, b) coincides with the metric topology by Theorem 2.9.13. With respect to
the order, a is the minimum element and b is the maximum element of X. As X is compact,
it has a countable base. The conclusion follows from Debreu’s lemma.
Proof. Suppose a and b are distinct points in X. By Lemma 2.2.19, there exists a simple
chain C1 = {U1,0 , . . . , U1,n } from a to b made of open connected sets of diameter < 1. Let
V be the collection of all open sets in V of X that have diameter < 21 and V is contained
in one of the links in C1 . For each j = 1, . . . , n choose xj ∈ U1,j−1 ∩ U1,j , and set x0 = a
and xn+1 = b. Another application of Lemma 2.2.19 shows that for each j = 0, . . . , n, there
is a chain C(xj , xj+1 ) from xj to xj+1 of sets in V that are fully contained in Uj . We will
constrict a simple chain from a to b as follows. From the simple chain from a to x1 , let
V ′ be the first link that intersects some link in the chain C(x1 , x2 ). Set C0′ to be the chain
obtained by removing all the links in from C(a, x1 ) succeeding V ′ . Let V ′′ be the last link in
C(x1 , x2 ) that intersects V ′ and set C1′ to be the chain obtained by removing from C(x1 , x2 )
the links that precede V ′′ . Clearly C0′ ∪ C1′ is a simple chain from a to x2 . Repeating this
construction at all remaining intersections results in a simple chain C2 = {U2,0 , . . . , U2,n2 }
from a to b such that diam(U2,j ) < 12 , and for each j, U 2,j ⊂ U1,k for some k. Continuing
this process, we obtain for each n ∈ N, a simple chain Cn from a to b with open connected
links of diameter < 2−n and whose closures are contained in links of the chain Cn−1 . Let
Cn be the union of the closures of the linksTin Cn . Clearly {Cn : n ∈ N} is a decreasing
sequence of subcontinua of X. Hence C = n Cn is a nonempty subcontinuum in X and
contains {a, b}.
Claim: C has only two noncut points, a and b. If x ∈ C \ {a, b}, for any n ∈ N, at most one
or two links in the simple chain Cn contain x. Let An be the union of the links preceding
them, and let Bn be the union of all the links in succeeding them. It follows that
[
∞ [
∞
C \ {x} = An ∩ C) ∪ Bn ∩ C ,
n=1 n=1
A and B are nonempty disjoint open sets in C, that is, x is a cut point. Hence, by
Lemma 2.9.10, C has exactly two noncut points, namely a and b. It follows that C is
homeomorphic to the unit interval [0, 1] and so, C is an arc from a to b
Lemma 2.10.1. Let (X, d) be a metric space. Suppose that for any ε > 0, there exist some
δ > 0, some metric space (W, ρ) and a map Φ : X → W such that Φ(X) is totally bounded,
and d(x, y) < ε whenever ρ(Φ(x), Φ(y)) < δ. Then, X is totally bounded.
Proof. Given ε > 0, choose δ > 0, W and Φ as in the statement of the Lemma. Then, there
exists a finite collection {V1 , . . . , Vn } of balls of diameter δ covering Φ(X). Consequently
{Φ−1 (V1 ), . . . , Φ−1 (Vn )} covers X, and diam(Φ−1 (Vj ) ≤ ε for each j = 1, . . . , n. This shows
that X is indeed totally bounded.
Theorem 2.10.2. (Arzelà–Ascoli) Let (X, τ ) be a compact topological space and let (S, d)
be a complete metric space. F ⊂ C(X, S) is relatively compact iff F is equicontinuous and
{f (x) : f ∈ F} is relatively compact in S for each x ∈ X.
d(f (x), f (y)) ≤ d(f (x), fj (x)) + d(fj (x), fj (y)) + d(fj (y), f (y)) < ε, y ∈ U.
Therefore, F is equicontinuous at any point x ∈ X.
(ii) follows directly from Theorem 2.10.2 by noticing that a set in Rn is relatively compact
iff it is bounded.
Example 2.10.4. Let (X, d) be a compact metric space. Suppose F is a bounded collection
of functions F on X such that |f (x) − f (y)| ≤ M d(x, y) for all x, y ∈ X and f ∈ F. Then,
F is totally bounded in C(X, R). In particular, the collection of all Lipschitz functions such
that kf ku + kf kL ≤ M , where
|f (x) − f (y)|
kf kL = sup
x6=y d(x, y)
is compact in C(X, R).
Proof. Let W be an open neighborhood of x with compact closure. Since W ∩ U also has
compact closure and contains x, we can assume without loss of generality that W ⊂ U .
If W = W , there is nothing else to prove; otherwise, {x} and ∂W = W \ W are disjoint
nonempty compact sets. For any y ∈ ∂W , there are disjoint open sets Vy and Hy such
that x ∈ Vy and y ∈ Hy . By compactness, there are finite Hy1 , . . . , Hyn such that ∂W ⊂
Sn Tn
j=1 Hyj =: H. Define V := W ∩ j=1 Vyj . Clearly x ∈ V , V ∩ H = ∅, V ⊂ W , and
V ⊂ X \ H. Hence, V is compact and
x ∈ V ⊂ V ⊂ W ∩ (X \ H) ⊂ W ∩ W ∪ (X \ W ) = W ⊂ U
Lemma 2.11.3. If (X, τ ) is a l.c.H. space then any basis B has a subset C whose closures
are compact, and which is itself a basis.
Proof. Let B be a countable basis for the topology. Let C be the collection if all sets in B
with compact closures. We prove now that C = 6 ∅ and C is a basis. Suppose U is an open
set and let x ∈ U . For some B ∈ B x ∈ B ⊂ U . Let V an open set with compact closure
such that x ∈ V ⊂ V ⊂ B. Then, for some B ′ ∈ B, x ∈ B ′ ⊂ V and B ′ is compact.
54 2. Elements of point set Topology
Proof. Each point of A has an open neighborhood with compact closure contained in U .
By compactness, A can be cover with a finite collection of such neighborhoods. The union
V of of the open sets in such finite collection is the required set.
Theorem 2.11.5. (Urysohn’s Lemma) Let X be a l.c.H. space, let A ⊂ X be compact and
U ⊂ X be open with A ⊂ U . There exists f ∈ C00 (X) such that A ≺ f ≺ U .
Proof. The proof is just as the one form Lemma 2.1.20 with a few slight modifications that
a n
S As before, let D0 = {0, 1}, Dn = { 2n : 0 < a < 2 , a ≡ 1
we indicate in what follows.
mod 2} (n ≥ 1), and D = n Dn the set of dyadic rational numbers. Fefine a chain {Ut }t∈D
of subsets of X progressively by first setting D1 = A and D0 = U . For n = 1, choose and
open set U 1 with compact closure such that U1 ⊂ U 1 ⊂ U 1 ⊂ U0 . Suppose open sets Ut ,
2
Sn−1 2 2
with t ∈ k=0 Dk and n ≥ 2 have been define in such a way that Ut ⊂ Us whenever s < t.
For u = 2an ∈ Dn , s = a−1 a+1
2n and t = 2n belong to Dn−1 , and so sets Us and Ut are already
defined. Choose an open set Uu with compact closure such that Ut ⊂ Uu ⊂ Uu ⊂ Us . This
procedure defines a chain {Ut }t∈D of open sets satisfying
Ut ⊂ Us whenever s, t ∈ D and s < t.
Lemma 2.11.6. Let X be a l.c.H space with a countable basis. There exists a countable
cover {Kn : n ∈ N} of X by compact sets such that Kn ( Int(Kn+1 ).
Theorem 2.11.7. (One point compactification) Suppose (X, τ ) is a l.c.H space, and let
Xb = X ∪ {∆} where ∆ a point not in X. Define τb as the collection of arbitrary unions of
sets in τ ∪ {Xb \ K : K; compact in (X, τ )}. Then (X,
b τb) is a Hausdorff compact space. If
b If (X, τ ) is compact, then X is
(X, τ ) is not compact, then X is an open dense set in X.
b τb).
an open and closed compact subset of (X,
Suppose U = {Uα : α ∈ A } ⊂ τb covers X. b Then at least one Uα is of the form X b \K, where
0
K is compact in (X, τ ). This means that {Uα ∩ X : α 6= α0 } is an open cover S (in τ ) of K.
Hence there exist a finite collection of set Uα1 , . . . , Uαn in U such that K ⊂ nj=1 X ∩ Uαj .
b = Sn Uα whence we conclude that X
It follows that X b is compact. Clearly X is an open
k=0 j
Proof. Every x ∈ K has a neighborhood Vx whose closure is compact and contained in some
Gj . By compactness, K is covered by a finite collection V1 , . . . , Vk of such neighborhoods.
For each j = 1, . . . , n, let Hj be the union of those Vℓ that lie in Gj . Then, there are functions
gj ∈ C00 (X) such that Hj ≺ gj ≺ Gj . Define h1 = g1 and hj = (1 − g1 ) · . . . · (1 − gj−1 )gj
for 2 ≤ j ≤ k. Then 0 ≤ hj ≺ Gj . It is easy to verify by induction that
k
X
h= hj = 1 − (1 − g1 ) · . . . · (1 − gk )
j=1
Proof. Since X has a countable base and is Hausdorff, its one point compactification
Xb = X ∪ {∆}, of which X is an open subset, is also separable and Hausdorff. Indeed, there
esits a countable basis B = {Un ⊂ X} for X such that each Un is compact in X. The family
b \ U , U ∈ B, is a basis of open neighborhoods of
of finite intersections of sets of the form X
∆.
56 2. Elements of point set Topology
We will show that the space X b is metrizable by embedding it homeomorphically into the
cube [0, 1]N. Let C be the family of pairs (V, U ) with V, U ∈ B such that V ⊂ U . By
Urysohn’s lemma, for each (V, U ) ∈ C there is fV,U ∈ C00 (X) such that V ≺ f ≺ U . As X is
Hausdorff and f (∆) = 0 for all f ∈ C00 (X), the collection F = {fV,U : (U, V ) ∈ C} separates
b Let {(Vn , Un ) : n ∈ N} be an enumeration of C. The map e : x 7→ (fVn ,Un (x))
points in X.
embeds continuously X b into the cube [0, 1]N. As F separates points of X, b the map e is
injective. The compactness of X b implies that e is an homeomorphism between X b and
b N
e(X) ⊂ [0, 1] . Hence, X is homeomorphic to an open set e(X) ⊂ [0, 1] . N
To show that C00 (X) is separable in the uniform norm, observe that the collection E ⊂
b
C00 (X) of polynomials in F with rational coefficients is a ring that separates points in X.
Since C00 (X) ⊂ {f ∈ C(X)b : f (∆) = 0}, the Stone–Weiertrass theorem 5.3.10 implies that
u b
C00 (X) ⊂ E = {f ∈ C(X) : f (∆) = 0} = C0 (X).
Corollary 2.11.10. If X is a compact metric space, then X is separable and C(X) is
separable in the topology of uniform convergence.
S
Proof. For
S any n ∈ N, there is a finite set Fn ⊂ X such that X = x∈Fn B(x; 1/n). The
set F = n Fn is a dense set in X and thus, X is separable.
2.12. Exercises
Exercise 2.12.1. Suppose {(Xα , τα ) : α ∈ A } is a familly of topological spaces. For each
α ∈ A , Xα′ := Xα × {α} is considered an exect copy of Xα in F the sense S that U × {α} is
declared open in Xα′ iff U ∈ τα . Define the disjoint union X = α Xα := α Xα′ . Let τ be
the collection of all U ⊂ X such that U ∩Xα′ is open in Xα′ . Show that τ is a topology on X,
that Xα′ is an open and closed subset of X and that {U ∩ Xα′ : U ∈ τ } = {V × {α} : V ∈ τα }
for all α ∈ A .
Exercise 2.12.2. In any topological space (X, τ ) show that
∂(A ∪ B) ⊂ ∂A ∪ ∂B
∂(A ∩ B) ⊂ ∂A ∪ ∂B
for all subsets A, B of X. (Hint: A ∪ B ⊂ A ∪ B and Ao ∪ B o ⊂ (A ∪ B)o )
Exercise 2.12.3. Let (X, τ ) be a topological space. Suppose there exists a family {(Yi , τi ) :
i ∈ I} of Hausdorff spaces and a collection F = {fi : X −→ Yi }i∈I of continuous functions
which separates points in X; i.e., for any x1 , x2 ∈ X, x1 6= x2 , there is f ∈ F such that
f (x1 ) 6= f (x2 ). Show that X is Hausdorff.
Exercise 2.12.4. Let X and Y be topological spaces and suppose that f : X → Y . Show
that the following statements are equivalent.
(a) f is continuous.
2.12. Exercises 57
(a) Show that there is a box I := [a1 , b1 ] × . . . × [an , bn ], where −∞ < aj < bj < ∞
for each j = 1, . . . , n, such that A ⊂ I. Based on the previous statement, by
considering the sequence of numbers on each component of the vectors xn , it will
be enough to consider the case n = 1.
(b) Denote by ℓ(I) be the length of interval I. Divide I in two subintervals of same
length. Choose one subinterval I1 that contains an infinite number of elements
of the sequence A to obtain a subsequence A1 ⊂ A ∩ I1 . Arguing by induction,
obtain a sequence of nested subintervals Ik+1 ⊂ Ik ⊂ I with ℓ(Ik ) = 2−k ℓ(I), and
subsequences Ak+1 ⊂ Ak ⊂ A such that Ak+1 ⊂ Ik+1 ∩ Ak .
(c) Let αk and βk be the left and right end points of Ik . Show that αk ≤ αk+1 ≤
βk+1 ≤ βk for all k, and conclude that limk αk = limk βk := x∗ .
(d) Construct a subsequence {xmk : k ∈ N} ⊂ A so that limk xnk = x∗ .
Exercise 2.12.14. Let (X, d) be a metric space and let (xn : n ∈ Z+ ) be a sequence in X.
For each n ∈ Z+ define An = {xm : m ≥ n}. Show that
(a) (xn ) is Cauchy if and only if for any ε > 0 there exists an integer N > 0 such that
d(xn , xN ) < ε whenever n ≥ N .
(b) (xn ) is Cauchy iff limn→∞ diam(An ) = 0.
Exercise 2.12.15. Suppose {(Xα , dα ) : α ∈ A } is a pairwise–disjoint family of metric
spaces. Show that
ρα (x, y) ∧ 1 if (x, y) ∈ Xα × Xα
(2.9) ρ(x, y) =
2 if (x, y) ∈ Xα × Xβ , α 6= β
F
is a metric compatible with the disjoint union topology on α Xα .
Exercise 2.12.16. Suppose (X, d) and (Y, ρ) are metric spaces, and that d is complete. If
E is closed in X, f : E −→ Y is continuous, and
ρ(f (x), f (x′ )) ≥ d(x, x′ )
for all x and x′ in E, show that f (E) is closed.
Exercise 2.12.17. Show that if (X, d) and f are as in Capaccioli’s theorem, then there is
N ≥ 1 sich that f N is a contraction.
S
Exercise
S 2.12.18. If A ⊂ B, show that Aε ⊂ B ε , Aε ⊂ Bε . Show that Aε = a∈A {a}ε
and a∈A {a}ε ⊂ Aε , where {a}ε and {a}ε are the open S and closed balls of radius T
ε centered
at a respectively. In addition, if A is compact, then a∈A {a}ε = Aε . Show that ε>0 Aε =
T
A = ε>0 Aε .
Exercise 2.12.19. Let B(X) denote the set of all real valued bounded functions on a set
X. For f ∈ B(X) define its uniform norm by kf ku = supx∈X |f (x)|. Show that k · ku is
a metric on B(X). If (X, τ ) is a topological space and Cb (X) is the space of real bounded
continuous functions on X, show that (Cb (X), k · ku ) is a complete metric space.
2.12. Exercises 59
Example 3.1.1. Consider the experiment of casting a regular dice. The sampling space Ω
is described by the number of points on the side facing up up once the dice comes to a rest.
Then, Ω = {1, 2, 3, 4, 5, 6} and there are up to 26 different events; for instance, the event A
described by all the outcomes which have odd number of points is A = {1, 3, 5}; the event
B of outcomes with less than five points is B = {1, 2, 3, 4}. In this case, one can assign
probabilities to individual outcomes and then define probabilities to all events by assigning
them the sum of probabilities of their elements. For instance, P[{1}] = . . . = P[{6}] = 61
corresponds to the ideal fair dice. In this case, P[A] = P[{1}] + P[{3}] + P[{5}] = 12 ;
similarly, P[B] = 32 . The event B is more likely to happen than the event A.
In the example of the dice, probabilities are assign to events by adding the probabilities
of its individual outcomes. This procedure however does not provide a good way to measure
the probabilities of events when the sample space Ω is not countable.
Example 3.1.2. Consider the angle registered between a fixed reference axis through center
of a roulette and a marked point in the circumference of a roulette after one spins it. Then
Ω = [0, 2π). an ideal roulette has the property that it assigns the same probability to arcs
that have the same length. That is, if [a, b] ⊂ [0, 2π), then P[[a, b]] = b−a
2π . Observe that
this probability measure assigns probability zero to each individual outcome. P[Ω] = 1 but
61
62 3. Basic measure theory
The example of the roulette suggests that it is not always possible to start from indi-
vidual probabilities to construct a meaningful notion of probability of events. It is then
reasonable to assume that probabilities have been assigned to all events.
In order to determine probabilities of events, it seems reasonable to establish some ideal
structure on the collection of events, if any. We fisrt introduce some structures that appear
often in the theory of integration.
Example 3.1.4. The colletion E of intervals of the form (a, b] with −∞ < a < b < ∞ is a
semiring in R.
Proof. (i) We prove the first statement by induction. The statement holds for n = 1 by
definition of a semiring. Suppose the statement holds for some n ≥ 1. Then there are
S S
pairwise disjoint sets C1 , . . . , Ck such that A \ nj=1 Aj = kℓ=1 Cℓ . From
n+1
[ n
[ k
[
A\ Aj = A \ Aj \ An+1 = (Cℓ \ An+1 )
j=1 j=1 ℓ=1
it follows that for each ℓ = 1, . . . , k there are pairwise disjoint sets D1ℓ , . . . , Dsℓℓ in R such
Sℓ
that Cℓ \ An+1 = sm=1 Dmℓ . Clearly, {D ℓ : ℓ = 1, . . . k, m = 1, . . . , s } is a collection of
m ℓ
pairwise disjoint sets in R and
n+1
[ sℓ
k [
[
ℓ
A\ Aj = Dm
j=1 ℓ=1 m=1
3.1. Measurable spaces 63
S
(ii) Let B1 = A1 and Bn = An \ jn−1 Aj . By (i) each set Bn is the union of a finite
collectionSof sets in R and {Bn : n ∈ N} is a pairwise disjoint collection. (ii) follows from
S
n An = n Bn .
Definition 3.1.6. A collection R of subsets of Ω is a ring if
(i) A ∪ B ∈ R,
(ii) and A \ B ∈ R whenever A, B ∈ R.
A ring R that is closed under countable unions is called σ–ring . A ring R is called δ–ring
if it is closed under countable intersections. A ring A is an algebra if
(iii) Ω ∈ A .
An algebra F which is also a σ–ring is called σ–algebra.
Example 3.1.7. For any set Ω, if R is σ–ring of subsets of Ω, then R is a δ–ring.TTo check
this, let {An : n ∈ N} ⊂ R. Then A1 \ An ∈ R for any n ∈N, and so A1 \ n An =
S T T
n (A1 \ An ) ∈ R. Consequently, n An = A1 \ A1 \ n An ∈ R.
Example 3.1.8. The collection {∅, Ω} is a σ–algebra in Ω. It is the smallest one, and thus
it is called the trivial σ–algebra.
Example 3.1.9. The collection P(Ω) of all subsets of Ω, that is, P(Ω) = {A : A ⊂ Ω} is
clearly a σ–algebra in Ω. It is the largest one and it is called the power set.
Definition 3.1.10. Given a collection C of subsets of Ω, the σ–algebra generated by C,
denoted by σ(C), is the intersection of all σ–algebras containing C. If A and F are σ–
algebras in Ω and A ⊂ F , then A is said to be a sub σ–algebra of F .
Definition 3.1.11. Let (X, G) be a topological space. The σ–algebra generated by all open
sets G, denoted by B(X) is called the Borel σ–algebra.
Example 3.1.12. Considet the Euclidean space (Rn , | · |). B(Rn ) is generated by the
countable collection of open balls {Br (x) : r ∈ Q+ , x ∈ Qd }.
Proof. Let S be the minimal collection of sets of X containing the open and closed sets,
and which is closed under countable intersections and countable disjoint unions. Clearly
S ⊂ B(X). Consider
S0 := {A ∈ S : Ac ∈ S }
We will show that S0 is a σ–algebra. Clearly S0 contains the closed and open sets, and
it is closed under complementation. In particular, X ∈ S0 . If {An : n ∈ N} ⊂ S0
64 3. Basic measure theory
Example 3.2.2. The set function µ on the semiring E = {(a, b] : −∞ < a < b < ∞} of
sets in R given by µ((a, b]) = b − a is σ–additive.
Remark 3.2.3. Clearly, if (Ω, F , µ) is a measure space, then µ(∅) = 0. Also, the order
of in which we take the union of the sets AP
n in the definition
P of a measure is not relevant.
Indeed, if f : N → N is any bijection, then n µ[Af (n) ] = n µ[An ] by Lemma A.1.1.
3.3. Construction of measures 65
Example 3.2.4. (Counting measure) Let Ω be any set with σ–algebra P(Ω). For finite
sets let µ(A) = card(A) the cardinality of A; and µ(A) = ∞ otherwise. It is easy to check
that (Ω, P(Ω), µ) is a measure space.
Example 3.2.5. For any measure space (Ω, F , µ), the collection of set R = {A ∈ F :
µ(A) < ∞} is a ring on Ω.
Theorem 3.2.6.SA nonnegative
charge µ on a measureble space (Ω, F ) is a measure iff
limn µ(Bn ) = µ m Bm for all nondecreasing sequence {Bn : n ∈ N} ⊂ F .
Proof. Suppose first that µ is a measure and let {Bn : n ∈ N} ⊂ F be nondecreasing. Let
A1 = B1 and An = Bn \SBn−1 for S n ≥ 2. Then {An : n ∈ N} is a pairwise disjoint sequence
of measurable sets and m Am = m Bm . Thus,
[ X n
X n
[
µ Bm ) = µ(Am ) = lim µ(Am ) = lim µ Am = lim µ(Bn )
n n n
m m m=1 m=1
Conversely,
Sn suppose {An : n ∈ N} ⊂ F be a pairwise S disjoint sequence and set Bn =
PAm . Then {Bn : n ∈ N} ⊂ F increases to m Am . Beign a charge, µ(Bn ) =
m=1
limn nm=1 µ(Am ). Hence
X [
µ(Am ) = lim µ(Bn ) = µ Am ,
n
n m
which shows that µ is a measure.
Remark 3.2.7. The assumption µ[A1 ] < ∞ is sufficient in Exercise 3.10.5 iii) as the next
example shows. Consider Ω = R and B any σ–algebra that contains all intervals of the
form [a, ∞). Let λ the measure that assigns to each interval (finite or infinite) its length.
Clearly ∩n [n, ∞) = ∅, however λ([n, ∞)) = ∞ for each n.
Outer measures are interesting since they can be use to extend and/or construct mea-
sures as we will demonstrate below.
Definition 3.3.3. Let µ∗ be an outer measure on Ω and let E ⊂ Ω.
(a) If E satisfies
(3.3) µ∗ (A) = µ∗ (A ∩ E) + µ∗ (A ∩ E c ) for all A⊂Ω
then we say that E is µ∗ –measurable.
(b) If µ∗ (E) = 0, then E we say that E is µ∗ –negligible.
The collection of all µ∗ –measurable subsets of Ω is denoted by Mµ∗ .
Theorem 3.3.4. If µ∗ be an outer measure on Ω then, Mµ∗ is a σ–algebra and contains
all µ∗ –negligible sets. Moreover, (Ω, Mµ∗ , µ∗ ) is a complete measure space.
It remains to show that the collection of µ∗ –measurable is closed under countable unions.
Since S
the countable
S union of sets can be expresses as a countable
Sn−1 union of pairwise disjoint
sets: An = n Bn where B1 = A1 , and Bn = An \ ( k=1 Ak ), it suffices to assume a
pairwise disjoint sequence {An : n ∈ N}. We first prove by induction that
n
X n
[
(3.4) µ∗ (E) = µ∗ (E ∩ Ak ) + µ∗ (E ∩ ( Ak )c )
k=1 k=1
for any E ⊂ Ω. For n = 1 this is just by definition. Assume the statement is to for n. Since
An+1 is µ∗ –measurable, we have that
n
\ n
\ n
\
∗
µ (E ∩ ( Ack )) ∗
= µ (E ∩ ( Ack ) ∗
∩ An+1 ) + µ (E ∩ ( Ack ) ∩ Acn+1 )
k=1 k=1 k=1
n+1
\
= µ∗ (E ∩ An+1 ) + µ∗ (E ∩ ( Ack ))
k=1
Proof. As ∅ ∈ E , finite additivity implies that µ(∅) = 0. Theorem 3.3.2, with h = µ, shows
that the set function µ∗ given (3.1) is an outer measure while Thoerem 3.3.4 shows that
(Ω, Mµ∗ , µ∗ ) is a complete measure space. We will show (i) that µ∗ and µ coincide in E
and that (ii) Mµ∗ contains σ(E ).
68 3. Basic measure theory
(i) Suppose I ∈ E , and let {Ik : k ∈ N} be a countable cover of I in E . Then {I, ∅} and
{I ∩ Ik : k ∈ N} are also a covers of I in E . By definition of µ∗ , the countable subadditivity
and finite additivity of µ it follows that
X X X
µ∗ (I) ≤ µ(I) ≤ µ(I ∩ Ik ) ≤ µ(Ik ∩ I) + µ(Ik \ I) = µ(Ik )
k k k
Taking the infimum over all countable covers of I in E leads to µ∗ (I) = µ(I).
(ii) Let I ∈ E and let A ⊂ Ω. Given ε > 0 let {Ik : k ∈ N} ⊂ E be a cover of A with
X
µ(Ik ) ≤ µ∗ (A) + ε.
k
Since Ik = (I ∩ Ik ) ∪ (Ik ∩ I c ),
and Ik ∩ I c is a finite union of disjoint sets in E , say
c
S Nk
Ik ∩ I = j=1 Ik,j , it follows that
Nk
X
µ(Ik ) = µ(Ik ∩ I) + µ(Ik ∩ I c ) = µ(Ik ∩ I) + µ(Ik,j )
j=1
Therefore,
X X
µ∗ (A) + ε ≥ µ(Ik ) = µ(Ik ∩ I) + µ(Ik ∩ I c )
k k
X X
= µ(Ik ∩ I) + µ(Ik,j )
k k,j
≥ µ∗ (A ∩ I) + µ∗ (A ∩ I c )
Letting ε → 0 leads to µ∗ (A) ≥ µ∗ (A∩I)+µ∗ (A∩I c ). This combined with the subadditivity
of the outer measure µ∗ shows that I is µ∗ –measurable.
Corollary 3.3.6. Let (E , µ) be as in Theorem 3.3.5, and let E ↑ denote the collection of
countable unions of sets in E .
(i) For any E ⊂ Ω, µ∗ (E) = inf{µ(C) : E ⊂ C ∈ E ↑ }. Moreover, there is B ∈ σ(E )
with E ⊂ B such that
(3.5) µ∗ (E) = µ(B)
S
(ii) For any increasing sequence {An : n ∈ N} of sets, µ∗ ∗
n An = limn µ (An ).
S
(iii) If E = n En , where {En : n ∈ N} ⊂ σ(E ) with µ(En ) < ∞ then, for any ε > 0,
there exists a cover of E by pairwise disjoint sets {An : n ∈ N} ⊂ E such that
[
µ An \ E < ε
n
(iv) If E ∈ σ(E ) and µ(E) < ∞, then for any ε > 0, there exists a finite set of pairwise
disjoint sets {Aj : j = 1, . . . , K} ⊂ E such that
K
[
µ E△ Aj < ε
j=1
3.3. Construction of measures 69
Proof. Clearly µ∗ (E) ≤ µ∗ (B) = µ(B) for all B ∈ E ↑ with B ⊃ E. Thus, it suffices to
assume that µ∗ (E) < ∞.
S P
(i) If r > µ∗ (E), then there are Cn ∈ E so that E ⊂ C = n Cn ∈ E ↑ and n µ(Cn ) < r.
P
As E ↑ ⊂ σ(E ), µ(C) = µ∗ (C) ≤ n µ(Cn ) < r. The first statement follows by letting
r ց µ∗ (E).
To obtain
T (3.5), for each n ∈ N we choose Bn ∈ E ↑ with µ(Bn ) < µ∗ (E) + n1 . The set
B = n Bn has the desire property.
(ii) By partT(i), for each n there is Bn ∈ σ(E ) such that An ⊂ Bn and µ∗ (An ) = µ∗ (Bn ).
Let En := m≥n Bm . Then An ⊂ En ⊂ Bn ∩ En+1 and En ∈ σ(E ) whence it follows that
µ∗ (An ) = µ∗ (En ). Consequently,
[ [ [
µ∗ An ≤ µ∗ ( En ) = lim µ∗ (En ) = lim µ∗ (An ) ≤ µ∗ An
n n
n n n
where the first equality follows from the fact that µ∗ is a measure on σ(E ).
[
N
µ Cj △E < ε
j=1
Tn−1
The sets B1 = C1 and Bn = Cn \ j=1 Cj for 1 ≤ n ≤ N are pairwise disjoint, and each
one of them is the finite union of disjoint sets in E . This proves (iv).
(iii) Without loss of generally we may assume that the sets in {En : n ∈ N} are pairwise
disjoint. For each n ∈ N there is a cover {Bn,m : m ∈ N} ⊂ E of En such that
[ ε
µ Bn,m \ En < n
m
2
∗
S that E ∈ σ(E ) with µ (E) ε= µ(E) < ∞. For any
Assume that E is a ring and suppose
ε > 0 choose Bn ∈ E so that E ⊂ n Bn = B and µ(B) < µ(E) + 2 . Hence, µ(B \ E) < 2ε .
70 3. Basic measure theory
S
Since Ak := kj=1 Bj ր B, we can choose k so that µ(B) − µ(Ak ) = µ(B \ Ak ) < 2ε . Since
η(B) = η(E) + η(B \ E), η ≤ µ and η = µ on E , it follows that
Example 3.3.7. (Relative measure) Suppose (Ω, F , µ) is a measure space and let C ⊂ Ω
any arbitrary nonempty subset. The collection FC := {C ∩ A : A ∈ F }, called trace of C
is clearly a σ–algebra on C. Let µ∗ be the outer measure induced by F . Caratheódory’s
theorem extends µ to a σ–algebra Mµ containing F . If µ∗ (C) < ∞ then, there is C ′ ∈ F
such that C ⊂ C ′ and µ∗ (C) = µ(C ′ ). For any A ∈ F there are sets D, F ∈ F such that
C ∩ A ⊂ D, C \ A ⊂ E, and
µ∗ (C ∩ A) = µ(D)
µ∗ (C \ A) = µ(E)
As C ∩ A ⊂ D ∩ C ′ ) ∩ D ∩ A and C \ A ⊂ E ∩ C ′ ) ∩ E \ A , it follows that
µ∗ (C ∩ A) = µ(C ′ ∩ A) = µ(D ∩ A)
µ∗ (C \ A) = µ(C ′ \ A) = µ(E \ A)
Proof. Clearly µ(∅) = 0 and S µ is finitely additive on E . We now prove that µ is countably
subadditive on E . If (a, b] = ∞
m=1 (a(m), b(m)], the right–continuity and positivity of the
increments of F imply that for any ε > 0, there are aε and bε (j) such that
µ((a, b]) < µ((aε , b]) + 2ε ; µ((a(m), bε (m)]) < µ((a(m), b(m)]) + ε
2m+1
Since the close box [aε , b] is compact and
∞
[ ∞
[
(3.6) [aε , b] ⊂ (a, b] ⊂ (a(m), b(m)] ⊂ (a(m), bε (m)),
m=1 m=1
SN 0
there is N0 ∈ N such that (aε , b] ⊂ [aε , b] ⊂ m=1 (a(m), bε (m)). Finite additivity implies
finite subadditivity on the semiring E , so
N0
X
ε ε
µ((a, b]) < µ((aε , b]) + 2 ≤ µ((a(m), bε (m)]) + 2
m=1
∞
X
≤ µ((a(m), b(m)]) + ε
m=1
Countably subadditivity of µ on E follows by letting ε ց 0. The conclusion follows from
Carathéodory’s extension theorem.
Theorem 3.4.2. Let (Rd , B(Rd ), µ) a finite Borel measure space, and define the distribu-
tion function of µ by F (x) := µ[{y : y ≤ x}). Then
(i) F has nonnegative increments
(ii) F is proper, i.e., lim F (x) = µ(Rd ), lim F (x) = 0.
mink xk ր∞ mink xk ց−∞
(iii) F is right continuous
72 3. Basic measure theory
Conversely, if F satisfies (i)–(iii) then there is a finite measure µ on (Rd , B(Rd )) with
distribution F .
The following example shows that not every Lebesgue set is a Borel set.
Example 3.4.5. (Existence of a Lebesgue set that is not a Borel set.) Define the function
G : [0, 1] −→ [0, 1] by
G(y) = inf{x ∈ [0, 1] : F (x) = y}
where F is the devil’s stair function. It is easy to check that G takes values in the Cantor set
C1/3 . The continuity of F implies that F (G(y)) = y for all y ∈ [0, 1]. Thus, G is injective
and since F is nondecreasing, so is G. Hence, G is measurable for G−1 ([0, t)) is an interval
for all t ∈ [0, 1]. Consequently, B = G(E) is Lebesgue measurable for any E ⊂ [0, 1]. Let E
be any non–Lebesgue measurable subset of [0, 1]. If B were a Borel set, then G−1 (B) would
be Borel measurable, but G−1 (B) = E contradicting the choice of E.
3.4.2. Hausdorff measure on metric spaces. Suppose (X, d) is a metric space and let
g : R+ → R+ be a nondecreasing function with g(0) = 0. For each δ > 0 let Eδ the collection
3.4. Two examples of construction by outer measures. 73
on P(X) is an outer measure. Since Eδ ⊂ Eδ′ for δ < δ′, it follows that A 7→ H g (A) :=
supδ>0 Hδg (A) is also an outer measure.
Lemma 3.4.6. If A, B ⊂ X and d(A, B) := inf{d(x, y) : x ∈ A, y ∈ B} > 0, then
H g (A ∪ B) = H g (A) + H g (B).
Proof. Suppose 0 < δ < d(A, B) and let {Cn : n ∈ N} ⊂ Eδ be a cover of A ∪ B. Each
Cn intersects at most one of the sets A or B. Hence, we can split the cover {Cn } in two
according the whether A ∩ Cn = ∅ or B ∩ Cn = ∅. Consequently,
X
g(diam(Cn )) ≥ Hδg (A) + Hδg (B)
n
whence we conclude that Hδg (A ∪ B) ≥ Hδg (A) + Hδg (B). The opposite inequality holds by
the subadditivity of Hδs . The conclusion follows by letting δ → 0.
Definition 3.4.7. An outer measure µ∗ on a metric space that satisfies
µ∗ (A ∪ B) = µ∗ (A) + µ∗ (B) if d(A, B) > 0
is said to be a metric outer measure.
Theorem 3.4.8. ( Carathéodory) If µ∗ is a metric outer measure, then every Borel set is
µ∗ –measurable.
Proof. It is enough to show that any closed set F is µ∗ –measurable and to that end, we
will show that
(3.7) µ∗ (E) ≥ µ∗ (E ∩ F ) + µ∗ (E \ F )
for any subset E with µ∗ (E) < ∞. For any set B and ε > 0, let B ε = {x : d(x, B) < ε}.
Since F is closed, the sequence En = E \ F 1/n = {x ∈ E : d(x, F ) ≥ 1/n} increases to
E \ F . Since d(En , E ∩ F ) ≥ 1/n then,
µ∗ (E) ≥ µ((E ∩ F ) ∪ En ) ≥ µ∗ (E ∩ F ) + µ∗ (En )
and for any n,
[ X
µ∗ (E \ F ) = µ∗ En ∪ (Ek \ Ek−1 ) ≤ µ∗ (En ) + µ∗ (Ek \ Ek−1 )
k>n k>n
j
Observe that d(Ek \ Ek−1 , Ek+1+j \ Ek+j ) ≥ k(k+j) for j ≥ 1. Indeed, for any x ∈ Ek \ Ek−1
and y ∈ Ek+j+1 \ Ek+j we have
1 1 j
d(x, y) ≥ d(x, F ) − d(y, F ) > − = .
k k+j k(k + j)
74 3. Basic measure theory
Theorem 3.4.8 implies that set M g (X) of H g –measurable functions contains the Borel
sets of (X, d). By Theorem 3.3.4 H g extends to a complete measure on (X, M g (X)). For
each gp (t) = tp , p ≥ 0, the measure H p := H gp is called p–th Hausdorff measure on X.
Notice that p = 0 is the counting measure.
Theorem 3.4.9. If H p (A) < ∞, then H q (A) = 0 for all q > p. If H q (A) > 0, then
H p (A) = ∞ for all p < q.
Proof. It suffices to prove the first statement as the second statement is the contrapositive
statement of the first one. For any δ > 0, let {An : n ∈ N} ⊂ Eδ be a cover of A such that
X
(diam(An ))p < Hδp (A) + 1.
n
Therefore, Hδq (A) ≤ δ q−p (Hδp (A) + 1). Letting δ ց 0 we obtain that H q (A) = 0.
A function f between metric spaces (X, d) and (Y, ρ) is called Lipschitz of degree α > 0
if for some constant L ≥ 0
for all x1 , x2 ∈ X. Lipschitz functions of degree one are typically refered only as Lipschitz
functions and
ρ(f (x1 ), f (x2 ))
Lip(f ) := sup
x1 6=x2 d(x1 , x2 )
is called Lipschitz coefficient of f .
Theorem 3.4.10. Let f be a Lipschitz function of degree α between metric spaces (X, d)
and (Y, ρ). For any s ≥ 0,
α
Proof. Notice that diam(f (A)) ≤ L diam(A) . Given δ > 0 let δ ∗ = Lδ α . If {An } ⊂ Eδ
is a countable cover of A, then {f (An )} ⊂ Eδ∗ is a countable cover of f (A). Hence
s/α
X X s
Hδ∗ (f (A)) ≤ diam(f (An ))s/α ≤ Ls/α diam(An ) .
n n
for all x, y ∈ X and some constants 0 < a ≤ b. Then, for any s > 0 and any A ⊂ X
Proof. It follows from Theorem 3.4.10 that H s (f (A)) ≤ bs H s (A). To obtain the inequality
in left hand side of 3.8 fix δ > 0 and conisder any countable covering {Bn } of f (A) with
diam(Bn ) ≤ δ. Then, {f −1 (Bn ∩ f (X))} is a countable covering of A and
1 1
diam(f −1 (Bn ∩ f (X)) ≤ diam(Bn ∩ f (X)) ≤ δ.
a a
Hence
X s
Has−1 δ (A) ≤ a−s diam(Bn )
n
whence we conclude that Has−1 δ (A) ≤ a−s Hδs (f (A)). The first conclusion follows by letting
δ → 0. For the last statement, set a = 1 = b.
There is a close connection between the Lebesgue measure λd and the d–th H d Hausdorff
measure on Rd . Each Hausdorff measure H p , p ≥ 0, is translation invariant. Let Q = (0, 1]d
and let δ > 0. Divide Q in nd non–overlapping cubes of size 1/n so that n−d < δ. It
follows that Hδd (Q) ≤ nd n−d dd/2 and thus, H d (Q) < dd/2 < ∞. On the other hand, if
{An : n ∈ N} ⊂ Eδ covers Q, then each An is contained in a closed ball of radius diam(An );
thus,
X X
λd (Q) ≤ λd (An ) ≤ ωd (diam(An ))d
n n
where ωd is the volume of the unit ball in Rd . Consequently, ωd−1 ≤ H d (Q). Therefore
there is a constant ωd−1 ≤ ad ≤ dd/2 such that H d = ad λd . We defer until Section 9.8 the
determination of the constant ad .
76 3. Basic measure theory
Proof. The intersection M the intersection of all monotone classes that contain A is also
a monotone class. Clearly M ⊂ σ(A). Define
M0 = {B ∈ M : X \ B ∈ M}
Clearly A ⊂ M0 . If {Bn : n ∈ N} ⊂ M0 is a monotone sequence, then {X \ Bn : n ∈ N} ⊂
M is also a monotone sequence. Thus limn Bn ∈ M, and X \ limn Bn = limn (X \ Bn ) ∈ M.
It follows that M0 is a monotone class, and so M = M0 .
Define
M1 = {B ∈ M : A ∈ A implies A ∪ B ∈ M}
Clearly A ⊂ M1 . If {Bn : n ∈ N} ⊂ M1 is a monotone sequence and A ∈ A then,
{Bn ∪ A : n ∈ N} is a monotone sequence in M. Thus limn Bn ∈ M, and A ∪ limn Bn =
limn (A ∪ Bn ) ∈ M. It follows that M1 is a monotone class, and so M1 = M.
Finally, define
M2 = {B ∈ M : A ∈ M implies A ∪ B ∈ M}
As M1 = M, we have that A ⊂ M2 . If {Bn : n ∈ N} ⊂ M2 is a monotone sequence, and
A ∈ M, then {A ∪ Bn : n ∈ N} is a monotone sequence in M. Thus limn Bn ∈ M, and
A ∪ limn Bn = limn (A ∪ Bn ) ∈ M. It follows that M2 is a monotone class, and so M2 = M.
Proof. By exercise 3.10.12, it suffices to show that d(P) is a π–system itself. Let d(P) the
intersection of all d–systems that contain P. Clearly σ(P) is a d–system that contains P,
thus d(P) ⊂ σ(P). It remains to prove that d(P) ⊃ σ(P). For that purpose, consider
H = {D ∈ D : D ∩ B ∈ D, ∀B ∈ P}
Clearly, P ⊂ H and H is a d–system. So d(P) ⊂ H. Similarly, let
A = {A ∈ D : A ∩ D ∈ D, ∀D ∈ D}
Then, P ⊂ A and A is a d–system, so A ⊃ d(P). This shows that d(P) is a π–system. It
follows that d(P) is a σ–algebra.
Proof. Let µn and νn the finite measures on F defined by µn (A) := µ(A ∩ Cn ) and
νn (A) := νn (A ∩ Cn ). Then, since C is a π–system, it is easy to check that D = {D ∈ F :
µn (D) = νn (D)} is a d–system that contains C. Therefore, µn = νn for each n. For any
A ∈ F we have µ(A) = limn→∞ µn (A) = limn→∞ νn (A) = ν(A).
Theorem 3.5.6. Suppose E is a semiring on Ω, and µ is additive and countably subadditive
on E . If the Carathédory extension is σ–finite on σ(E ), then Mµ = σ(E ) and the extension
is unique.
Proof. Suppose that σ(E ) ∋ En ր Ω with µ(En ) < ∞, and consider the finite measures
µn (·) = µ(· ∩ En ). To show that Mµ ⊂ σ(E ) it is enough to show that {E ∩ En : E ∈
Mµ } ⊂ σ(E ) for each n, and to that purpose, it suffices to assume that µ is finite. Let
E ∈ Mµ , and as in Corollary 3.3.6 let B ∈ σ(E ) be such that E ⊂ B and µ∗ (E) = µ(B).
Notice that µ(B) = µ∗ (B) = µ∗ (B ∩ E) + µ∗ (B \ E) and µ∗ (B \ E) = 0. Therefore,
E = B \ (B \ E) ∈ σ(E ).
Example 3.5.7. The Lebesgue measure λ and the Lebesgue–Stieltjes measure µF associ-
ated to a right–continuous function F with nonnegative increments are the only measures
Q Q
that assign dk=1 (bk −ak ) and dk=1 ∆k (ak , bk )F respectively, to each d–dimensional interval
Qd
(a, b] = k=1 (ak , bk ] where ak ≤ bk .
78 3. Basic measure theory
Lemma 3.6.6. If f : (T, T ) → (S, S) and g : (S, S) → (U, U ) are measurable functions,
then g ◦ f : (T, T ) → (U, U ) is measurable.
3.7. Universal completion 79
Proof. If A ∈ U , then g −1 (A) ∈ S, and so f −1 g −1 (A) ∈ T . Therefore, (g ◦ f )−1 (A) =
f −1 g −1 (A) ∈ T .
Lemma 3.6.7. Let (Ω, F ) be a measurable space. A function f on Ω with values in a metric
space (S, d) is measurable iff g ◦ f : Ω −→ R is measurable for any real valued continuous
function g on S.
Using Example 3.4.5 we will show that not every Lebesgue measurable set is universally
measurable.
Example 3.7.2. (Existence of a Lebesgue set that is not universally measurable). Let G be
as in Example 3.4.5. As any non–Lebesgue measurable set E is not universally measurable,
then B = G(E) is Lebesgue measurable but not universally measurable. Indeed, if B
80 3. Basic measure theory
were universally measurable, then the measurability of G would imply that E = G−1 (B) is
universally measurable, contradiction.
We will show that the Suslin operation on a collection E is exhaustive in the sense that
S S(E) = S(E). First we use a technical result about sequences of integers.
Lemma 3.8.2. The function β : N × N → N given by
(3.10) β(m, n) = 2m−1 (2n − 1)
is a bijection. Let ϕ : N → N and ψ : N → N be given by
ϕ(l) = m, ψ(l) = n if l = 2m−1 (2n − 1).
Then β ◦ (ϕ, ψ) is the identity map on N, and (ϕ, ψ) ◦ β is the identity map on N × N.
Proof. It is clear that every integer l can be uniquely expressed as in (3.10). From this it
follows that β is bijective and β −1 = (ϕ, ψ). Moreover,
ϕ ◦ β = ϕ, ψ◦β =ψ
Let (σ, τ ) ∈ (NN) × (NN×N). The equation η = Ψ(σ, τ ) has solution
(3.11) σ =ϕ◦η
(3.12) τ =ψ◦η◦β
which shows that Ψ is indeed a bijection. Finally, if the first l = β(m, n) of Ψ(ψ, τ ), then as
m ≤ β(m, n), the values σ(1), . . . , σ(m) are obtained by using (3.11). As β(m, k) ≤ β(m, n)
for all 1 ≤ k ≤ n, the values of τ (m, 1), . . . , τ (m, n) are obtained by using (3.12).
Theorem 3.8.3. Let E be a nonempty collection of subset of an nonempty set X.
(i) S(S(E)) = S(E); in particular, S(E) is closed under countable unions and countable
intersections.
(ii) If ∅ ∈ E and X \ A ∈ S(E) for each A ∈ E, then the σ–algebra σ(E) generated by
E is contained in S(E).
Let Ψ, β, ϕ, and ψ be as in Lemma 3.8.2. For any k-th tupple (η1 , . . . , ηk ), choose any
η ∈ NN such that η|k = (η1 , . . . , ηk ) and choose functions σ ∈ NN and τ ∈ NN×N so that
η = Ψ(σ, τ ) as in Lemma 3.8.2. Then ηℓ = Ψ(σ, τ )(ℓ) for 1 ≤ ℓ ≤ k and, although the
functions σ and τ are not uniquely determined by (η1 , . . . , ηk ), the k–tupples
(σ1 , . . . , σ(ϕ(k))), (τ (ϕ(k), 1), . . . , τ (ϕ(k), ψ(k))),
uniquely determined by (η1 , . . . , ηk ). Hence, we may define
(τ (ϕ(k),1),...,τ (ϕ(k),ψ(k)))
B(η1 , . . . , ηk ) = A(σ(1),...,σ(ϕ(k))) ∈E
unambiguously. It follows that
∞
[ \ [ ∞
\
Bη|k = B Ψ(σ, τ )(1), . . . , Ψ(σ, τ )(k)
η∈NN k=1 σ∈NN k=1
τ ∈NN×N
[ \∞
(τ (ϕ(k),1),...,τ (ϕ(k),ψ(k)))
[ [ ∞ \
\ ∞
(τ (m,1),...,τ (m,n))
= A(σ(1),...,σ(ϕ(k))) = A(σ(1),...σ(m))
σ∈NN k=1 σ∈NN τ ∈NN×N m=1 n=1
τ ∈NN×N
[ \ ∞ [ \
∞ ∞
[ \
g(1),...,g(n))
= A(σ(1),...σ(m)) = A(σ(1),...,σ(m)) = A
σ∈NN m=1 g∈NN n=1 σ∈NN m=1
82 3. Basic measure theory
This shows that S(S(E)) ⊂ S(E). The converse inclusion is obvious. The last statement is
discussed in Example 3.8
Although the Suslin A–operation involves uncountable union of sets, it turns out that
the A–operation preserves measurability which is a quite surprising result. Before we state
and proof this fact, we make a few observations related to the A–operation.
For any α ∈ Nn , let
∞
[ \ n
[ \
α
S := Ef |k , Sα := Eβ|k
f ∈NN k=1 β∈Nn k=1
f |n ≤α β≤α
The last statement follows from (3.17) in the case G = X, and Sg ⊂ A(E).
closed balls of radius 2−2 , with centers in D, and that intersect Un . By induction, for each
(n1 , . . . , nk ) ∈ Nk , let {Un1 ,...,nk ,n : n ∈ N} be the collection of all closed balls of radius 2−k ,
with centers in D, and that intersect Un1 ,...,nk . It easy to check that
[ [
X= Un , Un1 ,...,nk ⊂ Un1 ,...,nk ,n
n n
and
[
X= Uα , diam(Uα ) = 2−k+1
α∈Nk
As X is complete, for each x ∈ X, there is a unique g ∈ NN such that x ∈ Ug|k for each
k ∈ N. Hence
∞
[ \ ∞
[ \
X= Ug|k , Ω×X = (Ω × Ug|k )
g∈NN k=1 g∈NN k=1
Then
∞
[ \
A(E) = (Ω × X) ∩ A(E) = (Ω × Ug|k ) ∩ A(E)
g∈NN k=1
[ \ ∞
= Af |k × (Cf |k ∩ Ug|k )
g,f ∈NN k=1
k→∞
Notice that each Ĉh|k is closed and diam(Ch|k ) ≤ 2−[k/2]+1 −−−→ 0. As the class of all
closed sets is closed under countable intersections, substituting Ĉhk by
Ĉh(1) ∩ . . . ∩ Ĉh(1),...,h(k) ,
we may assume without loss of generality that Ch|k+1 ⊂ Ch|k for all h ∈ NN and k ∈ N. As
T
X is complete, we have that Ĉh := ∞k=1 Ĉh|k 6= ∅ iff Ĉh|k 6= ∅ for all k. When Ĉhk = ∅ we
3.9. Measurable Isomorphism Theorem* 85
T
may redefine Âh|k = ∅ without altering (3.18). Hence, if Âh := ∞k=1 Âh|k 6= ∅ then Ĉh 6= ∅.
It follows that
[ [
πΩ (A(E)) = πΩ (Âh × Ĉh ) = Âh
h∈NN h∈NN
µ
By Theorem 3.8.4, πΩ (A(E)) ∈ S(F ) ⊂ F .
The next two general result show that measurable isomorphic partitions lead to isomor-
phic spaces, and that isomorphic spaces can be partitioned in isomorphic pieces.
Lemma 3.9.2. Suppose {En : n ∈ N} and {Fn : n ∈ N} are sequences of measurable sets
in (S, S ) and (T, T ) respectively. If Ej ∩ ES
j = Fi ∩ FjS= ∅ for any i 6= j and En and Fn
are measurable isomorphic for each n then, n En and n Fn are measurable isomorphic.
Proof.
S For eachS n ∈ N let fn be a measurable isomorphism between En and Fn . Define
F : n En → n Fn as x 7→ fn (x) if x ∈ En . This S is well defined bijective S
function. To
prove measurability, notice that any Borel set B ⊂ Sn Fn is of the
S form B = n Bn , where
Bn is a Borel subset of Fn . Hence F −1 (B) = F −1 ( n (Bn )) = n fn−1 (Bn ), which implies
that F is measurable. A similar argument proves that F −1 is measurable.
Lemma 3.9.3. Suppose E and F are measurable sets in (S, S ), and suppose E and F are
measurable isomorphic. If E = E1 ∩ E2 and E1 ∩ E2 = ∅ then, there are disjoint sets F1
and F2 such that F = F1 ∪ F2 and Ei and Fi , i = 1, 2, are isomorphic.
86 3. Basic measure theory
Proof. Let g be an isomorphism between E and F . Let Fi = φ(Ei ). Clearly Ei and Fi are
isomorphic and F1 ∪ F2 = F .
Theorem 3.9.4. Let A, B, and C measurable sets in (S, S ), and suppose A ⊂ B ⊂ C. If
A and C are isomorphic, then B and C are isomorphic.
the product topology. To Q check this is the case, observe that for any i ∈ I andQ open set
−1 −1
U ∈ τi pN
i (U ) ∈ τ ⊂ B(
Q i∈I iX ). It follows that {p i : i ∈ I, B ∈ B(X i )} ⊂ B( i∈I Xi ),
and so B(Xi ) ⊂ B( i∈I Xi ). When I is countable, and each topological space Xi is
nice then, both the product σ–algebra and the Borel σ–algebra generated by the product
topology coincide.
Theorem 3.9.5. Let {(Xn , B(Xn )) : n ∈ N} be a sequence second countable N topological
spaces with the corresponding
Q Borel σ–algebras. Then, the product σ–algebra B(Xn ) and
the Borel σ–algebra B n X) generated by the product topology coincide.
Q N
Proof. It is enough to prove that B( n Xn ) ⊂ n B(Xn ). As each (Xn , τn ) is second
countable, the product topology τ is second countable. Moreover, if Bn is a countable basis
for τn , then A = {pn (B) : n ∈ N, B ∈ Bn } is a countable subbasis for τ , and the
Ncollection
of finite intersections in A forms a countable basis B for τ . Notice that B ⊂ Nn B(Xn ).
As each openQ the countable union of sets in B, it follows that τ ⊂ n B(Xn ).
set in τ isN
Therefore B( n Xn ) ⊂ n B(Xn ).
Lemma 3.9.6. Suppose Q {fn :→N(An , An ) →Q
(Bn , BN
n )} be a sequence of measurable func-
tions. The function F : ( n An , n An ) → ( n Bn , n Bn ) given by
(xn : n ∈ N) 7→ (fn (xn ) : n ∈ N).
Then, F is measurable.
Q
Proof. For any n and anyN B ∈ Bn denote by hBin = {y ∈ m Bm : yn ∈ B}. It suffices to
show that F −1 (hBin ) ∈ n An . It is easy to check that F −1 (hBin ) = hfn−1 (B)in . Therefore
F is measurable.
Lemma 3.9.7. There is a Borel subset E ⊂ {0, 1}N that is isomorphic to [0, 1].
P −n
Proof. The metric d(x, y) = n2 |xn − yn | on {0, 1}N is metric compatible with the
product topology on {0, 1}N. As any number in [0, 1] has a binary expansion, the map
τ : {0, 1}N → [0, 1] given by
∞
X xn
x 7→
2n
n=1
is surjective. τ is continuous since |τ (x) − τ (y)| ≤ d(x, y). The set E = {x ∈ {0, 1}N :
x(n) = 0, i.o.} ∪ {1} is a Borel set in {0, 1}N, and the restriction of τ to E is a bijection
since very number in [0, 1) has a unique binary expansion with an infinite number of 0 bits.
It remains to show that τ −1 : [0, 1] → E is measurable. Let Bj = {x ∈ {0, 1}N : xi = 1}.
Then
j−1
2[
−1 −1 2k − 1 2k
τ (Bj ∩ E) = τ (Bj ) = {1} ∪ , j
2j 2
k=1
is a Borel subset of [0, 1].
Lemma 3.9.8. There is a Borel set E1 ⊂ {0, 1}N that is isomorphic to [0, 1]N.
88 3. Basic measure theory
N
Proof. Let E be as in Lemma 3.9.7 and define the map τ ′ : {0, 1}N 7→ [0, 1]N by
(g(n, )˙ : n ∈ N) 7→ (τ (g(n, ·)) : n ∈ N).
Clearly τ ′ is surjective and continuous, and its restriction to E N is a bijection. By Lemma 3.9.6
and Lemma 3.9.7, the restriction of τ ′ to E N is an isomorphism.
N
The conclusion of the Theorem follows from the fact that {0, 1}N and {0, 1}N are home-
omorphic (See Example 2.8.9).
Theorem 3.9.9. Let X be a Polish space, and let B ∈ B(X). There exits a Borel set
EB ⊂ {0, 1}N that is isomorphic to B.
Proof. S contains the closed and open sets: Every closed subset of X is a Polish space,
and by Alexandroff’s lemma, every open set is a Polish subspace of X. From Theorem 3.9.10
it follows that S contains all closed and open sets.
S is close under countable disjoint unions: Suppose {An : n ∈ N} ⊂ S . For each n ∈ N,
there exists continuous function φn : {n} × NN → X with φn ({n} × NN) = An . Since S the
sets {n} × NN, n ∈ N, form a partition of NN, we have that the functionS φ : NN →
n An
N N
S by n 7→ φm (n) if n ∈ {m} × N is continuous and φ(N ) = n An , which shows
defined
that n An ∈ S .
S isQclose under
N countable intersections: Each subspace
Q An is separable metric space, and so
B( n An ) = n B(An ). Notice that ∆ = {x ∈ n An : xn = x1 , n ∈ N} is a closed subset
Q N Q
of n An . The function Φ : NN → n An given by (nk : k ∈ N) 7→ (φk (nk ) : k ∈ N)
N
is continuous; hence, D := Φ−1 (∆) is a closed subset of the Polish space NN . Then,
T
there is a continuous surjection G : NN → D. Consequently, p1 ◦ Φ ◦ G : NN → n An is a
continuous surjection.
The conclusion follows from Theorem 3.1.13.
Lemma 3.9.12. Suppose X is a separable metric space. There exists a countable set N ⊂ X
such that for any x ∈ X \ N and open set Ux containing x, Ux ∩ (X \ N ) is uncountable.
Remark 3.9.13. Points in X \ N are called condensation points of X.
Proof. Let N be the set of points in X which have a neighborhood Nx that is at most
countable.
S Since X is separable, there is a set {xk } ⊂ N (at most countable) such that
N = k Nk . N is countable and satisfies the conditions in the Lemma.
Proof.
Theorem 3.9.14. Let X be a Polish space. For any uncountable Borel set B ⊂ X, there
is a compact set K ⊂ B that is isomorphic to {0, 1}N.
(X, τ ) is analytic. The following results makes the link between analytic sets and the Suslin
operation.
Theorem 3.9.16. Let (X, τ ) be a Polish space. A set A ⊂ X is analytic iff A = A(I)
where I is s Suslin scheme {Ef |k : f ∈ NN} of closed sets such that for any f ∈ NN
(i) E(f (1),...,f (k+1)) ⊂ E(f (1),...,f (k)) .
k→∞
(ii) diam(E(f (1),...,f (k)) ) −−−→ 0.
It is left as an exercise to check that t(f, g) ≤ t(f, h) ∧ t(h, g) for all f, g, h ∈ NN whence
1
it follows that t is a metric. In this metric, B(f, m ) = {f (1)} × . . . × {f (m)} × NN and so
N
t generates the product topology on N . Moreover, comparing t with the product metric
P
d(f, g) = n |f (n)−g(n)|∧1
2n , we have that d(f, g) ≤ t(f, g) and so t is a Polish metric on NN.
M = {f ∈ NN : Ef |k 6= ∅ for all k ∈ N}
S T T
Then A = f ∈M k Ef |k , and there is a map φA : M → X given by φA : f 7→ k Efk .
k→∞
Notice that if t(f, g) < k1 then, g ∈ Eg|k = E|f |k . Since diamρ (Ef |k ) −−−→ 0, continuity of
φB follows.
We claim that M is a closed subset of NN. Let f ∈ Mc , and that Ef |k = ∅ for some k ∈ N.
If t(f, g) < k1 then Eg|k = Ef |k = ∅. Hence Bt (f ; k1 ) ⊂ Mc . Being a closed set, M is itself a
Polish space, and so by Theorem 3.9.10 there exists a continuous surjection G : NN → M.
The map φA ◦ G is a continuous map with φA (NN) = A.
Remark 3.9.17. Theorem 3.8.4 and the the regular Suslin representation of analytics sets
imply that for any Polish space (X, τ ) and measure µ on (X, B(X)), the analytic sets are
µ
included in B(X) .
92 3. Basic measure theory
3.10. Exercises
Exercise 3.10.1. Suppose that {Fi }i∈I is an arbitrary collection of algebras (or σ–algebras),
show that ∩i∈I Fi is also an algebra (respectively a σ–algebra).
Exercise 3.10.2. Let Ω be an uncountable set. Consider the collection A of all subsets
A ⊂ Ω such that either A or Ω \ A is countable. Is A an σ–algebra? (Here by countable we
mean either finite or infinite countable).
Exercise 3.10.3. Let C be a collection of subsets of Ω. Show that for each A ∈ σ(C) there
is a countable sub-family C0 of C such that A ∈ σ(C0 ). (Hint: Let F be the union of all
σ–algebras σ(L) where L runs over all the countable sub-families of C, and show that F is
a σ–algebra that satisfies C ⊂ F ⊂ σ(C).)
Exercise 3.10.4. Show that any positive finitely (countably) additive set function µ on a
semiring E of Ω is finitely (countably) subadditive.
Exercise 3.10.5. Suppose (Ω, F , µ) is a measure space. Show that
(a) If A, B ∈ F , and that B ⊂ A, then µ[B] ≤ µ[A]. If in addition µ[B] < ∞, then
µ[A \ B] = µ[A] − µ[B].
S P
(b) For any {An } ⊂ F , µ[ An ] ≤ n µ[An ].
Continuity properties. Let {An } ⊂ F .
hT i
(c) If An+1 ⊂ An for all n and µ[A1 ] < ∞, then µ n An = limn→∞ µ[An ]. (Hint:
S
observe that A1 = D ∪ n An \ An+1 where D = ∩n An .)
S
(d) If An ⊂ An+1 for all n, then, µ[ n An ] = limn→∞ µ[An ]
Exercise 3.10.6. For any measure space (Ω, F , µ),
(a) Show that
µ
F = {E ⊂ Ω : ∃A, B ∈ F with A ⊂ E ⊂ B and µ∗ (B \ A) = 0}
= {E ⊂ Ω : ∃B ∈ F with E△B ∈ Nµ }
where µ∗ is the outer measure induced by µ.
µ
(b) Show that the measure µ on F extends uniquey to F by setting µ(E) := µ(A)
where A ∈ F and µ∗ (A△E) = 0. In fact, µ(E) = sup{µ(A) : A ∈ F , A ⊂ E}.
Exercise 3.10.7. Let(E , µ) be
as in Theorem 3.3.5. Show that for any increasing sequence
∗
S ∗
of subsets En ⊂ Ω, µ n En = limn µ (En ).
Exercise 3.10.8. (Cantor sets). Consider the space ([0, 1], B([0, 1]), λ), where λ is the
Lebesgue measure. Let 0 < β ≤ 1/3, and set F0 = I0 = [0, 1]. Remove from F0 the middle
open interval of size β. This leaves two disjoint close subintervals I11 , I12 of the same size
whose union we denote by F1 . Suppose that the set Fn has been constructed so that it is the
union of 2n closed subintervals {Ink }k of the same size. From each subinterval Ink , subtract
the middle open interval of size β n+1 leaving 2n+1 disjointT close subintervals {In+1,k }k of
the same size whose union we denote by Fn+1 . Let C := n Fn .
3.10. Exercises 93
Integration: measure
theoretic approach
The moment or mean of a random variable is the average value of the observable after repli-
cating the experiment a large number of times under the same conditions of the experiment.
Example 4.0.1. (Fair dice) The set Ω = {1, 2, 3, 4, 5, 6} contains all the possible outcomes
of rolling a dice. Let X denote the double of the number of dots facing up after the dice
comes to rest. This is a random variable X(ω) = 2ω, ω ∈ Ω. If the dice is fair, the mean
value of X is 2 · 61 + 4 · 61 + 6 · 61 + 8 · 16 + 10 · 16 + 12 · 16 = 7.
Example 4.0.2. In example 3.1.2 corresponding to the roulette spun around its center,
let Y = (cos ω, sin ω), where ω ∈ [0, 2π) is the angle observed after spinning the roulette
once. Y is a random variable with values on the unit circle. If the roulette is such that the
probability is uniformly distributed along the [0, 2π), then the mean of Y is (0, 0).
97
98 4. Integration: measure theoretic approach
Lemma 4.1.1. For any finite collection I = {A1 , . . . , An } of sets in a semiring R, there
exists another finite collection C = {C1 , . . . , Cm } of pairwise disjoint sets in R such that
(i) For each Cj ∈ C, there is Aℓ ∈ I with Cj ⊂ Aℓ .
S
(ii) For each Aℓ ∈ I, Aℓ = {Cj ∈ C : Cj ⊂ Aℓ }.
Proof. Suppose {a1 , . . . , an } are all the non–zero values that φ takes. Each Aj := {φ =
aj } ∈ R. Suppose that φ has another representation
m
X
φ= bk 1Bk
k=1
where {B1 , . . . , Bm } are pairwise disjoint and bk 6= 0 for all k. We show that
n
X m
X
aj µ(Aj ) = bk µ(Bk )
j=1 k=1
S S
First, notice that nj=1 Aj = m k=1 Bk , and that if Aj ∩ Bk 6= ∅, then aj = bk . Hence
aj µ(Aj ∩ Bk ) = bk µ(Aj ∩ Bk ) for all 1 ≤ j ≤ n and 1 ≤ k ≤ m. This shows that
n
X n
X m
X
aj µ(Aj ) = aj µ(Aj ∩ Bj )
j=1 j=1 k=1
Xm X n m
X
= bk µ(Aj ∩ Bk ) = bk µ(Bk )
k=1 j=1 k=1
To show that (4.1) is linear on E(R) consider two measurable simple functions φ1 and φ2 .
Let I the collection of all non–void level sets {φi = r}, i = 1, 2, with r 6= 0 and let C be a
4.1. Simple functions and integration 99
Hence
X
φ1 + φ2 = (φ1 (C) + φ2 (C))1C
C∈C
As C is pairwise disjoint
X X
rµ({φi = r}) = φi (C)µ(C)
r∈R\{0} C∈C
φi (C)6=0
and so
X X X
rµ({φ1 + φ2 = r}) = rµ({φ1 = r}) + rµ({φ2 = r})
r∈R\{0} r∈R\{0} r∈R\{0}
X
= (φ1 (C) + φ2 (C))µ(C)
C∈C
φ1 (C)+φ2 (C)6=0
∈ E(R) is represented as a
Finally, we show that extension µ does not depend on how φ P
linear combination of indicator functions in R. Suppose φ = pk=1 bk 1Bk and let C as in
Lemma 4.1.1 for {B1 , . . . , Bp }. Then
p
X p
X X X X
bk 1Bk = bk 1C : C ∈ C, C ⊂ Bk = bk : C ⊂ Bk 1C
k=1 k=1 C∈C
and
p
X p
X X
bk µ(Bk ) = bk µ(C) : C ∈ C, C ⊂ Bk
k=1 k=1
X X
= bk : C ⊂ B k µ(C) = µ(φ)
C∈C
Remark 4.1.3. If µ is nonnegetive real extended, then the conclussion of Theorem 4.1.2
holds for all simple function φ ≥ 0. The proof is exactly as before since only finite summa-
tions of nonnegative real extended numbers are involved.
P
Proof. Suppose φ = nk=1 bk 1Ak where bk ≥ 0 areSthe distinct values of φ. For any pairwise
disjoint sequence {Ej : j ∈ N} ⊂ σ(R) with E = j Ej ,
n
X n
X X
∞
ν(E) = bk µ(Ak ∩ E) = bk µ(Ak ∩ Ej )
k=1 k=1 j=1
X∞ X n X∞
= bk µ(Ak ∩ Ej ) = ν(Ej )
j=1 k=1 j=1
This means that ν is a positive countably additive. Clearly ν(E) ≥ 0 for all EF and the
proof is complete.
The goal of integration theory is to extend the linear functional µ, called integral on
E(R) to a larger class of functions. Caratheodory’s extension theorem allow us to extend
first a measure over a semiring to a σ–algebra of sets. The collection of sets with finite
measure form a ring of sets and Theorem 4.1.2 allows us to extend the measure linearly to
simple functions.
Proof. (i) Let φ : [0, ∞] → [0, ∞] be φ(x) = x and φ0 (x) ≡ 0, φn (x) = 2−n [2n x]1[0,n) (x) +
n1[n,∞) (x) for n ≥ 1. If 0 ≤ x ≤ n then 0 ≤ x − φn (x) ≤ 21n , thus limn φn (x) = x. The
sequence sn := φn ◦ f has the desired properties.
4.2. Lebesgue Integration 101
P∞
(ii) Notice that f = n=1 (sn − sn−1 ), and 2n (sn − sn−1 ) ∈ F is an indicator function.
Definition 4.2.3. For any measurable function f : Ω → [0, ∞], the integral of f over
E ∈ F is defined by
Z Z
(4.3) f dµ := sup{ s dµ : 0 ≤ s ≤ f, s is simple}
E E
Proof. Observe that the function gt (ω) := t1f −1 (t,∞] (ω) is a simple measurable and 0 ≤
gt ≤ f 1f −1 (t,∞] ≤ f .
Proof. (i) Let An = f −1 ((n, ∞]) and observe that An ց f −1 ({∞}). By Chebyshev–
Markov’s
Z
1
µ(An ) ≤ f dµ
n Ω
the conclusion follows by letting n → ∞ since µ(A1 ) < ∞.
(i) Let Bn = f −1 ( n1 , ∞] and observe that Bn ր f −1 (0, ∞]. By Chebyshev–Markov’s
Z
µ(Bn ) ≤ n f dµ = 0
Ω
The conclusion follows immediately.
A property P about Ω occurs almost surely if µ {ω ∈ Ω : P (ω) is false} = 0. This
is commonlyR denoted by P occurs µ–a.s. In this context Corollary 4.2.5 states that (i) if
f ≥ 0 and Ω f dµ < ∞ then f is finite µ–a.e.; (ii) if in addition the integral is zero, then f
is zero µ–a.s.
Example 4.2.6. In the Steinhaus space ([0, 1], B, λ), the functions 1[0,1]\Q and 1[0,1] are
equal λ–a.s; also, almost surely every ω ∈ [0, 1] has a binary expansion with an infinite
number of ones.
102 4. Integration: measure theoretic approach
Let s be a simple function with 0 ≤ s ≤ f and 0 < c < 1. Consider the S sets En = {ω ∈
Ω : c s(ω) ≤ fn (ω)}. Observe that En ⊂ En+1 for all n and that Ω = n En . Indeed, if
f (ω) = 0 then ω ∈ E1 ; whereas if f (ω) > 0 then c s(ω) ≤ cf (ω) < f (ω) ans so, ω ∈ En for
some n. Consequently,
Z Z Z
fn dµ ≥ fn dµ ≥ c s dµ.
Ω En En
R
Letting n → ∞ we obtain that α ≥ c Ω s dµ by Theorem 4.1.4. Letting c ր 1 we obtain
Z
(4.7) α≥ s dµ.
Ω
Example 4.3.2. On (0, 1), B((0, 1)), λ) the function f (x) = x1p is integrable if p < 1.
Indeed, for p 6= 1, monotone convergence gives
Z Z
1 1
f dλ = lim x−p dx = lim 1 − 1−p
(0,1) n→∞ [n−1 ,1) n→∞ 1 − p n
The limit is finite (1/(1 − p)) when p < 1 and infinity when p > 1. When p = 1
Z Z
1
f dλ = lim dx = lim log(n) = ∞
(0,1) n→∞ −1
[n ,1) x n→∞
4.3. Monotone Convergence 103
The set lim supn An , usually denoted by {An i.o}, is the set in which the events An occur
infinitely often.
Then, µf is a measure on F and for any measurable function g : Ω → [0, ∞] we have that
Z Z
(4.10) g dµf = gf dµ
Ω Ω
Observe that (4.10) holds if g = 1E with E ∈ F and so it holds also for any nonnegative
simple function. The result follows by monotone convergence and Lemma 4.2.2.
Remark 4.3.6. If two measurable functions f, g : Ω → [0, ∞] are equal R µ–a.s. then µRg (E) =
µf (E) for all E ∈ F . Indeed, if A = {ω ∈ Ω : f (ω) 6= g(ω)} then for A f dµ = 0 = A g dµ.
Since µf (E) = µf (E ∩ A) + µf (E \ A), it follows that µf (E) = µg (E). This shows that the
MCT and its equivalents can be restated by assuming that the hypothesis hold µ–almost
surely.
104 4. Integration: measure theoretic approach
Proof. Consider the sequence gn (ω) := inf k≥n fk (ω). Observe that each gn is measurable,
0 ≤ gn ≤ gn+1 , gn ≤ fn and limn→∞ gn = lim inf n→∞ fn . Thus,
Z Z
(4.12) gn dµ ≤ fn dµ.
Ω Ω
Letting n → ∞, the conclusion of the statement follows from the MCT.
If a > 0, then af = (af )+ −(af )− = af+ −af−R; whereas if aR < 0 then af = (af )+ −(af )− =
−af− − (−af+ ). It follows immediately that Ω af dµ = a Ω f .
The complex valued case follows from the real one by considering the u = Re(f ) and
v = Im(f ) parts separately, and from i f = i (u + i v) = −v + i u.
Theorem 4.3.11. If f ∈ L1 (Ω, F , µ) then
Z Z
(4.16) f dµ ≤ |f | dµ.
Ω Ω
Equality in (4.16) holds iff there is a constant α ∈ C with |α| = 1 such that αf = |f | µ–a.s
Proof. For a extended–real valued function R f the result follows from −|f | ≤ f ≤ |f |. For
the complex valued case, denote by z = Ω f dµ ∈ C and let α ∈ S1 such that αz = |z|.
Then by Theorem 3.10.19
Z Z Z Z Z
(4.17) f dµ = α f dµ = αf dµ = Re(αf ) dµ ≤ |f | dµ
Ω Ω Ω Ω Ω
R
where the last two relations in (4.17) follow from | Ω f dµ| ≥ 0 and Re(αf ) ≤ |αf | = |f |
respectively.
If there is equality in (4.16) then, from |f | − Re(αf ) ≥ 0 and Corollary 4.2.5 we conclude
that |αf | = Re(αf ) a.s., that is, αf = Re(αf ) = |f | a.s.
RRemark 4.3.12. If (Ω, F , µ) is a probability space and f ∈ L1 (µ), then the integral
Ω f dµ, commonly denoted by Eµ [f ], is called the expectation or expected value of f
under µ. The mention of µ is ommited when µ is clear for the context.
Lemma 4.3.13. Suppose fR ∈ L1 , then for any ε > 0 there is δ > 0 such that for any
A ∈ F , if µ(A) < δ, then | A f dµ| < ε.
then f ∈ L1 and
Z Z Z
(4.20) lim |fn − f | dµ = 0, lim fn dµ = f dµ
n→∞ Ω n→∞ Ω Ω
Proof. With out loss of generality, we can assume that pointwise convergence and (4.18)
hold everywhere.
Clearly |f | ≤ g and so, f ∈ L1 . Since gn + g − |fn − f | ≥ 0, from Fatou’s lemma and (4.19)
we obtain
Z Z
2g dµ ≤ lim inf (gn + g − |f − fn |) dµ
Ω n Ω
Z Z
= 2 g dµ + lim inf − |f − fn | dµ
n
ZΩ Z Ω
= 2 g dµ − lim sup |f − fn | dµ.
Ω n Ω
R R
Since
R |fn − f | ≥ 0, lim supn Ω |f − fn | dµ = 0. To conclude, notice that Ω (fn − f ) dµ ≤
Ω |fn − f | dµ.
Theorem 4.4.2. If {fn : n ∈ N} ⊂ L1 is a Cauchy sequence, then there is f ∈ L1 such
that limn kfn − f k = 0. If f˜ is any other such function, then f = f˜ µ–a.s.
Remark 4.4.3. This result says that after identifying all integrable function thatRdiffer on
sets of measure zero, the resulting space L1 is a Banach space with norm kf k := |f | dµ.
Proof. Since {fn } is a Cauchy sequence, there is a subsequence {fnk : k ∈ N} such that
kfnk+1 − fnk k < 2−k . Let
k
X
g k = fn 1 + (fnj+1 − fnj )
j=1
k
X
Gk = |fn1 | + |fnj+1 − fnj |
j=1
The following application of dominated convergence will help illustrate the strength of
the monotone class theorem.
R
Theorem
R n 4.4.4. Suppose µ and ν are finite measures on ([0, 1], B([0, 1]). If xn µ(dx) =
x ν(dx) for all n ∈ Z+ then, µ = ν.
4.5. Riemann integral and Lebesgue integral on R. 107
The common value A(f ) in (4.23) is called the Riemann integral of f over [a, b].
Proof. Choose partitions Pn ⊂ Pn+1 such that U (f, Pn ) − L(f, Pn ) < 1/n. For each
partition Pn , let mn,k = inf{f (t) : t ∈ [tn,k−1 , tn,k ]} and Mn,k = sup{f (t) : t ∈ [tn,k−1 , tn,k ]}.
Let gn and hn be defined by gn (a) = hn (a); and gn (t) = mn,k , hRn (t) = Mn,k on t ∈
(tn,k−1 , tn,k ]. Clearly, gn ≤ gn+1 ≤ f ≤ hn+1 ≤ hn on [a, b] \ Pn , and [a,b] gn = L(f, Pn ) ≤
R
U (f, Pn ) = [a,b] hn .
R R
Dominated convergence implies [a,b] g(x)dx = [a,b] h(x)dx = A(f ); Thus, since g =
limn gn ≤ f ≤ limn hn = h, then g = fS = h a.s. Let D = {t ∈ [a, b] : g(t) < f (t)}.
Then, f is continuous at every point x ∈ / n Pn ∪ D.
108 4. Integration: measure theoretic approach
R
Example 4.5.3. The function f = 1[0,1]\Q ∈ L1 ([0, 1]) and [0,1] f dλ = 1; however, f is not
Riemann integrable in [0, 1] since U (f, P) − L(f, P) = 1 for any partition P of [0, 1].
Let f be a real valued funcion defined on an interval [a, b]. The modulus of continuity
of f on a set T ⊂ [a, b] is defined as
Ωf (T ) := sup{f (x) − f (y) : x, y ∈ T }.
For x ∈ [a, b], the modulus of continuity of f at x is defined as
ωf (x) = lim Ω(B(x; δ) ∩ [a, b]) = inf Ω(B(x; δ) ∩ [a, b])
δց0 δ>0
Lemma 4.5.4. If ωf (x) < ε for all x ∈ [c, d] ⊂ [a, b], then exists δ > 0 such that Ωf (T ) < ǫ
for all T ⊂ [c, d] with diam(T ) < δ.
Proof. For any x ∈ [c, d] there is δx > 0 such that Ωf (B(x; δx ) ∩ [c, d]) < ε. The collection
of all B(x; δx /2) forms an open cover of [c, d]. By compactness, there are x1 , . . . , xk with
[c, d] ⊂ ∪kj=1 B(xj ; δj /2). Let δ = min{δj /2}. If T ⊂ [c, d] with diam(T ) < δ, then is fully
contained in at least one B(xj ; δj ) so Ωf (T ) < ǫ.
Proof. Only sufficiency remains to be proved. For each r > 0, define Jr = {x ∈ [a, b] :
ωf (x) ≥ r}. Each Jr is a closed subset in [a, b], see Lemma 17.3.2, and the set of discontinu-
ities of f is J = ∪k∈NJ1/k . Then, each J1/k is a compact subset of measure zero; thus, for
eack k, there is a finite collection of open (w.r.t [a, b]) intervals Ak covering Jk whose lengths
add up less than 1/k. The complement of the union of intervals Ak is a finite collection of
close subintervals Bk By Lemma 4.3.11, there is 0 < δk such that if T ⊂ [a, b] \ ∪k Ak and
diam(T ) < δk , then Ωf (T ) < k1 . Let Pr be the partition formed by the endpoints of the Ak ,
and by the subintervals contained in Bk whose lengths are less that δk . Then,
U (f, Pk ) − L(f, Pk ) = S1 + S2
where S1 is formed by the subintervals Ak and S2 by subintervals contained in Bk . Then
S1 ≤ (M − n)/k and S2 ≤ (b − a)/k; hence, for k large enough we have that U (f, Pk ) −
L(f, Pk ) < ε.
An important example of Riemann integrable functions are the so called piecewise con-
rinuous functions. A function f on [a, b] is piecewise coninuous if there exists a finite set
C ⊂ [a, b] such that f is continuous on [a, b] \ C, and f admits finite left limits and right
limits at every point in (a, b] and [a, b) respectively. A piecewise continuous function f on
[a, b] is piecewise differentiable if f is continuously differentiable outside of a countable
set D ⊂ [a, b], and its derivative f ′ , defined in [a, b] \ D, admits left and right limits at every
point of (a, b] and [a, b) respectively. A function is piecewise continuous (differentiable) in
R if it is piecewise continuous (differentiable) on any finite interval [a, b].
4.6. Integration under measurable transformations 109
Proof. It suffices to consider the case g ≥ 0 and then apply the conclusion of this case
to real u and imaginary v parts ofPg and the positive and negative parts of u and v. By
Lemma 4.2.2(ii) we have that g = ∞ P∞αn ≥ 0 are constants and An ∈ σ(f ).
n=1 αn 1An where
−1 ′
Thus An = f (Bn ) for some Bn ∈ F . Thus g = n=1 αn 1Bn ◦ f .
Definition 4.6.2. Let T : (Ω, F , µ) → (R, R) measurable. We define the induced measure
µT on R by
(4.25) µT (B) := µ(T −1 (B))
for all B ∈ R. The measure µT := µ ◦ T −1 is called the push forward of µ by T . When µ
is a probability space, the induced measure is called the law or distribution of T under
µ.
Theorem 4.6.3. Consider T : (Ω, F , µ) → (R, R) and the induced measure µ ◦ T −1 on
F . Suppose that f is a extended real or complex valued function defined on (R, R). Then,
f ◦ T ∈ L1 (Ω, F , µ) if and only if f ∈ L1 (R, R, µ ◦ T −1 ). Furthermore,
Z Z
(4.26) f ◦ T dµ = f d(µ ◦ T −1 )
Ω R
Proof. The statement holds for indicator functions by (4.25) and thus, by linearity it holds
for simple functions. The extension to nonnegative real valued measurable functions follows
by monotone convergence. For general f , the conclusion follows by applying (4.26) to the
ℜ(f )+ , ℜ(f )− , ℑ(f )+ and ℑ(f )− separately.
Theorem 4.6.4. Suppose F : R −→ R is a nondecreasing right–continuous function
(F (t+) := limx→t+ F (x) = F (t)). Then, there exists a unique measure µ on (R, B(R))
such that µ((a, b]) = F (b) − F (a) for all a < b.
Proof. For α = inf x∈R F (x) and β = supx∈R F (x) let ((α, β), B((α, β)), λ) be a standard
Lebesgue space. Define the map X : (α, β) −→ R by
X(t) = inf{x ∈ R : F (x) ≥ t}
Increasing monotonicity and right–continuity of F implies that X(t) ≤ x if and only if
t ≤ F (x). Hence X is measurable, and the induced measure µX = λX −1 on B(R) satisfies
µ((a, b]) = λ(X −1 ((a, b]) = λ((F (a), F (b)]) = F (b) − F (a).
110 4. Integration: measure theoretic approach
It follows that µ is a σ–finite measure on (R, B) such that µ(a, b] = F (b) − F (a). Since the
collection of intervals {(a, b] : a < b} is π-system that generates B(R), uniqueness follows
by Sierpinski’s theorem.
Lemma 4.6.5. Suppose H : (a, b) → R, −∞ ≤ a < b ≤ ∞, is anon–decreasing function,
and define
G(x) = H(x−) = sup H(y) (a < x < b)
y<x
F (x) = H(x+) = inf H(z) (a < x < b).
x<z
Proof. For any a < x < y < z < b, the monotonicity of H implies that
(4.27) G(x) ≤ H(x) ≤ F (x) ≤ G(y) ≤ H(y) ≤ F (y) ≤ G(z) ≤ H(z) ≤ F (z).
Letting y ր z we obtain
G(x) ≤ H(x) ≤ F (x) ≤ G(z−) ≤ G(z) ≤ F (z−) ≤ G(z).
Thus, G(z) = F (z−). Letting x ր z gives
G(z−) ≤ G(z) ≤ G(z−).
Therefore, G(z) = G(z−) = F (z−). Similarly, by letting first y ց x, and then letting
z ց x in (4.27), we obtain that F (x) = F (x+) = G(x+).
Lemma 4.6.6. Given a right–continuous non–decreasing function F : (a, b) → R, let α =
inf a<x<b F (x), β = supa<x<b F (x) and G(x) = F (x−). For any α < q < β define
(4.28) Q(q) = inf{x ∈ (a, b) : F (x) ≥ q}
(4.29) Q+ (q) = sup{x ∈ (a, b) : G(x) ≤ q}.
Then, Q is a non–decreasing left continuous function, Q+ is a non–decreasing right contin-
uous function,
(4.30) F (Q(q)) ≥ q
(4.31) G(Q+ (q)) ≤ q,
Q(q) = Q+ (q−), and Q+ (q) = Q(q+).
Proof. The definition of Q and the monotonicity of F imply that (Q(q), ∞) ⊂ {x : F (x) ≥
q}. If Q(q) < zn ց Q(q) then F (zn ) ≥ q, and by the right–continuity of F , F (Q(q)) ≥ q.
Consequently,
(4.32) F (x) ≥ q iff Q(q) ≤ x.
4.7. Exercises 111
Similarly, the monotonicity of G imply that (−∞, Q+ (q)) ⊂ {x : G(x) ≥ q}. If Q+ (q) >
xn ր Q+ (q), then G(zn ) ≤ q and, by left–continuity of G, G(Q+ (q)) ≤ q. Consequently,
(4.33) G(x) ≤ q iff x ≤ Q+ (q).
We claim that Q+ (q) ≤ Q(p) whenever q < p. Indeed, if x < Q+ (q) then, by (4.33),
G(x) ≤ G(x+) = F (x) ≤ q < p. From (4.32) we get that x < Q(p).
Consequently, for any α < q < p < β,
(4.34) Q(q) ≤ Q+ (q) ≤ Q(q+) ≤ Q+ (q+) ≤ Q(p) ≤ Q+ (p)
(4.35) Q(q) ≤ Q+ (q) ≤ Q(p−) ≤ Q+ (p−) ≤ Q(p) ≤ Q+ (p).
Applying G to each side of (4.34) leads to
G(Q+ (q)) ≤ G(Q(q+)) ≤ G(Q+ (q+)) ≤ p.
By letting p ց q we obtain that
G(Q+ (q)) ≤ G(Q(q+)) ≤ G(Q+ (q+)) ≤ q.
Therefore, Q+ (q) ≤ Q(q+) ≤ Q+ (q+) ≤ Q+ (q).
Similarly, by applying F to each side of (4.35) and then letting q ր p, we obtain Q(p) ≤
Q(p−) ≤ Q+ (p−) ≤ Q(p).
Example 4.6.7. If µ is a finite measure on (R, B(R)) then F (x) = µ(−∞, x] and G(x) =
µ(−∞, x) are non–decreasing functions such that F (x) = F (x+) = G(x+) and G(x) =
G(x−) = F (x−).
Example 4.6.8. The standard normal distribution µ on B(R) is the probability mea-
sure defined by
1 2
µ(dx) = √ e−x /2 dx
2π
Let T (x) = x2 , then
Z t √
−1
√ √ 1 2
µ(T (−∞, t]) = µ [− t, t] = 2 √ e−x /2 dx
0 2π
Z t Z t
1 1/2 1
= √ s −1/2 −s/2
e ds = (s/2) 2 −1 e−s/2 ds
2π 0 Γ(1/2) 0
1/2 1
Thus, the law of T is given by µT (dt) = (t/2) 2 −1 e−t/2 1(0,∞) (t) dt which, in Statistics,
Γ(1/2)
is known as the χ21 –distribution.
4.7. Exercises
Exercise 4.7.1. In Steinhaus space ([0, 1], B([0, 1]), λ), give sequences fn and gn of non-
negative measurable functions that converge to zero and such that
R
(a) limn [0,1] fn (t)dt = 1
112 4. Integration: measure theoretic approach
R
(b) limn [0,1] gm (t)dt =∞
Give sequences hn , pn of nonnegative measurable functions such that 0 = lim inf hn <
lim supn hn = ∞, and 0 = lim inf pn < lim supn pn = ∞ such that
R R
(c) 1 = lim inf n [0,1] hn (t)dt < lim supn [0,1] hn (t)dt = 2
R
(d) limn [0,1] pn (t)dt = 0
Exercise 4.7.2. Suppose that fn is a sequence of integrable functions such that fn ≥ fn+1
for all n. Show that
Z Z
lim fn dµ = f dµ
n→∞ Ω Ω
Exercise 4.7.3. Suppose f is a continuous function in an interval [A, B]. For any A ≤ a <
b < B, show that
Z b
lim r f (t + 1r ) − f (t) dt = f (b) − f (a)
r→∞ a
Exercise 4.7.4. Suppose f ∈ L1 ([0, ∞), B([0, ∞)), λ1 ). Find
Z
1 r
lim xf (x) dx
r→∞ r 0
RExercise 4.7.7. Let µ be a measure on the Borel space (R, B) such that the integral
tx
e µ(dx) < ∞ for all t in an open interval I. Show that:
(a) If A is a compact subset of R, then µ(A) < ∞.
R
(b) For any t ∈ I the integral |x|etx µ(dx) < ∞,
R
(c) The map φ given by t 7→ etx µ(dx) is differentiable on I and that
Z
φ (t) = xetx µ(dx)
′
(Hint: Use the mean value theorem to show that |eu − 1| ≤ |u|(1 ∨ eu ))
4.7. Exercises 113
R √n x2 n
Exercise 4.7.8. Show that the an = √ (1
− n
− 2n ) dx converges and identify the limit.
Exercise 4.7.14. (Direct Riemann integrable functions) Let f : [0, ∞) → [0, ∞). For any
h > 0 let Mnh (f ) = sup{f (x) : x ∈ [(n − 1)h, nh)}, and similarly mhn (f ) = inf{f (x) : x ∈
[(n − 1)h, nh)}. Define
∞
X Z
h
f h (x) = Mn (f )1[(n−1)h,nh) (x), Uh (f ) = f h (x) dx
n=1 [0,∞)
X∞ Z
f h (x) = mhn (f )1[(n−1)h,nh) (x), Lh (f ) = f h (x) dx
n=1 [0,∞)
The function f is said to be direct Riemann integrable (R.d.i) if Uh (f ) < ∞ for all h > 0,
and limh→0 Uh (f ) − Lh (f ) = 0. Show that
(a) If Uh (f ) < ∞ for some h0 > 0, then Uh (f ) < ∞ for all h > 0.
(b) f is d.R.i iff f is bounded, continuous a.s. and Uh (f ) < ∞ for some h0 > 0.
(c) If f is d.R.i., then f ∈ L1 ([0, ∞), B([0, ∞)), λ).
(d) Suppose that f is bounded and continuous a.s. and g is a d.R.i. function. If f ≤ g,
then f is also d.R.i.
Exercise 4.7.15. (Quantile function) Let (Ω, F , P) and X be a probability space and a
real–valued measurable function on (Ω, F ) respectively. For any q ∈ (0, 1), a number zq
such that
P[X < zq ] ≤ q and P[X ≤ zq ] ≥ q,
is called a q–quantile of X. The functions Q and Q+ on (0, 1) defined by
Q(q) = inf{x ∈ R : P[X ≤ x] ≥ q}
Q+ (q) = sup{x ∈ R : P[X < x] ≤ q}
are non–decreasing left–continuous and right–continuous respectively. Show that
1. zq is a q–quantile of X iff −zq is a (1 − q)–quantile of −X.
2. Q(q) is the smallest q–quantile of X and Q+ (q) is the largest q–quantile of X.
3. Show that λQ−1 (−∞, x] = P[X ≤ x]. Thus, R Q and X have the R same law and for
any bounded measurable function φ on R, (0,1) φ(Q(p)) dp = Ω φ(X(ω))P(dω).
4.7. Exercises 115
R R
Exercise 4.7.16. Let µ be a measure on R and let f , g ∈ L+ (µ) such that R f dµ = R g dµ.
If there is c ∈ R such that f (x) ≤ g(x) for almost all x < c and f (x) ≥ g(x) for almost all
x > c, show that
µ · f (a, ∞) ≤ µ · g (a, ∞) , a∈R
where d(µ · f ) = f (x) dµ and d(µ · g) = g(x) dµ.
Exercise 4.7.17. (Generalized Chebyshev–Markov inequality) Let (Ω, F , µ) be a measure
space, f : Ω −→ R measurable and suppose φ : R −→ [0, ∞) is a nondecreasing function
such that φ ◦ f ∈ L1 (µ).
(a) For any function g : (R2 , B(R2 )) → [0, ∞) such that g(x, a) ≥ 1{x>a} , show that
Z
1
(4.38) µ({f > a}) ≤ R g(f (ω), a)φ(f (ω)) µ(dω) a∈R
Ω φ ◦ f dµ
(Hint: Consider the induced measure µf = µ ◦ f −1 on B(R) and Exercise 4.7.16
with µf · 1 and µf1φ µf · φ).
(b) Show that the Chebyshev–Markov inequality (4.4) follows as a particular example
of (4.38) (Hint: consider φ ≡ 1, f ≥ 0 and g(x, a) = xa 1{x>a} , where x, a ≥ 0).
(c) For any nonnegative nondecreasing function v, show that
Z
1
µ(f > a) ≤ v ◦ f dµ a ∈ R.
v(a)
(Hint: this can be prove directly as in the proof of the Chebyshev–Markov inequal-
ity or from (4.38) with φ ≡ 1 and g(x, a) = v(x)
v(a) ).
Exercise 4.7.18. Let (Ω, F , µ) and (Ψ, H, ν) be finite measure spaces with µ(Ω) = ν(Ψ).
Assume that (R, σ(C)) is a measurable space where C is a π–system. If X, Y are measurable
functions in R defined on Ω and Ψ respectively, show that the induced measures µX and
νY on (R, σ(C)) are the same if and only if they coincide on C.
Exercise 4.7.19. (Decreasing rearrangement). Suppose f is a real–valued measurable
function on (Ω, F , µ). Let δf (t) = µ[|f | > t] and define
f ∗ (s) = inf{t : δf (t) ≤ s}
(a) Show that δf (t) ≤ s iff f ∗ (s) ≤ t. (b) Show that f ∗ is nonincreasing and right–
continuous. (c) Suppose that δf (s) < ∞ for all s > 0. Show that f ∗ (δf (s)) ≤ s and
δf (f ∗ (s)) ≤ s. (d) Let λ be Lebesgue measure on ([0, ∞), B([0, ∞)). Show that
λ[f ∗ > t] = µ[|f | > t]
Thus f ∗ and |f | Rhave the same law. In
R ∞particular, for any measurable function ϕ : [0, ∞] →
∗
R, we have that φ(|f |(x)) µ(dx) = 0 φ(f (s)) ds.
Exercise 4.7.20. Suppose that (Ω, F , µ) is a finite measure space.
(i) Show that (F∗ )∗ = F∗ .
µ µ
(ii) Show that F ∗ = F .
Chapter 5
In this section we discuss two useful results form point set topology. The first result, known
as Baire’s Category Theorem, describes fat sets in topological spaces. The second result,
known as the The Stone–Weierstrass Theorem, is one of the most important results in basic
Analysis. In its classical form, it states that continuous functions in a compact set can be
uniformly approximated by polynomials. This result will be very useful in Chapter 6 where
we discuss a functional approach to integration Theory.
We will also develop in this section the functional counterpart of a monotone class of
sets. Monotone classes are useful to determine when two measures are equal.
The following result, known as the category theorem, has many theoretical applications in
mathematics.
117
118 5. Baire Category and Stone–Weierstrass theorem
Proof. Let {Un } be a sequenceT of dense open sets in X. Let B0 be a nonempty open set
in X. We will show that B0 ∩ n Un 6= ∅. Given an integer n ≥ 1, suppose we have chosen
a nonempty set open set Bn−1 . Since Un is open and dense, there is a nonempty open set
Bn with Bn ⊂ Un ∩ Bn−1 . In case (a) we take Bn to be a ball of diameter less than 1/n; in
case (b) we take Bn to be an open set with compact closure, as in Lemma 2.11.2. Let
\
K= Bn .
n≥1
In case (a), the centers of the balls Bn form a Cauchy sequence, which, by completeness,
implies that K 6= ∅. In case (b), Bn is a decreasing sequence of nonemptyTcompact sets and
its intersection K is therefore non–empty. In either case, ∅ =
6 K ⊂ B0 ∩ n U n .
A space X where the intersection of any sequence of open dense sets is dense is called
Baire space. Equivalently, X is a Baire space iff any sequence of nowhere dense closed sets
has union with empty interior. Indeed, {Un : n ∈ N} is a sequence of openTdense subsets of
X iff {X
S \ Un : n ∈ N} is a sequence of nowhere dense closed sets. Thus, n Un is dense in
X iff n (X \ Un ) has empty interior. From this observation, it follows that if X is a Baire
space, then X is of second category.
T
Example 5.1.3. The set Q is not a Gδ set. S If it were,
S then Q = n Un for some sequence
{Un } of open dense subsets R. Since R = n (R \ Un ) Q and Q is countable, it would follow
that R is of first category, which is false since R is complete.
Example 5.1.4. Let F : X → S be any function from a topological space X into a metric
space S. For any ε > 0 let Uε be the union ofTall open sets U ⊂ X such that diam(F (U )) < ε.
Clearly, F is continuous at x iff x ∈ G = n U1/n . This shows that the set of continuity
points of function F is a Gδ set. Consequently, there is function on R into a metric space
S that is continuous only at Q.
Example 5.1.5. Given a Gδ set G ⊂ R, there exists a function f on R that is continuous
at G and discontinuous anywhere else. Indeed, let Gn P⊂ R be a decreasing sequence of
open sets and ψn = 1Gn + 21Q\Gn − 21Qc \Gn . Then, Ψ = n 2−n ψn is continuous at G and
discontinuous anywhere else.
Definition 5.2.1. A vector space over a field F is a non empty set V with two operations:
addition that maps (x, y) ∈ V × V to an element x + y ∈ V , a scalar product that maps
(λ, x) ∈ F × V to an element λx ∈ V . These operations satisfy
(a) x + y = x + y.
(b) x + (y + z) = (x + y) + x.
(c) There is 0 ∈ V such that 0 + x = x + 0 = x.
(d) For each x ∈ V there is −x ∈ V such that x + (−x) = (−x) + x = 0.
(e) λ(γ x) = (λβ) x
(f) For all x ∈ V , 1 x = x, and (−1)x = −x
(h) λ(x + y) = λ x + λ y.
(i) (λ + γ)x = λ x + γ x.
A vector ring (simply ring if the context is clear) is a vector space R with an additional
operation (product) mapping each (x, y) ∈ R × R to an element xy ∈ R satisfying the
following properties:
(j) x(yz) = (xy)z
(k) xy = yx, and (λx)y = x(λy) for all x, y ∈ R and λ ∈ F.
An algebra A is a ring that has an element e ∈ A such that ex = xe = x for all x ∈ A.
Example 5.2.2. Consider FΩ , where F is either the set real numbers R or the set of complex
numbers C. We define sum, scalar multiplication and product of functions point wise, i.e.
(f + g)(x) = f (x) + g(x), (af )(x) = af (x), and (f g)(x) = f (x)g(x) for all x ∈ Ω and a ∈ F.
V ⊂ FΩ is a vector space of functions if af + g ∈ V for all f, g ∈ V and a ∈ F.
(a) A vector space of functions V is a ring if f · g ∈ V for all f, g ∈ V.
(b) A ring of functions V is an algebra if it contains the constant function 1.
(c) A vector space of functions V is a vector lattice if f ∧ g ∈ V, and hence f ∨ g ∈ V,
for any f, g ∈ V.
Definition 5.2.3. Suppose V is a vector space over R. A partial order ≤ on V is compatible
with the linear structure if V if for any α > 0, and x, y, x ∈ V we have
x ≤ y =⇒ ax ≤ ay, x ≤ y =⇒ x + z ≤ y + z
In such case (V, ≤) is said to be a partially ordered vector space.
A partially ordered vector space V is a vector lattice if for any x, y ∈ V there is z ∈ V ,
denoted as x ∨ y, such that x ≤ z, y ≤ z and z ≤ u whenever x ≤ u and y ≤ u.
Example 5.2.4. A vector subspace V ⊂ RΩ is a vector lattice if f ∧ g := min{f, g} ∈ V (or
equivalently f ∨ g := max{f, g} ∈ V) for all f, g ∈ V.
A set M ⊂ V is a linear subspace of V is M is also a linear space, that is, αx + y ∈ M
for all a ∈ F and x, y ∈ M . A linear subspace M of a partially ordered vector space V
majorizes V is for each x ∈ V , there is y ∈ M with x ≤ y.
120 5. Baire Category and Stone–Weierstrass theorem
Example 5.2.5. Let ℓ∞ denote the space of all bounded functions in RN. The subspace c
of convergent sequences on R majorizes ℓ∞ .
Proof. Since φα (x) → φ(x) for each x ∈ S and φ − φα ∈ C(S) for each α ∈ D, the sets
Uα = {φ − φα < ε}, with ε > 0 fixed and α ∈ D, form an increasing directed open cover of
S. Hence, S = Uα0 for some α0 ∈ D. Consequently, for any x ∈ S and α ≥ α0 ,
|φ(x) − φα (x)| = φ(x) − φα (x) ≤ φ(x) − φα0 (x) < ε.
This shows that {φα : α ∈ D} converges to φ uniformly.
Clearly Pn (t) converges to |t| uniformly over [−M − 1, M + 1] and Pn (0) = 0; while Q e n (t)
converges to 21 (t + 1 − |t − 1|) = t ∧ 1 uniformly over [−M, M + 2]. Since Q e n (0) → 0 as
n → ∞, the sequence Qn (t) = Q e n (t) − Q
e n (0) satisfies the conditions of the result.
Lemma 5.3.4. For each n ∈ Z+ let gn be the function on [−1, 1] given by
_n o
2kt k2 n
gn (t) = 2n − 2 2n : k ∈ Z, |k| ≤ 2
_n o
2kt 2kt k2 n
(5.3) = 2n − ( 2n ∧ 22n ) : k ∈ Z, |k| ≤ 2
1
Then 0 ≤ gn−1 (t) ≤ gn (t) ≤ t2 and t2 − gn (t) ≤ 4n for all n ∈ N and t ∈ [−1, 1].
2
Proof. For each n ∈ N and k ∈ Z with |k| ≤ 2n let gn,k (t) = 2kt k
2n − 22n . Since gn,0 (t) = 0 and
2 W
gn,k (t) ∨ 0 = 2kt 2kt k
2n − ( 2n ∧ 22n ), gn (t) = {gn,k (t) ∨ 0 : k ∈ Z, |k| ≤ 2n } and gn (t) ≤ gn+1 (t)
k 1
for all n ∈ Z+ and |t| ≤ 1. If |t − 2n | ≤ 2n , then
2
0 ≤ t − 2kn = t2 − gn,k (t) ≤ 41n .
S k−1 k+1
As [ 2n , 2n ] : k ∈ Z, |k| < 2n , the conclusion of the Lemma follows.
We will use Bb (Ω) to denote the collection of all bounded real–valued functions on Ω.
A subset E ⊂ RΩ is closed under chopping if f ∧ 1 ∈ E for any f ∈ E. E is called a
Stone lattice if it is vector lattice that is closed under chopping. E is called a ring lattice
closed under chopping it is a ring and a Stone lattice.
Definition 5.3.5. Suppose E ⊂ RΩ is a vector space. A function f ∈ RΩ is E–confined if
there is ψ ∈ E such that 1{f 6=0} ≤ ψ. In such case we say that ψ confines f . The set of all
functions in E that are E–confined is denoted by E00 . E is self–condined is E00 = E.
Example 5.3.6. The spaces C00 (Rn ) (real continuous compactly supported functions in
Rn ), Cb (Rn ) (real bounded continuous functions in Rn ) are self–contained. The uniform
closure of C00 (Rn ) denoted by C0 (Rn ) is not self–confined.
Remark 5.3.7. If f1 and f2 are E–confined, then there are ψj ∈ E for j = 1, 2 such that
1{fj 6=0} ≤ ψj . For any a ∈ R, {af1 + f2 6= 0} ⊂ {f1 6= 0} ∪ {f2 6= 0}. Therefore af1 + f2 is
confined by φ1 + φ2 . Hence, if E is a vector space, so is E00 . Since for any function f ∈ RΩ ,
f is E–confined iff |f | is E–confined, if E is a Stone lattice, so is E00 .
Lemma 5.3.8. If E ⊂ Bb (Ω) is a Stone lattice, then E00 is dense in E with the uniform
topology.
Proof. It is easy to check that E is a vector space whenever E is a vector space. Indeed,
let φ, ψ ∈ E and φn , ψn ∈ E such that kφ − φn ku ∧ kψ − ψn ku < n1 . Then for any scalars a,
b,
|a|+|b|
|aφ + bψ − aφn − bψn ku ≤ |a|kφ − φn ku + |b|kψ − ψn ku < n
Suppose E is a Stone lattice. The |φn | and φn ∧ 1 belong to E for each n ∈ N. Since
1
k|φ| − |φn |ku ≤ kφ − φn ku < n
1
kφ ∧ 1 − φn ∧ 1ku ≤ kφ − φn ku < n,
it follows that E is also a Stone lattice. To show that E is a ring is enough to show that
2
φ ∈ E whenever φ ∈ E for
1
(5.4) φψ = (φ − ψ)2 − (φ − ψ)2
2
Let M = supn kφn ku ∨ kφku . Let gn (t) be as in Lemma 5.3.4 and define Gn = M 2 gn φMn .
Then (Gn : n ∈ Z+ ) ⊂ E, kGn k ≤ M 2 and
2 2
kφ − G2n ku ≤ kφ − φ2n ku + kφ2n − Gn ku
M2
≤ 2M kφ − φn ku + 4n .
2
Therefore φ ∈ E. The polarization identity (5.4) implies that E is a ring.
Let Pn (t) and Qn (t) be sequences of polynomials in [−M, M ] that vanish at t = 0 such that
||t| − Pn (t)| ∨ |t ∧ 1 − Qn (t)| ≤ n1 for all |t| ≤ M . Then Pn (φn ), Qn (φn ) ∈ E and
Proof. As C(S) is a closed subspace of Bb (S) under uniform norm, it follows from Theorem
5.3.9 that E is a ring lattice closed under chopping contained in CZ (S). The space E ⊕ R :=
{φ + r : φ ∈ E, r ∈ R} is a ring lattice closed under chopping of continuous functions on S
and contains the constant functions. Indeed, for any φ, ψ ∈ E and r, s ∈ R with r ≤ s, E ⊕ R
(φ + r) ∧ (ψ + s) = (φ − ψ) ∧ (s − r) + ψ + r
Claim: E ⊕ R = CZ (S) ⊕ R. Let f ∈ CZ (S) ⊕ R. We will show that for any ε > 0 there
is φε ∈ E ⊕ R such kφε − f ku < ε. For any (t, s) ∈ S × S define a function φt,s ∈ C(S) as
follows:
(a) If t 6= s and either t ∈ S0 or s ∈ S0 , choose ψs,t ∈ E such that ψt,s (t) 6= ψt,s (s). Let
f (t) − f (s)
φt,s (x) := f (s) + (ψs,t (x) − ψt,s (s))
ψt,s (t) − ψt,s (s)
(b) If t = s or (t, s) ∈ ZE define ψt,s (x) ≡ f (t).
Clearly φt,s ∈ E ⊕ R, and
φt,s (t) = f (t), φst (s) = f (s).
For each t, the sets Ust = {φt,s > f − ε}, s ∈ S, form an open cover of S, for s ∈ Ust
for each s ∈ S. Hence, there exists a finite subcover {Ustk : k = 1, . . . , n}. The function
W
f t := nk=1 φt,sk belongs to E ⊕ R, f t > f − ε, and f t (t) = f (t). It follows that the
sets Vt = {f t < f + ε}, t ∈ S, form an open V covert of S. Hence, there is a finite subcover
{Vtj : j = 1, . . . , m}, and the function φε := m
j=1 f
k belongs to E ⊕R, and |f (x)−φ (x)| < ε
ε
for all x ∈ S. This shows that f ∈ E and completes the proof of the claim.
Proof. The space of polynomials E is a ring and separates points. Thus, of f is a continuous
function on [a, b], there is a sequence of polynomials pn such that kf − pn ku ≤ 4−n . The
sequence of polynomials Pn = pn − a2−n , where a > 0 is to be determined, converges
5
uniformly to f . Since kpn+1 − pn ku ≤ 4n+1 ,
a 5 1 5
Pn+1 (t) − Pn (t) ≥ n+1 − 2n+2 = n+1 a − n+1
2 2 2 2
5 1
For a = 5/2, we obtain Pn+1 −Pn ≥ 2n+2 1− 2n ≥ 0. Similarly, the sequence of polynomials
5
Qn = pn + 2n+1 decreases uniformly to f .
u 5
If φ ∈ E let φn ∈ E be a sequence such that kφ−φn ku ≤ 4−n . The sequences Φn = φn − 2n+1
5
and Ψ = ψn + 2n+1 uniformly increase and decrease to φ respectively.
Example 5.3.14. We that the function t 7→ t ∧ 1 can be uniformly approximated on any
interval [0, M ] by an increasing sequence of nonnegative polynomials gn (t) that vanishing
only at 0. Indeed, Theorem 5.3.13 provides a sequence of polynomials gn0 (t) that increase
uniformly to G(t) = 1 ∧ 1t on [0, M ]. For some n0 ∈ N large enough, the sequence {gn (t) =
tgn0 (t) : n ≥ n0 } satisfies the conclusion of the statement. Therefore, if E is a ring of
bounded functions and φ ∈ E+ , there is a nondecreasing sequence of functions ψn ∈ E+
which increases uniformly to 1 ∧ φ.
Example 5.3.15. For any interval IM = [−M, M ], M > 0, there exists a sequence of
polynomials qn (t) with |qn (t)| ≤ |qn+1 (t)| ≤ |t∧1| such that qn converges to t∧1 uniformly on
IM . Indeed, for H(t) = 1 ∧ t1+ , there is a sequence of polynomials h0n such that 0 ≤ h0n ր H
uniformly on IM . Then qn (t) = th0n (t) converges uniformly to t ∧ 1 and |qn (t)| ≤ |qn+1 (t)| ≤
|1 ∧ t| for all t ∈ IM .
Example 5.3.16. Consider
the continuous functions ψ(t) = t − t ∧ 1 = (t − 1)+ and
φ(t) = 1 ∧ a(t − 1)+ over [−M, M ]. The functions
1 1 1
ψ 0 (t) = 2 ψ(t) = − 2 1[1,M ] (t)
t t t
1 1 1 1
0
φ (t) = 2 φ(t) = a − 2 1[1,1+ 1 ] (t) + 2 1(1+ 1 ,M ] ,
t t t a t a
5.3. Stone–Weierstrass Theorem 125
being continuous on [−M, M ], are the uniform limit of nondecreasing and of nonincreasing
sequences of polynomials. Suppose the sequence of polynomials qn0 (t) and p0n (t) decrease
uniformly to ψ 0 and φ0 on [−M, M ] respectively. Then ψ and φ are the uniform limits
of nondecreasing sequences Q0n (t) = t2 qn0 (t) and Pn0 (t) = t2 p0n (t) on [−M, M ] respectively.
Similarly, if the sequences of polynomials qn1 (t) and qn1 (t) increase uniformly to ψ 0 and φ0
on [−M, M ] respectively, then Q1n (t) = t2 qn1 (t) and Pn1 (t) = t2 qn1 (t) increase uniformly to ψ
and φ on [−M, M ] respectively.
The Stone–Weierstrass theorem can be easily extended the setting of locally compact
Hausdorff topologies.
Theorem 5.3.17. Suppose (X, τ ) is a locally compact Hausdorff space and let E ⊂ C0 (X)
be a Stone lattice or ring. Define ZE = {x ∈ X : φ(x) = 0, ∀φ ∈ E}. Then, E = {φ ∈
C0 (X) : φ(z) = 0, ∀z ∈ ZE }.
Proof. Let B be the collection of all closed open balls contained in Ω that have rational
centers and radii. For each B ∈ B let φB be a continuous function in Rn supported in
B with φB (B) = [0, 1]. The collection of polynomials on {φB : B ∈ B} with rational
coefficients is countable, separated points of Ω and is a ring E ⊂ C0 (Ω, R).
Theorem 5.3.19. (Weierstrass extension) Suppose E is a collection of bounded functions
on a set S, and that E is either a Stone lattice or a ring. Let S0 ⊂ S. A real function f on
S0 can be approximated uniformly on S0 by functions in E if and only if f is the restriction
to S0 of some function fe ∈ E.
u u
implies that f ∈ E. If E separates points, then either E = C(S, C), or E = {f ∈ C(S, C) :
f (z) = 0} when there is z ∈ S such that g(z) = 0 for all g ∈ E.
Proof. For any f ∈ E, its real and imaginary parts Re(f ) = 21 (f + f ), Im(f ) = 2i 1
(f − f )
are real functions in E. The set of real functions ER in E is a ring of real bounded functions
which separate points. By the Stone–Weierstrass theorem, ER = {f ∈ C(S, R) : f (z) = 0} if
there z is the common zero of E or ER = C(S, R) otherwise. In any case, one can approximate
the real and imaginary parts of an arbitrary complex continuous function separately.
This space is compact Hausdorff and the projections PE = {pφ : φ ∈ E} define the uniformity
d˜ψ ({xφ : φ ∈ E}, {yφ : φ ∈ E}) = |xψ − yψ |.
The topology associated with this uniformity is the same as the product topology. The map
J : S → Π given by J : x 7→ {φ(x) : φ ∈ E} is continuous on (S, τ (D(E))) and KS = J(S)
(the closure of J(S) in Π) being a closed subset of a compact set, is compact.
Since J(x) = J(y) iff dφ (x, y) = 0 for all φ ∈ E, if f is E–uniformly continuous function
and J(x) = J(y), then f (x) = f (y). Hence, there is a unique map f ′ : J(S) → R such
that f = f ′ ◦ J. Moreover, the E–uniform continuity of f implies the PE –uniform continuity
of f ′ and by Theorem 2.7.4, f ′ admits a unique continuous extension fˆ on KS . For each
φ ∈ S let φ̂ be the extension of φ′ to KS (notice that φ′ is the projection pφ ). The collection
Ê = {φ̂ : φ ∈ E} is a Stone lattice or a ring of continuous functions on KS , as the case
might be, which separates points of KS .
If there is z ∈ S at which all φ̂ vanish, then the Stone–Weierstrass theorem shows that
u u
fˆ− fˆ(z) ∈ Ê . Hence f = fˆ◦J is the sum of the constant fˆ(z) and a function in E . If there
u u
is no such z, then Stone–Weierstrass theorem shows that fˆ ∈ Ê , so that f = fˆ◦J ∈ E .
5.5. Monotone classes of functions 127
m
It remains to show that A is closed under multiplication. Since f · g = f · g+ − f · g− , it
m m m
is enough to show that f · g ∈ A whenever f ∈ A and 0 ≤ g ∈ A . Define
m m
E ∗ = {f ∈ A : f · g ∈ A , ∀ 0 ≤ g ∈ A}.
m
Clearly, A ⊂ E ∗ and E ∗ is a monotone (resp. bounded monotone) class. Hence, E ∗ = A
m m
and f · g = f · g+ − f · g− ∈ A for all f ∈ A and g ∈ A. Let
m m m
E • = {f ∈ A : f · g ∈ A , ∀ 0 ≤ g ∈ A }.
m
As A ⊂ E • , and E • is a monotone (resp. bounded monotone) class, we have that E • = A .
m
Notice that A is also closed under taking limits of W
convergent
V (resp. bounded convergent)
m m
sequences, for if fn ∈ A converges to f , then f = n m≥n fm ∈ A .
m
Since A is an algebra closed under limits of convergent sequences, the collection of sets
m
1A ∈ A forms a σ–algebra. For any f ∈ M, n ∈ N, and r ∈ R, we have that hn =
m
(n(f − r)+ ) ∧ 1 ∈ A and hn ≤ hn+1 . Hence, limn hn = 1{f >r} ∈ A from whence it follows
m
that σ(M) ⊂ A . Therefore, the family of all real valued σ(M)–measurable functions is
m
contained in A ⊂ V.
Theorem 5.5.3. (Complex bounded class theorem) Suppose V is a complex vector space
of complex valued functions containing the constants, and that V is also a complex bounded
class. If M ⊂ V is complex multiplicative class, then V contains the collection of all bounded
complex–valued σ(M)–measurable functions.
Proof. The family of all complex linear combinations of functions in M ∪ {1} is a complex
algebra A of bounded functions in V which is closed under complex conjugation. Hence,
the real valued functions in A form a real algebra Ar of bounded functions contained in the
collection Vr of real valued bounded functions in V. Clearly, Vr is a real vector space and a
bounded monotone class. As in the real monotone class theorem, we conclude that the space
of bounded real valued σ(M)–measurable functions is contained in Vr . The conclusion of
the Theorem follows immediately.
Example 5.6.3. The support of a real valued function f on a topological (S, τ ) is defined
as supp(f ) = {f 6= 0}. The space of all real continuous functions with compact support
on S is denoted by C00 (S). A continuous real function f is said to vanish at infinity if
|f |−1 ([ε, ∞)) is compact in S for all ε > 0. The space of all real continuous functions on
S that vanish at infinity is denoted by C0 (S). The space of all real bounded continuous
functions on S is denoted as Cb (S). Evidently,
C00 (S) ⊂ C0 (S) ⊂ Cb (S).
u
Moreover, under the uniform norm topology on Cb (S), C00 (S) = C0 (S). Let M (S) denote
the space of real valued Borel measurable functions in S. In general,
Σ Σ Σ Σ
(5.5) C00 (S) ⊂ C0 (S) ⊂ Cb (S) ⊂ C(S) ⊂ M (S).
Σ
The family Cb (S) is known as the space of Baire functions and its sets are called
Baire sets. If S is locally compact, second countable Hausdorff, the families of sequential
Σ
limits in (5.5) coincide; if S is a metric space, Cb (S) = M (S). (See Exercise 5.8.7.)
Lemma 5.6.4. Suppose (S, d) is a metric space and let p ∈ S be fixed. For any nonempty
collection E ⊂ S Ω
Σ
(5.6) ESΣ = {f ∈ ESΣ : ∃Ef ⊂ E countable with f ∈ Ef S }
[
(5.7) ESΣ = {f ∈ ESΣ : ∃(φn : n ∈ N) ⊂ E with {f 6= p} ⊂ {φn 6= p}}.
n∈N
Proof. Let A and B the sets on the right hand side of (5.6) and (5.7) respectively. Clearly
E ⊂ A ∩ B. Suppose the sequences (fn ) ⊂ A, (gn ) ⊂ B converge poitwise to f and g
respectively.
S
For each n ∈ N let En ⊂ E be a countable collection with fn ∈ (En )ΣS . Then E∗ = n E fn
Σ Σ
is countable and (fn ) ⊂ E∗ S . Hence f ∈ E∗ S , and so f ∈ A. This shows that A is
sequentially closed.
S
S ⊂ n {gn 6= p}. For each n ∈ N there is aS
As S \ {p} is open in S, {g 6= p} sequence (φn,m :
m ∈ N) ⊂ E with {gn 6= p} ⊂ m {φn,m 6= p}. Then g ∈ B, for {g 6= p} ⊂ n,m {φn,m 6= p}.
This shows that B is sequentially closed.
Lemma 5.6.5. Let E ⊂ RΩ .
(i) If E is closed under +, −, ·, ∨, ∧, ∧1 or | |, then so is ERΣ .
If E ⊂ Bb (Ω) is a Stone lattice or a ring then,
(ii) ERΣ is a ring lattice closed under chopping.
130 5. Baire Category and Stone–Weierstrass theorem
(iii) The collection R(E) of sets in ERΣ is the same as the σ–ring, Rσ (E), generated by
all sets of the form φ−1 (I) where φ ∈ E and I is any interval in R \ {0}.
Proof. (i) Let ⋄ denote any of the operations in {+, −, ·, ∨, ∧} and define
E ⋄ = {f ∈ ERΣ : f ⋄ g ∈ ERΣ , g ⋄ f ∈ ERΣ , ∀ g ∈ E}.
If E is closed under ⋄ then E ⊂ E ⋄ . It is easy to check that E ⋄ is sequentially closed. Hence
E ⋄ = ERΣ . Define
E⋄⋄ = {f ∈ ERΣ : f ⋄ g ∈ ERΣ , g ⋄ f ∈ ERΣ , ∀ g ∈ ERΣ }.
Then, E ⊂ E⋄⋄ . It is easy to check that E⋄⋄ is sequentially closed. Hence E⋄⋄ = ERΣ . A similar
proof shows that ERΣ is closed under ∧1 or | |, when E is closed under one or the other
operation respectively.
u u u
(ii) As E ⊂ ERΣ , (E )Σ Σ
R = ER . By Theorem 5.3.9 E is a ring lattice closed under chopping.
The conclusion follows from (i).
W
(iii): As 1A\B = 1A − 1A ∧ 1B and 1Sn An = n 1An , R(E) is closed under proper differences
and countable unions, and so a σ–ring. Since 1{f >1} = limn 1∧(n(f −f ∧1), {f > r} ∈ R(E)
T S
for any f ∈ ERΣ and any r > 0. Thus {f ≥ r} = n f > r(1 − n1 ) and {f > 0} = n {f >
1 Σ
n } belong to ER . Replacing f by −f shows that {f < −r}, {f ≤ −r} and {f < 0} also
belong to ER . Consequently f −1 (I) ∈ R(E) for any f ∈ ERΣ and any interval I ⊂ R \ {0};
Σ
Let E ∗ denote the collection of real–valued functions f such that {f > r} ∈ Rσ (E) for all
r > 0. It follows that for any f ∈ E ∗ and any interval I contained in R\{0}, f −1 (I) ∈ Rσ (E).
Thus, for any f ∈ E ∗ the sequences (s+ −
n ) and (sn ) defined by
∞
X
s+
n =
k
2n 1{k<2n f ≤k+1}
k=0
X∞
s−
n =
k
2n 1{k<−2n f ≤k+1}
k=0
belong to ERΣ .
As s+n → f+ and s−
→ f− pointwise, f ∈ ERΣ . We claim that E ∗ is sequentially
n
closed. Indeed, if E ∗ ∋ fn → f pointwise then, as Rσ (E) is closed under countable unions
and intersections,
[[ \
{f > r} = {fn > r + k1 } ∈ Rσ (E).
k N n≥N
W
Proof. Suppose {φn } ⊂ E satisfies ψ = supn φn > 0. Then, 1 = 1{ψ>0} = n 1{φn >0} ∈ ERΣ .
Hence ERΣ is an algebra; consequently, R(E) is a σ–algebra.
S
If ERΣ is an algebra, then 1 ∈ ERΣ . Hence there is a sequence {φn } ⊂ E such that n {φn 6=
0} ⊃ {1 6= 0} = Ω. If E is a vector lattice then |φn | ∈ E, and supn |φn | > 0 on Ω. If E
is merely a ring then, for each n there is a sequence (ψm,n ) ⊂ E)+ such that ψm,n ր |φn |
uniformly. Therefore ψ = supm,n ψn,m > 0 on Ω. The last statement follows directly from
Lemma 5.6.5(iii).
Example 5.6.7. Suppose S is a topological space. The collection of Baire sets (sets in
Σ
Cb (S) ) is the σ–algebra generated by Cb (S), and we will refered to it as the Baire
σ–algebra. If S is metrizable, then the family of Baire sets coincides with Borel σ-algebra.
Proof. Let d be a complete metric with d < 1 that generates the topology in Y and let
D = (yn ) ⊂ Y be a dense sequence. We will show there is a sequence of measurable
functions fn : X → Y such that
132 5. Baire Category and Stone–Weierstrass theorem
We start by defining f0 (x) ≡ y0 so that (i) holds. Proceeding by induction, assume that
fn has been defined so that (i) holds. Since F is weak measurable and fn is measurable, it
follows that
\
Ak = fn−1 B(yk ; 21n ) 1
x ∈ X : F (x) ∩ B(yk ; 2n+1 ) 6= ∅ ∈ A
for each yk ∈ D. Given x ∈ X, let s ∈ F (x) be such that d(s, fn (x)) < 2−n . Since D is
dense, there is yk ∈ D such that d(s, yk ) < min 2−n−1 , 2−n − d(s, fn (x)) ; consequently,
S
d(yk , F (x)) < 2−n−1 , x ∈ Ak and SAk = X. S Let {Bk } ⊂ A be a sequence of pairwise
disjoint sets such that Bk ⊂ Ak and k Bk = Ak = X. By letting fn+1 (x) = yk whenever
x ∈ Bk , we obtain a measurable function fn+1 satisfying (i) and (ii).
5.8. Exercises
Exercise 5.8.1. Let V be a vector space over R.
(i) If (V, ≤) is a partially ordered vector space, show that C = {x ∈ V : x ≥ 0} is a
convex pointed cone, i.e.
(a) αx ∈ C or all α ≥ 0 and x ∈ C,
(b) αx + (1 − α)y ∈ C for all 0 ≤ α ≤ 1 and x, y ∈ C,
(c) C ∩ (−C) = {0}.
(ii) Conversely, if C is a convex pointed cone then the relation x ≤ y iff y − x ∈ C
defines a vector order on V with {x ≥ 0} = C.
(iii) Show that a partially ordered vector space V is a vector lattice iff for any x, y ∈ V
there exits w ∈ V , denoted by x ∧ y such that w ≤ x, w ≤ y and v ≤ w whenever
v ≤ x and v ≤ y.
Exercise 5.8.2. Show that M majorizes V iff M minorizes V , i.e., for any x ∈ V , there is
y ∈ M with y ≤ x.
Exercise 5.8.3. Let X and Y be locally compact Hausdorff topological P spaces. Show that
the ring E ⊂ C00 (X × Y ) of all functions of the form f (x, y) = nk=1 φk (x)ψk (y) where
φk ∈ C00 (X), ψk ∈ C00 (Y ), and n ∈ N, is dense in (C0 (X × Y ), k ku ). (Hint: C00 (X × Y ) =
C0 (X × Y ). Show that any g ∈ C00 (X × Y ) can be approximated uniformly by functions in
E.)
Exercise 5.8.4. Show that the collection of trigonometric polynomials
n
X
p(θ) = ck eiθk
k=−m
5.8. Exercises 133
is uniformly dense in the set of complex periodic continuous functions in [−π, π]. Show that
the set of real trigonometric functions
Xn
g(θ) = ak cos(kθ) + bk sin(kθ)
k=0
is uniformly dense is the set of real periodic continuous functions in [−π, π].
Σ Σ
Exercise 5.8.5. Show that if E ′ ⊂ E ⊂ S Ω then, (E ′ )Σ Σ
S ⊂ ES and ES S = ES .
Σ
(b) If S is a nonemepty subspace of a metric space (T, d), then ESΣ ⊂ ETΣ ∩ S Ω .
Exercise 5.8.7. Let (S, τ ) be a topological space.
(a) Show that (5.5) holds.
Σ
(b) If S is metrizable, show that Cb (S) = M (S).
(c) If S is a locally compact, second countable Hausdorff space, show all classes in (5.5)
coincide. (Hint: Show that for any compact set K and open set U , there are
sequences fn , gn ∈ C00 (S) such that fn ց 1K and gn ր 1U .)
(d) For the Euclidean space Rd , show that the sequential closure of the set of polyno-
mial in Rn is M (Rd ).
Exercise 5.8.8. Suppose E ⊂ Bb (Ω) is a Stone lattice. Let f ∈ ERΣ . Show that
(a) (f ∧ r) ∨ (−r) ∈ ERΣ for any r > 0.
(b) The sets {f > r} and {|f | = ∞} belong to ERΣ for all r > 0.
(c) For any set A ∈ ERΣ , f 1Ac ∈ ERΣ , and so f 1{|f |6=∞} ∈ ERΣ .
Chapter 6
Integration: functional
approach
In this Section we discuss and approach to integration (Daniell integration) that does not use
any measure theoretic considerations. Daniell’s direct and elegant approach to integration
exploits the continuity properties of a linear functional (elementary integral) defined on a
set of integrands which has a minimal required algebraic and/or order structure. Then,
through the introduction of a seminorm, it extends the elementary integral to the largest
possible space of functions so that linearity and dominated convergence hold. Measurability
is in turn defined in terms of local properties of integrable functions. The Carathéodory’s
cut condition (3.3) of measurability is obtained as a consequence of the extension, and a
measure theoretic representation follows as a result.
135
136 6. Integration: functional approach
The following result summarizes the properties of the upper Riemann–Jordan integral.
Theorem 6.1.2. The upper integral I # satisfies the following properties:
(i) (positive homogeneity) I # (rf ) = rI # (f ) for any scalar r ≥ 0 and any f ∈ F # .
(ii) (subaddtivity) I # (f + g) ≤ I # (f ) + I # (g) for any f, g ∈ F # .
(iii) (increasing monotonicity) If f, g ∈ F # and f ≤ g, then I # (f ) ≤ I # (g).
(iv) (majorization) For any φ ∈ E(R), |I(φ)| ≤ I # (|φ|).
The lower integral I# is positive homogeneous and monotone increasing and satisfies
(ii)’ (superadditivity) I# (f + g) ≥ I# (f ) + I# (g) for any f, g ∈ F # .
(ii) For any ε > 0 there exist φ, ψ ∈ E(R) such that f ≤ φ, g ≤ ψ and
ε ε
I(φ) < I # (f ) + , I(ψ) < I # (g) +
2 2
As f + g ≤ (φ + ψ) ∈ E(R), it follows that
I # (f + g) ≤ I(φ + ψ) = I(φ) + I(ψ) ≤ I # (f ) + I # (g) + ε
Since ε > 0 is arbitrary, (ii) follows.
(iv) As E(R) is a lattice, if φ ∈ E(R) then ±|φ| ∈ E(R). As I is positive, the result follows
from I(|φ| ± φ) ≥ 0.
For the last statement we can follows similar arguments as above. A more direct proof
however, can be obtained by noticing that I# (f ) = −I # (−f ).
Corollary 6.1.3. For any f, g ∈ F # , |I # (f ) − I # (g)| ≤ I # (|f − g|).
6.1. The Riemann integral revisited 137
Proof. For any v ∈ V there exists a sequence (vn : n ∈ N) ⊂ V such that limn kv − vn k = 0.
As |Λ(vn ) − Λ(vm )| ≤ kvn − vm k it follows that Λvn converges. If (un : n ∈ N) ⊂ v also
converges to v in k k, then Λvn − Λun | ≤ kvn − un k ≤ kvn − vk + kun − vk. This shows
that Λun converges and that limn Λun = limn Λvn . We define Λv := limn Λvn . Clearly Λ is
a linear extension Λ to V and |Λv| = limn |Λvn | ≤ limn kvn k = kvk.
To prove that L# is a ring lattice closed under chopping. This follows from solidity of the
seminorm k k# and the inequalities
|f | − |φ| ≤ |f − φ|
f ∧ 1 − φ ∧ 1| ≤ |f − φ|
|f g − φψ| ≤ kf ku |g − ψ| + kψku |f − φ|
The observation that E is self–confined has no bearing on the algebraic and order struc-
ture of L# . It has an effect in estimating the limit of the integral of sequences of Riemann
integrable functions that converge uniformly.
Theorem 6.1.8. (Uniform dominated convergence theorem) Suppose the sequence (fn :
n ∈ N) ⊂ L# converges uniformly to some function f . If |fn | ≤ g for all n ∈ N and some
function g ∈ L# , then f ∈ L# , kfn − f k# → 0 and limn I(fn ) = I(f ).
Proof. Suppose [−m, m] contains the support of φ1 , and hence of all φn . For each n let
{xjn : 1 ≤ j ≤, kn } be the points of discontinuity of φn . For each ε > 0 and n ∈ N let
kn
n [
[
Bn = (xjℓ − ε2−j−ℓ−1 , xjℓ + ε2−j−ℓ−1 )
ℓ=1 j=1
kn
n [
[
en =
B (xjℓ − ε2−j−ℓ−1 , xjℓ + ε2−j−ℓ−1 ].
ℓ=1 j=1
Lemma 6.1.9 is a modest version of monotone convergence. Not only does it use of
the algebraic structure of the space of step functions, but also it takes advantage of the
topological properties of the real line.
It can be shown (Exercise 6.8.3) that δ–continuity is equivalent to the following prop-
erties
(a) (σ–continuity) If φn ≤ φn+1 ∈ E and supn φn ∈ E, then limn I(φn ) = I(supn φn ).
P P P
(b) (σ–additivity) If 0 ≤ ϕn ∈ E and n ϕn ∈ E, then I( n ϕn ) = n I(ϕn ).
Remark 6.2.3. Not all elementary integrals are σ–continuous. Let Ω = N. The space c of
all convergent sequences in R is an algebra lattice. The positive linear functional
I(φ) = lim φ(n), φ∈c
n
defines a positive elementary integral on E = c which is not σ–additive. To check the last
statement, consider the sequence {ϕm = 1{1,...,m} : m ∈ N}. Then ϕm ր 1 ∈ E, however
0 = limm I(φm ) < 1 = I(1).
Remark 6.2.4. Exercise 6.8.4 shows why measure theory considers rings of sets as those
that can be measured so that the measure is additive.
Lemma 6.3.1. For any vector space E, E ↑ is closed under addition, multiplication by
nonnegative scalars and taking countable suprema. If in addition E is a vector lattice, then
E ↑ is also closed under taking finite infima.
6.3. Daniell’s mean 141
Proof. Suppose E ↑ ∋ hn and h = supn hn . For each n ∈ N let {ψn,k } ⊂ E such that
hn = supk φn,k . Then, h1 + h2 = supn,m (φ1,n + φ2,m ), rh1 = supn rh1,n for any r ≥ 0 and
h = supn,k φn,k . The first statement follows as each of the collections {φ1,n +φ2,k : n, k ∈ N},
{rh1,n : n ∈ N} and {φn,k : n, k ∈ N} is countable.
If E is a vector lattice, then ψn,m = φ1,n ∧ φ2,m ∈ E and h1 ∧ h2 = supn,m ψn,m . The second
statement follows as {ψnm } is countable.
Example 6.3.2. Suppose E is a ring. Then |φ|, (φ − 1)+ , 1 ∧ φ and 1 ∧ a(φ − 1)+ are
elements of E ↑ for any φ ∈ E and a > 0. Indeed, let M = kφku . By Lemma 5.3.2 and
Example 5.3.16 the maps t 7→ |t|, t 7→ (t − 1)+ and 1 ∧ a(t − 1)+ are the uniform limits on
[−M, M ] of monotone increasing and monotone decreasing sequences of polynomials that
vanish at t = 0. Consequently, |φ|, (φ − 1)+ and 1 ∧ (φ − 1)+ are uniform limits of monotone
decreasing and monotone increasing sequences of elements in E. As φ ∧ 1 = φ − (φ − 1)+ ,
φ ∧ 1 is the uniform limit of an increasing sequence in E.
Definition 6.3.3. Suppose I is a positive σ–continuous elementary integral on a vector
lattice E ⊂ Bb (Ω). The Daniell upper integral of a function h ∈ E ↑ is defined by
Z ∗
(6.4) h dI = I ∗ (h) = sup{I(φ) : φ ∈ E, φ ≤ h}
It is clear from the definition above that I ∗ (φ) = I(φ) for all φ ∈ E, and that expres-
sions (6.4) and (6.5) coincide on E ↑ . The following result summarizes the properties of
I ∗.
Theorem 6.3.4. Suppose I is an positive σ–continuous elementary integral on a vector
lattice E ⊂ Bb (Ω). Then Daniell’s upper integral I ∗ has the following properties:
(i) I ∗ is nondecreasing and positive homogeneous.
(ii) If {hn } ⊂ E ↑ is a nondecreasing sequence, then I ∗ (hn ) ր I ∗ (supn hn ).
(iii) I ∗ is additive on E ↑ .
P P
(iv) I ∗ is countably subadditive, i.e., if fn ≥ 0 then I ∗ ( n fn ) ≤ nI
∗ (f
n ).
Proof. (i) Increasing monotonicity follows directly from (6.4) and (6.5). Positive homo-
geneity is a consequence of Lemma 6.3.1 and linearity of I on E.
(ii) Suppose E ↑ ∋ hn ր h. Then supn I ∗ (hn ) ≤ I ∗ (h) by the increasing monotonicity of I ∗ .
For each n let {φn,m : m ∈ N} ⊂ E such that φn,m ր hn and define the sequence ψk =
max φn,m . If a < I ∗ (h), let E ∋ φ ≤ h so that a < I(φ). Then, E ∋ ϕk = ψk ∧ φ ≤ hk
0≤n,m≤k
and ϕk ր φ. Since (E, I) is σ–continuous, we have that
a < I(φ) = lim I(ϕk ) ≤ lim I ∗ (hk )
k k
142 6. Integration: functional approach
P
(iv) It is enough to assume that n I ∗ (fn ) < ∞. For ε > 0 and each n, let E ↑ ∋ hn ≥ fn
so that I ∗ (hn ) < I ∗ (fn ) + 2−n ε. Parts (ii) and (iii) and Lemma 6.3.1 imply
X X n
X
I ∗( fn ) ≤ I ∗ ( hn ) = lim I ∗ ( hk )
n
n n k=1
n
X X X
= lim I ∗ (hk ) = I ∗ (hn ) ≤ I ∗ (fn ) + ε.
n
k=1 n n
P ∗
PIf {fn } ∗is a sequence of nonnegative real–extended func-
(iii) (Countable subadditivity)
tions, then k n fn k ≤ n kfn k .
P
(iv) (Continuity) If {φn : n ∈ N} ⊂ E + and sup k nk=1 φk k∗ < ∞, then lim kφn k∗ = 0.
n n
(v) For any φ ∈ E, I(φ) ≤ kφk∗ .
As E is a vector lattice, φ ∈ E implies |φ| ∈ E; hence, kφk∗ = I(|φ|) < ∞ for all φ ∈ E.
For a comparison between a mean (Daniell mean) and the Jordan seminorm, see Exer-
cise 6.8.7.
Ω
Definition 6.3.7. Let E ⊂ Bb (Ω) be vector space. A functional k k on R that is finite on E
and satisfies (i)–(iv) in Theorem 6.3.5 is called a mean for E. A mean is said to dominate
the elementary integral (E, I) if (v) holds.
Remark 6.3.8. Notice that solidity of a mean k k implies that k|f |k = kf k. When E is a
vector lattice, the Daniell mean kf k∗ = I(|f |) dominates the elementary integral.
Theorem 6.3.9. (Chebyshev’s inequality.) If k k is a mean for E, then
(6.6)
{f > λ}
≤
{|f | > λ}
≤ 1 kf k
λ
Ω
for any f ∈ R and λ > 0.
Proof. Statement (i) follows from solidity, absolute homogeneity and countable subaddi-
tivity of the mean, and from the inequalities |a f + g| ≤ |a||f | + |g|, |f ∨ g| ≤ |f | + |g|,
|f ∧ g| ≤ |f | + |g|, and |f ∧ 1| ≤ |f |.
(ii) Suppose that {fn } ⊂ F is a Cauchy sequence. By Lemma 6.3.11, we can assume without
loss of generality that |fn (ω)| < ∞ for all n and
Pall ω ∈ Ω. Choose a subsequence {fnk }
−k
such that supn≥nk kfn − fnk k < 2 . Then g = k |fnk+1 − fnk | ∈ F. Hence B = {g = ∞}
is negligible,
∞
X
f (x) = fn1 (x) + (fnk+1 (x) − fnk (x)) = lim fnk (x)
k
k=1
absolutely on B c , and kf k < kfn1 k + 1 < ∞. For each k, if n ≥ nk then
kf − fn k ≤ kfn − fnk k + kf − fnk k ≤ 2−k + k1B c (f − fnk )k
X
≤ 2−k +
1B c (fnm+1 − fnm )
≤ 2−k+1 → 0.
m≥k
(iii) If fn converges to f in mean, then (fn ) is a Cauchy sequence in mean. By part (ii)
there is a subsequence {fnk } and a function f ′ ∈ F to which fnk converges in mean and
almost surely. It follows that f and f ′ are finite k k–a.s., and f = f ′ k k–a.s.
The following result is a simple version of monotone convergence for pointwise limits of
elementary functions.
Lemma 6.3.13. Suppose (φn ) ⊂ E is a monotone increasing sequence with supn kφn k < ∞.
Then supm φm ∈ L1 and limn k supm φm − φn k = 0.
Proof. We claim that (φn ) is a Cauchy sequence in L1 ; otherwise, there is ε > 0 and a
subsequence φnk such that kφnk − φnk−1 k ≥ ε. However, as (φnk − φnk−1 ) ⊂ E+ and
XK
sup
(φnk − φnk−1 )
= sup kφnK − φn0 k ≤ 2 sup kφm k < ∞,
K K m
k=1
limk kφnk − φnk−1 k = 0, which is a contradiction. Therefore, k supm φm − φn k → 0 by
Theorem 6.3.12(b,c).
Lemma 6.3.14. Assume E ⊂ Bb (Ω) is a Stone lattice or a ring. Let k k be a mean for E.
6.3. Daniell’s mean 145
u
(i) For any φ ∈ E, |φ|, φ2 ∈ E ∩ L1 (k k).
(ii) If φ ∈ E+ then φ ∧ 1 ∈ L1 .
(iii) If f ∈ L+
1 (k k), then there exists a sequence {ψn } ⊂ E+ such that kf − ψn k → 0.
Proof. (i) Suppose limn kfn − f k = 0 where {fn : n ∈ N} ⊂ L1 . Then, for any fn there
exists φn ∈ E such that kfn − φn k < n1 . Consequently
kf − φn k ≤ kf − fn k + kfn − φn k → 0
as n → 0. Therefore L1 is a closed linear subspace of F.
Suppose f, g ∈ L1 , a ∈ R and let (φn : n ∈ N) and (ψn : n ∈ N) be sequences in E such that
limn kφn − f k = lim kψn − gk = 0. Then
(6.7) |a f + g − (a φn − ψn )| ≤ |a||f − φn | + |g − ψn |
(6.8) |f | − |φn | ≤ |f − φn |
(6.9) |f ∧ 1 − φn ∧ 1| ≤ |f − φn |.
Solidity, absolute homogeneity and subadditivity of k k imply that af + g ∈ L1 .
If E is a lattice then (6.8) and (6.9) imply that |f |, f ∧ 1 ∈ L1 .
If E is merely a ring then (|φn | : n ∈ N) ⊂ L1 by Lemma 6.3.14(i), and so |f | ∈ L1
by (6.8). Consequently f+ , f− ∈ L1 whenever f ∈ L1 . Since f ∧ 1 = f+ ∧ 1 − f− , to
show that f ∧ 1 ∈ L1 it is enough to assume that f ≥ 0. In such case, there is a sequence
(φn : n ∈ N) ⊂ E+ such that kf − φn k → 0 by Lemma 6.3.14. Since (ϕn ∧ 1 : n ∈ N) ⊂ L1 ,
kf ∧ 1 − φn ∧ 1k → 0 by (6.9).
146 6. Integration: functional approach
u
(ii) We first show that gφ ∈ L1 whenever φ ∈ E. If g ∈ E and (φn ) ⊂ E converges uniformly
to g then, φφn ∈ L1 (k k) by Lemma 6.3.14(i). As kφφn − φgk ≤ kφn − gku kφk → 0, φg ∈ L1 .
If g is integrable and bounded and (φn ) ⊂ E is such that kg − φn k → 0 then, φφn ∈ L1 by
Lemma 6.3.14(i). As kφg − φφn k ≤ kφku kg − φn k → 0, φg ∈ L1 .
For a general f ∈ L1 , let (ψn ) ⊂ E be such that kf − ψn k → 0. Then gψn ∈ L1 for all n
and, since kf g − ψn gk ≤ kgku kf − ψn k → 0, f g ∈ L1 .
Remark 6.3.16. Statement (ii) in Theorem 6.3.15 says that the collection of bounded
integrable functions is an algebraic ring contained in L1 (k k).
Without loss of generality we may assume that |fn | < ∞ on Ω for all n. It is enough to
consider the case when fn ր f pointwise everywhere, for if fn ց f then f1 − fn ր f1 − f .
We claim that fn is Cauchy sequence on L1 ; otherwise, for some ε > 0 there would be a
subsequence {fnk } such that supk kfnk+1 − fnk k > ε. As (fnk+1 − fnk ) ⊂ L+
1 and
K
X
sup
(fnk+1 − fnk )
≤ sup kfn k + kfn1 k < ∞,
K n
k=1
Proof. Without loss of generality we may assume that all conditions happen everywhere.
By Theorem 6.3.12(d) and Daniell’s monotone convergence,
gn = sup{|fk − fm | : k, m ≥ n} ∈ L1 , n ∈ N.
Since gn ց 0 and 0 ≤ gn ≤ 2g for all n, kgn k → 0 by monotone convergence. Since
kfk − fm k ≤ kgn k for all k, m ≥ n, (fn ) is a Cauchy sequence in L1 . By Theorem 6.3.12(c),
fn converges in mean to f .
Lemma 6.4.9. If E is a Stone lattice or a ring then, for any a > 0 and h ∈ E ↑ , the function
1{h>a} ∈ E ↑ .
Proof. For any φ ∈ E let φn = 1 ∧ n(φ − φ ∧ 1) . If E is a Stone lattice then φn ∈ E.
If E is merely a ring, then φn ∈ E ↑ as in Example 6.3.2. Thus 1{φ>1} = sup ↑
Sn φn ∈ E by
Lemma 6.3.1. Therefore, if h = supn ϕn where (ϕn ) ⊂ E then, {h > a} = n {ϕn > a} ∈
E ↑↑ = E ↑ .
Theorem 6.4.10. Let I be a positive σ–continuous elementary integral on a Stone lattice
E ⊂ Bb (Ω), and let k k∗ be its Daniell’s mean. Then
(6.10) kAk∗ = inf{kBk∗ : A ⊂ B ∈ E ↑ } = inf{kBk∗ : A ⊂ B ∈ L1 }
for all A ∈ F. Moreover, for any A ∈ F there is a set B ∈ E ↑↓ ∩ L1 such that A ⊂ B and
kAk∗ = kBk∗ .
Proof. If kAk∗ < ∞, then there is a sequence (hn ) ⊂ E ↑ ∩ F such that 1A ≤ hn and
limn khn k∗ = kAk∗ . For any 0 < ε < 1
hn
1A ≤ 1{hn >1−ε} ≤ 1−ε ;
Then {sn = s+ −
n − sn } is a sequence of integrable simple functions that converge to f
on {|f | =
6 ∞} and such that |sn | ≤ |f |. By dominated convergence we conclude that
ksn − f k → 0.
6.5. Extension of the Integral 149
Ω
A function f ∈ R is called σ–finite with respect to a mean k k if {f 6= 0} is covered by
a sequence of k k–integrable sets. A mean k k is said to be σ–finite if the constant
S function
1 is σ–finite with respect to k k. Any f ∈ L1 (k k) is σ–finite, for {|f | > 0} = n {|f | > n1 },
and {|f | > 1/n} ∈ L1 (k k) for all n.
Theorem 6.4.12. If the Daniell mean k k∗ of a positive σ–continuous elementary integral
I on a Stone lattice E ⊂ Bb (Ω) is σ–finite then, there exists a sequence (φn ) ⊂ E such that
supn φn ≡ 1.
for any f ∈ L1 (k k) and {φn } ⊂ E with kf − φn k → 0. For any f ∈ L1 (k k), I(f ) is the
Daniell integral of f .
Theorem 6.5.1. (Integral Extension) Let I be a positive σ–continuous elementary integral
on a Stone lattice or a ring E ⊂ Bb (Ω). If k k is a mean for E that dominates I then,
Proof. (i) Suppose f, g ∈ L1 , a ∈ R and let {φn } and {ψn } be sequences in E which
converge in mean to f and g respectively. As kaf + g − (aφn + ψn )k → 0, the linearity of I
on E implies that
I(af + g) = lim I(aφn + ψn ) = aI(f ) + I(g).
n
We conclude this section with extend the set of integrable functions to contemplate
complex–valued functions.
Example 6.5.4. Define E ⊗ C = {φ + i ψ : φ, ψ ∈ E} be the complex linear span of E. For
any f ∈ CΩ , it is natural to define its seminorm kf k∗ as
∗
kf k∗C =
|f |
.
6.6. Alternative extension of the Daniell integral 151
It is obvious that the family (FC∗ , k k∗C) of complex–valued functions f ∈ C Ω with kf k∗C < ∞
is a complete complex normed space. The space L1 (C, k k∗ ) of complex–valued integrable
functions is then defined as the closure of E ⊗ C in FC∗ . It is easy to check that f = u + iv ∈
L1 (C, k k∗ ) iff u and v are in L1 (k k∗ ); furthermore, if {gn := φn + i ψn : n ∈ N} ⊂ E ⊗ C
is a sequence such that limn→∞ kf − gn k∗C = 0, then limn→∞ (kφn − uk∗ + |ψn − vk∗ ) = 0.
This means that I can be uniquely extended to L1 (C, k k∗ ) by setting I(f ) := I(u) + i I(v).
It is easy the check that I∗ (φ) = I(φ) for all φ ∈ E and that (6.14) and (6.15) coincide
on E ↓ .
Theorem 6.6.2. Let E be a vector lattice. Then
(i) E ↓ is closed under addition, multiplication by non negative scalars, countable infima
and finite suprema.
(ii) I∗ is nondecreasing and positive homogeneous.
(iii) I∗ f ≤ I ∗ f for any numerical function f .
(iv) If {gn } ⊂ E ↓ is a nonincreasing sequence, then I∗ (gn ) ց I∗ (inf n gn ).
(v) I∗ is additive on E ↓ .
P P
(vi) I∗ is σ–superadditive, i.e., if fn ≥ 0 then I∗ ( n fn ) ≥ n I∗ (fn ).
Proof. Observe that E ↓ = −E ↑ . (i) follows directly from Lemma 6.3.1. (ii) follows directly
from the definition of I∗ . The observation above it implies that
(6.16) I∗ f = −I ∗ (−f )
for any numerical function f . (iv) and (v) are consequences of (6.16) and Theorem 6.3.4[i,ii,
iii].
To prove (iii), consider g ∈ E ↓ and h ∈ E ↑ so that g ≤ f ≤ h. (i) and (6.16) imply that
↑
0 ≤ f − g ≤ h − g ∈ E+ and 0 ≤ I ∗ (f − g) = I ∗ (h) + I ∗ (−g) = I ∗ (f ) − I∗ (g).
152 6. Integration: functional approach
↓
To prove superadditivity, let fn ≥ gn ∈ E+ such that I∗ (gn ) > I∗ (fn ) − 2−n ε. For any fixed
integer N ,
X X X N
I∗ fn ≥ I ∗ gn ≥ I∗ (gn )
n n n=1
N
X
> I∗ (fn ) − 2−n ε .
n=1
+
Lemma 6.7.1. Suppose I is a positive linear functional on C00 (X). If {φ : φ ∈ Φ} ⊂ C00 (X)
is an increasing directed subset and supφ∈Φ φ ∈ C00 (X), then
(6.17) I sup φ = sup I(φ).
φ∈Φ φ∈Φ
+
Proof. Since Φ ⊂ C00 (X) is an increasing directed family and by hypothesis g := supφ∈Φ φ ∈
C00 (X), we have that supp(g) is compact and supp(φ) ⊂ supp(g) for all φ ∈ Φ. By Dini’s
lemma, kg − φku → 0 along the directed set Φ. Urysohn’s lemma provides a function
ψ ∈ C00 (X) such that supp(g) ψ. Hence, g − φ = |g − φ| ≤ kg − φku · ψ for all φ ∈ Φ.
Therefore
|I(g) − I(φ)| = I(g) − I(φ) = I(g − φ) ≤ kg − φku I(ψ) → 0
along the directed set Φ.
Lemma 6.7.1 shows that any positive linear functional I on C00 (X) leads to an ele-
mentary integral with a stronger version of σ–continuity. Positive linear functionals on
C00 (X) are called positive Radon measures . A more detail description of Radon measures
is developed in Section 7.7.
Definition 6.7.2. Suppose E ⊂ Bb (Ω). Let E ⇑ denote the collection of extended real–valued
functions that are the pointwise suprema of arbitrary collections in E.
(a) An elementary integral (E, I) is order continuous if (6.17) holds for every in-
creasingly directed collection Φ ⊂ E with supφ∈Φ φ ∈ E.
(b) A mean k k for E is said to be order continuous if suph∈H khk = k suph∈H hk for
⇑
any increasingly directed family H ⊂ E+ .
Positive homogeneity is obvious. Additivity follows from (ii) and the continuity of I • along
increasingly directed sets. If h1 , h2 are elements in E ⇑ , then there are increasingly directed
nets {φβ : β ∈ B1 } and {φ′β : β ′ ∈ B2 } in E such that φβ ր h1 and ψβ ′ ր h2 . As B1 × B2 is
a directed set with respect to the Cartesian order, φβ + ψβ ′ ր h1 + h2 . Therefore
I • (h1 + h2 ) = I • (sup(φβ + ψβ ′ )) = sup I • (φβ + ψβ ′ ) = sup I(φβ ) + sup I(ψβ ′ )
β,β ′ β,β ′ β β′
• • • •
= I (sup φβ ) + I (sup ψβ ′ ) = I (h1 ) + I (h2 ).
β β′
(iv) The increasing and positive homogeneity properties of I • are obvious. The subadditivity
of I • follows the same way in which I ∗ is subadditive. Let {fn P : n ∈ N} be a sequence of
nonnegative real extended functions. It is enough to consider n I • (fn ) < ∞. For any
ε > 0 and each n, there exits hn ∈ E ⇑ such that hn ≥ fn and I • (hn ) < I • (fn ) + 2−n ε. By
part (i) and (iii),
X X n
X
• • •
I ( fn ) ≤ I ( hn ) = lim I ( hk )
n
n n k=1
n
X X X
= lim I • (hk ) = I • (hn ) ≤ I • (fn ) + ε.
n
k=1 n n
Subadditivty follows by taking ε ց 0.
(v) Parts (ii), (iii) and (iv) imply that k k• is an order continuous mean dominating the
elementary integral. The last statement follows from E ↑ ⊂ E ⇑ .
Example 6.7.7. If X is l.c.H, I is a positive linear functional on C00 (X) then, functions
⇑
C00 (X) are lower semicontinuous. Moreover, by Theorem B.1.5, every nonnegative lower
semicontinuous function f be belongs to E ⇑ . For any lower semicontinuous function f ≥ 0,
by definition. If I • (f ) < ∞, then Theorem 6.7.6 implies f ∈ L1 (k k• ) and, for any r > 0,
the open sets 1{f >r} are also integrable.
Theorem 6.7.8. Let k k be an order continuous mean for a Stone lattice or a ring E ⊂
Bb (Ω). Let Φ ⊂ E with supφ∈Φ kφk < ∞. If Φ is increasingly directed or decreasingly
directed then, sup Φ ∈ L1 or inf Φ ∈ L1 , respectively, and Φ → lim Φ in L1 (k k). In
particular, if E is a vector lattice, then E ⇑ ∩ F(k k) ⊂ L1 (k k).
We claim that Φ is a Cauchy net in L1 (k k). If that were not the case, then for some ε > 0
there would exit an increasingly monotone sequence (φn ) ⊂ Φ such that kφn+1 − φn k > ε.
However, as {φn+1 − φn : n ∈ N} ⊂ E+ and
XN
sup
(φn+1 − φn )
≤ 2 sup kφk < ∞,
N n=1 φ∈Φ
Since Φ is a Cauchy net, given ε > 0 there exits φ0 ∈ Φ such that kφ − φ′ k < ε whenever
φ, φ′ ∈ Φ and φ, φ′ ≥ φ0 . As k k is order continuous, for all φ ≥ φ0
k sup Φ − φk =
sup (φ′ − φ)
= sup kφ′ − φk ≤ ε.
φ′ ∈Φ φ′ ∈Φ
φ′ ≥φ φ′ ≥φ
Remark 6.7.9. When E is a ring lattice closed under chopping, the conclusion of The-
orem 6.7.8 still holds when Φ is an increasingly direct set in E ⇑ with supφ∈Φ kφk < ∞
(Exercise 6.8.11).
6.8. Exercises 157
6.8. Exercises
Exercise 6.8.1. This exercise studies further properties of the Jordan–Riemann seminorm.
(a) Show that I# (φ) = I(φ) = I # (φ) for all φ ∈ E(R).
(b) Show that I# (f ), and I # (f ) coincide with the Riemann–Darboux lower and upper
integrals (4.21), (4.22) introduced in Section 4.5.
(c) Let F # denote the set of functions on R for which −∞ < I# (f ) ≤ I # (f ) < ∞.
Show that F # ⊂ Bb (R).
(d) If f ∈ F # , show that |f | ∈ F #
Exercise 6.8.2. Let Ω be any non empty set, E ⊂ Bb (Ω) a ring lattice closed under
chopping, and I : E → R is called a positive elementary integral on Ω.
(i) Develop the Riemann integral of (E, I)
(ii) As an example, treat the Riemann integral on R2 and on Rn .
Suppose in addition, that E is self–confined, that is, for each φ ∈ E, there is ψ ∈ E such
that 1{φ6=0} ≤ ψ.
(iii) Show that the uniform dominated convergence theorem holds.
Exercise 6.8.3. Show that δ–continuity is equivalent to σ–continuity and σ–additivity.
Exercise 6.8.4. Let Ω be a non empty set and B a nonempty collection of subset of Ω.
Show that the the collection E(B) of simple functions over B forms a vector space iff B is
a ring of sets. In such case, show that E(B) is a lattice ring. Show that if 1 ∈ E(B), then
E(B) is an algebra lattice of functions and that B) is an algebra of sets.
Exercise 6.8.5. In each of the examples above show that E is a ring lattice closed under
chopping and that (E, I) is a positive σ–continuous elementary integral.
Exercise 6.8.6. Suppose E1 ⊂ Bb (Ω1 ) and E2 ⊂ Bb (Ω2 ) are ring lattices closed under
chopping. Define E ⊂ Bb (Ω1 × Ω2 ) as the collection of all functions of the form
N
X
ϕ(x, y) = φj (x)ψj (y)
j=1
Exercise 6.8.7. Let E ⊂ Bb (Ω) be a self–confined ring lattice closed under chopping.
Suppose I is a positive σ–continuous elementary integral on E. Let I ∗ be the upper Daniell
integral of I and k k∗ the corresponding Daniell mean. Similarly, let I # the Jordan upper
integral of I and k k# the corresponding Jordan seminorm. Both I ∗ and I # coincide on E.
Show that
I ∗ (f ) ≤ I # (f ), kf k∗ ≤ kf k#
Ω
for all f ∈ R .
Exercise 6.8.8. If f ∈ L+
1 is bounded above by a > 0, show that there exists a sequence
(φn ) ⊂ E with 0 ≤ φn ≤ a such that limn kf − φn k = 0. (Hint: without loss of generality
assume a = 1 and use Example 5.3.14 together with Lemma 6.3.13)
Exercise 6.8.9. Suppose k k∗ is the Daniell mean of some positive σ–continuous elementary
integral I on a vector lattice E ⊂ Bb (Ω).
Ω
(a) For any f ∈ L1 (k k∗ ) and g ∈ R , show that I(f ) + I ∗ (g) = I ∗ (f + g).
(b) (Generalized Lebesgue dominated convergence) Suppose (fn ), (gn ) are sequences
in L1 (k k∗ ) such that |fn | ≤ gn . Assume that gn converges k k∗ –a.s to some
g ∈ L1 (k k∗ ) and that limn I(gn ) = I(g). If fn converges k k–a.s. to some function
f , show that f ∈ L1 (k k) and kf − fn k∗ → 0.
u
Exercise 6.8.10. Show that if E contains a countable dense subset in E then, E ↑ = E ⇑ .
Exercise 6.8.11. Suppose E ⊂ Bb (Ω) is a ring lattice closed under chopping and that
k k is an order continuous mean for E. If H ⊂ E ⇑ is an increasingly directed net and
suph∈H khk < ∞. Show that sup H ∈ L1 S and H → sup H in mean. (Hint: For each h ∈ H
let Φh = {φ ∈ E : φ ≤ h} and define Φ = h∈H Φh . Then Φ → sup H in L1 .)
Chapter 7
Daniell Measurability
Theorem 7.1.1. Let f ∈ L1 and ε > 0. There exists a set U ∈ E ↑ with kU k < ε and a
u
function h ∈ E (uniform closure of E) such that f = h on U c .
Proof. As in the proof of Theorem 6.3.12, let {φn } ⊂ E such that kf − φn k → 0. Passing
−n−1 for all n ≥ 1.
P∞ that kφn − φn−1 k ≤ 2
through a subsequence if necessary, we may assume
0 = φ0 and ψn = φn − φn−1 so that f =
Let ψP n=0 ψn in mean and almost surely. Define
f′ = ∞ ψ
n=0 n where the series is defined and zero otherwise. The functions
n
X ∞
X
gn = k|ψk |, g= k|ψk |
k=1 k=1
belong to E ↑ ∩L1 (k k∗ ), kgn −gk → 0 by monotone convergence, and 1{g>M } ∈ E ↑ ∩L1 for any
1
M > 0 by Lemmas 6.4.7 and 6.4.9. Let M be large enough
′ c
P so that k{g > M }k ≤ M kgk < ε
and set U = {g > M } ∪ {f 6= f }. On U the sequence n ψn converges absolutely and
n
′ X X 1X g M
f − ψk ≤ |ψk | ≤ k|ψk | ≤ ≤ .
n n n
k=0 k>n k>n
159
160 7. Daniell Measurability
P
Hence nk=1 φk converges to f uniformly on U c . By Weierstrass extension (Corollary 5.3.19),
u
there is h ∈ E such that f = h on U c .
Theorem 7.1.2. Let {fn } ⊂ L1 and assume that fn converges to f almost surely on a set
A ∈ L1 . Then, for any ε > 0 there is an integrable set A0 ⊂ A on which fn converges
uniformly to f , and such that kA \ A0 k < ε.
T
Proof. For each n, k ≥ 1 the set S(n, k) = A ∩ i,j≥n {|fi − fj | ≤ k1 } is integrable. For
k fixed, S(n, k) ր A almost surely as n ր ∞. Hence by Daniell’s monotone convergence
theorem, there is a sequence of integers nkT< nk+1 such that kA \ S(nk , k)k < ε2−k . Again,
by Daniell’s monotone convergence, A′0 = k S(nk , k) ∈ L1 and kA\A′0 k < ε. By hypothesis
the complement of the set U where (fn ) converges to f is k k–neglibible. It follows that
A0 = A′0 ∩U is an integrable set with kA\A0 k < ε on which fn converges to f uniformly.
Definition 7.1.3 and Littlewood’s principles imply that L1 (k k) ⊂ MR(k k). Measura-
bility of constant functions follow from the measurability of 1A for any A ∈ L1 . As we will
see in Theorem 7.1.6, MR(k k) contains σ(E).
Lemma 7.1.4. Suppose (fn : n ∈ N) ⊂ MR(k k). Then, for any A ∈ L1 and ε > 0, there
u
exist an integrable set B ⊂ A and a sequence (gn : n ∈ N) ⊂ E such that kA \ Bk < ε and
each fn = gn on B.
u
Proof. Set A−1 = A. Let L1 ∋ A0 ⊂ A−1 and g0 ∈ E be such that kA−1 \ A0 k < ε/2
and f0 = g0 on A0 . Suppose that sets Ak ⊂ Ak−1 ∈ L1 , k = 0, . . . , n and functions
u
g1 , . . . , gn ∈ E have been chosen so that kAk−1 \ Ak k < ε2−k−1 and fk = gk on Ak . Choose
u
L1 ∋ An+1 ⊂ An and gn+1 ∈ E be so thatTkAn \An+1 k < 2−n−2 ε and fn+1 = gn+1 on An+1 .
The monotone convergence implies that n An := B ∈ L1 . Moreover, the subadditivity of
the mean shows that
[
X
X −n−1
kA \ Bk =
(An−1 \ An )
≤
(1A
n−1 − 1An )
≤ ε2 = ε.
n≥0 n n
kB0 \ Bk < ε/2 and kfn − f kB,u = kgn − f kB,u → 0. We conclude that f is the restriction
u
of a function g ∈ E on B. Therefore, f is measurable.
Recall from Lemma 5.6.5 that if E a either a Stone lattice or a ring, its sequential closure
ERΣ is a ring lattice closed under chopping and the σ–ring generated by the collection of set
φ−1 (I), where φ ∈ E and I is an interval in R \ {0}, coincides with the collection of sets
in ERΣ . The following theorem makes the connection between Daniell measurable functions
and the collection of measurable functions generated by E in terms of algebraic and order
permanence properties.
We now show that 1Ω ∈ MR(k k). Let A ∈ L1 , then by Theorem 7.1.1 there is a set U ∈ L1
u
and g ∈ E such that kU k < ε and 1A = g on U c . If A0 = A \ U , then kA \ A0 k < ε and
g = 1Ω on A0 .
Finally, MR(k k) is sequentially closed by Egorov’s theorem. Since E ⊂ MR(k k), we conclude
that ERΣ ⊂ MR(k k).
Proof. If f ∈ MR(k k), then fn = 1 ∧ n(f − f ∧ 1) ∈ M (k k). Hence, limn fn = 1{f >1} ∈
MR(k k). For any
T r > 0 let 0 < dn < r so that dn ր r. Then {f > r} S = {f /r > 1} ∈ M (k k)
and {f ≥ r} = n {f > dn } ∈ M (k k). For 0 < dn ց 0, {f > 0} = n {f > dn } ∈ M (k k).
By using −f instead of f we obtain that {f < −r}, {f ≤ −r}, {f < 0} ∈ M (k k).
Since −f ∈ MR(k k), we have 1{f >−r} = 1 − 1{−f ≥r} ∈ MR(k k). Similarly, 1{f ≥−r} =
1 − 1{−f >r} ∈ MR(k k), 1{f ≥0} = 1 − 1{f >0} ∈ MR(k k) and 1{f ≥0} = 1 − 1{f <0} ∈ MR(k k).
Consversely, suppose that f is real valuedSand {f > d} ∈ M (k k) for all d ∈ D. For any
r ∈ R let D ∋ dn ց r. Then {f > r} = n {f > dn } ∈ M (k k). Similar arguments show
that {f ≥ r}, {f < r}, and {f ≥ r} are in M (k k) for all r ∈ R. Hence
n2n
X
−n n k
fn = 2 [2 f ]1{|f |≤n} = 2n 1{k≤2n f <k+1} ∈ MR(k k).
k=−n2n
The follwing result is a consequence of Egorov’s theorem and states that measurable
functions are uniform limits integrable functions on large integrable sets.
Theorem 7.1.9. Suppose D is a dense set in L1 . A real–valued function f is measurable
iff for every set A ∈ L1 and ε > 0, there is L1 ∋ A0 ⊂ A with kA \ A0 k < ε on which f is
the uniform limit of a sequence in D.
u
Proof. Suppose f ∈ MR(k k). For A ∈ L1 and ε > 0 there are L1 ∋ A′0 ⊂ A and g ∈ E such
that kA \ A′0 k ≤ ε/2 and f = g on A′0 . Since g1A′0 ∈ L1 , there is a sequence {dn } ⊂ D that
converges in mean and almost surely to g1A′0 . By Egorov’s theorem there is L1 ∋ A0 ⊂ A′0
with kA′0 \ A′0 k < ε/2 on which dn converges uniformly to f .
Conversely, let A ∈ L1 , ε > 0 and suppose there is a integrable set A′0 ⊂ A with kA \ A′0 k <
ε/2 on which f is the uniform limit of a sequence (dn : n ∈ N) ⊂ D. For some integer N ,
n ≥ N implies kdn − f ku,A′0 < ε/2. As |dn 1A′0 | ≤ ε1A′0 + |dN | for all n ≥ N , f 1A′0 ∈ L1 by
u
dominated convergence. Consequently, there is an integrable set A0 ⊂ A′0 and g ∈ E such
that kA′0 \ A0 k < ε/2 and f 1A′0 = g1A′0 .
7.2. Localization
The following result shows that a function that is measurable on each set of S a countable
collection G of integrable set is also measurable in any integrable piece of the G .
Theorem 7.2.1. (Localization) Suppose (An : n ∈ N) is a sequence of integrable sets. If f
is a measurable function in each An , then
(i) f is measurable on each integrable subset of A1 .
(ii) f is measurable on A1 ∪ A2 .
S
(iii) f is measurable on any integrable subset of A = n An .
7.3. Integrability criteria 163
Proof. Only sufficiency requires a proof. Assume AS∈ L1 and let (φn ) ⊂ E+ be a sequence
converging to 1A in mean and k k–a.s. Then A ⊂ n {φn > 21 } k k–a.s. The conclusion
follows from Theorem 7.2.1(iii).
Proof. Suppose g is a bounded and g ∈ MR(k k). Then, for any B ∈ L1 and ε > 0, there are
u
integrable sets Bk ⊂ B and functions φk ∈ E such that kB \ Bk k < 2−k and g1 SBkT= φk 1Bk .
′
Clearly the sequence gk = φk 1Bk ∈ L1 converges to g pointwise on B = k m≥k Bm .
S
Since kB \ B ′ k ≤ k m≥k (B \ Bm )k ≤ 2−k+1 for all k, we conclude that gk converges almost
surely to g1B . As |gk | ≤ kgku 1B , g1B ∈ L1 by dominated convergence.
Conversely, suppose g1B ∈ L1 for all 1B ∈ L1 . Fix 1A ∈ L1 . Then, for any ε > 0
u
there is an integrable set A0 ⊂ A and a function φ ∈ E such that kA \ A0 k < ε and
g1A 1A0 = g1A0 = φ1A0 . This shows that g is measurable on every integrable set.
Corollary 7.3.2. If g ∈ MR(k k) is bounded, then gf ∈ L1 for all f ∈ L1 .
164 7. Daniell Measurability
Proof. If f ∈ L1 , then sequence hn = g1{|f |> 1 } f ∈ L1 for each n. As |hn | ≤ kgku |f | and
n
hn → gf , we obtain that gf ∈ L1 by dominated convergence.
Proof. (i) If f ∈ L1 then, f ∈ F, k{|f | = ∞}k = 0, and there exists a sequence (φn ) ⊂ E
converging to f in mean and pointwise almost surely. The set C of all points where (φn )
converges is given by
\[ \ n 1o
C= |φn − φm | ≤ .
k
k N n,m≥N
By Lemma 5.6.5[(ii)], each set {|φn −φm | > k1 } belongs to ERΣ ; hence, by Lemma 5.6.5[(ii),(iii)],
Ω \ C ∈ ERΣ and 1C φn = (φn − 1C c φn ) ∈ ERΣ . Consequently h = lim supn 1C φn ∈ ERΣ ⊂
MR(k k) and f = h almost surely.
(ii) (Necessity) If f ∈SL1 (k k) then f measurable. For each n ∈ N, An = {|f | > 1/n} ∈ L1 ,
and since {f 6= 0} = n An , f is σ–finite.
(Suffciency) Suppose f ∈ F ∩ MR(k k). We claim that f 1A ∈ L1 for any A ∈ L1 . Indeed, for
u
each k ∈ N there is an integrable set Ak ⊂ A and a function gk ∈ E such that kA\Ak k < 2−k
and f = gk on Ak . SByT6.3.15(ii) each fk := f 1Ak is integrable. Clearly (fk ) converges to f
pointwise on A′ = k m≥k Am . Since
X X
A \ A′
≤ kA \ Am k ≤ 2−m = 2−k+1 → 0,
m≥k m≥k
If {f 6= 0} is σ–finite, then there is an increasing sequence {An } of integrable sets such that
1An ր 1A ≥ 1{f 6=0} . As (f 1An ) ⊂ L1 is dominated by |f | ∈ F and fn → f pointwise, we
have that f ∈ L1 by dominated convergence.
Σ
To prove the last assertion, suppose first that f ∈ ES
R ∩ F(k k). By Lemma 5.6.4 there
exists a sequence {φn } ⊂ E such that {f 6= 0} ⊂ {φn 6= 0}. As each {φn 6= 0} is
σ–finite, so is {f 6= 0}. Therefore f ∈ L1 by the first statement. If f ∈ ERΣ ∩ F, then
fm = (−m) ∨ (f ∧ m) ∈ ERΣ ∩ F and so fm ∈ L1 . As fm → f everywhere, f ∈ L1 by
dominated convergence.
7.4. Absolute continuity 165
Proof. Let f ∈ MR(k k). By Corollary 7.2.2 it is enough to show that f is k k♭ –measurable
on any set A of the form {φ > r} where φ ∈ E and r > 0.
We first prove that G : G ⊂ A, G ∈ L1 (k k)} ⊂ L1 (k k♭ ). As E ⊂ L1 (k k♭ ) ∩ L1 (k k),
A ∈ L1 (k k♭ ) ∩ L1 (k k). Let (φn : n ∈ N) ⊂ E be a sequence with 0 ≤ φn ≤ 1 that converges
166 7. Daniell Measurability
We claim that for any ε > 0 there exits δ > 0 such that G ⊂ A, G ∈ L1 (k k) and kGk < δ
imply kGk♭ < ε. Otherwise, there are ε0 > 0 and sequence of k k–integrable sets Gn ⊂ A
T S
such that kGn k < 2−n but kGn k♭ ≥ ε0 . Setting G = n m≥n Gm we obtain that kGk = 0.
S
Monotone convergence, however, implies that ∞ > kAk♭ ≥ kGk♭ = limn k m≥n Gm k♭ ≥
lim supn kGn k♭ ≥ ε0 . This is a contradiction to k k♭ ≪ k k.
(i) k k♭ ≪ k k.
(ii) f ∈ L1 (k k♭ ) iff f g ∈ L1 (k k).
(iii) f ∈ MR(k k♭ ) iff f g ∈ MR(k k).
To prove the next two statements, consider the function ξg := g1 1{g6=0} . Since g ∈ Lloc
1 (k k) ⊂
MR(k k), ξg ∈ MR(k k).
Conversely, suppose that f g ∈ L1 (k k). For any φ ∈ E, φξg ∈ MR(k k) and kφξg k♭ ≤ kφk <
∞. Since k k♭ ≪ k k, φξg ∈ MR(k k♭ ). From {φξg 6= 0} ⊂ {φ 6= 0}, it follows that φξg is
σ–finite with respect to k k♭ . By Theorem 7.3.3(ii), φg −1 1{g6=0} ∈ L1 (k k♭ ). Therefore, as
(iii) If f g ∈ MR(k k), then f 1{g6=0} = f gξg ∈ MR(k k). Since k k♭ ≪ k k, f 1{g6=0} ∈
MR(k k♭ ). As kf 1{g=0} k♭ = 0, f 1{g=0} ∈ L1 (k k♭ ) ⊂ MR(k k♭ . Therefore f ∈ MR(k k♭ ).
Proof. We show first that any Daniell–measurable function satisfies the cut condition (7.3).
Let’s denote M := M (k k). Suppose A ∈ M and let E ⊂ Ω. If kEk∗ = ∞, then (7.3)
holds by subadditivity. If kEk∗ < ∞, then by Theorem 6.4.10, there is B ∈ L1 such that
168 7. Daniell Measurability
B ⊃ E and kEk∗ = kBk∗ . By Lemma 7.3.1 both B ∩ A and B \ A are integrable. The
subadditivity of the mean and Theorem 6.5.1 imply that
k1E k∗ ≤ k1E∩A k∗ + k1E\A k∗ ≤ k1B∩A k∗ + k1B\A k∗
= I(1B∩A ) + I(1B\A ) = I(1B ) = k1E k∗ .
Therefore A satisfies (7.3).
We claim that the collection M ∗ of set satisfying (7.3) is an algebra. Clearly M ∗ is closed
under complementation. If A, B belong to M ∗ and E ⊂ Ω, then
kEk∗ ≤ kE ∩ (A ∪ B)k∗ + kE ∩ (Ac ∩ B c )k∗
= kE ∩ Ak∗ + k(E ∩ B) ∩ Ac k∗ + k(E ∩ Ac ) ∩ B c k∗
= kE ∩ Ak∗ + k(E ∩ Ac ) ∩ Bk∗ + k(E ∩ Ac ) ∩ B c k∗
= kE ∩ Ak∗ + kE ∩ Ac k∗ = kEk∗
These shows that M ∗ is an algebra.
To conclude the proof, we now show that M ∗ ⊂ M . Suppose A ∈ M ∗ and let E ∈ L1 . The
first part of the proof shows that M ∗ contains the integrable sets; hence, E ∩ A ∈ M ∗ . By
Theorem 6.4.10, there exists an integrable set B such that A∩E ⊂ B and kE ∩Ak∗ = kBk∗ .
From
kBk∗ = kB ∩ (E ∩ A)k∗ + kB \ (E ∩ A)k∗
= kBk∗ + kB \ (E ∩ A)k∗ ,
we obtain that kB \ (E ∩ A)k∗ = 0. Hence E ∩ A ∈ L1 for any E ∈ L1 and, by Lemma 7.3.1,
A ∈ M . Incidentally, this argument also shows that M ∗ = M is a σ–algebra.
Lemma 7.5.2. Let µ∗ be as in (7.1), and let k k∗ be Daniell’s mean. Then µ∗ = k k∗ .
Proof. Let R be the δ–ring generated by the collection of sets of the form φ−1 (I) where
φ ∈ E and I is a finite interval of the form (a, ∞) with a > 0. By definition, µ∗ = k k∗
on R.S For any PA ⊂ Ω, if µ∗ (A) < ∞ then there exists {An : n ∈ N} ⊂ R such that
A ⊂ n An and n I(An ) < µ(A) + ε. It follows from Daniell’s dominated converge that
S
B = n An ∈ R↑ ∩L1 which, together with Theorem 6.4.10, implies that µ∗ (A) = inf{I(B) :
A ⊂ B ∈ R↑ } = kAk∗ .
Proof. All statements are consequence of the Daniell integral extension theorem, Theo-
rem 7.5.1, and Lemma 7.5.2.
The last statement follows from Corollary 3.3.6, Theorem 3.5.6, Lemma 5.6.5, and Theo-
rem 5.6.6.
7.6. Maximality
Suppose k k and k k♮ are two means for a Stone lattice or a ring E ⊂ Bb (Ω). If k k ≤ k k♮
Ω
on R , then clearly L1 (k k♮ ) ⊂ L1 (k k). In this section we will show that given a mean k k
for E there exists a maximal mean k k♮ that coincides with k k on E such that k k ≤ k k♮
Ω
on R . In particular, we will show that the Daniell mean of an associated to any positive
σ–continuous elementary integral (E, I) is the maximal mean with kφk∗ = I(|φ|) for all
φ ∈ E for which Cauchy sequences converge and dominated convergence holds.
Lemma 7.6.1. Suppose that E ⊂ Bb (Ω) is either a Stone lattice or a ring, and let k k be
a mean for E. If k k♮ is another mean for E such that kφk ≤ kφk♮ for all φ ∈ E, then
khk ≤ khk♮ for all h ∈ ERΣ .
An immediate consequence of Lemma 7.6.1 is that two means that coincide on E will
also do so on ERΣ . The next result shows that among all means that agree with a particular
mean on E+ , there exits one, which we called maximal , that dominates the rest of them.
Theorem 7.6.2. For any mean k k for E, there exists a unique maximal mean k k♮ that
agrees with k k on E. If k k is order continuous, then there exits a unique maximal order
continuous mean k k∨ that agrees with k k on E and k k∨ ≤ k k♮ .
Proof. Let M(k k) be the collection of all means on E that agree with k k on E+ . Define
(7.5) kf k♮ = sup{kf k♭ : k k♭ ∈ M(k k)}.
Clearly k k♮ coincides with k k on E+ , whence continuity on E+ follows. Absolute homo-
geneity and solidity are easy to verify. It remains to show that k k is countably subadditive.
Let {fn } be a sequence of nonnegative functions. Then, For any k k♭ ∈ M(k k) it follows
that
X
♭ X X
fn
≤ kfn k♭ ≤ kfn k♮ .
n n n
P
♮ P
Taking suprema over all k k♭ ∈ M(k k) leads to
n fn
≤ n kfn k♮ .
170 7. Daniell Measurability
For the second statement consider the collection M• (k k) of all order continuous mean that
agree with k k on E. The arguments used above show that kf k∨ = sup{kf k♭ : k k♭ ∈
M• (k k)} is a mean for E which dominates k k and agrees with k k on E. To show that k k∨
⇑
is in fact order continuous, suppose H ⊂ E+ is increasingly directed. For any k k♭ ∈ M• (k k)
we have
k sup Hk♭ = sup khk♭ ≤ sup khk∨
h∈H h∈H
Proof. Denote the right hand side of (7.6) by kf k⋄ . Clearly k k⋄ agrees with k k on E and
thus, k k⋄ is σ–continuous on E+ . It is easy to check that k P
k⋄ is absolute homogeneous and
⋄
solid. To show that k k is countably subadditive, suppose n kf ⋄
P Pnk < r<P∞. TherePexist
Σ
functions hn ∈ ER such that |fn | ≤ hn and n khn k < r. Since
P
⋄
P n fn ≤
n |fn | ≤ n hn
P Σ
P
and n hn ∈ ER , the subadditivity of k k implies n fn ≤ n hn ≤ n khn k < r.
Therefore k k⋄ is a mean for E.
The following result generalizes Theorem 6.4.10 to the setting of a maximal mean.
Lemma 7.6.5. Suppose k k♮ is a maximal mean for E. Then, for any f ∈ F(k k♮ )+ , there
exists h ∈ L1 (k k♮ ) ∩ ERΣ with f ≤ h such that kf k♮ = khk♮ . If f is a set, then h can be
chosen to be a set too.
Remark 7.6.6. The function h in Lemma 7.6.5 is called upper envelope of f .
7.6. Maximality 171
Proof. By Theorem 7.6.4 there exist functions hn ∈ ERΣ such that f = |f | ≤ hn and
khn k♮ ≤ kf k♮ + 1/n. Notice that hn ∈ VL1 (k k♮ ) by Theorem 7.3.3(ii). An application of
monotone convergence shows that h = n hn is integrable and satisfies the conditions of
the Lemma. If f = 1A , then 1{h≥1} is a smaller upper envelop of f .
Theorem 7.6.7. Suppose E ⊂ Bb (Ω).
(i) If E is a Stone lattice, I is a positive σ–continuous elementary integral on E, and
k k∗ is the corresponding Daniell mean then, k k∗ is the maximal mean that agrees
with I on E+ .
Proof. (i) This is already proved in Remark 7.6.3. For a different proof, notice that E ↑ ⊂
ERΣ . Then, by Theorem 7.6.4,
(ii) If supn kfn k♮ = ∞ there is nothing to prove. Assume supn kfn k♮ < ∞. For each n ∈ N
let hn ∈ ERΣ ∩L1 (k k♮ ) be an upper envelop of fn = |fn |. The sequence f¯n = inf k≥n hk ∈ L1 is
nondecreasing and fn ≤ f¯n ≤ hn ; thus, kfn k♮ = kf¯n k♮ = khn k♮ . By monotone convergence
f¯n converges in mean to supn f¯n . Therefore
sup kfn k♮ = sup kf¯n k♮ = lim kf¯n k♮ = k sup f¯n k♮ ≥ k sup fn k♮ .
n n n n n
The converse inequality supn kfn k♮ ≤ k supn fn k♮ , follows from solidity since fn ≥ 0.
(ii) For any φ ∈ E with 1K ⊂ φ we have that kKk• ≤ kφk• = I(φ) < ∞. Such functions
φ exits by Urysohn’s lemma. It is clear that kKk• ≤ inf{I(φ) : K ≺ φ}. The opposite
inequality will follow immediately once we prove (iv).
(iv) Only the case kAk• < ∞ requires a proof. Since all h ∈ E ⇑ are lower semicontinuous,
we have that {h > r} ∈ G for all r. For any 0 < δ < 1 there is a function h ∈ E ⇑ ∩ F(k k• )
h
such that 1A ≤ h and khk• ≤ (1 + δ)kAk• . Then 1A ≤ 1{h≥1} ≤ 1{h>1−δ} ≤ 1−δ . By
Theorem 6.7.6 h ∈ L1 (k k ), and thus, 1{h≥1} ∈ L1 (k k ) and 1{h>1−δ} ∈ L1 (k k) ∩ E ⇑ .
• • •
From
khk• 1+δ
kAk• ≤ k{h ≥ 1}k• ≤ k{h > 1 − δ}k• ≤ ≤ kAk• ,
1−δ 1−δ
we conclude that
kAk• = inf{k1{h>r} k• : 1A ≤ h ∈ E ⇑ , 0 < r < 1}
= inf{kGk• : A ⊂ G ∈ G}.
We now conclude the proof of (7.8). If K ∈ K then kKk• < ∞ and so there is G ∈ G
such that U ⊂ G and kGk• < kKk• + ε. By Urysohn’s lemma, there is ψ ∈ E such that
K ≺ ψ ≺ G. It follows immediately that inf{I(φ) : K ≺ φ} ≤ kKk• + ε.
(v) Let F the collection of all subsets of X that have finite k k• and which satisfy (7.11).
It follows that F ⊂ L1 (k k• ) for if A ∈ F , then there is sequence {Kn : n ∈ N} ⊂ K, and
by (iv), a sequence {Gn : n ∈ N} ⊂ G ∩ L1 such that Kn ⊂ A ⊂ Gn with kAk• − n1 < kKn k•
and kGn k• < kAk• + n1 . Hence,
2
k1A − 1Kn k• ≤ k1Gn − 1Kn k• = kGn k• − kKn k• < → 0,
n
whence we conclude that A ∈ L1 . Clearly K ⊂ F , and by (7.10), G ∩ L1 ⊂ F .
We claim that F is a ring
which
S is
closed under finite
S k k• –mean countable unions, that is,
•
if {An : n ∈ N} ⊂ F and
n An
< ∞ then, n An ∈ F . First, suppose that Aj ∈ F
for j = 1, 2. Given ε > 0, choose Kj ∈ K and Gj ∈ G with Kj ⊂ Aj ⊂ Gj such that
ε
k1Gj − 1Kj k• < .
2
Then K1 \ G2 ∈ K, and since A1 \ A2 ∈ L1 ,
k1A1 \A2 − 1K1 \G2 k• ≤ k1A1 − 1K1 k• + k1G2 − 1A2 k• < ε.
174 7. Daniell Measurability
S
•
This shows that A1 \ A2 ∈ F . Now suppose
n An
< ∞ where {An : n ∈ N} ⊂ F .
Given ε > 0 choose Kn ∈ K such that k1An − 1Kn k• < 2−n−1 ε. It follows that
X ε
1S A − 1S K
• ≤ k1An − 1Kn k• <
n n n n
2
n
•
Choose N large enough so that
1Sn Kn − 1SN Kk
< 2ε . Then
j=1
•
1S A − 1SN
< ε,
n n j=1 Kj
S
whence we conclude that n An ∈ F .
Theorem 7.7.1 states that the Daniell–Stone mean k k• for a positive elementary integral
(C00 (X), I) is regular if X is l.c.H. The following result gives a unique integral representation
of the elemntary integral in terms of an associated Radon measure.
For the remainder of this section, we will assume that X is a l.c.H. space.
Theorem 7.7.3. ( Riesz–Markov representation theorem) If I is a positive Radon measure
on C00 (X) and k k• is the corresponding Daniell–Stone mean then, the resriction µ of k k•
to M (k k• ) is unique complete Radon measure defined on M (k k• ) ⊃ B(X) such that
Z
I(f ) = f dµ, f ∈ C00 (X).
X
In addition, if I is bounded, then µ is finite and kIk = µ(X) = k1k• .
Proof. Lemma 6.7.1 shows that (C00 (X), I) is an order–continuous elementary integral.
Theorem 7.7.1 shows that the restriction µ to M (k k)• ) is regular and that B(X) ⊂ M (k k• ).
The conclusion to the first statement follows from Theorem 7.5.3.
If I is boundend thenR |I(f )| ≤ kIkkf ku , for f ∈ C00 (X) and by regularity, µ(X) ≤ kIk.
Conversely, |I(f )| = | X f dµ| ≤ kf ku µ(X), and so kIk ≤ µ(X).
Example 7.7.4. Lebesgue measure on (Rd , B(Rd )) is a σ–finite Radon regular measure.
More generally, any Lebesgue–Stieltjes measure µ corresponding to a right–continuous func-
tions with nonnegative increments function is a σ–finite Radon measure on Borel sets.
7.7. Integration on locally compact Hausdorff spaces 175
Theorem 7.7.5. (Lusin’s theorem) Let (X, M , µ) be a Radon measure space, and let f be
a complex measurable function in X. If A ∈ M , µ(A) < ∞ and {f 6= 0} ⊂ A then, for
every ε > 0, there is g ∈ C00 (X) such that
Proof. For real valued functions, the first part is just restatment the definition of Daniell
measurability of functions for the Daniell mean induced by µ. For complex functions the
result follows by applyig the real–valued result to the real and imaginary part of any complex
function.
For g ′ ∈ C00 (X) with µ({f 6= g ′ }) < ε, set g := ϕ(g ′ ). Then kgku ≤ kf ku and, since
{f 6= g} ⊂ {f 6= g ′ }, (7.12) holds.
Corollary 7.7.6. Let f and A be as in Lusin’s theorem. There exists a sequence {gn } ⊂
C00 (X) such that kgn ku ≤ kf ku and gn → f µ–a.s.
Proof. For each n ∈ N, let gn ∈ C00 (X) be such that kgn ku ≤ kf ku and µ(En ) < 2−n ,
where En = {f 6= gn }. By Borel–Cantelli µ(En i.o) = 0; so, for µ–a.s. all x, there is Nx
such that f = gn for all n ≥ Nx .
Proof. First we consider the case f ≥ 0. Let 0 ≤ sn ր f be as Pin Lemma 4.2.2 and set
n −n
tn = sn − sn−1 . Then 2 tn = 1Tn for some Tn ∈ Mµ , and so f = n≥1 2 1Tn . For each n,
there exist Kn ∈ K and Gn ∈ G with Kn ⊂ TnR ⊂ Gn such that µ(Gn \ Kn ) < ε/2. Choose
P P
N large enough so that n>N 2
−n µ(T ) =
n f dµ − N n=1 2
−n µ(T ) < ε/2, and define
n
PN −n
P ∞ −n
u = n=1 2 1Kn , v = n=1 2 1 Gn . Then u and v are u.s.c. and l.s.c respectively,
u ≤ f ≤ v, and
Z Z X
N Z X
−n
(v − u) dµ = 2 (1Gn − 1Kn ) dµ + 2−n 1Gn dµ
j=1 n>N
Z X
∞ Z X
≤ 2−n (1Gn − 1Kn ) + 2−n 1Tn dµ < ε
n=1 n>N
7.8. Exercises
Exercise 7.8.1. Suppose A is a measurable set. Show that a function f is measurable on
every integrable subset of A iff f 1A is measurable.
Exercise 7.8.2. Suppose E is a ring lattice closed under chopping and that k k is an
order–continuous mean for E. Show that the functions E ⇑ are k k–measurable.
Ω
Exercise 7.8.3. If g ∈ Lloc ♭
1 (k k), show that the function k k : f 7→ kf gk on R is a mean
♭
for E. Show that any k k–neglibigle set is a k k –negligible set.
Exercise 7.8.4. Suppose k k♮ is a maximal mean on a Stone lattice or a ring E ⊂ Bb (Ω).
Show that f ∈ L1 (k k♮ ) iff f ∈ MR ∩ F(k k♮ ).
Exercise 7.8.5. If A and B are atoms of k k, show that either kA ∩ Bk = 0 or kAk = kBk.
Exercise 7.8.6. Suppose µ is a nonatomic measure on (Ω, F ) with µ(Ω) = ∞. Show that
for any 0 ≤ u < ∞, there is a A ∈ F such that µ(A) = u. (Hint: Beign µ not atomic, the
collection B := {B ∈ F : 0 < µ(B) < ∞} is not empty. Show that a := supB∈B µ(B) = ∞.)
Exercise 7.8.7. Suppose X is l.c.H and let I be a positive linear functional on C00 (X).
Suppose that {φn : n ∈ N} ⊂ C00 (X) converges uniformly to some function φ and that there
is a compact set K ⊂ X that contains the support of all functions in the sequence. Show
that φ ∈ C00 (X) and that limn kφn − φk• = 0.
Exercise 7.8.8. (Localization of an elementary integral.) Suppose X is a l.c.H space.
Assume there is collection of pairs {Wα , Iα ) : α ∈ A} such that {Wα : α} is an open cover
of X, Iα is a positive linear functional on C00 (Wα ), and Iα and Iβ coincide in C00 (Wα ∩ Wβ ).
Show that there is a unique positive linear functional I on X such that its restriction to
Wα is Iα . (Hint: let f ∈ C00 (X) and K a compact containing supp(f ). Use a partition of
unity of K subordinated
P to a finite cover {WSαj : j = 1, . . . , n} of K (see Lemma 2.11.8),
and define I(f ) = nj=1 Iαj (φj f ) where K ⊂ nj=1 Wαj and supp(φj ) ⊂ Wαj .)
Exercise 7.8.9. Suppose µ is a Borel measure on a topological space (X, τ ). The support
of µ is defined as
supp(µ) = {x ∈ X : ∀U ∈ τ, x ∈ U implies µ(U ) > 0}.
(a) Show that supp(µ) is a closed set.
(b) Show that if (X, τ ) has a countable base, then µ(X \ supp(µ)) = 0.
S measure on X, then µ X \ supp(µ) = 0.
(c) Show that if X is l.c.H and µ is a Radon
(Hint: If G = X \ supp(µ) then G = {V : V open, µ(V ) = 0}.)
Exercise 7.8.10. Suppose µ is a Borel measure on Rn . If f ∈ Cb (Rn ) and f ≡ c µ–a.s.,
show that f (x) = c for all x ∈ supp µ.
Chapter 8
Lp spaces
In this section we develop the theory of p–th integrable functions. Lp spaces are fundamental
objects in applications of integration theory.
Geometrically, if ϕ is convex and a < x < u < y < b then the point (u, ϕ(u)) on the
graph of ϕ lies below the straight line joining (x, ϕ(x)) and (y, ϕ(y)). It is easy to check
that (8.1) is equivalent to any of the inequalities
ϕ(u) − ϕ(x) ϕ(y) − ϕ(x) ϕ(y) − ϕ(u)
(8.2) ≤ ≤
u−x y−x y−u
ϕ(u)−ϕ(x)
For fixed a < x < b, inequalities (8.2) show that the map u 7→ u−x decreases as u ց x
and increases as u ր x. Consequently, the maps
ϕ(u) − ϕ(x) ϕ(v) − ϕ(x)
(8.3) α(x) := sup ; inf := β(x)
a<u<x u−x x<v<b v−x
satisfy
(8.4) α(x) ≤ β(x) ≤ α(y), a<x<y<b
Lemma 8.1.2. The functions α and β are monotone increasing and left continuous and
right continuous respectively. Furthermore, α(x+) = β(x) and α(x) = β(x−).
177
178 8. Lp spaces
Proof. Let x ∈ (a, b) be fixed, and consider the sequence xn = x + n1 . From (8.4), it follows
that β(x) ≤ α(x + n1 ) ≤ β(x + n1 ) ≤ n(ϕ(x + n2 ) − ϕ(x + n1 )). Letting n ր ∞, we obtain
β(x) ≤ α(x+) ≤ β(x+) ≤ β(x). The corresponding statement for left limits follows by
using xn = x − n1 instead.
Since the functions α and β are nondecreasing, we conclude that, except for a countable
set of common discontinuities where jumps are equal, α = β on (a, b).
Theorem 8.1.3. If ϕ : (a, b) → R convex, then ϕ is continuous; moreover, ϕ is differen-
tiable everywhere, except on a countable set.
Proof. Suppose a < x < y < b and let x = x0 < . . . < xn = y. Then
β(xm−1 )(xm − xm−1 ) ≤ ϕ(xm ) − ϕ(xm−1 ) ≤ α(xm )(xm − xm−1 )
Adding all terms gives
X n n
X
β(xm−1 )(xm − xm−1 ) ≤ ϕ(y) − ϕ(x) ≤ α(xm )(xm − xm−1 ).
m=1 m=1
Ry Ry
Consequently, ϕ(y) − ϕ(x) = x β(t) dt = x α(s) ds; hence, ϕ is absolutely continuous
on any closed interval, and differentiable everywhere except in the countable set N of
discontinuities of β.
Theorem 8.1.4. (a) If ϕ is convex in (a, b), there is a unique Borel measure µ on B((a, b))
such that
(8.5) µ((x, y]) = βϕ (y) − βϕ (x); µ([x, y)) = αϕ (y) − αϕ (x),
where αϕ and βϕ are the left and right derivatives defined as in (8.3). (b) Conversely, if µ
is a Borel measure on B((a, b)), then there exists a convex function ϕ such that (8.5) holds.
(c) If ψ is another such convex function, then ϕ − ψ is linear.
Ry
Proof. (a) By Theorem 8.1.3 ϕ(y) − ϕ(x) = x βϕ (t) dt. Since βϕ is nondecreasing and
right continuous, Theorem 4.6.4 shows that there is a unique Borel measure µ on B((a, b))
such that µ((x, y]) = βϕ (y) − βϕ (x) whenever a < x < y < b. The last identity in (8.5)
follows from βϕ (x−) = αϕ (x) for a < x < b.
(b) Given a Borel measure µ in B((a, b)), the funcion
µ((x0 , x]) if x0 ≤ x
(8.6) g(x) =
−µ((x, x0 ]) if x0 ≥ x,
Rx
where a < x0 < b is fixed, is nondecreasing and right continuous. If ϕ(x) = x0 g(t) dt then,
for any a < x < y < b,
Z
g(x)(y − x) ≤ g(t) dt = ϕ(y) − ϕ(x) ≤ g(y)(y − x).
[x,y]
Hence,
ϕ(u) − ϕ(x) ϕ(y) − ϕ(u)
≤ g(u) ≤
u−x y−u
8.2. Jensen’s Inequality 179
for all a < x < u < y < b; therefore,ϕ is convex. Moreover, βϕ (x) = g(x), αϕ (x) = g(x−),
and (8.5) holds.
(c) If ψ is another for convex function for which (8.5) holds, then
βϕ (y) − βψ (y) = βϕ (x) − βψ (x)
for any a < x < y < b. Consequently, βϕ = βψ + C for some constant C and, for x0 fixed,
ψ(x) = ψ(x0 ) − ϕ(x0 ) + C(x − x0 ) + ϕ(x).
Example 8.1.5. Consider the convex functions f (x) = 21 |x|, g(x) = x− , h(x) = x+ and
p(x) = 21 x2 on R. Then δ0 = µf = µg = µh , whereas µp = λ.
If ϕ ◦ X ∈ L1 , equality in (8.7) holds iff there are constants α, β such that ϕ(X) = αX + β
µ–a.s. Hence, if ϕ is strictly convex, equality in (8.7) holds iff X is constant µ–as.
If X ∈ L1 , then equality in (8.10) occurs iff X is a nonnegative constant P–a.s. The term
on the left and right of (8.10) are called geometric and arithmetic means respectively.
180 8. Lp spaces
8.3. Lp spaces
In this section we will introduce the spaces Lp starting from a mean k k∗ on a lattice ring
E. We will assume throughout this section that k k∗ is continuous along increasing
sequences of nonnegative functions, that is,
(8.11) kfn k∗ ր k sup fn k∗
n
Ω
for any nondecreasing sequence {fn } ⊂ R+ .
Two important examples of this instance are
Maximal means and the Daniell mean of a positive σ–continuous elementary integral (E, I).
Ω
Definition 8.3.1. The p–norm, 0 < p ≤ ∞, of a function f ∈ R is defined as
|f |p
∗
if 0 < p < 1
1/p
|f |p
∗
(8.12) kf kp = if 1 ≤ p < ∞
∗
inf{α > 0 : k{|f | > α}k = 0} if p=∞
Ω
Denote by Fp (k k∗ ) the collection of functions f ∈ R such that kf kp < ∞.
∗ ∗
It is clear
∗from the definition above that k k1 = k k , kf kp = 0 iff kf k = 0, and
{f > kf k∞ }
= 0.
1 1
Theorem 8.3.2. ( Hölder’s inequality) Suppose that p, q ≥ 1 and p + q = 1. For any
Ω
f, g ∈ R
(8.13) kf gk1 ≤ kf kp kgkq .
If 1 < p < ∞ and kf kp ∨ kgkq < ∞, then quality in (8.13) holds iff either kf kp ∧ kgkq = 0
or there is a constant c > 0 such that |f |p = c|g|q a.s. If p = 1 and kf k1 ∧ kgk∞ < ∞, then
equality in (8.13) occurs iff |f g| = kgk∞ |f | a.s.
Ω
Theorem 8.3.4. ( Minkowski’s inequality) Suppose 1 ≤ p ≤ ∞ and let f, g ∈ R be such
that {f = ±∞} ∩ {g = ∓∞} is negligible. Then
(8.15) kf + gkp ≤ kf kp + kgkp .
Proof. Suppose 1 ≤ p < ∞. (i) Absolute homogeneity and solidity are easy to check.
Finite subadditivity follows from Minkowski’s inequality. To check countable subadditiv-
Ω P p P p
ity, let {fn } ⊂ R+ . Then nk=1 fk ր n fn
pointwise.
Since k k∗
is continuous
P p
∗ = lim
|
Pn p
∗ .
along nonnegative monotone increasing sequences,
| ∞ n=1 fn | n k=1 fn |
Therefore,
∞ n n ∞
X
X
X X
fn
p = lim
fn
p ≤ lim kf kp = kfn kp .
n n
n=1 k=1 k=1 n=1
P
To check E–continuity, let {φn } ⊂ E+ be such that supn
nk=1 φn
p < ∞. If ψn =
Pn p p
Σ
k=1 φn ∈ E+ , then ψn ∈ ER by Corollary 7.3.4 and, since ψ
n ≤ kψnp−1 ku ψn , ψnp
∈ L1 (k k∗ ).
p p P∞ p
∗
As {ψn } is an increasing sequence of integrable functions,
ψn − n=1 φn → 0. The
elementary inequality 1 + tp ≤ (1 + t)p , where p ≥ 1 and t ≥ 0, shows that kφn kpp ≤
kψnp k∗ − kψn−1
p
k∗ → 0. This show that k kp is a mean.
(ii) Suppose that k k∗ is maximal and let k k♮p be the maximal mean that coincides with
k kp on E+ . If kf kp < ∞, then by the maximality of k k∗ , there exists 0 ≤ h ∈ ERΣ such that
|f |p ≤ h and k|f |p k∗ = khk∗ . As h1/p ∈ ERΣ and |f | ≤ h1/p , kf kp = kh1/p kp = kh1/p k♮p ≥
kf k♮p . Therefore, by Theorem 7.6.4, kf k♮p = kf kp .
it follows that k{f > b}k∗ = 0; consequently kf k∞ ≤ b. This shows that k k∞ is continuous
along nonnegative increasing sequences. Subadditivity follows immediately.
Proof. Since k kp is a mean for E, statements (i), (ii) and (iii) hold by Theorem 6.3.12.
Statement (iv) is a direct consequence of Daniell–Lebesgue dominated convergence theorem.
Theorem 8.3.7. Let 1 ≤ p, r < ∞. A function f ∈ Lp iff f |f |(p/r)−1 ∈ Lr . In particular,
for all 1 ≤ p < ∞, 1A ∈ Lp iff 1A ∈ L1 .
P
Proof. If f ∈ Lp , then there is a sequence {ψn } ⊂ E such that n kψn kp < ∞ and
P (p/r)−1
f = n ψn almost surely. Let G(x) = x|x| 1R\{0} (x) and define
n
X
Ψn = G( ψk )
k=0
P
Clearly Ψn ∈ E Σ , Ψn → G(f ) = f |f |(p/r)−1 , and |Ψn | ≤ h := ( n |ψn |)p/r ∈ Fr . By
Corollary 7.3.4 Ψn ∈ Lr and, by dominated convergence, f |f |(p/r)−1 ∈ Lr . The converse
statement follows by interchanging p with r and f with f |f |(p/r)−1 .
The last assertion follows from G(1A ) = 1A .
Corollary 8.3.8. Assume 1 ≤ p < ∞. Then MR(k k∗ ) = MR(k kp ). Moreover, for any
real valued function f , f ∈ Lp if f ∈ MR ∩ Fp and {f 6= 0} is σ–finite. If k k∗ is maximal,
then Lp ∩ RΩ = MR ∩ Fp .
Proof. The first statement is a consequence of the fact that all sets in Lp for all 1 ≤ p < ∞,
and that k1A kpp = k1A k∗ . The second follows from Theorem 7.3.3 and maximality of the
mean.
Remark 8.3.9. When p = ∞, the closure of E in F∞ is to small to be useful. Instead, we
define L∞ = MR(k k∗ ) ∩ F∞ (k k∗ ).
1 1
Corollary 8.3.10. If f ∈ Lp (k k∗ ), 1 ≤ p < ∞ and p + q = 1, then
(8.16) kf kp = max{kf gk∗ : g ∈ Lq , kgkq = 1}
If p = ∞ and {f 6= 0} is σ–finite, then (8.16) with q = 1 and sup instead of max.
8.3. Lp spaces 183
The most important instance of the theory of integration is when (E, I) is an elementary
integral and k k∗ is its Daniell mean. Then, by the Stone–Daniell representation theorem,
we can Rassociate to the extension I a measure µ so that I(1A ) = µ(A) for A ∈ M and
I(f ) = f dµ for all f ∈ L1 (k k∗ ).
The extension to complex–valued functions represents no extra effort in view of Sec-
tion 6.5.4. Almost by designed, we have the following results:
Theorem 8.3.12. Let Ss be the collection of all measurable complex simple functions, and
S = Ss ∩ L1 . Then, S is dense in Lp (Ω, M , µ) for all 1 ≤ p < ∞, and Ss is dense in
L∞ (X).
Proof. Clearly C00 (X) ⊂ Lp (X). Since the space S of integrable simple functions is dense
in Lp (X), it suffices to show that any set A ∈ S can be approximated in Lp (X) by functions
in C00 . By regularity, for any ε > 0 there are K ∈ K and G ∈ G with K ⊂ A ⊂ G such that
µ(G \ K) < ε. If f ∈ C00 (X) with 1K ≤ f ≤ 1G , then k1A − f kp ≤ k1G − 1K kp ≤ ε1/p .
Example 8.3.14. Let µ be a regular Radon–measure on Rn . Lp (Rn , µ) is separable each
1 ≤ p < ∞. This follows from the density of C00 (Rn ) in Lp (Rn , µ) and the fact (see
184 8. Lp spaces
Theorem 5.3.17) that there is a countable collection E of polynomials in C00 (Rn ) which is
uniformly dense in C0 (Rn ).
Example 8.3.15. We prove in this example that L∞ (Rn , λn ) is not separable. Suppose S
is dense in L∞ (Rn , λn ). We will show that S is necessarily uncountable. Fix r > 0 and for
each x ∈ Rn define fx = 1B(x;r) . As kfx − fy k∞ = 1 whenever x 6= y, each g ∈ S may be in
at most one ball B(fx ; 21 ). Since S is dense, we can conclude that S is uncountable.
Example 8.3.16. Clearly C([−1, 1]) ⊂ L∞ ([−1, 1], λ1 ). Let f = 1[0,1] . If 0 < ε < 14 , the
ball B(f ; ε) in L∞ ([−1, 1]) does not contained any function in C([−1, 1]). This shows that
C([−1, 1]) is not dense in (L∞ ([−1, 1]), k k∞ ).
Example 8.3.17. Let H = span{γt (x) = exit : x, t ∈ R}. If (R, B(R), µ) is a finite measure
space, then H is dense in Lp (µ) for all 1 ≤ p < ∞.
Proof. It suffices to show that for any ε > 0 and f ∈ C00 (R), there is g ∈ H such that
c
− gkp <
kf p ε. Let A > 0 large enough so that supp(f ) ⊂ [−A, A] and µ([−A, A] ) <
ε
2(kf ku +1) . By Stone–Weiestrass span{γ2πn/A (x) = ei2πnx/A : n ∈ Z} is uniformly dense
in space of continuous periodic functions
of periodnA. Therefore, there is g ∈ H such that
1/p
kf − gk[−A,A],u < 1 ∧ ε/(2µ (R)) . Then
kf − gkp = k(f − g)1[−A,A] kp + kg1[−A,A]c kp
ε µ([−A, A] 1/p
≤ + (kf ku + 1)µ([−A, A]c )1/p < ε.
2 µ(R)
Therefore, H is dense in Lp , 1 ≤ p < ∞.
Theorem 8.3.18. Let (Ω, F , µ) be a measure space and 1 ≤ q ≤ ∞. Suppose f is a
measurable function such that
n Z o
Mf = sup f g dµ : g ∈ Lq (µ, C), kgkq = 1 < ∞.
1 1
If {f 6= 0} is σ–finite then, f ∈ Lp (µ, C), where p + q = 1, and kf kp = Mf .
f
Proof. For p = 1 the statement follows immediately by taking g = |f | 1{|f |6=0} .
Assume 1 < p < ∞. For any E ∈ F with E ⊂ {|f | = 6 0} and µ(E) < ∞, we will show
that kf 1E kp ≤ Mf . This would imply that f ∈ Lp and that kf kp ≤ Mf . The reverse
inequality follows from Hölder’s inequality. Let fn be a sequence of simple functions such
that |fn | ≤ |f | and fn → f . Then hn = fn 1E belongs to Ls for all s > 0, |hn | ≤ |f |1E and
hn → f 1E . If
f |hn |p−1
φn = 1E ,
|f | khn kp−1
p
then kφn kq = 1 and
Z Z
kf 1E kp ≤ lim inf khn kp = lim inf |φn hn | dµ ≤ lim inf φn f dµ ≤ Mf .
n n n
8.4. Riesz representation. 185
For p = ∞, let ε > 0 and Aε = {|f | > Mf + ε}. If µ(Aε ) > 0, then for any ∅ =
6 E ⊂ A with
f 1E
µ(E) < ∞, let g = |f | µ(E)so that g ∈ L1 and kgk1 = 1. Then
Z Z
1
f g dµ = |f | dµ ≥ Mf + ε,
µ(E) E
contradicting the definition of Mf . Therefore kf k∞ ≤ Mf . Mf ≤ kf k∞ follows by Hölder’s
inequality.
Proof. The solidity of the k k∗p seminorm implies that Lp (k k∗ ) is a Banach lattice for all
1 ≤ p ≤ ∞. The difficult part is to show order completness. By Lemma 8.4.1 it is enough
to show that any nonempty set G of positive elements in Lp that is bounded above and
closed under finite suprema has a supremum in Lp . Let r = supg∈G kġkp < ∞.
We claim that ġ ∗ satisfies the conditions of the theorem. First, we show that ġ ∗ is an upper
bound. Since kġn ∨ ġkp ≤ r = kġ ∗ kp for any ġ ∈ G, kġ ∗ ∨ ġkp ≤ kġ ∗ kp . As k kp is strictly
increasing, ġ ≤ ġ ∗ for all ġ ∈ G.
We now show that ġ ∗ is the least upper bound of G. If f˙ is another upper bound of G, then
so is ġ ′ = ġ ∗ ∧ f˙. Then r ≤ kġ ′ kp ≤ kġ ∗ kp = r. Since k kp is strictly increasing, it follows
that ġ ′ = ġ ∗ , i.e., ġ ∗ ≤ f˙.
Z Z Z
Λγ (f ) = f γ dm = f 1{γ>0} γ dm ≤ f 1{γ>0} g dm = Λg f 1{γ>0} .
8.4. Riesz representation. 187
Conversely, if γ ∈ L+ + ∗
q satisfies Λγ (f ) ≤ Λg f 1{γ>0} for all f ∈ Lp , then γ ≤ g+ k k –a.s.
Therefore, ġ+ is the least upper bound of the family
n o
(8.17) G = γ̇ ∈ L+ : Λ γ ( f˙) ≤ Λ g f 1{γ>0} , for all f˙ ∈ L p
q
The next result shows that any continuous linear functional on Lp is of the form Λg for
some g ∈ Lq .
Theorem 8.4.3. (Riesz–representation theorem) Suppose (E, m) is an positive σ–continuous
elementary integral and let k k∗ be its Daniell mean. If either 1 < p < ∞ or p = 1 and k k∗
is σ–finite, then for any Λ ∈ L∗p there exists a unique g ∈ Lq such that Λ = Λg .
Remark 8.4.4. Theorem 8.4.3 states that if 1 < p < ∞ or p = 1, then L∗p (k k∗ , R) and
Lq (k k∗ , R) where p1 + 1q = 1 are isomprphic isometric spaces, that is, the map g →
7 Λg a
linear isometry from Lq and Lp . ∗
n o
Proof. Let G = γ ∈ L+ : Λ γ (f ) ≤ Λ f 1{γ>0} , for all f˙ ∈ L q . We claim that G is a
q
non empty order–directed and k kq –bounded. First notice that G 6= ∅ as it contains 0̇. If
γ1 , γ2 ∈ G , then
Z Z
Λγ1 ∨γ2 (f ) = f 1{γ2 <γ1 } γ1 dm + f 1{γ2 ≥γ1 } γ2 dm
≤ Λγ1 f 1{γ2 <γ1 } 1{γ1 >0} + Λγ2 f 1{γ2 ≥γ1 } 1{γ2 >0} ≤ Λ f 1{γ1 ∨γ2 >0} ,
which shows that γ1 ∨ γ2 ∈ G . For any γ ∈ G ,
nZ o n o
kγkq = sup f γ dm : f ∈ L+p , kf kp = 1 ≤ sup Λ f 1{γ>0} : f ∈ L +
p , kf kp = 1
n o
≤ sup Λ(f ) : f ∈ Lp , kf kp = kΛk < ∞.
If α0 = 0 set D = A, otherwise there exists an integrable set A1 ⊂ A such that Λ(1 b A )>
1
α0 /2. Proceeding by induction, suppose we have found integrable sets A1 , . . . , An such that
j−1
[
Aj ⊂ A \ Ak
k=1
n m
[ o
αm b B ) : B ⊂ An ⊂ A \
= sup Λ(1 Ak
k=1
for all f ∈ L+
p. As u̇ ∈ G ,
Z Z
(8.19) (u + r1D )f dm = uf 1Dc dm ≤ Λ(f 1Dc 1{u>0} .
Dc
for all f ∈ L+ ∗ b
p . Hence u̇ + r 1̇D ∈ G , and we must have that k1D k = 0. Then 0 ≤ Λ(1C ) =
b A ) < 0 which is a contradiction.
Λ(1
for all f ∈ L+ +
p . For general Λ we consider Λu − Λ and obtain v ∈ Lq such that Λu − Λ = Λv .
Then Λ = Λg with u = g+ and v = g− .
8.5. Reverse Borel–Cantelli theorem 189
Pn
Proof.
P Without loss of generality, we assume that P[A
n ] > 0 for all
n. Let f n = k=1 1Ak ,
f = n≥1 1An , and for any 0 < λ < 1, define Bn,λ = fn > λE[fn ] . Observe that
\ [ \ [
A= Ak = {f = ∞} ⊃ Bk,λ = Bλ ;
n≥1 k≥n n≥1 k≥n
The next result is a partial converse to the Borel-Cantelli theorem discussed in Corol-
lary 4.3.4.
Theorem 8.5.3. (Borel–Cantelli, II) Suppose {Ahn } ⊂ F is suchi that for any i 6= j,
P T S
P[Ai ∩ Aj ] ≤ P[Ai ]P[Aj ]. If n P[An ] = ∞, then P n≥1 k≥n Ak = 1.
T S P P
Proof. Denote by A = n≥1 k≥n Ak . Let an = nk=1 P[Ak ], bn = i6=j P[Ai ]P[Aj ], and
P
cn = nk=1 P2 [Ak ]. By Kochen–Stone’s lemma we have
c n + bn
P[A] ≥ lim sup
n a n + bn
cn an
From a2n
= cn +bn ≤ an +bn and an ր ∞, it follows that bn ր ∞ and limn bn = 0 = limn bn .
Therefore, P[A] = 1.
190 8. Lp spaces
Remark 8.6.2. Unless S is separable, if both fn and f are Borel–measurable, the map
ω 7→ ρ(fn (ω), f (ω)) may fail to be Borel–measurable (see Theorem 9.4.3 in Section 9.4).
However, when {fn : n ∈ N} ⊂ MS we have that {ρ(fn , f ) > δ} ∈ MR, and so the set
in (8.22) is k k–integrable for any A ∈ L1 .
Remark 8.6.3. Convergence in measure is of particular interest when (Ω, M , µ) is a finite
measure space. In this case, {fn : n ∈ N} ⊂ S Ω converges in measure to f is equivalent to
lim µ∗ ρ(fn , f ) > δ = 0
n
Denote by L0 (k k) the space of all almost surely defined real (or complex)–valued
measurable functions on Ω. We will show that there exists a topology that is consistent
with convergence in measure of functions.
Theorem 8.6.4. Let {fn : n ∈ Z} ⊂ RΩ .
(i) If fn converges in mean to f , then fn converges in measure to f .
Suppose {fn : n ∈ N} ⊂ MR.
(ii) If fn converges k k–a.s. to f , then fn converges in measure to f .
(iii) If fn converges in measure to f then, f ∈ MR and there exists a subsequence
{fmj : j ∈ N} that converges to f a.s.
Proof. (i) If kfn − f k → 0, then for any δ > 0,
{|f − fn | > δ}
≤ 1δ
f − fn
→ 0. This
shows that convergence in mean implies convergence in measure.
For the remainder if this section we will assume that (Ω, M , µ) is a finite measure space.
For any mensurable complex (or real extended) valued function f define
kf k0 = inf {ε > 0 : µ∗ (|f | > ε) ≤ ε} ,
where µ∗ is the Daniell–mean associated to µ. When f ∈ MR µ∗ is substituted by µ.
Ω
Lemma 8.6.5. For f and g be elements of R
(i) µ∗ (|f | > kf k0 ) ≤ kf0 k ≤ µ(Ω).
(ii) If µ∗ ({f = ±} ∩ {g = ∓}) = 0, then kf + gk0 ≤ kf k0 + kgk0 .
(iii) kf k0 ≤ kgk0 whenever |f | ≤ |g|.
(iv) krf k0 ≤ (r ∨ 1)kf k0 .
(v) If f ∈ MR with µ(|f | = ∞) = 0, then limr→0 krf k0 = 0.
Proof. (i) Since µ∗ (|f | > µ(Ω)) ≤ µ(Ω), kf k0 ≤ µ(Ω). Let εn ց kf k0 with µ∗ (|f | >
εn ) ≤ εn . By Theorem 7.6.7, µ∗ is maximal and continuous along arbitrary nonnegative
nondecreasing sequences. Hence, {|f | > εn } ր {|f | > kf k0 },
µ∗ (|f | > kf k0 ) = sup µ∗ (|f | > εn ) ≤ kf k0 .
n
(iv) Suppose 0 < r ≤ 1. Whenever µ∗ (|f | > a) ≤ a, µ∗ (r|f | > a) = µ∗ (|f | > a/r) ≤
µ∗ (|f | > a) ≤ a. Hence krf k0 ≤ kf k0 . Suppose r > 1. As µ∗ (r|f | > rkf k0 ) = µ∗ (|f | >
kf k0 ) ≤ kf k0 ≤ rkf k0 , krf k0 ≤ rkf k0 .
(v) Suppose kf k0 6= 0. For any ε > 0 limr→0 µ(|f | > ε/r) = µ(|f | = ∞) = 0. Hence, there
is δ > 0 such that 0 < r < δ implies µ(r|f | > ε) < ε. Therefore, krf k0 ≤ ε whenever
0 < r < δ.
Lemma 8.6.7 and Exercise 8.9.19 allows us to put a metric in the space M (S) of all
measurable functions defined on (S, ρ) that is equivalent to convergence in measure.
Theorem 8.6.8. Suppose that µ(Ω) < ∞. Let (S, ρ) be a separable metric space, and F :
[0, ∞) → [0, ∞) be a bounded nondecreasing continuous subadditive function with F (t) = 0
iff t = 0. For any given a pair measurable functions in f , g in S, define
Z
(8.25) DF (f, g) = F (ρ(f, g)) dµ.
Proof. Only the last statement require a proof. Suppose (S, ρ) complete and let (fn ) be a
Cauchy sequence in (M (S), DF ). Then by (8.24)
1
lim sup µ(ρ(fn , fm ) > ε) ≤ F (ε) Mlim sup DF (fn , fm ) = 0
M →∞ n,m≥M →∞ n,m≥M
Hence, there are integers nk < nk+1 such that supn,m≥nk µ(ρ(fn , fm ) > 2−k ) < 2−k , so
P −k
k µ(ρ(fnk+1 , fnk ) > 2 ) < ∞. By the Borel–Cantelli lemma, the set A = {ρ(fnk+1 , fnk ) >
2−k , i.o} has µ–measure zero; hence, {fnk } is a Cauchy sequence in (S, ρ) µ–a.s. Complete-
ness of (S, ρ) implies that fnk converges µ–a.s to a measurable function f . The dominated
convergence implies that limk DF (fnk , f ) → 0. Therefore limn DF (fn , f ) = 0, for in any
metric space, a Cauchy sequence that has a convergent subsequence is in fact convergent.
Theorem 8.6.9. Let (Ω, M , µ) be a finite measure space, and (S, ρ) be a separable metric
space.
(i) If fn (ω) converges to f (ω) pointwise µ–a.s. then fn converges to f in measure.
(ii) If fn converges in measure to f , then there is a subsequence fnk such that fnk (ω) →
f (ω) pointwise for µ–a.s. all ω ∈ Ω.
Proof. (i) fn → f a.s. is equivalent to ρ(fn , f ) ∧ 1 → 0 a.s. The conclusion follows from
Lemma 8.6.7 with F (t) = t ∧ 1, and dominated convergence.
(ii) k
P Choose a subsequence nk < nk+1 such that µ({ρ(fnk , f ) > 1/k} < 1/2 . Then
k µ({ρ(fnk , f ) > 1/k} < ∞ and, by Borel–Cantelli, fnk converges pointwise to f out-
side the set A = {ρ(fnk , f ) > 1/k, i.o} which has measure zero.
194 8. Lp spaces
Corollary 8.6.10. Assume µ(Ω) < ∞. Then fn converges in measure to f if and only if
for any subsequence fn′ there is a sub subsequence fn′k → f pointwise µ–a.s.
Proof. Let wn = supk,m≥n ρ(fk , fm ). The completeness of (S, ρ) implies that {fn } con-
verges µ–a.s. iff wn converges to zero µ–a.s.
If {fn } converges µ–a.s., then from supk ρ(fn+k , fn ) ≤ wn , we conclude that both wn and
supk ρ(fn+k , fn ) converge to 0 µ–a.s., and so in measure.
g
As |f − f−g | = (|f | − g)+ , it follows that I is uniform integrable iff
Z
(8.29) inf sup (|f | − g)+ dµ = 0
0≤g∈L1 f ∈I
If in addition µ(Ω) < ∞, then uniform integrability is equivalent to either of the following
conditions
R
(i) inf a>0 supf ∈I (|f | − a)+ dµ = 0
R
(ii) inf a>0 supf ∈I {|f |>a} |f | dµ = 0
Proof. Since (|f | − g)+ 1{|f |≥g} ≤ |f |1{|f |≥g} , (8.29) follows from (8.30).
Suppose that (8.29) holds, and for each ε > 0 choose 0 ≤ gε ∈ L1 so that
Z
ε
sup (|f | − gε )+ dµ <
f ∈I 2
If geε = 2gε/2 , then |f |1{|f |>egε } ≤ 2(|f | − gε/2 )+ . Therefore,
Z
ε
sup |f | dµ <
f ∈I {|f |>e
gε } 2
and (8.30) follows.
Assume in addition that µ(Ω) < ∞. Repeating the arguments used in the proof of the
equivalence between (8.30) and (8.29) shows that (i) and (ii) are equivalent. Clearly (i)
implies (8.30), since the infimum in (i) is taken over a smaller set of integrable functions,
namely the set of all constants. It remains to show
R that (8.29) implies (ii). For ε > 0, let
gε and geε as before, and choose aε > 0 so that {egε >aε } geε dµ < ε/2. From
|f |1{|f |>aε } ≤ |f |1{|f |>egε } + geε 1{egε >aε } ,
it follows that Z
sup |f | dµ ≤ ε.
f ∈I {|f |>aε }
Therefore (ii) holds.
Lemma 8.7.3. Suppose that µ is σ–finite, then there exists h > 0 with h ∈ L1 (Ω, F , µ).
Proof. (b) Suppose ν is Rσ–finite and I is uniformly integrable. For any ε > 0 let 0 ≤
geε ∈ L1 so that supf ∈I {|f |>egε } |f | dµ < ε. Since |f | ≤ |f |1{|f |>eg1 } + ge1 , (i) follows by
integration.
By Lemma 8.7.3 there exists a strictly positive function h ∈ L1 (µ). As 1{egε/3 >nh} → 0
pointwise, by dominated convergence there is an integer nε such that
Z
geε/3 dµ < 3ε .
{e
gε/3 >nε h}
ε
For any A ∈ F , |f |1A ≤ |f |1{|f |>egε/3 } + geε/3 1{egε/3 >nε h} + nε h1A . Hence, if δε := 3nε , then
R R
A h dµ < δε implies that supf ∈I A |f | dµ ≤ ε.
(a) Suppose h ∈ L+
1 satisfies (ii) and let α = supf ∈I |f | dµ. For any c > 0
Z Z Z
h dµ ≤ 1c |f | dµ ≤ 1c |f | dµ ≤ 1c α.
{|f |>ch} {|f |>ch}
R R
Consequently, if c > α/δε then {|f |>ch} h dµ < δε ; thus, supf ∈I {|f |>ch} |f | dµ < ε.
(c) Suppose µ(Ω) < ∞. Assume that (ii) holds. For ε > 0 let δε > 0 be asR in (ii). Since
1{h≥k} ց 0 as k ր ∞, then monotone convergence we can choose kε so that {h>kε } h dµ <
δ δ
R
2 . If µ(A) < 2kε , then A h dµ < δ and (ii)’ follows.
Theorem 8.7.5. Suppose that µ is σ–finite and let fn ∈ L1 (Ω, F , µ), n ∈ N. The following
statements are equivalent.
(i) There is f ∈ L1 to which fn converges in L1 .
(ii) fn is a Cauchy sequence in L1 .
(iii) {fn } is uniformly integrable and there is a measurable function f to which fn
converges in measure.
8.7. Uniform Integrability 197
Suppose (i) holds. The Markov–Chebyshev inequality implies that µ(|fn − f | > ε) ≤
1
ε kfn − f k1 . Convergence in measure follows. Given ε > 0, choose nε so that kfn − fm k1 < ε
for all n, m ≥ nε . Since x 7→ (a − x)+ is nonincreasing, letting gε = max{|f1 |, . . . , |fnε |},
we obtain Z
(|fn | − gε )+ dµ < ε
It remains to prove that (iii) implies (i). Suppose the contrary. Then, there is ε > 0 and a
subsequence (fnk ) such that
(8.31) inf kfnk − f k1 ≥ ε
k
Since fn converges to f in measure, we may assume without loss of generality that fnk
that converges
R R By Fatous’s Lemma and Theorem 8.7.4 if follows that f ∈ L1
to f µ–a.s.
since |f | dµ ≤ lim inf k |fnk | dµ < ∞. Thus,
R the sequence {fmk − f }k is also uniformly
integrable. Choose 0 ≤ g ∈ L1 so that supn {|fn −f |>g} |fn − f | dµ < 2ε . If gk = |fnk − f | ∧ g,
then limk gk = 0 a.s. Since g − gk ≥ 0, Fatou’s lemma gives
Z Z Z
(8.32) 0 ≤ lim sup gk dµ = g dµ − lim inf (g − gk ) dµ ≤ 0.
k k
We conclude this section with a well known result that is in fact equivalent to Theo-
rem 8.7.5.
Theorem 8.7.6. (Vitaly’s convergence theorem) Suppose 1 ≤ p < ∞ and let {fn : n ∈
N} ⊂ Lp (µ) and let f be F –measurable. Then, kfn − f kp → 0 iff {fn f } satisfies the
following conditions:
(i) fn converges to f in µ–measure.
(ii) For any ε > 0, there is E ∈ F with µ(E) < ∞ such that
Z
sup |fn |p dµ < ε
n Ω\E
Proof. Suppose kf − fn kp → 0. Then (i) holds clearly. For ε > 0, there is nε ∈ N such
1/p
that kf − fn kp < ε 2 for all n ≥ nε . Let Aε and Bε be measurable sets of finite measure
such that
Z
ε
|f |p dµ < p
Ω\A 2
Z
max |fj |p , dµ < εp
1≤j≤nε Ω\B
Set C = Aε ∪ Bε . Then, for any n ≥ nε
1Ω\C fn
≤ kfn − f kp + k1Ω\C f kp < ε1/p
p
Thus (ii) holds. Similarly, choose δε > 0 such that µ(A) < δ implies that
Z
ε
|f |p dµ < p
2
ZA
max |fj |p dµ < ε
1≤j≤nε A
Then, for n ≥ nε
k1A fn kp ≤ kfn − f kp + k1A f kp < ε1/p
Thus (iii) holds.
Conversely, suppose (i)–(iii) hold. We will show that any subsequence of (fn ) has a sub-
sequence which converges to f in Lp . By (i) Without loss of generality, suppose fn → f
µ–a.s.
Given ε > 0, choose E ∈ F with µ(E) < ∞ and δ > 0 so that
Z
ε
sup |fn |p dµ <
n Ω\E 4p
Z
ε
sup |fn |p dµ < p
n A 4
R R
whenever µ(A) < δ. By Fatou’s lemma, Ω\E |f |p dµ < 4ε and A |f |p dµ < 4ε whenever
µ(A) < δ. By Egorov’s theorem, there is a measureble set C ⊂ E with µ(E \ C) < δ such
that kfn − f ku,C → 0. Consequently
kf − fn kp = k(f − fn )1Ω\E kp + k(f − fn )1E\C kp + k(f − fn )1C kp
≤ ε1/p + kf − fn kC,u µ(C)
It follows that lim supn kf − fn kp ≤ ε1/p . Therefore, kf − fn kp → 0.
The notion of an atom is more relevant in the context of a the Daniell mean k k∗ of
a positive σ–continuous elementary integral (E, I) for in this setting, k k∗ is σ–additive on
the family of measurable sets M (k k∗ ).
Theorem 8.8.2. (Saks) Let k k∗ be the Daniell’s mean associated to an elementary positive
σ–continuous elementary integral (E, I).
(i) If E ∈ L1 and kEk∗ > 0 then, for any 0 < ε there exists aSfinite collection of
pairwise disjoint measurable sets E1 , . . . Enε such that E = nj=1
ε
En and either
∗ ∗
kEj k ≤ ε or Ej is an atom of k k with kEj k > ε.∗
(ii) If k k∗ has no atoms and E ∈ L1 then, for any 0 < α < kEk∗ , there exits D ∈ L1
with D ⊂ E such that kDk∗ = α.
Proof. (i) Since kEk∗ < ∞, there are at most a finite number of atoms E1 , . . . Eℓ ⊂ E
S
with kEj k∗ > ε. Let A = E \ ℓj=1 Ej . If kAk∗ = 0, the desired partition is given by
{Ej : 1 ≤ j ≤ ℓ} ∪ {A}. Suppose kAk∗ > 0.
Claim: Any nonnegligible measurable set B ⊂ A contains a set F ∈ L1 such that 0 <
kF k∗ ≤ ε. Suppose that is not the case. Then there is an integrable set B ⊂ A with
kBk∗ > 0 whose nonnegligible measurable subsets have Daniell mean larger than ε. In
particular, kBk∗ > ε and thus, B is not an atom of k k∗ . Consequently, there is a measurable
set G1 ⊂ B such that 0 < kG1 k∗ < kBk∗ . It follows that both kG1 k∗ and kB \ G1 k∗ are
larger that ε; thus, B \ G1 is not an atom of k k∗ and so, there exists G2 ⊂ B \ G1 such
that 0 < kG2 k∗ < kB \ G1 k∗ . Proceeding by induction, we obtain a sequence of pairwise
disjoint sets Gn ⊂ B with kGn k∗ > ε, which contradicts integrability of B.
From the claim, we conclue that for any integrable B ⊂ A with kBk∗ > 0,
0 < β(B; ε) := sup kHk∗ : H ∈ L1 , H ⊂ B, kHk∗ ≤ ε ≤ ε
Let H1 be an integrable subset of A such that
β(A; ε)
< kH1 k∗ ≤ ε
2
Proceeding by induction, we
obtain
S a countable
∗ collection (possibly
S finite) of integrable
subsets Hn of A such that
A \ nj=1 Hj
> 0 then, Hn+1 ⊂ A \ nj=1 Hj and
S
β A \ nj=1 Hj ; ε)
< kHn+1 k∗ ≤ ε.
2
P
S
∗
Since n kHn k∗ =
n Hn
≤ kAk∗ < ∞, limn kHn k∗ = 0. Hence
[ n
[
β A\ Hn ; ε ≤ β A \ Hj ; ε ≤ 2kHn+1 k∗ → 0
n j=1
S
∗ P
and so,
A \ n Hn
= 0. Choose an integer Nε large so that n>Nε kHn k∗ < ε.
S enough S
Set Eℓ+1 := H1 , . . . , Eℓ+Nε := HNε and ENε +1 := (A \ n Hn ) ∪ j>Nε+1 Hj . The collection
{Ej : j = 1, . . . Nε + 1} has the desired properties.
200 8. Lp spaces
(ii) Fix a sequence εn ց 0 with ε1 < α. By part (i), there exists a measurable set D1 ⊂ E
such that
α − ε1 ≤ kD1 k∗ ≤ α
Proceeding by induction, suppose we have constructed a collection of measurable sets D1 ⊂
. . . ⊂ Dn ⊂ E such that
α − εn ≤ kDn k∗ ≤ α
If kDn k∗ = α we are done, otherwise there is a set Bn+1 ⊂ E \ Dn such that
α − kDn k∗
α − kDn k∗ − εn+1 ∧ ≤ kBn+1 k∗ ≤ α − kDn k∗
2
Setting Dn+1 := Dn ∪ Bn+1 , we obtain a measurable set such that
α − εn+1 ≤ kDn ∪ Bn+1 k∗ = kDn k∗ + kBn+1 k∗ ≤ α
S
Let D = n Dn . Clearly D is a measurable subset of E with kDk∗ = α.
We conclude this section with some measure theoretical results concerning the range of
certain finite-dimentional vector-valued measures, and which extend Saks’s theorem 8.8.2[(ii)].
Theorem 8.8.3. (Lyapunov’s convexity theorem) Suppose µ1 , . . . , µn are signed measures
of finite total variation on a measure space (Ω, F ). Denote by Mb (Ω) the space of F –
bounded measurable functions in Ω. Then,
(i) the set
Z Z
K := g dµ1 , . . . , g dµn : g ∈ Mb (Ω), 0 ≤ g ≤ 1
Proof. (i) Let µ := |µ1 | + . . . + |µn |. Then µ is a finite measure and µj ≪ µ for each
j = 1, . . . , n. The Radon–Nikodym theorem implies that there are functions fj ∈ L1 (µ)
such that dµj = fj dµ. Since for any f ∈ L∞ (µ) there is a function f ′ ∈ Mb (S) such that
f = f ′ µ-a.s., we may consider functions in L∞ (µ) instead of Mb (Ω). Let Λ : L∞ (µ) → Rn
be the map
Z Z
Λ(g) := gf1 dµ1 , . . . , gfn dµn .
∗
Since L1 (µ) = L∞ (µ), Λ isRweak∗ –continuous
R linear map. Notice that g ∈ H := {h ∈
L∞ (µ) : 0 ≤ h ≤ 1} iff 0 ≤ gf dµ ≤ f dµ for all f ∈ L+ 1 (µ); hence, the convex set
H := {g ∈ L∞ (µ) : 0 ≤ g ≤ 1} is a closed subset of the unit ball in L∞ (µ). By Alaoglu’s
theorem K is weak∗ –compact, and so K = Λ(H) is compact in Rn .
8.8. Lyapunov’s convexity theorem 201
Theorem 8.8.4. Suppose µ1 , . . . , µn+1 are signed measures on (Ω, F ) of finite total vari-
ation, and let H := {g ∈ Mb (Ω) : 0 ≤ g ≤ 1}. Define Λ : g 7→ (µ1 g, . . . , µn g) on H and set
K := Λ(H).
(i) If c ∈ K, then there exist φ∗ , φ∗ ∈ H such that
(8.34) φ∗ = arg max{µn+1 g : g ∈ H, Λg = c}
(8.35) φ∗ = arg min{µn+1 g : g ∈ H, Λg = c}
dµj
Suppose µj ≪ ν and fj = dν for all j − 1, . . . , n + 1 and some σ–finite measure ν (e.g.
ν = |µ1 | + . . . + |µn+1 |).
(ii) If there exists g ∗ ∈ H and (a1 , . . . , an ) ∈ Rn such that Λg ∗ = c and
g ∗ (x) = 1 when fn+1 (x) > a1 f1 (x) + . . . + an fn (x)
(8.36)
g ∗ (x) = 0 when fn+1 (x) < a1 f1 (x) + . . . + an fn (x)
then, g ∗ solves (8.34). Any other solution g to (8.34) satisfies g = g ∗ , ν–a.s. on
{fn+1 6= a1 f1 + . . . + an fn ν}.
(iii) If there exists g∗ ∈ H and (b1 , . . . , bn ) ∈ Rn such that Λg∗ = c and
g∗ (x) = 1 when fn+1 (x) < b1 f1 (x) + . . . + bn fn (x)
(8.37)
g∗ (x) = 0 when fn+1 (x) > b1 f1 (x) + . . . + bn fn (x)
then, g∗ solves (8.35). Any other solution g to (8.35) satisfies g = g∗ ν–a.s. on
{fn+1 6= b1 f1 + . . . + bn fn }.
Proof. (i) The first statement follows from the σ(L∞ (ν), L1 (ν))–continuity of Λ and the
σ(L∞ (ν), L1 (ν))–compactness of H ∩ L∞ (ν) for {g ∈ L∞ : 0 ≤ g ≤ 1, Λg = c} = Λ−1 ({c}).
(ii) Suppose g ∈ H and Λg = c. If g ∗ (x) > g(x) then fn+1 (x) ≥ a1 f1 (x) + . . . + an fn (x),
whereas if g ∗ (x) < g(x) then fn+1 (x) ≤ a1 f1 (x) + . . . + an fn (x). Consequently
Z
I := (g ∗ (x) − g(x))(fn+1 − a1 f1 (x) − . . . − an fn (x))ν(dx) ≥ 0
202 8. Lp spaces
If g also solves (8.34) then I = 0 which means that the set {g∗ =
6 g, fn+1 6= a1 f1 +. . .+an fn }
is ν–negligible. Therefore g = g ∗ ν–a.s. on {fn+1 6= a1 f1 + . . . + an fn }.
(iii) may be obtained from part (ii) applied to −µj , j = 1, . . . , n + 1 and −c in place of µj ,
j = 1, . . . , n + 1 and c.
Theorem 8.8.5. Under the assumptions and notation of Theorem 8.8.4, if c is in the
relative interior of K, there exist g∗ , g ∗ ∈ H with Λg∗ = Λg ∗ = c satisfying (8.37), (8.35)
and (8.36), (8.34) respectively. Moreover, µn+1 g∗ < µn+1 g ∗ unless µn+1 = a1 µ1 +. . .+an µn
for some (a1 , . . . , an ) ∈ Rn .
Proof. The set L = {(Λg, µn+1 g) : g ∈ H} is a compact convex subset in Rn+1 . Let π (n) :
(x1, . . . , xn+1 ) 7→ (x1 ,. . . , xn ) and πn : (x1 , . . . , xn+1 ) 7→ xn . Clearly K = π (n) (L), and
−1
πn L ∩ π (n) ({c}) is a nonempty compact interval [c∗ , c∗ ]. There are two alternatives,
either c∗ = c∗ or c∗ < c∗ .
Case c∗ = c∗ : We claim that L is contained in a non vertical hyperplane containing the
origin, that is some (a1 , . . . , an ) ∈ Rn ,
Xn
(8.38) un+1 = a j uj , (u1 , . . . , un+1 ) ∈ L
j=1
We show that for any c′ ∈ K \ {c}, there exists a unique c′ ∈ R such that (c′ , c′ ) ∈ L.
Suppose this is not the case and that for some c′ ∈ K \ {c} there are c′ , c′ ∈ R with
c′ < c and such that (c′ , c′ ), (c′ , c′ ) ∈ L. As c is relative interior of K, there exists a point
(c′′ , c′′ ) ∈ L such that c′′ lies in the line containing c and c′ so that c is in the interior of
the straight segment from c′′ to c′ , that is
c = tc′′ + (1 − t)c′
for some 0 < t < 1. This implies that
t c′′ , c′′ + (1 − t) c′ , c′ = c, tc′′ + (1 − t)c′ ) ∈ L
t c′′ , c′′ + (1 − t) c′ , c′ = c, tc′′ + (1 − t)c′ ) ∈ L
but as c′ < c′ and 0 < t < 1, this contradicts the fact that c∗ = c∗ . Consequently, L is a
convex set that intersects any vertical line in at most one point, i.e., L is contained in a
non—vertical hyperplane through the origin and (8.38) holds for some (a1, . . . , an ) ∈ Rn .
This means that for any g ∈ H
Z
µn+1 g − (a1 µ1 g + . . . + an µn g) = g fn+1 − (a1 f1 + . . . + an fn ) dν = 0,
P
that is, fn+1 = nj=1 kj aj ν–a.s. Choosing g ∗ ∈ H with (Λg ∗ , µn+1 g ∗ ) = (c, c∗ ) and setting
g∗ = g ∗ we have that (8.36) and (8.37) hold vacuously.
Case c∗ < c∗ : Choose g∗ , g ∗ ∈ H so that (Λg∗ , µn+1 g∗ ) = (c, c∗ ) and (Λg ∗ , µn+1 g ∗ ) = (c, c∗ ).
8.8. Lyapunov’s convexity theorem 203
In particular, (8.39) holds for any g ∈ H that takes value 1 on {fn+1 −(a1 f1 +. . .+an fn ) > 0}
and 0 on {fn+1 −(a1 f1 +. . .+an fn ) < 0}. This implies that g ∗ satisfies the desired conditions.
Similarly, (8.40) holds for any g ∈ H taking vale 1 on {fn+1 − (b1 f1 + . . . + bn fn ) < 0} and 0
on {fn+1 −(b1 f1 +. . .+bn fn ) > 0}, which implies that g∗ satisfies the desired conditions.
Corollary 8.8.6. Suppose µ1 , . . . , µn , µn+1 are probability measures on (Ω, F ). Assume
µj ≪ ν for some σ–finite measure on (Ω, F ) for all j = 1, . . . , n + 1. Let 0 < α < 1 and g∗
and g ∗ be the solutions to
φ∗ = arg min{µn+1 g : g ∈ H, µj g = α, 1 ≤ j ≤ n}
(8.41)
φ∗ = arg max{µn+1 g : g ∈ H, µj g = α, 1 ≤ j ≤ n}
Then, either µn+1 g∗ < α < µn+1 g ∗ or µn+1 = a1 µ1 + . . . + an µn+1 for some (a1 , . . . , an ) ∈
[0, 1]n with a1 + . . . + an = 1.
Proof. Without loss of generality, we may assume that µ1 , . . . , µn are linearly independent.
We proceed by induction. If n = 1 then, as α = αµ1 (Ω) + (1 − α)µ1 (∅) ∈ (0, 1), it follows
that α is an interior point of K = {µ1 g : g ∈ H} and so, the solutions g∗ and g ∗ to (8.41)
satisfy µ2 g∗ < µ2 g ∗ unless µ2 = µ1 . When µ2 6= µ1 , it follows from Theorem (8.8.4)
that µ2 g∗ < µ2 τ = α < µ∗2 g. This proves that the statmnt for n = 1. Suppose that the
statement of the Corollary holds for 1, . . . , n. Then, for each j = 1, . . . , n there exist g∗j
and gj∗ such that µn g∗j < α < µn gj∗ . It follows that the point α = (α, . . . , α) ∈ Rn is an
interior point of K = {(µ1 g, . . . , µn g) : g ∈ H}. Consequently, by Theorem (8.8.5) the
solutions g∗ and g ∗ to (8.41) satisfy µn+1 g∗ < µn+1 g ∗ unless µn+1 is a convex combination
of µ1 , . . . , µn . If µn+1 is not in the convex hull of {µ1 , . . . , µn }, it follows from Theorem 8.8.4
that µn+1 g∗ < µn+1 τ = α < µn+1 g ∗ . This concluded the proof by induction.
204 8. Lp spaces
The following application of the Lyapunov convexity theorem shows the existence of
consensus partitions for nonatomic finite measures.
Theorem 8.8.7. (Dubins–Spanier) Let µ1 , . . . , µm be nonatomic signed Pnmeasures of finite
variabtion on a measurable space (Ω, F ). Given α1 , . . . , αn ≥ 0 with j=1 αj = 1, There
is a measurable partition {A1 , . . . , An } of Ω such that µi (Aj ) = αj µi (Ω) for all i = 1, . . . m,
j = 1, . . . , n.
Proof. As (1 − α)µi (∅) + αµi (Ω) = αµi (Ω) for all 0 ≤ α ≤ 1, Lyapunov’s convexity
theorem[(ii)] implies that there exists a measurable set A1 ⊂ Ω such that
µi (A1 ) = α1 µi (A1 ), i = 1, . . . , m
Similarly, there exists a measrable set A2 ⊂ Ω \ A1 such that
α2
µi (A2 ) = µi (Ω \ A1 ) = α2 µi (Ω), i = 1, . . . , m
α2 + . . . + αn
where we interpret α2 /(α2 + . . . + αn ) = 0 if α2 = . . . = αn = 0. Continuing this way, for
Sj−1
any j = 1, . . . , n − 1 there is a measurable set Aj ⊂ Ω\ ⊂ ℓ=1 Aℓ such that
µi (Aj ) = αk µi (Aj ), i = 1, . . . , m
Sn−1
Let An := Ω \ j=1 Aj . Then
µi (An ) = 1 − α1 − . . . − αn−1 µi (Ω) = αn µi (Ω)
for all i = 1, . . . , m. {Aj : j = 1, . . . , n} is the desired partition.
8.9. Exercises
Exercise 8.9.1. Suppose f is a differentiable function in (a, b). Show that f is convex
if and only if f ′ is nondecreasing. In that case, αf = βf = f ′ . If in addition f is twice
differentiable, show that f is convex if and only if f ′′ (x) ≥ 0 for all a < x < b.
Exercise 8.9.2. (Young’s inequality) Suppose that g : [0, ∞) → [0, ∞) is continuous
R and
strictly increasing with g(0) = 0. Let h = g −1 be its inverse of g. Define Φ(x) = x g(u) du
Ry 0
and Ψ(y) = 0 h(u) du. Show that
ab ≤ Φ(a) + Ψ(b), a, b ≥ 0
(Hint: Plot a graph of g and compare the area of a rectangle of sizes a times b with the
area under the graphs of g and h.)
Exercise 8.9.3. (a) Given a function ϕ : (a, b) → (0, ∞), show that if x 7→ log(ϕ(x)) is
Given a function ψ : (0, ∞) → R, show that ψ is convex iff the
convex, then so is ϕ. (b)
function ψ ∗ (x) = xψ x1 is convex.
Exercise 8.9.4. The following inequality is a slight generalization to Hólder’s inequality.
Ω P
Let fj ∈ R and pj ∈ R+ (j = 1, . . . , n) with j p1j = 1. Show that
kf1 · · · fn k ≤ kf1 kp1 · · · kfn kpn .
8.9. Exercises 205
If α ≥ 1, show that
Z Z
p α−1 p
α α p/α
|f − g | dµ ≤ α α (f ∨ g) α p |f − g| α dµ
p
Z 1− 1 Z 1
α α
≤ αα (f ∨ g)p dµ |f − g|p dµ
p
Z 1− 1 Z 1
p α α
≤α α (f + g) dµ |f − g|p dµ
Exercise 8.9.16. Show that k0k0 = 0 and k1A k0 = 1 ∧ µ(A) for all A ∈ M .
8.9. Exercises 207
Exercise 8.9.20. Suppose that fn converges to f in Lp for some 1 ≤ p < ∞. Show that
fn converges to f in measure.
Exercise 8.9.21. Consider the space L∞ ([0, 1], B([0, 1]), λ). Show that there is a bounded
linear functional Λ 6= 0 on L∞ that vanishes
R on C([0, 1]). Conclude that there is not
g ∈ L1 ([0, 1], B(0, 1]), λ) such that Λg = [0,1] f g dλ for all g ∈ L∞ . Thus, (L∞ )∗ 6= L1 .
Exercise 8.9.22. Let A be the collection of all subsets A of [0, 1] such that A or [0, 1] \ A
is at most countable. This is a σ–algebra. Let ν beP the counting measure on A. Show that
f ∈ L1 (µ) iff C(f ) := {f 6= 0} is countable and x∈C(f ) |f (x)| < ∞. Let g(x) = x for all
x ∈ [0, 1]. Show that g is not A–measurable; however, f g ∈ L1 (µ) whenever f ∈ L1 (µ).
Show that the linear functional Λ : L1 (µ) → R
Z X
Λ(f ) := f (x)g(x) µ(x) = xf (x)
[0,1]
is continuous. Conclude that (L1 (µ))∗ 6= L∞ (µ) in this situation. (Observe that µ is not
σ–finite)
Exercise 8.9.23. Suppose (Ω, M , P) is a probability space. Show that for any 0 ≤ p ≤ ∞,
the dimension of the vector space Lp is given by
[
dim(Lp ) = max n ∈ Z+ : ∃A1 , . . . , An ∈ M disjoint, Ω = An , P[Aj ] > 0
n
(Hint: If {An } is a finite partition of Ω with 0 < µ(An ) < 1 then {1An } is a linear
independent set in Lp for all 0 ≤ p ≤ ∞.)
Exercise 8.9.24. Suppose that µ(Ω) < ∞. For any measurable functions f and g in a
separable metric space S define
(8.42) α(f, g) := kρ(fn , f )k0 = inf{ε > 0 : µ(ρ(f, g) > ε) ≤ ε}
208 8. Lp spaces
Show that
(a) α defines a metric on M (S).
(b) fn converges to f in measure if and only if limn α(fn , f ) = 0.
p
Hint: (Show that D F (f,g)
µ(Ω)+1 ≤ α(f, g) ≤ DF (f, g), where F (t) = t ∧ 1.)
Exercise 8.9.29. Suppose µ1 , . . . , µn+1 are signed measures on (Ω, F ) of finite total vari-
ation, and let H := {g ∈ Mb (Ω) : 0 ≤ g ≤ 1}. Define Λ : g 7→ (µ1 g, . . . , µn g) on H and set
K := Λ(H). Suppose g ∗ ∈ H and Λg ∗ = c for some c ∈ K. If
g ∗ (x) = 1 when fn+1 (x) > a1 f1 (x) + . . . + an fn (x)
g ∗ (x) = 0 when fn+1 (x) < a1 f1 (x) + . . . + an fn (x)
with aj ≥ 0 for all j = 1, . . . , n, show that g ∗ = arg max{µn+1 g : g ∈ H, µj g ≤ cj , j =
1, . . . , n}.
Exercise 8.9.30. LetRν be a σ–finite measure on (R, B(R)). Suppose f is a probability
R den-
sity w.r.t ν and that |t|f (t) ν(dt) < ∞. For any 0 < α < 1 show that α, α tf (t) ν(dt)
is an interior point of the compact convex set
Z Z
K= g(t)f (t) ν(dt), g(t)tf (t) ν(dt) : 0 ≤ g ≤ 1, g ∈ L∞ (ν) .
8.9. Exercises 209
(Hint: Set µ1 (t) = f (t) ν(dt) and µ2 (t) = tf (t) ν(dt). Apply Theorem 8.8.3[(i)] to show α is
interior point of the image of g 7→ µ1 g, g ∈ H. Use Theorems 8.8.4, 8.8.5 and comparison
with g(t) ≡ α.)
Chapter 9
Finite product of
elementary integrals
is a well defined elementary integral. Indeed, if φ is of the form (9.1), then for each x ∈ X,
φx is a function in EY . So we can apply mY to φx and
Z XN
φx (y) mY (dy) = mY (φx ) = φX Y
j (x)mY (ψj )
j=1
211
212 9. Finite product of elementary integrals
R
is independent of the representation (9.1). Thus, the map x 7→ φx (y) mY (dx) = mY (φx )
is a well defined function in EX , and so we can apply mX to it and obtain
X
N N
X
mX φX Y
j mY (ψj ) = mX (φX Y
j )mY (ψj ) = m(φ).
j=1 j=1
Hence
N
X
kφk♭ ≤ kφX ∗ Y ∗
j kmX kφj kmY < ∞
j=1
and
Z Z Z ∗Z
φ(x, y) mY (dy) mX (dx)
|m(φ)| = φ(x, y) mY (dy) mX (dx) ≤
Z ∗ Z
≤ |φ(x, y)| mY (dy) mX (dx) = kφk♭ .
Equality holds if φ ≥ 0. Absolute homogeneity and solidity of k k♭ follow directly from the
absolute homogeneity and solidity of k k∗mX and k k∗my .
X×Y
The subadditivity of m∗X and m∗Y implies that for any pair of functions f, g ∈ R ,
Z ∗Z ∗ Z ∗Z ∗ Z ∗
kf + gk♭ = |f + g| dmY dmX ≤ |f | dmY + |g| dmY dmX
≤ kf k♭ + kgk♭ .
9.1. The iterated mean 213
We claim that k k♭ is continuous along increasing sequences, that is, supn kfn k♭ = k supn f k♭
whenever 0 ≤ fRn ր f := supn fn . Indeed, for R ∗ any x ∈ X, 0 ≤ (fn )x ր fx . By
∗
Theorem 7.6.7, y) mY (dy) increases Rto R f (x, y) mY (dy). By the same token,
fn (x,
R∗R∗ ∗ ∗
fn (x, y) mY (dy) mX (dx) increases to f (x, y) mY (dy) mX (dx) and the claim
follows.
Continuity along nonnegative increasing sequences, combined with subadditivity, implies
that k k♭ is countable subadditive.
Suppose (φn : n ∈ N) ⊂ E+ . Then
XN
♭ X
N XN N
X
φ j
= m φ j = m(φ j ) = kφj k♭ .
j=1 j=1 j=1 j=1
P
♭
Hence, supN
N j=1 φ j
< ∞, implies that m(φj ) = kφj k♭ → 0 as j → ∞. Therefore, k k♭
is a mean for E.
Now that we have a mean k k♭ that dominates the elementary integral m, we can extend
m uniquely to L1 (k k♭ ) as in Theorem 6.5.1 so that all the good properties of integration
such as linearity and dominated convergence hold.
Theorem 9.1.3. If f ∈ L1 (k k♭ ), then:
(i) For k k∗mx –a.a. x ∈ X, the function fx ∈ L1 (k kmY ).
F R
(ii) The k k∗mX –a.s. defined function x 7→ f (x, y) mY (dy) is k k∗mX –integrable.
R R
(iii) The value of F (x) mX (dx) = f dm, that is,
Z Z Z Z
f dm = F (x) mX (dx) = f (x, y) mY (dy) mX (dy).
G R∗
Proof. First notice that if kgk♭ = 0, then the function x 7→ |g|(x, y) mY (x) is defined
k k∗mX –a.s. and kGk∗mX = 0. Consequently, for k k∗mX –almost all x ∈ X, the map fx is
k k∗mY –negligible.
is k k∗mX –negligible, that is, k1N1 k∗mX = 0. Again, by (9.2), the set
n Z ∗ X o
N2 = x ∈ X : |φ(n) (x, y)| mY (dy) = ∞
n
is k k∗mX negligible.
P P (n)
Let s(n) = nk=1 φ(k) and Φ = n |φ(n) |. For all x ∈ X \(N1 ∪N2 ), the sequence (sx ) ⊂ EY
converges to fx k k∗my –a.s. and kΦx kmY < ∞. Since |s(n) | ≤ Φ for all n ∈ N, fx is k kmY –
R R
integrable and In (x) = s(n) (x, y) mY (dy) → f (x, y) mY (dy) = F (x). This shows that
(i) holds. Clearly (In ) ⊂ EX ,
Z X n Z
In (x)| ≤ (k)
|φ (x, y)| mY (dy) ≤ Φ(x, y) mY (dy),
k=1
and
Z Z X
♭
kΦk = Φ(x, y) mY (dy) mX (dx) ≤ kφ(n) k♭ < ∞.
n
Fubini’s theorem on its own is not useful unless we know before hand that the function
of interest is already integrable in the product mean. The following result states conditions
for integrability in terms of measurability and iterated integration.
Theorem 9.2.3. (Fubini–Tonelli) Suppose f ∈ MR(k k∗m ) and σ–finite. f ∈ L1 (k k∗m ) iff
one of the iterated upper integrals
Z ∗Z ∗ Z ∗Z ∗
|f (x, y)| mY (dy) mX (dx), or |f (x, y)| mX (dx) mY (dy)
is finite. In either case, both integrals coincide and equal to kf k∗m and (9.3) holds.
(Sufficiency) Let k k♭ be an iterated mean and assume kf k♭ < ∞. We will show that f 1A ∈
L1 (k k∗m ) for any m–integrable set A ⊂ X×Y . Indeed, there is a sequence of pairwise disjoint
u
m–integrableS sets∗An ⊂ A in and aP n ) ⊂ E such that f 1An = φn 1An
sequence of functions (φP
and kA \ n An km = 0. Let gn = nk=1 φk 1Ak and Gn = nk=1 |φk |1Ak . Then |gn | = Gn ≤
|f |1A m–a.s. The sequence {Gn : n ∈ N} ⊂ L1 (k k∗m ) ⊂ L1 (k k♭ ) increases m–a.s., and hence
k k♭ –a.s., to |f |1A and kf 1A k♭ ≤ kf k♭ < ∞; hence kGn − |f |1A k♭ = kgn − f 1A k♭ → 0 by
dominated convergence. This means that supn kGn k♭ = kf 1A k♭ and, as k k∗m is a maximal
mean,
kf 1A k♭ = sup kGn k♭ = sup kGn k∗m = kf 1A k∗m .
n
By dominated convergence we get that kgn − f 1A k∗m → 0 and f 1A ∈ L1 (k k∗m ).
To conclude the proof, let (Bn ) be a sequence of pairwise disjoint m–integrable sets such that
S ∗ ♭
Pn
n ∈ L1 (k km ) ⊂ L1 (k k ). Since |
n Bn = {f 6= 0}. It follows that each f 1BP k=1 f 1Bk | ր
|f | and kf k♭ < ∞, f ∈ L1 (k k♭ ) and kfP− nk=1 f 1Bk k♭ → 0. The same argument used to
prove the claim above shows that kf − nk=1 1Bk k∗m → 0 and that f ∈ L1 (k k∗m ).
Corollary 9.2.4. Let f ∈ RX and g ∈ RY .
(i) If kf k∗mX = kgk∗mY = 0, then kf gk∗m = 0.
216 9. Finite product of elementary integrals
Proof. (i) Suppose f is mX –negligible and g is my –negligible. For any ε > 0 there are
hX ∈ (EX )↑+ and hY ∈ (EY )↑+ with |f | ≤ hX and |g| ≤ hY such that khX k∗mX < ε and
khY k∗mY < ε. Since hX hY ∈ (EX ⊗ EY )↑+ , khX hY k∗m = khX hY k♭ = khX k∗mX khY k∗mY < ε2 .
Consequently, by solidity, kf gk∗ = 0.
Suppose f ∈ L1 (mX ) and g ∈ L1 (mY ). There are sequences (φX Y
n ) ⊂ EX and (φn ) ⊂ EY
such that φX Y
n → f in L1 (mX ) and mX –a.s. and φn → g in L1 (mY ) and mY –a.s. By (i),
X Y
φn φn → f g mX ⊗ my –a.s. and
lim kφX Y ∗ X Y ♭ ∗ ∗
n φn km = lim kφn φn k = kf kmX kgkmy < ∞
n n
By Daniell–Fatou’s lemma, f g ∈ L1 (m).
Suppose f is mX –measurable and g is mY –measurable. Then it is clear that f g is measurable
in any integrable boxes, that is, sets of the form AX × AY where AX ∈ L1 (mX ) and
AY ∈ L1 (my ). We claim that any integrable set A ∈ L1 (m) is m–a.s. contained in a
P
countable union of integrable boxes. Observe that if φ = N X Y X
j=1 φj φj , where φj ∈ EX and
φYj ∈ EY , then
N
[
{φ 6= 0} ⊂ {φX Y
j 6= 0} × {φj 6= 0}
j=1
Given two positive Radon measures (C00 (X), mX ) and C00 (Y ), mY ), where X and Y
are l.c.H spaces, the product mX ⊗ mY constructed from C00 (X) ⊗ C00 (Y ) defined Radon
measure on C00 (X × Y ).
Theorem 9.2.6. Suppose (C00 (X), mX ) (C00 (Y ), mY ) are positive Radon measures on lo-
cally compact Hausdorff spaces X and Y . If f ∈ C00 (X × Y ), then
(i) f ∈ L1 (mX ⊗ mY ), and the maps
Z Z
F (x) = f (x, y)mY (dy), G(y) = f (x, y)mX (dx)
Y X
are continuous of compact support in X and Y respectively.
If g ∈ L1 (mX ⊗ mY ), then
(ii) Eg := {g 6= 0} is mX ⊗ mY –a.s. σ–compact.
9.2. Fubini and Tonelli’s theorems 217
Proof. The Stone–Weierstrass theorem 5.3.17 implies that C00 (X × Y ) ⊂ C00 (X) ⊗ C00 (Y );
thus, C00 (X × Y ) ⊂ M (mX ⊗ mY ).
(i) For f ∈ C00 (X × Y ), let U ⊂ X and V ⊂ Y be open relatively compact sets such that
πX (supp(f )) × πY (supp(f )) ⊂ U × V where πX and πY are the projections onto X and
Y respectively. By Urysohn’s lemma, there are φ ∈ C00 (X) and ψ ∈ C00 (Y ) such that
πX (supp(f )) ≺ φ ≺ U and πY (supp(f )) ≺ ψ ≺ V . Hence, 1supp(f ) (x, y) ≺ φX (x)ψY (y) for
all (x, y) ∈ X × Y . The integrability of f follows from Fubini–Tonelli’s theorem.
R
It is clear that F (x) 7→ Y f (x, y) mY (dy) is supported in U . C00 (X). Fix x0 ∈ X and let
ε > 0. For any y ∈ Y there are neighborhoods x0 ∈ Uy and y ∈ Vy such that
|f (x, z) − f (x0 , y)| < ε, (x, z) ∈ U × V.
TN
Let {Vyk : k = 1, . . . N } be a finite subcover of πY (supp(f )) and set W = k=1 Uyk . Then
|f (x, y) − f (x0 , y)| ≤ ε(φ(x0 ) + φ(x))ψ(y), x ∈ W, y ∈ Y
Consequently
Z Z
|F (x) − F (x0 )| = f (x, y) − f (x0 , y) mY (dy) ≤ ε|φ(x0 ) + φ(x)| |ψ(y)|mY (dy)
Y Y
This proves that F ∈ C00 (X). A similar argument shows that G ∈ C00 (Y ).
Example 9.2.8. Fubini’s theorem implies that when (C00 (X), mX ) and (CR00 (Y ), mY ) are
positive
R Radon measures and f ∈ L1 (X × Y, mX ⊗ mY ), the maps x 7→ Y fx dmY and
y
y 7→ X f dmX are mX –measurable and mY –measurable respectively. However, they may
fail to be Borel measurable. As before, consider the l.c.H spaces X = R with the usual
topology and Y = R with the discrete topology, and let ∆ be the diagonal in X × Y .
The atomic measure δ0 and the counting measure # are Radon measures on X and Y
respectively. Let A ⊂ R be a non–Borel
set containing 0. The set ∆A := ∆ ∩ (X × A) is
a Borel set in X ×R Y and δ0 ⊗ # (∆A ) = 1. It follows that 1∆A ∈ L1 (X × Y, δ0 ⊗ #);
however, 1A (x) = Y (1∆A )x (y)#(dy) is not a Borel function on X.
Proof. Without loss of generality we can assume that f ≥ 0. The R case p = 1 is a re-
statement of Fubini’s theorem. Suppose that p > 1 and let H(x) = Y f (x, y) ν(dy). From
Fubini’s theorem and then Hölder’s inequality we obtain
Z Z Z Z
kHkpLp (µ) = f (x, y) ν(dy)H p−1 (x) µ(dx) = f (x, y)H p−1 (x) µ(dx) ν(dy)
X Y Y X
Z Z 1
p
≤ |f (x, y)|p µ(dx) kHkp−1
Lp (µ) ν(dy),
Y X
and the conclusion follows for immediately if kHkp < ∞. If kHkp = ∞, choose monotone
sequences of sets An ⊂ X and Bn ⊂ Y such that µ(An ) ∨ ν(Bn ) < ∞, and for any k ∈ N
define fk = f ∨ k. Then
Z Z p 1/p Z Z 1/p
fk (x, y) ν(dy) µ(dx) ≤ |fk (x, y)|p µ(dx) ν(dy).
An Bm Bm An
Suppose EX ⊂ Bb (X) is a ring lattice closed under chipping and let µ be a positive
σ–finite elementary integral on EX . Let ν be any Radon (Borel) elementary integral on the
Borel measurable space ([0, ∞), B([0, ∞))). From Corollary 9.2.4, it follows that for any
meaurable fuction f : X → [0, ∞], the set
E = {(x, t) ∈ X × [0, ∞) : f (x) > t}
is measurable on for the product ν ⊗ µ, for the function (x, t) ∈ X × [0, ∞) 7→ f (x) − t is
measurable.
Theorem 9.3.3. . Let ν be a Radon measure (Borel measure) on the half line [0, ∞). If
f ∈ L+1 (µ) then,
Z Z ∞
(9.6) ν [0, f (x)) µ(dx) = µ({f > t}) ν(dt)
X 0
Proof. Statement (i) clearly holds for sets of the form A × B with A ∈ A and B ∈ B.
For each x ∈ X and y ∈ Y , consider the collection Dy = {D ∈ A ⊗ B : Dy ∈ A}
and Dx = {D ∈ A ⊗ B : Dx ∈ B}. It is easy to check that if E ⊂ F ⊂ X × Y and
{An : n ∈ Z+ } ⊂ X × Y , then
[ [
(F \ E)x = Fx \ Ex , An = (An )x .
x
n n
Similar results hold for the corresponding y–sections. From these observations, it follows
that Dx and Dy are both d–systems containing the π–system {A × B : A ∈ A, B ∈ B}.
Therefore, Dx = A ⊗ B = Dy .
Statement (ii) follows from noticing that (f −1 (B))x = (fx )−1 (B) and (f −1 (B))y = (f y )−1 (B)
hold any B ⊂ R.
Theorem 9.4.2. Let (X × Y, A ⊗ B) be the product space of the measurable spaces (X, A)
and (Y, B). For any C ∈ A ⊗ B, the collection of sections {Cx : x ∈ X} has at most the
cardinality of the continuum. In particular, if ∆ = {(x, x) : x ∈ X} ∈ A ⊗ A, then X has
at most the cardinality of the continuum.
Indeed, (9.8) holds for each An × Bn ∈ S , and the collection D of subsets of X × Y for
which (9.8) holds is a σ–algebra. Therefore, there is a one–to–one map between the different
sections {Cx : x ∈ X} and the different sequences {(1An (x))n∈Z+ : x ∈ X} ⊂ {0, 1}Z+ .
The last statement follows from the fact that ∆x = {x} for each x ∈ X.
Theorem 9.4.3. Let (X, τX ) and (Y, τY ) be two topological spaces and let τX×Y be the
product topology on X × Y . If BX , BY and BX×Y are the corresponding Borel σ–algebras,
then BX ⊗ BY ⊂ BX×Y . Equality holds if both X and Y are second countable.
Proof. Since the integral extension of n to L1 (k k∗n ) is linear and positive, nG is linear and
positive. For any sequence (φk ) ⊂ E such that φk ց 0 we have that (φk ◦ G) ⊂ L1 (k k∗n ),
φk ◦ G ց 0. Therefore, by monotone convergence, nG (φk ) = n(φk ◦ G) = kφk ◦ Gk∗n → 0.
This shows that nG is a positive σ–continuous elementary integral on E2 .
The properties of the Daniell mean k k∗n imply that the functional k k♭ : f 7→ kf ◦ Gk∗n on
Ω
R 2 is a mean for E2 which coincides with the Daniell mean k k∗nG on E2 . By maximality,
k k♭ = k k∗nG on L1 (k k∗nG ). Hence
kφ ◦ G − f ◦ Gk∗n = kφ − f knG
for any f ∈ L1 (k k∗nG ) and φ ∈ E2 . Consequently f ∈ L1 (k k∗nG ) iff f ◦ G ∈ L1 (k k∗n ).
Therefore, if (φk ) ⊂ E2 converges to f in L1 (k k∗nG ), then n(f ◦ G) = limk n(φk ◦ G) =
limk nG (φk ) = nG (f ).
Suppose f ∈ MR(E2 , k k∗nG ). Then, for any B ∈ L1 (k k∗nG ) and ε > 0, there are ψ ∈
u
E2 and B0 ⊂ B, B0 ∈ L1 (k k∗nG ), such that kB \ B0 k∗nG < 2ε and f 1B0 = ψ1B0 . As
u
(ψ1B0 ) ◦ G ∈ L1 (k k∗n ), there is a function ϕ ∈ E1 and a k k∗n –integrable set A ⊂ G−1 (B0 )
222 9. Finite product of elementary integrals
ε
with kG−1 (B0 ) \ Ak∗n < 2 on which (ψ1B0 ) ◦ G = ϕ. Thus kG−1 (B) \ Ak∗n < ε and
(f ◦ G)1A = ϕ1A .
The last assertion follows from the second statement of the Theorem, the identity
[ [
G−1 Bn = G−1 (Bn )
n n
9.6.1. Vitali’s covering theorem. We start by discusing two techinical results about
coverings of sets in Rn by closed balls. These results will be used in our proof of the change
of variable theorem and it the proof that equivalence of Lebesgue’s measure and Hausdorff’s
measure H n on Rn .
Lemma 9.6.1. (Vitali’s covering Lemma.) Let (X, d) be a separable metric space and
b be the concentric closed ball with
B a collection of closed balls. For any B ∈ B, let B
b
diam(B) = 5 diam(B). (i) If
(a) diam(B) > 0 for all B ∈ B
(b) D := supB∈B diam(B) < ∞
then, there exists a countable collection G ⊂ B of pairwise disjoint sets such that for any
c′ .
B ∈ B, there is B ′ ∈ G satisfying B ∩ B ′ 6= ∅ and B ⊂ B
S
Suppose ∅ =6 A ⊂ B. (ii) In addition to (a) and (b), if
(c) inf{diam(B) : x ∈ B, B ∈ B} = 0 for any x ∈ A
then, there exists a countable collection G of pairwise disjoint balls in B such that for any
finite collection {B1 , . . . , Bm } ⊂ B,
m
[ [
(9.9) A\ Bk ⊂ b
B.
k=1 B∈G\{B1 ,...,Bm }
Proceeding by induction, suppose pairwise disjoint balls B1 , . . . , Bnk in B have been chosen
so that
nk
[
λd U \ Bj < θk λd (U ),
j=1
S k
Let Bk be the set of all closed balls in B that are contained by the open set Uk := U \ nj=1 Bj .
Clearly Bk covers Uk . Applying the same argument to Uk in place of U , we obtain disjoint
balls Bnk +1 , . . . , Bnk+1 in Bk such that
nk+1 nk+1
[ [
λd U \ Bj = λd U k \ Bj < θλd (Uk ) < θk+1 λd (U )
j=1 j=nk +1
The collections
S of all such {Bn : n ∈ N} is pairwise disjoint and, by letting k → ∞,
λd (U \ n Bn ) = 0.
Lemma 9.6.5. If G is differentiable at the point x ∈ Ω then, for any ε > 0 there exists
δ > 0 such that whenever 0 < r ≤ δ,
λ∗ G B(x; r) ≤ |JG (x)| + ε)λ B(x; r)
where λ∗ is the outer measure (or the Daniell–mean) associated the the Lebesgue measure
on Rn .
Proof. First we consider the case when det(T ) = 0. In this case, T (Rn ) is a linear subspace
of dimension m := rank(T ) < n. Given ε > 0 we will determine a small number ǫ1 > 0
and a corresponding δ > 0 for which (9.10) holds. For any 0 < r ≤ δ, all points G B(x; r)
lie within a distance ǫ1 r of B(G(x); kT kr) ∩ {G(x) + T v : v ∈ Rn }. Hence, G(B(x; r)) is
contained in a box with n − m sides of length 2(kT k + ǫ1 )r and m sides of length 2ǫ1 r.
Consequently
λ∗ G(B(x; r)) ≤ 2n (kT k + ǫ1 )m ǫ1n−m rn = cn (kT k + ǫ1 )m ǫ1n−m λ(B(x; r))
226 9. Finite product of elementary integrals
where cn is a parameter that depends only on the dimes ion n. It is enough to choose ǫ1 > 0
small enough so that cn (kT k + ǫ1 )m ǫ1n−m < ε.
We now assume that det(T ) 6= 0. For any ε > 0 we will choose a small ǫ1 > 0 and a
corresponding δ > 0 so that (9.10) holds. So, if 0 < r ≤ δ then
−1
T G(y) − T −1 G(x) ≤ 1 + ε
T −1
|y − x|
for all y ∈ B(x; r). This means that T −1 G B(x; r) ⊂ B T −1 G(x); 1 + ǫ1 kT −1 k r .
Therefore
n
λ∗ T −1 G B(x; r) ≤ 1 + ǫ1 kT −1 k λ B(T −1 G(x); r)
n
= 1 + ǫ1 kT −1 k λ B(x; r)
By Theorem 9.6.3[ii] λ∗ T −1 G B(x; r) = | det(T −1 |λ∗ G(x; r) , and so
n
λ∗ G(B(x; r)) ≤ | det(T )| 1 + ǫ1 kT −1 k λ B(x; r)
It suffices to choose ǫ1 > 0 small enough so that | det(T )|(1 + ǫ1 kT −1 k)n < | det(T )| + ε.
Remark 9.6.7. The last statement with Jg (x) for all x ∈ E is a special version of Sard’s
theorem where domain and range are of same dimension. A point y ∈ G(Ω) is a critical
value if there is x ∈ Ω such that y = G(x), G is differentiable at x and JG (x) = 0. Sard’s
theorem states that the set of critical values of a function G is Lebesgue negligible.
Proof. We first consider the case where E is bounded. By the outer regularity of Lebesgue
measure, for any ε > 0 there is an open set U such that E ⊂ U and λ(G) < λ∗ (E) + ε. By
Lemma 9.6.5, for each x ∈ E there is δx > 0 such that for all 0 < r ≤ δx ,
λ∗ G(B(x; r)) ≤ (M + ε)λ(B(x; r))
The family B of closed balls B(x; r) where x ∈ E, 0 < r ≤ min(δ5 x ,1) satisfy the conditions
of Vitali’s covering lemma. Hence, there exists a sequence G = {B k : k ∈ N} of pairwise
disjoint balls in B such that
[
k [
∞
E⊂ Bj ∪ ˆ
B j
j=1 j=k+1
9.6. Change of variables formula in (Rn , B(Rn ), λ). 227
Proof. First consider f = 1B where B is a Borel set. The continuity of G implies its Borel
measurability, and so f ◦ G = 1G−1 (B) is Borel measurable. Applying Theorem 9.6.8 with
E = G−1 (B) and noticing that G(E) = G(Ω) ∩ B leads to
Z Z Z
1B (y) dy = λ(G(Ω) ∩ B) ≤ |JG (x)| dx = 1B ◦ G (x)|JG (x)| dx
G(Ω) G−1 (B)
By linearity (9.13) holds for non negative simple functions and by monotone convergence
the conclusion extends to all non negative Borel functions.
Proof. If f ∈ MR(G(Ω)) then there is fb ∈ B(G(Ω)) such that λ {f 6= fb} = 0. Applying
Theorem 9.6.8 for function G−1 and set {f 6= fb}
λ {f ◦ G 6= fb ◦ G}) = λ G−1 ({f 6= fb}) = 0
Conversely, if f ◦ G ∈ MR(Ω), there exists h ∈ B(Ω) such that λ({f ◦ G 6= h}) = 0. The
continuity of G implies that h ◦ G−1 ∈ B(G(Ω)). Applying Theorem 9.6.8 with for function
G and set {f ◦ G 6= h} implies that
λ {f 6= h ◦ G−1 } = λ G({f ◦ G 6= h}) = 0
This argument shows that it is enough to consider Borel measurability in proving (i)–(iii).
Set Ω2 := G(Ω) so that G−1 (Ω2 ) = Ω. For any g ∈ B+ (Ω2 ), another application of
Theorem 9.6.9 with G−1 in place of G and Ω2 in place of Ω gives
Z Z
(9.16) g(x) dx ≤ g ◦ G−1 )(y) |JG−1 (y)| dy
G−1 (Ω2 ) Ω2
In particular, we consider g(x) := f ◦ G (x)|JG (x)|. The inverse function theorem shows
−1
that JG (x) 6= 0 for all x ∈ Ω and that JG−1 (y) = JG (G−1 (y) . Hence, inequality (9.16)
reduces to
Z Z
f ◦ G (x)|JG (x)| dx ≤ f (y)|JG (G−1 (y)| |JG (G−1 (y))|−1 dy
Ω G(Ω)
Z
(9.17) = f (y) dy
G(Ω)
To see this, we apply Fubini’s theorem and the change of variable (x, y) 7→ (x, x + y)
Z
Γ(a)Γ(b) = xa−1 e−x y b−1 e−y dxdy
(0,∞)2
Z ∞Z ∞
−v a−1 b−1
= e u (v − u) dv du
Z0 ∞ u
Z
v
= e−v ua−1 (v − u)b−1 du dv
0 0
−(n−1)
Setting G(t) := (t1 /tn , . . . , tn−1 /tn , tn ) we obtain that |JG (t)| = tn , and Dn−1 :=
G(T (Rn+ )) = {v ∈ Rn+ : v1 , . . . , vn > 0, v1 + . . . vn−1 < 1}. Hence
Z !Z
∞
an−1 −1 a −1
I= v1a1 −1 . . . vn−1 1 − (v1 + . . . + vn−1 ) n dv1 . . . dvn−1 f (v)v α−1 dv
Dn−1 0
9.6. Change of variables formula in (Rn , B(Rn ), λ). 231
It follows that
Z Z ∞
a1 −1 an −1
f (x1 + . . . + xn )x1 . . . xn dx = B(a1 , . . . , an ) f (s)sa1 +...+an −1 ds
0
Rn
+
Example 9.6.13. (Order Statistics) Let ν(dx) = f (x) λ1 (dx) be a probability measure and
νn (dx) = f (x1 ) · · · f (xn ) λd (dx), the product measure. The map T : x = (x1 , . . . , xn ) 7→
(x(1) , . . . , x(n) ), where x(k) is the k–th smallest element in x, is called the n–order statistic
map. If B = {(x(1) , . . . , x(n) ) : x(1) < . . . < x(n) }, then for each permutation σ on {1, . . . , n},
n
there is one to one linear map
n
S Pσ : B → R such that T ◦ Pσ = In the identity on B. It is
easy to check that λn (R \ σ∈Σn Pσ (B)) = 0. The maps Pσ are represented by matrices
with exactly one 1 in each row and each column and det(Pσ ) = ±1. Then, λd ◦ T −1 ≪ λd
and
X
fT (x(1) , . . . , x(n) ) = | det(Pσ )|f (xσ(1) ) · · · f (xσ(n) ) = n!f (x1 ) · · · f (xn )
σ∈Σn
9.6.4. Non invertible smooth functions. In this section assume that Ω ⊂ Rn is open
and that G : Ω → Rn is continuously differentiable on Ω. Let H 0 be the counting measure
on Rn and For any set E ⊂ Ω we define the map hE : G(Ω) → R+ as
X
hE (y) = H 0 E ∩ G−1 (y) = 1E (x)
x∈G−1 ({y})
Theorem 9.6.14. For any Lebesgue measurable set E ⊂ Ω, hE ∈ MRn (G(Ω), λ) and
Z Z
(9.18) hE (y) dy = |JG (x)| dx
Rn E
Proof. (1) First assume that E ⊂ Ω is open and that E ∩ {JG = 0} = ∅. The inverse
function theorem implies that for each x ∈ E there is δx > 0 such that, if 0 < r ≤ δx then,
B(x; r) ⊂ E and G is a C 1 –diffeomorphism from B(x; r) onto the open set G(B(x; r)). By
Theorem 9.6.10
Z
λ G(B(x; r)) = |Jg (x)| dx.
B(x;r)
The collection B of all closed balls B(x; r) with x ∈ E and 0 < r ≤ min(δx , 1) satisfy
the conditions of Vitali’s covering theorem.
S Hence, there exits a pairwise disjoint sequence
{B k : k ∈ N} ⊂ B such that N := E \ k Bk is Lebesgue negligible. Consequently
Z X XZ Z Z
1G(Bk ) dλ = |JG | dλ = S |JG | dλ = |JG | dλ
Rn k k Bk k Bk E
P
Since G is one to one on each Bk , 1G(Bk ) = hSk Bk ≤ hE . It is easy to check that
k
[
{hSk Bk < hE } ⊂ G E \ Bk = G(N )
k
Thus it suffices to show is clear that hE1 ≤ hE to see that (9.18) holds. From
{hE1 < hE } ⊂ G(E \ E1 ) = G E ∩ {JG = 0}
we conclude that hE1 = hE a.s.
The compactness of U and the continuity of JG imply that the right hand side of (9.19)
and (9.20) are finite. Hence hU − hU \E = hE a.s. and, by subtracting (9.20) from (9.19),
we obtain (9.18).
9.6. Change of variables formula in (Rn , B(Rn ), λ). 233
Since hSj Kj ≤ hE and {hSj Kj < hE } ⊂ G(N ), hE = hSj Kj a.s. and (9.18) holds.
Proof. We prove (9.21) for Borel sets first. Let B ⊂ Rn be a Borel set and set E = G−1 (B).
Then,
Proof. By Theorem 9.6.14 (9.22) holds for Lebesgue sets E ⊂ Ω. Notice that for any
nonnegative function ϕ and any y ∈ G(Ω)
Z X
ϕ(t)H 0 (dt) = ϕ(x)
G−1 ({y})
x∈G−1 ({y})
Then, by linearity (9.22) extends to Lebesgue nonnegative Lebesgue simple functions. Fi-
nally, by monotone convergence arguments, (9.22) extends to nonnegative Lebesgue mea-
surable functions.
234 9. Finite product of elementary integrals
Proof. It is enough to consider the case where f is C ∞ (B(0; 1)). To see that this is the
case, suppose the result holds for all continuous functions which are C ∞ on B(0; 1). Let
ε > 0. By the Stone–Weierstrass theorem there are polynomials P1 (x), . . . , Pn (x) such that
1
kf − P ku = supx∈B(0;1) |f (x) − P (x)|2 < ε, where P = (P1 , . . . , Pn )⊤ . Setting Pε := 1+ε P
we obtain that kPε ku ≤ 1 and
1
kf − Pε ku ≤ kf − P ku + εkf ku < 2ε
1+ε
As Pε ∈ C ∞ (Rn ), there exits xε ∈ B(0; 1) such that Pε (xε ) = xε . Hence |f (xε ) − xε | < 2ε.
By compactness, there is a sequence εn → 0 such that xεn → x∗ for some x∗ ∈ B(0; 1). By
continuity it follows that f (x∗ ) = x∗ .
We will assume that the statement is false and will reach a contradiction. Suppose f :
B(0; 1) → B(0; 1) is continuous on B(0; 1), of class C ∞ on B(0; 1) and such that f (x) 6= x
for all x ∈ B(0; 1). Then, for each x ∈ B(0; 1) the equation
F (τ, x) := |τ f (x) + (1 − τ )x|2 − 1 = 0
has exactly two solutions,
p
−hx, f (x) − xi ± hx, f (x) − xi2 + |x − f (x)|2 (1 − |x|2 )
τ± (x) =
|x − f (x)|2
By assumption x 6= f (x) for all x ∈ B(0; 1); thus, as B(0; 1) is compact, inf x∈B(0;1) |f (x) −
x| > 0. This implies that
(a) hx, f (x) − xi < 0 for all x ∈ B(0; 1).
(b) τ− ∈ C(B(0; 1)).
(c) τ− (x) = 0 whenever |x| = 1.
(d) τ− (x) < 0 whenever |x| < 1.
9.7. Applications of change of variables in integration 235
The expression to the right of the equality in (9.23) is, as a function of x, continuous
on B(0; 1) and strictly negative by observation (a) above. Similarly, the right hand side
of (9.24) is, as a function of x, continuous on B(0; 1). These observations, together with
implicit function theorem, imply that τ− is C ∞ on B(0; 1) and that supx∈B(0;1) kτ−′ (x)k < ∞.
This means that for each 0 ≤ t < 1/(1 + α), the map Φt is injective on B(0; 1). Since
Φ′t (x) = I + t(G′ (x) − I) and G′ is continuous and bounded on B(0; 1), there exists 0 < δ <
1/(1 + α) such that for all 0 ≤ t < δ, Φ′t (x) is invertible all x ∈ B(0; 1). It follows from the
inverse function theorem that each map Φt with 0 ≤ t < α is a local diffeomorphism. As
all maps Φt with t ∈ [0, α) are injective, we conclude that for each t ∈ [0, α)
g) Φt (B(0; 1)) is an open subset of B(0; 1),
(h) Φt is a C 1 –diffeomorphism from B(0; 1) to Φt (B(0; 1)).
We claim that for all t ∈ [0, δ), Φt (B(0; 1)) = B(0; 1). Observation [(e)] states that
Φt (Sn−1 ) = Sn−1 . Hence Φt (B(0; 1)) ∩ B(0; 1) = Φt (B(0; 1)) for all 0 ≤ t < 1. This
implies that for each 0 ≤ t < α Φt (B(0; 1)) is both open and closed in B(0; 1). The fact
that B(0; 1) is connected implies that Φt (B(0; 1)) = B(0; 1) for all 0 ≤ t < δ.
R
Define ρ(t) := B(0;1) det(Φ′t (x)) dx. It is clear that ρ(t) is a polynomial in t of degree
at most n. Since Φ′0 (x) = I and (t, x) 7→ P hi′t (x) is continuous on [0, 1) × B(0; 1) then
236 9. Finite product of elementary integrals
inf x∈B(0;1) det(Φt (x)) > 0 for all 0 ≤ t < δ. By the change of variables theorem 9.6.10, for
all 0 ≤ t < δ
Z Z
′
ρ(t) = det(Φt ) dλ = dλ = λ Φt (B(0; 1) = λ B(0; 1) =: ωn
B(0;1) Φt (B(0;1))
It follows that ρ(t) = ωn for all t and so ρ(1) = ωn . However since |Φ1 (x)|2 = |G(x)|2 = 1,
it follows that (G(x))⊤ G′ (x) = 0 for all x ∈ B(0; 1). This means that det(G′ (x)) = 0 for all
x ∈ B(0; 1), and so ρ(1) = 0. This is a contradiction.
Theorem 9.7.2. There is a unique Borel measure σ on Sn−1 such that λ∗ = ρ × σ. More-
over, for any f in B+ (Rn ) or L1 (Rn ),
Z Z ∞Z
(9.25) f dλ = f (ru)rn−1 dr σ(du)
Rn 0 Sn−1
Hence λ∗ and ρ × σ coincide on the class of sets C = {(a, b] × E : 0 < a < b, E ∈ B(Sn−1 ),
which is a π–system generating all Borel sets in Rn \ {0}. Therefore, from Theorem 3.5.5,
we conclude that λ∗ = ρ × σ.
Corollary 9.7.3. The measure σ on (Sn−1 , B(Sn−1 )) is invariant under orthogonal trans-
formations.
Example 9.7.4. Let a ∈ Rn be fixed and let {ei : i = 1, . . . , n} be the standard orthonormal
1
basis of Rn . For each i let Ti be any orthogonal map that maps |a| a to ei . Then
Z Z
1
(a · u)2 σ(du) = (ei · u)2 σ(du).
|a|2 Sn−1 Sn−1
Thus,
Z n Z
2 |a|2 X
(a · u) σ(du) = (ei · u)2 σ(du) = |a|2 ωn .
Sn−1 n Sn−1
i=1
2
Example 9.7.5. Consider the function f (x) = e−|x| . Fubini’s theorem, a change to polar
coordinates, and then a change of variables u = r2 , gives
Z 2
n Z 2
Z ∞
2 n−1
Z ∞
e−x dx = e−|x| dx = σ(Sn−1 ) e−r rn−1 dr = σ(S2 ) e−u un/2−1 du.
R Rn 0 0
2 π n/2
whence we conclude that σ(Sn−1 ) = n . If g = 1B(0;1) in (9.26), we obtain that
Γ( 2 )
where ρ ≥ 0 and (ϕ1 , . . . , ϕn−1 ) ∈ [0, 2π] × [−π, π]n−2 . It is easy to check that Φ : (0, ∞) ×
n−2
(0, 2π) × (0, π)n−2 → Rn \ ({0} × R+ × R) is a diffeomorphism and that
n−1
Y
| det(Φ′ )| = ρn−1 sinj−1 ϕj
j=2
n−2
If ρ = 1, we obtain a representation of the surface area dσn−1 on Sn−1 \ ({0} × R+ × R)
in terms of the parameters (ϕ1 , . . . , ϕn−1 ) ∈ (0, π) n−2 × (0, 2π):
σn−1 (d ϕ1 , . . . , d ϕn−1 ) = sinn−2 ϕn−1 · . . . · sin ϕ2 dϕ1 · · · dϕn−1
= sinn−2 ϕn−1 · σn−2 (d ϕ1 , . . . , d ϕn−2 ).
As an application of this relation, we compute the following integral.
Z Z ∞
1 n−1 rn−1
2 (n+1)/2
dx = σ(S ) dr
Rn (1 + |x| ) 0 (1 + r2 )(n+1)/2
Z π/2
n−1 σ(Sn ) π (n+1)/2
= σ(S ) sinn−1 θ dθ = = .
0 2 Γ((n + 1)/2)
The show (9.27) for general bounded sets we will use a technique named as Steiner
symmetrization, which generates from A a finite sequence of increasingly symmetric sets of
the same volume and comparable radii.
For each v ∈ Sd−1 , we will use the notation ℓ(v) = {tv : t ∈ R} and {v}⊥ = {u ∈
Rd : v · u = 0} for the straight line through the origin parallel to v and the orthogonal
complement of v respectively. Given v ∈ Sd−1 and x ∈ Rd , we will denote by λx,v the
measure on B(Rd ) induced by the map R ∋ t 7→ x + tv, that is,
λx,v (A) = λ∗1 ({t : x + tv ∈ A}) A ⊂ Rd .
The Steiner symmetrization of A with respect to v ∈ Sd−1 is defined as
S(A; v) = x + tv : x ∈ {v}⊥ , |t| < 21 λx,v (A) .
Geometrically, S(A, v) is constructed by bundling together line segments, each of which is
obtained by taking the intersection of A with x + ℓ(v) (x ⊥ v), squashed it to remove gaps,
and then slide the resulting interval along x + ℓ(v) to center it at x.
Remarks 9.8.1. The following observations can be checked straight forwardly.
(i) If A ⊂ B ⊂ Rd , then S(A; v) ⊂ S(B; v).
(ii) x + tv ∈ S(A; v) iff x − tv ∈ S(A; v).
(iii) If R is a linear unitary operator on Rd (i.e., R⊺ = R−1 ), R S(A; v) = S(R(A); Rv).
Lemma 9.8.2. Let A ∈ B(Rd ) be bounded. Then, for all v ∈ Sd−1 , S(A; v) ∈ B(Rd ),
λd (S(A; v)) = λd (A) and rad(S(A; v)) ≤ rad(A). If R is d
a unitary transformation of R
such that R(ℓ(v)) = ℓ(v) and R(A) = A, then R S(A; v) = S(A; v).
Remarks 9.8.1 also imply that the qualities and quantities under consideration (the mea-
surability of S(A; v) together with its Lebesgue measure and radius) are independent of
the any particular choice of coordinate system. Hence, without loss of generality, we can
assume that v = ed = [0, . . . , 0, 1]⊤ . This way,
S(A; v) = (ξ, t) ∈ Rd−1 × R : −f (ξ) < t < f (ξ)
[
= (ξ, t) ∈ Rd−1 × [0, ∞) : f (ξ) > t (ξ, t) ∈ Rd−1 × (−∞, 0] : −t < f (ξ) ,
R
where f (ξ) = 12 R 1A ((ξ, t)) λ1 (dt). By Fubini–Tonelli’s theorem, f is B(R)d−1 –measurable;
hence, S(A; v) ∈ B(Rd ). By Theorem 9.3.3 we have that
Z
λd (S(A; v)) = 2 f (ξ) λd−1 (dξ) = λd (A).
Rd−1
We now prove that the radius of the symmetrization A does not exceed the radius of A.
Since S(A; v) ⊂ S(A, v) and rad(A) = rad(A), we can assume without loss of generality
240 9. Finite product of elementary integrals
that A is compact. For any pair of points x and y in S(A; v), let ξ, τ ∈ Rd−1 and s, t ∈ R
be such that x = (ξ, s) and y = (τ, t). Define
M ± (x) = ± sup{r : (ξ, ±r) ∈ A}, M ± (y) = ± sup{r : (τ, ±r) ∈ A}.
The compactness of A implies that X ± = (ξ, M ± (x)) and Y ± = (τ, M ± (y)) are in A.
Moreover,
2|s| ≤ λ(ξ,0),v (A) ≤ M + (x) − M − (x)
2|t| ≤ λ(τ,0),v (A) ≤ M + (y) − M − (y);
therefore,
(M + (y) − M − (x)) ∨ (M + (x) − M − (y))
1
≥ M + (y) − M − (x) + (M + (x) − M − (y)
2
1 1
= (M + (y) − M − (y)) + (M + (x) − M − (x)) ≥ |s| + |t|.
2 2
Consequently,
|y − x|2 = |τ − ξ|2 + |t − s|2 ≥ |τ − ξ|2 + (|t| + |s|)2
2
≤ |τ − ξ|2 + (M + (y) − M − (x)) ∨ (M + (x) − M − (y))
2
= |Y + − X − | ∨ ||X + − Y − | ≤ 4 rad2 (A);
that is, rad(S(A, v)) ≤ rad(A).
Theorem 9.8.3. The inequality (9.27) holds for any bounded A ⊂ Rd .
Proof. Since A is compact and hence, measurable, and λ∗d (A) ≤ λd (A), it suffices to assume
that A is compact. Consider the canonical orthonormal basis {e1 , . . . , ed } of Rd and defined
recurrently A0 = A, An = S(An−1 , en } for n = 1, . . . , d. It follows that λd (An ) = λ(A)
and rad(An ) ≤ rad(A) for all 1 ≤ n ≤ d. The crucial part of this construction is that by
Remark 9.8.1(iii), the unitary operators Rn : x 7→ x − 2(x · en )en satisfy Rm (An ) = An for
all 1 ≤ m ≤ n ≤ d. For n = d in particular, this means that −Ad = Ad , that is, Ad is
symmetric. Therefore,
d d
λd (A) = λd (Ad ) ≤ ωd rad(Ad ) ≤ ωd rad(A) .
Proof. We have already shown in Section 3.4.2 that H d = ad λd for some constant ωd−1 ≤
ad ≤ dd/2 .
9.9. Laplace’s method 241
Let A ∈ B(Rd ) and let {An } be a countable cover of A by sets of diameter at most δ. Then
X X
λd (A) ≤ λ∗d (An ) ≤ 2−d ωd (diam(An ))d .
n n
To obtain the inverse inequality we will make use of the Vitali’s covering theorem. Given
δ > 0, there is a countable collection of pairwise disjoint closed balls Bn with radius 0 <
rn < δ such that
[ [
λd (Q \ Bn ) = 0 = a−1 d H d
(Q \ Bn ).
n n
Thus,
[ X X
Hδd (Q) ≤ Hδd ( Bn ) ≤ Hδd (Bn ) ≤ (diam(Bn ))d
n n n
X [
= cd λd (Bn ) = cd λ( Bn ) = c d .
n n
Therefore, H d = cd λd .
Proof. Without loss of generality, me may assume that x0 . As D2 g(0) is strictly positive
definite and g is in C 2 near x0 , there exists R > 0 small enough so that f ∈ C(B(0; R)),
g ∈ C 2 (B(0; R) and
(9.29) g(x) ≥ g− + c|x|2 .
By (iii) and (iv), for s > α we have that
Z
n n s→∞
(9.30) s 2 esg− e−sg(x) f (x) dx ≤ Cα s 2 eαgR e−s(gR −g− ) −−−→ 0
{x∈D:|x|>R}
1
Using the change of variables y = s 2 x in the integral over B(0; R) leads to
Z
n
sg−
s e
2 e−sg(x) f (x) dx =
B(0;R)
Z
− 12
1
(9.31) 1
exp − s g(s y) − g − f s− 2 y dy
B(0;s− 2 R)
The continuity of f over B(0; R) together with (9.29) shows that the integrandin (9.31) is
2
bounded by kf ku(B(0;R)) e−c|y| and, as s → ∞, converges pointwise to f (x0 ) exp − 21 y ⊺ Ay .
Hence, by dominated convergence, the integral in the right hand side of (9.31) converges to
the expression in the right hand side of (9.28).
Example 9.9.2. Using Laplace’s method we will derive the classical first-order asymptotic
expansion of the gamma function which is known as Stirling’s formula. Using the change
of variable y = x/s we obtain that
Z ∞ Z ∞
s −x s+1
Γ(s + 1) = x e dx = s exp(−sg(y)) dy
0 0
9.10. Exercises
Exercise 9.10.1. Suppose (EX , mX ) and (EY , mY ) be σ–finite elementary integrals and
let, and let k k∗m , m = mX ⊗ mY , be the Daniell product mean. Show that m is σ–finite
and that for any set A ∈ M (k k∗m ),
Z Z Z Z
m(A) = 1A (x, y) mY (dy) mX (dx) = 1A (x, y) mX (dx) mY (dy)
X Y Y X
9.10. Exercises 243
Exercise 9.10.2. Consider X = R with the usual topology, Y = R with the discrete
topology, and let λ1 and the # be the Lebesgue and counting measures on X and Y
respectively. Show that (a) the diagonal Λ in X × Y has measure (λ1 ⊗ #)(∆) = ∞, and
(b) inf{m(K) : K ∈ K, K ⊂ ∆} = 0. (Hint: For (a), the outer regularity of the Radon
measure λ1 ⊗ #; for (b), show that every compact subset of ∆ is is a finite set.)
Exercise 9.10.4. Suppose ϕ : (0, ∞) → R is convex. Recall that ϕ∗ (x) = xϕ(1/x). Show
that
Z
ϕ(0) := lim ϕ(0) = ϕ(1) − D+ ϕ(1) + tµϕ (dt)
xց0 (0,1]
∗ ∗
ϕ (0) = lim ϕ (0) = D+ ϕ(1) + µϕ (1, ∞).
xց0
Exercise 9.10.5. Let k(s, t) be a complex–valued measurable function in (0, ∞)2 such that
k(αs, αt) = α−1 k(s, t) for all α > 0. Suppose that t 7→ k(1, t)t−1/p is Lebesgue integrable
in (0, ∞) for some 1 < p. Show that for every f ∈ Lp (0, ∞),
Z ∞
(Kf )(s) = k(s, t)f (t) dt
0
satisfies kKf kp ≤ C(k, p)kf kp , where C = C(k, p) is a constant depending on k and p only.
When k(s, t) = 1s 1{t<s} , K is called Hardy’s operator . Find C(k, p) in this case.
Exercise 9.10.6. Let (X, A, µ) = ([0, 1],RB([0, 1]), λ) = (Y, B, ν). Let 0 = δ1 < δ1 < . . . <
δn → 1. Let gn (x) = an 1(δn ,δn+1 ] so that [0,1] gn (t)dt = 1. Define the function
∞
X
f (x, y) = (gn (x) − gn+1 (x))gn (y)
n=1
R R R R
Show that [0,1] [0,1] f (x, y)dy dx = 1 6= 0 = [0,1] [0,1] f (x, y)dx dy. Show that |f | ∈
/
L1 ([0, 1] , B ([0, 1]), λ2 ).
2 2
x2 −y 2
Exercise 9.10.7. Let f (x, y) = (x2 +y 2 )2
for (x, y) ∈ (0, 1]2 . Is f ∈ L1 ((0, 1]2 , λ)?
Exercise 9.10.8. If f (n, x) = e−nx − 2e−2nx , (n, x) ∈ N×(0, ∞), show that f ∈ / L1 (# ⊗ λ),
whereR # P
is the counting measure
P R∞ on N and λ is Lebesgue measure on (0, ∞). (Hint: Check
∞
that 0 n f (n, x) dx 6= n 0 f (n, x) dx.
244 9. Finite product of elementary integrals
Exercise 9.10.9. Let X = [0, 1] = Y , F the Borel σ–algebra on [0, 1]. Let λ and ν be the
Lebesgue and counting measure R respectively.
Let ∆be the diagonal
in [0, 1]2 . Compute
R R R
the iterative integrals [0,1] [0,1] 1∆ dλ dν and [0,1] [0,1] 1∆ dν dλ. Show that 1∆ is not
σ–finite and thus, no integrable w.r.t. λ ⊗ ν.
Exercise 9.10.10. Let µ be a measure on space (Ω, F ). If f ∈ Lp (µ) for some 1 ≤ p < ∞,
show that
Z Z ∞
p
(9.33) |f | dµ = p tp−1 µ(|f | > t) dt
Ω 0
P kp k
Show that f ∈ Lp (µ) iff k∈Z 2 µ(|f | > 2 ) converges.
Exercise 9.10.11. Let (X, A, µ) be a σ–finite measure space. Let f ∈ A be fixed and
assume
R that µ({f ≤ t}) < ∞ for all t ∈ R. Consider the collection C = {g ∈ A : 0 ≤ g ≤
1, g dµ = G}, where G > 0 is fixed. Show that the function
g∗ = 1{f <s} + c1{f =s} ,
where s = sup{t : µ(f < t) ≤ G} and c is chosen so that
G = µ({f < s}) + cµ({f = s}),
minimizes the problem
Z
I = inf f (x)g(x) µ(dx).
{g∈C } X
Exercise 9.10.14. Show that the stretching factor 5 in Vitali’s covering Lemma can be
reduced to any factor θ > 3.
Exercise 9.10.15. Let Rµ be a probability measure on (R, B(R)) and define Fµ (x) :=
µ(−∞, x]. Suppose that |y|µ(dy) < ∞.
(a) Show that
Z x Z
Ψ(x) := Fµ (z) dz = (x − y)+ µ(dy) < ∞, x∈R
∞ R
Exercise 9.10.17. Let S and T be linear operators on Rd . If |Sx| ≤ |T x| for all x, show
that | det(S)| ≤ | det(T )|. (Hint: If det(T ) 6= 0 then S ◦ T −1 (B(0; 1)) ⊂ B(0; 1).)
246 9. Finite product of elementary integrals
Show that the Lebesgue measure of the set E = {x ∈ Rn+ : xb1 + . . . + xbnn < r} is given by
Γ b11 + 1 · . . . · Γ b1n + 1 1 +...+ 1
λn (E) = r b1 bn
Γ b11 + . . . + b1n + 1
R
Exercise 9.10.19. Show that Sn−1 v · u σ(du) = 0 for any v ∈ Rn . (Hint: Let P be any
orthogonal transformation such that P v = −v.)
Exercise 9.10.20. For any a and b in Rn , show that
Z
(a · u)(b · u) σ(du) = (a · b)ωn
Sn−1
σ(Sn−1 )
where ωn = n is the Lebesgue measure of the unit ball B(0; 1) in Rn . (Hint: Consider
the orthogonal transformation R such that Ra = −a and R is the identity in the orthogonal
complement {a}⊥ .)
Chapter 10
In this section we will developed a theory of integration that extends previous discussion
on positive elementary integrals to signed elementary integrals. For a given signed measure
m under some simple technical condition, we will show that there is an optimal mean k k
dominating m and −m. From there, using the integration theory developed for positive
elementary integrals, we extend m to L1 (k k). We will show that the space of integrals have
a rich algebraic and order structure.
Example 10.1.1. The simplest example of signed elementary integrals are those obtained
by difference of positive integrals; more precisely, if (E, m1 ) and (E, m2 ) are elementary
positive integrals, then m = m1 − m2 is a signed elementary integral on E.
247
248 10. Signed and Complex measures
An elementary integral (E, m) is said to be of finite variation if (10.1) holds for all ψ ∈ E+ .
The map ψ 7→ |m|(ψ) on E+ is called variation of m.
Remark 10.1.4. As −φ ∈ E iff φ ∈ E, we have that |m|(ψ) = sup{m(φ) : φ ∈ E, |φ| ≤ ψ} =
sup{|m(φ)| : φ ∈ E, |φ| ≤ ψ}. As |ψ| = | − ψ|, we also have that |m(ψ)| = m(ψ) ∨ m(−ψ) ≤
|m|(ψ) for all ψ ∈ E+ . It is clear that m = |m| whenever m is a positive elementary integral.
Example 10.1.5. Given a measurable space (Ω, F ), let E := B(Ω, F ) be the space of
bounded real valued measurable functions with the sup norm. If Λ is a bounded linear
functional on E, then Λ is of finite variation. Indeed, for any ψ ∈ E and φ ∈ E with |φ| ≤ ψ
|Λφ| ≤ kΛkkφku ≤ kΛkkψku
This shows that |Λ|(ψ) < kΛkkψku < ∞ for all ψ ∈ E+ . In particular |Λ|(1) ≤ kΛk. Let
ψn ∈ E with kψn ku ≤ 1 such that kΛk = limn |Λψn |. Then kΛk ≤ |Λ|(1). This shows that
|Λ|(1) = kΛk.
Lemma 10.1.6. Suppose (E, m) is a signed elementary integral of finite variation. The
variation map |m| on E+ is additive and positive homogeneous.
Proof. Positive homogeneity follows directly from the definition of | |. Also, from the
definition of variation it follows that |m|(ψ) ≤ |m|(ϕ). whenever ψ, ϕ ∈ E+ and ψ ≤ ϕ.
Let ψ1 and ψ2 be nonnegative elementary functions and let ε > 0.
There are functions φj ∈ E, j = 1, 2, such that |φj | ≤ ψj and m(φj ) > |m|(ψj ) − 2ε . As
|φ1 + φ2 | ≤ ψ1 + ψ2 ,
|m|(ψ1 ) + |m|(ψ2 ) − ε < m(φ1 ) + m(φ2 ) = m(φ1 + φ2 ) ≤ |m|(ψ1 + ψ2 ).
Consequently, |m| is superadditive on E+ .
Now we show that |m| is subadditive on E+ . Let φ ∈ E such that |φ| ≤ ψ1 + ψ2 and
|m|(ψ1 +ψ2 )−ε < m(φ). Notice that 0 ≤ ψ1′ = ψ1 ∧|φ| ≤ ψ1 and 0 ≤ ψ2′ = |φ|−ψ1 ∧|φ| ≤ ψ2 .
Consider the functions
φ1+ = φ+ ∧ ψ1′ φ2+ = φ+ − φ+ ∧ ψ1′
(10.2) ′ ′
φ1− = ψ1 − φ+ ∧ ψ1 φ2− = φ− + φ+ ∧ ψ1′ − ψ1′
The functions in the array (10.2) are in E+ ; its columns add up to ψ1′ and ψ2′ respectively;
its rows add up to φ+ and φ− respectively. Hence
|m|(ψ1 + ψ2 ) − ε < m(φ) = m(φ+ − φ− ) = m(φ1+ − φ1− ) + m(φ2+ − φ2− )
≤ |m|(ψ1′ ) + |m|(ψ2′ ) ≤ |m|(ψ1 ) + |m|(ψ2 ).
Subadditivity follows immediately.
Theorem 10.1.7. If (E, m) is a signed elementary integral of finite variation then the
variation map |m| admits a unique linear extension to E. This extension, denoted also by
|m|, is the minimal positive elementary integral on E such that
(10.3) |m(φ)| ≤ |m|(|φ|)
10.1. Real valued elementary integrals 249
for all φ ∈ E.
Proof. By Lemma 10.1.6 the variation map | | is additive and positive homogeneous on E+ .
Hence, for any φ ∈ E we can define |m|(φ) = |m|(φ+ )−|m|(φ− ). Furthermore, if φ = φ1 −φ2
with φ1 and φ2 in E+ , then φ+ + φ2 = φ1 + φ− whence it follows that |m|(φ+ ) − |m|(φ− ) =
|m|(φ1 ) − |m|(φ2 ). This shows that the value |m|(φ) is independent on how we choose to
express φ as the difference of nonnegative elementary functions. Consequently,
|m(φ)| ≤ |m(φ+ )| + |m(φ− )| ≤ |m|(φ+ + φ− ) = |m|(|φ|)
Suppose n is a positive elementary integral on E such that |m(ψ)| ≤ n(ψ) for all ψ ∈ E+ .
Then, for any φ ∈ E such that |φ| ≤ ψ,
|m(φ)| ≤ |m(φ+ )| + |m(ψ− )| ≤ n(φ+ ) + n(φ− ) = n(|φ|) ≤ n(ψ).
Taking the suprema over all such φ we obtain that |m|(ψ) ≤ n(ψ). Consequently, for any
φ ∈ E we have |m(φ)| ≤ |m|(|φ|) ≤ n(|φ|).
The following result provides an alternative representation for the variation of a signed
elementary integral.
Proof. By Theorem 10.1.7 the variation |m| is a positive elementary integral. It remains
to show that |m| is σ–additive whenever m is so. Let (ψn ) be an increasing sequence in
E. By replacing ψn by ψn − ψ1 if necessary, we may assume without loss of generalization
that (ψn ) ⊂ E+ . Let ψ = supn ψn . Clearly supn |m|(ψn ) ≤ |m|(ψ). For the converse
inequality, for any ε > 0 choose φ ∈ E with |φ| ≤ ψ such that |m|(ψ) − ε < m(φ).
The sequences (ψn ∧ φ+ ) and (ψn ∧ φ− ) in E+ increase to φ+ and φ− respectively. As
m is σ–additive, and hence σ–continuous, we have that limn m(ψn ∧ φ+ ) = m(φ+ ) and
250 10. Signed and Complex measures
limn m(ψn ∧ φ− ) = m(φ− ). Hence, for some N ∈ N, m(ψn ∧ φ+ ) − m(ψn ∧ φ− ) > m(ψ) − ε
for all n ≥ N . As |ψn ∧ φ+ − ψn ∧ φ− | ≤ ψn , n ≥ N implies that
sup |m|(ψk ) ≥ |m|(ψn ) ≥ m(ψn ∧ φ+ ) − m(ψn ∧ φ− ) ≥ m(ψ) − ε.
k
R called the total variation of the elementary integral m. If 1 ∈ L1 (m), that is, kmkT V :=
is
1 d|m| < ∞, then m is said to be of finite total variation, or simply, that m is a finite
elementary integral .
If m is only additive, then |m| is also additive on E. We use the Jordan seminorm k k# m
# #
instead of the Daniell mean and define L# (m) to be L# (|m|), the closure of E on (F , k km ).
As in Section 6.1
|m(f )| ≤ |m|(|f |) = kf k#
m, f ∈ L# (|m|)
The procedure described above can be applied to linear functionals (not necessarily
positive) in C00 (X) where X is l.c.H. to produce signed–Radon measures.
Theorem 10.2.1. Suppose X is a l.c.H. space. A linear functional m on C00 (X) has finite
variation iff m has the following property:
Property R: If (φn : n ∈ N) is a sequence of functions in C00 (X) whose supports are
contained in a common compact set and which converges uniformly to a function
φ then, limn m(φn ) = m(φ).
In either case, m is order continuous.
10.3. Signed measures 251
+
Proof. Suppose m is not of finite variation. Then, there exits ψ ∈ C00 (X) and a sequence
(φn : n ∈ N ⊂ C00 (X) with |φn | ≤ ψ such that |m(φn )| > 2 . Each function gn := 2−n φn
n
vanishes outside of supp(ψ) and kgn ku → 0. Since |m(gn )| > 1, it follows that m does not
satisfy the Radon property.
Conversely, suppose m does not satisfy the Radon property. Then, there is a compact set K
and sequence of functions φn ∈ C00 (X) whose supports are contained in K, which converges
uniformly to some function φ ∈ C00 (X) and such that ε := inf n |m(φn − φ)| > 0. Without
loss of generality suppose that kφn − φku < 2−n . Let φ ∈ C00 (X) such that 1K ≺ ψ ≤ 1.
For each n define
ψn = sign m(φn − φ) · (φn − φ)
P
so that m(ψn ) ≥ 0. If Ψn := nk=1 ψk , then clearly |Ψn | ≤ ψ and |m(Ψn )| = m(Ψn ) > nε.
Therefore m has infinite variation at ψ.
By Lemma 6.7.1, if m is a Radon measure, then |m| is order continuous. Consequently, for
any increasing directed family Φ ⊂ C00 (X) with lim Φ = ψ,
lim |m(ψ − φ)| ≤ lim |m|(ψ − φ) = 0
φ∈Φ φ∈Φ
Therefore, m is order–continuous.
Definition 10.2.2. A linear functional m satisfying property R in Theorem 10.2.1 is called
(real valued) Radon measure.
Example 10.2.3. With λ as the Lebesgue measure on R, the linear functional
Z
m(f ) := f (x) sin(x) λ(dx), f ∈ C00 (R)
R
R
is a real valued Radon measure, and its variation is given by |m|(f ) = R f (x)| sin(x)| λ(dx).
m is not defined in all of M|m| , e.g. m(R) is not defined. |m| is a positive Radon measure
defined in all of M|m| . Moreover, |m|(R) = ∞.
Conversely, if ν is a real nonnegative additive function in R such that |µ(A)| ≤ ν(A) for
all A ∈ R, then its extension n dominates m and −m on E(R), that is, |m(ψ)| ≤ n(ψ)
for all ψ ∈ E+ (R). This implies that m has finite variation and that |m| ≤ n and |µ| ≤ ν.
Consequently, |µ| is the smallest positive measure that dominates µ and −µ.
Theorem 10.3.1. Suppose µ be a real–valued additive function in a ring of functions R
and let m be its linear extension to E(R).
252 10. Signed and Complex measures
(i) m is of finite variation iff there exists a positive additive function ν on R such that
(10.5) |µ(A)| ≤ ν(A), A ∈ R.
(ii) If m is of bounded variation, then restriction |µ| of |m| to R is the smallest positive
additive function in R satisfying (10.5); moreover,
(10.6) |µ|(A) = sup{µ(A1 ) − µ(A \ A1 ) : A1 ∈ R, A1 ⊂ A}
(iii) m is σ–continuous iff µ is σ–additive.
Remark 10.3.2. |µ| is called the variation measure of µ. The total variation fo a
measure µ is defined as kµkT V := |µ|(Ω). When kµkT V < ∞, we say that µ is of finite total
variation, or simply that µ is a finite measure.
Proof. The arguments given above prove (i) and half of (ii).
Proof of equation (10.6): Let m be the linear extension of µ to E(R), and let ν(A) denote
the right hand side of (10.6). Clearly |µ(A)| ≤ ν(A) and, since R ⊂ E(R), ν(A) ≤ |m|(A)
by (10.4). We will show that ν is an additive function in R. Let B1 and B2 disjoint sets in
R and let ε > 0. Let A1 ⊂ B1 and A2 ⊂ B2 be sets in R such that
ε
ν(B1 ) − < µ(A1 ) − µ(B1 \ A1 )
2
ε
ν(B2 ) − < µ(A2 ) − µ(B2 \ A2 ).
2
In this case, (B1 ∪ B2 ) \ (A1 ∪ A2 ) = (B1 \ A1 ) ∪ (B2 \ A2 ) and so,
ν(B1 ) + ν(B2 ) − ε < µ(A1 ∪ A2 ) − µ (B1 ∪ B2 ) \ (A1 ∪ A2 ) ≤ ν(B1 ∪ B2 ).
Thi shows that ν is superadditive.
Let A ⊂ B1 ∪ B2 such that ν(B1 ∩ B2 ) − ε < µ(A) − µ (B1 ∩ B2 ) \ A . Set
A1+ = A ∧ B1 A2+ = A − (A ∧ B1 )
(10.7)
A1− = B1 − (A ∧ B1 ) A2− = B2 + (A ∧ B1 ) − A
The terms in (10.7) are pairwise disjoint sets in R since A2− = ((B1 ∪B2 )\A)∩B2 = B2 \A.
The union by rows in (10.7) is A and (B1 ∪ B2 ) \ A, while the union by columns is B1 and
B2 . Therefore
ν(B1 ) + ν(B2 ) ≥ µ(A1+ ) − µ(A1− ) + µ(A2+ ) − µ(A2− )
= µ(A1+ ∪ A2+ ) − µ(A1− ∪ A2− ) = µ(A) − µ (B1 ∪ B2 ) \ A
> ν(B1 ∪ B2 ) − ε.
This shows that ν is subadditive. Therefore ν is additive and dominates µ. The first part
of (ii) implies that |µ| = ν.
Remark 10.3.5. Since the union of a sequence of sets is independent of any rearrangement
of the sequence, the series in (ii) is absolute convergence whenever it is finite. By definition,
a signed measure µ takes at most one value in {−∞, ∞}.
The restriction of a signed measure on the ring R(F ) of measurable sets in A ∈ F with
|µ(A)| < ∞ is clearly σ–additive and its linear extension to the space of simple functions
E(R) is an σ–continuous elementary integral. The converse is not necessarily true, as the
next example shows.
R
Example 10.3.6. The function ν(A) = A (f (x) −R g(x)) dx on B(R), where f, g ∈ L+ 1 (λ) is
a signed measure on B(R). The function µ(A) = A x dx is not a signed measure on B(R);
however, µ is σ–additive on the ring of Borel sets with finite Lebesgue measure.
If µ is of finite variation on R(F ) then (10.6) holds, the measures µ+ and µ− are well
define and satisfy µ = µ+ − µ− and |µ| = µ+ + µ− on R(F ). In the remaining of this
254 10. Signed and Complex measures
section we will extend these identities to all of F , even in the case where µ fails to be of
finite variation.
Theorem 10.3.8. Suppose that µ is a signed measure on (Ω, F ). If −∞ < µ(A) < 0, then
there is a negative set B with B ⊂ A and µ(B) ≤ µ(A).
Theorem 10.3.9. (Hahn decomposition theorem) Let (Ω, F ) be a measurable space and µ
a signed measure on F . There is a positive set P and a negative set N such that Ω = P ∪ N
and P ∩ N = ∅.
Proof. Without loss of generality we may assume that µ does not take the value −∞. Let
N denote the family of all negative sets and let η = inf{µ(E) : E ∈ N }. Since ∅ ∈ N , then
−∞ ≤ η ≤ µ(∅) S= 0. Let An ∈ N be a sequence such that µ(An ) ց η. The S sets B1S= A1 ,
Bn+1 = An+1 \ nk=1 Ak (n ∈ N) are negative and pairwise disjoint and N = n An = n Bn ;
hence, N ∈ N and η ≤ µ(N ) ≤ µ(An ). Consequently, −∞ < µ(N ) = η.
We will show that P = Ω\N is a positive set. Suppose that there is a measurable set E ⊂ P
and µ(E) < 0. By Theorem 10.3.8 there is a negative set B ⊂ E with µ(B) ≤ µ(E). Since
N and B are disjoint negative sets, N ∪ B ∈ N and µ(N ∪ E) = µ(N ) + µ(E) < µ(N ) = η
contradicting the choice of η. Therefore P is a positive set.
Definition 10.3.10. Let (Ω, F ) be a measurable space. Two measures µ and ν are mu-
tually singular , denoted by µ ⊥ ν, if there is A ∈ F such that µ(A) = 0 = ν(Ω \ A).
10.3. Signed measures 255
Theorem 10.3.11. (Jordan decomposition theorem) Let (Ω, F ) be a measurable space and
µ a signed measure. There is a unique pair of measures µ+ and µ− such that
(10.8) µ = µ+ − µ− , µ+ ⊥ µ −
Set |µ| := µ+ + µ− . If (P, N ) and (S, Q) are two Hahn decompositions of Ω with respect to
µ then P = S and N = Q |µ|–a.s.
We will give a different description of the variation function that extends to complex
measures. For any A ∈ F , let PA denote the collection of all the countable measurable
partitions of A, and define
X
(10.12) Vµ (A) = sup |µ(Aj )| : {Aj } ∈ PA
j
Theorem 10.3.12. Vµ = |µ|. If ν is a measure on (Ω, F ) such that |µ(A)| ≤ ν(A) for all
A ∈ F , then |µ| ≤ ν.
Proof. It follows from (10.12) that Vµ (∅) = 0 and |µ(A)| ≤ Vµ (A). Suppose that En ∈ F is
a pairwise disjoint sequence whose union is E, and let Am ∈ F be any countable partition
of E. Then {Am ∩ En : n ∈ N} is a countable partition of Am and {Am ∩ En : m ∈ N} is a
countable partition of En . Hence
X XX XX
|µ(Am )| ≤ |µ(Am ∩ En )| = |µ(Am ∩ En )|,
n m n n m
S P
whence it follows that Vµ ( n En ) ≤ n Vµ (En ). It remains to show the that the last
inequality holds in the opposite direction. To that purpose, let tn ∈ R be a sequence such
256 10. Signed and Complex measures
Taking linear combinations (aµ + bν)(E) := aµ(E) + bν(E), we conclude that the space
Mr (Ω, F ) (Mc (Ω, F )) of real (complex) measures of finite total variation form a real
(complex) vector space with norm µ 7→ kµkT V = |µ|(Ω)
Theorem 10.3.14. The space of complex measures Mc (Ω, F ) with the total variation norm
is a Banach space.
Proof. Suppose that (µn ) is a Cauchy sequence, then |µn (E) − µm (E)| ≤ kµn − µm kT V .
This means that (µn ) is a Cauchy sequence of bounded functions defined on F . Hence
µn converges uniformly to a bounded function µ on F . Clearly µ(∅) = 0 and µ is finitely
additive. To show that µ is countably additivity, suppose Em ∈ F increases to its union E.
Given ε > 0, there is N such that supA∈F |µn (A)−µ(A)| < ε/3 for all n ≥ N . The countably
additivity of µN implies that µN is continuous on F , that is limm µN (Em ) = µN (E). Thus,
for some m0 , |µN (E \ Em )| = |µN (E) − µN (Em )| < ε/3 whenever m ≥ m0 . Therefore,
|µ(E) − µ(Em )| ≤ |µ(E) − µN (E)| + |µN (E) − µN (Em )| + |µN (Em ) − µ(Em )| < ε
for all m ≥ m0 . This shows that µ is a complex measure.
Theorem 10.4.1. (M, ≤) is an order complete vector lattice, that is, if B ⊂ M has an
upper bound in M, then it has a least upper bound in M.
Theorem 10.4.1 and (b) show that the space M∗F V (E) of elementary integrals on E of total
finite variation is a Banach space and a vector lattice.
Theorem 10.4.3. For any m1 , m2 , n ∈ M+ , (m1 + m2 ) ∧ n ≤ m1 ∧ n + m2 ∧ n.
Theorem 10.4.6. (Riesz) Let M be an order complete vector lattice. Then, for any G ⊂ M,
G ⊥ is a band. Moreover, (G ⊥ )⊥ is the band (G) generated by Gand every m ∈ M has a unique
decomposition m = m|| + m⊥ with m|| ∈ (G) and m⊥ ∈ G ⊥ .
The first part of the proof shows that (G ⊥ )⊥ is an ideal containing G. Hence (G) ⊂ (G ⊥ )⊥
and (G) ∩ G ⊥ = {0}. For any m ∈ M+ let
_n o
(10.16) m|| = n ∈ (G) : n ≤ m
and m⊥ = m − m|| . As (G) is a band, we have that m|| ∈ (G)+ and m⊥ ≥ 0. We claim that
m⊥ ∈ G ⊥ . For n ∈ G, m⊥ ∧ |n| = (m − m|| ) ∧ |n| ∈ (G)+ , and so m|| + (m⊥ ∧ |n|) ∈ (G)+ . As
m|| + (m⊥ ∧ |n|) ≤ m, (10.16) implies that m|| + (m⊥ ∧ |n|) ≤ m|| . Therefore m⊥ ∧ |n| = 0.
Proof. Theorem 10.1.9 implies that M∗ (E) is an ideal for m Wis σ–continuous iff |m| is σ–
continuous. Suppose B ⊂ M∗ (E) has least upper bound n = B in M(E). Without loss
of generality, we may assume that B is increasingly directed and contained in M∗+ (E). If
(φn : n ∈ N) ⊂ E+ and φn ր φ ∈ E then, as in the proof of Theorem 10.4.1,
_ _
B (φ) = sup m(φ) = sup m(φn ) = sup sup m(φn ) = sup B (φn ).
m∈B m∈B, n∈N n m∈B n
W
This shows that B ∈ M∗ (E). Therefore, M∗ (E) is a band in M(E). A similar proof shows
that M• (E) is a band in M∗ (E).
260 10. Signed and Complex measures
Let E σ = {h ∈ ERΣ : ∃φ ∈ E, |h| ≤ φ}. It is clear that E ⊂ E σ ⊂ L1 (|m|) for any signed
elementary integral m on E. E σ contains all sets of the form {φ > r} where φ ∈ E and r > 0
for 1{φ>r} ≤ φr+ . As E is a ring lattice closed under chopping, so is E σ by Lemma 5.6.5.
Theorem 10.4.12. Suppose (E, m) is a signed elementary integral over a ring lattice E
closed under chopping. Then
(10.18) |m|(h) = sup{|m(ψ)| : ψ ∈ E σ , |ψ| ≤ h}
σ
for any h ∈ E +
.
Remark 10.4.13. In (10.18) it is understood that m(ψ) stands for the value at ψ of the
extension of m to all L1 (|m|).
Proof. For each h ∈ E σ + let ν(h) denote value of the right hand side of (10.18). As E σ is
a ring lattice closed under chopping, (E σ , m) is a signed elementary integral whose variation
is given by ν. If ψ ∈ E σ and |ψ| ≤ h ∈ E σ + , then |m(ψ)| ≤ |m|(|ψ|) ≤ |m|(h), and so
ν(h) ≤ |m|(h). Hence, ν is finite. On the other hand, for all ψ ∈ E+
|m|(ψ) = sup{|m(φ)| : φ ∈ E, |φ| ≤ ψ} ≤ sup{|m(φ)| : φ ∈ E σ |φ| ≤ ψ} = ν(ψ).
10.4. The space of elementary integrals 261
Consequently ν and |m| coincide on E+ , and so the Daniel means k kν and k k|m| associated
to ν and to |m| respectively coincide on E+ . Therefore, by Lemma 7.6.1, ν(h) = khkν =
khk|m| = |m|(h) for all h ∈ ERΣ .
Theorem 10.4.14. (Hahn) Let m, n ∈ M∗ (E).
(i) m ⊥ n iff any set B ∈ E σ admits a partition {Bn , Bm } ⊂ E σ such that |m|(Bn ) =
0 = |n|(Bm ).
(ii) m ≪ n iff for any set N ∈ E σ , |n|(N ) = 0 implies |m|(N ) = 0.
Proof. (i) If every set B ∈ E σ admits the decomposition stated above then, from (10.4),
|m| ∧ |n|(B) = 0 and so |m| ∧ |n| ≡ 0. Conversely, suppose m ⊥ n and let B be a set in E σ .
Without loss of generality suppose m and n are positive. By (10.4), for each k ∈ N there
σ such that
exists a pair of functions ψk , φk ∈ E+
(10.19) 1B = ψk + φk
and
m(ψk ) + n(φk )| = kψk k∗m + kφk k∗n ≤ 2−k .
Then 0 ≤ ψk ≤ 1 converges to 0 in k k∗m -mean and k k∗m –a.s and the same conclusion holds
for φk with k k∗n in place of k k∗m . By (10.19) the set C where (ψk ) converges coincides with
the set where (φk ) converges. As Ω \ C ∈ ERΣ , 1C ψk and 1C φk belong to E σ . Let
Bn = {lim inf 1C ψk > 0}, Bm = B \ Bn .
k
Then Bn , Bm ∈ E+ σ and B ⊂ {lim inf 1 φ > 0}. Since m(B ) = 0 = n(B ), B
m k C k n m m and
Bn provide the desired decomposition.
W
(ii) If m ≪ n then |m| = |m||| = k|m| ∧ (k|n|). As {|m| ∧ (k|n|) : k ∈ N} is increasingly
directed, |m|(ψ) = supk |m| ∧ (k|n|) (ψ) for all ψ ∈ E σ ; therefore, if |n|(B) = 0 at some set
B ∈ E σ , then |m|(B) = 0. Conversely, suppose |m|(B) = 0 whenever 1B ∈ E σ with |n|(B) =
0. For any D ∈ E σ , let D1 , D2 be a disjoint partition of D so that |m|⊥ (D2 ) = 0 = |n|(D1 ).
Then |m|⊥ (D1 ) ≤ |m|(D1 ) = 0, and so |m|⊥ (D) = 0. Therefore, |m| = |m||| ∈ (n).
Remark 10.4.15. If µ, ν are positive measures on a measurable space (Ω, F ), then ν ≪ µ
iff for any A ∈ P
F , ν(A) = 0 whenever µ(A) = 0. Indeed, let E be the space of simple
n
functions φ = j=1 aj 1Aj such that n ∈ N, aj ∈ R, Aj ∈ F , and µ(Aj ) < ∞. As
elementary integrals on E, µ and ν are in M∗ (E). Any set A ∈ F with µ(A) < ∞ is in E σ .
The conclusion follows by Hahn’s theorem (ii).
Example 10.4.16. (Lebesgue decomposition) Suppose µ and ν are σ–finite measures on
(Ω, F ). Then there are unique measures νa and νs with νa ≪ µ and νs ⊥ µ such that
ν = νa + νs . It is enough to assume that ν(Ω) < ∞. Let Nµ the sets of all µ–negligible sets,
that is Nµ = {B ∈ A : µ(B) = 0}. Choose an increasing sequence {Bj : j ∈ N} ⊂ Nµ such
that
lim ν(Bj ) = sup{ν(B) : B ∈ Nµ }.
j
262 10. Signed and Complex measures
S
Let N = j Bj , and notice that µ(N ) = lim µj (Nj ) = 0, and µ(N ) = limj ν(Nj ) =
sup{ν(B) : B ∈ Nµ }. Then ν = νa + νs where νa (A) := ν(A \ N ) and νs (A) := ν(A ∩ N ).
Then νs ⊥ µ and (N, N c ) is the Hahn partition of Ω as in Hanh’s theorem (i). We claim
that that νa ≪ µ. To prove this it suffices to show that for any B ∈ F with B ⊂ N c and
ν(B) = 0, ν(B) = 0 holds. If this were not the case then N ∪ B ∈ Nµ , and
ν(N ∪ B) = ν(N ) + ν(B) > ν(N )
which is a contradiction. Uniqueness follows from Riesz’s decomposition. A more direct
proof follows from noticing that for any σ–finite measure ν, ν ≪ µ and ν ⊥ µ iff ν = 0.
Example 10.4.17. (Hahn–Jordan decomposition) For any m ∈ M∗ we know that m+ ⊥
m− . Hence, any set B ∈ E σ admits a partition {B− , B+ } ⊂ E σ such that m+ (B− ) = 0 =
m− (B+ ). It follows that for any k k∗|m| –integrable sets E ⊂ B− and F ⊂ B+ , m(E) ≤ 0
and m(F ) ≥ 0. If m is σ–finite, there there exists a partition {N, P } ⊂ E Σ of Ω such that
m(A) ≤ 0 and m(B) ≥ 0 for all k k∗m –integrable sets A ⊂ N and B ⊂ P .
Proof. The inequality to the left follows directly from the definition of a Daniell mean.
To show the right hand side inequality we first show that it is enough to consider g ∈
ERΣ ∩ Lloc
1 (n). By Theorem 6.4.12, there exists a nondecreasing sequence (φk : k ∈ N) ⊂ E+
with supk φk ≡ 1. By Theorem 7.6.7, for each k ∈ N there is hk ∈ ERΣ such that
|g|1{φk > 1 } ≤ hk , |g|1{φk > 1 } = hk k k∗n|g| –a.s.
k k
1{h=∞} ∈ ERΣ and k1{h=∞} k∗n = 0; hence, γ = h1{h6=∞} ∈ ERΣ and kf gk∗n = kf γk∗n for all
Ω
f ∈ R . This proves our claim.
10.5. Radon–Nikodym Theorem 263
For the rest of the proof we will assume that g ∈ ERΣ ∩ Lloc 1 (n). As Daniell means are
maximal on E, k kgn ≤ k k∗n|g| . Since 1{|g|=0} ∈ ERΣ , k1{|g|=0} k∗n|g| = k1{|g|=0} gk∗n = 0. By
Ω
Lemma 7.6.5, for any f ∈ R with kf gk∗n < ∞, there exists h ∈ ERΣ such that |f g| ≤ h and
kf gk∗n = khk∗n . From
1{|g|>0}
1{|g|>0} |f | ≤ h ∈ ERΣ
|g|
we obtain that
∗
kf k∗n|g| =
1{|g|>0} |f |
n ≤ kh1{|g|>0} k∗n ≤ khk∗n = kf gk∗n
|g|
Proof. By Theorem 7.4.4 and Lemma 10.5.1 f ∈ L1 (n|g| ) iff f g ∈ L1 (n). Thus, if (ψm :
m ∈ N) ⊂ E converges to f in L1 (n|g| ) then, (ψm g : m ∈ N) ⊂ L1 (n|g| ) converges to f g in
L1 (n|g| ). Consequently,
ng (f ) = lim ng (ψm ) = lim n(ψm g) = n(f g), f ∈ L1 (n|g| ).
m m
Proof. Let (φk : k ∈ N) ⊂ E+Sbe an increasing sequence such that supk φk = 1. Then
Ak = {φk > k1 } ∈ E σ and Ω = k Ak . If k1B k∗n = 0, there is a subset N ∈ ERΣ such that
1B ≤ 1N and 1B = 1N k k∗n –a.s. Hence, N ∩ Ak ∈ E σ .
264 10. Signed and Complex measures
P
The implication (i) implies (ii) follows from k1B k∗m ≤ k1N k∗m ≤ k k1N ∩Ak k∗m and Hahn’s
theorem (ii). The implication (ii) implies (i) is also a consequence of Hahn’s theorem (ii).
Ω
If (iii) holds then, by Theorem 10.5.2 and Lemma 10.5.1, kf k∗|m| = kf gk∗n for all f ∈ R .
Hence (ii) holds. If g ∈ L1 (n), then 1 ∈ L1 (n|g| ) and kgk∗n = k1kn|g| = kng kT V .
Corollary 10.5.4. If n ∈ M∗ (E) is σ–finite, then m ∈ M∗ (E) admits a unique decompisi-
tion of the form
(10.24) m = m|| + m⊥ = ng + m⊥
||
where m|| ≪ n, m⊥ ⊥ n and g = dm loc
dn ∈ L1 (n). We refer to (10.24) as the Radon–
Nikodym decomposition of m with respect to n.
Proof. Riesz’s theorem provides a unique decomposition m = m|| +m⊥ where m|| ≪ n, and
m⊥ ⊥ n. By Corollary 10.4.9 µ|| ∈ M∗ (E). The conclusion follows from Radom–Nikodym’s
theorem[(iii)].
R
whence we conclude that |µ − ν|(A) ≤ ∆c (1A×Ω + 1Ω×A ) dΓ. In particular,
(10.25) kµ − νkT V ≤ 2 inf Γ(∆c )
Γ∈C(µ,ν)
e
and so, Γ(∆) = a. As a consequence,
Z Z
e c)
kµ − νkT V = |f − f | dλ = 2 1 − (f ∧ f ′ ) dλ = 2(1 − a) = 2Γ(∆
′
It is very often the case in applcations of coupling in Probability theory, that the space
(Ω, F ) is a complete separable metric space with the Borel σ–algebra.
10.6.2. Change of variables in Rn . Here we are concern with smooth changes of vari-
ables in integrals with respect to Lebesgue measure in Rn .
Theorem 10.6.1. Suppose Ω is an open set in Rn and let G : Ω → G(Ω) be a diffeomor-
phism. If µ is a measure on B(Ω) and µ ≪ λ, where λ is Lebesgue’s measure on B(Ω),
then the induced measure µ ◦ G−1 ≪ λ and
d(µ ◦ G−1 ) f (G−1 (u))
(u) = f (G−1 (u))| det(G−1 )′ (u)| = ,
dλ | det G′ (G−1 (u))|
dµ
where f = dλ .
Example 10.6.2. Suppose (R2 , B(R2 ), µ) is a measure space such that µ is absolutely
dµ
continuous with respect Lebesgue measure λ2 on B(R2 ), and let f = dλ 2
. Consider the
transformation T : (x, y) 7→ (x/y, y) on Ω = {(x, y)|y 6= 0}. It is obvious that T is a
diffeomorphism of Ω to itself. We have that T −1 (u, v) = (uv, v), and det T ′ (x, y) = y1 .
Consequently, the measure µ ◦ T −1 ≪ λ and
d(µ ◦ T −1 )
(u, v) = |v|f (uv, v), (u, v) ∈ Ω
dλ
Similar conclusion is obtained if Ω = R × (0, ∞) or Ω = R × (−∞, 0).
It is now easy to show that the measure ν on Ω, B(Ω) induced by the the map T1 :
266 10. Signed and Complex measures
(x, y) 7→ xy , i.e. ν(du) := (µ ◦ T −1 ) du × (R \ {0}) , is absolutely continuous with respect
Lebesgue measure over the real line, and that
Z
dν
(u) = |v|f (uv, v) dv.
dλ1 R
X
R
Then Ξ is a convex subset of Rp , and the map Λ : η 7→ log eη·T (x) ν(dx) is convex on Ξ.
Suppose Ξo 6= ∅ an define
Z
G(z) := ez T (x) ν(dx), z = η + iβ ∈ Ξo + iRp .
⊺
For the last statement it suffices to consider the case p = 1. Let µ = ν ◦ T −1 . Fix
z = η + iβ ∈ Ξo + iR. For δ > 0 small enough, |h| ≤ δ implies that z + h ∈ Ξo + iR, and so
eth − 1 eδ|t| − 1 eδt + e−δt
≤ ≤ .
h δ δ
R
As (e(η+δ)t + e(η−δ)t )µ(dt) < ∞, we obtain from dominated convergence that G is analytic
R
and G′ (z) = T (x)ezT (x) ν(dx). Equation 10.27 follows by repeating the same argument as
above with T n (x) ν(dx) in place of ν(dx).
Corollary 10.6.6. Suppose {Pθ : θ ∈ ∆} is a family of exponential type with natural
parameter η = η(θ) and Ξ = η(∆) ⊂ Rp open. Then
Z
0= Dθ fθ (x) ν(dx)
X
268 10. Signed and Complex measures
kµn − µkT V → 0.
Proof. Sufficiency: Without loss of generality assume ν is a finite signed measure. Suppose
that for any ε > 0, there is δ > 0 such that |ν(A)| < ε whenever A ∈ F and µ(A) < δ.
If µ(E) = 0 then |ν(E)| < ε for all ε > 0; consequently ν(E) = 0. If (P, N ) is a Hahn
decomposition of ν, then ν+ (E) = ν(E ∩P ) = 0 = ν− (E) = ν(E ∩N ); therefore, |ν|(E) = 0.
Proof. Sufficiency is obvious as |ν(A)| ≤ |ν|(A) for all A ∈ F . To prove necessity, assume
without loss of generality, that all the elements in G are signed measures of finite total
variation. For each ν ∈ G let (Pν , Nν ) be a Hahn decomposition of ν. Then, for ε > 0, there
is δ > 0 such that if µ(A) < δ then supν∈G |ν(A)| < ε. Then µ(A∩Pµ )∨µ(A∩Nµ ) ≤ µ(A) < δ
implies ν(A ∩ Pν ) ∨ (−ν(A ∩ Nν )) = ν+ (A) ∨ ν− (A) < ε for all ν ∈ G. This means that
supν∈G |ν|(A) ≤ 2ε.
10.7. Uniformly continuous families of measures 269
Proof. (i) The limit set function µ is clearly a monotone finitely additive function in F with
µ(∅) = 0. For any pairwise disjointPsequence {An } ⊂ F with union P∞ A, the monotonicity and
n
additivity of µ imply that µ(A) ≥ k=1 µ(Ak ) for all n. Thus k=1 P µ(Ak ) ≤ µ(A).
P On the
other hand, for any c < µ(A), there isPN such that c < µN (A) = k µN (Ak ) ≤ k µ(Ak ).
Letting c ր µ(A), we obtain µ(A) ≤ ∞ k=1 µ(Ak ). It follows that µ is countably additive.
(ii) It suffices to assume that ν is a probability measure. Indeed, for any measurable
P 2−k
partition {Ek : k ∈ N} of Ω such that 0 < ν(En ) < ∞, the measure ν ′ = k ν(Ek ) 1Ek dν
is equivalent to ν. We may replace ν with ν ′ .
P 2−k
Proof. Apply Vitali–Hahn–Saks theorem with k 1+kµk k |µk | in place of ν.
We conclude this section with a result that is useful in the foundations of Statistics.
Theorem 10.7.6. (Halmos–Savage) Let P be a family of complex measures (or finite signed
measures), not all of them zero, on a measurable space (Ω, F ). Suppose ν is a σ–finite
measure on (Ω, F ) such that µ ≪ ν for all µ ∈ P. Then, there exists a probability measure
m ≪ ν on (Ω, F ) such that sup{|µ(A)| : µ ∈ P} = 0 iff m(A) = 0.
Consider the collection C of sets C ∈ F for which there is µC ∈ P with µC (C) > 0 and
dµC
dν +1SC > 0 ν–a.s. Choose P
c a sequence {Cn } ⊂ C such that ν(Cn ) ր supC∈C ν(C). Define
C0 = n≥1 Cn and let m = n 2−n µn , where µn is a choice for µCn . Clearly m(C0 ) > 0
and dm
dν + 1C0 > 0; hence, C0 ∈ C . It is clear that sup{µ(A) : µ ∈ P} = 0 implies that
c
m(A) = 0. To prove the converse implication, suppose that m(A) = 0. It follows from
X Z
−n dµn
0 = m(A ∩ C0 ) ≥ 2 dν ≥ 0
n A∩Cn dν
that ν(A ∩ C0 ) = 0; hence, µ(A ∩ C0 ) = 0 for all µ ∈ P. We will show that µ(A ∩ C0c ) = 0
whenever µ ∈ P. Set B = { dµ c
dν > 0} and notice that µ(A) = µ(A∩B∩C0 ). If µ(A∩B∩C0 ) >
c
0, then dµ
+1 c > 0 by definition of B. Hence, A ∩ B ∩ C ∈ C and
c
dν A∩B∩C0c 0
ν C0 ∪ A ∩ B ∩ C0c > ν(C0 ) = sup ν(C),
C∈C
which is a contradiction. Therefore, µ ≪ m ≪ ν for all µ ∈ P.
10.8. Exercises
Exercise 10.8.1. Let µ be a finitely additive function in F with values in R. Suppose
that limk µ(Ak ) = 0 for any sequence of measurable sets Ak ց ∅. ShowS that µ is a signed–
measure. (Hint: If {An : n ∈ N} is a sequence of pairwise disjoint sets, m≥n Am ց ∅.)
Exercise 10.8.2. Show that (M, ≤) is a partially order vector space, that is, ≤ is a
partial order and for any r ∈ [0, ∞) and n, m, k ∈ M, n ≤ m implies rn ≤ rm and
n + k ≤ m + k.
Exercise 10.8.3. Suppose that m, n, k, l ∈ M and r ∈ [0, ∞). Show that
(a) k + (m ∧ n) = (k + m) ∧ (k + n) and k + (m ∨ n) = (k + m) ∨ (k + n).
(b) r(m ∧ n) = (rm) ∧ (rn) and r(m ∨ n) = (rm) ∨ (rn).
(c) (m ∧ n) ∧ k = m ∧ (n ∧ k) and (m ∨ n) ∨ k = m ∨ (n ∨ k)
10.8. Exercises 271
(d) m ∧ n + m ∨ n = m + n.
(e) |m + n| ≤ |m| + |n|.
(f) If m ≤ n and k ≤ l, then m ∧ k ≤ n ∧ l and m ∨ k ≤ n ∨ l.
Exercise 10.8.4. Using (10.13) and equation (10.4) show that
(10.28) m ∧ n(ψ) = inf m(φ1 ) + n(φ2 ) : φi ∈ E+ , φ1 + φ2 = ψ
for any ψ ∈ E+ .
Exercise 10.8.5. Let (Ω, B) be a measurable space. For any pair of finite signed measures
µ and ν define
1
µ∨ν = 2 (µ + ν + |µ − ν|)
1
µ∧ν = 2 (µ + ν − |µ − ν|).
Show that µ ∨ ν and µ ∧ ν are finite signed measures such that:
(a) For all F ∈ B,
µ ∨ ν (F ) = sup µ(E) + ν(F \ E) : B ∋ E ⊂ F ,
µ ∧ ν (F ) = inf µ(E) + ν(F \ E) : B ∋ E ⊂ F .
(b) µ ≤ µ ∨ ν, ν ≤ µ ∨ ν, and if τ is a signed measure such that µ ≤ τ and ν ≤ τ , then
µ ∨ ν ≤ τ.
(c) µ ≥ µ ∧ ν, ν ≥ µ ∧ ν, and if λ is a signed measure such that µ ≥ λ and ν ≥ λ, then
µ ∧ ν ≥ λ.
In particular, µ+ = µ ∨ 0 and µ− = (−µ) ∨ 0 = −(µ ∧ 0).
Exercise 10.8.6. Suppose µ, ν are signed measures of finite variation on a measure space
(Ω, F ). If µ, ν ≪ λ for some measure σ–finite measure λ, and f = dµ ′ dν
dλ and f = dλ , show
that d|µ−ν|
dλ = |f − f ′ |, and d(µ∧ν)
dλ = f ∧ f ′.
Exercise 10.8.7. Let µ and ν be probability measures on (Ω, F ). Show that
1
kµ − νkT V = (µ − ν)+ (Ω) = (µ − ν)− (Ω) = 1 − µ ∧ ν (Ω) = sup µ(A) − ν(A)|.
2 A∈F
Exercise 10.8.9. Let m, n ∈ M∗ (E) and suppose there exists a sequence (φk ) ⊂ E with
supk φk = 1. If m ⊥ n, then there exists a partition {A1 , A2 } ⊂ E Σ of Ω such that
|m|(A1 ) = 0 = |n|(A2 ). (Hint: Ω is the countable union of sets in E σ .)
Exercise 10.8.10. Let m be a σ–finite elementary integral over a ring lattice closed under
chopping E ⊂ Bb (Ω). For any countable collection F ⊂ L0 (E, m), show that there is an
elementary integral n ≪ m such that F ⊂ L1 (E, n).
Exercise 10.8.11. Suppose µ is a measure on (R, B(R)). The map G : x 7→ x2 induces a
measure on ([0, ∞), B([0, ∞))). If µ ≪ λ, show that ν = µ ◦ G−1 ≪ λ and
dν 1 √ √
(t) = √ (f (− t) + f ( t))1(0,∞) (t)
dλ 2 t
1 2
dµ √1 e− 2 x
where f = dλ . In particular, if µ(dx) = 2π
dx, we obtain the χ21 –measure
1 −t/2
χ21 (dt) = √ e 1(0,∞) (t) dt.
2πt
Exercise 10.8.12. (Box–Muller) Let µ = 1(0,1)2 · λ2 , where λ2 is Lebesgue’s measure on
(R2 , B(R2 )). Let T : (0, 1)2 → R2 defined by
p p ⊺
T (u1 , u2 ) = −2 log(u1 ) cos(2πu2 ), −2 log(u1 ) sin(2πu2 )
Show that T is a diffeomorphism from (0, 1)2 to R2 \ R+ × {0} . Conclude that the induced
measure µ ◦ T −1 is the normal distribution on R2 with mean 0 amd covariance matrix I2
(the two–by–two identity matrix).
Exercise 10.8.13. Let λ2 is Lebesgue’s measure on (R2 , B(R2 )). Let
1 1
D := {(u1 , u2 ) : u1 , u2 > 0, u1α + u2β < 1}
for α, β > 0 and let c = λ1 (D). Let µ := 1c 1D (u1 , u2 ) · λ2 . Notice that D ⊂ (0, 1)2 . Let
1
u1α
X(u1 , u2 ) = 1 1
u1α + u2β
Show that the induced law Bα,β := µ◦X −1 is absolutely continuous with respect to Lebesgue
measure λ1 on R, and
dBα,β Γ(α + β) α−1
(x) = x (1 − x)β−1 1(0,1) (x)
dλ1 Γ(α)Γ(β)
This is the beta distribution with parameters α and β (Hint: See Example 9.6.11).
Exercise 10.8.14. Let µ and ν be measures on (R, B(R) with µ ≪ ν. Suppose that
µ((−∞, c]) < µ(R) < ∞ and consider the map T : x 7→ x ∧ c. For the induced measure
µ ◦ T −1 . Show that dµ ◦ T −1 = f 1(−∞,c) dν + µ([c, ∞))dδc , where f = dµ
dν .
Exercise 10.8.15. The condition ν(Ω) < ∞ in Lemma 10.7.2R is necessary, as the following
exercise shows. Consider (R, B(R), λ) and define ν(A) := A |x|dx. Show that ν ≪ λ,
however for no ε > 0 does there exist δ > 0 such that A ∈ B(R), λ(A) < δ implies ν(A) < ε
Chapter 11
Differentiation
In this section we apply the results on the previous sections to the case of Borel σ–finite
measures in Rd. In particular, we extend the Fundamental Theorem of Calculus to the
setting of Lebesgue integration.
One way to study the existence of (11.1) at a point x is to compare the variation measure
|µ| with the Lebesgue measure through a maximal ratio.
Definition 11.1.2. Hardy’s maximal function Mµ of µ at x is given by
|µ|(B(x; r))
(11.2) Mµ (x) = sup
r>0 λ(B(x; r))
Lemma 11.1.3. The map x 7→ Mµ (x) is lower semicontinuous.
Proof. Without loss of generality we assume that µ ≥ 0. For any t > 0 we show that
Et = {Mµ > t} is open. For x ∈ Et , there is r > 0 such that µ(B(x; r)) = pλ(B(x; r)) with
p > t. Choose δ > 0 small enough so that
(r + δ)d < rd p/t
273
274 11. Differentiation
Observe that if y ∈ B(x; r), then B(x; r) ⊂ B(y; r + δ). By translation invariance
µ(B(y; r + δ)) ≥ µ(B(x; r)) = pλ(B(x; r))
pr d
= (r+δ)d
λ(B(y; r + δ)) > tλ(B(y; r + δ))
Therefore, B(x; r) ⊂ Et .
The next result is a covering Lemma that depends on the properties of Lebesgue mea-
sure.
Lemma 11.1.4. Let W be the union of a finite collection of open balls B(xi ; ri ), i =
1, . . . , N . Then, there exist S ⊂ {1, . . . , N } such that
(a) The balls B(xj ; rj ), j ∈ S are pairwise disjoint;
(b) W ⊂ ∪j∈S B(xj ; 3rj );
P
(c) λ(W ) ≤ 3d j∈S λ(B(xj ; rj )).
By construction, (a) holds. Observe that in any metric space, if r′ ≤ r and B(x; r′ ) ∩
B(y; r) 6= ∅, then B(x; r′ ) ⊂ B(y; 3r). Thus, (b) follows. Finally, (c) follows from (b) and
the the dilation property of λ: λ(B(0; ar)) = ad λ(B(0; r)).
Proof. Let K ⊂ {Mµ > t} compact. Any x ∈ K is the center of a ball Bx with |µ|(Bx ) >
tλ(Bx ). By compactness, there is a finite subcover of K by balls Bx ’s; Lemma 11.1.4 implies
the existence of a finite sub collection of pairwise disjoint balls {Bj : j = 1 . . . , N } such that
N
X N
X
3d 3d
λ(K) ≤ 3d λ(Bj ) ≤ t |µ|(Bj ) ≤ t kµkT V
j=1 j=1
Since Lebesgue measure λ is regular, (11.3) follows by taking the supremum over all compact
K ⊂ {Mµ > t}.
(i) If p = 1, then
3d
(11.4) λ({M f > t}) ≤ kf k1
t
(ii) If 1 < p ≤ ∞, then M f ∈ Lp and there is a constant Cp such that
(11.5) kM f kp ≤ Cp kf kp
Therefore kM f k∞ ≤ kf k∞ . If 1 < p < ∞ then, for 0 < c < 1 and t > 0, let
Clearly ht ∈ L∞ and, from Chebyshev’s and Hölder’s inequalities, we also have that gt ∈ L1 .
Hence, M f ≤ M gt + M ht ≤ M gt + ct, and so, {M f > t} ⊂ {M gt > (1 − c)t}. By Hardy–
Littlewood’s theorem,
Z
3d 3d
λ({M f > t}) ≤ λ({M gt > (1 − c)t}) ≤ kgt k1 = f (x) dx
t(1 − c) t(1 − c) {f >ct}
This proves all the statements in this Theorem. The constant Cp in this case can be chosen
to be minimal by letting c = p/(p − 1) = q; this gives Cp ≈ (3d epq)1/p .
The following result states that almost every point in Rd is a Lebesgue point of f .
dµ
Theorem 11.1.7. If µ ≪ λ and f = dλ , then Dµ exists λ–a.s and f = Dµ λ–a.s.
276 11. Differentiation
Proof. It suffices to assume that f ∈ L1 (Rd , B(Rd ), λ). For each f ∈ L1 (λ) define the maps
Tr f , with r > 0, and T f by
Z
1
(Tr f )(x) = |f (y) − f (x)| λ(dx)
λ(B(x; r)) B(x;r)
(T f )(x) = lim sup Tr (x).
rց0
The following result shows that the symmetric derivative singular measures with respect
to Lebesgue is null.
Theorem 11.1.8. If µ ⊥ λ, then Dµ = 0 λ–a.s.
Since µ ⊥ λ, there is a set E such that λ(E) = 0 = µ(Rd \ E). Given ε > 0, there is,
by regularity, a compact K ⊂ E such that µ(K) > kµkT V − ε. If µ1 (·) = µ(· ∩ K) and
µ2 (·) = µ(· ∩ K c ), then kµ2 k < ε and D̄µ1 (x) = Dµ1 (x) = 0 for any x ∈ K c . Hence
D̄µ (x) = D̄µ2 (x) ≤ Mµ2 (x).
Therefore, {D̄µ > t} = ({D̄µ > t} ∩ K) ∪ ({D̄µ > t} ∩ K c ) ⊂ K ∪ {D̄µ2 > t}. Since
λ(K) ≤ λ(E) = 0, Hardy–Littlewood’s lemma implies that
3d 3d
λ({D̄µ > t}) ≤ t kµ2 k < t ε.
Letting ε ց 0 gives λ({D̄µ > t}) = 0 for all t > 0. We conclude that Dµ exists and
Dµ = D̄µ = 0 λ–a.s.
Corollary 11.1.9. Let µ = µa + µs = f dλ + µs be the Radon–Nikodym decomposition of
a complex or signed measure µ in B(Rd ). Then Dµ exists and Dµ = f λ–a.s.
11.2. The fundamental theorem of Calculus 277
Remark 11.1.10. In the case µ ≪ λ, open balls B(x; r) can be replaced by other types
of sets whose Lebesgue measures are proportional to those of a ball. For instance, we
can consider sets E(x; r) ⊂ B(x; r) for which there is a fixed number a > 0 such that
λ(E(x; r)) ≥ aλ(B(x; r)). In such case,
Z Z
1 a
|f (y) − f (x)| λ(dy) ≤ |f (y) − f (x)| λ(dy)
λ(E(x; r)) E(x;r) λ(B(x; r)) B(x;r)
Proof. Let µ the unique measure on B([a, b]) such that µ((x, y]) = f (y) − f (x). The
absolute continuity of f means that µ ≪ λ since µ([a, b]) = f (b) − f (a) < ∞. Let g be
the Radon–Nikodym derivative of µ w.r.t. λ. By Theorem 11.1.7we have that Dµ = g a.s.
Since
Dµ (x) = lim µ((x−h,x]) µ((x,x+h])
λ((x−h,x]) = lim λ((x,x+h]) = g(x),
h→0 h→0
we have that f is differentiable λ–a.s. and that
Z
f (x) − f (a) = µf ((a, x]) = f ′ (t) dt
(a,x]
If Vf (b) < ∞, we say that f is a function of finite variation on [a, b]. It is easy to verify
that a function that is absolutely continuous on an interval [a, b] is automatically of finite
variation.
278 11. Differentiation
Proof. Let a < x < y < b. For any partition a = t0 < . . . < tn = x we have
n
X
|f (tj ) − f (tj−1 )| + |f (y) − f (x)| ≤ Vf (y)
j=1
Hence Vf (x) + |f (y) − f (x)| ≤ Vf (y). Therefore Vf (x) + f (x) − f (y) ≤ Vf (y) and Vf (x) +
f (y) − f (x) ≤ Vf (y).
Suppose that
P f is absolutely continuous. P Then, given ε > 0, there is δ > 0 such that
whenever j b j − a j < δ, we have j |f (b j ) − f (aj )| < ε/2. For each interval [aj , bj ],
choose a partition Pj = {tjk } ⊂ [aj , bj ] such that
X
ε
Vf (bj ) − Vf (aj ) − 2j+1 < |f (tj,k ) − f (tj,k−1 )|
k
P P P
Since j k (tj,k − tj,k−1 ) = − aj ) < δ, we conclude that
k (bj
X XX
|Vf (bj ) − Vf (aj )| < 2ε + |f (tj,k ) − f (tj,k−1 )| < ε.
j j k
Example 11.2.4. If f is a Lischitz function in [a, b], then it is clearly is of bounded vari-
antion and absolutely continuous. Then f is differentiable λ–a.e., f ′ ∈ L1 ([a, b]) and (11.8)
holds.
Proof. Let C be the at most countable set where f ′ does not exists. We extend f ′ in [a, b]
by setting fR′ (x) = 0 for xR ∈ C. Since f ′ ∈ L1 ([a, b]), there is a l.s.c. g on [a, b] such that
f ′ ≤ g and [a,b] g(t) dt < [a,b] f ′ (t) dt + ε. For any η > 0, define
Z
Fη (x) = g(t) dt − (f (x) − f (a)) + η(x − a), a ≤ x ≤ b
[a,x]
By adding a small constant to g if necessary, we can assume that f ′ < g. The lower
semicontinuity of g implies that for every x ∈ [a, b] \ C there is δx > 0 such that
f (t) − f (x)
g(t) > f ′ (x), < f ′ (x) + η
t−x
for all t ∈ (x, x + δx ). Hence,
Z
Fη (t) − Fη (x) = g(s) ds − (f (t) − f (x)) + η(t − x)
[x,t]
> f (x)(t − x) − (f ′ (x) + η)(t − x) + η(t − x) = 0
′
We claim that Fη is strictly increasing. Suppose that for some a ≤ x1 < x2 ≤ b we have
Fη (x1 ) > Fη (x2 ). For each Fη (x2 ) < y < Fη (x1 ) define
xy = sup{x ∈ [x1 , x2 ] : Fη (x) ≥ y}
The continuity of Fη implies that Fη (xy ) = y and x ∈ C. Since C is countable, we reach a
contradiction.
As Fη (a) = 0, we have that Fη (x) > 0 for all a < x ≤ b. Letting η → 0 gives
Z Z
f (x) − f (a) ≤ g(t) dt < f ′ (t) dt + ε,
[a,x] [a,x]
and since ε is arbitrary, we conclude that
Z
f (x) − f (a) ≤ f ′ (t) dt
[a,x]
The inverse inequality follows by taking −f instead of f .
where G(t−) = limsրt G(s) and µG , µF are the signed measures induced by G and F
respectively.
280 11. Differentiation
Proof. By Lemma 11.2.2 we can assume without loss of generality that G and F are
nondecreasing. Let µG and µF be the unique Borel measures on (a, b] such that µG ((α, β]) =
G(β) − G(α), µF ((α, β]) = F (β) − F (α) for all (α, β] ⊂ I, −∞ < α < β < ∞. By Fubini’s
theorem,
Z
(F (b)−F (a))(G(b) − G(a)) = µF ⊗ µG (dt, ds)
(a,b]×(a,b]
Z Z Z Z
= µF (dt) µG (ds) + µG (ds) µF (dt)
(a,b] (a,s] (a,b] (a,t)
Z Z
= F (s) µG (ds) + G(t−) µF (dt) − F (a)µG ((a, b]) − G(a)µF ((a, b]).
(a,b] (a,b]
Denoting by ∆G(t) = G(t) − G(t−) the size of the jump of G at t, we can express (11.9)
as
Z Z X
F (t) µG (dt) = F (b)G(b) − F (a)G(a) − G(t) µF (dt) + ∆G(t)∆F (t)
(a,b] (a,b] a<t≤b
Remark 11.3.2. (Differential notation) The integration by part formula (11.9) is com-
monly written as
d(F G) = F dG + G− dF
where G− (t) = G(t−) and dG stands for the measure Lebesgue–Stieltjes measure µG .
Example 11.3.3. The function f (x) = sinx x 1(0,∞) (x) is not integrable. However,
Z x
sin t π
lim dt =
x→∞ 0 t 2
R∞ P 1
Indeed, first notice that 0 |f (t)| dt ≥ 2 n n = ∞.
As for the second statement, notice that F (x, y) = e−xy sin x is integrable over any region
of the form {0 < x ≤ a, y > 0. By Fubini’s theorem
Z aZ ∞ Z a Z ∞Z a
sin x
e−xy sin x dy dx = dx = e−xy sin x dx dy
0 0 0 x 0 0
Integrating by parts we obtain
Z a Z a
e−xy sin x dx = −e−ay cos a + 1 − ye−xy cos x dx
Z a0 Z a 0
ye−xy cos x dx = ye−ay sin a + y 2 e−xy sin x dx
0 0
Collecting and rearranging all terms gives
Z a
1
e−xy sin x dx = 2
1 − e−ay cos a − ye−ay sin a
0 1+y
11.3. Integration by parts in R 281
Hence
Z a Z a Z a
sin x π e−ay ye−ay
dx = − cos a dy − sin a dy
0 x 2 0 1 + y2 0 1 + y2
The conclusion follows by letting a → ∞.
Example 11.3.4. Suppose F is a right–continuous function, has local finite variation on
I = [0, ∞) and that inf t∈[a,b] |F (t)| > 0 for any [a, b] ⊂ I. Then 1/F is also right–continuous
and locally finite variation on I. Applying (11.9) with G = 1/F we obtain
1 1
0=Fd + dF
F F−
The uniqueness of the Radon–Nikodym derivative implies that
1 1
d =− dF
F F (t)F (t−)
Example 11.3.5. If G is a continuous function of locally finite variation then,
(11.10) dGn = nGn−1 (t) dG
for each n ∈ Z+ . For n = 1 this is evidently true. By induction assume that equation (11.10)
holds for n ≥ 1. Then, an application of (11.9) implies
d(Gn+1 ) = G(t) dGn + Gn (t)dG = nGn (t) dG + Gn (t) dG
= (n + 1)Gn (t) dG.
A simple consequence of (11.10) is
deG(t) = eG(t) dG(t)
for any nonnegative right–continuous function of local total variation on I.
Lemma 11.3.6. Suppose G is right–continuous nondecreasing in the interval [0, T ) (0 <
T ≤ ∞). Then, for any n ∈ N
Z Z
n−1 Gn (t) − Gn (0)
G (s−)µG (ds) ≤ ≤ Gn−1 (s)µG (ds)
(0,t] n (0,t]
n−1
for all 0 < t < T . (In differential notation, nG− dG ≤ dGn ≤ nGn−1 dG.)
Taking v ≡ −1 shows that S is the unique solution to the equation above with kS1(0,t] ku =
1 < ∞. Therefore
Y
1 − F (t) = exp − Qc (t) (1 − ∆Q(xj )).
0<xj ≤t
284 11. Differentiation
then,
Z
(11.15) x(t) ≤ α(t) + α(s) exp(µ(s, t)) µ(ds)
(0,t)
Proof. Set h(t) to be the right hand side of (11.13). By the fundamental theorem of
Calculus
ḣ(t) = α̇(t) + β(t)x(t) ≤ α̇(t) + β(t)h(t).
This implies that
Z t ′ Z t
exp − β(r) dr h(t) ≤ α̇(t) exp − β(r) dr
a a
Integrating over [a, t]
Z t Z t Z t
(11.16) h(t) ≤ α(a) exp β(r) dr + α̇(s) exp β(r) dr ds
a a s
Integration by parts leads to
Z t Z t
x(t) ≤ h(t) ≤ α(t) + α(s) exp β(r) dr ds.
a s
If α is non–decreasing, then α̇ ≥ 0 and, since β ≥ 0, (11.16) reduces to
Z t Z t Z t
x(t) ≤ h(t) ≤ α(a) exp β(r) dr + α̇(s) exp β(r) dr ds
a a a
Z t
≤ α(t) exp β(r) dr
a
11.4. Analytic functions 285
converges absolutely and uniformly on any compact K ⊂ B(a; R) and diverges for all z with
|z − a| > R. The number R is called the radius of convergence of f .
p
Proof. Since lim sup n |cn (z − a)n | = |z−a|
R , the first and second statements follow from
Theorem A.1.4[i,ii]. If K ⊂ B(a; R) is compact, then K ⊂ B(a; r) ⊂ B(0; R) for some
0 < r < R. The last statement follows from the first one.
In the rest of this section we will use complex measures to derive several properties of
analytic functions. We start with the following fundamental result.
Theorem 11.4.2. Let µ be a complex measure on a measurable space (Ω, F ) and let ϕ be
a complex–valued measurable function on Ω. Suppose D ⊂ C is an open set which does not
intersect ϕ(Ω). Then, the map f : D → C given by
Z
µ(dω)
f (z) =
Ω ϕ(ω) − z
where
Z
µ(dω) kµkT V
(11.19) cn = , |cn | ≤ , n ∈ Z+ .
Ω (ϕ(ω) − a)n+1 rn+1
If R is the radius of convergence of (11.18), then r ≤ R.
Therefore,
f (z) α − iβ α + iβ z η(z)
= + +
z 2 2 z z
Thus, f is holomorphic at 0 only if α = −iβ which is equivalent to the Cauchy–Riemann
equations (11.20).
Conversely, If f is holomorphic at z0 , it is obvious that f is differentiable as a function in
the plane. The Cauchy–Riemann equations follow by comparing the real and imaginary
parts in
f (h) − f (0) f (ik) − f (0)
f ′ (0) = lim = lim
h→0 h k→0 ik
The following result shows that a function that is analytic around a point a ∈ D, is also
holomorphic at any point close enough to a.
Theorem 11.4.6. Suppose that the power series
∞
X
(11.21) f (z) = cn (z − a)n
n=0
converges in the inside the disk B(a; r), r > 0. Then, f is holomorphic and analytic in
B(a; r), f admits derivatives f (k) of any order k ∈ Z+ , all of which are holomorphic and
analytic in B(a; r). Moreover,
∞
X n!
(11.22) f (k) (z) = cn (z − a)n−k , z ∈ B(a; r),
(n − k)!
n=k
f (n) (a)
and cn = n! for each n ∈ Z+ .
Proof. We first show that (a) f is analytic at any point w ∈ B(a; r), and then that (b) f and
f ′ are analytic and holomorphic on B(a; r). For derivatives of order k > 1, the statement
will follow by applications of (a) and (b) inductively on f (k−1) . The last statement follows
by setting z = a in (11.22).
p
The convergence of the power series f in B(a; r) implies that r ≤ 1/ lim supn n |cn |. Since
√
limn n n = 1, we conclude that the power series (11.22) (k = 1) converges absolutely in
B(a; r). Let w ∈ B(a; r) and choose δ > 0 so that ρ := |a − w| + δ < r. Then, for any
z ∈ B(w; δ) we have that
X∞ X∞ X n X∞ X ∞
n n
cn (z − a) = cn (z − w)j (w − a)n−j = cn,j (z),
j
n=0 n=0 j=0 n=0 j=0
where cn,j (z) = nj cn (z − w)j (w − a)n−j 1[0,n] (j). Observe that if u = a + |a − w| + |w − z|,
then |u − a| < r and thus,
X ∞
∞ X ∞
X ∞
X
n
|cn,j (z)| = |cn | |z − w| + |w − a| = |cn |(u − a)n .
n=0 j=0 n=0 n=0
288 11. Differentiation
Example 11.4.9. (Complex powers) Let Lθ0 be the branch of logarithm defined on Ωθ0 =
{z ∈ C : |z| > 0, θ0 < arg(z) < θ0 + 2π}. For any α ∈ C, the complex power function pα
on Ωθ0 is defined as
pα : z 7→ z α := exp(αLθ0 (z)), z ∈ Ωθ0
Then, pα ∈ H(Ωθ0 ) and p′α (z) = αz α−1 on Ωθ0 . If α ∈ Z, then pα coincides with the usual
integer power function restricted to Ωθ0 .
P
Corollary 11.4.10. If f (z) = ∞ n ′
n=0 cn (z − a) for all z ∈ B(a; r) and f ≡ 0, then f ≡ c0 .
Proof. If f ′ ≡ 0, then ncn = 0 for all n ∈ N; hence, f (z) = c0 for all z ∈ B(a; r).
The following result, based on Theorems 11.4.2 and 11.4.6, plays a very important role in
the theory of complex functions.
Theorem 11.5.1. Let γ be a closed path in the complex plane and D = C \ γ ∗ . The map
on D defined by
Z
1 dξ
Indγ (z) =
2πi γ ξ − z
is an integer valued function, constant on each connected component of D and 0 in the
unbounded component of D.
Proof. Let z ∈ D be fixed and let the interval [a, b] be the parameter domain of the closed
path λ. Consider the map
Z t γ ′ (s)
ϕ(t) = exp ds , t ∈ [a, b].
a γ(s) − z
We will show that φ(b) = 1. The fundamental theorem of calculus implies that
ϕ(t)γ ′ (t)
ϕ′ (t) = ,
γ(t) − z
which in turn, implies that
d ϕ
= 0.
dt γ − z
290 11. Differentiation
Consequently, the map ϕ/(γ −z) is a constant function over the interval [a, b]. In particular,
ϕ(b) ϕ(a) 1
= =
γ(b) − z γ(a) − z γ(b) − z
since ϕ(a) = 1 and γ(b) = γ(a). Therefore, ϕ(b) = 1 and thus, Ind(z) ∈ Z.
To prove the last statement, observe that Ind is analytic on D by Theorem 11.4.2; being an
integer valued function, it follows that Ind is constant on each connected component of D.
Since γ ∗ is compact, we can choose a ball large enough that contains it. The complement of
this ball is contained in one connected component of D; thus, D has a unique unbounded
component. Since
Λ(γ)
Ind(z) ≤ ,
dist(z, γ ∗ )
we conclude that Ind(z) = 0 for all z in the unbounded component of D.
Indeed, consider the parameterization γ(t) = a + reit , with 0 ≤ t ≤ 2π. By Theorem 11.5.1
it is enough to consider z = a. Then,
Z Z 2π
1 dz r
Indγ (a) = = eit (reit )−1 dt = 1.
2πi γ z − a 2π 0
Lemma 11.5.3. If f is the derivative of a function F ∈ H(D), then
Z
f = 0.
ϕ
R
for any closed path φ in D. In particular, ϕ z n dz = 0 for all integer n 6= −1 and any
closed path ϕ in C \ {0} (in C \ {0} when n ≥ 0).
Theorem 11.5.4. (Cauchy’s theorem Rfor a triangle) Let D be an open set in C and p ∈ D.
If f ∈ H(D \ {p}) and f ∈ C(D) then ∂△ f = 0 for all triangle △ ⊂ D.
11.5. Cauchy formula 291
Proof. Let A, B and C be the vertexes of the triangle △ := △0 and consider ∂△0 as the
piecewise linear curve that goes from A to B, from B to C and then from C to A.
Case (a) Assume first that p ∈/ △0 . Let C ′ , A′ and B ′ be the midpoints of the segments
AB, BC and CA respectively. By joining the midpoints with linear segments we divide the
triangle △0 in four congruent sub-triangles and obtain
Z X4 Z
f (z) dz = f (z) dz.
∂△ j=1 ∂△j
Observe that 2−n diam(△0 ) = diam(△n ) ≤ Λ(∂△n ) = 2−n Λ(∂△0 ); hence, the intersection
T
n △n consists of a single point z0 ∈ △0 . Also, since f is holomorphic at z0 , given ε > 0,
there is δ > 0 such that
|f (z) − f (z0 ) − f ′ (z0 )(z − z0 )| < ε|z − z0 |
whenever |z − z0 | < δ. By Lemma 11.5.3, we obtain that for all n large enough
Z Z ε
(11.25) f (z) dz = f (z) − f (z0 ) − f ′ (z0 )(z − z0 ) dz ≤ Λ2 (∂△0 ) n .
∂△n ∂△n 4
R
Combining (11.25) with (11.24) and letting ε → 0 we obtain ∂△0 f (z) dz = 0.
Case (b) Assume p is one of the vertexes of △0 , say A. The continuity of f at p implies
for any ε > 0, there is that δ > 0 such that |f (z) − f (p)| < ε whenever |z − p| < δ. Let X
and Y be points on AB and AC within δ distance from A and consider the triangles AXY ,
XBC and CY X. From Part (a) we have that
Z Z Z
f (z) dz = f (z) dz = f (z) − f (p) dz ≤ 4δε
∂△0 ∂△AXY ∂△AXY
R
Therefore, ∂△0 f = 0.
Part R(c) Suppose p ∈ △o0 . By considering the triangles ABp, BCp and CAp, Part (b) shows
that ∂△0 f = 0.
Theorem 11.5.5. (Morera’s theorem) Suppose D is an R open convex subset in the complex
plane and let f be a continuous function in D. Then, ∂△ f = 0 for any triangle △ ⊂ D if
and only if there is F ∈ H(D) such that F ′ = f .
292 11. Differentiation
where
Z
f (n) (a) 1 f (ξ)
(11.28) cn = = dξ,
n! 2πi γ (ξ − a)n+1
and γ is the positively oriented circle of radius r centered at a. Moreover,
Z
n! |f (ξ)| n!M
(11.29) |f (n) (a)| ≤ n+1
|dξ| ≤ n .
2π γ r r
If R is the radius of convergence of the series (11.27) then, r < R. The sequence of
inequalities (11.29) are known as Cauchy estimates.
11.5. Cauchy formula 293
Proof. Only necessity needs to be proved. For any a ∈ D let 0 < r < q be such that
B(a; r) ⊂ B(a; q) ⊂ D. Let γ be the positively oriented circle of radius r centered at a.
Applying Cauchy’s theorem on the convex set B(a; q) we obtain that
Z
1 f (ξ)
f (z) = dξ z ∈ B(a; r),
2πi γ ξ − z
since Indγ (z) = 1 for all z ∈ B(a; r). All conclusions follow from Theorem 11.4.2 and
Theorem 11.4.6.
R
Corollary 11.5.8. Suppose f ∈ C(D), where D is an open set in the plane. If ∂△ f (z) dz =
0 for any closed triangle △ ⊂ D, then f ∈ H(D).
Proof. Let h(z) = (z − a)2 f (z) if z ∈ D := B(a; R) and h(a) = 0. It is easy to check that
h ∈ H(B(a; R)) and that h′ (a) = 0. Hence h admits a power series expansion
X X
h(z) = cn (z − a)n = (z − a)2 cn+2 (z − a)n ,
n≥2 n≥0
P
whence it follows that f (z) = n≥0 cn+2 (z − a)n for all z ∈ D and limz→a = c2 . Setting
f (a) = c2 we obtain that f ∈ H(B(a; R)).
Theorem 11.5.11. Suppose {fn : n ∈ N} ⊂ H(D) converges to a function f uniformly on
compact subsets of D. Then f ∈ H(D) and fn′ also converges to f ′ uniformly on compact
subsets of D.
Proof. Suppose there is B(a; r) ⊂ U for which the opposite holds. From Cauchy’s formula
Z π
1
|f (a)| ≤ |f (a + reiθ )| dθ ≤ |f (a)|,
2π −π
P n
it follows that |f | ≡ |f (a)| in ∂B(a; r). If f (z) = n≥0 a0 (z − a) is the power series
expansion of f around a, we obtain by dominated convergence that
Z π
2 1
|f (a)| = |f (a + reiθ )|2 dθ
2π −π
Z π X 2
1 n inθ
= an r e
2π −π
n≥0
X
= |an |2 r2n .
n≥0
Hence an = 0 for all n ≥ 1, which means that f ≡ a0 = f (a) in B(0; r). As U is open and
connected, it follows that f is constant contradicting the assumption on f .
Remark 11.5.13. The behavior of an analytic function near the boundary of converges
may be very complicated as the following examples will demonstrate.
P P∞
(a) The power series ∞ n
n=0 z and
n 1
n=0 nz diverge at every point z ∈ S . At z = 1,
1
both series diverge to +∞. For z ∈ S \{1} the partial sums of each series oscillate.
P zn
(b) The power series ∞ 1
n=1 n converges at every point z ∈ S \ {1}. To see this, set
PN
SN = n=1 z n . Then, by summation by parts
−1
1
N
X N
X
zn 1 1 1
= SN − SM −1 − − Sn
n N M n+1 n
n=M n=M
Hence
XN
z n 2 1 1 2 1 1 4 1
≤ + + − ≤
n |1 − z| N M |1 − z| M N |1 − z| M
n=M
Proof. Assume the staement is false. Then, form any m ∈ N, there is n ∈ N such that if
1 − n1 < |z| < 1, |f (z)| > m. This implies that the number of zeroes of f in U is finite. Let
p be a polynomial with the same zeroes, including multiplicities, as f . Then g = fp ∈ H(U )
has no zeroes in U . It follows that limz→1 g(z) = 0. This contradicts the maximal modulus
principle.
11.5. Cauchy formula 295
has radius of convergence 1. Hence f ∈ H(U ) and has no analytical extension to any open
p p
set containing U . For any rational number m we have that along {r exp 2πi m : 0 ≤ r < 1},
limz→1 |f (z)| = ∞.
Proof. f ∈ H(D) implies that f ∈ C ∞ (D). The conclusion follows from the Cauchy–
Riemann equations ux = vy , uy = −vx .
Example R 11.5.17. As in Example 11.4.4, for any complex measure µ on S1 , the function
eit +z
F (z) = S1 eit −z µ(d eit ) is analytic on B(0; 1). As linear combination of harmonic functions
R it
are harmonic, it follows that U (z) = S1 Re eeit +z −z
µ(d eit ) is harmonic on B(0; 1). The
it
kernel P (eit , z) = Re eeit +z
−z
is called the Poisson kernel on the unit disk.
Proof. Suppose that |f (z)| ≤ M for all z ∈ C. Cauchy estimates (11.29) implies that
n!M
|f (n) (0)| ≤ (n ∈ N).
rn
Letting r → ∞ gives f (n) (0) = 0 for all n ∈ N. Therefore, f (z) ≡ f (0).
296 11. Differentiation
Proof. Let A be the set of all limit points of Z(f ) in D. By continuity, Z(f ) is closed in
D and A ⊂ Z(f ). Being that A is closed in Z(f ), A is closed in D. The first statement will
follow if we show that A is open in D.
In the case where A = ∅, the second staement follows from the representation (11.30)
when not all cn are zero.
The next result gives some conditions under which analytic functions in open domains
may be extended to larger domains.
Corollary 11.5.21. Let U and V be connected open sets in C and suppose f ∈ H(U ) and
g ∈ H(V ). If U ∩ V 6= ∅ and {z ∈ U ∩ V : f (z) = g(z)} admits a limit point in each
component of U ∩ V then,
f (z) if z ∈ U
h(z) :=
g(z) if z ∈ V
is a well defined function, and is only analytic function in U ∪ V whose restriction to U (or
to V ) equals to f (to g).
11.5. Cauchy formula 297
The objects Γ are called chains and if all γk are closed paths, then Γ is called cycle, If each
path γk is replaced by its opposite path (denoted formally by −γk ) given by t 7→ γk (b+a−t)
(t ∈ [a, b]), then resulting chain −Γ satisfies
Z Z
f =− f f ∈ C(Γ∗ )
−Γ Γ
If z ∈ C \ Γ∗ , then the IndΓ (z) is defined by
n
X
IndΓ (z) = Indγk (z).
k=1
Suppose D ⊂ C is a non-empty open set and γ, η are chains in D, i. e., γ ∗ ∪ η ∗ ⊂ D. If
Indγ (z) = 0 for all z ∈ C \ D, then γ is said to be homologous to 0 in D, denoted by
γ ∼ 0. If γ − η ∼ 0, then Indγ (z) = Indη (z) for all z ∈ C \ D; in such case, γ is said to be
homologous to η in D, denoted by γ ∼ η.
The following result extends theorem 11.5.6 to cycles homologous to 0.
Theorem 11.5.22. (General Cauchy’s theorem) Suppose f ∈ H(D) where D is a non–
empty open set in the complex plane. If γ is a cycle in D and γ ∼ 0, then
Z
1 f (w)
(11.31) f (z) Indγ (z) = dw z ∈ D \ γ∗
2πi γ w − z
and
Z
(11.32) f (w) dw = 0.
γ
If γ1 and γ2 are cycles in D and γ1 ∼ γ2 , then
Z Z
(11.33) f (w) dw = f (w) dw
γ1 γ2
By Corollary 11.5.8, the map z 7→ g(w, z) is holomorphic on D for all w ∈ D fixed and so,
the integral in parenthesis in (11.34) is zero. Hence h ∈ H(D) by Corollary 11.5.8.
To prove (11.32), fix a ∈ D \ γ ∗ and define F (z) = (z − a)f (z). By the first part of the
proof Z Z
F (w)
f (w) dw = dw = 2πiF (a) Indγ (a) = 0.
γ γ w−a
Proof. The last statement will follow as a consequence of the first statement and the general
Cauchy theorem.
By compactness, η ′ := d(K, C \ Ω) > 0. Construct a grid of vertical and horizontal lines
forming squares whose edges lie in the grid and have length η := η ′ /2. Since K is compact,
only a finite number of those squares, say Q1 , . . . , Qm , intersect K. The choice of η ensures
that these squares are contained in Ω. Orient the boundary of each such square Qj =
[nj η, (nj + 1)η] × [mj η, (mj + 1)η] counterclockwise, that is
∂Qj = γj1 +̇γj2 +̇γj3 +̇γj4
where γjk , j = k, . . . 4, are the directed edges η(nj , mj ) to η(nj + 1, mj ), η(nj + 1, mj )
to η(nj + 1, mj + 1), η(nj + 1, mj + 1) to η(nj , mj + 1), and η(nj , mj + 1) to η(nj , mj )
respectively. Clearly
1 if z ∈ Int(Qj )
Ind∂Qj (z) =
0 if z ∈ C \ Qj
Let Σ be the collection of all directed edges γjk (1 ≤ j ≤ m, i ≤ k ≤ 4). Remove from
Σ those directed edges whose opposites appear also in Σ. Let Φ be the remaining set of
directed edges. None of the edges in Φ intersect K for if an edge ℓ of some square Qj
intersects K, then there is exactly one other square Qj ′ having ℓ as common side. Hence ℓ
appears twice with opposite orientation and so, it is an edge that is removed from Σ. We
claim that the edges in Φ form a cycle. To see this, notice that Φ is balanced in the sense
that for each vertex p appearing in Φ, the number of edges having p as initial point is the
same as the number of edges having p as an end point. Now, starting with a vertex p,
choose γ1 = [p, p1 ] ∈ Φ. Having chosen k distinct oriented edges γj = [pj−1 , pj ], 1 ≤ j ≤ k,
we stop if p = pk in which case we have a closed path based at p. If p 6= pk , and there are
exactly r of the edges γ1 , . . . , γk has pk as an endpoint, then exactly r − 1 of those edges
have pk as initial point. Since Φ is balanced, there is another edge γk+1 ∈ Φ whose initial
point is pk . Since Φ is finite, at some finite step n we get an edge γn = [pn−1 , p]. The edges
γ1 , . . . , γn form a closed path based at p0 .
The remaining members of Φ clearly form a balanced collection of edges. The same con-
struction may be applied. This shows that the Φ has finite partition Φ1 , . . . Φt , each of
which forms closed path Γ1 , . . . , Γt . The sum of those closed paths is a cycle.
P
By construction, IndΓ (z) = m j=1 Ind∂Qj (z) for each z that is not in the boundary of any
Qj . Hence
S
1 if z ∈ m Int(Qj )
j=1S
IndΓ (z) =
0 if z ∈ C \ m j=1 Qj
S
m
If z ∈ K ∩ j=1 ∂Qj , then z ∈/ Γ∗ and z is a limit point of the interior of some Qj . Since
the function z 7→ IndΓ (z) is constant in each component of the complement of Γ∗ , it follows
that IndΓ (z) = 1. Consequently
1 if z ∈ K
IndΓ (z) =
0 if z ∈ /Ω
300 11. Differentiation
The following results will give some conditions under which to closed paths γ0 and γ1
in an open set D are homologous. Two closed curves γ0 and γ1 in a topological space
X parameterized by the same interval [a, b] are homotopic if there is a continuous map
H : [0, 1] × [a, b] → X such that
H(0, ·) = γ0 (·), H(1, ·) = γ1 (·), H(s, a) = H(s, b)
for all 1 < s < 0.
If X is a path connected topological space and every closed curved is homotopic to a
constant curve γ1 (a point), then X is said to be simply connected .
Lemma 11.5.24. Let γ0 and γ1 be closed paths in C parameterized by the interval [a, b]. If
there is α ∈ C such that
(11.35) |γ1 (t) − γ0 (t)| < |α − γ0 (t)|, a≤t≤b
then, Indγ0 (α) = Indγ1 (α).
Proof. Without lose of generality, suppose that γ0 and γ1 are both parameterized by I =
[0, 1]. There exists a continuous function H : I 2 → D such that H(0, ·) = γ0 , H(1, ·) = γ1 (·)
and H(s, 0) = H(s, 1) for all 0 ≤ s ≤ 1. Let α ∈ C \ D. Since H(I 2 ) is compact, then there
is ε > 0 such that
(11.37) inf |H(s, t) − α| > 2ε
0≤s,t≤1
From(11.41), (11.42), and n + 2 applications of Lemma 11.5.24 we conclude that α has the
same index with respect to the paths γ0 , g0 , . . . , gn , γn .
Remark 11.5.26. The polygonal paths were taken instead of the closed curves γk (·) :=
H( nk , ·) because H may not be differentiable. It is possible to extend the definition of
index to continuous curves by approximating them uniformly by smooth paths (Weierstrass
theorem with trigonometric polynomials); then, an application of Theorem 11.5.25 justifies
that this procedure does not depend on any the particular approximation.
Lemma 11.5.27. Suppose D ⊂ C is a simply connected open. If f ∈ H(D), then there
exists F ∈ H(D) such that F ′ = f . Any two such F differ by a constant.
R
Proof. The assumption in D implies that γ f (w) dw = 0 for all closed path in D. There-
fore, for fix z0 ∈ D, the function
Z
F (z) = f (w) dw
η(z0 ,z)
where η(z0 , z) is any path in D joining z0 to z is well defined. For any z ∈ D, there is a
neighborhood B(z; r) ⊂ D of z such that |f (w) − f (z)| < ε. Choosing η(z, z + h) as the
straight line segment joining z to z + h for all h with |h| < r gives
F (z + h) − F (z) Z
1
− f (z) ≤ |f (w) − f (z)| dw < ε
h |h| η(z,z+h)
This shows that F ∈ H(D) and F ′ (z) = f (z) for all z ∈ D.
If G ∈ H(D) satisfies G′ = f then H = F − G satisfies H ′ ≡ 0. Since D is connected, it
follows that H is a constant function.
Theorem 11.5.28. Suppose D is open and simply connected. If f ∈ H(D) and f (z) 6= 0
for all z ∈ D, then there exists g ∈ H(D) such that f = exp ◦g. Any two such g differ by a
constant multiple of 2πi.
If F is given by the right hand side of (11.43), then F ∈ H(D), and F ′ (z) = 1/z on D. As
D is connected, L and F differ by a constant in D, and since L(1) = 0 = F (1), we conclude
that L ≡ F . The function L given by (11.43) coincides with the restriction to B(1; 1) of
the principal logarithm function introduced in Example 11.4.8.
Example 11.5.30. (Complex binomial expansion) Let log(reiθ ) = log(r) + iθ be the prin-
cipal logarithm function on Ω := {reiθ : r > 0, −π < θ < π} as in Example 11.4.8. For any
α ∈ C and k ∈ Z+ define α(k) = 1 if k = 0 and α(k) = α · . . . · (α − k + 1) otherwise. Define
α
α(k) α
k := k! . Suppose α ∈ C \ Z+ and let hα (z) = (1 + z) := exp(α log(1 + z)). Repeated
(k)
differentiation gives hα (z) = α(k) hα−k (z) and so, h(k) (0) = α(k) 6= 0 for all k ∈ Z+ . It
follows that hα has power series expansion around 0 given by
X∞
α α k
(11.44) hα (z) = (1 + z) = z , |z| < 1
k
k=0
a |α−k|
Indeed, setting ak := αk , we have that R := lim k+1 ak = lim k+1 = 1. Hence the
k→∞ k→∞
radius of convergence of the power series (11.44) is 1/R = 1. Notice that if α ∈ Z+ then
equation (11.44) coincides with the usual binomial expansion of elementary algebra.
Theorem 11.5.31. Suppose D is a simply connected region in C and f ∈ H(D). If
0∈
/ f (D), then the map z 7→ log |f (z)| is harmonic on D and
Z 2π
1
log |f (z0 )| = log |f (z0 + reiθ )| dθ
2π 0
whenever B(z0 ; r) ⊂ D.
11.6. Singularities
The next result concerns holomorphic functions in regions with holes.
Theorem 11.6.1. (Laurent–Weierstrass) Let D be an open set in the complex plane con-
taining an annulus A(a; r1 , r2 ) = {z ∈ Z : r1 ≤ |z − a| ≤ r2 } (r1 ≤ r ≤ r2 ). Let γr (a) denote
the positively oriented circle of radius r centered at a. If f ∈ H(D), then
X
(11.45) f (z) = cn (z − a)n z ∈ A(a; r1 , r2 )
n∈Z
where
Z
1 f (w)
(11.46) cn = dw, n ∈ Z.
2πi γr (a) (w − a)n+1
The series (11.45) converges absolutely and uniformly over A(a; r1 , r2 ).
11.6. Singularities 303
Proof. Since D is open, there exists R1 < r1 < r2 < R2 such that A(a; r1 , r2 ) ⊂ A(a; R1 , R2 ) ⊂
D. For any z ∈ A(a; r1 , r2 ), Corollary 11.5.8 shows that the function
(
f (ξ)−f (z)
ξ−z if ξ ∈ D \ {z}
g(ξ) =
f ′ (z) if ξ=z
is holomorphic on D. Since γR2 and γR1 are homotopic, γR2 ∼ γR1 and so,
Z Z
(11.47) g(ξ) dξ = g(ξ) dξ.
γR1 (a) γR2 (a)
Since r1 < |z − a| < r2 , the integrands in (11.47) can be written as g(ξ) = fξ−z (ξ)
− fξ−z
(z)
. After
substitution and transposition of terms we obtain
Z Z ! Z Z
dξ dξ f (ξ) f (ξ)
f (z) − = dξ − dξ.
γR (a) ξ − z
2
γR (a) ξ − z
1
γR (a) ξ − z γR (a) ξ − z
2 1
It follows from Theorem 11.4.2 that f2 ∈ H(B(a; r2 )), and admits a power series
∞
X
(11.48) f2 (z) = cn (z − a)n , z ∈ B(a; r2 ),
n=0
1
R f (ξ) ∗
with cn = 2πi γR2 (a) ξ−a dξ for all n ∈ Z+ . Similarly, f1 ∈ H(C \ B(a; r1 )). Since ξ ∈ γR 1
and |z − a| > r1 ,
ξ − a R1
z − a < r1 < 1
Thus
1 1 1 X (ξ − a)n−1 ∞
=− = −
ξ−z z − a 1 − ξ−a (z − a)n
z−a n=1
Remark 11.6.2. The terms f1 defined by (11.49), and f2 defined by (11.48) are called
principal and regular parts of f respectively.
Theorem 11.6.3. Suppose f ∈ H(D) where D = B(a; R) \ {a}. One and only of the
following holds:
(i) The point z = a is a removable singularity .
(ii) There exist m ∈ N complex numbers c−1 , . . . , c−m , c−m 6= 0, such that
m
X c−k
f (z) −
(z − a)k
k=1
Proof. Suppose (iii) does not hold. Then there are numbers 0 < ρ ≤ R, δ > 0 and a point
w ∈ C such that z ∈ B(a; ρ)\{a} implies |f (z)−w| > δ. It follows that g : z 7→ 1/(f (z)−w)
is bounded and holomorphic on B(a; ρ) \ {a}; hence a is a removable singularity of g and
g ∈ H(a; ρ) by setting g(a) = lim g(z). Then g has a zero at a of order m ∈ Z+ and
z→a
g(z) = (z − a)m h(z)Pwhere h ∈ B(a; ρ) and h(a) 6= 0. Thus φ = h1 ∈ H(B(a; ρ) admits a
power series φ(z) = n≥0 cn (z − a)n where c0 6= 0. It follows that
1 X
f (z) = m
c′n (z − a)n ,
(z − a)
n≥0
where c′0
= w + c0 and c′n
= cn for all n ≥ 1. If m = 0, that is g(a) 6= 0, then (i) holds;
whereas if m ≥ 1, then (ii) holds.
Remark 11.6.4. The coefficient c−1 in the Laurent expansion (11.45) is called residue of
f at a, and it is denoted as Res(f ; a). The Laurent–Weierstrass, together with the general
Cauchy theorem, implies that
Z Z
1 1
f (z) dz = Res(f ; a) Indγ (a) = Res(f1 ; a) Indγ (a) = f1 (z) dz
2πi γ 2πi γ
/ γ∗.
for any cycle γ ∼ 0 in D such that a ∈
A function f that is analytic on an open set D ⊂ C except for a discrete set of points A,
all of which are poles, is said to be meromorphic. A function f is said to be meromorphic
at z0 if it is meromorphic on a neighborhood U of z0 .
Theorem 11.6.5. (Theorem of residues) Suppose f ∈ H(D \ A) where A ⊂ D is a discrete
set at which f has singularities. If γ ∼ 0 in D and A ∩ γ ∗ = ∅ then,
Z X
1
(11.50) f (z) dz = Res(f ; a) Indγ (a).
2πi γ
a∈A
11.6. Singularities 305
Proof. Let B = {a ∈ A : Indγ (a) 6= 0}. Since A has no limit points in D then A is countable
and closed in D; hence, D \A is open. Indγ is constant in each component of C\γ ∗ , vanishes
at the unbounded component of C \ γ ∗ , and also vanishes at any component intersecting
C \ D. It follows that B is finite. Let a1 , . . . , an be the points of B and Q1 , . . . , Qn be
Pn parts of f at a1 , . . . , an respectively. Then, D0 = D \ (A \ B) is open and
the principal
F = f − k=1 Qk ∈ H(D0R) for the singularities are removable. From the general Cauchy
theorem 11.5.22 we obtain γ F (z) dz = 0. As Res(f ; ak ) = Res(Qk ; ak ),
Z Xn Z Xn
1 1
f (z) dz = Qk (z) dz = Res(Qk ; ak ) Indγ (ak ).
2πi γ 2πi γ
k=1 k=1
This is formula 11.50 since Indγ (a) = 0 for all a ∈ A \ B.
The formula of residues (11.50) is often used to obtain explicit expressions of integrals
over infinite intervals of the real line.
R dx 1
Example 11.6.6. To evaluate R 1+x 4 consider the function f (z) = 1+z 4 . Let γR be the
closed path obtained by joining the straight line segment ℓR from (−R, 0) to (R, 0), and
the upper semicircle cR of radius R centered at the origin (See Figure 1). f has only four
2k−1
single poles, namely zk = ei 4 with k = 0, . . . , 3, of which z2 , z3 lay in the unbounded
component of C \ γR ∗ . The Residues at z and z are given by
0 1
1 1+i
lim (z − z0 )f (z) = 3 = − √
z→z0 4z0 4 2
1 1−i
lim (z − z1 )f (z) = 3 = √
z→z1 4z1 4 2
iR
−R R
Figure 1.
R dx
√
2π
Therefore, R 1+x4 = 2 .
e ax
Example 11.6.7. For any 0 < a < 1 the function fa (x) = 1+e x is integrable in with respect
R
Lebesgue’s measure on (R, B(R)). To evaluate Ia = R fa (x) dx consider the rectangular
path γR with base on the segment from (−R, 0) to (R, 0) and hight R (see Figure 2)
i2π
−R 0 R
Figure 2.
The function fa is meromorphic on C and has simple poles zk = iπ(2k − 1) all of which,
∗ . Now
with the exception of z1 = iπ, are in the unbounded component of C \ γR
Res(fa ; iπ) = lim (z − iπ)f (z) = −eaπi
z→iπ
R
Thus, fa = −2πieaπi . 1 ) and right (v 2 ) vertical sides of γ we have
Along the left (vR R R
γR
Z aR
e e−Ra
R→∞
fa ≤ 2π R
+ −R
−−−−→ 0
v1 +̇v2 e −1 1−e
R R
Along the base h1R and the opposite horizontal side h2R we have
Z Z Z
2aπi R→∞ 2aπi
fa = (1 − e ) fa (x) dx −−−−→ (1 − e ) fa (x) dx
h1R +̇h2R [−R,R] R
The theory of holomorphic functions we have presented can also be applied to solve
certain linear differential equations.
P
Example 11.6.8. Suppose f ∈ H(B(0; r)\{0}) with Laurent expansion f (z) = n∈Z an z n .
The region D = B(0; r) \ (−∞, 0){0} is open an simply connected. For any constant c ∈ C,
the function
Z z X an−1
Fc (z) = f = c + a−1 log(z) + zn,
n
n6=0
where log is the principal branch of logarithm, satisfies F ∈ H(D) and F ′ (z) = f (z) for all
z ∈ D. This provides a method to solve the differential equation
w′ (z) + f (z)w(z) = 0, z∈D
11.6. Singularities 307
namely,
X a
n−1 n
w(z) = exp Fc (z) = Cz a−1 exp z .
n
n6=0
Example 11.6.9. (Frobenius–Fuchs method) Consider the second order linear differential
equation
P (z) ′ Q(z)
(11.51) w′′ (z) + w (z) + 2 w(z) = 0
z z
where P and Q are analytic in a neighborhood B(0; a) of 0. Under these assumptions, the
point z = 0 is said to be a regular singular point of the differential equation (11.51). On
the region D = B(0; a) \ (−∞, 0) × {0} we propose a solution of the form
∞
X
w(z) = z r 1 + an z n
n=0
P n
P n.
Let P (z) = n≥0 pn z and Q(z) = n≥0 qn z
A simple computation shows that
X
zw′ (z) = z r an (r + n)z n
n≥0
X
z 2 w′′ (z) = z r an (r + n)(r + n − 1)z n
n≥0
X
z 2 w′′ (z) + zP (z)w′ (z) + Q(z)w(z) = z r an (r + n)(r + n − 1)z n
n≥0
n
X X
+ am (r + m)pn−m z n
n≥0 m=0
X Xn
+ am qn−m z n = 0
n≥0 m=0
This equation is known as the indicial equation of (11.51). For n ≥ 1, equating the
coefficient of the n–th power to 0 gives
n−1
X
an (r + n)(r + n − 1) + (r + n)p0 + q0 + am (r + m)pn−m + qn−m = 0
m=0
308 11. Differentiation
We set a0 =, and let α and β the two solutions to I(r) = 0 arranged so that Re(α − β) ≥ 0.
Setting s = α − β, we obtain p0 − 1 = −(α + β) = −2α + s. Hence, for all n ≥ 1
I(α + n) = I(α) + n(n + s) = n(n + s) 6= 0
This shows that the recurrence equation (11.52) has a unique solution given by
Pn−1
m=0 am (r + m)pn−m + qn−m
an = −
I(α + n)
P
To proof that the formal series w(z) = z α 1 + n≥1 an z n = z α f (z) is indeed a solution
to (11.51), it suffices to show that f converges in an open disk around z = 0. As P, Q ∈
H(B(0; a)), Cauchy’s estimates shows that for some 0 < ρ < a and M > 1
M M M
|pn | ≤ , |qn | ≤ , |αpn + qn | ≤
ρn ρn ρn
Mj
for all n ≥ 1. For n = 1 we have |a1 | = |r1|s+1|
p1 +q1 |
≤ M
ρ . By induction, assume |aj | ≤ ρj
for
all 1 ≤ j ≤ n − 1. Then
Pn−1 n(n−1)
m=0 |am | |rpn−m + qn−m | + m|pn−m | Mn n + 2 Mn
|an | ≤ ≤ n <
n(n + s)| ρ n2 ρn
This completes the induction argument. It follows that f has radius of convergence R ≥
ρ
M > 0.
We conclude this section with a result that states that in the complex plane there one
can construct functions that have singularities in arbitrary discrete set A and arbitrary
principal parts around points in A.
Theorem 11.6.10. (Mittag–Leffler) Let {an :Pn ∈ N} ⊂ C be a sequence such that
∞ −k be a Laurent series
limn |an | = ∞. For each n ∈ N let Pn (z) = k=1 cn,k (z − an )
converging on C \ {an }. Then there is a holomorphic function f : C \ {an : n ∈ N} → C
such that for all n ∈ N, the principal part of f at an is Pn .
any compact set K ⊂ C \ A, there NK ∈ N such that 2 dist(0, K) < inf n≥N |an |. Hence
kPn − Qn kK ≤ 2−n for all n ≥ N and so,
∞
X N
X −1 ∞
X
kPn − Qn kK ≤ kPn − Qn kK + 2−n < ∞.
n=1 n=1 n=N
P∞
Consequently, f := n=1 (Pn − Qn ) ∈ H(C \ A). It remains to check that f has the correct
principal parts. To this end, consider ak ∈ A, and let
X
fk = (Pn − Qn ) = f − (Pk − Qk )
n6=k
so that f = (fk − Qk ) + Pk . The first term of this sum is holomorphic near ak and so, Pk
is the principal part of f at ak .
Corollary 11.7.2. (Rouché) Let D ⊂ C open and let γ be a closed path such that γ ∼ 0 in
D. Suppose that Indγ (z) ∈ {0, 1} for all z ∈ D \ γ ∗ and let D1 = {z ∈ D : Indγ (z) = 1}. If
f, g ∈ H(D) and
(11.54) |f (z) − g(z)| < |f (z)| z ∈ γ∗,
then Ng = Nf , where Ng and Nf is the number of zeroes of g and f in D1 , counted according
to their multiplicity.
Proof. From (11.54) it follows that neither f nor g has zeroes in γ ∗ . If Γ1 = f ◦ γ and
Γ0 = g ◦ γ, then Lemma 11.5.24 and Theorem 11.7.1 show that Ng = IndΓ0 (0) = IndΓ1 (0) =
Nf .
We will conclude this section with another remarkable integral equation involving the
number of zeroes of an analytic function in a ball B(0; r).
R 2π
Lemma 11.7.4. For all ρ ∈ R, 2π 1 it +
0 log 1 + ρ e dt = log (|ρ|)
Proof. Let us denote g(t, ρ) = log |1 + ρ eit | Assume first that |ρ| < 1. Then
1 X (−1)n+1
log |1 + ρ eit | = (log(1 + ρ eit ) + log(1 + ρ e−it )) = ρn cos(nt).
2 n
n≥1
11.7. Zeroes of analytic functions 311
P n
Since n≥1 |ρ|n < ∞, by dominated convergence, we conclude that t 7→ g(t, ρ) is integrable
R 2π
1
for every |ρ| < 1 and 2π it
0 log 1 + ρ e dt = 0.
1
R 2π
Similarly, we obtain that 2π 0 g(t, −1) dt = 0.
Theorem 11.7.5. (Jensen’s formula) Suppose f ∈ H(B(0; R)) and f (0) 6= 0. For any 0 <
r < R, let α1 , . . . , αnr be the zeroes of f in B(0; r) repeated according to their multiplicities.
Then
nr
Y r 1 Z π
(11.56) |f (0)| = exp log |f (reiθ )| dθ .
|αk | 2π −π
k=1
1
Rπ
The map l(r) = 2π −π log |f (reiθ )| dθ, 0 < r < R, is non–decreasing on (0, R) and
log |f (0)| = lim l(r)
r→0
Proof. Fix 0 < r < R. Suppose f has mr zeroes in B(0; r) so that {α1 , . . . , αnr } ⊂ B(0; r)
and |αnr +1 | = . . . = |αmr | = r. The function
nr
Y mr
Y
r 2 − αj z αj
(11.57) g(z) = f (z)
r(αj − z) αj − z
j=1 j=nr +1
is analytic on B(0; R) and has no zeroes in B(0; s) for any r < s < R. By Theorem 11.5.28
g = exp ◦h for some h ∈ H(B(0; s)). Then, log |g| = Re(h) is harmonic and satisfies the
mean–value property
Z π
1
log |g(0)| = log |g(reiθ )| dθ.
2π −π
312 11. Differentiation
By (11.57)
nr
Y r
|g(0)| = |f (0)| .
|αj |
j=1
Each factor in the first product in (11.57) has module one for if z = reiθ and 1 ≤ j ≤ nr ,
r2 − α z re−iθ − α
j j
= = 1.
r(αj − z) αj − reiθ
If αj = reiθj , nr + 1 ≤ j ≤ mr , then
mr
X
(11.58) log |g(reiθ )| = log |f (reiθ )| − log |1 − ei(θ−θ0 ) |.
j=nr +1
Identity 11.56 follows from (11.58) by integration over [0, 2π] and application of Lemma 11.7.4.
The second statement follows by noticing that if r < s, then mr ≤ ms , 1 ≤ r/|αj | ≤ s/|αj |
for each j = 1, . . . , mr , and 1 ≤ s/|αj | for each j = mr + 1, . . . , ms . The last statement
corresponds to the case where there are no zeroes in B(0; r) when r is small. It follows by
the continuity of f and dominated convergence by choosing 0 < r0 < R small enough so
that |f (0)|
2 < |f (z)| < 23 |f (0)| whenever |z| < r0 .
that l(r) is still non–decreasing on (0, R), and that limr→0 l(r) = −∞ = log |f (0)|.
Corollary 11.7.7. Suppose f ∈ H(B(0; R)) and f (0) 6= 0. For any 0 < r < R, let n(r) be
the number of zeroes, counting multiplicities. Then
Z r Z 2π
n(s) 1
(11.59) ds = log |f (reiθ )| dθ − log |f (0)|
0 s 2π 0
Proof. Fix 0 < r < R and suppose α1 , . . . , αn(r) are the zeroes of f in B(0; r) repeated
according to their multiplicity. Then
n(r) r Xn(r) Z r n(r) Z
X ds X r ds
log = = 1(|αk | < s)
αk s s
k=1 k=1 |αk | k=1 0
Z r X n(r) Z r
ds n(s)
= 1(|αk | < s) = ds
0 s 0 s
k=1
Identity (11.59) follows from Jensen’s formula (11.56). As n(r) is nondecreasing, the last
Rr
statement follows by comparing r/2 n(s)
s ds and the right hand side of (11.59).
Conversely, suppose that for any ε > 0 there is N for which (11.60) holds. Then, for ε = 1/2
there is an integer N0 such that
1 3
(11.62) < |aN0 · · · an | < , n > N0 .
2 2
Q
This implies that an 6= 0 for all n ≥ N0 . Let qn := nk=N0 an , n ≥ N0 . Then, for any other
ε > 0 we can choose N > N0 such that
q 2
n+k
− 1 < ε, n ≥ N, k ≥ 0
qn 3
Consequently
2
|qn+k − qn | < ε|qn | < ε, n ≥ N, k≥0
3
Hence {qn : n ∈ N} is a Cauchy sequence and by (11.62), qn converges to some q 6= 0.
Q∞ Q∞
The infinite product n=1 (1 + an ) is absolutely convergent if n=1 (1 + |an |) is conver-
gent.
Q
Theorem 11.8.3.
Q∞ Absolutely convergence of ∞ P(1 + an ) implies proper convergence.
n=1
The product Q n=1 (1 + an ) converges absolutely iff ∞ n=1 |an | < ∞. Absolutely convergence
∞
implies that n=1 (1 + an ) = 0 iff an = −1 for some n.
Q
Suppose
Q (bk : k ∈ N) is a rearrangement of (an : n ∈ N). n (1 + an ) converges absolutely
iff k (1 + bk ) converges absolutely. In either case,
Y Y
(1 + an ) = (1 + bk ).
n k
As both (sn : n ∈ N) and (qn : n ∈ N) are monotone nondecreasing sequences, the conver-
gence of one implies the boundedness, and hence convergence, of the other one. In either
case, there is N ∈ N for which |an | ≤ 12 , and thus an 6= −1, whenever n ≥ N .
Q
Suppose n (1 + an ) converges absolutely. Let bk = ag(k) where g is a permutation of N.
Q Q Q
Let pn = nj=1 (1 + an ), qk = kj=1 (1 + bk ) and p = ∞n=1 (1 + an ). There is a constant
C > 0 such that |pn | ≤ C for all n. Given 0 < ε < 1, there is an integer N such that
11.8. Entire functions 315
P
n ≥ N implies that j≥n |aj | < ε and |pn − p| < ε. There is an integer M such that
{1, . . . , N } ⊂ {g(1), . . . , g(M )}. For m ≥ M ,
Y
|qm − p| ≤ |qm − pN | + |pN − p| ≤ |pN | (1 + |an |) − 1 + ε
n>N
ε
≤ C(e − 1) + ε < (2C + 1)ε.
The third statement follows immediately.
Proof. (i) for each z ∈ D we can write Fn (z) = 1+an (z) with |an (z)| ≤ cn . The convergence
of Pm (z) follows from Theorem 11.8.3. As each Pm is holomorphic on D, we conclude that
the limit P ∈ H(D). Moreover, if for some z ∈ D, Fn (z) 6= 0 for all n, then {Pm (z)} is
bounded away from zero, that is inf m |Pm (z)| > 0.
Proof. If {zn } is finite, the result is immediate. Suppose {zn } is infinite. As f is entire and
not constant, it follows that |zn | → ∞. There exits a sequence {pn : n ∈ N} ⊂ Z+ (pn =n−1
Q
is one example) such that (11.65) in Theorem 11.8.6 holds. Hence P (z) = ∞ n=1 E p n
z
zn
is an entire function whose zeroes are {zn : n ∈ N}. It follows that the function h = f /P
is entire and has no zeros in C. Theorem 11.5.28 implies that there is an entire function g
such that h = exp ◦g.
sin(πz)
Example 11.8.9. The function f (z) = πz is entire and has only zeros of order one at
each n ∈ Z \ {0}. Then, with pn ≡ 1
sin(πz) Y ∞
z2
= eg(z) z 1− 2
π n
k=1
for some entire function h. We will show that eg(z) ≡ 1. The function
cos(πz)
w(z) = π cot(πz) = π
sin(πz)
is meromorphic on C with simple poles (order one) in Z. The function
1 X 1 1 1 X 2z
∞ ∞
(11.68) h(z) = + + = +
z z+n z−n z z 2 − n2
n=1 n=1
is also meromorphic on C with simple poles in Z. We will show that w ≡ h. Let ϕ = w − h.
Then
(a) ϕ is entire.
(b) ϕ(z) = ϕ(z + 1) and ϕ(−z) = −ϕ(z).
(a) is obvious since both w and h have only poles of order one on each integer with residues
equal to 1. Thus ϕ ∈ H(C \ Z) has a removable singularity on n ∈ Z.
To prove (b), it is enough to show that h is periodic with period 1. Let
N
X 1
hN (z) = .
z−n
n=−N
These sequence converges uniformly to h on compact subset of C \ Z. Since
N
X N
X −1
1 1 1 1
hN (z + 1) = = = hN (z) + − ,
z − (n − 1) z−n z+1+N z−N
n=−N n=−(N +1)
318 11. Differentiation
11.9. Exercises
Exercise 11.9.1. Let µ be a complex measure on B(Rd ) and let Mµ be its Hardy’s maximal
function. Show that Mµ < ∞ λd –a.a.
Exercise 11.9.2. For any a < x < y < b, show that
nX
n o
Vf (y) − Vj (x) = sup |f (tj ) − f (tj−1 )| : x = t0 < . . . < tn = y, n ∈ N
j−1
11.9. Exercises 319
This means that the variation of f over any subinterval [x, y] ⊂ [a, b] is given by the
difference of the variations over [a, y] and [a, x].
Exercise 11.9.3. Show that the function f (t) = t sin t−1 ) if t 6= 0 and f (0) = 0 is not of
bounded variation over any interval containing 0.
Exercise 11.9.4. Suppose f and g are absolutely continuous functions over [a, b] and let
α ∈ C. Show that f + αg, f · g and exp ◦f are absolutely continuous.
RR 3
Exercise 11.9.5. Define I± (s, R) := 1 exp i ± t3 + st dt. Show that limR→∞ I± (s, R)
RR 3
exists for any s ∈ R. Conclude that limr→∞ −R cos t3 + st dt exists for all s ∈ R. (Hint:
3 t3
d i t3
Notice that dt (e ) = t2 i ei 3 and use integration by parts.)
Exercise 11.9.6. Suppose F and G are functions on R+ which are of bounded variation
over any interval [a, b] ⊂ R+ . Suppose that G(∞) := limR→∞ G(R) exits. Show that
Z
F (s)µG (ds) = F (a)(G(∞) − G(a)) − F (b)(G(∞) − G(b))
(a,b]
Z
+ G(∞) − G(t−) µF (ds)
(a,b]
Remark 11.9.7. The existence of limR→∞ G(R) does not mean that µG (R+ ) is finite. If
limR→∞ VG (R) < ∞, then |µG |(R+ ) < ∞ and so |µG (R+ )| < ∞.
Exercise 11.9.8. Let µ be a Borel measure on an interval I in the real line. For any
a, b ∈ I with a < b, show that
Z
1 n
1(a < sn < . . . < s1 < b)µ(ds1 ) ⊗ · · · ⊗ µ(dsn ) ≤ µ(a, b)
n!
If in addition, µ is continuous, show that
Z
1 n
1(a < sn ≤ . . . ≤ s1 ≤ b)µ(ds1 ) ⊗ · · · ⊗ µ(dsn ) ≤ µ(a, b] .
n!
(Hint: Define G(t) := µ(a, t]. Apply Fubini’s theorem together with Lemma 11.3.6.)
Exercise 11.9.9. Let {fn : n ∈ N} is a sequence of differentiable functions on an open
interval I and that fn′ ∈ Lloc ′
1 (I). Assume that fn and fn converge uniformly in compact
subsets of I to functions f and g respectively. Show that f is λ–a.s. differentiable and
that f ′ (x) = g(x) at every differentiable point x. (Hint: For fixed x ∈ I, consider φn (h) =
R
1
2h [x−h,x+h] fn′ (t) dt, h > 0. Then, show that fn (x+h)−f
2h
n (x−h)
= φn (h) converges uniformly
1
R
to 2h [x−h,x+h] g(t) dt.)
Exercise 11.9.10. Let D be an open subset in C and suppose f ∈ H(D). Show that
Cauchy–Riemann’s equation in polar coordinates is given by
∂g 1 ∂g
=
∂r ir ∂θ
iθ
where g(r, θ) = f (re ).
320 11. Differentiation
P
Exercise 11.9.11. Suppose the double series a(n)z nm converges absolutely on
(n,m)∈N2
B(0; 1) and call its sum S(z). Show that each of the following series converge absolutely in
B(0; 1) as well, and has sum S(z):
∞
X ∞
X X
zn
a(n) , A(n)z n , where A(n) = a(d).
1 − zn
n=1 n=1 d|n
Show that f ∈ H(D). (Hint: Use Morera’s theorem together with Fubini’s theorem.)
Exercise 11.9.15. Determine the regions in which the following functions are holomorphic:
Z 1 Z ∞ tz Z 1
dt e etz
f (z) = , g(z) = 2
dt, h(z) = 2
dt.
0 1 + tz 0 1+t −1 1 + t
Exercise 11.9.16. Let z0 ∈ C and c > 0. Define the path ξ : t 7→ z0 + itc, −1 ≤ t ≤ 1. For
x > 0 define
Z
1 1 1
g(x) = − dz
2πi ξ z − z0 − x z − z0 + x
Estimate limx→0 g(x).
11.9. Exercises 321
R∞
Exercise 11.9.17. (Gamma function reprise) Show that Γ(z) = 0 e−t tz−1 dt defines
an analytic function in the half plane H = {z ∈ C : Re(z) R > 0},−1and that it satisfies
Γ(z + 1) = zΓ(z) for all z ∈ H. (Hint: Define Fn (z) = (1/n,n] e t z−1 dt. Apply the
result from Exercise 11.9.14 to show that on any strip Sa,b = {z ∈ C : a < Re(z) < b}
(0 < a < b < ∞), Fn is analytic and Fn converges to Γ uniformly. For the last statement,
use integration by parts.)
Exercise 11.9.18. Show that
R
(a) S1 (z) = (1,∞) e−t tz−1 dt is an entire function.
R P (−1)n
(b) For Re(z) > 0, (0,1] e−1 tz−1 dt = ∞ n=0 n!(n+z) .
P (−1)n
(c) Show that S0 (z) = ∞ n=0 n!(n+z) is meromorphic with only simple poles in −Z.
(Hint: For fix R > 0, split the series at some integer N > 2R. Show that the finite
sum is meromorhpic with poles in 0, . . . N and the remaining series, converges
uniformly since
(−1)n 1
n!(n + z) ≤ n!R
for 2R < N < n and |z| ≤ R.)
(d) Conclude that Γ can be extended as a meromorphic function in C with only single
poles at −Z.
Exercise 11.9.19. Prove that on the strip S0,1 = {z ∈ C : 0 < Re(z) < 1}
RR
(a) lim 0 cos(t)tz−1 dt = Γ(z) cos πz2 .
R→∞
RR
(b) lim 0 sin(t)tz−1 dt = Γ(z) sin πz2 .
R→∞
(Hint: Use the contour shown in Figure 3)
iR
iε
ε R
Figure 3.
(c) Show that equation in (b) can be extended by analytic continuation to −1 <
Re(z) < 1, and as consequence
Z R Z R √
sin x π sin x
lim dx = and lim 3/2
dx = 2π
R→∞ 0 x 2 R→∞ 0 x
322 11. Differentiation
Exercise 11.9.20. (Frobenius–Fuchs method, cont.) Consider the second order differential
equation (11.51) and suppose that the indicial equation I(r) = r(r − 1) + p0 r + q0 = 0 has
solutions α, β such that Re(α − β) ≥ 0. Show that
(a) If s = α − β ∈
/ Z+ then, there are two solutions to (11.51) of the form
X
w1 (z) = z α 1 + an z n
n≥1
X
β
w2 (z) = z 1+ bn z n
n≥1
where the series have positive radii of convergence. {w1 , w2 } is a linearly indepen-
dent system of solutions.
(b) If n = α − β ∈ Z+ then, there is a second solution to (11.51) of the form
X
w1 (z) = z α 1 + an z n
n≥1
X
w2 (z) = z β 1 + bn z n + Cw1 (z) log z
n≥1
for some constant C, and where the power series have positive radii of convergence.
(Hint: suppose there is a solution of the form w(z) = w1 (z)h(z) for some analytic
function in a disk around 0. w1 (z) = z α f (z) with f analytic in a disk near 0 with
f (0) = 1. This give gives a first order equation on h′ given by
2α f ′ p(z) ′
h′′ + +2 + h
z f z
Show that a solution to this reduced equation is of the form
X
h′ (z) = z −n−1 1 + cn z n .
n≥1
Exercise 11.9.21. Suppose f is an entire function and |f (z)| ≤ A|z|k for some constant
A > 0, k ∈ N and all z large enough. Show that f is a polynomial of degree at most k + 1
(Hint: Use Cauchy estimates).
Exercise 11.9.22. Suppose f is and entire function and that for some ρ > 0 there are
constants A, B such that |f (z)| ≤ A exp(B|z|ρ ). The infimum ρf of all such ρ is called the
order of growth of f . Show that
(a) There exists a constant C depending only on f such that
n(r) ≤ Crρ , r > 0.
11.9. Exercises 323
If {αk : k ∈ N} are the zeroes of f that are different from 0 then, for any s > ρ,
(b) P
∞ 1
n=1 |αk |s < ∞. (Hint: notice that
X 1 X X 1
=
|αk |s |α k|
s
|αk |≥1 j≥0j j+1
2 ≤|αk |<2
for all m ∈ N where B2m is the 2m–th Bernoulli number. (Hint: Use identity (11.68) in
Example 11.8.9. Equate the coefficients of the power series (11.71) and (11.70).)
324 11. Differentiation
wherever the limit exits. If f (z0 ) exists for some z0 = σ0 + iξ0 , show that f (z) is exists in
the set ∆(σ0 ):= {z ∈ C :Re(z) > σ0 }. If t 7→ e−σ0 t ∈ L1 (|µ|) for some σ0 ∈ R, show that
f ∈ H ∆(σ0 ) ∩ C ∆(σ0 ) .
R
Exercise 11.9.26. Show that fbp : t 7→ [−1,1] (1 − x2 )−p e−itx dx is entire for each p < 1 and
that
√ 1
b πΓ(1 − p)2 2 −p
f (t) = 1 J 1 −p (t)
t 2 −p 2
Some Elements of
Functional Analysis
In this section we discuss a few results on the theory of continuous linear maps on topological
vector spaces which will be useful throughout in the following sections, in particular in the
study of further representation theorems, addressed in Chapter 13, and in the study of weak
convergence of measures, address in Chapter 17.
Example 12.1.1. The Euclidean spaces (Fn , k k2 ) defined in Example 2.5.6 are the simplest
examples of Banach spaces.
Example 12.1.2. Let K be a compact topological space. The space C(K) of complex or
real functions with kf ku := supx∈K |f (x)| is a Banach space.
325
326 12. Some Elements of Functional Analysis
Example 12.1.3. We saw in Chapter 8 that if (Ω, F , µ) is a measure space, then for each
1 ≤ p ≤ ∞, Lp (µ) us a Banach space.
Let a ∈ A and b ∈ B and suppose W is an open with a + b ∈ W . Then, there exit open
neighborhoods V1 and V2 of a and b respectively such that V1 + V2 ⊂ W . Since a ∈ A and
b ∈ B, there are points x ∈ V1 ∩ A and y ∈ V2 ∩ B. Therefore,
x + y ∈ (A + B) ∩ (V1 + V2 ) ⊂ (A + B) ∩ W
and thus, a + b ∈ A + B.
1
Example 12.1.6. In R consider the sets A = {n + n+1 : n ∈ N} and B = {−n : n ∈ N}.
Clearly A and B are both closed subsets of R; however, A + B is not closed since { n1 : n ∈
N} ⊂ A + B but 0 ∈/ A + B.
Theorem 12.1.7. Let X be a topological vector space and ∅ =
6 A ⊂ X. Then
(a) If V is open, so is A + V .
T
(b) A = {A + V : V open, 0 ∈ V }
Corollary 12.1.9 implies that any topological vector space is Hausdorff regular, that
is, for any point x ∈ X and closed subset F ⊂ X such that x ∈ / F , there exits an open
neighborhood V ⊂ X of 0 such that x + V and F + V are disjoint.
Definition 12.1.10. Let X be a vector space.
(a) B ⊂ X is balanced if λB ⊂ B for any λ ∈ F with |λ| ≤ 1.
(b) C ⊂ X is convex if λC + (1 − λ)C ⊂ C for all λ ∈ [0, 1].
(c) A ⊂ X is affine if αA + (1 − α)A ⊂ A for all α ∈ R.
If X is a topological vector space,
328 12. Some Elements of Functional Analysis
Proof. For any open neighborhood V of 0 let U be a balanced neighborhood such that
U +U ⊂V.
n→∞
Suppose xn −−−→ x. Then, for all n, m large enough, xm , xn ∈ x + U and so, xn − xm =
(xn − x) + (x − xm ) ∈ U + (−U ) = U + U ⊂ V .
330 12. Some Elements of Functional Analysis
Suppose E is not bounded and let W be an open neighborhood of 0. For any integer n
there is xn ∈ E \ (nW ). Then n1 xn does not converge to 0.
Example 12.1.19. The ball B(0; r) = {f ∈ L0 : kf k0 < r} is balanced for any r > 0;
however, L0 is not locally convex in general. As a counterexample, consider the probability
space ((0, 1], B((0, 1]), λ). Define
f0 ≡ 1, fn = 2k 1(2−k (l−1),2−k l]
(iii) can be obtained from (ii) and the proof is left as an exercise.
(iv) Suppose that d is translation invariant metric on X compatible with the topology τ .
We can define a metric ρ on X/M by setting
ρ(π(x), π(y)) := inf{d(x − y, z) : x ∈ M } = d(x − y, M )
Notice that d(x − y, M ) = 0 iff x − y ∈ M = M . Hence
ρ(π(x), π(y)) = ρ(π(x) − π(y), π(0)),
and ρ(π(x), π(y)) = 0 iff π(x) = π(y). Since
d(x − y, z) ≤ d(0, z + y − x) = d(−z, y − x) = d(y − x, −z),
it follows that ρ(π(x), π(y)) = ρ(π(y), π(x)). From
d(x − y, z) ≤ d(x − y, u − y + z ′ ) + d(u − y + z ′ , z) = d(x − u, z ′ ) + d(u − y, z − z ′ ),
we conclude that ρ(π(x), π(y)) ≤ ρ(π(x), π(u)) + ρ(π(u), π(y)). This shows that ρ is a
translation invariant metric on X/M . Since d(x, 0) = d(x + z, z) for all z,
π {x : d(x, 0) < r} = {π(x) : ρ(π(x), π(0)) < r}
From (ii), it follows that if d is a translation invariant metric that generates the topology τ
on X then, rho is a translation invariant metric on X/M that generates τM .
Suppose that d is a complete translation invariant metric generating τ . and let {π(xn ) :
n ∈ N} be a Cauchy sequence in (X/M, ρ). Without loss of generality we may assume that
ρ(π(xn+1 − xn ), π(0)) < 2−n . Set z1 = 0 and choose z2 ∈ M such that
1
d(x2 + z2 − (x1 + z1 ), 0) < .
2
12.3. Locally convex spaces 333
(b) If A is convex and absorbent, then {t > 0 : t−1 x ∈ A} is an infinite interval (either open
or closed) whose left end point is µA (x) for if t−1 x ∈ A and s > t, then
s−1 x = (1 − st )0 + st t−1 x ∈ A.
Hence, if µA (x) < r and µA (y) < t, then r−1 x and t−1 y belong to A. The convexity of A
implies
r −1 t −1
(r + t)−1 (x + y) ≤ r x+ t y ∈ A.
r+t r+t
Thus, µA (x + y) ≤ r + t. Letting r ց µA (x) and s ց µA (y) completes the proof.
(c) If A is balanced, convex and absorbent then for any θ ∈ F with |θ| = 1 we have that
θ−1 A = A. Then, {t > 0 : t−1 x ∈ θ−1 A} = {t > 0 : t−1 x ∈ A} and consequently,
µA (θx) = µA (x). For a general λ ∈ F we have that λ = |λ|θ for some θ ∈ F with |θ| = 1.
Therefore,
µA (λx) = |λ|µA (θx) = |λ|µA (x).
Therefore, if A is an absorbent balanced convex set µA is a seminorm.
Theorem 12.3.3. Suppose that A is convex and absorbent. Let B = {x ∈ X : µA (x) < 1}
and C = {x ∈ X : µA (x) ≤ 1}. Then, B ⊂ A ⊂ C and µB = µA = µC .
Proof. Since A is convex and absorbent, s > µA (x) implies that s−1 x ∈ A. Thus, if
µA (x) < 1 then x = 1 x ∈ A, that is, B ⊂ A. It is obvious that A ⊂ C. It is easy to see that
1
B and C are convex; since µA ( µA (x)+1 x) < 1, it follows that B and C are also absorbent.
By definition of the Minkowski functional it follows that µC ≤ µA ≤ µB . For x ∈ X fixed,
consider µC (x) < s < t. Then s−1 x ∈ C and µA (t−1 x) = st µA (s−1 x) < 1, it follows that
µB (x) < t. By letting t ց µC (x) we obtain that µB (x) ≤ µc (x).
Theorem 12.3.4. Suppose ρ is a seminorm on a linear space X and set B = {x ∈ X :
ρ(x) < 1}. Then, B is balanced, convex and absorbent, and ρ = µB .
1
Proof. It is clear that B is a balanced convex set, and since ρ ρ(x)+1 x < 1, it follows
that B is also absorbent. For each t > 0 and x ∈ X, ρ(t−1 x) < 1 iff ρ(x) < t; hence,
{t > 0 : t−1 x ∈ B} = {t > 0 : ρ(x) < t}, and so ρ(x) = µB (x).
Theorem 12.3.5. Suppose (X, τ ) is a locally convex topological linear space and let V be
a local convex balanced base at 0 ∈ X. Then,
(i) V = {x ∈ X : µV (x) < 1} for each V ∈ V.
(ii) {µV : V ∈ V} is a family of continuous seminorms that separates points in X.
(iii) τ is generated by {µV : V ∈ V}.
Conversely, if {ρα : α ∈ A} is a family of seminorms that separate points of X then,
12.3. Locally convex spaces 335
Proof. (i) Theorem 12.3.3 shows that {x ∈ X : µV (x) < 1} ⊂ V . Let x ∈ V . The
continuity of λ 7→ λx implies that there is a real number 0 < λ < 1 such that λ−1 x ∈ V .
Thus µV (x) < 1, and so V ⊂ {x ∈ X : µV (x) < 1}.
(iii) follows from (ii) since τ is generated by finite intersections of sets of the form x + tV =
{y : µV (x − y) < t} where x ∈ X and t > 0.
(iv) For any α ∈ A and t > 0, Vα (0; t) = {x ∈ X : ρα (x) < t} = tV (0; 1) is balanced,
convex, and absorbent set. Hence, by Theorem 12.3.4, µV (0;1) = ρα . Consequently, the
collection of all finite intersections of sets of the form Vα (0, 1/n), α ∈ A and n ∈ N, is a
locally balanced and convex base at 0 for some topology τ on X.
It remains to show that (x, y) 7→ x+y and (λ, x) 7→ λx are continuous. If V = ∩nj=1 Vαj (0; εj )
for some n ≥ 1, α1 , . . . , αn ∈ A and positive ε1 , . . . , εn then
1 1
V + V ⊂ V.
2 2
Continuity of (x, y) 7→ x + y follows.
Let α0 ∈ F and x0 ∈ X be fixed. If |α − α0 | < δ and max1≤j≤n ραj (x − x0 ) < δ for some
δ > 0, then
ραj (αx − α0 x0 ) ≤ |α||ραj (x − x0 ) + |α − α0 |ραj (x0 )
≤ (δ + |α0 |)δ + δ max ραj (x0 ).
1≤j≤n
For δ small enough, (δ + |α0 |)δ + δ max1≤j≤n ραj (x0 ) < min1≤j≤n εj . Continuity of (α, x) 7→
αx follows.
Proof. Theorem 12.3.5 shows that the collection of sets {x : ρn (x) < r}, n ∈ N and r > 0,
defines a convex balanced local based at 0 for a linear topology τ on X. It easy to verify
that
ρn (x − y)
d(x, y) := max 2−n
n 1 + ρn (x − y)
is an invariant metric on X. We now show that d is compatible with τ . For any r > 0 let
N be the first integer for which 2−n ≤ r for all n > N . Then,
N
\
r
(12.4) x ∈ X : ρk (x) < −k = Bd (0; r) = {x ∈ X : d(x, 0) < r}.
2 −r
k=1
Hence Bd (0; r) ∈ τ and the identity map I from (X, τ ) into (X, d) is continuous. In
passing, (12.4) also shows that the balls Bd (0; r) are balanced and convex.
T
Conversely, consider the basic set V = m j=1 {x ∈ X : ρj (x) < rj }. Fix a positive number r
2−j rj
less than min1≤j≤m 1+rj . If d(x, 0) < r then
ρj (x) rj
2−j < r < 2−j
1 + ρj (x) 1 + rj
for all 1 ≤ j ≤ m. Hence ρj (x) < rj for all 1 ≤ j ≤ m, that is, Bd (0; r) ⊂ V . This shows
that the identity map I −1 from (X, d) to (X, τ ) is continuous.
Example 12.3.7. (The space C ∞ (Ω).) Suppose Ω ⊂ Rn is a nonempty open set. Let
{Km : n ∈ N} be a cover of Ω by compact sets so that Kn ⊂ Kn+1 o . Let C ∞ (Ω) be the
collection of all infinitely differentiable real valued functions in Ω. For each n ∈ N define
the seminorms
pn (φ) = sup |φ(k) (x)| : x ∈ Kn , |k| ≤ n
|k| P
where φ(k) = k∂1 φ kn and |k| = nj=1 kj . Clearly {pn : n ∈ N} separates points of C ∞ (Ω).
∂x1 ···∂xn
By Theorem 12.3.6, the topology on C ∞ (Ω) induced by {pn : n ∈ N} is metrizable by a
translation invariant metric d. We claim that (C ∞ (Ω), d) is complete. Suppose {φn : n ∈ N}
1
is a Cauchy sequence. For any N ∈ N let VN = {φ : pN (φ) < M }. Then, for each N , there
k k 1
is n0 such that |∂ φn − ∂ φm | < N on KN whenever n, m ≥ n0 and |k| ≤ N . It follows
that ∂ k φn converges uniformly on compact subsets of Ω to function gk . In particular,
n→∞
φn −−−→ g0 . It is an easy exercise to show that g0 ∈ C ∞ (Ω) and that Dk g0 = gk .
12.3. Locally convex spaces 337
The following result gives a full analytic description of the convex hull of a set.
Theorem 12.3.10. For any linear topological space X:
(i) The intersection of any collection of convex sets is convex
(ii) For any A ⊂ X, co(A) = ∩{C : A ⊂ C, C is convex} and
nX
N N
X o
(12.5) co(A) = λj xj : N ≥ 1, λj ≥ 0, λj = 1, xj ∈ A .
j=1 j=1
T
Proof. (i) Suppose C is a collection of convex subsets of X and let x, y be points in C.
Then, for any λT∈ [0, 1] we have that λx + (1 − λ)y ∈ C for each C ∈ C. Therefore
λx + (1 − λ)y ∈ C.
(ii) Denote the sets on the left–hand side and right hand side of (12.5) by C and D respec-
tively.PWe claim that for any Ppoints x1 , . . . , xn in C and nonnegative numbers λ1 , . . . , λn
n n
with k=1 λk = 1, we have k=1 λk xk ∈ C. Since C is convex, the claim holds n ≤ 2 by
definition. Assume the statementP is valid for n − 1 ≥ 2. Let λ1 , . . . , λn be non negative
numbers with 0 < λn < 1 and nj=1 λj = 1. Then, for any set of points x1 , . . . , xn in C,
n
X n−1
X λj
λj xj = λn xn + (1 − λn ) xj ∈ C
1 − λn
j=1 j=1
To complete the proof it is enough to show that D is convex. For any pair of points x
and y in D, then there exists N ≥ 1 points xj ∈ A and two sets of non negative numbers
P PN PN
{λ1 , . . . , λN } and {λ′1 , . . . , λ′N } with Nj=1 λj = 1 =
′
j=1 λj such that x = j=1 λj xj and
PN ′
y = j= λj xj . Thus, for any α ∈ [0, 1],
N
X
αx + (1 − α)y = (αλj + (1 − α)λ′j )xj ∈ D
j=1
PN
since αλj + (1 − α)λ′j ≥ 0 and j=1 (αλj + (1 − α)λ′j ) = 1.
(iii) It is clear that co(Ab ) ⊂ co◦ (A). From part (3) we obtain that
nX
N N
X o
co(Ab ) = λj αj xj : N ≥ 1, |αj | ≤ 1, λj ≥ 0, λj = 1, xi ∈ A
j=1 j=1
nX
N N
X o
(12.7) = λj xj : N ≥ 1, |λj | ≤ 1, xi ∈ A .
j=1 j=1
Let (α, a) and (β, b) be points in S × A. Let J = {1 ≤ k ≤ n : αk + βk 6= 0}. Then, for any
0<λ<1
n
X
λf (α, a) + (1 − λ)f (β, b) = λαk ak + (1 − λ)βk bk
k=1
X λαk ak + (1 − λ)βk bk
= (λαk + (1 − λ)βk )
λαk + (1 − λ)βk
k∈J
= f (λα + (1 − λ)β, c),
Th last statement follows from the fact that the closure of totally bounded sets is also
totally bounded and that in complete metric spaces, a set is compact iff is closed and
totally bounded.
P P
Proof. Suppose x = kj=1 λj xj where λj > 0, kj=1 λj = 1, and xj ∈ C. Suppose k > n+1.
We will show that x is in the convex hull of a proper subset of {x1 , . . . , xk }. As k − 1 > n,
P
the vectors x2 − x1 , . . . , xk − x1 are linearly dependent; thus, kj=2 cj (xj − x1 ) = 0 for some
P
scalars cj , one of which is strictly positive. Let c1 = − nj=2 cj and c := min{λj /cj : cj > 0}.
Pk
Then, cj = 0, c > 0, λj − ccj ≥ 0 for all j and λm − ccm = 0 for some m. As
Pk j=1
x = j=1 (λj − ccj )xj , x is in the convex hull of a proper subset of {x1 , . . . , xk }.
Proof. Let S be the simplex in Rn+1 consisting of points (λ1 , . . . , λn+1 ) such that λj ≥ 0
P Pn+1
and n+1
j=1 λj = 1. Consider the function f : S × K
n+1 → Rn given by (λ, x) =
j=1 λj xj .
By Lemma 12.3.12 co(K) = f (S × K n+1 ). The compactness of S × K n+1 and continuity of
f imply that co(K) is compact.
Proof. To check that B forms a basis for a topology we first prove that fro each V ∈ B
and x0 ∈ V , V − x0 is absorbent. Fix V ∈ B and let x0 ∈ V and x ∈ X. Then x0 ∈ Xi
and x ∈ Xj for some i, j ∈ I. Since I is directed, there is k ∈ I such that i k, and j k
so that x0 , x ∈ Xk . Then (V ∩ Xk ) − x0 is a neighborhood in τk around 0 and so there is
ε > 0 such that x0 + λx ∈ V ∩ Xk ⊂ Vk for all |λ| < ε.
12.4. Inductive limit topology 341
(i) To show that the addition operation is continuous is enough to prove that the it is
continuous at (0, 0). This follows from the observation that 12 V + 12 V ⊂ V for any V ∈ B.
(ii) To prove continuity of the scalar product, fix x0 ∈ X and λ0 ∈ F. Notice that
λx − λ0 x0 = λ(x − x0 ) + (λ − λ0 )x0
Given V ∈ B, there is ε > 0 such that (λ − λ0 )x0 ∈ 21 V whenever |λ − λ0 | < ε. Setting
1
δ := 2(ε+|λ 0 |)
, we have that if x ∈ x0 + δV and |λ − λ0 | < ε, then λ(x − x0 ) ∈ λδV ⊂ 21 V .
The continuity of the scalar product follows from this.
where m ∈ Z+ and r > 0, are contained in τ , we have that for each K ∈ K(Ω) the topology
induced by τ on DK coincides with the original topology τK .
Theorem 12.4.5. (D(Ω), τ ), as in Example 12.4.4, is a complete locally convex space. If
E ⊂ D(Ω) is bounded then
(12.9) sup pm (φ) < ∞
φ∈E
for all m ∈ Z+ , and E is compact in τ (that is, (D(Ω), τ ) has the Heine–Borel property).
Proof. First we prove that E ⊂ D(Ω) is bounded iff E ⊂ DK for some KS∈ K(Ω). Consider
a sequence {Kn : n ∈ N} ⊂ K(Ω) such that Kn ⊂ int Kn+1 with Ω = n Kn . Suppose E
is contained in no DK with K ∈ K(Ω). Then, there are functions φn ∈ E and points
xn ∈ Ω \ Kn such that |φn (xn )| > 0. Define the set
\ 1
W := {φ ∈ D : |φ(xn )| < |φn (xn )|}.
n
n≥1
since each each set {φ ∈ DK : |φ(xn )| ≤ n1 |φn (xn )}, 1 ≤ n < m}, is open in τK . By
definition of the inductive topology, W ∈ τ . As φn ∈
/ nW , no set of the form rW contains
E; hence, E is not bounded. Therefore, if E is bounded then there is K ∈ K(Ω) so that
E ⊂ DK . Since τK coincides with the topology on DK induced by τ , E is bounded in
(DK , τK ) and (12.9) holds for all n ∈ Z+ .
We now show that E is compact. Since E is bounded, E ⊂ DK for some K ∈ K(Ω), and E is
bounded in DK . Since supφ∈E pm (φ) < ∞ for each m ∈ N, {∂ α φ : φ ∈ E} is equicontinuous
(in the sup norm in C(K)) for each α ∈ Zn+ . The Arzelà–Ascoli theorem and Cantor’s
diagonal process imply that any sequence in E contains a subsequence {φm : m ∈ N} ⊂ E
for which ∂ α φn converges uniformly. From this, it follows that E is compact in DK and
hence, in D.
The description of bounded sets and Cauchy sequences in D(Ω) was simple since each
DK , K ∈ K(Ω), is a Fréchet space generated by countable collection of seminorms. The
space D(Ω) is the archetype of countable inductive system {X, (Xn , τn ) : n ∈ N} in which
Xn is closed subset of (Xn+1 , τn+1 ), and taun coincides with the relative topology on Xn
12.4. Inductive limit topology 343
induced by τn+1 . These systems are called strict inductive systems. We will conclude
this section by presenting a result that describes bounded sets in strict inductive systems.
Lemma 12.4.6. Suppose X is a locally convex topological vector space and that M is a
linear subspace of X equipped with the induced topology from X. Let V ⊂ M be an open
convex balanced neighborhood of 0 in M .
(i) There exists an open balanced neighborhood W of 0 in X such that V = W ∩ M .
(ii) If M is closed in X then, for any x0 ∈ X \ M there exists an open convex balanced
neighborhood W of 0 in X such that W ∩ M = V and x0 ∈ X \ W .
(ii) If M is a closed linear subspace of X then X/M with the quotient topology is a locally
convex linear vector space and so it is also Hausdorff. Let π : X → X/M be the quotient
map. If x0 ∈ / M then there is an open neighborhood Ṽ of 0 in X/M which does not contain
x0 + M . Thus, π −1 (Ṽ ) is a convex balanced open neigborhood of 0 in X which does not
intersect x0 + M . Consequently, there is an open balanced neighborhood U of 0 in X such
that U ∩ M ⊂ M , and
(12.10) U ∩ (x0 + M ) = ∅
As in the proof of part (i), the set W = co◦ (U ∪ V ) is a convex balanced neighborhood of
0 in X such that W ∩ M = V . We will show that x0 ∈ / W . If x0 ∈ W , then x0 = αx + βy
for some x ∈ U , y ∈ V , and α, β ∈ F with |α| + |β| ≤ 1. If α = 0 then x0 = βy ∈ V ⊂ M
which is a contradiction to the assumption on x0 ∈/ M . If α 6= 0 then, since U is balanced
αx = x0 − βy ∈ (x0 + M ) ∩ U
which is a contradiction to 12.10. Therefore x0 ∈
/ W,
Theorem 12.4.7. (Dieudonné–Schwartz) Let X be a vector space and {(Xn , τn ) : n ∈ N}
an inductive sequence of locally convex vector spaces Xn ⊂ X such that for each n ∈ N, τn
is the topology on Xn induced by τn+1 , and Xn is a closed subset in (Xn+1 , τn+1 ). Then,
(i) (X, τ ) is a locally convex topological vector space, and for each n ∈ N the topology
induced on Xn by τ is the same as τn , and Xn is closed in (X, τ ).
(ii) B ⊂ X is bounded in (X, τ ) iff B ⊂ Xn for some n and B is bounded in (Xn , τn ).
Proof. (i) Fix n ∈ N and let Vn be an open convex balanced neighborhood of 0 in (Xn , τn ).
By Lemma 12.4.6[(i)], there exists an open convex balanced neighborhood Vn+1 of 0 in
(Xn+1 , τn+1 ) such that Vn+1 ∩ Xn = Vn . Continuing by induction we obtained an increasing
344 12. Some Elements of Functional Analysis
sequence {Vn+j : j ∈ Z+ } of convexSbalanced sets such that Vn+j ∈ τn+j , Vn+j+1 ∩ Xn+j =
Vn+j . It is easy to check that V = k≥n Vk is convex balance set and
Vn ∩ Xj if j < n
V ∩ Xj =
Vj if n ≤ j
Hence V ∈ τ and Vn ∈ {Xn ∩ U : U ∈ τ }. This implies that τn and the induced topology
on Xn by τ are equal. We now show that X \ Xn ∈ τ . Let x ∈ X \ Xn . Then x ∈ Xn+p
for some integer p ≥ 1. Since Xn is closed in Xn+p , Theorem 12.1.8 shows that there is an
open convex balanced neighborhood Vn+p of 0 in τn+p such that
(x + Vn+p ) ∩ Xn = ∅.
The first part of the proof shows that there is a convex balanced neighborhood V ∈ τ of 0
such that V ∩ Xn+p = Vn+p . Consequently
(x + V ) ∩ Xn = (x + V ) ∩ Xn+p ∩ Xn = (x + Vn+p ) ∩ Xn = ∅.
This shows that X \ Xn is open in τ .
It remains to show that {0} is closed in (X, τ ). Let x ∈ X \ {0}. Then x ∈ Xn for some
n ∈ N. Then there is a convex balanced neighborhood Vn of 0 in τn such that x ∈ / Vn . By
the first part of the proof there is an open neighborhood V of 0 in τ such that V ∩ Xn = Vn .
Then, x ∈ / V and so {0} is closed in (X, τ ).
(ii) Suppose B is a bounded set in (Xn , τn ). Let V be an open convex balanced neighborhood
of 0 in τ . Then Vn := V ∩ Xn is a convex balanced neighborhood of 0 in τn . There is t0
such that t ≥ t0 implies that B ⊂ tVn ⊂ tV . This shows that B is also bounded in (X, τ ).
Conversely, suppose B ⊂ X is not contained in any Xn . There is a sequence {xn : n ∈ N}
such that xn ∈ B \ Xn . We can extract a subsequence such that xnk ∈ Xnk+1 \ Xnk . Clearly
xn1 6= 0. Thus there is a open convex balanced neighborhood V2 of 0 in τn2 such that
xn1 ∈/ Vn2 . Since 12 xn2 ∈ Xn3 \ Xn2 , by Lemma 12.4.6[(ii)], there is a convex balanced
neighborhood V3 of 0 in τn3 such that V3 ∩ Xn2 = V2 and 12 xn2 ∈ / V3 . Proceeding by
induction, we obtain an increasing sequence {Vk : n ∈ N} of convex
S balanced sets such that
Vk ∈ τnk , Vk+1 ∩ Xnk = Vk , and k1 xnk ∈/ Vk+1 . The set V = k VK is a convex balanced
subset of X and V ∩ Xn ∈ τn for each n ∈ N, that is V is an neighborhood of 0 in τ . By
construction the sequence { k1 xnk } ⊂ X \ V and, since X \ V is close, it does not converge
to 0. Hence, from Theorem 12.1.17, it follows that B is not bounded. Therefore, if B is
bounded in (X, τ ), then B ⊂ Xn for some n.
Suppose B is bounded and B ⊂ Xn . Let Vn be an convex balanced neighborhood of 0
in τn . By Lemma 12.4.6 there is a convex balanced neighborhood of 0 in τ such that
V ∩ Xn = Vn . As B is bounded, there is t0 > 0 such that B ⊂ tV for all t ≥ t0 . Thus
B = B ∩ Xn ⊂ (tV ) ∩ Xn = t(V ∩ Xn ) = tVn for all t ≥ t0 .
Remark 12.4.8. Under the assumptions of Dieudonné–Schwartz theorem, a sequence {xn :
n ∈ N} is convergent in the inductive limit topology (X, τ ) iff there exits Xk such that
{xn : n ∈ N} ⊂ Xk , and the sequence converges in (Xk , τk ).
12.5. Continuous linear transformations 345
Example 12.4.9. Suppose Ω is a locally compact second countable Hausdorff space. The
space C00 (Ω) with the inductive limit topology τ described in Example 12.4.3 is a strict
inductive system. Each space CK is closed in (C00 (Ω), τ ). A sequence {φn } is convergent
in τ iff there is a compact set K ⊂ Ω such that {φn : n ∈ N} ⊂ DK , and φn converges
uniformly.
Example 12.4.10. Let Ω ⊂ Rn be an open. The space D(Ω) described in Example 12.4.4
is a strict inductive system. Each DK is a closed subset of D, and a sequence {φn } ⊂ D
is convergent iff there is a compact set K such that {φn : n ∈ N} ⊂ DK , and a function
φ ∈ DK such that limn pm (φn − φ) = 0 for each m ∈ Z+ .
Example 12.4.11. Suppose ψ, , φ ∈ D(Rn ). Let K1 = supp(ψ) and K2 = supp(φ) and
K = K1 + K2 . The map F : x 7→ ψ(x)τx φ is a continuous map from Rn to DK . Indeed,
F (x) = 0 for all x ∈ K1c and supp(F (x)) ⊂ K1 + K2 for all x ∈ K1 . Since each ϕ ∈ D is
m→∞
uniformly continuous, for any sequence xm −−−−→ x in Rn and any α ∈ Zn+ , we have that
m→∞ m→∞
τxn ∂ α φ −−−−→ τx ∂ α φ uniformly. Hence F (xm ) −−−−→ F (x).
(iv)⇒(i): Suppose (iv) holds but Λ fails to be continuous. Let {Vn : n ∈ N} be a local
neighborhood of 0. There exists an open neighborhood W of 0 in Y such that, for any n
there is xn ∈ Vn \ Λ−1 (W ). Then xn → 0 but Λxn 9 0 contradicting assumption (iv).
Corollary 12.5.2. Suppose (X, τ ) is a locally convex space and that τ is generated by a
countable nondecreasing family of seminorms {ρm : m ∈ N}. Then, Λ ∈ X ∗ iff there exists
a constant C > 0 and N ∈ N such that
(12.11) |Λx| ≤ CρN (x), x∈X
Proof. For any x ∈ X, m ∈ N and r > 0 define Vm (x; r) = {y ∈ X : ρm (x − y) < r}. Since
ρm ≤ ρm+1 for all m ∈ N, the collection of all sets Vm (x; r) forms a base for the topology τ .
Suppose (12.11) holds. Given ε > 0, let V := VN (0; ε/C). Then, Λ(V ) ⊂ B(0; ε). This
implies that Λ is continuous.
Conversely, if Λ is continuous, there is N and δ > 0 such that |Λ(x)| < 1 whenever x ∈
VN (0; δ). Consequently, |Λ(x)| < 2δ ρN (x) for any x ∈ X.
Example 12.5.3. (Continuous linear maps on D(Ω)) Suppose Y is a topological space, Ω
is an open subset of Rn , and Λ : D(Ω) → Y is a linear map. Theorem 12.5.1 implies that if
Λ is continuous, then Λ is bounded. Although (D(Ω), τ ) is not metrizable, we know that for
any compact K ⊂ Ω, the relative topology on DK induced by τ coincides with the topology
τK generated by the countable seminorms pm (φ) given. This will allow us to establish the
equivalence between continuity and boundedness of Λ.
Theorem 12.5.4. Suppose Λ : D(Ω) → Y is a linear map where Y is a locally convex
space. The following statements are equivalent.
(i) Λ is continuous.
(ii) Λ is bounded.
(iii) For any sequence φn → 0 in D(Ω), Λ(φn ) → 0 in Y .
(iv) The restriction of Λ to any DK , K ∈ K(Ω), is continuous.
Suppose that (iv) holds. This is the only place where the locally convex assumption on Y is
used. Let U be an open balanced convex set in Y and set V = Λ−1 (U ). Then V is balanced
and convex. By assumption V ∩ DK ∈ τK . By definition of the inductive topology τ on
D(Ω) (see Theorem 12.4.2), it follows that V ∈ τ . Therefore, Λ is continuous.
Example 12.5.5. Let α ∈ Zn+ and ψ ∈ C ∞ (Ω). The maps
Ψ : φ 7→ ψφ
Dα : φ 7→ ∂ α φ
from D(Ω) to itself are continuous. By Theorem 12.5.4, it is enough to show that if
m→∞
φm −−−−→ 0 in D(Ω), then ψφm and ∂ α φm converge to 0 in D(Ω). There is K ∈ K(Ω)
m→∞
such that {φm } ⊂ DK , and ∂ β φm −−−−→ 0 uniformly for all β ∈ Zn+ .
Proof. As T ∈ L(X, Y ) is bounded, kT k < ∞. From k(T + αU )xkY ≤ kT xkY + |α|kU xkY ,
it follows that kT k := sup kT xkY is a norm.
kxkX =1
Suppose Y is a Banach space, and assume (Tn : n ∈ N) is a Cauchy sequence in L(X, Y ).
Since
kTn x − Tm xkY ≤ kTn − Tm kkxkX
for all x ∈ X, it follows that {Tn : n ∈ N} converges pointwise to some function T : X → Y .
Since
kT (αu + v) − αT (u) − T (v)kY ≤ kT (αu + v) − Tn (αu + v)kY +
kαT (u) + T (v) − αTn (u) − Tn (v)kY ,
by passing to the limit as n → ∞ we conclude that T is a linear map. Given ε > 0, there
is an integer N such that n > m ≥ N implies that k(Tn − Tm )xk ≤ kTn − Tm k < ε for all
x with kxk = 1. Letting n → ∞ we obtain that supkxk=1 k(T − Tm )xkY ≤ ε for all n > N .
n→∞
This shows that Tn −−−→ T in L(X, Y ).
Definition 12.6.2. A normed ring (A, +, ·, k k) over the field F is a Banach ring if (A, k k)
is a nontrivial Banach space, and for any x, y ∈ A
kxyk ≤ kxkkyk
(A, +, ·, k k) is a Banach algebra if A is a Banach unital ring whose unit e satisfies kek = 1.
Remark 12.6.3. If A is a Banach ring, then the product (x, y) 7→ xy is continuous in
A × A. This follows from
kxy − x′ y′ k = kx(y − y′ ) − (x′ − x)y′ k ≤ kxkky − y′ k + ky′ kkx − x′ k
12.6. Banach algebra of linear operators on a Banach spaces 349
Remark 12.6.4. If A is a Banach ring with a unit e, then kek = keek ≤ kek2 and so,
kek ≥ 1. For each a ∈ A define La x = ax. Then La ∈ L(A) since kLa xk = kaxk ≤ kakkxk.
Clearly a 7→ La is an algebra isomorphism from A onto A e := {La : a ∈ A}. Since
kLa k ≤ kak and kak = kLa ek ≤ kekkLa k, A and A e are linearly homeomorphic. More
importantly, the norms a 7→ |||a||| := kLa k and a 7→ kak are equivalent. Under the norm
||| |||, A becomes a Banach algebra since Le = I.
e := A × C define
Remark 12.6.5. Suppose A is a Banach ring. On A
(a) (x, α) + c(y, β) = (x + cy, α + cβ) for all x, y ∈ A and α, β, c ∈ C.
(b) (x, α) · (y, β) = (xy + αy + βx, αβ) for all x, y ∈ A and α, β ∈ C.
(c) k(x, α)k = kxk + |α| for all x ∈ A and α ∈ C.
Under these operations and norm,(A, e +, ·) is a Banach algebra with unit (0, 1). Indeed,
k(x, α) · (y, β)k ≤ (kxk + |λ|)(kyk + |β|) = k(x, α)kk(y, β)k, and k(0, 1)k = 1. Clearly
x 7→ (x, 0) is an isometric isomorphism from A onto A × {0}. Notice that (x, α) · (y, 0) =
(xy + αy, 0) ∈ A × {0} and (y, 0) · (x, α) = (yx + αy, 0) ∈ A × {0} for all x, y ∈ A and
α ∈ C. By identifying A with A × {0} we have that any non–unital Banach ring is a closed
ideal of codimension one in a Banach algebra.
Example 12.6.6. If X is a locally compact Hausdorff space, C0 (X) under the uniform
norm and the pointwise sum, product and scalar product is a Banach ring. C0 (X) × R with
the operations and norm defined in Remark 12.6.5 is a Banach algebra. This algebra is
homeomorphic to C(X ∪ {∆}) where X ∪ {∆} is the one–point compactification of X.
Remark 12.6.7. If A is a Banach algebra, x ∈ A is invertible if there is y ∈ A such that
xy = e = yx. Clearly such y, if it exists, is unique, and will be denoted by x−1 . The
collection GA of all invertible elements in A contains the unit element e, and is a group
under multiplication. Indeed, if x, y ∈ A then (xy)−1 = y−1 x−1 .
Example 12.6.8. Suppose (X, k k) is a complex Banach space. Under operator addition
and scalar multiplication, and composition L(X) is an algebra whose unit is the identity
map I. With respect the operator norm
kT k = sup kT xk,
kxk=1
L(X) is a Banach algebra(see Exercise 12.17.16). The group GL(X) := GL(X) , called the
general linear group of X, consists of all bijective maps T ∈ L(X) for which T −1 ∈ L(X).
It will be shown below that GL(X) is an open subset of L(X) and that it is topological
group, that is the group multiplication (composition) in GL and the map T 7→ T −1 are
continuous with respect to the operator norm.
Lemma 12.6.9. Suppose A is a Banach algebra. If x ∈ A and kxk < 1, then (e − x) ∈ GA
and
X∞
(e − x)−1 = xn .
n=0
350 12. Some Elements of Functional Analysis
P
Proof. Let sn = nk=0 xk . Notice that (e − x)sn = sn (e − x) = e − xn+1 ; hence, if sn
converges in A to some element s, s = (e − x)−1 . For that purpose, we will show that
{sn : n ∈ N} is a Cauchy sequence. For n > m we have that
n
X kxkm+1
ksn − sm k = kxkk < .
1 − kxk
k=m+1
Theorem 12.6.10. If A is a Banach algebra, then the set GA of all x ∈ A for which x−1
exists is an open in (A, k k). Moreover, if x ∈ GA , then for khkkkx−1 k < 1
Let f be the map x 7→ x−1 on GA . For any h ∈ A such that khk < 1/kx−1 k we have that
X
(x + h)−1 = (e + x−1 h)−1 x−1 = (−1)n (x−1 h)n x−1
n≥0
X
−1 −1 −1 −1
=x −x hx +x (hx−1 )2 (−1)n (hx−1 )n .
n≥0
Hence
kx + h)−1 − x−1 + x−1 hx−1 k kx−1 k3 khk h→0
≤ −−−→ 0
khk 1 − kx−1 kkhk
n→∞
Lemma 12.6.11. Suppose x ∈ ∂GA in A. If {xn : n ∈ N} ⊂ GA and kx − xn k −−−→ 0,
then limn kx−1
n k = ∞.
Proof. Suppose the conclusion is false. Then, for some M > 0 and subsequence {xnk : k ∈
1
N} we have supk kx−1
nk k < M . Let K be large enough so that kxnK − xk < M . Then
ke − x−1 −1 −1
nK xk = kxnK (xnK − x)k ≤ kxnK kkxnK − xk < 1
Hence, x−1 −1
nK x ∈ GA and so x = xnK (xnK x) ∈ GA . Since GA is open, x ∈
/ ∂(GA ) which is a
contradiction.
12.7. Finite dimensional spaces 351
Proof. (i) Suppose Λ is a linear map on Fn into X. Set yj = Λ(ej ) where {ej : j = 1, . . . , n}
is the standard coordinate system ej (k) = 1{j} (k). Then
n
X n
X
Λ(x) = πj (x)yj = Sj ◦ πj (x)
j=1 j=1
(ii) Let {yj : j = 1, . . . , n} be a basis of Y and define the linear map Λ on Fn into X by
setting Λ(ej ) = yj . Then Λ is an isomorphism between Fn and Y which, by part (i) is
continuous.
We will show that Λ−1 : Y → X is bounded in a neighborhood of 0 in Y . Let B the open
unit ball in Fn and Sn−1 = ∂B the unit sphere in Fn . As Sn−1 is compact and Λ(x) = 0
iff x = 0, K is a compact subset of X and 0 ∈ / K. By Theorems 12.1.8 and 12.1.13
there is an open balanced neighborhood V of 0 in X such that V ∩ (K + V ) = ∅. Then
U = Λ−1 (V ∩ Y ) = Λ−1 (V ) is an open neighborhood of 0 in Fn such that U ∩ Sn−1 = ∅.
Since V is balanced, the linearity of Λ−1 implies that U is balanced, and so U is a connected
subset of Fn . It follows that U is contained in B. This establishes that Λ is an linear
homeomorphism between Fn and Y .
352 12. Some Elements of Functional Analysis
for some finite collection of points x1 , . . . , xm ∈ X. The linear space Y generated by such
points xj is finite dimensional and thus, it is closed in X. Then
1 1 1 1
V ⊂Y + V ⊂Y + Y + V =Y + V
2 2 4 4
The same argument shows that V ⊂ Y + 2−n V for each n ∈ N. By Theorem 12.1.7
\
V ⊂ (Y + 2−n V ) = Y = Y.
n∈N
Proof. Let π : X → X/M be the quotient map. Clearly π(F ) is a finite dimensional linear
subspace of X/M and closed in (X/M, τM ) by Theorem 12.7.1. Therefore π −1 (π(F )) =
F + M is a closed subset of X.
Remark 12.7.3. The finite dimensioness assumption on F is necessary. See Exercise 12.17.4.
Proof. If K is a singleton, then the statement holds with m = 0. Assume that K has is not a
singleton. Fix p ∈ K and let Mp := span(K −p). If m := dim(Mp ) then, there are m linearly
independent vectors {x1 , . . . , xm } ⊂ K − p that generate Mp . Let
PmIp be the interior
Pof K −p
m
relative to Mp . Consider the norm k kMp on Mp defined by k j=1 αj xj kMp := j=1 |αj |.
Theorem 12.7.1 shows that this norm induces the same topology on Mp as the topology
induced by (Rn , k k2 ). Thus Ip is an open convex set in (Mp , k kMp ).
We claim that Ip 6= ∅ which, by Lemma Pm12.2, would imply that IP p ∩ Mp = K − p. First
notice that any point of the form x = j=1 αj xj with 0 ≤ αj and m j=1 αj < 1 belongs to
1 Pm
K − p since 0 ∈ K − p. If w := 2m x
j=1 j then, the set
U (w; ε) := {x ∈ Mp : kx − wkMp < ε} ⊂ K − p
P
1
for all 0 < ε < 2m . Indeed, if x ∈ U (w; ε), x − w = m 1
j=1 ǫj xj where |ǫn | < 2m . Then
Pm P
x = j=1 2m 1
+ ǫ xj , m 1 1
j=1 2m + ǫj < 1, and 2m + ǫj > 0. This completes the proof of
the claim.
Proof. By Theorem 12.7.6 K is homeomorphic the unit ball in some Euclidean space Rm .
The conclusion follows from Brouwer’s fixed point theorem.
Proof. Suppose that f has no fixed point in K. Then, its graph G = {(x, f (x) : x ∈ K ⊂
X × X is a compact set that does not intersect the diagonal ∆ in X × X. Thus, there is a
convex balanced neighborhood V of 0 such that
G + (V × V ) ∩ ∆ = ∅
This implies that
(12.13) f (x) ∈
/ x + V, x ∈ K;
otherwise, we would have (x + v, f (x)) ∈ G + (V × V ) ∩ ∆ for some x ∈ K. Let µ
be the Minkowski functional of V . By Theorem 12.3.5 µ is a continuous seminorm and
V = {x ∈ X : µ(x) < 1}. Define the function
α(x) = (1 − µ(x))+
Sn
Clearly, φ−1 ({0}) = X \ V . Let x1 , . . . , xn ∈ K be such that K ⊂ j=1 (xj + V ) and define
the functions βj in K as
α(x − xj )
βj (x) := Pn , j = 1, . . . , n
k=1 α(x − xk )
These functions are well defined since the denominator is positive in K.
The set H := co({x1 , . . . , xn }) is a finite dimensional compact set. Define the function g on
K by
n
X
g(x) := βj (x)xj .
j=1
Clearly g is continuous and g(K) ⊂ H. The same holds for the function g ◦ f . By Brouwer’s
fixed theorem, there exits p ∈ H such that g(f (p)) = p. Since βj (x) = 0 outside xj + V , we
12.9. Uniform booundedness 355
have that
n
X
x − g(x) = βj (x)(x − xj ) ∈ co(V ) = V
j=1
Γ is uniformly bounded if for any bounded set E ⊂ X there exists a bounded set F ⊂ Y
such that Λ(E) ⊂ F for all Λ ∈ Γ.
If x ∈SB, then Γ(x) ⊂ nU for some n ∈ N, and so Λ(n−1 x) ∈ U ⊂ U for all Λ ∈ Γ. Hence
B⊂ nD. As B is of second category and D is closed, D must have an interior point x.
n∈N
Therefore, there exists a neighborhood V of 0 in X such that V ⊂ x − D, and
(12.14) Λ(V ) ⊂ Λ(x) − Λ(D) ⊂ U − U ⊂ W
for each Λ ∈ Γ. This shows that Γ is equicontinuous.
This shows that Γ is uniformly bounded. In particular, as {x} is bounded in X for any
x ∈ X, we have that Γ(x) is bounded in Y .
Corollary 12.9.3. Suppose X is an F–space and Γ is a collection of continuous maps from
X into a topological vector space Y . If Γ(x) = {Λ(x) : Λ ∈ Γ} is bounded in Y for each
x ∈ X, then Γ is equicontinuous.
Proof. By Baire’s category theorem, X is of second category. The conclusion follows from
the Banach–Steinhaus’ theorem.
Proof. Let C = {Λ(x) : x ∈ K, Λ ∈ Γ}. As in the proof of the Banach–Steinhaus, for any
open neigborhood
T W of 0 in Y , let U be a blanced open set in Y such that U + U ⊂ W and
define D = Λ∈Γ Λ−1 (U ). For x ∈ K, there is an integer n such that Γ(x) ⊂ nU . Hence
[
K= K ∩ nD.
n∈N
As K is compact Hausdorff space, the Baire category theorem implies that K is of second
category in the relative topology. Since D is closed, for some integer n, x0 ∈ K and a
neighborhhod V of 0 in X,
(12.15) K ∩ (x0 + V ) ⊂ nD.
Since compact sets are bounded, there exists an integer m such that
(12.16) K − x0 ⊂ mV
12.9. Uniform booundedness 357
The convextity assumption in Theorem 12.9.4 can not be remove as the following ex-
ample shows.
Example 12.9.5. Consider the sequence xn ∈ ℓ2 (C) defined as xn (m) = n1 1{n} (m) for
P
n ≥ 1 and x0 ≡ 0. Let Λn be the linear map on ℓ2 (C) given by Λn x = nk=1 k 2 x(k). As
xn → 0 in ℓ2 (C), we have that K is compact in ℓ2 (C). If Γ(x) deonotes the orbit of x, then
Γ(x0 ) = {0}, Γ(x1 ) = {1} and Γ(xm ) = {0, m} for all m ≥ 2. Hence Γ(x) is bounded for
each x ∈ K; however, {Λn (x) : x ∈ K, n ∈ N} = Z+ is not bounded.
Definition 12.9.6. (Bilinear mappings) Suppose X, Y , and Z are vector spaces. A function
B : X × Y → Z is a bilinear map if for each (x, y) ∈ X × Y , the maps B(x, ·) : Y → and
B(·, y) : X → Z defined as u 7→ B(x, u) and v 7→ B(v, y) respectively, are linear,
Theorem 12.9.7. Suppose X is an F –space, Y and Z are topological vector spaces. If
B : X × Y → Z is a bilinear map such that for each (x, y) ∈ X × Y , the maps B(x, ·) and
n→∞ n→∞
B(·, y) are continuous the,n B(xn , yn ) −−−→ B(x0 , y0 ) in Z whenever xn −−−→ x0 in X
n→∞
and yn −−−→ y0 in Y .
Proof. For each n ∈ N the map bn (x) = B(x, yn ) is continuous in X. Since y 7→ B(x, y)
is continuous in Y , bn (x) → B(x, y0 ) in Z. Since Cauchy sequences are bounded, {bn (x) :
n ∈ N} is bounded in Z for each x ∈ X. Corollary 12.9.3 of Banach–Steinhause’s theorem
implies that the maps {bn : n ∈ N} are equicontinuous. Let U and W be open neighborhoods
of 0 ∈ Z such that U + U ⊂ W . There is a neighborhood V of 0 ∈ X such that
bn (V ) ⊂ U, n∈N
There is N ∈ N such that for all n ≥ N
xn − x0 ∈ V
B(x0 , yn ) − B(x0 , y0 ) ∈ U
Hence, for all such n
B(xn , yn ) − B(x0 , y0 ) = bn (xn − x0 ) + B(x0 , yn − y0 ) ∈ bn (V ) + U ⊂ U + U ⊂ W
This means that limn B(xn , yn ) = B(x0 , y0 ) in Z.
Remark 12.9.8. When Y in Theorem 12.9.7 is metrizable, the product X × Y is a metriz-
able topological linear space (with sum and scalar product defined as (x, y) + a(x′ , y ′ ) =
(x + ax′ , y + a′ y ′ )). It follows that the bilinear map is a continuous map.
358 12. Some Elements of Functional Analysis
Proof. The equivalence of (i) and (ii) follows from Theorem 12.5.7 when Y = F.
(i) implies (iii) since {0} is closed in F. (iii) implies (iv) since by assumption x∗ is not
identically zero.
(iv) implies (ii): Let x ∈ X \ ker(x∗ ). Then, for some balanced open neighborhood B of 0
(12.17) x + B ∩ ker(x∗ ) = ∅.
Since x∗ (B) is a balanced subset of F, either x∗ (B) is bounded and (ii) follows, or x∗ (B) = F.
In the latter case, there is y ∈ B such that x∗ (y) = −x∗ (x). This means that x + B ∩
ker(x∗ ) 6= ∅ contradicting 12.17.
Theorem 12.10.3. Suppose x∗ ∈ X ∗ \ {0}. Then ker(x∗ ) is a closed subspace of X of
codimension 1 and there is x0 ∈ X such that X = ker(x∗ ) ⊕ span({x0 }).
To see that, set p(x) := lim supd x(d) and notice that M , f and p satisfies the conditions of
Hanh–Banach’s theorem. The last statement follows from the observation that −p(−x) =
lim inf d x(d). Banach limits can be used to prove the existence of charges that are not
measures (Exercise 12.17.6).
Corollary 12.10.6. Let X be a complex linear space and L ⊂ X be a linear subspace.
Suppose that ρ : X → [0, ∞) satisfies ρ(x + y) ≤ ρ(x) + ρ(y) and ρ(a x) = |a|ρ(x) for all
x, y ∈ X and a ∈ C. If f : L → C is a linear functional such that |f (x)| ≤ ρ(x), then there
is a linear functional F on X such that F = f on L and |F | ≤ ρ on X.
360 12. Some Elements of Functional Analysis
Proof. Let u and v be the real and imaginary part of f respectively. By considering X as
a linear space over R, it follows that u and v are real linear functionals on X. Consequently,
from
f (ix) = iu(x) − v(x)
f (ix) = u(ix) + iv(ix)
it follows that v(x) = −u(ix) and so, f (x) = u(x) − i u(ix). Since
u(x) ≤ |f (x)| ≤ ρ(x) x ∈ L,
by Theorem 12.10.4, there is a real functional U on X such that U = u on L and U (x) ≤
ρ(x). It is easy to check that
F (x) = U (x) − iU (i x)
defines a complex linear functional on X as a complex linear space. For any x ∈ X, let
a ∈ C with |a| = 1 such that |F (x)| = aF (x). Then,
|F (x)| = F (a x) = U (a x) ≤ ρ(a x) = ρ(x)
Proof. The result follows from directly from Corollary 12.10.6 to the Hahn–Banach theorem
with ρ(x) := kΛkkxk.
12.10. Duality and separation theorems 361
Conversely, if F is a bounded linear functional such that F (H) = {0} and F (x) 6= 0 then,
since H ⊂ F −1 ({0}), it follows that x ∈
/ H.
The last statement follows from the first part by taking H = {0}.
Corollary 12.10.10. Let X be a normed linear space and denote by X ∗ the space of all
continuous linear functionals on X. Then, for any x ∈ X,
(12.22) kxk = max |x∗ (x)|
{x∗ ∈X ∗ :kx∗ k=1}
Proof. Since |x∗ (x)| ≤ kx∗ kkxk, the right hand side of (12.22) is at most kxk. For x = 0
there is nothing to prove; if x 6= 0, then (12.22) follows by taking a linear functional x∗ such
that x∗ (x) = kxk and kx∗ k = 1.
Remark 12.10.11. If (X, k kX ) is a Banach space, then X ∗ is a Banach space under the
norm kx∗ k := sup{|x∗ (x)k : kxkX ≤ 1}. By the same token, X ∗∗ = (X ∗ )∗ is a Banach space
under the corresponding sup norm. X ∗∗ contains a copy of X, namely the functionals of
the form fx : x∗ 7→ x∗ (x). Corollary 12.10.10 implies that the map F : X → X ∗∗ given by
x 7→ fx is injective isometry. If F is also onto, then X is said to be a reflexive space.
Example 12.10.12. Suppose X is a Banach space and M ⊂ X is a closed linear subspace.
It is easy to check that M ⊥ is closed on X ∗ under the induced norm topology. Theo-
rem 12.10.8 implies that any linear functional m∗ ∈ M ∗ has an extension x∗m∗ ∈ X ∗ with
km∗ k = kx∗m∗ k. If x∗2 ∈ X ∗ is another extension of m∗ to X then x∗m∗ − x∗2 ∈ M ⊥ and
km∗ k ≤ kx∗2 k. Hence, σ : M ∗ → X ∗ /M ⊥ given by m∗ 7→ x∗m∗ + M ⊥ is a well defined linear
map. Moreover, from
km∗ k ≤ kx∗m∗ + M ⊥ kτq := inf {kx∗m∗ + y ∗ k} ≤ kx∗m∗ k = km∗ k
y ∗ ∈M ⊥
Proof. It suffices to show that Λ(B) is an open neighborhood of 0 ∈ F for any balanced
open neighborhood of 0 ∈ X. If x ∈ B is such that rx = |Λ(x)| 6= 0, then Λ(B) ⊃ Drx
where Drx = {z ∈ F : |z| ≤ rx }. Then, there are two alternatives: either Λ(B) = F
or r := supx∈B |Λ(x)| < ∞. In the former alternative there is nothing else to prove and
Λ is unbounded; in the later, Λ ∈ X ∗ and we claim that Λ(B) = Dr . It is clear that
Dr ⊂ Λ(B). Suppose there is x ∈ B such that Λ(x) = r. The continuity of the scalar
product α 7→ αx implies that there is t > 1 such that tx ∈ B. But then, Λ(tx) > r, which
is a contradiction.
The core of a set A, denoted by core(A), is the set of points x ∈ A such that A − x is
absorbent. Points in core(A) are called core points of A. Theorem 12.1.15(a) shows that
Ao ⊂ core(A).
Lemma 12.10.14. If A is a nonempty subset of a topological vector space X and a nonzero
linear functional f on X satisfies Re(f )(x) ≥ a for all x ∈ A, then Re(f )(x) > a for all
x ∈ Ao .
Proof. It is enough to consider the case where the scalar field F = R since once the real
case is proved with a linear function Λ1 , the unique complex linear functional Λ whose real
part is given by Λ1 gives the stated separation.
12.10. Duality and separation theorems 363
If A is open in X then, as linear functionals are open, Λ(A) is open in F. (ii) follows from
(i) by taking s as the right–endpoint of the open interval Λ(A).
(iii) If X is locally convex, A convex and compact and B convex and closed, then by
Theorem 12.1.8 there is a convex neighborhood V of 0 such that (A+V )∩(B+V ) = ∅. From
(b), There is a continuous linear functional Λ and a real number s such that Λ(x) < s ≤ Λ(y)
for all x ∈ A + V and y ∈ B + V . Since A is compact and Λ is continuous, the later attains
its maximum value at some point x0 ∈ A. (iii) follows by setting t = Λ(x0 ).
Corollary 12.10.16. If X is locally convex, then X ∗ separates points.
Proof. If x, y ∈ X and x 6= y then {x} and {y} are disjoint compact, and hence closed,
sets in X. The conclusion follows from Theorem 12.10.15(iii).
Corollary 12.10.17. Suppose X is locally convex, B closed balanced and convex and x0 ∈
/
B. Then, there exists Λ ∈ X ∗ such that |Λ| ≤ 1 on B and Λ(x0 ) > 1.
Proof. By Theorem 12.10.15 there is Λ1 ∈ X ∗ and a real numbers t < s such that Λ1 (x0 ) ∈
(−∞, t) × R and Λ1 (B) ⊂ (s, ∞) × R. Since B is balanced, it follows that s < 0 and that
K = Λ1 (B) ⊂ C is a bounded closed ball around 0. If Λ1 (x0 ) = Reiθ , then there is
0 < r < R such that |Λ1 | ≤ r on K. The function Λ = r−1 e−iθ Λ1 satisfies the desired
properties.
The following result extends Theorem 12.10.9 to the setting of locally convex spaces.
Theorem 12.10.18. Suppose X is a locally convex topological vector space and Y a linear
subspace of X. If x ∈ X \ Y , then there exists x∗ ∈ Y ⊥ such that x∗ (x) = 1.
364 12. Some Elements of Functional Analysis
Proof. Theorem (12.10.15)[(iii)] with A = {x} and B = Y implies that there exists Λ ∈ X ∗
and a constant s ∈ R such that Re(Λ(y)) < s < Re(Λ(x)) for all y ∈ Y . As Y is a vector
1
space, it follows that Λ(Y ) = {0} and Λ(x) 6= 0. The functional x∗ = Λ(x) Λ satisfies the
conditions of the theorem.
Corollary 12.10.19. Suppose X is a locally convex and M ⊂ X is a linear subspace. If
f ∈ M ∗ , then there is Λ ∈ X ∗ such that Λ = f on M .
For any convex closed set B in a topological vector space, let PB be the collection of
half spaces PΛ,c = {x : Re(Λ(x)) ≤ c}, Λ ∈ X ∗ and c ∈ R, that contain B. The next result
states that closed convex sets in a locally convex topological space as completely described
by the dual X ∗ .
Theorem 12.10.22. Let (X, τ ) be a locally convex topological vector space with dual X ∗ .
If B ⊂ X is a closed convex set, then B = ∩PB . Consequently, all locally convex topologies
on X with a common dual X ∗ have the same closed convex subsets.
12.11. Weak topology 365
The last statement follows from the fact that half spaces are defined only in terms of X ∗ .
Proof. If Λ ∈ (X, σ(X, X ′ ))∗ , then there exists a weak neighborhood of 0 of the form
V = {xT: |Λj (x)| < ε, j = 1, . . . , n}, Λ1 , . . . , Λn ∈ X ′ such that x ∈ V implies |Λ(x)| < 1.
If x ∈ nj=1 NΛj , then |aΛ(x)| < 1 for all a ∈ F; therefore, Λ(x) = 0. It follows from
P
Lemma 12.11.4 that Λ = nj=1 λj Λj for some scalars λ1 , . . . , λn and thus, Λ ∈ X ′ .
To prove the last statement, notice that the topology σ(X, X ′ ) is generated by the separating
family of seminorms ρΛ (x) := |Λ(x)|, Λ ∈ X ′ . By Theorem 12.3.5, (X, σ(X, X ′ )) is a
Hausdorff locally convex topological linear space.
Theorem 12.11.10. Suppose (X, τ ) is a locally convex topological vector space, and let X ∗
be its dual space equipped with the weak∗ topology. Let M and N be linear subspaces of X
and Y ∗ respectively. Then
⊥
τ
M⊥ = M
⊥
⊥ w∗
N =N
τ w∗
where M and N donote the the closures of M and N on (X, τ ) and (X ∗ , σ(X ∗ , X))
respectively.
Proof. If x ∈ M then x∗ (x) = 0 for all x∗ ∈ M ⊥ ; hence, x ∈ ⊥ M ⊥ . Since ⊥ M ⊥ is closed
τ τ
in (X, τ ), it follows that M ⊂ ⊥ M ⊥ . Conversely, if x ∈ / M then, by Theorem 12.10.18,
τ
there is x∗ ∈ M ⊥ such that x∗ (x) 6= 0 and so, x ∈ / ⊥ M ⊥ . This shows that X \ M ⊂
X \ ⊥ M⊥ .
⊥ ⊥
If x∗ ∈ N then x∗ (x) = 0 for all x ∈ ⊥N ;
hence, x∗ ∈ ⊥ N . Since ⊥ N is closed
w∗ ⊥
in (X ∗ , σ(X ∗ , X)), it follows that N ⊂ ⊥ ∗ ∗
N . Conversely, as (X , σ(X , X)) is a lo-
w∗
cally convex space whose dual is X, if x∗ ∈ / N then, by Theorem 12.10.18 applied to
⊥
(X ∗ , σ(X ∗ , X)), there is x ∈ ⊥ N such that x∗ (x) 6= 0 and so, x∗ ∈
/ ⊥ N . This shows that
w∗ ⊥
X ∗ \ N ⊂ X ∗ \ ⊥N .
The following result is a weaker version of Theorem 12.10.15(iii) which does not require
local convexity.
Theorem 12.11.11. Suppose X is a topological vector space whose dual X ∗ separate points.
If A and B are non–empty disjoint compact convex subsets of X, then there exits Λ ∈ X ∗
such that
(12.24) sup Re(Λx) < inf Re(Λy)
x∈A y∈B
Proof. By Theorem 12.11.5, the weak topology σ(X, X ∗ ) is a Hausdorff locally convex
topology on X. Consequently, A and B are nonempty disjoint convex σ(X, X ∗ )– weakly
closed subsets of X. Therefore, there is Λ ∈ (X, σ(X, X ∗ ))∗ = X ∗ for which 12.24 holds.
Corollary 12.11.12. Suppose X1′ and X2′ are linear spaces of linear functionals on X which
separate points. X1′ ⊂ X2′ iff σ(X, X1′ ) ⊂ σ(X, X2′ ).
Proof. If X1′ ⊂ X2′ then clearly σ(X, X1′ ) ⊂ σ(X, X2′ ) by definition of weak topology.
Conversely, if σ(X, X1′ ) ⊂ σ(X, X2′ ) then C (X, σ(X, X1′ )); F ⊂ C (X, σ(X, X2′ )); F and
thus, X1′ = (X, σ(X, X1′ ))′ ⊂ (X, σ(X, X2′ ))′ = X2′ by Theorem 12.11.5.
Theorem 12.11.13. Assume (X, τ ) is a locally convex topological vector space with dual
w
X ′ . For any non–empty convex set E ⊂ X, the closure E of E in τ and the closure E of
E in σ(X, X ′ ) coincide.
368 12. Some Elements of Functional Analysis
w w
Proof. As E is weakly closed, then it is originally closed; hence, E ⊂ E . Conversely,
/ E there exist Λ ∈ X ′ and s ∈ R
by the separation theorem (12.10.15))(iii), for any x ∈
such that Re(Λ(x)) < s < Re(Λ(y)) for all y ∈ E. Thus, V = {z ∈ X : Re(Λ(z)) < s} is
a weak–neighborhood of x that does not contain points in E. It follows that E is weakly
w
closed; therefore, E ⊂ E.
Proof. Let P be the collection of all nonempty compact extreme sets of K.T This is a
nonempty
T collection since K ∈ P. It is clear that if ∅ =
6 C ⊂ P then, either C = ∅ or
C ∈ P.
(i) Fix S ∈ P and Λ ∈ X ∗ and let µ := maxz∈S Re Λz . Clearly SΛ 6= ∅. Suppose that
for some 0 < λ < 1 and points x, y ∈ K, z := λx + (1 − λ)y ∈ SΛ . Since z ∈ S, we have
x, y ∈ S and
µ = Re Λ(λx + (1 − λ)y) = λ Re(Λx) + (1 − λ) Re(Λy).
This implies that Re(Λx) = µ = Re(Λy). This implies that x, y ∈ SΛ . Therefore SΛ ∈ P.
(ii) We now prove that E(K) is not empty. Fixed any S ∈ P and let P(S) the collection of all
sets in P that are contained in S. By definition S ∈ P(S) and so P(S) is not empty. Order
P(S) by inclusion. By Hausdorff’s maximal principle there is a maximalTchain C ⊂ P(S).
Since C satisfies the finite intersection property, we have that ∅ =
6 C := C ∈ P(S). The
12.12. Some compactness theorems in linear spaces 369
Remark 12.12.6. The set K in (12.25) is called the polar of V . Banach–Alaoglu’s theorem
states that the polar V
of an any open neighborhood V of 0 ∈ X is σ(X ∗ , X)–compact
in X ∗ .
then we conclude that K is compact. Indeed, (b) implies that K is τ2 –compact and (a)
implies that K is τ1 –compact.
(b): Let f0 be an element in the closure of K in P . We will show that f0 ∈ X ∗ and that
|f0 (x)| ≤ 1 for all x ∈ B. Given x ∈ X and ε > 0, let
W (x; ε) = {f ∈ P : |f (x) − f0 (x)| < ε}
For any x, y ∈ X, a, b ∈ C, W (x; ε) ∩ W (y; ε) ∩ W (ax + by; ε) is open in P ; thus, it contains
a function f ∈ K. From
|f0 (ax + by) − af0 (x) − bf0 (y)| ≤|f0 (ax + by) − f (ax + by)|
+ |a||f0 (x) − f (x)| + |b||f0 (y) − f (y)|
≤(1 + |a| + |b|)ε,
we conclude that f0 ∈ X ∗ . Similarly, for x ∈ B and ε > 0, let f ∈ W (x; ε) ∩ K. Then
|f0 (x)| ≤ |f0 (x) − f (x)| + |f (x)| < ε + 1.
We conclude that f0 ∈ K, and (b) follows.
Example 12.12.7. If X is a Banach space, then the unit ball {x∗ ∈ X ∗ : kx∗ k ≤ 1k}
in X ∗ is weak∗ compact. More generally, the strong closure of a bounded set in X ∗ is
σ(X ∗ , X)–compact.
Example 12.12.8. Suppose (Ω, F , µ) is a nonatomic measure space. Let B be the closed
unit ball in (L1 (µ), k k1 ). We claim that E(B) = ∅. First, if f ∈ L1 with kf k1 < 1 then,
from
f 1
f = (1 − kf k1 )0 + kf k1 , 0 = (f − f )
kf k1 2
it follows that neither 0 nor f is an extreme point. If kf k1 = 1, we claim that the measure
µf (dx) := |f | · µ(dx) is nonatomic for if µf (A) > 0, B ⊂ A ∩ {f 6= 0}, and νf (B) = 0,
then B = ∅. By Saks theorem (Theorem 8.8.2), there is a set A ∈ F for which µf (A) = 21 .
Setting g = 2f 1A and h = 2f 1Ac , we have that kgk1 = khk1 = 1 and
1
f = (g + h).
2
This shows that f is not an extreme point. As a consequence, (L1 (µ), k k1 ) cannot be
isometrically isomorphic to dual space of any Banach space. Otherwise B is be weak∗
compact and so, E(B) 6= ∅ by Lemma 12.12.2 which is a contradiction.
12.12. Some compactness theorems in linear spaces 371
Alaoglu’s theorem is very useful under the additional assumption that X is separable,
for then the weak∗ –ball (12.25) is also sequentially compact, that is, any sequence {x∗n } ⊂ K
has a weak∗ –convergent subsequence.
Theorem 12.12.9. Let X be a separable topological vector space. If K ⊂ X ∗ is weak∗ –
compact, then K is metrizable.
Proof. Let {xn } be a countable dense subset of X. Each linear functional x bn : Λ 7→ Λ(xn ) is
weak∗ –continuous. If x cn (Λ′ ) for all n, then Λ = Λ′ for they are continuous functions
cn (Λ) = x
that coincide on a dense set of X. Thus {b xn } separates points in X ∗ . By Theorem 2.9.1,
we conclude that K is metrizable.
Example 12.12.10. Let µ ≥ 0 be a Radon measure on (Rd , B(Rd )). The unit ball in L∞ (µ)
is compact and metrizable. Indeed, first notice that C00 (Rd ) is dense in L1 (µ). Let {Gn :
n ∈ N} be a sequence of open sets such that Gn is compact in Rd and Gn ⊂ Gn+1 ր Rd .
From Urysohn’s lemma we obtain a sequence {φn : n ∈ N} ⊂ C00 (Rd ) with Gn ≺ φn ≺ Gn+1 .
Let R be the collection of polinomials in Rd with rational coefficients. By Stone–Weierstrass
theorem, D := {φn p : n ∈ N, p ∈ R} is a countable dense in (C00 (Rd , k ku ). As µ is finte on
open compact sets, it follow that D is dense in L1 (µ).
Definition 12.12.11. Let A ⊂ X and B ⊂ X ∗ . The polar of A and the dual polar of
B, are the sets in X ∗ and X respectively, defined by
A
= {Λ ∈ X ∗ : |Λ(x)| ≤ 1, x ∈ A}
B = {x ∈ X : |Λ(x)| ≤ 1, Λ ∈ B}
respectively.
Lemma 12.12.12. Suppose X is a topological vector space with dual X ∗ . Let ∅ =
6 A⊂X
∗
∗
and ∅ =
6 B ⊂ X . Then, A is convex, balanced and weak –closed; similarly, B is convex,
balanced and closed in X.
it follows that A
and
B are weak∗ –closed and closed in X ∗ and X respectively.
Theorem 12.12.13. (Bipolar theorem) Suppose X is a locally convex topological vector
space with dual X ∗ and let ∅ =
6 A ⊂ X and ∅ = 6 B ⊂ X ∗ . Then,
A
is the closure in
X of the balanced convex hull of A. Similarly,
B is the weak∗ –closure of the balanced
convex hull of B.
Proof. It is clear that A ⊂
A
. Since the latter set is balanced, convex and closed
in X, it contains co◦ (A). Suppose there is x ∈
A
\ co◦ (A). By Corollary 12.10.17,
372 12. Some Elements of Functional Analysis
there is Λ ∈ X ∗ such that |Λ| < 1 on co◦ (A) and Λ(x) > 1. The first condition implies that
Λ ∈ A
, and so Λ(x) ≤ 1. This is a contradiction.
Since (X ∗ , σ(X ∗ , X)) is locally convex and has X as its dual, the second statement goes
through step by step as above, exchanging the roles of X and X ∗ .
In the following result, we combines the Banach–Alaoglu theorem and the compact
version of the Banach–Steinhaus theorem to show that in locally convex topological spaces,
weak bounded sets are originally bounded.
Theorem 12.12.14. Suppose (X, τ ) is a locally convex topological vector space, and let X ∗
be its dual space. A non empty subset E in X is bounded in τ iff E is bounded in σ(X, X ∗ ).
The following result is a fully describes weak compact spaces in Banach spaces.
Theorem 12.12.15. (Eberlein–Smulian) Let X be a Banach space with dual space X ∗ . A
set K ⊂ X is σ(X, X ∗ )–compact iff any sequence in K has a σ(X, X ∗ )–weakly convergent
subsequence in K.
Then, for some subsequence {xnk : k ∈ N} and x ∈ K we have that Λ(xnk ) → Λ(x)
as k → ∞ for all Λ ∈ X ∗ . As X ∗ is a Banach space, the Banach–Steinhaus theorem
implies that (xn ) is norm bounded in X ∗∗ and hence, in X. This contradicts the fact that
limk kxnk k = ∞.
w∗∗
By the Banach–Alaoglu’s theorem, the σ(X ∗∗ , X ∗ )–closure of K, denoted by K , is
w∗∗
σ(X ∗∗ , X ∗ )–compact. We will show that K ⊂ X by constructing a sequence (xn ) ⊂ X
which converges to x′′ in σ(X ∗∗ , X ∗ ). The conclusion of the theorem would then follow
w∗∗
from Theorem (12.11.1). Fix x′′ ∈ K and choose any x∗1 ∈ X ∗ with kx∗1 k = 1. Then,
there exists x1 ∈ K such that
|(x′′ − x1 )(x∗1 )| < 1.
We continue by induction. Suppose that {x1 , . . . , xn } ⊂ X, {x∗1 , . . . , x∗n } ⊂ X ∗ and
{k1 , . . . , kn } ⊂ N, have been constructed so that
(1) 1 = k1 < . . . < kn .
(2) kx∗j k = 1, j = 1, . . . , kn .
(3) max{|y ∗∗ (x∗j )| : j = 1, . . . , kn } > 12 ky ∗∗ k for all y in
En = span(x∗∗ , x∗∗ − x1 , . . . , x∗∗ − xn ) ⊂ X ∗∗ .
(4) max{|(x∗∗ − xn )(x∗j )| : j = 1, . . . , kn } < n1 .
As En is finite dimensional, then it is a closed subspace of (X ∗∗ , k k) and the sphere
S n−1 = {y ∗∗ ∈ En : ky ∗∗ k = 1} is compact. Hence, there are points yk∗∗n +1 , . . . , yk∗∗n+1 in
S n−1 such that
kn+1
[ 1
(12.28) S n−1 ⊂ y ∗∗ ∈ En : ky ∗∗ − yj∗∗ k < .
4
j=kn +1
3
For each j = kn + 1, . . . , kn+1 choose x∗j ∈ X ∗ so that kx∗j k = 1 and |y ∗∗ (x∗j )| > 4.
From (12.28) it follows that
1
max{|y ∗∗ (x∗j )| : j = kn + 1, . . . , kn+1 } > ky ∗∗ k
2
for all y ∈ En . For each ℓ = 1, . . . , n + 1 define
kℓ
\ 1
Vℓ = y ∗∗ ∈ X ∗∗ : |(x∗∗ − y ∗∗ )(x∗j )| < .
ℓ
j=1
w∗∗
As x∗∗ ∈ W , there exists a point xn+1 ∈ K ∩ Vn+1 . The sequence (xn ) ⊂ X thus
constructed satisfies xn ∈ Vn and, by (3), it follows that
1
(12.29) sup{|y ∗∗ (xj )| : j ∈ N} ≥ ky ∗∗ k
2
S
for all y in the closure E of E = n En in (X ∗∗ , k k).
374 12. Some Elements of Functional Analysis
By hypothesis, there exist a subsequence xnm and a point x ∈ K to which xnm converges in
σ(X, X ∗ ). By Theorem (12.11.13), x belongs to the closure of span(xn : n ∈ N) in (X, k k);
consequently, x∗∗ − x ∈ E. Fix j ∈ N. Then, for any ε > 0 there is M > kj such that
|(x − xnm )(x∗j )| < ε for m ≥ M . For all such m we have that nm ≥ m ≥ M > kj ≥ j and
1
|(x∗∗ − x)(xj )| ≤ |(x∗∗ − xnm )(xj )| + |(xnm − x)(xj )| ≤ + ε.
nm
It follows that |x∗∗ (xj ) − x(xj )| = 0 for all j ∈ N. Therefore, from (12.29), x = x∗∗ .
Proof. To show that Λ is an open map it suffices to show that for any open neighborhood
V of 0 in X, the set Λ(V ) contains an open neighborhood of 0 in Y .
the sequence of partial sums x1 + . . . + xn converges to some point x ∈ X with d(x, 0) < r.
Consequently
n
X n
X
Λ(x) = lim Λ(xk ) = lim yk − yk+1 = lim y1 − yn+1 = y1 .
n→∞ n→∞ n→∞
k=1 k=1
The second statement follows directly from the first since Λ(X) is an open linear subspace
of Y . To prove the last statement, notice that N = Λ−1 ({0}) is a closed subspace of X, and
by Theorem 12.2.1[(iv)], X/N inherits the metric (F -space, Fréchet, normed) properties of
X. Let π : X → X/N be the quotient map. Since π(X) = X/N and x − y ∈ N implies
Λx = Λy, there exists f : X/N → Y such that Λ = f ◦ π. Since Λ(X) = Y and f in
ono-to-one, f is a linear isomorphism. Since Λ is continuous, for any open set V in Y the
set π −1 f −1 (V ) = Λ−1 (V ) is open in X; hence, by definition of the quotient topology,
f −1 (V ) is open in X/N an so, f is continuous. To show that f −1 is continuous, it is enough
to show that f is open. This follows from the identity
f (U ) = f π(π −1 (U )) = Λ(π −1 (U )),
the continuity of π, and the fact that Λ is open.
Proof. (i) is an immediate consequence of the open mapping theorem since Λ(X) = Y is
a complete metric space.
(ii) For any open set V ⊂ X, (Λ−1 )−1 (V ) = Λ(V ) is open in Y . Thus, Λ−1 is continuous.
Proof. The identity map I : x 7→ x from (X, τ2 ) into (X, τ1 ) is continuous and bijective.
Therefore, by Corollary 12.13.2(ii) τ1 = τ2 .
376 12. Some Elements of Functional Analysis
A map f from a topological space (X, τX ) into a topological space (Y, τY ) has a closed
graph if {(x, f (x)) : x ∈ X} is closed in the product space (X × Y, τX ⊗ τY ).
Theorem 12.13.4. (Closed graph theorem) Suppose X and Y are F–spaces. If Λ is a
linear map from X to Y whose graph G is closed in X × Y , then Λ is continuous.
A linear map Λ : X → Y induces a linear map from the space Y ♯ of all linear functions
on Y into the space X ♯ of all linear functions on X, namely Λ† : f 7→ f ◦ Λ. Clearly Λ† is
a linear map from Y ♯ to X ♯ . When Λ is a continuous, Λ† (Y ∗ ) ⊂ X ∗ . In this situation, the
restriction of Λ† to Y ∗ is called the transpose of Λ.
Lemma 12.13.6. When X and Y are topological linear spaces for which their duals X ∗ and
Y ∗ separate points, then Λ† ∈ L(Y ∗ , X ∗ ) where Y ∗ and X ∗ are given the weak∗ topologies.
Proof. For any y ∗ ∈ Y ∗ let {yn∗ : n ∈ D} be a net that converges to y ∗ in σ(Y ∗ , Y ). Then,
for any x ∈ X
lim(Λ† yn∗ )(x) = lim yn∗ (Λx) = y ∗ (Λx) = (Λ† y ∗ )(x).
n n
⊥
(ii) Since Range(Λ) = Range(Λ)⊥ , part (i) implies that Λ has dense image in Y iff
ker(Λ† ) = {0∗ }, or equivalently, iff Λ† is injective.
(iii) Equip X ∗ and Y ∗ with the corresponding weak∗ topologies. Part (i) implies that Λ is
injective iff ⊥ Range(Λ† ) = {0}. Since the dual of (X ∗ , σ(X ∗ , X)) is X, Theorem 12.10.18
(applied to the locally convex space (X ∗ , σ(X ∗ , X))) implies that ⊥ Range(Λ† ) = {0} iff
w∗
⊥
Range(Λ† ) = ⊥ Range(Λ† ) = X ∗.
When X and Y are Banach spaces, the conclusion of Lemma 12.13.6 can be strengthen.
Theorem 12.13.8. Let X and Y be Banach spaces and equine X ∗ and Y ∗ with the corre-
sponding norm topologies.
(i) Λ ∈ L(X, Y ) iff Λ† ∈ L(Y ∗ , X ∗ ).
(ii) The map σ : Λ 7→ Λ† is a linear isometry from L(X, Y ) into L(Y ∗ , X ∗ ).
This shows that σ is an isometry from L(X, Y ) into L(Y ∗ , X ∗ ). The linearity of σ is left as
an exercise.
(i) Necessity follows from (ii). As for sufficiency, suppose Λ† ∈ L(Y ∗ , X ∗ ). Let (x, y) ∈
Graph(Λ). Choose a sequence {xn : n ∈ N} ⊂ X such that kxn − xkX+ kΛxn − ykY → 0.
By continuity, for any f ∈ Y ∗ we have that f ◦ Λ (x) = limn f ◦ Λ (xn ) = f (y); hence,
f (Λx) = f (y) for all f ∗ ∈ Y ∗ . Theorem 12.10.18 implies that Λx = y. Continuity of Λ
follows from by the closed graph theorem.
Example 12.13.9. (Dual of a quotient space) Suppose M is a closed linear subspace of a
Banach space X. We know that Y = X/M equipped with the norm induced by the quotient
topology is a Banach space . The quotient map π : x 7→ x + M belongs to L(X, Y ) and its
transpose π † belongs to L(Y ∗ , X ∗ ). Since π(x) = 0 + M for any x ∈ M , π † (Y ∗ ) ⊂ M ⊥ . We
claim that π † (Y ∗ ) = M ⊥ . Fix x∗ ∈ M ⊥ and let N = ker(x∗ ). N is a closed linear subspace
of X and M ⊂ N . If π(x) = π(y), then x − y ∈ M ⊂ N and x∗ (x) = x∗ (y). Hence, there
is a unique map Λ : Y → F such that Λ ◦ π = x∗ . It is easy to check that Λ is linear. It
follows from the definition of the quotient topology that Λ ∈ Y ∗ and π † (Λ) = x∗ . Therefore
π † (Y ∗ ) = M ⊥ . If U is the unit ball in X, then π(U ) is the unit ball in Y . Hence
Example 12.13.10. (Quotient in the dual space) Suppose M is a closed linear subspace
of a Banach space X. We know that Z = X ∗ /M ⊥ equipped with the norm induced by the
quotient topology is a Banach space. By Hahn–Banach’s theorem every m∗ ∈ M ∗ admits
and extension x∗ ∈ X ∗ and if x∗1 and x∗2 are two such extensions, x∗1 − x∗2 ∈ M ⊥ . Thus,
the map τ : M ∗ 7→ Z given by m∗ 7→ x∗ + M ⊥ , where x∗ extends m∗ is a well defined
linear map. Since the restriction of any x∗ ∈ X ∗ to M belongs to M ∗ , we have that τ is
an isometric isomorphism. We claim that τ is continuous. Fix m∗ ∈ M ∗ . Notice that for
any extension x∗ of m∗ we have that km∗ kM ∗ ≤ kx∗ kX ∗ . The Hanh–Banach provides an
extension x∗ ∈ X ∗ to m∗ such that km∗ kM ∗ = kx∗m∗ kX ∗ . Then
km∗ kM ∗ ≤ inf{kx∗m∗ + y ∗ kX ∗ : y ∗ ∈ M ⊥ } = kτ m∗ kZ ≤ kx∗m∗ kX ∗ = km∗ kM ∗
Therefore, X ∗ /M ⊥ and M ∗ are isometrically isomorphic.
In the remaining of this section we focus on linear maps between Banach spaces. The
following results state equivalent forms of the open mapping theorem in this setting.
Theorem 12.13.11. Let U and V the open unit disks in the Banach spaces X and Y
respectively. Suppose T ∈ L(X, Y ) and let δ > 0. The following statements are equivalent.
(i) kT † y ∗ k ≥ δky ∗ k for every y ∗ ∈ Y ∗ .
(ii) δV ⊂ T (U ).
(iii) δV ⊂ T (U ).
Moreover, T satisfies T (X) = Y iff any (and hence all) of (i)–(iii) holds for some δ > 0.
(ii) implies (iii). Statement (i) implies that δV ⊂ T (U ). Then, for any y ∈ Y \ {0} and
ε > 0 there is x′ such that kx′ k ≤ 1 and kδ −1 T x′ − kyk−1 yk < kyk−1 ε. This means that for
any y ∈ Y and ε > 0 there is x ∈ X with kxk ≤ kyk such that kδ −1 T x − yk < ε.
Fix y1 ∈ V and choose a sequence of positive numbers εn > 0 such that
X
εn < 1 − ky1 k.
n≥1
by induction, once yn has been picked, there is xn ∈ X such that kxn k ≤ kyn k and kyn −
δ −1 T xn k < εn . Set
yn+1 := yn − δ −1 T xn
12.13. The open map theorem 379
(iii) implies (i). By definition of the operator norm and of the transpose
n o n o
kT † y ∗ k = sup (T † y ∗ )(x) : x ∈ U = sup y ∗ (T x) : x ∈ U
n o n o
= sup |(y ∗ (y)| : y ∈ T (U ) ≥ sup |y ∗ (y)| : y ∈ δV = δky ∗ k
Necessity last statement is a direct consequence of the open map theorem. Sufficiency is
clear from (iii).
Theorem 12.13.12. Suppose X and Y are Banach spaces and T ∈ L(X, Y ). Then the
following statements are equivalent
(i) Range(T ) is normed closed in Y
(ii) Range(T † ) is σ(X ∗ , X)–closed in X ∗ .
(iii) Range(T † ) is normed closed in X ∗ .
Proof. Clearly (ii) implies (iii). We will prove that (i) implies (ii) and that (iii) implies (i).
(i) implies (ii): Let U be the open unit ball in X. By Theorems 12.13.7 and 12.11.10
ker(T )⊥ is the closure of Range(T † ) in X ∗ , σ(X ∗ , X)). To prove that (ii) holds it is enough
to show that ker(T )⊥ ⊂ Range(T † ). Fix x∗ ∈ ker(T )⊥ . Since Range(T ) is closed in Y , it is
also a Banach space; hence, T : X → Range(T ) is an open map. Theorem 12.13.11 implies
that δU ⊂ T (U ) for some δ > 0. Thus, for every y ∈ Range(T ) there is x ∈ X such that
y = T x and kxk ≤ 2δ kyk.
This shows that Λis continuous. The Hanh–Banach theorem provides an extension y ∗ ∈ Y ∗
of Λ. Then T † y ∗ (x) = y ∗ (T x) = Λ(T x) = x∗ (x) for all x ∈ X. Hence T † y ∗ = x∗ , that is
x∗ ∈ Range(T † ).
(iii) implies (i): Let W be the normed closure of Range(T ) in Y and define the map S :
X → W by Sx = T x. Since Range(S) = Range(T ) is dense in W , by Theorem 12.13.7[(ii)],
S † : W ∗ → X ∗ is injective. By Hahn–Banach’s theorem, if w∗ ∈ W then there is y ∗ ∈ Y
with kw∗ k = ky ∗ k that extends w∗ to Y . For any x ∈ X
T † y ∗ )(x) = y ∗ (T x) = w∗ (T x) = w∗ (Sx) = S † w∗ (x)
If kλk > kxk, 1 − λ−1 x is invertible and so, (λe − x) = −λ(e − λ−1 x) ∈ GA . Hence
σ(x) ⊂ B(0; kxk) and σ(x) is compact.
If σ(x) = ∅, then ρ(x) = C and for any y ∗ ∈ A∗ , gy∗ (λ) = y ∗ (λe − x)−1 is an entire
1 1
function. Since k(λe − x)−1 k ≤ |λ| 1−|λ|−1 kxk
for all |λ| > kxk, lim|λ|→∞ gy∗ (λ) = 0. By
Liouville’s theorem, gy∗ ≡ 0 for all y ∈ A ; by Hahn–Banach theorem, (λe − x)−1 = 0 for
∗ ∗
For the rest of this section we focus primarily on the Banach algebra L(X) of bounded
linear operators on a non–trivial complex Bananch space X. For T ∈ L(X), the set σP (T )
of all λ ∈ σ(T ) for which T − λI is not injective is called the point spectrum of T , and its
elements are called eignenvalues of T . For λ ∈ σP (T ), ker(T − λI) is called eigenspace,
and its elements are called eigenvectors.
Corollary 12.14.4. Suppose X is a complex Banach space. For any T ∈ L(X), σ(T ) is a
non–empty compact set in C. If λ ∈ ∂(σ(T )) in C, then inf kxk=1 k(T − λe)xk = 0.
Proof. Only the second statement needs to be proved. Suppose λ is in the boundary
of σ(T ). Let {λn : n ∈ N} ⊂ ρ(T ) such that λn → λ. Lemma 12.6.11 implies that
limn k(T − λn I)−1 k = ∞. For all n large enough, there are xn ∈ X with kxn k = 1 such that
1
k(T − λn )−1 xn k > k(T − λn )k − > 0
n
−1 −1 −1
Let αn = k(T − λn ) xn k and set yn = αn (T − λn ) xn . Then kyn k = 1 and
n→∞
k(T − λI)yn k ≥ k(T − λn I)yn k − |λ − λn | = αn−1 − |λ − λn | −−−→ 0
This shows that inf |y|=1 k(T − λI)yk = 0.
Theorem 12.14.5. Suppose X is a Banach space and T ∈ L(X). Then σ(T ) = σ(T † ).
Proof. Notice that λ ∈ ρ(T ) iff (T − ΛI) is invertible as an element of L(X), and this
happens iff (T † − λI) is invertible as an element of the Banach space L(X ∗ ).
Example 12.14.6. Consider the maps S and T from CN to itself given by
Sx(n) = x(n − 1)1(n ≥ 2)
T x(n) = x(n + 1)
1 1
For 1 ≤ p ≤ ∞ and p + q set Sp and Tq as the restrictions of A and T to ℓp and ℓq
respectively. It is easy to check that for 1 ≤ p < ∞, Sp† = Tq . When p = ∞, we have
T1† = S∞ . Then σ(Tq ) = σ(Sp ). Since Sp is an isometry, 0 ∈ σ(Sp ) and σ(Sp ) ⊂ B(0; 1).
For any λ 6= 0 with |λ| < 1, if Sx = λx, then λx(1) = 0, and x(n − 1) = λx(n) for n ≥ 2.
From this, it follows that x ≡ 0. Thus
σP (Sp ) = ∅
Similarly, if |λ| ≤ 1 and T x = λx for some x ∈ ℓq , then x(n + 1) = λx(n). From this it
follows that x(n) = x(1)λn−1 for all n ∈ N and so, |λ| < 1. From this and Example 12.13.9
if follows that if |λ| < 1
1 = dim ker(Tq − λI) = dim ker(Sp† − λI)
⊥ ∗
= dim Range(Sp − λI) = dim ℓp / Range(Sp − λI) ,
/ S1 ,
Finally, for λ ∈
k(Sp − λI)xk ≥ kSp xk − |λ|kxk = 1 − |λ|kxk
for all x ∈ ℓp . This shows that when |λ| 6= 1, (Sp − λI) is injective and Range(Sp − λI) is
closed in ℓp . Therefore, {λ ∈ C : inf kxk=1 k(Sp − λI)xk = 0} = S1 .
In these notes, we will only consider the case where X and Y are both Banach spaces.
Let U be the unit ball in X. Since Y is a complete normed space, totally bounded sets in Y
are relatively compact; hence, T is completely continuous iff T (U ) is compact subset of Y .
In this setting, completely continuous maps are called compact operators. We will use
Lc (X, Y ) to denote the set of compact operators. Clearly, T ∈ Lc (X, Y ) iff any bounded
sequence {xn : n ∈ N} ⊂ X admits a subsequence such that {T xnk : k ∈ N} converges in Y .
Example 12.15.3. Let Ω be an open bounded subset of Rd . The space X = C(Ω)
eqqiped with
R the sup norm is a Banach space. Suppose K ∈ C(Ω × Ω). The map
T x(t) := Ω K(t, s)f (s) ds defines a bounded operatorn on C(Ω). An application of Arzèla–
Ascoli’s theorem shows that T is a compact operator (see Exercise 12.17.31).
Theorem 12.15.4. Suppose X, Y and Z are Banach spaces. The collection of compact
operators Lc (X, Y ) is a closed linear subspace of L(X, Y ) with its norm topology. Further-
more, if either S ∈ Lc (X, Y ) or T ∈ Lc (Y, Z), then T S ∈ Lc (X, Z).
Let U be the unit ball in X. Suppose {Tn : n ∈ N} ⊂ Lc (X, Y ) converges in operator norm
to T . To show that T ∈ Lc (X, Y ) it is enough to show that T (U ) is totally bounded. Given
ε > 0, choose TN so that kTN −T k < 3ε . Then, there is a finite collection of {x1 , . . . , xm } ⊂ U
S
such that TN (U ) ⊂ N ε
j=1 B T xj ; 3 . For x ∈ U , choose xj so that kTN x − TN xj k < 3 .
ε
Since
kT x − T xj k ≤ kT x − TN xk + kTN x − TN xj k + kTN xj − T xj k < ε,
Sm
T (U ) ⊂ j=1 B(T xj ; ε). This shows that T (U ) is totally bounded.
384 12. Some Elements of Functional Analysis
Suppose S ∈ Lc (X, Y ) and T ∈ L(Y, Z). Let U be the unit ball in X. If S is compact then,
S(U ) is compact in Y and so T (S(U )) is compact in Z. Hence T (S(U )) is compact in Z.
Similarly, T is compact then, S(T (U )) is compact in Z since T (U ) is bounded in Y .
Proof. (i) is consequence from the fact that a subspace of Y of finite dimension n is
homeomorphic to the Euclidean space Fn . There, a set is compact iff is closed and bounded.
(ii) If Range(T ) is closed, then it is itself a Banach space. By the open map theorem,
T : X → Range(T ) is an open map. If T is compact, then Range(T ) is locally compact;
hence, by Theorem 12.7.1[(iii)], Range(T ) is of finite dimension.
(iii) Suppose λ 6= 0. Clearly Y = ker(T − λI) is a closed normed space. Since λ−1 T y = y
for all y ∈ Y , the restriction of T to Y is a continuous linear map onto Y . Part (ii) implies
that Y is of finite dimension.
(iv) If 0 ∈
/ σ(T ) then T ∈ GL(X) and so Range(T ) = X. since T is compact, part(ii)
implies that dim(X) < ∞ contradicting the assumption in the statement.
Example 12.15.6. Let 1 ≤ p < ∞. Suppose {αn : n ∈ N} ⊂ C is bounded and let m :=
supn kαn |. Let A : ℓp → ℓp be the operator defined by Ax(n) = αn x(n). It is easy to check
that kAxkℓp ≤ mkxkℓp and that kAk = m. Furthermore {αn : n ∈ N} ⊂ σP (A) ⊂ σ(A).
For each m ∈ N define Am : ℓp → ℓp by
Am x(n) = αn x(n)1(n ≤ m)
Each Am has finite dimensional range and so, it is compact. Notice that kA − Am k =
n→∞
supn>m |αn |. Therefore, if αn −−−→ 0 then, A is compact. Conversely, if A is compact,
n→∞
then we must have that αn −−−→ 0.
Theorem 12.15.7. (Schauder) Suppose X, Y are Banach spaces, and let T ∈ L(X, Y ).
T ∈ Lc (X, Y ) iff T † ∈ Lc (Y ∗ , X ∗ ).
Suppose T is compact. Let U be the unit disk in X, and let {yn∗ : n ∈ N} be a sequence
in the unit disk of Y ∗ . For each n, denote by fn the restriction of yn∗ to T (U ). Since
|fn (y) − fn (y ′ )| = |yn∗ (y − y ′ )| ≤ ky − y ′ k, {fn : n ∈ N} is an equicontinuous sequence
in C(T (U ), F). Clearly supn |fn (y)| ≤ kT k for all y ∈ T (U ). Hence, by Arzelà–Ascoli’s
12.15. Compact operators 385
theorem, {fn : n ∈ N} is relatively compact in C(T (U ), F) and so, there exists a subsequence
fnk that converges to some f ∈ C(T (U ), F). From
kT † y ∗nk −T † yn∗ j k = sup |(T † y ∗nk −T † yn∗ j )(x)| = sup |(y ∗nk −yn∗ j )(T x)|
x∈U y∈T (U )
= sup |fnk (y) − fnj (y)|
y∈T (U )
Suppose T † is compact. The first part of the Theorem implies that T †† ∈ Lc (X ∗∗ , Y ∗∗ ). Let
φ : X → X ∗∗ and ψ : Y → Y ∗∗ be the standard isometric embeddings given by φ : x 7→ ex ,
where ex (x∗ ) = x∗ (x) for all x∗ ∈ X ∗ and ψ : y 7→ ey , where ey (y ∗ ) = y ∗ (y) for all y ∗ ∈ Y ∗ .
Then
ψ(T x) (y ∗ ) = eT x (y ∗ ) = y ∗ (T x) = T † y ∗ (x)
= ex (T † y ∗ ) = φ(x) (T † y ∗ ) = T †† (φ(x)) (y ∗ )
for all y ∗ ∈ Y ∗ and x ∈ X. This means that ψ ◦ T = T †† ◦ φ. Since φ is an isometry, φ(U )
is contained in the unit disc U ∗∗ of X ∗∗ . Hence
ψT (U ) ⊂ T †† (φ(U )) ⊂ T †† (U ∗∗ )
It follows that ψ(T (U )) is totally bounded in Y ∗∗ . Since ψ is an isometry, it follows that
T (U ) is totally bounded in Y ; therefore, T is compact.
The rest of this section is dedicated to the analysis of the spectrum of compact operators
in X.
Theorem 12.15.8. If X is a Banach space and T ∈ Lc (X), then Range(T − λI) is closed
in X for all λ 6= 0.
Proof. By Theorem 12.15.5[(iii)], dim ker(T − λI) < ∞. By Corollary 12.10.20, there
exists a closed linear subspace M such that X = ker(T − λI) ⊕ M . Let S be the restriction
of T − λI to M . Then S ∈ L(M, X), S is injective, and Range(S) = Range(T − λI). To
show that Range(S) is closed it suffices to show that for some r > 0
(12.34) kSxk ≥ rkxk, x∈M
If‘(12.34) does not hold for any r > 0 then, for any n ∈ N there is xn ∈ M with kxn k = 1 such
that kSxn k < n1 . Then Sxn → 0 and by compactness of T , after passage to a subsequence,
T xn converges to some x0 ∈ X. Hence λxn = T xn − Sxn → x0 . Since M is closed, x0 ∈ M ,
and kx0 k = |λ| > 0. However, by continuity of S
Sx0 = lim S(λxn ) = 0
n
which is a contradiction. Therefore (12.34) holds for some r > 0.
The following technical results will be used to give a full description of the spectrum of
compact operators.
386 12. Some Elements of Functional Analysis
hence dim M ⊥ ≥ k.
Lemma 12.15.10. Suppose M is proper closed linear subspace of a Banach space X. For
any r > 1 there exists x ∈ X such that kxk < r and d(x, M ) = 1.
Proof. Proof. We first show that if either (i) or (ii) is false, then there are closed subspaces
Mn and scalars λn such that
(a) {Mn : n ∈ N} is a strictly increasing sequence of closed subspaces of X.
(b) T (Mn ) ⊂ MN for all n ∈ N.
(c) c := inf n |λn | > 0
(d) (T − λn I)(Mn ) ⊂ Mn−1 for all integer n ≥ 2.
Suppose (ii) is false. Let {λn } a sequence of distinct eigenvalues with |λ| > r. To each
λn choose a unit–norm eigenvalue xn ad define Mn = span{x1 , . . . , xn }. Each Mn is finite
12.15. Compact operators 387
dimensional and hence closed. We prove by induction that {x1 , . . . , xn } is a linearly inde-
pendent set for each n. For n = 1 this is trivial. Assume the the statement holds for n ≥ 1.
Suppose
0 = a1 x1 + . . . + an xn + an+1 xn+1
Applying T gives
0 = a1 λ1 xn + . . . an λn xn + an+1 λn+1 xn+1
Consequently
0 = a1 (λn+1 − λ1 )x1 + . . . + an (λn+1 − λn )xn
As λj 6= λn+1 for all 1 ≤ j ≤ n, we conclude that aj = 0 for all 1 ≤ j ≤ 0. Hence
an+1 xn+1 = 0 and so an+1 = 0. We conclude that Mn is properly contained in Mn+1 .
Clearly T (Mn ) ⊂ Mn . Notice that if x ∈ Mn and
x = a1 x1 + . . . + an xn ,
then
(T − λn )x = a1 (λ1 − λn )x1 + . . . + an−1 (λn−1 − λn )xn−1 ∈ Mn−1
This shows that (T − λn )(Mn ) ⊂ Mn−1 .
Having shown the existence of spaces Mn and scalars λn satisfying (a)–(d) we obtained from
Lemma 12.15.10 vectors yn ∈ Mn such that
kyn k < 2, d(yn , Mn−1 ) = 1
for all integers n ≥ 2. For 2 ≤ m < n we have T ym ∈ Mm ⊂ Mn−1 and (T − λn )yn ∈ Mn−1 .
Hence
kT yn − T ym k = kλn yn − T ym − (T − λn )yn k = |λn |
yn − kλn |−1 T ym − (T − λn )yn
≥ cd(yn , Mn−1 ) = c > 0
This shows that {T yn : nn ∈ N} does admit have a convergent subsequence which is in
contradiction to the compactness of T . Therefore (i) and (ii) hold.
We now present the main result of this section.
Theorem 12.15.12. (Sprectral theorem for compact operators) Suppose X is a banach
space and T ∈ Lc (X). For any scalar λ 6= 0
(i) The numbers defined below are all finite and equal:
α = dim ker(T − λI)
β = dim X/(Range(T − λI))
α∗ = dim ker(T † − λI)
β ∗ = dim X ∗ /(Range(T † − λI))
(ii) If in addition λ ∈ σ(T ), then λ is an eigenvalue of T and T † .
(iii) σ(T ) is compact, at most countable and it has at most one limit point, namely 0.
388 12. Some Elements of Functional Analysis
Proof. Since T is compact iff T † is compact, then by Theorem 12.15.5 implies that α and
α∗ are finite. Set Tλ := T − λI.
Let Y = Y with the norm topology and M = Range(T − λI). Then M is closed in Y and,
by Theorem 12.13.7[(i)], the annihilator of M is ker(T † − λI). Lemma 12.15.9 implies that
(12.35) β ≤ α∗
Set Y = X ∗ with the weak∗ –topology and M = Range(T † − λI). Theorem 12.13.12 M
is a closed subspace of Y . By Theorem 12.13.7[(i)], the annihilator of M is ker(T − λI).
Lemma 12.15.9 implies that
(12.36) β∗ ≤ α
We now show that
(12.37) α≤β
Assume (12.37) is false. Since β ≤ α < ∞, by Corollaries 12.7.4 and 12.10.20 imply that
there are closed subspaces E and F in X, with dim(F ) = β such that
X = ker(Tλ ) ⊕ E = Range(Tλ ) ⊕ F
Each x ∈ X has a unique representation x = x1 + x2 with x1 ∈ ker(Tλ ) and x2 ∈ E. Let
π : X → ker(Tλ ) given by x 7→ x1 . Clearly π is linear. We claim that π is continuous.
Suppose (x, y) ∈ Graph(π) and let (xn , π(xn )) → (x, y) in (X × ker(Tλ )). Then, y ∈ ker(Tλ )
and
z := x − y = lim(xn − π(xn )) ∈ E
n
(iii) Theorem 12.14.2 shows that σ(T ) is compact. Part (ii) shows that σ(T ){0} consists only
of eigenvalues. From Theorem 12.15.11, the nonzero eigenvalues of T is at most countable
and with 0 as the only possible accumulation point. If dim(X) < ∞, σ(T ) is finite. If
dim(X) = ∞ then 0 ∈ σ(T ).
Lemma 12.16.1. (Cauchy–Schwartz) If H is a vector space with inner product (·, ·), then
(12.40) |(x, y)| ≤ kxkkyk
for all x, y ∈ H.
Proof. We will assume that F = C as the real case is simple to check. It is enough to
assume that that both x and y are not the zero vector; then, for any α ∈ C
0 ≤ kx − αyk2 = kxk2 − 2 Re α(x, y) + |α|2 kyk2 .
1
Letting α = kyk2
(x, y) we obtain that
|(x, y)|2
0 ≤ kxk2 − ,
kyk2
whence (12.40) follows.
Corollary 12.16.2. If H is a vector space with an inner product (·, ·), then (H, k · k) is a
normed space.
390 12. Some Elements of Functional Analysis
Proof. We will only prove the triangle inequality as the other properties of a norm are easy
to verify. For any x, y ∈ H
kx + yk2 = kxk2 + 2 Re (x, y) + kyk2 ≤ kxk2 + 2|(x, y)| + kyk2
≤ kxk2 + 2kxk kyk + kyk2 = (kxk + kyk)2 .
The conclusion follows immediately.
The following relations between the inner product and the induced norm play a very
important role in applications.
Lemma 12.16.3. If H is an inner product vector space and k · k is the induced norm, then
(12.41) kx + yk2 + kx − yk2 = 2kxk2 + 2kyk2
and
1 i
(12.42) (x, y) = kx + yk2 − kx − yk2 + kx + iyk2 − kx − iyk2
4 4
for all x, y ∈ H.
The identity‘(12.41) is known as the parallelogram law . The next result shows that
the parallelogram law is defining property of any inner product space.
Theorem 12.16.4. p (von Neumann-Jordan) A normed space (H, k · k) has an inner product
(·, ·) with kxk = (x, x) if and only if (12.41) holds.
Proof. Only sufficiency requires a proof at this point. Let (x, y) be defined by equa-
tion (12.42). We will show that (·, ·) satisfies properties (a)–(d). Observe that the continuity
of the norm implies that continuity of the inner product.
√ It is clear that (x, 0) = 0, (y, x) = (x, y) and that (x, iy) = −i(x, y). Since |1 + i| =
2 = |1−i|, it follows that (x, x) ≥ 0; moreover, since k·k is a norm, we have that (x, x) = 0
only if x = 0.
12.16. Hilbert Spaces 391
V ⊥ = {u ∈ H : (v, u) = 0, ∀v ∈ V }.
The following concept is a slight generalization of inner product on a general linear space.
Definition 12.16.5. Suppose X is a complex linear space (no topology needed). A map
from a linear space f : X × X → C is said to be sesquilinear if for all a, y, z ∈ X and
α∈C
(i) f (x + αy, z) = f (x, z) + αf (y, z).
(ii) f (x, y + αz) = f (x, y) + αf (x, z)
In addition f satisfies f (y, x) = f (x, y) for all x, y ∈ X, then f is said to be symmetric.
Example 12.16.6. The inner product on a complex vector space H is a symmetric sesquilin-
ear map. For any linear map A : H → H on an inner product space, the map f (x, y) :=
(Ax, y) is sesquilinear (but not necessarily symmetric).
Proof. If f is symmetric then, f (x, x) = f (x, x) for all x ∈ X. This means that f (x, x) is
real.
Conversely, suppose f˜(x) := f (x, x) is real for all x ∈ X. For any x, y ∈ H, a simple
calculation gives
Since f˜(λx) = |λ|2 f˜(x) for all x ∈ X and λ ∈ C, and i−1 = −i,
1 ˜ i
f (y, x) = f (x + y) − f˜(x − y) − f˜(y + ix) − f˜(y − ix)
4 4
1 ˜ i
= f (x + y) − f˜(x − y) + f˜(y − ix) − f˜(y + ix)
4 4
1 ˜ i
= f (x + y) − f˜(x − y) + f˜(x + iy) − f˜(x − iy) = f (x, y)
4 4
12.16.1. Hilbert spaces and the Projection Theorem. A Hilbert p space is a vector
space with an inner product such that, under the induced norm x 7→ (x, x), (H, k · k) is
a complete normed space.
Theorem 12.16.8. (The projection theorem) Let M be a nonempty closed convex subset
of a Hilbert space H. For any x0 ∈ H, there exists a unique y0 ∈ M such that
(12.45) kx0 − y0 k = inf{kx0 − yk : y ∈ M }
Proof. Let d be the right hand side of (12.45) and let (yn ) ⊂ M be a sequence such that
kx0 − yn k → d as n → ∞. By the parallelogram law,
yn + ym
2
4
x0 −
+ kyn − ym k2 = 2kx0 − yn k2 + 2kx0 − ym k2 .
2
Since yn +y
2
m
∈ M , it follows that
kyn − ym k2 ≤ 2kx0 − yn k2 + 2kx0 − ym k2 − 4d2 → 0
as n, m → ∞. Therefore, (yn ) is a Cauchy sequence in H, and by completeness and the
closeness of M , there exists y0 ∈ M such that limn kyn − y0 k = 0 and thus, kx0 − y0 k = d.
To show uniqueness, suppose there is another y ∗ ∈ M such that kx0 − y ∗ k = d. Since
y0 +y ∗
2 ∈ M , it follows from the parallelogram law that
y0 + y ∗
2
ky0 − y ∗ k2 = 4d2 − 4
x0 −
≤ 4d2 − 4d2 = 0;
2
that is, y0 = y ∗ .
Corollary 12.16.9. If M is a closed convex subset of a Hilbert space H. For each x ∈ H
let PM (x) be the unique vector in M such that kx − PM xk = inf y∈M kx − yk. Then
(i) PM (x) = x if x ∈ M and PM (x) ∈ ∂M if x ∈
/ M.
(ii) supy∈M Re(x − PM (x), y) ≤ Re(x − PM (x), PM (x)) ≤ Re(x − PM (x), x), that is,
the hyperplane through PM (x) defined by v := x − PM (x) separates M from x.
(iii) For all x, y in H we have that kPM (x) − PM (y)k ≤ kx − yk, that is, the map
x 7→ PM (x) is continuous.
(iv) If M is closed linear subspace of H, then for any x ∈ H, x − PM (x) ∈ M ⊥ .
(1 − λ)kx − PM (x)k < kx − PM (x)k whenever 0 < λ < 1. This is a contradictiom to the
definition of PM (x). Therefore PM (x) ∈ ∂M .
Proof. We first show uniqueness. If (x, y) = (x, y ′ ) for all x ∈ H, then (x, y − y ′ ) = 0 for
all x ∈ H. In particular, if x = x − y ′ , we conclude that ky − y ′ k = 0; therefore, y = y ′ .
If y ∗ ≡ 0, then y = 0 represents the functional. Suppose that y ∗ 6= 0 and let u ∈ H
so that y ∗ (u) 6= 0. By continuity, M = {x : y ∗ (x) = 0} is a closed linear subspace. Let
1
u = P u + Qu, with P u ∈ M and Qu ∈ M ⊥ . Then, Qu 6= 0 and v = kQuk Qu is a well
⊥ ∗
∗
∗
defined unit vector in M . If w = y (x) v − y (v) x, then y (w) = 0 and
0 = (w, v) = y ∗ (x) − y ∗ (v) (x, v).
Hence, if y = y ∗ (v) v, then y ∗ (x) = (x, y) = V (y) (x) for all x ∈ H.
Proof. Consider the space G = span({xn : n ∈ Z}). G is a separable Hilbert space, and
by Alaoglu’s theorem and Theorem 12.12.9 there exist x ∈ G and a subsequence xnk such
that xnk → x in σ(G, G), that is, for any g ∈ G, limk hg, xnk i = hg, xi exists. Let PG the
orthogonal projection from H onto G. Then, for any u ∈ H, u = PG u + (I − PG )u and
hg, (I −PG )u) = 0 for all g ∈ G. Hence limk hu, xnk i = limk hPG u, xnk i = hPG u, xi. Therefore
xnk → x in σ(H, H).
396 12. Some Elements of Functional Analysis
Theorem 12.16.16. For any T ∈ L(H, H) there exists a unique T ∗ ∈ L(H, H) such that
(T x, y) = (x, T ∗ y)
for all x, y ∈ H. Moreover, T ∗ ∈ L(H, H) and kT ∗ k = kT k. The operator T ∗ is called
adjoint of T . The adjoint and the transpose of T are related by
T ∗ = V −1 T † V.
where V is the Riesz representation map in Theorem 12.16.13. Furthermore, the map
T 7→ T ∗ on L satisfies
(i) (λT + S)∗ = λT ∗ + S ∗ for all λ ∈ C and T, S ∈ L(H).
(ii) (T S)∗ = S ∗ T ∗ for all T, S ∈ L(H).
(iii) (T ∗ )∗ = T for all T ∈ L(H).
(iv) If T is invertible, then so is T ∗ and (T ∗)−1 = (T −1 )∗ .
Proof. For fix y ∈ H, the map x 7→ (T x, y) is linear and bounded. Therefore, the Riesz
representation theorem implies that there is a unique T ∗ y ∈ H such that
(12.48) (T x, y) = (x, T ∗ y),
x ∈ H.
The left hand side of (12.48) can be expressed as T † V (y) (x) = V (y) ◦ T (x), whereas
the right hand side of (12.48) can be expressed as V (T ∗ y) (x). Therefore T † ◦ V = V ◦ T ∗ .
Proof. The first inequality has been proved already. Let x ∈ H with kxk = 1. It follows
from
kT xk2 = (T x, T x) = (x, T ∗ T x) ≤ kxkkT ∗ T xk ≤ kT ∗ T k ≤ kT ∗ kkT k = kT k2
that kT k2 ≤ kT ∗ T k = kT k2 .
12.16. Hilbert Spaces 397
Proof. The first statement follows from σ(T ) = σ(T † ) and the fact that
T ∗ − λI = (T − λI)∗ = V −1 ◦ (T − λI)† ◦ V = V −1 ◦ (T † − λI † ) ◦ V
The last statement follows from the fact that V is an isometry (sesquilinear though).
The adjoint map on L(H) is an example of a more general concept which we define
below.
Definition 12.16.19. A C ∗ –algebra is a complex Banach algebra A together with an map
∗ from A into itself (called involution) that satisfies
(a) (λx)∗ = λx∗ for any λ ∈ C and x ∈ A.
(b) (x∗ )∗ = x for all x ∈ A.
(c) (xy)∗ = y ∗ x∗ for all x, y ∈ A.
(d) kx∗ xk = kxk2 for all x ∈ A.
Remark 12.16.20. In a C ∗ –algebra, (a)–(d) imply that kx∗ k = kxk. Indeed, kxk2 =
kx∗ xk ≤ kx∗ kkxk implies kxk ≤ kx∗ k. Applaying this to x∗ gives kx∗ k ≤ k(x∗ )∗ k = kxk.
Example 12.16.21. By Corollary 12.16.18, L(H) is a C ∗ –algebra.
Theorem 12.16.22. Any non–unital complex Banach ring A with an incolution operator
is isometrically ∗–isomorphic to a C ∗ –subalgebra of codimansion one in a C ∗ –algebra.
Proof. Suppose A is a non–unital ring with and involution operator. For any a ∈ A and
λ ∈ C the operator La +λI where La x = ax belongs to L(A). Since kLa xk = kaxk ≤ kakkxk,
kLa k ≤ kak. On the other hand,
kLa a∗ k = kaa∗ k = kak2 = ka∗ k2 = kakka∗ k
Hence, kLa k = kak that is, a 7→ La is an isometry homomorphism from A into L(A). By
defining (La + λI)∗ := La∗ + λI, we have that a 7→ La preserves to involution, that is
e := {La + λI : (a, λ) ∈ A × C} is an subalgebra
a∗ 7→ La∗ = (La )∗ . It is readily seen that A
in L(A) and has unit L0 + I = I.
Claim I: Ab is closed in L(A). First notice that if kLa + λIk = 0 then λ = 0, otherwise
−1
(−λ−1 a)x = x and x(−λ a∗ ) = x for all x ∈ A. Consequently
−1 ∗ −1 ∗
−λ−1 a = (−λ−1 a)(−λ a ) = (−λ a )
This means that A has a unit which contradicts the assumption on A. Hence λ = 0 and
kLa k = 0. Since a 7→ La is an isometry, we have that a = 0. Let φ(La + λI) := λ. This is
a well defined linear map on Ae since La + λI = Lb + βI) implies that La−b + (λ − β)I = 0
and so, a = b and α = β. The kernel of φ is {La : a ∈ A} which is a closed subspace
of L(A) for a 7→ La is an isometry and A is a Banach space. This shows that φ ∈ A∗ .
398 12. Some Elements of Functional Analysis
Proof. The sesquilinear map f (x, y) = (Ax, y) is symmetric iff A is self–adjoint. The
conclusion follows from Lemma 12.16.7.
Proof. This is similar to the decomposition of complex number in their real and impaginary
parts. If such decomposition exists, then
T + T ∗ = (R + iJ) + (R∗ − iJ ∗ ) = 2R
T − T ∗ = (R + iJ) − (R∗ − iJ ∗ ) = 2iJ
Therefore, the decomposition exists, is unique and
1 1
R = (T + T ∗ ), J = (T − T ∗ )
2 2i
Theorem 12.16.30. Suppose H is a Hilbert space. If T ∈ L(H) is self–adjoint, then
400 12. Some Elements of Functional Analysis
To prove (ii) it is enough to auume that kxk = kyk = 1. It follows from the self-adjointness
of T that
1
(T x, y) = (T (x + y), x + y) − (T (x − y), x − y)
4
+ i (T (x + iy), x + iy) − (T (x − iy), x − iy)
Proof. Since (T x, x) = (x, T ∗ x), then (T x, x) = 0 for all x implies that (T ∗ x, x) = 0 for
all x. Let T = R + iJ the real and imaginary decomposition of T . It follows from the
expressions for R and J that (Rx, x) = 0 = (Jx, x) for all x ∈ H. Since R and J are
self-adjoint, we conclude that kRk = |||R||| = 0 = |||J||| = kJk. Therefore R = 0 = J and so
T = 0.
Theorem 12.16.32. Suppose H is a (complex) Hilbert space and let T ∈ H(H). The
following statements are equivalent.
(i) T is normal.
(ii) kT xk = kT ∗ xk for all x ∈ H.
(iii) The real and imaginary parts of T as in Lemma 12.16.29 commute.
12.16. Hilbert Spaces 401
Clearly
X
kxk2 = kPI xk2 + kQI xk2 ≥ kPI xk2 = |(x, en )|2
n∈I
Inequality (12.50) follows by taking the supremum over all finite subsets I ⊂ N .
Theorem 12.16.34. (Parseval) Let H be a Hilbert space. There exists a maximal family
G ⊂ H of orthonormal vectors such that H = span(G ). If in addition, H is separable, then
G is countable and
X X
(12.51) lim kx − (x, en )en k = 0, and kxk2 = |(x, en )|2
n→∞
n n
Proof. Consider the family S of all collections of orthonormal vectors partially ordered by
inclusion. It is clear that C is an orthonormal family of vectors whenever C is a chain of
orthonormal families. By Zorn’s lemma, there exits a maximal orthonormal family G in H.
Let M = span(G ). We claim that M = H. If not, there is u ∈ H \ M and u = P u + Qu
1
for some P u ∈ M , Qu ∈ M ⊥ , and Qu 6= 0. If follows that G ∪ { kQuk Qu} is an orthonormal
collection, in contradition to the maximality of G . Therefore H = span(G ).
If H is separable, then any maximal orthogonal class√ G is countable. Indeed, for any
′ ′
orthonormal vectors e and e we have that ke − e k = 2. If S is a countable
√ dense subset
of H, then for each e ∈√G one can choose u(e) ∈ S such that ke − u(e)k < 2/4. It follows
that ku(e) − u(e′ )k ≥ 2/2; consequently, G is countable.
Let (en : n ∈ N) be Pan enumeration of the elements of G . Bessel’s inequalityPimplies that
n
the sequence sn = k=1 (x, ek )ek is a Cauchy sequence in H, suppose that s = n (x, en )en .
⊥
Since (x − s, em ) = 0 for all m ∈ N, it follows that (x − s) ∈ span(G ) = H ⊥ = {0}. A
simple calculation shows that
Xn Xn n
X
2
x −
2
(x, ek )ek = kxk − 2 2
|(x, ek )| + |(x, ek )|2 .
k=1 k=1 k=1
After letting n → ∞ (12.51) follows immediately.
Theorem 12.16.35. (Gram–Schmidt orthogonalization) Suppose {xn : n ∈ N} ⊂ H is a
sequence of linearly independent vector in a Hilbert space H. Let M0 = {0}, and for n ≥ 1
let Mn = span(x1 , . . . , xn ). There exists an orthonormal sequence {un : n ∈ N} ⊂ H such
that for each n ∈ N
⊥ .
(i) un ∈ Mn ∩ Mn−1
(ii) Mn = span(u1 , . . . , un ).
If {vn : n ∈ N} is another orthonormal sequence satisfying (i)–(11), then vn = λn un , where
λn ∈ S1 .
Proof. For n = 1 define u1 = kx1 k−1 x1 . Clearly consitions (1)–(ii) are satisfied. Assume
vectors {u1 , . . . , un−1 }, n > 1, had been constructed so that (i) and (ii) hold. Let Pn be the
orthogonal projection from H onto Mn . Define
u′n+1 = (I − Pn )xn+1
1
un+1 = ′ un+1
kun+1 k
Clearly u′n+1 ∈ Mn+1 ∩Mn⊥ , and since the vectors in Mn+1 are linearly indpendent, ku′n+1 k >
0. Since
Xn
Pn xn+1 = (xn+1 , uj )uj
j=1
xn+1 = Pn Xn+1 + ku′n+1 k un+1
12.16. Hilbert Spaces 403
it follows that Mn+1 = span(u1 , . . . , un+1 ). This completes our construction. The last
statement is easily proved by induction.
Necessity Let U be the unit ball in H and let L = T (H). L is a Hilbert space and since T (U )
is totally bounded, L is separable. Therefore, by Parseval’s theorem L admits a sequence
of orthonormal vectors Φ = {φn : n ∈ N} such that span(Φ) = L. Let Pn be the projection
from H onto {φj : 1 ≤ j ≤ n}. Each Tn := Pn T is a bounded operator of finite dimension
range.
To complete the proof of the Theorem, it suffices to show that the sequence of functions
gn (y) = k(I − Pn )yk defined in K converges uniformly to 0 along some subsequence. From
Bessel’s inequality shows that gn pointwise to 0 since
X
gn2 (y) = |(y, ek )|2 → 0
k>n
Proof. By Theorem 12.16.30, there is a sequence of unit vectors {xn : n ∈ N} such that
limn |(T xn , xn )| = kT k. Since T is compact and x 7→ (T x, x) is real, without loss of
404 12. Some Elements of Functional Analysis
Proof. Suppose T x = λx and T y = µy for nonzero x and y. Both λ and µ are real. Thus
λ(x, y) = (T x, y) = (x, T y) = µ(x, y).
As a consequence (λ − µ)(x, y) = 0. Since λ 6= µ, (x, y) = 0.
Theorem 12.16.39. Suppose T is a self–adjoint–complex operator on a Hilbert space H.
Then, σ(P ) is at most countable and σ(T ) \ {0} are eigenvalues of T that have at most one
accumulation point, namely 0.
Let Φ = {λn } be the list of all distinct nonzero eigenvalues of T ordered decreasingly accord-
ing to magnitude, i.e., |λn+1 | ≤ |λn |. If Pn be the projection from H onto Nn = ker(T −λn I),
then
X
(12.52) T = λn Pn
n
Proof. The first statement is consequence of Theorem 12.15.12 which also implies that
each Pn is of finite range. Define
Mn := N1 ⊕ . . . ⊕ Nn
[
M := span Mn
n
When the sequence Φ of nonzero eigenvalues has a finite number k of elements, then
H = N1 ⊕ . . . Nk ⊕ N0
To prove (12.52) in the case when Φ is infinite, it suffices to show that Tn = T PMn converges
to T in the operator norm. For each n, the restriction TMn⊥ of T to Mn⊥ is a self–adjoint com-
pact operator all of whose distinct nonzero eigenvalues are {λk : k > n}. Lemma 12.16.37
implies that kT kMn⊥ = |λn+1 |. Hence
n
X
kT − Tn k =
T − λj Pj
= kT − T PMn k
j=1
n→∞
= kT (I − PMn )k ≤ kT kMn⊥ kI − PMn k ≤ |λn+1 | −−−→ 0
since limn λn = 0.
Proof. As in Theorem 12.16.39, let Φ = {λn : n ∈ N} the sequence of all distinct nonzero
eigenvalues of A ordered decreasingly in order of magnitude (|λn+1 | ≤ |λn |), Nn = ker(T −
λn I), and Pn be the orthogonal projection from H onto Nn . Then
X
A= λn Pn
n
where {µm } are all the distinct nonzero eignevalues of S0 ordered decreasingly according
to magnitude, and Em is the projections from N0 onto N0,m = ker(S0 − µm I|N0 ). Each
′ = dim(N ′
km 0,m ) < ∞ and so, N0,m has a finite orthonormal basis {em,j : 1 ≤ j ≤
′ }. Rearranging the order of the {µ } and
S ′ ′
km m m {em,j : 1 ≤ j ≤ km } we obtained a
sequence of orthonormal vectors {e0,m } ⊂ N0 and a sequence of real numbers {µ0,m } (not
necessarily distinct numbers) such that Se0,m = µ0,m e0,m . Then, setting λ0 = 0 and
k0 = dim span({e0,m }) we get
XX kn
Sx = µn,j (x, en,j )en,j
n≥0 j=1
XX
kn
Tx = λn (x, en,j )en,j
n≥0 j=1
for all x ∈ H. The remaining of the proof consists of rearranging the double sequences into
one keeping the relation between eigenvalues and corresponding eigenvectors.
12.17. Exercises
Exercise 12.17.1. Let α ∈ F \ {0} and A ⊂ X. Show that (αA)◦ = αA◦ .
Exercise 12.17.2. Let X be a topological vector space. Suppose Y is linear subspace of
X. Show that
(a) Y has non empty interior iff Y = X (Hint: Use Theorem 12.1.15(a)).
(b) Y is bounded iff Y = {0} (Hint: let x ∈ Y \ {0} and choose a neighborhood V ∋ 0
which does not contain x).
Exercise 12.17.3. Suppose X is linear topological spaces, A ⊂ X is nonempty compact
and B ⊂ X nonempty closed. Show that A + B is closed in X.
Exercise 12.17.4. For each n ∈ Z let en (t) := eint (|t| ≤ π). Define
fn = e−n + nen , (n ∈ N)
Let X1 be the closure in L2 (−π, π) of the linear span of the functions {en : n ∈ Z+ }, and
let X2 be the closure in L2 (−π, π) of the linear span of {fn }. Show that X1 + X2 is dense
in L2 (−π, π) but it is not closed. For instance,
∞
X 1
x= e−n
n
n=1
for φ ∈ D(Rn ). This gives some justification to the name the derivative of a
distribution. (Hint: Use Fubini’s them together with integration by parts.)
(c) For any u ∈ D(Rn ) define τx u(φ) := u(τ−x φ). Show that τx u ∈ D∗ (Rn ) for any
xinRn . For φ ∈ D(Rn ) fixed, show that x 7→ τx u(φ) is continuous.
408 12. Some Elements of Functional Analysis
T
Exercise 12.17.11. For any set A in a vector space X, show that co(A) = {C : A ⊂
C, C convex}. If X is a topological vector space, show that co(A) is the smallest closed
convex set that contains A.
Exercise 12.17.12. Suppose A is a non–empty subset of a real vector space X. The
minimal affine set that contains A is defined as the intersection of all affine subspaces in X
that contain A. For any a ∈ A, show that
nX n Xn o
aff(A) = αk xk : n ∈ N, αk ∈ R, αk = 1, xk ∈ A
k=1 k=1
= a + span(A − a).
Show that the smallest closed affine space that contains A is given by aff(A) = a +
span(A − a) for all a ∈ A.
Exercise 12.17.13. Let µ be a probability measure on (R, B(R)) such that F (x) =
µ(−∞, x] is continuous. Show that L0 (µ) is not locally convex.
Exercise 12.17.14. Let C ⊂ X be a convex set in a real vector space X. The relative
interior of C, denoted by ri(C), is defined as the interior of C relative to aff(C). For any
a ∈ C, show that
(a) ri(C) = a + int(C − a), where int(C − a) is the interior of C relative to the vector
space span(C − a).
(b) If ri(C) 6= ∅, show that ri(C) = C.
Exercise 12.17.15. For any nonempty subset A ⊂ X, show that
nX n o
cone(A) = λj xj : n ∈ N, λj ≥ 0, xj ∈ A
j=1
Exercise 12.17.18. Show that the there is a solution f ∈ C([0, 1]) to the equation
Z 1
f (x) = sin(t + f 2 (x)) dt
0
for all x ∈ [0, 1]. (Hint: Use Exercises 12.17.17.)
Exercise 12.17.19. Suppose X is an F–space, Y is a normed space and Γ is a collection
of continuous maps from X into Y . Let B be the set of all points x ∈ X whose orbit
Γ(x) = {Λ(x) : Λ ∈ Γ} is bounded. If B is of first category, show that X \ B = {x ∈
X : supΛ∈Γ kΛ(x)k = ∞} is a dense Gδ set in X. (Hint: Consider the map ϕ : x 7→
supΛ∈Γ kΛ(x)k. As ϕ is lower semicontinuous, Vn = ϕ−1 ((n, ∞)) is open in X for any
n ∈ N. Show that Vn is dense in X.)
Exercise 12.17.20. Let X and Y be an F–space and let Γ = {Λn : n ∈ N} be a sequence of
continuous linear maps from X into a topological vector space Y such that Λn x converges
to a point Λx for each x ∈ X. Show that Λ is a continuous linear functional from X to Y .
Exercise 12.17.21. Suppose (X, k · kX ), (Y, k · kY ) and (Z, k · kZ ) are Banach spaces, and
B : X × Y → Z is a bilinear map continuous separately on each component. Show that B
is continuous as a map from the product space X × Y to Z. Show that there is M > 0 such
that kB(x, y)kZ ≤ M kxkX kykY for all (x, y) ∈ X × Y .
Exercise 12.17.22. Let X be topological vector space over F with topological dual space
X ∗.
(a) Show that the space X × F with adition and scalar multimplication given by
λ(x, α) + (y, β) = (λx + y, λα + β)
is a topological vector space over F when F has the Euclidean topology.
(b) Show that (X × F)∗ = X ∗ × F.
Exercise 12.17.23. Consider the measure space (N, 2N, #) where # is the counting mea-
sure. The spaces Lp (#) on (N, 2N) will be denoted by ℓp . Let c0 be the subspace of all
f ∈ ℓ∞ such that limm→∞ f (m) = 0. Show that
(a) c0 is a closed subspace of ℓ∞ .
(b) c∗0 = ℓ1 , that is, for any L ∈ c∗0 , there is a sequence l ∈ ℓ1 such that
X
kLk1 = |l(n)| < ∞
n≥1
P
and L(f ) = n≥1 l(n)f (n) for all f ∈ c0 .
P
(Hint: Given L ∈ let γn = L(1{n} ). For f ∈ c0 define fn = nk=1 f (k)1{k} . Show that
c∗0 ,
kfn − f k∞ → 0.) Constrast the conclusion in (b) with Example 12.12.8.
Exercise 12.17.24. Suppose X is a locally convex topological vector space with dual X ′ .
Suppose K is σ(X, X ′ )–compact. If there is a countable set in X ′ that separates points of
K, show that K is originally bounded and metrizable. (Hint: K is weakly bounded and
hence, originally bounded. Use Theorem 2.9.1)
410 12. Some Elements of Functional Analysis
Exercise 12.17.25. Let X, Y and Z be Banach spaces. For any T ∈ L(X, Y ), S ∈ L(Y, Z)
and a ∈ F show that
(i) (aT + S)† = aT † + S † .
(ii) (ST )† = T † S † .
† −1
(iii) If T is bijective, then so is T † and T −1 = T† ∈ L(X ∗ , Y ∗ ).
Suppose X = Y and we identify X as a subspace of double dual X ∗∗ through the map
x 7→ x̂ where x̂(x∗ ) = x∗ (x).
(iv) T †† |X = T .
Exercise 12.17.27. Suppose X is a Banach space. Show that the set Surj(X) of all
bounded linear surjective maps is open in L(X) with the operator norm (Hint: Apply
Theorem 12.13.12).
Exercise 12.17.29. Suppose H is a Hilbert space and let T ∈ L(H). Show that |||T ||| :=
supkxk=1 |(T x, x)| ≤ kT k. If T is seld–adjoint, show that kT k = |||T ||| (Hint: Show that
1
Re(αβ(T x, y)) = (T (αx + βy), αx + βy) − (T (αx − βy), αx − βy)
4
≤ |||T ||| |α|2 kxk2 + |β|2 kyk2 )
Inequality (12.53) is the generalized Cauchy inequality. (Hint: f (x, y) := (Ax, y) satisfies
the properties of an inner product, except for possibthe condition f (x, x) = 0 iff x = 0. The
proof of the Cuachy–Schwartz inequality goes through in this case).
Exercise 12.17.31. Suppose Ω is an open bounded subset of Rd , and let K ∈ C Ω × Ω .
R
Show that the operator T x(t) = Ω K(t, s)x(s) ds defined on C(Ω) is a compact operator on
L(C(Ω)).
12.17. Exercises 411
Exercise 12.17.32. Let I = [a, b], a < b. On C 2 (I) define the norm |||x||| = kxku + kx′ ku +
kx′′ ku . Under this norm X is a Banach space. Define the map L : C 2 (I) → C 0 (I) by
Lx(t) = a0 (t)x′′ (t) + a1 (t)x′ (t) + a2 (t)x(t)
where aj ∈ C 2−j (I) for j = 0, 1, 2, and a0 > 0. Show that
(a) L ∈ L(C 2 (I), C 0 (I)).
(b) dim(ker(L)) = 2 (Hint: there are unique solutions to the initial value problems
Lx = 0 with x(a) = 0, x′ (a) = 1 and x(a) = 1, x′ (a) = 0 respectively.)
Exercise 12.17.33. In this exrcise, D is the differential operator. Show that the n–th term
in each of the sequences {pn : n ∈ N} defined below are polynomials of degree, ans that
each sequence is orthogonal in a corresponding L2 space.
(a) Legendre polynomials
√
2n + 1 n
Pn (x) := √ D (1 − x2 )n , ([−1, 1], B([−1, 1]), dx)
n!2n 2
(b) Laguerre polynomials
ex 1
Ln (x) := Dn (e−x xn ) = (D − 1)n xn , (0, ∞), B(0, ∞), e−x dx
n! n!
(c) Hermite polynomials
2 /2 2 /2 1 2
Hn (x) := (−1)n ex Dn (e−x(R, B(R), √ e−x /2 dx)
),
2π
In Section 15.1, it will be seen that these sequences are complete orthogonal systems in
their respective L2 spaces.
Exercise 12.17.34. On L2 [0, 1], define the operator Ax(t) = tx(t). Show that A is a
self–adjoint bounded operator, kAk = 1, and σP (A) = ∅. What is σ(T )?
Exercise 12.17.35. Let H be a Hilbert space. Suppose T is a compact normal operator on
H. Show that there are a sequence of complex numers {λn } and an orthonormal sequence
of vectors {en } ⊂ H such that
X
Tx = λn (x, en )en , x ∈ H.
n
(Hint: use the real–imaginary decomposition of T )
Chapter 13
which means that the map f 7→ Λf is an isometry from (L1 , k k1 ) into (L∗∞ , k k). As
a consequence, L1 (µ) is normed–closed in L∗∞ (µ) and, by Theorem 12.11.13, σ(L∗∞ , L∞ )–
closed.
The following example shows that if p = 1, the statement of Theorem 8.4.3 may not
hold if µ is not σ–finite.
Example 13.1.1. Suppose Ω is uncountable. Let F = P(Ω) and let B be the sub σ–
algebra generated by the countable subsets of Ω. Let µ be the counting measure on F and
let µ0 be its restriction to B. Then L1 (µ) = L1 (µ0 ) consists of all functions equal to zero
except on countable subsets of Ω; L∞ (µ0 ) is the collection of all functions that are constant
except on countable subsets of Ω; L∞ (µ) is the collection of all bounded functions. It is
∗
P = L∞ (µ) ⊃ L∞ (µ0 ). Let A and B be uncountable with A ∪ B = Ω and
clear that (L1 (Ω))
define Λ(f ) =R x∈A f (x). Then Λ is a continuous linear functional on L1 (µ0 ) with kΛk = 1
and if Λf = f g dµ, then g = 1A ∈ / L∞ (µ0 ).
Example 13.1.2. Suppose (Ω, F , µ) is a σ–finite measure space. For any 1 ≤ p < ∞,
The collection S ∗ of simple integrable functions is dense in Lp , and by Alaoglu’s theorem,
the dual unit ball Bq = {g ∈ Lq : kgkq ≤ 1}, 1 < q ≤ ∞, is σ(Lq , Lp )–compact. If F is
countably generated then, S ∗ has a countable subset that is dense in Lp ; in which case,
the topology σ(Lq , Lp ) on Bq is metrizable, and so Bq is sequentially compact.
413
414 13. More results on duality
We conclude this section with a result that describes uniform integrability in terms of
weak compactness.
Lemma 13.1.3. Let (fn ) be a bounded sequence in L1 (Ω, F , ν) . (a) If µn = fn dν con-
verges setwise, then (fn ) is uniformly integrable, and there is f ∈ L1 (Ω, F , ν) to which (fn )
converges weakly in σ(L1 , L∞ ). (b) In addition, if ν is finite and fn → f in ν–measure,
then fn → f in L1 .
We now show that (fn ) converges to f in σ(L1 , L∞ ).R Let M :=R supn kfn k1 . Setwise
convergence of µn = fn dν to µ = f dν implies that sfn dν → sf dν for all simple
functions. As simple functions are dense in L∞ (ν), for any g ∈ L∞ , and ε > 0, there exists
a simple function s such that kg − sk∞ < 3(Mε+1) . For such simple function s, there exists
R
an integer Nε such that n ≥ Nε implies s(fn − f ) dν < 3ε . Combining these facts, we
obtain that
Z Z Z Z
g(f n − f ) dν ≤
(g − s)f n dν +
s(f n − f ) dν +
(s − g)f dν
ε
≤ 2kg − sk∞ M + < ε.
3
(b) Suppose that, in addition, ν is finite and fn → f in ν–measure. Then, for any ε > 0 we
have limn ν(|fn − f | > ε) = 0. From the uniform integrability of (fn ) and the inequality
Z Z
kfn − f k1 ≤ εν(Ω) + |fn | dν + |f | dν
|fn −f |>ε |fn −f |>ε
Proof. We consider the case where µ is a probability measure. The general case can be
derive from this one.
Now we prove that K is uniformly integragle. If that were not the case, there would be
number ε > 0 and sequences (En ) ⊂ F and (fn ) ⊂ K such that
Z
1
(13.1) µ(En ) < , |fn | dµ ≥ ε.
n En
Let fn′ be a σ(L1 , L∞ )–convergent
R subsequence. Then by Lemma 13.1.3, (fn′ ) is uniformly
integrable, and so limn′ E ′ |fn′ | dµ = 0. This is a contradiction to (13.1).
n
w∗
Conversely, if K is uniformly integrable, then L is bounded in L1 an thus, the closure K
w∗
of K in σ(L∗∞ , L∞ ) is σ(L∗∞ , L∞ )–compact. For any Λ ∈ K , the map E 7→ Λ1E is clearly
a bounded finitely additive function in F ; for if (fα ) is a net in K such that limα fn = Λ
in σ(L∗∞ , L∞ ), then
Z Z
|Λ1E | = | lim fα dµ| ≤ sup |f | dµ ≤ sup kf k1 < ∞
α E f ∈K E f ∈K
for all E ∈ F . We will show now that in fact Λ is countably additive (hence, a measure)
and that Λ ≪ µ. Indeed, since K is uniformlyR integrable, for any ε > 0 there is δ > 0
such that µ(E) < δ implies |Λ1E | ≤ supf ∈K E |f | dµ < ε. As µ is finite, if En ց ∅, then
µ(En ) → 0, and so limn Λ1En = 0. This shows that Λ is a finite signed or complex measure
w∗
and Λ ≪ ν. Consequently, Λ = f dµ for some f ∈ L1 and so K ⊂ L1 . Therefore K is
relatively σ(L1 , L∞ )–compact.
Proof. Suppose Λ ∈ (L∞ (µ))∗ . Let mΛ be the restriction of Λ to the set of simple functions
E(F ) on F . The arguments above show that mΛ is additive and of finite variation. If
416 13. More results on duality
for all 0 ≤ t ≤ 1.
13.3. Lp –Interpolation Theorems 417
Theorem 13.3.2. (M. Riesz). Suppose (X, MX , µ) and (Y, MY , ν) are measure spaces, ν
is semifinite, and 1 ≤ p0 , p1 , q0 , q1 ≤ ∞. For any 0 < t < 1 define pt and qt as
1 1−t t 1 1−t t
= + , = + .
pt p0 p1 qt q0 q1
Suppose T is a linear operator on Lp0 (µ) + Lp1 (µ) into Lq0 (ν) + Lq1 (ν) such that T is
bounded from Lpj (µ) to Lqj (ν), that is, for some constants M0 , M1 , kT f kqj ≤ Mj kf kpj for
all f ∈ Lpj (µ) (j = 0, 1). Then T is bounded from Lpt (µ) to Lqt (ν) for all 0 < t < 1, and
1 1
Proof. For each number 1 ≤ p ≤ ∞, we use p′ to denote its conjugate; that is p + p′ = 1.
Fix 0 < t < 1. We first assume that pt < ∞ and qt > 1. This excludes the cases
p0 = ∞ = p1 and q0 = 1 = q1 . Let SX denote the collection of µ–integrable simple
functions on X; similarly for SY . As the collection SX is dense in Lp (µ) for all 1 ≤ p < ∞,
it suffices to prove (13.3) for functions in SX . Corollary 8.3.10 and the density of SY in Lqt′
imply that if f ∈ SX , then
Z
(13.4) kT f kqt = sup T f g dν : g ∈ SY , kgkqt′ = 1 .
Y
Pm Pn
Let f = j=1 aj 1Aj ∈ SX and f = k=1 bk 1Bk ∈ SY , where all aj and bk are not zero and
the sets in {Aj } and {Bk } are disjoint, be such that kf kpt = 1 = kgkqt′ . Then T f ∈ Lq0 ∩Lq1
and thus, T f ∈ Lqt for any qt between q0 and q1 . Consider the functions α and β on C
given by
1−z z 1−z z
α(z) = + , β(z) = + .
p0 p1 q0 q1
This shows that (13.4) holds for any f ∈ SX with kf kpt = 1, whence we conclude that (13.3)
holds.
If p0 = ∞ = p1 , the conclusion follows directly from Hölders inequality, for in such case
f ∈ L∞ (µ) implies T f ∈ Lq0 (ν) ∩ Lp1 (ν). Therefore,
Z
(1−t)qt
|T f |qt dν ≤ kT f k(1−t)q
q0
t
kT f ktq
q1 ≤ M0
t
M1tqt kf kq∞t .
Remark 13.3.3. The assumption that ν is semifinite is used only when q0 = ∞ = q1 ,
where qt = ∞ for all 0 < t < 1 and Corollary 8.3.10 still applies.
Definition 13.3.4. Suppose 1 ≤ p ≤ ∞, 1 ≤ q ≤ ∞. A mapping from T : Lp −→ L0 is
said to be of strong–type (p, q) if for f ∈ Lp
(13.7) kT f kq ≤ Akf kp ,
where A is a constant not depending on f .
If q < ∞, then T is said to be of weak–type (p, q) if there is a constant A such that
Akf kp q
(13.8) ν (|T f | > α) ≤ ,
α
for all α > 0. If q = ∞ weak (p, ∞) type is the same as strong (p, ∞) type.
13.3. Lp –Interpolation Theorems 419
It follows from straight application of Chebyshev inequality that strong–type (p, q) implies
weak–type (p, q):
Theorem 13.3.5. (Marcinkiewicz) Supose (X, F , µ) and (Y, B, ν) are σ–finite measure
spaces. Let 1 ≤ s < r ≤ ∞ and Suppose T is a subadditive map from Ls (µ) + Lr (µ) to
the space MY of B measurable functions. If T is simultaneously of weak–type (s, s) and
weak–type (r, r), then T is of strong–type (p, p) for all s < p < r. More explicitly, suppose
that for all f, g ∈ Ls + Lr and c ∈ C
(i) |T (f + g) (x)| ≤ |T f (x)| + |T g(x)|.
(ii) |T cf | = |c||T f |.
s
(iii) ν (|T f | > α) ≤ Aα1 kf ks when f ∈ Ls .
r
(iv) ν |T f | > α ≤ Aαr kf kr , when ∈ Lr .
kT f kp ≤ Ap kf kp , f ∈ Lp
Proof. We first consider the case r < ∞. Let f ∈ Lp and define the function λ(α) =
{|T f | > α}. For α > 0, we have that f = f 1{|f |>α} + f 1{|f |≤α} , so that f1 = f 1{|f |>α} ∈ Ls
and f2 = f 1{|f |≤α} ∈ Lr . Condition (i) implies that
Hence,
By Fubini’s theorem
Z Z ∞
p
|T f | dν = p αp−1 γ(α) dα.
0
420 13. More results on duality
Multipying both sides of (13.9) by αp−1 and integrating with respect to α gives
Z ∞ Z Z Z |f |
p−1 −s s s
α α |f | dµ dα = |f | αp−s−1 dα dµ
0 |f |>α 0
Z
1
= |f |s |f |p−s dµ
p−s
Similarly,
Z ∞ Z Z Z ∞
p−1 −r r r
α α |f | dµ dα = |f | αp−1−r dα dµ
0 |f |≤α |f |
Z
1
= |f |r |f |p−r dµ
r−p
Consequently,
p (2 As )s (2 Ar )r
kT f kp ≤ Ap kf kp , (Ap ) = + p.
p−s r−p
Just as we did before, we multiply by p αp−1 both sides of the previous inequality, integrate
with respect to α and apply Fubini’s theorem to get:
If T ∗ is of weak (p, q)–type, 1 ≤ p, q < ∞, then the set {f ∈ Lq (µ) : limt→t0 Tt f (y) =
f (y) ν–a.s} is closed in Lp (µ)
Proof. TheR conclusion is obviously true for all f ∈ C00 (Rd ). The opeprators Tr f (x) =
1 d ∗
λd (B(x;r) B(x;r) f dλd clearly map L1 (R , λd ) into itself, and T is Hardy’s Maximal function
M . As M is of weak–(1, 1) type, the result follows from Theorem 13.3.7.
422 13. More results on duality
Proof. Let S be a dense set in Ω, and let {Bn : n ∈ N} be the sequence of all closed balls
whose centers pn lie in S, whose ratios rn are rational, and that are contained in some
member of U . For each Bn = B(pn ; rn ) set Vn = B(pn ; rn /2). Clearly {Vn : n ∈ N} is an
open cover of Ω. For each n ∈ N, let φn be a mollifier such that Vn φn Bn . Define
ψ1 = φ1 , and inductively ψn+1 = (1 − φ1 ) · . . . · (1 − φn )φn+1 . Clearly 0 ≤ ψn Bn . It is
easy to check by induction that for any n ∈ N.
ψ1 + . . . + ψn = 1 − (1 − φ1 ) · · · (1 − φn ).
If K ⊂ Ω is compact, then K ⊂ V1 ∪ . . . ∪ Vn form some n and so,
(13.10) ψ1 (x) + . . . + ψn (x) = 1, x ∈ V1 ∪ . . . ∪ Vn
From (13.10) it follows that {(Vn , ψn ) : n ∈ N} satisfies (i)–(v).
For each φ ∈ D(Ω), the summation in (13.12) is in fact finite. Clearly Λ is linear on D(Ω).
To prove continuity, suppose φn → 0 in D(Ω). Then, there is a compact set K ⊂ Ω such
that supp(φn ) ⊂ K. Let m be as in Theorem 13.4.1[(v)], so that
m
X
Λ(φn ) = ΛUj (ψj φn ), n ∈ N.
j=1
n→∞ n→∞
Since ψj φn −−−→ 0 in D(Uj ) for each j, Λ(φn ) −−−→ 0. This means (see Exercise 12.17.8
that Λ ∈ D′ (Ω).
Definition 13.4.3. Suppose u ∈ D′ (Ω). Let WΛ be the union of all open sets in Ω where
u vanishes. The support of u ∈ D′ (Ω) is defined SΛ = Ω \ WΛ .
Theorem 13.4.4. If Λ ∈ D′ (Ω) has support SΛ then,
(i) Λ vanishes off SΛ .
(ii) If φ ∈ D(Ω) and supp(φ) ∩ SΛ = ∅, then Λ(φ) = 0.
(iii) If SΛ = ∅, then Λ ≡ 0.
(iv) If SΛ ⊂ W ⊂ Ω for some open Ω and ψ ∈ C ∞ (Ω) is such that ψ|W ≡ 1, then
ψ · Λ = Λ.
Only finitely terms in the sum are different from 0. Since ψn φ ∈ D(U ) for some open set U
where Λ vanishes,
X
Λ(φ) = Λ(ψn φ) = 0.
n≥1
Lemma 13.5.1. If |µ| is a regular finite measure on B(X). Then kΛµ k = kµkT V .
Proof. We only need to prove that kΛµ k ≥ kµkT V . Since |µ| is finite and regular, for any
measurable set A and ε > 0, there is K ∋ K ⊂ A such that |µ|(A \ K) < ε.
P
Let {Aj : 1 ≤ j ≤ n} be a finite partition of X such that kµkT V < nj=1 |µ(Aj )| + ε/2. For
each 1 ≤ j ≤ n, let K ∋ Kj ⊂ Aj such that |µ|(Aj \ Kj ) < 2−j ε. Then
n
X n
X
kµkT V < |µ(Kj )| + ε ≤ |µ|(Kj ) + ε
j=1 j=1
Since the Kj s are compact and pairwise disjoint, by Urysohn’s lemma there is a function
µ(Kk )
f ∈ C00 (X) with kf ku = 1 such that f (x) = |µ(Kj )|
for x ∈ Kj . Let K = ∪nj=1 Kj . Then
Z Xn
f dµ = |µ(Kj )| > kµkT V − ε
K j=1
Z
f dµ ≤ kf ku |µ|(K c ) ≤ ε
Kc
R R R
Hence kΛµ k ≥ X f dµ ≥ | K f dµ| − | K c f dµ| ≥ kµkT V − 2ε. Therefore, kΛµ k ≥
kµkT V .
13.5. Riesz duality between C0 (X) and M (X) 425
Lemma 13.5.2. Suppose Λ is a real bounded linear functional on the space C0 (X). There
exists a pair of positive bounded linear functionals Λ+ and Λ− on C0 (X) such that Λ =
Λ+ − Λ− .
Theorem 13.5.3. (Riesz representation theorem) Let X be a l.c.H. topological space. Sup-
pose that Λ is complex or real bounded linear functional on C0 (X). Then, there is a unique
regular complex (finite signed) measure µΛ on B(X) such that
Z
(13.13) Λ(f ) = f dµΛ , f ∈ C0 (X)
X
Proof. It suffices to consider the case of real bounded linear functional, for if Λr = ℜ(Λ),
then Λ(f ) = Λr (f ) − i Λr (i f ).
Let Λ be a real bounded linear operator on C0 (X). Then there is a pair of positive bounded
linear operators Λ+ , Λ− such that Λ = Λ+ − Λ− . By Riesz’ representation R theorem 7.7.3,
there is a pair of regular finite measures
R µ + and µ − such that Λ ± (f ) = X f µ± on C0 (X).
Let µΛ = µ+ − µ− . Hence Λ(f ) = X f dµ and, by Lemma 13.5.1, kΛk = kµΛ k.
R
To prove uniqueness, suppose that ν is a finite regular measure and that X f ν = 0 for
all f ∈ C0R(X). Let ν = ν+ − ν− be the Hahn decomposition
R of ν. The linear functionals
Λ± (f ) = X f dν± are bounded. The assumption X f dν = 0 implies that Λ+ = Λ− . The
Riesz representation theorem 7.7.3 shows that ν+ = ν− . Since ν+ ⊥ ν− , we have that
ν+ = ν− = 0.
If we denote by M(X) the space of complex (real values of finite total variation) mea-
sures on B(X), and by C0∗ (X) the space of complex (real) bounded linear functionals on
C0 (X), the Riesz duality principle states that the map µ 7→ Λµ from M(X) to C0∗ (X) is an
isometry.
Corollary 13.5.4. Suppose Xis a Hausdorff compact topological space. The set P(X) if
Borel probability measures on X is a weak∗ –compact convex subset of M(X).
426 13. More results on duality
Proof. Convexity is obvious. By the Risesz representation theorem C ∗ (X) = M(X). Since
Z
P(X) ⊂ {µ : | f dµ| ≤ 1, kf ku ≤ 1}
where p− , . . . , pk are polynomials and a1 , . . . , ak are the distinct zeroes of Q. The order of
pj , 1 ≤ j ≤ k is corresponds to the oder of multiplicity of aj .
The problem we will study below is that of approximating holomorphic functions in an
open set by rational functions with a prescribed set of poles. We first state an auxiliary
topological result about the complex plane.
Lemma 13.6.1. Let Ω be a nonempty open set in C. There exists a sequence of compact
sets Kn such that
(i) Kn ⊂ Int(Kn+1 )
S
(ii) Ω = n Kn
(iii) Every component of S 2 \ Kn contains a component of S2 \ Ω, where S2 is the the
one point compactification C ∩ {∞} of C.
1 1
Hence |y − ω| > n+1 and so, d(z, Ωc ) ≥ n+1 .
1
(ii) If z ∈ Ω, d(z, Ωc ) > 0. Let m ∈ N large enough so that |z| ≤ m and m < d(z, Ωc ).
Then, z ∈ Km .
(iii) For each n ∈ N, set B(∞; n) := {z : |z| > n}. Let C be a connected component of
Vn := S2 − Kn . Since
[ 1
S2 \ Ω ⊂ Vn = B(∞; n) ∪ B(a; ),
n
a∈Ω
/
C is open and contains at least one of the open discs B(a; n1 ) where a ∈ {∞}∩C\Ω = S2 \Ω.
Say B(a0 ; n1 ) ⊂ C. Since discs are connected, C intersects the connected component Q of a
in S2 \ Ω. Since connected components are pairwise disjoint, Q ⊂ C.
Theorem 13.6.2. (Runge) Suppose ∅ = 6 K ⊂ Ω ⊂ C, where K and Ω are compact and open
respectively. Let A = {aj } be a set that contains one point in each component of S2 \ K. If
f ∈ H(Ω), for every ε > 0 there exists a rational function R whose poles lie in A such that
(13.14) sup |f (z) − R(z)| < ε
z∈K
Proof. Let R be the subspace of rational functions contained in C(K), and whose poles lie
in A. The statement of the theorem is equivalent to saying that if f ∈ H(Ω), then f is in the
uniform closure of R in C(K). By the Hahn–Banach theorem 12.10.9, this is equivalent to
saying that if µ ∈ C ∗ (K) and µ ∈ R⊥ , then µ(f ) = 0. By the Riesz–representation theorem,
C ∗ (K) is the space of M(K) of complex (and R thus of finite variation) Borel measures on K.
Suppose then that µ ∈ M(X) is such that K R dµ = 0 for all R ∈ R. Define
Z
1
h(z) := µ(dw), z ∈ S2 \ K
K w − z
Theorem (11.4.2) together with Remark 11.4.3 imply that h ∈ H(S2 \ K). We claim that
h ≡ 0 on S2 \ K. Suppose Cj is the component of S2 \ K that contains aj .
Case aj ∈ C: For some r > 0, B(aj ; r) ⊂ Cj . For fixed z ∈ B(aj ; r) and w ∈ K,
|z − aj | < r ≤ |w − aj | and so,
N
X (z − aj )n
1
(13.15) = lim
w − z N →∞ (w − aj )n+1
n=0
uniformly for w ∈ K. The truncated sums in (13.15) belong to R and so, they vanish under
µ. Hence h(z) = 0 for z ∈ B(aj ; r) and, since Cj is connected, h ≡ 0 on Vj .
Case aj = ∞: There is r > 0 such that B(∞; r) ⊂ Vj . For fix z with |z| > R
X wn N
1
(13.16) = lim
w − z N →∞ z n+1
n=0
uniformly for w ∈ K. Again, the truncated sums in (13.16) belong to R and, by similar
arguments as before, h ≡ 0 on Vj .
428 13. More results on duality
Let Γ be a cycle in Ω such that Γ ∼ 0 in Ω and IndΓ (z) = 1 for all z ∈ K. Since Γ∗ ⊂
K c , Cauchy’s general theorem and Fubini’s theorem (notice that the integrand involved is
continuous on the compact set K × Γ) imply that
Z Z Z
1 f (w)
f dµ = dw µ(dz)
K K 2πi Γ w − z
Z Z
1 1
= f (w) µ(dz) dw
2πi Γ K w−z
Z
1
=− f (w)h(w) dw = 0
2πi Γ
This shows that µ(f ) = 0 for all µ ∈ R⊥ .
Proof. The assumptions imply that S2 \ K is a open connected set containing ∞. The
conclusion follows by applying Runge’s theorem with A = {∞}.
Theorem 13.6.4. Let Ω be a nonempty set in C, and A a set that has one point in each
component of S2 \ Ω (A could be uncountable). If f ∈ H(Ω, there exists a sequence of
rational functions Rn whose poles are all in A, such that Rn converges to f uniformly in
compact subsets of Ω. In particular, when S2 \ Ω us connected, one may take A = {∞} and
get Rn to be polynomials.
Proof. Let {Kn } be a sequence of compact sets as in Lemma 13.6.1. Since each component
of Vn := S2 \ Kn contains a component of S2 \ Ω, each Vn contains a point in A. Thus, by
Runge’s theorem, for each n ∈ N there exists a rational function Rn whose poles are in A
such that
1
sup |Rn (z) − f (z)| < .
z∈Kn n
For any compact set K ⊂ Ω, there is n0 ∈ N such that K ⊂ Kn for all n ≥ n0 . The
conclusion follows.
13.7. Exercises
Exercise 13.7.1. Suppose that Λ is a bounded linear functional on Lp (Ω). For A ∈ F ,
define FA = {F ∩ A : F ∈ F }; denote by µA the restriction of µ to FA ; for any real or
complex valued function f on A, define fA as fA = f on A and zero elsewhere. Show that
ΛA : f 7→ Λ(fA ) is a bounded linear functional on Lp (A) with kΛA k ≤ kΛk.
Exercise 13.7.2. Let X = C k ([0, 1]) the space of functions that admit continuous deriva-
tives of order k. Define kf kk := kf ku + kf ′ ku + . . . + kf (k) ku . Show that k k is a complete
13.7. Exercises 429
Calculus on Banach
spaces
In this Section we give a brief presentation of Calculus on Banach spaces. We cover three
topics: Integration, differentiation, and optimization. First, following the steps of Daniell’s
approach to integration, we extend the notion of measurability, and introduce Bochner’s
integral, a form of integration defined for functions taking values on a Banach space. The
second part of these notes is dedicated to Differentiation. We extend the notion of derivative
to functions between Banach spaces, and present important results such as the implicit the
mean value theorem and the implicit function theorem. Lastly, we give a brief introduction
to the problem of optimization, where the objective and constrains are defined in Banach
spaces.
431
432 14. Calculus on Banach spaces
Suppose (Ω, M , µ) is a measure space. Let µ∗ be the Daniell mean associated to the
elementary integral µ on the space E of simple integrable M –measurable functions. A Borel
measureable function in Ω with values on a general metric space may fail to be in MS (µ∗ ).
For separable metric spaces we have the following result.
Lemma 14.1.3. Suppose (Ω, M , µ) is a measure space and let µ∗ be the Daniell–mean
associated to µ. If (S, d) is separable and f : (Ω, M ) −→ (S, B(S)) then f ∈ MS (µ∗ ).
The general Stone–Weierstrass theorem shows that when (S, ρ) = (R, | |), Defini-
tions 14.1.1 and 7.1.3 are equivalent. Egorov’s theorem extends in the general setting
of Definition 14.1.1.
Theorem 14.1.4. (Extended Egorov’s theorem) Suppose that (fn : n ∈ N) ⊂ MS (k k)
converges almost surely to f . Then f is measurable and, for any A ∈ L1 and ε > 0, there
is L1 ∋ A0 ⊂ A with kA \ A0 k < ε on which convergence is uniform.
Proof. By repeating the proof of Lemma 7.1.4 we obtain a set L1 ∋ A′0 ⊂ A with kA\A′0 k <
ε/2 on which each fn is E–uniformly continuous.
will be called E–valued elementary functions. The collection of all such functions will
be denoted as E ⊗ E.
The following result summarizes the properties of k k∗E ; this in turn, will be used to
define E–valued integrable functions.
Theorem 14.2.1. Let FE be the space of E–valued almost surely defined functions for
which the seminorm (14.1) is finite.
(i) (FE , k k∗E ) is a complete seminormed space.
(ii) If {fn } ⊂ E Ω converges to f in k k∗E –mean, then there is a subsequence that
converges to f almost surely.
Let L1 (E) be the closure of E ⊗ E in FE . If f ∈ L1 (E), then
(iii) f is measurable and |f |E ∈ L1 (R) := L1 (k k∗ ),
(iv) for any ε > 0, there is a set U ∈ L1 (R) with kU k∗ < ε such that f is the uniform
limit of a sequence in E ⊗ E on U c .
Remark 14.2.2. Unless it is clear from the context, we will explicitly specify the Banach E
space in L1 (E) to distinguish it from the space of numerical integrable functions L1 (k k∗ ).
Proof. (i) and (ii) follow by repeating all the steps of the proof of Theorem 6.3.12, substi-
tuting absolute value | | by the E–norm | |E .
P
To prove (iii), we first consider functions of the form Φ = j ej φj as in (14.2). Since
X
|Φ(x)|E − |Φ(y)|E ≤ |Φ(x) − Φ(y)|E ≤ |ej |E |φj (x) − φj (y)|,
j
it follows that Φ and |Φ|E are E–valued and R–valued E–uniformly continuous respectively.
By the Stone–Weierstrass theorem, |Φ|E is the sum of a constant a ∈ R and a function
u
φ ∈ E . Hence, |Φ|E ∈ F(k k∗ ) ∩ MR(k k∗ ) and we conclude that |Φ|E ∈ L1 (k k∗ ). For
general f ∈ L1 (E), let Φn ∈ E ⊗ E be a sequence converging to f almost surely and in
k k∗E –norm. Egorov’s theorem 14.1.4 shows that f is measurable. Since
|f |E − |Φn |E
∗ ≤
|f − Φn |E
∗ = kf − Φn k∗E → 0
Statement (iv) is proved by a slight modification of the proof of Theorem 7.1.1. Choose a
sequence {Φn } ⊂ E ⊗ E that converges to f in mean and such that kΦnP− Φn−1 k∗ < 2−n−1 .
Setting Ψ0 = Φ0 and ΨP n = Φn − Φn−1 for n ≥ 1 we have that f = n Ψn in mean and
almost surely. Let f ′ = n Ψ where
Pn the series converges, and zero otherwise.
The
P real valued sequence ψ n = k=1 k|Ψk |E converges in L1 and almost surely to ψ =
n n|Ψn |E . For any M and K
K
X X
k{ψ > M }k∗ ≤ 1
M kψk
∗
≤ 1
M kkΨk k∗E + k
2k
;
k=1 k>K
thus, for K and M large enough we have k{ψ > M }k∗
< ε. For such M define U = {ψ >
∗ c
M }. Then U ∈ L1 (k k ) and on U ,
n
′ X X 1X ψ M
f − Ψ k E ≤ |Ψk |E ≤ k|Ψk |E ≤ ≤ .
n n n
k=1 k>n k>n
Therefore, on Uc ∩ {f ′ = 0}, f is the uniform limit of a sequence in E ⊗ E.
Parts (c) and (d) of Theorem 14.2.1 make a connection between integrability of E–valued
functions, E–uniform continuity, and uniform limits of E ⊗ E functions. This motivates the
following stronger notion of measurability.
Definition 14.2.4. An E–valued function f is strongly measurable if for any set A ∈ L1
and ε > 0, there exists an integrable set A0 ⊂ A with kA \ A0 k∗ < ε on which f is the
uniform limit of a sequence in E ⊗ E.
As any E ⊗ E–function is E–uniformly continuous and thus measurable, strong measur-
ability implies measurability. Is easy to check that the collection of strongly measurable
functions is a linear space.
Example 14.2.5. (Strong measurability of functions in C(R, E)). Let ∆ be a subinterval
of R. Consider E the space of step functions in (∆, B(∆)) and let k kλ be Daniell’s mean
associated to Lebesgue’s measure on R. If u ∈ C(∆; E), then u is strongly measurable. To
check this, define
n −1
2X k
un (t) := u n 1 k k+1 (t), n ∈ N.
n
2 ,
2n 2n
∩∆
k=−2
14.2. Banach valued integral 435
For any m, {B(yn ; 1/m) : n ∈ N} covers V . Set D1m = B(y1 ; 1/m), Dn+1 m = B(y
n+1 ; 1/m) \
Sn P 1
j=1 B(yj ; 1/m) for n ≥ 1. Then Φm = m clearly satisfies kf − Φm ◦ f ku ≤
n yn 1Dn m for
each m.
P
Notice that 1 ≡ n 1Dnm ◦ f for each m. By Egorov’s theorem, given a set A0 P ∈ L1 (k k∗ )
∗ ε
and ε > 0, there is an integrable subset A1 ⊂ A0 with kA0 \ A1 k < 2 on which n 1Dn1 ◦ f
finitely many
1Dn1 ◦f vanish on A
1 . As a consequence,
converges uniformly to 1. Thus, all but
there exists ψ1 ∈ E ⊗ E such that
|Φ1 ◦ f − ψ1 |E
< 1, and so
|f − ψ1 |E
< 2.
u,A1 u,A1
Repeating this argument inductively we obtain a decreasing sequence of sets (Am : m ∈
N) ⊂ L1 (k k∗ ) and a sequence (ψm : m ∈ N) ⊂ E ⊗ E such that
ε
kAm−1 \ Am k∗ <
2m
2
|f − ψm |E
< .
u,Am m
T
By monotone convergence B := m Am ∈ L1 (k k∗ ) and by construction
A \ Bk∗ =
P ∗
m kAm−1 \ Am k < ε. On B we have that ψm −→ f uniformly. This shows that f is
strongly measurable.
The next result gives necessary and sufficient conditions for a function f ∈ E Ω to be
integrable.
Theorem 14.2.7. f ∈ L1 (E) iff f is strongly measurable and |f |E ∈ L1 (k k∗ ). In either
case, there exists f˜ ∈ L1 (E) such that kf − f˜k∗E = 0 and f˜(Ω) separable.
436 14. Calculus on Banach spaces
it follows that
|I(f )|E = lim |I(Φn )|E ≤ lim I(|Φn |E ) = lim kΦn k∗E = I(|f |E ) = kf k∗E .
n n n
and
I(Λf − ΛΦn ) ≤ I |Λf − ΛΦn | ≤ kΛkI |f − Φn |E ) = kΛkkf − Φn k∗E .
∗
by f (Ω). If I(f ) ∈
Proof. Let V be the closed linear space generated / var V , then there
exists Λ ∈ E such that Λ(V̂ ) = {0} and Λ I(f ) 6= 0. However, Λ(I(f )) = I(Λf ) = 0
which is a contradiction.
The following result extends Remark 14.3.2 to the setting of closed operators (not nec-
essarily bounded). B ⊂ E × F is a closed linear map if B is a closed linear subspace
of E × F such that (0, y) ∈ B implies that y = 0. The domain of B is defined as
dom(B) = {x ∈ E : ∃y ∈ F with (x, y) ∈ B}. Similarly, the range of B is defined as
range(B) = {y ∈ F : ∃x ∈ F with (x, y) ∈ B}. If (x, y) ∈ B, then we write y = Bx.
Theorem 14.3.4. (Hille) Let B ⊂ E × F be a closed linear map. Suppose f ∈ L1 (k k∗E )
and that f (Ω) ⊂ dom(B). If Bf ∈ L1 (k k∗F ) then I(f ) ∈ dom(B) and B(I(f )) = I(B(f )).
We conclude this section with a simple fundamental theorem of Calculus for Banach
valued integrals over closed compact intervals.
Theorem 14.3.5. If f ∈ C 1 ([a, b]; E), then f ′ ∈ L1 ([a, b], k kλE ) and
Z b
f (b) − f (a) = f ′ (t) dt
a
Proof. Integrability of f ′ follows immediately from continuity, see Examples 14.2.5 and 14.2.9.
For any Λ ∈ E ∗ we have that φΛ = Λ ◦ f ∈ C 1 ([a, b]; R) and, by the fundamental theorem
of Calculus
Z b
d
Λ f (b) − f (a) = (Λ ◦ f )(t) dt
a dt
Z b
=Λ f ′ (t) dt
a
The conclusion follows from the version Hahn-Banach’s extension theorem stated in Theo-
rem 12.10.9.
where the suprema is taken over all partitions P of [a, b]. The term ℓϕ (a, b) is the arc
length of the curve ϕ over [a, b].
A path ϕ in E defined is a continuous function ϕ : [a, b] → E such that for some
partition a = t0 < . . . < tn = b, ϕ ∈ C 1 ([tk−1 , tk ]). In Example 14.6.6) we show that if ϕ is
a path in E defined on [a, b] then,
Z b
(14.5) ℓϕ (a, b) = |ϕ′ (t)|E dt.
a
If ϕ is a path in [a, b] and f : ϕ∗→ L(E, F ), the path integral of f over ϕ is defined as
Z Z b
f := f (ϕ(t))ϕ′ (t) dt
ϕ a
Proof. It is enough to assume that ϕ is continuously differentiable over [a, b]. By the chain
d
rule dt (f ◦ ϕ)(t) = f ′ (ϕ(t))ϕ′ (t). The conclusion follows from the fundamental theorem of
Calculus 14.3.5.
Remark 14.4.2. If X is a Banach space, then Theorems 14.2.6 and 14.2.7 show that
Bochner integrability implies Pettis integrability and in this case, the value of the integral
is uniquely defined. If X is a topological vector space where X ∗ separate points, then the
Pettis integral of a weakly measurable function, when it exists, is uniquely defined.
Theorem 14.4.3. Suppose µ is a Borel measure in a compact Hausdorff space Ω. (Ω, B, µ).
Let X be a topological vector space where X ∗ separates points. If φ : Ω → X is continuous
and the closed
R convex hull of φ(Ω), co(φ(Ω)), is compact in X, then φ is Pettis integrable
and y := Ω φ dµ ∈ co(φ(Ω)).
Proof. We consider the case of real vector spaces. The complex case follows from this by
doubling dimension in the arguments detailed below.
In particular,
n
X n
X
cj Λj φ (ω) < c j tj , ω∈Ω
j=1 j=1
it follows that c·m < c·t. This shows that m ∈ co L(φ(Ω) .
Since µ is a probability measure,
Since LR is linear, co L(φ(Ω) = L(H) and so, there exists y ∈ H such that m = Ly. Thus
Λj y = Ω (Λj φ) dµ, 1 ≤ j ≤ n and EL 6= ∅.
supp(ψ(x)τx φ) ⊂ K1 + K2 ,
and {ψ(x)τx φ : x ∈ K1 } is a compact subset of DK1 +K2 . As DK1 +K2 is Fréchet, for any
distribution u ∈ D∗ (Rn )
Z Z
u ψ(x)τx φ dx = ψ(x)u τx φ dx
K1 K1
14.5. Symbolic calculus in Banach algebras 441
Theorems 12.3.11 and 14.4.3 imply that the integrals in (14.8) and (14.9) exists. By the
general Cauchy theorem 11.31, the formulas there hold for Λ◦f in place of f , where Λ ∈ X ∗ ;
hence, the identities hold by the definition of the Pettis integral.
We now prove that f is strongly holomorphic. Let a ∈ Ω and r > 0 as before. Let Γ
be the
R circle of radius R centered at a. follows from Theorems 12.3.11 and 14.4.3 that
1 f (w)
2π1 Γ (w−a)2 dw exists. A simple calculation gives
Z Z
f (z) − f (a) 1 f (w) 1 f (w)
(14.10) − dw = (z − a) dw
z−a 2πi Γ (w − a)2 2πi (w − a)2 (w − z)
Let V be any balanced convex neighborhood of 0 in X. Define g(z) as the integral in the
right–hand–side of (14.10). As K = {f (w) : w ∈ Γ∗ } is compact in X, K ⊂ tV some t0 > 0.
Since dw = ireiθ dθ on a + rS1 , for |z − a| < r/2 we have that
|(w − a)−2 (w − z)−1 ||dw| ≤ 2r−2 dθ
It follows that the integrand in g is contained in 2r−2 K ⊂ 2r−2 tV ; consequently, g(z) ∈
z→a
2r−2 t0 V . This shows that g(z) −−−→ 0 in X; hence, f is holomorphic.
Example 14.5.3. Suppose A is a Banach algebra and let x ∈ A. Let σ(x), ρ(x) and
r(x) be the spectrum resolvent and spectral radius of x. Theorem 12.14.2 shows that
f (λ) = (λe − x)−1 is weakly holomorphic on ρ(x) = C \ σ(x). Hence, f (λ) and
P the functions
λn f (λ), n ∈ Z+ , are strongly holomorphic on ρ(x). If |λ| > kxk, f (λ) = ∞ m=0 λ
−m−1 xm
absolutely and uniformly on compact subsets of C \ B(0; kxk). Denoting by Γr the circle of
radius r centered at 0, by Theorem 14.4.5, we have that for r > kxk
Z
n 1
(14.11) x = λn f (λ) dλ, n ∈ Z+
2πi Γr
Since ρ(x) contains all λ with |λ| > r(x), Theorem 14.5.2 implies that the condition r > kxk
in (14.11) can be replace by r > r(x). For such r, let M (r) = max{|f (λ)| : |λ| = r}. Then
kxn k ≤ rn+1 M (r)
p p
Consequently lim supn n kxn k ≤ r and so, lim supn n kxn k ≤ r(x).
The following result is an extension of the real-variable mean valued theorem (see Ex-
ercise 14.10.11) to the setting of differentiable functions in Banach spaces.
Theorem 14.6.3. (Mean value theorem) Suppose F ∈ C 1 (U, Y ) where U ⊂ X is convex.
For any x, y ∈ U ,
(14.15) kF (x) − F (y)k ≤ M (x, y) kx − yk
where M (x, y) = sup0≤t≤1 kF ′ (x + t(y − x))k.
Proof. The last statement is the simplest to prove. The differentiability of F implies that
for any unitary vector u ∈ X
F (x + tu) − F (x)
lim = F ′ (x)u,
t→0 t
where the limit is taken over t ∈ F. Therefore, from (14.16), we conclude that
sup kF ′ (x)uk ≤ M.
kuk=1
For the first statement, it is enough to assume that X and Y are Banach spaces over R.
Let x, y ∈ U be fixed. Let v ∈ Y ∗ with kvk = 1. Define ϕ : [0, 1] → R as ϕ(t) =
(v ◦ F )(x + t(y − x)). Then, ϕ is differentiable in (0, 1) and, by the real–valued mean valued
theorem, there is t∗ ∈ (0, 1) such that ϕ(1) − ϕ(0) = ϕ′ (t∗ ). Hence
v(F (y) − F (x)) = v F ′ (x + t∗ (y − x))(y − x)
≤ kF ′ (x + t∗ (y − x))kky − xk
Consequently, kF (y) − F (x)k = supkvk=1 (F (y) − F (x)) ≤ kF ′ (x + t∗ (y − x))kky − xk.
The conclusion follows immediately.
The following results are immediate consequence of the mean value theorem.
Corollary 14.6.4. Suppose U ⊂ X is an open connected set in the Banach space X. A
function F ∈ C 1 (U, Y ) is constant iff F ′ = 0.
Proof. Exercise.
Corollary 14.6.5. Let X, Y be Banach spaces and U ⊂ X open. Suppose F ∈ C 1 (U, Y ).
For any x0 ∈ U and ε > 0, there exists a ball B(x0 ; r) ⊂ U such that
kf (u) − f (v) − f ′ (x0 )(u − v)k < εku − vk, u, v ∈ B(x0 ; r)
14.6. Differentiation on Banach spaces 445
Proof. Consider the function g(x) = f (x) − f ′ (x0 )x. The continuity of f ′ at x0 implies
that there exists a ball B(x0 ; r) ⊂ U such that kf ′ (x) − f ′ (x0 )k < ε whenever x ∈ B(x0 ; r).
Since tv + (1 − t)u ∈ B(x0 ; r) for any u, v ∈ B(x0 ; r) and 0 ≤ t ≤ 1,
kg(u) − g(v)k = kf (u) − f (v) − f ′ (x0 )(u − v)k
≤ sup kg ′ (u + t(u − v))kku − vk
0≤t≤1
= sup kf ′ (tv + (1 − t)u) − f ′ (x0 )kku − vk < εku − vk.
0≤t≤1
Example 14.6.6. We are now in Rthe position of proving that if ϕ ∈ C 1 ([a, b], E) then ϕ
is recitifiable and that ℓϕ (a, b) = (a,b] |ϕ(t)|E dt. Given ε > 0, there is δ > 0 such that
|t − s| < δ implies |ϕ′ (t) − ϕ′ (s)|E < ε. Let B xj ; 2δ , j = 1, . . . , N be a finite cover of [a, b].
If |s − t| < 2δ , then s, t ∈ B(xj ; δ) for some j and so, setting gj (t) = ϕ(t) − ϕ′ (xj )t, we have
that
ϕ(t) − ϕ(s) − ϕ′ (xj )(s − j) ≤ sup gj′ (t + λ(s − t)) |t − s|
E E
0≤λ≤1
≤ ϕ′ (t + λ(s − t)) − ϕ′ (xj )E |t − s| < ε|t − s|
δ
Let a = t0 < . . . < tn = b be any partition such that max (tk+1 − tk ) < 2. For each
0≤k≤n−1
j = 0, . . . , n, there is xkj with tj+1 , tj ∈ B(xkj ; δ). Then
n−1 n−1
X X
ϕ(tj+1 ) − ϕ(tj ) − ϕ′ (tk ) (tj+1 − tj )
E E
j=0 j=0
n−1
X
≤ ϕ(tj+1 ) − ϕ(tj ) − ϕ′ (tj )(tj+1 − tj )
E
j=0
n−1
X ′
≤ ε(b − a) + ϕ (xk ) − ϕ′ (tj ) (tj+1 − tj )
j E
j=0
≤ 2ε(b − a)
The conclusion follows immediately.
As a consequence
Z
e F (x + v) − F (x) − L(x)v = e (L(x + tv) − L(x))v dt
(0,1]
Z
≤ kvk kL(x + tv) − L(x)k dt.
(0,1]
Therefore
F (x + v) − F (x) − L(x)v = o(v)
since L is continuous at x.
Theorem 14.7.1. (Uniform contraction principle) Suppose W and V are closed and open
subsets of Banach spaces X and Y respectively. Let F : W × V −→ W be a uniform
contraction and let x∗ (y) be the unique fixed point of F (·, y) : W −→ W .
(i) If F ∈ C(W × V, X) then, x∗ ∈ C(V, X).
Suppose W = U where U is an open subset of X and that F (U × V ) ⊂ U .
(ii) If F ∈ C(U × V, X) and F ∈ C r (U × V, X) (r ≥ 1) then, x∗ ∈ C r (V, X) and
−1
(14.18) x′∗ (y) = I − ∂x F (x∗ (y), y) ∂y F (x∗ (y), y), y ∈ V.
(ii) The assumption F (U ×V ) ⊂ U implies that x∗ maps V into U since x∗ (y) = F (x∗ (y), y).
By the chain rule,
The inverse part of mean value theorem 14.6.3 along with (14.17) shows that
Hence T is a uniform contraction and, by the first part of the proof, T has a continuous fixed
point z : V → L(Y, X). We will show that z is in fact the derivative of x∗ . We fix y ∈ V ,
and set B(y) = ∂x F (x∗ (y), y), A(y) = ∂y F (x∗ (y), y). Let h(k) := x∗ (y + k) − x∗ (y) for all
448 14. Calculus on Banach spaces
k small enough. The fixed point property of x∗ and z together with the differentiability of
F implies that for all k small enough
(I − B(y))(h(k) − z(y)k) = F (x∗ (y + k), y + k) − F (x∗ (y), y) − B(y)h(k) − A(y)k
= F (x∗ (y) + h(k), y + k) − F (x∗ (y), y) − B(y)h(k) − A(y)k
:= P (h(k), k),
where kP (h,k)k
khk+kkk → 0 as (h, k) → (0, 0). From (14.21), (I − B(y)) ∈ L(X) is an invertible
operator with (I − B(y))−1 ∈ L(X). This shows that
x∗ (y + k) = x∗ (y) + z(y)k + r(k)
where r(k) = o(k) as k → 0
For r > 1, the result follows by induction. Suppose the result holds for r − 1, then at least
x ∈ C r−1 (V, X). The fact that x∗ satisfies (14.19) implies that
−1
(14.22) x′∗ (y) = I − ∂x F (x∗ (y), y) ∂y F (x∗ (y), y)
Since the map T 7→ T −1 from GL(X) to GL(X) is differentiable, it follows that x∗ ∈ C r (V )
whenever F ∈ C r (U × V, Y ).
Remark 14.7.2. The continuity of x∗ in Theorem 14.7.1 holds if one assumes F ∈ C(U ×
V, X) and F (U × V ) ⊂ U . Theorem 14.7.1 holds when F ∈ C r (W × V, X), where W ⊂ X
is an open subset containing U and F (U × V ) ⊂ U .
Proof. Define G : Ω −→ X by
−1
G(x, y) = x − ∂x F (x0 , y0 ) (F (x, y) − F (x0 , y0 ))
Observe thatG has the same smoothness as F ; moreover, x = G(x, y) is equivalent to
F (x, y) = F (x0 , y0 ). Since ∂x G(x0 , y0 ) = 0, for any 0 < θ < 1 there exists open balls U and
V1 around x0 and y0 respectively, such that U × V1 ⊂ Ω and sup(x,y)∈U ×V1 k∂x G(x, y)k ≤
θ < 1. The mean value theorem implies that
kG(x, y) − G(x′ , y)k ≤ θkx − x′ k, x, x′ ∈ U , y ∈ V1
14.8. Existence and uniqueness of solutions to differential equations 449
Proof. Applying the implicit function theorem to F (x, y) = y − f (x) gives neighborhoods
U ′ ⊂ W and V ⊂ Y around x0 and y0 = f (x0 ) respectively, such that for each y ∈ V ,
there exists a unique g(y) ∈ U ′ satisfying y = f (g(y)). Moreover, the relation g : y 7→
g(y) is necessarily in C r (V, X). This uniqueness shows that f is injective in U ′ . The set
U = U ′ ∩ f −1 (V ) is an open neighborhood of x0 with V = f (U ), and thus, f : U −→ V
is a bijective function whose inverse f −1 = g. Finally, equation 14.24 follows directly
from (14.23).
for all t ∈ I and x, y ∈ U . Then, for any (t0 , x0 ) ∈ D, there exits δ > 0 and a function
x(·; (t0 , x0 )) ∈ C 1 ((t0 − δ, t0 + δ); Rn ) satisfying (14.25). Furthermore, there is η > 0 such
that on (t0 − η, t0 + η)2 × B(x0 ; η), the map (t, (τ, x)) 7→ x(t; (τ, x)) : I → Rn is continuous.
Proof. Existence and uniqueness of solutions. We first prove that for any point (t0 , x0 ) ∈ D
we can find δ > 0 and a neighborhood V where (14.25) admits a solution with initial
conditions (τ, x) ∈ V in the interval (τ − δ, τ + δ). Let a, b > 0 such that K := (t0 , x0 ) ∈
[t0 −a, t0 +a]×B(x0 ; b) ⊂ D and (14.26) holds. Let m = sup(t,x)∈K |f (t, x)| and choose δ > 0
so that (i) δ < a2 , (ii) mδ ≤ 2b , and (iii) Lδ < 1. Define F as the family of all continuous
functions ϕ on Iδ := [−δ, δ] such that ϕ(0) = 0 and kϕku(Iδ ) ≤ 2b equipped with the uniform
norm. For each (τ, x) ∈ K ′ := (t0 − δ, t0 + δ) × B x0 ; 2b , define the transformation T(τ,x)
on F by
Z t+τ
T(τ,x) ϕ(t) = f (s, ϕ(s − τ ) + x) ds, t ∈ Iδ .
τ
Consequently, x(·, (τ, x)) is the only function on Iδ (τ ) solving (14.25) with x(τ ; (τ, x)) = x.
that
|ϕ∗(t1 ,x1 ) (t) − ϕ∗(t2 ,x2 ) (s)| ≤ |ϕ∗(t1 ,x1 ) (t) − ϕ∗(t1 ,x1 ) (s)| + |ϕ∗(t1 ,x1 ) (s) − ϕ∗(t2 ,x2 ) (s)| < ε.
It follows that x : (t, (τ, x)) 7→ x(t; (τ, x)) is continuous on V = {(t, (τ, x)) : (τ, x) ∈
K ′ , |t − τ | < δ}.
For each point (t0 , x0 ) in the domain of the vector field f , the solution x(t) = x(·; (t0 , x0 ))
to (14.25) provided by Theorem 14.8.1 is only defined in a neighborhood of t0 . Such solution
can be extended uniquely to a continuously differentiable function defined in a maximum
interval. Suppose y(t) and z(t) are solutions to (14.25) with y(t0 ) = x0 = z(t0 ) defined in
an interval J containing Iδ (t0 ). We claim that y(t) ≡ z(t). Otherwise, there is an interval
Iδ (t0 ) ⊂ [a, b] ⊂ J such that y = z on [a, b] but that in any neighborhood of b (or (a) )there
is a point t′ with y(t′ ) 6= z(t′ ). Applying Picard’s construction around the point (b, y(b)) we
obtain a unique solution φ to the problem (14.25) with φ(b) = y(b) in an interval containing
b. This is a contradiction to y 6= z. Hence, x(·; (t0 , x0 )) can be extended uniquely to a
maximal interval Iδ (t0 ) ⊂ J(t0 , x0 ) as a continuously differential function (also denoted by
x) satisfying (14.25) on J(t0 , x0 ).
Local continuity of solutions to (14.25) can be extended to the whole domain (maximal
interval) of definition.
Theorem 14.8.2. Consider the initial valued problems
ẋ(t) = f (t, x(t)), x(t0 ) = x0
ẏ(t) = g(t, y(y)), y(t0 ) = y0
Assume that g satisfies the conditions of Theorem 14.8.1 and that
|f (t, x) − f (t, y)| ≤ L|x − y|
452 14. Calculus on Banach spaces
then
|x(t; (t0 , x0 )) − y(t; (t0 , y0 ))| ≤ δ + ε|t − t0 | exp L|t − t0 |
for all t in the intersection of the maximal domain of definition of x(·; (t0 , x0 )) and y(·; (t0 , y0 )).
Proof. For simplicity we set x(t) = x(t; (t0 , x0 )) and y(t) = y(t; (t0 , y0 )). Then, for t ≥ t0
Z t
|x(t) − y(t)| ≤ |x0 − y0 | + f (s, x(s)) − g(s, y(s)) ds
t0
Z t Z t
≤δ+ f (s, y(s)) − g(s, y(s)) ds + f (s, x(s)) − f (s, y(s)) ds
t0 t0
Z t
≤ δ + ε(t − t0 ) + L |x(s) − y(s)| ds
t0
Applying Gronwall’s inequality with u(t) = |x(t) − y(t)|, α(t) = δ + ε(t − t0 ) and β = L we
obtain that
|x(t) − y(t)| ≤ (δ + ε(t − t0 )) exp L(t − t0 ) .
For t ≤ t0 the proof is similar.
Proof. By hypothesis, there is δ > 0 such that F (x) ≥ F (x0 ) for all x ∈ B(x0 ; δ) ⊂ U .
For any unitary vector u ∈ X and |t| < δ we have that x0 + tu ⊂ B(x0 ; δ). Define
gu : t 7→ F (x0 + tu). Clearly gu has a local minimum at t = 0 and it is differentuable at
t = 0. By a classical result of real–valued Calculus gu′ (0) = F ′ (x0 )u = 0. As this holds for
any unitary vector in X, we conclude that F ′ (x0 ) = 0.
In the remaining of this section, we will consider the problem of finding local extreme
points of a funcion under functional constrains.
Theorem 14.9.2. (Surjective Theorem) Let X, Y be Banach spaces and Ω ⊂ X open.
Assume that F ∈ C 1 (U, Y ) and that for some x0 ∈ U , F ′ (x0 ) has a right hand inverse in
L(Y, X). Then, F (Ω) contains an open ball around f (x0 ).
14.9. Optimization and Lagrange Multipliers 453
Proof. Let L ∈ L(Y, X) be a right hand side inverse of A = f ′ (x0 ) and let c = kLk. By
Corollary 14.6.5, there exists a ball B(x0 ; δ) such that
δ
kf (u) − f (v) − A(u − v)k < ku − vk, u, v ∈ B(x0 ; δ).
2kck
1 1
We will show that the the ball B(y0 ; 2c ) ⊂ f (Ω), where y0 = f (x0 ). For y ∈ B(y0 ; 2c ),
we define inductively the following sequence (xn : n ≥ 0) as follows. Starting at x0 , we let
x1 = x0 + L(y − y0 ), and for n ≥ 1
(14.27) xn+1 = xn − L f (xn ) − f (xn−1 ) − A(xn − xn−1 ) .
We show by induction that xn ∈ B(x0 ; δ) and kxn − xn−1 k ≤ δ2−n . Indeed, for n = 1
δ
kx1 − x0 k ≤ cky − y0 k ≤ ,
2
and if the statement holds for n ≥ 1, then
1
kxn+1 − xn k ≤ ckf (xn ) − f (xn−1 − A(xn − xn−1 )k ≤ kxn − xn−1 k < δ2−n−1 .
2
Hence,
n
X n
X
kxn+1 − x0 k ≤ kxk − xk−1 k ≤ δ2−k < δ.
k=1 k=1
Proof. It suffices to consider the case when Y = F n . Let {ek : 1 ≤ n} be the standard
basis for F n . If A = f ′ (x0 ) is surjective, then there are {uk : 1 ≤ k ≤ n} in X such
thatPAuk = ek . Define Lek = uk , k = 1, . . . , n and linearly extend L to all of Y . For any
y = nk=1 αk ek , we have
Xn X n 1/2 X
n 1/2 X
n 1/2
2 2
kLyk ≤ |αk kuk k ≤ |α| kuk k = kyk kuk k2 .
k=1 k=1 k=1 k=1
This shows that L ∈ L(Y, X). Therefore, the conclusion of the result follows immediately
from the surjective theorem.
454 14. Calculus on Banach spaces
The following application of the preceding Corollary to the Surjective Theorem is to the
problem of nonlinear optimization with constraints.
Theorem 14.9.4. (Lagrange Multipliers) Let f ,g1 , . . . , gn be functions in C 1 (Ω, R), where
Ω ⊂ X is an open subset of a Banach space X. Let M = {x ∈ Ω : g1 (x) = · · · gn (x) = 0}.
If x0 is a local maximum point of f restricted to M , then there exists a nontrivial linear
relation of the form
n
X
(14.28) µf ′ (x0 ) + λk gk′ (x0 ) = 0.
k=1
Moreover, if {gk′ (x0 )
: 1 ≤ k ≤ n} is a linearly independent family in L(X, R), then µ 6= 0
and there is a unique solution to
n
X
′
(14.29) f (x0 ) + λk g ′ (x0 ) = 0.
k=1
Proof. Let U be a ball around x0 such that f (x0 ) ≥ f (x) for all x ∈ U ∩ M . Let F : U −→
Rn+1 be the function given by
F (x) = (f (x), g1 (x), . . . , gn (x))⊤
For any r > f (x0 ), the vector (r, 0, . . . , 0)⊤ ∈
/ F (U ). Hence, F (U ) does not contained any
open neighborhood of the point (f (x0 ), g1 (x0 ), . . . , gn (x0 ))⊤ = (f (x0 ), 0, . . . , 0))⊤ . Then,
F ′ (x0 ) is not surjective. Therefore, the range V of F ′ (x0 ) is a proper subspace of Rn+1 . Let
(µ, λ)⊤ = (µ, λ1 , . . . , λn )⊤ be nonzero element in V ⊥ . Then
n
X
′
µf (x0 )v + λk g ′ (x0 )v = 0
k=1
for all v ∈ X and (14.28) follows.
14.10. Exercises
Exercise 14.10.1. Suppose that E is a Banach space. Show that if f is and E–valued
measurable function, then Λf is measurable for all Λ ∈ E ∗ , where E ∗ is the space of
continuous linear functionals on E.
Exercise 14.10.2. Any f ∈ CΩ has a unique representation f √ = u + i v, where u, v ∈ RΩ .
Show that f is measurable iff u, v ∈ MR. In either case, |f | = u2 + v 2 ∈ MR.
Exercise 14.10.3. Show that k k∗ defines a complete seminorm on the space FC∗ of
complex–valued functions with finite mean k k∗C.
Exercise 14.10.4. Let CΩ ∋ f = u + i v, where u, v ∈ RΩ . Show that
14.10. Exercises 455
The following next four exercises deal with basic resutls from single–variable real valued
functions.
Exercise 14.10.9. Suppose f : (a, b) → R has a local minimum at some point x0 ∈ (a.b).
If f is differentiable at x0 , show that f ′ (x0 ) = 0.
Exercise 14.10.10. (Rolle’s theorem) Suppose f : [a, b] → R (−∞ < a < b < ∞) is
continuous function, and that f is differentiable in (a, b). If f (a) = f (b), show that there is
c ∈ (a, b) such that f ′ (c) = 0.
Exercise 14.10.11. (General Mean value theorem) Suppose f, g : [a, b] → R (−∞ < a <
b < ∞) are continuous functions, and that both f and g are differentiable in (a, b). Show
that there is a point c ∈ (a, b) such that g ′ (c)(f (b) − f (a)) = f ′ (c)(g(b) − g(a)). (Hint:
consider the function h(x) = (f (x) − f (a))(g(b) − g(a)) − (f (b) − f (a))(g(x) − g(a)).) The
version g(x) = x is known as Cauchy’s mean value theorem.
Exercise 14.10.12. (L’Hôpital’s rule) Suppose f and g are real valued functions defined
in an interval I. For a ∈ I, suppose that limx→a f (x) = 0 = limx→a g(x).
(a) If f and g are differentiable at a and g ′ (a) 6= 0, show that
f (x) f ′ (a)
lim = ′
x→a g(x) g (a)
456 14. Calculus on Banach spaces
(Hint: Without loss of generality, assume f (a) = 0 = g(a). Apply the mean value
theorem (general) to f and g over the interval with enpoints a and x ∈ I.)
Exercise 14.10.13. (Taylor approximation) Suppose f is a real-valued function defined in
an interval I.
(a) (Peano’s residual) Suppose f has n finite derivatives at a ∈ I, that is, f ′ (a), . . . , f (n) (a)
exits. Show that
n
X f (k) (a)
rn (x) := f (x) − (x − a)k = o (x − a)n
k!
k=0
(Hint: Use L’Hôpital’s rule (a) and (b) to rn (x) and (x − a)n .)
Suppose f and g admit n ≥ 0 continuous derivatives in (α, β) ⊂ I, and that f (n+1) and
g (n+1) exist in (α, β). Let α < a < β and fix x ∈ (α, β). Define
n
X f (k) (t)
F (t) = (x − t)k
k!
k=0
n
X g (k) (t)
G(t) = (x − t)k
k!
k=0
(x−t)n (n+1)
(b) Show that F (x) = f (x) and F ′ (t) = n! f (t), and similarly, G(x) = g(x),
n
G′ (t) = (x−t)
n! g
(n+1) (t).
(c) For any x ∈ [α, β] and x 6= a, show that there is ξ between a and x such that
n
X f (k) (a) n
X g (k) (a)
f (x) − (x − a)k g (n+1) (ξ) = g(x) − (x − a)k f (n+1) (ξ)
k! k!
k=0 k=0
(d) (Lagrange’s residual) Show that there is a point ξ between a and x such that
n
X f (k) (a) f (n+1) (ξ)
(14.30) f (x) = (x − a)k + (x − a)n+1
k! (n + 1)!
k=0
Exercise 14.10.14. In Theorem 12.6.10 it was showed that the group GL(X) of bounded
operators on a Banach space X whose inverses are also bounded is open in L(X). Show
that the map T 7→ T −1 on GL(X) is differentiable and compute its derivative.
Exercise 14.10.15. Let X and Y be two normed spaces. If T ∈ L(X, Y ), show that
x 7→ L(x) is differentiable everywhere and that L′ (x) = L.
Chapter 15
Observe that gt (x) := eix·t satisfies |gt | ≡ 1; hence gt ∈ L1 (|µ|) for all t ∈ Rn and so, µ
b
n
is a well defined from R to C.
p
Example 15.1.2. The Bernoulli measure ηa,b (0 ≤ p ≤ 1, a 6= b) on R is given by
p
ηa,b := (1 − p)δa + pδb . This measure corresponds to the flipping a biased coin that results in
heads up (with a value of a) with probability p or tails up (with a value of b) with probability
1 − p. Its characteristic function is given by ηd
p
a,b (t) = (1 − p)e
ita + peitb . Special cases are
the symmetric Bernoulli measure where η = 12 (δ−1 + δ1 ), in which case ηb(t) = cos t; the
p p
Bernoulli 0–1 measure with probability of success η0,1 ({1}) = p where η0,1 = pδ1 + (1 − p)δ0 ,
dp it
in which case η (t) = pe + (1 − p).
0,1
Example 15.1.3. The uniform distribution on R over (a, b) is the measure Ua,b (dx) =
1 d eibt −eiat
b−a 1(a,b) (x) dx. Its characteristic function is Ua,b (t) = it(b−a) .
Example 15.1.4. The exponential distribution E(λ) with parameter λ > 0 is given by
µ(dx; λ) = λe−λx 1[0,∞) (x) dx. Its characteristic function is
Z ∞
(it−λ)x ∞ λ
b(t; λ) =
µ λeixt e−λx dx = λ e it−λ 0 = .
0 λ − it
459
460 15. Fourier transform and Convolution on Rn
1
For λ = 1, the reflected exponential is µr (dx) = ex 1(−∞,0] (x) dx and thus, µ cr (t) = 1+it .
1 1 −|x|
The double exponential distribution ν(dx) = 2 (µr + µ)(dx) = 2 e dx has characteristic
function
1 1 1
νb(t) = + = .
2(1 − it) 2(1 + it) 1 + t2
Theorem 15.1.5. Suppose that µ and ν are complex measures (measures of finite variation)
on B(Rd ). Then, µ = ν iff µ
b = νb.
Proof. Denote by ft (x) = exp(x · t) and consider the collection M of all such functions;
observe that f0 ≡ 1 ∈ M. This is a complex multiplicative family contained in the space of
all bounded complex valued Borel measurable functions V. The later is a complex vector
space and a bounded class. By the Complex Bounded Class Theorem, V contains all
the bounded complex valued σ(M)–measurable functions, which contains in particular all
functions of the form 1B , B ∈ σ(M). Since µ and ν coincide in M, then by Dominated
Convergence they also coincide in σ(M). Consider the maps γt (x) = t · x, with t ∈ Rd and
observe that they generate B(Rd ). Since
γt (x) = t · x = −i lim n(ft/n (x) − f0 (x)),
n
Proof. Without loss of generality assume t > 0. Then form some θ ∈ (−π, π],
Z
−iθ
1=e µ b(t) = cos(xt − θ) µ(dx)
θ 2π
By Theorem 4.3.11, x 7→ cos(xt − θ) = 1 µ–a.s. Hence, supp µ ⊂ t + t Z.
Theorem 15.1.7. Suppose µ is a probability measure on Rn . If |b
µ(t)| = 1 in a small
neighborhood of 0, then µ = δb for some b ∈ Rn .
Given a positive finite measure µ, its reflection µr is given by µr (A) = µ(−A) for all
A ∈ B(Rn ). Then, for any bounded measurable function f
Z Z
f (x) µr (dx) = f (−x) µ(dx)
real and
Z
b(t) =
µ cos(x · t)µ(dt)
15.1.1. Smoothness of the Fourier transform. Here we present an analysis of the re-
lation between moments of a measure and the degree of smoothness of its Fourier transform.
Pn
For any t ∈ Rn and α ∈ Zn+ we denote tα = tα1 1 · · · tαnn , |α| = j=1 αj , and α! =
α1 ! · · · αn !.
Lemma 15.1.8. For any n ∈ Z+ and x ∈ R
n |x|n+1 2|x|n
ix X (ix)k
(15.1) e − ≤ min ,
k! (n + 1)! n!
k=0
P (ix)k
Proof. Let h−1 (x) := eix , and hn (x) := eix − nk=0 k! for n ≥ 0. It is easy to check that
Z x
(15.2) hn (x) = i hn−1 (s) ds, n ≥ 0.
0
Since |h0 (t)| ≤ 2 and |h−1 | = 1, it follows (15.2) that |h0 (x)| ≤ |x| ∧ 2. By induction,
if (15.1) holds for n − 1 then, from (15.2) we obtain that
|x|n+1 2|x|n
|hn (x)| ≤ min , .
(n + 1)! n!
Theorem 15.1.9. Suppose that µ is a complex measure on (Rn , B(Rn ). If
Z
|xj |m |µ|(dx) < ∞,
Rn
then the partial derivative ∂jm µ
b
exists, is uniformly continuous, and
Z
(15.3) ∂jk µ
b(t) = ik xkj eix·t µ(dx), 0 ≤ k ≤ m.
Rn
P m
n 2
Moreover, if |x|m = 2
j=1 xj b ∈ C m (Rn ), and
∈ L1 (|µ|), then µ
X i|α| Z
(15.4) b(t) =
µ tα xα µ(dx) + o(|t|m )
α!
0≤|α|≤m
dµ R
Proof. Let f = d|µ| . Since kµk is finite, |xj |k |µ|(dx) < ∞ for all 0 ≤ k ≤ m. We proceed
by induction. For k = 0 there is nothing to proof. Suppose the statement is valid for
ixj h
0 ≤ k < m. Since e h −1 ≤ |xj |, by dominated convergence we get that
Z
(∂jk µ
b)(t + hej ) − (∂jk µ
b)(t) eixj h − 1
lim = lim (ixj )k eix·t f (x)|µ|(dx)
h→0 h h→0 h
Z
=i k+1
xk+1
j eix·t f (x)|µ|(dx).
462 15. Fourier transform and Convolution on Rn
Proof. We give a simple ODE proof of this fact. First note that
Z Z
1 ixt −x2 /2 1 2
b(t) = √
µ e e dx = √ cos(xt)e−x /2 dx
2π 2π
Lemma 15.1.9 and integration by parts shows that
Z Z
′ 1 −x2 /2 1 2
b (t) = − √
µ x sin(xt)e dx = − √ t cos(xt)e−x /2 dx = −tb
µ(t)
2π 2π
b satisfies the equation
Therefore, µ
b′ (t) + tb
µ µ(t) = 0; b(0) = 1.
µ
2
b(t) = e−t /2 .
The unique solution to this initial value problem is µ
P (i)2n 2n
The last statesment follows from (15.4) and µ b(t) = ∞ n=0 2n n! t .
The following result relates the smoothness of the Fourier transform of a measure to the
existence of finite moments.
Theorem 15.1.11. Let µ be a finite positive measure on (Rn , B(Rn )). If ∂ α µ
b(0) exits and
2m n n
b ∈ C (R ); furthermore, for all α ∈ Z+ with |α| = 2m,
is finite for all |α| = 2m then, µ
R α R
|x | µ(dx) < ∞ and ∂ α µb(t) = i|α| xα eix·t µ(dx).
By Fatou’s lemma
Z Z
2 1 − cos(hxj )
xj µ(dx) ≤ lim inf 2 µ(dx)
h→0 h2
b(huj ) − 2b
µ b(−huj )
µ(0) + µ
= − lim sup 2
= −∂j2 µ
b(0).
h→0 h
Hence the claim holds for k = 1.
for all 1 ≤ j ≤ n. By applying the case k = 1 to each measure of the form x2k µ(dx) we obtain
R 2(k+1) 2(k+1) R 2(k+1) ix·t
that xj µ(dx) < ∞ for all 1 ≤ j ≤ n and that ∂j b(t) = i2(k+1) xj
µ e µ(dx).
This completes our induction argument.
It follows from
|x1 + . . . + xn |2m ≤ n2m−1 (x2m 2m
1 + . . . + xn )
thatR xα ∈ L1 (µ) for all α ∈ Zn+ with |α| = 2m, and from Theorem 15.1.9, ∂ α µ
b(t) =
i2m xα eix·t µ(dx).
Lemma
R −δ |x| 15.1.12. Suppose µ is a complex measure on (R, B(R)) of finite variation. If
R izx
e 0 |µ|(dx) < ∞ for some δ0 > 0, then µ b(z) = e µ(dx) has an analytic extension to
the strip D = {z ∈ C : | Im(z)| < δ0 }. Furthermore, for any z ∈ D
Z
µ (z) = i xeizx µ(dx)
′
Proof. The ideas in the proof of Theorem 10.6.5 provide a proof for the present lemma.
dµ
Set f := d|µ| . Then |f | = 1 |µ|–a.s. and µ = f · |µ|. As |eizx | ≤ eδ|x| for any z ∈ H, the map
R izx
z 7→ e µ(x) is a continuous extension of µ b to D. For a + ib = z ∈ D fixed, let δ1 > 0 be
such that B(z; δ1 ) ⊂ H. Clearly, |b| + δ1 < δ0 and, since
δ1 |xeizx | ≤ eδ0 |x| ,
xeizx ∈ L1 (|µ|(dx)). The convexity of the exponential function implies that for any |h| < δ1 ,
ihx
e − 1 e|x||h| − 1 e|x||δ1 | − 1 eδ1 |x| + e−δ1 |x|
≤ ≤ ≤ .
h |h| δ1 δ1
Dominated convergence implies that µ′ (z) exists and
Z
b(z + h) − µ
µ b(z) eihx − 1
b′ (z) = lim
µ = eizx lim f (x) |µ|(dx)
h→0 h h→0 h
Z Z
= i xeizx f (x) |µ|(dx) = i xeizx µ(dx)
b ∈ H(D).
This shows that µ
464 15. Fourier transform and Convolution on Rn
can be extended analytically to the strip H = {z ∈ C : | Im(z)| < 12 δ}. Our assumption
ch (n) (0) = (i)n pn , h) = 0 and so, µ
implies that µ ch (z) ≡ 0. This means that h = 0 µ–a.s.
which is a contradiction.
Remark 15.1.18. For any positive number a and any vector h we define the dilation by
a, δa , and the translation by h, τh , as the operators mapping any function g(x) into g(ax)
and g(x − h) respectively. It is left as an exercise (see Exercise 15.9.4) to show that the
Fourier transform satisfies
(a’) (e2πix·h f (x))∧ (y) = (τh fb)(y).
(b’) (τh g)∧ (y) = e−2πiy·h fb(y).
(d’) (δa f )∧ (y) = a−n fb(a−1 y)
Theorem 15.1.19. Suppose f ∈ L1 (Rn , λn ). Let A be an invertible linear transformation
on Rn and set fA = f ◦ A. Then,
1
fc
A (y) = fb (A⊺ )−1 y
| det(A)|
In particular, if f is a radial function, so is fb.
Proof. The first statement is a direct application of the change of variables formula for
Lebesgue measure on Rn . For the last statement, recall that f is radial iff f = fU for all
unitary linear transformation U (i.e. U ∈ O(n)). Hence fb(y) = fc b
U (y) = f (U y) for all
b
U ∈ O(n) and so, f is radial.
The following result will be very useful when we sudy regularity properties of the Fourier
transform of integrable functions, as well as of the operations discussed in Section 15.2.
Theorem 15.1.20. Suppose 1 ≤ p < ∞, and let f ∈ Lp (Rn , λn ). Then, the mapping
τ : Rn −→ Lp (Rn , λn ) given by t 7→ τt f = f (· − t) is uniformly continuous.
Proof. We first prove this lemma for continuous functions of compact support. Suppose
that g ∈ C00 (Rn ) and that supp(g) ⊂ B(0, a) then, g is uniformly continuous. Given ε > 0,
by uniform continuity of there is a 0 < δ < a such that |s − t| < δ implies
|g(s) − g(t)| < (λ(B(0, 3a)))−1/p ε.
Hence,
Z
|g(x − t) − g(x − s)|p dx = kτt g − τs gkpp = kτt−s g − gkpp < εp .
Proof. The first statement is clear from the definition of fb. Uniform continuity follows
from
Z
b b
|f (y) − f (s)| ≤ |f (x)||e−2πix·(y−s) − 1| dx
Rn
and dominated convergence. To prove that fb vanishes at infinity, notice that since eπi = −1
then
Z y Z
−2πi x+ ·y y −2πix·y
fb(y) = − f (x)e 2|y|2 dx = − f x − 2|y| 2 e dx.
Hence,
Z
y
2fb(y) = f (x) − f x − 2|y|2
e−2πix·y dx,
whence 2fb(y) ≤ kf − τh f k1 with h = y
2|y|2
. From Theorem 15.1.20 we conclude that
fb(y) → 0 as |y| → ∞.
15.2. Convolution
Definition 15.2.1. Suppose that µ and ν are two complex Borel measures on Rn . The
convolution of µ with ν is the measure Borel measure µ ∗ ν defined by
Z
(µ ∗ ν)(E) = 1E (x + y)(µ ⊗ ν)(dx, dy)
n n
ZR ×R Z
(15.5) = ν(E − x)µ(dx) = µ(E − y)ν(dy).
Rn Rn
(Here, (15.5) follows from Fubini’s theorem. Thus µ ∗ ν = ν ∗ µ.)
Proof. By definition
Z Z
g d(µ ∗ ν) = g(x + y)(µ ⊗ ν)(dx, dy)
Rn ×Rn
for all g ∈ L1 (µ ⊗ ν). Then, the first two statements follow directly from Radon–Nikodym’s
theorem together with Fubini’s theorem. To obtain (15.6), for each t ∈ Rn , define gt (x) :=
15.2. Convolution 467
Proof. We will first that supp(µ) + supp(ν) ⊂ supp(µ ∗ ν). Let x0 ∈ supp(µ) and y0 ∈
supp(ν). It is enough to show that (µ∗ν) x0 +y0 +U ) > 0 for any open neighborhood U of 0.
Choose an open neighborhood V of 0 such that V + V ⊂ U . Then, 1{x0 +V } (x)1{y0 +V } (y) ≤
1{x0 +y0 +U } (x + y). Integration with respect to µ ⊗ ν gives the desired result, for µ(x0 +
V )ν(y0 + V ) > 0.
To obtain the converse inclusion, suppose that z ∈ supp(µ ∗ ν). Let X = supp(µ) and
Y = supp(ν). Then, for any ε > 0
Z
0 < (µ ∗ ν) B(z; ε) = ν Y ∩ (B(z; ε) − x) µ(dx)
X
This means that for some x ∈ X, ν Y ∩ (B(z; ε) − x) > 0. This in turn implies that
there exists y ∈ Y ∩ (B(z; ε) − x), that is x + y ∈ B(z; ε) ∩ (X + Y ). This shows that
z ∈X +Y.
Example 15.2.5. Suppose that µ is a (positive)P Radon measure on [0, ∞). The renewal
measure associated to µ is defined as U = ∞ n=0 µ ∗n , where µ∗0 := δ . It is left as an
0
exercise to show that U is a Radon measure on R+ when µ(0, ∞) > 0 (see Exercise 15.9.18).
Here we show that (a) supp(U ) = ∪∞ ∗n ), and (b) supp(U ) is closed under addition.
n=0 supp(µ S
∗n
S
To check (a), set Fn := supp(µ ) and let F = Fn . Clearly that F ⊂ F ⊂ supp(U ).
c ∗n c
n
c P ∗n n cn
Each (Fn ) is open in R+ and µ (Fn ) = 0; hence, U (F ) = n µ (F ) = 0. This means
that supp(U ) ⊂ F since supp(U ) is the smallest closed set whose complement has zero U –
measure. To check (b), let x, y ∈ supp(U ) and let W be a ball around 0. Let V another
ball around 0 such that V + V ⊂ W . Then, for some n, m ∈ N, µ∗n (x + V )µ∗m (y + V ) > 0.
Hence µ∗(n+m) (x + y + W ) > 0. this shows that x + y ∈ supp(U ).
Example 15.2.6. Let µ be a complex Borel measure on Rn . For f ∈ Lloc n
1 (R , λn ), consider
the measure νf (dx) := f (x) dx. This defines a linear operator T : f 7−→ νf ∗ µ. It is obvious
that νf ∗ µ ≪ λn and that
Z
d(νf ∗ µ)
(x) = f (x − y)µ(dy)
dλn
468 15. Fourier transform and Convolution on Rn
R
We define (µ ∗ f )(x) := f (x − y) µ(dy) for any f ∈ Lloc
1 (λn ). If f ∈ L1 , then by Fubini’s
theorem we have
kT f k1 ≤ kµkkf k1
where kµk is the total variation of µ. If f ∈ L∞ (Rn ) then
kT f k∞ ≤ kµkkf k∞
Both, Marcinkiewicz and Riesz theorems show that T is of strong–type (p, p) for any p such
that 1 < p < ∞. Furthermore, Riesz theorem gives
kT f kp ≤ kµkkf kp .
From Example 15.2.6 it follows that when νf = f dλn and νg = g dλn then, νf ∗ νg ≪ λn
d(νf ∗νg ) R
and dλ n
(x) = Rn f (x − y) g(y) dy. This leads to the following definition.
Proof. (i) Let z ∈ supp(f ∗ g). We will show that B(z; ε) ∩ (supp(f ) + supp(g)) 6= ∅. By
definition, there exists zε ∈ B(z; ε) such that
Z Z
0 < |f ∗ g(zε )| ≤ |f (zε − y)g(y)| dy = |f (xε − y)g(y)| dy
supp(g)
Then, |f (zε − y)g(y)| > 0 for some y ∈ supp(g). Hence, zε ∈ supp(f ) + supp(g).
(ii) follows by applying Theorem 15.2.2 to the measures νf = f dλn and νg = g dλn , and
by recalling that by definition, fb(t) = µ cf (−2πt) for all f ∈ L1 (Rn , λn ). A more direct
proof may be obtained by direct application of Fubini’s
theorem
R along with the translation
invariance of Lebesgue’s measure, for instance, (f ∗ g)(x) ≤ |f (x − y)||g(y)| dy, and
Z Z
kf ∗ gk1 ≤ |f (x − y)||g(y)| dy dx
Z Z
= |g(y)| |f (x − y)| dx dy = kf k1 kgk1 .
15.2. Convolution 469
Proof. Suppose there is such g. Then gbfb = fb for all f ∈ L1 (Rn , λn ). Taking f (x) = e−π|x|
2
Remark 15.2.10. The space L1 (λn ) with the addition operation and scalar product in-
duced by pointwise evaluation, is a complex Banach space. Convolution makes L1 (λn )
a Banach ring, and Corollary 15.2.9 implies that L1 (λn ) is not an algebra. The Radon–
Nikodym theorem shows that the map from L1 (λn ) to the space of Borel complex measures
M(Rn ) given by f 7→ f · λn is an isometry. Hence, by considering L1 (λn ) as a subspace of
M(Rn ), we have that span(L1 (λn ) ∪ {δ0 }) is a Banach algebra with unit δ0 . Indeed, for any
f, g ∈ L1 (Rn ) and a, b ∈ C,
(f + aδ0 ) ∗ (g + bδ0 ) = (f ∗ g + ag + bf ) + abδ0
Z
kf + aδ0 kT V = |f | dλn + |a|δ0 (Rn ) = kf k1 + |a|.
Rn
1 1 1
Theorem 15.2.11. (Young) Let 1 ≤ r, p, q ≤ ∞ satisfy r = p + q − 1. If f ∈ Lp (Rn , λn )
and g ∈ Lq (Rn , λn ), then f ∗ g ∈ Lr (λn ) and
kf ∗ gkr ≤ kf kp kgkq
Proof. Without lost of generality, we might assume that 1 ≤ p < ∞. By Hölder’s inequality
and translation invariance of Lebesgue measure we have
Z
|(f ∗ g)(x + h) − (f ∗ g)(x + k)| ≤ |(f (x + h − y) − f (x + k − y)||g(y)| dy
≤ kτ−(k−h) f − f kp kgkq .
Uniform continuity follows directly from Theorem 15.1.20.
To prove the last statement, let {fk } ∪ {gk } ⊂ C00 (Rn ) such that lim kfk − f kp = 0 =
limk kgk − gkq and supp(fk ) ∪ supp(gk ) ⊂ B(0; ak ). Then, fk ∗ gk ∈ C00 (Rn ), supp(fk ∗ gk ) ⊂
B(0; 2ak ) and, by Hölder’s inequality,
kf ∗ g − fk ∗ gk ku ≤ kf − fk kp kgkq + kfk kp kg − gk kq .
We conclude that f ∗ g ∈ C0 and hence, uniformly continuous.
Theorem 15.2.13. Let f ∈ L1 (Rn , λn ). If ϕ ∈ C k (Rn ) and ∂ α ϕ is bounded for all 0 ≤
|α| ≤ k, then f ∗ ϕ ∈ C k and ∂ α (f ∗ ϕ) = f ∗ (∂ α ϕ) = (∂ α ϕ) ∗ f .
15.2. Convolution 471
For the following result we will make use of Stoke’s theorem from differential topology.
Lemma 15.2.14. Suppose f and g are functions in C 1 (Rd ) such that g∂j f and f ∂j g are in
L1 (Rn , λn ). If lim|x|→∞ |x|d−1 f (x)g(x) = 0 then
Z Z
(15.8) f ∂j g = − g∂j f
Proof. Let Br denote the ball of radius r in Rd centered at 0, Sr = ∂Br , σr (du) the
Lebesgue measure on Sr , and u(x) = x/kxk the normal vector outer vector at Sr in the
direction of x. By Stoke’s theorem
Z Z Z
f ∂j g = f guj dσr − g∂j f
Br Sr Br
Z Z
= rd−1 f (ru)g(ru) σ1 (du) − g∂j f,
S1 Br
Remark 15.2.15. In the setting of probability theory, if X and Y are independent random
vectors in Rn defined on a common probability space. Then the law of X + Y is the
convolution of the laws of X and Y .
15.2.1. Convolution of distributions and test functions. Using the notion of convo-
lution of a locally integrable function with a complex measure introduced in Example 15.2.6,
we show how to define convolution of distributions and test functions.
Suppose u ∈ D∗ (Rn ). Since the maps φ 7→ τx φ and φ 7→ φ̃, where τx φ(y) = φ(y − x)
and φ̃(y) = φ(−y), are continuous maps from D(Rn ) into itself, it follows that
τx u(φ) := u(τ−x φ)
u ∗ φ(x) : = u(τx φ̃), φ ∈ D(Rn )
are well Rdefined distribution for each x ∈ Rn . Recall that for any complex measure µ,
uµ (φ) = φ dµ. For each x ∈ Rn define the measure (τx µ)(A − x). Then
Z Z
τx uµ (φ) = τ−x φ dµ = φ(y + x)µ(dy) = uτx µ (φ)
Z Z
(uµ ∗ φ)(x) = τx φ̃(y) µ(dy) = φ(x − y) µ(dy) = µ ∗ φ(x)
Proof. (i) Notice that τy τ−x = τy−x and τg x φ(z) = τx φ(−z) = φ(−z − x) = φ̃(z + x) =
τ−x φ̃(z). Consequently, for any y ∈ R and x ∈ Rn
n
τx (u ∗ φ) (y) = u ∗ φ(y − x) = u(τy−x φ̃)
(τx u ∗ φ (y) = (τx u)(τy φ̃) = u(τy−x φ̃)
u ∗ (τx φ) (y) = u(τy τg
x φ) = u(τy−x φ̃).
15.3. Approximation to the identity 473
that is,
(15.9) τx ∂g
α φ = (−1)|α| ∂ α (τ φ̃).
x
Consider a collection {Kε : ε > 0} ⊂ L1 (Rn , λn ) that satisfy the following properties:
R
(i) Rn Kε (x) dx = a for all ε > 0.
(ii) supε>0 kKε k1 < ∞.
R
(iii) |x|>δ |Kε (x)| dx → 0 as ε → 0.
474 15. Fourier transform and Convolution on Rn
Theorem 15.3.1. Suppose {Kε : ε > 0} ⊂ L1 (Rn , λn ) satisfy (i)–(iii) above. Then, for
any f ∈ Lp (Rn , λn ), 1 ≤ p < ∞,
(15.10) lim kf ∗ Kε − a f kp = 0.
ε→0
Theorem 15.1.20 along with assumption (ii) implies that for any η > 0, there exists δ > 0
′
such MR kτy f − f kp < η/2 whenever |y| ≤ δ. ′ By assumption (iii), for some ε > 0 we have
2kf kp |y|>δ |Kε (x)| dx < η/2 whenever ε < ε . Combining these facts, we obtain
Z
kf ∗ Kε − af kp ≤ kτy f − f kp |Kε (y)| dy
|y|≤δ
Z
η η
+ kτy f − f kp |Kε (y)| dy ≤ +
|y|>δ 2 2
whenever 0 < ε < ε′ .
The second statement follows similarly. Let η > 0 be fixed. If f is continuous at x, then for
η
some δ > 0, R|x − u| ≤ δ implies that |f (x) − f (u)| < 2M . For such δ > 0, there is ε′ > 0 such
η ′
that 2kf k∞ |x|>δ |Kε (x)| dx < 2 whenever 0 < ε < ε . Putting these statements together
gives
Z
|f ∗ Kε (x) − af (x)| ≤ |f (x − y) − f (x)||Kε |(y) dy
Z Z
η η
(15.11) ≤ + |f (x − y) − f (x)||Kε |(y) dy ≤ +
|y|≤δ |y|>δ 2 2
If f is bounded and uniformly continuous, then δ > 0 can be chosen so that
η
sup |f (u) − f (u)| < .
|v−u|<δ 2M
of integrable functions. For any φ ∈ L1 (Rn , λn ), define φε (x) = ε−n φ(ε−1 x). It is an simple
exercise to show that {φε : ε > 0} satisfies (i) and (ii); as for (iii),
Z Z
−n −1 ε→0
ε |φ(ε x)| dx = |φ(u)| du −−−→ 0
{|x|>δ} {|u|> δε }
Example 15.3.2. (Mollification) Let U ⊂ Rn be an open set and let f be a function that
is locally integrable in U ; that is f ∈ L1 (V ) for any compact subset V of U . For any ε > 0,
let Uε = {x R∈ U : d(x, ∂U ) > ε}. Let η be a nonnegative function D(Rn ) with support in
B(0; 1) and η(x) dx = 1, and define ηε (x) = ε−n η(ε−1 x) for all ε > 0. The mollification
of f by η is defined as
Z Z
ε
f (x) := ηε (x − y)f (y) dy = ηε (x − y)f (y) dy, x ∈ Uε
U B(x;ε)
R
Lemma 15.3.3. Let η ≥ 0 be a mollifier with support in B(0; 1) and η(x) dx = 1. Suppose
f ∈ Lloc
1 (U ) and let fε be its mollification by η. Then,
(i) ηε ∗ f ∈ C ∞ (Uε ), and ∂ α f ε (x) = ∂ α ηε ∗ f (x) for all x ∈ Uε and α ∈ Zn+ .
(ii) f ε converges to f a.s. in U as ε → 0.
(iii) If f ∈ C(U ) then the convergence in (ii) is uniform in compact subsets of U .
(iv) Suppose 1 ≤ p < ∞ and f ∈ Lploc (U ). For any relatively compact set V ⊂ V ⊂ U ,
ε→0
kf ε − f kLp (V ) −−−→ 0
Proof. (i) Fix ε > 0 and let x ∈ Uε . Then, there is δ > 0 such that
(ii) For x ∈ U , there is ε0 > 0 such that x ∈ Uε for all 0 < ε ≤ ε0 . Let C = kηk∞ ωn . By
Theorem 11.1.7
Z
ε
|f (x) − f (x)| = ηε (x − y) f (y) − f (x) dy
B(x;ε)
Z
1 x − y
≤ n η |f (y) − f (x)| dy
ε B(x;ε) ε
Z
1 ε→0
(15.12) ≤C |f (y) − f (x)| dy −−−→ 0
λn (B(x; ε) B(x;ε)
ε→0
whenever x is a Lebesgue point of f . Hence, f ε −−−→ f a.s. in U .
(iii) If V is a relatively compact subset of U then there is another relatively compact set
W with V ⊂ W ⊂ W ⊂ U . The function f is uniformly continuous on W and so, the limit
in (15.12) is uniform in x ∈ V .
(iv) Let W relatively compact such that V ⊂ W ⊂ W ⊂ U . For all ε > 0 small enough
W ⊂ Uε . By assumption 1W f ∈ Lp (Rn ), and for any x ∈ V ,
Z Z
ε
f (x) = ηε (x − y)f (y) dy = ηε (x − y) 1W f (y) dy = ηε ∗ 1W f (x)
B(x;ε)
Proof.
R Fix 1 ≤ p < ∞ and let f ∈ Lp (Rn ). Let η ∈ D(Rn ) be a mollifier such that
η(x) dx = 1, and define ηε (x) = εn η(ε−1 x). Given δ > 0, there is g ∈ C00 (Rn ) such
that kf − gk < 2δ . Since supp(ηε ∗ g) ⊂ B(0; ε) + supp(g), {ηε ∗ g : ε > 0} ⊂ D(Rn ) by
ε→0
Theorem 15.2.13. It follows from Theorem 15.3.1 that kηε ∗ g − gkp −−−→ 0. Hence, for all
ε > 0 small enough we have that kg ∗ ϕε − gkp < 2δ , and
kf − g ∗ ηε kp ≤ kf − gkp + kg − g ∗ ηε kp < δ.
The following two classical examples are very important and will be used in the analysis
of the invertibility of the Fourier transform.
where cn = Γ( n+12 )π
−(n+1)/2 . Integration in polar coordinates followed by the change of
Z Z∞ Zπ/2
1 rn−1
dx = σn−1 dr = σn−1 sinn−1 θ dθ
Rn (1 + |x|2 )(n+1)/2 (1 + r2 )(n+1)/2
0 0
1 π (n+1)/2
= σn =
2 Γ[(n + 1)/2]
Observe that P satisfies the condition in Theorem 15.3.9. Thus, the family of kernels
Pε (x) = cn (ε2 +|x|2ε)(n+1)/2 is an approximation to the identity.
Example 15.3.6. In this example we show that the Poisson kernel Pε introduced in Ex-
ample 15.3.5 is related to the function ρ(x) = e−2π|x| through the identity ρb(y) = P1 (y).
Using the inverse Fourier transform of the Cauchy distribution in R and applying Fubini’s
theorem we obtain that for β > 0,
Z∞ Z∞ Z∞
−β 2 cos βx 2 cos(βx) 2
e = dx = e−u(1+x ) du dx
π(1 + x2 ) π
0 0 0
Z∞ Z∞
2e−u 2
= e−ux cos(βx) dx du
π
0 0
Z∞ Z∞ Z∞
e−u −ux2 −iβx
e−u −β2
= e e dx du = √ e 4u du
π πu
0 −∞ 0
R
satisfies the conditions of Theorems 15.3.9 and Rn W (x) dx = 1. Hence, the collection of
2 2
functions Wε (x) = (2πε12 )n/2 e−|y| /2ε is an approximation to the identity.
The Poisson and the Gaussian kernels given above are radial, that is, they come from
renormalization of integrable radial functions. A large class of good kernels {Kε : ε > 0}
found in applications can be dominated by a familly of radial kernels, and it is possible to
obtain a.s. convergence results.
Lemma 15.3.8. Let µ be either complex measure or a σ–finite measure on B(Rn ). Suppose
ψ ∈ L1 (Rn , λn ) is a nonnegative decreasing radial function. Then, for any x ∈ Rn
|ψ ∗ µ(x)| ≤ Mµ (x)kψk1 .
In particular, if f ∈ Lloc n
1 (R , λn ) then,
|ψ ∗ f (x)| ≤ Mf (x)kψk1 .
Proof. Fix x ∈ Rn and let µx be the measure given by µx (A) = µ(A + x). If E = {(y, t) ∈
Rn × [0, ∞) : ψ(y) > t} then, by assumption on ψ, E t = {y : ψ(y) > t} is a ball around the
origin. Fubini’s theorem implies that
Z Z Z ∞
|ψ ∗ µ(x)| ≤ ψ(y) |µx |(dy) = 1E dt d|µx |
n Rn 0
ZR∞ Z ∞
= |µx |(ψ > t) dt ≤ Mµx (0) λ(ψ > t) dt
0 0
= Mµ (x)kψk1 .
Theorem
R 15.3.9. Let {Kε : ε > 0} be a family of good kernels in Rn such that a =
Rn Kε (x) dx. Suppose φ0 is a nonnegative, decreasing function in [0, ∞) such that
1
R
Proof. Let Tr f (x) = λn (B(x;r)) B(0;r) |f (x − y) − f (x)| dy. Since f ∈ Lp , then Mf (x) < ∞
and limr→0 Tr f (x) = 0 at every Lebesgue point x of f . Let x be such a point. (i) Since ψ
is a λn –integrable nonnegative radial function, it follows that
Z Z
ψ(x) dx ≥ ψ(x) dx
Rn r/2≤|x|≤r
Z r Z
2n − 1 n
(15.14) = φ0 (s)sn−1 σn−1 (du) ds ≥ ωn r φ0 (r).
r/2 S n−1 2n
15.3. Approximation to the identity 479
Let Ik denote the k–th term in the sum (15.15). Since ψ is nonincreasing, we have
Z Z !
1
Ik ≤ c n |f (x − y) − f (x)| ψ(z) dz dy
(ωn 2k+1 ε)n 2k ε<|y|≤2k+1 ε |y| |y|
<|z|≤ ε
2ε
1 Z Z
≤ cn |f (x − y) − f (x)| dy ψ(z) dz
ωn (2k+1 ε)n |y|≤2k+1 ε 2k−1 <|z|≤2k+1
Z
≤ cn T2k+1 ε f (x) ψ(z) dz.
2k−1 <|z|≤2k+1
PR
Since 2k−1 <|z|≤2k+1 ψ(z) dz ≤ 2kψk1 < ∞, for any ε1 > 0 there is K0 big enough so that
k∈Z
X Z
ψ(z) dz < ε1 .
|k|>K0 2k−1 <|z|≤2k+1
1
R 2π −i ny dy. For functions in L (S1 ), the Fourier series (15.16) has a
where cn = 2π 0 f (y)e 2
precise geometrical interpretation.
P i nx
Conversely, if a ∈ L2 (Z), then SN (x) = |n|≤N cn e converges to a function f ∈ L2 (S1 )
whose n–th Fourier coefficient is cn .
1
Rπ
Proof. The space L2 (S1 ) is a Hilbert space with inner product hf, gi = 2π −π f (x)g(x) dx.
The sequence E = {en : n ∈ Z}, where e(x) = e i nx is an orthogonal collection in L2 (S1 )
1
that separates points of S . On the other hand, ej ek = ej+k , ek = e−k and e0 = 1. By
the Stone–Weierstrass theorem the algebra A generated by E is dense in C(S1 ), and so A is
dense in L2 (S1 ). The first conclusion follows from Parseval’s theorem.
P
Conversely, as kSn − Sm k2L2 (S1 ) = m 2
j=n+1 |cj | for all n < m, it follows that Sn is Cauchy
in L2 (S1 ). Hence, Sn converges to some f ∈ L2 (S1 ) and cn = hf, en i for all n ∈ Z.
Example 15.4.2. (sawtooth function) Let f be the 2π–periodic piecewise smooth odd
function defined as f (0) = 0 and f (x) = 21 (π − x) for all 0 < x < 2π. Then f ∈ L2 (S1 ) and
its Fourier series is given by
∞
1 X einx X sin nx
f (x) ∼ =
2i n n
|n|≥1 n=1
We will study the convergence of Fourier series by studying a particular kernel operator.
For each n ∈ Z+ consider the sum
X e−inx − ei(n+1)x sin (n + 12 )x
ikx
Dn (x) = e = = .
1 − eix sin(x/2)
|k|≤n
1
The n–th Dirichlet kernel in the unit ball is given by 2π 1[−π,π] Dn . If f ∈ L1 (S1 ) then,
from the periodicity of Dn and f , we have that
X Z π
ikx 1
Sn f (x) = cn e = Dn (x − y)f (y) dy
2π −π
|k|≤n
Z π Z π
1 1
= Dn (y)f (x − y) dy = Dn (y)f (x + y) dy.
2π −π 2π −π
Notice that f˜ = f 1[−2π,2π] ∈ L1 (R). For any |x| ≤ π we have [x − π, x + π] ⊂ [−2π, 2π];
thus,
Z Z π
e iky
f (x − y)e 1[−π,π] (y) dy = f (x − y)eiky dy
R −π
Z π Z π
(15.17) =e ikx
f (y)e −iky
dy = e ikx
f˜(y)e−iky dy
−π −π
1 e
for all k ∈ N. Hence, Sn f (x) = 2π f ∗ 1[−π,π] Dn (x) for all |x| ≤ π and n ∈ Z+ .
1
Rπ
Notice that for each n ∈ N, 2π π Dn (x) dx = 1. As | sin(t)| ≤ |t|,
Z π Z π | sin (n + 1 )y |
Z (n+ 1 )π
2
2 | sin t|
|Dn (y)| dy ≥ dy = 2 dt
−π −π |y| 0 |t|
Z nπ X Z (k+1)π | sin t|
n−1
| sin t|
≥4 dy ≥ dt
π |t| kπ |t|
k=1
n−1
X
8 1 8
≥ ≥ log n
π k+1 π
k=1
From this, we conclude that the Dirichlet kernels do not a constitute a family of good
kernels. This in turn suggests that the convergence of Fourier series is intricate, and may
even fail for continuous functions.
The following result is the analogous of Theorem 15.1.20 for integrable functions on S1 .
Lemma 15.4.3. Suppose 1 ≤ p < ∞. The mapping τ : S1 −→ Lp (S1 ) given by h 7→ f (·−h)
is uniformly continuous.
1
Proof. Since kf kLp (S1 ) = 2π k1[0,2π] f kLp (R) for all f ∈ Lp (S1 ), it suffices to estimate Lp (R)–
norm of 1[0,2π] (f − τh f ). Notice that
h→0
By Theorem 15.1.20, kfe − τh fekLp (R) −−−→ 0. For the second term the right hand side
of (15.18) we have
Z p Z
h→0
1[0,2π] (y) − 1[h,2π+h] (y) f (y − h) dy = 2 1[−|h||h|] |f (y)|p dy −−−→ 0
R R
Consequently, lim kf − τh f kLp (S1 ) = 0.
h→0
1
Rπ
Theorem 15.4.4. (Riemann–Lebesgue) If f ∈ L1 (S1 ) and cn (f ) := 2π −π f (y)e−iny dy
then, lim cn (f ) = 0.
|n|→∞
We will use The Riemann–Lebesgue theorem to address the problem of pointwise con-
vergence of the Fourier partial sums Sn of (15.16). From
Z π Z π
1 1
Sn f (x) = Dn (y)f (x − y) dy = Dn (y)f (x + y) dy
2π −π 2π −π
we obtain that
Z π
1 f (x − y) + f (x + y) − 2f (x) sin (n + 21 )y
(15.19) Sn (f ) − f (y) = dy
2π −π 2 sin(y/2)
Z π
1 f (x − y) + f (x + y) − 2f (x)
(15.20) = cos(ny) dy
2π −π 2
Z π
1 f (x − y) + f (x + y) − 2f (x)
(15.21) + cot(y/2) sin(ny) dy
2π −π 2
The term (15.20) goes to zero as n → ∞ by Riemann–Lebesgue Theorem. Convergence of
the second term (15.21) provides a criteria of convergence of Sn f .
Theorem 15.4.5. (Dini’s test) Suppose f ∈ L1 (S1 ). If the integral
Z π
(15.22) |f (x − y) + f (x + y) − 2f (x)| cot(y/2) dy < ∞
0
at a point x ∈ [−π, π] then, limn Sn f (x) = f (x).
2 sin t
Since π ≤ t ≤ 1 for |t| ≤ π2 , condition (15.22) holds whenever
Z π
f (x − y) + f (x + y) − 2f (x)
(15.23) dy < ∞
y
0
Suppose f has a jump discontinuity at x. Modifying the value of f at on sets of measure
zero does not change the value of Fourier coefficients so, we set f (x) = f (x−)+f
2
(x+)
. Hence,
if
Z π Z π
f (x − y) − f (x−) f (x + y) − f (x+)
(15.24) dy < ∞ and dy < ∞,
y y
0 0
f (x−)+f (x+)
limn→∞ Sn f (x) = 2 . We have the following result.
Corollary 15.4.6. If f is piecewise differentiable on S then,
f (x−) + f (x+)
lim Sn f (x) =
n→∞ 2
for all x ∈ [−π, π].
which causes ripples to form along the graph of f around the point of discontinuity evenly
in both directions. The overshoot is about 9% of the length of the jump in both directions.
Although the Dirichlet kernel is not within the class of good kernels, its Cesàro and
Abel sums are well behaved. For each n ∈ N, consider the averages
n−1
1X 1 1 − cos(ny) sin2 (ny/2)
Kn (y) = Dk (y) = =
n
k=0
n 1 − cos(y) n sin2 (y/2)
with the convention that sin(t)/t = 1 at t = 0. The n–th Fejér kernelP in the unit circle
1
is defined as 2π 1[−π,π] Kn . The Cesàro sum of the Fourier series f ∼ n∈Z cn (f )einx of a
function f ∈ L1 (S1 ) is given by
n−1 Z π
1X
σn f (x) := Sk f (x) = f (y)Kn (x − y) dy
n −π
k=0
1 e
From (15.17) we have that σn f (x) = 2π f ∗ 1[−π,π] Kn (x) for all |x| ≤ π, where fe =
f 1[−2π,2π] .
Theorem 15.4.8. Suppose f ∈ Lp (S1 ) where 1 ≤ p < ∞. Then, σn f converges to f as
n → ∞ in Lp (S1 ) and pointwise at every Lebesgue point of f . In particular, if f ∈ Lp (S1 )
is such that cn (f ) = 0 for all n, then f ≡ 0 a.s.
Proof. We claim that the Fejér kernels form a family of good kernels. First, for each n ∈ N
Kn ≥ 0, and
Z π
1
Kn (y) dy = 1.
2π −π
If 0 < δ < |y| ≤ π then Kn (y) ≤ n sin21(δ/2) ; hence,
Z
1 2(π − δ)
lim Kn (y) dy ≤ lim = 0.
n→∞ 2π δ<|y|≤π n→∞ 2πn sin2 (δ/2)
As with Cesáro sums, Abel sums can be expressed in tems of a family of convolution
operators. For each 0 < r < 1 consider the function
1 − r2
Pr (x) = .
1 − 2r cos(x) + r2
It is easy to check that
X 1 + z 1 − |z|2
Pr (x) = r|n| eixn = Re = ,
1−z |1 − z|2
n∈Z
where z = reix and |x| ≤ π. The r–th Poisson kernel in the unit disc is defined as
1 1
2π Pr 1[−π,π] . It is easy to see that for any f ∈ L1 (S ),
X Z π Z π
1 1
cn r|n| ei nx = f (y)Pr (x − y) dy = Pr (y)f (x − y) dy
2π −π 2π −π
n∈Z
1 e
Hence, Ar f (x) = 2π f ∗ 1[−π,π] Pr (x) for all |x| ≤ π and 0 < r < 1, where fe = 1[−2π,2π] f .
Consequently
Z
lim Pr (x) dx = 0.
r→1− η<|x|≤π
r→1−
We know show that Ar f −−−−→ f pointwise a.s. in S1 . Fix |x| ≤ π. The function g(r) =
1 − 2r cos(x) + r2 attains its minimum value within [0, 1] at r = cos(x). Thus,
1
Pr (x) ≤ 2(1 − r) 2 .
sin (x)
Hence, for |x| ≤ π/2
π2 1 1
Pr (x) ≤ (1 − r) ≤ 2π 2 (1 − r) 2
2 x2 x
while for π/2 ≤ |x| ≤ π, 1 − r cos(x) + r2 ≥ 1, and so
1
Pr (x) ≤ 2(1 − r) ≤ 2π 2 (1 − r)
x2
Define
1
ψ(x) = 21|x|≤1 + π 1 .
x2 |x|>1
Then, ψ is a nonnegative integrable radial function for which
1
Pr (x)1[−π,π] (x) ≤ ψ1−r (x) := (1 − r)−1 ψ((1 − r)−1 x)
2π
Pointwise a.s. convergence follows from Theorem 15.3.9. The last statement of follows
immediately.
Example 15.4.10.
P∞ Ifz nlog is the principal branch of logarithmic function, we have that
iθ
− log(1 − z) = n=1 n for all |z| < 1. If z = re with 0 < r < 1 then, the Abel sum of
1 P einθ P∞ sin(nθ)
the sawtooth function f (θ) = 2i |n|≥1 n = n=1 n is given by
1 X rn inθ
∞
X ∞
rn sin(nθ) 1 X r|n| einθ
Ar f (θ) = = = e − e−inθ
n 2i n 2i n
n=1 |n|≥1 n=1
1
=− log(1 − reiθ ) − log(1 − re−iθ ) = Im − log(1 − reiθ )
2i
= arg(1 − reiθ ).
15.4. Fourier series 487
It follows that f (θ) = limr→1− Ar f (θ) for all θ. For 0 < θ < 2π, we obtain another
expression for f , namely f (θ) = limr→1− Ar f (θ) = arg(1 − eiθ ). Let us now consider
X∞ X∞
rn cos(nθ) rn sin(nθ)
− log(1 − reiθ ) = +i
n n
n=1 n=1
(15.25) = − log(|1 − reiθ |) + i arg(1 − reiθ )
The second term the right hand side of (15.25) converges to −f for every θ. For 0 <
θ < 2π, the first term of the left hand side of (15.25) converges to the 2π–periodic even
function g(θ) := − log(|1 − eiθ |) = − log 2| sin(θ/2)| . Notice that g is unbounded and that
limθ→0 g(θ) = ∞ = limθ→2π g(θ). Since sin(t) ∼ = t as t → 0 and limt→0+ tα log(t) for any
P cos(nθ)
α > 0, we have that g ∈ Lp (S1 ) for all p ≥ 1. Being θ 7→ ∞ n=1 n square integrable
1
P∞ cos(nθ)
function on S , it follows that log 2| sin(θ/2)| = − n=1 n .
Remark 15.4.11. The statement of Theorem 15.4.1 holds also for L2 (Tn ), n ≥ 1. The
collection E = {Ek : k ∈ Zn } ⊂ L2 (Tn ) given by Ek (x) = e2πik·x separates points of Tn .
Since Ek Ej = Ek+j , E0 ≡ 1 and Ek = E−k , the linear span A generated by E is a dense
algebra in C(Tn ). Being C(Tn ) dense in Lp (TRn ) for all 1 ≤ p < ∞, we conclude that A is
also dense in Lp (Tn ). It is easy to check that [0,1)n Ek (x)E j (x)dx = δjk . Therefore, for any
P
f ∈ L2 (T, k∈Zn hf |Ek iEk converges to f in L2 (Tn ).
The following example makes a connection between the Fourier series and Fourier inte-
grals.
Example 15.4.12. (Poisson summation formula) If f ∈ L1 then the map S
P P f : x 7→
1 1 n n =
k∈Z n f (x+k) converges absolutely a.s. Indeed, set Q = − ,
2 2 . Then R k∈Zk (Q+
k). From
Z X
R P P R
f (x + k)dx ≤ Q k∈Z |f (x + k)|dx = k∈Zn Q+k |f (x)| dx = kf k1
Q k∈Zn
P
we conclude that k∈Zn f (x + k) converges absolutely a.s. and in L1 (Q) to some function
P f ∈ L1 (Q). We can extend P f periodically to almost all Rn by noticing that P f (x + ℓ) =
P f (x) for all ℓ ∈ Zn and for all x ∈ Q where P f converges. Thus P f can be consider as
a function on Tn . Moreover, by applying Fubini’s theorem we obtain that the ℓ–th Fourier
coefficient of P f is given by
Z X X Z
−2πix·ℓ
f (x + k) e dx = f (x + k)e−2πix·ℓ dx
Q k∈Zn k∈Zn Q
X Z X Z
= f (x)e−2πi(x−k)·ℓ dx = f (x)e−2πix·ℓ dx
k∈Zn Q+k k∈Zn Q+k
Z
= f (x)e−2πix·ℓ dx = fb(ℓ)
Rn
Suppose there is a radial nonincreasing function ψ0 in [0, ∞) such that |f (x)| ≤ ψ0 (kxk)
and that ψ ◦ k · k ∈ L1 (Rn ). For any x ∈ Q and k ∈ Zn , kk + xk ≥ 2√1 n kkk, and so
488 15. Fourier transform and Convolution on Rn
a.s. If fb is also dominated by a nonincreasing radial function ϕ ∈ L1 , then the right hand
side of (15.26) converges absolutely and uniformly on Tn and thus, it is also continuous. If
f ∈ C(Rn ) then both series in (15.26) are absolutely and uniformly convergent; hence, (15.26)
P P
holds everywhere on Tn . In particular, for x = 0 we have that k∈Zn f (k) = k∈Zn fb(k).
Example 15.4.13. The Poisson summation obtained from periodization of the Gaussian
2
kernel ϕε (x) = e−πkxk in Rn is given by
X − πkx−mk2 X 2 2
ε−n e ε2 = e−πε kmk e2πim·x
m∈Zn m∈Zn
P −πtm2 e2πimx
For n = 1, the function Θ(x; t) := m∈Z e is called Theta function.
Example 15.4.14. The Poisson summation obtained from periodization of the Poisson
kernel (in Rn ) is given by
Γ[(n + 1)/2] X ε X
n+1 n+1 = e−2πεkmk e−2πix·m
π 2 (ε 2 + kx − mk2 ) 2
m∈Zn m∈Zm
For n = 1 and x = 0 we get that
X 1 π 1 + e−2πε
=
ε2 + m2 ε 1 − e−2πε
m∈Z
which means that
∞
X 1 π 1 + e−2πε 1
2 = − 2
ε + m2
2 ε 1−e −2πε ε
m=1
2 3 3 3
3 π ε + o(ε ) π2
= →
2πε3 + o(ε3 ) 3
P∞ 1 π2
as ε → 0. This gives another proof that m=1 m2 = 6 .
Similarly,
Z T Z Z T
1 −iat 1
JT (a) := e b(t) dt =
µ cos(t|x − a|)dt µ(dx)
2T −T R T 0
Z
sin(T |x − a|)
= µ({a}) + µ(dx).
{a}c T |x − a|
By dominated convergence, limT →∞ JT (a) = µ({a}).
(ii) The second statement of the theorem follows from part (i) applied to the measure
µf (dx) = f (x) dx, where f ∈ L1 (λ).
b ∈ L1 then, for any a ∈ R
(iii) If µ
JT (a) ≤ kb
µk1
→ 0,
2T
490 15. Fourier transform and Convolution on Rn
Example 15.5.3. The double exponential distribution ν(dx) = 21 e−|x| dx has characteristic
1
function νb(t) = 1+t 2 ∈ L1 (R). Therefore, by (15.30)
Z
1 −|x| 1 e−itx
e = dt
2 2π R 1 + t2
1
Consequently, the Cauchy distribution ρ(dx) = π(1+x2 )
dx has characteristic function
given by ρb(t) = e−|t| .
The following result provides another way to invert the Fourier transform of funcitons
in L1 as a limit of regular functions. This approach involves convolution operations and
provides L1 and a.s. convergence.
Theorem 15.5.5. (Inversion Theorem of theR Fourier transform in L1 (Rn , λn )). Suppose
ϕ ∈ L1 (Rn , λn ) such that ϕ
b ∈ L1 (Rn , λn ) and ϕ(t)
b dt = 1. For any ε > 0 define
Z
(15.32) Sε (f, x) = ϕ(εs)fb(s)ei2πx·s ds.
ε→0
Then, kSε (f ) − f k1 −−−→ 0. If fb ∈ L1 (Rn , λn ) then, Sε (f ) converges pointwise to f at every
Lebesgue point of f . Furthermore,
Z
(15.33) f (x) = lim Sε (f, x) = e2πt·x fb(t) dt
ε→0
b
Proof. Let K(x) = ϕ(−x) and define Kε (x) = ε−n K(ε−1 x) for all ε > 0. By Fubini’s
theorem,
Z Z
−n −1
Kε ∗ f (x) = ε ei2πε (x−y)·s ϕ(s) ds f (y) dy
Z Z
−1
= ε ϕ(s) ei2πε (x−y)·s f (y) dy ds
−n
Z
= ϕ(εs)fb(s)ei2πx·s ds,
where the last equality follows by a change of variables s 7→ ε−1 s. The first conclusion
follows from Theorem 15.3.1.
2
For the second statement consider ϕ(x) = e−π|x| . Then, ϕ(x) = ϕ(−x) and
Z n Z∞
Y 2
−2πix·t −π|x|2 2
b =
ϕ(t) e e dx = e−2πixj tj e−πxj dxj = e−π|t| = ϕ(t).
Rn j=1−∞
R R
b dt = 1 = ϕ(x) dx. Clearly the kernels Kε defined in the first part of the
Hence ϕ(t)
proof satisfy the conditions of Theorem 15.3.9; thus, the left hand side of 15.32 converges
to f pointwise at every Lebesgue point of f as ε → 0. By dominated convergence, the right
b
hand side of (15.32) converges to fb (−x) pointwise as ε → 0, and (15.33) follows. The last
statement is a consequene of Riemann–Lebesgue’s lemma 15.1.21.
2
b −1 x) = εn φ(ε−1 x) and so, Sε (f ) ≡
Proof. For φ(x) = e−π|x| , we have that Kε (x) = ε−n φ(ε
n
Kε ∗ f in R . If f is continuous at 0, then 0 is a Lebesgue point of f . If follows from by
Theorem 15.3.9 that lim Sε (f, 0) = lim Kε ∗ f (0) = f (0).
ε→0 ε→0
R
If fˆ ≥ 0 then, by monotone convergence, fˆ ∈ L1 and fb = lim Sε (f, 0).
ε→0
Riesz’ interpolation theprem extends the Fourier transform to all Lp spaces with 1 <
p < 2.
494 15. Fourier transform and Convolution on Rn
Proof. From Riemann–Lebesgue’s lemma and Plancherel’s theorem we know that the
Fourier transform F is a linear map on L1 (Rn ) + L2 (Rn ) into L∞ (Rn ) + L2 (Rn ) such that
kF(f )k∞ ≤ kf k1 and kF(g)k2 = kgk2 for all f ∈ L1 and g ∈ L2 . By Riesz’s interpolation
theorem, for any 0 < θ < 1 we can define the Fourier transform as a bounded operator on
Lpθ into Lqθ with kF(f )kqθ ≤ kf kpθ .
2
Example 15.7.1. The family of functions ϕα (x) = e−α|x| , α > 0, is contained in S.
The next results makes a connection between D(Rn ) (with the strictly inductive limit
topology τ defined in Example (12.4.4), and the S with the Fréchet topology ρ induced by
the seminorms ρm .
Theorem 15.7.2.
(i) D(Rn ) is dense in (S, ρ), and S is dense in (Lp (λn ), k kp ) for all 1 ≤ p < ∞.
(ii) The inclusion map ι : (D(Rn ), τ ) → (S, ρ) is continuous.
(iii) For any 1 ≤ p < ∞ there is a constant 0 < C = C(n, p) < ∞ such that
kφkp ≤ C ρn (φ), φ∈S
Hence, the inclusion map j : (S, ρ) → (Lp (λn ), k kp ) is continuous.
15.7. Schwartz functions 495
Proof. (ii) For each compact K ⊂ Rn , the topology induced on DK by ρ is the same as
the topology τK induced on DK by the seminorms pm (φ) = sup{|φα (x)| : x ∈ Rn , |α| ≤ m},
m ∈ Z+ , since (1+|x|2 )m is bounded on K for each m ∈ Z+ . This shows that the restriction
of ι to DK is continuous. Thus, by Theorem 12.5.4, ι is continuous.
(i) Let φ ∈ S. Choose η ∈ D(Rn ) with 0 ≤ η ≤ 1 such that η ≡ 1 in the unit ball and
zero outside the ball of radius 2. Define φr (x) := φ(x)η(rx) for r > 0. Clearly φr ∈ D(Rn ).
r→0
We claim that φr −−−→ φ in S. Indeed, by the Leibniz formula for differentiation, for any
polynomial P (x) and α ∈ Z+
X α
α
(15.36) P (x) ∂ (φ − φr )(x) = P (x) (∂ α−β φ)(x)r|β| (∂ β (1 − ψ))(rx)
β
0≤β≤α
The dual space of (S(Rn ), ρ) is called the space of tempered distributions (see Exer-
cise 15.9.28).
Example 15.7.3. Suppose µ a positive Radon measure on (Rn , B(Rn )) such that
Z
C := (1 + |x|2 )−N µ(dx) < ∞
Rn
R
form some N ∈ N. The map uµ : φ 7→ φ dµ is a tempered distribution. To see that,
m→∞
suppose φm → 0 in S. Then k(1 + |x|2 )N φn (x)k∞ −−−−→ 0. Consequently, |uµ (φm )| ≤
k(1 + |x|2 )N φn (x)k∞ C → 0.
Theorem 15.7.4. The Fourier transform F maps the space S onto itself; moreover, F :
(S, ρ) → (S, ρ) is a continuous bijection and F −1 = F 3 . For any polynomial P on Rn ,
(15.37) b = (P (2πix)ϕ(x))∧ (t)
P (−∂)ϕ(t)
(15.38) P\ b
(∂)ϕ(t) = P (2πit)ϕ(t)
496 15. Fourier transform and Convolution on Rn
Proof. Suppose φ ∈ S. Then, xα φ(x) is integrable for any α ∈ Zn+ . By Theorem 15.1.9,
φb ∈ C ∞ (Rn ) and (15.37) holds for p(x) = xα and hence, for any polynomial by linearity.
Consequently ∂d b and (15.38) follows. Since xα φ(x) and ∂ β (xα φ(x)) are
α φ(t) = (2πit)α φ(t)
n
both in S for any α, β ∈ Z+ , applying (15.37) first and then (15.38) we obtain that
b = (2iπt)β (−2iπx)α φ(x) ∧ (t)
(2iπt)β (∂ α φ)(t)
∧
= ∂ β (−2iπx)α φ(x) (t)
≤ (2π)|α|
∂ β xα φ(x)
< ∞
1
The Fourier inversion theorem 15.5.5 implies that F restricted to S is bijective and that
for any φ ∈ S, φ(−x) = F 2 φ(x). This implies that F 4 φ = φ which means that F −1 = F 3 .
To prove continuity of F on S we use the closed graph theorem. Suppose φm → φ in
m→∞ m→∞
(S, ρ) and that for some ψ ∈ S, φc m −−−−→ ψ in S. Then φm −−−−→ φ in L1 (λn ) since
m→∞ m→∞
(1 + |x|2 )n φm (x) − φ(x) −−−−→ 0 uniformly. Hence φc −−−→ φb uniformly, which means
m −
that ψ = φ.b
Remark 15.7.5. Theorem 15.7.4 implies the existence of smooth Lebesgue integrable func-
tions whose Fourier transform is not only smooth, but also has compact support.
Corollary 15.7.6. If u ∈ S ∗ , the map u
b : S → C given by u b where φb = F(φ), is
b(φ) := u(φ),
∗ ∗
a tempered distribution. Furthermore Φ : S → S given by u 7→ u b is a continuous bijection
and Φ−1 = Φ3 .
when u, φ ∈ L1 .
A function f ∈ Lp (Rn ) has a partial derivative with respect to the k–th coordinate
in the sense of Lp (Rn ) if there exists g ∈ Lp (Rn ) such that
f (· + h e ) − f (·)
k k
lim
− g(·)
= 0
hk →0 hk p
In such case g is a.s. unique. Clearly, if f admists a partial derivative ∂k f in Rn the sense
of differential Calculus as well as a partial derivative g with respect to xk in the sense of
Lp (Rn ), then g = ∂k f a.s.
Lemma 15.7.8. If f ∈ L1 (Rn ) has a partial derivative g w.r.t. xk in the sense of L1 (Rn )
then,
gb(t) = 2πitk fb(t)
for all t ∈ Rn .
and by letting hk → 0.
Theorem 15.7.9. If ψ ∈ S, then
τ−hej ψ − ψ h→0
−−−→ ∂xj ψ
h
in S and in Lp (λn ) for each 1 ≤ j ≤ n and 1 ≤ p < ∞.
τhej −τ0
Proof. Let ηh = h . From
1 + |x| ≤ 1 + |x − y| + |y| ≤ (1 + |x − y|)(1 + |y|),
and Jensen’s inequality we obtain that
(1 + |x|2 ) ≤ (1 + |x|)2 ≤ (1 + |x − y|)2 (1 + |y|)2 ≤ 4(1 + |x − y|2 )(1 + |y|2 ).
Let N ∈ Z+ and α ∈ Zn+ with |α| ≤ N . There is a constant A = A(ψ, N, α) > 0 such that
(1 + |x|2 )N (∂ α (ηh φ(x) − φ(x)|) = (1 + |x|2 )N ∂xj ∂ α ψ(x + θhej ) − ∂xj ∂ α ψ(x)
= (1 + |x|2 )N ∂x2 ∂ α ψ(x + ξθhej )|h|
j
(1 + |x|2 )N
≤ A|h| ≤ 4N (1 + |h|2 )N A|h|
(1 + |x + ξθhej |2 )N
498 15. Fourier transform and Convolution on Rn
where ξ, θ ∈ (0, 1) result from applications of the mean value theorem. Convergence in S
and in Lp (λn ) follow from
h→0
ρN (ηh φ − φ) ≤ 4N (1 + |h|2 )n |A||h| −−−→ 0,
and Theorem 15.7.2[(iii)].
Example 15.8.1. We have already seen that any analytic function, and thus its real and
imaginary parts, on an open subset U ⊂ C satisfy the mean value property. In fact, in
R
this case the mean value property coincides with the Cauchy formula f (a) = γr (a) fz−a
(z)
dz,
it
where γr (a)(t) = a + re , 0 ≤ t ≤ 2π.
Theorem 15.8.2. If u ∈ C(U ) satisfies the mean–value property in U , then u ∈ C ∞ (U ).
Proof. Let η(x) = ψ(|x|) be a mollifier with support B(0; 1) with mass one. As before,
uε = ηε ∗ u denotes the mollification of u with ηε . We will show that u = uε in Uε . Indeed,
if x ∈ Uε , then by using polar coordinates, we obtain
Z
ε 1 |x − y|
u (x) = n η u(y) dy
ε U ε
Z |y|
1
= n η u(x − y) dy
ε B(0;ε) ε
Z
= η(y)u(x − εy) dy
B(0;1)
Z 1 Z
n−1
= ψ(r)r u(x − rεz)σn−1 (dz) dr
0 ∂S n−1
Z 1
= u(x)σn−1 (Sn−1 ) ψ(r)rn−1 dr = u(x).
0
Thus u ∈ C ∞ (Uε ) for any ε > 0.
15.8. Harmonic functions 499
A function u ∈ C 2 (U ) is harmonic in U if
n
X
(15.41) △ u(x) = ∇ · ∇u(x) = ∂x2j xj u(x) = 0, x∈U
j=1
Proof. Fix x0 ∈ Ω. For any r > 0 such that B(x0 ; r) ⊂ Ω un satisfies the mean–value
property. From the hypothesis of the Theorwm we have that u is continuous on Ω and
{un : n ∈ N} is bounded on B(x0 ; r). By dominated convergence, we obtain that
Z Z
1 1
u(x0 ) = lim un (x0 ) = lim un (x) dx = lim u(x) dx
n n ωd r d B(x ;r) n ωd r d B(x ;r)
0 0
Proof. Suppose that u attains its maximum at some point x0 ∈ U , that is, u(x0 ) ≥ u(y)
for all y ∈ U . As
Z
1
u(x0 ) = n u(y) dy
r ωn B(x0 ;r)
for any ball B(x0 , r) ⊂ U , Corollary (4.2.5)(ii) it follows that u ≡ u(x0 ) on B(x0 ; r). Hence
the set {x ∈ U : u(x) = u(x0 )} is both closed and open in U . Therefore, u ≡ u(x0 ) in the
connected componet of x0 in U .
We will use the results on harmonic functions discussed above to study harmonic func-
tions in the unit disc of the complex plane.
If f ∈ C(S1 ), and Pr is the Poisson kernel in the unit disc then, as in Example 11.5.17
(with µ(dθ) = f (eiθ ) dθ on S1 ), u = Pr ∗ f is harmonic on B(0; 1). Let Hf (z) = Pr ∗ f (θ)
for z = reiθ ∈ B(0; 1) and f (eiθ ) if |z| = 1. Then, Hf is bounded in B(0; 1) and
(15.43) kHf ku(B(0;1)) ≤ kf ku(S1 ) .
For each ek (θ) = eikθ , k ∈ Z, we have that Hek (reiθ ) = r|k| ek (θ); consequently, Hg ∈
C(B(0; 1)) and Hg (eiθ ) = g(eiθ ) for any trigonometric polynomial g. As trigonomet-
ric polynomials are dense in C(S1 ), we conclude from (15.43) that Hf ∈ C(B(0; 1)) and
Hf (eiθ ) = f (eiθ ). In the remainder of this section, we use P [f ] to denote the function Hf
on B(0; 1) introduced above.
The next result shows that any continuous function f on B(0; 1) that is harmonic on
B(0; 1) is obtained by applying the Poisson kernel to the restriction of f ∂B(0; 1) = S1 . For
any function u on B(0; 1), we use ur to denote the map on S1 given by θ 7→ u(reiθ ). For
any function f on S1 .
Theorem 15.8.7. If u ∈ C(B(0; 1)) and harmonic on B(0; 1), then u = P [u].
Proof. As f is harmonic on B(0; 1), so is fr (z) = f (rz); moreover, fr ∈ C(B(0; 1)). The-
orem 15.8.7 shows that for any 0 ≤ r, ρ < 1, frρ (θ) = f (rρeiθ ) = fr (ρeiθ ) = Pρ ∗ fr (θ).
15.9. Exercises 501
15.9. Exercises
Exercise 15.9.1. Suppose that µb(t) is the characteristic function of a finite positive mea-
n n
sure µ on (R , B(R )). Show that
b(−t) = µ
(a) µ b(t)
b is uniformly continuous and |b
(b) µ b(0) = µ(Rn ).
µ(t)| ≤ µ
Pm
b is positive definite, i.e.,
(c) µ b(tk − tj )zk zj ≥ 0 for all tj ∈ Rn and zj ∈ C,
k,j=1 µ
j = 1, . . . , m.
R
(d) For any g ∈ L1 (Rd ), show that g(x)b µ(x − y)g(y) dx ⊗ dy ≥ 0.
Remark 15.9.2. In Section 18.5 we will show that a function ϕ that satisfy conditions
(c) and (c) of Exercise 15.9.1 is in fact the characteristic function of a finite measure µ in
B(Rd ).
Exercise 15.9.3. (Hamburger moment problem.) The question is whether given a sequence
of real numbers {mn : n R∈ Z+ }, m0 = 1, there is a unique probability measure µ on
(R, B(R)) suchR that mn = xn µ(dx). Suppose there is one such probability measure, and
n
define Mn := |x| µ(dx). Assume
p
1 2n
r := lim sup M2n < ∞
n 2n
Show that
√
(a) r = lim supn n1 n Mn (Hint: By Cauchy–Schwartz, M2n+1 ≤ (M2n+2 M2n )1/2 .)
1
R
(b) For |z| < re , e|xz| µ(dx) < ∞. (Hint: Given ε > 0, there is N ∈ N such that
Mn < nn (r + ε)n whenever n ≥ N . Thus
|z|n Mn
≤ |e z (r + ε)|n
n!
P Mn z n
and so, n≥0 n! converges for |z| < 1/(re).)
1
b admits an analytic extension to the strip D = {z ∈ C : | Im(z)| <
(c) µ re }.
Conclude that µ must be unique.
Exercise 15.9.4. Suppose f ∈ L1 and h ∈ Rn , α ∈ R. Show that
(a) If g(x) = f (x)e2πix·h then gb(y) = fb(y − h).
(b) If g(x) = f (x − h) then, gb(y) = fb(y)e−2πih·y .
(c) If g(x) = f (−x), then gb(y) = fb(y).
(d) If g(x) = f (x/α) and α > 0, then gb(y) = αn fb(αy).
502 15. Fourier transform and Convolution on Rn
Exercise 15.9.10. The Bessel kernel of order α > 0 is the function on Rn defined by
Z ∞ |x|2 (α−n)
1
Gα (x) = n
α
exp − − t t 2 −1 dt
(4π) Γ 2 0
2 4t
R
(a) Show that {Gα : α > 0} ⊂ L1 (Rn , λn ) and Gα dλn = 1.
(b) Show that Gcα (ξ) = (1 + |2πξ|2 )− α2 , and that that Gα ∗ Gβ = Gα+β .
Exercise 15.9.11. The gamma distribution with parameters u > 0 and θ > 0 is defined
as the Borel measure γu,θ on R with
dγu,θ θu u−1 −θx
(x) = x e 1(0,∞) (x).
dx Γ(u)
The cases γ1,θ and γ1/2,1/2 correspond to the exponential E(θ) and χ21 distributions respec-
tively. For n ∈ N, γn,θ is known as Erlang distribution E(θ, n). Show that
R
(a) esx γu,θ (dx) = θu (s − θ)−u < ∞ for all s ∈ (−∞, θ).
15.9. Exercises 503
R
(b) The map Gu,θ (z) = ezx γu,θ (dx) is analytic on (−∞, θ) × R ⊂ C.
P zn
(c) Gu,θ (z) = Gu,1 ( zθ ) = 1 + n≥1 Γ(u+n) u
Γ(u) n!θ n = θ (θ − z)
−u for all |z| < θ.
u −u .
u,θ (s) = θ (θ − is)
(d) The characteristic function γd
Exercise 15.9.12. Let γu,θ be the Gamma measure on (0, ∞), B((0, ∞)) . Show that the
measure on (0, ∞) induced by the function Y : x 7→ x1 is absolutely continuous w.r.t. the
Lebesgue measure on (0, ∞) with Radon–Nikodym derivative given by
θu −(u+1) − yθ
fu,θ (y) = y e 1(0,∞) (y)
Γ(u)
This induced measure is called the inverse–gamma distribution, and is denoted by
Ig(u, θ).
Exercise 15.9.13. Let µ and ν be two complex measures in B(Rn ). Assume that µ ≪ λn
R
and that f = λµn . Show that µ ∗ ν ≪ λn and that d(µ∗ν)
dλn (x) := f ∗ ν(x) = Rn f (x − y) ν(dy).
Exercise 15.9.14. Let A, B ∈ B(Rn ) be such that λ(A), λ(B) > 0. Show that A + B
contains a set an open ball. (Hint: With out loss of generality assume that A and B are
compact. Then 1A ∗ 1B is continuous and not identically zero.)
Exercise 15.9.15. For each a > 0 define fa (x) = 1[−a,a] ∗ 1[−1,1] (x). Then fba (t) =
1
(πt)2
sin(2πat) sin(2πt) ∈ L1 (R). Show that kfa ku = 4 and that lima→∞ kfba k1 = ∞. Con-
clude from the open mapping theorem that the Fourier transform map f 7→ fb from L1 to
C0 is not surjective.
Exercise 15.9.16. If γu,θ denotes the gamma measure with parameters u > 0, θ > 0, then
γu1 ,θ ∗ γu2 ,θ = γu1 +u2 ,θ . In particular (a) the convolution of exponential distribution E(θ)
n times with itself is the Erlang E(n, θ) distribution; (b) the convolution χ21 ∗ χ21 is the
exponential distribution E(1/2).
Exercise 15.9.17. Suppose that µ is a finite Borel measure on Rn such that µ ∗ ν = ν for
some Borel measure ν not identically zero. Show that µ = δ0 .
Exercise 15.9.18. Let U be the renewal measure associated to a positive Radon measure
µ on R+ . Let λ be Lebesgue’s measure on [0, ∞). Suppose ν is a another Radon measure
on [0, ∞), and z is a measurable function on [0, ∞) bounded in compact sets. Show that
(a) U Radon measure on ([0, ∞), B([0, ∞)). (Hint: For finite µ consider µ̌(s) =
R is a −sx
[0,∞) e µ(dx). For infinite µ, consider µt (dx) = 1[0,t] (x)µ(dx) and check that
µt [0, s] = µ∗n [0, s] for all n and 0 ≤ s ≤ t.)
∗n
Exercise 15.9.23. Assume g ∈ L2 (λn ). If µ is complex Borel measure, show that F(g∗µ) =
b. (Hint: Consider first functions g ∈ L1 ∩ L2 .)
gbµ
Exercise 15.9.24. Let g ∈ L2 (λn ) and suppose that gb ∈ L∞ (Rn , λn ). If f ∈ L2 (λn ), show
that f ∗ g ∈ L2 (λn ) and F(f ∗ g) = fbgb. (Hint: Consider first functions f ∈ L1 ∩ L2 .)
Exercise 15.9.25. Show that D is dense in Lp (µ), 1 ≤ p < ∞, for any regular measure
µ on B(Rn ). (Hint: Use the density of C00 (Rn ) combined with Stone–Weierstrass theorem
and Exercise 13.7.4.)
Exercise 15.9.26. Suppose φ ∈ Lp admits a partial derivative ∂xj φ at every point x ∈ Rn
and that |∂xj φ(x)| ≤ (1+|x|2A)(n+α)/2 for some constants A > 0 and α > 0. Show that ∂xj φ is
also the Lp partial derivative of φ.
Exercise 15.9.27. Show that the shift opeartor τh : φ(x) 7→ φ(x − h) is continuous on S
h→0
and that τh φ −−−→ φ in S.
Exercise 15.9.28. For any tempered distribution L show that
(a) uL = L ◦ ι, where ι : D(Rn ) → S is the inclusion map, is a distribution in D(Rn ).
(b) There is a unique uL ∈ D∗ (Rn ) such that uL = L ◦ ι.
(c) For any α ∈ Zn+ , polynomial P , and g ∈ S the following are also tempered distri-
butions: Dα L(φ) := (−1)|α| L(Dα φ), P · L(φ) := L(P φ), g · L(φ) := L(gφ).
Exercise 15.9.29. Show that (15.39) and (15.40) are equivalent.
Exercise 15.9.30. Find all radial functions in Rd that are harmonic. (Hint: Suppose
u(x) = v(r), where r = |x| = (x21 + . . . + x2n )1/2 . Show that △ u = v ′′ (r) + n−1 ′
r v (r) on
Rn \ {0}.)
Chapter 16
Countable product of
probability spaces
Products of measurable spaces are common in Probability theory as they provide the natural
setting for the study of sequences of random variables, and more generally, the construction
of of random processes.
505
506 16. Countable product of probability spaces
16.2. Independence
The concept of independence plays an important role in Probability and Statistics. It is
related to notion that one can repeat an experiments whose outcomes neither influence nor
are influenced by the outcomes of other experiments.
Definition 16.2.1. Consider a probability space (Ω, F , P). The sets in a collection C ⊂ F
are mutually independent if for any finite sub-collection D ⊂ C,
\ Y
P[ C] = P[C].
C∈D C∈D
The collections in {Ct ⊂ F : t ∈ T } are independent if for any finite I ⊂ T and any
choice Ci ∈ Ci , i ∈ I,
\ Y
P[ Ci ] = P[Ci ].
i∈I i∈I
Example 16.2.2. Consider ([0, 1], B([0, 1]), λ), the sets A = [0, 21 ] and B = [ 14 , 43 ] are
independent, for λ(A ∩ B) = 14 = λ(A)λ(B).
P n
Recall that every x ∈ [0, 1] has a unique binary expansion x = n≥1 rn /2 where
P
rn ∈ {0, 1}, and n≥1 rn = ∞ for x > 0. Observe that for each n ∈ N, the n–th bit
map x 7→ rn (x) defines a measurable function from ([0, 1], B([0, 1])) to ({0, 1}, P({0, 1})).
Therefore, the map β : [0, 1] → {0, 1}N given by x 7→ (rn (x)) is measurable. The next result
is a mathematical formulation of tossing a fair coin.
Lemma 16.2.4. Suppose θ ∼ U [0, 1], and let {Xn = rn ◦ θ} its binary expansion. Then,
{Xn } is an i.i.d. Bernoulli sequence with rate p = 12 . Conversely, if (Xn ) is an i.i.d.
P
Bernoulli sequence with rate p = 21 , then θ = n≥1 2−n Xn ∼ U [0, 1].
16.2. Independence 507
Proof. Suppose that θ ∼ U (0, 1). For any N ∈ N and k1 , . . . , kN ∈ {0, 1},
N
\ N
X N
X
kj kj 1
{x ∈ (0, 1] : rj (x) = kj } = ( 2j
, 2j
+ 2N
]
j=1 j=1 j=1
2N −1 −1
[
{x ∈ (0, 1] : rN (x) = 0} = ( 22jN , 2j+1
2N
]
j=0
2N −1 −1
[
{x ∈ (0, 1] : rN (x) = 1} = ( 2j+1
2N
, 2(j+1)
2N
]
j=0
T QN
It follows immediately that P[ N j=1 {Xj = kj }] =
1
2N
= j=1 P[Xj = kj ]. Hence {Xn } is a
1
Bernoulli sequence with rate 2 .
Conversely, suppose {Xn : n ≥ 1} is a Bernoulli sequence with rate 12 . If θe ∼ U (0, 1), then
d
en } =
the first part shows that the sequence of bits {X {Xn }. Therefore,
X d
X
θ := 2−n Xn = en = θe
2−n X
n≥1 n≥1
One can generate a U [0, 1] i.i.d. sequence out of a single U [0, 1] random variable.
Lemma 16.2.5. There exist a sequence (fn ) of measurable functions on [0, 1] such that for
any θ ∼ U [0, 1], (fn (θ)) is an i.i.d sequence random variables with f1 (θ) ∼ U [0, 1].
Proof. Reorder the sequence (rm ) of binary bit maps into a two–dimensional array (hn,j :
P h
n, j ∈ N), and define the function fn := j≥1 2njj on [0, 1] for each n. By Lemma 16.2.4,
{Xn = rn ◦ θ} forms a Bernoulli sequence with rate p = 21 . Thus, the collections σ(Xnj :
j ≥ 1) are independent. Again, by Lemma 16.2.4, it follows that (fn ) is an i.i.d. sequence
of U [0, 1] random variables.
Proof. Suffices to assume that each (Sn , Sn ) = ([0, 1], B([0, 1])). Lemma 16.2.5 provides a
U [0, 1]–distributed i.i.d. sequence (fn ) of random variables defined on [0, 1]. Theorem 4.6.4
shows that for each n, there is a map Tn : [0, 1] → Sn such that λ ◦ Tn−1 = µn . The map F
given by x 7→ (Tn (fn (x))) has the stated properties.
508 16. Countable product of probability spaces
Proof. Considering maps of the form 1A×B with A ∈ S and B ∈ T first. Then, a
∈ S ⊗ T . By
monotone class argument shows that µ(s, E) is S –measurable for any E
linearity and monotone convergence we extend S –measurability of µg (s) for arbitrary
S ⊗ T –measurable function g.
The following result establishes the existence of a unique probability measure on any
countable product of measurable spaces, where a compatible collection of stochastic kernels
that involve finite–dimensional projections is prescribed. No topological restrictions need
to be imposed on the spaces.
Theorem 16.3.3. (Ionescu Tulcea) For any measurable spaces (Sn , Sn ) and stochastic
kernels µn from S1 × . . . × Sn−1 N to Sn , where µ1 is a measure on (S1 , S1 ), there exists
a unique probabilityQ measure on n Sn such that for any k, the law of the projection
(p1 , . . . , pk ) : n Sn → S1 × . . . × Sk is µ1 ⊗ . . . ⊗ µk .
Q N Q fn = Nn Sj ,
Proof. Let Ω = n Sn and F = n Sn . For each n, let Tn = j>n Sj , F j=1
f S
Fn = Fn × Tn and C = n Fn . Observe that C is an algebra and that σ(C) = F . For each
A∈F fn , define
Proof. Consider the measures µn as kernels from S1 × . . . × Sn−1 to Sn and apply Ionescu
Tulcea’s theorem.
Events in the τ ({Xi : i ∈ I}) are those whose occurrence is independent of any fixed
finite subfamily of {Xi : i ∈ I}.
Proof. The inclusion ⊂ is obvious. To prove inclusion in the opposite direction, suppose
J ⊂ I is finite. There exits N ∈ N such that J ⊂ JN . Then
∞
\ [ N
\ [ [ [
(16.6) σ Fj ⊂ σ Fj ⊂ σ Fj ⊂ σ Fj .
n=1 j∈I\Jn n=1 j∈I\Jn j∈I\JN j∈I\J
The left hand side of (16.6) is independent of J. Taking the intersection over finite subsets
of I gives the reverse inclusion.
S S
The set B = N k=1 Ajk belongs to EJ , with J = {j1 , . . . , jk }. Since A ∈ σ j∈I\J Aj , A
and B are independent, and so
ε > P[A \ B] = P[A](1 − P[B]) > P[A](1 − P [A] − ε).
Letting ε ց 0 yields 0 = P[A](1 − P[A]).
collection P of sets in F ⊗I which are invariant under finite permutations forms a σ–algebra
called symmetric or exchangeable σ–algebra. It is easy to verify that
Iθ ⊂ T ⊂ P
Example 16.5.3. Suppose (S, F ) = (R, B(R)). and let A, B ∈ B(R).
(i) {x ∈ RZ+ : limn x(n) ∈ A} ∈ Iθ .
(ii) {x ∈ RZ+ : limn x(2n) ∈ A, limn x(2n + 1) ∈ B} ∈ T \ Iθ .
P
(iii) {x ∈ RZ+ : ∞ n=1 x(n) ≥ 0} ∈ P \ T .
Theorem 16.5.4. (Hewitt–Savage 0–1 law.) Suppose that the the family of projections
{pj : j ∈ I} is i.i.d. If A ∈ P, then P(A) ∈ {0, 1}.
nk
Proof. We will consider the case I = Z. Let A ∈ P and let Bk ∈ F−n k
be a sequence such
that limk P[A△Bk ] = 0. For each j, let sign(j) = −1{n<0} (j) + 1{n≥0} (j). Then,
j + sign(nk − j)(2nk + 1) if −nk ≤ j ≤ 3nk + 1
πk (j) =
j otherwise
is a finite permutation with πk ◦ πk = Id and Bk′ = πk−1 (Bk ) ∈ Fn∞k +1 . Hence
P[A△Bk′ ] = P[πk−1 (A△Bk )] = P[A△Bk ] → 0
By independence, P[A] = P[A ∩ A] = limk P[Bk′ ∩ Bk ] = limk P[Bk′ ]P[Bk ] = P[A]2 .
theorem, they occur infinitely many often. The life span of a monkey is of the order of 108
seconds, as a result, it is unlikely that this famous quote will ever be typed this way.
16.6. Symmetrization
Let (Ω, F , P) be a probability space and let X be a real–valued random variable on Ω.
Definition 16.6.1. A probability measure µ on (R, B(R)) is symmetric if µ(A) = µ(−A)
for all A ∈ B(R). A random variable X defined on (Ω, F , P) is symmetric if its law
µX = P ◦ X −1 is symmetric.
Example 16.6.2. Suppose X and X̃ are two i.i.d. random variables in Ω. Then, Y = X −X̃
is symmetric. Y is called a symmetrization of X.
A median m for X is a 21 –quantile of the distribution of X, that is
1
max{P[X < m], P[X > m]} ≤
2
Proposition 16.6.3. If X ∈ L1 (P) and m is a median for X then,
E[|X − m|] = inf E[|X − c|].
c∈R
Proof. Let {X̃n } be an independent copy of {Xn }. Observe that for each n, Zn = Sn − S̃n =
Pn
k=1 (Xk − X̃k ) is a symmetrization of both Sn − nan and Sn . Let m be a median for X;
then, combining Lemma 16.7 and Theorem 16.6.8, for all n large enough, we obtain
1 1
−nP |X1 −X̃1 |>2nε
2P |Sn − nan | > nε ≥ P |Zn | > 2nε ≥ 1−e
2 2
1 1 1 1
≥ 1 − e− 2 nP |X1 −m|>2nε ≥ 1 − e− 2 nP |X1 |>2nε−|m| .
2 2
Thus, if limn P[|Sn − an | > nε] = 0, then limx→∞ xP[|X1 | > x] = 0.
.
516 16. Countable product of probability spaces
Proof. Let T = inf{k S≥ 1 : |Sk | > ε} and define Ak = {T = k}. Observe that Bn =
{max1≤k≤n |Sk | > ε} = nk Ak , and that Sk 1Ak is independent from Sn −Sk for all 1 ≤ k ≤ n.
Hence
n
X n
X
E[|Sn |2 ] ≥ E[|Sn |2 1Bn ] = E[|Sn |2 1Ak ] ≥ E (|Sk |2 + 2Sk (Sn − Sk ))1Ak
k=1 k=1
n
X n
X
= E[(|Sk |2 1Ak ] ≥ ε2 P[Ak ] = ε2 P[Bn ]
k=1 k=1
On the other hand, |Sk |1Ak ≤ (|Xk | + |Sk−1 |)1Ak ≤ 1Ak (R + ε); hence
n
X n
X
E[|Sn |2 1Bn ] = E (Sk2 + (Sn − Sk )2 )1Ak = E Sk2 1Ak ] + P[Ak ]E[|Sn − Sk |2 ]
k=1 k=1
X
n
2 2 2 2
≤ (R + ε) + E[|Sn | ] P[Ak ] = P[Bn ] (R + ε) + E[|Sn | ] .
k=1
Therefore
E[|Sn |2 ] − ε2 (R + ε)2
P[Bn ] ≥ ≥ 1 −
(R + ε)2 + E[|Sn |2 ] − ε2 E[|Sn |2 ]
is injective. To check this, suppose ω = 6 ω ′ and let k be the first component such that
′
ωk 6= ωk . Then
X X
|X ′ (ω) − X ′ (ω ′ )| ≥ 2cnk − 2 cnk+j > 2cnk − 2cnk 2−j = 0.
j≥1 j≥1
hence, limn→∞ supl,k≥n ml,k = 0. Therefore, given ε > 0, there is n0 such that
sup |mk,l | < ε n ≥ n0
k,l≥n
16.8.1. Weak law of large numbers. We start this section with the statement and proof
of two technical results will be useful in the proof of both the weak and strong versions of
the LLN.
Lemma 16.8.1. (Cesáro average) Let {µt : t > 0} be a family of probability measures on
[0, ∞) such that limt→∞ µt ([0, a]) = 0 for any a ≥ 0. If f is a bounded measurable function
in [0, ∞) and limt→∞ f (t) = b, then
Z
lim f (s) µt (ds) = b.
t→∞ [0,∞)
Proof. Given ε > 0, choose a > 0 such that |f (t) − b| < ε whenever t ≥ a. Then
Z Z Z
f (s) µt (ds) − b ≤ |f (s) − b| µt (ds) + |f (s) − b| µt (ds)
[0,∞) [0,a] (a,∞]
≤ 2kf ku µt [0, a] + ε(1 − µt [0, a])
n
1 X
(16.12) lim xk = 0
n→∞ an
k=1
520 16. Countable product of probability spaces
P P
Proof. Let bn = nk=1 xakk and a0 = 0 = b0 so that xn = an (bn − bn−1 ). If sn = nk=1 xk ,
then summation by parts gives
X n n
X n
X
sn = ak (bk − bk−1 ) = an bn + ak−1 bk−1 − ak bk−1
k=1 k=2 k=1
n
X
= a n bn − (ak − ak−1 )bk−1 .
k=1
Hence
n
X (ak − ak−1 )
sn
(16.13) = bn − bk .
an an
k=1
Let b = limn bn . As ak−1 ≤ ak ր ∞, the sum on the right of 16.13 converges to b by
Cesáro’s lemma and thus, (16.12) follows.
Theorem 16.8.3. For each n, let Xn,m , m = 1, . . . , mm be independent random variables.
Let {bn } ⊂ (0, ∞) be a numeric sequence with bn → ∞ and define the truncated sequence
en,m = Xn,m 1{|X ≤b } . Suppose that
X n,m n
Pm n
(a) limn→∞ m=1 P[|Xn,m | > bn ] = 0;
P en,m |2 ] = 0.
(b) limn→∞ 12 mn E[|X
bn m=1
P mn P mn
Let Sn = e Sn −an
m=1 Xn,m and an = E m=1 Xn,m . Then, bn converges to 0 in probability.
Z ∞ Z n
e 1 |2 ] =
E[|X e1 | > t] dt =
2tP[|X 2tP[|Xe1 | > t] dt
Z0 n 0
Z n
= 2tP[n ≥ |X1 | > t] dt ≤ 2tP[|X1 | > t] dt.
0 0
16.8.2. Strong law of large numbers. The following result is one version of the strong
law of large numbers for i.i.d. random variables.
Proof. Sufficiency: Assume that E[|X1 |p ] < ∞ and also that E[X1 ] = 0 if p > 1. Let
X̃n = Xn 1{|Xn |≤n1/p } . By Fubini’s theorem, we have that
X X XZ n
1/p
P[Xn 6= X̃n ] = P[|Xn | > n ] ≤ P[|Xn | > t1/p ] dt
n n n n−1
Z ∞
= P[|X|p > t] dt = E[|X|p ] < ∞
0
Borel–Cantelli lemma shows that P[Xn 6= X̃n , i.o] = 0. Consequently, to show that An (p)
1 Pn
converges it is enough to show that n1/p k=1 X̃n → 0 P–a.s. By Kronecker’s lemma, it
P X̃n
suffices to show that n n1/p < ∞ P–a.s.
P 1
If p > 1, then by Kolmogorov’s lemma, it is enough to show that n var( n1/p X̃n ) =
P 1 P 1
n n2/p var(X̃n ) and n n1/p E[X̃n ] converge. For the former series, the following estimate
holds for p ≥ 1
X 1 X 1
2/p
var( X̃ n ) ≤ 2/p
E[|Xn |2 ; |Xn | ≤ n1/p ]
n
n n
n
Z ∞
1
≤ 22/p 2/p
E[|X1 |2 ; |X1 | ≤ t1/p ] dt
0 t
h Z ∞ i 41/p p
1
= 41/p E |X1 |2 2/p
dt = E[|X1 |p ] < ∞
|X1 |p t 2 − p
As for the latter series, observe that E[X̃n ] = −E[Xn ; |Xn | > n1/p ]. Hence, for p > 1
X 1 X 1
1/p
|E[X̃n ]| ≤ 1/p
E[|Xn |; |Xn | > n1/p ]
n
n n
n
Z ∞
1
≤ 1/p
E[|X1 |; |X1 | > t1/p ] dt
0 t
h Z |X1 |p i
1 p
= E |X1 | 1/p
dt = E[|X1 |p ] < ∞.
0 t 1 − p
P
In the special case p = 1, notice that n1 nk=1 E[Xk ; |Xk | ≤ k] = 0 since by dominated
convergence E[Xn ; |Xn | ≤ n] = E[X1 ; |X1 | ≤ n] → E[X1 ] = 0. Hence, it suffices to show
16.9. Random Walks 523
P P
that n1 nk=1 (X̃k −E[X̃k ]) → 0, which follows from the previous estimate n 1
n2/p
var(X̃n ) ≤
Cp E[|X1 |p ] with p = 1 and Kolmogorov’s lemma.
1
Necessity: Assume that Ap := limn S
n1/p n
converges P–a.s. Then
Xn Sn n − 1 1/p S
n−1
= 1/p − →0
n1/p n n (n − 1)1/p
Consequently, P[|Xn | > n1/p , i.o] = 0 and by the reversed Borel–Cantelli lemma and Fu-
bini’s theorem
Z ∞ X
p
E[|X1 | ] = P[|X1 |p > t] dt ≤ 1 + P[|X1 | > n1/p ] < ∞.
0 n≥1
The proof of sufficiency shows that Ap := 0 for p < 1 and A1 = E[X1 ]. If p > 1, the proof
of sufficiency shows that
n
1 X 1
(Xn − E[Xn ]) = (Sn − nE[X1 ]) → 0
n1/p k=1
n1/p
Proof. Only L1 needs to be proved. For bounded X1 , the conclusion of the statement
follows from dominated convergence. For general X1 , use the truncation Xnm = Xn 1|Xn |≤m .
By dominated convergence,
Pn for any ε, there is m0 such that kX1m − X1m k1 < ε/3 for all
m m
m ≥ m0 . Let Sn = k=1 Xk , then
kSn − E[X1 ]k1 ≤ kSn − Snm0 k1 + kSnm0 − E[X1m0 ]k1 + kX1m0 − X1 k1
≤ 2kX1m0 − X1 k1 + kSnm0 − E[X1m0 ]k1 < 2ε/3 + kSnm0 − E[X1m0 ]k1
The conclusion follows by first letting n → ∞, and then ε → 0.
Proof. Hewitt–Savage 0 − 1 law implies that for some constant c ∈ R lim inf n Sn = c
P–a.s. Since Y is an i.i.d. sequence {Sn+1 − Y1 : n ∈ N} and {Sn : n ∈ N} have the same
distribution; therefore, c − Y1 = c P–a.s. If c is finite, then Y1 = 0 P–a.s. which in turn
implies (i). If Y1 6≡ 0, then c is either +∞ or −∞. The same analysis applies to lim supn Sn .
Clearly the possibility lim supn Sn = −∞ and lim inf n Sn = +∞ is not possible. This proves
the theorem.
In the remaining of this section we will analyze how often a random walk on Rd returns
near a point x ∈ Rd .
Definition 16.9.2. Suppose S = {Sn : n ∈ Z+ } is a random walk in Rd . A point x is said
to be a recurrent point for S if for every ε > 0, P[kSn − xk < ε i.o] = 1. A point x is said
to be a possible values for S if for any ε > 0, there is n ∈ N such that P[kSn − xk < ε] > 0.
Notice that
\ \
{kSk − xk < ε} ∩ {kSn+k − Sk − (y − x)k ≥ 2ε} ⊂ {kSn+k − yk ≥ ε}
n≥m n≥m
This contradicts to the assumption y ∈ V which implies that P[kSn − yk < ε i.o] = 1.
Therefore y − x ∈ V.
16.9. Random Walks 525
Proof. The first statement is a direct consequence of Borel–Cantelli’s theorem. For the
second part, set F = {kSn k < ε i.o}c . Looking at the last time kSn k < ε we obtain
X h \ i
P[F ] = P {kSm k < ε} ∩ {kSn k ≥ ε}
m≥0 n≥m+1
X h \ i
≥ P {kSm k < ε} ∩ {kSn − Sm k ≥ 2ε}
m≥0 n≥m+1
X h\ i
= P[kSm k < ε] P {kSn k ≥ 2ε}
m≥0 n≥1
P T i
Since P[F ] < 1 and m≥0 P[kSm k < ε] = ∞, we conclude that P n≥1 {kSn k ≥ 2ε} = 0.
Let k ≥ 2 and set
\
A(m, k) := {kSm k < ε} ∩ {kSn k ≥ ε}
n≥m+k
The next result shows that convergence of the series in Lemma 16.9.4 is independent of
ε > 0. To make things simpler, we will use the uniform norm kxk = max1≤j≤d |xj | on Rd .
Lemma 16.9.5. For any integer m ≥ 2
X X
P[kSn k < mε] ≤ (2m)d P[kSn k < ε]
n≥0 n≥0
526 16. Countable product of probability spaces
Proof. Dividing the d–cube (−mε, mε)d in (2m)d cubes of size ε we obtain that
X XX
P[kSn k < mε] ≤ P[Sn ∈ kε + [0, ε)d ]
n≥0 n≥0 k
Since the events {Tk = ℓ} and {kSn − Sℓ k < ε} are independent, we further obtain that
XX h i X X
P kSn − Sℓ k < ε, Tk = ℓ = P[Tk = ℓ] P[kSn k < ε]
ℓ≥0 n≥ℓ ℓ≥0 n≥0
X
≤ P[kSn k < ε]
n≥0
As the cardinality of {−m, . . . , m − 1}d is (2m)d , the conclusion of the lemma follows.
P
Theorem 16.9.6. For any random walk Sn , V = ∅ iff n≥0 P[kSn k < ε] < ∞ for some
(and hence all) ε > 0.
Proof. Suppose V is not a lattice, i.e., there is no h > 0 for which V/h ⊂ Z. We claim
that m := inf{x ∈ V : x > 0} = 0. suppose m > 0. Then, there is d ∈ V such that
mq < d < (m + 1)q for some q ∈ N. Hence
d 1
m< <m 1+
q q
Then, there is x ∈ V such that m ≤ x < dq ; consequently,
0 < d − xq < q(m − x) + m ≤ m
Since d − xq ∈ V, this contradicts the definition of m. Therefore,Sm = 0. To conclude,
for any ε > 0 choose v ∈ G with 0 < v < ε. Since G ∩ (0, ∞) = n≥0 (nv, (n + 1)v] and
{nv : n ∈ N} ⊂ V, we conclude that any x ∈ (0, ∞) is with ε–distance from V. This shows
that V is dense in R.
We conclude this section with an important result for one–dimensional random walks.
16.10. Exercises 527
1
Theorem 16.9.8. (Chung–Fuchs) Suppose Sn is a random walk on R. If n Sn → 0 in
probability, then V =
6 ∅.
P
Proof. By Theorem 16.9.6, it suffices to show that n≥0 P[|Sn | < 1] = ∞. By Lemma 16.9.5,
for any m ≥ 2 and L ∈ N
1 X h ni
X Lm
1 X
P[|Sn | < 1] ≥ P[|Sn | < m] ≥ P |Sn | <
2m 2m L
n≥0 n≥0 n=0
16.10. Exercises
Exercise 16.10.1. Show that if {Ct : t ∈ T } is an independent family of π–systems then,
the σ–algebras σ(Ct ), t ∈ T ), are independent.
Exercise 16.10.2. Suppose X, Y are identically distributed random variables and that
Y
X > 0 and E[X] < ∞. Show that E X > 1, unless X is constant a.s.
Exercise 16.10.4. Suppose X and Y are independent Rd –valued random vectors (that is,
Rd –valued measurable functions) defined on a common probability space (Ω, F , P). Let µX
and µY be the laws of X and Y respectively. Show that the law µZ of Z = X + Y is given
by the convolution µX ∗ µY .
Exercise 16.10.6. Let (ǫn : n ∈ N) be an i.i.d sequence of Bernoulli random variables with
p = 1/2. Let
X ǫn
X=
3n
n≥1
Show that X has Cantor Devil’s stairs distribution defines in Example 3.4.4. Find E[X]
and var[X].
Exercise 16.10.7. If Z is a compound random walk subordinated by N with P –distributed
steps, show that
P
(i) ϕZ (t) = E[eitZ ] = ∞ n N
n=0 ϕP (t)P[N = n] = E[(ϕP (t)) ].
(ii) If N is Poisson distributed with parameter λ then ϕZ (t) = exp λ(ϕP (t) − 1) .
p
(iii) If N is geometric with parameter p, then ϕZ (t) = 1−(1−p)ϕP (t)
Exercise 16.10.8. Let (S, F , µ) be a probability space and that I is a countable set of
indices. If the set of projections {pj : j ∈ I} is i.i.d on (S I , F ⊗I , µ⊗I ), show that Pπ −1 = P
for all finite permutation π of I. (Hint: Consider first the collection of all finite dimensional
elementary cylinders.)
Exercise 16.10.9. For any q ∈ (0, 1) consider the function
φq (x) = q − 1(−∞,0] (x) x.
It is easy to check that φ1−q (−x) = φq (x). Show that for any a ≤ b
φq (x − b) − φq (x − a) = (b − x)1(a,b] (x) + (b − a) 1(−∞,a] (x) − q .
If X ∈ L1 (P) and zq is a q–th quantile of X, show that
E[φq (X − zq )] = min E[φq (X − a)].
a∈R
Observe that Proposition 16.6.3 follows by taking q = 1/2.
Exercise
S 16.10.10. Suppose µ is the step distribution of a random walk S. Show that
U = n≥1 supp(µ∗n ), and that U is closed under addition.
Exercise 16.10.11. Let {Sn : n ∈ Z+ } be a random walk on Z with steps Yn = Sn − Sn−1 ,
n ≥ 1. Suppose E[|Y1 |] < ∞ and that Y1 is aperiodic, that is, the greatest common divisor
of {m : P[Y1 = m] > 0} is 1. Show that P[Sn = x i.o.] = 1 for any x ∈ Z.
Exercise 16.10.12. Suppose S is a random walk Pon R+ . Let µ the step distribution ans
assume µ({0}) < 1. For any t ≥ 0 define N (t) = n≥0 1[0,t] (Sn ). Show that
N (t) 1
lim = , P − a.s.
t
t→∞ m
R
where m = [0,∞) xµ(dx) ∈ R+ . (Hint: SNt −1 ≤ t < SNt and limn→∞ Sn = ∞ P–a.s.)
Chapter 17
Weak convergence of
measures
Weak convergence of measures plays an important role in probability theory, statistics and
their applications. The central limit theorem, for instance, is one of such important and
widely used applications. In this chapter we present the theoretical framework of weak
convergence of measures. The following chapter we discuss the setting of Euclidean spaces
and discuss the Central Limit Theorem for independent random variables.
We equipped M with the weak* topology σ(M, W). In particular, when W separates points
of M, Theorem 12.11.5 implies that (M, σ(M, W)) is a locally convex Hausdorff topological
vector space whose dual is W. In such case, limits of convergent nets in (M, σ(M, W)) are
uniquely defined.
When S is a metric space, it is natural to consider the dual pair (M, W0 ) where W0 =
Cb (S) as M(S) is contained in the dual space of (Cb (S), k ku ), and Cb (S) separates the Borel
measures. When S is locally compact Hausdorff topological space, then based on Riesz’s
529
530 17. Weak convergence of measures
representation theorem, it is natural to consider the dual pair (M, Wk ) where W1 = C00 (S)
or W2 = C0 (S). If S is a compact metric space space, then the dual pairs (M, Wk ),
k = 0, 1, 2, coincide. In Probability theory one is mainly concerned with M+1 as a subspace
of (M, σ(M, W0 )).
Definition 17.1.1. Let W be a linear space of bounded measurable functions on S. A net
w
{µα : α ∈ D} ⊂ M(S) converges W–weakly to µ ∈ M(S), denoted by µα − → µ, if
Z Z
lim f dµα = f dµ
α S S
for all f ∈ W. If W = Cb (S) we simply say that µα converges weakly to µ, which we
denoted by µα ⇒ µ. If S is locally compact and Hausdorff and W = C00 (S) then we say
v
that µα converges vaguely to µ, which we denoted by µα −
→ µ; if W = C0 (S) then we say
w∗
that µα converges vaguely* to µ, which we denoted by µα −−→ µ.
Example R 17.1.2. If {µα : α ∈ D} and µ are finite measures on (S, B(S)) and kµα −µkT V →
0, then | f d(µα − µ)| ≤ kf ku kµα − µkT V for any f ∈ Cb (S). Therefore, µα ⇒ µ.
The converse is not necessarily true. For instance, consider µn = δ1/n , n ∈ N, and µ = δ0
on (R, B(R)). Clearly µn ⇒ µ, however kµn − µkT V = 2.
w∗
Example 17.1.3. If S is locally compact Hausdorff then µα −−→ µ iff supα kµα kT V < ∞
v
and µα −→ µ. This follows from the fact that C00 (S) is dense in C0 (S). To see that the net
{µα } needs be bounded, consider the example S = (0, ∞) and the sequence µn = nδ1/n .
v
Then µn − → 0 however, µn does not weak*–converge as any function f ∈ C0 (S) such that
√
f (1/n) ∼ 1/ n will show.
The weak topology σ(M(S), Cb (S)), as the example below shows, may be too restrictive
for only bounded continuous functions are considered as test functions.
Example 17.1.4. On (R, B(R)), the sequence µn = 1 − n1 δ0 + n1 δn converges weakly to
δ0 . Consider the (unbounded) continuous function ψ(x) = x. Then µn (ψ) = 1 6= 0, for all
n ∈ N and so, µn (ψ) 6→ 0 as n → ∞.
We present below one extension of the theory weak convergence developed thus far
which enlarges the collection of test functions to include some unbounded functions, and
which is usefulR in many applications. Suppose ψ ∈ C(S) with ψ ≥ 1. Let Mψ (S) =
{µ ∈ M(S) : S ψ d|µ| < ∞}, and C ψ (S) = {f ∈ C(S) : ψ −1 f ∈ Cb (S)}. Equip Mψ (S)
with the weak topology σ(Mψ (S), C ψ (S)). As kψ −1 µkT V < ∞ for all µ R∈ M(S), the
map Ψ : M(S) → Mψ (S) given by µ 7→ ψ1 · µ is well defined, and Ψ(µ)f = f ψ1 dµ for all
f ∈ C ψ (S). Since Cb (S) ⊂ C ψ (S), the weak topology σ(Mψ (S), C ψ (S)) on Mψ (S) is stronger
than the relative topology on Mψ (S) inherited as a subspace of M(S), σ(M(S), Cb (S)) .
The following Theorem shows that results about weak convergence in σ(M(S), Cb (S))
can be then translated into results about weak convergence in σ(Mψ (S), C ψ (S)).
Theorem 17.1.5. The map Ψ is an homeomorphism between M(S), σ(M(S), Cb (S)) and
( Mψ (S), σ(Mψ (S), C ψ (S)) .
17.2. Weak convergence of measures on metric spaces 531
Proof. Notice that µ ∈ M(S) iff ψ1 · µ ∈ Mψ (S), and that f ∈ Cb (S) iff ψf ∈ C ψ (S). Let
µ ∈ M(S), f1 , . . . , fN ∈ Cb (S), N ∈ N, and ε > 0. Consider the neighborhood
Z
Uε (µ; f1 , . . . , fN ) := ν ∈ M(S) : fj (dν − dµ) < ε, j = 1, . . . , N
Proof. (i): Suppose that µα ⇒ µ and let g ∈ Lb (S) with g ≥ c. By Theorem B.1.6, there
is a sequence gk of bounded Lipschitz functions such that c ≤ gk ≤ gk+1 ր g. Hence, for
each k
Z Z Z
lim inf g dµα ≥ lim inf gk dµα = gk dµ.
α α
R R
As µ(S) < ∞, by monotone convergence we obtain that lim inf α g dµα ≥ g dµ.
Conversely, suppose f ∈ Cb (S). Since Cb (S) ⊂ Lb (S), both f and −f are in Lb (S), so
Z Z
lim inf f dµα ≥ f dµ
α
Z Z
lim inf −f dµα ≥ −f dµ
α
R R
Therefore, limα f dµα = f dµ.
532 17. Weak convergence of measures
(ii) Let 0 ≤ f ∈ Lb (S) and let fk ∈ Cb (S) be such that 0 ≤ fk ր f pointwise. Since S is
locally compact and separable, there is a sequence of open sets Vj with compact closure such
that V j ⊂ Vj+1 ր S. Choose vj ∈ C00 (S) so that 1V j ≤ vj ≤ 1Vj+1 and supp vj ⊂ Vj+1 .
Let fkj = fk vj ; clearly fkj ∈ C00 (S) and fkj ր fk as j ր ∞. Then for all k and j
Z Z Z Z
lim inf f dµα ≥ lim inf fk dµα ≥ lim inf fkj dµα = fkj dµ
α α α
Let Ub (S) ⊂ Cb (S) denote the collection if all bounded uniformly continuous functions
on S. Then, Lip
b (S) ⊂ U b (S) ⊂ Cb (S), and so, by Corollary 12.11.12, σ M(S), L b (S) ⊂
σ M(S), Ub (S) ⊂ σ M(S), Cb (S) . These weak topologies coincide on the cone M+ (S).
Corollary 17.2.2. Let (S, d) be a metric space. A net {µα : α ∈ D} ⊂ M+ (S) converges
weakly to µ ∈ M(S) if and only if µα f → µf for all f ∈ Lipb (S).
Proof. Suppose limα µα f = µf for each f ∈ Lipb (S). We claim that µ ∈ M+ (S). Indeed,
for any function g ∈ Cb+ (S) there is, by Theorem B.1.6, a sequence {fn : n ∈ N} ⊂ Lip+
b (S)
such that fn ր g. Then 0 ≤ limα µα fn = µfn . By dominated convergence limn µfn =
µg. Hence µg ≥ 0 for every g ∈ Cb+ (S). The conclusion follows as a consequence of
Theorem B.1.6 along with Theorem 17.2.1(i).
Theorem 17.2.3. (Portmanteau theorem) Let (S, d) be a metric space, µ ∈ M+ (S) and
suppose {µα : α ∈ D} is a net in ⊂ M+ (S). The following statements are equivalent
(i) µα converges weakly to µ.
(ii) lim supα µα (S) ≤ µ(S) and lim inf α µα (U ) ≥ µ(U ) for each U open.
(iii) lim inf α µα (S) ≥ µ(S) and lim supα µα (F ) ≤ µ(F ) for each F close.
(iv) limα µα (A) = µ(A) for each Borel set A such that µ(∂A) = 0.
Proof. (i) =⇒ (ii). If (i) holds then limα µα (S) = µ(S). For any open set U we have that
1U is a bounded lower semicontinuous. Therefore, by Theorem 17.2.1(i), (ii) holds.
(iii) =⇒ (iv): Suppose that A is such that µ(∂A) = 0. Denote by Ao the interior of A then,
since µα (Ao ) ≤ µα (A) ≤ µα (A), we obtain
µ(Ao ) ≤ lim inf µα (A) ≤ lim sup µα (A) ≤ µ(A)
α α
(iv) =⇒ (i): Since ∂S = ∅, we have that limα µα (S) = µ(S). Suppose f ∈ Cb (S) with f ≥ c
so that 0 ≤ g := f − c ∈ Cb (S). The sets Ft := {g = t} ∈ B(S), t ≥ 0, are pairwise disjoint.
Since µ(S) < ∞, µ(Ft ) = 0 for all but finitely many t ≥ 0. For any δ > 0 and k ∈ Z+ define
Bk,δ := {kδ ≤ g ≤ (k + 1)δ}. As g is bounded, for each δ > 0 there is Nδ ∈ N such that
17.2. Weak convergence of measures on metric spaces 533
Bk,δ = ∅ for all k > Nδ ; since g is continuous, ∂Bk,δ ⊂ Fkδ ∪ F(k+1)δ . It follows that for any
n ≥ 1, there are uncountably many 0 ≤ δ < n1 such that
(17.2) µ Fkδ = 0, for all k ∈ N.
For each such δ we have the estimate
Z Nδ
X Nδ
X Z
g dµ − δµ(S) ≤ kδ µ Bk,δ = lim kδ µα Bk,δ ≤ lim inf g dµα
α α
k≥0 k≥0
Proof. δxα ⇒ δx iff limα f (xα ) = f (x) for all f ∈ Cb (S). The particular choice f (y) =
1 ∧ d(y, x) shows that xα → x.
Lemma 17.2.5. The set {δx : x ∈ S} is weakly closed in M+ (S).
Proof. Suppose that δxα ⇒ µ. As M+ (S) is a closed in M(S), σ(M(S), Cb (S)) and
{δx : x ∈ s} ⊂ M+ (S), µ ∈ M+ (S). Let x ∈ supp(µ). For any open neighborhood R V of x,
let f ∈ Cb (S) be such that 0 ≤ f ≤ 1, f (x) = 1 and f = 0 on S \ V . Clearly f dµ > 0
and, as limα δxα f = limα f (xα ) = µf , there exists α0 ∈ D such that α ≥ α0 implies that
x0 ∈ V . Therefore, xα → x and δx = µ.
Theorem 17.2.6. For any metric space (S, ρ), co(δx : x ∈ S) is σ(M(S), Cb (S))–dense in
M+1 (S).
The next result gives sufficient conditions for the uniform convergence of integrals with
respect to a weakly convergent net of positive measures.
Theorem 17.2.8. (R. Rao) Let (S, d) be a separable metric space and suppose that the net
{µα : α ∈ D} ⊂ M+ (S) weakly converges to a nonnengative measure µ. If Γ ⊂ Cb (S) is
534 17. Weak convergence of measures
uniformly bounded (i.e., supf ∈Γ kf ku < ∞), and equicontinuous (i.e., for any x ∈ S and
ε > 0 there is δ > 0 such that d(x, y) < δ implies that |f (x) − f (y)| < ε) then,
Z
lim sup f d(µα − µ) = 0
α f ∈Γ
Proof. Let M := supf ∈Γ kf ku . For any x ∈ S and ε > 0 there is an open ball Bx centered
S x ) = 0 and |f (x) − f (y)| < ε for all y ∈ Bx and f ∈ Γ. Since S is
at x such that µ(∂B
separable, S = n∈N Bxn for some countable subcollection of balls. Set A1 = Bx1 , and
Sn−1
An = Bxn \ j=1 Aj for n > 1. It follows that {An : n ∈ N} is a pairwise disjoint collection
of Borel sets covering S with µ(∂An ) = 0 for all n ∈ N. Define
X
ν := µ(An )δxn
n
X
να := µα (An )δxn
n
S
For any δ > 0, there is N ∈ N large enough such that µ Ω \ N An < δ. Since
SN SN SN n=1 S
∂ Ω \ n=1 An ⊂ n=1 ∂An , we have that limα µα Ω \ n=1 An = µ Ω \ N n=1 An .
Hence, for any f ∈ Γ
Z N
!
X X
f d(να − ν) ≤ M
µα (An ) − µ(An ) + µα (An ) − µ(An )
S n=1 n>N
!
N
X N
[ N
[
≤M |µα (An ) − µ(An )| + M µα (Ω \ An + µ (Ω \ An
n=1 n=1 n=1
R
Passing to the limit we obtain that lim sup supf ∈Γ S f d(να − ν) ≤ 2M δ. As δ may be
α
arbitrarily small, we conclude that
Z
(17.3) lim sup f d(να − ν) = 0.
α f ∈Γ S
We will use Rao’s theorem to show that M+ (S), as a subspace of S), σ(M(S), Cb (S)) ,
is metrizable. Recall that function f on S is Lipschitz iff
|f (x) − f (y)|
Lip(f ) = sup <∞
x6=y d(x, y)
We denote by Lip1 (S) the collection of all bounded Lipschitz functions with kf kBL :=
kf ku + Lip(f ) ≤ 1.
Theorem 17.2.9. (Kantorovich–Rubinstein) Let (S, d) be a separable metric space. For
any µ ∈ M(S) define
Z
∗
(17.5) kµk := sup f dµ : f ∈ Lip1 (S)
S
Proof. We prove that kµk∗ = 0 implies that µ ≡ 0. The remaining details that show
that k k∗ is a norm on M(S) are left as an exercise. Let F ⊂ S be a closed set. Then
R 1 ∧ nd(x,
fn (x) := F ) is a sequence of bounded Lipschitz functions such that fn Rր 1S\F.
Then, S fn dµ ≤ |fn kBL kµk∗ = 0. By monotone convergence |µ(S \F )| = limn S fn dµ.
Thus, |µ(S \ F ) = 0 for any closed set F . Being µ a finite measure, Sierpinski’s monotone
class theorem implies that µ(A) = 0 for any Borel set A.
α
We claim that for any net {µα : α ∈ D} ⊂ M+ (S) and µ ∈ M(S), µα ⇒ µ iff µ ∈ M+ (S)
and limα kµα −µk∗ = 0. To prove necessity, notice that M+ (S) is closed in σ(M(S), Cb (S)).
α
Thus, µα ⇒ µ implies that µ ∈ M+ (S). A direct application of Rao’s theorem with
∗
Γ = Lip1 (S) shows that lim R α kµα − µk = 0. To prove sufficiency, notice that for any
Lipschitz function f , limα S f d(µα − µ) ≤ kf kBL limα kµα − µk∗ = 0. This shows that
limα µα f = µf for all f ∈ Lip(S). The conclusion of the claim follows from Corollary 17.2.2.
If (S, d) is a locally compact separable metric space then C0 (S) is separable with respect
to the uniform norm. The Riesz representation theorem along with Alaoglu’s theorem
implies that the ball M≤1 := {µ ∈ M(S) : kµkT V ≤ 1} is weak∗ –compact, metrizable
and thus, sequentially weak∗ –compact. We have shown above that if (S, d) is a separable
metric space then M+ (S), σ(M(S), Cb (S) is metrizable. Consequently, M+ 1 (S) is a convex
weakly–closed and metrizable subspace of M(S). To emphasize the connection between
536 17. Weak convergence of measures
weak convergence and weak topology, we will give a different proof of the metrizability of
the weak topology on M+
≤1 (S).
Theorem 17.2.10. Let (S, d) be a metric space. If (S, d) is a separable metric space, then
M+ +
≤1 := {µ ∈ M (S) : kµkT V ≤ 1}
is a separable and metrizable closed subset of M(S). Furthermore, (S, d) is separable iff
M+ 1 (S), σ(M(S), C b (S)) is separable and metrizable.
Proof. It is obvious that M+ ≤1 (S) is a closed set in M(S), σ(M(S), Cb (S)) .
If (S, d) is a separable metric space then, by Theorem 2.9.3, there is an equivalent met-
ric e on S so that (a) (S, e) is isometrically homeomorphic to a dense set of a com-
pact subset Sb of [0, 1]N, and (b) Ub (S, e), k ku is isometric to Cb (S), b k ku . Conclu-
sion (a), together with the Riesz representation theorem, implies that Cb (S), b k ku ∗ =
M(S),b k kT V ; conclusion (b) implies that M(S), σ(M(S), Ub (S)) can be embedded as a
subspace of M(S), b σ(M(S), b Cb (S))
b . By Theorem 2.11.9, Cb (S) b is separable which, along
with Alaoglu’s theorem and Theorem 12.12.9, implies that {µ ∈ M(S) b : kµkT V ≤ 1} is a
σ(M(S),b Cb (S))–compact,
b metrizable, and hence separable, space. As σ(M(S), Ub (S)) and
σ(M(S), Cb (S)) coincide on M+ (S), it follows that M+ ≤1 (S) is metrizable and separable.
For the last statement, Lemma 17.2.4 shows that the map x 7→ δx is a continuous embedding
of S into M+ +
1 (S). Therefore, if M1 (S) is separable, so is S.
If S is a separable l.c.H topological space, then from Theorem 2.11.9, C00 (S) is sep-
∗
arable. By Riesz representation theorem C00 (S), kw ku = M(S), k kT V . Hence, by
Theorem 12.12.9, any normed–bounded subset of M(S) is relatively weak*–compact and
metrizable and thus, separable. Therefore, the weak* topology is completely determined
by considering vague convergence of sequences of measures. The following result is the
equivalent to the Portmanteau theorem for vague convergence in locally compact separable
(l.c.s) metric spaces.
Theorem 17.2.11. Suppose (S, d) is a locally compact separable metric space, {µn : n ∈ N}
v
is a sequence in M+ (S), and µ ∈ M+ (S). If µn −
→ µ, then
(i) lim inf n µn (U ) ≥ µ(U ) for all U open,
(ii) lim supn µn (K) ≤ µ(K) for all K compact,
(iii) limn µn (A) = µ(A) for all Borel set A with compact closure and µ(∂A) = 0.
Conversely, assume that supn kµn kT V < ∞. If either (i) & (ii) hold, or if (iii) holds, then
v
µn −
→ µ.
v
Proof. (i): Assume that µn −→ µ. As 1U is lower semicontinuous for any open set U , (i) is
a direct consequence of Theorem 17.2.1 (ii).
(ii): Suppose K is compact and consider the open sets K ε = {s ∈ S : d(s, K) < ε}. There
is a an open set Uε such that K ⊂ Uε ⊂ Uε ⊂ K ε such that U ε is compact. Let fε ∈ C00 (S)
17.3. Weak convergence under continuous transformations 537
ε
R
Rbe such that supp(f ε ) ⊂ U and 1K ≤ fε ≤ 1U ε . Hence, lim supn µn (K) ≤ limn fε dµn =
fε dµ ≤ µ(U ). Since µ(U ε ) < ∞, (ii) follows by letting ε ց 0.
ε
(iii): We show that (i) and (ii) imply (iii). Suppose A is Borel measurable, then µn (Ao ) ≤
µn (A) ≤ µn (A). If A is compact and µ(∂A) = 0, then (iii) follows from (i) and (ii).
For the last statement, assume that (iii) holds and that supn kµn kT V < ∞. Let 0 ≤ f ∈
R Rb
C00 (S), b = kf ku , and K = supp f . By Fubini’s we have that f dµ = 0 µ(f > t)dt. Since
µ(S) < ∞ and ∂{f > t} ⊂ {f = t} ⊂ K, we obtain that µ(∂{f > t}) ≤ µ(f = t) = 0
for a.s. all t ≥ 0. Therefore limn µn (f > t) = µ(f > t) for a.s all t. The assumption
supn kµn kT V < ∞ and dominated convergence imply that
Z Z b Z b Z
f dµ = µ(f > t) dt = lim µn (f > t) dt = lim f dµn
0 n 0 n
Since f+ , f− ∈ C00 (S) if and only if f ∈ C00 (S), vague convergence follows.
Theorem 17.2.12. Let (S, d) be a l.c.s metric space and let {µn , µ} be finite measures in
M+ (S). The following statements are equivalent.
(i) µn ⇒ µ
v
(ii) µn −
→ µ and µn (S) → µ(S).
Proof. Clearly (i) implies (ii). Conversely, assume (ii), and let f ∈ Lb (S) Rwith c ≤ f for
Rsome constant c. Then 0 ≤ f − c ∈ Lb (S) and by Theorem 17.2.1(ii) lim inf n (f − c) dµn ≥
(f − c) dµ. The assumption µn (S) → µ(S) implies that
Z Z
lim inf f dµn ≥ f dµ.
n
538 17. Weak convergence of measures
Weak convergence behaves well also under a.s continuous transformations. Let h be a
function on an arbitrary space X with values in a metric space (S ′ , d′ ). For any T ⊂ S, the
modulus of continuity of h on T is defined as
Lemma 17.3.2. Let S and S ′ be metric spaces and let h : S → S ′ . For any r > 0, the set
Jr = {x ∈ S : ωh (x) ≥ r} is closed.
Proof. If x ∈ Jrc , there is δ > 0 such that Ωh (B(x; δ)) < r. Clearly B(x; δ) ⊂ Jrc .
Proof. (i) Clearly limα µα h−1 (S ′ ) = limα µα (S) = µ(S) = µh−1 (S ′ ). For any closed set
F ⊂ S ′ , we have
h−1 (F ) ⊂ h−1 (F ) ⊂ Dh ∪ h−1 (F )
If µ(Dh ) = 0 then µ(h−1 (F )) = µ(h−1 (F )). By the Portmanteau theorem
(ii) Let f (x) = ((−M ) ∨ x) ∧ M where M = khku . As h = f ◦ h and f ∈ Cb (R), by part (i)
Z Z Z Z Z Z
−1 −1
h dµα = f ◦ h dµα = f dµα ◦ h −→ f dµ ◦ h = f ◦ h dµ = h dµ.
Proof. Consider the function ha (x) = |x|1{|x|≤a} . Notice that ha is continuous everywhere
but Dha = {±a}. With the exception of at most countably number of points a, we have
that µ({±a}) = 0. For such typical a, Theorem 17.3.4 shows that
Z Z Z
|x| µ(dx) = lim |x| µn (dx) ≤ lim inf |x| µn (dx).
{|x|≤a} n {|x|≤a} n
Let (X, B) be a topological space with its Borel σ–algebra. A measurable function
V : X −→ [0, ∞] is precompact or norm–like if V −1 ([0, r]) is compact for any 0 ≤ r < ∞.
Theorem 17.4.3. A collection Π ⊂ M(X) is bounded in total variation and tight iff there
exists a precompact function V ≥ 1 such that
Z
sup V d|µ| < ∞
µ∈Π
R
Proof. Suppose there is a precompact function V ≥ 1 with a := supµ∈Π V d|µ| < ∞.
Then
Z
sup kµkT V ≤ sup V d|µ| < ∞.
µ∈Π
For any ε > 0 let r > 1 so that r > a/ε. The set K = V −1 ([1, r]) is compact and
a
sup |µ|(K c ) ≤ sup |µ|(V > r) ≤ < ε.
µ∈Π µ∈Π r
Therefore, Π is bounded in total variation and tight.
Suppose Π is of bounded total variation and tight. There exists a sequence of compact sets
K1 ⊂ K2 ⊂ . . . such that
|µ|(Knc ) < 2−n , µ ∈ Π.
540 17. Weak convergence of measures
P
Let V (x) = 1 + ∞ c . For any r > 0 let nr = [r] + 1. As {V ≤ r} ⊂ {V ≤ nr } ⊂ Kn ,
n=1 1Kn r
V is precompact and
Z
sup V d|µ| ≤ sup kµkT V + 1 < ∞
µ∈Π µ∈Π
Definition 17.4.4. Let (S, d) a metric space and µ ∈ M(S). A set A ∈ B(S) is inner
regular with respect to µ if
(17.6) |µ|(A) = sup{|µ|(K) : K compact, K ⊂ A},
and µ is inner regular, or simply regular , if (17.6) holds for all A ∈ B(S).
If (17.6) holds with supremum over closed sets, then A is closed regular .
Lemma 17.4.5. Let (S, d) be a metric space.
(i) If µ ∈ M(S) is tight, then family R of measurable sets A such that A and S \ A
are inner regular is a σ–algebra.
(ii) For any finite measure µ ∈ M(S), the collection RF of measurable sets A such
that A and S \ A are closed regular is a σ–algebra.
Proof. (i) Without loss of generality, we may assume µ is a finite nonnegative tight measure.
Let R be the collection of Borel sets A such that A and S \ A are both regular. Clearly
S
S ∈ R, and A ∈ R if and only if S \A ∈ R. Suppose {An : n ∈ N} ⊂ R and set A = n An .
For each n, there are compact sets Kn ⊂ An and Ln ⊂ S \An such that µ(An \Kn ) < ε2−n−1
S
and µ((S \ An ) \ Ln ) < ε2−n . Choose N large enough so that µ A \ N k=1 Ak < ε/2. The
SN T
sets F = j=1 Kj and L = n Ln are compact and
N
X
µ(A \ F ) ≤ ε/2 + ε 2−j−1
j=1
X
−n
µ((S \ A) \ L) ≤ ε 2 = ε.
n
(ii) The same proof with closed in place of compact shows that RF = {A ∈ B(S) :
A and S \ A closed regular} is a σ–algebra.
Theorem 17.4.6. Let (S, d) be a metric space and µ ∈ M(S). µ is inner regular iff the
singleton {µ} is tight.
Proof. Necessity is obvious. To prove sufficiency, we may assume without loss of generality
that µ is a nonnegative tight measure on B(S). We show first that µ is closed regular. For
any open set U , let F = S \ U . The sequence of sets closed sets Fn = {x ∈ S : ρ(x, F ) ≥ n1 },
n ∈ N, satisfies Fn ր U . Consequently U is closed regular. By Lemma 17.4.5 we conclude
that µ is closed regular.
17.4. Tightness and Prohorov’s theorem 541
For ε > 0, let K be a compact set such that µ(S \ K) < ε/2. For any B ∈ B(S),
let F ⊂ B be a closed set with µ(B \ F ) < ε/2. Hence L = F ∩ K is compact and
µ(B \ L) ≤ µ(B \ F ) + µ(F \ L) < ε. This shows that µ is inner regular.
Theorem 17.4.7. (Ulam) If (S, d) is a complete separable metric space and µ ∈ M(S),
then µ is tight.
The above conditions are equivalent on a complete metric space (S, d) if each µ ∈ Π is tight.
On any metric space (S, d), (ii) implies (i).
Proof. Assume (i). First we show that Π is bounded in total variation. Suppose that there
is {µn } with kµn kT V > n and let {µn′ } be a convergent subsequence; then, supn′ |µn′ f | < ∞
for any f ∈ Cb (S). Since Π ⊂ Cb (S)∗ and Cb (S) is a Banach space, it follows from the
Banach–Steinhaus theorem that {µn′ } is bounded with respect to the total variation norm.
This contradicts the choice of {µn }.
We show now that Π is tight. Suppose that Π fails to be tight. Then there exists ε > 0
such that for any compact K ⊂ S, one can find µK ∈ Π such that kµK k ≥ |µK |(K) + ε. In
particular, there is µ1 ∈ Π with kµ1 k > ε. Ulam’s theorem provides a compact set K1 ⊂ S
with |µ1 |(K1 ) > ε. By Lemma 17.4.8, there is µ2 ∈ Π such that |µ2 |(S \ K1ε ) > ε. Let
K2 ⊂ S \ K1ε be a compact set so that |µ2 |(K2 ) > ε. By induction, having constructed a
Sm−1 ε
compact set Km and µm ∈ Π with Km ⊂ S \ j=1 Km and |µm |(Km ) > ε, we can find
Sm S
µm+1 ∈ Π so that |µm+1 S \ j=1 Kj > ε. Let Km+1 ⊂ S \ m
ε ε
j=1 Km be a compact set
ε/4
such that |µm+1 |(Km+1 ) > ε. This construction yields the sequence {Um } = {Int(Km )} of
pairwise P P sets. For each m, we choose fm ∈ Cb (S) such that 1Km ≤ fm ≤ 1Um .
disjoint open
Clearly, m fm = m |fm | ≤ 1 and
Z Z
(17.7) fm d|µm | = fm d|µm | > ε
S Um
Denote by c = supµ∈Π kµkT V . Since any compact set K ⊂ S is a separable metric space,
it follows that C(K) is a separable Banach space. Alaoglu’s theorem implies that BK =
{µ ∈ M(K) : kµkT V ≤ c} is weak*–compact. The separability of C(K) implies that
BK is metrizable; hence, sequentially compact. A standard diagonal argument shows that
any sequence {µm } ⊂ Π has a subsequence {µmk } that weak*–converges in each space
(M(Kn ), C(Kn ))w∗ .
17.5. Vague convergence for σ–finite measures 543
We will show that µmk converges weakly on M(S). For f ∈ Cb (S) and ε > 0, let m0 be so
that 2−m0 < ε. Then, for all n, k with mn ≥ mk > m0 we have
Z Z
f d(µmn − µm ) ≤ εkf ku +
(17.8) k f d(µ m n − µ m k
)
S Km0
R
The choice of {µmk } implies R that S f dµmk is a numerical Cauchy sequence. Let us denote
the limit by µ∗ (f ) = limk S f dµmk . It is clear that µ∗ is a linear bounded functional on
Cb (S). It remains to show that µ∗ is a measure on σ(Cb (S)) = B(S). By considering the
families {|µ| : µ ∈ Π} and {µ+ : µ ∈ Π}, we can assume without loss of generality that
Π ⊂ M+ (S). So, let {fn } ⊂ C(S) be a non increasing sequence with fn ց 0 pointwise. By
Dini’s theorem, fn ց 0 uniformly on the compact set Km0 (m0 as before). Therefore, for
all n large enough, we have that
Z Z
0≤ fn dµmk ≤ kf1 kε + fn dµmk ≤ (kf k1 + c)ε,
S Km0
from which we conclude that limn µ∗ (fn ) = 0. The Daniell–Stone theorem implies that µ∗
is a measure.
Remark 17.4.10. If µn ⇒ µ converges weakly, |µn | may fail to converge. Consider for
example µn = δ0 − δ 1 for n even and µn = n1 δn for n odd. Then, µn ⇒ 0 but |µn | does not
n
converge weakly.
If both µn and |µn | converge weakly, say to µ and ν respectively, then it might be that
ν 6= |µ|. Consider µn = δ0 − δ 1 . Then µn ⇒ 0, while |µn | ⇒ 2δ0 .
n
If |µn | converges weakly, µn might fail to converge. Consider µn = δ0 − δ 1 for n even and
n
µn = 2δ0 for n odd. Then |µn | ⇒ 2δ0 , but µn fails to converge weakly.
We show that µmk converges vaguely to some measure ν in (S, B(S)) that is finite on
compact sets. Let f ∈ CR00 (S) and suppose
R supp(f ) ⊂ Km0 . Then, by the choice ofR {µmk },
the numerical sequence f dµmk = Km f dµmk converges. Clearly, L(f ) = limk f dµmk
0
is a positive linear functional on C00 (S). By the Daniell–Stone theorem, it suffices to show
that L is δ–continuous. Let fn ⊂ C00 (S) be a decreasing sequence converging to zero. If
Kp ⊃ supp(f1 ), then Kp ⊃ supp(fn ) and by Dini’s theorem, fn ց 0 uniformly. From
Z
L(fn ) = lim fn dµmk ≤ kfn kc(Kp ),
k Kp
Proof. Let G be open. Then, for any x ∈ G there is ε > 0 and A ∈ U such that
x ∈ Ao ⊂ A ⊂ B(x; ε) ⊂SG. Since S is separable, there is aSfinite or infinite sequence
An ∈ U such that G ⊂ n Aon and An ⊂ G. Hence, G = n An and U satisfies the
hypotheses in Theorem 17.6.1.
Corollary 17.6.3. Let V be the π–system generated by the collection of open balls B(x; ε).
If S is separable and µn (A) → µ(A) for any A ∈ V ∪ {S} with µ(∂A) = 0, then µn ⇒ µ.
Proof. Since ∂B(x; ε) ⊂ {y ∈ S : ρ(x, y) = ε}, the boundaries of the open balls around a
point x are pairwise disjoint; hence, all but countably many have zero µ–measure. Since
17.7. Uniform integrability and weak convergence of measures 545
∂(A ∩ B) ⊂ ∂A ∪ ∂B, the collection U of finite intersections of open balls with zero µ–
measure boundary satisfies the hypothesis of Corollary 17.6.2.
Proof.
R For each αR > 0 consider the functions gα (x) = x1{|x|≤α} . By Theorem 17.3.4,
gα (X) dµ = limn gα (Xn ) dµ for all but countable many α ≥ 0. Observe that
|µ(Xn − X)| ≤|µ(gα (Xn ) − gα (X))|
Z Z
(17.9) + sup |Xn | dµ + |X| dµ
n {|Xn |>α} {|X|>α}
We have seen that the first term on the right of (17.9) converges to zero; the second term
converges to zero from the uniform integrability of Xn as in Theorem 8.7.4(iii); the third
converges to zero by obvious reasons (dominated convergence for instance).
Corollary 17.7.2. Let Xn and X be measurable functions in topological space S defined on
a finite measure space (Ω, F , µ) and ϕ is a continuous real–valued function in S. Suppose
that Xn ⇒ X. If {ϕ(Xn )} is uniformly integrable, then limn µϕ(Xn ) = µϕ(X).
Proof. Let f ∈ Cb (R), the continuity of x 7→ xf (x) implies that Xn f (Xn ) ⇒ Xf (X). Since
|f (Xn )Xn | ≤ kf k∞ |Xn |, Theorem R {f (Xn )Xn } is uniformly integrable.
R 8.7.4 implies that
From Theorem 17.7.1 we obtain f (Xn )Xn dµ → f (X)X dµ.
R
Proof. It followsRthe same steps as that of Theorem 17.7.1 replacing µf (Xn ) with f (x) µn (dx)
and µf (X) with f (x) µ(dx) for all functions f involved.
Theorem 17.8.1. Let X and Xn random variables on a probability space (Ω, F , µ) with
values in a metric space (S, d).
(i) If Xn converges in measure to X, then Xn ⇒ X, that is µ ◦ Xn−1 =⇒ µ ◦ X −1 .
(ii) Let a ∈ S. Then Xn =⇒ a if and only if Xn → a in measure.
Convergence in law of a sequence of random variables converging does not provide any
pointwise information about the random variables; even more, each random variable may
be defined on different probability spaces. When the probability laws are defined on a
nice space, it is possible to construct a probability space supporting a sequence of random
variables with prescribed laws in which the sequence of random variables converges pointwise
to a random variable with the prescribed limiting law.
Lemma 17.8.2. (Kallenberg) Suppose κ and {κn } are random variables in S = {1, . . . , m}
such that κn ⇒ κ. If θ ∼ U (0, 1) and θ and κ are independent, then there are measurable
d
functions fn : S ⊗ [0, 1] −→ S such that κ en → κ almost surely as
en = fn (κ, θ) = κn and κ
n → ∞.
Proof. Let µn and µ be the laws of κn and κ respectively and denote by pnj = µn ({j}) and
pj = µ({j}). For each n ∈ N, let Jn be the set of j ∈ S such that pnj ≤ pj . For each j ∈ Jn ,
divide the interval ∆j = [pnj /pj , 1] in #(Jnc ) disjoint subintervals ∆j,i , i ∈ Jnc , so that
pnj pni − pi
|∆j,i | = αi 1 − , αi = P n , i ∈ Jnc .
pj j∈Jnc (pj − pj )
17.8. Weak convergence on probability spaces 547
Proof. For any p ∈ N let {Bkp : k ∈ N} be a partition of S by measurable sets such that
Sm p p
supk diam(Bkp ) < 2−p and µ(∂Bkp ) = 0 for each k. Choose mp so that µ k=1 Bk > 1−2−p
and define
m
Ap0 = ∪k=1
p
(Bkp )c , Apk = Bkp 1 ≤ k ≤ mp .
For each n, p ∈ N and 1 ≤ k ≤ mp , let
µn (·|Apk ) if µn (Apk ) 6= 0
µpn,k (·) =
µ otherwise
d
By Corollary 16.3.4 there exists a probability space (Ω, F , P) and random variables Gpn,k =
d d
µnn,k , X = µ and θ = U (0, 1) on Ω such that {Gpn,p } and (X, θ) are independent, as well as
X and θ.
d
Let {Yn } be random variables in S such that Yn = µnP(defined not necessarily in a common
mp p
probability space). For each p, define K p : s 7→ k=0 k1Ak (s), and set κn = K (X),
p p
p p p p
κn = K p (Yn ). Since limn µn (Ak ) = µ(Ak ) for each 1 ≤ k ≤ mp , it follows that κn ⇒ κp as
n → ∞. Consequently, be Lemma 17.8.2, there exist random variables κ epn = κ
epn (X, θ) such
p p
en → κ P–a.s. as n → ∞. Define
that κ
Xnp = Gpn,k on κpk = k},
{e
and observe
mp mp
X X
P[Xnp ∈ A] = P[Gpn,k ∈ epn
A, κ = k] = µn (A ∩ Apk ) = µn (A).
k=0 k=0
548 17. Weak convergence of measures
d
for any A ∈ B(S), that is, Xnp = µn for each n, p ∈ N. Since X ∈ Apκp and Xnp ∈ Apκep
n
P–a.s., we have that
κpn 6= κp } ∪ {X ∈ Ap0 }.
{d(Xnp , X) > 2−p } ⊂ {e
S
epn → κp P–a.s. and {e
Since κ κpn 6= κ} = {|eκpn − κ| ≥ 1}, then limm P[ κpn
n≥m {e 6= κ}] = 0,
p
and from µ(A0 ) < 2−p , we conclude that there is np ∈ N such that
[
P[ κpn 6= κ}] < 2−p .
{e
n≥np
We may assume that n1 < n2 < . . .. By Borel–Cantelli’s, we have that sup d(Xnp , X) ≥ 2−p
n≥np
d
P–a.s. for all but finitely many p. If Xn = Xnp , np ≤ n < n, then Xn = µn and Xn → X
P–a.s. as n → ∞.
The following result, very useful in applications, states that sequences of random vari-
ables that are closed to one another have the same weak limit distribution.
Theorem 17.8.4. (Slutsky) Let {Xn }, {Yn } and X be random variables in (S, d) defined on
a probability space (Ω, F , P). If Xn ⇒ X and d(Xn , Yn ) → 0 in probability, then Yn ⇒ X.
If F is closed, then Fε ց F as ε ց 0 and the result Follows from the Portmanteau theorem
(iii).
Corollary 17.8.5. Let Xn and Yn be real or complex valued random variables defined on
a common probability space, and let c be a real or complex constant Assume that Xn ⇒ X
and Yn ⇒ c. Then
(i) Xn + Yn ⇒ X + c.
(ii) Yn Xn ⇒ cX.
(iii) If c 6= 0, then Xn /Yn ⇒ X/c.
Proof. Clearly [Xn , c]⊤ ⇒ [X, c]⊤ . As d2 ([Xn , Yn ]⊤ , [Xn , c]⊤ ) = |Yn − c| converges to 0
in measure, [Xn , Yn ] ⇒ [X, c]⊤ . Consequently, f (Xn , Yn ) ⇒ f (X, c) for any f which is
P ⊗ δc –a.s. continuous. (i), (ii) follow from the particular continuous cases [x, y]⊤ 7→ x + y
and [x, y]⊤ 7→ yx. (iii) follows from the particular case h : [x, y]⊤ 7→ x/y since the set of
discontinuities of h is Dh = C × {0} and P[(X, c) ∈ C × {0}] = P[X ∈ C]δc ({0}) = 0.
17.9. Exercises 549
17.9. Exercises
Exercise 17.9.1. Let (S, τ ) be a Hausdorff topological space. Suppose W is a linear
subspace of Cb (S) which separates points of M(S). Show that
(a) M+ (S) is a closed convex pointed cone of (M, σ(M, W)).
(b) The collection M+ +
1 (S) = {µ ∈ M (S) : µ(S) = 1} of Baire probability measures
on S is a closed convex subset of (M, σ(M, W)).
Exercise 17.9.2. Let (S, d) be a metric space. Suppose ψ ∈
C(S) and ψ ≥ 1. Show that
span δx : x ∈ S , co aδx : x ∈ S, a ≥ 0 , and co δx : x ∈ S are σ Mψ (S), C ψ (S) –dense
in Mψ (S), Mψ ψ +
+ (S), and M (S) ∩ M1 (S) respectively.
Exercise 17.9.3. Suppose (K, B(K)) is a compact metric space
with Borel σ–algebra, and
let D a dense subset of K. Show that M(D), σ(M(D), Ub (D)) coincides with the subspace
MD (K) ⊂ M(K), σ(M(K), Cb (K)) of all measures ν on K such that ν(K \ D) = 0.
Exercise
R 17.9.4. Suppose f ∈ Lb (S), where (S, d) is a metric space. Show that the map
µ 7→ f dµ on M+ (S) with the relative weak R topology σ(M(S), Cb (S)) is lower semicon-
tinuous. Similarly, if f ∈ Ub (S), then µ 7→ f dµ is upper semicontinuous.
Exercise 17.9.5. Complete the proof that k k∗ given by (17.5) defines a norm on the space
M(S), where (S, d) is metric space (not necessarily separable).
Exercise 17.9.6. Under the assumptions of Lemma 17.3.2, given δ > 0 define hδ (x) :=
Ωh (B(x; δ)).
T For any r > 0 show that the set Aδr := {x ∈ S : hδ (x) > r} is open in (S, d).
δ
Show that δ>0 Ar = Jr .
Exercise 17.9.7. Suppose Xn and X are random variables in a metric space (S, d) and
that Xn converges in probability to X. Suppose that h : S → (S ′ , d′ ) is continuous on a set
C ⊂ S and that P[X ∈ C] = 1. Show that h(Xn ) converges in probability to h(X). (Hint:
Fix ε > 0. For any δ > 0,
P∗ [d′ (h(Xn ), h(X)) > ε] ≤ P∗ [d(Xn , X) ≥ δ]
+ P∗ [d(Xn , X) < δ, d′ (h(Xn ), h(X)) > ε]
≤ P∗ [d(Xn , X) ≥ δ] + P[X ∈ Aδε ].
Use Exercise 17.9.6.)
Exercise 17.9.8. Suppose µn , n ∈ Z+ , and µ are probability measures on the (R, B(R).
On ((0, 1), B((0, 1)), λ1 ) define Xn (t) = inf{x : Fn (x) ≥ t} and X(t) = inf{x : F (x) ≥ t},
where Fn (x) = µn (−∞, x] (similarly for F ). If µn ⇒ µ, show that Xn → X pointwise.
Exercise 17.9.9. Let X, {Xn } be real valued random variables. Suppose Xn ⇒ X and let
{an } be a numerical sequence.
(i) Suppose that Xn + an converges in distribution. Show that an converges.
(ii) Show that if an > 0, an Xn ⇒ X, and X is not identically zero, then limn an = 1.
(iii) Show that if an → ∞ and an Xn converges in law, then Xn → 0 in measure.
Chapter 18
Weak convergence in
Euclidean spaces
To prove sufficiency we assume without loss of generality that µn (Rd ) = µ(Rd ). Each d–
dimensional interval (a, b] is determined by the 2d hyper planes that contain its faces.
Let U be the class of d–dimensional intervals for which the hyper planes containing their
faces have µ–measure zero. Notice that there are at most countably many hyper planes
(orthogonal to one of the main axis) with positive µ–measure. For each A = (a, b] ∈ U , let
VA the set of vertices of A; then, each v ∈ VA is a point of continuity for F . Since µn (A) =
551
552 18. Weak convergence in Euclidean spaces
P P
v∈VA (−1) where p(v) = dk=1 1{vk =ak } , we conclude that µn (A) → µ(A) for
p(v) F (v),
n
each A ∈ U . Therefore, by Corollary 17.6.2, µn ⇒ µ.
Theorem 18.1.2. Let µn be a sequence of measures in M+ (B(Rd ) with distributions Fn ,
and let F be a right–continuous function in Rd with positive increments. If supn kµn kT V <
v
∞ and Fn (x) → F (x) for each point x of continuity of F , then µn − → µ, where µ is
Qd
the Lebesgue–Stieltjes measure with µ((a, b]) = k=1 ∆k (ak , bk )F for each d–dimensional
interval (a, b].
Proof. We will show that µn (A) → µ(A) for any Borel measurable set A such that µ(∂A) =
0 and A compact. Let U be the collection of d–dimensional intervals for which the parallel
hyper planes containing their faces have zero µ–measure. Clearly A is compact and µ(∂A) =
0 if A ∈ U , and as in the proof of Theorem 17.6.1, lim inf n µn (G) ≥ µ(G) for any G
open. By Theorem 17.2.11, it suffices to prove that lim supn µn (K) ≤ µ(K) holds for any
d
Sn set K. Assume that K ⊂ R compact, then for any ε, one can choose an open set
compact
V = j=1 (ai , bj ) ⊃ K such that (ak , bk ] ∈ U and µ(V ) < µ(K) + ε. As in the proof of
Theorems 17.6.1 and 18.1.1, limn µn (V ) = µ(V ), so lim supn µn (K) ≤ limn µn (V ) = µ(V ) <
µ(K) + ε.
Theorem 18.1.3. (Helly’s selection theorem) Any sequence of uniformly bounded measures
in (Rd , B(Rd )) has a weak*–convergent subsequence.
Proof. Without loss of generality, assume that supn kµn kT V ≤ 1. A short proof follows
from the Alaoglu’s and Riesz representation theorems. Indeed, C0 (Rd )∗ = M(Rd ) and The
closed ball B = {µ ∈ M(Rd ) : kµkT V ≤ 1} is weak* compact. Since C0 (Rd ) is separable,
then B with the weak* topology is metrizable and hence sequentially compact.
If µn is a sequence of probability functions on (Rd , B(Rd )), the vague limit in Helly’s
selection Theorem may not be a probability measure since some mass may escape to infinity.
We conclude this section with a result that extends the notion of approximations to the
identity to measures in B(Rn ).
R
Theorem 18.1.4. Let {Kε : ε > 0} be a family of good kernels on Rn such that Kε dλn = a
for all ε > 0. For any complex measure µ on (Rn , B(Rn )), (Kε · λn ) ∗ µ converges weakly to
a µ as ε → 0.
Proof. Fix f ∈ Cb (Rd ). Applying Fubini’s theorem and using the translation of Lebesgue
measure we have that
Z Z Z Z
f (z)(Kε · λn ∗ µ)(dz) − a f dµ = f (x + y) − f (y) Kε (x) dx µ(dy)
Z Z
≤ |f (x + y) − f (y)||Kε (x)| dx |µ|(dy)
Let M = supε>0 kKε k1 . For any η > 0, there exists a compact subset K of Rn such that
η
(18.1) 2kf ku M |µ|(K c ) <
3
18.1. Weak convergence and distribution functions 553
b
Proof. Set K(x) = ϕ(−x) and let Kε (x) = ε−n K(ε−1 x) for all ε > 0. Using Fubini’s
theorem and the translation invariance of λn we obtain that
Z
Kε ∗ µ(x) = ε−n ϕ(εb −1 (y − x)) µ(dy)
Z Z
−n −1
= ε e−i2πε (y−x)·s ϕ(s) dsµ(dy)
Z Z
= e−i2π(y−x)·s ϕ(εs) dsµ(dy)
Z Z
i2πx·s
= e ϕ(εs) e−i2πy·s µ(dy) ds
Z
= ϕ(εs)ei2πx·s µ b(−2πs)ds
The first statement follows from Theorem 18.1.4. The second statement follows from the
observations in Remark 15.5.2 and by Theorem (15.3.8)(i). It can also be proved directly
2
by considering ϕ(x) = e−π|x| . By dominated convergence, for any f ∈ C00 (Rn )
Z Z Z Z
f dµ = lim f (x) Sε (µ, x) dx = f (x) ei2πx·s µb(s) ds dx.
ε→0
554 18. Weak convergence in Euclidean spaces
R
Then ν(dx) = µ(dx) − ei2πx·s µ
b(s) ds dx is a complex measure that vanishes on C00 (Rn ).
It follows from the Riesz representation theorem that ν = 0.
Remark 18.1.6. Theorem 18.1.4 offers a direct proof the separability of the Fourier trans-
form of complex measures in B(Rn ). Indeed, if µ
b = νb then Sε (µ) = Sε (ν), and so for all
f ∈ Cb (Rn )
Z Z
µf = lim f Sε (µ) λn = lim f Sε (ν) λn = νf.
ε→0 ε→0
Proof. Given ε > 0, let M > 0 such that µ(|x| > M ) < 2ε . Let f (x) be a continuous
function such that 1Rd \B(0;2M ) ≤ f ≤ 1Rd \B(0;M ) , for instance
f (x) = 0 ∨ (1 ∧ ( |x|
M − 1)).
R R
Then lim supn µn (|x| > 2M ) ≤ limn f dµn = f dµ < 2ε ; hence, there is n0 such that
supn>n0 µn (|x| > 2M ) < ε. For 1 ≤ n ≤ n0 , let Mn > 0 such that µn (|x| > Mn ) < ε.
Therefore, if J = 2M ∨ max1≤n≤n0 Mn then supn µn (|x| > J) < ε.
v
Lemma 18.2.2. Let {µn }, µ be measures in M+ (Rd ) such that µn − → µ. Then, {µn } is
tight if and only if limn µn (Rd ) = µ(Rd ), in which case µn ⇒ µ.
Proof. Assume {µn } is tight and let f ∈ Cb (Rd ). For ε > 0, there is r > 0 such that
µ(|x| > r) ∨ supn µn (|x| > r) < 2(kf kεu +1) . Let gr ∈ C00 (Rd ) such that 1B(0;r) ≤ gr ≤ 1.
Then
Z Z Z Z
| f d(µn − µ)| ≤ | (f − f gr ) dµn | + | f gr d(µn − µ)| + | (f − f gr ) dµ|
Z
≤ | f gr d(µn − µ)| + kf ku (µn + µ)(|x| > r)
Z
< | f gr d(µn − µ)| + ε
R R
Letting n ր ∞ and then ε ց 0 we obtain limn f dµn = f dµ.
Sufficiency follows from Theorem 17.2.12 and Lemma 18.2.1.
Proof. For any sequence {µn } ⊂ Π there is, by Helly’s selection theorem, a subsequence
v
µn′ and a finite measure µ such that µn′ −
→ µ. If Π is tight, then Lemma 18.2.2 implies that
µn′ ⇒ µ.
Conversely, if Π is not tight, then for some ε > 0 and there is a sequence {µn } ⊂ Π such
that µn (|x| > n) ≥ ε. By hypothesis, there is a subsequence {µn′ } such that µn′ ⇒ µ and
hence, by Lemma 18.2.2, {µn′ } is tight. Therefore, supn′ µn′ (|x| > M ) < ε for some M > 0.
This contradicts the choice of µn .
Proof. It is clear that (a) or (b) imply (c). Theorem 16.7.5 states Pthat (a) and (b) are
equivalent. Thus, it suffices to show that (c) implies (b). Let Sn = nk=1 Xk , and let µ be
the Borel probability measure on R to which Sn converges. Then, {Sn } is uniformly tight
and so is the family {Sm − Sn : m < n}. We will show that {Sm } is a Cauchy sequence in
measure. If that were not the case, there is a sequence Yj := Smj − Snj such that
Z
(18.4) inf D(Yj , 0) = inf |Yj | ∧ 1 dP > 0
j j
x
Proof. It suffices to assume that µ(Rd ) = 1. Since sin x ≤ 2 for all x ≥ 2, we obtain
Z c Z Z c
d
(µ(R ) − µ
b(ta)) dt = ( (1 − eita·x ) dt) µ(dx)
−c −c
Z
sin(ca · x)
= 2c (1 − )µ(dx) ≥ cµ(x : |c a · x| ≥ 2)
ca · x
Lemma 18.4.2. Suppose {µn : n ∈ N} ⊂ M+ (Rd ), and let charac µ bn be the characteristic
function of µn . If {b
µn : n ∈ N} converges pointwise to a some limit p and {µn : n ∈ N} is
tight, then µn ⇒ µ for some measure µ ∈ M+ (Rd ) with µ b = p.
Proof. Pointwise convergence implies that µn (Rd ) = µ bn (0) → p(0); thus, {µn } is uniformly
bounded. Tightness, Helly’s theorem and Lemma 18.2.2 imply that any subsequence of
{µn } has a weakly convergent subsequence µn′ . Suppose µn′ ⇒ µ, then since ft (x) =
eit·x ∈ Cb (Rd ), we have that p(t) = limn µbn′ (t) = µ
b(t). By uniqueness of characteristic
functions, any subsequential limit is actually µ; therefore, µn ⇒ µ.
Proof. Clearly µn is uniformly bounded. By (18.5) and dominated convergence, for any
a ∈ Rd and r fixed we have
Z
r 2/r
(18.6) lim sup µn (x : |a · x| > r) ≤ (p(0) − p(ta)) dt
n 2 −2/r
The next results gives necessary and sufficient conditions for weak convergence of a
sequence of probability measures in Euclidean space in terms of the corresponding sequence
of characteristic functions.
Theorem 18.4.4. Let {µ, µn : n ∈ N} ⊂ M+ (Rd , B(Rd )). Then, µn ⇒ µ if and only if
bn → µ
µ b uniformly in compact sets.
Proof. Necessity follows by a direct application of Rao’s theorem 17.2.8 with (S, d) =
(Rd , k k2 ), and Γ := {ft (x) = eit·x : t ∈ K} where K ⊂ Rd is compact.
We show now that ϕ b 0 ) < 0 for some point t0 ∈ Rd , choose a real valued
b ≥ 0. Indeed, if ϕ(t
∞ (Rd ) that equals one at t and that vanishes outside a neighborhood of t
function ρ ∈ C00 0 R 2πi t·x 0
in which ϕb is negative. Let g(x) = F −1 ρ(x) = e ρ(t) dt so that, by Theorem 15.5.5,
R
gb = ρ. Thus, 0 ≤ F (0) = ρ2 (t)ϕ(t)
b dt < 0, which is a contradiction.
2
Let g1 (x) = √1 e−|x| , gn (x) = nd g(|x|n) and define Fn = gn ∗ gn ∗ ϕ. Since kgn k1 = 1
( 2π)d
2 2 2
and gbn (t) = e−2π |t| /n ր 1, we obtain by Monotone Convergence Theorem and (18.8) that
Z Z
b dt = lim
ϕ(t) (gbn (t))2 ϕ(t)
b dt ≤ ϕ(0) < ∞.
n
b ∈ L+
Therefore, ϕ d
1 (R ).
558 18. Weak convergence in Euclidean spaces
Proof. We only prove sufficiency. If ϕ ∈ L1 , then Lemma 18.5.1 implies that ϕ = µ b for
some finite measure µ ≪ λd . The general case will be obtained from the integrable case
through Levy–Bochner’s continuity theorem. R i t·x Suppose γ is a positive integrable function
such that kγk1 = 1 and define σ(t) := e γ(x) dx = γ b(−t/(2π)). Then, t 7→ ϕ(t)σ(t) also
satisfies the conditions of the Theorem, for
Xn X n Z
(tj −tk )·y
cj ϕ(tj − tk )σ(tj − tk )ck = cj ϕ(tj − tk ) e γ(y) dy ck
j,k=1 j,k=1
Z X
n
= cj ei tj ·y ϕ(tj − tk )ck ei tk ·y γ(y) dy ≥ 0.
j,k=1
2
Consider γ(x) = √1 e−|x| /2 and define γn = nd γ(nx) for each n ∈ N. Since σn (t) =
2π
2 2
e−|t| /n , ϕσn ∈ L1 (Rd ) for each n ∈ N; consequently, ϕσn = µ
cn for some finite measure
µn ≪ λd , and µn (Rd ) = ϕ(0). Since limn ϕσn = ϕ, we conclude from Levy’s continuity
theorem that there is a finite measure µ such that µn ⇒ µ and µ
b = ϕ.
Lemma 18.6.1. Let {zj , wj : 1 ≤ j ≤ n} ⊂ C such that maxj {|zj |, |wj |} ≤ θ. Then
Yn n
Y n
X
n−1
(18.9) z j − w j ≤ θ |zj − wj |
j=1 j=1 j=1
Proof. (18.9) holds trivially if n = 1. By induction, suppose that (18.9) holds for n − 1.
Then
Yn Yn Y n n
Y Y n n
Y
zj − w j ≤ z1 zj − z1 w j + z1 wj − w1 wj
j=1 j=1 j=2 j=2 j=2 j=2
Yn n
Y n
X
≤ θ zj − wj + θn−1 |z1 − w1 | ≤ θn−1 |zj − wj |
j=2 j=2 j=1
P zn
Proof. Observe that 2n−1 ≤ n! for n ≥ 2. Since ez − 1 − z = n≥2 n! and |z| ≤ 1, it
follows that
|z|2 X −(n−2)
|ez − 1 − z| ≤ 2 = |z|2
2
n≥2
Proof. Let γ > |c| and n0 large enough so that |cn | < γ and γ/n ≤ 1 whenever n ≥ n0 . If
zj = 1 + cnn and wj = ecn /n , 1 ≤ j ≤ n, then have that
max {|zj |, |wj |} ≤ eγ/n
1≤j≤n
c /n γ2
e n − 1 − c n ≤ .
n n2
Therefore
2
(1 + cn )n − ecn ≤ eγ(n−1)/n n γ → 0.
n n2
The conclusion follows from the continuity of the exponential function.
Theorem 18.6.4. Let {cn,m : 1 ≤ m ≤ mn } ⊂ C. Suppose that
(i) lim sup1≤m≤mn |cn,m | = 0,
n→0
Pm n
(ii) lim
n→∞ m=1 cn,m = c ∈ C,
P n
(iii) and M := supn m m=1 |cn,m | < ∞.
Then,
mn
Y
(1 + cn,m ) = ec
m=1
log(1+z)
Proof. If log is the principal logarithm on C\(−∞, 0]×{0}, then for |z| < 1 lim z = 1.
z→0
Given ε > 0, there is δ > 0 such that 0 < |z| < δ implies
| log(1 + z) − z| < ε|z|.
Without loss of generality, we can assume that supm |cn,m | < 1 for all n. Then,
Xmn
Xmn mn
X
log(1 + cn,m ) − cn,m ≤ | log(1 + cn,m ) − cn,m | ≤ ε |cn,m | < M ε
m=1 m=1 m=1
By letting n → ∞ and then ε → 0 we obtain
mn
X mn
X
lim log(1 + cn,m ) = lim cn,m = c
n→∞ n→∞
m=1 m=1
The conclusion follows from the continuity of the exponential function.
560 18. Weak convergence in Euclidean spaces
Theorem 18.6.5. (Classical CLT) Let {Xn } ⊂ PL2 (P) be a sequence of i.i.d random vectors
with covariance matrix Σ = E[XX ∗ ]. If Sn = nk=1 Xk , then
Sn − nE[X1 ]
√ =⇒ N (0, Σ)
n
where N (0, Σ) is s multivariate normal distribution with mean 0 and covariance Σ.
Proof. By setting Xn′ = Xn − E[X1 ] we can assume that {Xn } is a mean zero sequence.
Equation 15.4 in Theorem 15.1.9 shows that
t∗ Σt
ϕX1 (t) = 1 − + o(|t|2 )
2
Therefore, for fixed t
it · Sn t∗ Σt 1 n 1
E √ = 1− +o → exp − t∗ Σt
n 2n n 2
by Lemma 18.6.3. The conclusion follows from Levy–Bochner’s theorem.
18.6.2. Lindeberg–Feller CLT. In this section we obtain a slightly more general CLT
for independent random variables.
Theorem 18.6.6. For each n, Let Xn,m , 1 ≤ m ≤ mn be independent random vectors with
E[Xn,m ] = 0. Suppose that
Pm n ∗
(1) m=1 E[Xn,m Xn,m ] → Σ, where Σ is a positive definite matrix.
P n n→∞
(2) For any ε > 0, m 2
m=1 E[|Xn,m | ; |Xn,m | > ε] −−−→ 0.
P n
Then Sn = m m=1 Xn,m =⇒ N (0, Σ).
∗ ]. By Levy–Bochner’s theorem,
Proof. Let ϕn,m (t) = E[eit·Xn,m ] and Σn,m = E[Xn,m Xn,m
it is enough to show that
Ymn 1
(18.10) ϕn,m (t) → exp t∗ Σt
2
m=1
Let zn,m = ϕn,m (t) and wn,m = 1 − 21 t∗ Σn,m t. For 0 < ε ≤ 1, Corollary 15.4 shows that
|t·Xn,m |3
|zn,m − wn,m | ≤ E 6 ∧ |t · Xn,m |2
|t|3
≤ E |Xn,m |3 ; |Xn,m | ≤ ε + |t|2 E |Xn,m |2 ; |Xn,m | > ε
6
ε|t|3
≤ E |Xn,m |2 ; |Xn,m | ≤ ε + |t|2 E |Xn,m |2 ; |Xn,m | > ε
6
Adding over m = 1, . . . , mn and passing first to the limit n → ∞ and then ε → 0, we obtain
from assumptions (1) and (2)
mn
X
(18.11) lim |zm,n − wn,m | = 0.
n→∞
m=1
18.6. Classical Central Limit Theorem 561
shows that lim supm kΣn,m k = 0. Hence, for all n large enough |wn,m | ≤ 1. Since
n→∞
|ϕn,m (t)| ≤ 1, Lemma 18.9 with θ = 1 and (18.11) imply that
Ymn mn
Y 1
lim ϕn,m (t) − (1 − t∗ Σn,m t) = 0
n→0 2
m=1 m=1
P n ∗ P mn ∗ n→∞
Since m m=1 |t Σn,m t| =
∗
m=1 t Σn,m t −−−→ t Σt, the conditions in Lemma 18.6.4 with
1 ∗
cn,m = − 2 t Σn,m t are satisfied; hence
mn
Y mn
Y 1 1
lim ϕn,m (t) = lim (1 − t∗ Σn,m t) = exp − t∗ Σt .
n→∞ n→∞ 2 2
m=1 m=1
Example 18.6.8. Let (Xn : n ∈ N) be an i.i.d sequence of random variables in L4 (P) and
assume that E[X1 ] = 0 and let σ 2 = E[X12 ].
n
X X
(n − 1)Sn2 : = (Xi − X n ) = Xi2 − n(X n )2
i=1 j=1
Xn n
X X
1
= Xi2 − Xj2 + 2 Xj Xj
n
i=1 j=1 1≤j<k≤n
Proof. Let Yn,m = 1{Xn,m =1} , then P[Yn,m = 1] = P[Xm,n = 1], P[Yn,m 6= Xn,m ] =
Pmnn,m ≥ 2]. Hence, {Yn,m : 1 ≤ m ≤ mn } satisfy (i) and (ii); furthermore, if Tn =
P[X
m=1 Yn,m , then (iii) implies that |Tn − Sn | → 0 in measure. Therefore, by Slutsky’s
theorem, it suffices to prove that Tn =⇒ Pλ . Denote by pn,m = P[Yn,m = 1], then
mn
Y
ϕTn (t) = E[itTn ] = (1 + pn,m (eit − 1)).
m=1
18.8. Exercises 563
Let zn,m = 1 + pn,m (eit − 1) and wn,m = exp pn,m (eit−1 ) . Notice that |zn,m | ≤ 1 and
|wn,m | = epn,m (cos t−1) ≤ 1. By (ii), there is n0 such that ∆n := max1≤m≤mn pn,m < 1/2
whenever n ≥ n0 . Hence, by Lemma 18.9 and Theorem 18.6.2 we have that
Ymn mn
Y mn
X
zn,m − wn,m ≤ exp pn,m (eit − 1) − 1 − pn,m (eit − 1)
m=1 m=1 m=1
mn
X mn
X
≤ p2n,m |eit − 1|2 ≤ 4∆n pn,m → 0
m=1 m=1
Q mn
Since limn m=1 wn,m = exp(λ(eit − 1)) by (i), we conclude that
φTn (t) → exp(λ(eit − 1)).
Therefore, by Levy–Bochner’s theorem, Tn =⇒ Pλ .
18.8. Exercises
Exercise 18.8.1. Let λ be Lebesgue measure on ([0, 1], B([0, 1]). For each n, let µn (dt) =
fn (t)λ(dt) where
n−1
X
fn (t) = n2 1 k k 1
(t).
k=0 n , n + n3
Let Fn (x) be the distribution function of µn . Show that |Fn (x) − x| ≤ n1 , fn → 0 λ–a.s.
2
and that kµn − λkT V = nn−1 2 + 1 − n12 . Conclude that µn ⇒ λ but µn fails to converge in
total variation.
Exercise 18.8.2. Consider µn = δn and νn = 31 (δ−n + δ0 + δn ) on (R, B(R)). Show that
v v
µn −
→ 0 and νn −→ 31 δ0 . In the process of passing to the limit, {µn } lets mass escape to
{−∞} whereas {νn } lets equal mass escape to −∞ and ∞.
Exercise 18.8.3. Let µ be a complex Borel measure on Rn . Suppose T is the operator
on L1 (Rn , λn ) into itself given by f 7→ f ∗ µ. Show that kT k = kµkT V . (Hint: Let h be
a measurable function such that |h| = 1 and dµ = h d|µ|. Choose g ∈ C00 (Rn ) such that
|g| −π|x|2 , choose δ > 0 small enough so that
R ≤ 1 and kg − hkL1R(Rn ,λn ) < ε. With φ(x) = e
g(x)φδ ∗ µ(x)dx − g dµ < ε.)
Exercise 18.8.4. Let Zλ be a compound Poisson random walk with parameter λ and
P –distributed steps. Show that
Zλ − E[Zλ ]
√ =⇒ N (0, 1)
var Zλ
R
provided that x2 P (dx) < ∞.
Chapter 19
Conditioning and
disintegration
Throughout this section will consider probability spaces only, that is a measure space
(Ω,
R F , P) with P(Ω) = 1. For any integrable function X, we will denote by E[X] =
X(ω)P(dω).
565
566 19. Conditioning and disintegration
Proof. Let A ∈ A and denote by µ and ν the laws of (1A , X) and Y respectively. The
independence of A and Y means that the joint law of ((1A , X), Y ) is the product measure
µ ⊗ ν. Thus, by Fubini’s theorem
Z ZZ Z Z
f (X, Y ) dP = sf (x, y) µ(ds, dx) ⊗ ν(dy) = s f (x, y) ν(dy) µ(ds, dx)
A
Z Z
= sE[f (x, Y )] µ(ds, dx) = h(X) dP
A
Observe that h(X) ∈ A .
Theorem 19.1.4. (Conditional Jensen’s inequality) Let X : Ω → (a, b), where −∞ ≤ a <
b ≤ ∞, is an integrable function. If ϕ : (a, b) → R is a convex function and ϕ ◦ X ∈ L1 ,
19.2. Conditional Independence 567
then
(19.2) ϕ(E[X|A ]) ≤ E[ϕ ◦ X|A ]
for any sub–σ–algebra A .
Proof. Let S = {(p, q) ∈ R2 : px + q ≤ ϕ(x), a < x < b}. The convexity of ϕ implies S 6= ∅
and that ϕ(x) = sup{px+q : (p, q) ∈ S}. If S ′ be a countable dense subset of S, then we also
have ϕ(x) = sup{px + q : (p, q) ∈ S ′ }. Hence, for all (p, q) ∈ S ′ , E[ϕ ◦ X|A ] ≥ p E[X|A ] + q
almost surely. Taking the supremum over all (p, q) ∈ S ′ gives (19.2).
Elementary results from the theory of Hilbert spaces also lead to the notion of condi-
tional expectation without reference to the R Radon–Nikodym theorem. Indeed, L2 (Ω) is a
Hilbert space with inner product hf |gi = Ω f ḡ dP. Given a σ–algebra A , the space H of
A –measurable square integrable functions is a close subspace of L2 . Thus, for any f ∈ L2
the orthogonal projection g = PH of f onto H satisfies hf − g|hi = 0 for all h ∈ H . So, g
satisfies (19.1), that is, g = E[f |A ]. Since A = [g < 0] ∈ A , we have that
E[|g|] = E[1Ac g] − E[1A g] = E[1Ac f ] − E[1A f ] ≤ E[|f |]
Thus the map f → 7 E[f |A ] defined on L2 is L1 –continuous. Since P(Ω) = 1, L2 is dense
as a subspace of L1 , there is a unique continuous extension of the conditional expectation
map to L1 .
Theorem 19.1.5. Suppose that G is a collection of σ–algebras contained in F and let
f ∈ L1 (P). The family {E[f |A ] : A ∈ G} is uniformly integrable.
Proof. Denote fA = E[f |A ]. As {|fA | > a} ≤ {E[|f |A ] > a},
Z Z Z
|fA | dP ≤
E[|f | A ] dP = |f | dP.
{|fA |>a} {E[|f |A ]>a} {E[|f |A ]>a}
E[|f |]
Since P[E[|f |A ] > a] ≤ a −→ 0 as a → ∞, we conclude that
Z
inf sup |fA | dP = 0.
a>0 A ∈G {|f |>a}
A
Proof. Suppose that A and B are conditional independent given C . For any A ∈ A , B
and C ∈ C we have
P A ∩ C ∩ B) = P 1C P[A ∩ B|C ] = P 1C P[A|C ]P[B|C ]
h i
= P P[A|C ]P[B ∩ C|C ] = P P P[A|C ]1B∩C C
= P P[A|C ]1B∩C
Since σ(B, C ) = σ {B ∩ C : B ∈ B, C ∈ C } , a monotone class argument shows that
P[A ∩ H] = P P[A|C ]1H
for all H ∈ σ(B, C ). This means that
P[A|σ(B, C )] = P[A|C ]
Proof. (i) implies (ii): As σ(B1 , . . . , Bm+1 ) ⊂ σ(Bn : n ∈ N) for all m ∈ Z+ , A ⊥⊥B0
σ(B1 , . . . , Bm+1 ) for all m ∈ Z+ . The result then follows from Lemma 19.2.4.
We introduce some technical concepts that will be useful in the construction of regular
conditional probabilities.
Definition 19.3.2. A collection K ⊂ P(Ω) is a compact class if it is closed under
the finite intersection property : for any {Kn : n ∈ N} ⊂ K ,
finite unions and it has T
T
K
n n = ∅ implies that n≤n0 Kn = ∅ for some n0 ∈ N.
570 19. Conditioning and disintegration
Since D is countable, there is a µ–null set N ∈ G such that conditions (i), (ii) and (iii) hold
on Ω \ N . Hence, for each ω ∈ Ω \ N , P (ω, ·) is a finitely additive quasi–probability measure
on D.
We claim that P (ω, ·) is countable additive in B for each ω ∈ Ω \ N . Fix ω ∈ Ω \ N . It is
enough to show that if B ∋ Bj ց ∅, then limj P (ω, Bj ) = 0. Given ε > 0, for each Bj there
B B B T B
is Kmjj ∈ K such that Kmjj ⊂ Bj , and P (ω, Bj ) < P (ω, Kmjj ) + ε/2j . Since j Kmjj = ∅,
T B
there is j0 ∈ N such that j≤j0 Kmjj = ∅. For all j ≥ j0 ,
\ \ \ [
Bℓ Bℓ
Bj ⊂ Bj 0 = Bℓ = Bℓ \ Km ℓ
⊂ (Bℓ \ Km ℓ
)
ℓ≤j0 ℓ≤j0 ℓ≤j0 ℓ≤j0
Consequently,
X
Bℓ
P (ω, Bj ) ≤ P (ω, Bj0 ) ≤ P (ω, Bℓ \ Km ℓ
)<ε
ℓ≤j0
for all A ∈ B(S), The stochastic kernel P (ω, X −1 (A)) := ν(ω, A) is the µ–G regular con-
ditional probability that represents µ[X ∈ ·|G ].
Then νe is a stochastic kernel from (T, T ) to (S, B(S)), and P (ω, X −1 (·)) = νe(Y (ω), ·) is
the unique regular conditional probability that represents µ[X −1 ∈ ·|σ(Y )].
19.4. Disintegration
The goal in this section is the extension of Fubini’s theorem through the use of regular
conditional expectations.
Theorem 19.4.1. (Disintegration) Let (Ω, F , P) be a probability space, and (S, S ) and
(T, T ) measurable spaces. Let G ⊂ F sub–σ–algebra. Let X : (Ω, F ) → (S, S ) be such
that P[X ∈ ·|G ] has a regular version ν. If Y : (Ω, G ) → (T, T ) and f : (S ×T, S ⊗T ) → C
are functions such that E[|f (X, Y )|] < ∞ then,
Z
(19.4) E[f (X, Y )|G ](·) = f (x, Y (·))ν(·, dx) P–a.s.
ZS Z
(19.5) E[f (X, Y )] = f (x, Y (ω))ν(ω, dx) P(dω)
Ω S
19.4. Disintegration 573
If G = σ(Y ) and P[X ∈ dx|σ(Y )] = ν(Y (ω), dx) for some stochastic kernel from (T, T ) to
(S, S ) then,
Z
E[f (X, Y )|σ(Y )](·) = f (x, Y (·))µ(Y (·), dx) P–a.s.
S
Z Z
E[f (X, Y )] = f (x, Y (ω))µ(Y (ω), dx) P(dω)
Ω S
If X and Y are independent then, µ(X ∈ dx|σ(Y )](·) = P[X ∈ dx] P–a.s.
R f (x,y)
This means that E[g(Y )|σ(X)] = Y g(y) X,Y
fX (x) ν(dy); moreover, P[Y ∈ dy|σ(X)] admits
a regular version PY |X=x ≪ ν with
dPY |X=x fX,Y (x, y)
fY |X (y|x) := =
dν fX (x)
Proof. We will consider only the case where T is uncountable in which case, by measurable
isomorphism theorem 3.9.15, there is bijection φ : (R, B(R)) −→ (T, T ) such that φ and
φ−1 are measurable. Then ν(s, B) := µ(s, φ(B)) is a stochastic kernel from S to R. Let
g : S × (0, 1) :→ R be defined as the quantile tranformation
g(s, t) = inf{x ∈ (0, 1) : ν(s, (−∞, x]) ≥ t}
Since g(s, t) ≤ x iff ν(s, (−∞,x]) ≥ t, the measurability of the map s 7→ ν(s, (−∞, x])
implies that g is S ⊗ B (0, 1) measurable. If θ ∼ U [0, 1], then
P[g(θ, s) ≤ x] = P[θ ≤ ν(s, (−∞, x])] = ν(s, (−∞, x])
This shows that g(θ, s) ∼ ν(s, dx). Therefore, for f := φ ◦ g, f (θ, s) ∼ µ(s, dt).
Theorem 19.5.2. (Transfer) Suppose (S, S ) is a measurable space, (T, T ) is a Borel space,
and (ξ, η) be a random variable in S × T . If {ξ, e θ} are independent random variables in S
d d
and [0, 1] respectively so that ξe = ξ and θ = U [0, 1], then there exists a measurable function
e f (ξ, d
e θ)) =
f : S × [0, 1] −→ T such that (ξ, (ξ, η).
Proof. Example 19.3.10 shows that there is a stochastic kernel µ from S to T such that
E[η ∈ ·|ξ] = µ(ξ, ·). Lemma 19.5.1 implies that there is a function f : S × [0, 1] → T such
that the law of f (s, θ) is µ(s, ·) for each s ∈ S. Hence, for any bounded measurable function
g on S × T we have
hZ 1 i hZ 1 i
e e
E[g(ξ, f (ξ, θ))] = E e e
g(ξ, f (ξ, u)) du = E g(ξ, f (ξ, u)) du
0 0
hZ i
= E g(ξ, t)µ(ξ, dt) = E E[g(ξ, η)|ξ] = E[g(ξ, η)]
T
19.5. Kolmogorov’s extension theorem 575
d e
e θ), we obtain that (ξ, η) =
Setting ηe = f (ξ, (ξ, ηe).
Theorem 19.5.3.
Q (Daniell)
N Let {(S n , Sn )} be a sequence of Borel spaces. Suppose that the
sequence ( nk=1 Sk , nk=1 Sk , µn ) is projective; that is,
(19.6) µn+1 (· × Sn+1 ) = µn (·),
n ∈ N.
Q N
Then, there exists a unique probability measure µ on ( n Sn , n Sn ) such that
µ ◦ (p1 , . . . , pk )−1 = µk , k ∈ N,
Q Qk
where (p1 , . . . , pk ) : n Sn −→ j=1 Sj is the projection s 7→ (s1 , . . . , sk ).
Proof. Let θ := (θn ) be an i.i.d. sequence of U [0, 1]–random variables defined on the
space ((0, 1), B)(0, 1)), λ). Since S1 is a Borel space, there is a function F : (0, 1) → S1
d
such that f1 (θ1 ) = µ1 . For n ≥ 1, suppose that we have constructed random variables
d
ξ1 = f1 (θ1 ), . . . , ξn = fn (θ1 , . . . , θn ) on S1 , . . . , Sn respectiely, so that (ξ1 , . . . , ξn ) = µn .
Q d
Let (η1 , . . . , ηn+1 ) be an arbitrary random variable in n+1 k=1 Sk with (η1 , . . . , ηn+1 ) = µn+1 .
d
By (19.6), we have that (ξ1 , . . . , ξn ) = (η1 , . . . , ηn ). Theorem 19.5.2 implies that there is a
Q
measurable function fen+1 on nj=1 Sj × [0, 1] such that if ξn+1 := fen+1 (ξ1 , . . . , ξn , θn+1 ) then
d d
(ξ1 , . . . , ξn+1 ) = (η1 , . . . , ηn+1 ) = µn+1 . The law of the sequence (ξn ) satisfies the required
conditions. Uniqueness follows from Sierpinski’s monotone class theorem.
Suppose (S, S ) is a Borel space and let T be a non-empty index set, and consider
the product space (S T , S ⊗T ). Kolmogorov’s extension theorem shows that for any pro-
jective family {µI : I ⊂ T , I finite} of probability measures on S ⊗I there exists a unique
probability measure µ on S ⊗T .
The identity map in S T is called the S–values canonical stochastic process in T
with distribution µ, and the coordinate evaluation maps Xt (s) = s(t) are the values of the
process at t.
For any probability space (Ω, F , P), a measurable map Xe : (Ω, F ) −→ (S T , S ⊗T ) is
an S–valued stochastic process in T with distribution P ◦ X e −1 , and X
et (ω) = (X(ω))(t)
e
is the value of the process at t.
19.5.1. Weakly stationary processes. A mean zero complex valued process {Xt : t ∈
R} ⊂ L2 (Ω, P) is weakly stationary if E[Xt X s ] = E[Xt−s X 0 ] for all t, s ∈ R. Denote
by ρ(h) = E[Xh X 0 ] the covariance function of X. If X is a weakly stationary, then r is a
positive definite function, that is,
X n
zk ρ(tk − tj )z j ≥ 0
k,j
For general κ, let α and β be its real and imaginary part respectively. Since κ is positive
definite, then the function Q in R2n given by
n
X
Q(u, v) = (uk − i vk )(uj − i vj )κ(tk − tj )
k,j=1
Xn
= (α(tk − tj )(uk uj + vk vj ) − β(tk − tk )(uk vj − uj vk ) .
k,j=1
is a positive quadratic form. Therefore exp − 12 Q(u, v) is the characteristic function of a
2d–dimensional Gaussian random vector (Yt , Zt ) with
(19.7) E[Ytk Ytj ] = E[Ztk Ztj ] = α(tk − tj )
(19.8) E[Ytk Ztj ] = −E[Ytj Ztk ] = −β(tk − tj ).
If Xt = √12 (Yt + iZt ), then E[Xtk Xtj ] = κ(tk − tj ). The laws µt = L Yt , Zt are projective,
and so the existence of weakly stationary process with covariance function κ follows from
Kolmogorov’s extension theorem.
Remark 19.6.2. When each P [·|A ], P ∈ P, admits a regular conditional probability (for
instance, if (Ω, F ) is a Borel space), there is a slightly stronger notion of sufficiency. A
σ–algebra A ⊂ F is said to be strongly sufficient if there is a stochastic kernel from
(Ω, A ) to (Ω, F ) such that
P [F |A ](ω) = µ(ω, F ), F ∈F P ∈ P.
If A = σ(T ), then µ(ω, F ) = η(T (ω), F ) for some stochastic kernel η from (E, E ) to (Ω, F )
with (T (Ω) ⊂ E).
578 19. Conditioning and disintegration
A minimal sufficient σ–algebra for some population P has the minimal information
needed to conditionally reduce the entire population P.
Remark 19.6.7. If T is a minimal sufficient statistic and S is a sufficient statistic with
S = ψ(T ) for some measurable function ψ, then S is also minimal sufficient.
Lemma 19.6.8. Suppose N P0 = N P for some P0 ⊂ P. If A is sufficient for P and
minimal sufficient for P0 , then A is minimal sufficient for P.
Proof. If B is sufficient for P, then so is B sufficient for the smaller population P0 ; thus,
A ⊂B e P0 = BeP .
Theorem 19.6.9. (Bahadur) Let P be a population on (Ω, F ). If there is a σ–finite
measure ν on (Ω, F ) such that P ≪ ν for all P ∈ P, then there exists a minimal sufficient
statistic for P.
580 19. Conditioning and disintegration
Proof.
P Halmos–Savage theorem shows that there is a sequence
P {Pn } ⊂ P such that P ≪
c P
n n n := Q, where the c are
P n dPn
postive constants with n n = 1. For any P ∈ P let
c
dP dQ
fP = dν and f = dν = n cn dν . The map
fP (x)
T : x 7→ 1{f >0} : P ∈ P
f (x)
is measurable as a function from (Ω, F ) to the product space (RP , B ⊗P ). We will show that
T is a minimal sufficient statistic for P. Indeed, for each P ∈ P, let gP be the projection
in RP onto its P –th component. As fP = (gP ◦ T ) f , T is sufficient. If B is another
sufficient σ–algebra then, by the factorization theorem, there are B–measurable functions
gP : P ∈ P) and a measurable function h such that fP = geP h ν–a.s. for each P ∈ P. The
(e
uniqueness of the Radon–Nikodym derivative implies that
geP h geP
gP ◦ T = =P ν–a.s.
f en
n cn g
eP –measurable.
Therefore T is B
Proof. Lemma 4.2.2 shows Pthat for any f ∈ A+ , there are sequences {an } ⊂ R+ and
{An } ⊂ A such that f = n an 1An . By assumption, there exists a sequence {Bn } ⊂ B
P
with {An △Bn } ⊂ NBP so that if fe = n an 1Bn , then supP ∈P P [f 6= fe] = 0.
Consider the case where there exists a σ–finite measure ν such that P ≪ ν for all P ∈ P.
By the factorization theorem, there is h ∈ F+ such that for any P ∈ P
dP
(x) = gP (x)h(x)
dν
e P ∈ B+ be such that P [gP 6= G
for some gP ∈ A+ . Let G e P ] = 0; then,
dP
(x) = GP (x)h(x).
dν
By the factorization theorem, we conclude that B is sufficient.
In the general case, suppose that B is not sufficient. Then there are F ∈ F and
P1 , P2 ∈ P such that
Proof. Let P0 = {Pθ0 , . . . , Pθk }. Since ν and Pθ are equivalent measures, then N P0 =
N P . Let ηi = η(θi ) − η(θ0 ) and ξj = ξ(θj ) − ξ(θ 0 ). As in the proof of Theorem 19.6.9
S(x) = [exp η1 · T (x) − ξ1 . . . exp ηp · T (x) − ξp ]⊺ is minimal sufficient for P0 . The linear
independence of the vectors ηj imply that the matix p × p–matrix L whose i–th row is ηi⊺
is invertible. Let g(w) = [ew1 . . . ewp ]⊺ , then the function
G(t) = g(Lt) diag(e−ξi )
is an homeomorphism from Rp onto (0, ∞)p and G(T (X)) = S(X). Consequently T (X) =
G−1 (S(X)) is minimal sufficient for P0 and, by Lemma 19.6.10 T (X) is minimal sufficient
for P.
As we pointed out above, it is desirable to have sufficient statistics that make a reduc-
tion with minimal information. The following result addresses the problem of existence of
minimal sufficient statistics under a mild condition.
Definition 19.6.12. Let P be a population on (Ω, F ). A σ–algebra A ⊂ F is said to be
complete (resp. bounded complete) for P if whenever f ∈ L1 (Ω, A , P) (resp. f bounded
and A –measurable)
Z
sup f dP = 0 implies sup P (f 6= 0) = 0.
P ∈P P ∈P
A statistic T is complete (resp. bounded complete) if σ(T ) is complete (resp. bounded
complete).
Example 19.6.13. Consider a population of exponential type
Pη
(x) = eη·T (x)−ξ(η) ψ(x)
dν
where η ∈ ∆ ⊂ Rk , and ∆ has nonempty–interior. It follows from example 19.6.11 that T
is a sufficient statistic. We prove here that T is also complete.
where τ = (ψ · ν) ◦ T −1 and |h| < δ for some δ > 0. The function g can be analyti-
cally extended to B(0; δ) + iRk . This implies that the Borel measures F+ (t)eη0 ·t τ (dt) and
582 19. Conditioning and disintegration
F− (t)eη0 ·t τ (dt) on Rk have the same characteristic function. From the uniqueness of the
characteristic function we obtain F+ = F− τ –a.s. Therefore F (T ) = F+ (T ) − F− (T ) = 0
ν–a.s.
Lemma 19.6.14. Suppose A ⊂ B eP . If B is complete for P, then so is A . In particular, if
T is a complete statistic and S = ψ(T ) for some measurable function ψ, then S is complete.
Proof. For any f ∈ L1 (A , P), there is fe ∈ L1 (B, P) such that kf − fekL1 (P) = 0.
So, if supP ∈P EP [f ] = 0, then supP ∈P EP [fe] = 0; consequently, supP ∈P P [f 6= 0] =
supP ∈P P [fe 6= 0] = 0.
Theorem 19.6.15. (Lehmann-Scheffe-Bahadur) Suppose there exists a minimal sufficient
σ–algebra A for P. A σ–algebra B is sufficient and complete for P iff B is minimal
sufficient for P and A is complete.
Proof. Suppose B is sufficient and complete. The minimal sufficiency of A implies that
A ⊂B eP ; thus, by Lemma 19.6.14, A is complete. The sufficiency of A implies that for
any B ∈ B, there is a function gB ∈ A+ with supP ∈P P P [B|A ] 6= gB = 0. Hence,
there is geB ∈ B+ such
that supP ∈P P [gB 6= geB ] = 0. Since G = (1B − geB ) ∈ L1 (B, P)
and supP ∈P EP [G] = 0, the completeness of B implies that supP ∈P P (G 6= 0) = 0. Thus
supP ∈P EP |1B − gB |] = 0, which in turn means that B ∈ AfP . Consequently, B ⊂ AfP
and thus, B is minimal sufficient.
Definition 19.6.16. Given a population P on (Ω, F ), a statistic V : (X , B) → (V, D) of
X is said to be ancillary if there is a probability measure µ on (V, D) such that
P[V (X) ∈ D] = µ(D), P ∈ P, D ∈ D
Example 19.6.17. Given a fixed σ 2 > 0, consider the family P of normal distribution Pµ
with mean µ ∈]mathbbR and variance σ 2 . If X ∼ N (µ, σ 2 ), then V (X) = X − µ ∼ N (0, σ 2 )
and so, V is ancillary for P.
Theorem 19.6.18. (Basu) Let T and V be two statistics of X from the population P. If
T is boundedly complete and sufficient and V is ancillary, then T and V are independent.
Proof. Let µ be a measure on (V, D) such that P (V (X) ∈ D) = µ(D) for any D ∈ D. It
is enough to show that EP [1(V (X) ∈ D)|T ] = µ(D) = 0 for each P–a.s. for all D ∈ D.
Sufficiency of T and ancillarity of V imply that ψD (T ) = E[1(V ∈ D)|T ] − P (V ∈ D) is
bounded P–a.s. and does not depend on P . By bounded completeness we have
EP ψD (T )] = 0, P ∈P
Hence ψ(T ) = 0 P–a.s., that is E[V (X) ∈ B|T ] = EP [V (X) ∈ B] for all P ∈ P.
Example 19.6.19. Let σ 2 > 0 be fixed. Let {Xj : 1 ≤ j ≤ n} be a iid sample form a
normal Ppopulation {Pµ : µ ∈ R} with mean (unknown) µ and variance (known) σ 2 . Then,
1 n 1 Pn
X̄ := n j=1 Xj is sufficient and complete. The statistic S (X) := n−1 j=1 (Xj − X̄)2 is
2
19.7. Bayes model and conjugate priors 583
ancillary since its distribution, as in Example 19.6.17, it distribution does not depend on
the value of µ. Therefore, var X and S 2 are independent. Notice that
n
X̄ − µ 2 (n − 1)S 2 X Xj − µ 2
(19.11) n + =
σ σ2 σ
j=1
The first term in the left–hand side of (19.11) is the square of a normal N (0, 1) distribution
and so, it has ξ 2 distribution. The term in the right–hand side of (19.11) is the summ of
n iid normal N (0, 1) distributions, and so it has distribution χ2n . The independence of X̄
2
and S 2 implies that (n−1)S
σ2
has charactersitic function (1 − 2it)−(n−1)/2 which means that
2
it has χn−1 distribution.
Disintegration implies that the conditional distribution of X given Θ exists and is given by
P[X ∈ dx|Θ = θ] = Pθ (dx).
Consider a parametric model {Pθ : θ ∈ ∆} such that Pθ ≪ µ for all θ ∈ ∆, where µ is
a σ–finite measure on F. Suppose there is a function f : (X × Θ, F ⊗ B) → (R+ , B(R+ ))
with
dPθ
(x) = fθ (x) := f (x; θ)
dµ
Then, the function L(θ; x) = fθ (x) is called likelihood function and ℓ(θ; x) := log(fθ (x))
is called log–likelihood function. Maximum likelihood estimators are solutions to the
problem
b
θ(x) : = arg max L(θ; x)
θ∈∆
= arg max ℓ(θ; x).
θ∈∆
Proof. From Theorem 10.6.5 we know that ∆ is convex and that Λ is finite and convex on
∆. Clearly, if θ ∈
/ ∆ then Λ(θ) = +∞. Hence Λ is proper and convex.
Proof. As Λ is proper, lower semicontinuous and convex on Rn , the conclusion follows from
Frenchel-Legendre’s duality theorem B.2.14[(iv)].
Definition 19.7.3. A family of exponential type {Pθ : θ ∈ ∆} relative to a Borel measure
µ on Rn is said to be full if
(A) ∆o 6= ∅
(B) For any v ∈ Rn \ {0} and r ∈ R, 0 ≤ µ({x : v · x = r}) < 1.
(C) C = co(supp(µ)) has nonempty interior in Rn .
Assumptions (A), (B) and (C) guarantee that the model {Pθ : θ ∈ ∆} is a truly n–
dimensional and that if Pθ = Pη , then θ = η.
19.7. Bayes model and conjugate priors 585
Proof. Without loss of generality, we may assume that µ is in fact a probability measure.
Indeed, fix θ0 ∈ ∆. As Pθ0 and µ are equivalent measures, they have the same support S.
Then by by shifting ∆ to ∆′ := ∆ − θ0 we can consider the exponential model P′θ′ (dx) =
′ ′ ′
eθ ·x−Λ (θ ) Pθ0 (dx) where θ′ ∈ ∆′ and Λ′ (θ′ ) = Λ(θ + θ0 ).
It t ∈
/ C then, by Theorem 12.10.15[(iii)] there exists v ∈ R and real constants α < β such
that
v · x ≤ α < β < v · t, x ∈ C.
Hence, for λ > 0
Z
−λv·t
e eλv·x µ(dx) ≤ e−λ(β−α) → 0
Suppose
that t ∈ C o . Then,o there is a finite set {sj : j = 1, . . . , m} ⊂ supp(µ) such that
t ∈ co(sj : j = 1, . . . , m) . For each u ∈ Sn−1 let Hu (t) be the hyperplane through t
with normal u. The function ρ : u 7→ max{d(sj , Hu ) : j = 1, . . . , m} is clearly continuous,
and so attains its minimum at some point u0 ∈ S n−1 . Since Hu is an affine space of
dimension (n − 1), ρ0 = ρ(u0 ) > 0. This means that for any u ∈ Sn−1 , the half–space
Hu+ (t) := {x : u · x ≥ u · t} contains at least one of the balls B(sj ; ρ0 ), each of which has
positive measure under µ. Hence
ξ(t) := inf µ(Hu+ (t)) ≤ min µ B(sj ; ρ0 ) > 0.
u∈Sn−1 1≤j≤m
Example 19.7.6. Consider the normal distribution N (µ, σ 2 ) where both µ and σ 2 are
unknown. Then
1 (x−µ)2 µ 1 1 µ2
φµ,σ2 (x) = √ e− 2σ2 = exp 2 x − 2 x2 − log(2πσ 2 ) + 2
2πσ 2 σ 2σ 2 2σ
Let T1 (x) = x, T2 (x) = x2 , θ1 = σµ2 , and θ2 = − 2σ1 2 . Then, the normal distribution has a
(natural) exponential representation
fθ1 ,θ2 (t1 , t2 ) = exp θ · t − K(θ) ν(dt)
θ12
where θ ∈ R × (−∞, 0), K(θ) = 21 log( −θ π
2
) − 2
2θ2 , and ν is measure on R supported on
the parabola t2 = t21 . The conjugate measure is of form
pa,b (θ) = D(a, b)eθ·b−aK(θ)
By Theorem 19.7.5, the domain of conjugacy E contains the set {(a, b) : a > 0, ab2 > b21 }.
This can be seen directly from
Z Z ∞ Z ∞ 2
b1 θ1 +b2 θ2 −aK(θ1 ,θ2 ) 1 −b2 s a/2
aθ1
e dθ1 dθ2 = a e s eb1 θ1 − 4s dθ1 ds
R×(−∞,0) π2 0 −∞
Z ∞
2 a+3 b2
= √ s 2 −1 exp − s b2 − 1 ds
π a−1 a 0 a
a+3
2 Γ 2 1
= √ a+3 =
π a−1 a b2 2 D(a, b)
b2 − a1
which implies a > 0 and ab2 > b21 . To obtain conjugate measure in terms of the original
parameters (µ, σ 2 ), we apply the change of variables formula for integration. Consider the
19.8. Information inequality 587
µ
change of variable (θ1 , θ2 ) = G(µ, σ 2 ) = σ2
, − 2σ1 2 on R × (0, ∞). Then,
The expression qa,b (µ, σ 2 ) can be interpreted as follows: given σ 2 , the distribution
of µ is
b1 σ 2
2 a+3 1 b2
normal N a , a ; while marginally, σ has inverse–gamma distribution Ig 2 , 2 b2 − a1 .
The function s : (x, θ) 7→ ∂θ log(fθ (x)) is called the score function. From (19.13) and
(19.14) we have that ∂θ g(θ) = cov(T, s⊤θ ) = Eθ [(T − g(θ))(sθ − Eθ (sθ ))].
Theorem 19.8.1. For any real valued functions ψ1 , . . . , ψk on X ×∆ such that for ψi (·, θ) ∈
L2 (fθ dµ) for each θ ∈ ∆, define
C(θ) = varθ (ψ) = Eθ [(ψ − Eθ [ψ])(ψ − Eθ [ψ])⊤ ] = covθ (ψi , ψj )
γ(θ) = covθ (ψ, T ) = Eθ [(ψ − Eθ [ψ])(T − Eθ [T ])⊤ ] = covθ (ψi , Tj )
Proof. Choosing an arbitrary v ∈ Rp and considering v⊤ g(θ) instead of g(θ) shows that it
suffices to consider the case p = 1.
For any a ∈ Rk , the Cauchy-Schwartz inequality shows that
2
covθ (T, a⊤ ψ)
(19.16) varθ (T ) ≥ .
varθ (a⊤ ψ)
Since covθ (T, a⊤ ψ) = a⊤ γ(θ) and varθ (a⊤ ψ) = a⊤ C(θ)a, we conclude that
a⊤ γ(θ)γ ⊤ (θ)a
(19.17) varθ (T ) ≥ sup = ρ(θ).
a6=0 a⊤ C(θ)a
From the theory of symmetric matrices, we know that ρ equals to the largest eigenvalue of
⊤
the matrix C −1 γγ ⊤ , which has the same eigenvalues as C −1/2 γγ ⊤ C −1/2 = C −1/2 γ C −1/2 γ .
Therefore, ρ = γ ⊤ C −1 γ and (19.15) follows.
where I(θ) = Eθ [∂θ⊤ log(fθ )∂θ log(fθ )]. In Statistics, I(θ) is refereed as the Fisher’s infor-
mation matrix, and (19.18) as Cramér–Rao’s information inequality.
19.9. Exercises
Exercise 19.9.1. Suppose that A = σ({A1 , . . . , An }) where the sets {Aj } form
Pa pairwise
disjoint measurable partition of Ω. Show that for any f ∈ L1 (E), E[f |A ] = nj=1 aj 1Aj ,
E[f 1Aj ]
where aj = P[Aj ] if P[Aj ] > 0, or aj = 0 otherwise.
P[A∩B]
Exercise 19.9.2. For any pair of measurable sets A and B, P[A|B] := P[B] if P[B] > 0
or P[A|B] = 0 otherwise. Let A ⊂ F be a sub–σ–algebra and A ∈ A .
(i) (Bayes’s formula) Suppose that P[B] > 0. Show that
E[1A E[1B |A ]]
P[A|B] =
E[E[1B |A ]]
(ii) If A is generated by a partition {A1 , . . . , An }, show that
P[B|Ak ]P[Ak ]
P[Ak |B] = Pn
j=1 P[B|Aj ]P[Aj ]
Exercise 19.9.5. Let (T, T ) and (U, U ) be Borel spaces and let µ and ν be probability
measures on (T × U, T ⊗ U ). Assume that µ ≪ ν and that
dµ
(t, u) = a(t)b(u)
dν
Let µT and νT be the marginals on (T, T ) of µ and ν respectively. Similarly, let µU |T
and νU |T be the regular conditional probabilities of U given T with respect to µ and ν
respectively. Show that
(i) µT ≪ νT and
Z
dµT
(t) = a(t) b(u)νU |T (du|t)
dνT U
(ii) µU |T ≪ νU |T and
dµU |T b(u)
(u|t) = R
dνU |T U b(u) νU |T (du|t)
(Hint: Consider g(t, u) = 1A (t)1B (u), compute Eµ [g] and apply disintegration.)
(c) There is a σ–finite kernel µ from Rm to Rk such that for θ ∈ πk (∆), the conditional
T |U
distribution Pθ,ψ of T given U is of exponential type relative to µ(U, ·) and
T |U
dPθ,ψ (dt|U ) = c̄θ (y)eθ·t µ(U, dt)
Conclude that for θ ∈ πk (∆) fixed, U is a sufficient statistic for {Pθ,ψ : ψ ∈ ∆θ }.
Exercise 19.9.8. Assume Pθ ≪ µ for all θ ∈ ∆ where σ–finite measure µ on F and let
fθ = Pdµθ .
590 19. Conditioning and disintegration
(a) Show that the marginal of PX of X is absolutely continuous with respect to µ and
has density
Z
dPX
m(x) := = fθ (x) Π(dθ).
dµ ∆
(b) Show that the conditional distribution of P[Θ ∈ dθ|X] - called posterior distri-
bution- exits, is absolutely continuous with respect to Π, and
P[Θ ∈ dθ|X] fθ (X)
= .
dΠ m(X)
dΠ
(c) If Π ≪ τ for some σ–finite measure τ on (∆, B) and π = dτ , show that the
posteriori distribution P[Θ ∈ dθ|X] ≪ τ and
P[Θ ∈ dθ|X] fθ (X)π(θ)
= .
dτ m(X)
Exercise 19.9.9. Consider the normal distribution N (µ, σ 2 ) where µ is a fixed known
number and σ 2 is unknown. Show that this distribution admits a natural exponential
representation of the form
1 π
fθ (x) = eθT (x)− 2 log −θ
where T (x) = (x − µ)2 , and θ = − 2σ1 2 . Show that the conjugate prior has density w.r.t.
Lebesgue measure on (0, ∞) given by
b
a+3 a+3 b 1
2 2
2
1 2
+1
ga,b (σ ) = exp −
Γ a+32
σ2 2 σ2
and that its domain of conjugacy E = {(a, b) : a > 0, b > 0}. This means that the conjugate
prior is distributed as an inverse-gamma Ig a+32 , a
2 .
Chapter 20
Martingales
591
592 20. Martingales
The filtration {FtX : t ∈ T}, called the natural history of X, is the smallest filtration
with respect to which X is adapted.
Definition 20.1.3. When T = [0, ∞), B̂ = (Ω ⊗ [0, ∞), F ⊗ B([0, ∞)) is referred as the
base space. Suppose (Ω, Ft )t≥0 is a filtered space. A process X : Ω × [0, ∞) → S taking
values on a measurable space (S, S ) is progressively measurable if for each t ≥ 0 the
process defined by X t : (ω, s) 7→ Xt∧s (ω) is Ft ⊗ B([0, ∞)) − S –measurable.
Remark 20.1.4. A set Γ ⊂ Ω × [0, ∞) is progressively measurable if the process 1Γ is
progressively measurable. Thus, Γ is progressively measurable iff Γ ∩ Ω × [0, t] ∈ Ft ⊗
B([0, t]).
Loosely speaking, progressive measurability means that the information on the evolution
of the process X up to time t is contained in Ft .
Theorem 20.1.5. Let X be a stochastic process in a filtered space (Ω, {Ft : t ∈ T}).
(i) If X is progressively measurable, then it is adapted.
Suppose X takes values on a metric space.
(ii) If X is left–continuous and adapted, then X is progressively measurable.
(iii) If X is right–continuous and adapted, then X is progressively measurable.
(iv) If {Xn : n ∈ N} is a sequence of progressively measurable processes converging
poitwise to X, then X is progressively measurable.
Proof. (i) Suppose A ∈ S . For any t ≥ 0 fixed, (X t )−1 (A) ∈ Ft ⊗ B([0, ∞)). As the
t
t–cross section (X t )−1 (A) = (Xt )−1 (A), the conclussion follows from Lemma 9.4.1.
(iv) For any t ≥ 0, {(Xn )t : n ∈ N} converges to X t poitwise. The conclusion follows from
Theorem 3.6.8.
20.1. Measurability concepts for stochastic processes 593
FtP contains the completion of Ft and if P is the collection of all probability measures
on (Ω, F ) then FtP contains the universal completion of Ft .
Lemma 20.1.8. {FtP : t ∈ T} is a filtration and the restrictionSof P to FtP is complete for
each t ∈ T. Moreover, Ft ⊂ FtP = σ(Ft ∪ NP) and F∞ P := σ
t∈T Ft
P = σ(F ∪ N ).
∞ P
Proof. We first show that FtP is a σ–algebra. Since A△B = Ac △B c , it follows that FtP
is closed under taking complements. For {An } ⊂ FtP, let {AP P
n } ⊂ Ft such that An △An ⊂
594 20. Martingales
S S S S
Nn ∈ A∞σ and P[Nn ] = 0. Then, △ P P
n An n An ⊂ n (An △An ) ⊂ n Nn ∈ A∞σ .
S
Therefore, n An ∈ FtP.
To show that FtP is complete is enough to show that if E ∈ FtP and P[E] = 0 then E ∈ NP.
In such case, there is E P ∈ Ft and N ∈ A∞σ with P[N ] = 0 such that E P△E ⊂ N , which
is the equivalent to E P \ N ⊂ E ⊂ E P ∪ N ∈ A∞σ . As P[E P] = P[E] = 0, it follows that
E ∈ NP. The last statement is evident.
Lemma 20.1.9. The right–continuous augmentation of the P–regularization of {Ft : t ≥ 0}
is the same as the P–regularization of the right–continuous augmentation {Ft+ : t ≥ 0}.
Proof. First notice that ∪t Ft+ = ∪t ∩u>t Fu = ∪t Ft . Therefore the filtrations Ft and
Ft+ have the same nearly empty sets.
Denote by Gt = Ft+ and by Ht = FtP. We want to show that GtP = Ht+ . If A ∈ Ht+ ,
then A ∈ FuP for all u > t. Hence, for each n there is An ∈ Ft+1/n such that A△An ∈ NP.
T S
Let à = n m≥n Am , then à ∈ Gt and A△à ⊂ ∪n A△An ∈ NP. Thus, Ht+ ⊂ GtP.
Conversely, if A ∈ GtP, then there is A′ ∈ Gt such that A△A′ ∈ NP. Since A′ ∈ Fu for all
u > t, then A ∈ FuP = Hu for all u > t. It follows that A ∈ Ht+ , that is, GtP ⊂ Ht+ .
Definition 20.1.10. Suppose (Ω, F , P) is a probability space. The natural augmenta-
tion of a filtration {Ft : T} is defined as {FtP : t ∈ T} if T ⊂ Z+ or {(F P)t+ : t ∈ T}
(equivalently {Ft+ )P : t ∈ T}) if T = [0, ∞). A filtration satisfies the natural conditions
if it is equal to its natural extension.
Suppose T and S are stopping and optional times with respect {Ft : t ∈ T} respectively.
The collections of sets
FT := {A ∈ F : A ∩ {T ≤ t} ∈ Ft , ∀t ∈ T}
20.2. Stopping times 595
and
FS+ := {A ∈ F : A ∩ {S < t} ∈ Ft , ∀t ∈ T}
S S
are sub σ–algebras of F . Indeed, clearly Ω ∈ FT . Since {T ≤ t}∩ n An = n {T ≤ t}∩An ,
c
cunder countable unions. If A ∈ FT then, as A ∩ {T ≤ t} =
it follows that FT is closed
{T ≤ t} \ A ∩ {T ≤ t} , A ∈ FT . Similar arguments show that FS+ is a sub σ–algebra
of F .
Remark 20.2.2. Since {T ≤ t} ∩ {T ≤ s} = {T ≤ t ∧ s} ∈ Ft∧s , T is FT –measurable. If
T ≡ t, show that FT = Ft . A similar argument shows that S is FS+ measurable.
Clearly, any stopping time is an optional time; however, the converse statement depends
upon the continuity properties of the filtration Ft .
Lemma 20.2.3. Suppose T = [0, ∞). T is an F –optional time iff T is an F+ –stopping
time. In such case,
\
FT +h = FT + = A ∈ F∞ : A ∩ {T ≤ t} ∈ Ft+
h>0
The first
T statement follows by letting A = Ω. To prove the last statement, observe that
A ∈ h>0 FT +h iff A ∩ {T + h ≤ t} ∈ Ft for all t ≥ 0Tand h > 0. This is equivalent to
A ∩ {T ≤ t} ∈ Ft+h for all t ≥ 0 and h > 0. Hence A ∈ h>0 FT +h iff A ∩ {T ≤ t} ∈ Ft+
for all t ≥ 0.
Proof. (i) For any t ∈ T, the map ΦT,t : (ω, s) 7→ (ω, T (ω) ∧ t ∧ s) is Ft ⊗ B([0, ∞))–
Ft ⊗ B([0, ∞)) measurable. Indeed, for any u ≥ 0
Ω if t ≤ u
{T ∧ t ≤ u} =
{T ≤ u} if u < t
which means that {T ∧ t ≤ u} ∈ Ft for all u ≥ 0. Hence, for any A ∈ Ft
Φ−1
T,t (A × [0, u]) = A × [0, u] ∪ A ∩ {T ∧ t ≤ u} × (u, ∞) ∈ Ft ⊗ B([0, ∞))
for all t ≥ 0.
Theorem 20.2.6. If T , S and {Tn } are stopping times. Then,
(i) S + T , S ∧ T , S ∨ T and supn Tn are stopping times.
(ii) If S ≤ T , then FS ⊂ FT , and S is FT –measurable.
(iii) A ∩ {S ≤ T } ∈ FT for all A ∈ FS .
(iv) In addition, if Ft is right–continuous then inf n Tn is a stopping time.
Proof. We consider only the case T = [0, ∞]. The case T = Z+ is left as an exercise.
(i) The conclusion {T + S > t} ∈ Ft for each t ≥ 0 follows directly from the identity
[
{S + T > t} = {S > t} ∪ {T > t} ∪ {q < T ≤ t} ∩ {t − q < S ≤ t} .
q∈Q∩(0,t]
T
The last statement follows from {supn T ≤ t} = n {Tn ≤ t}.
S
(iv) If {Tn : n ∈ N} is a sequence of optional times, then {inf n Tn < t} = n {Tn < t} ∈ Ft
for all t ≥ 0. Hence T = inf n Tn is an optional time. By Lemma 20.2.3, if {Ft : t ≥ 0} is
right continuous, then T is a stopping time.
Corollary 20.2.7. Suppose T and S are stopping times. Then,
(i) FS∧T = FS ∩ FT .
(ii) Each of the events {T < S}, {S < T }, {T ≤ S}, {S ≤ T }, and {T = S} belong to
FS∧T .
(iii) {S = T } ∩ FS = {T = S} ∩ FT .
The next result shows that a stopping time can be approximated from above by stopping
times that take countably many values.
598 20. Martingales
Lemma 20.2.9. Suppose T = [0, ∞). Let T be a stopping time w.r.t. some filtration Ft .
Let
Tn = 2−n ([2n T ] + 1)1T <∞ + ∞1{T =∞} .
Then {Tn } is
Ta sequence of stopping times and Tn ց T . Moreover, if Ft is right–continuous,
then FT = n FTn .
Proof. If k = [2n t], then {Tn ≤ t} = {T < k/2n } ∈ Fk/2n ⊂ Ft . As dyadic numbers are
dense in R, Tn ց T .
T T
Clearly FT ⊂ FTn . Suppose A ∈ n FTn . From
[
A ∩ {T < t} = A ∩ {Tn < t} ∈ Ft
n
T
we obtain A ∈ FT + . Therefore, if Ft is right–continuous, we conclude that FT = FT n .
Optional times are often constructed recursively in terms of shifts on the underlying
path space. Recall that Xt ◦ θs (ω) = Xt+s (ω) := ω(t + s) for all s, t ≥ 0. For any pair of
optional times S and T on the canonical space consider the random time U = S + T ◦ θS
defined as
+∞ if S(ω) = ∞
U (ω) =
s + T (θs ω) if S(ω) = s < ∞
Then
XT ◦ θS (ω) := XS(ω)+T (θS(ω) ) (ω) = ω S(ω) + T (θS(ω) ω)
if S(ω) < ∞.
Theorem 20.2.10. (compound optional times) For any metric space (S, d), let S and T
be optional times on the canonical space S Z+ , C([0, ∞), S) or D([0, ∞), S) endowed with the
right–continuous filtration Ft+ = σ(Xs : s ≤ t)+ . Then U = S + T ◦ θS is an optional time.
Proof. For S Z+ the proof is simple and is left as an exercise. For canonical spaces
C([0, ∞), S) or D([0, ∞), S), the process X is progressively measurable. As S∧n+T ◦ωS∧n ր
U , we may assume without loss of generality that S is bounded. By Theorem 20.2.4(b),
XS+t ∈ F(S+t)+ for all t ≥ 0. Therefore, for any set A = {Xs ∈ B} with B ∈ B(S) and
0 ≤ s ≤ t we have θS−1 (A) ∈ F(S+t)+ . The set of all such sets A generate the history
σ–algebra Ft := σ(Xs : s ≤ t) and thus,
θS−1 (Ft ) ⊂ F(S+t)+
For t ≥ 0 fixed,
[
{U < t} = {S < q} ∩ {T ◦ θS < t − q}
q∈Q∩(0,t)
20.3. Martingales and Stopping times 599
If 0 < q < t then {T < t − q} ∈ Ft−r and thus, θS−1 ({T < t − q}) ∈ F(S+t−q)+ . By
Lemma 20.2.3
{S < q, T ◦ θS < t − q} = {S + t − q < t} ∩ θS−1 (T < t − q} ∈ Ft .
Therefore, {U < t} ∈ Ft and so U is an Ft –optional time.
Proof. By the assumption on {Ft : t ≥ 0}, it is enough to show that {D Γ < t} ∈ Ft for
all t ≥ 0. Since Γ is progressively measurable, then Γt := Γ ∩ Ω ∩ [0, t) ∈ Ft ⊗ B([0, t])
for all t ≥ 0. Notice that if pΩ is the projection (ω, s) 7→ ω, then
{DΓ < t} = pΩ (Γt )
The measurable projection theorem 3.8.6 shows that pΩ (Γt ) is universally measurable with
respect to Ft . Since Ft is complete, then pΩ (Γt ) ∈ Ft .
Proof.
Z
1
E[f (Sn+1 )|Fn ] = E[f (Sn + ξn+1 )|Fn ] = B(0;1) f (Sn + y) dy
B(0;1)
Z
1
= B(0;1) f (y) dy = f (Sn )
B(Sn ;1)
Observe that for any u ∈ T, the function H(t, ω)1[0,u] (t) is also elementary. Hence, we can
define a process (H · X)u by letting
(H · X)u = (H1[0,u] ) · X
It is straight forward to check that {(H · X)u , Fu : u ∈ T} is an adapted process.
20.3. Martingales and Stopping times 601
Lemma 20.3.4. Let {Ω, {Fu : u ∈ T}, F , P} a filtered probability space where T ⊂ R+ ,
Fu ⊂ F for all u ∈ T.
(i) If {Xu , Fu : u ∈ T} is a martingale, then so is {(H · X)u , Fu : u ∈ T}.
(ii) If {Xu , Fu : u ∈ T} is a supermartingale, then so is {(H · X)u , Fu : u ∈ T}.
(iii) If {Xu , Fu } and H ≥ 0, then {(H · X)u , Fu : u ∈ T} is a submartingale.
[a,b]
dense set in T and let U[s,u]∩D be the number of up-crossings of X from a to b over the time
interval [s, u] ∩ D.
Theorem 20.4.1. (Doob’s up-crossing theorem) If {Xt , Ft : t ∈ T} be a submartingale,
then
[a,b] 1 1
(20.8) E[U[s,u]∩D ] ≤ (E[Xu+ ] + |a|) ≤ (E[|Xu |] + |a|).
b−a b−a
Proof. By translating to origin to s, we may assume without loss of generality that 0 =
s < u. Given a finite sequence SN = {s1 < . . . < sN } ⊂ [0, u] ∩ D, let
T0 = min{t ∈ SN : Xt ≤ a} ∧ u
T2k−1 = min{SN ∋ t > T2k−2 : Xt ≥ b} ∧ u
T2k = min{SN ∋ t ∈ T2k−1 : Xt ≤ a} ∧ u
[a,b]
for 1 ≤ k ≤ [N/2] + 1. It is easy to check that each Tj is a stopping time. Let UN denote
[a,b]
the number of up-crossings from a to b of X in SN . It is clear by definition that UN ∈ Fu .
Observe that Zt = a + (Xt − a)+ is a submartingale that has the same up-crossings in SN
as Xt . Consider the elementary process
[N/2]+1
X
H= 1(T2k ,T2k+1 ] .
k=0
[a,b]
If UN = m then m ≤ [N/2] + 1, each of the first m terms H · Z contributes at least (b − a),
the (m + 1)–th term contributes at most (Xu − a)+ , and the remaining terms are all zero;
therefore,
[a,b]
(b − a)UN ≤ (H · Z)u .
After taking expectations we obtain
[a,b] 1
E[UN ] ≤ E[(H · Z)u ].
b−a
If K = 1(0,u] − H, then
Zu − Z0 = 1(0,u] · Z = (H · Z)u + (K · Z)u .
As K is a nonnegative elementary process, Lemma 20.3.4(b) implies that
E[Zu − Z0 ] = E[(H · Z)u ] + E[(K · Z)u ]
≥ E[(H · Z)u ] + E[(K · Z)0 ] = E[(H · Z)u ].
Therefore
[a,b] 1 1
E[UN ] ≤ (E[(Xu − a)+ ] − E[(X0 − a)+ ]) ≤ E[(Xu − a)+ ].
(b − a) (b − a)
The estimate (20.8) follows from monotone convergence after taking supremum over all
finite sets S ⊂ D ∩ [0, u].
604 20. Martingales
Remark 20.4.2. If T is an interval of the form [u, v], Doob’s up-crossing Theorem implies
that
[ h [a,b] i
(20.9) Osc[u,v]∩D := U[u,v]∩D = ∞
a,b∈Q,a<b
is nearly empty. Therefore, the limits
Xu+ = lim Xq
D∋qցu
Xv− = lim Xq
D∋qրv
exists a.s. in R.
Proof. The process Yu = −Xu is a submartingale and sup E[Yu+ ] = 0. The conclusion
follows from the martingale convergence theorem 20.4.3. For the last statement assume
A ∈ Ft , t < ∞. By Fatou’s lemma, for any sequence {tn } ⊂ D with tn ր ∞
Z Z Z
X∞ dP ≤ sup inf Xtm dP ≤ Xt dP
A tn tm >tn A A
since {Xs : s ∈ T} is a supermartingale. Therefore E[X∞ |Ft ] ≤ Xt .
20.4. Martingale convergence theorem 605
X−∞ as t ր −∞. It is readily seen that X−∞ ∈ F−∞ . If A ∈ F−∞ , then as F−∞ ⊂ Fq
for all q ≤ 0,
Z Z Z
q→−∞
X0 dP = Xq dP −→ X−∞ dP.
A A A
Theorem 20.4.9. (Law of P Large numbers) Let {Xn } ⊂ L1 (P) be an i.i.d sequence of
random variables. Let Sn = nk=1 Xk . Then,
1
(20.10) Sn → E[X1 ]
n
P–a.s. and in L1 (P).
1
Proof. For each n ≥ 1, define F−n := σ(Xk : k ≥ n) and let M−n = n Sn . Since {Xk } is
an i.i.d sequence, it follows that
1
E[Xk |F−n ] = Sn for all 1≤k≤n
n
Consequently, E[M−n |F−n−1 ] = M−n−1 . Hence, {M−n : n ∈ N} is a backwards martingale
and by Corollary 20.4.8, M−n converges P–a.s and in L1 (P) to M = E[M1 |F−∞ ].
σ(X
T p : p > k) which means that M is measurable with respect to the tail σ–algebra T =
k Tk . By the Kolmogorov 0–1 law, we have that M = E[M ] P–a.s. The L1 convergence
of M−n shows that E[M ] = E[X1 ].
[a,b]
Proof. As X is a martingale, UQ∩[0,n] is integrable for each rational pair a < b and n ∈ Z+ .
Thus Osc is nearly empty. As {Xu : u ∈ [t, t + 1] ∩ Q} is a uniformly integrable martingale,
Xq converges a.s. and in L1 to Xt′ as Q ∋ q ց t. For all A ∈ Ft
Z Z Z
Q∋qցt
Xt dP = Xq dP −→ Xt′ dP.
A A A
The set on the righthand side is nearly empty since Xq′ = Xq = Yq a.s. for all q ∈ D+ .
Theorem 20.5.2. (optional stopping time: discrete time closable processes) Suppose {Xn :
n ∈ Z+ ∪ {∞}} is a closed martingale (resp. submartingale, supermartingale). Then,
for any stopping time T , XT ∈ L1 . If S ia another stopping time and S ≤ T , then
E[XT |FS ] = (resp. ≥, ≤)XS .
608 20. Martingales
The next example shows that the closable condition in Theorem 20.5.2 is necessary.
Example 20.5.3. Consider an i.i.d. sequence
P {ξn } of random variables with P[ξ1 = −1] =
P[ξ1 = 1] = 1/2. Let X0 = 0 and Xn = nk=1 ξk for n ≥ 1 and Fn = σ(Xk : k ≤ n). If T =
inf{n ≥ 0 : Xn = 1}, then P[T < ∞] = 1 and thus, E[XT ] = 1; however E[X0 ] = E[Xn ] = 0
for all n ∈ Z+ .
Corollary 20.5.4. (Wald inequality) Suppose that {Xn , Fn : n ∈ Z+ } is a martingale
(resp. submartingale, supermartingale) such that
h
E Xn+1 − Xn |Fn } ≤ B
for some constant B > 0. If T is a stopping time and T ∈ L1 (P) then, the stopped process
X T is a u.i martingale (resp. submartingale, supermartingale) and
E[XT ] = (resp. ≥, ≤) E[X0 ]
Proof. (i) Suppose first that T takes values on a countable set {tk } ⊂ T. Theorem 20.5.1
shows that XT ∈ L1 and that E[X∞ |FT ] = XT .
For the general case, let Tn be a sequence of stopping times taking values on a countable
set and decreasing to T as in Lemma 20.2.9. By the backwards martingale theorem and
the right–continuity of X
E[X∞ |FT ] = lim E[X∞ |FTn ] = lim XTn = XT
n n
almost surely and in L1 .h The last statement in (i) follows from Theorem 20.2.8 since
i
XT ∧s = E[X∞ |FT ∧s ] = E E[X∞ |FT ] Fs = E[XT |Fs ].
(ii) Suppose s, t ∈ T, with s < t, and let A ∈ Fs . Notice that for any stopping time
{T > s} ∈ Fs ⊂ FT ∨s . Since the stopped process X t is a u.i. martingale, by part (i)
E[1A∩{T >s} XtT ] = E[1A∩{T >s} (X t )T ∨s ] = E[1A∩{T >s} E[X∞
t
|FT ∨s ]]
= E[1A∩{T >s} Xt ] = E[1A∩{T >s} Xs ] = E[1A∩{T >s} XT ∧s ]
Hence
E[1A XtT ] = E[1A∩{T ≤s} XtT ] + E[1A∩{T >s} XtT ]
= E[1A∩{T ≤s} XT ∧s ] + E[1A∩{T >s} XT ∧s ] = E[1A XsT ]
Therefore, E[XtT |Fs ] = XsT . If T is bounded, let τ = sup T . then X T = (X τ )T and by part
(i) it follows that X T is u.i. with XsT = E[XT |Fs ].
Lemma 20.5.7. (Chung) Suppose {X−n , F−n : n ∈ Z+ } is a reversed submartingale. If
{E[X−n ] : n ∈ Z+ } is bounded below, then {X−n } is uniformly integrable.
610 20. Martingales
+ −
Proof. Let ℓ = inf n E[X−n ] = limn E[X−n ] > −∞. We will show that {X−n } and {X−n }
+
are u.i. sequences. Notice that {X−n , F−n } is also a reversed submartingale.
+
λP[|X−n | > λ] ≤ E |X−n | = 2E[X−n ] − E[X−n ] ≤ 2E[X0+ ] − ℓ < ∞
Therefore, limλ→∞ supn P[|X−n | > λ] = 0. From the submartingale property, we have that
h i h i h i
+
E X−n 1{X + >λ} ≤ E X0+ 1{X + >λ} ≤ E X0+ 1{|X−n |>λ} ;
−n −n
+ −
whence we conclude that {X−n } is uniform integrable. It remains to show that {X−n } is
uniformly integrable. Given ε > 0, there is n0 such that |E[X−m ] − E[X−n ]| < ε/2 whenever
n ≥ m ≥ n0 . Then,
−
E X−n 1{X − >λ} = −E X−n 1{X−n <−λ} = E X−n 1{X−n ≥−λ} − E[Xn ]
−n
≤ E X−m 1{X−n ≥−λ} − E[Xn ] = E[X−m ] − E X−m 1{X−n <−λ} − E[Xn ]
ε ε
≤ − E X−m 1{X−n <−λ} ≤ + E |X−m |1{|X−n |>λ}
2 2
−
Setting m = n0 , letting λ → ∞ and then ε → 0, we conclude that {X−n } is uniformly
integrable. Therefore {Xn } is uniformly integrable.
Theorem 20.5.8. (optional stopping time: closable càdlàg processes) Suppose {Xt , Ft :
t ∈ R+ } is a closed right–continuous with left limits martingale (resp. submartingale, su-
permartingale). If S ≤ T are stopping times, then E[XT |FS ] = (resp. ≥, ≤) XS
Proof. If (20.12) holds, then by taking conditional expectation of Xn − Xn−1 w.r.t. Fn−1
we obtain
E[Xn − Xn−1 |Fn−1 ] = An − An−1 .
Consequently, we obtain and expression for A
n
X
(20.13) An = E[Xk − Xk−1 |Fk−1 ]; A0 = 0
k=1
Proof. Let Q be a countable dense set in T and let S ⊂ Q ∩ [0, t] be finite. Define MS =
maxs∈S Xs+ , and for T ∋ u > t fixed let U = min{s ∈ S : Xs > λ} ∧ u. Then U is a stopping
time taking values on a finite subset of T and
{U < u} = {U ≤ t} = {MS > λ} ⊂ {Xt♮ > λ}
λ1{U <u} = λ1{MS >λ} ≤ XU 1{U <u} = XU ∧t 1{U ≤t} ∈ Ft
By Theorem 20.3.5, E[Xt |FU ∧t ] ≥ XU ∧t , therefore
Z Z
1 1
P[{MS > λ}] ≤ XU ∧t dP ≤ E[Xt |FU ∧t ] dP
λ {U ≤t} λ {U ≤t}
Z Z Z
1 1 1
= Xt dP = Xt dP ≤ X + dP
λ {U ≤t} λ {MS >λ} λ {Xt♮ >λ} t
By taking suprema over all finite subsets S of {t} ∪ (Q ∩ [0, t]), Doob’s inequality follows
from the right–continuity of X.
Theorem 20.7.2. (Doob’s inequality.) Suppose that X is a right–continuous submartingale
w.r.t. Ft and let 1 ≤ p, q ≤ ∞ with p1 + 1q = 1. If {Xt+ : t ∈ T} ⊂ Lp (P) then
kXt♮ kp ≤ q kXt+ kp
(20.15)
♮
kX∞ kp ≤ q supt kXt+ kp
If X is actually a martingale and {Xt : t ∈ T} ⊂ Lp , then
kXt∗ kp ≤ q kXt kp
(20.16)
∗ k
kX∞ p ≤ q supt kXt kp
∗ k ≤ q kX k .
If X is u.i. then kX∞ p ∞ p
Therefore kMS kp ≤ q kXt+ kp . The first inequality follows from the right–continuity of X
and by taking suprema over all finite subsets of {t} ∪ (Q ∩ [0, t]). The second by monotone
convergence once t → ∞.
(ii) If X is actually a martingale, then (20.16)
p from applying (20.15) to Yt =∗ |Xt |.
follows
If in addition, X is u.i. then |Xt |p = E[X∞ |Ft ] ≤ E[|X∞ |p |Ft ]. Therefore kX∞ kp ≤
q supt kXt kp ≤ kX∞ kp .
Corollary 20.7.3. If X is a right–continuous martingale w.r.t. {Ft : t ∈ T} and supt kXt kp <
∞, then limtրsup T Xt = X∞ exists a.s. and in Lp . Consequently, X is u.i.
Proof. Since kXt k1 ≤ kXt kp , it follows from the martingale convergence theorem 20.4.3
that limtրsup T Xt = X∞ exists a.s. To obtain convergence in Lp notice that
|Xt − X∞ |p ≤ (2Xt∗ )p .
The conclusion follows from Doob’s maximal inequality and dominated convergence.
Theorem 20.7.4. (Azuma–Hoeffding) Suppose {Xn , Fn } is a martingale such that
|Xn − Xn−1 | ≤ cn , n ≥ 1.
Then, for all m ∈ Z+ and a > 0
a2
(20.17) P[|Xm − X0 | > a] ≤ 2 exp − Pm 2 .
2 k=1 ck
A similar estimate for P[Xm − X0 < −a] is obtained by setting Xn′ = −Xn . Hence,
m
X
−ta t2
P[|Xm − X0 | > a] ≤ 2e exp 2 c2k .
k=1
Pm 2
The choice t = a/( k=1 ck ) implies (20.17).
614 20. Martingales
Example 20.7.5. Pn Let Xn be a sequence of integrable i.i.d. random variables and let
S0 = 0, Sn = k=1 (Xk − E[X1 ]) for n ≥ 1. Then, Sn is a martingale with respect to
Fn = σ(Xk ; k ≤ n) and Sn − Sn−1 = Xn − E[X1 ] for allP n ≥ 1. Suppose that |X1 | ≤ C with
probability one for some C > 0. Then, with X n = n1 nk=1 Xk ,
h1 i a2 n
P |X n − E[X1 ]| > a = P |Sn | > a ≤ 2 exp −
n 8C 2
for all a > 0. An L1 rate of convergence can be derive for the strong law of large numbers
by integrating over a,
Z ∞ √
2 2 2C 2π
E[|X n − E[X1 ]|] ≤ 2 exp(−a n/8C ) da = √
0 n
20.8. Exercises
Exercise 20.8.1. If X : (Ω, F ) −→ (S T , S ⊗T ) is measurable and P is a probability
measure on F , show that family of probability measures µI = P ◦ (pI ◦ X)−1 on S ⊗I ,
where I ⊂ T and I is finite, is projective. This shows that every stochastic process has a
canonical representation.
Exercise 20.8.2. Show that P is generated by the sets of the form A×(a, b] where A ∈ Fb
and 0 ≤ a < b < ∞. Show that P ⊂ O.
Exercise 20.8.3. Suppose T is a stopping time with respect the filtration {Ft : t ∈ [0, ∞)}.
Show that T −1 (A) ∈ Ft for any A ∈ B([0, t]) and t ≥ 0.
Exercise 20.8.4. If T = Z+ show that T is a stopping time iff {T = n} ∈ Fn for any
n ∈ Z+ .
Exercise 20.8.5. Suppose T be a stopping time, and let A ∈ FT . Define T A = 1A T +∞1Ac
(under the convention that ∞ · 0 = 0). show that T A is a stopping time.
Exercise 20.8.6. (a) Suppose X is a Martingale w.r.t {Ft }, and let ϕ be a convex function
such that Y = ϕ ◦ X ∈ L1 . Show that Y is a submartingale.
(b) Suppose that X is a real–values submartingale instead, and that ϕ is a convex, nonde-
creasing function such that Y = ϕ ◦ X ∈ L1 . Show that Y is a submartingale.
Exercise 20.8.7. Suppose that T and S are stopping times taking values on a discrete
countable set and that max T := u < ∞. If X is a submartingale, show that XT ∈ L1 , that
E[Xu |FT ] ≥ XT , and that E[XT |FS ] ≤ E[Xu |FT ∧S ].
Chapter 21
Applications of
Martingale theory
21.1. Differentiation
We will present versions of the Radon–Nikodym theorem for stochastic kernels using a
technique based on martingales.
S
Theorem 21.1.1. Let (Ω, {Bn : n ∈ N}) be a filtered space and let B = σ n Bn .
Suppose µ and ν are a probability measure and a finite measure on (Ω, B) respectively.
Denote by µn and νn the restrictions of µ and ν to Bn respectively. Suppose that νn ≪ µn
dνn
for all n and let Xn = dµ n
. Then
(i) {Xn , Bn } is a martingale.
(ii) lim supn Xn = X ∈ L1 (µ),
Z
ν(A) = X dµ + ν(A ∩ µ({X = ∞})) = νa (A) + νs (A),
A
615
616 21. Applications of Martingale theory
Set Yn = dµ dνn
dρn and Zn = dρn . Then, Yn , Xn ≥ 0, Ym + Zn = 2, and so by part (i),
n
{Yn , Bn } and {Zn , Bn } are a nonnegative bounded martingales with respect to ρ. The
martingale convergence theorem 20.4.3 and dominated convergence
S imply that limn Yn = Y
and limn Zn = Z exist ρ–a.s. and in L1 (ρ). For any A ∈ n Bn ,
Z Z
µ(A) = lim µn (A) = Yn dρ = Y dρ
n
ZA ZA
ν(A) = lim νn (A) = Zn dρ = Z dρ
n A A
S
Consequently, as n Bn is a π–system, by Dynkin’s monotone class theorem we have that
(21.2) dµ = Y dρ, dν = Z dρ.
dνn
As Xn = dµ n
, we have that Xn = ZYnn ρ–a.s., and hence µ–a.s. Since Y + Z = 2 ρ–a.s.,
it follows that ρ(Y = 0 = Z) = 0, and so X = YZ ρ-a.s., and hence µ–a.s. By (21.1)
and (21.2) we have that µ({Y = 0}) = 0, and ρ({Y = 0}△{X = ∞}) = 0. Thus, 1 =
Y Y1 1{Y >0} + 1{Y =0} , and so
Z Z Z
Z
ν(A) = Z dρ = 1{Y >0} Y dρ + Z1{Y =0} dρ
A A Y A
Z
= X dµ + ν(A ∩ {X = ∞}).
A
Theorem 21.1.2. Suppose (Ω, B, µ) is a countably generated probability space and let Bn
be as above. Given a finite measure ν on B, define
X ν(B)
(21.3) Xn (ω) = 1B (ω)1(0,∞) (µ(B) ∨ ν(B))
µ(B)
B∈Bn
There
Sis
s
∈
X∞ L+
1 (µ) to which Xns converges µ–a.s. By Fatou’s lemma and (21.5), for any
B ∈ n σ(Bn )
Z Z
s
(21.6) X∞ dµ ≤ lim inf Xns dµ = νs (B)
B n B
By monotone convergence,
S the class of sets in B that satisfy (21.6) is a monotone class
containing the algebra n σ(Bn ). Hence, by the monotone class theorem 3.5.2, (21.6) holds
for all sets in B. As νs ⊥ µ, there is B ∈ B such that νs (B) = 0 = µ(Ω \ B). This means
that
Z Z Z
s s s
X∞ dµ = X∞ dµ + X∞ dµ = 0
B Ω\B
It follows that s
X∞ = 0 µ–a.s., and so X∞ = a
X∞ µ–a.s.
dνa
To conclude, suppose that Y = dµ . For any B ∈ Bn
Z Z
νa (B) = Y dµ = Xna dµ
B B
that is, E[Y |σ(Bn )] = Xna .
This means that a
{Xn , σ(Bn )}
is uniformly integrable martingale;
hence, by Theorem 20.4.5, Y = limn Xna µ–a.s. and in L1 (µ). Therefore Y = X∞ µ–a.s.
Remark 21.1.3. The martingale approach developed above can be used to extend the
notion of symmetric derivative to general measures. For example, Theorem 21.1.2 should
be compared with Corollary 11.1.9 by considering a d–dimensional interval I with integer
vertices, µ as the normalized Lebesgue measure on I, and ν any finite Borel measure on I.
As partitions Bn , we may consider dyadic boxes contained in I.
We conclude this section with a result that establishes the measurability of Radon–
Nikodym’s decomposition between σ–finite kernels.
618 21. Applications of Martingale theory
Theorem 21.1.4. (de Possel, Doob) Let µ and ν be σ–finite kernels from (S, S ) to (T, T ).
If T is countably generated then, there is a measurable function X : (S×T, S ⊗T ) → [0, ∞]
such that for all B ∈ T
Z
(21.7) ν(s, B) = X(s, t)1{X<∞} µ(s, dt) + ν s, B ∩ {X = ∞}
B
Proof. Suppose µ and ν are σ–finite. Then ρ := µ + R ν is also σ–finite; hence, there exists
a function f : (S × T, S ⊗ T ) → (0, ∞) such that f (s, t)ρ(s, dt) = 1{ρ(s,T )6=0} . We may
assume without loss of generality that ρ(s, T ) > 0 for all s ∈ S. It follows that ρ′ = f · ρ is
a stochastic kernel, and µ′ = f · µ, and ν ′ = f · ν are finite kernels.
By L1 (ρ′s ) convergence, ′ ′ ′
ν (s, dt) = X (s, t) · ρ (s, dt) in σ(Bn ) for each n ∈ N, which extends
S
to B = σ n σ(Bn ) by the monotone class theorem. Any other function Y such that
Y · ρ = X ′ · ρsatisfies Y (s, ·) = X ′ (s, ·) ρs –a.s. for each s ∈ S. From ρ′ = µ′ + ν ′ , we obtain
(1 − X ′ ) · f · ν = (X ′ · f ) · µ. Hence
f · ν = (1{X ′ <1} · f · ν + 1{X ′ =1} · f ) · ν
X′
= 1{X ′ <1} ′
· f · µ + 1{X ′ =1} · f · ν
1−X
= 1{X<∞} X · f · µ + (1{X=∞} f · ν
X′
where X = 1−X ′ with 1/0 := ∞. Consequently ν = 1{X<∞} X · µ + (1{X=∞} · ν.
21.2. Disintegration of Stochastic kernels 619
a
a
RClearly ν := 1{X<∞} X · µ ≪ µ, and for any B ∈ T the map ′ s 7→ ν (s, B) =
B X(s, t)1{X(s,·)<∞} (t)µ(s, dt) is S –measurable. Since {X = ∞} = {X = 1} and
Z Z
′ ′ ′ ′
µ (s, X (s, ·) = 1) = X (s, t)µ (s, dt) = (1 − X ′ (s, t))ν ′ (s, dt) = 0,
{X ′ (s,·)=1} {X ′ (s,·)=1}
we conclude that µ ⊥ ν s := (1{X=∞} · ν. The measurability of r 7→ ν s (r, B) follows from
1
ν s = ν − ν s . The uniqueness of X follows from the uniqueness of X ′ = 1 − 1+X .
Proof. For each B ∈ U consider the measures νB (s, dt) = ρ(s, dt × B). It is easy to verify
that each νB is a kernel from S to T ; moreover, νB ≤ νU for all B ∈ U , and so νB ≪ νU ,
and νU is a stochastic kernel. We use ν to denote νU . As (T, T ) is countably generated,
by Theorem 21.1.4, there exists a measurable function hB : S × T → (0, ∞) such that
νB = hB · νU for each B ∈ U . For each s ∈ S and B ∈ B, the function hB (s, ·) is uniquely
determined ν(s, dt)–a.s., and
(1) hU (s, ·) = 1 ν(s, dt)–a.s.
(2) For all A, B ∈ U with A ⊂ B, hA (s, ·) ≤ hB (s, ·) ν(s, dt)–a.s.
(3) For any monotone sequence {Bn :∈ N} ⊂ U with Bn ր B, hBn (s, ·) ր hB (s, ·)
ν(s, dt)–a.s.
We will consider only the case where U is uncountable in which case, by the measurable
isomorphism theorem 3.9.15, we may assume without loss of generality that (U, U ) =
(R, B(R)). For each s ∈ S and each set D ⊂ S × T , we denote Ds = {t ∈ T : (s, t) ∈ D}.
The sets defined by
Ω(p, q) = {(s, t) : h(−∞,p] (s, t) ≤ h(−∞,q] (s, t)},
where p < q, and
Ω(−∞) = {(s, t) : inf h(−∞,p] (s, t) = 0}
p∈Q
Ω(+∞) = {(s, t) : sup h(−∞,p] (s, t) = 1}
p∈Q
620 21. Applications of Martingale theory
and x 7→ H((s, t), x) is nondecreasing and right–continuous. Since infima in (21.9) is taken
over a countable set, for any x ∈ R the map (s, t) 7→ H((s, t), x) is S ⊗ T –measurable.
As a consequence, for each (s, t) there is a unique measure µ((s, t), du) on (U, U ) whose
distribution is given by H((s, t), ·). We claim that µ is in fact a kernel from S × T to U .
Indeed, let C be the collection of sets B ∈ U for which the map (s, t) 7→ µ((s, t), B) is
S ⊗ T –measurable. It is obvious that C is a λ–system that contains all the intervals (a, b],
−∞ ≤ a < b < ∞. Sierpinski’s monotone class theorem implies that C = U .
21.3. Exchangeability
Suppose X = {Xi : i ∈ I} is a collection of measurable functions defined on some probability
space (Ω, F , P) with values in some Polish space (E, d). Denote by SI the collections of all
finite permutations ρ on I, that is, ρ : I → I is bijective and ρ(i) = i for all but finitely any
i ∈ I.
21.3. Exchangeability 621
Definition 21.3.1. The collection X is exchangeable if the joint law of X is the same as
the joint law of Xρ = {Xρ(i) : ρ ∈ SI }, for any ρ ∈ SI .
A simple example is an i.i.d sequence {Xn }. In this section we will derive a result that
characterizes exchangeable sequences of E–valued functions. For simplicity, we will consider
the canonical probability space of sequences in E invariant under finite permutations; that
is Ω = E N, F = B ⊗N(E), and P is a probability measure on F such that P = P ◦ ρ−1 for
any ρ ∈ SN. We will use Xn (x) = xn to denote the projection onto the n–th component (or
copy of E).
Give a permutation ρ ∈ Sn (Sn := S{1,...,n} ), we will consider its extension, also denoted
by ρ, to SN by setting ρ(j) = j for all j > n. Similarly, given a measurable function
f : E n −→ R, we consider its extension the E N, denoted also by f , as f (x1 , . . . , xn , . . .) =
f (x1 , . . . , xn ).
A measurable function F : E N → R is symmetric if F (xρ ) = F (x) for all ρ ∈ SN. A
measurable function f : E n → R is n–symmetric if f (xρ ) = f (x) for all ρ ∈ Sn .
P
Example 21.3.2. S(x) = n1 nk=1 xk is n–symmetric, but not symmetric as a function
P
on RN. The function a(x) = lim supn n1 nk=1 xk is symmetric R–valued function. If B ∈
B(E),F (x) = 1{Xn ∈B, i.o} (x) is symmetric.
1 P
This shows that E[g(X)|En ] = E[g(Xρ )|En ]. Since An g(X) = n! ρ∈Sn g(Xρ ) = An (g)(Xρ ),
we conclude that E[g(X)|En ] = An g(X).
The term Bn,m on the right–side involves at most (n − 1)!m terms; thus, by the exchange-
ability of X, kBn,m k ≤ m
n kg(X1 , . . . , Xk )k1 → 0 as n → ∞. Therefore, E[Ag(X)|E] =
limn Gn,m ∈ σ(Xj : j > m).
Theorem 21.3.5. (de Finetti) The sequence (Xn ) is exchangeable iff there exists a σ–
algebra A ⊂ F such that (Xn ) is i.i.d. conditioned to A . In either case, A can be taken
to be the exchange σ–algebra E or any sub σ–algebra of it.
For each n ∈ N there is a natural measurable map ξn from (Ω, F , P) to the space of
1 Pn
probability measures on (E, B(E)) given by x 7→ n k=1 δxk . This is the n–empirical
measure of P. The following result gives a detailed description of the limiting distribution
of the empirical measures ξn as n → ∞.
Theorem 21.3.6. (Xn ) is an exchangeable sequence iff there is a σ–algebra A and a
A –measurable random measure ξ on (Ω, F ) such that conditioned to A , (Xn ) is an i.i.d
sequence with
Z
E[g(X1 )|A ] = E[g(X1 )|ξ] = g(x) ξ(dx)
21.4. Exercises
Exercise 21.4.1. Apply Theorem 21.1.1 to provide a probabilistic proof of the Radon–
Nikodym theorem in the case where B is countably generated. Observe that the conditional
expectations that are needed have elementary definitions in this case and there is no need
of circular reasoning.
Appendix A
Infinite series on
Banach spaces
P
Recall that
Pn for any numerical sequence
P {an : n ∈ N}, the series n aP
n is convergent if
limP
n→∞ a
k=1 k exits. The series a
n n is absolutely convergent when n |an | converges.
P
If n an converges but it is not absolutely convergent, we Psay that the series n an is
conditionally
P convergent. It is a well known result that when a
n n is absolutely convergent,
then n an is convergent. All this is easily extended to sequences {an : n ∈ N} in a complete
(real or complex) normed space X by substituting the modulus | | in R or C by the norm
k k in X. Through out this section, we will assume that (X, | |) is a Banach space.
n
X N
X ∞
X
|bk | ≤ |ak | ≤ |ak |
k=1 k=1 k=1
P P
Thus is absolutely convergent. Let s := n an . Given ε > 0 there is N1 = N1 (ε) ∈ N
n bn
P
n P
such that | ak − s| < 2ε and |ak | < 2ε whenever n ≥ N1 . Setting N2 := max{f −1 (k) :
k=1 k>n
625
626 A. Infinite series on Banach spaces
X ∞
k X ∞ X
X k ∞ X
X k ∞
X
|tk | ≤ |bj (n)| = |bj (n)| = |af (j,n) | ≤ |an |
j=1 n=1 n=1 j=1 n=1 j=1 n=1
P P
Thus, k sP k is absolutely convergent. Let S = n an . Given ε > 0, there is N = N (ε)
ε
such that Sr k
k>n |a | < 2 whenever n > N . Let r be the minimum integer such that
{1, . . . , N } ⊂ j=1 f (j, N). For n > N ∨ r,
Xn n
X X ε
sk − ak ≤ |an | <
2
k=1 k=1 n>N
X n X∞ X ε
ak − an ≤ |ak | <
2
k=1 n=1 k>n
P P
Putting things together, we have that nk=1 sk − ∞ n=1 an < ε whenever n > N ∨ r. This
completes the proof of (ii).
Lemma A.1.3. For any sequence (an ) ⊂ X,
|an+1 | p p |an+1 |
lim inf ≤ lim inf n |an | ≤ lim sup n |an | ≤ lim sup
|an | |an |
p p
Proof. Let β ∗ = lim sup |a|an+1
n |
|
, α ∗ = lim sup n |a |, β = lim inf |an+1 | and α = lim inf n |a |.
n ∗ |a n | ∗ n
If β ∗ < ∞, then for any b > β ∗ fixed, there exists N ∈ N such that
|an+1 | < b|an | for all n ≥ N.
Hence, |am+N | ≤ bm |aN | for all m > 0; consequently,
p p
n
|an | ≤ b1−N/n n |aN |
for n > N . Letting n → ∞ and then b ց β ∗ shows that α∗ ≤ β ∗ . A similar argument
shows that β∗ ≤ α∗ .
A.1. Properties of absolutely convergent series 627
p P
Theorem A.1.4. p (i) If lim sup n |an | < 1, then an converges absolutely.
n
P
(ii) If lim sup p|an | > 1, then an diverges.
P
(iii) If lim sup n |an | = 1, the convergence (or divergence) of an is inconclusive.
p
∗ = lim sup n |a | < 1. Then for α < A < 1, there exists N ∈ N such
Proof. p Suppose α n
that nP|an | < A for all n ≥ N . Thus, n
n
P|an | < A for all n ≥ N and since the geometric
series A converges, it follows that an converges absolutely.
On the other hand, if α∗ > 1, then for any P
fixed 1 < A < α∗ , there are infinitely many an
n
with |an | > A . Therefore, an 9 0, and so an diverges.
P P
The series an and bn with an√= 1/n and bn = 1/n2 diverge and converge respectively;
√
in both cases, lim an = 1 = lim n bn .
n
Proof. Suppose kSn k ≤ c for all n ∈ N and some constant c > 0. Summation by parts
gives
Xn n
X
ak xk = an+1 Sn − (ak+1 − ak )Sk
k=1 k=1
The first term in the right hand side converges to 0 in X since kan+1 Sn k ≤ c|an+1 |. The
second term in the right side converges absolutely since |(ak+1 − ak )Sk | ≤ c|ak+1 − ak |.
Example A.1.6. The simplest application of Theorem A.1.5 is to determine convergence
of
P alternating series. Suppose an is a nonincreasing sequence converging to 0. Then
n+1
n (−1) an converges.
We conclude this section by introducing one type of convergence of series that we will
appear is these notes and which can be extended to complete normed spaces.
Pn P
Definition A.1.7. Let sn = k=1 bn be the the n–th partial sum of the series bn .
1 Pn−1
The series
P is Cesàro summable if σn = n k=0 sn converges, and Abel summable if
A(r) = ∞ n=0 b n r n converges as r → 1−.
P P
Theorem A.1.8. (Abel’s test) If bn converges and has sum B then, the series bn is
Cesàro and Abel summable, and B = lim σn = lim A(r).
n→∞ r→1−
Pn
Proof. Let Bn := k=0 bn . For all integers N > M , we have by summation by parts that
N
X N
X −1 N
X −1
n N +1 k+1 k N +1
bk r = r BN − (r − r )Bk = r BN + (1 − r) r k Bk
k=0 k=0 k=0
628 A. Infinite series on Banach spaces
P
SinceP n Bn converges, {Bn : n ∈ Z+ } is bounded, that is |Bn | ≤ c for some constant c > 0
and rn Bn converges. Given ε > 0 there is N such that n ≥ N implies that |Bn − B| < 2ε .
Breaking the sum in two parts we obtain
X ∞ X∞ N
X −1
ε
n n
r bn − B = (1 − r) r |Bn − B| ≤ (1 − r) rn |Bn − B| + rN
2
n=0 n=0 n=0
ε
≤ (1 − r)2cN +
2
P
∞ n
Thus, if |1 − r| < 4Nε c we obtain that r bn − B < ε. Therefore lim A(r) = A(1) := B.
n=0 r→1−
Cesàro convergence is left as an exercise.
Even if the iterated limits lim lim a(p, q) and lim lim a(p, q) exist and are equal, it
p→∞ q→∞ q→∞ p→∞
may happen that the double sequence a(p, q) diverges
pq
Example A.2.2. Consider a(p, q) = p2 +q 2
. The iterated limits are both zero, however, the
double sequence a(p, q) diverges.
Theorem A.2.3. Suppose that lim a(p, q) = α and that for any p, the limit lim a(p, q)
p,q→∞ q→∞
exists. Then the iterated limit lim lim a(p, q) = α.
p→∞ q→∞
Proof. For any ε > 0, there is N1 = N1 (ε) such that if p > N1 and q > N2 , then
|a(p, q) − α| < 2ε .
For each p, let A(p) = limq→∞ a(p, q). Therefore, there is N2 = N2 (p, ε) such that if q > N2 ,
then |a(p, q) − A(p)| < 2ε .
For each p > N1 , choose q = q(p) > N1 ∨ N2 . It follows that
|A(p) − α| < |A(p) − a(p, q)| + |a(p, q) − α| < ε
This completes the proof.
Definition A.2.4. Given a double sequence a(n, m) consider the sequence of double partial
sums
Xp X q
(A.1) s(p, q) = a(m, n)
m=1 n=1
P
The
P double series a(p, q) convergesPto a sum S if limp,q→∞ s(p, q) = S. The double series
a(p, q) is absolutely convergent if |a(p, q)| converges.
P P
Lemma A.2.5. If p,q a(p, q) converges absolutely, then p,q a(p, q) converges.
A.2. Double series 629
Pn P n P
Proof. Let φn := p=1 q=1 a(p, q). Since p,q |a(p, q)| converges, φn is a Cauchy se-
quence and so it converges to some pointPsay S. Given ε > 0, there is N1 such that
|φn − S| < 2ε whenever n ≥ N1 . Let s = p,q |a(p, q)|. For ε > 0, there is N2 such that
when p, q > N2
p X
X q
ε
s− |a(m, n)| <
2
m=1 n=1
For p, q > N := N1 ∨ N2
N X
X N
ε
|s(p, q) − S| ≤ |s(p, q) − φN | + |φN − S| ≤ s − |a(p, q)| + <ε
2
p=1 q=1
P
This shows that p,q a(p, q) converges to S.
p P
P q
Remark A.2.6. If a(m, n) a nonnegative double sequence, and s(p, q) := a(m, n)
P m=1 n=1
is a bounded double sequence, then the series a(p, q) converges. In particular, if the
∞ P
P ∞ P
iterated sum a(m, n) converges and has limit s, then a(p, q) converges to s.
m=1 n=1 p,q
P
Theorem A.2.7. Suppose n,m a(n, m) is an absolutely convergent double series. Let
P
g : N → N × N be a bijection. Consider the rearrangement G(n) = a(g(n)). Then n G(n)
is absolutely convergent and
∞
X
G(n) = lim s(p, q) = lim lim s(p, q) = lim lim s(p, q)
p,q→∞ p→∞ q→∞ q→∞ p→∞
n=1
That is
X X X ∞
∞ X X ∞
∞ X
G(n) = a(n, m) = a(n, m) = a(n, m)
n n,m n=1 m=1 m=1 n=1
Pk Pp Pq
Proof. Let Tk = j=1 |G(j)| and S(p, q) = m=1 n=1 |a(m, n)|. For any k, there is
a pair (p, q) such that Tk ≤ S(p,
P q). Conversely, for any pair (p,P
q) there is k such that
S(p, q) ≤ Tk . This shows that n G(n) is absolutely convergent iff p,q a(p, q) is absolutely
P P P
convergent. Let s(p, q) = pm=1 qn=1 a(m, n). Lemma A.2.5 shows that s := a(p, q) =
limp,q→∞ s(p, q) exits.
Example
P A.2.8. Suppose {an : n ∈ N} and {bn : n ∈ N} are bounded sequences. Then
a b (nm) s is convergent for all s ∈ C with Re(s) > 1. Moreover,
nm n m
X a m bn X ∞
an X bm X X
∞ ∞ 1
= = a b
d n/d
mn
(mn)s ns ms ns
n=1 m=1 n=1 d|n
where for each n ∈ N, the sum inside parenthesis runs along all (positive integer) divisors d
of n. In particular, for an = 1 = bn , we have that
X ∞
1 2 X d(n)
∞
=
ns ns
n=1 n=1
where d(n) is the number of divisors of n.
A.3. Exercises
Exercise A.3.1. Suppose (an : n ∈ N) is a sequence in a Banach space (X, | |). Show that
P
(i) n an converges iff for any ε > 0 there exists N such that |an + . . . + am | < ε
whenever m > n ≥ N .
P P
(ii) If n an converges absolutely, then n an converges.
Lower semicontinuous
and convex functions
Theorem B.1.2. f is lower semicontinuous iff for any x ∈ X and any net xα → x,
(B.1) f (x) ≤ lim inf f (xα ).
α
If if addition X is first countable, the statements above hold with sequences in place of nets.
Proof. Suppose f is lower semicontinuous and let {xn : n ∈ D} be a net that converges
to x. For any α > f (x) the set V = {f > α} is an open neighborhood of x. Hence there
is n0 ∈ D such that n ≥ n0 implies that f (xn ) > α; this implies that α ≤ lim inf n f (xn ).
(B.1) follows by letting α → f (x).
633
634 B. Lower semicontinuous and convex functions
Suppose that (B.1) holds for any x ∈ X and any net xn → x. We will show that for each
α ∈ R, the set Fα := {f ≤ α} is closed. Let {xn : n ∈ D} be a net in Fα that converges to
a point x ∈ X. Then f (xn ) ≤ α for all n ∈ D, and so
f (x) ≤ lim inf f (xn ) = sup inf f (xm ) ≤ α.
n n∈D m∈D:m≥n
Therefore x ∈ Fα .
Lemma B.1.3. The epigraph of a function f : X → R is defined as
epi(f ) = {(x, α) ∈ X × R : f (x) ≤ α}.
Then, f is lower semicontinuous iff epi(f ) is closed.
Proof. For any a ∈ f (X) let Fa = {f ≤ a}. Each Fa is closed and the collection {Fa :
a ∈ fT
(X)} satisfies the finite intersection property. Consequently the set of minimizers
F := a∈f (X) Fa 6= ∅.
Theorem B.1.5. Let X be a locally compact Hausdorff topological space. For any lower
semicontinuous function f ≥ 0, f = sup{φ ∈ C00 (X) : φ ≤ f }.
Proof. Let x0 ∈ X. If f (x0 ) = 0 let ψ ≡ 0. If f (x0 ) > 0, then for any 0 < a < f (x0 ),
Ua = {x ∈ X : f (x) > a} is an open neighborhood of x0 . By Urysohn’s lemma, there
is ψa ∈ C00 (X) such that 1{x0 } ≺ ψa ≺ 1Ua . Hence, φa = aψ satisfies a = φa (x0 ) and
0 ≤ φa ≤ f . The conclusion follows immediately.
Theorem B.1.6. Let (S, d) be a metric space and suppose f ∈ RS and f (x) ≥ b > ∞ for
all x ∈ S. The function f is lower semicontinuous if and only if there is a sequence of
bounded Lipschitz continuous functions fk such that inf k,x {fk (x)} ≥ b an fk ր f pointwise.
Proof. Sufficiency is clear since continuous functions are lower semicontinuous, and so is
the supremum of lower semicontinuous functions.
It suffices to assume that f ≥ 0. For each t ≥ 0 define
gt (x) = inf {f (z) + td(x, z)}
z
B.2. Convex functions 635
Clearly 0 ≤ gs ≤ gt whenever s < t, and gt (x) ≤ f (x) + td(x, x) = f (x). Notice that for all
x, y ∈ S, f (z) + td(x, z) ≤ f (z) + td(y, z) + td(x, y); consequently, gt (x) ≤ gt (y) + td(x, y).
By symmetry, we obtain |gt (x) − gt (y)| ≤ td(x, y), which means that each gt is Liptschitz
continuous. If h = limn gn , then 0 ≤ h ≤ f . We will show that h = f . To that purpose, fix
x ∈ S and let ε > 0. For each n ∈ N, there is zn ∈ S such that
(B.2) gn (x) + ε > f (zn ) + nd(x, zn ) ≥ nd(x, zn )
Since f (x) ≥ gn (x), it follows that f (x) + ε > nd(x, zn ) for all n; hence, zn converges to
x. Since f is lower semicontinuous, there is N such that for n ≥ N , f (x) − ε < f (zn ). For
such n, we obtain from (B.2) that gn (x) > f (x) − 2ε. Letting n → ∞ and then ε ց 0 shows
that h = f . To conclude, notice that {fn := gn ∧ n : n ∈ Z+ } is an increasing sequence of
nonnegative bounded Lipschitz–continuous functions which converges to f .
Proof. Suppose there is x1 ∈ X with f (x1 ) = −∞. Let xλ := λx0 + (1 − λ)x1 . Then,
since epi(f ) is convex, it follows that f (xλ ) = −∞ for all 0 ≤ λ < 1. Therefore, f (x) ≤
lim inf λ→1 f (xλ ) = −∞ which is a contradiction.
Lemma B.2.4. Suppose f is convex. If x ∈ core(dom(f )) and f (x) ∈ R, then f is proper.
Proof. Let y ∈ dom(f ). There exists ε > 0 such that x + ε(x − y) ∈ dom(f ). As
ε 1
x= y+ (x + ε(x − y)),
1+ε 1+ε
for any α > f (y) and β > f (x + ε(x − y) we have that
ε 1
f (x) ≤ α+ β.
1+ε 1+ε
Since f (x) > −∞, letting α → f (y) we conclude that f (y) > −∞.
Theorem B.2.5. Suppose f : X → R is convex and that for some x0 ∈ X there is an open
o that supx∈V f (x0 + x) < ∞. If f (x0 ) ∈ R, then f is proper and
neighborhood V of 0 such
continuous on dom(f ) .
636 B. Lower semicontinuous and convex functions
Without loss of generality suppose V is balanced. Then, for any 0 < λ ≤ 1 and x ∈ V
f (x0 + λx) = f (λ(x0 + x) + (1 − λ)x0 ) ≤ λf (x0 + x) + (1 − λ)f (x0 ).
Thus
f (x0 + λx) − f (x0 ) ≤ λ(f (x0 + x) − f (x0 )) ≤ λ(m − f (x0 ))
On the other hand,
1 λ
x0 = (x0 + λx) + (x0 − x).
1+λ 1+λ
Thus,
1 λ
f (x0 ) ≤ f (x0 + λx) + f (x0 − x)
1+λ 1+λ
whence it follows that
f (x0 ) − f (x0 + λx) ≤ λ(f (x0 − x) − f (x0 )) ≤ λ(m − f (x0 ))
Consequently,
|f (y) − f (x0 )| ≤ λ(m − f (x0 )), y ∈ x0 + λV
o
and continuity at x0 follows. For any z ∈ dom(f ) there is µ > 1 such that x0 +µ(z−x0 ) ∈
dom(f ))o . For any x ∈ V ,
1 1 1
z+ 1− x = (x0 + µ(z − x0 )) + 1 − (x0 + x).
µ µ µ
Hence
1 1 1
f (z + 1 − x) ≤ f ((x0 + µ(z − x0 ))) + 1 − f (x0 + x)
µ µ µ
1 1
≤ f ((x0 + µ(z − x0 ))) + 1 − m.
µ µ
This shows that f is bounded above on z + 1 + µ1 V ; therefore, by the argument developed
above, f is continuous at z.
Lemma B.2.7. Suppose f is a convex function in a real vector space X. For any x ∈
dom(f ) with f (x) > −∞ and d ∈ X, the function
f (x + λd) − f (x)
λ 7→ ,
λ
is monotone nondecreasing on R \ {0}.
Proof. Suppose 0 < µ < λ. Then µ = alphaλ where 0 < α = µλ < 1. If (x + λd) ∈ / dom(f )
there is nothing to prove. If (x + λd) ∈ dom(f ) then, for and v > f (x + λd),
f (x + µd) = f ((1 − α)x + α(x + λd)) < (1 − α)f (x) + αv.
Rearranging the terms above we obtain
µ
f (x + µd) − f (x) <(v − f (x)).
λ
Letting v → f (x + λd) shows the result holds on (0, ∞).
f (x−µd)−f (x) f (x−λd)−f (x)
Applying the result to −d we have that µ ≤ λ whenever 0 < µ < λ.
Hence,
f (x − λd) − f (x) f (x − µd) − f (x)
≤ .
−λ −µ
This shows the result holds on (−∞, 0).
Remark B.2.8. If f is a convex function with dom(f ) 6= ∅ and ∂f (x) 6= ∅ for some x ∈ X,
then clearly x ∈ dom(f ). Also, if f attains a minimum at x iff 0 ∈ ∂f (x).
Suppose f is a proper convex function and let x ∈ dom(f ). The map f+′ (x; d) given by
f (x + λd) − f (x) f (x + λd) − f (x)
d 7→ lim = inf
λ→0 λ λ≥0 λ
is the right–sided directional derivative of f at x. If f+′ (x; ·) ∈ X ∗ then it is the
Gâteaux derivative of f at x.
Theorem B.2.9. Suppose f is a proper convex function in the topological vector space X.
Let x ∈ dom(f ). Then
(i) The function d 7→ f+′ (x; d) is positive homogeneous and convex.
(ii) If f is continuous at x then d 7→ f+′ (x; d) is continuous on X.
f (x+αλd)−f (x)
Proof. Positive homogeneity follows from λ = α f (x+αλd)−f
αλ
(x)
. Let d, v ∈ X
and 0 < λ < 1. As f is convex and proper
f (x + λ(αd + (1 − α)v)) − f (x) f (x + λd) − f (x) f (x + λv) − f (x)
≤α + (1 − α) .
λ λ λ
Statement (i) follows by letting λ ց 0.
638 B. Lower semicontinuous and convex functions
Proof. By assumption (x, α) ∈/ epi(f ). Since X × R is locally convex and epi(f ) is convex
and closed in X × R, by Theorem 12.10.15[(iii)] there exit some (v, λ) ∈ X ∗ × R and ε > 0
such that
(B.3) v(x) + λα + ε = (v, λ) (x, α) + ε < v(y) + λβ
for all (y, β) ∈ epi(f ). By letting β → ∞ we obtain that λ ≥ 0. If λ > 0 then
1
g(y) = v(x − y) + α, y∈X
λ
satisfies α = g(x) and g(y) < f (y) for all y ∈ X.
If x ∈ dom(f ) then (x, f (x)) ∈ epi(f ) and from (B.3) we conclude that λ > 0.
If f (x) = +∞ and λ = 0 then v(x − y) < −ε for all y ∈ dom(f ). Hence, the continuous
affine function h(y) := v(x − y) + 2ε satisfies h(x) > 2ε > 0 and h(y) < 0 for all y ∈ dom(f ).
B.2. Convex functions 639
Fix y0 ∈ dom(f ). As before, there exists a continuous affine function φ such that φ(y0 ) =
f (y0 ) − 1 and φ(y) < f (y) for all y ∈ X. For any c > 0 define
gc (y) := ch(y) + φ(y), y ∈ X.
For y ∈ dom(f ) we have that gc (y) < φ(y) < f (y); whereas for if f (y) = ∞, gc (y) < ∞ =
f (y). We choose c large enough so that gc (x) = c 2ε + φ(x) ≥ α. The corresponding function
gc has the desired properties.
Corollary B.2.12. Suppose X is a normed space and that f : X → R ∪ {+∞} is proper
lower semicontinuous and convex. If B ⊂ X is bounded then inf x∈B f (x) > −∞.
For any x ∈ X, part (i) gives f (x) ≥ v(x) − f ∗ (v) for all v ∈ X ∗ . Taking suprema over all
v ∈ X ∗ gives f (x) ≥ f ∗∗ (x).
(iv) It is enough to show that under the additional condition in (iv), f ≤ f ∗∗ . Let x ∈
X and suppose α < f (x). By Theorem B.2.11 there exists a continuous affine function
g(y) = v(y) − c such that α ≤ g(x) and g(y) < f (y) for all y ∈ X. We claim that
(v, c) ∈ E ∗ := epi(f ∗ ). Otherwise, f ∗ (v) > c and by definition of f ∗ there exits x0 ∈ X
such that c < v(x0 ) − f (x0 ) which leads to the contradiction f (x0 ) < v(x0 ) − c = g(x0 ).
Consequently,
n o
α ≤ g(x) = v(x) − c ≤ sup w(x) − λ
(w,λ)∈E ∗
n o
= sup w(x) − f ∗ (w) = f ∗∗ (x).
w∈dom(f ∗ )
Proof. (i) implies (ii): By considering g(x) = f (x+x0 )−f (x0 ) for some x0 ∈ dom(f ) if nec-
essary, we can assume without loss of generality that f (0) = 0. Suppose lim inf kxk→∞ fkxk
(x)
≤
f (xn ) 1
0. Then, there exists a sequence (xn : n ∈ N) ⊂ X such that kxn k > n and kxn k < n.
Hence
f kxnn k xn = f (1 − kxnn k )0 + kxnn k xn ≤ kxnn k f (xn ) < 1.
n
This shows that kxn k xn ∈ F1 an so, F1 is unbounded.
B.3. Asymptotic Cones and Functions in Rn 641
A proper lower semicontinuous convex function that satisfies (iv) in Theorem B.2.17 is
said to be coercive.
Proof. (i) It is clear that C∞ is a cone and that 0 ∈ C∞ . Suppose d ∈ C∞ and let
{dn : n ∈ N} be a sequence in C∞ with dn → d. There is tn ≥ 1 and x1 ∈ C such that
kt−1
1 x1 − d1 k ≤ 1. Once t1 , . . . , tn−1 and x1 , . . . , xn−1 have been constructed, we can find
1 1
tn > tn−1 and xn such that kt−1 −1
n xn −dn k ≤ n . It follows that ktn xn −dk ≤ n +kdn −dk → 0,
whence we conclude that d ∈ C∞ .
(ii) Clearly C∞ ⊂ C ∞ . Suppose d ∈ C ∞ . Let tn ր ∞ and xn ∈ Cn such that
1
t−1
n xn → d. For each n, there is x̂n ∈ C with kxn − x̂n k ≤ n . Hence ktn x̂n − dk ≤
−1
−1 −1
tn kx̂n − xn k + ktn xn − dk → 0. Therefore, d ∈ C∞ .
(iii) Suppose C is a cone. If d ∈ C∞ then for some tn ր ∞ and xn ∈ C, d = limn t−1
n xn .
Since t−1
n x n ∈ C, d ∈ C.
Conversely, suppose d ∈ C. As nd ∈ C and d = n1 nd, we have that d ∈ C∞ .
Theorem B.3.3. Suppose C is a nonempty convex subset in Rn . Then C∞ is a closed
convex cone and
(B.5) C∞ = d ∈ Rn : ∀x ∈ C, ∀λ ≥ 0, x + λd ∈ C
(B.6) = {d ∈ Rn : ∀λ ≥ 0, x0 + λd ∈ C
for any x0 ∈ C.
Proof. Let R be the set in the right hand side of (B.5) and let Rx0 be the set in (B.6).
Evidently R ⊂ Rx0 for any x0 ∈ C.
−1
d ∈ Rx0 so that dλ = x0 + λd ∈ C for all λ ≥ 0. As n dn → d, it follows that
Suppose
d ∈ C ∞ = C∞ ; therefore Rx0 ⊂ C∞ .
We now show that C∞ ⊂ R. Let d ∈ C∞ . There is tn ր ∞ and xn ∈ C such that
d = limn t−1 −1
n xn . Let x ∈ C and set dn = tn (xn − x). Then dn → d and xn = x + tn dn . For
any λ > 0, there exits n0 such that tn > λ whenever n ≥ n0 . By convexity
λ λ
x̂n := x + λdn = 1 − x + xn ∈ C.
tn tn
Then limn x̂n = x + λd ∈ C, which means that d ∈ R.
It remains to show that C∞ is convex. Suppose d1 , d2 ∈ C∞ . Let 0 < α < 1. Fix x0 ∈ C.
Then for all λ > 0 x0 + λdi ∈ C, i = 1, 2. Since C is convex so is C, and so
x0 + λ(αd1 + (1 − α)d2 ) = α(x0 + λd1 ) + (1 − α)(x0 + λd2 ) ∈ C.
Therefore, αd1 + (1 − α)d2 ∈ Rx0 = C∞ .
Theorem B.3.4. A nonempty subset C of Rn is bounded iff C∞ = {0}.
B.3. Asymptotic Cones and Functions in Rn 643
Suppose that F is a nonempty closed subset of X ×R with the property that if (x, µ) ∈ F
then (x, µ′ ) ∈ F for all µ′ > µ. Then, the function on X defined as
g(x) := inf{µ : (x, µ) ∈ F }
is the unique lower semicontinuous function with epi(g) = F .
Example B.3.5. Let f : X → R and set F = epi(f ). The function f (x) := inf{µ : (x, µ) ∈
F } is called closure of f . Clearly, it is lower semicontinuous and f ≤ f . If f is convex,
then so is f .
(iii) Since f is proper, then (0, 0) ∈ epi(f∞ ), and so f∞ (0) ≤ 0. If f∞ (0) > −∞ then, as
f∞ (0) = f∞ (λ0) = λf∞ (0) for all λ > 0, it follows that f∞ (0) = 0.
Suppose f∞ (0) = 0 and that f∞ (x) = −∞ for some x. Then 0 = f∞ (0) ≤ lim inf n f∞ (n−1 x) =
n−1 f∞ (x) = −∞, which is a contradiction.
Theorem B.3.8. For any proper function f : Rn → R ∪ {+∞}, the asymptotic function
f∞ associated to f is given by
f (tz)
(B.7) f∞ (d) = lim inf
z→d t
t→∞
for all d ∈ Rn .
Proof. Let g(d) denote the right hand side of (B.7). It is enough to show that epi(f∞ ) =
epi(g). Let (d, µ) ∈ epi(f∞ ). Then for some tn ր ∞ and (dn , µn ) ∈ epi(f ), limn t−1
n (dn , µn ) =
(d, µ). Hence
1 µn
f (tn t−1
n dn ) ≤
tn tn
1 −1
By passing to the limit we obtain that g(d) ≤ lim inf n tn f (tn tn dn ) ≤ µ, which means that
(d, µ) ∈ epi(g).
Conversely, suppose that (d, µ) ∈ epi(g). By definition of g, there is a sequence tn ր ∞
and a sequence dn → d such that
f (tn dn )
lim = g(d)
n tn
Hence for any ε > 0 there exits n0 ∈ N such that n ≥ n0 implies that f (tn dn ) < tn (µ + ε).
This means that tn (dn , µ + ε) ∈ epi(f ) for all n ≥ n0 . Consequently t−1 n tn (dn , µ + ε) →
(d, µ + ε) ∈ epi(f∞ ). Since epi(f∞ ) is closed, by letting ε → 0, (d, µ) ∈ epi(f∞ ).
Let x ∈ dom(f ). By Theorem B.3.3, (d, µ) ∈ epi(f∞ ) iff for all λ > 0, (x + λd, f (x) + λµ) ∈
epi(f ). Equivalently, (d, µ) ∈ epi(f∞ ) iff f (x+λµ)−f
λ
(x)
≤ µ. Set
f (x + λd) − f (x)
g(d) := sup , d ∈ Rn .
λ>0 λ
We have proved that (d, µ) ∈ epi(f∞ ) iff (d, µ) ∈ epi(g), and so g ≡ f∞ .
Theorem B.3.10. Suppose f : Rn :→ R ∪ {+∞} is proper. For any α ∈ R, if {f ≤ α} =
6 ∅
then {f ≤ α}∞ ⊂ {f∞ ≤ 0}.
The next result identifies the sign that the asymptotic function f∞ associated to a
proper lower semicontinuous convex function f in terms of the limit behavior of f along
rays.
Lemma B.3.12. Let f be a proper lower semicontinuous convex function in Rn . f∞ (d) ≤ 0
iff lim supλ→∞ f (x+λd) < ∞ for all x ∈ dom(x). Equivalently, f∞ (d) > 0 iff lim inf λ→∞ f (x+
λd) = +∞ for all x ∈ dom(f ).
646 B. Lower semicontinuous and convex functions
Proof. Suppose that f∞ (d) ≤ 0. Then, for any x ∈ dom(f ) and λ > 0
f (x + λd) − f (x)
≤ 0.
λ
Hence lim supλ→∞ f (x + λd) ≤ f (x) < ∞.
Conversely, suppose f (d) > α > 0. For any x ∈ dom(f ) there exits λ0 > 0 such that λ ≥ λ0
implies that
f (x + λd) − f (x)
> α.
λ
Consequently, f (x + λd) ≥ f (x) + λα → +∞.
Theorem B.3.13. Suppose f : Rn → R ∪ {+∞} is a proper lower semicontinuous convex
function. f is coercive iff
Z
(B.10) e−f (x) λ(dx) < ∞
This is not possible as (B + λd) ∩ B = ∅ for λ large enough. Therefore, f∞ (d) > 0 for all
d 6= 0. The conclusion follows from Corollary B.3.11
B.4. Exercises
Exercise B.4.1. Suppose that {fα } is a collection of lower semicontinuous functions, then
W
α fα is lower semicontinuous. If f and g are real and lower semicontinuous, then f + g is
also lower semicontinuous. Any f ∈ C(X, R) is lower semicontinuous. For any U ∈ τ , the
function g(x) = 1U (x) is lower semicontinuous.
Exercise B.4.2. Let {pn : n ∈ Z+ } and {qn : n ∈ Z+ } be non-increasing sequence of
real–valued continuous functions on a compact set X. If pn ց u1 and qn ց u2 and u1 ≤ u2 ,
show that for any r ∈ N, there is Nr ∈ N such that n ≥ Nr implies that
1
pn (x) < qr (x) + x∈X
r
B.4. Exercises 647
649
650 Index