A Second Course in Analysis: M. Ram Murty
A Second Course in Analysis: M. Ram Murty
M. Ram Murty
A Second Course
in Analysis
HBA Lecture Notes in Mathematics
Series Editor
Sanoli Gun, C.I.T. Campus, Institute of Mathematical Sciences, Chennai,
Tamil Nadu, India
Editorial Board
R. Balasubramanian, Institute of Mathematical Sciences, Chennai
Abhay G. Bhatt, Indian Statistical Institute, New Delhi
Yuri F. Bilu, Université Bordeaux I, France
Partha Sarathi Chakraborty, Institute of Mathematical Sciences, Chennai
Carlo Gasbarri, University of Strasbourg, Germany
Anirban Mukhopadhyay, Institute of Mathematical Sciences, Chennai
V. Kumar Murty, University of Toronto, Toronto
D. S. Nagaraj, Institute of Mathematical Sciences, Chennai
Olivier Ramaré, Centre National de la Recherche Scientifique, France
Purusottam Rath, Chennai Mathematical Institute, Chennai
Parameswaran Sankaran, Institute of Mathematical Sciences, Chennai
Kannan Soundararajan, Stanford University, Stanford
V. S. Sunder, Institute of Mathematical Sciences, Chennai
The IMSc Lecture Notes in Mathematics series is a subseries of the HBA Lecture
Notes in Mathematics series. This subseries publishes high-quality lecture notes
of the Institute of Mathematical Sciences, Chennai, India. Undergraduate and
graduate students of mathematics, research scholars, and teachers would find this
book series useful. The volumes are carefully written as teaching aids and highlight
characteristic features of the theory. The books in this series are co-published with
Hindustan Book Agency, New Delhi, India.
123
M. Ram Murty
Department of Mathematics and Statistics
Queen’s University
Kingston, ON, Canada
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
This therefore is Mathematics:
She reminds you of the invisible forms of the
soul; She gives life to her own discoveries;
She awakens the mind and purifies the
intellect; She brings light to our intrinsic
ideas.
—Proclus (412–485 CE)
Preface
This book is based on graduate courses given by author since 2007 at Queen’s
University. The course was meant to bring the typical graduate student (in pure
mathematics) up to speed and orient his or her mind toward the research frontier.
This is a difficult task. On the one hand, one cannot begin again a program of
instruction covered in at least half a dozen undergraduate courses. On the other
hand, if the course is pitched at too high a level, there will be a high dropout rate
and the student is prone to say “I used to like mathematics until I took this course!”
Therefore, a balance must be achieved in the process of instruction.
For the most part, we may assume our student was a teenager when he or she
began their college education and drilled in the routine theorems of single variable
and multi-variable calculus, real and complex analysis, probability and measure
theory. But now at the graduate level, not only is a synthesis of these topics
appropriate, but it is essential for the formation of the research mind. Such a mind
has to re-examine many of the subtle notions and to have some understanding of
how mathematics evolved over the centuries. Perhaps, through such a presentation,
the student may find the necessary subliminal suggestions that form the intuitive
basis of understanding of mathematics at a deeper level.
In this context, the student will understand that the process of mathematical
research follows a basic template involving six steps. First, one identifies the need
to quantify or make precise a particular idea. This is a process of pattern recognition
and requires the student to have a wealth of examples already learned from his or
her undergraduate days. Concepts have an evolutionary history, and they often arise
as a response to a particular need to understand phenomenon in the physical world.
For instance, calculus arose from the study of physical motion. Multivariable cal-
culus arose from a study of motion in three dimensions and a deeper study of
electrostatics and fluid dynamics. Mathematical rigor and the need to establish
foundations for mathematics arose to avoid pathologies and logical contradictions.
Thus, the process of identification leads to a precise mathematical definition. This
is the source of power in mathematics. What is often referred by the multitude at
large in hazy and nebulous terms is made precise and quantified by the mathe-
matician. Having made the definition, the mathematician should then provide
vii
viii Preface
examples in which the mathematical principle that has been identified and defined
is highlighted and the student can analyze them and see a common thread that
unites the seemingly diverse array of examples. Such an intuitive understanding
leads to the formulation of a theorem or even a collection of theorems. Having the
theorem in hand not only allows one to understand all the disparate examples in one
unifying comprehensive glance, but it allows for further applications and exten-
sions. This is the predictive power of mathematics. The six-step process of higher
mathematical learning at the graduate level can then be summarized by the
appropriate acronym “ideate.”
This text assumes therefore that the reader is a graduate student who has had the
standard regimen of mathematical courses at the undergraduate level. Often, in such
a phase, one gains only information and not intuition that is essential for the
research mathematician. It is hoped that this text can form the basis for a
semester-long or year-long graduate course in analysis so as to prepare the student
for a career as a research mathematician. The author has tested the material over a
span of two decades and can certify that indeed the presentation here takes the
student through the sacred rites needed to enter the sanctum sanctorum of the
temple of mathematics.
This book is based on graduate courses given by the author at Queen’s University
since 2007. I would like to thank Drs. Jung-Jo Lee and Purusottam Rath for helping
to put into LaTeX a large part of these notes. I thank Drs. Akshaa Vatwani, Siddhi
Pathak, Kumar Murty, Steven Spallone, Seoyoung Kim and the referees for reading
sections of the original manuscript and giving me feedback.
ix
Contents
1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Axioms of Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Constructing Numbers from Sets . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Set-Theoretic Construction of the Real Numbers . . . . . . . . . . . . 10
1.4 Sequences of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.6 Sequences of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.7 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.8 Metric Spaces and Euclidean Spaces . . . . . . . . . . . . . . . . . . . . . 40
1.9 The Heine–Borel Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.10 Vector-Valued Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
1.11 Derivatives of Multivariable Functions . . . . . . . . . . . . . . . . . . . . 52
1.12 The Inverse Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 62
1.13 The Implicit Function Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 67
1.14 The Lagrange Multiplier Method . . . . . . . . . . . . . . . . . . . . . . . . 70
1.15 Level Sets and Tangent Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 73
1.16 Changing Variables in Integrals . . . . . . . . . . . . . . . . . . . . . . . . . 75
1.17 Volume and Surface Area of the Hypersphere . . . . . . . . . . . . . . 80
1.18 Green’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
1.19 Theorems of Gauss and Stokes . . . . . . . . . . . . . . . . . . . . . . . . . 87
1.20 Differential Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
2 Measure Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.1 Topological Spaces and Measure Spaces . . . . . . . . . . . . . . . . . . 103
2.2 The Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
2.3 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.4 Orthonormal Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
2.5 Trigonometric Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.6 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
xi
xii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Chapter 1
Background
The concept of number lies at the heart of mathematics. The subject of mathematical
analysis, in the formal sense of the word, began when mathematicians initiated the
exploration of infinity as a precise mathematical idea. This perhaps is the greatest
event of the nineteenth century, on par with the discovery of zero as a mathematical
idea. Both zero and infinity are now essential to the understanding of limits and the
related processes of real and complex analysis.
Because these two ideas were poorly understood, mathematicians in the nineteenth
century introduced the axiomatic method with the theory of sets as the foundation
for all mathematics. In this approach, Georg Cantor (1845–1918) was definitely the
leading figure. Sadly, his work received considerable opposition at first, but is now
universally accepted. It is not an exaggeration to say that research into the foundations
of mathematics has given us a larger understanding of the nature of mathematical
logic, both its strengths and its limitations.
Set theory is the standard foundation of all mathematics. Mathematical proofs use
set-theoretic ideas in one form or another. Below, we give a quick overview of the
axioms of set theory so that the student is introduced to the relevant vocabulary. One
need not become overly preoccupied with these formalities for otherwise a serious
study of these axioms will take us into the realm of mathematical logic which is
not the province of this short survey. Rather, we indicate the main themes that are
relevant to the later chapters, very much similar to a quick sight-seeing tour of an
ancient city. There will undoubtedly be many side streets to explore and since time is
of the essence, we need to be content with a one-hour tour highlighting the significant
monuments. For a gentle introduction to mathematical logic, we refer the reader to
[1].
The basic objects of set theory are classes, some of which are sets. An informal
definition of a set is that it is an unordered collection of objects. The basic relation
between them is membership indicated by the symbol ∈. Thus x ∈ A means that x
is a member of A. Its negation, x is not a member of A, is written x ∈ / A. Already,
with the notion of a set and the relation of membership, we can easily run into
logical contradictions if we do not formulate proper axioms for the construction
of sets. The famous “Russell paradox” (named after the British philosopher and
mathematician Bertrand Russell (1872–1970)) shows that the “set of all sets” is not
a set. Indeed, if it were a set, it would be a proper member of itself by definition,
which is a contradiction. The paradox is often posed as the following riddle: a barber
shaves anyone who does not shave himself; does the barber shave himself? The
contradiction arises from self-reference and so it was quickly realized that to avoid
logical contradictions in mathematics, one must formulate precise axioms for the
construction of sets. In a formal system of axiomatic set theory, the notions of “class”
and “∈” are undefined terms. There are several well-developed axiomatic systems
such as the Zermelo–Frankel system and the von Neumann–Gödel–Bernays system.
Our treatment below follows the latter.
The underlying idea of any of these systems is that we construct new sets from
old sets in a systematic manner, with the existence of the empty set as a basic axiom.
This is how we avoid the Russel paradox.
The words “and” and “not” are used as propositional connectives. We usually write
these words out and do not use symbols for them, though some logicians do. To say
that “Proposition P implies Proposition Q” we write “P =⇒ Q”. If P =⇒ Q and
Q =⇒ P, we write P ⇐⇒ Q. The usual logical quantifiers are: ∀ which means
“for all” and ∃ which means “there exists”. The first one is sometimes called the
universal quantifier and the second one as the existential quantifier. The statement,
“For all x, the proposition F(x) holds” is denoted ∀x (F(x)) with the parentheses
used to designate the proposition. Such a statement is called a universal proposition.
A statement of the form “There exists x such that F(x) holds” is written ∃x (F(x)).
Such a statement is called an existential proposition. Using class variables like
x, y, A, B and so on, together with the basic relation of membership ∈, quantifiers,
connectives and parentheses, we can construct “well-formed” formulas. If variables
are quantified, they are called “bound” and should only appear after the quantifier.
Variables without quantifiers are called “free”. For example,
∀x∃y((x ∈ y) =⇒ (x ∈ z))
The essential idea of set theory is that sets are built up in a progressive hierarchy
and there are precise rules as to how to form new sets from old sets. We begin with
the axiom of extensionality which allows us to determine when two sets are equal.
Axiom 1: (extensionality) For any classes A and B, A = B if and only if
∀x(x ∈ A ⇐⇒ x ∈ B).
∀x(x ∈
/ ∅).
u ∈ {x, y} ⇐⇒ (u = x or u = y).
If x = y, we obtain the singleton {x}: u ∈ {x} if and only if u = x. Again, these sets
are unique by extensionality.
As defined earlier, a variable is called free if there are no quantifiers defining it.
Otherwise, the variable is said to be bounded. This concept is used in the next axiom
and allows us to form new sets from old using well-formed formulas. In fact, the next
axiom is more appropriately called an axiom schema or a collection of axioms.
Axiom 4: (selection) Given any well-formed formula P(x) containing one free vari-
able x, there is a class A such that for any set x, we have x ∈ A if and only if P(x)
holds.
We often write
A = {x : P(x)}
and read this as “A is the class of all (sets) x such that P(x) holds.”
The distinction between sets and classes in the last axiom is important as noted
earlier. Without it, we fall into “Russell’s paradox”. Indeed, let R = {x : x ∈
/ x}. If R
were a set, then R ∈ R implies R ∈ / R and if R ∈ / R, then R ∈ R, a contradiction in
both cases. Technically, R is not a set but a proper class consisting of all sets which
we call “the universe” and the theory contains no logical contradiction. The previous
axioms do allow us to construct new sets from old ones in a systematic manner. Thus,
a set may consist of other sets that have already been well-defined.
Axiom 5 : (union) For any set x, there is a set ∪x such that y ∈ ∪x if and only if
y ∈ z for some z ∈ x.
We sometimes write ∪ y∈x y for ∪x in the last axiom. Thus,
x ∪ y = ∪{x, y} = {z : z ∈ x or z ∈ y}.
4 1 Background
(∀x ∈ A) Q(x)
means
(∀x)(x ∈ A =⇒ Q(x)).
(∀x ∈ A) (x ∈ B)
A ∩ B = {x : x ∈ A and x ∈ B}.
If A is a set, then A ∩ B is a set for any class B by Axiom 6 and the definition of a
set. If x is a non-empty set, its intersection is defined by
∩x = {y : (∀u ∈ x) =⇒ y ∈ u}.
We can use set theory to define an ordered pair (x, y) as the set of the form
Indeed, for any ordered pair Q, its first member is the unique x such that {x} ∈ Q
and its second member is the unique y such that Q = {{x}, {x, y}} where x is the
first member.
A relation is any set of ordered pairs. For any relation E, we define the inverse
relation by
E −1 := {(y, x) : (x, y) ∈ E}.
From the definition of an ordered pair, we are led to the definition of a function
as a set f of ordered pairs such that (x, y) ∈ f and (x, z) ∈ f implies y = z. We
1.1 Axioms of Set Theory 5
write f (x) = y to mean (x, y) ∈ f for any given function f . Thus, a function is a
special kind relation. In other words, all functions are relations but not all relations
are functions.
If f is a function, f −1 is not necessarily a function. A function f is called one-
to-one if and only if f −1 is a function.
This now allows us to construct the Cartesian product of two sets X and Y ,
denoted X × Y , which consists of ordered pairs (x, y) with x ∈ X and y ∈ Y .
With the axioms listed so far, we cannot prove the existence of an infinite set. The
next axiom allows us to do this along with our earlier axioms.
Axiom 7: (axiom of infinity) There is a set N = ∅ such that for all x ∈ N , x ∪ {x} ∈
N.
This axiom allows us to construct the natural numbers using the empty set because
we can now define inductively the sequence of sets
and define the non-negative integers using the “zero symbol” 0 to designate the empty
set and define
1 := {0}, 2 := {0, 1}, 3 := {0, 1, 2}, . . .
We could have also stated Axiom 7 in logical notation as: ∃N such that ∅ ∈ N
and ∀x ∈ N ,
x ∪ {x} ∈ N .
To prevent x ∪ {x} from being equal to x, and in general to keep order among sets
and prevent unwanted closed cycles for the membership relation, we need a further
axiom:
Axiom 8: (axiom of regularity) For any set A = ∅, (∃X ∈ A) X ∩ A = ∅.
This allows us to prove:
Exercises
1. Recall that an ordered pair (a, b) can be defined as the set {{a}, {a, b}}. Show that
(a, b) = (c, d) if and only if a = c and b = d.
2. Define the ordered triple (a, b, c) to be the ordered pair ((a, b), c) where the
ordered pair is defined as usual. Show that
(a, b, c) = (a , b , c )
3. If x ∈ y ∈ z ∈ w, prove that w ∈
/ x.
Find the domain and range for each relation in the previous question whether or
not it is a function.
Mathematicians and philosophers of the nineteenth century pondered deeply into the
nature of a number. The question of “what is a number?” is not a simple one. But
since mathematicians decided to give foundations of mathematics using the axiomatic
method and sets as the basic building blocks, we are led to define numbers using
sets. We follow Richard Dedekind (1831–1916) and Giuseppe Peano (1858–1932)
in the following construction. It was as late as 1888 and 1889 when this construction
was described in two papers written independently by Dedekind and Peano.
We construct a sequence of sets to represent the natural numbers. As noted earlier,
zero is represented by the empty set. We have already described the construction of
1.2 Constructing Numbers from Sets 7
the natural numbers using the empty set. For each natural number n, the successor
of n is denoted n + 1 (and sometimes as n ) and defined as
n ∪ {n}.
{0, 1, 2, . . . , n − 1}.
We designate the set of natural numbers by the symbol N. (It is a matter of personal
convenience whether to include zero as a natural number or not. In this discussion,
zero is a natural number. In other settings, it may not be. There is no universal
convention regarding this and the student is expected to understand depending on
the context. Some authors use the term “whole numbers” to indicate that zero is
included in the discussion.)
The arithmetic operations on N are now defined recursively. Addition is defined
as a function from N × N to N:
+(m, n) := m + n
m × n := (m × n) + n.
[( j1 , k1 )] + [( j2 , k2 )] = [( j1 + j2 , k1 + k2 )],
[( j1 , k1 )] × [( j2 , k2 )] = [( j1 j2 + k1 k2 , j1 k2 + j2 k1 )].
This latter definition is best understood if we recall that the symbol ( j, k) represents
j − k so that the left hand side of the above equation is
( j1 − k1 )( j2 − k2 ) = j1 j2 + k1 k2 − ( j1 k2 + j2 k1 ).
One needs to check that these definitions are “well-defined” in the sense that they are
independent of the representatives chosen for the equivalence class. We leave that to
the student as an exercise (see exercises below).
In this way, we have now extended the notion of addition and multiplication from
the set of natural numbers to the set of integers. Subtraction of integers can be defined
by
[( j1 , k1 )] − [( j2 , k2 ) = [( j1 , k1 )] + (−1)[( j2 , k2 )],
where −1 represents the equivalence class (0, 1). All of these definitions correspond
to our usual notion of addition, subtraction and multiplication. Their virtue lies in
their purely set-theoretic formulation.
We can also order the set of integers in the usual way. Thus,
j1 + k2 < k1 + j2 ⇐⇒ [( j1 , k1 )] < [( j2 , k2 )]
1.2 Constructing Numbers from Sets 9
and
j1 + k2 ≤ k1 + j2 ⇐⇒ [( j1 , k1 )] ≤ [( j2 , k2 )].
This corresponds to our usual notion of “less than” and “less than or equal to”.
Finally, we can define the absolute value on the set of integers by setting
⎧
⎨ k 0 < k,
|k| = 0 k = 0
⎩
−k k < 0
We can now construct the rational numbers Q from the set of integers. We do
this by defining an equivalence relation on the set Z × Z>0 by stating that two pairs
( j1 , k1 ) and ( j2 , k2 ) are equivalent if and only if j1 k2 = j2 k1 . Intuitively, we think of
( j1 , k1 ) as representing the “fraction” j1 /k1 and examining what we would mean by
j1 /k1 = j2 /k2 by reducing it to notions already defined. The set of rational numbers
Q is then defined as the set of such equivalence classes.
The expected operations of addition and multiplication are now evident:
[( j1 , k1 )] + [( j2 , k2 )] = [( j1 k2 + j2 k1 , k1 k2 )]
[( j1 , k1 )][( j2 , k2 )] = [( j1 j2 , k1 k2 )].
Again, these definitions are easily verified to be well-defined. Finally, we can now
define “division”. If [( j1 , k1 )], [( j2 , k2 )] ∈ Q with j2 = 0, we define
[( j1 , k1 )]
:= [( j1 k2 , j2 k1 )].
[( j2 , k2 )]
These operations satisfy the familiar laws of associativity, commutativity and dis-
tributivity. Subtraction of rational numbers then can be written as:
[( j1 , k1 )] < [( j2 , k2 )] ⇐⇒ j1 k2 < j2 k1 .
[( j1 , k1 )] ≤ [( j2 , k2 )] ⇐⇒ j1 k2 ≤ j2 k1 .
These definitions agree with our usual notions of ordering of the rational numbers.
Finally, the definition of absolute value can be extended as:
⎧
⎨ [( j, k)] if [(0, 1)] < [( j, k)],
|[( j, k)]| = [(0, 1)] if [( j, k)] = [(0, 1)],
⎩
−[( j, k)] if [( j, k)] < [(0, 1)].
10 1 Background
Again, our familiar properties of the absolute value of rational numbers hold. With
this foundational construction in place, we can conveniently represent the equivalence
class of ( j, k) as simply the fraction j/k and continue to work with these numbers
as we were (hopefully) taught from childhood.
In the next sections, we construct the real numbers from this axiomatic framework.
Exercises
[( j1 , k1 )] + [( j2 , k2 )] = [( j1 + j2 , k1 + k2 )]
( j1 + j2 ) · k = j1 k + j2 k.
3. Show that the relations < and ≤ on Z have the following properties:
(a) [(0, j)] < [(0, 0)] for all j ∈ Z>0 ;
(b) [(0, j)] < [(k, 0)] for all j, k ∈ Z>0 ;
(c) [(0, j)] < [(0, k)], j, k ∈ Z>0 if and only if k < j;
(d) [(0, 0)] < [( j, 0)] for all j ∈ Z>0 ;
(e) [( j, 0)] < [(k, 0)], j, k ∈ Z≥0 if and only if j < k;
(f) [(0, j)] ≤ [(0, 0)] for all j ∈ Z≥0 ;
(g) [(0, j)] ≤ [(k, 0)] for all j, k ∈ Z≥0 ;
(h) [(0, j) ≤ [(0, k)] for j, k ∈ Z≥0 if and only if k ≤ j;
(i) [(0, 0)] ≤ [( j, 0)] for all j ∈ Z≥0 ;
(j) [( j, 0)] ≤ [(k, 0)] j, k ∈ Z≥0 if and only if j ≤ k.
The rational number system is insufficient to “measure” all the lengths that arise in the
“real world.” This was discovered by the ancient school of Pythagoras which viewed
the world through a strange mix of mathematics and mysticism. The aphorism “all
is number” seems to have been the underlying mantra of the Pythagoreans. In our
modern digital world, this mantra imposes its universality more than in any earlier
age.
1.3 Set-Theoretic Construction of the Real Numbers 11
2b2 = a 2
which implies that a is even since the left hand side of the equation is patently an
even number. Writing a = 2c, we now get
b2 = 2c2
12 1 Background
then A has no least upper bound. Indeed, if u is a least upper bound,√x ≤ u for √
any
x ∈ A. By definition, u is a rational number and so cannot first equal 2. If u < 2,
then we can choose a natural number N such that
1 √
< 2 − u,
N
so that
1 √
u+ < 2
N
√
contradicting that u is an upper bound for the set A. If u > 2, then we can similarly
find N such that
1 √
u− > 2
N
which again contradicts that u is the least upper bound for A. Thus, A does not have
a least upper bound in the realm of rational numbers.
Two ways of constructing the real numbers were proposed independently by
Richard Dedekind (1831–1916) and Augustin Cauchy (1789–1857). Each method
has its virtues. The method of Dedekind, using what are called Dedekind cuts is
1.3 Set-Theoretic Construction of the Real Numbers 13
Q = A ∪ B, A = {x ∈ Q : x 2 < 2} B = Q\A.
Rational numbers are also defined by Dedekind cuts. Indeed, the rational number q
is defined by the cut
For example, the number zero is represented by A ∪ B with A being the set of all
nonzero negative rational numbers and B being the set of all nonzero positive rational
numbers. As the Dedekind cut A ∪ B is uniquely defined by A, with B being the
complement of A in the set of rationals, we may as well associate A to each real
number, or even think of A as the real number. One would then have to define how
one would work with this definition from the perspective of addition, multiplication
and so on. This is not difficult to do. For instance, we have
A1 < A2 ⇐⇒ A1 ⊂ A2 , A1 ≤ A2 ⇐⇒ A1 ⊆ A2 .
A1 + A2 = {a1 + a2 : a1 ∈ A1 , a2 ∈ A2 }.
{a1 · a2 : a1 ∈ A1 , a2 ∈ A2 , a1 , a2 ≥ 0} ∪ {q ∈ Q : q ≤ 0} if A1 , A2 ≥ 0;
−(A1 · (−A2 )), if A1 ≥ 0, A2 < 0;
−((−A1 ) · A2 ), if A1 < 0, A2 ≥ 0;
(−A1 ) · (−A2 ), if A1 , A2 < 0.
One can show that all the usual operations with numbers satisfy the expected prop-
erties of commutativity, associativity and distributivity.
By contrast, the method of Cauchy sequences begins with√a definition of con-
vergence. In some ways, it √ is motivated by our discussion of 2 earlier.
√ If A is the
Dedekind√cut representing 2, then the least upper bound of A “is” 2. Thus, the
number 2 can be thought of as a limit of a sequence of rational numbers. But
there can be many such sequences and so one needs a more formal approach. The
underlying observation is that a limit of a sequence of rational numbers need not be
rational.
This is perhaps best illustrated by Euler’s series for e. Students of calculus are
familiar with
∞
1
e=
j=0
j!
and the right hand side can be thought of as a limit of a sequence of rational numbers
n
1
Sn = .
j=0
j!
b ∞
b! b!
b!e = + (1.1)
j=0
j! j=b+1
j!
and the second term on the right hand side is strictly less than the geometric series
∞
1 1
=
k=1
(b + 1)k b
1.3 Set-Theoretic Construction of the Real Numbers 15
which is less than or equal to 1. But the left hand side of (1.1) is a positive integer,
the first term on the right hand side of (1.1) is a positive integer and the second term
is not an integer, which is a contradiction.
What this example shows is that the “limit” of the sequence of partial sums Sn
(each of which is a rational number) is the irrational number e.
The reader is familiar with the usual notion of convergence. We say a sequence
of real numbers xn converges to L if given > 0 there exists N such that
|xn − L| < ∀ n ≥ N.
One may want to view real numbers as “limits of sequences of rational numbers.” But
in our formal construction of numbers, several difficulties arise with this definition.
The first is that as we have so far only constructed the rational numbers, we need
to take the xn ’s to be all rational numbers. Secondly, the limit L need not be a
rational number as the example with e shows. This motivates the formal definition
of a Cauchy sequence of rational numbers.
A sequence qn in Q is called a Cauchy sequence if given > 0, there is an N
such that
|qn − qm | < ∀ m, n ≥ N .
Now it is easy to see that any convergent sequence is a Cauchy sequence. For future
reference, we record this below.
Theorem 1.2 Any convergent sequence qn is a Cauchy sequence.
Proof Suppose that qn converges to L. Then, choosing N such that
|qn − L| < ∀ n ≥ N,
2
we have for all m, n ≥ N , by the triangle inequality
Later, we will show that every Cauchy sequence converges. Thus, a sequence is
Cauchy if and only if it is convergent and the two notions are equivalent. But the
advantage of the notion of a Cauchy sequence is that the limit value L is not mentioned
in the definition.
The construction of the real numbers is now brought about by defining an equiva-
lence relation on the set of all Cauchy sequences of rational numbers. Since we have
constructed rational numbers, we are now allowed to construct the set of all Cauchy
sequences of rational numbers. On this set, we say two Cauchy sequences {qn } and
{rn } are equivalent if for any > 0, there exists a natural number N such that
|qn − rn | < ∀ n ≥ N.
16 1 Background
This is easily seen to define an equivalence relation on the set of all Cauchy sequences
of rational numbers. The set of real numbers R is then defined as the set of all
equivalence classes. We will denote the equivalence class of the sequence {qn } by
[{qn }].
We leave as an exercise (see Exercise 4 below) to verify that if {qn } and {rn } are
two Cauchy sequences, then so are {qn + rn } and {qn · rn }. Thus, the usual operations
of addition and multiplication are easily defined:
One needs to verify that these are well-defined by checking that the definition does
not depend on the choice of the representative of the equivalence class.
The order relation of the real numbers can also be defined. Thus, viewing real
numbers as equivalence classes or Cauchy sequences of rational numbers, we define
[{qn }] < [{rn }] if there is an N such that qn < rn for all n ≥ N . We also write [{qn }] ≤
[{rn }] if either [{qn }] < [{rn }] or [{qn }] = [{rn }]. Again, these order relations have
the expected properties.
Finally, the absolute value function can also be defined. We set
and the reader can easily verify that (indeed) the right hand side is a Cauchy sequence
if {qn } is Cauchy.
The absolute value has the usual properties, the most notable being the triangle
inequality:
|[{qn }] + [{rn }]| ≤ |[{qn }]| + |[{rn }]|.
Having given this formal definition of the real numbers using Cauchy sequences, it
is convenient to drop the sequence notation and just represent the real numbers as
single letters. Thus, if a, b ∈ R, the triangle inequality is the familiar
|a + b| ≤ |a| + |b|.
It is now convenient to visualize the set of real numbers in the usual way as points
of the line stretching from −∞ to +∞ and introduce the following notation. If a and
b are real numbers such that a ≤ b, then the interval [a, b] indicates the set of real
numbers {x : a ≤ x ≤ b}. The half-open interval (a, b] represents {x : a < x ≤ b}.
Similarly [a, b) means {x : a ≤ x < b} and the open interval (a, b) is {x : a < x <
b}.
By its very construction, the set of real numbers has the property of complete-
ness in that every Cauchy sequence converges. The reader will also observe that the
construction depended only on the property of the absolute value viewed as a metric
to measure the distance between two rational numbers. This signals a wider appli-
cability of the method of Cauchy sequences valid more generally in metric spaces
which we discuss later.
1.3 Set-Theoretic Construction of the Real Numbers 17
With the construction of the real numbers, we have restored the least upper bound
property and can now define that a sequence of numbers (rational or real) xn con-
verges to a number L ∈ R if given any > 0, there is an N such that
|xn − L| < , ∀ n ≥ N.
Proof Let s = sup A. Then s is an upper bound for A so that (i) holds. Now if (ii)
were false, there would be an > 0 such that A ∩ (s − , s] = ∅. But then s −
would also be an upper bound for A, a contradiction. Conversely, suppose that (i)
and (ii) hold. Then (i) implies s is an upper bound for A. If s were not the least
upper bound for A, there would be a t < s such that a ≤ t for all a ∈ A. Taking
= (s − t)/2 > 0 in (ii) we get that A ∩ (s − , s] = ∅. But s − = t + so that
A ∩ (t + , s] = ∅ which is a contradiction since a ≤ t for all a ∈ A.
Exercises
√
1. If p is a prime number, show that p is irrational.
18 1 Background
2. Show that
∞
1
< 1.
n=2
n!
is an irrational number.
4. Show that if {qn } and {rn } are two Cauchy sequences, then so are {qn + rn } and
{qn · rn }.
In other words, the sequence converges to L. The case when the sequence is mono-
tonic decreasing is similar.
1.4 Sequences of Real Numbers 19
xnk , k = 1, 2, . . .
with n k being a sequence of strictly increasing natural numbers. The reader can easily
check that if xn is a convergent sequence with limit L then any subsequence is also
convergent with the same limit L. (See Exercise 1 below.)
The following theorem is fundamental in the theory. It was discovered indepen-
dently by Bernard Bolzano (1781–1848) and Karl Weierstrass (1815–1897). Bolzano
studied mathematics, physics, philosophy and theology at the University of Prague
and became a Catholic priest in 1804. He then taught at the University of Prague
where he was chair of philosophy of religion and later was the dean of the philoso-
phy department. However, his pacifist views and strong opposition to militarism led
to his alienation from both academics and church leaders. He was dismissed from
the university in 1819 and exiled to the countryside when he refused to retract his
views. Much of his mathematical and philosophical work was recognized only after
his death. Bolzano is now remembered for the introduction of rigor in mathematics,
especially with regard to the − δ definition of continuity, the greatest lower bound
property of the real numbers and the intermediate value theorem.
Weierstrass trained to be a high-school teacher of mathematics obtaining his teach-
ing certificate in 1841 at the age of twenty-six. For more than a dozen years, he taught
at various high schools until 1854, when his paper on abelian functions which he
published in Crelle’s Journal had brought him instant recognition from mathemati-
cians at the University of Berlin that he was offered a professorship there when he
was nearly forty years old. Until his retirement in 1890, he developed the theory of
infinite series and infused mathematical rigor into analysis. Weierstrass discovered
the theorem below independently and it was only much later that mathematical his-
torians recognized that Bolzano had foreseen it and proved it much earlier. Today, we
honor both mathematicians by referring to it as the Bolzano–Weierstrass theorem.
This theorem allows us to define the important notions of lim sup and lim inf of
any sequence of real numbers. Indeed, given a sequence of numbers xn , if it is not
bounded above, we write
20 1 Background
There are several characterizations of lim sup and lim inf that are convenient in many
applications. We record these below.
Theorem 1.6 Let L be a real number and an a sequence of real numbers. Then,
L = lim sup an
n→∞
if and only if
(a) for each > 0, there is a positive integer N such that an < L + for all n ≥ N ;
(b) for each > 0, the inequality L − < an holds for infinitely many n,
Proof Suppose first that lim supn→∞ an = L. As L is a real number, the sequence is
bounded above by M (say). From the definition of lim sup, we also see that L is the
limit of a convergent subsequence and that it is the largest of such limits. If (a) did
not hold, then for some > 0, we have an > L + for infinitely many n so that by
the Bolzano–Weierstrass theorem there would be a subsequence converging to a limit
point in [L + , M] contradicting the definition of L. Therefore (a) holds. If (b) did
not hold, then for some > 0, we have L − < an holds for only finitely many n.
Thus, there is an N such that L − ≥ an for all n ≥ N . But then lim supn→∞ an ≤
L − contradicting our definition of lim sup.
In an analogous fashion, the reader can show (see Exercise 2):
Theorem 1.7 Let L be a real number and an a sequence of real numbers. Then,
L = lim inf an
n→∞
if and only if
(a) for each > 0, there is a positive integer N such that an > L − for all n ≥ N ;
(b) for each > 0, the inequality L + > an holds for infinitely many n.
We can now return to a general discussion of Cauchy sequences of real numbers. A
sequence xn of real numbers is called a Cauchy sequence if for every > 0, there
is an N such that
1.4 Sequences of Real Numbers 21
|xn − xm | < ∀ m, n ≥ N .
Proof The fact that a convergent sequence is a Cauchy sequence was proved earlier
(see Theorem 1.2). Now suppose our sequence is Cauchy. We want to show that it
converges. By taking = 1 in the definition of a Cauchy sequence, we see that xn is
bounded. By the Bolzano–Weierstrass theorem, there is a convergent subsequence
xn k . Putting
L = lim xn k
k→∞
With this notation, we can reformulate the notion of continuity as follows. A function
f : [a, b] → R is continuous at c if and only if for every > 0, the inverse image
under f of any open ball of radius contains an open ball Bδ (c) for some δ > 0.
This reformulation allows us to generalize the notion of continuity in a wider context,
namely to the setting of metric spaces and later to topological spaces.
Exercises
and
lim inf xn = lim inf {xk : k ≥ n}.
n→∞ n→∞
1 1 1 1
an := 1 − + − + · · · + (−1)n−1
2 3 4 n
is Cauchy and hence converges. (We will see in the next section that it converges
to log 2.)
√ √
6. Let a1 = 2 and define recursively an+1 = 2 + an .
(a) Show by induction that an < 2 for all n ≥ 1.
7. Prove that lim supn→∞ sin n = 1 and lim inf n→∞ sin n = −1. [Hint: You may
use the fact that π is irrational.]
n
sn := ak .
k=1
If the limit
lim sn
n→∞
exists and equals L, we say the series converges to L. Otherwise, we say the series
diverges.
The geometric series
∞
rk
k=0
plays an important role in mathematics. Its partial sums are easily seen to be (see
Exercise 1)
1 − r n+1
sn = 1 + r + r 2 + · · · + r n = ,
1−r
1
,
1−r
n
sn = ak ,
k=1
is a Cauchy sequence and hence converges. Indeed, as ∞ k=1 bk converges, the
sequence of its partial sums is Cauchy so that given > 0, there is a positive integer
N such that n
bk < , ∀ n ≥ m ≥ N.
k=m
Hence n
|sn − sm | ≤ bk < ∀ n ≥ m ≥ N.
k=m
Theorem 1.10 (Ratio test) Suppose that ak is a sequence of nonzero real numbers
and
ak+1
lim sup < 1.
k→∞ ak
∞
Then the series k=1 ak converges.
Proof Let L be
ak+1
lim sup .
k→∞ ak
Let r be any real number such that L < r < 1. Then, by the definition of limsup (see
Theorem 1.6), there exists an N such that
ak+1
a < r, ∀ k ≥ N.
k
|a N + j | ≤ |a N |r j , j = 1, 2, . . .
and the sequence ak is strictly positive for sufficiently large k, the student can verify
that the series diverges.
1.5 Infinite Series 25
∞
then the series k=1 ak converges. If L > 1, the series diverges. If L = 1, the test is
inconclusive.
Proof If L < 1, then as before let r be such that L < r < 1. By the property of
limsup (see Theorem 1.6) we see ak < r k for all k sufficiently large. Hence, our
series converges by the comparison test. If L > 1, then again by Theorem 1.6, we
1/k
have ak > 1 for infinitely many k. Thus, ak > 1 for infinitely many k and our series
diverges. This completes the proof.
converges.
for all k ≥ 1. Summing this inequality from k = 1 to infinity gives the result.
∞
1
n s
n=1
Since 1/x is a decreasing function of x, we see that ak > 0. On the other hand,
k+1
1 1 k+1
x −k
ak = − dx = d x.
k k x k kx
The numerator in the integrand is at most 1 in the interval [k, k + 1] and so the
integrand is at most 1/k 2 . Consequently, ak ≤ 1/k 2 . Therefore, by the comparison
test, the series
∞ k+1
1 dx
−
k=1
k k x
converges to a finite limit γ , often called Euler’s constant. In fact, our estimate on
ak shows that
n
ak = γ + O(1/n).
k=1
In other words,
1
= log(n + 1) + γ + O(1/n),
k≤n
k
These tests of convergence that we have so far adumbrated are not sufficient to
deal with alternating series such as
∞
(−1)k−1
.
k=1
k
∞
(−1)k−1 ak
k=1
converges.
Proof Let us consider the partial sums
n
sn = (−1)k−1 ak .
k=1
Then
s2n = (a1 − a2 ) + · · · + (a2n−1 − a2n )
is easily seen to be positive and greater than s2n−2 so that s2n is increasing. Moreover,
so that our sequence is bounded. By Theorem 1.4, this sequence converges to a finite
limit. A similar analysis applies to the partials sums of odd index:
converges. The partial sums of this series were shown to be Cauchy in an exercise in
Sect. 1.4. However, the special nature of the series can be used to explicitly evaluate
it as the following example shows.
Example 1.2 We observe that
1
1
= x k−1 d x,
k 0
28 1 Background
so that
n
(−1)k−1
n 1
= (−1)k−1 x k−1 d x.
k=1
k k=1 0
Interchanging the integral and sum on the right-hand side and noting the geometric
series
n
1 − (−x)n
(−1)k−1 x k−1 = ,
k=1
1+x
we find
n
(−1)k−1 1
1 − (−x)n 1
dx 1
(−x)n
= dx = − d x.
k=1
k 0 1+x 0 1+x 0 1+x
so that
1 1 1
1− + − + · · · = log 2.
2 3 4
Exercises
1. If r = 1, show that
1 − r n+1
1 + r + r2 + · · · + rn = .
1−r
∞
2. If the series k=1 ak converges, show that limk→∞ ak = 0.
∞ ∞
3. If k=1 |ak | converges, show that k=1 ak converges.
∞
1
k=2
k(log k)s
7. Show that1
π 1 1 1
= 1 − + − + ··· .
4 3 5 7
1
[Hint: Follow the template of Example 1.2 by noting that 1/(2k + 1) = 0 x 2k d x.]
1 x =1
lim x n =
n→∞ 0 0 ≤ x < 1.
1 This seems to have been first discovered by Madhava in fourteenth-century India. He is also
credited to have discovered the familiar Taylor series for sine and cosine functions more than 250
years before the advent of Sir Isaac Newton. In fact, the Kerala school of mathematics led by
Madhava is now credited to have discovered much of what we would now describe as precalculus.
See p. 224 of [2]. The formula in this exercise was rediscovered several centuries later by G. Leibniz.
We therefore refer to the series as the Madhava–Leibniz series.
30 1 Background
1 x =1
f (x) =
0 0 ≤ x < 1,
then limn→∞ f n (x) = f (x) pointwise. Note that even though each f n (x) is a con-
tinuous function, the limit function is not continuous. In many applications, this
phenomenon is not useful. Thus, pointwise convergence is too weak a notion of
convergence for a sequence of functions. This motivates the definition of uniform
convergence. As before, let I be an interval and f n a sequence of functions defined
on I . We say the sequence converges uniformly to f on I if for every > 0, there
is an N such that
The important thing to note here is that N does not depend on x and applies for all
x ∈ I , hence the term uniform is applicable to emphasize this property.
In a similar vein, we say
∞
f n (x)
n=1
for all x ∈ I and all n, we deduce by the uniform convergence property, that there is
an N such that
Proof By Theorem 1.14, the limit function f is continuous and so all the integrals
in the statement of the theorem exist. Now, given > 0, we want to show that
b b
f n (x)d x − f (x)d x < ∀ n ≥ N.
a a
x b
f (x) = f (a) + g(t)dt = lim f n (a) + g(t)dt,
a n→∞ a
lim f n (a)
n→∞
so that c
lim f n (a) = lim f n (c) − f n (t)dt . (1.3)
n→∞ n→∞ a
Therefore, the limit of the right-hand side of (1.3) exists. Therefore f is well-defined
by (1.2) and by the fundamental theorem of calculus, f = g. To complete the proof,
we need to show that limn→∞ f n = f uniformly on [a, b]. By (b), given > 0, there
is an N such that
Hence,
| f m (x) − f (x)| = f m (a) + g(t)dt
x x
a f m (t)dt − limn→∞ f n (a) + a
| f m (a) − limn→∞ f n (a)| + − g(t))dt .
x
≤ a ( f m (t)
If m ≥ N , the absolute value of the integrand is less than so that the integral is less
than (b − a). Since limn→∞ f n (a) exists, there is an N1 such that for m ≥ N1 , we
have | f m (a) − limn→∞ f n (a)| < . Therefore if m ≥ max(N , N1 ), we have
Proof Let
n
Sn (x) := f k (x),
k=1
and
n
Tn := Mk
k=1
be the respective partial sums. Then, as the series n≥1 Mn converges, the sequence
of partial sums Tn is Cauchy. That is, given > 0, there is an N such that
|Tm − Tn | < ∀ m ≥ n ≥ N.
This means that for each x, the sequence Sn (x) is also Cauchy because
m
m
m
|Sm (x) − Sn (x)| = f k (x) ≤ | f k (x)| ≤ Mk = |Tm − Tn | <
k=n+1 k=n+1 k=n+1
for all x ∈ I . Therefore the sequence Sn (x) converges pointwise to a limit f (x) (say).
We need to show this convergence is uniform. Indeed,
f (x) − n f k (x) = | f (x) − Sm (x) + Sm (x) − Sn (x)|
k=1
≤ | f (x) − Sm (x)| + |Sm (x) − Sn (x)|
≤ | f (x) − Sm (x)| + |Tm − Tn |
34 1 Background
for any m ≥ n. Given > 0, we have an N such that |Tm − Tn | < for m ≥ n ≥ N
so that for all x ∈ I ,
n
f (x) − f k (x) ≤ | f (x) − Sm (x)| +
k=1
whenever m ≥ n ≥ N . Letting m tend to infinity, we see that the first term on the
right-hand side of the inequality tends to zero. This completes the proof.
|| f || := sup{| f (x)| : x ∈ I }.
Metric spaces are reviewed in Sect. 1.8. The student can verify that given two func-
tions, f, g ∈ B(I ), the distance || f − g|| defines a metric. It is now clear from the
above discussion that a sequence of functions f n in B(I ) converges to f ∈ B(I ) if
and only if limn→∞ f n = f uniformly on I .
Exercises
(a) Compute the pointwise limit limn→∞ f n (x) for x ∈ [0, 1]. Is the convergence
uniform?
(b) Compute
1 1
lim f n (x)d x and lim f n (x)d x.
n→∞ 0 0 n→∞
1
| f n (x)| ≤ ∀n ≥ 1 and ∀x ∈ [−1, 1].
2n
Deduce that limn→∞ f n = 0 uniformly on [−1, 1].
1 − n2 x 2
f n (x) = .
(1 + n 2 x 2 )2
Deduce that
1, x =0
lim f n (x) =
n→∞ 0, 0 < |x| ≤ 1.
With x a real variable, and c a real number, an infinite series of the form
∞
an (x − c)n = a0 + a1 (x − c) + a2 (x − c)2 + · · ·
n=0
is called a power series. Such series have an ubiquitous role in mathematics, ranging
from approximation theory to the solution of differential equations. As we shall see
later, they play a vital role in the development of complex analysis.
There is no loss of generality if we consider the series of the form
∞
an x n (1.4)
n=0
36 1 Background
as can be seen by a simple change of variable. Thus, for the sake of elegance and
simplicity, we will formulate our results with c = 0.
By a simple application of the root test, the series converges if
1
R= , (1.5)
lim supn→∞ |an |1/n
where we understand 1/0 to be infinity and 1/∞ to be zero. This motivates the
definition of the radius of convergence as the largest number R such that the series
(1.4) converges for |x| < R and R is given by the formula (1.5).
An important application of the theory of power series is the expansion of a
function as a Taylor series. One can also view the theory as giving polynomial
approximations to C ∞ functions. The idea is simple enough and is an application of
repeated integration by parts.
Without any loss of generality, suppose that f is a C ∞ function defined on [−1, 1].
By the fundamental theorem of calculus
x
f (x) − f (0) = f (t)dt.
0
In other words,
x
f (x) = f (0) + f (0)x + f (t)(x − t)dt.
0
The key observation here is that the primitive of 1 is t + c for any constant c and we
have chosen the constant to be −x in this calculation. Iterating this procedure, it is
now evident that we have the following theorem.
Theorem 1.18 (Taylor’s theorem) Suppose that f is a function defined on an open
interval I and that c is a point in I . Suppose that f (n+1) exists and is continuous on
I . Then,
n
f (k) (c)
f (x) = (x − c)k + Rn (x),
k=0
k!
where
1.7 Power Series 37
x
1
Rn (x) = (x − t)n f (n+1) (t)dt. (1.6)
n! c
f (n+1) (ξ )
Rn (x) = (x − c)n+1 . (1.7)
(n + 1)!
Proof It is clear that our discussion preceding the statement of the theorem (replacing
zero by c) leads to the first assertion. Indeed,
x x
f (x) − f (c) = (x − t)0 f (t)dt = (x − c) f (c) + (x − t) f (t)dt,
c c
Then,
x x
m M
(x − t)n dt ≤ Rn (x) ≤ (x − t)n dt.
n! c n! c
(x − c)n+1
.
n+1
Consequently,
Rn (x)(n + 1)!
m≤ ≤ M.
(x − c)n+1
By the intermediate value theorem, we see that there is a ξ ∈ [c, x] such that (1.7)
holds. This completes the proof.
It is worth remarking that in Theorem 1.18, the term
n
f (k) (c)
(x − c)k
k=0
k!
is called the nth-order Taylor polynomial. Equation (1.7) is called the Lagrange form
for the remainder. For n = 0, the theorem reduces to the usual mean value theorem.
Theorem 1.18 allows us to derive polynomial approximations of some familiar
functions such as e x and the trigonometric functions like sin x and cos x.
38 1 Background
x 2n+2
sin(2n+2) (ξ ) ,
(2n + 2)!
Exercises
2. If
∞
an x n
n=0
has radius of convergence R, and P and Q are polynomials such that Q(n) = 0
for all n ≥ 0, show that
∞
P(n)
an x n
n=0
Q(n)
3. Show that the Taylor series of (1 + x)r when r is a real number is given by
∞
r k
x ,
k=0
k
where
r r (r − 1) · · · (r − k + 1)
= .
k k!
1.7 Power Series 39
Show further that the power series converges for |x| < 1.
4. Using the previous exercise, show that the Taylor series of (1 − 4x)−1/2 about
x = 0 is given by
∞
2n n
x .
n=0
n
5. Prove that
∞
2n (−1)n 1
n
=√ .
n=0
n 4 2
has radius of convergence R = ∞. (Jr (x) is called the Bessel function of the
first kind.)
7. With Jr (x) as in the previous exercise, show that Jr (x) is a solution y(x) to
Bessel’s differential equation
x 2 y + x y + (x 2 − r 2 )y = 0.
(Bessel functions have many applications in the study of the propagation of elec-
tromagnetic waves through cylindrical waveguides.)
8. (Generalized mean value theorem for integrals) If f and g are two continuous
functions on an interval [a, b] and if f is non-negative there, show that there is a
ξ ∈ [a, b] such that
b b
f (t)g(t)dt = g(ξ ) f (t)dt.
a a
9. (Second mean value theorem for integrals) Let f (x) be a bounded, monotonic
decreasing, non-negative, differentiable function on [a, b] and let g(x) be a
bounded integrable function. Show that for some ξ ∈ [a, b], we have
b ξ
f (x)g(x)d x = f (a) g(x)d x.
a a
40 1 Background
The construction of the real numbers from the set of rational numbers used two
fundamental ideas: the concept of a metric and Cauchy sequences. It is convenient
to define both of these ideas in a more general context.
Given a set X , a pseudo-metric for X is a function d : X × X → R+ , the set of
non-negative real numbers satisfying the following properties:
1. d(x, x) = 0 for all x ∈ X ;
2. d(x, y) = d(y, x) for all x, y ∈ X (symmetry);
3. d(x, z) ≤ d(x, y) + d(y, z) for all x, y, z ∈ X (triangle inequality).
If in addition d(x, y) = 0 implies x = y, then d is called a metric. We sometimes
write (X, d) to indicate either the pseudo-metric space or metric space accordingly.
The usual absolute value |x − y| for x, y ∈ R defines a metric and more generally
for m-dimensional space Rm , we define the distance between x = (x1 , . . . , xm ) and
y = (y1 , . . . , ym ), as
m
|x − y| := (x j − y j )2 .
j=1
Occasionally, we also use the notation ||x|| to denote the length of the vector x. The
fact that this distance function satisfies the triangle inequality is not totally trivial
and is equivalent to the Cauchy–Schwarz inequality. (Note that there is no “t” in the
spelling of Schwarz here.) Essentially, we need to show that
It is convenient to introduce the dot product (or sometimes called the inner product):
m
x · y := x j yj. (1.9)
j=1
1.8 Metric Spaces and Euclidean Spaces 41
(x + y) · (x + y) ≤ x · x + y · y + 2|x||y|,
x · y ≤ |x||y|,
2x j y j ≤ x 2j + y 2j .
The homogeneity of the left-hand side shows that for any λ = 0, we can replace x j
by λx j and y j by y j /λ to get
2x j y j ≤ λ2 x 2j + λ−2 y 2j .
m
m
m
2 x j y j ≤ λ2 x 2j + λ−2 y 2j . (1.10)
j=1 j=1 j=1
θ c−x x
A B
D
Father-teacher.”2 Some Indian connection seems evident from all written accounts
of Pythagoras and his school with its belief in reincarnation and its insistence on
vegetarianism and moral practices as a means for attaining higher knowledge. In his
research article highlighting the influence of Indian philosophy on Greek thought,
Marlow writes that the philosophy of Pythagoras and Plato “is as unlike anything in
Greek thought as it is like the Hindu mysticism of the Upanisads.”3
The Indian penchant for abstraction, with its legendary discovery of zero and the
decimal system, demonstrates a striking difference between the Greek geometric (and
visual or sensory) approach to mathematics and the Indian abstract (or transcendental)
approach. This is evidenced by the short and elegant proof of the Pythagorean theorem
by Bhaskaracharya (1114–1185 CE). By contrast, Euclid’s proof constructs squares
on each side and proceeds to subdivide them, comparing them with areas of numerous
triangles on each side. The proof is more geometric and visual.
Here is Bhaskara’s short proof of the Pythagorean theorem. Consider the right-
angled triangle ABC with the right angle at C along with the perpendicular C D to
AB (see Fig. 1.1). The triangles ABC, AC D and C B D are similar.
Thus, comparing the smaller triangles to the big triangle ABC, we get
c−x b
cos θ = = ,
b c
so that a 2 = cx. Putting these two equations together gives us the familiar
Pythagorean theorem:
c2 = a 2 + b2 ,
or alternatively,
c= a 2 + b2 .
2 See p. 14 of The Message of Plato by Edward J. Urwick, Methuen and Company Limited, 36
Essex Street W.C., London, 1920.
3 See p. 39 of A.N. Marlow, Hinduism and Buddhism in Greek Philosophy, Philosophy East and
But the concept of a vector was slow in coming. William Rowan Hamilton (1805–
1865) is said to be the first mathematician who defined the notion of a vector in the
context of developing his theory of quaternions. Throughout his life, he was obsessed
in developing his quaternionic theory and in doing so, was led to give a precise
definition of a vector. Both vectors and quaternions have revolutionized mathematics,
but it was the idea of a vector that has had a more profound impact. Hamilton’s work,
however, was confined to four dimensions. It was Hermann Grassmann (1809–1877)
who gave us the modern version of a vector in higher-dimensional space.
Grassmann, interestingly, never had an academic position in a university. He was
a high-school teacher who also had expertise in languages. In particular, he was a
Sanskrit scholar who compiled the first translation of the Rig Veda into German.
It was his mastery of Sanskrit that inspired him to introduce the term “matrix”
in mathematics. In doing this, he was fully aware of the Latin roots, mater and
the Sanskrit root matr, both signifying “mother” or more accurately, “the womb,”
because he felt that the theory of matrices is the “womb of mathematics” from which
everything emerges.
Grassmann saw mathematics more as a “theory of forms” rather than as a “theory
of measurement.” It would be accurate to say that the modern viewpoint is that math-
ematics is both. But Grassmann’s emphasis on forms led to fundamental abstractions
leading to the concepts of vectors and matrices.
The concept of a vector is inevitable if one wants to study motion in three-
dimensional space. In fact, it would be fair to say that multivariable calculus partly
arose motivated by a need to study three-dimensional motion, just as one-variable
calculus arose to describe linear motion, or motion in the plane. Vectors represent
forces. They have a magnitude and a direction. For the novice, the formal operations
with vectors and matrices seem ad hoc. A familiar case in point is the dot product of
two vectors, which lies at the heart of the definition of matrix multiplication:
a = (a1 , a2 , . . . , an ), b = (b1 , b2 , . . . , bn ), a · b = a1 b1 + a2 b2 + · · · + an bn .
c b
θ φ
(0, 0) (a, 0)
c2 = a 2 + b2 − 2ab cos θ.
From Fig. 1.2, the student can easily see that the vector sum a + b has coordinates
Since θ + φ = π , we find either by looking at the graph of the cosine and sine
functions, or by using the addition formulas, that
since we may choose, without any loss of generality, our coordinate x-axis along the
vector a.
On the other hand, if a = (a1 , a2 ) and b = (b1 , b2 ), then b − a = (b1 − a1 , b2 −
a2 ) so that by the Pythagorean theorem, we have
W = F · d.
1
m(v · v).
2
Energy does not have a direction, but velocity does as it is a vector.
46 1 Background
From the geometric meaning of the dot product, we see that two vectors x and y are
orthogonal if and only if their dot product is zero. It is important to keep the geometric
meaning of vectors in mind. It adds another dimension to our understanding.
It is convenient to recall here the cross-product of two vectors in R3 . Formally,
given two vectors
a = (a1 , a2 , a3 ), b = (b1 , b2 , b3 )
a × b := (a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ).
The formal definition lacks any hint of its importance and meaning. In physics, the
concept arises to describe torque. If a represents the displacement of a particle from
a fixed point to a movable point, and b is the force applied at the movable point,
then the cross-product a × b is the torque exerted by the force about the fixed point.
The above opaque definition is better remembered using determinants. One writes
symbolically,
i j k
a × b = a1 a2 a3
b1 b2 b3
where i, j, k are the unit vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1) respectively. Expand-
ing the determinant using the first row leads to the earlier definition so that the “deter-
minant expression” serves as a useful mnemonic for the cross-product. A straight-
forward and tedious computation shows that
so that
|a × b|2 = |a|2 |b|2 − |a|2 |b|2 cos2 θ. (1.13)
Consequently,
|a × b| = |a||b| sin θ,
where θ is the angle between the two vectors a and b. This formula implies a geo-
metric interpretation of the magnitude of the cross-product vector. It is the area of the
parallelogram spanned by the two vectors a and b. The direction of the cross-product
is given by the familiar “right-hand rule,” where if you align the fingers of your right
hand along the vector a and bend your fingers around in the direction of rotation
from a to b, your thumb will point in the direction of a × b. We also note that the
cross-product of a and b is zero if and only if they are parallel.
With this interlude on the dot product and cross-product, we return to our general
discussion and consider sequences in Rm . It will be clear that much of this discussion
applies to a general metric space. For the sake of clarity, we will confine our attention
to a discussion of sequences in Rm . They will be sequences of vectors xn and we can
1.8 Metric Spaces and Euclidean Spaces 47
write
xn = (xn1 , . . . , xnm ).
|xn − L| < ∀ n ≥ N.
We then write limn→∞ xn = L. The reader can easily verify that convergence of a
sequence of vectors is equivalent to the convergence of the component sequences
(see Exercise 1).
The notion of Cauchy sequences is Rm is analogous (as it is in any metric space).
The student can show again that a sequence in Rm is Cauchy if and only if it converges
(see Exercise 3).
A sequence xn in Rm is said to be bounded if there is a number M such that
|xn | ≤ M for all n. Again, the reader can verify that a sequence in Rm is bounded if
and only if each component sequence is bounded (Exercise 2).
The analogue of the Bolzano–Weierstrass theorem goes through.
Theorem 1.19 Every bounded sequence in Rm has a convergent subsequence.
Exercises
(c) Obtain an estimate for |xn − xm | in terms of K and |x2 − x1 | for any two
positive integers m, n.
(d) Show that xn is a Cauchy sequence in Rd .
(e) Show that the vector
x := lim xn
n→∞
is a fixed point of f
7. For each number a put f a (x) = ax. For which numbers a is f a a contraction on
R? When f a is a contraction, what is the fixed point?
8. For what intervals [0, r ] with r ≤ 1, is the map f : [0, r ] → [0, r ] given by
x → x 2 a contraction?
Much of our discussion regarding analysis on the real line extends to Rm . For example
in R2 , the closed rectangle [a, b] × [c, d] is
and more generally we can speak of closed rectangles in Rm (even though they are
not rectangles per se, but rather generalizations of them). A similar definition can
be made about open rectangles (a, b) × (c, d) in R2 and analogously in higher
dimensions. A subset U ⊂ Rm is called open if for each point x ∈ U there is an open
rectangle A such that x ∈ A ⊂ U . This is equivalent to the definition that there is
an open ball Br (x) contained in U . A subset C of Rm is closed if its complement is
1.9 The Heine–Borel Theorem 49
Exercises
3. If B is compact and O is an open cover of {x} × B, show that there is an open set
U ⊂ Rn containing x such that U × B is covered by a finite number of sets in O.
5. In Rm , show that the union of any (even infinite) number of open sets is open.
Prove that the intersection of finitely many open sets is open, and find a coun-
terexample to show how this fails for infinitely many open sets.
6. If A is a closed set that contains every rational number r ∈ [0, 1], show that
[0, 1] ⊂ A.
as the reader can easily verify. For example, the circle of radius r is parameterized
by
(x(t), y(t)) = (r cos t, r sin t), 0 ≤ t ≤ 2π.
r (t)
T(t) =
|r (t)|
is called the unit tangent vector to the curve C. It is again a vector-valued function
of a single variable. The magnitude of the rate of change of the tangent vector with
respect to the length of the curve is called the curvature of C. More precisely, the
curvature κ is given by
dT
κ := ,
ds
By (1.14),
ds
= |r (t)|.
dt
Combining these formulas, we obtain
|T (t)|
κ= .
|r (t)|
We can also define this without using limits. We say f is continuous if and only if
f −1 (U ) is open for every open set U in Rm . The following important theorem states
that the continuous image of a compact set is compact.
Theorem 1.21 If f : A → Rm is continuous and A ⊂ Rn is compact, then f (A) is
compact.
Proof Let O be an open cover of f (A). For each open set U of O, there is an open
set VU such that f −1 (U ) = VU ∩ A, by the continuity of f . The collection of VU ’s
is an open cover of A which can be reduced to a finite subcover VU1 , . . . , VUn (say)
since A is compact. Thus, U1 , . . . , Un cover f (A).
Exercises
r(t) = (at + b, ct + d, et + f )
for some constants a, b, c, d, e, f . Show that the curvature of the line is zero.
3. Given a curve r(t), assume that the unit tangent vector to this curve is differen-
tiable. Define the principal unit normal vector N(t) to be
T (t)
.
|T (t)|
Show that
1 dT
N(t) = ,
κ ds
where κ denotes the curvature.
4. Under the same conditions as in the previous exercise, show that |r(t)| is constant
if and only if r(t) and r (t) are orthogonal to each other, for all values of t in the
domain of r(t).
for all x in A.
f (a1 , . . . , ai + h, . . . , an ) − f (a1 , . . . , an )
lim
h→0 h
1.11 Derivatives of Multivariable Functions 53
is called the ith partial derivative of f at a and denoted Di f (a). We can con-
tinue taking partial derivatives, provided they exist. For example, we can consider
D j (Di f ), sometimes denoted D j,i f . There are various theorems that ensure that
D j,i f = Di, j f . For instance, if both of these are continuous on an open set con-
taining a, then we have Di, j f (a) = D j,i f (a) (see Exercise 7). The equality is true
assuming weaker hypotheses which we do not discuss here. The function Di, j f is
called a second-order partial derivative or sometimes a mixed partial derivative.
In the case of functions of two variables f (x, y), it is convenient to write f x and f y
for ∂∂ xf and ∂∂ yf respectively. Thus f x y would be ( f x ) y and so on. A similar tradition is
sometimes adopted for functions of three or more variables.
The directional derivatives and the partial derivatives are related. Namely, if ei
is an element of the standard basis of Rn with 1 in the ith component and all other
components being zero, then Di f (a) = f ei (a).
For functions whose range is contained in R, we can speak about the maximum
and minimum values. The reader can verify that if A ⊂ Rn and f : A → R is dif-
ferentiable and the maximum value occurs at a, then Di f (a) = 0 for every i.
Historically, multivariable calculus emerged in the study of electromagnetism
and in particular, in the derivation of Maxwell’s equations. This is analogous to the
development of one-variable calculus that arose from Newton’s study of motion and
gravitation. Partly motivated by this context and for other practical reasons, the study
of scalar fields has been given detailed attention. Thus, given a continuous function
f : Rn → R, we define the gradient of f , denoted ∇ f as a function from Rn to Rn
given by
∇ f = (D1 f, . . . , Dn f ).
Thus, ∇ f is a vector field. Points x ∈ Rn where ∇ f (x) = 0 are called critical points.
If f (x, y) is a function f : R2 → R, of two variables and we have a curve z(t) =
(x(t), y(t)) ∈ R2 , we can study f along this curve and it can be viewed as a function
of a single variable. That is, we can consider
We want to calculate F (t0 ). Writing (x(t0 ), y(t0 )) = (x0 , y0 ), we have from one-
variable calculus,
x − x0 y − y0
F (t0 ) = lim f x (ξ, y) + f y (x0 , η)
t→t0 t − t0 t − t0
which equals
f x (x0 , y0 )x (t0 ) + f y (x0 , y0 )y (t0 ).
In other words,
df ∂ f dx ∂ f dy
= + .
dt ∂ x dt ∂ y dt
df ∂ f d xi
n
= .
dt i=1
∂ xi dt
df
= ∇ f (x) · x (t), (1.15)
dt
and we see the analogy to the one-variable version of the chain rule. Thus, (1.15)
is often referred to as the chain rule for functions of several variables, though it is
only a special case of the general chain rule for the derivative of the composition of
functions f : Rn → Rm and g : Rm → Rr (see Exercise 3 at the end of this section).
This observation allows us to prove Taylor’s theorem in several variables.
We say a subset S of Rn is star-shaped with respect to the point c if for every x
in S the line segment joining x to c lies in S.
Lemma 1.2 Let f be a C ∞ function on an open subset U of Rn which is star-shaped
with respect to a point c = (c1 , . . . , cn ) in U . Then there are C ∞ -functions g1 , . . . , gn
on U such that
n
∂f
f (x) = f (c) + (xi − ci )gi (x), gi (c) = (c).
i=1
∂ xi
Proof Since U is star-shaped with respect to c, we have for any x in U that the line
segment
1.11 Derivatives of Multivariable Functions 55
c + t (x − c), 0 ≤ t ≤ 1
d n
∂f
f (c + t (x − c)) = (xi − ci ) (c + t (x − c)).
dt i=1
∂ xi
t=1
n 1
∂f
f (c + t (x − c)) = (xi − ci ) (c + t (x − c))dt.
t=0
i=1 0 ∂ xi
Let
1
∂f
gi (x) = (c + t (x − c))dt.
0 ∂ xi
n
f (x) = f (c) + (xi − ci )gi (x).
i=1
Moreover,
1
∂f ∂f
gi (c) = (c)dt = (c).
0 ∂ xi ∂ xi
n
∂f
f (x) = f (c) + (xi − ci ) (c),
i=1
∂ xi
n
∂
h · ∇ := hi ,
i=1
∂ xi
∂ f
n
(h · ∇) f (c) := hi .
i=1
∂ xi x=c
56 1 Background
More generally, we can introduce powers of this operator (h · ∇)r . For instance,
n
∂2
(h · ∇)2 := hi h j .
i, j=1
∂ xi ∂ x j
r
((x − c) · ∇) j ((x − c) · ∇)r +1
f (x) = f (c) + f (c) + f (c + ξ(x − c)),
j=1
j! (r + 1)!
Proof We apply the one-variable Taylor’s theorem (see Theorem 1.18) to the function
g(t) := f (c + th).
g ( j) (t) = (h · ∇) j f (c + th)
1
f (x0 + h, y0 + k) = f (x0 , y0 ) + (h 2 f x x + 2hk f x y + k 2 f yy ) + o((h 2 + k 2 )3/2 )
2
1.11 Derivatives of Multivariable Functions 57
1
−1
−1 0
−0.5 0
0.5 1 −1
where the second derivatives on the right are evaluated at (x0 + θ h, y0 + θ k) for
some 0 < θ < 1. The second expression on the right-hand side is a quadratic form
in h, k and by completing the squares, we can write
1 fx y 2
f x x f yy − f x2y
f (x0 + h, y0 + k) − f (x0 , y0 ) = f x x h+ k + k 2
.
2 fx x f x2x
A similar result holds for relative minima (see Exercise 8). Critical points that do not
correspond to either a relative maxima or minima are called saddle points.
The notion of a saddle point is best illustrated by considering the function
f (x, y) = x 2 − y 2 . Clearly, (0, 0) is a critical point. However, neither a local maxi-
mum nor a local minimum occurs at (0, 0). This can also be seen visually in Fig. 1.4.
The appearance of the quadratic form signals a generalization in the case of more
than two variables. Accordingly, we introduce the Hessian matrix. Given a function
f : A ⊂ Rn → R which has derivatives at least to order two, we define the Hessian
to be the n × n matrix
∂2 f
H= .
∂ xi ∂ x j
Sometimes, we write H ( f ) for this matrix. It is then easy to see that in a neighborhood
of c, we have from Taylor’s Theorem 1.22 that upon writing h = x − c, we have
1
f (x) = f (c) + (h · ∇ f )(c) + hT [H ( f )(c)]h + o(|h|3 ),
2
58 1 Background
as h → 0. Thus, the linear map h → f (a)h can be viewed as a good linear approx-
imation to the difference f (a + h) − f (a). Now, given f : Rn → Rm , we say it is
differentiable at a ∈ Rn if there exists a linear transformation λ : Rn → Rm such
that
f (a + h) = f (a) + λ(h) + o(|h|)
then ⎛ ∂f ⎞
1 ∂ f1
(a) · · · (a)
⎜ ∂ x1 ∂ xn ⎟
⎜ . .. ⎟
⎜
D f (a) = ⎜ .. ⎟
. ⎟,
⎝∂ f ∂ fm ⎠
m
(a) · · · (a)
∂ x1 ∂ xn
The Jacobian determinant is the determinant of the Jacobian matrix. We use the
notation
∂( f 1 , . . . , f n )
∂(x1 , . . . , xn )
In other words, as h → 0,
By definition of differentiability,
∂f ∂f
f (x(t + h), y(t + h)) = f (x(t), y(t)) + x + y + o(x + y),
∂x ∂y
∂ f dx ∂ f dy
g (t) = + .
∂ x dt ∂ y dt
There are certain notions in multivariable calculus that only pertain to dimensions
2 or 3, given that these concepts arose in the context of physics. For example, in R3 ,
if f is the vector field ( f 1 , f 2 , f 3 ), we define the vector field curl by
∂ f3 ∂ f2 ∂ f1 ∂ f3 ∂ f2 ∂ f1
curl f = − , − , − .
∂y ∂z ∂z ∂x ∂x ∂y
An easy way to remember this is to make symbolic use of the cross-product notation:
i j k
∂ ∂ ∂ ∂ f3 ∂ f2 ∂ f1 ∂ f3 ∂ f2 ∂ f1
∇ × f = = − , − , − .
∂x ∂y ∂z ∂y ∂z ∂z ∂x ∂x ∂y
f1 f2 f3
As noted earlier, the notion has its roots in physics and was introduced to measure
the rotation of a vector field in R3 . Thus, if curl f = 0, we say the vector field f is
irrotational at that point or has no rotation.
A vector field F is called conservative if F = ∇ f for some scalar valued function
f , which is referred to in the literature as a potential function. If F has continuous
first-order partial derivatives, then it is a consequence of Stokes theorem (see below)
that F is conservative if and only if curl F = 0.
It is not hard to see that if F is conservative, then curl F = 0 if F has continuous
second-order partial derivatives. Indeed, if F = ∇ f = ( ∂∂ xf , ∂∂ yf , ∂∂zf ), we have
i j k
∂ ∂ ∂
curl F = = ( f zy − f yz , f x z − f zx , f yx − f x y ) = 0.
∂x ∂y ∂z
fx fy fz
f u (x) = ∇ f (x) · u.
Exercises
10. Let f be a C ∞ function of two variables. Suppose that f (0, 0) = 0 and that
f (ta, tb) = t 2 f (a, b) for all real numbers t and all vectors (a, b). Show that
1
f (x) = (x · ∇)2 f (0, 0).
2
11. A ray of light travels at a constant speed in a uniform medium. In different media
(such as air and water), light travels at different speeds. When light passes from
one medium into another, light is refracted as shown in the figure below. If the
62 1 Background
θ1
P
θ2
sin θ1 v1
= .
sin θ2 v2
From single-variable calculus, the student is familiar with the following theorem: if
f : R → R is continuously differentiable in an open set containing a and f (a) = 0,
then there is an open set V with a ∈ V and an open set W with f (a) ∈ W such
that f : V → W has a continuous inverse f −1 : W → V which is differentiable.
Moreover, for y ∈ W , we have
( f −1 ) (y) = [ f ( f −1 (y)]−1 .
The reason for this is clear. Indeed, if f (a) > 0, there is an open interval V containing
a such that f (x) > 0 for x ∈ V . Thus, f is increasing on V and is therefore one-to-
one with an inverse function f −1 defined on some open interval W containing f (a).
It is not difficult to show that f −1 is differentiable and that
1
( f −1 ) (y) = .
f ( f −1 (y))
1.12 The Inverse Function Theorem 63
A similar statement can be made if f (a) < 0. There is an analogous theorem for
higher dimensions, and this is the content of the general inverse function theorem.
To this end, we begin with the following lemma.
Lemma 1.3 Let A ⊂ Rn be a rectangle and let f : A → Rn be continuously differ-
entiable. If there is a number M such that |D j f i (x)| ≤ M for all x ∈ Ao , then
for all x, y ∈ A.
Applying the mean value theorem to each of the bracketed terms on the right-hand
side, we obtain
by Exercise 1. Now,
2
| f (y) − f (x)| ≤ | f i (y) − f i (x)| ≤ 23/2 M|y − x|.
i=1
n
f i (y) − f i (x) = [ f i (y1 , . . . , y j , x j+1 , . . . , xn ) − f i (y1 , . . . , y j−1 , x j , . . . , xn )].
j=1
n
√
| f i (y) − f i (x)| ≤ M|y j − x j | ≤ n M|y − x|.
j=1
64 1 Background
n
| f (y) − f (x)| ≤ | f i (y) − f i (x)| ≤ n 3/2 M|y − x|.
i=1
f : Rn → Rn
and
|D j f i (x) − D j f (a)| ≤ 1/2n 3/2 ∀i, j and x ∈ U.
This last condition combined with Lemma 1.3 applied to g(x) = f (x) − x gives for
x1 , x2 ∈ U that
1
| f (x1 ) − x1 − ( f (x2 ) − x2 )| ≤ |x1 − x2 |.
2
Since (by the triangle inequality)
1.12 The Inverse Function Theorem 65
1
|x1 − x2 | − | f (x1 ) − f (x2 )| ≤ | f (x1 ) − x1 − ( f (x2 ) − x2 )| ≤ |x1 − x2 |
2
we deduce
|x1 − x2 | ≤ 2| f (x1 ) − f (x2 )| for x1 , x2 ∈ U. (1.18)
Now f (∂U ) is a compact set which by (1.16), does not contain f (a). Therefore,
there is a number d > 0 such that | f (a) − f (x)| ≥ d for x ∈ ∂U . Let
If y ∈ W and x ∈ ∂U , then
We will show that for any y in W there is a unique x in the interior of U such that
f (x) = y. To prove this, consider the function g : U → R given by
n
g(x) = |y − f (x)| = (yi − f i (x))2 .
2
i=1
n
2(yi − f i (x))D j f i (x) = 0, for all j.
i=1
By (1.17), the matrix (D j f i (x)) has nonzero determinant for all x ∈ U . Therefore,
yi = f i (x) for all i. In other words, y = f (x). This proves the existence of x. Unique-
ness follows from (1.18). If V = U o ∩ f −1 (W ), we have shown that the function
f : V → W has an inverse f −1 : W → V . We can rewrite (1.18) as
This implies that f −1 is continuous. To complete the proof, we need to show that
f −1 is differentiable. Let u = D f (x). We will show that f −1 is differentiable at
y = f (x) with derivative u −1 . By the definition of the derivative,
where
|r (x1 − x)|
lim = 0. (1.21)
x1 →x |x1 − x|
66 1 Background
Therefore
u −1 ( f (x1 ) − f (x)) = x1 − x + u −1 (r (x1 − x)).
Since every y1 ∈ W is of the form f (x1 ) for some x1 ∈ V , we can rewrite this as
so to complete the proof, it suffices to show that the last term is o(|y1 − y|) as y1 → y.
As u −1 is a linear transformation (see Exercise 5 below) , it suffices to show that
we see by (1.20), the second factor is bounded by 2 and the first factor approaches
zero by (1.21) because f −1 (y1 ) → f −1 (y) by the continuity of f −1 .
Exercises
n
√
|x j | ≤ n||x||.
j=1
4. Determine if the following functions are locally C 1 -invertible at the given point.
(a) f (x, y) = (x 2 − y 2 , 2x y) at (x, y) = (0, 0);
6. With f as in the previous exercise, show that f has no global inverse. That is,
there is no function g : R2 → R2 such that g( f (x)) = x.
to show that the continuity of the derivative cannot be eliminated from the hypoth-
esis of Theorem 1.26.
The implicit function theorem and the inverse function theorem have a long and
luminous history that the student can find described in [4]. They seem to have roots
in the works of Isaac Newton (1642–1727) and Gottfried Leibniz (1646–1716).
Though Joseph Louis Lagrange (1736–1813) found a version of the theorem which
is essentially the inverse function theorem discussed below, it was Augustin Louis
Cauchy (1789–1857) who studied the theorem with sufficient mathematical rigor
and so is acknowledged today as its discoverer.
We begin with a motivating example. Suppose we are given the function f (x, y) =
x 2 + y 2 − 1. If we choose a, b such that f (a, b) = 0 and a = ±1, then there are open
intervals A containing a and B containing b such that if x ∈ A, there is a unique y ∈ B
with f (x, y) = 0. We can therefore define a function g √ : A → R by the condition
that g(x) ∈ B and f (x, g(x)) = 0. If b > 0, then g(x) = 1 − x 2 (See Fig. 1.6). For
our function, there is another number b1 such that f (a, b1 ) = 0. There will also be
another interval B1 containing b1 such that when
√ x ∈ A, we have f (x, h(x)) = 0 for
a unique h(x) ∈ B1 . In this case, h(x) = − 1 − x 2 . Both g and h are differentiable,
and these functions are said to be defined implicitly by the equation f (x, y) = 0.
For a = ±1, it is impossible to find any such function g defined on an open interval
containing a.
We would like a simple criterion for deciding when such a function can be found
for any general continuously differentiable function f of several variables. This
is supplied by the implicit function theorem. More generally, we ask the follow-
ing. Given f : Rn × R → R and f (a1 , . . . , an , b) = 0, when can we find for each
68 1 Background
b
B
A
( )
a
−1 1
b1
f i : Rn × Rm → R, 1≤i ≤m
that satisfy
f i (a1 , . . . , an , b1 , . . . , bm ) = 0, 1 ≤ i ≤ m,
when can we find, for each (x1 , . . . , xn ) near (a1 , . . . , an ) a unique (y1 , . . . , ym ) near
(b1 , . . . , bm ) which satisfies
f i (x1 , . . . , xn , y1 , . . . , ym ) = 0, 1 ≤ i ≤ m.
This is the content of the following implicit function theorem which uses the inverse
function theorem in an essential way.
Theorem 1.27 (The implicit function theorem) Suppose f : Rn × Rm → Rm is
continuously differentiable in an open set containing (a, b) with a ∈ Rn and b ∈ Rm
respectively. Suppose further that f (a, b) = 0. Let M be the m × m matrix
since F ◦ h is the identity map. Therefore, f (x, k(x, 0)) = 0. In other words, we can
define g(x) = k(x, 0). This completes the proof.
Exercises
1. The equations relating Cartesian coordinates and polar coordinates are given by
the familiar
x = r cos θ, y = r sin θ.
Using the inverse function theorem, show that we can solve (locally) for r and
θ uniquely in terms of x and y as long as we are away from the origin.
x = r cos θ, y = r sin θ, z = z.
Determine the points of R3 for which we can solve (locally) for r , θ and z in
terms of Cartesian coordinates.
Determine the points in R3 for which we can solve (locally) for r, φ, θ in terms
of x, y and z.
P
φ
θ
y
D1 f (x, g(x))
g (x) = − ,
D2 f (x, g(x))
∂z D1 F ∂z D2 F
=− , =− ,
∂x D3 F ∂y D3 F
provided D3 F = 0.
The Lagrange multiplier method gives conditions for finding the maxima or minima
of a scalar field subject to a side condition. Suppose we want to find the maximum
or minimum of a function f : Rn → R subject to the side condition g(x) = 0 for
some differentiable g : Rm → R. Let C be any curve given by r : [0, 1] → Rn lying
on the hypersurface defined by g(x) = 0. Thus, if r(t) = (x1 (t), . . . , xn (t)), then
g(r (t)) = 0. Now if f has an extremum at x0 (say), and C passes through x0 , then
setting h(t) := f (r(t)), we see that h also has an extremum at t0 where t0 is such
that r (t0 ) = x0 . Thus, by the chain rule, we deduce that
1.14 The Lagrange Multiplier Method 71
In other words, ∇ f (x0 ) is orthogonal to the tangent vector r (t0 ) for every curve C
lying on g = 0 passing through x0 . But if g(x) = 0, we see that for any u,
0 = Du g = ∇g · u,
∇ f (x0 ) = λ∇g(x0 ).
This is the essential idea in the Lagrange multiplier method. It was discovered by
Joseph Louis Lagrange (1736–1813) who made fundamental contributions not only
to analysis but also to number theory and group theory.
We summarize our discussion formally in the following theorem and leave the
proof to the reader.
Theorem 1.28 (Lagrange multiplier method) Let U be an open set of Rn and suppose
that g : U → R is a continuously differentiable function on U . Let S be the set of
points x in U such that g(x) = 0 and ∇ g(x) = 0. Let f : U → R be continuously
differentiable on U and assume that x0 is a point of S such that x0 is an extreme point
for f on S. That is, x0 is an extremum for f subject to the constraint g. Then, there
is a number λ such that
∇ f (x0 ) = λ∇ g(x0 ).
We can consider the situation where there are more constraints. Suppose we are
interested in finding the extrema of a function f : Rn → R subject to the constraints
gi (x) = 0, 1 ≤ i ≤ k,
Exercises
1. Find a formula for the surface area of an open box with length x, width y and
height z. If the volume V is fixed, determine the minimum surface area.
Find the pair of points x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) on the unit sphere
that maximize and minimize the function
n
f (x1 , . . . , xn , y1 , . . . , yn ) = x j yj.
j=1
lie on the unit hypersphere, use the previous exercise to derive the Cauchy–
Schwarz inequality:
|x · y| ≤ |x| |y|.
4. Use the Lagrange multiplier method to show that the distance from the point
(x0 , y0 ) to the line ax + by = d is
|ax0 + by0 − d|
√ .
a 2 + b2
5. Show that the distance from the point (x0 , y0 , z 0 ) ∈ R3 to the plane
ax + by + cz = d
is
|ax0 + by0 + cz 0 − d|
√ .
a 2 + b2 + c2
f (x, y, z) = x 2 y 2 z 2
1.14 The Lagrange Multiplier Method 73
x 2 + y2 + z2
(x 2 y 2 z 2 )1/3 ≤ .
3
8. Prove the arithmetic mean - geometric mean inequality: for any positive num-
bers x1 , x2 , . . . , xn , we have
x1 + x2 + · · · + xn
(x1 x2 · · · xn )1/n ≤ .
n
9. Heron’s formula for the area of triangle whose sides have length x, y, z is given
by
s(s − a)(s − b)(s − c),
10. Let A be the n × n symmetric matrix with real entries ai j so that ai j = a ji . Show
that the global maximum value of the quadratic form
n
ai j xi x j
i, j=1
is equal to the largest eigenvalue of A. Show that the global minimum value of the
quadratic form subject to the same constraint is equal to the smallest eigenvalue
of A. (Recall that the eigenvalues of a real symmetric matrix are real.)
(where f (a) is the Jacobian matrix D( f ) evaluated at a) is the closest linear approx-
imation to f (x) near a. The level set h −1 ( f (a)) also contains a and we call this the
tangent space at a. One can visualize this tangent space as the vector space of all
tangent vectors to all possible curves on the surface through the point a. Clearly, the
tangent space at a consists of all vectors v such that
f (a) · v = 0.
In effect, we are integrating along the curve and this can be seen as a generalization
of (1.14). One can integrate with respect to the coordinate parameters as well and
consider generally,
f (x1 , . . . , xn )d xi , 1 ≤ i ≤ n.
C
Exercises
Ax 2 + Bx y + C x z + Dy 2 + E yz + F z 2 + Gx + H y + I z + J = 0
x2 y2 z2
2
+ 2 + 2 = 1.
a b c
Show that the volume of this
ellipsoid is 41 π 2 abc.
x 2 yz + 3y 2 − 2x z 2 − 8z = 0
1.15 Level Sets and Tangent Spaces 75
x 2 + (4a − 2)y 2 − az 2 + 1 = 0
The first application of the Jacobian matrix is the generalization to the multivariable
setting of the change of variable method of one-variable integral calculus. The reader
will recall that if g : [a, b] → R is continuously differentiable, and f : R → R is
continuous, then
g(b) b
f (x)d x = f (g(x))g (x)d x.
g(a) a
The generalization of this formula to higher dimensions involves the Jacobian matrix.
The situation is summarized by the following theorem whose proof can be found in
many books such as [5].
Theorem 1.29 Let A ⊂ Rn be an open set and g : A → Rn a one-to-one, continu-
ously differentiable function such that det g (x) = 0 for all x ∈ A. If f : g(A) → R
is integrable, then
Alternately, this theorem can also be written as follows (which may be more familiar
to the student). For ease of notation, we specialize to the case n = 2. Let R be a region
in the x y-plane which is mapped in a one-to-one and onto fashion to the region R
in the uv-plane via the transformation
Then,
76 1 Background
(b, d)
ad − bc
(a, c)
∂(F, G)
f (x, y)d xd y = f (F(u, v), G(u, v)) dudv. (1.22)
R R ∂(u, v)
The essential idea of the proof is understood by noting how an infinitesimal volume
element changes under a linear transformation. We illustrate this in the case of R2 .
The student will recall that if we have two vectors
a b
v= w= ,
c d
the area of the parallelogram spanned by them is given by the absolute value of the
determinant ad − bc. Indeed, the area is (see (1.13) and Fig. 1.8 below)
|v|2 |w|2 − (v · w)2 .
Computing this directly using vector coordinates leads to the square root of
The following is suggested by the chain rule. The infinitesimal changes x and y
in x and y are given by:
x = Fu u + Fv v, y = G u u + G v v.
In the x y coordinate system, the area of the rectangle xy is transformed in the
uv-coordinate system to the area of the parallelogram spanned by the vectors
Fu Gu
and .
Fv Gv
∂(F, G)
.
∂(u, v)
Thus, if the region R is transformed into R , it is now intuitively clear that (1.22)
holds. A more rigorous proof can be found in [5]. In the case n = 2, we will show
how this is done in the section on Green’s theorem.
A classical application of the above theorem is to the evaluation of the probability
integral
∞
e−x d x.
2
I =
−∞
Indeed, we have
∞ ∞
e−x −y 2
2
I2 = d xd y.
−∞ −∞
The integration is over R2 which we can also parameterize using polar coordinates:
cos θ −r sin θ
,
sin θ r cos θ
√
The inner integral is evidently 1/2 so that I 2 = π . Hence I = π .
Even though the area of a circle can be calculated using elementary calculus, it
is instructive to observe that the above integral I allows us to find a formula for the
78 1 Background
e−x −y 2
2
over expanding circles of radius r with r ranging from zero to infinity. In this expan-
sion, the infinitesimal change in area as the circle expands is given by 2πr dr . In
other words, we are integrating over the infinitesimal change in the circumference
as r ranges from zero to infinity. Thus,
∞
e−r r dr = π.
2
I 2 = 2π
0
We change x = t y in the inner integral and interchange the order, which we can do
because of absolute convergence:
∞ ∞
e−y (1+t 2 )
2
I2 = 4 y dy dt.
0 0
This idea will be used later when we discuss the beta function and its functional
relation to the gamma function.
Exercises
a · (b × c).
a · (b × c) = (a × b) · c.
4. Show that
2 3
x 2 + y 2 d xd y = πa ,
R 3
e−x +y 2
d xd y = π(1 − e−a ).
2 2
e−1
e y/(x+y) d xd y =
T 2
x−y sin 1
cos d xd y = .
R x+y 2
9. Show that
d xd ydz
= 4π log(a/b),
R (x 2 + y 2 + z 2 )3/2
80 1 Background
2π(e − 1)
e−(x +x y+y 2 )
2
d xd y = √ ,
R 3
1 1
(1 − x 2 y 2 )−1 d xd y = 1 + 2
+ 2 + ··· .
S 3 5
sin u sin v
x= , y= , u, v > 0, u + v ≤ π/2,
cos v cos u
show that the integral equals π 2 /8. (This exercise is due to E. Calabi.)
We can apply this perspective to calculate the volume and surface areas of the n-
dimensional hypersphere of radius R. The reader will recall that this hypersphere is
the locus of points (x1 , . . . , xn ) in Rn such that
Denoting its volume by Vn (R), we see that it is given by the n-dimensional integral
Vn (R) = ··· d x1 · · · d xn .
x12 +···+xn2 ≤R 2
d Vn (R)
Sn−1 (R) = = nCn R n−1 .
dR
Thus, the surface area of S n−1 is nCn . We can also perform this integration by
integrating the infinitesimal change in the surface area dωn−1 as r ranges from zero
to infinity so that
nCn = dωn−1 .
S n−1
This idea was alluded to in the previous section. To determine Cn , we use the fol-
lowing trick:
∞ n ∞ ∞
e−x d x e−(x1 +···+xn ) d x1 · · · d xn .
2 2 2
π n/2 = = ···
−∞ −∞ −∞
The n-fold integral on the right can be viewed as an integration over the surface area
element dωn−1 (say) of the hypersphere
x12 + · · · + xn2 = r 2
In other words,
∞
e−r r n−1 dr = π n/2 .
2
nCn
0
The integral on the left-hand side is really a special value of the -function (which
will be studied in more detail in Chap. 4). Defining
∞
(s) := e−t t s−1 dt, s > 0,
0
∞
1
e−r r n−1 dr =
2
(n/2).
0 2
Therefore,
2π n/2
Cn = ,
n(n/2)
so that
2π n/2 R n
Vn (R) = .
(n/2)
We thus have formulas for the volume and surface area of the n-dimensional sphere
of radius R.
C2 : y = g2 (x)
<
>
C1 : y = g1 (x)
a b
To state and prove Green’s theorem, we need first the notion of a simple closed
curve. A curve x : [a, b] → Rn is called closed if x(a) = x(b). It is called a simple
closed curve if it does not intersect itself except at the endpoints.
Theorem 1.30 (Green’s theorem) Let C be a piecewise smooth, simple closed curve
in the plane with positive orientation and let R be the region enclosed by C together
with C. Suppose that M(x, y) and N (x, y) are continuous and have continuous
first-order partial derivatives in some open region D that contains R. Then,
∂N ∂M
M(x, y)d x + N (x, y)dy = − d A.
C R ∂x ∂y
Proof (Sketch) We will assume that there are two differentiable functions g1 (x) and
g2 (x) such that the region R can be described as
d
C2 : x = h2 (y) <
>
c C1 : x = h1 (y)
g2 (x)
∂M b
∂M
dA = d yd x
R ∂y a g1 (x) ∂y
∂M b y=g2 (x) b
dA = M(x, y) dx = [M(x, g2 (x)) − M(x, g1 (x))]d x.
R ∂y a y=g1 (x) a
Thus,
∂M
M(x, y)d x = − d A.
C R ∂y
∂N
N (x, y)dy = d A.
C R ∂x
Putting everything together gives Green’s theorem. To convey the essential idea of
the proof, we had assumed that our contour was of the form indicated in the figures.
However, it is not difficult to modify our proof to consider the more general case.
We discuss the necessary modifications below.
One important corollary of Green’s theorem is the following formula for comput-
ing the area of R:
Corollary 1.2 Suppose that C is a piecewise smooth, simple closed curve enclosing
the region R with area A. Then
2A = xdy − yd x.
C
1.18 Green’s Theorem 85
To illustrate the use of the corollary, we show that the area of an ellipse defined
by
x2 y2
+ = 1,
a2 b2
is πab. The ellipse is a simple closed curve which can be defined parametrically by
<
>
86 1 Background
Letting S denote the region g(R) and C its boundary, we have by Corollary 1.2,
2 d xd y = x dy − y d x.
S C
We change variables to evaluate the integral on the right-hand side: x = F(u, v), y =
G(u, v). Then,
∂G ∂G ∂F ∂F
dy = du + dv, dx = du + dv.
∂u ∂v ∂u ∂v
Denoting by C the image of C under this transformation, we have that our integral
is
∂G ∂F ∂G ∂F
F −G du + F −G dv.
C ∂u ∂u ∂v ∂v
Exercises
1. Evaluate
y 2 d x + x 2 dy,
C
where C is the closed path formed by the square with vertices (0, 0), (1, 0), (1, 1)
and (1, 0) oriented counterclockwise.
2. Given an n-sided polygon in the plane whose vertices are
arranged counterclockwise around the polygon, show that the area inside the
polygon is given by
1.18 Green’s Theorem 87
1 a1 b1 a2 b2
an−1 bn−1 an bn
+ + ··· + + .
2 a2 b2 a3 b3 an bn a1 b1
xdy = − yd x.
C C
There are several standard higher-dimensional versions of Green’s theorem that the
student has undoubtedly seen in a second course in calculus. We recall these theorems
(without proofs below).
The first is the divergence theorem. Recall that a smooth surface S is said to
be orientable if, to each point on S, we can assign a unit normal vector in such
a way that the resulting vector field of normals is continuous everywhere. Such a
surface then is said to be oriented. Spheres and ellipsoids are common examples of
orientable surfaces. The Möbius strip, obtained by taking a strip of paper twisting it
around and then joining the ends, is not orientable. The following theorem is usually
attributed to Gauss.
Theorem 1.31 (The divergence theorem) Let f = ( f 1 , f 2 , f 3 ) be a vector field
defined and continuously differentiable in a domain D in R3 . Let R be a region
in D that is bounded by a piecewise smooth, closed orientable surface S. Then,
f · n dσ = ∇ · f d V,
S R
where S is oriented so that at each point on S, n is the outwardly directed unit normal.
Given a continuously differentiable vector field f = ( f 1 , f 2 , f 3 ) in R3 , the function
∂ f1 ∂ f2 ∂ f3
+ +
∂x ∂y ∂z
∂ f1 ∂ f2 ∂ f3
f 1 dydz + f 2 dzd x + f 3 d xd y = + + d V.
S R ∂x ∂y ∂z
For simplicity, we first assume that any line parallel to the coordinate axes cuts S in
at most two points. The reader may find it convenient to visualize our surface as that
of an oblate football. We can then refer to the “lower” portion and “upper” portion
of the surface when we view it from either the x y-plane, or yz-plane or the x z-plane.
Let us first look at the x y-plane. We can write z = g1 (x, y) and z = g2 (x, y) for
the upper and lower portions of S denoted S2 and S1 respectively. If we denote the
projection of S onto the x y-plane as R1 , then noting that d V = d xd ydz,
g2 (x,y)
∂ f3 ∂ f3
dV = dz d y d x,
R ∂z R1 g1 (x,y) ∂z
f 3 (x, y, g1 ) = − f 3 k · n1 d S2 .
R1 S2
Therefore
∂ f3
dV = f 3 k · n d S.
R ∂z S
A similar analysis projecting the surface onto the other two coordinate planes gives
∂ f1
dV = f 1 i · n d S.
R ∂x S
1.19 Theorems of Gauss and Stokes 89
∂ f2
dV = f 2 j · n d S.
R ∂z S
Adding up both sides of our equations now gives the divergence theorem.
Suitable modifications can be made now to consider the more difficult case when
lines parallel to the coordinate axes meet the surface in more than two points. As in
the case of our proof of Green’s theorem, we subdivide the region into subregions
that satisfy our conditions above and then add up the final result, keeping in mind
that we have oriented surfaces and that cancellations take place in the appropriate
way.
Perhaps an important theme in both Green’s theorem and Gauss’s divergence
theorem is how the notion of oriented surface and boundary impose themselves in
the course of the proof. It is this essential ingredient that is formalized in the theory
of differential forms.
One interesting application of the divergence theorem is a simple formula for the
volume a region R bounded a piecewise smooth closed oriented surface S. If n is the
unit normal function at each point of S, then
1
Volume of R = r · n dσ
3 S
where r is the position vector (x, y, z). This formula is analogous to the one we
obtained earlier using Green’s theorem for the area enclosed by a simple closed
curve. Since r = (x, y, z), we see that ∇ · r = 3 and the result is immediate from
the divergence theorem.
As highlighted before, much of the development of multivariable calculus was
motivated by questions arising in physics. For instance, in fluid mechanics, the diver-
gence of a vector field F at a point (x, y, z) corresponds to the net flow of fluid out
of a small box centered at (x, y, z). The gradient of F has the following meaning. If
div F(x, y, z) = ∇ · F(x, y, z) > 0, there is more fluid going out of the box than into
the box in which we call (x, y, z) a source. If div F(x, y, z) = ∇ · F(x, y, z) < 0,
there is more fluid entering into the box than that which goes out of the box, in which
case we call (x, y, z) a sink. If ∇ · F(x, y, z) = 0 in a region D, we say that the
vector field is incompressible or source-free in D.
The divergence of a vector field is then a measure of the “flow” of the vector field.
For example, for the vector field F(x, y) = (x, y), the divergence is 2 indicating that
the flow is outward. For the vector field F(x, y) = (y, −x), the divergence is zero
so that the net flow is zero.
The third theorem in the “holy trinity” of theorems from vector calculus is Stokes
theorem.
Theorem 1.32 (Stokes theorem) Let S be an oriented surface parameterized by
a one-to-one parameterization r : D ⊆ R2 → R3 , where D is a region to which
Green’s theorem applies. Let ∂ S be the positively oriented piecewise smooth bound-
ary of S. Then
90 1 Background
F · ds = curl F · dS,
∂S S
(4) ∇ × F = 0.
Exercises
1. Let C(t) = (x(t), y(t)) for a ≤ t ≤ b be a curve with x(t), y(t) being continu-
ously differentiable. The right normal vector at t is defined as
dy dx
N (t) = ,− .
dt dt
∂ p ∂q
(−qd x + pdy) = + d yd x.
C A ∂x ∂y
3. Let F(x, y) = (y, −x). Let C be the circle of radius 1 oriented counter clockwise.
Show that
F · n ds = 0.
C
4. Recall that a region is called simply connected if every closed path can be contin-
uously shrunk to a point. Let P(x, y) and Q(x, y) be continuous and continuous
first-order partial derivatives at each point of a simply connected region R. Show
that a necessary and sufficient condition that
Pd x + Qdy = 0
C
∂P ∂Q
=
∂y ∂x
at every point of R.
dφ = Pd x + Qdy.
Prove that a necessary and sufficient condition that Pd x + Qdy is an exact dif-
ferential is that
∂P ∂Q
= .
∂y ∂x
92 1 Background
f (x, y)d xd y,
so that ∂x
∂x
∂u ∂v
d xd y = dudv.
∂y ∂y
∂u ∂v
If we formally set x = y, we see that the determinant is zero and so this sug-
gests d xd x = 0. Interchanging x and y changes the sign of the determinant so that
d xd y = −d yd x. These observations suggest a search for a suitable “algebra” of dif-
ferential forms that satisfy these two properties. Such an algebra leads to an elegant
synthesis of Green’s theorem and Stokes theorem and in a single formula leads to
grand generalization to higher dimensions, which can be seen as the ultimate fun-
damental theorem of calculus. Below, we give a lightning overview of the theory of
differential forms. The student can find a detailed treatment in [5].
Let V be a vector space over R and denote by V k the k-fold product
V × · · · × V.
Note that the order of the factors S and T matters here so the operation ⊗ is not
commutative. The reader can easily verify the following properties:
1.20 Differential Forms 93
(S1 + S2 ) ⊗ T = S1 ⊗ T + S2 ⊗ T,
S ⊗ (T1 + T2 ) = S ⊗ T1 + S ⊗ T2 ,
(S ⊗ T ) ⊗ U = S ⊗ (T ⊗ U ).
φi1 ⊗ · · · ⊗ φik , 1 ≤ i 1 , . . . , i k ≤ n,
T (v1 , . . . , vi , . . . , v j , . . . , vk ) = −T (v1 , . . . , v j , . . . , vi , . . . , vk ) ∀ v1 , . . . , vk ∈ V
where we have interchanged vi and vi on the right-hand side and left all other v’s
fixed in their position. In other words, the sign of the tensor changes if we apply
a transposition to the order of the arguments, a property shared by the determinant
function. The set of alternating tensors is denoted k (V ). The reader will recall
that for the permutation group Sk on k letters, the sgn function assigns +1 if the
permutation is even and −1 if it is odd. This function allows us to construct an
alternating tensor from any tensor as follows. Given T ∈ Tk (V ), we define
1
Alt (T )(v1 , . . . , vk ) = (sgn σ )T (vσ (1) , . . . , vσ (k) ).
k! σ ∈S
k
We leave the verification that this is indeed an alternating tensor to the student who
we assume is familiar with basic linear algebra and has already demonstrated this
property for the determinant function. The proof here is similar.
94 1 Background
(r + s)!
ω∧η = Alt (ω ⊗ η).
r !s!
The reason for the coefficient will be apparent later. The wedge product has the
usual properties of distributivity and associativity. It has the important property that
ω ∧ η = (−1)r s η ∧ ω. In particular, if r is odd, ω ∧ ω = 0. If r or s is even, then
ω ∧ η = η ∧ ω. If r and s are both odd, then ω ∧ η = −η ∧ ω.
We can construct a basis for k (V ) using the wedge product. In fact, the set of
all
φi1 ∧ · · · ∧ φik , 1 ≤ i1 < i2 < · · · < ik ≤ n
n
wi = ai, j v j ,
j=1
n
δi j = T (wi , w j ) = aik a j T (vk , v )
k,=1
n
= aik a jk .
k=1
( p, v) + ( p, w) = ( p, v + w), a · ( p, v) = ( p, av)
(e1 ) p , . . . , (en ) p
F( p) = F1 ( p)(e1 ) p + · · · + Fn ( p)(en ) p .
n
Di Fi ,
i=1
96 1 Background
n
∇= Di · ei ,
i=1
is a 1-form. Thus, the map f → d f changes 0-forms into 1-forms. This observation
allows us to construct an explicit basis for the vector space of k-forms, namely,
and we say ω is C r if the functions ωi1 ,...,ik are all C r . If ω is a differentiable k-form,
then we define dω as
dω := dωi1 ,...,ik ∧ d xi1 ∧ · · · ∧ d xik
i 1 <···<i k
1.20 Differential Forms 97
which is a (k + 1)-form. This is the algebra of differential forms that we have been
looking for. The d-operator is called the exterior derivative operator.
On the geometric side, we define a singular n-cube in A ⊂ Rn to be a contin-
uous function c : [0, 1]n → A. The standard n-cube is In : [0, 1]n → Rn given by
In (x) = x for x ∈ [0, 1]n . We can consider the Z-module generated by the singular
n-cubes. These are simply finite formal sums of the form
n jcj,
j
with n j ∈ Z and c j ’s are singular n-cubes. Such sums are called n-chains. In partic-
ular, any singular n-cube is also an n-chain.
For each singular n-chain c in A, we will define an (n − 1)-chain in A called the
boundary of c and denoted ∂c. For example, the boundary of I2 may be defined as
the sum of four singular 1-cubes arranged counterclockwise around the boundary of
[0, 1]2 as indicated in the figure below.
However, it will be convenient to define it as the sum of four singular 1-cubes
with the indicated coefficients:
With this as motivation, we can define generally the boundary of In as follows.
For each i with 1 ≤ i ≤ n, we define two singular (n − 1)-cubes In(i,0) and In(i,1) as
follows. If x ∈ [0, 1]n−1 , then
and
In(i,1) (x) = (x1 , . . . , xi−1 , 1, xi , . . . , xn−1 ).
We call In(i,0) the (i, 0)-face of In and In(i,1) the (i, 1)-face. Thus, in the case n = 1,
I1(1,0) = (0, 0) and I1(1,1) = (0, 1) are simply two points. In the case n = 2 illustrated
in Figs. 1.12 and 1.13, they are given as indicated in the figure below (Fig. 1.14).
Thus, the boundary of I2 is seen to be the formal sum
−1
−1 +1
+1
(2,1)
I2
(1,0) (1,1)
I2 I2
(2,0)
I2
n
∂ In := (−1)i+a In(i,a) .
i=1 a=0,1
c(i,a) = c ◦ (In(i,a) )
n
∂c = (−1)i+a c(i,a) .
i=1 a=0,1
We then define the boundary of an n-chain i n i ci as
∂ n i ci = n i ∂ci .
i i
1.20 Differential Forms 99
Exercises
Alt (S ⊗ T ) = Alt (T ⊗ S) = 0.
(r + s + t)!
(ω ∧ η) ∧ θ = (ω ∧ (η ∧ θ ) = Alt (ω ⊗ η ⊗ θ ).
r !s!t!
3. If v1 , v2 , . . . , vn−1 ∈ Rn , define φ : Rn → R by
⎛ ⎞
v1
⎜ .. ⎟
⎜ ⎟
φ(w) = det ⎜ . ⎟ .
⎝vn−1 ⎠
w
∂N ∂M
d(Md x + N dy) = − d xd y.
∂x ∂y
ω = F1 dy ∧ dz + F2 dz ∧ d x + F3 d x ∧ dy.
Show that dω = (div F)d x d y dz. Deduce the divergence theorem from Theo-
rem 1.34.
η = F1 d x + F2 dy + F3 dz.
Show that
This observation can be used to give a proof of Theorem 1.32 using Theorem 1.34.
−y x
ω= dx + 2 dy.
x 2 + y2 x + y2
Show that ω is closed but not exact. [Thus, the converse of the previous exercise
does not hold. This is because the domain is not simply connected. In a simply
connected domain, every closed 1-form is exact.]
References
1. M. Ram Murty, B. Fodden, Hilbert’s Tenth Problem, An introduction to logic, number theory and
computing, Student mathematical library, vol. 88 (American Mathematical Society, Providence,
Rhode Island, 2019)
References 101
By the end of the nineteenth century, it was clear that Riemann integration (about
which one learns in a first course in calculus) had to be replaced by a more versatile
method of integration. For instance, the characteristic function of the irrational num-
bers is not Riemann integrable, and yet intuition suggests that over any finite interval
it should integrate to the length of that interval. It was Henri Lebesgue (1875–1941)
who came up with the appropriate theory in his doctoral thesis written in 1902. There
was an urgency in developing such a theory because there were other disciplines in
mathematics that were struggling to find an appropriate framework, most notably,
probability theory.
Metric spaces are examples of a larger universe of objects called topological
spaces. A topological space is a pair (X, O) consisting of a set X and a collection
O of subsets of X (called open sets) such that
(a) Ø and X ∈ O;
(b) U, V ∈ O implies U ∩ V ∈ O;
(c) for any collection of open sets Ua , we have a Ua ∈ O.
The complements of open sets are called closed sets. Given a topological space
(X, O), a base U for O is any collection of open sets such that every open set is a
union of a subcollection of U. A neighborhood of a point x ∈ X is any set N (not
necessarily open) such that x ∈ U ⊂ N for some open U . We say that a collection
N of neighborhoods of x is a neighborhood base at x iff for every neighborhood V
of x, we have x ∈ N ⊂ V for some N ∈ N .
Every metric space gives rise to a topology. Indeed, if (X, d) is a metric space,
we define the open ball of radius r and centered at x as
{y ∈ X : d(x, y) ≤ r }.
That is, μ is countably additive. We then speak of the triple (X, A , μ) (or simply
(X, μ)) as a measure space. One can also consider more generally real measures
and complex measures where μ takes values in R or C, respectively. But we will
only be dealing with positive measures here.
If (X, μ) is a measure space, members of A are called measurable sets. If Y
is a topological space and f : X → Y , then f is called measurable if f −1 (U ) is
measurable for every open U of Y . In Lebesgue’s original theory, Y was the set
of non-negative real numbers [0, ∞]. If f : X → R is measurable, we say f is a
real measurable function. If f : X → C is measurable, we say f is a complex
measurable function. Such a function can clearly be decomposed as f = u + iv
with u and v being real measurable functions. One can also show that | f | is a real
measurable function in such an instance.
Theorem 2.1 Let X be a measure space. If f n : X → [−∞, ∞] is a sequence of
measurable functions, and
2.1 Topological Spaces and Measure Spaces 105
Thus g is measurable. It is clear that the same result holds with sup replaced by inf.
Thus, as
h = inf sup f i
k≥1 i≥k
Exercises
1. Let X be a measure space and E a measurable set of X . Show that the charac-
teristic function χ E of E defined by χ E (x) = 1 if x ∈ E and zero otherwise is
a measurable function.
2. If f and g are measurable functions with range in [−∞, ∞], show that the func-
tions
max{ f, g}, and min{ f, g}
A1 ⊂ A2 ⊂ A3 ⊂ · · ·
A1 ⊃ A2 ⊃ · · ·
106 2 Measure Theory
An = {n, n + 1, n + 2, ...}.
If
∞
A= An ,
n=1
then A is empty and thus has measure zero, whereas μ(An ) = ∞. Does this
contradict the previous exercise?
7. If f is a real-valued function on a measurable space X such that
{x : f (x) ≥ r }
where α1 , ..., αn are the distinct values of s and Ai = {x : s(x) = αi }. It is clear that
s is measurable if and only if each Ai is measurable.
For a simple function s and a measurable set E of A , we define
n
s dμ := αi μ(Ai ∩ E).
E i=1
It is not hard to show that for any measurable function f : X → [0, ∞], there are
simple measurable functions s1 , s2 , ... such that
0 ≤ s1 ≤ s2 ≤ · · · ≤ f
2.2 The Lebesgue Integral 107
and
sn (x) → f (x) as n → ∞
for every x ∈ X .
This allows us to define the Lebesgue integral. If f : X → [0, ∞] is measurable,
and E ∈ A , we define the Lebesgue integral of f over E as
f dμ := sup s dμ,
E E
where the supremum is taken over all simple measurable functions s such that 0 ≤
s ≤ f . If f is simple, the two definitions of the Lebesgue integral agree.
We now come to the interesting part of the theory. The following theorems encap-
sulate the versatility of Lebesgue’s theory to handle limit operations. Throughout,
(X, μ) is a measure space.
Theorem 2.2 (Lebesgue’s monotone convergence theorem) Let { f n } be a sequence
of measurable functions and suppose that
(a) 0 ≤ f 1 (x) ≤ f 2 (x) ≤ · · · ≤ ∞ for every x ∈ X ;
(b) f n (x) → f (x) as n → ∞ for every x ∈ X .
Then f is measurable and
f n dμ → f dμ
X X
as n → ∞.
Proof Since the sequence of numbers
f n dμ
X
f n dμ → α
X
f n dμ ≤ f dμ
X X
To see this last equality, let x ∈ X . If f (x) = 0, then x ∈ E 1 ; if f (x) > 0, then
cs(x) < f (x) because c < 1, and so x ∈ E n for some n. Clearly,
f n dμ ≥ f n dμ ≥ csdμ, n∈N
X En En
α≥c sdμ.
X
α≥ sdμ,
X
α≥ f dμ.
X
then
∞
f dμ = f n dμ.
X n=1 X
Proof Put
gn = f 1 + f 2 + · · · + f n .
2.2 The Lebesgue Integral 109
Proof Put
gn = inf f i (x), n = 1, 2, ...; x ∈ X.
i≥n
Clearly gn ≤ f n , so that
gn dμ ≤ f n dμ, n = 1, 2, 3, ...
X X
For (X, μ) a positive measure space, we define L 1 (μ) to be the set of all complex
measurable functions f on X for which
| f | dμ < ∞.
X
The reader will recall that f measurable implies that | f | is measurable and so the
above integral is well-defined. Elements of L 1 (μ) are then called Lebesgue inte-
grable functions. We now come to what may be termed as the most important theorem
in Lebesgue’s theory.
| f n − f | ≤ 2g
and we deduce
2gdμ ≤ 2gdμ + lim inf − | f n − f |dμ = 2gdμ − lim sup | f n − f |dμ.
X X n→∞ X X n→∞ X
Since
2gdμ
X
is finite, we can subtract it from both sides of the inequality and deduce
Sets of measure zero play a special role in Lebesgue’s theory. If a certain property
holds apart from a set of measure zero, we say the property holds almost everywhere
and abbreviate it as a.e. More precisely, a property P holds almost everywhere on
a measurable set E of the σ-algebra A if there exists N ⊂ A such that μ(N ) = 0
and P holds on E\N . In fact, we can define an equivalence relation on measurable
functions and say two functions f and g are equivalent if f − g = 0 a.e. In such a
case, we have for every measurable set E of X that
f dμ = gdμ.
E E
For 0 < p < ∞, and a measure space X , we define L p (μ) to be the set of mea-
surable functions f on X for which
1/ p
|| f || p := | f | p dμ < ∞.
X
equivalence classes of functions where two functions f and g are equivalent if they
are equal a.e. However, in common parlance, we drop this subtle distinction and still
continue to refer to elements of L p (μ) as functions with this tacit understanding in
the background.
The space of continuous functions with compact support on X , denoted Cc (X ),
is dense in L p (μ). This fact is often useful in proving many fundamental theorems
concerning L p (μ).
We also introduce the space L ∞ (X ) consisting of (real or complex-valued) func-
tions f for which
|| f ||∞ := sup | f (x)| < ∞.
x∈X
Exercises
∞
∞ ∞
∞
ai j = ai j .
i=1 j=1 j=1 i=1
gdφ = g f dμ
X X
for every measurable g on X with range in [0, ∞]. (The converse of this result is
called the Radon–Nikodym theorem: any positive measure φ on X is of the form
(2.1) for some f ∈ L 1 (μ). One refers to f as the Radon–Nikodym derivative
of φ.)
4. Let f : X → [0, ∞] be measurable and suppose that for some measurable set E,
we have
112 2 Measure Theory
f dμ = 0.
E
Show that f = 0 a.e. on E. [Hint: for each n, show that the set
A vector space H over C is called an inner product space if for every pair of
elements x, y ∈ H , we associate (x, y) ∈ C (called the inner product of x and y)
satisfying the following:
(a) (x, y) = (y, x); (the bar denotes complex conjugation)
(b) (x + y, z) = (x, z) + (y, z) for all x, y, z ∈ H ;
(c) (αx, y) = α(x, y) for all x, y ∈ H and α ∈ C;
(d) (x, x) ≥ 0 for all x ∈ H with equality if and only if x = 0.
Property (c) implies that (0, y) = 0 for all y ∈ H . Properties (b) and (c) may be
combined to say that the map x → (x, y) is a linear functional on H . Properties
(a) and (c) show that (x, αy) = α(x, y). Finally (a) and (b) imply that (z, x + y) =
(z, x) + (z, y) and (d) allows us to define the norm of x, denoted ||x|| to be the
non-negative square root of (x, x).
Theorem 2.6 (The parallelogram law) In any inner product space H , we have
Proof We have
and
(x − y, x − y) = (x, x) − (x, y) − (y, x) + (y, y).
The above theorem has the following geometric interpretation in Euclidean geom-
etry. In any parallelogram, the sum of the squares of the diagonals is equal to the
sum of the squares of its sides.
2.3 Inner Product Spaces 113
Proof Without any loss of generality, we may suppose that (x, y) is real since we
can multiply x by eiθ for some real θ without altering ||x|| or ||(x, y)||. Thus, for r
real,
0 ≤ (x − r y, x − r y) = ||x||2 − 2r (x, y) + r 2 ||y||2 .
as desired.
seems to have been first stated by Buniakowsky in 1859 and later independently by
Schwarz in 1885. The interested student may find an exposition of the influence of
this inequality in mathematics, statistics and physics in [1].
Thus,
||x − z|| ≤ ||x − y|| + ||y − z||,
114 2 Measure Theory
n
(x, y) = xi yi .
i=1
( f, g) = f gdμ.
X
How do we know the integral is well-defined? This follows from first applying
the Cauchy–Schwarz inequality to simple functions and then taking limits. Since
L p (μ) is a complete metric space for every 1 ≤ p ≤ ∞, we see that L 2 (μ) is a
Hilbert space.
(c) The vector space of all continuous complex functions on [0, 1] is an inner product
space if
1
( f, g) = f (t)g(t)dt,
0
which shows that the map x → (x, y) is uniformly continuous. The same argument
works for x → (y, x). Finally, the triangle inequality shows that
M⊥ = x ⊥.
x∈M
for all x ∈ E.
Proof Let δ be the infimum of ||x|| for x ∈ E. For any x, y ∈ E, we apply the
parallelogram law to x/2 and y/2 to get
1 1 1
||x − y||2 = ||x||2 + ||y||2 − ||(x + y)/2||2 .
4 2 2
Since E is convex, (x + y)/2 ∈ E. Hence,
116 2 Measure Theory
From this we see that if x0 exists, then it must be unique for ||x|| = ||y|| = δ implies
||x − y||2 = 0. The definition of δ implies that there is a sequence of elements yn ∈ E
such that ||yn || → δ. Replacing x, y in (2.2) by yn , ym , we see that the sequence yn is
Cauchy. Since H is complete, there is an x0 ∈ H such that yn → x0 . As E is closed,
we must have x0 ∈ E.
x + M = {x − y : y ∈ M}.
This set is closed and convex. Define z to be the element of smallest norm in x + M,
which exists by the previous theorem. We need to show that z ∈ M ⊥ . Let w ∈ M
and consider (w, z). Without loss of generality, we may suppose that ||w|| = 1. The
minimizing property of z means that
for every α ∈ C. Choosing α = (z, w) gives 0 ≤ −|(z, w)|2 from which we deduce
that z ∈ M ⊥ . Thus, x = y + z with y ∈ M and z ∈ M ⊥ . To see that y and z are the
nearest points to x in M and M ⊥ , respectively, we have by the theorem of Pythagoras,
for u ∈ M,
||x − u||2 = ||y + z − u||2 = ||z||2 + ||y − u||2 ,
which is clearly minimized for y = u. A similar argument shows that z is the nearest
point to x in M ⊥ . The linearity of the maps is clear, and the last assertion follows
from Pythagoras.
Proof Take x ∈ H , x ∈
/ M. Take z as in the theorem.
We have already seen that the map x → (x, y) is a linear functional for all y ∈ H .
It is a very important result of Riesz that all continuous linear functionals on H are
of this type.
2.3 Inner Product Spaces 117
u = L(x)z − L(z)x.
In other words, L(x) = (x, y) with y = L(z)z. The uniqueness is obvious since
(x, y) = (x, y ) for all x means that y − y is orthogonal to every x ∈ H . In particular,
it is orthogonal to itself, which means that ||y − y || = 0 from which we deduce that
y = y.
Exercises
(a + b)2 a2 b2
≤ + .
x+y x y
(a + b + c)2 a2 b2 c2
≤ + + .
x+y+z x y z
(a1 + a2 + · · · + an )2 a2 a2 a2
≤ 1 + 2 + ··· + n .
x1 + x2 + · · · + xn x1 x2 xn
3. For any set of strictly positive real numbers a1 , ..., an , show that
1 1 1
(a1 + a2 + · · · + an ) + + ··· + ≥ n2.
a1 a2 an
4. Show that the space C[0, 1] of real-valued continuous functions on [0, 1] with
the usual inner product
1
( f, g) = f (t)g(t)dt,
0
It may be instructive to begin with an example. Consider the space C[0, 1] and
the function f (x) = x − 1/2. The functions φn (x) = e2πinx are orthonormal with
respect to the usual inner product. We can try to expand f in terms of these functions:
1
1
cn = f (x)e−2πinx d x = − ,
0 2πin
1 1
1
= (x − 1/2)2 d x = .
n=0
4π 2 n 2 0 12
x̂(α) := (x, u α )
and we refer to these as the Fourier coefficients of x relative to the set u α . We expect
in some sense,
x= x̂(α)u α .
α∈A
2.4 Orthonormal Sets 119
Given x in a Hilbert space H , and F any finite subset of A, we can consider the sum
s F (x) = (x, u α )u α .
α∈F
which is to say that s F is the vector nearest to x in the space spanned by F. Indeed,
it is easy to see that s F is in the space spanned by F and that x − s F is orthogonal to
every u α with α ∈ F by direct computation. Thus, x − s F is orthogonal to s − s F .
Now, x − s = x − s F + s f − s and by Pythagoras, we get
If x ∈ H and
s F (x) = x̂(α)u α ,
α∈F
then
||x − s F (x)|| < ||x − s||,
Proof Only the first part needs proving. But this is clear from the orthogonality
relations. The last inequality is called Bessel’s inequality.
We want to extend this discussion to infinite F. We want to give meaning to
φ(α),
α∈A
where φ(α) are non-negative numbers. To this end, we simply define the sum to be
the supremum of the set of all sums over finite subsets of A. We consider the space
2 (A), the set of functions φ with domain A and satisfying
|φ(α)|2 < ∞.
α∈A
If φ ∈ 2 (A), then it is easy to see that φ(α) = 0 for at most countably many elements
of A. Indeed, the set An of α for which |φ(α)| > 1/n satisfies
|nφ(α)|2 ≤ n 2 |φ(α)|2 < ∞,
α∈An α∈A
d2 ( f (x), f (x )) = d1 (x, x ), ∀ x, x ∈ X.
holds for every x ∈ H . The map x → x̂ is a continuous linear map from H onto
2 (A) whose restriction to the closure P of P is an isometry of P onto 2 (A).
Proof The inequality is immediate from Theorem 2.13. Define f on H by f (x) = x̂.
Then, the inequality shows that f maps H into 2 (A). The linearity of this map is
clear. Applying (2.3) to x − y gives
from which we see the continuity of f . Theorem 2.13 shows that f is an isometry
of P onto the dense subspace of 2 (A) consisting of those functions whose support
is a finite set F of A. The theorem now follows from the previous lemma applied
to x = P, X 0 = P, Y = 2 (A). Note that P being a closed subspace of a complete
metric space is itself complete.
The fact that the map x → x̂ carries H onto 2 (A) is known as the Riesz–Fischer
theorem.
We now prove the important theorem concerning orthonormal bases.
Proof We will show that (a) ⇒ (b) ⇒ (c) ⇒ (d) ⇒ (a). If P is not dense, then its
closure is not all of H and we can write
⊥
H=P⊕P ,
and a non-zero vector in the complement can be added to our orthonormal set,
contrary to its maximal property. (b) implies (c) follows from Theorem 2.13. To see
that (c) implies (d), we use the “polarization identity”:
which is easily verified (see exercises below) to be valid in any Hilbert space. We
apply this with x, y replaced by x̂, ŷ since 2 (A) is also a Hilbert space. The sum in
(c) is ||x̂||2 and the sum in (d) is (x̂, ŷ). Finally, to see that (d) implies (a), suppose that
(a) is false and our set is not a maximal orthonormal set. Then, we can find a vector
u = 0 which is orthogonal to every u α and we can add u to this set. If x = y = u,
we have (x, y) = ||u||2 > 0. On the other hand, all the Fourier coefficients of u are
zero so that (d) does not hold for this vector, a contradiction. This completes the
proof.
Exercises
n−1
vn = xn − (xn , u j )u j ,
j=1
for n ≥ 2. This is the extension of the Gram–Schmidt process that the student
has undoubtedly seen in a first course in linear algebra.
2. If M = {x : L(x) = 0} where L is a continuous linear functional on H , which is
not identically zero, prove that M ⊥ is a vector space of dimension 1.
3. Let H be a Hilbert space and {u 1 , ..., u N } an orthonormal system of vectors. Let
M be the span of these orthonormal vectors. Given x ∈ H , show that the vector
N
y= (x, u i )u i ∈ M
i=1
4. In the space L 2 [−1, 1], let M be the subspace spanned by the functions 1, x, x 2 .
Find an orthonormal basis for M.
5. Compute
1
min |x 3 − a − bx − cx 2 |2 d x.
a,b,c −1
1 nk
N
ω =1
N n=1
1
N
(x, y) = ||x + ω n y||2 ω n ,
N n=1
that hold in every inner product space for N ≥ 3. Show also that
π
1
(x, y) = ||x + eiθ y||2 eiθ dθ.
2π −π
Let T be the unit circle. We would like to consider the Hilbert space L 2 (T) and show
that functions in this space can be approximated by trigonometric polynomials. A
trigonometric polynomial is a finite sum of the form
N
a0 + an cos nt + bn sin nt.
n=1
This is the form we will consider below. Observe that the set of functions
124 2 Measure Theory
u n (t) = eint ,
forms an orthonormal family with respect to the usual inner product on L 2 (T) given
by
π
1
( f, g) = f (t)g(t)dt.
2π −π
To show that this is a maximal orthonormal family, it suffices to show that the
trigonometric polynomials are dense in L 2 (T) by the theorem of the previous section.
Since C(T) is dense in L 2 (T), it suffices to show that every continuous function can
be approximated by a trigonometric polynomial. In fact, we will prove the following
stronger theorem.
Theorem 2.16 Fix > 0 and f ∈ C(T). There exists a trigonometric polynomial
P such that
| f (t) − P(t)| ≤ ,
Lemma 2.2 There exist trigonometric polynomials Q 1 , Q 2 , Q 3 , ... with the follow-
ing properties:
(a) Q k (t) ≥ 0, for all t ∈ R;
(b)
π
1
Q k (t)dt = 1.
2π −π
Proof Put
k
1 + cos t
Q k (t) = ck ,
2
where ck is chosen so that (b) holds. This is the family of trigonometric polynomials
we will use. Clearly (a) also holds. We need to show (c). To this end, we first note
that since Q k (t) is even,
π k π k
ck 1 + cos t ck 1 + cos t 2ck
1= dt > sin tdt = .
π 0 2 π 0 2 π(k + 1)
2.5 Trigonometric Series 125
for 0 < δ ≤ |t| ≤ π. Since 1 + cos δ < 2, the result is now immediate.
Thus,
π
1
Pk (t) − f (t) = { f (t − s) − f (t)}Q k (s)ds.
2π −π
where A1 is the integral over the interval [−δ, δ], and A2 is over the complementary
set. In A2 , we use the estimate on ηk (δ) to see that
Does there exist an f ∈ L 2 (T) whose sequence of Fourier coefficients coincides with
the cn . The answer is provided by the Riesz–Fischer theorem. Recall that the map
x → x̂ is an isometry from H to L 2 (A) in the context of Theorem 2.15. Since this
map is onto, there is an f ∈ L 2 (T) such that fˆ(n) = cn for all n.
Let us look at the following example. Consider the function f (t) = t for −π ≤
t < π and extended by periodicity on the real line. Its Fourier coefficients are easily
computed:
π
2(−1)n π
xe−inx d x = ,
−π n
for n = 0 and for n = 0, the Fourier coefficient is zero. Thus, by Parseval’s formula,
we immediately deduce that
∞
1 π2
= .
n=1
n2 6
Exercises
1
N 1
lim f (nα) = f (t)dt
N →∞ N n=1 0
2.5 Trigonometric Series 127
for every irrational real number α. [Hint: do it first for f (t) = e2πikt , with k =
0, ±1, ±2, ...]
2. Is the series
∞
cos nx
√ , x ∈T
n=1
n
∞
1 π4
= .
n=1
n4 90
for p ≥ 1. We have to check that this satisfies the axioms for our norm. The first
two are clear. The third follows from Minkowski’s inequality. For p = 1, this is
clear. So let us assume 1 < p < ∞ and proceed to show that Minkowski’s inequality
follows from the Cauchy–Schwarz inequality.
To prove this, we begin with an elementary remark. Suppose that φ(t) is a twice
differentiable function on [0, 1] such that φ (t) ≥ 0. Thus, φ (t) is increasing, and
φ(t) is concave up in this interval. Thus,
φ(0) + φ(1)
≥ φ(1/2).
2
We will make use of this observation below. For any set a1 , ..., an , b1 , ..., bn of non-
negative real numbers, we have
⎛ ⎞1/ p ⎛ ⎞1/ p ⎛ ⎞1/ p
n
n
n
⎝ (a j + b j ) p ⎠ ≤ ⎝ aj ⎠ + ⎝
p
bj ⎠ ,
1/ p
from which the inequality for L p -spaces follows by looking at finite sums and taking
limits. To prove the inequality, let A j = ta j + (1 − t)b j and set
⎛ ⎞1/ p
n
φ(t) = ⎝ Aj ⎠
p
.
j=1
⎛ ⎞1/ p−2 ⎧⎛ ⎞ ⎛ ⎞2 ⎫
⎪
⎨ ⎪
⎬
n n n n
( p − 1) ⎝ Aj ⎠ ⎝ Aj ⎠ A j (a j − b j )2 − ⎝ A j (a j − b j )⎠ .
p p p−2 p−1
⎪
⎩ j=1 ⎪
⎭
j=1 j=1 j=1
for non-negative a j , b j .
By the usual limiting process for integrals, we deduce
1/ p 1/ p
| f + g| p dμ ≤ (| f | + |g|) p dμ ≤ || f || p + ||g|| p ,
X X
showing that L p (μ) is a normed linear space. Technically speaking, as stated ear-
lier, L p (μ) does not consist of functions but rather equivalence classes of functions
equal almost everywhere. One can show that this space is complete (Riesz–Fischer
theorem). Thus, L p (μ) is a Banach space.
In this context, it is useful to derive another fundamental inequality called
Hölder’s inequality. We begin with an elementary observation. Let p, q be pos-
itive real numbers such that 1p + q1 = 1. Then, for a, b ≥ 0
ap bq
ab ≤ + . (2.4)
p q
To see this, let us first note that the inequality is trivially true if a or b equals zero. So,
let us assume a, b > 0. Replacing a and b by x 1/ p and y 1/q , respectively, we need to
prove that
x y
x 1/ p y 1/q ≤ + .
p q
But this is now immediate from the concavity of the logarithmic function. We can
now prove:
|ak | |bk |
a = n 1/ p , b = n 1/q
k=1 |ak |p k=1 |bk |q
provided all of the integrals exist. An important consequence of this is the following.
Theorem 2.18 Let (X, μ) be a measure space and let 1 ≤ p ≤ ∞ and q be such
that
1 1
+ = 1.
p q
|| f g||1 ≤ || f || p ||g||q .
almost everywhere. Integrating this over X gives the desired inequality in this case
also.
for all x ∈ X . (Note that the norm on the left hand side of the inequality is in Y and
the norm on the right hand side is in X . We will continue this convention since it will
be clear from the context, which norm is being used.) It is easy to see that the set of
all bounded linear transformations is a vector space. We define a norm on this space
by setting
||T || = sup{||T (x)|| : x ∈ X, ||x|| ≤ 1}.
It turns out that the space of all bounded linear transformations from a normed vector
space X to a Banach space Y is itself a Banach space.
2.6 Banach Spaces 131
In the special case that Y is the field of complex numbers, we call such linear
transformations, functionals. If X is a Banach space, the space X ∗ consisting of
all bounded linear functionals is again a Banach space, called the dual space of X .
Since X ∗ is again a Banach space, one can take its dual X ∗∗ , and one can show that
X can be identified with an isometric subspace of X ∗∗ . It is not in general true that
X is isomorphic to X ∗∗ . However, for some important spaces, like L p -spaces, it is
true that X X ∗∗ . We refer the reader to [2] for further details.
The space L 1 (μ) is of some interest in the theory of Fourier series. Recall that
we proved that any f ∈ L 2 (T) has a Fourier series which converges to f in the L 2 -
norm. A natural question to ask is if for any f ∈ C(T), the Fourier series converges
to f . This question has stimulated extensive research in the subject. In 1876, du
Bois Reymond showed that there exists a continuous function whose Fourier series
diverges at a point. He asked if the Fourier series converges almost everywhere. This
was finally settled by Carleson in 1966, who showed that for any L 2 -function, its
Fourier series converges almost everywhere. We will use the theory of Banach spaces
to show that there exist continuous functions whose Fourier series diverge on a dense
subset of T.
In this context, let us consider the problem of convergence of Fourier series. The
question is if the partial sums,
n
sn ( f, x) = fˆ( j)ei j x
j=−n
n
Dn (t) = ei jt .
j=−n
Noting that
eit/2 Dn (t) − e−it/2 Dn (t) = 2i sin(n + 1/2)t,
we find
sin(n + 1/2)t
Dn (t) = .
sin t/2
Let us show that the L 1 -norm of Dn tends to infinity as n tends to infinity. Indeed,
using the inequality | sin t| ≤ |t|,
132 2 Measure Theory
π (n+1/2)π
2 dt 2 dt
||Dn ||1 > | sin(n + 1/2)t| = | sin t| .
π 0 t π 0 t
2 1 4 1
n kπ n
| sin t|dt = → ∞.
π k=1 kπ (k−1)π π 2 k=1 k
The fact that L ∞ (μ) is a normed linear space follows from the usual triangle inequal-
ity.
We will prove in the next section:
Remark 2.1 One can show that this dense subset is of type G δ , that is, a countable
intersection of open sets.
Before we prove this theorem, we want to consider its application to the theory of
Fourier series alluded to above. Consider the linear functional Tn : C(T) → sn ( f, 0).
Clearly, each Tn is bounded since
Now fix n. Put g(t) = 1 if Dn (t) ≥ 0 and −1 if Dn (t) < 0. There exist continuous
functions f j ∈ C(T) such that −1 ≤ f j ≤ 1 and f j (t) → g(t) for every t, as j →
∞. By the dominated convergence theorem,
π π
1 1
lim Tn ( f j ) = lim f j (−t)Dn (t)dt = g(−t)Dn (t)dt = ||Dn ||1 .
j→∞ j→∞ 2π −π 2π −π
2.6 Banach Spaces 133
Thus, these operators are unbounded, and so by the Banach–Steinhaus theorem, there
exist functions in C(T) whose Fourier series diverges on a dense subset of T.
Exercises
3. Let ∞ be the space of all bounded sequences. Show that for x = {xn },
defines a norm on ∞ .
4. Let c be the subspace of ∞ consisting of convergent sequences and c0 the subspace
of null sequences. Show that c and c0 are closed Banach subspaces of ∞ .
The proof of the Banach–Steinhaus theorem will rely on the following theorem due
to Baire.
Theorem 2.20 (Baire) If X is a complete metric space, then the intersection of a
countable collection of dense open subsets is dense in X .
Proof Let V1 , V2 , ... be dense and open in X . Let W be any open set of X . We have
to show that ∩Vn has a point in W for W = Ø. Let B(x, r ) denote the open ball of
radius r centered at x ∈ X and let B(x, r ) be its closure. Since V1 is dense, W ∩ V1
is non-empty and so we can find x1 , r1 such that
We now proceed inductively. If n ≥ 2, we can choose xn−1 and rn−1 such that Vn ∩
B(xn−1 , rn−1 ) is non-empty and so we can find xn and rn so that
Proof of Theorem 2.19 Put φ(x) = supα∈A ||Tα (x)|| and let Vn be the set of x ∈ X
for which φ(x) > n. Since each Tα is continuous and the norm is a continuous map,
each function x → ||Tα (x)|| is continuous on X . It is easy to see that each Vn is
open. If all of these Vn ’s are dense, then by Baire’s theorem, ∩Vn is dense in X , and
the second half of the theorem is established. Therefore, let us suppose that VN fails
to be dense. Then, there is an x0 and r > 0 such that B(x0 , r ) ∩ VN = ∅. In other
words,
φ(x) = sup ||Tα (x)|| ≤ N
α∈A
for all x ∈ X satisfying ||x − x0 || ≤ r (the fact that we can take the closed disk
follows by continuity). Putting y = x − x0 , we can rephrase this as φ(x0 + y) ≤ N
for all ||y|| < r . In particular, with y = 0, we have φ(x0 ) ≤ N . Now,
Tα (y) = Tα (y + x0 ) − Tα (x0 ),
Thus, if ||y|| < r , we have ||Tα (y)|| ≤ 2N . Changing y to r y shows that for ||y|| ≤ 1,
we have ||Tα (y)|| ≤ 2N /r for all α. This completes the proof.
Corollary 2.7.1 Let X be a Banach space and Y a normed linear space. Suppose
that we have a sequence of operators Tn : X → Y such that for each x, the sequence
Tn (x) is bounded. Then, it follows that all these sequences are uniformly bounded.
Although we do not develop the theory here, it is clear that many of the notions
of calculus generalize to the setting of normed linear spaces. For instance, the notion
of derivative can be defined for any map f : X → Y of normed linear spaces. Fix
a ∈ X . Then, we say f is differentiable at a if there is a linear transformation
D f,a : X → Y such that
One can show that this linear transformation, if it exists, is unique, and thus, we can
speak of D f,a as the Fréchet derivative of f at a. It is then possible to derive a
general theory of calculus in a Banach space. We refer the interested reader to [2].
Exercises
1. Show that the norm of f in a Banach space can be evaluated on the boundary of
the unit ball.
2. Show that the derivative in a Banach space, when it exists, is unique.
3. Let X be a complete metric space. A subset E of X is called nowhere dense if its
closure contains no non-empty open subset of X . Any countable union of nowhere
dense sets is called a set of the first category. All other subsets are said to be of
second category. Show that no complete metric space is of the first category.
and
||F|| = sup{|F(x)| : x ∈ X, ||x|| ≤ 1}.
The third comment concerns the field of scalars. So far, we have been concerned
about spaces over C but all of the theory is valid over R. In fact, the Hahn–Banach
theorem was originally proved for real normed spaces.
A complex function φ on a complex vector space V is a complex linear functional
if
φ(x + y) = φ(x) + φ(y), φ(αx) = αφ(x), (2.5)
from which (2.6) follows. To prove the second part, we have to check that if f is
defined as in (2.6), then it is complex linear. But this follows easily by noting that
Finally, since |u(x)| ≤ | f (x)| we have ||u|| ≤ || f ||. On the other hand, to every
x ∈ V , there is a complex number α, |α| = 1 so that α f (x) = | f (x)|. Then,
Replacing x by λx and dividing both sides of the above by |λ|, the requirement is
There exists such an α if all the intervals [A x , Bx ] have a common point. That is, if
and only if
Ax ≤ By
Proof Let M = {λx0 } and define f (λx0 ) = λ||x0 ||. Then, f is a linear functional
of norm 1 on M. By the Hahn–Banach theorem, f can be extended to X with the
norm preserved.
Indeed, the right hand side is certainly a lower bound for ||x||. However, by the
Hahn–Banach theorem, there is a continuous linear functional f of norm 1 with
f (x) = ||x|| for a given x. Thus, the right hand side is also an upper bound for ||x||.
Hence, for fixed x ∈ X , the map f → f (x) is a bounded linear functional on X ∗ .
This gives us an injection
X → X ∗∗ .
The study of the interplay between X and X ∗ forms a large part of what is called
functional analysis.
Proof Since X ∗ is already a normed linear space, we have to show that it is complete.
Let xn∗ be a Cauchy sequence in X ∗ . Then, ||xn∗ − xm∗ || tends to zero as n, m both
tend to infinity. Thus, for any fixed x ∈ X , the sequence xn∗ (x) is a Cauchy sequence
of scalars since
|xn∗ (x) − xm∗ (x)| ≤ ||xn∗ − xm∗ || · ||x||.
2.8 Hahn–Banach Theorem 139
Thus, for each x ∈ X , there is a scalar x ∗ (x) such that xn∗ (x) → x ∗ (x). The functional
x ∗ defined on all of X in this way is clearly linear since
x ∗ (ax + by) = lim xn∗ (ax + by) = lim axn∗ (x) + bxn∗ (y)
Now since xn∗ is a Cauchy sequence, given > 0, there is an M such that
|x ∗ (x)| = |x ∗ (x) − xm
∗ (x) + x ∗ (x)| ≤ |x ∗ (x) − x ∗ (x)| + |x ∗ (x)| ≤ ( + ||x ∗ ||)||x||,
m m m m
Exercises
Let us first consider the dual space of Rn . Recall that the norm of the vector x =
(x1 , ..., xn ) is given by
1/2
||x|| = 2
xi .
i
Let e1 , ..., en be the standard basis vectors. Then, any linear map f is determined by
the values f (ei ). Thus, letting yi = f (ei ), we have
f (x) = xi yi ,
i
since x = i xi ei . By Cauchy–Schwarz, we have
1/2
| f (x)| = | xi yi | ≤ yi2 ||x||.
i i
However, choosing x = (y1 , ..., yn ), we see that this bound is achieved. Hence, every
bounded linear functional corresponds to a vector (y1 , ..., yn ) and
f (x) = xi yi
i
is the usual inner product. Thus, the dual space of Rn is again Rn as Banach spaces.
This example should not surprise us in view of our earlier characterization of
linear functionals on a Hilbert space H . They are all of the form x → (x, y). What
is the norm of this functional? By Cauchy–Schwarz, we have
so that the norm is bounded by ||y||. However, choosing x = y, this bound is attained,
and hence the norm of this functional is ||y||.
We will now determine the dual of the space p , consisting of all sequences (xn )
such that
|xn | p < ∞.
n
2.9 Examples of Dual Spaces 141
for any p, q > 1 satisfying 1/ p + 1/q = 1. The special case p = q = 2 is the famil-
iar Cauchy–Schwarz inequality.
Every bounded linear functional on p is of the form
f (x) = xn yn ,
n
We want to show that the sequence (yi ) lives in q . To this end, define for each natural
number N , the sequence x N having for its i-th component |yi |q/ p sgn yi for i ≤ N
and zero otherwise. Then,
N 1/ p
||x N || = |yi |
q
,
i=1
and
N
N
f (x N ) = |yi |q/ p+1 = |yi |q .
i=1 i=1
|y N | = f (u N ) ≤ || f ||||u N || ≤ || f ||.
142 2 Measure Theory
Thus, the sequence of y’s is bounded. Conversely, given such an element y, we can
clearly define and element of ∗1 by setting
f (x) = xi yi .
i
We caution the reader that the dual space of ∞ is NOT 1 (see Exercise 4 in the
previous section and the exercises below).
The above discussion can be extended to determine the dual space of L p (X ) for
any measure space X . It turns out that it is L q (X ) for 1 ≤ p < ∞. The proof is
similar to what we have said above and makes use of the Radon-Nikodym theorem.
The dual space of a Hilbert space is itself as we have already seen.
The notion of orthogonality can be introduced in normed spaces through the dual
space. The vectors x ∈ X and x ∗ ∈ X ∗ are said to be orthogonal if x, x ∗ = 0. If
S is a subset of a normed linear space X , the orthogonal complement of S, denoted
S ⊥ , consists of all elements x ∗ ∈ X ∗ orthogonal to every vector of S. Recall that if
in a Hilbert space H , M is a closed subspace, we have the decomposition
H = M ⊕ M ⊥.
x = y + z,
x, z
d = inf ||x − y|| = max ,
y∈M z∈M ⊥ ||z||
so that the right hand side of the equation in the theorem is bounded by d + . Since
was arbitrary, the right hand side is less than or equal to d. We now have to find a z
so that x, z = d||z||. Let N be the subspace spanned by x and M. Elements of N
can be written uniquely as u = ax + m with a ∈ R. Define a linear functional on N
by setting f (u) = ad. The norm of this functional is given by
2.9 Examples of Dual Spaces 143
|a|d d
sup{| f (u)|/||u|| : u ∈ N } = sup = = 1.
|a|||x + m/a|| inf ||x + m/a||
Exercises
References
1. M. Ram Murty, The Cauchy-Schwarz inequality in mathematics, physics, and statistics. Math.
Student 88, 17–25 (2019)
2. W. Rudin, Functional Analysis (McGraw Hill, New York, 1973)
Chapter 3
Fourier Transforms
Jean Baptiste Joseph Fourier (1768–1830), the son of a tailor, was orphaned at the
age of eight and brought up and educated by the clergy. In 1790, at the age of 22, he
was appointed as a professor at the Ecole Polytechnique and in 1798, Napoleon took
him on his campaign to Egypt. His research on the heat equation began in 1800, and
by 1811, he had developed the theory now called the theory of Fourier series as the
essential tool for solving it. Fourier is best known today for his 1822 book, “Théorie
Analytique de la Chaleur” which was described by Kelvin as a “great mathematical
poem.” The theory of Fourier series and Fourier transforms that are in use today in
mathematics, physics and engineering can be traced back to this remarkable work.
In the subsequent years, Fourier’s theory has been extended to the context of the
Lebesgue integral. The essential theorems are easily proved using Fubini’s theorem
discovered by Guido Fubini (1879–1943). We begin with a discussion of this theorem
in the context of measure theory.
Let (X, μ) and (Y, ν) be two measure spaces. We can define a measure on X × Y
as follows. Let A ⊆ X be measurable and B ⊆ Y be measurable. Then we call A × B
a measurable rectangle and define a measure λ on X × Y by setting
λ(A × B) = μ(A)ν(B).
One can check that λ extends uniquely to a complete measure on the σ-algebra
generated by these rectangles. Recall that (X, μ) is said to be a complete measure
space if it contains all the subsets of sets of measure zero. We denote λ as μ × ν.
Now if f is an integrable function on X × Y , then the functions f (x, ·) on Y and
f (·, y) on X are both integrable for almost all x and y. In this situation, we have the
following theorem of Fubini.
(i) 0 f ∞ or
(ii) X Y | f |dνdμ < ∞
then
f dλ = f dν dμ = f dμ dν.
X ×Y X Y Y X
In the case (ii) holds, we have that f ∈ L 1 (λ) and that f (x, ·) ∈ L 1 (ν) for almost
every x ∈ X and f (·, y) ∈ L 1 (μ) for almost every y ∈ Y .
Throughout (unless stated otherwise)
dx
dμ = √
2π
It is easy to see that the translation invariance of the Lebesgue measure as well as
Fubini’s theorem now implies that the convolution is well-defined. More precisely,
we have
f ∗ g1 f 1 g1 .
and as the Lebesgue measure is translation invariant, the inner integral is f 1 which
is a constant and as
∞ ∞
|F(x, y)|d xd y = f 1 g1 ,
−∞ −∞
3.1 Fubini’s Theorem and Convolutions 147
Exercises
1. Let ⎧ (x 2 −y 2 )
⎨ (x 2 +y 2 )2 (x, y) ∈ (0, 1) × (0, 1)
f (x, y) =
⎩
0 otherwise
Show that ∞ ∞
π
f (x, y)d x dy = − ,
−∞ −∞ 4
and ∞ ∞
π
f (x, y)dy d x = .
−∞ −∞ 4
2. Prove that ∞
2y 2y
− 2 dy = 2 log x.
0 1 + y2 x + y2
Deduce that ∞
log x y dy
− = .
1 − x2 0 (1 + y 2 )(x 2 + y 2 )
3. Show that
1 ∞
log x log x
I := dx = d x.
0 1 − x2 1 1 − x2
Deduce that ∞ ∞
y d x dy
−2I = .
0 0 (1 + y 2 )(x 2 + y 2 )
4. Applying Fubini’s theorem in the last exercise, use the previous exercises to
deduce that
∞
1 π2
= .
n=0
(2n + 1) 2 8
∞
1 3
= 2.
m=1
m2 −n 2 4n
m =n
Show that
k−1
1
k+ ζ(2k) = ζ(2 j)ζ(2k − 2 j).
2 j=1
This shows that ζ(2k) can be recursively determined from ζ(2).1 [Hint: write
the summand as a double sum, interchange summations, and use the previous
exercise.]
√
where i = −1. Here are some basic properties of the Fourier transform.
Theorem 3.3 Suppose f ∈ L 1 (R) and α, λ ∈ R. Then
(1) If g(x) = f (x)eiαx , then ĝ(t) = fˆ(t − α).
(2) If g(x) = f (x − α), then ĝ(t) = fˆ(t)e−iαt .
(3) If g ∈ L 1 (R) and h = f ∗ g, then ĥ = fˆĝ.
(4) If g(x) = f (−x), then ĝ(t) = fˆ(t).
(5) If g(x) = f (x/λ) where λ > 0, then ĝ(t) = λ fˆ(λt).
(6) If g(x) = −i x f (x) and g ∈ L 1 (R), then fˆ is differentiable with derivative
fˆ (t) = ĝ(t).
Proof Property (1) is easy to see:
∞
ĝ(t) = f (x)eiαx e−i xt dμ(x)
−∞
∞
= f (x)e−i(t−α)x dμ(x) = fˆ(t − α).
−∞
1 These exercises are taken from the author’s short note [1] in Math. Student.
3.2 The Fourier Transform 149
Putting u = x − α, we have
∞
ĝ(t) = f (u)e−i(α+u)t dμ(u),
−∞
= fˆ(t)ĝ(t)
= λ fˆ(λt).
where
e−i xu − 1
φ(x, u) = .
u
Now
ei xu/2 (e−i xu − 1)
φ(x, u)ei xu/2 =
u
e−i xu/2 − ei xu/2 2i sin(xu/2)
= =−
u u
so that for u = 0, we have
|2 sin(xu/2)| sin(xu/2)
|φ(x, u)| = = · x |x|
|u| xu/2
fˆ(s) − fˆ(t)
fˆ (t) = lim
s→t s−t
∞
= f (x)e−i xt lim φ(x, s − t)dμ(x)
−∞ s→t
∞
= (−i x)e−i xt f (x)dμ(x)
−∞
∞
= g(x)e−i xt dμ(x) = ĝ(t).
−∞
Observe that ∞ ∞
+y 2 )/2 d xd y
e−(x
2
I2 = .
−∞ −∞ 2π
= 0.
The interchange of integration and differentiation can be justified (see next section)
and so we deduce I (u) is constant. But since I (0) = 1, we have I (u) = 1. That is
∞
e−x /2 −iux
2
e dμ(x) = 1
−∞
which implies
∞
fˆ(u) = e−x /2 −iux
dμ(x) = e−u /2
2 2
e = f (u).
−∞
152 3 Fourier Transforms
That is, f is its own Fourier transform. In other words, when we view the Fourier
transform as a linear transformation, it is an eigenvector with eigenvalue 1.
Exercises
1. Prove that
sin θ
≤ 1,
θ
where (x, t) denotes the usual inner product, and d x is the Lebesgue measure
on Rn . If f (x) = e−|x| /2 , show that fˆ = f . (Here x = (x1 , ..., xn ) and |x|2 =
2
x12 + · · · + xn2 .)
6. Prove that ∞
1 e−x /2
2
1
e−t /2 dt ∼ √
2
√ ,
2π x 2π x
as x tends to infinity.
Theorem 3.4 Let [a, b] be an interval in R (which may be infinite) and suppose that
f is a continuous function on the rectangle [a, b] × [c, d]. Set
b
g(y) = f (x, y) d x.
a
∂f
Suppose that exists and is continuous. Further suppose if [c, d] is infinite then
∂y
∂f
is bounded. Then g is differentiable and
∂y
b
∂f
g (y) = (x, y) d x.
a ∂y
Proof We compute
g(y + 1/n) − g(y)
lim
n→∞ 1/n
Indeed,
b
n [g(y + 1/n) − g(y)] = n [ f (x, y + 1/n) − f (x, y)] d x.
a
∂f
By the assumption that exists and the mean value theorem, we have that
∂y
1∂f
f (x, y + 1/n) − f (x, y) = (x, ξ)
n ∂y
which is now absolutely convergent for t > 0. It is now easy to check that the function
sin x
f (x, t) = e−t x
x
satisfies the hypotheses of the theorem for [a, b] = [0, ∞), [c, d] = [ , ∞) with
> 0. Differentiating under the integral sign with respect to t, we obtain
∞
g (t) = − e−t x sin xd x.
0
We integrate by parts:
∞ ∞
e−t x x=∞ e−t x
e−t x sin xd x = − sin x + cos xd x.
0 t x=0 0 t
The first term on the right hand side is 1/t 2 so that we deduce
In other words,
1
g (t) = − .
1 + t2
3.3 Differentiation Under the Integral Sign 155
Integrating, we obtain
g(t) = − arctan t + C
In his semi-autobiography, “Surely you are joking Mr. Feynman,” Richard Feyn-
man narrates that he learned this method while in high school and it was the one trick
he would use again and again and this is how he earned a reputation for evaluating
integrals.
Exercises
1. Suppose f ∈ L 1 (R) and f (x) > 0 for all real x. Show that | fˆ(y)| < fˆ(0) for
every y = 0.
2. Find the Fourier transform of
x
f (x) = .
(1 + x 2 )2
3. Prove that 1
x −1
d x = log 2.
0 log x
4. Let u2
φ(α) = f (x, α)d x, a ≤ α ≤ b,
u1
This is sometimes called Leibniz’s rule for differentiating under the integral sign.
5. Let a2
sin ax
φ(a) = d x, a = 0.
a x
Show that
3 sin a 3 − 2 sin a 2
φ (a) = .
a
6. Prove that ∞
√
−t x 2 /2 2π
e dx = √ .
−∞ t
fˆ(t) = e−|t| /2 .
2
This is
∞ 0 ∞
−|λ||x| −i xt −|λ||x| −i xt
e e dμ(x) = e e dμ(x) + e−|λ||x| e−i xt dμ(x).
−∞ −∞ 0
Also,
0 0
−|λ||x| −i xt
e e dμ(x) = − e−|λ||x|+i xt dμ(x)
−∞ ∞
In other words,
2 |λ|
h λ (t) = .
π λ2 + t 2
Changing x to λx gives
∞
2 ∞ dx
h λ (x)dμ(x) = 2 √
−∞ π 0 (1 + x 2 ) 2π
2 ∞ dx 2 ∞
= = arctan x = 1.
π 0 1 + x2 π 0
158 3 Fourier Transforms
0 < H (x) 1
Exercises
1. Suppose f ∈ L 1 (R) has a Fourier transform. For any t1 , ..., tn ∈ R and z 1 , z 2 , ...,
z n ∈ C, prove that
n n
f (t j − tk )z j z k ≥ 0.
j=1 k=1
3. Prove that ∞
π −t 2 /2
e−x /2
2
cos(t x) d x = e .
0 2
Proof We have
∞
( f ∗ h λ )(x) = f (x − y)h λ (y)dμ(y)
−∞
∞ ∞
= f (x − y) Hλ (t)e−it y dμ(t) dμ(y)
−∞ −∞
∞ ∞
= H (λt) f (x − y)e−it y dμ(y) dμ(t)
−∞ −∞
Exercises
1. Solve for f : ∞
4 −|x| 2 −2|x|
f (x − t)e−|t| dt = e − e .
−∞ 3 3
Show that
F(t) + log(t − 1) = O(1),
as t → 1+.
3. For each natural number n, let gn be the characteristic function of [−n, n] and
let h be the characteristic function of [−1, 1]. Compute the convolution gn ∗ h
explicitly. (The graph is piecewise linear.)
4. With notation as in the previous exercise, show that gn ∗ h is the Fourier transform
of a function f n ∈ L 1 (R), where (apart from a multiplicative constant),
The integrand on the right hand side of the equation in Proposition 3.5.1 is bounded by
| fˆ(t)| since |H (x)| 1. Moreover, as λ → 0+ , H (λt) → 1 and so by the Lebesgue
dominated convergence theorem,
∞ ∞
lim+ H (λt) fˆ(t)ei xt dμ(t) = fˆ(t)ei xt dμ(t)
λ→0 −∞ −∞
provided fˆ ∈ L 1 (R). What we would like to prove now is that the right hand side of
the above limit is in fact equal to f (x) almost everywhere. To this end, we need a
few technical results and we collate them below.
160 3 Fourier Transforms
We give two proofs of the inversion theorem. The first proof follows closely Rudin
[2]. The second proof is shorter and makes essential use of the Lebesgue dominated
convergence theorem, Fubini’s theorem and the fact that e−x /2 is its own Fourier
2
transform. Though the first proof is longer, it has the merit of invoking related results
of independent interest. The second proof of course is crisp and succinct.
Proposition 3.6.1 For any f on R and any y ∈ R, let
f y (x) := f (x − y).
f − g p < ε.
ε p (2 A + |s − t|) ε p (2 A + δ)
gs − gt pp (3A)−1 ε p μ(I ) εp.
3A 3A
Therefore, when |s − t| < δ, gs − gt p ε.
Finally,
f s − f t p f s − gs p + gs − gt p + gt − f t p
f − g p + gs − gt p + f − g p
2ε + ε = 3ε
we have
∞
(g ∗ h λ )(x) − g(x) = [g(x − y) − g(x)]λ−1 h 1 (y/λ)dμ(y)
−∞
∞
= [g(x − λs) − g(x)]h 1 (s)dμ(s)
−∞
lim f ∗ h λ − f p = 0.
λ→0+
Now,
∞ ∞
( f ∗ h λ )(x) − f (x) = f (x − y)h λ (y)dμ(y) − f (x)h λ (y)dμ(y)
−∞ −∞
∞
= ( f (x − y) − f (x))h λ (y)dμ(y)
−∞
∞
1/ p 1/q
= ( f (x − y) − f (x))h λ (y)h λ (y)dμ(y).
−∞
|( f ∗ h λ )(x) − f (x)|
∞ 1/ p ∞ 1/q
| f (x − y) − f (x)| h λ (y)dμ(y)
p
h λ (y)dμ(y)
−∞ −∞
∞ 1/ p
| f (x − y) − f (x)| p h λ (y)dμ(y)
−∞
because ∞
h λ (y)dμ(y) = 1.
−∞
p
Letting g(y) = f y − f p , we have by Fubini’s theorem that the double integral
is equal to ∞
g(y)h λ (y)dμ(y).
−∞
|g(y)|1/ p f y p + f p = 2 f p
Proof Exercise.
We are now ready to prove the inversion theorem.
because fˆ ∈ L 1 (R).
But Proposition 3.6.3 says
lim f ∗ h λ − f 1 = 0.
λ→0+
But the left hand side is g(x). Thus g(x) = f (x) a.e. This completes the
proof.
Here is the promised short proof of the inversion theorem. Let g (x) = e− x /2
2 2
.
Consider the absolutely convergent double integral:
∞ ∞
f (u)eiw(x−u) e− w 2 /2
2
I (x) = dμ(u)dμ(w). (3.2)
−∞ −∞
By Fubini’s theorem, we can evaluate the inner integral and deduce that
∞
fˆ(w)eiwx e− x /2 dμ(w).
2 2
I (x) =
−∞
Since fˆ ∈ L 1 (R), we can apply the dominated convergence theorem to deduce that
∞
lim I (x) = fˆ(w)eiwx dμ(w).
→0 −∞
1
e−x /2
2 2
g (x) =
.
Letting → 0 and applying again the dominated convergence theorem gives the limit
is f (x) since the probability integral equals 1. This completes the proof.
Exercises
Deduce that 1
log(1 + x) π log 2
dx = .
0 1+x 2 8
lim
f (t) = 0.
|t|→∞
so that
3.7 Further Properties of the Fourier Transform 165
∞
| fˆ(tn ) − fˆ(t)| | f (x)||e−itn x − e−it x |dμ(x)
−∞
fˆ(tn ) → fˆ(t),
so that
|2 fˆ(t)| f − f π/t 1 → 0
Exercises
as t → ±∞.
2. For any smooth function f , show that
∞
sin t x
lim f (x) d x = π f (0).
|t|→∞ −∞ x
166 3 Fourier Transforms
The main weakness of the inversion theorem is that f ∈ L 1 (R) does not guarantee
fˆ ∈ L 1 (R). It turns out that by considering functions f ∈ L 1 (R) ∩ L 2 (R), we can
guarantee fˆ ∈ L 2 (R). We will show that L 1 (R) ∩ L 2 (R) is dense in L 2 (R). Using
this fact, we can extend the concept of the Fourier transform to all functions f ∈
L 2 (R). Moreover, we will see that for f ∈ L 2 (R), we have f 2 = fˆ2 so that the
Fourier transform is an isometry on L 2 (R).
φ A − fˆ2 → 0 and ψ A − f 2 → 0 as A → ∞.
f (x) := f (−x)
and
g= f ∗
f ∈ L 1 (R).
Thus, ∞
g(x) = f (x − y) f (−y)dμ(y) = ( f −x , f ).
−∞
From the fact that the map x → f x is a uniformly continuous map from R to
L p (R) (for any 1 p < ∞), and the fact that the inner product is a continuous map,
we see that g is continuous. Moreover, by the Cauchy–Schwarz inequality,
|g(x)| f 2 .
ĝ = fˆ · f˜ˆ = | fˆ|2 0
Thus, fˆ ∈ L 2 (R).
Next, we need to show that
Y = { fˆ : f ∈ L 1 (R) ∩ L 2 (R)}
L 2 (R) = Y ⊕ Y ⊥ .
x → eiαx H (λx)
defines a map into L 1 (R) ∩ L 2 (R), for all α ∈ R and λ > 0. Thus the Fourier trans-
form of these functions, namely
∞
t → h λ (α − t) = ei(α−t)x H (λx)dμ(x),
−∞
Φ of L 2 onto L 2 . This suffices to prove (1) and (2). The Hilbert space isomorphism
follows from Parseval’s formula:
( f, g) = ( fˆ, ĝ).
Exercises
1. Prove that ∞
eit x dt
= πe−|x| .
−∞ 1 + t2
2. Suppose f, g ∈ L 1 (R) ∩ L 2 (R) are such that fˆ(x) = ĝ(x) a.e. Show that f (x) =
g(x) a.e.
3. Extend Plancherel’s theorem for functions f ∈ L 2 (Rk ).
That is,
(v, w)
λ= .
w2
In other words,
|(v, w)| vw.
Definition 3.2 We define the Schwartz space2 S to be the space of infinitely dif-
ferentiable functions F : R → C such that
|x k F () (x)| → 0
2 Notice the “t” in the spelling of Schwartz’s name. The space is named after Laurent Schwartz
(1915–2002), who is different from H.A. Schwarz (1848–1921) of the Cauchy–Schwarz inequality.
170 3 Fourier Transforms
with equality if and only if ψ(x) = Ae−Bx for some B > 0 and
2
2B
|A| =2
.
π
Proof Writing
∞ ∞
1= |ψ(x)| d x =
2
ψ(x)ψ(x) d x.
−∞ −∞
so that ∞
12 |x||ψ(x)||ψ (x)| d x.
−∞
so that
3.9 The Uncertainty Principle 171
∞
ψ (x) = 2πit x dt
(2πit)ψ(t)e
−∞
In other words, ψ is the (inverse) Fourier transform of (2πit)ψ(t).
By Parseval’s formula, the L -norm of ψ is equal to
2
∞
2 dt.
4π 2 t 2 |ψ(t)|
−∞
Thus
∞ 1/2 ∞ 1/2
1 2(2π) x |ψ(x)| d x
2 2 2 dx
x |ψ(x)|
2
−∞ −∞
for some scalar λ. This is an ordinary differential equation which is easily solved:
ψ(x) = Ae−Bx .
2
Our best guess for the particle’s position is then given by the expectation
∞
μ= x|ψ(x)|2 d x
−∞
and the error (or uncertainty) involved in this guess is given by the variance
∞
(x − μ)2 |ψ(x)|2 d x.
−∞
A similar analysis holds for the momentum of the particle. Indeed, the probability
that the momentum of a particle lies in (a, b) is given by
172 3 Fourier Transforms
b
2 d x.
|ψ(x)|
a
The expectation and variance are defined as before. Without loss of generality,
we can normalize our functions so that the expectations and variances of both the
position and the momentum are zero. With this normalization, we see that Heisenberg
uncertainty principle establishes a lower bound for the product of these variances.
In other words, any attempt to lower the error in our observation of position
increases the error in estimating momentum and vice versa. Though the theorem
originally arose in the context of quantum mechanics, we see that it is really a
theorem about Fourier transforms.
In the terminology of physics, the uncertainty principle is written as
(uncertainty of position) × (uncertainty of momentum) ≥ ,
16π 2
where denotes Planck’s constant.
Exercises
1. The uncertainty principle can also be formulated in terms of the Hermite operator
H defined as
d2 f
H ( f ) := − 2 + x 2 f.
dx
Show that (H ( f ), f ) ≥ ( f, f ) for all f in the Schwartz space with the usual
inner product.
2. Suppose f is a continuous function in L 1 (R). Show that if f and fˆ both have
compact support, then f is identically zero.
as well as the properties of convolutions that were nascent in our earlier treatment.
The following approximation theorem is due to Weierstrass.
Theorem 3.10 Let f ∈ C(R/Z), that is, f is a continuous function on R/Z. Let
ε > 0 be given. Then there is a trigonometric polynomial p such that
f − p∞ ε.
To prove this, we introduce the following useful concept. Let ε > 0 and 0 < δ <
1/2. A function f ∈ C(R/Z) is said to be a periodic (ε, δ) approximation to the
identity if
1
(a) f (x) 0 ∀x ∈ R and 0 f (x) d x = 1;
(b) f (x) < ε ∀δ |x| < 1 − δ.
In other words, the “bulk” of the continuation of f (x) occurs in [0, δ) and (1 −
δ, 1].
Lemma 3.1 For every ε > 0 and 0 < δ < 1/2, there exists a trigonometric polyno-
mial P which is an (ε, δ) approximation to the identity.
Proof We use the Féjer kernel:
N
|n| 2πinx
FN (x) = 1− e .
n=−N
N
N −1
e2πinx
n=0
sin N πx
eπi(N −1)x ·
sin πx
so that 2
sin N πx 1
FN (x) = ,
sin πx N
for x ∈
/ Z. If x ∈ Z, then FN (x) = N . In any case, FN (x) 0 for any x. Now
174 3 Fourier Transforms
N 1
1
|n|
FN (x) d x = 1− e2πinx d x = 1.
0 n=−N
N 0
1 1
|FN (x)|
N | sin πx|2 N (sin πδ)2
for δ < |x| < 1 − δ because the sine function is increasing on [0, π/2] and decreasing
on [π/2, π].
Thus, by choosing N sufficiently large, we can make
|FN (x)| ε
N 1
= an f (t)e2πin(x−t) dt
n=−N 0
N 1
= an f (t)e−2πint dt e2πinx
n=−N 0
because 1
P(y) dy = 1.
0
Thus, as P is non-negative,
1
| f (x) − ( f ∗ P)(x)| | f (x) − f (x − y)|P(y) dy.
0
1−δ 1
| f (x) − f (x − y)|P(y) dy + | f (x) − f (x − y)|P(y) dy.
δ 1−δ
By the uniform continuity of f , the first and the last integrals are
δ 1
ε P(y) dy + ε P(y) dy 2ε
0 1−δ
1
because 0 P(y) dy = 1.
The middle integral is bounded by
1−δ
M P(y) dy Mε
δ
The virtue of this proof is that everything can be made explicit. Indeed, given a
continuous function f , the trigonometric polynomial approximations are provided
by f ∗ FN , where FN is the Féjer kernel.
176 3 Fourier Transforms
Exercises
1. Let
|n| 2πinx
FN (x) = 1− e .
|n|≤N
N
Show that
N −1 2 2
1 1 sin π N x
FN (x) = e 2πinx
= .
N n=0
N sin πx
L2
A≤ ,
4π
with equality if and only if C is a circle.
Proof Our proof is due to Hurwitz. The reader will recall that Green’s theorem gives
us the formula for the area A:
1
A= (x dy − y d x) ,
2 C
where we have parametrized the curve C by γ : [0, 2π] → R2 with γ(t) = (x(t),
y(t)). We first observe that without any loss of generality, we may suppose that
L = 2π because any dilation (x, y) → (λx, λy) increases the area A by λ2 and the
length L by λ. Thus, without any loss of generality, taking λ = 2π/L, we see that
it suffices to show that when L = 2π, then A ≤ π with equality if and only C is a
circle. Utilizing our formula for the length of the curve, we have
3.11 The Isoperimetric Inequality 177
2π
1
(x (t))2 + (y (t))2 dt = 1.
2π 0
As our curve is closed, the functions x(t), y(t) are periodic with period 2π. We may
write down their Fourier series:
Then,
x (t) = an ineint , y (t) = bn ineint .
n n
Now x(t) and y(t) are real valued, so that a−n = an and b−n = bn . Our formula
for the area via Green’s theorem combined with another application of Parseval’s
formula now gives
2π ∞
1
A= x(t)y (t) − y(t)x (t) dt = π n an bn − bn an .
2 0 n=−∞
Observing that
|an bn − bn an | ≤ 2|an ||bn | ≤ |an |2 + |bn |2 , (3.5)
2(|a1 |2 + |b1 |2 ) = 1.
178 3 Fourier Transforms
Since we have equality at every stage in (3.5), we deduce |a1 | = |b1 | = 1/2. Thus
writing
1 1
a1 = eiα , and b1 = eiβ ,
2 2
the equality from (3.5) gives
1 = 2|a1 b1 − a1 b1 |
where the sign in y(t) depends on the parity of (k − 1)/2. These functions parametrize
the circle of radius 1 centered at (a0 , b0 ). This completes the proof.
Exercises
show that
2π 2π
| f (t)| dt ≤
2
| f (t)|2 dt.
0 0
2. Show that equality can hold in the above inequality if and only if f (t) = A sin t +
B cos t, for some constants A and B.
N 1
1
lim χ I (xn ) = χ I (x) dμ(x).
N →∞ N 0
n=1
Since step functions are not continuous functions, we would like to replace this
criterion with continuous functions.
To this end, let ε > 0. Then we can find continuous functions
such that 1
f ε± (x) d x = b − a ± ε.
0
N 1
1
f (xn ) → f (x) dμ(x)
N n=1 0
N
1
lim e2πimxn = 0 ∀m = 0, m ∈ Z.
N →∞ N n=1
Proof Recall that from our previous remark, we know that {xn }∞
n=1 is equidistributed
if and only if
N 1
1
lim f (xn ) = f (x) d x
N →∞ N 0
n=1
for all continuous functions f . In particular, we can apply this to the continuous
function
f m (x) = e2πimx
and so if {xn }∞
n=1 is equidistributed, then
180 3 Fourier Transforms
N 1
1
lim f m (xn ) = e2πimx d x = 0.
N →∞ N 0
n=1
To prove the converse, we recall from the previous section that trigonometric
polynomials are dense in C[0, 1]. Given a continuous function f , let ε > 0 be fixed.
We can find a trigonometric polynomial P(x) such that
P(x) = am e2πimx
|m|R
and
sup | f (x) − P(x)| ε.
x∈[0,1]
Thus,
N N
1 1
f (xn ) − P(xn ) ε
N n=1
N n=1
and
N N
1 1
lim am e2πimxn = am lim e2πimxn .
N →∞ N N →∞ N
n=1 |m|R |m|R n=1
The inner limits are all zero except for the case m = 0. On the other hand,
1 1
f (x) d x − P(x) d x ε
0 0
1
and 0 P(x) d x = a0 .
From our construction in Theorem 3.10, we see that
1
a0 = f (x) d x.
0
Thus,
N 1
1
lim f (xn ) = f (x) d x
N →∞ N 0
n=1
as desired.
Example 3.2 If θ ∈ Q, the sequence of fractional parts {nθ}∞ n=1 is not equidis-
/ Q, the sequence of fractional parts {nθ}∞
tributed. If θ ∈ n=1 is equidistributed.
We apply Weyl’s criterion: indeed, if θ = p/q (say), then
3.12 Weyl’s Criterion and Uniform Distribution 181
N
e2πiq(np/q) = N
n=1
so that the Weyl’s limit is 1 for m = q. This proves the first part of the assertion. For
the second part, we see that
N
e2πimnθ
n=1
(e2πimθ ) N − 1
e2πimθ ·
e2πimθ − 1
Exercises
N N N
1 1
f (n) = f (t)dt + ( f (1) + f (N )) + ({t} − ) f (t)dt.
n=1 1 2 1 2
3. Show that the sequence {log n} is not equidistributed mod 1. [Hint: apply the
Euler–Maclaurin summation formula with f (t) = e2πi log t .]
4. Let a and b be integers with a < b and suppose that f is twice differentiable on
[a, b]. For all x ∈ [a, b] suppose that either f (x) ≥ δ > 0 or f (x) ≤ −δ < 0
holds. Then,
b
2πi f (n) 4
e ≤ | f (b) − f (a)| + 2 √ +3 .
n=a δ
In other words, 1
am = f (x)e−2πimx d x.
0
fˆ(n)e2πinx
n∈Z
lim
f (n) = 0.
|n|→∞
|| f − P||∞ < .
D N (x) = e2πinx .
|n|N
sin(2N + 1)πx
D N (x) = = cos 2 N πx + cot πx · sin 2 N πx
sin πx
f (x) = fˆ(n)e2πinx .
n∈Z
Proof Let
S N (x, f ) = fˆ(n)e2πinx
|n|N
1
= e 2πinx
f (t)e−2πint dt
|n|N 0
1
= f (t) e2πin(x−t) dt
0 |n|N
= f ∗ D N (x).
Now,
1
S N (x, f ) − f (x) = f (t)D N (x − t) dt − f (x)
0
1 1
= f (x − u)D N (u) du − f (x)D N (u) du
0 0
184 3 Fourier Transforms
so that
1
S N (x, f ) − f (x) = ( f (x − u) − f (x)){cos 2 N πu + cot πu · sin 2 N πu} du.
0
Let
g1 (u) = f (x − u) − f (x)
and
g2 (u) = g1 (u) cot πu.
Exercises
sin(N + 1/2)x
D N (x) = ,
sin(x/2)
1
(D0 (x) + D1 (x) + · · · + D N −1 (x)) .
N
fˆ(n) = o(1/|n|k ),
We will now study how to make the following heuristic argument rigorous. As earlier,
for f ∈ L 1 (R) we define
∞
fˆ(t) = f (x)e−2πi xt d x.
−∞
This is a minor variation of the Fourier transform where i xt has been replaced
by 2πi xt in the exponential. This is motivated by a desire to have a more elegant
formulation of the Poisson formula.
Given f ∈ L 1 (R), we define
F(x) = f (n + x).
n∈Z
But,
1
F̂(n) = F(x)e−2πinx d x
0
1
= f (m + x)e−2πinx d x
0 m∈Z
1
= f (m + x)e−2πinx d x.
m∈Z 0
Thus,
f (m + x) = fˆ(n)e2πinx .
m∈Z n∈Z
186 3 Fourier Transforms
f (n + x)
n∈Z
fˆ(n)
n∈Z
to be absolutely convergent.
Now, to ensure F(x) is differentiable, we need to study
Since we are assuming f (x) is differentiable, we have by the mean value theorem
f (n + x + h) − f (n + x) = h f (ξn,x )
| f (ξn,x )|.
n∈Z
f (n) and f (n + x)
n∈Z n∈Z
for |t| 1 (say). Our estimates on f (x) ensure absolute convergence of the integral.
Therefore,
fˆ(n)
n∈Z
is absolutely convergent.
The differentiability of f along with our estimates on f, f now ensure F(x) is
differentiable so that it admits Fourier series expansion.
With those remarks in place, the heuristic “proof” now can be made rigorous. We
leave the details to the reader.
Then
fˆ(m)e2πimx = f (n + x).
m∈Z n∈Z
We leave the proof as an exercise to the reader and turn our attention to various
applications.
Recall that if F(x) = e−c|x| with Re (c) > 0, then
2c
F̂(u) =
c2 + 4π 2 u 2
which follows easily from our earlier calculation of the Fourier transform of the
continuous function e−|x| . The conditions of the theorem are satisfied, and we obtain
∞ ∞
2c
e−c|n| = 2 + 4π 2 n 2
.
n=−∞ n=−∞
c
Therefore,
∞
ec + 1 2c
= . (∗)
ec − 1 n=−∞ c2 + 4π 2 n 2
Putting c = 2π gives
∞ ∞
e2π + 1 1 1
π 2π = =1+2 .
e −1 n=−∞
n 2+1
n=1
n 2+1
Thus
∞
1 πe2π + π − (e2π − 1) πe2π − e2π + π + 1
2 = = .
n=1
n2 +1 e −1
2π e2π − 1
which was finally solved by Euler who showed that it equals π 2 /6. This can be
deduced from our result.
Indeed, from (∗), we have
∞ ∞
1 2 1 1 1
1+ c = = + 2 .
2c e −1 n=−∞
c 2 + 4π 2 n 2 c 2
n=1
c 2 + 4π 2 n 2
Now
c2 c3
ec − 1 = c + + + ···
2! 3!
so that
⎛ ⎞
1 ⎝ 2 1
1+ ⎠ − 2
2c c2
c 1 + 2 + 6 + ···
c c
2
1 2 c c2 c c2 1
= 1+ 1− − + ··· + + + ··· + ··· − 2
2c c 2 6 2 6 c
which simplifies to
2
1 1 c c2 1 c 1
+ 1− − − · · · + c2 + + ··· ··· −
2c c2 2 6 2 6 c2
1 1 1
=− + = .
6 4 12
Therefore,
∞
1 π2
2
= .
n=1
n 6
∞
1
∈ π 2k Q
n=1
n 2k
√
f (t) = e−π(a+t/ x)2
then √ √
fˆ(t) = · e−πt x .
2
xe2πiat x
Therefore,
∞ ∞
√
−π(a+n/ x)2
√ √
e−πn
2
e = x x+2πian x
. (3.6)
n=−∞ n=−∞
Riemann used this identity to derive the analytic continuation and functional
equation for his famous zeta function:
∞
1
ζ(s) = .
n=1
ns
∂u ∂2u
=κ 2 ()
∂t ∂x
where κ is a constant (called the conductivity of the ring’s material).
Fourier solved this equation by the method of separation of variables, by assuming
that
u(x, t) = A(x)B(t).
so that
A (x) B (t)
κ = = λ = constant
A(x) B(t)
since the left hand side is a function of x and the right hand side is a function of t.
3.14 The Poisson Summation Formula 191
We therefore get two ordinary differential equations, and they can be solved by
classical methods. The general solution is of the form for A(x) is
A(x) = e2πinx
B(t) = e−4π
2 2
n t
.
In this way, we see that the general theta function appears as the solution to the
heat equation.
If in addition, we have an initial condition:
u(x, 0) = f (x)
e−4π
2 2
θt (x) = n t 2πinx
e
n∈Z
then we see that e−4π n t is the n-th Fourier coefficient of θt (x) and an e−4π n t is the
2 2 2 2
n-th Fourier coefficient of u(x, t). As an = fˆ(n), and e−4π n t = θ̂t (n), we see that
2 2
u(x, t) = ( f ∗ θt )(x).
In other words, the θ-function enters in a fundamental way to resolve the general
heat equation.
192 3 Fourier Transforms
Exercises
1. Let
1 − |x| if |x| ≤ 1
g(x) =
0 otherwise.
Show that 2
sin πx
g (x) = .
πx
2. Apply Poisson’s summation formula for the function g in the previous exercise
to deduce that
∞
1 π2
= ,
n=−∞
(n + x) 2 (sin πx)2
If X and Y are independent and have density functions f X and f Y , then X + Y has
density function f X ∗ f Y (see exercises).
We say X and Y are identically distributed if f X = f Y . Given a random variable
X , its expectation denoted E(X ) is defined as
E(X ) := x f X (x)d x.
X
Sometimes, one denotes E(X ) as μ and calls it the mean of X . The variance of X ,
denoted var(X ), is E((X − μ)2 ).
With this terminology and understanding, we are now ready to state and prove
the central limit theorem.
Theorem 3.16 (The central limit theorem) Let X 1 X 2 , ... be independent and iden-
tically distributed random variables with E(X i ) = 0 and var(X i ) = 1 Let Sn =
X 1 + X 2 + · · · + X n . Then,
β
√ 1 √
e−t /2
2
lim P(α n ≤ Sn ≤ β n) = √ dt.
n→∞ 2π α
√
Proof We want to determine the density function of Sn / n. Recall that if X has
density f X and Y has density f Y , then X + Y has density f X ∗ f Y , provided X and Y
are independent. Since all the X i ’s are identically distributed with density function
f (say) and the X i ’s are independent, Sn has density function given by
f ∗n := f ∗ · · · ∗ f.
−1
√
Now if X has density f (t), λX has √ λ f (t/λ) (Exercise). Therefore Sn / n
√ density
∗n
has density g (t) where g(t) = n f ( nt). We want to show
194 3 Fourier Transforms
1
lim g ∗n (t) → √ e−t /2 .
2
n→∞ 2π
g (t)n → e−t /2
2
.
√
g (t) =
But f (t/ n). Taking the Taylor expansion, we have
√ √
f (t/ n) =
f (0) +
f (0)t/ n +
f (0)t 2 /2n + O(1/n 3/2 ).
Now ∞
f (0) = f (t)dt = 1,
−∞
Exercises
[Hint: choose the random variables to be Bernoulli trials with Ω = {H, T } and
independent random variables X i such that P(X i = H ) = p and apply the central
limit theorem.]
4. Prove that
n
−n nk 1
lim e = .
n→∞
k=0
k! 2
[Hint: consider X 1 , ..., X n independent random variables with the Poisson distri-
bution and parameter 1.]
References
1. M. Ram Murty, A simple proof that ζ(2) = π 2 /6. Math. Student 88, 113–115 (2019)
2. W. Rudin, Real and Complex Analysis, 3rd edn. (McGraw-Hill, New York, 1987)
Chapter 4
Complex Analysis
u(x, y) + iv(x, y)
u(x0 , y) + iv(x0 , y) − u(x0 , y0 ) − iv(x0 , y0 ) ∂u ∂v
−i lim = −i +i .
y→y0 y − y0 ∂y ∂ y x=x0 ,y=y0
∂v ∂u
=− ,
∂x ∂y
∂u ∂v
= .
∂x ∂y
Functions of a real variable that satisfy Laplace’s equation are called harmonic.
Thus, the real and imaginary parts of a differentiable function are harmonic via the
Cauchy–Riemann equations. Later, we will see that if f is analytic, u and v are, in
addition, infinitely differentiable.
We can easily see that h ∈ H (Ω) and that the chain rule holds:
h = (g ◦ f ) f .
B(z 0 , r ) = {z : |z − z 0 | < r },
the former using the ratio test and the latter from the root test for convergence of
series.
A function f : Ω → C is representable by a power series in Ω if for every open
ball B(z 0 , r ) ⊆ Ω,
∞
f (z) = cn (z − z 0 )n
n=0
Let
∞
g(z) = ncn z n−1 .
n=1
200 4 Complex Analysis
By the root test for convergence, we see that g also converges for all z ∈ Ω.
Now consider
∞
f (z) − f (w) z n − wn
− g(w) = cn − nw n−1
z−w n=1
z−w
z n − wn
n−1
− nw n−1 = (z − w) kz k−1 w n−k−1
z−w k=1
z n − wn
− nw n−1 = (z − w)G(z, w),
z−w
But
z n − wn n−1
− nw n−1 = z k w n−1−k − nw n−1 .
z−w k=0
n−2
n−2
(z − w)G(z, w) = αk z k+1
w n−2−k
− αk z k w n−1−k .
k=0 k=0
so that ∞
f (z) − f (w)
− g(w) |z − w| n 2 |cn |ρn−2 .
z−w
n=2
By the ratio test, the series on the right-hand side converges. Therefore, we may
take limits as z → w to get
f (w) = g(w)
as desired.
|z − a| < r |φ(ζ) − a|
so that
z−a
φ(ζ) − a < 1.
converges to
1 1 1
· z−a = .
φ(ζ) − a 1 − φ(ζ)−a φ(ζ) − z
So by Fubini’s theorem,
202 4 Complex Analysis
∞
n ∞
1 z−a
f (z) := · dμ(ζ) = cn (z − a)n
X φ(ζ) − a n=0 φ(ζ) − a n=0
where
dμ(ζ)
cn = .
X (φ(ζ) − a)n+1
|μ(X )|
|cn |
r n+1
so that our series converges absolutely.
Exercises
∂u ∂v ∂u ∂v
1. Let f (z) = u(x, y) + iv(x, y) with z = x + i y. If , , , exist, are
∂x ∂x ∂ y ∂ y
continuous on Ω and satisfy the Cauchy–Riemann equations, then show that f
is differentiable on Ω.
2. Show that the function f (z) = z is not analytic.
3. Verify the Cauchy–Riemann equations for f (z) = z 3 .
4. For z = x + i y, define the differential operator:
∂ 1 ∂ ∂
:= +i .
∂z 2 ∂x ∂y
∂f
= 0.
∂z
z−w
f (z) = .
1 − wz
N N −1
S N := an bn = a N B N − Bn (an+1 − an ),
n=1 n=1
where
n
Bn = bj.
j=1
∞
zn
n=1
n
converges for all |z| ≤ 1. What is its radius of convergence? Does the series
∞
zn
n=1
α α(α − 1) · · · (α − n + 1)
= .
n n!
Show that the series
∞
α n
z
n=0
n
d −1 1
f (w) = , where w = f (z).
dw f (z)
204 4 Complex Analysis
(This is the complex analytic version of the inverse function theorem of real
variables.) [Hint: Apply the inverse function theorem from real variables along
with the Cauchy–Riemann equations.]
We have shown that any function representable by a power series in some open subset
of C is holomorphic there. Our next goal is to show that any holomorphic function
has a power series representation.
Definition 4.1 A curve in a topological space X is a continuous map γ : [α, β] →
X . We let γ ∗ be the image of γ. If X = C and the curve γ is piecewise C 1 , then we
call γ a path. A closed path is a path γ : [α, β] → X such that γ(α) = γ(β).
Definition 4.2 If γ : [α, β] → C is a path and f : C → C is continuous on γ ∗ , we
define the path integral of f as
β
f = f (z)dz = f (γ(t))γ (t) dt.
γ γ α
Note that
β
f max∗ | f (z)| |γ (t)| dt,
z∈γ
γ α
f = f + f + f.
∂ [a,b] [b,c] [c,a]
Note that this integral is invariant under cyclic permutations of the vertices
(a, b, c).
Throughout, Ω is an open set in C.
Definition 4.3 We define the index of z with respect to a closed path γ by
1 dw
Indγ (z) :=
2πi γ w−z
for z ∈ C \ γ ∗ = Ω.
We will show that this is always an integer. Intuitively, the index Indγ (z) mea-
sures how many times γ “winds around z.” Thus, sometimes, it is also called the
4.2 Integration over Paths 205
winding number. Note that γ ∗ is compact and hence lies in a bounded disk D whose
complement D c is connected and hence lies in precisely one unbounded connected
component.
Theorem 4.3 Let γ be a closed path, and let Ω be the complement of γ ∗ . Then,
Indγ (z) ∈ Z ∀z ∈ Ω and is constant on any connected component of Ω. Moreover,
the index is zero on the unbounded component of Ω.
Proof By our previous theorem, Indγ (z) is holomorphic (as it is representable as a
power series) and hence continuous. Thus, constancy on any connected component
will follow once we show it is integer-valued.
Let γ : [α, β] → γ ∗ and fix z ∈ Ω. Then,
β
1 γ (t) dt
Indγ (z) = .
2πi α γ(t) − z
Observe that Indγ (z) → 0 as |z| → ∞ so that once we show that Indγ (z) is
integer-valued, it will follow that it is zero on the unbounded component. To show
that Indγ (z) is integer-valued, recall that
Thus,
φ (s) γ (s)
=
φ(s) γ(s) − z
which is valid except possibly on a finite set S where γ is not differentiable. Every-
where else, φ (s) is continuous. Thus,
φ (s) φ(s)
= (∗)
γ (s) γ(s) − z
is continuous on [α, β] \ S.
206 4 Complex Analysis
Now,
d φ(s) (γ(s) − z)φ (s) − φ(s)γ (s)
= =0
ds γ(s) − z (γ(s) − z)2
by virtue of (∗).
φ(s)
Thus, is constant on [α, β] \ S.
γ(s) − z
/ γ∗.
But by continuity, this function is constant on all of [α, β] because z ∈
Thus,
φ(s) φ(α) 1
= =
γ(s) − z γ(α) − z γ(α) − z
as φ(α) = 1. Therefore,
γ(s) − z
φ(s) = .
γ(α) − z
But γ is closed, so γ(α) = γ(β). Thus, φ(α) = φ(β) = 1. Therefore, Indγ (z) is
integer-valued.
Corollary 4.2.1 If γ is a positively oriented circle with center a and radius r , then
1, if |z − a| < r ;
Indγ (z) =
0, if |z − a| > r.
Proof We need to only show Indγ (z) = 1 for |z − a| < r . Parametrize γ : [0, 2π] →
C by γ(θ) = a + r eiθ . Then as Indγ (z) is constant for |z − a| < r ,
2π 2π
1 ir eiθ dθ 1
Indγ (z) = Indγ (a) = = i dθ = 1.
2πi 0 a + r eiθ − a 2πi 0
The corollary makes complete sense from our intuitive understanding of the index
as a winding number.
Exercises
dz
α →
γ z−α
1 w dw
= zIndγ (z).
2πi γ w−z
3. Consider the curve γ given by γ(t) = (cos t, 3 sin t), 0 ≤ t ≤ π. Show that
Iγ (0) = 2.
4. If γ is the unit circle oriented counterclockwise, show that
cos z sin z
dz = dz = 2πi.
γ z γ z2
by considering
ez
dz
γ z
β
F (z) dz = F (γ(t))γ (t) dt = F(γ(β)) − F(γ(α)).
γ α
208 4 Complex Analysis
F (z) dz = 0
γ
z n dz = 0
γ
/ γ∗.
for all n 0; this also holds for n −2 if 0 ∈
We are now ready to prove what is sometimes called the Cauchy–Goursat theorem:
Theorem 4.5 (Cauchy’s theorem for a triangle). Let be a triangle in the open set
Ω ⊆ C, and let p ∈ Ω. Suppose f is continuous on Ω and f ∈ H (Ω \ { p}). Then,
f (z) dz = 0.
∂
Proof First assume p ∈ / . Let (a, b, c) be the oriented vertices of (see Fig. 4.1),
and let a , b , c be the corresponding opposite midpoints (see Fig. 4.2).
Now consider the four oriented triangles 1 , 2 , 3 and 4 formed by joining
these midpoints (a, c , b ), (b, a , c ), (c, b , a ) and (a , b , c ), respectively. Then,
4
J := f (z) dz = f (z) dz.
∂ j=1 ∂ j
The orientation is important since the “internal” line integrals are canceled.
For one of the triangles, call it 1 , we have
|J |
f (z) dz .
4
∂ 1
a b
4.3 The Local Cauchy Theorem 209
b a
a b
c
Let z 0 ∈ ∞n=1
n
. We see that z 0 ∈ , z 0 = p. Thus, for any ε > 0, there is an
r > 0 such that
f (z) dz = f (z) − f (z 0 ) − f (z 0 )(z − z 0 ) dz
∂ n ∂ n
so that
Lε L
f (z) dz ε |z − z 0 | dz ·
2n 2n
∂ n ∂ n
for z ∈ n
. Thus,
L 2ε
|J | 4n · = εL 2 .
4n
Since ε was arbitrary, J = 0.
In the case that p is a vertex of , choose two points of ∂ very close to p and
join with each other and with p, to form a triangle (see Fig. 4.3).
210 4 Complex Analysis
a b
a b
a b
This splits into a very small triangle containing the vertex p and a larger non-
triangular region. This larger non-triangular region can be cut into two triangular
regions, neither of which contain p (see Fig. 4.4).
As before, the integral over is the oriented sum of these integrals over these three
triangles. By our initial case, the integral over the other two integrals is zero. So, the
integral over is simply the integral over the remaining small triangle that contains
the vertex p. Since the triangle can be made arbitrarily small (as f is continuous on
Ω and hence bounded), this integral is zero.
Finally, in the general case that p lies in the interior of the triangle, join lines from
the vertices of to p, so as to split into three triangles, each having p as a vertex
(see Fig. 4.5).
By the previous calculation, the integral is again zero.
Theorem 4.6 (Cauchy’s theorem for a convex set). Let Ω be a convex open set. Let
p ∈ Ω and suppose f is continuous on Ω and f ∈ H (Ω \ { p}). Then, f = F for
some F ∈ H (Ω). Therefore,
f (z) dz = 0
γ
4.3 The Local Cauchy Theorem 211
Proof Let [a, z] denote the oriented line segment joining a to z. Fixing a ∈ Ω, the
convexity of Ω allows us to define
F(z) = f (ξ) dξ
[a,z]
for all z ∈ Ω. Now for z, z 0 ∈ Ω, let be the triangle in Ω with vertices (a, z 0 , z).
Then,
so that for z = z 0
F(z) − F(z 0 ) 1
− f (z 0 ) = [ f (w) − f (z 0 )] dw.
z − z0 z − z0 [z 0 ,z]
Hence,
F(z) − F(z 0 )
− f (z 0 ) ε for |z − z 0 | < δ.
z − z0
This proves that f = F . That is, F ∈ H (Ω). By Theorem 4.4, the result follows.
Theorem 4.7 (Cauchy’s formula for a convex set) Suppose γ is a closed path in a
/ γ ∗ , then
convex open set Ω and f ∈ H (Ω). If z ∈ Ω and z ∈
1 f (w)dw
f (z)Indγ (z) = .
2πi γ w−z
Proof Defining
212 4 Complex Analysis
⎧ f (w) − f (z)
⎪
⎨ , for w ∈ Ω, w = z
g(w) = w−z
⎪
⎩
f (z), if w = z,
1 f (w)dw
f (z) =
2πi γ w−z
for z ∈ B(a, r ). By our earlier theorem, this can be written as a power series.
f (z) dz = 0
∂
Proof Let V be a convex open set in Ω. As in the proof of Cauchy’s theorem for a
convex set, we can construct
F(z) = f (w) dw
[z 0 ,z]
Exercises
1. Evaluate
z2
dz
γ z−1
Suppose that not all cn are zero. Then, there is a smallest m such that cm = 0. As
f (a) = 0, we have m > 0. Define
(z − a)−m f (z), ∀z ∈ Ω \ {a}
g(z) =
cm , if z = a
214 4 Complex Analysis
so that f (z) = (z − a)m g(z). Also, g ∈ H (Ω \ {a}). But the power series represen-
tation of f gives
∞
g(z) = (z − a)k cm+k
k=0
Proof Define ⎧
⎨ (z − a)2 f (z), ∀z ∈ Ω \ {a}
h(z) =
⎩
0, if z = a.
∞
h(z) = cn (z − a)n
n=2
m
ck
f (z) −
k=1
(z − a)k
m
ck
Q a (z) :=
k=1
(z − a)k
for f about a.
216 4 Complex Analysis
Proof Suppose (3) does not hold. Then, there is an r > 0, δ > 0 and w ∈ C such
that | f (z) − w| > δ for all z ∈ B o (a, r ).
Then defining
1
g(z) = for z ∈ B o (a, r ),
f (z) − w
1 1
|g(z)| = < .
| f (z) − w| δ
1
| f (z) − w| ∀z ∈ B(a, r1 ).
c
Hence, f is bounded, and again by our previous theorem, f has a removable
singularity at a. So (1) holds.
Case 2. g(a) = 0. Then, we can write
for some g1 ∈ H (B(a, r )) with g1 (a) = 0. Thus, h(z) = 1/g1 (z) is holomorphic in
B(a, r2 ) for some 0 < r2 < r and h has no zeros in B(a, r2 ). Arguing as before, we
get that g1 ∈ H (B(a, r2 )). Therefore,
1 1
f (z) − w = = (z − a)−m = (z − a)−m h(z)
g(z) g1 (z)
Example 4.1 The function f (z) = e1/z is holomorphic in C \ {0} and has an essen-
tial singularity at z = 0.
We saw above how two holomorphic functions that agree on a set with limit points
must be equal. A similar statement can be made about functions with poles. Suppose
f and g are holomorphic in Ω except for a pole at a of order n. Then, (z − a)n f (z) and
(z − a)n g(z) have removable singularities at a so there are functions f 1 , g1 ∈ H (Ω)
such that (z − a)n f (z) = f 1 (z) and (z − a)n g(z) = g1 (z) ∀z ∈ Ω \ {a}. Both of
these functions agree for all z in Ω \ {a} and so f (z) = g(z) for all z ∈ Ω \ {a}.
Exercises
z+1
z
about z = 0 and z = 1.
2. Find the Laurent series expansion of
z
z2 +1
about z = i.
3. Suppose that a = 0. Write down the Laurent expansion of
1
z−a
1
f (z) =
z(z 2 + 1)
8. Suppose that f (z) has an essential singularity at z 0 . Let w ∈ C. Show that there
exists a sequence
z1, z2 , . . .
Sometimes called the maximum principle, the maximum modulus principle is one of
the most important theorems in the theory of complex functions. We derive it below.
The following theorem representing a nice fusion of Fourier series and complex
analysis will serve as a catalyst for a collection of results leading to the maximum
principle.
Theorem 4.13 If
∞
f (z) = cn (z − a)n
n=0
Proof We have
∞
f (a + r eiθ ) = cn r n einθ
n=0
, and the series is an absolutely convergent Fourier series. Thus, by Parseval’s formula
∞
2π
1
|cn |2 r 2n = | f (a + r eiθ )|2 dθ.
n=0
2π 0
The following theorem of Liouville is one of the central theorems of complex
analysis.
4.5 The Maximum Modulus Principle 219
for the power series expansion about z = 0, we see that the series converges for all
z ∈ C since f is entire. By the theorem,
∞
2π
1
|cn |2 r 2n = | f (a + r eiθ )|2 dθ < M 2 .
n=0
2π 0
Proof Suppose that the maximum is attained at an interior point a ∈ Ω. Then, there
is an r > 0 such that B(a, r ) ⊆ Ω and
we see that
∞
2π
1
|cn |2 r 2n = | f (a + r eiθ )|2 dθ | f (a)|2 = |c0 |2 ,
n=0
2π 0
Proof Since f (z) = 0 for all z ∈ Ω, we have 1/ f ∈ H (Ω). Applying the maximum
modulus principle to 1/ f yields the result.
Theorem 4.16 (The fundamental theorem of algebra). If f (z) ∈ C[z] has degree
n 1, then there exist unique α1 , . . . , αn ∈ C such that
f (z) = A(z − α1 ) · · · (z − αn ).
Let M = max(|a0 |, . . . , |an−1 |). Then, for |z| max(1, 2n M), we have
1 n
| f (z) − z n | |an−1 ||z|n−1 + · · · + |a1 ||z| + |a0 | n M|z|n−1 |z|
2
since |z| 1 and |z| 2n M. Thus, for such z,
1 n
|z| |z|n − | f (z)|
2
1 1
so that | f (z)| |z|n , because |z| 1. Now, if f has no zeros, then 1/ f is
2 2
1
entire and 2 for |z| sufficiently large, say |z| R. Thus, 1 is bounded
f (z) f (z)
and entire. By Liouville’s theorem, f is constant. Thus, f has a zero. Now by the
division algorithm,
f (z) = (z − α1 ) f 1 (z)
The following theorem gives an estimate for coefficients of the power series
representation of an analytic function.
| f (n) (a)| M
n.
n! R
Proof We have
∞
f (z) = cn (z − a)n
n=0
f (n) (a)
for z ∈ B(a, R) where cn = . But
n!
∞
2π
1
|cn |2 R 2n = | f (a + r eiθ )|2 dθ M 2
n=0
2π 0
so that
M
|cn | .
Rn
The Schwarz lemma is an important tool for many results in geometric complex
analysis.
Theorem 4.18 (Schwarz’s lemma). Let f (z) be analytic in |z| 1. Suppose that
f (0) = 0, and | f (z)| M for all |z| 1. Then,
| f (z)| M|z|
Proof Let ⎧
⎨ f (z)/z, for z = 0;
g(z) =
⎩
f (0), for z = 0.
| f (z)|
max |g(z)| = max |g(z)| = max = max | f (z)| M.
|z|1 |z|=1 |z|=1 |z| |z|=1
The following theorem illustrates how one can use complex analysis to study
functions which are not analytic.
Proof We cannot apply the maximum modulus theorem directly to φ(z) because
it is not an analytic function. So, we proceed as follows. Suppose the maximum is
attained at an interior point z 0 of D. Write
| f j (z 0 )| = c j f j (z 0 ) with |c j | = 1
which is analytic in D.
By the maximum modulus principle, there exists z 1 ∈ ∂D such that
|F(z 1 )| | f 1 (z 1 )| + · · · + | f n (z 1 )| = φ(z 1 )
We can sharpen the result of the previous theorem as follows. The maximum is
attained only on the boundary unless all the f j ’s are constant. This is easily seen
from the formula:
2π
1
c j f j (z 0 ) = | f j (z 0 )| | f (z 0 + r eiθ )|2 dθ max | f j (z)|,
2π 0 z∈∂D
M f (r ) = max | f (z)|.
|z|=r
Theorem 4.20 (Hadamard’s three circle theorem). Let f be analytic in the annulus
0 < r1 |z| r3 . Then,
log M f (r )
r1 r2 r3 ,
4.5 The Maximum Modulus Principle 223
β
then writing r2 = r1α r3 with α + β = 1, we have
Proof The function g(z) = z m f (z)n is analytic in the annulus for any m ∈ Z and
n ∈ N. By the maximum modulus principle,
for such m, n. By homogeneity, both sides of the expansion can be raised to the power
λ with λ > 0. Thus, the inequality holds for any m ∈ Q and n ∈ Q+ . By continuity,
this extends to m ∈ R and n ∈ R+ . For
we see that λ
r3 M f (r1 )
=
r1 M f (r3 )
so that
r1λ M f (r1 ) = r3λ M f (r3 ).
Let us raise both sides to the power n, with n ∈ R+ . Then setting m = λn, we
have
r1m M f (r1 )n = r3m M f (r3 )n .
Now for α + β = 1,
β
Since r2 = r1α r3 , we get
Hadamard noticed that the growth rate of the function yields new information
about the function itself. Here is an example.
224 4 Complex Analysis
M f (r ) cr n
| f (m) (0)| M f (r )
m+1 cr n−m−1
m! r
which goes to zero if m > n. Thus, f (m) (0) = 0 for all m > n so f is a polynomial
of degree n.
Exercises
1. Let f be analytic and nonzero in a region A. Show that | f | has no strict local
minima in A.
2. Show that the maximum of | sin(x + i y)| for 0 ≤ x ≤ 2π and 0 ≤ y ≤ 2π is
cosh 2π.
3. The functions u(x, y) and v(x, y) defined on R × R are said to be harmonic
conjugates if f (z) = u(x, y) + iv(x, y) with z = x + i y is analytic. Find the
harmonic conjugate of u(x, y) = x 2 − y 2 .
4. Let g be analytic in the region |z| < 1 and assume that |g(z)| = |z| for all |z| < 1.
Show that g(z) = eiθ z for some θ ∈ [0, 2π]. [Hint: Use Schwarz’s lemma.]
z
5. Consider the function f (z) = ee in the region −π/2 ≤ y ≤ π/2 where y =
Im(z). Show that the function is bounded on the edges of this region, but is
unbounded in the interior. (This exercise shows that the maximum modulus prin-
ciple does not necessarily hold for unbounded regions. However, with suitable
growth conditions, the Phragmén–Lindelöf theorem extends the principle to cer-
tain unbounded regions. See Sect. 4.12.)
Γ ∗ = γ1∗ ∪ · · · ∪ γn∗ .
1 dz
IndΓ (α) :=
2πi Γ z−α
and
Γ˜ ( f ) = f (z) dz.
Γ
1 f (w) dw
f (z)IndΓ (z) =
2πi Γ w−z
then
f (z) dz = f (z) dz.
Γ1 Γ2
Remark 4.3 We had proved this theorem earlier for a convex set Ω. This is now
more general.
Proof Define g : Ω × Ω → C by
⎧ f (w) − f (z)
⎪
⎨ , w = z
g(z, w) = w−z
⎪
⎩
f (z), w = z.
1
h(z) = g(z, w) dw.
2πi Γ
226 4 Complex Analysis
To prove our theorem, we need to show that h(z) = 0 for all z ∈ Ω \ Γ ∗ . To this
end, we first show h ∈ H (Ω). Since g is uniformly continuous over any compact
subset of Ω × Ω, we see that if z n → z ∈ Ω then g(z n , w) → g(z, w) uniformly in
w∈ / Γ ∗ . Thus, h(z n ) → h(z) so that h is continuous in Ω.
Now let Δ be a closed triangle in Ω. Then,
1
h(z) dz = g(z, w) dz dw
∂ 2πi Γ ∂
by Fubini’s theorem.
But for each w ∈ Ω, the function
z → g(z, w)
g(z, w) dz = 0
∂
1 f (w) dw
h 1 (z) =
2πi Γ w−z
for z ∈ Ω1 .
Clearly, if z ∈ Ω ∩ Ω1 , then h 1 (z) = h(z). Thus, there is a function φ ∈ H (Ω ∪
Ω1 ) whose restriction to Ω is h and to Ω1 is h 1 . By our hypothesis on Γ , IndΓ (α) = 0
/ Ω. That is, IndΓ (α) = 0 for all α ∈ Ω c . Thus, Ω1 ⊇ Ω c . Therefore, φ is
for all α ∈
entire. Moreover, Ω1 contains the unbounded component of (Γ ∗ )c . Hence,
1 1 F(z) dz
f (z) dz = = F(a)IndΓ (a) = 0.
2πi Γ 2πi Γ z−a
Exercises
1. Evaluate
2z 2 − 15z + 30
dz,
C z3 − 10z 2 + 32z − 32
for some absolute constant C and all n sufficiently large. Show that
|z j | ≤ B for all j = 1, 2, . . . , k.
[Hint: Consider ⎛ ⎞
∞
k
n
⎝ 1
z nj ⎠ z n = .
n=0 j=1 k=1
1 − zjz
By the given estimate, the left-hand side is a power series that converges for
|z| < 1/B. Hence, the right side must be analytic there.]
3. Let f (z, w) be a continuous function of two complex variables z, w, with z in a
region A and w on a curve γ. For each w on γ, assume that f is analytic in z. Let
∂f
F (z) = (z, w)dw.
γ ∂z
Remark 4.4 The condition that A has no limit points implies that there is no compact
subset of Ω containing infinitely many points of A. Thus, A is at most countable.
m
Q a (z) = ck (z − a)−k
k=1
/ Γ ∗ , then
If Γ is a cycle and a ∈
1
Q a (z) dz = c1 IndΓ (a) = Resz=a f (z) IndΓ (a),
2πi Γ
since ⎧
1 dz ⎨ 0, n = 1
=
2πi C zn ⎩
1, n = 1.
1
f (z) dz = Res( f ; a)IndΓ (a).
2πi Γ a∈A
Proof Let
B = {a ∈ A : IndΓ (a) = 0}
In order to use the residue theorem, we need to know how to compute the residues
of the functions at their poles.
Let f be meromorphic in Ω with a pole of order n at a ∈ Ω, and let
n
Q a (z) = c−k (z − a)−k
k=−1
Thus,
∞
f (z) = g(z) + Q a (z) = ck (z − a)k
k=−n
(z − a)n−1
h (n−1) (a)
.
(n − 1)!
Thus,
h (n−1) (a)
Resz=a f (z) = .
(n − 1)!
230 4 Complex Analysis
CR
ρ3 ρ
−R R
1 dz
= Res1 + Res2
2πi CR z4 +1
where Resk = Res( f ; eπik/4 ). Let z k = eπik/4 . Then, as these are all simple poles,
(z − z k ) z − zk
Resk = lim = lim
z→z k z +1
4 z→z k g(z)
1
where g(z) = z 4 + 1. Hence, Resk = .
4z k3
Therefore,
1 dz 1 1 i
= (e−3πi/4 + e−9πi/4 ) = (e−3πi/4 + e−πi/4 ) = − √ .
2πi CR z4+1 4 4 2 2
R
1 1 1
f (z) dz = f (z) dz + f (z) dz.
2πi CR 2πi −R 2πi C ∗R
Hence,
∞
dx π
=√ .
−∞ x4 +1 2
This example also illustrates how one can use complex analysis to calculate real
definite integrals.
Exercises
K
za j 1
= (4.1)
j=1
1 − zn j 1−z
sin z it z
f (z) = e
z
sin z
lim = 1.
z→0 z
−1 1
−A A
Thus,
A
1 sin x it x 1
e dx = f (z) dz. (4.2)
2πi −A x 2πi ΓA
ei z − e−i z
Since sin z = , the integral on the right-hand side can be rewritten as
2i
iz
1 e − e−i z 1 1 ei(t+1)z 1 ei(t−1)z
eit z dz = dz − dz .
2πi Γ A 2i z 2i 2πi Γ A z 2πi Γ A z
Defining
1 eisz
φ A (s) = − dz (s ∈ R),
4π ΓA z
φ A (t + 1) − φ A (t − 1).
1 eisz
dz = 0.
2πi Γ A +I z
1 eisz
dz = 1.
2πi Γ A +I I z
234 4 Complex Analysis
−1 1
−A A
ΓA
II
−1 1
−A A
ΓA
We parametrize the circular arc in each of these integrals. In our first integral,
−π iθ
1 eisz 1 eis Ae
0= dz + (i Aeiθ ) dθ
2πi ΓA z 2πi 0 Aeiθ
−ε
as A → ∞. Thus,
−π
−ε −π+ε −π
e is Aeiθ
dθ iθ
|eis Ae |dθ +
iθ
|eis Ae |dθ +
iθ
|eis Ae |dθ.
0 0 −ε −π+ε
Since the integrand is bounded by 1, the first and last integrals are bounded by 2ε.
Letting ε → 0, we see that
1 eisz
lim dz = 0 (4.3)
A→∞ 2πi ΓA z
if s < 0. Thus, lim A→∞ φ A (s) = 0 if s < 0. This fact could have been deduced alter-
nately by a simple application of the dominated convergence theorem (see Exercise
1 below).
We now consider the case if s > 0. Proceeding as before and parametrizing the
arc, we have
π
1 eisz 1 iθ
1= dz + ieis Ae dθ
2πi Γ A z 2πi 0
Therefore,
1 eisz
lim dz = 1
A→∞ 2πi ΓA z
if s > 0.
Therefore,
236 4 Complex Analysis
1 eisz
lim φ A (s) = lim − dz
A→∞ A→∞ 4π ΓA z
1 1 eisz
= lim dz
2i A→∞ 2πi ΓA z
⎧ 1
⎨ 2i , s > 0
=
⎩
0, s < 0.
That is,
A
1 sin x it x
lim e d x = φ A (t + 1) − φ A (t − 1).
A→∞ 2πi −A x
In other words,
A
sin x it x
lim e d x = 2πi{φ A (t + 1) − φ A (t − 1)}
A→∞ −A x
⎧
⎨ π, t + 1 > 0 and t − 1 < 0;
= −π, t + 1 < 0 and t − 1 > 0;
⎩
0, t + 1 < 0 and t − 1 < 0.
The first case means |t| < 1. The second case is impossible. The third case means
|t| > 1. Therefore,
A
sin x it x π, if|t| < 1;
lim e dx =
A→∞ −A x 0, if |t| > 1.
1 dz
− .
4π ΓA z
1 dz
=1
2πi Γ A +I I z
and
π
1 dz 1 i Aeiθ 1
= iθ
dθ = .
2πi II z 2πi 0 Ae 2
Thus,
dz
= πi
ΓA z
4.8 Further Examples 237
Let us consider a related example. This was considered in the previous chapter
and solved using the theorem related to differentiation under the integral sign. It is
instructive to consider it now using complex analysis.
∞
sin x
Let us compute d x.
0 x
∞ sin x 1 ∞ sin x
dx = dx
0 x 2 −∞ x
1 ∞ ei x − e−i x d x
=
2 −∞ 2i x
1 −ε e i x R ei x −ε e−i x R e−i x
= lim dx + dx − dx − dx .
2 R→∞ −R 2i x ε 2i x −R 2i x ε 2i x
ε→0
π iθ π
1 ei Re 1
iθ
Rieiθ dθ = ei R(cos θ+i sin θ) dθ → 0 as R → 0
2πi 0 Re 2π 0
by the dominated convergence theorem via an argument similar to the one we used
in our previous example.
It remains to consider the integral along the smaller circle C(ε) (Fig. 4.10):
238 4 Complex Analysis
C(R)
C()
−R − R
1 ei z
dz.
2πi C(↑) z
g(z)
lim dz = πig(0).
ε→0 C(↑) z
∞
sin x π
dx = .
0 x 2
Exercises
[Hint: Consider the path from 0 to R, and then from R to Re2πi/n and then back
to 0.]
1 f (z)
dz = N − P.
2πi γ f (z)
In other words, the difference between the number of zeros and poles of f can be
counted by integrating the logarithmic derivative of f .
Proof If α ∈ Ω1 is a zero of f of multiplicity m, then we can write
so Resz=b ( f / f )(z) = −r .
The result now follows from the residue theorem.
Proof By the inequality in the hypothesis, neither f nor g vanishes on γ ∗ . Thus, the
inequality can be rewritten as
1 − g(z) < 1 for all z ∈ γ ∗ .
f (z)
That is, g(z)/ f (z) lies in the open unit disk centered at 1 for z ∈ γ ∗ .
So if γ is defined on [a, b] and F(z) = g(z)/ f (z), then
Therefore,
Since
F (z) g (z) f (z) − f (z)g(z) f (z)
= · .
F(z) f (z)2 g(z)
4.9 Rouché’s Theorem 241
Thus,
1 f (z) g (z)
0= − dz
2πi γ f (z) g(z)
for z ∈ γ ∗ . By Rouché’s theorem, p(z) has three zeros in the unit disk.
We can also give an alternative proof of the fundamental theorem of algebra, using
Rouché’s theorem. Let
for |z| = R. Thus, f and g have the same number of zeros in |z| R. Thus, g has
n zeros.
The proof also shows that all zeros lie in the disk
so that f n (z) and f (z) have the same number of zeros in γ. The result now follows
from this conclusion.
Rouché’s theorem allows us study one-to-one analytic functions and deduce the
open mapping theorem which states that any non-constant analytic function f is an
open map. That is, f (U ) is open for every open U . We relegate this to the exercises.
A key tool needed is the following theorem which is of independent interest.
Theorem 4.27 Let A be an open subset of C, and suppose that f : A → C is analytic
and one to one. Then, f (z 0 ) = 0 for all z 0 ∈ A.
Proof Suppose that for some z 0 ∈ A, we have f (z 0 ) = 0. Then, the function f (z) −
f (z 0 ) has a zero of order k ≥ 2 at z 0 . As f is one to one, and A is open, it is
not constant, so the zeros of f (z) − f (z 0 ) are isolated. So, there is a δ > 0 such
that f (z) − f (z 0 ) has no zero and f (z) = 0 for 0 < |z − z 0 | ≤ δ . We can also
choose δ < m. In particular, there is an m > 0 such that | f (z) − f (z 0 )| ≥ m > 0 and
f (z) = 0 for 0 < |z − z 0 | ≤ δ. For 0 < η < δ, we apply Rouché’s theorem to the
functions, f (z) − f (z 0 ) and f (z) − f (z 0 ) − η to conclude that f (z) − f (z 0 ) − η
has k zeros inside the disk |z − z 0 | = δ. Since f (z) = 0 in the region 0 < |z − z 0 | ≤
δ, this cannot be a double zero. Therefore, f (z) is not one to one, a contradiction.
Exercises
z 87 + 36z 57 + 71z 4 + z 3 − z + 1
2z 5 − 6z 2 + z + 1
with a0 = 1, a1 = 0 and f not constant. Prove that f is one to one in the unit disk
|z| < 1 if
∞
n|an | ≤ 1.
n=2
4.9 Rouché’s Theorem 243
[Hint: Fix z 0 in the unit disk. Let g(z) = z − z 0 and h(z) = f (z) − f (z 0 ) and
apply Rouché’s theorem.]
5. By considering the example f n (z) = e z/n , show that the assumption that f be not
identically zero is essential in Hurwitz’s Theorem 4.26.
6. Suppose f is analytic and not constant on a region A. Show that f is an open
map.
To what extent does the knowledge of the zeros and poles of a meromorphic function
determine that function? Any polynomial is completely determined by the knowledge
of its zeros and multiplicities. Any polynomial can be written as
n
z
Az r 1−
i=1
ai
where ai are the nonzero roots of f . Does such a factorization exist for other func-
tions?
To this end, we begin by studying convergence of infinite products.
Let {u n }∞
n=1 be a sequence of complex numbers. Let
N
PN = (1 + u n ).
n=1
N
PN = (1 + u n )
n=1
N
PN∗ = (1 + |u n |).
n=1
244 4 Complex Analysis
Proof Recall that 1 + x e x for x 0. This immediately implies the bound for
PN∗ . To derive the bound for PN , we proceed by induction.
For N = 1, this is trivially true. Assume the inequality holds for all subscripts
N . Then,
PN +1 − 1 = PN (1 + u N +1 ) − 1 = (PN − 1)(1 + u N +1 ) + u N +1
so that
as claimed.
We will say the product (4.4) converges absolutely if the product with u n replaced
by |u n | converges. It is tempting to reduce the study of infinite products to the study of
infinite sums “by taking logarithms.” This can, in fact, be done but one must proceed
with some care to ensure the logarithm is well defined (see exercises below).
N
f N (z) = |u n (z)|,
n=1
the uniform convergence implies that for any ε > 0, ∃N0 such that
4.10 Infinite Products and Weierstrass Factorization 245
for M, N N0 . In particular,
where |u n | K for n N0 .
By the previous lemma, the functions
N
PN (z) = (1 + u n (z))
n=1
bounded in Ω, say by K .
are uniformly
Since ∞ n=1 |u n (z)| converges uniformly in Ω, given ε > 0, ∃N0 such that
∞
|u n (z)| < ε.
n=N0
Thus, for M N N0 ,
M
|PM (z) − PN (z)| = |PN (z)| (1 + u n (z)) − 1 |PN (z)| · (e − 1)
m=N +1
∞
1
1 −1
ζ(s) = = 1 −
n=1
ns p
ps
converges absolutely for Re (s) > 1, and the infinite product is by virtue of unique
factorization. We can apply our theorem to show ζ(s) = 0 for Re (s) > 1 since
246 4 Complex Analysis
1 −1 ps
1− s =0⇔ s = 0 ⇔ ps = 0
p p −1
is called a Dirichlet series. Such series play an important role in number theory. In
particular, the study of the distribution of prime numbers hinges on a study of such
series (see Sects. 4.19 and 4.20).
Let us now set
n
zk
Pn (z) = .
k=1
k
n
zn − 1
Pn (z) = z k−1 = .
k=1
z−1
1 − E n (z)
φ(z) =
z n+1
4.10 Infinite Products and Weierstrass Factorization 247
∞
r 1+ pn
<∞ (4.5)
n=1
|z n |
Remark 4.5 If f has a zero of order k at z = 0, we can apply the above theorem to
f (z)/z k . The above product expansion is not necessarily unique.
Proof Let
∞
P(z) = E pn (z/z n ).
n=1
Then,
Note that
z z z2
1− 1+ = 1− 2 2.
nπ nπ n π
Thus,
∞
z2
sin z = ze g(z)
1− 2 2 .
n=1
n π
By logarithmic differentiation,
∞
cos z 1 2z
= + g (z) + .
sin z z n=1
z − n2 π2
2
In particular,
∞
cos(ic/2) 2 4ic
= + g (ic/2) −
sin(ic/2) ic n=1
c + 4n 2 π 2
2
2i 2ic
= − + g (ic/2) −
c n=0
c + 4n 2 π 2
2
∞
2ic
= g (ic/2) − 2 + 4n 2 π 2
.
n=−∞
c
so that
cos(ic/2) ec + 1
= g (ic/2) − i · c .
sin(ic/2) e −1
∞
z2
sin z = Az 1− 2 2 .
n=1
n π
250 4 Complex Analysis
Now,
sin z
lim =1
z→0 z
Theorem 4.31
∞
z2
sin πz = πz 1− 2 .
n=1
n
∞
1 π2
= . (4.6)
n=1
n2 6
We leave the deduction of this from the theorem as an exercise to the reader.
Exercises
γ : [0, 1] → X.
such that H (s, 0) = γ0 (s) and H (s, 1) = γ1 (s), for all s ∈ [0, 1].
In this case, H is said to be a homotopy from γ0 to γ1 .
One should think of this as giving mathematical expression to the idea that we
can deform γ0 to γ1 continuously.
Recall that a space X is called path-connected if for any a, b ∈ X , there is a
continuous map γ : [0, 1] → X such that γ(0) = a and γ(1) = b.
Now suppose X is path-connected. We say X is simply connected if every closed
path in X is homotopic to a point. That is, it is homotopic to the trivial closed curve
γ(s) = x0 for all s ∈ [0, 1] and some x0 ∈ X .
For example, the annulus is not simply connected.
One can show that if γ0 and γ1 are homotopic closed paths in a domain Ω ∈ C,
then Indγ0 (α) = Indγ1 (α) for all α ∈/ Ω.
The idea is that a homotopy H from γ0 to γ1 induces a family of paths γt = H (·, t)
that depend continuously on t. Thus, the index Indγt (α) depends continuously on t.
But as the index is integer-valued, it is independent of t. In particular, if Ω is simply
connected, then Indγ (α) = 0 for any closed path γ in Ω and any α ∈ / Ω.
Suppose f ∈ H (Ω) where Ω is a simply connected region. Fix z 0 ∈ Ω and define
z
g(z) = f (w) dw
z0
z
for z ∈ Ω, where z0 denotes the contour integral over a path in Ω from z 0 to z.
By our earlier comments on simply connected regions, Cauchy’s theorem (Theorem
4.5) tells us that the integral is path-independent. Thus, g is well defined and is
holomorphic. Also, g = f .
As a special case to this, let Ω be simply connected with 0 ∈
/ Ω. Let f (z) = 1/z.
Then, f ∈ H (Ω). Fixing z 0 ∈ Ω, take w0 such that ew0 = z 0 . This we can do in
252 4 Complex Analysis
z 0 = r eiθ , r > 0
for z ∈ C \ {x ∈ R : x ≤ 0}. This function is called the principal branch of the log-
arithm. Writing z ∈ Ω, z = r eiθ , −π < θ < π, we see log z = log r + iθ. In partic-
ular,
iπ iπ
log i = and log(−i) = − .
2 2
Example 4.7 If instead we take
Ω = C \ {x ∈ R : x 0}
We can use the logarithm to deal with more complicated integrals such as
∞
f (x)x a−1 d x,
0
Theorem 4.32 Let f (z) be analytic in C except for a finite number of poles αi none
of which lie on the positive real axis. Let a ∈
/ Z and suppose
K1
(1) ∃ b > a such that | f (z)| for some constant K 1 as |z| → ∞.
|z|b
K2
(2) ∃b with 0 b < a such that | f (z)| b as |z| → 0.
|z|
Then,
∞
πe−πia
f (x)x a−1 d x = − Resz=αi ( f (z)z a−1 ),
0 sin πa i
4.11 The Logarithm 253
CR
C
L+
ϕ
−R − ϕ R
L−
Proof Write z a−1 = e(a−1) log z where log z is the principal branch of the logarithm
defined on the simply connected region
C \ {x ∈ R : x 0}.
1
f (z)z a−1 dz = Resz=αi ( f (z)z a−1 ).
2πi C αi in C
z = r eiϕ , r R
254 4 Complex Analysis
so that
log z = log r + iϕ.
Then,
R
f (z)z a−1 dz = f (r eiϕ )e(a−1)(log r +iϕ) · eiϕ dr
L+
R
= eiaϕ f (r eiϕ )r a−1 dr.
On L − , z = r ei(2π−ϕ) so that
and
f (z)z a−1 dz = f (r ei(2π−ϕ) )e(a−1)(log r +i(2π−ϕ)) · ei(2π−ϕ) dr.
L− R
Now,
so our integral is
R
f (z)z a−1 dz = −e2πia e−ϕia f (r e−iϕ )r a−1 dr
L−
Therefore,
R
f (z)z a−1 dz + f (z)z a−1 dz = (1 − e2πia ) f (r )r a−1 dr.
L+ L−
As ε → 0, this tends to
4.11 The Logarithm 255
∞
(1 − e2πia ) f (r )r a−1 dr.
0
Exercises
1. Let f be a meromorphic function and A its set of poles. Suppose it has a finite
number of poles all of which lie in the upper half plane (i.e., have positive imagi-
nary part). Suppose there is a constant K > 0 such that for some fixed δ > 1, we
have
| f (z)| ≤ K /|z|δ ,
2. Show that ∞
cos x π
dx = .
−∞ x2 + 1 e
3. Let f (z) have a simple pole at z = 0. Let C() be the semicircular arc from −
to of radius > 0. Show that
5. Let C be the circle of radius 2 centered at the origin and oriented counterclockwise.
Evaluate
1 dz
.
2πi C z 2 (z − 1)3
256 4 Complex Analysis
Proof Let B be a bound for f on the two vertical strips. Writing z = x + i y with
x, y real, we see that as f has finite order in the region, there are constants K and c
such that for all y, we have
| f (x + i y)| ≤ K e|y| ,
c
a ≤ x ≤ b.
Let m be an integer ≡ 2 (mod 4) with m > c. For z in the vertical strip, arg z → π/2
as |y| → ∞. Thus, there is a T1 such that for |y| > T1 ,
π π
arg z − < .
2 4m
Let > 0 and consider the function
With our choice of m, it is now easily checked that (see Exercise 1) for |y| ≥ T1 ,
√
|g (z)| ≤ K e|y| e−|z| / 2
c m
(4.8)
so that g (z) → 0 as |z| → ∞ since m > c. Let T2 be chosen so that |g (z)| ≤ B for
| Im(z)| ≥ T2 . Thus,
for |z| ≥ T2 . Applying the maximum modulus principle to the bounded region
4.12 The Phragmén–Lindelöf Theorem and Jensen’s Theorem 257
a ≤ Re(z) ≤ b, 0 ≤ | Im(z)| ≤ T2 ,
we find | f (z)| ≤ Be|z| . Since this is valid for all > 0, we can take → 0 to deduce
m
Writing
∞
h(z) = (an + ibn )z n
n=0
Hence,
258 4 Complex Analysis
2π !
|an |R n | Re(h(Reiθ ))| + Re h(Reiθ ) dθ
0
R β+ .
Notice that in this theorem the same result holds if the estimate
β+
| f (z)| e Ri
holds for |z| = Ri and Ri is a sequence tending to infinity. This observation will be
used later in our proof of the Hadamard factorization theorem.
Theorem 4.35 (Jensen’s theorem). Let f (z) be an entire function of order β such
that f (0) = 0. If z 1 , z 2 , . . . , z n are the zeros of f (z) in |z| < R, counted with mul-
tiplicity, then
2π
1 Rn
log | f (Reiθ )|dθ = log | f (0)| + log .
2π 0 |z 1 | · · · |z n |
Proof We may assume, without loss of generality, that f (0) = 1. Also, it is clear
that if the theorem is true for functions g and h, then it is also true for the product gh.
Thus, it suffices to prove it for functions with either no zero or one zero in |z| < R.
Indeed, if f has no zeros in |z| < R, the right-hand side is zero. The left-hand side
is
1 dz
(log f (z)) ,
2πi |z|=R z
which by Cauchy’s theorem is zero. Taking real parts gives the desired result.
If f has one zero z = z 1 in |z| < R, we consider the contour |z| = R taken in
the counterclockwise direction and cut it from z 1 to the boundary. We deform the
contour so that we go around z 1 in a clockwise direction along a circle of radius
(say). Then, by Cauchy’s theorem with g(z) = log f (z),
1 dz
0= g(z)
2πi C z
An alternative proof of Jensen’s theorem can be given that avoids the use of cutting
the plane. One considers
R(z − z 1 )
f (z) = 2 .
R − z1 z
Then, f (z) is regular for |z| ≤ R. Moreover, | f (z)| = 1 on |z| = R, and | f (0)| =
|z 1 |/R, as a simple calculation shows. Jensen’s theorem is easily verified for this
choice of f . But any holomorphic function on |z| ≤ R can be written as a function
with no zeros in |z| ≤ R and a product of functions of the form
R(z − z i )
.
R2 − zi z
Exercises
converges for any > 0 (here, we have indexed the zeros z i so that |z 1 | ≤ |z 2 | ≤
· · · ).
260 4 Complex Analysis
We will now derive a factorization theorem for entire functions of order 1. A similar
result holds for entire functions of higher order.
(1 − z)e z = 1 − z 2 + · · ·
and by Exercise 5 in the previous section. Thus, P(z) represents an entire function.
If we write
f (z) = P(z)F(z),
then F(z) is an entire function without zeros. If F were of finite order, we could
conclude by Theorem 4.34 that F(z) = e g(z) , where g(z) is a polynomial.
By the remark after Theorem 4.34, it suffices to show that
1+
|F(z)| e Ri
to deduce that F(z) = e g(z) where g(z) is of the form A + Bz for certain constants
A and B.
To this end, we will choose Ri satisfying
Ri − |z n | > |z n |−2
for all n. This can be done, since the total measure of the intervals (|z n | − |z n |−2 ,
|z n | + |z n |−2 ) is bounded by
∞
2 |z n |−2 < ∞,
n=1
We write
P(z) = P1 (z)P2 (z)P3 (z),
Since
∞
1
|z n |−1 ≤ R |z n |−1− ,
2
|z n |< 21 R n=1
we get
|P1 (z)| > exp(−Ri1+ ).
For P2 (z),
1 − z e z/zn ≥ e−2 |z − z n |/2Ri R −3
zn i
and
∞
|z n |−2 < (2R)−1+ |z n |−1− .
|z n |>2R n=1
so that
|F(z)| < exp(R 1+4 ).
Proof Without loss of any generality, we may suppose that none of the an are equal to
zero, since we can always add the principal part at zero to our function, if necessary.
As the function
1
Pn
z − an
is analytic at z = 0, we can write down its power series expansion. Let Q n (z) be the
sum of the terms of degree less than dn (with dn being a natural number to be chosen)
so that dn
1 z
Pn − Q (z) ≤ C ,
z − an
n n
an
valid for |z| ≤ |an |/2 (say) and Cn is a suitable positive constant greater than 1. We
choose dn inductively so that d1 < d2 < · · · and
Thus,
|Cn |1/dn
→ 0,
|an |
converges absolutely and uniformly for z in any compact set not containing the an ’s.
Indeed, for any given R, let N be such that |a N | ≥ R and split the series as
4.13 Entire Functions of Order 1 263
N "
# ∞ "
#
1 1
Pn − Q n (z) + Pn − Q n (z) .
n=1
z − an n=N +1
z − an
The first part is a finite sum. If |z| ≤ R/2, the second part is bounded by
∞
(Cn /|an |dn )z dn ,
n=N +1
which has a radius of convergence equal to infinity, by the ratio test. The finite sum
has the desired polar part for |z| ≤ R. As this is true for every R, this completes the
proof of the theorem.
Exercises
2. Prove that
∞
4z 2
cos πz = 1− .
n=1
(2n − 1)2
∞
bn bn
f (z) = f (0) + + .
n=1
z − an an
It is easily verified that the integral represents an analytic function in the half plane
Re(z) > 0. Integration by parts shows
Γ (z + 1) = zΓ (z). (4.9)
Γ (z + 1)
Γ (z) = ,
z
and the right-hand side is analytic if Re(z) > −1. Thus, by this inductive procedure,
it is easy to see that Γ (z) extends to a meromorphic function in the entire complex
plane with simple poles at z = 0, −1, −2, . . . .
We will prove that 1/Γ (z) is an entire function of order 1 and derive its Hadamard
factorization. To this end, we first derive a second functional equation satisfied by
the Γ function. The following result has the appearance of being a consequence of
Theorem 4.32 but cannot be deduced from that theorem.
Theorem 4.39 We have
∞
v x−1 dv π
=
0 1+v sin πx
where C is the contour taken along the real axis from to R, then in the positive
direction along the circle c1 of radius R centered at the origin and then back along
the real axis to z = and finally around the circle c2 of radius centered at the origin,
taken in the negative direction.
The function
z x−1
1+z
eπi(x−1) .
We will take < 1 < R so that integrating the function along the contour indicated
above shows by Cauchy’s theorem
R
u x−1 du z x−1 dz (ue2πi )x−1 du z x−1 dz
+ + +
1+u c1 1+z R 1+u c2 1+z
= (2πi)eπi(x−1) .
so that
z x−1 dz R x−1 2π R x
≤ 2π R = ,
1+z R−1 R−1
c1
which gives
∞
u x−1 π
= .
0 1+u sin πx
for x, y > 0.
Proof For x, y > 0, we have
∞ ∞
−t x−1 −u y−1
Γ (x)Γ (y) = e t dt e u du .
0 0
The interchanging of integrals is easily justified by Fubini’s theorem. This last integral
is
π/2
2 (cos θ)2x−1 (sin θ)2y−1 dθ,
0
1
λx−1 (1 − λ) y−1 dλ,
0
Putting
λ
v=
1−λ
π
Γ (x)Γ (1 − x) =
sin πx
for 0 < x < 1.
Theorem 4.42 1/Γ (z) is an entire function with simple zeros at z = 0, −1, −2, . . . .
we see that Γ (z)Γ (1 − z) is regular except when z is an integer, in which case it has
a simple pole.
We also see from this functional equation that since Γ (z) is regular in Re(z) > 0,
Γ (1 − z) has simple poles at z = 1, 2, 3, . . . . Therefore,
Exercises
1. Prove that
1 √
Γ = π.
2
1 1
Γ (2x)Γ = 22x−1 Γ (x)Γ x +
2 2
for x > 0.
268 4 Complex Analysis
Γ (x + c) ∼ x c Γ (x).
The student may be familiar with the classical Stirling’s formula that
n n √
n! ∼ 2πn,
e
as n tends to infinity. As the Γ function interpolates the factorials, it is prudent to
see if the asymptotic extends to the complex domain. This is the goal of this section.
The classical Stirling’s formula will be a corollary of our discussion.
Theorem 4.43 (Stirling’s formula) We have
√
Γ (x) ∼ e−x x x−1/2 2π
as x → ∞.
Proof By partial summation, we know that for a natural number n,
1
log Γ (n) = log(n − 1)! = n − log n − n + c1 + o(1)
2
as n → ∞ (and with c1 an absolute constant). If x is not an integer, let us write
x = n + c for some 0 < c < 1. By Exercise 3 of the previous section, we have
Γ (n + c) ∼ n c Γ (n),
so that
1
log Γ (x) = x − log x − x + c1 + o(1).
2
We can use the duplication formula to evaluate c1 . Indeed, on the one hand we have
from above
1
log Γ (2x) = 2x − log 2x − 2x + c1 + o(1).
2
On the other hand, by the duplication formula (Exercise 2 of the last section) we
have
1 1
log Γ (2x) = (2x − 1) log 2 + log Γ (x) + log Γ x + − log π,
2 2
which is equal to
1 1 1
2x − log 2x − 2x − log 2 + 2c1 − log π + o(1),
2 2 2
so that
1 log 2
c1 = 2c1 − log π − .
2 2
Thus, as required √
c1 = log 2π.
Γ (z) 1 ! dt
= 1 − (1 − t)z−1 − K.
Γ (z) 0 t
Γ (z − h)Γ (h) 1
= t h−1 (1 − t)z−h−1 dt
Γ (z) 0
1 1 !
= + (1 − t)z−h−1 − 1 t h−1 dt.
h 0
1 ! 1 !
Γ (z) − Γ (z)h + · · · + K + ···
Γ (z) h
1 Γ (z)
= − + K + O(h).
h Γ (z)
270 4 Complex Analysis
1 1 ! dt
= + (1 − t)z−1 − 1 + O(h),
h 0 t
∞
dt
= (1 − t)n dt
t n=0
in the integrand and integrate term by term to obtain the result. The step is valid
for z > 1 and by analytic continuation for all z unequal to a negative integer. This
completes the proof.
∞
1 z −z/n
= eγz z 1+ e ,
Γ (z) n=1
n
∞
1 z −z/n
= e Bz z 1+ e
Γ (z) n=1
n
1!
∞
1
0= B+ log 1 + −
n=1
n n
1!
N
1
= B + lim log 1 + −
N →∞ n n
n=1
= B − γ.
Exercises
1. Show that
∞ [u] − u + 21
1 1
log Γ (z) = z − log z − z + log 2π + du.
2 2 0 u+z
1 1 1
log Γ (z) = z − log z − z + log 2π + O
2 2 |z|
5. Show that
Γ (z) 1
= log z + O
Γ (z) |z|
for |z| → ∞ in the sector −π + δ < arg z < π − δ for any fixed δ > 0.
6. Prove that Γ (s) has poles only at s = 0, −1, −2, . . . , and that these are simple,
with
Ress=−k Γ (s) = (−1)k /k!.
7. Show that
1
e−1/x = x s Γ (s)ds,
2πi (σ)
8. Let f (s) = ∞n=1 an /n be an absolutely convergent Dirichlet series in the half
s
∞
1
ζ(s) = s
, Re(s) > 1,
n=1
n
where
∞
e−n πx
2
w(x) = .
n=1
Show that the integral converges absolutely for all s ∈ C. This gives the analytic
continuation and functional equation
Ξ (s) = Ξ (1 − s),
1. At the boundary of the region of convergence, i.e., at |x| = 1, the series may
converge or diverge. In Chap. 1, we have already seen examples of this phenomenon.
Abel’s theorem states that if the series converges at a boundary point, then it is
reasonably well behaved in the sense that it is continuous at that point. More precisely,
if
∞
an = A, (4.10)
n=0
then
∞
lim− an x n = A. (4.11)
x→1
n=0
[6]. However, our presentation is simplified and our theorem is more general. We
derive as a consequence an assortment of prime number theorems following the
arrangement of Serre [7].
Exercises
1. Let Λ(n) = log p if n = pa for some prime p and zero otherwise. Show that the
prime number theorem is equivalent to
ψ(x) := Λ(n) ∼ x,
n≤x
as x → ∞.
2. Prove that
log p = ψ(x) + O(x 1/2 log x),
p≤x
The following analytic theorem of Newman [5] is the key result that will be used
to prove the Tauberian theorem. The proof is an application of Cauchy’s residue
theorem. Newman’s novel idea was the insertion of a new kernel into the relevant
integral, playing a role similar to that of the Fejér kernel in standard proofs of the
Tauberian theorem.
Theorem 4.47 ∞ For t ≥ 0, let f (t) be a bounded and locally integrable function and
let g(s) := 0 f(t)e−st dt for Re(s) > 0. If g(s) has an analytic continuation to
∞
Re(s) ≥ 0, then 0 f (t)dt exists and equals g(0).
T
Proof For T > 0, let gT (s) = 0 f (t)e−st dt. This integral converges for all values
of s, and it is easy to see that gT (s) is an entire function. We need to show that
We will denote Re(s) by σ. Fix R > 0 and consider the positively oriented contour
C shown in Fig. 4.12. Here, δ > 0 (depending on R) is chosen small enough so that
g(s) is analytic on C . Indeed, as g(s) is analytic on the line σ = 0, one can cover
the vertical strip from (0, R) to (0, −R) with open balls, on each of which g(s) is
analytic. Compactness of this strip allows one to obtain a finite subcover, which then
gives the desired δ.
4.17 The Analytic Theorem 275
C− C+
Using s = Reiθ and R cos θ = σ on C+ , we obtain the following estimate for the
kernel
sT 1 s 2
σT 1 eiθ
σT 2 cos θ σT |σ|
e
s 1 + R 2 = e Reiθ + R = e R e R 2 . (4.13)
As gT (s) is entire and the rest of the integrand is analytic to the left of σ = 0, we
have by Cauchy’s theorem
1 s 2 ds
I1 = gT (s)e sT
1+ 2 .
2πi C− R s
That is, we can integrate over the semicircle C− instead of C− . Then, noting that
σ < 0 in this case, we have
T T
e−σT
|gT (s)| = f (t)e −st
dt ≤ M e−σt dt ,
0 0 |σ|
and the estimate (4.13) holds on C− exactly as it did on C− . We obtain |I1 | 1/R
in the same way as done for |IC + | above. This leaves us with the integral
1 s 2 ds
I2 := g(s)e sT
1+ 2 .
2πi C− R s
with the implicit constant depending on R. Recalling that σ < 0 in this region, the
above quantity can be compared to the real-valued function xe−x , which attains a
global maximum of e−1 (as can be checked by standard derivative tests). Thus,
giving a bound of O R (1/T ) for the integrand over the arcs of C− . As the length of
the arcs is again a function of R which gets absorbed into the implied constant, we
see that the contribution to |I2 | from the arcs of C− is O R (1/T ) as T → ∞. On the
vertical strip of C− , as σ = −δ, we have
|esT | = e−δT .
The rest of the integrand of I2 is analytic in this region and hence absolutely bounded
in terms of R. The contribution to |I2 | from this strip is thus O R (e−δT ). Putting
everything together, we have obtained, as T → ∞,
4.17 The Analytic Theorem 277
As R is arbitrary, the right-hand side can be made as small as needed. This completes
the proof.
∞
x 1+
bn ≤ bn .
n≤x n=1
n
The right-hand side is x 1+ G(1 + ) which is of the order of x 1+ / since G(1 + )
1/. Choosing = (log x)−1 gives B(x) x log x. Note that this estimate does not
use any information about the behavior of G(s) on Re(s) = 1, except at s = 1.
Normally, (c) is not needed in the general Wiener–Ikehara Tauberian theorem. One
can deduce it from the other assumptions, as indicated in the concluding remarks.
However, in practically all applications, this condition is found to be readily available
and we retain it for the sake of a shorter proof.
A natural starting point for this and indeed most proofs of the Tauberian theorem
is what is known as Abel’s trick: For Re(s) > 1, we have
∞
B(x)
G(s) = s d x. (4.14)
1 x s+1
278 4 Complex Analysis
This can be derived using partial summation, as is done in Exercise 2.1.5 of [8]. We
proceed to prove the above Theorem 4.48.
After the change of variable x to eu and then s to s + 1, we have for Re(s) > 0,
∞
G(s + 1) 1 B(eu ) − eu −su
− = e du,
s+1 s 0 eu
is bounded on account of (c) and the left-hand side has an analytic continuation to
Re(s) ≥ 0 by (b). Hence, by Theorem 4.47, the integral
∞ ∞
B(eu ) − eu B(t) − t
du = dt (4.16)
0 eu 1 t2
converges. We will show that B(x) ∼ x as x → ∞. Suppose not. Then, lim x→∞
B(x)/x either does not exist or does not equal 1 if it exists. In either case, we see that
lim supx→∞ B(x)/x > 1 or lim inf x→∞ B(x)/x < 1. Suppose the former inequality
holds (the latter case can be treated similarly). Then, there exists some λ > 1 such that
B(x) ≥ λx for infinitely many x. As there exists x arbitrarily large with B(x) ≥ λx
and B(x) is an increasing function, we have
λx λx
B(t) − t λx − t
dt ≥ dt
x t2 x t2
λ λ
λx − vx λ−v
= xdv = dv,
1 (vx)2 1 v2
For fixed λ, as x → ∞, the above integrals are tails of the convergent integral (4.16)
and can be made arbitrarily small, thereby giving a contradiction. This completes the
proof.
The result can be extended to Dirichlet series with complex coefficients as follows.
be a Dirichlet series with complex coefficients. Let A(x) denote the partial sum of
the coefficients:
A(x) = an .
n≤x
∞
Suppose there exists a Dirichlet series G(s) = n=1 bn /n s with non-negative coef-
ficients, such that
(a) |an | ≤ bn for all n.
(b) G(s) is absolutely convergent for Re(s) > 1.
(c) The function G(s) (resp. F(s)) extends meromorphically to the region Re(s) ≥ 1,
having nopoles except for a simple pole at s = 1 with residue R (resp. r ).
(d) B(x) := n≤x bn = O(x).
Then, as x → ∞,
A(x) = r x + o(x).
Proof If an ’s are real, we consider the series G(s) − F(s), which has non-negative
coefficients and satisfies the conditions of Theorem 4.48, giving
(bn − an ) = (R − r )x + o(x),
n≤x
as x → ∞. As B(x) = Rx + o(x), this proves the result in the case of real coeffi-
cients. If the coefficients an are not real, we define
∞
F ∗ (s) = ān /n s
n=1
so that
F + F∗ F − F∗
F= +i .
2 2i
and apply the result for real coefficients separately to the real and imaginary part
above after checking that the necessary conditions are satisfied.
280 4 Complex Analysis
As remarked earlier, the added condition (c) in Theorem 4.48 is not restrictive for
most practical purposes. However, it is possible to eliminate this condition altogether.
We give a brief sketch of the argument. The key idea is to notice that the known bound
B(x) x log x implies that for any > 0, the function
f (t) B(et ) 1
f (t) := = − t
et et (1+) e
is bounded and satisfies the conditions of Theorem 4.47. Applying this theorem to
f (t) and following an elementary argument that exploits the increasing behavior of
the function B(et )/et (1+) , one obtains a uniform bound on supt≥0 | f (t)|. Letting
→ 0, we see that f (t) must be bounded. A more detailed proof can be found in
[6].
The Riemann zeta function admits an analytic continuation to the entire complex
plane, apart from a simple pole at s = 1. It satisfies a remarkable functional equation
which we derived in an earlier chapter using the Poisson summation formula, essen-
tially following the method of Riemann. However, for the purpose of deriving the
prime number theorem, it is not necessary to have analytic continuation to the entire
complex plane. It suffices to have continuation to Re(s) ≥ 1. We elaborate below.
Firstly, the series
∞
1
n=1
ns
is analytic in the half plane Re(s) > 1. Also, by an observation due to Euler, the
series can be written as an infinite product over prime numbers:
∞
1
1 −1
= 1 − ,
n=1
ns p
ps
again valid in the region Re(s) > 1. In Example 4.4, we saw that this Euler product
shows that ζ(s) = 0 for Re(s) > 1. This Euler product provides the fundamental link
between ζ(s) and prime numbers. Indeed, if we define the von Mangoldt function
by
log p if n = pr , p prime
Λ(n) :=
0 otherwise,
Since ζ(s) admits an analytic continuation to the entire complex plane, apart from
a simple pole at s = 1, we see that the left-hand side is a meromorphic function. If
ζ(s) = 0, for Re(s) = 1, then the left-hand side provides an analytic continuation
of the series on the right-hand side for Re(s) ≥ 1, except for a simple pole at s = 1
with residue 1. The Wiener–Ikehara Tauberian theorem then implies that
Λ(n) ∼ x,
n≤x
and with σ > 1, we have from the Euler product (see Exercise 4)
⎛ ⎞
1
|ζ(σ)3 ζ(σ + it)4 ζ(σ + 2it)| = exp ⎝ (3 + 4 cos(tr log p) + cos(2tr log p)⎠
r pr σ
p,r ≥1
(4.18)
By (4.17), the term on the right-hand side of (4.18) is ≥ 1. This being valid for every
σ > 1, we may take the limit of the left-hand side as σ → 1+ . If ζ(1 + it) = 0, the
term on the left vanishes because ζ(σ)3 has a pole of order 3 and ζ(σ + it)4 has a
zero of order 4 at σ = 1. This proves:
Theorem 4.49
ζ(1 + it) = 0, ∀t ∈ R, t = 0.
The Tauberian theorem thus provides the simplest derivation of the prime number
theorem once we understand the minimal requirements needed from the zeta function.
That the zeta function admits an analytic continuation to Re(s) > 0 is easily seen by
noting that the alternating series
∞
(−1)n−1
= (1 − 21−s )ζ(s), (4.19)
n=1
ns
Exercises
Deduce that
∞
(−1)n−1
n=1
ns
6. Prove that
μ(n) = o(x),
n≤x
as n tends to infinity.
4.20 Further Applications 283
Theorem 4.50 Suppose L(s, ρ) is absolutely convergent for Re(s) > 1 and extends
to a meromorphic function on Re(s) ≥ 1 with no zeros or poles except for a pole of
order cχ at s = 1. Then,
n
χ(xv ) = (1 + o(1))cχ .
N v≤n
log n
The proof of the above theorem follows by applying the Tauberian theorem to
L /L. We refer the reader to the appendix of Chap. 1 of [7] for the same. If Theorem
4.50 holds for all irreducible representations ρ = 1 with cχ = 0, then the Peter–
Weyl theorem allows us to deduce that the xv ’s are equidistributed with respect to
the normalized Haar measure of G. Special cases of this theorem lead to important
results, among them being the prime number theorem for arithmetic progressions
(see exercises below), Chebotarev density theorem and the Sato–Tate theorem. An
excellent reference for the interested reader wishing to delve deeper into these topics
is [9].
Exercises
∞
1 −s
− s + ζ(s, x) = ζ(s) = ζ(s + r )x r .
x r =1
r
∞
1 1 −s x r
− s = ,
(n + x)s n r =1
r n s+r
Deduce that (2s − 2)ζ(s) extends analytically to Re(s) > 0. [Hint: Put x = 1/2
in the identity of the previous exercise.]
3. For any natural number q, show that
q−1
a
(q − q)ζ(s) =
s
ζ(s, ) − ζ(s) .
a=1
q
4. Deduce that (3s − 3)ζ(s) extends analytically to Re(s) > 0. Combining this with
Exercise 2, conclude that ζ(s) extends to Re(s) > 0 apart from a possible pole at
s = 1.
5. Show that lims→1+ (2s − 2)ζ(s) = 2 log 2. [Hint: Use question 2.] Deduce from
the previous exercise that ζ(s) extends to an analytic function for Re(s) > 0
except for a simple pole at s = 1 with residue 1.
6. For each character χ : (Z/qZ)∗ → C, show that L(s, χ) can be written as
q
L(s, χ) = q −s χ(a)ζ(s, a/q).
a=1
Deduce that L(s, χ) extends analytically to Re(s) > 0 for every non-trivial char-
acter χ.
7. With notation as in the previous exercise, show that if L(s, χ) does not vanish on
Re(s) = 1, then for a and q coprime,
x
Λ(n) ∼ ,
n≤x,n≡a(mod q)
φ(q)
4.20 Further Applications 285
where φ(q) is Euler’s function, equal to the order of the group (Z/qZ)∗ . Deduce
that there are infinitely many primes in arithmetic progressions. (Dirichlet’s the-
orem)
The theme of fusing Fourier transforms and analytic functions is the underlying goal
of the Paley–Wiener theorems. It is sometimes the case that a Fourier transform of
a function f on R can be extended to an analytic function in some region of the
complex plane. For instance, we have already seen that if f (t) = e−|t| , then (up to a
constant factor) the Fourier transform &
f (x) is
1
,
1 + x2
which is a rational function and thus defines a meromorphic function in the complex
plane. Thus, it would be interesting to investigate under what conditions on f is it
the case that &
f is analytic in certain regions of the complex plane. This is the aim of
the Paley–Wiener theorems.
For instance, let F ∈ L 2 (0, ∞) and define
∞
f (z) = F(t)eit z dt, z ∈ H, (4.20)
0
where H denotes the upper half plane (i.e., z ∈ C with Im(z) > 0). If Im(z) > δ > 0,
and z n is a sequence of complex numbers with Im(z n ) > δ tending to z , then a simple
application of the dominated convergence theorem shows
∞
lim |eit zn − eit z |2 dt = 0,
n→∞ 0
because the integrand is bounded by the L 1 -function 4e−2δt and tends to zero for
every t > 0. The Cauchy–Schwarz inequality implies then that f is continuous in H.
Furthermore, a direct application of Fubini’s theorem and Cauchy’s theorem shows
that
f (z)dz = 0,
γ
∞
f (x + i y) = F(t)e−t y eit x dt.
0
for√every y > 0 (keeping in mind that our Lebesgue measure was normalized by
1/ 2π in the earlier chapter). This proves:
Theorem 4.51 If f is of the form (4.20), then f is holomorphic in H and its restric-
tions to horizontal lines in H form a bounded set in L 2 (R).
where 0 < A < ∞ and F ∈ L 2 (X ) where X = (−A, A). By the method of proof
above, f is entire and it satisfies the growth condition
A A
| f (z)| ≤ |F(t)|e−t y dt ≤ e A|y| |F(t)|dt.
−A −A
Entire functions that satisfy (4.22) are said to be of exponential type. Our discussion
shows that
Theorem 4.52 Every f of the form (4.21) is an entire function that satisfies (4.22)
and whose restriction to the real axis lies in L 2 (by the Plancherel theorem).
It is remarkable that the converses of the two theorems above are true. This is the
content of the Paley–Wiener theorems.
and ∞
|F(t)|2 dt = C.
0
Proof To gain some intuition on what F should look like, we can apply the inversion
theorem (without worrying whether the conditions are met or not). Thus, our desired
F should be of the form
∞
1
F(t) = et y f (x + i y)e−it x d x = f (z)e−it z dz,
−∞ 2π Im(z)=y
and if this argument is correct, the last integral should not depend on which y is
chosen suggesting that perhaps Cauchy’s theorem should be applied. Motivated by
this idea, let y be fixed with 0 < y < ∞. For each a > 0, let γa be the rectangular
path with vertices ±a + i and ±a + i y. Since f is holomorphic in the upper half
plane, we have by Cauchy’s theorem
f (z)e−it z dz = 0.
γa
We consider only real values of t. Let φ(b) be the integral of f (z)e−it z over the
straight-line interval from b + i to b + i y (b ∈ R). Set I = [y, 1] if y < 1 and I =
[1, y] if y > 1. Then, by the Cauchy–Schwarz inequality
2
|φ(b)|2 = f (b + iu)e−it (b+iu) du ≤ | f (b + iu)|2 du e2tu du .
I I I
(4.24)
Put
L(b) = | f (b + iu)|2 du,
I
L(a j ) + L(−a j ) → 0, j → ∞.
φ(a j ) → 0, φ(−a j ) → 0, as j → ∞.
This holds for every t ∈ R, and the sequence a j does not depend on t.
288 4 Complex Analysis
Writing f y (x) for f (x + i y), we see that f y ∈ L 2 (−∞, ∞), by hypothesis. The
Plancherel theorem shows that
∞
lim |&
f y (t) − g j (y, t)|2 dt = 0, (4.26)
j→∞ −∞
F(t) = et &
f 1 (t), (4.27)
F(t) = et y &
f y (t). (4.28)
Notice that (4.27) does not involve y and (4.28) holds for every y > 0. Plancherel’s
theorem can be applied to (4.28):
∞ ∞ ∞
1
e−2t y |F(t)|2 dt = |&
f y (t)|2 dt = | f y (x)|2 d x ≤ C.
−∞ −∞ 2π −∞
By taking y arbitrarily large in the above inequality, we deduce that F(t) = 0 almost
all t < 0. On the other hand, letting y → 0 in the penultimate inequality shows that
∞
|F(t)|2 dt ≤ C.
0
In other words,
∞ ∞
f (z) = F(t)e−yt eit x dt = F(t)eit z dt (z ∈ H).
0 0
Finally,
∞
|F(t)|2 dt = C
0
follows from (4.23) and an application of Plancherel’s theorem. This completes the
proof.
Theorem 4.54 Suppose A and C are positive constants and f is an entire function
such that
| f (z)| ≤ Ce A|z| ∀ z ∈ C,
and ∞
| f (x)|2 d x < ∞.
−∞
for all z ∈ C.
Proof Put f (x) = f (x)e−|x| for > 0 and x ∈ R. We will show that
∞
lim f (x)e−it x d x = 0, t ∈ R, |t| > A. (4.29)
→0 −∞
Since
|| f − f ||2 → 0, as → 0,
The Plancherel theorem would then imply that the Fourier transforms of f converge
in L 2 to the Fourier transform F of f . Then, (4.29) will imply that F vanishes outside
of [−A, A] so that the Fourier inversion theorem then implies the result. To prove
(4.29), let for each real a, γa be the path defined by
In other words, γa is the ray emanating from the origin to infinity with radial angle
a. Put
Pa = {w : Re(weia ) > A},
and if w ∈ Pa , define
290 4 Complex Analysis
∞
Φa (w) = f (z)e−wz dz = eia f (seia ) exp(−wseia )ds.
γa 0
C exp(−[Re(weia ) − A]s)
0
Φπ (w) = − f (x)e−wx d x, Re(w) < 0.
−∞
Since f ∈ L 2 (R), we see that both Φ0 and Φπ are holomorphic in the indicated half
planes. Now it is easily checked that for t ∈ R,
∞ ∞
f (x)e−it x d x = f (x)e−|x| e−it x d x = Φ0 ( + it) − Φπ (− + it),
−∞ −∞
so to prove our assertion, we need to show that the right-hand side of the above
equation tends to zero as → 0 for |t| > A. We will do this by showing that any two
of our functions Φa agree in the intersections of their domains of definition. That is,
they are analytic continuations of each other. Once we have this, then we can replace
Φ0 and Φπ by Φπ/2 when T < −A and by Φ−π/2 for t > A and it is then obvious
that the difference tends to zero as → 0. To this end, suppose 0 < b − a < π and
put
a+b b−a
c= , d = cos > 0.
2 2
where γ is the circular arc of radius r given by γ(t)r eit , for a ≤ t ≤ b. Since
If |w| > A/d, it follows that the integral in (4.30) tends to zero as r tends to infinity.
We now apply Cauchy’s theorem:
Since the middle integral tends to zero as r tends to infinity, we conclude that
Φa (w) = Φb (w) for w = |w|e−ic and |w| > A/d. As these two functions agree
on an uncountable set, we conclude Φa and Φb agree on in the intersection of the
half planes in which they were originally defined. This completes the proof.
Exercises
References
1. M. Ram Murty, M. Dewar, H. Graves, Problems in the theory of modular forms, IMSC Lecture
Notes, Hindustan Book Agency, Delhi (2015)
2. A. Tauber, Ein Satz aus der Theorie der unendlichen Reihen. Monatshefte f. Math. 8, 273–277
(1897)
3. S. Ikehara, An extension of Landau’s theorem in the analytic theory of numbers. J. Math.
Phys. Mass. Inst. Technol. 10, 1–12 (1931)
4. E. Landau, Über die Betdeutung einiger neuerer Grenzwertsätze der Herren Hardy und Axer.
Prace mat.-Fiz. 21, 97–177 (1910)
5. D.J. Newman, Simple analytic proof of the prime number theorem. Am. Math. Mon. 87,
693–696 (1980)
6. J. Korevaar, The Wiener-Ikehara theorem by complex analysis. Proc. Am. Math. Soc. 134(4),
1107–1116 (2005)
7. J.P. Serre, Abelian l-Adic Representations and Elliptic curves, Lectures at McGill University
(W. A. Benjamin Inc., New York-Amsterdam, 1968)
292 4 Complex Analysis
8. M. Ram Murty, Problems in Analytic Number Theory, 2nd edn., Graduate Texts in Mathe-
matics (Springer, New York, 2008)
9. M. Ram Murty, V. Kumar Murty, Non-Vanishing of L-Functions and Applications, Modern
Birkhäuser Classics (Birkhäuser/Springer Basel AG, Basel, 1997)
Chapter 5
Introduction to Algebraic Topology
The subject of algebraic topology was born in 1895, when Henri Poincaré introduced
the notion of the fundamental group π1 (X ) attached to a topological space X . His idea
was to study the topology of a space using group theory and more generally, algebra.
The modern viewpoint is somewhat analogous (in some perspectives, identical) to
the study of normal extensions of fields and their associated Galois groups.
Since the computations of fundamental groups of topological spaces often proved
difficult, Poincaré also introduced homology groups to measure connectivity proper-
ties of topological spaces. These being abelian groups, one could invoke the machin-
ery of Z-modules (or more generally, module theory) to study the topological spaces.
By this route, algebra entered the study of topology, and we call this branch of math-
ematics algebraic topology.
The subject developed rapidly in the early part of the twentieth century. For
example, in 1910, L.E.J. Brouwer proved his famous fixed point theorem stating that
any continuous map
f : D → D,
where D = {z ∈ C : |z| ≤ 1}, has a fixed point. Later, in 1926, S. Lefschetz extended
this to prove his celebrated fixed point theorem. In 1967, M. Atiyah and R. Bott
gave a far-reaching generalization of Lefschetz’s formulas. Their work led to the
development of the Atiyah–Singer index theorem the following year.
A fundamental concept in algebraic topology is the notion of homotopy. If X
and Y are two topological spaces, maps f : X → Y and g : X → Y are said to be
homotopically equivalent if there is a continuous map
h : X × [0, 1] → Y
such that h(x, 0) = f (x) and h(x, 1) = g(x). Two spaces X and Y are said to be
homotopically equivalent if there are maps F : X → Y and G : Y → X such that
Recall that a topological space is a pair (X, O) where X is a set and O is a collection
of subsets of X satisfying the following conditions:
Elements of O are called open sets. The complement of an open set is called a closed
set.
If X and Y are topological spaces and we speak of maps from X to Y , we always
mean continuous maps. Given two topological spaces X and Y , a map f : X → Y
is called a homeomorphism if it is one to one, onto and f (U ) is open if and only
if U is open. If such a map exists, then X and Y are said to be homeomorphic. The
basic problem of topology is to classify topological spaces up to homeomorphism.
The real line R is an example of a topological space with open sets being disjoint
union of open intervals. More generally, a metric space X with metric d is a topologi-
cal space with open sets characterized by those subsets O of X with the property that
for every x ∈ O, there is an > 0 such that the ball B(x, ) ⊂ O. Thus, Rn , Cn with
the usual metric are examples of topological spaces. More generally, n-manifolds
are topological spaces which are worthy of independent study. (Recall that X is said
to be an n-manifold ifX is a Hausdorff, second countable topological space and
every element x ∈ X belongs to an open set U which is homeomorphic to an open
set of Rn .)
Any subset S of a topological space X inherits a topology from X , called the
relative topology. Thus, the open sets of S are simply of the form S ∩ O for some
open set O of X . For instance, the n-sphere denoted S n and defined by
Theorem 5.1 (Brouwer fixed point theorem, 1910) Any continuous map f : B n →
B n , has a fixed point.
F(0, t) = a, F(1, t) = b, 0 ≤ s ≤ 1, 0 ≤ t ≤ 1.
Theorem 5.2 Let X be a convex subset of Rn , a, b ∈ X . Then any two paths from a
to b are homotopic.
Proof Let α and β be two paths from a to b. Define F(s, t) = (1 − t)α(s) + tβ(s).
F is easily seen to be a homotopy between α and β.
Thus, for any convex subset, there is only one equivalence class of homotopic paths
from a to b.
Proof Define F(s, t) = α((1 − t)s + tρ(s)). Then, F(s, 0) = α(s) and F(s, 1) =
α ◦ ρ(s). As F is the composition of continuous functions, F is a homotopy between
α and α ◦ ρ.
If α is a path from a to b and β is a path from b to c, then we can define the “product”
of α and β as a path from a to c by setting
α(2s) 0 ≤ s ≤ 1/2
(αβ)(s) =
β(2s − 1) 1/2 ≤ s ≤ 1.
Proof Clear.
This lemma allows us to define products of homotopy classes. Let [α] be the
homotopy class of α. If α is a path from a to b, β a path from b to c, we define
[α][β] = [αβ],
and
298 5 Introduction to Algebraic Topology
⎧
⎪
⎨α(2s) 0 ≤ s ≤ 1/2
(α(βγ))(s) = β(4s − 2) 1/2 ≤ s ≤ 3/4
⎪
⎩
γ(4s − 3) 3/4 ≤ s ≤ 1
so we see that it is easy to give examples of α, β, γ such that (αβ)γ = α(βγ) (see
Exercise 6). Thus the product of paths (when defined) is not necessarily associative.
However, we can prove that they are homotopically equivalent.
Lemma 5.3
(αβ)γ ∼ α(βγ).
(α(βγ))(s) = ((αβ)γ)(ρ(s))
where ⎧
⎪
⎨s/2 0 ≤ s ≤ 1/2
ρ(s) = s − 1/4 1/2 ≤ s ≤ 3/4
⎪
⎩
2s − 1 3/4 ≤ s ≤ 1.
aα ∼ α ∼ αb.
If we set
0 0 ≤ s ≤ 1/2
ρ(s) =
2s − 1 1/2 ≤ s ≤ 1
We insert a word of caution to the reader that α−1 is not to be confused with the
inverse function of α, which may not be defined.
5.2 Homotopic Paths 299
Lemma 5.5 If α0 and α1 are paths from a to b with α0 ∼ α1 , then α0−1 ∼ α1−1 .
[α]−1 := [α−1 ].
Lemma 5.6 If α is a path from a to b, then αα−1 is homotopic to the constant path
at a. That is, [αα−1 ] = [a].
Proof Define ⎧
⎪
⎨α(2s) 0 ≤ s ≤ t/2
F(s, t) = α(t) t/2 ≤ s ≤ 1 − t/2
⎪
⎩
α(2 − 2s) 1 − t/2 ≤ s ≤ 1.
Then F(s, 0) = α(0) = a and F(s, 1) = (αα−1 )(s) so that F is the required homo-
topy.
Exercises
1. Recall that a basis B for a topological space X consists of open sets such that
every open set of X is a union of elements of B. Prove that the collection of open
intervals in R is a basis for R with the usual Euclidean topology.
2. Let B be a non-empty collection of subsets of a set X . If B is closed under finite
intersections and every element of X belongs to some element of B, show that B
is a basis for some topology on X .
3. Consider Z with the arithmetic progression topology where a basis of open
sets is given by {an + b : n ∈ Z} where a, b range over all integers satisfying
(a, b) = 1. Prove that for each prime p, the set pZ is closed. Deduce that there
are infinitely many primes. (This topology was introduced by H. Furstenberg.)
[Hint: if there are only finitely many primes p, the union ∪ p ( pZ) is closed. The
complement consists of {−1, 1} and is not open.]
4. Prove that the continuous image of a connected space is connected.
5. If X is path-connected, show that it is connected.
6. Give an example to show that the product of paths, when defined is not necessarily
associative.
7. Show that R and Rn for n ≥ 2 are not homeomorphic.
300 5 Introduction to Algebraic Topology
Consequently,
Exercises
1. Show that the punctured complex plane C \ {0} is path-connected but not simply
connected.
f ∗ ([α]) = [ f ◦ α].
302 5 Introduction to Algebraic Topology
By Lemma 5.7, this definition does not depend on the choice of the representative
in the homotopy class [α]. If the product path αβ is defined, then f ◦ (αβ)=(f ◦ α)
( f ◦ β), so that
f ∗ ([α][β]) = f ∗ ([α]) f ∗ ([β]).
f ∗ ([α]−1 ) = f ∗ ([α])−1 .
f ∗ ([b]) = [ f (b)].
m :G×G →G
given by
m(g, h) = gh −1
is continuous. For instance, R with the usual topology, (or more generally, Rn ) is
an example of a topological group. An important example is given by S 1 , where we
identify it with the group of complex numbers of absolute value 1.
As indicated in example 3 above, the fundamental group of the product space
X × Y consists of pairs of loops (α, β) with α ∈ π(X ) and β ∈ π(Y ). In the case
of a topological group, we will consider loops based at the identity. Since m : G ×
G → G, the induced map
is given by
[(α, β)] → [m(α, β)] = [α · β]
where the product on the right is multiplication in the group. That is α · β indicates
the path α(s)β(s) for 0 ≤ s ≤ 1. On the one hand, m ∗ is a homomorphism, so that
We deduce:
Theorem 5.6 If G is a topological group, and π(G) consists of loops based at the
identity, then π(G) is abelian.
The computation of fundamental groups is often not easy. The usual technique is
to look for subspaces about which we know the structure of their fundamental group
and then try “to paste” together this information. Here is the simplest case of this
occurrence.
Theorem 5.7 Suppose X is the union of two open sets U and V which are simply
connected. Suppose further that U ∩ V is non-empty and path-connected. Then X
is simply connected.
The idea of the proof is to write any loop of X as a product of loops contained in U
or V . The reader will be easily convinced by drawing a picture to see what is going
on. The formal proof will involve the following lemma.
Lemma 5.8 (Lebesgue’s lemma) Let X be a compact metric space and F an open
cover. Then, there is a δ > 0 such that any subset of X of diameter < δ is contained
in some member of F.
s → f ((s + j − 1)/N ), 0 ≤ s ≤ 1,
r∗ : π(X, a) → π(A, a)
F(x, 0) = x, ∀x ∈ X
F(x, 1) = r (x)
F(a, t) = a ∀ a ∈ A, t ∈ I.
Proof r∗ ◦ i ∗ is the identity map of π(A, a). Since i ◦ r is homotopic to the identity
map, i ∗ ◦ r∗ is the identity map.
5.4 Examples of Some Fundamental Groups 305
We shall use this theorem in two ways. First, we use it to determine when two spaces
have isomorphic fundamental groups. Second, we use it to show that a subspace is
not a deformation retract by proving that certain retracts are not deformation retracts.
A space X is called contractible to a point if there is an x0 ∈ X such that x0 is a
deformation retract of X .
Any convex subset X of Rn is contractible to a point. To see this, let x0 be X and
define f : X × I → X by
f (x, t) = (1 − t)x + t x0 .
Exercises
π(X × Y, (x0 , y0 ))
such that each of the (necessarily pointed) maps m(x0 , ·) and m(·, x0 ) on (X, x0 )
are homotopic to the identity loop based at x0 . Show that π(X, x0 ) is abelian.
6. Let Δn denote the n-simplex defined by
{(x0 , x1 , . . . , xn ) : x0 + x1 + · · · + xn = 1, xi ≥ 0}.
306 5 Introduction to Algebraic Topology
This is a subset of Rn+1 with the induced topology. For instance, a 0-simplex is
a point, a 1-simplex is homeomorphic to the interval [0, 1] and so on. Show that
π(Δn ) is trivial.
Y X
f
Such a map g is called a lift of f . We begin by showing that if Y is a connected
space, such a lift, if it exists, is essentially unique.
Proof Let
S = {y ∈ Y : g(y) = h(y)},
T = {y ∈ Y : g(y) = h(y)}.
5.5 Covering Spaces 307
Clearly Y = S ∪ T . If we can show that S is both open and closed, then by the
connectedness of Y , we get S = Y because, by hypothesis, S is non-empty.
Let y ∈ Y and U an open neighborhood of f (y) that is evenly covered by p. Let
V and W be sheets of p −1 (U ) such that g(y) ∈ V , h(y) ∈ W . If g(y) = h(y), then
V = W since p maps homeomorphically onto U . If g(y) = h(y), then V and W are
disjoint for the same reason.
Since g and h are continuous at y, there is an open neighborhood N of y such that
g(N ) ⊂ V , h(N ) ⊂ W . For y ∈ T , V ∩ W = φ, so that g(z) = h(z) for z ∈ N and
hence N ⊂ T . Thus, T is open. On the other hand, if y ∈ S, then V = W . Moreover,
for each z ∈ N , g(z) must be the unique point v ∈ V such that p(v) = f (z) and
h(z) must be the unique point w ∈ W such that p(w) = f (z). Thus, g = h on N and
N ⊂ S. Hence S is open. This completes the proof.
We now address the problem of lifting of paths. For this purpose, we will use
Lebesgue’s lemma (Lemma 5.8).
[s j−1 , s j ] ⊂ γ −1 (U j ), for 1 ≤ j ≤ m.
Then, α(0) = e0 and p ◦ α = γ on [0, s1 ]. Now perform the same procedure with
e0 = α(0) replaced by α(s1 ) and U1 replaced by U2 , to extend α to the interval
[s1 , s2 ]. After m steps, we have lifted the entire path α. The uniqueness follows from
Lemma 5.9.
We can actually lift homotopies. That is, if we have a family of paths depending con-
tinuously on a parameter, the lifted paths also depend continuously on the parameter.
Proof The uniqueness follows from Lemma 5.9 since I × I is connected. According
to the path lifting theorem, there is a unique path t → et , 0 ≤ t ≤ 1 in E such that
p(et ) = F(0, t), 0 ≤ t ≤ 1. By the same theorem, there exists for each t, a unique
path s → G(s, t), 0 ≤ s ≤ 1, such that G(0, t) = et and ( p ◦ G)(s, t) = F(s, t),
0 ≤ s ≤ 1. This defines the lift G of F. We must show it is continuous.
Let γ : [0, 1] → X be the path defined by
As in the proof of Theorem 5.10, we can find open sets U1 , . . . , Um and 0 = s0 <
s1 < · · · < sm = 1 and > 0 such that
Since e0 ∈ V1 , we can assume also that the initial points et belong to V1 , for 0 ≤ t ≤ ,
since they depend continuously on t. As before, we can define G in m steps. The
first step is to define
Now let (E, e) and (X, b) be pointed spaces and let p : (E, e) → (X, b) be a covering
map. That is, p : E → X is a covering map with p(e) = b. Let γ : I → X be a loop
based at b. By the path lifting theorem, there is a unique lift α : I → E such that
α(0) = e. The lift need not be a loop. However, the terminal point α(1) satisfies
p(α(1)) = γ(1) = b, so that α(1) lies in the fiber p −1 (b) over b.
Suppose now that γ1 is another loop in X based at b such that γ1 ∼ γ. Let {γt :
0 ≤ t ≤ 1} be the homotopy so that γ0 = γ. By the homotopy lifting theorem applied
to F(s, t) = γt (s), we obtain a mapping G : I × I → E such that G(0, 0) = e and
Φ : π(X, b) → p −1 (b)
F:I×I →E
p(t) = e2πit , t ∈ R.
Since p −1 (1) coincides with Z, we deduce from Theorem 5.12 that the elements of
π(S 1 , 1) are in one-to-one correspondence with the integers. (In the next section, we
shall see that π(S 1 , 1) Z.)
Exercises
5.6 Applications
We will apply the theory of covering spaces to compute π(S 1 , 1). We view S 1 as
the subset of C whose absolute value is 1. Recall that the map p : R → S 1 given
by p(t) = e2πit is a covering map. Given any loop γ of S 1 based at 1, we can lift it
uniquely to a path α : I → R with α(0) = 0, by the path lifting theorem. This allows
us to define the degree of the loop γ based at 1 by setting
deg γ = α(1),
γn (x) = e2πinx
has degree n. Moreover, none of these loops are homotopically equivalent, for other-
wise (by the monodromy theorem) their lifts would have the same endpoint, which
is not the case. In addition, if γ is any loop of S 1 , let n = deg γ and consider γγn−1
whose lift is homotopic to the constant loop at zero. Thus, γ ∼ γn . This proves:
Theorem 5.13
π(S 1 , 1) Z.
This gives us the first example of a topological space with a non-trivial fundamental
group. For any map f : S 1 → S 1 , we have the induced map
given by
F(x, t) = f (t x)/| f (t x)|.
Thus, f˜ is homotopic to the constant map. Since degree is preserved under homotopy,
f˜ has degree zero. On the other hand, the homogenized polynomial
Since z n has degree n we deduce that f˜ has degree n, which is a contradiction. Thus,
for some z ∈ C, f (z) = 0.
i∗ r∗
π(S 1 , 1) → π(B 2 , 1) → π(S 1 , 1)
equal to the identity. But π(B 2 , 1) is trivial, so that r∗ is trivial and the composition
cannot be the identity.
Theorem 5.15 (Brouwer fixed point theorem for n = 2) Any continuous map f :
B 2 → B 2 has a fixed point.
Proof Suppose that f (x) = x for all x ∈ B 2 . Define r (x) ∈ S 1 to be the intersection
with S 1 of the ray that starts at f (x) and passes through x. Certainly r (x) = x for
x ∈ S 1 . By writing an equation for r in terms of f (see Exercise 1), we see that r is
continuous. But this contradicts Lemma 5.10.
312 5 Introduction to Algebraic Topology
One can prove the higher dimensional version of Brouwer’s theorem using higher
homotopy groups. Given a pointed topological space (X, b) we may consider the
loop space (Ω X, b̃) of all loops based at b. By abuse of notation, we will denote
this space by (Ω X, b). One can endow this space with the compact-open topology:
namely the sets
(K ; ∅) = { f ∈ Ω X : f (K ) ⊂ O}, K ⊂ I
with K a compact subset of I , and O an open set in X , are used as a subbase for a
topology on Ω X . If X were a metric space with metric d, this topology agrees with
the topology induced by the metric d ∗ on Ω X given by
We can now define the higher homotopy groups recursively as follows: π1 (X, b) :=
π(X, b), π2 (X, b) := π1 (Ω X, b) and generally
r : B n → S n−1 .
i r
S n−1 → B n → S n−1 .
i∗ r∗
Z → 0 → Z.
5.6 Applications 313
We give several applications of the Brouwer fixed point theorem. The first is to
prove:
if and only if m = n.
Proof If Rn Rm , then
Rn \{0} Rm \{0}.
This theorem (called the theorem of the invariance of domain), though not a direct
application of the fixed point theorem, has the same essential idea in its proof as the
fixed point theorem. Its significance is the corollary that the notion of dimension is
well-defined. That is, an n-manifold cannot be at the same time be an m-manifold
unless m = n.
Our next application is to linear algebra which plays a fundamental role in the
Google PageRank algorithm.
Proof For any vector x ∈ Rn define σ(x) to be the sum of its coordinates. Let Δn−1
denote the (n − 1)-simplex. Now consider the map f : Δn−1 → Δn−1 given by
Ax
f (x) = .
σ(Ax)
It is not difficult (see Exercise 6) to see that this is indeed a map into Δn−1 . As Δn−1
is homeomorphic to B n , we can apply the fixed point theorem to deduce the result.
A theorem of Seifert and van Kampen allows us to write down more fundamental
groups from the fact that π(S 1 ) Z. This theorem is a vast generalization of Theorem
5.7, though essentially based on the same ideas. To state it precisely, we recall the
definition of free products of groups. Given any collection of groups G i for i ∈ I ,
an index set, and a fixed group G with homomorphisms φi : G i → G, we say that a
group G is the free product of the groups G i if and only if the following condition
holds: if H is any other group and we have homomorphisms ψi : G i → H , there
exists a unique homomorphism f : G → H making the following diagram commute.
G
φi
f
Gi H
ψi
One can show that given any collection of groups, their free product exists. Intu-
itively, the free product is to be thought of as a group consisting of “words” formed
using the “alphabet” of elements of the G i for i ∈ I . More precisely, elements of
the free product consist of finite sequences (x1 , x2 , . . . , xn ) where each xk belongs
to some G i , any two successive terms belong to different groups and no term is the
identity element of any G i . One can define multiplication of “words” in the obvious
way. The essential point is to ensure that all of this is well-defined and that G exists.
If G 1 and G 2 are two groups, we denote their free product as G 1 ∗ G 2 . Now
suppose that X is a topological space with subspaces U and V . We have the induced
homomorphisms
π(U ) → π(X ), π(V ) → π(X )
as well as
i j
π(U ∩ V ) → π(U ), π(U ∩ V ) → π(V ),
Theorem 5.18 (Seifert and van Kampen, 1933) Suppose that X is a topological
space with path-connected subspaces U and V . If U ∩ V is path-connected and
a ∈ U ∩ V , then
π(X, a) π(U, a) ∗ π(V, a)/H
where H is the normal subgroup generated by the words i ∗ (g) j∗ (g)−1 where g ∈
π(U ∩ V, a).
This theorem allows us to compute, for instance, the fundamental group of the “figure
eight” as the free product Z ∗ Z since π(S 1 ) is Z and the intersection consists of a
point in this case. More generally, the same logic leads us to conclude that the
fundamental group of a flower with n petals is the n-fold free product of the integers.
In particular, if X is the union of two path-connected subspaces U and V with
U ∩ V simply connected, we deduce from Theorem 5.18 that π(X ) is isomorphic to
the free product of π(U ) and π(V ).
5.6 Applications 315
Exercises
p : X → X/G
is a covering map.
Observe that in both of the examples above, we have an even action. Thus, in both
cases,
p : X → X/G
This defines an action (see Exercise 1). This, however, is not an even action because
even actions are fixed point free (see Exercise 2). That is, for an even action, g √
·x =x
for any element x implies g = 1 (see Exercise 2). In this example, z = i = −1 is
fixed by
5.7 Group Actions and Orbit Spaces 317
0 −1 10
= .
1 0 01
p −1 (b) = {g · e : g ∈ G}.
given by γ → gγ .
Proof Consider two loops γ, η in X/G, based at b. Let γη be the unique lift of γη
that begins at e. Then, γη = γ̃(ηa ) where a = γ̃(1) and ηa is the unique lift of η that
begins at a. This is because γ̃(ηa ) is also a lift of γη that begins at e. Let η̃ be the
unique lift of η that begins at e. Since gγ · η̃ is also a lift that begins at gγ · e and
a = γ̃(1) = gγ · e, it follows that ηa = gγ · η̃. Hence,
Proof The kernel consists of elements γ ∈ π(X/G, b) with φ(γ) = 1. These are the
elements for which γ̃(1) = e which are loops based at e.
π(X/G, b) G.
318 5 Introduction to Algebraic Topology
This result allows us to deduce (yet again) the fact that π(S 1 ) Z from the even
action of Z on R given by translation. Similarly, we can deduce the fundamental
group of the torus.
We give one more example involving the Möbius strip. Let
X = {z ∈ C : z = x + iy, x ∈ R, 0 ≤ y ≤ 1}.
T (z) = z + 1 + i
and let G be the cyclic group generated by T . Then, X/G is homeomorphic to the
Möbius strip. Moreover, the action is even and
X → X/G
is a covering map. We deduce immediately that the fundamental group of the Möbius
strip is Z.
We will compute the fundamental group of another important topological space,
namely the Klein bottle. This will turn out to be a non-abelian group.
Define the following transformations of the plane:
Notice that
(AB) · (x, y) = A · (−x, y + 1) = (−x + 1, y + 1)
and
(B A) · (x, y) = B · (x + 1, y) = (−x − 1, y + 1).
{(x, y, z) : x 2 + y 2 ≤ 1, z ≥ 0}.
It is easy to see that this is homeomorphic to the unit disk with antipodal points on
the boundary identified. Thus,
P2 (R) B 2 /R
5.7 Group Actions and Orbit Spaces 319
(b)
S 2 # P2 (R)# · · · #P2 (R) n ≥ 1;
n times
In case (a), we say the surface is orientable and in the second, non-orientable. g
(and sometimes n) is called the genus of the surface.
Theorem 5.21 was initiated and carried through in the orientable case by A.F.
Möbius (1790–1868) in a paper he submitted for the Grand Prix de Mathématiques
of the Paris Academy of Sciences. He was 71 at the time. The jury did not consider
any of the submitted papers worthy of the prize and so the work of Möbius appeared
as just another mathematical paper in their proceedings. It is not clear who finally
proved the theorem in its full generality. Some ascribe it to H.R. Brahana, whose
paper appeared in the Annals of Mathematics in 1922.
Exercises
1. Show that the action of S L 2 (Z) on the upper half plane H given by
ab az + b
·z =
cd cz + d
320 5 Introduction to Algebraic Topology
is indeed an action.
2. If G acts evenly on a topological space, and g · x = x for some x ∈ X and g ∈ G,
show that g = 1.
3. Notice that S 3 is homeomorphic to
show that AB A = B. Deduce that the fundamental group of the Klein bottle is the
group generated by the two elements A and B with only the relation AB A = B.
It is clear that Aut (E/ X, p) acts on E simply by φ · x = φ(x). We would like to make
this into an even action (or equivalently, a properly discontinuous action) as defined
in the previous section. To ensure this, we need to assume that E is locally path-
connected. That is, each point of E has a neighborhood which is path connected. It
is not hard to see that the continuous image of a locally path-connected space is also
locally path-connected (Exercise 1).
5.8 Automorphisms of Covering Spaces 321
f :Y →X
(Y, c) (X, b)
f
The diagram above induces the following diagram of fundamental groups:
π(E, e)
f˜∗ p∗
π(Y, c) π(X, b)
f∗
Therefore, this is a necessary condition for the lift f˜ to exist. It turns out that this is
also a sufficient condition provided Y is locally path-connected. Though the proof
of this is not difficult, it is rather long and we omit it. For future reference, we state
it as:
Theorem 5.23 (Lifting criterion) Let p : (E, e) → (X, b) be a covering map. Sup-
pose Y is connected, locally path-connected and
f : (Y, c) → (X, b)
322 5 Introduction to Algebraic Topology
f˜ : (Y, c) → (E, b)
We use this lifting criterion to determine when we can have “isomorphic” cover-
ings of a given topological space X .
p1 ∗ π(E 1 , e1 ) = p2 ∗ π(E 2 , e2 )
(E2 , e2 ) (X, b)
p2
Since E 2 is connected and locally path-connected, p2 has a lift p̃2 by Theorem
5.23. By the uniqueness lemma (Lemma 5.9), this is unique and p1 ◦ p̃2 = p2 . Sim-
ilarly, we have a unique lift p̃1 : (E 1 , e1 ) → (E 2 , e2 ) so that p2 ◦ p̃1 = p1 . We thus
have
p̃1 p̃2
(E 1 , e1 ) → (E 2 , e2 ) → (E 1 , e1 )
ψ := ( p̃2 ◦ p̃1 ) : (E 1 , e1 ) → (E 1 , e1 ).
Note that
This lemma has a converse, the proof of which we leave as an exercise (Exercise 5).
satisfying p2 ◦ φ = p1 , then
p1 ∗ π(E 1 , e1 ) = p2 ∗ π(E 2 , e2 ).
φ : E1 → E2
p∗ π(E, e1 ) = p∗ π(E, e2 )
for any e1 ∈ p −1 (b). By Lemma 5.13, there is an element φ ∈ Aut (E/ X, p) with
φ(e2 ) = e1 . Thus, if p(e1 ) = p(e2 ), there is an element φ ∈ Aut (E/ X, p) such that
324 5 Introduction to Algebraic Topology
φ(e1 ) = e2 . Conversely, it is clear that if φ(e1 ) = e2 for some φ ∈ Aut (E/ X, p),
then p(e1 ) = p(e2 ). Therefore, Aut (E/ X, p) identifies points in E the same way p
does. This gives a one to one correspondence say ψ between X and E/G. These two
spaces are also homeomorphic as topological spaces because U in X is open if and
only if p −1 (U ) is open in E if and only if ψ(U ) is open in E/G. This completes the
proof.
Theorem 5.26 Let p : (E, e) → (X, b) be a covering map with E connected and
locally path-connected. If p∗ π(E, e) π(X, b) then
This theorem allows us to compute the fundamental group of any topological space
X by first finding a covering E which is simply connected and locally path-connected
and then computing its automorphism group. The question of when a given space
possesses such a covering will be discussed in the next chapter.
For now, let us observe the similarity of Theorem 5.27 with the main theorem
in Galois theory. The fundamental group has been identified as the group of auto-
morphisms of the simply connected covering E. This analogy has other features that
parallel Galois theory, and these we take up in the next chapter.
Exercises
1. Show that the continuous image of a locally path connected space is locally path-
connected.
2. Show that a connected space which is locally path connected is path-connected.
3. Given an example of a space which is connected but not locally path-connected.
4. If p : (E, e) → (X, b) is a covering map, show that p∗ : π(E, e) → π(X, b) is
injective.
5. Prove Lemma 5.14.
5.9 The Universal Covering Space 325
X = ∪n≥1 Cn
where Cn is the circle in R2 of radius 1/n and center (1/n, 0) is not semilocally
simply connected at (0, 0) (see Exercise 1). Thus, this space fails to have a universal
covering.
326 5 Introduction to Algebraic Topology
so that
p −1 (U ) = ∪[α]∈ p−1 (U ) [U, α]
We claim that the [V, α] are either equal or disjoint. Indeed, if [γ] ∈ [V, α] ∩ [V, β],
by Exercise 2,
[V, γ] = [V, β], [V, γ] = [V, α]
so that [V, α] = [V, β]. Thus, the [V, α]’s are disjoint. The map pα = p|[V, α] :
[V, α] → V is continuous and surjective. To prove it is injective, suppose pα (αβ) =
pα (αγ). Then β and γ have the same endpoints. The path βγ −1 is a closed path in
5.9 The Universal Covering Space 327
V and thus equivalent to the constant path by our choice of V . Thus, β ∼ γ and
[αβ] = [αγ]. To complete the proof, we need to check that pα−1 is continuous. We
leave this to the reader (Exercise 4).
Finally, one needs to check that E is simply connected. First we show that E is
path-connected. Let ẽ denote the class of the constant path at b. Let [α] ∈ E. Define
α̃ : I → E by α̃(s) = [αs ] for s ∈ I and αs (t) = α(st), for t ∈ I . Then, α̃ is a path
in E from ẽ to [α]. This establishes path-connectedness.
If β is a closed path in E based at ẽ, by uniqueness of liftings
β = p
◦β
so that
[ p ◦ β] = [ p ◦ ( pβ)] = [ pβ(1)] = ẽ
which is the constant path. Hence E is simply connected. This completes the proof.
p H : (X H , b H ) → (X, b)
H = p H ∗ π(X H , b H ).
G π(X, b)
by Theorem 5.27, we may take X H = E/H and p H the map induced by p. This
space has the required property.
We can apply this to study complex manifolds of dimension one which are called
Riemann surfaces. Roughly speaking, it is a topological space X for which every
point has a neighborhood homeomorphic to the unit disk in C.
The simplest example is of course C itself. Another example is the projective line
over C, P1 (C) called the Riemann sphere. This is the one point compactification
328 5 Introduction to Algebraic Topology
Exercises
1. Prove that
X = ∪n≥1 {(x, y) : (x − 1/n)2 + y 2 = 1/n 2 }
T.W. Gamelin and R.E. Greene, Introduction to Topology, Saunders Series, 1983.
5.10 Suggestions for Further Reading 329
This is a beautiful and brief introduction to the subject. One of the nice features is that
it treats the Borsuk–Ulam theorem without introducing homology or cohomology.
It is highly recommended.
This has the features of the previous book and more. A full semester course can
be based on the book since the chapters are short. The emphasis is on fundamen-
tal groups and covering spaces though there is a brief chapter at the end on singular
homology. This is highly recommended for the serious student of algebraic topology.
This is a very informal introduction to the subject with more emphasis on the intuitive
aspects. Its drawback is that in some cases there are no proofs and the treatment is
sloppy. But its colloquial style makes up for it. It is recommended reading on the bus
or for bedtime.
This is a standard text for the subject. But it is lengthy and it is probably impossible
to cover all of it in a semester. Homology and cohomology are not discussed at all,
nor is higher homotopy theory.
This book was later reprinted as Algebraic Topology, A first course by M. Greenberg
and J. Harper. The first part, consisting of about 30 pages on elementary homotopy
theory, is very accessible and can be covered within a month.
Rotman has a strong algebraic flavor, and this appeals to many. Surprisingly, he does
the theory of covering spaces near the end of the book. Chapters 1–3, Chaps. 10 and
11 are relevant to our chapter. The intuitive style in some places of the book makes
for pleasant reading.
Despite its title, this is not an easy book to read. Its merit is a painless introduction
to fiber bundles and spectral sequences.
A Big O notation, 25
Abelian functions, 19 Bolzano, B., 19
Abel’s theorem, 273 Bolzano-Weierstrass theorem, 47
Absolute value, 9 Borel, E., 49
Algebraic topology, 293 Bott, R., 293
Almost everywhere, 110 Bound variable, 3
Alternating tensor, 93 Boundary, 97
Analytic, 198 Bounded, 12, 17
Analytic continuation, 264 Bounded above, 17
Ananda-Rau, K., 273 Bounded below, 17
Antipodal point, 315 Bounded linear transformation, 130
Argument principle, 239 Bounded sequence, 18, 47
Arithmetic mean geometric mean inequality, Branch of the logarithm, 252
73 Brouwer fixed point theorem, 296, 311, 312
Arithmetic progression topology, 299 Brouwer, L.E.J., 293, 313
Atiyah, M., 293 Buddha, 10, 41
Axiom of choice, 136
Axiom of infinity, 5
Axiom of regularity, 5 C
Axiom schema, 3 Calabi, E., 80
Calculus of residues, 228
Cantor, G., 1
B Carleson, L., 131
Babylon, 10 Cartesian product, 5
Baire’s theorem, 133 Casorati-Weierstrass theorem, 218
Banach space, 127 Cauchy, A., 13, 67
Banach–Steinhaus theorem, 132, 134 Cauchy-Goursat theorem, 208
Base, 103 Cauchy-Riemann equations, 198
Basel problem, 189 Cauchy-Schwarz for integrals, 47
Base point, 300 Cauchy-Schwarz inequality, 45, 72, 113,
Basis of open sets, 299 128, 168
Bernoulli trials, 195 Cauchy sequence, 13, 15, 20, 40
Bessel differential equation, 39 Cauchy’s estimates, 220
Bessel function of the first kind, 39 Cauchy’s theorem, 225
Bessel inequality, 120, 121 Cauchy’s theorem for a convex set, 210, 211
Beta function, 78, 266, 267 Cauchy’s theorem for a triangle, 208
Bhaskaracharya, 42 Central limit theorem, 192, 193
© Hindustan Book Agency 2022 331
M. R. Murty, A Second Course in Analysis, IMSc Lecture Notes in Mathematics,
https://doi.org/10.1007/978-981-16-7246-0
332 Index
I
G Identically distributed random variables, 193
Galois covering, 323 Implicit function theorem, 67
Gamma function, 78, 264 Incompressible, 89
Gauss, C.F., 87 Increasing sequence, 18
Gelfond-Schneider theorem, 188 Independent random variables, 193
Generalized mean value theorem, 39 Index of z, 204
Genus, 319 India, 10
Global Cauchy theorem, 224 Infimum, 17
Goursat theorem, 208 Infinite series, 23
Gradient, 53 Inner product, 40, 93
Gram–Schmidt process, 122 Inner product space, 112, 168
Grassmann, Hermann, 43
Integers, 8
Greatest lower bound, 12, 17
Intermediate value theorem, 37
Green, G., 82
Intersection of sets, 4
Green’s theorem, 176
Grothendieck, A., 294 Interval, 16
Group action, 315 Invariance of domain, 313
Inverse function theorem, 62, 204
Inverse relation, 4
H Inversion theorem, 159
Hadamard factorization, 260, 270 Irrotational, 60
Hadamard, J., 273 Isolated singularity, 214
Hadamard three circle theorem, 222 Isometry, 120, 166
Hahn–Banach theorem, 135, 136, 143 Isoperimetric inequality, 176
334 Index
J M
Jacobian determinant, 59 Madhava, 29
Jacobian matrix, 58, 60, 75 Matrix, 43
Jensen’s theorem, 256, 258 Maximum modulus principle, 218, 219
Maximum principle, 218
Maxwell equations, 90
K Mean, 193
Kelvin, Lord, 82, see also W. Thomson Mean value theorem, 39
k-forms, 96, 97 Measurable function, 104
Klein bottle, 318, 320 Measurable sets, 104
Kronecker delta function, 93 Measure space, 104
k-tensor, 92 Meromorphic function, 228
Method of contradiction, 11
Metric, 16, 40
L Metric spaces, 40
Lagrange multiplier method, 70, 71 Minimum modulus theorem, 219
Lagrange, J.L., 67, 71 Minkowski’s inequality, 128
Lao-Tzu, 10, 41 Mirsky, L., 232
Laplace’s equation, 198 Mittag-Leffler theorem, 262
Laurent series expansion, 215 Mixed partial derivative, 53
Least upper bound, 12, 17 Möbius, A.F., 319
Least upper bound property, 12 Möbius function, 282
Lebesgue integrable, 109 Möbius strip, 318
Lebesgue integral, 106, 107
Monodromy theorem, 309, 310
Lebesgue’s dominated convergence theo-
Monotone convergence theorem, 107, 167
rem, 109, 152
Monotonic sequence, 18
Lebesgue’s lemma, 303, 307
Morera’s theorem, 212
Lebesgue’s monotone convergence theorem,
Multilinear, 92
107
Lebesgue, H., 103
Lefschetz, S., 293
Left continuity, 21 N
Left derivative, 31 n-chains, 97
Legendre’s duplication formula, 267 Neighborhood base, 103
Leibniz, G., 29, 67 Newman, D.J., 232
Leibniz’s rule, 156 Newton, I., 29, 67
Lens space, 320 n-manifold, 295
Level sets, 73 Non-orientable, 319
Lift, 306 Norm, 112
Lifting criterion, 322 Normal covering, 323
Lifting problem, 306 Normed linear space, 127
Lim inf, 19 Nowhere dense, 135
Lim sup, 19 n-simplex, 305
Linearly ordered, 136
Liouville’s theorem, 219
Little o notation, 26 O
Littlewood, J.E., 273 One-to-one, 5
Local Cauchy theorem, 207 Open ball, 103
Locally path connected, 320 Open interval, 16
Logarithm, 251 Open map, 243
Logarithmic derivative, 239 Open mapping theorem, 242
Loop, 299, 300 Open rectangles, 48
Loop space, 312 Open sets, 103, 295
Lower bound, 12, 17 Orbit space, 315
Index 335
P Q
PageRank algorithm, 313 Quadratic form, 73
Paley-Wiener theorem, 285, 287 Quadric surfaces, 74
Parallelepiped, 78 Quantifier, 2
Parallelogram law, 112 Quotient topology, 315
Parseval’s formula, 118
Parseval’s identity, 121
Partial derivatives, 52 R
Partially ordered set, 136 Radius of convergence, 36, 199
Partial sum, 23 Radon–Nikodym theorem, 111, 142
Partial summation, 203 Random variable, 192
Path, 204 Range, 6
Path-connected, 251, 300 Ratio test, 24
Path connectedness, 296 Rational numbers, 9
Path integral, 204 Real measurable function, 104
Path lifting theorem, 307 Real measure, 104
Peano, G., 6 Real numbers, 13
Perelman, G., 294 Region, 207
Perron, O., 313 Regular covering, 323
Peter-Weyl theorem, 283 Regular function, 198
Phragmén-Lindelöf theorem, 224, 256, 259 Relation, 4
Picard, E., 218 Relative topology, 295
Picard’s theorem, 218 Removable singularity, 214
Plancherel’s theorem, 166, 286 Representable analytic functions, 199
Plancherel transform, 166 Residue, 228
Planck’s constant, 172 Residue theorem, 228
Plato, 41 Retract, 304
Poincaré conjecture, 294 Retraction, 304
Poincaré, H., 293, 296 Riemann, G.F.B., 190
Pointwise convergence, 29 Riemann-Lebesgue lemma, 164, 182, 184
Poisson summation formula, 185, 186, 249 Riemann sphere, 327
Polar co-ordinates, 69 Riemann surfaces, 327
Positive definite, 93 Riemann zeta function, 190, 245, 272
Positive measure, 104 Riesz-Fischer theorem, 121, 126, 129
Potential function, 60 Riesz representation theorem, 116
Poussin, C. de la Vallée, 273 Rig Veda, 43
Power series, 35, 199 Right continuity, 21
Power set axiom, 4 Right derivative, 31
Prime number, 11 Right hand rule, 46
Prime number theorem, 272 Root test, 25, 36
Principal branch of logarithm, 252 Rotation, 60
336 Index