0% found this document useful (0 votes)
10 views

mth303_notes_2024

The document outlines the syllabus for MTH 303: Real Analysis I for the semester 2024-2025, detailing various topics including the real numbers, sequences, topology, continuous functions, differentiation, integration, sequences of functions, and series of functions. Each section includes specific concepts and theorems that will be covered, such as the Completeness Axiom and the Bolzano-Weierstrass Theorem. The course aims to provide a comprehensive understanding of real analysis and its foundational principles.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

mth303_notes_2024

The document outlines the syllabus for MTH 303: Real Analysis I for the semester 2024-2025, detailing various topics including the real numbers, sequences, topology, continuous functions, differentiation, integration, sequences of functions, and series of functions. Each section includes specific concepts and theorems that will be covered, such as the Completeness Axiom and the Bolzano-Weierstrass Theorem. The course aims to provide a comprehensive understanding of real analysis and its foundational principles.

Uploaded by

Angelo Oppio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 114

MTH 303: Real Analysis I

Semester 1, 2024-2025

Prahlad Vaidyanathan
Contents
I. The Real Numbers √ 4
1. Introduction: Irrationality of 2. . . . . . . . . . . . . . . . . . . . . . . 4
2. Ordered Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. The Completeness Axiom . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4. Properties of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

II. Sequences of Real Numbers 14


1. Limits of Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2. Limits and Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3. The Bolzano-Weierstrass Theorem . . . . . . . . . . . . . . . . . . . . . . 19
4. Cauchy Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

III. Topology of R 24
1. Open Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2. Closed Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3. The Closure of a Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4. Compact Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

IV.Continuous Functions 32
1. Limits of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2. Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3. Continuous Functions on Compact Sets . . . . . . . . . . . . . . . . . . . 39
4. The Intermediate Value Theorem . . . . . . . . . . . . . . . . . . . . . . 40
5. Uniformly Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . 43

V. Differentiation 47
1. Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . . . 47
2. The Mean Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3. Extrema and Curve Sketching . . . . . . . . . . . . . . . . . . . . . . . . 52
4. Taylor’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

VI.Integration 59
1. Lower and Upper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2. Integrable Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3. Properties of the Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4. The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . 75
5. The Logarithm and Exponential Functions . . . . . . . . . . . . . . . . . 80

2
VII.Sequences of Functions 85
1. Pointwise and Uniform Convergence . . . . . . . . . . . . . . . . . . . . . 85
2. Distances between Functions . . . . . . . . . . . . . . . . . . . . . . . . . 87
3. Weierstrass’ Approximation Theorem . . . . . . . . . . . . . . . . . . . . 89

VIII.
Series of Functions 95
1. Series of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
2. Tests of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3. Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4. Taylor’s Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5. Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

IX.Instructor Notes 113

3
I. The Real Numbers

1. Introduction: Irrationality of 2
Notation:

• N = {1, 2, 3, . . .}

• N0 = {0, 1, 2, . . .}

• Z = {0, ±1, ±2, . . .}

• Q = {p/q : p, q ∈ Z, q 6= 0}

• R =? The set of real numbers does not admit such an easy definition.

Theorem 1.1. There does not exist x ∈ Q such that x2 = 2.

Proof. Suppose such an x exists, write x = p/q for p, q ∈ Z with q 6= 0. Moreover,


assume that p and q are relatively prime. Then,

p2 = 2q 2 (I.1)

So 2 | p2 , and so 2 | p (this is Euclid’s Lemma - you will see a proof in Lemma 2.6 of
[MTH301]). So we write p = 2m for some m ∈ Z. Then, Equation I.1 reduces to

2m2 = q 2

Once again, 2 | q. But this contradicts the assumption that p and q are relatively
prime.

Remark 1.2. Notice that 2 shows up naturally when measuring √ lengths (the length
of the diagonal of a unit square). But what kind of number is 2 then?

Before we answer this, we first isolate the important properties of Q that matter to us
(i.e. to measure length, time, etc. and do basic arithmetic).

(i) Q contains Z.

(ii) We may add, subtract and multiply elements of Q.

(iii) We may divide by a (non-zero) rational number. More precisely, to every x ∈


Q \ {0}, there exists y ∈ Q such that xy = 1.

4
(iv) Given x, y ∈ Q, then either x ≤ y or y ≤ x.
What we want from R is that

• R should contain Q, but also contain elements like 2. i.e. R should fill the ‘holes’
in Q.

• R should satisfy these kind of axioms as above.

2. Ordered Fields
Definition 2.1. A field is a set F with two binary operations + : F × F → F and
· : F × F → F satisfying the following axioms:
(i) (F, +) is an abelian group. The identity element is denoted by 0, and the additive
inverse of x is denoted (−x).

(ii) Multiplication axioms:


• Associativity: (xy)z = x(yz)
• Identity: ∃e ∈ F such that e 6= 0 and xe = ex = x for all x. We denote this
identity by 1 (it is unique)
• Inverse (for non-zero elements): If x ∈ F \ {0}, there exists y ∈ F such that
xy = yx = 1. We denote this element by (1/x) (it is also unique).
• Commutativity: xy = yx

(iii) Distributive Law: x(y + z) = xy + xz


Example 2.2.
(i) Q is a field.

(ii) N is not a field (it is not a group under +)

(iii) Z is not a field (2 does not have a multiplicative inverse in Z)

(iv) Z/2Z = {[0], [1]} is a field under addition modulo 2.


When F is a field and x, y ∈ F , we write x2 , x/y, 2x, etc to denote the obvious elements.
Remark 2.3. If F is a field, then there are a number of facts one can prove such as:
(i) −(−x) = x

(ii) If x 6= 0 then 1/(1/x) = x.

(iii) If x + y = x then y = 0 (cancellation for addition)

(iv) If x 6= 0 and xy = xz then y = z (cancellation for multiplication)

5
We will not prove these properties as you will prove them in MTH301 (please see [Rudin,
Proposition 1.14-1.16]).

Definition 2.4.

(i) An order on a set S is a relation < on S such that


(a) For any x, y ∈ S we must have one and exactly one of the following: x <
y, x = y, or y < x.
(b) If x, y, z ∈ S such that x < y and y < z, then x < z.

(ii) If (S, <) is an ordered set, we write x ≤ y to mean x < y or x = y. We write x > y
for the negation of x ≤ y.

Definition 2.5. An ordered field is a field F which is an ordered set such that

(i) If y < z, then x + y < x + z.

(ii) If x > 0 and y > 0, then xy > 0.

In other words, the field operations (+ and ·) respect the order structure.

Example 2.6.

(i) Q is an ordered field.

(ii) Z/2Z = {[0], [1]} can be ordered naturally by declaring [0] < [1]. However, it is
not an ordered field because [0] < [1] but

[0] + [0] = [1] + [1].

(End of Day 1)

Now, if F is an ordered field, we say x is positive if x > 0 and negative if x < 0.

Remark 2.7. If F is an ordered field, one can prove a number of properties such as

(i) If x 6= 0, then x2 > 0. In particular, 1 > 0.


1
(ii) If 0 < x < y, then y
< x1 .

As before, we will omit the proofs here. See [Rudin, Proposition 1.18].

R, which we will soon describe, is going to be an ordered field. We now need to explain
what we mean by saying that R will fill the ‘holes’ in Q (i.e. it will include numbers like

2)

6
3. The Completeness Axiom
Definition 3.1. Let (S, <) be an ordered set.

(i) A subset E ⊂ S is said to be bounded above if there exists z ∈ S such that

x≤z

for all x ∈ E. Such an element z is called an upper bound for E.

(ii) Suppose E is bounded above. An element α ∈ S is called the least upper bound
or supremum of E if
(a) α is an upper bound for E.
(b) If β is an upper bound for E, then α ≤ β
(b’) Equivalently, if γ < α, then γ is not an upper bound for E, so there exists
x ∈ E such that γ < x.

Note that if α and α0 are two elements of S that satisfy both properties, then it
follows that
α = α0 .
Therefore, the supremum is unique, and we write α = sup(E).

(iii) A subset E ⊂ S is said to be bounded below if there exists y ∈ S such that

y≤x

for all x ∈ E. Such an element y is called a lower bound for E.

(iv) Suppose E is bounded below. Then an element α ∈ S is called the greatest lower bound
or infimum of E if
(a) α is a lower bound for E.
(b) If β is a lower bound for E, then β ≤ α.
(b’) Equivalently, if α < γ, then γ is not a lower bound for E, so there exists
x ∈ E such that x < γ.
Again, we write inf(E) for the infimum of E.

Example 3.2.

(i) If E is a finite set, then sup(E) = max(E) and inf(E) = min(E).

(ii) If
E = {1/n : n ∈ N} ⊂ Q
then E is bounded below (by 0). Indeed, 0 = inf(E) (we will prove this later), but
0∈/ E.

7
(iii) Let
E := {r ∈ Q : r2 < 2}.
Then, E is bounded above (by 2, say). We claim that E does not have a least
upper bound in Q.
Proof. Suppose p = sup(E) is a rational number. We know that p > 0 since 1 ∈ E.
Define
p2 − 2 2p + 2
q := p − = .
p+2 p+2
Then q is rational and
2(p2 − 2)
q2 − 2 = .
(p + 2)2
By Theorem 1.1, we have two possible cases:
(a) Suppose p2 < 2: Then, q > p and q 2 < 2. Hence, q ∈ E and so p is not an
upper bound for E.
(b) Suppose p2 > 2: Then, 0 < q < p and q 2 > 2. So if r ∈ E, then
r2 < 2 < q 2 ⇒ r < q.
Hence, q is an upper bound for E, which contradicts the assumption that
p = sup(E).
In either case, we have a contradition, so sup(E) does not exist in Q.
Definition 3.3. We say that an ordered set S has the least upper bound property if
every set that is non-empty and bounded above in S has a supremum in S.
Note that by Part (iii) of Example 3.2, Q does not have the least upper bound property.

Note: We will often omit the phrase ‘non-empty’ in statements like the next one
and it will be assumed implicitly.

Lemma 3.4. Suppose S is an ordered set that has the least upper bound property, then
whenever E ⊂ S is non-empty and bounded below, then E has an infimum in S.
Proof. Suppose E is bounded below, define A := {x ∈ S : x is a lower bound for E}.
Then, A 6= ∅. Moreover, if y ∈ E, then y is an upper bound for A. By hypothesis,
α := sup(A)
exists in S. We claim that α = inf(E) by verifying the two conditions of Part (ii) of
Definition 3.1.
(i) If γ < α, then γ is not an upper bound for A. Hence, γ ∈
/ E since every element
of E is an upper bound for A. Therefore,
α≤x
for all x ∈ E. Hence, α is a lower bound for E.

8
(ii) Suppose α < γ, then γ ∈
/ A since α is an upper bound for A. Hence, γ is not a
lower bound for E.
Therefore, α = inf(E) holds.
Theorem 3.5 (Cantor, Dedekind (1872)). There exists an ordered field R which has the
least upper bound property and contains Q as a subfield.
Remark 3.6.
(i) To say that R contains Q as a subfield means that Q ⊂ R, and
• Addition/multiplication in R, when applied to elements of Q, is the usual
operation in Q.
• Positive rational numbers are positive in R.

(ii) The ‘Completeness Axiom’ is simply the fact that R satisfies the Least Upper
Bound property, which states that R is ‘order complete’. There are other notions
of completeness you will encounter later in the course. Don’t confuse them.

(End of Day 2)

4. Properties of R
Definition 4.1. For a, b ∈ R with a < b, define

(a, b) := {x ∈ R : a < x < b} (Open Interval)


(a, b] := {x ∈ R : a < x ≤ b} (Half-Open Interval)
[a, b) := {x ∈ R : a ≤ x < b} (Half-Open Interval)
[a, b] := {x ∈ R : a ≤ x ≤ b} (Closed Interval).

Proposition 4.2 (Nested Interval Property). Suppose {In := [an , bn ]}∞


n=1 be a sequence
of closed intervals in R such that

I1 ⊃ I2 ⊃ . . . .

Then,

\
In 6= ∅.
n=1

Proof. Set A := {an : n ∈ N}. Since

a1 ≤ a2 ≤ . . . ≤ an ≤ . . . ≤ b n ≤ . . . ≤ b 2 ≤ b 1

we see that A is bounded above by b1 . By the Completeness Axiom, α := sup(A) exists


in R. Then, it follows that
an ≤ α

9
for all n ∈ N. Moreover, if m ∈ N, then

an ≤ b m

for all n ∈ N. So bm is an upper bound for A, so

α ≤ bm

Hence, α ∈ Im , and this is true for all m ∈ N, so



\
α∈ In .
n=1

Theorem 4.3 (Archimedean Property of R).

(i) If x, y ∈ R with x > 0, there exists n ∈ N such that nx > y.

(ii) Given y ∈ R with y > 0, there exists n ∈ N such that 1/n < y.

Proof.

(i) Let A = {nx : n ∈ N}, then we WTS that y is not an upper bound for A. Suppose
y is an upper bound for A, then by the Completeness Axiom,

α := sup(A)

exists in R. Since x > 0, α − x < α, so (α − x) is not an upper bound for A.


Hence, there exists m ∈ N such that

α − x < mx.

But this implies that


α < (m + 1)x
which contradicts the assumption that α is an upper bound for A. So A cannot
be bounded above.

(ii) Since y > 0, Part (i) implies that there exists n ∈ N such that ny > 1, so that
y > 1/n.

Remark 4.4. The Well-Ordering Principle states that every non-empty subset of posi-
tive integers contains a smallest member. This is an axiom, and we will use it below.

Theorem 4.5 (Density of Q in R). Suppose x, y ∈ R with x < y, there exists p ∈ Q


such that x < p < y.

10
Proof. We wish to find m, n ∈ Z such that
m
a< < b.
n
(i) To find the denominator n: We choose n so that, if we take steps of length 1/n,
then the steps are too small to ‘step over’ the interval [x, y]. To do this, we note
that (y − x) > 0 so by Theorem 4.3, there exists n ∈ N so that
1
y−x> ⇒ ny > nx + 1
n

(ii) To find the numerator: Note that nx ∈ R so by Theorem 4.3, there exists k0 ∈ N
such that
k0 > nx.
Consider the set
S := {k ∈ N : k > nx}
Then, S is non-empty since k0 ∈ S. By the well-ordering principle, S has a smallest
member, say m. Then,
m > nx ≥ (m − 1)

Now observe that


m
x<
n
Also, by Step (i),
m
m ≤ nx + 1 < ny ⇒ < y.
n
Therefore, p := m/n works.

Proposition 4.6 (Existence of Roots). If n ∈ N and a ∈ R is positive, then there is a


unique positive y ∈ R such that y n = a.

Note: This element y is denoted by a1/n .


Proof.

(i) Uniqueness: Suppose y1 and y2 are two distinct positive roots of a, then either
y1 < y2 or y2 < y1 must hold. Assume WLOG that 0 < y1 < y2 . Then, it follows
(from Definition 2.5 - See Homework 1) that

y1n < y2n .

Therefore, the nth root, if it exists, must be unique.


(End of Day 3)

11
(ii) Existence: Consider
E := {x ∈ R : xn ≤ a}
Then, E 6= ∅ because 0 ∈ E. Moreover,
 
n n n−1 n n−2
(a + 1) = a + na + a + . . . + na + 1 ≥ na > a
2
So E is bounded above by (a + 1). Hence,
α := sup(E)
exists in R.
(a) Suppose αn < a, then we WTS that there exists M ∈ N such that
 n
1
α+ <a
M
By the Binomial theorem,
 n n  
1 n
X n n−k 1
α+ =α + α
M k=1
k Mk
" n   #
1 X n
< αn + αn−k
M k=1 k
Pn n
 
Therefore, if B := k=1 k
αn−k , we wish to find M ∈ N so that
B B
αn +<a⇔M >
M a − αn
However, a − αn > 0 by hypothesis, so by the Archimedean property, such an
M ∈ N exists. So we conclude that
1
α+ ∈E
M
which contradicts the assumption that α is an upper bound for E.
(b) Suppose αn > a, then we WTS that there exists M ∈ N such that
 n
1
α− > a.
M
By the Binomial theorem,
 n n  
1 n
X
k n 1
α− =α + (−1) αn−k k
M k=1
k M
n  
n
X n n−k 1
>α − α
k=1
k Mk
" n   #
1 X n
> αn − αn−k
M k=1 k

12
Pn n
 n−k
Again, set B := k=1 k
α and choose M ∈ N so that

B
M>
αn − a
works. Then, for any x ∈ E,
 n
n 1 1
x ≤a< α− ⇒x<α− .
M M

Therefore, α − M1 is an upper bound for E, contradicting the assumption that


α = sup(E).
Therefore, both αn < a and αn > a are untenable, so αn = a must hold.

13
II. Sequences of Real Numbers
1. Limits of Sequences
Definition 1.1. The absolute value of a real number x ∈ R is defined as
(
x : if x > 0
|x| =
−x : if x ≤ 0

Remark 1.2.

(i) Note that for all x, y ∈ R, we have


(a) |xy| = |x||y|
(b) |x + y| ≤ |x| + |y|

(ii) For any two x, y ∈ R, define the distance between them to be

d(x, y) := |x − y|

(iii) Note that for any x, y ∈ R, x = y if and only if |x − y| <  for all  > 0.
Proof. If x = y and  > 0, then |x − y| = 0 < .

Conversely, suppose that |x − y| <  for all  > 0. WTS: x = y. Suppose not, then

r := |x − y| > 0

If  = r/2, then clearly |x − y| <  does not hold. This violates our assumption,
so x = y must hold.

Definition 1.3.

(i) A sequence of real numbers is an ordered list of the form (x1 , x2 , x3 , . . .), where
each xi ∈ R.

(ii) We say that a sequence (xn ) converges to a point x ∈ R if for each  > 0, there
exists N ∈ N such that |xn − x| <  for all n ≥ N . If this happens, we write

lim xn = x.
n→∞

14
(iii) If no such x exists, then we say that (xn ) diverges (or we say that limn→∞ xn does
not exist).
Example 1.4.
1
(i) limn→∞ n
=0
Proof. For  > 0, there exists N ∈ N such that N1 <  (by Theorem I.4.3). So for
n ≥ N , we have
1 1
|xn − 0| = ≤ < .
n N
This is true for any  > 0 so limn→∞ xn = 0.

(ii) limn→∞ n2 does not exist.


Proof. Suppose x := limn→∞ n2 existed. Since x ∈ R, there exists M ∈ N such
that M > (1 + x) (by Theorem I.4.3). Hence, for any n ≥ M , we have

n2 ≥ M 2 > (1 + x) ⇒ |n2 − x| > 1.

So for  = 1/2, there is no N ∈ N such that |n2 − x| <  for all n ≥ N . This
contradicts the assumption that limn→∞ n2 exists.

(iii) The sequence (1, −1, 1, −1, . . .) diverges.


Proof. Homework.

(End of Day 4)

Also have a look at [Abbot, Examples 2.2.5, 2.2.6 and 2.2.7] for a good explanation of
convergence and divergence.
Lemma 1.5. If a sequence is convergent, then its limit is unique. i.e. If limn→∞ xn = x1
and limn→∞ xn = x2 , then x1 = x2 .
Proof. Suppose x1 6= x2 , then r := |x1 − x2 | > 0. So if  = r/2, then there exists N1 ∈ N
such that
|xn − x1 | < 
for all n ≥ N1 . Similarly, there exists N2 ∈ N such that

|xn − x2 | < 

for all n ≥ N2 . Then if N := max{N1 , N2 }, we have both inequalities whenever n ≥ N .


Hence,
r = |x1 − x2 | ≤ |x1 − xN | + |xN − x2 | <  +  = r.
This is a contradiction.
Definition 1.6. A set S ⊂ R is said to be bounded if there exists M > 0 such that
|x| ≤ M for all x ∈ S.

15
Note: A set S ⊂ R is bounded if and only if it is bounded above and bounded be-
low. (Why?)

Lemma 1.7. Every convergent sequence is bounded.

Proof. Suppose limn→∞ xn = x. For  = 1, there exists N ∈ N such that

|xn − x| < 1

for all n ≥ N . Thus,


|xn | ≤ 1 + |x|
for all n ≥ N . Now set

M := max{|x1 |, |x2 |, . . . , |xN |, 1 + |x|}.

Then, |xn | ≤ M for all n ∈ N.

Theorem 1.8 (Algebra of Limits). Suppose limn→∞ xn = x and limn→∞ yn = y. Then,

(i) For any c ∈ R, limn→∞ (cxn ) = cx.

(ii) limn→∞ (xn + yn ) = x + y.

(iii) limn→∞ (xn yn ) = xy.

(iv) If y 6= 0, then there exists N ∈ N such that yn 6= 0 for all n ≥ N . Moreover,


xn x
lim = .
n→∞ yn y

Proof.

(i) Homework.

(ii) For  > 0, we need find N ∈ N so that

|xn + yn − (x + y)| < 

for all n ≥ N . Now, there exist N1 , N2 ∈ N such that

|xn − x| < /2 for all n ≥ N1


|yn − y| < /2 for all n ≥ N2 .

So if N := max{N1 , N2 }, then for all n ≥ N , we have both inequalities, so that

|xn + yn − (x + y)| ≤ |xn − x| + |yn − y| < /2 + /2 = .

This proves part (ii).

16
(iii) Since (xn ) is convergent, it is bounded by Lemma 1.7. So there exists M > 0 so
that
|xn | ≤ M
for all n ∈ N. Fix  > 0 and consider two cases:
(a) Suppose first that y = 0, then there exists N ∈ N so that |yn | < /M for all
n ≥ N . Then,
|xn yn | ≤ M |yn | < 
for all n ≥ N . Thus, limn→∞ (xn yn ) = 0.
(b) Suppose that y 6= 0, then there exists N1 , N2 ∈ N so that
|xn − x| < /2|y| for all n ≥ N1
|yn − y| < /2M for all n ≥ N2
If N := max{N1 , N2 }, then both inequalities hold. Therefore, for all n ≥ N ,
|xn yn − xy| ≤ |xn yn − xn y| + |xn y − xy|
≤ M |yn − y| + |y||xn − x|
< /2 + /2 = .
Thus, limn→∞ (xn yn ) = xy.
(iv) Suppose y 6= 0, so that r := |y| > 0. Then, there exists N ∈ N so that
r
|yn − y| <
2
for all n ≥ N . Therefore,
r r
|yn | > |y| − = > 0.
2 2
1
for all n ≥ N . Moreover, the ratio yn makes sense for all n ≥ N . We assume
WLOG that yn 6= 0 for all n ∈ N.
By part (iii), it now suffices to show that
1 1
lim = .
n→∞ yn y
Note that
1 1 |yn − y| |yn − y|
− = ≤
yn y |yn y| r2 /2
So if  > 0, there exists N0 ∈ N so that for all n ≥ N ,
1 1
|yn − y| < (r2 /2) ⇒ − < .
yn y
1
Hence, limn→∞ yn
= y1 .

(End of Day 5)

17
2. Limits and Order
Proposition 2.1. Suppose limn→∞ xn = x and limn→∞ yn = y.

(i) If xn ≥ 0 for all n ∈ N, then x ≥ 0.

(ii) If xn ≤ yn for all n ∈ N, then x ≤ y.

(iii) If c ∈ R is such that xn ≤ c for all n ∈ N, then x ≤ c.

(iv) If c ∈ R is such that yn ≥ c for all n ∈ N, then y ≥ c.

Proof.

(i) Suppose x < 0, then there exists  > 0 such that x +  < 0. Now, there exists
N ∈ N so that |xn − x| <  for all n ≥ N . Then, xN ∈ (x − , x + ) so

xN < 0.

This contradicts the assumption that xn ≥ 0 for all n ∈ N.

(ii) Let zn := yn − xn , then (zn ) is convergent and

lim zn = y − x
n→∞

by Theorem 1.8. Moreover, by hypothesis, zn ≥ 0 for all n ∈ N, so by part (i),

y − x ≥ 0 ⇒ y ≥ x.

(iii) Again, take yn = c for all n ∈ N. Then, limn→∞ yn = c, so we may apply part (ii).

(iv) Again, take xn = c for all n ∈ N and apply part (ii).

Remark 2.2.

(i) Suppose limn→∞ xn = x and suppose that there exists N ∈ N so that xn ≥ 0 for
all n ≥ N (some initial terms may be negative). Then, the proof of part (i) of
Proposition 2.1 shows that x ≥ 0.

(ii) Indeed, if a property (P) holds for all members of a sequence (xn ) except possibly
for the first finitely many terms, then we say that (xn ) eventually has property (P).
Therefore, part (i) Proposition 2.1 may be restated to say that if (xn ) is eventually
non-negative, then its limit is non-negative.

(iii) The same is true for the assumptions in part (ii), (iii) and (iv). Similar results will
hold in future theorems as well.

18
Theorem 2.3 (Squeeze Theorem). If limn→∞ xn = x, limn→∞ yn = x and xn ≤ zn ≤ yn
for all n ∈ N, then (zn ) is convergent and limn→∞ zn = x as well.
Proof. Homework.
Definition 2.4.
(i) A sequence (xn ) is said to be increasing if xn ≤ xn+1 for all n ∈ N.

(ii) A sequence (xn ) is said to be decreasing if xn+1 ≤ xn for all n ∈ N.

(iii) A sequence is said to be monotone if it is either decreasing or increasing.

Note: Technically, “increasing” should be “non-decreasing”, but we will avoid such


terminology.

Theorem 2.5 (Monotone Sequence Theorem). If a sequence is monotone and bounded,


then it is convergent.
Proof. Let (xn ) be an increasing sequence (the proof for a decreasing sequence is similar),
and choose M ∈ R such that |xn | ≤ M for all n ∈ N. Then, the set S := {xn : n ∈ N}
is bounded above (by M ), so
α := sup(S)
exists in R. We claim that limn→∞ xn = α. So fix  > 0, then (α − ) is not an upper
bound for the set S. Hence, there exists N ∈ N such that

α −  < xN ≤ α.

Now, for any n ≥ N , we have


xN ≤ xn ≤ α.
Therefore, |xn − α| <  for all n ≥ N. Hence the claim.

3. The Bolzano-Weierstrass Theorem


Definition 3.1. A subsequence of a sequence (xn ) is a new sequence of the form
(xn1 , xn2 , xn3 , . . .) where n1 < n2 < n3 < . . . are a strictly increasing sequence of natural
numbers.
Lemma 3.2. If (xn ) is a convergent sequence with limn→∞ xn = x, then any subsequence
of (xn ) also converges to x.
Proof. Let yj = xnj denote the subsequence. For  > 0, there exists N ∈ N so that
|xn − x| <  for all n ≥ N . Since (nj ) is strictly increasing, there exists j0 ∈ N so that

nj0 > N ⇒ nj ≥ N for all j ≥ j0

It then follows that |yj − x| <  for all j ≥ j0 , so that limn→∞ yj = x as well.

19
Remark 3.3.
(i) We know that limn→∞ n1 = 0. The sequence (1/2n ) is clearly a subsequence of the
original sequence. Therefore,
1
lim n = 0.
n→∞ 2

(ii) Consider the sequence (1, −1, 1, −1, . . .). It has a subsequence (1, 1, 1, . . .) and
another subsequence (−1, −1, −1, . . .). The first subsequence converges to 1, while
the second converges to −1. This does not violate Lemma 3.2 because the original
sequence is not convergent!

(iii) More generally, suppose (xn ) is a sequence such that the set {xn : n ∈ N} is
finite (i.e. There are only finitely many numbers that keep repeating). Then, one
number must repeat infinitely often, and so (xn ) has a convergent subsequence.

Note: For an interval I = [a, b], we write `(I) = (b − a) for its length.

Theorem 3.4 (Bolzano-Weierstrass Theorem). Every bounded sequence has a conver-


gent subsequence.
Proof. Suppose (xn ) is bounded. If S := {xn : n ∈ N} is finite, then there is nothing to
prove, so we may assume that it is infinite. Now, there exists M > 0 such that

−M ≤ xn ≤ M

for all n ∈ N.
(i) Divide the interval [−M, M ] into two halves,

J1 = [−M, 0], and J2 = [0, M ]

If both J1 and J2 contain only finitely many members of S, then S would have
to be finite. Since this is not the case, either J1 or J2 contains infinitely many
elements of S. Let I1 denote the interval which does.

(ii) Now divide the interval I1 into two equal halves, which we denote by K1 and K2 .
Once again, either K1 ∩ S or K2 ∩ S is infinite. Let I2 denote the interval which
contains infinitely members of S.

(iii) Note that I1 and I2 are both closed and I2 ⊂ I1 .

(iv) We now proceed inductively. Suppose we have chosen closed intervals {I1 , I2 , . . . , Ik }
such that
(a) Each Ij , 1 ≤ j ≤ k is closed.
(b) Ij ⊂ Ij−1 for all 2 ≤ j ≤ k.
(c) For each 2 ≤ j ≤ k, `(Ij ) = `(Ij−1 )/2.

20
(d) Each Ij contains infinitely many members of S.
To construct Ik+1 , we do exactly what we did in Step (ii).

(v) Thus proceeding, we construct a sequence (Ij ) of nested closed intervals, each of
which contains infinitely members of S. By Proposition I.4.2,

\
In 6= ∅.
n=1

Let x be any point in this intersection.

(vi) We now construct the subsequence:


(a) I1 contains infinitely many points of S, so fix one such point, denoted by xn1 .
(b) Now, I2 contains infinitely many points of S, so it must contain a point of
the form xn2 where n2 > n1 (since the set {x1 , x2 , . . . , xn1 } is finite!).
(c) Again, proceeding by induction, we obtain a subsequence (xnj ) such that
xnj ∈ Ij for all j ∈ N.

(vii) We claim that limj→∞ xnj = x. To see this, fix  > 0 and observe that

`(I1 ) = 2M
`(I2 ) = M
`(I3 ) = M/2
.. ..
.=.
`(Ij ) = 2M/2j−1 = M/2j−2 .

Since limj→∞ M/2j−2 = 0, there exists j0 ∈ N such that

M
<
2j−2
for all j ≥ j0 . Now, if j ≥ j0 , then

xnj ∈ Ij and x ∈ Ij ⇒ |xnj − x| < `(Ij ) < .

Hence, limj→∞ xnj = x.

(End of Day 6)

21
4. Cauchy Sequences
Definition 4.1. A sequence (xn ) ⊂ R is said to be Cauchy if for each  > 0, there exists
N ∈ N such that |xn − xm | <  whenever n, m ≥ N .
Proposition 4.2. Every convergent sequence is Cauchy.
Proof. If (xn ) converges to x, then for any  > 0, there exists N ∈ N such that |xn − x| <
/2 whenever n, m ≥ N . So by the triangle inequality
|xn − xm | ≤ |xn − x| + |x − xm | < 
for all n, m ≥ N .
We wish to prove the converse, which gives us a criterion to determine when a sequence
converges. For that, we need two lemmas, the first of which is analogous to Lemma 1.7.
Lemma 4.3. Every Cauchy sequence is bounded.
Proof. If (xn ) is Cauchy, then for  = 1, there exists N ∈ N such that |xn − xm | < 1 for
all n, m ≥ N . In particular
|xn − xN | < 1
whenever n ≥ N , so that
|xn | < 1 + |xN |
for all n ≥ N . So if
M := max{|x1 |, |x2 |, . . . , |xN |, |xN | + 1},
then |xj | ≤ M for all j ∈ N.
Lemma 4.4. If a Cauchy sequence has a convergent subsequence, then the whole se-
quence converges.
Proof. Suppose (xn ) is Cauchy and (xnk ) is a convergent subsequence that converges to
x ∈ R. Then, for each  > 0, there exist N1 , N2 ∈ N such that
|xn − xm | < /2 for all n, m ≥ N1 , and
|xnk − x| < /2 for all k ≥ N2 .
Now, (nk ) is a subsequence of (xn ) so n1 < n2 < . . .. Hence, there exists K0 ∈ N such
that
nK0 ≥ N1
Moreover, we may choose K0 so that K0 ≥ N2 as well (Why?). Therefore,
|xn − xnK0 | < /2 for all n ≥ N1 , and
|xnK0 − x| < /2
So if n ≥ N1 , we have
|xn − x| ≤ |xn − xnK0 | + |xnK0 − x| < .
Hence (xn ) converges to x.

22
Theorem 4.5 (Cauchy Criterion). A sequence in R is convergent if and only if it is
Cauchy.

Proof. If (xn ) is convergent, then it is Cauchy by Proposition 4.2. Conversely, if (xn )


is Cauchy, then it is bounded by Lemma 4.3. By the Bolzano-Weierstrass Theorem
(Theorem 3.4), (xn ) has a convergent subsequence. By Lemma 4.4, (xn ) must be con-
vergent.

Remark 4.6. The Cauchy Criterion of Theorem 4.5 is also referred to as ‘completeness’
of R. Note that this is different from the ‘order completeness’ of Section 3. Again, don’t
confuse them (see Remark I.3.6).

23
III. Topology of R
1. Open Sets
Lemma 1.1. Consider an open interval U := (a, b) ⊂ R. If x ∈ U , then there exists
δ > 0 such that (x − δ, x + δ) ⊂ U .

Proof. Since a < x < b, we may choose


1
δ := min{|b − x|, |x − a|}
2

Definition 1.2.

(i) If x ∈ R and δ > 0, the set

B(x, δ) = (x − δ, x + δ) = {y ∈ R : |x − y| < δ}

is called a neighbourhood of x, or the ball of radius δ around x.

(ii) A set U ⊂ R is said to be open if for each x ∈ U , there exists δ > 0 such that
B(x, δ) ⊂ U . (Note: The value of δ depends on x, just like in Lemma 1.1).

Example 1.3.

(i) Every open interval is an open set by Lemma 1.1.

(ii) U := (0, 1) ∪ (2, 3) is an open set.

(iii) If I = [0, 1], then I is not an open set because if x = 1, then for no δ > 0 is it true
that B(x, δ) ⊂ I.

(iv) Similarly, {0} is not an open set.

Note: You should think of an open set as one where every point has some room around
it.

Theorem 1.4.

(i) An arbitrary union of open sets is open.

(ii) The intersection of finitely many open sets is open.

24
Proof.
S
(i) Suppose {Uj : j ∈ J} is an arbitrary collection of open sets and U = j∈J Uj .
WTS: U is open, so pick x ∈ U . Then, there exists j ∈ J such that x ∈ Uj . Uj is
open so there exists δ > 0 such that B(x, δ) ⊂ Uj . Clearly, B(x, δ) ⊂ U . This is
true for every x ∈ U so U is open.
(End of Day 7)
Suppose {U1 , U2 , . . . , Uk } is a collection of finitely many open sets and U :=
(ii) T
k
i=1 Ui . WTS: U is open, so pick x ∈ U . Then, for any fixed 1 ≤ i ≤ k,
x ∈ Ui , so there exists δi > 0 such that B(x, δi ) ⊂ Ui . Let
δ := min{δi : 1 ≤ i ≤ k} > 0
Then, we claim that B(x, δ) ⊂ U . To see this, fix 1 ≤ j ≤ k. Then,
B(x, δ) ⊂ B(x, δj ) ⊂ Uj .
This is true for each 1 ≤ j ≤ k, so B(x, δ) ⊂ U . This is true for each x ∈ U , so U
is open.

Remark 1.5.
(i) Note that in Part (iii) of Theorem 1.4, we necessarily needed finitely many open
sets so that the value of δ chosen in the proof is positive. Therefore, this proof
would not work if there were infinitely many sets.
(ii) Moreover, the statement would be false for infinitely many sets: Set Un := (−1/n, 1/n),
then each Un is open but
\∞
Un = {0}
n=1
which is not open.

2. Closed Sets
Definition 2.1. Let A ⊂ R be a set. A point x ∈ R is called a limit point of A if for
each δ > 0, there is a point y ∈ A such that
y 6= x and |y − x| < δ.
In other words, (B(x, δ) ∩ A) \ {x} =
6 ∅ for each δ > 0.
Lemma 2.2. x ∈ R is a limit point of A if and only if there is a sequence (xn ) ⊂ A
such that xn 6= x for all n ∈ N and
lim xn = x.
n→∞

25
Proof. If x is a limit point of A, then for each n ∈ N, with δ = 1/n, there is xn ∈ A
such that xn 6= x and |xn − x| < 1/n. Now if  > 0, there is N ∈ N so that 1/N < , so
if n ≥ N , we have
|xn − x| < 1/n ≤ 1/N < 
for all n ≥ N . Therefore, limn→∞ xn = x.

Conversely, suppose there is a sequence (xn ) as above, then for any δ > 0, there is N ∈ N
so that |xN − x| < δ. Therefore,

xN ∈ B(x, δ) ∩ A

satisfies the requirement, so x is a limit point of A.

Remark 2.3.

(i) A limit point is also sometimes called a cluster point of the set.

(ii) The set of all limit points of A is called the derived set of A, and is denoted by A0 .

(iii) We require y 6= x in Definition 2.1 to avoid the following problem: Suppose A =


{1}, then for each δ > 0,
B(1, δ) ∩ A 6= ∅.
However, 1 is not a genuine limit of a sequence in A, other than the constant
sequence.

(iv) In contract, a point x ∈ R is called a isolated point of A if it is in A, but not a


limit point of A.
(a) In the above example, the point 1 is an isolated point of {1}.
(b) However, 1 is not an isolated point of [0, 1], it is a limit point!

Definition 2.4. A set F ⊂ R is said to be closed if it contains all its limit points.

Example 2.5.

(i) If A = {1}, then A is closed.


Proof. Every sequence of elements of A converges to 1, so A0 = ∅.

(ii) Every closed interval is closed.


Proof. Write A = [a, b] and suppose x ∈ A0 , then there is a sequence (xn ) ⊂ A
such that limn→∞ xn = x. Now,

a ≤ xn ≤ b

for all n ∈ N. By Proposition II.2.1, it follows that a ≤ x ≤ b, so x ∈ A. Hence,


A0 ⊂ A.

26
(iii) If A = {1/n : n ∈ N}, then A is not closed.
Proof. By Example II.1.4, limn→∞ 1/n = 0. So 0 ∈ A0 . Since 0 ∈
/ A, A is not
closed.

Theorem 2.6. A set F ⊂ R is closed if and only if F c = R \ F is open.

Proof.

(i) Suppose F is closed, we WTS that U := F c is open. So fix x ∈ U , then we WTS


that there is some δ > 0 so that B(x, δ) ⊂ U . Suppose not, then for each δ > 0,

B(x, δ) ∩ F 6= ∅.

Since x ∈
/ F , it follows that x is a limit point of F . Since F is closed, x ∈ F . But
this contradicts the assumption that x ∈ U . Hence, U must be open.

(ii) Suppose U is open, we WTS that F is closed. The proof reverses the above
argument: Suppose x ∈ F 0 , then we WTS that x ∈ F . Suppose not, then x ∈ U .
Since U is open, there is δ > 0 so that B(x, δ) ⊂ U . In that case,

B(x, δ) ∩ F = ∅.

This contradicts the assumption that x ∈ F 0 . Hence, x must be in F , so F is


closed.

Corollary 2.7.

(i) Arbitrary intersection of closed sets is closed.

(ii) A union of finitely many closed sets is closed.

Proof. Use Theorem 1.4 and De Morgan’s Laws.

3. The Closure of a Set


Definition 3.1. For a subset A ⊂ R, the closure of A is the set

A = A ∪ A0 .

Lemma 3.2. Let A ⊂ R and x ∈ R be fixed. Suppose δ > 0 is such that B(x, δ) ∩ A 6= ∅,
then
B(x, δ) ∩ A 6= ∅.

27
Proof. Suppose B(x, δ) ∩ A 6= ∅. Since A = A ∪ A0 , there are two options: either
B(x, δ) ∩ A 6= ∅ or B(x, δ) ∩ A0 6= ∅. In the first case, there is nothing to prove, so
assume that
B(x, δ) ∩ A0 6= ∅.
In that case, there is a y ∈ A0 such that |y − x| < δ. Now r := δ − |y − x| > 0, so by
definition
B(y, r) ∩ A 6= ∅
Choose z ∈ B(y, r) ∩ A, then

|z − x| ≤ |z − y| + |y − x| < δ − |y − x| + |y − x| = δ.

Therefore, z ∈ B(x, δ) ∩ A so B(x, δ) ∩ A 6= ∅.

(End of Day 8)

Proposition 3.3. Let A ⊂ R and x ∈ R. Then, x ∈ A if and only if

B(x, δ) ∩ A 6= ∅

for all δ > 0.

Proof.

(i) Suppose x ∈ A and δ > 0. Then there are two cases: either x ∈ A or x ∈ A0 . In
the first case, it is clear that

x ∈ B(x, δ) ∩ A.

In the second case, there is a point

y 6= x such that y ∈ B(x, δ) ∩ A.

In either case, B(x, δ) ∩ A 6= ∅.

(ii) Conversely, suppose that the given condition holds. We WTS that x ∈ A. Suppose
not, then x ∈
/ A and x ∈/ A0 . Since x ∈
/ A0 , there must be a δ > 0 such that

(B(x, δ) ∩ A) \ {x} = ∅.

Since x ∈
/ A, it follows that
B(x, δ) ∩ A = ∅.
This contradicts the hypothesis.

Theorem 3.4. For a subset A ⊂ R,

28
(i) A is a closed set.
(ii) If F is a closet set containing A, then A ⊂ F .
In other words, A is the smallest closed set containing A.
Proof.
c
(i) To show that A is closed, it suffices to show that U := A is open. So fix x ∈ U .
Then, x ∈
/ A so by Proposition 3.3, there exists δ > 0 such that
B(x, δ) ∩ A = ∅.
We claim that B(x, δ) ⊂ U . To see this, choose y ∈ B(x, δ), then |y − x| < δ. Set
r := δ − |y − x|, then we claim that
B(y, r) ⊂ B(x, δ).
Indeed, if z ∈ B(y, r), then
|z − x| ≤ |z − y| + |y − x| < δ − |y − x| + |y − x| = δ ⇒ z ∈ B(x, δ).
Therefore,
B(y, r) ∩ A = ∅.
By Proposition 3.3, y ∈
/ A, so that y ∈ U . This is true for each point y ∈ B(x, δ),
so B(x, δ) ⊂ U . Such a δ > 0 exists for each x ∈ U , so U is an open set.
(ii) Suppose F is a closed set that contains A, then any limit point of A must also
either be in F , or be a limit point of F . In either case, we see that A0 ⊂ F , so that
A = A ∪ A0 ⊂ F.

Example 3.5.
(i) If F is a closed set, then F = F . Hence,
(a) {1} = {1}.
(b) [0, 1] = [0, 1].
(ii) If A = (0, 1), then A = [0, 1].
Proof. Since [0, 1] is closed, we know from Theorem 3.4 that A ⊂ [0, 1]. Therefore,
it suffices to show that [0, 1] ⊂ A. Since A ⊂ A, it suffices to show that
{0, 1} ⊂ A.
(a) If x = 0: Then (1/n) ⊂ A that converges to x, so 0 ∈ A0 .
(b) If x = 1: Then, (1 − 1/n) ⊂ A that converges to x, so 1 ∈ A0 .

(iii) Q = R.
Proof. Homework.

29
4. Compact Sets
Definition 4.1. A set K ⊂ R is said to be compact if every sequence in K has a
subsequence that converges to a point in K.
Lemma 4.2. A compact set must be both closed and bounded.
Proof.
(i) If K is compact and not closed, then K 6= K, so K 0 is not contained in K. So
there must be a sequence (xn ) in K that converges to a point x ∈
/ K. Then, every
subsequence if (xn ) converges to x (by Lemma II.3.2), so no subsequence of (xn )
can converge to a point in K.

(ii) Suppose K is compact and not bounded. Then, for each n ∈ N, there exists
xn ∈ K such that |xn | > n. The sequence (xn ) is now unbounded. Indeed,
every subsequence of (xn ) is also unbounded, so no subsequence can converge (by
Lemma II.1.7).

Example 4.3.
(i) Every closed and bounded interval is compact.
Proof. If K = [a, b], then any sequence in K is bounded, so it has a convergent
subsequence by the Bolzano-Weierstrass theorem. Since K is closed, that limit
point also belongs to K.

(ii) (0, 1) is not compact, as it is not closed.

(iii) If K1 and K2 are two compact sets, then K1 ∪ K2 is compact.


Proof. Suppose K = K1 ∪ K2 and (xn ) is a sequence in K. Then, either K1 or K2
must contain infinitely many elements of the sequence, so we may assume that a
subsequence (xnj ) is entirely contained in K1 . Since K1 is compact, this now has
a subsequence that converges to a point in K1 . Hence, the original sequence has a
subsequence that converges to a point in K.
(End of Day 9)

(iv) More generally, the union of finitely many compact sets is compact.

(v) The union of infinitely many compact sets may not be compact: Take Kn =
[0, 1 − 1/n], then each Kn is compact, but

[
K= Kn = [0, 1)
n=1

is not closed, and therefore not compact.

30
Theorem 4.4 (Heine-Borel Theorem). A subset K ⊂ R is compact if and only if it is
closed and bounded.

Proof. If K is compact, then it is closed and bounded by Lemma 4.2.

Conversely, suppose K is both closed and bounded. If (xn ) is a sequence in K, then by


the Bolzano-Weierstrass theorem, (xn ) has a convergent subsequence. The limit of this
subsequence must be in K because K is closed. Hence, K is compact.

Proposition 4.5. If K is a compact set, then sup(K) and inf(K) both exist, and are
elements of K.

Proof. Since K is bounded, both α := sup(K) and β := inf(K) exist. We prove that
α ∈ K (the proof for β is analogous). For each  > 0, α −  is not an upper bound for
K, so there is a point x ∈ K such that

α−<x≤α

Taking  = 1/n, we obtain a sequence (xn ) in K such that

α − 1/n < xn ≤ α for all n ∈ N.

Since K is compact, (xn ) has a convergent subsequence (xnj ) so that x := limj→∞ xnj ∈
K. Now,
α − 1/nj < xnj ≤ α
holds for all j ∈ N, so by the Squeeze theorem, x = α, so α ∈ K.

31
IV. Continuous Functions
1. Limits of Functions
Definition 1.1. Let A ⊂ R and f : A → R be a function, and c ∈ A0 be a limit point
of A. For a real number L ∈ R, we write

“ lim f (x) = L”
x→c

if, for any sequence (xn ) in A which satisfies

(i) limn→∞ xn = c and

(ii) xn 6= c for all n ∈ N,

we have limn→∞ f (xn ) = L. If this happens, we say that L is the limit of f at c.

Example 1.2.

(i) Suppose f : [0, 1] → R is the function f (x) = x2 . Then, we claim that

lim f (x) = 1/4.


x→1/2

This is because, if (xn ) is a sequence with limn→∞ xn = 1/2, then limn→∞ x2n = 1/4
by the Algebra of Limits (Theorem II.1.8).

(ii) More generally, if f : [0, 1] → R is given by f (x) = xn , then

lim = 1/2n .
x→1/2

Again, this follows from Theorem II.1.8 applied inductively.

(iii) Suppose f : [0, 1] → R is the constant function f (x) = 3 for all x ∈ [0, 1], then for
any c ∈ [0, 1],
lim f (x) = 3 = f (c).
x→c

(iv) Therefore, if f : [0, 1] → R is any polynomial and c ∈ [0, 1] is fixed, then

lim f (x) = f (c).


x→c

32
(v) Suppose f : [0, 2] → R is given by
(
x : if 0 ≤ x ≤ 1
f (x) =
x2 : otherwise.

Then, we claim that limx→1 f (x) = 1.


Proof. Fix a sequence (xn ) with limn→∞ xn = 1. Fix  > 0, then there exists
N1 ∈ N such that
|xn − 1| <  for all n ≥ N1 .
Moreover, since limn→∞ x2n = 1, there exists N2 ∈ N such that

|x2n − 1| <  for all n ≥ N2 .

Therefore, if N = max{N1 , N2 }, we have

|f (xn ) − 1| <  for all n ≥ N.

Hence, limn→∞ f (xn ) = 1.

(vi) Suppose f : [0, 2] → R is given by



x : if 0 ≤ x < 1

f (x) = x2 : if 1 < x ≤ 2

25 : if x = 1.

Then, limx→c f (x) = 1.

(vii) Suppose f : [0, 2] → R is given by


(
x : if 0 ≤ x ≤ 1
f (x) = 2
x + 1 : otherwise.

Then, we claim that limx→1 f (x) does not exist.


Proof. Suppose limx→1 f (x) = L did exist, then we may choose a sequence xn :=
1 − 1/n, and yn = 1 + 1/n. We know that limn→∞ xn = 1 = limn→∞ yn , so it must
happen that
lim f (xn ) = L = lim f (yn ).
n→∞ n→∞

However, we know from Theorem II.1.8 that

lim f (xn ) = 1 while lim f (yn ) = 2.


n→∞ n→∞

Therefore, limx→1 f (x) does not exist.

33
Remark 1.3. In Definition 1.1, we required that c ∈ A0 . This is to ensure that there are
(non-constant) sequences in A converging to c. We do not necessarily need c to belong
to A (the domain of f ).

Theorem 1.4 (Algebra of Limits of Functions). Suppose f, g : A → R and c ∈ A0 are


such that limx→c f (x) = L and limx→c g(x) = M , then

(i) limx→c (f (x) + g(x)) = L + M .

(ii) For any k ∈ R, limx→c (kf (x)) = kL.

(iii) limx→c (f (x)g(x)) = LM .


f (x)
(iv) If M 6= 0, then limx→c g(x)
= L/M .

Proof. Follows from the definition and the Algebra of Limits (Theorem II.1.8).

(End of Day 10)

Theorem 1.5. Given f : A → R and c ∈ A0 , limx→c f (x) = L if and only if for each
 > 0, there is a δ > 0 such that

x ∈ A and 0 < |x − c| < δ ⇒ |f (x) − L| <  (IV.1)

Proof.

(i) Suppose limx→c f (x) = L, and  > 0 is given. We WTS that there is a δ > 0
satisfying this condition. Suppose not, then δ = 1/n does not satsify this condition.
So there is a point xn ∈ A with 0 < |xn − c| < 1/n but |f (xn ) − L| ≥ . Now, the
sequence (xn ) converges to c, xn 6= c for all n ∈ N, but (f (xn )) does not converge
to L. This contradicts the definition.

(ii) Conversely, suppose the Equation IV.1 holds. We WTS: limx→c f (x) = L. So
choose a sequence (xn ) ⊂ A such that limn→∞ xn = c and xn 6= c for all n ∈ N. We
WTS: limn→∞ f (xn ) = L. So fix  > 0. By hypothesis, there is a δ > 0 satisfying
Equation IV.1. For this δ > 0, there is N ∈ N such that

0 < |xn − c| < δ for all n ≥ N.

By hypothesis, this implies that

|f (xn ) − L| <  for all n ≥ N.

Therefore, limn→∞ f (xn ) = L.

34
2. Continuous Functions
Definition 2.1. Given a function f : A → R and a point c ∈ A, we say that f is
continuous at c if
lim f (x) = f (c).
x→c

We say that f is continuous on A if it is continuous at each point c ∈ A.

Theorem 2.2. Given f : A → R and c ∈ A, the following are equivalent (TFAE):

(i) f is continuous at c.

(ii) Whenever (xn ) is a sequence in A with limn→∞ xn = c, we have limn→∞ f (xn ) =


f (c).

(iii) For each  > 0, there exists δ > 0 such that if x ∈ A and |x − c| < δ, then
|f (x) − f (c)| < .

Proof.

(i) ⇔ (ii): Definition 1.1.

(ii) ⇔ (iii): Theorem 1.5.

Definition 2.3. Suppose f, g : A → R are two functions, we define new functions on A


by

(f + g)(x) := f (x) + g(x)


(kf )(x) := kf (x)
(f g)(x) := f (x)g(x)
 
f f (x)
(x) := ( defined provided g(x) 6= 0).
g g(x)

Theorem 2.4 (Algebra of Continuous Functions). Suppose f, g : A → R are two func-


tions and c ∈ A is fixed. Suppose that f and g are both continuous at c. Then,

(i) f + g is continuous at c.

(ii) If k ∈ R, then kf is continuous at c.

(iii) f g is continuous at c.

(iv) If g(c) 6= 0, then (f /g) is continuous at c.

35
Proof. This is a combination of Definition 1.1, Theorem II.1.8 and Definition 2.1. For
instance, we prove part (i). Suppose (xn ) is a sequence in A with limn→∞ xn = c. Then,
by Definition 2.1,

lim f (xn ) = f (c), and


n→∞
lim g(xn ) = g(c).
n→∞

By Theorem II.1.8,

lim (f + g)(xn ) = lim (f (xn ) + g(x)) = f (c) + g(c).


n→∞ n→∞

Hence, (f + g) is continuous at c.

(End of Day 11)

Remark 2.5. The sequential definition (Definition 1.1) is useful to determine when a
function is not continuous at a point. There are two possible reasons:

(i) limx→c f (x) does not exist: If there are two sequences (xn ) and (yn ) in A with
limn→∞ xn = c = limn→∞ yn , but

lim f (xn ) 6= lim f (yn )


n→∞ n→∞

then, limx→c f (x) does not exist, so f cannot be continuous.

(ii) limx→c f (x) exists, but it is not equal to f (c).

Example 2.6.

(i) Let A ⊂ R be any set, and f : A → R be a constant function. Then, f is


continuous.

(ii) If f : A → R is given by f (x) = x, then f is continuous.

(iii) If p : A → R is a polynomial of the form

p(x) = a0 + a1 x + . . . + an xn

then p is continuous by Theorem 2.4.

(iv) If A = {1, 2, 3} and f : R \ A → R is given by

x2 + 3x + 7
f (x) =
(x − 1)(x − 2)(x − 3)2

Then, f is continuous. Such a function is called a rational function.

36
(v) Let f : [0, 2] → R be given by
(
1 : if 0 ≤ x < 1
f (x) =
2 : otherwise

then f is discontinuous at c = 1 because limx→1 f (x) does not exist. Indeed,

lim f (1 − 1/n) = 1 but lim f (1 + 1/n) = 2.


n→∞ n→∞

(vi) Let f : [0, 2] → R be given by


(
1 : if x = 1
f (x) =
2 : otherwise

Then, f is discontinuous at c = 1 because

lim f (x) = 2 but f (1) = 1.


x→1

(vii) Let f : R → R be given by


(
sin(1/x) : if x 6= 0
f (x) =
0 : otherwise

Then, f is discontinuous at c = 0. If xn = 1/(2nπ) and yn = 1/(2nπ + π/2), then


limn→∞ xn = 0 = limn→∞ yn . However,

lim f (xn ) = 0 and lim f (yn ) = 1.


n→∞ n→∞

Here, the limit does not exist because the function oscillates too much.

(viii) Let f : R → R be given by


(
x sin(1/x) : if x 6= 0
f (x) =
0 : otherwise

Then, f is continuous at c = 0.
Proof. Note that | sin(t)| ≤ 1 for all t ∈ R, so

|f (x)| ≤ |x|

for all x ∈ R. So if (xn ) is a sequence with limn→∞ xn = 0, then limn→∞ f (xn ) = 0


as well. Since f (0) = 0, it is continuous.

37
(ix) Let A := [0, ∞) and fix m ∈ N. Define f : A → R by

f (x) = x1/m .

Then f is continuous at each point c ∈ A.


Proof. Fix c ∈ A and  > 0. We WTS: There exists δ > 0 such that

|x − c| < δ ⇒ |f (x) − f (c)| < .

To do this, we know that


(a) If x, y ∈ A with x < y, then xm < y m (by a repeated application of Defini-
tion I.2.5.
(b) Conversely, if a, b ∈ A are such that a < b, then a1/m < b1/m must hold (by
taking x = a1/m and y = b1/m and seeing that x ≥ y cannot hold). Therefore,
we see that
x < y ⇔ x1/m < y 1/m ⇔ xm < y m .

Now, we note that

|f (x) − f (c)| <  ⇔ c1/m −  < x1/m < c1/m + 


⇔ (c1/m − )m < x < (c1/m + )m .

If a := (c1/m − )m , then

a1/m = c1/m −  < c1/m ⇒ a < c.

Similarly, if b := (c1/m + )m , then b > c, so if

δ := min{c − a, b − c}

then whenever |x − c| < δ, we have

c − δ < x < c + δ ⇒ c − (c − a) < x < c + (b − c)


⇒a<x<b
⇒ |f (x) − f (c)| < .

So, by Theorem 1.5, f is continuous at c.

Definition 2.7. Let A, B ⊂ R and f : A → R and g : B → R be two functions.


Moreover, assume that f (A) ⊂ B, so we may define h : A → R by

h(x) := g(f (x)).

Then, h is called the composition of g and f , denoted g ◦ f .

38
Proposition 2.8 (Composition of Continuous Functions). Let f : A → R and g : B →
R as above and let c ∈ A. If f is continuous at c and g is continuous at f (c), then g ◦ f
is continuous at c.

Proof. Suppose (xn ) is a sequence in A with limn→∞ xn = c. Then, by hypothesis,

lim f (xn ) = f (c).


n→∞

If yn := f (xn ), then once again

lim g(yn ) = g(f (c)).


n→∞

Hence,
lim (g ◦ f )(xn ) = (g ◦ f )(c).
n→∞

So g ◦ f is continuous at c.

Example 2.9. Define f : R → R by



f (x) := 3x2 + 1

Then, f is continuous at each√point c ∈ R because it is the composition of two functions


h(x) := 3x2 + 1 and g(t) := t, both of which are continuous by Example 2.6.

3. Continuous Functions on Compact Sets


Definition 3.1. Let f : A → R be a function.

(i) The set


f (A) := {f (x) : x ∈ A}
is called the range of f .

(ii) We say that f is bounded if f (A) is a bounded set. In that case, we may define

inf(f ) := inf f (A) and sup(f ) := sup f (A).

(End of Day 12)

Remark 3.2.

(i) If A = R and f (x) := x2 + 1, then f is not bounded. However, if we restrict the


domain to, say, A := [0, 1], then f is bounded on A.

(ii) If A = (0, 1) and f : A → R is f (x) := 1/x, then f is not bounded.

Theorem 3.3. If K is compact and f : K → R is continuous, then f (K) is compact.

39
Proof. Let (yn ) be a sequence in f (K), then we may write yn = f (xn ) for some sequence
(xn ) ⊂ K. Since K is compact, there is a subsequence (xnj ) of (xn ) and a point x ∈ K
such that limj→∞ xnj = x. Since f is continuous, limj→∞ f (xnj ) = f (x) ∈ f (K). So
(f (xnj )) is a subsequence of (yn ) that converges to a point in f (K). Hence, f (K) is
compact.
Corollary 3.4 (Extreme Value Theorem). If K is compact and f : K → R is continu-
ous, then f is bounded. Moreover, there exist points x0 , x1 ∈ K such that

f (x0 ) = inf f (K) and f (x1 ) = sup f (K).

Proof. Note that f (K) is compact by Theorem 3.3, so f (K) is both closed and bounded
by Lemma III.4.2. Hence, f (K) is bounded.

Moreover, by Proposition III.4.5, there are points y0 , y1 ∈ f (K) such that

y0 = inf f (K) and y1 = sup f (K).

Since y0 ∈ f (K), there exist x0 ∈ K such that f (x0 ) = y0 and similarly, there is x1 ∈ K
such that f (x1 ) = y1 . This proves the result.
Remark 3.5. You will have used Corollary 3.4 in Calculus before. Given a function
f : [0, 1] → R, you are often asked to find the maxima/minima of f . This is done in two
steps:
(i) First determine if f (0) or f (1) is a maximum.

(ii) For points in (0, 1), you can use the derivative to test.
This entire process relies on the fact that f has a maximum/minimum on the set,
otherwise there is no guarantee that this process would work.

4. The Intermediate Value Theorem


Proposition 4.1. Suppose In = [an , bn ] is a sequence of nested intervals, i.e.

I1 ⊃ I2 ⊃ I3 ⊃ . . .

and further assume that limn→∞ |bn − an | = 0. Then, there is a unique point x0 ∈ R
such that ∞
\
In = {x0 }.
n=1

Proof. We know from Proposition I.4.2 that



\
A := In 6= ∅
n=1

40
Suppose A has two distinct points, say y0 < y1 so that r := |y1 − y0 | > 0. Then, for each
n ∈ N,
an ≤ y0 < y1 ≤ bn .
In other words, for each n ∈ N,
|bn − an | ≥ |y1 − y0 | > r.
This contradicts the assumption that limn→∞ |bn − an | = 0. Therefore, the intersection
must contain exactly one point.
Theorem 4.2 (Intermediate Value Theorem). Suppose [a, b] ⊂ R is a closed interval
and f : [a, b] → R is continuous. Suppose that f (a) < f (b), and c ∈ R is any value such
that
f (a) ≤ c ≤ f (b).
Then, there is a point x ∈ [a, b] such that f (x) = c.
Proof. Assume without loss of generality that f (a) < c < f (b).
(i) Set a1 = a, b1 = b and consider x1 := (a1 + b1 )/2. If f (x1 ) = c, then we are done.
If not, then either
f (x1 ) > c or f (x1 ) < c.

(a) If f (x1 ) > c, let a2 = a1 and b2 = x1 .


(b) If f (x1 ) < c, let a2 = x1 and b2 = b1 .
In either case, we get a new interval I2 = [a2 , b2 ] with the property that f (a2 ) <
c < f (b2 ). Moreover,
`(I2 ) = |b2 − a2 | = |b − a|/2.
(ii) Proceeding inductively, we obtain a sequence of intervals In = [an , bn ] such that
(a) In+1 ⊂ In ⊂ [a, b] for all n ∈ N.
(b) `(In ) = |b − a|/2n−1 .
(c) f (an ) ≤ c ≤ f (bn ) for all n ∈ N.
By Proposition 4.1, there is a unique point x ∈ R such that

\
In = {x}
n=1

Note that
lim an = x = lim bn
n→∞ n→∞
Since f is continuous, we have
f (x) = lim f (an ) = lim f (bn ).
n→∞ n→∞

Since f (an ) ≤ c ≤ f (bn ) and for all n ∈ N, it follows from Theorem II.2.3 that
f (x) = c.

41
Note that in the hypothesis of Theorem 4.2, we assumed that f (a) < f (b). But the
same is true if we assume f (b) > f (a) as well.
Remark 4.3.
(i) Consider the function f (x) = 1/x defined in R \ {0}. Given  > 0, there is N ∈ N
so that if |x| > N , one has
|f (x)| < 
(Indeed, we may use the Archimedean property (see Theorem I.4.3) to find N ∈ N
such that 1/N < ).
(ii) Similarly, if f (x) = 1/x2 and  > 0, there is N ∈ N such that if |x| > N , one has
|f (x)| < .

(iii) In general, if k ≥ 1 and a ∈ R are fixed, and  > 0 is given, there exists N ∈ N
such that if |x| > N ,
a
< .
xk
We use this fact below.
(End of Day 13)
Corollary 4.4. Suppose f : R → R is a polynomial of odd degree, then f has at least
one real root. i.e. There exists x ∈ R such that f (x) = 0.
Proof. Write
f (x) = a0 + a1 x + . . . + am xm
where am 6= 0 and m is odd. Then,
m−1
f (x) X
= 1 + ak xk−m
xm k=0
1
Fix  := 2(m−1) > 0 (you will see why later). By Remark 4.3, for each 0 ≤ k ≤ m − 1,
there is Nk ∈ N such that if |x| > Nk . then
|ak xk−m | < 
If N = max{N0 , N1 , . . . , Nm−1 }, then for any x ∈ R with |x| > N , one has
f (x)
− 1 < (m − 1).
xm
Hence, if |x| > N , one has
f (x) 1
m
> 1 − (m − 1) > .
x 2
1
because  = 2(m−1)
. Now consider two cases:

42
(i) If x > N , then
xm
f (x) > > 0.
2
(ii) If x < −N , then
xm
f (x) << 0.
2
Hence, in particular, f (−N − 1) < 0 and f (N + 1) > 0. So by applying the
Intermediate Value Theorem Theorem 4.2 to f on the interval [−N − 1, N + 1],
we see that f must have a root.

Remark 4.5. Note that Corollary 4.4 does not hold for polynomials of even degree
(for instance, x2 + 1 does not have a real root). Try to find where exactly we used the
assumption of odd degree in the proof.

5. Uniformly Continuous Functions


Remark 5.1.
(i) Consider the function f : (0, 1) → R given by f (x) = 1/x.
(a) If c = 1/2 and  = 1/4, we wish to find δ > 0 so that |x − c| < δ implies that
|f (x) − f (c)| < . Note that
|f (x) − f (c)| <  ⇔ f (c) −  < f (x) < f (c) + 
1 1
⇔ 2 − < f (x) < 2 +
4 4
7 1 9
⇔ < <
4 x 4
4 4
⇔ <x<
9 7
4 1 1 4 1
⇔ − <x− < −
9 2 2 7 2
1 1 1
⇔− <x− <
18 2 14
So δ := 1/18 works.
(b) If c = 1/3 and  = 1/4, we now wish to find δ > 0 as above. Again
1 1
|f (x) − f (c)| <  ⇔ 3 − < f (x) < 3 +
4 4
11 1 13
⇔ < <
4 x 4
4 1 1 4 1
⇔ − <x− < −
13 3 3 11 3
1 1 1
⇔− <x− <
39 3 33

43
So δ = 1/39 works.
Therefore, the value of δ > 0 depends not just on  but also on the point c.

(ii) Suppose f : (0, 1) → R is given by f (x) = 3x + 8. Fix  = 1/4


(a) If c = 1/2, then we wish to find δ > 0 so that |f (x) − f (c)| < . Note that

|f (x) − f (c)| <  ⇔ |3x + 8 − (3c + 8)| < 


⇔ 3|x − c| < 

⇔ |x − c| < .
3
Hence, δ = /3 worked.
(b) If c = 1/3, then the same proof shows that δ = /3 works again.
In other words, for this function f (x) = 3x + 8, the value of δ > 0 needed depends
only on , but not on the point c.

Definition 5.2. A function f : A → R is said to be uniformly continuous on A if, for


each  > 0, there exists δ > 0 such that for any x, y ∈ A,

|x − y| < δ ⇒ |f (x) − f (y)| < .

Remark 5.3.

(i) Note that the definition of continuity (see Definition 2.1) is local. i.e. It describes
the behaviour of a function around a point. However, the definition of uniform
continuity is global ; it describes the behaviour of the function on the set.

(ii) Also, there is no sequential definition of uniform continuity that is comparable to


part (ii) of Theorem 2.2.

Example 5.4.

(i) If A ⊂ R any set and f : A → R is given by f (x) = 3x + 8, then f is uniformly


continuous: For each  > 0, δ := /3 works.

(ii) If f : (0, 1) → R is given by f (x) = 1/x, then f is not uniformly continuous.


Proof. Fix  = 1/2, and we claim that there is no δ > 0 such that |x − y| < δ
implies that |f (x) − f (y)| < . To see this, observe that

1 1 1
− =
n n+1 n(n + 1)

but    
1 1
f −f =1
n n+1

44
So if δ > 0 is chosen, there exists n ∈ N so that 1/n(n + 1) < δ, and in that case,
x = 1/n and y = 1/(n + 1) satisfy |x − y| < δ but

|f (x) − f (y)| = 1 > .

Definition 5.5. A function f : A → R is said to be Lipschitz continuous if there exists


L > 0 such that
|f (x) − f (y)| ≤ L|x − y|
for all x, y ∈ A. Any number L that satisfies this condition is called a Lipschitz constant
associated to f .
Proposition 5.6. If A ⊂ R is any set and f : A → R is Lipschitz continuous, then it
is uniformly continuous.
Proof. For  > 0, take δ := /L.

(End of Day 14)

Theorem 5.7. If K is compact and f : K → R is continuous on K, then f is uniformly


continuous on K.
Proof. Suppose f is not uniformly continuous, then there exists  > 0 for which no δ > 0
works. In other words, for each n ∈ N, δ = 1/n does not work, so there exist two points
xn , yn ∈ K such that
1
|xn − yn | < but |f (xn ) − f (yn )| ≥ . (IV.2)
n
Now the sequence (xn ) has a subsequence (xnj ) which converges to a point x ∈ K.
Consider the corresponding subsequence (ynj ) of (yn ). Again, this has a convergent
subsequence (ynjk ) that converges to a point y ∈ K. Now consider the two sequence
zk := xnjk and wk := ynjk . Then,

lim zk = x and lim wk = y.


k→∞ k→∞

1
Moreover, |zk − wk | < njk
, so x = y must hold.

Now we use the fact that f is continuous: We know that

lim f (zk ) = f (x) = f (y) = lim f (wk ).


k→∞ k→∞

But we know from Equation IV.2 that

|f (zk ) − f (wk )| ≥ .

This cannot happen, so our assumption is false.

45
Example 5.8. Let [a, b] ⊂ R be a closed and bounded (compact) interval. Let f :
[0, 1] → R be a continuous, non-negative function. For a given  > 0, there is a δ > 0
such that
|x − y| < δ ⇒ |f (x) − f (y)| < 
Choose n ∈ N so that 1/n < δ. Define a partition of [0, 1] by

P := 0 < 1/n < 2/n < . . . < (n − 1)/n < 1

For each 0 ≤ j ≤ n − 1, the subinterval Ij := [j/n, (j + 1)/n] of that partition is such


that for any x, y ∈ Ij ,
|f (x) − f (y)| < 
In particular, if
mj := inf f (Ij ) and Mj := sup f (Ij )
then 0 ≤ Mj − mj <  holds. We may now define
n−1 n−1
X 1X
A := Mj `(Ij ) = Mj , and
j=0
n j=0
n−1
1X
B := mj
n j=0

Then, we have
0≤A−B <
We will see later that Z b
B≤ f (t)dt ≤ A
a
as well. For now, try to visualize these numbers A and B in terms of the area under the
curve f .

46
V. Differentiation
1. Definition and Basic Properties
Definition 1.1. Let A = (a, b) ⊂ R be an open interval and f : A → R be a function,
and c ∈ A be fixed.

(i) We say that f is differentiable at c if

f (x) − f (c)
lim
x→c x−c
exists. In that case, we denote this number by f 0 (c).

(ii) If f is differentiable at each point of A, then we say that f is differentiable on A.


In that case, we may define the derivative of f as the function f 0 : A → R given
by
f (x) − f (t)
f 0 (t) := lim
x→t x−t
Example 1.2.

(i) Fix n ∈ N. Let f : R → R be f (x) = xn . Then f is differentiable on R and

f 0 (t) = ntn−1

Proof. Fix c ∈ R. Note that xn − cn = (x − c)(xn−1 + cxn−2 + . . . + cn−1 ), so

f (x) − f (c)
lim = lim(xn−1 + cxn−2 + . . . + cn−1 ) = ncn−1
x→c x−c x→c

because polynomials are continuous functions (see Part (iii) of Example IV.2.6).

(ii) Let g : R → R be g(x) := |x|, then g is not differentiable at c = 0.


Proof. Taking xn := 1/n and yn := −1/n, we see that

|x|
lim
x→0 x
does not exist.

47
(iii) Let f : R → R is (
x sin(1/x) : if x 6= 0
f (x) =
0 : if x = 0.
Then, f is not differentiable at c = 0.
Proof. Taking xn := 1/(2nπ) and yn := 1/(2nπ + π/2), we see that

f (x)
lim = lim sin(1/x)
x→0 x x→0

does not exist.

(iv) Let f : R → R be (
x2 sin(1/x) : if x 6= 0
f (x) =
0 : if x = 0.
Then, f is differentiable at c = 0.
Proof. Again,
f (x) − f (0)
lim = lim x sin(1/x) = 0
x→0 x−0 x→0

by Part (vii) of Example IV.2.6.


Proposition 1.3. If f : (a, b) → R is differentiable at a point c ∈ (a, b), then f is
continuous at c.
Proof. We know that
f (x) − f (c)
f 0 (c) = lim
x→c x−c
exists. So, if (xn ) ⊂ (a, b) is such that limn→∞ xn = c, then

f (xn ) − f (c)
lim (f (xn ) − f (c)) = lim (xn − c) = 0 × f 0 (c) = 0
n→∞ n→∞ xn − c
by Theorem II.1.8. Hence, limx→c f (x) = f (c).
Lemma 1.4. Suppose A ⊂ R is any set and g : A → R is a continuous.
(i) If c ∈ A is such that g(c) > 0, then there exists δ > 0 such that g(y) > 0 for all
y ∈ B(x, δ) ∩ A.

(ii) If c ∈ A is such that g(c) < 0, then there exists δ > 0 such that g(y) < 0 for all
y ∈ B(x, δ) ∩ A.
Proof. Exercise.
Theorem 1.5 (Algebra of Differentiable Functions). Suppose f, g : (a, b) → R are two
functions that are differentiable at c ∈ (a, b), and let k ∈ R. Then

48
(i) (f + g) is differentiable at c and (f + g)0 (c) = f 0 (c) + g 0 (c).
(ii) (kf ) is differentiable at c and (kf )0 (c) = kf 0 (c).
(iii) [Product Rule] (f g) is differentiable at c and (f g)0 (c) = f 0 (c)g(c) + f (c)g 0 (c).
(iv) [Quotient Rule] If g(c) 6= 0, then (f /g) is differentiable at c and
 0
f f 0 (c)g(c) − f (c)g 0 (c)
(c) =
g g(c)2
Proof.
(i) Exercise
(ii) Exercise
(iii) Suppose (xn ) ⊂ (a, b) is such that limn→∞ xn = c. Then, write
(f g)(xn ) − (f g)(c) f (xn )g(xn ) − f (c)g(xn ) + f (c)g(xn ) − f (c)g(c)
=
xn − c xn − c
   
f (xn ) − f (c) g(xn ) − g(c)
= g(xn ) + f (c)
xn − c xn − c
Since g is continuous at c by Proposition 1.3,
lim g(xn ) = g(c)
n→∞

Hence, by the algebra of limits Theorem II.1.8,


(f g)(xn ) − (f g)(c)
lim = g(c)f 0 (c) + f (c)g 0 (c)
n→∞ xn − c
(iv) By part (iii), it suffices to show that (1/g) is differentiable at c and that
 0
1 g 0 (c)
(c) = −
g g(c)2
Note that g(c) 6= 0 and g is continuous at c (by Proposition 1.3), so there is δ > 0
such that g(y) 6= 0 for all y ∈ B(c, δ) (by Lemma 1.4). Therefore, we may ask if g
is differentiable at c.

Now, consider a sequence (xn ) ⊂ (a, b) with limn→∞ xn = c. Then,


1 1
g(xn )
− g(c) g(c) − g(xn )
lim = lim
n→∞ xn − c n→∞ (xn − c)g(xn )g(c)

1 g(c) − g(xn ) 1
= lim ·
g(c) n→∞ xn − c g(xn )
1 1
= · (−g 0 (c)) ·
g(c) g(c)
where the last limit follows from the algebra of limits Theorem II.1.8.

49
(End of Day 15)

Theorem 1.6 (Chain Rule). Let A = (a, b) and f : A → R and g : B → R be two


functions such that f (A) ⊂ B. If f is differentiable at c ∈ A and g is differentiable at
f (c) ∈ B, then (g ◦ f ) is differentiable at c and

(g ◦ f )0 (c) = f 0 (c)g 0 (f (c)).

Proof. Let (xn ) be a sequence in A such that limn→∞ xn = c. Then, f is continuous by


Proposition 1.3, so
lim f (xn ) = f (c)
n→∞

Since g is differentiable at f (c), we have

g(f (xn )) − g(f (c))


lim = g 0 (f (c))
n→∞ f (xn ) − f (c)

exists. Now consider


(g ◦ f )(xn ) − (g ◦ f )(c) g(f (xn )) − g(f (c)) f (xn ) − f (c)
= lim · = g 0 (f (c))f 0 (c)
xn − c n→∞ f (xn ) − f (c) xn − c

by the algebra of limits Theorem II.1.8.

2. The Mean Value Theorem


Remark 2.1. Let [a, b] be a closed interval and f : [a, b] → R be a function that is con-
tinuous on [a, b] and differentiable at each point in (a, b). We know from Corollary IV.3.4
that f has both maxima and minima on [a, b]. We now try to understand how to find
these points and sketch the curve of f .

Definition 2.2. Let f : A → R be a function.

(i) f is said to have a local maximum at c ∈ A if there exists δ > 0 such that

f (y) ≤ f (c) for all y ∈ B(c, δ) ∩ A.

(ii) The term local minimum is defined analogously.

(iii) A local extremum of f is either a local maximum or a local minimum.

Proposition 2.3. Suppose f : (a, b) → R has a local extremum at a point c ∈ (a, b). If
f is differentiable at c, then f 0 (c) = 0.

Note: A point c ∈ (a, b) such that f 0 (c) = 0 is called a critical point of f .

50
Proof. Assume without loss of generality that c is a local maximum. Then, there is
δ > 0 such that
f (y) ≤ f (c) for all y ∈ B(c, δ)
(We may assume that B(c, δ) ⊂ (a, b) since (a, b) is open). Now choose a sequence
xn := c + 1/n so that limn→∞ xn = c. Then, by ignoring the first few terms, we may
assume that xn ∈ B(c, δ) for all n ∈ N. Hence,
f (xn ) − f (c)
f 0 (c) = lim ≤0
n→∞ xn − c
since xn ≥ c for all n ∈ N. Taking yn = c − 1/n, we see that
f (yn ) − f (c)
f 0 (c) = lim ≥0
n→∞ yn − c
Hence, f 0 (c) = 0.
Theorem 2.4 (Rolle’s Theorem). Let f : [a, b] → R be continuous and differentiable on
(a, b). If f (a) = f (b), then there exists c ∈ (a, b) such that f 0 (c) = 0.
Proof. Assume that f 0 (c) 6= 0 for all c ∈ (a, b). By Proposition 2.3, f has no local
extrema in (a, b). By Corollary IV.3.4, f has absolute extrema
M := sup f ([a, b]) and m := inf f ([a, b])
in [a, b]. So these extrema must be attained at a or b. Since f (a) = f (b), it follows that
m = M.

But in that case, f is a constant function, so f 0 (c) = 0 for any c ∈ (a, b).
Theorem 2.5 (Mean Value Theorem). Suppose f : [a, b] → R is continuous and differ-
entiable on (a, b). Then, there is a point c ∈ (a, b) such that
f (b) − f (a)
= f 0 (c).
b−a
(Equivalently, f (b) − f (a) = f 0 (c)(b − a)).
Proof. Let h : [a, b] → R be given by
h(x) := f (x)(b − a) − x(f (b) − f (a)).
Then, h is continuous on [a, b] and differentiable on (a, b). Moreover,
h(a) = f (a)(b − a) − a(f (b) − f (a)) = f (a)b − af (b) and
h(b) = f (b)(b − a) − b(f (b) − f (a)) = −f (b)a + bf (a)
So h(a) = h(b). By Rolle’s Theorem (Theorem 2.4), there exists c ∈ (a, b) such that
h0 (c) = 0. Hence,
f 0 (c)(b − a) − (f (b) − f (a)) = 0
So c is the desired point.

51
3. Extrema and Curve Sketching
Theorem 3.1. Let f : [a, b] → R be continuous and differentiable on (a, b).

(i) If f 0 (x) > 0 for all x ∈ (a, b), then f is strictly increasing on [a, b].

(ii) If f 0 (x) < 0 for all x ∈ (a, b), then f is strictly decreasing on [a, b].

(iii) If f 0 (x) = 0 for all x ∈ (a, b), then f is constant on [a, b].

Proof.

(i) If a ≤ x < y ≤ b, then there exists c ∈ (x, y) such that

f (y) − f (x) = f 0 (c)(y − x)

Since f 0 (c) > 0, it follows that f (y) > f (x). Hence, f is strictly increasing.

(ii) Similar to part (i).

(iii) For any x ∈ a < x ≤ b, there exists c ∈ (a, x) such that

f (x) − f (a) = f 0 (c)(x − a)

Since f 0 (c) = 0, f (x) = f (a). This is true for all x ∈ (a, b], so f is constant.

Proposition 3.2 (First Derivative Test for Extrema). Let f : [a, b] → R be continuous
and differentiable on (a, b). Fix c ∈ (a, b) and δ > 0 so that (c − δ, c + δ) ⊂ (a, b).

(i) If f 0 (x) > 0 for all x ∈ (c − δ, c) and f 0 (x) < 0 for all x ∈ (c, c + δ), then f has a
local maximum at c.

(ii) If f 0 (x) < 0 for all x ∈ (c − δ, c) and f 0 (x) > 0 for all x ∈ (c, c + δ), then f has a
local minimum at c.

Proof. Part (ii) is similar to (i), so we only prove (i).

By Theorem 3.1, f is strictly increasing on [c − δ, c] and strictly decreasing on [c, c + δ].


Hence, for all x ∈ [c − δ, c + δ] \ {c}, we have

f (x) < f (c)

so c is a local maximum.

Definition 3.3. A function f : (a, b) → R is said to be continuously differentiable on


(a, b) if it is differentiable and f 0 is continuous.

52
Proposition 3.4 (Second Derivative Test for Extrema). Suppose f : [a, b] → R is
continuous and twice continuously differentiable on (a, b) (i.e. f 0 is differentiable and f 00
is continuous). Suppose c ∈ (a, b) is a critical point of f .
(i) If f 00 (c) < 0, then f has a local maximum at c.
(ii) If f 00 (c) > 0, then f has a local minimum at c.
Proof. We only prove part (ii) because part (i) is similar.

Since f 00 is continuous, it follows from Lemma 1.4 that there is δ > 0 such that
f 00 (x) > 0
for all x ∈ B(c, δ). By Theorem 3.1, it follows that f 0 is strictly increasing on [c−δ, c+δ].
However,
f 0 (c) = 0
So it follows that
(i) f 0 (x) < 0 for all x ∈ (c − δ, c)
(ii) f 0 (x) > 0 for all x ∈ (c, c + δ)
So by the First Derivative Test for Extrema (Proposition 3.2), we see that f is a local
minimum.
(End of Day 16)
Example 3.5. We (indicate how to) sketch the graph of f : [−5, 5] → R given by
f (x) = x4 − 4x3 + 10

(i) Note that


f 0 (x) = 4x3 − 12x2 = 4x2 (x − 3)
so c = 0 and c = 3 are the two critical points. Moreover,
(a) f 0 (x) < 0 when x < 0, so f is decreasing on [−5, 0).
(b) f 0 (x) < 0 when 0 < x < 3, so f is decreasing on (0, 3).
(c) f 0 (x) > 0 when x > 3, so f is increasing on (3, 5].
(ii) Note that
f 00 (x) = 12x2 − 24x = 12x(x − 2)
Hence,
(a) f 00 (0) = 0
(b) f 00 (3) > 0 so c = 3 is a local minimum.
This information can now be used to sketch the graph. (You may need more information
about asymptotes, concavity, etc. for more complicated functions. We do not pursue
this here. For more on this, see [Apostol Calculus])

53
4. Taylor’s Theorem
Definition 4.1. Let f : (a, b) → R be a differentiable function, so that f 0 : (a, b) → R
is defined.
(i) If f 0 is differentiable, we say that f is twice differentiable, and we define f (2) :=
f 00 := (f 0 )0 .

(ii) For n ∈ N, we inductively define f to be n-times differentiable if f (n−1) is differ-


entiable, in which case, we write

f (n) := (f (n−1) )0 .

(iii) For n ∈ N, we say that f is n-times continuously differentiable if f is n-times


differentiable and f (n) is continuous.

(iv) We say that f is infinitely differentiable (or smooth) if f is n-times differentiable


for any n ∈ N.
Remark 4.2. Suppose f : (a, b) → R is differentiable.
(i) Given two points x0 < x ∈ (a, b), the mean value theorem states that there is a
point c ∈ (x0 , x) such that

f (x) = f (x0 ) + f 0 (c)(x − x0 )

(ii) Suppose f 0 is also continuous, and suppose that |x − x0 | is small. Then, |c − x0 |


would also be small, and so
f 0 (c) ≈ f 0 (x0 ).
Therefore, the Mean Value Theorem gives us an approximation

f (x) ≈ f (x0 ) + f 0 (x0 )(x − x0 ).

(iii) Notice that the RHS is the equation of the tangent line at the point (x0 , f (x0 )).
So the Mean Value Theorem implies that the graph of the function ‘near’ x0 looks
like the tangent line.

(iv) Taylor’s theorem is a generalization of this idea for functions that have higher
order derivatives.
Definition 4.3. Let n ∈ N, and let f : (a, b) → R be a n-times differentiable function.
Given x0 ∈ (a, b) fixed, we define the nth order Taylor polynomial of f at x0 to be

f 00 (x0 ) f (n) (x0 )


Pn (t) := f (x0 ) + f 0 (x0 )(t − x0 ) + (t − x0 ) + . . . + (t − x)n
2! n!
Remark 4.4.

54
(i) Note that Pn is a polynomial of degree ≤ n.

(ii) Moreover,

Pn (x0 ) = f (x0 )
Pn0 (x0 ) = f 0 (x0 )
Pn00 (x0 ) = f 00 (x0 )
..
.

Hence,
Pn(i) (x0 ) = f (i) (x0 ) for all 0 ≤ i ≤ n

Example 4.5.

(i) If f : R → R is a polynomial, say

f (x) = 1 + 2x + 3x2 + 4x3

Then, at x0 = 0, we have

P0 (t) = f (0) = 1
P1 (t) = f (0) + f 0 (0)(t − 0) = 1 + 2t
f 00 (0) 2
P2 (t) = f (0) + f 0 (0)t + t = 1 + 2t + 3t2
2!
P3 (t) = 1 + 2t + 3t2 + 4t3 = f (t)

(ii) Indeed, if f : R → R is a polynomial of the form

f (x) = a0 + a1 x + . . . + an xn

then it follows that


f (i) (0)
ai =
i!
for all 0 ≤ i ≤ n. Therefore, Pn = f .

(iii) Suppose f : R → R is f (x) = cos(x). Then, at x0 = 0 we have



0
 : if i is odd
(i)
f (0) = 1 : if i is even and i/2 is even

−1 : if i is even and i/2 is odd

55
Hence,
P0 (t) = f (0) = 1
P1 (t) = f (0) + f 0 (0)t = 1
t2
P2 (t) = 1 −
2!
P3 (t) = P2 (t)
t2 t4
P4 (t) = 1 + +
2! 4!
P5 (t) = P4 (t)
The graphs of these functions are shown here: https://www.geogebra.org/m/
s9SkCsvC.
(iv) Suppose f : (−1, 1) → R is given by
1
f (x) =
(1 − x)
Then,
1
f 0 (x) =
(1 − x)2
2
f 00 (x) =
(1 − x)3
..
.
n!
f (n) (x) =
(1 − x)n+1
So at x0 = 0, we get
P0 (t) = f (0) = 1
P1 (t) = f (0) + f 0 (0)t = 1 + t
P2 (t) = 1 + t + t2
..
.
Pn (t) = 1 + t + t2 + . . . + tn .

The next lemma is an analogue of Rolle’s Theorem (Theorem 2.4) for functions with
higher order derivatives.
Lemma 4.6. Suppose g : (a, b) → R is a (n + 1)-times differentiable function and
x0 , x ∈ (a, b) are fixed and distinct. Suppose that
g(x) = g(x0 ) = g 0 (x0 ) = g 00 (x0 ) = . . . = g (n) (x0 ) = 0.
Then, there is a point c between x0 and x such that
g (n+1) (c) = 0

56
Proof. We use the Mean Value Theorem (Theorem 2.5) repeatedly. Assume WLOG
that x0 < x.
(i) By the Mean Value Theorem (Theorem 2.5), there exists x1 ∈ (x0 , x) such that
g 0 (x1 ) = 0.
(ii) Now, g 0 (x0 ) = 0 so there exists x2 ∈ (x0 , x1 ) such that g 00 (x2 ) = 0.
(iii) Thus proceeding, we obtain points
x0 < xn < xn−1 < . . . < x1 < x
such that
g (i) (xi ) = 0 for all 0 ≤ i ≤ n.
(iv) Finally, there exists c := xn+1 ∈ (x0 , xn ) such that
g (n+1) (c) = 0

Theorem 4.7 (Taylor’s Theorem). Suppose f : (a, b) → R is (n+1)-times differentiable,


and x0 , x ∈ (a, b) are fixed and distinct. Then, there exists c between x0 and x such that
f (n+1) (c)
f (x) = Pn (x) + (x − x0 )n+1 .
(n + 1)!
where Pn is the nth order Taylor polynomial of f at x0 .
Proof. Assume WLOG that x0 < x.
(i) Define
f (x) − Pn (x)
M := ∈ R.
(x − x0 )n+1
Now define g : (a, b) → R by
g(t) := f (t) − Pn (t) − M (t − x0 )n+1 .

(ii) We know that


f (x0 ) = Pn (x0 )
f 0 (x0 ) = Pn0 (x0 )
..
.
f (n) (x0 ) = Pn(n) (x0 )
by Remark 4.4. Therefore,
0 = g(x0 ) = g 0 (x0 ) = . . . = g (n) (x0 ).
Moreover, by our choice of M ,
g(x) = 0
as well.

57
(iii) So by Lemma 4.6, there exists c ∈ (x0 , x) such that

g (n+1) (c) = 0.

Now note that


g (n+1) (c) = f (n+1) (c) − 0 − (n + 1)!M
Hence,
f (n+1) (c)
M=
(n + 1)!
and so
f (n+1) (c)
f (x) = g(x) + Pn (x) + M (x − x0 )n+1 = Pn (x) + (x − x0 )n+1 .
(n + 1)!

(End of Day 17)

58
VI. Integration
1. Lower and Upper Integrals
Throughout this section, we will fix a closed and bounded interval [a, b] ⊂ R and and a
bounded function f : [a, b] → R. i.e. There exist m, M ∈ R such that

m ≤ f (x) ≤ M

for all x ∈ [a, b].

Definition 1.1.

(i) A partition of [a, b] is a finite set P := {x0 , x1 , . . . , xn } ⊂ [a, b] such that

a = x0 < x1 < . . . < xn−1 < xn = b.

We write ∆Pi := ∆i = (xi − xi−1 ), for 1 ≤ i ≤ n.

(ii) Given a partition P as above, define

mi := inf{f (x) : xi−1 ≤ x ≤ xi } and Mi := sup{f (x) : xi−1 ≤ x ≤ xi }

for 1 ≤ i ≤ n. Define the Lower Riemann Sum of f with respect to P by


n
X
L(f, P) := mi ∆i
i=1

and define the Upper Riemann Sum of f with respect to P by


n
X
U(f, P) := Mi ∆i .
i=1

Lemma 1.2. Let f : [a, b] → R be a bounded function and P be a partition of [a, b], and
let m := inf f ([a, b]) and M := sup(f [a, b]) as above. Then,

m(b − a) ≤ L(f, P) ≤ U(f, P) ≤ M (b − a).


Pn
Proof. Note that i=1 ∆i = (b − a). Moreover, for each 1 ≤ i ≤ n, we have

m ≤ mi ≤ Mi ≤ M

59
So
m
X
m(b − a) = m∆i
i=1
n
X
≤ mi ∆i = L(f, P)
i=1
Xn
≤ Mi ∆i = U(f, P)
i=1
n
X
≤ M ∆i = M (b − a)
i=1

Note: To understand what Riemann sums measure, see this link: https://www.
geogebra.org/m/Fv6t696j.
In the above setting, the set

Af := {L(f, P) : P a partition of [a, b]} ⊂ R

is bounded above by M (b − a). Similarly, the set

Bf := {U(f, P) : P a partition of [a, b]} ⊂ R

is bounded below by m(b − a). Therefore, we may define

Definition 1.3. The upper and lower Riemann integrals of f are defined by

U(f ) := Uab (f ) := inf{U(f, P) : P a partition of [a, b]}


L(f ) := Lba (f ) := sup{L(f, P) : P a partition of [a, b]}

Example 1.4. Define f : [0, 1] → R by


(
0 : if x ∈ Q ∩ [0, 1]
f (x) =
1 : otherwise

Then, we claim that


U01 (f ) = 1 while L10 (f ) = 0.
Proof. To see this, fix any partition P = {x0 , x1 , . . . , xn } as above. Each subinterval
[xi−1 , xi ] contains both a rational number and an irrational number, so for each 1 ≤ i ≤ n,

mi = 0 and Mi = 1.

Therefore, L(f, P) = 0 and U (f, P) = 1.

60
Definition 1.5. Let P and P e be two partitions of the interval [a, b]. We say that P
e is
a refinement of P (or P
e is finer than P) if P ⊂ P.
e

In other words, P
e must break the interval into more pieces than P and each subinterval
created by P
e is contained in a subinterval created by P.

Lemma 1.6. If P
e is a refinement of P, then

L(f, P) ≤ L(f, P)
e and U (f, P) ≥ U (f, P).
e

Proof. We prove the first inequality since the second is analogous. Write

P := a = x0 < x1 < x2 < . . . < xn = b.

and write n
X
L(f, P) := mi (xi − xi−1 )
i=1

where mi = inf{f (x) : xi−1 ≤ x ≤ xi }.

(i) Assume first that P


e \ P = {y} and that

xi−1 < y < xi

for some 1 ≤ i ≤ n. Now write

α := inf{f (x) : xi−1 ≤ y ≤ xi } and β := inf{f (x) : y ≤ x ≤ xi }.

Then, observe that α ≥ mi and β ≥ mi . Therefore,

L(f, P)
e − L(f, P) = α(y − xi−1 ) + β(xi − y) − mi (xi − xi−1 )
≥ mi (y − xi−1 ) + mi (xi − y) − mi (xi − xi−1 )
= 0.
⇒ L(f, P)
e ≥ L(f, P).

(ii) Now suppose P


e \ P = {y1 , y2 , . . . yk }, then define

P1 = P ∪ {y1 }
P2 = P1 ∪ {y2 }
..
.
Pk = Pk−1 ∪ {yk } = P.
e

By part (i), we have

L(f, P) ≤ L(f, P1 ) ≤ L(f, P2 ) ≤ . . . ≤ L(f, Pk ) = L(f, P).


e

61
Remark 1.7. Given f : [a, b] → R as above, we had two sets
Af := {L(f, P) : P a partition of [a, b]}, and
Bf := {U(f, P) : P a partition of [a, b]}.

If P and P
e are two partitions of [a, b], then

S := P ∪ P
e

is a partition that is finer than both P and P.


e Therefore, by Lemma 1.2 and Lemma 1.6,

L(f, P) ≤ L(f, S) ≤ U(f, S) ≤ U(f, P).


e

Hence, for any a ∈ Af and b ∈ Bf , we have a ≤ b.

(End of Day 18)

Lemma 1.8. Suppose A, B ⊂ R are two non-empty sets such that a ≤ b for any a ∈ A
and b ∈ B. Then, sup(A) and inf(B) both exist, and
sup(A) ≤ inf(B)
Proof. Fix b ∈ B, then by hypothesis, b is an upper bound for A. So sup(A) exists and
sup(A) ≤ b.
Now this holds for all b ∈ B, so B is bounded below by sup(A). Hence, inf(B) exists
and
sup(A) ≤ inf(B).

Proposition 1.9. For any bounded function f : [a, b] → R, write m := inf(f [a, b]) and
M := sup(f [a, b]). Then
m(b − a) ≤ L(f ) ≤ U(f ) ≤ M (b − a).
Proof. We know from Lemma 1.2 that for any partition P of [a, b], we have
m(b − a) ≤ L(f, P) ≤ U(f, P) ≤ M (b − a).
Therefore,
L(f ) = sup{L(f, P) : P a partition of [a, b]} ≥ m(b − a)
and similarly, U(f ) ≤ M (b − a) as well.

Moreover, by Remark 1.7,


L(f, P) ≤ U(f, P)
e

for any two partitions P and P.


e So by Lemma 1.8, we have L(f ) ≤ U(f ).

62
2. Integrable Functions
Definition 2.1. A bounded function f : [a, b] → R is said to be Riemann integrable if

Lba (f ) = Uab (f ).

For an integrable function, we write


Z b Z b
f := f (x)dx = Lba (f ) = Uab (f ).
a a

Lemma 2.2. Let A, B ⊂ R be two non-empty sets such that a ≤ b for all a ∈ A and
b ∈ B. Then, sup(A) = inf(B) iff, for each  > 0, there exist a ∈ A and b ∈ B such that
(b − a) < .

Proof. By Lemma 1.8, sup(A) and inf(B) both exist and sup(A) ≤ inf(B).

⇒: Suppose that sup(A) = inf(B) and fix  > 0. Then, there exists a ∈ A such that

sup(A) − /2 < a ≤ sup(A).

Similarly, there exists b ∈ B such that

inf(B) ≤ b < inf(B) + /2

Subtracting, we see that

(b − a) ≤ inf(B) + /2 − sup(A) + /2 = .

⇐: Suppose that the condition given above holds. We WTS that sup(A) = inf(B).
Suppose that sup(A) < inf(B), so that

 := inf(B) − sup(A) > 0

By hypothesis, there exists a ∈ A and b ∈ B such that (b − a) < . Then, it follows


that
 = inf(B) − sup(A) ≤ (b − a) < 
This is a contradition, and thus sup(A) = inf(B) must hold.

Theorem 2.3 (Integrability Criterion). For a bounded function f : [a, b] → R, TFAE:

(i) f is Riemann integrable

(ii) For each  > 0, there is a partition P of [a, b] such that

U(f, P) − L(f, P) < .

63
(iii) There is a sequence of partitions Pn of [a, b] such that

lim (U(f, Pn ) − L(f, Pn )) = 0


n→∞

In that case, Z b
f = lim U(f, Pn ) = lim L(f, Pn ).
a n→∞ n→∞

Proof.
(i) ⇒ (ii) : Suppose f is integrable and fix  > 0. Then, by Lemma 2.2, there exist two
partitions P and P
e such that

U(f, P) − L(f, P)
e < .

Take S := P ∪ P,
e then by Lemma 1.6, we have

U(f, S) − L(f, S) ≤ U(f, P) − L(f, P)


e < .

So (ii) must hold.

(ii) ⇒ (iii) : Suppose (ii) holds, then for each n ∈ N, there is a partition Pn such that
1
U(f, Pn ) − L(f, Pn ) < .
n
So it follows that
lim (U(f, Pn ) − L(f, Pn )) = 0.
n→∞

Moreover, for each n ∈ N, we have


Z b
f = U(f )
a
≤ U(f, Pn )
1
≤ L(f, Pn ) +
n
1
≤ L(f ) +
n
Z b
1
= f+ .
a n
Therefore, Z b Z b
1 1
U(f, Pn ) − f < and L(f, Pn ) − f <
a n a n
Hence, Z b
f = lim U(f, Pn ) = lim L(f, Pn ).
a n→∞ n→∞

64
(iii) ⇒ (i) : Suppose (iii) holds. Take

A := {L(f, P) : P a partition of [a, b]}


B := {U(f, P) : P a partition of [a, b]}

Given  > 0, there is n ∈ N so that

U(f, Pn ) − L(f, Pn ) < .

By Lemma 2.2, we see that L(f ) = U(f ) so f must be integrable.

(End of Day 19)

Definition 2.4. A function f : [a, b] → R is called a step function if there is a partition


P = {x0 , x1 , . . . , xn } of [a, b] and constants {c1 , c2 , . . . , cn } ⊂ R such that


 c1 : if a ≤ x < x1

c2 : if x1 ≤ x < x2

f (x) = .

 ..


n
c : if x
n−1 ≤x≤b

Proposition 2.5. A step function is Riemann integrable.


Proof. Given a step function f : [a, b] → R, choose a partition P = {x0 , x1 , . . . , xn } as
in Definition 2.4. Then, for each 1 ≤ i ≤ n, we have

mi = inf(f [xi−1 , xi ]) = sup(f [xi−1 , xi ]) = Mi

so that L(f, P) = U(f, P). So the Integrability Criterion of Theorem 2.3 holds.
Definition 2.6. A function f : [a, b] → R is said to be
(i) increasing (or non-decreasing) if f (x) ≤ f (y) whenever x ≤ y.

(ii) decreasing (or non-increasing) if f (x) ≥ f (y) whenever x ≤ y.

(iii) monotone if it is either increasing or decreasing.


Proposition 2.7. If f : [a, b] → R is monotone, then it is Riemann integrable. More-
over,
Z b n  
1X (b − a)
f = lim f a+i
a n→∞ n n
i=1
n−1  
1X (b − a)
= lim f a+j
n→∞ n n
j=0

65
Proof. Assume WLOG that f is increasing (the proof is similar for decreasing func-
tions).

(i) First note that f (a) ≤ f (x) ≤ f (b) for all x ∈ [a, b], so f is a bounded function.

(ii) Now we wish to verify the Integrability Criterion (Part (iii) of Theorem 2.3). Fix
n ∈ N, and choose a partition Pn = {x0 , x1 , x2 , . . . , xn } where

xi = a + i(b − a)/n
1
so that ∆i = (xi − xi−1 ) = n
for all 1 ≤ i ≤ n. Then, observe that

mi = inf(f ([xi−1 , xi ])) = f (xi−1 ) and Mi = sup(f ([xi−1 , xi ])) = f (xi )

Therefore,
n
X
L(f, Pn ) = mi ∆i
i=1
n
1X
= f (xi−1 ), and similarly
n i=1
n
1X
U(f, Pn ) = f (xi ).
n i=1
n
1X
⇒ U(f, Pn ) − L(f, Pn ) = [f (xi ) − f (xi−1 )]
n i=1
f (b) − f (a)
=
n
⇒ lim (U(f, Pn ) − L(f, Pn )) = 0.
n→∞

The result follows from Part (iii) of Theorem 2.3.

Theorem 2.8. If f : [a, b] → R is continuous, then it is Riemann integrable.

Proof. We wish to verify Part (ii) of Theorem 2.3, so fix  > 0.

(i) Since f is uniformly continuous by Theorem IV.5.7, there is δ > 0 such that for
all x, y ∈ [a, b],
|x − y| < δ ⇒ |f (x) − f (y)| < /(b − a).

(ii) Choose n ∈ N such that (b−a)/n < δ and consider the partition P = {x0 , x1 , . . . , xn }
given by
xi = a + i(b − a)/n.
Then, note that ∆i = (xi − xi−1 ) < δ for all i.

66
(iii) Consider
mi = inf(f [xi−1 , xi ]) and Mi = sup(f [xi−1 , xi ]).
Since [xi−1 , xi ] is a compact set, there are points yi , zi ∈ [xi−1 , xi ] such that
mi = f (yi ) and Mi = f (zi ).
This follows from the Extreme Value Theorem Corollary IV.3.4.
(iv) Since |yi − zi | ≤ (xi − xi−1 ) < δ, we have

Mi − mi < .
b−a
for all 1 ≤ i ≤ n. Therefore,
n
X n
X
U(f, P) − L(f, P) = Mi ∆i − mi ∆i
i=1 i=1
Xm
= (Mi − mi )∆i
i=1
n
 X
< ∆i
(b − a) i=1
=
So by the Integrability Criterion (Part (ii) of Theorem 2.3), f is integrable.

Lemma 2.9. For any n, m ∈ N,


(i) (n + 1)m+1 ≥ (n + m + 1)nm .
(ii) nm+1 > (n + 1)m (n − m).
(iii)
n−1 n
X
m nm+1 X
i < < im .
i=0
m+1 i=1

Proof. (Not done in class. Here for reference only.)


(i) This is a short calculation.
m+1
X 
m+1 m+1 k
(n + 1) = n
k=0
k
m−1
X m + 1 
= nk + (m + 1)nm + nm+1
k=0
k
≥ (m + 1)nm + nm+1
= (n + m + 1)nm

67
(ii) We prove this by induction on m. Moreover, we may assume WLOG that n ≥ m.
(a) If m = 1, then this is true because n2 > n2 − 1 = (n + 1)(n − 1).
(b) Assume the result is true for (m − 1). i.e. We assume that
nm > (n + 1)m−1 (n − m + 1)
We wish to prove the result for m. Since n ≥ m, we have
nm+1 = n · nm
> n(n + 1)m−1 (n − m + 1)
= (n + 1)m−1 (n2 − nm + n)
≥ (n + 1)m−1 (n2 − nm + n − m)
= (n + 1)m−1 (n + 1)(n − m)
= (n + 1)m (n − m)

(iii) We prove this by induction on n.


(a) If n = 1, then we clearly have
1
0< <1
m+1
(b) Consider the first inequality. Suppose the result is true for n, and consider
n
X n−1
X
m
i = im + nm
i=1 i=1
m+1
n
< + nm (by induction hypothesis)
m+  1 
m n+m+1
=n
m+1
(n + 1)m+1
≤ (by part (i))
m+1
So the first inequality holds for (n + 1) as well.
(c) For the other inequality, again assume it is true for n and consider
n+1
X n
X
m
i = im + (n + 1)m
i=1 i=1
m+1
n
> + (n + 1)m (by induction hypothesis)
m+1
nm+1 + (n + 1)m (m + 1)
=
m+1
(n + 1) (n − m) + (n + 1)m (m + 1)
m
> (by part (ii))
m+1
(n + 1)m+1
=
m+1

68
Example 2.10. Fix m ∈ N, b > 0 and let f : [0, b] → R be given by f (x) = xm . Then,
f is increasing so it is integrable by Proposition 2.7. Now consider the partition

Pn = 0 < b/n < 2b/n < . . . < (n − 1)b/n < b.

Then
n−1  
1X jb
L(f, Pn ) = f
n j=0 n
n−1
bm X m
= m+1 j
n j=0

bm nm+1
<
nm+1 m + 1
bm
=
m+1
Note that the middle inequality holds by Lemma 2.9. Similarly,
bm
U(f, Pn ) >
m+1
Therefore, we have
bm
L(f, Pn ) < < U(f, Pn )
m+1
By the Squeeze theorem (Theorem II.2.3), we see that
b
bm
Z
xm dx = .
0 m+1

3. Properties of the Integral


For an interval [a, b] ⊂ R, we write I[a, b] for the set of all Riemann integrable functions
defined on [a, b].

(End of Day 20)

Theorem 3.1 (Monotonicity). Suppose f, g ∈ I[a, b] are such that f (x) ≤ g(x) for all
x ∈ [a, b]. Then
Z b Z b
f≤ g
a a

69
Proof. Let P = {x0 , x1 , . . . , xn } be a partition of [a, b]. For each 1 ≤ i ≤ n, set

Mi = sup(f [xi−1 , xi ]), and Ki = sup(g[xi−1 , xi ]).

By hypothesis, we have Mi ≤ Ki for all 1 ≤ i ≤ n, so


n
X n
X
U(f, P) = Mi ∆i ≤ Ki ∆i = U(g, P).
i=1 i=1

Hence,
U(f ) ≤ U(g, P)
This is true for every P, so U(f ) is a lower bound for the set {U(g, P) : P a partition of [a, b]}.
Hence,
U(f ) ≤ inf{U(g, P) : P a partition of [a, b]} = U(g)
Since f and g are integrable, it follows that
Z b Z b
f = U(f ) ≤ U(g) = g.
a a

Corollary 3.2. Let f ∈ I[a, b] and suppose M > 0 is such that |f (x)| ≤ M for all
x ∈ [a, b]. Then,
Z b
f (x)dx ≤ M (b − a)
a

Proof. By hypothesis,
−M ≤ f (x) ≤ M
for all x ∈ [a, b], so by applying Theorem 3.1, we see that
Z b
−M (b − a) ≤ f (x)dx ≤ M (b − a)
a

which proves the result.


Theorem 3.3 (Additivity). Let f : [a, c] → R be a bounded function and suppose
a < b < c. Then,
(i) f ∈ I[a, c] if and only if f ∈ I[a, b] ∩ I[b, c].

(ii) Moreover, in that case,


Z c Z b Z c
f= f+ f
a a b

Proof.
(i)

70
⇒: Suppose f ∈ I[a, c], fix  > 0 with a view to using Theorem 2.3. Then, there
is a partition P = {x0 , x1 , . . . , xn } of [a, c] such that
U(f, P) − L(f, P) < 

If P
e = P ∪ {b}, then by Lemma 1.6,

U(f, P)
e − L(f, P)
e ≤ U(f, P) − L(f, P) < .

Let P1 = {x0 , x1 , . . . , xk = c}, then P1 is a partition of [a, b]. If mi =


inf(f [xi−1 , xi ]) and Mi = sup(f [xi−1 , xi ])), then Mi ≥ mi for all 1 ≤ i ≤ n.
So,
k
X k
X
U(f, P1 ) − L(f, P1 ) = Mi (xi − xi−1 ) − mi (xi − xi−1 )
i=1 i=1
Xk
= (Mi − mi )(xi − xi−1 )
i=1
n
X
≤ (Mj − mj )(xj − xj−1 )
j=1

= U(f, P)
e − L(f, P)
e < .

By Theorem 2.3, f ∈ I[a, b]. Similarly, f ∈ I[b, c] as well.


⇐: If f ∈ I[a, b] ∩ I[b, c] and  > 0 is fixed, then there is a partition P1 of [a, b]
and P2 of [b, c] such that
U(f, P1 ) − L(f, P1 ) < /2
U(f, P2 ) − L(f, P2 ) < /2.
If P = P1 ∪ P2 , then P is a partition of [a, c] and
U(f, P) = U(f, P1 ) + U(f, P2 ) and L(f, P) = L(f, P1 ) + L(f, P2 )
Therefore,
U(f, P) − L(f, P) < 
so f ∈ I[a, c] by Theorem 2.3.
(ii) If f ∈ I[a, b] ∩ I[b, c] and  > 0, there is a partition P1 of [a, b] and P2 of [b, c] such
that
U(f, P1 ) − L(f, P1 ) < /2
U(f, P2 ) − L(f, P2 ) < /2, and
Z b
L(f, P1 ) ≤ f ≤ U(f, P1 ),
a
Z c
L(f, P2 ) ≤ f ≤ U(f, P2 )
b

71
As above, P := P1 ∪ P2 is a partition of [a, c] and
Z c
f ≤ U(f, P)
a
= U(f, P1 ) + U(f, P2 )
< L(f, P1 ) + L(f, P2 ) + 
Z b Z c
≤ f+ f +
a b

This is true for every  > 0, so


Z c Z b Z c
f≤ f+ f
a a b

(Do verify this!). The reverse inequality is proved similarly.

Corollary 3.4. If f ∈ I[a, b] and [c, d] ⊂ [a, b], then f |[c,d] ∈ I[c, d].
Proof. Exercise.
Theorem 3.5. If f : [a, b] → R is a bounded function with finitely many discontinuities,
then f ∈ I[a, b].
Proof. Let F ⊂ [a, b] be the finite set so that f is continuous on [a, b] \ F .
(i) Assume first that F = {x} is a singleton. Fix  > 0 and choose M > 0 so that
|f (x)| ≤ M for all x ∈ [a, b]. Let δ := /12M , then on the subinterval [x − δ, x + δ],
we have
` := inf(f [x − δ, x + δ]) ≥ −M and L := sup(f [x − δ, x + δ]) ≤ M
Therefore,
(L − `)(2δ) ≤ 4M δ ≤ /3.
Now note that f ∈ I[a, x − δ] ∩ I[x + δ, b] because it is continuous on both subin-
tervals (by Theorem 2.8). So there is a partition P1 of [a, x − δ] and P2 of [x + δ, b]
such that
U(f, P1 ) − L(f, P1 ) < /3
U(f, P2 ) − L(f, P2 ) < /3.
Now consider P = P1 ∪ P2 treated as a partition of [a, b]. Then,
U(f, P) = U(f, P1 ) + U(f, P2 ) + L(2δ), and
L(f, P) = L(f, P1 ) + L(f, P2 ) + `(2δ)
⇒ U(f, P) − L(f, P) < /3 + /3 + /3 = 
Hence, f ∈ I[a, b] by Theorem 2.3.

72
(ii) Now suppose F = {x1 , x2 , . . . , xn } is any finite set. Again, assume x1 < x2 < . . . <
xn and consider the subintervals

J0 = [a, x1 ], J1 := [x1 , x2 ], . . . , Jn = [xn , b]

Each Ik contains only one discontinuity of f , so f ∈ I(Jk ) for all 0 ≤ k ≤ n. So


by Theorem 3.3, f ∈ I[a, b].

Theorem 3.6 (Linearity - I). Let f ∈ I[a, b] and α ∈ R. Then, αf ∈ I[a, b] and
Z b Z b
(αf ) = α f
a a

Proof.

(i) First assume α ≥ 0. Let P = {x0 , x1 , . . . , xn } be a partition of [a, b]. For each
1 ≤ i ≤ n, set

Mi = sup{f (x) : xi−1 ≤ x ≤ xi } and Ki := sup{αf (x) : xi−1 ≤ x ≤ xi }.

Since α ≥ 0, we have
Ki = αMi
by Homework 1.2. Hence,
n
X n
X
U(αf, P) = Ki ∆i = α Mi ∆i = αU(f, P).
i=1 i=1

Hence,

U(αf ) = inf{U(αf, P) : P is a partition of [a, b]}


= inf{αU(f, P) : P is a partition of [a, b]}
= α inf{U(f, P) : P is a partition of [a, b]}
= αU(f )

Similarly,
L(αf ) = αL(f ).
Since f ∈ I[a, b], we have

U(αf ) = αU(f ) = αL(f ) = L(αf ).

So αf ∈ I[a, b] and also


Z b Z b
(αf ) = U(αf ) = αU(f ) = α f.
a a

73
(ii) To prove it for α < 0, it suffices to prove it for α = −1. In that case, consider a
partition P as before. Then, for each 1 ≤ i ≤ n,
sup{αf (x) : xi−1 ≤ x ≤ xi } = − inf{f (x) : xi−1 ≤ x ≤ xi }
Hence,
U(αf, P) = −L(f, P)
Taking an infimum, we have
U(αf ) = inf{U(f, P) : P is a partition of [a, b]}
= − sup{L(f, P) : P is a partition of [a, b]}
= −L(f ).
Similarly,
L(αf ) = −U(f ).
Therefore, U(αf ) = L(αf ), so αf ∈ I[a, b] and
Z b Z b Z b
(αf ) = U(αf ) = −L(f ) = − f =α f.
a a a

(End of Day 21)

Theorem 3.7 (Linearity - II). Let f, g ∈ I[a, b], then (f + g) ∈ I[a, b] and
Z b Z b Z b
(f + g) = f+ g
a a a

Proof.
(i) Let P = {x0 , x1 , . . . , xn } be a partition of [a, b] and for each 1 ≤ i ≤ n, set
Mi = sup(f [xi−1 , xi ])
Ki = sup(g[xi−1 , xi ]), and
Li = sup((f + g)[xi−1 , xi ]).
Then, for any x ∈ [xi−1 , xi ], we have
(f + g)(x) = f (x) + g(x) ≤ Mi + Ki
⇒ Li ≤ Mi + Ki
Xn
⇒ U(f + g, P) = Li ∆i
i=1
Xn n
X
≤ Mi ∆i + Ki ∆i
i=1 i=1
= U(f, P) + U(g, P)

74
(ii) Now let P and P
e be two partitions of [a, b], set S := P ∪ P.
e Then, by part (i) and
Lemma 1.6, we have

U(f + g, S) ≤ U(f, S) + U(g, S)


≤ U(f, P) + U(g, P).
e
⇒ U(f + g) ≤ U(f, P) + U(g, P).
e

This is true for any P and P,


e so

U(f + g) ≤ U(f ) + U(g).

(iii) Similarly, one can show that

L(f + g) ≥ L(f ) + L(g)

(iv) Since f, g ∈ I[a, b], we have

U(f + g) ≤ L(f ) + L(g) ≤ L(f + g) ≤ U(f + g)

Hence, f + g ∈ I[a, b]. Moreover, we conclude that


Z b Z b Z b
(f + g) = U(f + g) = U(f ) + U(g) = f+ g.
a a a

4. The Fundamental Theorem of Calculus


Theorem 4.1 (First Fundamental Theorem of Calculus). Let F : [a, b] → R be a
continuous function that is differentiable on (a, b). Suppose f : [a, b] → R is a Riemann
integrable function such that
f (x) = F 0 (x)
for all x ∈ (a, b). Then
Z b
f = F (b) − F (a).
a

Proof. Fix a partition P = {x0 , x1 , . . . , xn } of [a, b] as before. For each 1 ≤ i ≤ n,


we apply the Mean Value Theorem (Theorem V.2.5) to F on [xi−1 , xi ], so there exists
ci ∈ (xi−1 , xi ) such that

F (xi ) − F (xi−1 ) = F 0 (ci )(xi − xi−1 ) = f (ci )∆i

Now,
mi := inf(f [xi−1 , xi ]) ≤ f (ci ) ≤ sup(f [xi−1 , xi ]) =: Mi .

75
Therefore,
n
X
L(f, P) = mi ∆i
i=1
Xn
≤ f (ci )∆i
i=1
n
X
≤ Mi ∆i
i=1
= U(f, P).
Xn
⇒ L(f, P) ≤ (F (xi ) − F (xi−1 )) ≤ U(f, P)
i=1
⇒ L(f, P) ≤ F (b) − F (a) ≤ U(f, P).

Taking a supremum on the LHS and infimum on the RHS, we conclude that

L(f ) ≤ F (b) − F (a) ≤ U(f )


Rb
But f ∈ I[a, b] so L(f ) = U(f ) = a
f . Hence the result.

(End of Day 22)

Example 4.2. We now give a short proof of Example 2.10. For m ∈ N, consider
f : [a, b] → R be given by f (x) = xm . Then, let F : [a, b] → R be given by

xm+1
F (x) =
m+1
Then, F 0 (t) = f (t) for all t ∈ [a, b], so by Theorem 4.1,
b
bm+1 − am+1
Z
xm dx = F (b) − F (a) = .
a m+1

Theorem 4.3 (Second Fundamental Theorem of Calculus). Let f : [a, b] → R be Rie-


mann integrable. Define F : [a, b] → R by
Z x
F (x) := f (t)dt.
a

(i) Then, F is continuous on [a, b].

(ii) If f is continuous at a point c ∈ [a, b], then F is differentiable at c and F 0 (c) = f (c).

Proof.

76
(i) Since f is bounded, there is M > 0 such that |f (x)| ≤ M for all x ∈ [a, b]. So for
any a ≤ x ≤ y ≤ b it follows from Corollary 3.2 that
Z y
f (t)dt ≤ M |y − x|
x

But by additivity of the integral (Theorem 3.3), we know that


Z y Z x Z y
f= f+ f
a a x

Hence, Z y
f = F (y) − F (x).
x

We conclude that |F (y) − F (x)| ≤ M |y − x| for all x, y ∈ [a, b], so F is Lipschitz


continuous (see Definition IV.5.5).

(ii) Suppose f is continuous at c, and fix  > 0. We wish to prove that there is a δ > 0
such that whenever |x − c| < δ,

F (x) − F (c)
− f (c) < .
x−c

Since f is continuous at c, there is a δ > 0 such that

|x − c| < δ ⇒ |f (x) − f (c)| < .

Now suppose x ∈ [a, b] is such that |x − c| < δ. We consider two cases:


(a) Assume first that x ≥ c. Then, for any t ∈ [c, x], we have

f (c) −  < f (t) < f (c) + 

Hence by Theorem 3.1 and Theorem 3.3, we have


Z x
(f (c) − )(x − c) ≤ f (t)dt ≤ (f (c) + )(x − c)
c
Rx Rc
a
f (t)dt − a f (t)dt
⇒ f (c) −  ≤ ≤ f (c) + 
x−c
F (x) − F (c)
⇒ − ≤ − f (c) ≤ 
x−c
F (x) − F (c)
⇒ − f (c) ≤ .
x−c

(b) If x ≤ c, the argument is similar with minor changes.

77
Therefore, in either case, we see that when |x − c| < δ,

F (x) − F (c)
− f (c) ≤ 
x−c

Hence, F is differentiable at c and F 0 (c) = f (c).

Theorem 4.4 (Mean Value Theorem for Integrals). Suppose f : [a, b] → R is continuous
on [a, b]. Then, there exists c ∈ [a, b] such that
Z b
1
f (c) = f (t)dt.
(b − a) a

Proof. Homework.

Definition 4.5. Let f : [a, b] → R be a continuous function. A function P : [a, b] → R


is called a primitive (or anti-derivative) of f if P is continuous on [a, b], differentiable on
(a, b) and
P 0 (x) = f (x)
for all x ∈ (a, b).

Corollary 4.6. Let f : [a, b] → R be a continuous function.

(i) If P and Q are two primitives of [a, b], then there is a constant C ∈ R such that

P (x) − Q(x) = C

for all x ∈ [a, b].

(ii) If P : [a, b] → R is any primitive of f , then for any c, d ∈ [a, b],


Z d
P (d) = P (c) + f (t)dt. (VI.1)
c

Proof.

(i) Note that (P − Q) is a function on [a, b] that satisfies

(P − Q)0 (x) = 0

for all x ∈ (a, b). So the result follows from the Mean Value Theorem (specifically,
Theorem V.3.1).

78
(ii) Let F : [a, b] → R be given by
Z x
F (x) = f (t)dt.
a

Then, by the Second Fundamental Theorem (Theorem 4.3), F is a primitive of f .


So by part (i), there is a constant C such that
F (x) − P (x) = C
for all x ∈ [a, b]. In particular,
Z d
P (d) − P (c) = F (d) − F (c) = f (t)dt
c

by the additivity of the integral (Theorem 3.3). Hence the result.

Remark 4.7.
(i) The Fundamental Theorem of Calculus (specifically Corollary 4.6) is so useful, it
is usually taught in school as the definition of the integral. i.e. The integral of a
function is identified with its anti-derivative and Equation VI.1 is used to compute
integrals.
(ii) Leibniz used the symbol Z
f (t)dt = P (x) + C (VI.2)

to say that P is any primitive of f and C is a fixed constant depending on P and


f . This is a formal expression that does not have any intrinsic meaning. It is
used to algebraically manipulate integrals, and used along with Equation VI.1 to
evaluate definite integrals.
R
(iii) The symbol f (t)dt in Equation VI.2 is also called the indefinite integral of f . It
is simply another term for a primitive of f .
Theorem 4.8 (Substitution Theorem). Let g : [a, b] → R be a continuous function that
is differentiable on (a, b). Let f : [c, d] → R be a continuous function and assume that
g([a, b]) ⊂ [c, d]. Then,
Z g(b) Z b
f (u)du = f (g(t))g 0 (t)dt.
g(a) a

Proof. Let P : [c, d] → R and Q : [a, b] → R be given by


Z y
P (y) := f (u)du
g(a)
Z x
Q(x) := f (g(t))g 0 (t)dt
a

79
Then, by the Fundamental Theorem of Calculus, Q is differentiable on (a, b) and for any
x ∈ (a, b)
Q0 (x) = f (g(x))g 0 (x)
Moreover, h := P ◦ g is also differentiable and for any x ∈ (a, b)

h0 (x) = P 0 (g(x))g 0 (x) = f (g(x))g 0 (x).

By Theorem V.3.1, there is a constant C ∈ R such that

Q(x) − h(x) = C

for all x ∈ [a, b]. At x = a, we see that

C = Q(a) − h(a) = 0 − 0.

Therefore, Q(x) = h(x) for all x ∈ [a, b]. In particular,


Z b Z g(b)
0
f (g(t))g (t)dt = Q(b) = h(b) = f (u)du
a g(a)

5. The Logarithm and Exponential Functions


Definition 5.1. The natural logarithm is the function ln : (0, ∞) → R defined by
Z x
1
ln(x) := dt
1 t

(End of Day 23)

Proposition 5.2. The natural logarithm has the following properties:


(i) ln(1) = 0.

(ii) ln is differentiable at each point x ∈ (0, ∞) and


1
ln0 (x) = .
x

(iii) [Functional Equation] ln(ab) = ln(a) + ln(b) for all a, b ∈ (0, ∞).
Proof.
(i) Obvious.

(ii) This follows by the Fundamental Theorem of Calculus (Theorem 4.1 and Theo-
rem 4.3) since ln(1) = 0.

80
(iii) Note that
Z ab Z a Z ab
1 1 1
ln(ab) = dt = dt + dt
1 t 1 t a t
by the Additivity of the integral (Theorem 3.3). Consider g : [1, b] → [a, ab] given
by g(s) := as. Then, by the Substitution Theorem (Theorem 4.8), we see that
Z ab Z g(b) Z b
1 1 1
dt = dt = · adt = ln(b)
a t g(1) t 1 at

Hence, ln(ab) = ln(a) + ln(b).

From now on, we write R+ := {x ∈ R : x > 0}


Proposition 5.3. For each b ∈ R there is a unique a ∈ R+ such that ln(a) = b.
Proof.
(i) Existence: Fix b ∈ R.
(a) Assume first that b > 0. Note that ln(2) 6= 0 so choose n ∈ N so that
b
n>
ln(2)

Then, by the functional equation, ln(2n ) = n ln(2) > b. Since ln(1) = 0, it


follows by the Intermediate Value Theorem (Theorem IV.4.2) that there is
an a ∈ (0, 2n ) so that ln(a) = b.
(b) Now suppose b < 0, choose a ∈ R+ such that ln(a) = −b. Then, by the
functional equation
ln(1/a) + ln(a) = ln(1) = 0.
Hence, ln(1/a) = b.
(c) Now if b = 0 we may take a = 1.

(ii) Uniqueness: Note that ln0 (x) = 1/x for all x ∈ R, so ln is strictly increasing on R+
(by Theorem V.3.1). So if a1 < a2 , then ln(a1 ) < ln(a2 ). This implies uniqueness.

Definition 5.4. Define a function exp : R → R+ by declaring that exp(x) = y if and


only if ln(y) = x. This is called the exponential function
Proposition 5.5. The exponential function has the following properties:
(i) For each x ∈ R, ln(exp(x)) = x, and for each y ∈ R+ , exp(ln(y)) = y.

(ii) exp(0) = 1.

81
(iii) [Functional Equation] exp(a + b) = exp(a) exp(b) for all a, b ∈ R.

(iv) exp is strictly increasing.

(v) exp is continuous on R.

(vi) exp is differentiable at each point x ∈ R and exp0 (x) = exp(x).

Proof.

(i) This follows by the definition of exp.

(ii) Since ln(1) = 0, it follows by definition that exp(0) = 1.

(iii) Fix a, b ∈ R and choose x, y ∈ R+ so that ln(x) = a and ln(y) = b. Then

ln(xy) = ln(x) + ln(y) = a + b

However, exp(ln(xy)) = xy, so

exp(a + b) = exp(ln(xy)) = xy = exp(a) exp(b).

(iv) Suppose a, b ∈ R are such that a < b. Choose c, d ∈ R+ so that ln(c) = a and
ln(d) = b. If c ≥ d, then since ln is strictly incerasing, it would follow that

a = ln(c) ≥ ln(d) = b.

This is not the case, so it must happen that c < d, so exp(a) < exp(b).

(v) Fix c ∈ R+ and  > 0. We may assume that  is small enough so that (c−, c+) ⊂
R+ . We wish to find a δ > 0 so that

|x − c| < δ ⇒ | exp(x) − exp(c)| < 

Set d := exp(c) so that ln(d) = c, and set

δ := min{ln(d) − ln(d − ), ln(d + ) − ln(d)} > 0

Note that δ > 0 because ln is strictly increasing. Now suppose x ∈ R is such that
|x − c| < δ. Then,

x < c + δ ≤ c + ln(d + ) − ln(d) = ln(d + )

Since exp is increasing by part (iii), it follows that

exp(x) < exp(ln(d + )) = d +  = exp(c) + .

Similarly, exp(x) > exp(c) −  as well. Hence, | exp(x) − exp(c)| <  as desired.

82
(vi) Fix c ∈ R and consider the difference quotient

exp(x) − exp(c) exp(c + h) − exp(c) exp(h) − 1


= = exp(c)
x−c h h
where h = x − c and the last equality holds by the functional equation of part (iii).
Now choose a sequence (xn ) in R with limn→∞ xn = 0. So it suffices to prove that

exp(xn ) − 1
lim = 1.
n→∞ xn
Choose yn ∈ R+ such that ln(yn ) = xn , so that exp(xn ) = yn . Since exp is
continuous by part (v), we know that

lim yn = exp(0) = 1.
n→∞

Therefore, this limit reduces to

exp(xn ) − 1 yn − 1
lim = lim
n→∞ xn n→∞ ln(yn )
1
= lim  
n→∞ ln(yn )−ln(1)
yn −1
1
= lim 0 = 1.
n→∞ ln (1)

So it follows that exp is differentiable at c and exp0 (c) = exp(c).

Remark 5.6.

(i) By Proposition 5.3, there is a unique positive real number e ∈ R such that ln(e) =
1, so that exp(1) = e.

(ii) We now consider the function equation exp(a + b) = exp(a) exp(b). Since exp(1) =
e, it follows that for any n ∈ N,

exp(n) = exp(1) · exp(1) · . . . · exp(1) = en


| {z }
n times

(iii) Now observe that for any a ∈ R,

exp(−a) exp(a) = exp(a + (−a)) = exp(0) = 1

so that exp(−a) = 1/ exp(a) = exp(a)−1 .

83
(iv) Therefore, we may combine parts (i) and (ii) and see that

exp(n) = en

for any n ∈ Z (Z denotes the set of all integers).

(v) Now fix n ∈ Z non-zero, and note that


 n  
1 1
exp = exp n · = exp(1) = e
n n

Now exp(1/n) ∈ R+ , so by the uniqueness of the nth root (see Proposition I.4.6),
it follows that
exp(1/n) = e1/n .

(vi) If r ∈ Q, we write r = m/n for some m, n ∈ Z with n 6= 0. Then


   m
1 1
exp(r) = exp m · = exp = (e1/n )m = em/n = er
n n

Definition 5.7. For any x ∈ R, we define

(i) ex := exp(x).

(ii) For any a ∈ R+ , we define ax := ex ln(a) .

Remark 5.8.

(i) Note that this definition agrees with the conclusion of Remark 5.6 for rational x,
but is simply the definition for irrational x.

(ii) For any a > 0, we have some basic properties which may be proved using Propo-
sition 5.5: For any a, b > 0 and x, y ∈ R,
(a) ln(ax ) = x ln(a).
(b) (ab)x = ax bx .
(c) ax ay = ax+y .
(d) (ax )y = axy = (ay )x .
ln(y)
(e) If a 6= 1, then y = ax if and only if x = ln(a)
.

(End of Day 24)

84
VII. Sequences of Functions
1. Pointwise and Uniform Convergence
Definition 1.1. Let A ⊂ R be set and let (fn ) be a sequence of functions on A (i.e.
For each n ∈ N, fn : A → R is a function). We say that (fn ) converges pointwise to a
function f : A → R if, for each x ∈ A,

lim fn (x) = f (x).


n→∞

p
If this happens, we write fn →
− f.
Remark 1.2.
p
(i) fn →
− f if and only if, for each x ∈ A and each  > 0, there exists N ∈ N such that

|fn (x) − f (x)| < 

for all n ≥ N .

(ii) For each n ∈ N, let fn : [0, 1] → R be given by fn (x) = xn . Then,

fn (1) = 1

for all n ∈ N. Now if 0 ≤ x < 1, then

lim fn (x) = 0
n→∞

by Question 2 on Quiz 1. Therefore, if f : [0, 1] → R is given by


(
0 : if 0 ≤ x < 1
f (x) =
1 : if x = 1
p
then fn →
− f.

(iii) Consider the same functions as in part (ii) and consider two points x = 3/4 and
y = 1/4 and let  = 1/4. Then,
(a)
1
f2 (y) = <
16
so that for all n ≥ 2, |fn (y) − f (y)| < .

85
(b) However,
f4 (x) ≈ 0.31, and f5 (x) ≈ 0.23
so that for all n ≥ 5, |fn (x) − f (x)| < .
Therefore, the value of N chosen (as in part (i)) depends on the choice of x and .
Definition 1.3. Let A ⊂ R be a set and (fn ) be a sequence of functions on A. We say
that (fn ) converges uniformly to a function f : A → R if for each  > 0, there exists
N ∈ N such that
|fn (x) − f (x)| < 
u
for all x ∈ A. If this happens, we write fn →
− f.
Remark 1.4.
(i) Note that in the definition above, given  > 0 the value of N chosen does not depend
on the point x ∈ A. Therefore, Definition 1.3 is stronger than Definition 1.1.
u p
(ii) In particular, if fn →
− f , then fn →
− f must hold. However, the converse does not
hold.
u
Lemma 1.5. Suppose each fn is a continuous on A and fn →
− f , then f is continuous
on A.
Proof. Fix c ∈ A and  > 0. We wish to find a δ > 0 so that
|x − c| < δ ⇒ |f (x) − f (c)| < .
To do this, choose N ∈ N so that
|fn (x) − f (x)| < /3
for all n ≥ N and all x ∈ A. Now, fN is continuous, so there is a δ > 0 such that
|x − c| < δ ⇒ |fN (x) − fN (c)| < /3.
Then,
|f (x) − f (c)| ≤ |f (x) − fN (x)| + |fN (x) − fN (c)| + |fN (c) − f (c)|
  
< + +
3 3 3
= .
So f is continuous at c. This is true for any c ∈ A, so f is continuous on A.
Example 1.6. Consider fn : [0, 1] → R given by fn (x) := xn . Then, we saw in
Remark 1.2 that (fn ) converges pointwise to
(
0 : if 0 ≤ x < 1
f (x) :=
1 : if x = 1

Since f is not continuous, (fn ) does not converge uniformly to f .

86
2. Distances between Functions
Remark 2.1.

(i) Given two points x, y ∈ R, the distance between them is given by

d(x, y) := |x − y|.

(ii) Given a set A ⊂ R and two functions f, g : A → R, we may measure the distance
between them as
d∞ (f, g) := sup |f (x) − g(x)|.
x∈A

However, for this to be well-defined, we need f and g to be bounded functions.

(iii) Alternatively, if A = [0, 1] is an interval and f and g are integrable, we may define
it as the ‘area’ between the two graphs, by
Z 1
d1 (f, g) := |f (t) − g(t)|dt.
0

(iv) There are many different definitions like these, but we will focus on one that
depends on one crucial fact (see Theorem IV.3.3) that every continuous function
on a closed interval is bounded.

Definition 2.2. Fix an interval [a, b] ⊂ R.

(i) We define
C[a, b] := {f : [a, b] → R : f is continuous}.
In other words, each element of C[a, b] is a function.

(ii) For two functions f, g ∈ C[a, b], we define

d(f, g) := sup |f (x) − g(x)|


x∈[a,b]

Note that the is a well-defined real number because f and g are both bounded
functions.
u
Proposition 2.3. For a sequence (fn ) ⊂ C[a, b], we have fn →
− f if and only if, for
each  > 0, there exists N ∈ N such that

d(fn , f ) < 

for all n ≥ N .

In other words, uniform convergence of a sequence is captured by this distance d(·, ·).
Proof. This follows from Definition 1.3 and Definition 2.2.

87
(End of Day 25)
Proposition 2.4. For any three functions f, g, h ∈ C[a, b], we have
(i) d(f, g) ≥ 0.
(ii) d(f, g) = 0 holds if and only if f (x) = g(x) for all x ∈ [a, b].
(iii) d(f, g) = d(g, f ).
(iv) [Triangle Inequality] d(f, g) ≤ d(f, h) + d(g, h).
A function that satisfies all these conditions is called a metric and (C[a, b], d) is called a
metric space.
Proof.
(i) Obvious because d(f, g) is the supremum of a set of non-negative real numbers.
(ii) Obvious because d(f, g) = 0 holds if and only if |f (x) − g(x)| = 0 for all x ∈ [a, b].
(iii) Obvious.
(iv) Note that for any x ∈ [a, b], we have
|f (x) − g(x)| ≤ |f (x) − h(x)| + |h(x) − g(x)| ≤ d(f, h) + d(g, h)
So the RHS is an upper bound for the set {|f (x) − g(x)| : x ∈ [a, b]}, so it follows
that
d(f, g) = sup |f (x) − g(x)| ≤ d(f, h) + d(g, h).
x∈[a,b]

Definition 2.5. Fix an interval [a, b] ⊂ R.


(i) Define B[a, b] to be the set of all bounded, real-valued functions on [a, b].
(ii) For two functions f, g ∈ B[a, b], we define the distance
d(f, g) := sup |f (x) − g(x)|.
x∈[a,b]

Note that this is a well-defined real number and it satisfies all the properties of
Proposition 2.4.
Remark 2.6.
(i) Note that C[a, b] ⊂ B[a, b].
(ii) If (fn ) in C[a, b] converges uniformly to a function f ∈ B[a, b], then f ∈ C[a, b] by
Lemma 1.5.
(iii) You will study metric spaces next semester. In that context, the distance function
d is a metric on C[a, b] and on B[a, b]. Moreover, C[a, b] is a subspace of B[a, b]
and it is closed (it contains all its limit points).

88
3. Weierstrass’ Approximation Theorem
Remark 3.1. We introduce a little probability theory. If you have not seen any prob-
ability theory before, you may skip this remark and take Equation VII.3 (see below) as
a fact.
(i) Fix a real number p ∈ [0, 1], and consider an experiment with two possible out-
comes: A success occurs with probability p, and a failure occurs with probability
(1 − p). Our experiment is modelled by a sample space (the set of all possible
outcomes),
Ω := {0, 1}
where 0 represents a failure and 1 represents a success, together with a probability
function P : P(Ω) → R given by
P ({0}) = (1 − p)
P ({1}) = p
P (∅) = 0
P (Ω) = 1

(ii) Fix n ∈ N, and suppose this experiment is repeated n times. Then, the sample
space becomes
Ω := {(a1 , a2 , . . . , an ) : ai ∈ {0, 1}}
and the probability function P : P(Ω) → R is given by
P (∅) = 0, and
P ({(a1 , a2 , . . . , an )}) = pk (1 − p)n−k
where k is the number of 1’s occurring in (a1 , a2 , . . . , an ), with the understanding
that for any A ⊂ Ω, X
P (A) = P ({ω}).
ω∈A

The triple (Ω, P(Ω), P ) is an example of a probability space.


(iii) An event is any subset of Ω. In this experiment, for each 0 ≤ k ≤ n, let Ck ⊂ Ω
be the event consisting of all outcomes with exactly k successes. Then,
Ck = {(a1 , a2 , . . . , an ) ∈ Ω : exactly k of the ai are 1}.
Hence,  
n k
P (Ck ) = p (1 − p)n−k
k
The sets {C0 , C1 , . . . , Cn } are mutually disjoint and their unions is Ω, so we see
that n n  
X X n k
1 = P (Ω) = P (Ck ) = p (1 − p)n−k (VII.1)
k=0 k=0
k

89
(iv) A random variable is any function Z : Ω → R. In our case, we will be interested
in two random variables:
(a) X = Xn,p defined by
n
X
X(a1 , a2 , . . . , an ) = the number of successes in (a1 , a2 , . . . , an ) = ai .
i=1

Note that X(ω) ∈ {0, 1, 2, . . . , n} for all ω ∈ Ω.


(b) Y = Yn,p defined by
Xn,p
Y = Yn,p :=
n
Note that Y (ω) ∈ {0, 1/n, 2/n, . . . , 1} for any ω ∈ Ω.

(v) For a subset A ⊂ R, we will often be interested in the quantity

P (Z ∈ A) = P ({ω ∈ Ω : Z(ω) ∈ A})

For example, we might have


(a) If A = {2}, then
 
n 2
P (X ∈ A) = P (X = 2) = P (C2 ) = p (1 − p)n−2 .
2

(b) If A = {0, 1/n, 2/n, 3/n}, then


3  
X n k
P (Y ∈ A) = P (X ≤ 3) = p (1 − p)n−k
k=0
k

(End of Day 26)

(vi) Given a random variable Z, there are two useful real numbers associated to it, the
mean µZ and the variance σZ2 . We will not define the variance here, but the mean
is given by X
µZ = E(Z) = zP (Z = z) (VII.2)
z∈R

which in our case will be a finite sum.

(vii) In our situation, we will need two facts concerning the random variable Y = Yn,p :
p(1−p)
(a) µY = p and σY2 = n
.
(b) We also need Chebyshev’s inequality: For δ > 0, define

D := {ω ∈ Ω : |Y (ω) − p| > δ}

90
Then,
σY2 p(1 − p)
P (D) = P (|Y − µY | > δ) < 2
=
δ nδ 2
Moreover, the function p 7→ p(1 − p) has its maximum value at p = 1/2 (by
Calculus), so
1
P (|Yn,p − p| > δ) < (VII.3)
4nδ 2
Lemma 3.2. Fix δ > 0, p ∈ [0, 1], and n ∈ N, let
 
k
S := k ∈ {0, 1, . . . , n} : −p >δ
n
Then,
X n 1
pk (1 − p)n−k <
k∈S
k 4nδ 2

Proof. Let Y = Yn,p be the random variable defined in Remark 3.1, and define the set

D := {ω ∈ Ω : |Y (ω) − p| > δ}

so that
1
P (D) <
4nδ 2
by Chebyshev’s inequality. However, ω = (a1 , a2 , . . . , an ) ∈ D if and only if |Y (ω) − p| >
δ. For each 0 ≤ k ≤ n, let

Ck = {(a1 , a2 , . . . , an ) : exactly k of the ai are 1}

Then, Y (ω) = k/n for each ω ∈ Ck , so


 
k
D ∩ Ck = (a1 , a2 , . . . , an ) ∈ Ck : −p >δ
n
(
Ck : if nk − p > δ
=
∅ : otherwise

Moreover,
D = (D ∩ C0 ) t (D ∩ C2 ) t . . . t (D ∩ Cn ).
Hence,
n
X
P (D) = P (D ∩ Bk )
k=0
X n
= pk (1 − p)n−k
k∈S
k

Hence the result.

91
Remark 3.3. Suppose f : [0, 1] → R is a function. For n ∈ N and p ∈ [0, 1], let Y = Yn,p
be the random variable defined above.
(i) Since Y (ω) ∈ [0, 1] for all ω ∈ Ω, we get a new random variable f (Y ) : Ω → R
given by f ◦ Y . i.e.
f (Y )(ω) = f (Y (ω))
for all ω ∈ Ω.
(ii) We need one important fact about the mean of f (Y ). It is given by the expression
X
E(f (Y )) = f (y)P (Y = y)
y∈R

In our case, Y takes values in {0, 1/n, 2/n, . . . , 1}, so


n    
X k k
E(f (Y )) = f P Y =
k=0
n n
n  
X k
= f P (X = k)
k=0
n
n   
X k n k
= f p (1 − p)n−k
k=0
n k

Definition 3.4. Let f : [0, 1] → R be a function, and let n ∈ N. The nth Bernstein Polynomial
associated to f is defined as
n   
X k n k
Bn (x) := f x (1 − x)n−k
k=0
n k

Note that this is indeed a polynomial (of degree ≤ n).


Theorem 3.5 (Bernstein’s proof of Weierstrass’ Theorem). Let f : [0, 1] → R be a
continuous function. Then, the the sequence (Bn ) defined above converges uniformly to
f on [0, 1].
Proof. Fix n ∈ N. We wish to estimate the quantity
n   
X k n k
E := f (x) − f x (1 − x)n−k
k=0
n k
n     
X k n k
= f (x) − f x (1 − x)n−k
k=0
n k

by Equation VII.1. This is made up of two pieces: If


 
k
S := k ∈ {0, 1, . . . , n} : −x >δ
n

92
then E ≤ E1 + E2 , where
X    
k n k
E1 := f (x) − f x (1 − x)n−k , and
n k
k∈S
/
X    
k n k
E2 := f (x) − f x (1 − x)n−k
k∈S
n k

(δ will be chosen shortly)


(i) Since f is uniformly continuous (by Theorem IV.5.7), there is a δ > 0 such that
for any x, y ∈ [0, 1],

|x − y| ≤ δ ⇒ |f (x) − f (y)| < /2.


k
Note that k ∈
/ S if and only if n
− x ≤ δ. Therefore, for this δ > 0, we have
!
 X n 
E1 ≤ xk (1 − x)n−k ≤ .
2 k 2
k∈S
/

by Equation VII.1.

(ii) We now need to estimate E2 . Note that there exists M > 0 such that

|f (x)| ≤ M

for all x ∈ [0, 1]. So


!
X n
E2 ≤ 2M xk (1 − x)n−k
k∈S
k

Choose N ∈ N so that
M
< .
N δ2
Then, by Lemma 3.2, for all n ≥ N , we would have

E2 < /2.

Observe that the choice of N only depends on  and δ. Since δ only depends on 
(and not on x, because of uniform continuity), it follows that
n   
X k n k
f (x) − f x (1 − x)n−k < 
k=0
n k

for all x ∈ [0, 1] and for all n ≥ N .

93
Corollary 3.6 (Weierstrass’ Approximation Theorem). For any continuous function
u
f : [a, b] → R, there is a sequence (pn ) of polynomials such that pn →
− f.

Proof. Define ϕ : [0, 1] → [a, b] by

ϕ(t) := a(1 − t) + tb

Then, ϕ is continuous, one-to-one and onto. Moreover, if ψ : [a, b] → [0, 1] is given by


s−a
ψ(s) :=
b−a
then ψ ◦ ϕ(t) = t for all t ∈ [0, 1] and ϕ ◦ ψ(s) = s for all s ∈ [a, b].

Given f ∈ C[a, b], define g : [0, 1] → R by g := f ◦ ϕ, then g ∈ C[0, 1], so there is a


sequence (Bn ) converging uniformly to g by Theorem 3.5. Now consider pn : [a, b] → R
by
pn (s) := Bn ◦ ψ
Then, each pn is a polynomial and for any s ∈ [a, b],

|pn (s) − f (s)| = |Bn (ψ(s)) − g(ψ(s))| ≤ d(Bn , g)


u
Therefore, d(pn , f ) ≤ d(Bn , g), so pn →
− f.

(End of Day 27)

94
VIII. Series of Functions
1. Series of Real Numbers
Definition 1.1. Let (an ) be a sequence of real numbers.

(i) For each k ∈ N, define the k th partial sum as

k
X
sk := ai .
i=1

(ii) The sequence (sk ) is called the (infinite) series associated to (an ). The series is
often denoted by

X
“ ai ”
i=1

Note that this is only a symbol, and really represents the limit

“ lim sk ”
k→∞

which may or may not exist.

(iii) If the sequence (sk ) converges to s ∈ R, then we say that the series is convergent,
and we write ∞
X
s := ai .
i=1

If the sequence (sk ) does not converge, we say that the series is divergent.

Remark 1.2. Suppose (an ) is a sequence and there is N ∈ N such that



X
ai
i=N
P∞
converges, then the series i=1 ai also converges.

Proposition
P∞ 1.3 (Geometric Series). If −1 < r < 1 and an := rn , then the series
i=1 ai converges.

This is called a geometric series with parameter r.

95
Proof. Define the partial sums by
k
1 − rk+1
 
X r
sk := ai = r = (1 − rk+1 ).
i=1
1−r 1−r

However, limn→∞ |r|n = 0 by Question 2 on Quiz 1. So by the Squeeze theorem, it


follows that
lim rn = 0.
n→∞

Therefore, (sk ) converges, and we have



X r
ai = lim sk =
i=1
k→∞ 1−r

P∞ P∞
Proposition 1.4. If i=1 ai = a and i=1 bi = b are two convergent series, then for
any α ∈ R,

X
(αai + bi ) = αa + b
i=1

Proof. Exercise. [Hint: Use the Algebra of Limits (Theorem II.1.8)]

Proposition 1.5. Let (an ) be a sequence such that a := ∞


P
i=1 ai is convergent. Then,

(i) For each n ∈ N, the series



X
Tn := ai
i=n
converges.

(ii) limn→∞ Tn = 0.

The sequence (Tn ) defined above is called the tail of the series.
Proof. Consider the parial sums
k
X
sk := ai
i=1

Then, by hypothesis, limk→∞ sk = a.

(i) Fix n ∈ N, and consider the partial sums of the series defining Tn :
k
X
tk := an+i
i=1

96
We WTS that (tk ) is convergent, so observe that
k
X
tk = an+i = sn+k − sn−1
i=1

Therefore, with n fixed and letting k → ∞, we see that

lim tk = lim (sn+k − sn−1 ) = a − sn−1


k→∞ k→∞

since (sn+k )∞
k=1 is a subsequence of (sk ). In other words,

Tn = a − sn−1 .

(ii) Now letting n → ∞ in this last equation, we see that

lim Tn = lim (a − sn−1 ) = a − a = 0.


n→∞ n→∞

2. Tests of Convergence
P∞
Theorem 2.1 (Test of Divergence). If a series i=1 ai is convergent, then limn→∞ an =
0.
Proof. Write
k
X
sk := ai
i=1

then limk→∞ sk = s exists by hypothesis. Note that

an = sn − sn−1

so by the Algebra of Limits, (an ) is convergent and

lim an = lim sn − lim sn−1 = s − s = 0.


n→∞ n→∞ n→∞

Example 2.2.
P∞
(i) If an = 1 for all n ∈ N, then i=1 ai diverges.

(ii) If an := 1/n, then we will soon see that ∞


P
i=1 ai is divergent. However, limn→∞ an =
0, so the converse of Theorem 2.1 does not hold.

(End of Day 28)

97
Theorem 2.3 (Integral Test). Let f : [1, ∞) → R be a non-negative, decreasing func-
tion. For each k ∈ N, define
k
X
sk := f (i)
i=1
Z k
tk := f (x)dx.
1

Then, the sequence (sk ) is convergent if and only if the sequence (tk ) is convergent.

Proof. Since f is a non-negative function, note that

sk ≤ sk+1 and tk ≤ tk+1 ,

so both sequences are increasing. By the Monotone Sequence Theorem (Theorem II.2.5),
each sequence is convergent if and only if it is bounded above.

On the interval [1, k], f is Riemann Integrable since it is monotone (by Proposition VI.2.7).
Consider the partition P = {1, 2, . . . , k}. Since f is decreasing,
k−1
X
U(f, P) = f (i) = sk−1
i=1
Xk
L(f, P) = f (i) = sk − f (1)
i=2

Since f is integrable, we know that

sk − f (1) = L(f, P) ≤ tk ≤ U(f, P) = sk−1

Now,

(i) If (sk ) is convergent, it is bounded above. In that case, (tk ) is also bounded above.
Since (tk ) is monotone increasing, it must be convergent by the Monotone Sequence
Theorem (Theorem II.2.5).

(ii) Similarly, if (tk ) is convergent, it is bounded above. The same argument as above
shows that (sk ) must also be convergent.

Example 2.4.

(i) If an = 1/n, then consider the function f : [1, ∞) → R given by


1
f (x) = .
x

98
Observe that f is non-negative and decreasing. Moreover,
Z k
tk = f (x)dx = ln(k)
1
P∞
so (tk ) is unbounded. By the Integral Test, i=1 ai diverges.
(ii) If an = 1/n2 , then consider f : [1, ∞) → R given by
1
f (x) = .
x2
Again f is non-negative and decreasing. However, by FTOC,
Z k
−1 k 1
tk = f (x)dx = |1 = 1 − .
1 x k
Since (tk ) converges, it follows from the Integral Test that ∞
P
i=1 ai converges as
well.
Definition 2.5. A series ∞
P P∞
i=1 ai is said to be absolutely convergent if the series i=1 |ai |
is convergent.
Theorem 2.6. If ∞
P
i=1 ai is absolutely convergent, then it is convergent.

Proof. Define two sequences by


k
X
sk := ai
i=1
Xk
tk := |ai |.
i=1

By hypothesis, limk→∞ tk = t exists. We WTS that limk→∞ sk exists. By the Cauchy


Criterion (see Theorem II.4.5), it suffices to show that (sk ) is Cauchy. So fix  > 0. We
WTS that there exists N ∈ N such that
|sn − sm | < 
for all n, m ≥ N .

To do this, note that (tk ) is Cauchy (because it is convergent, by Proposition II.4.2), so


there exists N ∈ N such that
|tn − tm | < 
whenever n, m ≥ N . Now, if n > m ≥ N , then
n
X n
X
|sn − sm | = ai ≤ |ai | = tn − tm < 
i=m+1 i=m+1
P∞
Therefore, (sn ) is Cauchy and therefore convergent. This proves that the series i=1 ai
converges.

99
Theorem 2.7 (Comparison Test). Suppose (an ) and (bn ) are two sequences with non-
negative entries (i.e. an ≥ 0 and bn ≥ 0 for all n ∈ N). Suppose that

(i) There exists N ∈ N such that an ≤ bn for all n ≥ N , and


P∞
(ii) i=1 bi converges.

Then, ∞
P
i=1 ai converges as well.

Proof. We may assume WLOG that an ≤ bn for all n ∈ N (see Remark 1.2). Consider
the partial sums
k
X
sk := ai
i=1
Xk
tk := bi
i=1

Since each ai ≥ 0, we have s1 ≤ s2 ≤ . . ., and similarly, (tk ) is also a monotonically


increasing sequence. Moreover, by hypothesis, we know that

sk ≤ tk

for all k ∈ N, and limn→∞ tk =: t exists. By the Monotone Sequence Theorem (The-
orem II.2.5), we know that (tk ) is bounded above (indeed, t = sup{tk : k ∈ N}).
Therefore,
sk ≤ t
for all k ∈ N, so (sk ) is bounded above. By the Monotone Sequence theorem, (sk ) is
convergent, which proves the result.

(End of Day 29)

Theorem 2.8 (Ratio Test). Suppose (an ) is a non-negative sequence of real numbers
such that
an+1
lim =: L
n→∞ an

exists.

(i) If L < 1, then the series ∞


P
i=1 ai converges.

(ii) If L > 1, then the series ∞


P
i=1 ai diverges.

(iii) If L = 1, the Ratio Test is inconclusive.

Proof.

100
(i) Suppose L < 1, fix a number r ∈ (L, 1). Then, there exists N ∈ N such that, for
all n ≥ N , we have
an+1
<r
an
an+1 an
⇒ n+1 < n
r r
So the sequence
aN aN +1
, ,...
rN rN +1
is a decreasing sequence, so that
an aN

rn rN
aN
for all n ≥ N . Hence, if C := rN
, then

an ≤ Crn
P∞
Since
P∞ i=1 Cri converges, it follows from the Comparison Test (Theorem 2.7) that
i=1 ai also converges.

(ii) If L > 1, then there exists N ∈ N such that


an+1
>1
an
for all n ≥ N . Hence, an+1 > an for all n ≥ N . In particular, the
P∞sequence (an )
cannot converge to 0. By the Test of Divergence (Theorem 2.1), i=1 ai does not
converge.

(iii) We give two examples:


(a) If an = 1/n, then
an+1
lim =1
n→∞ an
P∞
but the series i=1 ai diverges.
(b) If an = 1/n2 , then
an+1
lim =1
n→∞ an
P∞
but the series i=1 ai converges.
These two examples show that both conclusions are possible when the limit is 1.

101
3. Power Series
Definition 3.1. Let (fn ) be a sequence of functions on a set A ⊂ R.
(i) For each k ∈ N, define the k th partial sum as gk : A → R given by
k
X
gk (x) = fi (x).
i=1

(ii) The sequence (gk ) is called the series associated to (fn ). The series is often denoted
by
X∞
“ fi ”
i=1

(iii) Suppose that the sequence (gk ) converges pointwise to a function f : A → R, we


say that the series is convergent pointwise and we write

X
f (x) = fi (x)
i=1

(iv) Suppose that the sequence (gk ) converges uniformly to a function f : A → R, we


say that the series is uniformly convergent and we write

X
f= fi
i=1

Note that, by Remark VII.1.4, if the series converges uniformly, then it converges
pointwise.
Theorem 3.2 (Weierstrass’ M-Test). Suppose (fn ) is a sequence of functions on a set
A ⊂ R. Suppose that there exist real numbers Mn ≥ 0 such that
(i) 0 ≤ |fn (x)| ≤ Mn for all x ∈ A and all n ∈ N.
P∞
(ii) i=1 Mi converges in R.

Then, the series ∞


P
i=1 fi converges uniformly on A.

Proof.
(i) By condition (ii) and the Comparison Test (Theorem 2.7),

X
|fi (x)|
i=1

converges for each x ∈ A, so we may write



X
f (x) = fi (x).
i=1

102
(ii) Now consider
n
X ∞
X
f (x) − fi (x) = fi (x)
i=1 i=n+1
X∞
≤ |fi (x)|
i=n+1
X∞
≤ Mi
i=n+1

The RHS is now the tail of convergent series (see Proposition 1.5), so for a given
 > 0, there exists N ∈ N such that

X
Mi < 
i=n+1

for all n ≥ N . Hence,


n
X
f (x) − fi (x) < 
i=1

holds for all n ≥ N and for all x ∈ A. Therefore, the series converges uniformly
to f .

Definition 3.3. Fix an interval I := (−R, R) ⊂ R and a sequence (cn ) of real numbers.
For each n ∈ N, define fn : I → R by

fn (x) := cn xn .

If the series ∞
X
ci x i
i=1

converges pointwise on I, the function



X
f (x) = ci x i
i=1

is called a power series.


Example 3.4.
(i) Consider the series

X xi
i=0
i!

103
For any fixed x ∈ R, we apply the Ratio Test.
|an+1 | |x|
lim = lim =0
n→∞ |an | n→∞ n + 1

so the series converges absolutely pointwise. We define



X xi
e(x) :=
i=0
i!

(ii) Similarly,

X (−1)i x2i+1
s(x) := , and
i=0
(2i + 1)!

X (−1)i x2i
c(x) :=
i=0
(2i)!
converge pointwise for each x ∈ R.
(End of Day 30)
Theorem 3.5. Suppose a power series

X
ci x i
i=1

converges at one real number y ∈ R. Then,


(i) The series converges absolutely pointwise on (−R, R) where R := |y|.
(ii) The series converges uniformly on (−S, S) whenever 0 < S < R.
Proof. We know that

X
ci y i
i=1
converges. By the Test of Divergence (Theorem 2.1), limn→∞ cn y n = 0. In particular,
there exists N ∈ N so that
|cn y n | < 1
for all n ≥ N .
(i) Fix a point x ∈ (−R, R) where R = |y|, and let r := |x|. Then, for n ≥ N , we
have n
xn x
|cn xn | = |cn y n | n < = tn
y y
where t = r/R. Since t < 1, ∞ i
P
i=1 t converges, so the series

X
|ci xi |
i=1

converges by the Comparison Test (Theorem 2.7).

104
(ii) Moreover, if 0 < S < R, then let
PA := (−S, S). Then, if tP
:= S/R, then the above
argument shows that the series i=1 ci x is dominated by ∞
∞ i i
i=1 t . The Weierstrass
M-Test (Theorem 3.2) now says that the series converges uniformly on A.

Corollary 3.6. Suppose that a power series



X
ci x i
i=1

converges at some point y1 ∈ R and diverges at a point y2 ∈ R. Then, there is a number


R > 0 such that

(i) The series converges absolutely on (−R, R).

(ii) The series diverges on (−∞, −R) ∪ (R, ∞).

Note that if x = R or x = −R, we cannot say anything conclusive about the convergence
of the series.

Proof. Define
A := {|x| : the series converges at x}
Then, A ⊂ R≥0 and A is non-empty because |y1 | ∈ A.

We know that the series does not converge at y2 . So if x ∈ R such that |x| > |y2 |, then
the series cannot converge at x by Theorem 3.5. Therefore, A ⊂ [0, |y2 |].

In particular, A is bounded above, so

R := sup(A)

exists.

(i) If z ∈ R is such that |z| < R, then there exists x ∈ R such that the series converges
at x and |z| < |x|. So by Theorem 3.5, the series must converge absolutely at z.

(ii) If z ∈ R is such that |z| > R, then the series cannot converge at z because otherwise
this would force |z| ∈ A, contradicting the definition of R.

Definition 3.7. This number R described in Corollary 3.6 is called the radius of convergence
of the power series. We adopt the following convention:

(i) If the series converges only at 0 and no other point, we write R = 0.

(ii) If the series converges at every point x ∈ R, we write R = ∞.

105
(End of Day 31)
P∞ i
Corollary 3.8. Consider a power series i=1 ci x as above, and assume that

cn+1
L := lim
n→∞ cn

exists. If R is the radius of convergence of the power series, then


(
1/L : if L 6= 0
R=
∞ : if L = 0.

Proof.

0, then let an := cn xn , then applying the Ratio Test (Theorem 2.8) to the
(i) If L 6=P
series ∞ i=1 ai , we wish to determine when

an+1
1 > T := lim = L|x|.
n→∞ an

Hence, the series converges whenever L|x| < 1, so it converges on the interval

(−1/L, 1/L)

Similarly, it diverges on the intervals (−∞, −1/L) ∪ (1/L, ∞), so it follows by


definition that R = 1/L.

(ii) If L = 0, then the same argument shows that the series converges for each x ∈ R,
so R = ∞.

Example 3.9.

(i) Consider the power series



X xi
i=1
i!

Here, ci = i!1 , so
cn+1
L := lim =0
n→∞ cn
so R = +∞ (i.e. the power series converges at each x ∈ R).

(ii) Consider the power series



X
xi
i=1

106
Here, ci = 1 for all i ∈ N, so
cn+1
L = lim =1
n→∞ cn
so R = 1 as well. So the power series converges on (−1, 1) but diverges on
(−∞, −1) ∪ (1, ∞).
(a) At x = 1, the series clearly diverges.
(b) At x = −1, the series diverges since the sequence of partial sums does not
converge (Verify this!).
Hence the series converges on (−1, 1).
(iii) Consider the power series

X xi
i=1
i
Here, ci = 1/i for all i ∈ N, so
cn+1
L = lim =1
n→∞ cn
so R = 1 as well. So the power series
(a) converges on (−1, 1),
(b) diverges on (−∞, −1) ∪ (1, ∞),
(c) At x = 1, the series diverges by Example 2.4,
(d) At x = −1, the series converges (we have not proved this, but it is a con-
sequence of the Alternating Series Test - see [Apostol Calculus, Theorem
10.14]).
Hence, the series converges on [−1, 1).
Remark 3.10. All the results for a power series centered at 0 also hold for power series
centered at other points.
(i) Suppose a power series

X
ci (x − a)i
i=1

converges a point x ∈ R, then it converges pointwise at each z ∈ (a − |x|, a + |x|).


(ii) If a power series

X
ci (x − a)i
i=1

converges pointwise on I := (a − R, a + R), then for any 0 < S < R, it converges


uniformly on (a − S, a + S).

107
(iii) Every such power series has a radius of convergence R, which defines aan interval
I := (a − R, a + R) where the series converges, and the set

A := (−∞, a − R) ∪ (a + R, ∞)

where the series diverges. We cannot determine how the series behaves at the
points (a − R) or (a + R).

4. Taylor’s Series
Remark 4.1. Let I := (a, b) be a fixed open interval in R. Given a function f : I → R,
when can f be expressed as a power series? To answer this, we recall Taylor’s Theorem
(Theorem V.4.7):

Suppose f : I → R is (n + 1)-times differentiable, and x0 , x ∈ I are fixed and distinct.


Then, there exists c between x0 and x such that

f (n+1) (c)
f (x) = Pn (x) + (x − x0 )n+1 .
(n + 1)!

where Pn is the nth order Taylor polynomial of f at x0 given by

f 00 (x0 ) f (n) (x0 )


Pn (t) := f (x0 ) + f 0 (x0 )(t − x0 ) + (t − x0 ) + . . . + (t − x)n
2! n!

In order to extend this polynomial into a series, we need f to be infinitely differentiable


(smooth - see Definition V.4.1).
Definition 4.2. Given an infinitely differentiable function f : I → R and a point
x0 ∈ (a, b), we define the Taylor’s Series of f at x0 to be

X f (i) (x0 )
(x − x0 )i = lim Pn (x).
i=0
i! n→∞

A priori, we do not know whether this series converges at all.


Theorem 4.3. Assume f : I → R is infinitely differentiable and x0 ∈ I is fixed. Suppose
there exists A ≥ 0 such that
|f (n) (x)| ≤ An
for all x ∈ I and for all n ∈ N. Then, the Taylor’s series of f converges to f (x) at each
point x ∈ I.
Proof. Fix  > 0. We WTS that there exists N ∈ N such that

|f (x) − Pn (x)| < 

108
for all n ≥ N and all x ∈ I. To see this, we use Taylor’s Theorem (Theorem V.4.7) to
write
f (n+1) (c)
f (x) − Pn (x) = (x − x0 )n+1
(n + 1)!
for some point c ∈ I. By hypothesis,
An+1
|f (x) − Pn (x)| ≤ (b − a)n+1 (VIII.1)
(n + 1)!
The corresponding series

X Ai (b − a)i
i=1
i!
converges by the ratio test (see Example 3.4), so by the Test of Divergence (Theorem 2.1),
An+1
lim (b − a)n+1 = 0.
n→∞ (n + 1)!

Hence, there exists N ∈ N such that


An+1
(b − a)n+1 < 
(n + 1)!
for all n ≥ N. Together with Equation VIII.1, this proves the theorem.
We return to the examples of Example 3.4.
Example 4.4. Fix I := (−1, 1).
(i) Let f : I → R be the function f (x) := ex . Then, by Proposition VI.5.5, we have
|f (x)| ≤ e1 = e for all x ∈ I
f (n) (x) = f (x) for all x ∈ I
Therefore, if A = e, then the conditions of Theorem 4.3 are met. Setting x0 = 0,
we have ∞ ∞
X f (i) (x0 ) i
X xi
f (x) = (x − x0 ) =
i=1
i! i=1
i!

(ii) Let f : I → R be the function


eix − e−ix
f (x) = sin(x) =
2i
Then,
|f (n) (x)| ≤ 1
for all x ∈ I and all n ∈ N. Therefore, Theorem 4.3 applies with A = 1. At x0 = 0,
we see that
x 3 x5 x7
f (x) = x − + − + ...
3! 5! 7!

109
(iii) Similarly, if
eix + e−ix
f (x) = cos(x) =
2
Then,
x2 x4 x6
f (x) = 1 − + − + ...
2! 4! 6!
(iv) Note that in each example above, the series converges uniformly by Theorem 3.5.
(End of Day 32)
(v) We now give an example to show how Theorem 4.3 may fail. Define f : (−1, 1) → R
by ( 2
e−1/x : if x 6= 0
f (x) =
0 : if x = 0
Then, f is differentiable and
( 2
2x−3 e−1/x : if x 6= 0
f 0 (x) =
0 : if x = 0.

Again, f 0 is differentiable and


( 2
00 (−6x−4 − 4−5 )e−1/x : if x 6= 0
f (x) =
0 : if x = 0.

This pattern continues, so f is infinitely differentiable and


f (n) (0) = 0
for all n ∈ N. So the Taylor’s series of f at 0 is the zero function, so it does not
converge to f on I.
(vi) Let f : (−1, 1) → R be given by
1
f (x) =
1−x
Then, f is infinitely differentiable and
f (n) (x) = n!(1 − x)−n−1
So there is no constant A ≥ 0 such that |f (n) (x)| ≤ An (Verify this!), so Theo-
rem 4.3 does not apply. However, the Taylor’s series at x0 = 0 is

X
2
1 + x + x + ... = xi
i=0

and this converges pointwise to f (x) on I. Therefore, the Taylor’s series may
converge even if the hypothesis of Theorem 4.3 is not satisfied.

110
5. Review
Remark 5.1. Some important results/ideas:

(i) Chapter 1:
(a) There is no x ∈ Q such that x2 = 2 ( Theorem I.1.1.
(b) Least Upper Bound Property (Definition I.3.3).
(c) Archimedean Property of R (Theorem I.4.3).
(d) Density of Q in R (Theorem I.4.5).
(e) Existence and Uniqueness of roots (Proposition I.4.6).

(ii) Chapter 2:
(a) Definition of convergence of a sequence (Definition II.1.3).
(b) Algebra of Limits (Theorem II.1.8).
(c) Limits and order (Proposition II.2.1).
(d) Monotone Sequence Theorem (Theorem II.2.5)
(e) Bolzano-Weierstrass Theorem (Theorem II.3.4)
(f) Cauchy Criterion for convergence of a sequence (Theorem II.4.5).

(iii) Chapter 3:
(a) Definition of Open and Closed sets (Definition III.1.2 and Definition III.2.4).
(b) Closure of a set (Theorem III.3.4).
(c) Definition of Compact set (Definition III.4.1).
(d) Heine-Borel Theorem (Theorem III.4.4).

(iv) Chapter 4:
(a) Epsilon-Delta Definition of Limit (Theorem IV.1.5).
(b) Definition of Continuous Function (Theorem IV.2.2).
(c) Continuous Functions on a Compact set and Extreme Value Theorem (Corol-
lary IV.3.4).
(d) Intermediate Value Theorem (Theorem IV.4.2)
(e) Compactness and Uniform Continuity (Theorem IV.5.7).

(v) Chapter 5:
(a) Rules of Differentiation (Theorem V.1.5 and Theorem V.1.6).
(b) Mean Value Theorem (Theorem V.2.5).
(c) First and Second Derivative Tests for Extrema (Proposition V.3.2 and Propo-
sition V.3.4).

111
(d) Taylor’s Theorem (Theorem V.4.7).

(vi) Chapter 6:
(a) Definition of Riemann Sums (Definition VI.1.1)
(b) Riemann Integrability Criterion (Theorem VI.2.3).
(c) Continuous Functions are Integrable (Theorem VI.2.8).
(d) Function with finitely many discontinuities is Integrable (Theorem VI.3.5)
(e) Fundamental Theorem of Calculus (Theorem VI.4.1 and Theorem VI.4.3).
(f) Properties of Logarithm and Exponential Function (Proposition VI.5.2 and
Proposition VI.5.5)

(vii) Chapter 7:
(a) Uniform limit of continuous functions (Lemma VII.1.5)
(b) Uniform convergence in terms of supremum metric (Proposition VII.2.3).
(c) Weierstrass’ Approximation Theorem (Corollary VII.3.6).

(viii) Chapter 8:
(a) Definition of convergence of series (Definition 1.1).
(b) Tail of convergent series (Proposition 1.5).
(c) Test of Divergence (Theorem 2.1).
(d) Integral Test (Theorem 2.3)
(e) Comparison Test (Theorem 2.7).
(f) Ratio Test (Theorem 2.8).
(g) Definition of Convergence of a series of functions (Definition 3.1)
(h) Weierstrass’ M-Test (Theorem 3.2).
(i) Convergence of Power Series (Theorem 3.5) and Radius of Convergence.
(j) Taylor’s Series examples (Example 4.4).

(End of Day 33)

112
IX. Instructor Notes

(i) Instead of following Rudin, I decided to take a more accessible approach by fol-
lowing [Abbot] and [Apostol Calculus] (the latter for the Fundamental Theorem
of Calculus and Series).

(ii) This meant that I had to go slow (especially at the start, in the first three chapters)
which impacted me at the end as I was unable to do the Arzela-Ascoli theorem,
and was unable to really explore the idea of power series. In the future, it may be
prudent to go a little faster through the early material.

(iii) That said, I think the course was well-received and the grades indicate that most
students found the pace to be alright.

(iv) I also did not discuss the more general Riemann-Stieltjes integral because it did
not seem like the added generality was really necessary (plus, I think the main
ideas are already in the usual Riemann integral).

113
Bibliography
[Abbot] S. Abbott, Understanding Analysis, Springer (2001) (Available here)

[Apostol Calculus] T. Apostol, Calculus Vol 1 (2nd Ed), John Wiley and Sons (1967)

[Rudin] W. Rudin, Principles of Mathematical Analysis (3rd Ed), McGraw-Hill (1976)

[Apostol] T. Apostol, Mathematical Analysis (2nd Ed), Addison-Wesley (1974)

[MTH301] My MTH301 Notes (Available here)

114

You might also like