0% found this document useful (0 votes)
4 views

A Short Note on Functional Analysis

The document is an introduction to Functional Analysis, covering topics such as Normed Linear Spaces, Banach Spaces, Inner Product Spaces, and Bounded Linear Maps. It includes definitions, theorems, and lemmas related to semi-norms, norms, and various properties of linear spaces. The content is structured into sections and subsections, providing a comprehensive overview of the subject matter.

Uploaded by

bernardjmpan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

A Short Note on Functional Analysis

The document is an introduction to Functional Analysis, covering topics such as Normed Linear Spaces, Banach Spaces, Inner Product Spaces, and Bounded Linear Maps. It includes definitions, theorems, and lemmas related to semi-norms, norms, and various properties of linear spaces. The content is structured into sections and subsections, providing a comprehensive overview of the subject matter.

Uploaded by

bernardjmpan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 117

Introduction to Functional Analysis

You will know it!

Last Updated: March 19, 2024

Contents
1 Normed Linear Spaces and Banach Spaces 3
1.1 Semi-Norms and Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Operations on Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Equivalence of Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Finite-Dimensional Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Additional Examples of Norms and Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . 24

2 Inner Product Spaces 25


2.1 Basics for Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Remarks on Derived Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 Orthogonality in Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Orthogonal Projection Onto Finite-Dimensional Subspaces . . . . . . . . . . . . . . . . . . . . 47
2.5 Orthonormal Bases in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.6 Examples of Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3 Bounded Linear Maps and Continuous Dual Spaces 62


3.1 Bounded Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.2 The Operator Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3 The Uniform Boundedness Principle/Banach–Steinhaus Theorem . . . . . . . . . . . . . . . . 71
3.4 Extending Linear Functionals: The Hahn-Banach Theorem . . . . . . . . . . . . . . . . . . . . 76
3.5 The Riesz Representation Theorem and Reflexivity . . . . . . . . . . . . . . . . . . . . . . . . 81
3.6 The Adjoint of Bounded Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.7 The Adjoint of Linear Maps in Hilbert Space . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4 Compact Linear Maps and Some Spectral Theory 104


4.1 Weak and Weak∗ Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.2 Compact Linear Maps and Their Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3 The Spectral of Bounded Linear Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.4 Spectral Decomposition in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

1
5 Differential Calculus on Banach Spaces 110

A Additional Topics on Linear Spaces 111


A.1 Complexification of Real Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
A.2 Topological Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

All linear spaces in the note, unless specified otherwise, is over field F = R or C.

2
1 Normed Linear Spaces and Banach Spaces
1.1 Semi-Norms and Norms
Definition 1.1. Let V be a linear space. A semi-norm on V is a function ∥ · ∥ : V → R that satisfies the following
conditions:

1. (Homogeneity). ∥cx∥ = |c|∥x∥ for all x ∈ V and c ∈ F.

2. (Triangle Inequality). ∥x + y∥ ≤ ∥x∥ + ∥y∥ for all x, y ∈ V .

The pair (V, ∥ · ∥) is called a semi-normed linear space. In practice, we usually say that V is a semi-normed
linear space under ∥ · ∥. Finally, if furthermore

∀x ∈ V : ∥x∥ = 0 ⇐⇒ x = 0V , (1)

then such semi-norm is called a norm on V , by which V is called a normed linear space.

Remark 1.1. It is readily to see that once restricting ∥ · ∥ to any subspace of V , the two axioms still hold for such
restricted map. Consequently, given any subspace V ′ of V , the restriction of every semi-norm (resp. norm) on V
to V ′ is also a semi-norm (resp. norm) on the subspace.

Theorem 1.1. Let V be a semi-normed linear space. Then ∥0V ∥ = 0 and for every x ∈ V ,

∥x∥ ≥ 0 and ∥−x∥ = ∥x∥. (2)

Furthermore, for every x, y ∈ V ,


|∥x∥ − ∥y∥| ≤ ∥x − y∥. (3)

In particular, if ∥x − y∥ = 0V , then ∥x∥ = ∥y∥.


Proof. Let x ∈ V be arbitrary. First, since 0V = 0x, it follows that

∥0V ∥ = ∥0x∥ = |0|∥x∥ = 0.

Furthermore, by homogeneity,
∥−x∥ = |−1|∥x∥ = ∥x∥.
As a result, by triangle inequality,
0 = ∥0V ∥ = ∥x + (−x)∥ ≤ ∥x∥ + ∥−x∥ = 2∥x∥.
This then shows that ∥x∥ ≥ 0, as desired. Finally, let y ∈ V be arbitrary as well. Again. by triangle inequality, we have

∥x∥ = ∥y + (x − y)∥ ≤ ∥y∥ + ∥x − y∥,

hence we have ∥x∥ − ∥y∥ ≤ ∥x − y∥. By symmetry, we also have ∥y∥ − ∥x∥ ≤ ∥y − x∥. Since ∥x − y∥ = ∥−(x − y)∥ = ∥y − x∥, it follows that
|∥x∥ − ∥y∥| ≤ ∥x − y∥. As we can see, if ∥x − y∥ = 0, we see that

0 ≤ |∥x∥ − ∥y∥| ≤ ∥x − y∥ = 0,

hence ∥x∥ − ∥y∥ = 0, implying that ∥x∥ = ∥y∥. ■

Definition 1.2. Let V be a semi-normed linear space. An element x ∈ V is called unit if ∥x∥ = 1.

3
Lemma 1.2 (Normalization). Let V be a semi-normed linear space. For every x ∈ V , if ∥x∥ > 0, the vector
x̂ := ∥x∥−1 x is unit and collinear with x.
Proof. Let x ∈ V be arbitrary with ∥x∥ > 0. Then clearly, x̂ := ∥x∥−1 x, being a scalar multiple of x, is certainly collinear with x, and

∥x̂∥ = ∥∥x∥−1 x∥ = ∥x∥−1 ∥x∥ = 1.

Therefore, the element x̂ ∈ V is unit as well. ■

Theorem 1.3. Let V be a semi-normed linear space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 x and ŷ := ∥y∥−1 y,
4∥x − y∥
∥x̂ − ŷ∥ ≤ . (4)
∥x∥ + ∥y∥
Proof. Let x, y ∈ V be nonzero with respective normalizations x̂, ŷ. Then

∥x∥ ∥x∥
∥x∥∥x̂ − ŷ∥ = x − y ≤ ∥x − y∥ + y − y
∥y∥ ∥y∥
|∥y∥ − ∥x∥|
= ∥x − y∥ + ∥y∥
∥y∥
= ∥x − y∥ + |∥y∥ − ∥x∥| ≤ 2∥x − y∥.

By similar arguments, we have ∥y∥∥x̂ − ŷ∥ ≤ 2∥x − y∥ as well. Consequently,

(∥x∥ + ∥y∥)∥x̂ − ŷ∥ 4∥x − y∥


∥x̂ − ŷ∥ = ≤ . ■
∥x∥ + ∥y∥ ∥x∥ + ∥y∥

Lemma 1.4. Let V be a semi-normed linear space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 x and ŷ := ∥y∥−1 y,

∥x∥ + ∥y∥ − (2 − ∥x̂ + ŷ∥)(∥x∥ ∨ ∥y∥) ≤ ∥x + y∥


(5)
≤ ∥x∥ + ∥y∥ − (2 − ∥x̂ + ŷ∥)(∥x∥ ∧ ∥y∥).

Proof. Let x, y ∈ V be nonzero with respective normalizations x̂, ŷ. Without loss of generality, we may assume that ∥x∥ ≤ ∥y∥. Observe that
   
∥x∥ ∥x∥ ∥x∥
∥x + y∥ = ∥x∥x̂ + y+ 1− y ≤ ∥x∥∥x̂ + ŷ∥ + 1 − ∥y∥
∥y∥ ∥y∥ ∥y∥
= ∥x∥∥x̂ + ŷ∥ + (∥y∥ − ∥x∥)
= ∥x∥ + ∥y∥ − (2 − ∥x̂ + ŷ∥)∥x∥.

and
   
∥y∥ ∥y∥ ∥y∥
∥x + y∥ = ∥y∥ŷ + x− − 1 x ≥ ∥y∥(∥ŷ + x̂∥) − − 1 ∥x∥
∥x∥ ∥x∥ ∥x∥
≥ |∥y∥(∥ŷ + x̂∥) − (∥y∥ − ∥x∥)|
= |∥x∥ + ∥y∥ − (2 − ∥x̂ + ŷ∥)∥y∥|.

The desired inequality thus follows. ■

Theorem 1.5. Let V be a semi-normed linear space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 x and ŷ := ∥y∥−1 y,

∥x − y∥ + |∥x∥ − ∥y∥| ∥x − y∥ + |∥x∥ − ∥y∥|


≤ ∥x̂ − ŷ∥ ≤ . (6)
∥x∥ ∧ ∥y∥ ∥x∥ ∨ ∥y∥

4
Proof. Let x, y ∈ V be nonzero with respective normalizations x̂, ŷ. By the preceding lemma, we have

∥x − y∥ ≥ ∥x∥ + ∥−y∥ − (2 − ∥x̂ − ŷ∥)(∥x∥ ∨ ∥−y∥)


= ∥x∥ + ∥y∥ − (2 − ∥x̂ − ŷ∥)(∥x∥ ∨ ∥y∥).

Consequently,

(∥x∥ ∨ ∥y∥)∥x̂ − ŷ∥ ≤ ∥x − y∥ − ∥x∥ − ∥y∥ + 2(∥x∥ ∨ ∥y∥)


= ∥x − y∥ + |∥x∥ − ∥y∥|.

Similarly,
∥x − y∥ ≤ ∥x∥ + ∥y∥ − (2 − ∥x̂ − ŷ∥)(∥x∥ ∧ ∥y∥),
so

(∥x∥ ∧ ∥y∥)∥x̂ − ŷ∥ ≥ ∥x − y∥ − ∥x∥ − ∥y∥ + 2(∥x∥ ∧ ∥y∥)


= ∥x − y∥ + |∥x∥ − ∥y∥|. ■

Remark 1.2. In addition, the upper bound in the preceding theorem is better than the one in Theorem 1.3, as

∥x − y∥ + |∥x∥ − ∥y∥| 2∥x − y∥ 2∥x − y∥ 4∥x − y∥


≤ ≤ = .
∥x∥ ∨ ∥y∥ ∥x∥ ∨ ∥y∥ (∥x∥ + ∥y∥)/2 ∥x∥ + ∥y∥

Theorem 1.6 (Metric Space Structure). Let V be a semi-normed linear space. Then the following function

d : V ×V → R : (x, y) 7→ ∥x − y∥ (7)

is a pseudo-metric on V , which becomes a metric if and only if ∥ · ∥ is a norm.


Proof. Let d : V ×V → R be defined as above. We then verify the conditions for pseudo-metrics as follows:
• For every x ∈ V , we have d(x, y) = ∥x − y∥ ≥ 0. In particular, d(x, x) = ∥x − x∥ = ∥0V ∥ = 0.
• For every x, y ∈ V ,
d(x, y) = ∥x − y∥ = ∥y − x∥ = d(y, x).

• For every x, y, x ∈ V ,
d(x, z) = ∥x − z∥ = ∥(x − y) + (y − z)∥ ≤ ∥x − y∥ + ∥y − z∥ = d(x, y) + d(y, z).

Finally, we show that d is a metric if and only if ∥ · ∥ is a norm on V :


• Suppose that ∥ · ∥ is a norm on V . Then for every x, y ∈ V ,

d(x, y) = 0 ⇐⇒ ∥x − y∥ = 0 ⇐⇒ x − y = 0V ⇐⇒ x = y.

This shows that d is a metric when ∥ · ∥ is a norm.


• Conversely, suppose that d is itself a metric. Then for every x ∈ V ,

∥x∥ = 0 ⇐⇒ d(x, 0V ) = 0 ⇐⇒ x = 0V .

We then conclude that ∥ · ∥ is a norm on V in this case. ■

Corollary 1.7. Every semi-normed linear space is a topological linear space under the pseudo-metric topology,
by which the semi-norm is uniformly continuous.
Proof. Let V be a semi-normed linear space with semi-norm ∥ · ∥.

5
• Let x, y ∈ V be arbitrary. Then for every x′ , y′ ∈ V ,

∥(x′ + y′ ) − (x + y)∥ = ∥(x′ − x) + (y′ − y)∥ ≤ ∥x′ − x∥ + ∥y′ − y∥


≤ 2(∥x′ − x∥ ∨ ∥y′ − y∥).

Consequently, if (x′ , y′ ) → (x, y) in V ×V , then we have x′ → x and y′ → y, hence ∥(x′ + y′ ) − (x + y)∥ → 0 as well. In other words,
we also have x′ + y′ → x + y in V , so the addition on V is continuous.
• Let x ∈ V and c ∈ F be arbitrary. Then for every c′ ∈ F and x′ ∈ V ,

∥c′ x′ − cx∥ = ∥c′ x′ − c′ x + c′ x − cx∥ ≤ ∥c′ (x′ − x)∥ + ∥(c′ − c)x∥


= |c′ |∥x′ − x∥ + |c′ − c|∥x∥.

If (c′ , x′ ) → (c, x) in F × V , we shall also have c′ → c and x′ → x, hence |c′ − c| → 0 and ∥x′ − x∥ → 0 as well. Together with the
inequality above, we may conclude that c′ x′ → cx as well, so the scalar multiplication is continuous then.
Now that the addition and the scalar multiplication are both continuous, so V is a topological vector space. Finally, let (xn )n∈N be a sequence
in V such that xn → x ∈ V under d. Then for every ε > 0, there exists N ∈ N such that ∥xn − x∥ < ε for n ≥ N. As noted in Theorem 1.1, we
can see that
|∥xn ∥ − ∥x∥| ≤ ∥xn − x∥ < ε, ∀n ≥ N.
Therefore, the semi-norm ∥ · ∥ : V → R is itself uniformly continuous as well. ■

Theorem 1.8 (Closed Span). Let V be a normed linear space with subset S. The closure Span(S) of the subspace
spanned by S is the smallest closed subspace of V containing S.
Proof. Let S ⊆ V be an arbitrary subset. It is clear that Span(S) is closed in V , so we may verify that it is a subspace first: Let x, y ∈ Span(S)
and c ∈ F be arbitrary. Then by the property of closure, we can find sequences (xn )n∈N and (yn )n∈N in Span(S) such that xn → x and yn → y.
By the continuity of addition and scalar multiplication, we see that

cx + y = c lim xn + lim yn = lim (cxn + yn ) ∈ Span(S).


n→∞ n→∞ n→∞ | {z }
∈Span(S)

Therefore, Span(S) is indeed a closed subspace of V .


Finally, let V ′ be another closed subspace of V containing S. First, since V ′ is a subspace, we must have Span(S) ⊆ V ′ . Now because
V is closed, we should have Span(S) ⊆ V ′ as well. This proves the minimality of Span(S).
′ ■

Definition 1.3. Let V be a normed linear space. A topological base of V is a linearly independent subset B such
that the subspace of V spanned by B is dense in it, namely V = Span(B).

Remark 1.3. In functional analysis, the base of a normed linear space, namely a linearly independent subset that
spans the whole space, is usually referred as a Hamel base or an algebraic base. Clearly, every Hamel base is
also a topological base.

Definition 1.4. Let V be a semi-normed linear space. A sequence (xn )n∈N of elements in V is a Cauchy sequence
if for every ε > 0, there exists N ∈ N such that ∥xm − xn ∥ < ε for all m, n ≥ N.

Theorem 1.9 (Cauchy Sequences in Normed Linear Spaces). Let V be a semi-normed linear space, and (xn )n∈N
be a sequence in V .

1. If (xn )n∈N is Cauchy, then it is also bounded in V .

2. If xn → x in V for some x ∈ V , then the sequence (xn )n∈N is also Cauchy and ∥xn ∥ → ∥x∥ in R.

6
Proof. 1. Suppose that (xn )n∈N is Cauchy. Then there exists N0 ∈ N such that ∥xm − xn ∥ < 1 for all m, n ≥ N0 . In this case, for n ≥ N0 ,

∥xn ∥ = ∥(xn − xN0 ) + xN0 ∥ ≤ ∥xn − xN0 ∥ + ∥xN0 ∥ < 1 + ∥xN0 ∥.

Consequently, for every n ∈ N,


∥x∥ ≤ ∥x1 ∥ ∨ ∥x2 ∥ ∨ · · · ∨ ∥xN0 −1 ∥ ∨ (∥xN0 ∥ + 1).
2. Suppose that xn → x ∈ V as n → ∞. Let ε > 0 be arbitrary.
• Clearly, there exists N ∈ N such that ∥xn − x∥ < ε for all n ≥ N. In this case,

|∥xn ∥ − ∥x∥| < ∥xn − x∥ < ε.

This shows that ∥xn ∥ → ∥x∥ as n → ∞ in R.


• Meanwhile, there also exists N ′ ∈ N such that ∥xn − x∥ < ε/2 for all n ≥ N ′ . Then for every m, n ≥ N,
ε ε
∥xm − xn ∥ = ∥(xm − x) − (xn − x)∥ ≤ ∥xm − x∥ + ∥xn − x∥ < + = ε.
2 2
Therefore, the convergent sequence (xn )n∈N is also Cauchy. ■

Definition 1.5. A normed linear space is called a Banach space if the metric induced by its norm is complete,
namely every Cauchy sequence in such space is convergent.

Theorem 1.10. Let V be a Banach space. Then a subspace W of V is a Banach space if and only if it is closed.
Proof. Let W be a subspace of V . Suppose that W is closed in V . Let (xn )n∈N be a Cauchy sequence in W . Since V is a Banach space,
there exists x ∈ V such that xn → x as n → ∞. Now because W is closed in V , we certainly have x ∈ W as well. In other words, the Cauchy
sequence (xn )n∈N in W is convergent with limit x ∈ W , hence W is a Banach space.
Conversely, suppose that W is a Banach space. Let (xn )n∈N be a sequence in W such that xn → x ∈ V as n → ∞. By Theorem 1.9, the
sequence (xn )n∈N is Cauchy. Since W is Banach, there exists x′ ∈ W ⊆ V such that xn → x′ . However, because every convergent sequence
has a unique limit in any metric space, we must have x = x′ ∈ W . Therefore, the subspace W is closed in V . ■

Definition 1.6. Let V be a normed linear space and (xn )n∈N be a sequence of elements in V .

1. For each n ∈ N, the n-th partial sum of such sequence is defined as sn := x1 + · · · + xn .

2. The series ∑∞n=1 xn is (conditionally) convergent if the sequence (sn )n∈N of partial sums is convergent in
V , in which case its sum is defined as the limit of (sn )n∈N .

3. The series ∑∞ ∞
n=1 xn is absolutely convergent if the series ∑n=1 ∥xn ∥ of non-negative real numbers is con-
vergent.

Theorem 1.11 (Necessary Condition for Convergent Series). Let V be a normed linear space and (xn )n∈N be a
sequence in V . If the series ∑∞
n=1 xn is convergent, then xn → 0V as n → ∞.

Proof. Let (sn )n∈N be the sequence of partial sums. By putting s0 := 0V , it is clear that xn = sn − sn−1 for all n ∈ N. If the series ∑∞
n=1 xn is
convergent, then
lim xn = lim (sn − sn−1 ) = lim sn − lim sn−1 = 0V . ■
n→∞ n→∞ n→∞ n→∞

Theorem 1.12. A normed linear space is a Banach space if and only if every absolutely convergent series in it
is also convergent.

7
Proof. Let V be a normed linear space. First, suppose that V is a Banach space. Consider an absolute convergent series ∑∞ n=1 xn . For
convenience, denote by (sn )n∈N and (Sn )n∈N the sequences of partial sums of ∑∞ ∞
n=1 xn and ∑n=1 ∥xn ∥, respectively.
By assumption, the sequence (Sn )n∈N of non-negative real numbers is convergent, hence is Cauchy as well. That is, for every ε > 0,
there exists N ∈ N such that |Sm − Sn | < ε for all m, n ≥ N. As a result, given any m, n ∈ N with m ≥ n ≥ N, we see that

∥sm − sn ∥ = ∥xn+1 + · · · + xm ∥ ≤ ∥xn+1 ∥ + · · · + ∥xm ∥ = Sm − Sn < ε.

Consequently, the sequence (sn )n∈N is also Cauchy in V , hence is also convergent by noting that V is Banach. Therefore, the series ∑∞
n=1 xn
itself is convergent, as desired.
Conversely, suppose that every absolutely convergent series in V is also convergent. Let (xn )n∈N be a Cauchy sequence in V . First,
there exists N1 ∈ N such that ∥xm − xn ∥ < 1/2 for all m, n ≥ N. Whenever we have Nk for some k, we can further find Nk+1 ≥ Nk such that
1
∥xm − xn ∥ < , ∀m, n ≥ Nk+1 .
2k+1
In this case, consider the series ∑∞ −k for all k ∈ N, so by direct comparison
k=1 (xNk+1 − xNk ). By assumption above, we have ∥xNk+1 −xNk ∥ < 2
test, it is clear that the previous series is absolutely convergent. Now by our hypothesis, such series is convergent as well. Observe that its
k-th partial sum is given by
(xN2 − xN1 ) + · · · + (xNk+1 − xNk ) = xNk+1 − xN1 ,
so there exists x ∈ V such that xNk → x as k → ∞. Finally, let ε > 0 be arbitrary.
• On the one hand, let r := ⌈1 − log2 ε⌉. Then for m, n ≥ Nr , we have
1 1 ε
∥xm − xn ∥ < ≤ 1−log ε = .
2r 2 2 2

• On the other hand, since xNk → x, there exists r′ ∈ N such that ∥xNk − x∥ < ε/2 for all k ≥ r′ .
Then for n ≥ Nk where k := r ∨ r′ ,
ε ε
∥xn − x∥ ≤ ∥xn − xNk ∥ + ∥xNk − x∥ <
+ = ε.
2 2
This shows that xn → x as n → ∞, namely the Cauchy sequence (xn )n∈N is convergent. In conclusion, the normed linear space V is a Banach
space. ■

1.2 Operations on Semi-Norms


Theorem 1.13 (Semi-Norm Induced By Linear Maps). Let V,W be linear spaces and T : V → W be a linear
map. Then for every semi-norm ∥ · ∥ on W , the map

∥x∥T : V → R : x 7→ ∥T x∥

is also a semi-norm on V . Furthermore, if ∥ · ∥ itself is a norm on W and T is injective, then ∥ · ∥T becomes a


norm on V and T becomes a linear isometry, namely ∥T x∥ = ∥x∥T for all x ∈ V .
Proof. Suppose that ∥ · ∥ is a semi-norm on W , and let ∥ · ∥T be defined as above. We check that ∥ · ∥S is a semi-norm on V as follows:
• For every x ∈ V and c ∈ F,
∥cx∥T = ∥T (cx)∥ = ∥c(T x)∥ = |c|∥T x∥ = ∥c∥∥x∥T .

• For every x, y ∈ V ,
∥x + y∥T = ∥T (x + y)∥ = ∥T x + Ty∥ ≤ ∥T x∥ + ∥Ty∥ = ∥x∥T + ∥y∥T .

Furthermore, for every x ∈ V , we may observe that


∗ ∗∗
∥x∥T = 0 ⇐⇒ ∥T x∥ = 0 ⇐= T x = 0W ⇐= x = 0V .

8
∗ ∗
Clearly, the converse of ⇐= is true if ∥ · ∥ is a norm, and the converse of ⇐= holds whenever T is injective. Therefore, the semi-norm ∥ · ∥T
is a norm if ∥ · ∥ is and the linear map T is injective. In this case, for each x ∈ V , we have ∥T x∥ = ∥x∥T , hence T is an isometry under such
two norms. ■

Corollary 1.14 (Semi-Norms and Isomorphism). Let V,W be linear spaces and T : V → W be an isomorphism.
Then for every semi-norm ∥ · ∥ on V , the map

∥y∥′ := ∥T −1 y∥, ∀y ∈ W,

is also a semi-norm on W , which is a norm on W if and only if ∥ · ∥ is a norm on V .


Proof. Let ∥ · ∥ be a semi-norm on V and ∥ · ∥′ : W → R be the map defined above. Since T −1 : W → V is linear, by the preceding theorem,
it is clear that ∥ · ∥′ is a semi-norm on W .
• Clearly, if ∥ · ∥ is a norm on V , since T −1 is already injective, we can see that ∥ · ∥′ is a norm on W as well.
• Conversely, suppose that ∥ · ∥′ is a norm on W . Then for every x ∈ V ,

∥x∥ = ∥T −1 (T x)∥ = ∥T x∥′ .

Since T is also injective, it follows from the preceding theorem that ∥ · ∥ is itself also a norm on V .
Therefore, we may conclude that ∥ · ∥ is a norm on V if and only if ∥ · ∥′ is. ■

Remark 1.4. The above theorem implies that isomorphic linear spaces have “identical” families of semi-normed
defined on it.

Theorem 1.15 (“Kernel” of Semi-Norms). Let V be a semi-normed linear space. Then the subset Z := {x ∈ V |
∥x∥ = 0} is a subspace of V .

1. For every subspace V ′ of V , the semi-norm ∥ · ∥ restrict to a norm on V ′ if and only if V ′ ∩ Z = {0V }. In
particular, the semi-norm ∥ · ∥ is a norm on V if and only if Z = {0V }.

2. On the quotient space V /Z, define


∥x∥′ := ∥x∥, ∀x ∈ V. (8)

Then the new map ∥ · ∥′ on V /Z is a well-defined norm.


Proof. Let Z := {x ∈ V | ∥x∥ = 0}. First, since ∥0V ∥ = 0, it follows that 0V ∈ Z. In other words, the set Z is nonempty. Furthermore, for
every c ∈ F and x, y ∈ V ,
0 ≤ ∥cx + y∥ ≤ ∥cx∥ + ∥y∥ = |c|∥x∥ + ∥y∥ = |c| · 0 + 0 = 0.
Consequently, we also have ∥cx + y∥ = 0 whence cx + y ∈ Z. Therefore, W is indeed a subspace of V .
1. Let V ′ be a subspace of V . Then

∥ · ∥ restricts to a norm on V ′ ⇐⇒ ∀x ∈ V ′ : (∥x∥ = 0 ⇐⇒ x = 0V )


⇐⇒ ∀x ∈ V ′ : (x ∈ Z ⇐⇒ x = 0V ) ⇐⇒ V ′ ∩ Z = {0V }.

As a result, we can see that ∥ · ∥ is itself a norm on V if and only if {0V } = V ∩ Z = Z.


2. First, for every x, y ∈ V ,
x = y ⇐⇒ x − y ∈ Z ⇐⇒ ∥x − y∥ = 0 ⇐⇒ ∥x∥ = ∥y∥,
where the last equivalence follows from Theorem 1.1. This shows that ∥ · ∥′ is well-defined. We then verify the axioms:

9
• For every x ∈ V ,
∥x∥′ = 0 ⇐⇒ ∥x∥ = 0 ⇐⇒ x ∈ Z ⇐⇒ x = 0.

• For every x ∈ V and c ∈ F,


∥cx∥′ = ∥cx∥′ = ∥cx∥ = |c|∥x∥ = |c|∥x∥′ .

• For every x, y ∈ V ,
∥x + y∥′ = ∥x + y∥′ = ∥x + y∥ ≤ ∥x∥ + ∥y∥ = ∥x∥′ + ∥y∥′ .

Therefore, we may conclude that ∥ · ∥′ is a norm on the quotient space V /Z. ■

Remark 1.5. Recall that given any pseudo-metric ρ on a set X, there is a unique metric ρ̃ defined on the quotient
set X/∼, where ∼ is an equivalence relation on X defined by

∀x, y ∈ X : x ∼ y ⇐⇒ ρ(x, y) = 0.

In terms of the pseudo-metric d induced by the semi-norm ∥ · ∥, we see that for every x, y ∈ X,

x ∼ y ⇐⇒ d(x, y) = 0 ⇐⇒ ∥x − y∥ = 0 ⇐⇒ x − y ∈ Z.

Therefore, the equivalence relation ∼ is precisely the congruence relation for the quotient space structure of
V /Z. Furthermore, suppose that d˜ is the metric on V /Z induced by d, given any x, y ∈ X,

˜ y) = d(x, y) = ∥x − y∥ = ∥x − y∥′ = ∥x − y∥′ ,


d(x,

that is, d˜ is also the metric induced by the norm ∥ · ∥′ on V /Z. In conclusion, starting with a semi-normed linear
space, there is no difference if we
• first convert it into a normed linear space by taking quotients and then associate a metric to the new norm,
or

• first construct the associated pseudo-metric and convert that into a metric by equivalence relation.
Both will result in the same metric space.
Theorem 1.16 (Quotient Space). Let V be a normed linear space. For every subspace W of V , the quotient
space V /W is a semi-normed linear space under the following semi-norm

∀x ∈ V : ∥x∥′ := d(x,W ) = inf ∥x − z∥ = inf ∥x + z∥, (9)


z∈W z∈W

while the quotient map π : V → V /W is Lipschitz continuous with constant 1. Furthermore,


• if W is closed in V , then V /W becomes a normed linear space;

• if furthermore V is a Banach space, then V /W is a Banach space as well.


Proof. Let W be a subspace of V . First, we need to show that the map on V /W is well-defined: Suppose that x = y for some x, y ∈ V . Then
we have x − y, y − x ∈ W . Consequently, for every z ∈ W ,

∥x − z∥ = ∥y − ((y − x) + z) ∥ ≥ d(y,W ) and ∥y − z∥ = ∥x − ((x − y) + z) ∥ ≥ d(x,W ).


| {z } | {z }
∈W ∈W

10
Since z ∈ W was selected arbitrarily above, we have d(x,W ) ≥ d(y,W ) and d(y,W ) = d(x,W ), namely d(x,W ) = d(y,W ). This shows that
∥ · ∥′ is well-defined. Furthermore, because W is subspace, the map z 7→ −z is a bijection on it. It is thus clear that

d(x,W ) = inf ∥x − z∥ = inf ∥x + z∥.


z∈W z∈W

Next, we verify the conditions for semi-norms on V /W as follows:


• Let x ∈ V and c ∈ F be arbitrary. If c = 0, then

∥cx∥′ = ∥0x∥′ = ∥0V ∥′ = ∥0V ∥ = 0 = 0∥x∥′ ,

otherwise,
∥cx∥′ = ∥cx∥′ = inf ∥cx − z∥ = inf |c|∥x − c−1 z∥ = |c| inf ∥x − c−1 z∥ = |c|∥x∥′ ,
z∈W z∈W z∈W

where the last equality holds because z 7→ c−1 z is a bijection on W .


• Let x, y ∈ V be arbitrary. For every z, z′ ∈ W , we see that

∥x − z∥ + ∥y − z′ ∥ ≥ ∥(x + y) − (z + z′ ) ∥ ≥ ∥x + y∥′ = ∥x + y∥′ .


| {z }
∈W

Since z, z′ ∈ W above are arbitrary, by taking infima over all z, z′ , it is clear that ∥x∥ + ∥y∥ ≥ ∥x + y∥′ .
Clearly, we have ∥x∥′ ≤ ∥x + 0V ∥ = ∥x∥ for each x ∈ V . Therefore, for every x, y ∈ V ,

∥x − y∥′ = ∥x − y∥′ ≤ ∥x − y∥,

implying that the quotient map π : V → V /W is Lipschitz continuous with constant 1.


• Next, suppose that W is closed in V . Let x ∈ V be such that ∥x∥′ = 0. By the definition of infimum, there exists a sequence (zn )n∈N
in W such that ∥x − zn ∥ → 0 as n → ∞. This then implies that zn → x in V , hence x ∈ W due to the fact W is closed in V . Therefore,
x = 0V , implying that ∥ · ∥′ is a norm on V /W .
• Finally, assume also that V is a Banach space. Let ∑∞ n=1 ζn be an absolutely convergent series in V /W . Then for each n ∈ N, there
exists xn ∈ V such that ζn = xn and ∥xn ∥ < ∥ζn ∥′ +2−n . In this case, the series ∑∞
n=1 xn is also absolutely convergent, hence converges
as well by Theorem 1.12. By the continuity of the quotient map π : V → V /W , the series ∑∞ ∞
n=1 ζn = ∑n=1 xn is convergent as well.
Then it also follows from Theorem 1.12 that V /W is a Banach space as well. ■

Lemma 1.17. Let V be a linear space with semi-norms ∥ · ∥1 , . . . , ∥ · ∥m . Suppose that ∥ · ∥ is a norm on Rm such
that ∥y∥ ≤ ∥y + z∥ for every y, z ∈ Rm with non-negative entries. Then the following map

p : V ×V → R : x 7→ ∥[∥x∥1 , . . . , ∥x∥m ]T ∥ (10)

is also a semi-norm on V , which is a norm whenever at least one of ∥ · ∥1 , . . . , ∥ · ∥m is a norm.


Proof. We simply verify the axioms as follows:
• For every x ∈ V and c ∈ F,

p(cx) = ∥[∥cx∥1 , . . . , ∥cx∥m ]T ∥ = ∥[c∥x∥1 , . . . , c∥x∥m ]T ∥ = ∥c[∥x∥1 , . . . , ∥x∥m ]T ∥


= |c|∥[∥x∥1 , . . . , ∥x∥m ]T ∥ = |c|p(x).

• For every x, y ∈ V ,

p(x + y) = ∥[∥x + y∥1 , . . . , ∥x + y∥m ]T ∥ = ∥[∥x∥1 + ∥y∥1 , . . . , ∥x∥m + ∥y∥m ]T ∥


= ∥[∥x∥1 , . . . , ∥x∥m ]T + [∥y∥1 , . . . , ∥y∥m ]T ∥
≤ ∥[∥x∥1 , . . . , ∥x∥m ]T ∥ + ∥[∥y∥1 , . . . , ∥y∥m ]T ∥ = p(x) + p(y).

11
This shows that p is a semi-norm on V . Furthermore, for every x ∈ V , since ∥ · ∥ is a norm on Rm , we see that

p(x) = 0 ⇐⇒ ∥[∥x∥1 , . . . , ∥x∥m ]T ∥ = 0 ⇐⇒ ∥x∥1 = · · · = ∥x∥m = 0.

Consequently, if at least one of ∥ · ∥1 , . . . , ∥ · ∥m is a norm, the last condition is equivalent to x = 0V whence p is also a norm on V . ■

Theorem 1.18 (Operations on Semi-Norms). Let V be a linear space with semi-norms ∥ · ∥, ∥ · ∥′ . Then given
any non-negative a ∈ R, the maps a∥ · ∥, ∥ · ∥ + ∥ · ∥′ , and ∥ · ∥ ∨ ∥ · ∥′ are also semi-norms on V , which are also
norms when ∥ · ∥, ∥ · ∥′ are norms and a > 0.
Proof. Let a ∈ R be a non-negative constant. Then for every x, y ∈ V and c ∈ F,

a∥cx∥ = a|c|∥x∥ = |c|(a∥x∥),


∥cx∥ + ∥cx∥′ = |c|∥x∥ + |c|∥x∥′ = |c|(∥x∥ + ∥x∥′ ),
∥cx∥ ∨ ∥cx∥′ = |c|∥x∥ ∨ |c|∥x∥′ = |c|(∥x∥ ∨ ∥x∥′ ),

and also

a∥x + y∥ ≤ a(∥x∥ + ∥y∥) = a∥x∥ + a∥y∥,


∥x + y∥ + ∥x + y∥′ ≤ (∥x∥ + ∥y∥) + (∥x∥′ + ∥y∥′ ) = (∥x∥ + ∥x∥′ ) + (∥y∥ + ∥y∥′ ),
∥x + y∥ ∨ ∥x + y∥′ ≤ (∥x∥ + ∥y∥) ∨ (∥x∥′ + ∥y∥′ ) ≤ (∥x∥ ∨ ∥x∥′ ) + (∥y∥ ∨ ∥y∥′ ).

This then shows that a∥ · ∥, ∥ · ∥ + ∥ · ∥′ , and ∥ · ∥ ∨ ∥ · ∥′ are also semi-norms on V . When ∥ · ∥, ∥ · ∥′ are both norms and a > 0, for each x ∈ V ,

a∥x∥ = 0 ⇐⇒ ∥x∥ = 0 ⇐⇒ x = 0V ,

and

∥x∥ + ∥x∥′ = 0 ⇐⇒ ∥x∥ = ∥x∥′ = 0 ⇐⇒ x = 0V ,


⇐⇒ ∥x∥ ∨ ∥x∥′ = 0.

Therefore, the maps a∥ · ∥, ∥ · ∥ + ∥ · ∥′ , and ∥ · ∥ ∨ ∥ · ∥′ are norms on V in this case.


In addition, the fact that ∥ · ∥ + ∥ · ∥′ , and ∥ · ∥ ∨ ∥ · ∥′ are semi-norms (resp. norms) on V can also be seen from the preceding lemma by
considering the l1 -norm and l∞ -norm on R2 . For every y = [y1 , y2 ]T , z = [z1 , z2 ]T ∈ R2 with non-negative entries,

∥y + z∥1 = (y1 + z1 ) + (y2 + z2 ) ≥ y1 + y2 = ∥y∥1

and
∥y + z∥∞ = (y1 + z1 ) ∨ (y2 + z2 ) ≥ y1 ∨ y2 = ∥y∥∞ . ■

1.3 Equivalence of Semi-Norms


Definition 1.7. Let V be a linear space. A semi-norm ∥ · ∥′ on V is said to be stronger than another semi-norm
∥ · ∥, or equivalently, ∥ · ∥ is weaker than ∥ · ∥′ , if there exists some positive constant M ∈ R such that ∥x∥ ≤ M∥x∥′
for all x ∈ V .

Lemma 1.19. Let V be a linear space. Then being stronger is a preorder on the set of all semi-norms on V .
Proof. The reflexity is trivial with M = 1. Now let ∥ · ∥, ∥ · ∥′ , and ∥ · ∥′′ be semi-norms on V such that ∥ · ∥′ is stronger than ∥ · ∥ and ∥ · ∥′′
is stronger than ∥ · ∥′ . Then there are positive constant M, M ′ ∈ R such that

∥x∥ ≤ M∥x∥′ and ∥x∥′ ≤ M ′ ∥x∥′′

for all x ∈ M. In this case, we see that ∥x∥ ≤ MM ′ ∥x∥′′ and MM ′ > 0, so ∥ · ∥′′ is also stronger than ∥ · ∥. ■

12
Theorem 1.20. Let V be a linear space with semi-norms ∥ · ∥ and ∥ · ∥′ . The following statements are equivalent:

1. The semi-norm ∥ · ∥′ is stronger than ∥ · ∥.

2. The pseudo-metric topology on V induced by ∥ · ∥′ is finer than the one induced by ∥ · ∥.

3. For every sequence (xn )n∈N in V , if ∥xn ∥′ → 0 in R, then ∥xn ∥ → 0 as well.


Proof. (1 =⇒ 2). Suppose that ∥ · ∥′ is stronger than ∥ · ∥, say ∥ · ∥ ≤ M∥ · ∥′ for some positive M ∈ R. For each x ∈ V and r > 0, let

B := {y ∈ V | ∥y − x∥ < r} and B′ := {y ∈ V | ∥y − x∥′ < r/M}

be the open balls under the respective pseudo-metric topologies with respective radi r and r/M. Then for each y ∈ B′ , we see that
r
∥y − x∥ ≤ M∥y − x∥′ < M · = r,
M
hence y ∈ B as well. In other words, every open ball under ∥ · ∥ contains an open ball under ∥ · ∥′ , hence the topology induced by ∥ · ∥′ is
finer than the one induced by ∥ · ∥.
(2 =⇒ 3). Suppose that the pseudo-metric topology on V induced by ∥ · ∥′ is finer than the one induced by ∥ · ∥. Let (xn )n∈N be a
sequence in V such that ∥xn ∥′ → 0 in R. Then we have xn → 0V under ∥ · ∥′ , hence xn → 0V under ∥ · ∥ as well. As a result, we may conclude
that ∥xn ∥ = ∥xn − 0V ∥ → 0 as well.
(3 =⇒ 1). Suppose that ∥xn ∥′ → 0 implies that ∥xn ∥ → 0 for every sequence (xn )n∈N in V . Consider the unit ball B := {x ∈ V | ∥x∥′ ≤ 1}
centered at 0V . Assume, to the contrary, that B is unbounded in ∥ · ∥. Then for every n ∈ N, there exists xn ∈ V such that ∥xn ∥ > n, so we
may put
1
yn := xn .
n(∥xn ∥′ ∨ 1)
As we can see,
1 1 1
∥yn ∥′ = ∥xn ∥′ = ∥xn ∥′ ≤
n(∥xn ∥′ ∨ 1) n(∥xn ∥′ ∨ 1) n
and
1 n 1
∥yn ∥ = ∥xn ∥ > = ≥ 1.
n(∥xn ∥′ ∨ 1) n(∥xn ∥′ ∨ 1) ∥xn ∥′ ∨ 1
In this case, we see that ∥yn ∥′ → 0 as n → ∞ but ∥yn ∥ ̸→ 0, a contradiction. Therefore, the ball B is bounded in ∥ · ∥, hence there exists a
positive M ∈ R such that B ⊆ {x ∈ V | ∥x∥ ≤ M}.
Now let x ∈ V be arbitrary. If ∥x∥′ = 0, by considering the constant sequence with common term x, it follows from the assumption that
∥x∥ = 0 as well. Thus, we shall assume that ∥x∥′ > 0 then. In this case, for y := (1/∥x∥′ )x, we have
1
∥y∥′ = ∥x∥′ = 1,
∥x∥′

hence ∥y∥ ≤ M as noted above. Consequently,


∥x∥ = ∥∥x∥′ y∥ = ∥x∥′ ∥y∥ ≤ M∥x∥′ .
This then shows that ∥ · ∥′ is stronger than ∥ · ∥. ■

Definition 1.8. Let V be a linear space. Then two semi-norms ∥ · ∥ and ∥ · ∥′ are equivalent if ∥ · ∥′ is stronger
than ∥ · ∥ and vice-versa.

Corollary 1.21. Let V be a linear space with semi-norms ∥ · ∥ and ∥ · ∥′ . Then the following statements are
equivalent:

1. The semi-norms ∥ · ∥ and ∥ · ∥′ are equivalent.

2. The pseudo-metric topologies on V induced by ∥ · ∥′ and ∥ · ∥ are identical.

13
3. There exists positive constants m, M ∈ R such that m∥x∥′ ≤ ∥x∥ ≤ M∥x∥ for all x ∈ V .

Furthermore, being equivalent is an equivalence relation on the set of all semi-norms on V .


Proof. The equivalence (1 ⇐⇒ 3) is clear from Theorem 1.20. We then prove that (1 ⇐⇒ 2) as follows:
• Suppose that ∥ · ∥ and ∥ · ∥′ are equivalent. Then ∥ · ∥′ is stronger than ∥ · ∥ and vice-versa. By definition, there exist positive
M1 , M2 ∈ R such that ∥x∥ ≤ M1 ∥x∥′ and ∥x∥′ ≤ M2 ∥x∥ for all x ∈ V . In this case, given any x, we see that
1
∥x∥′ ≤ ∥x∥ ≤ M1 ∥x∥′ .
M2

• Conversely, suppose that m∥x∥′ ≤ ∥x∥ ≤ M∥x∥′ for some positive constants m, M. Then we have ∥x∥ ≤ M∥x∥′ and ∥x∥′ ≤ (1/m)∥x∥,
hence ∥ · ∥′ is stronger than ∥ · ∥ and vice-versa. In conclusion, the semi-norms ∥ · ∥ and ∥ · ∥′ are equivalent.
Finally, the reflexity and symmetry of such relation are also trivial for us. Meanwhile, the transitivity follows from Lemma 1.19. ■

Remark 1.6. Consequently, given two equivalent norms ∥ · ∥ and ∥ · ∥′ on V , the linear space V is a Banach
space under ∥ · ∥ if and only if it is a Banach space under ∥ · ∥′ .

1.4 Finite-Dimensional Normed Linear Spaces


1 1
Theorem 1.22 (Young’s Inequality). For every a, b > 0, if p, q > 1 are conjugate indices, i.e., + = 1, then
p q

a b
a1/p b1/q ≤ + (11)
p q

whose equality is attained if and only if a = b.


Proof. Note that the function f (x) = ln x is strictly concave. Then by Jensen’s inequality,
 
ln a ln b a b
ln(a1/p b1/q ) = + ≤ ln + ,
p q p q
whose equality is attained if and only if a = b. ■

Remark 1.7. If we replace a1/p and b1/q with a and b, respectively, we shall get another equivalent form of
Young’s inequality:
a p bq
ab ≤ + (12)
p q
whose equality is attained if and only if a p = bq .
Proof–2. Consider the power function x 7→ x p−1 on [0, ∞), whose inverse is given by y 7→ y1/(p−1) = yq−1 . Clearly, we have
a b
xp yq a p bq
Z a Z b
x p−1 dx + yq−1 dy = + = + ,
0 0 p 0 q 0 p q

On the one hand, if b = a p−1 , or equivalently, bq = a p , the sum of two integrals above is precisely equal to ab.

14
y

b y = x p−1

O a x

On the other hand, if b > a p−1 , then


Z a p−1
a p bq
Z a Z b Z a Z b
+ = x p−1 dx + yq−1 dy = x p−1 dx + yq−1 dy + yq−1 dy
p q 0 0 0 0 a p−1
Z b
= aa p−1 + yq−1 dy
a p−1

> a p + a(b − a p−1


) = ab.

Similarly, if a > ϕ −1 (b), we shall have analogous strict inequality. The proof is thus complete. ■

Theorem 1.23 (Hölder’s Inequality). Let ai , bi ≥ 0 for i = 1, . . . , n, where n is a positive integer. If p, q ∈ (1, ∞)
are conjugate indices, we have
!1/p !1/q
n n n
∑ ai bi ≤ ∑ aip ∑ bqi , (13)
i=1 i=1 i=1

whose equality is attained if and only if the vectors (a1p , . . . , anp ) and (bq1 , . . . , bqn ) are collinear.
Proof. Denote
!1/p !1/q
n n
q
A= ∑ aip and B= ∑ bi .
i=1 i=1
Thus, by Young’s inequality, for each i = 1, . . . , n, we have
 q
ai bi 1  ai  p 1 bi
· ≤ · + · .
A B p A q B
Summing over i = 1, . . . , n, it gives
1 n 1 n
1 n
1 1
∑ ai bi ≤ pA p
AB i=1 ∑ aip + qBq ∑ bqi = + = 1.
p q
i=1 i=1
Therefore,
!1 !1
n n p n q
q
∑ ai bi ≤ AB = ∑ aip ∑ bi ,
i=1 i=1 i=1
q
where the equality holds if and only if aip /A p = bi /Bq for all i = 1, . . . , n. ■
1−q q
Proof–2. Note that for p > 1, the function f (x) = x p is convex. Adopting the notations above, we let αi := (bi /B)q and xi := ai bi B .
Then we have
n n n n  q n
bi p(1−q) pq
∑ αi xi = ∑ ai bi and ∑ αi xip = ∑ B · aip bi B = B p ∑ aip = (AB) p .
i=1 i=1 i=1 i=1 i=1
By Jensen’s inequality, we have !p
n
∑ ai bi ≤ (AB) p .
i=1
The Hölder’s inequality thus follows. ■

15
Theorem 1.24 (Minkowski’s Inequality). Let n ∈ N and ai , bi ≥ 0 for i = 1, . . . , n. Then for every p ≥ 1,
" #1 !1 !1
n p n p n p

∑ (ai + bi ) p
≤ ∑ aip + ∑ bip (14)
i=1 i=1 i=1

whose equality is attained if and only if the vectors (a1 , . . . , an ) and (b1 , . . . , bn ) are collinear.
1
Proof. The inequality is trivial if p = 1. Assume p > 1 and q > 1 satisfies p + 1q = 1. By Hölder’s inequality, we have
n n n n
∑ (ai + bi ) p = ∑ (ai + bi ) p−1 (ai + bi ) = ∑ (ai + bi ) p−1 ai + ∑ (ai + bi ) p−1 bi
i=1 i=1 i=1 i=1
" #1 !1 " #1 !1
n q n p n q n p
≤ ∑ (ai + bi )(p−1)q ∑ aip + ∑ (ai + bi )(p−1)q ∑ bip
i=1 i=1 i=1 i=1
" #1  !1 !1 
n q n p n p
= ∑ (ai + bi ) p  ∑ aip p
+ ∑ bi 
i=1 i=1 i=1

Therefore,
" #1 " #1− 1 !1 !1
n p n q n p n p

∑ (ai + bi ) p
= ∑ (ai + bi ) p
≤ ∑ aip + ∑ bip .
i=1 i=1 i=1 i=1

Finally, from the conditions for Hölder’s inequality, the equality is possible in Minkowski’s inequality if and only if the vectors (a1 , . . . , an )
and (b1 , . . . , bn ) are collinear. ■

Example 1.1 (l p -Norm on Fn ). Let p ≥ 1 be a positive constant. The l p -norm on Fn is defined as follows: For
every x = [x1 , . . . , xn ]T ∈ Fn ,
∥x∥ p := (|x1 | p + · · · + |xn | p )1/p . (15)

Here we verify that ∥ · ∥ p is a norm on Fn :


• For every x = [x1 , . . . , xn ]T ∈ Fn and c ∈ F, since cx = [cx1 , . . . , cxn ]T

∥cx∥ p = (|cx1 | p + · · · + |cxn | p )1/p = (|c||x1 | p + · · · + |c||xn | p )1/p


= (|c| p (|x1 | p + · · · + |c||xn | p ))1/p
= |c|(|x1 | p + · · · + |c||xn | p )1/p = |c|∥x∥ p .

• For every x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Fn ,

∥x + y∥ p = (|x1 + y1 | p + · · · + · · · + |xn + yn | p )1/p


≤ ((|x1 | + |y1 |) p + · · · + · · · + (|xn | + |yn |) p )1/p

≤ (|x1 | p + · · · + |xn | p )1/p + (|y1 | p + · · · + |yn | p )1/p = ∥x∥ p + ∥y∥ p .

Here ≤ follows from Minkowski’s inequality.

• Let x = [x1 , . . . , xn ]T ∈ Fn . Then

∥x∥ p = 0 ⇐⇒ (|x1 | p + · · · + |xn | p )1/p = 0


⇐⇒ |x1 | p + · · · + |xn | p = 0
⇐⇒ |x1 | = · · · = |xn | = 0 (Because |xi | ≥ 0 for all i = 1, . . . , n.)
⇐⇒ x1 = · · · = xn = 0 ⇐⇒ x = 0.

16
Furthermore, the l∞ -norm, or max norm, on Fn is defined as follows: For every x = [x1 , . . . , xn ]T ∈ Fn ,

∥x∥∞ := |x1 | ∨ · · · ∨ |xn |. (16)

Clearly,
lim ∥x∥ p = lim (|x1 | p + · · · + |xn | p )1/p = |x1 | ∨ · · · ∨ |xn | = ∥x∥∞ .
p→∞ p→∞

Here we also show that ∥ · ∥∞ is a norm on Fn :


• For every x = [x1 , . . . , xn ]T ∈ Fn and c ∈ F, since cx = [cx1 , . . . , cxn ]T

∥cx∥∞ = |cx1 | ∨ · · · ∨ |cxn | = |c||x1 | ∨ · · · ∨ |c||xn |


= |c|(|x1 | ∨ · · · ∨ |xn |) = |c|∥x∥∞ .

• For every x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Fn ,

∥x + y∥∞ = |x1 + y1 | ∨ · · · ∨ |xn + yn | ≤ (|x1 | + |y1 |) ∨ · · · ∨ (|xn | + |yn |)


≤ (|x1 | ∨ · · · ∨ |xn |) + (|y1 | ∨ · · · ∨ |yn |) = ∥x∥∞ + ∥y∥∞ .

• Let x = [x1 , . . . , xn ]T ∈ Fn . Then

∥x∥∞ = 0 ⇐⇒ |x1 | ∨ · · · ∨ |xn | = 0


⇐⇒ |x1 | = · · · = |xn | = 0 (Because |xi | ≥ 0 for all i = 1, . . . , n.)
⇐⇒ x1 = · · · = xn = 0 ⇐⇒ x = 0.

Finally, we add some additional remarks on l p -norms:

• The l2 -norm, also called the Euclidean norm, is the most frequently-appeared norm on Fn : For every
x = [x1 , . . . , xn ]T ∈ Fn ,
∥x∥2 = (|x1 |2 + · · · + |xn |2 )1/2 . (17)

As we shall see, it is the only norm on Fn induced from an inner product.

• The l1 -norm is called the sum norm, or the Manhattan norm, or taxicab norm, in which for every x =
[x1 , . . . , xn ]T ∈ Fn ,
∥x∥1 = |x1 | + · · · + |xn |. (18)

• In addition, every l p -norm is permutationally invariant, namely for every x = [x1 , . . . , xn ]T and σ ∈ Sn , we
have
∥xσ ∥ p = ∥x∥ p , where xσ := [xσ (1) , . . . , xσ (n) ]T (19)

Finally, by Corollary 1.14, we can define analogies of l p -norms on every finite-dimensional linear space: If V is
finite-dimensional with base {z1 , . . . , zn }, then for every x ∈ V ,

∥x∥ p := (|z∨1 (x)| p + · · · + |z∨n (x)| p )1/p , p ∈ [1, ∞),


(20)
∥x∥∞ := |z∨1 (x)| ∨ · · · ∨ |z∨n (x)|.

17
Here z∨1 , . . . , z∨n : V → F are the coordinate forms, and we applied Corollary 1.14 via the isomorphism Fn → V :
ei 7→ zi . One needs to beware that such isomorphism is not canonical, namely its definition relies on the base
selected for V .

Corollary 1.25. Let S ∈ Mm,n (F) have full column rank. Then for every norm ∥ · ∥ on Fm , the following map

∥ · ∥S : V → R : x 7→ ∥Sx∥ (21)

is also a norm on Fn .
Proof. By identifying Mm,n (F) with L (Fn , Fm ), we can see that S is an injective linear map. Then by Theorem 1.13, for every norm ∥ · ∥
on Fm , the corresponding map ∥ · ∥S is also a norm on Fn . ■

Lemma 1.26 (Equivalence of l p -Norms). For every x ∈ Fn and 1 ≤ p < q ≤ ∞,

1 1
∥x∥q ≤ ∥x∥ p ≤ n p − q ∥x∥q . (22)

In particular, all l p -norms on Fn are equivalent.


Proof. Let x = [x1 , . . . , xn ]T where x1 , . . . , xn ∈ F. By assumption, it is clear that p < ∞ always. First, suppose that q = ∞. Let k ∈ {1, . . . , n}
be an index such that |xk | = ∥x∥∞ . Observe that

∥x∥ p = (|x1 | p + · · · + |xn | p ) ≥ (|xk | p )1/p = |xk | = ∥x∥∞

and
∥x∥ p = (|x1 | p + · · · + |xn | p ) ≤ (n|xk | p )1/p = n1/p |xk | = n1/p ∥x∥∞ .
Therefore, we have ∥x∥∞ ≤ ∥x∥ p ≤ n1/p ∥x∥∞ .
Next, suppose that q < ∞. There is nothing to prove if x = 0V , so we shall assume that x ̸= 0V . By applying Hölder’s inequality with
r := q/p ∈ [1, ∞) and its conjugate r′ ,
!1/r′ !1/r ! p/q
n n n n
′ ′
∥x∥ pp = ∑ |xi | p ≤ ∑ 1r ∑ |xi | pr = n1/r ∑ |xi |q
i=1 i=1 i=1 i=1
1−(p/q)
=n ∥x∥qp .
1 1
Consequently, we also have ∥x∥ p ≤ n p − q ∥x∥q . As for another inequality, we may put ∥x∥−1 T
q x =: y = [y1 , . . . , yn ] . Then it is clear that
∥y∥q = 1. For each i = 1, . . . , n,
|yi | = ∥x∥−1 −1
q |xi | ≤ ∥x∥∞ |xi | ≤ 1,

hence |yi | p ≥ |yi |q holds now. In this case,

∥x∥ p ∥x∥−1 p p 1/p


q = ∥y∥ p = (|y1 | + · · · + |yn | )
q/p
≥ (|y1 |q + · · · + |yn |q )1/p = ∥y∥q = 1q/p = 1,
1 1
so ∥x∥ p ≥ ∥x∥q holds as well. Together, we certainly have ∥x∥q ≤ ∥x∥ p ≤ n p − q ∥x∥q as well. In particular, the l p -norm and lq -norm are
equivalent, as desired. ■

Lemma 1.27. Let V be a normed linear space and x1 , . . . , xn ∈ V be linearly independent elements. Then there
exists a positive constant λ ∈ R such that for every c1 , . . . , cn ∈ F,

∥c1 x1 + · · · + cn xn ∥ ≥ λ (|c1 | + · · · + |cn |). (23)

18
Proof. Let c1 , . . . , cn ∈ F. There is nothing to prove if c1 = · · · = cn = 0, in which case |c1 | + · · · + |cn | = 0, thus we shall assume that at
least one of ci ’s is nonzero. Then for each i = 1, . . . , n, we may put
ci
di := ∈ F.
|c1 | + · · · + |cn |
Clearly, at least one of di ’s is also nonzero while
n n n
ci |ci |
∑ |di | = ∑ |c1 | + · · · + |cn |
=∑ = 1.
i=1 i=1 i=1 |c1 | + · · · + |cn |

It thus suffices to study the norm ∥d1 x1 + · · · + dn xn ∥: Consider the map

f : Fn → F : (d1 , . . . , dn ) 7→ ∥d1 x1 + · · · + dn xn ∥.

Note that V is a topological linear space and the norm ∥ · ∥ on V is also continuous (cf. Corollary 1.7), hence f is also continuous.
Furthermore, consider the set
S := {(d1 , . . . , dn ) ∈ Fn | |d1 | + · · · + |dn | = 1}.
We claim that S is closed and bounded with respect to the Euclidean norm ∥ · ∥2 , hence is also compact under ∥ · ∥2 by the Heine-Borel
theorem:
• Observe that S is a the unit sphere of Fn under its l1 -norm, hence is closed with respect to ∥ · ∥1 . Since ∥ · ∥1 and ∥ · ∥2 are equivalent,
the set S is also closed with respect to ∥ · ∥2 .
• Furthermore, for every (d1 , . . . , dn ) ∈ S, it is clear that

(|d1 |2 + · · · + |dn |2 )1/2 ≤ |d1 | + · · · + |dn | = 1.

Consequently, the set S is also bounded with respect to ∥ · ∥2 .


Then by the extreme value theorem, the map f attains a minimum λ on S. In particular, here λ > 0, for otherwise there exists (d1 , . . . , dn ) ∈ S
such that
0 = f (d1 , . . . , dn ) = ∥d1 x1 + · · · + dn xn ∥.
In this case, we have d1 x1 + · · · + dn xn = 0V . Since x1 , . . . , xn are linearly independent, we must have d1 = · · · = dn = 0, a contradiction. The
proof is thus complete. ■

Theorem 1.28. Let V be a finite-dimensional normed linear space with base {e1 , . . . , em }, let e∨1 , . . . , e∨m : V → F
be the coordinate forms, and let (xn )n∈N be a sequence in V .
1. The sequence (xn )n∈N is Cauchy in V if and only if the sequence (e∨i (xn ))n∈N of i-th coordinates is also a
Cauchy sequence in F for all i = 1, . . . , m.

2. For every x ∈ V , we have xn → x in V if and only if e∨i (xn ) → ei (x) in F for all i = 1, . . . , m.
In particular, every finite-dimensional normed linear space is a Banach space.
Proof. First, since e1 , . . . , en are linearly independent, by Lemma 1.27, there exists λ > 0 such that for all c1 , . . . , cm ∈ F,

∥c1 e1 + · · · + cm em ∥ ≥ λ (|c1 | + · · · + |cm |).

As we can see, for every k, n ∈ N, we have


m m m
∥xk − xn ∥ = ∑ e∨i (xk )ei − ∑ e∨i (xn )ei = ∑ (e∨i (xk ) − e∨i (xn ))ei .
i=1 i=1 i=1

• Suppose that the sequence (xn )n∈N is Cauchy in V . Then for every ε > 0, there exists N ∈ N such that ∥xk − xn ∥ < λ ε for all k, n ≥ N.
In this case,
m m
λ ε > ∥xk − xn ∥ = ∑ (e∨i (xk ) − e∨i (xn ))ei ≥λ ∑ |e∨i (xk ) − e∨i (xn )|.
i=1 i=1

19
Then for i = 1, . . . , m and k, n ≥ N,
m
λε
|e∨ ∨ ∨ ∨
i (xk ) − ei (xn )| ≤ ∑ |ei (xk ) − ei (xn )| < = ε,
i=1 λ

This shows that the sequence (e∨


i (xn ))n∈N is a Cauchy sequence.
• Conversely, suppose that the sequence (e∨
i (xn ))n∈N is Cauchy for all i = 1, . . . , m. Then for every ε > 0, there exists Ni ∈ N such that

ε
|e∨ ∨
i (xk ) − ei (xn )| < , ∀k, n ≥ Ni .
m(∥ei ∥ ∨ 1)
In this case, when n, k ≥ N1 ∨ · · · ∨ Nm ,
m m m
ε
∥xk − xn ∥ = ∑ (e∨i (xk ) − e∨i (xn ))ei ≤ ∑ |e∨ ∨
i (xk ) − ei (xn )|∥ei ∥ < ∑
m
= ε.
i=1 i=1 i=1

Consequently, the sequence (xn )n∈N is Cauchy in V as well.


The proof of Statement 2 is literally the same as above (simply by dropping the subscript k’s).
Finally, suppose that (xn )n∈N is Cauchy. As noted in Statement 1, the sequence (e∨ i (xn ))n∈N is Cauchy for all i = 1, . . . , m. Note
that F = R or C is complete, so for each i, there exists ci ∈ F such that e∨ i (xn ) → ci as n → ∞. In this case, by Statement 2, we have
xn → x := c1 e1 + · · · + cm em in F, namely the sequence (xn )n∈N is convergent. In conclusion, the space V is a Banach space. ■

Remark 1.8. In particular, a sequence (xn = [xk,1 , . . . , xk,n ]T )k∈N in Fn is convergent if and only if the sequence
of i-th components (xk,i )k∈N in F is convergent for all i = 1, . . . , n.
Theorem 1.29 (Direct Sum of Normed Linear Spaces). Let (Vi , ∥ · ∥i )i∈I be a nonempty family of normed linear
spaces. Then for every p ∈ [1, ∞), the following map is a norm on the direct sum i∈I Vi : For every (xi )i∈I ∈
L
L
i∈I Vi ,
!1/p
∥(xi )i∈I ∥ p := ∑ ∥xi ∥ip . (24)
i∈I

For every (xi )i∈I ∈


L L
Furthermore, the following map is also a norm on i∈I Vi : i∈I Vi ,

∥(xi )i∈I ∥∞ := sup ∥xi ∥i . (25)


i∈I

Proof. First, let p ∈ [1, ∞) be an arbitrary constant, and ∥ · ∥ p : i∈I Vi → R be defined as above. First, for each (xi )i∈I ∈ i∈I Vi , its support,
L L

namely the set of indices i’s where xi ̸= 0Vi , is a finite set, so we merely have a finite sum in the definition of ∥ · ∥ p . Next, we verify that
∥ · ∥ p is a norm on i∈I Vi :
L

• For every (xi )i∈I ∈ and c ∈ F, since c(xi )i∈I = (cxi )i∈I ,
L
i∈I Vi
!1/p !1/p
∥c(xi )i∈I ∥ p = ∥(cxi )i∈I ∥ p = ∑ ∥cxi ∥ip = ∑ (|c|∥xi ∥i ) p
i∈I i∈I
!1/p !1/p
= |c| p ∑ ∥xi ∥ip = |c| ∑ ∥xi ∥ip = |c|∥(xi )i∈I ∥ p .
i∈I i∈I

• For every (xi )i∈I , (yi )i∈I ∈


L
i∈I Vi ,
!1/p !1/p
∥(xi )i∈I + (yi )i∈I ∥ p = ∥(xi + yi )i∈I ∥ p = ∑ ∥xi + yi ∥ip ≤ ∑ (∥xi ∥i + ∥yi ∥i ) p
i∈I i∈I
!1/p !1/p

≤ ∑ ∥xi ∥ip + ∑ ∥yi ∥ip = ∥(xi )i∈I ∥ p + ∥(yi )i∈I ∥ p .
i∈I i∈I

20

Here ≤ is obtained by applying the Minkowski’s inequality to the union of the supports of (xi )i∈I and (yi )i∈I .
• Let (xi )i∈I ∈
L
i∈I Vi . Then
!1/p
∥(xi )i∈I ∥ p = 0 ⇐⇒ ∑ ∥xi ∥ip = 0 ⇐⇒ ∑ ∥xi ∥ip = 0
i∈I i∈I

⇐⇒ ∀i ∈ I : ∥xi ∥i = 0 (Because |xi | ≥ 0 for all i = 1, . . . , n.)


⇐⇒ ∀i ∈ I : xi = 0Vi ⇐⇒ (xi )i∈I = 0.

Similarly, for every (xi )i∈I ∈ i∈I Vi , the number ∥(xi )i∈I ∥∞ is finite because we are indeed finding the maximum among the norms of its
L

finitely many nonzero components. Again, we show that ∥ · ∥∞ is a norm as follows:


• For every (xi )i∈I ∈ and c ∈ F, since cx = [cx1 , . . . , cxn ]T
L
i∈I Vi

∥c(xi )i∈I ∥∞ = sup ∥cxi ∥i = sup |c|∥xi ∥i = |c| sup ∥xi ∥i = |c|∥(xi )i∈I ∥∞ .
i∈I i∈I i∈I

• For every (xi )i∈I , (yi )i∈I ∈


L
i∈I Vi ,

∥(xi )i∈I + (yi )i∈I ∥∞ = ∥(xi + yi )i∈I ∥∞ = sup ∥xi + yi ∥i


i∈I

≤ sup (∥xi ∥i + ∥yi ∥i ) ≤ sup ∥xi ∥i + sup ∥yi ∥i = ∥(xi )i∈I ∥∞ + ∥(yi )i∈I ∥∞ .
i∈I i∈I i∈I

• Let (xi )i∈I ∈


L
i∈I Vi . Then

∥(xi )i∈I ∥∞ = 0 ⇐⇒ sup ∥xi ∥i = 0


i∈I

⇐⇒ ∀i ∈ I : ∥xi ∥i = 0 (Because |xi | ≥ 0 for all i = 1, . . . , n.)


⇐⇒ ∀i ∈ I : xi = 0Vi ⇐⇒ (xi )i∈I = 0. ■

Remark 1.9. Consequently, by putting Vi := F for all i ∈ I, one can see that the direct sum F⊕I always has
ℓ p -norms defined as follows: For every (xi )i∈I ∈ F⊕I ,
!1/p
p
∥(xi )i∈I ∥ p := ∑ |xi | , p ∈ [1, ∞) and ∥(xi )i∈I ∥∞ := sup |xi |. (26)
i∈I i∈I

Furthermore, given any linear space V with base (zi )i∈I , applying Corollary 1.14 via the isomorphism F⊕I → V :
ei 7→ zi , the space V also necessarily has l p -norms, in which for every x ∈ V ,
!1/p
∥x∥ p := ∑ |z∨i (x)| p , p ∈ [1, ∞) and ∥x∥∞ := sup |z∨i (x)|. (27)
i∈I i∈I

Again, one may note that the definition above is not canonical.

Corollary 1.30 (Direct Sum of Banach Spaces). Let (V1 , ∥ · ∥1 ), . . . , (Vk , ∥ · ∥k ) be a nonempty family of Banach
spaces. Then (V1 ⊕ · · · ⊕Vk , ∥ · ∥ p ) is a Banach space for all p ∈ [1, ∞].
Proof. Let p ∈ [1, ∞] be arbitrary, and let ((xn,1 , . . . , xn,k ))n∈N be a Cauchy sequence in V1 ⊕ · · · ⊕Vk under ∥ · ∥ p . Then for every ε > 0, there
exists N ∈ N such that whenever m, n ≥ N,

ε > ∥(xm,1 , . . . , xm,k ) − (xn,1 , . . . , xn,k )∥ p = ∥(xm,1 − xn,1 , . . . , xm,k − xn,k )∥ p .

21
Fix an arbitrary i ∈ {1, . . . , k}. Then for every m, n ≥ N,

∥xm,i − xn,i ∥i ≤ ∥(xm,1 − xn,1 , . . . , xm,k − xn,k )∥ p < ε,

so the sequence (xn,i )n∈N is also Cauchy in Vi . Since Vi is a Banach space under ∥ · ∥i , there exists xi ∈ Vi such that xn,i → xi in it.
• Suppose that p < ∞. Then for each i, there exists Ni ∈ N such that ∥xn,i − xi ∥i < ε/k1/p . Then for n ≥ N1 ∨ · · · ∨ Nk ,

∥(xn,1 , . . . , xn,k ) − (x1 , . . . , xk )∥ p = ∥(xn,1 − x1 , . . . , xn,k − xk )∥ p


ε p 1/p
 
= (∥xn,1 − x1 ∥1p + · · · + ∥xn,k − xk ∥kp )1/p < k · = ε.
k

• When p = ∞, for each i, there exists Ni ∈ N such that ∥xn,i − xi ∥i < ε. Then for n ≥ N1 ∨ · · · ∨ Nk ,

∥(xn,1 , . . . , xn,k ) − (x1 , . . . , xk )∥∞ = ∥(xn,1 − x1 , . . . , xn,k − xk )∥∞


= ∥xn,1 − x1 ∥1 ∨ · · · ∨ ∥xn,k − xk ∥k < ε.

Therefore, we have (xn,1 , . . . , xn,k ) → (x1 , . . . , xk ) in V1 ⊕ · · · ⊕Vk under ∥ · ∥ p , hence we may conclude that (V1 ⊕ · · · ⊕Vk , ∥ · ∥ p ) is a Banach
space as well. ■

Theorem 1.31 (Finite Dimension and Equivalent Norms). Let V be a normed linear space. Then V is finite-
dimensional if and only if every two norms on V are equivalent.
Proof. First, suppose that V is finite-dimensional. By Corollary 1.14, it suffices to consider the case when V = Fn . Note that being equivalent
is an equivalence relation on the set of norms on V (cf. Corollary 1.21), so we may simply show that every norm on Fn is equivalent to the
l1 -norm or the Euclidean norm: Let ∥ · ∥ be an arbitrary norm on V and {e1 , . . . , en } be the standard base of Fn .
• For each x = [x1 , . . . , xn ]T ∈ Fn , by triangle inequality, we have

∥x∥ = ∥x1 e1 + · · · + xn en ∥ ≤ |x1 |∥e1 ∥ + · · · + |xn |∥en ∥


≤ (∥e1 ∥ ∨ · · · ∨ ∥en ∥)(|x1 | + · · · + |xn |)
= (∥e1 ∥ ∨ · · · ∨ ∥en ∥)∥x∥1 .

On the other hand, by Lemma 1.27, there also exists λ > 0 such that

∥x∥ = ∥x1 e1 + · · · + xn en ∥ ≥ λ (|x1 | + · · · + |xn |) = λ ∥x∥1 .

Consequently, the norm ∥ · ∥ is equivalent to the l1 -norm ∥ · ∥1 , as desired.


• For each x = [x1 , . . . , xn ]T ∈ Fn , by triangle inequality and the Cauchy-Schwarz inequality of real numbers, we have

∥x∥ = ∥x1 e1 + · · · + xn en ∥ ≤ |x1 |∥e1 ∥ + · · · + |xn |∥en ∥


≤ (|x1 |2 + · · · + |xn |2 )1/2 (∥e1 ∥2 + · · · + ∥en ∥2 )1/2
= (∥e1 ∥2 + · · · + ∥en ∥2 )1/2 ∥x∥2 .

On the other hand, consider the unit sphere S := {x ∈ Fn | ∥x∥2 = 1}, which is compact under the Euclidean norm. It follows by the
extreme value theorem that ∥ · ∥ has a minimum m on S. Since 0 ∈/ S and ∥ · ∥ is a norm, we must have m > 0 then. Consequently, for
every nonzero x ∈ Rd , since x/∥x∥2 ∈ S, we have

x ∥x∥
0<m≤ = .
∥x∥2 ∥x∥2

That is, ∥x∥ ≥ m∥x∥2 . The equivalence between ∥ · ∥ and ∥ · ∥2 is now established.
On the other hand, suppose that V is infinite-dimensional. Let (ei )i∈I be a base of V with coordinate forms ei : V → F. Here we show that
the l1 and l∞ -norms on V are not equivalent (cf. Remark 1.9): For every x ∈ V ,

∥x∥1 = ∑ |e∨
i (x)| and ∥x∥∞ = sup |e∨
i (x)|.
i∈I i∈I

22
It is clear that ∥ · ∥∞ ≤ ∥ · ∥1 . Assume, to the contrary, that ∥ · ∥1 ≤ β ∥ · ∥∞ for some constant β > 0. Then for every positive integer k and
distinct i1 , . . . , ik ∈ I,
∥ei1 + · · · + eik ∥1 = 1 · k = k ≤ β ∥ei1 + · · · + eik ∥∞ = β · 1 = β .
In other words, β ≥ k holds for all positive integers, a contradiction. ■

Theorem 1.32 (Finite-Dimensional Subspaces). Let V be a normed linear space with subspace W . If W is
finite-dimensional, then it is closed in V . Furthermore, for every x ∈ V , there exists y ∈ W closest to x, namely

∥x − y∥ = d(x,W ) = inf ∥x − z∥. (28)


z∈W

In addition, the set of all closest points in W to x is convex.


Proof. 1. Suppose that W is finite-dimensional with base {e1 , . . . , ek }, and x ∈ W . Since V is a metric space, there exists a sequence (xn )n∈N
in W such that xn → x as n → ∞.
/ W . Then {e1 , . . . , ek , x} is also linearly independent, hence W ′ := Span(e1 , . . . , ek , x) is also finite-
Assume, to the contrary, that x ∈
dimensional. By the preceding theorem, the spaces W ′ and W are both Banach spaces, hence by Theorem 1.10, W is closed in W ′ . In this
case, we must have x ∈ W , a contradiction.
2. Let x ∈ V be arbitrary. Clearly, if x ∈ W , then we may simply put y := x whence ∥x − y∥ = 0 = d(x,W ) holds for sure. Next, suppose
that x ∈
/ W . Since W is closed in V , we must have d(x,W ) > 0 as x ∈/ W . Next, let (yn )n∈N be a sequence in W such that ∥x − yn ∥ → d(x,W )
as n → ∞. Then there exists M > 0 such that ∥x − yn ∥ < M for all n ∈ N, so for each n,

∥yn ∥ = ∥x + (yn − x)∥ ≤ ∥x∥ + ∥yn − x∥ ≤ ∥x∥ + M.

This shows that (yn )n∈N is also a bounded sequence in W . Since W is finite-dimensional, by the generalized Bolzano-Weierstrass theorem,
we can find a convergent subsequence (ynk )k∈N such that ynk → y ∈ W as k → ∞. By the continuity of norms,

∥x − y∥ = lim ∥x − ynk ∥ = d(x,W ),


k→∞

where the last equality holds because every subsequence of a convergent sequence in R converges to the same limit.
Finally, let y1 , y2 ∈ W be elements closest to x and λ ∈ [0, 1] be arbitrary. Then for yλ := λ y1 + (1 − λ )y2 ∈ W ,

d(x,W ) ≤ ∥x − yλ ∥ = ∥x − (λ y1 + (1 − λ )y2 )∥ = ∥λ (x − y1 ) + (1 − λ )(x − y2 )∥


≤ λ ∥x − y1 ∥ + (1 − λ )∥x − y2 ∥
= λ d(x.W ) + (1 − λ )d(x,W ) = d(x,W ).

Therefore, we also have ∥x − yλ ∥ = d(x,W ), implying that yλ ∈ W is closest to x as well. In conclusion, the set of all closest points in W to
x is convex. ■

Lemma 1.33 (Riesz). Let V be a normed linear space with closed subspace W . If W ̸= V , then for every
δ ∈ (0, 1), there is a unit element x ∈ V , i.e., ∥x∥ = 1, such that d(x,W ) ≥ δ .
Proof. Suppose that W ̸= V . Then the space V and the quotient space V /W are both nonzero. Let δ ∈ (0, 1) and y ∈ V \ W be arbitrary.
Since W is closed in V , we must have
0 < d(y,W ) = inf d(y, z) = inf ∥y − z∥.
z∈W z∈W
Now observe that d(y,W ) < d(y,W )/δ , so there exists zδ ∈ W such that d(y,W ) ≤ ∥y − zδ ∥ ≤ d(y,W )/δ . In this case, we may simply put

x := ∥y − zδ ∥−1 (y − zδ ).

It is clear that ∥x∥ = 1, while for every z ∈ W ,

∥x − z∥ = ∥y − zδ ∥−1 ∥y − zδ − ∥y − zδ ∥z ∥ ≥ ∥y − zδ ∥−1 d(y,W ) > δ . ■


| {z }
∈W

23
Theorem 1.34. Let V be a normed linear space. Then V is finite-dimensional if and only if the unit closed ball
B := {x ∈ V | ∥x∥ ≤ 1} is compact in V .
Proof. First, suppose that V is finite-dimensional. Again, by Corollary 1.14 and Theorem 1.31, we may assume that V = Fn endowed with
the Euclidean norm ∥ · ∥2 . Let (xn )n∈N be a sequence of elements in B. Such sequence is certainly bounded as B is, so by the Bolzano-
Weierstrass theorem, it contains a convergent subsequence (xnr )r∈N with limit x ∈ V . By Theorem 1.9, we see that ∥xnr ∥ → ∥x∥ as r → ∞.
Since ∥xnr ∥ ≤ 1 for all r, it follows that ∥x∥ ≤ 1 as well. This shows that x ∈ B, so the unit closed ball B is compact in V .
Conversely, suppose that the unit closed ball B is compact in V . Since the open balls (B(x, 1/2))x∈B covers B, there exists finitely many
x1 , . . . , xn ∈ B such that
B ⊆ B(x1 , 1/2) ∪ · · · ∪ B(xn , 1/2).
Denote by U := Span(x1 , . . . , xn ), which is closed in V by Corollary 1.32. If V is infinite-dimensional, then U is a proper subspace of V .
By Riesz’s lemma, there exists x∗ ∈ B such that d(x∗ ,U) > 1/2. In particular, given each k = 1, . . . , n, we have ∥x∗ − xk ∥ ≥ d(x∗ ,U) > 1/2,
contrary to the choice of xk ’s. ■

1.5 Additional Examples of Norms and Semi-Norms


Example 1.2 (ℓ p -Spaces).

Example 1.3 (L p -Spaces). Let (X, A, µ) be a measure space. For every measurable function f : X → F, we
define Z 1/p
∥ f ∥ p := | f | p dµ . (29)
X

The collection of all those f ’s with ∥ f ∥ p < ∞ is called the Lebesgue space, denoted by L p (X, A, µ) or simply
L p (µ). Here ∥ · ∥ p is a semi-norm on L p (µ). Furthermore,

∥ f ∥∞ := inf{M ∈ [0, ∞] | µ{| f | > M} = 0}.

• ℓ p -spaces are special cases on (N, P(N), δ ), where δ is the counting measure on (N, P(N)).

• The l p -norms on Fn are special cases on ({1, . . . , n}, P({1, . . . , n}), δ ), where δ is the counting measure
on ({1, . . . , n}, P({1, . . . , n})).

24
2 Inner Product Spaces
2.1 Basics for Inner Product Spaces
Definition 2.1. Let V be a linear space. A semi-inner product on V is a sesquilinear form on V , namely a map
⟨·, ·⟩ : V ×V → F that satisfies the following conditions:

1. (Positive Semi-Definiteness). ⟨x, x⟩ is a non-negative real number for all x ∈ V .

2. (Additivity). ⟨x + x′ , y⟩ = ⟨x, y⟩ + ⟨x′ , y⟩ for every x, x′ , y ∈ V .

3. (Homogeneity). ⟨cx, y⟩ = c⟨x, y⟩ for all x, y ∈ V and c ∈ F.

4. (Hermitian Property). ⟨y, x⟩ = ⟨x, y⟩ for every x, y ∈ V .

Under ⟨·, ·⟩, the linear space V is a called a semi-inner product space. Finally, if ⟨·, ·⟩ satisfies the following
additional condition:
∀x ∈ V : ⟨x, x⟩ = 0 ⇐⇒ x = 0V , (30)

then the map ⟨·, ·⟩ is called an inner product, under which V is called an inner product space.

Remark 2.1. When F = R, the Hermitian property degenerates to the symmetry, namely ⟨y, x⟩ = ⟨x, y⟩ for all
x, y ∈ V .

Theorem 2.1. Let V be a semi-inner product space and x ∈ V . For every c ∈ F and y, y′ ∈ V ,

⟨x, cy⟩ = c⟨x, y⟩ = ⟨cx, y⟩ and ⟨x, y + y′ ⟩ = ⟨x, y⟩ + ⟨x, y′ ⟩. (31)

Furthermore,
⟨x, 0V ⟩ = ⟨0V , x⟩ = 0. (32)
Proof. Let c ∈ F and y, y′ ∈ V be arbitrary. Then

⟨x, cy⟩ = ⟨cy, x⟩ = c⟨y, x⟩ = c⟨y, x⟩ = c⟨x, y⟩ = ⟨cx, y⟩

and
⟨x, y + y′ ⟩ = ⟨y + y′ , x⟩ = ⟨y, x⟩ + ⟨y′ , x⟩ = ⟨y, x⟩ + ⟨y′ , x⟩ = ⟨x, y⟩ + ⟨x, y′ ⟩.
Consequently,
⟨0V , x⟩ = ⟨0V + 0V , x⟩ = ⟨0V , x⟩ + ⟨0V , x⟩,
so we must have ⟨0V , x⟩ = 0 and hence ⟨x, 0V ⟩ = ⟨0V , x⟩ = 0 = 0 as well. ■

Remark 2.2. Together with the above theorem, now we may demonstrate why a semi-inner product is required
to be Hermitian rather than symmetric: Under the Hermitian condition, for every x ∈ V and c ∈ F,

⟨cx, cx⟩ = cc⟨x, x⟩ = |c|2 ⟨x, x⟩ ≥ 0.

Clearly, if we do not require the map to be Hermitian, we will be left with c2 ⟨x, x⟩, which is not necessarily a
real number!

25
Proposition 2.2 (Integral Form of Complex Inner Products). Let V be a complex inner product space. Then for
every x, y ∈ V ,
1 π iθ
Z
⟨x, y⟩ = e ∥x + eiθ y∥2 dθ . (33)
2π −π
Proof. Let x, y ∈ V be arbitrary. Then
∥x + eiθ y∥2 = ∥x∥2 + ∥y∥2 + eiθ ⟨y, x⟩ + e−iθ ⟨x, y⟩,
whence
eiθ ∥x + eiθ y∥2 = eiθ (∥x∥2 + ∥y∥2 ) + e2iθ ⟨y, x⟩ + ⟨x, y⟩.
Note that
π π
eiθ e2iθ
Z π Z π
eiθ dθ = =0 and e2iθ dθ = = 0,
−π i −π −π 2i −π
so
1 ⟨x, y⟩ ⟨x, y⟩
Z π Z π
eiθ ∥x + eiθ y∥2 dθ = dθ = · 2π = ⟨x, y⟩. ■
2π −π 2π −π 2π

Theorem 2.3 (Real Part of Semi-Inner Products). Let V be a complex semi-inner product space. Then by
regarding V as a real linear space, the real part of the semi-inner product

ℜ⟨·, ·⟩ : V ×V → R : (x, y) 7→ ℜ⟨x, y⟩ (34)

is also a semi-inner product on V , which is an inner product on V if and only if ⟨·, ·⟩ is. Furthermore, for every
x, y ∈ V ,
ℜ⟨x, y⟩ = ℜ⟨y, x⟩ and ℜ⟨ix, y⟩ = −ℜ⟨x, iy⟩, (35)

and hence
⟨x, y⟩ = ℜ⟨x, y⟩ + iℜ⟨x, iy⟩ = ℜ⟨x, y⟩ − iℜ⟨ix, y⟩. (36)
Proof. By restricting the scalars, it is clear that V is also a real linear space. First, let x, y ∈ V be arbitrary. Denote by ⟨x, y⟩ = a + bi for
some a, b ∈ R. Then we may observe that

⟨ix, y⟩ = i⟨x, y⟩ = i(a + bi) = −b + ai and ⟨x, iy⟩ = ⟨−ix, y⟩ = −⟨ix, y⟩ = b − ai.

Consequently, we have ℜ⟨ix, y⟩ = −b = −ℜ⟨x, iy⟩ whence

⟨x, y⟩ = a + bi = ℜ⟨x, y⟩ + iℜ⟨x, iy⟩ = ℜ⟨x, y⟩ − iℜ⟨ix, y⟩.

Next, we show that ℜ⟨·, ·⟩ is a semi-inner product on V :


• For every x ∈ V , we have ℜ⟨x, x⟩ = ⟨x, x⟩ ≥ 0.
• For every x, x′ , y ∈ V ,
ℜ⟨x + x′ , y⟩ = ℜ(⟨x, y⟩ + ⟨x′ , y⟩) = ℜ⟨x, y⟩ + ℜ⟨x′ , y⟩.

• For every x, y ∈ V and c ∈ R,


ℜ⟨cx, y⟩ = ℜ(c⟨x, y⟩) = cℜ⟨x, y⟩.

• For every x, y ∈ V ,
ℜ⟨y, x⟩ = ℜ⟨x, y⟩ = ℜ⟨x, y⟩

Furthermore, for every x ∈ V , since ℜ⟨x, x⟩ = ⟨x, x⟩, we see that

(ℜ⟨x, x⟩ = 0 ⇐⇒ x = 0V ) ⇐⇒ (⟨x, x⟩ = 0 ⇐⇒ x = 0V ).

Therefore, ℜ⟨·, ·⟩ is an inner product on V if and only if ⟨·, ·⟩ is. ■

26
Theorem 2.4 (Complex Semi-Inner Products From Reals). Let V be a complex linear space. If ⟨·, ·⟩ : V ×V → R
is a semi-inner product on V as a real linear space such that ⟨ix, y⟩ = −⟨x, iy⟩ for all x, y ∈ V , then the following
map
⟨·, ·⟩′ : V ×V → C : (x, y) 7→ ⟨x, y⟩ + i⟨x, iy⟩ (37)

is also a semi-inner product on V as a complex linear space such that ⟨x, x⟩′ = ⟨x, x⟩ for all x ∈ V . Furthermore,
the map ⟨·, ·⟩′ is an inner product if and only if ⟨·, ·⟩ is.
Proof. Suppose that ⟨·, ·⟩ : V × V → R is a semi-inner product on V as a real linear space, and let ⟨·, ·⟩′ : V × V → C be as defined above.
We then verify that ⟨·, ·⟩′ is a semi-inner product on V as a complex linear space:
• For every x ∈ V , since ⟨ix, x⟩ = −⟨x, ix⟩ = −⟨ix, x⟩, it follows that ⟨x, ix⟩ = ⟨ix, x⟩ = 0. Consequently,

⟨x, x⟩′ = ⟨x, x⟩ + i⟨x, ix⟩ = ⟨x, x⟩ ≥ 0.

• For every x, x′ , y ∈ V ,

⟨x + x′ , y⟩′ = ⟨x + x′ , y⟩ + i⟨x + x′ , iy⟩ = (⟨x, y⟩ + ⟨x′ , y⟩) + i(⟨x, iy⟩ + ⟨x′ , iy⟩)
= (⟨x, y⟩ + i⟨x, iy⟩) + (⟨x′ , y⟩ + i⟨x′ , iy⟩)
= ⟨x, y⟩′ + ⟨x′ , y⟩′ .

• Let x, y ∈ V be arbitrary. Observe that

i⟨x, y⟩ = i(⟨x, y⟩ + i⟨x, iy⟩) = −⟨x, iy⟩ + i⟨x, y⟩



= ⟨ix, y⟩ + i⟨x, y⟩ = ⟨ix, y⟩ + i⟨ix, iy⟩ = ⟨ix, y⟩′ ,

where = holds because
⟨ix, iy⟩ = −⟨x, i(iy)⟩ = −⟨x, −y⟩ = −(−⟨x, y⟩) = ⟨x, y⟩.
Meanwhile, for every a ∈ R,

⟨ax, y⟩′ = ⟨ax, y⟩ + i⟨ax, iy⟩ = a⟨x, y⟩ + i(a⟨x, iy⟩)


= a(⟨x, y⟩ + i⟨x, iy⟩) = a⟨x, y⟩′ .

Therefore, given any c ∈ C with c = a + bi for a, b ∈ R,

⟨cx, y⟩′ = ⟨(a + bi)x, y⟩′ = ⟨ax + bix, y⟩′


= a⟨x, y⟩′ + b⟨ix, y⟩′ = a⟨x, y⟩′ + bi⟨x, y⟩′
= (a + bi)⟨x, y⟩′ = c⟨x, y⟩′ .

• For every x, y ∈ V ,

⟨y, x⟩′ = ⟨y, x⟩ + i⟨y, ix⟩ = ⟨x, y⟩ + i⟨ix, y⟩


= ⟨x, y⟩ − i⟨x, iy⟩ = ⟨x, y⟩′ .

Finally, since ⟨x, x⟩′ = ⟨x, x⟩ for all x ∈ V , it is clear that ⟨·, ·⟩′ is an inner product if and only if ⟨·, ·⟩ is. ■

Remark 2.3. By the preceding two theorems, we can see that a complex semi-inner product is uniquely deter-
mined by its real part.

Definition 2.2. Let V be a complex linear space. A sesquilinear form on V is a map ϕ : V ×V → C such that

1. ϕ(cx1 + x2 , y) = cϕ(x1 , y) + ϕ(x2 , y) for all x1 , x2 , y ∈ V and c ∈ F; and

27
2. ϕ(x, cy1 + y2 ) = cϕ(x, y1 ) + ϕ(x, y2 ) for all x, y1 , y2 ∈ V and c ∈ F.

That is, ϕ is linear on the first component but conjugate linear on the second. Furthermore, the map

Φ : V → C : x 7→ ϕ(x, x) (38)

is called the quadratic form associated with ϕ.

Remark 2.4. Clearly, for every c ∈ C and x ∈ V ,

Φ(cx) = ϕ(cx, cx) = ccϕ(x, x) = |c|2 Φ(x). (39)

Example 2.1. Let V be a complex linear space and f , g : V → C be linear functionals on V . Then the following
map is sesquilinear:
ϕ : V ×V → C : (x, y) 7→ f (x)g(y). (40)
• For every x1 , x2 , y ∈ V and c ∈ F,

ϕ(cx1 + x2 , y) = f (cx1 + x2 )g(y) = (c f (x1 ) + f (x2 ))g(y)


= c( f (x1 )g(y)) + f (x2 )g(y) = cϕ(x1 , y) + ϕ(x2 , y).

• For every x, y1 , y2 ∈ V and c ∈ F,

ϕ(x, cy1 + y2 ) = f (x)g(cy1 + y2 ) = f (x)(cg(y1 ) + g(y2 ))


= f (x)(cg(y1 ) + g(y2 ))
= c( f (x)g(y1 )) + f (x)g(y2 ) = cϕ(x, y1 ) + ϕ(x, y2 ).

Example 2.2. Let V be a complex inner product space and A, B ∈ L (V ) be linear operators on V . Then the
following map is also sesquilinear:

ϕ : V ×V → C : (x, y) 7→ ⟨Ax, By⟩. (41)

• For every x1 , x2 , y ∈ V and c ∈ F,

ϕ(cx1 + x2 , y) = ⟨A(cx1 + x2 ), By⟩ = ⟨c(Ax1 ) + Ax2 , By⟩


= c⟨Ax1 , By⟩ + ⟨Ax2 , By⟩ = cϕ(x1 , y) + ϕ(x2 , y).

• For every x, y1 , y2 ∈ V and c ∈ F,

ϕ(x, cy1 + y2 ) = ⟨Ax, B(cy1 + y2 )⟩ = ⟨Ax, c(By1 ) + By2 ⟩


= c⟨Ax, By1 ⟩ + ⟨Ax, By2 ⟩ = cϕ(x, y1 ) + ϕ(x, y2 ).

In particular, by considering the identity maps, given any A ∈ L (V ), we also have the following sesquilinear
forms (x, y) 7→ ⟨Ax, y⟩ and (x, y) 7→ ⟨x, Ay⟩.

Lemma 2.5. Let V be a semi-inner product space. For every x, y ∈ V ,

⟨x, x⟩ + ⟨y, y⟩
|⟨x, y⟩| ≤ . (42)
2

28
Proof. Clearly, there is nothing to prove if ⟨x, y⟩ = 0, so we shall assume that ⟨x, y⟩ ̸= 0 then. Let t ∈ C such that |t| = 1. Then

0 ≤ ⟨tx − y,tx − y⟩ = |t|2 ⟨x, x⟩ − t⟨x, y⟩ − t⟨y, x⟩ + ⟨y, y⟩


= ⟨x, x⟩ + ⟨y, y⟩ − 2ℜ(t⟨x, y⟩),

so
⟨x, x⟩ + ⟨y, y⟩
ℜ(t⟨x, y⟩) ≤ .
2
−1 −1
In particular, for t0 := |⟨x, y⟩| ⟨y, x⟩, we can see that |t0 | = |⟨x, y⟩| |⟨y, x⟩| = 1 and

ℜ(t⟨x, y⟩) = ℜ(|⟨x, y⟩|−1 ⟨y, x⟩⟨x, y⟩) = ℜ(|⟨x, y⟩|−1 |⟨x, y⟩|2 ) = ℜ(|⟨x, y⟩|) = |⟨x, y⟩|.

Therefore, the previous inequality about t can be applied to t0 , which precisely entails the desired inequality. ■

Theorem 2.6 (Cauchy-Bunyakovski-Schwarz Inequality). Let V be a semi-inner product space. Then for every
x, y ∈ V ,
|⟨x, y⟩|2 ≤ ⟨x, x⟩⟨y, y⟩. (43)

When ⟨·, ·⟩ is an inner product, the equality is attained if and only if x, y are linearly dependent.
Proof. Let x, y ∈ V be arbitrary and θ ∈ R. Since ⟨x, x⟩ and ⟨y, y⟩ are both non-negative real numbers, there is nothing to prove if ⟨x, y⟩ = 0.
As a result, we shall assume that ⟨x, y⟩ ̸= 0 throughout. Now consider the following real polynomial:

p(t) := ⟨tx − E iθ y,tx − E iθ y⟩ = t 2 ⟨x, x⟩ − 2tℜ(E −iθ ⟨x, y⟩) + ⟨y, y⟩ ≥ 0.

In this case, we may select θ ∈ R appropriately such that ℜ(e−iθ ⟨x, y⟩) = |⟨x, y⟩| (e.g., θ = arg(⟨x, y⟩) and certainly θ = 0 when F = R),
hence p becomes
p(t) = t 2 ⟨x, x⟩ − 2t|⟨x, y⟩| + ⟨y, y⟩.
Here we claim that ⟨x, x⟩ > 0 in this case: Suppose otherwise, say ⟨x, x⟩ = 0. Then the polynomial p is further reduced to p(t) = −2t|⟨x, y⟩| +
⟨y, y⟩. Since ⟨x, y⟩ =
̸ 0, we see that |⟨x, y⟩| > 0. Thus p defines a decreasing linear function on t, so for sufficiently large t, we shall have
p(t) < 0, contradicting the positiveness of p.
Now ⟨x, x⟩ > 0, so p instead defined a quadratic function on t with global minimum at

−2|⟨x, y⟩| |⟨x, y⟩|


t0 := − = .
2⟨x, x⟩ ⟨x, x⟩
Furthermore, the global minimum is equal to

(−2|⟨x, y⟩|)2 |⟨x, y⟩|2


p(t0 ) = ⟨y, y⟩ − = ⟨y, y⟩ − ≥ 0,
4⟨x, x⟩ ⟨x, x⟩

where the last inequality follows from the fact that p(t) ≥ 0 for all t ∈ R. The desired inequality |⟨x, y⟩|2 ≤ ⟨x, x⟩⟨y, y⟩ thus follows. Finally,
we study the equality condition as follows:
• First, suppose that x, y are linearly dependent, say x = cy for some c ∈ F. Then

⟨x, y⟩ = ⟨cy, y⟩ = c⟨y, y⟩ and ⟨x, x⟩ = ⟨cy, cy⟩ = |c|2 ⟨y, y⟩,

so
|⟨x, y⟩|2 = |c|2 ⟨y, y⟩2 = ⟨x, x⟩⟨y, y⟩.
That is, the equality is attained now.
• Now suppose that x, y are linearly independent. Then for every t, θ ∈ R, the element tx − eiθ y is always nonzero. When ⟨·, ·⟩ is an
inner product on V , we see that p(t) = ⟨tx − eiθ y,tx − eiθ y⟩ > 0 for all t ∈ R. Then together with the discussions above, we see that
|⟨x, y⟩|2 < ⟨x, x⟩⟨y, y⟩, namely the equality is never attained in this case. ■

29
Direct Proof for Inner Product Spaces. Suppose that V is an inner product space with x, y ∈ V . Again, if x = y = 0V , we have ⟨x, y⟩ =
⟨x, x⟩ = ⟨y, y⟩ = 0, so the desired inequality follows as well. Now, without loss of generality, assume that y ̸= 0V . Put v := ⟨y, y⟩x − ⟨x, y⟩y.
As we can see,

0 ≤ ⟨v, v⟩ = ⟨⟨y, y⟩x − ⟨x, y⟩y, ⟨y, y⟩x − ⟨x, y⟩y⟩


= |⟨y, y⟩|2 ⟨x, x⟩ − ⟨y, y⟩⟨x, y⟩⟨x, y⟩ − ⟨x, y⟩⟨y, y⟩⟨y, x⟩ + |⟨x, y⟩|2 ⟨y, y⟩
= ⟨y, y⟩2 ⟨x, x⟩ − ⟨y, y⟩|⟨x, y⟩|2 − |⟨x, y⟩|2 ⟨y, y⟩ + |⟨x, y⟩|2 ⟨y, y⟩
= ⟨y, y⟩2 ⟨x, x⟩ − ⟨y, y⟩|⟨x, y⟩|2 = ⟨y, y⟩(⟨y, y⟩⟨x, x⟩ − |⟨x, y⟩|2 ).

Now since ⟨y, y⟩ > 0, the desired inequality ⟨y, y⟩⟨x, x⟩ ≥ |⟨x, y⟩|2 follows. Clearly, the equality here is attained if and only if v = ⟨y, y⟩x −
⟨x, y⟩y = 0V .
• If v = ⟨y, y⟩x − ⟨x, y⟩y = 0V , we have x = ⟨y, y⟩−1 ⟨x, y⟩y, hence x, y are linearly dependent.
• Conversely, suppose that x = cy for some c ∈ F. Then ⟨x, y⟩ = ⟨cy, y⟩ = c⟨y, y⟩, hence

v = ⟨y, y⟩x − ⟨x, y⟩y = ⟨y, y⟩(cy) − c⟨y, y⟩y = 0V .

The proof is thus complete. ■

Alternative Proof for the Inequality. Again, we consider the case when V is an inner product space. Let λ ∈ C be nonzero. We now put
u := λ x and v := (λ )−1 y. Observe that  
1 1
⟨u, v⟩ = λ x, y = λ · ⟨x, y⟩ = ⟨x, y⟩.
λ λ
By Lemma 2.5, we have
⟨u, u⟩ + ⟨v, v⟩ |λ |2 ⟨x, x⟩ ⟨y, y⟩
|⟨x, y⟩| = |⟨u, v⟩| ≤ = + .
2 2 2|λ |2
Again, there is nothing to prove if x = 0. Suppose otherwise. Then we may put λ := (⟨y, y⟩/⟨x, x⟩)1/4 , hence

(⟨y, y⟩/⟨x, x⟩)1/2 ⟨x, x⟩ ⟨y, y⟩


|⟨x, y⟩| = + = ⟨x, x⟩1/2 ⟨y, y⟩1/2 .
2 2(⟨y, y⟩/⟨x, x⟩)1/2
The desired inequality follows as well. ■

Theorem 2.7 (Norms Derived from Inner Products). Let V be a semi-inner product space. Then V is also a
semi-normed linear space under the following semi-norm:

∥x∥ := ⟨x, x⟩1/2 , ∀x ∈ V. (44)

1. The real part ℜ⟨·, ·⟩ of the semi-inner product induces the same semi-norm on V .

2. In the triangle inequality ∥x + y∥ ≤ ∥x∥ + ∥y∥ with x, y ∈ V , the equality is attained if and only if y = cx
for some non-negative c ∈ R.

3. The semi-inner product is continuous under the pseudo-metric topology induced by ∥ · ∥.

4. The map ∥ · ∥ is a norm on V if and only if ⟨·, ·⟩ is an inner product.


Proof. Let ∥ · ∥ : V → R be defined as above. Clearly, such map is homogeneous with scalar multiplication: For every x ∈ V and c ∈ F,

∥cx∥ = ⟨cx, cx⟩1/2 = (|c|2 ⟨x, x⟩)1/2 = |c|⟨x, x⟩ = |c|∥x∥.

Furthermore, for every x, y ∈ V , by CBS inequality,

∥x + y∥2 = ⟨x + y, x + y⟩ = ⟨x, x⟩ + ⟨x, y⟩ + ⟨y, x⟩ + ⟨y, y⟩2

30
= ∥x∥2 + ∥y∥2 + 2ℜ⟨x, y⟩
≤ ∥x∥2 + ∥y∥2 + 2|⟨x, y⟩|
≤ ∥x∥2 + ∥y∥2 + 2∥x∥∥y∥ = (∥x∥ + ∥y∥)2 ,

hence we certainly have ∥x + y∥ ≤ ∥x∥ + ∥y∥, proving the triangle inequality. The bottom equality in CBS is attained if and only if x = cy
for some c ∈ F. In this case,
ℜ⟨x, y⟩ = ℜ(⟨cy, y⟩) = ℜ(c⟨y, y⟩) = ℜ(c)⟨y, y⟩
and
|⟨x, y⟩| = |⟨cy, y⟩| = |c⟨y, y⟩| = |c|⟨y, y⟩,
so the top equality is attained if and only if ℜ(c) = |c|, which is equivalent to say that c is a non-negative real number. Therefore, the
equality in the triangle inequality is attained if and only if x = cy for some non-negative c ∈ R.
• By Theorem 2.3, we see that ℜ⟨x, x⟩ = ⟨x, x⟩ for all x ∈ V , so the semi-norm on V induced by ℜ⟨·, ·⟩ is the same as the one induced
by ⟨·, ·⟩.
• Let x, x′ , y, y′ ∈ V . By CBS inequality,

|⟨x′ , y′ ⟩ − ⟨x, y⟩| = |⟨x′ , y′ ⟩ − ⟨x′ , y⟩ + ⟨x′ , y⟩ − ⟨x, y⟩|


≤ |⟨x′ , y′ ⟩ − ⟨x′ , y⟩| + |⟨x′ , y⟩ − ⟨x, y⟩| = |⟨x′ , y′ − y⟩| + |⟨x′ − x, y⟩|
≤ ∥x′ ∥∥y′ − y∥ + ∥x′ − x∥∥y∥ ≤ (∥x′ ∥ + ∥y∥)(∥x′ − x∥ ∨ ∥y′ − y∥).

Consequently, if (x′ , y′ ) → (x, y) in V × V , we shall have x′ → x and y′ → y in V . As a result, we see that ∥x′ − x∥ → 0 and
∥y′ − y∥ → 0, hence |⟨x′ , y′ ⟩ − ⟨x, y⟩| → 0 as well. This proves the continuity of the semi-inner product. (In addition, its continuity
also follows from the polarization identities in the following theorem.)
• Finally, for every x ∈ V ,
∥x∥ = 0 ⇐⇒ ∥x∥2 = 0 ⇐⇒ ⟨x, x⟩ = 0.
Clearly, ∥ · ∥ is a norm if and only if the above is equivalent to that x = 0V , which is precisely the case when ⟨·, ·⟩ is an inner
product. ■

Remark 2.5. Given any semi-inner product space, we shall by default use ∥ · ∥ to denote the semi-norm induced
by the semi-inner product. In this case, the CBS inequality can be rephrased as follows: For every x, y ∈ V ,

|⟨x, y⟩| ≤ ∥x∥∥y∥. (45)

Definition 2.3. An inner product space is called a Hilbert space if it is a Banach space under the derived norm,
or equivalently, the metric induced by the inner product is complete.

Example 2.3 (Standard Inner Product on Fn ). The standard inner product on Fn is defined as follows: For every
x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Fn ,

⟨x, y⟩ := y∗ x = x1 y1 + · · · + xn yn . (46)

We prove that this is an inner product on Fn as follows:


• For every x = [x1 , . . . , xn ]T ,
⟨x, x⟩ = x1 x1 + · · · + xn xn = |x1 |2 + · · · + |xn |2 ≥ 0.
Clearly,
⟨x, x⟩ = 0 ⇐⇒ |x1 | = · · · = |xn | = 0 ⇐⇒ x1 = · · · = xn = 0 ⇐⇒ x = 0.

31
• For every x = [x1 , . . . , xn ]T , x′ = [x1′ , . . . , xn′ ]T , y = [y1 , . . . , yn ]T ∈ Fn , since x + x′ = [x1 + x1′ , . . . , xn + xn′ ]T ,
n n n n
⟨x + x′ , y⟩ = ∑ (xk + xk′ )yk = ∑ (xk yk + xk′ yk ) = ∑ xk yk + = ∑ xk′ yk = ⟨x, y⟩ + ⟨x′ , y⟩.
k=1 k=1 k=1 k=1

• For every x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Fn and c ∈ F, since cx = [cx1 , . . . , cxn ]T ,


n n
⟨cx, y⟩ = ∑ (cxk )yk = c ∑ xk yk = c⟨x, y⟩.
k=1 k=1

• For every x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Fn ,


n n n
⟨y, x⟩ = ∑ yk xk = ∑ yk xk = ∑ yk xk = ⟨x, y⟩.
k=1 k=1 k=1

Furthermore, the norm on Fn induced by the standard inner product is precisely the Euclidean norm: For every
x = [x1 , . . . , xn ]T ,
⟨x, x⟩1/2 = (|x1 |2 + · · · + |xn |2 )1/2 = ∥x∥2 .

Theorem 2.8. Every finite-dimensional inner product space is a Hilbert space.


Proof. Let V be a finite-dimensional inner product space. Then under the norm induced by its inner product, we see that V is also a finite-
dimensional normed linear space. By Theorem 1.28, we see that V is already a Banach space under the derived norm, hence is a Hilbert
space by definition. ■

Example 2.4 (Inner Product Induced by Matrix). In addition, let A ∈ Mn (F) be a matrix. Then the following
map based on the standard inner product also defines a semi-inner product on Fn :

∀x, y ∈ Fn : ⟨x, y⟩A := ⟨Ax, Ay⟩ = (Ay)∗ Ax = y∗ (A∗ A)x. (47)

• For every x ∈ Fn , it is clear that ⟨x, x⟩A = ⟨Ax, Ax⟩ ≥ 0.

• For every x, x′ , y ∈ Fn and c ∈ F,

⟨x + x′ , y⟩A = ⟨A(x + x′ ), Ay⟩ = ⟨Ax, Ay⟩ + ⟨Ax′ , Ay⟩ = ⟨x, y⟩A + ⟨x′ , y⟩A

and
⟨cx, y⟩A = ⟨A(cx), Ay⟩ = ⟨c(Ax), Ay⟩ = c⟨Ax, Ay⟩ = c⟨x, y⟩A .

• For every x, y ∈ Fn ,
⟨y, x⟩A = ⟨Ay, Ax⟩ = ⟨Ax, Ay⟩ = ⟨x, y⟩A .

Furthermore, for every x ∈ Fn ,

⟨x, x⟩A = 0 ⇐⇒ ⟨Ax, Ax⟩ = 0 ⇐⇒ Ax = 0 ⇐= x = 0.

Clearly, the last converse holds if and only if A has full column rank, but because A is square, the latter condition
is equivalent to that A is invertible. In other words, the map ⟨·, ·⟩A is an inner product on Fn if and only if A is
invertible.

32
Theorem 2.9 (Complexification). Let X be a real inner product space. Then the following map

⟨x1 + iy1 , x2 + iy2 ⟩C := (⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩) + i(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩) (48)

defines an inner product on XC , in which for every x, y ∈ X,

∥x + iy∥2C = ⟨x + iy, x + iy⟩C = ∥x∥2 + ∥y∥2 . (49)

Furthermore, the space XC is complete whenever X is.

Remark 2.6. The motivation for such inner product is as follows: Note that in C, the inner product is defined as
follows: For a1 , a2 , b1 , b2 ∈ R,

⟨a1 + ib1 , a2 + ib2 ⟩ = (a1 + ib1 )a2 + ib2 = (a1 + ib1 )(a2 − ib2 )
= (a1 a2 + b1 b2 ) + i(a2 b1 − a1 b2 ).

Proof. We first verify that ⟨·, ·⟩C is an inner product on XC :


• For every x, y ∈ X,

⟨x + iy, x + iy⟩C = (⟨x, x⟩ + ⟨y, y⟩) + i(⟨y, x⟩ − ⟨x, y⟩)


= ⟨x, x⟩ + ⟨y, y⟩ ≥ 0.

Furthermore,
⟨x + iy, x + iy⟩C = 0 ⇐⇒ ⟨x, x⟩ = ⟨y, y⟩ = 0 ⇐⇒ x = y = 0 ⇐⇒ x + iy = 0.

• For every x1 , x1′ , x2 , y1 , y′1 , y2 ∈ X,

⟨x1 + iy1 , x2 + iy2 ⟩C + ⟨x1′ + iy′1 , x2 + iy2 ⟩C


= (⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩) + i(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩) + (⟨x1′ , x2 ⟩ + ⟨y′1 , y2 ⟩) + i(⟨y′1 , x2 ⟩ − ⟨x1′ , y2 ⟩)
= (⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩ + ⟨x1′ , x2 ⟩ + ⟨y′1 , y2 ⟩) + i(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩ + ⟨y′1 , x2 ⟩ − ⟨x1′ , y2 ⟩)
= (⟨x1 + x1′ , x2 ⟩ + ⟨y1 + y′1 , y2 ⟩) + i(⟨y1 + y′1 , x2 ⟩ − ⟨x1 + x1′ , y2 ⟩)
= ⟨(x1 + x1′ ) + i(y1 + y′1 ), x2 + iy2 ⟩C .

• For every x1 , x2 , y1 , y2 ∈ X and a, b ∈ R,

(a + ib)⟨x1 + iy1 , x2 + iy2 ⟩C = (a + ib)((⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩) + i(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩))


= (a(⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩) − b(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩)) + i(a(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩) + b(⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩))
= (⟨ax1 − by1 , x2 ⟩ + ⟨ay1 + bx1 , y2 ⟩) + i(⟨ay1 + bx1 , x2 ⟩ − ⟨ax1 − by1 , y2 ⟩)
= ⟨(ax1 − by1 ) + i(ay1 + bx1 ), x2 + iy2 ⟩C = ⟨(a + ib)(x1 + iy1 ), x2 + iy2 ⟩C .

• For every x1 , x2 , y1 , y2 ∈ X,

⟨x2 + iy2 , x1 + iy1 ⟩C = (⟨x2 , x1 ⟩ + ⟨y2 , y1 ⟩) + i(⟨y2 , x1 ⟩ − ⟨x2 , y1 ⟩)


= (⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩) − i(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩)
= (⟨x1 , x2 ⟩ + ⟨y1 , y2 ⟩) + i(⟨y1 , x2 ⟩ − ⟨x1 , y2 ⟩) = ⟨x1 + iy1 , x2 + iy2 ⟩C .

Finally, let (xn + iyn )n∈N be a Cauchy sequence in XC . Then for every m, n ∈ N,

∥(xm + iym ) − (xn + iyn )∥2C = ∥(xm − xn ) + i(ym − yn )∥2C = ∥xm − xn ∥2 + ∥ym − yn ∥2 ,

33
so (xn )n∈N and (yn )n∈N are also Cauchy sequences in X. Now if X is an Hilbert space, such two sequences must be convergent, say
xn → x ∈ X and yn → y ∈ X. Again, for every n ∈ N,

∥(xn + iyn ) − (x + iy)∥2C = ∥xn − x∥2 + ∥yn − y∥2 → 02 + 02 = 0. (as n → ∞)

Therefore, we also have xn + iyn → x + iy in XC , hence XC is a Hilbert space as well in this case. ■

2.2 Remarks on Derived Norms


Theorem 2.10 (Properties of Derived Norms). Let V be a semi-inner product space. Then for every x, y ∈ V ,

∥x ± y∥2 = ∥x∥2 + ∥y∥2 ± 2ℜ⟨x, y⟩ (50)

and hence when F = C,


∥x ± iy∥2 = ∥x∥2 + ∥y∥2 ± 2ℑ⟨x, y⟩. (51)

1. (Pythagorean Theorem). If ⟨x, y⟩ = 0, then ∥x + y∥2 = ∥x∥2 + ∥y∥2 .

2. (Parallelogram Identity). ∥x + y∥2 + ∥x − y∥2 = 2(∥x∥2 + ∥y∥2 ).

3. ∥x + y∥∥x − y∥ ≤ ∥x∥2 + ∥y∥2 with equality if and only if ℜ⟨x, y⟩ = 0.

4. (Polarization Identities).

1 1 1
⟨x, y⟩ = (∥x + y∥2 − ∥x − y∥2 ) = ∑ (−1)k ∥x + (−1)k y∥2 . (F = R)
4 4 k=0
1 1 3
⟨x, y⟩ = [(∥x + y∥2 − ∥x − y∥2 ) + i(∥x + iy∥2 − ∥x − iy∥2 )] = ∑ ik ∥x + ik y∥2 . (F = C)
4 4 k=0

Proof. Let x, y ∈ V be arbitrary. Observe that

∥x ± y∥2 = ⟨x ± y, x ± y⟩ = ⟨x, x⟩ + ⟨y, y⟩ ± (⟨y, x⟩ + ⟨x, y⟩)


= ∥x∥2 + ∥y∥2 ± 2ℜ⟨x, y⟩.

When ⟨x, y⟩ = 0, we see that ℜ⟨x, y⟩ = 0 as well. Consequently, the equality ∥x + y∥2 = ∥x∥2 + ∥y∥2 follow immediately. Furthermore, by
the above identities, we see that

∥x ± iy∥2 = ∥x∥2 + ∥iy∥2 ± 2ℜ⟨x, iy⟩


= ∥x∥2 + ∥y∥2 ± 2ℜ(−i⟨x, y⟩) = ∥x∥2 + ∥y∥2 ± 2ℑ⟨x, y⟩,

while
∥x + y∥2 + ∥x − y∥2 = 2(∥x∥2 + ∥y∥2 ) and ∥x + y∥2 − ∥x − y∥2 = 4ℜ⟨x, y⟩.
Similarly, we see that

∥x + y∥2 ∥x − y∥2 = (∥x∥2 + ∥y∥2 + 2ℜ⟨x, y⟩)(∥x∥2 + ∥y∥2 − 2ℜ⟨x, y⟩)


= (∥x∥2 + ∥y∥2 )2 − 4(ℜ⟨x, y⟩)2 ≤ (∥x∥2 + ∥y∥2 )2 ,

whose equality holds if and only if ℜ⟨x, y⟩ = 0. Finally, observe that when F = C,

∥x + iy∥2 − ∥x − iy∥2 = 4ℜ(⟨x, iy⟩) = 4ℜ(−i⟨x, y⟩) = 4ℑ⟨x, y⟩.

34
The desired statements thus follow immediately. ■

Remark 2.7. By (50), we see that

1 1
ℜ⟨x, y⟩ = (∥x + y∥2 − ∥x − y∥2 ) = (∥x + y∥2 − ∥x∥2 − ∥y|2 ), (52)
4 2
which degenerates to ⟨x, y⟩ when F = R.

Corollary 2.11 (Generalized Pythagorean Theorem). Let V be a semi-inner product over F and x1 , . . . , xm ∈ V ,
where m ≥ 2. If ⟨xi , x j ⟩ = 0 for every distinct i, j = 1, . . . , m, then for every c1 , . . . , cm ∈ F,

∥c1 x1 + · · · + cm xm ∥2 = |c1 |2 ∥x1 ∥2 + · · · + |cm |2 ∥xm ∥2 . (53)

Proof. By direct computations,


2 * +
m m m m m m m
∑ ci xi = ∑ ci xi , ∑ c j x j =∑ ∑ ci c j ⟨xi , x j ⟩ = ∑ ci ci ⟨xi , xi ⟩ = ∑ |ci |2 ∥xi ∥2 .
i=1 i=1 j=1 i=1 j=1 i=1 i=1

Alternatively, one may also prove such identity by induction on m.


• (Base Step). Let x1 , x2 ∈ V be such that ⟨x1 , x2 ⟩ = 0 and c1 , c2 ∈ F be arbitrary. Observe that

⟨c1 x1 , c2 x2 ⟩ = c1 c2 ⟨x1 , x2 ⟩ = c1 c2 (0) = 0.

Then by the Pythagorean theorem,

∥c1 x1 + c2 x2 ∥2 = ∥c1 x1 ∥2 + ∥c2 x2 ∥2 = |c1 |2 ∥x1 ∥2 + |c2 |2 ∥x2 ∥2 .

• Now assume that the identity holds for some m ≥ 2. Let x1 , . . . , xm+1 ∈ V be such that ⟨xi , x j ⟩ = 0 for every distinct i, j = 1, . . . , m,
and c1 , . . . , cm+1 ∈ F be arbitrary. By similar arguments as above and the linearity of semi-inner product,
m m
⟨c1 x1 + · · · + cm xm , cm+1 xm+1 ⟩ = ∑ ⟨ci xi , cm+1 xm+1 ⟩ = ∑ 0 = 0.
k=1 k=1

Therefore,

∥c1 x1 + · · · + cm xm + cm+1 xm+1 ∥2 = ∥c1 x1 + · · · + cm xm ∥2 + ∥cm+1 xm+1 ∥2
∗∗
= |c1 |2 ∥x1 ∥2 + · · · + |cm |2 ∥xm ∥2 + |cm+1 |2 ∥xm+1 ∥2 ,
∗ ∗∗
where = follows from the Pythagorean theorem, and = follows from the inductive hypothesis. ■

Theorem 2.12 (Generalized Polarization Identity). Let V be a complex linear space, ϕ : V × V → C be a


sesquilinear form on V , and Φ : V → C : x 7→ ϕ(x, x) be the quadratic form associated with ϕ. Then for every
x, y ∈ V ,
1 3
ϕ(x, y) = ∑ ik Φ(x + ik y)
4 k=0
(54)
1
= [(Φ(x + y) − Φ(x − y)) + i(Φ(x + iy) − Φ(x − iy))].
4
That is, the sesquilinear form ϕ is completely determined by its associated quadratic form Φ.

35
Proof. Observe that
 
1 1 1
ϕ(x, y) = · − ϕ(2x, 2iy) = − ϕ((x + iy) + (x − iy), (x + iy) − (x − iy))
2 2i 4i
i
= (Φ(x + iy) + ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy) − Φ(x − iy))
4
i i
= (Φ(x + iy) − Φ(x − iy)) + (ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy))
4 4
What remains is to study the last two terms:
i i
(ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy)) = ((Φ(x) − ϕ(iy, x) + ϕ(x, iy) − Φ(iy)) − (Φ(x) + ϕ(iy, x) − ϕ(x, iy) − Φ(iy)))
4 4
i i
= · 2(ϕ(x, iy) − ϕ(iy, x)) = · 2(−iϕ(x, y) − iϕ(y, x))
4 4
1
= (ϕ(x, y) + ϕ(y, x)).
2
Observe that

Φ(x + y) − Φ(x − y) = (Φ(x) + ϕ(y, x) + ϕ(x, y) + ϕ(y, y)) − (Φ(x) − ϕ(y, x) − ϕ(x, y) + ϕ(y, y))
= 2(ϕ(x, y) + ϕ(y, x)),

so
i 1
(ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy)) = (Φ(x + y) − Φ(x − y)).
4 4
The proof is thus complete. ■

Corollary 2.13. Let V be a complex linear space and ϕ : V × V → C be a sesquilinear form. Then ϕ is
Hermitian, namely ϕ(x, y) = ϕ(y, x) for all x, y ∈ V , if and only if ϕ(x, x) ∈ R for all x ∈ V ; that is, its associated
quadratic form takes real values only.
Proof. For convenience, denote by Φ : V → C : x 7→ ϕ(x, x) the quadratic form associated with ϕ. First, if ϕ is Hermitian, then for every
x ∈ V , we have Φ(x) = ϕ(x, x) = ϕ(x, x), so Φ(x) ∈ R holds for sure. Conversely, suppose that Φ(x) ∈ R for all x ∈ R. Observe from Remark
2.4 that

Φ(y − x) = Φ(−(x − y)) = |−1|2 Φ(x − y) = Φ(x − y),


Φ(y ± ix) = Φ(±i(x ∓ iy)) = | ± i|2 Φ(x ∓ iy) = Φ(x ∓ iy).

By the generalized polarization identity, we have

1
ϕ(y, x) = [(Φ(y + x) + Φ(y − x)) + i(Φ(y + ix) − Φ(y − ix))]
4
1
= [(Φ(x + y) + Φ(x − y)) + i(Φ(x − iy) − Φ(x + iy))]
4
1
= [(Φ(x + y) + Φ(x − y)) + i(Φ(x + iy) − Φ(x − iy))] = ϕ(x, y).
4
Therefore, the sesquilinear form ϕ is also Hermitian in this case. ■

Remark 2.8. Consequently, a sesquilinear form is a semi-inner product if and only if it is positive semi-definite,
as the Hermitian condition is now implied by the positive semi-definiteness condition. Besides, the quadrant
form associated with a semi-inner product is precisely the square of the derived semi-norm.

Corollary 2.14. Let V be a complex semi-inner product space and T ∈ L (V ) be a linear operator. Then for

36
every x, y ∈ V ,
1 3 k
⟨T x, y⟩ = ∑ i ⟨T (x + ik y), x + ik y⟩.
4 k=0
(55)

In particular, if ⟨T x, x⟩ = 0 for all x ∈ V , we must have T = 0.


Proof. By Example 2.2, the following map is a sesquilinear form on V :

ϕ : V ×V → C : (x, y) 7→ ⟨T x, y⟩.

Then by the generalized polarization identity, we certainly have

1 3 k 1 3
⟨T x, y⟩ = ϕ(x, y) = ∑ i ϕ(x + ik y, x + ik y) = ∑ ik ⟨T (x + ik y), x + ik y⟩, ∀x, y ∈ V.
4 k=0 4 k=0

As we can see, suppose that ⟨T x, x⟩ = 0 for all x ∈ V . Fix one arbitrary x ∈ V . By the identity above, we now have ⟨T x, y⟩ = 0 for all y ∈ V ,
especially when y = T x itself, hence T x = 0V holds. Since x ∈ V is arbitrary, it follows that T = 0, as desired. ■

Remark 2.9. By putting T := idV , the above generalized polarization identity recovers the classical one.

Theorem 2.15. Let V be a semi-inner product space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 and ŷ := ∥y∥−1 y,
2∥x − y∥
∥x̂ − ŷ∥ ≤ . (56)
∥x∥ + ∥y∥
Proof. By Theorem 2.10,
2
∥x̂ − ŷ∥2 = ∥x̂∥2 + ∥ŷ∥2 − 2ℜ⟨x̂, ŷ⟩ = 2 − ℜ⟨x, y⟩
∥x∥∥y∥
2 ∥x∥2 + ∥y∥2 − ∥x − y∥2
= 2− ·
∥x∥∥y∥ 2
2∥x∥∥y∥ − (∥x∥2 + ∥y∥2 − ∥x − y∥2 ) ∥x − y∥2 − (∥x∥ − ∥y∥)2
= = .
∥x∥∥y∥ ∥x∥∥y∥
In this case,

∥x − y∥2 − (∥x∥ − ∥y∥)2


4∥x − y∥2 − (∥x∥ + ∥y∥)2 ∥x̂ − ŷ∥2 = 4∥x − y∥2 − (∥x∥ + ∥y∥)2 ·
∥x∥∥y∥
4∥x∥∥y∥∥x − y∥2 − (∥x∥ + ∥y∥)2 (∥x − y∥2 − (∥x∥ − ∥y∥)2 )
=
∥x∥∥y∥
−(∥x∥ − ∥y∥)2 ∥x − y∥2 + (∥x∥ + ∥y∥)2 (∥x∥ − ∥y∥)2
=
∥x∥∥y∥
(∥x∥ − ∥y∥)2
= ((∥x∥ + ∥y∥)2 − ∥x − y∥2 ).
∥x∥∥y∥
Note from the CBS inequality that

(∥x∥ + ∥y∥)2 − ∥x − y∥2 = (∥x∥2 + 2∥x∥∥y∥ + ∥y∥2 ) − (∥x∥2 + ∥y∥2 − 2ℜ⟨x, y⟩)
= 2∥x∥∥y∥ + 2ℜ⟨x, y⟩ = 2(∥x∥∥y∥ + ℜ⟨x, y⟩)
≥ 2(|⟨x, y⟩| + ℜ⟨x, y⟩) ≥ 0.

Therefore, we have
4∥x − y∥2 − (∥x∥ + ∥y∥)2 ∥x̂ − ŷ∥2 ≥ 0,
hence the desired inequality follows immediately. ■

37
Remark 2.10. The above inequality is stronger than the one presented in Theorem 1.3.

Theorem 2.16 (Ptolemy Inequality). Let V be an inner product space. Then for every x, y, z ∈ V ,

∥x − y∥∥z∥ ≤ ∥x − z∥∥y∥ + ∥z − y∥∥x∥. (57)

Proof. There is nothing to prove if anyone of x, y, z is equal to 0V , so we may assume that x, y, z are all nonzero then. Let
x y z
x′ := , y′ := , and z′ := .
∥x∥2 ∥y∥2 ∥z∥2
Then
1 1 2
∥x′ − y′ ∥2 = ∥x′ ∥2 + ∥y′ ∥2 − 2ℜ⟨x′ , y′ ⟩ = ∥x∥2 + ∥y∥2 − ℜ⟨x, y⟩
(∥x∥2 )2 (∥y∥2 )2 ∥x∥2 ∥y∥2
1 1 2ℜ⟨x, y⟩
= + −
∥x∥2 ∥y∥2 ∥x∥2 ∥y∥2
∥y∥2 + ∥x∥2 − 2ℜ⟨x, y⟩ ∥x − y∥2
= = .
∥x∥2 ∥y∥2 ∥x∥2 ∥y∥2
Consequently, we have
∥x − y∥
∥x′ − y′ ∥ = ,
∥x∥∥y∥
and similar inequalities hold for x′ , z′ and y′ , z′ as well. Finally, by triangle inequality,
∥x − y∥ ∥x − z∥ ∥z − y∥
= ∥x′ − y′ ∥ ≤ ∥x′ − z′ ∥ + ∥z′ − y′ ∥ = + .
∥x∥∥y∥ ∥x∥∥z∥ ∥z∥∥y∥

Multiplying both sides by ∥x∥∥y∥∥z∥ yields the desired inequality. ■

Theorem 2.17 (Apollonius Identity). Let V be an inner product space. For every x, y, z ∈ V ,

2
1 1
∥x − z∥2 + ∥y − z∥2 = ∥x − y∥2 + 2 z − (x + y) . (58)
2 2

Proof. Observe that


∥x − y∥2 = ⟨x − y, x − y⟩ = ∥x∥2 + ∥y∥2 − ⟨x, y⟩ − ⟨y, x⟩,
and
2  
1 1 1 1 1
z − (x + y) = z − (x + y), z − (x + y) = ∥z∥2 + ∥x + y∥2 − (⟨z, x + y⟩ + ⟨x + y, z⟩)
2 2 2 4 2
2 1 2 2 1
= ∥z∥ + (∥x∥ + ∥y∥ + ⟨x, y⟩ + ⟨y, x⟩) − (⟨z, x⟩ + ⟨z, y⟩ + ⟨x, z⟩ + ⟨y, z⟩).
4 2
Consequently,
2
1 1
∥x − y∥2 + 2 z − (x + y)
2 2
1 1
= (∥x∥ + ∥y∥ − ⟨x, y⟩ − ⟨y, x⟩) + 2∥z∥2 + (∥x∥2 + ∥y∥2 + ⟨x, y⟩ + ⟨y, x⟩) − (⟨z, x⟩ + ⟨z, y⟩ + ⟨x, z⟩ + ⟨y, z⟩)
2 2
2 2
= ∥x∥2 + ∥y∥2 + 2∥z∥2 − (⟨z, x⟩ + ⟨z, y⟩ + ⟨x, z⟩ + ⟨y, z⟩)
= ⟨x − z, x − z⟩ + ⟨y − z, y − z⟩ = ∥x − z∥2 + ∥y − z∥2 . ■

Remark 2.11. The above identity generalizes Apollonius’ theorem in geometry: Given any triangle △ABC, if D

38
is the midpoint of BC, then
|AB|2 + |AC|2 = 2|AD|2 + |BD|2 . (59)
−−
⃗ y := −−
⃗ and z := −
−⃗
Here we may fix an arbitrary origin O, and put x := OB, OC, OA.
Theorem 2.18 (Hlawka’s Inequality). Let V be an inner product space. Then for every x, y, z ∈ V ,

∥x + y + z∥2 + ∥x∥2 + ∥y∥2 + ∥z∥2 = ∥x + y∥2 + ∥x + z∥2 + ∥y + z∥2 , (60)

and also
∥x + y∥ + ∥x + z∥ + ∥y + z∥ ≤ ∥x + y + z∥ + ∥x∥ + ∥y∥ + ∥z∥. (61)
Proof. For convenience, we denote by u := x + y + z. Observe that

∥u∥2 = ∥x + y + z∥2 = ⟨x + y + z, x + y + z⟩
= ∥x∥2 + ∥y∥2 + ∥z∥2 + 2(ℜ⟨x, y⟩ + ℜ⟨y, z⟩ + ℜ⟨x, z⟩),

so by Theorem 2.10, the identity


∥u∥2 + ∥x∥2 + ∥y∥2 + ∥z∥2 = ∥x + y∥2 + ∥x + z∥2 + ∥y + z∥2
holds for sure. Next, denote by

A := ∥u∥ + ∥x∥ + ∥y∥ + ∥z∥ and B := ∥x + y∥ + ∥x + z∥ + ∥y + z∥.

If A = 0, then x = y = z = 0V and the inequality is trivial. Now suppose that A > 0. Then

A2 − AB = (∥u∥ + ∥x∥ + ∥y∥ + ∥z∥)2 − (∥u∥ + ∥x∥ + ∥y∥ + ∥z∥)(∥x + y∥ + ∥x + z∥ + ∥y + z∥)



= (∥x∥ + ∥y∥ − ∥x + y∥)(∥u∥ + ∥z∥ − ∥x + y∥) + (∥y∥ + ∥z∥ − ∥y + z∥)(∥u∥ + ∥x∥ − ∥y + z∥)
+ (∥z∥ + ∥x∥ − ∥x + z∥)(∥u∥ + ∥y∥ − ∥x + z∥) ≥ 0.

Here in =, we used the identity proven previously to reduce the number of terms. In the last sum, each product is non-negative by triangle
inequality, especially
∥u∥ + ∥z∥ = ∥x + y + z∥ + ∥−z∥ ≥ ∥(x + y + z) + (−z)∥ = ∥x + y∥
and so on. The proof is complete. ■

Definition 2.4. Let V be a linear space. A norm ∥ · ∥ : V → R on V is said to be derived from an inner product
if there exists an inner product ⟨·, ·⟩ : V ×V → F such that ∥x∥ = ⟨x, x⟩1/2 for all x ∈ V .
Proposition 2.19. For n ≥ 2, the l p -norm on Fn is derived from inner products if and only if p = 2.
Proof. In Example 2.3, we have seen that the Euclidean norm (the l2 -norm) is induced from the standard inner product on Fn . Conversely,
suppose that ∥ ·
| p is induced from inner products. In this case, consider the standard base elements e1 , e2 ∈ Fn . Here ∥e1 ∥ p = ∥e2 ∥ p = 1 for every p ∈ [1, ∞],
so by the parallelogram identity (cf. Theorem 2.10), we have

∥e1 + e2 ∥2p + ∥e1 − e2 ∥2p = 2(∥e1 ∥2 + ∥e2 ∥2 ) = 2(12 + 12 ) = 4.

Note that ∥e1 + e2 ∥∞ = ∥e1 − e2 ∥∞ = 1. Since 12 + 12 < 4, we must have p < ∞ in this case. Then

∥e1 ± e2 ∥ p = (1 p + (±1) p + (n − 2) · 0 p )1/p = 21/p .

As a result,
4 = ∥e1 + e2 ∥2p + ∥e1 − e2 ∥2p = 22/p + 22/p = 21+(2/p) ,
which entails that p = 2, as desired. ■

39
Theorem 2.20 (Norms Derived From Inner Products). Let V be a linear space. Then a norm ∥ · ∥ on V is derived
from an inner product if and only if the parallelogram identity holds, namely for all x, y ∈ V ,

∥x + y∥2 + ∥x − y∥2 = 2(∥x∥2 + ∥y∥2 ). (62)

Proof. The necessity of parallelogram identity is clear from Theorem 2.10, so it suffices to prove its sufficiency then: Suppose that the
parallelogram identity hold for a norm ∥ · ∥ on V .
Case 1. When F = R, we define
1
⟨·, ·⟩ : V ×V → R : (x, y) 7→ (∥x + y∥2 − ∥x∥2 − ∥y∥2 ). (63)
2
We then verify that ⟨·, ·⟩ is an inner product on V :
• For every x ∈ V ,
1 1
⟨x, x⟩ = (∥x + x∥2 − ∥x∥2 − ∥x∥2 ) = (∥2x∥2 − 2∥x∥2 )
2 2
1 2∥x∥2
= (4∥x∥2 − 2∥x∥2 ) = = ∥x∥2 ≥ 0.
2 2
Since ∥ · ∥ is a norm on V ,
⟨x, x⟩ = 0 ⇐⇒ ∥x∥2 = 0 ⇐⇒ ∥x∥ = 0 ⇐⇒ x = 0V .
In addition,
1 1
⟨x, 0V ⟩ = (∥x + 0V ∥2 − ∥x∥2 − ∥0V ∥2 ) = (∥x∥2 − ∥x∥2 − 02 ) = 0.
2 2
• For every x, y ∈ V , it is clear that
1
⟨x, y⟩ − ⟨y, x⟩ = ((∥x + y∥2 − ∥x∥2 − ∥y∥2 ) − (∥y + x∥2 − ∥y∥2 − ∥x∥2 )) = 0,
2
so ⟨x, y⟩ = ⟨y, x⟩ holds for sure. Together with identity proven in the last item, we have ⟨0V , x⟩ = ⟨x, 0V ⟩ = 0 for all x ∈ V .
• Let x, x′ , y ∈ V be arbitrary. Then

4(⟨x, y⟩ + ⟨x′ , y⟩) = 2((∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x′ + y∥2 − ∥x′ ∥2 − ∥y∥2 ))
= 2(∥x + y∥2 + ∥x′ + y∥2 ) − 2(∥x∥2 + ∥x′ ∥2 ) − 4∥y∥2

= (∥x + x′ + 2y∥2 + ∥x − x′ ∥2 ) − (∥x + x′ ∥2 + ∥x − x′ ∥2 ) − 4∥y∥2
= ∥x + x′ + 2y∥2 − ∥x + x′ ∥2 − 4∥y∥2

= (2(∥x + x′ + y∥2 + ∥y∥2 ) − ∥x + x′ ∥2 ) − ∥x + x′ ∥2 − 4∥y∥2
= 2∥x + x′ + y∥2 − 2∥x + x′ ∥2 − 2∥y∥2 = 4⟨x + x′ , y⟩.

Here =’s follow from the parallelogram identity. Consequently, we have ⟨x, y⟩ + ⟨x′ , y⟩ = ⟨x + x′ , y⟩, proving the additivity on the
first component.
The proof for the homogeneity is slightly involved: Let x, y ∈ V be arbitrary.
• First, we claim that ⟨bx, y⟩ = b⟨x, y⟩ for every b ∈ Q. Let us consider the integer exponents temporarily. The case for non-negative
integers can be handled by induction: Observe that

⟨0x, y⟩ = ⟨0V , y⟩ = 0 = 0⟨x, y⟩.

Assume the identity holds for some n ≥ 0. Then

⟨(n + 1)x, y⟩ = ⟨nx + x, y⟩ = ⟨nx, y⟩ + ⟨x, y⟩ = n⟨x, y⟩ + ⟨x, y⟩ = (n + 1)⟨x, y⟩.

Furthermore, since
⟨x, y⟩ + ⟨−x, y⟩ = ⟨x + (−x), y⟩ = ⟨0V , y⟩ = 0,

40
we indeed have ⟨−x, y⟩ = −⟨x, y⟩. Consequently, for n > 0,

⟨(−n)x, y⟩ = ⟨−(nx), y⟩ = −⟨nx, y⟩ = −(n⟨x, y⟩) = (−n)⟨x, y⟩.

The desired identity thus follows. Furthermore, given any b = m/n ∈ Q, since

n⟨bx, y⟩ = ⟨nbx, y⟩ = ⟨mx, y⟩ = m⟨x, y⟩,

we indeed have ⟨bx, y⟩ = (m/n)⟨x, y⟩ = b⟨x, y⟩, as desired.


• Next, we prove the CBS inequality now, namely |⟨x, y⟩| ≤ ∥x∥∥y∥: There is nothing to prove if x = 0V . When x ̸= 0V , consider the
following real quadratic polynomial
p(t) := t 2 ∥x∥2 + 2t⟨x, y⟩ + ∥y∥2 .
As we can see, when t ∈ Q, we have t⟨x, y⟩ = ⟨tx, y⟩ whence

p(t) = t 2 ∥x∥2 + 2⟨tx, y⟩ + ∥y∥2


1
= ∥tx∥2 + 2 · (∥tx + y∥2 − ∥tx∥2 − ∥y∥2 ) + ∥y∥2 = ∥tx + y∥2 .
2
Since Q is dense in R, it follows that p(t) = ∥tx + y∥2 for all t ∈ R. Observe that the discriminant of p is given by

∆ = (2⟨x, y⟩)2 − 4∥x∥2 ∥y∥2 = 4(⟨x, y⟩2 − ∥x∥2 ∥y∥2 ).

Since p(t) = ∥tx + y∥2 ≥ 0 and its leading coefficient is ∥x∥2 > 0, we must have ∆ ≤ 0. Thus, it is clear that ⟨x, y⟩2 ≤ ∥x∥2 ∥y∥2 ,
hence the desired inequality follows.
Finally, let a ∈ R be arbitrary. Then for every b ∈ Q, since ⟨bx, y⟩ = b⟨x, y⟩, it follows that

⟨ax, y⟩ − a⟨x, y⟩ = ⟨ax, y⟩ − ⟨bx, y⟩ + b⟨x, y⟩ − a⟨x, y⟩


= ⟨ax − bx, y⟩ − (a − b)⟨x, y⟩ = ⟨(a − b)x, y⟩ − (a − b)⟨x, y⟩.

Consequently, by CBS inequality,

|⟨ax, y⟩ − a⟨x, y⟩| = |⟨(a − b)x, y⟩ − (a − b)⟨x, y⟩| ≤ |⟨(a − b)x, y⟩| + |a − b|⟨x, y⟩
≤ ∥(a − b)x∥∥y∥ + |a − b|∥x∥∥y∥ = 2|a − b|∥x∥∥y∥.

Then for every ε > 0, because Q is dense in R, we can always let b ∈ Q be such that
ε
|a − b| <
1 ∨ 2∥x∥∥y∥
and hence
|⟨ax, y⟩ − a⟨x, y⟩| ≤ 2|a − b|∥x∥∥y∥ < ε,
Therefore, we indeed have ⟨ax, y⟩ = a⟨x, y⟩, proving that ⟨·, ·⟩ is an inner product on V .
Case 2. When F = C, we define
1 i
⟨·, ·⟩ : V ×V → C : (x, y) 7→ (∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x + iy∥2 − ∥x∥2 − ∥y∥2 ). (64)
2 2
First, by regarding V as a real linear space, it follows from Case 1 that the following map
1
⟨·, ·⟩′ : V ×V → R : (x, y) 7→ (∥x + y∥2 − ∥x∥2 − ∥y∥2 )
2
is an inner product on V with ∥x∥2 = ⟨x, x⟩′ . We then verify that ⟨x, iy⟩′ = −⟨ix, y⟩′ for all x, y ∈ V : Observe that ∥ix∥ = |i|∥x∥ = ∥x∥,
∥iy∥ = |i|∥y∥ = ∥y∥, and also
∥ix + y∥ = ∥i(x − iy)∥ = |i|∥x − iy∥ = ∥x − iy∥.

41
Therefore,
1 1
⟨x, iy⟩′ + ⟨ix, y⟩′ =(∥x + iy∥2 − ∥x∥2 − ∥iy∥2 ) + (∥ix + y∥2 − ∥ix∥2 − ∥y∥2 )
2 2
1
= (∥x + iy∥2 − ∥x∥2 − ∥iy∥2 + ∥x − iy∥2 − ∥x∥2 − ∥iy∥2 )
2
1
= (∥x + iy∥2 + ∥x − iy∥2 ) − (∥x∥2 + ∥iy∥2 ) = 0,
2
where the last equality follows from the parallelogram identity. As a result, by Theorem 2.4, the map ⟨·, ·⟩ : V ×V → C, in which for every
x, y ∈ V
1 i
⟨x, y⟩ := ⟨x, y⟩′ + i⟨x, iy⟩′ =(∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x + iy∥2 − ∥x∥2 − ∥iy∥2 )
2 2
1 i
= (∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x + iy∥2 − ∥x∥2 − ∥y∥2 ),
2 2
is an inner product on V over C such that ⟨x, x⟩ = ⟨x, x⟩′ = ∥x∥2 for all x ∈ V . ■

Remark 2.12. The definition for the inner product when F = R is inspired from Theorem 2.10: In this case, for
every x, y ∈ V , we have
∥x + y∥2 = ∥x∥2 + ∥y∥2 + 2⟨x, y⟩,

so necessarily ⟨x, y⟩ must be of the form given above.

2.3 Orthogonality in Inner Product Spaces


Definition 2.5. Let V be an inner product space.

1. For every x, y ∈ V , the element x is orthogonal to y if ⟨x, y⟩ = 0, denoted by x ⊥ y.

2. Two nonempty subsets A, B ⊆ V are called orthogonal, denoted by A ⊥ B, if x ⊥ y for all x ∈ A and y ∈ B.
When A = {x} is a singleton, we shall write x ⊥ B in place of {x} ⊥ B.

Theorem 2.21. Let V be an inner product space.

1. For every x, y ∈ V ,
x ⊥ y ⇐⇒ y ⊥ x =⇒ ∀a, b ∈ F : ax ⊥ by. (65)

Furthermore, if x ⊥ y, then ∥x + y∥ = ∥x − y∥. The converse is true when F = R.

2. For every x ∈ V ,
x = 0V ⇐⇒ ∀y ∈ V : x ⊥ y. (66)

3. For every x, y ∈ V , if ⟨x, z⟩ = ⟨y, z⟩ for all z ∈ V , then x = y.

4. For every x ∈ V and family (xi )i∈I in V , if x ⊥ xi for all i ∈ I, then x ⊥ Span(xi | i ∈ I) as well.
Proof. 1. Let x, y ∈ V . Since ⟨y, x⟩ = ⟨x, y⟩ and 0 = 0, we can see that

x ⊥ y ⇐⇒ ⟨x, y⟩ = 0 ⇐⇒ ⟨y, x⟩ = 0 ⇐⇒ y ⊥ x.

When x ⊥ y, for every a, b ∈ F,


⟨ax, by⟩ = abx, y = ab(0) = 0,

42
hence ax ⊥ by as well. Furthermore, by Theorem 2.10, we have

∥x ± y∥2 = ∥x∥2 + ∥y∥2 ± 2ℜ⟨x, y⟩.

Therefore,
∥x + y∥ = ∥x − y∥ ⇐⇒ ℜ⟨x, y⟩ = 0 ⇐= ⟨x, y⟩ = 0,
where the last converse is true when F = R.
2. Let x ∈ V be arbitrary. By Theorem 2.1, we have ⟨0V , x⟩ = 0 whence 0V ⊥ x. Furthermore, if x ⊥ y for all y ∈ V , by putting y := x,
we see that ⟨x, x⟩ = 0 whence x = 0V in this case.
3. Let x, y ∈ V . Suppose that ⟨x, z⟩ = ⟨y, z⟩ for all z ∈ V . Then for each z ∈ V , we have ⟨x − y, z⟩ = ⟨x, z⟩ − ⟨y, z⟩ = 0. That is, (x − y) ⊥ z
for all z ∈ V , hence by 3, we have x − y = 0V , namely x = y.
4. Let x ∈ V and (xi )i∈I be a family of elements in V . Suppose that x ⊥ xi for all i ∈ I. First, if I = ∅, then Span(∅) = {0V }, which is
certainly orthogonal to x. Therefore, we may assume that I is also nonempty in this case. Then for every y ∈ Span(xi | i ∈ I), suppose that
y = ci1 xi1 + · · · + cik xik for some i1 , . . . , ik ∈ I and ci1 , . . . , cik ∈ F,

⟨x, y⟩ = ⟨x, ci1 xi1 + · · · + cik xik ⟩


= ci1 ⟨x, xi1 ⟩ + · · · + cik ⟨x, xik ⟩ = ci1 0 + · · · + cik 0 = 0.

Consequently, we also have x ⊥ y, implying that x ⊥ Span(xi | i ∈ I). ■

Remark 2.13. Nevertheless, “being orthogonal to” is not transitive, namely if x ⊥ y and y ⊥ z, we do not neces-
sarily have x ⊥ z:
Consider x = [1, 0, 0]T , y = [0, 0, 1]T , and z = [1, 1, 0]T in R3 . It is clear that ⟨x, y⟩ = ⟨y, z⟩ = 0, but ⟨x, z⟩ = 1 ̸= 0.

Definition 2.6. Let V be an inner product space. A nonempty subset S of nonzero elements in V is called
orthogonal if x ⊥ y for every distinct elements x, y ∈ V . If furthermore ∥x∥ = 1 for all x ∈ S, such set S is said to
be orthonormal.

Remark 2.14. Here are some remarks on the preceding definition:

• Clearly, given an orthogonal set S, the set S′ := {∥x∥−1 x | x ∈ S} is orthonormal.

• In practice, the condition of a family (xi )i∈I of nonzero elements being orthonormal is rephrased as fol-
lows: For every i, j ∈ I, 
1, if i = j;
⟨xi , x j ⟩ = δi, j = (67)
0, if i ̸= j,

where δi, j is called the Kronecker delta.

• By the generalized Pythagorean theorem (cf. Corollary 2.11), if x1 , . . . , xn ∈ V are orthogonal, for every
c1 , . . . , cn ∈ F,
∥c1 x1 + · · · + cn xn ∥2 = |c1 |2 ∥x1 ∥2 + · · · + |cn |2 ∥xn ∥2 . (68)

Clearly, if x1 , . . . , xn are orthonormal, then given any c1 , . . . , cn ∈ F,

∥c1 x1 + · · · + cn xn ∥2 = |c1 |2 + · · · + |cn |2 . (69)

Theorem 2.22. Every orthogonal subset of an inner product space is linearly independent.

43
Proof. Let V be an inner product space, S ⊆ V be orthogonal, and x1 , . . . , xn ∈ S be distinct elements. Suppose that 0V = c1 x1 + · · · + cn xn
for some c1 , . . . , cn ∈ F. Then fix an arbitrary k = 1, . . . , n. As we can see,
* +
n n
0 = ⟨0V , xk ⟩ = ∑ c j x j , xk = ∑ c j ⟨x j , xk ⟩ = ck ∥xk ∥2 .
j=1 j=1

Since xk ̸= 0V , we must have ∥xk ∥ > 0 whence ck = 0. Therefore, we indeed have c1 = · · · = cn = 0, implying that x1 , . . . , xn are linearly
independent. Since x1 , . . . , xn are arbitrary, the set S is linearly independent as well. ■

Definition 2.7. Let V be an inner product space and u ∈ V be nonzero. For every x ∈ V , its orthogonal projection
onto u is defined as
⟨x, u⟩
Proju (x) := u = ⟨x, û⟩û, (70)
∥u∥2
where û := ∥u∥−1 u is the normalization of u.

Remark 2.15. Clearly,


Proju (x) = 0V ⇐⇒ ⟨x, u⟩ = 0 ⇐⇒ x ⊥ u. (71)

Theorem 2.23 (Orthogonal Projection). Let V be an inner product space and u ∈ V be nonzero.

1. For every x ∈ V and λ ∈ F,

Projλ u (x) = Proju (x) if λ ̸= 0 and Proju (x − λ u) = Proju (x) − λ u, (72)

In particular, we have Proju (λ u) = λ u for all λ ∈ F.

2. For every x ∈ V , the element x⊥u := x − Proju (x) is orthogonal to u with (x − λ u)⊥u = x⊥u for all λ ∈ F
and
∥x∥ ≥ ∥x⊥u ∥ = inf ∥x − λ u∥. (73)
λ ∈F

In particular, the last infimum is attained if and only if λ = ⟨x, u⟩/∥u∥2 , namely the element Proju (x) is
the unique element in the subspace Span(u) closest to x.
Proof. 1. For every λ ∈ F,
⟨x − λ u, u⟩ ⟨x, u⟩ − λ ∥u∥2
Proju (x − λ u) = 2
u= u = Proju (x) − λ u.
∥u∥ ∥u∥2
Consequently,
Proju (λ u) = Proju (λ u − λ u) + λ u = Proju (0V ) + λ u = 0V + λ u = λ u.
Furthermore, when λ is nonzero,
⟨x, λ u⟩ λ ⟨x, u⟩λ ⟨x, u⟩
Projλ u (x) = (λ u) = u= u = Proju (x).
∥λ u∥2 |λ |2 ∥u∥2 ∥u∥2
2. Clearly,

⟨x⊥u , u⟩ = ⟨x − Proju (x), u⟩ = ⟨x, u⟩ − ⟨Proju (x), u⟩


⟨x, u⟩
= ⟨x, u⟩ − ⟨u, u⟩ = ⟨x, u⟩ − ⟨x, u⟩ = 0,
∥u∥2

so x⊥u ⊥ u holds for sure. As a result, we also have x⊥u ⊥ Proju (x), so

∥x∥2 = ∥x⊥u + Proju (x)∥2 = ∥x⊥u ∥2 + ∥ Proju (x)∥2 ≥ ∥x⊥u ∥2 .

44
Finally, for every λ ∈ F, by 1,

(x − λ u)⊥u = (x − λ u) − Proju (x − λ u) = (x − λ u) − (Proju (x) − λ u)


= x − Proju (x) = x⊥u .

As a result,

∥x − λ u∥2 = ∥ Proju (x − λ u)∥2 + ∥(x − λ u)⊥u ∥2 = ∥ Proju (x) − λ u∥2 + ∥x⊥u ∥2


 2
⟨x, u⟩
= 2
− λ ∥u∥2 + ∥x⊥u ∥2 ≥ ∥x⊥u ∥2 .
∥u∥

Clearly, here the equality is attained if and only if λ = ⟨x, u⟩/∥u∥2 . ■

Remark 2.16. In particular, we also see that

|⟨x, u⟩|2 |⟨x, u⟩|2


∥x∥2 ≥ ∥ Proju (x)∥2 = 4
∥u∥2 = ,
∥u∥ ∥u∥2

so we also have the CBS inequality


∥⟨x, u⟩|2 ≤ ∥x∥2 ∥u∥2 .

Furthermore, now we can see that the equality is attained if and only if x = Proju (x), namely x and u are collinear.

Corollary 2.24. Let V be an inner product space and u ∈ V be a unit element, namely ∥u∥ = 1. Then for every
x, y ∈ V ,
⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩ = ⟨x − Proju (x), y − Proju (y)⟩, (74)

so for every λ , µ ∈ F,
|⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩| ≤ ∥x − λ u∥∥y − µu∥. (75)
Proof. Since ∥u∥ = 1 now, we have Proju (x) = ⟨x, u⟩u and Proju (y) = ⟨y, u⟩u. Therefore,

⟨x − Proju (x), y − Proju (y)⟩ = ⟨x − ⟨x, u⟩u, y − ⟨y, u⟩u⟩


= ⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩ − ⟨y, u⟩⟨x, u⟩ + ⟨x, u⟩⟨y, u⟩∥u∥2
= ⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩.

Furthermore,

|⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩| = |⟨x − Proju (x), y − Proju (y)⟩| ≤ ∥x − Proju (x)∥∥y − Proju (y)∥
≤ ∥x − λ u∥∥y − µu∥,

for every λ , µ ∈ F. ■

Theorem 2.25 (Criterion for Orthogonality). Let V be an inner product space. Then for every x, y ∈ V , we have
x ⊥ y if and only if ∥x∥ ≤ ∥x − cy∥ for all c ∈ F.
Proof. Let x, y ∈ V . First, if x ⊥ y, for every c ∈ F, by the Pythagorean theorem,

∥x − cy∥2 = ∥x∥2 + ∥−cy∥2 = ∥x∥2 + |c|∥y∥2 ≥ ∥x∥2 ,

hence ∥x − cy∥ ≥ ∥x∥ holds in this case.

45
Conversely, suppose that ∥x∥ ≤ ∥x − cy∥ for all c ∈ F. There is nothing to prove if y = 0V , so we may assume that y is nonzero. In this
case, consider
⟨x, y⟩
z := x − Projy (x) = x − y.
∥y∥2
Here by Theorem 2.23, we have z ⊥ y. Then by Pythagorean theorem and our hypothesis,

∥x∥2 = ∥ Projy (x)∥2 + ∥z∥2 ≥ ∥z∥2 ≥ ∥x∥2 .

Consequently, we must have ∥ Projy (x)∥ = 0 whence Projy (x) = 0V . In this case, we see that x = z, hence x ⊥ y holds for sure. ■

Alternative Proof of Sufficiency. Suppose that ∥x∥ ≤ ∥x − cy∥ for all c ∈ F. Consequently, for every c ∈ F,

∥x∥2 ≤ ∥x − cy∥2 = ∥x∥2 + |c|2 ∥y∥2 − 2ℜ(c⟨x, y⟩),

or equivalently,
|c|2 ∥y∥2
ℜ(c⟨x, y⟩) ≤ .
2
Assume, to the contrary, that ⟨x, y⟩ ̸= 0. Then we may put c := t⟨x, y⟩/|⟨x, y⟩| for some t > 0, so we have |c| = t and
!
t⟨x, y⟩ t|⟨x, y⟩|2
ℜ(c⟨x, y⟩) = ℜ ⟨x, y⟩ = ℜ = ℜ(t|⟨x, y⟩|) = t|⟨x, y⟩|.
|⟨x, y⟩| |⟨x, y⟩|

Consequently, we can see that


t 2 ∥y∥2
t|⟨x, y⟩| ≤ ,
2
namely |⟨x, y⟩| ≤ t∥y∥2 /2. By putting t → 0+, it follows that ⟨x, y⟩ = 0, a contradiction. ■

Definition 2.8. Let V be an inner product space. Then for every A ⊆ V , its orthogonal complement is defined as

A⊥ := {x ∈ V | x ⊥ A} = {x ∈ V | ∀y ∈ A : ⟨x, y⟩ = 0}. (76)

In particular, we adopt the convention that ∅⊥ = V .

Lemma 2.26. Let V be an inner product space. Then

{0V }⊥ = V and V ⊥ = {0V }. (77)

Furthermore, for every A, B ⊆ V ,


A ⊆ B =⇒ B⊥ ⊆ A⊥ . (78)
Proof. 1. By Statement 2 of Theorem 2.21, every element in V is orthogonal to 0V , hence {0V }⊥ = V . Meanwhile, for each x ∈ V , if x ⊥ V ,
also by Statement 2 there, we must have x = 0V . Consequently, V ⊥ = {0V }.
2. Let A ⊆ B ⊆ V be arbitrary. Then every element x ∈ B⊥ is orthogonal to every element in B, especially to every element in A, so
x ∈ A⊥ as well. This shows that B⊥ ⊆ A⊥ . ■

Theorem 2.27. Let V be an inner product space with A ⊆ V . Then its orthogonal complement A⊥ is a closed
subspace of V , while also

A⊥ = Span(A)⊥ = Span(A) and A ⊆ (A⊥ )⊥ . (79)

Whenever A ∩ A⊥ ̸= ∅, we always have A ∩ A⊥ = {0V }.

46
Proof. First, let x1 , x2 ∈ A⊥ and c ∈ F be arbitrary. Then for every y ∈ A,

⟨cx1 + x2 , y⟩ = c⟨x1 , y⟩ + ⟨x2 , y⟩ = c(0) + 0 = 0,

so cx1 + x2 ∈ A⊥ as well. This shows that A⊥ is a subspace of V . Furthermore, let (xn )n∈N be a sequence in A⊥ converging to some x ∈ V .
Note that the inner product is continuous, so
⟨x, y⟩ = lim ⟨xn , y⟩ = lim 0 = 0.
n→∞ n→∞

Therefore, we must have x ∈ A⊥ in this case, so A⊥ is also closed in V .


• Since every element in A is orthogonal to every element in A⊥ , it is immediate that A ⊆ (A⊥ )⊥ .
• Suppose that A ∩ A⊥ ̸= ∅. Then for each x ∈ A ∩ A⊥ , since x ⊥ x, we must have x = 0V . Consequently, the identity A ∩ A⊥ = {0V }
holds in this case.
• Next, since A ⊆ Span(A), it is clear that Span(A)⊥ ⊆ A⊥ . Conversely, for every x ∈ A⊥ , by Statement 4 of Theorem 2.21, we have
x ∈ Span(A)⊥ as well. Therefore, the identity A⊥ = Span(A)⊥ is true.

• Finally, because Span(A) ⊆ Span(A), we also have Span(A) ⊆ Span(A)⊥ . Conversely, let x ∈ Span(A)⊥ and y ∈ Span(A). Then
there is a sequence (yn )n∈N in Span(A) such that yn → y. Since x ⊥ yn for all n, by the continuity of inner products,

⟨x, y⟩ = lim ⟨x, yn ⟩ = lim 0 = 0.


n→∞ n→∞


Therefore, we also have x ∈ Span(A) , completing the proof. ■

Remark 2.17. Consequently, it suffices to discuss the orthogonal complement of a subspace in the future.

Theorem 2.28. Let V be an inner product space with subspace W . Then for every x ∈ V ,

x ∈ W ⊥ ⇐⇒ ∀y ∈ W : ∥x − y∥ ≥ ∥x∥. (80)

That is, x ∈ W ⊥ if and only if 0V is the closest point to x in W .


Proof. Let x ∈ V . If x ∈ W ⊥ , then for every y ∈ W , since x ⊥ y, we have

∥x − y∥2 = ∥x∥2 + ∥−y∥2 = ∥x∥2 + ∥y∥2 ≥ ∥x∥2 .

Conversely, suppose that ∥x − y∥ ≥ ∥x∥ for all y ∈ W . Fix one arbitrary y ∈ W . Then for every λ ∈ F, since λ y ∈ W , we shall have
∥x∥ ≤ ∥x − λ y∥. Thus, by Theorem 2.25, we can see that y ⊥ x. Since y ∈ W is arbitrary, it follows that x ⊥ W whence x ∈ W ⊥ . ■

2.4 Orthogonal Projection Onto Finite-Dimensional Subspaces


Theorem 2.29 (Closest Point and Bessel’s Inequality). Let V be an inner product space and x1 , . . . , xn ∈ V be
orthonormal. Then for each x ∈ X, the closest point of x in the n-dimensional subspace W := Span(x1 , . . . , xn ) is
precisely
ProjW (x) := ⟨x, x1 ⟩x1 + · · · + ⟨x, xn ⟩xn ∈ W, (81)

where
x⊥W := x − ProjW (x) ∈ W ⊥ and ∥ ProjW (x)∥2 = |⟨x, x1 ⟩|2 + · · · + |⟨x, xn ⟩|2 , (82)

with
n
d(x,W )2 = ∥x − ProjW (x)∥2 = ∥x∥2 − ∥ ProjW (x)∥2 = ∥x∥2 − ∑ |⟨x, xk ⟩|2 .
k=1

47
In particular,
n
∑ |⟨x, xk ⟩|2 ≤ ∥x∥2 , (83)
k=1

in which the equality is attained if and only if x = ProjW (x), which is also equivalent to that x ∈ W .
Proof. Let y := ⟨x, x1 ⟩x1 + · · · + ⟨x, xn ⟩xn ∈ W . First, because x1 , . . . , xn are orthonormal, it is clear that

∥y∥2 = |⟨x, x1 ⟩|2 + · · · + |⟨x, xn ⟩|2 .

Furthermore, for every z ∈ W , suppose that z = c1 x1 + · · · + cn xn with c1 , . . . , cn ∈ F,


* + * +
n n n
⟨x − y, z⟩ = ⟨x, z⟩ − ⟨y, z⟩ = x, ∑ ck xk − ∑ ⟨x, x j ⟩x j , ∑ ck xk
k=1 j=1 k=1
n n n
= ∑ ck ⟨x, xk ⟩ − ∑ ∑ ⟨x, x j ⟩ck ⟨x j , xk ⟩
k=1 j=1 k=1
n n
= ∑ ck ⟨x, xk ⟩ − ∑ ⟨x, xk ⟩ck = 0.
k=1 k=1

Therefore, we indeed have x − y ∈ W ⊥ , and in particular, (x − y) ⊥ y. By Pythagorean theorem, we see that


n
∥x∥2 = ∥(x − y) + y∥2 = ∥x − y∥2 + ∥y∥2 ≥ ∥y∥2 = ∑ |⟨x, xk ⟩|2 .
k=1

Clearly, here the equality is attained if and only if x − y = 0V , namely x = y. We then show that x = y if and only if x ∈ W :
• If x = y, since y ∈ W , it is certain that x ∈ W as well in this case.
• Conversely, suppose that x ∈ W . Then we can write x = α1 x1 + · · · + αn xn for some α1 , . . . , αn ∈ F. For each k = 1, . . . , n,
* +
n n
⟨x, xk ⟩ = ∑ α j x j , xk = ∑ α j ⟨x j , xk ⟩ = αk (1) = αk .
j=1 j=1

Therefore, we have x = ⟨x, x1 ⟩x1 + · · · + ⟨x, xn ⟩xn = y.


Finally, we prove that d(x,W ) = ∥x − y∥ and y ∈ W is the unique point attaining the equality: For every z ∈ W ,

∥x − z∥2 = ∥ (x − y) + (y − z) ∥2 = ∥x − y∥2 + ∥y − z∥2 ≥ ∥x − y∥2 .


| {z } | {z }
∈W ⊥ ∈W

As we can see, the equality is attained if and only if ∥y − z∥ = 0, namely y = z. The proof is complete. ■

Remark 2.18. The above theorem generalizes the orthogonal projection onto a single nonzero element (cf.
Theorem 2.23).

Corollary 2.30 (Bessel’s Inequality, Infinite Version). Let V be an inner product space and let (en )n∈N be an
orthonormal sequence of elements in V . Then for every x ∈ V ,

∑ |⟨x, en ⟩|2 ≤ ∥x∥2 . (84)
n=1

Furthermore, the orthonormal sequence (en )n∈N converges weakly to 0V but is not strongly convergent.
Proof. By Bessel’s inequality, we have
n
∥x∥2 ≥ ∑ |⟨x, ek ⟩|2 , ∀n ∈ N.
k=1

48
Consequently, the non-negative series ∑∞ 2
n=1 |⟨x, en ⟩| is convergent, implying that ⟨x, en ⟩ → 0 = ⟨x, 0V ⟩ as n → ∞.
w
In conclusion, we have en −→ 0V . Meanwhile, assume, to the contrary, that en → x under the derived norm for some x ∈ V . Then by
w
Theorem 4.5, we have en −→ x and 1 = ∥en ∥ → ∥x∥, namely ∥x∥ = 1. As a result, for every y ∈ V , we see that ⟨en , y⟩ → 0 = ⟨x, y⟩. By
Theorem 2.21, we must have x = 0V , contrary to the fact that ∥x∥ = 1. ■

Theorem 2.31 (Gram-Schmidt Process). Let V be an inner product space and (xn )n∈N be a sequence of linearly
independent elements in V . Then defines recursively

y1 := x1 , e1 := ∥y1 ∥−1 y1 , (85)


k−1
yk := xk − ∑ ⟨xk , en ⟩en , ek := ∥yk ∥−1 yk , k ≥ 2. (86)
n=1

Then the sequence (yn )n∈N is orthogonal and (en )n∈N is orthonormal. Furthermore, for every n ∈ N,

Span(x1 , . . . , xn ) = Span(y1 , . . . , yn ) = Span(e1 , . . . , en ). (87)

Proof. First, because (xn )n∈N is linearly independent, each xn is certainly nonzero. We then prove that y1 , . . . , yn are orthogonal and spans
the same subspace as x1 , . . . , xn by induction on n, in which case e1 , . . . , en is necessarily orthonormal:
• Since x1 = y1 , it is obvious that Span(x1 ) = Span(y1 ). Furthermore, since e1 is a scalar multiple of y1 , we have Span(e1 ) = Span(y1 )
as well.
• Now suppose that the assertions hold for some n ≥ 1. By the preceding theorem, we see that
n
yn+1 = xn+1 − ∑ ⟨xn+1 , ek ⟩ek ∈ Span(e1 , . . . , en )⊥ = Span(y1 , . . . , yn )⊥ = Span(x1 , . . . , xn )⊥ ,
k=1

so yn+1 ⊥ yk for all k = 1, . . . , n. Furthermore, because xn+1 ∈


/ Span(x1 , . . . , xn ), we see that yn+1 ̸= 0V . Furthermore, observe that

yn+1 ∈ Span(x1 , . . . , xn , xn+1 ) and Span(y1 , . . . , yn ) = Span(x1 , . . . , xn ) ⊆ Span(x1 , . . . , xn , xn+1 ),

so y1 , . . . , yn , yn+1 ∈ Span(x1 , . . . , xn , xn+1 ). Note that dim(Span(x1 , . . . , xn , xn+1 )) = n + 1, while y1 , . . . , yn , yn+1 , being orthogonal,
are also linearly independent. Then y1 , . . . , yn , yn+1 must also be a base of Span(x1 , . . . , xn , xn+1 ), namely

Span(x1 , . . . , xn , xn+1 ) = Span(y1 , . . . , yn , yn+1 ).

Finally, because e1 , . . . , en+1 are the respective normalizations of y1 , . . . , yn+1 , we have Span(e1 , . . . , en+1 ) = Span(y1 , . . . , yn+1 ) as
well. ■

Remark 2.19. Here are some special cases for the Gram-Schmidt process:

• If (xn )n∈N is already orthogonal, by easy induction, we certainly have yn = xn for all n ∈ N.

• Instead, if (xn )n∈N is linearly dependent, we shall get yk = 0V at some k: Let k be the smallest index such
that x1 , . . . , xk are linearly independent but x1 , . . . , xk , xk+1 are not. Then again by Theorem 2.29, we must
have yk = 0V in that case.

Definition 2.9. Let V be a finite-dimensional inner product space. An orthonormal base of V is a base in which
the elements are orthonormal.

Remark 2.20. Note that every orthonormal subset is necessarily linearly independent, hence it becomes an
orthonormal base if and only if it is a spanning set of V .

49
Corollary 2.32. Let V be a nonzero finite-dimensional inner product space. Then every orthonormal subset of
V can be extended to an orthonormal base. In particular, the space V possesses an orthonormal base.
Proof. Let x1 , . . . , xk be orthonormal elements in V .
• If dim(V ) = k, since x1 , . . . , xk are already linearly independent, they certainly constitute an orthonormal base of V .
• Suppose that dim(V ) > k. Then we can find xk+1 , . . . , xn ∈ V such that {x1 , . . . , xk , xk+1 , . . . , xn } is a base of V . Then we apply
Gram-Schmidt process to such set, which yields an orthonormal subset {e1 , . . . , en }. Now

Span(e1 , . . . , en ) = Span(x1 , . . . , xn ) = V,

so {e1 , . . . , en } is a base. Furthermore, since x1 , . . . , xk are orthonormal, we must have e j = x j for all j = 1, . . . , k.
Finally, since V is nonzero, we can fix an arbitrary unit element in it, which is certainly orthonormal. Then by our discussions above, such
unit element extends to an orthonormal base of V , as desired. ■

Theorem 2.33 (Finite-Dimensional Spaces). Let V be a finite-dimensional inner product space with orthonor-
mal base {e1 , . . . , en }.

1. (Fourier Expansion). For every x ∈ V ,


n n
x= ∑ ⟨x, ek ⟩ek and ∥x∥2 = ∑ |⟨x, ek ⟩|2 (88)
k=1 k=1

2. (Parseval’s Identity). Let β = (e1 , . . . , en ) be the corresponding ordered base of V . Then for every x, y ∈ V ,
n
[x]β = [⟨x, e1 ⟩, . . . , ⟨x, en ⟩]T and ⟨x, y⟩ = ∑ ⟨x, ek ⟩⟨ek , y⟩ = ⟨[x]β , [y]β ⟩Fn , (89)
k=1

where ⟨·, ·⟩Fn stands for the standard inner product on Fn .

3. (Matrix Representation). Let W be another finite-dimensional inner product space with orthonormal
ordered base γ = (v1 , . . . , vn ). Then for every linear map T : V → W , its matrix representation under β , γ
is given by  
⟨Te1 , v1 ⟩ ⟨Te2 , v1 ⟩ · · · ⟨Ten , v1 ⟩
 ⟨Te1 , v2 ⟩ ⟨Te2 , v2 ⟩ · · · ⟨Ten , v2 ⟩ 
 
γ [T ]β = [⟨Te j , vi ⟩] = 
 .. .. .. . (90)

 . . . 
⟨Te1 , vm ⟩ ⟨Te2 , vm ⟩ · · · ⟨Ten , vm ⟩

Proof. Let x ∈ V be arbitrary. Such expansion is already clear from Theorem 2.29 as the projection of x onto V is itself. We also repeat the
straightforward arguments as follows: Write x = c1 e1 + · · · + cn en . Then for each k = 1, . . . , n,
* +
n n
⟨x, ek ⟩ = ∑ c j e j , ek = ∑ c j ⟨e j , ek ⟩ = ck (1) = ck .
j=1 j=1

Therefore, we indeed have


x = ⟨x, e1 ⟩e1 + · · · + ⟨x, en ⟩en .
Since e1 , . . . , en are orthonormal, it follows that
∥x∥2 = |⟨x, e1 ⟩|2 + · · · + |⟨x, en ⟩|2 .

50
Consider the ordered base β = (e1 , . . . , en ). It is clear that

[x]β = [⟨x, e1 ⟩, . . . , ⟨x, en ⟩]T .

Now let y ∈ V be arbitrary as well. As we can see,


* +
n n n
⟨x, y⟩ = ∑ ⟨x, ek ⟩ek , y = ∑ ⟨x, ek ⟩⟨ek , y⟩ = ∑ ⟨x, ek ⟩⟨y, ek ⟩ = [y]∗β [x]β = ⟨[x]β , [y]β ⟩Fn .
k=1 k=1 k=1

Finally, let W be another finite-dimensional inner product space with orthonormal ordered base γ = (y1 , . . . , yn ), and let T : V → W be a
linear map. Then for each j = 1, . . . , n, as noted above, we have

[Te j ]β = [⟨Te j , y1 ⟩, ⟨Te j , y2 ⟩, . . . , ⟨Te j , ym ⟩]T .

Therefore,
⟨Te1 , y1 ⟩ ⟨Te2 , y1 ⟩ ··· ⟨Ten , y1 ⟩
 

h i  ⟨Te1 , y2 ⟩ ⟨Te2 , y2 ⟩ ··· ⟨Ten , y2 ⟩ 



γ [T ]β = [T v1 ]γ [T v2 ]γ ··· [T vn ]γ = 
 .. .. .. .
 ■
 . . . 
⟨Te1 , ym ⟩ ⟨Te2 , ym ⟩ ··· ⟨Ten , ym ⟩

Theorem 2.34 (Closest Point Property). Let V be an inner product space and W be a finite-dimensional sub-
space. Then for every x ∈ V , there is a unique y ∈ W closest to x, namely

∥x − y∥ = d(x,W ) = inf ∥x − z∥. (91)


z∈W

Furthermore, we have
(W ⊥ )⊥ = W and V = W ⊕W ⊥ . (92)
Proof. Let x ∈ V be arbitrary. Since W is finite-dimensional, it contains an orthonormal base {e1 , . . . , em }. Then by Theorem 2.29, the
element
y := ProjW (x) = ⟨x, e1 ⟩e1 + · · · + ⟨x, en ⟩en
is precisely the unique element in W such that ∥x − y∥ = d(x,W ). Furthermore, we also have x − y ∈ W ⊥ , so the decomposition x =
y + (x − y) ∈ W +W ⊥ tells us that V = W ⊕W ⊥ .
• Meanwhile, since W and W ⊥ are both subspaces of V , we shall have 0V ∈ W ∩W ⊥ . However, now that W ∩W ⊥ ̸= ∅, so by Theorem
2.27, we must have W ∩W ⊥ = {0V }. Therefore, we indeed have V = W ⊕W ⊥ , as desired.
• Finally, it is clear from Theorem 2.27 that W ⊆ (W ⊥ )⊥ . Now let x ∈ (W ⊥ )⊥ . Then ProjW (x) ∈ W ⊆ (W ⊥ )⊥ , so x − ProjW (x) ∈
(W ⊥ )⊥ as well. However, since x − ProjW (x) ∈ W ⊥ as well, we then have x − ProjW (x) = 0V , namely x = ProjW (x) ∈ W . The
equality (W ⊥ )⊥ = W thus holds as well. ■

Alternative Proof of the Closest Point. The existence of such point y is also ensured by Theorem 1.32. What remains is to prove for its
uniqueness: If x ∈ W , then we have d(x,W ) = 0. In this case, the only y ∈ W satisfying ∥x − y∥ = 0 = d(x,W ) is x itself. Consequently, we
shall assume that x ∈ / W then. Let y1 , y2 ∈ W be closest points in W to x and λ ∈ (0, 1). As noted in Theorem 1.32, the point λ y1 + (1 − λ )y2
is also closest to x, namely

d(x,W ) = ∥x − (λ y1 + (1 − λ )y2 )∥ = ∥λ (x − y1 ) + (1 − λ )(x − y2 )∥



≤ λ ∥x − y1 ∥ + (1 − λ )∥x − y2 ∥ = d(x,W ).

Note that the equality in the triangle inequality ≤ is attained, so by Theorem 2.7, there exists a non-negative c ∈ R such that

(1 − λ )(x − y2 ) = c(λ (x − y1 )) = (cλ )(x − y1 ).

51
As we can see, if c = 0, then we have (1 − λ )(x − y2 ) = 0V . Since λ ∈ (0, 1), it follows that x − y2 = 0V , namely x = y2 ∈ W , a contradiction.
Therefore, we should have c > 0 in this case. Note that the above identity is equivalent to

(1 − (c + 1)λ )x = (1 − λ )y2 − cλ y1 .

Here if λ ̸= 1/(c + 1), then


1−λ cλ
x= y2 − y1 ∈ W,
1 − (c + 1)λ 1 − (c + 1)λ
contrary to our assumption above. Thus, by plugging in λ = 1/(c + 1), we see that
 
1 1 c
0V = 1 − y2 − c · y1 = (y2 − y1 ),
c+1 c+1 c+1

Because c > 0, we also have c/(c + 1) > 0 whence y1 = y2 , as desired. ■

Definition 2.10. Let V be an inner product space and x1 , . . . , xn ∈ V . The Gram matrix of x1 , . . . , xn is defined as
 
⟨x1 , x1 ⟩ ⟨x2 , x1 ⟩ · · · ⟨xn , x1 ⟩
⟨x1 , x2 ⟩ ⟨x2 , x2 ⟩ · · · ⟨xn , x2 ⟩
 
G(x1 , . . . , xn ) :=  .
 .. ..  , (93)
 .. . . 
⟨x1 , xn ⟩ ⟨x2 , xn ⟩ · · · ⟨xn , xn ⟩

whose determinant is called the Gram determinant of these elements.

Theorem 2.35. Let V be an inner product space. For every x1 , . . . , xn ∈ V , their Gram matrix G(x1 , . . . , xn ) is
positive semi-definite, which is positive definite if and only if x1 , . . . , xn are linearly independent.
Proof. Let x1 , . . . , xn ∈ V be arbitrary with Gram matrix G := G(x1 , . . . , xn ). Clearly, for every i, j = 1, . . . , n, we have ⟨xi , x j ⟩ = ⟨x j , xi ⟩, so
G is certainly Hermitian. Furthermore, for every z = [c1 , . . . , cn ]T ∈ Fn ,
* + 2
n n n

z Gz = ∑ ci ⟨x j , xi ⟩c j = ∑ ⟨c j x j , ci xi ⟩ = ∑ c j x j , ∑ ci xi = ∑ ci xi ≥ 0,
i, j i, j j=1 i=1 i=1

so the matrix G is certainly positive semi-definite. In addition, here

z∗ Gz = 0 ⇐⇒ ∥c1 x1 + · · · + cn xn ∥ = 0 ⇐⇒ c1 x1 + · · · + cn xn = 0V .

Consequently,

G is positive definite ⇐⇒ ∀z = [c1 , . . . , cn ]T ∈ Fn : (z∗ Gz = 0 =⇒ z = 0)


⇐⇒ ∀c1 , . . . , cn ∈ F : (c1 x1 + · · · + cn xn = 0V =⇒ c1 = · · · = cn = 0)
⇐⇒ x1 , . . . , xn are linearly independent. ■

Remark 2.21. Consider x, y ∈ V in which x ̸= 0V . Then


" #
∥x∥2 ⟨y, x⟩
G(x, y) = ,
⟨x, y⟩ ∥y∥2

so
0 ≤ det(G(x, y)) = ∥x∥2 ∥y∥2 − ⟨y, x⟩⟨x, y⟩ = ∥x∥2 ∥y∥2 − |⟨x, y⟩|2 .

This also proves the CBS inequality.

52
Theorem 2.36 (Closest Point and Gram Matrix). Let V be an inner product space and x1 , . . . , xn ∈ V be linearly
independent. Then for every x ∈ V , the unique closest point in the n-dimensional subspace W := Span(x1 , . . . , xn )
is given by y := c1 x1 + · · · + cn xn , where
   
c1 ⟨x, x1 ⟩
c2  ⟨x, x2 ⟩
   
G(x1 , . . . , xn ) .  =  . 
  
. (94)
 ..   .. 
cn ⟨x, xn ⟩

Furthermore,
det(G(x1 , . . . , xn , x))
d(x,W )2 = ∥x − y∥2 = . (95)
det(G(x1 , . . . , xn ))
Proof. Let x ∈ V be arbitrary and y ∈ W be the unique closest point to x (cf. Theorem 2.29). Write y = c1 x1 + · · · + cn xn for some
c1 , . . . , cn ∈ F. Note that x − y ∈ W ⊥ , so for each k = 1, . . . , n, we have
* +
n n
0 = ⟨x − y, xk ⟩ = ⟨x, xk ⟩ − ⟨y, xk ⟩ = ⟨x, xk ⟩ − ∑ ci xi , xk = ⟨x, xk ⟩ − ∑ ci ⟨xi , xk ⟩.
i=1 i=1

In other words,
⟨x, xk ⟩ = c1 ⟨x1 , xk ⟩ + · · · + cn ⟨xn , xk ⟩.
Therefore, the vector [c1 , . . . , cn ]T ∈ F is precisely the solution to the linear system described above. Now because x1 , . . . , xn are linearly
independent, the Gram matrix G(x1 , . . . , xn ) is positive definite, especially non-singular. As a result, such linear system is consistent with a
unique solution. Finally, since y ⊥ (x − y), we see that

d(x,W )2 = ∥x − y∥2 = ⟨x − y, x − y⟩ = ⟨x − y, x⟩ − ⟨x − y, y⟩
= ⟨x − y, x⟩ = ⟨x, x⟩ − ⟨y, x⟩.

Consequently,
⟨x, x⟩ = ⟨y, x⟩ + d(x,W )2 = c1 ⟨x1 , x⟩ + c2 ⟨x2 , x⟩ + · · · + cn ⟨xn , x⟩ + d(x,W )2 .
Together with the equations obtained above, we have the following (n + 1) × (n + 1) consistent linear system:
    
⟨x1 , x1 ⟩ ⟨x2 , x1 ⟩ · · · ⟨xn , x1 ⟩ 0 c1 ⟨x, x1 ⟩
⟨x1 , x2 ⟩ ⟨x2 , x2 ⟩ · · · ⟨xn , x2 ⟩ 0 c2  ⟨x, x2 ⟩
    
 . . . . .   . 
    
 .. .. .. ..  ..  =  .. .

    
⟨x1 , xn ⟩ ⟨x2 , xn ⟩ · · · ⟨xn , xn ⟩ 0 cn  ⟨x, xn ⟩
    

⟨x1 , x⟩ ⟨x2 , x⟩ · · · ⟨xn , x⟩ 1 d(x,W ) 2 ⟨x, x⟩

Observe that the determinant of the coefficient matrix is precisely det(G(x1 , . . . , xn )) > 0. Then by Cram’s rule, we have
 
⟨x1 , x1 ⟩ ⟨x2 , x1 ⟩ · · · ⟨xn , x1 ⟩ ⟨x, x1 ⟩
⟨x1 , x2 ⟩ ⟨x2 , x2 ⟩ · · · ⟨xn , x2 ⟩ ⟨x, x2 ⟩
 
1  det(G(x , . . . , x , x))
. . . .

1 n
d(x,W )2 = det .. .. .. ..  = . ■
 
det(G(x1 , . . . , xn ))   det(G(x1 , . . . , xn ))
⟨x1 , xn ⟩ ⟨x2 , xn ⟩ · · · ⟨xn , xn ⟩ ⟨x, xn ⟩
 

⟨x1 , x⟩ ⟨x2 , x⟩ · · · ⟨xn , x⟩ ⟨x, x⟩

Remark 2.22. In particular, the setup of the linear system (94) does not require x1 , . . . , xn to be linearly inde-
pendent. In other words, such consistent linear system always exists provided x1 , . . . , xn spans S, though it may
have infinitely many solutions. Nonetheless, all of those solutions only correspond to different coefficients for

53
x1 , . . . , xn but will result in essentially the same y ∈ W .

2.5 Orthonormal Bases in Hilbert Spaces


Theorem 2.37 (Closest Point Property in Hilbert Spaces). Let V be a Hilbert space with subset C. If C is closed
and convex, then for every x ∈ V , there is a unique point y ∈ C closest to x, namely

∥x − y∥ = d(x,C) = inf ∥x − z∥. (96)


z∈C

Proof. Suppose that C is closed and convex, and let x ∈ V . First, if x ∈ C, we may simply put y := x, which certainly satisfies ∥x − y∥ = 0 =
d(x,C). On the other hand, suppose that x ∈/ C. Since C is closed in V , we must have d := d(x,C) > 0. Let (yn )n∈N be a sequence in C such
that ∥x − yn ∥ → d as n → ∞. In particular, we must have ∥x − yn ∥ ≥ d for all n, as d = d(x,C) is the infimum among all of them. We then
claim that (yn )n∈N is also a Cauchy sequence in V : For every m, n ∈ N, by the parallelogram identity,

∥ym − yn ∥2 = ∥(ym − x) + (x − yn )∥2 = 2(∥ym − x∥2 + ∥x − yn ∥2 ) − ∥(ym − x) − (x − yn )∥2


= 2(∥ym − x∥2 + ∥x − yn ∥2 ) − ∥ym + yn − 2x∥2
2
1
= 2(∥ym − x∥2 + ∥x − yn ∥2 ) − 4 (ym + yn ) − x .
2

Since C is convex, we have z := (ym + yn )/2 ∈ C as well. Consequently, we see that ∥z − x∥ ≥ d whence

∥ym − yn ∥2 ≤ 2(∥ym − x∥2 + ∥x − yn ∥2 ) − 4d 2 .

For every ε > 0, since ∥x − yn ∥ → d, there exists N ∈ N such that ∥x − yn ∥2 < d 2 + (ε 2 /4) for all n ≥ N. Consequently, when m, n ≥ N,

ε2
 
∥ym − yn ∥2 ≤ 2 · 2 d 2 + − 4d 2 = ε 2 ,
4

so ∥ym − yn ∥ < ε holds in this case. Therefore, the sequence (yn )n∈N is also a Cauchy sequence in V .
Note that V is a Hilbert space, so there exists y ∈ V such that yn → y as n → ∞. As we can see, because C is closed, we must have y ∈ C
in this case. Furthermore, by the continuity of norm, we have

∥x − y∥ = lim ∥x − yn ∥ = d.
n→∞

The existence of such y is thus clear to us. As for its uniqueness, let y′ ∈ C be another point such that ∥x − y′ ∥ = d(x,C). Again, by putting
z′ := (y + z)/2 and similar arguments as above, we have

∥y − y′ ∥2 = 2(∥y − x∥2 + ∥x − y′ ∥)2 − 4∥z′ − x∥2 = 4(d 2 − ∥z′ − x∥2 ).

Again, by the convexity of C, we also have z′ ∈ C whence

0 ≤ ∥y − y′ ∥2 = 4(d 2 − ∥z′ − x∥2 ) ≤ 4(d 2 − d 2 ) = 0.

Consequently, we have ∥y − y′ ∥ = 0 whence y = y′ . The proof is thus complete. ■

Remark 2.23. In particular, every closed subspace of V is also convex, so the above theorem applies to every
closed subspace of a Hilbert space.
Definition 2.11. Let V be a Hilbert space with closed subspace W .
1. For every x ∈ V , the unique element y ∈ W closest to x is called the orthogonal projection of x onto W .

2. Furthermore, the map PW : V → V , in which PW x is the orthogonal projection of x onto W for all x ∈ V , is
called the orthogonal projection of V onto W .

54
Remark 2.24. By Theorem 2.34, one can see that the above definition generalizes the orthogonal projection
onto a finite-dimensional subspace, as every finite-dimensional subspace is always closed (cf. Corollary 1.32).
Nevertheless, though the original case works for all inner product spaces, the above definition is restricted to
Hilbert spaces only.

Theorem 2.38. Let V be a Hilbert space with closed subspace W . For every x, p ∈ V , the following statements
are equivalent:

1. The element p is the orthogonal projection of x onto W .

2. Here p ∈ W and x − p ∈ W ⊥ .

3. The element x − p is the orthogonal projection of x onto W ⊥ .

In particular, we have
(W ⊥ )⊥ = W and V = W ⊕W ⊥ . (97)

Consequently, if W is proper, then W ⊥ is nonzero.


Proof. (1 =⇒ 2). Suppose that p is the orthogonal projection of x onto W . Then it is clear that p ∈ W . Furthermore, fix an arbitrary y ∈ W .
For every c ∈ F, since p + cy ∈ W as well, we have

∥x − p∥ ≤ ∥x − (p + cy)∥ = ∥(x − p) − cy∥.

By Theorem 2.25, we indeed have (x − p) ⊥ y in this case. Since y ∈ W is arbitrary, it follows that x − p ∈ W ⊥ as well.
(2 =⇒ 1, 3). Suppose that p ∈ W and x − p ∈ W ⊥ . Let y ∈ W be arbitrary. Now p − y ∈ W , so (x − p) ⊥ (p − y). By Pythagorean
theorem,
∥x − y∥2 = ∥(x − p) + (p − y)∥2 = ∥x − p∥2 + ∥p − y∥2 ≥ ∥x − p∥2 .
Therefore, p is precisely the orthogonal projection of x onto W .
• Now the decomposition x = p + (x − p) shows that V = W +W ⊥ . Note that W ∩W ⊥ is nonempty, hence W ∩W ⊥ = {0V }. Conse-
quently, we certainly have V = W ⊕W ⊥ .
• Furthermore, assume that x ∈ (W ⊥ )⊥ in this case. Since p ∈ W ⊆ (W ⊥ )⊥ , we see that x − p ∈ (W ⊥ )⊥ as well. Note that x − p ∈ W ⊥ ,
hence x − p = 0V holds for sure, namely x = p ∈ W . This shows that (W ⊥ )⊥ = W then. In addition, if W ⊥ = {0V }, then we must
have W = (W ⊥ )⊥ = {0V }⊥ = V . In other words, if W ̸= V , then W ⊥ ̸= {0V }.
In addition, applying (2 =⇒ 1) to W ⊥ , one can see that x − p is also the orthogonal projection of x onto W ⊥ . Of course, one may repeat the
above arguments as above: Let z ∈ W ⊥ be arbitrary. Since (x − p) − z ∈ W ⊥ , we have (x − p − z) ⊥ p whence

∥x − z∥2 = ∥(x − p − z) + p∥2 = ∥(x − p) − z∥2 + ∥p∥2 ≥ ∥p∥2 = ∥x − (x − p)∥2 .

This then shows that x − p is also the orthogonal projection of x onto W ⊥ .


(3 =⇒ 2). Finally, suppose that x − p is the orthogonal projection of x onto W ⊥ . Note that W ⊥ is always a closed subspace of V , so
such projection makes sense. What remains is to show that p ∈ W . Here by (1 =⇒ 2), it is clear that p ∈ (W ⊥ )⊥ .
Let q be the orthogonal projection of p onto M. Again by (1 =⇒ 2), we also have p − q ∈ W ⊥ . Therefore, we see that p ⊥ (p − q) and
q ⊥ (p − q), so p − q is also orthogonal to itself, implying that p − q = 0V . That is, we have p = q ∈ W , as desired. ■

Corollary 2.39. Let V be a Hilbert space. Then for every A ⊆ V ,

(A⊥ )⊥ = Span(A). (98)

Furthermore, for every subspace W of V , if W is dense in it, then W ⊥ = {0V }.

55

Proof. Let A ⊆ V be arbitrary. Then by Theorem 2.27, we have A⊥ = Span(A) . Since Span(A) is closed in V , by the preceding theorem,
we have

Span(A) = (Span(A) )⊥ = (A⊥ )⊥ . ■

Finally, given any dense subspace W of V , we have W ⊥ = W = V ⊥ = {0V }.

Lemma 2.40. Let V be a Hilbert space and (ei )i∈I be a family of orthonormal elements. For every x ∈ V , the
set of index i’s where ⟨x, ei ⟩ ̸= 0 is at most countable.
Proof. Fix an arbitrary x ∈ V . Observe that

[
{i ∈ I | ⟨x, ei ⟩ ̸= 0} = Nn (x), where Nn (x) := {i ∈ I | |⟨x, ei ⟩| > 1/n}, n ∈ N.
n=1

As we can see, the set Nn (x) must be finite: If not, given any positive integer k, we can fix an (nk)-element subset Nn,k (x) of Nn (x). By
Bessel’s inequality,
1
∥x∥2 ≥ ∑ |⟨x, ei ⟩| > nk · = k.
i∈N (x)
n
n,k

Since k is arbitrary, we must have ∥x∥ = ∞, which is absurd. Now that each Nn (x) is finite, so the collection of those index i’s in which
⟨x, ei ⟩ is indeed at most countable. ■

Definition 2.12. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then for every x ∈ V ,
the generalized Fourier series of x with respect to (en )n∈N is given by

x∼ ∑ ⟨x, en ⟩en , (99)
n=1

where each ⟨x, en ⟩ is called the generalized Fourier coefficient of x under (en )n∈N .

Lemma 2.41. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then for every sequence
(cn )n∈N of scalars,
∞ ∞
∑ cn en is convergent ⇐⇒ ∑ |cn |2 < ∞, (100)
n=1 n=1

in which case,
2
∞ ∞
∑ cn en = ∑ |cn |2 . (101)
n=1 n=1

Proof. For every m, n ∈ N with m > n, by generalized Pythagorean theorem,

∥cn+1 en+1 + · · · + cm em ∥2 = |cn+1 |2 + · · · + |cm |2 .

• If the series ∑∞ ∞ 2 ∞ 2
n=1 cn en converges, then the partial sums of the series ∑n=1 |cn | is a Cauchy sequence in R, hence the series ∑n=1 |cn |
is convergent as well.
• Conversely, if ∑∞ 2 ∞
n=1 |cn | < ∞, then the partial sums of the series ∑n=1 cn en in V is also Cauchy. Since V is a Hilbert space, such
series must also be convergent.
Finally, suppose that ∑∞
n=1 cn en converges. Then by the continuity of norms and the generalized Pythagorean theorem,

2 2 2
∞ m m m ∞
∑ cn en =
m→∞
lim ∑ cn en = lim
m→∞
∑ cn en = lim
m→∞
∑ |cn |2 = ∑ |cn |2 . ■
n=1 n=1 n=1 n=1 n=1

56
Remark 2.25. For x ∈ V , though its associated generalized Fourier series ∑∞
n=1 ⟨x, en ⟩en could be convergent,
such sum is not necessarily equal to x:
1
Consider V = L2 [−π, π] with ϕn (t) = √ sin(nt) for all n ∈ N. Here (ϕn )n∈N is orthonormal in V , but for
π
f (t) = cos(t), we have
1
Z π
⟨ f , ϕn ⟩ = cos(t) · √ sin(nt) dt = 0,
−π π
so
∞ ∞
∑ ⟨ f , ϕn ⟩ϕn = ∑ 0ϕn = 0 ̸= f .
n=1 n=1

Theorem 2.42 (Bessel’s Inequality in Hilbert Space). Let V be a Hilbert space and (en )n∈N be an orthonor-
mal sequence of elements in V . Then for every x ∈ V , the closest point of x in the closed subspace W :=
Span(en | n ∈ N) is precisely

ProjW (x) := ∑ ⟨x, en ⟩en ∈ W, (102)
n=1

where

∥ ProjW (x)∥2 = ∑ |⟨x, en ⟩|2 ≤ ∥x∥2 . (103)
n=1

Here the last equality is attained if and only if x = ProjW (x), which is also equivalent to that x ∈ W .
Proof. Let x ∈ V be arbitrary. As noted in the infinite version of Bessel’s inequality, we see that

∑ |⟨x, en ⟩| ≤ ∥x∥2 < ∞.
n=1

Then by the preceding lemma, we see that the series ∑∞


n=1 ⟨x, en ⟩en is convergent in V as well with
2
∞ ∞
∑ ⟨x, en ⟩en = ∑ |⟨x, en ⟩2 |.
n=1 n=1

We then let y be its sum, which, being the limit of elements in Span(en | n ∈ N), is clearly an element in W = Span(en | n ∈ N). Furthermore,
we claim that x − y ∈ W ⊥ : By Theorem 2.21, we see that W ⊥ = {en | n ∈ N}⊥ . For each k ∈ N, by the continuity of inner products,
* +

⟨x − y, ek ⟩ = ⟨x, ek ⟩ − ⟨y, ek ⟩ = ⟨x, ek ⟩ − ∑ ⟨x, en ⟩en , ek
n=1

= ⟨x, ek ⟩ − ∑ ⟨x, en ⟩⟨en , ek ⟩ = ⟨x, ek ⟩ − ⟨x, ek ⟩(1) = 0.
n=1

Therefore, we indeed have x − y ∈ W⊥ in this case. By Theorem 2.38, we see that y is precisely the orthogonal projection of x onto W .
Finally, since y ⊥ (x − y), by Pythagorean theorem, we have

∥x∥2 = ∥x − y∥2 + ∥y∥2 = ∥x − y∥2 + ∑ |⟨x, en ⟩|2 .
n=1

Note that x ∈ W if and only if 0 = d(x,W ) = ∥x − y∥, so



x ∈ W ⇐⇒ x = y ⇐⇒ ∥x∥2 = ∑ |⟨x, en ⟩|2 . ■
n=1

Definition 2.13. Let V be a Hilbert space. An orthonormal sequence (en )n∈N is called complete, or an orthonor-

57
mal base of V , if for every x ∈ V ,

x= ∑ ⟨x, en ⟩en . (104)
n=1

Theorem 2.43. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then the following
statements are equivalent:

1. The orthonormal sequence (en )n∈N is an orthonormal base of V .

2. For every x ∈ V , there exists a unique family (cn )n∈N of scalars such that

x= ∑ cn en . (105)
n=1

3. The sequence (en )n∈N is a topological base of V , namely V = Span(en | n ∈ N).

4. The orthogonal complement of the spanned subspace Span(en | n ∈ N) is zero.


Proof. (1 ⇐⇒ 2). Suppose that (en )n∈N is complete. Let x ∈ V . It follows from the completeness of (en )n∈N that x = ∑∞ n=1 ⟨x, en ⟩en . Then
the existence of such family of scalars is already ensured. As for the uniqueness, suppose that x = ∑∞
n=1 cn e n , where cn ∈ F is a scalar for all
n ∈ N. Then for each k ∈ N, by the continuity of inner product, we have
* +
∞ ∞ ∞
⟨x, ek ⟩ = ∑ cn en , ek = ∑ cn ⟨en , ek ⟩ = ∑ cn δn,k = ck (1) = ck .
n=1 n=1 n=1

This proves the uniqueness of such family of scalars as well as (2 =⇒ 1).


(2 =⇒ 3). Suppose that every element x ∈ V allows a unique expansion x = ∑∞
n=1 cn en . This shows that x ∈ Span(en | n ∈ N) for all
x ∈ V , so V = Span(en | n ∈ N) holds for sure.
(3 =⇒ 1, 4). If V = Span(en | n ∈ N), then by Theorem 2.27,

Span(en | n ∈ N)⊥ = Span(en | n ∈ N) = V ⊥ = {0V }.

Furthermore, by Theorem 2.42, we have



x = ProjV (x) = ∑ ⟨x, en ⟩en .
n=1

(4 =⇒ 3). Suppose that the orthogonal complement of Span(en | n ∈ N) is zero. Let W := Span(en | n ∈ N). Again, by Theorem 2.27,
we have

W ⊥ = Span(en | n ∈ N) = Span(en | n ∈ N)⊥ = {0V }.
Since W is closed in V and V is a Hilbert space, by Theorem 2.38, we have W = (W ⊥ )⊥ = {0V }⊥ = V . ■

Corollary 2.44. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then the following
statements are equivalent:

1. The orthonormal sequence (en )n∈N is complete, an orthonormal base of V .

2. (Parseval’s Identity). For every x, y ∈ V ,



⟨x, y⟩ = ∑ ⟨x, en ⟩⟨en , y⟩. (106)
n=1

58
3. (Plancherel’s Identity). For every x ∈ V ,

∥x∥2 = ∑ |⟨x, en ⟩|2 . (107)
n=1

Proof. (1 =⇒ 2). Suppose that (en )n∈N is complete. For every x, y ∈ V , now because x = ∑∞
n=1 ⟨x, en ⟩en , by the continuity of inner products,
we have * +
∞ ∞
⟨x, y⟩ = ∑ ⟨x, en ⟩en , y = ∑ ⟨x, en ⟩⟨en , y⟩.
n=1 n=1
(2 =⇒ 3). Put y := x in the formula.
(3 =⇒ 1). Suppose that ∥x∥2 = ∑∞ 2
n=1 |⟨x, en ⟩| < ∞. Let W := Span(en | n ∈ N). For each x ∈ V , by Theorem 2.42, we can see that
x ∈ W as well. This shows that V = W = Span(en | n ∈ N), so by the preceding theorem, the orthonormal sequence (en )n∈N is complete. ■

Theorem 2.45. All orthonormal bases of a nonzero Hilbert space are of the same cardinality.
Proof. Let V be a nonzero Hilbert space. If V has a finite orthonormal base, then it must be of finite dimension. Correspondingly, all
orthonormal bases have the same cardinality as the dimension of V .
Thus, we may assume that V has an infinite orthonormal base (ei )i∈I . Let (e′j ) j∈J be another orthonormal base of V . As discussed
above, the set J is also infinite. For each i ∈ I, we may let

Ni := { j ∈ J | ⟨ei , e′j ⟩ ̸= 0}.

Because (e′j ) j∈J is an orthonormal base, the set Ni must be nonempty, and also at most countable by Lemma 2.40. Meanwhile, for each
j ∈ J, since (ei )i∈I is also an orthonormal base, we can also find some i ∈ I such that j ∈ Ni . Therefore, we have J = i∈I Ni , so
S

[
|J| = Ni ≤ ∑ |Ni | = |I|.
i∈I i∈I

By symmetric arguments, we have |I| ≤ |J| as well. Therefore, the equality |I| = |J| holds in this case. ■

Definition 2.14. Let V be a Banach space. A Schauder base, or a countable base, is a sequence (en )n∈N of
elements in V such that for every x ∈ V , there is a unique sequence (cn )n∈N of scalars in F such that

x= ∑ cn en . (108)
n=1

Such series is called the Schauder expansion of x, in which the scalars cn ’s are called the coordinates of x with
respect to the Schauder base.

Remark 2.26. Here are some remarks on the preceding definition:

• The ordering of elements in a Schauder base is important, because reordering terms in a series will break
its convergence even in a Banach space. That is why it is defined as a sequence rather than an arbitrary
indexed family.

• Clearly, every orthonormal base of a Hilbert space is a Schauder base.

Theorem 2.46. Every Schauder base in a Banach space is linearly independent and also a topological base.

59
Proof. Let V be a Banach space with Schauder base (en )n∈N . Consider an almost-null family (cn )n∈N of scalars, namely cn = 0 for all but
finitely many n’s, by which

0V = ∑ cn en .
n=1
Then by the uniqueness of Schauder expansion, we must have cn = 0V for all n ∈ N, so the family (en )n∈N is linearly independent. Fur-
thermore, for each x ∈ V , its Schauder expansion tells us that it is a limit point of Span(en | n ∈ N). Therefore, the spanned subspace
Span(en | n ∈ N) is dense in V , implying that (en )n∈N is a topological base of V as well. ■

60
2.6 Examples of Inner Products
Example 2.5 (Frobenius Inner Product). Let V := Mm,n (F). Then the Frobenius inner product is defined as

⟨A, B⟩F := tr(B∗ A). (109)

The Frobenius inner product on Mm,n (F) is defined as follows: For every A = [ai j ], B = [bi j ] ∈ Fn , let

n m
⟨A, B⟩F = tr(B∗ A) = ∑ ∑ bi j ai j = ∑ ai j bi j .
j=1 i=1 i, j

As we can see,

tr(A∗ A) = ∑ ai j ai j = ∑ |ai j |2 ≥ 0, and ⟨A, B⟩F = tr(B∗ A) = tr((A∗ B)∗ ) = tr(A∗ B) = ⟨B, A⟩F .
i, j i, j

If n = 1, the Frobenius inner product is the standard inner product on Fm .

Example 2.6. Let (X, A, µ) be a measure space.


Z
⟨ f , g⟩ := f g dµ (110)
X

is a semi-inner product on L2 (µ), and ∥ f ∥2 = ⟨ f , f ⟩1/2 for all f ∈ L2 (µ).

61
3 Bounded Linear Maps and Continuous Dual Spaces
3.1 Bounded Linear Maps
Definition 3.1. Let X,Y be normed linear spaces. A linear map T : X → Y is called bounded if there exists a
positive constant M ∈ R such that ∥T x∥ ≤ M∥x∥ for all x ∈ X.

Theorem 3.1 (Characterization of Bounded Linear Maps). Let X,Y be normed linear spaces and T : X → Y be
a linear map. Then the following statements are equivalent:

1. The linear map T is bounded.

2. The linear map T is Lipschitz.

3. The linear map T is uniformly continuous.

4. The linear map T is continuous.

5. The linear map T is continuous at some x0 ∈ X.

6. The linear map T is continuous at 0X .


Proof. (1 =⇒ 2). Suppose that T is bounded, say by some constant M > 0. Then for every x, y ∈ X,

∥T x − Ty∥ = ∥T (x − y)∥ ≤ M∥x − y∥.

Therefore, the map T : X → Y is Lipschitz continuous with constant M.


(2 =⇒ 3 =⇒ 4 =⇒ 5). All of them are trivial by definition.
(5 =⇒ 6). Suppose that T is continuous at some x0 ∈ X. Let (xn )n∈N be a sequence of elements in X converging to 0X . Then we see
that xn + x0 → 0X + x0 = x0 as n → ∞, so by the continuity of T at x0 , we see that T (xn + x0 ) → T (x0 ). Consequently,

T (xn ) = T ((xn + x0 ) − x0 ) = T (xn + x0 ) − T (x0 ) → T (x0 ) − T (x0 ) = 0Y = T (0X ).

This shows that T is continuous at 0X as well.


(6 =⇒ 1). Suppose that T is continuous at 0X . Then there exists δ > 0 such that ∥T x∥ = ∥T x − T 0X ∥ < 1 for all x ∈ X with ∥x∥ < δ .
Then for every nonzero y ∈ 0X , since ∥(δ /(2∥y∥))y∥ = δ /2 < δ , it follows that
 
δ δ δ
1> T y = · Ty = ∥Ty∥,
2∥y∥ 2∥y∥ 2∥y∥

or equivalently, we have ∥Ty∥ < (2/δ )∥y∥. Therefore, the linear map T : X → Y is bounded by constant M := 2/δ > 0. ■

Corollary 3.2. Let X,Y be normed linear spaces. For every bounded linear map T : X → Y , its kernel/null
space is closed in X.
Proof. Let T : X → Y be a bounded linear map, and let (xn )n∈N be a sequence in ker(T ) converging to some x ∈ X. Then by the continuity
of T , we see that T xn → T x as well. Now for each n ∈ N, we have T xn = 0Y because xn ∈ ker(T ). This shows that T x = 0Y as well, hence
x ∈ ker(T ), showing that ker(T ) is closed in X. ■

Theorem 3.3. Let X,Y be normed linear spaces. The set B(X,Y ) of all bounded linear maps from X to Y is a
subspace of L (X,Y ).

62
Proof. Clearly, the zero map is bounded, as ∥0x∥ = ∥0Y ∥ = 0 ≤ M∥x∥ for all M > 0. Furthermore, let T, T ′ ∈ L (X,Y ) be bounded and
c ∈ F. Then there exist M, M ′ > 0 such that ∥T x∥ ≤ M∥x∥ and ∥T ′ x∥ ≤ M ′ ∥x∥ for all x ∈ 0X . Put c′ := |c| ∨ 1 > 0. Then given any x ∈ X,

∥(T + T ′ )x∥ = ∥T x + T ′ x∥ ≤ ∥T x∥ + ∥T ′ x∥ ≤ M∥x + M ′ ∥x∥ = (M + M ′ )∥x∥

and
∥(cT )x∥ = ∥c(T x)∥ = |c|∥T x∥ ≤ |c|M∥x∥ ≤ c′ M∥x∥.
Since M + M ′ > 0 and c′ M > 0, the linear maps T + T ′ and cT are bounded as well. The desired assertion thus follows. ■

Theorem 3.4. Let X,Y be normed linear spaces. If X is finite-dimensional, then every linear map from X to Y
is bounded, namely L (X,Y ) = B(X,Y ).
Proof. Let {e1 , . . . , en } be a base of X with coordinate forms e∨ ∨
1 , . . . , en : X → F, and let T : X → Y be a linear map. As we can see, for each
x ∈ X,
!
n n
∥T x∥ = T ∑ e∨i (x)ei = ∑ e∨i (x)T (ei )
i=1 i=1
n n
≤ ∑ ∥e∨ ∨
i (x)T (ei )∥ = ∑ |ei (x)|∥T (ei )∥
i=1 i=1
n
≤ (∥T (e1 )∥ ∨ · · · ∨ ∥T (en )∥) ∑ |e∨
i (x)|.
i=1

Then we may put M := ∥T (e1 )∥ ∨ · · · ∨ ∥T (en )∥ ∨ 1 > 0. By Lemma 1.27, there exists λ > 0 such that
n n
∥x∥ = ∑ e∨i (x)ei ≥λ ∑ |e∨i (x)|.
i=1 i=1

Consequently,
n
M
∥T x∥ ≤ M ∑ |e∨
i (x)| ≤ ∥x∥.
i=1 λ
Since M/λ > 0, we can conclude that T is bounded. ■

Definition 3.2. Let X,Y be normed linear spaces. An isomorphism from X to Y is a bounded linear isomorphism
T : X → Y , namely a bijective linear map, whose inverse T −1 : Y → X is also bounded.

Remark 3.1. Clearly, an isomorphism T between normed linear spaces is both a linear isomorphism (from the
aspect of linear space structure) and a homeomorphism (from the aspect of metric space structure). Thus,

• being isomorphic is an equivalence relation on any nonempty collection of normed linear spaces;

• the inverse of every isomorphism is also an isomorphism, as the inverse of a linear isomorphism (resp.
homeomorphism) is a linear isomorphism (resp. homeomorphism) as well.

Definition 3.3. Let X,Y be normed linear spaces. A linear map T : X → Y is called a linear isometry, or simply
an isometry, if ∥T x∥ = ∥x∥ for all x ∈ X.

Remark 3.2. In general, a map f : X → Y is called an isometry if ∥ f (x) − f (y)∥ = ∥x − y∥ for all x, y ∈ X. Here

63
for any linear map T : X → Y ,

T is an isometry ⇐⇒ ∀x, y ∈ X : ∥T x − Ty∥ = ∥x − y∥


⇐⇒ ∀x, y ∈ X : ∥T (x − y)∥ = ∥x − y∥
⇐⇒ ∀x ∈ X : ∥T x∥ = ∥x∥.

This explains why the linear isometries are defined in the above fashion.

Theorem 3.5. Every linear isometry is bounded and injective.


Proof. Let X,Y be normed linear spaces and T : X → Y be a linear isometry. The fact that ∥T x∥ = ∥x∥ immediately tells us that T is
bounded. Furthermore,
T x = Ty ⇐⇒ ∥T x − Ty∥ = 0 ⇐⇒ ∥x − y∥ = 0 ⇐⇒ x = y,
hence the map T is injective. ■

Theorem 3.6. Let X,Y be inner product spaces. Then a linear map U : X → Y is an isometry if and only if
⟨T x, Ty⟩ = ⟨x, y⟩ for all x, y ∈ X.
Proof. Let T : X → Y be a linear map. First, suppose that ⟨T x, Ty⟩ = ⟨x, y⟩ for all x, y ∈ X. Then for every x ∈ X, we have

∥T x∥2 = ⟨T x, T x⟩ = ⟨x, x⟩ = ∥x∥2 ,

namely ∥T x∥ = ∥x∥. It is thus clear that T is a linear isometry in this case.


Conversely, suppose that T is a linear isometry. For every x, y ∈ X, by Theorem 2.10, we have

∥x + y∥2 = ∥x∥2 + ∥y∥2 + 2ℜ⟨x, y⟩ and ∥x + iy∥2 = ∥x∥2 + ∥y∥2 + 2ℑ⟨x, y⟩.

Since T is a linear isometry, we see that

∥x∥2 + 2ℜ⟨x, y⟩ + ∥y∥2 = ∥x + y∥2 = ∥T (x + y)∥2


= ∥T x + Ty∥2 = ∥T x∥2 + 2ℜ⟨T x, Ty⟩ + ∥Ty∥2
= ∥x∥2 + 2ℜ⟨T x, Ty⟩ + ∥y∥2 ,

so we have ℜ⟨T x, Ty⟩ = ℜ⟨x, y⟩ in this case. Likewise, by considering x + iy, we have ℑ⟨T x, Ty⟩ = ℑ⟨x, y⟩ as well. Therefore, the equality
⟨T x, Ty⟩ = ⟨x, y⟩ holds for sure in this case. ■

Definition 3.4. Two normed linear spaces X and Y are isometrically isomorphic if there exists an isomorphism
T : X → Y that is also a linear isometry.

Theorem 3.7. Let X,Y be normed linear spaces and T : X → Y be a linear map. If T is a linear isomorphism
as well as an isometry, itself as well as its inverse T −1 : Y → X is also an isometric isomorphism.
Proof. Suppose that T is both a linear isomorphism and an isometry. Then it suffices to show that T −1 : Y → X is also bounded. As we can
see, for every y ∈ Y ,
∥y∥ = ∥T (T −1 y)∥ = ∥T −1 y∥,
so T −1 is also an isometry whence bounded as well. In this case, we see that T is an isometric isomorphism, while T −1 , being a linear
isomorphism and an isometry, is an isometric isomorphism as well. ■

64
3.2 The Operator Norms
Definition 3.5. Let X,Y be normed linear spaces and T : X → Y be a linear map. The operator norm, or simply,
the norm of T is defined as
∥T ∥ := inf{M > 0 | ∀x ∈ X : ∥T x∥ ≤ M∥x∥}, (111)

adopting the convention that inf(∅) = ∞.

Remark 3.3. Clearly, a linear map is bounded if and only if it has finite norm. That is,

B(X,Y ) = {T ∈ L (X,Y ) | ∥T ∥ < ∞}. (112)

In particular, every linear isometry is bounded with operator norm 1.

Lemma 3.8. Let X,Y be normed linear spaces and T : X → Y be a linear map. Then for every x ∈ X,

∥T x∥ ≤ ∥T ∥∥x∥. (113)

Proof. Let x ∈ X be arbitrary. There is nothing to prove if ∥T ∥ = ∞. Suppose that ∥T ∥ < ∞ now. Then the following set

M := {M > 0 | ∀y ∈ X : ∥Ty∥ ≤ M∥y∥}

is nonempty. As we can see, we have ∥T x∥ ≤ M∥x∥ for all M ∈ M, hence ∥T x∥ ≤ ∥T ∥∥x∥ holds as well by taking infimum over M. ■

Theorem 3.9. Let X,Y be normed linear spaces and T : X → Y be a linear map. Then

∥T ∥ = sup ∥T x∥ = sup ∥T x∥. (114)


∥x∥≤1 ∥x∥<1

Furthermore, when X is nonzero,


∥T x∥
∥T ∥ = sup = sup ∥T x∥. (115)
x̸=0X ∥x∥ ∥x∥=1

Proof. 1. Ye prove the identity ∥T ∥ = sup∥x∥≤1 ∥T x∥ first: If ∥T ∥ = ∞, then T is unbounded. In other words, for every M > 0, there exists
y ∈ X such that ∥Ty∥ > M∥y∥. Clearly, here y ̸= 0X , for otherwise ∥Ty∥ = ∥0Y ∥ = 0 = M∥y∥, a contradiction. In this case, the normalization
ŷ := ∥y∥−1 y satisfies ∥x∥ = 1 and
∥Ty∥ M∥y∥
∥T ŷ∥ = > = M.
∥y∥ ∥y∥
Consequently, we have sup∥x∥≤1 ∥T x∥ = ∞ as well in this case. Next, suppose that ∥T ∥ < ∞ and denote by M := sup∥x∥≤1 ∥T x∥.
• For each x ∈ X with ∥x∥ ≤ 1, we have ∥T x∥ ≤ ∥T ∥∥x∥ ≤ ∥T ∥. Consequently, the inequality M ≤ ∥T ∥ < ∞ holds in this case.
• If ∥T ∥ = 0, the above inequality already implies that M = ∥T ∥ = 0; Otherwise, the domain X is nonzero. Then for every nonzero
x ∈ X with normalization x̂ := ∥x∥−1 x,
∥T x∥ = ∥x∥∥T x̂∥ ≤ M∥x∥,
hence by definition, the converse inequality ∥T ∥ ≤ M is true as well.
2. As a result, it is immediate that
∥T ∥ = sup ∥T x∥ ≥ sup ∥T x∥.
∥x∥≤1 ∥x∥<1

For the converse, we still consider two cases:

65
• Suppose that ∥T ∥ = ∞. Then for every M > 0, there exists y ∈ X with ∥y∥ ≤ 1 such that ∥Ty∥ > 2M. In this case, observe that
 
1 ∥y∥ 1 ∥Ty∥ 2M
y = < 1 and T y = > = M.
2 2 2 2 2

Consequently, we also have sup∥x∥<1 ∥T x∥ = ∞ = ∥T ∥ then.


• Suppose that ∥T ∥ < ∞. Then T is bounded and hence continuous. Let y ∈ X be nonzero with ∥y∥ ≤ 1, and let (rn )n∈N be a sequence
on the interval (0, 1) with rn → 1. Then by the continuity of scalar multiplication, we have rn y → y in X, so also by the continuity of
T , we have T (rn y) → Ty in Y as well. Observe that ∥rn y∥ = rn ∥y∥ ≤ rn < 1, so

∥T (rn y)∥ ≤ sup ∥T x∥.


∥x∥<1

Since T (rn y) → Ty in Y , it follows that ∥Ty∥ ≤ sup∥x∥<1 ∥T x∥. By the arbitrariness of y, it follows that ∥T ∥ ≤ sup∥x∥<1 ∥T x∥.
3. Suppose that X is nonzero then. The sets X \ {0X } and {x | ∥x∥ = 1} are both nonempty, hence their supremum are not equal to −∞.
It is also immediate that
∥T ∥ = sup ∥T x∥ ≥ sup ∥T x∥.
∥x∥≤1 ∥x∥=1

Furthermore, for each nonzero x ∈ X with ∥x∥ ≤ 1, since x̂ := ∥x∥−1 x is a unit element, it follows that

∥T x∥ = ∥x∥∥T x̂∥ ≤ ∥x∥ sup ∥T x∥ ≤ sup ∥T x∥.


∥x∥=1 ∥x∥=1

Consequently, we also have


∥T ∥ = sup ∥T x∥ ≤ sup ∥T x∥.
∥x∥≤1 ∥x∥=1

4. Finally, denote by M ′ := supx̸=0X (∥T x∥/∥x∥). It is clear that M ′ is non-negative or ∞.


• Suppose that M ′ < ∞. For every nonzero x ∈ 0X , we have ∥T x∥/∥x∥ ≤ M ′ , hence ∥T x∥ ≤ M ′ ∥x∥. By definition, we have ∥T ∥ ≤
M ′ < ∞.
• Suppose that ∥T ∥ < ∞. Let M > 0 be arbitrary with ∥T x∥ ≤ M∥x∥ for all x ∈ X, which exists now. Then for every nonzero x ∈ 0X ,
we certainly have ∥T x∥/∥x∥ ≤ M. Consequently, M ′ ≤ M holds, which, by the arbitrariness of M, implies that M ′ ≤ ∥T ∥ < ∞.
Then we consider two cases:
• If ∥T ∥ < ∞, then by the second item above, we have M ′ ≤ ∥T ∥ < ∞ whence ∥T ∥ ≤ M ′ as well by the first item. The equality
∥T ∥ ≤ M ′ holds for sure.
• If ∥T ∥ = ∞, if M ′ < ∞, then by the first item, we have ∥T ∥ ≤ M ′ < ∞, a contradiction. Therefore, we have M ′ = ∞ = ∥T ∥ in this
case as well. ■

Corollary 3.10. Let X,Y be normed linear spaces and T : X → Y be a linear map. If X is nonzero, then for
every r > 0,
sup ∥T x∥ = sup ∥T x∥ = sup ∥T x∥ = r∥T ∥.
∥x∥≤r ∥x∥<r ∥x∥=r
Proof. Let r > 0 be arbitrary. Here we prove that sup∥x∥≤r ∥T x∥ = r∥T ∥ only, as the others would follow by similar arguments: As we can
see,
x′ :=x/r
sup ∥T x∥ = sup ∥T (rx′ )∥ = r sup ∥T x′ ∥ = r∥T ∥. ■
∥x∥≤r ∥x′ ∥≤1 ∥x′ ∥≤1

Definition 3.6. Let X,Y be normed linear spaces. A linear map T : X → Y is called a contraction if there exists
α ∈ (0, 1) such that ∥T x∥ ≤ α∥x∥ for all x ∈ X.
Remark 3.4. Again, a general contraction is a function f : X → Y such that ∥ f (x) − f (y)∥ ≤ α∥x − y∥ for some
α ∈ (0, 1) and all x, y ∈ X. If T : X → Y is a linear contraction with constant α ∈ (0, 1), then

∥T x − Ty∥ = ∥T (x − y)∥ ≤ α∥x − y∥, ∀x, y ∈ X.

66
In other words, every linear contraction is also a contraction in general. In particular, for every y0 ∈ Y , the
following affine map
T̂ : X → Y : x 7→ T x + y0 (116)

is also a contraction in the general sense.

Theorem 3.11 (Operator Norm of Linear Contractions). Let X,Y be normed linear spaces. A linear map
T : X → Y is a contraction if and only if ∥T ∥ < 1. In particular, every linear contraction is bounded.
Proof. Let T : X → Y be a linear map. Then

T is a linear contraction ⇐⇒ ∃α ∈ (0, 1) : ∀x ∈ X : ∥T x∥ ≤ α∥x∥


⇐⇒ ∃α ∈ (0, 1) : ∥T ∥ ≤ α ⇐⇒ ∥T ∥ < 1.

Clearly, if T is a linear contraction, since ∥T ∥ < 1 < ∞, it is certainly bounded as well. ■

Theorem 3.12 (Banach’s Fixed Point Theorem). Let X be a Banach space. Then every contraction (not neces-
sarily linear) T : X → X has a unique fixed point x∗ ∈ X, namely T x∗ = x∗ . Furthermore, for every x0 ∈ X, we
always have
lim T n x0 = x∗ . (117)
n→∞

Proof. Let T : X → X be a contraction and α ∈ (0, 1) be such that ∥T x − Ty∥ ≤ α∥x − y∥ for all x, y ∈ X. Next, we may select an arbitrary
x0 ∈ X and put xn := T n x0 for all n ∈ N. Then for each n, we have

∥xn+1 − xn ∥ = ∥T xn − T xn−1 ∥ = ∥T (xn − xn−1 )∥ ≤ α∥xn − xn−1 ∥.

Consequently, it follow from easy induction that


∥xn+1 − xn ∥ ≤ α n ∥x1 − x0 ∥.
Then for every n, p ∈ N, we have
p−1 p−1
∥xn+p − xn ∥ ≤ ∑ ∥xn+i+1 − xn+i ∥ ≤ ∑ α n+i ∥x1 − x0 ∥
i=0 i=0
p−1
α n ∥x1 − x0 ∥
= α n ∥x1 − x0 ∥ ∑ α i ≤ → 0. (as n → ∞)
i=0 1−M

Therefore, (xn )n∈N is a Cauchy sequence in X, hence is convergent with limit x∗ ∈ X. Being a Lipschitz function, the map T is continuous
on X, so we have  
T x∗ = T lim xn = lim T xn = lim xn+1 = x∗ .
n→∞ n→∞ n→∞
Thus, the element x∗ is indeed a fixed point of T . Finally, suppose that y ∈ X is another fixed point of T . If y ̸= x∗ , then

0 < ∥x∗ − y∥ = ∥T x∗ − Ty∥ ≤ α∥x∗ − y∥ < ∥x∗ − y∥,

a contradiction. Therefore, the fixed point of T in X is unique, completing the proof. ■

Theorem 3.13 (Norm of Quotient Map). Let X be a normed linear space with subspace U. If U is closed and
U ̸= X, then ∥π∥ = 1 where π : X → X/U is the quotient map.
Proof. Suppose that U is a proper closed subspace of X, and let π : X → X/U be the quotient map.
• By Theorem 1.16, we have seen that ∥π∥ ≤ 1, as for every x ∈ X with ∥x∥ ≤ 1,

∥π(x)∥ = ∥x∥ ≤ ∥x∥ ≤ 1.

67
• Meanwhile, by Riesz’s lemma (cf. Lemma 1.33), for every δ ∈ (0, 1), there exists a unit element x ∈ X such that δ ≤ d(x,U) =
∥x∥ = ∥π(x)∥ in X/U. Consequently, we also have ∥π∥ ≥ δ for all δ ∈ (0, 1), hence ∥π∥ ≥ 1 as well.
Therefore, we must have ∥π∥ = 1 in this case. ■

Theorem 3.14 (Norm of Complexification). Let X,Y be real inner product spaces and T : X → Y be a bounded
linear map. Then the complexification TC : XC → YC of T is also bounded with ∥TC ∥ = ∥T ∥. Furthermore, if T
is an isomorphism, so is TC .
Proof. 1. By Theorem 2.9, for every x, y ∈ X,

∥TC (x + iy)∥2C = ∥T x + i(Ty)∥2C = ∥T x∥2 + ∥Ty∥2


≤ ∥T ∥2 ∥x∥2 + ∥T ∥2 ∥y∥2 = ∥T ∥2 (∥x∥2 + ∥y∥2 ) = ∥T ∥2 ∥x + iy∥2C ,

so the inequality ∥TC (x + iy)∥C ≤ ∥T ∥2 ∥x + iy∥C holds. Consequently, we can conclude that TC is also bounded with ∥TC ∥ ≤ ∥T ∥. In
particular, for every x ∈ X,
∥T x∥ = ∥TC (x + i0)∥ ≤ ∥TC ∥∥x + i0∥C = ∥TC ∥∥x∥,
so the converse inequality ∥T ∥ ≤ ∥TC ∥ holds as well. Therefore, we certainly have ∥TC ∥ = ∥T ∥ in this case.
2. Suppose that T is an isomorphism. Then T −1 : Y → X is also a bounded linear map. By Theorem A.3, we see that TC is also a linear
isomorphism with inverse (T −1 )C . As noted above, the map (T −1 )C is also bounded, so TC is an isomorphism as well. ■

Theorem 3.15 (Factorization Theorem). Let X,Y be normed linear spaces and T : X → Y be a bounded linear
map. Then for every closed subspace U of X, if U ⊆ ker(T ), then there exists a unique bounded linear map
T̃ : X/U → Y such that
∀x ∈ X : T̃ x = T x, and ∥T̃ ∥ = ∥T ∥. (118)

1. Here im(T̃ ) = im(T ), so T̃ is surjective if and only if T is.

2. Here ker(T̃ ) = ker(T )/U, so T̃ is injective if and only if ker(T ) = U.


Proof. Let U be a closed subspace of X contained in ker(T ), and define

T̃ : X/U → Y : x 7→ T x.

As we can see, for every x, y ∈ X,

x = y ⇐⇒ x − y ∈ ker(T ) =⇒ T (x − y) = 0Y ⇐⇒ T x = Ty.

Therefore, the map T̃ is well-defined.


• For every x, y ∈ X and c ∈ F,
T̃ (cx + y) = T̃ (cx + y) = T (cx + y) = c(T x) + Ty = cT̃ x + T̃ y,
so T̃ is also linear.
• Let x ∈ X be arbitrary. Then for every z ∈ U ⊆ ker(T ), we have T z = 0Y whence T (x − z) = T x. In this case,

∥T̃ x∥ = ∥T x∥ = ∥T (x − z)∥ ≤ ∥T ∥∥x − z∥.

By taking infimum over all z ∈ U, it follows that ∥T̃ x∥ ≤ ∥T ∥∥x∥, Therefore, the linear map T̃ is also bounded with ∥T̃ ∥ ≤ ∥T ∥.
Finally, since ∥x∥ ≤ ∥x∥, it is clear that
∥T x∥ = ∥T̃ x∥ ≤ ∥T̃ ∥∥x∥ ≤ ∥T̃ ∥∥x∥.
Then we also have ∥T ∥ ≤ ∥T̃ ∥, so the identity ∥T̃ ∥ = ∥T ∥ holds for sure.
Finally, we study the injectivity and surjectivity of T̃ as follows:

68
• For every y ∈ Y ,
y ∈ im(T̃ ) ⇐⇒ ∃x ∈ X : y = T̃ x = T x ⇐⇒ y ∈ im(T ).
Therefore, we have im(T̃ ) = im(T ). In this case,

T̃ is surjective ⇐⇒ im(T̃ ) = Y
⇐⇒ im(T ) = Y ⇐⇒ T is surjective.

• For every x ∈ X,
x ∈ ker(T̃ ) ⇐⇒ 0Y = T̃ x = T x ⇐⇒ x ∈ ker(T ).
Consequently, we see that ker(T̃ ) = ker(T )/U whence

T̃ is injective ⇐⇒ {0X } = ker(T̃ ) = ker(T )/U ⇐⇒ ker(T ) = U,

where the direction =⇒ in the last equivalence follows from the fact that U ⊆ ker(T ). ■

Theorem 3.16. Let X,Y, Z be normed linear spaces, and let T : X → Y, T ′ : Y → Z be linear maps. Then

∥T ′ ◦ T ∥ ≤ ∥T ∥∥T ′ ∥. (119)

Proof. There is nothing to prove if ∥T ∥ = ∞ and ∥T ′ ∥ = ∞. Suppose that ∥T ∥ < ∞ and ∥T ′ ∥ < ∞ now. Then there exist constants M, M ′ > 0
such that ∥T x∥ ≤ M∥x∥ for all x ∈ X and ∥T ′ y∥ ≤ M ′ ∥y∥ for all y ∈ Y . Consequently, for each x ∈ X,

∥(T ′ ◦ T )x∥ = ∥T ′ (T x)∥ ≤ M ′ ∥T x∥ ≤ M ′ M∥x∥.

Since the above inequality holds for all such M, M ′ , it follows that ∥T ′ ◦ T ∥ ≤ ∥T ′ ∥∥T ∥ as well. ■

Theorem 3.17. Let X,Y be normed linear spaces. Then the operator norm ∥ · ∥ is a norm on B(X,Y ).
Proof. Ye first show that ∥ · ∥ is a norm on B(X,Y ).
• For every T ∈ B(X,Y ) and c ∈ F,

∥cT ∥ = sup ∥(cT )x∥ = sup ∥c(T x)∥ = sup |c|∥T x∥ = |c| sup ∥T x∥ = |c|∥T ∥.
∥x∥=1 ∥x∥=1 ∥x∥=1 ∥x∥=1

• For every T, T ′ ∈ B(X,Y ),

∥T + T ′ ∥ = sup ∥(T + T ′ )x∥ ≤ sup (∥T x∥ + ∥T ′ x∥)


∥x∥=1 ∥x∥=1

≤ sup ∥T x∥ + sup ∥T ′ x∥ = ∥T ∥ + ∥T ′ ∥.
∥x∥=1 ∥x∥=1

• Let T ∈ B(X,Y ) be arbitrary. Then

∥T x∥
∥T ∥ = 0 ⇐⇒ ∀x ∈ X \ {0X } : =0
∥x∥
⇐⇒ ∀x ∈ X \ {0X } : ∥T x∥ = 0
⇐⇒ ∀x ∈ X \ {0X } : T x = 0Y ⇐⇒ T = 0. ■

Theorem 3.18. Let X,Y be normed linear spaces. If Y is a Banach space, then B(X,Y ) is also a Banach space
under the operator norm.
Proof. So far we have seen that B(X,Y ) is a normed linear space under the operator norm. Furthermore, suppose that Y is a Banach space.
Let (Tn )n∈N be a Cauchy sequence in B(X,Y ) under the operator norm, and x ∈ X be arbitrary. Ye claim that (Tn x)n∈N be a Cauchy sequence
in Y :

69
• If x = 0X , we have Tn x = Tn 0X = 0Y for all n ∈ N, so the sequence (Tn x)n∈N is constant and certainly Cauchy as well.
• Suppose that x ̸= 0X . Let ε > 0 be arbitrary. Then there exists N ∈ N such that ∥Tm − Tn ∥ < ε/∥x∥ for all m, n ≥ N. In this case,
when m, n ≥ N,
ε
∥Tm x − Tn x∥ = ∥(Tm − Tn )x∥ ≤ ∥Tm − Tn ∥∥x∥ < · ∥x∥ = ε.
∥x∥
This shows that the sequence (Tn x)n∈N is Cauchy as well.
Now since Y is a Banach space, there exists a unique y ∈ Y such that Tn x → y in Y as n → ∞. This allows us to define a map

T : X → Y : x 7→ lim Tn x.
n→∞

Here we show that T is linear: Let x, y ∈ X and c ∈ F be arbitrary. By the continuity of addition and scalar multiplication,

T (cx + y) = lim Tn (cx + y) = lim (c(Tn x) + Tn y) = c lim Tn x + lim Tn y = c(T x) + Ty.


n→∞ n→∞ n→∞ n→∞

Furthermore, the map T is also bounded: Note that (Tn )n∈N is a Cauchy sequence, so it is bounded with respect to the operator norm, say
there is some M > 0 such that ∥Tn ∥ ≤ M for all n ∈ N. Then for every x ∈ X, by the continuity of norms,

∥T x∥ = lim Tn x = lim ∥Tn x∥ ≤ M∥x∥,


n→∞ n→∞

where the last inequality holds because


∥Tn x∥ ≤ ∥Tn ∥∥x∥ ≤ M∥x∥, ∀n ∈ N.
Finally, we show that Tn → T in B(X,Y ): For every ε > 0, there exists N ∈ N such that ∥Tm − Tn ∥ < ε for all m, n ≥ N. Then for every
x ∈ X and n ≥ N,

∥(Tn − T )x∥ = ∥Tn x − T x∥ = Tn x − lim Tm x = lim ∥Tn x − Tm x∥


m→∞ m→∞

≤ lim ∥Tn − Tm ∥∥x∥ ≤ ε∥x∥,


m→∞

hence ∥Tn − T ∥ ≤ ε. The proof is thus complete. ■

Definition 3.7. Let X,Y be normed linear spaces. A sequence (Tn )n∈N of bounded linear maps from X to Y is
said to converge weakly to a linear map T ∈ B(X,Y ) if Tn x → T x in Y for all x ∈ X, as opposite to the strong
convergence, namely ∥Tn − T ∥ → 0 as n → ∞ under the operator norm ∥ · ∥.

Theorem 3.19. Let X,Y be normed linear spaces and (Tn )n∈N be a sequence in B(X,Y ). If Tn → T ∈ B(X,Y )
strongly, then (Tn )n∈N converges to T weakly as well.
Proof. Suppose that Tn → T strongly. Then for every x ∈ X,

∥Tn x − T x∥ = ∥(Tn − T )x∥ ≤ ∥Tn − T ∥∥x∥ → 0. (as n → ∞)

As a result, the sequence (Tn )n∈N converges to T weakly as well. ■

Theorem 3.20 (Extension Theorem). Let X be a normed linear space and U be a dense subspace of X. Then
for every Banach space Y and bounded linear map T : U → Y , there is a unique extension T ∈ B(X,Y ) of T to
X such that ∥T ∥ = ∥T ∥.
Proof. Let Y be a Banach space and T ∈ B(U,Y ) be a bounded linear map. Fix an arbitrary x ∈ X. Since U is dense in X, we can find a
sequence (xn )n∈N in U with xn → x. By Theorem 1.9, the sequence (xn )n∈N is Cauchy. Furthermore, since for every m, n ∈ N,

∥T xm − T xn ∥ = ∥T (xm − xn )∥ ≤ ∥T ∥∥xm − xn ∥,

the sequence (T xn )n∈N is Cauchy in Y as well. Note that Y is a Banach space, so the sequence (T xn )n∈N is convergent as well. Inspired
from this, we define
T : X → Y : x 7→ lim T xn .
n→∞

70
Ye then show that T is a well-defined linear map on X extending T :
• First, we need to verify that T is well-defined: Let (xn )n∈N , (xn′ )n∈N be sequences in U converging to x. Then xn − xn′ → x − x = 0X
as n → ∞, hence by the continuity of T , we see that

T xn − T xn′ = T (xn − xn′ ) → T (0X ) = 0Y . (as n → ∞)

This shows that the limits of (T xn )n∈N and (T xn′ )n∈N are identical, hence T is well-defined. Furthermore, for each x ∈ U, by
considering the constant sequence in U with all terms equal to x, it follows that T x = T x as well.
• Let x, y ∈ X and c ∈ F. Fix arbitrary sequences (xn )n∈N and (yn )n∈N in U such that xn → x and yn → y. Then cxn + yn → cx + y,
hence

T (cx + y) = lim T (cxn + yn ) = lim (c(T xn ) + Tyn )


n→∞ n→∞

= c lim T xn + lim Tyn = cT (x) + T (y).


n→∞ n→∞

Consequently, the map T is linear.


Next, for every x ∈ X, again, by fixing an arbitrary sequence (xn )n∈N in U converging to x, we may observe that

∥T x∥ = lim T xn = lim ∥T xn ∥ ≤ lim ∥T ∥∥xn ∥ = ∥T ∥ lim ∥xn ∥ = ∥T ∥∥x∥.


n→∞ n→∞ n→∞ n→∞

Then we can see that T is also bounded with ∥T ∥ ≤ ∥T ∥. Finally,

∥T ∥ = sup ∥T x∥ = sup ∥T x∥ ≤ sup ∥T x∥ = ∥T ∥.


x∈U,∥x∥≤1 x∈U,∥x∥≤1 x∈X,∥x∥≤1

Therefore, we may conclude that ∥T ∥ = ∥T ∥, as desired. ■

3.3 The Uniform Boundedness Principle/Banach–Steinhaus Theorem


Lemma 3.21. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then for every x ∈ X
and r > 0,
sup ∥T x′ ∥ ≥ r∥T ∥. (120)
x′ ∈X,∥x′ −x∥<r

Proof. Let x ∈ X be arbitrary. Then for every z ∈ X, we have


∥T (x + z)∥ + ∥T (z − x)∥
∥T (x + z)∥ ∨ ∥T (x − z)∥ = ∥T (x + z)∥ ∨ ∥T (z − x)∥ ≥
2
∥T (x + z) + T (z − x)∥ ∥2T z∥
≥ = = ∥T z∥.
2 2
In particular, by taking the supremum over ∥z∥ < r, we see that

sup ∥T x′ ∥ = sup (∥T (x + z)∥ ∨ ∥T (x − z)∥) ≥ sup ∥T z∥ = r∥T ∥,


x′ ∈X,∥x′ −x∥<r ∥z∥<r ∥z∥<r

where the last equality follows from Corollary 3.10. ■

Theorem 3.22 (Uniform Boundedness Theorem; Banach-Steinhaus). Let X be a Banach space and Y be a
normed linear space. Given any family F of bounded linear maps from X to Y , if supT ∈F ∥T x∥ < ∞ for all
x ∈ X, then supT ∈F ∥T ∥ < ∞ as well.
Proof (A. Sokal). Let F be a family of bounded linear maps from X to Y such that supT ∈F ∥T x∥ < ∞ for all x ∈ X. Assume, to the contrary,
that supT ∈F ∥T ∥ = ∞. Then for every n ∈ N, there exists Tn ∈ F such that ∥Tn ∥ ≥ 4n . Meanwhile, we also put x0 := 0X . Then for each

71
n ∈ N, we may see from the preceding lemma that
1
sup ∥Tn x′ ∥ ≥ ∥Tn ∥.
x′ ∈X,∥x′ −xn−1 ∥<3−n 3n

2
Then we also let xn ∈ X with ∥xn − xn−1 ∥ < 3−n such that ∥Tn xn ∥ ≥ ∥Tn ∥.
3n+1
Clearly, the sequence (xn )n∈N is Cauchy in X, hence is convergent as X is a Banach space. Let x ∈ X be the limit of (xn )n∈N . As we can
see, for each n ∈ N, by the continuity of norms,
m
∥x − xn ∥ = lim ∥xm − xn ∥ ≤ lim ∑ ∥xk − xk−1 ∥
m→∞ m→∞
k=n+1
m  
1 1 1 1
≤ lim ∑ = lim 1 − = .
k m→∞ 2 · 3n 3m−n 2 · 3n
k=n+1 3
m→∞

Consequently,
1
|∥Tn x∥ − ∥Tn xn ∥| ≤ ∥Tn x − Tn xn ∥ ≤ ∥Tn ∥∥x − xn ∥ < ∥Tn ∥.
2 · 3n
As a result,  n
1 2 1 1 1 4
∥Tn x∥ > ∥Tn xn ∥ − ∥Tn ∥ ≥ n+1 ∥Tn ∥ − ∥Tn ∥ = ∥Tn ∥ ≥ .
2 · 3n 3 2 · 3n 6 · 3n 6 3
In this case, the sequence (Tn x)n∈N is unbounded from above, contrary to our assumption. ■

Corollary 3.23. Let X be a Banach space and Y be a normed linear space. For every sequence (Tn )n∈N of
bounded linear maps from X to Y , if the sequence (Tn x)n∈N in Y is convergent for all x ∈ X, then the map

T : X → Y : x 7→ lim Tn x (121)
n→∞

also defines a bounded linear map from X to Y with ∥T ∥ ≤ supn∈N ∥Tn ∥.


Proof. Let (Tn )n∈N be a sequence in B(X,Y ) such that (Tn x)n∈N converges in Y for all x ∈ X, and let T : X → Y be as defined above. Then
for every x, y ∈ X and c ∈ F,

T (cx + y) = lim Tn (cx + y) = lim (c(Tn x) + Tn y) = c lim Tn x + lim Tn y = c(T x) + Ty,


n→∞ n→∞ n→∞ n→∞

so the map T is linear as well. Meanwhile, now for each x ∈ X, since the sequence (Tn x)n∈N is convergent, it is certainly bounded as well.
Then by the Banach-Steihaus theorem, we see that supn∈N ∥Tn ∥ < ∞ as well. Finally, for each x ∈ X, by the continuity of the norm,

∥T x∥ = lim ∥Tn x∥ ≤ sup ∥Tn x∥ ≤ sup ∥Tn ∥∥x∥ = ∥x∥ sup ∥Tn ∥ < ∞.
n→∞ n∈N n∈N n∈N

Therefore, the linear map T is also bounded with ∥T ∥ ≤ supn∈N ∥Tn ∥. ■

Theorem 3.24 (The Open Mapping Theorem). Let X,Y be Banach spaces and T : X → Y be a bounded linear
map. Then the following statements are equivalent:

1. The map T is open, namely maps open sets in X to open sets in Y .

2. The element 0Y ∈ Y is an interior point of T (BX ), where BX := B(0X , 1) = {x ∈ X | ∥x∥ < 1} is the unit
open ball in Y .

3. The map T is itself surjective.


Proof. (1 ⇐⇒ 2). Denote by Clearly, if T is open, then T (BX ) is also open in Y . Since 0Y = T 0X ∈ T (BX ), there certainly exists r > 0 such
that y ∈ T (BX ) for all y ∈ Y with ∥y∥ < r.

72
Conversely, suppose that 0Y is an interior point of T (BX ). Then there exists r > 0 such that y ∈ T (BX ) for all y ∈ Y with ∥y∥ < r. Now
let G ⊆ X be an open set and y ∈ T (G) be arbitrary. By definition, we can find some x ∈ G such that y = T x. Since G is open, there is δ > 0
such that x̂ ∈ G whenever ∥x̂ − x∥ < δ . Then for every y′ ∈ Y with ∥y′ − y∥ < rδ , since ∥(y′ − y)/δ ∥ < r, it follows from the assumption that
some x′ ∈ BX such that (y′ − y)/δ = T x′ . In this case,

y′ = y + δ (T x′ ) = T x + T (δ x′ ) = T (x + δ x′ ) and ∥(x + δ x′ ) − x∥ = ∥δ x′ ∥ = δ ∥x′ ∥ < δ .

The latter implies that x + δ x′ ∈ G whence y′ ∈ T (G) as well. Consequently, the set T (G) is open in Y , implying that T is an open map.
(2 ⇐⇒ 3). Again, assume that 0Y lies in the interior of T (BX ). For convenience, we adopt the notation for such constant r > 0 as in the
r
last paragraph. Let y ∈ Y be arbitrary. There is nothing to prove if y = 0Y ; Otherwise, for y′ := y, it is clear that ∥y′ ∥ = r/2 < r, so by
2∥y∥
assumption, there exists x ∈ X with ∥x∥ < 1 such that y′ = T x. In this case,
 
2∥y∥ ′ 2∥y∥ 2∥y∥
y= y = (T x) = T x ,
r r r
showing that T is a surjective map.
Conversely, suppose that T is surjective. First, we show that 0X is an interior point of the closure T (BX ) of T (BX ): For each n ∈ N, we
may consider the following map on Y :

∥ · ∥n : Y → R : y 7→ inf{∥u∥ + n∥v∥ | u ∈ X, v ∈ Y, and y = v + Tu}.

Ye now show that ∥ · ∥n is a norm on Y :


• Let y ∈ Y and c ∈ F. First, for every u ∈ X and v ∈ Y such that y = v + Tu, since cy = c(v + Tu) = cv + T (cu), it follows that

∥cy∥n ≤ ∥cu∥ + n∥cv∥ = c∥u∥ + n(c∥v∥) = c(∥u∥ + n∥v∥).

By taking infimum over all such u, v’s, it is clear that ∥cy∥n ≤ c∥y∥n . Note that such inequality is independent of the choices of y
and c, so
∥y∥n = ∥c−1 (cy)∥n ≤ c−1 ∥cy∥n ,
hence ∥cy∥n ≥ c∥y∥n as well. The identity ∥cy∥n = c∥y∥n thus holds for sure.
• Let y, y′ ∈ Y be arbitrary. Then for every u, u′ ∈ X and v, v′ ∈ Y , if y = v + Tu and y′ = v′ + Tu′ , then

y + y′ = (v + Tu) + (v′ + Tu′ ) = (v + v′ ) + T (u + u′ ).

Consequently,

∥y + y′ ∥n ≤ ∥u + u′ ∥ + n∥v + v′ ∥ ≤ (∥u∥ + ∥u′ ∥) + n(∥v∥ + ∥v′ ∥)


= (∥u∥ + n∥v∥) + (∥u′ ∥ + n∥v′ ∥).

By taking infimum over all u, u′ , v, v′ , it is clear that ∥y + y′ ∥n ≤ ∥y∥n + ∥y′ ∥n .


• Let y ∈ Y be such that ∥y∥n = 0. Then for every k ∈ N, there exists uk ∈ X and vk ∈ Y such that y = vk + Tuk and ∥uk ∥ + n∥vk ∥ < 1/k.
Clearly, we have ∥uk ∥ < 1/k and ∥vε ∥ < 1/(nk) in this case, so uk → 0X and vk → 0Y as k → ∞. By the continuity of T , we see that

y = lim y = lim (vk + Tuk ) = 0Y + T 0X = 0Y + 0Y = 0Y .


k→∞ k→∞

In particular, for each y ∈ Y , by considering u := 0X and v := y, it is clear that

∥y∥n ≤ ∥0X ∥ + n∥y∥ = n∥y∥.

Furthermore, let Z := Y ⊕N be the direct sum of countable copies of Y , and by Theorem 1.29, it can be endowed with the following “l∞ -norm”:

∥(yn )n∈N ∥ := sup ∥yn ∥n .


n∈N

In this case, for each n ∈ N, consider the n-th injection map inn : Y → Z.

73
• Clearly, for each y ∈ Y , since inn (y) has a unique nonzero component y at index n, we must have ∥ inn (y)∥ = ∥y∥n ≤ n∥y∥. Therefore,
the map inn is bounded as well.
• Furthermore, let y ∈ Y be arbitrary. Since T is surjective, we have y = T x for some x ∈ X. Then by putting u := x and v := 0Y , we
see that
∥ inn (y)∥ ≤ ∥y∥n ≤ ∥x∥ + n∥0Y ∥ = ∥x∥ + n(0) = ∥x∥.
Consequently, the sequence (inn (y))n∈N is also bounded in Y by ∥x∥, where x ∈ X satisfies T x = y.
Then by the Banach-Steinhaus theorem (cf. Theorem 3.22), there exists a real constant C > 0 such that ∥ inn ∥ ≤ C for all n ∈ N. Next, we
may fix an arbitrary δ ∈ (0, 1/C) and claim that B(0Y , δ ) ⊆ T (BX ): Let y ∈ Y with ∥y∥ < δ . For each n ∈ N, we see that

∥y∥n = ∥ inn (y)∥ ≤ ∥ inn ∥∥y∥ < Cδ < 1.

Then there exist un ∈ X and vn ∈ Y with y = vn + Tun such that ∥un ∥ + n∥vn ∥ < 1. In particular, we see that ∥un ∥ < 1 and ∥vn ∥ < 1/n. The
latter implies that vn → 0Y as n → ∞, hence Tun → y in this case. Now un ∈ BX for all n ∈ N, so y ∈ T (BX ), as desired.
Finally, we then show that T (BX ) contains the open ball B(0Y , δ /2) instead, which will prove that 0Y is an interior point of T (BX ): Let
y ∈ Y be such that ∥y∥ < δ /2.
• By scaling the ball as in the previous part, we see that y ∈ T (B(0X , 1/2)). Then there exists x1 ∈ X with ∥x1 ∥ < 1/2 such that
∥y − T x1 ∥ < δ /4.
• Proceeding inductively, for n ≥ 1, since ∥y − T x1 − · · · − T xn−1 ∥ < δ /2n , we shall have y − T x1 − · · · − T xn−1 ∈ T (B(0X , 1/2n )). In
this case, we can find xn ∈ X with ∥xn ∥ < 1/2n such that

δ
∥y − T x1 − · · · − T xn−1 − T xn ∥ < .
2n+1

Clearly, the series ∑∞ ∞ ∞ −n = 1 < ∞, the series ∞ x converges


n=1 T xn is convergent with sum y. Meanwhile, since ∑n=1 ∥xn ∥ < ∑n=1 2 ∑n=1 n
absolutely in X. Since X is a Banach space, by Theorem 1.12, such series is also convergent in X, say with sum x ∈ X. In this case, by the
continuity of the norms and T ,
!
∞ ∞ ∞ ∞
∥x∥ = ∑ xn ≤ ∑ ∥xn ∥ < 1 and T x = T ∑ xn = ∑ T xn = y.
n=1 n=1 n=1 n=1

so x ∈ BX whence y = T x ∈ T (BX ) as well. The proof is thus complete. ■

Corollary 3.25 (Bounded Inverse Theorem). Every bounded linear isomorphism between Banach spaces is also
an isomorphism, namely its inverse is bounded as well.
Proof. Let X,Y be Banach spaces, and let T : X → Y be a bounded linear isomorphism. Since T is surjective now, by the open mapping
theorem, itself is also an open map. Then for every open set G ⊆ X, we see that T (G) = T −1 (T −1 (G)) is open as well, so the map T −1 is
also continuous as well as bounded. Therefore, the map T is indeed an isomorphism in this case. ■

Corollary 3.26 (Isomorphism Theorem). Let X,Y be Banach spaces and T : X → Y be a bounded linear map.
If im(T ) is closed in Y , then X/ ker(T ) is isomorphic to im(T ).
Proof. Suppose that im(T ) is closed in Y . Now by the factorization theorem (cf. Theorem 3.15), we have a bounded linear isomorphism

T̃ : X/ ker(T ) → im(T ) : x 7→ T x.

As we can see, the quotient space X/ ker(T ) is Banach as X is (cf. Theorem 1.16), and so is im(T ) as it is closed in the Banach space Y
(cf. Theorem 1.10). Then by Bounded inverse theorem (cf. Corollary 3.25), we can conclude that T̃ is an isomorphism, namely X/ ker(T )
is isomorphic to im(T ). ■

Remark 3.5. In addition, the converse also holds in this case: If X/ ker(T ) is isomorphic to im(T ), then im(T )
is also a Banach space. By Theorem 1.10 again, it is necessarily closed in Y as well.

74
Theorem 3.27. Let X,Y be nonzero normed linear spaces and T : X → Y be a surjective bounded linear map.
Then T is invertible with bounded inverse T −1 if and only if there exists a constant C > 0 such that ∥T x∥ ≥ C∥x∥
for all x ∈ X, in which case ∥T −1 ∥ ≤ 1/C.
Proof. Suppose that T is invertible with bounded inverse T −1 . Then for every x ∈ X,

∥x∥ = ∥(T −1 ◦ T )x∥ = ∥T −1 (T x)∥ ≤ ∥T −1 ∥∥T x∥.

In particular, since T −1 is also invertible, it is certainly nonzero whence ∥T −1 ∥ > 0. Therefore, we have ∥T x∥ ≥ ∥x∥/∥T −1 ∥, hence the
constant C := 1/∥T −1 ∥ certainly suffices.
Conversely, suppose that there is some constant C > 0 satisfying that ∥T x∥ ≥ C∥x∥ for all x ∈ X. First, we show that the map T is itself
bijective: It is already surjective by hypothesis. If T x = 0Y for some x ∈ X, then we have

0 = ∥0Y ∥ = ∥T x∥ ≥ C∥x∥,

so ∥x∥ = 0 holds for sure, implying that x = 0X . Therefore, the kernel of T is trivial, implying that it is also injective. Therefore, it has a
linear inverse T −1 : Y → X. Yhat remains is to show that T −1 is also bounded: For every y ∈ Y , since

∥y∥ = ∥T (T −1 y)∥ ≥ C∥T −1 y∥,

we have ∥T −1 y∥ ≤ ∥y∥/C. This shows that T −1 is also bounded with ∥T −1 ∥ ≤ 1/C. ■

Definition 3.8. Let X be a normed linear space. For every linear operator T ∈ B(X), its Neumann series is
defined as the series ∑∞ k
k=0 T .

Lemma 3.28. Let X be a normed linear space with T ∈ B(X). If the Neumann series of T converges under the
operator norm, then idX −T is invertible whose inverse is given by the sum of the Neumann series.
Proof. Suppose that the Neumann series of T converges to some T ′ ∈ B(X). For every k ≥ 0, note that

(idX −T )(idX +T + · · · + T k ) = idX −T k+1 ,

so

(idX −T )T ′ = (idX −T ) ∑ T k = lim (idX −T k+1 ) = idX − lim T k+1 .
k=0 k→∞ k→∞

Because the Neumann series is convergent, by Theorem 1.9, we certainly have Tk→ 0 under the operator norm. Therefore, we see that
(idX −T )T ′ = idX , hence T ′ is a right inverse of idX −T . By symmetric arguments, it is clear that T ′ is also a left inverse of idX −T , hence
T ′ = (idX −T )−1 , as desired. ■

Theorem 3.29 (Inverse Concerning Contractions). Let X be a Banach space with T ∈ B(X). If ∥T ∥ < 1, then
idX −T is also invertible with

1
idX −T = ∑ Ak and ∥ idX −T ∥ ≤
1 − ∥T ∥
. (122)
k=0

Proof. Suppose that ∥T ∥ < 1, namely T is a contraction. Ye first claim that idX −T is surjective: For every y ∈ X, by the Banach’s fixed point
theorem, there exists an element x ∈ X such that x = T x+y, namely y = (idX −T )x. Meanwhile, for every x ∈ X, since ∥T x∥ ≤ ∥T ∥∥x∥ < ∥x∥,

∥(idX −T )x∥ = ∥x − T x∥ ≥ ∥x∥ − ∥T x∥ ≥ ∥x∥ − ∥T ∥∥x∥ = (1 − ∥T ∥)∥x∥.

Then by Theorem 3.27, we may conclude that idX −T is also invertible with ∥ idX −T ∥ ≤ 1/(1 − ∥T ∥). By the preceding lemma, what
remains is to show that the Neumann series of T is convergent in B(X). Now because ∥T ∥ < 1 and ∥T k ∥ ≤ ∥T ∥k for all k ≥ 0, the series
k k
∑∞ ∞
k=0 ∥T ∥ is convergent as well. Thus, the Neumann series ∑k=0 T converges absolutely, hence converges as well by Theorem 1.12. ■

75
Corollary 3.30. Let X be a Banach space. Then the collection of isomorphisms from X to itself is an open
subset of B(X).
Proof. Clearly, the identity map on X is an isomorphism, so such set is nonempty. Now let T ∈ B(X) be an isomorphism. Consider another
S ∈ B(X) with ∥S − T ∥ < 1/∥T −1 ∥. Observe that
1
∥ idX −T −1 S∥ = ∥T −1 (T − S)∥ ≤ ∥T −1 ∥∥T − S∥ < ∥T −1 ∥ · = 1,
∥T −1 ∥

so by the preceding theorem, we can see that T −1 S is also an isomorphism. Now since T is itself an isomorphism, we see that S = T (T −1 S)
is an isomorphism as well. This shows that the open ball centered at T of radius ∥T ∥−1 is contained in the set of all isomorphisms on X, so
such set is indeed open in B(X). ■

Theorem 3.31 (Closed Graph Theorem). Let X,Y be Banach spaces and T : X → Y be a linear map. Then T is
bounded if and only if its graph is closed in X ×Y under the norm

∥(x, y)∥ := ∥x∥ + ∥y∥, ∀x ∈ X and y ∈ Y. (123)

Proof. By Corollary 1.30, it is clear that X ×Y is a Banach space under the norm defined above. This allows us to consider the statement
presented above:
• Suppose that T is bounded. Let (xn , yn )n∈N be a sequence in the graph of T converging to some (x, y) ∈ X × Y . Then clearly, we
must have xn → x and yn → y. Since T is bounded, we shall have yn = T xn → T x as well. By the uniqueness of limits, it follows
that y = T x, namely the pair (x, y) also belongs to the graph of T . Therefore, the graph of T is closed in X ×Y .
• Conversely, suppose that the graph of T is closed in X ×Y . By Theorem 1.10, it is a Banach space. In this case, the projection maps
πX : X ×Y → X and πY : X ×Y → Y are both bounded linear maps. Furthermore, note that the restriction of πX to the graph of T is
bijective, hence it has a bounded inverse T ′ from X to the graph of T by the bounded inverse theorem. Clearly, for each x ∈ X, we
have T ′ x = (x, T x) whence
T x = πY (x, T x) = πY (T ′ x) = (πY ◦ T ′ )x.
This shows that T = πY ◦ T ′ is also bounded, as desired. ■

Remark 3.6. As a result, to show that a linear map T between Banach spaces is bounded, it suffices to show that
xn → x and T xn → y imply that T x = y.

3.4 Extending Linear Functionals: The Hahn-Banach Theorem


Definition 3.9. Let X be a normed linear space. The space X ∗ := B(X, F) is called the continuous dual of X.

Remark 3.7. Here are some remarks on the preceding definition:

• Note that F = R or C is a Banach space, so the continuous dual X ∗ is also a Banach space by Theorem
3.18.

• The continuous dual X ∗ = B(X, F) is a subspace of the algebraic dual X ∨ = L (X, F). By Theorem 3.4,
they coincide when X is finite-dimensional.

• Finally, the operator norm on X ∗ is of the following form: For every continuous functional ϕ : X → F,

∥ϕ∥ = sup |ϕ(x)|, (124)


∥x∥≤1

76
which is called the dual norm now. In practice, such norm is denoted by ∥ · ∥∗ or ∥ · ∥D .

Lemma 3.32 (Boundedness of Coordinate Forms). Let X be a normed linear space.

1. If X is finite-dimensional with base {e1 , . . . , en }, then each coordinate form e∨i : X → F is bounded with
norm 1. Furthermore, the family {e∨1 , . . . , e∨n } is a base of X ∗ in which

n
ϕ= ∑ ϕ(e j )e∨j , ∀ϕ ∈ X ∗ . (125)
j=1

2. If X is infinite-dimensional with base (ei )i∈I , then at most finitely many e∨i ’s are bounded.
Proof. First, suppose that X is finite-dimensional. Then the index set I is certainly finite. As noted in the preceding remark, each coordinate
form e∨ ∗
i is also bounded. Now let ϕ ∈ X be arbitrary. Then for every x ∈ X,
! !
ϕ(x) = ϕ ∑ e∨i (x)ei = ∑ e∨
i (x)ϕ(ei ) = ∑ ϕ(ei )e∨i (x),
i∈I i∈I i∈I

so the identity ϕ = ∑i∈I ϕ(ei )e∨ ∨


i holds for sure. Clearly, if ϕ = 0, then we must have ϕ(ei ) = 0 for all i ∈ I, so the family (ei )i∈I is linearly
independent as well. Therefore, we may conclude that (e∨ )
i i∈I is a base of X ∗ in this case.

Now suppose that X is infinite-dimensional now, namely the index set I is also infinite. ■

Definition 3.10. Let X be a linear space. A map p : X → R is called sublinear if

1. (Non-Negative Homogeneity). p(rx) = rp(x) for all x ∈ X and non-negative r ∈ R;

2. (Subadditivity). p(x + y) ≤ p(x) + p(y) for all x, y ∈ X.

Proposition 3.33. Let X be a linear space. Then a sublinear map p : X → R is a semi-norm if and only if it is
symmetric, namely p(−x) = p(x) for all x ∈ X.
Proof. Let p be a sublinear map on X. If it is already a semi-norm, by Theorem 1.1, we certainly have p(−x) = p(x) for all x ∈ X.
Conversely, suppose that p is symmetric. Then for every x ∈ X and c ∈ F, we see that

p(cx) = p(|c|x) = |c|p(x).

Since the triangle inequality already holds for p, we can conclude that p is a semi-norm on X. ■

Theorem 3.34 (Hahn-Banach, Real Linear Functionals). Let X be a real linear space and p : X → R be a
sublinear map. Then for every subspace U of X and linear functional ϕ0 : U → R, if ϕ0 (x) ≤ p(x) for all x ∈ U,
then there is a linear extension ϕ : X → R of ϕ0 such that ϕ(x) ≤ p(x) for all x ∈ X as well.
Proof. Let U be a subspace of X and ϕ0 : U → R be a linear map such that ϕ0 (x) ≤ p(x) for all x ∈ U. First, we show that whenever U ̸= X,
we can extend ϕ0 to a larger subspace of X satisfying the same condition: Yhen U ̸= X, we may fix one x1 ∈ X \U and consider

U1 := Span(U ∪ {x1 }) = {x + cx1 | x ∈ U, c ∈ R}.

Now let α ∈ R be a constant to be determined later. Ye may define

ϕ1 : U1 → R : x + cx1 7→ ϕ0 (x) + cα.

77
• Ye first show that ϕ1 is well-defined: Suppose that x + cx1 = x′ + c′ x1 for some x, x′ ∈ U and c, c′ ∈ R. Then we have x − x′ =
(c′ − c)x1 . Since x1 ∈
/ U, the previous two expressions can only be 0X . That is, x = x′ and c = c′ , hence

ϕ0 (x) + cα = ϕ0 (x′ ) + c′ α.

In particular, for each x ∈ U,


ϕ1 (x) = ϕ1 (x + 0x1 ) = ϕ0 (x) + 0α = ϕ0 (x),
namely ϕ1 extends ϕ0 to the whole U1 .
• Let x, y ∈ U and c, c1 , c2 ∈ R be arbitrary. Then

ϕ1 ((x + c1 x1 ) + (y + c2 x1 )) = ϕ1 ((x + y) + (c1 + c2 )x1 ) = ϕ0 (x + y) + (c1 + c2 )α


= (ϕ0 (x) + ϕ0 (y)) + (c1 α + c2 α)
= ϕ1 (x + c1 x1 ) + ϕ1 (x + c2 x2 )

and

ϕ1 (c(x + c1 x1 )) = ϕ1 (cx + (cc1 )x1 ) = ϕ0 (cx) + (cc1 )α


= cϕ0 (x) + c(c1 α) = c(ϕ0 (x) + c1 α) = cϕ1 (x + c1 x1 ).

Therefore, the map ϕ1 is also linear.


In addition, we shall show that ϕ1 (x) ≤ p(x) as well for all x ∈ U1 : For every x, y ∈ U,

ϕ0 (x) + ϕ0 (y) = ϕ0 (x + y) ≤ p(x + y) = p((x − x1 ) + (x1 + y)) ≤ p(x − x1 ) + p(x1 + y),

hence
ϕ0 (x) − p(x − x1 ) ≤ p(x1 + y) − ϕ0 (y).
Since x, y ∈ U are arbitrary, it follows that

sup (ϕ0 (x) − p(x − x1 )) ≤ inf (p(x1 + y) − ϕ0 (y)).


x∈U y∈U

Thus, we may assume that α was selected between these two numbers. Now let x ∈ X and c ∈ R be arbitrary.
• If c = 0, then ϕ1 (x) = ϕ0 (x) ≤ p(x).
• Suppose that c > 0. Then
x    x  x
p(x + cx1 ) = cp + x 1 ≥ c α + ϕ0 = cα + cϕ0
c c c
= cα + ϕ0 (x) = ϕ1 (x + c1 ).

• Finally, suppose that c < 0. Then because −c > 0,


 x    x   x
p(x + cx1 ) = −cp − − x1 ≥ −c ϕ0 − − α = −cϕ0 − + cα
c c c
= ϕ0 (x) + cα = ϕ1 (x + cx1 ).

Clearly, if X is finite-dimensional, applying the extension procedure finitely many times, we can certainly obtain the desired linear functional
on X. As for general normed linear spaces, we are supposed to consider the following set:

A := {(Y, ψ) | U ⊆ Y ⊆ X is a subspace and ψ ∈ Y ∨ with ψ|U = ϕ0 , ψ ≤ p}.

Clearly, we have (U, ϕ0 ) ∈ A, so the set A is nonempty. Next, we define a binary relation ⪯ on A by

(Y1 , ψ1 ) ⪯ (Y2 , ψ2 ) ⇐⇒ Y1 ⊆ Y2 and ψ2 |Y1 = ψ1 .

It is immediate that ⪯ is a partial order on A. Furthermore, given any chain (Yi , ψi )i∈Γ in A, the union Y ∗ :=
S
i∈Γ Yi is also a subspace of

78
X. Furthermore, define
ψ ∗ : Y ∗ → R : x 7→ ψi (x), where i = min{ j ∈ Γ | x ∈ Y j },
Then for every i ∈ Γ and x ∈ Yi , since the minimum index j such that x ∈ Y j is no more than i, we must have ψi (x) = ψ j (x) = ψ ∗ (x). This
shows that (Yi , ψi ) ⪯ (Y ∗ , ψ ∗ ) for all i ∈ Γ, namely every chain in A is bounded from above.
By Zorn’s lemma, we can find a maximal element (Y, ϕ) ∈ A. Here we must have Y = X, for otherwise we can apply the extension
procedure described at the beginning to extend ϕ to a largest subspace Y ′ of X, contradicting the maximality of (Y, ϕ). The proof is thus
complete. ■

Lemma 3.35. Let X be a complex linear space.

1. For every complex linear functional ϕ : X → C, its real part ℜ(ϕ) is also a linear functional on X as a
real linear space. Furthermore,

ϕ(x) = ℜ(ϕ(x)) − iℜ(ϕ(ix)), ∀x ∈ X. (126)

2. For every linear functional ψ : X → R as a real linear space, the map

ϕ : X → C : x 7→ ψ(x) − iψ(ix)

is also a complex linear functional on X such that ψ = ℜ(ϕ).


Proof. 1. Let ϕ : X → C be a complex linear functional. Clearly, for every x, y ∈ X and c ∈ R,

ℜ(ϕ(cx + y)) = ℜ(cϕ(x) + ϕ(y)) = ℜ(cϕ(x)) + ℜ(ϕ(y)) = cℜ(ϕ(x)) + ℜ(ϕ(y)),

so ℜ(ϕ) is also a real linear functional on X. Furthermore, for every x ∈ X,

ℜ(ϕ(ix)) + iℑ(ϕ(ix)) = ϕ(ix) = iϕ(x) = i(ℜ(ϕ(x)) + iℑ(ϕ(x)))


= −ℑ(ϕ(x)) + iℜ(ϕ(x)).

By identifying the real parts, we see that ℜ(ϕ(ix)) = −ℑ(ϕ(x)), so

ϕ(x) = ℜ(ϕ(x)) + iℑ(ϕ(x)) = ℜ(ϕ(x)) − iℜ(ϕ(ix)).

2. Conversely, suppose that ψ : X → R is a real linear functional on X, and let ϕ : X → C be defined as above. Since ψ takes value
from real numbers, it is clear that ℜ(ϕ) = ψ. Then for every x, y ∈ X and c = a + ib ∈ C,

ϕ(x + y) = ψ(x + y) − iψ(i(x + y)) = ψ(x + y) − iψ(ix + iy)


= (ψ(x) + ψ(y)) − i(ψ(ix) + ψ(iy))
= (ψ(x) − iψ(ix)) + (ψ(y) − iψ(iy)) = ϕ(x) + ϕ(y),

and

ϕ(cx) = ψ(cx) − iψ(i(cx)) = ψ((a + ib)x) − iψ((−b + ia)x)


= ψ(ax) + ψ(ibx) − i(ψ(−bx) + ψ(iax))
= aψ(x) + bψ(ix) − i(−bψ(x) + aψ(ix))
= (a + ib)(ψ(x) − iψ(ix)) = cϕ(x).

Therefore, the map ϕ is also a linear functional on X. ■

79
Theorem 3.36 (Hahn-Banach, Complex Linear Functionals). Let X be a complex linear space and p : X → R
be a sublinear map. Then for every subspace U of X and linear functional ϕ0 : U → C, if ℜ(ϕ0 (x)) ≤ p(x) for
all x ∈ U,
1. there is a linear extension ϕ : X → C of ϕ0 such that ℜ(ϕ(x)) ≤ p(x) for all x ∈ X as well;

2. if furthermore p is a semi-norm on X, then we even have |ϕ(x)| ≤ p(x) for all x ∈ X in this case.
Proof. Let U be a subspace of X and ϕ0 : U → C be a linear functional such that ℜ(ϕ0 (x)) ≤ p(x) for all x ∈ U. By the preceding lemma,
we see that ℜ(ϕ0 ) is a real linear functional on U, so by the Hahn-Banach theorem for real linear functionals, there exists a real linear
functional ψ : X → R extending ℜ(ϕ0 ) such that ψ(x) ≤ p(x) for all x ∈ X. Furthermore, define

ϕ : X → C : x 7→ ψ(x) − iψ(ix),

which, by the preceding lemma, is also a complex linear functional on X with ℜ(ϕ) = ψ ≤ p. For each x ∈ U, since ix ∈ U as well,

ϕ(x) = ψ(x) − iψ(ix) = ℜ(ϕ0 (x)) − iℜ(ϕ0 (ix)) = ϕ0 (x).

Thus, the map ϕ is an extension of ϕ0 as well. Finally, let x ∈ X be arbitrary. Ye then put α := 1 if ϕ(x) = 0 and α := |ϕ(x)|/ϕ(x) otherwise.
Clearly, we always have |α| = 1 and
ϕ(αx) = αϕ(x) = |ϕ(x)| ∈ R,
so
|ϕ(x)| = ϕ(αx) = ψ(αx) ≤ p(αx) = |α|p(x) = p(x). ■

Remark 3.8. In particular, if |ϕ0 | ≤ p on U, since ℜ(ϕ0 ) ≤ |ϕ0 |, we shall have ℜ(ϕ0 ) ≤ p on U as well. This
allows the following special case of the preceding theorem:
Let X be a semi-normed complex linear space. Then for every subspace U of X and linear function
ϕ0 : U → C, if |ϕ0 (x)| ≤ ∥x∥ for all x ∈ U, then there is a linear extension ϕ : X → C of ϕ0 such
that |ϕ(x)| ≤ ∥x∥ for all x ∈ X as well.
Corollary 3.37 (Hahn-Banach, Extension). Let X be a normed linear space with subspace U. Then for every
bounded linear functional ϕ0 ∈ U ∗ , there exists a bounded linear extension ϕ ∈ X ∗ of ϕ0 such that ∥ϕ∥ = ∥ϕ0 ∥.
Proof. Let ϕ0 ∈ U ∗ be an arbitrary bounded linear functional. Then we define

p : X → [0, ∞) : x 7→ ∥ϕ0 ∥∥x∥.

By Theorem 1.18, it is clear that p is a semi-norm on X, which is certainly sublinear. Furthermore, for each x ∈ X,

p(x) = ∥ϕ0 ∥∥x∥ ≥ |ϕ0 (x)|.

By the Hahn-Banach theorem for real/complex linear functionals, we can find a linear functional ϕ : X → F extending ϕ0 such that |ϕ(x)| ≤
p(x) = ∥ϕ0 ∥∥x∥ for all x ∈ X. Clearly, the map ϕ is bounded now with ∥ϕ∥ ≤ ∥ϕ0 ∥. Meanwhile,

|ϕ0 (x)| |ϕ(x)| |ϕ(x)|


∥ϕ0 ∥ = sup = sup ≤ sup = ∥ϕ∥.
x∈U\{0X } ∥x∥ x∈U\{0X } ∥x∥ x∈X\{0X } ∥x∥

The identity ∥ϕ0 ∥ = ∥ϕ∥ thus holds for sure, completing the proof. ■

Corollary 3.38 (Norming Functional). Let X be a normed linear space and x ∈ X be nonzero. Then there exists
a bound linear functional ϑx ∈ X ∗ with ∥ϑx ∥ = 1 and ϑx (x) = ∥x∥. Furthermore,

∥x∥ = max |ϕ(x)|. (127)


ϕ∈X ∗ ,∥ϕ∥≤1

80
In particular, if ϕ(x) = 0 for all ϕ ∈ X ∗ , we must have x = 0X .
Proof. Consider the subspace U := Span(x), which is nonzero as x is. By the universal property of linear spaces, there is a unique linear
map ϑ : U → F such that ϑ (x) = ∥x∥. More precisely, the map ϑ is given by

ϑ : U → F : λ x 7→ λ ∥x∥.

Clearly, for every λ ∈ F,


|ϑ (λ x)| = |λ ∥x∥| = |λ |∥x∥ = ∥λ x∥,
hence ϑ is also bounded with ∥ϑ ∥ = 1. By the preceding Hahn-Banach theorem, we can find an extension ϑx ∈ X ∗ of ϑ with ∥ϑx ∥ = ∥ϑ ∥ = 1
and ϑx (x) = ϑ (x) = ∥x∥ as well. Finally, for each ϕ ∈ X ∗ with ∥ϕ∥ ≤ 1, we have |ϕ(x)| ≤ ∥ϕ∥∥x∥ ≤ ∥x∥, so

∥x∥ ≥ sup |ϕ(x)|.


ϕ∈X ∗ ,∥ϕ∥≤1

Now the equality is attained and the supremum degenerates to maximum because of the bounded linear functional ϑx ∈ X ∗ . As we can see,
if ϕ(x) = 0 for all x ∈ X ∗ , we must have ∥x∥ = 0 whence x = 0X . The proof is complete. ■

Corollary 3.39. Let X be a normed linear space with proper closed subspace U. For every x0 ∈ X \ U, there
exists ϕ ∈ X ∗ such that ϕ(x) = 0 for all x ∈ U and ϕ(x0 ) ̸= 0.
Proof. Let x0 ∈ X \U be arbitrary. Consider the quotient map π : X → X/U, which is also a bounded linear map by Theorem 3.13. In this
case, note that π(x0 ) = x0 ̸= 0X , so by the preceding corollary, we can find ϑ ∈ (X/U)∗ with ∥ϑ ∥ = 1 and ϑ (x0 ) = ∥x0 ∥ ̸= 0. In this case,
we may put ϕ := ϑ ◦ π, which is also bounded and linear. Clearly,

∀x ∈ U : ϕ(x) = ϑ (x) = ϑ (0X ) = 0 and ϕ(x0 ) = ϑ (x0 ) = ∥x0 ∥ ̸= 0. ■

3.5 The Riesz Representation Theorem and Reflexivity


Definition 3.11. Let X be a normed linear space. The continuous bidual of X is defined as the continuous dual
of the space X ∗ , denoted by X ∗∗ := (X ∗ )∗ .

Theorem 3.40 (Evaluation Map and Reflexity). Let X be a normed linear space. Then for every x ∈ X, the
following evaluation map
evx : X ∗ → F : ϕ 7→ ϕ(x) (128)

is also a bounded linear functional on X ∗ with ∥ evx ∥ = ∥x∥. Furthermore, we also have a linear isometry

ev : X → X ∗∗ : x 7→ evx . (129)

Proof. Fix an arbitrary x ∈ X, and let evx : X ∗ → F be defined as above. Then for every ϕ, ψ ∈ X ∗ and c ∈ F,

evx (cϕ + ψ) = (cϕ + ψ)(x) = cϕ(x) + ψ(x) = c evx (ϕ) + evx (ψ).

This shows that the map evx is a linear functional on X ∗ . Furthermore, for every ϕ ∈ X ∗ , we see that

| evx (ϕ)| = |ϕ(x)| ≤ ∥ϕ∥∥x∥.

Consequently, the linear functional evx : X ∗ → F is also bounded with ∥ evx ∥ ≤ ∥x∥, hence evx ∈ X ∗∗ . In addition, by consider the norming
functional ϑx ∈ X ∗ described in Corollary 3.38, we see that

| evx (ϑx )| = ϑx (x) = ∥x∥ = ∥ϑx ∥∥x∥.

81
Here the last equality is attained because ∥ϑx ∥ = 1. Thus, we must have ∥ evx ∥ = ∥x∥ in this case. Finally, let ev : X → X ∗∗ be as defined
above and x, y ∈ X be arbitrary. Then for every c ∈ F and ϕ ∈ X ∗ ,

evcx+y (ϕ) = ϕ(cx + y) = cϕ(x) + ϕ(y) = c evx (ϕ) + evy (ϕ) = (c evx + evy )(ϕ),

so evcx+y = c evx + evy holds for sure. As a result, the map ev is also linear. Again, for each x ∈ X, since ∥ evx ∥ = ∥x∥, we can see that ev is
an isometry, as desired. ■

Definition 3.12. A normed linear space X is called reflexive if the canonical isometry ev : X → X ∗∗ is also
surjective, namely an isometric isomorphism.

Remark 3.9. In fact, every reflexive normed linear space X is necessarily complete, as it is isometrically iso-
morphic to its bidual X ∗∗ = B(X ∗ , F), which is always a Banach space (cf. Theorem 3.18). Therefore, we shall
always refer to reflexive Banach spaces only.

Lemma 3.41 (Bounded Linear Functional By Inner Products). Let X be an inner product space. For every
z ∈ X, the following map
ηz : X → F : x 7→ ⟨x, z⟩ (130)

is a bounded linear functional on X with ∥ηz ∥ = ∥z∥.


Proof. Fix an arbitrary z ∈ X, and let ηz : X → F be defined as above. Then for every x, y ∈ X and c ∈ F,

ηz (cx + y) = ⟨cx + y, z⟩ = c⟨x, z⟩ + ⟨y, z⟩ = cηz (x) + ηz (y).

Consequently, the map ηz is linear. Furthermore, for every x ∈ X, by CBS inequality,

|ηz (x)| = |⟨x, z⟩∥ ≤ ∥x∥∥z∥.

Therefore, the linear functional ηz is bounded with ∥ηz ∥ ≤ ∥z∥. Note that |ηz (z)| = |⟨z, z⟩| = ∥z∥2 , so the equality ∥ηz ∥ = ∥z∥ is attained
now. ■

Remark 3.10. In particular, when z ̸= 0X , the map ∥z∥−1 ηz ∈ X ∗ is precisely a norming functional of z, as

∥z∥−1 ηz (z) = ∥z∥−1 ∥z∥2 = ∥z∥,


∥∥z∥−1 ηz ∥ = ∥z∥−1 ∥ηz ∥ = ∥z∥−1 ∥z∥ = 1.

Lemma 3.42. Let X be a Hilbert space and ϕ ∈ X ∗ . If ϕ ̸= 0, then the orthogonal complement of ker(ϕ) is of
dimension 1.
Proof. Suppose that ϕ ̸= 0. By Corollary 3.2, its kernel ker(ϕ) is a closed subspace of X. Furthermore, because ϕ ̸= 0, its kernel is also
proper in X. By Theorem 2.38, we see that ker(ϕ)⊥ is nonzero then.
Next, fix nonzero elements x1 , x2 ∈ ker(ϕ)⊥ . Clearly, ϕ(x1 ) and ϕ(x2 ) are both nonzero, for otherwise x1 , x2 ∈ ker(ϕ) as well forcing
them to be zero. Then in this case, for a := −ϕ(x1 )ϕ(x2 )−1 ,

ϕ(x1 + ax2 ) = ϕ(x1 ) + aϕ(x2 ) = ϕ(x1 ) + (−ϕ(x1 )ϕ(x2 )−1 )ϕ(x2 )


= ϕ(x1 ) − ϕ(x1 ) = 0.

As a result, we have x1 +ax2 ∈ ker(ϕ). However, because ker(ϕ)⊥ is a subspace of X, we should have x1 +ax2 ∈ ker(ϕ)⊥ as well. Therefore,
we must have x1 + ax2 = 0X . Since a ̸= 0, the two elements x1 , x2 are linearly independent. As a result, the subspace ker(ϕ)⊥ is only of
dimension 1. ■

82
Remark 3.11. Yhen X is finite-dimensional, the above lemma would be trivial: For nonzero ϕ ∈ X ∗ = L (X, F),
we see that its rank, namely the dimensional of its range, is precisely equal to 1. By rank-nullity theorem, the
dimension of ker(ϕ) is equal to dim(X) − 1. Since X = ker(ϕ) ⊕ ker(ϕ)⊥ , we see that ker(ϕ)⊥ necessarily has
dimension 1 as well.

Theorem 3.43 (Riesz-Fréchet Representation Theorem). Let X be a Hilbert space. Then the following map

η : X → X ∗ : z 7→ (ηz : X → F : x 7→ ⟨x, z⟩). (131)

is a conjugate linear bijective isometry. In particular, for every bounded linear functional ϕ ∈ X ∗ , there exists a
unique element z ∈ X such that ϕ(x) = ⟨x, z⟩ for all x ∈ X.
Proof. Let η : X → X ∗ be as defined. First, we show that η is conjugate linear: Let z, z′ ∈ X and c ∈ F be arbitrary. Then for every x ∈ X,

ηcz+z′ (x) = ⟨x, cz + z′ ⟩ = c⟨x, z⟩ + ⟨x, z′ ⟩ = cηz (x) + ηz′ (x) = (cηz + ηz′ )(x).

This shows that ηcz+z′ = cηz + ηz′ , hence η is conjugate linear. Furthermore, by Lemma 3.41, we see that ∥ηz ∥ = ∥z∥ for all z ∈ X, hence
η is also an isometry. Yhat remains is to prove that η is bijective:
• Let z, z′ ∈ X be such that ηz = ηz′ . Then for every x ∈ X,

0 = ηz (x) − ηz′ (x) = ⟨x, z⟩ − ⟨x, z′ ⟩ = ⟨x, z − z′ ⟩.

Since x ∈ X is arbitrary, it follows that z − z′ = 0X , namely z = z′ . Therefore, the map η is injective.


• Now let ϕ ∈ X ∗ be arbitrary. Ye may simply put z = 0X when ϕ = 0. Suppose that ϕ ̸= 0 in this case. By the preceding lemma, we
see that ker(ϕ)⊥ is of dimension 1, hence every nonzero element in it can serve as a base. For convenience, we may fix an arbitrary
unit element z0 ∈ X. For every x ∈ X, by Theorem 2.42, we see that the orthogonal projection of x onto ker(ϕ)⊥ is precisely ⟨x, z0 ⟩z0 .
Then according to Theorem 2.38, we have

x − ⟨x, z0 ⟩z0 ∈ (ker(ϕ)⊥ )⊥ = ker(ϕ),

where the last equality holds because ker(ϕ) is a closed subspace of X (cf. Corollary 3.2). As a result, we have

ϕ(x) = ϕ((x − ⟨x, z0 ⟩z0 ) + ⟨x, z0 ⟩z0 ) = ϕ(x − ⟨x, z0 ⟩z0 ) + ϕ(⟨x, z0 ⟩z0 )
= ϕ(⟨x, z0 ⟩z0 ) = ⟨x, z0 ⟩ϕ(z0 ) = ⟨x, ϕ(z0 )z0 ⟩.

Consequently, for z := ϕ(z0 )z0 , we have


ϕ(x) = ⟨x, ϕ(z0 )z0 ⟩ = ⟨x, z⟩ = ηz (x),
namely ϕ = ηz holds for sure. ■

Remark 3.12. Here are some remarks on the Riesz representation theorem:

• Yhen X is finite-dimensional, we can find such z for given nonzero ϕ ∈ X ∗ as follows: Let {e1 , . . . , en } be
an orthonormal base of X (cf. Corollary 2.32). Then for every x ∈ X,
! * +
n n n n
ϕ(x) = ϕ ∑ ⟨x, ek ⟩ek = ∑ ⟨x, ek ⟩ϕ(ek ) = ∑ ⟨x, ϕ(ek )ek ⟩ = x, ∑ ϕ(ek )ek .
k=1 k=1 k=1 k=1

The element
n
z := ∑ ϕ(ek )ek ∈ X
k=1

83
certainly suffices. In particular, we may also prescribe that {e1 , . . . , en−1 } is an orthonormal base of
ker(ϕ), while by adding en , we obtain an orthonormal base of X. Then z = ϕ(en )en is precisely of the
form presented in the previous proof.

• Inspired from this, many functional analysis texts uses inner-product notation to describe functionals in
normed linear spaces as well, namely given any normed linear space X, if x ∈ X and ϕ ∈ X ∗ , they write

⟨x, ϕ⟩ := ϕ(x). (132)

Corollary 3.44. Every Hilbert space X is reflexive, whose continuous dual X ∗ is also a Hilbert space under the
inner product
∀ϕ, ψ ∈ X ∗ : ⟨ϕ, ψ⟩X ∗ := ⟨zϕ , zψ ⟩X , (133)

where zϕ , zψ ∈ X are the unique representatives ensured by the Riesz representation theorem.
Proof. Let X be a Hilbert space. First, we show that X ∗ is also a Hilbert space under the map defined above: For convenience, we may let
η : X → X ∗ is the conjugate linear bijective isometry constructed in the Riesz representation theorem. Then the map ⟨·, ·⟩X ∗ on X ∗ can also
be described as
∀ϕ, ψ ∈ X ∗ : ⟨ϕ, ψ⟩X ∗ := ⟨η −1 (ψ), η −1 (ϕ)⟩X ,
which is certainly an inner product on X ∗ . Furthermore, for every ϕ ∈ X ∗ , denoting by ∥ϕ∥op is operator norm,

∥ϕ∥2op = ∥η −1 (ϕ)∥2 = ⟨η −1 (ϕ), η −1 (ϕ)⟩X = ⟨ϕ, ϕ⟩X ∗ .

Consequently, the operator norm on X ∗ is precisely induced by the new inner product ⟨·, ·⟩X ∗ . Since X ∗ is already a Banach space under the
operator norm, it is thus a Hilbert space under the inner product defined above.
Finally, let f ∈ X ∗∗ be arbitrary. Then by applying the Riesz representation theorem to X ∗∗ , we can find a unique ψ ∈ X ∗ such that
f (ϕ) = ⟨ϕ, ψ⟩X ∗ for all ϕ ∈ X ∗ . In this case,

f (ϕ) = ⟨ϕ, ψ⟩X ∗ = ⟨η −1 (ψ), η −1 (ϕ)⟩X = ϕ(η −1 (ψ)) = evη −1 (ψ) (ϕ),

hence we have f = evη −1 (ψ) , showing that the canonical isometry ev : X → X ∗∗ is surjective as well. Therefore, X is reflexive. ■

Definition 3.13. Let X be a normed linear space.

1. For every nonempty A ⊆ X, its annihilator is defined as

A⊥ := {ϕ ∈ X ∗ | ∀x ∈ A : ϕ(x) = 0}. (134)

2. For every nonempty B ⊆ X ∗ , its annihilator is defined as

B⊥ := {x ∈ X | ∀ϕ ∈ B : ϕ(x) = 0}. (135)

Remark 3.13. The Riesz representation theorem explains why the notation ⊥ is adopted in the above definitions:
Suppose that X is a Hilbert space with A ⊆ X. By the Riesz representation theorem, we have a bijection between
the following two sets:

{ϕ ∈ X ∗ | ∀x ∈ A : ϕ(x) = 0} ↔ {z ∈ X | ∀x ∈ A : ⟨x, z⟩ = 0},

84
where the latter is precisely the orthogonal complement of A. That is why the former set is denoted by A⊥ even
for normed linear spaces.
Theorem 3.45. Let X be a normed linear space. Then for every nonempty A ⊆ X (resp. B ⊆ X ∗ ), its annihilator
is a closed subspace of X ∗ (resp. of X).
Proof. 1. Let A ⊆ X be nonempty. Clearly, the zero map in X ∗ is certainly contained in A⊥ , so A⊥ is nonempty. Furthermore, let ϕ, ψ ∈ A⊥
and c ∈ F be arbitrary. Then for every x ∈ A,

(cϕ + ψ)(x) = cϕ(x) + ψ(x) = c(0) + 0 = 0.

This shows that cϕ + ψ ∈ A⊥ , implying that A⊥ is a subspace of X ∗ . Finally, let (ϕn )n∈N be a sequence in A⊥ that converges to some ϕ ∈ X ∗ .
Then for every x ∈ A,
|ϕ(x)| = |ϕ(x) − 0| = |ϕ(x) − ϕn (x)| = |(ϕ − ϕn )(x)| ≤ ∥ϕ − ϕn ∥∥x∥ → 0. (as n → ∞)
This shows that ϕ(x) = 0 for all x ∈ A, hence ϕ ∈ A⊥ as well. Consequently, A⊥ is a closed subspace of X ∗ , as desired.
2. Similarly, let B ⊆ X ∗ be nonempty. Clearly, we have 0X ∈ B⊥ , so B⊥ is nonempty. Next, let x, y ∈ B⊥ and c ∈ F. Then for every
ϕ ∈ B,
ϕ(cx + y) = cϕ(x) + ϕ(y) = c(0) + 0 = 0,
hence cx + y ∈ B⊥ as well. This shows that B⊥ is a subspace of X. Finally, let (xn )n∈N be a sequence in B⊥ converging to some x ∈ X. Then
given any ϕ ∈ B, since ϕ is continuous, we see that

ϕ(x) = lim ϕ(xn ) = lim 0 = 0.


n→∞ n→∞

Consequently, we also have x ∈ B⊥ , so B⊥ is closed in X. ■

Theorem 3.46. Let X be a normed linear space with closed subspace U. Then the following map

T : (X/U)∗ → U ⊥ : ϕ 7→ ϕ ◦ π (136)

is an isometric isomorphism, where π : X → X/U is the quotient map.


Proof. By Corollary 3.13, the quotient map π : X → X/U is linear and bounded. Consequently, given any ϕ ∈ (X/U)∗ , we certainly have
ϕ ◦ π ∈ X ∗ . In particular, given any x ∈ U,
(ϕ ◦ π)(x) = ϕ(π(x)) = ϕ(x) = ϕ(0X ) = 0.
This shows that ϕ ◦ π ∈ U ⊥ for all ϕ ∈ (X/U)∗ , so the above map T is defined.
• Let ϕ, ψ ∈ (X/U)∗ and c ∈ F. Then for every x ∈ X,

(T (cϕ + ψ))(x) = (cϕ + ψ)(x) = cϕ(x) + ψ(x)


= c(T ϕ)(x) + (T ψ)(x) = (c(T ϕ) + T ψ)(x).

Consequently, we have T (cϕ + ψ) = c(T ϕ) + T ψ, implying that T is linear.


• Let ϕ ∈ (X/U)∗ be arbitrary. Clearly,

∥T ϕ∥ = sup |(T ϕ)(x)| = sup |ϕ(x)| = sup |ϕ(x)| = ∥ϕ∥.
x∈X,∥x∥<1 x∈X,∥x∥<1 x∈X,∥x∥<1


The only unclear part here is at =: Since ∥x∥ ≤ ∥x∥, the inequality ≤ holds for sure. Conversely, let x ∈ X be such that ∥x∥ < 1.
Then there exists z ∈ U such that ∥x − z∥ < 1 as well. In this case, because z ∈ U,

ϕ(z) = ϕ(0X ) = 0.

As a result,
ϕ(x − z) = ϕ(x − z) = ϕ(x) − ϕ(z) = ϕ(x) − 0 = ϕ(x).

85
Now because ∥x − z∥ < 1, the inequality ≥ is true as well. Consequently, the linear map T is also an isometry whence bounded as
well.
• Finally, for each ψ ∈ U ⊥ , it is clear that U ⊆ ker(ψ). Then by the factorization theorem (cf. Theorem 3.15), there is a unique
bounded linear functional ϕ : X/U → F such that ϕ(x) = ψ(x) for all x ∈ X and ∥ϕ∥ = ∥ψ∥. Clearly, we have ψ = T ϕ in this case,
so T is also surjective.
In conclusion, the map T is an isometric isomorphism, as desired. ■

Theorem 3.47. Let X be a normed linear space.

1. For every subspace U of X,


(U ⊥ )⊥ = U. (137)

In particular, the subspace U is dense in X if and only if U ⊥ = {0}.

2. For every subspace Y of X ∗ ,


(Y⊥ )⊥ ⊇ Y . (138)

Proof. 1. Let U be a subspace of X. Ye prove the identity first:


• For each x ∈ U, since ϕ(x) = 0 for all ϕ ∈ U ⊥ , it is clear that x ∈ (U ⊥ )⊥ . Consequently, we have U ⊆ (U ⊥ )⊥ and hence U ⊆ (U ⊥ )⊥
as well, since (U ⊥ )⊥ is closed in X.
• Let x ∈ X \ U. Since U is closed in X, by Corollary 3.39, there exists ϕ ∈ U ⊥ such that ϕ(x) ̸= 0. In this case, we also have
x ∈ X \ (U ⊥ )⊥ , so (U ⊥ )⊥ ⊆ U as well.
Consequently,
U is dense in X ⇐⇒ U = X ⇐⇒ (U ⊥ )⊥ = X ⇐⇒ U ⊥ = {0}.
2. Let Y be a subspace of X ∗ and ϕ ∈ Y be arbitrary. Then for each x ∈ Y⊥ , we see that ϕ(x) = 0, so ϕ ∈ (Y⊥ )⊥ . This shows that
Y ⊆ (Y⊥ )⊥ , hence Y ⊆ (Y⊥ )⊥ as well since (Y⊥ )⊥ is closed in X. ■

Corollary 3.48. A normed linear space is separable, namely containing a countable dense subset, whenever its
continuous dual is.
Proof. Let X be a normed linear space such that X ∗ is separable. Suppose that (ϕn )n∈N is a family dense in X ∗ . Then for each n ∈ N,
by the definition of operator norm, we can find some xn ∈ X with ∥xn ∥ ≤ 1 such that |ϕn (xn )| ≥ ∥ϕn ∥/2. Ye now claim that the subspace
U := Span(xn | n ∈ N) is dense in X: By the preceding theorem, it suffices to show that U ⊥ = {0}. Let ϕ ∈ U ⊥ be arbitrary. Then for every
n ∈ N, we may observe that
1 1
∥ϕ − ϕn ∥ ≥ |ϕ(xn ) − ϕn (xn )| = |ϕn (xn )| ≥∥ϕn ∥ = ∥ϕ − (ϕ − ϕn )∥
2 2
1
≥ |∥ϕ∥ − ∥ϕ − ϕn ∥|.
2
Because (ϕn )n∈N is dense in X ∗ , we shall have infn∈N ∥ϕ − ϕn ∥ = 0. As a result,
1 1
0 = inf ∥ϕ − ϕn ∥ ≥ inf |∥ϕ∥ − ∥ϕ − ϕn ∥| = ∥ϕ∥,
n∈N n∈N 2 2
hence ∥ϕ∥ = 0 holds for sure. In other words, now we have ϕ = 0, as desired.
Finally, let Q := Q if F = R, or Q := Q(i) = {a + ib | a, b ∈ Q} if F = C, which is countable and dense in F. Meanwhile, since (xn )n∈N
is a spanning family of U, we see that the set D of Q-linear combinations of (xn )n∈N is also dense in U as well as in X. Clearly, the set D is
also countable, so the space X is also separable. ■

86
Theorem 3.49. Let X be a normed linear space with subspace U. Then the following map

T : X ∗ /U ⊥ → U ∗ : ϕ 7→ ϕ|U (139)

is an isometric isomorphism.
Proof. First, as shown later in Corollary 3.51, we have a bounded linear map

S : X ∗ → U ∗ : ϕ 7→ ϕ|U .

Ye claim that S is surjective and U ⊥ = ker(S):


• For every ψ ∈ U ∗ , since U is a subspace of X, by Hahn-Banach theorem (Corollary 3.37), there exists ϕ ∈ X ∗ such that ψ = ϕ|U =
Sϕ. Consequently, the map S is surjective.
• For every ϕ ∈ X ∗ ,
ϕ ∈ ker(S) ⇐⇒ ϕ|U = 0 ⇐⇒ ∀x ∈ U : ϕ(x) = 0 ⇐⇒ ϕ ∈ U ⊥ .
Therefore, the identity U ⊥ = ker(S) holds for sure.
Then by the factorization theorem (cf. Theorem 3.15), the map T defined above is also a bounded linear isomorphism. By Theorem 3.7,
what remains is to show that T is also an isometry: Let ϕ ∈ X ∗ be arbitrary.
• Fix an arbitrary ψ ∈ U ⊥ . Since ψ(x) = 0 for all x ∈ U,

∥T ϕ∥ = ∥ ϕ|U ∥ = sup |ϕ(x)| = sup |ϕ(x) − ψ(x)|


x∈U,∥x∥≤1 x∈U,∥x∥≤1

≤ sup |ϕ(x) − ψ(x)| = ∥ϕ − ψ∥.


x∈X,∥x∥≤1

By taking infimum over all ψ’s, it follows that ∥T ϕ∥ ≤ ∥ϕ∥.


• Meanwhile, again by Hahn-Banach theorem (Corollary 3.37), there exists ϕ ′ ∈ X ∗ such that ϕ|U = ϕ ′ |U and ∥ ϕ|U ∥ = ∥ϕ ′ ∥. In this
case, we certainly have ϕ − ϕ ′ ∈ U ⊥ , so ϕ = ϕ ′ whence

∥ϕ∥ = ∥ϕ ′ ∥ ≤ ∥ϕ ′ ∥ = ∥ ϕ|U ∥ = ∥T ϕ∥. ■

3.6 The Adjoint of Bounded Linear Maps


Definition 3.14. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then the adjoint of
T is defined as the following map:
T ∗ : Y ∗ → X ∗ : ψ 7→ ψ ◦ T. (140)

Remark 3.14. In terms of the “inner-product” notations, one has

⟨x, T ∗ ψ⟩ = ⟨T x, ψ⟩, ∀x ∈ X and ψ ∈ Y ∗ . (141)

Theorem 3.50. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then the adjoint
T ∗ : Y ∗ → X ∗ is also linear and bounded with ∥T ∗ ∥ = ∥T ∥.
Proof. First, for every ψ, ψ ′ ∈ Y ∗ , c ∈ F and x ∈ X, we see that

(T ∗ (ψ + ψ ′ ))(x) = (ψ + ψ ′ )(T x) = ψ(T x) + ψ ′ (T x)


= (T ∗ ψ)(x) + (T ∗ ψ ′ )(x) = (T ∗ ψ + T ∗ ψ)(x)

87
and
(T ∗ (cψ))(x) = (cψ)(T x) = cψ(T x) = c(T ∗ ψ)(x) = (c(T ∗ ψ))(x).
Then we have
T ∗ (ψ + ψ ′ ) = T ∗ ψ + T ∗ ψ ′ and T ∗ (cψ) = cT ∗ ψ,
implying that T ∗ : Y ∗ → X ∗ is linear. Furthermore, given any ψ ∈ Y ∗ ,

∥T ∗ ψ∥ = sup |(T ∗ ψ)(x)| = sup |ψ(T x)|


x∈X,∥x∥≤1 x∈X,∥x∥≤1

≤ sup ∥ψ∥∥T x∥ = ∥ψ∥ sup ∥T x∥ = ∥ψ∥∥T ∥.


x∈X,∥x∥≤1 x∈X,∥x∥≤1

Therefore, the linear map T ∗ is also bounded with ∥T ∗ ∥ ≤ ∥T ∥. Finally, let x ∈ X with ∥x∥ ≤ 1. Then we have

∥T ∗ ∥ = sup ∥T ∗ ψ∥ ≥ sup |T ∗ ψ(x)| = sup |ψ(T x)| = ∥T x∥,


ψ∈Y ∗ ,∥ψ∥≤1 ψ∈Y ∗ ,∥ψ∥≤1 ψ∈Y ∗ ,∥ψ∥≤1

where the last equality follows from Corollary 3.38. Finally, by taking supremum over all those x’s, it is clear that ∥T ∗ ∥ ≥ ∥T ∥ as well. The
identity ∥T ∗ ∥ = ∥T ∥ is thus clear for us. ■

Corollary 3.51. Let X be a normed linear space with subspace U. Then the the restriction map

T : X ∗ → U ∗ : ϕ 7→ ϕ|U (142)

is bounded linear with ∥T ∥ ≤ 1, and even injective if U is dense in X.


Proof. Consider the inclusion map ι : U → X. It is clear that T = ι ∗ , hence is a bounded linear map. In fact, For every ϕ ∈ X ∗ ,

∥T ϕ∥ = ∥ ϕ|U ∥ = sup |ϕ(x)| ≤ sup |ϕ(x)| = ∥ϕ∥,


x∈U,∥x∥≤1 x∈X,∥x∥≤1

so we even have ∥T ∥ ≤ 1 in this case.


Meanwhile, for every two maps ϕ, ϕ ′ ∈ X ∗ , if ϕ = ϕ ′ on U, we must have ϕ = ϕ ′ on U. Yhen U is dense in X, we should have ϕ = ϕ ′
on X as well. Thus, the map T is also injective in that case. ■

Theorem 3.52 (Properties of Adjoints). Let X,Y, Z be normed linear spaces.

1. Ye have id∗X = idX ∗ .

2. For every S, T ∈ B(X,Y ) and c ∈ F,

(S + T )∗ = S∗ + T ∗ and (cT )∗ = cT ∗ . (143)

3. For every T ∈ B(X,Y ) and S ∈ B(Y, Z),

(S ◦ T )∗ = T ∗ ◦ S∗ . (144)

4. For every T ∈ B(X,Y ), if it admits a bounded inverse, so does T ∗ with


Proof. 1. Clearly, for every ϕ ∈ X ∗ , we have
id∗X ϕ = ϕ ◦ idX = ϕ,
so id∗X = idX ∗ holds for sure.

88
2. Let S, T ∈ B(X,Y ) and c ∈ F. Then for every ψ ∈ Y ∗ ,

(S + T )∗ ψ = ψ ◦ (S + T ) = ψ ◦ S + ψ ◦ T = S∗ ψ + T ∗ ψ = (S∗ + T ∗ )ψ

and
(cT )∗ ψ = ψ ◦ (cT ) = c(ψ ◦ T ) = c(T ∗ ψ) = (cT ∗ )ψ.
Consequently, we certainly have (S + T )∗ = S∗ + T ∗ and (cT )∗ = cT ∗ .
3. Let T ∈ B(X,Y ) and S ∈ B(Y, Z). For every ψ ∈ Z ∗ ,

(S ◦ T )∗ ψ = ψ ◦ (S ◦ T ) = (ψ ◦ S) ◦ T = T ∗ (S∗ ψ) = (T ∗ ◦ S∗ )ψ.

Therefore, we see that (S ◦ T )∗ = T ∗ ◦ S∗ in this case. ■

Corollary 3.53. Let X,Y be normed linear spaces. If T : X → Y is an isomorphism, so is T ∗ : Y ∗ → X ∗ with

(T ∗ )−1 = (T −1 )∗ =: T −∗ . (145)

In particular, if X and Y are (isometrically) isomorphic, then X ∗ and Y ∗ are (isometrically) isomorphic as well.
Proof. Let T : X → Y be an isomorphism. Note that T −1 ◦ T = idX and T ◦ T −1 = idY , while T −1 is also bounded. By taking adjoints over
both sides, we have
idX ∗ = id∗X = (T −1 ◦ T )∗ = T ∗ ◦ (T −1 )∗ and idY ∗ = idY∗ = (T ◦ T −1 )∗ = (T −1 )∗ ◦ T ∗ .
As we can see, the map (T −1 )∗ is the functional inverse of T ∗ , which is also bounded as T −1 is. This shows that T ∗ is also an isomorphism
such that (T ∗ )−1 = (T −1 )∗ holds.
Next, suppose that T is also an isometry. Then ∥T ∗ ∥ = ∥T ∥ = 1 as well. For every ψ ∈ Y ∗ , it is immediate that ∥T ∗ ψ∥ ≤ ∥T ∗ ∥∥ψ∥ =
∥ψ∥. Meanwhile,
∥ψ∥ = ∥T −∗ (T ∗ ψ)∥ ≤ ∥T −∗ ∥∥T ∗ ψ∥ = (1)∥T ∗ ψ∥ = ∥T ∗ ψ∥,
because T −1 is also an isometric isomorphism (cf. Theorem 3.7). Therefore, we see that ∥T ∗ ψ∥ = ∥ψ∥ for all ψ ∈ Y ∗ , hence T ∗ is an
isometric isomorphism as well in this case. ■

Lemma 3.54. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then

T ∗∗ ◦ evX = evY ◦T, (146)

where evX : X → X ∗∗ and evY : Y → Y ∗∗ are the canonical linear isometries on X and Y , respectively.
Proof. Let x ∈ X be arbitrary. Then for every ψ ∈ Y ∗ ,

((T ∗∗ ◦ evX )(x))(ψ) = (T ∗∗ (evx ))(ψ) = evx (T ∗ ψ)


= (T ∗ ψ)(x) = ψ(T x) = (evT x )(ψ),

so we have (T ∗∗ ◦ evX )(x) = evT x = (evY ◦T )(x). Since x ∈ X is arbitrary, it follows that T ∗∗ ◦ evX = evY ◦T . ■

Theorem 3.55. Let X,Y be normed linear spaces. Then a bounded linear map S : Y ∗ → X ∗ is an adjoint of
some map in B(X,Y ) if and only if im(S∗ ◦ evX ) ⊆ im(evY ), where evX : X → X ∗∗ and evY : Y → Y ∗∗ are the
canonical linear isometries on X and Y , respectively.
Proof. First, suppose that S = T ∗ for some bounded linear map T : X → Y . By the preceding lemma, we see that

S∗ ◦ evX = T ∗∗ ◦ evX = evY ◦T,

89
so
im(S∗ ◦ evX ) = im(evY ◦T ) ⊆ im(evY ).
Conversely, assume that im(S∗ ◦ evX ) ⊆ im(evY ). Define

T : X → Y : x 7→ evY−1 (S∗ (evX (x))).

Here because evY is injective and im(S∗ ◦ evX ) ⊆ im(evY ), such definition is possible. Yhat remains is to prove that S = T ∗ : Let ψ ∈ Y ∗ be
arbitrary. Then for every x ∈ X,

(T ∗ ψ)(x) = ψ(T x) = ψ(evY−1 (S∗ (evX (x))))


= (S∗ (evX (x)))(ψ) = (evX (x))(Sψ) = (Sψ)(x).

This tells us that T ∗ ψ = Sψ. Since ψ ∈ Y ∗ is arbitrary, we must have S = T ∗ , as desired. ■

Theorem 3.56. Every normed linear space isomorphic to a reflexive normed linear space is also reflexive.
Proof. Let X be a reflexive normed linear space, Y be a normed linear space isomorphic to X via an isomorphism T : X → Y , and evX : X →
X ∗∗ , evY : Y → Y ∗∗ be the canonical linear isometries. First, by Corollary 3.53, both T ∗ : Y ∗ → X ∗ and T ∗∗ : X ∗∗ → Y ∗∗ are isomorphism
as well. Then according to Lemma 3.54, we see that evY = T ∗∗ ◦ evX ◦T −1 , which is an isomorphism as well. Therefore, the space Y is also
reflexive. ■

Theorem 3.57. Let X be a Banach space. If X is reflexive, then every closed subspace U of X is also reflexive.
Proof. Suppose that X is reflexive, and let U be a closed subspace of X. By Theorem 1.10, we see that U is a Banach space as well. Again,
we still denote by evX : X → X ∗∗ and evU : U → U ∗∗ the canonical linear isometries. Let ϑ ∈ U ∗∗ be arbitrary. Then define

ϑ̂ : X ∗ → F : ϕ 7→ ϑ ( ϕ|U ).

Ye claim that ϑ̂ ∈ X ∗∗ in this case:


• First, for every ϕ, ϕ ′ ∈ X ∗ and c ∈ F,

ϑ̂ (cϕ + ϕ ′ ) = ϑ ( (cϕ + ϕ ′ ) U ) = ϑ (c ϕ|U + ϕ ′ U


) = cϑ ( ϕ|U ) + ϑ ( ϕ ′ U
) = cϑ̂ (ϕ) + ϑ̂ (ϕ ′ ),

so ϑ̂ is itself a linear map.


• Furthermore, for each ϕ ∈ X ∗ , by Corollary 3.51,

|ϑ̂ (ϕ)| = |ϑ ( ϕ|U )| ≤ ∥ϑ ∥∥ ϕ|U ∥ ≤ ∥ϑ ∥∥ϕ∥,

hence the linear map ϑ̂ is also bounded with ∥ϑ̂ ∥ ≤ ∥ϑ ∥.


Now since X is reflexive, there exists a unique x ∈ X such that ϑ̂ = evX (x). Ye claim that x ∈ U: Suppose otherwise. Then by Corollary
3.39, there exists ϕ ∈ U ⊥ such that ϕ(x) ̸= 0. In this case,

0 = ϑ (0) = ϑ ( ϕ|U ) = ϑ̂ (ϕ) = ϕ(x) ̸= 0,

a contradiction. Finally, we simply show that ϑ = evU (x) as well: Let ψ ∈ U ∗ be arbitrary. By Hahn-Banach theorem (cf. Corollary 3.37),
it can be linearly extended to some ϕ ∈ X ∗ . In this case,

(evU (x))(ψ) = ψ(x) = ϕ(x) = (evX (x))(ϕ) = ϑ̂ (ϕ) = ϑ ( ϕ|U ) = ϑ (ψ).

The identity ϑ = evU (x) thus holds, so evU is also surjective whence U is reflexive as well. ■

Theorem 3.58. Let X be a normed linear space. Then its continuous dual X ∗ is reflexive whenever X is, while
the converse is true when X is a Banach space.

90
Proof. Let X be a normed linear space, and evX : X → X ∗∗ , evX ∗ : X ∗ → X ∗∗∗ be the canonical linear isometries. First, suppose that X
is reflexive, namely the map evX is an isomorphism. Then we let ϑ ∈ X ∗∗∗ be arbitrary. As we can see, for each ψ ∈ X ∗∗ , denoting by
x := ev−1
X (ψ),

ϑ (ψ) = ϑ (evX (x)) = (ev∗X (ϑ ))(x) = (evX (x))(ev∗X (ϑ ))


= ψ(ev∗X (ϑ )) = (evX ∗ (ev∗X (ϑ )))(ψ).

Consequently, we have ϑ = evX ∗ (ev∗X (ϑ )), implying that evX ∗ is an isomorphism as well. In other words, the space X ∗ is also reflexive.
Conversely, suppose that X ∗ is reflexive. Then as shown above, we see that the bidual X ∗∗ is reflexive as well. Meanwhile, note that
evX is a linear isometry, so X is isometrically isomorphic to im(evX ). If X is a Banach space, so is im(evX ). Note that X ∗∗ is always a
Banach space, hence according to Theorem 1.10, the subspace im(evX ) is closed in X ∗∗ . By the preceding two theorems, we can conclude
that im(evX ) as well as X are both reflexive in this case. ■

Corollary 3.59. Let X be a Banach space. If X is reflexive, for every closed subspace U of X, the quotient space
X/U is reflexive as well.
Proof. Suppose that X is reflexive, and let U be a closes subspace of X. Now by Theorem 3.46, we see that (X/U)∗ is isometrically
isomorphic to U ⊥ . Here U ⊥ is a closed subspace of X ∗ , while X ∗ , as shown in the preceding theorem, is also reflexive, hence U ⊥ as well as
(X/U)∗ should be reflexive as well. Again, it follows from Theorem 1.16 that X/U is also a Banach space, so it is necessarily reflexive by
the preceding theorem. ■

Theorem 3.60. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then

im(T )⊥ = ker(T ∗ ) and im(T ∗ )⊥ = ker(T ), (147)

and
ker(T )⊥ ⊇ im(T ∗ ) and ker(T ∗ )⊥ = im(T ). (148)
Proof. For every ψ ∈ Y ∗ ,

ψ ∈ im(T )⊥ ⇐⇒ ∀y ∈ im(T ) : 0 = ψ(y)


⇐⇒ ∀x ∈ X : 0 = ψ(T x) = (T ∗ ψ)(x) ⇐⇒ T ∗ ψ = 0 ⇐⇒ ψ ∈ ker(T ∗ ).

Then we have im(T )⊥ = ker(T ∗ ). Similarly, for every x ∈ X,

x ∈ im(T ∗ )⊥ ⇐⇒ ∀ϕ ∈ im(T ∗ ) : 0 = ϕ(x)


⇐⇒ ∀ψ ∈ Y ∗ : 0 = (T ∗ ψ)(x) = ψ(T x) ⇐⇒ T x = 0Y ⇐⇒ x ∈ ker(T ),

so im(T ∗ )⊥ = ker(T ) holds as well. Meanwhile, by Theorem 3.47,

ker(T )⊥ = (im(T ∗ )⊥ )⊥ ⊇ im(T ∗ ) and im(T ) = (im(T )⊥ )⊥ = ker(T ∗ )⊥ . ■

Lemma 3.61. Let X,Y be Banach spaces and T : X → Y be a bounded linear map. Then the image of T is
closed in Y if and only if there is some C > 0 such that given any y ∈ im(T ), there exists x ∈ X with y = T x such
that ∥x∥ ≤ C∥y∥.
Proof. Suppose that im(T ) is closed in Y . Then by the isomorphism theorem (cf. Corollary 3.26), we have an isomorphism

T̃ : X/ ker(T ) → im(T ) : x 7→ T x.

Ye claim that C := 2∥T̃ −1 ∥ > 0 would suffice: Let y ∈ im(T ) be arbitrary. If y = 0Y , we can simply put x = 0X and the statement is trivial
for us. Thus, we shall also assume that y is nonzero then.

91
In this case, fix an arbitrary x0 ∈ T̃ −1 y. Because y is nonzero, its preimage under T̃ should be nonzero as well. Then for ε = ∥T̃ −1 y∥ > 0,
there exists uε ∈ ker(T ) such that

∥x0 − uε ∥ < ∥x0 ∥ + ∥T̃ −1 y∥ = ∥T̃ −1 y∥ + ∥T̃ −1 y∥ = 2∥T̃ −1 y∥.

Put x := x0 − uε . Then as noted above, we have

∥x∥ ≤ 2∥T̃ −1 y∥ ≤ 2∥T̃ −1 ∥∥y∥ = C∥y∥,

and
T x = T x0 − Tuε = T x0 = T̃ x0 = T̃ (T̃ −1 y) = y.
Conversely, suppose that such constant C > 0 exists. Still, the map T̃ : X/ ker(T ) → im(T ) defined above, by the factorization theorem
(cf. Theorem 3.15), is a bounded linear isomorphism. Fix an arbitrary y ∈ Y , and let x ∈ X be such that y = T x and ∥x∥ ≤ C∥y∥. Then we
have
∥T̃ −1 y∥ = ∥x∥ ≤ ∥x∥ ≤ C∥y∥,
so the inverse of T̃ is bounded as well. Consequently, the map T̃ is an isomorphism. Now X/ ker(T ) is a Banach space as X is (cf. Theorem
1.16), so is im(T ). By Theorem 1.10, we can conclude that im(T ) is closed in Y , as desired. ■

Corollary 3.62. Let X,Y be Banach spaces and T : X → Y be a bounded linear map. If there exists C > 0 such
that ∥x∥ ≤ C∥T x∥ for all x ∈ X, then the map T is injective and im(T ) is closed in Y .
Proof. Suppose that there is C > 0 such that ∥x∥ ≤ C∥T x∥ for all x ∈ X. Then for x ∈ ker(T ), we see that

∥x∥ ≤ C∥T x∥ = C∥0Y ∥ = C(0) = 0.

This shows that x = 0X whence T is injective. Meanwhile, for each y ∈ im(T ), it has a unique preimage x ∈ X, which satisfies ∥x∥ ≤
C∥T x∥ = C∥y∥. By the preceding lemma, we may conclude that im(T ) is closed in Y as well. ■

Lemma 3.63. Let X,Y be Banach spaces and T : X → Y be a bounded linear map. If there exists c > 0 such
that c∥ϕ∥ ≤ ∥T ∗ ϕ∥ for all ϕ ∈ Y ∗ , then the map T is also surjective.
Proof. Suppose that there exists c > 0 such that c∥ϕ∥ ≤ ∥T ∗ ϕ∥ for all ϕ ∈ Y ∗ . Since X,Y are Banach spaces, by the open mapping theorem
(cf. Theorem 3.24), it suffices to prove that 0Y is an interior point of T (B(0X , 1)), where B(0X , 1) := {x ∈ X | ∥x∥ = 1} is the unit open ball
in X. Furthermore, in light of the proofs there, one only needs to show that 0Y is an interior point of T (B(0X , 1)).
Suppose otherwise. (The remaining proofs require the separation version of Hahn-Banach theorem. To be completed later.) ■

Theorem 3.64 (Closed Range Theorem; Banach). Let X,Y be Banach spaces and T : X → Y be a bounded
linear map. Then
im(T ) is closed in Y ⇐⇒ im(T ) = ker(T ∗ )⊥
(149)
⇐⇒ im(T ∗ ) is closed in X ∗ ⇐⇒ im(T ∗ ) = ker(T )⊥ .
Proof. First, by Theorem 3.60, we have
ker(T )⊥ ⊇ im(T ∗ ) and ker(T ∗ )⊥ = im(T ).
Consequently,
im(T ) = ker(T ∗ )⊥ ⇐⇒ im(T ) = im(T ) ⇐⇒ im(T ) is closed in Y
To complete the proof, we consider the following implications:
• First, we show that im(T ∗ ) = ker(T )⊥ holds if im(T ) is closed in Y : Again, it is clear from above that im(T ∗ ) ⊆ im(T ∗ ) ⊆ ker(T )⊥ .
Conversely, let ϕ ∈ ker(T )⊥ be arbitrary. Then we have ker(T ) ⊆ ker(ϕ). In light of this, we define

ψ0 : im(T ) → F : y 7→ ϕ(x), where x ∈ X with y = T x.

92
First, let x, x′ ∈ X be such that T x = T x′ . Then T (x − x′ ) = T x − T x′ = 0Y , implying that x − x′ ∈ ker(T ) ⊆ ker(ϕ). Consequently,
we also have 0 = ϕ(x − x′ ) = ϕ(x) − ϕ(x′ ), so ϕ(x) = ϕ(x′ ) as well. This shows that ψ0 is well-defined.
– For every y, y′ ∈ im(T ) and c ∈ F, suppose that y = T x and y′ = T x′ for some x, x′ ∈ X, observe that cy + y′ = c(T x) + T x′ =
T (cx + x′ ), so
ψ0 (cy + y′ ) = ϕ(cx + x′ ) = cϕ(x) + ϕ(x′ ) = cψ0 (y) + ψ0 (y′ ).
Consequently, the map ψ0 is also linear.
– Finally, since im(T ) is closed, by Lemma 3.61, there exists C > 0 such that for any y ∈ im(T ), there exists x ∈ X with y = T x
such that ∥x∥ ≤ C∥y∥. In this case,
|ψ0 (y)| = |ϕ(x)| ≤ ∥ϕ∥∥x∥ ≤ C∥ϕ∥∥y∥.
This, the linear functional ψ0 is also bounded with ∥ψ0 ∥ ≤ C∥ϕ∥.
Then by Hahn-Banach theorem (cf. Corollary 3.37), we can extend ψ0 to a bounded linear functional ψ ∈ Y ∗ . In this case, for every
x ∈ X,
ϕ(x) = ψ0 (T x) = ψ(T x) = (T ∗ ψ)(x).
This shows that ϕ = T ∗ ψ ∈ im(T ∗ ), so the reversed inclusion ker(T )⊥ ⊆ im(T ∗ ) holds as well.
• Next, if im(T ∗ ) = ker(T )⊥ , then as noted at the beginning, we have

im(T ∗ ) ⊆ ker(T )⊥ = im(T ∗ ) ⊆ im(T ∗ ),

implying that im(T ∗ ) = im(T ∗ ), namely im(T ∗ ) is closed in X ∗ .


• Finally, suppose that im(T ∗ ) is closed in X ∗ . Ye want to show that im(T ) is also closed in X. (The remaining proofs require the
preceding lemma. To be completed later.) ■

3.7 The Adjoint of Linear Maps in Hilbert Space


Lemma 3.65. Let X,Y be Hilbert spaces. Then for every bounded linear map T : X → Y , there is a unique
bounded linear map T̃ : Y → X such that

∀x ∈ X and y ∈ Y : ⟨T x, y⟩ = ⟨x, T̃ y⟩. (150)

In addition, we have ∥T̃ ∥ = ∥T ∗ ∥ = ∥T ∥, where T ∗ : Y ∗ → X ∗ is the adjoint of T .


Proof. Let T : X → Y be a bounded linear maps. Now because X,Y are Hilbert spaces, by Riesz representation theorem (cf. Theorem 3.43),
we have canonical conjugate linear bijective isometries ηX : X → X ∗ and ηY : Y → Y ∗ . Then for every x ∈ X and y ∈ Y ,

⟨T x, y⟩ = (ηY (y))(T x) = (T ∗ ηY (y))(x) = ⟨x, ηX−1 (T ∗ ηY (y))⟩.

As a result, we may put


T̃ := ηX−1 ◦ T ∗ ◦ ηY : Y → X.
Yhat remains is to prove that T̃ is bounded and linear, whose uniqueness follows from the observation above: It is clear that T̃ is additive,
while for every y ∈ Y and c ∈ F,

T̃ (cy) = ηX−1 (T ∗ ηY (cy)) = ηX−1 (T ∗ (cηY (y)))


= ηX−1 (c(T ∗ ηY (y))) = cηX−1 (T ∗ ηY (y)) = cT̃ (y).

Therefore, the map T̃ : Y → X is indeed linear in this case. Furthermore, for every y ∈ Y ,

∥T̃ y∥ = ∥ηX−1 (T ∗ ηY (y))∥ = ∥T ∗ ηY (y)∥ ≤ ∥T ∗ ∥∥ηY (y)∥ = ∥T ∗ ∥∥y∥.

Consequently, the linear map T̃ is also bounded with ∥T̃ ∥ ≤ ∥T ∗ ∥. In addition, since T ∗ = ηX ◦ T̃ ◦ ηY , we shall have ∥T ∗ ∥ ≤ ∥T̃ ∥ as well
by symmetric arguments. This then show that ∥T̃ ∥ = ∥T ∗ ∥ = ∥T ∥, where the last equality is from Theorem 3.50, completing the proof. ■

93
Definition 3.15. Let X,Y be Hilbert spaces and T : X → Y be a bounded linear maps. Then the (Hilbert-space)
adjoint of T is defined as the unique bounded linear map from Y to X, also denoted by T ∗ , such that

∀x ∈ X and y ∈ Y : ⟨T x, y⟩ = ⟨x, T ∗ y⟩. (151)

Remark 3.15. Clearly, the Hilbert-space adjoint of T : X → Y can be regarded as a “pullback” of its Banach-
space adjoint, whose relationships are displayed in the following commutative diagram:

ηY
Y Y∗

THilbert ∗
TBanach

X ηX
X∗

Theorem 3.66 (Properties of Adjoints). Let X,Y, Z be Hilbert spaces.

1. Ye have id∗X = idX and for every bounded linear map T : X → Y ,

(T ∗ )∗ = T and ∥T ∗ T ∥ = ∥T T ∗ ∥ = ∥T ∥2 . (152)

2. For every bounded linear maps S, T : X → Y and c ∈ F,

(S + T )∗ = S∗ + T ∗ and (cT )∗ = cT ∗ . (153)

3. For every bounded linear maps T : X → Y and S : Y → Z,

(S ◦ T )∗ = T ∗ ◦ S∗ . (154)

4. If T : X → Y is an isomorphism, then T ∗ is also an isomorphism such that

(T ∗ )−1 = (T −1 )∗ =: T −∗ . (155)

Proof. 1. Observe that for every x, y ∈ X,


⟨idX (x), y⟩ = ⟨x, y⟩ = ⟨x, idX (y)⟩,
so by Lemma 3.65, we must have id∗X = idX in this case. Furthermore, let T : X → Y be a bounded linear map. Then for every x ∈ X and
y ∈ Y,
⟨T ∗ y, x⟩ = ⟨x, T ∗ y⟩ = ⟨T x, y⟩ = ⟨y, T x⟩,
hence again by Lemma 3.65, it follows that T = (T ∗ )∗ as well. Next, we show that ∥T ∗ T ∥ = ∥T ∥2 :
• First, we have ∥T ∗ T ∥ ≤ ∥T ∗ ∥∥T ∥ = ∥T ∥2 .
• Meanwhile, for every x ∈ X, by CBS inequality,

∥T x∥2 = ⟨T x, T x⟩ = ⟨x, T ∗ (T x)⟩ ≤ ∥(T ∗ T )x∥∥x∥ ≤ ∥T ∗ T ∥∥x∥2 .

Thus, we also have ∥T ∥2 ≤ ∥T ∗ T ∥ in this case.


Finally, since (T ∗ )∗ = T , we also have
∥T T ∗ ∥ = ∥(T ∗ )∗ ◦ T ∗ ∥ = ∥T ∗ ∥2 = ∥T ∥2 .

94
2. Let S, T : X → Y be bounded linear maps and c ∈ F be a scalar. Then for every x ∈ X and y ∈ Y ,

⟨(S + T )x, y⟩ = ⟨Sx + T x, y⟩ = ⟨Sx, y⟩ + ⟨T x, y⟩


= ⟨x, S∗ y⟩ + ⟨x, T ∗ y⟩ = ⟨x, S∗ y + T ∗ y⟩ = ⟨x, (S∗ + T ∗ )y⟩

and
⟨(cT )x, y⟩ = ⟨c(T x), y⟩ = c⟨T x, y⟩ = c⟨x, T ∗ y⟩ = ⟨x, c(T ∗ y)⟩ = ⟨x, (cT ∗ )y⟩.
This shows that S∗ + T ∗ = (S + T )∗ and cT ∗ = (cT )∗ .
3. Let T : X → Y and S : Y → Z be bounded linear maps. Then for every x ∈ X and z ∈ Z,

⟨(S ◦ T )x, z⟩ = ⟨S(T x), z⟩ = ⟨T x, S∗ z⟩ = ⟨x, T ∗ (S∗ z)⟩ = ⟨x, (T ∗ ◦ S∗ )z⟩,

so we shall have (S ◦ T )∗ = T ∗ ◦ S∗ in this case.


4. Finally, suppose that T : X → Y is an isomorphism. Note that T −1 ◦ T = idX and T ◦ T −1 = idY , while T −1 is also bounded by
definition. By taking adjoints over both sides, we have

idX = id∗X = (T −1 ◦ T )∗ = T ∗ ◦ (T −1 )∗ and idY = idY∗ = (T ◦ T −1 )∗ = (T −1 )∗ ◦ T ∗ .

As we can see, the map (T −1 )∗ is the functional inverse of T ∗ , which is also bounded as T −1 is. This shows that T ∗ is also an isomorphism
such that (T ∗ )−1 = (T −1 )∗ holds. ■

Theorem 3.67 (Kernel and Image of Adjoints). Let X,Y be Hilbert spaces and T : X → Y be a bounded linear
map. Then
ker(T ∗ ) = im(T )⊥ and im(T ∗ ) = ker(T )⊥ , (156)

while
ker(T ) = im(T ∗ )⊥ and im(T ) = ker(T ∗ )⊥ , (157)
Proof. First, for every y ∈ Y ,

y ∈ ker(T ∗ ) ⇐⇒ T ∗ y = 0X ⇐⇒ ∀x ∈ X : ⟨x, T ∗ y⟩ = 0
⇐⇒ ∀x ∈ X : ⟨T x, y⟩ = 0 ⇐⇒ y ∈ im(T )⊥ .

Here ⇐⇒ follows from Theorem 2.21. This shows that ker(T ∗ ) = im(T )⊥ , and hence by Corollary 2.39, we also have

ker(T ∗ )⊥ = (im(T )⊥ )⊥ = im(T ).

Note that (T ∗ )∗ = T , so ker(T ) = im(T ∗ )⊥ and ker(T )⊥ = im(T ∗ ) hold as well. ■

Theorem 3.68 (Left Invertible Linear Maps). Let X be a Hilbert space and T ∈ B(X). Then the following
statements are equivalent:

1. The map T is left invertible, namely there exists S ∈ B(X) such that ST = idX .

2. There exists a positive constant α such that ∥x∥ ≤ α∥T x∥ for all x ∈ X.

3. The map T is injective with closed image.

4. The map T ∗ T is an isomorphism.


Proof. (1 =⇒ 2). Suppose that T is left invertible with left inverse S ∈ B(X). Clearly, the map S is nonzero, so ∥S∥ > 0 as well. Then for
every x ∈ X,
∥x∥ = ∥ idX (x)∥ = ∥(ST )x∥ ≤ ∥S∥∥T x∥,

95
so the condition holds with constant α = ∥S∥ > 0.
(2 =⇒ 3). Suppose that there is α > 0 such that ∥x∥ ≤ α∥T x∥ for all x ∈ X. First, for every x ∈ ker(T ), we have

0 ≤ ∥x∥ ≤ α∥T x∥ = α∥0X ∥ = α(0) = 0.

This shows that x = 0X , so the map T is injective. Meanwhile, let (yn )n∈N be a sequence of elements in im(T ) converging to some y ∈ X.
Suppose that yn = T xn with xn ∈ X for all n ∈ N. Then for every m, n ∈ N,

∥xm − xn ∥ ≤ α∥T (xm − xn )∥ = α∥T xm − T xn ∥ = α∥ym − yn ∥.

Since (yn )n∈N is convergent, itself is a Cauchy sequence in X. By the inequality above, it is clear that (xn )n∈N is also a Cauchy sequence in
X. Note that X is complete, so the sequence (xn )n∈N converges to some x ∈ X. By the continuity of T ,

T x = lim T xn = lim yn = y,
n→∞ n→∞

hence y ∈ im(T ) as well. In conclusion, the image of T is closed in X, as desired.


(3 =⇒ 1). Suppose that T is injective with closed image. By Theorem 1.10, we see that im(T ) is also a Banach space. Then by the
bounded inverse theorem (cf. Theorem 3.25), we can see that a bounded linear inverse T̂ : im(T ) → X of T exists. Now let P : X → X be
the orthogonal projection onto im(T ) and S := T̂ ◦ P. As we can see, for each x ∈ X,

S(T x) = T̂ (P(T x)) = T̂ (T x) = x,

and
∥Sx∥ = ∥T̂ (Px)∥ ≤ ∥T ∥∥Px∥ ≤ ∥T ∥∥x∥.
This shows that S ∈ B(X) with ∥S∥ ≤ ∥T ∥ and ST = idX , namely T is left invertible.
(2 =⇒ 3). Suppose that there is α > 0 satisfying ∥x∥ ≤ α∥T x∥ for all x ∈ X. Then given any x ∈ X,

∥x∥2 ≤ α 2 ∥T x∥2 = α 2 ⟨T x, T x⟩ = α 2 ⟨T ∗ T x, x⟩ ≤ α 2 ∥T ∗ T x∥∥x∥,

so we also have ∥x∥ ≤ α 2 ∥T ∗ T x∥ in this case. Note that α 2 > 0, so by (1 ⇐⇒ 2), we can conclude that T ∗ T is left invertible. Suppose that
S ∈ B(X) is a left inverse of T ∗ T . Then we see that S(T ∗ T ) = idX , hence

idX = id∗X = (S(T ∗ T ))∗ = (T ∗ (T ∗ )∗ )S∗ = (T ∗ T )S∗ .

Since S∗ ∈ B(X), the map T ∗ T is right invertible as well. Therefore, we may conclude that T ∗ T is an isomorphism.
(4 =⇒ 1). Suppose that T ∗ T is an isomorphism. Denote by S := (T ∗ T )−1 . Then we can see that idX = S(T ∗ T ) = (ST ∗ )T , hence the
map T is certainly left invertible, completing the proof. ■

Lemma 3.69. Let X be a Hilbert space. Then for every T ∈ B(X), its restriction T̂ to ker(T )⊥ is injective and
bounded with identical image as T .
Proof. Let T ∈ B(X) be arbitrary and T̂ : ker(T )⊥ → X be its restriction to ker(T )⊥ . Clearly, the map T̂ is bounded, while also

ker(T̂ ) = ker(T ) ∩ ker(T )⊥ = {0X },

so T̂ is certainly injective.
Furthermore, it is clear that im(T̂ ) ⊆ im(T ). As for the converse, by Corollary 3.2, its kernel is a closed subspace of X. Then for every
x ∈ X, by Theorem 2.38, it admits a unique decomposition x = p + z, where p ∈ ker(T ) and z ∈ ker(T )⊥ . In this case,

T x = T (p + z) = T p + T z = 0X + T z = T z = T̂ z.

Therefore, we certainly have im(T̂ ) = im(T ). ■

Theorem 3.70 (Right Invertible Linear Maps). Let X be a Hilbert space and T ∈ B(X). Then the following
statements are equivalent:

96
1. The map T is right invertible, namely there exists S ∈ B(X) such that T S = idX .

2. The map T is surjective.

3. The map T T ∗ is an isomorphism.


Proof. (1 ⇐⇒ 3). Note from the preceding theorem that

T is right invertible ⇐⇒ ∃S ∈ B(X) : ST = idX ⇐⇒ ∃S ∈ B(X) : T ∗ S = idX


⇐⇒ T T ∗ = (T ∗ )∗ T ∗ is an isomorphism.

(1 ⇐⇒ 2). Suppose that T is right invertible with right inverse S ∈ B(X). Then for every y ∈ X, we can see that y = idX (y) = (T S)y =
T (Sy). This shows that the map T is surjective.
Conversely, suppose that T is surjective. By the preceding lemma, the restriction T̂ of T to ker(T )⊥ is also bounded and injective with
the same image as T . However, because T is surjective, the map T̂ is surjective as well. Therefore, we can see that T̂ is a bounded linear
isomorphism from ker(T )⊥ to X.
Note that ker(T )⊥ is a closed subspace of X, hence is also complete by Theorem 1.10. By the bounded inverse theorem (cf. Corollary
3.25), its functional inverse is bounded as well. Besides, for each x ∈ X, we have

(T T̂ −1 )x = T (T̂ −1 x) = x.

Consequently, we also have T T̂ −1 = idX , implying that T is right invertible. ■

Theorem 3.71 (Isometry Between Hilbert Spaces). Let X,Y be Hilbert spaces and T : X → Y be a linear map.
Then the following statements are equivalent:

1. The map T : X → Y is an isometry.

2. T ∗ T = idX .

3. The family (Tei )i∈I is orthonormal for every orthonormal family (ei )i∈I in X.

4. There is an orthonormal base (ei )i∈I in X such that the family (Tei )i∈I is also orthonormal in X.
Proof. (1 ⇐⇒ 2, 3). By Theorem 3.6, we know that T is an isometry if and only if ⟨T x, Ty⟩ = ⟨x, y⟩ for all x, y ∈ X. Note that ⟨T x, Ty⟩ =
⟨x, T ∗ Ty⟩, so
∀x, y ∈ X : ⟨T x, Ty⟩ = ⟨x, y⟩ ⇐⇒ ∀x, y ∈ X : ⟨x, T ∗ Ty⟩ = ⟨x, y⟩ ⇐⇒ T ∗ T = idX .
(1 =⇒ 3). Suppose that T is an isometry. Let (ei )i∈I be an orthonormal family in X. Then for every i, j ∈ I, by the preceding theorem,

1, if i = j;
⟨Tei , Te j ⟩ = ⟨ei , e j ⟩ = δi, j =
0, if i ̸= j.

This shows that (Tei )i∈I is also an orthonormal family in Y .


(3 =⇒ 4). This is trivial.
(4 =⇒ 2). Suppose that there is an orthonormal base (ei )i∈I of X such that the family (Tei )i∈I is also orthonormal. Then for every
x ∈ X, by Theorem 2.43, !
T ∗T x = T ∗T ∑ ⟨x, ei ⟩ei = ∑ ⟨x, ei ⟩(T ∗ Tei ),
i∈I i∈I
so for each j ∈ I,
⟨T ∗ T x, e j ⟩ = ∑ ⟨x, ei ⟩⟨T ∗ Tei , e j ⟩ = ∑ ⟨x, ei ⟩⟨Tei , Te j ⟩ = ⟨x, e j ⟩,
i∈I i∈I

97
where the last equality holds because (Tei )i∈I is an orthonormal family. As a result, for every y ∈ X,

⟨T ∗ T x, y⟩ = ∑ ⟨y, e j ⟩⟨T ∗ T x, e j ⟩ = ∑ ⟨y, e j ⟩⟨x, e j ⟩ = ⟨x, y⟩,


j∈I j∈I

implying that T ∗ T = idX , as desired. ■

Definition 3.16. Let X be a Hilbert space. A bounded linear operator T ∈ B(X) is called unitary if it is a
surjective isometry.

Theorem 3.72 (Equivalent Definitions of Unitary Operators). Let X be a Hilbert space and T ∈ B(X). Then
the following statements are equivalent:

1. The operator T is unitary.

2. The operator T is an isomorphism with T −1 = T ∗ .

3. T ∗ T = T T ∗ = idX .

4. The operator T and its adjoint T ∗ are both isometries.

5. The adjoint T ∗ of T is unitary.

6. For every orthonormal base (ei )i∈I of X, the family (Tei )i∈I is also an orthonormal base of X.

7. There exists an orthonormal base (ei )i∈I of X such that the family (Tei )i∈I is also an orthonormal base of
X.
Proof. (1 =⇒ 2). Suppose that T is unitary. Being a surjective linear isometry, it is certainly a linear isomorphism, so by Theorem 3.7, the
map T is an isometric isomorphism. In particular, given any x, y ∈ X,

⟨T x, y⟩ = ⟨T x, T (T −1 y)⟩ = ⟨x, T −1 y⟩,

so we certainly have T −1 = T ∗ in this case.


(2 =⇒ 3). If T is an isomorphism with T −1 = T ∗ , then it is clear that T ∗ T = T −1 T = idX = T T −1 = T T ∗ .
(3 ⇐⇒ 4). By the preceding theorem, we see that T is an isometry if and only if idX = T ∗ T , while T ∗ is an isometry if and only if
idX = (T ∗ )∗ T ∗ = T T ∗ . The desired equivalence thus follows.
(4 =⇒ 5). Suppose that T and T ∗ are both isometries. As noted above, we have T T ∗ = T ∗ T = idX . In particular, for each y ∈ X, we
have T ∗ (Ty) = (T ∗ T )y = idX (y) = y, so T ∗ is surjective as well. By definition, the linear operator T ∗ is also unitary.
(5 =⇒ 1). If T ∗ ∈ B(X) is unitary, now because (1 =⇒ 5), we can see that T = (T ∗ )∗ is also unitary.
(1 =⇒ 6). Suppose that T is unitary. Let (ei )i∈I be an orthonormal base of X. By the preceding theorem, one can see that (Tei )i∈I is
also an orthonormal family in X. Meanwhile, for every y ∈ X, since T is surjective, we have y = T x for some x ∈ X. Consequently,
!
y = Tx = T ∑ ⟨x, ei ⟩ei = ∑ ⟨x, ei ⟩Tei ∈ Span(Tei | i ∈ I).
i∈I i∈I

This shows that X = Span(Tei | i ∈ I), so by Theorem 2.43, the family (Tei )i∈I is also an orthonormal base of X.
(6 =⇒ 7). This is trivial.
(7 =⇒ 1). Suppose that the family (Tei )i∈I is also an orthonormal base of X for some orthonormal base (ei )i∈I of X. Again, by the
preceding theorem and Theorem 2.43, we see that T is an isometry with

X = Span(Tei | i ∈ I) ⊆ im(T ),

98
namely im(T ) = X. Now by Theorem 3.7, we see that X is isometrically isomorphic to im(T ), so im(T ) is also complete. Because X is a
Hilbert space, it follows from Theorem 1.10 that im(T ) is also closed in X. Consequently, we see that im(T ) = im(T ) = X, namely T is
surjective. Being a surjective isometry, one can conclude from the definition that T is unitary, as desired. ■

Example 3.1. In fact, every unitarily invariant norm on Fn , namely ∥Ux∥ = ∥x∥ for every x ∈ Fn and unitary
matrix U ∈ Mn (F), is a positive scalar multiple of the l2 -norm:
• For every x ∈ Fn and unitary matrix U, consider the standard inner product on Fn ,

∥Ux∥22 = (Ux)∗ (Ux) = x∗U ∗Ux = x∗ x = ∥x∥22 .

• Suppose that ∥ · ∥ is a unitarily invariant norm on Fn . Consider the Householder matrix U onto e1 . It is clear that

∥x∥ = ∥Ux∥ = ∥∥x∥2 e1 ∥ = ∥x∥2 ∥e1 ∥.

Consequently, we have ∥ · ∥ = ∥e1 ∥∥ · ∥2 .

Definition 3.17. Let X be a Hilbert space. A bounded linear operator T ∈ B(X) is called self-adjoint if T ∗ = T .

Theorem 3.73 (Properties of Self-Adjoint Operators). Let X be a Hilbert space.

1. For every self-adjoint operators S, T ∈ B(X),

• given any c ∈ R, the scalar multiple cT is self-adjoint;


• their sum S + T is always self-adjoint,
• their product S ◦ T is self-adjoint if and only if S, T commute.

In particular, every polynomial of T in real coefficients is also self-adjoint. If T is an isomorphism, its


inverse T −1 is also self-adjoint.

2. For every T ∈ B(X), the operators T ∗ T , T T ∗ , and T + T ∗ are all self-adjoint.

3. For every T ∈ B(X), there exist unique self-adjoint operators A, B ∈ B(X) such that T = A + iB and
T ∗ = A − iB.
Proof. 1. Let S, T ∈ B(X) be self-adjoint operators. First, for every c ∈ R, we have

(cT )∗ = cT ∗ = cT = cT,

so the operator cT is self-adjoint. Besides, note that

(S + T )∗ = S∗ + T ∗ = S + T and (S ◦ T )∗ = T ∗ ◦ S∗ = T ◦ S.

Therefore, S + T is self-adjoint, while S ◦ T is self-adjoint if and only if T ◦ S = S ◦ T , namely S, T commute in this case.
• In particular, by easy induction, one can see that T n is self-adjoint for all non-negative integer n, so every real linear combination of
non-negative powers of T is also self-adjoint.
• Suppose that T is an isomorphism. Note that (T −1 )∗ = (T ∗ )−1 = T −1 , so its inverse T −1 is also self-adjoint.
2–3. Let T ∈ B(X) be arbitrary. Then

(T ∗ T )∗ = T ∗ (T ∗ )∗ = T ∗ T, (T T ∗ )∗ = (T ∗ )∗ T ∗ = T T ∗ , and (T + T ∗ )∗ = T ∗ + (T ∗ )∗ = T ∗ + T,

99
so T ∗ T, T T ∗ , T + T ∗ are all self-adjoint. Furthermore, if A, B ∈ L (X) are operators such that T = A + iB and T ∗ = A − iB, we must have
1 1
A= (T + T ∗ ) and B= (T − T ∗ ).
2 2i
Now let A, B ∈ L (X) be defined as above. It is clear that A is self-adjoint as T + T ∗ is and 1/2 ∈ R. Meanwhile,
1 1 1
B∗ = (T − T ∗ )∗ = − (T ∗ − T ) = (T − T ∗ ) = B,
−2i 2i 2i
so B is self-adjoint as well. The desired assertion thus follows. ■

Theorem 3.74. Let X be a complex Hilbert space. An operator T ∈ B(X) is self-adjoint if and only if ⟨T x, x⟩ ∈ R
for all x ∈ X. Furthermore, when X is nonzero and T is self-adjoint,

∥T ∥ = sup |⟨T x, x⟩|. (158)


∥x∥=1

Proof. Let T ∈ B(X) be arbitrary. If T is self-adjoint, then for every x ∈ X,

⟨T x, x⟩ = ⟨x, T x⟩ = ⟨T x, x⟩,

so ⟨T x, x⟩ ∈ R holds for sure. Furthermore, suppose that X is nonzero now.


• For every x ∈ X with ∥x∥ = 1,
|⟨T x, x⟩| ≤ ∥T x∥∥x∥ ≤ ∥T ∥∥x∥2 = ∥T ∥,
so M := sup∥x∥=1 |⟨T x, x⟩| ≤ ∥T ∥ holds as well.
• Conversely, we first claim that |⟨T x, x⟩| ≤ M∥x∥2 for all x ∈ X: There is nothing to prove if x = 0X ; otherwise, denoting by
x̂ := ∥x∥−1 x the normalization of x,

⟨T x, x⟩ = ⟨∥x∥(T x̂), ∥x∥x̂⟩ = ∥x∥2 ⟨T x̂, x̂⟩ ≤ M∥x∥2 .

The desired inequality thus follows. Now for every x, y ∈ X, we have

⟨T (x + y), x + y⟩ − ⟨T (x − y), x − y⟩ = 2(⟨T x, y⟩ + ⟨Ty, x⟩) = 4ℜ⟨T x, y⟩.

Thus
1
ℜ⟨T x, y⟩ = (⟨T (x + y), x + y⟩ − ⟨T (x − y), x − y⟩)
4
M M
≤ (∥x + y∥2 + ∥x − y∥2 ) = (∥x∥2 + ∥y∥2 ).
4 2
Now suppose that ∥x∥ = 1 and T x ̸= 0X . Then we may put y := ∥T x∥−1 (T x). Clearly, we have

ℜ⟨T x, y⟩ = ℜ⟨T x, ∥T x∥−1 (T x)⟩ = ∥T x∥−1 ℜ⟨T x, T x⟩ = ∥T x∥−1 ∥T x∥2 = ∥T x∥,

while
M M
ℜ⟨T x, y⟩ ≤ (∥x∥2 + 12 ) = (12 + 12 ) = M.
2 2
Therefore, we see that ∥T x∥ ≤ M for all ∥x∥ = 1 (such inequality automatically holds if T x = 0X ), hence ∥T ∥ ≤ M as well.
Conversely, suppose that ⟨T x, x⟩ ∈ R for all x ∈ X. Then for every x ∈ X, we see that ⟨x, T x⟩ = ⟨T x, x⟩ = ⟨T x, x⟩, so

⟨(T − T ∗ )x, x⟩ = ⟨T x − T ∗ x, x⟩ = ⟨T x, x⟩ − ⟨T ∗ x, x⟩ = ⟨T x, x⟩ − ⟨x, T x⟩ = 0.

By Corollary 2.14, we have T − T ∗ = 0. Therefore, we may conclude that T = T ∗ , namely T is self-adjoint. ■

Alternative Proof of Equivalent Condition. By Corollary 2.14, we see that the map

ϕ : X × X → C : (x, y) 7→ ⟨T x, y⟩

100
is a sesquilinear form on X. Furthermore, its associated quadratic form is precisely given by x 7→ ⟨T x, x⟩. Then according to Corollary 2.13,

∀x ∈ X : ⟨T x, x⟩ ∈ R ⇐⇒ ∀x ∈ X : ϕ(x, x) ∈ R
⇐⇒ ∀x, y ∈ X : ϕ(x, y) = y, x
⇐⇒ ∀x, y ∈ X : ⟨T x, y⟩ = ⟨Ty, x⟩ = ⟨x, Ty⟩ ⇐⇒ T = T ∗ . ■

Definition 3.18. Let X be a real linear space. A bilinear form ϕ : X × X → R is called symmetric if ϕ(x, y) =
ϕ(y, x) for all x, y ∈ X, in which case the associated quadratic form is defined as

Φ : X → R : x 7→ ϕ(x, x). (159)

Remark 3.16. Similar to sesquilinear forms, it is clear that every semi-inner product on real linear spaces is
automatically a symmetric bilinear form (that is also positive semi-definite).

Theorem 3.75. Let X be a real linear space, ϕ : X × X → R be a symmetric bilinear form on X, and Φ : X → R
be the quadratic form on X associated with ϕ. Then for every x, y ∈ X,

1 1
ϕ(x, y) = (Φ(x + y) − Φ(x) − Φ(y)) = (Φ(x + y) − Φ(x − y)). (160)
2 4
In other words, the symmetric bilinear form ϕ is completely determined by its associated quadratic form Φ.
Proof. For every x, y ∈ X,

Φ(x ± y) = ϕ(x ± y, x ± y) = ϕ(x, x) + ϕ(y, y) ± (ϕ(x, y) + ϕ(y, x))


= Φ(x) + Φ(y) ± 2ϕ(x, y).

Therefore, we have Φ(x + y) − Φ(x) − Φ(y) = 2ϕ(x, y) and Φ(x + y) − Φ(x − y) = 4ϕ(x, y). The desired identities thus follows. ■

Remark 3.17. One may note that the above theorem is parallel to the generalized polarization identity for
sesquilinear forms (cf. Theorem 2.12).

Corollary 3.76. Let X be a Hilbert space and T ∈ B(X) be self-adjoint. Then

T = 0 ⇐⇒ ∀x ∈ X : ⟨T x, x⟩ = 0. (161)

Proof. The direction (=⇒) is trivial for us. As for its converse, we consider different cases for the ground field:
• If F = C, the statement follows from Corollary 2.14.
• Suppose that F = R. Ye then show that the following map is a symmetric bilinear form on X:

ϕ : X × X → R : (x, y) 7→ ⟨T x, y⟩.

– For every x1 , x2 , y ∈ X and c ∈ F,

ϕ(cx1 + x2 , y) = ⟨T (cx1 + x2 ), y⟩ = ⟨c(T x1 ) + T x2 , y⟩


= c⟨T x1 , y⟩ + ⟨T x2 , y⟩ = cϕ(x1 , y) + ϕ(x2 , y).

– For every x, y1 , y2 ∈ X and c ∈ F,

ϕ(x, cy1 + y2 ) = ⟨T x, cy1 + y2 ⟩ = c⟨T x, y1 ⟩ + ⟨T x, y2 ⟩ = cϕ(x, y1 ) + ϕ(x, y2 ).

101
– For every x, y ∈ X,
ϕ(x, y) = ⟨T x, y⟩ = ⟨x, Ty⟩ = ⟨Ty, x⟩ = ϕ(x, y).

Now let Φ; X → R be the quadratic form associated with ϕ. As we can see, we have Φ(x) = ϕ(x, x) = ⟨T x, x⟩ = 0 for all x ∈ X.
Then by the preceding theorem, we see that
1 1
⟨T x, y⟩ = ϕ(x, y) = (Φ(x + y) − Φ(x) − Φ(y)) = (0 − 0 − 0) = 0
2 2
holds for all x, y ∈ X. In particular, we have ∥T x∥2 = ⟨T x, T x⟩ = 0 for all x ∈ X, so T = 0 holds as well in this case. ■

Theorem 3.77 (Self-Adjoint Operators on Finite-Dimensional Spaces). Let X be a finite-dimensional inner


product space, and T be a linear operator on X. Then the following statements are equivalent:

1. The linear operator T on X is self-adjoint.

2. For every ordered orthonormal base β of X, the matrix representation β [T ]β with respect to β is a Her-
mitian matrix.

3. There exists an ordered orthonormal base β of X to which the matrix representation β [T ]β of T is Hermi-
tian.
Proof. (1 =⇒ 2). Let T be an operator on X. First, suppose that T is self-adjoint. Fix an arbitrary ordered orthonormal base β = (e1 , . . . , en )
of X. Then for each i, j = 1, . . . , n, by Theorem 2.33, the (i, j)-entry of β [T ]β is given by ⟨Te j , ei ⟩. Note that

⟨Te j , ei ⟩ = ⟨ei , Te j ⟩ = ⟨Tei , e j ⟩,

so the (i, j)-entry of β [T ]β is the conjugate of its ( j, i)-entry. Ye may conclude that β [T ]β is a Hermitian matrix in this case.
(2 =⇒ 3). This is trivial.
(3 =⇒ 1). Suppose that β [T ]β is a Hermitian matrix for some ordered orthonormal base β of X. Then for every x, y ∈ X, by Theorem
2.33,

⟨T x, y⟩ = ⟨[T x]β , [y]β ⟩Fn = [y]∗β [T x]β = [y]∗β β [T ]β [x]β


= [y]∗β β [T ]∗β [x]β = (β [T ]β [y]β )∗ [x]β
= [Ty]∗β [x]β = ⟨[x]β , [Ty]β ⟩Fn = ⟨x, Ty⟩.

This shows that T = T ∗ , namely T is self-adjoint. ■

Theorem 3.78 (Adjoint of Complexification). Let X be a real Hilbert space and T ∈ B(X). Then

(TC )∗ = (T ∗ )C =: TC∗ . (162)

In particular, the linear operator T is self-adjoint if and only if TC is.


Proof. For every x1 , x2 , y1 , y2 ∈ X,

⟨x1 + iy1 , (T ∗ )C (x2 + iy2 )⟩C = ⟨x1 + iy1 , T ∗ x2 + i(T ∗ y2 )⟩


= (⟨x1 , T ∗ x2 ⟩ + ⟨y1 , T ∗ y2 ⟩) + i(⟨y1 , T ∗ x2 ⟩ − ⟨x1 , T ∗ y2 ⟩)
= (⟨T x1 , x2 ⟩ + ⟨Ty1 , y2 ⟩) + i(⟨Ty1 , x2 ⟩ − ⟨T x1 , y2 ⟩)
= ⟨T x1 + i(Ty1 ), x2 + iy2 ⟩C = ⟨TC (x1 + iy1 ), x2 + iy2 ⟩C .

This then shows that (TC )∗ = (T ∗ )C in this case.

102
• Clearly, if T is self-adjoint, we shall have (TC )∗ = (T ∗ )C = TC , implying that TC is also self-adjoint.
• Conversely, suppose that TC is self-adjoint. Then we have TC = (TC )∗ = (T ∗ )C , so for every x ∈ V ,

T x = TC (x + i0) = (T ∗ )C (x + i0) = T ∗ x.

In other words, we have T = T ∗ in this case, implying that T is self-adjoint. ■

Definition 3.19. Let X be a Hilbert space. A bounded linear operator T ∈ B(X) is normal if T ∗ T = T T ∗ .

Theorem 3.79. Let X be a Hilbert space. Then a bounded linear operator T ∈ B(X) is normal if and only if
∥T x∥ = ∥T ∗ x∥ for all x ∈ X. In particular, if T is normal, then ker(T ∗ ) = ker(T ).
Proof. Let T ∈ B(X) be arbitrary. If T is normal, then for every x ∈ X,

∥T x∥2 = ⟨T x, T x⟩ = ⟨x, T ∗ T x⟩ = ⟨x, T T ∗ x⟩ = ⟨T ∗ x, T ∗ x⟩ = ∥T ∗ x∥2 ,

so the equality ∥T x∥ = ∥T ∗ x∥ holds for sure. In addition, for every x ∈ X,

x ∈ ker(T ) ⇐⇒ T x = 0X ⇐⇒ ∥T x∥ = 0 ⇐⇒ ∥T ∗ x∥ = 0 ⇐⇒ T ∗ x = 0X ⇐⇒ x ∈ ker(T ∗ ),

hence the equality ker(T ) = ker(T ∗ ) holds as well.


Conversely, suppose that ∥T x∥ = ∥T ∗ x∥ for all x ∈ X. Consider the bounded linear operator T ∗ T − T T ∗ ∈ B(X), which is certainly
self-adjoint. Clearly, for every x ∈ X,

⟨(T ∗ T − T T ∗ )x, x⟩ = ⟨T ∗ T x, x⟩ − ⟨T T ∗ x, x⟩ = ∥T x∥2 − ∥T ∗ x∥2 = 0,

so by Corollary 3.76, we have T ∗ T − T T ∗ = 0. Therefore, we see that T ∗ T = T T ∗ , namely T is normal. ■

103
4 Compact Linear Maps and Some Spectral Theory
4.1 Weak and Weak∗ Convergence
Definition 4.1. Let V be a normed linear space. A sequence (xn )n∈N of elements in V converges weakly to some
w
→ x, if ϕ(xn ) → ϕ(x) for all ϕ ∈ V ∗ , in which case x is called a weak limit of (xn )n∈N .
x ∈ V , denoted by xn −

Remark 4.1. As opposite to weak convergence, we shall refer the convergence of elements under norm by strong
convergence then.

Theorem 4.1 (Properties of Weak Convergence). Let V be a normed linear spaces and (xn )n∈N be a sequence
of elements in V .
w w
1. (Uniqueness of Weak Limits). If xn −
→ x ∈ V and xn −
→ y ∈ V , then x = y.

2. If (xn )n∈N converges strongly to some x ∈ V , it also converges weakly to x ∈ V .


w
3. If xn −
→ x ∈ V , then the sequence (xn )n∈N is bounded with

∥x∥ ≤ lim inf ∥xn ∥. (163)


n→∞

Furthermore, for every sequence (ϕn )n∈N in V ∗ , if ϕn → ϕ ∈ V ∗ as well, then ϕn (xn ) → ϕ(x).

4. For every sequence (yn )n∈N of elements in V ,

w w w
xn −
→ x ∈ V and yn −
→ y ∈ V =⇒ xn + yn −
→ x + y. (164)

5. For every sequence (cn )n∈N of scalars,

w w
xn −
→ x ∈ V and cn → c ∈ F =⇒ cn xn −
→ cx. (165)

w w
6. For every normed linear space W and bounded linear map T : V → W , if xn −
→ x in V , then T xn −
→ T x in
W as well.
Proof. 1. Suppose that x, y ∈ V are weak limit of V . Then for every ϕ ∈ V ∗ , we have

ϕ(x − y) = ϕ(x) − ϕ(y) = lim ϕ(xn ) − lim ϕ(xn ) = 0.


n→∞ n→∞

By Corollary 3.38, we must have x − y = 0V whence x = y.


2. Suppose that xn → x in V as n → ∞. Then for every ϕ ∈ V ∗ , we see that

|ϕ(xn ) − ϕ(x)| = |ϕ(xn − x)| ≤ ∥ϕ∥∥xn − x∥ → ∥ϕ∥(0) = 0. (as n → ∞)


w
Consequently, we should have ϕn −
→ x in this case.
3. As for the boundedness of (xn )n∈N , by Theorem 3.40, we have ∥ evxn ∥ = ∥xn ∥ for all n ∈ N, where evxn ∈ V ∗∗ is the evaluation map
at xn . Then for every ϕ ∈ V ∗ , we see that the sequence (evxn (ϕ))n∈N = (ϕ(xn ))n∈N , being convergent to ϕ(x), is also bounded. Note that
V ∗ is a Banach space, so by the Banach-Strauhaus theorem (cf. Theorem 3.22), we can conclude that supn∈N ∥xn ∥ = supn∈N ∥ evxn ∥ < ∞, as
desired.

104
w
→ x ∈ V in this case now. Then for every ϕ ∈ V ∗ with ∥ϕ∥ ≤ 1, since |ϕ(xn )| ≤ ∥ϕ∥∥xn ∥ ≤ ∥xn ∥ for all n ∈ N, we
Suppose that xn −
have
|ϕ(x)| = lim |ϕ(xn )| = lim inf |ϕ(xn )| ≤ lim inf ∥xn ∥.
n→∞ n→∞ n→∞
By Corollary 3.38, we then see that
∥x∥ = max |ϕ(x)| ≤ lim inf ∥xn ∥.
ϕ∈V ∗ ,∥ϕ∥≤1 n→∞

Finally, let (ϕn )n∈N be a sequence in V ∗ converging to ϕ ∈ V ∗. Then for every n ∈ N,

|ϕn (xn ) − ϕ(x)| ≤ |ϕn (xn ) − ϕ(xn )| + |ϕ(xn ) − ϕ(x)|


≤ ∥ϕn − ϕ∥∥xn ∥ + |ϕ(xn ) − ϕ(x)| → 0. (as n → ∞)
w
Here ∥ϕn − ϕ∥∥xn ∥ → 0 because ϕn → ϕ and (xn )n∈N is bounded, while |ϕ(xn ) − ϕ(x)| → 0 as xn −
→ x. Therefore, we have ϕn (xn ) → ϕ(x)
in this case.
4–6. Let (xn )n∈N , (yn )n∈N be weakly convergent sequences in V with weak limits x, y, respectively, and let (cn )n∈N be a convergent
sequence of scalars with limit c ∈ F. Then for every ϕ ∈ V ∗ ,

|ϕ(xn + yn ) − ϕ(x + y)| = |(ϕ(xn ) + ϕ(yn )) − (ϕ(x) + ϕ(y))|


= |(ϕ(xn ) − ϕ(x)) + (ϕ(yn ) − ϕ(y))|
≤ |ϕ(xn ) − ϕ(x)| + |ϕ(yn ) − ϕ(y)| → 0 + 0 = 0, (as n → ∞)

and

|ϕ(cn xn ) − ϕ(cx)| = |cn ϕ(xn ) − cϕ(x)| = |(cn ϕ(xn ) − cn ϕ(x)) + (cn ϕ(x) − cϕ(x))|
≤ |cn ϕ(xn ) − cn ϕ(x)| + |cn ϕ(x) − cϕ(x)|
= |cn ||ϕ(xn ) − ϕ(x)| + |cn − c||ϕ(x)| → 0 + 0 = 0. (as n → ∞)
w w
Therefore, we indeed have ϕ(xn + yn ) → ϕ(x + y) and ϕ(cn xn ) → ϕ(cx) for all ϕ ∈ V ∗ , so xn + yn −
→ x + y and cn xn −
→ cx. Finally, let W be
a normed linear space and T : V → W be a bounded linear map. Then for every ψ ∈ W ∗ ,

|ψ(T x) − ψ(T xn )| = |(T ∗ ψ)(x) − (T ∗ ψ)(xn )| → 0, (as n → ∞)


w
→ x and T ∗ ψ ∈ V ∗ . Therefore, the sequence (T xn )n∈N converges weakly to T x in W as well.
as xn − ■

Theorem 4.2. Let V be a finite-dimensional normed linear space. Then a sequence of elements in V is strongly
convergent if and only if it is weakly convergent.
Proof. The implication from strong convergence to weak convergence is clear from the preceding theorem. Now we let (xn )n∈N be a weakly
convergent sequence in V with weak limit x ∈ V . Because V is finite-dimensional, we may consider an arbitrary base {e1 , . . . , en } of V with
coordinate form e∨ ∨
1 , . . . , en : V → F.
Now by Lemma 3.32, each e∨j is a bounded linear functional on V . Then according to our hypothesis, we also have e∨j (xn ) → e∨j (x) in
F as n → ∞. As a result, we may conclude from Theorem 1.28 that xn → x in V as well in this case. ■

Definition 4.2. Let V be a normed linear space. A sequence (ϕn )n∈N of bounded linear functionals on V
w∗
converges weakly∗ to some ϕ ∈ V ∗ , denoted by ϕn −→ ϕ, if ϕn (x) → ϕ(x) for all x ∈ V , in which case ϕ is called
a weak∗ limit of (ϕn )n∈N .

Theorem 4.3 (Properties of Weak∗ Convergence). Let V be a normed linear space and (ϕn )n∈N be a sequence
of bounded linear functionals on V .
w∗ w∗
1. (Uniqueness of Weak Limits). If ϕn −→ ϕ ∈ V ∗ and ϕn −→ ψ ∈ V ∗ , then ϕ = ψ.

105
2. For every ϕ ∈ V ∗ ,
w w∗
ϕn → ϕ =⇒ ϕn −
→ ϕ =⇒ ϕn −→ ϕ. (166)

The converse of the second implication is true if V is reflexive.


w∗
3. If ϕn −→ ϕ ∈ V , then
∥ϕ∥ ≤ lim inf ∥ϕn ∥. (167)
n→∞

Furthermore, the sequence (ϕn )n∈N is bounded if V is complete, in which case for every sequence (xn )n∈N
of elements in V , if xn → x ∈ V , then ϕn (xn ) → ϕ(x).

4. For every sequence (ψn )n∈N of elements in V ,

w∗ w∗ w∗
ϕn −→ ϕ ∈ V ∗ and ψn −→ ψ ∈ V ∗ =⇒ ϕn + ψn −→ ϕ + ψ. (168)

5. For every sequence (cn )n∈N of scalars,

w∗ w
ϕn −→ ϕ ∈ V and cn → c ∈ F =⇒ cn ϕn −
→ cϕ. (169)

Proof. 1. Suppose that ϕ, ψ ∈ V ∗ are weak∗ limits of (ϕn )n∈N . Then for every x ∈ V ,

ϕ(x) − ψ(x) = lim ϕn (x) − lim ϕn (x) = 0,


n→∞ n→∞

namely ϕ(x) = ψ(x). This then shows that x = y, proving the uniqueness.
2. Let ϕ ∈ V ∗ be arbitrary. We prove the implications as follows:
• First, suppose that ϕn → ϕ. Let ϑ ∈ V ∗∗ be arbitrary. Then for every x ∈ V ,

|ϑ (ϕn ) − ϑ (ϕ)| = |ϑ (ϕn − ϕ)| ≤ ∥ϑ ∥∥ϕn − ϕ∥ → 0. (as n → ∞)


w
This shows that ϕn −
→ ϕ as well.
w
→ ϕ. Then for every x ∈ V , considering the evaluation map evx ∈ V ∗∗ , we see that
• Next, suppose that ϕn −

|ϕn (x) − ϕ(x)| = | evx (ϕn ) − evx (ϕ)| → 0. (as n → ∞)


w∗
Therefore, we have ϕn −→ ϕ as well.
w∗
• Finally, suppose that ϕn −→ ϕ and V is reflexive. Then for each ϑ ∈ V ∗∗ , we have ϑ = evx for some x ∈ V . As a result, we see that

ϑ (ϕn ) = evx (ϕn ) = ϕn (x) → ϕ(x) = evx (ϕ) = ϑ (ϕ).


w
Consequently, we can conclude that ϕn −
→ ϕ in this case.
w∗
3. Suppose that ϕn −→ ϕ ∈ V ∗ . Then for every x ∈ V with ∥x∥ ≤ 1, note that |ϕn (x)| ≤ ∥ϕn ∥∥x∥ ≤ ∥ϕn ∥, so

|ϕ(x)| = lim |ϕn (x)| = lim inf |ϕn (x)| ≤ lim inf ∥ϕn ∥.
n→∞ n→∞ n→∞

Consequently,
∥ϕ∥ = sup |ϕ(x)| ≤ lim inf ∥ϕn ∥.
x∈V,∥x∥≤1 n→∞

Next, suppose that V is complete. Then for every x ∈ V , since ϕn (x) → ϕ(x), the sequence (ϕn (x))n∈N is bounded. Since V is now a Banach
space, it follows from the Banach-Steinhaus theorem (cf. Theorem 3.22) that (ϕn )n∈N is also bounded in V ∗ . In this case, let (xn )n∈N be a

106
convergent sequence in V with limit x. Then

|ϕn (xn ) − ϕ(x)| ≤ |ϕn (xn ) − ϕn (x)| + |ϕn (x) − ϕ(x)|


≤ ∥ϕn ∥∥xn − x∥ + |ϕn (x) − ϕ(x)| → 0 + 0 = 0. (as n → ∞)

w∗
Here ∥ϕn ∥∥xn − x∥ → 0 as (ϕn )n∈N is bounded and xn → x, while |ϕn (x) − ϕ(x)| → 0 because ϕn −→ ϕ.
4–5. Let (ϕn )n∈N , (ψn )n∈N be weakly∗ convergent sequences in V with weak∗ limits ϕ, ψ ∈ V ∗ , respectively, and let (cn )n∈N be a
convergent sequence of scalars with limit c ∈ F. Then for every x ∈ V ,

|(ϕn + ψn )(x) − (ϕ + ψ)(x)| = |(ϕn (x) − ϕ(x)) + (ψn (x) − ψ(x))|


≤ |ϕn (x) − ϕ(x)| + |ψn (x) − ψ(x)| → 0 + 0 = 0, (as n → ∞)

and

|(cn ϕn )(x) − (cϕ)(x)| = |cn ϕn (x) − cϕ(x)| = |(cn ϕn (x) − cn ϕn (x)) + (cn ϕn (x) − cϕ(x))|
≤ |cn ϕ(x) − cn ϕn (x)| + |cn ϕn (x) − cϕ(x)|
= |cn ||ϕ(x) − ϕn (x)| + |cn − c||ϕ(x)| → 0 + 0 = 0. (as n → ∞)

w∗ w∗
Therefore, we also have ϕn + ψn −→ ϕ + ψ and cn ϕn −→ cϕ. ■

Definition 4.3. Let V be an inner product space and x ∈ V . A sequence (xn )n∈N of elements in V converges
w
weakly to x, denoted by xn −
→ x, if ⟨xn , y⟩ → ⟨x, y⟩ as n → ∞, for every y ∈ V , in which case x is called a weak
limit of (xn )n∈N .

Remark 4.2. Let V be a Hilbert space with sequence (xn )n∈N of elements in it. Then we see that

w
→ x ∈ V ⇐⇒ ∀ϕ ∈ V ∗ : ϕ(xn ) → ϕ(x)
xn −
⇐⇒ ∀y ∈ V : ⟨xn , y⟩ → ⟨x, y⟩,

where the last equivalence follows from the Riesz representation theorem (cf. Theorem 3.43). Here we simply
generalized the last characterization to arbitrary inner product spaces, in which the definition of weak conver-
gence is slightly weakly than in normed linear spaces.

Theorem 4.4 (Properties of Weakly Convergent Sequences). Let V be an inner product space and let (xn )n∈N ,
(yn )n∈N be sequences of elements in V .
w w
1. (Uniqueness of Weak Limits). If xn −
→ x ∈ V and xn −
→ y ∈ V , then x = y.

2. If V is complete, the every weakly convergent sequence is also bounded.


w w w
3. If xn −
→ x ∈ V and yn −
→ y ∈ V , then xn + yn −
→ x + y.
w w
4. If xn −
→ x ∈ V , then for every sequence (cn )n∈N of scalars with cn → c ∈ F, we have cn xn −
→ cx.
w w
Proof. 1. Suppose that xn −
→ x ∈ V and xn −
→ y ∈ V . Then for every z ∈ V , we have

⟨x, z⟩ = lim ⟨xn , z⟩ = ⟨y, z⟩.


n→∞

Consequently, we have ⟨x − y, z⟩ = ⟨x, z⟩ − ⟨y, z⟩ = 0 for all z ∈ V , hence x − y = 0V , namely x = y.

107
w
2. Suppose that V is complete, and xn −
→ x ∈ V . For every n ∈ N, by Lemma 3.41, we see that the map

ηn : V → F : z 7→ ⟨z, xn ⟩

is a bounded linear map on V with ∥ηn ∥ = ∥xn ∥. Furthermore, given any z ∈ V , since ηn (z) = ⟨z, xn ⟩ → ⟨z, x⟩, the sequence (ηn (x))n∈N
is bounded in F. Now because V is complete, we can see from the Banach-Steinhaus theorem (cf. Theorem 3.22) that supn∈N ∥xn ∥ =
supn∈N ∥ηn ∥ < ∞. That is, the sequence (xn )n∈N is bounded in V .
w w
3–4. Suppose that xn −
→ x ∈ V and yn −
→ y ∈ V . Then for every z ∈ V , since ⟨xn , z⟩ → ⟨x, z⟩ and ⟨yn , z⟩ → ⟨y, z⟩, it follows that

⟨xn + yn , z⟩ = ⟨xn , z⟩ + ⟨yn , z⟩ → ⟨x, z⟩ + ⟨y, z⟩ = ⟨x + y, z⟩. (as n → ∞)


w
This shows that xn + yn −
→ x + y. Next, let (cn )n∈N be a sequence of scalars with cn → c ∈ F. Again, for every z ∈ V , since ⟨xn , z⟩ → ⟨x, z⟩,
it is clear that
⟨cn xn , z⟩ = cn ⟨xn , z⟩ → c⟨x, z⟩. (as n → ∞)
w
Therefore, we also have cn xn −
→ cx in this case. ■

Theorem 4.5 (Relationship Between Strong and Weak Convergences). Let V be an inner product space with
x ∈ V , and let (xn )n∈N be a sequence of elements in V . Then

w
xn → x ⇐⇒ xn −
→ x and ∥xn ∥ → ∥x∥. (170)

Proof. Suppose that xn → x ∈ V under the derived norm, namely ∥xn − x∥ → 0 as n → ∞. It is clear from Theorem 1.9 that ∥xn ∥ → ∥x∥.
Furthermore, for every y ∈ V , by CBS inequality,

|⟨xn , y⟩ − ⟨x, y⟩| = |⟨xn − x, y⟩| ≤ ∥xn − x∥∥y∥ → 0. (as n → ∞)


w
This shows that ⟨xn , y⟩ → ⟨x, y⟩, hence xn −
→ x in this case.
w
Conversely, suppose that xn −
→ x ∈ V and ∥xn ∥ → ∥x∥ as well. Then by the property of weak convergence, we have ⟨xn , x⟩ → ⟨x, x⟩ =
∥x∥2 as n → ∞. Consequently, by Theorem 2.1,

∥xn − x∥2 = ∥xn ∥2 + ∥x∥2 − 2ℜ⟨xn , x⟩ → ∥x∥2 + ∥x∥2 − 2ℜ∥x∥2 = 0. (as n → ∞)

Therefore, we indeed have xn → x in this case. ■

Corollary 4.6. Let V be a finite-dimensional inner product space. A sequence of elements in V is convergent
under the derived norm if and only if it is weakly convergent.
Proof. By the preceding theorem, it suffices to show that weak convergence in V implies the convergence of norms: Let (xn )n∈N be a
w
→ x. Since V is finite-dimensional, it possesses a finite orthonormal base {e1 , . . . , en }. Let e∨
sequence in V such that xn − ∨
1 , . . . , en : V → F be
the corresponding coordinate form ■

Lemma 4.7. Let V be an inner product space, and let S ⊆ V be a subset such that Span(S) is dense in V . Then
w
for every bounded sequence (xn )n∈N and element x in V , if ⟨xn , y⟩ → ⟨x, y⟩ for all y ∈ S, then xn −
→ x.
Proof. Since (xn )n∈N is bounded, there exists M ′ > 0 such that ∥xn ∥ ≤ M ′ for all n ∈ N. Furthermore, we may put M := M ′ ∨ ∥x∥. Let z ∈ V
and ε > 0 be arbitrary. Since Span(S) is dense in V , there exists y0 ∈ Span(S) such that ∥z − y0 ∥ < ε/(3M).
Clearly, if ⟨xn , y⟩ → ⟨x, y⟩ for all y ∈ S, we have ⟨xn , y⟩ → ⟨x, y⟩ for every y ∈ Span(S) as well. Now since ⟨xn , y0 ⟩ → ⟨x, y0 ⟩, there exists
N ∈ N such that
ε
|⟨xn , y0 ⟩ − ⟨x, y0 ⟩| < , ∀n ≥ N.
3
In this case,

|⟨xn , z⟩ − ⟨x, z⟩| ≤ |⟨xn , z⟩ − ⟨xn , y0 ⟩| + |⟨xn , y0 ⟩ − ⟨x, y0 ⟩| + |⟨x, y0 ⟩ − ⟨x, z⟩|

108
ε ε ε ε
< ∥xn ∥∥z − y0 ∥ + + ∥x∥∥y0 − z∥ < M · + +M· = ε.
3 3M 3 3M
w
This then shows that ⟨xn , z⟩ → ⟨x, z⟩. Since z ∈ V is arbitrary, it follows that xn −
→ x. ■

4.2 Compact Linear Maps and Their Properties


4.3 The Spectral of Bounded Linear Maps
Definition 4.4. Let V be a Banach space and T ∈ B(V ).

1. The point spectrum of T is defined as

σ p (T ) := {λ ∈ F | λ idV −T is not injective}. (171)

Every element λ ∈ σ p (T ) is called an eigenvalue of T , and an nonzero x ∈ V with T x = λ x is called an


eigenvector. The closed subspace ker(λ idV −T ) is called the eigenspace associated with λ .

2. The continuous spectrum of T is defined as

σc (T ) := {λ ∈ F | λ idV −T is injective but not surjective, with dense range}. (172)

3. The residual spectrum of T is defined as

σr (T ) := {λ ∈ F | λ idV −T is injective but not surjective, without dense range}. (173)

The spectrum of T is

σ (T ) := σ p (T ) ∪· σc (T ) ∪· σr (T ) = {λ ∈ F | λ idV −T is not bijective}. (174)

4.4 Spectral Decomposition in Hilbert Spaces

109
5 Differential Calculus on Banach Spaces
Definition 5.1. Let X,Y be normed linear spaces and Ω be an open subset of X. Then a map f : Ω → Y is
(Fréchet) differentiable at a point a ∈ Ω if there exists a bounded linear map A : X → Y such that for every h ∈ X
such that a + h ∈ Ω,
f (a + h) = f (a) + Ah + ∥h∥δ (h), where lim δ (h) = 0. (175)
h→0V

110
A Additional Topics on Linear Spaces
A.1 Complexification of Real Linear Spaces
Definition A.1. Let X be a real linear space. Then the complexification of X, denoted by XC , is defined on the
abelian group X ⊕ X under the following scalar multiplication by C:

∀a, b ∈ R and x, y ∈ X : (a + ib)(x, y) := (ax − by, ay + bx). (176)

Remark A.1. Observe that for every x ∈ X,

i(x, 0) = (0 + i1)(x, 0) = (0(x) − 1(0), 0(0) + 1(x)) = (0, x).

Therefore, for every x, y ∈ X,


(x, y) = (x, 0) + (0, y) = (x, 0) + i(y, 0).

Note that we have a canonical embedding ι : X → X ⊕ X : x 7→ (x, 0) that is linear over R with image X ⊕ {0}.
Then by identifying X with X ⊕ {0}, we may write each pair (x, y) ∈ X ⊕ X as a formal sum x + iy, in which case
the above scalar multiplication is also written by

∀a, b ∈ R and x, y ∈ X : (a + ib)(x + iy) = (ax − by) + i(ay + bx). (177)

Apparently, such scalar multiplication resembles the multiplication of complex numbers.

Theorem A.1. Let X be a real linear space. Then the complexification XC of X is a complex linear space.
Furthermore, for every real subspace V of X, its complexification VC is a complex subspace of XC .
Proof. It is already clear that the underlying set X ⊕ X is an additive abelian group. What remains is to simply verify the axioms for scalar
multiplications:
• For every x, y ∈ X,
1(x + iy) = (1 + i0)(x + iy) = (1(x) − 0(y)) + i(1(y) + 0(x)) = x + iy.

• For every x, y ∈ X and a, b, c, d ∈ R,

(a + ib)(x + iy) + (c + id)(x + iy) = ((ax − by) + i(ay + bx)) + ((cx − dy) + i(cy + dx))
= (ax − by + cx − dy) + i(ay + bx + cy + dx)
= ((a + c)x − (b + d)y) + i((a + c)y + (b + d)x)
= ((a + c) + i(b + d))(x + iy) = ((a + ib) + (c + id))(x + iy)

and

(a + ib)((c + id)(x + iy)) = (a + ib)((cx − dy) + i(cy + dx))


= (a(cx − dy) − b(cy + dx)) + i(a(cy + dx) + b(cx − dy))
= ((ac − bd)x − (ad + bc)y) + i((ac − bd)y + (ad + bc)x)
= ((ac − bd) + i(ad + bc))(x + iy) = ((a + ib)(c + id))(x + iy).

111
• For every x, x′ , y, y′ ∈ X and a, b ∈ R,

(a + ib)(x + iy) + (a + ib)(x′ + iy′ ) = ((ax − by) + i(ay + bx)) + ((ax′ − by′ ) + i(ay′ + bx′ ))
= (ax − by + ax′ − by′ ) + i(ay + bx + ay′ + bx′ )
= (a(x + x′ ) − b(y + y′ )) + i(a(y + y′ ) + b(x + x′ ))
= (a + ib)((x + x′ ) + i(y + y′ )) = (a + ib)((x + iy) + (x′ + iy′ )).

Next, let V be a real subspace of X. Then V ⊕V is certainly an additive subgroup of X ⊕ X. Furthermore, for every u, v ∈ V and a, b ∈ R,

(a + ib)(u + iv) = (au − bv) +i (av + bu) ∈ VC .


| {z } | {z }
∈V ∈V

This shows that VC is a complex subspace of XC , as desired. ■

Theorem A.2 (Base of Complexification). Let X be a real linear space. Then for every base (e j ) j∈J of X over
R, the family (e j + i0) j∈J is also a base of XC over C. In particular,

dimC (XC ) = dimR (X). (178)

Proof. Let (e j ) j∈J be a base of X over R. Then for every x, y ∈ X, there exist almost null families (a j ) j∈J and (b j ) j∈J of real numbers such
that
x = ∑ a j e j and y = ∑ b j e j .
j∈J j∈J

Then
x + iy = ∑ a j e j + i ∑ b j e j = ∑ (a j e j + i(b j e j )) = ∑ (a j + ib j )(e j + i0).
j∈J j∈J j∈J j∈J

Note that a j + ib j = 0 if and only if a j = b j = 0, so

{ j ∈ J | a j + ib j ̸= 0} = { j ∈ J | a j ̸= 0 or b j ̸= 0} = { j ∈ J | a j ̸= 0} ∪ { j ∈ J | b j ̸= 0}

is also a finite set. This shows that (e j + i0) j∈J is a spanning set for XC .
Furthermore, let (z j ) j∈J be an almost null family of complex numbers such that ∑ j∈J z j (e j + i0) = 0. Suppose that z j = a j + ib j with
a j , b j ∈ R for each j ∈ J. Then

0= ∑ z j (e j + i0) = ∑ (a j + ib j )(e j + i0) = ∑ (a j e j + i(b j e j )) = ∑ a j e j + i ∑ b j e j ,


j∈J j∈J j∈J j∈J j∈J

so we must have
∑ a j e j = ∑ b j e j = 0.
j∈J j∈J

However, the last equality implies that a j = b j = 0 for all j ∈ J, as they are all real numbers. This then shows that z j = 0 for all j, so the family
(e j + i0) j∈J is also linearly independent over C. Consequently, it is a base of XC over C, implying that dimC (XC ) = |J| = dimR (X). ■

Theorem A.3 (Complexification of Linear Maps). Let X,Y be real linear spaces and T : X → Y be a real
linear map. Then there is a unique complex linear map TC : XC → YC extending T , namely making the following
diagram commute:
T
X Y

XC TC
YC

112
More precisely,
TC : XC → YC : x + iy 7→ T x + i(Ty). (179)

1. We have
(idX )C = idXC , ker(TC ) = ker(T )C and im(TC ) = im(T )C , (180)

so the map TC is injective (resp. surjective) if and only if T is.

2. For every real linear space Z and real linear map S : Y → Z,

(S ◦ T )C = SC ◦ TC . (181)

3. If T is a linear isomorphism, so is TC with

(TC )−1 = (T −1 )C . (182)

Proof. First, let S : XC → YC be a complex linear map extending T . Then for every x, y ∈ V ,

S(x + iy) = S((x + i0) + i(y + i0)) = S(x + i0) + iS(y + i0)
= (T x + i0) + i(Ty + i0) = T x + i(Ty).

Therefore, it suffices to consider the following map

TC : XC → YC : x + iy 7→ T x + i(Ty).

What remains is to prove that TC is complex linear, while the uniqueness of such map follows from the observation above:
• For every x, x′ , y, y′ ∈ X,

TC ((x + iy) + (x′ + iy)) = TC ((x + x′ ) + i(y + y′ ))


= T (x + x′ ) + i(T (y + y′ )) = (T x + T x′ ) + i(Ty + Ty′ )
= (T x + i(Ty)) + (T x′ + i(Ty′ )) = TC (x + iy) + TC (x′ + iy′ ).

• Let x, y ∈ X be arbitrary. Then for every a, b ∈ R,

TC ((a + ib)(x + iy)) = TC ((ax − by) + i(ay + bx)) = T (ax − by) + i(T (ay + bx))
= (a(T x) − b(Ty)) + i(a(Ty) + b(T x))
= (a + ib)(T x + i(Ty)) = (a + ib)TC (x + iy).

In particular, for every x, y ∈ X,


(idX )C (x + iy) = idX (x) + i(idX (y)) = x + iy = idXC (x + iy),
so we indeed have (idX )C = idXC . We then study the injectivity and surjectivity of TC :
• For every x, y ∈ X,

x + iy ∈ ker(TC ) ⇐⇒ 0Y + i0Y = T (x + iy) = T x + i(Ty)


⇐⇒ T x = Ty = 0Y ⇐⇒ x, y ∈ ker(T ).

Consequently, we have ker(TC ) = ker(T ) + i ker(T ) = ker(T )C . Therefore,

TC is injective ⇐⇒ ker(TC ) = {0X + i0X } ⇐⇒ ker(T ) = {0X } ⇐⇒ T is injective.

113
• For every y1 , y2 ∈ Y ,

y1 + iy2 ∈ im(TC ) ⇐⇒ ∃x1 , x2 ∈ X : y1 + iy2 = T (x1 + ix2 ) = T x1 + i(T x2 )


⇐⇒ ∃x1 , x2 ∈ X : y1 = T x1 and y2 = T x2 ⇐⇒ y1 , y2 ∈ im(T ).

This also shows that im(TC ) = im(T ) + i im(T ) = im(T )C , hence

TC is surjective ⇐⇒ im(TC ) = YC ⇐⇒ im(T ) = Y ⇐⇒ T is surjective.

Next, let Z be a real linear space and S : Y → Z be a real linear map. For every x, y ∈ X,

(SC ◦ TC )(x + iy) = SC (TC (x + iy)) = SC (T x + i(Ty))


= S(T x)) + i(S(Ty)) = (S ◦ T )x + i((S ◦ T )y) = (S ◦ T )C (x + iy).

The desired identity (S ◦ T )C = SC ◦ TC thus follows. Finally, suppose that T is a linear isomorphism, namely bijective. Then as noted above,

idXC = (idX )C = (T −1 ◦ T )C = (T −1 )C ◦ TC ,

and
idYC = (idY )C = (T ◦ T −1 )C = TC ◦ (T −1 )C .
This shows that TC is also a linear isomorphism with inverse (T −1 )C , namely (TC )−1 = (T −1 )C . ■

Corollary A.4. Let X,Y be finite-dimensional real linear spaces with respective ordered base β = (e1 , . . . , en )
and γ = ( f1 , . . . , fm ). Then for every linear map T : X → Y , we have

γC [TC ]βC = γ [T ]β , (183)

where βC = (e1 + i0X , . . . , en + i0X ) and γC = ( f1 + i0Y , . . . , fm + i0Y ) are the respective ordered bases of XC and
YC .
Proof. Let T : X → Y be a linear map. Then for every j = 1, . . . , n,
m
TC (e j + i0X ) = Te j + i(T 0X ) = Te j + i0Y = ∑ fi∨ (Te j )( f j + i0Y ),
i=1

where each fi∨ : Y → R is the corresponding coordinate form on Y . Consequently, for every i = 1, . . . , m and j = 1, . . . , n, the (i, j)-entry of
γC [TC ]βC is fi∨ (Te j ), which is also the (i, j)-entry of γ [T ]β . The desired identity follows as well. ■

Theorem A.5 (Universal Property). Let X be a real linear space and Y be a complex linear space. Then for
every real linear map T : X → Y , there is a unique complex linear map T̃ : XC → Y extending T .

T
X Y


XC

More precisely,
T̃ : XC → Y : x + iy 7→ T x + i(Ty). (184)
Proof. Let T : X → Y be a real linear map. Suppose that S : XC → Y is a complex linear map extending T . Then for every x, y ∈ X,

S(x + iy) = S((x + i0) + i(y + i0)) = S(x + i0) + iS(y + i0) = T x + i(Ty).

114
Consequently, we may define
T̃ : XC → Y : x + iy 7→ T x + i(Ty).
What remains is to show that T̃ is linear over C, while all other conditions have been verified as above:
• For every x, x′ , y, y′ ∈ X,

T̃ ((x + iy) + (x′ + iy)) = T̃ ((x + x′ ) + i(y + y′ ))


= T (x + x′ ) + i(T (y + y′ )) = (T x + T x′ ) + i(Ty + Ty′ )
= (T x + i(Ty)) + (T x′ + i(Ty′ )) = T̃ (x + iy) + T̃ (x′ + iy′ ).

• Let x, y ∈ X be arbitrary. Then for every a, b ∈ R,

T̃ ((a + ib)(x + iy)) = T̃ ((ax − by) + i(ay + bx)) = T (ax − by) + i(T (ay + bx))
= (a(T x) − b(Ty)) + i(a(Ty) + b(T x))
= (a + ib)(T x + i(Ty)) = (a + ib)T̃ (x + iy). ■

Remark A.2. In algebra, one can also extend the scalars from a small field to a larger one by tensor products.
That is, given any real linear space X, the tensor product C ⊗R X also has complex linear space structure:

∀c, z ∈ C and x ∈ X : c(z ⊗ x) = cz ⊗ x.

In fact, such structure is isomorphic to XC as constructed before.

Corollary A.6 (Alternative Complexification). Let X be a real linear space. Then XC is isomorphic to C ⊗R X
as complex linear spaces via a unique isomorphism.
Proof. We first construct complex linear maps from both direction:
• First, consider the canonical embedding x 7→ x + i0X from X to XC , which is a real linear map. Then by the universal property of the
extension of scalar, we can find a unique complex linear map f : C ⊗R X → CC such that f (1 ⊗ x) = x + i0X for all x ∈ X.
• Conversely, note that we also have another canonical embedding x 7→ 1 ⊗ x from X to C ⊗R X, which is also a real linear map. By
the preceding theorem, we can even find another complex linear map

g : XC → C ⊗R X : x + iy 7→ 1 ⊗ x + i(1 ⊗ y) = 1 ⊗ x + i ⊗ y.

We then show that f , g are mutual inverses, hence are both isomorphisms:
• For every x ∈ X,
g( f (1 ⊗ x)) = g(x + i0x ) = 1 ⊗ x + i ⊗ 0 = 1 ⊗ x,
so g ◦ f is the identity map on C ⊗R X.
• For every x, y ∈ X,

f (g(x + iy)) = f (1 ⊗ x + i ⊗ y) = f (1 ⊗ x) + f (i ⊗ y)
= f (1 ⊗ x) + i f (1 ⊗ y) = (x + i0X ) + i(y + i0X ) = x + iy,

hence f ◦ g is also the identity map on XC . ■

Definition A.2. Let X be a complex linear space with real subspace V . Then X is called a complexification of
V , and V is called an R-form of X, if X = V + iV and V ∩ iV = {0}, where iV = {iv | v ∈ V } ⊆ X.

Theorem A.7. Let X be a complex linear space with real subspace V . Then the following statements are
equivalent:

115
1. The space X is a complexification of V .

2. Every base of V over R is also a base of X over C.

3. Some base of V over R is also a base of X over C.

4. The complex linear map C ⊗R V → X in which z ⊗ v 7→ zv for every z ∈ C and v ∈ V is an isomorphism of


complex linear spaces.
Proof. (1 =⇒ 2). Suppose that X is a complexification of V . Let (e j ) j∈J be a base of V over R. Then for every x ∈ X, we have x = u + iv
for some u, v ∈ V . Consequently,

x = u + iv = ∑ e∨j (u)e j + i ∑ e∨j (v)e j = ∑ (e∨j (u) + ie∨j (v))e j ,


j∈J j∈J j∈J

hence (e j ) j∈J is a spanning set of X over C. Furthermore, let (z j ) j∈J be an almost null family of complex numbers such that 0 = ∑ j∈J z j e j .
Suppose that z j = a j + ib j with a j , b j ∈ R for each j. Then

0= ∑ z j e j = ∑ (a j + ib j )e j = ∑ a j e j + i ∑ b j e j ,
j∈J j∈J j∈J j∈J

so we have
i ∑ b j e j = − ∑ a j e j ∈ iV ∩V = {0}.
j∈J j∈J

Then by the linear independence of (e j ) j∈J , we must have a j = b j = 0 for all j ∈ J, namely z j = 0 for all j. This shows that (e j ) j∈J is also
linearly independent over C, hence a base of X over C.
(2 =⇒ 3). This is trivial.
(3 =⇒ 4). Suppose that some base (e j ) j∈J of V over R is a base of X over C. Consider the complex linear map ϕ : C ⊗R V → X in
which z ⊗ v 7→ zv for every z ∈ C and v ∈ V . Now ϕ(1 ⊗ e j ) = 1e j = e j holds for all j ∈ J, while (1 ⊗ e j ) j∈J is a base of C ⊗R V over C,
hence the map ϕ is indeed an isomorphism.
(4 =⇒ 1). Suppose that such complex linear map C ⊗R V → X is now an isomorphism. Consider the real subspace V̂ := 1 ⊗ V of
C ⊗R V , which is now isomorphic to V as real linear spaces. Note that

iV̂ = {i(1 ⊗ v) | v ∈ V } = {i ⊗ v | v ∈ V },

so it is clear that C ⊗R V = V̂ + iV̂ and V̂ ∩ iV̂ = {0}. Consequently, C ⊗R V is a complexification of V̂ , hence X itself is a complexification
of V as well. ■

116
A.2 Topological Linear Spaces
Definition A.3. A linear space V over field F is called a topological linear space if it is endowed with a topology
under which the addition + : V × V → V and scalar multiplication · : F × V → V are both continuous, where
V ×V and F ×V are endowed with the respective product topologies.

Remark A.3. The continuity of the addition and scalar multiplication tells us that given any sequences (xn )n∈N
and (yn )n∈N in V and any sequence (cn )n∈N in F,

• if xn → x ∈ V and yn → y ∈ V , then xn + yn → x + y;

• if xn → x ∈ V and cn → c ∈ F, then cn xn → cx as well.

Theorem A.8. Let V be a topological linear space over F.

1. For every x ∈ V , the translation map ϑx : V → V : y 7→ x + y is a homeomorphism on V .

2. For every a ∈ F, the homothety ηa : V → V : x 7→ a · x is also a homeomorphism on V .


Proof. The first statement is clear as every topological linear space is an additive topological group. Now let a ∈ F be arbitrary. Clearly,
we have µa ◦ µa−1 = idV = µa−1 ◦ µa , so µa is invertible with inverse µa−1 . What remains is to prove that µa is continuous: Note that the
following injection
ιa : V → F ×V : x 7→ (a, x)
is continuous, hence ηa = η ◦ ιa , where η is the scalar multiplication, is also continuous. The proof is complete. ■

Remark A.4. Similar as in topological groups, the topology in a topological linear space V is ultimately deter-
mined by the neighborhood filter of 0V .

Theorem A.9. Let V be a linear space over F and (Vi )i∈I be a nonempty family of linear spaces over F. Then
i f
for every family (V −
→ Vi )i∈I of linear maps, the linear space V is always a topological space under the initial
topology with respect to ( fi )i∈I .
fi
Proof. Let (V − → Vi )i∈I be a family of linear maps and suppose that V is endowed with the initial topology with respect to ( fi )i∈I . Fix an
arbitrary i ∈ I.
1. For each a ∈ F and x ∈ V , since fi is linear, we have fi (ax) = a fi (x). If we denote by η : F × V → V and ηi : F × Vi → Vi the
respective scalar multiplications, the above shows that fi ◦ η = ηi ◦ (idF × fi ). Note that fi , ηi , idF × fi are all continuous. Since i ∈ I
is arbitrary, it follows that η is continuous as well.
2. Likewise, for each x, y ∈ V , we have fi (x + y) = fi (x) + fi (y), hence fi ◦ ϑ = ϑi ◦ ( fi × fi ), where ϑ : V ×V → V and ϑi : Vi ×Vi → Vi
are the respective addition maps. By the same arguments as above, we see that the addition ϑ is also continuous.
Therefore, the linear space V is now a topological linear space under the initial topology, as desired. ■

Theorem A.10. Let V be a topological linear space. For every C ⊆ V , if C is convex, so is its closure C.
Proof. Let C be a convex subset of V . Fix arbitrary x, y ∈ C and α ∈ [0, 1]. Then there are sequences (xn )n∈N and (yn )n∈N such that xn → x
and yn → y. Since C is convex, we have αxn + (1 − α)yn ∈ C as well. Meanwhile, one may observe that αxn + (1 − α)yn → αx + (1 − α)y,
so αx + (1 − α)y ∈ C holds for sure. This shows that C is also a convex subset of V . ■

117

You might also like