A Short Note on Functional Analysis
A Short Note on Functional Analysis
Contents
1 Normed Linear Spaces and Banach Spaces 3
1.1 Semi-Norms and Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Operations on Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Equivalence of Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Finite-Dimensional Normed Linear Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Additional Examples of Norms and Semi-Norms . . . . . . . . . . . . . . . . . . . . . . . . . 24
1
5 Differential Calculus on Banach Spaces 110
All linear spaces in the note, unless specified otherwise, is over field F = R or C.
2
1 Normed Linear Spaces and Banach Spaces
1.1 Semi-Norms and Norms
Definition 1.1. Let V be a linear space. A semi-norm on V is a function ∥ · ∥ : V → R that satisfies the following
conditions:
The pair (V, ∥ · ∥) is called a semi-normed linear space. In practice, we usually say that V is a semi-normed
linear space under ∥ · ∥. Finally, if furthermore
∀x ∈ V : ∥x∥ = 0 ⇐⇒ x = 0V , (1)
then such semi-norm is called a norm on V , by which V is called a normed linear space.
Remark 1.1. It is readily to see that once restricting ∥ · ∥ to any subspace of V , the two axioms still hold for such
restricted map. Consequently, given any subspace V ′ of V , the restriction of every semi-norm (resp. norm) on V
to V ′ is also a semi-norm (resp. norm) on the subspace.
Theorem 1.1. Let V be a semi-normed linear space. Then ∥0V ∥ = 0 and for every x ∈ V ,
Furthermore, by homogeneity,
∥−x∥ = |−1|∥x∥ = ∥x∥.
As a result, by triangle inequality,
0 = ∥0V ∥ = ∥x + (−x)∥ ≤ ∥x∥ + ∥−x∥ = 2∥x∥.
This then shows that ∥x∥ ≥ 0, as desired. Finally, let y ∈ V be arbitrary as well. Again. by triangle inequality, we have
hence we have ∥x∥ − ∥y∥ ≤ ∥x − y∥. By symmetry, we also have ∥y∥ − ∥x∥ ≤ ∥y − x∥. Since ∥x − y∥ = ∥−(x − y)∥ = ∥y − x∥, it follows that
|∥x∥ − ∥y∥| ≤ ∥x − y∥. As we can see, if ∥x − y∥ = 0, we see that
0 ≤ |∥x∥ − ∥y∥| ≤ ∥x − y∥ = 0,
Definition 1.2. Let V be a semi-normed linear space. An element x ∈ V is called unit if ∥x∥ = 1.
3
Lemma 1.2 (Normalization). Let V be a semi-normed linear space. For every x ∈ V , if ∥x∥ > 0, the vector
x̂ := ∥x∥−1 x is unit and collinear with x.
Proof. Let x ∈ V be arbitrary with ∥x∥ > 0. Then clearly, x̂ := ∥x∥−1 x, being a scalar multiple of x, is certainly collinear with x, and
Theorem 1.3. Let V be a semi-normed linear space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 x and ŷ := ∥y∥−1 y,
4∥x − y∥
∥x̂ − ŷ∥ ≤ . (4)
∥x∥ + ∥y∥
Proof. Let x, y ∈ V be nonzero with respective normalizations x̂, ŷ. Then
∥x∥ ∥x∥
∥x∥∥x̂ − ŷ∥ = x − y ≤ ∥x − y∥ + y − y
∥y∥ ∥y∥
|∥y∥ − ∥x∥|
= ∥x − y∥ + ∥y∥
∥y∥
= ∥x − y∥ + |∥y∥ − ∥x∥| ≤ 2∥x − y∥.
Lemma 1.4. Let V be a semi-normed linear space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 x and ŷ := ∥y∥−1 y,
Proof. Let x, y ∈ V be nonzero with respective normalizations x̂, ŷ. Without loss of generality, we may assume that ∥x∥ ≤ ∥y∥. Observe that
∥x∥ ∥x∥ ∥x∥
∥x + y∥ = ∥x∥x̂ + y+ 1− y ≤ ∥x∥∥x̂ + ŷ∥ + 1 − ∥y∥
∥y∥ ∥y∥ ∥y∥
= ∥x∥∥x̂ + ŷ∥ + (∥y∥ − ∥x∥)
= ∥x∥ + ∥y∥ − (2 − ∥x̂ + ŷ∥)∥x∥.
and
∥y∥ ∥y∥ ∥y∥
∥x + y∥ = ∥y∥ŷ + x− − 1 x ≥ ∥y∥(∥ŷ + x̂∥) − − 1 ∥x∥
∥x∥ ∥x∥ ∥x∥
≥ |∥y∥(∥ŷ + x̂∥) − (∥y∥ − ∥x∥)|
= |∥x∥ + ∥y∥ − (2 − ∥x̂ + ŷ∥)∥y∥|.
Theorem 1.5. Let V be a semi-normed linear space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 x and ŷ := ∥y∥−1 y,
4
Proof. Let x, y ∈ V be nonzero with respective normalizations x̂, ŷ. By the preceding lemma, we have
Consequently,
Similarly,
∥x − y∥ ≤ ∥x∥ + ∥y∥ − (2 − ∥x̂ − ŷ∥)(∥x∥ ∧ ∥y∥),
so
Remark 1.2. In addition, the upper bound in the preceding theorem is better than the one in Theorem 1.3, as
Theorem 1.6 (Metric Space Structure). Let V be a semi-normed linear space. Then the following function
d : V ×V → R : (x, y) 7→ ∥x − y∥ (7)
• For every x, y, x ∈ V ,
d(x, z) = ∥x − z∥ = ∥(x − y) + (y − z)∥ ≤ ∥x − y∥ + ∥y − z∥ = d(x, y) + d(y, z).
d(x, y) = 0 ⇐⇒ ∥x − y∥ = 0 ⇐⇒ x − y = 0V ⇐⇒ x = y.
∥x∥ = 0 ⇐⇒ d(x, 0V ) = 0 ⇐⇒ x = 0V .
Corollary 1.7. Every semi-normed linear space is a topological linear space under the pseudo-metric topology,
by which the semi-norm is uniformly continuous.
Proof. Let V be a semi-normed linear space with semi-norm ∥ · ∥.
5
• Let x, y ∈ V be arbitrary. Then for every x′ , y′ ∈ V ,
Consequently, if (x′ , y′ ) → (x, y) in V ×V , then we have x′ → x and y′ → y, hence ∥(x′ + y′ ) − (x + y)∥ → 0 as well. In other words,
we also have x′ + y′ → x + y in V , so the addition on V is continuous.
• Let x ∈ V and c ∈ F be arbitrary. Then for every c′ ∈ F and x′ ∈ V ,
If (c′ , x′ ) → (c, x) in F × V , we shall also have c′ → c and x′ → x, hence |c′ − c| → 0 and ∥x′ − x∥ → 0 as well. Together with the
inequality above, we may conclude that c′ x′ → cx as well, so the scalar multiplication is continuous then.
Now that the addition and the scalar multiplication are both continuous, so V is a topological vector space. Finally, let (xn )n∈N be a sequence
in V such that xn → x ∈ V under d. Then for every ε > 0, there exists N ∈ N such that ∥xn − x∥ < ε for n ≥ N. As noted in Theorem 1.1, we
can see that
|∥xn ∥ − ∥x∥| ≤ ∥xn − x∥ < ε, ∀n ≥ N.
Therefore, the semi-norm ∥ · ∥ : V → R is itself uniformly continuous as well. ■
Theorem 1.8 (Closed Span). Let V be a normed linear space with subset S. The closure Span(S) of the subspace
spanned by S is the smallest closed subspace of V containing S.
Proof. Let S ⊆ V be an arbitrary subset. It is clear that Span(S) is closed in V , so we may verify that it is a subspace first: Let x, y ∈ Span(S)
and c ∈ F be arbitrary. Then by the property of closure, we can find sequences (xn )n∈N and (yn )n∈N in Span(S) such that xn → x and yn → y.
By the continuity of addition and scalar multiplication, we see that
Definition 1.3. Let V be a normed linear space. A topological base of V is a linearly independent subset B such
that the subspace of V spanned by B is dense in it, namely V = Span(B).
Remark 1.3. In functional analysis, the base of a normed linear space, namely a linearly independent subset that
spans the whole space, is usually referred as a Hamel base or an algebraic base. Clearly, every Hamel base is
also a topological base.
Definition 1.4. Let V be a semi-normed linear space. A sequence (xn )n∈N of elements in V is a Cauchy sequence
if for every ε > 0, there exists N ∈ N such that ∥xm − xn ∥ < ε for all m, n ≥ N.
Theorem 1.9 (Cauchy Sequences in Normed Linear Spaces). Let V be a semi-normed linear space, and (xn )n∈N
be a sequence in V .
2. If xn → x in V for some x ∈ V , then the sequence (xn )n∈N is also Cauchy and ∥xn ∥ → ∥x∥ in R.
6
Proof. 1. Suppose that (xn )n∈N is Cauchy. Then there exists N0 ∈ N such that ∥xm − xn ∥ < 1 for all m, n ≥ N0 . In this case, for n ≥ N0 ,
Definition 1.5. A normed linear space is called a Banach space if the metric induced by its norm is complete,
namely every Cauchy sequence in such space is convergent.
Theorem 1.10. Let V be a Banach space. Then a subspace W of V is a Banach space if and only if it is closed.
Proof. Let W be a subspace of V . Suppose that W is closed in V . Let (xn )n∈N be a Cauchy sequence in W . Since V is a Banach space,
there exists x ∈ V such that xn → x as n → ∞. Now because W is closed in V , we certainly have x ∈ W as well. In other words, the Cauchy
sequence (xn )n∈N in W is convergent with limit x ∈ W , hence W is a Banach space.
Conversely, suppose that W is a Banach space. Let (xn )n∈N be a sequence in W such that xn → x ∈ V as n → ∞. By Theorem 1.9, the
sequence (xn )n∈N is Cauchy. Since W is Banach, there exists x′ ∈ W ⊆ V such that xn → x′ . However, because every convergent sequence
has a unique limit in any metric space, we must have x = x′ ∈ W . Therefore, the subspace W is closed in V . ■
Definition 1.6. Let V be a normed linear space and (xn )n∈N be a sequence of elements in V .
2. The series ∑∞n=1 xn is (conditionally) convergent if the sequence (sn )n∈N of partial sums is convergent in
V , in which case its sum is defined as the limit of (sn )n∈N .
3. The series ∑∞ ∞
n=1 xn is absolutely convergent if the series ∑n=1 ∥xn ∥ of non-negative real numbers is con-
vergent.
Theorem 1.11 (Necessary Condition for Convergent Series). Let V be a normed linear space and (xn )n∈N be a
sequence in V . If the series ∑∞
n=1 xn is convergent, then xn → 0V as n → ∞.
Proof. Let (sn )n∈N be the sequence of partial sums. By putting s0 := 0V , it is clear that xn = sn − sn−1 for all n ∈ N. If the series ∑∞
n=1 xn is
convergent, then
lim xn = lim (sn − sn−1 ) = lim sn − lim sn−1 = 0V . ■
n→∞ n→∞ n→∞ n→∞
Theorem 1.12. A normed linear space is a Banach space if and only if every absolutely convergent series in it
is also convergent.
7
Proof. Let V be a normed linear space. First, suppose that V is a Banach space. Consider an absolute convergent series ∑∞ n=1 xn . For
convenience, denote by (sn )n∈N and (Sn )n∈N the sequences of partial sums of ∑∞ ∞
n=1 xn and ∑n=1 ∥xn ∥, respectively.
By assumption, the sequence (Sn )n∈N of non-negative real numbers is convergent, hence is Cauchy as well. That is, for every ε > 0,
there exists N ∈ N such that |Sm − Sn | < ε for all m, n ≥ N. As a result, given any m, n ∈ N with m ≥ n ≥ N, we see that
Consequently, the sequence (sn )n∈N is also Cauchy in V , hence is also convergent by noting that V is Banach. Therefore, the series ∑∞
n=1 xn
itself is convergent, as desired.
Conversely, suppose that every absolutely convergent series in V is also convergent. Let (xn )n∈N be a Cauchy sequence in V . First,
there exists N1 ∈ N such that ∥xm − xn ∥ < 1/2 for all m, n ≥ N. Whenever we have Nk for some k, we can further find Nk+1 ≥ Nk such that
1
∥xm − xn ∥ < , ∀m, n ≥ Nk+1 .
2k+1
In this case, consider the series ∑∞ −k for all k ∈ N, so by direct comparison
k=1 (xNk+1 − xNk ). By assumption above, we have ∥xNk+1 −xNk ∥ < 2
test, it is clear that the previous series is absolutely convergent. Now by our hypothesis, such series is convergent as well. Observe that its
k-th partial sum is given by
(xN2 − xN1 ) + · · · + (xNk+1 − xNk ) = xNk+1 − xN1 ,
so there exists x ∈ V such that xNk → x as k → ∞. Finally, let ε > 0 be arbitrary.
• On the one hand, let r := ⌈1 − log2 ε⌉. Then for m, n ≥ Nr , we have
1 1 ε
∥xm − xn ∥ < ≤ 1−log ε = .
2r 2 2 2
• On the other hand, since xNk → x, there exists r′ ∈ N such that ∥xNk − x∥ < ε/2 for all k ≥ r′ .
Then for n ≥ Nk where k := r ∨ r′ ,
ε ε
∥xn − x∥ ≤ ∥xn − xNk ∥ + ∥xNk − x∥ <
+ = ε.
2 2
This shows that xn → x as n → ∞, namely the Cauchy sequence (xn )n∈N is convergent. In conclusion, the normed linear space V is a Banach
space. ■
∥x∥T : V → R : x 7→ ∥T x∥
• For every x, y ∈ V ,
∥x + y∥T = ∥T (x + y)∥ = ∥T x + Ty∥ ≤ ∥T x∥ + ∥Ty∥ = ∥x∥T + ∥y∥T .
8
∗ ∗
Clearly, the converse of ⇐= is true if ∥ · ∥ is a norm, and the converse of ⇐= holds whenever T is injective. Therefore, the semi-norm ∥ · ∥T
is a norm if ∥ · ∥ is and the linear map T is injective. In this case, for each x ∈ V , we have ∥T x∥ = ∥x∥T , hence T is an isometry under such
two norms. ■
Corollary 1.14 (Semi-Norms and Isomorphism). Let V,W be linear spaces and T : V → W be an isomorphism.
Then for every semi-norm ∥ · ∥ on V , the map
∥y∥′ := ∥T −1 y∥, ∀y ∈ W,
Since T is also injective, it follows from the preceding theorem that ∥ · ∥ is itself also a norm on V .
Therefore, we may conclude that ∥ · ∥ is a norm on V if and only if ∥ · ∥′ is. ■
Remark 1.4. The above theorem implies that isomorphic linear spaces have “identical” families of semi-normed
defined on it.
Theorem 1.15 (“Kernel” of Semi-Norms). Let V be a semi-normed linear space. Then the subset Z := {x ∈ V |
∥x∥ = 0} is a subspace of V .
1. For every subspace V ′ of V , the semi-norm ∥ · ∥ restrict to a norm on V ′ if and only if V ′ ∩ Z = {0V }. In
particular, the semi-norm ∥ · ∥ is a norm on V if and only if Z = {0V }.
9
• For every x ∈ V ,
∥x∥′ = 0 ⇐⇒ ∥x∥ = 0 ⇐⇒ x ∈ Z ⇐⇒ x = 0.
• For every x, y ∈ V ,
∥x + y∥′ = ∥x + y∥′ = ∥x + y∥ ≤ ∥x∥ + ∥y∥ = ∥x∥′ + ∥y∥′ .
Remark 1.5. Recall that given any pseudo-metric ρ on a set X, there is a unique metric ρ̃ defined on the quotient
set X/∼, where ∼ is an equivalence relation on X defined by
∀x, y ∈ X : x ∼ y ⇐⇒ ρ(x, y) = 0.
In terms of the pseudo-metric d induced by the semi-norm ∥ · ∥, we see that for every x, y ∈ X,
x ∼ y ⇐⇒ d(x, y) = 0 ⇐⇒ ∥x − y∥ = 0 ⇐⇒ x − y ∈ Z.
Therefore, the equivalence relation ∼ is precisely the congruence relation for the quotient space structure of
V /Z. Furthermore, suppose that d˜ is the metric on V /Z induced by d, given any x, y ∈ X,
that is, d˜ is also the metric induced by the norm ∥ · ∥′ on V /Z. In conclusion, starting with a semi-normed linear
space, there is no difference if we
• first convert it into a normed linear space by taking quotients and then associate a metric to the new norm,
or
• first construct the associated pseudo-metric and convert that into a metric by equivalence relation.
Both will result in the same metric space.
Theorem 1.16 (Quotient Space). Let V be a normed linear space. For every subspace W of V , the quotient
space V /W is a semi-normed linear space under the following semi-norm
10
Since z ∈ W was selected arbitrarily above, we have d(x,W ) ≥ d(y,W ) and d(y,W ) = d(x,W ), namely d(x,W ) = d(y,W ). This shows that
∥ · ∥′ is well-defined. Furthermore, because W is subspace, the map z 7→ −z is a bijection on it. It is thus clear that
otherwise,
∥cx∥′ = ∥cx∥′ = inf ∥cx − z∥ = inf |c|∥x − c−1 z∥ = |c| inf ∥x − c−1 z∥ = |c|∥x∥′ ,
z∈W z∈W z∈W
Since z, z′ ∈ W above are arbitrary, by taking infima over all z, z′ , it is clear that ∥x∥ + ∥y∥ ≥ ∥x + y∥′ .
Clearly, we have ∥x∥′ ≤ ∥x + 0V ∥ = ∥x∥ for each x ∈ V . Therefore, for every x, y ∈ V ,
Lemma 1.17. Let V be a linear space with semi-norms ∥ · ∥1 , . . . , ∥ · ∥m . Suppose that ∥ · ∥ is a norm on Rm such
that ∥y∥ ≤ ∥y + z∥ for every y, z ∈ Rm with non-negative entries. Then the following map
• For every x, y ∈ V ,
11
This shows that p is a semi-norm on V . Furthermore, for every x ∈ V , since ∥ · ∥ is a norm on Rm , we see that
Consequently, if at least one of ∥ · ∥1 , . . . , ∥ · ∥m is a norm, the last condition is equivalent to x = 0V whence p is also a norm on V . ■
Theorem 1.18 (Operations on Semi-Norms). Let V be a linear space with semi-norms ∥ · ∥, ∥ · ∥′ . Then given
any non-negative a ∈ R, the maps a∥ · ∥, ∥ · ∥ + ∥ · ∥′ , and ∥ · ∥ ∨ ∥ · ∥′ are also semi-norms on V , which are also
norms when ∥ · ∥, ∥ · ∥′ are norms and a > 0.
Proof. Let a ∈ R be a non-negative constant. Then for every x, y ∈ V and c ∈ F,
and also
This then shows that a∥ · ∥, ∥ · ∥ + ∥ · ∥′ , and ∥ · ∥ ∨ ∥ · ∥′ are also semi-norms on V . When ∥ · ∥, ∥ · ∥′ are both norms and a > 0, for each x ∈ V ,
a∥x∥ = 0 ⇐⇒ ∥x∥ = 0 ⇐⇒ x = 0V ,
and
and
∥y + z∥∞ = (y1 + z1 ) ∨ (y2 + z2 ) ≥ y1 ∨ y2 = ∥y∥∞ . ■
Lemma 1.19. Let V be a linear space. Then being stronger is a preorder on the set of all semi-norms on V .
Proof. The reflexity is trivial with M = 1. Now let ∥ · ∥, ∥ · ∥′ , and ∥ · ∥′′ be semi-norms on V such that ∥ · ∥′ is stronger than ∥ · ∥ and ∥ · ∥′′
is stronger than ∥ · ∥′ . Then there are positive constant M, M ′ ∈ R such that
for all x ∈ M. In this case, we see that ∥x∥ ≤ MM ′ ∥x∥′′ and MM ′ > 0, so ∥ · ∥′′ is also stronger than ∥ · ∥. ■
12
Theorem 1.20. Let V be a linear space with semi-norms ∥ · ∥ and ∥ · ∥′ . The following statements are equivalent:
be the open balls under the respective pseudo-metric topologies with respective radi r and r/M. Then for each y ∈ B′ , we see that
r
∥y − x∥ ≤ M∥y − x∥′ < M · = r,
M
hence y ∈ B as well. In other words, every open ball under ∥ · ∥ contains an open ball under ∥ · ∥′ , hence the topology induced by ∥ · ∥′ is
finer than the one induced by ∥ · ∥.
(2 =⇒ 3). Suppose that the pseudo-metric topology on V induced by ∥ · ∥′ is finer than the one induced by ∥ · ∥. Let (xn )n∈N be a
sequence in V such that ∥xn ∥′ → 0 in R. Then we have xn → 0V under ∥ · ∥′ , hence xn → 0V under ∥ · ∥ as well. As a result, we may conclude
that ∥xn ∥ = ∥xn − 0V ∥ → 0 as well.
(3 =⇒ 1). Suppose that ∥xn ∥′ → 0 implies that ∥xn ∥ → 0 for every sequence (xn )n∈N in V . Consider the unit ball B := {x ∈ V | ∥x∥′ ≤ 1}
centered at 0V . Assume, to the contrary, that B is unbounded in ∥ · ∥. Then for every n ∈ N, there exists xn ∈ V such that ∥xn ∥ > n, so we
may put
1
yn := xn .
n(∥xn ∥′ ∨ 1)
As we can see,
1 1 1
∥yn ∥′ = ∥xn ∥′ = ∥xn ∥′ ≤
n(∥xn ∥′ ∨ 1) n(∥xn ∥′ ∨ 1) n
and
1 n 1
∥yn ∥ = ∥xn ∥ > = ≥ 1.
n(∥xn ∥′ ∨ 1) n(∥xn ∥′ ∨ 1) ∥xn ∥′ ∨ 1
In this case, we see that ∥yn ∥′ → 0 as n → ∞ but ∥yn ∥ ̸→ 0, a contradiction. Therefore, the ball B is bounded in ∥ · ∥, hence there exists a
positive M ∈ R such that B ⊆ {x ∈ V | ∥x∥ ≤ M}.
Now let x ∈ V be arbitrary. If ∥x∥′ = 0, by considering the constant sequence with common term x, it follows from the assumption that
∥x∥ = 0 as well. Thus, we shall assume that ∥x∥′ > 0 then. In this case, for y := (1/∥x∥′ )x, we have
1
∥y∥′ = ∥x∥′ = 1,
∥x∥′
Definition 1.8. Let V be a linear space. Then two semi-norms ∥ · ∥ and ∥ · ∥′ are equivalent if ∥ · ∥′ is stronger
than ∥ · ∥ and vice-versa.
Corollary 1.21. Let V be a linear space with semi-norms ∥ · ∥ and ∥ · ∥′ . Then the following statements are
equivalent:
13
3. There exists positive constants m, M ∈ R such that m∥x∥′ ≤ ∥x∥ ≤ M∥x∥ for all x ∈ V .
• Conversely, suppose that m∥x∥′ ≤ ∥x∥ ≤ M∥x∥′ for some positive constants m, M. Then we have ∥x∥ ≤ M∥x∥′ and ∥x∥′ ≤ (1/m)∥x∥,
hence ∥ · ∥′ is stronger than ∥ · ∥ and vice-versa. In conclusion, the semi-norms ∥ · ∥ and ∥ · ∥′ are equivalent.
Finally, the reflexity and symmetry of such relation are also trivial for us. Meanwhile, the transitivity follows from Lemma 1.19. ■
Remark 1.6. Consequently, given two equivalent norms ∥ · ∥ and ∥ · ∥′ on V , the linear space V is a Banach
space under ∥ · ∥ if and only if it is a Banach space under ∥ · ∥′ .
a b
a1/p b1/q ≤ + (11)
p q
Remark 1.7. If we replace a1/p and b1/q with a and b, respectively, we shall get another equivalent form of
Young’s inequality:
a p bq
ab ≤ + (12)
p q
whose equality is attained if and only if a p = bq .
Proof–2. Consider the power function x 7→ x p−1 on [0, ∞), whose inverse is given by y 7→ y1/(p−1) = yq−1 . Clearly, we have
a b
xp yq a p bq
Z a Z b
x p−1 dx + yq−1 dy = + = + ,
0 0 p 0 q 0 p q
On the one hand, if b = a p−1 , or equivalently, bq = a p , the sum of two integrals above is precisely equal to ab.
14
y
b y = x p−1
O a x
Similarly, if a > ϕ −1 (b), we shall have analogous strict inequality. The proof is thus complete. ■
Theorem 1.23 (Hölder’s Inequality). Let ai , bi ≥ 0 for i = 1, . . . , n, where n is a positive integer. If p, q ∈ (1, ∞)
are conjugate indices, we have
!1/p !1/q
n n n
∑ ai bi ≤ ∑ aip ∑ bqi , (13)
i=1 i=1 i=1
whose equality is attained if and only if the vectors (a1p , . . . , anp ) and (bq1 , . . . , bqn ) are collinear.
Proof. Denote
!1/p !1/q
n n
q
A= ∑ aip and B= ∑ bi .
i=1 i=1
Thus, by Young’s inequality, for each i = 1, . . . , n, we have
q
ai bi 1 ai p 1 bi
· ≤ · + · .
A B p A q B
Summing over i = 1, . . . , n, it gives
1 n 1 n
1 n
1 1
∑ ai bi ≤ pA p
AB i=1 ∑ aip + qBq ∑ bqi = + = 1.
p q
i=1 i=1
Therefore,
!1 !1
n n p n q
q
∑ ai bi ≤ AB = ∑ aip ∑ bi ,
i=1 i=1 i=1
q
where the equality holds if and only if aip /A p = bi /Bq for all i = 1, . . . , n. ■
1−q q
Proof–2. Note that for p > 1, the function f (x) = x p is convex. Adopting the notations above, we let αi := (bi /B)q and xi := ai bi B .
Then we have
n n n n q n
bi p(1−q) pq
∑ αi xi = ∑ ai bi and ∑ αi xip = ∑ B · aip bi B = B p ∑ aip = (AB) p .
i=1 i=1 i=1 i=1 i=1
By Jensen’s inequality, we have !p
n
∑ ai bi ≤ (AB) p .
i=1
The Hölder’s inequality thus follows. ■
15
Theorem 1.24 (Minkowski’s Inequality). Let n ∈ N and ai , bi ≥ 0 for i = 1, . . . , n. Then for every p ≥ 1,
" #1 !1 !1
n p n p n p
∑ (ai + bi ) p
≤ ∑ aip + ∑ bip (14)
i=1 i=1 i=1
whose equality is attained if and only if the vectors (a1 , . . . , an ) and (b1 , . . . , bn ) are collinear.
1
Proof. The inequality is trivial if p = 1. Assume p > 1 and q > 1 satisfies p + 1q = 1. By Hölder’s inequality, we have
n n n n
∑ (ai + bi ) p = ∑ (ai + bi ) p−1 (ai + bi ) = ∑ (ai + bi ) p−1 ai + ∑ (ai + bi ) p−1 bi
i=1 i=1 i=1 i=1
" #1 !1 " #1 !1
n q n p n q n p
≤ ∑ (ai + bi )(p−1)q ∑ aip + ∑ (ai + bi )(p−1)q ∑ bip
i=1 i=1 i=1 i=1
" #1 !1 !1
n q n p n p
= ∑ (ai + bi ) p ∑ aip p
+ ∑ bi
i=1 i=1 i=1
Therefore,
" #1 " #1− 1 !1 !1
n p n q n p n p
∑ (ai + bi ) p
= ∑ (ai + bi ) p
≤ ∑ aip + ∑ bip .
i=1 i=1 i=1 i=1
Finally, from the conditions for Hölder’s inequality, the equality is possible in Minkowski’s inequality if and only if the vectors (a1 , . . . , an )
and (b1 , . . . , bn ) are collinear. ■
Example 1.1 (l p -Norm on Fn ). Let p ≥ 1 be a positive constant. The l p -norm on Fn is defined as follows: For
every x = [x1 , . . . , xn ]T ∈ Fn ,
∥x∥ p := (|x1 | p + · · · + |xn | p )1/p . (15)
16
Furthermore, the l∞ -norm, or max norm, on Fn is defined as follows: For every x = [x1 , . . . , xn ]T ∈ Fn ,
Clearly,
lim ∥x∥ p = lim (|x1 | p + · · · + |xn | p )1/p = |x1 | ∨ · · · ∨ |xn | = ∥x∥∞ .
p→∞ p→∞
• The l2 -norm, also called the Euclidean norm, is the most frequently-appeared norm on Fn : For every
x = [x1 , . . . , xn ]T ∈ Fn ,
∥x∥2 = (|x1 |2 + · · · + |xn |2 )1/2 . (17)
• The l1 -norm is called the sum norm, or the Manhattan norm, or taxicab norm, in which for every x =
[x1 , . . . , xn ]T ∈ Fn ,
∥x∥1 = |x1 | + · · · + |xn |. (18)
• In addition, every l p -norm is permutationally invariant, namely for every x = [x1 , . . . , xn ]T and σ ∈ Sn , we
have
∥xσ ∥ p = ∥x∥ p , where xσ := [xσ (1) , . . . , xσ (n) ]T (19)
Finally, by Corollary 1.14, we can define analogies of l p -norms on every finite-dimensional linear space: If V is
finite-dimensional with base {z1 , . . . , zn }, then for every x ∈ V ,
17
Here z∨1 , . . . , z∨n : V → F are the coordinate forms, and we applied Corollary 1.14 via the isomorphism Fn → V :
ei 7→ zi . One needs to beware that such isomorphism is not canonical, namely its definition relies on the base
selected for V .
Corollary 1.25. Let S ∈ Mm,n (F) have full column rank. Then for every norm ∥ · ∥ on Fm , the following map
∥ · ∥S : V → R : x 7→ ∥Sx∥ (21)
is also a norm on Fn .
Proof. By identifying Mm,n (F) with L (Fn , Fm ), we can see that S is an injective linear map. Then by Theorem 1.13, for every norm ∥ · ∥
on Fm , the corresponding map ∥ · ∥S is also a norm on Fn . ■
1 1
∥x∥q ≤ ∥x∥ p ≤ n p − q ∥x∥q . (22)
and
∥x∥ p = (|x1 | p + · · · + |xn | p ) ≤ (n|xk | p )1/p = n1/p |xk | = n1/p ∥x∥∞ .
Therefore, we have ∥x∥∞ ≤ ∥x∥ p ≤ n1/p ∥x∥∞ .
Next, suppose that q < ∞. There is nothing to prove if x = 0V , so we shall assume that x ̸= 0V . By applying Hölder’s inequality with
r := q/p ∈ [1, ∞) and its conjugate r′ ,
!1/r′ !1/r ! p/q
n n n n
′ ′
∥x∥ pp = ∑ |xi | p ≤ ∑ 1r ∑ |xi | pr = n1/r ∑ |xi |q
i=1 i=1 i=1 i=1
1−(p/q)
=n ∥x∥qp .
1 1
Consequently, we also have ∥x∥ p ≤ n p − q ∥x∥q . As for another inequality, we may put ∥x∥−1 T
q x =: y = [y1 , . . . , yn ] . Then it is clear that
∥y∥q = 1. For each i = 1, . . . , n,
|yi | = ∥x∥−1 −1
q |xi | ≤ ∥x∥∞ |xi | ≤ 1,
Lemma 1.27. Let V be a normed linear space and x1 , . . . , xn ∈ V be linearly independent elements. Then there
exists a positive constant λ ∈ R such that for every c1 , . . . , cn ∈ F,
18
Proof. Let c1 , . . . , cn ∈ F. There is nothing to prove if c1 = · · · = cn = 0, in which case |c1 | + · · · + |cn | = 0, thus we shall assume that at
least one of ci ’s is nonzero. Then for each i = 1, . . . , n, we may put
ci
di := ∈ F.
|c1 | + · · · + |cn |
Clearly, at least one of di ’s is also nonzero while
n n n
ci |ci |
∑ |di | = ∑ |c1 | + · · · + |cn |
=∑ = 1.
i=1 i=1 i=1 |c1 | + · · · + |cn |
f : Fn → F : (d1 , . . . , dn ) 7→ ∥d1 x1 + · · · + dn xn ∥.
Note that V is a topological linear space and the norm ∥ · ∥ on V is also continuous (cf. Corollary 1.7), hence f is also continuous.
Furthermore, consider the set
S := {(d1 , . . . , dn ) ∈ Fn | |d1 | + · · · + |dn | = 1}.
We claim that S is closed and bounded with respect to the Euclidean norm ∥ · ∥2 , hence is also compact under ∥ · ∥2 by the Heine-Borel
theorem:
• Observe that S is a the unit sphere of Fn under its l1 -norm, hence is closed with respect to ∥ · ∥1 . Since ∥ · ∥1 and ∥ · ∥2 are equivalent,
the set S is also closed with respect to ∥ · ∥2 .
• Furthermore, for every (d1 , . . . , dn ) ∈ S, it is clear that
Theorem 1.28. Let V be a finite-dimensional normed linear space with base {e1 , . . . , em }, let e∨1 , . . . , e∨m : V → F
be the coordinate forms, and let (xn )n∈N be a sequence in V .
1. The sequence (xn )n∈N is Cauchy in V if and only if the sequence (e∨i (xn ))n∈N of i-th coordinates is also a
Cauchy sequence in F for all i = 1, . . . , m.
2. For every x ∈ V , we have xn → x in V if and only if e∨i (xn ) → ei (x) in F for all i = 1, . . . , m.
In particular, every finite-dimensional normed linear space is a Banach space.
Proof. First, since e1 , . . . , en are linearly independent, by Lemma 1.27, there exists λ > 0 such that for all c1 , . . . , cm ∈ F,
• Suppose that the sequence (xn )n∈N is Cauchy in V . Then for every ε > 0, there exists N ∈ N such that ∥xk − xn ∥ < λ ε for all k, n ≥ N.
In this case,
m m
λ ε > ∥xk − xn ∥ = ∑ (e∨i (xk ) − e∨i (xn ))ei ≥λ ∑ |e∨i (xk ) − e∨i (xn )|.
i=1 i=1
19
Then for i = 1, . . . , m and k, n ≥ N,
m
λε
|e∨ ∨ ∨ ∨
i (xk ) − ei (xn )| ≤ ∑ |ei (xk ) − ei (xn )| < = ε,
i=1 λ
ε
|e∨ ∨
i (xk ) − ei (xn )| < , ∀k, n ≥ Ni .
m(∥ei ∥ ∨ 1)
In this case, when n, k ≥ N1 ∨ · · · ∨ Nm ,
m m m
ε
∥xk − xn ∥ = ∑ (e∨i (xk ) − e∨i (xn ))ei ≤ ∑ |e∨ ∨
i (xk ) − ei (xn )|∥ei ∥ < ∑
m
= ε.
i=1 i=1 i=1
Remark 1.8. In particular, a sequence (xn = [xk,1 , . . . , xk,n ]T )k∈N in Fn is convergent if and only if the sequence
of i-th components (xk,i )k∈N in F is convergent for all i = 1, . . . , n.
Theorem 1.29 (Direct Sum of Normed Linear Spaces). Let (Vi , ∥ · ∥i )i∈I be a nonempty family of normed linear
spaces. Then for every p ∈ [1, ∞), the following map is a norm on the direct sum i∈I Vi : For every (xi )i∈I ∈
L
L
i∈I Vi ,
!1/p
∥(xi )i∈I ∥ p := ∑ ∥xi ∥ip . (24)
i∈I
Proof. First, let p ∈ [1, ∞) be an arbitrary constant, and ∥ · ∥ p : i∈I Vi → R be defined as above. First, for each (xi )i∈I ∈ i∈I Vi , its support,
L L
namely the set of indices i’s where xi ̸= 0Vi , is a finite set, so we merely have a finite sum in the definition of ∥ · ∥ p . Next, we verify that
∥ · ∥ p is a norm on i∈I Vi :
L
• For every (xi )i∈I ∈ and c ∈ F, since c(xi )i∈I = (cxi )i∈I ,
L
i∈I Vi
!1/p !1/p
∥c(xi )i∈I ∥ p = ∥(cxi )i∈I ∥ p = ∑ ∥cxi ∥ip = ∑ (|c|∥xi ∥i ) p
i∈I i∈I
!1/p !1/p
= |c| p ∑ ∥xi ∥ip = |c| ∑ ∥xi ∥ip = |c|∥(xi )i∈I ∥ p .
i∈I i∈I
20
∗
Here ≤ is obtained by applying the Minkowski’s inequality to the union of the supports of (xi )i∈I and (yi )i∈I .
• Let (xi )i∈I ∈
L
i∈I Vi . Then
!1/p
∥(xi )i∈I ∥ p = 0 ⇐⇒ ∑ ∥xi ∥ip = 0 ⇐⇒ ∑ ∥xi ∥ip = 0
i∈I i∈I
Similarly, for every (xi )i∈I ∈ i∈I Vi , the number ∥(xi )i∈I ∥∞ is finite because we are indeed finding the maximum among the norms of its
L
∥c(xi )i∈I ∥∞ = sup ∥cxi ∥i = sup |c|∥xi ∥i = |c| sup ∥xi ∥i = |c|∥(xi )i∈I ∥∞ .
i∈I i∈I i∈I
≤ sup (∥xi ∥i + ∥yi ∥i ) ≤ sup ∥xi ∥i + sup ∥yi ∥i = ∥(xi )i∈I ∥∞ + ∥(yi )i∈I ∥∞ .
i∈I i∈I i∈I
Remark 1.9. Consequently, by putting Vi := F for all i ∈ I, one can see that the direct sum F⊕I always has
ℓ p -norms defined as follows: For every (xi )i∈I ∈ F⊕I ,
!1/p
p
∥(xi )i∈I ∥ p := ∑ |xi | , p ∈ [1, ∞) and ∥(xi )i∈I ∥∞ := sup |xi |. (26)
i∈I i∈I
Furthermore, given any linear space V with base (zi )i∈I , applying Corollary 1.14 via the isomorphism F⊕I → V :
ei 7→ zi , the space V also necessarily has l p -norms, in which for every x ∈ V ,
!1/p
∥x∥ p := ∑ |z∨i (x)| p , p ∈ [1, ∞) and ∥x∥∞ := sup |z∨i (x)|. (27)
i∈I i∈I
Again, one may note that the definition above is not canonical.
Corollary 1.30 (Direct Sum of Banach Spaces). Let (V1 , ∥ · ∥1 ), . . . , (Vk , ∥ · ∥k ) be a nonempty family of Banach
spaces. Then (V1 ⊕ · · · ⊕Vk , ∥ · ∥ p ) is a Banach space for all p ∈ [1, ∞].
Proof. Let p ∈ [1, ∞] be arbitrary, and let ((xn,1 , . . . , xn,k ))n∈N be a Cauchy sequence in V1 ⊕ · · · ⊕Vk under ∥ · ∥ p . Then for every ε > 0, there
exists N ∈ N such that whenever m, n ≥ N,
21
Fix an arbitrary i ∈ {1, . . . , k}. Then for every m, n ≥ N,
so the sequence (xn,i )n∈N is also Cauchy in Vi . Since Vi is a Banach space under ∥ · ∥i , there exists xi ∈ Vi such that xn,i → xi in it.
• Suppose that p < ∞. Then for each i, there exists Ni ∈ N such that ∥xn,i − xi ∥i < ε/k1/p . Then for n ≥ N1 ∨ · · · ∨ Nk ,
• When p = ∞, for each i, there exists Ni ∈ N such that ∥xn,i − xi ∥i < ε. Then for n ≥ N1 ∨ · · · ∨ Nk ,
Therefore, we have (xn,1 , . . . , xn,k ) → (x1 , . . . , xk ) in V1 ⊕ · · · ⊕Vk under ∥ · ∥ p , hence we may conclude that (V1 ⊕ · · · ⊕Vk , ∥ · ∥ p ) is a Banach
space as well. ■
Theorem 1.31 (Finite Dimension and Equivalent Norms). Let V be a normed linear space. Then V is finite-
dimensional if and only if every two norms on V are equivalent.
Proof. First, suppose that V is finite-dimensional. By Corollary 1.14, it suffices to consider the case when V = Fn . Note that being equivalent
is an equivalence relation on the set of norms on V (cf. Corollary 1.21), so we may simply show that every norm on Fn is equivalent to the
l1 -norm or the Euclidean norm: Let ∥ · ∥ be an arbitrary norm on V and {e1 , . . . , en } be the standard base of Fn .
• For each x = [x1 , . . . , xn ]T ∈ Fn , by triangle inequality, we have
On the other hand, by Lemma 1.27, there also exists λ > 0 such that
On the other hand, consider the unit sphere S := {x ∈ Fn | ∥x∥2 = 1}, which is compact under the Euclidean norm. It follows by the
extreme value theorem that ∥ · ∥ has a minimum m on S. Since 0 ∈/ S and ∥ · ∥ is a norm, we must have m > 0 then. Consequently, for
every nonzero x ∈ Rd , since x/∥x∥2 ∈ S, we have
x ∥x∥
0<m≤ = .
∥x∥2 ∥x∥2
That is, ∥x∥ ≥ m∥x∥2 . The equivalence between ∥ · ∥ and ∥ · ∥2 is now established.
On the other hand, suppose that V is infinite-dimensional. Let (ei )i∈I be a base of V with coordinate forms ei : V → F. Here we show that
the l1 and l∞ -norms on V are not equivalent (cf. Remark 1.9): For every x ∈ V ,
∥x∥1 = ∑ |e∨
i (x)| and ∥x∥∞ = sup |e∨
i (x)|.
i∈I i∈I
22
It is clear that ∥ · ∥∞ ≤ ∥ · ∥1 . Assume, to the contrary, that ∥ · ∥1 ≤ β ∥ · ∥∞ for some constant β > 0. Then for every positive integer k and
distinct i1 , . . . , ik ∈ I,
∥ei1 + · · · + eik ∥1 = 1 · k = k ≤ β ∥ei1 + · · · + eik ∥∞ = β · 1 = β .
In other words, β ≥ k holds for all positive integers, a contradiction. ■
Theorem 1.32 (Finite-Dimensional Subspaces). Let V be a normed linear space with subspace W . If W is
finite-dimensional, then it is closed in V . Furthermore, for every x ∈ V , there exists y ∈ W closest to x, namely
This shows that (yn )n∈N is also a bounded sequence in W . Since W is finite-dimensional, by the generalized Bolzano-Weierstrass theorem,
we can find a convergent subsequence (ynk )k∈N such that ynk → y ∈ W as k → ∞. By the continuity of norms,
where the last equality holds because every subsequence of a convergent sequence in R converges to the same limit.
Finally, let y1 , y2 ∈ W be elements closest to x and λ ∈ [0, 1] be arbitrary. Then for yλ := λ y1 + (1 − λ )y2 ∈ W ,
Therefore, we also have ∥x − yλ ∥ = d(x,W ), implying that yλ ∈ W is closest to x as well. In conclusion, the set of all closest points in W to
x is convex. ■
Lemma 1.33 (Riesz). Let V be a normed linear space with closed subspace W . If W ̸= V , then for every
δ ∈ (0, 1), there is a unit element x ∈ V , i.e., ∥x∥ = 1, such that d(x,W ) ≥ δ .
Proof. Suppose that W ̸= V . Then the space V and the quotient space V /W are both nonzero. Let δ ∈ (0, 1) and y ∈ V \ W be arbitrary.
Since W is closed in V , we must have
0 < d(y,W ) = inf d(y, z) = inf ∥y − z∥.
z∈W z∈W
Now observe that d(y,W ) < d(y,W )/δ , so there exists zδ ∈ W such that d(y,W ) ≤ ∥y − zδ ∥ ≤ d(y,W )/δ . In this case, we may simply put
x := ∥y − zδ ∥−1 (y − zδ ).
23
Theorem 1.34. Let V be a normed linear space. Then V is finite-dimensional if and only if the unit closed ball
B := {x ∈ V | ∥x∥ ≤ 1} is compact in V .
Proof. First, suppose that V is finite-dimensional. Again, by Corollary 1.14 and Theorem 1.31, we may assume that V = Fn endowed with
the Euclidean norm ∥ · ∥2 . Let (xn )n∈N be a sequence of elements in B. Such sequence is certainly bounded as B is, so by the Bolzano-
Weierstrass theorem, it contains a convergent subsequence (xnr )r∈N with limit x ∈ V . By Theorem 1.9, we see that ∥xnr ∥ → ∥x∥ as r → ∞.
Since ∥xnr ∥ ≤ 1 for all r, it follows that ∥x∥ ≤ 1 as well. This shows that x ∈ B, so the unit closed ball B is compact in V .
Conversely, suppose that the unit closed ball B is compact in V . Since the open balls (B(x, 1/2))x∈B covers B, there exists finitely many
x1 , . . . , xn ∈ B such that
B ⊆ B(x1 , 1/2) ∪ · · · ∪ B(xn , 1/2).
Denote by U := Span(x1 , . . . , xn ), which is closed in V by Corollary 1.32. If V is infinite-dimensional, then U is a proper subspace of V .
By Riesz’s lemma, there exists x∗ ∈ B such that d(x∗ ,U) > 1/2. In particular, given each k = 1, . . . , n, we have ∥x∗ − xk ∥ ≥ d(x∗ ,U) > 1/2,
contrary to the choice of xk ’s. ■
Example 1.3 (L p -Spaces). Let (X, A, µ) be a measure space. For every measurable function f : X → F, we
define Z 1/p
∥ f ∥ p := | f | p dµ . (29)
X
The collection of all those f ’s with ∥ f ∥ p < ∞ is called the Lebesgue space, denoted by L p (X, A, µ) or simply
L p (µ). Here ∥ · ∥ p is a semi-norm on L p (µ). Furthermore,
• ℓ p -spaces are special cases on (N, P(N), δ ), where δ is the counting measure on (N, P(N)).
• The l p -norms on Fn are special cases on ({1, . . . , n}, P({1, . . . , n}), δ ), where δ is the counting measure
on ({1, . . . , n}, P({1, . . . , n})).
24
2 Inner Product Spaces
2.1 Basics for Inner Product Spaces
Definition 2.1. Let V be a linear space. A semi-inner product on V is a sesquilinear form on V , namely a map
⟨·, ·⟩ : V ×V → F that satisfies the following conditions:
Under ⟨·, ·⟩, the linear space V is a called a semi-inner product space. Finally, if ⟨·, ·⟩ satisfies the following
additional condition:
∀x ∈ V : ⟨x, x⟩ = 0 ⇐⇒ x = 0V , (30)
then the map ⟨·, ·⟩ is called an inner product, under which V is called an inner product space.
Remark 2.1. When F = R, the Hermitian property degenerates to the symmetry, namely ⟨y, x⟩ = ⟨x, y⟩ for all
x, y ∈ V .
Theorem 2.1. Let V be a semi-inner product space and x ∈ V . For every c ∈ F and y, y′ ∈ V ,
Furthermore,
⟨x, 0V ⟩ = ⟨0V , x⟩ = 0. (32)
Proof. Let c ∈ F and y, y′ ∈ V be arbitrary. Then
and
⟨x, y + y′ ⟩ = ⟨y + y′ , x⟩ = ⟨y, x⟩ + ⟨y′ , x⟩ = ⟨y, x⟩ + ⟨y′ , x⟩ = ⟨x, y⟩ + ⟨x, y′ ⟩.
Consequently,
⟨0V , x⟩ = ⟨0V + 0V , x⟩ = ⟨0V , x⟩ + ⟨0V , x⟩,
so we must have ⟨0V , x⟩ = 0 and hence ⟨x, 0V ⟩ = ⟨0V , x⟩ = 0 = 0 as well. ■
Remark 2.2. Together with the above theorem, now we may demonstrate why a semi-inner product is required
to be Hermitian rather than symmetric: Under the Hermitian condition, for every x ∈ V and c ∈ F,
Clearly, if we do not require the map to be Hermitian, we will be left with c2 ⟨x, x⟩, which is not necessarily a
real number!
25
Proposition 2.2 (Integral Form of Complex Inner Products). Let V be a complex inner product space. Then for
every x, y ∈ V ,
1 π iθ
Z
⟨x, y⟩ = e ∥x + eiθ y∥2 dθ . (33)
2π −π
Proof. Let x, y ∈ V be arbitrary. Then
∥x + eiθ y∥2 = ∥x∥2 + ∥y∥2 + eiθ ⟨y, x⟩ + e−iθ ⟨x, y⟩,
whence
eiθ ∥x + eiθ y∥2 = eiθ (∥x∥2 + ∥y∥2 ) + e2iθ ⟨y, x⟩ + ⟨x, y⟩.
Note that
π π
eiθ e2iθ
Z π Z π
eiθ dθ = =0 and e2iθ dθ = = 0,
−π i −π −π 2i −π
so
1 ⟨x, y⟩ ⟨x, y⟩
Z π Z π
eiθ ∥x + eiθ y∥2 dθ = dθ = · 2π = ⟨x, y⟩. ■
2π −π 2π −π 2π
Theorem 2.3 (Real Part of Semi-Inner Products). Let V be a complex semi-inner product space. Then by
regarding V as a real linear space, the real part of the semi-inner product
is also a semi-inner product on V , which is an inner product on V if and only if ⟨·, ·⟩ is. Furthermore, for every
x, y ∈ V ,
ℜ⟨x, y⟩ = ℜ⟨y, x⟩ and ℜ⟨ix, y⟩ = −ℜ⟨x, iy⟩, (35)
and hence
⟨x, y⟩ = ℜ⟨x, y⟩ + iℜ⟨x, iy⟩ = ℜ⟨x, y⟩ − iℜ⟨ix, y⟩. (36)
Proof. By restricting the scalars, it is clear that V is also a real linear space. First, let x, y ∈ V be arbitrary. Denote by ⟨x, y⟩ = a + bi for
some a, b ∈ R. Then we may observe that
⟨ix, y⟩ = i⟨x, y⟩ = i(a + bi) = −b + ai and ⟨x, iy⟩ = ⟨−ix, y⟩ = −⟨ix, y⟩ = b − ai.
• For every x, y ∈ V ,
ℜ⟨y, x⟩ = ℜ⟨x, y⟩ = ℜ⟨x, y⟩
(ℜ⟨x, x⟩ = 0 ⇐⇒ x = 0V ) ⇐⇒ (⟨x, x⟩ = 0 ⇐⇒ x = 0V ).
26
Theorem 2.4 (Complex Semi-Inner Products From Reals). Let V be a complex linear space. If ⟨·, ·⟩ : V ×V → R
is a semi-inner product on V as a real linear space such that ⟨ix, y⟩ = −⟨x, iy⟩ for all x, y ∈ V , then the following
map
⟨·, ·⟩′ : V ×V → C : (x, y) 7→ ⟨x, y⟩ + i⟨x, iy⟩ (37)
is also a semi-inner product on V as a complex linear space such that ⟨x, x⟩′ = ⟨x, x⟩ for all x ∈ V . Furthermore,
the map ⟨·, ·⟩′ is an inner product if and only if ⟨·, ·⟩ is.
Proof. Suppose that ⟨·, ·⟩ : V × V → R is a semi-inner product on V as a real linear space, and let ⟨·, ·⟩′ : V × V → C be as defined above.
We then verify that ⟨·, ·⟩′ is a semi-inner product on V as a complex linear space:
• For every x ∈ V , since ⟨ix, x⟩ = −⟨x, ix⟩ = −⟨ix, x⟩, it follows that ⟨x, ix⟩ = ⟨ix, x⟩ = 0. Consequently,
• For every x, x′ , y ∈ V ,
⟨x + x′ , y⟩′ = ⟨x + x′ , y⟩ + i⟨x + x′ , iy⟩ = (⟨x, y⟩ + ⟨x′ , y⟩) + i(⟨x, iy⟩ + ⟨x′ , iy⟩)
= (⟨x, y⟩ + i⟨x, iy⟩) + (⟨x′ , y⟩ + i⟨x′ , iy⟩)
= ⟨x, y⟩′ + ⟨x′ , y⟩′ .
• For every x, y ∈ V ,
Finally, since ⟨x, x⟩′ = ⟨x, x⟩ for all x ∈ V , it is clear that ⟨·, ·⟩′ is an inner product if and only if ⟨·, ·⟩ is. ■
Remark 2.3. By the preceding two theorems, we can see that a complex semi-inner product is uniquely deter-
mined by its real part.
Definition 2.2. Let V be a complex linear space. A sesquilinear form on V is a map ϕ : V ×V → C such that
27
2. ϕ(x, cy1 + y2 ) = cϕ(x, y1 ) + ϕ(x, y2 ) for all x, y1 , y2 ∈ V and c ∈ F.
That is, ϕ is linear on the first component but conjugate linear on the second. Furthermore, the map
Φ : V → C : x 7→ ϕ(x, x) (38)
Example 2.1. Let V be a complex linear space and f , g : V → C be linear functionals on V . Then the following
map is sesquilinear:
ϕ : V ×V → C : (x, y) 7→ f (x)g(y). (40)
• For every x1 , x2 , y ∈ V and c ∈ F,
Example 2.2. Let V be a complex inner product space and A, B ∈ L (V ) be linear operators on V . Then the
following map is also sesquilinear:
In particular, by considering the identity maps, given any A ∈ L (V ), we also have the following sesquilinear
forms (x, y) 7→ ⟨Ax, y⟩ and (x, y) 7→ ⟨x, Ay⟩.
⟨x, x⟩ + ⟨y, y⟩
|⟨x, y⟩| ≤ . (42)
2
28
Proof. Clearly, there is nothing to prove if ⟨x, y⟩ = 0, so we shall assume that ⟨x, y⟩ ̸= 0 then. Let t ∈ C such that |t| = 1. Then
so
⟨x, x⟩ + ⟨y, y⟩
ℜ(t⟨x, y⟩) ≤ .
2
−1 −1
In particular, for t0 := |⟨x, y⟩| ⟨y, x⟩, we can see that |t0 | = |⟨x, y⟩| |⟨y, x⟩| = 1 and
ℜ(t⟨x, y⟩) = ℜ(|⟨x, y⟩|−1 ⟨y, x⟩⟨x, y⟩) = ℜ(|⟨x, y⟩|−1 |⟨x, y⟩|2 ) = ℜ(|⟨x, y⟩|) = |⟨x, y⟩|.
Therefore, the previous inequality about t can be applied to t0 , which precisely entails the desired inequality. ■
Theorem 2.6 (Cauchy-Bunyakovski-Schwarz Inequality). Let V be a semi-inner product space. Then for every
x, y ∈ V ,
|⟨x, y⟩|2 ≤ ⟨x, x⟩⟨y, y⟩. (43)
When ⟨·, ·⟩ is an inner product, the equality is attained if and only if x, y are linearly dependent.
Proof. Let x, y ∈ V be arbitrary and θ ∈ R. Since ⟨x, x⟩ and ⟨y, y⟩ are both non-negative real numbers, there is nothing to prove if ⟨x, y⟩ = 0.
As a result, we shall assume that ⟨x, y⟩ ̸= 0 throughout. Now consider the following real polynomial:
In this case, we may select θ ∈ R appropriately such that ℜ(e−iθ ⟨x, y⟩) = |⟨x, y⟩| (e.g., θ = arg(⟨x, y⟩) and certainly θ = 0 when F = R),
hence p becomes
p(t) = t 2 ⟨x, x⟩ − 2t|⟨x, y⟩| + ⟨y, y⟩.
Here we claim that ⟨x, x⟩ > 0 in this case: Suppose otherwise, say ⟨x, x⟩ = 0. Then the polynomial p is further reduced to p(t) = −2t|⟨x, y⟩| +
⟨y, y⟩. Since ⟨x, y⟩ =
̸ 0, we see that |⟨x, y⟩| > 0. Thus p defines a decreasing linear function on t, so for sufficiently large t, we shall have
p(t) < 0, contradicting the positiveness of p.
Now ⟨x, x⟩ > 0, so p instead defined a quadratic function on t with global minimum at
where the last inequality follows from the fact that p(t) ≥ 0 for all t ∈ R. The desired inequality |⟨x, y⟩|2 ≤ ⟨x, x⟩⟨y, y⟩ thus follows. Finally,
we study the equality condition as follows:
• First, suppose that x, y are linearly dependent, say x = cy for some c ∈ F. Then
⟨x, y⟩ = ⟨cy, y⟩ = c⟨y, y⟩ and ⟨x, x⟩ = ⟨cy, cy⟩ = |c|2 ⟨y, y⟩,
so
|⟨x, y⟩|2 = |c|2 ⟨y, y⟩2 = ⟨x, x⟩⟨y, y⟩.
That is, the equality is attained now.
• Now suppose that x, y are linearly independent. Then for every t, θ ∈ R, the element tx − eiθ y is always nonzero. When ⟨·, ·⟩ is an
inner product on V , we see that p(t) = ⟨tx − eiθ y,tx − eiθ y⟩ > 0 for all t ∈ R. Then together with the discussions above, we see that
|⟨x, y⟩|2 < ⟨x, x⟩⟨y, y⟩, namely the equality is never attained in this case. ■
29
Direct Proof for Inner Product Spaces. Suppose that V is an inner product space with x, y ∈ V . Again, if x = y = 0V , we have ⟨x, y⟩ =
⟨x, x⟩ = ⟨y, y⟩ = 0, so the desired inequality follows as well. Now, without loss of generality, assume that y ̸= 0V . Put v := ⟨y, y⟩x − ⟨x, y⟩y.
As we can see,
Now since ⟨y, y⟩ > 0, the desired inequality ⟨y, y⟩⟨x, x⟩ ≥ |⟨x, y⟩|2 follows. Clearly, the equality here is attained if and only if v = ⟨y, y⟩x −
⟨x, y⟩y = 0V .
• If v = ⟨y, y⟩x − ⟨x, y⟩y = 0V , we have x = ⟨y, y⟩−1 ⟨x, y⟩y, hence x, y are linearly dependent.
• Conversely, suppose that x = cy for some c ∈ F. Then ⟨x, y⟩ = ⟨cy, y⟩ = c⟨y, y⟩, hence
Alternative Proof for the Inequality. Again, we consider the case when V is an inner product space. Let λ ∈ C be nonzero. We now put
u := λ x and v := (λ )−1 y. Observe that
1 1
⟨u, v⟩ = λ x, y = λ · ⟨x, y⟩ = ⟨x, y⟩.
λ λ
By Lemma 2.5, we have
⟨u, u⟩ + ⟨v, v⟩ |λ |2 ⟨x, x⟩ ⟨y, y⟩
|⟨x, y⟩| = |⟨u, v⟩| ≤ = + .
2 2 2|λ |2
Again, there is nothing to prove if x = 0. Suppose otherwise. Then we may put λ := (⟨y, y⟩/⟨x, x⟩)1/4 , hence
Theorem 2.7 (Norms Derived from Inner Products). Let V be a semi-inner product space. Then V is also a
semi-normed linear space under the following semi-norm:
1. The real part ℜ⟨·, ·⟩ of the semi-inner product induces the same semi-norm on V .
2. In the triangle inequality ∥x + y∥ ≤ ∥x∥ + ∥y∥ with x, y ∈ V , the equality is attained if and only if y = cx
for some non-negative c ∈ R.
30
= ∥x∥2 + ∥y∥2 + 2ℜ⟨x, y⟩
≤ ∥x∥2 + ∥y∥2 + 2|⟨x, y⟩|
≤ ∥x∥2 + ∥y∥2 + 2∥x∥∥y∥ = (∥x∥ + ∥y∥)2 ,
hence we certainly have ∥x + y∥ ≤ ∥x∥ + ∥y∥, proving the triangle inequality. The bottom equality in CBS is attained if and only if x = cy
for some c ∈ F. In this case,
ℜ⟨x, y⟩ = ℜ(⟨cy, y⟩) = ℜ(c⟨y, y⟩) = ℜ(c)⟨y, y⟩
and
|⟨x, y⟩| = |⟨cy, y⟩| = |c⟨y, y⟩| = |c|⟨y, y⟩,
so the top equality is attained if and only if ℜ(c) = |c|, which is equivalent to say that c is a non-negative real number. Therefore, the
equality in the triangle inequality is attained if and only if x = cy for some non-negative c ∈ R.
• By Theorem 2.3, we see that ℜ⟨x, x⟩ = ⟨x, x⟩ for all x ∈ V , so the semi-norm on V induced by ℜ⟨·, ·⟩ is the same as the one induced
by ⟨·, ·⟩.
• Let x, x′ , y, y′ ∈ V . By CBS inequality,
Consequently, if (x′ , y′ ) → (x, y) in V × V , we shall have x′ → x and y′ → y in V . As a result, we see that ∥x′ − x∥ → 0 and
∥y′ − y∥ → 0, hence |⟨x′ , y′ ⟩ − ⟨x, y⟩| → 0 as well. This proves the continuity of the semi-inner product. (In addition, its continuity
also follows from the polarization identities in the following theorem.)
• Finally, for every x ∈ V ,
∥x∥ = 0 ⇐⇒ ∥x∥2 = 0 ⇐⇒ ⟨x, x⟩ = 0.
Clearly, ∥ · ∥ is a norm if and only if the above is equivalent to that x = 0V , which is precisely the case when ⟨·, ·⟩ is an inner
product. ■
Remark 2.5. Given any semi-inner product space, we shall by default use ∥ · ∥ to denote the semi-norm induced
by the semi-inner product. In this case, the CBS inequality can be rephrased as follows: For every x, y ∈ V ,
Definition 2.3. An inner product space is called a Hilbert space if it is a Banach space under the derived norm,
or equivalently, the metric induced by the inner product is complete.
Example 2.3 (Standard Inner Product on Fn ). The standard inner product on Fn is defined as follows: For every
x = [x1 , . . . , xn ]T , y = [y1 , . . . , yn ]T ∈ Fn ,
⟨x, y⟩ := y∗ x = x1 y1 + · · · + xn yn . (46)
31
• For every x = [x1 , . . . , xn ]T , x′ = [x1′ , . . . , xn′ ]T , y = [y1 , . . . , yn ]T ∈ Fn , since x + x′ = [x1 + x1′ , . . . , xn + xn′ ]T ,
n n n n
⟨x + x′ , y⟩ = ∑ (xk + xk′ )yk = ∑ (xk yk + xk′ yk ) = ∑ xk yk + = ∑ xk′ yk = ⟨x, y⟩ + ⟨x′ , y⟩.
k=1 k=1 k=1 k=1
Furthermore, the norm on Fn induced by the standard inner product is precisely the Euclidean norm: For every
x = [x1 , . . . , xn ]T ,
⟨x, x⟩1/2 = (|x1 |2 + · · · + |xn |2 )1/2 = ∥x∥2 .
Example 2.4 (Inner Product Induced by Matrix). In addition, let A ∈ Mn (F) be a matrix. Then the following
map based on the standard inner product also defines a semi-inner product on Fn :
⟨x + x′ , y⟩A = ⟨A(x + x′ ), Ay⟩ = ⟨Ax, Ay⟩ + ⟨Ax′ , Ay⟩ = ⟨x, y⟩A + ⟨x′ , y⟩A
and
⟨cx, y⟩A = ⟨A(cx), Ay⟩ = ⟨c(Ax), Ay⟩ = c⟨Ax, Ay⟩ = c⟨x, y⟩A .
• For every x, y ∈ Fn ,
⟨y, x⟩A = ⟨Ay, Ax⟩ = ⟨Ax, Ay⟩ = ⟨x, y⟩A .
Clearly, the last converse holds if and only if A has full column rank, but because A is square, the latter condition
is equivalent to that A is invertible. In other words, the map ⟨·, ·⟩A is an inner product on Fn if and only if A is
invertible.
32
Theorem 2.9 (Complexification). Let X be a real inner product space. Then the following map
Remark 2.6. The motivation for such inner product is as follows: Note that in C, the inner product is defined as
follows: For a1 , a2 , b1 , b2 ∈ R,
⟨a1 + ib1 , a2 + ib2 ⟩ = (a1 + ib1 )a2 + ib2 = (a1 + ib1 )(a2 − ib2 )
= (a1 a2 + b1 b2 ) + i(a2 b1 − a1 b2 ).
Furthermore,
⟨x + iy, x + iy⟩C = 0 ⇐⇒ ⟨x, x⟩ = ⟨y, y⟩ = 0 ⇐⇒ x = y = 0 ⇐⇒ x + iy = 0.
• For every x1 , x2 , y1 , y2 ∈ X,
Finally, let (xn + iyn )n∈N be a Cauchy sequence in XC . Then for every m, n ∈ N,
∥(xm + iym ) − (xn + iyn )∥2C = ∥(xm − xn ) + i(ym − yn )∥2C = ∥xm − xn ∥2 + ∥ym − yn ∥2 ,
33
so (xn )n∈N and (yn )n∈N are also Cauchy sequences in X. Now if X is an Hilbert space, such two sequences must be convergent, say
xn → x ∈ X and yn → y ∈ X. Again, for every n ∈ N,
Therefore, we also have xn + iyn → x + iy in XC , hence XC is a Hilbert space as well in this case. ■
4. (Polarization Identities).
1 1 1
⟨x, y⟩ = (∥x + y∥2 − ∥x − y∥2 ) = ∑ (−1)k ∥x + (−1)k y∥2 . (F = R)
4 4 k=0
1 1 3
⟨x, y⟩ = [(∥x + y∥2 − ∥x − y∥2 ) + i(∥x + iy∥2 − ∥x − iy∥2 )] = ∑ ik ∥x + ik y∥2 . (F = C)
4 4 k=0
When ⟨x, y⟩ = 0, we see that ℜ⟨x, y⟩ = 0 as well. Consequently, the equality ∥x + y∥2 = ∥x∥2 + ∥y∥2 follow immediately. Furthermore, by
the above identities, we see that
while
∥x + y∥2 + ∥x − y∥2 = 2(∥x∥2 + ∥y∥2 ) and ∥x + y∥2 − ∥x − y∥2 = 4ℜ⟨x, y⟩.
Similarly, we see that
whose equality holds if and only if ℜ⟨x, y⟩ = 0. Finally, observe that when F = C,
34
The desired statements thus follow immediately. ■
1 1
ℜ⟨x, y⟩ = (∥x + y∥2 − ∥x − y∥2 ) = (∥x + y∥2 − ∥x∥2 − ∥y|2 ), (52)
4 2
which degenerates to ⟨x, y⟩ when F = R.
Corollary 2.11 (Generalized Pythagorean Theorem). Let V be a semi-inner product over F and x1 , . . . , xm ∈ V ,
where m ≥ 2. If ⟨xi , x j ⟩ = 0 for every distinct i, j = 1, . . . , m, then for every c1 , . . . , cm ∈ F,
• Now assume that the identity holds for some m ≥ 2. Let x1 , . . . , xm+1 ∈ V be such that ⟨xi , x j ⟩ = 0 for every distinct i, j = 1, . . . , m,
and c1 , . . . , cm+1 ∈ F be arbitrary. By similar arguments as above and the linearity of semi-inner product,
m m
⟨c1 x1 + · · · + cm xm , cm+1 xm+1 ⟩ = ∑ ⟨ci xi , cm+1 xm+1 ⟩ = ∑ 0 = 0.
k=1 k=1
Therefore,
∗
∥c1 x1 + · · · + cm xm + cm+1 xm+1 ∥2 = ∥c1 x1 + · · · + cm xm ∥2 + ∥cm+1 xm+1 ∥2
∗∗
= |c1 |2 ∥x1 ∥2 + · · · + |cm |2 ∥xm ∥2 + |cm+1 |2 ∥xm+1 ∥2 ,
∗ ∗∗
where = follows from the Pythagorean theorem, and = follows from the inductive hypothesis. ■
35
Proof. Observe that
1 1 1
ϕ(x, y) = · − ϕ(2x, 2iy) = − ϕ((x + iy) + (x − iy), (x + iy) − (x − iy))
2 2i 4i
i
= (Φ(x + iy) + ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy) − Φ(x − iy))
4
i i
= (Φ(x + iy) − Φ(x − iy)) + (ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy))
4 4
What remains is to study the last two terms:
i i
(ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy)) = ((Φ(x) − ϕ(iy, x) + ϕ(x, iy) − Φ(iy)) − (Φ(x) + ϕ(iy, x) − ϕ(x, iy) − Φ(iy)))
4 4
i i
= · 2(ϕ(x, iy) − ϕ(iy, x)) = · 2(−iϕ(x, y) − iϕ(y, x))
4 4
1
= (ϕ(x, y) + ϕ(y, x)).
2
Observe that
Φ(x + y) − Φ(x − y) = (Φ(x) + ϕ(y, x) + ϕ(x, y) + ϕ(y, y)) − (Φ(x) − ϕ(y, x) − ϕ(x, y) + ϕ(y, y))
= 2(ϕ(x, y) + ϕ(y, x)),
so
i 1
(ϕ(x − iy, x + iy) − ϕ(x + iy, x − iy)) = (Φ(x + y) − Φ(x − y)).
4 4
The proof is thus complete. ■
Corollary 2.13. Let V be a complex linear space and ϕ : V × V → C be a sesquilinear form. Then ϕ is
Hermitian, namely ϕ(x, y) = ϕ(y, x) for all x, y ∈ V , if and only if ϕ(x, x) ∈ R for all x ∈ V ; that is, its associated
quadratic form takes real values only.
Proof. For convenience, denote by Φ : V → C : x 7→ ϕ(x, x) the quadratic form associated with ϕ. First, if ϕ is Hermitian, then for every
x ∈ V , we have Φ(x) = ϕ(x, x) = ϕ(x, x), so Φ(x) ∈ R holds for sure. Conversely, suppose that Φ(x) ∈ R for all x ∈ R. Observe from Remark
2.4 that
1
ϕ(y, x) = [(Φ(y + x) + Φ(y − x)) + i(Φ(y + ix) − Φ(y − ix))]
4
1
= [(Φ(x + y) + Φ(x − y)) + i(Φ(x − iy) − Φ(x + iy))]
4
1
= [(Φ(x + y) + Φ(x − y)) + i(Φ(x + iy) − Φ(x − iy))] = ϕ(x, y).
4
Therefore, the sesquilinear form ϕ is also Hermitian in this case. ■
Remark 2.8. Consequently, a sesquilinear form is a semi-inner product if and only if it is positive semi-definite,
as the Hermitian condition is now implied by the positive semi-definiteness condition. Besides, the quadrant
form associated with a semi-inner product is precisely the square of the derived semi-norm.
Corollary 2.14. Let V be a complex semi-inner product space and T ∈ L (V ) be a linear operator. Then for
36
every x, y ∈ V ,
1 3 k
⟨T x, y⟩ = ∑ i ⟨T (x + ik y), x + ik y⟩.
4 k=0
(55)
ϕ : V ×V → C : (x, y) 7→ ⟨T x, y⟩.
1 3 k 1 3
⟨T x, y⟩ = ϕ(x, y) = ∑ i ϕ(x + ik y, x + ik y) = ∑ ik ⟨T (x + ik y), x + ik y⟩, ∀x, y ∈ V.
4 k=0 4 k=0
As we can see, suppose that ⟨T x, x⟩ = 0 for all x ∈ V . Fix one arbitrary x ∈ V . By the identity above, we now have ⟨T x, y⟩ = 0 for all y ∈ V ,
especially when y = T x itself, hence T x = 0V holds. Since x ∈ V is arbitrary, it follows that T = 0, as desired. ■
Remark 2.9. By putting T := idV , the above generalized polarization identity recovers the classical one.
Theorem 2.15. Let V be a semi-inner product space. For every nonzero x, y ∈ V with respective normalizations
x̂ := ∥x∥−1 and ŷ := ∥y∥−1 y,
2∥x − y∥
∥x̂ − ŷ∥ ≤ . (56)
∥x∥ + ∥y∥
Proof. By Theorem 2.10,
2
∥x̂ − ŷ∥2 = ∥x̂∥2 + ∥ŷ∥2 − 2ℜ⟨x̂, ŷ⟩ = 2 − ℜ⟨x, y⟩
∥x∥∥y∥
2 ∥x∥2 + ∥y∥2 − ∥x − y∥2
= 2− ·
∥x∥∥y∥ 2
2∥x∥∥y∥ − (∥x∥2 + ∥y∥2 − ∥x − y∥2 ) ∥x − y∥2 − (∥x∥ − ∥y∥)2
= = .
∥x∥∥y∥ ∥x∥∥y∥
In this case,
(∥x∥ + ∥y∥)2 − ∥x − y∥2 = (∥x∥2 + 2∥x∥∥y∥ + ∥y∥2 ) − (∥x∥2 + ∥y∥2 − 2ℜ⟨x, y⟩)
= 2∥x∥∥y∥ + 2ℜ⟨x, y⟩ = 2(∥x∥∥y∥ + ℜ⟨x, y⟩)
≥ 2(|⟨x, y⟩| + ℜ⟨x, y⟩) ≥ 0.
Therefore, we have
4∥x − y∥2 − (∥x∥ + ∥y∥)2 ∥x̂ − ŷ∥2 ≥ 0,
hence the desired inequality follows immediately. ■
37
Remark 2.10. The above inequality is stronger than the one presented in Theorem 1.3.
Theorem 2.16 (Ptolemy Inequality). Let V be an inner product space. Then for every x, y, z ∈ V ,
Proof. There is nothing to prove if anyone of x, y, z is equal to 0V , so we may assume that x, y, z are all nonzero then. Let
x y z
x′ := , y′ := , and z′ := .
∥x∥2 ∥y∥2 ∥z∥2
Then
1 1 2
∥x′ − y′ ∥2 = ∥x′ ∥2 + ∥y′ ∥2 − 2ℜ⟨x′ , y′ ⟩ = ∥x∥2 + ∥y∥2 − ℜ⟨x, y⟩
(∥x∥2 )2 (∥y∥2 )2 ∥x∥2 ∥y∥2
1 1 2ℜ⟨x, y⟩
= + −
∥x∥2 ∥y∥2 ∥x∥2 ∥y∥2
∥y∥2 + ∥x∥2 − 2ℜ⟨x, y⟩ ∥x − y∥2
= = .
∥x∥2 ∥y∥2 ∥x∥2 ∥y∥2
Consequently, we have
∥x − y∥
∥x′ − y′ ∥ = ,
∥x∥∥y∥
and similar inequalities hold for x′ , z′ and y′ , z′ as well. Finally, by triangle inequality,
∥x − y∥ ∥x − z∥ ∥z − y∥
= ∥x′ − y′ ∥ ≤ ∥x′ − z′ ∥ + ∥z′ − y′ ∥ = + .
∥x∥∥y∥ ∥x∥∥z∥ ∥z∥∥y∥
Theorem 2.17 (Apollonius Identity). Let V be an inner product space. For every x, y, z ∈ V ,
2
1 1
∥x − z∥2 + ∥y − z∥2 = ∥x − y∥2 + 2 z − (x + y) . (58)
2 2
Remark 2.11. The above identity generalizes Apollonius’ theorem in geometry: Given any triangle △ABC, if D
38
is the midpoint of BC, then
|AB|2 + |AC|2 = 2|AD|2 + |BD|2 . (59)
−−
⃗ y := −−
⃗ and z := −
−⃗
Here we may fix an arbitrary origin O, and put x := OB, OC, OA.
Theorem 2.18 (Hlawka’s Inequality). Let V be an inner product space. Then for every x, y, z ∈ V ,
and also
∥x + y∥ + ∥x + z∥ + ∥y + z∥ ≤ ∥x + y + z∥ + ∥x∥ + ∥y∥ + ∥z∥. (61)
Proof. For convenience, we denote by u := x + y + z. Observe that
∥u∥2 = ∥x + y + z∥2 = ⟨x + y + z, x + y + z⟩
= ∥x∥2 + ∥y∥2 + ∥z∥2 + 2(ℜ⟨x, y⟩ + ℜ⟨y, z⟩ + ℜ⟨x, z⟩),
If A = 0, then x = y = z = 0V and the inequality is trivial. Now suppose that A > 0. Then
Definition 2.4. Let V be a linear space. A norm ∥ · ∥ : V → R on V is said to be derived from an inner product
if there exists an inner product ⟨·, ·⟩ : V ×V → F such that ∥x∥ = ⟨x, x⟩1/2 for all x ∈ V .
Proposition 2.19. For n ≥ 2, the l p -norm on Fn is derived from inner products if and only if p = 2.
Proof. In Example 2.3, we have seen that the Euclidean norm (the l2 -norm) is induced from the standard inner product on Fn . Conversely,
suppose that ∥ ·
| p is induced from inner products. In this case, consider the standard base elements e1 , e2 ∈ Fn . Here ∥e1 ∥ p = ∥e2 ∥ p = 1 for every p ∈ [1, ∞],
so by the parallelogram identity (cf. Theorem 2.10), we have
Note that ∥e1 + e2 ∥∞ = ∥e1 − e2 ∥∞ = 1. Since 12 + 12 < 4, we must have p < ∞ in this case. Then
As a result,
4 = ∥e1 + e2 ∥2p + ∥e1 − e2 ∥2p = 22/p + 22/p = 21+(2/p) ,
which entails that p = 2, as desired. ■
39
Theorem 2.20 (Norms Derived From Inner Products). Let V be a linear space. Then a norm ∥ · ∥ on V is derived
from an inner product if and only if the parallelogram identity holds, namely for all x, y ∈ V ,
Proof. The necessity of parallelogram identity is clear from Theorem 2.10, so it suffices to prove its sufficiency then: Suppose that the
parallelogram identity hold for a norm ∥ · ∥ on V .
Case 1. When F = R, we define
1
⟨·, ·⟩ : V ×V → R : (x, y) 7→ (∥x + y∥2 − ∥x∥2 − ∥y∥2 ). (63)
2
We then verify that ⟨·, ·⟩ is an inner product on V :
• For every x ∈ V ,
1 1
⟨x, x⟩ = (∥x + x∥2 − ∥x∥2 − ∥x∥2 ) = (∥2x∥2 − 2∥x∥2 )
2 2
1 2∥x∥2
= (4∥x∥2 − 2∥x∥2 ) = = ∥x∥2 ≥ 0.
2 2
Since ∥ · ∥ is a norm on V ,
⟨x, x⟩ = 0 ⇐⇒ ∥x∥2 = 0 ⇐⇒ ∥x∥ = 0 ⇐⇒ x = 0V .
In addition,
1 1
⟨x, 0V ⟩ = (∥x + 0V ∥2 − ∥x∥2 − ∥0V ∥2 ) = (∥x∥2 − ∥x∥2 − 02 ) = 0.
2 2
• For every x, y ∈ V , it is clear that
1
⟨x, y⟩ − ⟨y, x⟩ = ((∥x + y∥2 − ∥x∥2 − ∥y∥2 ) − (∥y + x∥2 − ∥y∥2 − ∥x∥2 )) = 0,
2
so ⟨x, y⟩ = ⟨y, x⟩ holds for sure. Together with identity proven in the last item, we have ⟨0V , x⟩ = ⟨x, 0V ⟩ = 0 for all x ∈ V .
• Let x, x′ , y ∈ V be arbitrary. Then
4(⟨x, y⟩ + ⟨x′ , y⟩) = 2((∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x′ + y∥2 − ∥x′ ∥2 − ∥y∥2 ))
= 2(∥x + y∥2 + ∥x′ + y∥2 ) − 2(∥x∥2 + ∥x′ ∥2 ) − 4∥y∥2
∗
= (∥x + x′ + 2y∥2 + ∥x − x′ ∥2 ) − (∥x + x′ ∥2 + ∥x − x′ ∥2 ) − 4∥y∥2
= ∥x + x′ + 2y∥2 − ∥x + x′ ∥2 − 4∥y∥2
∗
= (2(∥x + x′ + y∥2 + ∥y∥2 ) − ∥x + x′ ∥2 ) − ∥x + x′ ∥2 − 4∥y∥2
= 2∥x + x′ + y∥2 − 2∥x + x′ ∥2 − 2∥y∥2 = 4⟨x + x′ , y⟩.
∗
Here =’s follow from the parallelogram identity. Consequently, we have ⟨x, y⟩ + ⟨x′ , y⟩ = ⟨x + x′ , y⟩, proving the additivity on the
first component.
The proof for the homogeneity is slightly involved: Let x, y ∈ V be arbitrary.
• First, we claim that ⟨bx, y⟩ = b⟨x, y⟩ for every b ∈ Q. Let us consider the integer exponents temporarily. The case for non-negative
integers can be handled by induction: Observe that
Furthermore, since
⟨x, y⟩ + ⟨−x, y⟩ = ⟨x + (−x), y⟩ = ⟨0V , y⟩ = 0,
40
we indeed have ⟨−x, y⟩ = −⟨x, y⟩. Consequently, for n > 0,
The desired identity thus follows. Furthermore, given any b = m/n ∈ Q, since
Since p(t) = ∥tx + y∥2 ≥ 0 and its leading coefficient is ∥x∥2 > 0, we must have ∆ ≤ 0. Thus, it is clear that ⟨x, y⟩2 ≤ ∥x∥2 ∥y∥2 ,
hence the desired inequality follows.
Finally, let a ∈ R be arbitrary. Then for every b ∈ Q, since ⟨bx, y⟩ = b⟨x, y⟩, it follows that
|⟨ax, y⟩ − a⟨x, y⟩| = |⟨(a − b)x, y⟩ − (a − b)⟨x, y⟩| ≤ |⟨(a − b)x, y⟩| + |a − b|⟨x, y⟩
≤ ∥(a − b)x∥∥y∥ + |a − b|∥x∥∥y∥ = 2|a − b|∥x∥∥y∥.
Then for every ε > 0, because Q is dense in R, we can always let b ∈ Q be such that
ε
|a − b| <
1 ∨ 2∥x∥∥y∥
and hence
|⟨ax, y⟩ − a⟨x, y⟩| ≤ 2|a − b|∥x∥∥y∥ < ε,
Therefore, we indeed have ⟨ax, y⟩ = a⟨x, y⟩, proving that ⟨·, ·⟩ is an inner product on V .
Case 2. When F = C, we define
1 i
⟨·, ·⟩ : V ×V → C : (x, y) 7→ (∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x + iy∥2 − ∥x∥2 − ∥y∥2 ). (64)
2 2
First, by regarding V as a real linear space, it follows from Case 1 that the following map
1
⟨·, ·⟩′ : V ×V → R : (x, y) 7→ (∥x + y∥2 − ∥x∥2 − ∥y∥2 )
2
is an inner product on V with ∥x∥2 = ⟨x, x⟩′ . We then verify that ⟨x, iy⟩′ = −⟨ix, y⟩′ for all x, y ∈ V : Observe that ∥ix∥ = |i|∥x∥ = ∥x∥,
∥iy∥ = |i|∥y∥ = ∥y∥, and also
∥ix + y∥ = ∥i(x − iy)∥ = |i|∥x − iy∥ = ∥x − iy∥.
41
Therefore,
1 1
⟨x, iy⟩′ + ⟨ix, y⟩′ =(∥x + iy∥2 − ∥x∥2 − ∥iy∥2 ) + (∥ix + y∥2 − ∥ix∥2 − ∥y∥2 )
2 2
1
= (∥x + iy∥2 − ∥x∥2 − ∥iy∥2 + ∥x − iy∥2 − ∥x∥2 − ∥iy∥2 )
2
1
= (∥x + iy∥2 + ∥x − iy∥2 ) − (∥x∥2 + ∥iy∥2 ) = 0,
2
where the last equality follows from the parallelogram identity. As a result, by Theorem 2.4, the map ⟨·, ·⟩ : V ×V → C, in which for every
x, y ∈ V
1 i
⟨x, y⟩ := ⟨x, y⟩′ + i⟨x, iy⟩′ =(∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x + iy∥2 − ∥x∥2 − ∥iy∥2 )
2 2
1 i
= (∥x + y∥2 − ∥x∥2 − ∥y∥2 ) + (∥x + iy∥2 − ∥x∥2 − ∥y∥2 ),
2 2
is an inner product on V over C such that ⟨x, x⟩ = ⟨x, x⟩′ = ∥x∥2 for all x ∈ V . ■
Remark 2.12. The definition for the inner product when F = R is inspired from Theorem 2.10: In this case, for
every x, y ∈ V , we have
∥x + y∥2 = ∥x∥2 + ∥y∥2 + 2⟨x, y⟩,
2. Two nonempty subsets A, B ⊆ V are called orthogonal, denoted by A ⊥ B, if x ⊥ y for all x ∈ A and y ∈ B.
When A = {x} is a singleton, we shall write x ⊥ B in place of {x} ⊥ B.
1. For every x, y ∈ V ,
x ⊥ y ⇐⇒ y ⊥ x =⇒ ∀a, b ∈ F : ax ⊥ by. (65)
2. For every x ∈ V ,
x = 0V ⇐⇒ ∀y ∈ V : x ⊥ y. (66)
4. For every x ∈ V and family (xi )i∈I in V , if x ⊥ xi for all i ∈ I, then x ⊥ Span(xi | i ∈ I) as well.
Proof. 1. Let x, y ∈ V . Since ⟨y, x⟩ = ⟨x, y⟩ and 0 = 0, we can see that
x ⊥ y ⇐⇒ ⟨x, y⟩ = 0 ⇐⇒ ⟨y, x⟩ = 0 ⇐⇒ y ⊥ x.
42
hence ax ⊥ by as well. Furthermore, by Theorem 2.10, we have
Therefore,
∥x + y∥ = ∥x − y∥ ⇐⇒ ℜ⟨x, y⟩ = 0 ⇐= ⟨x, y⟩ = 0,
where the last converse is true when F = R.
2. Let x ∈ V be arbitrary. By Theorem 2.1, we have ⟨0V , x⟩ = 0 whence 0V ⊥ x. Furthermore, if x ⊥ y for all y ∈ V , by putting y := x,
we see that ⟨x, x⟩ = 0 whence x = 0V in this case.
3. Let x, y ∈ V . Suppose that ⟨x, z⟩ = ⟨y, z⟩ for all z ∈ V . Then for each z ∈ V , we have ⟨x − y, z⟩ = ⟨x, z⟩ − ⟨y, z⟩ = 0. That is, (x − y) ⊥ z
for all z ∈ V , hence by 3, we have x − y = 0V , namely x = y.
4. Let x ∈ V and (xi )i∈I be a family of elements in V . Suppose that x ⊥ xi for all i ∈ I. First, if I = ∅, then Span(∅) = {0V }, which is
certainly orthogonal to x. Therefore, we may assume that I is also nonempty in this case. Then for every y ∈ Span(xi | i ∈ I), suppose that
y = ci1 xi1 + · · · + cik xik for some i1 , . . . , ik ∈ I and ci1 , . . . , cik ∈ F,
Remark 2.13. Nevertheless, “being orthogonal to” is not transitive, namely if x ⊥ y and y ⊥ z, we do not neces-
sarily have x ⊥ z:
Consider x = [1, 0, 0]T , y = [0, 0, 1]T , and z = [1, 1, 0]T in R3 . It is clear that ⟨x, y⟩ = ⟨y, z⟩ = 0, but ⟨x, z⟩ = 1 ̸= 0.
Definition 2.6. Let V be an inner product space. A nonempty subset S of nonzero elements in V is called
orthogonal if x ⊥ y for every distinct elements x, y ∈ V . If furthermore ∥x∥ = 1 for all x ∈ S, such set S is said to
be orthonormal.
• In practice, the condition of a family (xi )i∈I of nonzero elements being orthonormal is rephrased as fol-
lows: For every i, j ∈ I,
1, if i = j;
⟨xi , x j ⟩ = δi, j = (67)
0, if i ̸= j,
• By the generalized Pythagorean theorem (cf. Corollary 2.11), if x1 , . . . , xn ∈ V are orthogonal, for every
c1 , . . . , cn ∈ F,
∥c1 x1 + · · · + cn xn ∥2 = |c1 |2 ∥x1 ∥2 + · · · + |cn |2 ∥xn ∥2 . (68)
Theorem 2.22. Every orthogonal subset of an inner product space is linearly independent.
43
Proof. Let V be an inner product space, S ⊆ V be orthogonal, and x1 , . . . , xn ∈ S be distinct elements. Suppose that 0V = c1 x1 + · · · + cn xn
for some c1 , . . . , cn ∈ F. Then fix an arbitrary k = 1, . . . , n. As we can see,
* +
n n
0 = ⟨0V , xk ⟩ = ∑ c j x j , xk = ∑ c j ⟨x j , xk ⟩ = ck ∥xk ∥2 .
j=1 j=1
Since xk ̸= 0V , we must have ∥xk ∥ > 0 whence ck = 0. Therefore, we indeed have c1 = · · · = cn = 0, implying that x1 , . . . , xn are linearly
independent. Since x1 , . . . , xn are arbitrary, the set S is linearly independent as well. ■
Definition 2.7. Let V be an inner product space and u ∈ V be nonzero. For every x ∈ V , its orthogonal projection
onto u is defined as
⟨x, u⟩
Proju (x) := u = ⟨x, û⟩û, (70)
∥u∥2
where û := ∥u∥−1 u is the normalization of u.
Theorem 2.23 (Orthogonal Projection). Let V be an inner product space and u ∈ V be nonzero.
2. For every x ∈ V , the element x⊥u := x − Proju (x) is orthogonal to u with (x − λ u)⊥u = x⊥u for all λ ∈ F
and
∥x∥ ≥ ∥x⊥u ∥ = inf ∥x − λ u∥. (73)
λ ∈F
In particular, the last infimum is attained if and only if λ = ⟨x, u⟩/∥u∥2 , namely the element Proju (x) is
the unique element in the subspace Span(u) closest to x.
Proof. 1. For every λ ∈ F,
⟨x − λ u, u⟩ ⟨x, u⟩ − λ ∥u∥2
Proju (x − λ u) = 2
u= u = Proju (x) − λ u.
∥u∥ ∥u∥2
Consequently,
Proju (λ u) = Proju (λ u − λ u) + λ u = Proju (0V ) + λ u = 0V + λ u = λ u.
Furthermore, when λ is nonzero,
⟨x, λ u⟩ λ ⟨x, u⟩λ ⟨x, u⟩
Projλ u (x) = (λ u) = u= u = Proju (x).
∥λ u∥2 |λ |2 ∥u∥2 ∥u∥2
2. Clearly,
so x⊥u ⊥ u holds for sure. As a result, we also have x⊥u ⊥ Proju (x), so
44
Finally, for every λ ∈ F, by 1,
As a result,
Furthermore, now we can see that the equality is attained if and only if x = Proju (x), namely x and u are collinear.
Corollary 2.24. Let V be an inner product space and u ∈ V be a unit element, namely ∥u∥ = 1. Then for every
x, y ∈ V ,
⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩ = ⟨x − Proju (x), y − Proju (y)⟩, (74)
so for every λ , µ ∈ F,
|⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩| ≤ ∥x − λ u∥∥y − µu∥. (75)
Proof. Since ∥u∥ = 1 now, we have Proju (x) = ⟨x, u⟩u and Proju (y) = ⟨y, u⟩u. Therefore,
Furthermore,
|⟨x, y⟩ − ⟨x, u⟩⟨u, y⟩| = |⟨x − Proju (x), y − Proju (y)⟩| ≤ ∥x − Proju (x)∥∥y − Proju (y)∥
≤ ∥x − λ u∥∥y − µu∥,
for every λ , µ ∈ F. ■
Theorem 2.25 (Criterion for Orthogonality). Let V be an inner product space. Then for every x, y ∈ V , we have
x ⊥ y if and only if ∥x∥ ≤ ∥x − cy∥ for all c ∈ F.
Proof. Let x, y ∈ V . First, if x ⊥ y, for every c ∈ F, by the Pythagorean theorem,
45
Conversely, suppose that ∥x∥ ≤ ∥x − cy∥ for all c ∈ F. There is nothing to prove if y = 0V , so we may assume that y is nonzero. In this
case, consider
⟨x, y⟩
z := x − Projy (x) = x − y.
∥y∥2
Here by Theorem 2.23, we have z ⊥ y. Then by Pythagorean theorem and our hypothesis,
Consequently, we must have ∥ Projy (x)∥ = 0 whence Projy (x) = 0V . In this case, we see that x = z, hence x ⊥ y holds for sure. ■
Alternative Proof of Sufficiency. Suppose that ∥x∥ ≤ ∥x − cy∥ for all c ∈ F. Consequently, for every c ∈ F,
or equivalently,
|c|2 ∥y∥2
ℜ(c⟨x, y⟩) ≤ .
2
Assume, to the contrary, that ⟨x, y⟩ ̸= 0. Then we may put c := t⟨x, y⟩/|⟨x, y⟩| for some t > 0, so we have |c| = t and
!
t⟨x, y⟩ t|⟨x, y⟩|2
ℜ(c⟨x, y⟩) = ℜ ⟨x, y⟩ = ℜ = ℜ(t|⟨x, y⟩|) = t|⟨x, y⟩|.
|⟨x, y⟩| |⟨x, y⟩|
Definition 2.8. Let V be an inner product space. Then for every A ⊆ V , its orthogonal complement is defined as
Theorem 2.27. Let V be an inner product space with A ⊆ V . Then its orthogonal complement A⊥ is a closed
subspace of V , while also
⊥
A⊥ = Span(A)⊥ = Span(A) and A ⊆ (A⊥ )⊥ . (79)
46
Proof. First, let x1 , x2 ∈ A⊥ and c ∈ F be arbitrary. Then for every y ∈ A,
so cx1 + x2 ∈ A⊥ as well. This shows that A⊥ is a subspace of V . Furthermore, let (xn )n∈N be a sequence in A⊥ converging to some x ∈ V .
Note that the inner product is continuous, so
⟨x, y⟩ = lim ⟨xn , y⟩ = lim 0 = 0.
n→∞ n→∞
⊥
Therefore, we also have x ∈ Span(A) , completing the proof. ■
Remark 2.17. Consequently, it suffices to discuss the orthogonal complement of a subspace in the future.
Theorem 2.28. Let V be an inner product space with subspace W . Then for every x ∈ V ,
x ∈ W ⊥ ⇐⇒ ∀y ∈ W : ∥x − y∥ ≥ ∥x∥. (80)
Conversely, suppose that ∥x − y∥ ≥ ∥x∥ for all y ∈ W . Fix one arbitrary y ∈ W . Then for every λ ∈ F, since λ y ∈ W , we shall have
∥x∥ ≤ ∥x − λ y∥. Thus, by Theorem 2.25, we can see that y ⊥ x. Since y ∈ W is arbitrary, it follows that x ⊥ W whence x ∈ W ⊥ . ■
where
x⊥W := x − ProjW (x) ∈ W ⊥ and ∥ ProjW (x)∥2 = |⟨x, x1 ⟩|2 + · · · + |⟨x, xn ⟩|2 , (82)
with
n
d(x,W )2 = ∥x − ProjW (x)∥2 = ∥x∥2 − ∥ ProjW (x)∥2 = ∥x∥2 − ∑ |⟨x, xk ⟩|2 .
k=1
47
In particular,
n
∑ |⟨x, xk ⟩|2 ≤ ∥x∥2 , (83)
k=1
in which the equality is attained if and only if x = ProjW (x), which is also equivalent to that x ∈ W .
Proof. Let y := ⟨x, x1 ⟩x1 + · · · + ⟨x, xn ⟩xn ∈ W . First, because x1 , . . . , xn are orthonormal, it is clear that
Clearly, here the equality is attained if and only if x − y = 0V , namely x = y. We then show that x = y if and only if x ∈ W :
• If x = y, since y ∈ W , it is certain that x ∈ W as well in this case.
• Conversely, suppose that x ∈ W . Then we can write x = α1 x1 + · · · + αn xn for some α1 , . . . , αn ∈ F. For each k = 1, . . . , n,
* +
n n
⟨x, xk ⟩ = ∑ α j x j , xk = ∑ α j ⟨x j , xk ⟩ = αk (1) = αk .
j=1 j=1
As we can see, the equality is attained if and only if ∥y − z∥ = 0, namely y = z. The proof is complete. ■
Remark 2.18. The above theorem generalizes the orthogonal projection onto a single nonzero element (cf.
Theorem 2.23).
Corollary 2.30 (Bessel’s Inequality, Infinite Version). Let V be an inner product space and let (en )n∈N be an
orthonormal sequence of elements in V . Then for every x ∈ V ,
∞
∑ |⟨x, en ⟩|2 ≤ ∥x∥2 . (84)
n=1
Furthermore, the orthonormal sequence (en )n∈N converges weakly to 0V but is not strongly convergent.
Proof. By Bessel’s inequality, we have
n
∥x∥2 ≥ ∑ |⟨x, ek ⟩|2 , ∀n ∈ N.
k=1
48
Consequently, the non-negative series ∑∞ 2
n=1 |⟨x, en ⟩| is convergent, implying that ⟨x, en ⟩ → 0 = ⟨x, 0V ⟩ as n → ∞.
w
In conclusion, we have en −→ 0V . Meanwhile, assume, to the contrary, that en → x under the derived norm for some x ∈ V . Then by
w
Theorem 4.5, we have en −→ x and 1 = ∥en ∥ → ∥x∥, namely ∥x∥ = 1. As a result, for every y ∈ V , we see that ⟨en , y⟩ → 0 = ⟨x, y⟩. By
Theorem 2.21, we must have x = 0V , contrary to the fact that ∥x∥ = 1. ■
Theorem 2.31 (Gram-Schmidt Process). Let V be an inner product space and (xn )n∈N be a sequence of linearly
independent elements in V . Then defines recursively
Then the sequence (yn )n∈N is orthogonal and (en )n∈N is orthonormal. Furthermore, for every n ∈ N,
Proof. First, because (xn )n∈N is linearly independent, each xn is certainly nonzero. We then prove that y1 , . . . , yn are orthogonal and spans
the same subspace as x1 , . . . , xn by induction on n, in which case e1 , . . . , en is necessarily orthonormal:
• Since x1 = y1 , it is obvious that Span(x1 ) = Span(y1 ). Furthermore, since e1 is a scalar multiple of y1 , we have Span(e1 ) = Span(y1 )
as well.
• Now suppose that the assertions hold for some n ≥ 1. By the preceding theorem, we see that
n
yn+1 = xn+1 − ∑ ⟨xn+1 , ek ⟩ek ∈ Span(e1 , . . . , en )⊥ = Span(y1 , . . . , yn )⊥ = Span(x1 , . . . , xn )⊥ ,
k=1
so y1 , . . . , yn , yn+1 ∈ Span(x1 , . . . , xn , xn+1 ). Note that dim(Span(x1 , . . . , xn , xn+1 )) = n + 1, while y1 , . . . , yn , yn+1 , being orthogonal,
are also linearly independent. Then y1 , . . . , yn , yn+1 must also be a base of Span(x1 , . . . , xn , xn+1 ), namely
Finally, because e1 , . . . , en+1 are the respective normalizations of y1 , . . . , yn+1 , we have Span(e1 , . . . , en+1 ) = Span(y1 , . . . , yn+1 ) as
well. ■
Remark 2.19. Here are some special cases for the Gram-Schmidt process:
• If (xn )n∈N is already orthogonal, by easy induction, we certainly have yn = xn for all n ∈ N.
• Instead, if (xn )n∈N is linearly dependent, we shall get yk = 0V at some k: Let k be the smallest index such
that x1 , . . . , xk are linearly independent but x1 , . . . , xk , xk+1 are not. Then again by Theorem 2.29, we must
have yk = 0V in that case.
Definition 2.9. Let V be a finite-dimensional inner product space. An orthonormal base of V is a base in which
the elements are orthonormal.
Remark 2.20. Note that every orthonormal subset is necessarily linearly independent, hence it becomes an
orthonormal base if and only if it is a spanning set of V .
49
Corollary 2.32. Let V be a nonzero finite-dimensional inner product space. Then every orthonormal subset of
V can be extended to an orthonormal base. In particular, the space V possesses an orthonormal base.
Proof. Let x1 , . . . , xk be orthonormal elements in V .
• If dim(V ) = k, since x1 , . . . , xk are already linearly independent, they certainly constitute an orthonormal base of V .
• Suppose that dim(V ) > k. Then we can find xk+1 , . . . , xn ∈ V such that {x1 , . . . , xk , xk+1 , . . . , xn } is a base of V . Then we apply
Gram-Schmidt process to such set, which yields an orthonormal subset {e1 , . . . , en }. Now
Span(e1 , . . . , en ) = Span(x1 , . . . , xn ) = V,
so {e1 , . . . , en } is a base. Furthermore, since x1 , . . . , xk are orthonormal, we must have e j = x j for all j = 1, . . . , k.
Finally, since V is nonzero, we can fix an arbitrary unit element in it, which is certainly orthonormal. Then by our discussions above, such
unit element extends to an orthonormal base of V , as desired. ■
Theorem 2.33 (Finite-Dimensional Spaces). Let V be a finite-dimensional inner product space with orthonor-
mal base {e1 , . . . , en }.
2. (Parseval’s Identity). Let β = (e1 , . . . , en ) be the corresponding ordered base of V . Then for every x, y ∈ V ,
n
[x]β = [⟨x, e1 ⟩, . . . , ⟨x, en ⟩]T and ⟨x, y⟩ = ∑ ⟨x, ek ⟩⟨ek , y⟩ = ⟨[x]β , [y]β ⟩Fn , (89)
k=1
3. (Matrix Representation). Let W be another finite-dimensional inner product space with orthonormal
ordered base γ = (v1 , . . . , vn ). Then for every linear map T : V → W , its matrix representation under β , γ
is given by
⟨Te1 , v1 ⟩ ⟨Te2 , v1 ⟩ · · · ⟨Ten , v1 ⟩
⟨Te1 , v2 ⟩ ⟨Te2 , v2 ⟩ · · · ⟨Ten , v2 ⟩
γ [T ]β = [⟨Te j , vi ⟩] =
.. .. .. . (90)
. . .
⟨Te1 , vm ⟩ ⟨Te2 , vm ⟩ · · · ⟨Ten , vm ⟩
Proof. Let x ∈ V be arbitrary. Such expansion is already clear from Theorem 2.29 as the projection of x onto V is itself. We also repeat the
straightforward arguments as follows: Write x = c1 e1 + · · · + cn en . Then for each k = 1, . . . , n,
* +
n n
⟨x, ek ⟩ = ∑ c j e j , ek = ∑ c j ⟨e j , ek ⟩ = ck (1) = ck .
j=1 j=1
50
Consider the ordered base β = (e1 , . . . , en ). It is clear that
Finally, let W be another finite-dimensional inner product space with orthonormal ordered base γ = (y1 , . . . , yn ), and let T : V → W be a
linear map. Then for each j = 1, . . . , n, as noted above, we have
Therefore,
⟨Te1 , y1 ⟩ ⟨Te2 , y1 ⟩ ··· ⟨Ten , y1 ⟩
Theorem 2.34 (Closest Point Property). Let V be an inner product space and W be a finite-dimensional sub-
space. Then for every x ∈ V , there is a unique y ∈ W closest to x, namely
Furthermore, we have
(W ⊥ )⊥ = W and V = W ⊕W ⊥ . (92)
Proof. Let x ∈ V be arbitrary. Since W is finite-dimensional, it contains an orthonormal base {e1 , . . . , em }. Then by Theorem 2.29, the
element
y := ProjW (x) = ⟨x, e1 ⟩e1 + · · · + ⟨x, en ⟩en
is precisely the unique element in W such that ∥x − y∥ = d(x,W ). Furthermore, we also have x − y ∈ W ⊥ , so the decomposition x =
y + (x − y) ∈ W +W ⊥ tells us that V = W ⊕W ⊥ .
• Meanwhile, since W and W ⊥ are both subspaces of V , we shall have 0V ∈ W ∩W ⊥ . However, now that W ∩W ⊥ ̸= ∅, so by Theorem
2.27, we must have W ∩W ⊥ = {0V }. Therefore, we indeed have V = W ⊕W ⊥ , as desired.
• Finally, it is clear from Theorem 2.27 that W ⊆ (W ⊥ )⊥ . Now let x ∈ (W ⊥ )⊥ . Then ProjW (x) ∈ W ⊆ (W ⊥ )⊥ , so x − ProjW (x) ∈
(W ⊥ )⊥ as well. However, since x − ProjW (x) ∈ W ⊥ as well, we then have x − ProjW (x) = 0V , namely x = ProjW (x) ∈ W . The
equality (W ⊥ )⊥ = W thus holds as well. ■
Alternative Proof of the Closest Point. The existence of such point y is also ensured by Theorem 1.32. What remains is to prove for its
uniqueness: If x ∈ W , then we have d(x,W ) = 0. In this case, the only y ∈ W satisfying ∥x − y∥ = 0 = d(x,W ) is x itself. Consequently, we
shall assume that x ∈ / W then. Let y1 , y2 ∈ W be closest points in W to x and λ ∈ (0, 1). As noted in Theorem 1.32, the point λ y1 + (1 − λ )y2
is also closest to x, namely
51
As we can see, if c = 0, then we have (1 − λ )(x − y2 ) = 0V . Since λ ∈ (0, 1), it follows that x − y2 = 0V , namely x = y2 ∈ W , a contradiction.
Therefore, we should have c > 0 in this case. Note that the above identity is equivalent to
(1 − (c + 1)λ )x = (1 − λ )y2 − cλ y1 .
Definition 2.10. Let V be an inner product space and x1 , . . . , xn ∈ V . The Gram matrix of x1 , . . . , xn is defined as
⟨x1 , x1 ⟩ ⟨x2 , x1 ⟩ · · · ⟨xn , x1 ⟩
⟨x1 , x2 ⟩ ⟨x2 , x2 ⟩ · · · ⟨xn , x2 ⟩
G(x1 , . . . , xn ) := .
.. .. , (93)
.. . .
⟨x1 , xn ⟩ ⟨x2 , xn ⟩ · · · ⟨xn , xn ⟩
Theorem 2.35. Let V be an inner product space. For every x1 , . . . , xn ∈ V , their Gram matrix G(x1 , . . . , xn ) is
positive semi-definite, which is positive definite if and only if x1 , . . . , xn are linearly independent.
Proof. Let x1 , . . . , xn ∈ V be arbitrary with Gram matrix G := G(x1 , . . . , xn ). Clearly, for every i, j = 1, . . . , n, we have ⟨xi , x j ⟩ = ⟨x j , xi ⟩, so
G is certainly Hermitian. Furthermore, for every z = [c1 , . . . , cn ]T ∈ Fn ,
* + 2
n n n
∗
z Gz = ∑ ci ⟨x j , xi ⟩c j = ∑ ⟨c j x j , ci xi ⟩ = ∑ c j x j , ∑ ci xi = ∑ ci xi ≥ 0,
i, j i, j j=1 i=1 i=1
z∗ Gz = 0 ⇐⇒ ∥c1 x1 + · · · + cn xn ∥ = 0 ⇐⇒ c1 x1 + · · · + cn xn = 0V .
Consequently,
so
0 ≤ det(G(x, y)) = ∥x∥2 ∥y∥2 − ⟨y, x⟩⟨x, y⟩ = ∥x∥2 ∥y∥2 − |⟨x, y⟩|2 .
52
Theorem 2.36 (Closest Point and Gram Matrix). Let V be an inner product space and x1 , . . . , xn ∈ V be linearly
independent. Then for every x ∈ V , the unique closest point in the n-dimensional subspace W := Span(x1 , . . . , xn )
is given by y := c1 x1 + · · · + cn xn , where
c1 ⟨x, x1 ⟩
c2 ⟨x, x2 ⟩
G(x1 , . . . , xn ) . = .
. (94)
.. ..
cn ⟨x, xn ⟩
Furthermore,
det(G(x1 , . . . , xn , x))
d(x,W )2 = ∥x − y∥2 = . (95)
det(G(x1 , . . . , xn ))
Proof. Let x ∈ V be arbitrary and y ∈ W be the unique closest point to x (cf. Theorem 2.29). Write y = c1 x1 + · · · + cn xn for some
c1 , . . . , cn ∈ F. Note that x − y ∈ W ⊥ , so for each k = 1, . . . , n, we have
* +
n n
0 = ⟨x − y, xk ⟩ = ⟨x, xk ⟩ − ⟨y, xk ⟩ = ⟨x, xk ⟩ − ∑ ci xi , xk = ⟨x, xk ⟩ − ∑ ci ⟨xi , xk ⟩.
i=1 i=1
In other words,
⟨x, xk ⟩ = c1 ⟨x1 , xk ⟩ + · · · + cn ⟨xn , xk ⟩.
Therefore, the vector [c1 , . . . , cn ]T ∈ F is precisely the solution to the linear system described above. Now because x1 , . . . , xn are linearly
independent, the Gram matrix G(x1 , . . . , xn ) is positive definite, especially non-singular. As a result, such linear system is consistent with a
unique solution. Finally, since y ⊥ (x − y), we see that
d(x,W )2 = ∥x − y∥2 = ⟨x − y, x − y⟩ = ⟨x − y, x⟩ − ⟨x − y, y⟩
= ⟨x − y, x⟩ = ⟨x, x⟩ − ⟨y, x⟩.
Consequently,
⟨x, x⟩ = ⟨y, x⟩ + d(x,W )2 = c1 ⟨x1 , x⟩ + c2 ⟨x2 , x⟩ + · · · + cn ⟨xn , x⟩ + d(x,W )2 .
Together with the equations obtained above, we have the following (n + 1) × (n + 1) consistent linear system:
⟨x1 , x1 ⟩ ⟨x2 , x1 ⟩ · · · ⟨xn , x1 ⟩ 0 c1 ⟨x, x1 ⟩
⟨x1 , x2 ⟩ ⟨x2 , x2 ⟩ · · · ⟨xn , x2 ⟩ 0 c2 ⟨x, x2 ⟩
. . . . . .
.. .. .. .. .. = .. .
⟨x1 , xn ⟩ ⟨x2 , xn ⟩ · · · ⟨xn , xn ⟩ 0 cn ⟨x, xn ⟩
Observe that the determinant of the coefficient matrix is precisely det(G(x1 , . . . , xn )) > 0. Then by Cram’s rule, we have
⟨x1 , x1 ⟩ ⟨x2 , x1 ⟩ · · · ⟨xn , x1 ⟩ ⟨x, x1 ⟩
⟨x1 , x2 ⟩ ⟨x2 , x2 ⟩ · · · ⟨xn , x2 ⟩ ⟨x, x2 ⟩
1 det(G(x , . . . , x , x))
. . . .
1 n
d(x,W )2 = det .. .. .. .. = . ■
det(G(x1 , . . . , xn )) det(G(x1 , . . . , xn ))
⟨x1 , xn ⟩ ⟨x2 , xn ⟩ · · · ⟨xn , xn ⟩ ⟨x, xn ⟩
Remark 2.22. In particular, the setup of the linear system (94) does not require x1 , . . . , xn to be linearly inde-
pendent. In other words, such consistent linear system always exists provided x1 , . . . , xn spans S, though it may
have infinitely many solutions. Nonetheless, all of those solutions only correspond to different coefficients for
53
x1 , . . . , xn but will result in essentially the same y ∈ W .
Proof. Suppose that C is closed and convex, and let x ∈ V . First, if x ∈ C, we may simply put y := x, which certainly satisfies ∥x − y∥ = 0 =
d(x,C). On the other hand, suppose that x ∈/ C. Since C is closed in V , we must have d := d(x,C) > 0. Let (yn )n∈N be a sequence in C such
that ∥x − yn ∥ → d as n → ∞. In particular, we must have ∥x − yn ∥ ≥ d for all n, as d = d(x,C) is the infimum among all of them. We then
claim that (yn )n∈N is also a Cauchy sequence in V : For every m, n ∈ N, by the parallelogram identity,
Since C is convex, we have z := (ym + yn )/2 ∈ C as well. Consequently, we see that ∥z − x∥ ≥ d whence
For every ε > 0, since ∥x − yn ∥ → d, there exists N ∈ N such that ∥x − yn ∥2 < d 2 + (ε 2 /4) for all n ≥ N. Consequently, when m, n ≥ N,
ε2
∥ym − yn ∥2 ≤ 2 · 2 d 2 + − 4d 2 = ε 2 ,
4
so ∥ym − yn ∥ < ε holds in this case. Therefore, the sequence (yn )n∈N is also a Cauchy sequence in V .
Note that V is a Hilbert space, so there exists y ∈ V such that yn → y as n → ∞. As we can see, because C is closed, we must have y ∈ C
in this case. Furthermore, by the continuity of norm, we have
∥x − y∥ = lim ∥x − yn ∥ = d.
n→∞
The existence of such y is thus clear to us. As for its uniqueness, let y′ ∈ C be another point such that ∥x − y′ ∥ = d(x,C). Again, by putting
z′ := (y + z)/2 and similar arguments as above, we have
Remark 2.23. In particular, every closed subspace of V is also convex, so the above theorem applies to every
closed subspace of a Hilbert space.
Definition 2.11. Let V be a Hilbert space with closed subspace W .
1. For every x ∈ V , the unique element y ∈ W closest to x is called the orthogonal projection of x onto W .
2. Furthermore, the map PW : V → V , in which PW x is the orthogonal projection of x onto W for all x ∈ V , is
called the orthogonal projection of V onto W .
54
Remark 2.24. By Theorem 2.34, one can see that the above definition generalizes the orthogonal projection
onto a finite-dimensional subspace, as every finite-dimensional subspace is always closed (cf. Corollary 1.32).
Nevertheless, though the original case works for all inner product spaces, the above definition is restricted to
Hilbert spaces only.
Theorem 2.38. Let V be a Hilbert space with closed subspace W . For every x, p ∈ V , the following statements
are equivalent:
2. Here p ∈ W and x − p ∈ W ⊥ .
In particular, we have
(W ⊥ )⊥ = W and V = W ⊕W ⊥ . (97)
By Theorem 2.25, we indeed have (x − p) ⊥ y in this case. Since y ∈ W is arbitrary, it follows that x − p ∈ W ⊥ as well.
(2 =⇒ 1, 3). Suppose that p ∈ W and x − p ∈ W ⊥ . Let y ∈ W be arbitrary. Now p − y ∈ W , so (x − p) ⊥ (p − y). By Pythagorean
theorem,
∥x − y∥2 = ∥(x − p) + (p − y)∥2 = ∥x − p∥2 + ∥p − y∥2 ≥ ∥x − p∥2 .
Therefore, p is precisely the orthogonal projection of x onto W .
• Now the decomposition x = p + (x − p) shows that V = W +W ⊥ . Note that W ∩W ⊥ is nonempty, hence W ∩W ⊥ = {0V }. Conse-
quently, we certainly have V = W ⊕W ⊥ .
• Furthermore, assume that x ∈ (W ⊥ )⊥ in this case. Since p ∈ W ⊆ (W ⊥ )⊥ , we see that x − p ∈ (W ⊥ )⊥ as well. Note that x − p ∈ W ⊥ ,
hence x − p = 0V holds for sure, namely x = p ∈ W . This shows that (W ⊥ )⊥ = W then. In addition, if W ⊥ = {0V }, then we must
have W = (W ⊥ )⊥ = {0V }⊥ = V . In other words, if W ̸= V , then W ⊥ ̸= {0V }.
In addition, applying (2 =⇒ 1) to W ⊥ , one can see that x − p is also the orthogonal projection of x onto W ⊥ . Of course, one may repeat the
above arguments as above: Let z ∈ W ⊥ be arbitrary. Since (x − p) − z ∈ W ⊥ , we have (x − p − z) ⊥ p whence
55
⊥
Proof. Let A ⊆ V be arbitrary. Then by Theorem 2.27, we have A⊥ = Span(A) . Since Span(A) is closed in V , by the preceding theorem,
we have
⊥
Span(A) = (Span(A) )⊥ = (A⊥ )⊥ . ■
⊥
Finally, given any dense subspace W of V , we have W ⊥ = W = V ⊥ = {0V }.
Lemma 2.40. Let V be a Hilbert space and (ei )i∈I be a family of orthonormal elements. For every x ∈ V , the
set of index i’s where ⟨x, ei ⟩ ̸= 0 is at most countable.
Proof. Fix an arbitrary x ∈ V . Observe that
∞
[
{i ∈ I | ⟨x, ei ⟩ ̸= 0} = Nn (x), where Nn (x) := {i ∈ I | |⟨x, ei ⟩| > 1/n}, n ∈ N.
n=1
As we can see, the set Nn (x) must be finite: If not, given any positive integer k, we can fix an (nk)-element subset Nn,k (x) of Nn (x). By
Bessel’s inequality,
1
∥x∥2 ≥ ∑ |⟨x, ei ⟩| > nk · = k.
i∈N (x)
n
n,k
Since k is arbitrary, we must have ∥x∥ = ∞, which is absurd. Now that each Nn (x) is finite, so the collection of those index i’s in which
⟨x, ei ⟩ is indeed at most countable. ■
Definition 2.12. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then for every x ∈ V ,
the generalized Fourier series of x with respect to (en )n∈N is given by
∞
x∼ ∑ ⟨x, en ⟩en , (99)
n=1
where each ⟨x, en ⟩ is called the generalized Fourier coefficient of x under (en )n∈N .
Lemma 2.41. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then for every sequence
(cn )n∈N of scalars,
∞ ∞
∑ cn en is convergent ⇐⇒ ∑ |cn |2 < ∞, (100)
n=1 n=1
in which case,
2
∞ ∞
∑ cn en = ∑ |cn |2 . (101)
n=1 n=1
• If the series ∑∞ ∞ 2 ∞ 2
n=1 cn en converges, then the partial sums of the series ∑n=1 |cn | is a Cauchy sequence in R, hence the series ∑n=1 |cn |
is convergent as well.
• Conversely, if ∑∞ 2 ∞
n=1 |cn | < ∞, then the partial sums of the series ∑n=1 cn en in V is also Cauchy. Since V is a Hilbert space, such
series must also be convergent.
Finally, suppose that ∑∞
n=1 cn en converges. Then by the continuity of norms and the generalized Pythagorean theorem,
2 2 2
∞ m m m ∞
∑ cn en =
m→∞
lim ∑ cn en = lim
m→∞
∑ cn en = lim
m→∞
∑ |cn |2 = ∑ |cn |2 . ■
n=1 n=1 n=1 n=1 n=1
56
Remark 2.25. For x ∈ V , though its associated generalized Fourier series ∑∞
n=1 ⟨x, en ⟩en could be convergent,
such sum is not necessarily equal to x:
1
Consider V = L2 [−π, π] with ϕn (t) = √ sin(nt) for all n ∈ N. Here (ϕn )n∈N is orthonormal in V , but for
π
f (t) = cos(t), we have
1
Z π
⟨ f , ϕn ⟩ = cos(t) · √ sin(nt) dt = 0,
−π π
so
∞ ∞
∑ ⟨ f , ϕn ⟩ϕn = ∑ 0ϕn = 0 ̸= f .
n=1 n=1
Theorem 2.42 (Bessel’s Inequality in Hilbert Space). Let V be a Hilbert space and (en )n∈N be an orthonor-
mal sequence of elements in V . Then for every x ∈ V , the closest point of x in the closed subspace W :=
Span(en | n ∈ N) is precisely
∞
ProjW (x) := ∑ ⟨x, en ⟩en ∈ W, (102)
n=1
where
∞
∥ ProjW (x)∥2 = ∑ |⟨x, en ⟩|2 ≤ ∥x∥2 . (103)
n=1
Here the last equality is attained if and only if x = ProjW (x), which is also equivalent to that x ∈ W .
Proof. Let x ∈ V be arbitrary. As noted in the infinite version of Bessel’s inequality, we see that
∞
∑ |⟨x, en ⟩| ≤ ∥x∥2 < ∞.
n=1
We then let y be its sum, which, being the limit of elements in Span(en | n ∈ N), is clearly an element in W = Span(en | n ∈ N). Furthermore,
we claim that x − y ∈ W ⊥ : By Theorem 2.21, we see that W ⊥ = {en | n ∈ N}⊥ . For each k ∈ N, by the continuity of inner products,
* +
∞
⟨x − y, ek ⟩ = ⟨x, ek ⟩ − ⟨y, ek ⟩ = ⟨x, ek ⟩ − ∑ ⟨x, en ⟩en , ek
n=1
∞
= ⟨x, ek ⟩ − ∑ ⟨x, en ⟩⟨en , ek ⟩ = ⟨x, ek ⟩ − ⟨x, ek ⟩(1) = 0.
n=1
Therefore, we indeed have x − y ∈ W⊥ in this case. By Theorem 2.38, we see that y is precisely the orthogonal projection of x onto W .
Finally, since y ⊥ (x − y), by Pythagorean theorem, we have
∞
∥x∥2 = ∥x − y∥2 + ∥y∥2 = ∥x − y∥2 + ∑ |⟨x, en ⟩|2 .
n=1
Definition 2.13. Let V be a Hilbert space. An orthonormal sequence (en )n∈N is called complete, or an orthonor-
57
mal base of V , if for every x ∈ V ,
∞
x= ∑ ⟨x, en ⟩en . (104)
n=1
Theorem 2.43. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then the following
statements are equivalent:
2. For every x ∈ V , there exists a unique family (cn )n∈N of scalars such that
∞
x= ∑ cn en . (105)
n=1
(4 =⇒ 3). Suppose that the orthogonal complement of Span(en | n ∈ N) is zero. Let W := Span(en | n ∈ N). Again, by Theorem 2.27,
we have
⊥
W ⊥ = Span(en | n ∈ N) = Span(en | n ∈ N)⊥ = {0V }.
Since W is closed in V and V is a Hilbert space, by Theorem 2.38, we have W = (W ⊥ )⊥ = {0V }⊥ = V . ■
Corollary 2.44. Let V be a Hilbert space and (en )n∈N be an orthonormal sequence in V . Then the following
statements are equivalent:
58
3. (Plancherel’s Identity). For every x ∈ V ,
∞
∥x∥2 = ∑ |⟨x, en ⟩|2 . (107)
n=1
Proof. (1 =⇒ 2). Suppose that (en )n∈N is complete. For every x, y ∈ V , now because x = ∑∞
n=1 ⟨x, en ⟩en , by the continuity of inner products,
we have * +
∞ ∞
⟨x, y⟩ = ∑ ⟨x, en ⟩en , y = ∑ ⟨x, en ⟩⟨en , y⟩.
n=1 n=1
(2 =⇒ 3). Put y := x in the formula.
(3 =⇒ 1). Suppose that ∥x∥2 = ∑∞ 2
n=1 |⟨x, en ⟩| < ∞. Let W := Span(en | n ∈ N). For each x ∈ V , by Theorem 2.42, we can see that
x ∈ W as well. This shows that V = W = Span(en | n ∈ N), so by the preceding theorem, the orthonormal sequence (en )n∈N is complete. ■
Theorem 2.45. All orthonormal bases of a nonzero Hilbert space are of the same cardinality.
Proof. Let V be a nonzero Hilbert space. If V has a finite orthonormal base, then it must be of finite dimension. Correspondingly, all
orthonormal bases have the same cardinality as the dimension of V .
Thus, we may assume that V has an infinite orthonormal base (ei )i∈I . Let (e′j ) j∈J be another orthonormal base of V . As discussed
above, the set J is also infinite. For each i ∈ I, we may let
Because (e′j ) j∈J is an orthonormal base, the set Ni must be nonempty, and also at most countable by Lemma 2.40. Meanwhile, for each
j ∈ J, since (ei )i∈I is also an orthonormal base, we can also find some i ∈ I such that j ∈ Ni . Therefore, we have J = i∈I Ni , so
S
[
|J| = Ni ≤ ∑ |Ni | = |I|.
i∈I i∈I
By symmetric arguments, we have |I| ≤ |J| as well. Therefore, the equality |I| = |J| holds in this case. ■
Definition 2.14. Let V be a Banach space. A Schauder base, or a countable base, is a sequence (en )n∈N of
elements in V such that for every x ∈ V , there is a unique sequence (cn )n∈N of scalars in F such that
∞
x= ∑ cn en . (108)
n=1
Such series is called the Schauder expansion of x, in which the scalars cn ’s are called the coordinates of x with
respect to the Schauder base.
• The ordering of elements in a Schauder base is important, because reordering terms in a series will break
its convergence even in a Banach space. That is why it is defined as a sequence rather than an arbitrary
indexed family.
Theorem 2.46. Every Schauder base in a Banach space is linearly independent and also a topological base.
59
Proof. Let V be a Banach space with Schauder base (en )n∈N . Consider an almost-null family (cn )n∈N of scalars, namely cn = 0 for all but
finitely many n’s, by which
∞
0V = ∑ cn en .
n=1
Then by the uniqueness of Schauder expansion, we must have cn = 0V for all n ∈ N, so the family (en )n∈N is linearly independent. Fur-
thermore, for each x ∈ V , its Schauder expansion tells us that it is a limit point of Span(en | n ∈ N). Therefore, the spanned subspace
Span(en | n ∈ N) is dense in V , implying that (en )n∈N is a topological base of V as well. ■
60
2.6 Examples of Inner Products
Example 2.5 (Frobenius Inner Product). Let V := Mm,n (F). Then the Frobenius inner product is defined as
The Frobenius inner product on Mm,n (F) is defined as follows: For every A = [ai j ], B = [bi j ] ∈ Fn , let
n m
⟨A, B⟩F = tr(B∗ A) = ∑ ∑ bi j ai j = ∑ ai j bi j .
j=1 i=1 i, j
As we can see,
tr(A∗ A) = ∑ ai j ai j = ∑ |ai j |2 ≥ 0, and ⟨A, B⟩F = tr(B∗ A) = tr((A∗ B)∗ ) = tr(A∗ B) = ⟨B, A⟩F .
i, j i, j
61
3 Bounded Linear Maps and Continuous Dual Spaces
3.1 Bounded Linear Maps
Definition 3.1. Let X,Y be normed linear spaces. A linear map T : X → Y is called bounded if there exists a
positive constant M ∈ R such that ∥T x∥ ≤ M∥x∥ for all x ∈ X.
Theorem 3.1 (Characterization of Bounded Linear Maps). Let X,Y be normed linear spaces and T : X → Y be
a linear map. Then the following statements are equivalent:
or equivalently, we have ∥Ty∥ < (2/δ )∥y∥. Therefore, the linear map T : X → Y is bounded by constant M := 2/δ > 0. ■
Corollary 3.2. Let X,Y be normed linear spaces. For every bounded linear map T : X → Y , its kernel/null
space is closed in X.
Proof. Let T : X → Y be a bounded linear map, and let (xn )n∈N be a sequence in ker(T ) converging to some x ∈ X. Then by the continuity
of T , we see that T xn → T x as well. Now for each n ∈ N, we have T xn = 0Y because xn ∈ ker(T ). This shows that T x = 0Y as well, hence
x ∈ ker(T ), showing that ker(T ) is closed in X. ■
Theorem 3.3. Let X,Y be normed linear spaces. The set B(X,Y ) of all bounded linear maps from X to Y is a
subspace of L (X,Y ).
62
Proof. Clearly, the zero map is bounded, as ∥0x∥ = ∥0Y ∥ = 0 ≤ M∥x∥ for all M > 0. Furthermore, let T, T ′ ∈ L (X,Y ) be bounded and
c ∈ F. Then there exist M, M ′ > 0 such that ∥T x∥ ≤ M∥x∥ and ∥T ′ x∥ ≤ M ′ ∥x∥ for all x ∈ 0X . Put c′ := |c| ∨ 1 > 0. Then given any x ∈ X,
and
∥(cT )x∥ = ∥c(T x)∥ = |c|∥T x∥ ≤ |c|M∥x∥ ≤ c′ M∥x∥.
Since M + M ′ > 0 and c′ M > 0, the linear maps T + T ′ and cT are bounded as well. The desired assertion thus follows. ■
Theorem 3.4. Let X,Y be normed linear spaces. If X is finite-dimensional, then every linear map from X to Y
is bounded, namely L (X,Y ) = B(X,Y ).
Proof. Let {e1 , . . . , en } be a base of X with coordinate forms e∨ ∨
1 , . . . , en : X → F, and let T : X → Y be a linear map. As we can see, for each
x ∈ X,
!
n n
∥T x∥ = T ∑ e∨i (x)ei = ∑ e∨i (x)T (ei )
i=1 i=1
n n
≤ ∑ ∥e∨ ∨
i (x)T (ei )∥ = ∑ |ei (x)|∥T (ei )∥
i=1 i=1
n
≤ (∥T (e1 )∥ ∨ · · · ∨ ∥T (en )∥) ∑ |e∨
i (x)|.
i=1
Then we may put M := ∥T (e1 )∥ ∨ · · · ∨ ∥T (en )∥ ∨ 1 > 0. By Lemma 1.27, there exists λ > 0 such that
n n
∥x∥ = ∑ e∨i (x)ei ≥λ ∑ |e∨i (x)|.
i=1 i=1
Consequently,
n
M
∥T x∥ ≤ M ∑ |e∨
i (x)| ≤ ∥x∥.
i=1 λ
Since M/λ > 0, we can conclude that T is bounded. ■
Definition 3.2. Let X,Y be normed linear spaces. An isomorphism from X to Y is a bounded linear isomorphism
T : X → Y , namely a bijective linear map, whose inverse T −1 : Y → X is also bounded.
Remark 3.1. Clearly, an isomorphism T between normed linear spaces is both a linear isomorphism (from the
aspect of linear space structure) and a homeomorphism (from the aspect of metric space structure). Thus,
• being isomorphic is an equivalence relation on any nonempty collection of normed linear spaces;
• the inverse of every isomorphism is also an isomorphism, as the inverse of a linear isomorphism (resp.
homeomorphism) is a linear isomorphism (resp. homeomorphism) as well.
Definition 3.3. Let X,Y be normed linear spaces. A linear map T : X → Y is called a linear isometry, or simply
an isometry, if ∥T x∥ = ∥x∥ for all x ∈ X.
Remark 3.2. In general, a map f : X → Y is called an isometry if ∥ f (x) − f (y)∥ = ∥x − y∥ for all x, y ∈ X. Here
63
for any linear map T : X → Y ,
This explains why the linear isometries are defined in the above fashion.
Theorem 3.6. Let X,Y be inner product spaces. Then a linear map U : X → Y is an isometry if and only if
⟨T x, Ty⟩ = ⟨x, y⟩ for all x, y ∈ X.
Proof. Let T : X → Y be a linear map. First, suppose that ⟨T x, Ty⟩ = ⟨x, y⟩ for all x, y ∈ X. Then for every x ∈ X, we have
∥x + y∥2 = ∥x∥2 + ∥y∥2 + 2ℜ⟨x, y⟩ and ∥x + iy∥2 = ∥x∥2 + ∥y∥2 + 2ℑ⟨x, y⟩.
so we have ℜ⟨T x, Ty⟩ = ℜ⟨x, y⟩ in this case. Likewise, by considering x + iy, we have ℑ⟨T x, Ty⟩ = ℑ⟨x, y⟩ as well. Therefore, the equality
⟨T x, Ty⟩ = ⟨x, y⟩ holds for sure in this case. ■
Definition 3.4. Two normed linear spaces X and Y are isometrically isomorphic if there exists an isomorphism
T : X → Y that is also a linear isometry.
Theorem 3.7. Let X,Y be normed linear spaces and T : X → Y be a linear map. If T is a linear isomorphism
as well as an isometry, itself as well as its inverse T −1 : Y → X is also an isometric isomorphism.
Proof. Suppose that T is both a linear isomorphism and an isometry. Then it suffices to show that T −1 : Y → X is also bounded. As we can
see, for every y ∈ Y ,
∥y∥ = ∥T (T −1 y)∥ = ∥T −1 y∥,
so T −1 is also an isometry whence bounded as well. In this case, we see that T is an isometric isomorphism, while T −1 , being a linear
isomorphism and an isometry, is an isometric isomorphism as well. ■
64
3.2 The Operator Norms
Definition 3.5. Let X,Y be normed linear spaces and T : X → Y be a linear map. The operator norm, or simply,
the norm of T is defined as
∥T ∥ := inf{M > 0 | ∀x ∈ X : ∥T x∥ ≤ M∥x∥}, (111)
Remark 3.3. Clearly, a linear map is bounded if and only if it has finite norm. That is,
Lemma 3.8. Let X,Y be normed linear spaces and T : X → Y be a linear map. Then for every x ∈ X,
∥T x∥ ≤ ∥T ∥∥x∥. (113)
Proof. Let x ∈ X be arbitrary. There is nothing to prove if ∥T ∥ = ∞. Suppose that ∥T ∥ < ∞ now. Then the following set
is nonempty. As we can see, we have ∥T x∥ ≤ M∥x∥ for all M ∈ M, hence ∥T x∥ ≤ ∥T ∥∥x∥ holds as well by taking infimum over M. ■
Theorem 3.9. Let X,Y be normed linear spaces and T : X → Y be a linear map. Then
Proof. 1. Ye prove the identity ∥T ∥ = sup∥x∥≤1 ∥T x∥ first: If ∥T ∥ = ∞, then T is unbounded. In other words, for every M > 0, there exists
y ∈ X such that ∥Ty∥ > M∥y∥. Clearly, here y ̸= 0X , for otherwise ∥Ty∥ = ∥0Y ∥ = 0 = M∥y∥, a contradiction. In this case, the normalization
ŷ := ∥y∥−1 y satisfies ∥x∥ = 1 and
∥Ty∥ M∥y∥
∥T ŷ∥ = > = M.
∥y∥ ∥y∥
Consequently, we have sup∥x∥≤1 ∥T x∥ = ∞ as well in this case. Next, suppose that ∥T ∥ < ∞ and denote by M := sup∥x∥≤1 ∥T x∥.
• For each x ∈ X with ∥x∥ ≤ 1, we have ∥T x∥ ≤ ∥T ∥∥x∥ ≤ ∥T ∥. Consequently, the inequality M ≤ ∥T ∥ < ∞ holds in this case.
• If ∥T ∥ = 0, the above inequality already implies that M = ∥T ∥ = 0; Otherwise, the domain X is nonzero. Then for every nonzero
x ∈ X with normalization x̂ := ∥x∥−1 x,
∥T x∥ = ∥x∥∥T x̂∥ ≤ M∥x∥,
hence by definition, the converse inequality ∥T ∥ ≤ M is true as well.
2. As a result, it is immediate that
∥T ∥ = sup ∥T x∥ ≥ sup ∥T x∥.
∥x∥≤1 ∥x∥<1
65
• Suppose that ∥T ∥ = ∞. Then for every M > 0, there exists y ∈ X with ∥y∥ ≤ 1 such that ∥Ty∥ > 2M. In this case, observe that
1 ∥y∥ 1 ∥Ty∥ 2M
y = < 1 and T y = > = M.
2 2 2 2 2
Since T (rn y) → Ty in Y , it follows that ∥Ty∥ ≤ sup∥x∥<1 ∥T x∥. By the arbitrariness of y, it follows that ∥T ∥ ≤ sup∥x∥<1 ∥T x∥.
3. Suppose that X is nonzero then. The sets X \ {0X } and {x | ∥x∥ = 1} are both nonempty, hence their supremum are not equal to −∞.
It is also immediate that
∥T ∥ = sup ∥T x∥ ≥ sup ∥T x∥.
∥x∥≤1 ∥x∥=1
Furthermore, for each nonzero x ∈ X with ∥x∥ ≤ 1, since x̂ := ∥x∥−1 x is a unit element, it follows that
Corollary 3.10. Let X,Y be normed linear spaces and T : X → Y be a linear map. If X is nonzero, then for
every r > 0,
sup ∥T x∥ = sup ∥T x∥ = sup ∥T x∥ = r∥T ∥.
∥x∥≤r ∥x∥<r ∥x∥=r
Proof. Let r > 0 be arbitrary. Here we prove that sup∥x∥≤r ∥T x∥ = r∥T ∥ only, as the others would follow by similar arguments: As we can
see,
x′ :=x/r
sup ∥T x∥ = sup ∥T (rx′ )∥ = r sup ∥T x′ ∥ = r∥T ∥. ■
∥x∥≤r ∥x′ ∥≤1 ∥x′ ∥≤1
Definition 3.6. Let X,Y be normed linear spaces. A linear map T : X → Y is called a contraction if there exists
α ∈ (0, 1) such that ∥T x∥ ≤ α∥x∥ for all x ∈ X.
Remark 3.4. Again, a general contraction is a function f : X → Y such that ∥ f (x) − f (y)∥ ≤ α∥x − y∥ for some
α ∈ (0, 1) and all x, y ∈ X. If T : X → Y is a linear contraction with constant α ∈ (0, 1), then
66
In other words, every linear contraction is also a contraction in general. In particular, for every y0 ∈ Y , the
following affine map
T̂ : X → Y : x 7→ T x + y0 (116)
Theorem 3.11 (Operator Norm of Linear Contractions). Let X,Y be normed linear spaces. A linear map
T : X → Y is a contraction if and only if ∥T ∥ < 1. In particular, every linear contraction is bounded.
Proof. Let T : X → Y be a linear map. Then
Theorem 3.12 (Banach’s Fixed Point Theorem). Let X be a Banach space. Then every contraction (not neces-
sarily linear) T : X → X has a unique fixed point x∗ ∈ X, namely T x∗ = x∗ . Furthermore, for every x0 ∈ X, we
always have
lim T n x0 = x∗ . (117)
n→∞
Proof. Let T : X → X be a contraction and α ∈ (0, 1) be such that ∥T x − Ty∥ ≤ α∥x − y∥ for all x, y ∈ X. Next, we may select an arbitrary
x0 ∈ X and put xn := T n x0 for all n ∈ N. Then for each n, we have
Therefore, (xn )n∈N is a Cauchy sequence in X, hence is convergent with limit x∗ ∈ X. Being a Lipschitz function, the map T is continuous
on X, so we have
T x∗ = T lim xn = lim T xn = lim xn+1 = x∗ .
n→∞ n→∞ n→∞
Thus, the element x∗ is indeed a fixed point of T . Finally, suppose that y ∈ X is another fixed point of T . If y ̸= x∗ , then
Theorem 3.13 (Norm of Quotient Map). Let X be a normed linear space with subspace U. If U is closed and
U ̸= X, then ∥π∥ = 1 where π : X → X/U is the quotient map.
Proof. Suppose that U is a proper closed subspace of X, and let π : X → X/U be the quotient map.
• By Theorem 1.16, we have seen that ∥π∥ ≤ 1, as for every x ∈ X with ∥x∥ ≤ 1,
67
• Meanwhile, by Riesz’s lemma (cf. Lemma 1.33), for every δ ∈ (0, 1), there exists a unit element x ∈ X such that δ ≤ d(x,U) =
∥x∥ = ∥π(x)∥ in X/U. Consequently, we also have ∥π∥ ≥ δ for all δ ∈ (0, 1), hence ∥π∥ ≥ 1 as well.
Therefore, we must have ∥π∥ = 1 in this case. ■
Theorem 3.14 (Norm of Complexification). Let X,Y be real inner product spaces and T : X → Y be a bounded
linear map. Then the complexification TC : XC → YC of T is also bounded with ∥TC ∥ = ∥T ∥. Furthermore, if T
is an isomorphism, so is TC .
Proof. 1. By Theorem 2.9, for every x, y ∈ X,
so the inequality ∥TC (x + iy)∥C ≤ ∥T ∥2 ∥x + iy∥C holds. Consequently, we can conclude that TC is also bounded with ∥TC ∥ ≤ ∥T ∥. In
particular, for every x ∈ X,
∥T x∥ = ∥TC (x + i0)∥ ≤ ∥TC ∥∥x + i0∥C = ∥TC ∥∥x∥,
so the converse inequality ∥T ∥ ≤ ∥TC ∥ holds as well. Therefore, we certainly have ∥TC ∥ = ∥T ∥ in this case.
2. Suppose that T is an isomorphism. Then T −1 : Y → X is also a bounded linear map. By Theorem A.3, we see that TC is also a linear
isomorphism with inverse (T −1 )C . As noted above, the map (T −1 )C is also bounded, so TC is an isomorphism as well. ■
Theorem 3.15 (Factorization Theorem). Let X,Y be normed linear spaces and T : X → Y be a bounded linear
map. Then for every closed subspace U of X, if U ⊆ ker(T ), then there exists a unique bounded linear map
T̃ : X/U → Y such that
∀x ∈ X : T̃ x = T x, and ∥T̃ ∥ = ∥T ∥. (118)
T̃ : X/U → Y : x 7→ T x.
x = y ⇐⇒ x − y ∈ ker(T ) =⇒ T (x − y) = 0Y ⇐⇒ T x = Ty.
By taking infimum over all z ∈ U, it follows that ∥T̃ x∥ ≤ ∥T ∥∥x∥, Therefore, the linear map T̃ is also bounded with ∥T̃ ∥ ≤ ∥T ∥.
Finally, since ∥x∥ ≤ ∥x∥, it is clear that
∥T x∥ = ∥T̃ x∥ ≤ ∥T̃ ∥∥x∥ ≤ ∥T̃ ∥∥x∥.
Then we also have ∥T ∥ ≤ ∥T̃ ∥, so the identity ∥T̃ ∥ = ∥T ∥ holds for sure.
Finally, we study the injectivity and surjectivity of T̃ as follows:
68
• For every y ∈ Y ,
y ∈ im(T̃ ) ⇐⇒ ∃x ∈ X : y = T̃ x = T x ⇐⇒ y ∈ im(T ).
Therefore, we have im(T̃ ) = im(T ). In this case,
T̃ is surjective ⇐⇒ im(T̃ ) = Y
⇐⇒ im(T ) = Y ⇐⇒ T is surjective.
• For every x ∈ X,
x ∈ ker(T̃ ) ⇐⇒ 0Y = T̃ x = T x ⇐⇒ x ∈ ker(T ).
Consequently, we see that ker(T̃ ) = ker(T )/U whence
where the direction =⇒ in the last equivalence follows from the fact that U ⊆ ker(T ). ■
Theorem 3.16. Let X,Y, Z be normed linear spaces, and let T : X → Y, T ′ : Y → Z be linear maps. Then
∥T ′ ◦ T ∥ ≤ ∥T ∥∥T ′ ∥. (119)
Proof. There is nothing to prove if ∥T ∥ = ∞ and ∥T ′ ∥ = ∞. Suppose that ∥T ∥ < ∞ and ∥T ′ ∥ < ∞ now. Then there exist constants M, M ′ > 0
such that ∥T x∥ ≤ M∥x∥ for all x ∈ X and ∥T ′ y∥ ≤ M ′ ∥y∥ for all y ∈ Y . Consequently, for each x ∈ X,
Since the above inequality holds for all such M, M ′ , it follows that ∥T ′ ◦ T ∥ ≤ ∥T ′ ∥∥T ∥ as well. ■
Theorem 3.17. Let X,Y be normed linear spaces. Then the operator norm ∥ · ∥ is a norm on B(X,Y ).
Proof. Ye first show that ∥ · ∥ is a norm on B(X,Y ).
• For every T ∈ B(X,Y ) and c ∈ F,
∥cT ∥ = sup ∥(cT )x∥ = sup ∥c(T x)∥ = sup |c|∥T x∥ = |c| sup ∥T x∥ = |c|∥T ∥.
∥x∥=1 ∥x∥=1 ∥x∥=1 ∥x∥=1
≤ sup ∥T x∥ + sup ∥T ′ x∥ = ∥T ∥ + ∥T ′ ∥.
∥x∥=1 ∥x∥=1
∥T x∥
∥T ∥ = 0 ⇐⇒ ∀x ∈ X \ {0X } : =0
∥x∥
⇐⇒ ∀x ∈ X \ {0X } : ∥T x∥ = 0
⇐⇒ ∀x ∈ X \ {0X } : T x = 0Y ⇐⇒ T = 0. ■
Theorem 3.18. Let X,Y be normed linear spaces. If Y is a Banach space, then B(X,Y ) is also a Banach space
under the operator norm.
Proof. So far we have seen that B(X,Y ) is a normed linear space under the operator norm. Furthermore, suppose that Y is a Banach space.
Let (Tn )n∈N be a Cauchy sequence in B(X,Y ) under the operator norm, and x ∈ X be arbitrary. Ye claim that (Tn x)n∈N be a Cauchy sequence
in Y :
69
• If x = 0X , we have Tn x = Tn 0X = 0Y for all n ∈ N, so the sequence (Tn x)n∈N is constant and certainly Cauchy as well.
• Suppose that x ̸= 0X . Let ε > 0 be arbitrary. Then there exists N ∈ N such that ∥Tm − Tn ∥ < ε/∥x∥ for all m, n ≥ N. In this case,
when m, n ≥ N,
ε
∥Tm x − Tn x∥ = ∥(Tm − Tn )x∥ ≤ ∥Tm − Tn ∥∥x∥ < · ∥x∥ = ε.
∥x∥
This shows that the sequence (Tn x)n∈N is Cauchy as well.
Now since Y is a Banach space, there exists a unique y ∈ Y such that Tn x → y in Y as n → ∞. This allows us to define a map
T : X → Y : x 7→ lim Tn x.
n→∞
Here we show that T is linear: Let x, y ∈ X and c ∈ F be arbitrary. By the continuity of addition and scalar multiplication,
Furthermore, the map T is also bounded: Note that (Tn )n∈N is a Cauchy sequence, so it is bounded with respect to the operator norm, say
there is some M > 0 such that ∥Tn ∥ ≤ M for all n ∈ N. Then for every x ∈ X, by the continuity of norms,
Definition 3.7. Let X,Y be normed linear spaces. A sequence (Tn )n∈N of bounded linear maps from X to Y is
said to converge weakly to a linear map T ∈ B(X,Y ) if Tn x → T x in Y for all x ∈ X, as opposite to the strong
convergence, namely ∥Tn − T ∥ → 0 as n → ∞ under the operator norm ∥ · ∥.
Theorem 3.19. Let X,Y be normed linear spaces and (Tn )n∈N be a sequence in B(X,Y ). If Tn → T ∈ B(X,Y )
strongly, then (Tn )n∈N converges to T weakly as well.
Proof. Suppose that Tn → T strongly. Then for every x ∈ X,
Theorem 3.20 (Extension Theorem). Let X be a normed linear space and U be a dense subspace of X. Then
for every Banach space Y and bounded linear map T : U → Y , there is a unique extension T ∈ B(X,Y ) of T to
X such that ∥T ∥ = ∥T ∥.
Proof. Let Y be a Banach space and T ∈ B(U,Y ) be a bounded linear map. Fix an arbitrary x ∈ X. Since U is dense in X, we can find a
sequence (xn )n∈N in U with xn → x. By Theorem 1.9, the sequence (xn )n∈N is Cauchy. Furthermore, since for every m, n ∈ N,
∥T xm − T xn ∥ = ∥T (xm − xn )∥ ≤ ∥T ∥∥xm − xn ∥,
the sequence (T xn )n∈N is Cauchy in Y as well. Note that Y is a Banach space, so the sequence (T xn )n∈N is convergent as well. Inspired
from this, we define
T : X → Y : x 7→ lim T xn .
n→∞
70
Ye then show that T is a well-defined linear map on X extending T :
• First, we need to verify that T is well-defined: Let (xn )n∈N , (xn′ )n∈N be sequences in U converging to x. Then xn − xn′ → x − x = 0X
as n → ∞, hence by the continuity of T , we see that
This shows that the limits of (T xn )n∈N and (T xn′ )n∈N are identical, hence T is well-defined. Furthermore, for each x ∈ U, by
considering the constant sequence in U with all terms equal to x, it follows that T x = T x as well.
• Let x, y ∈ X and c ∈ F. Fix arbitrary sequences (xn )n∈N and (yn )n∈N in U such that xn → x and yn → y. Then cxn + yn → cx + y,
hence
Theorem 3.22 (Uniform Boundedness Theorem; Banach-Steinhaus). Let X be a Banach space and Y be a
normed linear space. Given any family F of bounded linear maps from X to Y , if supT ∈F ∥T x∥ < ∞ for all
x ∈ X, then supT ∈F ∥T ∥ < ∞ as well.
Proof (A. Sokal). Let F be a family of bounded linear maps from X to Y such that supT ∈F ∥T x∥ < ∞ for all x ∈ X. Assume, to the contrary,
that supT ∈F ∥T ∥ = ∞. Then for every n ∈ N, there exists Tn ∈ F such that ∥Tn ∥ ≥ 4n . Meanwhile, we also put x0 := 0X . Then for each
71
n ∈ N, we may see from the preceding lemma that
1
sup ∥Tn x′ ∥ ≥ ∥Tn ∥.
x′ ∈X,∥x′ −xn−1 ∥<3−n 3n
2
Then we also let xn ∈ X with ∥xn − xn−1 ∥ < 3−n such that ∥Tn xn ∥ ≥ ∥Tn ∥.
3n+1
Clearly, the sequence (xn )n∈N is Cauchy in X, hence is convergent as X is a Banach space. Let x ∈ X be the limit of (xn )n∈N . As we can
see, for each n ∈ N, by the continuity of norms,
m
∥x − xn ∥ = lim ∥xm − xn ∥ ≤ lim ∑ ∥xk − xk−1 ∥
m→∞ m→∞
k=n+1
m
1 1 1 1
≤ lim ∑ = lim 1 − = .
k m→∞ 2 · 3n 3m−n 2 · 3n
k=n+1 3
m→∞
Consequently,
1
|∥Tn x∥ − ∥Tn xn ∥| ≤ ∥Tn x − Tn xn ∥ ≤ ∥Tn ∥∥x − xn ∥ < ∥Tn ∥.
2 · 3n
As a result, n
1 2 1 1 1 4
∥Tn x∥ > ∥Tn xn ∥ − ∥Tn ∥ ≥ n+1 ∥Tn ∥ − ∥Tn ∥ = ∥Tn ∥ ≥ .
2 · 3n 3 2 · 3n 6 · 3n 6 3
In this case, the sequence (Tn x)n∈N is unbounded from above, contrary to our assumption. ■
Corollary 3.23. Let X be a Banach space and Y be a normed linear space. For every sequence (Tn )n∈N of
bounded linear maps from X to Y , if the sequence (Tn x)n∈N in Y is convergent for all x ∈ X, then the map
T : X → Y : x 7→ lim Tn x (121)
n→∞
so the map T is linear as well. Meanwhile, now for each x ∈ X, since the sequence (Tn x)n∈N is convergent, it is certainly bounded as well.
Then by the Banach-Steihaus theorem, we see that supn∈N ∥Tn ∥ < ∞ as well. Finally, for each x ∈ X, by the continuity of the norm,
∥T x∥ = lim ∥Tn x∥ ≤ sup ∥Tn x∥ ≤ sup ∥Tn ∥∥x∥ = ∥x∥ sup ∥Tn ∥ < ∞.
n→∞ n∈N n∈N n∈N
Theorem 3.24 (The Open Mapping Theorem). Let X,Y be Banach spaces and T : X → Y be a bounded linear
map. Then the following statements are equivalent:
2. The element 0Y ∈ Y is an interior point of T (BX ), where BX := B(0X , 1) = {x ∈ X | ∥x∥ < 1} is the unit
open ball in Y .
72
Conversely, suppose that 0Y is an interior point of T (BX ). Then there exists r > 0 such that y ∈ T (BX ) for all y ∈ Y with ∥y∥ < r. Now
let G ⊆ X be an open set and y ∈ T (G) be arbitrary. By definition, we can find some x ∈ G such that y = T x. Since G is open, there is δ > 0
such that x̂ ∈ G whenever ∥x̂ − x∥ < δ . Then for every y′ ∈ Y with ∥y′ − y∥ < rδ , since ∥(y′ − y)/δ ∥ < r, it follows from the assumption that
some x′ ∈ BX such that (y′ − y)/δ = T x′ . In this case,
The latter implies that x + δ x′ ∈ G whence y′ ∈ T (G) as well. Consequently, the set T (G) is open in Y , implying that T is an open map.
(2 ⇐⇒ 3). Again, assume that 0Y lies in the interior of T (BX ). For convenience, we adopt the notation for such constant r > 0 as in the
r
last paragraph. Let y ∈ Y be arbitrary. There is nothing to prove if y = 0Y ; Otherwise, for y′ := y, it is clear that ∥y′ ∥ = r/2 < r, so by
2∥y∥
assumption, there exists x ∈ X with ∥x∥ < 1 such that y′ = T x. In this case,
2∥y∥ ′ 2∥y∥ 2∥y∥
y= y = (T x) = T x ,
r r r
showing that T is a surjective map.
Conversely, suppose that T is surjective. First, we show that 0X is an interior point of the closure T (BX ) of T (BX ): For each n ∈ N, we
may consider the following map on Y :
By taking infimum over all such u, v’s, it is clear that ∥cy∥n ≤ c∥y∥n . Note that such inequality is independent of the choices of y
and c, so
∥y∥n = ∥c−1 (cy)∥n ≤ c−1 ∥cy∥n ,
hence ∥cy∥n ≥ c∥y∥n as well. The identity ∥cy∥n = c∥y∥n thus holds for sure.
• Let y, y′ ∈ Y be arbitrary. Then for every u, u′ ∈ X and v, v′ ∈ Y , if y = v + Tu and y′ = v′ + Tu′ , then
Consequently,
Furthermore, let Z := Y ⊕N be the direct sum of countable copies of Y , and by Theorem 1.29, it can be endowed with the following “l∞ -norm”:
In this case, for each n ∈ N, consider the n-th injection map inn : Y → Z.
73
• Clearly, for each y ∈ Y , since inn (y) has a unique nonzero component y at index n, we must have ∥ inn (y)∥ = ∥y∥n ≤ n∥y∥. Therefore,
the map inn is bounded as well.
• Furthermore, let y ∈ Y be arbitrary. Since T is surjective, we have y = T x for some x ∈ X. Then by putting u := x and v := 0Y , we
see that
∥ inn (y)∥ ≤ ∥y∥n ≤ ∥x∥ + n∥0Y ∥ = ∥x∥ + n(0) = ∥x∥.
Consequently, the sequence (inn (y))n∈N is also bounded in Y by ∥x∥, where x ∈ X satisfies T x = y.
Then by the Banach-Steinhaus theorem (cf. Theorem 3.22), there exists a real constant C > 0 such that ∥ inn ∥ ≤ C for all n ∈ N. Next, we
may fix an arbitrary δ ∈ (0, 1/C) and claim that B(0Y , δ ) ⊆ T (BX ): Let y ∈ Y with ∥y∥ < δ . For each n ∈ N, we see that
Then there exist un ∈ X and vn ∈ Y with y = vn + Tun such that ∥un ∥ + n∥vn ∥ < 1. In particular, we see that ∥un ∥ < 1 and ∥vn ∥ < 1/n. The
latter implies that vn → 0Y as n → ∞, hence Tun → y in this case. Now un ∈ BX for all n ∈ N, so y ∈ T (BX ), as desired.
Finally, we then show that T (BX ) contains the open ball B(0Y , δ /2) instead, which will prove that 0Y is an interior point of T (BX ): Let
y ∈ Y be such that ∥y∥ < δ /2.
• By scaling the ball as in the previous part, we see that y ∈ T (B(0X , 1/2)). Then there exists x1 ∈ X with ∥x1 ∥ < 1/2 such that
∥y − T x1 ∥ < δ /4.
• Proceeding inductively, for n ≥ 1, since ∥y − T x1 − · · · − T xn−1 ∥ < δ /2n , we shall have y − T x1 − · · · − T xn−1 ∈ T (B(0X , 1/2n )). In
this case, we can find xn ∈ X with ∥xn ∥ < 1/2n such that
δ
∥y − T x1 − · · · − T xn−1 − T xn ∥ < .
2n+1
Corollary 3.25 (Bounded Inverse Theorem). Every bounded linear isomorphism between Banach spaces is also
an isomorphism, namely its inverse is bounded as well.
Proof. Let X,Y be Banach spaces, and let T : X → Y be a bounded linear isomorphism. Since T is surjective now, by the open mapping
theorem, itself is also an open map. Then for every open set G ⊆ X, we see that T (G) = T −1 (T −1 (G)) is open as well, so the map T −1 is
also continuous as well as bounded. Therefore, the map T is indeed an isomorphism in this case. ■
Corollary 3.26 (Isomorphism Theorem). Let X,Y be Banach spaces and T : X → Y be a bounded linear map.
If im(T ) is closed in Y , then X/ ker(T ) is isomorphic to im(T ).
Proof. Suppose that im(T ) is closed in Y . Now by the factorization theorem (cf. Theorem 3.15), we have a bounded linear isomorphism
T̃ : X/ ker(T ) → im(T ) : x 7→ T x.
As we can see, the quotient space X/ ker(T ) is Banach as X is (cf. Theorem 1.16), and so is im(T ) as it is closed in the Banach space Y
(cf. Theorem 1.10). Then by Bounded inverse theorem (cf. Corollary 3.25), we can conclude that T̃ is an isomorphism, namely X/ ker(T )
is isomorphic to im(T ). ■
Remark 3.5. In addition, the converse also holds in this case: If X/ ker(T ) is isomorphic to im(T ), then im(T )
is also a Banach space. By Theorem 1.10 again, it is necessarily closed in Y as well.
74
Theorem 3.27. Let X,Y be nonzero normed linear spaces and T : X → Y be a surjective bounded linear map.
Then T is invertible with bounded inverse T −1 if and only if there exists a constant C > 0 such that ∥T x∥ ≥ C∥x∥
for all x ∈ X, in which case ∥T −1 ∥ ≤ 1/C.
Proof. Suppose that T is invertible with bounded inverse T −1 . Then for every x ∈ X,
In particular, since T −1 is also invertible, it is certainly nonzero whence ∥T −1 ∥ > 0. Therefore, we have ∥T x∥ ≥ ∥x∥/∥T −1 ∥, hence the
constant C := 1/∥T −1 ∥ certainly suffices.
Conversely, suppose that there is some constant C > 0 satisfying that ∥T x∥ ≥ C∥x∥ for all x ∈ X. First, we show that the map T is itself
bijective: It is already surjective by hypothesis. If T x = 0Y for some x ∈ X, then we have
0 = ∥0Y ∥ = ∥T x∥ ≥ C∥x∥,
so ∥x∥ = 0 holds for sure, implying that x = 0X . Therefore, the kernel of T is trivial, implying that it is also injective. Therefore, it has a
linear inverse T −1 : Y → X. Yhat remains is to show that T −1 is also bounded: For every y ∈ Y , since
Definition 3.8. Let X be a normed linear space. For every linear operator T ∈ B(X), its Neumann series is
defined as the series ∑∞ k
k=0 T .
Lemma 3.28. Let X be a normed linear space with T ∈ B(X). If the Neumann series of T converges under the
operator norm, then idX −T is invertible whose inverse is given by the sum of the Neumann series.
Proof. Suppose that the Neumann series of T converges to some T ′ ∈ B(X). For every k ≥ 0, note that
so
∞
(idX −T )T ′ = (idX −T ) ∑ T k = lim (idX −T k+1 ) = idX − lim T k+1 .
k=0 k→∞ k→∞
Because the Neumann series is convergent, by Theorem 1.9, we certainly have Tk→ 0 under the operator norm. Therefore, we see that
(idX −T )T ′ = idX , hence T ′ is a right inverse of idX −T . By symmetric arguments, it is clear that T ′ is also a left inverse of idX −T , hence
T ′ = (idX −T )−1 , as desired. ■
Theorem 3.29 (Inverse Concerning Contractions). Let X be a Banach space with T ∈ B(X). If ∥T ∥ < 1, then
idX −T is also invertible with
∞
1
idX −T = ∑ Ak and ∥ idX −T ∥ ≤
1 − ∥T ∥
. (122)
k=0
Proof. Suppose that ∥T ∥ < 1, namely T is a contraction. Ye first claim that idX −T is surjective: For every y ∈ X, by the Banach’s fixed point
theorem, there exists an element x ∈ X such that x = T x+y, namely y = (idX −T )x. Meanwhile, for every x ∈ X, since ∥T x∥ ≤ ∥T ∥∥x∥ < ∥x∥,
Then by Theorem 3.27, we may conclude that idX −T is also invertible with ∥ idX −T ∥ ≤ 1/(1 − ∥T ∥). By the preceding lemma, what
remains is to show that the Neumann series of T is convergent in B(X). Now because ∥T ∥ < 1 and ∥T k ∥ ≤ ∥T ∥k for all k ≥ 0, the series
k k
∑∞ ∞
k=0 ∥T ∥ is convergent as well. Thus, the Neumann series ∑k=0 T converges absolutely, hence converges as well by Theorem 1.12. ■
75
Corollary 3.30. Let X be a Banach space. Then the collection of isomorphisms from X to itself is an open
subset of B(X).
Proof. Clearly, the identity map on X is an isomorphism, so such set is nonempty. Now let T ∈ B(X) be an isomorphism. Consider another
S ∈ B(X) with ∥S − T ∥ < 1/∥T −1 ∥. Observe that
1
∥ idX −T −1 S∥ = ∥T −1 (T − S)∥ ≤ ∥T −1 ∥∥T − S∥ < ∥T −1 ∥ · = 1,
∥T −1 ∥
so by the preceding theorem, we can see that T −1 S is also an isomorphism. Now since T is itself an isomorphism, we see that S = T (T −1 S)
is an isomorphism as well. This shows that the open ball centered at T of radius ∥T ∥−1 is contained in the set of all isomorphisms on X, so
such set is indeed open in B(X). ■
Theorem 3.31 (Closed Graph Theorem). Let X,Y be Banach spaces and T : X → Y be a linear map. Then T is
bounded if and only if its graph is closed in X ×Y under the norm
Proof. By Corollary 1.30, it is clear that X ×Y is a Banach space under the norm defined above. This allows us to consider the statement
presented above:
• Suppose that T is bounded. Let (xn , yn )n∈N be a sequence in the graph of T converging to some (x, y) ∈ X × Y . Then clearly, we
must have xn → x and yn → y. Since T is bounded, we shall have yn = T xn → T x as well. By the uniqueness of limits, it follows
that y = T x, namely the pair (x, y) also belongs to the graph of T . Therefore, the graph of T is closed in X ×Y .
• Conversely, suppose that the graph of T is closed in X ×Y . By Theorem 1.10, it is a Banach space. In this case, the projection maps
πX : X ×Y → X and πY : X ×Y → Y are both bounded linear maps. Furthermore, note that the restriction of πX to the graph of T is
bijective, hence it has a bounded inverse T ′ from X to the graph of T by the bounded inverse theorem. Clearly, for each x ∈ X, we
have T ′ x = (x, T x) whence
T x = πY (x, T x) = πY (T ′ x) = (πY ◦ T ′ )x.
This shows that T = πY ◦ T ′ is also bounded, as desired. ■
Remark 3.6. As a result, to show that a linear map T between Banach spaces is bounded, it suffices to show that
xn → x and T xn → y imply that T x = y.
• Note that F = R or C is a Banach space, so the continuous dual X ∗ is also a Banach space by Theorem
3.18.
• The continuous dual X ∗ = B(X, F) is a subspace of the algebraic dual X ∨ = L (X, F). By Theorem 3.4,
they coincide when X is finite-dimensional.
• Finally, the operator norm on X ∗ is of the following form: For every continuous functional ϕ : X → F,
76
which is called the dual norm now. In practice, such norm is denoted by ∥ · ∥∗ or ∥ · ∥D .
1. If X is finite-dimensional with base {e1 , . . . , en }, then each coordinate form e∨i : X → F is bounded with
norm 1. Furthermore, the family {e∨1 , . . . , e∨n } is a base of X ∗ in which
n
ϕ= ∑ ϕ(e j )e∨j , ∀ϕ ∈ X ∗ . (125)
j=1
2. If X is infinite-dimensional with base (ei )i∈I , then at most finitely many e∨i ’s are bounded.
Proof. First, suppose that X is finite-dimensional. Then the index set I is certainly finite. As noted in the preceding remark, each coordinate
form e∨ ∗
i is also bounded. Now let ϕ ∈ X be arbitrary. Then for every x ∈ X,
! !
ϕ(x) = ϕ ∑ e∨i (x)ei = ∑ e∨
i (x)ϕ(ei ) = ∑ ϕ(ei )e∨i (x),
i∈I i∈I i∈I
Now suppose that X is infinite-dimensional now, namely the index set I is also infinite. ■
Proposition 3.33. Let X be a linear space. Then a sublinear map p : X → R is a semi-norm if and only if it is
symmetric, namely p(−x) = p(x) for all x ∈ X.
Proof. Let p be a sublinear map on X. If it is already a semi-norm, by Theorem 1.1, we certainly have p(−x) = p(x) for all x ∈ X.
Conversely, suppose that p is symmetric. Then for every x ∈ X and c ∈ F, we see that
Since the triangle inequality already holds for p, we can conclude that p is a semi-norm on X. ■
Theorem 3.34 (Hahn-Banach, Real Linear Functionals). Let X be a real linear space and p : X → R be a
sublinear map. Then for every subspace U of X and linear functional ϕ0 : U → R, if ϕ0 (x) ≤ p(x) for all x ∈ U,
then there is a linear extension ϕ : X → R of ϕ0 such that ϕ(x) ≤ p(x) for all x ∈ X as well.
Proof. Let U be a subspace of X and ϕ0 : U → R be a linear map such that ϕ0 (x) ≤ p(x) for all x ∈ U. First, we show that whenever U ̸= X,
we can extend ϕ0 to a larger subspace of X satisfying the same condition: Yhen U ̸= X, we may fix one x1 ∈ X \U and consider
77
• Ye first show that ϕ1 is well-defined: Suppose that x + cx1 = x′ + c′ x1 for some x, x′ ∈ U and c, c′ ∈ R. Then we have x − x′ =
(c′ − c)x1 . Since x1 ∈
/ U, the previous two expressions can only be 0X . That is, x = x′ and c = c′ , hence
ϕ0 (x) + cα = ϕ0 (x′ ) + c′ α.
and
hence
ϕ0 (x) − p(x − x1 ) ≤ p(x1 + y) − ϕ0 (y).
Since x, y ∈ U are arbitrary, it follows that
Thus, we may assume that α was selected between these two numbers. Now let x ∈ X and c ∈ R be arbitrary.
• If c = 0, then ϕ1 (x) = ϕ0 (x) ≤ p(x).
• Suppose that c > 0. Then
x x x
p(x + cx1 ) = cp + x 1 ≥ c α + ϕ0 = cα + cϕ0
c c c
= cα + ϕ0 (x) = ϕ1 (x + c1 ).
Clearly, if X is finite-dimensional, applying the extension procedure finitely many times, we can certainly obtain the desired linear functional
on X. As for general normed linear spaces, we are supposed to consider the following set:
Clearly, we have (U, ϕ0 ) ∈ A, so the set A is nonempty. Next, we define a binary relation ⪯ on A by
It is immediate that ⪯ is a partial order on A. Furthermore, given any chain (Yi , ψi )i∈Γ in A, the union Y ∗ :=
S
i∈Γ Yi is also a subspace of
78
X. Furthermore, define
ψ ∗ : Y ∗ → R : x 7→ ψi (x), where i = min{ j ∈ Γ | x ∈ Y j },
Then for every i ∈ Γ and x ∈ Yi , since the minimum index j such that x ∈ Y j is no more than i, we must have ψi (x) = ψ j (x) = ψ ∗ (x). This
shows that (Yi , ψi ) ⪯ (Y ∗ , ψ ∗ ) for all i ∈ Γ, namely every chain in A is bounded from above.
By Zorn’s lemma, we can find a maximal element (Y, ϕ) ∈ A. Here we must have Y = X, for otherwise we can apply the extension
procedure described at the beginning to extend ϕ to a largest subspace Y ′ of X, contradicting the maximality of (Y, ϕ). The proof is thus
complete. ■
1. For every complex linear functional ϕ : X → C, its real part ℜ(ϕ) is also a linear functional on X as a
real linear space. Furthermore,
ϕ : X → C : x 7→ ψ(x) − iψ(ix)
2. Conversely, suppose that ψ : X → R is a real linear functional on X, and let ϕ : X → C be defined as above. Since ψ takes value
from real numbers, it is clear that ℜ(ϕ) = ψ. Then for every x, y ∈ X and c = a + ib ∈ C,
and
79
Theorem 3.36 (Hahn-Banach, Complex Linear Functionals). Let X be a complex linear space and p : X → R
be a sublinear map. Then for every subspace U of X and linear functional ϕ0 : U → C, if ℜ(ϕ0 (x)) ≤ p(x) for
all x ∈ U,
1. there is a linear extension ϕ : X → C of ϕ0 such that ℜ(ϕ(x)) ≤ p(x) for all x ∈ X as well;
2. if furthermore p is a semi-norm on X, then we even have |ϕ(x)| ≤ p(x) for all x ∈ X in this case.
Proof. Let U be a subspace of X and ϕ0 : U → C be a linear functional such that ℜ(ϕ0 (x)) ≤ p(x) for all x ∈ U. By the preceding lemma,
we see that ℜ(ϕ0 ) is a real linear functional on U, so by the Hahn-Banach theorem for real linear functionals, there exists a real linear
functional ψ : X → R extending ℜ(ϕ0 ) such that ψ(x) ≤ p(x) for all x ∈ X. Furthermore, define
ϕ : X → C : x 7→ ψ(x) − iψ(ix),
which, by the preceding lemma, is also a complex linear functional on X with ℜ(ϕ) = ψ ≤ p. For each x ∈ U, since ix ∈ U as well,
Thus, the map ϕ is an extension of ϕ0 as well. Finally, let x ∈ X be arbitrary. Ye then put α := 1 if ϕ(x) = 0 and α := |ϕ(x)|/ϕ(x) otherwise.
Clearly, we always have |α| = 1 and
ϕ(αx) = αϕ(x) = |ϕ(x)| ∈ R,
so
|ϕ(x)| = ϕ(αx) = ψ(αx) ≤ p(αx) = |α|p(x) = p(x). ■
Remark 3.8. In particular, if |ϕ0 | ≤ p on U, since ℜ(ϕ0 ) ≤ |ϕ0 |, we shall have ℜ(ϕ0 ) ≤ p on U as well. This
allows the following special case of the preceding theorem:
Let X be a semi-normed complex linear space. Then for every subspace U of X and linear function
ϕ0 : U → C, if |ϕ0 (x)| ≤ ∥x∥ for all x ∈ U, then there is a linear extension ϕ : X → C of ϕ0 such
that |ϕ(x)| ≤ ∥x∥ for all x ∈ X as well.
Corollary 3.37 (Hahn-Banach, Extension). Let X be a normed linear space with subspace U. Then for every
bounded linear functional ϕ0 ∈ U ∗ , there exists a bounded linear extension ϕ ∈ X ∗ of ϕ0 such that ∥ϕ∥ = ∥ϕ0 ∥.
Proof. Let ϕ0 ∈ U ∗ be an arbitrary bounded linear functional. Then we define
By Theorem 1.18, it is clear that p is a semi-norm on X, which is certainly sublinear. Furthermore, for each x ∈ X,
By the Hahn-Banach theorem for real/complex linear functionals, we can find a linear functional ϕ : X → F extending ϕ0 such that |ϕ(x)| ≤
p(x) = ∥ϕ0 ∥∥x∥ for all x ∈ X. Clearly, the map ϕ is bounded now with ∥ϕ∥ ≤ ∥ϕ0 ∥. Meanwhile,
The identity ∥ϕ0 ∥ = ∥ϕ∥ thus holds for sure, completing the proof. ■
Corollary 3.38 (Norming Functional). Let X be a normed linear space and x ∈ X be nonzero. Then there exists
a bound linear functional ϑx ∈ X ∗ with ∥ϑx ∥ = 1 and ϑx (x) = ∥x∥. Furthermore,
80
In particular, if ϕ(x) = 0 for all ϕ ∈ X ∗ , we must have x = 0X .
Proof. Consider the subspace U := Span(x), which is nonzero as x is. By the universal property of linear spaces, there is a unique linear
map ϑ : U → F such that ϑ (x) = ∥x∥. More precisely, the map ϑ is given by
ϑ : U → F : λ x 7→ λ ∥x∥.
Now the equality is attained and the supremum degenerates to maximum because of the bounded linear functional ϑx ∈ X ∗ . As we can see,
if ϕ(x) = 0 for all x ∈ X ∗ , we must have ∥x∥ = 0 whence x = 0X . The proof is complete. ■
Corollary 3.39. Let X be a normed linear space with proper closed subspace U. For every x0 ∈ X \ U, there
exists ϕ ∈ X ∗ such that ϕ(x) = 0 for all x ∈ U and ϕ(x0 ) ̸= 0.
Proof. Let x0 ∈ X \U be arbitrary. Consider the quotient map π : X → X/U, which is also a bounded linear map by Theorem 3.13. In this
case, note that π(x0 ) = x0 ̸= 0X , so by the preceding corollary, we can find ϑ ∈ (X/U)∗ with ∥ϑ ∥ = 1 and ϑ (x0 ) = ∥x0 ∥ ̸= 0. In this case,
we may put ϕ := ϑ ◦ π, which is also bounded and linear. Clearly,
Theorem 3.40 (Evaluation Map and Reflexity). Let X be a normed linear space. Then for every x ∈ X, the
following evaluation map
evx : X ∗ → F : ϕ 7→ ϕ(x) (128)
is also a bounded linear functional on X ∗ with ∥ evx ∥ = ∥x∥. Furthermore, we also have a linear isometry
ev : X → X ∗∗ : x 7→ evx . (129)
Proof. Fix an arbitrary x ∈ X, and let evx : X ∗ → F be defined as above. Then for every ϕ, ψ ∈ X ∗ and c ∈ F,
evx (cϕ + ψ) = (cϕ + ψ)(x) = cϕ(x) + ψ(x) = c evx (ϕ) + evx (ψ).
This shows that the map evx is a linear functional on X ∗ . Furthermore, for every ϕ ∈ X ∗ , we see that
Consequently, the linear functional evx : X ∗ → F is also bounded with ∥ evx ∥ ≤ ∥x∥, hence evx ∈ X ∗∗ . In addition, by consider the norming
functional ϑx ∈ X ∗ described in Corollary 3.38, we see that
81
Here the last equality is attained because ∥ϑx ∥ = 1. Thus, we must have ∥ evx ∥ = ∥x∥ in this case. Finally, let ev : X → X ∗∗ be as defined
above and x, y ∈ X be arbitrary. Then for every c ∈ F and ϕ ∈ X ∗ ,
evcx+y (ϕ) = ϕ(cx + y) = cϕ(x) + ϕ(y) = c evx (ϕ) + evy (ϕ) = (c evx + evy )(ϕ),
so evcx+y = c evx + evy holds for sure. As a result, the map ev is also linear. Again, for each x ∈ X, since ∥ evx ∥ = ∥x∥, we can see that ev is
an isometry, as desired. ■
Definition 3.12. A normed linear space X is called reflexive if the canonical isometry ev : X → X ∗∗ is also
surjective, namely an isometric isomorphism.
Remark 3.9. In fact, every reflexive normed linear space X is necessarily complete, as it is isometrically iso-
morphic to its bidual X ∗∗ = B(X ∗ , F), which is always a Banach space (cf. Theorem 3.18). Therefore, we shall
always refer to reflexive Banach spaces only.
Lemma 3.41 (Bounded Linear Functional By Inner Products). Let X be an inner product space. For every
z ∈ X, the following map
ηz : X → F : x 7→ ⟨x, z⟩ (130)
Therefore, the linear functional ηz is bounded with ∥ηz ∥ ≤ ∥z∥. Note that |ηz (z)| = |⟨z, z⟩| = ∥z∥2 , so the equality ∥ηz ∥ = ∥z∥ is attained
now. ■
Remark 3.10. In particular, when z ̸= 0X , the map ∥z∥−1 ηz ∈ X ∗ is precisely a norming functional of z, as
Lemma 3.42. Let X be a Hilbert space and ϕ ∈ X ∗ . If ϕ ̸= 0, then the orthogonal complement of ker(ϕ) is of
dimension 1.
Proof. Suppose that ϕ ̸= 0. By Corollary 3.2, its kernel ker(ϕ) is a closed subspace of X. Furthermore, because ϕ ̸= 0, its kernel is also
proper in X. By Theorem 2.38, we see that ker(ϕ)⊥ is nonzero then.
Next, fix nonzero elements x1 , x2 ∈ ker(ϕ)⊥ . Clearly, ϕ(x1 ) and ϕ(x2 ) are both nonzero, for otherwise x1 , x2 ∈ ker(ϕ) as well forcing
them to be zero. Then in this case, for a := −ϕ(x1 )ϕ(x2 )−1 ,
As a result, we have x1 +ax2 ∈ ker(ϕ). However, because ker(ϕ)⊥ is a subspace of X, we should have x1 +ax2 ∈ ker(ϕ)⊥ as well. Therefore,
we must have x1 + ax2 = 0X . Since a ̸= 0, the two elements x1 , x2 are linearly independent. As a result, the subspace ker(ϕ)⊥ is only of
dimension 1. ■
82
Remark 3.11. Yhen X is finite-dimensional, the above lemma would be trivial: For nonzero ϕ ∈ X ∗ = L (X, F),
we see that its rank, namely the dimensional of its range, is precisely equal to 1. By rank-nullity theorem, the
dimension of ker(ϕ) is equal to dim(X) − 1. Since X = ker(ϕ) ⊕ ker(ϕ)⊥ , we see that ker(ϕ)⊥ necessarily has
dimension 1 as well.
Theorem 3.43 (Riesz-Fréchet Representation Theorem). Let X be a Hilbert space. Then the following map
is a conjugate linear bijective isometry. In particular, for every bounded linear functional ϕ ∈ X ∗ , there exists a
unique element z ∈ X such that ϕ(x) = ⟨x, z⟩ for all x ∈ X.
Proof. Let η : X → X ∗ be as defined. First, we show that η is conjugate linear: Let z, z′ ∈ X and c ∈ F be arbitrary. Then for every x ∈ X,
ηcz+z′ (x) = ⟨x, cz + z′ ⟩ = c⟨x, z⟩ + ⟨x, z′ ⟩ = cηz (x) + ηz′ (x) = (cηz + ηz′ )(x).
This shows that ηcz+z′ = cηz + ηz′ , hence η is conjugate linear. Furthermore, by Lemma 3.41, we see that ∥ηz ∥ = ∥z∥ for all z ∈ X, hence
η is also an isometry. Yhat remains is to prove that η is bijective:
• Let z, z′ ∈ X be such that ηz = ηz′ . Then for every x ∈ X,
where the last equality holds because ker(ϕ) is a closed subspace of X (cf. Corollary 3.2). As a result, we have
ϕ(x) = ϕ((x − ⟨x, z0 ⟩z0 ) + ⟨x, z0 ⟩z0 ) = ϕ(x − ⟨x, z0 ⟩z0 ) + ϕ(⟨x, z0 ⟩z0 )
= ϕ(⟨x, z0 ⟩z0 ) = ⟨x, z0 ⟩ϕ(z0 ) = ⟨x, ϕ(z0 )z0 ⟩.
Remark 3.12. Here are some remarks on the Riesz representation theorem:
• Yhen X is finite-dimensional, we can find such z for given nonzero ϕ ∈ X ∗ as follows: Let {e1 , . . . , en } be
an orthonormal base of X (cf. Corollary 2.32). Then for every x ∈ X,
! * +
n n n n
ϕ(x) = ϕ ∑ ⟨x, ek ⟩ek = ∑ ⟨x, ek ⟩ϕ(ek ) = ∑ ⟨x, ϕ(ek )ek ⟩ = x, ∑ ϕ(ek )ek .
k=1 k=1 k=1 k=1
The element
n
z := ∑ ϕ(ek )ek ∈ X
k=1
83
certainly suffices. In particular, we may also prescribe that {e1 , . . . , en−1 } is an orthonormal base of
ker(ϕ), while by adding en , we obtain an orthonormal base of X. Then z = ϕ(en )en is precisely of the
form presented in the previous proof.
• Inspired from this, many functional analysis texts uses inner-product notation to describe functionals in
normed linear spaces as well, namely given any normed linear space X, if x ∈ X and ϕ ∈ X ∗ , they write
Corollary 3.44. Every Hilbert space X is reflexive, whose continuous dual X ∗ is also a Hilbert space under the
inner product
∀ϕ, ψ ∈ X ∗ : ⟨ϕ, ψ⟩X ∗ := ⟨zϕ , zψ ⟩X , (133)
where zϕ , zψ ∈ X are the unique representatives ensured by the Riesz representation theorem.
Proof. Let X be a Hilbert space. First, we show that X ∗ is also a Hilbert space under the map defined above: For convenience, we may let
η : X → X ∗ is the conjugate linear bijective isometry constructed in the Riesz representation theorem. Then the map ⟨·, ·⟩X ∗ on X ∗ can also
be described as
∀ϕ, ψ ∈ X ∗ : ⟨ϕ, ψ⟩X ∗ := ⟨η −1 (ψ), η −1 (ϕ)⟩X ,
which is certainly an inner product on X ∗ . Furthermore, for every ϕ ∈ X ∗ , denoting by ∥ϕ∥op is operator norm,
Consequently, the operator norm on X ∗ is precisely induced by the new inner product ⟨·, ·⟩X ∗ . Since X ∗ is already a Banach space under the
operator norm, it is thus a Hilbert space under the inner product defined above.
Finally, let f ∈ X ∗∗ be arbitrary. Then by applying the Riesz representation theorem to X ∗∗ , we can find a unique ψ ∈ X ∗ such that
f (ϕ) = ⟨ϕ, ψ⟩X ∗ for all ϕ ∈ X ∗ . In this case,
f (ϕ) = ⟨ϕ, ψ⟩X ∗ = ⟨η −1 (ψ), η −1 (ϕ)⟩X = ϕ(η −1 (ψ)) = evη −1 (ψ) (ϕ),
hence we have f = evη −1 (ψ) , showing that the canonical isometry ev : X → X ∗∗ is surjective as well. Therefore, X is reflexive. ■
Remark 3.13. The Riesz representation theorem explains why the notation ⊥ is adopted in the above definitions:
Suppose that X is a Hilbert space with A ⊆ X. By the Riesz representation theorem, we have a bijection between
the following two sets:
84
where the latter is precisely the orthogonal complement of A. That is why the former set is denoted by A⊥ even
for normed linear spaces.
Theorem 3.45. Let X be a normed linear space. Then for every nonempty A ⊆ X (resp. B ⊆ X ∗ ), its annihilator
is a closed subspace of X ∗ (resp. of X).
Proof. 1. Let A ⊆ X be nonempty. Clearly, the zero map in X ∗ is certainly contained in A⊥ , so A⊥ is nonempty. Furthermore, let ϕ, ψ ∈ A⊥
and c ∈ F be arbitrary. Then for every x ∈ A,
This shows that cϕ + ψ ∈ A⊥ , implying that A⊥ is a subspace of X ∗ . Finally, let (ϕn )n∈N be a sequence in A⊥ that converges to some ϕ ∈ X ∗ .
Then for every x ∈ A,
|ϕ(x)| = |ϕ(x) − 0| = |ϕ(x) − ϕn (x)| = |(ϕ − ϕn )(x)| ≤ ∥ϕ − ϕn ∥∥x∥ → 0. (as n → ∞)
This shows that ϕ(x) = 0 for all x ∈ A, hence ϕ ∈ A⊥ as well. Consequently, A⊥ is a closed subspace of X ∗ , as desired.
2. Similarly, let B ⊆ X ∗ be nonempty. Clearly, we have 0X ∈ B⊥ , so B⊥ is nonempty. Next, let x, y ∈ B⊥ and c ∈ F. Then for every
ϕ ∈ B,
ϕ(cx + y) = cϕ(x) + ϕ(y) = c(0) + 0 = 0,
hence cx + y ∈ B⊥ as well. This shows that B⊥ is a subspace of X. Finally, let (xn )n∈N be a sequence in B⊥ converging to some x ∈ X. Then
given any ϕ ∈ B, since ϕ is continuous, we see that
Theorem 3.46. Let X be a normed linear space with closed subspace U. Then the following map
T : (X/U)∗ → U ⊥ : ϕ 7→ ϕ ◦ π (136)
∗
The only unclear part here is at =: Since ∥x∥ ≤ ∥x∥, the inequality ≤ holds for sure. Conversely, let x ∈ X be such that ∥x∥ < 1.
Then there exists z ∈ U such that ∥x − z∥ < 1 as well. In this case, because z ∈ U,
ϕ(z) = ϕ(0X ) = 0.
As a result,
ϕ(x − z) = ϕ(x − z) = ϕ(x) − ϕ(z) = ϕ(x) − 0 = ϕ(x).
85
Now because ∥x − z∥ < 1, the inequality ≥ is true as well. Consequently, the linear map T is also an isometry whence bounded as
well.
• Finally, for each ψ ∈ U ⊥ , it is clear that U ⊆ ker(ψ). Then by the factorization theorem (cf. Theorem 3.15), there is a unique
bounded linear functional ϕ : X/U → F such that ϕ(x) = ψ(x) for all x ∈ X and ∥ϕ∥ = ∥ψ∥. Clearly, we have ψ = T ϕ in this case,
so T is also surjective.
In conclusion, the map T is an isometric isomorphism, as desired. ■
Corollary 3.48. A normed linear space is separable, namely containing a countable dense subset, whenever its
continuous dual is.
Proof. Let X be a normed linear space such that X ∗ is separable. Suppose that (ϕn )n∈N is a family dense in X ∗ . Then for each n ∈ N,
by the definition of operator norm, we can find some xn ∈ X with ∥xn ∥ ≤ 1 such that |ϕn (xn )| ≥ ∥ϕn ∥/2. Ye now claim that the subspace
U := Span(xn | n ∈ N) is dense in X: By the preceding theorem, it suffices to show that U ⊥ = {0}. Let ϕ ∈ U ⊥ be arbitrary. Then for every
n ∈ N, we may observe that
1 1
∥ϕ − ϕn ∥ ≥ |ϕ(xn ) − ϕn (xn )| = |ϕn (xn )| ≥∥ϕn ∥ = ∥ϕ − (ϕ − ϕn )∥
2 2
1
≥ |∥ϕ∥ − ∥ϕ − ϕn ∥|.
2
Because (ϕn )n∈N is dense in X ∗ , we shall have infn∈N ∥ϕ − ϕn ∥ = 0. As a result,
1 1
0 = inf ∥ϕ − ϕn ∥ ≥ inf |∥ϕ∥ − ∥ϕ − ϕn ∥| = ∥ϕ∥,
n∈N n∈N 2 2
hence ∥ϕ∥ = 0 holds for sure. In other words, now we have ϕ = 0, as desired.
Finally, let Q := Q if F = R, or Q := Q(i) = {a + ib | a, b ∈ Q} if F = C, which is countable and dense in F. Meanwhile, since (xn )n∈N
is a spanning family of U, we see that the set D of Q-linear combinations of (xn )n∈N is also dense in U as well as in X. Clearly, the set D is
also countable, so the space X is also separable. ■
86
Theorem 3.49. Let X be a normed linear space with subspace U. Then the following map
T : X ∗ /U ⊥ → U ∗ : ϕ 7→ ϕ|U (139)
is an isometric isomorphism.
Proof. First, as shown later in Corollary 3.51, we have a bounded linear map
S : X ∗ → U ∗ : ϕ 7→ ϕ|U .
Theorem 3.50. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then the adjoint
T ∗ : Y ∗ → X ∗ is also linear and bounded with ∥T ∗ ∥ = ∥T ∥.
Proof. First, for every ψ, ψ ′ ∈ Y ∗ , c ∈ F and x ∈ X, we see that
87
and
(T ∗ (cψ))(x) = (cψ)(T x) = cψ(T x) = c(T ∗ ψ)(x) = (c(T ∗ ψ))(x).
Then we have
T ∗ (ψ + ψ ′ ) = T ∗ ψ + T ∗ ψ ′ and T ∗ (cψ) = cT ∗ ψ,
implying that T ∗ : Y ∗ → X ∗ is linear. Furthermore, given any ψ ∈ Y ∗ ,
Therefore, the linear map T ∗ is also bounded with ∥T ∗ ∥ ≤ ∥T ∥. Finally, let x ∈ X with ∥x∥ ≤ 1. Then we have
where the last equality follows from Corollary 3.38. Finally, by taking supremum over all those x’s, it is clear that ∥T ∗ ∥ ≥ ∥T ∥ as well. The
identity ∥T ∗ ∥ = ∥T ∥ is thus clear for us. ■
Corollary 3.51. Let X be a normed linear space with subspace U. Then the the restriction map
T : X ∗ → U ∗ : ϕ 7→ ϕ|U (142)
(S ◦ T )∗ = T ∗ ◦ S∗ . (144)
88
2. Let S, T ∈ B(X,Y ) and c ∈ F. Then for every ψ ∈ Y ∗ ,
(S + T )∗ ψ = ψ ◦ (S + T ) = ψ ◦ S + ψ ◦ T = S∗ ψ + T ∗ ψ = (S∗ + T ∗ )ψ
and
(cT )∗ ψ = ψ ◦ (cT ) = c(ψ ◦ T ) = c(T ∗ ψ) = (cT ∗ )ψ.
Consequently, we certainly have (S + T )∗ = S∗ + T ∗ and (cT )∗ = cT ∗ .
3. Let T ∈ B(X,Y ) and S ∈ B(Y, Z). For every ψ ∈ Z ∗ ,
(S ◦ T )∗ ψ = ψ ◦ (S ◦ T ) = (ψ ◦ S) ◦ T = T ∗ (S∗ ψ) = (T ∗ ◦ S∗ )ψ.
(T ∗ )−1 = (T −1 )∗ =: T −∗ . (145)
In particular, if X and Y are (isometrically) isomorphic, then X ∗ and Y ∗ are (isometrically) isomorphic as well.
Proof. Let T : X → Y be an isomorphism. Note that T −1 ◦ T = idX and T ◦ T −1 = idY , while T −1 is also bounded. By taking adjoints over
both sides, we have
idX ∗ = id∗X = (T −1 ◦ T )∗ = T ∗ ◦ (T −1 )∗ and idY ∗ = idY∗ = (T ◦ T −1 )∗ = (T −1 )∗ ◦ T ∗ .
As we can see, the map (T −1 )∗ is the functional inverse of T ∗ , which is also bounded as T −1 is. This shows that T ∗ is also an isomorphism
such that (T ∗ )−1 = (T −1 )∗ holds.
Next, suppose that T is also an isometry. Then ∥T ∗ ∥ = ∥T ∥ = 1 as well. For every ψ ∈ Y ∗ , it is immediate that ∥T ∗ ψ∥ ≤ ∥T ∗ ∥∥ψ∥ =
∥ψ∥. Meanwhile,
∥ψ∥ = ∥T −∗ (T ∗ ψ)∥ ≤ ∥T −∗ ∥∥T ∗ ψ∥ = (1)∥T ∗ ψ∥ = ∥T ∗ ψ∥,
because T −1 is also an isometric isomorphism (cf. Theorem 3.7). Therefore, we see that ∥T ∗ ψ∥ = ∥ψ∥ for all ψ ∈ Y ∗ , hence T ∗ is an
isometric isomorphism as well in this case. ■
Lemma 3.54. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then
where evX : X → X ∗∗ and evY : Y → Y ∗∗ are the canonical linear isometries on X and Y , respectively.
Proof. Let x ∈ X be arbitrary. Then for every ψ ∈ Y ∗ ,
so we have (T ∗∗ ◦ evX )(x) = evT x = (evY ◦T )(x). Since x ∈ X is arbitrary, it follows that T ∗∗ ◦ evX = evY ◦T . ■
Theorem 3.55. Let X,Y be normed linear spaces. Then a bounded linear map S : Y ∗ → X ∗ is an adjoint of
some map in B(X,Y ) if and only if im(S∗ ◦ evX ) ⊆ im(evY ), where evX : X → X ∗∗ and evY : Y → Y ∗∗ are the
canonical linear isometries on X and Y , respectively.
Proof. First, suppose that S = T ∗ for some bounded linear map T : X → Y . By the preceding lemma, we see that
89
so
im(S∗ ◦ evX ) = im(evY ◦T ) ⊆ im(evY ).
Conversely, assume that im(S∗ ◦ evX ) ⊆ im(evY ). Define
Here because evY is injective and im(S∗ ◦ evX ) ⊆ im(evY ), such definition is possible. Yhat remains is to prove that S = T ∗ : Let ψ ∈ Y ∗ be
arbitrary. Then for every x ∈ X,
Theorem 3.56. Every normed linear space isomorphic to a reflexive normed linear space is also reflexive.
Proof. Let X be a reflexive normed linear space, Y be a normed linear space isomorphic to X via an isomorphism T : X → Y , and evX : X →
X ∗∗ , evY : Y → Y ∗∗ be the canonical linear isometries. First, by Corollary 3.53, both T ∗ : Y ∗ → X ∗ and T ∗∗ : X ∗∗ → Y ∗∗ are isomorphism
as well. Then according to Lemma 3.54, we see that evY = T ∗∗ ◦ evX ◦T −1 , which is an isomorphism as well. Therefore, the space Y is also
reflexive. ■
Theorem 3.57. Let X be a Banach space. If X is reflexive, then every closed subspace U of X is also reflexive.
Proof. Suppose that X is reflexive, and let U be a closed subspace of X. By Theorem 1.10, we see that U is a Banach space as well. Again,
we still denote by evX : X → X ∗∗ and evU : U → U ∗∗ the canonical linear isometries. Let ϑ ∈ U ∗∗ be arbitrary. Then define
ϑ̂ : X ∗ → F : ϕ 7→ ϑ ( ϕ|U ).
a contradiction. Finally, we simply show that ϑ = evU (x) as well: Let ψ ∈ U ∗ be arbitrary. By Hahn-Banach theorem (cf. Corollary 3.37),
it can be linearly extended to some ϕ ∈ X ∗ . In this case,
The identity ϑ = evU (x) thus holds, so evU is also surjective whence U is reflexive as well. ■
Theorem 3.58. Let X be a normed linear space. Then its continuous dual X ∗ is reflexive whenever X is, while
the converse is true when X is a Banach space.
90
Proof. Let X be a normed linear space, and evX : X → X ∗∗ , evX ∗ : X ∗ → X ∗∗∗ be the canonical linear isometries. First, suppose that X
is reflexive, namely the map evX is an isomorphism. Then we let ϑ ∈ X ∗∗∗ be arbitrary. As we can see, for each ψ ∈ X ∗∗ , denoting by
x := ev−1
X (ψ),
Consequently, we have ϑ = evX ∗ (ev∗X (ϑ )), implying that evX ∗ is an isomorphism as well. In other words, the space X ∗ is also reflexive.
Conversely, suppose that X ∗ is reflexive. Then as shown above, we see that the bidual X ∗∗ is reflexive as well. Meanwhile, note that
evX is a linear isometry, so X is isometrically isomorphic to im(evX ). If X is a Banach space, so is im(evX ). Note that X ∗∗ is always a
Banach space, hence according to Theorem 1.10, the subspace im(evX ) is closed in X ∗∗ . By the preceding two theorems, we can conclude
that im(evX ) as well as X are both reflexive in this case. ■
Corollary 3.59. Let X be a Banach space. If X is reflexive, for every closed subspace U of X, the quotient space
X/U is reflexive as well.
Proof. Suppose that X is reflexive, and let U be a closes subspace of X. Now by Theorem 3.46, we see that (X/U)∗ is isometrically
isomorphic to U ⊥ . Here U ⊥ is a closed subspace of X ∗ , while X ∗ , as shown in the preceding theorem, is also reflexive, hence U ⊥ as well as
(X/U)∗ should be reflexive as well. Again, it follows from Theorem 1.16 that X/U is also a Banach space, so it is necessarily reflexive by
the preceding theorem. ■
Theorem 3.60. Let X,Y be normed linear spaces and T : X → Y be a bounded linear map. Then
and
ker(T )⊥ ⊇ im(T ∗ ) and ker(T ∗ )⊥ = im(T ). (148)
Proof. For every ψ ∈ Y ∗ ,
Lemma 3.61. Let X,Y be Banach spaces and T : X → Y be a bounded linear map. Then the image of T is
closed in Y if and only if there is some C > 0 such that given any y ∈ im(T ), there exists x ∈ X with y = T x such
that ∥x∥ ≤ C∥y∥.
Proof. Suppose that im(T ) is closed in Y . Then by the isomorphism theorem (cf. Corollary 3.26), we have an isomorphism
T̃ : X/ ker(T ) → im(T ) : x 7→ T x.
Ye claim that C := 2∥T̃ −1 ∥ > 0 would suffice: Let y ∈ im(T ) be arbitrary. If y = 0Y , we can simply put x = 0X and the statement is trivial
for us. Thus, we shall also assume that y is nonzero then.
91
In this case, fix an arbitrary x0 ∈ T̃ −1 y. Because y is nonzero, its preimage under T̃ should be nonzero as well. Then for ε = ∥T̃ −1 y∥ > 0,
there exists uε ∈ ker(T ) such that
and
T x = T x0 − Tuε = T x0 = T̃ x0 = T̃ (T̃ −1 y) = y.
Conversely, suppose that such constant C > 0 exists. Still, the map T̃ : X/ ker(T ) → im(T ) defined above, by the factorization theorem
(cf. Theorem 3.15), is a bounded linear isomorphism. Fix an arbitrary y ∈ Y , and let x ∈ X be such that y = T x and ∥x∥ ≤ C∥y∥. Then we
have
∥T̃ −1 y∥ = ∥x∥ ≤ ∥x∥ ≤ C∥y∥,
so the inverse of T̃ is bounded as well. Consequently, the map T̃ is an isomorphism. Now X/ ker(T ) is a Banach space as X is (cf. Theorem
1.16), so is im(T ). By Theorem 1.10, we can conclude that im(T ) is closed in Y , as desired. ■
Corollary 3.62. Let X,Y be Banach spaces and T : X → Y be a bounded linear map. If there exists C > 0 such
that ∥x∥ ≤ C∥T x∥ for all x ∈ X, then the map T is injective and im(T ) is closed in Y .
Proof. Suppose that there is C > 0 such that ∥x∥ ≤ C∥T x∥ for all x ∈ X. Then for x ∈ ker(T ), we see that
This shows that x = 0X whence T is injective. Meanwhile, for each y ∈ im(T ), it has a unique preimage x ∈ X, which satisfies ∥x∥ ≤
C∥T x∥ = C∥y∥. By the preceding lemma, we may conclude that im(T ) is closed in Y as well. ■
Lemma 3.63. Let X,Y be Banach spaces and T : X → Y be a bounded linear map. If there exists c > 0 such
that c∥ϕ∥ ≤ ∥T ∗ ϕ∥ for all ϕ ∈ Y ∗ , then the map T is also surjective.
Proof. Suppose that there exists c > 0 such that c∥ϕ∥ ≤ ∥T ∗ ϕ∥ for all ϕ ∈ Y ∗ . Since X,Y are Banach spaces, by the open mapping theorem
(cf. Theorem 3.24), it suffices to prove that 0Y is an interior point of T (B(0X , 1)), where B(0X , 1) := {x ∈ X | ∥x∥ = 1} is the unit open ball
in X. Furthermore, in light of the proofs there, one only needs to show that 0Y is an interior point of T (B(0X , 1)).
Suppose otherwise. (The remaining proofs require the separation version of Hahn-Banach theorem. To be completed later.) ■
Theorem 3.64 (Closed Range Theorem; Banach). Let X,Y be Banach spaces and T : X → Y be a bounded
linear map. Then
im(T ) is closed in Y ⇐⇒ im(T ) = ker(T ∗ )⊥
(149)
⇐⇒ im(T ∗ ) is closed in X ∗ ⇐⇒ im(T ∗ ) = ker(T )⊥ .
Proof. First, by Theorem 3.60, we have
ker(T )⊥ ⊇ im(T ∗ ) and ker(T ∗ )⊥ = im(T ).
Consequently,
im(T ) = ker(T ∗ )⊥ ⇐⇒ im(T ) = im(T ) ⇐⇒ im(T ) is closed in Y
To complete the proof, we consider the following implications:
• First, we show that im(T ∗ ) = ker(T )⊥ holds if im(T ) is closed in Y : Again, it is clear from above that im(T ∗ ) ⊆ im(T ∗ ) ⊆ ker(T )⊥ .
Conversely, let ϕ ∈ ker(T )⊥ be arbitrary. Then we have ker(T ) ⊆ ker(ϕ). In light of this, we define
92
First, let x, x′ ∈ X be such that T x = T x′ . Then T (x − x′ ) = T x − T x′ = 0Y , implying that x − x′ ∈ ker(T ) ⊆ ker(ϕ). Consequently,
we also have 0 = ϕ(x − x′ ) = ϕ(x) − ϕ(x′ ), so ϕ(x) = ϕ(x′ ) as well. This shows that ψ0 is well-defined.
– For every y, y′ ∈ im(T ) and c ∈ F, suppose that y = T x and y′ = T x′ for some x, x′ ∈ X, observe that cy + y′ = c(T x) + T x′ =
T (cx + x′ ), so
ψ0 (cy + y′ ) = ϕ(cx + x′ ) = cϕ(x) + ϕ(x′ ) = cψ0 (y) + ψ0 (y′ ).
Consequently, the map ψ0 is also linear.
– Finally, since im(T ) is closed, by Lemma 3.61, there exists C > 0 such that for any y ∈ im(T ), there exists x ∈ X with y = T x
such that ∥x∥ ≤ C∥y∥. In this case,
|ψ0 (y)| = |ϕ(x)| ≤ ∥ϕ∥∥x∥ ≤ C∥ϕ∥∥y∥.
This, the linear functional ψ0 is also bounded with ∥ψ0 ∥ ≤ C∥ϕ∥.
Then by Hahn-Banach theorem (cf. Corollary 3.37), we can extend ψ0 to a bounded linear functional ψ ∈ Y ∗ . In this case, for every
x ∈ X,
ϕ(x) = ψ0 (T x) = ψ(T x) = (T ∗ ψ)(x).
This shows that ϕ = T ∗ ψ ∈ im(T ∗ ), so the reversed inclusion ker(T )⊥ ⊆ im(T ∗ ) holds as well.
• Next, if im(T ∗ ) = ker(T )⊥ , then as noted at the beginning, we have
Therefore, the map T̃ : Y → X is indeed linear in this case. Furthermore, for every y ∈ Y ,
Consequently, the linear map T̃ is also bounded with ∥T̃ ∥ ≤ ∥T ∗ ∥. In addition, since T ∗ = ηX ◦ T̃ ◦ ηY , we shall have ∥T ∗ ∥ ≤ ∥T̃ ∥ as well
by symmetric arguments. This then show that ∥T̃ ∥ = ∥T ∗ ∥ = ∥T ∥, where the last equality is from Theorem 3.50, completing the proof. ■
93
Definition 3.15. Let X,Y be Hilbert spaces and T : X → Y be a bounded linear maps. Then the (Hilbert-space)
adjoint of T is defined as the unique bounded linear map from Y to X, also denoted by T ∗ , such that
Remark 3.15. Clearly, the Hilbert-space adjoint of T : X → Y can be regarded as a “pullback” of its Banach-
space adjoint, whose relationships are displayed in the following commutative diagram:
ηY
Y Y∗
∗
THilbert ∗
TBanach
X ηX
X∗
(T ∗ )∗ = T and ∥T ∗ T ∥ = ∥T T ∗ ∥ = ∥T ∥2 . (152)
(S ◦ T )∗ = T ∗ ◦ S∗ . (154)
(T ∗ )−1 = (T −1 )∗ =: T −∗ . (155)
94
2. Let S, T : X → Y be bounded linear maps and c ∈ F be a scalar. Then for every x ∈ X and y ∈ Y ,
and
⟨(cT )x, y⟩ = ⟨c(T x), y⟩ = c⟨T x, y⟩ = c⟨x, T ∗ y⟩ = ⟨x, c(T ∗ y)⟩ = ⟨x, (cT ∗ )y⟩.
This shows that S∗ + T ∗ = (S + T )∗ and cT ∗ = (cT )∗ .
3. Let T : X → Y and S : Y → Z be bounded linear maps. Then for every x ∈ X and z ∈ Z,
As we can see, the map (T −1 )∗ is the functional inverse of T ∗ , which is also bounded as T −1 is. This shows that T ∗ is also an isomorphism
such that (T ∗ )−1 = (T −1 )∗ holds. ■
Theorem 3.67 (Kernel and Image of Adjoints). Let X,Y be Hilbert spaces and T : X → Y be a bounded linear
map. Then
ker(T ∗ ) = im(T )⊥ and im(T ∗ ) = ker(T )⊥ , (156)
while
ker(T ) = im(T ∗ )⊥ and im(T ) = ker(T ∗ )⊥ , (157)
Proof. First, for every y ∈ Y ,
∗
y ∈ ker(T ∗ ) ⇐⇒ T ∗ y = 0X ⇐⇒ ∀x ∈ X : ⟨x, T ∗ y⟩ = 0
⇐⇒ ∀x ∈ X : ⟨T x, y⟩ = 0 ⇐⇒ y ∈ im(T )⊥ .
∗
Here ⇐⇒ follows from Theorem 2.21. This shows that ker(T ∗ ) = im(T )⊥ , and hence by Corollary 2.39, we also have
Theorem 3.68 (Left Invertible Linear Maps). Let X be a Hilbert space and T ∈ B(X). Then the following
statements are equivalent:
1. The map T is left invertible, namely there exists S ∈ B(X) such that ST = idX .
2. There exists a positive constant α such that ∥x∥ ≤ α∥T x∥ for all x ∈ X.
95
so the condition holds with constant α = ∥S∥ > 0.
(2 =⇒ 3). Suppose that there is α > 0 such that ∥x∥ ≤ α∥T x∥ for all x ∈ X. First, for every x ∈ ker(T ), we have
This shows that x = 0X , so the map T is injective. Meanwhile, let (yn )n∈N be a sequence of elements in im(T ) converging to some y ∈ X.
Suppose that yn = T xn with xn ∈ X for all n ∈ N. Then for every m, n ∈ N,
Since (yn )n∈N is convergent, itself is a Cauchy sequence in X. By the inequality above, it is clear that (xn )n∈N is also a Cauchy sequence in
X. Note that X is complete, so the sequence (xn )n∈N converges to some x ∈ X. By the continuity of T ,
T x = lim T xn = lim yn = y,
n→∞ n→∞
and
∥Sx∥ = ∥T̂ (Px)∥ ≤ ∥T ∥∥Px∥ ≤ ∥T ∥∥x∥.
This shows that S ∈ B(X) with ∥S∥ ≤ ∥T ∥ and ST = idX , namely T is left invertible.
(2 =⇒ 3). Suppose that there is α > 0 satisfying ∥x∥ ≤ α∥T x∥ for all x ∈ X. Then given any x ∈ X,
so we also have ∥x∥ ≤ α 2 ∥T ∗ T x∥ in this case. Note that α 2 > 0, so by (1 ⇐⇒ 2), we can conclude that T ∗ T is left invertible. Suppose that
S ∈ B(X) is a left inverse of T ∗ T . Then we see that S(T ∗ T ) = idX , hence
Since S∗ ∈ B(X), the map T ∗ T is right invertible as well. Therefore, we may conclude that T ∗ T is an isomorphism.
(4 =⇒ 1). Suppose that T ∗ T is an isomorphism. Denote by S := (T ∗ T )−1 . Then we can see that idX = S(T ∗ T ) = (ST ∗ )T , hence the
map T is certainly left invertible, completing the proof. ■
Lemma 3.69. Let X be a Hilbert space. Then for every T ∈ B(X), its restriction T̂ to ker(T )⊥ is injective and
bounded with identical image as T .
Proof. Let T ∈ B(X) be arbitrary and T̂ : ker(T )⊥ → X be its restriction to ker(T )⊥ . Clearly, the map T̂ is bounded, while also
so T̂ is certainly injective.
Furthermore, it is clear that im(T̂ ) ⊆ im(T ). As for the converse, by Corollary 3.2, its kernel is a closed subspace of X. Then for every
x ∈ X, by Theorem 2.38, it admits a unique decomposition x = p + z, where p ∈ ker(T ) and z ∈ ker(T )⊥ . In this case,
T x = T (p + z) = T p + T z = 0X + T z = T z = T̂ z.
Theorem 3.70 (Right Invertible Linear Maps). Let X be a Hilbert space and T ∈ B(X). Then the following
statements are equivalent:
96
1. The map T is right invertible, namely there exists S ∈ B(X) such that T S = idX .
(1 ⇐⇒ 2). Suppose that T is right invertible with right inverse S ∈ B(X). Then for every y ∈ X, we can see that y = idX (y) = (T S)y =
T (Sy). This shows that the map T is surjective.
Conversely, suppose that T is surjective. By the preceding lemma, the restriction T̂ of T to ker(T )⊥ is also bounded and injective with
the same image as T . However, because T is surjective, the map T̂ is surjective as well. Therefore, we can see that T̂ is a bounded linear
isomorphism from ker(T )⊥ to X.
Note that ker(T )⊥ is a closed subspace of X, hence is also complete by Theorem 1.10. By the bounded inverse theorem (cf. Corollary
3.25), its functional inverse is bounded as well. Besides, for each x ∈ X, we have
(T T̂ −1 )x = T (T̂ −1 x) = x.
Theorem 3.71 (Isometry Between Hilbert Spaces). Let X,Y be Hilbert spaces and T : X → Y be a linear map.
Then the following statements are equivalent:
2. T ∗ T = idX .
3. The family (Tei )i∈I is orthonormal for every orthonormal family (ei )i∈I in X.
4. There is an orthonormal base (ei )i∈I in X such that the family (Tei )i∈I is also orthonormal in X.
Proof. (1 ⇐⇒ 2, 3). By Theorem 3.6, we know that T is an isometry if and only if ⟨T x, Ty⟩ = ⟨x, y⟩ for all x, y ∈ X. Note that ⟨T x, Ty⟩ =
⟨x, T ∗ Ty⟩, so
∀x, y ∈ X : ⟨T x, Ty⟩ = ⟨x, y⟩ ⇐⇒ ∀x, y ∈ X : ⟨x, T ∗ Ty⟩ = ⟨x, y⟩ ⇐⇒ T ∗ T = idX .
(1 =⇒ 3). Suppose that T is an isometry. Let (ei )i∈I be an orthonormal family in X. Then for every i, j ∈ I, by the preceding theorem,
1, if i = j;
⟨Tei , Te j ⟩ = ⟨ei , e j ⟩ = δi, j =
0, if i ̸= j.
97
where the last equality holds because (Tei )i∈I is an orthonormal family. As a result, for every y ∈ X,
Definition 3.16. Let X be a Hilbert space. A bounded linear operator T ∈ B(X) is called unitary if it is a
surjective isometry.
Theorem 3.72 (Equivalent Definitions of Unitary Operators). Let X be a Hilbert space and T ∈ B(X). Then
the following statements are equivalent:
3. T ∗ T = T T ∗ = idX .
6. For every orthonormal base (ei )i∈I of X, the family (Tei )i∈I is also an orthonormal base of X.
7. There exists an orthonormal base (ei )i∈I of X such that the family (Tei )i∈I is also an orthonormal base of
X.
Proof. (1 =⇒ 2). Suppose that T is unitary. Being a surjective linear isometry, it is certainly a linear isomorphism, so by Theorem 3.7, the
map T is an isometric isomorphism. In particular, given any x, y ∈ X,
This shows that X = Span(Tei | i ∈ I), so by Theorem 2.43, the family (Tei )i∈I is also an orthonormal base of X.
(6 =⇒ 7). This is trivial.
(7 =⇒ 1). Suppose that the family (Tei )i∈I is also an orthonormal base of X for some orthonormal base (ei )i∈I of X. Again, by the
preceding theorem and Theorem 2.43, we see that T is an isometry with
X = Span(Tei | i ∈ I) ⊆ im(T ),
98
namely im(T ) = X. Now by Theorem 3.7, we see that X is isometrically isomorphic to im(T ), so im(T ) is also complete. Because X is a
Hilbert space, it follows from Theorem 1.10 that im(T ) is also closed in X. Consequently, we see that im(T ) = im(T ) = X, namely T is
surjective. Being a surjective isometry, one can conclude from the definition that T is unitary, as desired. ■
Example 3.1. In fact, every unitarily invariant norm on Fn , namely ∥Ux∥ = ∥x∥ for every x ∈ Fn and unitary
matrix U ∈ Mn (F), is a positive scalar multiple of the l2 -norm:
• For every x ∈ Fn and unitary matrix U, consider the standard inner product on Fn ,
• Suppose that ∥ · ∥ is a unitarily invariant norm on Fn . Consider the Householder matrix U onto e1 . It is clear that
Definition 3.17. Let X be a Hilbert space. A bounded linear operator T ∈ B(X) is called self-adjoint if T ∗ = T .
3. For every T ∈ B(X), there exist unique self-adjoint operators A, B ∈ B(X) such that T = A + iB and
T ∗ = A − iB.
Proof. 1. Let S, T ∈ B(X) be self-adjoint operators. First, for every c ∈ R, we have
(cT )∗ = cT ∗ = cT = cT,
(S + T )∗ = S∗ + T ∗ = S + T and (S ◦ T )∗ = T ∗ ◦ S∗ = T ◦ S.
Therefore, S + T is self-adjoint, while S ◦ T is self-adjoint if and only if T ◦ S = S ◦ T , namely S, T commute in this case.
• In particular, by easy induction, one can see that T n is self-adjoint for all non-negative integer n, so every real linear combination of
non-negative powers of T is also self-adjoint.
• Suppose that T is an isomorphism. Note that (T −1 )∗ = (T ∗ )−1 = T −1 , so its inverse T −1 is also self-adjoint.
2–3. Let T ∈ B(X) be arbitrary. Then
(T ∗ T )∗ = T ∗ (T ∗ )∗ = T ∗ T, (T T ∗ )∗ = (T ∗ )∗ T ∗ = T T ∗ , and (T + T ∗ )∗ = T ∗ + (T ∗ )∗ = T ∗ + T,
99
so T ∗ T, T T ∗ , T + T ∗ are all self-adjoint. Furthermore, if A, B ∈ L (X) are operators such that T = A + iB and T ∗ = A − iB, we must have
1 1
A= (T + T ∗ ) and B= (T − T ∗ ).
2 2i
Now let A, B ∈ L (X) be defined as above. It is clear that A is self-adjoint as T + T ∗ is and 1/2 ∈ R. Meanwhile,
1 1 1
B∗ = (T − T ∗ )∗ = − (T ∗ − T ) = (T − T ∗ ) = B,
−2i 2i 2i
so B is self-adjoint as well. The desired assertion thus follows. ■
Theorem 3.74. Let X be a complex Hilbert space. An operator T ∈ B(X) is self-adjoint if and only if ⟨T x, x⟩ ∈ R
for all x ∈ X. Furthermore, when X is nonzero and T is self-adjoint,
⟨T x, x⟩ = ⟨x, T x⟩ = ⟨T x, x⟩,
Thus
1
ℜ⟨T x, y⟩ = (⟨T (x + y), x + y⟩ − ⟨T (x − y), x − y⟩)
4
M M
≤ (∥x + y∥2 + ∥x − y∥2 ) = (∥x∥2 + ∥y∥2 ).
4 2
Now suppose that ∥x∥ = 1 and T x ̸= 0X . Then we may put y := ∥T x∥−1 (T x). Clearly, we have
while
M M
ℜ⟨T x, y⟩ ≤ (∥x∥2 + 12 ) = (12 + 12 ) = M.
2 2
Therefore, we see that ∥T x∥ ≤ M for all ∥x∥ = 1 (such inequality automatically holds if T x = 0X ), hence ∥T ∥ ≤ M as well.
Conversely, suppose that ⟨T x, x⟩ ∈ R for all x ∈ X. Then for every x ∈ X, we see that ⟨x, T x⟩ = ⟨T x, x⟩ = ⟨T x, x⟩, so
Alternative Proof of Equivalent Condition. By Corollary 2.14, we see that the map
ϕ : X × X → C : (x, y) 7→ ⟨T x, y⟩
100
is a sesquilinear form on X. Furthermore, its associated quadratic form is precisely given by x 7→ ⟨T x, x⟩. Then according to Corollary 2.13,
∀x ∈ X : ⟨T x, x⟩ ∈ R ⇐⇒ ∀x ∈ X : ϕ(x, x) ∈ R
⇐⇒ ∀x, y ∈ X : ϕ(x, y) = y, x
⇐⇒ ∀x, y ∈ X : ⟨T x, y⟩ = ⟨Ty, x⟩ = ⟨x, Ty⟩ ⇐⇒ T = T ∗ . ■
Definition 3.18. Let X be a real linear space. A bilinear form ϕ : X × X → R is called symmetric if ϕ(x, y) =
ϕ(y, x) for all x, y ∈ X, in which case the associated quadratic form is defined as
Remark 3.16. Similar to sesquilinear forms, it is clear that every semi-inner product on real linear spaces is
automatically a symmetric bilinear form (that is also positive semi-definite).
Theorem 3.75. Let X be a real linear space, ϕ : X × X → R be a symmetric bilinear form on X, and Φ : X → R
be the quadratic form on X associated with ϕ. Then for every x, y ∈ X,
1 1
ϕ(x, y) = (Φ(x + y) − Φ(x) − Φ(y)) = (Φ(x + y) − Φ(x − y)). (160)
2 4
In other words, the symmetric bilinear form ϕ is completely determined by its associated quadratic form Φ.
Proof. For every x, y ∈ X,
Therefore, we have Φ(x + y) − Φ(x) − Φ(y) = 2ϕ(x, y) and Φ(x + y) − Φ(x − y) = 4ϕ(x, y). The desired identities thus follows. ■
Remark 3.17. One may note that the above theorem is parallel to the generalized polarization identity for
sesquilinear forms (cf. Theorem 2.12).
T = 0 ⇐⇒ ∀x ∈ X : ⟨T x, x⟩ = 0. (161)
Proof. The direction (=⇒) is trivial for us. As for its converse, we consider different cases for the ground field:
• If F = C, the statement follows from Corollary 2.14.
• Suppose that F = R. Ye then show that the following map is a symmetric bilinear form on X:
ϕ : X × X → R : (x, y) 7→ ⟨T x, y⟩.
101
– For every x, y ∈ X,
ϕ(x, y) = ⟨T x, y⟩ = ⟨x, Ty⟩ = ⟨Ty, x⟩ = ϕ(x, y).
Now let Φ; X → R be the quadratic form associated with ϕ. As we can see, we have Φ(x) = ϕ(x, x) = ⟨T x, x⟩ = 0 for all x ∈ X.
Then by the preceding theorem, we see that
1 1
⟨T x, y⟩ = ϕ(x, y) = (Φ(x + y) − Φ(x) − Φ(y)) = (0 − 0 − 0) = 0
2 2
holds for all x, y ∈ X. In particular, we have ∥T x∥2 = ⟨T x, T x⟩ = 0 for all x ∈ X, so T = 0 holds as well in this case. ■
2. For every ordered orthonormal base β of X, the matrix representation β [T ]β with respect to β is a Her-
mitian matrix.
3. There exists an ordered orthonormal base β of X to which the matrix representation β [T ]β of T is Hermi-
tian.
Proof. (1 =⇒ 2). Let T be an operator on X. First, suppose that T is self-adjoint. Fix an arbitrary ordered orthonormal base β = (e1 , . . . , en )
of X. Then for each i, j = 1, . . . , n, by Theorem 2.33, the (i, j)-entry of β [T ]β is given by ⟨Te j , ei ⟩. Note that
so the (i, j)-entry of β [T ]β is the conjugate of its ( j, i)-entry. Ye may conclude that β [T ]β is a Hermitian matrix in this case.
(2 =⇒ 3). This is trivial.
(3 =⇒ 1). Suppose that β [T ]β is a Hermitian matrix for some ordered orthonormal base β of X. Then for every x, y ∈ X, by Theorem
2.33,
Theorem 3.78 (Adjoint of Complexification). Let X be a real Hilbert space and T ∈ B(X). Then
102
• Clearly, if T is self-adjoint, we shall have (TC )∗ = (T ∗ )C = TC , implying that TC is also self-adjoint.
• Conversely, suppose that TC is self-adjoint. Then we have TC = (TC )∗ = (T ∗ )C , so for every x ∈ V ,
T x = TC (x + i0) = (T ∗ )C (x + i0) = T ∗ x.
Definition 3.19. Let X be a Hilbert space. A bounded linear operator T ∈ B(X) is normal if T ∗ T = T T ∗ .
Theorem 3.79. Let X be a Hilbert space. Then a bounded linear operator T ∈ B(X) is normal if and only if
∥T x∥ = ∥T ∗ x∥ for all x ∈ X. In particular, if T is normal, then ker(T ∗ ) = ker(T ).
Proof. Let T ∈ B(X) be arbitrary. If T is normal, then for every x ∈ X,
x ∈ ker(T ) ⇐⇒ T x = 0X ⇐⇒ ∥T x∥ = 0 ⇐⇒ ∥T ∗ x∥ = 0 ⇐⇒ T ∗ x = 0X ⇐⇒ x ∈ ker(T ∗ ),
103
4 Compact Linear Maps and Some Spectral Theory
4.1 Weak and Weak∗ Convergence
Definition 4.1. Let V be a normed linear space. A sequence (xn )n∈N of elements in V converges weakly to some
w
→ x, if ϕ(xn ) → ϕ(x) for all ϕ ∈ V ∗ , in which case x is called a weak limit of (xn )n∈N .
x ∈ V , denoted by xn −
Remark 4.1. As opposite to weak convergence, we shall refer the convergence of elements under norm by strong
convergence then.
Theorem 4.1 (Properties of Weak Convergence). Let V be a normed linear spaces and (xn )n∈N be a sequence
of elements in V .
w w
1. (Uniqueness of Weak Limits). If xn −
→ x ∈ V and xn −
→ y ∈ V , then x = y.
Furthermore, for every sequence (ϕn )n∈N in V ∗ , if ϕn → ϕ ∈ V ∗ as well, then ϕn (xn ) → ϕ(x).
w w w
xn −
→ x ∈ V and yn −
→ y ∈ V =⇒ xn + yn −
→ x + y. (164)
w w
xn −
→ x ∈ V and cn → c ∈ F =⇒ cn xn −
→ cx. (165)
w w
6. For every normed linear space W and bounded linear map T : V → W , if xn −
→ x in V , then T xn −
→ T x in
W as well.
Proof. 1. Suppose that x, y ∈ V are weak limit of V . Then for every ϕ ∈ V ∗ , we have
104
w
→ x ∈ V in this case now. Then for every ϕ ∈ V ∗ with ∥ϕ∥ ≤ 1, since |ϕ(xn )| ≤ ∥ϕ∥∥xn ∥ ≤ ∥xn ∥ for all n ∈ N, we
Suppose that xn −
have
|ϕ(x)| = lim |ϕ(xn )| = lim inf |ϕ(xn )| ≤ lim inf ∥xn ∥.
n→∞ n→∞ n→∞
By Corollary 3.38, we then see that
∥x∥ = max |ϕ(x)| ≤ lim inf ∥xn ∥.
ϕ∈V ∗ ,∥ϕ∥≤1 n→∞
and
|ϕ(cn xn ) − ϕ(cx)| = |cn ϕ(xn ) − cϕ(x)| = |(cn ϕ(xn ) − cn ϕ(x)) + (cn ϕ(x) − cϕ(x))|
≤ |cn ϕ(xn ) − cn ϕ(x)| + |cn ϕ(x) − cϕ(x)|
= |cn ||ϕ(xn ) − ϕ(x)| + |cn − c||ϕ(x)| → 0 + 0 = 0. (as n → ∞)
w w
Therefore, we indeed have ϕ(xn + yn ) → ϕ(x + y) and ϕ(cn xn ) → ϕ(cx) for all ϕ ∈ V ∗ , so xn + yn −
→ x + y and cn xn −
→ cx. Finally, let W be
a normed linear space and T : V → W be a bounded linear map. Then for every ψ ∈ W ∗ ,
Theorem 4.2. Let V be a finite-dimensional normed linear space. Then a sequence of elements in V is strongly
convergent if and only if it is weakly convergent.
Proof. The implication from strong convergence to weak convergence is clear from the preceding theorem. Now we let (xn )n∈N be a weakly
convergent sequence in V with weak limit x ∈ V . Because V is finite-dimensional, we may consider an arbitrary base {e1 , . . . , en } of V with
coordinate form e∨ ∨
1 , . . . , en : V → F.
Now by Lemma 3.32, each e∨j is a bounded linear functional on V . Then according to our hypothesis, we also have e∨j (xn ) → e∨j (x) in
F as n → ∞. As a result, we may conclude from Theorem 1.28 that xn → x in V as well in this case. ■
Definition 4.2. Let V be a normed linear space. A sequence (ϕn )n∈N of bounded linear functionals on V
w∗
converges weakly∗ to some ϕ ∈ V ∗ , denoted by ϕn −→ ϕ, if ϕn (x) → ϕ(x) for all x ∈ V , in which case ϕ is called
a weak∗ limit of (ϕn )n∈N .
Theorem 4.3 (Properties of Weak∗ Convergence). Let V be a normed linear space and (ϕn )n∈N be a sequence
of bounded linear functionals on V .
w∗ w∗
1. (Uniqueness of Weak Limits). If ϕn −→ ϕ ∈ V ∗ and ϕn −→ ψ ∈ V ∗ , then ϕ = ψ.
105
2. For every ϕ ∈ V ∗ ,
w w∗
ϕn → ϕ =⇒ ϕn −
→ ϕ =⇒ ϕn −→ ϕ. (166)
Furthermore, the sequence (ϕn )n∈N is bounded if V is complete, in which case for every sequence (xn )n∈N
of elements in V , if xn → x ∈ V , then ϕn (xn ) → ϕ(x).
w∗ w∗ w∗
ϕn −→ ϕ ∈ V ∗ and ψn −→ ψ ∈ V ∗ =⇒ ϕn + ψn −→ ϕ + ψ. (168)
w∗ w
ϕn −→ ϕ ∈ V and cn → c ∈ F =⇒ cn ϕn −
→ cϕ. (169)
Proof. 1. Suppose that ϕ, ψ ∈ V ∗ are weak∗ limits of (ϕn )n∈N . Then for every x ∈ V ,
namely ϕ(x) = ψ(x). This then shows that x = y, proving the uniqueness.
2. Let ϕ ∈ V ∗ be arbitrary. We prove the implications as follows:
• First, suppose that ϕn → ϕ. Let ϑ ∈ V ∗∗ be arbitrary. Then for every x ∈ V ,
|ϕ(x)| = lim |ϕn (x)| = lim inf |ϕn (x)| ≤ lim inf ∥ϕn ∥.
n→∞ n→∞ n→∞
Consequently,
∥ϕ∥ = sup |ϕ(x)| ≤ lim inf ∥ϕn ∥.
x∈V,∥x∥≤1 n→∞
Next, suppose that V is complete. Then for every x ∈ V , since ϕn (x) → ϕ(x), the sequence (ϕn (x))n∈N is bounded. Since V is now a Banach
space, it follows from the Banach-Steinhaus theorem (cf. Theorem 3.22) that (ϕn )n∈N is also bounded in V ∗ . In this case, let (xn )n∈N be a
106
convergent sequence in V with limit x. Then
w∗
Here ∥ϕn ∥∥xn − x∥ → 0 as (ϕn )n∈N is bounded and xn → x, while |ϕn (x) − ϕ(x)| → 0 because ϕn −→ ϕ.
4–5. Let (ϕn )n∈N , (ψn )n∈N be weakly∗ convergent sequences in V with weak∗ limits ϕ, ψ ∈ V ∗ , respectively, and let (cn )n∈N be a
convergent sequence of scalars with limit c ∈ F. Then for every x ∈ V ,
and
|(cn ϕn )(x) − (cϕ)(x)| = |cn ϕn (x) − cϕ(x)| = |(cn ϕn (x) − cn ϕn (x)) + (cn ϕn (x) − cϕ(x))|
≤ |cn ϕ(x) − cn ϕn (x)| + |cn ϕn (x) − cϕ(x)|
= |cn ||ϕ(x) − ϕn (x)| + |cn − c||ϕ(x)| → 0 + 0 = 0. (as n → ∞)
w∗ w∗
Therefore, we also have ϕn + ψn −→ ϕ + ψ and cn ϕn −→ cϕ. ■
Definition 4.3. Let V be an inner product space and x ∈ V . A sequence (xn )n∈N of elements in V converges
w
weakly to x, denoted by xn −
→ x, if ⟨xn , y⟩ → ⟨x, y⟩ as n → ∞, for every y ∈ V , in which case x is called a weak
limit of (xn )n∈N .
Remark 4.2. Let V be a Hilbert space with sequence (xn )n∈N of elements in it. Then we see that
w
→ x ∈ V ⇐⇒ ∀ϕ ∈ V ∗ : ϕ(xn ) → ϕ(x)
xn −
⇐⇒ ∀y ∈ V : ⟨xn , y⟩ → ⟨x, y⟩,
where the last equivalence follows from the Riesz representation theorem (cf. Theorem 3.43). Here we simply
generalized the last characterization to arbitrary inner product spaces, in which the definition of weak conver-
gence is slightly weakly than in normed linear spaces.
Theorem 4.4 (Properties of Weakly Convergent Sequences). Let V be an inner product space and let (xn )n∈N ,
(yn )n∈N be sequences of elements in V .
w w
1. (Uniqueness of Weak Limits). If xn −
→ x ∈ V and xn −
→ y ∈ V , then x = y.
107
w
2. Suppose that V is complete, and xn −
→ x ∈ V . For every n ∈ N, by Lemma 3.41, we see that the map
ηn : V → F : z 7→ ⟨z, xn ⟩
is a bounded linear map on V with ∥ηn ∥ = ∥xn ∥. Furthermore, given any z ∈ V , since ηn (z) = ⟨z, xn ⟩ → ⟨z, x⟩, the sequence (ηn (x))n∈N
is bounded in F. Now because V is complete, we can see from the Banach-Steinhaus theorem (cf. Theorem 3.22) that supn∈N ∥xn ∥ =
supn∈N ∥ηn ∥ < ∞. That is, the sequence (xn )n∈N is bounded in V .
w w
3–4. Suppose that xn −
→ x ∈ V and yn −
→ y ∈ V . Then for every z ∈ V , since ⟨xn , z⟩ → ⟨x, z⟩ and ⟨yn , z⟩ → ⟨y, z⟩, it follows that
Theorem 4.5 (Relationship Between Strong and Weak Convergences). Let V be an inner product space with
x ∈ V , and let (xn )n∈N be a sequence of elements in V . Then
w
xn → x ⇐⇒ xn −
→ x and ∥xn ∥ → ∥x∥. (170)
Proof. Suppose that xn → x ∈ V under the derived norm, namely ∥xn − x∥ → 0 as n → ∞. It is clear from Theorem 1.9 that ∥xn ∥ → ∥x∥.
Furthermore, for every y ∈ V , by CBS inequality,
Corollary 4.6. Let V be a finite-dimensional inner product space. A sequence of elements in V is convergent
under the derived norm if and only if it is weakly convergent.
Proof. By the preceding theorem, it suffices to show that weak convergence in V implies the convergence of norms: Let (xn )n∈N be a
w
→ x. Since V is finite-dimensional, it possesses a finite orthonormal base {e1 , . . . , en }. Let e∨
sequence in V such that xn − ∨
1 , . . . , en : V → F be
the corresponding coordinate form ■
Lemma 4.7. Let V be an inner product space, and let S ⊆ V be a subset such that Span(S) is dense in V . Then
w
for every bounded sequence (xn )n∈N and element x in V , if ⟨xn , y⟩ → ⟨x, y⟩ for all y ∈ S, then xn −
→ x.
Proof. Since (xn )n∈N is bounded, there exists M ′ > 0 such that ∥xn ∥ ≤ M ′ for all n ∈ N. Furthermore, we may put M := M ′ ∨ ∥x∥. Let z ∈ V
and ε > 0 be arbitrary. Since Span(S) is dense in V , there exists y0 ∈ Span(S) such that ∥z − y0 ∥ < ε/(3M).
Clearly, if ⟨xn , y⟩ → ⟨x, y⟩ for all y ∈ S, we have ⟨xn , y⟩ → ⟨x, y⟩ for every y ∈ Span(S) as well. Now since ⟨xn , y0 ⟩ → ⟨x, y0 ⟩, there exists
N ∈ N such that
ε
|⟨xn , y0 ⟩ − ⟨x, y0 ⟩| < , ∀n ≥ N.
3
In this case,
|⟨xn , z⟩ − ⟨x, z⟩| ≤ |⟨xn , z⟩ − ⟨xn , y0 ⟩| + |⟨xn , y0 ⟩ − ⟨x, y0 ⟩| + |⟨x, y0 ⟩ − ⟨x, z⟩|
108
ε ε ε ε
< ∥xn ∥∥z − y0 ∥ + + ∥x∥∥y0 − z∥ < M · + +M· = ε.
3 3M 3 3M
w
This then shows that ⟨xn , z⟩ → ⟨x, z⟩. Since z ∈ V is arbitrary, it follows that xn −
→ x. ■
The spectrum of T is
109
5 Differential Calculus on Banach Spaces
Definition 5.1. Let X,Y be normed linear spaces and Ω be an open subset of X. Then a map f : Ω → Y is
(Fréchet) differentiable at a point a ∈ Ω if there exists a bounded linear map A : X → Y such that for every h ∈ X
such that a + h ∈ Ω,
f (a + h) = f (a) + Ah + ∥h∥δ (h), where lim δ (h) = 0. (175)
h→0V
110
A Additional Topics on Linear Spaces
A.1 Complexification of Real Linear Spaces
Definition A.1. Let X be a real linear space. Then the complexification of X, denoted by XC , is defined on the
abelian group X ⊕ X under the following scalar multiplication by C:
Note that we have a canonical embedding ι : X → X ⊕ X : x 7→ (x, 0) that is linear over R with image X ⊕ {0}.
Then by identifying X with X ⊕ {0}, we may write each pair (x, y) ∈ X ⊕ X as a formal sum x + iy, in which case
the above scalar multiplication is also written by
Theorem A.1. Let X be a real linear space. Then the complexification XC of X is a complex linear space.
Furthermore, for every real subspace V of X, its complexification VC is a complex subspace of XC .
Proof. It is already clear that the underlying set X ⊕ X is an additive abelian group. What remains is to simply verify the axioms for scalar
multiplications:
• For every x, y ∈ X,
1(x + iy) = (1 + i0)(x + iy) = (1(x) − 0(y)) + i(1(y) + 0(x)) = x + iy.
(a + ib)(x + iy) + (c + id)(x + iy) = ((ax − by) + i(ay + bx)) + ((cx − dy) + i(cy + dx))
= (ax − by + cx − dy) + i(ay + bx + cy + dx)
= ((a + c)x − (b + d)y) + i((a + c)y + (b + d)x)
= ((a + c) + i(b + d))(x + iy) = ((a + ib) + (c + id))(x + iy)
and
111
• For every x, x′ , y, y′ ∈ X and a, b ∈ R,
(a + ib)(x + iy) + (a + ib)(x′ + iy′ ) = ((ax − by) + i(ay + bx)) + ((ax′ − by′ ) + i(ay′ + bx′ ))
= (ax − by + ax′ − by′ ) + i(ay + bx + ay′ + bx′ )
= (a(x + x′ ) − b(y + y′ )) + i(a(y + y′ ) + b(x + x′ ))
= (a + ib)((x + x′ ) + i(y + y′ )) = (a + ib)((x + iy) + (x′ + iy′ )).
Next, let V be a real subspace of X. Then V ⊕V is certainly an additive subgroup of X ⊕ X. Furthermore, for every u, v ∈ V and a, b ∈ R,
Theorem A.2 (Base of Complexification). Let X be a real linear space. Then for every base (e j ) j∈J of X over
R, the family (e j + i0) j∈J is also a base of XC over C. In particular,
Proof. Let (e j ) j∈J be a base of X over R. Then for every x, y ∈ X, there exist almost null families (a j ) j∈J and (b j ) j∈J of real numbers such
that
x = ∑ a j e j and y = ∑ b j e j .
j∈J j∈J
Then
x + iy = ∑ a j e j + i ∑ b j e j = ∑ (a j e j + i(b j e j )) = ∑ (a j + ib j )(e j + i0).
j∈J j∈J j∈J j∈J
{ j ∈ J | a j + ib j ̸= 0} = { j ∈ J | a j ̸= 0 or b j ̸= 0} = { j ∈ J | a j ̸= 0} ∪ { j ∈ J | b j ̸= 0}
is also a finite set. This shows that (e j + i0) j∈J is a spanning set for XC .
Furthermore, let (z j ) j∈J be an almost null family of complex numbers such that ∑ j∈J z j (e j + i0) = 0. Suppose that z j = a j + ib j with
a j , b j ∈ R for each j ∈ J. Then
so we must have
∑ a j e j = ∑ b j e j = 0.
j∈J j∈J
However, the last equality implies that a j = b j = 0 for all j ∈ J, as they are all real numbers. This then shows that z j = 0 for all j, so the family
(e j + i0) j∈J is also linearly independent over C. Consequently, it is a base of XC over C, implying that dimC (XC ) = |J| = dimR (X). ■
Theorem A.3 (Complexification of Linear Maps). Let X,Y be real linear spaces and T : X → Y be a real
linear map. Then there is a unique complex linear map TC : XC → YC extending T , namely making the following
diagram commute:
T
X Y
XC TC
YC
112
More precisely,
TC : XC → YC : x + iy 7→ T x + i(Ty). (179)
1. We have
(idX )C = idXC , ker(TC ) = ker(T )C and im(TC ) = im(T )C , (180)
(S ◦ T )C = SC ◦ TC . (181)
Proof. First, let S : XC → YC be a complex linear map extending T . Then for every x, y ∈ V ,
S(x + iy) = S((x + i0) + i(y + i0)) = S(x + i0) + iS(y + i0)
= (T x + i0) + i(Ty + i0) = T x + i(Ty).
TC : XC → YC : x + iy 7→ T x + i(Ty).
What remains is to prove that TC is complex linear, while the uniqueness of such map follows from the observation above:
• For every x, x′ , y, y′ ∈ X,
TC ((a + ib)(x + iy)) = TC ((ax − by) + i(ay + bx)) = T (ax − by) + i(T (ay + bx))
= (a(T x) − b(Ty)) + i(a(Ty) + b(T x))
= (a + ib)(T x + i(Ty)) = (a + ib)TC (x + iy).
113
• For every y1 , y2 ∈ Y ,
Next, let Z be a real linear space and S : Y → Z be a real linear map. For every x, y ∈ X,
The desired identity (S ◦ T )C = SC ◦ TC thus follows. Finally, suppose that T is a linear isomorphism, namely bijective. Then as noted above,
idXC = (idX )C = (T −1 ◦ T )C = (T −1 )C ◦ TC ,
and
idYC = (idY )C = (T ◦ T −1 )C = TC ◦ (T −1 )C .
This shows that TC is also a linear isomorphism with inverse (T −1 )C , namely (TC )−1 = (T −1 )C . ■
Corollary A.4. Let X,Y be finite-dimensional real linear spaces with respective ordered base β = (e1 , . . . , en )
and γ = ( f1 , . . . , fm ). Then for every linear map T : X → Y , we have
where βC = (e1 + i0X , . . . , en + i0X ) and γC = ( f1 + i0Y , . . . , fm + i0Y ) are the respective ordered bases of XC and
YC .
Proof. Let T : X → Y be a linear map. Then for every j = 1, . . . , n,
m
TC (e j + i0X ) = Te j + i(T 0X ) = Te j + i0Y = ∑ fi∨ (Te j )( f j + i0Y ),
i=1
where each fi∨ : Y → R is the corresponding coordinate form on Y . Consequently, for every i = 1, . . . , m and j = 1, . . . , n, the (i, j)-entry of
γC [TC ]βC is fi∨ (Te j ), which is also the (i, j)-entry of γ [T ]β . The desired identity follows as well. ■
Theorem A.5 (Universal Property). Let X be a real linear space and Y be a complex linear space. Then for
every real linear map T : X → Y , there is a unique complex linear map T̃ : XC → Y extending T .
T
X Y
T̃
XC
More precisely,
T̃ : XC → Y : x + iy 7→ T x + i(Ty). (184)
Proof. Let T : X → Y be a real linear map. Suppose that S : XC → Y is a complex linear map extending T . Then for every x, y ∈ X,
S(x + iy) = S((x + i0) + i(y + i0)) = S(x + i0) + iS(y + i0) = T x + i(Ty).
114
Consequently, we may define
T̃ : XC → Y : x + iy 7→ T x + i(Ty).
What remains is to show that T̃ is linear over C, while all other conditions have been verified as above:
• For every x, x′ , y, y′ ∈ X,
T̃ ((a + ib)(x + iy)) = T̃ ((ax − by) + i(ay + bx)) = T (ax − by) + i(T (ay + bx))
= (a(T x) − b(Ty)) + i(a(Ty) + b(T x))
= (a + ib)(T x + i(Ty)) = (a + ib)T̃ (x + iy). ■
Remark A.2. In algebra, one can also extend the scalars from a small field to a larger one by tensor products.
That is, given any real linear space X, the tensor product C ⊗R X also has complex linear space structure:
Corollary A.6 (Alternative Complexification). Let X be a real linear space. Then XC is isomorphic to C ⊗R X
as complex linear spaces via a unique isomorphism.
Proof. We first construct complex linear maps from both direction:
• First, consider the canonical embedding x 7→ x + i0X from X to XC , which is a real linear map. Then by the universal property of the
extension of scalar, we can find a unique complex linear map f : C ⊗R X → CC such that f (1 ⊗ x) = x + i0X for all x ∈ X.
• Conversely, note that we also have another canonical embedding x 7→ 1 ⊗ x from X to C ⊗R X, which is also a real linear map. By
the preceding theorem, we can even find another complex linear map
g : XC → C ⊗R X : x + iy 7→ 1 ⊗ x + i(1 ⊗ y) = 1 ⊗ x + i ⊗ y.
We then show that f , g are mutual inverses, hence are both isomorphisms:
• For every x ∈ X,
g( f (1 ⊗ x)) = g(x + i0x ) = 1 ⊗ x + i ⊗ 0 = 1 ⊗ x,
so g ◦ f is the identity map on C ⊗R X.
• For every x, y ∈ X,
f (g(x + iy)) = f (1 ⊗ x + i ⊗ y) = f (1 ⊗ x) + f (i ⊗ y)
= f (1 ⊗ x) + i f (1 ⊗ y) = (x + i0X ) + i(y + i0X ) = x + iy,
Definition A.2. Let X be a complex linear space with real subspace V . Then X is called a complexification of
V , and V is called an R-form of X, if X = V + iV and V ∩ iV = {0}, where iV = {iv | v ∈ V } ⊆ X.
Theorem A.7. Let X be a complex linear space with real subspace V . Then the following statements are
equivalent:
115
1. The space X is a complexification of V .
hence (e j ) j∈J is a spanning set of X over C. Furthermore, let (z j ) j∈J be an almost null family of complex numbers such that 0 = ∑ j∈J z j e j .
Suppose that z j = a j + ib j with a j , b j ∈ R for each j. Then
0= ∑ z j e j = ∑ (a j + ib j )e j = ∑ a j e j + i ∑ b j e j ,
j∈J j∈J j∈J j∈J
so we have
i ∑ b j e j = − ∑ a j e j ∈ iV ∩V = {0}.
j∈J j∈J
Then by the linear independence of (e j ) j∈J , we must have a j = b j = 0 for all j ∈ J, namely z j = 0 for all j. This shows that (e j ) j∈J is also
linearly independent over C, hence a base of X over C.
(2 =⇒ 3). This is trivial.
(3 =⇒ 4). Suppose that some base (e j ) j∈J of V over R is a base of X over C. Consider the complex linear map ϕ : C ⊗R V → X in
which z ⊗ v 7→ zv for every z ∈ C and v ∈ V . Now ϕ(1 ⊗ e j ) = 1e j = e j holds for all j ∈ J, while (1 ⊗ e j ) j∈J is a base of C ⊗R V over C,
hence the map ϕ is indeed an isomorphism.
(4 =⇒ 1). Suppose that such complex linear map C ⊗R V → X is now an isomorphism. Consider the real subspace V̂ := 1 ⊗ V of
C ⊗R V , which is now isomorphic to V as real linear spaces. Note that
iV̂ = {i(1 ⊗ v) | v ∈ V } = {i ⊗ v | v ∈ V },
so it is clear that C ⊗R V = V̂ + iV̂ and V̂ ∩ iV̂ = {0}. Consequently, C ⊗R V is a complexification of V̂ , hence X itself is a complexification
of V as well. ■
116
A.2 Topological Linear Spaces
Definition A.3. A linear space V over field F is called a topological linear space if it is endowed with a topology
under which the addition + : V × V → V and scalar multiplication · : F × V → V are both continuous, where
V ×V and F ×V are endowed with the respective product topologies.
Remark A.3. The continuity of the addition and scalar multiplication tells us that given any sequences (xn )n∈N
and (yn )n∈N in V and any sequence (cn )n∈N in F,
• if xn → x ∈ V and yn → y ∈ V , then xn + yn → x + y;
Remark A.4. Similar as in topological groups, the topology in a topological linear space V is ultimately deter-
mined by the neighborhood filter of 0V .
Theorem A.9. Let V be a linear space over F and (Vi )i∈I be a nonempty family of linear spaces over F. Then
i f
for every family (V −
→ Vi )i∈I of linear maps, the linear space V is always a topological space under the initial
topology with respect to ( fi )i∈I .
fi
Proof. Let (V − → Vi )i∈I be a family of linear maps and suppose that V is endowed with the initial topology with respect to ( fi )i∈I . Fix an
arbitrary i ∈ I.
1. For each a ∈ F and x ∈ V , since fi is linear, we have fi (ax) = a fi (x). If we denote by η : F × V → V and ηi : F × Vi → Vi the
respective scalar multiplications, the above shows that fi ◦ η = ηi ◦ (idF × fi ). Note that fi , ηi , idF × fi are all continuous. Since i ∈ I
is arbitrary, it follows that η is continuous as well.
2. Likewise, for each x, y ∈ V , we have fi (x + y) = fi (x) + fi (y), hence fi ◦ ϑ = ϑi ◦ ( fi × fi ), where ϑ : V ×V → V and ϑi : Vi ×Vi → Vi
are the respective addition maps. By the same arguments as above, we see that the addition ϑ is also continuous.
Therefore, the linear space V is now a topological linear space under the initial topology, as desired. ■
Theorem A.10. Let V be a topological linear space. For every C ⊆ V , if C is convex, so is its closure C.
Proof. Let C be a convex subset of V . Fix arbitrary x, y ∈ C and α ∈ [0, 1]. Then there are sequences (xn )n∈N and (yn )n∈N such that xn → x
and yn → y. Since C is convex, we have αxn + (1 − α)yn ∈ C as well. Meanwhile, one may observe that αxn + (1 − α)yn → αx + (1 − α)y,
so αx + (1 − α)y ∈ C holds for sure. This shows that C is also a convex subset of V . ■
117