0% found this document useful (0 votes)
25 views

Linear Algebra Concepts and Theories on Abstract Vector Spaces

This document is a lecture note for the module MA2101 Linear Algebra II, which focuses on abstract vector spaces and builds upon the foundational knowledge from MA1101R Linear Algebra I. It introduces new concepts such as direct sums, quotient spaces, and Jordan canonical forms, emphasizing theoretical understanding over routine computations. The content is organized into chapters covering various topics in linear algebra, including vector spaces, linear transformations, and inner product spaces.

Uploaded by

minrei101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Linear Algebra Concepts and Theories on Abstract Vector Spaces

This document is a lecture note for the module MA2101 Linear Algebra II, which focuses on abstract vector spaces and builds upon the foundational knowledge from MA1101R Linear Algebra I. It introduces new concepts such as direct sums, quotient spaces, and Jordan canonical forms, emphasizing theoretical understanding over routine computations. The content is organized into chapters covering various topics in linear algebra, including vector spaces, linear transformations, and inner product spaces.

Uploaded by

minrei101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 200

Linear Algebra:

Concepts and Theories on Abstract Vector Spaces


Third e-Book Edition

by
Ma Siu Lun
Department of Mathematics
National University of Singapore

Version: July 19, 2018


Preface
This is my lecture notes of the module MA2101 Linear Algebra II. It is the continuation
of the book \Linear Algebra: Concepts and Techniques on Euclidean Spaces, Second
Edition, McGraw Hill" which is the textbook of the rst year module MA1101R Linear
Algebra I.
You shall notice that the rst chapter is named Chapter 8. For Chapters 1 to 7, I refer
to the chapters of the textbook of MA1101R mentioned above. References and results
from the textbook of MA1101R will be quoted directly using the reference codes, e.g.
Theorem 3.1.6, De nition 7.1.1, etc.
From linear algebra modules in 1000 level, you basically learnt the properties of
Euclidean Spaces Rn and their subspaces. In MA2101, you shall study the abstract
version of vector spaces. Most of concepts and results will be generalized to this abstract
version of vector spaces. Furthermore, new topics like direct sums, quotient spaces,
isomorphisms, Jordan canonical forms, etc., will be introduced. A key di erence from
linear algebra modules in 1000 level is that there is a greater emphasis on conceptual
understanding of theoretical results than on routine computations. Since MA2101 is
built upon the background knowledge of MA1101R, you are advised to revise or study
the textbook of MA1101R before attending classes of MA2101.
S.L. Ma
Contents
Chapter 8 General Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 5
Section 8.1 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Section 8.2 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Section 8.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Section 8.4 Linear Spans and Linear Independence . . . . . . . . . . . . . . 20
Section 8.5 Bases and Dimensions . . . . . . . . . . . . . . . . . . . . . . . . 26
Section 8.6 Direct Sums of Subspaces . . . . . . . . . . . . . . . . . . . . . . 32
Section 8.7 Cosets and Quotient Spaces . . . . . . . . . . . . . . . . . . . . 37
Exercise 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Chapter 9 General Linear Transformations . . . . . . . . . . . . . . . . . . . . 55
Section 9.1 Linear Transformations . . . . . . . . . . . . . . . . . . . . . . . 55
Section 9.2 Matrices for Linear Transformations . . . . . . . . . . . . . . . . 58
Section 9.3 Compositions of Linear Transformations . . . . . . . . . . . . . . 62
Section 9.4 The Vector Space L(V; W ) . . . . . . . . . . . . . . . . . . . . . 67
Section 9.5 Kernels and Ranges . . . . . . . . . . . . . . . . . . . . . . . . . 69
Section 9.6 Isomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Exercise 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Chapter 10 Multilinear Forms and Determinants . . . . . . . . . . . . . . . . . 91
Section 10.1 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Section 10.2 Multilinear Forms . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Section 10.3 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Exercise 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Chapter 11 Diagonalization and Jordan Canonical Forms . . . . . . . . . . . . 107
Section 11.1 Eigenvalues and Diagonalization . . . . . . . . . . . . . . . . . . 107
Section 11.2 Triangular Canonical Forms . . . . . . . . . . . . . . . . . . . . 114
Section 11.3 Invariant Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . 117
Section 11.4 The Cayley-Hamilton Theorem . . . . . . . . . . . . . . . . . . . 125
Section 11.5 Minimal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . 129
Section 11.6 Jordan Canonical Forms . . . . . . . . . . . . . . . . . . . . . . 135
Exercise 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4 Linear Algebra: Concepts and Theories on Abstract Vector Spaces

Chapter 12 Inner Product Spaces . . . . . . . . . . . . . . . . . . . . . . . . . 155


Section 12.1 Inner Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Section 12.2 Norms and Distances . . . . . . . . . . . . . . . . . . . . . . . . 158
Section 12.3 Orthogonal and Orthonormal Bases . . . . . . . . . . . . . . . . 161
Section 12.4 Orthogonal Complements and Orthogonal Projections . . . . . . 166
Section 12.5 Adjoints of Linear Operators . . . . . . . . . . . . . . . . . . . . 172
Section 12.6 Unitary and Orthogonal Diagonalization . . . . . . . . . . . . . 179
Exercise 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Chapter 8

General Vector Spaces

Section 8.1 Fields

Discussion 8.1.1 In Chapter 3, we have de ned the Euclidean n-space Rn with two opera-
tions: the addition and scalar multiplication. The same operations are also de ned in Chapter
2 when we study matrices. It is natural to ask if the results of n-vectors studied in Chapter 3
can be applied to matrices as well. In order to unify all these similar mathematical objects, we
need a more general framework for vector spaces. But before we study this abstract version of
vector spaces, we rst introduce an abstract version of real numbers.

De nition 8.1.2 A eld consists of the following:


(a) a nonempty set F;

(b) an operation of addition a + b between every pair of elements a; b 2 F; and

(c) an operation of multiplication ab between every pair of elements a; b 2 F.

Furthermore, the operations satisfy the following axioms:

(F1) (Closure under Addition) For all a; b 2 F, a + b 2 F.

(F2) (Commutative Law for Addition) For all a; b 2 F, a + b = b + a.

(F3) (Associative Law for Addition) For all a; b; c 2 F, (a + b) + c = a + (b + c).

(F4) (Existence of the Additive Identity) There exists an element 0 2 F such that
a + 0 = a for all a 2 F. We call 0 the zero element of F and all other elements in F are
called nonzero elements of F.
(By Proposition 8.1.5.1, we know that there exists only one such element in F.)
6 Chapter 8. General Vector Spaces

(F5) (Existence of Additive Inverses) For every a 2 F, there exists b 2 F such that
a + b = 0. We call b the additive inverse of a and denote b by a.
(By Proposition 8.1.5.2, we know that for each a, the additive inverse of a is unique.)
(F6) (Closure under Multiplication) For all a; b 2 F, ab 2 F.
(F7) (Commutative Law for Multiplication) For all a; b 2 F, ab = ba.
(F8) (Associative Law for Multiplication) For all a; b; c 2 F, (ab)c = a(bc).
(F9) (Existence of the Multiplicative Identity) There exists a nonzero element 1 in F
such that 1a = a for all a 2 F.
(By Proposition 8.1.5.3, we know that there exists only one such element in F.)
(F10) (Existence of Multiplicative Inverses) For every nonzero element a 2 F, there exists
c 2 F such that ac = 1. We call c the multiplicative inverse of a and denote c by a 1 .
(By Proposition 8.1.5.4, we know that for each a, the multiplicative inverse of a is unique.)
(F11) (Distributive Law) For all a; b; c 2 F, a(b + c) = (ab) + (ac).

Example 8.1.3
1. (Number Systems) When we talk about a number system, we refer to one of the
following sets together with the usual addition and multiplication.
(a) N = the set of all natural numbers, i.e. N = f1; 2; 3; : : : g.
(b) Z = the set of all integers, i.e. Z = f0; 1; 2n; : : : g. o
(c) Q = the set of all rational numbers, i.e. Q = pq p; q 2 Z and q 6= 0 .
(d) R = the set of all real numbers.
(e) C = pthe set of all complex numbers, i.e. C = fa + b i j a; b 2 Rg where i2 = 1
(i = 1 ).
Note that N  Z  Q  R  C. The number systems Q, R and C (with the usual
addition and multiplication) satisfy all axioms in De nition 8.1.2. So they are elds.
The number system N does not satisfy (F4), (F5) and (F10) and hence is not a eld.
The number system Z does not satisfy (F10) and hence is not a eld.

2. Let F2 = f0; 1g. De ne the addition and multiplication on F2 by

+ 0 1  0 1
0 0 1 0 0 0
1 1 0 1 0 1
It can be checked that F2 is a eld.
Section 8.1. Fields 7

A eld which has only nitely many elements is called a nite eld. The eld F2 here is
an example of a nite eld. It is known that a nite eld of q elements exists if and only
if q = ps for some prime number p and positive integer s.

Remark 8.1.4 The approach of de ning something by properties (or axioms) is one of the
characteristics of modern mathematics. The advantage is that once we prove a theorem based
on these properties, the theorem can automatically be applied to all mathematical objects that
have these properties.

Proposition 8.1.5 Let F be a eld.


1. (Uniqueness of the Additive Identity) If b; c 2 F satis es the property that a + b =
a + c = a for all a 2 F, then b = c.
2. (Uniqueness of the Additive Inverse) For any a 2 F, if there exist b; c 2 F such that
a + b = a + c = 0, then b = c.
3. (Uniqueness of the Multiplicative Identity) If b; c are nonzero elements in F satis-
fying the property ba = ca = a for all a 2 F, then b = c.

4. (Uniqueness of the Multiplicative Inverse) For any a 2 F and a 6= 0, if there exist


b; c 2 F such that ab = ac = 1, then b = c.
5. For any a 2 F, a0 = 0 and ( 1)a = a.

6. For any a; b 2 F, if ab = 0, then a = 0 or b = 0.

Proof We only show the proof of Part 2:


b=b+0 by (F4)
= b + ( a + c) by the given assumption
= (b + a) + c by (F3)
= (a + b) + c by (F2)
=0+c by the given assumption
=c+0 by (F2)
=c by (F4).

(Proofs of the other parts are left as exercises. See Question 8.5.)

De nition 8.1.6 Let F be a eld. We can de ne the subtraction and division as follows.
1. For any a; b 2 F, the substraction of a by b is de ned to be a + ( b) and is denoted by
a b.
8 Chapter 8. General Vector Spaces

2. For any a; b 2 F where b is nonzero, the division of a by b is de ned to be ab 1 . Unlike


the real and complex numbers, we seldom use the notation a  b and ab when working
with an abstract eld.

Discussion 8.1.7 The results we established in Chapter 1 and Chapter 2 can be generalized
to any elds. In particular, row-echelon and reduced row-echelon forms, Gaussian and Gauss-
Jordan Eliminations, matrix operations, inverses and determinants of square matrices will be
used in the following chapters over any elds.

De nition 8.1.8 Let F be a eld.


1. A linear system with all coecients taken from F is called a linear system over F. In
particular, a linear system over R is called a real linear system and a linear system over
C is called a complex linear system.
2. A matrix with all entries taken from F is called a matrix over F. In particular, a matrix
over R is called a real matrix and a matrix over C is called a complex matrix.

Example 8.1.9
1. Solve the following complex linear system:
8
>
< x1 + i x2 + 3i x3 = 0
i x1 + x2 + x3 = 0
>
:
(1 i)x1 + (1 + i)x3 = 0:

Solution
0 1 0 1
1 i 3i 0 R2 i R1 1 i 3i 0 R3
1
+ 2 (1 + i)R2
B
@ i 1 1 0
C
A ! B
@ 0 2 4 0
C
A !
1 i 0 1+i 0 R3 (1 i)R1 0 1 i 2 2i 0
0 1 0 1 0 1
1 i 3i 0 1R
2 2 1 i 3i 0 R1 i R2 1 0 i 0
B
@ 0 2 4 0
C
A ! B
@ 0 1 2 0
C
A ! B
@ 0 1 2 0
C
A
0 0 0 0 0 0 0 0 0 0 0 0
The last augmented matrix corresponds to the complex system
(
x1 + i x3 = 0
x2 + 2x3 = 0
which has a general solution x1 = i t, x2 = 2t, x3 = t for t 2 C.
0 1
1 1 1
2. Let A = @0 1 1C
B
A be a matrix over F2 . Find the inverse of A.
1 0 1
Section 8.1. Fields 9

Solution Note that in F2 , 1 + 1 = 0.


0 1 0 1 0 1
1 1 1 1 0 0 R3 + R1 1 1 1 1 0 0 R3 + R2 1 1 1 1 0 0
B
@ 0 1 1 0 1 0
C
A ! B
@ 0 1 1 0 1 0
C
A ! B
@ 0 1 1 0 1 0
C
A
1 0 1 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1
0 1 0 1
R1 + R3 1 1 0 0 1 1 R1 + R2 1 0 0 1 1 0
! B
@ 0 1 0 1 0 1
C
A ! B
@ 0 1 0 1 0 1
C
A
R2 + R3 0 0 1 1 1 1 0 0 1 1 1 1
0 1
1 1 0
So A 1 B
= @1 0 1C
A.
1 1 1
0 1
i 0 0 0
B C
B0 1 + i 1 0C
3. Find the determinant of the complex matrix B = B
B0
C.
@ 0 2 iCA
1 i 1 + 2i 0
Is B invertible?

Solution
i 0 0 0
1+i 1 0
0 1+i 1 0
det(B ) = =i 0 2 i 0+0 0
0 0 2 i
i 1 + 2i 0
1 i 1 + 2i 0
" #
2 i 0 i
= i (1 + i) + 0 = i[(1 + i)(2 i) + 1] = 1 + 4i :
1 + 2i 0 i 0

Since det(B ) 6= 0, B is invertible.

De nition 8.1.10 Let F be a eld and A = (aij ) an n  n matrix over F, i.e.


0 1
a11 a12  a n 1
B
B a21 a22    a nC
C
A=B B
.. ..
2
C
.. C:
@ . . . A
an1 an2    ann
The trace of A, denoted by tr(A), is de ned to be the sum of the entries on the diagonal of A,
i.e.
n
X
tr(A) = a11 + a22 +    + ann = aii :
i=1
Note that tr(A) = tr(A ). (See also Question 2.11.)
T
10 Chapter 8. General Vector Spaces

Proposition 8.1.11
1. If A and B are n  n matrix over F, then tr(A + B ) = tr(A) + tr(B ).

2. If c 2 F and A is an n  n matrix over F, then tr(cA) = c tr(A).

3. If C and D are m  n and n  m matrices, respectively, over F, then tr(CD) = tr(DC ).

Proof The proof is left as exercises. See Question 8.6.

Remark 8.1.12 In general, tr(XY Z ) 6= tr(Y XZ ) even when the matrices X , Y and Z can
be multiplied accordingly.
! ! !
1 0 0 1 0 1
For example, let X = ,Y = and Z = . Then
0 1 1 0 0 0
! !
XY Z = 00 01 and Y XZ =
0
0
0
1
:

Hence tr(XY Z ) = 1 6= 1 = tr(Y XZ ).

Section 8.2 Vector Spaces

Discussion 8.2.1 In this section, we give an abstract de nition of vector spaces. Under this
general framework, vector spaces learnt in Chapter 3 are particular examples.

De nition 8.2.2 A vector space consists of the following:


(a) a eld F, where the elements in F are called scalars ;

(b) a nonempty set V , where the elements in V are called vectors ;

(c) an operation of vector addition u + v between every pair of vectors u; v 2 V ; and


(d) an operation of scalar multiplication cu between every c 2 F and every vector u 2 V .

Furthermore, the operations satisfy the following axioms:

(V1) (Closure under Vector Addition) For all u; v 2 V , u + v 2 V .

(V2) (Commutative Law for Vector Addition) For all u; v 2 V , u + v = v + u.

(V3) (Associative Law for Vector Addition) For all u; v ; w 2 V , u+(v +w) = (u+v )+w.
Section 8.2. Vector Spaces 11

(V4) (Existence of the Zero Vector) There exists a vector 0 2 V , called the zero vector,
such that u + 0 = u for all u 2 V .
(By Proposition 8.2.4.1, we know that there exists only one such vector in V .)

(V5) (Existence of Additive Inverse) For every vector u 2 V , there exists a vector v 2 V
such that u + v = 0. We call v the negative of u and denote v by u.
(By Proposition 8.2.4.2, we know that for each u, the negative of u is unique.)
By this axiom, for u; v 2 V , we can de ne the subtraction of u by v to be u v = u+( v).
(V6) (Closure under Scalar Multiplication) For all c 2 F and u 2 V , cu 2 V .

(V7) For all b; c 2 F and u 2 V , b(cu) = (bc)u.

(V8) For all u 2 V , 1u = u.

(V9) For all c 2 F and u; v 2 V , c(u + v ) = cu + cv .

(V10) For all b; c 2 F and u 2 V , (b + c)u = bu + cu.

(The last two axioms, i.e. (V9) and (V10), are also known as Distributive Laws.)
As a convention, we say that V together with the given vector addition and scalar multiplication
is a vector space over F. If the vector addition and scalar multiplication are known, we simply
say that V is a vector space over F. In particular, if F = R, V is called a real vector space ; and
if F = C, V is called a complex vector space.

Example 8.2.3
1. The Euclidean n-space Rn in Chapter 3 is a real vector space using the vector addition and
scalar multiplication de ned in De nition 3.1.3. The axioms (V1) and (V6) are obviously
satis ed. The other axioms follow by Theorem 3.1.6.

2. Let F be a eld. The set Fn = f(u1 ; u2 ; : : : ; un ) j u1 ; u2 ; : : : ; un 2 Fg together with the


vector addition:

(u1 ; u2 ; : : : ; un ) + (v1 ; v2 ; : : : ; vn ) = (u1 + v1 ; u2 + v2 ; : : : ; un + vn );

and the scalar multiplication

c(u1 ; u2 ; : : : ; un ) = (cu1 ; cu2 ; : : : ; cun );

where c 2 F and (u1 ; u2 ; : : : ; un ); (v1 ; v2 ; : : : ; vn ) 2 Fn , is a vector space over F. The zero


vector is (0; 0; : : : ; 0).
In particular, Qn , Cn and F2n are vector spaces.
12 Chapter 8. General Vector Spaces

3. Let F be a eld and let Mmn (F) be the set of all mn matrices over F, i.e. A 2 Mmn (F)
if and only if
0 1
a11 a12  a n 1
B
B a21 a22  a nCC
A = (aij )mn = B B
.. .. C
2
.. C where aij 2 F for all i; j .
@ . . . A
am1 am2    amn
We can de ne the matrix addition and scalar multiplication for matrices over F as dis-
cussed in Discussion 8.1.7, i.e.

(aij ) + (bij ) = (aij + bij ) for (aij ); (bij ) 2 Mmn (F);

and
c(aij ) = (caij ) for c 2 F and (aij ) 2 Mmn (F)
(see Notation 2.1.5 for the notation used above). Then Mmn(F) is a vector space over
F. The zero vector is the zero matrix 0mn.
In particular, Mmn (Q), Mmn (R), Mmn (C) and Mmn (F2 ) are vector spaces.

4. Let F be a eld and let FN be the set of all in nite sequences a = (an )n2N = (a1 ; a2 ; a3 ; : : : )
with a1 ; a2 ; a3 ;    2 F, i.e.

FN = f(an)n2N j a ; a ; a ;    2 Fg:
1 2 3

For a = (an )n2N ; b = (bn )n2N 2 FN , de ne the addition of sequences by

a + b = (an + bn)n2N = (a 1 + b1 ; a2 + b2 ; a3 + b3 ; : : : ):

For c 2 F and a = (a1 ; a2 ; a3 ; : : : ) 2 FN , de ne the scalar multiplication by

ca = (can )n2N = (ca1 ; ca2 ; ca3 ; : : : ):

Then FN is a vector space over F. The zero vector is the zero sequence (0; 0; 0; : : : ).
In particular, QN , RN , CN and F2N are vector spaces.

5. Let F be a eld. A polynomial p(x) = a0 + a1 x +    + am xm , with a0 ; a1 ; : : : ; am 2 F, is


called a polynomial over F. In particular, if F = R, p(x) is called a real polynomial ; and
if F = C, p(x) is called a complex polynomial.
Let P (F) be the set of all polynomials over F, i.e.

P (F) = fa 0 + a1 x +    + am xm j m is a nonnegative integer and a0 ; a1 ; : : : ; am 2 Fg:


Section 8.2. Vector Spaces 13

For a0 + a1 x +    + am xm ; b0 + b1 x +    + bn xn 2 P (F) where m  n, de ne the addition


of the polynomials by

(a0 + a1 x +    + am xm ) + (b0 + b1 x +    + bn xn )
= (b0 + b1 x +    + bn xn ) + (a0 + a1 x +    + am xm )
= (a0 + b0 ) + (a1 + b1 )x +    + (am + bm )xm + bm+1 xm+1 +    + bn xn :

For c 2 F and a0 + a1 x +    + am xm 2 P (F), de ne the scalar multiplication by

c(a0 + a1 x +    + am xm ) = ca0 + ca1 x +    + cam xm :


Then P (F) is a vector space over F. The zero vector is the zero polynomial 0.
In particular, P (Q), P (R), P (C) and P (F2 ) are vector spaces.
6. Let A be a nonempty set and F a eld. Let F (A; F) be the set of all functions f : A ! F.
For f; g 2 F (A; F), de ne the function f + g : A ! F by

(f + g )(a) = f (a) + g (a) for a 2 A:

For c 2 F and f 2 F (A; F), de ne the function cf : A ! F by

(cf )(a) = cf (a) for a 2 A:

Then F (A; F) is a vector space over F. The zero vector is the zero function 0 : A !F
de ne by 0(a) = 0 for a 2 A.
7. Let F be a eld and V = f0g. We de ne

0+0=0 and c0 = 0 for c 2 F:


Then V is a vector space over F which is called a zero space.
8. C = fa + b i j a; b 2 Rg is a vector space over R using the usual addition of complex
numbers as the vector addition and the usual multiplication of real numbers to complex
numbers as the scalar multiplication, i.e.

(a + b i) + (c + d i) = (a + c) + (b + d)i for a + b i ; c + d i 2 C with a; b; c; d 2 R

and
c(a + b i) = (ca) + (cb)i for c 2 R and a + b i 2 C with a; b 2 R:
9. Let V be the set of all positive real numbers, i.e. V = fa 2 R j a > 0g.
(a) V is not a vector space over R using the usual addition of real numbers as the vector
addition and the usual multiplication of real numbers as the scalar multiplication.
It does not satis ed axioms (V5) and (V6) in De nition 8.2.2.
14 Chapter 8. General Vector Spaces

(b) De ne the vector addition y by

a y b = ab for a; b 2 V
and de ne the scalar multiplication  by

m  a = am for m 2 R and a 2 V:
Then V is a vector space over R using these two operations. (We leave the veri cation
as exercise. See Question 8.9.)

Proposition 8.2.4 Let V be a vector space over a eld F.


1. (Uniqueness of the Zero Vector) If v ; w 2V satisfy the property that u+v =
u + w = u for all u 2 V , then v = w.
2. (Uniqueness of the Additive Inverse) For any u 2 V , if there exist v; w 2 V such
that u + v = u + w = 0, then v = w.
3. For all u 2 V , 0u = 0 and ( 1)u = u.
4. For all c 2 F, c0 = 0.
5. If cu = 0 where c 2 F and u 2 V , then c = 0 or u = 0.

Proof We only show the proof of 0u = 0 in Part 3: By (F4), 0 + 0 = 0. Thus


0u = (0 + 0)u = 0u + 0u: by (V10)

By (V5), the vector 0u has the negative 0u. Adding 0u to both sides of the equation above
yields
0u + ( 0u) = [0u + 0u] + ( 0u)
) 0u + ( 0u) = 0u + [0u + ( 0u)] by (V3)
) 0 = 0u + 0 by (V5)
) 0 = 0u: by (V4)

(Proofs of the other parts are left as exercises. See Question 8.12.)

Section 8.3 Subspaces

Discussion 8.3.1 Given an arbitrary subset W of a vector space V , under the same vector
addition and scalar multiplication as in V , W automatically satis es axioms (V2), (V3) and
Section 8.3. Subspaces 15

(V7)-(V10) in De nition 8.2.2 (whenever they make sense). In case W also satis es (V1), (V4),
(V5) and (V6), it forms a vector space sitting inside the larger vector space V . For example,
in R3 , the xy -plane is itself a vector space.
For most of the applications of linear algebra, we need to work with smaller vector spaces sitting
inside a big vector space.

De nition 8.3.2 A subset W of a vector space V is called a subspace of V if W is itself a


vector space using the same vector addition and scalar multiplication as in V .
In De nition 3.3.2, subspaces of Rn are de ned di erently. It can be shown that when we apply
the de nition of subspaces here to Rn , we get the same kind of subspaces in Chapter 3 (see
Section 8.4).

Example 8.3.3
1. Let V be a vector space and 0 the zero vector.

(a) Since f0g is a subset of V and f0g is a vector space (a zero space), f0g is a subspace
of V .
(b) Since V is a subset of V and V is a vector space, V is a subspace of V .

These two subspaces, f0g and V , are called trivial subspaces of V . Other subspaces of V
are called proper subspaces of V .

2. Let F be a eld. Let V = F2 and W = f(a; a) j a 2 F g  V .

(V1) Take any two vectors (a; a); (b; b) 2 W . The sum (a; a) + (b; b) = (a + b; a + b) is
again a vector in W . Hence W is closed under the vector addition.
(V4) The zero vector (0; 0) of V is also contained in W .
(V5) Take any vector (a; a) 2 W . The negative of (a; a) is ( a; a) which is again a
vector in W . So the negative of every vector in W is also contained in W .
(V6) Take any vector (a; a) 2 W and any scalar c 2 F. The scalar multiple c(a; a) =
(ca; ca) is again a vector in W . Hence W is closed under the scalar multiplication.

By Discussion 8.3.1, W is a subspace of V . (Actually, by Theorem 8.3.4 below, we do not


need to check (V5).)

Theorem 8.3.4 Let V be a vector space over a eld F. A subset W of V is a subspace of V


if and only if

(S1) (Containing the Zero Vector) 0 2 W ;

(S2) (Closure under the Vector Addition) for all u; v 2 W , u + v 2 W ; and


16 Chapter 8. General Vector Spaces

(S3) (Closure under the Scalar Multiplication) for all c 2 F and u 2 W , cu 2 W .

Proof
)
( ) (S1), (S2) and (S3) follow directly from the de nition of vector spaces.
(() Suppose W satis es (S1), (S2) and (S3). We only need to show that W satis es (V5).
(Why?)
(V5) Take any vector u 2 W. By Proposition 8.2.4.3, u = ( 1)u and, by (S3), it is
contained in W .

Remark 8.3.5 Theorem 8.3.4 can be simpli ed further: Let W be a nonempty subset of a
vector space V over a eld F. Then W is a subspace of V if and only if

for all a; b 2 F and u; v 2 W , au + bv 2 W . (8.1)

(The proof is left as exercise. See Question 8.14.)


Comparing with Theorem 8.3.4, (S1) is replaced by the \nonempty" condition while (S2) and
(S3) are combined as (8.1).

Example 8.3.6
( ! )
a b
1. Let F be a eld and W = a; b 2 F  M  (F).
2 2
b a
! !
(S1)
0 0
2 W (because the matrix is of the form a b with a = 0; b = 0 2 F).
0 0 b a
! !
a b a b
(S2) For any 1 1 ; 2 2 2W (where a1 ; a2 ; b1 ; b2 2 F),
b1 a1 b2 a2
! ! !
a1 b1 a b a + a2 b1 + b2
+ 2 2 = 1 2W
b1 a1 b2 a2 b1 + b2 a1 + a2
!
a b
(because the matrix is of the form with a = a1 + a2 ; b = b1 + b2 2 F).
b a
!
a b
(S3) For any c 2 F and 0 0 2W (where a0 ; b0 2 F),
b0 a0
! !
a b ca0 cb0
c 0 0 = 2W
b0 a0 cb0 ca0
!
a b
(because the matrix is of the form with a = ca0 ; b = cb0 2 F).
b a
Section 8.3. Subspaces 17

As W is a subset of M22 (F) satisfying (S1), (S2) and (S3), it is a subspace of M22 (F).
2. (In this example, vectors in Fn are written as column vectors.) Let F be a eld and A
an m  n matrix over F. Then the solution set W of the homogeneous linear system
Ax = 0 is a subspace of Fn. The subspace W is called the solution space of Ax = 0 or
the nullspace of A.

Proof Note that W = fu 2 Fn j Au = 0g  Fn .


(S1) Since A0 = 0, 0 2 W .
(S2) Take any u; v 2 W , i.e. Au = 0 and Av = 0. Since

A(u + v) = Au + Av = 0 + 0 = 0;
u + v 2 W.
(S3) Take any c 2 F and take any u 2 W , i.e. Au = 0. Since

A(cu) = cAu = c0 = 0;
cu 2 W .
As W is a subset of Fn satisfying (S1), (S2) and (S3), it is a subspace of Fn .
3. Consider the vector space RN of in nite sequences over R. De ne
n o
W = (a1 ; a2 ; a3 ; : : : ) 2 RN nlim
!1 an = 0  RN:
(S1) Since the limit of the zero sequence (0; 0; 0; : : : ) is 0, (0; 0; 0; : : : ) 2 W .
(S2) Take any a = (an )n2N ; b = (an )n2N 2 W , i.e. lim an = 0 and lim bn = 0. Since
n!1 n!1
lim (a + bn ) = lim an + lim bn = 0 + 0 = 0;
n!1 n n!1 n!1
a + b = (an + bn)n2N 2 W .
(S3) Take any c 2 R and take any a = (an )n2N 2 W , i.e. lim an = 0. Since
n!1
lim can = c lim an = c0 = 0
n!1 n!1
ca = (can )n2N 2 W .
As W is a subset of RN satisfying (S1), (S2) and (S3), it is a subspace of RN .
4. Let F be a eld and n a positive integer. De ne Pn (F) to be the set of all polynomial
over F with degree at most n. Note that

Pn(F) = fa 0 + a1 x +    + an xn j a0 ; a1 ; : : : ; an 2 Fg  P (F):

Then Pn (F) is a subspace of P (F). If m  n, Pm (F) is a subspace of Pn (F).


18 Chapter 8. General Vector Spaces

5. For a; b 2 R with a < b, let [a; b] = fc 2 R j a  c  bg be a closed interval on the real


line. Consider the set F ( [a; b]; R) of all functions f : [a; b] ! R. Using the addition and
scalar multiplication de ned in Example 8.2.3.6, F ( [a; b]; R) forms a real vector space.
De ne
(a) C ([a; b]) = ff 2 F ( [a; b]; R) j f is continuous on [a; b]g;
(b) for n 2 N, C n ([a; b]) = ff 2 F ( [a; b]; R) j f is n times di erentiable on [a; b]g; and
(c) C 1 ([a; b]) = ff 2 F ( [a; b]; R) j f is in nitely di erentiable on [a; b]g.
Then C ([a; b]), C n ([a; b]) and C 1 ([a; b]) are subspaces of F ( [a; b]; R).
Note that C 1 ([a; b])  C n ([a; b])  C ([a; b])  F ( [a; b]; R) and C n ([a; b])  C m ([a; b])
whenever n  m. Thus C 1 ([a; b]) is a subspace of C ([a; b]) and C n ([a; b]) for all n 2 N;
and for each n 2 N, C n ([a; b]) is a subspace of C ([a; b]) and C m ([a; b]) if m  n.

Remark 8.3.7 Since a real polynomial can be regarded as a real-valued in nitely di erentiable
function, P (R) and Pn (R), for n 2 N, can be considered as subspaces of C ([a; b]), C m ([a; b]),
for all m 2 N, and C 1 ([a; b]) for any closed interval [a; b] on the real line.

Theorem 8.3.8 If W1 and W2 are two subspaces of a vector space V , then the intersection of
W1 and W2 is also a subspace of V . (Remind that for sets A and B , the intersection of A and
B is the set A \ B = fx j x 2 A and x 2 B g.)
Proof
(S1) Since both W1 and W2 contain the zero vector, the zero vector is contained in W1 \ W2 .
(S2) Take any u; v 2 W1 \ W2 . Since W1 is a subspace and u; v 2 W1 , we have u+v 2 W . 1

Similarly, u + v 2 W2 : Thus u + v 2 W1 \ W2 .
(S3) Take any scalar c and any u 2 W1 \ W2 . Since W1 is a subspace and u 2 W , we have
1

cu 2 W1 . Similarly, cu 2 W2 . Thus cu 2 W1 \ W2 .
As W1 \ W2 is a subset of V satisfying (S1), (S2) and (S3), it is a subspace of V .

Example 8.3.9 Let F be a eld and n a positive integer. De ne


W1 = fA 2 Mnn (F) j A is an upper triangular matrixg
and
W2 = fA 2 Mnn (F) j A is a lower triangular matrixg:
Both W1 and W2 are subspaces of Mnn (F) (check it). Note that W1 \ W2 is the set of all
n  n diagonal matrices over F. It is also a subspace of Mnn (F).

Remark 8.3.10
1. If W1 ; W2 ; : : : ; Wn are subspaces of a vector space V , then W1 \ W2 \    \ Wn is also a
subspace of V .
Section 8.3. Subspaces 19

2. The union of two subspaces of a vector space may not be a vector space. (Reminded that
for sets A and B , the union of A and B is the set A [ B = fx j x 2 A or x 2 B g.)
For example, let F be a eld, W1 = f(x; 0) j x 2 F g and W2 = f(0; y ) j y 2 F g. It is
easy to check that W1 and W2 are subspaces of F2 . Since (1; 0) 2 W1 and (0; 1) 2 W2 ,
both (1; 0) and (0; 1) are elements of W1 [ W2 : However, (1; 0) + (0; 1) = (1; 1) is neither
contained in W1 nor W2 and hence it is not an element of W1 [ W2 : This shows that
W1 [ W2 is not a subspace of F2 :

De nition 8.3.11 Let W1 and W2 be subspaces of a vector space V . The sum of W1 and W2
is de ned to be the set W1 + W2 = fw1 + w2 j w1 2 W1 and w2 2 W2 g.

Theorem 8.3.12 If W1 and W2 are subspaces of a vector space V , then W1 + W2 is a subspace


of V .
Proof The proof is left as an exercise. See Question 8.15(a).

Example 8.3.13
1. Let W1 and W2 be two nonparallel lines in R3 such that both lines pass thought the
origin. Then W1 + W2 is the plane that contains both lines. It is obvious that W1 , W2
and W1 + W2 are subspaces of R3 .



W2  


 w +w
  

 1 2 

w2 
 

  

  :

  

  

W1
 


  w
  

 
   





 1s 

  





 W1 + W2 

In particular, if W1 is the x-axis in R 3


and W2 is the y -axis in R , then W
3
1 + W2 is the
xy-plane in R3 .
2. Let F be a eld. De ne
( ! ) ( ! )
a 0 0 a
W1 = a2F and W2 = a2F :
0 a a 0
( ! )
a b
Then W1 + W2 = a; b 2 F .
b a
W1 , W2 and W1 + W2 are subspaces of M22 (F).
20 Chapter 8. General Vector Spaces

Remark 8.3.14 Let W1 and W2 be subspaces of a vector space V . Then W1 + W2 is the


smallest subspace of V that contains both W1 and W2 . Precisely, if U is a subspace of V such
that W1  U and W2  U , then W1 + W2  U . (See Question 8.15(b).)

Section 8.4 Linear Spans and Linear Independence

Discussion 8.4.1 Most of the discussions about Rn


in Chapter 3 can be rephrased using
abstract vector spaces. In the following two sections, we shall study the concepts of linear
spans, linear independence and bases using the framework of abstract vector spaces.

De nition 8.4.2 Let V be a vector space and v1 ; v2 ; : : : ; vm 2 V . For any scalars c1 ; c2 ; : : : ; cm ,


the vector
c1 v 1 + c2 v 2 +    + cm v m
is called a linear combination of v1 ; v2 ; : : : ; vm . (Note that we only accept linear combinations
using nite number of vectors.)

Theorem 8.4.3 Let V be a vector space over a eld F and let B be a nonempty subset of V .
The set of all linear combinations of vectors taken from B ,

W = fu 2 V j u is a linear combination of some vectors from B g


= fc1 v1 +    + cm vm j m 2 N; c1 ; : : : ; cm 2 F and v1 ; : : : ; vm 2 B g;

is a subspace of V .
Proof
(S1) Take any u 2 B . Since 0 = 0u, 0 2 W .
(S2) Take any u; u0 2 W , i.e. u = a1 v1 +    + am vm and u0 = b1 v10 +    + bn vn0 for
a1 ; : : : ; am ; b1 ; : : : ; bn 2 F and v1 ; : : : ; vm ; v10 ; : : : ; vn0 2 B . Then
u + u0 = a v1 +    + amvm + b v10 +    + bnvn0
1 1

which is also a linear combination of vectors from B . So u + u0 2 W .


(S3) Take any c 2 F and any u 2 W , i.e. u = a v1 +    + am vm for a ; : : : ; am 2 F and
1 1

v1; : : : ; vm 2 B. Then
cu = ca v1 +    + cam vm
1

which is also a linear combination of vectors from B . So cu 2 W .


As W is a subset of V satisfying (S1), (S2) and (S3), it is a subspace of V .
Section 8.4. Linear Spans and Linear Independence 21

De nition 8.4.4 Let V be a vector space over a eld F and let B be a nonempty subset of
V . The subspace
W = fu 2 V j u is a linear combination of some vectors from B g
in Theorem 8.4.3 is called the subspace of V spanned by B and we write W = spanF (B ) or
simply W = span(B ) if the eld F is known. Sometimes, we also say that W is a linear span
of B and B spans W . Note that B  W .
In particular if B = fv1 ; v2 ; v3 ; : : : g, then
W = fc1 v1 +    + cm vm j m 2 N and c1 ; : : : ; cm 2 Fg
and we write W = spanF fv1 ; v2 ; v3 ; : : : g or simply W = spanfv1 ; v2 ; v3 ; : : : g and say that
W is the subspace of V spanned by the vectors v1 ; v2 ; v3 ; : : : ; W is a linear span of the vectors
v1; v2; v3; : : : ; and the vectors v1; v2; v3; : : : span W .
Remark 8.4.5 In De nition 8.4.4, if B is nite, say B = fv1 ; v2 ; : : : ; vk g, then
spanF (B ) = fc1 v1 + c2 v2 +    + ck vk j c1 ; c2 ; : : : ; ck 2 Fg:
(Compare with De nition 3.2.3.)

Example 8.4.6
! ! ! !
1 1 0 1 1 1 1 2
1. Let A1 = , A2 = , A3 = and B = be real matrices.
1 1 1 0 1 1 3 4
Determine whether B is a linear combination of A1 , A2 and A3 .
Solution Consider the equation
c1 A1 + c2 A2 + c3 A3 = B :
Since !
c1 + c3 c1 + c2 + c3
c1 A1 + c2 A2 + c3 A3 = ;
c1 + c2 c3 c1 c3
we have 8
>
>
> c1 + c3 = 1
>
< c1 + c2 + c3 = 2
>
>
> c1 + c2 c3 = 3
>
: c1 c3 = 4:
0 1 0 1
1 0 1 1 1 0 1 1
B C Gaussian B C
B 1 1 1 2 C B 0 1 0 1 C
B
B 1 1 1 3C
C ! B0 0 2 1C
B
C
@ A @ A
Elimination
1 0 1 4 0 0 0 2
Since the system is inconsistent, B is not a linear combination of A1 , A2 and A3 .
22 Chapter 8. General Vector Spaces

( ! )
a b a + b + 2c
2. Let W1 = a; b; c 2 R  M  (R). Since
2 2
2(b + c) 3(a + c)
! ! ! !
a b a + b + 2c 1 1 1 1 0 2
=a +b +c ;
2(b + c) 3(a + c) 0 3 2 0 2 3
( ! ! !)
1 1 1 1 0 2
W1 = span ; ; and by Theorem 8.4.3, W1 is a subspace of
0 3 2 0 2 3
M  (R).
2 2

3. Let W2 = fp(x) 2 P3 (R) j p( 1) = 0 and p(1) = 0g  P (R).


3 For any polynomial
p(x) = a + bx + cx2 + dx3 2 P3 (R),
8
( (
>
>
> a= s
>
<
p( 1) = 0 a b+c d = 0 b= t
p(x) 2 W2 , , , for s; t 2 R;
p(1) = 0 a+b+c+d = 0 >
>
> c=s
>
: d=t
i.e. p(x) 2 W2 if and only if p(x) = s tx + sx2 + tx3 = s( 1 + x2 ) + t( x + x3 ) for some
s; t 2 R. Thus W2 = spanf 1 + x2 ; x + x3 g and by Theorem 8.4.3, W2 is a subspace of
P3(R).
4. Let p1 (x) = 1 + x + x2 , p2 (x) = x + 2x2 and p3 (x) = 2 x x2 be real polynomials. Prove
that P2 (R) = spanfp1 (x); p2 (x); p3 (x)g.

Solution Since p1 (x); p2 (x); p3 (x) 2 P2 (R), spanfp1 (x); p2 (x); p3 (x)g  P2 (R).
To show P2 (R)  spanfp1 (x); p2 (x); p3 (x)g, we only need to show that any polynomial
q(x) = a1 + a2 x + a3 x2 2 P2 (R) is a linear combination of p1 (x), p2 (x) and p3 (x). Consider
the equation
c1 p1 (x) + c2 p2 (x) + c3 p3 (x) = q(x):
Since

c1 p1 (x) + c2 p2 (x) + c3 p3 (x) = c1 (1 + x + x2 ) + c2 (x + 2x2 ) + c3 (2 x x2 )


= (c1 + 2c3 ) + (c1 + c2 c3 )x + (c1 + 2c2 c3 )x2 ;

we have 8
>
< c1 + 2c3 = a1
>
c1 + c2 c3 = a2
:
c1 + 2c2 c3 = a3 :
0 1 0 1
1 0 2 a1 Gauss-Jordan 1 0 0 1
3
a1 + 43 a2 23 a3
B C
B
@ 1 1 1 a2
C
A ! B
@ 0 1 0 a2 + a3 C
A
1 2 1 a3 Elimination 0 0 1 1
a 23 a2 + 13 a3
3 1
Section 8.4. Linear Spans and Linear Independence 23

The system has a solution c1 = 31 a1 + 43 a2 2


3
a3 , c2 = a2 + a3 and c3 = 31 a1 2
3
a2 + 13 a3 .
It means
 
q(x) = 1
3
a1 + 43 a2 2
3
a3 p1 (x) + ( a2 + a3 ) p2 (x) + 1
3
a1 2
3
a2 + 13 a3 p3 (x):
As every polynomial in P (R) is a linear combination of p1 (x), p2 (x) and p3 (x), we have
P2(R)  spanfp1(x); p2(x); p3(x)g and hence P2(R) = spanfp1(x); p2(x); p3(x)g.
5. Let F be a eld. For 1  i  n, let ei be the vector in Fn such that its ith coordinate is
1 and all other coordinates are 0, i.e.
ei = (0; 0; : : : ; 0; 1; 0; : : : ; 0):
"
the ith coordinate

For any u = (u1 ; u2 ; : : : ; un ) 2 Fn ,


u = u e1 + u e2 +    + unen:
1 2

So Fn = spanF fe1 ; e2 ; : : : ; en g.
6. Since every complex number in C can be written as a + b i for a; b 2 R, C = spanR f1; i g.
Furthermore, using the vectors e1 ; e2 ; : : : ; en in Part 5,
Cn = spanCfe1; e2; : : : ; eng
= spanR fe1 ; e2 ; : : : ; en ; i e1 ; i e2 ; : : : ; i en g:

7. Let F be a eld. For 1  i  m and 1  j  n, let Eij be the m  n matrix over F such
that its (i; j )-entry is 1 and all other entries are 0. For any A = (aij ) 2 Mmn (F),
m X
X n
A= aij Eij :
i=1 j =1
So Mmn (F) = spanF fEij j 1  i  m and 1  j  ng.
8. Let F be a eld. We have
P (F) = fa 0 + a1 x +    + am xm j m 2 N and a0 ; a1 ; : : : ; am 2 Fg = spanF f1; x; x2 ; : : : g
and
Pn(F) = fa 0 + a1 x +    + an xn j a0 ; a1 ; : : : ; an 2 Fg = spanF f1; x; : : : ; xn g:

Remark 8.4.7 In De nition 8.4.2, Theorem 8.4.3 and De nition 8.4.4, we only accept linear
combinations using nite number of vectors. For example, the power series
1
X
xi = 1 + x + x2 +   
i=0
is not contained in spanf1; x; x ; : : : g. 2
24 Chapter 8. General Vector Spaces

De nition 8.4.8 Let V be a vector space over a eld F.


1. Let v1 ; v2 ; : : : ; vk 2 V .

(a) The vectors v1 ; v2 ; : : : ; vk are called linearly independent if the vector equation

c1 v1 + c2 v2 +    + ck vk = 0

has only the trivial solution c1 = 0, c2 = 0, : : : , ck = 0.


(b) The vectors v1 ; v2 ; : : : ; vk are called linearly dependent if they are not linearly
independent, i.e. there exists a1 ; a2 ; : : : ; ak 2 F, not all ai 's are zero, such that
a1 v1 + a2 v2 +    + ak vk = 0.
(Compare this de nition with De nition 3.4.2.)

2. Let B be a subset of V .

(a) B is called linearly independent if for every nite subset fv1; v2; : : : ; vk g of B ,
v1; v2; : : : ; vk are linearly independent.
(b) B is called linearly dependent if there exists a nite subset fv1; v2; : : : ; vk g of B
such that v1 ; v2 ; : : : ; vk are linearly dependent.

(For convenience, whenever we write a set in the form fu1; u2; : : : ; ung, we always assume
that (i) n  1; and (ii) ui 6= uj for i 6= j .)

Remark 8.4.9 Same as the discussion in Section 3.4, linear independence is used to determine
whether there are redundant vectors in a set. (See Theorem 3.4.4 and Remark 3.4.5.)

1. If a set B is linearly dependent, at least one vector u 2 B can be expressed as a linear


combination of other vectors in B and hence u is a redundant vector, i.e. span(B fug) =
span(B ).

2. If a set B is linearly independent, no vector in the set can be expressed as a linear


combination of other vectors in B and hence the set has no redundant vector, i.e. for all
u 2 B, span(B fug) $ span(B).
Example 8.4.10
1. Determine whether the subset f(1; i ; 1 i); (i ; 1; 0); (3i ; 1; 1 + i)g of C
3
is linearly in-
dependent.

Solution To answer the question, we need to solve the vector equation c1 (1; i ; 1 i) +
c2 (i ; 1; 0) + c3 (3i ; 1; 1 + i) = (0; 0; 0).
Section 8.5. Bases and Dimensions 25

c1 (1; i ; 1 i) + c2 (i ; 1; 0) + c3 (3i ; 1; 1 + i) = (0; 0; 0)


, (c 1 + i c2 + 3i c3 ; i c1 + c2 + c3 ; (1 i)c1 + (1 + i)c3 ) = (0; 0; 0)
8
>
< c1 + i c2 + 3i c3 = 0
, >
i c1 + c2 + c3 = 0
:
(1 i)c1 + (1 + i)c3 = 0
8
>
< c1 = i t
, >
c2 = 2t for t 2 C:
:
c3 = t
Since we have nontrivial solutions, f(1; i ; 1 i); (i ; 1; 0); (3i ; 1; 1 + i)g is linearly depen-
dent.
2. Determine whether the subset f1; x + x2 ; 2 x2 ; x +3x3 g of P3 (R) is linearly independent.
Solution We need to solve the polynomial equation c1 + c2 (x + x2 ) + c3 (2 x2 ) + c4 (x +
3x3 ) = 0 where 0 is the zero polynomial.
c1 + c2 (x + x2 ) + c3 (2 x2 ) + c4 (x + 3x3 ) = 0
, (c 1 + 2 c3 ) + ( c2 + c4 ) x + ( c2 c3 )x2 + 3c4 x3 = 0 + 0x + 0x2 + 0x3
8
>
>
> c1 + 2c3 =0
>
< c2 + c4 = 0
, >
>
> c2 c3 =0
>
: 3c4 = 0
, c 1 = 0; c2 = 0; c3 = 0; c4 = 0:
Since we only have the trivial solution, f1; x + x2 ; 2 x2 ; x +3x3 g is linearly independent.
3. Let f1 ; f2 ; f3 2 C 1 ([ 1; 1]) where f1 (x) = ex , f2 (x) = xex and f3 (x) = x for x 2 [ 1; 1].
Determine whether f1 , f2 , f3 are linearly independent.
Solution Consider the function equation c1 f1 + c2 f2 + c3 f3 = 0 where 0 is the zero
function. The equation means
c1 ex + c2 xex + c3 x = c1 f1 (x) + c2 f2 (x) + c3 f3 (x) = 0(x) = 0 for all x 2 [ 1; 1]:
In particular, substituting x = 1, x = 0, x = 1 into the equation above, we have
8
> 1
< e c1 1e c2 c3 = 0
>
c1 =0 , c 1 = 0; c2 = 0; c3 = 0:
:
ec1 + ec2 + c3 = 0
Since we only have the trivial solution, f1 , f2 , f3 are linearly independent.
26 Chapter 8. General Vector Spaces

Section 8.5 Bases and Dimensions

De nition 8.5.1 A subset B of a vector space V is called a basis for V if B is linearly


independent and B spans V . (See Section 3.5.)
A vector space V is called nite dimensional if it has a basis consisting of nitely many vectors;
otherwise, V is called in nite dimensional.

Remark 8.5.2
1. For convenience, the empty set ; is de ned to be the basis for a zero space.
2. Every vector space has a basis. The proof of this requires a fundamental result in set
theory called Zorn's Lemma.
3. In some in nite dimensional (topological) vector spaces, bases de ned in De nition 8.5.1
are called algebraic bases or Hamel bases in order to distinguish it from other kinds of
\bases" where in nite sums are allowed.

Example 8.5.3
( ! )
a b a + b + 2c
1. Let W1 = a; b; c 2 R  M  (R).
2 2 By Example 8.4.6.2, we
2(b + c) 3(a + c)
( ! ! !)
1 1 1 1 0 2
have W1 = span ; ; . Note that
0 3 2 0 2 3
8
! ! ! ! >
< c1 = t
1 1 1 1 0 2 0 0
c1 + c2 + c3 = , c2 = t for t 2 R:
0 3 2 0 2 3 0 0 >
:
c3 = t
( ! ! !)
1 1 1 1 0 2
So ; ; is linearly dependent and hence is not a basis for W1 .
0 3 2 0 2 3

2. Let W2 = fp(x) 2 P3 (R) j p( 1) = 0 and p(1) = 0g  P (R).


3 By Example 8.4.6.3, we
have W2 = spanf 1 + x2 ; x + x3 g. Note that
c1 ( 1 + x2 ) + c2 ( x + x3 ) = 0 , c 1 = 0; c2 = 0:
So f 1 + x2 ; x + x3 g is linearly independent and hence is a basis for W2 .
3. By Example 8.4.6.5, we have Fn = spanF fe1 ; e2 ; : : : ; en g. Since
c1 e1 + c2 e2 +    + en = 0, (c ; c ; : : : ; cn) = (0; 0; : : : ; 0)
1 2

, c = 0; c = 0;    ; cn = 0;
1 2

fe1; e2; : : : ; eng is linearly independent and hence is a basis for Fn. This basis is called
the standard basis for Fn .
Section 8.5. Bases and Dimensions 27

4. By Example 8.4.6.6, Cn = spanfe1 ; e2 ; : : : ; en ; i e1 ; i e2 ; : : : ; i en g. Note that the set

B = fe1 ; e2 ; : : : ; en ; i e1 ; i e2 ; : : : ; i en g

is linearly dependent over C but linearly independent over R. So we have the following
conclusion:

(a) B is not a basis for Cn if Cn is regarded as a vector space over C.


(b) B is a basis for Cn if Cn is regarded as a vector space over R.

(We always assume that Cn is a vector space over C unless we specify the otherwise.)

5. By Example 8.4.6.7, we have Mmn (F) = spanF fEij j 1  i  m and 1  j  ng. The
set fEij j 1  i  m and 1  j  ng is linearly independent and hence is a basis for
Mmn(F). This basis is called the standard basis for Mmn(F).
6. By Example 8.4.6.8, we have P (F) = spanF f1; x; x2 ; : : : g. The set f1; x; x2 ; : : : g is
linearly independent and hence is a basis for P (F). This basis is called the standard basis
for P (F).
Also by Example 8.4.6.8, we have Pn (F) = spanF f1; x; : : : ; xn g. The set f1; x; : : : ; xn g
is linearly independent and hence is a basis for Pn (F). This basis is called the standard
basis for Pn (F).

Remark 8.5.4
1. The vectors spaces Fn , Mmn (F) and Pn (F) are nite dimensional while P (F) is in nite
dimensional.

2. The vector space FN is in nite dimensional. The vector space F (A; F) is nite dimensional
if A is a nite set; and F (A; F) is in nite dimensional if A is an in nite set.

Lemma 8.5.5 Let V be a nite dimensional vector space and B = fv1 ; v2 ; : : : ; vn g a basis
for V . Any vector u 2 V can be expressed uniquely as a linear combination

u = c v 1 + c v 2 +    + cn v n
1 2

where c1 ; c2 ; : : : ; cn are scalars.


(See Question 8.28 for the in nite dimensional version of the lemma.)
Proof The proof follows the same argument as the proof for Theorem 3.5.7.

De nition 8.5.6 Let V be a nite dimensional vector space over a eld F where dim(V ) =
n  1.
28 Chapter 8. General Vector Spaces

1. A basis B = fv1 ; v2 ; : : : ; vn g for V is called an ordered basis if the vectors in B have a


xed order such that v1 is the rst vector, v2 is the second vector, etc.
2. Let B = fv1 ; v2 ; : : : ; vn g be an ordered basis for V and let u 2 V . If
u = c v1 + c v2 +    + cnvn
1 2 for c1 ; c2 ; : : : ; cn 2 F;
then the coecients c1 ; c2 ; : : : ; cn are called the coordinates of u relative to the basis B .
The vector 0 1
c1
B C
B c2 C
(u)B = (c1 ; c2 ; : : : ; cn ) or [u]B = B C
B .. C
@.A
cn
in Fn is called the coordinate vector of u relative to the basis B .

Lemma 8.5.7 Let V be a nite dimensional vector space over a eld F, where dim(V )  1,
and let B be an ordered basis for V .
1. For any u; v 2 V , u = v if and only if (u)B = (v )B .
2. For any v1 ; v2 ; : : : ; vr 2 V and c1 ; c2 ; : : : ; cr 2 F,
(c1 v1 + c2 v2 +    + cr vr )B = c1 (v1 )B + c2 (v2 )B +    + cr (vr )B :

Proof The proof is left as exercise. See Question 8.24

Example 8.5.8
1. Let B1 = fv1 ; v2 ; v3 g where v1 = (1; 1; 0), v2 = (0; 1; 1) and v3 = (1; 1; 1) are vectors
in F23 . Note that B1 is a basis for F23 . Using B1 as an ordered basis, nd the coordinate
vector of u = (a; b; c) 2 F23 relative to B1 .
Solution (Recall that in F2 , 1 + 1 = 0.)
8 8
>
< c1 + c3 = a >
< c1 = b + c
c1 v1 + c2 v2 + c3 v3 = u , c1 + c2 + c3 = b , c2 = a + b
>
: >
:
c2 + c3 = c c3 = a + b + c:
Thus (u)B1 = (b + c; a + b; a + b + c).
! ! !
1 1 0 1 1 1
2. Let B2 = fA1 ; A2 ; A3 ; A4 g where A1 = , A2 = , A3 = and
1 1 1 0 1 1
!
A4 = 10 01 are real matrices. Note that B2 is a basis for M22(R). Using B2 as an
!
1 2
ordered basis, nd the coordinate vector of C = relative to B2 .
4 3
Section 8.5. Bases and Dimensions 29

Solution 8 8
>
>
> c1 + c3 + c4 = 1 >
>
> c1 = 2
>
< >
<
c1 + c2 + c3 =2 c2 = 1
c1 A1 + c2 A2 + c3 A3 + c4 A4 = C , ,
>
>
> c1 + c2 c3 =4 >
>
> c3 = 1
>
: >
:
c1 c3 c4 = 3 c4 = 0:
Thus (C )B2 = (2; 1; 1; 0).

Remark 8.5.9 Let V be a nite dimensional vector space over a eld F such that V has a
basis B with n vectors. Using the coordinate system relative to B , we can translate all vectors
in V to vectors in Fn . Thus all problems about V can be solved by using theorems and methods
worked for Fn .
By this way, most of the results in Chapter 3 and Chapter 4 for Euclidean n-spaces can be
applied to other nite dimensional vector spaces.

Theorem 8.5.10 Let V be a vector space which has a basis with n vectors. Then
1. any subset of V with more than n vectors is always linearly dependent; and

2. any subset of V with less than n vectors cannot span V .

Hence every basis for V has n vectors.


Proof The proof follows the same argument as the proof for Theorem 3.6.1.

De nition 8.5.11 The dimension of a nite dimensional vector space V over a eld F, denoted
by dimF (V ) or simply dim(V ), is de ned to be the number of vectors in a basis for V . In
addition, we de ne the dimension of a zero space to be zero. (See also Section 3.6.)

Example 8.5.12
1. dimF (Fn ) = n.

2. dimC (Cn ) = n and dimR (Cn ) = 2n.

3. dimF (Mmn (F)) = mn.

4. dimF (Pn (F)) = n + 1.

5. In Example 8.3.9, dim(W1 ) = dim(W2 ) = 21 n(n + 1) and dim(W1 \ W2 ) = n. (Why?)

Theorem 8.5.13 Let V be a nite dimensional vector space and B a subset of V . The
following are equivalent:

1. B is a basis for V .
30 Chapter 8. General Vector Spaces

2. B is linearly independent and jB j = dim(V ).


3. B spans V and jB j = dim(V ).

Proof The proof follows the same argument as the proof for Theorem 3.6.7.

Example 8.5.14 In Example 8.4.6.4, we have P2 (R) = spanfp1 (x); p2 (x); p3 (x)g where p1 (x) =
1 + x + x2 , p2 (x) = x + 2x2 and p3 (x) = 2 x x2 . Since dim(P2 (R)) = 3, by Theorem 8.5.13,
fp1(x); p2(x); p3(x)g is a basis for P2(R).
Theorem 8.5.15 Let W be a subspace of a nite dimensional vector space V . Then
1. dim(W )  dim(V ); and
2. if dim(W ) = dim(V ), then W = V .

Proof The proof follows the same argument as the proof for Theorem 3.6.9.

Example 8.5.16 For this example, we modify the algorithms in Example 4.1.14 and use them
to nd bases for nite dimensional vector spaces:
Find a basis for the subspace W of M22 (R) spanned by
! ! !
A1 = 12 21 ; A2 = 36 63 ; A3 = 49 95 ;
! ! !
2 1
A4 = 1 1
; A5 = 59 84 ; A6 = 47 23 :
Solution Use the (ordered) standard basis E = fE11 ; E12 ; E21 ; E22 g for M22 (R) where
! ! ! !
E11 = 10 00 ; E12 = 00 10 ; E21 = 01 00 ; E22 = 00 01 :
Then
(A1 )E = (1; 2; 2; 1); (A2 )E = (3; 6; 6; 3); (A3 )E = (4; 9; 9; 5);
(A4 )E = ( 2; 1; 1; 1); (A5 )E = (5; 8; 9; 4); (A6 )E = (4; 2; 7; 3):

Use Method 1 of Example 4.1.14.1:


0 1 0 1
1 2 2 1 1 2 2 1
B C B C
B
B 3 6 6 3CC Gaussian B0
B 1 1 1C
C
B 4 9 9 5CC B0 0 1 1C
B
B
2 1 1 1C
C ! B
B
C
C:
C Elimination B0 0 0 0C
B B
B C
B 5 8 9 4AC B0 0 0 0C
@ @ A
4 2 7 3 0 0 0 0
Section 8.5. Bases and Dimensions 31

So f(1; 2; 2; 1); (0; 1; 1; 1); (0; 0; 1; 1)g is a basis for the subspace of R4 spanned by (A1 )E , (A2 )E ,
(A3 )E , (A4 )E , (A5 )E , (A6 )E . Let B1 , B2 , B3 be 2  2 real matrices such that
!
(B1 )E = (1; 2; 2; 1) ) B1 = E11 + 2E12 + 2E21 + E22 = 12 21 ;
!
(B2 )E = (0; 1; 1; 1) ) B2 = 0E11 + E12 + E21 + E22 = 01 11 ;
!
(B3 )E = (0; 0; 1; 1) ) B3 = 0E11 + 0E12 + E21 + E22 = 01 01 :
( ! ! !)
1 2 0 1 0 0
Then fB1 ; B2 ; B3 g = ; ; is a basis for W .
2 1 1 1 1 1
Use Method 2 of Example 4.1.14.1:
0 1 0 1
1 3 4 2 5 4 1 3 4 2 5 4
B C Gaussian B C
B2 6 9 1 8 2C B0 0 1 3 2 6C
B
B2 6 9 1 9 7A
C
C ! B
B0 0 0 0 1 5C
C:
@
Elimination @ A
1 3 5 1 4 3 0 0 0 0 0 0
Since the 1st, 2nd and 5th columns are pivot columns of the row echelon form, f [A1 ]E ; [A3 ]E ; [A5 ]E g
is a basis for the subspace of R4 spanned by [A1 ]E , [A2 ]E , [A3 ]E , [A4 ]E , [A5 ]E , [A6 ]E . Thus
fA1; A3; A5g is a basis for W .
Theorem 8.5.17 Let V be a nite dimensional vector space. Suppose C is a linearly inde-
pendent subset of V . Then there exists a basis B for V such that C  B .
Proof If span(C ) = V , then B = C is a basis for V . Suppose span(C ) $ V . There exists
u 2 V but u 2= span(C ). Let C1 = C [ fug. Note that C1 is linearly independent (check it). If
span(C1 ) = V , then B = C1 is a basis for V . If not, we repeat the process above to nd a new
vector in V but not in span(C1 ). Since V is nite dimensional, by Theorem 8.5.13, we shall
eventually get enough vectors to form a basis B for V .

Example 8.5.18 Let


C = f1 + 4x 2x2 + 5x3 + x4 ; 2 + 9x x2 + 8x3 + 2x4 ; 2 + 9x x2 + 9x3 + 3x4 g
which is a linearly independent set in P4 (R). Extend C to a basis for P4 (R).
Solution Use the standard basis E = f1; x; x2 ; x3 ; x4 g for P4 (R). Then
(1 + 4x 2x2 + 5x3 + x4 )E = (1; 4; 2; 5; 1);
(2 + 9x x2 + 8x3 + 2x4 )E = (2; 9; 1; 8; 2);
(2 + 9x x2 + 9x3 + 3x4 )E = (2; 9; 1; 9; 3):
32 Chapter 8. General Vector Spaces

We use the algorithm in Example 4.1.14.2:


0 1 0 1
1 4 2 5 1 Gaussian 1 4 2 5 1
B
@2 9
C
1 8 2A ! B @0 1 3
C
2 0A
2 9 1 9 3 Elimination 0 0 0 1 1
Since the third and fth columns are non-pivot columns of the row-echelon form on the right,

f(1; 4; 2; 5; 1); (2; 9; 1; 8; 2); (2; 9; 1; 9; 3); (0; 0; 1; 0; 0); (0; 0; 0; 0; 1)g
is a basis for R5 . Thus

B = f1 + 4x 2x2 + 5x3 + x4 ; 2 + 9x x2 + 8x3 + 2x4 ; 2 + 9x x2 + 9x3 + 3x4 ; x2 ; x4 g

is a basis for P4 (R).

Section 8.6 Direct Sums of Subspaces

Discussion 8.6.1 Let W1 and W2 be subspaces of a vector space V . In Theorem 8.3.12, we


learn that the sum of W1 and W2 ,

W1 + W2 = fw1 + w2 j w1 2 W1 and w2 2 W2 g;

is a subspace of V . Sometimes, in order to study the behavior of a large vector space, it is more
convenient to decompose the space into sums of smaller subspaces. (For example, see Chapter
11.) To do so, we need to make sure each vector in W1 + W2 is expressed uniquely as w1 + w2
with w1 2 W1 and w2 2 W2 .

Example 8.6.2 Let W1 be the xy-plane and W2 the yz -plane. Then W1 + W2 = R3 . Take
(1; 2; 3) 2 R3 . We have

(1; 2; 3) = (1; 1; 0) + (0; 1; 3) = (1; 2; 0) + (0; 0; 3)

where (1; 1; 0); (1; 2; 0) 2 W1 and (0; 1; 3); (0; 0; 3) 2 W2 . So there are more than one way to
write (1; 2; 3) as w1 + w2 with w1 2 W1 and w2 2 W2 .
Let W3 be the z -axis. If we replace W2 by W3 , we still have W1 + W3 = R3 . Now, every vector
(a; b; c) 2 R3 is written uniquely as

(a; b; c) = (a; b; 0) + (0; 0; c)

with (a; b; 0) 2 W1 and (0; 0; c) 2 W3 .


Section 8.6. Direct Sums of Subspaces 33

De nition 8.6.3 Let W1 and W2 be subspaces of a vector space V . We say that the subspace
W1 + W2 is a direct sum of W1 and W2 if every vector u 2 W1 + W2 can be expressed uniquely
as
u = w1 + w2 where w1 2 W1 and w2 2 W2 :
In this case, we denote W1 + W2 by W1  W2 .
Note that as a set, W1  W2 = W1 + W2 = fw1 + w2 j w1 2 W1 and w2 2 W2 g. The \circle"
added to \+" can be regarded as a remark saying that the sum W1 + W2 is a direct sum.

Example 8.6.4 In Example 8.6.2, R3 = W1 + W2 but it is not a direct sum. By replacing W2


by W3 , we have R3 = W1  W3 , i.e. R3 is a direct sum of W1 and W3 .

Theorem 8.6.5 Let W1 and W2 be subspaces of a vector space V . Then W1 + W2 is a direct


sum if and only if W1 \ W2 = f0g.
Proof
()) Take any w 2 W \W .
1 2 Note that 0 = w + ( w) where w 2 W \ W  W
1 2 1 and
w 2W \W W .
1 2 2

On the other hand, 0 = 0 + 0 where 0 2 W1 and 0 2 W2 . As W1 + W2 is a direct sum, 0


can only be written uniquely as w1 + w2 for w1 2 W1 and w2 2 W2 . So w = 0.
Thus W1 \ W2 = f0g.

(() Suppose a vector u 2 W1 + W2 can be written as

u = w1 + w2 and u = w10 + w20


where w1 ; w10 2 W and w2 ; w20 2 W . Then w1 + w2 = w10 + w20 implies
1 2

w1 w10 = w20 w2:


Since W is a subspace and w1 ; w10 2 W , w1 w10 2 W . Similarly, w20 w2 2 W . It
1 1 1 2

follows that w1 w10 = w20 w2 2 W \ W . As W \ W = f0g, we conclude that


1 2 1 2

w1 w10 = w20 w2 = 0
and hence w1 = w10 and w2 = w20 . So the sum is unique.
We have shown that W1 + W2 is a direct sum.

Example 8.6.6
1. In Example 8.6.2, W1 \ W2 is the y -axis while W1 \ W3 = f(0; 0; 0)g. By Theorem 8.6.5,
W1 + W2 is not a direct sum and W1 + W3 is a direct sum.
34 Chapter 8. General Vector Spaces

2. Let

W1 = fA 2 Mnn (R) j AT = Ag and W2 = fA 2 Mnn (R) j AT = Ag :


Both W1 and W2 are subspaces of Mnn (R). So W1 + W2  Mnn (R).
For any matrix B 2 Mnn (R),

B = (B + B ) + (B B )
1
2
T 1
2
T

where 12 (B + B T ) 2 W1 1
2
and (B B ) 2 W . This means B 2 W
T
2 1 + W2 for all
B 2 Mnn(R). So Mnn(R)  W1 + W2.
Thus we have shown W1 + W2 = Mnn (R).
Furthermore, W1 \ W2 = f0g. By Theorem 8.6.5, W1 + W2 is a direct sum, i.e. Mnn (R) =
W1  W2 .
In this example, elements of W1 are symmetric matrices while elements of W2 are called
skew symmetric matrices or anti-symmetric matrices.
3. Let W1 be the subspace of C ([0; 2 ]) spanned by g 2 C ([0; 2]) where g(x) = sin(x) for
x 2 [0; 2]. De ne
 Z 2 
W2 = f 2 C ([0; 2]) f (t) sin(t)dt = 0 :
0

Then W2 is a subspace of C ([0; 2 ]) and C ([0; 2 ]) = W1  W2 . (We leave the veri cation
of the results as exercise. See Question 8.34.)

Theorem 8.6.7 Let W1 and W2 be subspaces of a vector space V . Suppose W1 + W2 is a


direct sum, i.e. W1 \ W2 = f0g.
1. If B1 and B2 are bases for W1 and W2 respectively, then B1 [ B2 is a basis for W1  W2 .
2. If both W1 and W2 are nite dimensional, then

dim(W1  W2 ) = dim(W1 ) + dim(W2 ):

(See Question 8.35 for a formula of dim(W1 + W2 ) when W1 + W2 is not a direct sum.)

Proof
1. (i) It is obvious that span(B1 [ B2 )  W1  W2 .
Take any u 2 W1  W2 , i.e. u = w1 + w2 where w1 2 W1 and w2 2 W2 . Since
B1 and B2 span W1 and W2 respectively, there exists, u1 ; u2 ; : : : ; uk 2 B1 and
v1; v2; : : : ; vm 2 B2 such that
w1 = a u1 + a u2 +    + ak uk
1 2 and w2 = b v1 + b v2 +    + bmvm
1 2
Section 8.6. Direct Sums of Subspaces 35

for some scalars a1 ; a2 ; : : : ; ak ; b1 ; b2 ; : : : ; bm . Then

u = a u1 + a u2 +    + ak uk + b v1 + b v2 +    + bmvm
1 2 1 2

is a linear combination of vectors from B1 [ B2 . So W1  W2  span(B1 [ B2 ).


Thus we have shown that span(B1 [ B2 ) = W1  W2 .

(ii) By De nition 8.4.8, to show B1 [ B2 is linearly independent, we need to show that


every nite subset of B1 [ B2 is linearly independent.
Take any nite set C = fu1 ; u2 ; : : : ; uk ; v1 ; v2 ; : : : ; vm g where u1; u2; : : : ; uk 2 B
1

and v1 ; v2 ; : : : ; vm 2 B2 . Consider the vector equation

c1 u1 + c2 u2 +    + ck uk + d1 v1 + d2 v2 +    + dm vm = 0: (8.2)

Set w = c1 u1 + c2 u2 +    + ck uk . Note that w 2 span(B1 ) = W1 . By (8.2),

w= d1 v1 d2 v2    dmvm 2 span(B ) = W 2 2

and hence w 2 W1 \ W2 . Since W1 \ W2 = f0g, w = 0. This means

c1 u1 + c2 u2 +    + ck uk = 0 and d1 v1 d2 v2    dmvm = 0:
As B1 and B2 are linearly independent, the two equations above have only the trivial
solutions c1 = 0, c2 = 0, : : : , ck = 0, d1 = 0, d2 = 0, : : : , dm = 0. Thus the equation
(8.2) has only the trivial solution and C is linearly independent.
We have shown that B1 [ B2 is linearly independent.

By (i) and (ii), B1 [ B2 is a basis for W1  W2 .

2. Since W1 \ W2 = f0g, B1 \ B2 = ;. So by Part 1, dim(W1  W2 ) = jB1 [ B2 j = jB1 j + jB2 j =


dim(W1 ) + dim(W2 ).

Remark 8.6.8 In Theorem 8.6.7.1, suppose W1 + W2 is not a direct sum, i.e. W1 \ W2 6= f0g.
It is still true that span(B1 [ B2 ) = W1 + W2 but B1 [ B2 may not be linearly independent and
hence B1 [ B2 may not be a basis for W1 + W2 .
For example, let W1 be the xy -plane and W2 the yz -plane (see Example 8.6.2). Take bases
B1 = f(1; 0; 0); (0; 1; 0)g and B2 = f(0; 1; 1); (0; 0; 1)g for W1 and W2 respectively. It is obvious
that span(B1 [ B2 ) = W1 + W2 but B1 [ B2 is linearly dependent and hence B1 [ B2 is not be
a basis for W1 + W2 .

De nition 8.6.9 Let V be a vector space and W1 ; W2 : : : ; Wk subspaces of V .


36 Chapter 8. General Vector Spaces

1. The sum of W1 ; W2 : : : ; Wk is de ned to be

W1 + W2 +    + Wk = fw1 + w2 +    + wk j wi 2 Wi for i = 1; 2; : : : ; kg;

which is a subspace of V .

2. The subspace W1 + W2 +    + Wk is said to be a direct sum of W1 ; W2 : : : ; Wk if every


vector u 2 W1 + W2 +    + Wk can be expressed uniquely as

u = w1 + w2 +    + wk where wi 2 Wi for i = 1; 2; : : : ; k.

In this case, we shall write the sum W1 + W2 +    + Wk as W1  W2      Wk .

Remark 8.6.10 We can use Theorem 8.6.5 repeatedly to determine whether the sum W1 +
W2 +    + Wk is a direct sum. For example, check the following one by one:

W1 \ W2 = f0g; (W1 + W2 ) \ W3 = f0g; : : : ; (W1 +    + Wk 1 ) \ Wk = f0g:

Example 8.6.11

1. Let W1 , W2 and W3 be the x-axis, the y -axis and the z -axis in R 3


respectively. Since
every (a; b; c) 2 R3 can be expressed uniquely as

(a; b; c) = (a; 0; 0) + (0; b; 0) + (0; 0; c)

where (a; 0; 0) 2 W1 , (0; b; 0) 2 W2 and (0; 0; c) 2 W3 , R3 is a direct sum of W1 ; W2 ; W3 ,


i.e. R3 = W1  W2  W3 .
Alternatively, we can use the method discussed in Remark 8.6.10. It is obvious that
W1 \ W2 = f(0; 0; 0)g. Since W1 + W2 is the xy-plane, (W1 + W2 ) \ W3 = f(0; 0; 0)g. Thus
W1 + W2 + W3 is a direct sum.

2. Let V be a nite dimensional vector space over a eld F and let fv1 ; v2 ; : : : ; vn g be a
basis for V . For each i = 1; 2; : : : ; n, de ne Wi = spanfvi g. Since each u 2 V can be
expressed uniquely as u = c1 v1 + c2 v2 +    + cn vn for c1 ; c2 ; : : : ; cn 2 F,

V = W1  W2      Wn :

The example in Part 1 is a particular case of this example.


Section 8.7. Cosets and Quotient Spaces 37

Section 8.7 Cosets and Quotient Spaces

De nition 8.7.1 Let W be a subspace of a vector space V . For u 2 V , the set


W + u = fw + u j w 2 W g
is called the coset of W containing u.

Example 8.7.2
1. Let W be the subspace of F23 spanned by (1; 0; 1) and (0; 1; 1). Then
W = fa(1; 0; 1) + b(0; 1; 1) j a; b 2 F2 g = f(0; 0; 0); (1; 0; 1); (0; 1; 1); (1; 1; 0)g:
The following are all the cosets of W :
W + (0; 0; 0) = f(0; 0; 0); (1; 0; 1); (0; 1; 1); (1; 1; 0)g;
W + (0; 0; 1) = f(0; 0; 1); (1; 0; 0); (0; 1; 0); (1; 1; 1)g;
W + (0; 1; 0) = f(0; 1; 0); (1; 1; 1); (0; 0; 1); (1; 0; 0)g;
W + (0; 1; 1) = f(0; 1; 1); (1; 1; 0); (0; 0; 0); (1; 0; 1)g;
W + (1; 0; 0) = f(1; 0; 0); (0; 0; 1); (1; 1; 1); (0; 1; 0)g;
W + (1; 0; 1) = f(1; 0; 1); (0; 0; 0); (1; 1; 0); (0; 1; 1)g;
W + (1; 1; 0) = f(1; 1; 0); (0; 1; 1); (1; 0; 1); (0; 0; 0)g;
W + (1; 1; 1) = f(1; 1; 1); (0; 1; 0); (1; 0; 0); (0; 0; 1)g:
Note that W + (0; 0; 0) = W + (0; 1; 1) = W + (1; 0; 1) = W + (1; 1; 0) = W
and W + (0; 0; 1) = W + (0; 1; 0) = W + (1; 0; 0) = W + (1; 1; 1) = F23 W .
2. Let W = f(x; y ) 2 R2 j x 2y = 0g. It is the line in R2 represented by the homogeneous
linear equation x 2y = 0 and is a subspace of R2 .
Take any (a; b) 2 R2 . The coset of W con-
taining (a; b) is y 6

W + (a; b)
.....
......
W + (a; b) ......
......


.
..........
...
......
......
......

= f(x; y ) + (a; b) j (x; y ) 2 R2 and x 2y = 0g


......
.
..........
...... ( a; b )
....
......
...... .... W
...... ......
......

= f(x0 ; y 0 ) 2 R2 j x0 2y 0 = a 2bg
.
... ......
.....
......
. ......
......
........... ..
. .......
.
...... ...
......
...... ......
...... ......
...... ......
......

which is the line in R2 represented by the


.
. ........
. -
...
......
...... x
......
......
......
......

linear equation x 2y = c where c = a 2b.


.....
......
......
......
......
......

Note that the line x 2y = c is parallel to


x 2y = 0.
In general, let W be a line in R2 (or R3 ) that passes through the origin. The cosets of W
are lines in R2 (or R3 ) parallel to W . (See also Discussion 3.2.15.1.)
38 Chapter 8. General Vector Spaces

3. Let W = f(x; y; z ) 2 R3 j z = 0g. It is the xy -plane in R3 represented by the homogeneous


linear equation z = 0 and is a subspace of R2 .
Take any (a; b; c) 2 R3 . Same as the ex- z 6

ample in Part 2, the coset of W containing ......


......
W + (a; b; c)
.............................................................................................................................................
......
.


...... ......
(a; b; c) is ....
......
....
.......
......
......
......
......
.........
......
......

...... ......
...... ......
............................................................................................................................................

W + (a; b; c) = f(x0 ; y0 ; z 0 ) 2 R3 j z 0 = cg (a; b; c)


..............................................................
..............................................................................
...... ......
.
...... 

which is the plane in R3 represented by the


...... ......
......  ......
..... ...... -
...........  .
.. ........
.... .... y
......  ......
...... ......
...... ......
..................................................................
linear equation z = c. Note that the plane
...........................................................................




 W

z = c is parallel to the xy-plane. x

In general, let W be a plane in R3 that contains the origin. The cosets of W are planes
in R3 parallel to W . (See also Discussion 3.2.15.2.)
4. (In this example, vectors in Fn are written as column vectors.) Let A be an m  n matrix
over a eld F and let W be the nullspace of A, i.e. W = fu 2 Fn j Au = 0g. Take any
v 2 Fn. Set b = Av. The coset of W containing v is
W + v = fu + v j u 2 Fn and Au = 0g = fw 2 Fn j Aw = bg
which is the solution set of the linear system Ax = b. (See also Theorem 4.3.6.)

Theorem 8.7.3 Let W be a subspace of a vector space V .


1. For any v ; w 2 V , the following are equivalent:
(a) v 2 W + w;
(b) w 2 W + v ;
(c) v w 2 W ;
(d) W + v = W + w.
2. For any v ; w 2 V , either W + v = W + w or (W + v ) \ (W + w) = ;.

Proof
1. ((a),(b)) Since W is a subspace, u 2 W if and only if
u 2 W . Thus
v 2 W + w , v = u + w for some u 2 W
, w = ( u) + v for some u 2 W
, w 2 W + v:
((a),(c)) v 2 W + w , v = u + w for some u 2 W
, v w = u for some u 2 W
Section 8.7. Cosets and Quotient Spaces 39

, v w 2 W.
((a),(d)) ()) Suppose v 2 W + w. Then v w 2 W . Hence
u 2 W + v , u v 2 W , (u v) + (v w) 2 W
, u w 2 W , u 2 W + w:
So W + v = W + w.
(() Since v 2 W + v , if W + v = W + w, then v 2 W + w.

2. Given u; v 2 V , assume (W + v ) \ (W + w) 6= ;. We need to show that W + v = W + w.


Take any u 2 (W + v ) \ (W + w), i.e. u 2 W + v and u 2 W + w. By Part 1, we have
W + v = W + u = W + w.

Example 8.7.4

1. Following Example 8.7.2.2, let y 6


W + v =W +w
W = f(x; y) 2 R j x 2y = 0g:
.
......

*
2

v
......
......
......
.
..........
... . .
...... .
......
......

Let v = (2; 3) and w = (0; 2). Since


...... .
.
...... .

w6
.
..........
. .
.....
...... .
. W
....... . .
......

*
..
. .
......
. ......

v
.
... ......
.........
. . ...........
.
.

v w = (2; 1) 2 W;
.
. . .
...... ......

w
....
. ..
. ..
.... ...
...... ......
...... ......
...... ......
......
...... -
. .........
.
...
......
...... x

by Theorem 8.7.3.1, W + v = W + w.
......
......
......
..........
.
...
......
......
......
......

2. Following Example 8.7.2.3, let z 6

W + v =W +w
W = f(x; y; z ) 2 R3 j z = 0g: ......
......
-
..........................................................................................................................................
......
.

6 v
...... ......
...... ......
...... . ......
......
.......
........
w .
.
......
.....
.
... ......
.

Let w = (0; 1; 2) and v = (0; 0; 2). Since


.
.
...... ......
...... ......
...............................................................................................................................................
.
.
.
.
.
.

v w = (0; 1; 0) 2 W; ......
......
......
......
...


-
...................................................................
............................................................................
.
.
.
......
......
...... -
..
......

v w
.
...........  ...
........
..... ..... y
......  ......

by Theorem 8.7.3.1, W + v = W + w.
...... ......
...... ......
..................................................................
...........................................................................




 W

Lemma 8.7.5 Let V be a vector space over a eld F and let W be a subspace of V .
1. Suppose u1 ; u2 ; v1 ; v2 2 V such that W + u1 = W + u2 and W + v1 = W + v2 . Then
W + (u1 + v1 ) = W + (u2 + v2 ).
2. Suppose u1; u2 2 V such that W + u1 = W + u2 . Then W + cu1 = W + cu2 for all
c 2 F.
40 Chapter 8. General Vector Spaces

Proof The proof is left as exercise. See Question 8.39.

De nition 8.7.6 Let V be a vector space over a eld F and let W be a subspace of V . We
de ne the addition of two cosets by

(W + u) + (W + v ) = W + (u + v ) for u; v 2 V: (8.3)

Let A and B be two cosets of W . Since A and B can be represented as W + u and W + v ,


respectively, by many di erent choices of u and v , we need to make sure that our de nition of
A + B will always give us the same answer in despite of the choices of u and v. Luckily, by
Lemma 8.7.5.1, the addition de ned in (8.3) is well-de ned.
Similarly, we de ne the scalar multiplication of a coset by

c(W + u) = W + cu for c 2 F and u 2 V: (8.4)

By Lemma 8.7.5.2, the scalar multiplication de ned in (8.4) is well-de ned.

Example 8.7.7 Following Example 8.7.2.2, let W = f(x; y) 2 R2 j x 2y = 0g. Let u = (1; 1)
and v = ( 2; 1). Note that

W + u = W +(1; 1) = f(x; y) j x 2y = 1g and W + v = W +( 2; 1) = f(x; y) j x 2y = 4g:


Then

(W + u) + (W + v ) = W + (u + v ) = W + ( 1; 2) = f(x; y ) j x 2y = 5g

and
3(W + u) = W + 3u = W + (3; 3) = f(x; y ) j x 2y = 3g:

y (W + u) + ( W + v ) y
6
..........
+v
......
......
......
.....

...... W
.
6
3(W +
......
.
u)
u
.....
...... ...... ......
...... ...... ......


......
......
......
...... 3 ......
......

u
..... .
.... .....
... ... ...
...... ...... ......
......
......
......
......
......
......
......
......
...... + W

u+v ...... ......


u ...... ......

HAKYHH
...... ...... ...... ......
. ......
......
......
... ....
......
......
... .
....
+ W ...
........
.....
...........
.. ...... ...... ...... W

AA HH
........ ...... ...... .... ..
.. .
.
...... ...... ...... ....
......
.
......
.....
......
.... .
...... .... ...... ...... ...... ...... ......
.. ......... ... ..... ...... ........... ...... ...... ......
...... ..... ........... ...... ...... W ...... ...... ......

H
Y u
..... ..... ..... ..... ..... .
....

H
... . ....... ... ... ... ... ...
...... ...... ...... ...... ...... ......

v HHA  u 
..
...... ......... ...... ...... ...... ...... ......
...... ........ ...... ...... ...... ...... ......
...... ...... ...... ...... ...... ......
...... ...... ...... ...... ...... ......
.
. ...... ........ ........ ........ ........ ........

HA
.. ... ...... ...... ..... ...... ......
...... ...... ...... ...... ...... ......
...... ...... ...... ......
...... ...... ...... ......
...... ...... ...... ......
...... ...... ...... ......
... .
. ...... .
.......... - ........... ........... -
.... .... ......
... ....
...... ...... x ...... ...... x
...... ...... ...... ......
...... ...... ...... ......
........... .
... ....... ...........
..... ..... .....
...... ...... ......
...
........ ........... ...
........
... .... ...
......
......
......
......
......

Figure A Figure B

Let u0 = (0; 12 ) and v 0 = (0; 2). Note that

W + u0 = W + (0; 12 ) = f(x; y) 2 R2 j x 2y = 1g = W + u
Section 8.7. Cosets and Quotient Spaces 41

and
W + v0 = W + (0; 2) = f(x; y) 2 R2 j x 2y = 4g = W + v:
Also
(W + u0 )+(W + v 0 ) = W +(u0 + v 0 ) = W +(0; 25 ) = f(x; y ) j x 2y = 5g = (W + u)+(W + v )
and
3(W + u0 ) = W + 3u0 = W + (0; 32 ) = f(x; y ) j x 2y = 3g = 3(W + u):
(As an exercise, draw the vectors u0 , v 0 , u0 + v 0 on Figure A above and also draw the vectors
u0, 3u0 on Figure B.)
Theorem 8.7.8 Let V be a vector space over a eld F and W a subspace of V . Denote the
set of all cosets of W in V by V=W , i.e.
V=W = fW + u j u 2 V g:
Then V=W is a vector space over F using the addition and scalar multiplication de ned in (8.3)
and (8.4).
Proof (V1) and (V6) follow from the de nitions of the addition and scalar multiplication.
(V2)-(V3) and (V7)-(V10) follow directly from the properties of V .
For example, for all W + u; W + v 2 V=W ,
(W + u) + (W + v ) = W + (u + v )
= W + (v + u) (by (V2) of V )
= (W + v ) + (W + u)
and hence (V2) is satis ed.
Finally, for (V4), the zero vector is W (= W + 0); and for (V5), the negative of W + u 2 V=W
is W + ( u) (which we usually written as W u).

De nition 8.7.9 The vector space V=W in Theorem 8.7.8 is called the quotient space of V
modulo W .

Remark 8.7.10 In abstract algebra, \quotient" is used to de ne modulo arithmetics for al-
gebraic structures.
For example, let nZ = f0; n; 2n; : : : g  Z. De ne
Z=nZ = fnZ + a j a 2 Zg where for a 2 Z; nZ + a = fa; n + a; 2n + a; : : : g:
For a; b 2 Z, nZ + a = nZ + b if and only if a  b (mod n). The operations of addition and
multiplication de ned by
(nZ + a)+(nZ + b) = nZ +(a + b) and (nZ + a)(nZ + b) = nZ + ab for nZ + a; nZ + b 2 Z=nZ:
These operations resemble the arithmetic of integer addition and multiplication modulo n.
42 Chapter 8. General Vector Spaces

Theorem 8.7.11 Let V be a nite dimensional vector space and W a subspace of V . Let
fw1; w2; : : : ; wmg be a basis for W .
1. For v1 ; v2 ; : : : ; vk 2 V , fv1 ; v2 ; : : : ; vk ; w1 ; w2 ; : : : ; wm g is a basis for V if and only if
fW + v1; W + v2; : : : ; W + vk g is a basis for V=W .
2. dim(V=W ) = dim(V ) dim(W ).

Proof In the following, we only prove that if fv1 ; v2 ; : : : ; vk ; w1 ; w2 ; : : : ; wm g is a basis for


V , then fW + v1 ; W + v2 ; : : : ; W + vk g is a basis for V=W :
(i) Take any W + u 2 V=W . As u 2 V ,
u = a v1 + a v2 +    + ak vk + b w1 + b w2 +    + bmwm
1 2 1 2

for some scalars a ; a : : : ; ak ; b ; b ; : : : ; bm . Since b w1 + b w2 +    + bm wm 2


1 2 1 2 1 2 W,
u (a v1 + a v2 +    + ak vk ) 2 W . By Theorem 8.7.3.1,
1 2

W + u = W + ( a v1 + a v2 +    + vk )
1 2

= a (W + v1 ) + a (W + v2 ) +    + ak (W + vk )
1 2

2 spanfW + v1; W + v2; : : : ; W + vk g:


Hence V=W = spanfW + v1 ; W + v2 ; : : : ; W + vk g.
(ii) Consider the equation
c1 (W + v1 ) + c2 (W + v2 ) +    + ck (W + vk ) = W: (8.5)
As c1 (W + v1 ) + c2 (W + v2 ) +    + ck (W + vk ) = W + (c1 v1 + c2 v2 +    + ck vk ), the
equation (8.5) implies c1 v1 + c2 v2 +    + ck vk 2 W . So
c1 v1 + c2 v2 +    + ck vk = d1 w1 + d2 w2 +    + dm wm ;
for some scalar d1 ; d2 ; : : : ; dm , and hence
c1 v1 + c2 v2 +    + ck vk d1 w1 d2 w2    dmwm = 0 (8.6)
Since v1 ; v2 ; : : : ; vk ; w1 ; w2 ; : : : ; wm are linearly independent, all the coecients of
(8.6) must be zero. In particular, we have c1 = 0, c2 = 0, : : : , ck = 0, i.e. (8.5) has only
the trivial solution.
So W + v1 ; W + v2 ; : : : ; W + vk are linearly independent.
By (i) and (ii), fW + v1 ; W + v2 ; : : : ; W + vk g is a basis for V=W .
(Proofs of the other parts are left as exercises. See Question 8.44.)

Example 8.7.12
1. Following Example 8.7.2.2, let W = f(x; y ) 2 R2 j x 2y = 0g. It is easy to check
that W = spanf(2; 1)g and hence f(2; 1)g is a basis for W and dim(W ) = 1. We extend
Exercise 8 43

f(2; 1)g to a basis f(2; 1); (0; 1)g for R . By Theorem 8.7.11.1, fW + (0; 1)g is a basis for
2

the quotient space R =W .2

Note that dim(R =W ) = 1 = 2 1 = dim(R ) dim(W ).


2 2

2. Let W = spanf(2; 2; 1; 0; 1); ( 2; 2; 4; 6; 2); (0; 0; 1; 1; 1); (1; 1; 2; 0; 1)g be a sub-


space of R .5

0 1 0 1
2 2 1 0 1 2 2 1 0 1
B C Gaussian B C
B 2 2 4 6 2C B0 0 3 6 3C
B
B 0 0 1 1 1A
C
C ! B
B0 0 0 3 0C
C
@
Elimination @ A
1 1 2 0 1 0 0 0 0 0
So C = f(2; 2; 1; 0; 1); (0; 0; 3; 6; 3); (0; 0; 0; 3; 0)g is a basis for W . Following the algo-
rithm in Example 4.1.14.2 (see also Example 8.5.18), we extend C to a basis for R5 :
f(2; 2; 1; 0; 1); (0; 0; 3; 6; 3; (0; 0; 0; 3; 0); (0; 1; 0; 0; 0); (0; 0; 0; 0; 1)g:
By Theorem 8.7.11.1, fW + (0; 1; 0; 0; 0); W + (0; 0; 0; 0; 1)g is a basis for the quotient
space R =W .
5

Note that dim(R =W ) = 2 = 5 3 = dim(R ) dim(W ).


5 5

Exercise 8

Question 8.1 to Question 8.6 are exercises for Section 8.1.


( ! )
a b
1. (a) Let A = a; b 2 R .
b a
Prove that A is a eld using the usual matrix addition and multiplication.
(Hint: The axioms (F2), (F3), (F8) and (F11) are also properties of matrices. You
only need to check the remaining axioms.)
( ! )
a b
(b) Let B = a; b 2 R .
b a
Is B a eld using the usual matrix addition and multiplication?
p
2. Let K = fa + b 2 j a; b 2 Qg  R. Show that K is a eld using the usual addition and
multiplication of real numbers.
(Hint: (F2), (F3), (F7), (F8) and (F11) are properties of real numbers. You only need to
check the remaining axioms.)

3. The nite eld F4 is de ned as follows: Let F4 = f0; 1; x; 1 + xg. The addition and
multiplication are the same as the polynomial addition and multiplication except 1+1 = 0,
x + x = 0 and x2 = 1 + x.
44 Chapter 8. General Vector Spaces

(a) Write down the addition and multiplication tables for F4 (as in Example 8.1.3.2).
(b) For every element a of F4 , nd a.
(c) For every nonzero element b of F4 , nd b 1 .
0 1
1 1 1
(d) Find the inverse of the square matrix @0 x 1CA over F4 .
B

0 0 1

4. Solve the following linear system over F when (i) F = R and (ii) F = F2 .
8
>
< x1 + x2 + x3 + x5 = 0
>
x1 + x3 + x4 =1
:
x2 + x4 + x5 = 1

5. Complete the proof of Proposition 8.1.5:


Let F be a eld.

(a) If b; c 2 F satis es the property that a + b = a + c = a for all a 2 F, show that b = c.


(b) If b; c are nonzero elements in F satisfying the property ba = ca = a for all a 2 F,
show that b = c.
(c) For any a 2 F and a 6= 0, if there exist b; c 2 F such that ab = ac = 1, show that
b = c.
(d) For any a 2 F, show that a0 = 0.
(e) For any a 2 F, show that ( 1)a = a.
(f) For any a; b 2 F, prove that if ab = 0, then a = 0 or b = 0.

6. Prove Proposition 8.1.11:

(a) If A and B are n  n matrix over F, show that tr(A + B ) = tr(A) + tr(B ).
(b) If c 2 F and A is an n  n matrix over F, show that tr(cA) = c tr(A).
(c) If C and D are m  n and n  m matrices, respectively, over F, show that tr(CD) =
tr(DC ).

Question 8.7 to Question 8.12 are exercises for Section 8.2.

7. For each of the following, list all axioms in the de nition of vector space, i.e. De nition
8.2.2, which are not satis ed by the given vector addition and scalar multiplication de ned
on the vector set V over R.
Exercise 8 45

(a) V = R2 , (x; y ) + (x0 ; y 0 ) = (x + x0 ; x + x0 + y + y 0 ) and c(x; y ) = (cx; cy ) for


(x; y ); (x0 ; y 0 ) 2 V and c 2 R.
(b) V = f(x; y ) 2 R2 j y 6= 0g, (x; y ) + (x0 ; y 0 ) = (x + x0 ; yy 0 ) and c(x; y ) = (cx; y ) for
(x; y ); (x0 ; y 0 ) 2 V and c 2 R.

8. Prove that R2 is a vector space over R using the following vector addition and scalar
multiplication:

(x; y ) + (x0 ; y 0 ) = (x + x0 + 1; y + y 0 2) and c(x; y ) = (cx + c 1; cy 2c + 2)

for (x; y ); (x0 ; y 0 ) 2 R2 and c 2 R.

9. Verify Example 8.2.3.9(b):


Let V be the set of all positive real numbers, i.e. V = fa 2 R j a > 0g. De ne the vector
addition y by
a y b = ab for a; b 2 V
and de ne the scalar multiplication  by

m  a = am for m 2 R and a 2 V:
Prove that V is a vector space over R using these two operations.

10. Let A = f0; 1g. Let f; g; h 2 F (A; R) such that f (t) = 1 + t2 , g (t) = sin( 21 t ) + cos( 12 t )
and h(t) = t for t 2 A. Show that f = g + h.

11. Let U and V be vector spaces over a eld F. Let

U  V = f(u; v) j u 2 U and v 2 V g:
De ne the vector addition and scalar multiplication as follows:

(u; v ) + (u0 ; v 0 ) = (u + u0 ; v + v 0 ) and c(u; v ) = (cu; cv )

for (u; v ); (u0 ; v 0 ) 2 U  V and c 2 F. Prove that U  V is a vector space.

12. Complete the proof of Proposition 8.2.4:


Let V be a vector space over a eld F.
(a) If v ; w 2 V satis es the property that u + v = u + w = u for all u 2 V , show that
v = w.
(b) For any u 2 V , if there exist v; w 2 V such that u + v = u + w = 0, show that
v = w.
46 Chapter 8. General Vector Spaces

(c) For all u 2 V , show that ( 1)u = u.


(d) For all c 2 F, show that c0 = 0.
(e) If cu = 0 where c 2 F and u 2 V , prove that c = 0 or u = 0.

Question 8.13 to Question 8.16 are exercises for Section 8.3.

13. For each of the following subsets W of the vector space V , determine whether W is a
subspace of V .

(a) V = F23 and W = f(0; 0; 0); (1; 1; 0); (0; 0; 1)g.


(b) V = F23 and W = f(0; 0; 0); (1; 1; 0); (0; 0; 1); (1; 1; 1)g.
(c) V = Mnn (C) and W = fA 2 V j AB = 0g where B is a given n  n real matrix.
(d) V = Mnn (C) and W = fA 2 V j A is invertibleg.
(e) V = Mnn (C) and W = fA 2 V j det(A) = 0g.
(f) V = RN and W = f(an )n2N 2 V j an = an + an for n = 1; 2; 3; : : : g.
+2 +1

(g) V = RN and W = f(an )n2N 2 V j an = an an for n = 1; 2; 3; : : : g.


+2 +1

(h) V = RN and W is the set of all convergent sequences over R.


(i) V = RN and W is the set of all divergent sequences over R.
(j) V = P (R) and W = fa + bx + cx j a; b; c 2 Zg.
2

(k) V = P (R) and W = f(a + b) + (a b)x j a; b 2 Rg.


(l) V = P (R) and W = fp(x) 2 V j p(1) = 0g.
(m) V = P (R) and W = fp(x) 2 V j p(1) = 0 and p(2) = 0g.
(n) V = P (R) and W = fp(x) 2 V j p(1) = 0 or p(2) = 0g.
(o) V = P (R) and W = fp(x) 2 V j p(2) = 1g.
(p) V = P (R) and W = fp(x) 2 V j p(1)  0g.
(q) V = P (F) and W = fp(x) 2 V j the degree of p(x) is equal to ng where F is a eld
and n is a positive integer.
(r) V = C 2 ([a; b]), where [a; b], with a < b, is a closed interval on the real line, and
 
d2 f (x) df (x)
W = f 2V 3 + 2 f (x) = 0 for x 2 [a; b] .
dx 2 dx
(s) V = C 2 ([a; b]), where [a; b], with a < b, is a closed interval on the real line, and
 
d2 f (x) df (x)
W = f 2V 3 + 2f (x) = x for x 2 [a; b] .
dx 2 dx
Exercise 8 47

14. Prove Remark 8.3.5:


Let W be a nonempty subset of a vector space V over a eld F. Prove that W is a
subspace of V if and only if for all a; b 2 F and u; v 2 W , au + bv 2 W .

15. Let W1 and W2 be subspaces of a vector space V .


(a) Prove Theorem 8.3.12:
Show that W1 + W2 = fu + v j u 2 W1 and v 2 W2 g is a subspace of V .
(b) Prove Remark 8.3.14:
Suppose U is a subspace of V that contains both W1 and W2 , i.e. W1 U and
W2  U . Prove that W1 + W2  U .
(c) Let W1 = f(a; a; 0) j a 2 Fg and W2 = f(0; a; a) j a 2 Fg where F is a eld. Write
down the subspace W1 + W2 of V explicitly.

16. Let V be a vector space over a eld F.


(a) Let W1 and W2 be proper subspaces of V . Prove that W1 [ W2 6= V .
(b) Let W1 , W2 and W3 be proper subspaces of V .
(i) Prove that if F has at least three elements, W1 [ W2 [ W3 6= V .
(ii) Let F = F2 . Give an example of proper subspaces W1 ; W2 ; W3 of a vector space
V over F such that W1 [ W2 [ W3 = V .
(See Example 8.3.3.1 for the de nition of proper subspaces.)

Question 8.17 to Question 8.20 are exercises for Section 8.4.


17. For each of the sets B1 to B6 , determine whether the set (i) spans P (R) and (ii) are
2

linearly independent.
B1 = f1 + x x2 ; 2 + 2x + x2 g.
B2 = f1 + x x2 ; 2 2x + 2x2 g.
B3 = f1 + x x2 ; 2 + 2x + x2 ; 1 + 5x 2x2 g.
B4 = f1 + x x2 ; 2 + 2x + x2 ; 4 + 3x2 g.
B5 = f1 + x x2 ; 2 + 2x + x2 ; 1 + 5x 2x2 ; 8x 2x2 g.
B6 = f1 + x x2 ; 2 + 2x + x2 ; 4 + 3x2 ; 2 + 6x 3x2 g.

18. Let V be a vector space over a eld F.


(a) Let W1 = spanfv1 ; v2 ; v3 g and W2 = spanfv1 + v2 ; v2 + v3; v1 + v3g where
v1 ; v2 ; v3 2 V .
48 Chapter 8. General Vector Spaces

(i) Prove that W1 = W2 if F = R.


(ii) Suppose F = F2 . Give one example that W1 = W2 and one example that
W1 6= W2 .
(b) Let A = (aij ) 2 Mnn (F). Take any v1 ; v2 ; : : : ; vn 2 V . De ne

W1 = spanfv1 ; v2 ; : : : ; vn g and W2 = spanfw1 ; w2 ; : : : ; wn g


where wj = a1j v1 + a2j v2 +    + anj vn for j = 1; 2; : : : ; n. Prove that if A is
invertible, then W1 = W2 .

19. Let f1 ; f2 ; : : : ; fn 2 C n 1 ([a; b]) where [a; b],with a < b, is a closed interval on the real
line. The Wronskian W (f1 ; f2 ; : : : ; fn ) : [a; b] ! R is the function de ned by
f1 (x) f2 (x)  fn (x)
df1 (x) df2 (x) dfn (x)
dx dx
 dx
W (f1 ; f2 ; : : : ; fn )(x) = .. .. .. for x 2 [a; b]:
. . .
dn 1 f1 (x) dn 1 f2 (x) dn 1 fn (x)
dxn 1 dxn 1
   dxn 1
(a) Let f1 ; f2 2 C 1 ([ 1; 1]) such that f1 (x) = ex and f2 (x) = xex for x 2 [ 1; 1].
Compute W (f1 ; f2 ).
(b) Let f1 ; f2 ; : : : ; fn 2 C n 1 ([a; b]). Prove that if W (f1 ; f2 ; : : : ; fn )(x0 ) 6= 0 for some
x0 2 [a; b], then f1 ; f2 ; : : : ; fn are linearly independent.
(c) Let f1 ; f2 ; : : : ; fn 2 C n 1 ([a; b]). If W (f1 ; f2 ; : : : ; fn )(x) = 0 for all x 2 [a; b], is it
true that f1 ; f2 ; : : : ; fn must be linearly dependent?

20. Let V = FN be the vector space of in nite sequence over the eld F. For i = 1; 2; 3; : : : ,
de ne ei 2 V to be the in nite sequence such that the ith term of the sequence is 1 and
all other terms are 0. Let B = fe1 ; e2 ; e3 ; : : : g.
(a) Is B linearly independent?
(b) Is V = span(B )?

Question 8.21 to Question 8.31 are exercises for Section 8.5.


21. Recall that C2 forms a complex vector space. If we restrict the scalars to real numbers,
then it is also a real vector space. (See Example 8.4.6.6 and Example 8.5.3.4.) Let
W = f(z; z ) 2 C2 j z 2 Cg. Here z is the complex conjugate of z (see Notation 12.1.2).
(a) Show that W is not a subspace of the complex vector space C2 .
Exercise 8 49

(b) Prove that W is a subspace of the real vector space C2 and nd a basis for W .

22. For each of the following subset B of V = Pn (R), determine whether B is a basis for V .
(a) B = f1; 1 + x; 1 + x + x2 ; : : : ; 1 + x + x2 +    + xn g.
(b) B = f1 + x; x + x2 ; x2 + x3 ; : : : ; xn 1
+ xn ; xn + 1g.

23. For each of the following subspace W of the vector space V , (i) nd a basis for W ; and
(ii) determine the dimension of W .
(a) V = F23 and W = f(0; 0; 0); (1; 1; 0); (0; 1; 1); (1; 0; 1)g.
0 1
1 1 i
B C
Bi 0 1C
(b) V = C4 and W = fAu j u 2 C3 g where A = B
Bi
C.
@ 0 1C A
1 0 i
(In here, vectors in C3 and C4 are written as column vectors.)
( ! ! ! !)
1 1 1 0 3 1 0 2
(c) V = M22 (R) and W = span ; ; ; .
1 1 0 1 1 3 2 0

(d) V = Mnn (R) and W = fA 2 V j A is diagonalg.


(e) V = Mnn (R) and W = fA 2 V j A is symmetricg.
(f) V = Mnn (R) and W = fA 2 V j A is skew symmetricg.
(g) V = Mnn (R) and W = fA 2 V j tr(A) = 0g.
(h) V = P (C) and W = spanf1 + x + x ; i x x ; 1 + i x x i x ; 2 + x i x + x ;
4
2 4 2 3 3 4

(1 i)x + 2x + i x g.
2 3

(i) V = Pn (C) and W = fp(x) 2 V j p(z ) = 0g where z 2 C is a constant.


(j) V = RN and W = f(an )n2N 2 V j an = 2an for n = 1; 2; 3; : : : g.
+3

(k) V = C 1 ([0; 2]) and W = spanff ; f ; f ; f g where f (x) = sin(x), f (x) = cos(x),
1 2 3 4 1 2

f (x) = sin(2x) and f (x) = cos(2x)g for x 2 [0; 2].


3 4

24. Prove Lemma 8.5.7:


Let V be a nite dimensional vector space over a eld F, where dim(V )  1, and let B
be an ordered basis for V .
(a) For any u; v 2 V , prove that u = v if and only if (u)B = (v )B .
(b) For any v1 ; v2 ; : : : ; vr 2 V and c1 ; c2 ; : : : ; cr 2 F, show that

(c1 v1 + c2 v2 +    + cr vr )B = c1 (v1 )B + c2 (v2 )B +    + cr (vr )B :


50 Chapter 8. General Vector Spaces

25. Let W be a subspace of V = M23 (R) with an ordered basis


( ! ! ! !)
1 1 0 1 0 0 3 1 1 0 2 0
B= ; ; ; :
1 1 0 0 1 0 1 3 0 2 0 1
For each of the following matrices A, (i) determine whether A 2 W ; (ii) if so, compute
the coordinate vector of A relative to B .
! ! !
0 0 2 0 4 1
(a) A =
0 0 3
. (b) A= 4 0 2
. (c) A = 10 01 00 .
! !
3 6 1
(d) A= 6 3 3
. (e) A = 0i 0 0
i 0
.

26. Let W be a subspace of V = P4 (R) with an ordered basis

B = f1 x; 2 + x x4 ; 1 + x + x2 + x3 + x4 g:
For each of the following polynomials p(x), (i) determine whether p(x) 2 W ; (ii) if so,
compute the coordinate vector of p(x) relative to B .
(a) p(x) = 5x + x2 + x3 . (b) p(x) = 1 + x x2 x3 + x4 .
(c) p(x) = 2 + 2x2 + 2x3 + 4x4 .

27. For each of the set B in Question 8.25 and Question 8.26, extend it to a basis for V .

28. Let V be a vector space over a eld F and let B be a basis for V . (In here, V can be
in nite dimensional.) Prove that every nonzero vector u 2 V can be expressed uniquely
as a linear combination
u = c1v1 + c2v2 +    + cmvm
for some m 2 N, c1 ; c2 ; : : : ; cm 2 F and v1; v2; : : : ; vm 2 B such that ci 6= 0 for all i and
vi 6= vj whenever i 6= j .
29. Let F be a eld, A 2 Mnn (F) and W = fB 2 Mnn (F) j AB = BAg. Suppose there
exists a column vector v 2 Fn such that fv ; Av ; A2 v ; : : : ; An 1 v g is a basis for Fn .
(a) Prove that W is a subspace of Mnn (F).
(b) Prove that I , A, A2 , : : : , An 1
are linearly independent vectors contained in W .
(c) Prove that fI ; A; A2 ; : : : ; An 1
g is a basis for W .

30. Let V be a nite dimensional vector space over C such that dimC (V ) = n. By restricting
the scalars to real numbers in the scalar multiplication, V can be regarded as a vector
space over R. (See Example 8.5.3.4.) Prove that dimR (V ) = 2n.
Exercise 8 51

31. Let W be a vector space over R. De ne W 0 = f(u; v ) j u; v 2 W g with the addition and
scalar multiplication

(u; v ) + (u0 ; v 0 ) = (u + u0 ; v + v 0 ) and c(u; v ) = (au bv; bu + av)


where (u; v ); (u0 ; v 0 ) 2 W 0 and c = a + b i 2 C with a; b 2 R.

(a) Prove that W 0 is a vector space over C.


(b) Suppose W is nite dimensional such that dimR (W ) = n. Find dimC (W 0 ).

Question 8.32 to Question 8.38 are exercises for Section 8.6.

32. For each of the following subspaces W1 and W2 of the vector space V ,

(i) nd the dimensions of W1 , W2 , W1 \ W2 and W1 + W2 ;


(ii) determine if W1 + W2 is a direct sum; and
(iii) determine if V = W1 + W2 .

(a) V = F4 , W1 = f(a; a; a; a) j a 2 Fg and W2 = f(a; b; c; d) 2 F4 j a + d = 0g where F


is a eld.
(b) V = P3 (R), W1 = fa + bx + bx2 + ax3 j a; b 2 Rg and W2 = fp(x) 2 V j p(1) = 0g.
(c) V = P3 (R), W1 = spanf1+x2 ; 1+2x2 ; 1+3x2 g and W2 = spanf1+x; 1+2x; 1+3xg.
( ! !) ( !)
0 1 0 1 1 0
(d) V = M22 (C), W1 = X2V i 0
X=X i 0
and W2 = span
0 i
.

(e) V has a basis fv1 ; v2 ; v3 ; v4 g, W1 = spanfv1 + v2 ; v2 + v3 ; v3 + v4; v1 + v4g and


W2 = spanfv2 g.

33. For each of the following subspaces W1 and W2 of V = Mnn (R), determine whether
(i) V = W1 + W2 ; and (ii) V = W1  W2 .

(a) W1 = fA 2 V j A is upper triangularg and W 2 = fA 2 V j A is lower triangularg.


(b) W = fA 2 V j A is upper triangularg and W
1 2 = fA 2 V j A is skew symmetricg.

34. Verify Example 8.6.6.3:


Let W1 be the subspace of C ([0; 2 ]) spanned by g 2 C ([0; 2]) where g(x) = sin(x) for
x 2 [0; 2]. De ne
 Z 2 
W2 = f 2 C ([0; 2]) f (t) sin(t)dt = 0 :
0
52 Chapter 8. General Vector Spaces

(a) Show that W2 is a subspace of C ([0; 2 ]).


(b) Prove that C ([0; 2 ]) = W1  W2 .

35. Let W1 and W2 be nite dimensional subspaces of a vector space. Prove that
dim(W1 + W2 ) = dim(W1 ) + dim(W2 ) dim(W1 \ W2 ):
(Hint: Start with a basis B for W1 \ W2 and use Theorem 8.5.17 to extend B to bases B1
and B2 for W1 and W2 , respectively. Then show that B1 [ B2 is a basis for W1 + W2 .)

36. Let W1 , W2 and W3 be subspaces of vector space.


(a) Suppose W1  W2 = W1  W3 . Is it true that W2 = W3 ?
(b) Suppose W1 , W2 and W3 are nite dimensional.
(i) Prove that if dim(W1 + W2 + W3 ) = dim(W1 ) + dim(W2 ) + dim(W3 ), then
W1 \ W2 = W1 \ W3 = W2 \ W3 = f0g.
(ii) Give an example such that W1 \ W2 = W1 \ W3 = W2 \ W3 = f0g but
dim(W1 + W2 + W3 ) 6= dim(W1 ) + dim(W2 ) + dim(W3 ).

37. Let U and V be vector spaces over a eld F and let U  V = f(u; v ) j u 2 U and v 2 V g
be the vector space over F as de ned in Question 8.11. De ne U 0 = f(u; 0V ) j u 2 U g and
V 0 = f(0U ; v) j v 2 V g where 0U and 0V are the zero vectors of U and V respectively.
(a) Show that U 0 and V 0 are subspaces of U  V .
(b) Prove that U  V = U 0  V 0 .
(c) If U and V are nite dimensional, nd dim(U 0 ), dim(V 0 ) and dim(U  V ) in terms
of dim(U ) and dim(V ).
(The vector space U  V is called the external direct sum of U and V .)

38. Let U , V and W be subspaces of a vector space.


(a) Prove that (U \ V ) + (U \ W )  U \ (V + W ) and (U + V ) \ (U + W )  U + (V \ W ).
(b) Is it true that (U \ V )+(U \ W ) = U \ (V + W )? Is it true that (U + V ) \ (U + W ) =
U + (V \ W )?
(c) Prove that (U \ V ) + (U \ W ) = U \ (V + (U \ W )) and (U + V ) \ (U + W ) =
U + (V \ (U + W )).

Question 8.39 to Question 8.46 are exercises for Section 8.7.


39. Prove Lemma 8.7.5:
Let V be a vector space over a eld F and let W be a subspace of V .
Exercise 8 53

(a) Suppose u1 ; u2 ; v1 ; v2 2 V such that W + u1 = W + u2 and W + v1 = W + v2 .


Prove that W + (u1 + v1 ) = W + (u2 + v2 ).
(b) Suppose u1 ; u2 2 V such that W + u1 = W + u2 . Prove that W + cu1 = W + cu2
for all c 2 F.

40. Let V = F24 and W = spanf(1; 1; 0; 1); (1; 0; 1; 1)g.


(a) What is the dimension of W ? What is the dimension of V=W ?
(b) How many distinct cosets of W are there?

41. Let [a; b], with a < b, be a closed interval on the real line and let
 
d2 f (x) df (x)
W = f 2 C ([a; b])
2
3 + 2f (x) = 0 for x 2 [a; b]
dx2 dx
which is a subspace of C 2 ([a; b]). Show that each coset of W is a solution set of a di erential
equation
d2 f (x) df (x)
3 + 2f (x) = g (x) for x 2 [a; b]
dx
2 dx
for some g 2 C ([a; b]).

42. Let W = f(an )n2N 2 RN j an+3 2an = 0 for n = 1; 2; 3; : : : g which is a subspace of RN .


Give an interpretation of the cosets of W in RN similar to that of Question 8.41.

43. For each of parts (a)-(e) of Question 8.32, write down a basis for V=W1 and a basis for
V=W2 .

44. Complete the proof of Theorem 8.7.11:


Let V be a nite dimensional vector space and W a subspace of V . Let fw1 ; w2 ; : : : ; wm g
be a basis for W .
(a) For v1; v2; : : : ; vk 2 V , prove that if fW + v1; W + v2; : : : ; W + vk g is a basis for
V=W , then fv1 ; v2 ; : : : ; vk ; w1 ; w2 ; : : : ; wm g is a basis for V .
(b) Prove that dim(V=W ) = dim(V ) dim(W ).

45. Let U and W be subspaces of a vector space V such that V = U  W .


(a) Suppose U is nite dimensional and fv1 ; v2 ; : : : ; vk g is a basis for U . Prove that
fW + v1; W + v2; : : : ; W + vk g is a basis for V=W .
(b) If U is in nite dimensional and B is a basis for U , is fW + v j v 2 B g a basis for
V=W ?
54 Chapter 8. General Vector Spaces

46. Let W be a subspace of a vector space V and U a subspace of V=W . De ne

U = fu 2 V j W + u 2 Ug:
(a) Show that U is a subspace of V .
(b) Suppose W and U are nite dimensional, say, dim(W ) = k and dim(U ) = m. Find
dim(U ).
Chapter 9

General Linear Transformations

Section 9.1 Linear Transformations

Discussion 9.1.1 In Chapter 7, a linear transformations from Rn to Rm is de ned to be a


function T : Rn ! Rm such that T (u) = Au for u 2 Rn where A is an m  n real matrix and
vectors in Rn are written as column vectors. In this chapter, we shall generalize the concept of
linear transformations to abstract vector spaces. As a consequence, we can regard this abstract
version of linear transformations as a generalized form of matrices. See Proposition 9.4.3.

De nition 9.1.2 Let V and W be two vector spaces over a eld F. A linear transformation
T : V ! W is a mapping from V to W that satis es the following two axioms:
(T1) For all u; v 2 V , T (u + v ) = T (u) + T (v ).
(T2) For all c 2 F and u 2 V , T (cu) = cT (u).
If W = V , the linear transformation T : V! V is called a linear operator on V .
If W = F, the linear transformation T : V ! F is called a linear functional on V .
................................... ...................................
....... .......

V W
..... ...... ..... ......
..... ..... ..... .....
.... .... .... ....
.
.. .... .
.. ....
. .... . ....
.... ....

u6 (u) 
.. .... .. ....
.... .... .... ....

u) =
. .... . ....

 
. .
...... ... ...... ...
T(
. ... . ...
.... c ..
.. c ..... cT ..
..
... ..
.. .. ..
..
..
. .. ..
. ..
.. .. .. ..

-u
. .. . ..
... ..
.. ... ..
..

u QQ
... .. ... ..

u6 
.. ..
..
. .. ..
. ..

v
.. .. .. ..

 Qs
. .
. .. . ..
.. .. .. ..
. . + .. . T ( ) ..

-
.. . ..

T
. .
. .. . ..
.. . .. .. ..
. .. . ..
... .
.
..
.. ... .
.
..
..
.. . ..

u + v) = T (u) + T (v)
. . .. . ..
.. . .. .. .
.. .. .. ..
(

. .
..
..
.
.
. ... ..
..
. T
...
.. . .. .. . ..
.. . .. . .

-v 
.. .
. ..
. .. .. ..
.
.. . . .. .

 
.. . .. .. ..

0 QQ
.. .. .. .. ..
.. .
. .. .. .. ..
.. .
. .. .. ..

Qs
.. . .. .. .. ..
0
.. . .. . .
..
..
.
.. ..
.. .. .
..
.. .. .. ..
.. .. .. .. ..
.. .. .. .. ..

v
.. .. .. ..
.. . .. .
.. .. .. ..
...
....
.... ....
...
...
...
....
....
T ( ) ...
...
....
.... .... .... ....
....
.... ......
. ....
.... ......
.
.... .. .... ..
..... .... ..... ....
..... ..... ..... .....
...... ..... ...... .....
......... ...... ......... ......
............................. ............................
56 Chapter 9. General Linear Transformations

Remark 9.1.3 (T1) and (T2) can be combined together: Let V and W be two vector spaces
over a eld F. A mapping T : V ! W is a linear transformation if and only if

T (au + bv) = aT (u) + bT (v) for all a; b 2 F and u; v 2 V:

Example 9.1.4
1. Let A be an m  n matrix over a eld F. De ne a mapping LA : Fn ! Fm by

LA (u) = Au for u 2 Fn

where vectors in Fn are written as column vectors.

(T1) For any u; v 2 Fn , LA (u + v ) = A(u + v ) = Au + Av = LA (u) + LA (v ).


(T2) For any c 2 F and u 2 Fn , LA (cu) = A(cu) = cAu = cLA (u).

So LA is a linear transformation.
(From this example, we see that De nition 9.1.2 can be regarded as a generalization of
De nition 7.1.1.)

2. The identity mapping on a vector space V is de ned to be the mapping IV : V !V such


that
IV (u) = u for u 2 V:
It is a linear operator on V and is also called the identity operator on V .

3. The zero mapping OV;W : V ! W , where V and W are vector spaces over the same eld,
is de ned by
OV;W (u) = 0 for u 2 V:
It is a linear transformation and is also called the zero transformation from V to W .
If W = V , we use OV to denote OV;V and call it the zero operator on V .

4. (a) Let S : P (R) ! P (R) be the mapping de ned by

S (p(x)) = p(x)2 for p(x) 2 P (R):

Is S a linear operator?
(b) Let T : P (R) ! P (R) be the mapping de ned by

T (p(x)) = xp(x) for p(x) 2 P (R):

Is T a linear operator?
Section 9.1. Linear Transformations 57

Solution
(a) S is not a linear operator. For example
S (1 + x) = (1 + x)2 = 1 + 2x + x2 6= 1 + x2 = S (1) + S (x):
(b) T is a linear operator:
(T1) For any p(x); q (x) 2 P (R),
T (p(x) + q(x)) = x(p(x) + q(x)) = xp(x) + xq(x) = T (p(x)) + T (q(x)):
(T2) For any c 2 R and p(x) 2 P (R),
T (cp(x)) = x(cp(x)) = cxp(x) = cT (p(x)):
5. Let V be the set of all convergent sequences over R. We know that V is a subspace of RN
(see Question 8.13). De ne a mapping T : V ! R by

!1 an for (an )n2N 2 V:


T ((an )n2N ) = nlim
Since for any convergent sequences (an )n2N and (bn )n2N and any c 2 R,
lim (a + bn ) = lim an + lim bn and lim can = c lim an ;
n!1 n n!1 n!1 n!1 n!1
T is a linear functional.
6. Let [a; b], with a < b, be a closed interval of the real line. We use the real vector space
C 1 ([a; b]) de ned in Example 8.3.6.5. Let D : C 1 ([a; b]) ! C 1 ([a; b]) be the di erential
operator such that for every f 2 C 1 ([a; b]), D(f ) is a function on C 1 ([a; b]) de ned by
df (x)
D(f )(x) = for x 2 [a; b]
dx
and let F : C 1 ([a; b]) ! C 1 ([a; b]) be the integral operator such that for every f 2
C 1 ([a; b]), F (f ) is a function on C 1 ([a; b]) de ned by
Z x
F (f )(x) = f (t)dt for x 2 [a; b]:
a
Both D and F are linear operators.

Proposition 9.1.5 Let V and W be vector space over the same eld. If T : V !W is a
linear transformation, then T (0) = 0.
Proof Since 0 + 0 = 0,
T (0 + 0) = T (0)
) T (0) + T (0) = T (0)
) T (0) + T (0) T (0) = T (0) T (0)
) T (0) + 0 = 0
) T (0) = 0:
58 Chapter 9. General Linear Transformations

Remark 9.1.6 Let V and W be vector space over the same eld. Suppose V has a basis B .
Let T : V !W be a linear transformation. For every u 2 V ,

u = a v1 + a v2 +    + amvm
1 2

for some scalar a ; a ; : : : ; am and some v1 ; v2 ; : : : ; vm 2 B . By using (T1) and (T2) repeat-
1 2

edly, we get
T (u) = a1 T (v1 ) + a2 T (v2 ) +    + am T (vm ):
It follows that T is completely determined by the images of vectors from B .
On the other hand, to de ne a linear transformation S from V to W , we rst set the image S (v )
for each v 2 B . For any u 2 V , since u = a1 v1 +a2 v2 +  +am vm for some v1 ; v2 ; : : : ; vm 2 B
and scalars a1 ; a2 ; : : : ; am , de ne S (u) = a1 S (v1 ) + a2 S (v2 ) +    + am S (vm ). Then we have a
linear transformation.

Example 9.1.7 Take the standard basis f1; x; x2 g for P2 (R). De ne a linear transformation
S : P2 (R) ! R3 by
S (1) = (1; 2; 1); S (x) = (0; 1; 1) and S (x2 ) = ( 1; 1; 0):
Then for any p(x) = a + bx + cx2 2 P2 (R),

S (p(x)) = aS (1) + bS (x) + cS (x2 )


= a(1; 2; 1) + b(0; 1; 1) + c( 1; 1; 0) = (a c; 2a + b + c; a + b):

Section 9.2 Matrices for Linear Transformations

Theorem 9.2.1 Let T : V ! W be a linear transformation where V and W are nite dimen-
sional vector spaces over a eld F such that n = dim(V )  1 and m = dim(W )  1. For any
ordered bases B and C for V and W respectively, there exists an m  n matrix A such that

[T (u)]C = A [u]B for all u 2 V:

(Together with Example 9.1.4.1, the linear transformation de ned in De nition 9.1.2 is the
same as that in De nition 7.1.1 when V = Fn , W = Fm and the standard bases B and C ,
respectively, are used.)
Proof Let B = fv1 ; v2 ; : : : ; vn g. Then every u 2 V can be expressed uniquely as
u = a v1 + a v2 +    + anvn
1 2
Section 9.2. Matrices for Linear Transformations 59

for some scalar a1 ; a2 ; : : : ; an . Using the notation of coordinate vectors (see De nition 8.5.6),
we have 0 1
a1
B C
B a2 C
[u]B = B C:
B .. C
@ . A
an
Then by Remark 9.1.6 and Lemma 8.5.7,

[T (u)]C = a1 [T (v1 )]C + a2 [T (v2 )]C +    + an [T (vn )]C


0 1
a1
 B a 2 C
B
C
= [T (v1 )]C [T (v2 )]C    [T (vn)]C B C
B .. C
@ . A
an
 
= [T (v1 )]C [T (v2 )]C    [T (vn)]C [u]B :
 
Let A = [T (v1 )]C [T (v2 )]C    [T (vn)]C . Then A is an m n matrix such that
[T (u)]C = A [u]B for all u 2 V .
 
De nition 9.2.2 The matrix A = [T (v1 )]C [T (v2 )]C    [T (vn )]C in the proof of The-
orem 9.2.1 is called the matrix for T relative to the ordered bases B and C . This matrix A is
usually denoted by [T ]C;B .
If W = V and C = B , we simply denote [T ]B;B by [T ]B and the matrix is called the matrix for
T relative to the ordered basis B .

Lemma 9.2.3 Let T1 ; T2 : V !W


be linear transformations where V and W are nite
dimensional vector spaces where dim(V )  1 and dim(W )  1. Take any ordered bases B and
C for V and W respectively. Then T1 = T2 if and only if [T1 ]C;B = [T2 ]C;B .
Proof By Remark 9.1.6, every linear transformation T : V ! W is completely determined
by the images of vectors from B . Thus the matrix in De nition 9.2.2 uniquely de ne a linear
transformation.

Example 9.2.4
1. In Example 9.1.7, let B = f1; x; x2 g and C = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g. The matrix for
S relative to B and C is
0 1
 1 0 1
B C
[S ]C;B = [S (1)]C [S (x)]C [S (x2 )]C = @2 1 1 A:
1 1 0
We can also recover the formula of S from the matrix [S ]C;B :
60 Chapter 9. General Linear Transformations

0 1
a
For p(x) = a + bx + cx 2
2 P (R), [p(x)]B =
2
B C
@ b A. Then
c
0 10 1 0 1
1 0 1 a a c
B CB C B C
[S (p(x)]C = [S ]C;B [p(x)]B = @2 1 1 A@ b A = @2a + b + cA :
1 1 0 c a+b
So
S (p(x)) = (a c)(1; 0; 0) + (2a + b + c)(0; 1; 0) + (a + b)(0; 0; 1)
= (a c; 2a + b + c; a + b):
2. Let V be a vector space of dimension 3 over C. De ne a linear transformation T :
V ! C3 such that T (v1 ) = (1; 1; 1), T (v2 ) = (i ; 2; i) and T (v3 ) = (0; i ; 0) where
B = fv1 ; v2 ; v3 g is an ordered basis for V . Let C = f(1; 1; 1); (1; 0; 1); (1; 1; 0)g. Find
the matrix for T relative to B and C .
 
Solution Since [T ]C;B = [T (v1 )]C [T (v2 )]C [T (v3 )]C , we need to nd the coordi-
nate vectors [(1; 1; 1)]C , [(i ; 2; i)]C and [(0; i ; 0)]C , i.e. to nd aj ; bj ; cj , j = 1; 2; 3 such
that
8
>
< a1 + b1 + c1 = 1
a1 (1; 1; 1) + b1 (1; 0; 1) + c1 (1; 1; 0) = (1; 1; 1) , >
a1 + c1 = 1
:
a1 b1 = 1;
8
>
< a2 + b2 + c2 = i
a2 (1; 1; 1) + b2 (1; 0; 1) + c2 (1; 1; 0) = (i ; 2; i) , >
a1 + c1 = 2
:
a1 b1 = i;
8
>
< a3 + b3 + c3 = 0
a3 (1; 1; 1) + b3 (1; 0; 1) + c3 (1; 1; 0) = (0; i ; 0) , >
a1 + c1 = i
:
a1 b1 = 0:
We solve the three linear systems together (see Example 3.7.4.1):
0 1 0 1
1 1 1 1 i 0 Gauss-Jordan 1 0 0 1 2 + 2i i
B
@ 1 0 1 1 2 i
C
A ! B
@ 0 1 0 2 2+i i
C
A:
1 1 0 1 i 0 Elimination 0 0 1 2 4 2i 2i
0 1 0 1 0 1
1 2 + 2i i
Then [(1; 1; 1)]C = B
@
C B C B C
2 A, [(i ; 2; i)]C = @ 2 + i A and [(0; i ; 0)]C = @ i A. So
2 4 2i 2i
0 1
  1 2 + 2i i
[T ]C;B = [T (v1 )]C [T (v2 )]C [T (v3 )]C = B
@ 2 2+i
C
iA :
2 4 2i 2i
Section 9.2. Matrices for Linear Transformations 61

 
3. Let A = c1 c2    cn be an n  n matrix over a eld F where ci is the ith column of
A. Let LA : Fn ! Fn be the linear operator de ned in Example 9.1.4.1, i.e. LA(u) = Au
for u 2 Fn where vectors in Fn are written as column vectors.
Take the standard basis E = fe1 ; e2 ; : : : ; en g for Fn . For all u = (u1 ; u2 ; : : : ; un )T 2 Fn ,
0 1
u1
B C
B u2 C
u = u e1 + u e2 +    + unen ) [u]E = B .. C = u:
1 2
B C
@ . A
un
Thus
[LA ]E = [LA ]E;E
 
= [LA (e1 )]E [LA (e2 )]E    [LA(en)]E
 
= [Ae1 ]E [Ae2 ]E    [Aen]E
 
= [c1 ]E [c2 ]E    [cn]E
 
= c1 c2    cn
= A:
(See also Discussion 7.1.8.)

Discussion 9.2.5 Use the identity operator IV : V ! V in Example 9.1.4.2 where V is a


nite dimensional vector space with dim(V )  1. Suppose B and C are two ordered bases for
V . Then from Theorem 9.2.1, we have
[u]C = [IV (u)]C = [IV ]C;B [u]B for all u 2 V:
So for any u 2 V , the matrix [IV ]C;B is doing the job of converting the coordinate vector of u
relative to B to the coordinate vector of u relative to C . Thus [IV ]C;B is called the transition
matrix from B to C .
Suppose B = fv1 ; v2 ; : : : ; vn g. Then
 
[IV ]C;B = [IV (v1 )]C [IV (v2 )]C    [IV (vn)]C
 
= [v1 ]C [v2 ]C    [vn]C
and it resembles the transition matrix de ned in De nition 3.7.3.

Theorem 9.2.6 The matrix [IV ]C;B in Discussion 9.2.5 is invertible and its inverse is the
transition matrix from C to B , i.e. ( [IV ]C;B ) 1 = [IV ]B;C . (See also Theorem 3.7.5.)
Proof It is easier to prove the theorem using the concept of compositions of linear transfor-
mations. So we leave it as an exercise for next section. See Question 9.16.
62 Chapter 9. General Linear Transformations

Example 9.2.7 Let E = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g and B = f(1; i ; 1); (0; 1; 1); (i ; 1; 1)g.
They are bases for C3 . Find the transition matrix from E to B .
0 1
 1 0  i
B C
Solution Since [IC3 ]E;B = [(1; i ; 1)]E [(0; 1; 1)]E [(i ; 1; 1)]E = @ i 1 1A, the tran-
1 1 1
sition matrix from E to B is
0 1 1 0 1
1 0 i 0 1 i
2
1+i
2
B C B C
[IC3 ]B;E = ( [IC3 ]E;B ) = @ i 1
1
1A = @ i 1 0 A:
1 1 1 i 1
2
i 1+i
2

Section 9.3 Compositions of Linear Transformations

Theorem 9.3.1 Let S : U ! V and T : V ! W be linear transformations. Then the


composition mapping T  S : U ! W , de ned by

(T  S )(u) = T (S (u)) for u 2 U;

is also a linear transformation.


Proof We use the result of Remark 9.1.3 to show that T  S is a linear transformation:
Take any scalars a, b and vectors u; v 2 U .

(T  S )(au + bv ) = T (S (au + bv ))
= T (aS (u) + bS (v )) because S is a linear transformation
= aT (S (u)) + bT (S (v )) because T is a linear transformation
= a(T  S )(u) + b(T  S )(v ):

So T  S is a linear transformation.

Example 9.3.2
1. Let S : C3 ! M22 (C) be the linear transformation de ned by
!
a + ic 0
S ((a; b; c)) = for (a; b; c) 2 C3
2b a i c
and let T : M22 (C) ! P2 (C) be the linear transformation de ned by
!! !
a b a b
T = a + (i b + c)x dx 2
for 2 M  (C):
2 2
c d c d
Section 9.3. Compositions of Linear Transformations 63

Then for (a; b; c) 2 C3


(T  S )((a; b; c)) = T (S ((a; b; c)))
!!
a + ic 0
=T
2b a i c
= (a + i c) + 2bx (a i c)x2
is a linear transformation from C3 to P2 (C).
2. Let [a; b], with a < b, be a closed interval on the real line and let V = C 1 ([a; b]). Consider
the di erential and integral operators D and F on V de ned in Example 9.1.4.6. For every
f 2V, d x
Z
(D  F )(f )(x) = D(F (f ))(x) = f (t)dt = f (x) for x 2 [a; b]
dx a
) (D  F )(f ) = f
and Z x
d f (t)
(F  D)(f )(x) = F (D(f ))(x) = dt = f (x) f (a) for x 2 [a; b]
( a dt
= f if f (a) = 0
) (F  D)(f )
6= f if f (a) 6= 0:
Thus D  F = IV but F  D 6= IV .

Theorem 9.3.3 Let S : U ! V and T : V ! W be linear transformations. Suppose U , V


and W are nite dimensional where dim(U )  1, dim(V )  1 and dim(W )  1. Let A, B and
C be ordered bases for U , V and W respectively. Then
[T  S ]C;A = [T ]C;B [S ]B;A :
(See also Theorem 7.1.11.)
Proof Let A = fv1 ; v2 ; : : : ; vn g, where n = dim(V ), and let fe1 ; e2 ; : : : ; en g be the standard
basis for Fn , where F is the eld of scalars. In here, each ei is written as a column vector. Note
that for i = 1; 2; : : : ; n,
vi = 0v1 +    + 0vi 1 + 1vi + 0vi+1 +    + 0vn ) [vi]A = ei:
Hence
the ith column of the matrix [T  S ]C;A = [(T  S )(vi )]C
= [T (S (vi ))]C
= [T ]C;B [S (vi )]B
= [T ]C;B [S ]B;A [vi ]A
= [T ]C;B [S ]B;A ei
= the ith column of the matrix [T ]C;B [S ]B;A :
So [T  S ]C;A = [T ]C;B [S ]B;A .
64 Chapter 9. General Linear Transformations

Example 9.3.4 Let S : C3 ! M22 (C) and T : M22 (C) ! P2 (C) be linear transformations
de ned in Example 9.3.2.1. Take the standard bases

A = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g;


( ! ! ! !)
1 0 0 1 0 0 0 0
B= ; ; ;
0 0 0 0 1 0 0 1
and C = f1; x; x2 g

for C4 , M22 (C) and P2 (C) respectively. Then


 
[S ]B;A = [S ((1; 0; 0))]B [S ((0; 1; 0))]B [S ((0; 0; 1))]B
" !# " !# " !# !
1 0 0 0 i 0
=
0 1 B
2 0 B
0 i B
0 1
1 0 i
B C
B0
0 0C
= B C;
B0
2 0C
@ A
1 0 i
" !!# " !!# " !!# " !!# !
1 0 0 1 0 0 0 0
[T ]C;B = T T T T
0 0 C
0 0 C
1 0 C
0 1 C
 
= [1]C [i x]C [x]C [ x2 ]C
0 1
1 0 0 0
B C
= @0 i 1 0A
0 0 0 1
and  
[T  S ]C;A = [(T  S )((1; 0; 0))]C [(T  S )((0; 1; 0))]C [(T  S )((0; 0; 1))]C
 
= [1 x2 ]C [2x]C [i + i x2 ]C
0 1
1 0 i
=B
@
C
0 2 0A :
1 0 i
0 1
Note that 0 1 0 1 1 0 i
1 0 i 1 0 0 0 B C
CB0 0 0C
[T  S ]C;A = B
@
C B
0 2 0A = @0 i 1 0 AB
B0
C = [T ]C;B [S ]B;A :
@ 2 0C A
1 0 i 0 0 0 1
1 0 i
Section 9.3. Compositions of Linear Transformations 65

De nition 9.3.5 Let T : V !V be a linear operator. For any nonnegative integer m, we


de ne T m as follows: 8
< IV
> if m = 0
Tm = T  T {z    T} if m  1:
: |
>
m times

Corollary 9.3.6 Let T : V ! V be a linear operator where V is nite dimensional where


dim(V )  1. Let B be an ordered basis. Then [T m ]B = ([T ]B )m .
Proof Use Theorem 9.3.3 repeatedly.

Example 9.3.7
1. Let A be an n  n matrix over a eld F and let LA : Fn ! Fn be the linear operator
de ned in Example 9.1.4.1. Then LAm (u) = Am u for u 2 Fn .

2. Let [a; b], with a < b, be a closed interval on the real line and let D : C 1 ([a; b]) !
C 1 ([a; b]) be the di erential operator de ned in Example 9.1.4.6. Then for every f 2
C 1 ([a; b]), Dm (f ) is a function on C 1 ([a; b]) such that
d m f (x)
Dm (f )(x) = for x 2 [a; b]:
dxm

Lemma 9.3.8 Let T : V ! V be a linear operator where V is a nite dimensional vector


space with dim(V )  1. Let B , C are two ordered bases for V and P the transition matrix
from B to C , i.e. P = [IV ]C;B . Then

[T ]B = P 1
[T ]C P :

Proof Since T = IV  T  IV ,

[T ]B = [T ]B;B
= [IV  T  IV ]B;B
= [IV  T ]B;C [IV ]C;B
= [IV ]B;C [T ]C;C [IV ]C;B
= ( [IV ]C;B ) 1
[T ]C [IV ]C;B :

So we have [T ]B = P 1
[T ]C P .

De nition 9.3.9 Let F be a eld and A; B 2 Mnn (F). Then B is said to be similar to A
if there exists an invertible matrix P 2 Mnn(F) such that B = P AP . 1
66 Chapter 9. General Linear Transformations

Theorem 9.3.10 Let T be a linear operator on a nite dimensional vector space V over a
eld F, with dim(V ) = n  1, and let C be an ordered basis for V . Then an n  n matrix D
over F is similar to [T ]C if and only if there exists an ordered basis B for V such that D = [T ]B .
Proof
(() The result follows from Lemma 9.3.8.

()) Let C = fu1 ; u2 ; : : : ; un g. Suppose D = P 1


[T ]C P where P = (pij )nn is an invertible
matrix. De ne B = fv1 ; v2 ; : : : ; vn g where

vj = p j u1 + p j u2 +    + pnj un
1 2

for j = 1; 2; : : : ; n. Since P is invertible, by Question 8.18, span(B ) = span(C ) = V


and henceby Theorem 8.5.13, B is a basis for V . Using B as an ordered basis for V ,
[IV ]C;B = [v1 ]C [v2 ]C    [vn ]C = P and hence

[T ]B = [IV ]B;C [T ]C [IV ]C;B = P 1


[T ]C P = D:

0 1
0 1 0
Example 9.3.11 Consider the real matrix A = @0
B
0 2A. Let B = fv1 ; v2 ; v3 g where
C

1 1 1
0 1 0 1 0 1
2 1
Bp C p1 C
v1 = B
@
C
2 A; v2 = @ 2A and v3 = B
@ 2A :
1 1 1
p p
Then LA (v1 ) = v1 , LA (v2 ) = 2 v2 and LA (v3 ) = 2 v3 . (See Example 6.1.12.3.) Hence
 
[LA ]B = [LA (v1 )]B [LA (v2 )]B [LA (v3 )]B
0 1
1
 p
   p   p0 0
= [v1 ]B 2 v2 B 2 v3
B C
B = @0 2
p A:
0
0 0 2

Using the standard basis E = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g for R3 , [LA ]E = A and
0 1
2 1 1
  B p p C
[IV ]E;B = [v1 ]E [v2 ]E [v3 ]E = B
@ 2 2 A:
2C
1 1 1

With P = [IV ]E;B , we have [LA ]B = [IV ]B;E [LA ]E [IV ]E;B = P AP .
1
(See also Example
6.2.6.2.)
Section 9.4. The Vector Space L( V; W ) 67

Remark 9.3.12 Using linear operators, we have a new interpretation of the problem of diag-
onalization discussed in Section 6.2. Given a square matrix A 2 Mnn (F), we use the linear
operator LA on Fn de ned in Example 9.1.4.1. By Example 9.2.4.3, [LA ]E = A where E is the
standard basis for Fn . Then to diagonalize A is the same as to nd an ordered basis B for Fn
such that the matrix for LA relative to B is a diagonal matrix. (See also Chapter 11.)

Section 9.4 The Vector Space L(V; W )

De nition 9.4.1 Let V and W be vector spaces over the same eld F.
1. Let T1 ; T2 : V ! W be linear transformations. We de ne a mapping T1 + T2 : V ! W by
(T1 + T2 )(u) = T1 (u) + T2 (u) for u 2 V:

2. Let T : V ! W be a linear transformation and c 2 F. We de ne a mapping cT : V !W


by
(cT )(u) = cT (u) for u 2 V:
The mappings T1 + T2 and cT are linear transformations.

Example 9.4.2 Let T1 ; T2 : R2 ! R2 be given by


T1 ((x; y)) = (x + 2y; 5x 6y) and T2 ((x; y)) = (3x + 4y; y x) for (x; y) 2 R2 :
Find the formulae for T1 + T2 , 3T1 , 5T2 and 3T1 5T2 .
Solution For (x; y) 2 R2 ,
(T1 + T2 )((x; y )) = T1 ((x; y )) + T2 ((x; y ))
= (x + 2y; 5x 6y ) + (3x + 4y; y x) = (4x + 6y; 4x 5y);
(3T1 )((x; y )) = 3T1 ((x; y )) = 3(x + 2y; 5x 6y ) = (3x + 6y; 15x 18y );
( 5T2 )((x; y )) = 5T2 ((x; y )) = 5(3x + 4y; y x) = ( 15x 20y; 5x 5y);
(3T1 5T2 )((x; y )) = (3T1 )((x; y )) + ( 5T2 )((x; y ))
= (3x + 6y; 15x 18y ) + ( 15x 20y; 5x 5y )
= ( 12x 14y; 20x 23y ):

Proposition 9.4.3 Let V and W be nite dimensional vector space over the same eld F where
dim(V )  1 and dim(W )  1 and let B and C be ordered basis for V and W respectively.
1. If T1 ; T2 : V !W are linear transformations, then [T1 + T2 ]C;B = [T1 ]C;B + [T2 ]C;B .
2. If T : V !W be a linear transformation and c 2 F, then [cT ]C;B = c[T ]C;B .
68 Chapter 9. General Linear Transformations

Proof The proof is left as exercise. See Question 9.20

Remark 9.4.4 Matrices and linear transformations have a lot of similarities. The observations
above show their relations in addition and scalar multiplication. By Theorem 9.3.3, we have
also seen that the composition of linear transformations is equivalent to matrix multiplication.
In the later sections, we shall learnt that the matrix inverse has a corresponding analog in linear
transformations (see Theorem 9.6.6). Thus we may sometimes regard linear transformations as
generalized matrices.

Theorem 9.4.5 Let V and W be vector spaces over the same eld F, and let L(V; W ) be the
set of all linear transformations from V to W . Then L(V; W ) is a vector space over F with
addition and scalar multiplication de ned in De nition 9.4.1.
Furthermore, if V and W are nite dimensional, then
dim(L(V; W )) = dim(V ) dim(W ):

Proof It is straight forward to check that L(V; W ) is a vector space over F. In particular,
the zero vector in L(V; W ) is the zero transformation OV;W and the negative of T is the linear
transformation T = ( 1)T .
Now, let V and W be nite dimensional. If V or W is a zero space, then L(V; W ) = fOV;W g
and dim(L(V; W )) = 0 = dim(V ) dim(W ). Suppose dim(V )  1 and dim(W )  1. Let
B = fv1 ; v2 ; : : : ; vn g and C = fw1 ; w2 ; : : : ; wm g be bases for V and W respectively. For
each pair of i; j , where 1  i  m and 1  j  n, de ne a linear transformation Tij : V ! W
such that for k = 1; 2; : : : ; n, (
Tij (vk ) =
wi if k = j
0 if k 6= j:
(See Remark 9.1.6.) It can be shown that f Tij j 1  i  m and 1  j  ng is a basis for
L(V; W ). Since this basis has mn elements, the dimension formula follows.
(Note that [Tij ]C;B = Eij where Eij is the matrix de ned in Example 8.4.6.7.)

De nition 9.4.6 If V is a vector space over F, then the vector space L(V; F) of all linear
functionals on V is called the dual space of V and is denoted by V  . By Theorem 9.4.5, if V is
nite dimensional, then dim(V ) = dim(V  ).

Example 9.4.7 Let V = Fn where vectors in V are written as column vectors. For each
f 2 V  , there exists scalars a1 ; a2 ; : : : ; an such that for all (x1 ; x2 ; : : : ; xn )T 2 V ,
00 11 0 1
x1 x1
BB CC B x2 C
B
BB x2 CC  C
f BB .. CC = a1 x1 + a2 x2 +    + an xn = a1 a2    an C:
BB CC B C
B ..
@@ . AA @ . A
xn xn
Section 9.5. Kernels and Ranges 69

So each linear functional in V  can be represented by a row vector over F. (See also Example
9.6.14.4.)

Section 9.5 Kernels and Ranges

De nition 9.5.1 Let T : V ! W be a linear transformation.


1. The subset Ker(T ) = fu 2 V j T (u) = 0g of V is called the kernel of T .
(In some textbooks, Ker(T ) is called the nullspace of T and is denoted by N(T ).)
2. The subset R(T ) = fT (u) j u 2 V g of W is called the range of T .

......
..
....................................................
T - ......
.........
........... ................

V W
..... .............. .....
..... ..... .............. ..... ....
.... .... .............. .... ....
.... .... .............. .... ....
...... .... ..............
... .. ...
.
................ ....
..
.... ....
. .. .. .. .. .. ....
....
...
. ... .
. .............
.... ... ...
. . ........ ..............
.
...
..
. .. . ..... . . . ........
... .. ... ... ..
..
... ..
. .
.
.
....... . . . . . ........ ..
.. ... .... ..
..
. .................. .. ...
... .
. ..
..
. ......... . ............................. ..
. .
.
. .... . . . . . . . . ...
.
..
..
........... ..
...
...
.
.......
... . . . . . .....
....
.. . ... .
. . .
.
...........
.. ........... .
.
.
.
.
.
.
.
..
. . .
.
. .
R( ). . .
T
. . .
. .
.
..
.
...
..
..
..
..
.. ........... .. ..
..
. .. . . . . . . ....
. .. .............. ..
. . ..
.. ... . . . . . . .... .. ............ .. . . . . . . . . . . ... ..

0
. . .. . . . .
.. . .
............ .
. ..
.. .. .. .. . ........... . . . . . . . . .... ..
. . .. ... . .. . ........... .. ..
.. .. . . ... . .
. .
. ..... . ..
.
..
..
..
.
..
Ker( )
.. .
..
T . ..
.
. ..
.
.
..
.
..
.
..
.
.. . . . . ........
. ....
... . .............. . . . . . . . ...
.. . . . . ...
.
..
..
.
.. .. . . . . . . . .... . ..
.. ...................
. . ... ... . .
.. .. .
. ..
. . ...
.
.
. .
.
.
.. .. ........... .. . . . . . . . . . . .. .
.. .. . . . . . . .... ..
. .....
..... ... . . . . . . . . . . . .. .
.
.. ..
.... . . . . . .... . ......... ..
..
.. ............. .. .. .. ..
.. ....
.... ........... .. ..
.. . . . . . . . . . ...
.. .... ........... ... .. ..
.. ..... . . . ........................... ..
.. ..
... . . . . . . . . .... ..
..
..
........................ ..
..
.. .... ..
. ...
. .. .... . . .
..
..
.
.. .. .... . . . . . . ...... .
..
.. .... .
.. .. .. ..... . . . ........ ..
.. .. .. ......... ......... ..
...
.... ... ...
.... ..........................
...... ...
.... ...... .... ... .....
.
....
....
.. .................... ....
....
.... ............. ...
....
.... .............. .... ....
..... ............. .... ....
..... ..... .............. .....
.....
....... ..... .............. ......
.....
............................................. ........
..........................

Theorem 9.5.2 Let T : V !W be a linear transformation. Then


1. Ker(T ) is a subspace of V ; and
2. R(T ) is a subspace of W .

Proof The proof is left as an exercise. See Question 9.28.

Example 9.5.3
1. Let T : M22 (R) ! M23 (R) be the linear transformation de ned by
!! ! !
w x w x+z x + 2y x z w x
T = for 2 M  (R):
2 2
y z x 2y w + x z 0 y z
8
>
> w x +z = 0 8
>
> >
> w=0
!! ! >
>
< x + 2y =0 >
>
<
w x 0 0 0 x=t
T = , x z=0 , for t 2 R:
y z 0 0 0 >
> >
> y = 21 t
>
>
> x 2y =0 >
>
:
>
: z=t
w+x z=0
70 Chapter 9. General Linear Transformations

( ! ) ( !)
0 t 0 1
So Ker(T ) = t 2 R and is a basis for Ker(T ).
1
2
t t 1
2
1
Since
!! !
w x w x+z x + 2y x z
T =
y z x 2y w + x z 0
! ! ! !
1 0 0 1 1 1 0 2 0 1 0 1
=w +x +y +z ;
0 1 0 1 1 0 2 0 0 0 1 0
( ! ! ! !)
1 0 0 1 1 1 0 2 0 1 0 1
R(T ) = span ; ; ; .
0 1 0 1 1 0 2 0 0 0 1 0

Using the standard basis fE11 ; E12 ; E13 ; E21 ; E22 ; E23 g, the four matrices are converted
to coordinate vectors (1; 0; 0; 0; 1; 0), ( 1; 1; 1; 1; 1; 0), (0; 2; 0; 2; 0; 0), (1; 0; 1; 0; 1; 0).
0 1 0 1
1 0 0 0 1 0 1 0 0 0 1 0
B C Gaussian B C
B 1 1 1 1 1 0CC
B0
B 1 1 1 2 0C
B
B 0 2 0 2 0 0AC ! B0 0 2 0 4 0C
C
@
Elimination @ A
1 0 1 0 1 0 0 0 0 0 0 0
( ! ! !)
1 0 0 0 1 1 0 0 2
Thus ; ; is a basis for R(T ).
0 1 0 1 2 0 0 4 0

2. Let IV : V ! V be the identity operator de ned in Example 9.1.4.2. Then Ker(IV ) = f0g
and R(T ) = V .

3. Let OV;W : V ! W be the zero transformation de ned in Example 9.1.4.3. Then


Ker(OV;W ) = V and R(OV;W ) = f0g.

4. Let [a; b], with a < b, be a closed interval on the real line and let D : C 1 ([a; b]) !
C 1 ([a; b]) be the di erential operator de ned in Example 9.1.4.6.
For f 2 C 1 ([a; b]), D(f ) = 0, where 0 is the zero function, if and only if f is a constant
function, i.e. f = c1 for some c 2 R where 1 is the function on C 1 ([a; b]) de ned by
1(x) = 1 for x 2 [a; b]. So Ker(D) = spanf1g.
Z x
Take f 2 C 1 ([a; b]). Let g 2 C 1 ([a; b]) be the function de ned by g (x) = f (t)dt for
1 a
x 2 [a; b]. Then D(g) = f . So R(D) = C ([a; b]).
5. Let [a; b], with a < b, be a closed interval on the real line and let F : C 1 ([a; b]) !
C 1 ([a; b]) be the integral operator de ned in Example 9.1.4.6. Then Ker(F ) = f0g and
R(F ) = fh 2 C 1 ([a; b]) j h(a) = 0g.
Section 9.5. Kernels and Ranges 71

De nition 9.5.4 Let T : V !W be a linear transformation.

1. If Ker(T ) is nite dimensional, then dim(Ker(T )) is called the nullity of T and is denoted
by nullity(T ).

2. If R(T ) is nite dimensional, then dim(R(T )) is called the rank of T and is denoted by
rank(T ).

Example 9.5.5 Consider linear transformations de ned in Example 9.5.3.


1. nullity(T ) = 1 and rank(T ) = 3.

2. nullity(IV ) = 0 and if V is nite dimensional, then rank(IV ) = dim(V ).

3. rank(OV;W ) = 0 and if V is nite dimensional, then nullity(OV;W ) = dim(V ).

4. nullity(D) = 1.

5. nullity(F ) = 0.

Lemma 9.5.6 Let T : V ! W be a linear transformation where V and W are nite dimen-
sional with dim(V )  1 and dim(W )  1. For any ordered bases B and C for V and W
respectively,

1. f [u]B j u 2 Ker(T )g is the nullspace of [T ]C;B and nullity(T ) = nullity([T ]C;B ); and
2. f [v]C j v 2 R(T )g is the column space of [T ]C;B and rank(T ) = rank([T ]C;B ).

Proof The proof is left as an exercise. See Question 9.30.

Theorem 9.5.7 (Dimension Theorem for Linear Transformations) Let T : V !W


be a linear transformation where V and W are nite dimensional. Then

rank(T ) + nullity(T ) = dim(V ):

Proof If V = f0g is a zero space, then Ker(T ) = V = f0g and R(T ) = f0W g, where 0W is
the zero vector in W , and hence

rank(T ) + nullity(T ) = dim(Ker(T )) + dim(R(T )) = 0 + 0 = 0 = dim(V ):

If W = f0g is a zero space, then Ker(V ) = V and R(T ) = W = f0g and hence

rank(T ) + nullity(T ) = dim(Ker(T )) + dim(R(T )) = dim(V ) + 0 = dim(V ):


72 Chapter 9. General Linear Transformations

Suppose dim(V )  1 and dim(W )  1. Let B and C be ordered bases for V and W respectively.
By Lemma 9.5.6 and the Dimension Theorem for Matrices (Theorem 4.3.4),

rank(T ) + nullity(T ) = dim(Ker(T )) + dim(R(T ))


= rank([T ]C;B ) + nullity([T ]C;B )
= the number of columns in [T ]C;B
= dim(V ):

Example 9.5.8 In Example 9.5.3.1, rank(T ) + nullity(T ) = 3 + 1 = 4 = dim(M22 (R)).

Theorem 9.5.9 Let T : V ! W be a linear transformation. Suppose B and C are subsets of


V such that B is a basis for Ker(T ), fT (v) j v 2 C g is a basis for R(T ) and for any v; v0 2 C ,
if v 6= v 0 , then T (v ) 6= T (v 0 ). Then B [ C is a basis for V .
Proof The proof is left as exercise. (See Question 9.31.)

Remark 9.5.10 When V is nite dimensional, Theorem 9.5.9 gives us a direct proof of the
Dimension Theorem for Linear Transformations without using matrices.

De nition 9.5.11 Let f : A ! B be a mapping.


1. f is called injective or one-to-one if for every z 2 B , there exists at most one x 2 A such
that f (x) = z .
2. f is called surjective or onto if for every z 2 B , there exists at least one x 2 A such that
f (x) = z .
3. f is called bijective if f is both injective and surjective, i.e. for every z 2 B , there exists
one and only one x 2 A such that f (x) = z .

Proposition 9.5.12 Let T : V !W be a linear transformation.


1. T is injective (one-to-one) if and only if Ker(T ) = f0g if and only if nullity(T ) = 0.
2. T is surjective (onto) if and only if R(T ) = W .
If W is nite dimensional, then T is surjective if and only if rank(T ) = dim(W ).

Proof We only prove that T is injective (one-to-one) if and only if Ker(T ) = f0g. The other
parts are obvious.
()) By Proposition 9.1.5, T (0) = 0. On the other hand, since T is injective, there is only one
vector maps to the zero vector in W. Hence Ker(T ) = f0g.
Section 9.6. Isomorphisms 73

(() Suppose Ker(T ) = f0g.


Assume for some w 2 W , there exist u; v 2 V such that T (u) = w and T (v ) = w. Then

T (u v) = T (u) T (v ) = w
w = 0:
It means u v 2 Ker(T ). As Ker(T ) = f0g, we have u v = 0 and hence u = v .
So T is injective.

Example 9.5.13 Consider linear transformations de ned in Example 9.5.3. T and OV;W
(when both V and W are not zero spaces) are neither injective nor surjective; IV is both
injective and surjective, i.e. IV is bijective; D is surjective but not injective; and F is injective
but not surjective.

Section 9.6 Isomorphisms

De nition 9.6.1 Let T : V ! W be a linear transformation. Then T is called an isomorphism


from V onto W if T is bijective.

Example 9.6.2
1. For any vector space V , the identity operator on V is an isomorphism.
2. Let F be a eld and let T : F3 ! P2 (F) be the linear transformation de ned by

T ((a; b; c)) = a + (a + b)x + (a + b + c)x2 for (a; b; c) 2 F3 :


Since 8 8
>
< a =0 >
< a=0
T ((a; b; c)) = 0 , >
a+b = 0 , >
b=0
: :
a+b+c = 0 c = 0;
Ker(T ) = f(0; 0; 0)g and hence T is injective.
For any d + ex + fx2 2 P2 (F),
8 8
>
< a =d >
< a=d
T ((a; b; c)) = d + ex + fx 2
, >
a+b = e , >
b=e d
: :
a+b+c = f c = f e:
Thus T ((d; e d; f e)) = d + ex + fx2 for all d + ex + fx2 2 P2 (F). It means that T is
surjective.
As T is a bijective linear transformation, T is an isomorphism.
74 Chapter 9. General Linear Transformations

3. De ne P : R2 ! R3 to be a linear transformation such that P ((x; y )) = (x; y; 0) for


(x; y ) 2 R2 . It is obvious that P is injective but not surjective. So P is not an isomorphism.
Let W be the xy -plane in R3 . Note that R(P ) = W . De ne P 0 : R2 ! W such that
P ((x; y)) = (x; y; 0) for (x; y) 2 R2 . Then P 0 is a bijective linear transformation and
hence P 0 is an isomorphism.

De nition 9.6.3 A mapping T : V ! W is bijective if and only if there exists a mapping


S : W ! V such that S  T = IV and T  S = IW where IV and IW are identity operators on
V and W respectively. The mapping S is known as the inverse of T and is denoted by T 1 .
Thus a bijective mapping is also called an invertible mapping.

Theorem 9.6.4 If T is an isomorphism, then T 1


is a linear transformation and hence is also
an isomorphism.
Proof Suppose T maps V to W where V and W are vector spaces over a eld F. Take any
w1; w2 2 W . Let v1 = T 1(w1) and v2 = T 1(w2). Note that
T (v1 ) = T (T 1 (w1 )) = (T  T 1 )(w1 ) = IW (w1 ) = w1

and similarly, T (v2 ) = w2 . Then for any a; b 2 F,

T 1 (aw1 + bw2 ) = T 1 (aT (v1 ) + bT (v2 ))


= T 1 (T (av1 + bv2 )) (because T is a linear transformation)
= (T 1  T )(av1 + bv2 )
= IV (av1 + bv2 )
= av1 + bv2 = aT 1 (w1 ) + bT 1 (w2 ):

So T 1 is a linear transformation. As the inverse of a bijective mapping is also bijective, T 1

is also an isomorphism.

Example 9.6.5 Let F


be a eld and let T : F3 ! P2 (F) be the isomorphism de ned in
Example 9.6.2.2. De ne a mapping S : P2 (F) ! F3 such that

S (d + ex + fx2 ) = (d; e d; f e) for d + ex + fx2 2 P2 (F):

Then for all (a; b; c) 2 F3 ,

(S  T )((a; b; c)) = S (T ((a; b; c)))


= S (a + (a + b)x + (a + b + c)x2 )
= (a; (a + b) a; (a + b + c) (a + b)) = (a; b; c):
Section 9.6. Isomorphisms 75

and for all d + ex + fx2 2 P2 (F),


(T  S )(d + ex + fx2 ) = T (S (d + ex + fx2 ))
= T ((d; e d; f e))
= d + [d + (e d)]x + [d + (e d) + (f e)]x2 = d + ex + fx2 :
It means S  T = IF3 and T  S = IP2 (F) . So S = T 1
. Note that S is also a linear transformation
and hence is an isomorphism.

Theorem 9.6.6 Let T : V ! W be a linear transformation where V and W are nite dimen-
sional with dim(V ) = dim(W )  1. Let B and C be ordered bases for V and W respectively.
1. T is an isomorphism if and only if [T ]C;B is an invertible matrix.
2. If T is an isomorphism, [T 1
]B;C = ( [T ]C;B ) 1 .
Proof The proof is left as exercise.

Example 9.6.7 Let T : P3 (R) ! M22 (R) be the mapping de ned by


!
p(0) p(1)
T (p(x)) = for p(x) 2 P3 (R):
p( 1) p(2)
T is a linear transformation. (Check it.) Let B = f1; x; x2 ; x3 g and C = fA1 ; A2 ; A3 ; A4 g
! ! ! !
1 1 0 1 0 0 0 0
where A1 = , A2 = , A3 = , A4 = .
1 1 0 1 1 1 0 1
!
a b
Note that = aA1 + ( a + b)A2 + ( a + c)A3 + (a b c + d)A4 . Then
c d
 
[T ]C;B = [T (1)]C [T (x)]C [T (x2 )]C [T (x3 )]C
" !# " !# " !# " !# !
1 1 0 1 0 1 0 1
=
1 1 C
1 2 C
1 4 C
1 8 C
0 1
1 0 0 0
B C
B0 1 1 1C
= B C:
B0 1 1 1C
@ A
0 2 2 8
Since det([T ]C;B ) = 12 6= 0, [T ]C;B is invertible and hence T is an isomorphism.
Furthermore,
0 1 0 1
1 0 0 0
1
1 0 0 0
B C B C
B0 1 1 1C B0 5 1 1C

[T 1 ]B;C = ( [T ]C;B ) 1 = B C =BB


6 2 6C
C:
B0 1 1 1C B0 C
@ A @
1
2
1
2
0 A
0 2 2 8 0 1
0 1
3 6
76 Chapter 9. General Linear Transformations

!
a b
For 2 M  (R),
2 2
c d
" !!# " !#
a b a b
T 1
= [T 1
]B;C
c d B
c d C
0 10 1 0 1
1 0 0 0 a a
B C C B C
B0 1 CB B 1 a + b 1 c 1 dC
B
5 1
a+b C
B
6 CB B C
= C=B C:
6 2 2 3 6
B CB C
B0
@
1
2
1
2
C
0 A@ a+c A @
B a + 2b + 2c C
1 1
A
0 1
0 1 a b c+d 1
a b c+ d
1 1 1
3 6 2 2 6 6

Thus
!!
a b
T 1
=a+( 1
a+b 1
c 1
d)x + ( a + 21 b + 12 c)x2 + ( 21 a 1
b 1
c + 16 d)x3 :
c d 2 3 6 2 6

Theorem 9.6.8 Let S : W !V and T : V !W be linear transformations such that


T  S = IW .
1. S is injective and T is surjective.
2. If V and W are nite dimensional and dim(V ) = dim(W ), then S and T are isomorphisms,
S 1 = T and T 1 = S . (See also Theorem 2.4.12.)

Proof The proof is left as an exercise. See Question 9.46.

Remark 9.6.9 If V and W are in nite dimensional, Theorem 9.6.8.2 is in general not true.
For example, in Example 9.3.2.2, we have D  F = IV but neither D nor F is an isomorphism.

De nition 9.6.10 Let V and W be vector spaces over a eld F. If there exists an isomorphism
from V onto W , then V is said to be isomorphic to W and we write V 
=F W or simply V 
= W.

Example 9.6.11 Let F be a eld.


1. Let T1 : Pn (F) ! Fn+1 be the mapping de ned by

T1 (a0 + a1 x +    + an xn ) = (a0 ; a1 ; : : : ; an ) for a0 + a1 x +    + an xn 2 Pn (F):


T1 is an isomorphism and hence Pn (F) 
=F Fn+1 .
2. Let T2 : F (N; F) ! FN be the mapping de ned by

T2 (f ) = (f (n))n2N = (f (1); f (2); f (3); : : : ) for f : N ! F in F (N; F):


T2 is an isomorphism and hence F (N; F) 
=F FN .
Section 9.6. Isomorphisms 77

Remark 9.6.12 The term \isomorphic" is used in abstract algebra to indicate that two alge-
braic objects have the same structure. For example, by Example 9.6.11.1, Pn (F) is isomorphic
to Fn+1 and hence Pn (F) and Fn+1 are the \same" as vector spaces. However, we do not want to
say that Pn (F) is \equal" to Fn+1 because they are still di erent in other aspects. In particular,
we can multiply two polynomials in Pn (F) but cannot multiply two vectors in Fn+1 .

Theorem 9.6.13 Let V and W be nite dimensional vector spaces over the same eld. Then
V is isomorphic to W if and only if dim(V ) = dim(W ).
Proof
()) Suppose V  = W , i.e. there exits an isomorphism T : V ! W . Let fv1 ; v2 ; : : : ; vn g be a
basis for V . We claim that fT (v1 ); T (v2 ); : : : ; T (vn )g is a basis for W :
(a) Take any w 2 W . Since T is surjective, there exists u 2 V such that T (u) = w. As
u 2 spanfv1; v2; : : : ; vng, w = T (u) 2 spanfT (v1); T (v2); : : : ; T (vn)g.
So we have shown that W = spanfT (v1 ); T (v2 ); : : : ; T (vn )g.
(b) Consider the vector equation

c1 T (v1 ) + c2 T (v2 ) +    + cn T (vn ) = 0:


But then T (c1 v1 + c2 v2 +    + cn vn ) = c1 T (v1 ) + c2 T (v2 ) +    + cn T (vn ) = 0
and hence c1 v1 + c2 v2 +    + cn vn 2 Ker(T ). Since T is injective, Ker(T ) = f0g.
Thus c1 v1 + c2 v2 +    + cn vn = 0. As v1 ; v2 ; : : : ; vn are linearly independent,
c1 = c2 =    = cn = 0.
So we have shown that T (v1 ); T (v2 ); : : : ; T (vn ) are linearly independent.
By (a) and (b), fT (v1 ); T (v2 ); : : : ; T (vn )g is a basis for W .
Hence dim(W ) = n = dim(V ).
(() Suppose dim(V ) = dim(W ) = n. Let fv1; v2; : : : ; vng and fw1; w2; : : : ; wng be bases
for V and W respectively.
De ne a linear transformation T : V ! W such that T (vi ) = wi for i = 1; 2; : : : ; n, i.e.
for any u = a1 v1 + a2 v2 +    + an vn 2 V ,

T (u) = a1 w1 + a2 w2 +    + an wn :
It can be shown that T is an isomorphism (check it). So V 
= W.

Example 9.6.14 Let F be a eld.


1. Mmn(F) =F Fmn.
2. Pn(F) =F Fn . +1
78 Chapter 9. General Linear Transformations

3. Cn 
=R R n . (In here, Cn is regarded as a vector space over R.)
2

4. Suppose V and W are nite dimensional vector spaces over F such that dim(V ) = n and
dim(W ) = m. Then L(V; W )  =F Fmn =F Mmn (F).
In particular, V is isomorphic to its dual space V  de ned in De nition 9.4.6. (This result
is not true if V is in nite dimensional.)

Theorem 9.6.15 (The First Isomorphism Theorem) Let T : V !W be a linear trans-


formation. Then V=Ker(T ) 
= R(T ).
Proof De ne a mapping S : V=Ker(T ) ! R(T ) such that
S (Ker(T ) + u) = T (u) for Ker(T ) + u 2 V=Ker(T ):
Note that every coset A of Ker(T ) can be represented as Ker(T ) + u by many di erent choices
of u. So we need to make sure that our de nition of S (A) will always give us the same answer
in despite of the choices of u: Suppose u; v 2 V such that Ker(T ) + u = Ker(T ) + v . By
Theorem 8.7.3.1, u v 2 Ker(T ) and hence T (u) T (v ) = T (u v ) = 0, i.e. T (u) = T (v ).
Thus
S (Ker(T ) + u) = T (u) = T (v) = S (Ker(T ) + v):
We have shown that the mapping S is well-de ned.
It is easy to check that S is an isomorphism. (See Question 9.48.) Thus V=Ker(T ) 
= R(T ).

Remark 9.6.16 Let A be an m  n matrix over a eld F.


Consider the linear transformation LA : Fn ! Fm de ned by LA (u) = Au for u 2 Fn . Then
Ker(LA ) is the nullspace of A and R(LA ) is the column space of A. (See Section 7.2 or Lemma
9.5.6.) So Theorem 9.6.15 gives us an algebraic relation between the nullspace of A and the
column space of A.
By Example 8.7.2.4, for each element Ker(LA ) + v 2 Fn=Ker(LA ),

Ker(LA ) + v = fu + v j u 2 Ker(LA )g

is the solution set of the linear system Ax = b where b = Av is an element of the column space
of A. (See Theorem 4.3.6.)

Exercise 9

Question 9.1 to Question 9.7 are exercises for Section 9.1.


1. For each of the following functions T , determine whether it is a linear transformation.
Exercise 9 79

(a) T : R2 ! R2 such that T ((x; y )) = (x; xy ) for (x; y ) 2 R2 .


(b) T : Rm ! Rn such that T (u) = uA for u 2 Rm , where A is an m  n real matrix
and vectors u 2 Rm are written as row vectors.
(c) T : Mnn (C) ! C such that T (A) = det(A) for A 2 Mnn (C).
(d) T : Mnn (C) ! C such that T (A) = tr(A) for A 2 Mnn (C).
(e) T : Mnn (C) ! Mnn (C) such that T (A) = P 1
AP for A 2 Mnn(C), where P
is an n  n invertible real matrix.
(f) T : P1 (R) ! P1 (R) such that T (a + bx) = (a + 1) + (a + b)x for a + bx 2 P1 (R).
(g) T : [0; 1] ! R such that T (x) = 2x for x 2 [0; 1].
(h) T : C 1 ([0; 1]) ! C 1 ([0; 1]) such that for f 2 C 1 ([0; 1]), T (f ) is the function de ned
by T (f )(x) = f (x) + x for x 2 [0; 1].
(i) T : C 1 ([a; b]) ! C 1 ([a; b]), where a < b, such that for f 2 C 1 ([a; b]), T (f ) is the
function de ned by
d2 f (x) d f (x)
T (f )(x) = 3 + 2f (x) for x 2 [a; b]:
dx2 dx
(j) T : V ! V=W such that T (u) = W + u for u 2 V , where V is a vector space and
W is a subspace of V .

2. For each of the following linear transformation T , (i) determine whether the given condi-
tions is sucient for us to nd a formula for T ; (ii) if possible, write down a formula for
T ; and if not, give two di erent examples of T that satis es the given conditions.
(a) T : P2 (R) ! P1 (R) such that

T (1 + x + x2 ) = 2 + 3x; T (2 + x + 3x2 ) = 1; T ( 1 + x + 2x2 ) = x:

(b) T : M22 (C) ! C such that


!! !! !!
i i 0 i 0 0
T = i; T = 0; T = i:
0 0 i 0 i i

3. Let T : V ! W be a linear transformation. For X  V , de ne T [X ] = fT (u) j u 2 X g


which is called the image of X under T .
(a) Show that if X is a subspace of V , then T [X ] is a subspace of W .
(b) If T [X ] is a subspace of W , is it true that X must be a subspace of V ?
(c) Let V = W = R3 , T ((x; y; z )) = (x y; y z; z x) for (x; y; z ) 2 R3 and X =
f(x; x; z) j x; z 2 Rg. Write down T [X ] explicitly and nd its dimension.
80 Chapter 9. General Linear Transformations

4. (a) Give an example of a mapping that satis es (T1) but not (T2).
(b) Give an example of a mapping that satis es (T2) but not (T1).

5. Let V and W be two vector spaces over a eld F. Suppose T : V ! W is a mapping that
satis es (T1).

(a) Prove that T (0) = 0 and T ( u) = T (u) for all u 2 V .


(b) Suppose F = Q. Prove that T is a linear transformation.

6. Let W1 and W2 be subspaces of a vector space V such that V = W1  W2 . Let P : V !V


be a mapping such that P (u) 2 W1 and u P (u) 2 W2 for all u 2 V .

(a) For v 2 W1 and w 2 W2 , show that P (v ) = v and P (w) = 0. (Hint: Use the
property that every vector in V can only be expressed uniquely as a sum of vectors
from W1 and W2 .)
(b) Prove that the mapping P is a linear operator.

(Parts (a) and (b) imply that for all u = v + w 2 V with v 2 W1 and w 2 W2 , P (u) = v .
The linear operator P is sometimes called the projection on W1 along W2 . You can
compare it with the orthogonal projections in De nition 5.2.13 and De nition 12.4.8.)
( ! ) ( ! )
a a a b
(c) Let V = M22 (R), W1 = a; b 2 R and W2 = a; b 2 R .
b b a b
(i) Prove that V = W1  W2 .
(ii) Write down a formula for P .

7. Let T : V !W be a linear transformation and let v1 ; v2 ; : : : ; vn be vectors in V .

(a) Suppose T (v1 ); T (v2 ); : : : ; T (vn ) are linearly independent. Prove that v1 ; v2 ; : : : ; vn
are linearly independent.
(b) Suppose v1 ; v2 ; : : : ; vn are linearly independent. Are T (v1 ); T (v2 ); : : : ; T (vn ) lin-
early independent?

Question 9.8 to Question 9.13 are exercises for Section 9.2.

8. For each of the following linear operator T : V !V and bases B , C for V , write down
the matrix [T ]C;B .

(a) V = R3 , T ((x; y; z )) = (x; x + y + z; x + 2y + 3z ) for (x; y; z ) 2 R3 ,


B = f(1; 0; 0); (1; 1; 0); (1; 1; 1)g and C = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g.
Exercise 9 81

(b) Use the same V and T as in Part (a) but B = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g and
C = f(1; 0; 0); (1; 1; 0); (1; 1; 1)g.
dp(x)
(c) Let V = Pn (R), T (p(x)) = for p(x) 2 Pn (R) and B = C = f1; x; x2 ; : : : ; xn g.
dx
(d) Let V = f(an )n2N j an+2 = an + an+1 for n = 1; 2; 3; : : : g, T ((an )n2N ) = (an+1 )n2N
for (an )n2N 2 V and B = C = f(bn )n2N ; T ((bn )n2N )g where (bn )n2N is a sequence
in V such that (bn )n2N and T ((bn )n2N ) are linearly independent.

9. Let T be a linear operation on P2 (R) such that


0 1
1 1 1
B C
[T ]C;B = @0 1 1A
0 0 1

where B = f1; 1 + x; 1 + x + x2 g and C = f1 x; x x2 ; x2 g. Find a formula of T .


0 1
1 0 2 3
10. Let A = @2
B
1 3 6A be a real matrix. Suppose C1 , C2 and C3 are matrices obtained
C

1 4 4 0
from A by the following elementary row operations:
2R3 R1 $ R2 R3 + 2R2

A ! C1 ; A ! C2; A ! C3:
(Follow Notation 1.4.8.)
(a) Write down C1 , C2 and C3 and nd elementary matrices E1 , E2 and E3 such that
EiA = Ci for i = 1; 2; 3.
(b) Suppose V , W are real vector spaces and T : V ! W is a linear transformation such
that [T ]C;B = A where B is an ordered basis for V and C = fw1 ; w2 ; w3 g is an
ordered basis for W . For each i = 1; 2; 3, nd an ordered basis Di for W such that
[T ]Di ;B = Ci .
(This question shows that elementary row operations done to a matrix is equivalent to
changing bases for the corresponding linear transformation.)

11. Let T : P3 (R) ! R3 be the linear transformation de ned by

T (a + bx + cx2 + dx3 ) = (a b c; 2a b + d; b 2c)


for all a + bx + cx2 + dx3 2 P3 (R).
(a) Let B = f1; x; x2 ; x3 g and C = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g. Write down the matrix
[T ]C;B .
82 Chapter 9. General Linear Transformations

(b) Use Gauss-Jordan Elimination to reduce [T ]C;B to its reduce row-echelon form R.
(c) Write down an ordered basis D for R3 such that [T ]D;B = R.

12. Let V be a real vector space with an ordered basis B = fv1; v2; v3g. Suppose T is an
operator on V such that 0 1
1 1 0
B C
[T ]B = @0 1 1A :
0 0 1
(a) Prove that C = fv1 v3; v1; v1 + v2 + v3g is a basis for V .
(b) Using C as an ordered basis, compute [T ]C .

13. Let B = f1 + x + x2 ; x + x2 ; x x2 g.
(a) Prove that B is a basis for P2 (R).
(b) Find the transition matrix from E to B where E = f1; x; x2 g is the standard basis
for P2 (R).

Question 9.14 to Question 9.19 are exercises for Section 9.3.

14. Let S : M22 (C) ! C3 and T : C3 ! M22 (C) be linear transformations such that
!! !
a b a b
S = (a + i b; c + i a; c + i d) for 2 M  (C)
2 2
c d c d
and !
x iy y
T ((x; y; z )) = for (x; y; z ) 2 C3 :
ix x iz
(a) Let B = fE11 ; E12 ; E21 ; E22 g and C = fe1 ; e2 ; e3 g be standard bases for M22 (C)
and C3 respectively. Compute [S ]C;B , [T ]B;C , [S  T ]B and [T  S ]C .
(b) Write down a formula for each of S  T and T  S .

15. Let B = f1; x; x2 g and C = f(1; 0); (0; 1)g. Suppose T1 : P (R) ! R
2
2
and T2 : R !
2

P2(R) are linear transformations such that


0 1
2 1 2
T1 (x) = ( 1; 1); T2 ((1; 0)) = 1 x2 and [T2  T1 ]B = @1 1
B C
0 A:
0 1 2
(a) Find the matrices [T1 ]C;B and [T2 ]B;C .
(b) Write down the formulae for T1 and T2 .
Exercise 9 83

16. Prove Theorem 9.2.6.


Let V be a nite dimensional vector space and B , C two bases for V . Prove that the
matrix [IV ]C;B is invertible and ( [IV ]C;B ) 1 = [IV ]B;C .

17. Let T be a linear operator on P2 (C) such that


0 1
1 i 0
[T ]B = @0 1 1C
B
A
i 0 1
where B = f1; x; x2 g.
(a) Let C = f1; 1 + i x; 1 + i x2 g. Compute [T ]C .
(b) Find a matrix P so that P 1
[T ]B P = [T ]C .

18. (a) Let P be a plane in R3 that contains the origin. De ne F : R3 ! R3 to be the


re ection about P . (See Section 7.3.) Let W1 = fu 2 R3 j F (u) = ug and
W2 = fu 2 R3 j F (u) = ug.
(i) What are W1 and W2 geometrically?
(ii) Show that R3 = W1  W2 .
(b) Let F be a eld with 1 + 1 6= 0.
(i) Suppose T is an linear operator on a vector space V over F such that T 2 = IV .
Let W1 = fu 2 V j T (u) = ug and W2 = fu 2 V j T (u) = ug. Prove that
V = W1  W2 .
(ii) If A is a square matrix over F such that A2 = I , show that there exists an
invertible matrix P over F such that P 1 AP is a diagonal matrix.

19. Let V be a nite dimensional vector space such that dim(V ) = n  1 and let B =
fv1; v2; : : : ; vng be an ordered basis for V . De ne a linear operator S on V such that
(
0 if k = 1
S (vk ) =
vk 1 if 2  k  n:
(a) Write down [S ]B .
(b) Prove that S n 1
6= OV and S n = OV .

Question 9.21 to Question 9.25 are exercises for Section 9.4.


20. Prove Proposition 9.4.3:
Let V and W be nite dimensional vector space over the same eld F, where dim(V )  1
and dim(W )  1, and let B and C be ordered basis for V and W respectively.
84 Chapter 9. General Linear Transformations

(a) If T1 ; T2 : V !W are linear transformations, prove that [T1 + T2 ]C;B = [T1 ]C;B +
[T2 ]C;B .
(b) If T : V !W be a linear transformation and c 2 F, prove that [cT ]C;B = c[T ]C;B .

21. Let S and T be linear operators on M22 (R) such that

S (X ) = AX and T (X ) = XA for X 2 M22 (R)


!
2 1
where A = .
1 2

(a) Let E = fE11 ; E12 ; E21 ; E22 g be the standard bases for M22 (R). Compute [S ]E ,
[T ]E , [IV ]E , [S + T ]E , [T 2IV ]E and [(T 2IV )2 ]E .
(b) Write down a formula for each of S + T , T 2IV and (T 2IV )2 .

22. Let V be a nite dimensional vector space such that dim(V ) = n  1 and let T be an
linear operator on V . De ne Q = T IV where  is a scalar. Suppose there exists a
positive integer n such that Qn = OV and Qn 1 (v ) 6= 0 for some v 2 V .

(a) Show that C = fQn 1 (v ); : : : ; Q(v ); v g is a basis for V .


(b) Using C in Part (a) as an ordered basis, compute [Q]C and [T ]C . (The matrix [T ]C
is called a Jordan block. See Section 11.6.)

23. Let V = R2 and W = P2 (R). Suppose U is the subspace of L(V; W ) spanned by T1 , T2 ,


T3 and T4 where

T1 ((a; b)) = (a + b) + ax2 ; T2 ((a; b)) = (a + b) + (a + b)x bx2 ;


T3 ((a; b)) = (a + b)(x x2 ); T4 ((a; b)) = 2(a + b) + (a + b)x + (a b)x2

for (a; b) 2 V . Find a basis for U and determine the dimension of U .

24. Let T : V ! W be a linear transformation. De ne T  : W  ! V  such that for f 2 W ,


T  (f ) is a functional on V de ned by

T  (f )(u) = f (T (u)) for u 2 V:

(See De nition 9.4.6 for the de nition of the dual spaces V  and W  .)

(a) Show that T  is a linear transformation.


Exercise 9 85

(b) Suppose V and W are nite dimensional with bases B = fv1 ; : : : ; vn g and C =
fw1; : : : ; wmg respectively. For i = 1; : : : ; n, de ne gi 2 V  such that
(
1 if i = j
gi (vj ) =
0 otherwise;
and for i = 1; : : : ; m, de ne hi 2 W  such that
(
1 if i = j
hi (wj ) =
0 otherwise.
(i) Show that B  = fg1 ; : : : ; gn g is a basis for V  and C  = fh1 ; : : : ; hm g is a basis
for W  .
(ii) Prove that [T  ]B  ;C  = ([T ]C;B )T .

25. Let V and W be vector spaces over the same eld. For any subset A of V , de ne

A0 = fT 2 L(V; W ) j T (u) = 0 for all u 2 Ag:


(a) If A is a subset of V , prove that A0 is a subspace of L(V; W ).
(b) If A and B are subsets of V such that A  B , prove that B 0  A0 .
(c) If U1 and U2 are subspaces of V , prove that (U1 + U2 )0 = U10 \ U20 .
(d) If U1 and U2 are nite dimensional subspaces of V , prove that U10 + U20 = (U1 \ U2 )0 .

Question 9.26 to Question 9.40 are exercises for Section 9.5.


26. For each of the following linear transformation T , (i) nd Ker(T ) and R(T ); and (ii) if
possible, nd nullity(T ) and rank(T ).
(a) T : F23 ! F23 such that T ((x; y; z )) = (x + y; y + z; x + z ) for (x; y; z ) 2 F23 .
(b) T : P (R) ! R2 such that T (p(x)) = (p(0); p(1)) for p(x) 2 P (R).
(c) T : Mnn (R) ! Mnn (R) such that T (A) = A + AT for A 2 Mnn (R).
(d) T : FN ! FN such that T ((an )n2N ) = (an+1
an )n2N for (an )n2N 2 FN .
(e) T : V ! V=W such that T (u) = W + u for u 2 V , where V is a vector space and
W is a subspace of V .

27. Let T : R3 ! P1 (R) be a linear transformation such that


!
1 1 1
[T ]C;B =
0 0 1
where B = f(1; 0; 0); (0; 1; 1); (0; 1; 1)g and C = f1; 1 + xg.
86 Chapter 9. General Linear Transformations

(a) Write down a formula for T .


(b) Find Ker(T ) and R(T ).

28. Prove Theorem 9.5.2:


Let T : V !W be a linear transformation. Show that
(a) Ker(T ) is a subspace of V ; and
(b) R(T ) is a subspace of W .

29. Let T be a linear operator on a vector space V such that T 2 = T .


(a) Prove that V = R(T )  Ker(T ).
(b) Prove that T is the projection on R(T ) along Ker(T ). (See Question 9.6 for the
de nition of \projection".)

30. Prove Lemma 9.5.6:


Let T : V ! W be a linear transformation where V and W are nite dimensional with
dim(V )  1 and dim(W )  1. Prove that for any ordered bases B and C for V and W
respectively,
(a) f [u]B j u 2 Ker(T )g is the nullspace of [T ]C;B and nullity(T ) = nullity([T ]C;B ); and
(b) f [v]C j v 2 R(T )g is the column space of [T ]C;B and rank(T ) = rank([T ]C;B ).

31. Prove Theorem 9.5.9:


Let T : V ! W be a linear transformation. Suppose B and C are subsets of V such that
B is a basis for Ker(T ), fT (v) j v 2 C g is a basis for R(T ) and for any v; v0 2 C with
v 6= v0, T (v) 6= T (v0). Prove that B [ C is a basis for V .
32. Let T : V ! W be a linear transformation. For w 2 W , we de ne the pre-image of w
under T to be the set
T 1 [w] = fu 2 V j T (u) = wg:
(a) Show that T 1
[w] = Ker(T ) + v for some v 2 V .
Let U be a subspace of W . De ne the pre-image of U under T to be the set

T 1 [U ] = fu 2 V j T (u) 2 U g:
(b) Show that T 1
[U ] is a subspace of V .
(c) If both Ker(T ) and U are nite dimensional, say dim(Ker(T )) = k and dim(U ) = m,
nd dim(T 1 [U ]).
Exercise 9 87

33. (a) Let T : V ! W be a linear transformation where V and W are nite dimensional
vector spaces such that dim(V ) = n  1 and dim(W ) = m  1. Show that there
exist ordered bases B and C for V and W , respectively, such that
!
[T ]C;B =
Ik 0k(n k)
0(m k)k 0(m k)(n k)
where k = rank(T ). (Hint: The result of Theorem 9.5.9 can be useful.)
(b) Let A 2 Mmn(F). Show that there exists invertible matrices P 2 Mmm(F) and
Q 2 Mnn(F) such that
!
PAQ = Ik 0k(n k)
0(m k)k 0(m k)(n k)
where k = rank(A).
0 1
1 1 0
(c) Let A = B
@ 0 1 1A be a real matrix. Find two 3  3 invertible real matrices
C

1 0 1
0 1
1 0 0
P and Q such that PAQ = @0 1 0C
B
A.
0 0 0

34. (a) Let S : U ! V and T : V ! W be linear transformations. Prove that Ker(S ) 


Ker(T  S ) and R(T )  R(T  S ).
(b) Let S : P2 (R) ! R3 and T : R3 ! M22 (R) be linear transformations de ned by

S (a + bx + cx2 ) = (a b; b c; c a) for a + bx + cx2 2 P2 (R)


!
x y+z
T ((x; y; z )) = for (x; y; z ) 2 R3 :
y+z x
Find Ker(S ), Ker(T  S ), R(T ), R(T  S ) and hence verify the results in Part (a).

35. (a) Let S; T : V ! W be linear transformations. Prove that R(S + T )  R(S ) + R(T )
and Ker(S + T )  Ker(S ) \ Ker(T ).
(b) Let S; T : R3 ! R4 be linear transformations de ned by

S ((x; y; z )) = (x; x; y; y) for (x; y; z ) 2 R3


T ((x; y; z )) = (x; x + z; y; y + z ) for (x; y; z ) 2 R3 :
Find R(S + T ), R(S ) + R(T ), Ker(S + T ) and Ker(S ) \ Ker(T ) and hence verify the
results in Part (a).
88 Chapter 9. General Linear Transformations

36. Let T1 ; T2 ; : : : ; Tk be linear operators on a vector space V . Suppose


(i) IV = T1 + T2 +    + Tk ;
(ii) Ti  Tj = OV whenever i 6= j ; and
(iii) Ti 2 = Ti for i = 1; 2; : : : ; k.
Prove that V = R(T1 )  R(T2 )      R(Tk ).

37. Let V be a nite dimensional vector space.


(a) Prove that if T is a linear operator on V such that Ker(T ) \ R(T ) = f0g, then
V = Ker(T )  R(T ).
(b) Give an example of a linear operator T on R2 such that V =
6 Ker(T ) + R(T ).
(c) Prove that for any linear operator T on V , Ker(T i )  Ker(T i+1 ) for i = 1; 2; 3; : : : .
(d) Show that for any linear operator T on V , there exists a positive integer m such that
V = Ker(T m )  R(T m ).

38. Let S : W ! V and T : V !W be linear transformations such that T  S = IW . Prove


that V = Ker(T )  R(S ).

39. Let V and W be nite dimensional vector spaces such that dim(V ) = dim(W ) and let
T : V ! W be a linear transformation. Prove that the following statements are equivalent:
(i) T is injective.
(ii) T is surjective.
(iii) T is bijective.

40. Let V and W be nite dimensional vector spaces and let T : V !W be a linear trans-
formation.
(a) Prove that if dim(V ) < dim(W ), then T is not surjective.
(b) Prove that if dim(V ) > dim(W ), then T is not injective.

Question 9.41 to Question 9.49 are exercises for Section 9.6.

41. For each of the following, determine whether the the linear transformation T is an iso-
morphism. If so, nd the inverse of T .
(a) T : F23 ! F23 such that T ((x; y; z )) = (x + y; y + z; z + x) for (x; y; z ) 2 F23 .
(b) T : R3 ! R3 such that T ((x; y; z )) = (x + y; y + z; z + x) for (x; y; z ) 2 R3 .
Exercise 9 89

(c) T : Mnn (C) ! Mnn (C) such that T (A) = P 1


AP for A 2 Mnn(C) where P
is an n  n invertible complex matrix.
(d) T : Pn (R) ! Rn such that T (p(x)) = (p(1); p(2); : : : ; p(n)) for p(x) 2 Pn (R).
d(xp(x))
(e) T : P (R) ! P (R) such that T (p(x)) = for p(x) 2 P (R).
dx
42. Prove Theorem 9.6.6:
Let T : V ! W be a linear transformation where V and W are nite dimensional with
dim(V ) = dim(W )  1. Let B and C be ordered bases for V and W respectively.
(a) Prove that T is an isomorphism if and only if [T ]C;B is an invertible matrix.
(b) If T is an isomorphism, show that [T 1
]B;C = ( [T ]C;B ) 1 .

43. Let T : M22 (C) ! M22 (C) be a linear transformation such that T 1
exists and
!! ! !
a b a a + ib a b
T 1
= for 2 M  (C):
2 2
c d a + ic b + c + id c d
(a) Write down a formula for T .
(b) Find Ker(T ) and R(T ).

44. Let F be a eld and c0 ; c1 ; : : : ; cn 2 F such that ci 6= cj for i 6= j . For i = 0; 1; : : : ; n,


de ne
qi (x) = di 1 (x c0 )    (x ci 1 )(x ci+1 )    (x cn ) 2 Pn (F)
where di = (ci c0 )    (ci ci 1 )(ci ci+1 )    (ci cn ) 2 F. (The polynomials qi (x)'s are
called the Lagrange polynomials.)
(a) Show that fq0 (x); q1 (x); : : : ; qn (x)g is a basis for Pn (F).
(b) Hence, or otherwise, prove that the linear transformation T : Pn (F) ! Fn+1 de ned
by T (p(x)) = (p(c0 ); p(c1 ); : : : ; p(cn )), for p(x) 2 Pn (F), is an isomorphism.

45. (a) Let U = fa0 + a1 x2 + a2 x4 +    + am x2m j m 2 N and a0 ; a1 ; a2 ; : : : ; am 2 Rg.


(i) Is U a proper subspace of P (R)?
(ii) Show that U = P (R).
(b) Given a nite dimensional vector space V , can there exist a proper subspace W of
V such that W 
=V?

46. Prove Theorem 9.6.8:


Let S : W ! V and T : V !W be linear transformations such that T  S = IW , the
identity operator on W .
90 Chapter 9. General Linear Transformations

(a) Prove that S is injective and T is surjective.


(b) If V and W are nite dimensional and dim(V ) = dim(W ), prove that S and T are
isomorphisms and they are inverses of each other. (See also Theorem 2.4.12.)

47. Let U , V and W be vector spaces over a eld F. Suppose R : U !V and S; T : V !W


are isomorphisms.
(a) Is T  R an isomorphism? If S 6= T , is S + T an isomorphism? If c 2 F and c 6= 0,
is cT an isomorphism?
(b) Prove that if S + T is an isomorphism, then S 1
+T 1
is also an isomorphism.

48. Complete the proof of Theorem 9.6.15:


Let T : V ! W be a linear transformation and let S : V=Ker(T ) ! R(T ) be the mapping
de ned in the proof of Theorem 9.6.15, i.e.

S (Ker(T ) + u) = T (u) for Ker(T ) + u 2 V=Ker(T ):


Prove that S is an isomorphism.

49. (a) Let V be a subspace of a vector space U and let W be a subspace of V . Show that
V=W is a subspace of U=W .
(b) (The Second Isomorphism Theorem) Let V and W be subspaces of a vector
space U . Prove that (V + W )=W 
= V=(V \ W ).
(c) (The Third Isomorphism Theorem) Let V be a subspace of a vector space U
and let W be a subspace of V . Prove that (U=W )=(V=W ) 
= U=V .
Chapter 10

Multilinear Forms and Determinants

Section 10.1 Permutations

Discussion 10.1.1 In Section 2.5, we de ne the determinants of a square matrices by using


the cofactor expansion. Such a de nition is usually referred as a \working" de nition which is
good for computation but not quite suitable for us to investigate the properties of determinants.
In this chapter, we shall study two di erent ways of de ning determinants: one by multilinear
forms and another by permutations.

De nition 10.1.2 A permutation  of f1; 2; : : : ; ng is a bijective mapping from f1; 2; : : : ; ng


!
1 2  n
to f1; 2; : : : ; ng. We usually represent  by .
(1) (2)    (n)
The set of all permutations of f1; 2; : : : ; ng is denoted by Sn . Note that jSn j = n!.

Example 10.1.3
( !) ( ! !)
1 1 2 1 2
S1 = , S2 = ; and
1 1 2 2 1
( ! ! ! ! ! ! )
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
S3 = ; ; ; ; ; ; .
1 2 3 2 3 1 3 1 2 3 2 1 1 3 2 2 1 3
!
1 2 3
For example,  = is the mapping from f1; 2; 3g to f1; 2; 3g such that  (1) = 3,
3 1 2
(2) = 1 and (3) = 2.

Notation 10.1.4
1. For ;  2 Sn,    is also a permutation and we usually denote    by  .
92 Chapter 10. Multilinear Forms and Determinants

2. In the following, for ; 2 f1 ; 2 ; : : : ; n g , we use  ; to denote the permutation of


f1; 2; : : : ; ng such that 8
k if k 6= ; >
<
 ; (k) = if k =
>
:
if k = :
This permutation is called the transposition of and . Note that  ; =  ; and
 ; 1= ; .

Example 10.1.5
! !
1 2 3 4 1 2 3 4
1. Let  = and  = . Then
3 1 2 4 4 1 3 2
! !
1 2 3 4 1 2 3 4
 = and  = :
4 3 2 1 3 4 1 2
! ! !
1 2 3 4 1 2 3 4 1 2 3 4
2. In S4 , 1;2 = , 2;3 = , 3;4 = ,
2 1 3 4 1 3 2 4 1 2 4 3
! !
1 2 3 4 1 2 3 4
1;2 2;3 1;2 = = 1;3 and 1;2 2;3 3;4 2;3 1;2 = = 1;4 :
3 2 1 4 4 2 3 1

Lemma 10.1.6
1. f 1
j  2 Sng = Sn.
2. For any  2 Sn , f j  2 Sn g = f j  2 Sn g = Sn .

Proof The proof is left as an exercise. See Question 10.4(a) and (b).

Lemma 10.1.7 For every  2 Sn, there exists 1 ; 2; : : : ; k 2 f1; 2; : : : ; n 1g such that
 =  1 ; 1 +1  2 ; 2 +1     k ; k +1 .

Proof Since every transposition  ; 2 Sn, < , can be written as


 ; = ; +1  +1 ; +2   2 ; 1  1;  ;
2 1   +1 ; +2  ; +1 ;
we only need to show that  can be written as a product of transpositions.
Let 0 =  . Suppose we have already de ned m 1 , where 1  m < n 1, such that
m 1 (k) = k for k = 1; 2; : : : ; m 1:
De ne m = m; m m 1 where m = m 1 (m)  m. Then
(
m; m (k) = k if k = 1; 2; : : : ; m 1
m (k) = m; m (m 1 (k)) =
m; m ( m ) = m if k = m:
Section 10.1. Permutations 93

Inductively, we obtain transpositions 1; 1 , 2; 2 , : : : , n 1; n 1 such that


!
n    2; 2 1; 1  = 1 2  n
; n
1 1
1 2  n
which is the identity mapping. Thus

 = (n ; n
1 1   ;  ;
2 2 1 1 ) 1
= 1; 1 2; 2    n ; n
1 1

is a product of transpositions.
!
1 2 3 4
Example 10.1.8 Let  = . Following the procedure of the proof of Lemma
3 1 4 2
10.1.7, we have !
1 2 3 4
34 23 13  =
1 2 3 4
and hence  = 13 23 34 . Since 13 = 12 23 12 , we have  = 12 23 12 23 34 .
The decomposition of a permutation discussed in Lemma 10.1.7 is not unique, for this example,
we can also write  = 23 34 12 .

De nition 10.1.9 Let  2 Sn .


An inversion is said to occur in  if  (i) >  (j ) for i < j .
If the total number of inversions in  is even,  is called an even permutation ; otherwise,  is
called an odd permutation.
The sign (or parity ) of  is de ned to be
(
1 if  is even
sgn( ) =
1 if  is odd:

Example 10.1.10
! !
1 2 3 4 1 2 3 4
1. Let  = and  = .
3 1 4 2 4 1 3 2
In  , inversions occur when (i; j ) = (1; 2), (1; 4) and (3; 4). So  is an odd permutation
and sgn( ) = 1. In  , inversions occur when (i; j ) = (1; 2), (1; 3), (1; 4) and (3; 4). So 
is an even permutation and sgn( ) = 1.

2. For  ; 2 Sn , where 1  <  n, inversions occur when (i; j ) = ( ; + 1), ( ; + 2),


: : : , ( ; ), ( + 1; ), : : : , ( 1; ). There are ( )+( 1) = 2( ) 1
inversions. Hence  ; is an odd permutation and sgn( ; ) = 1.
In particular, sgn( ; +1 ) = 1 for 1  n 1.
94 Chapter 10. Multilinear Forms and Determinants

Theorem 10.1.11 For any ;  2 Sn , sgn( ) = sgn() sgn( ).


Proof First we show that for all  2 Sn and 1   n 1, sgn(  ; +1 )= sgn(). As
!
=
1  1 +1 +2  n
(1)    ( 1) ( ) ( + 1) ( + 2)    (n)
and !
  ; +1 =
1  1 +1 +2  n ;
(1)    ( 1) ( + 1) ( ) ( + 2)    (n)
the number of inversions in   ; +1
(
(the number of inversions in ) + 1 if ( ) < ( + 1)
=
(the number of inversions in ) 1 if ( ) > ( + 1):
So
sgn(  ; +1 ) = sgn() (10.1)
By Lemma 10.1.7, we can write  =  1 ; 1 +1  2 ; 2 +1     k ; k +1 for some 1 ; 2; : : : ; k 2
f1; 2; : : : ; n 1g. By applying (10.1) repeatedly,
sgn( ) = sgn( 1 ; 1 +1  2 ; 2 +1     k 1 ; k 1 +1
 k ; k +1 )
= sgn( 1 ; 1 +1  2 ; 2 +1     k 1 ; k 1 +1
)
..
.
= ( 1)k 1 sgn( 1 ; 1 +1 )
and by Example 10.1.10.2, sgn( ) = ( 1)k . On the other hands, by applying (10.1) repeatedly,
sgn( ) = sgn(  1 ; 1 +1  2 ; 2 +1     k 1 ; k 1 +1
 k ; k +1 )
= sgn(  1 ; 1 +1  2 ; 2 +1     k 1 ; k 1 +1
)
..
.
= ( 1)k sgn( )
= sgn( ) sgn( ):

Corollary 10.1.12
1. If  2 Sn is a product of k transpositions, then sgn( ) = ( 1)k .
2. A permutation is even (respectively, odd) if it is a product of even (respectively, odd)
number of transpositions.
3. For any  2 Sn , sgn( 1 ) = sgn( ).
Proof Part 1 is a consequence of Example 10.1.10.2 and Theorem 10.1.11 while Part 2 is a
consequence of Part 1. The proof of Part 3 is left as exercise. See Question 10.4(c).
Section 10.2. Multilinear Forms 95

Section 10.2 Multilinear Forms

De nition 10.2.1 Let V be a vector space over a eld F and let V n = V|  {z   V}. A
n times
mapping T : V n ! F is called a multilinear form on V if for each i, 1  i  n,

T (u1 ; : : : ; ui 1 ; av + bw; ui+1 ; : : : ; un )


= aT (u1 ; : : : ; ui 1 ; v ; ui+1 ; : : : ; un ) + bT (u1 ; : : : ; ui 1 ; w; ui+1 ; : : : ; un ) (10.2)

for all a; b 2 F and u1; : : : ; ui 1; ui+1; : : : ; un; v; w 2 V . If n = 2, T is also called a bilinear


form on V .
A multilinear form T on V is called alternative if T (u1 ; u2 ; : : : ; un ) = 0 whenever u =u
for some 6= .
(If 1 + 1 6= 0 in F, a multilinear form T on V is alternative if and only if for all transposition
 =  ; 2 Sn and u1 ; : : : ; un 2 V , T (u1 ; u2 ; : : : ; un ) = T (u (1) ; u (2) ; : : : ; u (n) ). See
Theorem 10.2.3 and Question 10.9.)

Example 10.2.2 In the following examples, vectors in Fn are written as column vectors.
1. Let A 2 Mnn (F). De ne T : Fn  Fn ! F by

T (u; v) = uT Av for u; v 2 Fn :

Then T is a bilinear form on Fn . (See Question 10.8.)


!
0 1
In particular, if A = 2 M  (F), then
2 2
1 0
! !! ! !
a b a b
T ; = ad bc for ; 2F 2
c d c d
which is an alternative bilinear form.

2. De ne P : Mnn (F) ! F by
X
P (A) = a(1);1 a(2);2    a(n);n for A = (aij ) 2 Mnn (F):
2Sn

The value P (A) is known as the permanent of A.


 
Write A = a1 a2    an where ai is the ith column of A. Since each ai is a
vector in Fn , P can be regarded as a mapping from Fn      Fn to F. Take any b; c 2 F
and a1 ; : : : ; ai 1 ; ai+1 ; : : : ; an ; x; y 2 Fn where 1  i  n. Let aj = (a1j ; a2j ; : : : ; anj )T ,
96 Chapter 10. Multilinear Forms and Determinants

x = (x ; x ; : : : ; xn) and y = (y ; y ; : : : ; yn) . Then


1 2 1 2
T

 
P a1    ai 1 bx + cy ai+1    an
X
= a(1);1    a(i 1) ;i 1 (bx(i) + cy(i) ) a(i+1);i+1    a n ;n
( )
2Sn
X
=b a(1);1    a(i 1) ;i 1 x(i) a(i+1);i+1    a(n);n
2Sn
X
+c a(1);1    a(i 1);i 1 y(i) a(i+1);i+1    a(n);n
2Sn
 
= bP a1    ai 1 x ai+1    an
 
+ cP a1    ai 1 y ai+1    an :

So P is a multilinear form on Fn . Note that P is not alternative if 1 + 1 6= 0 in F.


3. De ne Q : Mnn (F) ! F by
X
Q(A) = sgn( ) a(1);1 a(2);2    a n ;n
( ) for A = (aij ) 2 Mnn (F):
2Sn
By Theorem 10.3.2, we shall learn that Q(A) is actually the determinant of A.
Following the same arguments as in Part 2, Q can be regarded as a multilinear form on
Fn.
 
Let A = a1 a2    an where ai = (a1i ; a2i ; : : : ; ani )T . Suppose a = a for some
6= . Let  =  ; . Then ai; (j ) = aij for all i; j . For any  2 Sn ,

a(1);1 a(2);2    a(n);n


= a(1); (1) a(2); (2)    a(n); (n)
= a( 1 (1));1 a( 1 (2));2    a( 1 (n));n (by rearranging the order of the terms)
= a (1);1 a (2);2    a (n);n (because  1 =  ):

By Lemma 10.1.6.2, we have f j  2 Sn g = Sn ; and by (10.1) in the proof of Theorem


10.1.11, we have sgn( ) = sgn( ) for all  2 Sn . So
X
Q(A) = sgn( ) a(1);1 a(2);2    a n ;n
( )
2Sn
X
= sgn( ) a (1);1 a (2);2    a n ;n
( )
2Sn
X
= sgn( ) a(1);1 a(2);2    a n ;n
( )
2Sn
= Q(A):
Section 10.2. Multilinear Forms 97

Suppose 1 + 1 6= 0 in F. Then Q(A) = 0. (Actually, it is also true for the case when
1 + 1 = 0 in F, see Question 10.10.)
Hence Q is an alternative multilinear form.

Theorem 10.2.3 Let T : V n ! F be an alternative multilinear form on a vector space V .


Then for all  2 Sn and u1 ; u2 ; : : : ; un 2 V ,

T (u1 ; u2 ; : : : ; un ) = sgn() T (u(1) ; u(2) ; : : : ; u(n) ):

Proof By Corollary 10.1.12, we only need to show that for all  =  ; , where 1  <  n,
T (u1 ; u2 ; : : : ; un ) = T (u (1) ; u (2) ; : : : ; u (n) ).
In the following computation, we use T (: : : ; x; : : : ; y ; : : : ) to denote the function value of T
when the th term is x, the th term is y and the ith term, for i 6= ; , is ui .
Since T is alternative,

0 = T (: : : ; u + u ; : : : ; u + u ; : : : )
= T (: : : ; u ; : : : ; u + u ; : : : ) + T (: : : ; u ; : : : ; u + u ; : : : )
= T (: : : ; u ; : : : ; u ; : : : ) + T (: : : ; u ; : : : ; u ; : : : )
+ T (: : : ; u ; : : : ; u ; : : : ) + T (: : : ; u ; : : : ; u ; : : : )
= T (: : : ; u ; : : : ; u ; : : : ) + T (: : : ; u ; : : : ; u ; : : : ):

Thus T (u1 ; u2 ; : : : ; un ) = T (u (1) ; u (2) ; : : : ; u (n) ).

Remark 10.2.4 Let T : V n ! F be a multilinear form on a nite dimensional vector space V


over a eld F. Fix a basis fv1 ; v2 ; : : : ; vm g for V . Take any u1 ; u2 ; : : : ; un 2 V , let

u1 = a v1 + a v2 +    + am vm;
11 21 1

u2 = a v1 + a v2 +    + am vm;
12 22 2
..
.
un = a1nv1 + a2nv2 +    + amnvm
where a11 ; a12 ; : : : ; amn 2 F.

1. Let F be the set of all mapping from f1; 2; : : : ; ng to f1; 2; : : : ; mg. Then apply (10.2)
repeatedly (see Example 10.2.5), we have
X
T (u1 ; u2 ; : : : ; un ) = af (1);1 af (2);2    af (n);n T (vf (1) ; vf (2) ; : : : ; vf (n) ): (10.3)
f 2F

2. Suppose T is an alternative form. Then in (10.3), T (vf (1) ; vf (2) ; : : : ; vf (n) ) = 0 when
f is not injective, i.e. there exists ; 2 f1; 2; : : : ; ng such that 6= and f ( ) = f ( ).
98 Chapter 10. Multilinear Forms and Determinants

(a) If m < n, then T (u1 ; u2 ; : : : ; un ) = 0 for all u1; u2; : : : ; un 2 V , i.e. T is a zero
mapping.
(b) If m  n, then (10.3) is still holds if we change the set F to the set of all injective
mapping from f1; 2; : : : ; ng to f1; 2; : : : ; mg.
In particular, When m = n, we can replace F by Sn . By Theorem 10.2.3, we get
X
T (u1 ; u2 ; : : : ; un ) = a(1);1 a(2);2    a(n);n T (v(1) ; v(2) ; : : : ; v(n) )
2Sn
X
= sgn( ) a(1);1 a(2);2    a n ;n T (v1; v2; : : : ; vn):
( )
2Sn
(10.4)

Example 10.2.5 Let T be a bilinear form on F.2


Take the standard basis fe1; e2g for F . 2

Then for all (a11 ; a21 )T ; (a12 ; a22 )T 2 F2 ,


! !!
a11 a
T ; 12
a21 a22
= T (a11 e1 + a21 e2 ; a12 e1 + a22 e2 )
= a11 T (e1 ; a12 e1 + a22 e2 ) + a21 T (e2 ; a12 e1 + a22 e2 )
= a11 a12 T (e1 ; e1 ) + a11 a22 T (e1 ; e2 ) + a21 a12 T (e2 ; e1 ) + a21 a22 T (e2 ; e2 ):

Furthermore, suppose T is an alternative bilinear form. Then T (e1 ; e1 ) = 0, T (e2 ; e2 ) = 0


and T (e1 ; e2 ) = T (e2 ; e1 ). Hence
! !!
a11 a
T ; 12 = (a11 a22 a21 a12 ) T (e1 ; e2 ):
a21 a22

Section 10.3 Determinants

De nition 10.3.1 A mapping D : Mnn (F) ! F is called a determinant function on Mnn (F)
if it satis es the following axioms:
(D1) By regarding the columns of matrices in Mnn (F) as vectors in Fn (see Example 10.2.2.2),
D is a multilinear form on Fn .
(D2) D(A) = 0 if A 2 Mnn (F) has two identical columns, i.e. as a multilinear form on Fn ,
D is alternative.
(D3) D(In ) = 1.
Section 10.3. Determinants 99

Theorem 10.3.2 There exists one and only one determinant function on Mnn (F) and it is
the function det : Mnn (F) ! F de ned by
X
det(A) = sgn( ) a(1);1 a(2);2    a n ;n
( ) for A = (aij ) 2 Mnn (F): (10.5)
2Sn
This formula is known as the classical de nition of determinants.
Proof Note that the function det is the function Q in Example 10.2.2.3. By Example
10.2.2.3 and Question 10.10, det is an alternative multilinear form on Fn and hence it sat-
is es (D1) and (D2). Since In = (ij ) where ii = 1 and ij = 0 if i 6= j , for any  2 Sn ,
(1);1 (2);2    (n);n = 0 whenever  is not the identity mapping. Then
X
det(In ) = sgn( ) (1);1 (2);2     n ;n =      nn = 1
( ) 11 22
2Sn

and hence it satis es (D3). So det is a determinant function on Mnn (F).


On the other hand, suppose D is a determinant function on Mnn (F). Take any A = (aij ) 2
Mnn(F). By applying (10.4) to the standard basis fe1; e2; : : : ; eng for Fn,
X  
D(A) = sgn( ) a(1);1 a(2);2    a n ;n D e1 e2    en
( )
2Sn
= det(A) D(In )
= det(A):

Hence D = det.

Example 10.3.3
!
1
1. In S1 , there is only one permutation  = .
1
Let A = (a11 ) 2 M11 (F). Then

det(A) = sgn( ) a(1);1 = a11 :

It coincides with De nition 2.5.2.


! !
1 2 1 2
2. In S2 , there are two permutations: 1 = and 2 = .
1 2 2 1
!
a a
Let A = 11 12 2 M  (F). Then
2 2
a21 a22
det(A) = sgn(1 ) a1 (1);1 a1 (2);2 + sgn(2 ) a2 (1);1 a2 (2);2 = a11 a22 a21 a12 :
This gives us the same formula as in Example 2.5.4.1.
100 Chapter 10. Multilinear Forms and Determinants

! ! !
1 2 3 1 2 3 1 2 3
3. In S3 , there are six permutations: 1 = , 2 = , 3 = ,
1 2 3 3 1 2 2 3 1
! ! !
1 2 3 1 2 3 1 2 3
4 = , 5 = and 6 = .
3 2 1 1 3 2 2 1 3
0 1
a11 a12 a13
Let A = @a21 a22 a23 C
B
A 2 M33 (F). Then
a31 a32 a33
6
X
det(A) = sgn(j ) aj (1);1 aj (2);2 aj (3);3
j =1
= a11 a22 a33 + a31 a12 a23 + a21 a32 a13 a31 a22 a13 a11 a32 a23 a21 a12 a33 :
This gives us the same formula as in Remark 2.5.5.

Lemma 10.3.4 Let A 2 Mnn (F). Then det(A) = det(AT ).


Proof (Note that the theorem is the same as Theorem 2.5.10. However, since we need this
theorem in the proof of the cofactor expansions, Theorem 10.3.5, while the proof of Theorem
2.5.10 uses the property of cofactor expansions, we need to reprove the theorem using only
results learnt in this chapter so far.)
Let A = (aij ). For any  2 Sn , by rearranging the order of the terms,

a(1);1 a(2);2    a(n);n = a1; 1 (1) a2; 1 (2)    an; n


1( ) :
By Lemma 10.1.6.1 and Corollary 10.1.12.3, we have f 1
j  2 Sng = Sn and sgn( 1
)=
sgn( ). Thus
X
det(A) = sgn( ) a(1);1 a(2);2    a n ;n ( )
2Sn
X
= sgn( ) a1; 1 (1) a2; 1 (2)    an; 1( ) n
2Sn
X
= sgn( 1 ) a1; 1 (1) a2; 1 (2)    an; n
1( )

2Sn
X
= sgn( ) a1;(1) a2;(2)    an; n ( )
2Sn
= det(AT ):

Theorem 10.3.5 (Cofactor Expansions) Let A = (aij ) 2 Mnn (F). De ne A e ij to be the


(n 1)  (n 1) matrix obtained from A by deleting the ith row and the j th column. Then
Exercise 10 101

for any = 1; 2; : : : n and = 1; 2; : : : n,

det(A) = a 1 A 1 + a 2 A 2 +    + a n A n
= a1 A1 + a2 A2 +    + an An

where Aij = ( 1)i+j det(A


e ij ). (See Theorem 2.5.6.)

Proof For m = 1; 2; : : : ; n, de ne mappings Em : Mnn (F) ! F such that


n
X
Em (A) = ( 1)m+k amk det(A
e mk ) for A = (aij ) 2 Mnn (F):
k=1
It can be checked that each Em is a determinant function (see Question 10.17). Thus by
Theorem 10.3.2,
n
X
det(A) = E (A) = ( 1) +k a k det(A
e k ) = a 1A 1 + a 2A 2 +    + a nA n:
k=1

Let B = AT = (bij ). Note that bij = aji and Be ij = (Ae ji) T


for all i, j . Applying the result
above to B , by Lemma 10.3.4,
n
X
det(A) = det(B ) = E (B ) = ( 1) +k b k det(B
e k)
k=1
n
X
= ( 1)k+ ak det(A
ek )
k=1
= a1 A1 + a2 A2 +    + an An :

Remark 10.3.6 By Example 10.3.3.1 and Theorem 10.3.5, the determinant function de ned
in Theorem 10.3.2 is the same as the determinant de ned inductively in De nition 2.5.2.

Exercise 10

Question 10.1 to Question 10.5 are exercises for Section 10.1.

1. How many di erent permutations are there in S?


4 List all the permutations in S 4 and
determine their signs.
! !
1 2 3 4 5 1 2 3 4 5
2. Let  = and  = .
2 3 5 4 1 5 4 3 2 1
(a) Write down  1 ,  1 ,  and  .
102 Chapter 10. Multilinear Forms and Determinants

(b) Find x; y 2 S5 such that x =  and y =  .


(c) Compute sgn( ), sgn( 1 ), sgn( ), sgn( 1 ), sgn( ) and sgn( ).
(d) Decompose  into a product of transpositions of the form  ; +1 as stated in Lemma
10.1.7.
Y
( (i) (j ))
i<j n
3. Let  2 Sn . Prove that sgn( ) =
1
Y .
(i j)
1i<j n

4. Prove Lemma 10.1.6 and Corollary 10.1.12.3:


(a) Show that f 1
j  2 Sng = Sn.
(b) For any  2 Sn , show that f j  2 Sn g = f j  2 Sn g = Sn .
(c) For any  2 Sn , prove that sgn( ) = sgn( ).
1

5. Let O1 = f 2 Sn j sgn( ) = 1g and O2 = f 2 Sn j sgn( ) = 1g where n  2.


(a) Prove that for any  2O ,O2 2 = f j  2 O1 g.
(b) Find jO1 j and jO2 j.

Question 10.6 to Question 10.14 are exercises for Section 10.2.

6. Determine which of the following are bilinear forms on R2 .


(a) T (u; v ) = 0 for u; v 2 R2 .
(b) T (u; v ) = 2 for u; v 2 R2 .
(c) T (u; v ) = u1 + u2 + v1 + v2 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .
(d) T (u; v ) = u1 v2 + u2 v1 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .
(e) T (u; v ) = u1 u2 + v1 v2 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .

7. Let V be a vector space over a eld F. A bilinear form T on V is called symmetric if


T (u; v) = T (v; u) for all u; v 2 V . Determine which of the following bilinear forms on
R2 are alternative and/or symmetric.
(a) T (u; v ) = 0 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .
(b) T (u; v ) = u1 v2 u2 v1 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .
(c) T (u; v ) = u1 v2 + u2 v1 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .
(d) T (u; v ) = 2u1 v1 + u1 v2 + u2 v1 + 3u2 v2 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .
Exercise 10 103

(e) T (u; v ) = 2u1 v1 u1 v2 + u2 v1 + 3u2 v2 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 R2 .

8. (In this question, vectors in Fn are written as column vectors.)


Let A 2 Mnn (F). De ne T : Fn  Fn ! F by

T (u; v) = uT Av for u; v 2 Fn :
Show that T is a bilinear form on Fn .

9. If 1 + 1 6= 0 in F, prove that a multilinear form T on V is alternative if and only if for all


transposition  =  ; 2 Sn , 1  <  n, and u1 ; : : : ; un 2 V , T (u1 ; u2 ; : : : ; un ) =
T (u (1) ; u (2) ; : : : ; u (n) ).
(Hint: The ()) part has already been shown in Theorem 10.2.3. You only need to prove
the (() part.)

10. Prove that the multilinear form Q in Example 10.2.2.3 is alternative when 1 + 1 = 0 in
F. (Hint: See Question 10.5.)
11. Let V be a vector space over a eld F and T : V n ! F an alternative multilinear form. If
u1; u2; : : : ; un are linearly dependent vectors in V , prove that T (u1; u2; : : : ; un) = 0.
12. Suppose V is nite dimensional. Let B be an ordered basis for V and T a bilinear form
on V .

(a) Show that there exists a square matrix A over F such that T (u; v ) = ([u]B )T A[v ]B
for all u; v 2 V .
(b) Prove that T is symmetric if and only if the matrix A in (a) is symmetric. (See
Question 10.7 for the de nition of symmetric bilinear forms.)
(c) If 1 + 1 6= 0 in F, prove that T is alternative if and only if the matrix A in (a) is
skew symmetric.

13. Let V be a vector space over a eld F. A function Q : V ! F is called a quadratic form
on V if it satis es the following two axioms.

(Q1) For all c 2 F and u 2 V , Q(cu) = c2 Q(u).


(Q2) The mapping H : V 2 ! F de ned by

H (u; v) = Q(u + v) Q(u) Q(v) for u; v 2 V


is a bilinear form.
104 Chapter 10. Multilinear Forms and Determinants

(a) Suppose 1 + 1 6= 0 in F. Prove that a mapping Q : V ! F is a quadratic form on V if


and only if there exists a symmetric bilinear form T on V such that Q(u) = T (u; u)
for all u 2 V .
(b) Give an example of a quadratic form Q on V = F22 such that we cannot nd a
symmetric bilinear form T on V such that Q(u) = T (u; u) for all u 2 V .

14. Suppose V is nite dimensional vector space over R with n = dim(V ). Let Q be
a quadratic form on V . Prove that there exist 1 ; 2 ; : : : ; n 2 R and a basis B =
fw1; w2; : : : ; wng for V such that
Q(u) = 1 a21 + 2 a22 +    + n a2n for u = a1 w1 + a2 w2 +    + an wn 2 V:

(Hint: See Section 6.4.)

Question 10.15 to Question 10.23 are exercises for Section 10.3.


! !
1 2 3 4 5 1 2 3 4 5
15. Let  = ,= and
2 3 5 4 1 5 4 3 2 1
0 1
3 1 9 3 1
B C
B 1 2 1 7 1C
B C
(aij )55 = B
B 0 6 5 1 C:
1C
B C
@ 4 3 2 1 1A
2 3 2 9 2
Compute

(a) sgn( ) a(1);1 a(2);2 a(3);3 a(4);4 a(5);5 and


(b) sgn( ) a (1);1 a (2);2 a (3);3 a (4);4 a (5);5 .

16. Use the formula in Theorem 10.3.2 to write down a formula for det(A) where A = (aij )44 .

17. (This question is part of the proof of cofactor expansions, Theorem 10.3.5. You should
not use the properties of determinants learnt from Section 2.5.)
Let Em : Mnn (F) ! F, 1  m  n, be the mapping de ned in Theorem 10.3.5, i.e.
n
X
Em (A) = ( 1)m+k amk det(A
e mk ) for A = (aij ) 2 Mnn (F):
k=1
Prove that Em is a determinant function.
Exercise 10 105

18. Reprove Theorem 10.3.5 using the classical de nition of determinants:


Let A = (aij ) 2 Mnn (F). De ne A e ij to be the (n 1)  (n 1) matrix obtained from
A by deleting the ith row and the j th column. Using the formula (10.5), prove that for
any = 1; 2; : : : n and = 1; 2; : : : n,

det(A) = a 1 A 1 + a 2 A 2 +    + a n A n
= a1 A1 + a2 A2 +    + an An

where A = ( 1) +
det(A
e ).

(Hint: First, use Theorem 10.2.3 and Lemma 10.3.4 to nd out what happen to the
determinant if we interchanging two columns or two rows of A.)

19. Let D be an (m + n)  (m + n) matrix such that


!
D= A B
0nm C
where A is an m  m matrix, B is an m  n matrix and C is an n  n matrix. Prove
that det(D) = det(A) det(C ).

20. Let W be an (m + n)  (m + n) matrix such that


!
W= X Y
Z 0nm
where X is an m  n matrix, Y is an m  m matrix and Z is an n  n matrix. Prove
that det(W ) = ( 1)mn det(Y ) det(Z ).

21. Let A be an n  n matrix. Explain why p(x) = det(xIn A) is a polynomial of degree n.


Also, show that the coecient of xn in p(x) is equal to 1, the coecient of xn 1 is equal
to tr(A) and the constant term is equal to ( 1)n det(A).
(Note that p(x) is the characteristic polynomial of A, see De nition 6.1.6.)

22. Let a1 ; a2 ; : : : ; an be elements of a eld. Prove that the value of the Vandermonde deter-
minant
1 1  1
  a1 a2  an
det (aij 1 )nn = .. .. ..
. . .
a1n 1
a2n 1
: : : ann 1

Y
is equal to (aj ai ).
i<j n
1
106 Chapter 10. Multilinear Forms and Determinants

23. Let a0 ; a1 ; : : : ; an 1 and b be elements of a eld. Prove that

b a0
1 b 0 a1
1 b a2
... ... .. = a0 a1 b    ak bk
1
1
+ bk :
.
... b ak 2
0
1 b ak 1
Chapter 11

Diagonalization and Jordan


Canonical Forms

Section 11.1 Eigenvalues and Diagonalization

Discussion 11.1.1 In Section 6.2, we have studied the problem of diagonalization of real
square matrices. The procedure to diagonalize square matrices over other elds is exactly the
same. By Remark 9.3.12, we learnt that to diagonalize a square matrix A 2 Mnn (F) is the
same as to nd an ordered basis B for Fn such that the matrix for LA relative to B is a diagonal
matrix. In this section, we shall restate a few important results in Chapter 6 in terms of linear
operators.

De nition 11.1.2 Let V be a vector space and T : V ! V a linear operator. A nonzero


vector u 2 V is called an eigenvector of T if T (u) 2 spanfug, i.e. T (u) = u for some scalar
. The scalar  is called an eigenvalue of T and u is said to be an eigenvector of T associated
with the eigenvalue .

Example 11.1.3

1. Let T : R2 ! R2 be the linear operator de ned by T ((x; y )) = (x + 3y; x y) for


(x; y ) 2 R2 . Since

T ((3; 1)) = (6; 2) = 2(3; 1) and T ((1; 1)) = ( 2; 2) = 2(1; 1);

2 and 2 are eigenvalues of T , (3; 1) is an eigenvector of T associated with 2 and (1; 1)


is an eigenvector associated with 2.
108 Chapter 11. Diagonalization and Jordan Canonical Forms

2. We de ne the shift operator S on the vector space of in nite sequences over a eld F such
that
S ((an )n2N ) = (an+1 )n2N = (a2 ; a3 ; a4 ; : : : ) for (an )n2N = (a1 ; a2 ; a3 ; : : : ) 2 FN :
For any  2 F, let a = (n 1 )n2N = (1; ; 2 ; : : : ) 2 FN . Then
S (a ) = (n )n2N = (; 2 ; 3 ; : : : ) = (1; ; 2 ; : : : ) = (n 1 )n2N = a :
Thus every scalar  is an eigenvalue of S and a is an eigenvector of S associated with .
3. Let [a; b], with a < b, be a closed interval on the real line and let D : C 1 ([a; b]) !
C 1 ([a; b]) be the di erential operator de ned in Example 9.1.4.6. For any  2 R, let
f 2 C 1 ([a; b]) be the function de ned by f (x) = ex for x 2 [a; b]. Then
dex
D(f )(x) = = ex = f (x) for all x 2 [a; b]
dx
) D(f ) = f :
Thus every real number  is an eigenvalue of D and f is an eigenvector of D associated
with .

De nition 11.1.4 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. The determinant of T , denoted by det(T ), is de ned to be the determinant of the
matrix [T ]B where B is any ordered basis for V .

Remark 11.1.5 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. Suppose B and C are two ordered bases for V . By Theorem 9.3.10, [T ]B and [T ]C
are similar, i.e. [T ]B = P 1 [T ]C P for some invertible matrix P . Then
det([T ]B ) = det(P 1
[T ]C P ) = det(P 1
) det([T ]C ) det(P )
= det(P ) 1
det([T ]C ) det(P ) = det([T ]C ):
So the de nition of det(T ) is independent of the choice of the basis B .

Theorem 11.1.6 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. For a scalar , let IV T be the linear operator de ned by (IV T )(u) =
u T (u) for u 2 V .
1.  is an eigenvalue of T if and only if det(IV T ) = 0.
(The equation det(xIV T ) = 0 is called the characteristic equation of T and the poly-
nomial det(xIV T ) is called the characteristic polynomial of T .)
2. u2V is an eigenvector of T associated with  if and only if u is a nonzero vector in
Ker(T IV ) (= Ker(IV T )).
(The subspace Ker(T IV ) of V is called the eigenspace of T associated with .)
Section 11.1. Eigenvalues and Diagonalization 109

Proof The proof of Part 1 follows the same argument as the proof for Remark 6.1.5. For Part
2, let u 2 V ,

T (u) = u , (T IV )(u) = T (u) u = 0 , u 2 Ker(T IV ):


So u is an eigenvector of T associated with  if and only if u is a nonzero vector in Ker(T IV ).

Notation 11.1.7 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. We use cT (x) to denote the characteristic polynomial of T , i.e. cT (x) = det(xIV T ).
For an eigenvalue  of T , we use E (T ) to denote the eigenspace of T associated with , i.e.
E (T ) = Ker(T IV ).
Also, for an n  n matrix A, we use cA (x) to denote the characteristic polynomial of A. For
an eigenvalue  of A, we use E (A) to denote the eigenspace of A associated with . (See
De nition 6.1.6 and De nition 6.1.11.)

Remark 11.1.8 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) = n  1. Take any basis B for V . Then

cT (x) = det(xIV T ) = det([xIV T ]B ) = det(x[IV ]B [T ]B ) = det(xIn [T ]B ) = c[T ]B (x):

Thus the characteristic polynomial of T is the same as the characteristic polynomial of the
matrix [T ]B . By the result of Question 10.21, cT (x) is a monic polynomial of degree n.

Example 11.1.9 Let V be a nite dimensional real vector space with a basis B = fv1 ; v2 ; v3 g
Let T : V ! V be linear operator such that

T (v1 ) = v1 ; T (v2 ) = 2v1 v2 2v3 and T (v3 ) = 2v1 + 2v2 + 3v3

Then
0 1
 1 2 2
[T ]B = [T (v1 )]B [T (v2 )]B [T (v3 )]B = @0
B C
1 2A :
0 2 3

Thus
x 1 2 2
cT (x) = c[T ]B (x) = det(xI [T ]B ) = 0 x+1 2 = (x 1)3
0 2 x 3
and hence T has only one eigenvalue 1.
110 Chapter 11. Diagonalization and Jordan Canonical Forms

For a; b; c 2 R,
av1 + bv2 + cv3 2 E1 (T ) , (T IV )(av1 + bv2 + cv3) = 0
, [T IV ]B [av1 + bv2 + cv3]B = [0]B
20 1 0 130 1 0 1
1 2 2 1 0 0 a 0
, 6B
4@0 1 2A
C B C7B C B C
@0 1 0A5@ b A = @0A
0 2 3 0 0 1 c 0
0 10 1 0 1
0 2 2 a 0
, B
@0
CB C B C
2 2A@ b A = @0A
0 2 2 c 0
0 1 0 1 0 1
a 1 0
, @ b A = s@0A + t@1A for s; t 2 R:
B C B C B C

c 0 1
So E1 (T ) = spanfv1 ; v2 + v3 g.

De nition 11.1.10 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. Then T is called diagonalizable if there exists an ordered basis B for V such that
[T ]B is a diagonal matrix.

Theorem 11.1.11 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. Then T is diagonalizable if and only if V has a basis B such that every vector in
B is an eigenvector of T .
Proof The proof follows the same argument as the proof for Theorem 6.2.3.

Algorithm 11.1.12 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) = n  1. We want to determine whether T is diagonalizable and if it is diagonalizable,
nd an ordered basis so that the matrix of T relative to this basis is a diagonal matrix. (See
Algorithm 6.2.4 and Remark 6.2.5.)
Step 1: Find a basis C for V and compute the matrix A = [T ]C .
Step 2: Factorize the characteristic polynomial cT (x) = cA (x) into linear factors (if possible),
i.e. to express it in the form
cA (x) = (x 1 )r1 (x 2 )r2    (x k )rk
where 1 ; 2 ; : : : ; k are distinct eigenvalues of A and r1 + r2 +    + rk = n. If we cannot
factorize cA (x) into linear factors, T is not diagonalizable.
Step 3: For each eigenvalue i , nd a basis Bi for the eigenspace Ei (T ) = Ker(T i IV ).
If jBi j < ri for some i, T is not diagonalizable. (See Theorem 11.5.10.)
Section 11.1. Eigenvalues and Diagonalization 111

Step 4: If T can pass the tests in Step 2 and Step 3, it is diagonalizable.


Let B = B1 [ B2 [    [ Bk . Then B is a basis for V and D = [T ]B is a diagonal
matrix.
Note that D = P 1
AP where P = [IV ]C;B is the transition matrix from B to C .

Example 11.1.13

1. Let T1 : R2 ! R2 be the linear operator de ned by

T1 ((x; y)) = (y; x) for (x; y) 2 R2 :

Step 1: Take the standard basis C = f(1; 0); (0; 1)g for R2 . Then
!
0 1
A = [T ]C =
1
1 0
:

Step 2: The characteristic polynomial is

x 1
cA (x) = det(xI A) = = 1 + x2
1 x

which cannot be factorized into linear factors over R. So T1 is not diagonalizable.

2. Let T2 : P2 (C) ! P2 (C) be linear operator de ned by

T2 (a + bx + cx2 ) = (4a b + c) + (a + 2b c)x i cx2 for a + bx + cx2 2 P2 (C):

Step 1: Take the standard basis C = f1; x; x2 g for P2 (C). Then


0 1
4 1 1
A = [T2]C = B
@1 2
C
1A :
0 0 i

Step 2: The characteristic polynomial is

x 4 1 1
cA (x) = det(xI A) = 1 x 2 1 = (x 3)2 (x + i):
0 0 x+i

Thus 3 and i are the eigenvalues.


112 Chapter 11. Diagonalization and Jordan Canonical Forms

Step 3: To nd a basis for E3 (T2 ):

a + bx + cx2 2 E3 (T2 ) , (T 2 3IP2 (C) )(a + bx + cx2 ) = 0


, [T 2 3IP2 (C) ]C [a + bx + cx2 ]C = [0]C
20 1 0 130 1 0 1
4 1 1 1 0 0 a 0
, 6B
4@1 2 1A
C B C7B C B C
3 @0 1 0A5@ b A = @0A
0 0 i 0 0 1 c 0
0 10 1 0 1
1 1 1 a 0
, B
@1 1
CB C B C
1 A@ b A @0A
=
0 0 3 i c 0
0 1 0 1
a 1
, @ b A = t@1A for t 2 C:
B C B C

c 0

Thus B3 = f1 + xg is a basis for E3 (T2 ).

As (x 3)2 is a factor of cA (x) and jB3 j = 1 < 2, T2 is not diagonalizable.

3. Let T3 : M22 (R) ! M22 (R) such that


!! ! !
a b 2a + b 2c d a+b c a b
T3 = for 2 M  (R):
2 2
c d 2a b + 2 c + d a+c+d c d
( ! ! ! !)
1 0 0 1 0 0 0 0
Step 1: Take the standard basis C = ; ; ; for M22 (R).
0 0 0 0 1 0 0 1
Then
0 1
2 1 2 1
B C
B 1 1 1 0C
A = [T ]C =
3
B
B 2 1 2 1C
C:
@ A
1 0 1 1

Step 2: The characteristic polynomial is

x+2 1 1 2
1 x 1 0 1
cA (x) = det(xI A) = = x2 (x 1)2 :
2 1 x 2 1
1 0 1 x 1

Thus 0 and 1 are the eigenvalues.


Section 11.1. Eigenvalues and Diagonalization 113

Step 3: To nd a basis for E0 (T3 ):


! !! !
a b a b 0 0
2 E (T ) ,
0 3 (T3 0IM22 (R) ) =
c d c d 0 0
" !# " !#
a b 0 0
, [T ]C 3 =
c d C
0 0 C
0 10 1 0 1
2 1 2 1 a 0
B CB C B C
B 1 1 1 0 CB b C B0C
CB C B
, B
B B C=B C
C
@ 2 1 2 1C A@ c A @0A
1 0 1 1 d 0
0 1 0 1 0 1
a 1 1
B C B C B C
BbC B 0C B 1C
, B C = sB
BcC B
C + tB C for s; t 2 R:
@ A @
C
1A B @ 0C A
d 0 1
( ! !)
1 0 1 1
Thus B0 = ; is a basis for E0 (T3 ).
1 0 0 1
( ! !)
1 1 0 1
Similarly, B1 = ; is a basis for E1 (T3 ).
1 0 0 1
( ! ! ! !)
1 0 1 1 1 1 0 1
Step 4: T3 is diagonalizable. Let B = ; ; ; .
1 0 0 1 1 0 0 1
We have 0 1
0 0 0 0
B C
B0 0 0 0C
D = [T3]B = B
B0 0 1
C:
0C
@ A
0 0 0 1
0 1
1 1 1 0
B C
B 0 1 1 1C
Let P = [IM22 (R) ]C;B = B
B
C. Then D = P 1
AP .
@ 1 0 1 0CA
0 1 0 1

Remark 11.1.14 In Algorithm 11.1.12, we need to factorize the characteristic polynomial into
linear factors. Sometimes, over a certain eld F, say F = R, we may not be able to factorize
some polynomials in this manner. Luckily, we can always nd a bigger eld that contains F
and at the same time, all polynomials over this bigger eld can be factorized into linear factors.
In particular, the eld C contains R and all polynomials over C can be factorized into linear
factors.

Example 11.1.15 In Example 11.1.13.1, the characteristic polynomial cannot be factorized


into linear factors over R. We can extend the linear operator T1 on R2 to a linear operator on
114 Chapter 11. Diagonalization and Jordan Canonical Forms

C , i.e.
2

T1 ((x; y)) = (y; x) for (x; y) 2 C2 :


! standard basis C = f(1; 0); (0; 1)g for C , we get the same matrix
Using the 2
A = [T ]C
1 =
0 1
. But over C,
1 0
cA (x) = 1 + x2 = (x i)(x + i)
and hence i and i are the eigenvalues of A. Following Algorithm 11.1.12, we nd a new basis
!
i 0
B = f(1; i); (1; i)g for C 2
such that D = [T1 ]B = .
0 i
! !
1 1 i 0
Let P = [IC2 ]C;B = . Then P 1
AP = .
i i 0 i

Section 11.2 Triangular Canonical Forms

Discussion 11.2.1 Although not all square matrices are diagonalizable, we can still reduce
them into simpler form provided that the eld used is big enough (see Remark 11.1.14). In this
section, we shall see that if the characteristic polynomial of a square matrix can be factorized
into linear factors, then the matrix is similar to an upper triangular matrix. In particular, every
complex square matrix is similar to an upper triangular matrix.

Lemma 11.2.2 Suppose A is an r  m matrix, B is an r  n matrix, C is an s  m matrix,


D is an s  n matrix, E is an m  t matrix, F is an m  u matrix, G is an n  t matrix, H is
an n  u matrix. Then
! ! !
A B E F =
AE + BG AF + BH :
C D G H CE + DG CF + DH
(To do such a matrix multiplication, you need to make sure that the sub-matrices can be
multiplied with each other.)
Proof The lemma can be shown easily by applying the results in Question 2.23.

Theorem 11.2.3 (Triangular Canonical Forms) Let F be a eld.


1. Let A 2 Mnn (F). If the characteristic polynomial of A can be factorized into linear
factors over F, then there exists an invertible matrix P 2 Mnn (F) such that P 1 AP is
an upper triangular matrix.
2. Let T be a linear operator on a nite dimensional vector space V over F where dim(V )  1.
If the characteristic polynomial of T can be factorized into linear factors over F, then there
exists an ordered basis B for V such that [T ]B is an upper triangular matrix.
Section 11.2. Triangular Canonical Forms 115

Proof We prove Part 1 by induction on n. (In the following, vectors in Fn are written as
column vectors.)
Since 1  1 matrices are upper triangular matrices, the result is true for n = 1.
Assume that if the characteristic polynomial of a matrix B 2 Mkk (F) can be factorized into
linear factors over F, then there exists an invertible matrix Q 2 Mkk (F) such that Q 1 BQ
is an upper triangular matrix.
Now, let A 2 M(k+1)(k+1) (F) such that cA (x) can be factorized into linear factors over F. Let
E be the standard basis for Fk+1 . As cA (x) has at least one linear factor x  for some  2 F,
A has at least one eigenvalue .
Let v be an eigenvector of Aassociated with . Extend fv g to a basis D = fv ; u1 ; : : : ; uk g for
Fk+1. Let R = [IFk+1 ]E;D = v u1    uk . Then
0 1
 b1  k b
B C
R AR = [IFk
1
+1 ]D;E [LA ]E [IFk+1 ]E;D = [LA ]D = B
B.
0
@ .. B
C
C
A
0

for some b1 ; : : : ; bk 2 F and B 2 Mkk (F). As cA (x) = cR 1 AR (x) = (x )cB (x), cB (x)
can also be factorized into linear factors over F. By the inductive assumption, there exists an
invertible matrix Q 2 Mkk (F) such that Q 1 BQ is an upper triangular matrix.
0 1
1 0  0
B C
Let P = R B 0 C
C. Then
B.
@ .. Q A
0
0 11 0 1
1 0  0 1 0  0
B C B C
P AP = B
1
B.
@.
0
Q
C
C
A
R AR B
1
B.
@.
0
Q
C
C
A
. .
0 0
0 10 10 1
1 0  0  b1  k 1
b 0  0
B CB CB C
=B 0 CB 0 CB0 C
B.
@ .. Q 1 CB .
A@ .. B CB .
A@ .. Q C
A
0 0 0
0 1
 (b1    k )Q
b
B C
B0 C
= B.
@ .. Q BQ 1 C
A
0

which is an upper triangular matrix. (Use Lemma 11.2.2 for the block matrix multiplications
preformed above.)
So by the mathematical induction, Part 1 of the theorem is true for all positive integer n.
116 Chapter 11. Diagonalization and Jordan Canonical Forms

For Part 2, we apply the result of Part 1 to A = [T ]C , where C is an ordered basis for V . There
exists an invertible matrix P such that D = P 1 [T ]C P is an upper triangular matrix. Then
by Theorem 9.3.10, there exists an ordered basis B such that [T ]B = D is an upper triangular
matrix.
(See Question 11.23 for an alternative proof of Part 2.)
0 1
8 6 2 2
B C
B 4 3 1 1C
Example 11.2.4 Let A = B
B
C be a real matrix. Note that cA (x) = (x 8)4 .
@ 8 6 10 2 C
A
4 1 3 11
The matrix A has an eigenvector (0; 0; 1; 1)T .
Extend f(0; 0; 1; 1)T g to a basis f(0; 0; 1; 1)T ; (1; 0; 0; 0)T ; (0; 1; 0; 0)T ; (0; 0; 0; 1)T g for R.
4

0 1
0 1 0 0
B C
B 0 0 1 0C
Let S = B
B
C. Then
@ 1 0 0 0C
A
1 0 0 1 0 1
8 8 6 2
B C
B0 8 6 2C
S 1
AS = B
B0 4 3 1C
C:
@ A
0 4 7 13
0 1
8 6 2
The matrix B = @4
B C
3 1 A has an eigenvector (2; 1; 3)T .
4 7 13
0 1
2 0 0
Extend f(2; 1; 3)T g to a basis f(2; 1; 3)T ; (0; 1; 0)T ; (0; 0; 1)T g for R3 . Let R = B
@
C
1 1 0A.
3 0 1
0 1
Then 8 3 1
R 1
BR = @0
B
6 2 A:
C

0 2 10
!
6 2
The matrix C = has an eigenvector (1; 1)T .
2 10
!
1 0
Extend f(1; 1)T g to a basis f(1; 1)T ; (0; 1)T g for R2 . Let Q = . Then
1 1
!
Q 1
CQ = 80 28 :
Now, let us trace what we have done backward following the proof of Theorem 11.2.3.
Section 11.3. Invariant Subspaces 117

0 1 0 10 1 0 1
1 0 0 2 0 0 1 0 0 2 0 0
Let U = R B 0
C
=B CB C B C
1 1 0A@0 1 0A = @ 1 1 0A. Then
@
0
Q A @
3 0 1 0 1 1 3 1 1
0
8 ( 3 Q 1 08 1) 4 1
1

U BU = B
1
@ 0
C B
Q CQ A = @00 1
8
C
2 A:
0 0 8
0 1 0 10 1 0 1
1 0 0 0 0 1 0 0 1 0 0 0 0 2 0 0
B C B CB C B C
B 0 C B 0 0 1 0C
CB0
B 2 0 0C B 0 1 1 0C
Let P = S B C = B C=B C. Then
B
@ 0 U C
A
B
@ 1 0 0 C
0AB
@0 1 1 0A B
C
@ 1 0 0 0C
A
0 1 0 0 1 0 3 1 1 1 3 1 1
0 1 0 1
B
8 (8 6 2) U C B
8 16 8 2
C
B 0 C B0 8 4 1C
P AP =
1 B
B
0 U BU 1
C=B
C B0 0 8 2C
C:
@ A @ A
0 0 0 0 8

Remark 11.2.5 Theorem 11.2.3 is still true if we change \upper triangular matrix" to \lower
triangular matrix".

Section 11.3 Invariant Subspaces

De nition 11.3.1 Let V be a vector space and T : V ! V a linear operator. A subspace W


of V is said to be T -invariant if T (u) is contained in W for all u 2 W , i.e.
T [W ] = fT (u) j u 2 W g  W:
If W is a T -invariant subspace of V , the linear operator T jW : W !W de ned by
T jW (u) = T (u) for u 2 W
is called the restriction of T on W .
......
........
.........................
......
T - ......
.........
.......................
......

V V
..... .....
..... .... ..... ....
.... .... .... ....
...... .... ...... ....
.. . ... ....
. .. . ... ....
....
....
.
..... ..... ..... ..
....
.....
...
...
.
.
.. ................................................... .. .
.
..
............................... ..
.. .... ................ .... .... ..
... ....... . . .... ... ........ . . .
.. ............... ...
.... ..
.
. ..... ..
... ..... .. .. ........... .
.
. ..
.... . . .
.
..
..
. . .................
..
. .
.
.... . . W . . .....
..
.. ............. ..... . . W . . ......
.
..
..
. . .. .............. .
..
. ... . . . . . . . . . .... .. ..
. .. . ................... . . . . . ... ..
............... .. ..
... ... .. ..
.
.
. ...
..... ..
. . ..

jW-
.. .
... .. . . . . . . . . . . ...
.
. .. ... .. . . ............................. . . . ...
.
..
..
.. .. . . . . . . . . . . ... ..
. .. .. . . ....................... . . ...
. ..
. . .. .. . . . .. .
.. ..
... ... .
. .
. T .
.
. .
.
. .
.
. .. .
. ..
.. ... . . . . . . . . . . .... .. .. ... . . .............................. . . ... ..
.
...
..
..
.
... . . . . . . . . . . ....
..
... . . . . . . . . . . ....
. .
.
.
..
..
.
.
...
..
..
..
.
... . . ...
..
..
. T [W ]
... . . ............................. . . ...
.. .
.. . . ..
. .
. ..
..
..
..
.. .. . . .
. . .
.. .. . . . . . . . . . . . . . .. ... . . .................... . . .. . .
.. .. .
. ..
. .. .. ..
.. .. . .
.
.
.. .. .. ..
.. ... . . . . . . . . . ... ..
. .. .. . . ............................... . . . ... ..
.. .. . .. .. .. . . . ..
.. .. . . . . . . . . . ... .. .. .. . . ............................ . . . ... ..
.. .. .. . ...... ..
.. .. .. .. .. .............................. ..
.. ..
.. .... . . . . . . . .... ..
. . . .. ..
.... . .. . . . . . . . . .. ..
.. ....
.... . . . . . . .......
.
.. .......... ... ....
.... . . . . . . ......
.
.. .............. .
..
.. .... . .. ................. .. .... .. ..
.. ..... .... ............... .. ..... .... ..
..
... ....... . ................................... ... ..
.. ...... . . . .........
................... ..
.................... ...
.... ....
. .... .
....
.... .... .... ...
.... ... .... ....
.... .... .... ....
.... .... .... ....
..... ..... ..... ....
.....
....... .... ..... ......
.. .. .......
.
........................... . ......... .
......................
118 Chapter 11. Diagonalization and Jordan Canonical Forms

Example 11.3.2
1. Let T : R3 ! R3 be the linear operator de ned by T ((x; y; z )) = (y; x; z ) for (x; y; z ) 2
R3. Note that T is a rotation about the z-axis. (See Section 7.3.)
(a) Let W1 be the xy -plane in R3 , i.e. W1 = f(x; y; 0) j x; y 2 Rg. Since for all (x; y; 0) 2
W1 ,
T ((x; y; 0)) = (y; x; 0) 2 W1 ;
W1 is T -invariant. The restriction T jW1 of T on W1 is equivalent a rotation de ned
on the xy -plane.
(b) Let W2 be the z -axis in R3 , i.e. W2 = f(0; 0; z ) j z 2 Rg. Since for all (0; 0; z ) 2 W2 ,
T ((0; 0; z )) = (0; 0; z ) 2 W2 ;
W2 is T -invariant. The restriction T jW2 of T on W2 is the identity operator.
(c) Let W3 be the yz -plane in R3 , i.e. W3 = f(0; y; z ) j y; z 2 Rg. It is not T -invariant.
For example, (0; 1; 1) 2 W3 but T ((0; 1; 1)) = (1; 0; 1) 2= W3 .
Let E = fe1 ; e2 ; e3 g be the standard basis for R3 , where e1 = (1; 0; 0), e2 = (0; 1; 0),
e3 = (0; 0; 1), and let C = fe1; e2g and D = fe3g which are bases for W1 and W2
respectively. Then
0 1
  0 1 0
[T ]E = [T (e1 )]E [T (e2 )]E [T (e3 )]E = B
@
C
1 0 0A ;
0 0 1
!
  0 1
[T jW1 ]C = [T (e1 )]C [T (e2 )]C = ;
1 0
 
[T jW2 ]D = [T (e3 )]D = ( 1 )
and
cT (x) = c[T ]E (x) = (x 1)(x2 + 1);
cT jW1 (x) = c[T jW1 ]C (x) = x2 + 1;
cT jW2 (x) = c[T jW2 ]D (x) = x 1:
!
[T jW1 ]C 0
Note that R3
= W1  W2 , [T ]E = and cT (x) = cT jW1(x) cT jW2(x).
0 [T jW2 ]D
(See Discussion 11.3.12.)
2. Let T be a linear operator on a vector space V over a eld F. Suppose T has an eigenvector
v, i.e. v is a nonzero vector in V such that T (v) = v for some  2 F. Let U = spanfvg.
For any u 2 U , u = av , for some a 2 F, and hence
T (u) = T (av) = aT (v) = av 2 U:
Section 11.3. Invariant Subspaces 119

 
So U is T -invariant. Using B = fv g as a basis for U , we have [T jU ]B = [T (v )]B = (  )
and hence cT jU (x) = c[T jU ]B (x) = x .
3. Let T be a linear operator on a vector space V over a eld F. Take any u 2 V . De ne

W = spanfu; T (u); T 2 (u); T 3 (u); : : : g:


For any w 2 W , w = a0 u+a1 T (u)+  +am T m (u) for some m 2 N and a0 ; a1 ; : : : ; am 2 F.
Then

T (w) = T (a0 u + a1 T (u) +    + am T m (u))


= a0 T (u) + a1 T 2 (u) +    + am T m+1 (u) 2 W:

So W is T -invariant. This subspace W of V is called the T -cyclic subspace of V generated


by u.
4. Let T be a linear operator on R4 de ned by

T ((a; b; c; d)) = (4a +2b +2c +2d; a +2b + c; a b; 3a 2b 2c d) for (a; b; c; d) 2 R4 :


Take u = (1; 2; 0; 0). Then

T (u) = T ((1; 2; 0; 0)) = (0; 3; 1; 1);


T 2 (u) = T (T (u)) = T ((0; 3; 1; 1)) = ( 2; 5; 3; 3)
= 2(1; 2; 0; 0) + 3(0; 3; 1; 1)
= 2u + 3T (u) 2 spanfu; T (u)g:

Note that T 3 (u) = T (T 2 (u)) 2 spanfT (u); T 2 (u)g  spanfu; T (u)g. Repeating the
process, we can show that T m (u) 2 spanfu; T (u)g for all m  2. Thus the T -cyclic
subspace of V generated by u is

W = spanfu; T (u); T 2 (u); T 3 (u); : : : g = spanfu; T (u)g:


By the discussion in Part 3, W is a T -invariant subspace. (See also Theorem 11.3.10 and
Example 11.3.11.1.)

Proposition 11.3.3 Let S and T be linear operators on V . Suppose W is a subspace of V


which is both S -invariant and T -invariant. Then
1. W is (S  T )-invariant and (S  T )jW = S jW  T jW ;
2. W is (S + T )-invariant and (S + T )jW = S jW + T jW ; and
3. for any scalar c, W is (cT )-invariant and (cT )jW = c(T jW ).

Proof The proof is left as exercise. See Question 11.14.


120 Chapter 11. Diagonalization and Jordan Canonical Forms

Discussion 11.3.4 Let T be a linear operator on a nite dimensional vector space V over a
eld F. Suppose W is a T -invariant subspace of V with dim(W )  1. Let dim(V ) = n and
dim(W ) = m.
Take an ordered basis C = fv1 ; v2 ; : : : ; vm g for W . For each j = 1; 2; : : : ; m, since T (vj ) 2 W ,

T jW (vj ) = T (vj ) = a1j v1 + a2j v2 +    + amj vm


for some a1j ; a2j ; : : : ; amj 2 F. Note that
0 1
a11 a12  a m
1
  B
B a21 a22  a mCC
[T jW ]C = [T jW (v1 )]B [T jW (v2 )]B    [T jW (vm)]B 2
C:
B
=B . C
.. ..
@ .. . . A
am1 am2    amm
By Theorem 8.5.17, we can extend C to an ordered basis B = fv1 ; v2 ; : : : ; vm ; vm+1 ; : : : ; vn g
for V . For each j = m + 1; m + 2; : : : ; n,

T (vj ) = a1j v1 + a2j v2 +    + amj vm + am+1; j vm+1 +    + anj vn


for some a1j ; a2j ; : : : ; anj 2 F. Then
 
[T ]B = [T (v1 )]B [T (v2 )]B    [T (vm)]B [T (vm+1)]B    [T (vn)]B
0 1
a11 a12  a m
1 a1; m+1  a1n
B
B a21 a22
B
 a m
2 a2; m+1  a2n C
C
B .. .. .. .. .. C C
B . . . . . C
B C
= Bam1
B am2    amm am; m +1  amn C C
B
B 0
B
0    0 am ; m +1 +1  am+1; n C
C
B .. .. .. .. .. C C
@ . . . . . A
0 0  0 an; m+1  ann
!
A1 A2
=
0 A3
where A1 = [T jW ]C , A2 is an m  (n m) matrix and A3 is an (n m)  (n m) matrix.

Example 11.3.5 Consider the linear operator T on R4 in Example 11.3.2.4. Let W =


spanfv1 ; v2 g where v1 = (1; 2; 0; 0) and v2 = (0; 3; 1; 1). From Example 11.3.2.4, we know
that W is a T -invariant subspace of R4 .
It is easy to check that C = fv1 ; v2 g is a basis for W with T jW (v1 ) = T (v1 ) = v2 and
T jW (v2 ) = T (v2 ) = 2v1 + 3v2 . Hence
!
 0  2
[T jW ]C = [T jW (v1 )]C [T jW (v2 )]C = :
1 3
Section 11.3. Invariant Subspaces 121

Extend C to a basis B = fv1 ; v2 ; v3 ; v4 g for R4 where v3 = (0; 0; 1; 0) and v4 = (0; 0; 0; 1).


Then
T (v3 ) = (2; 1; 0; 2) = 2v1 53 v2 + 53 v3 13 v4
and
T (v3 ) = (2; 0; 0; 1) = 2v1 4
3
v2 + v3 + v4:4
3
1
3

So 0 1
0 2 2 2
B C
B
B 1 3 5
3
4
3
C
C
[T ]B = B C:
B
@
0 0 5
3
4
3
C
A
0 0 1
3
1
3

Take the standard basis E = f(1; 0; 0; 0); (0; 1; 0; 0); (0; 0; 1; 0); (0; 0; 0; 1)g for R4 . Let
0 1 0 1
4 2 2 2 1 0 0 0
B C B C
B 1 2 1 0C B 2 3 0 0C
A = [T ]E = B
B 1 1 0 0C
C and P = [IR ]E;B = 4 B
B 0 1 1
C:
0C
@ A @ A
3 2 2 1 0 1 0 1
0 1
0 2 2 2
B C
B1 3 5 4C

Then P AP = [T ]B = BB 3 3C
1
C.
B0 0 5 4 C
@ 3 3 A
0 0 1
3
1
3

!
Lemma 11.3.6 Let D be a quare matrix such that D =
A B where both A and C are
0 C
square matrices. Then det(D) = det(A) det(C ).
Proof See Question 10.19.

Theorem 11.3.7 Let T be a linear operator on a nite dimensional vector space V . Suppose
W is a T -invariant subspace of V with dim(W )  1. Then the characteristic polynomial of T
is divisible by the characteristic polynomial of T jW .
Proof Using the notation in Discussion 11.3.4,
cT (x) = c[T ]B (x) = det(xIn [T ]B )
xI A1 A2
= m
0 xIn m A3
= det(xIm A1 ) det(xIn m A3 ) (see Lemma 11.3.6)
= cA1 (x) cA3 (x):

Since A1 = [T jW ]C , cA1 (x) = cT jW (x). Thus cT (x) is divisible by cT jW (x).


122 Chapter 11. Diagonalization and Jordan Canonical Forms

Example 11.3.8 In Example 11.3.5, cT (x) = (x 1)3 (x 2) and cT jW (x) = (x 1)(x 2).
(Check them.) It is obvious that cT (x) is divisible by cT jW (x).

Discussion 11.3.9 Given a linear operator T on a nite dimensional vector space V , by


Discussion 11.3.4, a T -invariant subspace of V can be helpful in nding a basis B such that
[T ]B is in a simpler form. By Example 11.3.2.3, for any u 2 V , the T -cyclic subspace of V
generated by u is T -invariant. This is a very useful invariant subspace. Actually, the subspace
W used in Example 11.3.5 is a T -cyclic subspace, see Example 11.3.2.4.

Theorem 11.3.10 Let T be a linear operator on a vector space V over a eld F. Take
a nonzero vector u 2 V . Suppose the T -cyclic subspace W = spanfu; T (u); T 2 (u); : : : g
generated by u is nite dimensional.

1. The dimension of W is equal to the smallest positive integer k such that T k (u) is a linear
combination of u; T (u); : : : ; T k 1 (u).

2. Suppose dim(W ) = k.

(a) fu; T (u); : : : ; T k (u)g is a basis for W .


1

(b) If T k (u) = a u + a T (u) +    + ak T k (u) where a ; a ;    ; ak 2 F,


0 1 1
1
0 1 1 then
cT jW (x) = a a x    ak xk + xk .
0 1 1
1

Proof Let k be the smallest positive integer such that T k (u) 2 spanfu; T (u); : : : ; T k 1 (u)g.
Assume T m (u) 2 spanfu; T (u); : : : ; T k 1 (u)g for m  1. Then

T m+1 (u) = T (T m (u)) 2 spanfT (u); T 2 (u); : : : ; T k (u)g  spanfu; T (u); : : : ; T k 1 (u)g:

By mathematical induction, we have shown that T n (u) 2 spanfu; T (u); : : : ; T k 1 (u)g for all
positive integer n. Thus W = spanfu; T (u); : : : ; T k 1 (u)g.
We claim that fu; T (u); : : : ; T k 1 (u)g is linearly independent: Assume the opposite, i.e. there
exists c0 ; c1 ; : : : ; ck 1 2 F such that c0 u + c1 T (u) +    + ck 1 T k 1 (u) = 0 and not all ci 's are
zero. Let j = maxfi j 0  i  k 1 and ci 6= 0g, i.e. c0 u + c1 T (u) +    + cj T j (u) = 0 and
cj 6= 0. Since u is a nonzero vector, j > 0. So we can write

T j (u) = cj 1 c0 u cj 1 c1 T (u)    cj cj T j (u) 2 spanfu; T (u); : : : ; T j (u)g:


1
1
1 1

As j < k, it contradicts our choice of k.


Since fu; T (u); : : : ; T k 1 (u)g is linearly independent and it spans W , it is a basis for W and
dim(W ) = k.
Section 11.3. Invariant Subspaces 123

Finally, use B = fu; T (u); : : : ; T k 1 (u)g as an ordered basis for W . Then


0 1
0 a0
B C
B1 0 0 a1 C
  B C
[T jW ]B = [T (u)]B [T 2 (u)]B    [T k (u)]B
B
=B 1 0 a2 C
B ... ... .. C
C
B
B
. C
@ ... 0 ak 2 C
A
0
1 ak 1

and hence by Question 10.23,

x a0
1 x 0 a1
1 x a2
cT jW (x) = det(xIW T jW ) = ... ... ..
.
... x ak 2
0
1 x ak 1

= a0 a1 x    ak xk
1
1
+ xk :

Example 11.3.11 Consider the linear operator T on R4 in Example 11.3.2.4.


1. Let u = (1; 2; 0; 0) and W = spanfu; T (u); T 2 (u); : : : g. From Example 11.3.2.4, we
have
T (u) 2= spanfug and T 2 (u) = 2u + 3T (u) 2 spanfu; T (u)g:
By Theorem 11.3.10, dim(W ) = 2, fu; T (u)g is a basis for W and

cT jW (x) = ( 2) 3x + x2 = (x 1)(x 2):


(See also Example 11.3.5 and Example 11.3.8.)

2. Let v = (0; 0; 0; 1) and W 0 = spanfv ; T (v ); T 2 (v ); : : : g. Then

T (v) = T ((0; 0; 0; 1)) = (2; 0; 0; 1) 2= spanfvg


T 2 (v) = T ((2; 0; 0; 1)) = (6; 2; 2; 5) 2= spanfv; T (v)g
T 3 (v) = T ((6; 2; 2; 5)) = (14; 8; 8; 13)
= 2(0; 0; 0; 1) 5(2; 0; 0; 1) + 4(6; 2; 2; 5)
= 2v 5T (v ) + 4T 2 (v ) 2 spanfv ; T (v ); T 2 (v )g:

By Theorem 11.3.10, dim(W 0 ) = 3, fv ; T (v ); T 2 (v )g is a basis for W 0 and

cT jW 0 (x) = 2 ( 5)x 4x2 + x3 = (x 1)2 (x 2):


124 Chapter 11. Diagonalization and Jordan Canonical Forms

Discussion 11.3.12 Let T be a linear operator on a nite dimensional vector space V over a
eld F. Suppose
V = W1  W2      Wk
where W1 ; W2 ; : : : ; Wk are T -invariant subspaces of V with dim(Wt ) = nt  1 for t = 1; 2; : : : k.
For each t, let Ct = fv1(t) ; v2(t) ; : : : ; vn(tt) g be an ordered basis for Wt . As Wt is T -invariant, for
j = 1; 2; : : : ; nt , T (vj(t) ) 2 Wt and hence

T jWt (vj(t) ) = T (vj(t) ) = a(1tj) v1(t) + a(2tj) v2(t) +    + a(ntt)j vn(tt)

for some a(1tj) ; a(2tj) ; : : : ; a(ntt)j 2 F. Thus


0 1
a(11t) a(12t)    a tnt
( )
1
  B (t)
B a21 a22
(t)
   a tnt C
( )
C
[T jWt ]Ct = [T jWt (v1(t) )]Ct [T jWt (v2(t) )]Ct    [T jWt (vn(tt))]Ct B
=B . ..
2
..C
C:
@ .. . . A
a(ntt)1 a(ntt)2    anttnt
( )

By Theorem 8.6.7.1, we know that the set

B = C1 [ C2 [    [ Ck
= fv1(1) ; v2(1) ; : : : ; vn(1)
1
; v1(2) ; v2(2) ; : : : ; vn(2)2 ; : : : : : : ; v1(k) ; v2(k) ; : : : ; vn(kk) g
is a basis for V .
Using B as an ordered basis with the order shown above,
 
[T ]B = [T (v1(1) )]B    [T (vn(1)1 )]B [T (v1(2))]B    [T (vn(2)2 )]B    [T (v1(k))]B    [T (vn(kk))]B
0 1
A1 0 0
=
B
B
B
0 A2 0 C
C
C
B ... C
@ A
0 0 Ak
 
where At = a(ijt) = [T jWt ]Ct for t = 1; 2; : : : ; k. Furthermore,
nt nt
cT (x) = cA1 (x) cA2 (x)    cAk (x) = cT jW1(x) cT jW2(x)    cT jWk(x);
i.e. the characteristic polynomial of T is the product of the characteristic polynomials of T jWt
for t = 1; 2; : : : ; k.

Example 11.3.13 Consider the linear operator T on R4 in Example 11.3.2.4.


Let W1 = spanfv1 ; v2 g and W2 = spanfw1 ; w2 g where v1 = (1; 2; 0; 0), v2 = (0; 3; 1; 1),
w1 = (0; 1; 1; 2) and w2 = (0; 0; 1; 1). We have the following observations:
Section 11.4. The Cayley-Hamilton Theorem 125

1. C1 = fv1 ; v2 g and C2 = fw1 ; w2 g are bases for W1 and W2 respectively.


2. B = fv1 ; v2 ; w1 ; w2 g is linearly independent. Hence W1 + W2 is a direct sum. (Why?)
Also, since dim(R4 ) = 4 = dim(W1  W2 ), we have R4 = W1  W2 .
3. From Example 11.3.2.4, W1 is T -invariant, T (v1 ) = v2 and T (v2 ) = 2v1 + 3v2 .
4. It can be shown that W2 is also T -invariant, T (w1 ) = 3w1 4w2 and T (w2 ) = w1 w2 .
Then by Discussion 11.3.12, we have
0 1
! !
0 2 0 0
B C
0 2 3 1 B 1 3 0 0 C
[T jW1 ]C1 = ; [T jW2 ]C2 = and [T ]B = B
B
C:
C
1 3 4 1 @ 0 0 3 1 A
0 0 4 1

Furthermore, cT jW1 (x) = (x 1)(x 2), cT jW2 (x) = (x 1)2 and cT (x) = (x 1)3 (x 2) =
cT jW1 (x) cT jW2 (x).
Take the standard basis E = f(1; 0; 0; 0); (0; 1; 0; 0); (0; 0; 1; 0); (0; 0; 0; 1)g for R4 . Let
0 1 0 1
4 2 2 2 1 0 0 0
B C B C
B 1 2 1 0C B 2 3 1 0C
A = [T ]E = B
B 1 1 0 0C
C and Q = [IR ]E;B =
4 B
B 0 1 1 1C
C:
@ A @ A
3 2 2 1 0 1 2 1
0 1
0 2 0 0
B C
Then Q 1 B1
AQ = [T ]B = B 3 0 0CC.
B0 0 3 1C
@ A
0 0 4 1

Section 11.4 The Cayley-Hamilton Theorem

Notation 11.4.1 Let F be a eld and let p(x) = a0 +a1 x+  +am xm where a0 ; a1 ; : : : ; am 2 F.
1. For a linear operator T on a vector space V over F, we use p(T ) to denote the linear
operator a0 IV + a1 T +    + am T m on V .
2. For an n  n matrix A over F, we use p(A) to denote the n  n matrix a In + a A + 0 1

   + a m Am .
Lemma 11.4.2 Let F be a eld and let p(x) = a0 + a1 x +    + am xm where a0 ; a1 ; : : : ; am 2 F.
Suppose T is a linear operator on a vector space V over F and A is an n  n matrix over F.
126 Chapter 11. Diagonalization and Jordan Canonical Forms

1. Suppose V is nite dimensional where dim(V ) = n  1. For any ordered basis B for V ,

[p(T )]B = p([T ]B ):

2. Consider the linear operator LA on Fn de ned in Example 9.1.4.1, i.e. LA (u) = Au for
u 2 Fn where vectors in Fn are written as column vectors. Then
p(LA ) = Lp(A)

where Lp(A) is the linear operator on Fn de ned by Lp(A) (u) = p(A)u for u 2 Fn .

3. If W is a T -invariant subspace of V , then W is also a p(T )-invariant subspace of V and

p(T )jW = p(T jW ):

4. Suppose r(x) and s(x) are polynomials over F such that p(x) = r(x) s(x). Then

p(T ) = r(T )  s(T ) = s(T )  r(T ) and q(A) = r(A) s(A) = s(A) r(A):

Proof
1. By Corollary 9.3.6 and Proposition 9.4.3,

[p(T )]B = [a0 IV + a1 T + a2 T 2 +    + am T m ]B


= a0 In + a1 [T ]B + a2 ([T ]B )2 +    + am ([T ]B )m (note that [IV ]B = In )
= p([T ]B ):

2. Take the standard basis E = fe1 ; e2 ; : : : ; en g for Fn . By Example 9.2.4.3, [LA ]E = A


and [Lp(A) ]E = p(A). Then applying Part 1 with T = LA ,

[p(LA )]E = p([LA ]E ) = p(A) = [Lp(A) ]E

and hence by Lemma 9.2.3, p(LA ) = Lp(A) .

3. By applying Proposition 11.3.3 repeatedly, W is a p(T )-invariant subspace of V and

p(T )jW = (a0 IV + a1 T + a2 T 2 +    + am T m )jW


= a0 IW + a1 T jW + a2 (T jW )2 +    + am (T jW )m (note that IV jW = IW )
= p(T jW ):

4. The results are obvious because T i  T j = T i+j = T j  T i and Ai Aj = Ai+j = Aj Ai for


all i; j .
Section 11.4. The Cayley-Hamilton Theorem 127

0 1
1 0 1
Example 11.4.3 Let A = @0 1
B C
1 A be a real matrix.
1 1 2
Consider the linear operator LA : R3 ! R3 as de ned in Example 9.1.4.1, i.e. LA (u) = Au for
u 2 R3 where vectors in R3 are written as column vectors.
1. Find the matrix p(A) and the linear operator p(LA ) where p(x) = 4 + 8x 5x2 .

Solution We have
p(A) = 4I3 + 8A 5A2
0 1 0 1 0 12 0 1
1 0 0 1 0 1 1 0 1 4 5 7
B C B C B C B C
= 4@0 1 0A + 8@0 1 1A 5@0 1 1 A =@ 5 6 7A
0 0 1 1 1 2 1 1 2 7 7 8
and by Lemma 11.4.2.2, p(LA ) = Lp(A) , i.e.
00 11 0 1 0 10 1 0 1
x x 4 5 7 x x
p(LA )BB CC
@@y AA = p (A )B C B
@y A @ 5
= 6
CB C
7A@y A for
B C
@y A 2R : 3

z z 7 7 8 z z

2. Note that cLA (x) = cA (x) = (x 2)(x 1)2 . Find the matrix cA (A) and the linear
operator cLA (LA ).

Solution We have
cA (A) = (A 2I3 )(A I3) 2

20 1 0 13 20 1 0 132 0 1
1 0 1 1 0 0 1 0 1 1 0 0 0 0 0
=6B
4@0 1 1A
C
2B C7 6B
@0 1 0A5 4@0 1 1A
C B C7
@0 1 0A5 = B C
@0 0 0A :
1 1 2 0 0 1 1 1 2 0 0 1 0 0 0
By Remark 11.1.8, cLA (x) = cA (x) and by Lemma 11.4.2.2, for any polynomial p(x),
p(LA )(u) = p(A)u for all u 2 R3 . We have
00 11 00 11
x x
cLA (LA )BB CC
@@y AA = c L BB CC
A A @@y AA
( )
z z
0 1 0 10 1 0 1 0 1
x 0 0 0 x 0 x
= cA (A)@y A = @0 0 0A@y A = @0A for
B C B CB C B C B C
@y A 2R : 3

z 0 0 0 z 0 z
Thus cLA (LA ) = OR3 , the zero operator on R3 .
128 Chapter 11. Diagonalization and Jordan Canonical Forms

Theorem 11.4.4 (Cayley-Hamilton Theorem)


1. Let T be a linear operator on a nite dimensional vector space V where dim(V )  1.
Then cT (T ) = OV where OV is the zero operator on V .
2. Let A be a square matrix. Then cA (A) = 0.

Proof Since Part 2 follows by applying Part 1 to the linear operator T = LA , we only need to
prove Part 1. To prove that cT (T ) = OV , we need to show that cT (T )(u) = 0 for all u 2 V .
Take any u 2 V . If u = 0, then it is obvious that cT (T )(u) = 0. Suppose u 6= 0. Let W =
spanfu; T (u); T 2 (u); : : : g. Note that W is a T -invariant subspace of V . Suppose dim(W ) = k.
By Theorem 11.3.10, B = fu; T (u); : : : ; T k 1 (u)g is a basis for W . Since T k (u) 2 W , there
exists a0 ; a1 ; : : : ; ak 1 such that

T k (u) = a0 u + a1 T (u) +    + ak 1 T k 1 (u): (11.1)

By Theorem 11.3.10 again,

cT jW (x) = a0 a1 x +    ak 1 xk 1 + xk : (11.2)

On the other hand, since W is a T -invariant subspace of V , by Theorem 11.3.7, the characteristic
polynomial of T is divisible by the characteristic polynomial of T jW , i.e.

cT (x) = q(x) cT jW (x)


for some polynomial q (x). Thus

cT (T )(u) = (q(T )  cT jW (T ))(u) (by Lemma 11.4.2.4)


= q (T )(cT jW (T )(u))
= q (T )(( a0 IV a1 T    ak T k + T k )(u)) (by (11.2))
1
1

= q (T )( a0 u a1 T (u)    ak T k (u) + T k (u))


1
1

= q (T )(0) (by (11.1))


= 0:

Since cT (T )(u) = 0 for all u 2 V , cT (T ) = OV .

Example 11.4.5 Let T : P2 (R) ! P2 (R) be the linear operator de ned by


T (a + bx + cx2 ) = 2c + (a + 2b + c)x + (a + 3c)x2 for a + bx + cx2 2 P2 (R):
Take the standard basis B = f1; x; x2 g for P2 (R). Let
0 1
0 0 2
A = [T ]B = @1 2
B C
1 A:
1 0 3
Section 11.5. Minimal Polynomials 129

Then
x 0 2
cT (x) = cA (x) = 1 x 2 1 = 4 + 8x 5x2 + x3 :
1 0 x 3
By the Cayley-Hamilton Theorem (Theorem 11.4.4), cT (T ) = OP2 (R) and cA (A) = 0, i.e.

4 IP2 (R) + 8 T 5 T 2 + T 3 = OP2 (R) and 4I + 8 A 5A2 + A3 = 0:

Section 11.5 Minimal Polynomials

Example 11.5.1 Consider the linear operator T on P2 (R) and the square matrix A = [T ]B
de ned in Example 11.4.5. By the Cayley-Hamilton Theorem (Theorem 11.4.4), cT (T ) = OP2 (R)
and cA (A) = 0. Now, let p(x) = 2 3x + x2 . We have

p(A) = 2I3 3A + A2
0 1 0 1 0 12 0 1
1 0 0 0 0 2 0 0 2 0 0 0
= 2@0 1 0C
B
A
B
3@1 2 C B
1 A + @1 2
C B C
1 A = @0 0 0A :
0 0 1 1 0 3 1 0 3 0 0 0
Thus p(T ) = OP2 (R) . Note that p(x) is a factor of cT (x), in fact, cT (x) = (x 2)p(x).

De nition 11.5.2 Let F be a eld.


1. Let p(x) be a polynomial of degree m over F, i.e. p(x) = a0 + a1 x +    + am xm where
a0 ; a1 ; : : : ; am 2 F and am 6= 0. If am = 1, then p(x) is called a monic polynomial.
2. Let T be a linear operator on a nite dimensional vector space V over F where dim(V )  1.
The minimal polynomial of T is the polynomial mT (x) over F such that
(a) mT (x) is monic,
(b) mT (T ) = OV , and
(c) if p(x) is a nonzero polynomial over F such that p(T ) = OV , then the degree of p(x)
must be greater than or equal to the degree of mT (x).
That is, the minimal polynomial mT (T ) of T is the monic polynomial p(x) of smallest
degree such that p(T ) = OV .
Since cT (T ) = OV by the Cayley-Hamilton Theorem, there exists at least one polynomial
p(x) such that p(T ) = OV . So mT (x) exists.
130 Chapter 11. Diagonalization and Jordan Canonical Forms

Remark 11.5.3 Let A be a square matrix. Similar to De nition 11.5.2.2, we can de ne the
minimal polynomial of A accordingly, i.e. the minimal polynomial mA (x) of A is the monic
polynomial p(x) of smallest degree such that p(A) = 0.
Note that if T is a linear operator on a nite dimensional vector space V , where dim(V )  1,
and B is an ordered basis for V , then mT (x) = m[T ]B (x).
In this section, we mainly study the properties of minimal polynomials of linear operators.
However, most results can be restated in terms of minimal polynomials of square matrices. See
Question 11.33.

Example 11.5.4
1. For any nite dimensional vector space V with dim(V )  1, mOV (x) = x; and for the
n  n zero matrix 0, m0 (x) = x. (Why?)
2. In Example 11.4.3, cLA (x) = mLA (x) = (x 2)(x 1)2 .
3. In Example 11.4.5, cT (x) = 4+8x 5x2 +x3 = (x 2)2 (x 1) while mT (x) = 2 3x+x2 =
(x 2)(x 1).
In general, it is not easy to nd the minimal polynomial of a linear operator. In the mean time,
we can only nd them by trial-and-error. However, the next lemma will give us some idea to
guess how the minimal polynomial looks like.

Lemma 11.5.5 Let T be a linear operator on a nite dimensional vector space V over a eld
F where dim(V )  1.
1. Let p(x) be a polynomial over F. Then p(T ) = OV if and only if p(x) is divisible by the
minimal polynomial of T .
2. If W is a T -invariant subspace of V with dim(W )  1, then the minimal polynomial of T
is divisible by the minimal polynomial of T jW .
3. Suppose  is an eigenvalue of T such that (x )r strictly divides cT (x), i.e. cT (x) =
(x )r q (x) where q (x) is a polynomial over F which is not divisible by x . Then

mT (x) = (x )s q1 (x)


where 1  s  r, q1 (x) is a polynomial over F and q1 (x) divides q (x).

Proof
1. ()) Suppose p(x) is a polynomial over F such that p(T ) = OV . By the Division Algo-
rithm, we have polynomials v (x) and w(x) over F such that

p(x) = v(x)mT (x) + w(x) and deg(w(x)) < deg(mT (x)):


Section 11.5. Minimal Polynomials 131

If w(x) = 0, then p(x) = v (x)mT (x) and hence p(x) is divisible by mT (x).
Assume w(x) 6= 0. For any u 2 V ,
0 = p(T )(u) = (v(T )  mT (T ) + w(T ))(u)
= (v (T )  mT (T ))(u) + w(T )(u)
= v (T )(mT (T )(u)) + w(T )(u)
= v (T )(0) + w(T )(u) = 0 + w(T )(u) = w(T )(u):
Thus w(T ) = OV and contracts that mT (x) is chosen to have the least degree.
(() Suppose p(x) = t(x)mT (x) for some polynomial t(x) over F. Then for all u 2 V ,
p(T )(u) = (t(T )  mT (T ))(u) = t(T )(mT (T )(u)) = t(T )(0) = 0:
So p(T ) = OV .
2. Since mT (T ) = OV , by Lemma 11.4.2.3, mT (T jW ) = mT (T )jW = OV jW = OW . By Part
1, mT (x) is divisible by mT jW (x).
3. By the Cayley-Hamilton Theorem (Theorem 11.4.4), cT (T ) = OV and hence by Part 1,
cT (x) is divisible by mT (x). Thus
mT (x) = (x )s q1 (x)
where 0  s  r and q1 (x) divides q (x). We only need to show that s  1.
Recall that  is an eigenvalue of T . Take an eigenvector v associated with . De ne W1 =
spanfv g. By Example 11.3.2.2, W1 is a T -invariant subspace of V and cT jW1 (x) = x .
So
T jW1 IW1 = OW1 :
The only monic polynomial with degree less than x  is the constant polynomial 1 (x) = 1
and 1 (T jW1 ) = IW1 6= OW1 . So x  is the minimal polynomial of T jW1 , i.e. mT jW1 (x) =
x . By Part 2, mT (x) is divisible by mT jW1 (x) = x  and hence s  1.
0 1
2 0 0
Example 11.5.6 Let A = @0 2 1C
B
A be a real matrix. Find the minimal polynomial of A.
0 0 2
Solution The characteristic polynomial of A is cA (x) = (x 2)3 . By the matrix version of
Lemma 11.5.5.3 (see Question 11.33), the minimal polynomial of A is mA (x) = (x 2)s where
s = 1; 2 or 3. Note that s is the smallest positive integer such that (A 2I )s = 0. As
0 1 0 12
0 0 0 0 0 0
A 2I = @0 0 1C
B
A 6= 0 and (A 2I ) = @0 0 1C
2 B
A = 0;
0 0 0 0 0 0
mA (x) = (x 2)2 .
132 Chapter 11. Diagonalization and Jordan Canonical Forms

Theorem 11.5.7 Let T be a linear operator on a vector space V . Suppose W1 and W2 are T -
invariant subspace of V . Then W1 + W2 is T -invariant and if W1 and W2 are nite dimensional
with dim(W1 )  1 and dim(W2 )  1, mT jW1 +W2 (x) is equal to the least common multiple of
mT jW1 (x) and mT jW2 (x).
Proof The proof is left as an exercise. See Question 11.38.

Theorem 11.5.8 Let T be a linear operator on a nite dimensional vector space V where
dim(V )  1. Suppose
cT (x) = (x 1 )r1 (x 2 )r2    (x k )rk
where 1 ; 2 ; : : : ; k are distinct eigenvalues of T . Then

mT (x) = (x 1 )s1 (x 2 )s2    (x k )sk where 1  si  ri for all i: (11.3)

De ne Ki (T ) = Ker((T i IV )si ) for i = 1; 2; : : : ; k. Then

V = K1 (T )  K2 (T )      Kk (T ): (11.4)

Furthermore, for each i = 1; 2; : : : ; k,

1. Ei (T )  Ki (T ),

2. Ki (T ) is a T -invariant subspace of V .

3. mT jK (T ) (x) = (x i )si ,


i

4. cT jK (T ) (x) = (x i )ri ,


i

5. dim(Ki (T )) = ri .

Proof The formula of mT (x) in (11.3) follows easily from Lemma 11.5.5.3. The proof of (11.4)
is left as exercise. (See Question 11.40.)

1. Take any u 2 Ei (T ).

(T i IV )(u) = 0 ) (T iIV )si (u) = 0 ) u 2 Ki (T ):


So Ei (T )  Ki (T ).

2. Take any u 2 Ki (T ). Then (T i IV )si (u) = 0. Hence by Lemma 11.4.2.4,

(T i IV )si (T (u)) = ((T i IV )si  T )(u)


= (T  (T i IV )si )(u) = T ((T i IV )si (u)) = T (0) = 0

which implies T (u) 2 Ki (T ). So Ki (T ) is T -invariant.


Section 11.5. Minimal Polynomials 133

3. Let p(x) = (x i )si . For any u 2 Ki (T ),


p(T jKi(T ) )(u) = p(T )jKi(T ) (u) = p(T )(u) = (T i IV )si (u) = 0:
It means p(T jKi(T ) ) = OKi (T ) and hence by Lemma 11.5.5.1, p(x) is divisible by
mT jK (T ) (x). Write mT jK (T ) (x) = (x i )ti , where 1  ti  si , for each i.
i i
Applying Theorem 11.5.7 to (11.4),
(x 1 )s1 (x 2 )s2    (x k )sk = mT (x) = lcmf(x 1 )t1 ; (x 2 )t2 ; : : : ; (x k )tk g
= (x 1 )t1 (x 2 )t2    (x k )tk :
So ti = si for all i.
4. Since mT jK (T ) (x) = (x i )si , by Lemma 11.5.5.3, cT jK (T ) (x) cannot have other factor
i i
x  for  6= i . So cT jK (T ) (x) = (x i )ui for ui  si .
i
Applying Discussion 11.3.12 to (11.4),
(x 1 )r1 (x 2 )r2    (x k )rk = cT (x) = (x 1 )u1 (x 2 )u2    (x k )uk :
So ui = ri for all i.
5. By Remark 11.1.8, dim(Ki (T )) = deg(cT jK (T ) (x)) = ri
i

Example 11.5.9
1. Consider the linear operator LA in Example 11.3.5 and Example 11.4.3. We know that
mLA (x) = cLA (x) = (x 2)(x 1)2 .
0 1 0 10 1 0 1
x 1 2 0 1 x 0
@y A 2 K2 (LA ) ,
B C B CB C B C
@ 0 1 2 1 A@y A = @0A
z 1 1 2 2 z 0
0 1 0 1
x 1
, @y A = t@ 1 A where t 2 R
B C B C

z 1
and 0 1 0 12 0 1 0 1
x 1 1 01 x 0
@y A 2 K1 (LA ) ,
B C B CB C B C
@ 0 1 1 1 A @y A = @0A
z 1 1 2 1 z 0
0 10 1 0 1
1 1 1 x 0
, B
@ 1 1
CB C B C
1 A@y A = @0A
1 1 1 z 0
0 1 0 1 0 1
x 1 1
, @y A = s@ 0 A + t@ 1 A where s; t 2 R:
B C B C B C

z 1 0
134 Chapter 11. Diagonalization and Jordan Canonical Forms

So K2 (LA ) = spanf( 1; 1; 1)T g and K1 (LA ) = spanf( 1; 0; 1)T ; ( 1; 1; 0)T g. Note that
E2 (LA ) = K2 (LA ) but E1 (LA ) $ K1 (LA ).
0 1
2 0 0
With B = f( 1; 1; 1)T ; ( 1; 0; 1)T ; ( 1; 1; 0)T g, [LA ]B = B
@ 0 1 0
C
A. (See Theorem
0 1 1
11.5.8.)
2. Consider the linear operator T in Example 11.4.5. We know that mT (x) = (x 2)(x 1)
and cT (x) = (x 2)2 (x 1).
0 10 1 0 1
0 2 0 2 a 0
a + bx + cx2 2 K2 (T ) , B
@ 1 2 2
CB C B C
1 A@ b A = @0A
1 0 3 2 c 0
0 1 0 1 0 1
a 1 0
, @ b A = s@ 0 A + t@1A where t; s 2 R
B C B C B C

c 1 0
and
0 10 1 0 1
0 1 0 2 a 0
a + bx + cx2 2 K1 (T ) , B
@ 1 2 1
CB C B C
1 A@ b A = @0A
1 0 3 1 c 0
0 1 0 1
a 2
, @ b A = t@ 1 A where t 2 R:
B C B C

c 1

So K2 (T ) = spanf 1+ x2 ; xg and K1 (T ) = spanf 2+ x + x2 g. Note that E2 (T ) = K2 (T )


and E1 (T ) = K1 (T ).
0 1
2 0 0
With B = f 1 + x2 ; x; 2 + x + x2 g, [T ]B = B
@ 0 2 0 C
A. (See Theorem 11.5.8.)
0 0 1

Theorem 11.5.10 Let T be a linear operator on a nite dimensional vector space V , where
dim(V )  1, such that cT (x) = (x 1 )r1 (x 2 )r2    (x k )rk where 1 ; 2 ; : : : ; k are distinct
eigenvalues of T . (Note that r1 + r2 +    + rk = dim(V ).)
The following are equivalent:
1. T is diagonalizable.
2. mT (x) = (x 1 )(x 2 )    (x k ).
3. dim(Ei (T )) = ri for i = 1; 2; : : : ; k.
Section 11.6. Jordan Canonical Forms 135

4. V = E1 (T )  E2 (T )      Ek (T ).

Proof
(1 ) 2) Suppose T is diagonalizable. By Theorem 11.1.11, V has a basis B consisting of
eigenvectors of T . For every u 2 B , T (u) = i u for some i . Thus (T i IV )(u) = 0
and hence by Lemma 11.4.2.4,
((T 1 IV )  (T 2 IV )      (T k IV ))(u)
= ((T 1 IV )      (T i 1 IV )  (T i+1 IV )      (T k IV )  (T i IV ))(u)
= ((T 1 IV )      (T i 1 IV )  (T i+1 IV )      (T k IV ))((T i IV )(u))
= ((T 1 IV )      (T i 1 IV )  (T i+1 IV )      (T k IV ))(0)
= 0:
This implies (T 1 IV )  (T 2 IV )      (T k IV ) = OV . By Lemma 11.5.5,
mT (x) = (x 1 )(x 2 )    (x k ).
(2 ) 3) Suppose mT (x) = (x 1 )(x 2 )    (x k ). For every i, Ki (T ) = Ei (T ) and
hence by Theorem 11.5.8, dim(Ei (T )) = dim(Ki (T )) = ri .
(3 ) 4) Suppose dim(Ei (T )) = ri for all i. For every i, by Theorem 11.5.8, Ei (T ) = Ki (T )
and hence V = K1 (T )  K2 (T )      Kk (T ) = E1 (T )  E2 (T )      Ek (T ).
(4 ) 1) For each i, let Bi be a basis for Ei (T ) (every vector in Bi is an eigenvector of T
associated with the eigenvalue i ). As V = E1 (T )  E2 (T )      Ek (T ), by Theorem
8.6.7.1, B = B1 [ B2 [    [ Bk is a basis for V and every vector in B is an eigenvector
of T . Hence by Theorem 11.1.11, T is diagonalizable.

Corollary 11.5.11 Let T be a linear operator on a nite dimensional vector space V and let
W be a T -invariant subspace of V with dim(W )  1. If T is diagonalizable, then T jW is also
diagonalizable.
Proof Since T is diagonalizable, by Theorem 11.5.10, mT (x) = (x 1 )(x 2 )    (x k )
where 1 ; 2 ; : : : ; k are distinct eigenvalues of T . By Lemma 11.5.5.2, mT (x) is divisible by
mT jW (x) and hence mT jW (x) = (x i1 )(x i2 )    (x is ) for some distinct eigenvalues
i1 ; i2 ; : : : ; is 2 f1 ; 2 ; : : : ; k g. By Theorem 11.5.10, T jW is diagonalizable.

Section 11.6 Jordan Canonical Forms

Example 11.6.1 Let T : C3 ! C3 be the linear operator de ned by


T ((a; b; c)) = ((1 + i)a + b; i b + c; a b (1 i)c) for (a; b; c) 2 C3 :
136 Chapter 11. Diagonalization and Jordan Canonical Forms

Using the standard basis E = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g,
0 1
1+i 1 0
B C
[T ]E = @ 0 i 1 A:
1 1 (1 i)
Note that
x (1 + i) 1 0
cT (x) = 0 x i 1 = (x i)3
1 1 x + (1 i)
and hence mT (x) = (x i)s where 1  s  3. Let Q = T i IV , i.e.

Q((a; b; c)) = (a + b; c; a b c) for (a; b; c) 2 C3 :


Then cQ (x) = x3 and mQ (x) = xs .
It is easy to check that Q 6= OC3 and Q2 6= OC3 but Q3 = OC3 . So s = 3, i.e. mQ (x) = x3 and
mT (x) = (x i)3 .
Take v = (1; 0; 0). Then Q(v ) = (1; 0; 1), Q2 (v ) = Q((1; 0; 1)) = (1; 1; 0) and Q3 (v ) =
Q((1; 1; 0)) = (0; 0; 0). Using B = fQ2 (v); Q(v); vg as an ordered basis for C3 ,
 
[T ]B = [T (Q2 (v ))]B [T (Q(v ))]B [T (v )]B
 
= [Q3 (v ) + i Q2 (v )]B [Q2 (v ) + i Q(v )]B [Q(v ) + i v ]B (because T = Q + i IV )
 
= [i Q2 (v )]B [Q2 (v ) + i Q(v )]B [Q(v ) + i v ]B
0 1
i 1 0
B
= @0 i 1CA:
0 0 i

De nition 11.6.2 Let  be a scalar. The t  t matrix


0 1
 1 0
B C
B  1 C
Jt() = B
B
B
... ... C
C
@ 0
... 1C
A

is called a Jordan block of order t associated with . For example,
0 1
0  1 0 0 1
 
!  1 0 B C
 1 B 0  1 0C
J1() =  ; J2() = 0  ; J3() = @0  1A ; J4() = B0 0  1C
B C B
C:
@ A
0 0 
0 0 0 
Section 11.6. Jordan Canonical Forms 137

Lemma 11.6.3 Given a Jordan block J = Jt (), cJ (x) = mJ (x) = (x )t .


Proof Since J is a t  t triangular matrix with diagonal entries all equal to , cJ (x) = (x )t .
Then by Theorem 11.5.8, mJ (x) = (x )s for 1  s  t. By direct computations, we get
(J I )r 6= 0, for r = 1; 2; : : : ; t 1, while (J I )t = 0. (Check it.) So mJ (x) = (x )t .

Theorem 11.6.4 Let T be a linear operator on a nite dimensional vector space V over a
eld F where dim(V )  1. Suppose the characteristic polynomial of T can be factorized into
linear factors over F. Then there exists an ordered basis B for V such that [T ]B = J with
0 1
Jt1 ( )1 0
B C
B
B
B
Jt2 ( )
2
C
C
C
J= B
B ... C
C (11.5)
B C
B ... C
@ A
0 Jtm (m)
where 1 ; 2 ; : : : ; m are eigenvalues of T . (Note that 1 ; 2 ; : : : ; m are not necessarily distinct.)
Proof We omit the proof because it is too technical. (See Question 11.53 and Question 11.54.)

Remark 11.6.5 Let A 2 Mnn(F).Suppose the characteristic polynomial of A can be


factorized into linear factors over F. Apply Theorem 11.6.4 to T = LA . We can nd an
invertible matrix P 2 Mnn (F) such that P 1 AP = J .

De nition 11.6.6 For a linear operator T of a nite dimensional vector space V , if there
exists an ordered basis B for V such that [T ]B = J where J is a square matrix of the form
stated in (11.5), we say that T has a Jordan canonical form J and J is a Jordan canonical
form for T .
Similarly, for a square matrix A, if there exists an invertible matrix P such that P 1 AP = J ,
we say that A has a Jordan canonical form J and J is a Jordan canonical form for A.

Example 11.6.7 Consider the matrix A and the linear operator LA in Example 11.3.5 and
Example 11.5.9.1. If we choose an ordered basis B = f( 1; 1; 1)T ; ( 1; 1; 0)T ; ( 1; 0; 1)g. Then
0 1
2 0 0 !
[LA ]B = B
0 1 1
C
=
J1(2) 0
:
@
0 0 1
A
0 J2(1)
which is a Jordan canonical form for LA .
0 1
1 1 1 !
Let P = B
1 1
C
0 A. Then P AP 1
=
J1(2) 0
which is a Jordan canonical
@
1 0 1
0 J2(1)
form for A.
138 Chapter 11. Diagonalization and Jordan Canonical Forms

Remark 11.6.8 Jordan canonical forms are not unique. But two Jordan canonical forms for
a linear operator (or a matrix) have the same collection of Jordan blocks but in di erent orders.
(Actually, two matrices in Jordan canonical forms are similar if and only if they have the same
collection of Jordan blocks.)
In Example 11.6.7, the following are all the possible Jordan canonical forms for A:
! !
J1(2) 0
;
J2(1) 0
:
0 J2(1) 0 J1(2)
Usually, we say that the Jordan canonical form is unique up to the ordering of the Jordan
blocks.

Theorem 11.6.9 Suppose a linear operator T on a nite dimensional space V (respectively,


a square matrix A) has a Jordan canonical form
0 1
Jt1 ( )1 0
B C
B
B
B
Jt2 ( ) 2
C
C
C
J= B
B ... C
C
B C
B ... C
@ A
0 Jtm (m)
where 1 ; 2 ; : : : ; m are eigenvalues of T (or A). (Note that 1 ; 2 ; : : : ; m are not necessarily
distinct.)
1. cT (x) (or cA (x)) = (x 1 )t1 (x 2 )t2    (x m )tm .
2. mT (x) (or mA (x)) is the least common multiple of (x 1 )t1 , (x 2 )t2 , : : : , (x
m )tm , i.e. if 01 ; 02 ; : : : ; 0k are all the distinct eigenvalues of T (respectively, A), then
mT (x) (or mA (x)) = (x 01 )s1 (x 02 )s2    (x 0k )sk where for each i, si is the order
of the largest Jordan block associated with 0i in the matrix J .
3. For every eigenvalue  of T (or A), the dimension of the eigenspace E (T ) (or E (A)) is
equal to the total number of Jordan blocks associated with  in the matrix J .

Proof In the following, we prove the linear operation part of the theorem. For the matrix
part of the theorem, we only need to apply our arguments to the linear operator LA .
( m)
Suppose B = fv1(1) ; v2(1) ; : : : ; vt(1)
1 ; v1 ; v2 ; : : : ; vt2 ; : : : ; v1
(2) (2) (2)
; v2(m) ; : : : ; vt(mm) g is an ordered
basis for V such that [T ]B = J .
1. Since xI J is an upper triangular matrix,
cT (x) = det(xI J )
= the product of the diagonal entries of xI J
= (x 1 )t1 (x 2 )t2    (x m )tm :
Section 11.6. Jordan Canonical Forms 139

2. Let Wi = span(Ci ) where Ci = fv1(i) ; v2(i) ; : : : ; vt(ii) g for i = 1; 2; : : : m. Then


V = W1  W2      Wm :
For i = 1; 2; : : : m, [T jWi ]Ci = Jti (i ) and by Lemma 11.6.3,
mT jWi (x) = mJti (i ) (x) = (x i )ti :
By Theorem 11.5.7, mT (x) is the least common multiple of (x 1 )t1 , (x 2 )t2 , : : : ,
(x m )tm .
0
x1 1
B x2 C
B C
3. Let u 2 V with [u]B = B . C
B
C where xi 2 F is a column vector for i = 1; 2; : : : ; m.
ti
@ .. A
xm
Then
u 2 E(T )
10 1 0 1
0
Jt1 ( ) It1
1 0 x1 C 0
B CB B C
CB C B C
B
B
B
Jt2 ( ) 2 It2 CB
B
CB
x C
2C
B
B 0 C
C
, B
B ... CB
CB
C
..
C
C
= B
B
B ..
C
C
C
B CB . C B . C
B ... CB C B C
@ A@ A @ A
0 Jtm (m) Itm xm 0
, (Jti (i ) Iti )xi = 0 for i = 1; 2; : : : ; m
(
, xi = i (1; 0; : : : ; 0)
T
; for some i 2 F; if i = 
for i = 1; 2; : : : ; m:
(0; 0; : : : ; 0) T
if i 6= 
Thus
E (T ) = spanf v1(i) j i = 1; 2; : : : ; m and i = g
and dim(E (T )) = the number of Jordan blocks associated with  in J .

Example 11.6.10
1. Suppose a complex square matrix A has a Jordan canonical form given by
0 1
i 1 0 0 0 0 0 0
B C
0 1
B
B 0 i 1 0 0 0 0 0 C
C
J3(i) 0 0 0 B
B 0 0 i 0 0 0 0 0 C
C

J=
B
B
B 0 J2(4) 0 0
C
C
C =
B
B
B 0 0 0 4 1 0 0 0
C
C
C:
B
@ 0 0 J2(4) 0 C
A
B
B 0 0 0 0 4 0 0 0 C
C
0 0 0 J1(4) B
B
B
0 0 0 0 0 4 1 0
C
C
C
B 0 0 0 0 0 0 4 0 C
@ A
0 0 0 0 0 0 0 4
140 Chapter 11. Diagonalization and Jordan Canonical Forms

Then cA = (x i)3 (x 4)2 (x 4)2 (x 4) = (x i)3 (x 4)5 , mA = (x i)3 (x 4)2 ,


dim(E i (A)) = 1 and dim(E4 (A)) = 3.

2. Let A be a real square matrix such that

cA (x) = (x 1)3 (x 2)2 and mA (x) = (x 1)2 (x 2):


Find a Jordan canonical form for A.

Solution Let J be a Jordan canonical form for A. Since cA (x) = (x 1)3 (x 2)2 , along
the diagonal of J , there are three 1's and two 2's. As mA (x) = (x 1)2 (x 2), the largest
Jordan block associated with 1 has order 2 and the largest Jordan block associated with
2 has order 1. So J must be similar to
0 1
0 1 1 1 0 0 0
J2(1) 0 0 0 B
0 1 0 0 0
C
B
B
B 0 J1(1) 0 0
C
C
C =
B
B
B 0 0 1 0 0
C
C
C:
B
@ 0 0 J1(2) 0 C
A
B
B
0 0 0 2 0
C
C
0 0 0 J1(2) @
0 0 0 0 2
A

Remark 11.6.11 The simplest form that a square matrix can be reduced to (or similar to) is
a Jordan canonical form. However, in practice, we seldom use Jordan canonical forms because
they are very sensitive to computational errors. For example, a Jordan canonical form for the
0 1
! p" 0
!
0 1
!
matrix is p" 6 0; and it is
if " = if " = 0. So small truncation
" 0 0 0 0
errors during computation may end up with dramatic di erences in Jordan canonical forms. So
mathematicians working in numerical analysis prefer some other canonical forms though not so
simple but more stable in computation.

Exercise 11

Question 11.1 to Question 11.6 are exercises for Section 11.1.

1. For each of the following linear operators T on V , (i) determine whether T is diagonal-
izable; and (ii) if T is diagonalizable, nd an ordered basis B for V such that [T ]B is a
diagonal matrix.

(a) V = F22 and T ((x; y )) = (x + y; y ) for (x; y ) 2 V .


(b) V = P2 (R) and T (a + bx + cx2 ) = (a + b) + (b a)x + (c a)x2 for a + bx + cx2 2 V .
(c) V = P2 (C) and T (a + bx + cx2 ) = (a + b) + (b a)x + (c a)x2 for a + bx + cx2 2 V .
Exercise 11 141

d(p(x))
(d) V = Pn (R) and T (p(x)) = for p(x)V .
dx
d(xp(x))
(e) V = Pn (R) and T (p(x)) = for p(x) 2 V .
dx
!
1 2
2. Let A = be a real matrix.
2 1
(a) Find an invertible 2  2 real matrix P so that P AP is a diagonal matrix.
1

(b) De ne a linear operator T on M  (R) such that T (X ) = AX for X 2 M  (R).


2 2 2 2

(i) Let C = fE11 ; E21 ; E12 ; E22 g. Find [T ]C .


(ii) Find an ordered basis B for M22 (R) so that [T ]B is a diagonal matrix.

3. Let S and T be linear operator on the same nite dimensional vector space. Suppose S
is diagonalizable.

(a) If every eigenvalue of S is an eigenvalue of T , must T be diagonalizable?


(b) If every eigenvector of S is an eigenvector of T , must T be diagonalizable?

4. (a) Prove that two similar square matrices have the same characteristic polynomial.
(b) Let A; B 2 Mnn (F) where F is a eld.
(i) Show that the following two 2n  2n matrices
! !
X= AB 0nn
and Y =
0nn 0nn
B 0nn B BA
are similar.
(ii) Hence, or otherwise, prove that AB and BA have the same characteristic poly-
nomial.
(c) Restate the result of Part (b)(ii) in terms of linear operators.

5. Let V be a nite dimensional vector space over a eld F and T a linear operator on V
such that T 2 = IV .

(a) If 1 + 1 6= 0 in F, prove that T is diagonalizable. (Hint: See Question 9.18(b)(i).)


(b) Give an example of T such that T 2 = IV but T is not diagonalizable.

6. Let T be a linear operator on a vector space V . Suppose 1 ; 2 ; : : : ; k are distinct eigen-


values of T . Show that E1 (T ) + E2 (T ) +    + Ek (T ) is a direct sum.
142 Chapter 11. Diagonalization and Jordan Canonical Forms

Question 11.7 to Question 11.11 are exercises for Section 11.2.


0 1
3 3 4
7. Let A = @1
B
1
C
1A be a real matrix.
0 1 1
Following the procedure of Example 11.2.4, nd an invertible matrix P such that P 1
AP
is an upper triangular matrix.
0 1 0 1
1 0 1 0 1 0
8. (a) Let B = B
@ 1 1 1A and Q = @1
C B C
0 0A be real matrices.
1 0 3 0 1 1
Verify that Q 1
BQ is an upper triangular matrix.
0 1 0 1
0 1 1 1 1 0 0 0
B C B C
B0 1 0 1C B 0 1 0 0C
(b) Let A = B
B1
C and R = B
C B
C be real matrices.
@ 0 2 0A @ 1 0 1 0C
A
0 1 0 3 0 0 0 1
Compute R 1
AR.
Hence, or otherwise, nd an invertible 4  4 real matrix P such that P 1
AP is an
upper triangular matrix.
0 1 0 1 0 1
1 0 2 2 0 0
B C B C B C
B 2 1 0 0C B 1C B 0C
9. Let A= B
B 1 2 1
C be a real matrix and
0C
v1 = B
B
C,
1C
v2 = B
B
C vectors
1C
@ A @ A @ A
3 2 2 3 1 1
in R4 .
(a) Verify that v1 and v2 are eigenvectors of A.
(b) Extend fv1 ; v2 g to an ordered basis B = fv1 ; v2 ; w1 ; w2 g for R4 .
 
(c) Using the vectors w1 and w2 obtained in Part (b), let R = v1 v2 w1 w2 .
Compute R 1 AR.
(d) Find an invertible real matrix P such that P 1
AP is an upper triangular matrix.
10. (a) Let B = (bij )nn and C = (cij )nn be upper triangular matrices . Show that BC
is an upper triangular matrix. What are the diagonal entries of BC ?
(b) Let A be an n  n complex matrix. Prove that if  is an eigenvalue of A2 , then 
p
or
p is an eigenvalue of A.

11. A linear operator T on a nite dimensional vector space V is called triangularizable if


there exists an order basis for V such that [T ]B is a triangular matrix.
Exercise 11 143

(a) Let T be a linear operator on a nite dimensional vector space V over a eld F.
Prove that T is triangularizable if and only if the characteristic polynomial of T can
be factorized into linear factors over F.
(b) For each of the following linear operators T on V , determine whether T is triangu-
larizable.
(i) V = F22 and T ((x; y )) = (x + y; y ) for all (x; y ) 2 V .
(ii) V = P2 (R) and T (a+bx+cx2 ) = (a+b)+(b a)x+(c a)x2 for all a+bx+cx2 2 V .
(iii) V = Mnn (C) and T (X ) = AX for X 2 V , where A is a complex n  n matrix.

Question 11.12 to Question 11.24 are exercises for Section 11.3.

12. For each of the following linear operators T on V and subspaces W of V , determine
whether W is T -invariant.
(a) V = C3 , W = f(a; b; a + b) j a; b 2 Cg and T ((a; b; c)) = (b + i c; c + i a; a + i b) for
(a; b; c) 2 V .
(b) V = C3 , W = spanf(1; 1; 1)g and T ((a; b; c)) = (b + i c; c + i a; a + i b) for (a; b; c) 2 V .
(c) V = C3 , W = f(x; y; z ) 2 C3 j x + y + z = 0g and T ((a; b; c)) = (b + i c; c + i a; a + i b)
for (a; b; c) 2 V .
dp(x)
(d) V = P (R), W = P3 (R) and T (p(x)) = for p(x) 2 V .
dx Z x
(e) V = P (R), W = P3 (R) and T (p(x)) = p(t)dt for p(x) 2 V .
0
!
0 1
(f) V = M22 (R), W = fA 2 V j A = Ag and T (X ) =
T
X for X 2 V .
1 0
(g) V = M22 (R), W = fA 2 V j AT = Ag and T (X ) = X T for X 2 V .

13. Consider the linear operator T de ned in Parts (a), (b), (c) of Question 11.12.
(i) Compute cT (x).
(ii) For each of Parts (a), (b), (c) of Question 11.12, if W is T -invariant, compute cT jW (x)
and verify that cT (x) is divisible by cT jW (x).

14. Prove Proposition 11.3.3:


Let S and T be linear operators on V . Suppose W is a subspace of V which is both
S -invariant and T -invariant. Show that
(a) W is (S  T )-invariant and (S  T )jW = S jW  T jW ;
(b) W is (S + T )-invariant and (S + T )jW = S jW + T jW ; and
144 Chapter 11. Diagonalization and Jordan Canonical Forms

(c) for any scalar c, W is (cT )-invariant and (cT )jW = c(T jW ).

15. Let V be a real vector space which has a basis B = fv1; v2; v3; v4g. Suppose T is a
linear operator on V such that
0 1
1 1 1 1
B C
B2 1 2 1C
[T ]B = B
B1
C:
@ 1 3 1C
A
1 1 1 1

Let W = spanfv4 ; T (v4 ); T 2 (v4 ); : : : g, the T -cyclic subspace generated by v4 .


(a) Compute T (v4 ) and T 2 (v4 ).
(b) What is the dimension of W ?
(c) Write down the characteristic polynomial of T jW .

16. Let T be a linear operator on a vector space V . Suppose u and v are two eigenvectors of
T associated with eigenvalues  and , respectively, where  6= . Let W = spanfu; vg
and B = fw; T (w)g where w = u + v .
(a) Show that W is a T -invariant subspace of V .
(b) Prove that B is a basis for W .
(c) Find cT jW (x) and [T jW ]B .

17. Let S be the shift operator on RN , i.e. S ((an )n2N ) = (an+1 )n2N for (an )n2N 2 RN , and let

W = f(an )n2N 2 RN j an+3 = 2an+2 + an+1 2an for n = 1; 2; 3; : : : g:


(a) Is W an S -invariant subspace of RN ?
(b) Find dim(W ).
(c) Take b = (bn )n2N 2 W with b1 = 0, b2 = 0, b3 = 1. Prove that fb; S (b); S 2 (b)g is a
basis for W .
(d) Find the characteristic polynomial of S jW .
(e) Find a basis B for W such that [S jW ]B is a diagonal matrix.

18. Use the shift operator S in Question 11.17. Let

W 0 = f(an )n2N 2 RN j an+2 = 2an+1 an for n = 1; 2; 3; : : : g:


Show that S jW 0 is not diagonalizable.
Exercise 11 145

19. Let A 2 Mnn (F) where F is a eld. De ne T to be a linear operator on Mnn (F) such
that T (X ) = AX for X 2 Mnn (F).
Let Eij , i; j = 1; 2; : : : ; n, be the n  n matrices as de ned in Example 8.4.6.7. For
q = 1; 2; : : : ; n, de ne Wq = span(Bq ) where Bq = fE1q ; E2q ; : : : ; Enq g.
(a) Is Mnn (F) = W1  W2      Wn ?
(b) Prove that Wq is T -invariant. What is [T jWq ]Bq ?
(c) Let B = fE11 ; E21 ; : : : ; En1 ; E12 ; E22 ; : : : ; En2 ; : : : ; E1n ; E2n ; : : : ; Enn g. Write
down [T ]B and hence prove that cT (x) = [cA (x)]n .
(d) Prove that T is diagonalizable if and only if A is diagonalizable.

20. Let W1 and W2 be subspaces of a vector space V such that V = W1  W2 and let
P : V ! V be the projection on W1 along W2 de ned in Question 9.6. For a linear
operator T on V , prove that T  P = P  T if and only if both W1 and W2 are T -invariant.
(Hint: You need the results proved in Question 9.6.)

21. Let T be a linear operator on a vector space V and W a T -invariant subspace of V .

(a) For u; v 2 V , show that if W + u = W + v , then W + T (u) = W + T (v ).


(b) De ne a mapping T=W : V=W ! V=W such that

T=W (W + u) = W + T (u) for W + u 2 V=W:

(By Part (a), T=W is well-de ne.)


(i) Show that T=W is a linear operator on V=W .
(ii) Show that cT (x) = cT jW (x) cT=W (x). (Hint: Follow Discussion 11.3.4.)
(iii) Let U be a T=W -invariant subspace of V=W and U = fu 2 V j W + u 2 Ug.
Show that U is a T -invariant subspace of V .

22. Let T be a linear operator on a nite dimensional vector space V where dim(V ) = n.
Prove that T is triangularizable if and only if there exist T -invariant subspaces W1 , W2 ,
: : : , Wn of V such that W1  W2      Wn = V and dim(Wj ) = j for j = 1; 2; : : : ; n.
(See Question 11.11 for the de nition of \triangularizable".)

23. Reprove Theorem 11.2.3.2 without using the matrix arguments:


Let T be a linear operator on a nite dimensional vector space V over F where dim(V )  1.
If the characteristic polynomial of T can be factorized into linear factors over F, by using
results of Question 11.21 and Question 11.22, prove that T is triangularizable.
146 Chapter 11. Diagonalization and Jordan Canonical Forms

24. (a) Let S and T be linear operators on a vector space V such that S  T = T  S . Prove
that E (T ) is S -invariant where  is an eigenvalue of T .
(b) Let S and T be linear operators on a nite dimensional vector space V , where
dim(V )  1, over a eld F such that S  T = T  S . If the characteristic polynomials
of S and T can both be factorized into linear factors over F, prove that there exists
an ordered basis B for V such that both [S ]B and [T ]B are upper triangular matrices.
(c) Restate the result in Part (b) using square matrices.

Question 11.25 to Question 11.32 are exercises for Section 11.4.

25. Let T be a linear operator on a vector space V over a eld F and p(x) a polynomial over
F. Prove that Ker(p(T )) and R(p(T )) are T -invariant.
26. Let F be a eld and p(x), q (x) two polynomials over F. The greatest common divisor (or
the highest common factor ) of p(x) and q (x), denoted by gcd(p(x); q (x)), is the monic
polynomial of highest degree which divides both p(x) and q (x). The following is an
algorithm called the Euclidean Algorithm which is used to nd gcd(p(x); q (x)):
Assume deg(p(x))  deg(q (x)).

Step 1: Let r0 (x) = p(x) and r1 (x) = q(x). Set t = 1.


Step 2: Divide rt 1 (x) by rt (x) to get the reminder rt+1 (x), i.e. rt+1 (x) is the polynomial
over F satisfying

rt 1 (x) = qt (x) rt (x) + rt+1 (x) and deg(rt+1 (x)) < deg(rt (x))
for some polynomial qt (x) over F.
Step 3: If rt+1 (x) 6= 0, increase the value of t by 1 and goto Step 2.
Step 4: Now, rt+1 (x) = 0. Let c be the coecient of the term of highest degree in rt (x).
Then gcd(p(x); q (x)) = c 1 rt (x).

(a) For each of the following cases, use the Euclidean Algorithm to nd gcd(p(x); q (x)).
(i) F = R, p(x) = x 5
3x4 + 2x3 2x2 2x + 1 and q (x) = x4 x3 + 7x2 2x.
(ii) F = F , p(x) = x
2
5
+ x4 + 1 and q (x) = x5 + x2 + x + 1.
(b) Prove that the polynomial c 1 rt (x) in Step 4 is the greatest common divisor of p(x)
and q (x).
(c) Prove that there exist polynomials a(x) and b(x) over F such that

a(x) p(x) + b(x) q(x) = gcd(p(x); q(x)):


Exercise 11 147

27. Let T be a linear operator on a vector space V over a eld F and let p(x), q (x) be
polynomials over F such that gcd(p(x); q (x)) = 1.
(a) Prove that Ker(q (T ))  R(p(T )).
(b) Prove that Ker(p(T )) \ Ker(q (T )) = f0g.
(Hint: Use the result of Question 11.26(c).)

28. The following is a famous \wrong proof" of the Cayley-Hamilton Theorem:


 Let A be a square matrix.
As cA (x) = det(xI A), cA (A) = det(AI A) = det(A A) = det(0) = 0.
Which of the equalities in the argument above is wrong? Why?

29. De ne p(x) = det(xB C ) where B and C are two n  n matrices. Suppose there exists
an n  n matrix A such that AB = C . Prove that p(A) = 0.
0 1
1 1 1 1
B C
B1 2 1 0C
30. Let A = B
B0
C be a real matrix.
@ 0 1 0C
A
1 0 0 1
(a) Find the characteristic polynomial of A.
(b) Find a real polynomial p(x) such that A 1
= p(A).

31. Let A be a nonzero n  n matrix over a eld F. Prove that if A is invertible, then there
is a polynomial p(x) over F such that A 1 = p(A).

32. Consider the linear operator T = LA on R4 with


0 1
0 1 1 1
B C
B0 1 0 1C
A=B
B1 0 2 0C
C:
@ A
0 1 0 3

De ne W1 = Ker(p(T )) and W2 = Ker(q (T )) where p(x) = (x 1)2 and q (x) = (x 2)2 .


(a) Compute cT (x) and show that the eigenvalues of T are 1 and 2.
(b) Find a basis fw1 ; w2 g for W1 and a basis fw3 ; w4 g for W2 .
(c) Is R4 = W1  W2 ? (Hint: Use the result of Question 11.27(b).)
(d) Using B = fw1 ; w2 ; w3 ; w4 g as an ordered basis for R4 , compute [T ]B .
148 Chapter 11. Diagonalization and Jordan Canonical Forms

Question 11.33 to Question 11.40 are exercises for Section 11.5.


33. Restate the results in Lemma 11.5.5 (Parts 1 and 3), Theorem 11.5.8 and Theorem 11.5.10
using square matrices.
(Hint: For Theorem 11.5.8, follow Discussion 11.3.12 to relate the direct sum of T -
invariant subspaces with square matrices.)
0 1
4 0 1
34. Let A = @2 3 2C
B
A be a real matrix.
1 0 4
(a) Find the minimal polynomial of A.
(b) By the result of (a), determine if A is diagonalizable.

35. Let A be a complex square matrix of order 3 such that


(A I )(A + i I ) = 0: (11.6)
(a) List all possible answers for mA (x).
(b) List all possible answers for cA (x).
(c) Is A invertible? Justify your answer.
(d) Is A diagonalizable? Justify your answer.
(e) Find all complex 3  3 matrices that satisfy the equation (11.6).

36. (a) For n  2, factorize xn x into linear factors over C.


(b) Let A be a complex square matrix such that An = A for some n  2. Prove that A
is diagonalizable.

37. (a) Prove that similar square matrices have the same minimal polynomial.
(b) Suppose two square matrices have the same minimal polynomial, is it true that they
are similar?

38. Prove Theorem 11.5.7:


Let T be a linear operator on a vector space V . Suppose W1 and W2 are T -invariant
subspace of V .
(a) Show that W1 + W2 is T -invariant.
(b) If W1 and W2 are nite dimensional with dim(W1 )  1 and dim(W2 )  1, prove that
mT jW1 +W2(x) is equal to the least common multiple of mT jW1(x) and mT jW2(x), i.e.
mT jW1(x) mT jW2(x)
mT jW1 +W2(x) = :
gcd(mT jW1(x); mT jW2(x))
Exercise 11 149

39. (a) Let S and T be two diagonalizable linear operators on a nite dimensional vector
space V . Show that there exists an ordered basis B for V such that [S ]B and [T ]B
are diagonal matrices if and only if S  T = T  S .
(b) Let A and B be two diagonalizable matrices in Mnn (F). Show that there exists
an invertible matrix P 2 Mnn (F) such that P 1 AP and P 1 BP are diagonal
matrices if and only if AB = BA.

40. Let T be a linear operator on a nite dimensional vector space V over a eld F where
dim(V )  1.
(a) Suppose mT (x) = p(x) q (x) where p(x) and q (x) are polynomials over F such that
deg(p(x))  1, deg(q (x))  1 and gcd(p(x); q (x)) = 1.
(i) Prove that R(p(T )) = Ker(q (T )).
(ii) Prove that V = Ker(p(T ))  Ker(q (T )).
(Hint: Use the results of Question 11.27.)
(b) Complete the proof of Theorem 11.5.8:
Prove that V = K1 (T )  K2 (T )      Kk (T ) where 1 ; 2 ; : : : ; k are all the
(distinct) eigenvalues of T .

Question 11.41 to Question 11.54 are exercises for Section 11.6.


41. For each of the characteristic polynomials cA (x) of a real matrix A below, (i) list all
possible (non-similar) Jordan canonical forms for A; and (ii) for each possible Jordan
canonical form, write down the minimal polynomial mA (x) of A.
(a) cA (x) = (x 1)(x 2)(x 3)(x 4).
(b) cA (x) = (x 1)2 (x 2)2 .
(c) cA (x) = (x 1)4 .

42. Let V be a real vector space of dimension 5 and let T be a linear operator on V . Suppose
that the minimal polynomial of T is
mT (x) = (x 1)(x 2)2 :
(i) List all possible (non-similar) Jordan canonical forms for T .
(ii) For each possible Jordan canonical form, write down the characteristic polynomial
cT (x) of T .

43. Let A be a complex square matrix of order 3 such that


(A I ) ( A + i I ) = 0:
2
(11.7)
150 Chapter 11. Diagonalization and Jordan Canonical Forms

(a) List all possible answers for mA (x).


(b) List all possible answers for cA (x).
(c) Find all complex 3  3 matrices that satisfy the equation (11.7).

44. Let T be a linear operator on a real vector space such that T has a Jordan canonical form
0 1
J3(2) 0
J =B
@ J2(2) C
A:
0 J2(3)
(a) Write down cT (x) and mT (x).
(b) Write down nullity(T 2IV ) and nd nullity((T 2IV )2 ).

45. Let T be a linear operator on a nite dimensional real vector space V with an ordered
basis C = fv1 ; v2 ; : : : ; v11 g such that
0 1
2 1 0 0 0 0 0 0 0 0 0
B C
B
B
0 2 1 0 0 0 0 0 0 0 0 C
C
B 0 0 2 0 0 0 0 0 0 0 0 C
B C
B C
B
B
0 0 0 2 1 0 0 0 0 0 0 C
C
B 0 0 0 0 2 0 0 0 0 0 0 C
B C
B C
[T ]C = B 0 0 0 0 0 8 1 0 0 0 0 C:
B C
B 0 0 0 0 0 0 8 1 0 0 0 C
B C
B C
B 0 0 0 0 0 0 0 8 0 0 0 C
B C
B 0 0 0 0 0 0 0 0 0 1 0 C
B C
B C
@ 0 0 0 0 0 0 0 0 0 0 0 A
0 0 0 0 0 0 0 0 0 0 0
(a) Find the characteristic polynomial and the minimal polynomial of T .
(b) For each 1  i  11, express T (vi ) as a linear combination of the vectors in C .
(c) Find the eigenvalues of T and bases for the corresponding eigenspaces.
(d) For each eigenvalue  of T , write down a basis for K (T ).
0 1
0 0 1 1
B C
B2 2 1 1C
46. Let A = B
B0
C be a real matrix.
@ 0 1 1C A
0 0 1 1
(a) Compute the characteristic polynomial of A and nd all eigenvalues.
(b) For each eigenvalue  obtained in Part (a), determine the dimension of the eigenspace
associated with .
Exercise 11 151

(c) Find a Jordan canonical form for A.


(d) Write down the minimal polynomial of A.

47. Let Eij be the n  n matrix as de ned in Example 8.4.6.7.


(a) Compute Eij 2 .
(b) Find the characteristic polynomial and the minimal polynomial of Eij .
(c) Write down a Jordan canonical form for Eij .

48. (a) Let J = Jt () where  6= 0.


(i) Compute the inverse of J .
(ii) Find the minimal polynomial of J 1 .
(iii) Write down a Jordan canonical form for J 1 .
(b) Suppose a square matrix A has a Jordan canonical form
0 1
Jt1 ( )
1 0
B C
B
B
B
Jt2 ( )2
C
C
C
B C
B ... C
B C
B ... C
@ A
0 Jtm (m)
where 1 ; 2 ; : : : ; m are nonzero. Write down a Jordan canonical form for A 1 .

49. Let T be a linear operator on a nite dimensional vector space V . De ne Q = T


IV where  is a scalar. Suppose Q2 = OV and Q 6= OV . Let W = Ker(Q) and let
fW + v1; W + v2; : : : ; W + vmg be a basis for V=W .
(a) Show that Q(v1 ); Q(v2 ); : : : ; Q(vm ) are linearly independent vectors in W .
(b) Let fQ(v1 ); Q(v2 ); : : : ; Q(vm ); w1 ; w2 ; : : : ; wk g be a basis for W .
Explain why B = fQ(v1 ); v1 ; Q(v2 ); v2 ; : : : ; Q(vm ); vm ; w1 ; w2 ; : : : ; wk g is a
basis for V . Write down [Q]B and [T ]B using B as an ordered basis.
(This question is a particular case of Question 11.53.)

50. (a) Let V be a nite dimensional vector space over a eld F.


Suppose T is a linear operator on V such that [T ]B = Jn () where n = dim(V ),
 2 F and B = fv1 ; v2 ; : : : ; vn g is an order basis for V .
Let C = fvn ; vn 1 ; : : : ; v1 g. Write down [T ]C .
(b) Prove that every complex square matrix is similar to its transpose.
152 Chapter 11. Diagonalization and Jordan Canonical Forms

51. (a) For s  1, nd the nullity of (Jt () It )s .


(b) Let T be a linear operator on a nite dimensional vector space V such that T has a
Jordan form 0 1
Jt1 () 0
B C
B
B
B
Jt2 () C
C
C
J= B
B ... C
C
B C
B ... C
@ A
0 Jtm ()
where t1 + t2 +    + tm = dim(V ).
m
X
(i) Show that for s  1, the nullity of (J In )s is equal to minfs; ti g.
i=1
(ii) Suppose nullity((T IV )s ) nullity((T IV )s 1 ) = r where s  2. What can
you say about the Jordan blocks in J based on the value of r? (Compare your
answer with the results in Question 11.53.)

52. Let T be a real linear operator on a nite dimensional vector space V such that
(i) cT (x) = (x + 1)9 (x 2)7 ;
(ii) mT (x) = (x + 1)4 (x 2)3 ;
(iii) nullity(T + IV ) = 4, nullity((T + IV )2 ) = 7, nullity((T + IV )3 ) = 8; and
(iv) nullity(T 2IV ) = 3, nullity((T 2IV )2 ) = 5.
Find a Jordan canonical form for T . (Hint: You may need the answer to Question
11.51(b)(ii).)

53. (This question is the preliminary of the proof of Theorem 11.6.4 in Question 11.54. You
can only use results discussed before Theorem 11.6.4 to do this question.)
Let T be a linear operator on a nite dimensional vector space V . Suppose  is an
eigenvalue of T . De ne Q = T IV and Nt = Ker(Qt ) for t  0.
(a) Suppose v 2 Nt Nt 1 where t  1. Let B = fQt 1 (v ); : : : ; Q(v ); v g and W =
span(B ).
(i) Prove that W is T -invariant.
(ii) Prove that B is a basis for W .
(iii) Using B as an ordered basis, write down [T jW ]B .
(b) For 1  t  s, let Ct  Nt Nt 1 such that fNt 1 + v j v 2 Ct g is a basis for Nt =Nt 1 .
(We also assume that for any u; v 2 Ct , if u 6= v , then Nt 1 + u 6= Nt 1 + v .)
(i) Prove that for t  2, if Ct = fv1 ; v2 ; : : : ; vk g, then Nt 2 + Q(v1 ); Nt 2 + Q(v2 );
: : : ; Nt 2 + Q(vk ) are linearly independent vectors in Nt 1 =Nt 2 .
Exercise 11 153

(ii) Prove that C1 [ C2 [    [ Cs is a basis for Ns .


(c) For s  1, show that there exists an ordered basis D for Ns such that [T jNs ]D is
a Jordan canonical form for T jNs . (Hint: Construct sets Ct in part (b) so that
fQ(v) j v 2 Ctg  Ct 1 for t  2. Then rearrange the vectors in C1 [ C2 [    [ Cs
to form a suitable ordered basis D for V .)
54. (This question is a proof of Theorem 11.6.4. You can only use results of Question 11.53
and results discussed before Theorem 11.6.4 to do this question.)
Let T be a linear operator on a nite dimensional space V over a eld F. Suppose the
characteristic polynomial of T can be factorized into linear factors over F. Use the results
of Theorem 11.5.8 and Question 11.53 to show that there exists an ordered basis B for V
such that [T ]B is a Jordan canonical form for T .
154 Chapter 11. Diagonalization and Jordan Canonical Forms
Chapter 12

Inner Product Spaces


In this chapter, we only study real and complex vector spaces.

Section 12.1 Inner Products

Discussion 12.1.1 In Chapter 5, we use dot product to de ne lengths, distances and angles
in Rn . For general vector spaces, we need an abstract version of \dot product" so that we can
generalize the works we have done in Chapter 5. In order to do so, we rst require that the
eld used must have some built-in measurement and ordering. Since not all elds are suitable,
we only study vector spaces over R and C in this chapter.

Notation 12.1.2 For a complex number c = a + b i, where a; b 2 R, we use c to denote the


complex conjugate of c, i.e. c = a b i. In particular, if c is real, then c = c.
Let A = (aij )mn be a complex matrix. We use A to denote the conjugate of A, i.e. A =
(aij )mn . Furthermore, the conjugate transpose of A is denoted by

A = A T
= (aji )nm :

In particular, if A is a real matrix, then A = A and A = AT .


Let A; B 2 Mmn (C), C 2 Mnp (C) and c 2 C. Then

(A + B ) = ( A + B )T = ( A + B )T = A + B = A + B  ;
T T

(AC ) = ( AC )T = ( A C )T = C A = C  A ;
T T

(cA) = ( cA )T = ( c A )T = c A = c A :
T
156 Chapter 12. Inner Product Spaces

De nition 12.1.3 Let F = R or C and let V be a vector space over F. An inner product on
V is a mapping which assigns to each ordered pair of vectors u; v 2 V a scalar hu; vi 2 F such
that it satis es the following axioms:
(IP1) For all u; v 2 V , hu; v i = hv ; ui. (This axiom implies hu; ui 2 R for all u 2 V .)
(IP2) For all u; v ; w 2 V , hu + v ; wi = hu; wi + hv ; wi.
(IP3) For all c 2 F and u; v 2 V , hcu; v i = chu; v i.
(IP4) h0; 0i = 0 and, for all nonzero u 2 V , hu; ui > 0.
(Compare the axioms with Theorem 5.1.5.)

Remark 12.1.4
1. If F = R, we can rewrite (IP1) as:
(IP1') For all u; v 2 V , hu; v i = hv ; ui.
For this case, we say that the inner product is symmetric.
2. By (IP1) and (IP2), for all u; v ; w 2 V ,
hw; u + vi = hu + v; wi = hu; wi + hv; wi = hu; wi + hv; wi = hw; ui + hw; vi:
3. By (IP1) and (IP3), for all c 2 F and u; v 2 V ,
hu; cvi = hcv; ui = chv; ui = c hv; ui = c hu; vi:
When F = R, we have hu; cv i = chu; v i.
4. By (IP2), for all u 2 V ,
h0; ui = h0 + 0; ui = h0; ui + h0; ui ) h0; ui = 0
and then by (IP1), hu; 0i = h0; ui = 0 = 0.

De nition 12.1.5 A vector space V , over F = R or C, equipped with an inner product is


called an inner product space.
If F = R, V is called a real inner product space ; and if F = C, V is called a complex inner
product space.

Example 12.1.6
1. For all u = (u1 ; u2 ; : : : ; un ); v = (v1 ; v2 ; : : : ; vn ) 2 Rn , de ne
hu; vi = u v 1 1 + u2 v2 +    + un vn = uv T :
Note that h ; i is actually the dot product de ned in Chapter 5. By Theorem 5.1.5, h ; i
Section 12.1. Inner Products 157

is an inner product on Rn .
This inner product is also called the usual inner product or the Euclidean inner product
on Rn . Furthermore, the Euclidean n-space is usually referred to the vector space Rn
equipped with this inner product.
2. For all u = (u1 ; u2 ; : : : ; un ); v = (v1 ; v2 ; : : : ; vn ) 2 Cn , de ne
hu; vi = u v + u v +    + unvn = uv
1 1 2 2

Then h ; i is an inner product on Cn which is called the usual inner product on Cn .


3. Let F = R or C. For A; B 2 Mmn (F), de ne hA; B i = tr(AB  ). (See De nition 8.1.10
and Proposition 8.1.11 for the de nition and properties of the trace function tr.)
If A = (aij ) and B = (bij ), then
n ! !
X m X
X n
hA; B i = tr(AB ) = tr aik bjk = aik bik :
k=1 mm i=1 k=1
It is obvious that h ; i is an inner product on Mmn (F).
4. Determine which of the followings are inner products on R2 .
(a) hu; vi = u + u + v + v for u = (u ; u ); v = (v ; v ) 2 R .
2
1
2
2
2
1
2
2 1 2 1 2
2

(b) hu; vi = u v + 2u v for u = (u ; u ); v = (v ; v ) 2 R .


1 1 2 2 1 2 1 2
2

(c) hu; vi = u v 2u v for u = (u ; u ); v = (v ; v ) 2 R .


1 1 2 2 1 2 1 2
2

(d) hu; vi = u v u v + u v + u v for u = (u ; u ); v = (v ; v ) 2 R .


3
2 1 1
1
2 1 2
1
2 2 1
3
2 2 2 1 2 1 2
2

(e) hu; vi = u v u v u v + u v for u = (u ; u ); v = (v ; v ) 2 R .


3
2 1 1
1
2 1 2
1
2 2 1
3
2 2 2 1 2 1 2
2

Solution
(a) It does not satisfy (IP2) and (IP3). Hence it is not an inner product.
(b) It is an inner product. (Check it.)
(c) It does not satisfy (IP4). Hence it is not an inner product.
(d) It does not satisfy (IP1). Hence it is not an inner product.
     
u1 + u2 v1 + v2 u1 u2 v1 v2
(e) hu; v i = p p +2 p p for all u = (u1 ; u2 ); v =
2 2 2 2
(v1 ; v2 ) 2 R2 . Same as (b), it is an inner product.
5. Let [a; b], with a < b, be a closed interval on the real line. Consider the vector space
C ([a; b]) de ned in Example 8.3.6.5. Then
Z b
hf; gi = b 1
a f (t)g(t)dt for f; g 2 C ([a; b])
a
is an inner product on C [a; b]). (We leave the veri cation as an exercise. See Question
12.5.)
158 Chapter 12. Inner Product Spaces

1
X
6. Let V be the set of all real in nite sequences (an )n2N such that a2n converges. De ne
n=1
1
X
h(an)n2N; (bn)n2Ni = an bn for (an )n2N ; (bn )n2N 2 V:
n=1
Then V is a real inner product space. (We leave the veri cation as an exercise. See
Question 12.6.)
This inner product space is called the l2 -space. It is an example of a class of inner product
spaces known as Hilbert spaces.

Section 12.2 Norms and Distances

Discussion 12.2.1 One of the important uses of an inner product is that we can use it to
measure the length of a vector and the distance between two vectors.

De nition 12.2.2 Let V be an inner product space.


p
1. For u 2 V , the norm (or length ) of u is de ned to be jjujj = hu; ui.
In particular, vectors of norm 1 are called unit vectors.
2. For u; v 2 V , the distance between u and v is d(u; v ) = jju vjj.
Example 12.2.3
1. Let Rn be equipped with the usual inner product. For u = (u1 ; u2 ; : : : ; un ) 2 Rn ,
jjujj = hu; ui = pu  u = u + u +    + un
p q
2 2 2
1 2

and for u = (u ; u ; : : : ; un ); v = (v ; v ; : : : ; vn ) 2 Rn ,
1 2 1 2
p
d(u; v) = jju vjj = hu v; u vi
p p
= (u v )  (u v ) = (u v ) + (u v ) +    + (un vn ) :
1 1
2
2 2
2 2

(See Section 5.1.)


2. Let Cn be equipped with the usual inner product. For u = (u1 ; u2 ; : : : ; un ) 2 Cn ,
p p
jjujj = hu; ui = ju j 1
2 + ju2 j2 +    + jun j2
and for u = (u1 ; u2 ; : : : ; un ); v = (v1 ; v2 ; : : : ; vn ) 2 Rn ,
p p
d(u; v) = jju vjj = hu v; u vi = ju 1 v1 j2 + ju2 v2 j2 +    + jun vn j2 :
p p
(For c = a + b i 2 C where a; b 2 R, jcj = c c = a2 + b2 is called the modulus of c.)
Section 12.2. Norms and Distances 159

3. We compare the usual inner product on R2 with the inner products de ned in Parts (b)
and (e) of Example 12.1.6.4.

The usual inner product Example 12.1.6.4(b) Example 12.1.6.4(e)

h (1; 0); (0; 1) i = 0 h (1; 0); (0; 1)) i = 0 h (1; 0); (0; 1) i = 1
2
q
jj(1; 0)jj = 1 jj(1; 0)jj = 1 jj(1; 0)jj = 3
2
p q
jj(0; 1)jj = 1 jj(0; 1)jj = 2 jj(0; 1)jj = 3
2

d((1; 0); (0; 1)) d((1; 0); (0; 1)) d((1; 0); (0; 1))
p p
= jj(1; 1)jj = 2 = jj(1; 1)jj = 3 = jj(1; 1)jj = 2

Using the usual inner product, for u = (x; y ) 2 R2 ,


u=1 , x2 + y2 = 1:
Thus, the set of all unit vectors forms the circle x2 + y 2 = 1.
Similarly, using the inner product in Example 12.1.6.4(b), the set of all unit vectors forms
the ellipse x2 + 2y 2 = 1; and
using the inner product in Example 12.1.6.4(e), the set of all unit vectors forms the ellipse
3x2 2xy + 3y 2 = 2.

y y y
6 6 6

 
......................................


....... ......
......
.....
.....
unit vectors ................................ unit vectors

....
..... ........

 
....
...... ....
....
.... ..........
..................................... unit vectors
....... .....
.....
...... .....
....
....
....... . ..


.. ...... .
.
...
. .... ..... .... ..
. ... ..... .....
.... ..
. ..
... ..
.. ....... .... ...... ..
... .... ... ..

@ 
..
. ..
.. ... ... .
.. ..
. .. .
.. .. ... .

 
. .. . .. . ..
.. .

x x x
.. .. .. .. .. ..

  @@@R  @@R


. . .. . .
... ..
. - .
.
. .. - ... ..
. -


.. .. ... .. ..
.. ..
.. .. .. . .


..

@@
.. .. .
.. .
. .. .. .. ..
.. .. .. .
. .. ...
.. ..
. .... .
... ... ...
.. .... .. .. ....

R
.. .. .... ....
... .. ....
.. .. .
.
... .. ..... ..... .. .
....
.
.... ... .....
...... ..... ..
....
.... ... ........ ...... ..
.....
.
... ............... ........ ....
....
....
.... .... .......................... ....
.....
..... .... .....
......
..... ..... ......
.................................
...... .....
......... ......
.................................

The usual inner product Example 12.1.6.4(b) Example 12.1.6.4(e)

4. Let Mmn (F), where F = R or C, be equipped with the inner product de ned in Example
12.1.6.3. For A = (aij ); B = (bij ) 2 Mmn (F),
v v
u m n u m n
p uX X uX X
jjAjj = hA; Ai = t aik aik = t jaik j 2

i=1 k=1 i=1 k=1


and v
u m n
uX X
d(A; B ) = jjA Bjj = t jaik bik j :2

i=1 k=1
160 Chapter 12. Inner Product Spaces

5. Let [a; b], with a < b, be a closed interval on the real line. Suppose the vector space
C ([a; b]) is equipped with the inner product de ned in Example 12.1.6.5. For f; g 2
C ([a; b]), s
p Z b
jjf jj = hf; f i = b a
1
[f (t)]2 dt
a
and s
Z b
d(f; g) = jjf gjj = b a
1
[f (t) g(t)]2 dt
a
.

Theorem 12.2.4 Let V be an inner product space over F = R or C.


1. jj0jj = 0 and, for any nonzero u 2 V , jjujj > 0.

2. For any c 2 F and u 2 V , jjcujj = jcj jjujj.

3. (Cauchy-Schwarz Inequality) For any u; v 2 V , jhu; v ij  jjujj jjv jj.


The equality holds if and only if u and v are linearly dependent, i.e. u = av for some
a 2 F or v = bu for some b 2 F.
4. (Triangle Inequality) For any u; v 2 V , jju + v jj  jjujj + jjv jj.

Proof We only prove the Cauchy-Schwarz Inequality:


If v = 0, then jhu; v ij = 0 = jjujj jjv jj. So the inequality is trivially true for this case.
Now, assume v 6= 0. De ne
w = u hjjuv;jjvi v: 2

Then

0  jjwjj = hw; wi = u
hu
; vi h u; vi


jjvjj v; u jjvjj v
2
2 2

!
= hu; ui
h u ; vi hu ; vi

h u; vi hu; vi


jjvjj hu; vi jjvjj hv; ui + jjvjj


2 jjvjj hv; vi
2 2 2

= hu; ui
hu; vi hu; vi hu; vi hu; vi + hu; vi hu; vi
jjvjj2 jjvjj jjvjj
2 2

= jjujj2
jhu; vij : 2
(12.1)
jjvjj2

So
jhu; vij  jjujj
2
2
and hence jhu; v ij2  jjujj2 jjv jj2 .
jjvjj
2

(Proofs of the other parts are left as exercises. See Question 12.12.)
Section 12.3. Orthogonal and Orthonormal Bases 161

Example 12.2.5
1. (Cauchy-Schwarz Inequality for Real Numbers) For any real numbers x1 ; x2 ; : : : ; xn ,
y1 ; y2 ; : : : ; yn , prove that

(x1 y1 + x2 y2 +    + xn yn )2  (x21 + x22 +    + x2n ) (y12 + y22 +    + yn2 ):

Solution Use Rn equipped with the usual inner product. Let x = (x1 ; x2 ; : : : ; xn ) and
y = (y1; y2; : : : ; yn). Then jjxjj2 = x21 + x22 +    + x2n, jjyjj2 = y12 + y22 +    + yn2 and
hx; yi = x1y1 + x2y2 +    + xnyn. The inequality follows by Theorem 12.2.4.3.
2. (Cauchy-Schwarz Inequality for Continuous Functions) For any f; g 2 C ([a; b])
where [a; b], with a < b, is a closed interval on the real line, prove that
Z b 2 Z b Z b 
f (t)g(t)dt  f (t) dt
2
g(t) dt :
2

a a a

Solution The inequality follows by applying Theorem 12.2.4.3 to the inner product
de ned in Example 12.1.6.5.

Section 12.3 Orthogonal and Orthonormal Bases

Discussion 12.3.1 In Section 5.1, we use the dot product to de ne the angle between two
vectors in Rn . Given a real inner product space V , we can de ne the angle between two vectors
in V in the same way, i.e. the angle between u; v 2 V is

cos 1

hu; vi  :
jjujj jjvjj
u and v are perpendicular to each other if and only if hu; vi = 0. Note that by
In particular,
the Cauchy-Schwarz Inequality (Theorem 12.2.4.3), we have 1 
hu; vi
jjujj jjvjj  1 and hence the
angle is well-de ned.
Although this de nition of \angles" does not work for complex inner product spaces, the concept
of \perpendicular" can still be de ned accordingly.
In the following, we restate some important results in Chapter 5 using inner product spaces.

De nition 12.3.2 Let V be an inner product space.


1. Two vectors u; v 2 V are said to be orthogonal to each other if hu; v i = 0.
162 Chapter 12. Inner Product Spaces

2. Let W be a subspace of V . A vector u is said to be orthogonal (or perpendicular ) to W


if u is orthogonal to all vectors in W .

3. A subset B of V is called orthogonal if the vectors in B are pairwise orthogonal.


If B is an orthogonal set and it is a basis for V , then B is called an orthogonal basis for
V.
4. A subset B of V is called orthonormal if B is orthogonal and all vectors in B are unit
vectors.
If B is an orthonormal set and it is a basis for V , then B is called an orthonormal basis
for V .

(See De nition 5.2.1, De nition 5.2.5 and De nition 5.2.10.)

Lemma 12.3.3 Let V be an inner product space over F where F = R or C.


1. Let W = span(B ) where B  V . For u 2 V , u is orthogonal to W if and only if u is
orthogonal to every vectors in B .

2. If B is an orthogonal set of nonzero vectors from V , then B is always linearly independent.

3. Suppose V is nite dimensional where dim(V )  1. Let B be an ordered orthonormal


basis for V . Then
hu; vi = (u)B ((v)B ) = ([u]B )T [v]B :
Furthermore, if F = R, then hu; v i = (u)B  (v )B .

Proof
1. (() It is obvious.
()) Suppose u 2 V is orthogonal to every vectors in B . Take any w 2 W . Since
W = span(B ), we can write w = a1 v1 + a2 v2 +    + ak vk for some v1 ; v2 ; : : : ; vk 2 B
and a1 ; a2 ; : : : ; ak 2 F. Then

hu; wi = hu; a v1i + hu; a v2i +    + hu; ak vk i


1 2

= a hu; v1 i + a hu; v2 i +    + ak hu; vk i


1 2

= a 0 + a 0 +    + ak 0 = 0:
1 2

So u is orthogonal to all vector in W and hence is orthogonal to W .

2. Take any nite subset fv1 ; v2 ; : : : ; vk g of B . Consider the vector equation

c1 v1 + c2 v2 +    + ck vk = 0 (12.2)
Section 12.3. Orthogonal and Orthonormal Bases 163

Since B is orthogonal, hvi ; vj i = 0 whenever i 6= j . For each j = 1; 2; : : : ; k, (12.2) implies

hc v1 + c v2 +    + ck vk ; vj i = h0; vj i = 0
1 2

) c hv1; vj i + c hv2; vj i +    + ck hvk ; vj i = 0


1 2

) cj hvj ; vj i = 0
) cj = 0 (because vectors in B are nonzero vectors):
Since (12.2) has only the trivial solution, fv1 ; v2 ; : : : ; vk g is linearly independent.
As all nite subsets of B are linearly independent, B is linearly independent.

3. Let B = fw1 ; w2 ; : : : ; wn g. Note that for i; j 2 f1; 2; : : : ; ng,


(
1 if i = j
h wi ; wj i =
0 if i 6= j .
Take any u; v 2 V , say, u = a1 w1 + a2 w2 +    + an wn and v = b1 w1 + b2 w2 +    + bn wn
where a1 ; a2 ; : : : ; an ; b1 ; b2 ; : : : ; bn 2 F. Then

hu; vi = ha w1 + a w2 +    + anwn; b w1 + b w2 +    + bnwni


1 2 1 2
n X
X n
= haiwi; bj wj i
i=1 j =1
n X
X n
= ai bj hwi ; wj i
i=1 j =1
= a1 b1 + a2 b2 +    + an bn
= (u)B ((v )B )
= ([u]B )T [v ]B :

When F = R, hu; v i = a1 b1 + a2 b2 +    + an bn = (u)B  (v )B .

Remark 12.3.4
1. Suppose V is a nite dimensional inner product space. By Theorem 8.5.13 and Lemma
12.3.3.2, if we know the dimension of V , to determine whether a set B of nonzero vectors
from V is an orthogonal (respectively, orthonormal) basis for V , we only need to check:

(i) B is orthogonal (respectively, orthonormal); and


(ii) jB j = dim(V ).

(See Remark 5.2.6.)

2. By Lemma 12.3.3.3, a nite dimensional real inner product space is essentially the same
as the Euclidean space.
164 Chapter 12. Inner Product Spaces

Example 12.3.5
1. Suppose C3 is equipped with the usual inner product. Let W = spanf(1; 1; 1); (1; i ; i)g.
For any (x; y; z ) 2 C3 , by Lemma 12.3.3.1,
(
(x; y; z ) is orthogonal to W , h(x; y; z); (1; 1; 1)i = 0
h(x; y; z); (1; i ; i)i = 0
(
x+ y+ z = 0
,
x iy iz = 0
8
>
< x=0
, >
y= t for t 2 C:
:
z=t
So (0; t; t), t 2 C, are all the vectors orthogonal to W .
2. Suppose Mnn (R) is equipped with the inner product de ned in Example 12.1.6.3. Let
W = fA 2 Mnn (R) j A is symmetricg.
Take any skew symmetric matrix B 2 Mnn (R), i.e. B T
= B. For any A 2 W , i.e.
AT = A, by Proposition 8.1.11, we have
tr(AB T ) = tr((AB T )T ) = tr(BAT ) = tr( B A) =
T
tr(B T A) = tr(AB T ):
This implies that hA; B i = tr(AB T ) = 0. So B is orthogonal to W .
3. Let F = R or C. Using the usual inner product, the standard basis fe1; e2; : : : ; eng for
Fn is an orthonormal basis.
4. Let F = R or C. Using the inner product de ned in Example 12.1.6.3, the standard basis
fEij j 1  i  m and 1  j  ng for Mmn(F) is an orthonormal basis.
5. Suppose P2 (R) is equipped with an inner product such that
Z 1
hp(x); q(x)i = 1
2
p(t)q(t)dt for p(x); q(x) 2 P2 (R):
1

Consider the standard basis f1; x; x2 g for P2 (R). Then


Z 1 Z 1 Z 1
h1; xi = 1
2
tdt = 0; hx; x i =
2 1
2
t3 dt = 0 and h1; x i =
2 1
2
t2 dt = 13 ;
1 1 1

i.e. x is orthogonal to both 1 and x2 but 1 and x2 are not orthogonal to each other. The
standard basis f1; x; x2 g is not an orthogonal basis.

Theorem 12.3.6 Let V be a nite dimensional inner product space. If B = fw1 ; w2 ; : : : ; wn g


is an orthonormal basis for V , then for any vector u 2 V ,
u = hu; w1i w1 + hu; w2i w2 +    + hu; wni wn;
Section 12.3. Orthogonal and Orthonormal Bases 165

i.e. using B as an ordered basis, (u)B = (hu; w1 i; hu; w2 i; : : : ; hu; wn i).


Proof The proof follows the same argument as the proof for Theorem 5.2.8.

Theorem 12.3.7 (Gram-Schmidt Process) Suppose fu1; u2; : : : ; ung is a basis for a
nite dimensional inner product space V . Let

v1 = u1;
v2 = u2 hhuv12;; vv11ii v1;
v3 = u3 hhuv13;; vv11ii v1 hhuv23;; vv22ii v2;
..
.
vn = un hhuv1n;; vv11ii v1 hhuv2n;; vv22ii v2    hvhun n;1v; vnn 1i1i vn 1 :

Then fv1 ; v2 ; : : : ; vn g is an orthogonal basis for V . Furthermore, let

w1 = jjv11jj v1; w2 = jjv12jj v2; : : : ; wk = jjv1njj vn


Then fw1 ; w2 ; : : : ; wn g is an orthonormal basis for V .
(The process of converting an orthogonal set to an orthonormal set by multiplying each vector
w by jjw1 jj is called normalizing.)
Proof For each i, write vi = ui xi where xi =
i 1
X hui; vj i v 2 spanfu ; : : : ; u g. Since
j =1
hvj ; vj i j 1 i 1

u1; u2; : : : ; ui are linearly independent, vi 6= 0. Thus by Remark 12.3.4.1, we only need to
show that fv1 ; v2 ; : : : ; vn g is orthogonal. We prove by using mathematical induction.
It is obvious that fv1 g is an orthogonal set.
Assume that fv1 ; v2 ; : : : ; vk 1 g is an orthogonal set. For i 2 f1; 2; : : : ; k 1g,
* +
hvk ; vii = uk
k
X 1
hu k ; vj i
j =1
hvj ; vj i vj ; vi
= huk ; vi i
k
X 1
huk ; vii hv ; v i
j =1
hvj ; vj i j i
= huk ; vi i
huk ; vii hv ; v i ( because for 1  j  k 1, hvj ; vi i = 0 if i 6= j )
hvi ; vi i i i
= 0:
So fv1 ; v2 ; : : : ; vk g is orthogonal.
By mathematical induction, fv1 ; v2 ; : : : ; vn g is orthogonal.
166 Chapter 12. Inner Product Spaces

Finally, fw1 ; w2 ; : : : ; wn g is orthonormal because

jjwijj = jjv1 jj vi =
1
jjvijj jjvijj = 1 for all i
i
and  
hwi; wj i = jjv1 jj vi; 1
jjvj jj vj =
1
jjvijj jjvj jj hvi; vj i = 0 if i 6= j .
i

Example 12.3.8 Consider the vector space P (R) equipped with the inner product de ned
2

in Example 12.3.5.5. Start with the standard basis f1; x; x g. By the Gram-Schmidt Process, 2

p1 (x) = 1;
p2 (x) = x
hx; p (x)i
hp (x); p (x)i p (x) = x;
1
1
1 1

p (x) = x 2 hx ; p (x)i p (x) hx ; p (x)i p (x) = 1 + x


2
1
2
2 2
3
hp (x); p (x)i
1 hp (x); p (x)i
1
1
3 2 2
2

form an orthogonal basis for P (R). Then 2

 (
p p5
 )
1 1 1
jjp (x)jj p (x); jjp (x)jj p (x); jjp (x)jj p (x) = 1; 3 x; 2 ( 1 + 3x )
2
1 2 3
1 2 3

is an orthonormal basis.

Section 12.4 Orthogonal Complements and Orthogonal Projec-


tions

De nition 12.4.1 Let V be an inner product space and W a subspace of V . The orthogonal
complement of W is de ned to be the set

W ? = fv 2 V j v is orthogonal to W g
= fv 2 V j hv; ui = 0 for all u 2 W g  V:

Example 12.4.2
1. In Example 12.3.5.1, W ? = f(0; t; t) j t 2 Cg = spanf(0; 1; 1)g which is also a subspace
of C3 .

2. For any inner product space V , V ? = f0g and f0g? = V .


Section 12.4. Orthogonal Complements and Orthogonal Projections 167

Theorem 12.4.3 Let V be an inner product space and W a subspace of V .


1. W ? is a subspace of V .
2. W \ W ? = f0g, i.e. W + W ? is a direct sum.
3. If W is nite dimensional, then V = W  W ? .
4. If V is nite dimensional, then dim(V ) = dim(W ) + dim(W ? ).

Proof
1. (S1) Since h0; ui = 0 for all u 2 W , 0 2 W ? .
(S2) Take any v ; w 2 W ? , i.e. hv ; ui = hw; ui = 0 for all u 2 W . Then

hv + w; ui = hv; ui + hw; ui = 0 + 0 = 0 for all u 2 W:

So v + w 2 W ? .
(S3) Take any v 2 W ? , i.e. hv ; ui = 0 for all u 2 W . For any scalar c,

hcv; ui = chv; ui = c0 = 0 for all u 2 W:

So cv 2 W ? .
Since W ? is a subset of V satisfying (S1), (S2) and (S3), W ? is a subspace of V .
2. If v 2 W \ W ? , then hv ; v i = 0 and hence by (IP4), v = 0. Thus W \ W ? = f0g. By
Theorem 8.6.5, W + W ? is a direct sum.
3. If W = f0g, then W  W? = W? = V .
Suppose W is not a zero space. Let fw1 ; w2 ; : : : ; wk g be an orthonormal basis for W .
For any u 2 V , de ne

v = hu; w1i w1 + hu; w2i w2 +    + hu; wk i wk and v0 = u v:


We have u = v + v 0 where v 2 W and since for i = 1; 2; : : : ; k,
* +
k
X
h v 0 ; wi i = u hu; wj i wj ; wi
j =1
k
X
= hu; wi i hu; wj i hwj ; wii
j =1
= hu; wi i hu; wii = 0;
by Lemma 12.3.3.1, v 0 2 W ?. So we have shown that V = W + W ? and by Part 2,
V = W  W ?.
168 Chapter 12. Inner Product Spaces

4. If V is nite dimensional, by Part 3 and Theorem 8.6.7.2, dim(V ) = dim(W ) + dim(W ? ).

Example 12.4.4
1. In Example 12.3.5.1 and Example 12.4.2.1, dim(W ) = 2 and dim(W ? ) = 1. Hence
dim(W ) + dim(W ? ) = 3 = dim(C3 ).
2. Suppose Mnn (R) is equipped with the inner product de ned in Example 12.1.6.3. Let
W1 = fA 2 Mnn (R) j A is symmetricg and W2 = fA 2 Mnn (R) j A is skew symmetricg.
By Example 12.3.5.2, we have W2  W1? .
By Example 8.6.6.2, we have Mnn (R) = W1  W2 and hence
dim(W2 ) = dim(Mnn (R)) dim(W1 ) (by Theorem 8.6.7.2)
= dim(W ? ) 1 (by Theorem 12.4.3.4).
So Theorem 8.5.15, W2 = W1? .

Remark 12.4.5 Theorem 12.4.3.3 is not always true when W is in nite dimensional.
Consider the l2 -space V de ned in Example 12.1.6.6. For i = 1; 2; 3; : : : , de ne ei 2 V to be
the real in nite sequence such that the ith term of the sequence is 1 and all other terms are
0. Let W = spanfe1 ; e2 ; e3 ; : : : g which is an in nite dimensional subspace of V . Note that
W 6= V , (see Question 8.20).
Let a = (an )n2N 2 W ? , i.e. ha; eii = 0 for all i. For each i, by the de nition of the inner
product of the l2 -space,
1
X
ha; eii = an (the nth term of ei ) = ai :
n=1
This means ai = 0 for all i and hence a is the zero sequence.
So we have shown that W ? = f0g. In this case, W  W ? = W = 6 V.
Theorem 12.4.6 Let V be an inner product space and W a subspace of V .
1. W  (W ?)?.
2. If W is nite dimensional, then W = (W ? )? .

Proof
1. Take any u 2 W . Then
u 2 W ) hu; vi = 0 for all v 2 W ?
) u 2 (W ?)?:
Thus W  (W ?)?.
Section 12.4. Orthogonal Complements and Orthogonal Projections 169

2. Take any v 2 (W ? )? . Since W is nite dimensional, by Theorem 12.4.3.3, V = W  W ? .


Write v = w + w0 where w 2 W and w0 2 W ? . By Part 1, w 2 (W ? )? and hence
w0 = v w 2 (W ?)?. But this means w0 2 W ? \ (W ?)?. By Theorem 12.4.3.2,
W ? \ (W ? )? = f0g. Thus v w = w0 = 0, i.e. v = w 2 W .
So we have shown that (W ? )?  W . Together with Part 1, (W ? )? = W .

Remark 12.4.7 Theorem 12.4.6.2 is not always true when W is in nite dimensional.
Use W de ned in Remark 12.4.5. Since W ? = f0g, (W ? )? = f0g? = V . We still have
W  (W ? )? but W 6= (W ? )? .

De nition 12.4.8 Let V be an inner product space and W a subspace of V such that V =
W  W ? . (By Theorem 12.4.3.3, we know that V = W  W ? is always true if W is nite
dimensional.) Every u 2 V can be uniquely expressed as

u = w + w0 where w 2 W and w0 2 W ? :
The vector w is called the orthogonal projection of u onto W and is denoted by ProjW (u).

Proposition 12.4.9 The mapping ProjW : V !V is a linear operator and is called the
orthogonal projection of V onto W .
Proof This is only a particular case of Question 9.6.

Example 12.4.10
1. In Example 12.3.5.1, W = spanf(1; 1; 1); (1; i ; i)g = f(a; b; b) j a; b 2 Cg and W ? =
f(0; t; t) j t 2 Cg.
For any (x; y; z ) 2 C,
 z y; z y
(x; y; z ) = x; y+2 z ; y+2 z + 0; 2 2

 z y  ; z y  2 W ? . So
where x; y+2 z ; y+2 z 2 W and 0; 2 2


ProjW ((x; y; z )) = x; y+2 z ; y+2 z :

2. Suppose Mnn (R) is equipped with the inner product de ned in Example 12.1.6.3. Let
W = fA 2 Mnn (R) j A is symmetricg.
By Example 12.4.4.2, W ? = fA 2 Mnn (R) j A is skew symmetricg. For each B 2
Mnn(R), by Example 8.6.6.2, we can write
B = (B + B ) + (B B )
1
2
T 1
2
T

where 12 (B + B T ) 2 W and (B B ) 2 W ? . So ProjW (B ) = (B + B


1
2
T 1
2
T
).
170 Chapter 12. Inner Product Spaces

Theorem 12.4.11 Let V be an inner product space and W a nite dimensional subspace of
V . If B = fw1 ; w2 ; : : : ; wk g is an orthonormal basis for W , then for any vector u 2 V ,
ProjW (u) = hu; w1 i w1 + hu; w2 i w2 +    + hu; wk i wk
and
ProjW ? (u) = u ProjW (u)
=u hu; w1i w1 hu; w2i w2    hu; wk i wk :
Proof The proof follows the same argument as the proof for Theorem 5.2.15.

Example 12.4.12
1. In Example 12.4.10.1, W = f(a; b; b) j a; b 2 Cg = spanf(1; 0; 0); (0; 1; 1)g  C3 . To
compute ProjW ((x; y; z )), for (x; y; z ) 2 C3 , by using Theorem 12.4.11, we rst need an
orthonormal basis for W . In this case, f(1; 0; 0); p12 (0; 1; 1)g is an orthonormal basis for
W . Thus
ProjW ((x; y; z )) = h(x; y; z ); (1; 0; 0)i (1; 0; 0) + h(x; y; z ); p12 (0; 1; 1)i p12 (0; 1; 1)

= x(1; 0; 0) + y+2 z (0; 1; 1) = x; y+2 z ; y+2 z
which gives us the same formula as in Example 12.4.10.1.
2. Let C ([ 1; 1]) be equipped with the inner product de ned by
Z 1
hf; gi = 1
2
f (t)g(t)dt for f; g 2 C ([ 1; 1]):
1

Since real polynomials can be regarded as functions in C ([ 1; 1]), W1 = P (R)


1 and
W2 = P2 (R) can be regarded as a subspace of C ([ 1; 1]).
 p
(a) Using the orthonormal basis 1; 3 x for W1 , for any f 2 C ([ 1; 1]),
 Z 1
  Z 1

ProjW1 (f ) = 1
2
f (t)d(t) + 3
2
tf (t)dt x:
1 1

In particular, if f (x) = ex for x 2 [ 1; 1], then



ProjW1 (f ) = 12 e 1e + 3e x:
n p p o
(b) Using the orthonormal basis 1; 3 x; 25 ( 1 + 3x2 ) for W2 (see Example 12.3.8),
for any f 2 C ([ 1; 1]),
 Z 1
  Z 1
  Z 1

ProjW2 (f ) = 1
2
f (t)d(t) + 3
2
tf (t)dt x+ 5
8
( 1 + 3t2 )f (t)dt ( 1+3x2 ):
1 1 1

In particular, if f (x) = ex for x 2 [ 1; 1], then


 
ProjW2 (f ) = 3
4
e + 11e + 3e x + 154 e 7
e x2 :
Section 12.5. Adjoints of Linear Operators 171

Theorem 12.4.13 (Best Approximation) Let V be an inner product space and W a


subspace of V such that V = W  W ? . Then for any u 2 V ,

d(u; ProjW (u))  d(u; w) for all w 2 W ,


i.e. ProjW (u) is the best approximation of u in W .

u
3 ..
....
.
....
.
.
. ...
.
. ....
.


.
.
. ...
.
. ....
.
.
.
. ..
.
..................................................................................................................


..... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .. . .. . . . . . . . .
.... .

W
. .
. .. ..
...
. . .. ...

XXXProjW (u) -
.. .
. .. ...
.
.. . .. ...
.... .
. .. ...
...
...... .
.
. ..
.. .
..
.... . ..
... .. .
.......................
.. ...
.. ...
.... .
. . ...

XXXXX
. . ..
... .
. .
. .. ...
.... .
.
. .
. .. ...
...... .. ....
.
.. .
.... .. ...
... ...

v XXXX
....
.. ...
...
.. ...
...

z
..
.... .. ...
...... ..
. .
....
... .. ...
.... .. ...
.... .. ...
... . ...
.... ...
...
...... ....
.
.. ..
.... ...
..................................................................................................................................................................................................................................................................................................................................

Proof The proof follows the same argument as the proof for Theorem 5.3.2.

Example 12.4.14 Let C ([ 1; 1]) be equipped with the inner product de ned by
Z 1
hf; gi = 1
2
f (t)g(t)dt for f; g 2 C ([ 1; 1]):
1

By Example 12.4.12.2(a), the best approximation of the exponential function f (x) = ex in



P1(R) is p(x) = 21 e 1e + 3e x.
In the following diagram, we compare p(x) with the approximation q (x) = 1 + x computed
using Taylor's expansion of ex about x = 0.
y
...
....
....

f (x) = ex
....
6 .
......
- .......
...
....
....
....
....
.... .......
.....
. .......
.... .......
..... .......
..
...... ... .........
.
. ...
..... .......
.... ....... .....
..... .......
........... .......
............. .......
...
..
.......... ......
.
..................... .
. ..... .
.... ...
....... ..... .......
....... ......... .......
....... .
....... .......... .......
....... .......
....
. ....... .
........ .
 ..... .... .... .
....... ...... ....... 6
....... .....

p(x) = 1
2
e e +ex
1 3
.....
........
.......
.......
.......

....
..............
...... .......
...... ..
...... .....

q(x) = 1 + x
.. .
.......... ....
...............
.......
... ..........
....... ........
....... .........
....... .........
....... ........
@ R ................ ..........
@ ...
....
.........
.. .....
....... ...........
....... .........
....... ............
....... ..........................
....... . .
.....
................................ .....
..... ........ .... ...
................... .
.............. ....... ..
..................
.................... .......
............... .............
...
. ......
.................... ..
....
. . ......
....... .... .......
................... .......
....... .. .......
....... .
....... ..... ....
....... .
..
........... ....... .
.... .......
.......
....... .......
........
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
x
1 0:5 0:5 1
172 Chapter 12. Inner Product Spaces

Section 12.5 Adjoints of Linear Operators

De nition 12.5.1 Let V be an inner product space and let T be a linear operator on V . A
linear operator T  on V is called the adjoint of T if

hT (u); vi = hu; T (v)i for all u; v 2 V: (12.3)

In Theorem 12.5.4.1, we shall learn that the adjoint of a linear operator is unique if it exists.
Thus we always use T  to denote the adjoint of T .
Please note that the (classical) adjoint of a matrix de ned in De nition 2.5.24 is a completely
di erent concept.

Remark 12.5.2 Let V be an inner product space and let T be a linear operator on V such
that T  exists. Then for all u; v 2 V ,

hu; T (v)i = hT (v); ui = hv; T (u)i = hT (u); vi:

Example 12.5.3
1. Let V be an inner product space. For all u; v 2 V ,

hIV (u); vi = hu; vi = hu; IV (v)i and hOV (u); vi = 0 = hu; OV (v)i:
Both IV and OV are the adjoints of themselves, i.e IV = IV and OV = OV .
2. Let Rn be equipped with the usual inner product. Given any n  n real matrix A, let LA
be the linear operator on Rn as de ned in Example 9.1.4.1, i.e. LA (u) = Au for u 2 Rn .
(In here, all vectors are expressed as column vectors.) For any u; v 2 Rn ,

hLA(u); vi = hAu; vi = (Au) v = u A v = hu; A vi = hu; LA (v)i:


T T T T
T

Thus the adjoint of LA is LAT , i.e. LA = LAT .


3. Let Cn be equipped with the usual inner product. Following the same arguments as
above, LA = LA for any n  n complex matrix A.
4. Consider the l2 -space V de ned in Example 12.1.6.6. Let S be the shift operator on V as
de ned in Example 11.1.3.2, i.e.

S ((an )n2N ) = (an+1 )n2N for (an )n2N 2 V:


De ne T : V !V such that for (an )n2N 2 V , T ((an )n2N ) = (a0n )n2N where
(
0 if n = 1
a0n =
an 1 if n > 1.
Section 12.5. Adjoints of Linear Operators 173

Then for all (an )n2N ; (bn )n2N 2 V ,


X1 1
X
h(an)n2N; T ((bn)n2N)i = anb0n = anbn 1
n=1 n=2
1
X
= an+1 bn = hS ((an )n2N ); (bn )n2N i:
n=1
Thus T is the adjoint of S .

Theorem 12.5.4 Let V be an inner product space and let T be a linear operator on V .
1. The adjoint of T is unique if it exists.
2. Suppose V is nite dimensional where dim(V )  1.
(a) T  always exists.
(b) If B is an ordered orthonormal basis for V , then [T  ]B = ([T ]B ) .
(c) rank(T ) = rank(T  ) and nullity(T ) = nullity(T  ).

Proof
1. Suppose there are two adjoints T1 and T2 of T . To show that T1 = T2 , we need to show
that T1 (v ) = T2 (v ) for all v 2 V . By (12.3), for all u; v 2 V ,
hu; T (v)i = hT (u); vi = hu; T (v)i ) hu; T (v) T (v)i = 0:
1 2 1 2

Substituting u by T1 (v ) T2 (v), we have


hT (v) T (v); T (v) T (v)i = 0
1 2 1 2 for all v 2 V:
But by (IP4), T1 (v ) T2 (v) = 0 and hence T1 (v) = T2 (v) for all v 2 V .
2. Suppose V is nite dimensional. Let B = fw1 ; w2 ; : : : ; wn g be an ordered orthonormal
basis for V . De ne an linear operator T 0 : V ! V such that [T 0 ]B = ([T ]B ) , i.e. for every
a1 w1 + a2 w2 +    + an wn 2 V ,
0 1 0 1
b1 a1
B C B C
B b2 C B a2 C
T 0 (a1 w1 +a2 w2 +  +an wn ) = b1 w1 +b2 w2 +  +bn wn where B C
B .. C = ([T ]B ) B
B C
.. C:
@ .A @ . A
bn an
By Lemma 12.3.3.3, for all u; v 2 V ,
hT (u); vi = ([T (u)]B ) [v]B = ([T ]B [u]B ) [v]B
T T

= ([u]B )T ([T ]B )T [v ]B
= ([u]B )T ([T ]B ) [v ]B = ([u]B )T [T 0 (v )]B = hu; T 0 (v )i:
174 Chapter 12. Inner Product Spaces

Thus T 0 is the adjoint of T , i.e. T  ( = T 0 ) exists and [T  ]B = ([T ]B ) .


For any complex (or real) matrices A, rank(A) = rank(A ) (see Question 12.1). So

rank(T ) = rank([T ]B ) = rank(([T ]B ) ) = rank([T  ]B ) = rank(T  )

(see Question 12.35 for a proof of rank(T ) = rank(T  ) without using matrices) and

nullity(T ) = dim(V ) rank(T ) = dim(V ) rank(T  ) = nullity(T  ):

Example 12.5.5 Let R 3


be equipped with the usual inner product and let T be the linear
operator on R de ned by
3

T ((x; y; z )) = (x; x + y; x + y + z ) for (x; y; z ) 2 R3 :


Let E = fe1 ; e2 ; e3 g be the standard basis for R3 which is an orthonormal basis for R3 . Then
0 1 0 1
1 0 0 1 1 1
B C 
[T ]E = @1 1 0A and ([T ]E ) = ([T ]E ) = @0 1 1C
T B
A:
1 1 1 0 0 1
0 10 1 0 1
1 1 1 x x+y+z
@0 1 1A@y A @ y + z A, the adjoint T  of T is the linear operator de ned by
As B CB C B C
=
0 0 1 z z
T  ((x; y; z )) = (x + y + z )e1 + (y + z )e2 + z e3 = (x + y + z; y + z; z ) for (x; y; z ) 2 R3 :
Note that for (x1 ; y1 ; z1 ); (x2 ; y2 ; z2 ) 2 R3 ,

hT ((x ; y ; z )); (x ; y ; z )i = h(x ; x


1 1 1 2 2 2 1 1 + y1 ; x1 + y1 + z1 ); (x2 ; y2 ; z2 )i
= x1 x2 + (x1 + y1 )y2 + (x1 + y1 + z1 )z2
= x1 (x2 + y2 + z2 ) + y1 (y2 + z2 ) + z1 z2
= h(x1 ; y1 ; z1 ); (x2 + y2 + z2 ; y2 + z2 ; z2 )i
= h(x1 ; y1 ; z1 ); T  ((x2 ; y2 ; z2 ))i:

Remark 12.5.6 Theorem 12.5.4.2(a) is not always true when V is in nite dimensional.
Let V be the l2 -space de ned in Example 12.1.6.6. As in Remark 12.4.5, consider the subspace
W = spanfe1 ; e2 ; e3 ; : : : g of V . Observe that W consists of all real in nite sequences which
has only nite number of nonzero terms. De ne a linear operator T on W such that
!
1
X
T ((an )n2N ) = ai for (an )n2N 2 W:
i=n n 2N
1
X
(The sum ai converges because there are only nite number of nonzero ai .)
i=n
Section 12.5. Adjoints of Linear Operators 175

Note that T (en ) = (1; 1; : : : ; 1; 0; 0; : : : ) where the rst n entries are 1 and all other entries are
0.
Assume T  exists. Let T  (e1 ) = (bn )n2N . By (12.3), for all n 2 N,

bn = hen ; T  (e1 )i = hT (en ); e1 i = 1:


But then T  (e1 ) is not a sequence in W which contradicts that T  is an linear operator on W .
(Actually, T  (e1 ) is not even a sequence in V .)
Hence the adjoint of T does not exist.

Proposition 12.5.7 Let F = R or C and let V be an inner product space over F. Suppose S
and T are linear operators on V such that S  and T  exists. Then
1. (S + T ) exists and (S + T ) = S  + T  ;
2. for any c 2 F, (c T ) exists and (c T ) = c T  ;
3. (S  T ) exists and (S  T ) = T   S  ;
4. (T  ) exists and (T  ) = T ; and
5. if W is a subspace of V which is both T -invariant and T  -invariant, then (T jW ) exists
and (T jW ) = T  jW .

Proof We only show the proof of Part 3:


(Note that before we have shown that the adjoint of S  T exists, we cannot use the term
(S  T ) .) For all u; v 2 V ,

h(S  T )(u); vi = hS (T (u)); vi


= hT (u); S  (v )i
= hu; T  (S  (v ))i = hu; (T   S  )(v )i:

So (S  T ) exists and (S  T ) = T   S  .
(Proofs of the other parts are left as exercises. See Question 12.31).

De nition 12.5.8 Let F = R or C.


1. Let V be an inner product space over F and T a linear operator on V such that T  exists.
Suppose T is invertible and T 1 = T  , i.e. T  T  = T   T = IV .
(a) If F = C, then T is called a unitary operator.
(b) If F = R, then T is called an orthogonal operator.
2. Let A be an invertible matrix over F such that A 1
= A , i.e. AA = A A = I .
176 Chapter 12. Inner Product Spaces

(a) If F = C, then A is called a unitary matrix.


(b) If F = R, then A is called an orthogonal matrix. Note that only real square matrices
satisfying A 1 = AT are called orthogonal matrices. (See also Section 5.4.)
An orthogonal matrix is also a unitary matrix.

Proposition 12.5.9 Let F = R or C, V a nite dimensional inner product space over F, where
dim(V )  1, and T a linear operator on V . Take any ordered orthonormal basis B for V .
1. If F = C, then T is unitary if and only if [T ]B is a unitary matrix.
2. If F = R, then T is orthogonal if and only if [T ]B is an orthogonal matrix.

Proof Since the proof of Part 2 is the same as Part 1, we only prove Part 1:
Let A = [T ]B . By Theorem 12.5.4.2, [T  ]B = A . Thus by Theorem 9.3.3, [T  T  ]B = AA
and [T   T ]B = A A.
Hence
T is unitary , T  T  = T   T = IV
, AA = AA = I (by Lemma 9.2.3)
, A is unitary.

Example 12.5.10
1. The following are some examples of unitary matrices where the rst three matrices are
also orthogonal matrices:
0 1
0 1 p1 p2 0
1 0 0 !
B 3 6 C cos() sin()
B C B p1 p p1 C
@0 1 0A ; B 3
1
C; ;
@ 6 2A
sin() cos()
0 0 1 p1 p1 p1
3 6 2
0 1
1 1 p1 i 0
0 1 0 1 B 2 2 2 C
p1 (1 + 2i) 0 0 p1 (1 + i) p1 B 1 p1 i C
B
5
C B 2
1
0 C
iA ; A; C:
2
@ 0 0 @ 3 3 B 2

p p (1 i) B p1 C
1 1
B 2i 0 1 1
C
0 i 0 3 3 @ 2 2
A
0 p1 i 1
2
1
2
2

2. Let R2 be equipped with the usual inner product. Consider the linear operator LA on R2
!
cos() sin()
where A = for some  2 [0; 2 ].
sin() cos()
Using the orthonormal basis E = f(1; 0); (0; 1)g for R2 , [LA ]E = A. Since A is an
orthogonal matrix, by Proposition 12.5.9, LA is an orthogonal operator.
Section 12.5. Adjoints of Linear Operators 177

3. Let C3 be equipped with the usual inner product. Let T be the linear operator on C 3

de ned by
 
T ((x; y; z )) = p12 (x + i y); p12 (x i y); z for (x; y; z ) 2 C3 :

Using the orthonormal basis E = f(1; 0; 0); (0; 1; 0) (0; 0; 1)g for C3 ,
0 1
p1 p1 i 0
2 2
B C
[T ]E = B p1
@ 2
p1 i 0C
A:
2

0 0 1

It can be checked that [T ]E is a unitary matrix and hence by Proposition 12.5.9, T is


unitary.

Theorem 12.5.11 Let F = R or C, V a nite dimensional inner product space over F, where
dim(V )  1, and T a linear operator on V . Then the following are equivalent:

1. T is unitary (when F = C) or orthogonal (when F = R).

2. For all u; v 2 V , hT (u); T (v )i = hu; v i.

3. For all u 2 V , jjT (u)jj = jjujj.

4. There exists an orthonormal basis fw1 ; w2 ; : : : ; wn g for V , where n = dim(V ), such


that fT (w1 ); T (w2 ); : : : ; T (wn )g is also orthonormal.

Proof
(1 ) 2) For all u; v 2 V , hT (u); T (v )i = hu; T  (T (v ))i = hu; IV (v )i = hu; v i.
p p
(2 ) 3) For all u 2 V , jjT (u)jj = hT (u); T (u)i = hu; ui = jjujj.
(3 ) 4) Take any orthonormal basis fw1 ; w2 ; : : : ; wn g for V . Since jjT (wi )jj = jjwi jj = 1
for all i, it suces to show that hT (wi ); T (wj )i = 0 for i 6= j . Let " = hT (wi ); T (wj )i.
Then " = hT (wj ); T (wi )i.

jjT (wi + "wj )jj = jjwi + "wj jj


) hT (wi) + "T (wj ); T (wi) + "T (wj )i = hwi + "wj ; wi + "wj i
) 1 + "hT (wi); T (wj )i + "hT (wj ); T (wi)i + "" = 1 + ""
) 2"" = 0
) " = 0:
Thus hT (wi ); T (wj )i = 0.
178 Chapter 12. Inner Product Spaces

(4 ) 1) Let B = fw1 ; w2 ; : : : ; wn g be an orthonormal basis for V such that fT (w1 ); T (w2 );


: : : ; T (wn )g is also orthonormal. For i = 1; 2; : : : ; n, by Theorem 12.3.6,

(T   T )(wi )
= h(T   T )(wi ); w1 i w1 + h(T   T )(wi ); w2 i w2 +    + h(T   T )(wi ); wn i wn
= hT  (T (wi )); w1 i w1 + hT  (T (wi )); w2 i w2 +    + hT  (T (wi )); wn i wn
= hT (wi ); T (w1 )i w1 + hT (wi ); T (w2 )i w2 +    + hT (wi ); T (wn )i wn
= wi :

This means that T   T = IV . By Theorem 9.6.8, T 1


= T  . So T is unitary if F=C
and T is orthogonal if F = R.

Example 12.5.12

1. In Example 12.5.10.2, LA maps the standard basis f(1; 0)T ; (0; 1)T g to

f(cos(); sin()) ; T
( sin(); cos())T g

which is also an orthonormal basis for R2 using the usual inner product.

2. In Example 12.5.10.3, T maps the standard basis f(1; 0; 0); (0; 1; 0); (0; 0; 1)g to
n    o
p1 ; p1 ; 0 ; p1 i ; p1 i ; 0 ; (0; 0; 1)
2 2 2 2

which is also an orthonormal basis for C3 using the usual inner procuct.

Remark 12.5.13 Theorem 12.5.11 is not always true when V is in nite dimensional.
The linear operator S  in Example 12.5.3.4 satis es Part 2 and Part 3 of Theorem 12.5.11. But
S  is not surjective and hence not invertible. So S  is not orthogonal.

Theorem 12.5.14 Let A be an n  n complex matrix. Suppose Cn is equipped with the usual
inner product. The following statements are equivalent:

1. A is unitary.
2. The rows of A form an orthonormal basis for Cn .

3. The columns of A form an orthonormal basis for Cn .

Proof The proof follows the same argument as the proof for Theorem 5.4.6. (We can also
prove the theorem by applying Theorem 12.5.11 to LA and LA .)
Section 12.6. Unitary and Orthogonal Diagonalization 179

Theorem 12.5.15 Let V be a complex nite dimensional inner product space where dim(V ) 
1. If B and C are ordered orthonormal bases for V , then the transition matrix from B to C is
a unitary matrix, i.e. [IV ]B;C = ( [IV ]C;B ) 1 = ([IV ]C;B ) . (See also Theorem 5.4.7.)
Proof Let B = fv1 ; v2 ; : : : ; vn g and C = fw1 ; w2 ; : : : ; wn g. De ne a linear operator T on V
such that T (wi ) = vi for i = 1; 2; : : : ; n. Note that
   
[T ]B;C = [T (w1 )]B [T (w2 )]B    [T (wn)]B = [v1 ]B [v2 ]B    [vn]B = In :
By Theorem 12.5.11, T is a unitary operator and hence by Proposition 12.5.9, [T ]C is a unitary
matrix. But then the transition matrix from B to C is
[IV ]C;B = [IV ]C;B In = [IV ]C;B [T ]B;C = [IV  T ]C;C = [T ]C
which is a unitary matrix.

Section 12.6 Unitary and Orthogonal Diagonalization

De nition 12.6.1
1. Let V be an inner product space and let T be a linear operator on V such that T  exists.
(a) T is called a self-adjoint operator if T = T  .
(b) T is called a normal operator if T  T  = T   T .
All self-adjoint operators, orthogonal operators and unitary operators are normal.
2. Let A be a complex square matrix.
(a) A is called a Hermitian matrix if A = A.
Note that a real matrix A satisfying A = A is a symmetric matrix.
(b) A is called a normal matrix if AA = A A.
All Hermitian matrices, real symmetric matrices, unitary matrices and orthogonal matri-
ces are normal.

Proposition 12.6.2 Let F = R or C, V a nite dimensional inner product space over F, where
dim(V )  1, and T a linear operator on V . Take an ordered orthonormal basis B for V and let
A = [T ]B .
1. If F = C, T is self-adjoint if and only if A is a Hermitian matrix.
If F = R, then T is self-adjoint if and only if A is a symmetric matrix.
2. T is normal if and only if A is a normal matrix
180 Chapter 12. Inner Product Spaces

Proof The proof follows the same argument as the proof for Proposition 12.5.9.

Example 12.6.3
1. Let C2 be equipped with the usual inner product. Let T be the linear operator on C
2

de ned by
T ((x; y)) = (2x i(x + y); 2y i(x + y)) for (x; y) 2 C2 :
!
2 i i
Using the orthonormal basis E = f(1; 0); (0; 1)g for C2 , [T ]E = . Since
i 2 i
!
6 2
([T ]E ) [T ]E = = [T ]E ([T ]E )
2 6
[T ]E is a normal matrix and hence by Proposition 12.6.2.2, T is a normal operator.
2. Let Rn be equipped with the usual inner product and A a real n  n matrix.
If A is symmetric, then the linear operator LA is self-adjoint (and normal).
If A is nonzero and skew symmetric, then LA is normal but not self-adjoint.

Lemma 12.6.4 Let F = R or C, V an inner product space over F and T a normal operator
on V .
1. For all u; v 2 V , hT (u); T (v )i = hT  (u); T  (v )i.
2. For any c 2 F, the linear operator T cIV is normal.
3. If u is an eigenvector of T associated with , then u is an eigenvector of T  associated
with .
4. If u and v are eigenvector of T associated with  and , respectively, where  6= , then
u and v are orthogonal.
Proof
1. hT (u); T (v )i = hu; T  (T (v ))i = hu; T (T  (v ))i = hT  (u); T  (v )i.
2. Note that IV = IV . By Proposition 12.5.7, (T cIV ) exists and (T cIV ) = T  cIV .
Since
(T cIV )  (T cIV ) = (T cIV )  (T  cIV )
= T  T  c T c T  + ccIV
= T   T c T c T  + ccIV (because T is normal)
= (T  cIV )  (T cIV )
= (T cIV )  (T cIV );
T cIV is normal.
Section 12.6. Unitary and Orthogonal Diagonalization 181

3. Since u is an eigenvector of T associated with , T (u) = u. So (T IV )(u) =


T (u) u = 0 and by Part 1 and Part 2,

h(T IV )(u); (T IV )(u)i = h(T IV )(u); (T IV )(u)i = 0:
By (IP4), (T IV ) (u) = 0 and hence

T  (u) u = (T  IV )(u) = (T IV ) (u) = 0;

i.e. T  (u) = u. So u is an eigenvector of T  associated with .

4. Note that T (u) = u and T (v ) = v . By Part 3, T  (v ) = v . Then

hu; vi = hu; vi = hT (u); vi = hu; T  (v)i = hu; vi = hu; vi:

Since  6= , hu; v i = 0.

Remark 12.6.5 The results in Lemma 12.6.4 is still true if we replace V by Cn (equipped
with the usual inner product) and T by an n  n normal matrix A. In particular, if u is an
eigenvector of A associated with , then u is also an eigenvector of A associated with .
!
2 i i
Example 12.6.6 Let A = . By Example 12.6.3.1, A is a normal matrix.
i 2 i
Let u = (1; 1)T and v = (1; 1)T . Since
! ! ! ! ! !
2 i i 1 1 2 i i 1 1
=2 and = (2 2i) ;
i 2 i 1 1 i 2 i 1 1

u and v are eigenvectors of A associated with the eigenvalues 2 and 2 2i respectively. Thus
u and v are also eigenvectors of A associated with the eigenvalues 2 and 2 + 2i respectively.
We can also check them directly:
! ! ! ! ! !
2+i i 1 1 2+i i 1 1
=2 and = (2 + 2i) :
i 2+i 1 1 i 2+i 1 1

De nition 12.6.7
1. Let F = R or C, V a nite dimensional inner product space over F, where dim(V )  1,
and T a linear operator on V . Suppose there exists an ordered orthonormal basis B for
V such that [T ]B is a diagonal matrix.
If F = C, then T is called unitarily diagonalizable.
If F = R, then T is called orthogonally diagonalizable.
182 Chapter 12. Inner Product Spaces

2. A complex square matrix A is called unitarily diagonalizable if there exists a unitary


matrix P such that P  AP is a diagonal matrix.
A real square matrix A is called orthogonally diagonalizable if there exists an orthogonal
matrix P such that P T AP is a diagonal matrix. (See Section 6.3.)
0 1
p1 p1
Example 12.6.8 Use the complex normal matrix A in Example 12.6.6. Let P = @ 2 2A

p1 p1
2 2
which is an orthogonal matrix and hence a unitary matrix. Since
0 1 !0 1 !
p1 p1 2 i i p1 p1 2 0
P AP = @ p 1
2

p1
2A

i 2 i
@ 2

p1
2A

p1
=
0 2 2i
;
2 2 2 2

A is unitarily diagonalizable.
Theorem 12.6.9
1. Let V be a complex nite dimensional inner product space where dim(V )  1. A linear
operator T on V is unitarily diagonalizable if and only if T is normal.
2. A complex square matrix A is unitarily diagonalizable if and only if A is normal.

Proof We only prove Part 1 of the theorem.


()) Suppose [T ]B is a diagonal matrix for some ordered orthonormal basis B for V . Since
([T ]B ) is also a diagonal matrix, [T ]B ([T ]B ) = ([T ]B ) [T ]B . By Proposition 12.6.2.2,
T is normal.
(() We use induction on the dimension of V :
If dim(V ) = 1, all linear operators on V are normal and unitarily diagonalizable.
Assume that if dim(V ) = n 1, then all normal operators on V are unitarily diagonaliz-
able.
Now, suppose dim(V ) = n. Since the characteristic polynomial cT (x) can always be
factorized into linear factors over C, T has at least one eigenvalue . Suppose u is an
eigenvector of T associated with . Let W = spanfug which is a T -invariant subspace of
V (see Example 11.3.2.2). By Theorem 12.4.3.3,
V = W  W ?:
Take any w 2 W ? , i.e. hw; ui = 0. By Lemma 12.6.4.3, u is an eigenvector of T 
associated with . Thus

hT (w); ui = hw; T (u)i = hw; ui = hw; ui = 0


Section 12.6. Unitary and Orthogonal Diagonalization 183

and hence T (w) 2 W ?. So W ? is T -invariant. Similarly, W ? is also T  -invariant (see


Question 12.30).
By Proposition 12.5.7.5, (T jW ? ) = T  jW ? . Then

(T jW ? )  (T jW ? ) = (T jW ? )  (T  jW ? )
= (T  T  )j ?W (by Proposition 11.3.3.1)
= (T   T )jW ? (because T is normal)
= (T  jW ? )  (T jW ? ) (again by Proposition 11.3.3.1)
= (T jW ? )  (T jW ? ):

So T jW ? is a normal operator on W ? . Since dim(W ? ) = n 1, by the inductive assump-


tion, there exists an ordered orthonormal basis C = fv1 ; v2 ; : : : ; vn 1 g for W ? such
that [T jW ? ]C is a diagonal matrix.
n o
Let B = jju1 jj u; v1 ; v2 ; : : : ; vn 1 . Then B is an orthonormal basis for V . By Discus-
sion 11.3.12, using B as an ordered basis,
0 1
 0  0
B C
[T ]B = B
B.
0 C
C
@ .. [T j W ? ]C A
0

which is a diagonal matrix. So T is unitarily diagonalizable.


(The proof of Part 2 is left as an exercise. See Question 12.41.)

Algorithm 12.6.10 Let T be a normal operator on a complex nite dimensional vector space
V where dim(V )  1. We want to nd an ordered orthonormal basis so that the matrix of T
relative to this basis is a diagonal matrix.
Step 1: Find an orthonormal basis C for V and compute the matrix A = [T ]C .
Step 2: Factorize the characteristic equation cA (x) into linear factors, i.e. to express it in the
form
cA (x) = (x 1 )r1 (x 2 )r2    (x k )rk
where 1 ; 2 ; : : : ; k are distinct eigenvalues of A and r1 + r2 +    + rk = dim(V ).
Step 3: For each eigenvalue i , nd a basis for the eigenspace Ei (T ) = Ker(T i IV ) and
then use the Gram-Schmidt Process (Theorem 12.3.7) to transform it to an orthonormal
basis Bi for Ei (T ).
Step 4: Let B = B1 [ B2 [    [ Bk . Then B is an orthonormal basis for V . Using B as an
ordered basis, D = [T ]B is a diagonal matrix.
Note that D = P  AP where P = [IV ]C;B is the transition matrix from B to C .
184 Chapter 12. Inner Product Spaces

(See also Algorithm 6.3.5.)

Example 12.6.11 Suppose M22 (C) is equipped with the inner product de ned in Example
12.1.6.3. Let T : M22 (C) ! M22 (C) be the linear operator de ned by
!! ! !
a b b ic + id a + ic id a b
T = for 2 M  (C):
2 2
c d ia ib d ia + ib c c d
( ! ! ! !)
1 0 0 1 0 0 0 0
Step 1: Take the standard basis C = ; ; ; for M22 (C).
0 0 0 0 1 0 0 1
Then 0 1
0 1 i i
B C
B 1 0 i iC
A = [T ]C = B
B i i 0 1C
C
@ A
i i 1 0
which is a Hermitian matrix and hence is a normal matrix. So T is a normal operator.

Step 2: The characteristic polynomial is


x 1 i i
1 x i i
cA (x) = det(xI A) = = (x + 1)3 (x + 3):
i i x 1
i i 1 x
Thus 1 and 3 are the eigenvalues.

Step 3: To nd a basis for E 1 (T ):


! !! !
a b a b 0 0
2 E (T ) ,
1 (T + IM22 (C) ) =
c d c d 0 0
" !# " !#
a b 0 0
, (A + I4 ) =
c d C
0 0 C
0 10 1 0 1
1 1 i i a 0
B CB C B C
1 1 i CB b C
iC
B B C B0C
, B
B = B C
@ i i 1 1AB
C
@cA
C B0C
@ A
i i 1 1 d 0
0 1 0 1 0 1 0 1
a 1 i i
B C B C B C B C
BbC B1C B0C B 0C
, B C
BcC = rB C + sB C + tB
B C B C
B
C for r; s; t 2 C:
@ A @0A @1A @ 0CA
d 0 0 1
Section 12.6. Unitary and Orthogonal Diagonalization 185

( ! ! !)
1 1 i 0 i 0
Thus ; ; is a basis for E 1 (T ) and by the Gram-Schmidt
0 0 1 0 0 1
( ! ! !)
p1 1 1 p1 i i p1 i i
Process, we obtain an orthonormal basis B 1 = ; 6 ; 12
2
0 0 2 0 1 3
for E 1 (T ).
( !)
i i
Similarly, B 3 = 1
is an orthonormal basis for E 3 (T ).
2
1 1
( ! ! ! !)
1 1 p1 i i p1 i i i i
Step 4: Let B = p12 ; 6 ; 12 ; 1
. We have
0 0 2 0 1 3 2
1 1
0 1
1 0 0 0
B C
B 0 1 0 0C
D = [T ]B = B
B 0 0 1 0C
C:
@ A
0 0 0 3
0 1
p1 p1 i p1 i 1
2
i
B 2 6 12 C
B p1
B 2 p1 i p1 i 1
iC
C
Let P = [IM22 (R) ]C;B = C. Then D = P  AP .
2
B 6 12
B p2 p1 1 C
B 0 2
C
@ 6 12 A
0 0 p3 1
2
12

Theorem 12.6.12
1. Let V be a real nite dimensional inner product space where dim(V )  1. A linear
operator T on V is orthogonally diagonalizable if and only if T is self-adjoint.

2. A real square matrix A is orthogonally diagonalizable if and only if A is symmetric. (See


Theorem 6.3.4.)

Proof We only prove Part 1:


()) Suppose [T ]B is a diagonal matrix for some ordered orthonormal basis B for V . Since
([T ]B )T = [T ]B , by Proposition 12.6.2.1, T is self-adjoint.

(() Suppose T is self-adjoint.


Let A = [T ]C where C is an ordered orthonormal basis for V . By Proposition 12.6.2.1,
A is an n  n real symmetric matrix where n = dim(V ). We factorize cA(x) into linear
factors over C, say,
cA (x) = (x 1 )(x 2 )    (x n )
186 Chapter 12. Inner Product Spaces

where 1 ; 2 ; : : : ; n 2 C. Since each i is an eigenvalue of A, there exists a nonzero


column vector u 2 Cn such that Au = i u. But then

i u = Au
= AT u (because A is symmetric)
= A u (because A is real)
= i u (by Proposition 12.6.2.2):

As u is nonzero, i = i and hence i 2 R.


Since cT (x) = cA (x), we conclude that cT (x) can be factorized into linear factors over
R. As a self-adjoint operator is normal, by following the same argument as in the proof
for Theorem 12.6.9, we can show that T is orthogonally diagonalizable. (In the proof of
Theorem 12.6.9, complex numbers are used only because we want to factorize cT (x) into
linear factors.)
(The proof of Part 2 is left as exercise. See Question 12.41.)

Exercise 12

Question 12.1 to Question 12.7 are exercises for Section 12.1.

1. Let A be a complex m  n matrix.


(a) Prove that rank( A ) = rank(A).
(b) Hence, or otherwise, prove that rank(A ) = rank(A). (Hint: You can use the result
from Remark 4.2.5.3.)

2. Let V be a vector space over F2 such that V has at least two nonzero vectors. Suppose
there exists a mapping h ; i : V  V ! F2 such that
(I) for all u; v 2 V , hu; v i = hv ; ui; and
(II) for all u; v ; w 2 V , hu + v ; wi = hu; wi + hv ; wi.
Show that there exists a nonzero vector u 2 V such that hu; ui = 0.
!
a b
3. Let A = be a real matrix.
c d
De ne hu; v i = uAv T for u; v 2 R2 where u and v are written as row vectors. Find a
necessary and sucient condition on a; b; c; d so that h ; i is an inner product on R2 .
Exercise 12 187

4. Determine which of the following mappings h ; i are inner products on V .

(a) V = C2 and hu; v i = u1 v1 + u2 v2 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 V .


(b) V = C2 and hu; v i = u1 v 1 + 4u2 v 2 for u = (u1 ; u2 ); v = (v1 ; v2 ) 2 V .
minfm;ng m n
X X X
(c) V = P (R) and hp(x); q (x)i = ai bi for p(x) = ai xi ; q(x) = bi xi 2 V .
i=0 i=0 i=0
(d) V = Mnn (R) and hA; B i = tr(AB ) for A; B 2 V .

5. Verify Example 12.1.6.5:


Let [a; b], with a < b, be a closed interval on the real line. Consider the vector space
C ([a; b]) de ned in Example 8.3.6.5. De ne
Z b
hf; gi = b 1
a f (t)g(t)dt for f; g 2 C ([a; b]):
a
Prove that h ; i is an inner product on C ([a; b]).

6. Verify Example 12.1.6.6:


1
X
Let V be the set of all real in nite sequences (an )n2N such that a2n converges.
n=1
1
X
(a) For any (an )n2N ; (bn )n2N 2 V , prove that an bn converges.
n=1
(b) Prove that V is a subspace of RN .
1
X
(c) De ne h(an )n2N ; (bn )n2N i = an bn for (an )n2N ; (bn )n2N 2 V .
n=1
Prove that h ; i is an inner product on V .

7. Let V be an inner product space over C and let


0 1
hu1; u1i hu1; u2i    hu1; uni
B hu2 ; u1 i hu2 ; u2 i    hu2 ; un i C
B C
A=B B
.. .. C
.. C
@ . . . A
hun; u1i hun; u2i    hun; uni
where u1 ; u2 ; : : : ; un 2 V .

(a) Prove that all eigenvalues of A are nonnegative real numbers. Hence show that
det(A) is a nonnegative real number.
(b) Prove that det(A) = 0 if and only if u1 ; u2 ; : : : ; un are linearly dependent.
188 Chapter 12. Inner Product Spaces

Question 12.8 to Question 12.15 are exercises for Section 12.2.


8. Find all unit vectors in P1 (R) if
(a) P (R) is equipped with the inner product such that
1

Z 1
hp(x); q(x)i = 1
2
p(t)q(t)dt for p(x); q(x) 2 P1 (R);
1

(b) P (R) is equipped with the inner product such that


1

Z 1
hp(x); q(x)i = p(t)q(t)dt for p(x); q(x) 2 P1 (R);
0

(c) P (R) is equipped with the inner product de


1 ned in Question 12.4(c).

9. If a1 ; a2 ; :::; an are positive real numbers, prove that


 
1 1 1
(a1 + a2 +    + an ) + +  + n :
2
a1 a2 an

10. For u = (x1 ; x2 ); v = (y1 ; y2 ) 2 R2 , let hu; v i = x1 y1 x2 y1 x1 y2 + 4x2 y2 .


(a) Prove that h ; i is an inner product on R2 .
(b) Prove that for any real numbers x1 ; x2 ; y1 ; y2 ,
  
(x1 y1 x2 y1 x1 y2 + 4x2 y2 )2  (x1 x2 )2 + 3x22 (y1 y2 )2 + 3y22 :

11. (Parallelogram Law) Let V be an inner product space. Prove that for all u; v 2 V,
jju + vjj2 + jju vjj2 = 2jjujj2 + 2jjvjj2.
12. Complete the proof of Theorem 12.2.4:
Let V be an inner product space over F = R or C. For any c 2 F and u; v 2 V , prove
that
(a) jj0jj = 0 and if u 6= 0, jjujj > 0;
(b) jjcujj = jcj jjujj;
(c) jhu; vij = jjujj jjvjj if and only if u and v are linearly dependent; and
(d) jju + vjj  jjujj + jjvjj.

13. Let V and W be inner product spaces over F = R or C. A mapping T : V ! W is said


to be continuous at v 2 V if for any real number " > 0, there exists a real number  > 0
such that for u 2 V , jjT (v ) T (u)jj < " whenever jjv ujj <  . The mapping T is called
continuous if it is continuous at every v 2 V .
Exercise 12 189

(a) Suppose V is nite dimensional. Prove that every linear transformation T : V !W


is continuous.
(b) Let F = R. Prove that if a mapping T : V ! W is continuous and satis es (T1) of
De nition 9.1.2, T is a linear transformation.

14. Let F = R or C. A norm on a vector space V over F is a mapping jj jj : V ! R that


satis es the following axioms:

(N1) jj0jj = 0 and, for all nonzero u 2 V , jjujj > 0.


(N2) For all c 2 F and u 2 V , jjcujj = jcj jjujj.
(N3) For all u; v 2 V , jju + v jj  jjujj + jjv jj.

A vector space equipped with a norm is called a normed vector space.


For each of the following, determine whether V is a normed vector space.
p
(a) V is an inner product space and jjujj = hu; ui for u 2 V .
(b) V = Fn and jj(a1 ; a2 ; : : : ; an )jj = min jai j for (a1 ; a2 ; : : : ; an ) 2 Fn .
1in

(c) V = F and jj(a1 ; a2 ; : : : ; an )jj = max jai j for (a1 ; a2 ; : : : ; an ) 2 Fn .


n
1in

(d) V = F and jj(a1 ; a2 ; : : : ; an )jj = ja1 j + ja2 j +    + jan j for (a1 ; a2 ; : : : ; an ) 2 Fn .


n

(e) V = Fn and jj(a1 ; a2 ; : : : ; an )jj = ja1 j2 + ja2 j2 +    + jan j2 for (a1 ; a2 ; : : : ; an ) 2 Fn .

15. Let V be a normed vector space, as de ned in Question 12.14, over R. De ne

hu; vi = (jju + vjj


1
4
2
jju vjj ) 2
for u; v 2 V:

(a) Prove that h ; i satis es (IP1') and (IP4).


(b) In addition, suppose the norm jj jj satis es the Parallelogram Law (see Question
12.11), i.e. for u; v 2 V ,

jju + vjj 2
+ jju vjj 2
= 2jjujj2 + 2jjv jj2 :

(i) For all u; v 2 V , show that hu; 2v i = 2hu; v i.


(ii) For all u; v ; w 2 V , show that hu; wi + hv ; wi = 12 hu + v ; 2wi.
(iii) Prove that h ; i satis es (IP2).
(iv) For all r 2 Q and u; v 2 V , show that hru; v i = rhu; v i.
(v) For all u; v 2 V , show that jhu; v ij  jjujj jjv jj.
(vi) For all c 2 R, r 2 Q and u; v 2 V , show that jhcu; v i rhu; v ij  jc rj jjujj jjv jj.
(vii) Prove that h ; i satis es (IP3) and hence h ; i is an inner product on V .
190 Chapter 12. Inner Product Spaces

Question 12.16 to Question 12.19 are exercises for Section 12.3.

16. For each part of Question 12.4, write down an orthonormal basis for V if h ; i is an inner
product.

17. Suppose C ([0; 2 ]) is equipped with an inner product such that


Z 2
hf; gi = 2
1
 f (t)g(t)dt for f; g 2 C ([0; 2]):
0

Let 0 ; 1 ; : : : ; 2n be functions in C ([0; 2 ]) de ned by

0 (x) = 1; 2m 1 (x) = cos(mx); 2m (x) = sin(mx) (m = 1; 2; : : : ; n) for x 2 [0; 2]:
(a) Prove that f0 ; 1 ; : : : ; 2n g is orthogonal.
(b) Find an orthonormal basis for Wn = spanf0 ; 1 ; : : : ; 2n g.

18. Let M22 (C) be equipped with the inner product hA; B i = tr(AB  ) for A; B 2 M22 (C).
Let W = spanfA1 ; A2 ; A3 ; A4 g where
! ! ! !
A1 = 10 0i ; A2 = 2i 00 ; A3 = 0 2i
i 0
; A4 = 0i ii :
(a) Find a subset B of fA1 ; A2 ; A3 ; A4 g such that B is a basis for W .
(b) Starting with the basis in (a), use the Gram-Schmidt Process to nd an orthonormal
basis for W .

19. Suppose Pn (R) is equipped with an inner product such that


Z 1
hp(x); q(x)i = p(t)q(t)dt for p(x); q(x) 2 Pn (R):
0

Starting with the standard bases, use the Gram-Schmidt Process to nd an orthonormal
basis for each of P1 (R) and P2 (R).

Question 12.20 to Question 12.28 are exercises for Section 12.4.

20. Suppose C4 is equipped with the usual inner product. Let W be a subspace of C4 with a
basis B = f(1; i ; 0; 0); (0; 1; i ; 0); (0; 0; 1; i)g.

(a) Find an orthonormal basis for W .


(b) Find an orthonormal basis for W ? .
Exercise 12 191

21. Let C4 be equipped with the usual inner product and let

W = f(a; a i b; a + 2i b; a + 3i b) j a; b 2 Cg:

(a) Find an orthonormal basis for W .


(b) Find a formula for the orthogonal projection ProjW : C4 ! C4 .

22. Let V be an inner product space and W1 , W2 subspaces of V .

(a) If W1  W2 , prove that W2?  W1? .


(b) Prove that (W1 + W2 )? = W ? \ W ? .
1 2

(c) If V is nite dimensional, prove that (W1 \ W2 )? = W1? + W2? .

23. Let T be a linear operator on an inner product space V such that T 2 = T .

(a) Prove that R(T ) = fv 2 V j T (v ) = v g.


(b) Prove that V = R(T )  Ker(T ).
(c) Suppose jjT (u)jj  jjujj for all u 2 V . Prove that T is the orthogonal projection of
V onto R(T ).

24. Let V = M22 (R) be equipped with the inner product hA; B i = tr(AB T ) for A; B 2 V .
Let W = spanfA1 ; A2 ; A3 ; A4 g where
! ! ! !
A1 = 01 0
3
; A2 = 2 1
2 2
; A3 = 10 0
1
; A4 = 00 1
2
:

(a) Find an orthonormal basis for W ? .


!
2 3
(b) Let C = . Express C in the form C = P + Q where P 2W and Q 2 W ? .
2 1
(c) Find the smallest value in the set f jjC X jj j X 2 W g.

25. Suppose C ([0; 1]) is equipped with an inner product such that
Z 1
hf; gi = f (t)g(t)dt for f; g 2 C ([0; 1]):
0

p
Let f be a function in C ([0; 1]) de ned by f (x) = x for x 2 [0; 1]. Find the best
approximation of f in each of P1 (R) and P2 (R), where P1 (R) and P2 (R) are regarded as
subspaces of C ([0; 1]). (Hint: Use the orthonormal bases found in Question 12.19.)
192 Chapter 12. Inner Product Spaces

26. Suppose C ([0; 2 ]) is equipped with the inner product de ned in Question 12.17. Let f
be a function in C ([0; 2 ]) de ned by f (x) = ex for x 2 [0; 2 ]. Follow the notation in
Question 12.17. Find the best approximation of f in W2 = spanf0 ; 1 ; 2 g.

27. Let V be an inner product space and W a subspace of V . The distance from a vector u
to W is de ned to be the value

d(u; W ) = min d(u; w):


w 2W
For each of the following, nd the distance from the given vector to W .
(a) V = R3 with the usual inner product, W = f(x; y; z ) 2 V j x + y + z = 0g and the
vector is u = (1; 1; 2).
(b) V = C3 with the usual inner product, W = f(x; y; z ) 2 V j x + y + z = 0g and the
vector is u = (1 + i ; 0; 1).
(c) V = C ([0; 1]) with the inner product as in Question 12.25, W = P (R) and the
1

vector is the function f de ned in Question 12.25.

28. Let V be an inner product space and W a subspace of V . The distance from a vector u
to a coset W + v is de ned to be the value

d(u; W + v) = min d(u; w + v):


w 2W
For each Part of Question 12.27, nd the distance from the given vector to the coset of
W stated below.
(a) f(x; y; z ) 2 V j x + y + z = 1g.
(b) f(x; y; z ) 2 V j x + y + z = 1 + i g.
(c) W + g where g 2 V is the function de ned by g (x) = 1 + x
p 3x for x 2 [0; 1].

Question 12.29 to Question 12.37 are exercises for Section 12.5.

29. For each of the following linear operator T on the inner product space V , nd the adjoint
of T .
(a) V = C3 is equipped with the usual inner product and T : V !V is de ned by
T ((x; y; z )) = (x; x + yi; x + yi z ) for (x; y; z ) 2 V .
(b) V = R2 is equipped with the inner product

h(u ; u ); (v ; v )i = u v
1 2 1 2 1 1 + 2u2 v2 for (u1 ; u2 ); (v1 ; v2 ) 2 V

and T : V !V is de ned by T ((u1 ; u2 )) = (u2 ; u1 ) for (u1 ; u2 ) 2 V .


Exercise 12 193

(c) V = Mnn (C) is equipped with the inner product

hX ; Y i = tr(XY ) for X ; Y 2V
and T : V !V is de ned by T (X ) = AX for X 2 V , where A is an n  n complex
matrix.

30. (a) Let T be linear operators on an inner product space V such that T  exists and let
W be a T -invariant subspace of T . Prove that W ? is T  -invariant.
(b) Let C3 be equipped with the usual inner product and let T : C3 ! C3 be the linear
operator de ned by

T ((x; y; z )) = (x + i y; z i y; x + y + (1 i)z ) for (x; y; z ) 2 C3 :

Let W = spanf(1; 1; 1)g.


(i) Write down a formula for T  .
(ii) Find an orthonormal basis for W and an orthonormal basis for W ? .
(iii) Verify that W is T -invariant and W ? is T  -invariant.

31. Complete the proof of Proposition 12.5.7:


Let F = R or C and let V be an inner product space over F. Suppose S and T are linear
operators on V such that S  and T  exists. Prove that

(a) (S + T ) exists and (S + T ) = S  + T  ;


(b) for any c 2 F, (c T ) exists and (c T ) = c T  ;
(c) (T  ) exists and (T  ) = T ; and
(d) if W is a subspace of V which is both T -invariant and T  -invariant, then (T jW )
exists and (T jW ) = T  jW .

32. Let T be linear operators on an inner product space such that T  exists.

(a) If T is surjective, prove that T  is injective.


(b) If T is injective, must T  be surjective?

33. Let T be a linear operator on an inner product space V such that T is invertible and T 
exists.

(a) If V is nite dimensional, prove that T  is invertible.


(b) If T  is invertible, prove that (T 1 ) exists and (T 1 ) = (T  ) 1 .
194 Chapter 12. Inner Product Spaces

(c) Let V be the the l2 -space. Consider the subspace W = spanfe1 ; e2 ; e3 ; : : : g of V


de ned in Remark 12.4.5. De ne a linear operator T on W such that

T ((an )n2N ) = (an an+1 )n2N for (an )n2N 2 W:


(i) Find T 1 and T  .
(ii) Is T  invertible? Does (T 1
) exist?

34. Let T be linear operators on an inner product space V such that T  exists.
(a) Prove that Ker(T   T ) = Ker(T ).
(b) Is it true that Ker(T  T  ) = Ker(T )? Justify your answer.

35. Let T be linear operators on an inner product space V such that T  exists.
(a) For v1 ; v2 ; : : : ; vk 2 R(T ), show that if T (v1 ); T (v2 ); : : : ; T (vk ) are linearly inde-
pendent, then T  (T (v1 )); T  (T (v2 )); : : : ; T  (T (vk )) are linearly independent.
(b) Suppose R(T ) is nite dimensional. (Note that V may not be nite dimensional.)
Prove that R(T  ) is nite dimensional and rank(T  ) = rank(T ).
(Hint: By substituting T by T  in (a), if T  (v1 ); T  (v2 ); : : : ; T  (vk ) are linearly
independent, then T (T  (v1 )); T (T  (v2 )); : : : ; T (T  (vk )) are linearly independent.)

36. Let T be linear operators on an inner product space V such that T  exists.
(a) Given b 2 V , show that x = u is a solution to (T   T )(x) = T  (b) if and only if
T (u) is the orthogonal projection of b onto R(T ).
(b) Given b 2 R(T ), show that fu 2 V j T (u) = bg = fu 2 V j (T   T )(u) = T  (b)g.

37. Determine which of the following complex square matrices are unitary.
0 1 0 1 0 1 0 1 0 1
1 0 0 0 i 0 1 0 i 1 0 i 1 0 i
B C B C B C B C B C
(i) @0 0 i A, (ii) @0 0 i A, (iii) @0 1 0A, (iv) @ 0 1 1A, (v) @0 1 0A.
0 i 0 i 0 0 i 0 1 i 1 1 i 0 0

Question 12.38 to Question 12.45 are exercises for Section 12.6.


38. Determine which of the complex square matrices in Question 12.37 are Hermitian and/or
normal.
0 1
0 1 i
39. Let A = @ 1
B
0
C
1 A.
i 1 0
Exercise 12 195

(a) Verify that A is a normal matrix.


(b) Find an unitary matrix P such that P  AP is a diagonal matrix.

40. (a) For each of the following matrices A, determine whether A is orthogonally diago-
nalizable. If so, nd an orthogonal matrix P such that P T AP is a diagonal matrix.
0 1 0 1 0 1
2 1 1 1 0 1 1 0 1
(i) A=B
@ 1 2
C
1A ; (ii) A=B
@
C
0 1 0A ; (iii) A = B C
@0 1 0A :
1 1 2 1 0 1 0 0 1

(b) For each of the matrices A in (a), determine whether A is unitarily diagonalizable.
If so, nd a unitary matrix P such that P  AP is a diagonal matrix.

41. Prove Theorem 12.6.9.2 and Theorem 12.6.12.2:


(a) Prove that a complex square matrix A is unitarily diagonalizable if and only if A is
normal.
(b) Prove that a real square matrix A is orthogonally diagonalizable if and only if A is
symmetric.

42. (a) Let V be a nite dimensional complex inner product space and T a normal operator
on V . Prove that
(i) T is self-adjoint if and only if all eigenvalues of T are real; and
(ii) T is unitary if and only if all eigenvalues of T have modulus 1.
(b) Restate the results in Part (a) using square matrices.

43. Let V be a nite dimensional complex inner product space and T : V !V a self-adjoint
linear operator on V .
(a) Prove that the linear operator T i IV is invertible.
(b) Prove that the linear operator S = (T + i IV )  (T i IV ) 1
is unitary.

44. Let A be an n  n Hermitian matrix, i.e. A = A. De ne

hu; vi = uAv for u; v 2 Cn

where u and v are written as row vectors.


(a) Prove that h ; i satis es (IP1), (IP2) and (IP3).
(b) Give a necessary and sucient condition on the eigenvalues of A so that h ; i is an
inner product. (See also Question 12.3.)
196 Chapter 12. Inner Product Spaces

45. Let V be a nite dimensional inner product space over C.


(a) Let T be a self-adjoint operator on V . Prove that hT (u); ui is a real number for all
u2V.
(b) A linear operator P on V is called positive semi-de nite if P is self-adjoint and
hP (u); ui  0 for all u 2 V .
(i) Let S be a linear operator on V and let P = S   S . Prove that P is positive
semi-de nite.
(ii) Suppose P is a positive semi-de nite operator on V . Show that there exists a
linear operator S on V such that P = S   S .
Index
, 33 symmetric, 102, 103
, 62
h ; i, 156 C, 6
jj jj, 158, 189 cA (x), 109

=, 76 Cn, 11, 23, 27, 29, 157

=F , 76 cT (x), 109
[T ]B , 59 CN, 12
[T ]B;C , 59 C ([a; b]), 18, 157
[u]B , 27 C 1 ([a; b]), 18
(u)B , 27 C n ([a; b]), 18
jcj, 158 Cauchy-Schwarz Inequality, 160
Cayley-Hamilton Theorem, 128
0, 13 characteristic equation, 108
1, 70 characteristic polynomial, 105, 108, 128
addition closed interval, 18
cosets, 40 complex inner product space, 156
eld, 5 complex linear system, 8
functions, 13 complex matrix, 8
in nite sequences, 12 conjugate, 155
linear transformations, 67 conjugate transpose, 155, 173
matrices, 12 complex number, 6
polynomials, 12 conjugate, 155
vectors, 10 modulus, 158
additive identity, 5 complex polynomial, 12
additive inverse, 6 complex vector space, 11, 155
adjoint, 172 composition mapping, 62
algebraic basis, 26 coordinate, 28
angle, 161 coordinate vectors, 28, 58
anti-symmetric matrix, 34 coset, 37
cyclic subspace, 119
basis, 26, 29
best approximation, 171 det(A), 99
bijective mapping, 72, 74 determinant, 98
bilinear form, 95 classical de nition, 99
198 Linear Algebra: Concepts and Theories on Abstract Vector Spaces

cofactor expansions, 100, 104 IV , 56


linear operator, 108 identity mapping, 56
diagonalizable, 110 identity operator, 56
di erential operator, 57, 108 image, 58, 79
dimension, 29 in nite dimensional, 26
Dimension Theorem injective mapping, 72
for linear transformations, 71 inner product, 156
for quotient spaces, 42 inner product space, 156
distance, 158 integer, 6
dual space, 84 integral operator, 57
invariant subspace, 117
ei, 23, 26 invertible mapping, 74
Eij , 23, 27 isomorphism, 73
EA (T ), 109 Isomorphism Theorem
E (T ), 109 the First, 78
eigenspace, 108
the Second, 90
eigenvalue, 107
the Third, 90
eigenvector, 107
Euclidean Algorithm, 146 Jt(), 136
Euclidean inner product, 156 Jordan block, 84, 136
Euclidean space, 11, 156, 163 Jordan canonical form, 137
external direct sum, 52
K (T ), 132
F (A; F), 13, 18, 27 Ker(T ), 69
Fn, 11, 23, 26, 29 kernel, 69
FN, 12, 27
F ,6
2
l2 -space, 158
Fn, 11
2
LA , 56, 61, 126
FN, 12
2
L(V; W ), 68
F , 43
4
length, 158
eld, 5 linear combination, 20
nite dimensional, 26 linear functional, 55
nite eld, 7 linear operator, 55
restriction of, 117
general vector space, 10 linear span, 21, 30
Gram-Schmidt Process, 165 linear system over a eld, 8
Harnel basis, 26 linear transformation, 55
Hermitian matrix, 179 bijective, 73
Hilbert spaces, 158 injective, 72
inverse, 74
i, 6 surjective, 72
Index 199

linearly dependent, 24 orthonormal set of vectors, 162


linearly independent, 24, 30
P (F), 12, 17, 23, 27
mA (x), 130 Pn(F), 17, 23, 27, 29
Mmn(F), 12, 23, 27, 29, 157 Parallelogram Law, 188, 189
mT (x), 129 permanent, 95
matrix over a eld, 8, 12 permutation, 91, 97
minimal polynomial, 129 even, 93
monic polynomial, 129 odd, 93
multilinear form, 95 parity, 93
alternative, 95, 97 sign, 93
multiplication perpendicular, 161
eld, 5 polynomial over a eld, 12
multiplicative identity, 6 power of a linear operator, 65
multiplicative inverse, 6 pre-image, 86
ProjW , 169
N, 6 projection, 80, 169
natural number, 6 proper subspace, 15, 47
nonzero element, 5
norm, 158, 189 Q, 6
normal matrix, 179 Qn, 11
normal operator, 179 QN, 12
normalizing, 165 quadratic form, 103
normed vector space, 189 quotient space, 41
nullity, 71 R, 6
nullspace, 17, 69 R(T ), 69
number system, 6 Rn, 11, 156
OV , 56
RN, 17
range, 69
OV;W , 56
rank, 71
ordered basis, 28, 58
rational number, 6
orthogonal basis, 162
real inner product space, 156
orthogonal complement, 166
real linear system, 8
orthogonal diagonalizable, 181
real matrix, 8
orthogonal matrix, 175
real number, 6
orthogonal operator, 175
real polynomial, 12
orthogonal projection, 169, 171
real vector space, 11, 155
orthogonal set of vectors, 162
orthogonal to a subspace, 162 Sn, 91
orthogonal vectors, 161 scalar, 10
orthonormal basis, 162, 178 scalar multiplication
200 Linear Algebra: Concepts and Theories on Abstract Vector Spaces

cosets, 40 triangular form, 114, 142, 145


functions, 13 trivial subspace, 15
in nite sequences, 12
unit vector, 158
linear transformations, 67
unitarily diagonalizable, 181
matrices, 12
unitary matrix, 175
polynomials, 13
unitary operator, 175
vectors, 10
usual inner product
self-adjoint operator, 179
on Cn , 157
sgn( ), 93
on Rn , 156
shift operator, 107, 172
similar matrices, 65 V/W, 41
skew symmetric matrix, 34 vector space, 10
solution set of a linear system, 38, 43 of continuous functions, 18, 157
solution space, 17 of di erentiable functions, 18
span(B ), 21 of functions, 13, 18, 27, 57
spanF (B ), 21 of in nite sequences, 12, 17, 27, 57, 158
standard basis of matrices, 12, 23, 27, 29, 157
for Fn , 26 of n-vectors, 11, 23, 26, 29, 156
for Mmn (F), 27 of polynomials, 12, 17, 23, 27, 29
for P (F), 27 vector spaces
for Pn (F), 27 isomorphic, 76
subpaces
direct sum of, 124 W ? , 166
Wronskian, 48
subspace, 15
subspaces Z, 6
direct sum of, 33, 35 zero element, 5
intersection of, 18 zero function, 13
sum of, 19, 32, 35 zero mapping, 56
surjective mapping, 72 zero matrix, 12
symmetric matrix, 34, 179 zero operator, 56
zero polynomial, 13
T 1 [u], 86 zero sequence, 12
T 1 [W ], 86 zero space, 13
T jW , 117 zero transformation, 56
T [W ], 79, 117 Zorn's Lemma, 26
tr(A), 9
trace, 9, 157
transition matrix, 61, 179
transposition, 92
Triangle Inequality, 160

You might also like