Linear Algebra Concepts and Theories on Abstract Vector Spaces
Linear Algebra Concepts and Theories on Abstract Vector Spaces
by
Ma Siu Lun
Department of Mathematics
National University of Singapore
Discussion 8.1.1 In Chapter 3, we have de ned the Euclidean n-space Rn with two opera-
tions: the addition and scalar multiplication. The same operations are also de ned in Chapter
2 when we study matrices. It is natural to ask if the results of n-vectors studied in Chapter 3
can be applied to matrices as well. In order to unify all these similar mathematical objects, we
need a more general framework for vector spaces. But before we study this abstract version of
vector spaces, we rst introduce an abstract version of real numbers.
(F4) (Existence of the Additive Identity) There exists an element 0 2 F such that
a + 0 = a for all a 2 F. We call 0 the zero element of F and all other elements in F are
called nonzero elements of F.
(By Proposition 8.1.5.1, we know that there exists only one such element in F.)
6 Chapter 8. General Vector Spaces
(F5) (Existence of Additive Inverses) For every a 2 F, there exists b 2 F such that
a + b = 0. We call b the additive inverse of a and denote b by a.
(By Proposition 8.1.5.2, we know that for each a, the additive inverse of a is unique.)
(F6) (Closure under Multiplication) For all a; b 2 F, ab 2 F.
(F7) (Commutative Law for Multiplication) For all a; b 2 F, ab = ba.
(F8) (Associative Law for Multiplication) For all a; b; c 2 F, (ab)c = a(bc).
(F9) (Existence of the Multiplicative Identity) There exists a nonzero element 1 in F
such that 1a = a for all a 2 F.
(By Proposition 8.1.5.3, we know that there exists only one such element in F.)
(F10) (Existence of Multiplicative Inverses) For every nonzero element a 2 F, there exists
c 2 F such that ac = 1. We call c the multiplicative inverse of a and denote c by a 1 .
(By Proposition 8.1.5.4, we know that for each a, the multiplicative inverse of a is unique.)
(F11) (Distributive Law) For all a; b; c 2 F, a(b + c) = (ab) + (ac).
Example 8.1.3
1. (Number Systems) When we talk about a number system, we refer to one of the
following sets together with the usual addition and multiplication.
(a) N = the set of all natural numbers, i.e. N = f1; 2; 3; : : : g.
(b) Z = the set of all integers, i.e. Z = f0; 1; 2n; : : : g. o
(c) Q = the set of all rational numbers, i.e. Q = pq p; q 2 Z and q 6= 0 .
(d) R = the set of all real numbers.
(e) C = pthe set of all complex numbers, i.e. C = fa + b i j a; b 2 Rg where i2 = 1
(i = 1 ).
Note that N Z Q R C. The number systems Q, R and C (with the usual
addition and multiplication) satisfy all axioms in De nition 8.1.2. So they are elds.
The number system N does not satisfy (F4), (F5) and (F10) and hence is not a eld.
The number system Z does not satisfy (F10) and hence is not a eld.
+ 0 1 0 1
0 0 1 0 0 0
1 1 0 1 0 1
It can be checked that F2 is a eld.
Section 8.1. Fields 7
A eld which has only nitely many elements is called a nite eld. The eld F2 here is
an example of a nite eld. It is known that a nite eld of q elements exists if and only
if q = ps for some prime number p and positive integer s.
Remark 8.1.4 The approach of de ning something by properties (or axioms) is one of the
characteristics of modern mathematics. The advantage is that once we prove a theorem based
on these properties, the theorem can automatically be applied to all mathematical objects that
have these properties.
(Proofs of the other parts are left as exercises. See Question 8.5.)
De nition 8.1.6 Let F be a eld. We can de ne the subtraction and division as follows.
1. For any a; b 2 F, the substraction of a by b is de ned to be a + ( b) and is denoted by
a b.
8 Chapter 8. General Vector Spaces
Discussion 8.1.7 The results we established in Chapter 1 and Chapter 2 can be generalized
to any elds. In particular, row-echelon and reduced row-echelon forms, Gaussian and Gauss-
Jordan Eliminations, matrix operations, inverses and determinants of square matrices will be
used in the following chapters over any elds.
Example 8.1.9
1. Solve the following complex linear system:
8
>
< x1 + i x2 + 3i x3 = 0
i x1 + x2 + x3 = 0
>
:
(1 i)x1 + (1 + i)x3 = 0:
Solution
0 1 0 1
1 i 3i 0 R2 i R1 1 i 3i 0 R3
1
+ 2 (1 + i)R2
B
@ i 1 1 0
C
A ! B
@ 0 2 4 0
C
A !
1 i 0 1+i 0 R3 (1 i)R1 0 1 i 2 2i 0
0 1 0 1 0 1
1 i 3i 0 1R
2 2 1 i 3i 0 R1 i R2 1 0 i 0
B
@ 0 2 4 0
C
A ! B
@ 0 1 2 0
C
A ! B
@ 0 1 2 0
C
A
0 0 0 0 0 0 0 0 0 0 0 0
The last augmented matrix corresponds to the complex system
(
x1 + i x3 = 0
x2 + 2x3 = 0
which has a general solution x1 = i t, x2 = 2t, x3 = t for t 2 C.
0 1
1 1 1
2. Let A = @0 1 1C
B
A be a matrix over F2 . Find the inverse of A.
1 0 1
Section 8.1. Fields 9
Solution
i 0 0 0
1+i 1 0
0 1+i 1 0
det(B ) = =i 0 2 i 0+0 0
0 0 2 i
i 1 + 2i 0
1 i 1 + 2i 0
" #
2 i 0 i
= i (1 + i) + 0 = i[(1 + i)(2 i) + 1] = 1 + 4i :
1 + 2i 0 i 0
Proposition 8.1.11
1. If A and B are n n matrix over F, then tr(A + B ) = tr(A) + tr(B ).
Remark 8.1.12 In general, tr(XY Z ) 6= tr(Y XZ ) even when the matrices X , Y and Z can
be multiplied accordingly.
! ! !
1 0 0 1 0 1
For example, let X = ,Y = and Z = . Then
0 1 1 0 0 0
! !
XY Z = 00 01 and Y XZ =
0
0
0
1
:
Discussion 8.2.1 In this section, we give an abstract de nition of vector spaces. Under this
general framework, vector spaces learnt in Chapter 3 are particular examples.
(V3) (Associative Law for Vector Addition) For all u; v ; w 2 V , u+(v +w) = (u+v )+w.
Section 8.2. Vector Spaces 11
(V4) (Existence of the Zero Vector) There exists a vector 0 2 V , called the zero vector,
such that u + 0 = u for all u 2 V .
(By Proposition 8.2.4.1, we know that there exists only one such vector in V .)
(V5) (Existence of Additive Inverse) For every vector u 2 V , there exists a vector v 2 V
such that u + v = 0. We call v the negative of u and denote v by u.
(By Proposition 8.2.4.2, we know that for each u, the negative of u is unique.)
By this axiom, for u; v 2 V , we can de ne the subtraction of u by v to be u v = u+( v).
(V6) (Closure under Scalar Multiplication) For all c 2 F and u 2 V , cu 2 V .
(The last two axioms, i.e. (V9) and (V10), are also known as Distributive Laws.)
As a convention, we say that V together with the given vector addition and scalar multiplication
is a vector space over F. If the vector addition and scalar multiplication are known, we simply
say that V is a vector space over F. In particular, if F = R, V is called a real vector space ; and
if F = C, V is called a complex vector space.
Example 8.2.3
1. The Euclidean n-space Rn in Chapter 3 is a real vector space using the vector addition and
scalar multiplication de ned in De nition 3.1.3. The axioms (V1) and (V6) are obviously
satis ed. The other axioms follow by Theorem 3.1.6.
3. Let F be a eld and let Mmn (F) be the set of all mn matrices over F, i.e. A 2 Mmn (F)
if and only if
0 1
a11 a12 a n 1
B
B a21 a22 a nCC
A = (aij )mn = B B
.. .. C
2
.. C where aij 2 F for all i; j .
@ . . . A
am1 am2 amn
We can de ne the matrix addition and scalar multiplication for matrices over F as dis-
cussed in Discussion 8.1.7, i.e.
and
c(aij ) = (caij ) for c 2 F and (aij ) 2 Mmn (F)
(see Notation 2.1.5 for the notation used above). Then Mmn(F) is a vector space over
F. The zero vector is the zero matrix 0mn.
In particular, Mmn (Q), Mmn (R), Mmn (C) and Mmn (F2 ) are vector spaces.
4. Let F be a eld and let FN be the set of all in nite sequences a = (an )n2N = (a1 ; a2 ; a3 ; : : : )
with a1 ; a2 ; a3 ; 2 F, i.e.
FN = f(an)n2N j a ; a ; a ; 2 Fg:
1 2 3
a + b = (an + bn)n2N = (a 1 + b1 ; a2 + b2 ; a3 + b3 ; : : : ):
Then FN is a vector space over F. The zero vector is the zero sequence (0; 0; 0; : : : ).
In particular, QN , RN , CN and F2N are vector spaces.
(a0 + a1 x + + am xm ) + (b0 + b1 x + + bn xn )
= (b0 + b1 x + + bn xn ) + (a0 + a1 x + + am xm )
= (a0 + b0 ) + (a1 + b1 )x + + (am + bm )xm + bm+1 xm+1 + + bn xn :
Then F (A; F) is a vector space over F. The zero vector is the zero function 0 : A !F
de ne by 0(a) = 0 for a 2 A.
7. Let F be a eld and V = f0g. We de ne
and
c(a + b i) = (ca) + (cb)i for c 2 R and a + b i 2 C with a; b 2 R:
9. Let V be the set of all positive real numbers, i.e. V = fa 2 R j a > 0g.
(a) V is not a vector space over R using the usual addition of real numbers as the vector
addition and the usual multiplication of real numbers as the scalar multiplication.
It does not satis ed axioms (V5) and (V6) in De nition 8.2.2.
14 Chapter 8. General Vector Spaces
a y b = ab for a; b 2 V
and de ne the scalar multiplication by
m a = am for m 2 R and a 2 V:
Then V is a vector space over R using these two operations. (We leave the veri cation
as exercise. See Question 8.9.)
By (V5), the vector 0u has the negative 0u. Adding 0u to both sides of the equation above
yields
0u + ( 0u) = [0u + 0u] + ( 0u)
) 0u + ( 0u) = 0u + [0u + ( 0u)] by (V3)
) 0 = 0u + 0 by (V5)
) 0 = 0u: by (V4)
(Proofs of the other parts are left as exercises. See Question 8.12.)
Discussion 8.3.1 Given an arbitrary subset W of a vector space V , under the same vector
addition and scalar multiplication as in V , W automatically satis es axioms (V2), (V3) and
Section 8.3. Subspaces 15
(V7)-(V10) in De nition 8.2.2 (whenever they make sense). In case W also satis es (V1), (V4),
(V5) and (V6), it forms a vector space sitting inside the larger vector space V . For example,
in R3 , the xy -plane is itself a vector space.
For most of the applications of linear algebra, we need to work with smaller vector spaces sitting
inside a big vector space.
Example 8.3.3
1. Let V be a vector space and 0 the zero vector.
(a) Since f0g is a subset of V and f0g is a vector space (a zero space), f0g is a subspace
of V .
(b) Since V is a subset of V and V is a vector space, V is a subspace of V .
These two subspaces, f0g and V , are called trivial subspaces of V . Other subspaces of V
are called proper subspaces of V .
(V1) Take any two vectors (a; a); (b; b) 2 W . The sum (a; a) + (b; b) = (a + b; a + b) is
again a vector in W . Hence W is closed under the vector addition.
(V4) The zero vector (0; 0) of V is also contained in W .
(V5) Take any vector (a; a) 2 W . The negative of (a; a) is ( a; a) which is again a
vector in W . So the negative of every vector in W is also contained in W .
(V6) Take any vector (a; a) 2 W and any scalar c 2 F. The scalar multiple c(a; a) =
(ca; ca) is again a vector in W . Hence W is closed under the scalar multiplication.
Proof
)
( ) (S1), (S2) and (S3) follow directly from the de nition of vector spaces.
(() Suppose W satis es (S1), (S2) and (S3). We only need to show that W satis es (V5).
(Why?)
(V5) Take any vector u 2 W. By Proposition 8.2.4.3, u = ( 1)u and, by (S3), it is
contained in W .
Remark 8.3.5 Theorem 8.3.4 can be simpli ed further: Let W be a nonempty subset of a
vector space V over a eld F. Then W is a subspace of V if and only if
Example 8.3.6
( ! )
a b
1. Let F be a eld and W = a; b 2 F M (F).
2 2
b a
! !
(S1)
0 0
2 W (because the matrix is of the form a b with a = 0; b = 0 2 F).
0 0 b a
! !
a b a b
(S2) For any 1 1 ; 2 2 2W (where a1 ; a2 ; b1 ; b2 2 F),
b1 a1 b2 a2
! ! !
a1 b1 a b a + a2 b1 + b2
+ 2 2 = 1 2W
b1 a1 b2 a2 b1 + b2 a1 + a2
!
a b
(because the matrix is of the form with a = a1 + a2 ; b = b1 + b2 2 F).
b a
!
a b
(S3) For any c 2 F and 0 0 2W (where a0 ; b0 2 F),
b0 a0
! !
a b ca0 cb0
c 0 0 = 2W
b0 a0 cb0 ca0
!
a b
(because the matrix is of the form with a = ca0 ; b = cb0 2 F).
b a
Section 8.3. Subspaces 17
As W is a subset of M22 (F) satisfying (S1), (S2) and (S3), it is a subspace of M22 (F).
2. (In this example, vectors in Fn are written as column vectors.) Let F be a eld and A
an m n matrix over F. Then the solution set W of the homogeneous linear system
Ax = 0 is a subspace of Fn. The subspace W is called the solution space of Ax = 0 or
the nullspace of A.
A(u + v) = Au + Av = 0 + 0 = 0;
u + v 2 W.
(S3) Take any c 2 F and take any u 2 W , i.e. Au = 0. Since
A(cu) = cAu = c0 = 0;
cu 2 W .
As W is a subset of Fn satisfying (S1), (S2) and (S3), it is a subspace of Fn .
3. Consider the vector space RN of in nite sequences over R. De ne
n o
W = (a1 ; a2 ; a3 ; : : : ) 2 RN nlim
!1 an = 0 RN:
(S1) Since the limit of the zero sequence (0; 0; 0; : : : ) is 0, (0; 0; 0; : : : ) 2 W .
(S2) Take any a = (an )n2N ; b = (an )n2N 2 W , i.e. lim an = 0 and lim bn = 0. Since
n!1 n!1
lim (a + bn ) = lim an + lim bn = 0 + 0 = 0;
n!1 n n!1 n!1
a + b = (an + bn)n2N 2 W .
(S3) Take any c 2 R and take any a = (an )n2N 2 W , i.e. lim an = 0. Since
n!1
lim can = c lim an = c0 = 0
n!1 n!1
ca = (can )n2N 2 W .
As W is a subset of RN satisfying (S1), (S2) and (S3), it is a subspace of RN .
4. Let F be a eld and n a positive integer. De ne Pn (F) to be the set of all polynomial
over F with degree at most n. Note that
Pn(F) = fa 0 + a1 x + + an xn j a0 ; a1 ; : : : ; an 2 Fg P (F):
Remark 8.3.7 Since a real polynomial can be regarded as a real-valued in nitely di erentiable
function, P (R) and Pn (R), for n 2 N, can be considered as subspaces of C ([a; b]), C m ([a; b]),
for all m 2 N, and C 1 ([a; b]) for any closed interval [a; b] on the real line.
Theorem 8.3.8 If W1 and W2 are two subspaces of a vector space V , then the intersection of
W1 and W2 is also a subspace of V . (Remind that for sets A and B , the intersection of A and
B is the set A \ B = fx j x 2 A and x 2 B g.)
Proof
(S1) Since both W1 and W2 contain the zero vector, the zero vector is contained in W1 \ W2 .
(S2) Take any u; v 2 W1 \ W2 . Since W1 is a subspace and u; v 2 W1 , we have u+v 2 W . 1
Similarly, u + v 2 W2 : Thus u + v 2 W1 \ W2 .
(S3) Take any scalar c and any u 2 W1 \ W2 . Since W1 is a subspace and u 2 W , we have
1
cu 2 W1 . Similarly, cu 2 W2 . Thus cu 2 W1 \ W2 .
As W1 \ W2 is a subset of V satisfying (S1), (S2) and (S3), it is a subspace of V .
Remark 8.3.10
1. If W1 ; W2 ; : : : ; Wn are subspaces of a vector space V , then W1 \ W2 \ \ Wn is also a
subspace of V .
Section 8.3. Subspaces 19
2. The union of two subspaces of a vector space may not be a vector space. (Reminded that
for sets A and B , the union of A and B is the set A [ B = fx j x 2 A or x 2 B g.)
For example, let F be a eld, W1 = f(x; 0) j x 2 F g and W2 = f(0; y ) j y 2 F g. It is
easy to check that W1 and W2 are subspaces of F2 . Since (1; 0) 2 W1 and (0; 1) 2 W2 ,
both (1; 0) and (0; 1) are elements of W1 [ W2 : However, (1; 0) + (0; 1) = (1; 1) is neither
contained in W1 nor W2 and hence it is not an element of W1 [ W2 : This shows that
W1 [ W2 is not a subspace of F2 :
De nition 8.3.11 Let W1 and W2 be subspaces of a vector space V . The sum of W1 and W2
is de ned to be the set W1 + W2 = fw1 + w2 j w1 2 W1 and w2 2 W2 g.
Example 8.3.13
1. Let W1 and W2 be two nonparallel lines in R3 such that both lines pass thought the
origin. Then W1 + W2 is the plane that contains both lines. It is obvious that W1 , W2
and W1 + W2 are subspaces of R3 .
W2
w +w
1 2
w2
:
W1
w
1s
W1 + W2
Theorem 8.4.3 Let V be a vector space over a eld F and let B be a nonempty subset of V .
The set of all linear combinations of vectors taken from B ,
is a subspace of V .
Proof
(S1) Take any u 2 B . Since 0 = 0u, 0 2 W .
(S2) Take any u; u0 2 W , i.e. u = a1 v1 + + am vm and u0 = b1 v10 + + bn vn0 for
a1 ; : : : ; am ; b1 ; : : : ; bn 2 F and v1 ; : : : ; vm ; v10 ; : : : ; vn0 2 B . Then
u + u0 = a v1 + + amvm + b v10 + + bnvn0
1 1
v1; : : : ; vm 2 B. Then
cu = ca v1 + + cam vm
1
De nition 8.4.4 Let V be a vector space over a eld F and let B be a nonempty subset of
V . The subspace
W = fu 2 V j u is a linear combination of some vectors from B g
in Theorem 8.4.3 is called the subspace of V spanned by B and we write W = spanF (B ) or
simply W = span(B ) if the eld F is known. Sometimes, we also say that W is a linear span
of B and B spans W . Note that B W .
In particular if B = fv1 ; v2 ; v3 ; : : : g, then
W = fc1 v1 + + cm vm j m 2 N and c1 ; : : : ; cm 2 Fg
and we write W = spanF fv1 ; v2 ; v3 ; : : : g or simply W = spanfv1 ; v2 ; v3 ; : : : g and say that
W is the subspace of V spanned by the vectors v1 ; v2 ; v3 ; : : : ; W is a linear span of the vectors
v1; v2; v3; : : : ; and the vectors v1; v2; v3; : : : span W .
Remark 8.4.5 In De nition 8.4.4, if B is nite, say B = fv1 ; v2 ; : : : ; vk g, then
spanF (B ) = fc1 v1 + c2 v2 + + ck vk j c1 ; c2 ; : : : ; ck 2 Fg:
(Compare with De nition 3.2.3.)
Example 8.4.6
! ! ! !
1 1 0 1 1 1 1 2
1. Let A1 = , A2 = , A3 = and B = be real matrices.
1 1 1 0 1 1 3 4
Determine whether B is a linear combination of A1 , A2 and A3 .
Solution Consider the equation
c1 A1 + c2 A2 + c3 A3 = B :
Since !
c1 + c3 c1 + c2 + c3
c1 A1 + c2 A2 + c3 A3 = ;
c1 + c2 c3 c1 c3
we have 8
>
>
> c1 + c3 = 1
>
< c1 + c2 + c3 = 2
>
>
> c1 + c2 c3 = 3
>
: c1 c3 = 4:
0 1 0 1
1 0 1 1 1 0 1 1
B C Gaussian B C
B 1 1 1 2 C B 0 1 0 1 C
B
B 1 1 1 3C
C ! B0 0 2 1C
B
C
@ A @ A
Elimination
1 0 1 4 0 0 0 2
Since the system is inconsistent, B is not a linear combination of A1 , A2 and A3 .
22 Chapter 8. General Vector Spaces
( ! )
a b a + b + 2c
2. Let W1 = a; b; c 2 R M (R). Since
2 2
2(b + c) 3(a + c)
! ! ! !
a b a + b + 2c 1 1 1 1 0 2
=a +b +c ;
2(b + c) 3(a + c) 0 3 2 0 2 3
( ! ! !)
1 1 1 1 0 2
W1 = span ; ; and by Theorem 8.4.3, W1 is a subspace of
0 3 2 0 2 3
M (R).
2 2
Solution Since p1 (x); p2 (x); p3 (x) 2 P2 (R), spanfp1 (x); p2 (x); p3 (x)g P2 (R).
To show P2 (R) spanfp1 (x); p2 (x); p3 (x)g, we only need to show that any polynomial
q(x) = a1 + a2 x + a3 x2 2 P2 (R) is a linear combination of p1 (x), p2 (x) and p3 (x). Consider
the equation
c1 p1 (x) + c2 p2 (x) + c3 p3 (x) = q(x):
Since
we have 8
>
< c1 + 2c3 = a1
>
c1 + c2 c3 = a2
:
c1 + 2c2 c3 = a3 :
0 1 0 1
1 0 2 a1 Gauss-Jordan 1 0 0 1
3
a1 + 43 a2 23 a3
B C
B
@ 1 1 1 a2
C
A ! B
@ 0 1 0 a2 + a3 C
A
1 2 1 a3 Elimination 0 0 1 1
a 23 a2 + 13 a3
3 1
Section 8.4. Linear Spans and Linear Independence 23
So Fn = spanF fe1 ; e2 ; : : : ; en g.
6. Since every complex number in C can be written as a + b i for a; b 2 R, C = spanR f1; i g.
Furthermore, using the vectors e1 ; e2 ; : : : ; en in Part 5,
Cn = spanCfe1; e2; : : : ; eng
= spanR fe1 ; e2 ; : : : ; en ; i e1 ; i e2 ; : : : ; i en g:
7. Let F be a eld. For 1 i m and 1 j n, let Eij be the m n matrix over F such
that its (i; j )-entry is 1 and all other entries are 0. For any A = (aij ) 2 Mmn (F),
m X
X n
A= aij Eij :
i=1 j =1
So Mmn (F) = spanF fEij j 1 i m and 1 j ng.
8. Let F be a eld. We have
P (F) = fa 0 + a1 x + + am xm j m 2 N and a0 ; a1 ; : : : ; am 2 Fg = spanF f1; x; x2 ; : : : g
and
Pn(F) = fa 0 + a1 x + + an xn j a0 ; a1 ; : : : ; an 2 Fg = spanF f1; x; : : : ; xn g:
Remark 8.4.7 In De nition 8.4.2, Theorem 8.4.3 and De nition 8.4.4, we only accept linear
combinations using nite number of vectors. For example, the power series
1
X
xi = 1 + x + x2 +
i=0
is not contained in spanf1; x; x ; : : : g. 2
24 Chapter 8. General Vector Spaces
(a) The vectors v1 ; v2 ; : : : ; vk are called linearly independent if the vector equation
c1 v1 + c2 v2 + + ck vk = 0
2. Let B be a subset of V .
(a) B is called linearly independent if for every nite subset fv1; v2; : : : ; vk g of B ,
v1; v2; : : : ; vk are linearly independent.
(b) B is called linearly dependent if there exists a nite subset fv1; v2; : : : ; vk g of B
such that v1 ; v2 ; : : : ; vk are linearly dependent.
(For convenience, whenever we write a set in the form fu1; u2; : : : ; ung, we always assume
that (i) n 1; and (ii) ui 6= uj for i 6= j .)
Remark 8.4.9 Same as the discussion in Section 3.4, linear independence is used to determine
whether there are redundant vectors in a set. (See Theorem 3.4.4 and Remark 3.4.5.)
Solution To answer the question, we need to solve the vector equation c1 (1; i ; 1 i) +
c2 (i ; 1; 0) + c3 (3i ; 1; 1 + i) = (0; 0; 0).
Section 8.5. Bases and Dimensions 25
Remark 8.5.2
1. For convenience, the empty set ; is de ned to be the basis for a zero space.
2. Every vector space has a basis. The proof of this requires a fundamental result in set
theory called Zorn's Lemma.
3. In some in nite dimensional (topological) vector spaces, bases de ned in De nition 8.5.1
are called algebraic bases or Hamel bases in order to distinguish it from other kinds of
\bases" where in nite sums are allowed.
Example 8.5.3
( ! )
a b a + b + 2c
1. Let W1 = a; b; c 2 R M (R).
2 2 By Example 8.4.6.2, we
2(b + c) 3(a + c)
( ! ! !)
1 1 1 1 0 2
have W1 = span ; ; . Note that
0 3 2 0 2 3
8
! ! ! ! >
< c1 = t
1 1 1 1 0 2 0 0
c1 + c2 + c3 = , c2 = t for t 2 R:
0 3 2 0 2 3 0 0 >
:
c3 = t
( ! ! !)
1 1 1 1 0 2
So ; ; is linearly dependent and hence is not a basis for W1 .
0 3 2 0 2 3
, c = 0; c = 0; ; cn = 0;
1 2
fe1; e2; : : : ; eng is linearly independent and hence is a basis for Fn. This basis is called
the standard basis for Fn .
Section 8.5. Bases and Dimensions 27
B = fe1 ; e2 ; : : : ; en ; i e1 ; i e2 ; : : : ; i en g
is linearly dependent over C but linearly independent over R. So we have the following
conclusion:
(We always assume that Cn is a vector space over C unless we specify the otherwise.)
5. By Example 8.4.6.7, we have Mmn (F) = spanF fEij j 1 i m and 1 j ng. The
set fEij j 1 i m and 1 j ng is linearly independent and hence is a basis for
Mmn(F). This basis is called the standard basis for Mmn(F).
6. By Example 8.4.6.8, we have P (F) = spanF f1; x; x2 ; : : : g. The set f1; x; x2 ; : : : g is
linearly independent and hence is a basis for P (F). This basis is called the standard basis
for P (F).
Also by Example 8.4.6.8, we have Pn (F) = spanF f1; x; : : : ; xn g. The set f1; x; : : : ; xn g
is linearly independent and hence is a basis for Pn (F). This basis is called the standard
basis for Pn (F).
Remark 8.5.4
1. The vectors spaces Fn , Mmn (F) and Pn (F) are nite dimensional while P (F) is in nite
dimensional.
2. The vector space FN is in nite dimensional. The vector space F (A; F) is nite dimensional
if A is a nite set; and F (A; F) is in nite dimensional if A is an in nite set.
Lemma 8.5.5 Let V be a nite dimensional vector space and B = fv1 ; v2 ; : : : ; vn g a basis
for V . Any vector u 2 V can be expressed uniquely as a linear combination
u = c v 1 + c v 2 + + cn v n
1 2
De nition 8.5.6 Let V be a nite dimensional vector space over a eld F where dim(V ) =
n 1.
28 Chapter 8. General Vector Spaces
Lemma 8.5.7 Let V be a nite dimensional vector space over a eld F, where dim(V ) 1,
and let B be an ordered basis for V .
1. For any u; v 2 V , u = v if and only if (u)B = (v )B .
2. For any v1 ; v2 ; : : : ; vr 2 V and c1 ; c2 ; : : : ; cr 2 F,
(c1 v1 + c2 v2 + + cr vr )B = c1 (v1 )B + c2 (v2 )B + + cr (vr )B :
Example 8.5.8
1. Let B1 = fv1 ; v2 ; v3 g where v1 = (1; 1; 0), v2 = (0; 1; 1) and v3 = (1; 1; 1) are vectors
in F23 . Note that B1 is a basis for F23 . Using B1 as an ordered basis, nd the coordinate
vector of u = (a; b; c) 2 F23 relative to B1 .
Solution (Recall that in F2 , 1 + 1 = 0.)
8 8
>
< c1 + c3 = a >
< c1 = b + c
c1 v1 + c2 v2 + c3 v3 = u , c1 + c2 + c3 = b , c2 = a + b
>
: >
:
c2 + c3 = c c3 = a + b + c:
Thus (u)B1 = (b + c; a + b; a + b + c).
! ! !
1 1 0 1 1 1
2. Let B2 = fA1 ; A2 ; A3 ; A4 g where A1 = , A2 = , A3 = and
1 1 1 0 1 1
!
A4 = 10 01 are real matrices. Note that B2 is a basis for M22(R). Using B2 as an
!
1 2
ordered basis, nd the coordinate vector of C = relative to B2 .
4 3
Section 8.5. Bases and Dimensions 29
Solution 8 8
>
>
> c1 + c3 + c4 = 1 >
>
> c1 = 2
>
< >
<
c1 + c2 + c3 =2 c2 = 1
c1 A1 + c2 A2 + c3 A3 + c4 A4 = C , ,
>
>
> c1 + c2 c3 =4 >
>
> c3 = 1
>
: >
:
c1 c3 c4 = 3 c4 = 0:
Thus (C )B2 = (2; 1; 1; 0).
Remark 8.5.9 Let V be a nite dimensional vector space over a eld F such that V has a
basis B with n vectors. Using the coordinate system relative to B , we can translate all vectors
in V to vectors in Fn . Thus all problems about V can be solved by using theorems and methods
worked for Fn .
By this way, most of the results in Chapter 3 and Chapter 4 for Euclidean n-spaces can be
applied to other nite dimensional vector spaces.
Theorem 8.5.10 Let V be a vector space which has a basis with n vectors. Then
1. any subset of V with more than n vectors is always linearly dependent; and
De nition 8.5.11 The dimension of a nite dimensional vector space V over a eld F, denoted
by dimF (V ) or simply dim(V ), is de ned to be the number of vectors in a basis for V . In
addition, we de ne the dimension of a zero space to be zero. (See also Section 3.6.)
Example 8.5.12
1. dimF (Fn ) = n.
Theorem 8.5.13 Let V be a nite dimensional vector space and B a subset of V . The
following are equivalent:
1. B is a basis for V .
30 Chapter 8. General Vector Spaces
Proof The proof follows the same argument as the proof for Theorem 3.6.7.
Example 8.5.14 In Example 8.4.6.4, we have P2 (R) = spanfp1 (x); p2 (x); p3 (x)g where p1 (x) =
1 + x + x2 , p2 (x) = x + 2x2 and p3 (x) = 2 x x2 . Since dim(P2 (R)) = 3, by Theorem 8.5.13,
fp1(x); p2(x); p3(x)g is a basis for P2(R).
Theorem 8.5.15 Let W be a subspace of a nite dimensional vector space V . Then
1. dim(W ) dim(V ); and
2. if dim(W ) = dim(V ), then W = V .
Proof The proof follows the same argument as the proof for Theorem 3.6.9.
Example 8.5.16 For this example, we modify the algorithms in Example 4.1.14 and use them
to nd bases for nite dimensional vector spaces:
Find a basis for the subspace W of M22 (R) spanned by
! ! !
A1 = 12 21 ; A2 = 36 63 ; A3 = 49 95 ;
! ! !
2 1
A4 = 1 1
; A5 = 59 84 ; A6 = 47 23 :
Solution Use the (ordered) standard basis E = fE11 ; E12 ; E21 ; E22 g for M22 (R) where
! ! ! !
E11 = 10 00 ; E12 = 00 10 ; E21 = 01 00 ; E22 = 00 01 :
Then
(A1 )E = (1; 2; 2; 1); (A2 )E = (3; 6; 6; 3); (A3 )E = (4; 9; 9; 5);
(A4 )E = ( 2; 1; 1; 1); (A5 )E = (5; 8; 9; 4); (A6 )E = (4; 2; 7; 3):
So f(1; 2; 2; 1); (0; 1; 1; 1); (0; 0; 1; 1)g is a basis for the subspace of R4 spanned by (A1 )E , (A2 )E ,
(A3 )E , (A4 )E , (A5 )E , (A6 )E . Let B1 , B2 , B3 be 2 2 real matrices such that
!
(B1 )E = (1; 2; 2; 1) ) B1 = E11 + 2E12 + 2E21 + E22 = 12 21 ;
!
(B2 )E = (0; 1; 1; 1) ) B2 = 0E11 + E12 + E21 + E22 = 01 11 ;
!
(B3 )E = (0; 0; 1; 1) ) B3 = 0E11 + 0E12 + E21 + E22 = 01 01 :
( ! ! !)
1 2 0 1 0 0
Then fB1 ; B2 ; B3 g = ; ; is a basis for W .
2 1 1 1 1 1
Use Method 2 of Example 4.1.14.1:
0 1 0 1
1 3 4 2 5 4 1 3 4 2 5 4
B C Gaussian B C
B2 6 9 1 8 2C B0 0 1 3 2 6C
B
B2 6 9 1 9 7A
C
C ! B
B0 0 0 0 1 5C
C:
@
Elimination @ A
1 3 5 1 4 3 0 0 0 0 0 0
Since the 1st, 2nd and 5th columns are pivot columns of the row echelon form, f [A1 ]E ; [A3 ]E ; [A5 ]E g
is a basis for the subspace of R4 spanned by [A1 ]E , [A2 ]E , [A3 ]E , [A4 ]E , [A5 ]E , [A6 ]E . Thus
fA1; A3; A5g is a basis for W .
Theorem 8.5.17 Let V be a nite dimensional vector space. Suppose C is a linearly inde-
pendent subset of V . Then there exists a basis B for V such that C B .
Proof If span(C ) = V , then B = C is a basis for V . Suppose span(C ) $ V . There exists
u 2 V but u 2= span(C ). Let C1 = C [ fug. Note that C1 is linearly independent (check it). If
span(C1 ) = V , then B = C1 is a basis for V . If not, we repeat the process above to nd a new
vector in V but not in span(C1 ). Since V is nite dimensional, by Theorem 8.5.13, we shall
eventually get enough vectors to form a basis B for V .
f(1; 4; 2; 5; 1); (2; 9; 1; 8; 2); (2; 9; 1; 9; 3); (0; 0; 1; 0; 0); (0; 0; 0; 0; 1)g
is a basis for R5 . Thus
W1 + W2 = fw1 + w2 j w1 2 W1 and w2 2 W2 g;
is a subspace of V . Sometimes, in order to study the behavior of a large vector space, it is more
convenient to decompose the space into sums of smaller subspaces. (For example, see Chapter
11.) To do so, we need to make sure each vector in W1 + W2 is expressed uniquely as w1 + w2
with w1 2 W1 and w2 2 W2 .
Example 8.6.2 Let W1 be the xy-plane and W2 the yz -plane. Then W1 + W2 = R3 . Take
(1; 2; 3) 2 R3 . We have
where (1; 1; 0); (1; 2; 0) 2 W1 and (0; 1; 3); (0; 0; 3) 2 W2 . So there are more than one way to
write (1; 2; 3) as w1 + w2 with w1 2 W1 and w2 2 W2 .
Let W3 be the z -axis. If we replace W2 by W3 , we still have W1 + W3 = R3 . Now, every vector
(a; b; c) 2 R3 is written uniquely as
De nition 8.6.3 Let W1 and W2 be subspaces of a vector space V . We say that the subspace
W1 + W2 is a direct sum of W1 and W2 if every vector u 2 W1 + W2 can be expressed uniquely
as
u = w1 + w2 where w1 2 W1 and w2 2 W2 :
In this case, we denote W1 + W2 by W1 W2 .
Note that as a set, W1 W2 = W1 + W2 = fw1 + w2 j w1 2 W1 and w2 2 W2 g. The \circle"
added to \+" can be regarded as a remark saying that the sum W1 + W2 is a direct sum.
w1 w10 = w20 w2 = 0
and hence w1 = w10 and w2 = w20 . So the sum is unique.
We have shown that W1 + W2 is a direct sum.
Example 8.6.6
1. In Example 8.6.2, W1 \ W2 is the y -axis while W1 \ W3 = f(0; 0; 0)g. By Theorem 8.6.5,
W1 + W2 is not a direct sum and W1 + W3 is a direct sum.
34 Chapter 8. General Vector Spaces
2. Let
B = (B + B ) + (B B )
1
2
T 1
2
T
where 12 (B + B T ) 2 W1 1
2
and (B B ) 2 W . This means B 2 W
T
2 1 + W2 for all
B 2 Mnn(R). So Mnn(R) W1 + W2.
Thus we have shown W1 + W2 = Mnn (R).
Furthermore, W1 \ W2 = f0g. By Theorem 8.6.5, W1 + W2 is a direct sum, i.e. Mnn (R) =
W1 W2 .
In this example, elements of W1 are symmetric matrices while elements of W2 are called
skew symmetric matrices or anti-symmetric matrices.
3. Let W1 be the subspace of C ([0; 2 ]) spanned by g 2 C ([0; 2]) where g(x) = sin(x) for
x 2 [0; 2]. De ne
Z 2
W2 = f 2 C ([0; 2]) f (t) sin(t)dt = 0 :
0
Then W2 is a subspace of C ([0; 2 ]) and C ([0; 2 ]) = W1 W2 . (We leave the veri cation
of the results as exercise. See Question 8.34.)
(See Question 8.35 for a formula of dim(W1 + W2 ) when W1 + W2 is not a direct sum.)
Proof
1. (i) It is obvious that span(B1 [ B2 ) W1 W2 .
Take any u 2 W1 W2 , i.e. u = w1 + w2 where w1 2 W1 and w2 2 W2 . Since
B1 and B2 span W1 and W2 respectively, there exists, u1 ; u2 ; : : : ; uk 2 B1 and
v1; v2; : : : ; vm 2 B2 such that
w1 = a u1 + a u2 + + ak uk
1 2 and w2 = b v1 + b v2 + + bmvm
1 2
Section 8.6. Direct Sums of Subspaces 35
u = a u1 + a u2 + + ak uk + b v1 + b v2 + + bmvm
1 2 1 2
c1 u1 + c2 u2 + + ck uk + d1 v1 + d2 v2 + + dm vm = 0: (8.2)
w= d1 v1 d2 v2 dmvm 2 span(B ) = W 2 2
c1 u1 + c2 u2 + + ck uk = 0 and d1 v1 d2 v2 dmvm = 0:
As B1 and B2 are linearly independent, the two equations above have only the trivial
solutions c1 = 0, c2 = 0, : : : , ck = 0, d1 = 0, d2 = 0, : : : , dm = 0. Thus the equation
(8.2) has only the trivial solution and C is linearly independent.
We have shown that B1 [ B2 is linearly independent.
Remark 8.6.8 In Theorem 8.6.7.1, suppose W1 + W2 is not a direct sum, i.e. W1 \ W2 6= f0g.
It is still true that span(B1 [ B2 ) = W1 + W2 but B1 [ B2 may not be linearly independent and
hence B1 [ B2 may not be a basis for W1 + W2 .
For example, let W1 be the xy -plane and W2 the yz -plane (see Example 8.6.2). Take bases
B1 = f(1; 0; 0); (0; 1; 0)g and B2 = f(0; 1; 1); (0; 0; 1)g for W1 and W2 respectively. It is obvious
that span(B1 [ B2 ) = W1 + W2 but B1 [ B2 is linearly dependent and hence B1 [ B2 is not be
a basis for W1 + W2 .
which is a subspace of V .
u = w1 + w2 + + wk where wi 2 Wi for i = 1; 2; : : : ; k.
Remark 8.6.10 We can use Theorem 8.6.5 repeatedly to determine whether the sum W1 +
W2 + + Wk is a direct sum. For example, check the following one by one:
Example 8.6.11
2. Let V be a nite dimensional vector space over a eld F and let fv1 ; v2 ; : : : ; vn g be a
basis for V . For each i = 1; 2; : : : ; n, de ne Wi = spanfvi g. Since each u 2 V can be
expressed uniquely as u = c1 v1 + c2 v2 + + cn vn for c1 ; c2 ; : : : ; cn 2 F,
V = W1 W2 Wn :
Example 8.7.2
1. Let W be the subspace of F23 spanned by (1; 0; 1) and (0; 1; 1). Then
W = fa(1; 0; 1) + b(0; 1; 1) j a; b 2 F2 g = f(0; 0; 0); (1; 0; 1); (0; 1; 1); (1; 1; 0)g:
The following are all the cosets of W :
W + (0; 0; 0) = f(0; 0; 0); (1; 0; 1); (0; 1; 1); (1; 1; 0)g;
W + (0; 0; 1) = f(0; 0; 1); (1; 0; 0); (0; 1; 0); (1; 1; 1)g;
W + (0; 1; 0) = f(0; 1; 0); (1; 1; 1); (0; 0; 1); (1; 0; 0)g;
W + (0; 1; 1) = f(0; 1; 1); (1; 1; 0); (0; 0; 0); (1; 0; 1)g;
W + (1; 0; 0) = f(1; 0; 0); (0; 0; 1); (1; 1; 1); (0; 1; 0)g;
W + (1; 0; 1) = f(1; 0; 1); (0; 0; 0); (1; 1; 0); (0; 1; 1)g;
W + (1; 1; 0) = f(1; 1; 0); (0; 1; 1); (1; 0; 1); (0; 0; 0)g;
W + (1; 1; 1) = f(1; 1; 1); (0; 1; 0); (1; 0; 0); (0; 0; 1)g:
Note that W + (0; 0; 0) = W + (0; 1; 1) = W + (1; 0; 1) = W + (1; 1; 0) = W
and W + (0; 0; 1) = W + (0; 1; 0) = W + (1; 0; 0) = W + (1; 1; 1) = F23 W .
2. Let W = f(x; y ) 2 R2 j x 2y = 0g. It is the line in R2 represented by the homogeneous
linear equation x 2y = 0 and is a subspace of R2 .
Take any (a; b) 2 R2 . The coset of W con-
taining (a; b) is y 6
W + (a; b)
.....
......
W + (a; b) ......
......
.
..........
...
......
......
......
= f(x0 ; y 0 ) 2 R2 j x0 2y 0 = a 2bg
.
... ......
.....
......
. ......
......
........... ..
. .......
.
...... ...
......
...... ......
...... ......
...... ......
......
...... ......
(a; b; c) is ....
......
....
.......
......
......
......
......
.........
......
......
...... ......
...... ......
............................................................................................................................................
In general, let W be a plane in R3 that contains the origin. The cosets of W are planes
in R3 parallel to W . (See also Discussion 3.2.15.2.)
4. (In this example, vectors in Fn are written as column vectors.) Let A be an m n matrix
over a eld F and let W be the nullspace of A, i.e. W = fu 2 Fn j Au = 0g. Take any
v 2 Fn. Set b = Av. The coset of W containing v is
W + v = fu + v j u 2 Fn and Au = 0g = fw 2 Fn j Aw = bg
which is the solution set of the linear system Ax = b. (See also Theorem 4.3.6.)
Proof
1. ((a),(b)) Since W is a subspace, u 2 W if and only if
u 2 W . Thus
v 2 W + w , v = u + w for some u 2 W
, w = ( u) + v for some u 2 W
, w 2 W + v:
((a),(c)) v 2 W + w , v = u + w for some u 2 W
, v w = u for some u 2 W
Section 8.7. Cosets and Quotient Spaces 39
, v w 2 W.
((a),(d)) ()) Suppose v 2 W + w. Then v w 2 W . Hence
u 2 W + v , u v 2 W , (u v) + (v w) 2 W
, u w 2 W , u 2 W + w:
So W + v = W + w.
(() Since v 2 W + v , if W + v = W + w, then v 2 W + w.
Example 8.7.4
*
2
v
......
......
......
.
..........
... . .
...... .
......
......
w6
.
..........
. .
.....
...... .
. W
....... . .
......
*
..
. .
......
. ......
v
.
... ......
.........
. . ...........
.
.
v w = (2; 1) 2 W;
.
. . .
...... ......
w
....
. ..
. ..
.... ...
...... ......
...... ......
...... ......
......
...... -
. .........
.
...
......
...... x
by Theorem 8.7.3.1, W + v = W + w.
......
......
......
..........
.
...
......
......
......
......
W + v =W +w
W = f(x; y; z ) 2 R3 j z = 0g: ......
......
-
..........................................................................................................................................
......
.
6 v
...... ......
...... ......
...... . ......
......
.......
........
w .
.
......
.....
.
... ......
.
v w = (0; 1; 0) 2 W; ......
......
......
......
...
-
...................................................................
............................................................................
.
.
.
......
......
...... -
..
......
v w
.
........... ...
........
..... ..... y
...... ......
by Theorem 8.7.3.1, W + v = W + w.
...... ......
...... ......
..................................................................
...........................................................................
W
Lemma 8.7.5 Let V be a vector space over a eld F and let W be a subspace of V .
1. Suppose u1 ; u2 ; v1 ; v2 2 V such that W + u1 = W + u2 and W + v1 = W + v2 . Then
W + (u1 + v1 ) = W + (u2 + v2 ).
2. Suppose u1; u2 2 V such that W + u1 = W + u2 . Then W + cu1 = W + cu2 for all
c 2 F.
40 Chapter 8. General Vector Spaces
De nition 8.7.6 Let V be a vector space over a eld F and let W be a subspace of V . We
de ne the addition of two cosets by
(W + u) + (W + v ) = W + (u + v ) for u; v 2 V: (8.3)
Example 8.7.7 Following Example 8.7.2.2, let W = f(x; y) 2 R2 j x 2y = 0g. Let u = (1; 1)
and v = ( 2; 1). Note that
(W + u) + (W + v ) = W + (u + v ) = W + ( 1; 2) = f(x; y ) j x 2y = 5g
and
3(W + u) = W + 3u = W + (3; 3) = f(x; y ) j x 2y = 3g:
y (W + u) + ( W + v ) y
6
..........
+v
......
......
......
.....
...... W
.
6
3(W +
......
.
u)
u
.....
...... ...... ......
...... ...... ......
......
......
......
...... 3 ......
......
u
..... .
.... .....
... ... ...
...... ...... ......
......
......
......
......
......
......
......
......
...... + W
HAKYHH
...... ...... ...... ......
. ......
......
......
... ....
......
......
... .
....
+ W ...
........
.....
...........
.. ...... ...... ...... W
AA HH
........ ...... ...... .... ..
.. .
.
...... ...... ...... ....
......
.
......
.....
......
.... .
...... .... ...... ...... ...... ...... ......
.. ......... ... ..... ...... ........... ...... ...... ......
...... ..... ........... ...... ...... W ...... ...... ......
H
Y u
..... ..... ..... ..... ..... .
....
H
... . ....... ... ... ... ... ...
...... ...... ...... ...... ...... ......
v HHA u
..
...... ......... ...... ...... ...... ...... ......
...... ........ ...... ...... ...... ...... ......
...... ...... ...... ...... ...... ......
...... ...... ...... ...... ...... ......
.
. ...... ........ ........ ........ ........ ........
HA
.. ... ...... ...... ..... ...... ......
...... ...... ...... ...... ...... ......
...... ...... ...... ......
...... ...... ...... ......
...... ...... ...... ......
...... ...... ...... ......
... .
. ...... .
.......... - ........... ........... -
.... .... ......
... ....
...... ...... x ...... ...... x
...... ...... ...... ......
...... ...... ...... ......
........... .
... ....... ...........
..... ..... .....
...... ...... ......
...
........ ........... ...
........
... .... ...
......
......
......
......
......
Figure A Figure B
W + u0 = W + (0; 12 ) = f(x; y) 2 R2 j x 2y = 1g = W + u
Section 8.7. Cosets and Quotient Spaces 41
and
W + v0 = W + (0; 2) = f(x; y) 2 R2 j x 2y = 4g = W + v:
Also
(W + u0 )+(W + v 0 ) = W +(u0 + v 0 ) = W +(0; 25 ) = f(x; y ) j x 2y = 5g = (W + u)+(W + v )
and
3(W + u0 ) = W + 3u0 = W + (0; 32 ) = f(x; y ) j x 2y = 3g = 3(W + u):
(As an exercise, draw the vectors u0 , v 0 , u0 + v 0 on Figure A above and also draw the vectors
u0, 3u0 on Figure B.)
Theorem 8.7.8 Let V be a vector space over a eld F and W a subspace of V . Denote the
set of all cosets of W in V by V=W , i.e.
V=W = fW + u j u 2 V g:
Then V=W is a vector space over F using the addition and scalar multiplication de ned in (8.3)
and (8.4).
Proof (V1) and (V6) follow from the de nitions of the addition and scalar multiplication.
(V2)-(V3) and (V7)-(V10) follow directly from the properties of V .
For example, for all W + u; W + v 2 V=W ,
(W + u) + (W + v ) = W + (u + v )
= W + (v + u) (by (V2) of V )
= (W + v ) + (W + u)
and hence (V2) is satis ed.
Finally, for (V4), the zero vector is W (= W + 0); and for (V5), the negative of W + u 2 V=W
is W + ( u) (which we usually written as W u).
De nition 8.7.9 The vector space V=W in Theorem 8.7.8 is called the quotient space of V
modulo W .
Remark 8.7.10 In abstract algebra, \quotient" is used to de ne modulo arithmetics for al-
gebraic structures.
For example, let nZ = f0; n; 2n; : : : g Z. De ne
Z=nZ = fnZ + a j a 2 Zg where for a 2 Z; nZ + a = fa; n + a; 2n + a; : : : g:
For a; b 2 Z, nZ + a = nZ + b if and only if a b (mod n). The operations of addition and
multiplication de ned by
(nZ + a)+(nZ + b) = nZ +(a + b) and (nZ + a)(nZ + b) = nZ + ab for nZ + a; nZ + b 2 Z=nZ:
These operations resemble the arithmetic of integer addition and multiplication modulo n.
42 Chapter 8. General Vector Spaces
Theorem 8.7.11 Let V be a nite dimensional vector space and W a subspace of V . Let
fw1; w2; : : : ; wmg be a basis for W .
1. For v1 ; v2 ; : : : ; vk 2 V , fv1 ; v2 ; : : : ; vk ; w1 ; w2 ; : : : ; wm g is a basis for V if and only if
fW + v1; W + v2; : : : ; W + vk g is a basis for V=W .
2. dim(V=W ) = dim(V ) dim(W ).
W + u = W + ( a v1 + a v2 + + vk )
1 2
= a (W + v1 ) + a (W + v2 ) + + ak (W + vk )
1 2
Example 8.7.12
1. Following Example 8.7.2.2, let W = f(x; y ) 2 R2 j x 2y = 0g. It is easy to check
that W = spanf(2; 1)g and hence f(2; 1)g is a basis for W and dim(W ) = 1. We extend
Exercise 8 43
f(2; 1)g to a basis f(2; 1); (0; 1)g for R . By Theorem 8.7.11.1, fW + (0; 1)g is a basis for
2
0 1 0 1
2 2 1 0 1 2 2 1 0 1
B C Gaussian B C
B 2 2 4 6 2C B0 0 3 6 3C
B
B 0 0 1 1 1A
C
C ! B
B0 0 0 3 0C
C
@
Elimination @ A
1 1 2 0 1 0 0 0 0 0
So C = f(2; 2; 1; 0; 1); (0; 0; 3; 6; 3); (0; 0; 0; 3; 0)g is a basis for W . Following the algo-
rithm in Example 4.1.14.2 (see also Example 8.5.18), we extend C to a basis for R5 :
f(2; 2; 1; 0; 1); (0; 0; 3; 6; 3; (0; 0; 0; 3; 0); (0; 1; 0; 0; 0); (0; 0; 0; 0; 1)g:
By Theorem 8.7.11.1, fW + (0; 1; 0; 0; 0); W + (0; 0; 0; 0; 1)g is a basis for the quotient
space R =W .
5
Exercise 8
3. The nite eld F4 is de ned as follows: Let F4 = f0; 1; x; 1 + xg. The addition and
multiplication are the same as the polynomial addition and multiplication except 1+1 = 0,
x + x = 0 and x2 = 1 + x.
44 Chapter 8. General Vector Spaces
(a) Write down the addition and multiplication tables for F4 (as in Example 8.1.3.2).
(b) For every element a of F4 , nd a.
(c) For every nonzero element b of F4 , nd b 1 .
0 1
1 1 1
(d) Find the inverse of the square matrix @0 x 1CA over F4 .
B
0 0 1
4. Solve the following linear system over F when (i) F = R and (ii) F = F2 .
8
>
< x1 + x2 + x3 + x5 = 0
>
x1 + x3 + x4 =1
:
x2 + x4 + x5 = 1
(a) If A and B are n n matrix over F, show that tr(A + B ) = tr(A) + tr(B ).
(b) If c 2 F and A is an n n matrix over F, show that tr(cA) = c tr(A).
(c) If C and D are m n and n m matrices, respectively, over F, show that tr(CD) =
tr(DC ).
7. For each of the following, list all axioms in the de nition of vector space, i.e. De nition
8.2.2, which are not satis ed by the given vector addition and scalar multiplication de ned
on the vector set V over R.
Exercise 8 45
8. Prove that R2 is a vector space over R using the following vector addition and scalar
multiplication:
m a = am for m 2 R and a 2 V:
Prove that V is a vector space over R using these two operations.
10. Let A = f0; 1g. Let f; g; h 2 F (A; R) such that f (t) = 1 + t2 , g (t) = sin( 21 t ) + cos( 12 t )
and h(t) = t for t 2 A. Show that f = g + h.
U V = f(u; v) j u 2 U and v 2 V g:
De ne the vector addition and scalar multiplication as follows:
13. For each of the following subsets W of the vector space V , determine whether W is a
subspace of V .
linearly independent.
B1 = f1 + x x2 ; 2 + 2x + x2 g.
B2 = f1 + x x2 ; 2 2x + 2x2 g.
B3 = f1 + x x2 ; 2 + 2x + x2 ; 1 + 5x 2x2 g.
B4 = f1 + x x2 ; 2 + 2x + x2 ; 4 + 3x2 g.
B5 = f1 + x x2 ; 2 + 2x + x2 ; 1 + 5x 2x2 ; 8x 2x2 g.
B6 = f1 + x x2 ; 2 + 2x + x2 ; 4 + 3x2 ; 2 + 6x 3x2 g.
19. Let f1 ; f2 ; : : : ; fn 2 C n 1 ([a; b]) where [a; b],with a < b, is a closed interval on the real
line. The Wronskian W (f1 ; f2 ; : : : ; fn ) : [a; b] ! R is the function de ned by
f1 (x) f2 (x) fn (x)
df1 (x) df2 (x) dfn (x)
dx dx
dx
W (f1 ; f2 ; : : : ; fn )(x) = .. .. .. for x 2 [a; b]:
. . .
dn 1 f1 (x) dn 1 f2 (x) dn 1 fn (x)
dxn 1 dxn 1
dxn 1
(a) Let f1 ; f2 2 C 1 ([ 1; 1]) such that f1 (x) = ex and f2 (x) = xex for x 2 [ 1; 1].
Compute W (f1 ; f2 ).
(b) Let f1 ; f2 ; : : : ; fn 2 C n 1 ([a; b]). Prove that if W (f1 ; f2 ; : : : ; fn )(x0 ) 6= 0 for some
x0 2 [a; b], then f1 ; f2 ; : : : ; fn are linearly independent.
(c) Let f1 ; f2 ; : : : ; fn 2 C n 1 ([a; b]). If W (f1 ; f2 ; : : : ; fn )(x) = 0 for all x 2 [a; b], is it
true that f1 ; f2 ; : : : ; fn must be linearly dependent?
20. Let V = FN be the vector space of in nite sequence over the eld F. For i = 1; 2; 3; : : : ,
de ne ei 2 V to be the in nite sequence such that the ith term of the sequence is 1 and
all other terms are 0. Let B = fe1 ; e2 ; e3 ; : : : g.
(a) Is B linearly independent?
(b) Is V = span(B )?
(b) Prove that W is a subspace of the real vector space C2 and nd a basis for W .
22. For each of the following subset B of V = Pn (R), determine whether B is a basis for V .
(a) B = f1; 1 + x; 1 + x + x2 ; : : : ; 1 + x + x2 + + xn g.
(b) B = f1 + x; x + x2 ; x2 + x3 ; : : : ; xn 1
+ xn ; xn + 1g.
23. For each of the following subspace W of the vector space V , (i) nd a basis for W ; and
(ii) determine the dimension of W .
(a) V = F23 and W = f(0; 0; 0); (1; 1; 0); (0; 1; 1); (1; 0; 1)g.
0 1
1 1 i
B C
Bi 0 1C
(b) V = C4 and W = fAu j u 2 C3 g where A = B
Bi
C.
@ 0 1C A
1 0 i
(In here, vectors in C3 and C4 are written as column vectors.)
( ! ! ! !)
1 1 1 0 3 1 0 2
(c) V = M22 (R) and W = span ; ; ; .
1 1 0 1 1 3 2 0
(1 i)x + 2x + i x g.
2 3
(k) V = C 1 ([0; 2]) and W = spanff ; f ; f ; f g where f (x) = sin(x), f (x) = cos(x),
1 2 3 4 1 2
B = f1 x; 2 + x x4 ; 1 + x + x2 + x3 + x4 g:
For each of the following polynomials p(x), (i) determine whether p(x) 2 W ; (ii) if so,
compute the coordinate vector of p(x) relative to B .
(a) p(x) = 5x + x2 + x3 . (b) p(x) = 1 + x x2 x3 + x4 .
(c) p(x) = 2 + 2x2 + 2x3 + 4x4 .
27. For each of the set B in Question 8.25 and Question 8.26, extend it to a basis for V .
28. Let V be a vector space over a eld F and let B be a basis for V . (In here, V can be
in nite dimensional.) Prove that every nonzero vector u 2 V can be expressed uniquely
as a linear combination
u = c1v1 + c2v2 + + cmvm
for some m 2 N, c1 ; c2 ; : : : ; cm 2 F and v1; v2; : : : ; vm 2 B such that ci 6= 0 for all i and
vi 6= vj whenever i 6= j .
29. Let F be a eld, A 2 Mnn (F) and W = fB 2 Mnn (F) j AB = BAg. Suppose there
exists a column vector v 2 Fn such that fv ; Av ; A2 v ; : : : ; An 1 v g is a basis for Fn .
(a) Prove that W is a subspace of Mnn (F).
(b) Prove that I , A, A2 , : : : , An 1
are linearly independent vectors contained in W .
(c) Prove that fI ; A; A2 ; : : : ; An 1
g is a basis for W .
30. Let V be a nite dimensional vector space over C such that dimC (V ) = n. By restricting
the scalars to real numbers in the scalar multiplication, V can be regarded as a vector
space over R. (See Example 8.5.3.4.) Prove that dimR (V ) = 2n.
Exercise 8 51
31. Let W be a vector space over R. De ne W 0 = f(u; v ) j u; v 2 W g with the addition and
scalar multiplication
32. For each of the following subspaces W1 and W2 of the vector space V ,
33. For each of the following subspaces W1 and W2 of V = Mnn (R), determine whether
(i) V = W1 + W2 ; and (ii) V = W1 W2 .
35. Let W1 and W2 be nite dimensional subspaces of a vector space. Prove that
dim(W1 + W2 ) = dim(W1 ) + dim(W2 ) dim(W1 \ W2 ):
(Hint: Start with a basis B for W1 \ W2 and use Theorem 8.5.17 to extend B to bases B1
and B2 for W1 and W2 , respectively. Then show that B1 [ B2 is a basis for W1 + W2 .)
37. Let U and V be vector spaces over a eld F and let U V = f(u; v ) j u 2 U and v 2 V g
be the vector space over F as de ned in Question 8.11. De ne U 0 = f(u; 0V ) j u 2 U g and
V 0 = f(0U ; v) j v 2 V g where 0U and 0V are the zero vectors of U and V respectively.
(a) Show that U 0 and V 0 are subspaces of U V .
(b) Prove that U V = U 0 V 0 .
(c) If U and V are nite dimensional, nd dim(U 0 ), dim(V 0 ) and dim(U V ) in terms
of dim(U ) and dim(V ).
(The vector space U V is called the external direct sum of U and V .)
41. Let [a; b], with a < b, be a closed interval on the real line and let
d2 f (x) df (x)
W = f 2 C ([a; b])
2
3 + 2f (x) = 0 for x 2 [a; b]
dx2 dx
which is a subspace of C 2 ([a; b]). Show that each coset of W is a solution set of a di erential
equation
d2 f (x) df (x)
3 + 2f (x) = g (x) for x 2 [a; b]
dx
2 dx
for some g 2 C ([a; b]).
43. For each of parts (a)-(e) of Question 8.32, write down a basis for V=W1 and a basis for
V=W2 .
U = fu 2 V j W + u 2 Ug:
(a) Show that U is a subspace of V .
(b) Suppose W and U are nite dimensional, say, dim(W ) = k and dim(U ) = m. Find
dim(U ).
Chapter 9
De nition 9.1.2 Let V and W be two vector spaces over a eld F. A linear transformation
T : V ! W is a mapping from V to W that satis es the following two axioms:
(T1) For all u; v 2 V , T (u + v ) = T (u) + T (v ).
(T2) For all c 2 F and u 2 V , T (cu) = cT (u).
If W = V , the linear transformation T : V! V is called a linear operator on V .
If W = F, the linear transformation T : V ! F is called a linear functional on V .
................................... ...................................
....... .......
V W
..... ...... ..... ......
..... ..... ..... .....
.... .... .... ....
.
.. .... .
.. ....
. .... . ....
.... ....
u6 (u)
.. .... .. ....
.... .... .... ....
u) =
. .... . ....
. .
...... ... ...... ...
T(
. ... . ...
.... c ..
.. c ..... cT ..
..
... ..
.. .. ..
..
..
. .. ..
. ..
.. .. .. ..
-u
. .. . ..
... ..
.. ... ..
..
u QQ
... .. ... ..
u6
.. ..
..
. .. ..
. ..
v
.. .. .. ..
Qs
. .
. .. . ..
.. .. .. ..
. . + .. . T ( ) ..
-
.. . ..
T
. .
. .. . ..
.. . .. .. ..
. .. . ..
... .
.
..
.. ... .
.
..
..
.. . ..
u + v) = T (u) + T (v)
. . .. . ..
.. . .. .. .
.. .. .. ..
(
. .
..
..
.
.
. ... ..
..
. T
...
.. . .. .. . ..
.. . .. . .
-v
.. .
. ..
. .. .. ..
.
.. . . .. .
.. . .. .. ..
0 QQ
.. .. .. .. ..
.. .
. .. .. .. ..
.. .
. .. .. ..
Qs
.. . .. .. .. ..
0
.. . .. . .
..
..
.
.. ..
.. .. .
..
.. .. .. ..
.. .. .. .. ..
.. .. .. .. ..
v
.. .. .. ..
.. . .. .
.. .. .. ..
...
....
.... ....
...
...
...
....
....
T ( ) ...
...
....
.... .... .... ....
....
.... ......
. ....
.... ......
.
.... .. .... ..
..... .... ..... ....
..... ..... ..... .....
...... ..... ...... .....
......... ...... ......... ......
............................. ............................
56 Chapter 9. General Linear Transformations
Remark 9.1.3 (T1) and (T2) can be combined together: Let V and W be two vector spaces
over a eld F. A mapping T : V ! W is a linear transformation if and only if
Example 9.1.4
1. Let A be an m n matrix over a eld F. De ne a mapping LA : Fn ! Fm by
LA (u) = Au for u 2 Fn
So LA is a linear transformation.
(From this example, we see that De nition 9.1.2 can be regarded as a generalization of
De nition 7.1.1.)
3. The zero mapping OV;W : V ! W , where V and W are vector spaces over the same eld,
is de ned by
OV;W (u) = 0 for u 2 V:
It is a linear transformation and is also called the zero transformation from V to W .
If W = V , we use OV to denote OV;V and call it the zero operator on V .
Is S a linear operator?
(b) Let T : P (R) ! P (R) be the mapping de ned by
Is T a linear operator?
Section 9.1. Linear Transformations 57
Solution
(a) S is not a linear operator. For example
S (1 + x) = (1 + x)2 = 1 + 2x + x2 6= 1 + x2 = S (1) + S (x):
(b) T is a linear operator:
(T1) For any p(x); q (x) 2 P (R),
T (p(x) + q(x)) = x(p(x) + q(x)) = xp(x) + xq(x) = T (p(x)) + T (q(x)):
(T2) For any c 2 R and p(x) 2 P (R),
T (cp(x)) = x(cp(x)) = cxp(x) = cT (p(x)):
5. Let V be the set of all convergent sequences over R. We know that V is a subspace of RN
(see Question 8.13). De ne a mapping T : V ! R by
Proposition 9.1.5 Let V and W be vector space over the same eld. If T : V !W is a
linear transformation, then T (0) = 0.
Proof Since 0 + 0 = 0,
T (0 + 0) = T (0)
) T (0) + T (0) = T (0)
) T (0) + T (0) T (0) = T (0) T (0)
) T (0) + 0 = 0
) T (0) = 0:
58 Chapter 9. General Linear Transformations
Remark 9.1.6 Let V and W be vector space over the same eld. Suppose V has a basis B .
Let T : V !W be a linear transformation. For every u 2 V ,
u = a v1 + a v2 + + amvm
1 2
for some scalar a ; a ; : : : ; am and some v1 ; v2 ; : : : ; vm 2 B . By using (T1) and (T2) repeat-
1 2
edly, we get
T (u) = a1 T (v1 ) + a2 T (v2 ) + + am T (vm ):
It follows that T is completely determined by the images of vectors from B .
On the other hand, to de ne a linear transformation S from V to W , we rst set the image S (v )
for each v 2 B . For any u 2 V , since u = a1 v1 +a2 v2 + +am vm for some v1 ; v2 ; : : : ; vm 2 B
and scalars a1 ; a2 ; : : : ; am , de ne S (u) = a1 S (v1 ) + a2 S (v2 ) + + am S (vm ). Then we have a
linear transformation.
Example 9.1.7 Take the standard basis f1; x; x2 g for P2 (R). De ne a linear transformation
S : P2 (R) ! R3 by
S (1) = (1; 2; 1); S (x) = (0; 1; 1) and S (x2 ) = ( 1; 1; 0):
Then for any p(x) = a + bx + cx2 2 P2 (R),
Theorem 9.2.1 Let T : V ! W be a linear transformation where V and W are nite dimen-
sional vector spaces over a eld F such that n = dim(V ) 1 and m = dim(W ) 1. For any
ordered bases B and C for V and W respectively, there exists an m n matrix A such that
(Together with Example 9.1.4.1, the linear transformation de ned in De nition 9.1.2 is the
same as that in De nition 7.1.1 when V = Fn , W = Fm and the standard bases B and C ,
respectively, are used.)
Proof Let B = fv1 ; v2 ; : : : ; vn g. Then every u 2 V can be expressed uniquely as
u = a v1 + a v2 + + anvn
1 2
Section 9.2. Matrices for Linear Transformations 59
for some scalar a1 ; a2 ; : : : ; an . Using the notation of coordinate vectors (see De nition 8.5.6),
we have 0 1
a1
B C
B a2 C
[u]B = B C:
B .. C
@ . A
an
Then by Remark 9.1.6 and Lemma 8.5.7,
Example 9.2.4
1. In Example 9.1.7, let B = f1; x; x2 g and C = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g. The matrix for
S relative to B and C is
0 1
1 0 1
B C
[S ]C;B = [S (1)]C [S (x)]C [S (x2 )]C = @2 1 1 A:
1 1 0
We can also recover the formula of S from the matrix [S ]C;B :
60 Chapter 9. General Linear Transformations
0 1
a
For p(x) = a + bx + cx 2
2 P (R), [p(x)]B =
2
B C
@ b A. Then
c
0 10 1 0 1
1 0 1 a a c
B CB C B C
[S (p(x)]C = [S ]C;B [p(x)]B = @2 1 1 A@ b A = @2a + b + cA :
1 1 0 c a+b
So
S (p(x)) = (a c)(1; 0; 0) + (2a + b + c)(0; 1; 0) + (a + b)(0; 0; 1)
= (a c; 2a + b + c; a + b):
2. Let V be a vector space of dimension 3 over C. De ne a linear transformation T :
V ! C3 such that T (v1 ) = (1; 1; 1), T (v2 ) = (i ; 2; i) and T (v3 ) = (0; i ; 0) where
B = fv1 ; v2 ; v3 g is an ordered basis for V . Let C = f(1; 1; 1); (1; 0; 1); (1; 1; 0)g. Find
the matrix for T relative to B and C .
Solution Since [T ]C;B = [T (v1 )]C [T (v2 )]C [T (v3 )]C , we need to nd the coordi-
nate vectors [(1; 1; 1)]C , [(i ; 2; i)]C and [(0; i ; 0)]C , i.e. to nd aj ; bj ; cj , j = 1; 2; 3 such
that
8
>
< a1 + b1 + c1 = 1
a1 (1; 1; 1) + b1 (1; 0; 1) + c1 (1; 1; 0) = (1; 1; 1) , >
a1 + c1 = 1
:
a1 b1 = 1;
8
>
< a2 + b2 + c2 = i
a2 (1; 1; 1) + b2 (1; 0; 1) + c2 (1; 1; 0) = (i ; 2; i) , >
a1 + c1 = 2
:
a1 b1 = i;
8
>
< a3 + b3 + c3 = 0
a3 (1; 1; 1) + b3 (1; 0; 1) + c3 (1; 1; 0) = (0; i ; 0) , >
a1 + c1 = i
:
a1 b1 = 0:
We solve the three linear systems together (see Example 3.7.4.1):
0 1 0 1
1 1 1 1 i 0 Gauss-Jordan 1 0 0 1 2 + 2i i
B
@ 1 0 1 1 2 i
C
A ! B
@ 0 1 0 2 2+i i
C
A:
1 1 0 1 i 0 Elimination 0 0 1 2 4 2i 2i
0 1 0 1 0 1
1 2 + 2i i
Then [(1; 1; 1)]C = B
@
C B C B C
2 A, [(i ; 2; i)]C = @ 2 + i A and [(0; i ; 0)]C = @ i A. So
2 4 2i 2i
0 1
1 2 + 2i i
[T ]C;B = [T (v1 )]C [T (v2 )]C [T (v3 )]C = B
@ 2 2+i
C
iA :
2 4 2i 2i
Section 9.2. Matrices for Linear Transformations 61
3. Let A = c1 c2 cn be an n n matrix over a eld F where ci is the ith column of
A. Let LA : Fn ! Fn be the linear operator de ned in Example 9.1.4.1, i.e. LA(u) = Au
for u 2 Fn where vectors in Fn are written as column vectors.
Take the standard basis E = fe1 ; e2 ; : : : ; en g for Fn . For all u = (u1 ; u2 ; : : : ; un )T 2 Fn ,
0 1
u1
B C
B u2 C
u = u e1 + u e2 + + unen ) [u]E = B .. C = u:
1 2
B C
@ . A
un
Thus
[LA ]E = [LA ]E;E
= [LA (e1 )]E [LA (e2 )]E [LA(en)]E
= [Ae1 ]E [Ae2 ]E [Aen]E
= [c1 ]E [c2 ]E [cn]E
= c1 c2 cn
= A:
(See also Discussion 7.1.8.)
Theorem 9.2.6 The matrix [IV ]C;B in Discussion 9.2.5 is invertible and its inverse is the
transition matrix from C to B , i.e. ( [IV ]C;B ) 1 = [IV ]B;C . (See also Theorem 3.7.5.)
Proof It is easier to prove the theorem using the concept of compositions of linear transfor-
mations. So we leave it as an exercise for next section. See Question 9.16.
62 Chapter 9. General Linear Transformations
Example 9.2.7 Let E = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g and B = f(1; i ; 1); (0; 1; 1); (i ; 1; 1)g.
They are bases for C3 . Find the transition matrix from E to B .
0 1
1 0 i
B C
Solution Since [IC3 ]E;B = [(1; i ; 1)]E [(0; 1; 1)]E [(i ; 1; 1)]E = @ i 1 1A, the tran-
1 1 1
sition matrix from E to B is
0 1 1 0 1
1 0 i 0 1 i
2
1+i
2
B C B C
[IC3 ]B;E = ( [IC3 ]E;B ) = @ i 1
1
1A = @ i 1 0 A:
1 1 1 i 1
2
i 1+i
2
(T S )(au + bv ) = T (S (au + bv ))
= T (aS (u) + bS (v )) because S is a linear transformation
= aT (S (u)) + bT (S (v )) because T is a linear transformation
= a(T S )(u) + b(T S )(v ):
So T S is a linear transformation.
Example 9.3.2
1. Let S : C3 ! M22 (C) be the linear transformation de ned by
!
a + ic 0
S ((a; b; c)) = for (a; b; c) 2 C3
2b a i c
and let T : M22 (C) ! P2 (C) be the linear transformation de ned by
!! !
a b a b
T = a + (i b + c)x dx 2
for 2 M (C):
2 2
c d c d
Section 9.3. Compositions of Linear Transformations 63
Example 9.3.4 Let S : C3 ! M22 (C) and T : M22 (C) ! P2 (C) be linear transformations
de ned in Example 9.3.2.1. Take the standard bases
Example 9.3.7
1. Let A be an n n matrix over a eld F and let LA : Fn ! Fn be the linear operator
de ned in Example 9.1.4.1. Then LAm (u) = Am u for u 2 Fn .
2. Let [a; b], with a < b, be a closed interval on the real line and let D : C 1 ([a; b]) !
C 1 ([a; b]) be the di erential operator de ned in Example 9.1.4.6. Then for every f 2
C 1 ([a; b]), Dm (f ) is a function on C 1 ([a; b]) such that
d m f (x)
Dm (f )(x) = for x 2 [a; b]:
dxm
[T ]B = P 1
[T ]C P :
Proof Since T = IV T IV ,
[T ]B = [T ]B;B
= [IV T IV ]B;B
= [IV T ]B;C [IV ]C;B
= [IV ]B;C [T ]C;C [IV ]C;B
= ( [IV ]C;B ) 1
[T ]C [IV ]C;B :
So we have [T ]B = P 1
[T ]C P .
De nition 9.3.9 Let F be a eld and A; B 2 Mnn (F). Then B is said to be similar to A
if there exists an invertible matrix P 2 Mnn(F) such that B = P AP . 1
66 Chapter 9. General Linear Transformations
Theorem 9.3.10 Let T be a linear operator on a nite dimensional vector space V over a
eld F, with dim(V ) = n 1, and let C be an ordered basis for V . Then an n n matrix D
over F is similar to [T ]C if and only if there exists an ordered basis B for V such that D = [T ]B .
Proof
(() The result follows from Lemma 9.3.8.
vj = p j u1 + p j u2 + + pnj un
1 2
0 1
0 1 0
Example 9.3.11 Consider the real matrix A = @0
B
0 2A. Let B = fv1 ; v2 ; v3 g where
C
1 1 1
0 1 0 1 0 1
2 1
Bp C p1 C
v1 = B
@
C
2 A; v2 = @ 2A and v3 = B
@ 2A :
1 1 1
p p
Then LA (v1 ) = v1 , LA (v2 ) = 2 v2 and LA (v3 ) = 2 v3 . (See Example 6.1.12.3.) Hence
[LA ]B = [LA (v1 )]B [LA (v2 )]B [LA (v3 )]B
0 1
1
p
p p0 0
= [v1 ]B 2 v2 B 2 v3
B C
B = @0 2
p A:
0
0 0 2
Using the standard basis E = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g for R3 , [LA ]E = A and
0 1
2 1 1
B p p C
[IV ]E;B = [v1 ]E [v2 ]E [v3 ]E = B
@ 2 2 A:
2C
1 1 1
With P = [IV ]E;B , we have [LA ]B = [IV ]B;E [LA ]E [IV ]E;B = P AP .
1
(See also Example
6.2.6.2.)
Section 9.4. The Vector Space L( V; W ) 67
Remark 9.3.12 Using linear operators, we have a new interpretation of the problem of diag-
onalization discussed in Section 6.2. Given a square matrix A 2 Mnn (F), we use the linear
operator LA on Fn de ned in Example 9.1.4.1. By Example 9.2.4.3, [LA ]E = A where E is the
standard basis for Fn . Then to diagonalize A is the same as to nd an ordered basis B for Fn
such that the matrix for LA relative to B is a diagonal matrix. (See also Chapter 11.)
De nition 9.4.1 Let V and W be vector spaces over the same eld F.
1. Let T1 ; T2 : V ! W be linear transformations. We de ne a mapping T1 + T2 : V ! W by
(T1 + T2 )(u) = T1 (u) + T2 (u) for u 2 V:
Proposition 9.4.3 Let V and W be nite dimensional vector space over the same eld F where
dim(V ) 1 and dim(W ) 1 and let B and C be ordered basis for V and W respectively.
1. If T1 ; T2 : V !W are linear transformations, then [T1 + T2 ]C;B = [T1 ]C;B + [T2 ]C;B .
2. If T : V !W be a linear transformation and c 2 F, then [cT ]C;B = c[T ]C;B .
68 Chapter 9. General Linear Transformations
Remark 9.4.4 Matrices and linear transformations have a lot of similarities. The observations
above show their relations in addition and scalar multiplication. By Theorem 9.3.3, we have
also seen that the composition of linear transformations is equivalent to matrix multiplication.
In the later sections, we shall learnt that the matrix inverse has a corresponding analog in linear
transformations (see Theorem 9.6.6). Thus we may sometimes regard linear transformations as
generalized matrices.
Theorem 9.4.5 Let V and W be vector spaces over the same eld F, and let L(V; W ) be the
set of all linear transformations from V to W . Then L(V; W ) is a vector space over F with
addition and scalar multiplication de ned in De nition 9.4.1.
Furthermore, if V and W are nite dimensional, then
dim(L(V; W )) = dim(V ) dim(W ):
Proof It is straight forward to check that L(V; W ) is a vector space over F. In particular,
the zero vector in L(V; W ) is the zero transformation OV;W and the negative of T is the linear
transformation T = ( 1)T .
Now, let V and W be nite dimensional. If V or W is a zero space, then L(V; W ) = fOV;W g
and dim(L(V; W )) = 0 = dim(V ) dim(W ). Suppose dim(V ) 1 and dim(W ) 1. Let
B = fv1 ; v2 ; : : : ; vn g and C = fw1 ; w2 ; : : : ; wm g be bases for V and W respectively. For
each pair of i; j , where 1 i m and 1 j n, de ne a linear transformation Tij : V ! W
such that for k = 1; 2; : : : ; n, (
Tij (vk ) =
wi if k = j
0 if k 6= j:
(See Remark 9.1.6.) It can be shown that f Tij j 1 i m and 1 j ng is a basis for
L(V; W ). Since this basis has mn elements, the dimension formula follows.
(Note that [Tij ]C;B = Eij where Eij is the matrix de ned in Example 8.4.6.7.)
De nition 9.4.6 If V is a vector space over F, then the vector space L(V; F) of all linear
functionals on V is called the dual space of V and is denoted by V . By Theorem 9.4.5, if V is
nite dimensional, then dim(V ) = dim(V ).
Example 9.4.7 Let V = Fn where vectors in V are written as column vectors. For each
f 2 V , there exists scalars a1 ; a2 ; : : : ; an such that for all (x1 ; x2 ; : : : ; xn )T 2 V ,
00 11 0 1
x1 x1
BB CC B x2 C
B
BB x2 CC C
f BB .. CC = a1 x1 + a2 x2 + + an xn = a1 a2 an C:
BB CC B C
B ..
@@ . AA @ . A
xn xn
Section 9.5. Kernels and Ranges 69
So each linear functional in V can be represented by a row vector over F. (See also Example
9.6.14.4.)
......
..
....................................................
T - ......
.........
........... ................
V W
..... .............. .....
..... ..... .............. ..... ....
.... .... .............. .... ....
.... .... .............. .... ....
...... .... ..............
... .. ...
.
................ ....
..
.... ....
. .. .. .. .. .. ....
....
...
. ... .
. .............
.... ... ...
. . ........ ..............
.
...
..
. .. . ..... . . . ........
... .. ... ... ..
..
... ..
. .
.
.
....... . . . . . ........ ..
.. ... .... ..
..
. .................. .. ...
... .
. ..
..
. ......... . ............................. ..
. .
.
. .... . . . . . . . . ...
.
..
..
........... ..
...
...
.
.......
... . . . . . .....
....
.. . ... .
. . .
.
...........
.. ........... .
.
.
.
.
.
.
.
..
. . .
.
. .
R( ). . .
T
. . .
. .
.
..
.
...
..
..
..
..
.. ........... .. ..
..
. .. . . . . . . ....
. .. .............. ..
. . ..
.. ... . . . . . . .... .. ............ .. . . . . . . . . . . ... ..
0
. . .. . . . .
.. . .
............ .
. ..
.. .. .. .. . ........... . . . . . . . . .... ..
. . .. ... . .. . ........... .. ..
.. .. . . ... . .
. .
. ..... . ..
.
..
..
..
.
..
Ker( )
.. .
..
T . ..
.
. ..
.
.
..
.
..
.
..
.
.. . . . . ........
. ....
... . .............. . . . . . . . ...
.. . . . . ...
.
..
..
.
.. .. . . . . . . . .... . ..
.. ...................
. . ... ... . .
.. .. .
. ..
. . ...
.
.
. .
.
.
.. .. ........... .. . . . . . . . . . . .. .
.. .. . . . . . . .... ..
. .....
..... ... . . . . . . . . . . . .. .
.
.. ..
.... . . . . . .... . ......... ..
..
.. ............. .. .. .. ..
.. ....
.... ........... .. ..
.. . . . . . . . . . ...
.. .... ........... ... .. ..
.. ..... . . . ........................... ..
.. ..
... . . . . . . . . .... ..
..
..
........................ ..
..
.. .... ..
. ...
. .. .... . . .
..
..
.
.. .. .... . . . . . . ...... .
..
.. .... .
.. .. .. ..... . . . ........ ..
.. .. .. ......... ......... ..
...
.... ... ...
.... ..........................
...... ...
.... ...... .... ... .....
.
....
....
.. .................... ....
....
.... ............. ...
....
.... .............. .... ....
..... ............. .... ....
..... ..... .............. .....
.....
....... ..... .............. ......
.....
............................................. ........
..........................
Example 9.5.3
1. Let T : M22 (R) ! M23 (R) be the linear transformation de ned by
!! ! !
w x w x+z x + 2y x z w x
T = for 2 M (R):
2 2
y z x 2y w + x z 0 y z
8
>
> w x +z = 0 8
>
> >
> w=0
!! ! >
>
< x + 2y =0 >
>
<
w x 0 0 0 x=t
T = , x z=0 , for t 2 R:
y z 0 0 0 >
> >
> y = 21 t
>
>
> x 2y =0 >
>
:
>
: z=t
w+x z=0
70 Chapter 9. General Linear Transformations
( ! ) ( !)
0 t 0 1
So Ker(T ) = t 2 R and is a basis for Ker(T ).
1
2
t t 1
2
1
Since
!! !
w x w x+z x + 2y x z
T =
y z x 2y w + x z 0
! ! ! !
1 0 0 1 1 1 0 2 0 1 0 1
=w +x +y +z ;
0 1 0 1 1 0 2 0 0 0 1 0
( ! ! ! !)
1 0 0 1 1 1 0 2 0 1 0 1
R(T ) = span ; ; ; .
0 1 0 1 1 0 2 0 0 0 1 0
Using the standard basis fE11 ; E12 ; E13 ; E21 ; E22 ; E23 g, the four matrices are converted
to coordinate vectors (1; 0; 0; 0; 1; 0), ( 1; 1; 1; 1; 1; 0), (0; 2; 0; 2; 0; 0), (1; 0; 1; 0; 1; 0).
0 1 0 1
1 0 0 0 1 0 1 0 0 0 1 0
B C Gaussian B C
B 1 1 1 1 1 0CC
B0
B 1 1 1 2 0C
B
B 0 2 0 2 0 0AC ! B0 0 2 0 4 0C
C
@
Elimination @ A
1 0 1 0 1 0 0 0 0 0 0 0
( ! ! !)
1 0 0 0 1 1 0 0 2
Thus ; ; is a basis for R(T ).
0 1 0 1 2 0 0 4 0
2. Let IV : V ! V be the identity operator de ned in Example 9.1.4.2. Then Ker(IV ) = f0g
and R(T ) = V .
4. Let [a; b], with a < b, be a closed interval on the real line and let D : C 1 ([a; b]) !
C 1 ([a; b]) be the di erential operator de ned in Example 9.1.4.6.
For f 2 C 1 ([a; b]), D(f ) = 0, where 0 is the zero function, if and only if f is a constant
function, i.e. f = c1 for some c 2 R where 1 is the function on C 1 ([a; b]) de ned by
1(x) = 1 for x 2 [a; b]. So Ker(D) = spanf1g.
Z x
Take f 2 C 1 ([a; b]). Let g 2 C 1 ([a; b]) be the function de ned by g (x) = f (t)dt for
1 a
x 2 [a; b]. Then D(g) = f . So R(D) = C ([a; b]).
5. Let [a; b], with a < b, be a closed interval on the real line and let F : C 1 ([a; b]) !
C 1 ([a; b]) be the integral operator de ned in Example 9.1.4.6. Then Ker(F ) = f0g and
R(F ) = fh 2 C 1 ([a; b]) j h(a) = 0g.
Section 9.5. Kernels and Ranges 71
1. If Ker(T ) is nite dimensional, then dim(Ker(T )) is called the nullity of T and is denoted
by nullity(T ).
2. If R(T ) is nite dimensional, then dim(R(T )) is called the rank of T and is denoted by
rank(T ).
4. nullity(D) = 1.
5. nullity(F ) = 0.
Lemma 9.5.6 Let T : V ! W be a linear transformation where V and W are nite dimen-
sional with dim(V ) 1 and dim(W ) 1. For any ordered bases B and C for V and W
respectively,
1. f [u]B j u 2 Ker(T )g is the nullspace of [T ]C;B and nullity(T ) = nullity([T ]C;B ); and
2. f [v]C j v 2 R(T )g is the column space of [T ]C;B and rank(T ) = rank([T ]C;B ).
Proof If V = f0g is a zero space, then Ker(T ) = V = f0g and R(T ) = f0W g, where 0W is
the zero vector in W , and hence
If W = f0g is a zero space, then Ker(V ) = V and R(T ) = W = f0g and hence
Suppose dim(V ) 1 and dim(W ) 1. Let B and C be ordered bases for V and W respectively.
By Lemma 9.5.6 and the Dimension Theorem for Matrices (Theorem 4.3.4),
Remark 9.5.10 When V is nite dimensional, Theorem 9.5.9 gives us a direct proof of the
Dimension Theorem for Linear Transformations without using matrices.
Proof We only prove that T is injective (one-to-one) if and only if Ker(T ) = f0g. The other
parts are obvious.
()) By Proposition 9.1.5, T (0) = 0. On the other hand, since T is injective, there is only one
vector maps to the zero vector in W. Hence Ker(T ) = f0g.
Section 9.6. Isomorphisms 73
T (u v) = T (u) T (v ) = w
w = 0:
It means u v 2 Ker(T ). As Ker(T ) = f0g, we have u v = 0 and hence u = v .
So T is injective.
Example 9.5.13 Consider linear transformations de ned in Example 9.5.3. T and OV;W
(when both V and W are not zero spaces) are neither injective nor surjective; IV is both
injective and surjective, i.e. IV is bijective; D is surjective but not injective; and F is injective
but not surjective.
Example 9.6.2
1. For any vector space V , the identity operator on V is an isomorphism.
2. Let F be a eld and let T : F3 ! P2 (F) be the linear transformation de ned by
is also an isomorphism.
Theorem 9.6.6 Let T : V ! W be a linear transformation where V and W are nite dimen-
sional with dim(V ) = dim(W ) 1. Let B and C be ordered bases for V and W respectively.
1. T is an isomorphism if and only if [T ]C;B is an invertible matrix.
2. If T is an isomorphism, [T 1
]B;C = ( [T ]C;B ) 1 .
Proof The proof is left as exercise.
!
a b
For 2 M (R),
2 2
c d
" !!# " !#
a b a b
T 1
= [T 1
]B;C
c d B
c d C
0 10 1 0 1
1 0 0 0 a a
B C C B C
B0 1 CB B 1 a + b 1 c 1 dC
B
5 1
a+b C
B
6 CB B C
= C=B C:
6 2 2 3 6
B CB C
B0
@
1
2
1
2
C
0 A@ a+c A @
B a + 2b + 2c C
1 1
A
0 1
0 1 a b c+d 1
a b c+ d
1 1 1
3 6 2 2 6 6
Thus
!!
a b
T 1
=a+( 1
a+b 1
c 1
d)x + ( a + 21 b + 12 c)x2 + ( 21 a 1
b 1
c + 16 d)x3 :
c d 2 3 6 2 6
Remark 9.6.9 If V and W are in nite dimensional, Theorem 9.6.8.2 is in general not true.
For example, in Example 9.3.2.2, we have D F = IV but neither D nor F is an isomorphism.
De nition 9.6.10 Let V and W be vector spaces over a eld F. If there exists an isomorphism
from V onto W , then V is said to be isomorphic to W and we write V
=F W or simply V
= W.
Remark 9.6.12 The term \isomorphic" is used in abstract algebra to indicate that two alge-
braic objects have the same structure. For example, by Example 9.6.11.1, Pn (F) is isomorphic
to Fn+1 and hence Pn (F) and Fn+1 are the \same" as vector spaces. However, we do not want to
say that Pn (F) is \equal" to Fn+1 because they are still di erent in other aspects. In particular,
we can multiply two polynomials in Pn (F) but cannot multiply two vectors in Fn+1 .
Theorem 9.6.13 Let V and W be nite dimensional vector spaces over the same eld. Then
V is isomorphic to W if and only if dim(V ) = dim(W ).
Proof
()) Suppose V = W , i.e. there exits an isomorphism T : V ! W . Let fv1 ; v2 ; : : : ; vn g be a
basis for V . We claim that fT (v1 ); T (v2 ); : : : ; T (vn )g is a basis for W :
(a) Take any w 2 W . Since T is surjective, there exists u 2 V such that T (u) = w. As
u 2 spanfv1; v2; : : : ; vng, w = T (u) 2 spanfT (v1); T (v2); : : : ; T (vn)g.
So we have shown that W = spanfT (v1 ); T (v2 ); : : : ; T (vn )g.
(b) Consider the vector equation
T (u) = a1 w1 + a2 w2 + + an wn :
It can be shown that T is an isomorphism (check it). So V
= W.
3. Cn
=R R n . (In here, Cn is regarded as a vector space over R.)
2
4. Suppose V and W are nite dimensional vector spaces over F such that dim(V ) = n and
dim(W ) = m. Then L(V; W ) =F Fmn =F Mmn (F).
In particular, V is isomorphic to its dual space V de ned in De nition 9.4.6. (This result
is not true if V is in nite dimensional.)
Ker(LA ) + v = fu + v j u 2 Ker(LA )g
is the solution set of the linear system Ax = b where b = Av is an element of the column space
of A. (See Theorem 4.3.6.)
Exercise 9
2. For each of the following linear transformation T , (i) determine whether the given condi-
tions is sucient for us to nd a formula for T ; (ii) if possible, write down a formula for
T ; and if not, give two di erent examples of T that satis es the given conditions.
(a) T : P2 (R) ! P1 (R) such that
4. (a) Give an example of a mapping that satis es (T1) but not (T2).
(b) Give an example of a mapping that satis es (T2) but not (T1).
5. Let V and W be two vector spaces over a eld F. Suppose T : V ! W is a mapping that
satis es (T1).
(a) For v 2 W1 and w 2 W2 , show that P (v ) = v and P (w) = 0. (Hint: Use the
property that every vector in V can only be expressed uniquely as a sum of vectors
from W1 and W2 .)
(b) Prove that the mapping P is a linear operator.
(Parts (a) and (b) imply that for all u = v + w 2 V with v 2 W1 and w 2 W2 , P (u) = v .
The linear operator P is sometimes called the projection on W1 along W2 . You can
compare it with the orthogonal projections in De nition 5.2.13 and De nition 12.4.8.)
( ! ) ( ! )
a a a b
(c) Let V = M22 (R), W1 = a; b 2 R and W2 = a; b 2 R .
b b a b
(i) Prove that V = W1 W2 .
(ii) Write down a formula for P .
(a) Suppose T (v1 ); T (v2 ); : : : ; T (vn ) are linearly independent. Prove that v1 ; v2 ; : : : ; vn
are linearly independent.
(b) Suppose v1 ; v2 ; : : : ; vn are linearly independent. Are T (v1 ); T (v2 ); : : : ; T (vn ) lin-
early independent?
8. For each of the following linear operator T : V !V and bases B , C for V , write down
the matrix [T ]C;B .
(b) Use the same V and T as in Part (a) but B = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g and
C = f(1; 0; 0); (1; 1; 0); (1; 1; 1)g.
dp(x)
(c) Let V = Pn (R), T (p(x)) = for p(x) 2 Pn (R) and B = C = f1; x; x2 ; : : : ; xn g.
dx
(d) Let V = f(an )n2N j an+2 = an + an+1 for n = 1; 2; 3; : : : g, T ((an )n2N ) = (an+1 )n2N
for (an )n2N 2 V and B = C = f(bn )n2N ; T ((bn )n2N )g where (bn )n2N is a sequence
in V such that (bn )n2N and T ((bn )n2N ) are linearly independent.
1 4 4 0
from A by the following elementary row operations:
2R3 R1 $ R2 R3 + 2R2
A ! C1 ; A ! C2; A ! C3:
(Follow Notation 1.4.8.)
(a) Write down C1 , C2 and C3 and nd elementary matrices E1 , E2 and E3 such that
EiA = Ci for i = 1; 2; 3.
(b) Suppose V , W are real vector spaces and T : V ! W is a linear transformation such
that [T ]C;B = A where B is an ordered basis for V and C = fw1 ; w2 ; w3 g is an
ordered basis for W . For each i = 1; 2; 3, nd an ordered basis Di for W such that
[T ]Di ;B = Ci .
(This question shows that elementary row operations done to a matrix is equivalent to
changing bases for the corresponding linear transformation.)
(b) Use Gauss-Jordan Elimination to reduce [T ]C;B to its reduce row-echelon form R.
(c) Write down an ordered basis D for R3 such that [T ]D;B = R.
12. Let V be a real vector space with an ordered basis B = fv1; v2; v3g. Suppose T is an
operator on V such that 0 1
1 1 0
B C
[T ]B = @0 1 1A :
0 0 1
(a) Prove that C = fv1 v3; v1; v1 + v2 + v3g is a basis for V .
(b) Using C as an ordered basis, compute [T ]C .
13. Let B = f1 + x + x2 ; x + x2 ; x x2 g.
(a) Prove that B is a basis for P2 (R).
(b) Find the transition matrix from E to B where E = f1; x; x2 g is the standard basis
for P2 (R).
14. Let S : M22 (C) ! C3 and T : C3 ! M22 (C) be linear transformations such that
!! !
a b a b
S = (a + i b; c + i a; c + i d) for 2 M (C)
2 2
c d c d
and !
x iy y
T ((x; y; z )) = for (x; y; z ) 2 C3 :
ix x iz
(a) Let B = fE11 ; E12 ; E21 ; E22 g and C = fe1 ; e2 ; e3 g be standard bases for M22 (C)
and C3 respectively. Compute [S ]C;B , [T ]B;C , [S T ]B and [T S ]C .
(b) Write down a formula for each of S T and T S .
15. Let B = f1; x; x2 g and C = f(1; 0); (0; 1)g. Suppose T1 : P (R) ! R
2
2
and T2 : R !
2
19. Let V be a nite dimensional vector space such that dim(V ) = n 1 and let B =
fv1; v2; : : : ; vng be an ordered basis for V . De ne a linear operator S on V such that
(
0 if k = 1
S (vk ) =
vk 1 if 2 k n:
(a) Write down [S ]B .
(b) Prove that S n 1
6= OV and S n = OV .
(a) If T1 ; T2 : V !W are linear transformations, prove that [T1 + T2 ]C;B = [T1 ]C;B +
[T2 ]C;B .
(b) If T : V !W be a linear transformation and c 2 F, prove that [cT ]C;B = c[T ]C;B .
(a) Let E = fE11 ; E12 ; E21 ; E22 g be the standard bases for M22 (R). Compute [S ]E ,
[T ]E , [IV ]E , [S + T ]E , [T 2IV ]E and [(T 2IV )2 ]E .
(b) Write down a formula for each of S + T , T 2IV and (T 2IV )2 .
22. Let V be a nite dimensional vector space such that dim(V ) = n 1 and let T be an
linear operator on V . De ne Q = T IV where is a scalar. Suppose there exists a
positive integer n such that Qn = OV and Qn 1 (v ) 6= 0 for some v 2 V .
(See De nition 9.4.6 for the de nition of the dual spaces V and W .)
(b) Suppose V and W are nite dimensional with bases B = fv1 ; : : : ; vn g and C =
fw1; : : : ; wmg respectively. For i = 1; : : : ; n, de ne gi 2 V such that
(
1 if i = j
gi (vj ) =
0 otherwise;
and for i = 1; : : : ; m, de ne hi 2 W such that
(
1 if i = j
hi (wj ) =
0 otherwise.
(i) Show that B = fg1 ; : : : ; gn g is a basis for V and C = fh1 ; : : : ; hm g is a basis
for W .
(ii) Prove that [T ]B ;C = ([T ]C;B )T .
25. Let V and W be vector spaces over the same eld. For any subset A of V , de ne
T 1 [U ] = fu 2 V j T (u) 2 U g:
(b) Show that T 1
[U ] is a subspace of V .
(c) If both Ker(T ) and U are nite dimensional, say dim(Ker(T )) = k and dim(U ) = m,
nd dim(T 1 [U ]).
Exercise 9 87
33. (a) Let T : V ! W be a linear transformation where V and W are nite dimensional
vector spaces such that dim(V ) = n 1 and dim(W ) = m 1. Show that there
exist ordered bases B and C for V and W , respectively, such that
!
[T ]C;B =
Ik 0k(n k)
0(m k)k 0(m k)(n k)
where k = rank(T ). (Hint: The result of Theorem 9.5.9 can be useful.)
(b) Let A 2 Mmn(F). Show that there exists invertible matrices P 2 Mmm(F) and
Q 2 Mnn(F) such that
!
PAQ = Ik 0k(n k)
0(m k)k 0(m k)(n k)
where k = rank(A).
0 1
1 1 0
(c) Let A = B
@ 0 1 1A be a real matrix. Find two 3 3 invertible real matrices
C
1 0 1
0 1
1 0 0
P and Q such that PAQ = @0 1 0C
B
A.
0 0 0
35. (a) Let S; T : V ! W be linear transformations. Prove that R(S + T ) R(S ) + R(T )
and Ker(S + T ) Ker(S ) \ Ker(T ).
(b) Let S; T : R3 ! R4 be linear transformations de ned by
39. Let V and W be nite dimensional vector spaces such that dim(V ) = dim(W ) and let
T : V ! W be a linear transformation. Prove that the following statements are equivalent:
(i) T is injective.
(ii) T is surjective.
(iii) T is bijective.
40. Let V and W be nite dimensional vector spaces and let T : V !W be a linear trans-
formation.
(a) Prove that if dim(V ) < dim(W ), then T is not surjective.
(b) Prove that if dim(V ) > dim(W ), then T is not injective.
41. For each of the following, determine whether the the linear transformation T is an iso-
morphism. If so, nd the inverse of T .
(a) T : F23 ! F23 such that T ((x; y; z )) = (x + y; y + z; z + x) for (x; y; z ) 2 F23 .
(b) T : R3 ! R3 such that T ((x; y; z )) = (x + y; y + z; z + x) for (x; y; z ) 2 R3 .
Exercise 9 89
43. Let T : M22 (C) ! M22 (C) be a linear transformation such that T 1
exists and
!! ! !
a b a a + ib a b
T 1
= for 2 M (C):
2 2
c d a + ic b + c + id c d
(a) Write down a formula for T .
(b) Find Ker(T ) and R(T ).
49. (a) Let V be a subspace of a vector space U and let W be a subspace of V . Show that
V=W is a subspace of U=W .
(b) (The Second Isomorphism Theorem) Let V and W be subspaces of a vector
space U . Prove that (V + W )=W
= V=(V \ W ).
(c) (The Third Isomorphism Theorem) Let V be a subspace of a vector space U
and let W be a subspace of V . Prove that (U=W )=(V=W )
= U=V .
Chapter 10
Example 10.1.3
( !) ( ! !)
1 1 2 1 2
S1 = , S2 = ; and
1 1 2 2 1
( ! ! ! ! ! ! )
1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3
S3 = ; ; ; ; ; ; .
1 2 3 2 3 1 3 1 2 3 2 1 1 3 2 2 1 3
!
1 2 3
For example, = is the mapping from f1; 2; 3g to f1; 2; 3g such that (1) = 3,
3 1 2
(2) = 1 and (3) = 2.
Notation 10.1.4
1. For ; 2 Sn, is also a permutation and we usually denote by .
92 Chapter 10. Multilinear Forms and Determinants
Example 10.1.5
! !
1 2 3 4 1 2 3 4
1. Let = and = . Then
3 1 2 4 4 1 3 2
! !
1 2 3 4 1 2 3 4
= and = :
4 3 2 1 3 4 1 2
! ! !
1 2 3 4 1 2 3 4 1 2 3 4
2. In S4 , 1;2 = , 2;3 = , 3;4 = ,
2 1 3 4 1 3 2 4 1 2 4 3
! !
1 2 3 4 1 2 3 4
1;2 2;3 1;2 = = 1;3 and 1;2 2;3 3;4 2;3 1;2 = = 1;4 :
3 2 1 4 4 2 3 1
Lemma 10.1.6
1. f 1
j 2 Sng = Sn.
2. For any 2 Sn , f j 2 Sn g = f j 2 Sn g = Sn .
Proof The proof is left as an exercise. See Question 10.4(a) and (b).
Lemma 10.1.7 For every 2 Sn, there exists 1 ; 2; : : : ; k 2 f1; 2; : : : ; n 1g such that
= 1 ; 1 +1 2 ; 2 +1 k ; k +1 .
= (n ; n
1 1 ; ;
2 2 1 1 ) 1
= 1; 1 2; 2 n ; n
1 1
is a product of transpositions.
!
1 2 3 4
Example 10.1.8 Let = . Following the procedure of the proof of Lemma
3 1 4 2
10.1.7, we have !
1 2 3 4
34 23 13 =
1 2 3 4
and hence = 13 23 34 . Since 13 = 12 23 12 , we have = 12 23 12 23 34 .
The decomposition of a permutation discussed in Lemma 10.1.7 is not unique, for this example,
we can also write = 23 34 12 .
Example 10.1.10
! !
1 2 3 4 1 2 3 4
1. Let = and = .
3 1 4 2 4 1 3 2
In , inversions occur when (i; j ) = (1; 2), (1; 4) and (3; 4). So is an odd permutation
and sgn( ) = 1. In , inversions occur when (i; j ) = (1; 2), (1; 3), (1; 4) and (3; 4). So
is an even permutation and sgn( ) = 1.
Corollary 10.1.12
1. If 2 Sn is a product of k transpositions, then sgn( ) = ( 1)k .
2. A permutation is even (respectively, odd) if it is a product of even (respectively, odd)
number of transpositions.
3. For any 2 Sn , sgn( 1 ) = sgn( ).
Proof Part 1 is a consequence of Example 10.1.10.2 and Theorem 10.1.11 while Part 2 is a
consequence of Part 1. The proof of Part 3 is left as exercise. See Question 10.4(c).
Section 10.2. Multilinear Forms 95
De nition 10.2.1 Let V be a vector space over a eld F and let V n = V| {z V}. A
n times
mapping T : V n ! F is called a multilinear form on V if for each i, 1 i n,
Example 10.2.2 In the following examples, vectors in Fn are written as column vectors.
1. Let A 2 Mnn (F). De ne T : Fn Fn ! F by
T (u; v) = uT Av for u; v 2 Fn :
2. De ne P : Mnn (F) ! F by
X
P (A) = a(1);1 a(2);2 a(n);n for A = (aij ) 2 Mnn (F):
2Sn
P a1 ai 1 bx + cy ai+1 an
X
= a(1);1 a(i 1) ;i 1 (bx(i) + cy(i) ) a(i+1);i+1 a n ;n
( )
2Sn
X
=b a(1);1 a(i 1) ;i 1 x(i) a(i+1);i+1 a(n);n
2Sn
X
+c a(1);1 a(i 1);i 1 y(i) a(i+1);i+1 a(n);n
2Sn
= bP a1 ai 1 x ai+1 an
+ cP a1 ai 1 y ai+1 an :
Suppose 1 + 1 6= 0 in F. Then Q(A) = 0. (Actually, it is also true for the case when
1 + 1 = 0 in F, see Question 10.10.)
Hence Q is an alternative multilinear form.
Proof By Corollary 10.1.12, we only need to show that for all = ; , where 1 < n,
T (u1 ; u2 ; : : : ; un ) = T (u (1) ; u (2) ; : : : ; u (n) ).
In the following computation, we use T (: : : ; x; : : : ; y ; : : : ) to denote the function value of T
when the th term is x, the th term is y and the ith term, for i 6= ; , is ui .
Since T is alternative,
0 = T (: : : ; u + u ; : : : ; u + u ; : : : )
= T (: : : ; u ; : : : ; u + u ; : : : ) + T (: : : ; u ; : : : ; u + u ; : : : )
= T (: : : ; u ; : : : ; u ; : : : ) + T (: : : ; u ; : : : ; u ; : : : )
+ T (: : : ; u ; : : : ; u ; : : : ) + T (: : : ; u ; : : : ; u ; : : : )
= T (: : : ; u ; : : : ; u ; : : : ) + T (: : : ; u ; : : : ; u ; : : : ):
u1 = a v1 + a v2 + + am vm;
11 21 1
u2 = a v1 + a v2 + + am vm;
12 22 2
..
.
un = a1nv1 + a2nv2 + + amnvm
where a11 ; a12 ; : : : ; amn 2 F.
1. Let F be the set of all mapping from f1; 2; : : : ; ng to f1; 2; : : : ; mg. Then apply (10.2)
repeatedly (see Example 10.2.5), we have
X
T (u1 ; u2 ; : : : ; un ) = af (1);1 af (2);2 af (n);n T (vf (1) ; vf (2) ; : : : ; vf (n) ): (10.3)
f 2F
2. Suppose T is an alternative form. Then in (10.3), T (vf (1) ; vf (2) ; : : : ; vf (n) ) = 0 when
f is not injective, i.e. there exists ; 2 f1; 2; : : : ; ng such that 6= and f ( ) = f ( ).
98 Chapter 10. Multilinear Forms and Determinants
(a) If m < n, then T (u1 ; u2 ; : : : ; un ) = 0 for all u1; u2; : : : ; un 2 V , i.e. T is a zero
mapping.
(b) If m n, then (10.3) is still holds if we change the set F to the set of all injective
mapping from f1; 2; : : : ; ng to f1; 2; : : : ; mg.
In particular, When m = n, we can replace F by Sn . By Theorem 10.2.3, we get
X
T (u1 ; u2 ; : : : ; un ) = a(1);1 a(2);2 a(n);n T (v(1) ; v(2) ; : : : ; v(n) )
2Sn
X
= sgn( ) a(1);1 a(2);2 a n ;n T (v1; v2; : : : ; vn):
( )
2Sn
(10.4)
De nition 10.3.1 A mapping D : Mnn (F) ! F is called a determinant function on Mnn (F)
if it satis es the following axioms:
(D1) By regarding the columns of matrices in Mnn (F) as vectors in Fn (see Example 10.2.2.2),
D is a multilinear form on Fn .
(D2) D(A) = 0 if A 2 Mnn (F) has two identical columns, i.e. as a multilinear form on Fn ,
D is alternative.
(D3) D(In ) = 1.
Section 10.3. Determinants 99
Theorem 10.3.2 There exists one and only one determinant function on Mnn (F) and it is
the function det : Mnn (F) ! F de ned by
X
det(A) = sgn( ) a(1);1 a(2);2 a n ;n
( ) for A = (aij ) 2 Mnn (F): (10.5)
2Sn
This formula is known as the classical de nition of determinants.
Proof Note that the function det is the function Q in Example 10.2.2.3. By Example
10.2.2.3 and Question 10.10, det is an alternative multilinear form on Fn and hence it sat-
is es (D1) and (D2). Since In = (ij ) where ii = 1 and ij = 0 if i 6= j , for any 2 Sn ,
(1);1 (2);2 (n);n = 0 whenever is not the identity mapping. Then
X
det(In ) = sgn( ) (1);1 (2);2 n ;n = nn = 1
( ) 11 22
2Sn
Hence D = det.
Example 10.3.3
!
1
1. In S1 , there is only one permutation = .
1
Let A = (a11 ) 2 M11 (F). Then
! ! !
1 2 3 1 2 3 1 2 3
3. In S3 , there are six permutations: 1 = , 2 = , 3 = ,
1 2 3 3 1 2 2 3 1
! ! !
1 2 3 1 2 3 1 2 3
4 = , 5 = and 6 = .
3 2 1 1 3 2 2 1 3
0 1
a11 a12 a13
Let A = @a21 a22 a23 C
B
A 2 M33 (F). Then
a31 a32 a33
6
X
det(A) = sgn(j ) aj (1);1 aj (2);2 aj (3);3
j =1
= a11 a22 a33 + a31 a12 a23 + a21 a32 a13 a31 a22 a13 a11 a32 a23 a21 a12 a33 :
This gives us the same formula as in Remark 2.5.5.
2Sn
X
= sgn( ) a1;(1) a2;(2) an; n ( )
2Sn
= det(AT ):
det(A) = a 1 A 1 + a 2 A 2 + + a n A n
= a1 A1 + a2 A2 + + an An
Remark 10.3.6 By Example 10.3.3.1 and Theorem 10.3.5, the determinant function de ned
in Theorem 10.3.2 is the same as the determinant de ned inductively in De nition 2.5.2.
Exercise 10
T (u; v) = uT Av for u; v 2 Fn :
Show that T is a bilinear form on Fn .
10. Prove that the multilinear form Q in Example 10.2.2.3 is alternative when 1 + 1 = 0 in
F. (Hint: See Question 10.5.)
11. Let V be a vector space over a eld F and T : V n ! F an alternative multilinear form. If
u1; u2; : : : ; un are linearly dependent vectors in V , prove that T (u1; u2; : : : ; un) = 0.
12. Suppose V is nite dimensional. Let B be an ordered basis for V and T a bilinear form
on V .
(a) Show that there exists a square matrix A over F such that T (u; v ) = ([u]B )T A[v ]B
for all u; v 2 V .
(b) Prove that T is symmetric if and only if the matrix A in (a) is symmetric. (See
Question 10.7 for the de nition of symmetric bilinear forms.)
(c) If 1 + 1 6= 0 in F, prove that T is alternative if and only if the matrix A in (a) is
skew symmetric.
13. Let V be a vector space over a eld F. A function Q : V ! F is called a quadratic form
on V if it satis es the following two axioms.
14. Suppose V is nite dimensional vector space over R with n = dim(V ). Let Q be
a quadratic form on V . Prove that there exist 1 ; 2 ; : : : ; n 2 R and a basis B =
fw1; w2; : : : ; wng for V such that
Q(u) = 1 a21 + 2 a22 + + n a2n for u = a1 w1 + a2 w2 + + an wn 2 V:
16. Use the formula in Theorem 10.3.2 to write down a formula for det(A) where A = (aij )44 .
17. (This question is part of the proof of cofactor expansions, Theorem 10.3.5. You should
not use the properties of determinants learnt from Section 2.5.)
Let Em : Mnn (F) ! F, 1 m n, be the mapping de ned in Theorem 10.3.5, i.e.
n
X
Em (A) = ( 1)m+k amk det(A
e mk ) for A = (aij ) 2 Mnn (F):
k=1
Prove that Em is a determinant function.
Exercise 10 105
det(A) = a 1 A 1 + a 2 A 2 + + a n A n
= a1 A1 + a2 A2 + + an An
where A = ( 1) +
det(A
e ).
(Hint: First, use Theorem 10.2.3 and Lemma 10.3.4 to nd out what happen to the
determinant if we interchanging two columns or two rows of A.)
22. Let a1 ; a2 ; : : : ; an be elements of a eld. Prove that the value of the Vandermonde deter-
minant
1 1 1
a1 a2 an
det (aij 1 )nn = .. .. ..
. . .
a1n 1
a2n 1
: : : ann 1
Y
is equal to (aj ai ).
i<j n
1
106 Chapter 10. Multilinear Forms and Determinants
b a0
1 b 0 a1
1 b a2
... ... .. = a0 a1 b ak bk
1
1
+ bk :
.
... b ak 2
0
1 b ak 1
Chapter 11
Discussion 11.1.1 In Section 6.2, we have studied the problem of diagonalization of real
square matrices. The procedure to diagonalize square matrices over other elds is exactly the
same. By Remark 9.3.12, we learnt that to diagonalize a square matrix A 2 Mnn (F) is the
same as to nd an ordered basis B for Fn such that the matrix for LA relative to B is a diagonal
matrix. In this section, we shall restate a few important results in Chapter 6 in terms of linear
operators.
Example 11.1.3
2. We de ne the shift operator S on the vector space of in nite sequences over a eld F such
that
S ((an )n2N ) = (an+1 )n2N = (a2 ; a3 ; a4 ; : : : ) for (an )n2N = (a1 ; a2 ; a3 ; : : : ) 2 FN :
For any 2 F, let a = (n 1 )n2N = (1; ; 2 ; : : : ) 2 FN . Then
S (a ) = (n )n2N = (; 2 ; 3 ; : : : ) = (1; ; 2 ; : : : ) = (n 1 )n2N = a :
Thus every scalar is an eigenvalue of S and a is an eigenvector of S associated with .
3. Let [a; b], with a < b, be a closed interval on the real line and let D : C 1 ([a; b]) !
C 1 ([a; b]) be the di erential operator de ned in Example 9.1.4.6. For any 2 R, let
f 2 C 1 ([a; b]) be the function de ned by f (x) = ex for x 2 [a; b]. Then
dex
D(f )(x) = = ex = f (x) for all x 2 [a; b]
dx
) D(f ) = f :
Thus every real number is an eigenvalue of D and f is an eigenvector of D associated
with .
De nition 11.1.4 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. The determinant of T , denoted by det(T ), is de ned to be the determinant of the
matrix [T ]B where B is any ordered basis for V .
Remark 11.1.5 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. Suppose B and C are two ordered bases for V . By Theorem 9.3.10, [T ]B and [T ]C
are similar, i.e. [T ]B = P 1 [T ]C P for some invertible matrix P . Then
det([T ]B ) = det(P 1
[T ]C P ) = det(P 1
) det([T ]C ) det(P )
= det(P ) 1
det([T ]C ) det(P ) = det([T ]C ):
So the de nition of det(T ) is independent of the choice of the basis B .
Theorem 11.1.6 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. For a scalar , let IV T be the linear operator de ned by (IV T )(u) =
u T (u) for u 2 V .
1. is an eigenvalue of T if and only if det(IV T ) = 0.
(The equation det(xIV T ) = 0 is called the characteristic equation of T and the poly-
nomial det(xIV T ) is called the characteristic polynomial of T .)
2. u2V is an eigenvector of T associated with if and only if u is a nonzero vector in
Ker(T IV ) (= Ker(IV T )).
(The subspace Ker(T IV ) of V is called the eigenspace of T associated with .)
Section 11.1. Eigenvalues and Diagonalization 109
Proof The proof of Part 1 follows the same argument as the proof for Remark 6.1.5. For Part
2, let u 2 V ,
Notation 11.1.7 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. We use cT (x) to denote the characteristic polynomial of T , i.e. cT (x) = det(xIV T ).
For an eigenvalue of T , we use E (T ) to denote the eigenspace of T associated with , i.e.
E (T ) = Ker(T IV ).
Also, for an n n matrix A, we use cA (x) to denote the characteristic polynomial of A. For
an eigenvalue of A, we use E (A) to denote the eigenspace of A associated with . (See
De nition 6.1.6 and De nition 6.1.11.)
Remark 11.1.8 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) = n 1. Take any basis B for V . Then
Thus the characteristic polynomial of T is the same as the characteristic polynomial of the
matrix [T ]B . By the result of Question 10.21, cT (x) is a monic polynomial of degree n.
Example 11.1.9 Let V be a nite dimensional real vector space with a basis B = fv1 ; v2 ; v3 g
Let T : V ! V be linear operator such that
Then
0 1
1 2 2
[T ]B = [T (v1 )]B [T (v2 )]B [T (v3 )]B = @0
B C
1 2A :
0 2 3
Thus
x 1 2 2
cT (x) = c[T ]B (x) = det(xI [T ]B ) = 0 x+1 2 = (x 1)3
0 2 x 3
and hence T has only one eigenvalue 1.
110 Chapter 11. Diagonalization and Jordan Canonical Forms
For a; b; c 2 R,
av1 + bv2 + cv3 2 E1 (T ) , (T IV )(av1 + bv2 + cv3) = 0
, [T IV ]B [av1 + bv2 + cv3]B = [0]B
20 1 0 130 1 0 1
1 2 2 1 0 0 a 0
, 6B
4@0 1 2A
C B C7B C B C
@0 1 0A5@ b A = @0A
0 2 3 0 0 1 c 0
0 10 1 0 1
0 2 2 a 0
, B
@0
CB C B C
2 2A@ b A = @0A
0 2 2 c 0
0 1 0 1 0 1
a 1 0
, @ b A = s@0A + t@1A for s; t 2 R:
B C B C B C
c 0 1
So E1 (T ) = spanfv1 ; v2 + v3 g.
De nition 11.1.10 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. Then T is called diagonalizable if there exists an ordered basis B for V such that
[T ]B is a diagonal matrix.
Theorem 11.1.11 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. Then T is diagonalizable if and only if V has a basis B such that every vector in
B is an eigenvector of T .
Proof The proof follows the same argument as the proof for Theorem 6.2.3.
Algorithm 11.1.12 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) = n 1. We want to determine whether T is diagonalizable and if it is diagonalizable,
nd an ordered basis so that the matrix of T relative to this basis is a diagonal matrix. (See
Algorithm 6.2.4 and Remark 6.2.5.)
Step 1: Find a basis C for V and compute the matrix A = [T ]C .
Step 2: Factorize the characteristic polynomial cT (x) = cA (x) into linear factors (if possible),
i.e. to express it in the form
cA (x) = (x 1 )r1 (x 2 )r2 (x k )rk
where 1 ; 2 ; : : : ; k are distinct eigenvalues of A and r1 + r2 + + rk = n. If we cannot
factorize cA (x) into linear factors, T is not diagonalizable.
Step 3: For each eigenvalue i , nd a basis Bi for the eigenspace Ei (T ) = Ker(T i IV ).
If jBi j < ri for some i, T is not diagonalizable. (See Theorem 11.5.10.)
Section 11.1. Eigenvalues and Diagonalization 111
Example 11.1.13
Step 1: Take the standard basis C = f(1; 0); (0; 1)g for R2 . Then
!
0 1
A = [T ]C =
1
1 0
:
x 1
cA (x) = det(xI A) = = 1 + x2
1 x
x 4 1 1
cA (x) = det(xI A) = 1 x 2 1 = (x 3)2 (x + i):
0 0 x+i
c 0
x+2 1 1 2
1 x 1 0 1
cA (x) = det(xI A) = = x2 (x 1)2 :
2 1 x 2 1
1 0 1 x 1
Remark 11.1.14 In Algorithm 11.1.12, we need to factorize the characteristic polynomial into
linear factors. Sometimes, over a certain eld F, say F = R, we may not be able to factorize
some polynomials in this manner. Luckily, we can always nd a bigger eld that contains F
and at the same time, all polynomials over this bigger eld can be factorized into linear factors.
In particular, the eld C contains R and all polynomials over C can be factorized into linear
factors.
C , i.e.
2
Discussion 11.2.1 Although not all square matrices are diagonalizable, we can still reduce
them into simpler form provided that the eld used is big enough (see Remark 11.1.14). In this
section, we shall see that if the characteristic polynomial of a square matrix can be factorized
into linear factors, then the matrix is similar to an upper triangular matrix. In particular, every
complex square matrix is similar to an upper triangular matrix.
Proof We prove Part 1 by induction on n. (In the following, vectors in Fn are written as
column vectors.)
Since 1 1 matrices are upper triangular matrices, the result is true for n = 1.
Assume that if the characteristic polynomial of a matrix B 2 Mkk (F) can be factorized into
linear factors over F, then there exists an invertible matrix Q 2 Mkk (F) such that Q 1 BQ
is an upper triangular matrix.
Now, let A 2 M(k+1)(k+1) (F) such that cA (x) can be factorized into linear factors over F. Let
E be the standard basis for Fk+1 . As cA (x) has at least one linear factor x for some 2 F,
A has at least one eigenvalue .
Let v be an eigenvector of Aassociated with . Extend fv g to a basis D = fv ; u1 ; : : : ; uk g for
Fk+1. Let R = [IFk+1 ]E;D = v u1 uk . Then
0 1
b1 k b
B C
R AR = [IFk
1
+1 ]D;E [LA ]E [IFk+1 ]E;D = [LA ]D = B
B.
0
@ .. B
C
C
A
0
for some b1 ; : : : ; bk 2 F and B 2 Mkk (F). As cA (x) = cR 1 AR (x) = (x )cB (x), cB (x)
can also be factorized into linear factors over F. By the inductive assumption, there exists an
invertible matrix Q 2 Mkk (F) such that Q 1 BQ is an upper triangular matrix.
0 1
1 0 0
B C
Let P = R B 0 C
C. Then
B.
@ .. Q A
0
0 11 0 1
1 0 0 1 0 0
B C B C
P AP = B
1
B.
@.
0
Q
C
C
A
R AR B
1
B.
@.
0
Q
C
C
A
. .
0 0
0 10 10 1
1 0 0 b1 k 1
b 0 0
B CB CB C
=B 0 CB 0 CB0 C
B.
@ .. Q 1 CB .
A@ .. B CB .
A@ .. Q C
A
0 0 0
0 1
(b1 k )Q
b
B C
B0 C
= B.
@ .. Q BQ 1 C
A
0
which is an upper triangular matrix. (Use Lemma 11.2.2 for the block matrix multiplications
preformed above.)
So by the mathematical induction, Part 1 of the theorem is true for all positive integer n.
116 Chapter 11. Diagonalization and Jordan Canonical Forms
For Part 2, we apply the result of Part 1 to A = [T ]C , where C is an ordered basis for V . There
exists an invertible matrix P such that D = P 1 [T ]C P is an upper triangular matrix. Then
by Theorem 9.3.10, there exists an ordered basis B such that [T ]B = D is an upper triangular
matrix.
(See Question 11.23 for an alternative proof of Part 2.)
0 1
8 6 2 2
B C
B 4 3 1 1C
Example 11.2.4 Let A = B
B
C be a real matrix. Note that cA (x) = (x 8)4 .
@ 8 6 10 2 C
A
4 1 3 11
The matrix A has an eigenvector (0; 0; 1; 1)T .
Extend f(0; 0; 1; 1)T g to a basis f(0; 0; 1; 1)T ; (1; 0; 0; 0)T ; (0; 1; 0; 0)T ; (0; 0; 0; 1)T g for R.
4
0 1
0 1 0 0
B C
B 0 0 1 0C
Let S = B
B
C. Then
@ 1 0 0 0C
A
1 0 0 1 0 1
8 8 6 2
B C
B0 8 6 2C
S 1
AS = B
B0 4 3 1C
C:
@ A
0 4 7 13
0 1
8 6 2
The matrix B = @4
B C
3 1 A has an eigenvector (2; 1; 3)T .
4 7 13
0 1
2 0 0
Extend f(2; 1; 3)T g to a basis f(2; 1; 3)T ; (0; 1; 0)T ; (0; 0; 1)T g for R3 . Let R = B
@
C
1 1 0A.
3 0 1
0 1
Then 8 3 1
R 1
BR = @0
B
6 2 A:
C
0 2 10
!
6 2
The matrix C = has an eigenvector (1; 1)T .
2 10
!
1 0
Extend f(1; 1)T g to a basis f(1; 1)T ; (0; 1)T g for R2 . Let Q = . Then
1 1
!
Q 1
CQ = 80 28 :
Now, let us trace what we have done backward following the proof of Theorem 11.2.3.
Section 11.3. Invariant Subspaces 117
0 1 0 10 1 0 1
1 0 0 2 0 0 1 0 0 2 0 0
Let U = R B 0
C
=B CB C B C
1 1 0A@0 1 0A = @ 1 1 0A. Then
@
0
Q A @
3 0 1 0 1 1 3 1 1
0
8 ( 3 Q 1 08 1) 4 1
1
U BU = B
1
@ 0
C B
Q CQ A = @00 1
8
C
2 A:
0 0 8
0 1 0 10 1 0 1
1 0 0 0 0 1 0 0 1 0 0 0 0 2 0 0
B C B CB C B C
B 0 C B 0 0 1 0C
CB0
B 2 0 0C B 0 1 1 0C
Let P = S B C = B C=B C. Then
B
@ 0 U C
A
B
@ 1 0 0 C
0AB
@0 1 1 0A B
C
@ 1 0 0 0C
A
0 1 0 0 1 0 3 1 1 1 3 1 1
0 1 0 1
B
8 (8 6 2) U C B
8 16 8 2
C
B 0 C B0 8 4 1C
P AP =
1 B
B
0 U BU 1
C=B
C B0 0 8 2C
C:
@ A @ A
0 0 0 0 8
Remark 11.2.5 Theorem 11.2.3 is still true if we change \upper triangular matrix" to \lower
triangular matrix".
V V
..... .....
..... .... ..... ....
.... .... .... ....
...... .... ...... ....
.. . ... ....
. .. . ... ....
....
....
.
..... ..... ..... ..
....
.....
...
...
.
.
.. ................................................... .. .
.
..
............................... ..
.. .... ................ .... .... ..
... ....... . . .... ... ........ . . .
.. ............... ...
.... ..
.
. ..... ..
... ..... .. .. ........... .
.
. ..
.... . . .
.
..
..
. . .................
..
. .
.
.... . . W . . .....
..
.. ............. ..... . . W . . ......
.
..
..
. . .. .............. .
..
. ... . . . . . . . . . .... .. ..
. .. . ................... . . . . . ... ..
............... .. ..
... ... .. ..
.
.
. ...
..... ..
. . ..
jW-
.. .
... .. . . . . . . . . . . ...
.
. .. ... .. . . ............................. . . . ...
.
..
..
.. .. . . . . . . . . . . ... ..
. .. .. . . ....................... . . ...
. ..
. . .. .. . . . .. .
.. ..
... ... .
. .
. T .
.
. .
.
. .
.
. .. .
. ..
.. ... . . . . . . . . . . .... .. .. ... . . .............................. . . ... ..
.
...
..
..
.
... . . . . . . . . . . ....
..
... . . . . . . . . . . ....
. .
.
.
..
..
.
.
...
..
..
..
.
... . . ...
..
..
. T [W ]
... . . ............................. . . ...
.. .
.. . . ..
. .
. ..
..
..
..
.. .. . . .
. . .
.. .. . . . . . . . . . . . . . .. ... . . .................... . . .. . .
.. .. .
. ..
. .. .. ..
.. .. . .
.
.
.. .. .. ..
.. ... . . . . . . . . . ... ..
. .. .. . . ............................... . . . ... ..
.. .. . .. .. .. . . . ..
.. .. . . . . . . . . . ... .. .. .. . . ............................ . . . ... ..
.. .. .. . ...... ..
.. .. .. .. .. .............................. ..
.. ..
.. .... . . . . . . . .... ..
. . . .. ..
.... . .. . . . . . . . . .. ..
.. ....
.... . . . . . . .......
.
.. .......... ... ....
.... . . . . . . ......
.
.. .............. .
..
.. .... . .. ................. .. .... .. ..
.. ..... .... ............... .. ..... .... ..
..
... ....... . ................................... ... ..
.. ...... . . . .........
................... ..
.................... ...
.... ....
. .... .
....
.... .... .... ...
.... ... .... ....
.... .... .... ....
.... .... .... ....
..... ..... ..... ....
.....
....... .... ..... ......
.. .. .......
.
........................... . ......... .
......................
118 Chapter 11. Diagonalization and Jordan Canonical Forms
Example 11.3.2
1. Let T : R3 ! R3 be the linear operator de ned by T ((x; y; z )) = (y; x; z ) for (x; y; z ) 2
R3. Note that T is a rotation about the z-axis. (See Section 7.3.)
(a) Let W1 be the xy -plane in R3 , i.e. W1 = f(x; y; 0) j x; y 2 Rg. Since for all (x; y; 0) 2
W1 ,
T ((x; y; 0)) = (y; x; 0) 2 W1 ;
W1 is T -invariant. The restriction T jW1 of T on W1 is equivalent a rotation de ned
on the xy -plane.
(b) Let W2 be the z -axis in R3 , i.e. W2 = f(0; 0; z ) j z 2 Rg. Since for all (0; 0; z ) 2 W2 ,
T ((0; 0; z )) = (0; 0; z ) 2 W2 ;
W2 is T -invariant. The restriction T jW2 of T on W2 is the identity operator.
(c) Let W3 be the yz -plane in R3 , i.e. W3 = f(0; y; z ) j y; z 2 Rg. It is not T -invariant.
For example, (0; 1; 1) 2 W3 but T ((0; 1; 1)) = (1; 0; 1) 2= W3 .
Let E = fe1 ; e2 ; e3 g be the standard basis for R3 , where e1 = (1; 0; 0), e2 = (0; 1; 0),
e3 = (0; 0; 1), and let C = fe1; e2g and D = fe3g which are bases for W1 and W2
respectively. Then
0 1
0 1 0
[T ]E = [T (e1 )]E [T (e2 )]E [T (e3 )]E = B
@
C
1 0 0A ;
0 0 1
!
0 1
[T jW1 ]C = [T (e1 )]C [T (e2 )]C = ;
1 0
[T jW2 ]D = [T (e3 )]D = ( 1 )
and
cT (x) = c[T ]E (x) = (x 1)(x2 + 1);
cT jW1 (x) = c[T jW1 ]C (x) = x2 + 1;
cT jW2 (x) = c[T jW2 ]D (x) = x 1:
!
[T jW1 ]C 0
Note that R3
= W1 W2 , [T ]E = and cT (x) = cT jW1(x) cT jW2(x).
0 [T jW2 ]D
(See Discussion 11.3.12.)
2. Let T be a linear operator on a vector space V over a eld F. Suppose T has an eigenvector
v, i.e. v is a nonzero vector in V such that T (v) = v for some 2 F. Let U = spanfvg.
For any u 2 U , u = av , for some a 2 F, and hence
T (u) = T (av) = aT (v) = av 2 U:
Section 11.3. Invariant Subspaces 119
So U is T -invariant. Using B = fv g as a basis for U , we have [T jU ]B = [T (v )]B = ( )
and hence cT jU (x) = c[T jU ]B (x) = x .
3. Let T be a linear operator on a vector space V over a eld F. Take any u 2 V . De ne
Note that T 3 (u) = T (T 2 (u)) 2 spanfT (u); T 2 (u)g spanfu; T (u)g. Repeating the
process, we can show that T m (u) 2 spanfu; T (u)g for all m 2. Thus the T -cyclic
subspace of V generated by u is
Discussion 11.3.4 Let T be a linear operator on a nite dimensional vector space V over a
eld F. Suppose W is a T -invariant subspace of V with dim(W ) 1. Let dim(V ) = n and
dim(W ) = m.
Take an ordered basis C = fv1 ; v2 ; : : : ; vm g for W . For each j = 1; 2; : : : ; m, since T (vj ) 2 W ,
So 0 1
0 2 2 2
B C
B
B 1 3 5
3
4
3
C
C
[T ]B = B C:
B
@
0 0 5
3
4
3
C
A
0 0 1
3
1
3
Take the standard basis E = f(1; 0; 0; 0); (0; 1; 0; 0); (0; 0; 1; 0); (0; 0; 0; 1)g for R4 . Let
0 1 0 1
4 2 2 2 1 0 0 0
B C B C
B 1 2 1 0C B 2 3 0 0C
A = [T ]E = B
B 1 1 0 0C
C and P = [IR ]E;B = 4 B
B 0 1 1
C:
0C
@ A @ A
3 2 2 1 0 1 0 1
0 1
0 2 2 2
B C
B1 3 5 4C
Then P AP = [T ]B = BB 3 3C
1
C.
B0 0 5 4 C
@ 3 3 A
0 0 1
3
1
3
!
Lemma 11.3.6 Let D be a quare matrix such that D =
A B where both A and C are
0 C
square matrices. Then det(D) = det(A) det(C ).
Proof See Question 10.19.
Theorem 11.3.7 Let T be a linear operator on a nite dimensional vector space V . Suppose
W is a T -invariant subspace of V with dim(W ) 1. Then the characteristic polynomial of T
is divisible by the characteristic polynomial of T jW .
Proof Using the notation in Discussion 11.3.4,
cT (x) = c[T ]B (x) = det(xIn [T ]B )
xI A1 A2
= m
0 xIn m A3
= det(xIm A1 ) det(xIn m A3 ) (see Lemma 11.3.6)
= cA1 (x) cA3 (x):
Example 11.3.8 In Example 11.3.5, cT (x) = (x 1)3 (x 2) and cT jW (x) = (x 1)(x 2).
(Check them.) It is obvious that cT (x) is divisible by cT jW (x).
Theorem 11.3.10 Let T be a linear operator on a vector space V over a eld F. Take
a nonzero vector u 2 V . Suppose the T -cyclic subspace W = spanfu; T (u); T 2 (u); : : : g
generated by u is nite dimensional.
1. The dimension of W is equal to the smallest positive integer k such that T k (u) is a linear
combination of u; T (u); : : : ; T k 1 (u).
2. Suppose dim(W ) = k.
Proof Let k be the smallest positive integer such that T k (u) 2 spanfu; T (u); : : : ; T k 1 (u)g.
Assume T m (u) 2 spanfu; T (u); : : : ; T k 1 (u)g for m 1. Then
T m+1 (u) = T (T m (u)) 2 spanfT (u); T 2 (u); : : : ; T k (u)g spanfu; T (u); : : : ; T k 1 (u)g:
By mathematical induction, we have shown that T n (u) 2 spanfu; T (u); : : : ; T k 1 (u)g for all
positive integer n. Thus W = spanfu; T (u); : : : ; T k 1 (u)g.
We claim that fu; T (u); : : : ; T k 1 (u)g is linearly independent: Assume the opposite, i.e. there
exists c0 ; c1 ; : : : ; ck 1 2 F such that c0 u + c1 T (u) + + ck 1 T k 1 (u) = 0 and not all ci 's are
zero. Let j = maxfi j 0 i k 1 and ci 6= 0g, i.e. c0 u + c1 T (u) + + cj T j (u) = 0 and
cj 6= 0. Since u is a nonzero vector, j > 0. So we can write
x a0
1 x 0 a1
1 x a2
cT jW (x) = det(xIW T jW ) = ... ... ..
.
... x ak 2
0
1 x ak 1
= a0 a1 x ak xk
1
1
+ xk :
Discussion 11.3.12 Let T be a linear operator on a nite dimensional vector space V over a
eld F. Suppose
V = W1 W2 Wk
where W1 ; W2 ; : : : ; Wk are T -invariant subspaces of V with dim(Wt ) = nt 1 for t = 1; 2; : : : k.
For each t, let Ct = fv1(t) ; v2(t) ; : : : ; vn(tt) g be an ordered basis for Wt . As Wt is T -invariant, for
j = 1; 2; : : : ; nt , T (vj(t) ) 2 Wt and hence
B = C1 [ C2 [ [ Ck
= fv1(1) ; v2(1) ; : : : ; vn(1)
1
; v1(2) ; v2(2) ; : : : ; vn(2)2 ; : : : : : : ; v1(k) ; v2(k) ; : : : ; vn(kk) g
is a basis for V .
Using B as an ordered basis with the order shown above,
[T ]B = [T (v1(1) )]B [T (vn(1)1 )]B [T (v1(2))]B [T (vn(2)2 )]B [T (v1(k))]B [T (vn(kk))]B
0 1
A1 0 0
=
B
B
B
0 A2 0 C
C
C
B ... C
@ A
0 0 Ak
where At = a(ijt) = [T jWt ]Ct for t = 1; 2; : : : ; k. Furthermore,
nt nt
cT (x) = cA1 (x) cA2 (x) cAk (x) = cT jW1(x) cT jW2(x) cT jWk(x);
i.e. the characteristic polynomial of T is the product of the characteristic polynomials of T jWt
for t = 1; 2; : : : ; k.
Furthermore, cT jW1 (x) = (x 1)(x 2), cT jW2 (x) = (x 1)2 and cT (x) = (x 1)3 (x 2) =
cT jW1 (x) cT jW2 (x).
Take the standard basis E = f(1; 0; 0; 0); (0; 1; 0; 0); (0; 0; 1; 0); (0; 0; 0; 1)g for R4 . Let
0 1 0 1
4 2 2 2 1 0 0 0
B C B C
B 1 2 1 0C B 2 3 1 0C
A = [T ]E = B
B 1 1 0 0C
C and Q = [IR ]E;B =
4 B
B 0 1 1 1C
C:
@ A @ A
3 2 2 1 0 1 2 1
0 1
0 2 0 0
B C
Then Q 1 B1
AQ = [T ]B = B 3 0 0CC.
B0 0 3 1C
@ A
0 0 4 1
Notation 11.4.1 Let F be a eld and let p(x) = a0 +a1 x+ +am xm where a0 ; a1 ; : : : ; am 2 F.
1. For a linear operator T on a vector space V over F, we use p(T ) to denote the linear
operator a0 IV + a1 T + + am T m on V .
2. For an n n matrix A over F, we use p(A) to denote the n n matrix a In + a A + 0 1
+ a m Am .
Lemma 11.4.2 Let F be a eld and let p(x) = a0 + a1 x + + am xm where a0 ; a1 ; : : : ; am 2 F.
Suppose T is a linear operator on a vector space V over F and A is an n n matrix over F.
126 Chapter 11. Diagonalization and Jordan Canonical Forms
1. Suppose V is nite dimensional where dim(V ) = n 1. For any ordered basis B for V ,
2. Consider the linear operator LA on Fn de ned in Example 9.1.4.1, i.e. LA (u) = Au for
u 2 Fn where vectors in Fn are written as column vectors. Then
p(LA ) = Lp(A)
where Lp(A) is the linear operator on Fn de ned by Lp(A) (u) = p(A)u for u 2 Fn .
4. Suppose r(x) and s(x) are polynomials over F such that p(x) = r(x) s(x). Then
p(T ) = r(T ) s(T ) = s(T ) r(T ) and q(A) = r(A) s(A) = s(A) r(A):
Proof
1. By Corollary 9.3.6 and Proposition 9.4.3,
0 1
1 0 1
Example 11.4.3 Let A = @0 1
B C
1 A be a real matrix.
1 1 2
Consider the linear operator LA : R3 ! R3 as de ned in Example 9.1.4.1, i.e. LA (u) = Au for
u 2 R3 where vectors in R3 are written as column vectors.
1. Find the matrix p(A) and the linear operator p(LA ) where p(x) = 4 + 8x 5x2 .
Solution We have
p(A) = 4I3 + 8A 5A2
0 1 0 1 0 12 0 1
1 0 0 1 0 1 1 0 1 4 5 7
B C B C B C B C
= 4@0 1 0A + 8@0 1 1A 5@0 1 1 A =@ 5 6 7A
0 0 1 1 1 2 1 1 2 7 7 8
and by Lemma 11.4.2.2, p(LA ) = Lp(A) , i.e.
00 11 0 1 0 10 1 0 1
x x 4 5 7 x x
p(LA )BB CC
@@y AA = p (A )B C B
@y A @ 5
= 6
CB C
7A@y A for
B C
@y A 2R : 3
z z 7 7 8 z z
2. Note that cLA (x) = cA (x) = (x 2)(x 1)2 . Find the matrix cA (A) and the linear
operator cLA (LA ).
Solution We have
cA (A) = (A 2I3 )(A I3) 2
20 1 0 13 20 1 0 132 0 1
1 0 1 1 0 0 1 0 1 1 0 0 0 0 0
=6B
4@0 1 1A
C
2B C7 6B
@0 1 0A5 4@0 1 1A
C B C7
@0 1 0A5 = B C
@0 0 0A :
1 1 2 0 0 1 1 1 2 0 0 1 0 0 0
By Remark 11.1.8, cLA (x) = cA (x) and by Lemma 11.4.2.2, for any polynomial p(x),
p(LA )(u) = p(A)u for all u 2 R3 . We have
00 11 00 11
x x
cLA (LA )BB CC
@@y AA = c L BB CC
A A @@y AA
( )
z z
0 1 0 10 1 0 1 0 1
x 0 0 0 x 0 x
= cA (A)@y A = @0 0 0A@y A = @0A for
B C B CB C B C B C
@y A 2R : 3
z 0 0 0 z 0 z
Thus cLA (LA ) = OR3 , the zero operator on R3 .
128 Chapter 11. Diagonalization and Jordan Canonical Forms
Proof Since Part 2 follows by applying Part 1 to the linear operator T = LA , we only need to
prove Part 1. To prove that cT (T ) = OV , we need to show that cT (T )(u) = 0 for all u 2 V .
Take any u 2 V . If u = 0, then it is obvious that cT (T )(u) = 0. Suppose u 6= 0. Let W =
spanfu; T (u); T 2 (u); : : : g. Note that W is a T -invariant subspace of V . Suppose dim(W ) = k.
By Theorem 11.3.10, B = fu; T (u); : : : ; T k 1 (u)g is a basis for W . Since T k (u) 2 W , there
exists a0 ; a1 ; : : : ; ak 1 such that
cT jW (x) = a0 a1 x + ak 1 xk 1 + xk : (11.2)
On the other hand, since W is a T -invariant subspace of V , by Theorem 11.3.7, the characteristic
polynomial of T is divisible by the characteristic polynomial of T jW , i.e.
Then
x 0 2
cT (x) = cA (x) = 1 x 2 1 = 4 + 8x 5x2 + x3 :
1 0 x 3
By the Cayley-Hamilton Theorem (Theorem 11.4.4), cT (T ) = OP2 (R) and cA (A) = 0, i.e.
Example 11.5.1 Consider the linear operator T on P2 (R) and the square matrix A = [T ]B
de ned in Example 11.4.5. By the Cayley-Hamilton Theorem (Theorem 11.4.4), cT (T ) = OP2 (R)
and cA (A) = 0. Now, let p(x) = 2 3x + x2 . We have
p(A) = 2I3 3A + A2
0 1 0 1 0 12 0 1
1 0 0 0 0 2 0 0 2 0 0 0
= 2@0 1 0C
B
A
B
3@1 2 C B
1 A + @1 2
C B C
1 A = @0 0 0A :
0 0 1 1 0 3 1 0 3 0 0 0
Thus p(T ) = OP2 (R) . Note that p(x) is a factor of cT (x), in fact, cT (x) = (x 2)p(x).
Remark 11.5.3 Let A be a square matrix. Similar to De nition 11.5.2.2, we can de ne the
minimal polynomial of A accordingly, i.e. the minimal polynomial mA (x) of A is the monic
polynomial p(x) of smallest degree such that p(A) = 0.
Note that if T is a linear operator on a nite dimensional vector space V , where dim(V ) 1,
and B is an ordered basis for V , then mT (x) = m[T ]B (x).
In this section, we mainly study the properties of minimal polynomials of linear operators.
However, most results can be restated in terms of minimal polynomials of square matrices. See
Question 11.33.
Example 11.5.4
1. For any nite dimensional vector space V with dim(V ) 1, mOV (x) = x; and for the
n n zero matrix 0, m0 (x) = x. (Why?)
2. In Example 11.4.3, cLA (x) = mLA (x) = (x 2)(x 1)2 .
3. In Example 11.4.5, cT (x) = 4+8x 5x2 +x3 = (x 2)2 (x 1) while mT (x) = 2 3x+x2 =
(x 2)(x 1).
In general, it is not easy to nd the minimal polynomial of a linear operator. In the mean time,
we can only nd them by trial-and-error. However, the next lemma will give us some idea to
guess how the minimal polynomial looks like.
Lemma 11.5.5 Let T be a linear operator on a nite dimensional vector space V over a eld
F where dim(V ) 1.
1. Let p(x) be a polynomial over F. Then p(T ) = OV if and only if p(x) is divisible by the
minimal polynomial of T .
2. If W is a T -invariant subspace of V with dim(W ) 1, then the minimal polynomial of T
is divisible by the minimal polynomial of T jW .
3. Suppose is an eigenvalue of T such that (x )r strictly divides cT (x), i.e. cT (x) =
(x )r q (x) where q (x) is a polynomial over F which is not divisible by x . Then
Proof
1. ()) Suppose p(x) is a polynomial over F such that p(T ) = OV . By the Division Algo-
rithm, we have polynomials v (x) and w(x) over F such that
If w(x) = 0, then p(x) = v (x)mT (x) and hence p(x) is divisible by mT (x).
Assume w(x) 6= 0. For any u 2 V ,
0 = p(T )(u) = (v(T ) mT (T ) + w(T ))(u)
= (v (T ) mT (T ))(u) + w(T )(u)
= v (T )(mT (T )(u)) + w(T )(u)
= v (T )(0) + w(T )(u) = 0 + w(T )(u) = w(T )(u):
Thus w(T ) = OV and contracts that mT (x) is chosen to have the least degree.
(() Suppose p(x) = t(x)mT (x) for some polynomial t(x) over F. Then for all u 2 V ,
p(T )(u) = (t(T ) mT (T ))(u) = t(T )(mT (T )(u)) = t(T )(0) = 0:
So p(T ) = OV .
2. Since mT (T ) = OV , by Lemma 11.4.2.3, mT (T jW ) = mT (T )jW = OV jW = OW . By Part
1, mT (x) is divisible by mT jW (x).
3. By the Cayley-Hamilton Theorem (Theorem 11.4.4), cT (T ) = OV and hence by Part 1,
cT (x) is divisible by mT (x). Thus
mT (x) = (x )s q1 (x)
where 0 s r and q1 (x) divides q (x). We only need to show that s 1.
Recall that is an eigenvalue of T . Take an eigenvector v associated with . De ne W1 =
spanfv g. By Example 11.3.2.2, W1 is a T -invariant subspace of V and cT jW1 (x) = x .
So
T jW1 IW1 = OW1 :
The only monic polynomial with degree less than x is the constant polynomial 1 (x) = 1
and 1 (T jW1 ) = IW1 6= OW1 . So x is the minimal polynomial of T jW1 , i.e. mT jW1 (x) =
x . By Part 2, mT (x) is divisible by mT jW1 (x) = x and hence s 1.
0 1
2 0 0
Example 11.5.6 Let A = @0 2 1C
B
A be a real matrix. Find the minimal polynomial of A.
0 0 2
Solution The characteristic polynomial of A is cA (x) = (x 2)3 . By the matrix version of
Lemma 11.5.5.3 (see Question 11.33), the minimal polynomial of A is mA (x) = (x 2)s where
s = 1; 2 or 3. Note that s is the smallest positive integer such that (A 2I )s = 0. As
0 1 0 12
0 0 0 0 0 0
A 2I = @0 0 1C
B
A 6= 0 and (A 2I ) = @0 0 1C
2 B
A = 0;
0 0 0 0 0 0
mA (x) = (x 2)2 .
132 Chapter 11. Diagonalization and Jordan Canonical Forms
Theorem 11.5.7 Let T be a linear operator on a vector space V . Suppose W1 and W2 are T -
invariant subspace of V . Then W1 + W2 is T -invariant and if W1 and W2 are nite dimensional
with dim(W1 ) 1 and dim(W2 ) 1, mT jW1 +W2 (x) is equal to the least common multiple of
mT jW1 (x) and mT jW2 (x).
Proof The proof is left as an exercise. See Question 11.38.
Theorem 11.5.8 Let T be a linear operator on a nite dimensional vector space V where
dim(V ) 1. Suppose
cT (x) = (x 1 )r1 (x 2 )r2 (x k )rk
where 1 ; 2 ; : : : ; k are distinct eigenvalues of T . Then
1. Ei (T ) Ki (T ),
5. dim(Ki (T )) = ri .
Proof The formula of mT (x) in (11.3) follows easily from Lemma 11.5.5.3. The proof of (11.4)
is left as exercise. (See Question 11.40.)
Example 11.5.9
1. Consider the linear operator LA in Example 11.3.5 and Example 11.4.3. We know that
mLA (x) = cLA (x) = (x 2)(x 1)2 .
0 1 0 10 1 0 1
x 1 2 0 1 x 0
@y A 2 K2 (LA ) ,
B C B CB C B C
@ 0 1 2 1 A@y A = @0A
z 1 1 2 2 z 0
0 1 0 1
x 1
, @y A = t@ 1 A where t 2 R
B C B C
z 1
and 0 1 0 12 0 1 0 1
x 1 1 01 x 0
@y A 2 K1 (LA ) ,
B C B CB C B C
@ 0 1 1 1 A @y A = @0A
z 1 1 2 1 z 0
0 10 1 0 1
1 1 1 x 0
, B
@ 1 1
CB C B C
1 A@y A = @0A
1 1 1 z 0
0 1 0 1 0 1
x 1 1
, @y A = s@ 0 A + t@ 1 A where s; t 2 R:
B C B C B C
z 1 0
134 Chapter 11. Diagonalization and Jordan Canonical Forms
So K2 (LA ) = spanf( 1; 1; 1)T g and K1 (LA ) = spanf( 1; 0; 1)T ; ( 1; 1; 0)T g. Note that
E2 (LA ) = K2 (LA ) but E1 (LA ) $ K1 (LA ).
0 1
2 0 0
With B = f( 1; 1; 1)T ; ( 1; 0; 1)T ; ( 1; 1; 0)T g, [LA ]B = B
@ 0 1 0
C
A. (See Theorem
0 1 1
11.5.8.)
2. Consider the linear operator T in Example 11.4.5. We know that mT (x) = (x 2)(x 1)
and cT (x) = (x 2)2 (x 1).
0 10 1 0 1
0 2 0 2 a 0
a + bx + cx2 2 K2 (T ) , B
@ 1 2 2
CB C B C
1 A@ b A = @0A
1 0 3 2 c 0
0 1 0 1 0 1
a 1 0
, @ b A = s@ 0 A + t@1A where t; s 2 R
B C B C B C
c 1 0
and
0 10 1 0 1
0 1 0 2 a 0
a + bx + cx2 2 K1 (T ) , B
@ 1 2 1
CB C B C
1 A@ b A = @0A
1 0 3 1 c 0
0 1 0 1
a 2
, @ b A = t@ 1 A where t 2 R:
B C B C
c 1
Theorem 11.5.10 Let T be a linear operator on a nite dimensional vector space V , where
dim(V ) 1, such that cT (x) = (x 1 )r1 (x 2 )r2 (x k )rk where 1 ; 2 ; : : : ; k are distinct
eigenvalues of T . (Note that r1 + r2 + + rk = dim(V ).)
The following are equivalent:
1. T is diagonalizable.
2. mT (x) = (x 1 )(x 2 ) (x k ).
3. dim(Ei (T )) = ri for i = 1; 2; : : : ; k.
Section 11.6. Jordan Canonical Forms 135
Proof
(1 ) 2) Suppose T is diagonalizable. By Theorem 11.1.11, V has a basis B consisting of
eigenvectors of T . For every u 2 B , T (u) = i u for some i . Thus (T i IV )(u) = 0
and hence by Lemma 11.4.2.4,
((T 1 IV ) (T 2 IV ) (T k IV ))(u)
= ((T 1 IV ) (T i 1 IV ) (T i+1 IV ) (T k IV ) (T i IV ))(u)
= ((T 1 IV ) (T i 1 IV ) (T i+1 IV ) (T k IV ))((T i IV )(u))
= ((T 1 IV ) (T i 1 IV ) (T i+1 IV ) (T k IV ))(0)
= 0:
This implies (T 1 IV ) (T 2 IV ) (T k IV ) = OV . By Lemma 11.5.5,
mT (x) = (x 1 )(x 2 ) (x k ).
(2 ) 3) Suppose mT (x) = (x 1 )(x 2 ) (x k ). For every i, Ki (T ) = Ei (T ) and
hence by Theorem 11.5.8, dim(Ei (T )) = dim(Ki (T )) = ri .
(3 ) 4) Suppose dim(Ei (T )) = ri for all i. For every i, by Theorem 11.5.8, Ei (T ) = Ki (T )
and hence V = K1 (T ) K2 (T ) Kk (T ) = E1 (T ) E2 (T ) Ek (T ).
(4 ) 1) For each i, let Bi be a basis for Ei (T ) (every vector in Bi is an eigenvector of T
associated with the eigenvalue i ). As V = E1 (T ) E2 (T ) Ek (T ), by Theorem
8.6.7.1, B = B1 [ B2 [ [ Bk is a basis for V and every vector in B is an eigenvector
of T . Hence by Theorem 11.1.11, T is diagonalizable.
Corollary 11.5.11 Let T be a linear operator on a nite dimensional vector space V and let
W be a T -invariant subspace of V with dim(W ) 1. If T is diagonalizable, then T jW is also
diagonalizable.
Proof Since T is diagonalizable, by Theorem 11.5.10, mT (x) = (x 1 )(x 2 ) (x k )
where 1 ; 2 ; : : : ; k are distinct eigenvalues of T . By Lemma 11.5.5.2, mT (x) is divisible by
mT jW (x) and hence mT jW (x) = (x i1 )(x i2 ) (x is ) for some distinct eigenvalues
i1 ; i2 ; : : : ; is 2 f1 ; 2 ; : : : ; k g. By Theorem 11.5.10, T jW is diagonalizable.
Using the standard basis E = f(1; 0; 0); (0; 1; 0); (0; 0; 1)g,
0 1
1+i 1 0
B C
[T ]E = @ 0 i 1 A:
1 1 (1 i)
Note that
x (1 + i) 1 0
cT (x) = 0 x i 1 = (x i)3
1 1 x + (1 i)
and hence mT (x) = (x i)s where 1 s 3. Let Q = T i IV , i.e.
Theorem 11.6.4 Let T be a linear operator on a nite dimensional vector space V over a
eld F where dim(V ) 1. Suppose the characteristic polynomial of T can be factorized into
linear factors over F. Then there exists an ordered basis B for V such that [T ]B = J with
0 1
Jt1 ( )1 0
B C
B
B
B
Jt2 ( )
2
C
C
C
J= B
B ... C
C (11.5)
B C
B ... C
@ A
0 Jtm (m)
where 1 ; 2 ; : : : ; m are eigenvalues of T . (Note that 1 ; 2 ; : : : ; m are not necessarily distinct.)
Proof We omit the proof because it is too technical. (See Question 11.53 and Question 11.54.)
De nition 11.6.6 For a linear operator T of a nite dimensional vector space V , if there
exists an ordered basis B for V such that [T ]B = J where J is a square matrix of the form
stated in (11.5), we say that T has a Jordan canonical form J and J is a Jordan canonical
form for T .
Similarly, for a square matrix A, if there exists an invertible matrix P such that P 1 AP = J ,
we say that A has a Jordan canonical form J and J is a Jordan canonical form for A.
Example 11.6.7 Consider the matrix A and the linear operator LA in Example 11.3.5 and
Example 11.5.9.1. If we choose an ordered basis B = f( 1; 1; 1)T ; ( 1; 1; 0)T ; ( 1; 0; 1)g. Then
0 1
2 0 0 !
[LA ]B = B
0 1 1
C
=
J1(2) 0
:
@
0 0 1
A
0 J2(1)
which is a Jordan canonical form for LA .
0 1
1 1 1 !
Let P = B
1 1
C
0 A. Then P AP 1
=
J1(2) 0
which is a Jordan canonical
@
1 0 1
0 J2(1)
form for A.
138 Chapter 11. Diagonalization and Jordan Canonical Forms
Remark 11.6.8 Jordan canonical forms are not unique. But two Jordan canonical forms for
a linear operator (or a matrix) have the same collection of Jordan blocks but in di erent orders.
(Actually, two matrices in Jordan canonical forms are similar if and only if they have the same
collection of Jordan blocks.)
In Example 11.6.7, the following are all the possible Jordan canonical forms for A:
! !
J1(2) 0
;
J2(1) 0
:
0 J2(1) 0 J1(2)
Usually, we say that the Jordan canonical form is unique up to the ordering of the Jordan
blocks.
Proof In the following, we prove the linear operation part of the theorem. For the matrix
part of the theorem, we only need to apply our arguments to the linear operator LA .
( m)
Suppose B = fv1(1) ; v2(1) ; : : : ; vt(1)
1 ; v1 ; v2 ; : : : ; vt2 ; : : : ; v1
(2) (2) (2)
; v2(m) ; : : : ; vt(mm) g is an ordered
basis for V such that [T ]B = J .
1. Since xI J is an upper triangular matrix,
cT (x) = det(xI J )
= the product of the diagonal entries of xI J
= (x 1 )t1 (x 2 )t2 (x m )tm :
Section 11.6. Jordan Canonical Forms 139
Example 11.6.10
1. Suppose a complex square matrix A has a Jordan canonical form given by
0 1
i 1 0 0 0 0 0 0
B C
0 1
B
B 0 i 1 0 0 0 0 0 C
C
J3(i) 0 0 0 B
B 0 0 i 0 0 0 0 0 C
C
J=
B
B
B 0 J2(4) 0 0
C
C
C =
B
B
B 0 0 0 4 1 0 0 0
C
C
C:
B
@ 0 0 J2(4) 0 C
A
B
B 0 0 0 0 4 0 0 0 C
C
0 0 0 J1(4) B
B
B
0 0 0 0 0 4 1 0
C
C
C
B 0 0 0 0 0 0 4 0 C
@ A
0 0 0 0 0 0 0 4
140 Chapter 11. Diagonalization and Jordan Canonical Forms
Solution Let J be a Jordan canonical form for A. Since cA (x) = (x 1)3 (x 2)2 , along
the diagonal of J , there are three 1's and two 2's. As mA (x) = (x 1)2 (x 2), the largest
Jordan block associated with 1 has order 2 and the largest Jordan block associated with
2 has order 1. So J must be similar to
0 1
0 1 1 1 0 0 0
J2(1) 0 0 0 B
0 1 0 0 0
C
B
B
B 0 J1(1) 0 0
C
C
C =
B
B
B 0 0 1 0 0
C
C
C:
B
@ 0 0 J1(2) 0 C
A
B
B
0 0 0 2 0
C
C
0 0 0 J1(2) @
0 0 0 0 2
A
Remark 11.6.11 The simplest form that a square matrix can be reduced to (or similar to) is
a Jordan canonical form. However, in practice, we seldom use Jordan canonical forms because
they are very sensitive to computational errors. For example, a Jordan canonical form for the
0 1
! p" 0
!
0 1
!
matrix is p" 6 0; and it is
if " = if " = 0. So small truncation
" 0 0 0 0
errors during computation may end up with dramatic di erences in Jordan canonical forms. So
mathematicians working in numerical analysis prefer some other canonical forms though not so
simple but more stable in computation.
Exercise 11
1. For each of the following linear operators T on V , (i) determine whether T is diagonal-
izable; and (ii) if T is diagonalizable, nd an ordered basis B for V such that [T ]B is a
diagonal matrix.
d(p(x))
(d) V = Pn (R) and T (p(x)) = for p(x)V .
dx
d(xp(x))
(e) V = Pn (R) and T (p(x)) = for p(x) 2 V .
dx
!
1 2
2. Let A = be a real matrix.
2 1
(a) Find an invertible 2 2 real matrix P so that P AP is a diagonal matrix.
1
3. Let S and T be linear operator on the same nite dimensional vector space. Suppose S
is diagonalizable.
4. (a) Prove that two similar square matrices have the same characteristic polynomial.
(b) Let A; B 2 Mnn (F) where F is a eld.
(i) Show that the following two 2n 2n matrices
! !
X= AB 0nn
and Y =
0nn 0nn
B 0nn B BA
are similar.
(ii) Hence, or otherwise, prove that AB and BA have the same characteristic poly-
nomial.
(c) Restate the result of Part (b)(ii) in terms of linear operators.
5. Let V be a nite dimensional vector space over a eld F and T a linear operator on V
such that T 2 = IV .
(a) Let T be a linear operator on a nite dimensional vector space V over a eld F.
Prove that T is triangularizable if and only if the characteristic polynomial of T can
be factorized into linear factors over F.
(b) For each of the following linear operators T on V , determine whether T is triangu-
larizable.
(i) V = F22 and T ((x; y )) = (x + y; y ) for all (x; y ) 2 V .
(ii) V = P2 (R) and T (a+bx+cx2 ) = (a+b)+(b a)x+(c a)x2 for all a+bx+cx2 2 V .
(iii) V = Mnn (C) and T (X ) = AX for X 2 V , where A is a complex n n matrix.
12. For each of the following linear operators T on V and subspaces W of V , determine
whether W is T -invariant.
(a) V = C3 , W = f(a; b; a + b) j a; b 2 Cg and T ((a; b; c)) = (b + i c; c + i a; a + i b) for
(a; b; c) 2 V .
(b) V = C3 , W = spanf(1; 1; 1)g and T ((a; b; c)) = (b + i c; c + i a; a + i b) for (a; b; c) 2 V .
(c) V = C3 , W = f(x; y; z ) 2 C3 j x + y + z = 0g and T ((a; b; c)) = (b + i c; c + i a; a + i b)
for (a; b; c) 2 V .
dp(x)
(d) V = P (R), W = P3 (R) and T (p(x)) = for p(x) 2 V .
dx Z x
(e) V = P (R), W = P3 (R) and T (p(x)) = p(t)dt for p(x) 2 V .
0
!
0 1
(f) V = M22 (R), W = fA 2 V j A = Ag and T (X ) =
T
X for X 2 V .
1 0
(g) V = M22 (R), W = fA 2 V j AT = Ag and T (X ) = X T for X 2 V .
13. Consider the linear operator T de ned in Parts (a), (b), (c) of Question 11.12.
(i) Compute cT (x).
(ii) For each of Parts (a), (b), (c) of Question 11.12, if W is T -invariant, compute cT jW (x)
and verify that cT (x) is divisible by cT jW (x).
(c) for any scalar c, W is (cT )-invariant and (cT )jW = c(T jW ).
15. Let V be a real vector space which has a basis B = fv1; v2; v3; v4g. Suppose T is a
linear operator on V such that
0 1
1 1 1 1
B C
B2 1 2 1C
[T ]B = B
B1
C:
@ 1 3 1C
A
1 1 1 1
16. Let T be a linear operator on a vector space V . Suppose u and v are two eigenvectors of
T associated with eigenvalues and , respectively, where 6= . Let W = spanfu; vg
and B = fw; T (w)g where w = u + v .
(a) Show that W is a T -invariant subspace of V .
(b) Prove that B is a basis for W .
(c) Find cT jW (x) and [T jW ]B .
17. Let S be the shift operator on RN , i.e. S ((an )n2N ) = (an+1 )n2N for (an )n2N 2 RN , and let
19. Let A 2 Mnn (F) where F is a eld. De ne T to be a linear operator on Mnn (F) such
that T (X ) = AX for X 2 Mnn (F).
Let Eij , i; j = 1; 2; : : : ; n, be the n n matrices as de ned in Example 8.4.6.7. For
q = 1; 2; : : : ; n, de ne Wq = span(Bq ) where Bq = fE1q ; E2q ; : : : ; Enq g.
(a) Is Mnn (F) = W1 W2 Wn ?
(b) Prove that Wq is T -invariant. What is [T jWq ]Bq ?
(c) Let B = fE11 ; E21 ; : : : ; En1 ; E12 ; E22 ; : : : ; En2 ; : : : ; E1n ; E2n ; : : : ; Enn g. Write
down [T ]B and hence prove that cT (x) = [cA (x)]n .
(d) Prove that T is diagonalizable if and only if A is diagonalizable.
20. Let W1 and W2 be subspaces of a vector space V such that V = W1 W2 and let
P : V ! V be the projection on W1 along W2 de ned in Question 9.6. For a linear
operator T on V , prove that T P = P T if and only if both W1 and W2 are T -invariant.
(Hint: You need the results proved in Question 9.6.)
22. Let T be a linear operator on a nite dimensional vector space V where dim(V ) = n.
Prove that T is triangularizable if and only if there exist T -invariant subspaces W1 , W2 ,
: : : , Wn of V such that W1 W2 Wn = V and dim(Wj ) = j for j = 1; 2; : : : ; n.
(See Question 11.11 for the de nition of \triangularizable".)
24. (a) Let S and T be linear operators on a vector space V such that S T = T S . Prove
that E (T ) is S -invariant where is an eigenvalue of T .
(b) Let S and T be linear operators on a nite dimensional vector space V , where
dim(V ) 1, over a eld F such that S T = T S . If the characteristic polynomials
of S and T can both be factorized into linear factors over F, prove that there exists
an ordered basis B for V such that both [S ]B and [T ]B are upper triangular matrices.
(c) Restate the result in Part (b) using square matrices.
25. Let T be a linear operator on a vector space V over a eld F and p(x) a polynomial over
F. Prove that Ker(p(T )) and R(p(T )) are T -invariant.
26. Let F be a eld and p(x), q (x) two polynomials over F. The greatest common divisor (or
the highest common factor ) of p(x) and q (x), denoted by gcd(p(x); q (x)), is the monic
polynomial of highest degree which divides both p(x) and q (x). The following is an
algorithm called the Euclidean Algorithm which is used to nd gcd(p(x); q (x)):
Assume deg(p(x)) deg(q (x)).
rt 1 (x) = qt (x) rt (x) + rt+1 (x) and deg(rt+1 (x)) < deg(rt (x))
for some polynomial qt (x) over F.
Step 3: If rt+1 (x) 6= 0, increase the value of t by 1 and goto Step 2.
Step 4: Now, rt+1 (x) = 0. Let c be the coecient of the term of highest degree in rt (x).
Then gcd(p(x); q (x)) = c 1 rt (x).
(a) For each of the following cases, use the Euclidean Algorithm to nd gcd(p(x); q (x)).
(i) F = R, p(x) = x 5
3x4 + 2x3 2x2 2x + 1 and q (x) = x4 x3 + 7x2 2x.
(ii) F = F , p(x) = x
2
5
+ x4 + 1 and q (x) = x5 + x2 + x + 1.
(b) Prove that the polynomial c 1 rt (x) in Step 4 is the greatest common divisor of p(x)
and q (x).
(c) Prove that there exist polynomials a(x) and b(x) over F such that
27. Let T be a linear operator on a vector space V over a eld F and let p(x), q (x) be
polynomials over F such that gcd(p(x); q (x)) = 1.
(a) Prove that Ker(q (T )) R(p(T )).
(b) Prove that Ker(p(T )) \ Ker(q (T )) = f0g.
(Hint: Use the result of Question 11.26(c).)
29. De ne p(x) = det(xB C ) where B and C are two n n matrices. Suppose there exists
an n n matrix A such that AB = C . Prove that p(A) = 0.
0 1
1 1 1 1
B C
B1 2 1 0C
30. Let A = B
B0
C be a real matrix.
@ 0 1 0C
A
1 0 0 1
(a) Find the characteristic polynomial of A.
(b) Find a real polynomial p(x) such that A 1
= p(A).
31. Let A be a nonzero n n matrix over a eld F. Prove that if A is invertible, then there
is a polynomial p(x) over F such that A 1 = p(A).
37. (a) Prove that similar square matrices have the same minimal polynomial.
(b) Suppose two square matrices have the same minimal polynomial, is it true that they
are similar?
39. (a) Let S and T be two diagonalizable linear operators on a nite dimensional vector
space V . Show that there exists an ordered basis B for V such that [S ]B and [T ]B
are diagonal matrices if and only if S T = T S .
(b) Let A and B be two diagonalizable matrices in Mnn (F). Show that there exists
an invertible matrix P 2 Mnn (F) such that P 1 AP and P 1 BP are diagonal
matrices if and only if AB = BA.
40. Let T be a linear operator on a nite dimensional vector space V over a eld F where
dim(V ) 1.
(a) Suppose mT (x) = p(x) q (x) where p(x) and q (x) are polynomials over F such that
deg(p(x)) 1, deg(q (x)) 1 and gcd(p(x); q (x)) = 1.
(i) Prove that R(p(T )) = Ker(q (T )).
(ii) Prove that V = Ker(p(T )) Ker(q (T )).
(Hint: Use the results of Question 11.27.)
(b) Complete the proof of Theorem 11.5.8:
Prove that V = K1 (T ) K2 (T ) Kk (T ) where 1 ; 2 ; : : : ; k are all the
(distinct) eigenvalues of T .
42. Let V be a real vector space of dimension 5 and let T be a linear operator on V . Suppose
that the minimal polynomial of T is
mT (x) = (x 1)(x 2)2 :
(i) List all possible (non-similar) Jordan canonical forms for T .
(ii) For each possible Jordan canonical form, write down the characteristic polynomial
cT (x) of T .
44. Let T be a linear operator on a real vector space such that T has a Jordan canonical form
0 1
J3(2) 0
J =B
@ J2(2) C
A:
0 J2(3)
(a) Write down cT (x) and mT (x).
(b) Write down nullity(T 2IV ) and nd nullity((T 2IV )2 ).
45. Let T be a linear operator on a nite dimensional real vector space V with an ordered
basis C = fv1 ; v2 ; : : : ; v11 g such that
0 1
2 1 0 0 0 0 0 0 0 0 0
B C
B
B
0 2 1 0 0 0 0 0 0 0 0 C
C
B 0 0 2 0 0 0 0 0 0 0 0 C
B C
B C
B
B
0 0 0 2 1 0 0 0 0 0 0 C
C
B 0 0 0 0 2 0 0 0 0 0 0 C
B C
B C
[T ]C = B 0 0 0 0 0 8 1 0 0 0 0 C:
B C
B 0 0 0 0 0 0 8 1 0 0 0 C
B C
B C
B 0 0 0 0 0 0 0 8 0 0 0 C
B C
B 0 0 0 0 0 0 0 0 0 1 0 C
B C
B C
@ 0 0 0 0 0 0 0 0 0 0 0 A
0 0 0 0 0 0 0 0 0 0 0
(a) Find the characteristic polynomial and the minimal polynomial of T .
(b) For each 1 i 11, express T (vi ) as a linear combination of the vectors in C .
(c) Find the eigenvalues of T and bases for the corresponding eigenspaces.
(d) For each eigenvalue of T , write down a basis for K (T ).
0 1
0 0 1 1
B C
B2 2 1 1C
46. Let A = B
B0
C be a real matrix.
@ 0 1 1C A
0 0 1 1
(a) Compute the characteristic polynomial of A and nd all eigenvalues.
(b) For each eigenvalue obtained in Part (a), determine the dimension of the eigenspace
associated with .
Exercise 11 151
52. Let T be a real linear operator on a nite dimensional vector space V such that
(i) cT (x) = (x + 1)9 (x 2)7 ;
(ii) mT (x) = (x + 1)4 (x 2)3 ;
(iii) nullity(T + IV ) = 4, nullity((T + IV )2 ) = 7, nullity((T + IV )3 ) = 8; and
(iv) nullity(T 2IV ) = 3, nullity((T 2IV )2 ) = 5.
Find a Jordan canonical form for T . (Hint: You may need the answer to Question
11.51(b)(ii).)
53. (This question is the preliminary of the proof of Theorem 11.6.4 in Question 11.54. You
can only use results discussed before Theorem 11.6.4 to do this question.)
Let T be a linear operator on a nite dimensional vector space V . Suppose is an
eigenvalue of T . De ne Q = T IV and Nt = Ker(Qt ) for t 0.
(a) Suppose v 2 Nt Nt 1 where t 1. Let B = fQt 1 (v ); : : : ; Q(v ); v g and W =
span(B ).
(i) Prove that W is T -invariant.
(ii) Prove that B is a basis for W .
(iii) Using B as an ordered basis, write down [T jW ]B .
(b) For 1 t s, let Ct Nt Nt 1 such that fNt 1 + v j v 2 Ct g is a basis for Nt =Nt 1 .
(We also assume that for any u; v 2 Ct , if u 6= v , then Nt 1 + u 6= Nt 1 + v .)
(i) Prove that for t 2, if Ct = fv1 ; v2 ; : : : ; vk g, then Nt 2 + Q(v1 ); Nt 2 + Q(v2 );
: : : ; Nt 2 + Q(vk ) are linearly independent vectors in Nt 1 =Nt 2 .
Exercise 11 153
Discussion 12.1.1 In Chapter 5, we use dot product to de ne lengths, distances and angles
in Rn . For general vector spaces, we need an abstract version of \dot product" so that we can
generalize the works we have done in Chapter 5. In order to do so, we rst require that the
eld used must have some built-in measurement and ordering. Since not all elds are suitable,
we only study vector spaces over R and C in this chapter.
A = A T
= (aji )nm :
(A + B ) = ( A + B )T = ( A + B )T = A + B = A + B ;
T T
(AC ) = ( AC )T = ( A C )T = C A = C A ;
T T
(cA) = ( cA )T = ( c A )T = c A = c A :
T
156 Chapter 12. Inner Product Spaces
De nition 12.1.3 Let F = R or C and let V be a vector space over F. An inner product on
V is a mapping which assigns to each ordered pair of vectors u; v 2 V a scalar hu; vi 2 F such
that it satis es the following axioms:
(IP1) For all u; v 2 V , hu; v i = hv ; ui. (This axiom implies hu; ui 2 R for all u 2 V .)
(IP2) For all u; v ; w 2 V , hu + v ; wi = hu; wi + hv ; wi.
(IP3) For all c 2 F and u; v 2 V , hcu; v i = chu; v i.
(IP4) h0; 0i = 0 and, for all nonzero u 2 V , hu; ui > 0.
(Compare the axioms with Theorem 5.1.5.)
Remark 12.1.4
1. If F = R, we can rewrite (IP1) as:
(IP1') For all u; v 2 V , hu; v i = hv ; ui.
For this case, we say that the inner product is symmetric.
2. By (IP1) and (IP2), for all u; v ; w 2 V ,
hw; u + vi = hu + v; wi = hu; wi + hv; wi = hu; wi + hv; wi = hw; ui + hw; vi:
3. By (IP1) and (IP3), for all c 2 F and u; v 2 V ,
hu; cvi = hcv; ui = chv; ui = c hv; ui = c hu; vi:
When F = R, we have hu; cv i = chu; v i.
4. By (IP2), for all u 2 V ,
h0; ui = h0 + 0; ui = h0; ui + h0; ui ) h0; ui = 0
and then by (IP1), hu; 0i = h0; ui = 0 = 0.
Example 12.1.6
1. For all u = (u1 ; u2 ; : : : ; un ); v = (v1 ; v2 ; : : : ; vn ) 2 Rn , de ne
hu; vi = u v 1 1 + u2 v2 + + un vn = uv T :
Note that h ; i is actually the dot product de ned in Chapter 5. By Theorem 5.1.5, h ; i
Section 12.1. Inner Products 157
is an inner product on Rn .
This inner product is also called the usual inner product or the Euclidean inner product
on Rn . Furthermore, the Euclidean n-space is usually referred to the vector space Rn
equipped with this inner product.
2. For all u = (u1 ; u2 ; : : : ; un ); v = (v1 ; v2 ; : : : ; vn ) 2 Cn , de ne
hu; vi = u v + u v + + unvn = uv
1 1 2 2
Solution
(a) It does not satisfy (IP2) and (IP3). Hence it is not an inner product.
(b) It is an inner product. (Check it.)
(c) It does not satisfy (IP4). Hence it is not an inner product.
(d) It does not satisfy (IP1). Hence it is not an inner product.
u1 + u2 v1 + v2 u1 u2 v1 v2
(e) hu; v i = p p +2 p p for all u = (u1 ; u2 ); v =
2 2 2 2
(v1 ; v2 ) 2 R2 . Same as (b), it is an inner product.
5. Let [a; b], with a < b, be a closed interval on the real line. Consider the vector space
C ([a; b]) de ned in Example 8.3.6.5. Then
Z b
hf; gi = b 1
a f (t)g(t)dt for f; g 2 C ([a; b])
a
is an inner product on C [a; b]). (We leave the veri cation as an exercise. See Question
12.5.)
158 Chapter 12. Inner Product Spaces
1
X
6. Let V be the set of all real in nite sequences (an )n2N such that a2n converges. De ne
n=1
1
X
h(an)n2N; (bn)n2Ni = an bn for (an )n2N ; (bn )n2N 2 V:
n=1
Then V is a real inner product space. (We leave the veri cation as an exercise. See
Question 12.6.)
This inner product space is called the l2 -space. It is an example of a class of inner product
spaces known as Hilbert spaces.
Discussion 12.2.1 One of the important uses of an inner product is that we can use it to
measure the length of a vector and the distance between two vectors.
and for u = (u ; u ; : : : ; un ); v = (v ; v ; : : : ; vn ) 2 Rn ,
1 2 1 2
p
d(u; v) = jju vjj = hu v; u vi
p p
= (u v ) (u v ) = (u v ) + (u v ) + + (un vn ) :
1 1
2
2 2
2 2
3. We compare the usual inner product on R2 with the inner products de ned in Parts (b)
and (e) of Example 12.1.6.4.
h (1; 0); (0; 1) i = 0 h (1; 0); (0; 1)) i = 0 h (1; 0); (0; 1) i = 1
2
q
jj(1; 0)jj = 1 jj(1; 0)jj = 1 jj(1; 0)jj = 3
2
p q
jj(0; 1)jj = 1 jj(0; 1)jj = 2 jj(0; 1)jj = 3
2
d((1; 0); (0; 1)) d((1; 0); (0; 1)) d((1; 0); (0; 1))
p p
= jj(1; 1)jj = 2 = jj(1; 1)jj = 3 = jj(1; 1)jj = 2
y y y
6 6 6
......................................
....... ......
......
.....
.....
unit vectors ................................ unit vectors
....
..... ........
....
...... ....
....
.... ..........
..................................... unit vectors
....... .....
.....
...... .....
....
....
....... . ..
.. ...... .
.
...
. .... ..... .... ..
. ... ..... .....
.... ..
. ..
... ..
.. ....... .... ...... ..
... .... ... ..
@
..
. ..
.. ... ... .
.. ..
. .. .
.. .. ... .
. .. . .. . ..
.. .
x x x
.. .. .. .. .. ..
.. .. ... .. ..
.. ..
.. .. .. . .
..
@@
.. .. .
.. .
. .. .. .. ..
.. .. .. .
. .. ...
.. ..
. .... .
... ... ...
.. .... .. .. ....
R
.. .. .... ....
... .. ....
.. .. .
.
... .. ..... ..... .. .
....
.
.... ... .....
...... ..... ..
....
.... ... ........ ...... ..
.....
.
... ............... ........ ....
....
....
.... .... .......................... ....
.....
..... .... .....
......
..... ..... ......
.................................
...... .....
......... ......
.................................
4. Let Mmn (F), where F = R or C, be equipped with the inner product de ned in Example
12.1.6.3. For A = (aij ); B = (bij ) 2 Mmn (F),
v v
u m n u m n
p uX X uX X
jjAjj = hA; Ai = t aik aik = t jaik j 2
i=1 k=1
160 Chapter 12. Inner Product Spaces
5. Let [a; b], with a < b, be a closed interval on the real line. Suppose the vector space
C ([a; b]) is equipped with the inner product de ned in Example 12.1.6.5. For f; g 2
C ([a; b]), s
p Z b
jjf jj = hf; f i = b a
1
[f (t)]2 dt
a
and s
Z b
d(f; g) = jjf gjj = b a
1
[f (t) g(t)]2 dt
a
.
Then
0 jjwjj = hw; wi = u
hu
; vi h u; vi
jjvjj v; u jjvjj v
2
2 2
!
= hu; ui
h u ; vi hu ; vi
h u; vi hu; vi
= hu; ui
hu; vi hu; vi hu; vi hu; vi + hu; vi hu; vi
jjvjj2 jjvjj jjvjj
2 2
= jjujj2
jhu; vij : 2
(12.1)
jjvjj2
So
jhu; vij jjujj
2
2
and hence jhu; v ij2 jjujj2 jjv jj2 .
jjvjj
2
(Proofs of the other parts are left as exercises. See Question 12.12.)
Section 12.3. Orthogonal and Orthonormal Bases 161
Example 12.2.5
1. (Cauchy-Schwarz Inequality for Real Numbers) For any real numbers x1 ; x2 ; : : : ; xn ,
y1 ; y2 ; : : : ; yn , prove that
Solution Use Rn equipped with the usual inner product. Let x = (x1 ; x2 ; : : : ; xn ) and
y = (y1; y2; : : : ; yn). Then jjxjj2 = x21 + x22 + + x2n, jjyjj2 = y12 + y22 + + yn2 and
hx; yi = x1y1 + x2y2 + + xnyn. The inequality follows by Theorem 12.2.4.3.
2. (Cauchy-Schwarz Inequality for Continuous Functions) For any f; g 2 C ([a; b])
where [a; b], with a < b, is a closed interval on the real line, prove that
Z b 2 Z b Z b
f (t)g(t)dt f (t) dt
2
g(t) dt :
2
a a a
Solution The inequality follows by applying Theorem 12.2.4.3 to the inner product
de ned in Example 12.1.6.5.
Discussion 12.3.1 In Section 5.1, we use the dot product to de ne the angle between two
vectors in Rn . Given a real inner product space V , we can de ne the angle between two vectors
in V in the same way, i.e. the angle between u; v 2 V is
cos 1
hu; vi :
jjujj jjvjj
u and v are perpendicular to each other if and only if hu; vi = 0. Note that by
In particular,
the Cauchy-Schwarz Inequality (Theorem 12.2.4.3), we have 1
hu; vi
jjujj jjvjj 1 and hence the
angle is well-de ned.
Although this de nition of \angles" does not work for complex inner product spaces, the concept
of \perpendicular" can still be de ned accordingly.
In the following, we restate some important results in Chapter 5 using inner product spaces.
Proof
1. (() It is obvious.
()) Suppose u 2 V is orthogonal to every vectors in B . Take any w 2 W . Since
W = span(B ), we can write w = a1 v1 + a2 v2 + + ak vk for some v1 ; v2 ; : : : ; vk 2 B
and a1 ; a2 ; : : : ; ak 2 F. Then
= a 0 + a 0 + + ak 0 = 0:
1 2
c1 v1 + c2 v2 + + ck vk = 0 (12.2)
Section 12.3. Orthogonal and Orthonormal Bases 163
hc v1 + c v2 + + ck vk ; vj i = h0; vj i = 0
1 2
) cj hvj ; vj i = 0
) cj = 0 (because vectors in B are nonzero vectors):
Since (12.2) has only the trivial solution, fv1 ; v2 ; : : : ; vk g is linearly independent.
As all nite subsets of B are linearly independent, B is linearly independent.
Remark 12.3.4
1. Suppose V is a nite dimensional inner product space. By Theorem 8.5.13 and Lemma
12.3.3.2, if we know the dimension of V , to determine whether a set B of nonzero vectors
from V is an orthogonal (respectively, orthonormal) basis for V , we only need to check:
2. By Lemma 12.3.3.3, a nite dimensional real inner product space is essentially the same
as the Euclidean space.
164 Chapter 12. Inner Product Spaces
Example 12.3.5
1. Suppose C3 is equipped with the usual inner product. Let W = spanf(1; 1; 1); (1; i ; i)g.
For any (x; y; z ) 2 C3 , by Lemma 12.3.3.1,
(
(x; y; z ) is orthogonal to W , h(x; y; z); (1; 1; 1)i = 0
h(x; y; z); (1; i ; i)i = 0
(
x+ y+ z = 0
,
x iy iz = 0
8
>
< x=0
, >
y= t for t 2 C:
:
z=t
So (0; t; t), t 2 C, are all the vectors orthogonal to W .
2. Suppose Mnn (R) is equipped with the inner product de ned in Example 12.1.6.3. Let
W = fA 2 Mnn (R) j A is symmetricg.
Take any skew symmetric matrix B 2 Mnn (R), i.e. B T
= B. For any A 2 W , i.e.
AT = A, by Proposition 8.1.11, we have
tr(AB T ) = tr((AB T )T ) = tr(BAT ) = tr( B A) =
T
tr(B T A) = tr(AB T ):
This implies that hA; B i = tr(AB T ) = 0. So B is orthogonal to W .
3. Let F = R or C. Using the usual inner product, the standard basis fe1; e2; : : : ; eng for
Fn is an orthonormal basis.
4. Let F = R or C. Using the inner product de ned in Example 12.1.6.3, the standard basis
fEij j 1 i m and 1 j ng for Mmn(F) is an orthonormal basis.
5. Suppose P2 (R) is equipped with an inner product such that
Z 1
hp(x); q(x)i = 1
2
p(t)q(t)dt for p(x); q(x) 2 P2 (R):
1
i.e. x is orthogonal to both 1 and x2 but 1 and x2 are not orthogonal to each other. The
standard basis f1; x; x2 g is not an orthogonal basis.
Theorem 12.3.7 (Gram-Schmidt Process) Suppose fu1; u2; : : : ; ung is a basis for a
nite dimensional inner product space V . Let
v1 = u1;
v2 = u2 hhuv12;; vv11ii v1;
v3 = u3 hhuv13;; vv11ii v1 hhuv23;; vv22ii v2;
..
.
vn = un hhuv1n;; vv11ii v1 hhuv2n;; vv22ii v2 hvhun n;1v; vnn 1i1i vn 1 :
u1; u2; : : : ; ui are linearly independent, vi 6= 0. Thus by Remark 12.3.4.1, we only need to
show that fv1 ; v2 ; : : : ; vn g is orthogonal. We prove by using mathematical induction.
It is obvious that fv1 g is an orthogonal set.
Assume that fv1 ; v2 ; : : : ; vk 1 g is an orthogonal set. For i 2 f1; 2; : : : ; k 1g,
* +
hvk ; vii = uk
k
X 1
hu k ; vj i
j =1
hvj ; vj i vj ; vi
= huk ; vi i
k
X 1
huk ; vii hv ; v i
j =1
hvj ; vj i j i
= huk ; vi i
huk ; vii hv ; v i ( because for 1 j k 1, hvj ; vi i = 0 if i 6= j )
hvi ; vi i i i
= 0:
So fv1 ; v2 ; : : : ; vk g is orthogonal.
By mathematical induction, fv1 ; v2 ; : : : ; vn g is orthogonal.
166 Chapter 12. Inner Product Spaces
jjwijj = jjv1 jj vi =
1
jjvijj jjvijj = 1 for all i
i
and
hwi; wj i = jjv1 jj vi; 1
jjvj jj vj =
1
jjvijj jjvj jj hvi; vj i = 0 if i 6= j .
i
Example 12.3.8 Consider the vector space P (R) equipped with the inner product de ned
2
in Example 12.3.5.5. Start with the standard basis f1; x; x g. By the Gram-Schmidt Process, 2
p1 (x) = 1;
p2 (x) = x
hx; p (x)i
hp (x); p (x)i p (x) = x;
1
1
1 1
(
p p5
)
1 1 1
jjp (x)jj p (x); jjp (x)jj p (x); jjp (x)jj p (x) = 1; 3 x; 2 ( 1 + 3x )
2
1 2 3
1 2 3
is an orthonormal basis.
De nition 12.4.1 Let V be an inner product space and W a subspace of V . The orthogonal
complement of W is de ned to be the set
W ? = fv 2 V j v is orthogonal to W g
= fv 2 V j hv; ui = 0 for all u 2 W g V:
Example 12.4.2
1. In Example 12.3.5.1, W ? = f(0; t; t) j t 2 Cg = spanf(0; 1; 1)g which is also a subspace
of C3 .
Proof
1. (S1) Since h0; ui = 0 for all u 2 W , 0 2 W ? .
(S2) Take any v ; w 2 W ? , i.e. hv ; ui = hw; ui = 0 for all u 2 W . Then
So v + w 2 W ? .
(S3) Take any v 2 W ? , i.e. hv ; ui = 0 for all u 2 W . For any scalar c,
So cv 2 W ? .
Since W ? is a subset of V satisfying (S1), (S2) and (S3), W ? is a subspace of V .
2. If v 2 W \ W ? , then hv ; v i = 0 and hence by (IP4), v = 0. Thus W \ W ? = f0g. By
Theorem 8.6.5, W + W ? is a direct sum.
3. If W = f0g, then W W? = W? = V .
Suppose W is not a zero space. Let fw1 ; w2 ; : : : ; wk g be an orthonormal basis for W .
For any u 2 V , de ne
Example 12.4.4
1. In Example 12.3.5.1 and Example 12.4.2.1, dim(W ) = 2 and dim(W ? ) = 1. Hence
dim(W ) + dim(W ? ) = 3 = dim(C3 ).
2. Suppose Mnn (R) is equipped with the inner product de ned in Example 12.1.6.3. Let
W1 = fA 2 Mnn (R) j A is symmetricg and W2 = fA 2 Mnn (R) j A is skew symmetricg.
By Example 12.3.5.2, we have W2 W1? .
By Example 8.6.6.2, we have Mnn (R) = W1 W2 and hence
dim(W2 ) = dim(Mnn (R)) dim(W1 ) (by Theorem 8.6.7.2)
= dim(W ? ) 1 (by Theorem 12.4.3.4).
So Theorem 8.5.15, W2 = W1? .
Remark 12.4.5 Theorem 12.4.3.3 is not always true when W is in nite dimensional.
Consider the l2 -space V de ned in Example 12.1.6.6. For i = 1; 2; 3; : : : , de ne ei 2 V to be
the real in nite sequence such that the ith term of the sequence is 1 and all other terms are
0. Let W = spanfe1 ; e2 ; e3 ; : : : g which is an in nite dimensional subspace of V . Note that
W 6= V , (see Question 8.20).
Let a = (an )n2N 2 W ? , i.e. ha; eii = 0 for all i. For each i, by the de nition of the inner
product of the l2 -space,
1
X
ha; eii = an (the nth term of ei ) = ai :
n=1
This means ai = 0 for all i and hence a is the zero sequence.
So we have shown that W ? = f0g. In this case, W W ? = W = 6 V.
Theorem 12.4.6 Let V be an inner product space and W a subspace of V .
1. W (W ?)?.
2. If W is nite dimensional, then W = (W ? )? .
Proof
1. Take any u 2 W . Then
u 2 W ) hu; vi = 0 for all v 2 W ?
) u 2 (W ?)?:
Thus W (W ?)?.
Section 12.4. Orthogonal Complements and Orthogonal Projections 169
Remark 12.4.7 Theorem 12.4.6.2 is not always true when W is in nite dimensional.
Use W de ned in Remark 12.4.5. Since W ? = f0g, (W ? )? = f0g? = V . We still have
W (W ? )? but W 6= (W ? )? .
De nition 12.4.8 Let V be an inner product space and W a subspace of V such that V =
W W ? . (By Theorem 12.4.3.3, we know that V = W W ? is always true if W is nite
dimensional.) Every u 2 V can be uniquely expressed as
u = w + w0 where w 2 W and w0 2 W ? :
The vector w is called the orthogonal projection of u onto W and is denoted by ProjW (u).
Proposition 12.4.9 The mapping ProjW : V !V is a linear operator and is called the
orthogonal projection of V onto W .
Proof This is only a particular case of Question 9.6.
Example 12.4.10
1. In Example 12.3.5.1, W = spanf(1; 1; 1); (1; i ; i)g = f(a; b; b) j a; b 2 Cg and W ? =
f(0; t; t) j t 2 Cg.
For any (x; y; z ) 2 C,
z y; z y
(x; y; z ) = x; y+2 z ; y+2 z + 0; 2 2
z y ; z y 2 W ? . So
where x; y+2 z ; y+2 z 2 W and 0; 2 2
ProjW ((x; y; z )) = x; y+2 z ; y+2 z :
2. Suppose Mnn (R) is equipped with the inner product de ned in Example 12.1.6.3. Let
W = fA 2 Mnn (R) j A is symmetricg.
By Example 12.4.4.2, W ? = fA 2 Mnn (R) j A is skew symmetricg. For each B 2
Mnn(R), by Example 8.6.6.2, we can write
B = (B + B ) + (B B )
1
2
T 1
2
T
Theorem 12.4.11 Let V be an inner product space and W a nite dimensional subspace of
V . If B = fw1 ; w2 ; : : : ; wk g is an orthonormal basis for W , then for any vector u 2 V ,
ProjW (u) = hu; w1 i w1 + hu; w2 i w2 + + hu; wk i wk
and
ProjW ? (u) = u ProjW (u)
=u hu; w1i w1 hu; w2i w2 hu; wk i wk :
Proof The proof follows the same argument as the proof for Theorem 5.2.15.
Example 12.4.12
1. In Example 12.4.10.1, W = f(a; b; b) j a; b 2 Cg = spanf(1; 0; 0); (0; 1; 1)g C3 . To
compute ProjW ((x; y; z )), for (x; y; z ) 2 C3 , by using Theorem 12.4.11, we rst need an
orthonormal basis for W . In this case, f(1; 0; 0); p12 (0; 1; 1)g is an orthonormal basis for
W . Thus
ProjW ((x; y; z )) = h(x; y; z ); (1; 0; 0)i (1; 0; 0) + h(x; y; z ); p12 (0; 1; 1)i p12 (0; 1; 1)
= x(1; 0; 0) + y+2 z (0; 1; 1) = x; y+2 z ; y+2 z
which gives us the same formula as in Example 12.4.10.1.
2. Let C ([ 1; 1]) be equipped with the inner product de ned by
Z 1
hf; gi = 1
2
f (t)g(t)dt for f; g 2 C ([ 1; 1]):
1
u
3 ..
....
.
....
.
.
. ...
.
. ....
.
.
.
. ...
.
. ....
.
.
.
. ..
.
..................................................................................................................
..... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .... ... .... .... ... .. . .. . . . . . . . .
.... .
W
. .
. .. ..
...
. . .. ...
XXXProjW (u) -
.. .
. .. ...
.
.. . .. ...
.... .
. .. ...
...
...... .
.
. ..
.. .
..
.... . ..
... .. .
.......................
.. ...
.. ...
.... .
. . ...
XXXXX
. . ..
... .
. .
. .. ...
.... .
.
. .
. .. ...
...... .. ....
.
.. .
.... .. ...
... ...
v XXXX
....
.. ...
...
.. ...
...
z
..
.... .. ...
...... ..
. .
....
... .. ...
.... .. ...
.... .. ...
... . ...
.... ...
...
...... ....
.
.. ..
.... ...
..................................................................................................................................................................................................................................................................................................................................
Proof The proof follows the same argument as the proof for Theorem 5.3.2.
Example 12.4.14 Let C ([ 1; 1]) be equipped with the inner product de ned by
Z 1
hf; gi = 1
2
f (t)g(t)dt for f; g 2 C ([ 1; 1]):
1
f (x) = ex
....
6 .
......
- .......
...
....
....
....
....
.... .......
.....
. .......
.... .......
..... .......
..
...... ... .........
.
. ...
..... .......
.... ....... .....
..... .......
........... .......
............. .......
...
..
.......... ......
.
..................... .
. ..... .
.... ...
....... ..... .......
....... ......... .......
....... .
....... .......... .......
....... .......
....
. ....... .
........ .
..... .... .... .
....... ...... ....... 6
....... .....
p(x) = 1
2
e e +ex
1 3
.....
........
.......
.......
.......
....
..............
...... .......
...... ..
...... .....
q(x) = 1 + x
.. .
.......... ....
...............
.......
... ..........
....... ........
....... .........
....... .........
....... ........
@ R ................ ..........
@ ...
....
.........
.. .....
....... ...........
....... .........
....... ............
....... ..........................
....... . .
.....
................................ .....
..... ........ .... ...
................... .
.............. ....... ..
..................
.................... .......
............... .............
...
. ......
.................... ..
....
. . ......
....... .... .......
................... .......
....... .. .......
....... .
....... ..... ....
....... .
..
........... ....... .
.... .......
.......
....... .......
........
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
-
x
1 0:5 0:5 1
172 Chapter 12. Inner Product Spaces
De nition 12.5.1 Let V be an inner product space and let T be a linear operator on V . A
linear operator T on V is called the adjoint of T if
In Theorem 12.5.4.1, we shall learn that the adjoint of a linear operator is unique if it exists.
Thus we always use T to denote the adjoint of T .
Please note that the (classical) adjoint of a matrix de ned in De nition 2.5.24 is a completely
di erent concept.
Remark 12.5.2 Let V be an inner product space and let T be a linear operator on V such
that T exists. Then for all u; v 2 V ,
Example 12.5.3
1. Let V be an inner product space. For all u; v 2 V ,
hIV (u); vi = hu; vi = hu; IV (v)i and hOV (u); vi = 0 = hu; OV (v)i:
Both IV and OV are the adjoints of themselves, i.e IV = IV and OV = OV .
2. Let Rn be equipped with the usual inner product. Given any n n real matrix A, let LA
be the linear operator on Rn as de ned in Example 9.1.4.1, i.e. LA (u) = Au for u 2 Rn .
(In here, all vectors are expressed as column vectors.) For any u; v 2 Rn ,
Theorem 12.5.4 Let V be an inner product space and let T be a linear operator on V .
1. The adjoint of T is unique if it exists.
2. Suppose V is nite dimensional where dim(V ) 1.
(a) T always exists.
(b) If B is an ordered orthonormal basis for V , then [T ]B = ([T ]B ) .
(c) rank(T ) = rank(T ) and nullity(T ) = nullity(T ).
Proof
1. Suppose there are two adjoints T1 and T2 of T . To show that T1 = T2 , we need to show
that T1 (v ) = T2 (v ) for all v 2 V . By (12.3), for all u; v 2 V ,
hu; T (v)i = hT (u); vi = hu; T (v)i ) hu; T (v) T (v)i = 0:
1 2 1 2
= ([u]B )T ([T ]B )T [v ]B
= ([u]B )T ([T ]B ) [v ]B = ([u]B )T [T 0 (v )]B = hu; T 0 (v )i:
174 Chapter 12. Inner Product Spaces
(see Question 12.35 for a proof of rank(T ) = rank(T ) without using matrices) and
Remark 12.5.6 Theorem 12.5.4.2(a) is not always true when V is in nite dimensional.
Let V be the l2 -space de ned in Example 12.1.6.6. As in Remark 12.4.5, consider the subspace
W = spanfe1 ; e2 ; e3 ; : : : g of V . Observe that W consists of all real in nite sequences which
has only nite number of nonzero terms. De ne a linear operator T on W such that
!
1
X
T ((an )n2N ) = ai for (an )n2N 2 W:
i=n n 2N
1
X
(The sum ai converges because there are only nite number of nonzero ai .)
i=n
Section 12.5. Adjoints of Linear Operators 175
Note that T (en ) = (1; 1; : : : ; 1; 0; 0; : : : ) where the rst n entries are 1 and all other entries are
0.
Assume T exists. Let T (e1 ) = (bn )n2N . By (12.3), for all n 2 N,
Proposition 12.5.7 Let F = R or C and let V be an inner product space over F. Suppose S
and T are linear operators on V such that S and T exists. Then
1. (S + T ) exists and (S + T ) = S + T ;
2. for any c 2 F, (c T ) exists and (c T ) = c T ;
3. (S T ) exists and (S T ) = T S ;
4. (T ) exists and (T ) = T ; and
5. if W is a subspace of V which is both T -invariant and T -invariant, then (T jW ) exists
and (T jW ) = T jW .
So (S T ) exists and (S T ) = T S .
(Proofs of the other parts are left as exercises. See Question 12.31).
Proposition 12.5.9 Let F = R or C, V a nite dimensional inner product space over F, where
dim(V ) 1, and T a linear operator on V . Take any ordered orthonormal basis B for V .
1. If F = C, then T is unitary if and only if [T ]B is a unitary matrix.
2. If F = R, then T is orthogonal if and only if [T ]B is an orthogonal matrix.
Proof Since the proof of Part 2 is the same as Part 1, we only prove Part 1:
Let A = [T ]B . By Theorem 12.5.4.2, [T ]B = A . Thus by Theorem 9.3.3, [T T ]B = AA
and [T T ]B = A A.
Hence
T is unitary , T T = T T = IV
, AA = AA = I (by Lemma 9.2.3)
, A is unitary.
Example 12.5.10
1. The following are some examples of unitary matrices where the rst three matrices are
also orthogonal matrices:
0 1
0 1 p1 p2 0
1 0 0 !
B 3 6 C cos() sin()
B C B p1 p p1 C
@0 1 0A ; B 3
1
C; ;
@ 6 2A
sin() cos()
0 0 1 p1 p1 p1
3 6 2
0 1
1 1 p1 i 0
0 1 0 1 B 2 2 2 C
p1 (1 + 2i) 0 0 p1 (1 + i) p1 B 1 p1 i C
B
5
C B 2
1
0 C
iA ; A; C:
2
@ 0 0 @ 3 3 B 2
p p (1 i) B p1 C
1 1
B 2i 0 1 1
C
0 i 0 3 3 @ 2 2
A
0 p1 i 1
2
1
2
2
2. Let R2 be equipped with the usual inner product. Consider the linear operator LA on R2
!
cos() sin()
where A = for some 2 [0; 2 ].
sin() cos()
Using the orthonormal basis E = f(1; 0); (0; 1)g for R2 , [LA ]E = A. Since A is an
orthogonal matrix, by Proposition 12.5.9, LA is an orthogonal operator.
Section 12.5. Adjoints of Linear Operators 177
3. Let C3 be equipped with the usual inner product. Let T be the linear operator on C 3
de ned by
T ((x; y; z )) = p12 (x + i y); p12 (x i y); z for (x; y; z ) 2 C3 :
Using the orthonormal basis E = f(1; 0; 0); (0; 1; 0) (0; 0; 1)g for C3 ,
0 1
p1 p1 i 0
2 2
B C
[T ]E = B p1
@ 2
p1 i 0C
A:
2
0 0 1
Theorem 12.5.11 Let F = R or C, V a nite dimensional inner product space over F, where
dim(V ) 1, and T a linear operator on V . Then the following are equivalent:
Proof
(1 ) 2) For all u; v 2 V , hT (u); T (v )i = hu; T (T (v ))i = hu; IV (v )i = hu; v i.
p p
(2 ) 3) For all u 2 V , jjT (u)jj = hT (u); T (u)i = hu; ui = jjujj.
(3 ) 4) Take any orthonormal basis fw1 ; w2 ; : : : ; wn g for V . Since jjT (wi )jj = jjwi jj = 1
for all i, it suces to show that hT (wi ); T (wj )i = 0 for i 6= j . Let " = hT (wi ); T (wj )i.
Then " = hT (wj ); T (wi )i.
(T T )(wi )
= h(T T )(wi ); w1 i w1 + h(T T )(wi ); w2 i w2 + + h(T T )(wi ); wn i wn
= hT (T (wi )); w1 i w1 + hT (T (wi )); w2 i w2 + + hT (T (wi )); wn i wn
= hT (wi ); T (w1 )i w1 + hT (wi ); T (w2 )i w2 + + hT (wi ); T (wn )i wn
= wi :
Example 12.5.12
1. In Example 12.5.10.2, LA maps the standard basis f(1; 0)T ; (0; 1)T g to
f(cos(); sin()) ; T
( sin(); cos())T g
which is also an orthonormal basis for R2 using the usual inner product.
2. In Example 12.5.10.3, T maps the standard basis f(1; 0; 0); (0; 1; 0); (0; 0; 1)g to
n o
p1 ; p1 ; 0 ; p1 i ; p1 i ; 0 ; (0; 0; 1)
2 2 2 2
which is also an orthonormal basis for C3 using the usual inner procuct.
Remark 12.5.13 Theorem 12.5.11 is not always true when V is in nite dimensional.
The linear operator S in Example 12.5.3.4 satis es Part 2 and Part 3 of Theorem 12.5.11. But
S is not surjective and hence not invertible. So S is not orthogonal.
Theorem 12.5.14 Let A be an n n complex matrix. Suppose Cn is equipped with the usual
inner product. The following statements are equivalent:
1. A is unitary.
2. The rows of A form an orthonormal basis for Cn .
Proof The proof follows the same argument as the proof for Theorem 5.4.6. (We can also
prove the theorem by applying Theorem 12.5.11 to LA and LA .)
Section 12.6. Unitary and Orthogonal Diagonalization 179
Theorem 12.5.15 Let V be a complex nite dimensional inner product space where dim(V )
1. If B and C are ordered orthonormal bases for V , then the transition matrix from B to C is
a unitary matrix, i.e. [IV ]B;C = ( [IV ]C;B ) 1 = ([IV ]C;B ) . (See also Theorem 5.4.7.)
Proof Let B = fv1 ; v2 ; : : : ; vn g and C = fw1 ; w2 ; : : : ; wn g. De ne a linear operator T on V
such that T (wi ) = vi for i = 1; 2; : : : ; n. Note that
[T ]B;C = [T (w1 )]B [T (w2 )]B [T (wn)]B = [v1 ]B [v2 ]B [vn]B = In :
By Theorem 12.5.11, T is a unitary operator and hence by Proposition 12.5.9, [T ]C is a unitary
matrix. But then the transition matrix from B to C is
[IV ]C;B = [IV ]C;B In = [IV ]C;B [T ]B;C = [IV T ]C;C = [T ]C
which is a unitary matrix.
De nition 12.6.1
1. Let V be an inner product space and let T be a linear operator on V such that T exists.
(a) T is called a self-adjoint operator if T = T .
(b) T is called a normal operator if T T = T T .
All self-adjoint operators, orthogonal operators and unitary operators are normal.
2. Let A be a complex square matrix.
(a) A is called a Hermitian matrix if A = A.
Note that a real matrix A satisfying A = A is a symmetric matrix.
(b) A is called a normal matrix if AA = A A.
All Hermitian matrices, real symmetric matrices, unitary matrices and orthogonal matri-
ces are normal.
Proposition 12.6.2 Let F = R or C, V a nite dimensional inner product space over F, where
dim(V ) 1, and T a linear operator on V . Take an ordered orthonormal basis B for V and let
A = [T ]B .
1. If F = C, T is self-adjoint if and only if A is a Hermitian matrix.
If F = R, then T is self-adjoint if and only if A is a symmetric matrix.
2. T is normal if and only if A is a normal matrix
180 Chapter 12. Inner Product Spaces
Proof The proof follows the same argument as the proof for Proposition 12.5.9.
Example 12.6.3
1. Let C2 be equipped with the usual inner product. Let T be the linear operator on C
2
de ned by
T ((x; y)) = (2x i(x + y); 2y i(x + y)) for (x; y) 2 C2 :
!
2 i i
Using the orthonormal basis E = f(1; 0); (0; 1)g for C2 , [T ]E = . Since
i 2 i
!
6 2
([T ]E ) [T ]E = = [T ]E ([T ]E )
2 6
[T ]E is a normal matrix and hence by Proposition 12.6.2.2, T is a normal operator.
2. Let Rn be equipped with the usual inner product and A a real n n matrix.
If A is symmetric, then the linear operator LA is self-adjoint (and normal).
If A is nonzero and skew symmetric, then LA is normal but not self-adjoint.
Lemma 12.6.4 Let F = R or C, V an inner product space over F and T a normal operator
on V .
1. For all u; v 2 V , hT (u); T (v )i = hT (u); T (v )i.
2. For any c 2 F, the linear operator T cIV is normal.
3. If u is an eigenvector of T associated with , then u is an eigenvector of T associated
with .
4. If u and v are eigenvector of T associated with and , respectively, where 6= , then
u and v are orthogonal.
Proof
1. hT (u); T (v )i = hu; T (T (v ))i = hu; T (T (v ))i = hT (u); T (v )i.
2. Note that IV = IV . By Proposition 12.5.7, (T cIV ) exists and (T cIV ) = T cIV .
Since
(T cIV ) (T cIV ) = (T cIV ) (T cIV )
= T T c T c T + ccIV
= T T c T c T + ccIV (because T is normal)
= (T cIV ) (T cIV )
= (T cIV ) (T cIV );
T cIV is normal.
Section 12.6. Unitary and Orthogonal Diagonalization 181
h(T IV )(u); (T IV )(u)i = h(T IV )(u); (T IV )(u)i = 0:
By (IP4), (T IV ) (u) = 0 and hence
Since 6= , hu; v i = 0.
Remark 12.6.5 The results in Lemma 12.6.4 is still true if we replace V by Cn (equipped
with the usual inner product) and T by an n n normal matrix A. In particular, if u is an
eigenvector of A associated with , then u is also an eigenvector of A associated with .
!
2 i i
Example 12.6.6 Let A = . By Example 12.6.3.1, A is a normal matrix.
i 2 i
Let u = (1; 1)T and v = (1; 1)T . Since
! ! ! ! ! !
2 i i 1 1 2 i i 1 1
=2 and = (2 2i) ;
i 2 i 1 1 i 2 i 1 1
u and v are eigenvectors of A associated with the eigenvalues 2 and 2 2i respectively. Thus
u and v are also eigenvectors of A associated with the eigenvalues 2 and 2 + 2i respectively.
We can also check them directly:
! ! ! ! ! !
2+i i 1 1 2+i i 1 1
=2 and = (2 + 2i) :
i 2+i 1 1 i 2+i 1 1
De nition 12.6.7
1. Let F = R or C, V a nite dimensional inner product space over F, where dim(V ) 1,
and T a linear operator on V . Suppose there exists an ordered orthonormal basis B for
V such that [T ]B is a diagonal matrix.
If F = C, then T is called unitarily diagonalizable.
If F = R, then T is called orthogonally diagonalizable.
182 Chapter 12. Inner Product Spaces
p1 p1
2 2
which is an orthogonal matrix and hence a unitary matrix. Since
0 1 !0 1 !
p1 p1 2 i i p1 p1 2 0
P AP = @ p 1
2
p1
2A
i 2 i
@ 2
p1
2A
p1
=
0 2 2i
;
2 2 2 2
A is unitarily diagonalizable.
Theorem 12.6.9
1. Let V be a complex nite dimensional inner product space where dim(V ) 1. A linear
operator T on V is unitarily diagonalizable if and only if T is normal.
2. A complex square matrix A is unitarily diagonalizable if and only if A is normal.
(T jW ? ) (T jW ? ) = (T jW ? ) (T jW ? )
= (T T )j ?W (by Proposition 11.3.3.1)
= (T T )jW ? (because T is normal)
= (T jW ? ) (T jW ? ) (again by Proposition 11.3.3.1)
= (T jW ? ) (T jW ? ):
Algorithm 12.6.10 Let T be a normal operator on a complex nite dimensional vector space
V where dim(V ) 1. We want to nd an ordered orthonormal basis so that the matrix of T
relative to this basis is a diagonal matrix.
Step 1: Find an orthonormal basis C for V and compute the matrix A = [T ]C .
Step 2: Factorize the characteristic equation cA (x) into linear factors, i.e. to express it in the
form
cA (x) = (x 1 )r1 (x 2 )r2 (x k )rk
where 1 ; 2 ; : : : ; k are distinct eigenvalues of A and r1 + r2 + + rk = dim(V ).
Step 3: For each eigenvalue i , nd a basis for the eigenspace Ei (T ) = Ker(T i IV ) and
then use the Gram-Schmidt Process (Theorem 12.3.7) to transform it to an orthonormal
basis Bi for Ei (T ).
Step 4: Let B = B1 [ B2 [ [ Bk . Then B is an orthonormal basis for V . Using B as an
ordered basis, D = [T ]B is a diagonal matrix.
Note that D = P AP where P = [IV ]C;B is the transition matrix from B to C .
184 Chapter 12. Inner Product Spaces
Example 12.6.11 Suppose M22 (C) is equipped with the inner product de ned in Example
12.1.6.3. Let T : M22 (C) ! M22 (C) be the linear operator de ned by
!! ! !
a b b ic + id a + ic id a b
T = for 2 M (C):
2 2
c d ia ib d ia + ib c c d
( ! ! ! !)
1 0 0 1 0 0 0 0
Step 1: Take the standard basis C = ; ; ; for M22 (C).
0 0 0 0 1 0 0 1
Then 0 1
0 1 i i
B C
B 1 0 i iC
A = [T ]C = B
B i i 0 1C
C
@ A
i i 1 0
which is a Hermitian matrix and hence is a normal matrix. So T is a normal operator.
( ! ! !)
1 1 i 0 i 0
Thus ; ; is a basis for E 1 (T ) and by the Gram-Schmidt
0 0 1 0 0 1
( ! ! !)
p1 1 1 p1 i i p1 i i
Process, we obtain an orthonormal basis B 1 = ; 6 ; 12
2
0 0 2 0 1 3
for E 1 (T ).
( !)
i i
Similarly, B 3 = 1
is an orthonormal basis for E 3 (T ).
2
1 1
( ! ! ! !)
1 1 p1 i i p1 i i i i
Step 4: Let B = p12 ; 6 ; 12 ; 1
. We have
0 0 2 0 1 3 2
1 1
0 1
1 0 0 0
B C
B 0 1 0 0C
D = [T ]B = B
B 0 0 1 0C
C:
@ A
0 0 0 3
0 1
p1 p1 i p1 i 1
2
i
B 2 6 12 C
B p1
B 2 p1 i p1 i 1
iC
C
Let P = [IM22 (R) ]C;B = C. Then D = P AP .
2
B 6 12
B p2 p1 1 C
B 0 2
C
@ 6 12 A
0 0 p3 1
2
12
Theorem 12.6.12
1. Let V be a real nite dimensional inner product space where dim(V ) 1. A linear
operator T on V is orthogonally diagonalizable if and only if T is self-adjoint.
i u = Au
= AT u (because A is symmetric)
= A u (because A is real)
= i u (by Proposition 12.6.2.2):
Exercise 12
2. Let V be a vector space over F2 such that V has at least two nonzero vectors. Suppose
there exists a mapping h ; i : V V ! F2 such that
(I) for all u; v 2 V , hu; v i = hv ; ui; and
(II) for all u; v ; w 2 V , hu + v ; wi = hu; wi + hv ; wi.
Show that there exists a nonzero vector u 2 V such that hu; ui = 0.
!
a b
3. Let A = be a real matrix.
c d
De ne hu; v i = uAv T for u; v 2 R2 where u and v are written as row vectors. Find a
necessary and sucient condition on a; b; c; d so that h ; i is an inner product on R2 .
Exercise 12 187
(a) Prove that all eigenvalues of A are nonnegative real numbers. Hence show that
det(A) is a nonnegative real number.
(b) Prove that det(A) = 0 if and only if u1 ; u2 ; : : : ; un are linearly dependent.
188 Chapter 12. Inner Product Spaces
Z 1
hp(x); q(x)i = 1
2
p(t)q(t)dt for p(x); q(x) 2 P1 (R);
1
Z 1
hp(x); q(x)i = p(t)q(t)dt for p(x); q(x) 2 P1 (R);
0
11. (Parallelogram Law) Let V be an inner product space. Prove that for all u; v 2 V,
jju + vjj2 + jju vjj2 = 2jjujj2 + 2jjvjj2.
12. Complete the proof of Theorem 12.2.4:
Let V be an inner product space over F = R or C. For any c 2 F and u; v 2 V , prove
that
(a) jj0jj = 0 and if u 6= 0, jjujj > 0;
(b) jjcujj = jcj jjujj;
(c) jhu; vij = jjujj jjvjj if and only if u and v are linearly dependent; and
(d) jju + vjj jjujj + jjvjj.
jju + vjj 2
+ jju vjj 2
= 2jjujj2 + 2jjv jj2 :
16. For each part of Question 12.4, write down an orthonormal basis for V if h ; i is an inner
product.
0 (x) = 1; 2m 1 (x) = cos(mx); 2m (x) = sin(mx) (m = 1; 2; : : : ; n) for x 2 [0; 2]:
(a) Prove that f0 ; 1 ; : : : ; 2n g is orthogonal.
(b) Find an orthonormal basis for Wn = spanf0 ; 1 ; : : : ; 2n g.
18. Let M22 (C) be equipped with the inner product hA; B i = tr(AB ) for A; B 2 M22 (C).
Let W = spanfA1 ; A2 ; A3 ; A4 g where
! ! ! !
A1 = 10 0i ; A2 = 2i 00 ; A3 = 0 2i
i 0
; A4 = 0i ii :
(a) Find a subset B of fA1 ; A2 ; A3 ; A4 g such that B is a basis for W .
(b) Starting with the basis in (a), use the Gram-Schmidt Process to nd an orthonormal
basis for W .
Starting with the standard bases, use the Gram-Schmidt Process to nd an orthonormal
basis for each of P1 (R) and P2 (R).
20. Suppose C4 is equipped with the usual inner product. Let W be a subspace of C4 with a
basis B = f(1; i ; 0; 0); (0; 1; i ; 0); (0; 0; 1; i)g.
21. Let C4 be equipped with the usual inner product and let
W = f(a; a i b; a + 2i b; a + 3i b) j a; b 2 Cg:
24. Let V = M22 (R) be equipped with the inner product hA; B i = tr(AB T ) for A; B 2 V .
Let W = spanfA1 ; A2 ; A3 ; A4 g where
! ! ! !
A1 = 01 0
3
; A2 = 2 1
2 2
; A3 = 10 0
1
; A4 = 00 1
2
:
25. Suppose C ([0; 1]) is equipped with an inner product such that
Z 1
hf; gi = f (t)g(t)dt for f; g 2 C ([0; 1]):
0
p
Let f be a function in C ([0; 1]) de ned by f (x) = x for x 2 [0; 1]. Find the best
approximation of f in each of P1 (R) and P2 (R), where P1 (R) and P2 (R) are regarded as
subspaces of C ([0; 1]). (Hint: Use the orthonormal bases found in Question 12.19.)
192 Chapter 12. Inner Product Spaces
26. Suppose C ([0; 2 ]) is equipped with the inner product de ned in Question 12.17. Let f
be a function in C ([0; 2 ]) de ned by f (x) = ex for x 2 [0; 2 ]. Follow the notation in
Question 12.17. Find the best approximation of f in W2 = spanf0 ; 1 ; 2 g.
27. Let V be an inner product space and W a subspace of V . The distance from a vector u
to W is de ned to be the value
28. Let V be an inner product space and W a subspace of V . The distance from a vector u
to a coset W + v is de ned to be the value
29. For each of the following linear operator T on the inner product space V , nd the adjoint
of T .
(a) V = C3 is equipped with the usual inner product and T : V !V is de ned by
T ((x; y; z )) = (x; x + yi; x + yi z ) for (x; y; z ) 2 V .
(b) V = R2 is equipped with the inner product
h(u ; u ); (v ; v )i = u v
1 2 1 2 1 1 + 2u2 v2 for (u1 ; u2 ); (v1 ; v2 ) 2 V
hX ; Y i = tr(XY ) for X ; Y 2V
and T : V !V is de ned by T (X ) = AX for X 2 V , where A is an n n complex
matrix.
30. (a) Let T be linear operators on an inner product space V such that T exists and let
W be a T -invariant subspace of T . Prove that W ? is T -invariant.
(b) Let C3 be equipped with the usual inner product and let T : C3 ! C3 be the linear
operator de ned by
32. Let T be linear operators on an inner product space such that T exists.
33. Let T be a linear operator on an inner product space V such that T is invertible and T
exists.
34. Let T be linear operators on an inner product space V such that T exists.
(a) Prove that Ker(T T ) = Ker(T ).
(b) Is it true that Ker(T T ) = Ker(T )? Justify your answer.
35. Let T be linear operators on an inner product space V such that T exists.
(a) For v1 ; v2 ; : : : ; vk 2 R(T ), show that if T (v1 ); T (v2 ); : : : ; T (vk ) are linearly inde-
pendent, then T (T (v1 )); T (T (v2 )); : : : ; T (T (vk )) are linearly independent.
(b) Suppose R(T ) is nite dimensional. (Note that V may not be nite dimensional.)
Prove that R(T ) is nite dimensional and rank(T ) = rank(T ).
(Hint: By substituting T by T in (a), if T (v1 ); T (v2 ); : : : ; T (vk ) are linearly
independent, then T (T (v1 )); T (T (v2 )); : : : ; T (T (vk )) are linearly independent.)
36. Let T be linear operators on an inner product space V such that T exists.
(a) Given b 2 V , show that x = u is a solution to (T T )(x) = T (b) if and only if
T (u) is the orthogonal projection of b onto R(T ).
(b) Given b 2 R(T ), show that fu 2 V j T (u) = bg = fu 2 V j (T T )(u) = T (b)g.
37. Determine which of the following complex square matrices are unitary.
0 1 0 1 0 1 0 1 0 1
1 0 0 0 i 0 1 0 i 1 0 i 1 0 i
B C B C B C B C B C
(i) @0 0 i A, (ii) @0 0 i A, (iii) @0 1 0A, (iv) @ 0 1 1A, (v) @0 1 0A.
0 i 0 i 0 0 i 0 1 i 1 1 i 0 0
40. (a) For each of the following matrices A, determine whether A is orthogonally diago-
nalizable. If so, nd an orthogonal matrix P such that P T AP is a diagonal matrix.
0 1 0 1 0 1
2 1 1 1 0 1 1 0 1
(i) A=B
@ 1 2
C
1A ; (ii) A=B
@
C
0 1 0A ; (iii) A = B C
@0 1 0A :
1 1 2 1 0 1 0 0 1
(b) For each of the matrices A in (a), determine whether A is unitarily diagonalizable.
If so, nd a unitary matrix P such that P AP is a diagonal matrix.
42. (a) Let V be a nite dimensional complex inner product space and T a normal operator
on V . Prove that
(i) T is self-adjoint if and only if all eigenvalues of T are real; and
(ii) T is unitary if and only if all eigenvalues of T have modulus 1.
(b) Restate the results in Part (a) using square matrices.
43. Let V be a nite dimensional complex inner product space and T : V !V a self-adjoint
linear operator on V .
(a) Prove that the linear operator T i IV is invertible.
(b) Prove that the linear operator S = (T + i IV ) (T i IV ) 1
is unitary.