0% found this document useful (0 votes)
81 views

Canonical Forms Linear Algebra Notes: Satya Mandal October 25, 2005

The document summarizes key concepts in linear algebra related to canonical forms of linear operators. It defines characteristic values (eigenvalues) and characteristic vectors (eigenvectors) of a linear operator T. It proves several equivalent conditions for a scalar λ to be a characteristic value of T, including that λ is a root of the characteristic polynomial of T. It describes how T can be diagonalized if it has a full set of eigenvectors and defines the characteristic polynomial of a linear operator or matrix.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
81 views

Canonical Forms Linear Algebra Notes: Satya Mandal October 25, 2005

The document summarizes key concepts in linear algebra related to canonical forms of linear operators. It defines characteristic values (eigenvalues) and characteristic vectors (eigenvectors) of a linear operator T. It proves several equivalent conditions for a scalar λ to be a characteristic value of T, including that λ is a root of the characteristic polynomial of T. It describes how T can be diagonalized if it has a full set of eigenvectors and defines the characteristic polynomial of a linear operator or matrix.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Canonical Forms

Linear Algebra Notes


Satya Mandal
October 25, 2005

1 Introduction
Here F will denote a field and V will denote a vector space of dimen-
sion dim(V ) = n. (In this note, unless otherwise stated, n = dim(V ))
We will study operatores T on V. The goal is to investigate if we
can find a basis e1 , . . . , en such that

the matrix of T = diagonal(λ1 , λ2 , . . . , λn )

is a diagonal matrix. This will mean that

T (ei ) = λi ei .

2 Characteristic Values
2.1 Basic Definitions and Facts
Here we will discuss basic facts.
2.1 (Definition.) Let V be a vector space over a field F and T ∈
L(V, V ) be linear operator.

1. A scalar λ ∈ F is said to be a characterisitic value of T, if

T (e) = λe f or some e ∈ V with e 6= 0.

A characterisitic value is also known an eigen value.

1
2. This non-zero element e ∈ V above is called a characterisitic
vector of T associated to λ. A characterisitic vector is also
known an eigen vector.
3. Write
N (λ) = {v ∈ V : T (v) = λv}.
Then N (λ) is a subspace of V and is said to be the charac-
terisitic space or eigen space of T associated to λ.
2.2 (Lemma.) Let V be a vector space over a field F and T ∈
L(V, V ) Then T is singular if and only if det(T ) = 0.
Proof. (⇒): We have T (e1 ) = 0 for some e1 ∈ V with e1 6= 0. We
can extend e1 to a basis e1 , e2 , . . . , en of V. Let A be the matrix of
T with respect to this basis. Since, T (e1 ) = 0, the first row of A is
zero. Therefore,
det(T ) = det(A) = 0.
So, this implication is established.
(⇐): Suppose det(T ) = 0. Let e1 , e2 , . . . , en be a basis of V and
A be the matrix of T with respect to this basis. So
det(T ) = det(A) = 0.
Therefore, A is not invertible. Hence AX = 0 for some non-zero
column  
c1
 c2 
X=  ... .

cn
Write v = c1 e1 + c2 e2 + · · · + cn en . Since not all ci is zero, v 6= 0.
Also, X
T (v) = ci T (ei )
   
c1 c1
 c2   c2 
 . . .  = (e1 , e2 , . . . , en )A  . . .  = 0..
= (T (e1 ), T (e2 ), . . . , T (en ))    

cn cn
So, T is singular.

The following are some equivalent conditions.

2
2.3 (Theorem.) Let V be a vector space over a field F and T ∈
L(V, V ) be linear operator. Let λ ∈ F be a scalar. Then the following
are equivalent:

1. λ is a characteristic value of T.

2. The operator T − λI is singular (or is not invertible).

3. det(T − λI) = 0.

Proof. ((1) ⇒ (2)): We have (T − λI)(e) = 0 for some e ∈ V with


e 6= 0. So, T − λI is singular and (2) is established.
((2) ⇒ (1)): Since T − λI is singular, we have (T − λI)(e) = 0
for some e ∈ V with e 6= 0. Therefore, (1) is established.
((2) ⇔ (3)): Immediate from the above lemma.

2.4 (Definition.) Let A ∈ Mn (F) be an n × n matrix with entries


in a field F.

1. A scalar λ ∈ F is said to be a characteristic value of A if


det(A − λIn ) = 0. Equivalently, λ ∈ F is said to be a charac-
teristic value of A if the matrix (A − λIn ) is not invertible.

2. The monic polynomial det(XI − A) is said to be the charac-


teristic polynomial of A. Therefore, characteristic values of
A are the roots of the characteristic polynomial of A.

2.5 (Lemma.) Let A, B ∈ Mn (F) be two n × n matrices with


entries in a field F. If A and B are similar, then they have some
characteristic polynomials.

Proof. Suppose A, B are similar matrices. Then A = P BP −1 . The


characteristic polynomial of A = det(XI−A) = det(XI−P BP −1 ) =
det(P (XI − B)P −1 ) = det(XI − B) = the characteristic polynomial
of B.

2.6 (Definitions and Facts) Let V be a vector space over a field


F and T ∈ L(V, V ) be linear operator.

3
1. Let A be the matrix of T with respect to some basis E of
V. We define the characteristic polynomial of T to be the
characteristic polynomial of A. Note that this polynomial is
well defined by the above lemma.
2. We say that T is diagonalizable if the there is a basis e1 , . . . , en
such that each ei is a characteristic value of T. In this case,
T (ei ) = λi ei for some λi ∈ F. Hence, with respect to this basis,
the matrix of T = Diagonal(λ1 , λ2 , . . . , λn )
Depending on how many of these eigen values λi are distinct,
we can rewrite the matrix of T.
3. Also note, if T is diagonilazable as above, the characteric poly-
nomial of T = (X − λ1 )(X − λ2 ) · · · (X − λn ), which is com-
pletely factorizable.
4. Suppose T is diagonalizable, as above. Depending on how
many of these eigen values λi are distinct, we can rewrite the
matrix of T.
Now sSuppose T is diagonalizable c1 , c2 , . . . , cr are the distinct
eigen values of T. Then the matrix of T with respect to some
basis of V looks like:
 
c 1 Id 1 0 ··· 0
 0 c 2 Id 2 · · · 0 
 
 ··· ··· ··· ··· 
0 0 · · · c r Id r
where Ik is the identity matrix of order k. So, d1 +d2 +· · ·+dr =
n = dim(V ).
In this case, the characteristic polynomial of

T = (X − c1 )d1 (X − c2 )d2 · · · (X − cr )dr .

Further,
di = dim(N (ci )).
(see 3 of 2.1.)

2.7 (Read Examples) Read Example 1 and 2 in page 184

4
2.2 Decomposition of V
2.8 (Definition) Suppose V is vector space of a field F with dim(V ) =
n. Let f (X) = a0 + a1 X + a2 X 2 + · · · + ar X r ∈ F[X] be polynomial
and T ∈ L(V, V ) be linear operator. Then, by definition,
f (T ) = a0 Id + a1 T + a2 T 2 + · · · + ar T r ∈ L(V, V )
is an operator. So, L(V, V ) becomes a module over F[X].
2.9 (Remark) Suppose V is vector space of a field F with dim(V ) =
n. Let T ∈ L(V, V ) be linear operator. Let f (X) be a characteristic
polynomial of T. We have understanable interest how f (T ) works.
2.10 (Lemma) Suppose V is vector space of a field F with dim(V ) =
n. Let T ∈ L(V, V ) be linear operator. Let f (X) ∈ F[X] be any
polynomial. Suppose
T (v) = λv
for some v ∈ V and λ ∈ F. Then
f (T )(v) = f (λ)v.
The proof is obvious. This means if λ is an eigen value of T then
f (λ) is an eigen value of f (T )
2.11 (Lemma) Suppose V is vector space of a field F with dim(V ) =
n. Let T ∈ L(V, V ) be linear operator. Suppose c1 , . . . , ck are the
distinct eigen values of T. Let
Wi = N (ci )
be the eigen space of T associated to ci . Write
W = W 1 + W2 + · · · + W k .
Then
dim(W ) = dim(W1 ) + dim(W2 ) + · · · + dim(Wk ).
Indeed, if
Ei = {eij ∈ Wi : j = 1, . . . , di }
is a basis of Wi , then
E = {eij ∈ Wi : j = 1, . . . , di ; j = 1, . . . , k}
is basis of W.

5
Proof. We only need to prove the last part. So, let
X
λij eij = 0

for some scalar λij ∈ F.


Write
di
X
ωi = λij eij .
j=1

Then ωi ∈ Wi and
ω1 + ω 2 + · · · + ω k = 0 (I).
We will first prove that ωi = 0.
Since
T (eij ) = ci eij
for any polynomial f (X) ∈ F[X], we have
f (T )(eij ) = f (ci )eij .
Therefore,
di
X di
X
f (T )(ωi ) = λij f (T )(eij ) = λij f (ci )eij = f (ci )ωi (II).
j=1 j=i

Now let Qk
(X − ci )
g(X) = Qi=2
k
i=2 (c1 − ci )
Note g(X) is a polynomial. Also note this definition/ expression
makes sense because c1 , . . . , ck are distinct. And also g(c1 ) = 1 and
g(c2 ) = f1 (c3 ) = · · · = g(ck ) = 0.
Use (II) and apply to (I). We get
k
X k
X k
X
0 = g(T )( ωi ) = g(T )(ωi ) = g(ci )ωi . = ω1
i=1 i=1 i=1

Similarly, ωi = 0 for i = 1. . . . , k. This means


di
X
0 = ωi = λij eij
j=i

6
Since Ei is a basis, λij = 0 for all i, j and the proof is complete.

Following is the final theorem in this section.


2.12 (Theorem) Suppose V is vector space of a field F with dim(V ) =
n. Let T ∈ L(V, V ) be linear operator. Suppose c1 , . . . , ck are the
distinct eigen values of T. Let
Wi = N (ci )
be the eigen space of T associated to ci . Then the following are
equivalent:
1. T is diagonalizable.
2. The characteristic polynomial for T is
f = (X − c1 )d1 (X − c2 )d2 · · · (X − ck )dk
and dim(Wi ) = di for i = 1, . . . , k.
3. dim(W1 ) + dim(W2 ) + · · · + dim(Wk ) = dim(V ).

Proof. ((1) ⇒ (2)): This is infact obvious. If c1 , . . . , ck are the


distinct eigen values and since T is diagonalizable, the matrix of T
is as in (4) of (2.6). Therefore, we can compute the characteristic
polynomial using this matrix and (2) is established.
((2) ⇒ (3)): We have dim(V ) = degree(f ). Therefore,
dim(V ) = d1 + d2 + · · · + dk = dim(W1 ) + dim(W2 ) + · · · + dim(Wk ).
Hence (3) is established.
((3) ⇒ (1)): Write W = W1 + · · · + Wk . Then, by lemma 2.11
dim(W ) = dim(W1 ) + dim(W2 ) + · · · + dim(Wk ).
Therefore, by (3), dim(V ) = dim(W ). Hence (3) is established and
the proof is complete.

In fact, I would like to restate the ”final theorem” 2.12 in terms


of direct sum of linear subspaces. So, I need to define direct sum of
vector spaces.

7
2.13 (Definition) Let V be a vector space over F and V1 , V2 . . . , Vk
be subspaces of V. We say that V is direct sum of V1 , V2 , . . . , Vk ,
if each element x ∈ V can be written uniquely as

x = ω1 + ω2 + · · · + ω k

with ωi ∈ Vi .
Equivalently, if

1. V = V1 + V2 + · · · + Vk , and

2. ω1 + ω2 + · · · + ωk = 0 with ωi ∈ Vi implies that ωi = 0 for


i = 1, . . . , k.

If V is direct sum of V1 , V2 . . . , Vk then we write

V = V 1 ⊕ V2 ⊕ · · · ⊕ V k .

Following is a proposition on direct sum decomposition.

2.14 (Proposition) Let V be a vector space over F with dim(V ) =


n < ∞. Let V1 , V2 . . . , Vk be subspaces of V Then

V = V 1 ⊕ V2 ⊕ · · · ⊕ V k

if and only if V = V1 + V2 + · · · + Vk and

dim(V ) = dim(V1 ) + dim(V2 ) + · · · + dim(Vk ).

Proof. (⇒): Obvious.


(⇐): Let Ei = {eij : j = 1, . . . , di } be basis of Vi . Let E = {eij :
j = 1, . . . , di ; i = 1, . . . , k}. Since V = V1 + V2 + · · · + Vk , we have
V = SpanE. Since dim(V ) = cardinlity(E), we have E forms a
basis of V. Now it follows that if ω1 + · · · + ωk = 0 with ωi ∈ Wi then
ωi = 0 ∀i. This completes the proof.

Now we restate the final theorem 2.12 in terms of direct sum.

8
2.15 (Theorem) Suppose V is vector space of a field F with dim(V ) =
n. Let T ∈ L(V, V ) be linear operator. Suppose c1 , . . . , ck are the
distinct eigen values of T. Let

Wi = N (ci )

be the eigen space of T associated to ci . Then the following are


equivalent:

1. T is diagonalizable.

2. The characteristic polynomial for T is

f = (X − c1 )d1 (X − c2 )d2 · · · (X − ck )dk

and dim(Wi ) = di for i = 1, . . . , k.

3. dim(W1 ) + dim(W2 ) + · · · + dim(Wk ) = dim(V ).

4. V = W1 ⊕ W2 ⊕ · · · ⊕ Wk .

Proof. Clearly, we proved

(1) ⇐⇒ (2) ⇐⇒ (3).

We will prove (3) ⇐⇒ (4).


((4) ⇒ (3)): This part is obvious because we can combine bases of
Wi to get a basis of V.
((3) ⇒ (4)):Write W = W P1 + W2 + · · · + Wk . Because of (4) and by
lemma 2.11, dim(W ) = dim(Wi ) = dim(V ). Therefore, V = W =
W1 + W2 + · · · + WkP .
Since dim(V ) = dim(Wi ), by proposition 2.14 V = W1 ⊕ W2 ⊕
· · · ⊕ Wk and the proof is complete.

9
3 Annihilating Polynomials
Suppose K is a commutative ring and M be a K−module. For
x ∈ M, we define annihiltor of x as

ann(x) = {λ ∈ K : λx = 0}.

Note that ann(x) is an ideal of K. (That means

ann(x) + ann(x) ⊆ ann(x) and K ∗ ann(x) ⊆ ann(x)).

We shall consider annihilator of a linear operator, as follows.

3.1 Minimal (monic) polynomials


3.1 (Facts) Let V be a vector space over a field F with dim(V ) =
n.
Recall, we have seen that M = L(V, V ) is a F[X]−module. For
f (X) ∈ F[X] and T ∈ L(V, V ), scalar multiplication is defined by
f ∗ T = f (T ) ∈ L(V, V )

1. So, for a linear operator T ∈ L(V, V ), the annihilator of T is:

ann(T ) = {f (X) ∈ F[X] : f (T ) = 0}

is an ideal of the polynomial ring F[X].

2. Note that ann(T ) is a non-zero proper ideal. It is non-zero


because dim(L(V, V )) = n2 and hance
2
1, T, T 2 , . . . , T n

is a linearly dependent set.

3. Also recall that any ideal I of F[X] is a principal ideal, which


means that I = F[X]p where p is the non-zero monic in I
polynomial of least degree.

10
4. Therefore,
ann(T ) = F[X]p(X)
where p(X) is the monic polynomial of least degree such that
p(T ) = 0.
This polynomial p(X) is defined to be the minimal monic
polynomial (MMP) for T.

5. Let us consider similar concepts for square matrices.

(a) For an n × n matrix A, we define annihilator ann(A) of A


and minimal monic polynomial of A is a similar way.
(b) Suppose two n×n matrices A, B similar an d A = P BP −1 .
Then for a polynomial f (X) ∈ F[X] we have

f (A) = P f (B)P −1 .

(c) Therefore ann(A) = ann(B).


(d) Hence A and B have SAME minimal monic polynomial.

3.2 Comparison of minimal monic and charac-


teristic polynomials:

Given a linear operator T we can think of two polynomials - the


minimal monic polynomial and the characteristic polynomial of T.
We will compare them.

3.2 (Theorem) Let V be a vector space over a field F with dim(V ) =


n. Suppose p(X) is the minimal monic polynomial of T and g(X) is
the characteristic polynomial of T. Then p, g have the same roots in
F. (Although multiplicity may differ.)
Same statement holds for matrices.

Proof. We will prove, for c ∈ F,

p(c) = 0 ⇐⇒ g(c) = 0.

11
Recall g(X) = det(XI − A), where A is the matrix of T with
respect to some basis. Also by theorem 2.3, g(c) = 0 is and only if
cI − T is singular.
Now suppose p(c) = 0. So, p(X) = (X − c)q(x). for some q(X) ∈
F. Since degree(q) < degree(p), by minimality of p we have q(T ) 6= 0.
Let v ∈ V be such that v 6= 0 and e = q(T )(v) 6= 0. Since, p(T ) = 0,
we have (T − cI)q(T ) = 0. Hence 0 = (T − cI)q(T )(v) = (T − cI)(e).
So, (T − cI) is singular and hence g(c) = 0. This estblishes the proof
of this part.
Now assume that g(c) = 0. Therefore, T − cI is singular. So,
there is vector e ∈ V with e 6= 0 such that T (e) = ce. Applying this
equation to p we have

p(T )(e) = p(c)e

(see lemma 2.10). Since p(T ) = 0 and e 6= 0, we have p(c) = 0 and


the proof is complete.

The above theorem raises the question if these two polynomial


are same? Answer is, not in general. But MMP devides the ch-
polynomial as follows.

3.3 (Caley-Hamilton Theorem) Let V be a vector space over a


field F with dim(V ) = n. Suppose Q(X) is the characteristic polyno-
mial of T. Then Q(T ) = 0.
In particular, if p(X) is the minimal monic polynomial of T, then

p | Q.

Proof. Write

K = F[T ] = {f (T ) : f ∈ F[X], T ∈ L(V, V )}.

Observe that
F ⊆ K ⊆ L(V, V ).
are subrings. Note Q(T ) ∈ K and we will prove Q(T ) = 0.
Let e1 , . . . , en be a basis of V and A = (aij ) be the matrix of T.
So, we have

12
(T (e1 ), T (e2 ), . . . , T (en )) = (e1 , e2 , . . . , en )A. (I)
Consider the following matrix, with entries in K :

 
T − a11 I −a12 I −a13 I · · · −a1n I

 −a21 I T − a22 I −a23 I · · · −a2n I 

B=
 −a31 I −a32 I T − a33 I · · · −a3n I .

 ··· ··· ··· ··· ··· 
−an1 I −an2 I −an3 I · · · T − ann I

Note that the


Q(X) = det(In X − A).
Therefore (I think this is the main point to understandin this proof.),

Q(T ) = det(In T − A) = det(B).

The above equation (I) says that

(e1 , e2 , . . . , en )B = (0, 0, . . . , 0).

Multiply this equation by Adj(B), and we get

(e1 , e2 , . . . , en )BAdj(B) = (0, 0, . . . , 0)Adj(B) = (0, 0, . . . , 0).

Therefore,

(e1 , e2 , . . . , en )(det(B))In = (0, 0, . . . , 0).

Therefore,

(e1 , e2 , . . . , en )(Q(T ))In = (0, 0, . . . , 0).

This implies that

Q(T )(ei ) = 0 ∀i = 1, . . . , n.

Hence Q(T ) = 0 and the proof is complete.

13
4 Invatiant Subspaces
4.1 (Definition) Let V be a vector space over the field F and
T : V → V be a linear operator. A subspace W of V is said to be
invariant under T if
T (W ) ⊆ W.
4.2 (Examples) Let V be a vector space over the field F and
T :V →V
be a linear operator.
1. (Trivial Examples) Then V and {0} are invariant under T.
2. Suppose e be an eigen vector of T and W = Fe. Then W is
invariant under T.
3. Suppose λ be an eigen value of T and W = N (λ) be the eigen
space of λ. Then W is invariant under T.
4.3 (Remark) Let V be a vector space over the field F and
T :V →V
be a linear operator. Suppose W is an invariant subspace T. Then
the restriction map
T|W : W → W
is an well defined linear operator on W. So, the following diagram
T|W
W /W

² T
²
V /V

commutes.
4.4 (Remark) Let V be a vector space over the field F with dim(V ) =
n < ∞. Let
T :V →V
be a linear operator. Suppose W is an invariant subspace T and
T|W : W → W
is the restriciton of T.

14
1. Let p be the characteristic polynomial of T and q be the char-
acteristic polynomial of T|W . Then q | p.

2. Also let P be the minimal (monic) polynomial of T and Q be


the minimal (monic) polynomial of T|W . Then Q | P.

Proof. Proof of (2) is easier. Since P (T ) = 0 we also have P (T|W ) =


0. Therefore
P (X) ∈ ann(T|W ) = F[X]Q(X).
Hence Q | P and proof of (2) is complete.
To prove (1), let E = {e1 , e2 , . . . , er } be a basis of W. Extend
this basis to a basis E = {e1 , e2 , . . . , er , er+1 , . . . , en } of V. Let A be
the matrix of T with respect to E and B be the matrix of T|W with
respect to E. So, we have

(T (e1 ), . . . , T (er )) = (e1 , . . . , er )B


and

(T (e1 ), . . . , T (er ), T (er+1 ), . . . , T (en )) = (e1 , . . . er , er+1 , . . . , , en )A.

Therefore, A can be written as blocks as follows:


µ ¶
B C
A=
0 D
So,

P (X) = det(In X − A) = det(Ir X − B) det(In−r X − D)

and
Q(X) = det(Ir X − B).
Therefore Q | P. The proof is complete.

4.5 (Definitions and Remarks) 1. Suppose F is a field. Re-


call an n × n matrix A = (aij ) is call an upper triangular
matrix if aij = 0 for all i, j with 1 ≤ i < j ≤ n. Similarly, we
define lower triangular matrices.

15
2. Now let V be a vector space over F with dim V = n < ∞. A
linear operator T : V → V is said to be triangulable, if there
is a basis E = {e1 , . . . , en } of V such that the matrix of V is
an (upper) triangular matrix. (Note that it does not make a
difference if we say ”upper” or ”lower” trinagular. To avoid
confusion, we will assume upper triangular.)

3. Now suppose a linear operator T is triangulable. So, for a basis


E = {e1 , . . . , en } we have (T (e1 ), . . . , T (en )) = (e1 , . . . , en )A
for some triangular matrix A = (aij ). We assume that A is
upper triangular. For 1 ≤ r ≤ n, write Wr = span(e1 , . . . , er ).
Then Wr is invariant under T.

4. (Factorization.) Suppose T ∈ L(V, V ) is triangulable. So,


the matrix of T, with respect to a basis e1 , . . . , en , is an up-
per triangular matrix A = (aij ). Note that the characterictic
polynomial q of T is given by

q(X) = det(IX − A) = (X − a11 )(X − a22 ) · · · (X − ann ).

Therefore, q is completely factorizable. So, we have

q(X) = (X − c1 )d1 (X − c2 )d2 · · · (X − ck )dk .

where d1 + d2 + · · · + dk = dim V and c1 , . . . , ck are the distinct


eigen values of T.
Also, since the minimal monic polynomial p of T divides q, it
follows that p is also completely factorizable. Therefore,

p(X) = (X − c1 )r1 (X − c2 )r2 · · · (X − ck )rk .

where ri ≤ di for i = 1, . . . , k.

4.6 (Theorem) Let V be a vector space over F with with finite


dimension dim V = n and T : V → V be a linear operator on V.
Then T is triangulable if and only if the minimal polynomial p of T
is a product of linear factors.

Proof. (⇒): We have already shown in (4) of Remark 4.5, that if


T is triangulable then the MMP p factors into linear factors.

16
(⇐): Now assume that the MMP p factors as

p(X) = (X − c1 )r1 (X − c2 )r2 · · · (X − ck )rk .

Let q denote the characteristic polynomial of T. Since p and q have


the same roots, q(c1 ) = q(c2 ) = · · · = q(ck ) = 0. Now we will split
the proof into several steps.
Step-1: Write λ1 = c1 . By (2.3), λ1 is an eigen value of T. So,
there is a non-zero vector e1 ∈ V such that T (e1 ) = λ1 e1 . Write
W1 = Span(e1 ).
Step-2: Extend e1 to a basis e1 , E2 , . . . , En of V. Write V1 = Span(E2 , . . . , En ).
Note that
1. e1 is linearly independent and dim W1 = 1.

2. W1 is invariant under T.

3. dim V1 = n − 1 and V = W1 ⊕ V1 .
Let v ∈ V1 and T (v) = λ1 e1 + λ2 E2 + · · · + λn En , for some
λ1 , . . . , λn ∈ F. Define T1 (v) = λ2 E2 + · · · + λn En ∈ V1 . Then

T1 : V 1 → V 1

is a well defined linear operator on V1 . Diagramatically, T1 is given


by
T1
V1 / V1
O
pr
² T
V = W 1 ⊕ V1 / V = W 1 ⊕ V1

where pr : V = W1 ⊕ V1 → V1 is the projection map. Let p1 be the


MMP of T1 . Now, we proceed to prove that p1 | p.

Claim : ann(T ) ⊆ ann(T1 ).


To prove this claim, let A be the matrix of T with respect to e1 , E2 , . . . , En
and B be the matrix of T1 with respect to E2 , . . . , En . Since W1 is
invariant under T, we have
µ ¶
λ1 C
A= .
0 B

17
Therefore, the matrix of T m is given by
µ m ¶
m λ1 C m
A = .
0 Bm

For some matrix Cm . Therefore, for a polynomial f (X) ∈ F[X] that


matrix f (A) of f (T ) is given by
µ ¶
f (λ1 ) C∗
f (A) = .
0 f (B)
some matrix C∗ . So, if f (X) ∈ ann(T ) then f (T ) = 0. Hence
f (A) = 0. This implies f (B) = 0 and hence f (T1 ) = 0. So, ann(T ) ⊆
ann(T1 ) and the claim is established.
Therefore, p1 | p. So, p1 satisfies the hypothesis of the theorem.
So, there is a an element e2 ∈ V1 such that T1 (e2 ) = λ2 e2 where
(X − λ2 ) | p1 | p.
Also follows that T (e2 ) = ae1 + λ2 e2 .
Step-3 Write W2 = Span(e1 , e2 ).
Note that

1. e1 , e2 are linearly independent and dim W2 = 2.

2. W2 is invariant under T.

3. Also µ ¶
λ1 a12
(T (e1 ), T (e2 )) = (e1 , e2 ) .
0 λ2

Step-4 If W2 6= V (that is if 2 < n), the process will continue. We


extend e1 , e2 to a basis e1 , e2 , E3 , . . . , En of V (Well, they are different
Ei , not the same as in previous steps.) Write V2 = Span(E3 , . . . , En ).
Note

1. dim(V2 ) = n − 2

2. V = W2 ⊕ V2 .

As in the previous steps, define T2 : V2 → V2 as in the diagram


(you should define explicitly):

18
T2
V2 / V2
O
pr
²
T
V = W 2 ⊕ V2 / V = W 2 ⊕ V2

where pr : V = W2 ⊕ V2 → V2 is the projection map.


Let p2 be the MMP of T2 . Using same argument, we will prove
p2 | p. Then we can find λ3 ∈ F and e3 ∈ V2 such that T3 (e3 ) = λ3 e3
where (X − λ3 ) | p2 | p. Therefore T (e3 ) = a13 e2 + a23 e2 + λ3 e3 .
So, we have

 
λ1 a12 a13
(T (e1 ), T (e2 ), T (e3 )) = (e1 , e2 , e3 )  0 λ2 a23  .
0 0 λ3

Final Step: The process continues for n steps and we get linearly
independent set (basis) e1 , e2 , . . . , en such that

(T (e1 ), T (e2 ), T (e3 ), . . . , T (en )) =

 
λ1 a12 a13 . . . a1n

 0 λ2 a23 . . . a2n 

(e1 , e2 , e3 , . . . , en ) 
 0 0 λ3 . . . a3n .

 ... ... ... ... ... 
0 0 0 ... λn
This completes the proof.

19
Recall a field F is said to be a an algebraically closed field if
every non-constant polynomial f ∈ F[X] has a root in F. It follows
that k is an algebraically closed field if and only if every non-constant
polynomial f ∈ F[X] product linear polynomials.

4.7 (Theorem) Suppose F is an algebraically closed field. Then


every n × n matrix over F is similar to a triangular matrix.

Proof. Consider the operation

T : F n → Fn

such that T (X) = AX. Now use the above theorem.

4.8 (Theorem) Let V be a vector space over F with with finite


dimension dim V = n and T : V → V be a linear operator on V.
Then T is diagonalizable if and only if the minimal polynomial p of
T is of the form

p = (X − c1 )(X − c2 ) · · · (X − ck )

where c1 , c2 , . . . , ck are the distinct eigen values of T.

Proof. (⇒): Suppose T is diagonalizable. Then, there is a basis


e1 , . . . , en of V such that

(T (e1 ), T (e2 ), . . . , T (en ) =


 
c 1 Id 1 0 0 ... 0
 0 c I
2 d2 0 . . . 0 
 
(e1 , . . . , en )  0
 0 c 3 Id 3 . . . 0 .

 ... ... ... ... ... 
0 0 0 . . . c n Id k
Write g(X) = (X −c1 )(X −c2 ) · · · (X −ck ) we will prove g(T ) = 0.
For, i = 1, . . . , d1 we have (T − c1 )(ei ) = 0. Therefore,

g(T )(ei ) = (T − c1 )(T − c2 ) · · · (T − ck )(e1 ) = 0.

Similarly, g(T )(ei ) = 0 for all i = 1, . . . , n. So, g(T ) = 0. Hence p | g.


Since c1 , . . . , ck are roots of both, we have p = g. Hence this part of
the proof is complete.

20
(⇐): We we asssume that p(X) = (X − c1 )(X − c2 ) · · · (X − ck ) and
prove thatPT is digonalizable. Let Wi = N (ci ) be a eigen space of ci .
Let W = ki=1 Wi be the sum of eigen spaces. Assume that W 6= V.
Now we will repeat some protions of the proof of theorem 4.6 and
get a contradiction. Let e1 , . . . , em be a basis of W and e1 , . . . , em , Em+1 , . . . , En
be a basis of V. Write V 0 = Span(Em+1 , . . . , En ). Note
1. W is invariant under T.
2. V = W ⊕ V 0 .
Define T 0 : V 0 → V 0 according to the diagram:
T0
V0 /V0
O
pr
² T
V =W ⊕V0 /V =W ⊕V0

where pr : V = W ⊕ V 0 → V 0 is the projection map.


As in the prrof of theorem 4.6, the MMP p0 of T 0 divides p.
Therefore, there is an element e ∈ V 0 such that T 0 (e) = λe for some
λ ∈ {c1 , c2 , . . . , ck }. We assume λ = c1 . Hence
T (e) = a1 e1 + · · · + an em + c1 e
where ai ∈ F. We can rewrite this equation as
T (e) = β + c1 e
where β = ω1 + ω2 + · · · + ωk ∈ W and ωi ∈ Wi . So,
(T − c1 )(e) = β.
Since T (W ) ⊆ W , for h(X) ∈ F[X] we have h(T )(β) ∈ W. Write
p = (X − c1 )q and q(X) − q(c1 ) = h(X)(X − c1 ).
So,
(q(T ) − q(c1 ))(e) = h(T )(T − c1 )(e) = h(T )(β)
is in W. Also
0 = p(T )(e) = (T − c1 )q((T )(e)
Therefore q((T )(e) ∈ W1 ⊆ W. So, q(c1 )e = q((T )(e) − (q(T ) −
q(c1 ))(e) is in W. Since q(c1 ) 6= 0 we get e ∈ W. This is a contradic-
tion and the proof is complete.

21
5 Simultaneous Triangulation and Di-
agonilization
Suppose F ⊆ L(V, V ) is a family of linear operators on a vector space
V a field F. We say that F is a commuting family if T U = U T
for all U, T ∈ F.
In this section we try to find a basis E of V so that, for all T in
a family F the matrix of T is diagonal (or triangular) with respect
to E. Following are the main theorems.

5.1 (Theorem) Let V be a finite dimensional vector space with


dim V = n over a field F. Let F ⊆ L(V, V ) be a commuting and
triangulable family of operators on V. Then there is a basis E =
{e1 , . . . , en } such that, for every T ∈ F, the matrix of T with respect
to E is a triangular matrix.

Proof. The proof is some fairly similar to that of theorem 4.6. We


will omit the proof. You can work it out when you need.

Following is the matrix version of the above theorem.

5.2 (Theorem) Let F ⊆ Mnn (F) be a commuting and triangula-


ble family of n × n matrices. Then there is an invertible matrix P
such that, for every A ∈ F, we have P AP −1 is an upper triangular
matrix.

5.3 (Theorem) Let V be a finite dimensional vector space with


dim V = n over a field F. Let F ⊆ L(V, V ) be a commuting and
diagonalizable family of operators on V. Then there is a basis E =
{e1 , . . . , en } such that, for every T ∈ F, the matrix of T with respect
to E is a diagonal matrix.

Proof. We will omit the proof. You can work it out when you need.

22
6 Direct Sum
Part of this section we already touched. We gave the definition 2.13
of direct sum of subspaces. Following is an exercise. Note that we
can make the same definition for any subspace W.

6.1 (Exercise) Let V be a finite dimensional vector space over a


field F. Let W1 , . . . , Wk be subspaces of V. Then V = W1 ⊕W2 ⊕· · ·⊕
Wk if and only if V = W1 + W2 + · · · + Wk and for each j = 2, . . . , k,
we have
(W1 + · · · + Wj−1 ) ∩ Wj = {0}.

6.2 (Examples) (1) R2 = Re1 ⊕ Re2 where e1 = (1, 0), e2 = (0, 1).
(2) Let V = Mnn (F). Let U be the subspace of all upper triangular
matrices. Let L be subspace of all strictly lower triangular matrices
(that means diagonal entries are zero). Then V = U ⊕ L.
(3) Recall theorem 2.15 that V is direct sum of eigen spaces of diag-
onizable operators T.

We used the word ’projection’ before in the context of direct sum.


Here we define projections.

6.3 (Definition) Let V be a finite dimensional vector space over


a field F. An linear operator E : V → V is said to be a projection
if E 2 = E.

6.4 (observations) Let V be a finite dimensional vector space


over a field F. Let E : V → V be a projection. Let R = range(E)
and N = NE be the null space of E. Then
1. For v ∈ V, we have x ∈ R ⇔ E(x) = x.

2. V = N ⊕ R.

3. For v ∈ V, we have v = (v − E(v)) + E(v) ∈ N ⊕ R.

4. Let V = W1 ⊕ W2 · · · ⊕ Wk be direct sum of subspaces Wi .


Define operators Ei : V → V by

Ei (v) = vi where v = v1 + · · · + vk , vi ∈ Wi .

23
Note Ei are well defined projections with

range(Ei ) = Wi and NEi = Wi

where Wi = W1 ⊕ · · · ⊕ Wi−1 ⊕ Wi+1 ⊕ · · · ⊕ Wk .

Following is a theorem on projections.

6.5 (Theorem) Let V be a finite dimensional vector space over a


field F. Suppose V = W1 ⊕ W2 ⊕ · · · ⊕ Wk be direct sum of subspaces
Wi . Then there are k linear operators E1 , . . . , Ek on V such that

1. each Ei is a projection (i. e. Ei2 = Ei ).

2. Ei Ej = 0 ∀ i 6= j.

3. E1 + E2 + · · · + Ek = I.

4. range(Ei ) = Wi .

Conversely, if E1 , . . . , Ek are k linear operators on V satisfying all


the conditions (2)-(3) above then Ei is a projection (i.e. (1) holds)
and with Wi = Ei (V ) we have V = W1 ⊕ W2 ⊕ · · · ⊕ Wk .

Proof. The proof is easy and left as an exercise. First, try with
k = 2 operators, if you like.

Homework: page 213, Exercise 1, 3, 4-7, 9.

24
7 Invariant Direct Sums
This section deals with some of the very natural concepts. Suppose
V is a vector space over a filed F and V = W1 ⊕ W2 ⊕ · · · ⊕ Wk .
where Wi are subspaces. Suppose for each i = 1, . . . , k we are given
linear operators Ti ∈ L(Wi , Wi ) on Wi , Then we can define a linear
operator T : V → such that
k
X k
X
T( vi ) = Ti (vi ) f or v i ∈ Wi .
i=1 i=1

So the restriction T|Wi = Ti . This means that the diagram


Ti
Wi /W
i

² T
²
V /V

commute.

Conversely, Suppose V is a vector space over a filed F and V =


W1 ⊕ W2 ⊕ · · · ⊕ Wk . where Wi are subspaces. Let T ∈ L(V, V ) be
a linear operator. Assume that Wi are invariant under T. Then we
can define linear operatop Ti : Wi → Wi by Ti (v) = T (v) for v ∈ Wi .
Therefore, the above diagram commutes and T can be reconstructed
from T1 , . . . , Tk , in the same way as above.

25
8 Primary Decomposition
We studied linear operators T on V under the assumption that the
characteristic polynomial q or the MMP p splits completely in to
linear factors. In this section we will not have this assumtion. Here
we will exploit the fact q, p have unique factorization.

8.1 (Primary Decomposition Theorem) Let V be a vector space


over F with finite dimension dim V = n and T : V → V be a linear
operator on V. Let p be the minimal monic polynomial (MMP) of T
and
p = pr11 pr22 · · · prkk
where ri > 0 and pi are distinct irreducible monic polynomials in
F[X]. Let
Wi = {v ∈ V : pi (T )ri (v) = 0}
be the null space of pi (T )ri . Then

1. V = W1 ⊕ · · · ⊕ Wk ;

2. each Wi is invariant under T ;

3. Let Ti = T|Wi : Wi → Wi be the operator on Wi induced by T.


Then the minimal monic polynomial of Ti is pri i .

Proof. Write
p Y r
fi = = pj j .
pri i j6=i

Note that f1 , f2 , . . . , fk have no common factor. So

CGD(f1 , f2 , . . . , fk ) = 1.

Therefore
f 1 g1 + f 2 g2 + · · · + f k gk = 1
for some gi ∈ F[X].
For i = 1, . . . , k, let hi = fi gi and Ei = hi (T )) ∈ L(V, V ). Then
X
E1 + E 2 + · · · + E k = hi (T ) = Id. (I)

26
Also, for i 6= j note that p | hi hj . Since p(T ) = 0 we have

Ei Ej = hi (T )hj (T ) = 0. (II)

Write Wi0 = Ei (V ) the range of Ei . By converse part of theorem 6.5,


it follows that V = W10 ⊕ · · · ⊕ Wk0 .
By (I), we have T = T E1 + T E2 + · · · + T Ek . So
k
X
T (Wi0 ) = T (Ei (V )) = T Ej Ei (V ) = T Ei2 (V ) = T Ei (V ) =
j=1

Ei T (V ) ⊆ Ei (V ) = Wi0 .
Therefore, Wi0 is invariant under T. We will show that Wi0 = Wi is
the null space of pi (T )ri .
We have

pi (T )ri (Wi0 ) = pi (T )ri fi (T )gi (T )(V ) = p(T )gi (T )(V ) = 0.

So, (Wi0 ) ⊆ Wi the null space of pi (T )ri .


Now suppose w ∈ Wi . So, pi (T )ri (w) = 0. For j 6= i, we have
pri i | fj gj = hj and hence, Ej (v) = hj (T )(w) = 0. Therefore w =
P k 0 0 0
j=1 Ej (w) = Ei (w) is in Wi . So, Wi ⊆ Wi . Therfore Wi = Wi and
(1) and (2) are established.
Now Ti : Wi → Wi is the restriction of T to Wi . It remains to
show that MMP of Ti is pri i . It is enough to show this for i = 1 or
that is MMP of T1 is pr11 .
We have p1 (T1 )r1 = 0, because W1 is the null space p1 (T )r1 .
Therefore pr11 ∈ ann(T1 ).
Now suppose g ∈ ann(T1 ). So, g(T1 ) = 0. Then
k
r
Y
g(T )f1 (T ) = g(T ) pj j .
j=2

Since g(T )|W1 = g(T1 ) = 0, we have g(T ) vanishes on W1 and also for
j = 2, . . . , k we have pj (T )rj vanished on Wj . Therefore, g(T )f1 (T ) =
0. Hence p | gf1 . Hence pr1 = fp1 | g. Therefore pr1 is the MMP of T1
and the proof is complete.

27
Remarks. (1) Note that the projections Ei = hi (T ) in the above
theorem are polynomials in T.
(2) Also think what it means if some (or all) of the irreducible
factors pi = (X − λi ) are linear.

28

You might also like