chapters7-8
chapters7-8
In this chapter we will introduce some special matrices such as unitary, normal and
Hermitian matrices along with their interesting and special properties. For simplicity, we
shall use ⋅, ⋅ and ⋅ to denote the usual inner product and the Euclidean norm,
respectively, on F n .
n× n
For a matrix A ∈ , its Conjugate Transpose (or Hermitian Transpose) is defined by
AH = ( A )
T
(7.1)
For example,
H
⎛2 + i 3−i ⎞ ⎛2 −i 4 −i ⎞
⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟
⎝ 4 + i 7 − 2i ⎠ ⎝ 3 + i 7 + 2i ⎠
The Conjugate Transpose (or Hermitian Transpose) of a matrix have the following
properties:
(1) (A H ) = A
H
(2) (αA + β B ) = α A H + β B H , α , β ∈
H
(3) ( AC )H = C H AH
n×n
Definition 7.1 A matrix U ∈ is said to be Unitary if it satisfies
U H U = UU H = I n (7.2)
n×n
Equivalently, U ∈ is unitary iff U −1 = U H . In addition, if U is real then it is also
called an orthogonal matrix.
- 188 -
Lemma 7.1 If U is unitary (or orthogonal, respectively), then so are U H , U T , U .
Proof:
U is unitary ⇒ U H U = UU H = I
⇒ (U H ) U H = UU H = I ⇒ U H is unitary
H
Assume U = A + iB , then
U = A − iB , U T = AT + iB T , U H = AT − iB T
U is unitary ⇒ U H U = ( AT − iBT ) ( A + iB ) = ( AT A + B T B ) + i ( AT B − BT A ) = I
⇒ AT A + BT B = I , AT B − BT A = 0
U H U = U T U = ( AT + iBT ) ( A − iB )
= ( AT A + BT B ) − i ( AT B − B T A ) = I − i 0 = I
⇒ U is unitary
U T (U T ) = U T U = I ⇒ U T is unitary
H
Theorem 7.2 Let U ∈ F n×n is unitary, then the following are equivalent:
(i) U is unitary (U is orthogonal if F = ).
(ii) The columns of U form an orthonormal basis of F n .
(iii) The rows of U form an orthonormal basis of F n .
(iv) Ux , Uy = x , y for all x , y ∈ F n .
Proof:
(i) ⇔ (ii) Let ( u1 , , un ) denotes the columns of U, then
where δ ij is the Kronecker delta. Hence U is unitary iff the columns of U are
- 189 -
(i) ⇔ (iii) By Theorem 7.1, the result just proved,
U is unitary ⇔ U T is unitary
⇔ columns of U T form an orthonormal basis for F n
⇔ rows of U T form an orthonormal basis for F n
U ( x + y) = x + y
2 2
⇒ Ux + Uy, Ux + Uy = x + y , x + y
2 2 2 2
⇒ Ux + Uy + Ux , Uy + Uy , Ux = x + y + x , y + y , x
⇒ Re ( Ux , Uy ) = Re ( x, y ) (7.3)
where Re(z) denotes the real part of the scalar z. If F = , then x , y is always real and
(7.3) implies that (iv) holds. If F = , then we can deduce from the identity
( )
2 2
U x + −1 y = x + −1 y
that
Im ( Ux , Uy ) = Im ( x, y )
for all x , y , where Im(z) denotes the imaginary part of z. This and (7.3) together give
(iv).
(iv) ⇔ (i) Suppose U is NOT unitary, then U H U − I ≠ 0 , and there exists x ∈ F n such
that y := (U H U − I ) x ≠ 0 . As a result,
0 < y H y = y H (U H U − I ) x = Ux , Uy − x , y ,
Statement (iv) and (v) of the above theorem say that: a square matrix is unitary iff (1) it
preserves the usual inner product of vectors, or (2) it preserves the Euclidean norm of
vectors.
- 190 -
Let 1 ≤ m < n , and let u1 , , um be an orthonormal set of vectors in F n . As u1 , , um are
linearly independent, one can always append vectors to the collection of them to form a
basis B = {u1 , , um , vm +1 , , vn } of F n . Then one may apply the Gram-Schmidt process on
The reflection hyperplane can be defined by a unit vector u which is orthogonal to the
hyperplane. The reflection of a point x about this hyperplane is:
x − 2 u , x u = x − 2u ( u H x )
(7.4)
or x′ = Qx = ( I − 2uu H ) x
Q = I − 2uu H (7.5)
The Householder matrix has the following properties:
(1) Q is Hermitian: Q = Q H .
(2) Q is unitary: Q −1 = Q H .
- 191 -
Figure 7.1 gives the geometric interoperation of Householder transformation:
Given x , find u ∋ Hx = ( I − 2uu T ) x = α e1
x
u hyperplane Where α = α e1 = Hx 2 = x
x − x e1
α e1 ∴u = (7.6)
x − x e1
Fig. 7.1 Householder transformation
Proof: We shall prove (i) by mathematical induction on n. The proof of (ii) is similar.
Let ( λ1 , w1 ) be an eigenpair of A with w1 = 1 . Choose w2 , , wn to be such that
U1 = ( w1 , , wn ) is unitary,
⎛ w1H ⎞ ⎛ λ1 *1 ⎞
⎜ ⎟ ⎜ ⎟
⇒ U1 AU1 = ⎜ ⎟ ( Aw1 … Awn ) = ⎜
H
⎟
⎜ wnH ⎟ ⎜0 *2 ⎟⎠
⎝ ⎠ ⎝ ( n −1)×1
Choose
⎛ 1 01×( n −1) ⎞ ⎛ λ1 *1V2 ⎞
⎜ ⎟ ⎜ ⎟
U2 = ⎜ ⎟ ⇒ U 2 H (U1H AU1 ) U 2 = ⎜ ⎟
⎜ 0( n −1)×1 V2 ⎟⎠ ⎜ 0( n −1)×1 V2 *2 V2 ⎟⎠
H
⎝ ⎝
( n −1)×( n −1)
Choose V2 ∈ to be unitary
⎛ λ2 ∗3 ⎞
⎜ ⎟
∋ V2H ∗2 V2 = ⎜ ⎟
⎜0 ∗4 ⎟⎠
⎝ ( n − 2)×1 ( n −1)×( n −1)
- 192 -
⎛ λ1 * ⎞
⎜ ⎟
⇒ U 2 H U 1 H A U1 U 2 = ⎜ 0 λ2 ∗ ⎟
⎜ 0( n − 2)×2 ∗ ⎟⎠
⎝
Continue this process, we prove the theorem.
Remark 7.1
(i) Recall that a square matrix may not be diagonalizable, i.e., it may not be similar to
any diagonal matrix. The above theorem tells us that all matrices A ∈ n×n are
similar to some upper triangular matrix; moreover, the triangularizing matrix can
be taken to be unitary.
(ii) In general, if A, B are square matrices such that B = U H AU for some unitary
matrix (or orthogonal matrix, respectively) then we say B is unitarily similar (or
orthogonally similar, respectively) to A. Hence the Schur triangularization theorem
says that every complex square matrix is unitarily similar to an upper triangular
matrix. Also, every real square matrix that has no nonreal eigenvalues is
orthogonally similar to a real upper triangular matrix.
(iii) The relation of being unitarily similar (or orthogonally similar, respectively) is an
equivalence relation.
(iv) A real square matrix having some nonreal eigenvalues cannot be triangularized by
a real orthogonal matrix (why?). But it can always be triangularized (to become a
nonreal triangular matrix) by some nonreal unitary matrix.
(v) In the above theorem we triangularized A to become an upper triangular matrix. By
applying the theorem on AT instead, we can show easily that the results of the
theorem also hold if we change “upper triangular” to “lower triangular.”
We will give an example to demonstrate the Schur triangularization process.
λ1 = 1 , then
⎡ 1 ⎤
⎡ −2 − 1 2 ⎤ ⎢ 1 0 2⎥
A − I = ⎢⎢ 8 −12 −8⎥⎥ → ⎢0 1 1 ⎥
⎢ ⎥
⎣⎢ −10 11 6 ⎥⎦ ⎢0 0 0 ⎥
⎣ ⎦
- 193 -
⎡− 4 ⎤
⎡ −1⎤ ⎢ 3⎥
1⎢ ⎥ x − e1 3⎢ 2 ⎥
and x = −2 is an associated unit eigenvector. Let u = = − and let Q
3⎢ ⎥ x − e1 8 ⎢ 3⎥
⎢⎣ 2 ⎥⎦ ⎢ 2 ⎥
⎢⎣ 3 ⎥⎦
be the associated Householder matrix, i.e.,
⎡ −1 −2 2 ⎤
1⎢
Q = I − 2uu = ⎢ −2 2 1 ⎥⎥ = ⎡⎣ x
T
V]
3
⎢⎣ 2 1 2 ⎥⎦
where
⎡ −2 2⎤
1⎢
V= ⎢2 1 ⎥⎥
3
⎢⎣ 1 2 ⎥⎦
Then
⎡ 3 64 13 ⎤
1⎢ 1 ⎡ −13 −1 ⎤
QAQ = ⎢0 −13 −1⎥⎥ and V T AV = ⎢
3 3 ⎣ 16 −15⎥⎦
⎢⎣0 16 −5⎥⎦
Now triangularize the 2 × 2 matrix V T AV , which has the single eigenvalue −3. The
1 ⎡1⎤ x − e1 ⎡ −0.6154 ⎤
associated unit eigenvector is x = ⎢ −4 ⎥ . Let u = x − e = ⎢ −0.7882 ⎥ and let P be
17 ⎣ ⎦ 1 ⎣ ⎦
the Householder matrix associated with u , i.e.,
⎡ 0.24254 −0.97014 ⎤
P=⎢ ⎥
⎣ −0.97014 0.24254 ⎦
Then
⎡ −3 17 ⎤
PV T AVP = ⎢ 3⎥
⎢⎣ 0 −3 ⎥⎦
⎡1 0 ⎤
U =Q⎢ ⎥
⎣0 P ⎦
Then
⎡1 0.97025 −21.747 ⎤
U AU = ⎢⎢0 −3.000 5.667 ⎥⎥
T
⎢⎣0 0 −3.000 ⎥⎦
is a (numerically approximate) Schur triangularization of A.
- 194 -
In summary, every square matrix is triangularizable but only non-defective matrices are
diagonalizable.
The Schur triangularization theorem is fundamental, as many nice and important results
follow from it. We list some of these results here.
multiplicities). Then
n
tr ( A) = ∑ λi and det( A) = λ1λ2 λn .
i =1
(exercise!).
λ1− m , , λn− m .
, λnm . As Am = (UTU H ) = UT mU H
m
has eigenvalues λ1m , is similar to T m , its
- 195 -
⎛ λ1−1 … * ⎞
⎜ ⎟
T −1 = ⎜ ⎟
⎜ 0 −1 ⎟
λn ⎠
⎝
eigenvalues of q ( A ) are q ( λ1 ) , , q ( λn ) .
Theorem 7.7 (Cayley-Hamilton Theorem) Let A ∈ F n×n and let has eigenvalues
det ( λ I − A ) = λ n + γ 1λ n −1 + + γ n be its characteristic polynomial. Then
An + γ 1 An −1 + + γ nI = 0 (7.8)
is in upper triangular form given in (7.7), where λ1 , , λn are the eigenvalues of A. Let
(T − λ1I1 ) (T − λn I ) = p (T ) = 0 (7.9)
for any upper triangular T given in (7.7) with n = n0 . Now suppose T is in the form
- 196 -
given in (7.7), with n = n0 + 1 . Write T in block matrix form
⎡T ′ * ⎤
T =⎢ ⎥
⎣ 0 λn0 +1 ⎦
assumption, (T ′ − λ1 I ) (T ′ − λ I ) = 0 . As a result,
n0
P (T ) = (T − λ I ) (T − λ I )(T − λ I )
1 n0 n0 +1
⎡T ′ − λn0 I *⎤ ⎞ ⎡*
⎛ ⎡T ′ − λ1 I *⎤ * ⎤
= ⎜⎢ ⎢ ⎥ ⎟⎟ ⎢0 λ − λ ⎥
⎜⎣ 0
⎝ *⎥⎦
⎣ 0 *⎦ ⎠ ⎣ n0 +1 n0 +1 ⎦
( )
⎡(T ′ − λ1 I ) T ′ − λn I *⎤ ⎡* * ⎤ ⎡0 *⎤ ⎡* * ⎤
=⎢ ⎥⎢ ⎥=⎢ ⎥=0
0
⎥⎢
⎢⎣ 0 *⎥⎦ ⎣0 0 ⎦ ⎣0 *⎦ ⎣0 0 ⎦
The proof is now completed.
Corollary 7.8 Let A ∈ F n×n . If A is nonsingular, then A−1 can be written as a linear
combination of I , A, A2 , An −1 .
(A n −1
+ γ 1 An − 2 + + γ n −1 I ) A = −γ n I
- 197 -
T H T = TT H = dig t11 , ( 2
, tnn
2
)
and hence T is normal.
Conversely, suppose T is not diagonal. Then there exists a smallest i , say, i0 , such that
2
ti0 , j ≠ 0 for some j with 1 ≤ i0 < j ≠ n . Observe that the ( i0 , i0 ) entry of T H T is ti0 ,i0 ,
∑t
2 2
but the ( i0 , i0 ) entry of TT H is i0 , j > ti0 ,i0 . Since T H T ≠ TT H , T is not normal.
j = i0
n× n
Proof: We shall prove (i) only. The proof of (ii) is similar. Let A ∈ , By Schur
triangularization theorem, there always exists unitary U such that T := U H AU is upper
triangular and has the eigenvalues of A, in any given order, as diagonal entries. Note that
AAH = AH A iff TT H = T H T . Hence, by Lemma 7.9, if A is normal then T is diagonal.
Conversely, suppose U H AU = D is diagonal for some unitary U. Then A is similar to D
and hence its eigenvalues are exactly those of D, which are also the diagonal entries of D.
Also it is straight forward to check that AAH = AH A .
The above theorem says that a complex matrix is normal iff it is unitarily similar to a
diagonal matrix. Or, equivalently, a matrix is normal iff it is diagonalizable by a unitary
matrix. As a result, a real normal matrix whose eigenvalues are not all real is unitarily (but
not orthogonally) similar to a complex (but not real) diagonal matrix. The proof of the
following result is left as an exercise.
- 198 -
Corollary 7.11
(i) Let A ∈ n×n . Then A is normal iff it has an orthonormal set of eigenvectors which
spans n .
(ii) Let A ∈ n×n . Then A is normal and all its eigenvalues are real iff it has an
orthonormal set of eigenvectors which spans n .
Theorem 7.12 (Spectral decomposition of unitary matrices) Let A ∈ F n×n . Then the
followings are equivalent
(i) A is unitary
(ii) A is normal and all eigenvalues of A have modulus of one.
(iii) U H AU = diag ( λ1 , , λn ) for some unitary U with λi = 1 for all i .
Proof:
(i) ⇔ (ii) Suppose A is unitary, then AAH = AH A = I and hence A is normal. Moreover,
if λ is any eigenvalue of A and x is a corresponding eigenvector, then, by Theorem
7.2,
x = Ax = λ x = λ x ⇒ λ = 1
(iii) ⇔ (i) If (iii) holds then one easily verifies that AH A = I . Therefore A−1 = AH , and A
is unitary.
- 199 -
(ii) If A and B are Hermitian (or skew-Hermitian, real symmetric, real skew-symmetric,
respectively) matrices of the same order, then so is α A + β B , where α , β are
real scalars.
n× n
(iii) Every A ∈ can be written uniquely as the sum of a real symmetric matrix and
A + AT A − AT
a real skew-symmetric matrix: A = + .
2 2
n× n
(iv) Every A ∈ can be written uniquely as the sum of a Hermitian matrix and a
A + AH A − AH
skew-Hermitian matrix: A = + .
2 2
(v) Hermitian , Skew- Hermitian and Unitary matrices are all normal
Theorem 7.14 (Spectral decomposition of Hermitian matrices) Let A ∈ F n×n . Then the
followings are equivalent
(i) A is Hermitian (or A is real symmetric if F = )
(ii) A is normal and all eigenvalues of A are real and eigenvectors belonging to
distinct eigenvalues are orthogonal.
(iii) U H AU = diag ( λ1 , , λn ) for some unitary (or orthogonal if F = ) matrix U
Proof:
(i) ⇒ (ii) Suppose A is Hermitian, then AAH = AH A = A2 and hence A is normal.
Moreover, if λ is any eigenvalue of A and x is a corresponding eigenvector, since
A = AH , we have
λ x = x H λ x = x H Ax = ( x H Ax ) = x H Ax = x H λ x = λ x = λ x
2 H 2 2
- 200 -
(iii) ⇔ (i) If (iii) holds then one easily verifies that AH = A if F = , and that AT = A
if F = .
Theorem 7.15 Let A ∈ F n×n . If A = − A H , then all eigenvalues of A are pure imaginary
λ ( A ) ⊆ jω
⇒ λ x H x = ( λ x ) x = x H AH x
H
∵ AH = − A
⇒ λ x H x = − x H Ax = −λ x H x
⇒ λ = −λ
⇒ λ is pure imaginary
⎛ 0 2 −1 ⎞
Example 7.2 Find an orthogonal matrix U that diagonalizes A = ⎜⎜ 2 3 −2 ⎟⎟
⎜ −1 −2 0 ⎟
⎝ ⎠
Solution:
(1) Calculate the eigenvalues of A:
p(λ ) = det(λ I − A) = (λ + 1)2 (5 − λ ) ⇒ λ = −1 , −1 , 5
(2) Calculate the respective eigenspaces
⎧⎪⎛ −1 −2 1 ⎞ ⎫⎪
T
N (5I − A) = Span ⎨⎜ , , ⎟ ⎬
⎩⎪⎝ 6 6 6 ⎠ ⎪⎭
ones.
(1 , 0 , 1) = ⎛ 1 , 0 , 1 ⎞
T T
v
u1 = 1 = ⎜ ⎟
v1 2 ⎝ 2 2⎠
⎡1 ⎤
⎢ 2⎥
h1 = v2 , u1 u1 = [ −2 1 0] 0 ⎥ u1 = − 2u1 = [ −1 0 −1]
⎢ T
⎢ ⎥
⎢1 ⎥
⎢⎣ 2 ⎥⎦
- 201 -
v2 − h1 = [ −2 1 0] − [ −1 0 −1] = [ −1 1 1]
T T T
[ −1 1 1]
T T
v2 − h1 ⎡ 1 1 1 ⎤
u2 = = ⎢− ⎥
v2 − h1 3 ⎣ 3 3 3⎦
⎧⎪⎛ 1 T
1 ⎞ ⎛ −1 1 1 ⎞ ⎫⎪
T
N (− I − A) = Span ⎨⎜ , 0 , ⎟ , ⎜ , , ⎟ ⎬
⎩⎪⎝ 2 2⎠ ⎝ 3 3 3 ⎠ ⎭⎪
(4) The columns of U form an orthonormal eigenbasis (WHY?):
⎛ −1 1 −1 ⎞
⎜ ⎟
⎜ 6 2 3⎟
⎜ −2 1 ⎟
U =⎜ 0 ⎟
⎜ 6 3⎟
⎜ 1 1 1 ⎟
⎜ ⎟
⎝ 6 2 3⎠
⇒ U T AU = diag ( 5, − 1, − 1)
If A is a real square matrix then obviously x H Ax = x T Ax is real for any real vector x . If
A is Hermitian, then it is easy to see that x H Ax is also real for any complex vector x .
The following is a classification of Hermitian and real symmetric matrices A according as
the positivity/negativity of the real scalars x H Ax , where x runs over all nonzero vectors
in F n .
(i) If x H Ax > 0 ∀ x ∈ F n & x ≠ 0 , then we call A a positive definite matrix and denote
by A > 0 .
- 202 -
For example,
⎛2 0⎞ ⎛ −2 0 ⎞ ⎛2 0 ⎞
A=⎜ ⎟>0, A=⎜ ⎟<0, A=⎜ ⎟ is indefinite
⎝ 0 4⎠ ⎝ 0 −4 ⎠ ⎝ 0 −4 ⎠
Question: Given a real symmetric matrix, how to determine its definiteness efficiently?
We have the following theorem.
(iii) A ≥ 0 ⇔ λi ≥ 0 .
(iv) A ≤ 0 ⇔ λi ≤ 0 .
Proof: We shall prove (i) and leave the others for the readers as exercises.
" ⇒ " Let A > 0 & (λ , x ) be an eigenpair
2 x H Ax
⇒ x H Ax = x H λ x = λ x ⇒ λ= 2
>0
x
assume this ? )
n
∀ x ∈ F n , x = ∑ α i xi for some α1 , ,α n
i =1
n n n
⇒ x H Ax = (∑ α i xi ) H (∑ α i λi xi ) = ∑ λiα i 2 > 0 (why? )
i =1 i =1 i =1
⇒ A>0
n× n
Property II: A = AT ∈ & A > 0 ⇒ aii > 0, ∀i and all the leading principal
- 203 -
Proof: Pick
T
⎡ ⎤
xi = ⎢ 0, , 0, 1 , 0, , 0 ⎥ , ∀i
⎣ i − th ⎦
⇒ xi Axi = aii > 0
T
Pick
T ⎡A A ⎤
x = ⎡⎣ x1 , 0 ⎤⎦ , A = ⎢ 11 12 ⎥
⎣ A21 A22 ⎦
x T Ax = x1T A11 x1 > 0 ⇒ A11 is positive definite
n× n
Property III: A = AT ∈ & A > 0 ⇒ A can be reduced to upper triangular form using
only row operation III and the pivots elements will all be positive.
Sketch of the proof:
a11 > 0
⎛ a a ⎞ ⎛ a11 a12 ⎞
A2 = ⎜ 11 12 ⎟ → ⎜ (1) ⎟
⎝ a21 a22 ⎠ ⎝ 0 a 22 ⎠
∵ A2 > 0 & determinant is invariant under row operation of type III. Continue this
n× n
Property IV: Let A = AT ∈ &A>0 Then
(i) A can be decompose as A = LU where L is lower triangular & U is upper
triangular.
(ii) A can be decompose as A = LDU where L is lower triangular & U is upper
triangular with all the diagonal element being equal to 1, D is an diagonal matrix.
Proof: by Gaussian elimination with type III row operation only and the fact that the
product of two lower (upper) triangular matrix is lower (upper) triangular.
Example 7.3
⎛ 4 2 −2 ⎞ ⎛1⎞ ⎛ 1⎞ row ⎛
4 2 −2 ⎞ ⎛ 1⎞ row ⎛
4 2 −2 ⎞
⎜ ⎟ R13 ⎜ ⎟ R12 ⎜ − ⎟
⎜ ⎟ R23 ⎜ − ⎟
⎜ ⎟
A = ⎜ 2 10 2 ⎟ ⎯⎯⎯⎯⎯→ A ∼ ⎜ 0 9 3 ⎟ ⎯⎯⎯⎯
⎝2⎠ ⎝ 2⎠ ⎝ 3 ⎠
→ A ∼ ⎜0 9 3 ⎟ =U
⎜ −2 2 5 ⎟ ⎜0 3 4 ⎟ ⎜0 0 3 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
Thus A = LU
- 204 -
⎛ 1 00 ⎞⎟
⎜
⎛1⎞ ⎛ 1⎞ ⎛1⎞ ⎜ 1
where L = R12 ⎜ ⎟ R13 ⎜ − ⎟ R23 ⎜ ⎟ = 1 0⎟
⎝2⎠ ⎝ 2⎠ ⎝ 3⎠ ⎜ 2 ⎟
⎜⎜ − 1 1 1 ⎟⎟
⎝ 2 3 ⎠
Also A = LDU with
⎛ 1 0 0 ⎞⎟ ⎛1 1 −1 ⎞
⎜ ⎛4 0 0⎞ ⎜ 2 2⎟
⎜ ⎟
L=⎜ 1 1 ⎟
0 , D = ⎜0 9 0⎟ & U = ⎜0 1 1 ⎟
⎜ 2 ⎟ ⎜ 3 ⎟
⎜⎜ − 1 ⎜0 3 ⎟⎠
1 ⎟⎟ ⎜0 1 ⎟
1 ⎝ 0 0
⎝ 2 3 ⎠ ⎝ ⎠
Property V: Let A = AT ∈ n× n
& A > 0 If A = L1 DU
1 1 = L2 D2U 2 , then
L1 = L2 , D1 = D2 & U1 = U 2
Proof:
∵ D2−1 L−21 L1 D1 = U 2U1−1
∵ LHS is lower triangular & RHS is upper triangular with diagonal elements 1
⇒ U 2U1−1 = I ⇒ U 2 = U1
⇒ D2−1 L−21 L1 D1 = I
⇒ L−21 L1 = D2 D1−1 ⇒ L−21 L1 = I (why?)
⇒ L1 = L2 ⇒ D2 D1−1 = I ⇒ D1 = D2
factorized into
A = LDLT
where D is a diagonal matrix & L is lower triangular with 1 along the diagonal
Proof:
Let A = LDU
T
A= A
⎯⎯⎯ → LDU = U T DLT
n× n
Property VII: (Cholesky Decomposition) Let A = AT ∈ & A > 0 , then A can be
factorized into
A = LLT
where L is lower triangular with positive diagonal
- 205 -
T
⎛ 1
⎞⎛ 1
⎞
Hint: A = LDL = ⎜ LD 2 ⎟ ⎜ LD 2 ⎟
T
⎝ ⎠⎝ ⎠
- 206 -
Exercises
1. Let A ∈ F n×n and B ∈ F m×m have no eigenvalue in common and the characteristic
nonsingular.
⎛ 1 3 −4 ⎞
3. Compute the LU factorization of A = ⎜⎜ −1 5 −3 ⎟⎟
⎜ 4 −8 23 ⎟
⎝ ⎠
⎛1 1 0⎞
4. Find the LU decomposition of the matrix A = ⎜⎜ 1 1 0 ⎟⎟
⎜0 0 1⎟
⎝ ⎠
⎛ 2 1 3⎞
5. Compute the LU factorization of A = ⎜⎜ 2 5 7 ⎟⎟
⎜4 2 7⎟
⎝ ⎠
6. Let A ∈ n×n
(a) If A is normal, show that A2 is also normal. Find a matrix A ∈ 2×2 which is not
normal, but A2 is normal.
(b) If A is unitary, show that A2 is also unitary. Find a matrix A ∈ 2×2 which is not
unitary, but A2 is unitary.
(c) If A is Hermitian, show that A2 is also Hermitian. Find a matrix A ∈ 2×2 which
is not Hermitian, but A2 is Hermitian.
n n
∑ aij = ∑ bij
2 2
7. If A = (aij ) , B = (bij ) ∈ F n×n are unitarily similar, prove that .
i , j =1 i , j =1
⎡ a b⎤
8. Let A = ⎢ ⎥ , where a, b ∈ with b ≠ 0 ,
⎣ −b a ⎦
(a) Show that A is normal.
(b) Find the eigenvalues of A and find two orthonormal eigenvectors.
- 207 -
(c) Find a unitary matrix U such that U H AU is diagonal.
9. Let A = − AH be skew-Hermitian
(a) Show that the eigenvalues of A are pure imaginary.
(b) Show that I − A and I + A are both nonsingular.
⎡ 1 0 0⎤
10. Let A = ⎢⎢ 0 1 1 ⎥⎥ , use Cayley-Hamilton Theorem, compute A−1 and A5 .
⎢⎣ −2 −2 1 ⎥⎦
⎡1 0 1 ⎤ ⎡1 1 1 ⎤
11. Let A1 = ⎢ ⎥ and A2 = ⎢ ⎥ Apply the Gram-Schmidt process to find
⎣1 0 −1⎦ ⎣0 0 0⎦
2×3
matrices B1 , B2 ∈ such that span( A1 ) = span( B1 ) , span( A1 , A2 ) = span( B1 , B2 ) ,
12. Show that the transition matrix from one orthonormal basis of a finite dimensional
inner product space to another is unitary.
⎡2 0 1 ⎤
13. Find an orthonormal basis for 3×3
consisting of eigenvectors of ⎢⎢ 0 2 −1⎥⎥ .
⎢⎣ 1 −1 1 ⎥⎦
14. Show that the eigenvalues of a unitary matrix all have absolute value (or complex
modulus) 1.
15. Find the Cholesky decomposition of the following matrices
⎡2 1 1⎤ ⎡ 4 2 −6 ⎤ ⎡ 4 1 −1⎤
(a) ⎢⎢ 1 2 1 ⎥⎥ (b) ⎢⎢ 2 10 9 ⎥⎥ (c) ⎢⎢ 1 2 1 ⎥⎥ .
⎢⎣ 1 1 2 ⎥⎦ ⎢⎣ −6 9 26 ⎥⎦ ⎢⎣ −1 1 2 ⎥⎦
16. Following the proof of Schur’s Triangularization Theorem, find an orthogonal matrix
P such that PT AP is upper triangular:
⎡1 −1⎤ ⎡ 2 −1⎤ ⎡13 −9 ⎤
(a) ⎢ ⎥ (b) ⎢ ⎥ (c) ⎢ ⎥.
⎣1 3 ⎦ ⎣ −1 2 ⎦ ⎣16 −11⎦
- 208 -
Chapter 8 Singular Value Decomposition
n× n
Recall that if A ∈ is a normal matrix (i.e., if A satisfies AAH = AH A ) then A can be
decomposed as A = UDU H where U is unitary, and D is diagonal and has the
eigenvalues of A as diagonal entries. If A is a square matrix but not normal, then A does
not have such a nice decomposition. But still, according to the Schur triangularization
theorem, we have A = UTU H for some unitary U, and upper (or lower) triangular T with
the eigenvalues of A as diagonal entries. We may re-formulate these results as:
n× n
Proposition 8.1 Any square matrix A ∈ can be decomposed as A = UXV H , where
U = V is unitary, and X is upper (or lower) triangular; moreover, if A is normal then X is
diagonal.
Note that if U = V then A = UXV H must be square. Hence the result of Proposition 8.1
does not hold for nonsquare matrices A. However, if we do not require U = V , then any
matrix A, no matter square or not, can be decomposed as A = U ∑ V H for some unitary
U,V (here U and V may not be equal) and a “diagonal” matrix ∑ (see Theorem 8.2).
This is the important singular value decomposition, and is the main subject of this chapter.
Definition 8.1 Let A ∈ F m×n , then the n nonnegative square roots of the eigenvalues of
AH A are called the singular values of A. We shall order them as
σ 1 ( A) ≥ σ 2 ( A) ≥ ≥ σ n ( A) ≥ 0 .
- 209 -
σ 12 ≥ σ 22 ≥ ≥ σ n2 are the eigenvalues of AH A .
AH A = VDV H
we have rank ( D ) = r also, and hence only r of the singular values of A are nonzero.
m× n
Theorem 8.2 (Singular Value Decomposition (SVD)) Let A ∈ , rank ( A) = r and
⎡ ∑ +r×r 0 ⎤
= [U1 U 2 ] ⎢ ⎥ [V1 V2 ]
H
A = U ΣV H
⎣ 0 0 ⎦ m×n (8.1)
r
= ∑ σ i ui viH
i =1
where
U1 ≡ [u1 ur ] , U 2 ≡ [ur +1 um ]
V1 ≡ [ v1 vr ] , V2 ≡ [ vr +1 vn ]
m× n
Moreover, if A ∈ then U and V can be chosen to be real orthogonal.
n× n
Proof: Note that AH A ∈ is Hermitian and Positive Semi-definite with
rank ( AH A) = rank ( A) = r
where
σ1 ≥ σ 2 ≥ ≥ σr > 0
Define
∑ + = diag (σ 1 σ r ) & V = ⎡⎣(V1 ) n×r (V2 ) n×( n − r ) ⎤⎦
- 210 -
⇒ AH A = V1 ( ∑ + ) V1H
2
(8.2)
Choose
AH AV2 = 0 ⇒ AV2 = 0 (8.3)
Define
U1 = AV1 ( ∑ + ) ∈
−1 m× r
(8.4)
From (8.2) and (8.3), we have
U1H U1 = I r ⇒ U1 is unitary
U 2 ∋ [U1 U2 ]∈ m× m
⎡∑+ 0⎤
⇒ A = U ∑V H , ∑ = ⎢ ⎥
⎣ 0 0⎦
m× n
Corollary 8.3 Let A ∈ ,then
(i) AH A and AAH have the same collection of nonzero eigenvalues.
(ii) AH and A have the same collection of nonzero singular values.
Proof:
(i) We need only to consider nonzero A's. By Theorem 8.2, we may write A = U ∑ V H
AH A = Udiag (σ 12 , , σ r2 , 0, , 0 )U H
and
AAH = Vdiag (σ 12 , , σ r2 , 0, , 0 )V H
(ii) By definition, the square roots of the eigenvalues of AAH are the singular values of
AH . Hence (ii) follows from (i).
- 211 -
Suppose A = U ∑ V H is the singular value decomposition of A, where U , ∑ , V are as
elements (in fact ∑ = ∑ H is square), we see that A and AH have the same collection of
nonzero singular values. This gives another proof of Corollary 8.3(ii).
Let A = U ∑ V H as in the statement of Theorem 8.2, i.e., U, V are unitary (or real
- 212 -
orthonormal right singular vectors v1 , , vn in F n , such that
⎧σ u for j = 1, , r ⎧σ v for j = 1, , r
Av j = ⎨ j j , AH u j = ⎨ j j
⎩ 0 for j = r + 1, , n ⎩ 0 for j = r + 1, , m
(1) The singular values σ 1 ( A ) ≥ ≥ σ r ( A ) > 0 of A are unique while U & V are not
unique.
n× n
(2) Columns of U are orthonormal eigenbasis for AH A ∈ .
m× m
(3) Columns of V are orthonormal eigenbasis for AAH ∈ .
⎡∑+ 0⎤
AV = U ∑ = U ⎢ ⎥
⎣ 0 0⎦
AH U = V ∑
⇒
(4) {v1 vr } is an orthonormal basis for R( A) .
(8) rank(A) = number of nonzero singular values but rank(A) ≠ number of nonzero
eigenvalues.
For example,
⎛0 1⎞
A=⎜ ⎟ ⇒ rank ( A) = 1 but λ ( A) = 0, 0
⎝0 0⎠
m× n m× m
Lemma 8.5 Let A ∈ ,and Q ∈ be orthogonal Then
QA F
= A F
Proof:
- 213 -
n
= ∑ QAi
2 2 2
QA F
= QA1 QAn F 2
i =1
n
= ∑ Ai
2
2
i =1
2
= A F
n
A F
= Σ F
= ∑σ
i =1
i
2
⎧ ⎡1 ⎤ ⎫ ⎡1 ⎤
⎪⎢ ⎥ ⎪ 1 ⎢ ⎥
R( A ) = span ⎨ ⎢1 ⎥ ⎬ ⇒ u1 =
H
⎢1 ⎥ ,
⎪ ⎢0 ⎥ ⎪ 2
⎩⎣ ⎦ ⎭ ⎢⎣0 ⎥⎦
- 214 -
⎛ 2 0⎞
A = U ΣV ⎜ ⎟
H
with Σ = ⎜ 0 0 ⎟
⎜ 0 0⎟
⎝ ⎠
Given an original image (here 359 × 371 pixels) shown in Fig. 8.1.
Fig. 8.1 Detail from Durer’s Melancolia, Fig. 8.2 Spectrum of singular values of A
dated 1514, 359x371 image
We can write it as a 359 × 371 matrix A which can then be decomposed via the singular
value decomposition as
A = U ΣV T
where U is 359 × 359 , Σ is 359 × 371 and V is 371× 371 .
where each rank 1 matrix ui viT is the size of the original matrix. Each one of these
significant compression of the image is possible if the spectrum of singular values has
only a few very strong entries.
- 215 -
We can therefore reconstruct the image from just a subset of modes. For example in
MATLAB we can write just the first mode as
[U,S,V]=svd(A)
B=U(:,1)*S(1,1)*V(:,1)’
- 216 -
Fig. 8.6 EOF reconstruction with 100 modes
Another application of SVD is the polar decomposition of a matrix. It is well known that
every complex number z can be written in the so called polar form
z = up = pu
- 217 -
Theorem 8.7 Let A ∈ F m×n ,
(i) Suppose m ≥ n . Then
A = UP (8.7)
for some U ∈ F m×n whose columns are orthonormal (i.e., U H U = I n ) and some
for some V ∈ F n×m whose columns are orthonormal and some positive
Proof: We shall prove only (i), and (ii) and (ii) will follow.
Suppose m ≥ n . By Theorem 8.2, A = U ΣV H where U ∈ F m×n has orthonormal
A = (UV H )(V ΣV H )
It is clear that the n × n matrix V ΣV H is positive semidefinite and has rank equal to
(UV ) (UV ) = VU
H
rank ( A) . Also, by direct checking, H H H
UV H = VV H = I n , which
Note that in Theorem 8.5(iii), the two unitary matrices U and V H may not be equal, and
the two positive semidefinite matrices P and Q may not be equal. However, it is not hard
to see that P and Q must be unitarily similar, since their eigenvalues are exactly the
singular values of A (exercise). In fact, it is easy to show that, in Theorem 8.5, P is in fact
- 218 -
( A A) ( AA )
1 1
H 2
, the unique positive semidefinite square root of AH A , and Q is H 2
(exercise).
Ω0
Ω
X x
To filter out the rigid body rotation of a deformation including in the deformation gradient
tensor F , a right Cauchy-Green strain tensor C is defined as
C = FT F (8.13)
Obviously C is positive definite. The Green-Lagrange strain tensor is defined as
- 219 -
1 1
E= (C − I ) = ( F T F − I ) (8.14)
2 2
From (8.14), it is clear that under pure rigid body rotation ( F = R ) , E = 0 , there is no
Hence the stretch tensor is related to the right Cauchy-Green strain tensor C as
= (FT F )
1 1
U =C 2 2
(8.14)
Using the spectral decomposition of U, we have
U = PΛPT (8.15)
and
Λ = diag ( λ1 , λ2 , λ3 ) , P = [ p1 p2 p3 ] (8.16)
where ( λ1 , λ2 , λ3 ) are the square roots of the eigenvalues of C, and [ p1 p2 p3 ] are the
Another strain definition for large deformation often used in continuum mechanics is the
logarithmic strain tensor:
ε = ln (U ) = Pdiag ( ln λ1 , ln λ2 , ln λ3 ) PT (8.17)
The rigid body rotation matrix R in the polar decomposition of (8.11) is given as
R = FU −1 (8.18)
with the inverse of U easily obtained from (8.15) as by using Theorem 7.5,
⎛1 1 1⎞ T
U −1 = Pdiag ⎜ , , ⎟P (8.19)
⎝ λ1 λ2 λ3 ⎠
Similar results can be obtained for the left polar decomposition (8.12) by defining a left
Cauchy-Green strain tensor B:
B = FF T (8.20)
⎛ 1 ⎞
⎜ 3 − 3 0 ⎟
⎜ ⎟
⎜ 5 3⎟
Example 8.2 Find the polar decomposition of the matrix F = ⎜ 0 − ⎟
⎜ 3 5⎟
⎜ 12 ⎟
⎜ 0 0 ⎟
⎝ 5 ⎠
- 220 -
Solution:
⎛ 3 −1 0 ⎞
⎜ ⎟
C = F T F = ⎜ −1 2 − 1 ⎟
⎜ 0 −1 3 ⎟
⎝ ⎠
⎧⎡ 1 ⎤ ⎫
⎧ ⎡1 ⎤ ⎫ ⎪⎢ 6 ⎥⎪
⎪ ⎪ ⎪
⎪⎢ ⎥ ⎪⎪
For λ12 = 1 N (C − I ) = Span ⎨ ⎢⎢ 2 ⎥⎥ ⎬ = Span ⎨ ⎢ 2 ⎥ ⎬
⎪ ⎢1 ⎥ ⎪ ⎪⎢ 6 ⎥⎪
⎩⎣ ⎦ ⎭ ⎪ 1 ⎥⎪
⎢
⎩⎪ ⎢⎣ 6 ⎥⎦ ⎭⎪
⎧⎡− 1 ⎤ ⎫
⎧ ⎡ −1⎤ ⎫ ⎪⎢ 2 ⎥⎪
⎪⎢ ⎥ ⎪ ⎪⎢ ⎪
For λ22 = 3 N (C − 3I ) = Span ⎨ ⎢ 0 ⎥ ⎬ = Span ⎨ 0 ⎥⎬
⎢ ⎥
⎪⎢ 1 ⎥ ⎪ ⎪⎢ 1 ⎥⎪
⎩⎣ ⎦ ⎭ ⎪⎢
⎩⎣ 2 ⎥⎦ ⎭⎪
⎧⎡ 1 ⎤ ⎫
⎧ ⎡ −1⎤ ⎫ ⎪⎢− 3⎥⎪
⎪ ⎪ ⎪
⎪⎢ ⎥ ⎪⎪
For λ32 = 4 N (C − 4 I ) = Span ⎨ ⎢⎢ 1 ⎥⎥ ⎬ = Span ⎨ ⎢ 1 ⎥⎬
⎪ ⎢ −1⎥ ⎪ ⎪ ⎢ 3 ⎥⎪
⎩⎣ ⎦ ⎭ ⎪ − 1 ⎥⎪
⎢
⎪⎩ ⎢⎣ 3 ⎥⎦ ⎪⎭
Hence
Λ = diag ( λ1 , λ2 , λ3 ) = diag 1, ( 3, 2 , )
⎡1 − 1 − 1 ⎤
⎢ 6 2 3⎥
⎢ ⎥
P = [ p1 p2 p3 ] = ⎢ 2 0 1 ⎥
⎢ 6 3 ⎥
⎢1 1 − 1 ⎥
⎢⎣ 6 2 3 ⎥⎦
- 221 -
T
⎡1 − 1 − 1 ⎤ ⎡1 − 1 − 1 ⎤
⎢ 6 2 3 ⎥ ⎡1 ⎤⎢ 6 2 3⎥
⎢ ⎥⎢ ⎥⎢ 2 ⎥
U = P ΛP = ⎢ 2
T
0 1 ⎥ 3 ⎥⎢ 0 1 ⎥
⎢ 6 3 ⎥⎢ 6 3 ⎥
⎢1 ⎥
⎢ 2 ⎥⎦ ⎢⎢
1 − 1 ⎣ 1 1 − 1 ⎥
⎣⎢ 6 2 3 ⎦⎥ ⎣⎢ 6 2 3 ⎦⎥
⎡5 3 1 5 3⎤
⎢ + − − ⎥
⎢6 2 3 6 2 ⎥
⎢ 1 4 1 ⎥
=⎢ − −
3 3 3 ⎥
⎢ ⎥
⎢5 3 1 5 3⎥
⎢6 − 2 −
3
+
6 2 ⎥⎦
⎣
⎤ ⎡⎢ ⎤⎡ T
⎡1 − 1 − 1 ⎥⎢ 1 − 1 − 1 ⎤
⎢ 1
6 2 3⎥ ⎢ ⎥⎢ 6 2 3⎥
⎢ ⎥⎢ 1 ⎥⎢ 2 ⎥
U −1 = PΛ −1 PT = ⎢ 2 0 1 ⎥ 0 1 ⎥
⎢ 6 3 ⎥⎢ 3 ⎥⎢ 6 3 ⎥
⎢1 ⎢ ⎥⎢
1 − 1 ⎥⎢ 1⎥ 1 1 − 1 ⎥
⎢⎣ 3 ⎥⎦ ⎢ ⎢ 3 ⎥⎦
2 ⎦⎥ ⎣
6 2 6 2
⎣
⎡1 1 1 1 1 ⎤
+
⎢3 2 3 −
6 3 2 3⎥
⎢ ⎥
⎢ 1 5 1 ⎥
=⎢
6 6 6 ⎥
⎢ ⎥
⎢1 − 1 1 1
+
1 ⎥
⎢⎣ 3 2 3 6 3 2 3 ⎥⎦
⎡ 1 ⎤
⎢ 3 − 0 ⎥ ⎡1 + 1 1 1
−
1 ⎤
3 ⎢ 3 2 3⎥
⎢ ⎥ ⎢3 2 3 6
⎥
⎢ 5 3⎥⎢ 1 5 1 ⎥
R = FU −1 = ⎢ 0 − ⎥⎢
⎢ 3 5⎥ 6 6 6 ⎥
⎢ ⎥
⎢ 12 ⎥ ⎢ 1 − 1 1 1 1 ⎥
⎢ 0 0 ⎥ +
⎣ 5 ⎦ ⎢⎣ 3 2 3 6 3 2 3 ⎥⎦
⎡ 1 5 3 5 3 1 ⎤
⎢ + − − ⎥
⎢ 2 6 3 9 18 2 ⎥
⎢ 15 1 1 5 5 3 15 1 1 ⎥
=⎢ − + − − − ⎥
⎢ 18 15 2 5 6 3 6 5 18 15 2 5 ⎥
⎢ 2 1 1 2 1 ⎥
⎢ − + ⎥
⎢⎣ 15 5 15 15 5 ⎥⎦
Check: RT R = I
- 222 -
8.3 Pseudo Inverse
A pseudo inverse or generalized inverse of a matrix A is a matrix that has some properties
of the inverse matrix of A but not necessarily all of them. The term “the pseudo inverse”
commonly means the Moore–Penrose pseudo inverse.
The purpose of constructing a pseudo inverse is to obtain a matrix that can serve as the
inverse in some sense for a wider class of matrices than invertible ones. Typically, the
pseudo inverse exists for an arbitrary matrix, and when a matrix has an inverse, then its
inverse and the generalized inverse are the same.
Recall the definition of inverse matrix. If A ∈ F n×n is nonsingular, then it has a unique
inverse A−1 of the same order which satisfies
A−1 A = AA−1 = I n (8.21)
Theorem 8.8 For any A ∈ F m×n ,there exists a unique A− ∈ F n×m which satisfies the
following:
(a) A− AA− = A−
(b) AA− A = A
(8.23)
( A A)
− H
(c ) = A− A
(d ) ( AA ) − H
= AA−
- 223 -
A− = V ∑ − U H (8.24)
Then
⎛ r n−r
⎞
A A = Vdiag ⎜1,
−
,10, , 0 ⎟V H ,
⎜ ⎟
⎝ ⎠
(8.25)
⎛ r m−r
⎞
AA− = Udiag ⎜1, ,10, , 0 ⎟U H
⎜ ⎟
⎝ ⎠
By direct verification, all conditions of (8.23) hold. Hence the existence of A− is proved.
For uniqueness, let A− and A# satisfy all conditions of (8.23). By (8.23)(b), we have
AA# = ( AA# ) = (( AA )( AA )) = ( AA ) ( AA )
H H
− # # H − H
where the last equality follows from (8.23)(b). Similarly, one can show that A# A = A− A
(exercise). It then follows that
A− = A− AA− = A− AA# = ( AA− ) A# = A# AA# = A#
where the first and the last equalities follow from (8.23)(a). The proof is now complete.
The matrix A− in the previous Theorem 8.8 is called the pseudo-inverse, or the
Moore-Penrose inverse, or the generalized inverse of A. Its explicit formula, in terms of
singular value decomposition, is given in (8.24). Clearly, if A is a square, nonsingular
matrix then A−1 = A− . Also, if A ∈ F m×n has full row (or column, respectively) rank then
A− is the unique right (or left, respectively) inverse of A that satisfies both AA− = I m
- 224 -
x := AH ( AAH ) b
−1
satisfy Ax = b and x ≠ x ;
(b) (Theorem 6.13) if A has full column rank, then x := ( AH A ) AH b is the unique vector
−1
that satisfies Ax − b E
< Ax − b E
for all x ∈ F n such that x ≠ x .
However, these results are somewhat restrictive because they cannot be applied if
rank ( A) < min(m, n) . It turns out that, by using the pseudo-inverse, a stronger result
(Theorem 8.10) can be derived and it covers both results of (a), (b) above. The following
lemma is needed to prove this stronger result.
Lemma 8.9 Let R(A) denote the column space of any matrix A. Then:
R ( A− ) = R ( A H ) (8.26)
A− x = A H (( A )
− H
)
A − x ∈ R ( A H ) ⇒ R ( A− ) ⊆ R ( A H ) .
A x = A− ( AAH x ) ∈ R ( A− ) ⇒ R ( AH ) ⊆ R ( A− )
H
Lemma 8.9 can also be proved by considering the singular value decomposition of A−
and AH .
Theorem 8.10 Let A ∈ F m×n and b ∈ F m . Then the vector x := A−b is the unique
vector in F n that satisfies
(i) Ax − b E
≤ Ax − b E
for all x ∈ F n , and
- 225 -
(ii) x E
< x E
for all x ∈ F n which satisfies Ax − b E
= Ax − b E
and x ≠ x
(In other words, x := A−b is the unique least square solution of the system Ax = b that
has the smallest Euclidean norm.)
Ax − b = ( Ax − Ax ) + ( Ax − b )
Notice that
AH ( Ax − b ) = AH AA− Ax − AH b = AH AA− b − AH b
= A H ( AA− ) b − AH b = A H ( A− ) AH b − AH b
H H
= ( AA− A ) b − AH b = AH b − AH b = 0
H
( Ax − Ax ) = A ( x − x ) ∈ R ( A) = N ( AH )
⊥
We see that
2 2 2
Ax − b E
= Ax − Ax E
+ Ax − b E
Thus
Ax − b E
≤ Ax − b E
x = A− b ∈ R ( A− ) = R ( A H ) = N ( A )
⊥
and that
x = ( x − x) + x
We have
2 2 2 2
x E
= x−x E
+ x E
> x E
Example 8.3 According to Fig. 8.2, the position of the uppermost joint of the model is
- 226 -
Fig. 8.2 Least square problem
where A− = AT ( AAT )
−1
is our pseudoinverse and we obtain
- 227 -
−1
⎡ −0.707 0.707 ⎤ ⎛ ⎡ −0.707 0.707 ⎤ ⎞
⎢ ⎥ ⎜ ⎡ −0.707 −0.500 −0.707 ⎤ ⎢ ⎥ ⎟ ⎡ 0.02 ⎤
Δϕ ≈ ⎢ −0.500 0.886 ⎥ ⎜ ⎢ ⎥ − 0.500 0.886 ⎥ ⎟ ⎢ −0.01⎥
⎣ 0.707 0.886 0.707 ⎦ ⎢
⎜
⎢⎣ −0.707 0.707 ⎥⎦ ⎝ ⎢⎣ −0.707 0.707 ⎥⎦ ⎟⎠ ⎣ ⎦
⎡− 0.707 0.707 ⎤ −1
⎢ ⎥ ⎡ 1.250 - 1.433⎤ ⎡ 0.02 ⎤
= ⎢ − 0.500 0.886⎥ ⎢
- 1.433 1.750 ⎥⎦ ⎢⎣− 0.01⎥⎦
⎢⎣− 0.707 0.707 ⎥⎦ ⎣
⎡− 0.707 0.707 ⎤
⎡13.06 10.69 ⎤ ⎡ 0.02 ⎤
= ⎢⎢ − 0.500 0.886⎥⎥ ⎢
⎣10.69 9.328⎥⎦ ⎢⎣− 0.01⎥⎦
⎢⎣− 0.707 0.707 ⎥⎦
A = SBS −1 (8.27)
for some invertible S. Then Ak = SB k S −1 for all nonnegative integers k, and hence
has a simple form, so that B k is easy to compute, then p ( A) can be obtained easily.
This is the case when A is diagonalizable, i.e., when the n × n matrix A has n linearly
independent eigenvectors, so that we may choose S to have these linearly independent
eigenvectors as columns, and B to be the diagonal matrix having the corresponding
eigenvalues as diagonal entries. However, not all square matrices are diagonalizable. For
example, consider the following matrix
⎛5 4 2 1⎞
⎜ ⎟
⎜ 0 1 −1 −1⎟
A=
⎜ −1 −1 3 0 ⎟
⎜ ⎟
⎝ 1 1 −1 2 ⎠
- 228 -
Including multiplicity, the eigenvalues of A are λ = 1, 2, 4, 4. The dimension of the kernel
of ( A − 4I ) is 1, so A is not diagonalizable. Still, we would like to find a B in (8.27)
which with an invertible matrix S makes A = SJS −1 . The matrix J is almost diagonal and
is the Jordan normal form of A. The major task of this section is to fill in the theoretical
and computational details of this example.
Recall that, by the Schur triangularization theorem, any complex, square matrix A is
unitarily similar to an upper triangular matrix, i.e.,
A = UTU H (8.27)
for some unitary U and upper triangular T, and the eigenvalues of A can be placed in any
order on the diagonal of T. In addition, if A and all its eigenvalues are real then U can be
chosen to be real orthogonal and T to be real. Although upper triangular matrices have
some nice structure (e.g., if T has diagonal entries λi , then T k is also upper triangular
and has diagonal entries λik ), it is still difficult to compute their k-th powers. Moreover,
the upper triangular matrix T in (8.27) may not be unique for a same matrix A.
Nevertheless, (8.27) can be considered as a first step towards finding a “simple” B that is
similar to A. In fact, our proof of the Jordan form theorem starts with the result of the
Schur triangularization theorem. We first prove an auxiliary result.
Lemma 8.11 Let A ∈ F n×n and B ∈ F m×m have no eigenvalue in common. If X ∈ F m×n
satisfies XA = BX , then X = 0
Proof: Assume λ1 , , λn are the eigenvalues of A and η1 , ,ηm are the eigenvalues of
A is p ( λ ) = ( λ − λ1 ) ( λ − λn ) . Then we have
p ( A ) = ( A − λ1 I ) ( A − λn I ) = An + γ 1 An−1 + + γ nI = 0 (8.28)
From XA = BX , we have
- 229 -
Xp ( A ) = XAn + γ 1 XAn −1 + + γ n X = ( XA ) An −1 + γ 1 ( XA ) An − 2 + +γnX
= BXAn −1 + γ 1 BXAn − 2 + +γnX
(8.29)
= ( B n + γ 1 B n −1 + + γ n I ) X = p( B) X
Or
0 = Xp ( A ) = p ( B ) X (8.30)
On the other hand, by Schur’s triangularization theorem, there exists unitary U such that
⎛η1 … * ⎞
⎜ ⎟
T = U H BU = ⎜ ⎟ (8.31)
⎜0 η ⎟
⎝ m⎠
and
p( B) = Up (T )U H (8.32)
with
p (T ) = (T − λ1 I ) (T − λn I )
⎛η1 − λ1 … * ⎞ ⎛η1 − λn … * ⎞
⎜ ⎟ ⎜ ⎟ (8.33)
=⎜ ⎟ ⎜ ⎟
⎜ 0 η − λ ⎟ ⎜ 0 η − λ ⎟
⎝ m 1⎠ ⎝ m n⎠
(8.30), we arrive at
p ( B ) is nonsigular
0 = Xp ( A ) = p ( B) X ⇒
X =0
⎡T1 ⎤
⎢ T ⎥
A=S⎢ 2 ⎥ S −1 (8.34)
⎢ ⎥
⎢ ⎥
⎣ Tk ⎦
- 230 -
n× n ni ×ni
for some nonsingular S ∈ and upper triangular Ti ∈ , where all diagonal
entries of Ti are λi ( i = 1, , k ) . Moreover, if A and all its eigenvalues are real, then S,
A = UTU H
with T being an upper triangular matrix with diagonal entries
n1 n2 nk0 +1
λ1 , , λ1 , λ2 , , λ2 , , λk0 +1 , , λk0 +1 .
⎡T (1) Y ⎤
T =⎢ (2) ⎥
⎣ 0 T ⎦
where T (1) , T (2) , Y have orders p × p, q × q, p × q , respectively. Note that T (1) , T (2)
are upper triangular, and all eigenvalues of T (2) (which are equal to λk0 +1 ) are distinct
p× q
from those of T (1) . We claim that there exists X ∈ such that
⎡ I p X ⎤ ⎡T (1) 0 ⎤ ⎡T (1) Y ⎤ ⎡ I p X ⎤
⎢ 0 I ⎥⎢ (2) ⎥
=⎢ (2) ⎥ ⎢ ⎥ (8.35)
⎣ q⎦⎣ 0 T ⎦ ⎣ 0 T ⎦ ⎣ 0 Iq ⎦
If this is true then
⎡T (1) 0 ⎤ −1
T = Sa ⎢ S
(2) ⎥ a
⎣ 0 T ⎦
⎡I p X ⎤ (1)
where S a is the invertible matrix ⎢ ⎥ . As T has only k0 distinct eigenvalues
⎣ 0 I q⎦
- 231 -
⎡T1 ⎤
⎢ ⎥ −1
T (1) = Sb ⎢ ⎥ Sb
⎢ Tk0 ⎥⎦
⎣
where Ti are ni × ni upper triangular matrices with all diagonal entries equal
⎡ Sb 0 ⎤
λi ( i = 1, , k0 ) . It then follows that (8.34) holds for S = US a ⎢ ⎥ and Tk0 +1 = T (2) .
⎣ 0 Iq ⎦
which is equivalent to
XT (2) − T (1) X = Y (8.36)
Definite φ : p× q
→ p× q
by φ ( X ) = XT (2) − T (1) X . It is easy to check that φ is a
8.11 and the assumption that T (1) and T (2) have no eigenvalues in common, X = 0 .
X∈ p× q
such that Y = φ ( X ) = XT (2) − T (1) X , and thus (8.36) holds for X. Our claim is
proved.
Finally, if A and all its eigenvalues are real, then the matrices U , S a , Sb above can be
Definition 8.3 Let λ ∈ and n be a positive integer. The Jordan block J n ( λ ) is the
n × n matrix
⎡λ 1 ⎤
⎢ λ ⎥
Jn (λ ) = ⎢ ⎥
⎢ 1⎥
⎢ ⎥
⎣ λ⎦
- 232 -
with λ on the diagonal and 1 on the superdiagonal, and zero elsewhere. A Jordan
matrix is a direct sum of Jordan blocks in the form J n1 ( λ1 ) ⊕ ⊕ J nk ( λk ) (here
Notice that J1 ( λ ) is just the 1× 1 matrix ( λ ), and a diagonal matrix is just a direct
sum of 1×1 Jordan blocks. We shall prove later that every complex matrix is similar to
a Jordan matrix (i.e., a direct sum of Jordan blocks). The Jordan matrix, which has a
much simpler structure than general upper triangular matrices, is then the matrix B we
want to find at the beginning of this chapter. We need the following lemmas to prove this
theorem. Recall that a permutation matrix is a square matrix, in which every column and
every row has exactly one nonzero entry whose value is 1.
Proof: Exercise.
Lemma 8.14
(i) Let k , l be positive integers. Then
Nullity of J k ( 0 ) = min( k , l )
l
In particular, J k ( 0 ) = 0, if l ≥ k
l
J k ( 0 ) ei +1 = ei for i = 1, , k −1
Proof: Exercise.
n× n
nonsingular matrix S ∈ and positive integers n1 , , nk such that n1 + + nk = n
and
- 233 -
⎡ J n1 ( λ ) ⎤
⎢ ⎥
⎢ J n2 ( λ ) ⎥ −1
A=S⎢ ⎥S (8.38)
⎢ ⎥
⎢⎣ J nk ( λ ) ⎥⎦
then
(
A = A + λ I = S J n1 ( λ ) ⊕ J n2 ( λ ) ⊕ )
⊕ J nk ( λ ) S −1
Suppose the lemma holds for the cases n < N , where N > 1 is a fixed integer, and let
A∈ n× n
be such that all its eigenvalues are zero. If A = ( 0 ) , then
A = J1 ( 0 ) ⊕ J1 ( 0 ) ⊕ ⊕ J1 ( 0 )
and the result follows. Therefore we assume A ≠ ( 0 ) , so that dim R ( A) ≥ 1 . Note that 0 is
eigenvalue of A implies that dim ker( A) ≥ 1 , and hence dim R ( A) < N . As it is always
1 ≤ dim R ( A) < N . Observe that any eigenvalue of φ (i.e., scalar λ which satisfies
J m1 ( 0 ) ⊕ ⊕ J ml ( 0 )
If we denote the m1th vector of B by w then, from the m1th column of the above matrix
- 234 -
( m1 − 1)
th
representation, we see that Aw is the vector in B. With similar arguments,
we conclude that
⎧ m1 terms m2 terms ml terms ⎫
⎪ ml −1 ⎪
B = ⎨ Am1 −1w, Am1 − 2 w, , Aw, w , Am2 −1w, , Aw, w , ,A w, , Aw, w⎬
⎪ ⎪
⎩ ⎭
for some w1 , , wl ∈ R ( A ) , and Ami wi = 0 for all i = 1, , l . Let vi ∈ N
be such that
l mi −1
Ax = ∑ ∑ α i , j A j +1vi = Ay
i =1 j = 0
where
l mi −1
y = ∑ ∑ α i , j A j vi ∈ span( E )
i =1 j = 0
- 235 -
and hence
A ( x − y ) = 0, i.e., ( x − y ) ∈ ker( A)
N
⊆ span( E ) + ker( A)
(note that this may not be a direct sum). Since
E⊆ N
⊆ span( E ) + ker( A)
it follows that (why?) one can find linear independent vectors u1 , , uh ∈ ker( A) such
that
N
= span( E ) ⊕ span {u1 , , uh }
N ×N
and let these vectors form the columns of a (necessarily nonsingular) matrix S ∈ ,
we have
h trems
S AS = J m1 +1 ( 0 ) ⊕
−1
⊕ J ml +1 ( 0 ) ⊕ J1 ( 0 ) ⊕ ⊕ J1 ( 0 )
If A and λ are real in the first place, then we can replace every occurrence of by
in the above, and hence S is real also. This completes the proof.
n× n
Theorem 8.16 (Jordan Form) For A ∈ , there exists a nonsingular matrix
n× n
S∈ and positive integers m1 , , mr such that m1 + + mr = n and A = SJS −1
J = J m1 ( λ1 ) ⊕ ⊕ J mr ( λr ) (8.39)
Here λ1 , , λr (may not be distinct) are eigenvalues of A. The Jordan matrix J is unique
up to permutations of the diagonal Jordan blocks. Moreover, if A and all its eigenvalues
are real, then S can be chosen to be real.
A = S0 (T1 ⊕ ⊕ Tk ) S0−1
- 236 -
And Ti are upper triangular matrices whose diagonal entries are all equal to an
(
Ti = Si J ni1 ( λi ) ⊕ ⊕ J ni ,r ( λi ) Si−1
i
)
By taking
S = S0 ( S1 ⊕ ⊕ Sk )
(
J = J n11 ( λ1 ) ⊕ ) (
⊕ J n1,r ( λ1 ) ⊕ J n21 ( λ2 ) ⊕
1
⊕ J n2,r ( λ2 ) ⊕
2
)
(
⊕ J nk 1 ( λk ) ⊕ ⊕ J nk ,r ( λk )
k
)
If A and all λ1 , , λr are real then S0 , S1 , , S k can be chosen to be real, so that S is
real. It remains to prove that the matrix J is unique up to permutations of its diagonal
Jordan blocks. Let J be the Jordan matrix J m1 ( λ1 ) ⊕ ⊕ J mr ( λr ) , and λ ∈ . Then for
( J − λI ) = J m1 ( λ1 − λ ) ⊕ ⊕ J mr ( λr − λ )
l l l
and hence
r
nullity of ( J − λ I ) = ∑ nullity of J mi ( λi − λ )
l l
i =1
∑λ nullity of J mi ( λi − λ ) = ∑ min ( mi , l )
l
=
i∈{1, , r}, i = λ i∈{1, , r}, λi = λ
( J − λI )
l
( J − λI )
l
Then J and J are similar to each other, and hence and are
similar to each other for any λ ∈ and any positive integer l . Consequently,
( J − λI )
l
( J − λI )
l
and have the same nullity, which means
∑ min ( mi , l ) = ∑ min ( k j , l )
i∈{1, , r}, λi = λ j∈{1, , s}, μ j = λ
for any λ ∈ and any positive integer l . From this, it can be deduced easily that the
- 237 -
The matrix J in the above theorem is called the Jordan form of A. Suppose we choose
some ordering of the distinct eigenvalues of A, say, λ1 , , λt . The diagonal Jordan
blocks of J can be ordered in such a way that the first blocks are those having λ1 as
eigenvalues, and are ordered in decreasing size of the blocks; the next blocks are those
having λ2 as eigenvalues, again ordered in decreasing size of the blocks, . Then the
representation of J is unique, once the ordering of the distinct eigenvalues of A are fixed.
(3) The sum of the sizes of all Jordan blocks corresponding to an eigenvalue λi is
delta).
Note: The Jordan form of a matrix is a conceptual tool and is never used in numerical
computation. Because there is no numerically stable method to compute the Jordan form
⎛0 ε ⎞
of a matrix. To demonstrate this, consider the matrices A ( ε ) = ⎜ ⎟ where ε ∈ .
⎝0 0⎠
⎛0 1⎞
Clearly, A ( ε ) → A ( 0 ) as ε → 0 . The Jordan form of A ( ε ) is ⎜ ⎟ for ε ≠ 0 (via
⎝0 0⎠
⎛ε 0⎞ ⎛0 0⎞
the similarity matrix S = ⎜ ⎟ ), and the Jordan form of A ( 0 ) is ⎜ ⎟ . Hence the
⎝ 0 1⎠ ⎝0 0⎠
0. Nevertheless, the Jordan form is still a very useful tool for analyzing properties of
square matrices.
- 238 -
Example 8.4 We return to the example given at the beginning of this section. For the
matrix
⎛5 4 2 1⎞
⎜ ⎟
⎜ 0 1 −1 −1⎟
A=
⎜ −1 − 1 3 0 ⎟
⎜ ⎟
⎝ 1 1 −1 2 ⎠
Find its Jordan form and S.
Solution: The characteristics polynomial for matrix A are
det( A − λ I ) = ( λ − 1)( λ − 2 )( λ − 4 ) = 0
2
⎧⎡ 1 ⎤ ⎫
⎪⎢ ⎥ ⎪
⎪ −1 ⎪
For λ1 = 1 N ( A − I ) = Span ⎨ ⎢ ⎥ ⎬
⎪⎢ 0 ⎥ ⎪
⎪⎩ ⎢⎣ 0 ⎥⎦ ⎪⎭
⎧⎡ 1 ⎤ ⎫
⎪⎢ ⎥ ⎪
⎪ −1 ⎪
For λ2 = 2 N ( A − 2 I ) = Span ⎨ ⎢ ⎥ ⎬
⎪⎢ 0 ⎥ ⎪
⎪⎩ ⎢⎣ 1 ⎥⎦ ⎪⎭
⎧⎡ 1 ⎤ ⎫
⎪⎢ ⎥ ⎪
⎪ 0 ⎪
For λ3 = λ4 = 4 N ( A − 4 I ) = Span ⎨ ⎢ ⎥ ⎬
⎪ ⎢ −1⎥ ⎪
⎢ ⎥
⎩⎪ ⎣ 1 ⎦ ⎭⎪
Since the nullity of ( A − 4 I ) is 1, the Jordan form of A is
⎛1 0 0 0⎞
⎜ ⎟
⎜ 2 0 0⎟
J=
⎜ 4 1⎟
⎜ ⎟
⎝0 4⎠
AS = SJ
Since
AS = { Av1 Av2 Av3 Av4 }
and
SJ = {v1 2v2 4v3 v3 + 4v4 }
- 239 -
This is equivalent to,
Av1 = v1 ⇒ (A − 1I )v1 = 0
Av2 = 2v2 ⇒ (A − 2 I )v2 = 0
Av3 = 4v3 ⇒ (A − 4 I )v3 = 0
Av4 = v3 +4v4 ⇒ (A − 4 I )v4 = v3
(A − 4 I ) 2 v4 = (A − 4 I )v3
But (A − 4 I )v3 = 0 , so
(A − 4 I ) 2 v4 = 0
⎡1 1 1 1⎤
⎢ −1 − 1 0 0 ⎥⎥
S = {v1 v2 v3 v4 } = ⎢
⎢ 0 0 −1 0⎥
⎢ ⎥
⎣0 1 1 0⎦
Check: A = SJS −1
- 240 -
Exercises
⎛1 0 0 0 2⎞
⎜ ⎟
0 0 3 0 0⎟
4. Given A = ⎜ , find its SVD.
⎜0 0 0 0 0⎟
⎜ ⎟
⎝0 4 0 0 0⎠
⎛0 1⎞
⎜ ⎟
5. Find the SVD of A = ⎜ 1 0 ⎟ .
⎜1 1⎟
⎝ ⎠
6. Prove that the rank of a matrix is equal to the number of nonzero singular values of
the matrix.
⎛ 1 2 −1 ⎞
⎜ ⎟
7. Given A = ⎜ 0 2 0 ⎟ , find its Jordan form and S.
⎜ 1 −2 3 ⎟
⎝ ⎠
n× n k
8. Let A ∈ . Show that Ak ≤ A 2
for all k ≥ 2 . If A is invertible, show that
2
A−1 ≥ 1 .
2 A 2
n n
Show that ∏ σ i = ∏ λi .
i =1 i =1
- 241 -
Q −1 AQ is normal.
⎛0 1⎞
11. Given A = ⎜ ⎟ , find its Jordan form and S.
⎝ −1 − 2 ⎠
12. Let A be a square matrix. If A is diagonalizable, show that all Jordan blocks in its
Jordan form have size 1×1 .
13. Let A1 , , Ak be square matrices. If A1 ⊕ ⊕ Ak is diagonalizable, show that all
A1 , , Ak are diagonalizable.
⎛ 1 1⎞
⎜ ⎟
15. Let A = ⎜ 2 −2 ⎟
⎜ −2 2 ⎟
⎝ ⎠
(a) Find the singular values of A, and find the matrices U , ∑ , V in the SVD of
A = U ∑V H .
(b) Find the matrices U, P in the polar form A = UP.
(c) Find the pseudo-inverse A− of A.
16. Let J k ( λ ) denote the k × k Jordan block with eigenvalue λ . Let m be a positive
integer.
(a) If m < k , show that
⎛0 1 0⎞
⎜ ⎟
⎜ ⎟
J k (0) = ⎜
m
1⎟
⎜ ⎟
⎜ ⎟
⎜0 0 ⎟⎠
⎝
where the “1” on the first row is located at the (m + 1)th position and all other
unspecified entries are 0.
If m ≥ k , show that J k ( 0 ) = 0
m
- 242 -
⎛ m ⎛ m ⎞ m −1 ⎛ m ⎞ m − 2 ⎛ m ⎞ m − k +1 ⎞
⎜λ ⎜ ⎟λ ⎜ ⎟λ ⎜ ⎟λ ⎟
⎜ ⎝1⎠ ⎝2⎠ ⎝ k − 1⎠ ⎟
⎜ ⎟
⎜ ⎟
⎜ ⎛ m ⎞ m−2 ⎟
Jk (λ )
m
=⎜ ⎜ ⎟λ ⎟
⎜ ⎝2⎠ ⎟
⎜ ⎛ m ⎞ m −1 ⎟
⎜ ⎜ ⎟λ ⎟
⎜ ⎝1⎠ ⎟
⎜ λ m ⎟
⎝ ⎠
⎧ m!
⎛m⎞ ⎪ if 0 ≤ r ≤ m
where ⎜ ⎟ = ⎨ r !( m − r ) !
⎝r⎠ ⎪
⎩ 0 otherwise
Hint, consider J k ( λ ) = ( λ I + J k ( 0 ) )
m m
(b) R ( AA− ) = R ( A) .
⎛4 0 1 0⎞
⎜ ⎟
2 2 3 0⎟
18. Given A = ⎜ , find its Jordan form and S.
⎜ −1 0 2 0⎟
⎜ ⎟
⎝4 0 1 2⎠
19. Find all possible Jordan forms of a transformation with characteristic polynomial
( x − 1) 2 ( x + 2) 2 .
20. Find all possible Jordan forms of a transformation with characteristic polynomial
( x − 1)3 ( x + 2) .
- 243 -
⎛1 1⎞ ⎛0 1⎞
21. Diagonalize these: (a) ⎜ ⎟ ; (b) ⎜ ⎟.
⎝0 0⎠ ⎝1 0⎠
⎛ 1 −1 ⎞ ⎛ −1 0 ⎞
22. Decide if these two are similar: (a) ⎜ ⎟ ; (b) ⎜ ⎟.
⎝ 4 −3 ⎠ ⎝ 1 −1 ⎠
− − −
23. If x , y ∈ n
, show that ⎡⎣ xyT ⎤⎦ = ⎡⎣ x T x ⎤⎦ ⎡⎣ yT y ⎤⎦ yx T .
m× n
24. For A ∈ , prove that R ( A) = R ( AAT ) using only definitions and elementary
properties of the Moore-Penrose pseudoinverse.
m× n
25. For A ∈ , prove that R( A− ) = R( AT ) .
p× n m× n
26. For A ∈ and B ∈ show that N ( A) ⊆ N ( B ) if and only if BA− A = B .
⎛ 0 −1 ⎞
27. Find the Jordan form of this matrix: ⎜ ⎟
⎝1 0 ⎠
⎛1 1⎞
28. Compute the pseudoinverse of ⎜ ⎟
⎝ 2 2⎠
n× n n× m m× m
29. Let A ∈ , B∈ , D∈ and suppose further that D is nonsingular.
(a) Prove or disprove that
−
⎡ A AB ⎤ ⎡ A− − A− ABD − ⎤
⎢0 D ⎥ = ⎢ ⎥
⎣ ⎦ ⎣0 D− ⎦
(b) Prove or disprove that
−
⎡A B⎤ ⎡ A− − A− BD − ⎤
⎢ 0 D⎥ = ⎢ ⎥
⎣ ⎦ ⎣0 D− ⎦
30. Solve the least square problem of
⎛2 −4 5⎞ ⎧1⎫
⎜ ⎟ ⎪3⎪
6 0 3⎟ ⎪ ⎪
A=⎜ , b=⎨ ⎬
⎜2 −4 5⎟ ⎪−1⎪
⎜ ⎟
⎝6 0 3⎠ ⎩⎪ 3 ⎭⎪
- 244 -
(b) The matrix S is 5 × 5 with two eigenvalues. For the eigenvalue 2 the nullities
( S − 2I )
2
are: S − 2 I has nullity two, and has nullity four. For the eigenvalue
-1 the nullities are: S + I has nullity one.
32. Find the Jordan form and a Jordan basis and S for each matrix.
⎛ −10 4 ⎞ ⎛ 5 −4 ⎞
(a) ⎜ ⎟ (b) ⎜ ⎟
⎝ −25 10 ⎠ ⎝ 9 −7 ⎠
⎛4 0 0⎞ ⎛5 4 3⎞ ⎛ 9 7 3⎞ ⎛ 2 2 −1 ⎞
(c) ⎜⎜ 2 1 3 ⎟⎟ ⎜ ⎟
(d) ⎜ −1 0 −3 ⎟
⎜ ⎟
(e) ⎜ −9 −7 −4 ⎟
⎜
(d) ⎜ −1 −1 1 ⎟
⎟
⎜5 0 4⎟ ⎜ 1 −2 1 ⎟ ⎜ 4 4 4⎟ ⎜ −1 − 2 2 ⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
⎛7 1 2 2⎞
⎜ ⎟
1 4 −1 −1 ⎟
(f) ⎜
⎜ −2 1 5 −1 ⎟
⎜ ⎟
⎝1 1 2 8⎠
- 245 -
References:
- 246 -