Linear Algebra
Linear Algebra
Contents
1 Review of Matrix Algebra
1.1 Determinants, Minors, and Cofactors . . . . . . . . . . . . . . . .
1.2 Rank, Trace, and Inverse . . . . . . . . . . . . . . . . . . . . . .
1.3 Elementary Operations and Matrices . . . . . . . . . . . . . . . .
2
4
5
7
9
11
12
17
20
20
20
24
24
24
29
32
32
33
6 MATLAB Commands
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
34
1
2
3
4
5
6
4
3
2 105
1
0.9 7.2 10
0.17 4.96 10
4
0.012338
11.72
2.6
8.7
10
31
22
A=
0
0
1
0
0
0
0
0
0
0
0
30
s+1
s+3
P(s) =
s2 + 3s + 2 s2 + 5s + 4
1+ j
j
0
=
1 + 3 j 5 2 j
is a 2 3 complex matrix.
Trace(A) = aii
(1)
i=1
Two matrices A and B are equal, written A = B, if and only if A has the same
number of rows and columns as B and if ai j = bi j for all i, j. Two matrices A and
B that have the same numbers of rows and columns may be added or subtracted
element by element, i.e.
C = [ci j ] = A B ci j = ai j bi j i, j
(2)
C = AB ci j =
aik bk j
(3)
k=1
7. (B +C)A = BA +CA
Let A = [ai j ] be an m n matrix. The transpose of A is denoted by AT = [a ji ]
and is the n m matrix obtained by interchanging columns and rows. The matrix
A is symmetric if A = AT , skew-symmetric if A = AT . Also, it can be seen that
(AB)T = BT AT . In the case where A contains complex elements, we let A be the
conjugate of A whose elements are the conjugates of those of A. Matrices satisfying
A = A T are Hermitian and those satisfying A = A T are skew-Hermitian.
2 4 1
A= 3 0 2
2 0 3
2 4 1
A= 3 0 2
2 0 3
2 0
B= 3 0
1 1
17 1
A.B = 8 2
1 1
Note that rank(A + B) 6= rank(A) + rank(B) and that Tr(AB) 6= Tr(A) Tr(B).
Example 7 Let
2 4 1
1 0 0
A = 3 0 2 ; B = 0 0 0
2 0 3
0 0 0
and note that rA = 3, rB = 1, while rA+B = 3. Also, note that Tr(AB) = 2 while
Tr(A) Tr(B) = 5 2 = 10.
CT
|A|
where C is the matrix of cofactors Ci j of A. Note that for the inverse to exist, | A |
must be nonzero, which is equivalent to saying that A is nonsingular. We also write
CT = Ad joint(A).
The following property holds:
(AB)1 = B1 A1
assuming of course that A and B are compatible and both invertible.
2 1 0
A= 0 2 1
1 1 1
1 0 0
B= 0 1 0
0 0 1
s
0
s2
P(s) = 0
1 s+1
0 0 1
1 s+1
s2 = 0 1 0 P(s)
P1 (s) = 0
1 0 0
s2
0
2.
1
s+1
1 0 0
= 0 1 0 P1 (s)
s2
P2 (s) = 0
2
0 s (s + 1)
s2 0 0
3.
1 s+1
1
0
0
1
0 P(s)
P3 (s) = 0 s2 = 0
0
0
0 s+1 0
In general, for any matrix of rank r we can reduce it via column and row operations to one of the following normal forms
Ir 0
T
(4)
Ir , [Ir | 0], [Ir | 0] , or
0 0
Next, we discuss vector spaces, or as they are sometimes called, linear space..
10
ai xi = 0 ai = 0, i = 1, . . . n
i=1
11
Example 12 Let X be the vector space of all vectors x = [x1 xn ]T such that all
components are equal. Then X is spanned by the vector of all 1s. Therefore dim
(X) = 1. On the other hand, if X is the vector space of all polynomials of degree
n 1 or less, a basis is {1,t,t 2 , ,t n1 } which makes dim (X) = n.
(5)
or in matrix notation
Ax = y
(6)
(7)
then rank(W ) = rank(A) is a necessary condition for the existence of at least one
solution. Now, if z N (A), and if x is any solution to Ax = y, then x + z is also a
solution. Therefore, for a unique solution to exist, we need that N (A) = 0. That
will require that the columns of A form a basis of R(A), i.e. that there will be n of
them and that they will be linearly independent, and of dimension n. Then, the A
matrix is invertible and x = A1 y is the unique solution.
12
(8)
(9)
Noting that xi can not be the zero vector, and recalling the conditions on the existence of solutions of linear equations, we see that we have to require
det(A i I) =| (A i I) |= 0
(10)
We thus obtain an nth degree polynomial, which when set to zero, gives the characteristic equation
| (A I) | = ( ) = 0
= ( )n + cn1 n1 + + c1 + c0
(11)
Tr(A) =
= (1)n+1cn1
i=1
n
|A| =
i = c0
(12)
i=1
0
1
0
0
1
A= 0
18 27 10
13
2
+ 10 + 27
+ 10
1
18
2 + 10
Ad j(A I) =
18
27 18 2
so that for 1 = 1 we can see that column 1 is
x1 = [18 18 18]T
Similarly, x2 = [7 21 63]T and x3 = [1 6 36]T . There is another
method of obtaining the eigenvectors from the definition by actually solving
Axi = i xi
Note that once all eigenvectors are obtained, we can arrange them in an n n
modal matrix M = [x1 x2 xn ]. Note that the eigenvectors are not unique
since if xi is an eigenvector, then so is any xi for any scalar .
2. Some eigenvalues are repeated: In this case, a full set of independent eigenvectors may or may not exist. Suppose i is an eigenvalue with an algebraic
multiplicity mi . Then, the dimension of the Null space of A i I which
is also the number of linearly independent eigenvectors associated with i is
the geometric multiplicity qi of the eigenvalue i . We can distinguish 3 cases
(a) Fully degenerate case qi = mi : In this case there will be qi independent
solutions to (A i I)xi rather than just one.
Example 14 Given the matrix
10/3
0
A=
2/3
2/3
1 1 1/3
4 0
0
1 3 1/3
1 1 11/3
( ) = 4 14 3 + 72 2 160 + 128 = 0
14
There are 4 roots, 1 = 2, 2 = 3 = 4 = 4. We can find the eigenvector associated with 1 as x1 = [1 0 1 1]T . Then, we find
2/3 1 1 1/3
0
0 0 0
(4I A)x =
2/3 1 1 1/3 x
2/3 1 1 1/3
= xmi 1
(13)
0
1
0
0
0
0
1
0
A=
0
0
0
1
8 20 18 7
( ) = 4 + 7 3 + 18 2 + 20 + 8 = 0
then, 1 = 1, 2 = 3 = 4 = 2. The eigenvector of 1 is easily
found to be x1 = [1 1 1 1]T . On the other hand, one eigenvector 0f
2 is found to be x2 = [0.125 0.25 0.5 1]T , then x2 = (A + 2I)x3
leading to x3 = [0.1875 0.25 0.25 0]T also, x3 = (A + 2I)x4 so that
x4 = [0.1875 0.1875 0.125 0]T
15
(14)
(15)
the index indicates the length of the longest chain of eigenvectors and
generalized eigenvectors associated with i .
Example 16
0
0
A=
0
0
0
0
0
0
1
0
0
0
0
1
0
0
( ) = 4 = 0
so 1 = 0 with m1 = 4. Next, form A 1 I = A, and determine that
rank(A 1 I) = 2. There are then 2 eigenvectors and 2 generalized
eigenvectors. The question is whether we have one eigenvector-generalized
eigenvector chain of length 3, or 2 chains of length 2. To check that,
note that n m = 0, and that rank(A 1 I)2 = 0, therefore, the index
is k1 = 2. this then guarantees that one chain has length 2, making the
length of the other chain 2. First, consider (A 1 I)2 x = 0 Any vector satisfies this equation but only 4 vectors are linearly independent.
Let us choose x1 = [1 0 0 0]T . Is this an eigenvector or a generalized eigenvector? It is an eigenvector since (A 1 I)x1 = 0. Similarly,
x2 = [0 1 0 0]T is an eigenvector. On the other hand, x3 = [0 0 1 0]T
is a generalized eigenvector since (A i I)x3 = x1 6= 0. Similarly,
x4 = [0 0 0 1]T is a generalized eigenvector associated with x2 .
16
Example 17 Let
4 1 2
A = 0 2 0 .
0 1 2
T
vectors are x1 = 1 0 0 , x2 = 1 0 1 . Therefore we have three
eigenvalues but only two linearly independent eigenvectors. We call gener
T
alized eigenvector of A the vector x3 = 1 1 1 such that
Ax3 = 2 x3 + x2 .
The vector x3 is special in the sense that, x2 and x3 together span a twodimensional A invariant subspace.
(16)
(A) := max |i |.
i=1,...n
where |.| is the modulus of the argument and thus (A) is the radius of the smallest
circle in the complex plane, centered at the origin, that includes all the eigenvalues.
As described earlier, the non-null vectors xi such that Axi = i xi are the (right)eigenvectors of A. Similarly y 6= 0 is a left-eigenvector of A if y A = y . In
general a matrix A has at least one eigenvector. It is easy to see that if x is an
eigenvector of A, Span{x} is an A invariant subspace.
In general let us suppose that a matrix A has an eigenvalue of multiplicity r
but with only one correspondent eigenvector. Then we can define r 1 generalized
eigenvectors in the following way
Ax1 = x1
Ax2 = x2 + x1
..
.
Axr = xr + xr1 .
17
i 1 0 0
0 i 1 0
0 0 i 0
Ji = . .
..
..
..
.
.
. .
.
.
.
0 0 i 1
0 0 0 i
Finally, the general case will have Jordan blocks each of size ki as shown in the
examples below.
Example 18 Let
0
1
0
0
1
A= 0
18 27 10
1
1
1
M = 1 3 6
1
9 36
18
then
then = M 1 AM.
1.8
0.9
0.1
M 1 = 1 1.167 0.167
0.2 0.2667 0.067
10/3
0
A=
2/3
2/3
Then, its characteristic equation is
1 1 1/3
4 0
0
1 3 1/3
1 1 11/3
( ) = 4 14 3 + 72 2 160 + 128 = 0
There are 4 roots, 1 = 2, 2 = 3 = 4 = 4. We can find the eigenvector associated
with 1 as x1 = [1 0 1 1]T . Then, we find x2 = [1 0 0 2]T ; x3 [0 1 1 0]T ; x2 =
[0 1 0 3]T Therefore,
1 1 0 0
0
0 1 1
M=
1 0 1 0
1 2 0 3
then
then
1/3
1/6
1
M =
1/3
1/3
2
0
J=
0
0
0
0
4
0
0
0
0
4
19
0
1
0
0
0
0
1
0
A=
0
0
0
1
8 20 18 7
( ) = 4 + 7 3 + 18 2 + 20 + 8 = 0
then, 1 = 1, 2 = 3 = 4 = 2. Find eigenvectors as before and form
M=
1 1/2
1/4
1/8
1
1
0
0
Then,
1 0
0
0
0 2 1
0
J=
0
0 2 1
0
0
0 2
Example 21
0
0
A=
0
0
0
0
0
0
1
0
0
0
0
1
0
0
( ) = 4 = 0
so 1 = 0 with m1 = 4. Then,
1
0
M=
0
0
0
0
1
0
0
1
0
0
0
0
0
1
20
therefore making
0
0
J=
0
0
1
0
0
0
0
0
0
0
0
0
1
0
4.2 Norms
A norm is a generalization of the ideas of distance and length. As stability theory is
usually concerned with the size of some vectors and matrices, we give here a brief
description of some norms that will be used in these notes. We will consider first
the norms of vectors defined on a vector space X with the associated scalar field of
real numbers R.
Let X be a linear space on a field F. A function k k : X 7 R is called a norm
if it satisfies the following proporties
4.2 Norms
1. kxk 0,
21
x X
2. kxk = 0 x = 0
3. kaxk = |a|kxk,
a F, x X
4. kx + yk kxk + kyk,
as
x, y X
T
Let now X = Cn . For a vector x = x1 . . . xn in Cn the p-norm is defined
kxk p :=
|xi | p
i=1
! 1p
, p 1.
(17)
kxk1 := |xi |
i=1
kxk2 :=
|xi |2
i=1
1
x = 2
2
Then, k x k1 = 5, k x k2 = 3 and k x k = 2.
Let us now consider the norms for a matrix A Cmn . First let us define the
induced p-norms
kAxk p
kAk p := sup
, p 1.
x6=0 kxk p
4.2 Norms
22
These norms are induced by the p-norms on vectors. For p = 1, 2, , there exists
some explicit formulae for the induced pnorms
m
|ai j |
1 jn
kAk1 := max
kAk2 :=
i=1
p
max (A A) (spectral norm)
n
|ai j |
1im
kAk := max
j=1
Unless otherwise specified, we shall adopt the convention of denoting the 2norm without any subscript. Therefore by kxk and kAk we shall mean respectively
kxk2 and kAk2 .
Another often used matrix norm is the so-called Frobenius norm
s
m n
p
(18)
kAkF := tr (A A) = |ai j |2 .
i=1 j=1
It is possible to show that the Frobenius norm cannot be induced by any vector
norm.
The following Lemma presents some useful results regarding matrix norms.
Lemma 1 Let A Cmn . Then
1. kABk kAkkBk, for any induced norm
(submultiplicative property)
kUAV kF = kAkF
3. (A) kAk, for any induced norm and the Frobenius norm
Given a square nonsingular matrix A, the quantity
(19)
is called the condition number of A with respect to the induced matrix norm k k p .
From Lemma 1, we have
If (A) is large, we say that A is ill conditioned; if (A) is small (i.e. close to 1),
we say that A is well conditioned. It is possible to prove that, given a matrix A, the
reciprocal of (A) gives a measure of how far A is from a singular matrix.
We now present an important property of norms of vectors in Rn which will be
useful in the sequel.
4.2 Norms
23
n||x||2
||x||2 n||x||
2. Consider again the vector of ex 23. Then we can check that
||x||1
3||x||2
||x||2 3||x||
Note that a norm may be defined independently from an inner product. Also, we
can define the generalized angle between 2 vectors in Rn using
x.y =< x, y >=k x k . k y k cos
(20)
where is the angle between x and y. Using the inner product, we can define the
orthogonality of 2 vectors by
x.y =< x, y >= 0
which of course means that = (2i + 1) /2.
(21)
24
= ( )n + cn1 n1 + + c1 + c0
= 0
(22)
Then,
(A) = (1)n An + cn1 An1 + + c1 A + c0 I
= 0
(23)
1
[(1)n An1 + cn1 An2 + + c1 A]
c0
25
d
dt exp(At)
= Aexp(At) = exp(At)A
Z t
0
eA(t ) bu( )d
2 1
x =
x
4 2
1
x(0) =
1
26
To find x(t) let us find exp(At) using 2 methods. First, by evaluating the infinite
series
t2
exp(At) = I + At + A2 +
1 0
2 1
=
+
t +0
0 1
4 2
1 + 2t
t
=
4t
1 2t
(24)
so that
1 + 2t
t
x(t) =
4t
1 2t
1 + 3t
=
6t 1
1
1
next, consider the Laplace transform approach, where we find (sI A)1
s+2 1
1
s2
s2
(sI A) =
4
s2
s2
(25)
s2
so that
exp(At) =
1 + 2t
t
4t
1 2t
(26)
Now consider a third method whereby, we transform A to its Jordan normal form.
The A matrix has a double eigenvalue at zero with an eigenvector x1 = [1 2]T and
a generalized eigenvector at x2 = [0 1]. Therefore, the M matrix is given by
1
0
(27)
2 11
Using T = M we obtain,
J = A = T
AT =
0 1
0 0
so that
x =
0 1
0 0
x;
x(0)
= [1 3]T
(28)
27
1 0
3t + 1
=
2 1
3
3t + 1
=
6t 1
(29)
1 2
x =
x
2 3
1
x(0) =
1
then there are 2 distinct eigenvalues, and we can find exp(Jt),
1 = 0.2361; 2 = 4.2361
0.8507 0.5257
M=
0.5257 0.8507
28
so that
M
0.8507 0.5257
0.5257 0.8507
0.7237e0.2361t + 0.2764e4.2361t
exp(At) =
0.4472(e4.2361t e0.2361t )
0.4472(e4.2361t e0.2361t )
0.2764e0.2361t + 0.7237e4.2361t
A11 A12
A=
A21 A22
(30)
with A11 and A22 square matrices. Suppose that A11 is nonsingular. Then the matrix
:= A22 A21 A1
11 A12
is called the Schur complement of A11 in A. Similarly, if A22 is nonsingular, the
matrix
:= A11 A12 A1 A21
22
is the Schur complement of A22 in A. A useful expression for the inverse of A in
terms of partitioned blocks is
A1
A12 1
1
11
(31)
A =
1
1 A21 A1
11
supposing that all the relevant inverses exist. The following well-established identities are also very useful
(A BD1C)1 = A1 + A1 B(D CA1 B)1CA1
1
= (I + AB)
(32)
(33)
(34)
29
I A1
11 A12
0
I
it is easy to verify that
det(A) = det(A11 ) det().
(35)
I
0
A1
22 A21 I
we find that
(36)
Consider A Cmn and B Cnm . Using identities (35) and (36), it is easy to
prove that that
det(Im AB) = det(In BA).
1
A u1 z2 zn = u1 z2 zn
.
0 A1
Therefore
1
A = U1
U1 .
0 A1
(37)
30
Now let us consider A1 C(n1)(n1) . From (37) it is easy to see that the (n 1)
eigenvalues of A1 are also eigenvalues of A. Given 2 an eigenvalue
of A1 with nor-
malized eigenvector u2 , we can construct a Unitary matrix U2 = u2 z2 zn1
C(n1)(n1) so that
A1U2 = U2 2
.
0 A2
Denoting by V2 the Unitary matrix
1 0
V2 =
,
0 U2
we have
1
1
1 0
1 0
= 0 2 .
V2U1 AU1V2 =
0 A1
0 U2
0 U2
0 0 A2
A
0
A=
u2 un .. ,
.
31
T
with = kAk. Therefore A = UV , with U = A u2 un , = 1 0 0
and V = 1.
We suppose now that the theorem holds for m = p + k and n = k and prove
it for m = p + k + 1 and n = k + 1. Let A C(p+k+1)(k+1) . All the eigenvalues
of A A are real (since A A is Hermitian), nonnegative and at least one is greater
than zero since A 6= 0. Denote by the maximum eigenvalue of A A and by v the
correspondent normalized eigenvector
A Av = v.
u AV0
u
.
=
A v V0 =
0 U0 AV0
U0
Since u A = v A A = v , it follows that u AV0 = 0. Now an easy inductive argument completes the proof, noting that U0 AV0 C(p+k)k .
The scalars i are called the singular values of A. They are usually ordered
nonincreasingly 1 2 . . . r 0, with r = min{m, n}. The largest singular
value 1 and the smallest singular value r are denoted by
(A) := 1
(A) := r .
The columns vi of V are called the right singular vectors of A, and the columns ui
of U the left singular vectors. They are related by
Avi = i ui ,
i = 1, . . . r.
The following Lemma shows some of the information we can get from the
singular value decomposition of a matrix A.
Lemma 2 Let A Cmn and consider its singular value decomposition A = UV .
Then
1. The rank k of A equals the number of singular values different from zero
2. Range(A) = Span{u1 , . . . uk }
3. Ker(A) = Span{vk+1 , . . . vn }
4. 1 = kAk
32
5. kAk2F = ri=1 i2
6. Given a square nonsingular matrix A
(A) =
1
(A1 )
A AT
A + AT
; Aa =
2
2
A + (A )T
A (A )T
; AAH =
2
2
Then note that < x, Ax >=< x, As x > if A is real, and < x, Ax >=< x, AH x > if A
is complex.
5.4.1 Definite Matrices
Let us consider a Hermitian matrix A. A is said to be positive (semi)definite if
x Ax > () 0, x Cn . We shall indicate a positive (semi)definite matrix by A >
() 0. A Hermitian matrix A is said to negative (semi)definite if (A) is positive
(semi)definite. The following Lemma gives a characterization of definite matrices.
Lemma 3 Let A be a Hermitian matrix. Then
1. A is positive (negative) definite if and only if all its eigenvalues are positive
(negative).
2. A is positive (negative) semidefinite if and only if all its eigenvalues are nonnegative (nonpositive).
Given a real n n matrix Q, then
1. Q is positive-definite, if and only if xT Qx > 0 for all x 6= 0.
2. Q is positive semidefinite if xT Qx 0 for all x.
33
(38)
1
1
1
1
1
= I A2 A1
+ A3 A1
4 [I A4 (A4 A3 A1 A2 )
1 A2 (A4 A3 A1 A2 ) ]A3 A1
1
1
1
1
= I A2 A1
4 [I (A4 A3 A1 A2 )(A4 A3 A1 A2 ) ]A3 A1
= I
1
1
1
1
1
1
= I A1
1 A2 [I (A4 A3 A1 A2 ) A4 + (A4 A3 A1 A2 ) A3 A1 A2 )]A4 A3
1
1
1
1
= I A1
1 A2 [I (A4 A3 A1 A2 ) (A4 A3 A1 A2 )]A4 A3
= I
6. MATLAB Commands
34
A11 A12
A=
A12 A22
(39)
with A11 and A22 square matrices. The following Theorem gives a characterization
of a positive matrix in terms of its partition (39).
Theorem 3 Let A be a Hermitian matrix partitioned as in (39). Then
1. A is positive definite if and only if A11 and A22 A12 A1
11 A12 are positive
definite
Proof 3
A11 A12
=
A12 A22
I A1
11 A12
0
I
A11
0
I A1
11 A12
0 A22 A12 A1
0
I
11 A12
and that
I A1
11 A12
0
I
A11 A12
=
A12 A22
I A12 A1
A11 A12 A1
0
I A12 A1
22
22 A12
22
.
0
I
0
I
0
A22
6 MATLAB Commands
There are plenty of MATLAB commands dealing with matrices. In this section we
shall give a brief overview of those related with the topics covered in this chapter.
We recall that the (conjugate) transpose of a matrix A is evaluated simply typing A
6. MATLAB Commands
35
and that a polynomial is represented by the row vector of its coefficients ordered in
descending powers.
Q=orth(A) returns a matrix Q whose columns are an orthonormal basis for
the range of a (rectangular) matrix A.
Q=null(A) returns a matrix Q whose columns are an orthonormal basis for
the null space of a (rectangular) matrix A.
det(A) returns the determinant of a square matrix A.
inv(A) returns the inverse (if it exists) of a square matrix A. A warning is
given if the matrix is ill conditioned.
pinv(A) returns the pseudo-inverse of a matrix A.
trace(A) returns the trace of a square matrix A.
rank(A) returns the rank of a matrix A.
cond(A) evaluates the condition number (19) of a square matrix A, using the
matrix spectral norm. In this case (19) becomes
(A) =
(A)
.
(A)
The command condest(A) can be used to get an estimate of the 1-norm condition number.
[V,D]=eig(A) returns a diagonal matrix D whose entries are the eigenvalues
of A and a matrix V whose columns are the normalized eigenvectors of A, such that
A*V=V*D. In general V is singular. The command eigs(A) returns only some
of the eigenvalues, by default the six largest in magnitude.
poly(A) returns a row vector with (n + 1) elements, whose entries are the
coefficients of the characteristic polynomial (16) of A.
norm(A) returns the 2-norm for both vectors and matrices. Anyway there are
some differences in the two cases
If A is a vector, the p-norm defined in (17) can be evaluated typing norm(A,p),
where p can be either a real number or the string inf, to evaluate the
norm.
If A is a matrix, the argument p in the command norm(A,p) can be only
1,2,inf,fro, where the returned norms are respectively the 1, 2, or
Frobenius matrix norms defined in section 4.2.
REFERENCES
36
References
[1] F. R. Gantmacher. The Theory of Matrices. Chelsea Publishing Company, New
York, N.Y., 1959. Two volumes.
[2] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press,
1985.
[3] B. Noble and J. W. Daniel. Applied Linear Algebra. Prentice-Hall, Third
edition, 1988.
[4] G. Strang. Introduction to Linear Algebra. Wellesley-Cambridge Press, Cambridge, MA, 3rd edition, 2003.