Linear Print
Linear Print
V. Ravichandran
Department of Mathematics
University of Delhi
CONTENTS
Chapter 1
Matrices
Given the square matrix A = [ai j ] of order n, the entries a11 , a22 , . . . , ann
are the diagonal entries of the matrix A.
Definition 1.1.1
A matrix whose all the entries are zero is called a zero matrix and
a square matrix of order n whose diagonal entries are 1 and other
entries are zero is called a identity matrix of order n. It is denoted
by In or simply I.
Example 1.1.2
The matrices
0 0 0
0 0 0 0 0 0
,
0 0 0
0 0 0
are both zero matrices while
1 0 0
1 0
, 0 1 0
0 1
0 0 1
are identity matrices of order 2 and 3 respectively.
Example 1.1.3
Let the matrices A, B and C be given by
1 2
1 2 3 x 2 3
A= , B= , C = 3 4 .
4 5 6 4 y z
5 6
The matrices A and B are equal if x = 1, y = 5 and z = 6.
The matrices A and C are not equal as they have different
orders.
Definition 1.1.3
Let A = [ai j ] and B = [bi j ] be two matrices of the same order. Then
their sum A + B is defined by A + B = [ai j + bi j ]. Similarly sub-
traction and scalar multiplication are defined by
A − B = [ai j − bi j ]
and
kA = [kai j ]
where k is a scalar. If A1 , A2 , . . ., An are matrices of the same order
and c1 , c2 , . . ., cn are are scalars, then the expression
c1 A1 + c2 A2 + · · · + cn An
is called a linear combination of A1 , A2 , . . ., An with coefficients
c1 , c2 , . . ., cn .
Example 1.1.4
Let the matrices A, B and C be given by
1 −2 3
A= ,
−1 2 3
0 −1 2
B= ,
−3 2 1
3 2 −1
C= .
2 0 −1
Then we have
1 −2 3 0 −1 2
A+B = +
−1 2 3 −3 2 1
4 Chapter 1. Matrices
1 + 0 −2 − 1 3 + 2
=
−1 − 3 2+2 3+1
1 −3 5
= .
−4 4 4
Similarly
1 −2 3 0 −1 2
A−B = −
−1 2 3 −3 2 1
1−0 −2 − (−1) 3 − 2
=
−1 − (−3) 2−2 3−1
1 −1 1
= .
2 0 2
Also we have
3 2 −1 9 6 −3
3C = 3 = .
2 0 −1 6 0 −3
Also the linear combination of A, B and C with coeffi-
cients 2, 1, -1 is given by
2A + B −C
1 −2 3 0 −1 2 3 2 −1
=2 + −
−1 2 3 −3 2 1 2 0 −1
2 −4 6 0 −1 2 −3 −2 1
= + +
−2 4 6 −3 2 1 −2 0 1
−1 −7 9
= .
−7 6 8
Definition 1.1.4
A matrix A is conformable with the matrix B if the number of
columns of A is equal to the number of rows of B. That is, if A
is an m × p matrix and B is an q × n matrix, the matrix A is con-
formable with the matrix B if p = q.
If the matrix A is conformable with the matrix B, then the product
AB is defined as follows.
1.1. Definitions and examples 5
Definition 1.1.5
The product or multiplication of two matrices A = [ai j ] and B =
[b jk ] of order m × n and n × p respectively is another matrix of order
m × p given by " #
n
AB = ∑ ai j b jk .
j=1
Example 1.1.6
Let
1 2
A= .
−2 1
Then
2 1 2 1 2 −3 4
A = = .
−2 1 −2 1 −4 −3
Example 1.1.7
Let
1 0 0
I3 = 0 1 0 .
0 0 1
Then we have
1 0 0 1 0 0 1 0 0
I32 = 0 1 0 0 1 0 = 0 1 0 = I3
0 0 1 0 0 1 0 0 1
and therefore I33 = I32 I3 = I3 I3 = I3 . In general, for any
positive integer m, Inm = In for the identity matrix In .
Definition 1.1.6
The transpose of the matrix A = [ai j ] denoted by AT is the matrix
obtained by interchanging rows and columns of A:
(AT ) ji = (A)i j .
Example 1.1.8
Let the matrices A and B be given by
1 −1
1 2 3
A= , B = 1 −2 .
4 5 2
2 −1
Then the transpose of A and B are given by
1 4
T T 1 1 2
A = 2 5 , B = .
−1 −2 −1
3 2
1.2. Properties of matrix operations 7
Definition 1.1.7
The trace of a square matrix A = [ai j ] of order n, denoted by tr A, is
the sum of all diagonal elements of A:
n
tr A = a11 + a22 + · · · + ann = ∑ aii .
i=1
Example 1.1.9
The trace of the matrix
1 2 3
A = 4 5 2
6 9 −2
is given by
tr A = 1 + 5 + (−2) = 4.
Many of the basic rules of arithmetic are valid for matrices. There are
some rules that are not valid for matrices.
Theorem 1.2.1
Let A, B and C be matrices and α and β be scalars. Then the fol-
lowing rules of matrix operations are valid:
(1) A + B = B + A (Commutative law for addition)
(2) A + (B +C) = (A + B) +C (Associative law for addition)
(3) A(BC) = (AB)C (Associative law for multiplication)
(4) A(B +C) = AB + AC (Left distributive law)
8 Chapter 1. Matrices
Therefore
" #
n
(AB)C = ∑ ai j b jk [ckl ]
j=1
" ! #
p n
= ∑ ∑ ai j b jk ckl
k=1 j=1
" #
p n
= ∑ ∑ ai j b jk ckl
k=1 j=1
" #
n p
= ∑ ∑ ai j b jk ckl
j=1 k=1
" !#
n p
= ∑ ai j ∑ b jk ckl
j=1 k=1
= A(BC).
(4) To prove A(B + C) = AB + AC, let A = [ai j ], B = [b jk ]
and C = [c jk ] be matrices of order m × n, n × p, n × p
respectively. Since
B +C = [b jk ] + [c jk ] = [b jk + c jk ],
" #
n
AB = ∑ a jk b jk ,
k=1
" #
n
AC = ∑ a jk c jk ,
k=1
we have
A(B +C) = [a jk ][b jk + c jk ]
" #
n
= ∑ a jk (b jk + c jk )
k=1
" #
n
= ∑ a jk b jk + a jk c jk
k=1
10 Chapter 1. Matrices
" #
n n
= ∑ a jk b jk + ∑ a jk c jk
k=1 k=1
" # " #
n n
= ∑ a jk b jk + ∑ a jk c jk
k=1 k=1
= AB + AC.
Example 1.2.1
Let A, B and C be matrices given by
1 2 1 −1 2 3 1
A= , B= , C= .
2 3 3 2 3 −2 0
Let us verify the associative law for multiplication.
We have
1 2 1 −1 7 3
AB = =
2 3 3 2 11 4
and
7 3 2 3 1 23 15 7
(AB)C = = .
11 4 3 −2 0 34 25 11
Similarly we have
1 −1 2 3 1 −1 5 1
BC = =
3 2 3 −2 0 12 5 3
and
1 2 −1 5 1 23 15 7
A(BC) = = .
2 3 12 5 3 34 25 11
Thus we see that (AB)C = A(BC).
Then
1 2 0 1 2 1
AB = =
3 4 1 0 4 3
and
0 1 1 2 3 4
BA = = ̸ AB.
=
1 0 3 4 1 2
This example shows that the matrix multiplication is not
commutative.
Example 1.2.3
Let A and B be two matrices given by
1 0 0 0
A= , B=
0 0 0 1
Then
1 0 0 0 0 0
AB = = = 0.
0 0 0 1 0 0
This example shows that AB = 0 need not imply A = 0 or
B = 0.
If
0 0
C= ,
0 2
then AC = 0 and therefore AB = AC but not B = C. Can-
cellation law that AB = AC ⇒ B = C does not hold al-
ways.
Theorem 1.2.2
If A and B are two square matrices and are commutative, then
(AB)2 = A2 B2 .
= A(A(BB))
= A(AB2 )
= (AA)B2
= A2 B2 .
Theorem 1.2.3
If A and B are two square matrices and are commutative, then for
any positive integer n, we have
ABn = Bn A.
Theorem 1.2.4
If A and B are two square matrices and are commutative, then for
any positive integer n, we have
(AB)n = An Bn .
P ROOF. Let
a11 a12 . . . a1n
a21 a22 . . . a2n
A=
. . . . . . . . . . . . . . . . . .
a p1 a p2 . . . a pn
14 Chapter 1. Matrices
Then
a11 a12 . . . a1n 1 0 ··· 0
a21 a22 . . . a2n 0 1 · · · 0
AI =
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
a p1 a p2 . . . a pn 0 0 ··· 1
a11 a12 . . . a1n
a21 a22 . . . a2n
=. . . . . . . . . . . . . . . . .
a p1 a p2 . . . a pn
= A.
This proves the first part. The second part is similar.
Definition 1.3.1
If A is a square matrix of n, a matrix B of order n is called the
inverse of A if
AB = BA = In .
If A has an inverse, it is called invertible or non-singular. If there
is no inverse, then the matrix is called singular.
Note that the inverse is defined only for square matrices and if B is
the inverse of A, then A is the inverse of B.
Example 1.3.1
Consider the matrix
1 1
1 1 2 2
A= and B= ,
1 −1 1
2 − 12
we see that
1 1
1 1 2 2 1 0
AB = = = I2
1 −1 1
2 − 12 0 1
and
1 1
2 2 1 1 1 0
BA = = = I2 .
1
2 − 12 1 −1 0 1
This shows that B is the inverse of A.
1.3. Invertible matrices 15
Example 1.3.2
Consider the matrix
1 1 a b
A= and B= ,
0 0 c d
we see that
1 1 a b a+c b+d
AB = = ̸= I2
0 0 c d 0 0
for any a, b, c, d . Therefore A has no inverse.
Theorem 1.3.1
The inverse of a square matrix, if it exists, is unique. Equivalently,
if B and C are inverses of a square matrix A, then B = C.
AA−1 = A−1 A = I.
Example 1.3.3
Let ad − bc ̸= 0. Then the inverse of the matrix
a b
A=
c d
is given by
−1 1 d −b
A = .
ad − bc −c a
16 Chapter 1. Matrices
P ROOF. Since
(AB)(B−1 A−1 ) = A(B(B−1 A−1 ))
= A((BB−1 )A−1 )
= A(IA−1 )
= AA−1
=I
and
(B−1 A−1 )(AB) = B−1 (A−1 (AB))
= B−1 ((A−1 )A)B)
= B−1 (IB)
= B−1 B
=I
18 Chapter 1. Matrices
Corollary 1.3.1
If A1 , A2 , . . . , An are invertible matrices of order n, then their prod-
uct A1 A2 · · · An is invertible and
(A1 A2 · · · An )−1 = A−1 −1 −1
n An−1 · · · A1 .
= A−1 −1 −1
k+1 Ak · · · A1 .
The second equality above follows from Theorem 1.3.2 while the
third equality follows from the induction assumption. Thus, the
result is true for n = k + 1.
Theorem 1.3.3
Let A be invertible. Then we have the following:
(1) A−1 is invertible and (A−1 )−1 = A.
(2) An is invertible and (An )−1 = (A−1 )n for any positive inte-
ger n.
(3) kA is invertible and (kA)−1 = 1k A−1 for any scalar k ̸= 0.
P ROOF.
or
A−1 A = AA−1 = I.
This shows that the inverse of A−1 is A.
(2) Since A and A−1 are commutative, we have
An (A−1 )n = (AA−1 )n = I n = I
and similarly (A−1 )n An = I.
(3) Since k ̸= 0, we have
1 1
(kA) A−1 = (k )AA−1 = I
k k
1 −1
and similarly k A (kA) = I.
Theorem 1.3.5
Let A, B and C be matrices and a and b be scalars. Then the follow-
ing properties of transpose are valid:
(1) (AT )T = A
(2) (A + B)T = AT + BT
(3) (A − B)T = AT − BT
20 Chapter 1. Matrices
Theorem 1.3.6
If A is invertible, then AT is also invertible and
(AT )−1 = (A−1 )T .
1.4. Diagonal, symmetric matrices 21
Definition 1.4.1
A square matrix A = [ai j ] such that ai j = 0 for i ̸= j is a di-
agonal matrix. A diagonal matrix whose diagonal entries are
d1 , d2 , . . . , dn is denoted by diag(d1 d2 . . . dn ). If ai j = 0 for
i > j, it is upper triangular. If ai j = 0 for i < j, then it is lower
triangular. A diagonal matrix in which all the diagonal elements
are equal is called scalar matrix. A scalar matrix with all diagonal
entries equal to one is the unit or identity matrix.
Example 1.4.1
The matrices
1 0 0 1 0 0
1 0 0 3 0 0 1 0
0 3
0 0 4 0 0 1
are examples of diagonal matrices.
The matrices
1 2 3 1 0 1
1 2 0 3 −1 0 2 2
0 3
0 0 4 0 0 4
22 Chapter 1. Matrices
Example 1.4.2
The product of two diagonal matrices
d1 0 0 · · · 0 e1 0 0 ··· 0
0 d2 0 · · · 0 0 e2 0 ··· 0
.. .. .. .. .. .. .. ... ... ..
. . . . . . . .
0 0 0 · · · dn 0 0 0 ··· en
d1 e1 0 0 ··· 0
0 d2 e2 0 · · · 0
= .. .. .. .. ...
. . . .
0 0 0 ··· dn en
is again a diagonal matrix.
Also it is clear that the inverse of diagonal matrix
d1 0 0 · · · 0
0 d2 0 · · · 0
D = .. .. .. .. ..
. . . . .
0 0 0 · · · dn
is the matrix D−1 given by
1
d1 0 0 ··· 0
0 1
0 ··· 0
D−1 = .
d2
.. ... .. ..
. . ...
0 0 0 ··· 1
dn
provided the diagonal entries are all nonzero.
1.4. Diagonal, symmetric matrices 23
Definition 1.4.2
The matrix A is symmetric if
AT = A
and skew-symmetric if
AT = −A.
Example 1.4.3
The matrices
1 2 3 1 2 3
2 3 −1 , 2 3 4
3 −1 4 3 4 5
are both examples of symmetric matrices. The matrix
0 2 3
A = −2 0 −1
−3 1 0
satisfies
0 −2 −3 0 2 3
AT = 2 0 1 = − −2 0 −1 = −A
3 −1 0 −3 1 0
and therefore it is an example of skew-symmetric matrix.
Similarly the matrix
0 2 −3
−2 0 −4
3 4 0
is skew-symmetric. Note that all the diagonal elements
of a skew-symmetric matrices are zero.
24 Chapter 1. Matrices
Theorem 1.4.1
If A and B are symmetric matrices of same order, k a scalar, then
the matrices A + B, A − B, AT , kA are all symmetric matrices.
Theorem 1.4.2
If A and B are symmetric matrices of order n, then the matrix AB is
a symmetric matrix if and only if AB = BA.
Theorem 1.4.3
If A is invertible symmetric matrix, then the matrix A−1 is also a
symmetric matrix.
Example 1.4.4
For any matrix A, AAT is symmetric. For let B = AAT . Since
the transpose of product of two matrices is the product of
their transposes in the reverse order and the transpose of
1.4. Diagonal, symmetric matrices 25
Example 1.4.5
Let A be given by
1 −1 2
A= .
0 2 −1
Then we have
1 0
1 −1 2 6 −4
T
AA = −1 2 =
0 2 −1 −4 5
2 −1
is a symmetric matrix.
Theorem 1.4.4
If A is invertible symmetric matrix, then the matrices AAT and AT A
are invertible symmetric matrices.
Theorem 1.4.5
Any matrix is a sum of a symmetric and a skew-symmetric matri-
ces.
and
1 1 1
CT = (A − AT )T = (AT − A) = − (A − AT ) = −C.
2 2 2
Thus B and C are symmetric and skew-symmetric respectively.
Also
1 1
B +C = (A + AT ) + (A − AT ) = A.
2 2
This completes the proof.
Example 1.4.6
Let
1 2 3
A = 4 5 6 .
7 8 9
Then
1 2 3 1 4 7
1 1
B = (A + AT ) = 4 5 6 + 2 5 8
2 2 7 8 9 3 6 9
1 3 5
= 3 5 7
5 7 9
and
1 2 3 1 4 7
1 1
C = (A − AT ) = 4 5 6 − 2 5 8
2 2 7 8 9 3 6 9
0 −1 2
= 1 0 −1 .
−2 −1 0
It is clear that B is symmetric and C is skew-symmetric. Also A =
B +C.
Theorem 1.4.6
Let A and B square matrices of order n. The we have
1.4. Diagonal, symmetric matrices 27
(1) tr(A + B) = tr A + tr B
(2) tr(A − B) = tr A − tr B
(4) tr(kA) = k tr A
(5) tr(AT ) = tr A.
and
!
n n n
tr(BA) = ∑ ∑ bi j a ji = ∑ a ji bi j
i=1 j=1 i, j=1
n
= ∑ ai j b ji = tr(AB).
i, j=1
Chapter 2
Determinants
Example 2.1.1
Consider the matrix
a11 a12 a13
A = a21 a22 a23 .
a31 a32 a33
For example, the minors of a11 , a12 , a13 and a32 are given respec-
tively by
a22 a23
M11 = = a22 a33 − a32 a23 ,
a32 a33
a21 a23
M12 = = a21 a33 − a31 a23 ,
a31 a33
a21 a22
M13 = = a21 a32 − a31 a22 ,
a31 a32
a11 a13
M32 = = a11 a23 − a21 a13 .
a21 a23
Their cofactors are respectively given by
C11 = (−1)1+1 M11 = +(a22 a33 − a32 a23 ),
C12 = (−1)1+2 M12 = −(a21 a33 − a31 a23 ),
C13 = (−1)1+3 M13 = +(a21 a32 − a31 a22 ),
C32 = (−1)3+2 M32 = −(a11 a23 − a21 a13 ).
Example 2.1.2
Consider the matrix A given by
1 2 3
A = 4 5 6 .
7 8 9
The minors M11 , M12 are given by
5 6
M11 = = 45 − 48 = −3
8 9
2.1. Minor and Cofactor of a square matrix 31
4 6
M12 = = 36 − 42 = −6
7 9
and therefore their cofactors are given by
C11 = (−1)1+1 M11 = M11 = −3
C12 = (−1)1+2 M12 = −M12 = 6.
Example 2.1.3
The cofactor and minor of an element ai j differ only in sign:
Ci j = ±Mi j .
For example, the sign for the cofactors of a 3 × 3 and a 4 × 4 matri-
ces are given in the following matrix:
+ − + −
+ − +
− + − , − + − + .
+ − + −
+ − +
− + − +
Definition 2.1.1
The determinant of the square matrix A = [ai j ] of order n, denoted
by det(A) or |A|, is defined by the following cofactor expansion:
det(A) := a11C11 + a12C12 + · · · + a1nC1n .
Example 2.1.4
The determinant of the matrix
1 2 3
A = 2 0 1
4 5 2
is given by
0 1 2 1 2 0
|A| = 1 −2
4 2 + 3 4 5
5 2
= 1(0 − 5) − 2(4 − 4) + 3(10 − 0)
= 25.
32 Chapter 2. Determinants
Example 2.1.5
The determinant of
a11 a12 a13
A = a21 a22 a23
a31 a32 a33
is given by
det(A)
a11 a12 a13
= a21 a22 a23
a31 a32 a33
= a11C11 + a12C12 + a13C13
a22 a23 a21 a23 a21 a22
= a11 − a12
a31 a33 + a13 a31 a32
a32 a33
= a11 (a22 a33 − a32 a23 ) − a12 (a21 a33 − a31 a23 )
+ a13 (a21 a32 − a31 a22 ).
Also we get
det(A)
= a11 a22 a33 + a12 a31 a23 + a13 a21 a32
− a11 a32 a23 − a12 a21 a33 − a13 a31 a22
= a11 (a22 a33 − a32 a23 ) − a21 (a12 a33 − a13 a32 )
+ a31 (a12 a23 − a13 a22 )
a22 a23 a12 a13 a12 a13
= a11 − a21
a32 a33 + a31 a22 a23
a32 a33
= a11C11 + a21C21 + a31C31
This is the cofactor expansion of the determinant using the first col-
umn. It can be shown that the determinant can be found by using
the cofactor expansion using any row or column.
Theorem 2.1.1
The determinant of a matrix A = [ai j ] of order n can be computed as
a sum of product of entries in a row or column and its corresponding
2.1. Minor and Cofactor of a square matrix 33
cofactors:
det(A) = ai1Ci1 + ai2Ci2 + · · · + ainCin ,
or
det(A) = a1 jC1 j + a2 jC2 j + · · · + an jCn j
(The first one is the cofactor expansion using the ith row and the
second one is the cofactor expansion using the jth column.)
Example 2.1.6
Consider the matrix A given in Example 2.1.4:
1 2 3
A = 2 0 1
4 5 2
The determinant of A obtained by cofactor expansion using second
column is
2 1 1 3 1 3
|A| = −2 +0 −5
4 2 4 2 2 1
= −2(4 − 4) + 0 − 5(1 − 6)
= 25.
The third column has only one nonzero entry. It is best to use this
column to find the determinant. In fact, by the cofactor expansion using
the third column, we have
1 2 4
det(A) = (−1)2+3 2 0 3 2 .
1 0 1
34 Chapter 2. Determinants
Example 2.1.8
If two rows of a determinant |A| are identical, then |A| = 0.
Example 2.1.9
Consider the determinant
a11 a12 a13
|A| = a21 a22 a23 .
a31 a32 a33
Consider
a11C21 + a12C22 + a13C23 .
This is obtained by multiplying the first row entries with the co-
factor of the corresponding entries in the second row. This can be
2.1. Minor and Cofactor of a square matrix 35
Definition 2.1.2
For a square matrix A = [ai j ] of order n, the matrix
C11 C12 · · · C1n
C21 C22 · · · C2n
. ..
.. ..
. ··· .
Cn1 Cn2 · · · Cnn
is called the matrix of cofactors. The transpose of this matrix is
called the adjoint of A and is denoted by Adj(A).
Example 2.1.10
Consider the matrix
1 2 3
A = 2 3 1 .
3 2 1
The matrix of the cofactors is given by
1 1 −5
4 −8 4 .
−7 5 −1
36 Chapter 2. Determinants
Theorem 2.1.2
For any square matrix A of order n,
A Adj(A) = Adj(A)A = |A|I
where I is the identity matrix of order n. For a nonsingular matrix
A, we have
1
A−1 = Adj(A).
|A|
Since
n
∑ ai jC jk = ai1C1k + ai2C2k + · · · + ainCnk
j=1
(
det A if i = k
=
0 if i ̸= k,
we see that
det(A) 0 ··· 0
0 det(A) · · · 0
A Adj(A) =
...
.
... ··· ...
0 0 · · · det(A)
This proves that A Adj(A) = det(A)I. Similarly, we can show that
Adj(A)A = det(A)I. Thus we have
1 1
A Adj(A) = Adj(A)A = I
det(A) det(A)
and therefore
1
A−1 = Adj(A).
det(A)
Example 2.1.11
Let A be given by
1 1 −1
A = 1 −1 1 .
−1 1 1
The determinant of A is given by
det(A) = 1(−1 − 1) − 1(1 + 1) − 1(1 − 1) = −4.
The matrix of cofactor is given by
−2 −2 0
−2 0 −2
0 −2 −2
38 Chapter 2. Determinants
Theorem 2.1.3
A matrix A is invertible if and only if det(A) ̸= 0.
Theorem 2.1.4
The determinant of triangular matrix (upper diagonal, lower diag-
onal or diagonal matrix) is the product of all its diagonal entries
and therefore the matrix is invertible if and only if all its diagonal
entries are nonzero.
Example 2.1.12
The inverse of
λ1 0 0
A = 0 λ2 0
0 0 λ3
exists if and only if λ1 λ2 λ3 ̸= 0. In this case, the inverse is given by
1
λ 0 0
1
A = 0 λ12 0 .
0 0 λ1
3
Example 2.1.13
The inverse of the lower triangular matrix matrix A given by
1 0 0 0
2 1 0 0
A= 3 2 1 0
4 3 2 1
is given by
1 0 0 0
−2 1 0 0
A−1 = 1 −2 1 0 .
0 1 −2 1
The inverse is also a lower triangular matrix. This is true in general
as shown in the next theorem.
Consider the upper triangular matrix A:
a11 a12 a13 a14 a15
0 a22 a23 a24 a25
0 0 a33 a34 a35
0 0 0 a44 a45
0 0 0 0 a55
Since the determinant is upper triangular and has zero in main diagonal,
M24 = 0. In this example, we can verify that Mi j = 0 for all i < j.
Theorem 2.1.5
The inverse (if it exists) of a upper (lower) triangular matrix is again
upper (lower) triangular. The inverse (if it exists) of diagonal matrix
is again diagonal.
2.1. Minor and Cofactor of a square matrix 41
P ROOF. We first prove the result for upper triangular matrix. Let
A = [ai j ] be an upper triangular matrix. Let i < j and Mi j be the
42 Chapter 2. Determinants
we have
1
x = A−1 b = Adj(A)b.
det(A)
Now let A = [ai j ], Ad j(A) = [Ci j ]T and b = [b1 b2 · · · bn ]T . Then
1
x= Adj(A)b
det(A)
T
C11 C12 · · · C1n b1
1 21 22 C C · · · C b
=
2n
.2
det(A) .. .. ..
. .
.. · · · .
Cn1 Cn2 · · · Cnn bn
b1C11 + b2C21 + · · · + bnCn1
1 b C + b2C22 + · · · + bnCn2
1 12 .
= ..
det(A) .
b1C1n + b2C2n + · · · + bnCnn
Hence
b1C1i + b2C2i + · · · + bnCni
xi =
det(A)
a11 a12 · · · a1 i−1 b1 a1 i+2 · · · a1n
1 a21 a22 · · · a2 i−1 b2 a2 i+2 · · · a2n
=
det(A) .. · · · ..
. .. .. .. .. .
. ··· . . .
a a ··· a b a ··· a
n1 n2 n i−1 n n i+2 nn
det(Ai )
= .
det(A)
Example 2.2.1
Solve the system of equations
2x+3y−4z =−3,
7x−8y+9z = 10,
−2x+ + z = 1.
44 Chapter 2. Determinants
Example 2.2.2
Solve the system of equations
x+y−z =3,
x−y+z =1,
−x+y+z =2.
2.3. Properties of determinants 45
Theorem 2.3.1
If a square matrix A has a row (or a column) of zeros, then det(A) =
0.
Example 2.3.1
The determinant of the matrix
1 2 3
A = 0 0 0
4 5 6
is zero.
Theorem 2.3.2
For any square matrix A,
det(A) = det(AT ).
Theorem 2.3.3
If B is the matrix obtained by multiplying a single row (or a column)
of a square matrix A by a constant λ , then
det(B) = λ det(A).
Theorem 2.3.4
If B is the matrix obtained adding a constant multiple of a row (or
a column) to another row (or a column) of a square matrix A, then
det(B) = det(A).
Corollary 2.3.1
If a square matrix A has two rows (or two columns) which are pro-
portional, then det(A) = 0.
P ROOF. Let ith row is k times the jth row in the matrix A, then
by adding −k times to the ith row the jth row, we see that the jth
row of A now becomes a row of zero. Thus the determinant of A
is zero.
Theorem 2.3.5
If B is the matrix obtained by interchanging two rows (or two
columns) of a square matrix A, then
det(B) = − det(A).
follows as
a11 a12 a21 a22
a21 a22 = − a11 a12 .
Assume that the result is true for all matrices of order n − 1.
Let n ≥ 3. Let B be obtained interchanging ith row and jth rows
of A. Let the cofactor matrices of A and B are respectively [Ci j ],
[Ci′ j ]. Since n ≥ 3, there is a 1 ≤ k ≤ with k ̸= i, j. Noting that
Ck′ j is obtained from Ck j by interchanging the same rows used to
get B from A. Since the cofactors are determinants of matrices of
order n − 1, by induction hypothesis, we have
Ck′ j = −Ck j .
Using cofactor expansion of B using this kth row, and we get
′ ′
det(B) = ak1Ck1 + · · · + aknCkn
= −[ak1Ck1 + · · · + aknCkn ]
= det(A).
Thus the result is therefore true for matrices of order n. By the
principle of mathematical induction, the result follows.
Example 2.3.2
Consider the determinant of a matrix A
1 2 −1 3
1 1 2 0
det(A) = .
−1 4 2 1
−1 2 3 1
By taking −1 as a common factor from the second row, we get
1 2 −1 3
−1 −1 −2 0
det(A) = − .
−1 4 2 1
−1 2 3 1
2.3. Properties of determinants 49
Example 2.3.3
Consider the matrix
1 a b+c
A = 1 b a + c .
1 c a+b
The determinant is given by
1 a b + c
det(A) = 1 b a + c
1 c a + b
By adding the second column to the third, we get
1 a a + b + c
det(A) = 1 b a + b + c
1 c a + b + c
By taking the common factor a + b + c from the last column and
using the fact that the determinant of a matrix having two identical
50 Chapter 2. Determinants
Example 2.3.5
Consider the determinant
a b · · · b
b
b a · · · b
b
det(A) = b b · · · b .
a
... .
· · · ..
.. ..
. .
b b b · · · a
If a = b, then all the rows are equal and therefore the determinant is
zero. Hence assume that a ̸= b. Add all the rows (except the first)
2.3. Properties of determinants 51
to the first row and take the common factor (n − 1)b + a from the
first row, to get
1 1 1 · · · 1
b a b · · · b
det(A) = [(n − 1)b + a] b b a · · · b .
... ... ... · · · ..
.
b b b · · · a
Subtract the first column from all the remaining columns to get
1 0 0 ··· 0
b a − b 0 ··· 0
det(A) = [(n − 1)b + a] b 0 a−b ··· 0
... .. .. ..
. . ··· .
b 0 0 · · · a − b
= [(n − 1)b + a](a − b)n−1 .
The last inequality follows since the determinant of a triangular ma-
trix is the product of diagonal entries.
Theorem 2.3.6
For any square matrix of order n, we have
det(λ A) = λ n det(A).
Example 2.3.6
Consider the identity matrix I3 . Then det(I3 ) = 1 and therefore
det(λ I3 ) = λ 3 .
Further
det(I3 + I3 ) = det(2I3 ) = 23 = 8
and
det(I3 ) + det(I3 ) = 2
52 Chapter 2. Determinants
and therefore
det(I3 + I3 ) ̸= det(I3 ) + det(I3 ).
This shows that
det(A + B) ̸= det(A) + det(B)
in general.
Theorem 2.3.7
Let A, B and C be square matrices which differs only in the ith row.
Let ith row of C is obtained by adding the ith rows of A and B. Then
det(C) = det(A) + det(B).
Theorem 2.3.9
Let A and B be square matrices. If AB is invertible, then both A and
B are invertible.
Example 2.3.7
Let E be an elementary matrix of order n. If E is obtained from In
by multiplying a row by k, then det(E) = k. For example, let E be
obtained by multiplying the second row by k, then
1 0 0 ··· 0
0 k 0 · · · 0
E = 0 0 1 · · · 0 .
. . .
.. .. .. · · · ...
0 0 0 ··· 1
Since the determinant of diagonal matrix is the product of diagonal
entries, it follows that det(E) = k.
If E is obtained by interchanging of two rows of In , then
det(E) = − det(I) = −1.
If E is obtained by adding a multiple of one row of In to another,
then det(E) = 1. For example, let E be obtained by adding k times
2.3. Properties of determinants 55
Theorem 2.3.10
Let A be square matrix of order n and E an elementary matrix of
order n, then
det(EA) = det(E) det(A).
Theorem 2.3.11
A square matrix A is invertible if and only if det(A) ̸= 0.
Theorem 2.3.12
If A and B are two matrices of order n, then
det(AB) = det(A) det(B).
Example 2.3.8
Consider the matrices
1 0 2 2 1 0
A = 2 1 0 and B = 0 2 1 .
0 1 2 1 2 1
Clearly
4 5 2
AB = 4 4 1 .
2 6 3
Also
det(A) = 1(2 − 0) − 0 + 2(2 − 0) = 6
det(B) = 2(2 − 2) − 1(0 − 1) + 0 = 1
det(AB) = 4(12 − 6) − 5(12 − 2) + 2(24 − 8) = 6.
Therefore we have det(A) det(B) = 6 = det(AB).
Theorem 2.3.13
If the square matrix A is invertible, then
1
det(A−1 ) = .
det(A)
Example 3.1.1
Consider the linear equation in two variables x and y given by
2x − y = 1.
Let x = t . Then y = 2x − 1 = 2t − 1. Thus the general
solution of the equation 2x − y = 1 is given by
x = t, y = 2t − 1.
The variable t is called a parameter.
Example 3.1.2
The solution of the equation
x1 + x2 − 2x3 = 4
is obtained by taking x2 = t, x3 = s and solving for x1 :
x1 = 4 − x2 + 2x3 = 4 − t − 2s.
3.1. System of linear equations 61
Example 3.1.3
Find the solution set of the equation 2x + 4y = 11 in integers.
If x and y are integers, then 2x + 4y is an even integer
and it cannot be equal to the odd integer 11. Hence the
equation has no solution in integers.
Then
a11 a12 . . . a1n x1
a21 a22 . . . a2n x2
Ax =
... .. .
. ..
.. ..
. .
am1 am2 . . . amn xm
62 Chapter 3. System of Linear Equations
a11 x1 + a12 x2 + · · · + a1n xn b1
a21 x1 + a22 x2 + · · · + a2n xn b2
=
.. = . = b.
..
.
am1 x1 + am2 x2 + · · · + amn xn bm
The matrix A is called the matrix of the linear system (3.1.2). The linear
system (3.1.2) can be written as Ax = b.
Definition 3.1.2
The system of linear equations (3.1.2) is called consistent if it has
a solution and it is inconsistent if it has no solution.
Let a1 , a2 , . . . , an column vectors. The sum x1 a1 + x2 a2 + · · · + xn an
is called a linear combination of a1 , a2 , . . . , an .
Theorem 3.1.1
The linear system Ax = b is consistent if and only if the vector b is
a linear combination of the column vectors of A.
a11 x1 a12 x2 a1n xn
a21 x1 a22 x2 a x
= + . + · · · + 2n. n
.. ..
. ..
am1 x1 am2 x2 amn xn
a11 a12 a1n
a21 a22 a2n
= x1
... + x2 ... + · · · + xn ...
am1 am2 amn
= x1 a1 + x2 a2 + · · · + xn an .
Therefore, Ax = b is equivalent to x1 a1 + x2 a2 + · · · + xn an = b
and the result follows from this equation.
Definition 3.1.3
The system of linear equations (3.1.2) of m equations in n variables
is
(1) square if the number of equations is the same as the num-
ber variables: m = n;
(2) underdetermined if there are fewer equations than the
variables: m < n;
(3) overdetermined if there are more equations than the vari-
ables: m > n.
Example 3.1.4
Consider the system of linear equations of two equations in two
variables x and y given by
x + y = 2, 2x + y = 3.
Then x = 1, y = 1 is a solution of the system of linear
equations and the system of equations is consistent.
Example 3.1.5
Consider the system of linear equations of two equations in two
variables x and y given by
x + y = 1, 2x + 2y = 3.
64 Chapter 3. System of Linear Equations
Example 3.1.6
If the system of linear equations is given by
x + y = 1, 2x + 2y = 2,
then it is equivalent to the linear system
x + y = 1, x + y = 1.
In this case, it reduces to a single equation x + y = 1 and
x = t , y = 1 − t for any real number t is a solution of the
given linear system. In this case, there are infinite number
of solutions.
Consider a linear system of two equations in the variables x and y
given by
l1 : a1 x + b1 y = c1
l2 : a2 x + b2 y = c2 .
The solutions of this system of linear equations are the points of in-
tersection of the lines l1 and l2 . There are three possibilities (see Fig-
ure ??):
(1) The two lines are parallel but not the same. In this case, there
is no solution to the given linear system.
(2) The two lines intersect at a point. In this case, there is exactly
one solution.
(3) The two lines are same. In this case, all the points on the line
are solutions to the given system and therefore there are infin-
itely many solutions.
y y y
x x x
Fig. 3.1.1: Straight line is represented as linear equation. Linear systems of 2 equa-
tions in 2 variables with no solution, unique solution, and infinite number of solutions
Definition 3.1.4
A system of linear equations Ax = b of n equations in n variables
is in strict triangular form if the matrix A is an upper triangular
matrix with non-zero diagonal entries. In other words, a system is
66 Chapter 3. System of Linear Equations
Definition 3.1.5
Two systems of linear equations are called equivalent if they have
the same solution set.
A system of equations obtained by applying the one or more of the
following elementary row operations is equivalent to the given sys-
tem:
x + y = 2, y + z = 2, x + z = 2.
and the linear system corresponding to the augmented matrix in the last
step is
x + y = 2, y + z = 2, 2z = 2.
Example 3.2.1
The matrices
0 1 ∗ ∗ ∗
1 ∗ ∗ 1 ∗ ∗ 0 0 1 ∗ ∗
0 0 1 , 0 1 ∗ , 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0 0
0 0 0 0 0
are in row-echelon form. In these examples, ∗ can be any
number.
70 Chapter 3. System of Linear Equations
Example 3.2.2
The matrices
0 1 0 ∗ 0
1 ∗ ∗ 0 ∗
∗ ∗ 0 ∗ 0 0 1 ∗ 0 0
0 0 1 ∗ , 0 0 0 1 ∗
0 0 0 1
, 0
0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
are in reduced row-echelon form. In these examples, ∗
can be any number.
3.3. Gauss/Gauss-Jordan elimination methods 71
Example 3.3.1
Solve the system of equations
− 2x3 + 7x5 = 12
2x1 + 4x2 − 10x3 + 6x4 + 12x5 = 28
2x1 + 4x2 − 5x3 + 6x4 − 5x5 = −1.
72 Chapter 3. System of Linear Equations
1 2 0 3 0 7
0 0 1 0 0 1 .
0 0 0 0 1 2
The last matrix is in reduced row-echelon form.
The linear system corresponding to the augmented
matrix above is
x1 + 2x2 + 3x4 =7
x3 =1
x5 = 2.
Hence the solution set is given by
x1 = 7 − 2r − 3s,
x2 = r,
x3 = 1,
x4 = s,
x5 = 2.
This method of obtaining the solution by reducing the
augmented matrix of the given linear system to reduced
row-echelon form is called the Gauss-Jordan elimina-
tion method.
The linear system corresponding to the augmented
matrix found in Step 5 is
x1 + 2x2 − 5x3 + 3x4 + 6x5 = 14
x3 − 7/2x5 = −6
x5 = 2.
From the last equation we have x5 = 2 and using it in the
middle equation we get
x3 = −6 + 7/2x5 = −6 + 7 = 1.
Using these values in the first equation we get
x1 + 2x2 + 3x4 = 14 + 5 − 12 = 7.
In this way also, we see that the solution set is given by
x1 = 7 − 2r − 3s,
x2 = r,
3.3. Gauss/Gauss-Jordan elimination methods 75
x3 = 1,
x4 = s,
x5 = 2.
This method of obtaining the solution by reducing the
augmented matrix of the given linear system to row-
echelon form is called the Gauss elimination method.
Example 3.3.2
Solve the system of equations
x3 + x4 + x5 = 3
x1 + x2 + x3 − x4 − x5 = 1
2x1 − x2 − x4 + 2x5 = 2.
The augmented matrix of the given system of linear
equations is given by
0 0 1 1 1 3
1 1 1 −1 −1 1 .
2 −1 0 −1 2 2
Step 1. Locate the leftmost column that does not con-
sists of entirely of zero. The first column has this property
in this case.
Step 2. Interchange the top row with another row,
if necessary, to bring a nonzero entry to the top of the
column found in Step 1.
In this case, exchange the first and second rows in the
preceding matrix.
1 1 1 −1 −1 1
0 0 1 1 1 3 .
2 −1 0 −1 2 2
Step 3. If the entry that is now at the top of the col-
umn found in Step 1 is a, multiply the first row by 1/a in
order to introduce a leading 1.
In this case, the entry at the top of the column is 1.
Step 4. Add a suitable multiples of the top row to the
rows below so that all entries below the leading 1 become
zeros.
76 Chapter 3. System of Linear Equations
1
= [3r + 6s − 6] = r + 2s − 2.
3
Using the values of x2 , x3 , x4 and x5 in the first equation,
we get
x1 + (r + 2s − 2) − 2r − 2s = −2
or
x1 − r − 2 = −2 or x1 = r.
In this way also, we see that the solution set is given by
x1 = r,
x2 = r + 2s − 2,
x3 = 3 − r − s,
x4 = r,
x5 = s.
This solution is obtained by Gauss elimination method.
1 1 −3 0 0
0 1 −1 1 0 .
0 0 1 1 0
Thus the matrix above is in row-echelon form.
Step 6. Beginning with the last nonzero row and
working upward, add suitable multiples of each row to
the rows above to introduce zeros above the leading 1’s.
Add the third row to the second row and 3 times third
row to first row in order to introduce zero above the lead-
ing 1 in the last row.
1 1 0 3 0
0 1 0 2 0 .
0 0 1 1 0
Now the entry above the leading 1 in the second row
should be made zero. Add -1 times the second row to the
first row.
1 0 0 1 0
0 1 0 2 0 .
0 0 1 1 0
The last matrix is in reduced row-echelon form.
The linear system corresponding to the augmented
matrix above is
x1 + x4 = 0
x2 + 2x4 = 0
x3 + x4 = 0
By setting x4 = −t , we see that the solution set is given
by
x1 = t, x2 = 2t, x3 = t, x4 = −t.
This solution is obtained by Gauss-Jordan elimina-
tion method.
The linear system corresponding to the augmented
matrix found in Step 5 is
x1 + x2 + 3x3 =0
x2 − x3 + x4 = 0
3.4. Elementary row operations 81
x3 + x4 = 0
By taking x4 = −t , x3 = t . Also
x2 = x3 − x4 = t + t = 2t
and
x1 = 3x3 − x2 = 3t − 2t = t.
Therefore the solution set is again given by
x1 = t, x2 = 2t, x3 = t, x4 = −t.
This solution is obtained by Gauss elimination method.
Theorem 3.3.1
A homogeneous linear system with more unknowns than the equa-
tions has infinitely many solutions.
Theorem 3.3.2
If R is the reduced row-echelon form of an n × n matrix A, then
either R has a row of zeros or R is the identity matrix.
Definition 3.4.1
A matrix of order n is called an elementary matrix if it can be
obtained from the identity matrix of order n by a single elementary
row operation.
Example 3.4.1
The matrices
1 0 0
0 1
, 0 1 0 ,
1 0
0 0 1
1 1 0 1 0 0
0 1 0 , 0 2 0
0 0 1 0 0 1
are all elementary matrices. The first matrix is obtained
from I2 by interchanging their rows. The second matrix,
the identity matrix of order 3, is obtained by multiplying
the first row of I3 by 1. The third matrix was obtained
from I3 by adding second row to the first row while the
last matrix is obtained by multiplying the second row of
I3 by 2.
Theorem 3.4.1
If E is an elementary matrix obtained by performing a elementary
operation on Im , and A is any m × n matrix, the matrix EA is ob-
tained from A by performing the same elementary row operation
on A.
by:
0 1 0 ... 0
1 0 0 . . . 0
E =0
.
0 1 . . . 0 .
.. .. .. .. ..
. . . .
0 0 0 ... 1
Then, for any matrix A, we have
0 1 0 ... 0 a11 a12 a13 . . . a1n
1 0 0 . . . 0 a21 a22 a23 . . . a2n
EA = 0 0 1 . . . 0 a31 a32 a33 . . .
. . . . . .
a3n
.. .. .. .. .. .. .. .. .. ..
. . . .
0 0 0 ... 1 am1 am2 am3 . . . amn
a21 a22 a23 . . . a2n
a11 a12 a13 . . . a1n
= a31 a32 a33 . . . a3n
.
.. .. .. .. ..
. . . .
am1 am2 am3 . . . amn
.
Thus the matrix EA is obtained from A by exchanging the first
and second row of A.
We now very the theorem for the third row operation of addi-
tion of a multiple of one row to another. In this case, we take the
particular row operation R1 = R1 + α R3 . In this case, the elemen-
tary matrix obtained from Im by the row operation R1 = R1 + α R3
is given by:
1 0 α ... 0
0 1 0 . . . 0
E = 0 0 1 . . . 0
. . . . ..
.. .. .. .. ..
0 0 0 ... 1
3.4. Elementary row operations 85
Theorem 3.4.2
If E is the elementary matrix obtained from I by using any one of
the row operation Ri ↔ R j , Ri ← α Ri or Ri ← Ri + α R j , then AE
is the matrix obtained from A by performing the column operations
Ci ↔ C j , Ci ← α Ci or C j ← C j + α Ci respectively.
Example 3.4.2
Let E be the elementary matrix obtained from I3 by adding the
second row to its first row. Let A be given by
a b c d 1 1 0
A = e f g h , E = 0 1 0 .
i j k l 0 0 1
The product EA is given by
1 1 0 a b c d
EA = 0 1 0 e f g h
0 0 1 i j k l
a+e b+ f c+g d +h
e f g h
i j k l
86 Chapter 3. System of Linear Equations
Example 3.4.3
Let A be the matrix given by
1 2 −1
A = 2 3 4 .
3 2 1
The matrix B below is obtained by adding 2 times the first
row to the third row:
1 2 −1
B = 2 3 4 .
5 6 −1
By adding −2 times the first row to the third row in
the matrix B, we get the matrix A back.
3.4. Elementary row operations 87
Theorem 3.4.3
Every elementary matrix is invertible and the inverse is also ele-
mentary matrix.
Theorem 3.4.4
If A is a square matrix of order n, the following statements are all
equivalent.
(1) A is invertible.
(2) Ax = 0 has only the trivial solution.
(3) The reduced row-echelon form of A is In .
(4) A is a product of elementary matrices.
Example 3.4.5
The inverse of
1 1 2
1 2 1
2 1 1
is given by
−1 −1 3
1
−1 3 −1 .
4 3 −1 −1
90 Chapter 3. System of Linear Equations
Theorem 3.4.5
Every linear system Ax = b has no solution, or has exactly one
solution, or infinitely many solutions.
P ROOF. It is enough to show that if there are more than one so-
lution, then there are infinitely many solutions. So let x1 and x2
be two different solutions of Ax = b. Then we have Ax1 = b and
Ax2 = b. Then for any real number λ , we see that
A(x1 + λ (x2 − x1 )) = Ax1 + λ (Ax2 − Ax1 )
= b + λ (b − b)
= b.
Thus the vector x1 + λ (x2 −x1 ) is also a solution of Ax = b. Since
x1 ̸= x2 , it follows that the solutions x1 + λ (x2 − x1 ) are all dis-
tinct. This completes the proof.
Theorem 3.4.6
If A is invertible, then the linear system Ax = b has exactly one
solution.
Example 3.4.6
3.4. Elementary row operations 91
Example 3.4.7
The condition for three points z, z1 , z2 to be on the same straight
line in the complex is
z z 1
z1 z1 1 = 0.
z2 z2 1
The general form of the equation of a straight line is
Az + Az + c = 0 where A ̸= 0 and c is real constant. If the
points z, z1 , z2 are on the same straight line, then
Az + Az + c = 0, Az1 + Az1 + c = 0, Az2 + Az2 + c = 0
and this can be written in the matrix form as
z z 1 A
z1 z1 1 A = 0 .
0
z2 z2 1 c
If the matrix is non-singular, then A = 0, c = 0. So the
matrix should be singular. This proves the statement.
92 Chapter 3. System of Linear Equations
Theorem 3.5.1
If a square matrix A can be reduced to upper triangular matrix using
Gauss elimination without row exchange, then A = LU where L is
a lower triangular matrix with all diagonal entries equal to 1 and U
is an upper triangular matrix.
Example 3.5.1
Let
0 1
A= .
1 0
If
1 0 a b
L= , U= ,
c 1 0 d
then
a b
LU = .
ac bc + d
Therefore LU = A can hold if and only if a = 0 and in this case, all
the entries in the first column of LU are zero and therefore it cannot
be equal to A. Thus, the matrix A cannot have LU decomposition.
Since the matrix A is non-singular, we conclude that not every non-
singular matrix can have LU decomposition.
3.5. Triangular factorization 93
Example 3.5.2
If a ̸= 0, then Let
a b 1 0 a b
= .
c d c/a 1 0 (ad − bc)/a
Example 3.5.3
If A = [ai j ] is a non-singular square matrix of order n with a11 = 0,
then he matrix A cannot have LU decomposition.
If L = [li j ] is a lower triangular matrix with lii = 1 and U = [u jk ]
is an upper triangular matrix, then a11 = u11 and so u11 = 0. Since
the determinant of a triangular matrix is the product of diagonal
elements, det(U) = u11 · · · unn = 0 and therefore the matrix U is
singular. Therefore det(A) = det LU = det L detU = 0 and hence A
is singular.
If A has LU decomposition A = LU, then there are elementary ma-
trices E1 , · · · , En such that
L = (E1−1 · · · En−1 ), U = En · · · E1 A.
Hence LU decomposition can be obtained from A = IA by applying the
sequence of elementary row and column operations as below:
A = IA
= (IE1−1 )(E1 A)
= (IE1−1 E2−1 )(E2 E1 A)
..
.
= (IE1−1 · · · En−1 )(En · · · E1 A)
= LU.
This method is illustrated in the following examples.
Example 3.5.4
Let
1 1 0
A = 1 2 1 .
2 1 1
94 Chapter 3. System of Linear Equations
Write
1 0 0 1 1 0
A = 0 1 0 1 2 1
0 0 1 2 1 1
Reduce the second matrix by using the row operations R2 = R2 − R1
and R3 = R3 − 2R1 and the first matrix by the corresponding inverse
column operations C1 = C1 +C2 and C1 = C1 + 2C3 to get
1 0 0 1 1 0
A = 1 1 0 0 1 1
2 0 1 0 −1 1
Now apply the row operation R3 = R3 + R2 to the second matrix
and the inverse column operation C2 = C2 −C3 to the first matrix to
get
1 0 0 1 1 0
A = 1 1 0 0 1 1
2 −1 1 0 0 2
Example 3.5.5
Let
2 1 3
A = 4 3 7 .
6 7 17
Write
1 0 0 2 1 3
A = 0 1 0 4 3 7 .
0 0 1 6 7 17
Reduce the second matrix by using the row operations R2 = R2 −
2R1 and R3 = R3 − 3R1 and the first matrix by the corresponding
inverse column operations C1 = C1 + 2C2 and C1 = C1 + 3C3 to get
1 0 0 2 1 3
A = 2 1 0 0 1 1 .
3 0 1 0 4 8
Now apply the row operation R3 = R3 − 4R2 to the second matrix
and the inverse column operation C2 = C2 + 4C3 to the first matrix
3.5. Triangular factorization 95
to get
1 0 0 2 1 3
A = 2 1 0 0 1 1 .
3 4 1 0 0 4
Example 3.5.6
Let
1 1 0 0
1 2 1 1
A=
2
.
1 1 0
1 0 1 1
Write
1 0 0 0 1 1 0 0
0 1 0 0 1 2 1 1
A=
0 0 1 0 2 1 1 0
0 0 0 1 1 0 1 1
Reduce the second matrix by using the row operations R2 = R2 −
R1 , R3 = R3 − 2R1 , R4 = R4 − R1 and the first matrix by the corre-
sponding inverse column operations C1 = C1 +C2 , C1 = C1 + 2C3 ,
C1 = C1 +C4 to get
1 0 0 0 1 1 0 0
1 1 0 0 0 1 1 1
A=
2 0 1 0 0 −1 1 0 .
1 0 0 1 0 −1 1 1
Now apply the row operation R3 = R3 + R2 , R4 = R4 + R2 to the
second matrix and the inverse column operation C2 = C2 −C3 , C2 =
C2 −C4 to the first matrix to get
1 0 0 0 1 1 0 0
1 1 0 0
A= 0 1 1 1
2 −1 1 0 0 0 2 1 .
1 −1 0 1 0 0 2 2
Finally apply the row operation R4 = R4 − R3 to the second matrix
and the inverse column operation C3 = C3 +C4 to the first matrix to
96 Chapter 3. System of Linear Equations
get
1 0 0 0 1 1 0 0
1 1 0
0 0 1 1 1
A=
2 −1 1
.
0 0 0 2 1
1 −1 1 1 0 0 0 1
Example 3.5.8
Solve: x1 + 2x2 + x3 = 3, 2x1 + x2 + 3x3 = 2, −x1 + x2 − x3 = 0. The
system of linear equation in matrix form is Ax = b where
1 2 1 x1 3
A= 2 1 3 , x = x2 , b = 2 .
−1 1 −1 x3 1
Write
1 0 0 1 2 1
A = 0 1 0 2 1 3 .
0 0 1 −1 1 −1
Reduce the second matrix by using the row operations R2 = R2 −
2R1 and R3 = R3 + R1 and the first matrix by the corresponding
inverse column operations C1 = C1 + 2C2 and C1 = C1 −C3 to get
1 0 0 1 2 1
A = 2 1 0 0 −3 1 .
−1 0 1 0 3 0
98 Chapter 3. System of Linear Equations
An ordered pair of two objects x and y is the pair written as (x, y) with
the property that (x, y) = (y, x) if and only if x = y. Ordered pair can be
defined in different ways; for example,
(a, b) := {{a}, {a, b}},
(a, b) := {a, {a, b}}.
Either of these definition shows that (x, y) = (y, x) if and only if x =
y. One can also extend the definition of ordered pair to ordered triple
and ordered n-tuple. Let n be a positive integer. A sequence of n real
numbers (a1 , a2 , a3 , . . . , an ) is called an ordered n-tuple. The set of
all ordered n-tuples, denoted by Rn , is called the Euclidean n-space.
Clearly, ordered pair are just the 2-tuple and we call a 3-tuple as an
ordered triple. We call the elements of Rn as points in Rn or as vectors
in Rn .
Definition 4.1.1 Equality and coordinatewise operations
Let u = (u1 , u2 , u3 , . . . , un ), v = (v1 , v2 , v3 , . . . , vn ) be two vectors in
Rn . The two vectors u and v are equal if
u1 = v1 , u2 = v2 , . . . , un = vn .
The sum u + v and the difference u − u are defined by
u + v := (u1 + v1 , u2 + v2 , . . . , un + vn )
100 Chapter 4. Euclidean Space
u − v := (u1 − v1 , u2 − v2 , . . . , un − vn )
and the scalar multiple λ u (where λ is a real number) is defined by
λ u := (λ u1 , λ u2 , . . . , λ un ).
The addition, subtraction and scalar multiple are respectively called
the coordinatewise addition, subtraction and scalar multiplication.
The zero vector, denoted by 0, is defined by
0 = (0, 0, . . . , 0).
When λ = −1, the scalar multiple −u given by
−u = (−u1 , −u2 , −u3 , . . . , −un )
is called the negative of u or the additive inverse of u. Note that u − v =
u + (−v) and u − u = 0.
Theorem 4.1.1
Let u, v and w be vectors in Rn and λ and µ be scalars. Then we
have the following
(1) u + v = v + u
(2) (u + v) + w = u + (v + w)
(3) u + 0 = 0 + u = u
(4) u + (−u) = (−u) + u = 0
(5) λ (µ u) = (λ µ )u
(6) λ (u + v) = λ u + λ v
(7) (λ + µ )u = λ u + µ u
(8) 1u = u
(1) is the commutative law for addition; (2) is the associative law for
addition; 0 is the identity element for addition; −u is additive inverse of
u.
P ROOF. We prove only the commutative law for addition. Let
u = (u1 , u2 , u3 , . . . , un ), v = (v1 , v2 , v3 , . . . , vn ).
Then
u + v = (u1 , u2 , u3 , . . . , un ) + (v1 , v2 , v3 , . . . , vn )
= (u1 + v1 , u2 + v2 , . . . , un + vn )
4.2. Dot product and norm 101
= (v1 + u1 , v2 + u2 , . . . , vn + un )
= v + u.
Example 4.1.1
Let u = (1, 3, −1), v = (4, −2, 5) be two vectors in R3 . Then
u + v = (1, 3, −1) + (4, −2, 5)
= (1 + 4, 3 + (−2), −1 + 5) = (5, 1, 4),
v + u = (4, −2, 5) + (1, 3, −1) = (5, 1, 4),
3u = 3(1, 3, −1) = (3, 9, −3),
3v = 3(4, −2, 5) = (12, −6, 15),
3u + 3v = (3, 9, −3) + (12, −6, 15) = (15, 3, 12),
3(u + v) = 3(5, 1, 4) = (15, 3, 12).
Clearly u + v = v + u, 3(u + v) = 3u + 3v. Also
u − v = (1, 3, −1) − (4, −2, 5) = (−3, 5, −6).
Example 4.1.2
If x + u = v, then by adding −u both sides of the equation, we get
(x + u) + (−u) = v + (−u).
By using associative law, we get
x + (u + (−u)) = v − u.
Thus we have
x+0 = v−u
or
x = v − u.
Definition 4.2.1
The Euclidean inner product of two vectors u = (u1 , u2 , u3 , . . . , un ),
v = (v1 , v2 , v3 , . . . , vn ) in Rn , denoted by u · v, is defined by
n
u · v = u1 v1 + u2 v2 + · · · + un vn = ∑ ui vi .
i=1
Example 4.2.1
Using the above theorem, we can compute the Euclidean inner
product in the same way we compute ordinary product. For ex-
ample,
(u + v).(u − v) = u.(u − v) + v.(u − v)
= u·u−u·v+v·u−v·v
= u · u − v · v.
104 Chapter 4. Euclidean Space
Definition 4.2.2
The Euclidean norm or Euclidean length of a vector u =
(u1 , u2 , u3 , . . . , un ) in Rn , denoted by ∥u∥, is defined by
s
√ q n
∥u∥ := u · u = u21 + u22 + · · · + u2n = ∑ u2i .
i=1
The distance between two vectors u, v in Rn , denoted by d(u, v), is
defined by
d(u, v) := ∥u − v∥
q
= (u1 − v1 )2 + (u2 − v2 )2 + · · · + (un − vn )2
s
n
= ∑ (ui − vi)2
i=1
Example 4.2.2
For two vectors u, v in Rn , we have
∥u + v∥2 = (u + v) · (u + v)
= u · (u + v) + v · (u + v)
= u·u+u·v+v·u+v·v
= u · u + 2(u · v) + v · v
= ∥u∥2 + 2(u · v) + ∥v∥2
and
∥u − v∥2 = (u − v).(u − v)
= u · (u − v) − v · (u − v)
= u·u−u·v−v·u+v·v
= u · u − 2(u · v) + v · v
= ∥u∥2 − 2(u · v) + ∥v∥2 .
Adding the two equalities, we get
∥u + v∥2 + ∥u − v∥2 = 2[∥u∥2 + ∥v∥2 ].
4.2. Dot product and norm 105
Also we get
∥u + v∥2 − ∥u − v∥2 = 4u · v
and therefore the Euclidean inner product can be expressed in term
of the norms as
1
u · v = [∥u + v∥2 − ∥u − v∥2 ].
4
Example 4.2.3
Let u = (1, 2, −1, 3) and v = (2, 0, 3, 1) be two vectors in R4 . Then
q √
∥u∥ = 12 + 22 + (−1)2 + 32 = 15,
p √
∥v∥ = 22 + 02 + 32 + 12 = 14,
u · v = 1 × 2 + 2 × 0 + (−1) × 3 + 3 × 1 = 2.
Note that √ √
|(u · v)| = 2 ≤ 15 14 = ∥u∥ ∥v∥·
Theorem 4.2.2
If u is vector in Rn and k a scalar, then
(1) ∥u∥ ≥ 0 and ∥u∥ = 0 if and only if u = 0
(2) ∥λ u∥ = |λ | ∥u∥.
Example 4.2.4
By taking λ = −1 in ∥λ u∥ = |λ | ∥u∥, we obtain the following: For
any u ∈ Rn ,
∥ − u∥ = ∥u∥.
Example 4.2.5
When n = 2, Cauchy-Schwarz inequality becomes
|u1 v1 + u2 v2 |2 ≤ (u21 + u22 )(v21 + v22 ).
By taking u1 = a, u2 = b, v1 = cos θ and v2 = sin θ , we get
q
|a cos θ + b sin θ | ≤ (a2 + b2 )(sin2 θ + cos2 θ )
p
= a2 + b2
or equivalently (a cos θ + b sin θ )2 ≤ a2 + b2 .
Example 4.2.6
By taking v1 = v2 = · · · = vn = 1 in Cauchy-Schwarz inequality, we
have
(u1 + u2 + · · · + un )2 ≤ n(u21 + u22 + . . . + u2n )·
Example 4.2.7
Let u and v be two vectors in Rn . Then
| ∥u∥ − ∥v∥ | ≤ ∥u − v∥·
108 Chapter 4. Euclidean Space
≤ ∥u − v∥ + ∥v∥
or
∥u∥ − ∥v∥ ≤ ∥u − v∥·
Similarly
∥v∥ = ∥(v − u) + u∥
≤ ∥v − u∥ + ∥u∥
= ∥ − (u − v)∥ + ∥u∥
= ∥u − v∥ + ∥u∥
or
∥v∥ − ∥u∥ ≤ ∥u − v∥
or
−(∥u∥ − ∥v∥) ≤ ∥u − v∥·
Thus we have | ∥u∥ − ∥v∥ | ≤ ∥u − v∥.
Definition 4.2.3
Two vectors u and v in Rn are orthogonal if u.v = 0.
Example 4.2.8
Consider u = (1, 1, 1, 1, 1, 1) and v = (1, −1, 1, −1, 1, −1) in R6 .
Then
u·v = 1−1+1−1+1−1 = 0
√
and therefore u and v are orthogonal in R6 . Note that ∥u∥ = 6 =
∥v∥,
∥u + v∥ = ∥(2, 0, 2, 0, 2, 0)∥
√ √
= 4 + 0 + 4 + 0 + 4 + 0 = 12
and therefore
∥u + v∥2 = 12 = 6 + 6 = ∥u∥2 + ∥v∥2 ·
Theorem 4.2.6
Two vectors u and v in Rn are orthogonal if and only if
∥u + v∥ = ∥u − v∥·
P ROOF. Since
1
u · v = [∥u + v∥2 − ∥u − v∥2 ],
4
u and v in R are orthogonal if and only if ∥u+v∥2 −∥u−v∥2 = 0
n
w1 = f1 (x1 , x2 , x3 , . . . , xn )
w2 = f2 (x1 , x2 , x3 , . . . , xn )
.
. . . ..
wm = fm (x1 , x2 , x3 , . . . , xn ).
f (x1 , x2 , x3 , · · · , xn ) = (w1 , w2 , w3 , · · · , wm ).
Example 4.3.2
Consider the functions
w1 = f1 (x1 , x2 , x3 ) = x1 + x1 x2 + x3
w2 = f2 (x1 , x2 , x3 ) = x2 + x3 .
Then we can define a function f : R3 → R2 by
f (x1 , x2 , x3 ) = (w1 , w2 ) = (x1 + x1 x2 + x3 , x2 + x3 ).
T (x1 , x2 , x3 , · · · , xn ) = (w1 , w2 , w3 , · · · , wm )
where
Theorem 4.3.1
A transform T : Rn → Rm is a linear transformation if and only if
T (λ u + µ v) = λ T (u) + µ T (v)
for all vectors u, v ∈ Rn and for every scalars λ , µ .
Definition 4.3.1
A linear transformation T : Rn → Rm is one-to-one if
x ̸= y ⇒ T (x) ̸= T (y)
for all x, y ∈ Rn or equivalently
T (x) = T (y) ⇒ x = y
for all x, y ∈ Rn .
Example 4.3.4
Let T (x) = Ax be a linear transformation. Then
T (x) = T (y) ⇒ Ax = Ay ⇒ x = y
provided A is invertible. Thus a linear transformation T is one-to-
one if its matrix A is invertible.
4.3. Linear transformations 115
Example 4.3.5
Let Ti : R2 → R2 be given by Ti (x) = Ai x where
1 0 0 −1 −1 0
A1 = , A2 = , A3 = ,
0 1 −1 0 0 −1
0 1 −1 0 0 −1
A4 = , A5 = , A6 = ,
−1 0 0 1 −1 0
1 0 0 1 α 0
A7 = , A8 = , A9 = ,
0 −1 1 0 0 α
1 0 0 0 0 0
A10 = , A11 = , A12 =
0 0 0 1 0 0
Chapter 5
Real Vector Spaces
Definition 5.1.1
A field (F, +, ·) is a nonempty set F together with the two binary
operations
+ : F × F → F, · : F × F → F,
called addition and multiplication respectively, satisfying the ax-
ioms for all x, y, z ∈ F:
(1) x + (y + z) = (x + y) + z, (associative law for addition)
(2) there is an element 0 ∈ F satisfying x + 0 = 0 + x = x,
(additive identity element)
(3) for every x ∈ F, there is an element −x ∈ F satisfying x +
(−x) = (−x) + x = 0, (existence of additive inverse)
(4) x + y = y + x, (commutative law/Abelian law for addi-
tion)
(5) x · (y · z) = (x · y) · z, (associative law for multiplication)
(6) there is an element 1 ∈ F (different from 0) satisfying x ·
1 = 1 · x = x, (multiplicative identity element)
(7) for every x ∈ F\{0}, there is an element x−1 ∈ F satisfying
x · x−1 = x−1 · x = 1, (existence of multiplicative inverse)
(8) x · y = y · x. (commutative law/Abelian law for multipli-
cation)
(9) x · (y + z) = x · y + x · z (distributive law)
118 Chapter 5. Real Vector Spaces
Example 5.1.1
The set Z of all integers is not a field with the usual addition and
multiplication. The non-zero integers other than 1, -1 have no mul-
tiplicative inverse.
Example 5.1.2
The set R of all real numbers and the C of all complex numbers are
fields with the usual addition and multiplication. The set Q of all
rational numbers and the Q[i] of all Gaussian rational numbers (the
complex numbers p + iq where p and q are rational numbers) are
also fields with the usual addition and multiplication.
Example 5.1.3
A positive number p with exactly two positive divisors, namely 1
and p itself, is called a prime number. Let p be a prime number. The
set Z p := {1, 2, . . . , p − 1} with addition and multiplications defined
by
x ⊕ y = x + y (mod p), x ⊙ y = xy (mod p)
is a field.
Definition 5.1.2
A vector space V over a field F is a nonempty set V , whose elements
are called vectors, together with the two operation
+ : V ×V → V, · : F ×V → V,
called addition and scalar multiplication respectively, satisfying the
following axioms for all u, v, w ∈ V and for all scalars λ , µ ∈ F:
(1) If u, v ∈ V , then u + v ∈ V .
(2) u + v = v + u
(3) (u + v) + w = u + (v + w)
5.1. Vector spaces 119
Example 5.1.5
The set V1 = {(x, 1) : x ∈ R} with the coordinatewise addition and
scalar multiplication is not a vector space. For, if (x, 1) ∈ V and
(y, 1) ∈ V , then (x, 1) + (y, 1) = (x + y, 2) ̸∈ V .
However, the set V2 = {(x, 0) : x ∈ R} is a vector space. The
sum of (x, 0), (y, 0) ∈ V2 is (x + y, 0) and it is clearly in V2 . The
scalar product λ (x, 0) = (λ x, 0) is also in V2 . The additive identity
is (0, 0) and the inverse of (x, 0) is (−x, 0). Other properties are
easily verified.
Example 5.1.6
The set V = {(x, 1) : x ∈ R} with addition and scalar multiplication
defined by
(x, 1) + (y, 1) = (x + y, 1), λ (x, 1) = (λ x, 1)
120 Chapter 5. Real Vector Spaces
is vector space. The vector (0, 1) is the additive identity; the inverse
of (x, 1) is (−x, 1). All the real vector space axioms are satisfied.
We verify only: λ (u + v) = λ u + λ v where u = (x, 1), v = (y, 1).
Clearly
u + v = (x, 1) + (y, 1) = (x + y, 1)
and therefore
λ (u + v) = λ (x + y, 1)
= (λ (x + y), 1)
= (λ x + λ y, 1)
= (λ x, 1) + (λ y, 1)
= λ (x, 1) + λ (y, 1)
= λ u + λ v.
Example 5.1.7
The set M2 of all square matrices of order 2 with usual matrix addi-
tion, matrix scalar multiplication is a real vector space. The vector
space axioms are verified below.
Let
u11 u12 v11 v12
u= , v= .
u21 u22 v21 v22
Then
(1) The addition of u and v is given by
u11 u12 v11 v12
u+v = +
u21 u22 v21 v22
u11 + v11 u12 + v12
= .
u21 + v21 u22 + v22
Clearly u + v ∈ M2 .
(2) We have earlier proved the commutative law for matrix ad-
dition. We repeat the proof. By the definition of addition,
we have
u11 u12 v11 v12
u+v = +
u21 u22 v21 v22
5.1. Vector spaces 121
u + v11 u12 + v12
= 11
u21 + v21 u22 + v22
v + u11 v12 + u12
= 11
v21 + u21 v22 + u22
= v + u.
(3) Associative law for matrix addition is already proved.
(4) The zero matrix
0 0
0=
0 0
satisfies u + 0 = 0 + u = u for all u ∈ M2 .
(5) For
u11 u12
u= ∈ M2 ,
u21 u22
the matrix
−u11 −u12
−u = ∈ M2
−u21 −u22
satisfies
u + (−u) = (−u) + u = 0.
(6) By the definition of matrix scalar multiplication, we have
u11 u12 λ u11 λ u12
λu = λ = ∈ M2 .
u21 u22 λ u21 λ u22
(7) By the definition of matrix addition and matrix scalar mul-
tiplication, we have
u11 u12 v11 v12
λ (u + v) = λ +
u21 u22 v21 v22
u11 + v11 u12 + v12
=λ
u21 + v21 u22 + v22
λ (u11 + v11 ) λ (u12 + v12 )
=
λ (u21 + v21 ) λ (u22 + v22 )
λ u11 + λ v11 λ u12 + λ v12
=
λ u21 + λ v21 λ u22 + λ v22
122 Chapter 5. Real Vector Spaces
λ u11 λ u12 λ v11 λ v12
= +
λ u21 λ u22 λ v21 λ v22
u u12 v v12
= λ 11 + λ 11
u21 u22 v21 v22
= λ u + λ v.
(8) The proof of (λ + µ )u = λ u + µ u is similar.
(9) It is clear that
µ u11 µ u12
λ (µ u) = λ
µ u21 µ u22
λ µ u11 λ µ u12
=
λ µ u21 λ µ u22
u11 u12
= (λ µ )
u21 u22
= (λ µ )u.
(10) Clearly
u11 u12 1u11 1u12
1u = 1 = = u.
u21 u22 1u21 1u22
Thus M2 is a vector space.
Example 5.1.8
The set Mm×n of all matrices (with real entries) of order m × n with
usual matrix addition, matrix scalar multiplication is a real vector
space.
Example 5.1.9
The set V of all points in any plane through the origin is a real vec-
tor space with usual addition and scalar multiplication. Any plane
through the origin has the equation ax + by + cz = 0. At least one
of the constant a, b, c, is non-zero. Let c ̸= 0. Then the equation can
be written as z = (−a/c)x + (−b/c)y = Ax + By. Thus the set V is
given by
V = {(u1 , u2 , Au1 + Bu2 )|u1 , u2 ∈ R}.
5.1. Vector spaces 123
(1) If u, v ∈ V , then
u = (u1 , u2 , Au1 + Bu2 ), v = (v1 , v2 , Av1 + Bv2 )
and therefore
u+v
= (u1 , u2 , Au1 + Bu2 ) + (v1 , v2 , Av1 + Bv2 )
= (u1 + v1 , u2 + v2 , A(u1 + v1 ) + B(u2 + v2 ))
∈ V.
(2) We omit the proofs of u + v = v + u, (u + v) + w = u +
(v + w).
(3) The element 0 = (0, 0, 0) ∈ V satisfies, any u ∈ V , u + 0 =
0 + u = u.
(4) For each u = (u1 , u2 , Au1 + Bu2 ) ∈ V , the element −u =
(−u1 , −u2 , −Au1 − Bu2 ) ∈ V satisfies u + (−u) = (−u) +
u = 0.
(5) If u, v ∈ V and λ a scalar, then it is easy to prove that λ u ∈
V . λ (u + v) = λ u + λ v, (λ + µ )u = λ u + µ u, λ (µ u) =
(λ µ )u, 1u = u.
Therefore V is a vector space.
Example 5.1.10
Let F [a, b] be the set of all functions f : [a, b] → R. If f , g ∈ F [a, b]
and λ be any scalar in R, then let f + g, λ f be the functions defined
by
( f + g)(x) = f (x) + g(x), (λ f )(x) = λ f (x)
for all x ∈ [a, b]. Under these operations, F [a, b] is a real vector
space.
Indeed, the function 0 : [a, b] → R defined by 0(x) = 0 for all x ∈
[a, b] acts as the additive identity while the function − f : [a, b] → R
defined by (− f )(x) = − f (x) for all x ∈ [a, b] is the additive inverse
of f ∈ F [a, b]. Other properties of the vectors follows from the
corresponding properties of real numbers. For example, f + g =
g + f follows from
( f + g)(x) = f (x) + g(x) = g(x) + f (x) = (g + f )(x).
124 Chapter 5. Real Vector Spaces
Example 5.1.11
The set C [a, b] of all continuous functions f : [a, b] → R with addi-
tion and scalar multiplication defined by
( f + g)(x) = f (x) + g(x), (λ f )(x) = λ f (x)
for all x ∈ [a, b] is a real vector space.
Example 5.1.12
Let Pn [x] be the set of all polynomials of degree less than n:
( )
n−1
Pn [x] := p : p(x) = ∑ ak xk , a0 , . . . , an−1 ∈ R .
k=0
Note that the constant polynomials p(x) = a0 are also included in
Pn [x]. If p, q ∈ Pn [x] and λ be any scalar in R, then let p + q, λ p
be the functions defined by
(p + q)(x) = p(x) + q(x), (λ p)(x) = λ p(x).
Under these operations, Pn [x] is a real vector space.
Example 5.1.13
Let V be the set with a single element 0. Let addition and scalar
multiplication be defined by
0 + 0 = 0, λ 0 = 0.
Then V is a vector space and it is called a zero vector space.
Theorem 5.1.2
Let V be a vector space. For any vector u ∈ V and scalar λ , we have
(1) 0u = 0;
(2) λ 0 = 0;
(3) (−1)u = −u;
(4) If λ u = 0, then λ = 0 or u = 0.
5.2. S UBSPACES
Let V be a vector space and W a subset of V . For two elements x, y ∈ W ,
we can talk about their sum x+y using the same addition operation in V .
Similarly, we can talk about λ x using the scalar product in V . Our aim
is to investigate if the two operations makes W as a vector space on its
own. For example, if V = R2 , and W1 = {(x1 , 1) : x1 ∈ R}, then, for two
elements x = (x1 , 1), y = (y1 , 1), their sum x + y = (x1 + y1 , 2) ̸∈ W1 and
therefore the addition operation is not a binary operation on W1 . Thus,
W1 is not a vector space. However, in certain cases, W may become a
vector space. Consider W2 = {(x1 , 0) : x1 ∈ R}. In this case, for the two
elements x = (x1 , 1), y = (y1 , 1), their sum x + y = (x1 + y1 , 0) ∈ W2 and
the scalar product λ x = (λ x1 , 0) ∈ W2 . The other properties for a vector
space are also satisfied and W2 becomes a vector space on its own.
Definition 5.2.1
A subset W of a vector space V is called a subspace if W is itself a
vector space under the addition and multiplication defined on V .
Theorem 5.2.1
Let W be a non-empty set subset of a vector space V . Then W is a
subspace of V if and only if
(1) u + v ∈ W for all u, v ∈ W ,
(2) λ u ∈ W for all u ∈ W and for all scalar λ .
Theorem 5.2.2
Let W be a non-empty set subset of a vector space V . Then W is a
subspace of V if and only if λ u + µ v ∈ W for all u, v ∈ W and for
all scalar λ , µ .
Example 5.2.1
Let V = R3 be the vector space under usual addition and scalar
multiplication and W = {(x, y, 0) : x, y ∈ R}. Let u = (x1 , y1 , 0) and
v = (x2 y2 , 0) be two vectors in W and λ , µ be scalars. Then
λ u + µ v = λ (x1 , y1 , 0) + µ (x2 , y2 , 0)
= (λ x1 + µ x2 , λ y1 + µ y2 , 0) ∈ W.
Thus W is a subspace of V .
Example 5.2.2
Let V = R3 be the vector space under usual addition and scalar
multiplication and
W = {(x, y, ax + by) : x, y ∈ R}.
128 Chapter 5. Real Vector Spaces
Let u = (x1 , y1 , ax1 + by1 ) and v = (x2 y2 , ax2 + by2 ) be two vectors
in W and λ , µ be scalars. Then
λ u + µ v = λ (x1 , y1 , ax1 + by1 ) + µ (x2 , y2 , ax2 + by2 )
= (λ x1 + µ x2 , λ y1 + µ y2 , a(λ x1 + µ x2 ) + b(λ y1 + µ y2 )) ∈ W.
Thus W is a subspace of V .
Since the points (x, y, z) ∈ W satisfy
ax + by = z,
we have proved that the set of points in a straight line passing
through origin is a subspace of R3 .
Example 5.2.3
Consider the vector space V = Mm of all square matrices of order m.
Let W be the subset consisting of symmetric matrices of order m.
Since the sum of two symmetric matrices is symmetric and scalar
multiple of symmetric matrix is symmetric, W is a subspace of V .
Similarly the subsets consisting of lower triangular matrices,
upper triangular matrices and diagonal matrices are all subspaces
of V .
The subset consisting of invertible matrices of order m is not a
vector space since the sum of two invertible matrices need not be
invertible.
Example 5.2.4
Consider the vector space V of all functions f : [a, b] → R under
the addition and scalar multiplication of functions. Since the sum
of two continuous is continuous and scalar multiple of a continuous
function is also continuous, the subset C[a, b] consisting of contin-
uous functions is a subspace of V .
Theorem 5.2.3
The set of all of solution of a homogeneous linear system Ax = 0
(with m equations and n unknowns) is a subspace of Rn .
5.2. Subspaces 129
Example 5.2.5
Consider solutions of the homogeneous linear system
x + y + z = 0, y−z = 0
given by
x = −2t, y = t, z = t.
The set of all solutions W = {(−2t,t,t) : t ∈ R} is a subspace of
R3 . It can be verified directly. Let u = (−2t,t,t) and v = (−2s, s, s)
and λ and µ be scalars. Then
λ u + µ v = (−2(λ t + µ s), λ t + µ s, λ t + µ s)
is again in W .
Definition 5.2.2
A vector v is a linear combination of the vectors v1 , v2 , . . . , vk if
there are scalars λ1 , λ2 , . . . , λk such that
v = λ1 v1 + λ2 v2 + . . . + λk vk .
Example 5.2.6
Consider the vector space Rn with usual addition and scalar multi-
plication. Consider the following vectors
e1 = (1, 0, 0, · · · , 0)
e2 = (0, 1, 0, · · · , 0)
e3 = (0, 0, 1, · · · , 0)
..
.
130 Chapter 5. Real Vector Spaces
en = (0, 0, 0, · · · , 1).
Any vector v = (v1 , v2 , · · · , vn ) ∈ Rn can be written as
v = v1 e1 + v2 e2 + v3 e3 + · · · + vn en .
Example 5.2.7
Consider the vector space R3 . Let e1 = (1, 0, 0), e2 = (1, 1, 0) and
e3 = (1, 1, 1). Let v = (v1 , v2 , v3 ) ∈ R3 be any vector. Since
λ1 e1 + λ2 e2 + λ3 e3 = (λ1 + λ2 + λ3 , λ2 + λ3 , λ3 ),
v = λ1 e1 + λ2 e2 + λ3 e3
if
λ1 + λ2 + λ3 = v1 , λ2 + λ3 = v2 , λ3 = v3
or equivalently if
λ1 = v1 − v2 , λ2 = v2 − v3 , λ3 = v3 .
Thus
v = (v1 − v2 )e1 + (v2 − v3 )e2 + v3 e3 .
Example 5.2.8
Consider the vector space R3 . Let v1 = (3, −1, 2), v2 = (−1, 2, −1).
Consider the vectors u = (7, −4, 5) and w = (1, 3, 1).
If u is a linear combination of v1 and v2 , then there are scalars
λ1 and λ2 such that
u = λ1 v1 + λ2 v2 .
Since
λ1 v1 + λ2 v2 = (3λ1 − λ2 , −λ1 + 2λ2 , 2λ1 − λ2 )
we must have
3λ1 − λ2 = 7, −λ1 + 2λ2 = −4, 2λ1 − λ2 = 5.
The solution of the linear system is given by λ1 = 2 and λ2 = −1.
Thus
u = 2v1 − 2v2 .
The linear system resulting from
w = λ1 v1 + λ2 v2
5.3. Span 131
is
3λ1 − λ2 = 1, −λ1 + 2λ2 = 3, 2λ1 − λ2 = 1;
this system is inconsistent and therefore w is not a linear combina-
tion of v1 and v2 .
Theorem 5.2.4
Let V be a vector space. The set W of all linear combination of
v1 , v2 , . . . , vk ∈ V is a subspace of V . Any subspace W ′ containing
v1 , v2 , . . . , vk contains the subspace W .
5.3. S PAN
Definition 5.3.1
Let S = {v1 , v2 , . . . , vk } be a subset of a vector space V . Then the
subspace W consisting of all linear combination of elements in S is
called the space spanned by v1 , v2 , . . ., vk or simply span of S and
132 Chapter 5. Real Vector Spaces
Example 5.3.1
Let V be a vector space. Let 0 ̸= v ∈ V and S = {v}. Then
span(S) = {λ v : λ ∈ R}.
Let v1 , v2 ∈ V and v1 ̸= λ v2 for any λ . If S = {v1 , v2 }, then
span S = {λ v1 + µ v2 : λ , µ ∈ R}.
Example 5.3.2
Consider the vector space Rn . Consider the set S consisting of the
following vectors
e1 = (1, 0, 0, · · · , 0)
e2 = (0, 1, 0, · · · , 0)
e3 = (0, 0, 1, · · · , 0)
..
.
en = (0, 0, 0, · · · , 1).
For any vector v = (v1 , v2 , · · · , vn ) ∈ Rn , we have
v = v1 e1 + v2 e2 + v3 e3 + · · · + vn en .
Thus span(S) = Rn .
Example 5.3.3
Consider v1 = (1, 0, 0), v2 = (0, 1, 1) and v3 = (1, 0, 1) in R3 . Then
λ1 v1 + λ2 v2 + λ3 v3
= (λ1 + λ3 , λ2 , λ2 + λ3 )
= (v1 , v2 , v3 )
where v1 = λ1 + λ3 , v2 = λ2 , v3 = λ3 + λ2 or λ1 = v1 + v2 − v3 , λ2 =
v2 , λ3 = v3 − v2 . Thus any vector in R3 is a a linear combination of
v1 and v2 and v3 . Thus
span{v1 , v2 , v3 } = R3 .
5.4. Linear dependence and independence 133
Example 5.3.4
Consider v1 = (1, 0, 0), v2 = (0, 1, 0) in R3 . Then
λ1 v1 + λ2 v2
= (λ1 , λ2 , 0).
Therefore
span{v1 , v2 } = {(x, y, 0) : x, y ∈ R};
in this case, span{v1 , v2 } ̸= R3 .
Example 5.3.5
The set {1, x, x2 , . . . , xn } spans the vector space of all polynomials
of degree less than or equal to n.
Definition 5.4.1
Let V be a vector space. Let S = {v1 , v2 , . . . , vk } be a subset of V .
If there are scalars λ1 , λ2 , . . . , λk , not all zero, such that
λ1 v1 + λ2 v2 + . . . + λk vk = 0,
then the set S is linearly dependent. If S is not linearly dependent,
then it is linearly independent. Thus S is linearly independent if
λ1 v1 + λ2 v2 + . . . + λk vk = 0,
then
λ1 = λ2 = · · · = λk = 0.
Example 5.4.1
A set with single vector v is linearly dependent if and only if there
is a scalar λ ̸= 0 such that λ v = 0. This hold if and only if v = 0.
Thus a single element set {v} is linearly dependent if and only if
v = 0.
The set {v} is linearly independent if and only if v ̸= 0.
134 Chapter 5. Real Vector Spaces
Example 5.4.2
Consider a two element subset {v1 , v2 } of a vector space V . The set
is linearly dependent if there are scalars λ1 and λ2 , not both zero,
such that
λ1 v1 + λ2 v2 = 0.
Since λ1 and λ2 are not both zero, at least one of them, say λ1 , is
nonzero. Then
λ2
v1 = − v2 .
λ1
Thus v1 is a scalar multiple of v2 .
If v1 is a scalar multiple of v2 , then v1 = λ v2 or v1 − λ v2 = 0.
Thus {v1 , v2 } is linearly dependent.
Thus, the set {v1 , v2 } is linearly dependent if and only if one
vector is scalar multiple of other vector. The set {v1 , v2 } is linearly
independent if and only if neither vector is scalar multiple of other
vector.
The set {(1, 0, 0), (0, 1, 0)} is linearly independent while
{(1, 1, 2), (2, 2, 4)} is linearly dependent.
Example 5.4.3
Consider the vector space Rn . Consider the set S consisting of the
following vectors
e1 = (1, 0, 0, · · · , 0)
e2 = (0, 1, 0, · · · , 0)
e3 = (0, 0, 1, · · · , 0)
..
.
en = (0, 0, 0, · · · , 1).
Then
λ1 e1 + λ2 e2 + . . . + λn en = 0
becomes
(λ1 , λ2 , . . . , λn ) = (0, 0, . . . , 0)
or
λ1 = λ2 = · · · = λn = 0.
5.4. Linear dependence and independence 135
Example 5.4.4
Consider v1 = (1, 0, 0), v2 = (0, 1, 1) and v3 = (1, 0, 1) in R3 . Since
λ1 v1 + λ2 v2 + λ3 v3
= (λ1 + λ3 , λ2 , λ2 + λ3 ),
the equation
λ1 v1 + λ2 v2 + λ3 v3 = 0
becomes
λ1 + λ3 = 0, λ2 = 0, λ2 + λ3 = 0.
Solving these equations, we get
λ1 = λ2 = λ3 = 0.
Thus S = {v1 , v2 , v3 } is a linearly independent set.
Example 5.4.5
The set {1, x, x2 , . . . , xn } is linearly independent in the vector
space of all polynomials of degree less than or equal to n. Let
λ0 , λ1 , λ2 , . . . , λn be scalars such that
λ0 1 + λ1 x + λ2 x2 + · · · + λn xn = 0.
Since this is true for all x, we get λ0 = 0 by taking x = 0. Now by
differentiating the above equation, we get
λ1 + 2λ2 x + · · · + nλn xn−1 = 0.
By taking x = 0 in this equation, we get λ1 = 0. By proceeding this
way, we see that
λ0 = λ1 = λ2 = . . . = λn = 0
and therefore {1, x, x2 , . . . , xn } is a linearly independent set.
Example 5.4.6
Consider v1 = (1, 0, 0), v2 = (0, 1, 1) and v3 = (1, 1, 1) in R3 . Since
λ1 v1 + λ2 v2 + λ3 v3
136 Chapter 5. Real Vector Spaces
= (λ1 + λ3 , λ2 + λ3 , λ2 + λ3 ),
the equation
λ1 v1 + λ2 v2 + λ3 v3 = 0
becomes
λ1 + λ3 = 0, λ2 + λ3 = 0, λ2 + λ3 = 0.
Solving these equations, we get
λ1 = λ2 = t, λ3 = −t.
In particular, we can take λ1 = λ2 = 1 and λ3 = −1. In this case,
we have
v1 + v2 − v3 = 0.
Thus S = {v1 , v2 , v3 } is a linearly dependent set.
Example 5.4.7
If S is a subset of a vector space V and 0 ∈ S, then S is linearly
dependent. For, let v1 , v2 , . . . , vk−1 , 0 be the vectors in S. By taking
λ1 = λ2 = · · · = λk−1 = 0 and λk = 1, we see that
λ1 v1 + λ2 v2 + . . . + λk−1 vk−1 + λk 0 = 0.
Since not all λi ’s are zero (in this case, λk ̸= 0), we see that S is
linearly dependent.
Theorem 5.4.1
A subset S = {v1 , v2 , . . . , vk } with two or more vector of a vector
space V is linearly dependent if and only if at least one of the vector
in S is a linear combination of other vectors in S.
or
−λ1 λ2 λi−1
vi = v1 − v2 − . . . − vi−1
λ1 λ1 λ1
λi+1 λk
− vi+1 − . . . − vk .
λ1 λ1
This shows that vi is linear combination of other vectors in S.
Conversely, let one of the vector, say v1 , is linear combination
of other vectors in S. Then there are scalars λ2 , λ3 , . . . , λk such
that
v1 = λ2 v2 + . . . + λk vk .
Thus
−v1 + λ2 v2 + . . . + λk vk = 0.
Since not all constants in the linear combination is zero, it follows
that S is linearly dependent.
Example 5.4.8
Consider v1 = (1, 0, 0), v2 = (1, 0, 1) and v3 = (3, 0, 1) in R3 . Since
2v1 + v2 = 2(1, 0, 0) + (1, 0, 1)
= (2, 0, 0) + (1, 0, 1)
= (3, 0, 1)
= v3 ,
the set {v1 , v2 , v3 } is linearly dependent in R3 .
Theorem 5.4.2
Any subset S of Rn having more than n vectors is linearly depen-
dent.
Definition 5.5.1
A subset S of a vector space V is called a basis for V if (a) S is
linearly independent and (b) S spans V .
Example 5.5.1
Consider the vector space Rn . Consider the set S consisting of the
following vectors
e1 = (1, 0, 0, · · · , 0)
e2 = (0, 1, 0, · · · , 0)
e3 = (0, 0, 1, · · · , 0)
..
.
en = (0, 0, 0, · · · , 1).
This S is a linearly independent set and also spans Rn . Thus it is a
basis for Rn . It is called the standard basis for Rn .
Example 5.5.2
Consider v1 = (1, 0, 0), v2 = (0, 1, 1) and v3 = (1, 0, 1) in R3 . We
have shown that the set S = {v1 , v2 , v3 } spans R3 and linearly inde-
pendent. Thus S is basis for R3 .
Example 5.5.3
The set S = {1, x, x2 , . . . , xn } is linearly independent in the vector
space Pn of all polynomials of degree less than or equal to n. Also
it spans Pn and therefore S is basis for Pn .
5.5. Basis and dimension 139
Theorem 5.5.1
Let S = {v1 , v2 , . . . , vk } be a basis for a vector space V . Then every
element v ∈ V can be expressed as a unique linear combination of
elements in S:
v = λ1 v1 + λ2 v2 + · · · + λk vk .
Definition 5.5.2
Let S = {v1 , v2 , . . . , vk } be a basis for a vector space V . Then every
element v ∈ V can be expressed as a unique linear combination of
elements in S:
v = λ1 v1 + λ2 v2 + · · · + λk vk .
The scalars λ1 , λ2 , . . . , λk are called the coordinates of the vector v
with respect to the basis S. The vector (v)S := (λ1 , λ1 , . . . , λk ) ∈ Rk
is the coordinate vector of v relative to S.
140 Chapter 5. Real Vector Spaces
Example 5.5.4
The coordinate vector of v = (v1 , v2 , v3 ) relative to the standard ba-
sis of R3 is (v)S = (v1 , v2 , v3 ) which is same as the vector v.
Consider the set S = {v1 , v2 , v3 } where v1 = (1, 0, 0), v2 =
(0, 1, 1) and v3 = (1, 0, 1) are in R3 . Thus S is basis for R3 . For
any vector v = (v1 , v2 , v3 ), we have
v = (v1 + v2 − v3 )v1 + v2 v2 + (v3 − v2 )v3 .
Therefore the coordinate vector of v relative to the basis S is given
by
(v)S = (v1 + v2 − v3 , v2 , v3 − v2 ).
Example 5.5.5
Consider the vector space M2 of all matrices of order 2. Consider
also the set
S = {E1 , E2 , E3 , E4 }
where
1 0 0 1
E1 = , E2 = ,
0 0 0 0
0 0 0 0
E3 = , E4 = .
1 0 0 1
Then any matrix
a b
v=
c d
can be written as a linear combination of E1 , E2 , E3 and E4 :
v = aE1 + bE2 + cE3 + dE4 .
Hence span(S) = M2 . Also if there are scalars λ1 , . . . , λ4 such that
0 0
λ1 E1 + λ2 E2 + λ3 E3 + λ4 E4 = ,
0 0
then we have
λ1 λ2 0 0
= ,
λ3 λ4 0 0
and therefore
λ1 = . . . = λ4 = 0.
5.5. Basis and dimension 141
Definition 5.5.3
A vector space V is finite-dimensional if it has a basis with finite
number of elements. The vector space {0} is also considered as
finite-dimensional (even though there is no basis for this vector
space). Otherwise it is called infinite-dimensional.
The number of elements in a basis for a finite-dimensional vec-
tor space V is called the dimension of the vector space and is de-
noted by dim(V ).
The above definition of dimension make sense only when all the ba-
sis of a finite-dimensional vector space have same number of elements.
In fact, this is true and will be proved shortly.
Theorem 5.5.2
Let V be finite-dimensional vector space. Let S be a basis for V with
n vectors. Any set with more than n elements is linearly dependent.
Theorem 5.5.3
Let V be finite-dimensional vector space. Let S be a basis for V
with n vectors. Any set with less than n elements does not span V .
Note that
λ1 v1 + λ2 v2 + · · · + λn vn
=(λ1 a11 + λ2 a12 + · · · + λn a1n )w1
+ (λ1 a21 + λ2 a22 + · · · + λn a2n )w2
+···
+ (λ1 am1 + λ2 am2 + · · · + λn amn )wm ,
Consider the equation
(5.5.2) λ1 v1 + λ2 v2 + · · · + λn vn = 0.
Also consider the system of equations
λ1 a11 + λ2 a12 + · · · + λn a1n = 0
λ1 a21 + λ2 a22 + · · · + λn a2n = 0
···
λ1 am1 + λ2 am2 + · · · + λn amn = 0
Since there are m equations in n variables and n > m, there are
nontrivial solution to the above linear system. Thus there are
scalars λ1 , λ2 , . . . , λn not all zero such that (??) holds. This proves
that S is linearly dependent, a contraction to the assumption that
S is a basis for V . Thus it follows that S′ does not span V .
Theorem 5.5.4
Any two bases for a finite-dimensional vector space V have same
number of elements.
Example 5.5.6
Since Rn has a basis with n elements, dim(Rn ) = n. Also the vec-
tor space Pn of all polynomial of degree n has a basis with n + 1
elements and therefore dim(Pn ) = n + 1. The vector space M2 of
all square matrices of order 2 has a basis with four elements and
therefore dim(M2 ) = 4. In general, dim(Mm×n ) = mn.
Theorem 5.5.5
Let V be a vector space and S be a nonempty subset of V . If S
is linearly independent subset of V and v ̸∈ span(S), then the set
S′ = S ∪ {v} is also linearly independent.
Theorem 5.5.6
Let V be a vector space and S be a nonempty subset of V . If v ∈ S
and v ∈ span(S − {v}), then
span(S) = span(S − {v}).
Theorem 5.5.7
Let V be a finite-dimensional vector space and S be a finite subset
of V .
(a) If S spans V , but not a basis for V , a basis for V can ob-
tained from S by removing appropriate vectors.
(b) If S is linearly independent, but not a basis for V , then S
can enlarged to a basis for V by adding appropriate vectors
into S.
Theorem 5.5.8
Let V be an n-dimensional vector space and S be a subset of V with
exactly n vectors. Then S is a basis for V if either S spans V or S is
linearly independent.
Example 5.5.7
Consider the set S with three vectors v1 = (1, 2, 3), v2 = (2, 5, 3)
and v3 = (1, 0, 8). Let λ1 , λ2 and λ3 be scalars such that λ1 v1 +
λ2 v2 + λ3 v3 = 0. Therefore we have
λ1 + 2λ2 + λ3 = 0,
2λ1 + 5λ2 = 0,
3λ1 + 3λ2 + 8λ3 = 0.
Since the matrix
1 2 1
A = 2 5 0
3 3 8
has nonzero determinant (det(A) = −1), it is invertible and there-
fore there is only a trivial solution to the above equations. Thus the
set S is linearly independent. Therefore S is a basis for R3 .
Example 5.5.8
Consider the set S with three vectors v1 = (1, 0, 0, 0), v2 =
(a, 1, 0, 0), v3 = (b, c, 1, 0) and v4 = (d, e, f , 1). Let λ1 , λ2 , λ3 and
λ4 be scalars such that λ1 v1 + λ2 + λ3 v3 + λ4 v4 = 0. Therefore we
have
λ1 + aλ2 + bλ3 + d λ4 = 0,
λ2 + cλ3 + eλ4 = 0,
λ3 + f λ4 = 0,
λ4 = 0.
Since the matrix
1 a b d
0 1 c e
A= 0 0 1 f
0 0 0 1
has nonzero determinant (det(A) = 1), it is invertible and therefore
there is only a trivial solution to the above equations. Thus the set
S is linearly independent. Therefore S is a basis for R4 .
148 Chapter 5. Real Vector Spaces
Example 5.5.9
Consider the set S with three vectors v1 = (1, 1, 1), v2 = (1, 1, 0)
and v3 = (1, 0, 0). Since
v3 = (1, 0, 0), v2 − v3 = (0, 1, 0), v1 − v2 = (0, 0, 1),
any vector v = (λ1 , λ2 , λ3 ) ∈ R3 can be written as
v = λ1 v3 + λ2 (v2 − v3 ) + λ3 (v1 − v2 )
= λ3 v1 + (λ2 − λ3 )v2 + (λ1 − λ2 )v3 .
Thus S spans R3 and therefore it is a basis for R3 .
Definition 5.6.1
Let A = [ai j ] be an m × n matrix. The vectors
r1 = (a11 a12 a13 · · · a1n ),
r2 = (a21 a22 a23 · · · a2n ),
..
.
rm = (am1 am2 am3 · · · amn ),
obtained from the rows of A, are called the row vectors of A. The
vectors
a11 a12 a1n
a21 a22 a2n
c1 = a31 , c2 = a32 , · · · , cn = a3n ,
. . .
.. .. ..
am1 am2 amn
obtained from the columns of A, are called the column vectors of
A. The vector space spanned by the row vectors of A is called the
row space of A and the vector space spanned by the column vectors
is the column space of A. The vector space of all solutions of the
linear system Ax = 0 is the null space of A. Note that row space
and null spaces are subspaces of Rn while the column space is a
subspace of Rm .
5.6. Row space, column space and null space 149
Example 5.6.1
Consider the matrix
1 0 1
A= .
1 1 1
The vectors r1 = (1 0 1) and r2 = (1 1 1) are the row vectors of
A while c1 = (1 1)T and c2 = (0 1)T and c3 = c1 are the column
vectors of A. The subspace
R = {λ r1 + µ r2 : λ , µ ∈ R}
is the row space of A. The subspace
C = {λ c1 + µ c2 : λ , µ ∈ R}
is the column space of A.
The solutions of the system Ax = 0 are given by the equation
x + z = 0, x+y+z = 0
or equivalently by
x + z = 0, y = 0.
By solving this, we get x = t, y = 0, z = −t. Thus the set of all
solutions of Ax = 0 is N = {(t 0 − t) : t ∈ R}. This is clearly a
vector subspace of R3 and is the null space of A.
Theorem 5.6.1
A linear system Ax = b is consistent if and only if b is in the column
space of A.
Theorem 5.6.2
A linear system Ax = 0 is has only trivial solution if and only if the
column vectors of A are linearly independent.
Example 5.6.2
Consider the linear system Ax = b where
1 0 1 x 4
A = 1 1 0 , x = y , and b = −1 .
0 1 1 z −1
The augmented matrix for the system is
1 0 1 4
1 1 0 −1 .
0 1 1 −1
By subtracting the first row from the second, second row from the
third and then dividing the third row by 2, we get the following
5.6. Row space, column space and null space 151
matrix
1 0 1 4
0 1 −1 −5 .
0 0 1 2
Thus the linear system becomes x + z = 4, y − z = −5 and z = 2.
Thus we have
x = 2, y = −3, z = 2.
Note that the vector b is in the column space of A, for
4 1 0 1
−1 = 2 1 − 3 1 + 2 0 .
−1 0 1 1
The vector
1 0 1
b = 1 + 1 = 2
0 1 1
is in the column space of A and therefore the system Ax = b is
consistent and the solution is given by x = 1, y = 1, z = 0.
Theorem 5.6.3
Let x0 be any solution of consistent linear system Ax = b and
x1 , x2 , . . . , xk be basis vectors of the null space of A. Every solu-
tion x of Ax = b can be written as
(5.6.1) x = x0 + λ1 x1 + · · · + λk xk ,
where λ1 , . . . , λk are scalars. Conversely the vector x given by (??)
is a solution of Ax = b every choice of scalars λ1 , . . . , λk .
Theorem 5.6.4
Let B be a matrix obtained from the matrix A by a sequence of
elementary row operations. Then the null space of the matrix A and
the null space of B are same.
Theorem 5.6.5
Let B be a matrix obtained from the matrix A by a sequence of
elementary row operations. Then the row space of a matrix A and
the row space of B are same.
Theorem 5.6.6
Let B be a matrix obtained from the matrix A by a sequence of
elementary row operations. Then a subset of column vectors of
A is linearly independent if and only if the corresponding column
vectors of B are linearly independent. The same is true for basis
also.
Theorem 5.6.7
Let R be the reduced row echelon form of a matrix A. Then the row
vectors of R with leading 1’s forms a basis for the row space of A
and the column vectors of A corresponding to column vectors of R
with leading 1’s forms a basis for the column space of A.
Corollary 5.6.1
The dimension of row space and column space are equal.
Definition 5.6.2
The common value of the dimensions of row space and column
space of a matrix A is the rank of A. The dimension of the null
space of A is the nullity of A.
It follows that the rank of A is the number of leading 1’s in the
reduced row echelon form of the matrix A.
Theorem 5.6.8
The matrix A and its transpose AT have the same rank.
5.6. Row space, column space and null space 155
Example 5.6.4
Consider the matrix
1 2 0 0 2 −1
2 4 1 0 7 −1
A=
1
.
2 0 1 4 0
1 2 1 1 7 1
The reduced row echelon form of the matrix is given by
1 2 0 0 2 −1
0 0 1 0 3 1
R= 0 0 0 1 2 1 .
0 0 0 0 0 0
The row vectors
r1 = 1 2 0 0 2 −1
r2 = 0 0 1 0 3 1
r3 = 0 0 0 1 2 1
forms a basis for the row space of A. Note that first, third and fourth
columns corresponds to leading 1’s. Therefore the column vectors
1 0 0
2 1 0
c1 =
1 c2 = 0 c3 = 1
1 1 1
forms a basis for the column space of A.
Note that the basis for the column space is just the column vec-
tors of A while for the row space they are not the rows of A. We can
find the basis for the row space consisting of rows of A by reducing
AT to reduced row echelon form.
156 Chapter 5. Real Vector Spaces
Example 5.6.5
Find λ & µ so that the system of linear equations
x+y+z = 6
x + 2y + 3z = 10
x + 2y + λ z = µ
has no solution, unique solution, or infinite number of solutions.
The given system is
1 1 1 x 6
1 2 3 y = 10 .
1 2 λ z µ
5.6. Row space, column space and null space 159
Augmented matrix
1 1 1 6
[A|b] = 1 2 3 10
1 2 λ µ
1 1 1 6
R = R2 − R1
∼ 0 1 2 4 2
R3 = R3 − R1
0 1 λ −1 µ −6
1 1 1 6
∼ 0 1 2 4 R3 = R3 − R2
0 0 λ − 3 µ − 10
Note that rank of A is 2 if λ = 3 and 3 if λ ̸= 3 while the rank of
[A|b] is 2 if λ = 3 and µ = 10 and 3 otherwise.
The system has no solution when rank of [A|b] is not equal to
rank of A i.e when λ = 3 and µ ̸= 10.
The system has a solution if the rank of A is equal to rank of
[A|b], ie when λ = 3 and µ = 10 or when λ ̸= 3. The solution is
unique if λ ̸= 3.
Chapter 6
Eigenvalues and Eigenvectors
which becomes
λ 3 − 9λ 2 + 24λ − 16 = 0.
The roots of the above equation (or the eigenvalues of the matrix)
are 1, 4, 4.
The eigenvector corresponding to the eigenvalue λ is given by
the matrix equation
3−λ 1 1 x1
1 3 − λ −1 x2 = 0.
1 −1 3 − λ x3
Case (i) (λ = 1) In this case, we have to solve
2 1 1 x1
1 2 −1 x2 = 0
1 −1 2 x3
or
2x1 + x2 + x3 = 0, x1 + 2x2 − x3 = 0, x1 − x2 + 2x3 = 0.
Subtracting the first two equation, we get the third equation. Hence
using the first two equation, we get
x1 x2 x3 k
= = = .
3 −3 −3 3
Therefore (k, −k, −k), k ̸= 0 is an eigenvector corresponding to λ =
1.
Case (ii) (λ = 4) In this case, we have to solve
−1 1 1 x1
1 −1 −1 x2 = 0
1 −1 −1 x3
or
−x1 + x2 + x3 = 0, x1 − x2 − x3 = 0, x1 − x2 − x3 = 0.
Note that all the three equations are same. Hence we use the first
equation alone. By putting x3 = 0, we have x1 = x2 = k, where
k is a constant. Similarly by putting x2 = 0, we have x1 = x3 =
k. Hence for nonzero k, (k, k, 0) and (k, 0, k) are the eigenvectors
corresponding to λ = 4. Note that for the eigenvalue λ = 4 we
have two linearly independent eigenvectors.
6.1. Eigenvalues and eigenvectors 163
Example 6.1.2
Find the eigenvalues and eigenvectors of
2 −2 2
1 1 1 .
1 3 −1
The characteristic equation of the given matrix is
2 − λ −2 2
1 1−λ 1 = 0.
1 3 −1 − λ
This reduces to (2 − λ )(λ 2 − 4) = 0 which gives λ = −2, 2, 2. The
eigenvectors corresponding to λ is given by
2 − λ −2 2 x1
1 1−λ 1 x2 = 0.
1 3 −1 − λ x3
Case (i) (λ = −2) The corresponding eigenvector is given by
the equations
4x1 − 2x2 + 2x3 = 0, x1 + 3x2 + x3 = 0, x1 + 3x2 + x3 = 0.
Note that the second and third equations are same. Hence using the
first and second, we have
x1 x2 x3
= =
−4 −1 7
and hence the corresponding eigenvector is (−4k, −k, 7k) where
k ̸= 0.
Case (ii) (λ = 2) The corresponding eigenvector is given by the
equations
−2x2 + 2x3 = 0, x1 − x2 + x3 = 0, x1 + 3x2 − 3x3 = 0.
Note the difference of the second and third equations gives an equa-
tion proportional to the first equation. Hence using the second and
third, we have
x1 x2 x3 k
= = =
0 4 4 4
and hence the corresponding eigenvector is (0, k, k) where k ̸= 0.
164 Chapter 6. Eigenvalues and Eigenvectors
Example 6.1.3
Find the eigenvalues and eigenvectors of
1 0 1
A = 0 1 1 .
1 1 0
The characteristic equation of the given matrix is
1 − λ 0 1
|A − λ I| = 0 1 − λ 1 = 0
1 1 −λ
or
λ 3 − 2λ 2 − λ + 2 = 0.
The eigenvalues are λ = −1, 1, 2. The corresponding eigenvectors
are
−1 −1 1
x1 = −1 , x2 = 1 , x3 = 1 .
2 0 1
Lemma 6.1.1
Let A be a square matrix and x ̸= 0, λ be a number. The vector x is
an eigenvector corresponding to λ if and only if Ax = λ x.
Theorem 6.1.1
The eigenvectors corresponding to distinct eigenvalues are linearly
independent.
6.1. Eigenvalues and eigenvectors 165
Theorem 6.1.2
Let λ1 , λ2 , . . ., λn be eigenvalues of a square matrix A of order n.
Then we have the following:
(i) λ1m , λ2m , . . ., λnm are eigenvalues of Am for any positive
integer m. In general, if p(x) is a polynomial, then p(λi ),
i = 1, 2, . . . , n are the eigenvalues of p(A).
(ii) A is nonsingular if and only if no eigenvalue is zero.
Equivalently A is singular if and only if at least one eigen-
value of A is zero.
(iii) If A is nonsingular, 1/λ1 , 1/λ2 , . . ., 1/λn are the eigen-
values of A−1 . Also |A|/λ1 , |A|/λ2 , . . ., |A|/λn are the
eigenvalues of Adj A.
(iv) A and AT have same eigenvalues.
166 Chapter 6. Eigenvalues and Eigenvectors
P ROOF. In the proof of the Theorem, we use Lemma ??. (i) Since
Axi = λi xi , we have A2 xi = A(λi xi ) = λi2 xi . Therefore λi2 is an
eigenvalue of A2 . Similarly λim is an eigenvalue of Am . Let
p(x) = a0 xn + a1 xn−1 + · · · + an
be the given polynomial. Then
p(A) = a0 An + a1 An−1 + · · · + an I
and therefore
p(A)xi = a0 An xi + · · · + an xi
= a0 λin xi + a1 λin−1 xi + · · · + an xi
= (a0 λin + a1 λin−1 + · · · + an )xi
= p(λi )xi .
Therefore p(λi ) is an eigenvalue of p(A).
(ii) A is singular ⇐⇒ |A| = 0 ⇐⇒ |A − 0I| = 0 ⇐⇒ 0 is an
eigenvalue of A.
(iii) Since A is nonsingular, all eigenvalues are nonzero. Also
Ax = λ x implies A−1 Ax = A−1 λ x or A−1 x = λ1 x.
(iv) Since
|AT − λ I| = |(A − λ I)T | = |A − λ I|,
the result follows.
(v) Let A = [ai j ]. Since A − λ I is triangular and the determi-
nant of triangular matrix is the product of its diagonal entries,
|A − λ I| = (a11 − λ )(a22 − λ ) . . . (ann − λ ).
6.1. Eigenvalues and eigenvectors 167
Theorem 6.1.3
(i) Eigenvalues of real orthogonal matrices are real or com-
plex conjugates in pairs and have absolute value one. Also
if λ is an eigenvalue of an orthogonal matrix, so is 1/λ .
(ii) All the eigenvalues of a real symmetric matrix are real.
(iii) All the eigenvalues of a real skew-symmetric matrix are
pure imaginary.
Theorem 6.1.4
An eigenvector cannot correspond to two different eigenvalues.
Example 6.1.4
Find the eigenvalues of A2 where
2 −2 2
A = 1 1 1 .
1 3 −1
The eigenvalues of A2 are the squares of the eigenvalues of A. Since
the eigenvalues of A are -2, 2, 2, the eigenvalues of the matrix A2
are 4, 4, 4.
6.2. Cayley-Hamilton theorem 169
Example 6.1.5
Find the sum and product of the eigenvalues of the matrix
2 3 −2
−2 1 1 .
1 0 2
The sum of eigenvalues is the trace of the matrix, that is,
2+1+2=5. The product of the eigenvalues is the determinant of the
matrix
2 3 −2
−2 1 1 = 21.
1 0 2
Example 6.2.2
Verify Cayley-Hamilton Theorem for the matrix J of order n where
1 1 ··· 1
1 1 · · · 1
J=
... ... · · · ... .
1 1 ··· 1
The characteristic equation is
1 − λ 1 ··· 1
1 1−λ ··· 1
. .. = 0.
.. ..
· · ·
. .
1 1 ··· 1−λ
Writing the first column of this determinant as the sum of all the
columns and taking the common factors in the first column we have
1 1 ··· 1
1 1 − λ · · · 1
(n − λ ) .. .. .. = 0.
. . ··· .
1 1 ··· 1−λ
172 Chapter 6. Eigenvalues and Eigenvectors
6.3. D IAGONALIZATION
A matrix is diagonalizable if it is similar to a diagonal matrix. That
is, the matrix A is diagonalizable if there exists a nonsingular matrix M
such that M −1 AM is a diagonal matrix.
Theorem 6.3.1
A square matrix of order n is diagonalizable if and only if it has n
linearly independent eigenvectors.
Example 6.3.2
Find the eigenvalues and eigenvectors of
1 1 1
A = 1 −1 1 .
1 1 −1
The characteristic equation of the given matrix is
λ 4 + λ 2 − 4λ − 4 = 0.
The eigenvalues are λ = −2, −1, 2. The corresponding eigenvec-
tors are
0 1 2
x1 = −1 , x2 = −1 , x3 = 1 .
1 −1 1
176 Chapter 6. Eigenvalues and Eigenvectors
Example 6.3.3
Find a matrix A whose eigenvalues are 1, 2 and 3 and the corre-
sponding eigenvectors are
1 1 0
0 , 1 , 1 .
1 −1 0
Consider the matrix M of eigenvectors and the diagonal matrix
D of eigenvalues
1 1 0 1 0 0
M = 0 1 1 , D = 0 2 0 .
1 −1 0 0 0 3
The inverse of M is given by
1/2 0 1/2
M −1 = 1/2 0 −1/2
−1/2 1 1/2
6.3. Diagonalization 177
and therefore
5/2 0 1/2
A = MDM −1 = 1/2 1 −1/2
1/2 0 5/2
is the required matrix with given eigenvalues and eigenvectors.
Theorem 6.3.2
Let A be a square matrix of order n. Then the following are equiv-
alent.
(1) A is orthogonally diagonalizable.
(2) A has an orthonormal set of n vectors.
(3) A is symmetric.
Example 6.3.4
Orthogonally diagonalize the matrix
1 1 1
A = 1 −1 1 .
1 1 −1
The characteristic equation of the given matrix is
λ 4 + λ 2 − 4λ − 4 = 0.
The eigenvalues are λ = −2, −1, 2. The corresponding eigenvec-
tors are
0 1 2
x1 = −1 , x2 = −1 , x3 = 1 .
1 −1 1
The corresponding orthonormal eigenvectors are given by
1 2
0 √ √
3 6
− √1 − √1 √1
v1 = 2 , v2 = 3 , v3 = 6
.
√1 − √3
1 √1
2 6
The matrix of eigenvectors is
0 √1 √2
3 6
− √1 − √1 √1
P= 2 3 6
.
√1
2
− √1
3
√1
6
Now it easy to see that
PT P = PPT = I
6.4. Quadratic forms 179
and
−2 0 0
PT AP = 0 −1 0 .
0 0 2
Note that
n n
q= ∑ ∑ a jixix j .
i=1 j=1
and therefore
n n ai j + a ji n n
q= ∑ ∑ xi x j = ∑ ∑ Ai j xi x j .
i=1 j=1 2 i=1 j=1
ai j +a ji
where Ai j = 2 satisfies
n n a ji + ai j
a ji = ∑ ∑ xi x j = A ji .
i=1 j=1 2
and therefore
3 2 x1
q = (x1 x2 ) .
2 5 x2
Theorem 6.5.3
A quadratic form q = xT Ax is
(1) positive definite if Di > 0 for i = 1, 2, . . . , n.
(2) positive semi-definite if Di ≥ 0 for i = 1, 2, . . . , n and at
least one Di = 0.
(3) negative definite if (−1)n Di > 0 for i = 1, 2, . . . , n.
(4) negative semi-definite if (−1)n Di ≥ 0 for i = 1, 2, . . . , n
and at least one Di = 0.
Example 6.5.1
Find the eigenvalues and the determinant of A + kI in terms of the
eigenvalues of A.
Since |A − λ I| = |(A + kI) − (λ − k)I|, λ is an eigenvalue of A
if and only if λ − k is an eigenvalue of A + kI. If λi , i = 1, . . . , n are
the eigenvalues of A, then λi − k, i = 1, . . . , n are the eigenvalues of
A + kI. Also |A + kI| = (λ1 − k) · · · (λn − k).
Example 6.5.2
Show that A and A + I can not be similar.
By the previous problem, it is clear that A and A + I have differ-
ent eigenvalues. But similar matrices have same set of eigenvalues.
Therefore A and A + I can not be similar.
Example 6.5.3
Prove that the sum of the eigenvalues of A + B is equals to the sum
of all individual eigenvalues of A and B.
Note that the sum of the eigenvalues of a matrix equals to the
trace of the matrix. The result follows since trace of A + B is equal
to the sum of the trace of A and trace of B,
Example 6.5.4
a b
Verify Cayley-Hamilton Theorem for A = .
c d
6.5. Linear transformation and canonical form 183
Example 6.5.5
a b c
Verify Cayley Hamilton Theorem for A = 0 d e .
0 0 f
The characteristic equation is (a − λ )(d − λ )( f − λ ) = 0 or
λ 3 − (a + d + f )λ 2 + (ad + a f + d f )λ − ad f = 0. Since
2
a ab + bd ac + be + c f
A2 = 0 d2 de + e f
0 0 f2
and
a3 a2 b + abd + bd 2 a2 c + abe + ac f + bde + be f + c f 2
A3 = 0 d3 d 2 e + de f + e f 2 ,
0 0 f 3
we get
A3 − (a + d + f )A2 + (ad + a f + d f )A − ad f I = 0.