Linear Algebra - Friedberg - 4th Ed - Some Notions
Linear Algebra - Friedberg - 4th Ed - Some Notions
1 Vector spaces
1.1 Introduction
Definitions (p. 1-3): A vector is an entity involving both magnitude and direction; represented by an
arrow. Parallelogram law for vector addition: the sum of two vectors x and y is the resultant
diagonal vector in the parallelogram having x and y as adjacent sides. Scalar multiplication:
multiplying a vector by a real number. Two nonzero vectors x and y are called parallel if y = tx for
some nonzero real number t.
The algebraic descriptions of vector addition and scalar multiplication for vectors in a plane yield
eight properties (p. 3).
Equation of line through A and B (end of vector u, end of vector v): x = u + t(v - u). Equation of a
plane containing A, B, and C (A, end of vector u, end of vector v): x = P + su + tv.
1.2 Vector spaces
Definitions (pp. 6-7): A vector space (or linear space) V over a field F consists of a set on which two
operations (called addition and scalar multiplication, respectively) are defined so that for each pair
of elements x, y in V there is a unique element x + y in V, and for each element a in F and each
element x in V there is a unique element ax in V, such that the following conditions hold.
(VS 1) For all x, y in V, x + y = y + x (commutativity of addition).
(VS 2) For all x, y, z in V, (x + y) + z = x + (y + z) (associativity of addition).
(VS 3) There exists an element in V denoted by 0 such that x + 0 = x for each x in V.
(VS 4) For each element x in V there exists an element y in V such that x + y = 0.
(VS 5) For each element x in V, 1x = x.
(VS 6) For each pair of elements a, b in F and each element x in V, (ab)x = a(bx).
(VS 7) For each element a in F and each pair of elements x, y in V, a(x + y) = ax+ ay.
(VS 8) For each pair of elements a, b in F and each element x in V, (a + b)x = ax + bx.
The eIements of the field F are called scalars and the elements of the vector space V are called
vectors.
Definitions (pp. 7-11): An object of the form (a
1
, a
2
, , a
n
), where the entries a
i
are elements of a
field F, is called an n-tuple with entries from F. The elements a
i
are called the entries or components
of the n-tuple. Two n-tuples (a
1
, a
2
, , a
n
) and (b
1
, b
2
, , b
n
) with entries from a field F are called
equal if a
i
= b
i
for i = 1, 2, ... , n. The set of all n-tuples with entries from a filed F is denoted by F
n
.
Vectors in F
n
may be written as column vectors rather than as row vectors. An m n matrix with
entries from a field F is a rectangular array with n entries per row (a vector in F
n
), m entries per
column (a vector in F
m
) and the entries a
ij
with i = j called the diagonal entries of the matrix. If the
number of rows and columns of a matrix are equal, the matrix is called square. Two m n matrices A
and B are called equal if all their corresponding entries are equal. The set of all m n matrices with
entries from a field F, is a vector space, denoted by M
mn
(F), under the see book operations of
matrix addition and scalar multiplication. Similarly, the set of all functions from a nonempty set S to
a field F can be a vector space, denoted by F(S, F), under suitable addition and multiplication
operations. A polynomial f(x) with coefficients from a field F is an expression of the form see
book. If f(x) = 0, then f(x) is called the zero polynomial, with a degree defined to be -1. Otherwise,
the degree of a polynomial is defined to be the largest exponent of x that appears in the
representation. The set of all polynomials with coefficients from F can be a vector space, denoted as
120609 2
P(F), under suitable addition and multiplication operations. A sequence in F is a function from the
positive integers into F, with (n) = a
n
for n = 1, 2, denoted as {a
n
}. Under suitable addition and
multiplication operations, all sequences that have only a finite number of nonzero terms a
n
form a
vector space.
Theorem 1.1, Cancellation Law for Vector Addition (p. 11): If x, y, and z are elements of a vector
space V such that x + z = y + z, then x = y.
Corollary 1 (p. 11): The vector 0 described in (VS 3) is unique [and is called the zero vector of V]
Corollary 2 (p. 12): The vector y described in (VS 4) is unique [and is called the additive inverse of x,
denoted by x].
Theorem 1.2 (p. 12): In any vector space V, the following statements are true:
(a) 0x = 0 for each x V.
(b) (-a)x = -(ax) = a(-x) for each a F and each x V.
(c) A0 = 0 for each a F.
Definition (p. 15): Let V = {0} consist of a single vector 0 and define 0 + 0 = 0 and c0 = 0 for each
scalar c in F. V is called the zero vector space.
Definition (p. 15): A real-valued function f defined on the real line is called an even function if
f(-t) = f(t) for each real number t.
1.3 Subspaces
Definitions (p. 16): A subset W of a vector space V over a field F is called a subspace of V if W is a
vector space over F with the operations of addition and scalar multiplication defined on V. In any
vector space V, note that V and {0} are subspaces. The latter is called the zero subspace of V.
Theorem 1.3 (p. 17): Let V be a vector space and W a subset of V. Then W is a subspace of V if and
only if the following three conditions hold for the operations defined in V.
(a) 0 W.
(b) x + y W whenever x W and y W. (W is closed under addition.)
(c) cx W whenever c F and x W. (W is closed under scalar multiplication.)
Definitions (pp. 17-18): The transpose A
t
of an m n matrix A is the n m matrix obtained from A by
interchanging the rows with the columns; that is, (A
t
)
ij
= A
ji
. A symmetric matrix is a matrix such that
A
t
= A. An n n matrix M is called a diagonal matrix if M
ij
= 0 whenever i j, that is, if all its
nondiagonal entries are zero. The trace of an n n matrix M, denoted tr(M), is the sum of the
diagonal entries of M; that is, tr(M) = M
11
+ M
22
+ + M
nn
Theorem 1.4 (p. 19): Any intersection of subspaces of a vector space V is a subspace of V.
1.4 Linear combinations and systems of linear equations
Definitions (p. 24): Let V be a vector space and S a nonempty subset of V. A vector v V is called a
linear combination of vectors of S if there exist a finite number of elements u
1
, u
2
, , u
n
in S and
scalars a
1
, a
2
, ... , a
n
in F such that v = a
1
u
1
+ a
2
u
2
+ + a
n
u
n
In this case we also say that v is a linear
combination of u
1
, u
2
, , u
n
and call a
1
, a
2
, ... , a
n
the coefficients of the linear combination.
For solving a system of linear equations (pp. 25-27) a general method using matrices is discussed in
Chapter 3. For now, we illustrate how to solve a system of linear equations by showing how to
120609 3
determine if a given vector can be expressed as a linear combination of other vectors. The procedure
uses three types of operations to simplify the original system:
interchanging the order of any two equations in the system;
multiplying any equation in the system by a nonzero constant;
adding a constant multiple of any equation to another equation in the system.
In Section 3.4, we prove that these operations do not change the set of solutions to the original
system. Note that we employed these operations to obtain a system of equations that have the
following properties:
1. The first nonzero coefficient in each equation is one.
2. If an unknown is the first unknown with a nonzero coefficient in some equation, then that
unknown occurs with a zero coefficient in each of the other equations.
3. The first unknown with a nonzero coefficient in any equation has a larger subscript than the
first unknown with a nonzero coefficient in any preceding equation.
Definition (p. 30): Let S be a nonempty subset of a vector space V. The span of S, denoted span(S), is
the set of all linear combinations of the vectors in S. For convenience, we define span() = {0}.
Theorem 1.5 (p. 30): The span of any subset S of a vector space V is a subspace of V. Moreover, any
subspace of V that contains S must also contain the span of S.
Definition (p. 30): A subset S of a vector space V generates (or spans) V if span(S) = V. In this case,
we also say that the vectors of S generate (or span) V.
1.5 Linear dependence and linear independence
Definition (p. 36): A subset S of a vector space V is called linearly dependent if there exist a finite
number of distinct vectors u
1
, u
2
, , u
n
in S and scalars a
1
, a
2
, ..., a
n
, not all zero, such that
a
1
u
1
+ a
2
u
2
+ + a
n
u
n
= 0.
In this case we also say that the vectors of S are linearly dependent. For a
1
= a
2
= ... = a
n
= 0, we get
the trivial representation of 0 as a linear combination of u
1
, u
2
, , u
n
. Furthermore, any subset of a
vector space that contains the zero vector is linearly dependent, because 0 = 1 0 is a nontrivial
representation of 0 as a linear combination of vectors in the set.
Definition (p. 37): A subset S of a vector space that is not linearly dependent is called linearly
independent. As before, we also say that the vectors of S are linearly independent.
The following facts about linearly independent sets are true in any vector space.
1. The empty set is linearly independent, for linearly dependent sets must be nonempty.
2. A set consisting of a single nonzero vector is linearly independent. For if {u} is linearly
dependent, then au = 0 for some nonzero scalar a. Thus u = a
-1
(au) = a
-1
0 = 0.
3. A set is linearly independent if and only if the only representations of 0 as linear combinations
of its vectors are trivial representations.
Theorem 1.6 (p. 39): Let V be a vector space, and let S
1
S
2
V. If S
1
is linearly dependent, then S
2
is
linearly dependent.
Corollary (p. 39): Let V be a vector space, and let S
1
S
2
V. If S
2
is linearly independent, then S
1
is
linearly independent.
Theorem 1.7 (p. 39): Let S be a linearly independent subset of a vector space V, and v be a vector in
V that is not in S. Then S {v} is linearly dependent if and only if v span(S).
120609 4
1.6 Bases and dimension
Definition (p. 43): A basis for a vector space V is a linearly independent subset of V that generates
V. If is a basis for V, we also say that the vectors of form a basis for V.
Recalling that span() = {0} and is lineaily inuepenuent, we see that is a basis for the zero
vector space. In F
n
the set e
1
, e
2
, , e
n
, with e
1
= (1, 0, , 0), e
2
= (0, 1, , 0), , e
n
= (0, 0, , 0, 1) is
called the standard basis for F
n
. In P
n
(F) the standard basis is the set {1, x, x
2
, , x
n
}. In P(F) the
standard basis is the set {1, x, x
2
, } which is not finite.
Theorem 1.8 (p. 43): Let V be a vector space and = {u
1
, u
2
, , u
n
} be a subset of V. Then is a basis
for V if and only if each v V can be uniquely expressed as a linear combination of vectors of , that
is, can be expressed in the form v = a
1
u
1
+ a
2
u
2
+ + a
n
u
n
= 0 for unique scalars a
1
, a
2
, ... , a
n
.
Theorem 1.9 (p. 44): If a vector space V is generated by a finite set S, then some subset of S is a basis
for V. Hence V has a finite basis.
Theorem 1.10, Replacement Theorem (p. 45): Let V be a vector space that is generated by a set G
containing exactly n vectors, and let L be a linearly independent subset of V containing exactly m
vectors. Then m n and there exists a subset H of G containing exactly n m vectors such that L H
generates V.
Corollary 1 (p. 46): Let V be a vector space having a finite basis. Then every basis for V contains the
same number of vectors.
Definitions (pp. 46-47): A vector space is called finite-dimensional if it has a basis consisting of a
finite number of vectors. The unique number of vectors in each basis of V is called the dimension of
V and is denoted by dim(V). A vector space that is not finite-dimensional is called infinite-
dimensional.
Results for some vector spaces (p. 47):
The vector space {0} has dimension zero.
The vector space F
n
has dimension n.
The vector space M
mn
(F) has dimension mn.
The vector space P
n
(F) has dimension n + 1.
The vector space P(F) is infinite-dimensional.
Over the field of complex numbers, the vector space of complex numbers has dimension 1.
Over the field of real numbers, the vector space of complex numbers has dimension 2.
Corollary 2 (pp. 47-48): Let V be a vector space with dimension n.
(a) Any finite generating set for V contains at least n vectors, and a generating set for V that
contains exactly n vectors is a basis for V.
(b) Any linearly independent subset of V that contains exactly n vectors is a basis for V.
(c) Every linearly independent subset of V can be extended to a basis for V.
An overview of dimension and its consequences (p. 49)
Particularly the relationships between linearly independent sets, bases, and generating sets.
The dimension of subspaces (p. 50)
Theorem 1.11 (p. 50): Let W be a subspace of a finite-dimensional vector space V. Then W is finite-
dimensional and dim(W) dim(V). Moreover, if dim(W) = dim(V), then V = W.
120609 5
Corollary (p. 51): If W is a subspace of a finite-dimensional vector space V, then any basis for W can
be extended to a basis for V.
The Lagrange interpolating formula (pp. 51-53): Corollary 2 of the replacement theorem can be
applied to obtain a useful formula. Let c
0
, c
1
, , c
n
be distinct in an infinite filed F. The polynomials
f
0
(x), f
1
(x), , f
n
(x) defined by
f
i
(x) =
(x-c
0
) (x-c
i-1
)(x-c
i+1
) (x-c
n
)
(c
i
-c
0
)(c
i
-c
i-1
)(c
i
-c
i+1
)(c
i
-c
n
)
= _
x-c
k
c
i
-c
k
n
k=0
ki
are called Lagrange polynomials (associated with c
0
, c
1
, , c
n
). Note that each f
i
(x) is a polynomial of
degree n and hence is in P
n
(F). By regarding f
i
(x) as a polynomial function f
i
: F F, we see that
f
i
(c
j
)= _
0 if i j
1 if i = j
This property of the Lagrange polynomials can be used to show that = {f0, f1, , fn} is a linearly
independent subset of P
n
(F). Since the dimension of P
n
(F) is n+1, it will follow from Corollary 1 of
Theorem 1.10 that is a basis for P
n
(F).
2 Linear transformations and matrices
2.1 Linear transformations, null spaces, and ranges
Definition (p. 54-55): Let V and W be vector spaces (over F). We call a function T: V W a linear
transformation from V to W if for all x, y V and c F, we have
(a) T(x + y) = T(x) + T(y) and
(b) T(cx) = cT(x).
If the underlying filed F is the field of rational numbers, then (a) implies (b), but, in general (a) and
(b) are logically independent. We often call T linear. The reader should verify the following
properties of a function T: V W.
1. If T is linear, then T(0) = 0.
2. T is linear if and only if T(cx + y) = cT(x) + T(y) for all x, y V and c F.
3. If T is linear, then T(x y) = T(x) T(y) for all x, y V.
4. T is linear if and only if, for x
1
, x
2
, , x
n
V and a
1
, a
2
, , a
n
F, we have
T _a
i
x
i
n
i=1
_= a
i
T(x
i
)
n
i=1
Definitions (p. 66): Define T
: R
2
R
2
by T
(a
1
, a
2
) is the vector obtained by rotating (a
1
, a
2
)
counterclockwise by if (a
1
, a
2
) (0, 0), and T
, by
|x]
= _
a
1
a
2
a
n
_
Notice that [u
i
]
= e
i
in the preceding definition.
Definition (p. 80): Let us now proceed with the promised matrix representation of a linear
transformation. Suppose that V and W are finite-dimensional vector spaces with ordered bases
= {v
1
, v
2
, , v
n
} and y = {w
1
, w
2
, ..., w
m
}, respectively. Let T: V W be linear. Then for each j,
1 j n, there exist unique scalars a
ij
F, 1 i m, such that
120609 7
T(v
j
) = a
ij
w
i
m
i=1
for 1 j n
Using the notation above, we call the m x n matrix A defined by A
ij
= a
ij
the matrix representation of
T in the ordered bases and y and write A = |T]
[
y
. If V = W and = y, we write A = [T]
. Notice that
the j-th column of A is simply |T(v
j
)]
.
Definition (p. 82): Let T, U: V W be arbitrary functions, where V and W are vector spaces over F,
and let a F. We define T + U: V W by (T + U)(x) = T(x) + U(x) for all x V, and aT: V W by
(aT)(x) = aT(x) for all V.
Theorem 2.7 (p. 82): Let V and W be vector spaces over a field F, and let T, U: V W be linear.
(a) For all a F, aT + U is linear.
(b) Using the operations of addition and scalar multiplication in the preceding definition, the
collection of all linear transformations from V to W is a vector space over F.
Definitions (p. 82): Let V and W be vector spaces over F. We denote the vector space of all linear
transformations from V into W by L(V, W). in the case that V = W, we write L(V) instead of L(V, W).
Theorem 2.8 (pp. 82-83): Let V and W be finite-dimensional vector spaces with ordered bases and
y, respectively, and let T, U: V W be linear transformations. Then
(a) |T + U]
[
y
= |T]
[
y
+ |U]
[
y
and
(b) a|T]
[
y
= a[|T]
[
y
for all scalars a.
2.3 Composition of linear transformations and matrix multiplication
Theorem 2.9 (p. 82): Let V, W, and Z be vector spaces over the same field F, and let T: V W and
U: W Z be linear. Then UT: V Z is linear. [UT stands for U T: the composite transformation]
Theorem 2.10 (p. 87): Let V be a vector space. Let T, U
1
, U
2
L(V). Then
(a) T(U
1
+ U
2
) = TU
1
+ TU
2
and (U
1
+ U
2
)T = U
1
T + U
2
T.
(b) T(U
1
U
2
) = (TU
1
)U
2
(c) TI = IT = T
(d) a(U
1
U
2
) = (aU
1
)(U
2
) = U
1
(aU
2
) for all scalars a.
Definition (p. 87): Let A be an m n matrix and B be an n p matrix. We define the product of A and
B, denoted AB, to be the m p matrix such that
(AB)
ij
= A
ik
B
kj
n
k=1
for 1 i m, 1 j p
Note that (AB)
ij
is the sum of products of corresponding entries from the i-th row of A and the j-th
column of B.
Theorem 2.11 (p. 88): Let V, W, and Z be finite-dimensional vector spaces with ordered bases , ,
and y, respectively. Let T: V W and U: W Z be linear transformations. Then
|0T]
= |0]
|T]
Corollary (p. 89): Let V be a finite-dimensional vector space with an ordered basis . Let T, U L(V).
Then [UT]
= [U]
[T]
.
Definitions (p. 89): We define the Kronecker delta
ij
by
ij
= 1 if i = j and
ij
= 0 if i j. Then the n n
identity matrix I
n
is defined by (I
n
)
ij
=
ij
.
120609 8
Theorem 2.12 (p. 89): Let A be an m n matrix, B and C be n p matrices, and D and E be q m
matrices. Then
(a) A(B + C) = AB + AC and (D + E)A = Da + EA.
(b) A(AB) = (aA)B = A(aB) for any scalar a.
(c) I
m
A = A = AI
n
.
(d) If V is an n-dimensional vector space with an ordered basis , then [I
V
]
= I
n
.
Theorem 2.13 (pp. 90-91): Let A be an m n matrix and B be an n p matrix. For each j (1 j p) let
u
j
and v
j
denote the j-th columns of AB and B, respectively. Then
(a) u
j
= Av
j
(b) v
j
= Be
j
, where e
j
is the j-th standard vector of F
p
.
Theorem 2.14 (p. 91): Let V and W be finite-dimensional vector spaces having ordered bases and
y, respectively, and let T: V W be linear. Then, for each u V, we have
|T(u)]
= |T]
|u]
Definition (p. 92): Let A be an m n matrix with entries from a field F. We denote by L
A
the mapping
L
A
: F
n
F
m
defined by L
A
(x) =Ax (the matrix product of A and x) for each column vector x F
n
. We call
L
A
a left-multiplication transformation.
Theorem 2.15 (p. 93): Let A be an m n matrix with entries from F. Then the left-multiplication
transformation L
A
: F
n
F
m
is linear. Furthermore, if B is any other m n matrix (with entries from F)
and and y are the standard ordered bases for F
n
and F
m
, respectively, then we have the following
properties.
(a) |L
A
]
y
= A.
(b) L
A
= L
B
if and only if A = B.
(c) L
A+B
= L
A
+ L
B
and L
aA
= aL
A
for all a F.
(d) If T: F
n
F
m
is linear, then there exists a unique m n matrix C such that T = L
C
. In fact,
C = |T]
[
y
.
(e) If E is an n p matrix, then L
AE
= L
A
L
E
.
(f) If m = n, then L
I
n
= I
F
n.
Theorem 2.16 (p. 93): Let A, B, and C be matrices such that A(BC) is defined. Then (AB)C is also
defined and A(BC) = (AB)C; that is, matrix multiplication is associative.
Definition (pp. 94-95): An incidence matrix is a square matrix in which all the entries are either zero
or one and, for convenience, all the diagonal entries are zero. A maximal collection of three or more
people with the property that any two can send to each other is called a clique. A relation among a
group of people is called a dominance relation if the associated incidence matrix A has the property
that for all distinct pairs i an j, A
ij
= 1 if and only if A
ji
= 0, that is, given any two people, exactly one of
them dominates (or, using the terminology of our first example, can send a message to) the other.
2.4 Invertibility and isomorphisms
Definition (pp. 99-100): Let V and W be vector spaces, and let T: V W be linear. A function U: W
V is said to be an inverse of T if TU = I
W
and UT = I
V
. If T has an inverse, then T is said to be invertible.
As noted in appendix B, if T is invertible, then the inverse of T is unique and is denoted by T
-1
.
The following facts hold for invertible functions T and U.
1. (TU)
-1
= U
-1
T
-1
.
2. (T
-1
)
-1
= T; in particular, T
-1
is invertible.
120609 9
We often use the fact that a function is invertible if and only if it is both one-to-one and onto. We
can therefore restate Theorem 2.5 as follows.
3. Let T: V W be a linear transformation, where V and W are finite-dimensional spaces of equal
dimension. Then T is invertible if and only if rank(T) = dim(V).
Theorem 2.17 (p. 100): Let V and W be vector spaces, and let T: V W be linear and invertible. Then
T
-1
: W V is linear.
Definition (p. 100): Let A be an n n matrix. Then A is invertible if there exists an n n matrix B such
that AB = BA = I. The matrix B is called the inverse of A and is denoted by A
-1
.
Lemma (p. 101): Let T be an invertible linear transformation from V to W. Then V is finite-
dimensional if and only if W is finite-dimensional. In this case, dim(V) = dim(W).
Theorem 2.18 (p. 101): Let V and W be finite-dimensional vector spaces with ordered basis and y,
respectively. Let T: V W be linear. Then T is invertible if and only if |T]
is invertible. Furthermore,
|T
-1
]
= (|T]
)
-1
.
Corollary 1 (p. 102): Let V be a finite-dimensional vector space with an ordered basis , and T: V V
be linear. Then T is invertible if and only if |T]
is invertible. Furthermore, |T
-1
]
= (|T]
)
-1
.
Corollary 2 (p. 102): Let A be an n n matrix. Then A is invertible if and only if L
A
is invertible.
Furthermore, (L
A
)
-1
= L
A
-1
.
Definitions (p. 102): Let V and W be vector spaces. We say that V is isomorphic to W if there exists a
linear transformation T: V W that is invertible. Such a linear transformation is called an
isomorphism from V onto W.
Theorem 2.19 (p. 103): Let V and W be finite-dimensional vector spaces (over the same field F). Then
V is isomorphic to W if and only if dim(V) = dim(W).
Corollary (p. 103): Let V is a vector space over F. Then V is isomorphic to F
n
if and only if dim(V) = n.
Theorem 2.20 (p. 103): Let V and W be finite-dimensional vector spaces over F of dimensions n and
m, respectively, and let and y be ordered bases for V and W, respectively. Then the function
: L(V, W) M
mn
(F), defined by (T) = |T]
: V F
n
defined by
(x) = [x]
for each x v.
Theorem 2.21 (p. 104): For any finite-dimensional vector space V with ordered basis ,
is an
isomorphism.
Figure 2.2 (p. 105): The Fig. shows (top) T: V W, (left)
: V F
n
, (bottom) L
A
: F
n
F
m
, and (right)
: W F
m
. There are two compositions of linear transformations that map V into F
m
:
1. Map V into F
n
with
.
2. Map V into W with T and follow it by
T.
120609 10
2.5 The change of coordinate matrix
Theorem 2.22 (p. 111): Let and ' be two ordered bases for a finite-dimensional vector space V,
and let Q = |I
V
]
'
. Then
(a) Q is invertible.
(b) For any v V, [v]
= Q[v]
The matrix Q = |I
V
]
'
Q
Corollary (p. 115): Let A M
nn
(F), and let be an ordered basis for F
n
. Then [L
A
]
= Q
-1
AQ, where Q is
the n n matrix whose j-th column is the j-th vector of .
Definition (p. 115): Let A and B be matrices in M
nn
(F). We say that B is similar to A if there exists an
invertible matrix Q such that B = Q
-1
AQ.
3 Elementary matrix operations and systems of linear equations
3.1 Elementary matrix operations and elementary matrices
Definitions (p. 148): Let A be an m n matrix. Any one of the following three operations on the rows
[columns] of A is called an elementary row [column] operation:
(1) interchanging any two rows [columns] of A;
(2) multiplying any row [column] of A by a nonzero scalar;
(3) adding any scalar multiple of a row [column] of A to another row [column].
Any of these three operations is called an elementary operation. Elementary operations are of
type 1, type 2, or type 3 depending on whether they are obtained by (1), (2), or (3).
Definition (p. 149): An n n elementary matrix is a matrix obtained by performing an elementary
operation on I
n
. The elementary matrix is said to be of type 1, 2, or 3 according to whether the
elementary operation performed on I
n
is a type 1, 2, or 3 operation, respectively.
Theorem 3.1 (p. 149): Let A M
n n
(F), and suppose that B is obtained from A by performing an
elementary row [column] operation. Then there exists an m m [n n] elementary matrix E such
that B = EA [B = AE]. In fact, E is obtained from I
m
[I
n
] by performing the same elementary row
[column] operation as that which was performed on A to obtain B. Conversely, if E is an elementary
m m [n n] matrix, then EA [AE] is a matrix obtained from A by performing the same elementary
row [column] operation as that which produces E from I
m
[I
n
].
Theorem 3.2 (p. 150): Elementary matrices are invertible, and the inverse of an elementary matrix is
an elementary matrix of the same type.
120609 11
3.2 The rank of a matrix and matrix inverses
Definition (p. 152): If A M
n n
(F), we define the rank of A, denoted rank(A), to be the rank of the
linear transformation L
A
: F
n
F
m
.
Theorem 3.3 (p. 152): Let T: V W be a linear transformation between finite-dimensional vector
spaces, and let and y be ordered bases for V and W, respectively. Then rank(T) = rank(|T]
[
y
).
Theorem 3.4 (p. 153): Let A be an m n matrix. If P and Q are invertible m m an n n matrices,
respectively, then
(a) rank(AQ) = rank(A),
(b) rank(PA) = rank(A)
and therefore,
(c) rank(PAQ) = rank(A).
Corollary (p. 153): Elementary row and column operations on a matrix are rank preserving.
Theorem 3.5 (p. 153): The rank of any matrix equals the maximum number of its linearly
independent columns; that is, the rank of a matrix is the dimension of the subspace generated by its
columns.
Theorem 3.6 (p. 155): Let A be an m n matrix of rank r. Then r m, r n, and by means of a finite
number of elementary row and column operations, A can be transformed into the matrix
D = _
I
r
O
1
O
2
O
3
] ,
where O
1
, O
2
, and O
3
are zero matrices. Thus D
ii
= 1 for i r and D
ij
= 0 otherwise.
Corollary 1 (p. 158): Let A be an m n matrix of rank r. Then there exist invertible matrices B and C
of sizes m m and n n, respectively, such that D = BAC, where
D = _
I
r
O
1
O
2
O
3
]
is the m n matrix in which O
1
, O
2
, and O
3
are zero matrices.
Corollary 2 (p. 158): Let A be an m n matrix. Then
(a) rank(A
t
) = rank(A).
(b) The rank of any matrix equals the maximum number of its linearly independent rows; that is,
the rank of a matrix is the dimension of the subspace generated by its rows.
(c) The rows and columns of any matrix generate subspaces of the same dimension, numerically
equal to the rank of the matrix.
Corollary 3 (p. 159): Any invertible matrix is a product of elementary matrices.
Theorem 3.7 (p. 159): Let T: V W and U: W Z be linear transformations on finite-dimensional
vector spaces V, W, and Z, and let A and B be matrices such that the product AB is defined. Then
(a) rank(UT) rank(U).
(b) rank(UT) rank(T).
(c) rank(AB) rank(A).
(d) rank{AB) rank(B).
120609 12
The inverse of a matrix (pp. 161-162)
Definition (p. 161): Let A and B be m n and m p matrices, respectively. By the augmented matrix
(A|B), we mean the m (n + p) matrix (A B), that is, the matrix whose first n columns are the
columns of A, and whose last p columns are the columns of B.
Computing the inverse of a matrix (pp. 161-162): If A is an invertible n n matrix, and the matrix
(A|I
n
) is transformed into a matrix of the form (I
n
|B) by means of a finite number of elementary row
operations, then B = A
-1
.
3.3 Systems of linear equations theoretical aspects
Definitions (p. 169): The system of equations S with m rows and n variables is called a system of m
linear equations in n unknowns over the field F. The m n matrix A is called the coefficient
matrix of the system (S). If we let x be the vector of n variables and b be the vector of the m
equation outcomes, then the system (S) may be rewritten as a single matrix equation Ax = b. The
solution to the system (S) is an n-tuple s F
n
such that As = b. The set of all solutions to the system
(S) is called the solution set of the system. System (S) is called consistent if its solution set is
nonempty; otherwise it is called inconsistent.
Definitions (p. 171): A system Ax = b of m equations in n unknowns is said to be homogeneous if
b = 0. Otherwise the system is said to be nonhomogeneous.
Theorem 3.8 (p. 171): Let Ax = 0 be a homogeneous system of m linear equations in n unknowns
over a field F. Let K denote the set of all solutions to Ax = 0. Then K = N(L
A
); hence K is a subspace of
F
n
of dimension n - rank(L
A
) = n - rank(A).
Corollary (p. 171): If m < n, the system Ax = 0 has a nonzero solution.
Theorem 3.9 (p. 172): Let K be the solution set of a system of linear equations Ax = b, and let K
H
be
the solution set of the corresponding homogeneous system Ax = 0. Then for any solution s to Ax = b
K = {s] + K
H
= {s + k: k K
H
}
Theorem 3.10 (p. 174): Let Ax = b be a system of n linear equations in n unknowns. If A is invertible,
then the system has exactly one solution, namely A
-1
b. Conversely, if the system has exactly one
solution, then A is invertible.
Definition (p. 174): The matrix (A|b) is called the augmented matrix of the system Ax = b.
Theorem 3.11 (p. 174): Let Ax = b be a system of linear equations. Then the system is consistent if
and only if rank(A) = rank(A|b).
An application in economics (pp. 176-179)
Definitions (pp. 176-177): A model of an (economic) system in which no commodities either enter or
leave the system is referred to as a closed model. The system can be written as Ap = p. In this
context A is called the input-output (or consumption) matrix, and Ap = p is called the equilibrium
condition. For vectors b and c in R
n
, we use the notation b c [b > c] to mean b
i
c
i
[b
i
> c
i
] for all i.
The vector b is called nonnegative [positive] if b 0 [b > 0].
Theorem 3.12 (p. 177): Let A be an n n input-output matrix having the form
A = [
B C
D E
120609 13
where D is a 1 (n 1) positive vector and C is an (n 1) 1 positive vector. Then (I A)x = 0 has a
one-dimensional set that is generated by a nonnegative vector.
Definition (p. 178): In an open model of an (economic) system we assume that there is an outside
demand for commodities produced [and for this open model also an equilibrium condition can be
given].
3.4 Systems of linear equations computational aspects
Definition (p. 182): Two systems of linear equations in are called equivalent if they have the same
solution set.
Theorem 3.13 (p. 182): Let Ax = b be a system of m linear equations in n unknowns, and let C be an
invertible m m matrix. Then the system (CA)x = Cb is equivalent to Ax = b.
Corollary (p. 182): Let Ax =b be a system of m linear equations in n unknowns. If (A'|b') is obtained
from (A|b) by a finite number of elementary row operations, then the system A'x = b' is equivalent
to the original system.
Definition (p. 185): A matrix is said to be in reduced row echelon form if the following three
conditions are satisfied.
(a) Any row containing a nonzero entry precedes any row in which all the entries are zero (if any).
(b) The first nonzero entry in each row is the only nonzero entry in its column.
(c) The first nonzero entry in each row is 1 and it occurs in a column to the right of the first
nonzero entry in the preceding row.
Gaussian elimination (p. 186): consists of two separate parts:
1. In the forward pass (steps 1-5), the augmented matrix is transformed into an upper triangular
matrix in which the first nonzero entry of each row is 1, and it occurs in a column to the right
of the first nonzero entry of each preceding row.
2. In the backward pass or back-substitution (steps 6-7), the upper triangular matrix is
transformed into reduced row echelon form by making the first nonzero entry of each row the
only nonzero entry of its column.
Theorem 3.14 (p. 187): Gaussian elimination transforms any matrix into its reduced row echelon
form.
Theorem 3.15 (p. 189): Let Ax = b be a system of r nonzero equations in n unknowns. Suppose that
rank( A) = rank(A|b) and that (A|b) is in reduced row echelon form. Then
(a) rank(A) = r.
(b) If the general solution obtained by the procedure above is of the form
s = s
0
+ t
1
u
1
+ t
2
u
2
+ + t
n-r
u
n-r
then {u
1
, u
2
, , u
n-r
} is a basis for the solution set of the corresponding homogeneous system
and s
0
is a solution to the original system.
An interpretation of the reduced row echelon form (p. 190)
Theorem 3.16 (p. 191): Let A be an m n matrix of rank r, where r > 0, and let B be the reduced row
echelon form of A. Then
(a) The number of nonzero rows in B is r.
(b) For each i = 1, 2, , r, there is a column b
j
i
, of B such that b
j
i
= e
i.
(c) The columns of A numbered j
1
, j
2
, , j
r
are linearly independent.
120609 14
(d) For each k = 1, 2, , n, if column k of B is d
1
e
1
+ d
2
e
2
+ + d
r
e
r
, then column k of A is
d
1
a
j
1
+ d
2
a
j
2
+ + d
r
a
j
r
Corollary (p. 191): The reduced row echelon form of a matrix is unique.
4 Determinants
4.1 Determinants of order 2
Definition (p.200): If
A = [
a b
c d
is a 2 2 matrix with entries from a field F, then we define the determinant of A, denoted det(A) or
|A|, to be the scalar ad bc.
Theorem 4.1 (p. 200): The function det: M
n n
(F) F is a linear function of each row of a 2 2 matrix
when the other row is held fixed. That is, if u, v, and w are in F
2
and k is a scalar, then
det [
u + kv
w
= det [
u
w
+ kdet [
v
w
and
det [
w
u + kv
= det [
w
u
+ kdet [
w
v
Theorem 4.2 (p. 201): Let A M
2 2
(F). Then the determinant of A is nonzero if and only if A is
invertible. Moreover, if A is invertible, then
A
-1
=
1
det (A)
[
A
22
-A
12
-A
21
A
11
The area of a parallelogram (pp. 202-203)
By the angle between two vectors in R
2
, we mean the angle with measure (0 < ) that is
formed by the vectors having the same magnitude and direction as the given vectors but
emanating from the origin.
If = {u, v} is an ordered basis for R
2
, we define the orientation of to be the real number
O[
u
v
=
det [
u
v
det [
u
v
A coordinate system {u, v} is called right-handed if u can be rotated in a counterclockwise
direction through an angle (0 < < ) to coincide with v. Otherwise {u, v} is called a left-
handed system.
An ordered set {u, v} in R
2
when regarded as arrows emanating from the origin of R
2
form a
parallellogram with u and v as adjacent sides: the parallellogram determined by u and v.
For the area of the parallellogram determined by u and v it can be proved that
A[
u
v
=O[
u
v
det [
u
v
= det [
u
v
4.2 Determinants of order n
Definitions (pp. 209-210): Let A M
n n
(F). If n = 1, so that A = (A
11
), we define det(A) = A11. For
n 2, we define det(A) recursively as
det(A) = (-1)
1+j
A
1j
det(
1j
)
n
j=1
The scalar det(A) is called the determinant of A and is also denoted by |A|.
120609 15
The scalar
(-1)
i+j
det(
ij
)
is called the cofactor of the entry of A in row i, colom j.
With this cofactor denoted as c
ij
the above formula for the determinant can be expressed as
det(A) = A
11
c
11
+ A
12
c
12
+ + A
1n
c
1n
which is called the cofactor expansion along the first row of A.
Theorem 4.3 (p. 212): The determinant of an n n matrix is a linear function of each row when the
remaining rows are held fixed. That is, for 1 r n, we have
det
`
a
1
a
r-1
u + kv
a
r+1
a
n
/
=det
`
a
1
a
r-1
u
a
r+1
a
n
/
+kdet
`
a
1
a
r-1
v
a
r+1
a
n
/
Corollary (p. 213): If A M
n n
(F) has a row consisting entirely of zeros, then det(A) = 0.
Lemma (p. 213): Let B M
n n
(F), where n 2. If row i of B equals e
k
for some k (1 k n), then
det(B) = (-1)
i+k
det(B
ik
).
Theorem 4.4 (p. 215): The determinant of a square matrix can be evaluated by cofactor expansion
along any row. That is, if A M
n n
(F), then for any integer i (1 i n),
det(A) = (-1)
i+j
A
ij
det(
ij
)
n
j=1
Corollary (p. 215): If A M
n n
(F) has two identical rows, then det(A) = 0.
Theorem 4.5 (p. 216): If A M
n n
(F), and B is a matrix obtained from A by interchanging any two
rows of A, then det(B) = = det(A).
Theorem 4.6 (p. 216): Let A M
n n
(F), and let B be a matrix obtained by adding a multiple of one
row of A to another row of A. Then det(B) = det(A).
Corollary (p. 217): If A M
n n
(F) has rank less than n, then det(A) = 0.
4.3 Properties of determinants
Theorem 4.7 (p. 223): For any A, B M
n n
(F), det(AB) = det(A) det(B).
Corollary (p. 223): A matrix A M
n n
(F) is invertible if and only if det(A) 0. Furthermore, if A is
invertible, then det(A
-1
) =
1
det (A)
.
Theorem 4.8 (p. 224): For any A M
n n
(F), det(A
t
) = det(A).
Theorem 4.9 (Cramer's Rule) (p. 224): Let Ax = b be the matrix form of a system of n linear equations
in n unknowns, where x = (x
1
, x
2
, , x
n
)
t
. If det(A) 0, then this system has a unique solution, and for
each k (k = 1, 2, , n),
x
k
=
det (M
k
)
det (A)
It is possible to interpret the determinant of a matrix A M
n n
(R) geometrically (p. 226): if the rows
of A are a
1
, a
2
, , a
n
, respectively, then |det(A)| is the n-dimensinal volume of the parallelepiped
having the vectors a
1
, a
2
, , a
n
as adjacent sides.
120609 16
Definition (p. 231): The classical adjoint of a square matrix A is the transpose of the matrix whose ij-
entry is the ij-cofactor of A.
4.4 Summary Important facts about determinants
Properties of the Determinant (p. 234)
1. If B is a matrix obtained by interchanging any two rows or interchanging any two columns of
an n n matrix A, then det(B) = det(A).
2. If B is a matrix obtained by multiplying each entry of some row or column of an n n matrix A
by a scalar k, then det(B) = k det(A).
3. If B is a matrix obtained from an n n matrix A by adding a multiple of row i to row j or a
multiple of column i to column j for i j, then det(B) = det(A).