MA2101 Chapter 1
MA2101 Chapter 1
have already taken. That means: many of the topics concern things you have already seen.
But what you may not realise is that Linear Algebra is actually a HUGE subject, which
can be (and is) thought about in many different ways. I want to show you a completely
new way of thinking about things you already know, a way related to geometry. (There are
still other, equally respectable ways; see for example MA2101S.) The ultimate objective is
to try to stop you from thinking that there is only one way to do mathematics!
In this first chapter, I want to give a SURVEY, without too much abstraction, of
the main topics in linear algebra. The objective is to understand what linear algebra is
REALLY about and to begin to answer the question: “Why should we care about linear
algebra?” (This is called MOTIVATION, and it’s both incredibly important and, alas,
incredibly neglected.)
The next chapters will be VERY different — there we will ask the deep questions,
emphasising ABSTRACTION. For most of you, maybe all, some of this material will look
1
BTW this is a pure maths course; I will be a bit sloppy in this chapter devoted to MO-
REMLEMMAPROOF......
Then we go back to studying matrices and the special forms you can get them to
take. We will discuss a big Theorem (“Jordan Canonical Form”) which will lead to a
space! After that we will study BILINEAR FORMS, which are just a generalisation of the
familiar scalar product of vectors, very important in geometry. And then we will confront
the most mysterious object in Linear Algebra, the dreaded DETERMINANT, and explain
what that thing is REALLY about. Expect the unexpected. (Part of the purpose of this
So relax and cast aside your preconceptions as to what linear algebra is, and (try to)
enjoy the ride.....(NB: because this Chapter is partly revision, I will go fast. Things slow
2
You know what a function is - it’s a RULE which turns NUMBERS INTO OTHER NUM-
BERS: f (x) = x2 means “please turn 3 into 9, 12 into 144 and so on”.
For example, “please rotate all 3-dimensional vectors through an angle of 90◦ clockwise
around the z-axis”. A LINEAR TRANSFORMATION T is one that ALSO satisfies these
* *
rules: if c is any scalar, and u and v are vectors, then
* * * * * *
T (c u ) = cT ( u ) and T ( u + v ) = T ( u ) + T ( v ).
* *
Recall that a straight line is a mapping that sends a real scalar t to the vector u + t v ,
* *
where u and v are given vectors. When we let a linear transformation T act on this, we
* *
get (by the above rules) T u + t T v . But this is just another straight line. REMEMBER
it turns straight lines into other straight lines! So in a sense, linear algebra is a branch
of geometry! (Note: the converse is true: any mapping of three-dimensional space that
always sends straight lines to straight lines is a linear transformation, possibly plus a shift
* * *
EXAMPLE: Let I be the rule I u = u for all u . You can check that I is linear! Called
3
* * *
EXAMPLE : Let D be the rule D u = 2 u for all u .
* * * *
D(c u ) = 2(c u ) = c(2 u ) = cD u
* * * * * * * *
D( u + v ) = 2( u + v ) = 2 u + 2 v = D u + D v → LINEAR!
* *
Note: Usually we write D( u ) as just D u .
*
EXAMPLE: Let u be a unit vector in any number of dimensions, let α be any real
α
number, and let S* be the rule
u
α * *
* * *
S* : v → v + (α − 1) u · v u ,
u
*
for any vector v . Here the dot, ·, denotes the dot product, which (using either definition)
You can readily verify that this is a linear transformation. If you let it act on any
*
vector perpendicular to u , then it has no effect on that vector. If it acts on any vector
*
parallel to u , it just stretches that vector by a factor of α. We will call this transformation
*
a STRETCHING transformation in the direction of u , with stretching factor α. If α
is negative, the “stretching” just reverses the direction as well as changing the length;
the things we normally call “reflections” are “stretches” of that sort (or products of such
*
stretches in higher dimensions). If α is zero, then everything in the direction of u is
crushed to zero size, and this is also a sort of stretching or extreme compression.
EXAMPLE: A ROTATION is a map of space to itself that preserves all angles and
4
lengths but which isn’t a reflection. Such a map turns straight lines into straight lines so
it is linear.
tioned the word, “matrix”! Linear transformations are MAPPINGS, they are things that
do something to space.
So why do people tend to think that Linear Algebra is all about matrices?
Let’s call this the BASIC BOX in two dimensions. Similarly, î, ĵ, and k̂ define the BASIC
BOX in 3 dimensions.
5
Now let T be any linear transformation. You know that any 2-dimensional vector can be
written as aî + bĵ, for some numbers a and b. So for any vector, we have
This formula tells us something very important: IF I KNOW WHAT T DOES TO î and
ĵ, THEN I KNOW EVERYTHING ABOUT T - because now I can tell you what T does
to ANY vector.
T (2î + 3ĵ)?
Answer: T (2î + 3ĵ) = 2T î + 3T ĵ = 2 î + 14 ĵ + 3 14 î + ĵ = 2î + 12 ĵ + 34 î + 3ĵ = 11 7
4 î + 2 ĵ.
Since T î and T ĵ tell me everything I need to know, this means that I can tell you everything
EXAMPLE: Let T be the same transformation as above, T (î) = î+ 14 ĵ and T (ĵ) = 14 î+ ĵ.
6
The basic box has been squashed a bit! Pictures of WHAT T DOES TO THE BASIC
* *
EXAMPLE: If D is the transformation D u = 2 u , then the Basic Box just gets expanded
1 0
EXAMPLE: Let I be the identity transformation. Then I î = î = , I ĵ = ĵ = ,
0 1
1 0
so the matrix of the identity transformation relative to îĵ is .
0 1
α
EXAMPLE: Remember the stretching transformation S* . Let’s consider two dimensions
u
and take the special case Sî3 . It maps î to 3î, and it maps ĵ to itself. So the matrix is
3 0 5 1 0
. Similarly the matrix of Sĵ is .
0 1 0 5
* * 2 0
EXAMPLE: If D u = 2 u , then Dî = and Dĵ = so the matrix of D relative to
0 2
2 0
î, ĵ is .
0 2
7
1
1
EXAMPLE: If T î = î + 14 ĵ and T ĵ = 14 î + j, then the matrix is 1
4 .
4 1
0 1
EXAMPLE: If T î = ĵ and T ĵ = î, the matrix is . Basic box is REFLECTED
1 0
8
EXAMPLE: Suppose in 3 dimensions T î = î+4ĵ+7k̂, T ĵ = 2î+5ĵ+8k̂, T k̂ = 3î+6ĵ+9k̂,
1 2 3
then the matrix is 4 5 6 , relative to îĵ k̂.
7 8 9
that eats 2-dimensional vectors but PRODUCES 3-dimensional vectors. But it still has a
1 1
matrix, 1 0 . It’s just that this matrix is not a SQUARE MATRIX, that is, it is not
2 −3
2 by 2 or 3 by 3. Instead it is 3 by 2.
matrix. In this chapter we are mainly interested in these two cases; more general cases
produces
2-dimensional vectors according to the rule T î = 2î, T ĵ = î + ĵ, T k̂ = î − ĵ. What is its
matrix?
2 1 1
Answer: , a 2 by 3 matrix.
0 1 −1
EXAMPLE: Suppose you take a flat square of rubber and SHEAR it, as shown.
9
change its volume,
of cards. The base stays fixed but the top moves a distance tan(θ). (The height remains
the same, 1 unit.) Clearly the shearing transformation S satisfies S î = î, S ĵ = î tan θ + ĵ,
1 tan θ
so the matrix of S relative to î, ĵ is .
0 1
1 1
EXAMPLE: Suppose T i = î + ĵ and T ĵ = î + ĵ. Matrix is and basic box is
1 1
SQUASHED FLAT!
EXAMPLE: Rotations in the plane. Suppose you ROTATE the whole plane through an
cos θ − sin θ
R(θ) = .
sin θ cos θ
10
Application: Suppose an object is moving on a circle at constant angular speed ω. What
is its acceleration?
*
Answer: Let its position vector at t = 0 be r0 . Because the object is moving on a circle,
*
its position at a later time t is given by rotating r0 by an angle θ(t). So
* cos θ − sin θ *
r (t) = r0
sin θ cos θ
Differentiate
*
dr − sin θ − cos θ *
= θ̇ r by the chain rule. Here θ̇ is actually ω, so
dt cos θ − sin θ 0
*
dr − sin θ − cos θ *
= ω r0 . Differentiate again,
dt cos θ − sin θ
*
d2 r
− cos θ sin θ *
= ω 2 r0
dt2 − sin θ − cos θ
2 cos θ − sin θ *
= −ω r0
sin θ cos θ
*
Substitute the equation for r (t),
*
d2 r *
2
= −ω 2 r ,
dt
are really talking about an uncountably infinite SET of rotations, parametrised by t. This
kind of set is extremely important, and it has a name: it is called a Lie group.
11
1.3. COMPOSITE TRANSFORMATIONS AND MATRIX MULTIPLICA-
TION.
You know what it means to take the COMPOSITE of two functions: if f (u) = sin(u), and
Similarly if A and B are linear transformations, then AB means “do B FIRST, then A”.
NOTE: BE CAREFUL! According to our definition, A and B both eat vectors and both
produce vectors. But then you have to take care that A can eat what B produces!
EXAMPLE: Suppose A eats and produces 2-dimensional vectors, and B eats and pro-
EXAMPLE: Suppose B eats 2-d vectors and produces 3-d vectors (so its matrix relative
b11 b12
to îĵ k̂ looks like this: b21 b22 , a 3 by 2 matrix) and suppose A eats 3-d vectors and
b31 b32
produces 2-d vectors. Then AB DOES make sense, because A can eat what B produces.
12
IMPORTANT FACT: Suppose aij is the matrix of a linear transformation A relative to
îĵ k̂, and suppose bij is the matrix of the Linear Transformation B relative to îĵ k̂. Suppose
that AB makes sense. Then the matrix of AB relative to îĵ or îĵ k̂ is just the matrix
1
EXAMPLE: What happens to the vector if we shear 45◦ parallel to the x axis and
2
then rotate 90◦ anticlockwise? What if we do the same in the reverse order?
Answer: Shear
1 tan θ
→
0 1
1 1 cos θ − sin θ
so in this case it is . A rotation through θ has matrix , so here it
0 1 sin θ cos θ
0 −1
is Hence
1 0
0 −1 1 1 0 −1
→ =
1 0 0 1 1 1
1 1 0 −1 1 −1
→ = .
0 1 1 0 1 0
13
Rotate, then shear
1 1 −1 1 −1
→ =
2 1 0 2 1
Very different!
Answer:
1
0
0 1 1 −1 0
0 −1 = = AB
−1 1 0 −1 −1
−1 1
2 by 3 3 by 2 2 by 2
1 0 0 1 1
0 1 1
0 −1 = 1 −1 0
−1 1 0
−1 1 −1 0 −1
3 by 2 2 by 3 3 by 3
EXAMPLE: Suppose you take a piece of rubber in 2 dimensions and shear it parallel to
the x axis by θ degrees, and then shear it again by φ degrees. What happens?
1 tan φ 1 tan θ 1 tan θ + tan φ
=
0 1 0 1 0 1
The shear angles don’t add up, since tan θ + tan θ 6= tan(θ + φ).
14
EXAMPLE: Rotate 90◦ around z-axis, then rotate 90◦ around x-axis in 3 dimensions.
[Always anti-clockwise unless otherwise stated.] Is it the same if we reverse the order?
0 −1 0
Rotate about z axis → i becomes j, j becomes −i, k stays the same, so 1 0 0 .
0 0 1
1 0 0
Rotate about x axis, i stays the same, j becomes k, k becomes −j, so 0 0 −1 , and
0 1 0
1 0 0 0 −1 0
0 0 −1 1 0 0
0 1 0 0 0 1
0 −1 0 1 0 0
6= 1 0 0 0 0 −1 ,
0 0 1 0 1 0
so the answer is NO!
1.4 DETERMINANTS
You probably know that the AREA of the box defined by two vectors is
* *
| u × v |, magnitude
If you don’t know it, you can easily check it, since the area of any parallelogram is given
by
AREA = HEIGHT × Base
* *
= | v | sin θ × | u |
* *
= | u | | v | sin θ
* *
= | u × v |.
15
Similarly, the VOLUME of a “three-dimensional parallelogram” [called a PARALLELOP-
IPED!] is given by
* * *
If you take any 3 vectors in 3 dimensions, say u , v , w, then they define a 3-dimensional
* *
parallelogram. The area of the base is | u × v |,
* π
height is |w| | sin 2 −θ |
* * *
u × v and w, so VOLUME
* * *
defined by u , v , w is just
* * *
π
| u × v | |w| | sin −θ |
2
* * *
=| u × v | |w| | cos θ|
* * *
=| u × v · w|.
Now let T be any linear transformation in two dimensions. [This means that it acts on
vectors in the xy plane and turns them into other vectors in the xy plane.]
16
Now T î and T ĵ still lie in the same plane as î and ĵ, so (T î) × (T ĵ) must be perpendicular
to that plane. Hence it must be some multiple of k̂. We define the DETERMINANT of
T to be that multiple, that is, by definition, det(T ) is the number given [STRICTLY IN 2
DIMENSIONS] by
I î × I ĵ = î × ĵ = k̂ = 1k̂
so det(I) = 1.
* *
EXAMPLE: D u = 2 u
T î × T ĵ = ĵ × î = −k̂ → det T = −1
17
EXAMPLE: Shear, S î = î, S ĵ = î tan θ + ĵ,
S î × S ĵ = k̂ → det S = 1.
EXAMPLE: T î = î + ĵ = T ĵ,
*
T î × T ĵ = 0 → det T = 0.
EXAMPLE: Rotation
Rî × Rĵ = (cos θî + sin θĵ) × (− sin θî + cos θĵ)
The area of the Basic Box is initially |î × ĵ| = 1. After we let T act on it, the area becomes
So
So det T = ±1 means that the area is UNCHANGED (Shears, rotations, reflections) while
det T = 0 means that the Basic Box is squashed FLAT, zero area.
18
a b
Take a general 2 by 2 matrix M = . We know that this means M î = aî + cĵ,
c d
M ĵ = bî + dĵ. Hence M î × M ĵ = aî + cĵ × bî + dĵ = (ad − bc)k̂, so
a b
det = ad − bc.
c d
2 0 1 tan θ
Check: det = 4, det = 1,
0 2 0 1
cos θ − sin θ 1 1
det = 1, det = 0.
sin θ cos θ 1 1
IN THREE dimensions there is a similar gadget. The Basic Box is defined by îĵ k̂, and
we can let any 3-dimensional L.T. act on it, to get a new box defined by T î, T ĵ, T k̂. We
define
det T = T î × T ĵ · T k̂
where the dot is the scalar product, as usual. Since |T î × T ĵ · T k̂| is the volume of the
REMARK: You don’t really need to use the standard ijk basis to make all of this work.
* * *
It will work for any triple of unit vectors u , v , and w as long as these three vectors (in this
19
order) form a right-handed basis. This is obvious geometrically: the linear transformation
doesn’t care about your choice of basis, and of course the amount by which areas and
det T = 0.
*
EXAMPLE: You can easily show that, for every unit vector u , the stretching transforma-
α
tion satisfies det S* = α. (This is obvious geometrically. To prove it from the definition,
u
* * *
choose a basis consisting of u together with any two unit vectors v and w perpendicular
* * * *
to each other and to u , such that u , v , and w form a right-handed basis.)
a b
Just as det = ad − bc, there is a formula for the determinant of a 3 by 3 matrix.
c d
a b a b
= det = ad − bc.
c d c d
20
In other words, we can compute a three-dimensional determinant if we know how to work
COMMENTS:
[a] We worked along the top row. Actually, a THEOREM says that you can use ANY
[b] How did I know that a12 had to multiply the particular 2-dimensional determinant
a21 a23
? Easy: I just struck out EVERYTHING IN THE SAME ROW AND COLUMN
a31 a33
∗ ∗ ∗
as a12 : a21 ∗ a23 and just kept the survivors!
a31 ∗ a33
This is the pattern, for example if you expand along the second row you will get
∗ a12 a13 a11 ∗ a13
− a21 ∗ ∗ ∗ + a22 ∗ ∗ ∗
∗ a32 a33 a31 ∗ a33
a11 a12 ∗
− a23 ∗ ∗ ∗
a31 a32 ∗
[c] What is the pattern of the + and − signs? It is an (Ang Moh ) CHESSBOARD,
+ − +
starting with a + in the top left corner: − + −
+ − +
[d] You can do exactly the same thing in FOUR dimensions, following this pattern, using
+ − + −
− + − +
because now you know how to work out 3-dimensional determinants. And
+ − + −
− + − +
so on!
21
Example:
1 −1 0
1 −1 1 −1 1 1
1 1 −1 = 1 − −1 +0
0 0 2 0 2 0
2 0 0
=0+2+0=2
(expanding along the top row) or, if you use the second row,
1 −1 0
1 1 −1
2 0 0
−1 0 1 0 1 −1
= −1 +1 − −1
0 0 2 0 2 0
=0+0+2=2
or
1 −1 0
1 1 −1
2 0 0
1 −1 −1 0 −1 0
=1 −1 +2
0 0 0 0 1 −1
=0+0+2=2
(expanding down the first column).
[a] Let S and T be two linear transformations such that det S and det T are defined. Then
From the point of view of pure algebra, this is a totally amazing formula! How on earth
do the different bits of the two determinants disentangle themselves like that? (Think of
22
trying to prove it using the above formula for the determinant! No thanks!) But from a
FTW.
Therefore, det[ST U ] = det[U ST ] = det[T U S] and so on: det doesn’t care about the order.
Remember however that this DOES NOT mean that ST U = U ST etc etc.
det M T = det M.
det(cM ) = cn det M.
det M = ±1
1.5. INVERSES.
23
*
If I give you a 3-dimensional vector u and a 3-dimensional linear transformation T , then
* *
T sends u to a particular vector, it never sends u to two DIFFERENT VECTORS! So
*
Tu
*
u
*
Tu
* *
Tu =Tv
*
v
1 0 0 0 0 1 0 0 0
0 0 01 = 0 = 0 0 00
0 0 0 0 0 0 0 0 1
So it can happen! Notice that this transformation destroys ĵ (and also k̂). In fact if
* * * * * * * * *
u 6= v and T u = T v , then T ( u − v ) = 0, that is, T w = 0 where w IS NOT THE
*
ZERO VECTOR. So if this happens, T destroys everything in the w direction. That is, T
SQUASHES 3-dimensional space down to two or even less dimensions. This means that T
*
LOSES INFORMATION − it throws away all of the information stored in the w direction.
24
Clearly T squashes the basic box down to zero volume, so
det T = 0
* *
u → Tu
Different
* *
v → Tv
* *
Therefore if I give you T u , THERE IS EXACTLY ONE u . The transformation that takes
* *
you from T u back to u is called the INVERSE OF T . The idea is that since a NON-
*
SINGULAR linear transformation does NOT destroy information, we can re-construct u
*
if we are given T u . Clearly T HAS AN INVERSE, CALLED T −1 , if and only if det T 6= 0.
25
EXAMPLE:
1 1 3 7 1 1 4 7
= , = ,
1 1 4 7 1 1 3 7
It destroys everything
in that direction!
1 1
Finally det = 0.
1 1
So it is SINGULAR and
1
has NO inverse.
−1
0 −1 α a
EXAMPLE: Take and suppose it acts on and and sends them to the
1 0 β b
same vector, so
0 −1 α 0 −1 a
= .
1 0 β 1 0 b
Then
−β −b α=a α a
= → → =
α a β=b β b
α a
so and are the same − this transformation never maps different vectors to the
β b
same vector. No vector is destroyed, no information is lost, nothing gets squashed! And
26
−10
det = 1, NON-SINGULAR.
01
α
EXAMPLE: Since det S* = α, stretching transformations are always non-singular
u
except when α = 0.
* *
By definition, T −1 sends T u to u , i.e.
* * *
T −1 (T ( u )) = u = T (T −1 ( u )).
* *
But u = I u (identity) so T −1 satisfies
T −1 T = T T −1 = I.
0 −1 a b
So to find the inverse of we just have to find a matrix such that
1 0 c d
a b 0 −1 1 0 0 1
= , → b = 1, a = 0, d = 0, c = −1 so answer is .
c d 1 0 0 1 −1 0
−1
a b 1 d −b
= .
c d ad − bc −c a
α
EXAMPLE: For the stretching transformation S* with α 6= 0, the inverse is obtained
u
For bigger square matrices there are many tricks for finding inverses. A general [BUT
27
[a] Work out the matrix of COFACTORS. [A cofactor is what you get when you work out
the smaller determinant obtained by striking out a row and a column, for example the
1 2 3
1 2
cofactor of 6 in 4 5 6 is = −6. You can do this for each element in a given
7 8
7 8 9
matrix, to obtain a new matrix of the same size. For example, the matrix of cofactors of
1 0 1 1 0 0
0 1 0 is 0 1 0 .
0 0 1 −1 0 1
+ − + 1 0 0
[b] Keep or reverse the signs of every element according to − + − (you get 0 1 0
+ − + −1 0 1
above.)
1 0 −1
[c] Take the TRANSPOSE, 0 1 0 .
0 0 1
[d] Divide by the determinant of the original matrix. THE RESULT IS THE DESIRED
1 0 −1
INVERSE. 0 1 0 in this example. Check:
0 0 1
1 0 1 1 0 −1 1 0 0
0 1 00 1 0 = 0 1 0.
0 0 1 0 0 1 0 0 1
INVERSE OF A PRODUCT:
(AB)−1 = B −1 A−1
28
APPLICATION: SOLVING LINEAR SYSTEMS.
RANK 2. If it had squashed everything down to a 1-dimensional space, we would say that
1
it had RANK 1.] Now actually 2 . DOES NOT lie in that two-dimensional space. Since
4
1 2 3
4 5 6 squashes EVERYTHING into that two-dimensional space, it is IMPOSSIBLE
7 8 9
29
1 2 3 x 1
for 4 5 6 y to be equal to 2 . Hence the system has NO solutions.
7 8 9 z 4
1 1
If we change 2 to 2 , this vector DOES lie in the special 2-dimensional space, and
4 3
1 2 3 x 1
the system 4 5 6 y = 2 DOES have a solution − in fact it has infinitely
7 8 9 z 3
many!
SUMMARY:
* *
Mr = a
* *
where M is a matrix, r = the vector of variables, and a is a given vector. Suppose M is
square.
* *
r = M −1 a .
[b] If det M = 0, there is probably no solution. But if there is one, then there will be many.
vector. But there may be some special vectors which DON’T have their direction changed!
30
1 2 1
EXAMPLE: clearly DOES change the direction of î and ĵ, since is not
2 −2 2
2
parallel to î and is not parallel to ĵ. BUT
−2
1 2 2 4 2
= =2
2 −2 1 2 1
2
which IS parallel to .
1
*
In general if a transformation T does not change the direction of a vector u , that is
* *
T u = λu
*
for some λ (SCALAR), then u is called an EIGENVECTOR of T . The scalar λ is called
*
the EIGENVALUE of u .
* *
T u = λu
* *
and write u = I u , I = identity. Then
* *
(T − λI) u = 0
31
* * *
Let’s suppose u 6= 0 [of course, 0 is always an eigenvector, that is boring]. So the equation
*
says that T − λI SQUASHES everything in the u direction. Hence
det(T − λI) = 0.
2 1
EXAMPLE: Find the eigenvalues of :
−2 2
1 2 1 0
det −λ =0
2 −2 0 1
1−λ 2
→ det =0
2 −2 − λ
→ − (1 − λ)(2 + λ) − 4 = 0
→ λ = 2 OR − 3
So there are TWO answers for a 2 by 2 matrix. Similarly, in general there are three answers
for 3 by 3 matrices, etc. Unfortunately, however, sometimes the “two” or “three” answers
can be repeated; that is, you don’t necessarily get three DIFFERENT answers. This is a
great evil.
* *
IMPORTANT POINT: Let u be an eigenvector of T . Then 2 u is also an eigenvector
* * * *
T (2 u ) = 2T u = 2λ u = λ × (2 u ).
32
* *
Similarly 3 u , 13.59 u etc are all eigenvectors! SO YOU MUST NOT EXPECT A UNIQUE
ANSWER!
α
OK, with that in mind, let’s find an eigenvector for λ = 2. Let’s call an eigenvector .
β
Then
* *
(T − λI) u = 0
1−λ 2 α
→ =0
2 −2 − λ β
−1 2 α
→ =0
2 −4 β
→ − α + 2β = 0
→ 2α − 4β = 0
But these equations are actually the SAME, so we really only have ONE equation for 2
unknowns. We aren’t surprised, because we did not expect a unique answer anyway! We
can just CHOOSE α = 1 (or 13.59 or whatever) and then solve for β. Clearly β = 12 , so
1 2 100
an eigenvector corresponding to λ = 2 is 1 . But if you said or that is also
2 1 50
correct!
→ 4α + 2β = 0
→ 2α + β = 0
1
Again we can set α = 1, then β = −2, so an eigenvector corresponding to λ = −3 is
−2
33
2 −10
or or etc.
−4 20
0 −1
EXAMPLE: Find the eigenvalues, and corresponding eigenvectors, of .
1 0
√
−λ −1
Answer: We have det = 0 → λ2 + 1 = 0 → λ = ±i, i = −1.
1 −λ
a ROTATION through 90◦ , and of course such a transformation leaves NO [real] vector’s
î, ĵ by letting T act on î and ĵ and then putting the results in the columns. So to say that
a b
T has matrix with respect to î, ĵ means that
c d
T î = aî + cĵ
T ĵ = bî + dĵ.
34
What’s so special about the two vectors î and ĵ? Nothing, except that EVERY vector in
Now actually we only really use î and ĵ for CONVENIENCE. In fact, we can do this with
* *
ANY pair of vectors u , v in two dimensions,
*
That is, any vector w can be
expressed as
* * *
w = αu + β v
* * *
for some scalars α, β. You can see this from the diagram − by stretching u to α u and v
* *
to β v , we can make their sum equal to w.
* *
We call u , v a BASIS for 2-dimensional vectors. Let
* P11
u = P11 î + P21 ĵ =
P21
* P12
v = P12 î + P22 ĵ =
P22
* * P11 P12
Then the transformation that takes î, ĵ to ( u , v ) has matrix = P . In order
P21 P22
* *
for u , v to be a basis, P must not squash the volume of the Basic Box down to zero, since
* *
otherwise u and v will be parallel. So we must have
det P 6= 0.
35
The same idea works in 3 dimensions: ANY set of 3 vectors forms a basis PROVIDED
* 1 * 1 1 1
EXAMPLE: The pair of vectors u = ,v = forms a basis, because det =
0 1 0 1
1 6= 0.
Now of course the COMPONENTS of a vector will change if you choose a different basis.
I will [temporarily] use subscripts to indicate which basis I am using. For example,
1
= 1î + 2ĵ.
2 (î,ĵ)
BUT if we express this SAME vector in terms of a different basis we will get different
components.
In fact,
* *
1î + 2ĵ = −î + 2(î + ĵ) = − u + 2 v ,
* * −1
so the components of this vector relative to u , v are . Is there a systematic
2 * *
(u,v )
* *
1î + 2ĵ = α u + β v
36
.
The thing to remember here is that these two expressions refer to the SAME vector.
* *
The idea now is that, in order for that to be true, the change from (î, ĵ) to ( u , v ) has to
* *
be “cancelled out”. But surely if going from (î, ĵ) to ( u , v ) is done by multiplying by P ,
then we expect that the cancelling has to be done by multiplying the coefficients by P −1 .
α 1 1 −1 1 −1
= P −1 = = ,
β 2 0 1 2 2
* *
that is, the components of this vector relative to u , v are found as
−1 1
= P −1
2 * *
(u,v )
2 (î,ĵ)
* *
THE COMPONENTS RELATIVE TO u , v ARE OBTAINED BY MULTIPLYING P −1
The same sort of thing happens for linear transformations − if a certain linear trans-
1 2
formation T has matrix relative to î, ĵ it will have a DIFFERENT matrix
0 −1 îĵ
* *
relative to u , v . What is that matrix?
1 2 1 5
= .
0 −1 (î,ĵ)
2 (î,ĵ) −2 (î,ĵ)
37
Now multiply both sides by P −1 , and, in the middle, “multiply and divide” by P − that
is, insert P P −1 , which of course we are allowed to do since that is just the identity matrix:
−1 1 2 −1 1 −1 5
P PP =P ,
0 −1 (î,ĵ)
2 (î,ĵ) −2 (î,ĵ)
( )
−1 1 2 1 5
P P P −1 = P −1 .
0 −1 (î,ĵ)
2 (î,ĵ) −2 (î,ĵ)
−1 1 1
But, by what we said above, P gives exactly the components of relative
2 (î,ĵ) 2 (î,ĵ)
* * 5 5
to the ( u , v ) basis, and similarly P −1 gives the components of relative
−2 (î,ĵ) −2 (î,ĵ)
* *
to ( u , v ). So we conclude that the components of the matrix relative to the new basis
must be
1 2
P −1 P.
0 −1 (î,ĵ)
We can check this out: you can show that the components of the two vectors change in
1 2 1 −1 1 2 1 1
P −1 P =
0 −1 0 1 0 −1 0 1
1 4
= .
0 −1
38
−1 7
And that is correct: if you multiply this into , you will get exactly !
2 * *
(u,v )
−2 * *
(u,v )
* *
We conclude that THE MATRIX OF T RELATIVE TO u , v , IS OBTAINED BY
So now we know how to work out the matrix of any linear transformation relative to ANY
basis.
* *
Now let T be a linear transformation in 2 dimensions, with eigenvectors e1 , e2 , eigenvalues
* *
λ1 , λ2 . Now e1 and e2 may or may not give a basis for 2-dimensional space (because we
can’t be sure that they are not parallel to each other). But suppose they do.
* *
QUESTION: What is the matrix of T relative to e1 , e2 ?
* *
ANSWER: As always, we see what T does to e1 and e2 , and put the results into the
columns!
* * * *
T e2 = λ 2 e2 = 0 e1 + λ 2 e2
λ1 0
So the matrix is .
0 λ2 * *
(e1 ,e2 )
α 0 0
a 0
We say that a matrix of the form or 0 β 0 is DIAGONAL. So we see
0 d
0 0 γ
39
that THE MATRIX OF A TRANSFORMATION RELATIVE TO ITS OWN EIGEN-
VECTORS (assuming that these form a basis, which tragically is not always the case) is
DIAGONAL.
1 tan θ
EXAMPLE: The shear matrix .
0 1
1 − λ tan θ
Eigenvalues: det = 0 → (1 − λ)2 = 0 → λ = 1. Only one eigenvector,
0 1−λ
1
namely , so the eigenvectors DO NOT give us a basis in this case → NOT possible to
0
diagonalize this matrix! This evil is responsible for most of the complications in matrix
around a straight line that passes through the origin and makes an angle of θ with the
40
cosθ
x-axis. Then the vector lies along this line, so it is left unchanged by the reflection;
sinθ
in other words, it is an eigenvector of ρ with eigenvalue 1. On the other hand, the vector
− sinθ
is perpendicular to the first vector [check that their scalar product is zero] so it
cosθ
is reflected into its own negative by ρ. That is, it is an eigenvector with eigenvalue − 1.
So ρ has a matrix with these eigenvectors and eigenvalues. The P matrix in this case
is
cos θ − sin θ
P = ,
sin θ cos θ
cos θ sin θ
and clearly P −1 = , and since the eigenvalues are ±1, we have
− sin θ cos θ
cos θ − sin θ 1 0 cos θ sin θ
ρ= .
sin θ cos θ 0 −1 − sin θ cos θ
Doing the matrix multiplication and using the trigonometric identities for cos2θ and sin2θ,
Notice that the determinant is − 1, as is typical for a reflection. Check that this gives
the right answer for reflections around the 45 degree diagonal and around the x-axis.
Let’s construct a simple model of weather forecasting. We assume that each day is either
41
RAINY or SUNNY.
Since probabilities have to add up to 100%, you can easily see that Rainy → Sunny has
probability 40% and Sunny → Rainy has probability 30%. We can organise these data into
a matrix by putting the probabilities into the columns. That is, all of the probabilities of
the form Rainy → (Rainy or Sunny) go into the f irst column, and all of the probabilities
of the form Sunny → (Rainy or Sunny) go into the second column, and so on if there are
(So if there were three states, say Rainy, Sunny, Hazy, then we would put the
three numbers Rainy → Rainy, Rainy → Sunny, Rainy → Hazy down the first
column of a 3 × 3 matrix, and similarly for the second and third columns.)
Question: Suppose today is sunny. What is the probability that it will be rainy 4
days from now? To see how to proceed, we make a “tree” like this: [R = rain, S = sun]
42
We have, similarly, that the probability of Rainy → Sunny over 2 days is
By constructing a tree starting with S, you will find that the probability of rain 2 days
and similarly
So matrix multiplication actually allows you to compute all of the probabilities in this
RR4 SR4 4 0.43
2 2 0.43
=M =M M = .
RS4 SS4 0.57 0.57
So if it is rainy today, the probability of rain in 4 days is 0.43=43%. If you want 20 days,
43
There is an easy way to work this out using eigenvalues.
−1 λ1 0
Suppose I can diagonalize M , that is, I can write P MP = D = for some
0 λ2
matrix P . Then
M = P DP −1
M 2 = (P DP −1 )(P DP −1 )
= P DP −1 P DP −1 = P D2 P −1
M 3 = M M 2 = P DP −1 P D2 P −1 = P D3 P −1 etc
M 30 = P D30 P −1 .
λ30 0
But D30 is very easy to work out − it is just 1
. Let’s see how this works!
0 λ30
2
0.6 0.3 1 1
Eigenvectors and eigenvalues of are (eigenvalue 0.3) and 4 (eigenvalue
0.4 0.7 −1 3
1) so
4
− 37
1 1 0.3 0 −1
P = 4 ,D = , P = 73 3
−1 3 0 1 7 7
2 × 10−16
30 (0.3)30 0 0
D = ≈
0 1 0 1
44
so
2 × 10−16
4
− 37
30 1 1 0 7
M = 4 3 3
−1 3 0 1 7 7
1 3 + 8 × 10−16
3 3
3 − 6 × 1016
= ≈ 7 7
7 4 − 8 × 10−16 4 + 6 × 10−16 4
7
4
7
So if it is rainy today, the probability of rain tomorrow is 60%, but the probability of rain
3
30 days from now is only 7 ≈ 43%. As we go forward in time, the fact that it rained today
becomes less and less important. The probability of rain in 31 days is almost the same as
Let M be any square matrix. Then the TRACE of M , denoted T rM , is defined as the
1 2 3 1 5 16
1 0
sum of the diagonal entries: T r = 2, T r 4 5 6 = 15, T r 7 2 15 = 11,
0 1
7 8 9 11 9 8
etc.
T rN M .
WAYS THE SAME NO MATTER WHICH BASIS YOU USE! This is why the trace is
45
λ1 0
T rA = T r = λ1 + λ2 .
0 λ2
So the trace is equal to the sum of the eigenvalues. This gives a quick check that you
have not made a mistake in working out the eigenvalues: they have to add up to the same
number as the trace of the original matrix. Check this for the examples in this chapter.
By the same sort of argument you can see that the DETERMINANT of such a matrix
is the PRODUCT of the eigenvalues. Now think about what the transformation is doing
to a small box inside some object. (Let’s think about the case where all eigenvalues and
eigenvectors are real.) Along the directions of the eigenvectors, the box is simply being
stretched or compressed, not rotated or twisted. (In other words, a diagonal matrix is
is given by the eigenvalues. So the (absolute value of the) product of the eigenvalues tells
you how much the volume of that small box is being changed. So now we have a better
understanding of why the determinant measures the changes in volumes under a linear
transformation. The point is that this argument works in ALL dimensions, not just 2 and
(It is a fact, which we will prove later, that for every square matrix, one can find a
basis such that the original matrix is transformed to a product of stretches with shears,
rotations, and reflections. The last three all have determinants with absolute values equal
46
to one, so the discussion in the previous paragraph actually works whether or not the
matrix is diagonalizable with real eigenvalues. The point I am making is that you will
understand linear algebra a LOT better if you have clear pictures in your mind, and that
is true even of the much more abstract stuff we are about to study!)
47