Practical Linear Algebra A Geometry Toolbox
Practical Linear Algebra A Geometry Toolbox
Gerald Farin
Dianne Hansford
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been made to publish reliable data and information, but
the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers have attempted to
trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained.
If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical,
or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without
written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://www.copyright.com/) or contact the Copyright
Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a
variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to
infringe.
Preface xiii
vii
viii
Bibliography 487
Index 489
Preface
Just about everyone has watched animated movies, such as Toy Story
or Shrek, or is familiar with the latest three-dimensional computer
games. Enjoying 3D entertainment sounds like more fun than study-
ing a linear algebra book. But it is because of linear algebra that
those movies and games can be brought to a TV or computer screen.
When you see a character move on the screen, it’s animated using
some equation straight out of this book. In this sense, linear algebra
is a driving force of our new digital world: it is powering the software
behind modern visual entertainment and communication.
But this is not a book on entertainment. We start with the funda-
mentals of linear algebra and proceed to various applications. So it
doesn’t become too dry, we replaced mathematical proofs with mo-
tivations, examples, or graphics. For a beginning student, this will
result in a deeper level of understanding than standard theorem-proof
approaches. The book covers all of undergraduate-level linear alge-
bra in the classical sense—except it is not delivered in a classical way.
Since it relies heavily on examples and pointers to applications, we
chose the title Practical Linear Algebra, or PLA for short.
The subtitle of this book is A Geometry Toolbox; this is meant
to emphasize that we approach linear algebra in a geometric and
algorithmic way. Our goal is to bring the material of this book to
a broader audience, motivated in a large part by our observations
of how little engineers and scientists (non-math majors) retain from
classical linear algebra classes. Thus, we set out to fill a void in the
linear algebra textbook market. We feel that we have achieved this,
presenting the material in an intuitive, geometric manner that will
lend itself to retention of the ideas and methods.
xiii
xiv Preface
Review of Contents
As stated previously, one clear motivation we had for writing PLA
was to present the material so that the reader would retain the in-
formation. In our experience, approaching the material first in two
and then in three dimensions lends itself to visualizing and then to
understanding. Incorporating many illustrations, Chapters 1–7 in-
troduce the fundamentals of linear algebra in a 2D setting. These
same concepts are revisited in Chapters 8–11 in a 3D setting. The
3D world lends itself to concepts that do not exist in 2D, and these
are explored there too.
Higher dimensions, necessary for many real-life applications and the
development of abstract thought, are visited in Chapters 12–16. The
focus of these chapters includes linear system solvers (Gauss elim-
ination, LU decomposition, the Householder method, and iterative
methods), determinants, inverse matrices, revisiting “eigen things,”
linear spaces, inner products, and the Gram-Schmidt process. Singu-
lar value decomposition, the pseudoinverse, and principal components
analysis are new additions.
Conics, discussed in Chapter 19, are a fundamental geometric en-
tity, and since their development provides a wonderful application
for affine maps, “eigen things,” and symmetric matrices, they really
shouldn’t be missed. Triangles in Chapter 17 and polygons in Chap-
ter 18 are discussed because they are fundamental geometric entities
and are important in generating computer images.
Several of the chapters have an “Application” section, giving a real-
world use of the tools developed thus far. We have made an effort to
choose applications that many readers will enjoy by staying away from
in-depth domain-specific language. Chapter 20 may be viewed as an
application chapter as a whole. Various linear algebra ingredients are
applied to the techniques of curve design and analysis.
The illustrations in the book come in two forms: figures and sketches.
The figures are computer generated and tend to be complex. The
sketches are hand-drawn and illustrate the core of a concept. Both are
great teaching and learning tools! We made all of them available on
the book’s website http://www.farinhansford.com/books/pla/. Many
of the figures were generated using PostScript, an easy-to-use geomet-
ric language, or Mathematica.
At the end of each chapter, we have included a list of topics, What
You Should Know (WYSK), marked by the icon on the left. This list
is intended to encapsulate the main points of each chapter. It is not
uncommon for a topic to appear in more than one chapter. We have
Preface xv
made an effort to revisit some key ideas more than once. Repetition
is useful for retention!
Exercises are listed at the end of each chapter. Solutions to selected
exercises are given in Appendix B. All solutions are available to
instructors and instructions for accessing these may be found on the
book’s website.
Appendix A provides an extensive glossary that can serve as a
review tool. We give brief definitions without equations so as to
present a different presentation than that in the text. Also notable
is the robust index, which we hope will be very helpful, particularly
since we revisit topics throughout the text.
Classroom Use
PLA is meant to be used at the undergraduate level. It serves as an
introduction to linear algebra for engineers or computer scientists, as
well as a general introduction to geometry. It is also an ideal prepara-
tion for computer graphics and geometric modeling. We would argue
that it is also a perfect linear algebra entry point for mathematics
majors.
As a one-semester course, we recommend choosing a subset of the
material that meets the needs of the students. In the table below,
LA refers to an introductory linear algebra course and CG refers to
a course tailored to those planning to work in computer graphics or
geometric modeling.
Chapter LA CG
1 Descartes’ Discovery • •
2 Here and There: Points and Vectors in 2D • •
3 Lining Up: 2D Lines •
4 Changing Shapes: Linear Maps in 2D • •
5 2×2 Linear Systems • •
6 Moving Things Around: Affine Maps in 2D • •
7 Eigen Things •
8 3D Geometry • •
9 Linear Maps in 3D • •
10 Affine Maps in 3D • •
11 Interactions in 3D •
xvi Preface
Chapter LA CG
12 Gauss for Linear Systems • •
13 Alternative System Solvers •
14 General Linear Spaces •
15 Eigen Things Revisited •
16 The Singular Value Decomposition •
17 Breaking It Up: Triangles •
18 Putting Lines Together: Polylines and Polygons •
19 Conics •
20 Curves •
Website
Practical Linear Algebra, A Geometry Toolbox has a website:
http://www.farinhansford.com/books/pla/
• teaching materials,
• additional material,
• Mathematica code,
• errata,
• and more!
Figure 1.1.
Local and global coordinate systems: the treasure’s local coordinates with respect to
the boat do not change as the boat moves. However, the treasure’s global coordinates,
defined relative to the lake, do change as the boat moves.
1
2 1. Descartes’ Discovery
whose inhabitants were not known for their intelligence. Here is one
story [12]:
Example 1.1
1 1
x1 = (1 − ) · 1 + · 3 = 2,
2 2
1 1
x2 = (1 − ) · 3 + · 5 = 4.
2 2
This is the “midpoint” of the target box. You see here how the
geometry in the unit square is replicated in the target box.
Sketch 1.4.
Map local unit square to a target
box.
A different way of writing (1.1) and (1.2) is as follows: Define
Δ1 = max1 − min1 and Δ2 = max2 − min2 . Now we have
x1 = min1 + u1 Δ1 , (1.3)
x2 = min2 + u2 Δ2 . (1.4)
1.1. Local and Global Coordinates: 2D 5
D
D
D D
D
D
Figure 1.2.
D D
Target boxes: the letter D is mapped several times. Left: centered in the unit square.
Right: not centered.
A note of caution: if the target box is not a square, then the object
from the local system will be distorted. We see this in the following
example, illustrated by Sketch 1.5. The target box is given by
You can see how the local object is stretched in the e1 -direction by
being put into the global system. Check for yourself that the corners
of the unit square (local) still get mapped to the corners of the target
box (global).
In general, if Δ1 > 1, then the object will be stretched in the e1 -
direction, and it will be shrunk if 0 < Δ1 < 1. The case of max1
smaller than min1 is not often encountered: it would result in a re-
versal of the object in the e1 -direction. The same applies, of course,
to the e2 -direction if max2 is smaller than min2 . An example of sev-
eral boxes containing the letter D is shown in Figure 1.2. Just for
fun, we have included one target box with max1 smaller than min1 !
Another characterization of the change of shape of the object may
be made by looking at the change in aspect ratio, which is the ratio of
the width to the height, Δ1 /Δ2 , for the target box. This is also writ-
ten as Δ1 : Δ2 .The aspect ratio in the local system is one. Revisiting
6 1. Descartes’ Discovery
Example 1.1, the aspect ratio of the target box is one, therefore there
is no distortion of the letter D, although it is stretched uniformly in
both coordinates. In Sketch 1.5, a target box is given that has aspect
ratio 3, therefore the letter D is distorted.
Aspect ratios are encountered many times in everyday life. Televi-
sions and computer screens have recently changed from nearly square
4 : 3 to 16 : 9. Sketch 1.5 illustrates the kind of distortion that occurs
when an old format program is stretched to fill a new format screen.
(Normally a better solution is to not stretch the image and allow
for vertical black bars on either side of the image.) All international
√
(ISO A series) paper, regardless of size, has an aspect ratio of 1√: 2.
Golden rectangles, formed based on the golden ratio φ = (1 + 5)/2
with an aspect ratio of 1 : φ, provide a pleasing and functional shape,
and found their way into art and architecture. Credit cards have an
aspect ratio of 8 : 5, but to fit into your wallet and card readers the
size is important as well.
This principle, by the way, acts strictly on a “don’t need to know”
basis: we do not need to know the relationship between the local and
global systems. In many cases (as in the typesetting example), there
Sketch 1.5. actually isn’t a known correspondence at the time the object in the
A distortion. local system is created. Of course, one must know where the actual
object is located in the local unit square. If it is not nicely centered,
we might have the situation shown in Figure 1.2 (right).
You experience this “unit square to target box” mapping whenever
you use a computer. When you open a window, you might want to
view a particular image in it. The image is stored in a local coordinate
system; if it is stored with extents (0, 0) and (1, 1), then it utilizes
normalized coordinates. The target box is now given by the extents
of your window, which are given in terms of screen coordinates and the
image is mapped to it using (1.1) and (1.2). Screen coordinates are
typically given in terms of pixels;1 a typical computer screen would
have about 1440 × 900 pixels, which has an aspect ratio of 8 : 5
or 1.6.
Example 1.2
Figure 1.3.
Selecting an icon: global to local coordinates.
These days, almost all engineering objects are designed using a Com-
puter-Aided Design (CAD) system. Every object is defined in a coor-
dinate system, and usually many individual objects need to be inte-
grated into one coordinate system. Take designing a large commercial
airplane, for example. It is defined in a three-dimensional (or 3D) co-
ordinate system with its origin at the frontmost part of the plane,
the e1 -axis pointing toward the rear, the e2 -axis pointing to the right
(that is, if you’re sitting in the plane), and the e3 -axis is pointing
upward. See Sketch 1.6.
Sketch 1.6. Before the plane is built, it undergoes intense computer simula-
Airplane coordinates. tion in order to find its optimal shape. As an example, consider the
engines: these may vary in size, and their exact locations under the
wings need to be specified. An engine is defined in a local coordinate
system, and it is then moved to its proper location. This process
will have to be repeated for all engines. Another example would
be the seats in the plane: the manufacturer would design just one—
then multiple copies of it are put at the right locations in the plane’s
design.
1.4. Stepping Outside the Box 9
Since the initial coordinates (u1 , u2 ) were not inside the unit square,
the mapped coordinates (x1 , x2 ) are not inside the target box. The
notion of mapping a square to a target box is a useful concept for
mentally visualizing what is happening—but it is not actually a re-
striction to the coordinates that we can map!
Example 1.3
Without much belaboring, it is clear the same holds for 3D. An ex-
ample should suffice: the target box is given by
in Section 11.8.
1.5. Application: Creating Coordinates 11
Figure 1.4.
Creating coordinates: a cat is turned into math. (Microscribe-3D from Immersion
Corporation, http://www.immersion.com.)
cat model with the tip of the CMM’s arm, it will associate three
coordinates with that position and record them. You repeat this
for several hundred points, and you have your cat in the box! This
process is called digitizing. In the end, the cat has been “discretized,”
or turned into a finite number of coordinate triples. This set of points
is called a point cloud.
Someone else will now have to build a mathematical model of your
cat.3 Next, the mathematical model will have to be put into scenes
of the movie—but all that’s needed for that are 3D coordinate trans-
formations! (See Chapters 9 and 10.)
1.6 Exercises
1. Let the coordinates of triangle vertices in the local [d1 , d2 ]-system unit
square be given by
(a) If the [d1 , d2 ]-system unit square is mapped to the target box with
2. Given local coordinates (2, 2) and (−1, −1), find the global coordinates
with respect to the target box with
Make a sketch of the local and global systems. Connect the coordinates
in each system with a line and compare.
3. Let the [d1 , d2 , d3 ]-system unit cube be mapped to the 3D target box
with
If the [d1 , d2 , d3 ]-system unit square is mapped to the target box with
where are the coordinates of the triangle vertices mapped? Hint: See
Exercise 1a.
5. Suppose we are given a global frame defined by (min1 , min2 , min3 ) =
(0, 0, 3) and (max1 , max2 , max3 ) = (4, 4, 4). For the coordinates (1, 1, 3)
and (0, 0, 0) in this frame, what are the corresponding coordinates in
the [d1 , d2 , d3 ]-system?
6. Assume you have an image in a local frame of 20 mm2 . If you enlarge
the frame and the image such that the new frame covers 40 mm2 , by
how much does the image size change?
1.6. Exercises 13
Figure 2.1.
Hurricane Katrina: the hurricane is shown here approaching south Louisiana. (Image
courtesy of NOAA, katrina.noaa.gov.)
15
16 2. Here and There: Points and Vectors in 2D
Sketch 2.8.
Scaling of points is ambiguous.
Figure 2.2.
Vector field: simulating hurricane air velocity. Lighter gray indicates greater velocity.
vectors the same length and using gray scale or varying shades of gray
to indicate speed, the vector field can be more informative than the
photograph. (Visualization of a vector field requires discretizing it: a
finite number of point and vector pairs are selected from a continuous
field or from sampled measurements.)
Other important applications of vector fields arise in the areas of
automotive and aerospace design: before a car or an airplane is built,
it undergoes extensive aerodynamic simulations. In these simulations,
the vectors that characterize the flow around an object are computed
from complex differential equations. In Figure 2.3 we have another
example of a vector field.
Sketch 2.9.
Addition of points is ambiguous.
Figure 2.3.
Vector field: every sampled point has an associated vector. Lighter gray indicates
greater vector length.
This is also called the Euclidean norm. Notice that if we scale the
vector by an amount k then
kv = |k|v. (2.5)
A normalized vector w has unit length, that is
w = 1.
Normalized vectors are also known as unit vectors. To normalize a
vector simply means to scale a vector so that it has unit length. If w
is to be our unit length version of v then
v
w= .
v
Each component of v is divided by the scalar value v. This scalar
value is always nonnegative, which means that its value is zero or
greater. It can be zero! You must check the value before dividing to
be sure it is greater than your zero divide tolerance. The zero divide
tolerance is the absolute value of the smallest number by which you
can divide confidently. (When we refer to checking that a value is
greater than this number, it means to check the absolute value.)
In Figures 2.2 and 2.3, we display vectors of varying magnitudes.
But instead of plotting them using different lengths, their magnitude
is indicated by gray scales.
Example 2.1
Start with
5
v= .
0
√
Applying (2.4), v = 52 + 02 = 5. Then the normalized version of
v is defined as
5/5 1
w= = .
0/5 0
Figure 2.4.
Unit vectors: they define a circle.
There are infinitely many unit vectors. Imagine drawing them all,
emanating from the origin. The figure that you will get is a circle of
radius one! See Figure 2.4.
To find the distance between two points we simply form a vector
defined by the two points, e.g., v = q − p, and apply (2.4).
Example 2.2
Let
−1 1
q= and p = .
2 0
Then
−2
q−p=
2
and √
q − p = (−2)2 + 22 = 8 ≈ 2.83.
Sketch 2.11 illustrates this example.
Sketch 2.11.
Distance between two points.
2.5. Combining Points 23
r = p + tv (2.6)
r = p + t(q − p)
and then
r = (1 − t)p + tq. (2.7)
Sketch 2.13 gives an example with t = 1/3.
The scalar values (1 − t) and t are coefficients. A weighted sum Sketch 2.13.
of points where the coefficients sum to one is called a barycentric Barycentric combinations:
combination. In this special case, where one point r is being expressed t = 1/3.
in terms of two others, p and q, the coefficients 1 − t and t are called
the barycentric coordinates of r.
24 2. Here and There: Points and Vectors in 2D
Example 2.3
2.6 Independence
Two vectors v and w describe a parallelogram, as shown in Sketch 2.6.
It may happen that this parallelogram has zero area; then the two
vectors are parallel. In this case, we have a relationship of the form
v = cw. If two vectors are parallel, then we call them linearly depen-
dent. Otherwise, we say that they are linearly independent.
Two linearly independent vectors may be used to write any other
vector u as a linear combination:
u = rv + sw.
How to find r and s is described in Chapter 5. Two linearly inde-
pendent vectors in 2D are also called a basis for R2 . If v and w
are linearly dependent, then you cannot write all vectors as a linear
combination of them, as the following example shows.
Example 2.4
Let
1 2
v= and w = .
2 4
If we tried to write the vector
1
u=
0
as u = rv + sw, then this would lead to
1 = r + 2s, (2.9)
0 = 2r + 4s. (2.10)
and then expanding, bringing all terms to the left-hand side of the
equation yields
(v12 − 2v1 w1 + w12 ) + (v22 − 2v2 w2 + w22 ) − (v12 + v22 ) − (w12 + w22 ) = 0,
which reduces to
v1 w1 + v2 w2 = 0. (2.12)
We find that perpendicular vectors have the property that the sum
of the products of their components is zero. The short-hand vector
notation for (2.12) is
v · w = 0. (2.13)
This result has an immediate application: a vector w perpendicular
to a given vector v can be formed as
−v2
w=
v1
s = v · w = v1 w1 + v2 w2 (2.14)
to be the dot product of v and w. Notice that the dot product returns
a scalar s, which is why it is also called a scalar product. (Mathemati-
cians have yet another name for the dot product—an inner product.
See Section 14.3 for more on these.) From (2.14) it is clear that
v · w = w · v.
28 2. Here and There: Points and Vectors in 2D
We have just proved the Law of Cosines, which generalizes the Pythag-
orean theorem by correcting it for triangles with an opposing angle
different from 90◦ .
We can formulate another expression for v − w2 by explicitly
writing out
v − w2 = (v − w) · (v − w)
(2.18)
= v2 − 2v · w + w2 .
Here is another expression for the dot product—it is a very useful one!
Rearranging (2.19), the cosine of the angle between the two vectors
can be determined as
v·w
cos θ = . (2.20)
vw
2.7. Dot Product 29
cos(θ)
θ
90° 180°
–1
Figure 2.5.
Cosine function: its values at θ = 0◦ , θ = 90◦ , and θ = 180◦ are important to remember.
kw2
cos θ = = ±1.
|k|ww
• right: cos(θ) = 0 → v · w = 0;
Example 2.5
u = projV1 w,
w = u + u⊥ . (2.22)
u⊥ = w − projV1 w,
2.9 Inequalities
Here are two important inequalities when dealing with vector lengths.
Let’s start with the expression from (2.19), i.e.,
v · w = vw cos θ.
v + w2 = (v + w) · (v + w)
= v · v + 2v · w + w · w
≤ v · v + 2|v · w| + w · w
(2.24)
≤ v · v + 2vw + w · w
= v2 + 2vw + w2
= (v + w)2 .
2.10 Exercises
1. Illustrate the parallelogram rule applied to the vectors
−2 2
v= and w = .
1 1
6. Illustrate a point with barycentric coordinates (1/2, 1/4, 1/4) with re-
spect to three other points.
7. Consider two points. Form the set of all convex combinations of these
points. What is the geometry of this set?
8. Consider three noncollinear points. Form the set of all convex combi-
nations of these points. What is the geometry of this set?
9. What is the length of the vector
−4
v= ?
−3
11. If a vector v is length 10, then what is the length of the vector −2v?
12. Find the distance between the points
3 −2
p= and q = .
3 −3
23. Show that the dot product has the following properties for vectors
u, v, w ∈ R2 .
u·v =v·u symmetric
29. For √
1/√2 1
v= and w=
1/ 2 3
find
u = projV1 w,
where V1 is the set of all 2D vectors kv, and find
u⊥ = w − projV1 w.
Figure 3.1.
Moiré patterns: overlaying two sets of lines at an angle results in an interesting pattern.
37
38 3. Lining Up: 2D Lines
for lines, where each is suited for particular applications. Once we can
represent a line, we can perform intersections and determine distances
from a line.
Figure 3.1 shows how interesting playing with lines can be. Two
sets of parallel lines are overlaid and the resulting interference pattern
is called a Moiré pattern. Such patterns are used in optics for checking
the properties of lenses.
• two points;
Sketch 3.1. The unit vector that is perpendicular (or orthogonal) to a line is
Elements to define a line. referred to as the normal to the line. Figure 3.2 shows two families
of lines: one family of lines shares a common point and the other
family of lines shares the same normal. Just as there are different
ways to specify a line geometrically, there are different mathematical
representations: parametric, implicit, and explicit. Each representa-
tion will be examined and the advantages of each will be explained.
Additionally, we will explore how to convert from one form to another.
Figure 3.2.
Families of lines: one family shares a common point and the other shares a common
normal.
3.2. Parametric Equation of a Line 39
t = i/9, i = 0, . . . , 9.
Example 3.1
t = i/4, i = 0, . . . , 4.
As you can see, the position of the point p and the direction and
length of the vector v determine which points on the line are gener-
ated as we increment through t ∈ [0, 1]. This particular artifact of
the parametric equation of a line is called the parametrization. The
parametrization is related to the speed at which a point traverses
the line. We may affect this speed by scaling v: the larger the scale
factor, the faster the point’s motion!
a · (x − p) = 0. (3.3)
3.3. Implicit Equation of a Line 41
Example 3.2
The implicit form is very useful for deciding if an arbitrary point lies
on the line. To test if a point x is on the line, just plug its coordinates
into (3.4). If the value f of the left-hand side of this equation,
f = ax1 + bx2 + c,
reflects the true distance of x to the line. Now the tolerance has a
physical meaning, which makes it much easier to specify. Sketch 3.3
illustrates the physical relationship of this tolerance to the line.
The sign of d indicates on which side of the line the point lies. This
sign is dependent upon the definition of a. (Remember, there were
two possible orientations.) Positive d corresponds to the point on the
side of the line to which a points.
Example 3.3
−2x1 + 4x2 − 4 = 0,
The distance is
√ √
d = (−2 × 0 + 4 × 1 − 4)/ 20 = 0/ 20 = 0,
ax1 + bx2 + c
= 0,
a
which we know as the point normal form. The point normal form of
the line from the example above is
2 4 4
− √ x1 + √ x2 − √ = 0.
20 20 20
Examining (3.4) you might notice that a horizontal line takes the
form
bx2 + c = 0.
This line intersects the e2 -axis at −c/b. A vertical line takes the form
ax1 + c = 0.
This line intersects the e1 -axis at −c/a. Using the implicit form, these
lines are in no need of special handling.
44 3. Lining Up: 2D Lines
x2 = âx1 + b̂,
x2 = 1/3x1 + 1.
The slope measures the steepness of the line as a ratio of the change
in x2 to a change in x1 : “rise/run,” or more precisely tan(θ). The
e2 -intercept indicates that the line passes through (0, b̂).
Sketch 3.5. Immediately, a drawback of the explicit form is apparent. If the
A line in explicit form. “run” is zero then the (vertical) line has infinite slope. This makes
life very difficult when programming! When we study transformations
(e.g., changing the orientation of some geometry) in Chapter 6, we
will see that infinite slopes actually arise often.
The primary popularity of the explicit form comes from the study
of calculus. Additionally, in computer graphics, this form is popular
when pixel calculation is necessary. Examples are Bresenham’s line
drawing algorithm and scan line polygon fill algorithms (see [10]).
l : l(t) = p + tv.
c = −(a1 p1 + a2 p2 ).
l : ax1 + bx2 + c = 0.
l : l(t) = p + tv.
Solution: Recognize that we need one point on the line and a vec-
tor parallel to the line. The vector is easy: simply form a vector
perpendicular to a of the implicit line. For example, we could set
b
v= .
−a
46 3. Lining Up: 2D Lines
Next, find a point on the line. Two candidate points are the inter-
sections with the e1 - or e2 -axis,
−c/a 0
or ,
0 −c/b
respectively. For numerical stability, let’s choose the intersection clos-
est to the origin. Thus, we choose the former if |a| > |b|, and the latter
otherwise.
Example 3.4
Figure 3.3.
Left: robot path application where clearance around the robot’s path must be mea-
sured. Right: measuring the perpendicular distance of each point to the line.
48 3. Lining Up: 2D Lines
a · (x − p) = 0,
v = a · (r − p).
v · w = vw cos θ.
The right triangle in Sketch 3.7 allows for an expression for cos(θ) as
d
cos(θ) = .
w
Substituting this into (3.10), we have
v = ad.
This indicates that the actual distance of r to the line is
v a · (r − p) ar1 + br2 + c
d= = = . (3.11)
a a a
If many points will be checked against a line, it is advantageous
to store the line in point normal form. This means that a = 1,
eliminating the division in (3.11).
Example 3.5
Find: d(r, l), or d for brevity. Again, this is illustrated in Sketch 3.7.
Solution: Form the vector w = r − p. Use the relationship
d = w sin(α).
Later in Section 8.2, we will see how to express sin(α) directly in
terms of v and w; for now, we express it in terms of the cosine:
sin(α) = 1 − cos(α)2 ,
and as before
v·w
cos(α) = .
vw
Thus, we have defined the distance d.
Example 3.6
We’ll use the same line as in the previous example, but now it will be
given in parametric form as
0 2
l(t) = +t .
4 −4
We’ll also use the same point
5
r= .
3
Add any new vectors for this example to the sketch you drew for the
previous example.
First create the vector
5 0 5
w= − = .
3 4 −1
√ √
Next calculate w = 26 and v = 20. Compute
2 5
·
−4 −1
cos(α) = √ √ ≈ 0.614.
26 20
3.7. The Foot of a Point 51
Find: The point q on the line that is closest to r. (See Sketch 3.8.)
q = p + tv, (3.12)
so our problem is solved once we have found the scalar factor t. From
Sketch 3.8, we see that
tv
cos(θ) = ,
w
where w = r − p. Using
v·w
cos(θ) = ,
vw
Example 3.7
Given: The parametric line l defined as
0 0
l(t) = +t ,
1 2
52 3. Lining Up: 2D Lines
and point
3
r= .
4
Find: The point q on l that is closest to r. This example is easy
enough to find the answer by simply drawing a sketch, but let’s go
through the steps.
• Do you want a parameter value on one or both lines for the inter-
section point?
The particular question(s) you want to answer along with the line
representation(s) will determine the best method for solving the in-
tersection problem.
3.8. A Meeting Place: Computing Intersections 53
Figure 3.4.
Intersecting lines: the top figure may be drawn without knowing where the shown lines
intersect. By finding line/line intersections (bottom), it is possible to color areas—
creating an artistic image!
This is one equation and one unknown! Just solve for t̂,
−c − ap1 − bp2
t̂ = , (3.13)
av1 + bv2
then i = l(t̂).
But wait—we must check if the denominator of (3.13) is zero be-
fore carrying out this calculation. Besides causing havoc numerically,
what else does a zero denominator infer? The denominator
denom = av1 + bv2
can be rewritten as
denom = a · v.
We know from (2.7) that a zero dot product implies that two vectors
are perpendicular. Since a is perpendicular to the line l2 in implicit
form, the lines are parallel if
a · v = 0.
Of course, we always check for equality within a tolerance. A phys-
ically meaningful tolerance is best. Thus, it is better to check the
quantity
a·v
cos(θ) = ; (3.14)
av
the tolerance will be the cosine of an angle. It usually suffices to
use a tolerance between cos(0.1◦ ) and cos(0.5◦ ). Angle tolerances are
particularly nice to have because they are dimension independent.
Note that we do not need to use the actual angle, just the cosine of
the angle.
If the test in (3.14) indicates the lines are parallel, then we might
want to determine if the lines are identical. By simply plugging in
the coordinates of p into the equation of l2 , and computing, we get
ap1 + bp2 + c
d= .
a
If d is equal to zero (within tolerance), then the lines are identical.
Example 3.8
Find: The intersection point i. Create your own sketch and try to
predict what the answer should be.
Solution: Find the parameter t̂ for l1 as given in (3.13). First check
the denominator:
p + t̂v = q + ŝw.
We have two equations (one for each coordinate) and two unknowns
t̂ and ŝ. To solve for the unknowns, we could formulate an expres-
sion for t̂ using the first equation, and substitute this expression into
56 3. Lining Up: 2D Lines
the second equation. This then generates a solution for ŝ. Use this
solution in the expression for t̂, and solve for t̂. (Equations like this
are treated systematically in Chapter 5.) Once we have t̂ and ŝ, the
intersection point is found by inserting one of these values into its
respective parametric line equation.
If the vectors v and w are linearly dependent, as discussed in Sec-
tion 2.6, then it will not be possible to find a unique t̂ and ŝ. The
lines are parallel and possibly identical.
Example 3.9
Find: The intersection point i. This means that we need to find t̂ and
ŝ such that l1 (t̂) = l2 (ŝ).2 Again, create your own sketch and try to
predict the answer.
Solution: Set up the two equations with two unknowns as in (3.15).
−2 −1 4 0
t̂ − ŝ = − .
−1 2 0 3
Example 3.10
3.9 Exercises
1. Give the parametric form of the line l(t) defined by points
0 4
p= and q = ,
1 2
7. For the points in Exercise 6, if a point does not lie on the line, calculate
the distance from the line.
8. What is the point normal form of the line in Exercise 5?
9. What is the implicit equation of the horizontal line through x2 = −1/2?
What is the implicit equation of the vertical line through x1 = 1/2?
10. What is the explicit equation of the line 6x1 + 3x2 + 3 = 0?
11. What is the slope and e2 -intercept of the line x2 = 4x1 − 1?
12. What is the explicit equation of the horizontal line through x2 = −1/2?
What is the explicit equation of the vertical line through x1 = 1/2?
13. What is the explicit equation of the line with zero slope with e2 -
intercept 3? What is the explicit equation of the line with slope 2
that passes through the origin?
14. What is the implicit equation of the line
1 −2
l(t) = +t ?
1 3
find the intersection point using each of the three methods in Sec-
tion 3.8.
24. Find the intersection of the lines
0 1
l1 : l1 (t) = +t and
−1 1
l2 : −x1 + x2 + 1 = 0.
26. Find the closest point on the line l(t) = p + tv, where
2 2
l(t) = +t
2 2
to the point
0
r= .
2
27. The line l(t) passes through the points
0 4
p= and q = .
1 2
Figure 4.1.
Linear maps in 2D: an interesting geometric figure constructed by applying 2D linear
maps to a square.
Geometry always has two parts to it: one part is the description of
the objects that can be generated; the other investigates how these
61
62 4. Changing Shapes: Linear Maps in 2D
v = v1 e1 + v2 e2 . (4.1)
v = v1 a1 + v2 a2 , (4.2)
Sketch 4.1. as illustrated by Sketch 4.1. This simply states that we duplicate the
A skew target box defined by a1 [e1 , e2 ]-geometry in the [a1 , a2 ]-system: The linear map transforms
and a 2 . e1 , e2 , v to a1 , a2 , v , respectively. The components of v are in the
context of the [e1 , e2 ]-system. However, the components of v with
respect to the [a1 , a2 ]-system are the components of v. Reviewing
a definition from Section 2.6, we recall that (4.2) is called a linear
combination.
4.2. The Matrix Form 63
Example 4.1
Let’s look at an example of the action of the map from the linear
combination in (4.2). Let the origin and
2 −2
a1 = , a2 =
1 4
Av + Bv = (A + B)v.
[A + B]T = AT + B T . (4.9)
There are no restrictions on the diagonal elements, but all other el-
ements are equal to the element about the diagonal with reversed
indices. For a 2 × 2 matrix, this means that a2,1 = a1,2 .
With matrix notation, we can now continue the discussion of in-
dependence from Section 2.6. The columns of a matrix define an
[a1 , a2 ]-system. If the vectors a1 and a2 are linearly independent
then the matrix is said to have full rank, or for the 2 × 2 case, the
matrix has rank 2. If a1 and a2 are linearly dependent then the ma-
trix has rank 1. These two statements may be summarized as: the
rank of a 2 × 2 matrix equals the number of linearly independent
column (row) vectors. Matrices that do not have full rank are called
rank deficient or singular. We will encounter an important example
of a rank deficient matrix, a projection matrix, in Section 4.8. The
only matrix with rank zero is the zero matrix, a matrix with all zero
entries. The 2 × 2 zero matrix is
0 0
.
0 0
Let’s break this statement down into the two basic elements: scalar
multiplication and addition. For the sake of concreteness, we shall use
the example
−1 1/2 1 −1
A= , u= , v= .
0 −1/2 2 4
We may multiply all elements of a matrix by one factor; we then Sketch 4.2.
say that we have multiplied the matrix by that factor. Using our Matrices preserve scalings.
example, we may multiply the matrix A by a factor, say 2:
−1 1/2 −2 1
2× = .
0 −1/2 0 −1
When we say that matrices preserve multiplication by scalar factors
we mean that if we scale a vector by a factor c, then its image will
also be scaled by c:
A(cu) = cAu.
Example 4.2
−1 1/2 1 −1 1/2 1 0
2× =2× = .
0 −1/2 2 0 −1/2 2 −2
A(u + v) = Au + Av.
Sketch 4.3.
This is also called the distributive law. Sketch 4.3 illustrates this
Matrices preserve sums.
property (with a different set of A, u, v).
68 4. Changing Shapes: Linear Maps in 2D
Example 4.3
−1 1/2 1 −1
A(3u + 2v) = 3 +2
0 −1/2 2 4
−1 1/2 1 6
= = .
0 −1/2 14 −7
−1 1/2 1 −1 1/2 −1
3Au + 2Av = 3 +2
0 −1/2 2 0 −1/2 4
0 6 6
= + = .
−3 −4 −7
4.4 Scalings
Consider the linear map given by
1/2 0 v /2
v = v= 1 . (4.12)
0 1/2 v2 /2
Figure 4.2.
Scaling: a uniform scaling.
been mapped to the vectors a1 and a2 . These are the column vectors
of the matrix in (4.12),
1/2 0
a1 = and a2 = .
0 1/2
Figure 4.3.
Scaling: a nonuniform scaling.
apply that scaling to the e2 -direction. The total effect is thus a factor
of s1,1 s2,2 .
You can see this from Figure 4.2 by mentally constructing the
square spanned by e1 and e2 and comparing its area to the rect-
angle spanned by the image vectors. It is also interesting to note
that, in Figure 4.3, the scaling factors result in no change of area,
although a distortion did occur.
The distortion of the circular Phoenix that we see in Figure 4.3
is actually well-defined—it is an ellipse! In fact, all 2 × 2 matrices
will map circles to ellipses. (In higher dimensions, we will speak of
ellipsoids.) We will refer to this ellipse that characterizes the action
of the matrix as the action ellipse.1 In Figure 4.2, the action ellipse
is a scaled circle, which is a special case of an ellipse. In Chapter 16,
we will relate the shape of the ellipse to the linear map.
1 We will study ellipses in Chapter 19. An ellipse is symmetric about two axes
that intersect at the center of the ellipse. The longer axis is called the major axis
and the shorter axis is called the minor axis. The semi-major and semi-minor
axes are one-half their respective axes.
4.5. Reflections 71
Figure 4.4.
Reflections: a reflection about the e1 -axis.
4.5 Reflections
Figure 4.5.
Reflections: a reflection about the line x1 = x2 .
Figure 4.6.
Reflections: a reflection about both axes is also a rotation of 180◦ .
4.6 Rotations
The notion of rotating a vector around the origin is intuitively clear,
but a corresponding matrix takes a few moments to construct. To
keep it easy at the beginning, let us rotate the unit vector
1
e1 =
0
Figure 4.7.
Rotations: a rotation by 45◦ .
But let’s verify that we have already found the solution to the general
rotation problem.
Let v be an arbitrary vector. We claim that the matrix R from
(4.16) will rotate it by α degrees to a new vector v . If this is so, then
we must have
v · v = v2 cos α
according to the rules of dot products (see Section 2.7). Here, we
made use of the fact that a rotation does not change the length of a
vector, i.e., v = v and hence v · v = v2 .
Since
v1 cos α − v2 sin α
v = ,
v1 sin α + v2 cos α
the dot product v · v is given by
v · v = v12 cos α − v1 v2 sin α + v1 v2 sin α + v22 cos α
= (v12 + v22 ) cos α
= v2 cos α,
and all is shown! See Figure 4.7 for an illustration. There, α = 45◦ ,
4.7. Shears 75
4.7 Shears
What map takes a rectangle to a parallelogram? Pictorially, one such
map is shown in Sketch 4.6.
In this example, we have a map: Sketch 4.6.
A special shear.
0 d
v= −→ v = 1 .
1 1
In matrix form, this is realized by
d1 1 d1 0
= . (4.17)
1 0 1 1
Verify! The 2 × 2 matrix in this equation is called a shear matrix. It
is the kind of matrix that is used when you generate italic fonts from
standard ones.
A shear matrix may be applied to arbitrary vectors. If v is an input
vector, then a shear maps it to v :
1 d1 v1 v + v2 d1
v = = 1 ,
0 1 v2 v2
as illustrated in Figure 4.8. Clearly, the circular Phoenix is mapped
to an elliptical one.
We have so far restricted ourselves to shears along the e1 -axis; we
may also shear along the e2 -axis. Then we would have
1 0 v1 v1
v = = ,
d2 1 v2 v1 d2 + v2
as illustrated in Figure 4.9.
Since it will be needed later, we look at the following. What is the
shear that achieves
v v
v= 1 −→ v = 1 ?
v2 0
76 4. Changing Shapes: Linear Maps in 2D
Figure 4.8.
Shears: shearing parallel to the e1 -axis.
Figure 4.9.
Shears: shearing parallel to the e2 -axis.
4.8 Projections
Projections—parallel projections, for our purposes—act like sunlight
casting shadows. Parallel projections are characterized by the fact
that all vectors are projected in a parallel direction. In 2D, all vec-
tors are projected onto a line. If the angle of incidence with the line
is ninety degrees then it is an orthogonal projection, otherwise it is
an oblique projection. In linear algebra, orthogonal projections are
very important, as we have already seen in Section 2.8, they give us
a best approximation in a particular subspace. Oblique projections
are important to applications in fields such as computer graphics and
architecture. On the other hand, in a perspective projection, the pro-
jection direction is not constant. These are not linear maps, however;
they are introduced in Section 10.5.
Let’s look at a simple 2D orthogonal projection. Take any vector
v and “flatten it out” onto the e1 -axis. This simply means: set the
v2 -coordinate of the vector to zero. For example, if we project the
vector
3
v=
1
onto the e1 -axis, it becomes
3
v = ,
0
Figure 4.10.
Projections: all vectors are “flattened out” onto the e1 -axis.
space is mapped into 1D space, namely onto the e1 -axis. Figure 4.10
illustrates this property and that the action ellipse of a projection is
a straight line segment that is covered twice.
To construct a 2D orthogonal projection matrix, first choose a unit
vector u to define a line onto which to project. The matrix is de-
fined by a1 and a2 , or in other words, the projections of e1 and e2 ,
respectively onto u. From (2.21), we have
u · e1
a1 = u = u1 u,
u2
u · e2
a2 = u = u2 u,
u2
thus
A = u1 u u2 u (4.19)
T
= uu . (4.20)
Forming a matrix as in (4.20), from the product of a vector and
its transpose, results in a dyadic matrix. Clearly the columns of A
are linearly dependent and thus the matrix has rank one. This map
reduces dimensionality, and as far as areas are concerned, projections
take a lean approach: whatever an area was before application of the
map, it is zero afterward.
Figure 4.11 shows the effect of (4.20) on the e1 and e2 axes. On
the left side, the vector u = [cos 30◦ sin 30◦ ]T and thin lines show the
4.8. Projections 79
Example 4.4
Figure 4.11.
Projections: e1 and e2 vectors orthogonally projected onto u results in a1 (black) and
a2 (dark gray), respectively. Left: vector u = [cos 30◦ sin 30◦ ]T . Right: vectors ui for
i = 1, 36 are at 10◦ increments.
80 4. Changing Shapes: Linear Maps in 2D
Ax = uuT x = (u · x)u.
Ax = uuT y + uuT y⊥ ,
Ax = yu.
T = a1,1 a2,2 − T1 − T2 − T3 .
1 1
T = a1,1 a2,2 − a1,2 a2,1 .
2 2
Our aim was not really T , but the parallelogram area P . Clearly (see
Sketch 4.9),
P = 2T,
and we have our desired area.
It is customary to use the term determinant for the (signed) area of Sketch 4.9.
the parallelogram spanned by [a1 , a2 ]. Since the two vectors a1 and a2 Parallelogram and triangles.
form the columns of the matrix A, we also speak of the determinant
of the matrix A, and denote it by det A or |A|:
a1,1 a1,2
|A| = = a1,1 a2,2 − a1,2 a2,1 . (4.24)
a2,1 a2,2
• If |A| < 0, then the linear map changes the orientation of objects.
(We’ll look at this closer after Example 4.5.) Areas may still con-
tract or expand depending on the magnitude of the determinant.
82 4. Changing Shapes: Linear Maps in 2D
Example 4.5
Sketch 4.10.
Resulting area after scaling one There are some rules for working with determinants:
column of A. If A = [a1 , a2 ], then
Next, we have
b1,1 b1,2 a1,1 v1 + a1,2 v2
v =
b2,1 b2,2 a2,1 v1 + a2,2 v2
b1,1 (a1,1 v1 + a1,2 v2 ) + b1,2 (a2,1 v1 + a2,2 v2 )
= .
b2,1 (a1,1 v1 + a1,2 v2 ) + b2,2 (a2,1 v1 + a2,2 v2 )
The matrix that we have created here, let’s call it C, is called the
product matrix of B and A:
BA = C.
2 The reason for this terminology will become apparent when we revisit these
In more detail,
b1,1 b1,2 a1,1 a1,2 b a + b1,2 a2,1 b1,1 a1,2 + b1,2 a2,2
= 1,1 1,1 .
b2,1 b2,2 a2,1 a2,2 b2,1 a1,1 + b2,2 a2,1 b2,1 a1,2 + b2,2 a2,2
(4.26)
This looks messy, but a simple rule puts order into chaos: the element
ci,j is computed as the dot product of B’s ith row and A’s jth column.
We can use this product to describe the composite map:
Example 4.6
Let
2 −1 2 0 −2
v= , A= , B= .
−1 0 3 −3 1
Then
−1 2 2 −4
v = =
0 3 −1 −3
and
0 −2 −4 6
v = = .
−3 1 −3 9
We can also compute v using the matrix product BA:
0 −2 −1 2 0 −6
C = BA = = .
−3 1 0 3 3 −3
Figure 4.12.
Linear map composition is order dependent. Top: rotate by –120◦ , then reflect about
the (rotated) e1 -axis. Bottom: reflect, then rotate.
You see how c2,1 is at the intersection of column one of the “top”
matrix and row two of the “left” matrix.
The complete multiplication scheme is then arranged like this
−1 2
0 3
0 −2 0 −6
−3 1 3 −3
AB = BA.
Example 4.7
Let us take two very simple matrices and demonstrate that the prod-
uct is not commutative. This example is illustrated in Figure 4.12. A
rotates by −120◦, and B reflects about the e1 -axis:
−0.5 0.866 1 0
A= , B= .
−0.866 −0.5 0 −1
Check for yourself that the other alternative gives the same result!
By referring to a trigonometry reference, we see that this product
matrix can be written as
cos(α + β) − sin(α + β)
,
sin(α + β) cos(α + β)
which corresponds to a rotation by α+β. As we will see in Section 9.9,
rotations in 3D do not commute.
What about the rank of a composite map?
This says that matrix multiplication does not increase rank. You
should try a few rank 1 and 2 matrices to convince yourself of this
fact.
One more example on composing linear maps, which we have seen
in Section 4.8, is that projections are idempotent. If A is a projection
matrix, then this means
Av = AAv
for any vector v. Written out using only matrices, this becomes
A = AA or A = A2 . (4.28)
Verify this property for the projection matrix
0.5 0.5
.
0.5 0.5
Excluding the identity matrix, only rank deficient matrices are idem-
potent.
uT v = u · v. (4.29)
(uT v)T = vT u,
Example 4.8
T T
−3 T
[u v] = 3 4 = [15]T = 15
6
3
vT u = −3 6 = [15] = 15.
4
The results are the same.
T
a1,1 a1,2 b1,1 b1,2 c c1,2
(AB)T = = 1,1 ,
a2,1 a2,2 b2,1 b2,2 c2,1 c2,2
T
bT
1,1 bT
1,2 a1,1 aT1,2 c c1,2
T
B A = T
= 1,1 .
bT
2,1 bT
2,2 a T
2,1 aT
2,2 c2,1 c2,2
Since bi,j = bT
j,i , we see the identical dot product is calculated to form
c1,2 .
What is the determinant of a product matrix? If C = AB denotes
a matrix product, then
|AB| = |A||B|, (4.31)
which tells us that B scales objects by |B|, A scales objects by |A|, and
the composition of the maps scales by the product of the individual
scales.
Example 4.9
We have |A| = 1/4 and |B| = 16. Thus, A scales down, and B scales
up, but the effect of B’s scaling is greater than that of A’s. The
product
2 0
AB =
0 2
Ar = A
· .
. . · A .
r times
Ar+s = Ar As
Ars = (Ar )s
A0 = I
For now, assume r and s are positive integers. See Sections 5.9
and 9.10 for a discussion of A−1 , the inverse matrix.
Distributive Law
A(B + C) = AB + AC
Distributive Law
(B + C)A = BA + CA
a(B + C) = aB + aC
(a + b)C = aC + bC
(ab)C = a(bC)
a(BC) = (aB)C = B(aC)
(A + B)T = AT + B T
(bA)T = bAT
(AB)T = B T AT
T
AT = A
4.13 Exercises
For the following exercises, let
0 −1 1 −1 2
A= , B= , v= .
1 0 −1 1/2 3
1. What linear combination of
1 0
c1 = and c2 =
0 2
results in
2
w= ?
1
Write the result in matrix form.
2. Suppose w = w1 c1 + w2 c2 . Express this in matrix form.
3. Is the vector v in the column space of A?
4. Construct a matrix C such that the vector
0
w=
1
is not in its column space.
92 4. Changing Shapes: Linear Maps in 2D
30. Compute B T A.
31. What is A2 ?
32. Let M and N be 2 × 2 matrices and each is rank one. What can you
say about the rank of M + N ?
33. Let two square matrices M and N each have rank one. What can you
say about the rank of M N ?
34. Find matrices C and D, both having rank greater than zero, such that
the product CD has rank zero.
This page intentionally left blank
5
2 × 2 Linear Systems
Figure 5.1.
Intersection of lines: two families of lines are shown; the intersections of corresponding
line pairs are marked by black boxes. For each intersection, a 2 × 2 linear system has
to be solved.
95
96 5. 2 × 2 Linear Systems
Sketch 5.1.
Geometry of a 2 × 2 system.
Example 5.1
What we have here are really two equations in the two unknowns
u1 and u2 , which we see by expanding the vector equations into
2u1 + 4u2 = 4
(5.2)
u1 + 6u2 = 4.
And as we saw in Example 5.1, these two equations in two unknowns
have the solution u1 = 1 and u2 = 1/2, as is seen by inserting these
values for u1 and u2 into the equations.
5.2. The Matrix Form 97
Au = b, (5.7)
where
a a1,2 u1 b
A = 1,1 , u= , b= 1 .
a2,1 a2,2 u2 b2
Both u and b represent vectors, not points! (See Sketch 5.1 for an
illustration.) The vector u is called the solution of the linear system.
While the savings of this notation is not completely obvious in the
2 × 2 case, it will save a lot of work for more complicated cases with
more equations and unknowns.
The columns of the matrix A correspond to the vectors a1 and
a2 . We could then rewrite our linear system as (5.1). Geometrically,
we are trying to express the given vector b as a linear combination
of the given vectors a1 and a2 ; we need to determine the factors u1
and u2 . If we are able to find at least one solution, then the linear
system is called consistent, otherwise it is called inconsistent. Three
possibilities for our solution space exist.
98 5. 2 × 2 Linear Systems
Example 5.2
Applying Cramer’s rule to the linear system in (5.3), we get
4 4
4 6 8
u1 = = ,
2 4 8
1 6
2
4
1 4 4
u2 = = .
2 4 8
1 6
Examining the determinant in the numerator, notice that b replaces
a1 in the solution for u1 and then b replaces a2 in the solution for u2 .
5.4. Gauss Elimination 99
Notice that if the area spanned by a1 and a2 is zero, that is, the
vectors are multiples of each other, then Cramer’s rule will not result
in a solution. (See Sections 5.6–5.8 for more information on this
situation.)
Cramer’s rule is primarily of theoretical importance. For larger
systems, Cramer’s rule is both expensive and numerically unstable.
Hence, we now study a more effective method.
u2 = b2 /a2,2 .
With u2 in hand, we can solve the first equation from (5.5) for
1
u1 = (b1 − u2 a1,2 ).
a1,1
u2 = 2/4 = 1/2,
1 1
u1 = 4−4× = 1.
2 2
1 Gauss elimination and forward elimination are often used interchangeably.
5.5. Pivoting 101
Example 5.3
We will look at one more example of forward elimination and back
substitution. Let a linear system be given by
−1 4 u1 0
= .
2 2 u2 2
5.5 Pivoting
Consider the system
0 1 u1 1
= ,
1 0 u2 1
illustrated in Sketch 5.6.
Our standard approach, shearing a1 onto the e1 -axis, will not work Sketch 5.6.
here; there is no shear that takes A linear system that needs
pivoting.
0
1
onto the e1 -axis. However, there is no problem if we simply exchange
the two equations! Then we have
1 0 u1 1
= ,
0 1 u2 1
102 5. 2 × 2 Linear Systems
Example 5.4
Let’s study an example taken from [11]:
0.0001 1 u1 1
= .
1 1 u2 2
which we will call the “true” solution. Note the magnitude of changes
in a2 and b relative to a1 . This is the type of behavior that causes
numerical problems. It can often be dealt with by using a larger
number of digits.
Suppose we have a machine that stores only three digits, although
it calculates with six digits. Due to round-off, the system above would
be stored as
0.0001 1 u1 1
= ,
0 −10000 u2 −10000
5.5. Pivoting 103
ut − ur = 1.0001.
Notice that the vectors of the linear systems are all within the same
range. Even with the three-digit machine, this system will allow us to
compute a result that is closer to the true solution because the effects
of round-off have been minimized. Now the error is
ut − up = 0.00014.
Sketch 5.7. It is obvious from the sketch that we have a problem here, but let’s
An unsolvable linear system. just blindly apply forward elimination; apply a shear such that a1 is
mapped to the e1 -axis. The resulting system is
2 1 u1 1
= .
0 0 u2 −1
But the last equation reads 0 = −1, and now we really are in trouble!
This means that our system is inconsistent, and therefore does not
have a solution.
It is possible however, to find an approximate solution. This is done
in the context of least squares methods, see Section 12.7.
Now the last equation reads 0 = 0—true, but a bit trivial! In reality,
our system is just one equation written down twice in slightly different
forms. This is also clear from the sketch: b may be written as a
multiple of either a1 or a2 , thus the system is underdetermined. This
type of system is consistent because at least one solution exists. We
can find a solution by setting u2 = 1, and then back substitution
results in u1 = 1.
5.8. Homogeneous Systems 105
Example 5.5
An example, illustrated in Sketch 5.9, should help. Let our homoge- Sketch 5.9.
neous system be Homogeneous system with
1 2 0 nontrivial solution.
u= .
2 4 0
Clearly, a2 = 2a1 ; the matrix A maps all vectors onto the line de-
fined by a1 and the origin. In this example, any vector u that is
perpendicular to a1 will be projected to the zero vector:
A[cu] = 0.
Example 5.6
Example 5.7
This next linear system might seem like a silly one to pose; however,
systems of this type do arise in Section 7.3 in the context of finding
eigenvectors:
0 1/2
u = 0.
0 0
5.9. Undoing Maps: Inverse Matrices 107
are solutions.
u = Bb? (5.10)
S1 Au = S1 b.
Let’s use another shear to zero the upper right element. Geometri-
cally, this corresponds to constructing a shear that will map the new
a2 to the e2 -axis. It is given by the matrix
1 −1
S2 = .
0 1
108 5. 2 × 2 Linear Systems
After the second shear, our linear system has been changed to
S2 S1 Au = S2 S1 b.
which corresponds to
S3 S2 S1 Au = S3 S2 S1 b.
S3 S2 S1 A = I; (5.12)
u = S3 S2 S1 b. (5.13)
A−1 = S3 S2 S1 . (5.14)
5.9. Undoing Maps: Inverse Matrices 109
The matrix A−1 undoes the effect of the matrix A: the vector u was
mapped to b by A, and b is mapped back to u by A−1 . Thus, we
can now write (5.13) as
u = A−1 b.
If this transformation result can be achieved, then A is called invert-
ible. At the end of this section and in Sections 5.6 and 5.7, we discuss
cases in which A−1 does not exist.
If we combine (5.12) and (5.14), we immediately get
A−1 A = I. (5.15)
This makes intuitive sense, since the actions of a map and its inverse
should cancel out, i.e., not change anything—that is what I does!
Figures 5.2 and 5.3 illustrate this. Then by the definition of the
inverse,
AA−1 = I.
If A−1 exits, then it is unique.
The inverse of the identity is the identity
I −1 = I.
Figure 5.3 shows the effects of a matrix and its inverse for the shear
1 1
.
0 1
We consider the inverse of a rotation as follows: if Rα rotates by α
degrees counterclockwise, then R−α rotates by α degrees clockwise,
or
−1 T
R−α = Rα = Rα ,
as we can see from the definition of a rotation matrix (4.16).
110 5. 2 × 2 Linear Systems
Figure 5.2.
Inverse matrices: illustrating scaling and its inverse, and that AA–1 = A–1 A = I. Top:
the original Phoenix, the result of applying a scale, then the result of the inverse scale.
Bottom: the original Phoenix, the result of applying the inverse scale, then the result
of the original scale.
Figure 5.3.
Inverse matrices: illustrating a shear and its inverse, and that AA–1 = A–1 A = I. Top:
the original Phoenix, the result of applying a shear, then the result of the inverse shear.
Bottom: the original Phoenix, the result of applying the inverse shear, then the result
of the original shear.
5.9. Undoing Maps: Inverse Matrices 111
Figure 5.4.
T –1
Inverse matrices: the top illustrates I, A–1 , A–1 and the bottom illustrates I, AT , AT .
112 5. 2 × 2 Linear Systems
Example 5.8
It can be the case that a matrix A does not have an inverse. For
example, the matrix
2 1
4 2
is not invertible because the columns are linearly dependent (and
therefore the determinant is zero). A noninvertible matrix is also
referred to as singular. If we try to compute the inverse by setting up
two simultaneous systems,
2 1 1 0
a1 a2 = ,
4 2 0 1
A = V V −1 .
Example 5.9
Let’s find the linear map A that maps the basis V formed by vectors
1 −1
v1 = and v2 =
1 1
2 5
3
2 1
1
2
–3 –2 –1 1 2 3
–1
1
–1 1 –1 1 2
2 2
–1
Figure 5.5.
Linear system classification: Three linear systems interpreted as line intersection
problems. Left to right: unique solution, inconsistent, underdetermined.
and the problem statement takes the row view by asking what u1 and
u2 satisfy both line equations. Depending on the problem at hand,
we can choose the view that best suits our given information.
We took a column view in our approach to presenting 2 × 2 linear
systems, but equally valid would be a row view. Let’s look at the
key examples from this chapter as if they were posed as implicit line
intersection problems. Figure 5.5 illustrates linear systems from Ex-
ample 5.1 (unique solution), the example in Section 5.6 (inconsistent
linear system), and the example in Section 5.7 (underdetermined sys-
tem). Importantly, the column and row views of the systems result
in the same classification of the solution sets.
Figure 5.6 illustrates two types of homogeneous systems from ex-
amples in Section 5.8. Since the right-hand side of each line equation
is zero, the lines will pass through the origin. This guarantees the
trivial solution for both intersection problems. The system with non-
trivial solutions is depicted on the right as two identical lines.
116 5. 2 × 2 Linear Systems
2 1.5
1.0
0.5
–2 –1 1 2
–3 –2 –1 1 2
–0.5
–2 –1.0
–4
Figure 5.6.
Homogeneous linear system classification: Two homogeneous linear systems inter-
preted as line intersection problems. Left to right: trivial solution only, nontrivial
solutions.
5.12 Exercises
1. Using the matrix form, write down the linear system to express
6
3
5.12. Exercises 117
3. What are the three possibilities for the solution space of a linear system
Au = b?
4. Use Cramer’s rule to solve the system in Exercise 1.
5. Use Cramer’s rule to solve the system
2 1 x1 8
= .
0 1 x2 2
10. Resolve the system in Exercise 1 with Gauss elimination with pivoting.
11. Give an example by means of a sketch of an unsolvable system. Do the
same for an underdetermined system.
12. Under what conditions can a nontrivial solution to a homogeneous sys-
tem be found?
13. Does the following homogeneous system have a nontrivial solution?
2 2 x1 0
= .
0 4 x2 0
19. What type of matrix has the property that A−1 = AT ? Give an example.
20. What is the inverse of
1 1
?
0 0
21. Define the matrix A that maps
1 1 1 1
→ and → .
0 0 1 −1
Figure 6.1.
Moving things around: affine maps in 2D applied to an old and familiar video game
character.
119
120 6. Moving Things Around: Affine Maps in 2D
they rotate, they zoom in or out. As you see this kind of motion,
the game software must carry out quite a few transformations. In
Figure 6.1, they have been applied to a familiar face in gaming. These
computations are implementations of affine maps, the subject of this
chapter.
v = v1 e1 + v2 e2 ,
v = v1 a1 + v2 a2 .
x = p + x1 a1 + x2 a2 , (6.1)
= p + Ax, (6.2)
Example 6.1
Let
2 2 −2
p= , a1 = , a2 =
2 1 4
define a new coordinate system, and let
2
x=
1/2
And now an example of a skew target box. Let’s revisit Example 5.1
from Section 5.1, and add an affine aspect to it by translating our
target box.
Example 6.2
p2 = (1 − t)p1 + tp3
x = Ax + p.
6.2. Affine and Linear Maps 123
We now have
p2 = A((1 − t)p1 + tp3 ) + p
= (1 − t)Ap1 + tAp3 + [(1 − t) + t]p
= (1 − t)[Ap1 + p] + t[Ap3 + p]
= (1 − t)p1 + tp3 .
The step from the first to the second equation may seem a bit con-
trived; yet it is the one that makes crucial use of the fact that we are
combining points using barycentric combinations: (1 − t) + t = 1.
The last equation shows that the linear (1 − t), t relationship
among three points is not changed by affine maps—meaning that
their ratio is invariant, as is illustrated in Sketch 6.4. In particular,
the midpoint of two points will be mapped to the midpoint of the
image points.
The other basic property of affine maps is this: they map parallel Sketch 6.4.
lines to parallel lines. If two lines do not intersect before they are Ratios are invariant under affine
mapped, then they will not intersect afterward either. Conversely, maps.
two lines that intersect before the map will also do so afterward.
Figure 6.2 shows how two families of parallel lines are mapped to two
families of parallel lines. The two families intersect before and after
the affine map. The map uses the matrix
1 2
A= .
2 1
Figure 6.2.
Affine maps: parallel lines are mapped to parallel lines.
124 6. Moving Things Around: Affine Maps in 2D
Figure 6.3.
Translations: points on a circle are translated by a fixed amount, and a line connects
corresponding points.
6.3 Translations
If an object is moved without changing its orientation, then it is trans-
lated. See Figure 6.3 in which points on a circle have been translated
by a fixed amount.
How is this action covered by the general affine map in (6.3)? Recall
the identity matrix from Section 5.9, which has no effect whatsoever
on any vector: we always have
Ix = x,
x = p + Ix.
r̄ = r − r = 0, x̄ = x − r. Sketch 6.5.
Rotating a point about another
Now we rotate the vector x̄ around the origin by α degrees: point.
¯ = Ax̄.
x̄
x = Ax̄ + r.
x = A(x − r) + r. (6.5)
x = 2p − x. (6.6)
While this does not have the standard affine map form, it is equivalent
to it, yet computationally much less complex.
thus we need to find the matrix A. (We have chosen a1 and a1
arbitrarily as the origins in the two coordinate systems.) We define
v2 = a2 − a1 , v3 = a3 − a1 ,
and
v2 = a2 − a1 , v3 = a3 − a1 .
We know
Av2 = v2 ,
Av3 = v3 .
These two vector equations may be combined into one matrix equa-
tion:
A v2 v3 = v2 v3 ,
6.5. Mapping Triangles to Triangles 127
which we abbreviate as
AV = V .
We multiply both sides of this equation by V ’s inverse V −1 and obtain
A as
A = V V −1 .
This is the matrix we derived in Section 5.10, “Defining a Map.”
Example 6.4
Note that V ’s inverse V −1 might not exist; this is the case when
v2 and v3 are linearly dependent and thus |V | = 0.
128 6. Moving Things Around: Affine Maps in 2D
Example 6.5
Figure 6.4.
Composing affine maps: The affine maps from (6.7) and (6.8) are applied iteratively,
resulting in the left and right images, respectively. The starting object is a set of points
on the unit circle centered at the origin. Successive iterations are lighter gray. The
same linear map is used for both images; however, a translation has been added to
create the right image.
6.6. Composing Affine Maps 129
Example 6.6
Figure 6.5.
Composing affine maps: The affine maps from (6.9) and (6.10) are applied iteratively,
resulting in the left and right images, respectively. The starting object is a set of points
on the unit circle centered at the origin. Successive iterations are lighter gray. The
same linear map is used for both images; however, a translation has been added to
create the right image.
130 6. Moving Things Around: Affine Maps in 2D
the translation steps away from the origin, thus the rotation action
moves the geometry in the e2 -direction even though the translation
is strictly in e1 . Same idea as in Example 6.5, but a very different
affine map! One interesting artifact of this map: We expect a circle
to be mapped to an ellipse, but by iteratively applying the shear and
rotation, the ellipse is stretched just right to morph back to a circle!
x = S[Rx + p]
We finish this chapter with Figure 6.8 by the Dutch artist, M.C.
Escher [5], who in a very unique way mixed complex geometric issues
with a unique style. The figure plays with reflections, which are affine
maps.
Figure 6.8.
M.C. Escher: Magic Mirror (1949).
No translation is applied.
Figure 6.9.
M.C. Escher: Magic Mirror (1949); affine map applied.
132 6. Moving Things Around: Affine Maps in 2D
6.7 Exercises
For Exercises 1 and 2 let an affine map be defined by
2 1 2
A= and p = .
1 2 2
1. Let
0 1
r= , s= ,
1 3/2
and q = (1/3)r + (2/3)s. Compute r , s , q ; e.g., r = Ar + p. Show
that q = (1/3)r + (2/3)s .
2. Let
0 2
t= and m= .
1 1
Compute t and m . Sketch the lines defined by t, m and t , m . Do
the same for r and s from Exercise 1. What does this illustrate?
3. Map the three collinear points
0 1 2
x1 = , x2 = , x3 = ,
0 1 2
to points xi by the affine map Ax + p, where
1 2 −4
A= and p = .
0 1 0
What is the ratio of the xi ? What is the ratio of the xi ?
4. Rotate the point
−2
x=
−2
by 90◦ around the point
−2
r= .
2
Define the matrix and point for this affine map.
6.7. Exercises 133
by 45◦ around the point p0 . Define the affine map needed here in terms
of a matrix and a point. Hint: Note that the points are evenly spaced so
some economy in calculation is possible.
0 0 1
6. Reflect the point x = about the line l(t) = p+tv, l(t) = +t .
2 0 2
7. Reflect the points
2 1 0 −1 −2
p0 = , p1 = , p2 = , p3 = , p4 =
1 1 1 1 1
Hint: Note that the points are evenly spaced so some economy in calcu-
lation is possible.
8. Given a triangle T with vertices
0 1 0
a1 = , a2 = , a3 = ,
0 0 1
what is the affine map that maps T to T ? What are the coordinates of
the point x corresponding to x = [1/2 1/2]T ?
9. Given a triangle T with vertices
2 0 −2
a1 = , a2 = , a3 = ,
0 1 0
suppose that the triangle T has been mapped to T via an affine map.
What are the coordinates of the point x corresponding to
0
x= ?
0
134 6. Moving Things Around: Affine Maps in 2D
10. Construct the affine map that maps any point x with respect to triangle
T to x with respect to triangle T using the vertices in Exercise 9.
11. Let’s revisit the coordinate transformation from Exercise 10 in Chap-
ter 1. Construct the affine map which takes a 2D point x in NDC
coordinates to the 2D point x in a viewport. Recall that the extents of
the NDC system are defined by the lower-left and upper-right points
−1 1
ln = and un = ,
−1 1
After constructing the affine map, find the points in the viewport asso-
ciated with the NDC points
−1 1 −1/2
x1 = , x2 = , x3 = .
−1 1 1/2
12. Affine maps transform parallel lines to parallel lines. Do affine maps
transform perpendicular lines to perpendicular lines?
13. Which affine maps are rigid body motions?
14. The solution to the problem of reflecting a point across a line is given
by (6.6). Why is this a valid combination of points?
7
Eigen Things
Figure 7.1.
The Tacoma Narrows Bridge: a view from the approach shortly before collapsing.
A linear map is described by a matrix, but that does not say much
about its geometric properties. When you look at the 2D linear map
figures from Chapter 4, you see that they all map a circle, formed
from the wings of the Phoenix, to some ellipse—called the action
ellipse, thereby stretching and rotating the circle. This stretching
135
136 7. Eigen Things
Figure 7.2.
The Tacoma Narrows Bridge: a view from shore shortly before collapsing.
7.2 Eigenvalues
We now develop a way to find the eigenvalues of a 2 × 2 matrix A.
First, we rewrite (7.1) as
Ar = λIr,
[A − λI]r = 0. (7.2)
This means that the matrix [A − λI] maps a nonzero vector r to the
zero vector; [A − λI] must be a rank deficient matrix. Then [A − λI]’s
determinant vanishes:
Figure 7.3.
Action of a matrix: behavior of the matrix from Example 7.1.
Example 7.1
p(λ) = λ2 − 4λ + 3 = 0.
λ1 = 3, λ2 = 1.
2 Recall 2
√ √ equation aλ + bλ + c = 0 has the solutions λ1 =
that the quadratic
−b+ b2 −4ac −b− b2 −4ac
2a
and λ2 = 2a
.
7.3. Eigenvectors 139
|A| = λ1 · λ2 .
7.3 Eigenvectors
Continuing with Example 7.1, we would still like to know the corre-
sponding eigenvectors. We know that one of them will be mapped to
three times itself, the other one to itself. Let’s call the corresponding
eigenvectors r1 and r2 . The eigenvector r1 satisfies
2−3 1
r = 0,
1 2−3 1
or
−1 1
r = 0.
1 −1 1
This is a homogeneous system. (Section 5.8 introduced these systems
and a technique to solve them with Gauss elimination.) Such sys-
tems have either none or infinitely many solutions. In our case, since
the matrix has rank 1, there are infinitely many solutions. Forward
elimination results in
−1 1
r = 0.
0 0 1
Assign r2,1 = 1, then back substitution results in r1,1 = 1. Any vector
of the form
1
r1 = c
1
140 7. Eigen Things
Figure 7.4.
Eigenvectors: the action of the matrix from Example 7.1 and its eigenvectors, scaled
by their corresponding eigenvalues.
Let us return to the general 2 × 2 case and review the ideas thus
far. The fixed directions r of a map A that satisfy Ar = λr are key to
understanding the action of the map. The expression det[A − λI] = 0
is a quadratic polynomial in λ, and its zeroes λ1 and λ2 are A’s eigen-
values. To find the corresponding eigenvectors, we set up the linear
systems [A − λ1 I]r1 = 0 and [A − λ2 I]r2 = 0. Both are homoge-
neous linear systems with infinitely many solutions, corresponding to
the eigenvectors r1 and r2 , which are in the null space of the matrix
[A − λI].
Example 7.2
Notice that when column vectors are exchanged, the solution vector
components are exchanged as well. One more forward elimination
step leads to
2 0 r2,1
= 0.
0 0 r1,1
Figure 7.5.
Quadratic polynomials: from left to right, no zero, one zero, two real zeroes.
numbers.
7.4. Striving for More Generality 143
or
λ2 + 1 = 0.
This has no real solutions, as expected.
A quadratic equation may also have one double root; then there
is only one fixed direction. A shear in the e1 -direction provides an
example—it maps all vectors in the e1 -direction to themselves. An
example is
1 1/2
A= .
0 1
The action of this shear is illustrated in Figure 4.8. You clearly see
that the e1 -axis is not changed.
The characteristic equation for A is
1 − λ 1/2
=0
0 1 − λ
or
(1 − λ)2 = 0.
It has the double root λ1 = λ2 = 1. For the corresponding eigenvec-
tor, we have to solve
0 1/2
r = 0.
0 0
In order to apply back substitution to this system, column pivoting
is necessary, thus the system becomes
1/2 0 r2
= 0.
0 0 r1
Example 7.3
λ(λ − 1) = 0,
A2 r = λAr
λr = λ2 r,
Ar1 = λ1 r1 , (7.5)
Ar2 = λ2 r2 . (7.6)
(Ar1 )T = (λ1 r1 )T
rT T T
1 A = r1 λ1
rT T
1 A = λ1 r1
rT T
1 Ar2 = λ1 r1 r2 . (7.7)
rT T
1 Ar2 = λ2 r1 r2 (7.8)
λ1 rT T
1 r2 = λ2 r1 r2
or
(λ1 − λ2 )rT
1 r2 = 0.
Example 7.4
R−1 = RT
A = RΛRT , (7.12)
Figure 7.6.
Eigendecomposition: the action of the symmetric matrix A from Example 7.1. Top: I,
A. Bottom: I, RT (rotate –45◦ ), ΛRT (scale), RΛRT (rotate 45◦ ).
Ax = RΛRT x
rT
1
= r1 r2 Λ x
rT
2
λ1 rT
1x
= r1 r2
λ2 rT
2x
= λ1 r1 rT T
1 x + λ2 r2 r2 x. (7.13)
Example 7.5
3P1x
3
Ax
1
r2 r1
1 2 3 4
P2x
Figure 7.7.
Eigendecomposition: the action of the matrix from Example 7.5 interpreted as a linear
combination of projections. The vector x is projected onto each eigenvector, r1 and
r2 , and scaled by the eigenvalues (λ1 = 3, λ2 = 1). The action of A is a sum of scaled
projection vectors.
7.6. Quadratic Forms 149
f (v1 , v2 ) or f (v).
Such functions are called quadratic forms because all terms are quadratic,
as we see by expanding (7.14)
1 1 1
0 0 0
–1 –1 –1
–1 0 1 –1 0 1 –1 0 1
Figure 7.8.
Quadratic forms: ellipsoid, paraboloid, hyperboloid evaluated over the unit circle. The
[e1 , e2 ]-axes are displayed. A contour plot for each quadratic form communicates
additional shape information. Color map extents: min f (v) colored black and max f (v)
colored white.
|C1 | = 1 λ1 = 2, λ2 = 0.5
|C2 | = 0 λ1 = 2, λ2 = 0
|C3 | = −1 λ1 = −2, λ2 = 0.5.
for any nonzero vector v ∈ R2 . This means that the quadratic form
is positive everywhere except for v = 0. An example is illustrated in
the left part of Figure 7.8. It is an ellipsoid: the matrix is in (7.15)
and the function is in (7.16). (It is hard to see exactly, but the middle
function, the paraboloid, has a line touching the zero plane.)
Positive definite symmetric matrices are a special class of matrices
that arise in a number of applications, and their well-behaved nature
lends them to numerically stable and efficient algorithms.
7.6. Quadratic Forms 151
vT Cv = 1.
This is a contour of f .
Let’s look at this contour for C1 :
2v12 + 0.5v22 = 1;
it is the equation of an ellipse.
By setting v1 = 0 and solving
√ for v2 , we can identify the e2 -axis
extents of the ellipse: ±1/ √0.5. Similarly by setting v2 = 0 we
find the e1 -axis extents: ±1/ 2. By definition, the major axis is the
longest of the ellipse’s two axes, so in this case, it is in the e2 -direction.
As noted above, the eigenvalues for C1 are λ1 = 2 and λ2 = 0.5
and the corresponding eigenvectors are r1 = [1 0]T and r2 = [0 1]T .
(The eigenvectors of a symmetric matrix are orthogonal.) Thus we
have that the minor axis corresponds to the dominant eigenvector.
Examining the contour plot in Figure 7.8 (left), we see that the ellipses
do indeed have the major axis corresponding to r2 and minor axis
corresponding to r1 . Interpreting the contour plot as a terrain map,
we see that the minor axis (dominant eigenvector direction) indicates
steeper ascent.
152 7. Eigen Things
7.25
0
1
0
–1
Figure 7.9.
Quadratic form contour: a planar slice of the quadratic form from Example 7.6 defines
an ellipse. The quadratic form has been scaled in e3 to make the contour easier to
see.
Example 7.6
Here is an example for which the major and minor axes are not aligned
with the coordinate axes:
2 0.5 T 4 1
A= thus C4 = A A = ,
0 1 1 1.25
and this quadratic form is illustrated in Figure 7.9.
The eigendecomposition, C4 = RΛRT , is defined by
−0.95 −0.30 4.3 0
R= and Λ = ,
−0.30 −0.95 0 0.92
where the eigenvectors are the columns of R.
The ellipse defined by f4 = vT C4 v = 1 is
4v12 + 2v1 v2 + 1.25v22 = 1;
it is illustrated in Figure 7.9. The major and minor axis lengths are
easiest to determine by using the eigendecomposition to perform a
coordinate transformation. The ellipse can be expressed as
vT RΛRT v = 1
v̂T Λv̂ = 1
λ1 v̂12 + λ2 v̂22 = 1,
aligning the
√ ellipse √
with the coordinate axes. Thus the minor axis has
length 1/ √λ1 = 1/ √4.3 = 0.48 on the e1 axis and the major axis has
length 1/ λ2 = 1/ 0.92 = 1.04 on the e2 axis.
7.7. Repeating Maps 153
Figure 7.10.
Repetitions: a symmetric matrix is applied several times. One eigenvalue is greater
than one, causing stretching in one direction. One eigenvalue is less than one, causing
compaction in the opposing direction.
Figure 7.11.
Repetitions: a matrix is applied several times. The eigenvalues are not real, therefore
the Phoenixes do not line-up along fixed directions.
become more and more stretched: they are elongated in the direction
r1 by λ1 = 1.3 and compacted in the direction of r2 by a factor of
λ2 = 0.7, with
√ √
1/√2 −1/√2
r1 = , r2 = .
1/ 2 1/ 2
In general,
An r1 = λn1 r1 . (7.18)
The same holds for r2 and λ2 , of course. So you see that once a
matrix has real eigenvectors, they play a more and more prominent
role as the matrix is applied repeatedly.
By contrast, the matrix corresponding to Figure 7.11 is given by
0.7 0.3
A= .
−1 1
As you should verify for yourself, this matrix does not have real eigen-
values. In that sense, it is related to a rotation matrix. If you study
Figure 7.11, you will notice a rotational component as we progress—
the figures do not line up along any (real) fixed directions.
In Section 15.2 on the power method, we will apply this idea of
repeating a map in order to find the eigenvectors.
7.8 Exercises
1. For each of the following matrices, describe the action of the linear map
and find the eigenvalues and eigenvectors:
1 s s 0 cos θ − sin θ
A= , B= , C= .
0 1 0 s sin θ cos θ
Figure 8.1.
3D objects: Guggenheim Museum in Bilbao, Spain. Designed by Frank Gehry.
157
158 8. 3D Geometry
8.1 From 2D to 3D
Moving from 2D to 3D geometry requires a coordinate system with
one more dimension. Sketch 8.1 illustrates the [e1 , e2 , e3 ]-system that
consists of the vectors
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0
e1 = ⎣0⎦ , e2 = ⎣1⎦ , and e3 = ⎣0⎦ .
0 0 1
Sketch 8.2. Sketch 8.2 illustrates a 3D vector v along with its components.
Length of a 3D vector. Notice the two right triangles. Applying the Pythagorean theorem
twice, the length or Euclidean norm of v, denoted as v, is
v = v12 + v22 + v32 . (8.3)
8.1. From 2D to 3D 159
Example 8.1
We will get some practice working with 3D vectors. The first task is
to normalize the vector ⎡ ⎤
1
v = ⎣2⎦ .
3
First calculate the length of v as
√
v = 12 + 22 + 32 = 14,
then the normalized vector w is
⎡ ⎤ ⎡ ⎤
1 0.27
v 1 ⎣ ⎦ ⎣
w= =√ 2 ≈ 0.53⎦ .
v 14 3 0.80
Check for yourself that w = 1.
Scale v by k = 2: ⎡ ⎤
2
2v = ⎣4⎦ .
6
Now calculate √
2v = 22 + 42 + 62 = 2 14.
Thus we verified that 2v = 2v.
Example 8.2
P = v ∧ w. (8.7)
Example 8.3
• Nonassociative: u ∧ (v ∧ w) = (u ∧ v) ∧ w, in general.
• Distributive: u ∧ (v + w) = u ∧ v + u ∧ w.
• Right-hand rule:
e 1 ∧ e2 = e3 ,
e2 ∧ e3 = e1 ,
e3 ∧ e1 = e2 .
• Orthogonality:
v · (v ∧ w) = 0 : v∧w is orthogonal to v.
w · (v ∧ w) = 0 : v∧w is orthogonal to w.
8.2. Cross Product 163
Example 8.4
Make your own sketches and don’t forget the right-hand rule to guess
the resulting vector direction.
• Parallel vectors:
⎡ ⎤
0×0−0×0
v ∧ 3v = ⎣0 × 6 − 0 × 2⎦ = 0.
2×0−6×0
• Homogeneous:
⎡ ⎤ ⎡ ⎤
0×0−3×0 0
4v ∧ w = ⎣0 × 0 − 0 × 8⎦ = ⎣ 0 ⎦ ,
8×3−0×0 24
and ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0×0−3×0 0 0
4(v ∧ w) = 4 ⎣0 × 0 − 0 × 2⎦ = 4 ⎣0⎦ = ⎣ 0 ⎦ .
2×3−0×0 6 24
• Antisymmetric:
⎡ ⎤ ⎛⎡ ⎤⎞
0 0
v ∧ w = ⎣0⎦ and − (w ∧ v) = − ⎝⎣ 0⎦⎠ .
6 −6
• Nonassociative:
⎡ ⎤ ⎡ ⎤
1×6−0×1 6
u ∧ (v ∧ w) = ⎣1 × 0 − 6 × 1⎦ = ⎣−6⎦ ,
1×0−0×1 0
• Distributive:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 2 −3
u ∧ (v + w) = ⎣1 ⎦ ∧ ⎣3⎦ = ⎣ 2⎦ ,
1 0 1
which is equal to
⎡⎤ ⎡ ⎤ ⎡ ⎤
0 −3 −3
(u ∧ v) + (u ∧ w) = ⎣ 2⎦ + ⎣ 0⎦ = ⎣ 2⎦ .
−2 3 1
8.3 Lines
Specifying a line with 3D geometry differs a bit from 2D. In terms of
points and vectors, two pieces of information define a line; however,
we are restricted to specifying
• two points or
• a point and a vector parallel to the line.
The 2D geometry item (from Section 3.1), which specifies only
• a point and a vector perpendicular to the line,
no longer works. It isn’t specific enough. (See Sketch 8.6.) In other
words, an entire family of lines satisfies this specification; this family
lies in a plane. (More on planes in Section 8.4.) As a consequence,
the concept of a normal to a 3D line does not exist.
Sketch 8.6. Let’s look at the mathematical representations of a 3D line. Clearly,
Point and perpendicular don’t from the discussion above, there cannot be an implicit form.
define a line.
The parametric form of a 3D line does not differ from the 2D line
except for the fact that the given information lives in 3D. A line l(t)
has the form
l(t) = p + tv, (8.11)
8.4. Planes 165
t̂v − ŝw = q − p.
However, now there are three equations and still only two unknowns.
Thus, the system is overdetermined. No solution exists when the lines
are skew. But we can find a best approximation, the least squares
solution, and that is the topic of Section 12.7. In many applications
it is important to know the closest point on a line to another line.
This problem is solved in Section 11.2.
We still have the concepts of perpendicular and parallel lines in 3D.
8.4 Planes
While exploring the possibility of a 3D implicit line, we encountered a
plane. We’ll essentially repeat that here, however, with a little change
in notation. Suppose we are given a point p and a vector n bound to
p. The locus of all points x that satisfy the equation
n · (x − p) = 0 (8.12)
where
A = n1
B = n2
C = n3
D = −(n1 p1 + n2 p2 + n3 p3 ).
Example 8.5
D = −(1 × 4 + 1 × 0 + 1 × 0) = −4.
x1 + x2 + x3 − 4 = 0.
Example 8.6
1 1 1 4 8
d = √ × 4 + √ × 4 + √ × 4 − √ = √ ≈ 4.6.
3 3 3 3 3
Notice that d > 0; this is because the point q is on the same side
of the plane as the normal direction.
√ The distance of the origin to
the plane is d = D = −4/ 3, which is negative because it is on
the opposite side of the plane to which the normal points. This is
analogous to the 2D implicit line.
Why not just specify one point and a vector in the plane, analogous
to the implicit form of a plane? Sketch 8.12 illustrates that this is
not enough information to uniquely define a plane. Many planes fit
that data.
Two vectors bound to a point are the data we’ll use to define a
Sketch 8.12.
plane P in parametric form as
Family of planes through a point
and vector.
P(s, t) = p + sv + tw. (8.15)
8.5. Scalar Triple Product 169
P = v ∧ w.
V = uv ∧ w cos θ.
V = u · (v ∧ w). (8.17)
V = u · (v ∧ w)
= w · (u ∧ v) (8.18)
= v · (w ∧ u).
Example 8.7
Figure 8.2.
Hedgehog plot: the normal of each facet is drawn at the centroid.
Figure 8.3.
Flat shading: the normal to each planar facet is used to calculate the illumination of
each facet.
x = up + vq + wr,
Figure 8.4.
Smooth shading: a normal at each vertex is used to calculate the illumination over
each facet. Left: zoomed-in and displayed with triangles. Right: the smooth shaded
bugle.
v = (e − c)/e − c.
If
n·v <0
then the triangle is back-facing, and we need not render it. This
process is called culling. A great savings in rendering time can be
achieved with culling.
Planar facet normals play an important role in computer graphics,
as demonstrated in this section. For more advanced applications,
consult a graphics text such as [14].
174 8. 3D Geometry
8.7 Exercises
For the following exercises, use the following points and vectors:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 1 4 1 1 0
p = ⎣0⎦ , q = ⎣1⎦ , r = ⎣2⎦ , v = ⎣0⎦ , w = ⎣1⎦ , u = ⎣0⎦ .
1 1 4 0 1 1
skew?
8. Form the point normal plane equation for a plane through point p and
with normal direction r.
8.7. Exercises 175
9. Form the point normal plane equation for the plane defined by points
p, q, and r.
10. Form a parametric plane equation for the plane defined by points p, q,
and r.
11. Form an equation of the plane that bisects the points p and q.
12. Given the line l defined by point q and vector v, what is the length of
the projection of vector w bound to q onto l?
13. Given the line l defined by point q and vector v, what is the (perpen-
dicular) distance of the point q + w (where w is a vector) to the line
l?
14. What is w ∧ 6w?
15. For the plane in Exercise 8, what is the distance of this plane to the
origin?
16. For the plane in Exercise 8, what is the distance of the point q to this
plane?
17. Find the volume of the parallelepiped defined by vectors v, w, and u.
18. Decompose w into
w = u1 + u2 ,
where u1 and u2 are perpendicular. Additionally, find u3 to complete
an orthogonal frame. Hint: Orthogonal projections are the topic of
Section 2.8.
19. Given the triangle formed by points p, q, r, and colors
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 1
ip = ⎣0⎦ , iq = ⎣1⎦ , ir = ⎣0⎦ ,
0 0 0
respectively.
This page intentionally left blank
9
Linear Maps in 3D
Figure 9.1.
Flight simulator: 3D linear maps are necessary to create the twists and turns in a flight
simulator. (Image is from the NASA website http://www.nasa.gov.)
177
178 9. Linear Maps in 3D
computer imagery. As you take a right turn, the terrain below changes
accordingly; as you dive downwards, it comes closer to you. When
you change the (simulated) position of your plane, the simulation
software must recompute a new view of the terrain, clouds, or other
aircraft. This is done through the application of 3D affine and linear
maps.1 Figure 9.1 shows an image that was generated by an actual
flight simulator. For each frame of the simulated scene, complex 3D
computations are necessary, most of them consisting of the types of
maps discussed in this section.
Sketch 9.1.
A vector in the [e1 , e2 , e3 ]-
9.1 Matrices and Linear Maps
coordinate system.
The general concept of a linear map in 3D is the same as that for
a 2D map. Let v be a vector in the standard [e1 , e2 , e3 ]-coordinate
system, i.e.,
v = v1 e1 + v2 e2 + v3 e3 .
(See Sketch 9.1 for an illustration.)
Let another coordinate system, the [a1 , a2 , a3 ]-coordinate system,
be given by the origin o and three vectors a1 , a2 , a3 . What vec-
tor v in the [a1 , a2 , a3 ]-system corresponds to v in the [e1 , e2 , e3 ]-
system? Simply the vector with the same coordinates relative to the
[a1 , a2 , a3 ]-system. Thus,
v = v1 a1 + v2 a2 + v3 a3 . (9.1)
Example 9.1
Let ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 2 0 0
v = ⎣1⎦ , a1 = ⎣0⎦ , a2 = ⎣1⎦ , a3 = ⎣ 0 ⎦ .
2 1 0 1/2
Sketch 9.2. Then ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
The matrix A maps v in the 2 0 0 2
[e1 , e2 , e3 ]-coordinate system to v = 1 · ⎣0⎦ + 1 · ⎣1⎦ + 2 · ⎣ 0 ⎦ = ⎣1⎦ .
the vector v in the [a1 , a2 , a3 ]- 1 0 1/2 2
coordinate system.
1 Actually, perspective maps are also needed here. They will be discussed in
Section 10.5.
9.1. Matrices and Linear Maps 179
You should recall that we had the same configuration earlier for
the 2D case—(9.1) corresponds directly to (4.2) of Section 4.1. In
Section 4.2, we then introduced the matrix form. That is now an
easy project for this chapter—nothing changes except the matrices
will be 3 × 3 instead of 2 × 2. In 3D, a matrix equation looks like this:
v = Av, (9.2)
i.e., just the same as for the 2D case. Written out in detail, there is
a difference: ⎡ ⎤ ⎡ ⎤⎡ ⎤
v1 a1,1 a1,2 a1,3 v1
⎣v2 ⎦ = ⎣a2,1 a2,2 a2,3 ⎦ ⎣v2 ⎦ . (9.3)
v3 a3,1 a3,2 a3,3 v3
All matrix properties from Chapter 4 carry over almost verbatim.
Example 9.2
u = sv + tw (9.4)
u 1 = s 1 v + t1 w and u2 = s2 v + t2 w,
which is again in the same space. We call the set of all vectors of
the form (9.4) a subspace of the linear space of all 3D vectors. The
term subspace is justified since not all 3D vectors are in it. Take for
instance the vector n = v ∧ w, which is perpendicular to both v and
w. There is no way to write this vector as a linear combination of v
and w!
We say our subspace has dimension 2 since it is generated, or
spanned, by two vectors. These vectors have to be noncollinear; oth-
erwise, they just define a line, or a 1D (1-dimensional) subspace. (In
Section 2.8, we needed the concept of a subspace in order to find the
orthogonal projection of w onto v. Thus the projection lived in the
one-dimensional subspace formed by v.)
If two vectors are collinear, then they are also called linearly de-
pendent. If v and w are linearly dependent, then v = sw. Con-
versely, if they are not collinear, they are called linearly independent.
If v1 , v2 , v3 are linearly independent, then we will not have a solution
set s1 , s2 for
v3 = s1 v1 + s2 v2 ,
9.3. Scalings 181
0 = s1 v1 + s2 v2 + s3 v3
9.3 Scalings
A scaling is a linear map that enlarges or reduces vectors:
⎡ ⎤
s1,1 0 0
v = ⎣ 0 s2,2 0 ⎦ v. (9.5)
0 0 s3,3
If all scale factors si,i are larger than one, then all vectors are enlarged,
as is done in Figure 9.2. If all si,i are positive yet less than one, all
vectors are shrunk.
Example 9.3
In this example,
⎡ ⎤ ⎡ ⎤
s1,1 0 0 1/3 0 0
⎣ 0 s2,2 0 ⎦ = ⎣ 0 1 0⎦ ,
0 0 s3,3 0 0 3
Figure 9.2.
Scalings in 3D: the large torus is scaled by 1/3 in each coordinate to form the small
torus.
Figure 9.3.
Nonuniform scalings in 3D: the “standard" torus is scaled by 1/3, 1, 3 in the e1 -, e2 -,
e3 -directions,respectively.
9.4. Reflections 183
Negative numbers for the si,i will cause a flip in addition to a scale.
So, for instance ⎡ ⎤
−2 0 0
⎣ 0 1 0⎦
0 0 −1
9.4 Reflections
If we reflect a vector about the e2 , e3 -plane, then its first component
should change in sign:
⎡ ⎤ ⎡ ⎤
v1 −v1
⎣v2 ⎦ −→ ⎣ v2 ⎦ ,
v3 v3
as well.
184 9. Linear Maps in 3D
Sketch 9.4. In Section 11.5, we develop a more general reflection matrix, called
Reflection of a vector about the the Householder matrix. Instead of reflecting about a coordinate
x1 = x 3 plane. plane, with this matrix, we can reflect about a given (unit) normal.
This matrix is central to the Householder method for solving a linear
system in Section 13.1.
By their very nature, reflections do not change volumes—but they
do change their signs. See Section 9.8 for more details.
9.5 Shears
What map takes a cube to the parallelepiped (skew box) of Sketch 9.5?
The answer: a shear. Shears in 3D are more complicated than the 2D
shears from Section 4.7 because there are so many more directions to
shear. Let’s look at some of the shears more commonly used.
Consider the shear that maps e1 and e2 to themselves, and that
also maps e3 to ⎡ ⎤
a
a3 = ⎣ b ⎦ .
1
The shear matrix S1 that accomplishes the desired task is easily found:
⎡ ⎤
1 0 a
S1 = ⎣0 1 b ⎦ .
Sketch 9.5. 0 0 1
A 3D shear parallel to the e1 ,
e2 -plane. It is illustrated in Sketch 9.5 with a = 1 and b = 1, and in Figure 9.4.
Thus this map shears parallel to the [e1 , e2 ]-plane. Suppose we apply
9.5. Shears 185
Figure 9.4.
Shears in 3D: a paraboloid is sheared in the e1 - and e2 -directions. The e3 -direction
runs through the center of the left paraboloid.
9.6 Rotations
Suppose you want to rotate a vector v around the e3 -axis by 90◦ to
a vector v . Sketch 9.6 illustrates such a rotation:
⎡ ⎤ ⎡ ⎤
2 0
v = ⎣0⎦ → v = ⎣2⎦ .
1 1
A rotation around e3 by different angles would result in different vec-
tors, but they all will have one thing in common: their third compo-
nents will not be changed by the rotation. Thus, if we rotate a vector
around e3 , the rotation action will change only its first and second
components. This suggests another look at the 2D rotation matrices
Sketch 9.6. from Section 4.6. Our desired rotation matrix R3 looks much like the
Rotation example. one from (4.16):
⎡ ⎤
cos α − sin α 0
R3 = ⎣ sin α cos α 0⎦ . (9.7)
0 0 1
Figure 9.5 illustrates the letter L rotated through several angles about
the e3 -axis.
Example 9.4
Figure 9.5.
Rotations in 3D: the letter L rotated about the e3 -axis.
RT R = I
RT = R−1
Figure 9.6.
Rotations in 3D: the letter L is rotated about axes that are not the coordinate axes. On
the right the point on the L that touches the rotation axes does not move.
Example 9.5
Example 9.6
Let α = 90◦ , ⎡ ⎤
√1 ⎡ ⎤
3
⎢ 1 ⎥ 1
a= ⎢√ ⎥ and v = ⎣0⎦ .
⎣ 3⎦
√1
0
3
With C = 0 and S = 1 in (9.10), we calculate
⎡ 1 ⎤
3 − 3
1 √1 1 √1
3 3 + 3
⎢1 ⎥
R=⎢ ⎣3 + 3
√1 1
3
1 √1 ⎥
3 − 3⎦ ,
3 − 3
1 √1 1 √1 1
3 + 3 3
We obtain ⎡ 1 ⎤
3
⎢1 ⎥
v = ⎢
⎣3 +
√1 ⎥ .
3⎦
3 −
1 √1
3
Convince yourself that v = v.
Continue this example with the vector
⎡ ⎤
1
v = ⎣1⎦ .
1
Surprised by the result?
190 9. Linear Maps in 3D
9.7 Projections
Projections that are linear maps are parallel projections. There are
two categories. If the projection direction is perpendicular to the
projection plane then it is an orthogonal projection, otherwise it is
an oblique projection.Two examples are illustrated in Figure 9.7, in
which one of the key properties of projections is apparent: flattening.
The orthogonal and oblique projection matrices that produced this
figure are ⎡ ⎤ ⎡ √ ⎤
1 0 0 1 0 1/√2
⎣0 1 0⎦ and ⎣0 1 1/ 2⎦ ,
0 0 0 0 0 0
respectively.
Projections are essential in computer graphics to view 3D geometry
on a 2D screen. A parallel projection is a linear map, as opposed to a
perspective projection, which is not. A parallel projection preserves
relative dimensions of an object, thus it is used in drafting to produce
accurate views of a design.
Recall from 2D, Section 4.8, that a projection reduces dimension-
ality and it is an idempotent map. It flattens geometry because a
projection matrix P is rank deficient; in 3D this means that a vector
Figure 9.7.
Projections in 3D: on the left is an orthogonal projection, and on the right is an oblique
projection of 45◦ .
9.7. Projections 191
Pk = Ak AT
k.
P1 v = (u · v)u, (9.12)
P2 v = (u1 · v)u1 + (u2 · v)u2 . (9.13)
Example 9.7
The example above is very simple, and we can immediately see that
Sketch 9.9. the projection direction is d = [0 0 ± 1]T . This vector satisfies the
Projection example. equation
P2 d = 0,
and we see that the projection direction is in the kernel of the map.
The idempotent property for P2 is easily understood by noticing
2 A2 is simply the 2 × 2 identity matrix,
that AT
P22 = A2 AT
2 A2 A2
T
T
u1 uT
1
= u1 u2 u1 u2
uT2 uT
2
uT
= u1 u2 I 1T
u2
= P2 .
0 = (P v)T (v − P v)
= vT (P T − P T P )v,
The cofactor is also written as (−1)i+j Mi,j where Mi,j is called the
minor of ai,j . As a result, (9.14) is also known as expansion by minors.
We’ll look into this method more in Section 12.6.
If (9.14) is expanded, then an interesting form for writing the de-
terminant arises. The formula is nearly impossible to remember, but
the following trick is not. Copy the first two columns after the last
column. Next, form the product of the three “diagonals” and add
them. Then, form the product of the three “antidiagonals” and sub-
tract them. The three “plus” products may be written as:
a1,1 a1,2 a1,3
a2,2 a2,3 a2,1
a3,3 a3,1 a3,2
|A| = a1,1 a2,2 a3,3 + a1,2 a2,3 a3,1 + a1,3 a2,1 a3,2
− a3,1 a2,2 a1,3 − a3,2 a2,3 a1,1 − a3,3 a2,1 a1,2 .
Example 9.8
• If A is invertible then
1
|A−1 | = .
|A|
v = BAv.
A
.
B C
9.9. Combining Linear Maps 197
1 5 −4
−1 −2 0
2 3 −4
0 0 −1 −2 −3 4
1 −2 0 3 9 −4
−2 1 1 −1 −9 4
m × n and n × p, (9.16)
Example 9.9
e3 e3
I R3 I
R1R3
e2 e2
R1
e1 e1
R3R1
Figure 9.8.
Combining 3D rotations: left and right, the original L is labeled I for identity matrix.
On the left, R1 is applied and then R3 , the result is labeled R3 R1 . On the right, R3 is
applied and then R1 , the result is labeled R1 R3 . This shows that 3D rotations do not
commute.
Scalar laws:
· a(B + C) = aB + aC
· (a + b)C = aC + bC
· (ab)C = a(bC)
· a(BC) = (aB)C = B(aC)
9.11. More on Matrices 201
• |A| + |B| = |A + B|
• |cA| = cn |A|
· A r+s
= Ar As
· T
AT = A
· A0 = I · [AB]T = B T AT
9.12 Exercises
1. Let v = 3a1 + 2a2 + a3 , where
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 2 0
a1 = ⎣1⎦ , a2 = ⎣0⎦ , a3 = ⎣3⎦ .
1 0 0
4. Given ⎡ ⎤ ⎡⎤ ⎡ ⎤
3 1 7
v = ⎣ 2 ⎦, w = ⎣−1⎦ , u = ⎣3⎦ ,
−1 2 0
is u in the subspace defined by v and w?
5. Is the vector u = v ∧ w in the subspace defined by v and w?
6. Let V1 be the one-dimensional subspace defined by
⎡ ⎤
1
v = ⎣1⎦ .
0
Map the unit cube with this matrix. What is the volume of the resulting
parallelepiped?
13. What matrix rotates around the e1 -axis by α degrees?
⎡ ⎤
−1
14. What matrix rotates by 45 around the vector ⎣ 0⎦?
◦
−1
15. Construct the orthogonal projection matrix P that projects onto the
line spanned by ⎡ √ ⎤
1/√3
u = ⎣1/√3⎦
1/ 3
and what is the action of the map, v = P v? What is the action of the
map on the following two vectors:
⎡ ⎤ ⎡ ⎤
1 0
v1 = ⎣1⎦ and v2 = ⎣0⎦?
1 1
rotation:
⎡ √ √ ⎤ scale:
⎡ ⎤ projection:
⎡ ⎤
1/ 2 0 1/ 2 1/2 0 0 1 0 −1
⎣ ⎦ ⎣ 0 1/4 0⎦ , ⎣0 1 0⎦ .
√0 1 √0 ,
−1/ 2 0 1/ 2 0 0 2 0 0 0
28. If ⎡ ⎤ ⎡ ⎤
0 1 0 0 0 1
B = ⎣1 1 1⎦ and B −1 =⎣ 1 0 0 ⎦,
1 0 0 −1 1 −1
what is (3B)−1 ?
29. For matrix A in Exercise 2, what is (AT )T ?
This page intentionally left blank
10
Affine Maps in 3D
Figure 10.1.
Affine maps in 3D: fighter jets twisting and turning through 3D space.
207
208 10. Affine Maps in 3D
Sketch 10.2.
Affine maps leave ratios
invariant. This map is a rigid
body motion.
Figure 10.2.
Affine map property: parallel planes get mapped to parallel planes via an affine map.
10.2. Translations 209
10.2 Translations
A translation is simply (10.2) with A = I, the 3 × 3 identity matrix:
⎡ ⎤
1 0 0
I = ⎣0 1 0⎦ ,
0 0 1
that is
x = p + Ix.
Thus, the new [a1 , a2 , a3 ]-system has its coordinate axes parallel to
the [e1 , e2 , e3 ]-system. The term Ix = x needs to be interpreted as
a vector in the [e1 , e2 , e3 ]-system for this to make sense. Figure 10.3
shows an example of repeated 3D translations.
Just as in 2D, a translation is a rigid body motion. The volume of
an object is not changed.
Figure 10.3.
Translations in 3D: three translated teapots.
then we know that the image has the same relationship with the pi :
u1 + u2 + u3 + u4 = 1,
Example 10.1
Let’s assume we want to map this tetrahedron to the four points pi
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 −1 0 0
⎣0⎦ , ⎣ 0⎦ , ⎣−1⎦ , ⎣ 0⎦ .
0 0 0 −1
This is a pretty straightforward map if you consult Sketch 10.4.
Let’s see where the point
⎡ ⎤
1
x = ⎣1⎦
1 Sketch 10.4.
An example tetrahedron map.
ends up. First, we find that
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 1 0 0
⎣1⎦ = −2 ⎣0⎦ + ⎣0⎦ + ⎣1⎦ + ⎣0⎦ ,
1 0 0 0 1
i.e., the barycentric coordinates of x with respect to the original pi are
(−2, 1, 1, 1). Note how they sum to one. Now it is simple to compute
the image of x; compute x using the same barycentric coordinates
with respect to the pi :
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 −1 0 0 −1
x = −2 ⎣0⎦ + ⎣ 0⎦ + ⎣−1⎦ + ⎣ 0⎦ = ⎣−1⎦ .
0 0 0 −1 −1
Example 10.2
Figure 10.4.
Projections in 3D: a 3D helix is projected into two different 2D planes.
[x + tv − q] · n = 0,
[x − q] · n + tv · n = 0,
[q − x] · n
t= .
v·n
The intersection point x is now computed as
[q − x] · n
x = x + v. (10.7)
v·n
that once x has been mapped into the projection plane, to x , it will
remain there. We can also show the map is idempotent algebraically,
vnT vnT
A2 = I− I−
vn vn
2
vnT vnT
= I2 − 2 +
vn vn
T
T 2
vn vn
=A− +
vn vn
and thus A2 = A. We can also show that repeating the affine map is
idempotent as well:
A(Ax + p) + p = A2 x + Ap + p
= Ax + Ap + p.
Example 10.3
v · n = −1,
1 1 1
T 0 0 0 0
vn = ,
0 0 0 0
−1 −1 −1 −1
⎡ ⎤
0
q·n
v = ⎣0⎦ .
v·n
1
Putting all the pieces together:
⎡ ⎡ ⎤⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 0 0 3 0 3
x = ⎣I − ⎣0 0 0⎦⎦ ⎣2⎦ + ⎣0⎦ = ⎣ 2⎦ .
1 1 1 4 1 −4
3 + 2 − 4 − 1 = 0,
Which of the two possibilities, (10.7) or the affine map (10.8) should
you use? Clearly, (10.7) is more straightforward and less involved. Yet
in some computer graphics or CAD system environments, it may be
desirable to have all maps in a unified format, i.e., Ax + p. We’ll
revisit this unified format idea in Section 10.5.
x = M x. (10.9)
and ⎡ ⎤ ⎡ ⎤
x1 x1
⎢x2 ⎥ ⎢x2 ⎥
x=⎢ ⎥
⎣x3 ⎦ , x = ⎢ ⎥
⎣x3 ⎦ .
1 1
The 4D point x is called the homogeneous form of the affine point
x. You should verify for yourself that (10.9) is indeed the same affine
map as before.
The homogeneous representation of a vector v must have the form,
⎡ ⎤
v1
⎢v2 ⎥
v=⎢ ⎥
⎣v3 ⎦ .
0
v = M v.
Example 10.4
I[q · n] o
M: .
0 0 0 x·n
Example 10.5
Figure 10.7.
Perspective maps: an experiment by A. Dürer.
10.6 Exercises
We’ll use four points
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 0 0
x1 = ⎣0⎦ , x2 = ⎣1⎦ , x3 = ⎣ 0 ⎦ , x4 = ⎣0⎦ ,
0 0 −1 1
four points
⎡ ⎤ ⎡ ⎤ ⎤⎡ ⎡ ⎤
−1 0 0 0
y1 = ⎣ 0 ⎦ , y2 = ⎣−1⎦ , y3 = ⎣ 0⎦ , y4 = ⎣0⎦ ,
0 0 −1 1
6. Using a direction
⎡ ⎤
2
1⎣ ⎦
v= 0 ,
4
2
what are the images of the xi when projected in this direction onto the
plane defined at the beginning of the exercises?
7. Using the same v as in Exercise 6, what are the images of the yi ?
8. What are the images of the xi when projected onto the plane by a
perspective projection through the origin?
9. What are the images of the yi when projected onto the plane by a
perspective projection through the origin?
10. Compute the centroid c of the xi and then the centroid c of their
perspective images (from Exercise 8). Is c the image of c under the
perspective map?
11. We claimed that (10.8) reduces to (10.10). This necessitates that
vnT
I− x = 0.
n·v
−2
What is u ?
224 10. Affine Maps in 3D
Figure 11.1.
Ray tracing: 3D intersections are a key element of rendering a ray-traced image.
(Courtesy of Ben Steinberg, Arizona State University.)
The tools of points, lines, and planes are our most basic 3D geometry
building blocks. But in order to build real objects, we must be able
to compute with these building blocks. For example, if we are given
225
226 11. Interactions in 3D
p = q + tn;
n · [p − tn] + c = 0.
Thus,
c+n·p
t= . (11.1)
n·n
It is good practice to assure that n is normalized, i.e., n · n = 1, and
then
t = c + n · p. (11.2)
Note that t = 0 is equivalent to n · p + c = 0; in that case, p is on
the plane to begin with!
Example 11.1
p − q = tn.
d = c + n · p.
x1 (s1 ) = p1 + s1 v1 , and
x2 (s2 ) = p2 + s2 v2 ,
[x2 − x1 ]v1 = 0,
[x2 − x1 ]v2 = 0,
228 11. Interactions in 3D
or
[p2 − p1 ]v1 = s1 v1 · v1 − s2 v1 · v2 ,
[p2 − p1 ]v2 = s1 v1 · v2 − s2 v2 · v2 .
These are two equations in the two unknowns s1 and s2 , and are thus
readily solved using the methods from Chapter 5.
Example 11.2
Let l1 be given by
⎡ ⎤ ⎡ ⎤
0 1
x1 (s1 ) = ⎣0⎦ + s1 ⎣0⎦ .
0 0
This line is parallel to the e2 -axis; both lines are shown in Sketch 11.4.
Sketch 11.4. Our linear system becomes
Distance between two lines.
1 0 s1 0
=
0 −1 s2 1
These are the two points of closest proximity; the distance between
the lines is x1 (0) − x2 (−1) = 1.
x = p + tv. (11.4)
At this point, we do not know the correct value for t; once we have
it, our problem is solved.
The solution is obtained by substituting the expression for x from
(11.4) into (11.3): Sketch 11.6.
[p + tv − q] · n = 0. A line and a plane.
Thus,
[p − q] · n + tv · n = 0
and
[q − p] · n
t= . (11.5)
v·n
230 11. Interactions in 3D
Example 11.3
x = q + u1 r1 + u2 r2 .
2 Keep in mind that real numbers are rarely equal to zero. A tolerance needs
to be used; 0.001 should work if both n and v are normalized. However, such
tolerances are driven by applications.
11.4. Intersecting a Triangle and a Line 231
p + tv = q + u1 r1 + u2 r2 .
This equation is short for three individual equations, one for each
coordinate. We thus have three equations in three unknowns u1 , u2 , t,
⎡ ⎤
u1
r1 r2 −v ⎣u2 ⎦ = p − q , (11.7)
t
and solve them with Gauss elimination. (The basic idea of this
method was presented in Chapter 5, and 3 × 3 and larger systems
are covered in Chapter 12.)
We thus arrive at
p + tv = p1 + u1 (p2 − p1 ) + u2 (p3 − p1 ),
11.5 Reflections
The next problem is that of line or plane reflection. Given a point x
on a plane P and an “incoming” direction v, what is the reflected or
“outgoing” direction v ? See Sketch 11.9, where we look at the plane
P “edge on.” We assume that v, v , and n are of unit length.
From physics, you might recall that the angle between v and the
plane normal n must equal that of v and n, except for a sign change.
We conveniently record this fact using a dot product:
cn = v − v (11.9)
for some real number c. This means that some multiple of the normal
vector may be written as the sum v + (−v).
We now solve (11.9) for v and insert into (11.8):
−v · n = [cn + v] · n,
v = v − 2[vT n]n,
You see this after multiplying out all products involved. Note that
nnT is an orthogonal projection matrix, just as we encountered in
11.6. Intersecting Three Planes 233
H = I − 2nnT . (11.13)
n1 · x + c1 = 0,
n2 · x + c2 = 0,
n3 · x + c3 = 0.
⎡ T⎤ ⎡ ⎤ ⎡ ⎤
n1 x1 −c1
⎣nT
2
⎦ ⎣x2 ⎦ = ⎣−c2 ⎦ . (11.14)
nT
3 x3 −c3
Example 11.4
x1 + x3 = 1, x3 = 1, x2 = 2.
Example 11.5
Sketch 11.13. Since n2 = n1 + n3 , they are indeed linearly dependent, and thus the
Three planes intersecting in one planes defined by them do not intersect in one point (see Sketch 11.13).
line.
11.7. Intersecting Two Planes 235
base and arm, but as a user, you may choose your own frame of refer-
ence that we will call the [b1 , b2 , b3 ]-coordinate frame. For example,
when using this digitized model, it might be convenient to have the
b1 axis generally aligned with the head to tail direction. Placing the
origin of your coordinate frame close to the cat is also advantageous
for numerical stability with respect to round-off error and accuracy.
(The digitizer has its highest accuracy within a small radius from its
base.) An orthonormal frame facilitates data capture and the trans-
formation between coordinate frames. It is highly unlikely that you
can manually create an orthonormal frame, so the digitizer will do it
for you, but it needs some help.
With the digitizing arm, let’s choose the point p to be the origin
of our input coordinate frame, the point q will be along the b1 axis,
and the point r will be close to the b2 axis. From these three points,
we form two vectors,
v1 = q − p and v2 = r − p.
v2 − (v2 · b1 )b1
b2 = . (11.16)
·
As the last step, we create b3 from the component of v3 that is
orthogonal to the subspace V12 , which means we normalize (v3 −
projV12 v3 ). This is done by separating the projection into the sum of
a projection onto V1 and onto V2 :
v3 − (v3 · b1 )b1 − (v3 · b2 )b2
b3 = . (11.17)
·
11.8. Creating Orthonormal Coordinate Systems 237
Example 11.6
w = projV12 v3
= projV1 v3 + projV2 v3
⎛⎡ ⎤ ⎡ ⎤⎞ ⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤⎞ ⎡ ⎤
2 0 0 2 1 1
= ⎝⎣−0.5⎦ · ⎣−1⎦⎠ ⎣−1⎦ + ⎝⎣−0.5⎦ · ⎣0⎦⎠ ⎣0⎦
2 0 0 2 0 0
⎡ ⎤
2
= ⎣−0.5⎦ .
0
238 11. Interactions in 3D
11.9 Exercises
For some of the exercises that follow, we will refer to the following two
planes and line. P1 goes through a point p and has normal vector n:
⎡ ⎤ ⎡ ⎤
1 −1
p = ⎣2⎦ , n = ⎣ 0⎦ .
0 0
The plane P2 is given by its implicit form
x1 + 2x2 − 2x3 − 1 = 0.
The line l goes through the point q and has direction v,
⎡ ⎤ ⎡ ⎤
−1 0
q = ⎣ 2⎦ , v = ⎣1⎦ .
0 0
11.9. Exercises 239
to the plane P2 . Is this point on the same side as the normal direction?
√ √ √ √
2. Given the plane (1/ 3)x1 + (1/ 3)x2 + (1/ 3)x3 + (1/ 3) = 0 and
point p = [1 1 4]T , what is the distance of p to the plane? What is
the closest point to p in the plane?
3. Revisit Example 11.2, but set the point defining the line l2 to be
⎡ ⎤
0
p2 = ⎣0⎦ .
1
The lines have not changed; how do you obtain the (unchanged) solu-
tions x1 and x2 ?
4. Given two lines, xi (si ) = pi + si vi , i = 1, 2, where
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 −1 1 −1
x1 (s1 ) = ⎣1⎦ + s1 ⎣−1⎦ and x2 (s2 ) = ⎣1⎦ + s2 ⎣−1⎦ ,
0 1 1 −1
find the two points of closest proximity between the lines. What are the
parameter values s1 and s2 for these two points?
5. Find the intersection of P1 with the line l.
6. Find the intersection of P2 with the line l.
7. Let a be an arbitrary vector. It may be projected along a direction v
onto the plane P that passes through the origin with normal vector n.
What is its image a ?
8. Does the ray p + tv with
⎡ ⎤ ⎡ ⎤
−1 1
p = ⎣−1⎦ and v = ⎣1⎦
0 1
x1 + x2 = 1, x1 = 1, x3 = 4.
x1 = 1 and x3 = 4.
11.9. Exercises 241
x1 − x2 = 0 and x1 + x2 = 2.
Figure 12.1.
Linear systems: the triangulation on the right was obtained from the left one by solving
a linear system.
243
244 12. Gauss for Linear Systems
Figure 12.1 illustrates the use of linear systems in the field of data
smoothing. The left triangulation looks somewhat “rough”; after
setting up an appropriate linear system, we compute the “smoother”
triangulation on the right, in which the triangles are closer to being
equilateral.
This chapter explains the basic ideas underlying linear systems.
Readers eager for hands-on experience should get access to software
such as Mathematica or MATLAB. Readers with advanced program-
ming knowledge can download linear system solvers from the web.
The most prominent collection of routines is LAPACK.
a1 a2 . . . an u = b,
or even shorter
Au = b.
The coefficient matrix A has n rows and n columns. For example, the
first row is
a1,1 , a1,2 , . . . , a1,n ,
and the second column is
a1,2
a2,2
..
.
an,2 .
Equation (12.1) is a compact way of writing n equations for the n Sketch 12.1.
unknowns u1 , . . . , un . In the 2 × 2 case, such systems had nice geo- A solvable 3×3 system.
metric interpretations; in the general case, that interpretation needs
n-dimensional linear spaces, and is not very intuitive. Still, the meth-
ods that we developed for the 2 × 2 case can be gainfully employed
here!
Some underlying principles with a geometric interpretation are best
explained for the example n = 3. We are given a vector b and we try
to write it as a linear combination of vectors a1 , a2 , a3 ,
a1 a2 a3 u = b.
If the ai are truly 3D, i.e., if they form a tetrahedron, then a unique
solution may be found (see Sketch 12.1). But if the three ai all lie in
a plane (i.e., if the volume formed by them is zero), then you cannot
write b as a linear combination of them, unless it is itself in that 2D
plane. In this case, you cannot expect uniqueness for your answer.
Sketch 12.2 covers these cases. In general, a linear system is uniquely
solvable if the ai have a nonzero n-dimensional volume. If they do
not, they span a k-dimensional subspace (with k < n)—nonunique
solutions exist only if b is itself in that subspace. A linear system is
called consistent if at least one solution exists.
For 2D and 3D, we encountered many problems that lent themselves
to constructing the linear system in terms of a linear combination of
column vectors, the ai . However, in Section 5.11 we looked at how a Sketch 12.2.
linear system can be interpreted as a problem using equations built Top: no solution, bottom:
nonunique solution.
row by row. In n-dimensions, this commonly occurs. An example
follows.
246 12. Gauss for Linear Systems
Example 12.1
Suppose that at five time instances, say ti = 0, 0.25, 0.5, 0.75, 1 sec-
onds, we have associated observation data, p(ti ) = 0, 1, 0.5, 0.5, 0. We
would like to fit a polynomial to these data so we can estimate values
in between the observations. This is called polynomial interpolation.
Five points require a degree four polynomial,
p(t) = c0 + c1 t + c2 t2 + c3 t3 + c4 t4 ,
or in matrix form,
⎡ ⎤⎡ ⎤ ⎡ ⎤
1 t0 t20 t30 t40 c0 p(t0 )
⎢1 t1 t21 t31 t41 ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢c1 ⎥ ⎢p(t1 )⎥
⎢ .. ⎥ ⎢ .. ⎥ = ⎢ .. ⎥
⎣ . ⎦⎣ . ⎦ ⎣ . ⎦
1 t4 t24 t34 t44 c4 p(t4 )
1.0
0.8
0.6
0.4
0.2
Figure 12.2.
Polynomial interpolation: a degree four polynomial fit to five data points.
12.2. The Solution via Gauss Elimination 247
S1 Au = S1 b
Algebraically, what this shear did was to change the rows of the sys-
tem in the following manner:
1
row1 ← row1 and row2 ← row2 − row1 .
2
Each of these constitutes an elementary row operation. Back substi-
tution came next, with
1 1
u2 = ×2=
4 2
and then
1
u1 = (4 − 4u2 ) = 1.
2
The divisions in the back substitution equations are actually scalings,
thus they could be rewritten in terms of a scale matrix:
1/2 0
S2 = ,
0 1/4
248 12. Gauss for Linear Systems
Example 12.2
Let’s step through the necessary row exchanges and shears for a 3 × 3
linear system. The goal is to get it in upper triangular form so we
may use back substitution to solve for the unknowns. The system is
⎡ ⎤⎡ ⎤ ⎡ ⎤
2 −2 0 u1 4
⎣4 0 −2⎦ ⎣u2 ⎦ = ⎣−2⎦ .
4 2 −4 u3 0
12.2. The Solution via Gauss Elimination 249
The matrix element a1,1 is not the largest in the first column, so we
choose the 4 in the second row to be the pivot element and we reorder
the rows:
⎡ ⎤⎡ ⎤ ⎡ ⎤
4 0 −2 u1 −2
⎣2 −2 0⎦ ⎣u2 ⎦ = ⎣ 4⎦ .
4 2 −4 u3 0
The permutation matrix that achieves this row exchange is
⎡ ⎤
0 1 0
P1 = ⎣1 0 0⎦ .
0 0 1
(The subscript 1 indicates that this matrix is designed to achieve the
appropriate exchange for the first column.)
To zero entries in the first column apply:
1
row2 ← row2 − row1
2
row3 ← row3 − row1 ,
and the system becomes
⎡ ⎤⎡ ⎤ ⎡ ⎤
4 0 −2 u1 −2
⎣0 −2 1⎦ ⎣u2 ⎦ = ⎣ 5⎦ .
0 2 −2 u3 2
The shear matrix that achieves this is
⎡ ⎤
1 0 0
G1 = ⎣−1/2 1 0⎦ .
−1 0 1
Now the first column consists of only zeroes except for a1,1 , meaning
that it is lined up with the e1 -axis.
Now work on the second column vector. First, check if pivoting
is necessary; this means checking that a2,2 is the largest in absolute
value of all values in the second column that are below the diagonal.
No pivoting is necessary. (We could say that the permutation matrix
P2 = I.) To zero the last element in this vector apply
row3 ← row3 + row2 ,
which produces
⎡ ⎤⎡ ⎤ ⎡ ⎤
4 0 −2 u1 −2
⎣0 −2 1⎦ ⎣u2 ⎦ = ⎣ 5⎦ .
0 0 −1 u3 7
250 12. Gauss for Linear Systems
Au = b,
Algorithm:
Pivoting step:
Find the element with the largest absolute value in column j
from aj,j to an,j ; this is element ar,j .
If r > j, exchange equations r and j.
At this point, all elements below the diagonal have been set to
zero. The matrix is now in upper triangular form.
Back substitution:
un = bn /an,n
For j = n − 1, . . . , 1
1
uj = aj,j [bj − aj,j+1 uj+1 − . . . − aj,n un ].
252 12. Gauss for Linear Systems
Example 12.3
then
row3 ← row3 − row1 .
Step j = 1 is complete and the linear system is now
⎡ ⎤⎡ ⎤ ⎡ ⎤
2 2 0 u1 6
⎣0 0 2⎦ ⎣u2 ⎦ = ⎣6⎦ .
0 −1 1 u3 1
Example 12.4
The matrix clearly has rank one. First perform forward elimination,
arriving at
⎡ ⎤ ⎡ ⎤
1 2 3 0
⎣0 0 0⎦ u = ⎣0⎦ .
0 0 0 0
For each zero row of the transformed system, set the corresponding
ui , the free variables, to one: u3 = 1 and u2 = 1. Back substituting
these into the first equation gives u1 = −5 for the pivot variable.
Thus a final solution is
⎡ ⎤
−5
u = ⎣ 1 ⎦,
1
All linear combinations of elements of the null space are also in the
null space, for example, u = 1u1 + 1u2 .
Example 12.5
where column 1 was exchanged with column 2 and then column 2 was
exchanged with column 3. Each exchange requires that the associated
unknowns are exchanged as well. Set the free variable: u1 = 1, then
back substitution results in u3 = 0 and u2 = 0. All vectors
⎡ ⎤
1
c ⎣0⎦
0
AA−1 = I. (12.6)
Example 12.6
Here, the matrices are n × n, and the vectors ai as well as the ei are
vectors with n components. The vector ei has all zero entries except
for its ith component; it equals 1.
We may now interpret (12.7) as n linear systems:
Example 12.7
Example 12.8
12.5 LU Decomposition
Gauss elimination has two major parts: transforming the system to
upper triangular form with forward elimination and back substitution.
The creation of the upper triangular matrix may be written in terms
of matrix multiplications using Gauss matrices Gj . For now, assume
that no pivoting is necessary. If we denote the final upper triangular
matrix by U , then we have
Gn−1 · . . . · G1 · A = U. (12.9)
It follows that
A = G−1 −1
1 · . . . · Gn−1 U.
A = LU, (12.10)
which is known as the LU decomposition of A. It is also called the
triangular factorization of A. Every invertible matrix A has such a
decomposition, although it may be necessary to employ pivoting.
Denote the elements of L by li,j (keeping in mind that li,i = 1) and
those of U by ui,j . A simple 3 × 3 example will help illustrate the
idea.
u1,1 u1,2 u1,3
0 u2,2 u2,3
0 0 u3,3
.
1 0 0 a1,1 a1,2 a1,3
l2,1 1 0 a2,1 a2,2 a2,3
l3,1 l3,2 1 a3,1 a3,2 a3,3
In this scheme, we are given the ai,j and we want the li,j and ui,j .
This is systematically achieved using the following.
Observe that elements of A below the diagonal may be rewritten as
ai,j = li,1 u1,j + . . . + li,j−1 uj−1,j + li,j uj,j ; j < i,
For the elements of A that are on or above the diagonal, we get
ai,j = li,1 u1,j + . . . + li,i−1 ui−1,j + li,i ui,j ; j ≥ i.
This leads to
1
li,j = (ai,j − li,1 u1,j − . . . − li,j−1 uj−1,j ); j<i (12.11)
uj,j
and
ui,j = ai,j − li,1 u1,j − . . . − li,i−1 ui−1,j ; j ≥ i. (12.12)
If A has a decomposition A = LU , then the system can be writ-
ten as
LU u = b. (12.13)
262 12. Gauss for Linear Systems
Ly = b, (12.14)
then solve
U u = y. (12.15)
If U u = y, then LU u = Ly = b. The two systems in (12.14) and
(12.15) are triangular and easy to solve. Forward substitution is ap-
plied to the matrix L. (See Exercise 21 and its solution for an algo-
rithm.) Back substitution is applied to the matrix U . An algorithm
is provided in Section 12.2.
A more direct method for forming L and U is achieved with (12.11)
and (12.12), rather than through Gauss elimination. This then is the
method of LU decomposition.
LU Decomposition
Algorithm:
The uk,k term must not be zero; we had a similar situation with
Gauss elimination. This situation either requires pivoting or the ma-
trix might be singular.
12.5. LU Decomposition 263
Example 12.9
1 1
k=3: u3,3 = a3,3 − l3,1 u1,3 − l3,2 u2,3 = 2 − 2 + = .
3 3
Check that this produces valid entries for L and U :
2 2 4
0 3 −1
0 0 1/3
.
1 0 0 2 2 4
−1/2 1 0 −1 2 −3
1/2 1/3 1 1 2 2
264 12. Gauss for Linear Systems
12.6 Determinants
With the introduction of the scalar triple product, Section 8.5 pro-
vided a geometric derivation of 3 × 3 determinants; they measure vol-
ume. And then in Section 9.8 we learned more about determinants
from the perspective of linear maps. Let’s revisit that approach for
n × n determinants.
When we apply forward elimination to A, transforming it to upper
triangular form U , we apply a sequence of shears and row exchanges.
Shears do not change the volumes. As we learned in Section 9.8, a row
exchange will change the sign of the determinant. Thus the column
12.6. Determinants 265
Example 12.10
Now, apply (12.16) to the upper triangular form U (12.5) from the
example, and notice that we did one row exchange, k = 1:
det A = (−1)1 [2 × −1 × 2] = 4.
and the Mi,j are called the minors; each is the determinant of the
matrix with the ith row and jth column removed. The Mi,j are
(n − 1) ×(n − 1) determinants, and they are computed by yet another
cofactor expansion. This process is repeated until we have 2 × 2
determinants. This technique is also known as expansion by minors.
Example 12.11
Example 12.12
Let’s solve the linear system from Example 12.3 using Cramer’s rule.
Following (12.17), we have
6 2 0 2 6 0 2 2 6
9 1 2 1 9 2 1 1 9
7 1 1 2 7 1 2 1 7
u1 = , u = , u3 = .
2 2 0 0
2 2 0 2 2 2
1 1 2 1 1 2 1 1 2
2 1 1 2 1 1 2 1 1
1500
1000
500
0
2000 2005 2010
Figure 12.3.
Least squares: fitting a straight line to stock price data for AIG from 2000 to 2013.
temperature = a × time + b,
12.7. Least Squares 269
40
30
20
10
10 20 30 40 50 60
Figure 12.4.
Least squares: a linear approximation to experimental data of time and temperature
pairs.
Au = b (12.19)
270 12. Gauss for Linear Systems
which is equivalent to
AT b⊥ = 0.
Based on (12.20), we can substitute b − b for b⊥ ,
AT (b − b ) = 0
AT (b − Au) = 0
AT b − AT Au = 0.
Rearranging this last equation, we have the normal equations
AT Au = AT b. (12.21)
This is a linear system with a square matrix AT A! Even more, that
matrix is symmetric. The solution to the new system (12.21), when
it has one, is the one that minimizes the error
Au − b2 ,
and for this reason, it is called the least squares solution of the original
system. Recall that b is closest to b in V and since we solved (12.19),
we have in effect minimized b − b.
It seems pretty amazing that by simply multiplying both sides by
AT , we have a “best” solution to the original problem!
Example 12.13
Figure 12.5.
Femur transplant: left, a titanium femoral head with shaft. Right, an example of a
sphere fit. Black points are “in front," gray points are occluded.
r = ρ1 (12.22)
..
. (12.23)
r = ρL . (12.24)
Lr = ρ1 + . . . + ρL
12.9 Exercises
1. Does the linear system
⎡ ⎤ ⎡ ⎤
1 2 0 1
⎣0 0 0⎦ u = ⎣2⎦
1 2 1 3
have a unique solution? Is it consistent?
2. Does the linear system
⎡ ⎤ ⎡ ⎤
1 1 5 3
⎣1 −1 1⎦ u = ⎣3⎦
1 2 7 3
have a unique solution? Is it consistent?
3. Examine the linear system in Example 12.1. What restriction on the ti
is required to guarantee a unique solution?
4. Solve the linear system Av = b where
⎡ ⎤ ⎡ ⎤
1 0 −1 2 −1
⎢0 0 1 −2 ⎥ ⎢ 2⎥
A=⎢ ⎣2
⎥ , and b = ⎢ ⎥ .
0 0 1⎦ ⎣ 1⎦
1 1 1 0 −3
Show all the steps from the Gauss elimination algorithm.
274 12. Gauss for Linear Systems
21. Write a forward substitution algorithm for solving the lower triangular
system (12.14).
22. Use the LU decomposition of A from Exercise 20 to solve the linear
system Au = b, where ⎡ ⎤
4
b = ⎣0⎦ .
4
23. Calculate the determinant of
⎡ ⎤
3 0 1
A = ⎣1 2 0⎦ .
1 1 1
1 1 2 0
using expansion by minors. Show all steps.
276 12. Gauss for Linear Systems
28. Set up and solve the linear system for solving the intersection of the
three planes,
x1 + x3 = 1, x3 = 1, x2 = 2.
29. Find the intersection of the plane
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 1 0
x(u1 , u2 ) = ⎣0⎦ + u1 ⎣ 0 ⎦ u2 ⎣ 1 ⎦
1 −1 −1
Figure 13.1.
A sparse matrix: all nonzero entries are marked.
277
278 13. Alternative System Solvers
Example 13.1
transformation ⎡ ⎤
0
⎢ .. ⎥
⎢.⎥
⎢ ⎥
Hi āi = γei = ⎢ ⎥
⎢γ ⎥ ,
⎢ .. ⎥
⎣.⎦
0
where γ = ±āi . Just as we developed for the 2 × 2 example,
Hi = I − 2ni nT
i , (13.1)
thus
α = γ 2 − ai,i γ.
When Hi is applied to column vector c,
vvT
Hi c = I − c = c − sv.
α
In the Householder algorithm that follows, as we work on the jth
column vector, we call ⎡ ⎤
aj,k
⎢ ⎥
âk = ⎣ ... ⎦
an,k
to indicate that only elements j, . . . , n of the kth column vector ak
(with k ≥ j) are involved in a calculation. Thus application of Hj
results in changes in the sub-block of A with aj,j at the upper-left
corner. Hence, the vector aj and Hj aj coincide in the first j − 1
components.
For a more detailed discussion, see [2] or [11].
Example 13.2
Now we set â1 , and the reflection H1 results in the linear system
⎡ √ ⎤ ⎡√ ⎤
− 2 √0 0 √2/2
⎣ 0 − 2 0⎦ u = ⎣ 2/2⎦ .
0 0 1 1
Example 13.3
Let’s revisit the least squares line-fitting problem from Example 12.13.
See that example for a problem description, and see Figure 12.4 for
an illustration. The overdetermined linear system for this problem is
⎡ ⎤ ⎡ ⎤
0 1 30
⎢10 1⎥ ⎢25⎥
⎢ ⎥ ⎢ ⎥
⎢20 1⎥ ⎢40⎥
⎢ ⎥ a ⎢ ⎥
⎢30 1⎥ ⎢ ⎥
⎢ ⎥ b = ⎢40⎥ .
⎢40 1⎥ ⎢30⎥
⎢ ⎥ ⎢ ⎥
⎣50 1⎦ ⎣5⎦
60 1 25
After the first Householder reflection (j = 1), the linear system be-
comes ⎡ ⎤ ⎡ ⎤
−95.39 −2.20 −54.51
⎢ 0 0.66 ⎥ ⎢ 16.14 ⎥
⎢ ⎥ ⎢ ⎥
⎢ 0 0.33 ⎥ ⎢ 22.28 ⎥
⎢ ⎥ a ⎢ ⎥
⎢ 0 −0.0068⎥ ⎢ ⎥
⎢ ⎥ b = ⎢ 13.45 ⎥ .
⎢ 0 ⎥ ⎢ ⎥
⎢ −0.34 ⎥ ⎢ −5.44 ⎥
⎣ 0 −0.68 ⎦ ⎣ −39.29⎦
0 −1.01 −28.15
13.2. Vector Norms 285
For the second Householder reflection (j = 2), the linear system be-
comes ⎡ ⎤ ⎡ ⎤
−95.39 −2.20 −54.51
⎢ 0 −1.47⎥ ⎢−51.10⎥
⎢ ⎥ ⎢ ⎥
⎢ 0 0 ⎥⎥ ⎢ 11.91 ⎥
⎢ a ⎢ ⎥
⎢ 0 0 ⎥ ⎢ ⎥
⎢ ⎥ b = ⎢ 13.64 ⎥ .
⎢ 0 0 ⎥⎥ ⎢ 5.36 ⎥
⎢ ⎢ ⎥
⎣ 0 0 ⎦ ⎣−17.91 ⎦
0 0 3.81
We can now solve the system with back substitution, starting with
the first nonzero row, and the solution is
a −0.23
= .
b 34.82
or the ∞-norm
v∞ = max |vi |. (13.4)
i
286 13. Alternative System Solvers
Figure 13.2.
Vector norms: outline of the unit vectors for the 2-norm is a circle, ∞-norm is a square,
1-norm is a diamond.
Example 13.4
Let ⎡ ⎤
1
v = ⎣ 0 ⎦.
−2
Then √
v1 = 3, v2 = 5 ≈ 2.24, v∞ = 2.
13.2. Vector Norms 287
v ≥ 0, (13.6)
v = 0 if and only if v = 0, (13.7)
cv = |c|v| for c ∈ R, (13.8)
v + w ≤ v + w. (13.9)
sort are often repeated many times and against many points. Instead,
the max norm might be more suitable from a computing point of view
and sufficient given the relationship in (13.5). This method of using
an inexpensive measure to exclude many possibilities is called trivial
reject in some disciplines such as computer graphics.
Figure 13.3.
Action of a 2 × 2 matrix: points on the unit circle (black) are mapped to points on an
ellipse (gray) This is the action ellipse. The vector is the semi-major axis of the ellipse,
which has length σ 1 .
13.3. Matrix Norms 289
Example 13.5
Let’s examine matrix norms for
⎡ ⎤
1 2 3
A = ⎣3 4 5 ⎦.
5 6 −7
Its singular values are given by 10.5, 7.97, 0.334 resulting in
A2 = max{10.5, 7.97, 0.334} = 10.5,
A1 = max{9, 12, 15} = 15,
A∞ = max{6, 12, 18} = 18
AF = 12 + 32 + . . . (−7)2 = 10.52 + 7.972 + 0.334 = 13.2.
From the examples above, we see that matrix norms are real-valued
functions of the linear space defined over all n × n matrices.1 Matrix
norms satisfy conditions very similar to the vector norm conditions
(13.6)–(13.9):
A > 0 for A = Z, (13.15)
A = 0 for A = Z, (13.16)
cA = |c|A| c ∈ R, (13.17)
A + B ≤ A + B, (13.18)
AB ≤ AB, (13.19)
Z being the zero matrix.
1 More on this type of linear space in Chapter 14.
13.4. The Condition Number 291
Figure 13.4.
Condition number: The action of A with κ(A) = 30.
292 13. Alternative System Solvers
Example 13.6
Let
cos α − sin α
A= ,
sin α cos α
meaning that A is an α degree rotation. Clearly, AT A = I, where
I is the identity matrix. Thus, σ1 = σ2 = 1. Hence the condition
number of a rotation matrix is 1. Since a rotation does not distort,
this is quite intuitive.
Example 13.7
Now let
100 0
A= ,
0 0.01
a matrix that scales by 100 in the e1 -direction and by 0.01 in the e2 -
direction. This matrix is severely distorting! We quickly find σ1 = 100
and σ2 = 0.01 and thus the condition number is 100/0.01 = 10, 000.
The fact that A distorts is clearly reflected in the magnitude of its
condition number.
Example 13.8
are not all that different. A vector sequence has a limit if each com-
ponent has a limit.
294 13. Alternative System Solvers
Example 13.9
It does not have a limit: even though the last two components each
have a limit, the first component diverges.
In other words, from some index i on, the distance of any v(i) from v
is smaller than an arbitrarily small amount . See Figure 13.5 for an
example. If the sequence converges with respect to one norm, it will
converge with respect to all norms. Our focus will be on the (usual)
Euclidean or 2-norm, and the subscript will be omitted.
Vector sequences are key to iterative methods, such as the two
methods for solving Au = b in Section 13.6 and the power method
for finding the dominant eigenvalue in Section 15.2. In practical ap-
plications, the limit vector v is not known. For some special prob-
lems, we can say whether a limit exists; however, we will not know it
a priori. So we will modify our theoretical convergence measure
(13.20) to examine the distance between iterations. This can take
13.6. Iterative System Solvers: Gauss-Jacobi and Gauss-Seidel 295
v0
v1
Figure 13.5.
Vector sequences: a sequence that converges.
which measures the change from one iteration to the next. In the case
of an iterative solution to a linear system, u(i) , we will test for the
condition that Au(i) − b < , which indicates that this iteration
has provided an acceptable solution.
Example 13.10
An iterative method starts from a guess for the solution and then
refines it until it is the solution. Let’s take
⎡ (1) ⎤ ⎡ ⎤
u 1
⎢ 1 ⎥ ⎣ ⎦
u(1) = ⎣u(1)
2 ⎦ = 1
u
(1) 1
3
for our first guess, and note that it clearly is not the solution to our
system: Au(1) = b.
A better guess ought to be obtained by using the current guess and
(2) (2)
solving the first equation for a new u1 , the second for a new u2 ,
and so on. This gives us
(2)
4u1 + 1 = 1,
(2)
2 + 5u2 + 1 = 0,
(2)
−1 + 2 + 4u3 = 3,
and thus ⎡ ⎤
0
u(2) = ⎣−0.6⎦ .
0.5
The next iteration becomes
(3)
4u1 − 0.6 = 1,
(3)
5u2 + 0.5 = 0,
(3)
−1.2 + 4u3 = 3,
and thus ⎡ ⎤
0.4
u(3) = ⎣−0.1 ⎦ .
1.05
2 This example was taken from Johnson and Riess [11].
13.6. Iterative System Solvers: Gauss-Jacobi and Gauss-Seidel 297
After a few more iterations, we will be close enough to the true solu-
tion ⎡ ⎤
0.333
u = ⎣−0.333⎦ .
1.0
Try one more iteration for yourself.
Example 13.11
Then
⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞ ⎡ ⎤
0.25 0 0 1 0 1 0 1 0
u(2) =⎣ 0 0.2 0 ⎦ ⎝⎣0⎦ − ⎣ 2 0 1⎦ ⎣1⎦⎠ = ⎣−0.6⎦
0 0 0.25 3 −1 2 0 1 0.5
Au(k) − b
should become small (i.e., less than some preset tolerance). Thus, we
check the length of the residual vector after each iteration, and stop
once it is smaller than our tolerance.
A modification of the Gauss-Jacobi method is known as Gauss-
Seidel iteration. When we compute u(k+1) in the Gauss-Jacobi method,
(k+1)
we can observe the following: the second element, u2 , is computed
(k) (k) (k) (k+1)
using u1 , u3 , . . . , un . We had just computed u1 . It stands to
(k)
reason that using it instead of u1 would be advantageous. This
idea gives rise to the Gauss-Seidel method: as soon as a new element
(k+1)
ui is computed, the estimate vector u(k+1) is updated.
In summary, Gauss-Jacobi updates the new estimate vector once
all of its elements are computed, Gauss-Seidel updates as soon as a
new element is computed. Typically, Gauss-Seidel iteration converges
faster than Gauss-Jacobi iteration.
13.7. Exercises 299
13.7 Exercises
1. Use the Householder method to solve the following linear system
⎡ ⎤ ⎡ ⎤
1 1.1 1.1 1
⎣1 0.9 0.9⎦ u = ⎣ 1 ⎦ .
0 −0.1 0.2 0.3
4. Examining the 1- and 2-norms defined in Section 13.2, how would you
define a 3-norm?
5. Show that the v1 satisfies the properties (13.6)–(13.9) of a vector
norm.
6. Define a new vector norm to be max{2|v1 |, |v2 |, . . . , |vn |}. Show that
this is indeed a vector norm. For vectors in R2 , sketch the outline of all
unit vectors with respect to this norm.
300 13. Alternative System Solvers
7. Let ⎡ ⎤
1 0 1
A = ⎣0 1 2⎦ .
0 0 1
What is A’s 2-norm?
8. Let four unit vectors be given by
1 0 −1 0
v1 = , v2 = , v3 = , v4 = .
0 1 0 −1
0
18. Let a linear system be given by
⎡ ⎤ ⎡ ⎤
4 0 1 0
A = ⎣2 −8 2⎦ and b = ⎣2⎦ .
1 0 2 0
0
19. Carry out three Gauss-Jacobi iterations for the linear system
⎡ ⎤ ⎡ ⎤
4 1 0 1 0
⎢1 4 1 0⎥ ⎢1⎥
A=⎣ ⎢ ⎥ and b = ⎣ ⎥ ⎢ ,
0 1 4 1⎦ 1⎦
1 0 1 4 0
20. Carry out three iterations of Gauss-Seidel for Example 13.10. Which
method, Gauss-Jacobi or Gauss-Seidel, is converging to the solution
faster? Why?
This page intentionally left blank
14
General Linear Spaces
Figure 14.1.
General linear spaces: all cubic polynomials over the interval [0, 1] form a linear
space. Some elements of this space are shown.
In Sections 4.3 and 9.2, we had a first look at the concept of linear
spaces, also called vector spaces, by examining the properties of the
spaces of all 2D and 3D vectors. In this chapter, we will provide a
framework for linear spaces that are not only of dimension two or
three, but of possibly much higher dimension. These spaces tend to
be somewhat abstract, but they are a powerful concept in dealing
with many real-life problems, such as car crash simulations, weather
303
304 14. General Linear Spaces
Example 14.1
Let’s start with a very familiar example of a linear space: R2 . Suppose
1 −2
u= and v =
1 3
are elements of this space; we know that
0
w = 2u + v =
5
is also in R .
2
1 Note that we will not always use the L notation, but rather the standard
Example 14.2
Let the linear space M2×2 be the set of all 2 × 2 matrices. We know
that the linearity property (14.1) holds because of the rules of matrix
arithmetic from Section 4.12.
Example 14.3
which is not in V2 .
v1 = s2 v2 + s3 v3 + . . . + sr vr
Example 14.4
Example 14.5
Example 14.6
Maps that do not have this property are called nonlinear maps and
are much harder to deal with.
Linear maps are conveniently written in terms of matrices. A vector
v in Ln is mapped to v in Lm ,
v = Av,
where A has m rows and n columns. The matrix A describes the map
from the [e1 , . . . , en ]-system to the [a1 , . . . , an ]-system, where the ai
are in Lm . This means that v is a linear combination v of the ai ,
v = v1 a1 + v2 a2 + . . . vn an ,
Example 14.7
in R3 by A.
308 14. General Linear Spaces
The matrix A has a certain rank k—how can we infer this rank from
the matrix? First of all, a matrix of size m×n, can be at most of rank
k = min{m, n}. This is called full rank. In other words, a linear map
can never increase dimension. It is possible to map Ln to a higher-
dimensional space Lm . However, the images of Ln ’s n basis vectors
will span a subspace of Lm of dimension at most n. Example 14.7
demonstrates this idea: the matrix A has rank 2, thus the v̂i live
in a dimension 2 subspace of R3 . A matrix with rank less than this
min{m, n} is called rank deficient. We perform forward elimination
(possibly with row exchanges) until the matrix is in upper triangular
form. If after forward elimination there are k nonzero rows, then the
rank of A is k. This is equivalent to our definition in Section 4.2
that the rank is equal to the number of linearly independent column
vectors. Figure 14.2 gives an illustration of some possible scenarios.
Figure 14.2.
The three types of matrices: from left to right, m < n, m = n, m > n. Examples of full
rank matrices are on the top row, and examples of rank deficient matrices are on the
bottom row. In each, gray indicates nonzero entries and white indicates zero entries
after forward elimination was performed.
14.2. Linear Maps 309
Example 14.8
There is one row of zeroes, and we conclude that the matrix has
rank 3, which is full rank since min{4, 3} = 3.
Next, let us take the matrix
⎡ ⎤
1 3 4
⎢0 1 2⎥
⎢ ⎥
⎣1 2 2⎦ .
0 1 2
and we conclude that this matrix has rank 2, which is rank deficient.
Let’s review some other features of linear maps that we have en-
countered in earlier chapters. A square, n × n matrix A of rank n is
invertible, which means that there is a matrix that undoes A’s action.
This is the inverse matrix, denoted by A−1 . See Section 12.4 on how
to compute the inverse.
If a matrix is invertible, then it does not reduce dimension and its
determinant is nonzero. The determinant of a square matrix measures
the volume of the n-dimensional parallelepiped, which is defined by
310 14. General Linear Spaces
v, w = v · w = v1 w1 + v2 w2 + . . . + vn wn .
From our experience with 2D and 3D, we can easily show that the
dot product satisfies the inner product properties.
Example 14.9
4
4
2
2
1 2
π 2π
π 2π
–2
π 2π
–4
Figure 14.3.
Inner product: comparing the dot product (black) with the test inner product from
Example 14.9 (gray). For each plot, the unit vector r is rotated in the range [0, 2π ]. Left:
the inner products, e1 · r and e1 , r
. Middle: length
the graph of for the inner products,
√
r · r and r, r
. Right: distance for the inner products, (e1 − r) · (e1 − r) and
(e1 − r), (e1 − r)
.
For vectors in Rn and the dot product, we have the Euclidean norm
v = v12 + v22 + . . . + vn2
and
dist(u, v) = (u1 − v1 )2 + (u2 − v2 )2 + . . . + (un − vn )2 .
Example 14.10
Let’s get a feel for the norm and distance concept for the test inner
product in (14.7). We have
e1 = e1 , e1 = 4(1)2 + 2(0)2 = 4
√
dist(e1 , e2 ) = 4(1 − 0)2 + 2(0 − 1)2 = 6,
√
The dot product produces e1 = 1 and dist(e1 , e2 ) = 2.
Figure 14.3 illustrates the difference between the dot product and
the test inner product. Again, r is a unit vector, rotated in the range
[0, 2π]. The middle plot shows that unit length vectors with respect
to the dot product are not unit length with respect to the test inner
product. The right plot shows that the distance between two vectors
differs too.
14.3. Inner Products 313
v, w = 0,
and rearrange,
2
v, w
≤ 1,
vw
we obtain
v, w
−1 ≤ ≤ 1.
vw
Now we can express the angle θ between v and w by
v, w
cos θ = .
vw
v ≥ 0,
v = 0 if and only if v = 0,
αv = |α|v.
v + w ≤ v + w,
formed from two vectors in R2 . See the exercises for the generalized
Pythagorean theorem.
The tools associated with the inner product are key for orthogonal
decomposition and best approximation in a linear space. Recall that
these concepts were introduced in Section 2.8 and have served as the
building blocks for the 3D Gram-Schmidt method for construction
of an orthonormal coordinate frame (Section 11.8) and least squares
approximation (Section 12.7).
We finish with the general definition of a projection. Let u1 , . . . , uk
span a subspace Lk of L. If v is a vector not in that space, then
All terms b1 , b2 , b1 , b3 , etc. vanish since the bi are mutually
orthogonal and b1 , b1 = 1. Thus, u − û, b1 = 0. In the same
manner, we show that u − û is orthogonal to the remaining bi . Thus
u − projSr u
br+1 = ,
·
and the set b1 , . . . , br+1 forms an orthonormal basis for the subspace
Sr+1 of Ln . We may repeat this process until we have found an
orthonormal basis for all of Ln .
14.4. Gram-Schmidt Orthonormalization 315
0 1 0 0
and we wish to form an orthonormal basis, b1 , b2 , b3 , b4 .
The Gram-Schmidt method, as defined in the displayed equations
above, produces
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0√ 0√
⎢0⎥ ⎢1/ 3⎥ ⎢ 2/ 6 ⎥
b1 = ⎢ ⎥ ⎢ √ ⎥ ⎢
⎣0⎦ , b2 = ⎣1/ 3⎦ , b3 = ⎣−1/ 6⎦ .
√ ⎥
√ √
0 1/ 3 −1/ 6
The final vector, b4 , is defined as
⎡ ⎤
0
⎢ 0 ⎥
b4 = v4 − v4 , b1 b1 − v4 , b2 b2 − v4 , b3 b3 = ⎢ √ ⎥
⎣ 1/ 2 ⎦
√
−1/ 2
Knowing that the bi are normalized and checking that bi · bj = 0,
we can be confident that this is an orthonormal basis. Another tool
we have is the determinant, which will be one,
b1 b2 b3 b4 = 1.
316 14. General Linear Spaces
Example 14.12
For a second example, a linear space is given by the set of all real-
valued continuous functions over the interval [0, 1]. This space is
typically named C[0, 1]. Clearly the linearity condition is met: if f
and g are elements of C[0, 1], then αf + βg is also in C[0, 1]. Here
we have an example of a linear space that is infinite-dimensional,
meaning that no finite set of functions forms a basis for C[0, 1].
For a third example, consider the set of all 3 × 3 matrices. They
form a linear space; this space consists of matrices. In this space,
linear combinations are formed using standard matrix addition and
multiplication with a scalar as summarized in Section 9.11.
And, finally, a more abstract example. The set of all linear maps
from a linear space Ln into the reals forms a linear space itself and
it is called the dual space L∗n of Ln . As indicated by the notation, its
dimension equals that of Ln . The linear maps in L∗n are known as
linear functionals.
For an example, let a fixed vector v and a variable vector u be
in Ln . The linear functionals defined by Φv (u) = u, v are in L∗n .
Then, for any basis b1 , . . . , bn of Ln , we can define linear functionals
Φbi (u) = u, bi for i = 1, . . . , n.
These functionals form a basis for L∗n .
Example 14.13
Example 14.14
14.6 Exercises
1. Given elements of R4 ,
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 0 3
⎢2⎥ ⎢0⎥ ⎢1⎥
u=⎢ ⎥
⎣0⎦ , v=⎢ ⎥
⎣2⎦ , w=⎢ ⎥
⎣2⎦ ,
4 7 0
is r = 3u + 6v + 2w also in R4 ?
2. Given matrices that are elements of M3×3 ,
⎡ ⎤ ⎡ ⎤
1 0 2 3 0 0
A = ⎣2 0 1⎦ and B = ⎣0 3 1⎦ ,
1 1 3 4 1 7
is C = 4A + B an element of M3×3 ?
3. Does the set of all polynomials with an = 1 form a linear space?
4. Does the set of all 3D vectors with nonnegative components form a
subspace of R3 ?
14.6. Exercises 319
5. Are ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 1 0
⎢0⎥ ⎢1⎥ ⎢1⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
u=⎢ ⎥
⎢1⎥ , v=⎢ ⎥
⎢1⎥ , w=⎢ ⎥
⎢0⎥
⎣0⎦ ⎣1⎦ ⎣1⎦
1 1 0
linearly independent?
6. Are ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 3 2
u = ⎣2⎦ , v = ⎣6⎦ , w = ⎣2⎦
1 1 1
linearly independent?
7. Is the vector ⎡ ⎤
2
⎢3⎥
⎢ ⎥
r=⎢ ⎥
⎢2⎥
⎣3⎦
2
in the subspace defined by u, v, w defined in Exercise 5?
8. Is the vector ⎡ ⎤
2
⎢3⎥
⎢ ⎥
r=⎢ ⎥
⎢2⎥
⎣3⎦
2
in the subspace defined by u, v, w defined in Exercise 6?
9. What is the dimension of the linear space formed by all n × n matrices?
10. Suppose we are given a linear map A : R4 → R2 , preimage vectors
vi , and corresponding image vectors wi . What are the dimensions
of the matrix A? The following linear relationship exists among the
preimages,
v4 = 3v1 + 6v2 + 9v3 .
What relationship holds for w4 with respect to the wi ?
11. What is the rank of the matrix
⎡ ⎤
1 2 0
⎢−1 −2 1⎥
⎢ ⎥?
⎣ 0 0 1⎦
2 4 −1
find w, r
, w, r, and dist(w, r) with respect to the dot product
and then for the test inner product in (14.7).
14. For v, w in R3 , does
v, w
= v12 w12 + v22 w22 + v32 w32
v, w
= 4v1 w1 + v2 w2 + 2v3 w3
A, B
= a1,1 b1,1 + a1,2 b1,2 + a1,3 b1,3 + a2,1 b2,1 + . . . + a3,3 b3,3
an inner product?
17. Let p(t) = p0 + p1 t + p2 t2 and q(t) = q0 + q1 t + q2 t2 be two quadratic
polynomials. Define
p, q
= p0 q0 + p1 q1 + p2 q2 .
21. Find a basis for the linear space formed by all 2 × 2 matrices.
22. Does the set of all monotonically increasing functions over [0, 1] form
a linear space?
23. Let L be a linear space. Is the map Φ(u) = u an element of the dual
space L∗ ?
24. Show the linearity of the derivative map on the linear space of quadratic
polynomials P2 .
This page intentionally left blank
15
Eigen Things Revisited
Figure 15.1.
Google matrix: part of the connectivity matrix for Wikipedia pages in 2009, which is
used to find the webpage ranking. (Source: Wikipedia, Google matrix.)
323
324 15. Eigen Things Revisited
we may ask if it has fixed directions and what are the corresponding
eigenvalues.
In this chapter we go a little further and examine the power method
for finding the eigenvector that corresponds to the dominant eigen-
value. This method is paired with an application section describing
how a search engine might rank webpages based on this special eigen-
vector, given a fun, slang name—the Google eigenvector.
We explore “Eigen Things” of function spaces that are even more
general than those in the gallery in Section 14.5.
“Eigen Things” characterize a map by revealing its action and ge-
ometry. This is key to understanding the behavior of any system.
A great example of this interplay is provided by the collapse of the
Tacoma Narrows Bridge in Figures 7.1 and 7.2. But “Eigen Things”
are important in many other areas: characterizing harmonics of musi-
cal instruments, moderating movement of fuel in a ship, and analysis
of large data sets, such as the Google matrix in Figure 15.1.
Example 15.1
Let ⎡ ⎤
1 1 0 0
⎢0 3 1 0⎥
A=⎢ ⎥.
⎣0 0 4 1⎦
0 0 0 2
15.1. The Basics Revisited 325
Example 15.2
The bad news is that one is not always dealing with upper trian-
gular matrices like the one in Example 15.1. A general n × n matrix
has a degree n characteristic polynomial
and the eigenvalues are the zeroes of this polynomial. Finding the
zeroes of an nth degree polynomial is a nontrivial numerical task.
In fact, for n ≥ 5, it is not certain that we can algebraically find the
factorization in (15.3) because there is no general formula like we have
for n = 2, the quadratic equation. An iterative method for finding
the dominant eigenvalue is described in Section 15.2.
For 2 × 2 matrices, in (7.3) and (7.4), we observed that the charac-
teristic polynomial easily reveals that the determinant is the product
of the eigenvalues. For n × n matrices, we have the same situation.
Consider λ = 0 in (15.3), then p(0) = det A = λ1 λ2 · . . . · λn .
Needless to say, not all eigenvalues of a matrix are real in general.
But the important class of symmetric matrices always does have real
eigenvalues.
Two more properties of eigenvalues:
• The matrices A and AT have the same eigenvalues.
• Suppose A is invertible and has eigenvalues λi , then A−1 has eigen-
values 1/λi .
Having found the λi , we can now solve homogeneous linear systems
[A − λi I]ri = 0
Example 15.3
Now we find the eigenvectors for the matrix in Example 15.1. Starting
with λ1 = 4, the corresponding homogeneous linear system is
⎡ ⎤
−3 1 0 0
⎢ 0 −1 1 0 ⎥
⎢ ⎥ r = 0,
⎣0 0 0 1⎦ 1
0 0 0 −2
15.1. The Basics Revisited 327
0 0 1 0
For practice working with homogenous systems, work out the details.
Check that each eigenvector satisfies Ari = λi ri .
Example 15.4
Let ⎡ ⎤
1 2 3
A = ⎣0 2 0⎦ .
0 0 2
This matrix has eigenvalues λi = 2, 2, 1. Finding the eigenvector
corresponding to λ1 = λ2 = 2, we get two identical homogeneous
systems ⎡ ⎤
−1 2 3
⎣ 0 0 0⎦ r1 = 0.
0 0 0
We set r3,1 = r2,1 = 1, and back substitution gives r1,1 = 5. The
homogeneous system corresponding to λ3 = 1 is
⎡ ⎤
0 2 3
⎣0 1 0⎦ r3 = 0.
0 0 1
Thus the two fixed directions for A are
⎡ ⎤ ⎡ ⎤
5 1
r1 = ⎣1⎦ and r3 = ⎣0⎦ .
1 0
Check that each eigenvector satisfies Ari = λi ri .
328 15. Eigen Things Revisited
Example 15.5
Symmetric matrices are special again. Not only do they have real
eigenvalues, but their eigenvectors are orthogonal. This can be shown
in exactly the same way as we did for the 2D case in Section 7.5.
Recall that in this case, A is said to be diagonalizable because it is
possible to transform A to the diagonal matrix Λ = R−1 AR, where
the columns of R are A’s eigenvectors and Λ is a diagonal matrix of
A’s eigenvalues.
Example 15.6
Example 15.7
By design, we know that this matrix is rank one and singular. The
characteristic polynomial is p(λ) = λ2 (1 − λ) from which we conclude
that λ1 = 1 and λ2 = λ3 = 0. The eigenvector corresponding to λ1 is
⎡ ⎤
1
r1 = ⎣0⎦ ,
1
and it spans the column space of P . The zero eigenvalue leads to the
system ⎡ ⎤
1/2 0 1/2
⎣ 0 0 0 ⎦ r = 0.
0 0 0
To find one eigenvector associated with this eigenvalue, we can sim-
ply assign r3 = r2 = 1, and back substitution results in r1 = −1.
330 15. Eigen Things Revisited
Alternatively, we can find two vectors that span the two dimensional
null space of P , ⎡ ⎤ ⎡ ⎤
−1 0
r2 = ⎣ 0 ⎦ , r3 = ⎣1⎦ .
1 0
They are orthogonal to r1 . All linear combinations of elements of
the null space are also in the null space, and thus r = 1r1 + 1r2 .
(Normally, the eigenvectors are normalized, but for simplicity they
are not here.)
Example 15.8
Let
1 −2
A= ,
0 −2
then simply by observation, we see that λ1 = −2 and λ2 = 1. Let’s
compare this to the eigenvalues from (15.4). The trace is the sum of
the diagonal elements, tr(A) = −1 and det A = −2. Then
−1 ± 3
λ1,2 = ,
2
resulting in the correct eigenvalues.
15.2. The Power Method 331
where each vi2 term is paired with diagonal element ci,i and each vi vj
term is paired with 2ci,j due to the symmetry in C. Just as before, all
terms are quadratic. Now, the contour f (v) = 1 is an n-dimensional
ellipsoid. The semi-minor √ axis corresponds to the dominant eigenvec-
tor r1 and its length is 1/√ λ1 , and the semi-major axis corresponds
to rn and its length is 1/ λn .
A real matrix is positive definite if
f (v) = vT Av > 0 (15.5)
for any nonzero vector v in Rn . This means that the quadratic form
is positive everywhere except for v = 0. This is the same condition
we encountered in (7.17).
trices, but since those eigenvalues may be complex, we will avoid them here.
332 15. Eigen Things Revisited
Algorithm:
Initialization:
Estimate dominant eigenvector r(1) = 0.
(1) (1)
Find j where |rj | = r(1) ∞ and set r(1) = r(1) /rj .
Set λ(1) = 0.
Set tolerance .
Set maximum number of iterations m.
For k = 2, . . . , m,
y = Ar(k−1) ,
λ(k) = yj .
Find j where |yj | = y∞ .
If yj = 0 then output: “eigenvalue zero; select new r(1)
and restart”; exit.
r(k) = y/yj .
If |λ(k) − λ(k−1) | < then output: λ(k) and r(k) ; exit.
If k = m output: “maximum iterations exceeded.”
• If |λ| is either “large” or “close” to zero, the r(k) will either become
unbounded or approach zero in length, respectively. This has the
potential to cause numerical problems. It is prudent, therefore, to
scale the r(k) , so in the algorithm above, at each step, the eigenvec-
tor is scaled by its element with the largest absolute value—with
respect to the ∞-norm.
Example 15.9
Figure 15.2 illustrates three cases, A1 , A2 , A3 , from left to right. The
three matrices and their eigenvalues are as follows:
2 1
A1 = , λ1 = 3, λ2 = 1,
1 2
2 0.1
A2 = , λ1 = 2.1, λ2 = 1.9,
0.1 2
2 −0.1
A3 = , λ1 = 2 + 0.1i, λ2 = 2 − 0.1i,
0.1 2
In all three examples, the vectors r(i) were scaled relative to the ∞-
norm, thus r(1) is scaled to
(1) 1
r = .
−0.066667
An tolerance of 5.0 × 10−4 was used for each matrix.
Figure 15.2.
The power method: three examples whose matrices are given in Example 15.9. The
longest black vector is the initial (guess) eigenvector. Successive iterations are in
lighter shades of gray. Each iteration is scaled with respect to the ∞-norm.
334 15. Eigen Things Revisited
2 4
Figure 15.3.
Directed graph: represents the webpage connectivity defined by C in (15.9).
In this example, page 1 has one outlink since c3,1 = 1 and three
inlinks since c1,2 = c1,3 = c1,4 = 1. Thus, the ith column describes
the outlinks of page i and the ith row describes the inlinks of page
i. This connectivity structure is illustrated by the directed graph of
Figure 15.3.
The ranking ri of any page i is entirely defined by C. Here are
some rules with increasing sophistication:
1. The ranking ri should grow with the number of page i’s inlinks.
2. The ranking ri should be weighted by the ranking of each of page
i’s inlinks.
3. Let page i have an inlink from page j. Then the more outlinks
page j has, the less it should contribute to ri .
Let’s elaborate on these rules. Rule 1 says that a page that is
pointed to very often deserves high ranking. But rule 2 says that if
all those inlinks to page i prove to be low-ranked, then their sheer
number is mitigated by their low rankings. Conversely, if they are
mostly high-ranked, then they should boost page i’s ranking. Rule 3
implies that if page j has only one outlink and it points to page i, then
page i should be “honored” for such trust from page j. Conversely,
if page j points to a large number of pages, page i among them, this
does not give page i much pedigree.
336 15. Eigen Things Revisited
Although not realistic, assume for now that each page as at least
one outlink and at least one inlink so that the matrix C is structured
nicely. Let oi represent the total number of outlinks of page i. This
is simply the sum of all elements of the ith column of C. The more
outlinks page i has, the lower its contribution to page j’s ranking it
will have. Thus we scale every element of column i by 1/oi . The
resulting matrix D with
cj,i
dj,i =
oi
is called the Google matrix. Note that all columns of D have nonneg-
ative entries and sum to one. Matrices with that property (or with
respect to the rows) are called stochastic.2
In our example above, we have
⎡ ⎤
0 1/2 1/3 1/2
⎢0 0 1/3 0 ⎥
D=⎢ ⎣1 1/2 0 1/2⎦ .
⎥
0 0 1/3 0
r = Dr. (15.10)
stochastic. If the rows sum to one, the matrix is right stochastic and if the rows
and columns sum to one, the matrix is doubly stochastic.
15.4. Eigenfunctions 337
which was calculated with the power method algorithm in Section 15.2.
Notice that r3 = 1 is the largest component, therefore page 3 has the
highest ranking. Even though pages 1 and 3 have the same number
of inlinks, the solitary outlink from page 1 to page 3 gives page 3 the
edge in the ranking.
In the real world, in 2013, there were approximately 50 billion web-
pages. This was the world’s largest matrix to be used ever. Luckily,
it contains mostly zeroes and thus is extremely sparse. Without tak-
ing advantage of that, Google (and other search engines) could not
function. Figure 15.1 illustrates a portion of a Google matrix for ap-
proximately 3 million pages. We gave three simple rules for building
D, but in the real world, many more rules are needed. For example,
webpages with no inlinks or outlinks must be considered. We would
want to modify D to ensure that the ranking r has only nonnegative
components. In order for the power method to converge, other modi-
fications of D are required as well, but that topic falls into numerical
analysis.
15.4 Eigenfunctions
An eigenvalue λ of a matrix A is typically thought of as a solution of
the matrix equation
Ar = λr.
In Section 14.5, we encountered more general spaces than those
formed by finite-dimensional vectors: those are spaces formed by poly-
nomials. Now, we will even go beyond that: we will explore the space
of all real-valued functions. Do eigenvalues and eigenvectors have
meaning there? Let’s see.
Let f be a function, meaning that y = f (x) assigns the output
value y to an input value x, and we assume both x and y are real
numbers. We also assume that f is smooth, or differentiable. An
example might be f (x) = sin(x).
The set of all such functions f forms a linear space as observed in
Section 14.5.
We can define linear maps L for elements of this function space. For
example, setting Lf = 2f is such a map, albeit a bit trivial. A more
interesting linear map is that of taking derivatives: Df = f . Thus,
338 15. Eigen Things Revisited
Df = λf.
f = λf. (15.11)
Df = f = 1 × f.
f (x) = λeλx ,
d2 cos(kx) d sin(kx)
2
= −k = −k 2 cos(kx), (15.12)
dx dx
15.5. Exercises 339
and the eigenvalues are −k 2 . Can you find another set of eigenfunc-
tions?
This may seem a bit abstract, but eigenfunctions actually have
many uses, for example in differential equations and mathematical
physics. In engineering mathematics, orthogonal functions are key for
applications such as data fitting and vibration analysis. Some well-
known sets of orthogonal functions arise as the result of the solution
to a Sturm-Liouville equation such as
15.5 Exercises
1. What are the eigenvalues and eigenvectors for
2 1
A= ?
1 2
4. If ⎡ ⎤
1
r = ⎣1⎦
1
is an eigenvector of ⎡ ⎤
0 1 1
A = ⎣1 1 0⎦ ,
1 0 1
what is the corresponding eigenvalue?
5. The matrices A and AT have the same eigenvalues. Why?
6. If A has eigenvalues 4, 2, 0, what is the rank of A? What is the deter-
minant?
7. Suppose a matrix A has a zero eigenvalue. Will forward elimination
change this eigenvalue?
8. Let a rotation matrix be given by
⎡ ⎤
cos α 0 − sin α
R=⎣ 0 1 0⎦ ,
sin α 0 cos α
carry out three steps of the power method with (15.7), and use r(3) and
r(4) in (15.8) to estimate A’s dominant eigenvalue. If you are able to
program, then try implementing the power method algorithm.
13. Let A be the matrix
⎡ ⎤
−8 0 8
A=⎣ 0 1 −2⎦ .
8 −2 0
Starting with the vector
r(1) = 1 1 1
carry out three steps of the power method with (15.7), and use r(3) and
r(4) in (15.8) to estimate A’s dominant eigenvalue. If you are able to
program, then try implementing the power method algorithm.
14. Of the following matrices, which one(s) are stochastic matrices?
⎡ ⎤ ⎡ ⎤
0 1 0 0 1 1/3 1/4 0
⎢2 0 0 1/2⎥ ⎢0 0 1/4 0⎥
A=⎢ ⎣−1 0 0 1/2⎦ ,
⎥ B=⎢ ⎣0 1/3 1/4 1⎦ ,
⎥
0 0 1 0 0 1/3 1/4 0
⎡ ⎤
⎡ ⎤ 1/2 0 0 1/2
1 0 0 ⎢ 0 1/2 1/2 0 ⎥
C = ⎣0 0 0⎦ , D=⎢ ⎣1/3 1/3 1/3
⎥.
0 ⎦
0 0 1
1/2 0 0 1/2
15. The directed graph in Figure 15.4 describes inlinks and outlinks to web-
pages. What is the corresponding adjacency matrix C and stochastic
(Google) matrix D?
2 4
Figure 15.4.
Graph showing the connectivity defined by C.
342 15. Eigen Things Revisited
draw the corresponding directed graph that describes these inlinks and
outlinks to webpages. What is the corresponding stochastic matrix D?
17. The Google matrix in Exercise 16 has dominant eigenvalue 1 and cor-
responding eigenvector r = [1/5, 2/5, 14/15, 2/5, 1]T . Which page has
the highest ranking? Based on the criteria for page ranking described
in Section 15.3, explain why this is so.
18. Find the eigenfunctions and eigenvalues for Lf (x) = xf .
19. For the map Lf = f , a set of eigenfunctions is given in (15.12). Find
another set of eigenfunctions.
16
The Singular Value
Decomposition
Figure 16.1.
Image compression: a method that uses the SVD. Far left: original image; second from
left: highest compression; third from left: moderate compression; far right: method
recovers original image. See Section 16.7 for details.
343
344 16. The Singular Value Decomposition
A = U ΣV T . (16.2)
AT A = (U ΣV T )T (U ΣV T )
= V ΣT U T U ΣV T
= V ΣT ΣV T
= V Λ V T , (16.3)
where 2
λ1 0 σ1 0
Λ = = Σ T
Σ = .
0 λ2 0 σ22
Equation (16.3) states the following: The symmetric positive defi-
nite matrix AT A has eigenvalues that are the diagonal entries of Λ
and eigenvectors as columns of V , which are called the right singular
vectors of A. This is the eigendecomposition of AT A.
16.1. The Geometry of the 2 × 2 Case 345
AAT = (U ΣV T )(U ΣV T )T
= U ΣV T V ΣT U T
= U ΣΣT U T
= U Λ U T , (16.4)
Example 16.1
Figure 16.2.
Action of a map: the unit circle is mapped to the action ellipse with semi-major axis
length σ 1 and semi-minor axis length σ 2 . Left: ellipse from matrix in Example 16.1;
middle: circle; right: ellipse from Example 16.2.
Example 16.2
We compute
1 2 5 2
AT A = , AAT = ,
2 5 2 1
16.1. The Geometry of the 2 × 2 Case 347
and we observe that these two matrices are no longer identical, but
they are both symmetric. As they are 2 × 2 matrices, we can easily
calculate the eigenvalues as λ1 = 5.82 and λ2 = 0.17. (Remember:
the nonzero eigenvalues are the same for a matrix and its transpose.)
These eigenvalues result in singular values σ1 = 2.41 and σ2 = 0.41.
The eigenvectors of AT A are the orthonormal column vectors of
0.38 −0.92
V =
0.92 0.38
Figure 16.3 will help us break down the action of A in terms of the
SVD. It is now clear that V and U are rotation or reflection matrices
and Σ scales, deforming the circle into an ellipse.
Notice that the eigenvalues of A are λ1 = λ2 = 1, making the point
that, in general, the singular values are not the eigenvalues!
Figure 16.3.
SVD breakdown: shear matrix A from Example 16.2. Clockwise from top left: Initial
point set forming a circle with two reference points; V T x rotates clockwise 67.5◦ ; ΣV T x
stretches in e1 and shrinks in e2 ; U ΣV T x rotates counterclockwise 22.5◦ , illustrating
the action of A.
348 16. The Singular Value Decomposition
Now we come full circle and look at what we have solved in terms
of our original question that was encapsulated by (16.1): What or-
thonormal vectors vi are mapped to orthogonal vectors σi ui ? The
SVD provides a solution to this question by providing V , U , and Σ.
Furthermore, note that for this nonsingular case, the columns of V
form a basis for the row space of A and the columns of U form a basis
for the column space of A.
It should be clear that the SVD is not limited to invertible 2 × 2
matrices, so let’s look at the SVD more generally.
A = U Σ VT
A = U Σ VT
A = U Σ VT
Figure 16.4.
SVD matrix dimensions: an overview of the SVD of an m × n matrix A. Top: m > n;
middle: m = n; bottom: m < n.
16.2. The General Case 349
Example 16.3
Let A be given by
⎡ ⎤
1 0
A = ⎣0 2⎦ .
0 1
The first step is to form AT A and AAT and find their eigenvalues and
(normalized) eigenvectors, which make up the columns of an orthog-
onal matrix.
1 0 λ1 = 5, 0 1
AT A = , V = ;
0 5 λ2 = 1, 1 0
⎡ ⎤ ⎡ ⎤
1 0 0 λ1 = 5, 0 1 0
AAT = ⎣0 4 2⎦ , λ2 = 1, U = ⎣0.89 0 −0.44⎦ .
0 2 1 λ3 = 0, 0.44 0 0.89
350 16. The Singular Value Decomposition
Figure 16.5.
SVD of a 3 × 2 matrix A: see Example 16.3. Clockwise from top left: Initial point set
forming a circle with one reference point; V T x reflects; ΣV T x stretches in e1 ; U ΣV T x
rotates counterclockwise 26.5◦ , illustrating the action of A.
The SVD of A = U ΣV T :
⎡ ⎤ ⎡ ⎤⎡ ⎤
1 0 0 1 0 2.23 0
⎣0 2⎦ = ⎣0.89 0 −0.44⎦ ⎣ 0 0 1
1⎦ .
1 0
0 1 0.44 0 0.89 0 0
Figure 16.5 illustrates the elements of the SVD and the action of A.
Because m > n, u3 is in the null space of AT , that is AT u3 = 0.
Example 16.4
orthogonal matrix.
⎡ ⎤ ⎡ ⎤
1.64 1.5 −0.94 λ1 = 3.77, −0.63 0.38 0.67
AT A = ⎣ 1.5 2.25 −0.45⎦ , λ2 = 0.84, V = ⎣−0.71 −0.62 −0.31⎦ ;
−0.94 −0.45 0.73 λ3 = 0, 0.30 −0.68 0.67
T 1.28 −1.04 λ1 = 3.77, 0.39 −0.92
AA = , U= .
−1.04 3.34 λ2 = 0.84, −0.92 −0.39
The matrix A is rank 2, thus there are two singular values, and
1.94 0 0
Σ= .
0 0.92 0
The SVD of A = U ΣV T :
⎡ ⎤
−0.63 −0.71 0.3
−0.8 0 0.8 0.39 −0.92 1.94 0 0 ⎣
= 0.38 −0.62 −0.68⎦ .
1 1.5 −0.3. −0.92 −0.39 0 0.92 0
0.67 −0.31 0.67
Figure 16.6 illustrates the elements of the SVD and the action of A.
Because m < n, v3 is in the null space of A, that is, Av3 = 0.
Figure 16.6.
The SVD of a 2 × 3 matrix A: see Example 16.4. Clockwise from top left: Initial point
set forming a circle with one reference point; V T x; ΣV T x; U ΣV T x, illustrating the
action of A.
352 16. The Singular Value Decomposition
Example 16.5
Evaluating p at λ = 0, we have
det A = λ1 · . . . · λn , (16.6)
A† AA† = A† , (16.8)
AA† A = A. (16.9)
Example 16.6
We find
1/2.23 0 0
Σ† = ,
0 1 0
16.6. Least Squares 355
Example 16.7
Ax = b,
AT Ax = AT b, (16.10)
Ax − b = U ΣV T x − b
= U ΣV T x − U U T b
= U (Σy − z).
x=Vy
x = V (Σ† z)
x = V Σ† (U T b).
Example 16.8
Let’s revisit the least squares problem that we solved using the normal
equations in Example 12.13 and the Householder method in Exam-
358 16. The Singular Value Decomposition
2. Compute ⎡ ⎤
54.5
⎢ 51.1 ⎥
⎢ ⎥
⎢ 3.2 ⎥
⎢ ⎥
z = U Tb = ⎢ ⎥
⎢−15.6⎥ .
⎢ 9.6 ⎥
⎢ ⎥
⎣ 15.2 ⎦
10.8
3. Compute
0.57
y = Σ† z = .
34.8
4. Compute
−0.23
x=Vy = ,
34.8
resulting in the same best fit line, x2 = −0.23x1 + 34.8, as we
found via the normal equations and the Householder method.
16.7. Application: Image Compression 359
b = A(AT A)−1 AT b
= AA† b
= projV b
A1 = σ1 u1 v1T
A2 = σ1 u1 v1T + σ2 u2 v2T
Figure 16.7.
Image compression: a method that uses SVD. The input matrix has singular values
σ i = 7.1, 3.8, 1.3, 0.3. Far left: original image; from left to right: recovering the image
by adding projection terms.
Figure 16.8.
Scatter plot: data pairs recorded in Cartesian coordinates.
l(d) = Xd2
= (Xd) · (Xd)
= dT X T Xd. (16.15)
Figure 16.9.
PCA: Analysis of a data set. Left: given data with centroid translated to the origin. Thick
lines are coincident with the eigenvectors scaled by their corresponding eigenvalue.
Right: points, eigenvector lines, and quadratic form over the unit circle.
eigenvalue. Recall that these eigenvectors form the major and minor
axis of the action ellipse of C, and the thick lines in Figure 16.9 (left)
are precisely these axes. In the right part of Figure 16.9, the quadratic
form for the data set is shown along with the data and action ellipse
axes. We see that the dominant eigenvector corresponds to highest
variance in the data and this is reflected in the quadratic form as
well. If λ1 = λ2 , then there is no preferred direction in the point
set and the quadratic form is spherical. We are guaranteed that the
eigenvectors will be orthogonal because C is a symmetric matrix.
Looking more closely at C, we see its very simple form,
Figure 16.10.
PCA data transformations: three possible data transformations based on PCA analy-
sis. Top: data points transformed to the principal components coordinate system. This
set appears in all images. Middle: data compression by keeping dominant eigenvector
component. Bottom: data compression by keeping the nondominant eigenvector.
This results in
xi · v1
x̂i = .
xi · v2
Figure 16.10 (top) illustrates the result of this transformation. Revisit
Example 7.6: this is precisely the transformation we used to align
the contour ellipse to the coordinate axes. (The transformation is
written a bit differently here to accommodate the point set organized
as transposed vectors.)
In summary: the data coordinates are now in terms of the trend
lines, defined by the eigenvectors of the covariance matrix, and the
coordinates directly measure the distance from each trend line. The
greatest variance corresponds to the first coordinate in this principal
components coordinate system. This leads us to the name of this
method: principal components analysis (PCA).
So far, PCA has worked with all components of the given data.
However, it can also be used for data compression by reducing di-
mensionality. Instead of constructing V to hold all eigenvectors, we
may use only the most significant, so suppose V = v1 . This transfor-
mation produces the middle image shown in Figure 16.10. If instead,
V = v2 , then the result is the bottom image, and for clarity, we chose
to display these points on the e2 -axis, but this is arbitrary. Compar-
ing these results: there is greater spread of the data in the middle
image, which corresponds to a trend line with higher variance.
Here we focused on 2D data, but the real power of PCA comes with
higher dimensional data for which it is very difficult to visualize and
understand relationships between dimensions. PCA makes it possible
to identify insignificant dimensions and eliminate them.
16.9 Exercises
1. Find the SVD for
1 0
A= .
0 4
2. What is the eigendecomposition of matrix A in Exercise 1.
3. For what type of matrix are the eigenvalues the same as the singular
values?
4. The action of a 2 × 2 linear map can be described by the mapping of
the unit circle to an ellipse. Figure 4.3 illustrates such an ellipse. What
are the lengths of the semi-axes? What are the singular values of the
corresponding matrix,
1/2 0
A= ?
0 2
5. Find the SVD for the matrix
⎡ ⎤
0 −2
A = ⎣1 0 ⎦.
0 0
6. Let ⎡ ⎤
−1 0 1
A=⎣ 0 1 0 ⎦,
1 1 −2
and C = AT A. Is one of the eigenvalues of C negative?
7. For the matrix ⎡ ⎤
−2 0 0
A=⎣ 0 1 0⎦ ,
0 0 1
show that both (16.5) and (16.6) yield the same result for the absolute
value of the determinant of A.
8. For the matrix
2 0
A= ,
2 0.5
show that both (16.5) and (16.6) yield the same result for the determi-
nant of A.
9. For the matrix ⎡ ⎤
1 0 1
A = ⎣0 1 0⎦ ,
0 1 0
show that both (16.5) and (16.6) yield the same result for the absolute
value of the determinant of A.
10. What is the pseudoinverse of the matrix from Exercise 5?
366 16. The Singular Value Decomposition
Figure 17.1.
2D finite element method: refinement of a triangulation based on stress and strain
calculations. (Source: J. Shewchuk, http://www.cs.cmu.edu/∼ quake/triangle.html.)
367
368 17. Breaking It Up: Triangles
u + v + w = 1. (17.2)
p1 ∼
= (1, 0, 0),
p2 ∼
= (0, 1, 0),
p3 ∼
= (0, 0, 1).
clockwise, then the points inside have all negative barycentric coordinates, and
the outside ones still have mixed signs.
370 17. Breaking It Up: Triangles
Example 17.1
Let’s work with a simple example that is easy to sketch. Suppose the
three triangle vertices are given by
Sketch 17.4.
0 1 0
Barycentric coordinates p1 = , p2 = , p3 = .
0 0 1
coordinate lines.
The points q, r, s with barycentric coordinates
1 1 1 1 1
q∼= 0, , , r∼ = (−1, 1, 1) , s ∼
= , ,
2 2 3 3 3
have the following coordinates in the plane:
1 1 1/2
q = 0 × p1 + × p2 + × p3 = ,
2 2 1/2
1
r = −1 × p1 + 1 × p2 + 1 × p3 = ,
1
1 1 1 1/3
s = × p1 + × p2 + × p3 = .
3 3 3 1/3
Let’s revisit the simple triangle in Example 17.1 and look at the
affine invariance of barycentric coordinates. Suppose we apply a 90◦
rotation,
0 −1
R=
1 0
to the triangle vertices, resulting in p̂i = Rpi . Apply this rotation to
s from Example 17.1,
−1/3
ŝ = Rs = .
1/3
area(i, p2 , p3 )
i1 = .
area(p1 , p2 , p3 )
Sketch 17.7.
The incenter. This may be rewritten as
rs1
i1 = ,
rs1 + rs2 + rs3
using the “1/2 base times height” rule for triangle areas.
Simplifying, we obtain
s1 s2 s3
i1 = , i2 = , i3 = ,
c c c
where c = s1 + s2 + s3 is the circumference of T . A triangle is not
affinely related to its incenter—affine maps change the barycentric
coordinates of i.
The circumcenter cc of a triangle is the center of the circle through
its vertices. It is obtained as the intersection of the edge bisectors.(See
Sketch 17.8.) Notice that the circumcenter might not be inside the
Sketch 17.8. triangle. This circle is called the circumcircle and we will refer to its
The circumcenter. radius as R.
17.3. Some Special Points 373
Furthermore,
"
1 (d1 + d2 )(d2 + d3 )(d3 + d1 )
R= .
2 D/2
Example 17.3
Yet again, let’s visit the simple triangle in Example 17.1. Be sure to
make a sketch to check the results of this example. Let’s compute √
the incenter. The lengths of the edges of the triangle are s1 = √2,
s2 = 1, and s3 = 1. The circumference of the triangle is c = 2 + 2.
The barycentric coordinates of the incenter are then
# √ $
∼ 2 1 1
i= √ , √ , √ .
2+ 2 2+ 2 2+ 2
17.4 2D Triangulations
The study of one triangle is the realm of classical geometry; in mod-
ern applications, one often encounters millions of triangles. Typically,
they are connected in some well-defined way; the most basic one
being the 2D triangulation. Triangulations have been used in sur-
veying for centuries; more modern applications rely on satellite data,
which are collected in triangulations called TINs (triangular irregular
networks).
Sketch 17.9. Here is the formal definition of a 2D triangulation. A triangulation
Examples of illegal of a set of 2D points {pi }N
i=1 is a connected set of triangles meeting
triangulations. the following criteria:
1. The vertices of the triangles consist of the given points.
2. The interiors of any two triangles do not intersect.
3. If two triangles are not disjoint, then they share a vertex or have
coinciding edges.
4. The union of all triangles equals the convex hull of the pi .
These rules sound abstract, but some examples will shed light on
them. Figure 17.2 shows a triangulation that satisfies the 2D triangu-
lation definition. Evident from this example: the number of triangles
surrounding a vertex, or valence, varies from vertex to vertex. These
triangles make up the star of a vertex. In contrast, Sketch 17.9 shows
three illegal triangulations, violating the above rules. The top exam-
ple involves overlapping triangles. In the middle example, the bound-
ary of the triangulation is not the convex hull of the point set. (A lot
more on convex hulls may be found in [4]; also see Section 18.3.) The
bottom example violates condition 3.
If we are given a point set, is there a unique triangulation? Cer-
Sketch 17.10. tainly not, as Sketch 17.10 shows. Among the many possible tri-
Nonuniqueness of angulations, there is one that is most commonly agreed to be the
triangulations.
17.5. A Data Structure 375
Figure 17.2.
Triangulation: a valid triangulation of the convex hull.
5 (number of points)
0.0 0.0 (point #1)
1.0 0.0
0.0 1.0
0.25 0.3
0.5 0.3
5 (number of triangles)
1 2 5 (first triangle - connects points #1,2,5)
2 3 5
4 5 3
1 5 4
1 4 3
We can improve this structure. We will encounter applications
that require knowledge of the connectivity of the triangulation, as
described in Section 17.6. To facilitate this, it is not uncommon to
also see the neighbor information of the triangulation stored. This
means that for each triangle, the indices of the triangles surrounding it
are stored. For example, in Sketch 17.11, triangle 1 defined by points
1, 2, 5 is surrounded by triangles 2, 4, −1. The neighboring triangles
are listed corresponding to the point across from the shared edge.
Triangle −1 indicates that there is not a neighboring triangle across
this edge. Immediately, we see that this gives us a fast method for
determining the boundary of the triangulation. Listing the neighbor
information after each triangle, the final data structure is as follows.
5 (number of points)
0.0 0.0 (point #1)
1.0 0.0
0.0 1.0
0.25 0.3
0.5 0.3
5 (number of triangles)
1 2 5 2 4 -1 (first triangle and neighbors)
2 3 5 3 1 -1
4 5 3 2 5 4
1 5 4 3 5 1
1 4 3 3 -1 4
This is but one of many possible data structures for a triangula-
tion. Based on the needs of particular applications, researchers have
developed a variety of structures to optimize searches. One such
structure that has proved to be popular is called the winged-edge
data structure [14].
17.6. Application: Point Location 377
Step 1: Perform the triangle inclusion test (see Section 17.1) for p
and the current triangle. If all barycentric coordinates are positive,
output the current triangle. If the barycentric coordinates are mixed
in sign, then determine the barycentric coordinate of p with respect
to the current triangle that has the most negative value. Set the
current triangle to be the corresponding neighbor and repeat Step 1.
Notes:
• Try improving the speed of this algorithm by not completing the di-
vision for determining the barycentric coordinates in (17.3)–(17.5).
This division does not change the sign. Keep in mind the test for
which triangle to move to changes.
378 17. Breaking It Up: Triangles
17.7 3D Triangulations
In computer applications, one often encounters millions of triangles,
connected in some well-defined way, describing a geometric object. In
particular, shading algorithms require this type of structure.
The rules for 3D triangulations are the same as for 2D. Additionally,
the data structure is the same, except that now each point has three
instead of two coordinates.
Figure 17.3 shows a 3D surface that is composed of triangles. An-
other example is provided in Figure 8.4. Shading requires a 3D unit
vector, called a normal, to be associated with each triangle or vertex.
A normal is perpendicular to an object’s surface at a particular point.
This normal is used to calculate how light is reflected, and in turn
the illumination of the object. (See [14] for details on such illumi-
Figure 17.3.
3D triangulated surface: a wireframe and shaded renderings superimposed.
17.8. Exercises 379
17.8 Exercises
Let a triangle T1 be given by the vertices
1 2 −1
p1 = , p2 = , p3 = .
1 2 2
1. Using T1 :
0
(a) What are the barycentric coordinates of p = ?
1.5
0
(b) What are the barycentric coordinates of p = ?
0
(c) Find the triangle’s incenter.
(d) Find the triangle’s circumcenter.
(e) Find the centroid of the triangle.
2. Using T2 :
0
(a) What are the barycentric coordinates of p = ?
1.5
0
(b) What are the barycentric coordinates of p = ?
0
(c) Find the triangle’s incenter.
380 17. Breaking It Up: Triangles
What are the areas of mapped triangles T1 and T2 ? Compare the ratios
T1 T1
and .
T2 T2
Figure 18.1.
Polygon: straight line segments forming a bird shape.
381
382 18. Putting Lines Together: Polylines and Polygons
Figure 18.2.
Mixing maps: a pattern is created by composing rotations and translations.
18.1 Polylines
Figure 18.3.
Polylines: the display of a 3D surface. Two different directions for the polyline sets
give different impressions of the surface shape.
18.2 Polygons
When the first and last vertices of a polyline are connected, it is called
a polygon. Normally, a polygon is thought to enclose an area. For
this reason, unless a remark is made, we will consider planar polygons
only. Just as with polylines, polygons constitute an ordered set of
vertices and we will continue to use the term edge vectors. Thus, a
polygon with n edges is given by an ordered set of 2D points
p 1 , p2 , . . . , p n
and has edge vectors vi = pi+1 − pi ; i = 1, . . . , n. Note that the edge
vectors sum to the zero vector.
If you look at the edge vectors carefully, you’ll discover that vn =
pn+1 − pn , but there is no vertex pn+1 ! This apparent problem is
resolved by defining pn+1 = p1 , a convention called cyclic numbering.
We’ll use this convention throughout, and will not mention it every
time. We also add one more topological characterization of polygons:
the number of vertices equals the number of edges.
Since a polygon is closed, it divides the plane into two parts: a
finite part, the polygon’s interior, and an infinite part, the polygon’s
exterior.
As you traverse a polygon, you follow the path determined by the
vertices and edge vectors. Between vertices, you’ll move along straight
lines (the edges), but at the vertices, you’ll have to perform a rotation
before resuming another straight line path. The angle αi by which
you rotate at vertex pi is called the turning angle or exterior angle at Sketch 18.2.
pi . The interior angle is then given by π − αi (see Sketch 18.2). Interior and exterior angles.
384 18. Putting Lines Together: Polylines and Polygons
Figure 18.4.
Circle approximation: using an n-gon to represent a circle.
Figure 18.5.
Trimmed surface: an application of polygons with holes. Left: trimmed surface. Right:
rectangular parametric domain with polygonal holes.
Sketch 18.12. If the sign of the u3,i value is the same for all angles, then the polygon
Turning angles. is convex. A mathematical way to describe this is by using the scalar
triple product (see Section 8.5). The turning angle orientation is
determined by the scalar
Notice that the sign is dependent upon the traversal direction of the
polygon, but only a change of sign is important. The determinant of
the 2D vectors would have worked just as well, but the 3D approach
is more useful for what follows.
If the polygon lies in an arbitrary plane, having a normal n, then the
above convex/concave test is changed only a bit. The cross product
in (18.3) produces a vector ui that has direction ±n. Now we need
the dot product, n · ui to extract a signed scalar value.
If we actually computed the turning angle at each vertex, we could
form an accumulated value called the total turning angle. Recall from
(18.2) that the total turning angle for a convex polygon is 2π. For a
polygon that is not known to be convex, assign a sign using the scalar
triple product as above, to each angle measurement. The sum E will
then be used to compute the winding number of the polygon. The
winding number W is
E
W = .
2π
18.7. Area 389
Thus, for a convex polygon, the winding number is one. Sketch 18.13
illustrates a few examples. A non-self-intersecting polygon is essen-
tially one loop. A polygon can have more than one loop, with dif-
ferent orientations: clockwise versus counterclockwise. The winding
number gets decremented for each clockwise loop and incremented
for each counterclockwise loop, or vice versa depending on how you
assign signs to your angles.
18.7 Area
A simple method for calculating the area of a 2D polygon is to use
the signed area of a triangle as in Section 4.9. First, triangulate the
polygon. For example, choose one vertex of the polygon and form all
triangles from it and successive pairs of vertices, as is illustrated in
Sketch 18.14. The sum of the signed areas of the triangles results in
the area of the polygon. For this method to work, we must form the
triangles with a consistent orientation. For example, in Sketch 18.14,
triangles (p1 , p2 , p3 ), (p1 , p3 , p4 ), and (p1 , p4 , p5 ) are all counter-
clockwise or right-handed, and therefore have positive area. More
precisely, if we form vi = pi − p1 , then the area of the polygon in Sketch 18.13.
Sketch 18.14 is Winding numbers.
1
A= (det[v2 , v3 ] + det[v3 , v4 ] + det[v4 , v5 ]).
2
In general, if a polygon has n vertices, then this area calculation
becomes
1
A = (det[v2 , v3 ] + . . . + det[vn−1 , vn ]). (18.4)
2
The use of signed area makes this idea work for nonconvex polygons
as in Sketch 18.15. As illustrated in the sketch, the negative areas
cancel duplicate and extraneous areas.
Equation (18.4) takes an interesting form if its terms are expanded.
We observe that the determinants that represent edges of triangles
within the polygon cancel. So this leaves us with
Sketch 18.14.
1 Area of a convex polygon.
A = (det[p1 , p2 ] + . . . + det[pn−1 , pn ] + det[pn , p1 ]). (18.5)
2
Equation (18.5) seems to have lost all geometric meaning because it
involves the determinant of point pairs, but we can recapture geomet-
ric meaning if we consider each point to be pi − o.
Is (18.4) or (18.5) the preferred form? The amount of computation
for each equation is similar; however, there is one drawback of (18.5).
390 18. Putting Lines Together: Polylines and Polygons
If the polygon is far from the origin then numerical problems can
occur because the vectors pi and pi+1 will be close to parallel. The
form in (18.4) essentially builds a local frame in which to compute the
area. For debugging and making sense of intermediate computations,
(18.4) is easier to work with. This is a nice example of how reducing
an equation to its “simplest” form is not always “optimal”!
An interesting observation is that (18.5) may be written as a gen-
eralized determinant. The coordinates of the vertices are
p
pi = 1,i .
p2,i
Sketch 18.15.
Area of a nonconvex polygon. Example 18.1
Let
0 1 1 0
p1 = , p2 = , p3 = , p4 = .
0 0 1 1
We have
1 0 1 1 0 0 1
A= = [0 + 1 + 1 + 0 − 0 − 0 − 0 − 0] = 1.
2 0 0 1 1 0 2
1
A= n · (u2 + . . . + un−1 ), (18.7)
2
with the ui defined in (18.6). Notice that (18.7) is a sum of scalar
triple products, which were introduced in Section 8.5.
Example 18.2
Compute the area with (18.7), and note that the normal
⎡ √ ⎤
−1/√2
n = ⎣−1/ 2⎦ .
0
First compute
⎡ ⎤ ⎤
⎡ ⎡ ⎤
2 2 0
v2 = ⎣−2⎦ , v3 = ⎣−2⎦ , v4 = ⎣0⎦ ,
0 3 3
(u2 + u3 + . . . + un−2 )
n= .
u2 + u3 + . . . + un−2
Example 18.3
Calculate ⎡ ⎤ ⎡ ⎤
1 −1
u2 = ⎣−1⎦ , u3 = ⎣ 1 ⎦ ,
1 1
and the normal is ⎡ ⎤
0
n = ⎣0⎦ .
1
18.8. Application: Planarity Test 393
• numerical stability,
• speed,
• Volume test: Choose the first polygon vertex as a base point. Form
vectors to the next three vertices. Use the scalar triple product to
calculate the volume spanned by these three vectors. If it is less
than a given tolerance, then the four points are coplanar. Continue
for all other sets.
• Plane test: Construct the plane through the first three vertices.
Check if all of the other vertices lie in this plane, within a given
tolerance.
l(t) = p + tr.
If you happen to know that you are dealing only with convex poly-
gons, another inside/outside test is available. Check which side of the
edges the point p is on. If it is on the same side for all edges, then p is
inside the polygon. All you have to do is to compute all determinants
of the form
(pi − p) (pi+1 − p) .
If they are all of the same sign, p is inside the polygon.
18.10 Exercises
1. What is the sum of the interior angles of a six-sided polygon? What is
the sum of the exterior angles?
2. What type of polygon is equiangular and equilateral?
3. Which polygon is equilateral but not equiangular?
4. Develop an algorithm that determines whether or not a polygon is sim-
ple.
5. Calculate the winding number of the polygon with the following vertices:
0 −2 −2
p1 = , p2 = , p3 = ,
0 0 2
0 2 2
p4 = , p5 = , p6 = ,
2 2 −2
3 3 0
p7 = , p8 = , p9 = .
−2 −1 −1
18.10. Exercises 397
9. The following points are the vertices of a polygon that should lie in a
plane:
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
2 3 2
p1 = ⎣ 0⎦ , p2 = ⎣ 2⎦ , p3 = ⎣ 4⎦ ,
−2 −4 −4
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 0 0
p4 = ⎣ 3⎦ , p5 = ⎣ 2⎦ , p6 = ⎣0⎦ .
−1 −1 0
However, one point lies outside this plane. Which point is the outlier?2
Which planarity test is the most suited to this problem?
2 This term is frequently used to refer to noisy, inaccurate data from a laser
scanner.
This page intentionally left blank
19
Conics
Figure 19.1.
Conic sections: three types of curves formed by the intersection of a plane and a cone.
From left to right: ellipse, parabola, and hyperbola.
Take a flashlight and shine it straight onto a wall. You will see a
circle. Tilt the light, and the circle will turn into an ellipse. Tilt
further, and the ellipse will become more and more elongated, and
will become a parabola eventually. Tilt a little more, and you will have
a hyperbola—actually one branch of it. The beam of your flashlight
is a cone, and the image it generates on the wall is the intersection
of that cone with a plane (i.e., the wall). Thus, we have the name
conic section for curves that are the intersections of cones and planes.
Figure 19.1 illustrates this idea.
The three types, ellipses, parabolas, and hyperbolas, arise in many
situations and are the subject of this chapter. The basic tools for
handling them are nothing but the matrix theory developed earlier.
399
400 19. Conics
Before we delve into the theory of conic sections, we list some “real-
life” occurrences.
• If you water your lawn, the water leaving the hose traces a parabolic
arc.
The positive factors λ1 and λ2 denote how much the ellipse deviates
from a circle. For example, if λ1 > λ2 , the ellipse is more elongated
in the x2 -direction. See Sketch 19.2 for the example
Sketch 19.2.
An ellipse with λ1 = 1/4, λ2 = 1 2 1
x + x2 = 1.
1/25, and c = 1. 4 1 25 2
An ellipse in the form of (19.2) is said to be in standard position
because its minor and major axes are coincident with the coordinate
axes and the center is at the origin. The ellipse is symmetric about
the major and minor axes. The semi-major and semi-minor axes are
one-half the respective major and minor axes. In standard position,
the ellipse lives in the rectangle with
x1 extents [− c/λ1 , c/λ1 ] and x2 extents [− c/λ2 , c/λ2 ].
You will see the wisdom of this in a short while. This equation allows
for significant compaction:
xT Dx − c = 0. (19.4)
Example 19.1
Let’s start with the ellipse 2x21 + 4x22 − 1 = 0. In matrix form, corre-
sponding to (19.3), we have
2 0 x1
x1 x2 − 1 = 0. (19.5)
0 4 x2
which becomes
x̂T RDRT x̂ − c = 0. (19.6)
3 3 3
2 2 2
1 1 1
0 0 0
–1 –1 –1
–2 –2 –2
–3 –3 –3
–3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3 –3 –2 –1 0 1 2 3
Figure 19.2.
Conic section: an ellipse in three positions. From left to right: standard position as
given in (19.5), with a 45◦ rotation, and with a 45◦ rotation and translation of [2, –1]T .
402 19. Conics
An abbreviation of
A = RDRT (19.7)
shortens (19.6) to
x̂T Ax̂ − c = 0. (19.8)
There are a couple of things about (19.8) that should look familiar.
Note that A is a symmetric matrix. While studying the geometry of
symmetric matrices in Section 7.5, we discovered that (19.7) is the
eigendecompostion of A. The diagonal matrix D was called Λ there.
One slight difference: convention is that the diagonal elements of Λ
satisfy λ1,1 ≥ λ2,2 . The matrix D does not; that would result in
all ellipses in standard position having major axis on the e2 -axis, as
is the case with the example in Sketch 19.2. The curve defined in
(19.8) is a contour of a quadratic form as described in Section 7.6. In
fact, the figures of conics in this chapter were created as contours of
quadratic forms. See Figure 7.8.
Now suppose we encounter an ellipse as in Figure 19.2 (right), ro-
tated and translated out of standard position. What is the equation
of this ellipse? Points x̂ on this ellipse are mapped from an ellipse in
standard position via a rotation and then a translation, x̂ = Rx + v.
Again using the fact that a rotation matrix is orthogonal, we replace
x by RT (x̂ − v) in (19.4), and the rotated conic takes the form
or
[x̂T − vT ]RDRT [x̂ − v] − c = 0.
[x̂T − vT ]A[x̂ − v] − c = 0.
xT Ax − 2xT Av + vT Av − c = 0. (19.9)
with b = Av and d = vT Av − c.
19.1. The General Conic 403
In fact, many texts simply start out by using (19.12) as the initial
definition of a conic.
Example 19.2
Let’s continue with the ellipse from Example 19.1. We have xT Dx−1
= 0, where
2 0
D= .
0 4
This ellipse is illustrated in Figure 19.2 (left). Now we rotate by 45◦ ,
using the rotation matrix
s −s
R=
s s
√
with s = sin 45◦ = cos 45◦ = 1/ 2. The matrix A = RDRT becomes
3 −1
A= .
−1 3
then (19.10) gives the recipe for adding translation terms, and the
ellipse is now
3 −1 7
xT x − 2xT + 18 = 0.
−1 3 −5
404 19. Conics
This was a lot of work just to find the general form of an ellipse!
However, as we shall see, a lot more has been achieved here; the
form (19.10) does not just represent ellipses, but any conic. To see
that, let’s examine the two remaining conic types: hyperbolas and
parabolas.
Example 19.3
Example 19.4
The parabola
x21 − x2 = 0
or
x2 = x21 ,
is illustrated in Sketch 19.4. In matrix form, it is
−1 0 x1 0
x1 x2 + x1 x2 = 0.
0 0 x2 1
Sketch 19.4.
A parabola.
19.2. Analyzing Conics 405
If A is the zero matrix and either c4 or c5 are nonzero, then the conic
is degenerate and simply consists of a straight line.
Since A is a symmetric matrix, it has an eigendecomposition, A =
RDRT . The eigenvalues of A, which are the diagonal elements of D,
also characterize the conic type.
Example 19.5
Let’s check the type for the examples of the last section.
Example 19.2: we encountered the ellipse in this example in two
forms, in standard position and rotated,
2 0 3 −1
=
0 4 −1 3 = 8.
Example 19.4:
1
0
= 0,
0 0
confirming that we have a parabola. The characteristic equation for
this matrix is λ(λ + 1) = 0, thus one eigenvalue is zero.
We have derived the general conic and folded this into a tool to
determine its type. What might not be obvious: we found that affine
maps, M x + v, where M is invertible, take a particular type of (non-
degenerate) conic to another one of the same type. The conic type is
determined by the sign of the determinant of A and it is unchanged
by affine maps.
This linear system may be solved if A has full rank. This is equivalent
to A having two nonzero eigenvalues, and so the given conic is either
an ellipse or a hyperbola.
Calculating c = vT Av − d from (19.10) and removing the transla-
tion terms, the ellipse with center at the origin is
3 −1
xT
x − 1 = 0.
−1 3
Example 19.6
4 4 4
2 2 2
0 0 0
–2 –2 –2
–4 –4 –4
–4 –2 0 2 4 –4 –2 0 2 4 –4 –2 0 2 4
Figure 19.3.
Conic section: a hyperbola in three positions. From left to right: standard position,
with rotation, with rotation and translation.
resulting in
2
v= .
0
Calculate c = vT Av − 3 = 1, then the conic without the translation
is
1 4
xT x − 1 = 0.
4 2
This hyperbola is illustrated in Figure 19.3 (middle).
The characteristic equation of the matrix is λ2 − 3λ − 14 = 0; its
roots are λ1 = 5.53 and λ2 = −2.53. The hyperbola in standard
position is
T 5.53 0
x x − 1 = 0.
0 −2.53
This hyperbola is illustrated in Figure 19.3 (left).
19.4. Exercises 409
19.4 Exercises
1. What is the matrix form of a circle with radius r in standard position?
2. What is the equation of an ellipse that is centered at the origin and has
e1 -axis extents of [−5, 5] and e2 -axis extents of [−2, 2]?
3. What are the e1 - and e2 -axis extents of the ellipse
16x21 + 4x22 − 4 = 0?
10x21 + 2x22 − 4 = 0
10x21 − 2x22 − 4 = 0
10x21 − 2x22 − 4 = 0
11. Let x21 − 2x1 x2 − 4 = 0 be the equation of a conic section. What type
is it?
12. Let x21 + 2x1 x2 − 4 = 0 be the equation of a conic section. What type
is it?
13. Let 2x21 + x2 − 5 = 0 be the equation of a conic section. What type is
it?
14. Let a conic be given by
Write it in matrix form. What type of conic is it? What is the rotation
and translation that took it out of standard position? Write it in matrix
form in standard position.
15. What affine map takes the circle
to the ellipse
2x21 + 4x22 − 1 = 0?
16. How many intersections does a straight line have with a conic? Given
a conic in the form (19.2) and a parametric form of a line l(t), what are
the t-values of the intersection points? Explain any singularities.
17. If the shear
1 1
1/2 0
is applied to the conic of Example 19.1, what is the type of the resulting
conic?
20
Curves
Figure 20.1.
Car design: curves are used to design cars such as the Ford Synergy 2010 concept
car. (Source http://www.ford.com.)
Earlier in this book, we mentioned that all letters that you see here
were designed by a font designer, and then put into a font library.
The font designer’s main tool is a cubic curve, also called a cubic
Bézier curve. Such curves are handy for font design, but they were
initially invented for car design. This happened in France in the early
1960s at Rénault and Citroën in Paris. These techniques are still in
use today, as illustrated in Figure 20.1. We will briefly outline this
kind of curve, and also apply previous linear algebra and geometric
concepts to the study of curves in general. This type of work is
called geometric modeling or computer-aided geometric design; see an
411
412 20. Curves
introductory text such as [7]. Please keep in mind: this chapter just
scratches the surface of the modeling field!
where f (t) and g(t) are functions of the parameter t. For the linear
interpolant above, f (t) = (1 − t)a1 + tb1 and g(t) = (1 − t)a2 + tb2 .
In general, f and g can be any functions, e.g., polynomial, trigono-
metric, or exponential. However, in this chapter we will be looking
at polynomial f and g.
Let us be a bit more ambitious now and study motion along curves,
i.e., paths that do not have to be straight. The simplest example is
that of driving a car along a road. At time t = 0, you start, you follow
the road, and at time t = 1, you have arrived somewhere. It does not
really matter what kind of units we use to measure time; the t = 0
and t = 1 may just be viewed as a normalization of an arbitrary time
interval.
We will now attack the problem of modeling curves, and we will
choose a particularly simple way of doing this, namely cubic Bézier
curves. We start with four points in 2D or 3D, b0 , b1 , b2 , and b3 ,
called Bézier (control) points. Connect them with straight lines as
shown in Sketch 20.1. The resulting polygon is called a Bézier (con-
trol) polygon.1
Sketch 20.1. The four control points, b0 , b1 , b2 , b3 , define a cubic curve, and
A Bézier polygon. some examples are illustrated in Figure 20.2. To create these plots,
we evaluate the cubic curve at many t-parameters that range between
1 Bézier polygons are not assumed to be closed as were the polygons of Sec-
tion 18.2.
20.1. Parametric Curves 413
Figure 20.2.
Bézier curves: two examples that differ in the location of one control point, b0 only.
and then these points are connected by straight line segments to make
the curve look smooth. The points are so close together that you
cannot detect the line segments. In other words, we plot a discrete
approximation of the curve.
Here is how you generate one point on a cubic Bézier curve. Pick
a parameter value t between 0 and 1. Find the corresponding point
on each polygon leg by linear interpolation:
These three points form a polygon themselves. Now repeat the linear
interpolation process, and you get two points:
Figure 20.3.
Bézier curves: all intermediate Bézier points generated by the de Casteljau algorithm
for 33 evaluations.
for 33 evaluations. The points bji are often called intermediate Bézier
points, and the following schematic is helpful in keeping track of how
each point is generated.
b0
b1 b10
b2 b11 b20
b3 b12 b21 b30
stage : 1 2 3
Except for the (input) Bézier polygon, each point in the schematic is
a function of t.
Example 20.1
points,
1 1 2
b10 = b0 + b1 = ,
2 2 6
1 1 4
b11 = b1 + b2 = ,
2 2 8
1 1 8
b12 = b2 + b3 = .
2 2 4
Next,
1 1 1 1 3
b20 = b + b1 = ,
2 0 2 7
1 1 1 6
b21 = b11 + b2 = ,
2 2 6
and finally
9
1 1 2
b30 = b20 + b21 = .
2 2 13
2
This is the point on the curve corresponding to t = 1/2.
B30 B33
B31 B32
Figure 20.4.
Bernstein polynomials: a plot of the four cubic polynomials for t ∈ [0, 1].
and they are illustrated in Figure 20.4. Now b30 (t) can be written as
b30 (t) = B03 (t)b0 + B13 (t)b1 + B23 (t)b2 + B33 (t)b3 . (20.3)
b0 , b 1 , b 2 , b 3 .
This process of generating two Bézier curves from one, is called sub-
division.
From now on, we will also use the shorter b(t) instead of b30 (t).
Figure 20.5.
Affine invariance: as the control polygon rotates, so does the curve.
418 20. Curves
Figure 20.6.
Convex hull property: a Bézier curve lies in the convex hull of its control polygon. Left:
shaded area fills in the convex hull of the control polygon. Right: shaded area fills in
the minmax box of the control polygon. The convex hull lies inside the minmax box.
b(t) = a0 + a1 t + a2 t2 + a3 t3 .
A matrix inversion is all that is needed here. Notice that the square
matrix in (20.8) is nonsingular, therefore we can conclude that any
cubic curve can be written in either the Bézier or the monomial
form.
20.4 Derivatives
Equation (20.2) consists of two (in 2D) or three (in 3D) cubic equa-
tions in t. We can take the derivative in each of the components:
db(t)
= −3(1 − t)2 b0 − 6(1 − t)tb1 + 3(1 − t)2 b1
dt
− 3t2 b2 + 6(1 − t)tb2 + 3t2 b3 .
db(t)
Rearranging, and using the abbreviation dt = ḃ(t), we have
ḃ(1) = 3[b3 − b2 ].
3 Note that the derivative curve does not have control points anymore, but
Example 20.2
Let us compute the derivative of the curve from Example 20.1 for
t = 1/2. First, let’s evaluate the direct equation (20.9). We obtain
1 1 0 4 1 8 0 1 8 8
ḃ =3· − +6· − +3· − ,
2 4 8 4 4 8 8 4 0 8
which yields
1 9
ḃ = .
2 −3
If instead, we used (20.10), and thus the intermediate control points
calculated in Example 20.1, we get
1 1 1
ḃ = 3 b1 2
− b0
2
2 2 2
6 3
=3 −
6 7
9
= ,
−3
which is the same answer but with less work! See Sketch 20.4 for an Sketch 20.4.
illustration. A derivative vector.
c1 − c0 = c[b3 − b2 ] (20.13)
for some positive real number c, meaning that the three points b2 , b3 =
Sketch 20.6. c0 , c1 are collinear.
Smoothly joining Bézier curves. If we use this rule to piece curve segments together, we can design
many 2D and 3D shapes. Figure 20.7 gives an example.
Figure 20.7.
Composite Bézier curves: the letter D as a collection of cubic Bézier curves. Only
one Bézier polygon of many is shown.
speed. If the road does not curve, i.e., it is straight, you will not have
to turn your steering wheel. When the road does curve, you will have
to turn the steering wheel, and more so if the road curves rapidly.
The curviness of the road (our model of a curve) is thus proportional
to the turning of the steering wheel.
Returning to the more abstract concept of a curve, let us sample its
tangents at various points (see Sketch 20.7). Where the curve bends
sharply, i.e., where its curvature is high, successive tangents differ
from each other significantly. In areas where the curve is relatively
flat, or where its curvature is low, successive tangents are almost
identical. Curvature may thus be defined as rate of change of tan-
gents. (In terms of our car example, the rate of change of tangents is
proportional to the turning of the steering wheel.)
Since the tangent is determined by the curve’s first derivative, its Sketch 20.7.
rate of change should be determined by the second derivative. This is Tangents on a curve.
indeed so, but the actual formula for curvature is a bit more complex
than can be derived in the context of this book. We denote the
curvature of the curve at b(t) by κ; it is given by
% %
%ḃ ∧ b̈%
κ(t) = % %3 . (20.14)
%ḃ%
424 20. Curves
Figure 20.8.
Inflection point: an inflection point, a point where the curvature changes sign, is
marked on the curve.
Figure 20.9.
Inflection point: a cubic with two inflection points.
20.7. Moving along a Curve 425
Figure 20.10.
Curve motions: a letter is moved along a curve.
Take a look at Figure 20.10. You will see the letter B sliding along
a curve. If the curve is given in Bézier form, how can that effect be
achieved? The answer can be seen in Sketch 20.8. If you want to
position an object, such as the letter B, at a point on a curve, all
you need to know is the point and the curve’s tangent there. If ḃ is
the tangent, then simply define n to be a vector perpendicular to it.4
Using the local coordinate system with origin b(t) and [ḃ, n]-axes,
you can position any object as in Section 4.1.
The same story is far trickier in 3D! If you had a point on the curve Sketch 20.8.
and its tangent, the exact location of your object would not be fixed; Sliding along a curve.
it could still rotate around the tangent. Yet there is a unique way to
position objects along a 3D curve. At every point on the curve, we
may define a local coordinate system as follows.
Let the point on the curve be b(t); we now want to set up a local
coordinate system defined by three vectors f1 , f2 , f3 . Following the
2D example, we set f1 to be in the tangent direction; f1 = ḃ(t). If
the curve does not have an inflection point at t, then ḃ(t) and b̈(t)
will not be collinear. This means that they span a plane, and that
plane’s normal is given by ḃ(t) ∧ b̈(t). See Sketch 20.9 for some visual
information. We make the plane’s normal one of our local coordinate
axes, namely f3 . The plane, by the way, has a name: it is called the
osculating plane at x(t). Since we have two coordinate axes, namely
f1 and f3 , it is not hard to come up with the remaining axis, we just
set f2 = f1 ∧ f3 . Thus, for every point on the curve (as long as it is not
an inflection point), there exists an orthogonal coordinate system. It
is customary to use coordinate axes of unit length, and then we have
ḃ(t)
f1 = % %, (20.16)
%ḃ(t)%
4 If ḃ1 −ḃ2
ḃ = then n = .
ḃ2 ḃ1
426 20. Curves
Figure 20.11.
Curve motions: a robot arm is moved along a curve. (Courtesy of M. Wagner, Arizona
State University.)
ḃ(t) ∧ b̈(t)
f3 = % %, (20.17)
%ḃ(t) ∧ b̈(t)%
f2 = f1 ∧ f3 . (20.18)
x(t, u) = b(t) + u1 f1 + u2 f2 + u3 f3 .
20.8 Exercises
Let a cubic Bézier curve d(t) be given by the control polygon
0 6 3 2
d0 = , d1 = , d2 = , d3 = .
0 3 6 4
10. Attach another curve c(t) at d(t), creating a composite curve with tan-
gent continuity. What are the constraints on c(t)’s polygon?
11. Sketch b(t) manually.
12. Using the de Casteljau algorithm, evaluate b(t) for t = 1/4.
13. Evaluate the first and second derivative of b(t) for t = 1/4. Add these
vectors to the sketch from Exercise 11.
14. For b(t), what is the control polygon for the curve defined from t = 0
to t = 1/4 and the curve defined over t = 1/4 to t = 1?
15. Rewrite b(t) in monomial form.
16. Find b(t)’s minmax box.
17. Find b(t)’s curvature at t = 0 and t = 1/2.
18. Find b(t)’s Frenet frame for t = 1.
19. Using the de Casteljau algorithm, evaluate b(t) for t = 2. This is
extrapolation.
20. Attach another curve c(t) at b0 , creating a composite curve with tangent
continuity. What are the constraints on c(t)’s polygon?
A
Glossary
Action ellipse The image of the unit circle under a linear map.
Affine space A set of points with the property that any barycentric
combination of two points is again in the space.
429
430 A. Glossary
Domain A linear map maps from one space to another. The “from”
space is the domain.
Dual space Consider all linear maps from a linear space into the
1D linear space of scalars. All these maps form a linear space
themselves, the dual space of the original space.
Line Given two points in affine space, the set of all barycentric
combinations is a line.
Linear space A set of vectors with the property that any linear
combination of any two vectors is also in the set. Also called a
vector space.
436 A. Glossary
Null space The set of vectors mapped to the zero vector by a linear
map. The zero vector is always in this set. Also called the kernel.
437
Span For a given set of vectors, its span is the set (space) of all vec-
tors that can be obtained as linear combinations of these vectors.
Transpose matrix The matrix whose rows are formed by the columns
of a given matrix.
Upper triangular matrix A matrix with only zero entries below the
diagonal.
1a. The triangle vertex with coordinates (0.1, 0.1) in the [d1 , d2 ]-system is Chapter 1
mapped to
2−1 1
u1 = = ,
3−1 2
2−2
u2 = = 0.
3−2
x1 = 1 + 0.5 × 1 = 1.5,
x2 = 1 + 0 × 2 = 1,
x3 = 1 + 0.7 × 4 = 3.8.
443
444 Selected Exercise Solutions
Chapter 2 2. The vectors v and w form adjacent sides of a parallelogram. The vectors
v+w and v−w form the diagonals of this parallelogram, and an example
is illustrated in Sketch 2.6.
4. The operations have the following results.
(a) vector
(b) point
(c) point
(d) vector
Selected Exercise Solutions 445
(e) vector
(f) point
5. The midpoint between p and q is
1 1
m= p + q.
2 2
8. A triangle.
9. The length of the vector
−4
v=
−3
is
v = (−4)2 + (−3)2 = 5.
13. The distance between p and q is 1.
14. A unit vector has length one.
15. The normalized vector
v −4/5
= .
v −3/5
17. The barycentric coordinates are (1−t) and t such that r = (1−t)p+tq.
We determine √ to p and q by calculating l1 =
√ the location of r relative
r − p = 2 2 and l2 = q − r = 4 2. The barycentric coordinates
√
must sum to one, so we need the total length l3 = l1 + l2 = 6 2. Then
the barycentric coordinates are t = l1 /l3 = 1/3 and (1 − t) = 2/3.
Check that this is correct:
3 2 1 1 7
= + .
3 3 1 3 7
The two sides of Cauchy-Schwartz are then (v·w)2 = 9 and v2 w2 =
12 × 32 = 9.
32. No, the triangle inequality states that v + w ≤ v + w.
Chapter 3 1. The line is defined by the equation l(t) = p + t(q − p), thus
0 4
l(t) = +t .
1 1
We can check that
0 4
l(0) = +0× = p,
1 1
0 4
l(1) = +1× = q.
1 1
3. The parameter value t = 2 is outside of [0, 1], therefore l(2) is not formed
from a convex combination.
5. First form the vector
2
q−p= ,
−1
then a is perpendicular to this vector, so let
1
a= .
2
Next, calculate c = −a1 p1 − a2 p2 = 2, which makes the equation of the
line x1 + 2x2 + 2 = 0.
Alternatively, we could have let
−1
a= .
−2
Then the implicit equation of the line is −x1 − 2x2 − 2 = 0, which is
simply equivalent to multiplying the previous equation by −1.
Selected Exercise Solutions 447
3. Yes, v = Aw where
3
w= .
−2
6. The transpose of the given matrices are as follows:
0 1 1 −1
AT = , BT = , vT = 2 3 .
−1 0 −1 1/2
2
3
.
0 −1 −3
1 0 2
2
3
.
1 −1 −1
−1 1/2 −1/2
19. This one is a little tricky! It is a rotation; notice that the determinant
is one. See the discussion surrounding (4.15).
21. No. Simply check that the matrix multiplied by itself does not result
in the matrix again:
−1 0
A2 = = A.
0 −1
9. The Gauss elimination steps from Section 5.4 need to be modified be-
cause a1,1 = 0. Therefore, we add a pivoting step, which means that we
exchange rows, resulting in
2 2 x1 6
= .
0 4 x2 8
Now the matrix is in upper triangular form and we use back substitution
to find x2 = 2 and x1 = 1.
13. No, this system has only the trivial solution, x1 = x2 = 0.
14. We find the kernel of the matrix
2 6
C=
4 12
is an example.
452 Selected Exercise Solutions
21. Simply by inspection, we see that the map is a reflection about the
e1 -axis, thus the matrix is
1 0
A= .
0 −1
2/3
Chapter 6 1. The point q = . The transformed points:
4/3
3 11/2 14/3
r = , s = , q = .
4 6 16/3
q4 = p0 + tv.
Selected Exercise Solutions 453
then
v·w
t= = 1,
v2
and
2 −2 0
q4 = +1 = .
1 2 3
This point is the midpoint between p4 and p4 , thus
2
p4 = 2q4 − p4 = .
5
x = A(x − a1 ) + a1 .
Check that the ai are mapped to ai . The point x = [1/2 1/2]T , the
midpoint between a2 and a3 , is mapped to x = [0 1/2]T , the midpoint
between a2 and a3
9. Affine maps take collinear points to collinear points and preserve ratios,
thus
0
x = .
0
11. Mapping a point x in NDC coordinates to x in the viewport involves
the following steps.
1. Translate x by an amount that translates ln to the origin.
454 Selected Exercise Solutions
2. Scale the resulting x so that the sides of the NDC box are of unit
length.
3. Scale the resulting x so that the unit box is scaled to match the
viewport’s dimensions.
4. Translate the resulting x by lv to align the scaled box with the
viewport.
The sides of the viewport have the lengths Δ1 = 20 and Δ2 = 10. We
can then express the affine map as
Δ1 /2 0 −1 10
x = x− + .
0 Δ2 /2 −1 10
Applying this affine map to the NDC points in the question yields
10 30 15
x1 = , x2 = , x3 = .
10 20 17 12
Of course we could easily “eyeball” x1 and x2 , so they provide a good
check for our affine map. Be sure to make a sketch.
12. No, affine maps do not transform perpendicular lines to perpendicular
lines. A simple shear is a counterexample.
λ2 − 2λ − 3 = 0.
Draw a sketch!
12. The quadratic form for C1 is an ellipsoid and λi = 4, 2. The quadratic
form for C2 is an hyperboloid and λi = 6.12, −2.12. The quadratic
form for C3 is an paraboloid and λi = 6, 0.
456 Selected Exercise Solutions
7. Two 3D lines are skew if they do not intersect. They intersect if there
exists s and t such that l1 (t) = l2 (s), or
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
0 −1 1 0
t ⎣0⎦ − s ⎣−1⎦ = ⎣1⎦ − ⎣0⎦ .
1 1 1 1
n(x − p) = 0,
or
2 1 2 2
x1 + x2 + x3 − = 0.
3 3 3 3
10. A parametric form of the plane P through the points p, q, and r is
d v·w
cos(θ) = and cos(θ) = .
w vw
Solve for d, ⎡ ⎤ ⎡ ⎤
1 1
⎣0⎦ · ⎣1⎦
v·w 0 1
d= = ⎡ ⎤ = 1.
v
1
⎣0⎦
0
h v ∧ w
sin(θ) = and sin(θ) = .
w vw
Solve for h,
⎡ ⎤
0
⎣−1⎦
v ∧ w 1 √
h= = ⎡ ⎤ = 2.
v 1
⎣0⎦
0
14. The cross product of parallel vectors results in the zero vector.
17. The volume V formed by the vectors v, w, u can be computed as the
scalar triple product
V = v · (w ∧ u).
This is invariant under cyclic permutations, thus we can also compute
V as
V = u · (v ∧ w),
which allows us to reuse the cross product from Exercise 1. Thus,
⎡ ⎤ ⎡ ⎤
0 0
V = ⎣0⎦ · ⎣−1⎦ = 1.
1 1
18. This solution is easy enough to determine without formulas, but let’s
practice using the equations. First, project w onto v, forming
⎡ ⎤
1
v·w
u1 = v = ⎣0⎦ .
v 2
0
Draw a sketch to see what you have created. This frame is not or-
thonormal, but that would be easy to do.
Selected Exercise Solutions 459
⎡ ⎤
−1
14. To rotate about the vector ⎣ 0 ⎦, first form the unit vector a =
−1
⎡ √ ⎤
−1/ 2
⎣ 0 ⎦. Then, following (9.10), the rotation matrix is
√
−1/ 2
⎡ √ √ ⎤
1
(1 + 22 ) √1/2 1
(1 − 22 )
⎢2 2
⎥
⎣ −1/2√ 2/2 1/2√ ⎦ .
1
2
(1 − 22 ) −1/2 12 (1 + 22 )
The matrices for rotating about an arbitrary vector are difficult to
verify by inspection. One test is to check the vector about which we
rotated. You’ll find that
⎡ ⎤ ⎡ ⎤
−1 −1
⎣ 0 ⎦ → ⎣ 0 ⎦,
−1 −1
which is precisely correct.
16. The projection is defined as P = AAT , where A = [u1 u2 ], which
results in ⎡ ⎤
1/2 1/2 0
P = ⎣1/2 1/2 0⎦ .
0 0 1
The action of P on a vector v is
⎡ ⎤
1/2(v1 + v2 )
v = ⎣1/2(v1 + v2 )⎦ .
v3
The vectors, v1 , v2 , v3 are mapped to
⎡ ⎤ ⎡ ⎤ ⎡ ⎤
1 1/2 0
v1 = ⎣1⎦ , v2 = ⎣1/2⎦ ,
v3 = ⎣0⎦ .
1 0 0
This is an orthogonal projection into the plane x1 − x2 = 0. Notice
that the e3 component keeps its value since that component already
lives in this plane.
The first two column vectors are identical, thus the matrix rank is 2
and the determinant is zero.
17. To find the projection direction, we solve the homogeneous system
P d = 0. This system gives us two equations to satisfy:
d1 + d2 = 0 and d3 = 0,
which have infinitely many nontrivial solutions, d = [c − c 0]T . All
vectors d are mapped to the zero vector. The vector v3 in the previous
exercise is part of the kernel, thus its image v3 = 0.
Selected Exercise Solutions 461
27. Only square matrices have an inverse, therefore this 3 × 2 matrix has
no inverse.
29. The matrix (AT )T is simply A.
and
⎡ ⎤
−1 −1 −1
X = x2 − x1 x3 − x1 x4 − x1 =⎣ 1 0 0 ⎦.
0 −1 1
462 Selected Exercise Solutions
We can find this point using the matrix from the previous exercise, p =
Ap or with barycentric coordinates. Clearly, the barycentric coordinates
for p with respect to the xi are (1, 1, −1, 0), thus
6. As always, draw a sketch when possible! The plane that is used here is
shown in Figure B.1. Since the plane is parallel to the e2 axis, only a
side view is shown. The plane is shown as a thick line.
e3
v
e1
q
Figure B.1.
The plane for these exercises.
Selected Exercise Solutions 463
0 0 sin(90◦ ) cos(90◦ )
13. Let S be the scale matrix and R be the rotation matrix, then the affine
map is x = RSx + l , or specifically
⎡ √ √ ⎤ ⎡ ⎤
2/ 2 0 −2/ 2 −2
x = ⎣ 0√ 2 0√ ⎦ x + ⎣−2⎦ .
2/ 2 0 2/ 2 −2
Chapter 11 1. First we find the point normal form of the implicit plane as
1 2 2 1
x1 + x2 − x3 − = 0.
3 3 3 3
Substitute p into the plane equation to find the distance d to the plane,
d = −5/3 ≈ −1.67. Since the distance is negative, the point is on the
opposite side of the plane as the normal direction.
4. The point on each line that corresponds to the point of closest proximity
is found by setting up the linear system,
a = a + tv,
(a + tv) · n = 0,
12. With a sketch, you can find the intersection point without calculation,
but let’s practice calculating the point. The linear system is:
⎡ ⎤ ⎡ ⎤
1 1 0 1
⎣1 0 0⎦ x = ⎣1⎦ ,
0 0 1 4
⎡ ⎤
1
and the solution is ⎣0⎦.
4
14. With a sketch you can find the intersection line without calculation,
but let’s practice calculating the line. The planes x1 = 1 and x3 = 4
have normal vectors
⎡ ⎤ ⎡ ⎤
1 0
n1 = ⎣0⎦ and n2 = ⎣0⎦ ,
0 1
16. Set ⎡ ⎤
1
b1 = ⎣0⎦ ,
0
Selected Exercise Solutions 467
This is the upper triangular form of the coefficient matrix. Back sub-
stitution results in the solution vector
⎡ ⎤
1
⎢−3⎥
v=⎢ ⎥
⎣ 0 ⎦.
−1
5. The steps for transforming A to upper triangular are given along with
the augmented matrix after each step.
Exchange row1 and row2 .
⎡ ⎤
1 0 0 0
⎣0 0 1 −1⎦ .
1 1 1 −1
This is the upper triangular form of the coefficient matrix. Back sub-
stitution results in the solution vector
⎡ ⎤
0
v = ⎣ 0 ⎦.
−1
resulting in ⎡ ⎤
0 1 0
G = ⎣1 −1/2 0⎦ .
1 −3/2 1
Let’s check this result by computing
⎡ ⎤⎡ ⎤ ⎡ ⎤
0 1 0 2 −2 0 4 0 −2
GA = ⎣1 −1/2 0⎦ ⎣4 0 −2⎦ = ⎣0 −2 1 ⎦,
1 −3/2 1 4 2 −4 0 0 −1
and
8
0 1 3
8 1 3
0 8
6 2 0 = 10, 1 6 0 = 10, 1 2 6 = 10.
6 1 1 1 6 1 1 1 6
⎡ ⎤
2
The solution is ⎣2⎦.
2
28. This intersection problem was introduced as Example 11.4 and it is
illustrated in Sketch 11.12. The linear system is
⎡ ⎤⎡ ⎤ ⎡ ⎤
1 0 1 x1 1
⎣0 0 1⎦ ⎣x2 ⎦ = ⎣1⎦ .
0 1 0 x3 2
30. Use the explicit form of the line, x2 = ax1 + b. The overdetermined
system is ⎡ ⎤ ⎡ ⎤
−2 1 2
⎢−1 1⎥ ⎢1⎥
⎢ ⎥ a ⎢ ⎥
⎢0 1⎥ ⎢ ⎥
⎢ ⎥ b = ⎢0⎥ .
⎣1 1⎦ ⎣1⎦
2 1 2
Following (12.21), we form the normal equations and the linear system
becomes
10 0 a 0
= .
0 5 b 6
Thus the least squares line is x2 = 6/5. Sketch the data and the line
to convince yourself.
32. Use the explicit form of the line, x2 = ax1 + b. The linear system for
two points is not overdetermined, however let’s apply the least squares
technique to see what happens. The system is
−4 1 a 1
= .
4 1 b 3
Selected Exercise Solutions 471
Thus the least squares line is x2 = (1/4)x1 + 2. Let’s evaluate the line
at the given points:
x2 = (1/4)(−4) + 2 = 1 x2 = (1/4)(4) + 2 = 3
This is the interpolating line! This is what we expect since the solution
to the original linear system must be unique. Sketch the data and the
line to convince yourself.
1. The linear system after running through the algorithm for j = 1 is Chapter 13
⎡ ⎤ ⎡ ⎤
−1.41 −1.41 −1.41 −1.41
⎣ 0 −0.14 −0.14⎦ u = ⎣ 0 ⎦ .
0 −0.1 0.2 0.3
and from this one we can use back substitution to find the solution
⎡ ⎤
1
u = ⎣−1⎦ .
1
The outline of all unit vectors takes the form of a rectangle with lower
left vertex l and upper right vertex u:
−1/2 1/2
l= , u= .
−1 1
λ2 − 2.58λ + 1 = 0,
−1/i
Hence there is a limit, namely [0, 1, 0]T .
17. We have
⎡ ⎤ ⎡ ⎤
4 0 0 0 0 −1
D = ⎣0 8 0⎦ and R = ⎣2 0 2 ⎦.
0 0 2 1 0 0
Hence
⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎞ ⎡ ⎤
0.25 0 0 2 0 0 −1 0 0.5
u (2)
=⎣ 0 0.125 0 ⎦ ⎝⎣−2⎦ − ⎣2 0 2 ⎦ ⎣0⎦⎠ = ⎣−0.25⎦ .
0 0 0.5 0 1 0 0 0 0
Next:
⎡ ⎤ ⎛⎡ ⎤ ⎡ ⎤⎡ ⎤⎞ ⎡ ⎤
0.25 0 0 2 0 0 −1 0.5 0.5
u(3) =⎣ 0 0.125 0 ⎦ ⎝⎣−2⎦ − ⎣2 0 2 ⎦ ⎣−0.25⎦⎠ = ⎣−0.375⎦ .
0 0 0.5 0 1 0 0 0 −0.25
Similarly, we find
T
u(4) = 0.438 −0.313 −0.25 .
The true solution is
T
u = 0.444 −0.306 −.222 .
αv, w
= (αv1 )2 w12 + (αv2 )2 w22 + (αv3 )2 w32
= α2 v, w
= α v, w
.
16. Yes, A, B
satisfies the properties (14.3)–(14.6) of an inner product.
Symmetry: This is easily satisfied since products, ai,j bi,j , are commu-
tative.
Positivity:
A, A
= a21,1, + a21,2 + . . . + a23,3 ≥ 0,
and equality is achieved if all ai,j = 0. If A is the zero matrix, all
ai,j = 0, then A, B
= 0.
Homogeneity and Additivity: If C is also in this space, then
αA + βB, C
= (αa1,1 + βb1,1 )c1,1 + . . . + (αa3,3 + βb3,3 )c3,3
= αa1,1 c1,1 + βb1,1 c1,1 + . . . + αa3,3 c3,3 + βb3,3 c3,3
= α A, C
+ β B, C
.
Selected Exercise Solutions 475
17. Yes. We have to check the four defining properties (14.3)–(14.6). Each
is easily verified. Symmetry: For instance p0 q0 = q0 p0 , so this property
is easily verified.
Positivity: p, p
= p20 + p21 + p22 ≥ 0 and p, p
= 0 if p0 = p1 = p2 = 0.
If p(t) = 0 for all t, then p0 = p1 = p2 = 0, and clearly then p, p
= 0.
Homogeneity and Additivity: We can tackle these together,
αr + βp, q
= α r, q
+ β p, q
,
by first noting that if r is also a quadratic polynomial, then
αr + βp = αr0 + βp0 + (αr1 + βp1 )t + (αr2 + βp2 )t2 .
This means that
αr + βp, q
= (αr0 + βp0 )q0 + (αr1 + βp1 )q1 + (αr2 + βp2 )q2
= αr0 q0 + βp0 q0 + αr1 q1 + βp1 q1 + αr2 q2 + βp2 q2
= α r, q
+ β p, q
.
19. The Gram-Schmidt method produces
⎡ √ ⎤ ⎡ √ ⎤ ⎡⎤
1/√2 −1/√ 2 0
b1 = ⎣1/ 2⎦ , b2 = ⎣ 1/ 2 ⎦ , b3 = ⎣ 0 ⎦ .
0 0 −1
Knowing that the bi are normalized and checking that bi · bj = 0, we
can be confident that this is an orthonormal basis. Another tool we
have is the determinant, which will be one,
b1 b2 b3 = 1
1. First we find the eigenvalues by looking for the roots of the characteristic Chapter 15
equations
2 − λ 1
det[A − λI] = = 0, (B.1)
1 2 − λ
which is λ2 − 4λ + 3 = 0 and when factored becomes (λ − 3)(λ − 1) = 0.
This tells us that the eigenvalues are λ1 = 3 and λ2 = 1.
The eigenvectors ri are found by inserting each λi into [A − λi I]r = 0.
We find that
1 −1
r1 = and r2 = .
1 1
476 Selected Exercise Solutions
1 4
2 3
Figure B.2.
Graph showing the connectivity defined by C.
Selected Exercise Solutions 477
d2 sin(kx) d cos(kx)
= −k = −k2 sin(kx),
dx2 dx
and the corresponding eigenvalues are −k2 .
then
† 0 1 0
A = .
−0.5 0 0
Check (16.8) and (16.9) to be sure this solution satisfies the properties
of the pseudoinverse.
14. The least squares system is Ax = b. We proceed with the enumerated
steps.
Selected Exercise Solutions 479
The pseudoinverse of Σ is
0.45 0 0
Σ† = .
0 1 0
2. Compute ⎡
⎤
0.89
z = U b = ⎣ 0 ⎦.
T
−0.45
3. Compute
0.4
y = Σ† z = .
0
4. Compute the solution
0
x = Vy = .
0.4
AT Av = AT b,
1 0 0
v= ,
0 5 2
and we can see that the solution is the same as that from the pseudo-
inverse.
17. Form the covariance matrix by first creating
T 12 8
X X= ,
8 12
and then divide this matrix by the number of points, 7, thus the co-
variance matrix is
1.71 1.14
C= .
1.14 1.71
Find the (normalized) eigenvectors of C and make them the columns
of V . Looking at a sketch of the points and from our experience so
far, it is clear that
0.707 −0.707
V = .
0.707 0.707
480 Selected Exercise Solutions
Chapter 17 1a. First of all, draw a sketch. Before calculating the barycentric coordi-
nates (u, v, w) of the point
0
,
1.5
notice that this point is on the edge formed by p1 and p3 . Thus, the
barycentric coordinate v = 0.
The problem now is simply to find u and w such that
0 1 −1
=u +w
1.5 1 2
and u + w = 1. This is simple enough to see, without computing! The
barycentric coordinates are (1/2, 0, 1/2).
1b. Add the point
0
p=
0
to the sketch from the previous exercise. Notice that p, p1 , and p2
are collinear. Thus, we know that w = 0.
The problem now is to find u and v such that
0 1 2
=u +v
0 1 2
and u + v = 1. This is easy to see: u = 2 and v = −1. If this wasn’t
obvious, you would calculate
p2 − p
u= ,
p2 − p1
then v = 1 − u.
Thus, the barycentric coordinates of p are (2, −1, 0).
If you were to write a subroutine to calculate the barycentric coor-
dinates, you would not proceed as we did here. Instead, you would
calculate the area of the triangle and two of the three sub-triangle
areas. The third barycentric coordinate, say w, can be calculated as
1 − u − v.
Selected Exercise Solutions 481
Plot this point on your sketch, and this looks correct! Recall the
incenter is the intersection of the three angle bisectors.
1d. Referring to the circumcenter equations from Section 17.3, first calcu-
late the dot products
1 −2
d1 = · = −1,
1 1
−1 −3
d2 = · = 3,
−1 0
2 3
d3 = · = 6,
−1 0
then D = 18. The barycentric coordinates (cc1 , cc2 , cc3 ) of the circum-
center are
The circumcenter is
−1 1 5 2 2 −1 0.5
+ + = .
2 1 6 2 3 2 2.5
2. The ellipse is
1 2 1 2
x + x = 1.
25 1 4 2
4. The equation of the ellipse is
2x21 + 10x22 − 4 = 0.
which is
3
v= .
−1
To write the circle in standard position, we need c = vT Av − 6 = 4.
Therefore, the circle in standard form is
1 0
xT x − 4 = 0.
0 1
Divide this equation by 4 so we can easily see the linear map that takes
the circle to the ellipse,
1/4 0
xT x − 1 = 0.
0 1/4
b2
b3
b1
b0
Figure B.3.
The curve for Exercise 1 in Chapter 20.
Selected Exercise Solutions 485
The polygon for 1/2 ≤ t ≤ 1 is given by the bottom row, reading from
right to left:
3.625 3.5 2.5 2
, , , .
3.125 4 5 4
Try to sketch the algorithm into Figure. B.3.
5. The monomial coefficients ai are
0 18 −27 11
a0 = , a1 = , a2 = , a3 = .
0 9 0 −5
For additional insight, compare these vectors to the point and deriva-
tives of the Bézier curve at t = 0.
6. The minmax box is given two points, which are the “lower left” and
“upper right” extents determined by the control polygon,
0 6
and .
0 6
486 Selected Exercise Solutions
487
488 BIBLIOGRAPHY