0% found this document useful (0 votes)
32 views

Hummel J. a. - Vector Geometry (1965)

This document is a preface and introduction to a textbook on vector geometry, aimed at students preparing for calculus. It emphasizes the importance of a solid mathematical foundation and provides a review of necessary topics such as trigonometry and analytic geometry. The text includes various chapters covering the real number system, vectors, conic sections, and quadratic curves, with a focus on geometric understanding and mathematical reasoning.

Uploaded by

sormi2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Hummel J. a. - Vector Geometry (1965)

This document is a preface and introduction to a textbook on vector geometry, aimed at students preparing for calculus. It emphasizes the importance of a solid mathematical foundation and provides a review of necessary topics such as trigonometry and analytic geometry. The text includes various chapters covering the real number system, vectors, conic sections, and quadratic curves, with a focus on geometric understanding and mathematical reasoning.

Uploaded by

sormi2021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 272

Vector Geometry

JAMES A. HUMMEL
University of Maryland

ADDISON-WESLEY PUBLISHING COMPANY, INC.


BEADING, MASSACHUSETTS • PALO ALTO • LONDON • DALLAS • ATLANTA
This book is in the
ADDISON-WESLEY SERIES IN INTRODUCTORY MATHEMATICS

Richard S. Pieters and Gail S. Young


Consulting Editors

Copyright © 1965
Philippines Copyright 1965
Addison-Wesley Publishing Company, Inc.

Printed in the United States of America

ALL RIGHTS RESERVED. THIS BOOK, OR PARTS THEREOF,


MAY NOT BE REPRODUCED IN ANY FORM
WITHOUT WRITTEN PERMISSION OF THE PUBLISHER.

Library of Congress Catalog Card No. 65-13609


Preface
This text has been written for use in a one-semester precalculus course for
students who have had a good preparation in high school mathematics. Most
high schools now give courses which cover all of the topics needed to begin
calculus, and many high schools even offer an introduction to the calculus.
However, only the very best students obtain from their high school courses the
mathematical sophistication necessary to begin a college calculus course in which
the emphasis is on the concepts as well as the techniques. The great majority
of the students must be started in a calculus course which begins very slowly,
or they must be given a precalculus course which helps develop their mathe­
matical insight. The total amount of time required to reach the end of the basic
calculus sequence is about the same in either case. However, the second method
offers some extra advantages, since the students can at the same time learn
something new which they may find useful.
There is a basic problem in the design of any first-year college course in
mathematics. This is the great variation in the mathematical training offered
in the different high schools. Some students have had a complete set of modern
courses, such as those represented by the SMSG sample textbooks. Others
have had a few such courses. But many are still graduating,from high school
with three or three and a half units of strictly traditional material.
No single course can cover this entire range. Each textbook must make some
assumptions about what the student already knows and what type of training
he has had. In this text, we assume that the student has some knowledge of
trigonometry and elementary analytic geometry. Furthermore, we assume
that the student has been exposed to at least one course with the modern point
of view. To bridge the wide gap that still remains, the first two chapters of this
text have been included, which offer a review of the material which the student
is assumed to have seen already.
The instructor must decide how rapidly the material in these first two chapters
can be covered. It should be noted that this material is presented, in too brief
a fashion to be suitable for the student who has never seen these topics before.
To learn this material from the start would require an entire semester for most
students. On the other hand, a very well-prepared student could skim over these
two chapters in two or three weeks. The average amount of time which might
be spent on these two chapters with a normal class would be about four weeks.
It is recommended that this review material be included in the course, even
for rather well-prepared students. There are several reasons for this. First,
these chapters cover all of the topics needed to proceed with the rest of the
book; they contain the definitions of our terminology and tell us what we can
assume to be known in our proofs. The inclusion of this material makes the
entire text (essentially) complete in its mathematical development. Secondly,
the student who has taken mostly traditional courses will learn here some of the
iii
iv PREFACE

modern mathematical terminology and will be introduced to some aspects of


this point of view. Finally, if a student has had a number of the more modern
courses, the contents of these chapters will offer him a fast review and should
also show him how we can be precise when necessary and relax our precision
when it is unimportant. In many cases, such a student will find that the de­
velopment given here differs from what he has seen before. This may help
him understand that there is no single “correct” way of developing math­
ematics.
Chapter 3 starts the “new” material in this text. This chapter introduces
the mathematical study of vectors. As mentioned above, the main purpose of
this text is to prepare students for the study of the calculus and other future
courses. Many topics could be chosen which would be useful for this purpose.
We have chosen to study vectors, because they are useful to engineers and
physicists and because they can be used to prepare the student for the study of
vector spaces, linear algebra, functions of several variables, and functional
analysis.
Because of this effort to build for the future, some of the material studied
may seem rather strange. Obvious things are sometimes looked at in what may
appear to be an unnecessarily abstract point of view. When this happens, it is
done with good reason. An attempt is being made to introduce the student to
the proper point of view in order to facilitate his future studies. On the other
hand, there are also places where a more abstract point of view could have been
taken, and yet was not. The emphasis throughout the chapters on vectors is
on the geometric picture. Excessive abstraction cannot be allowed to interfere
with geometric understanding. In fact, we make use of geometric intuition to
motivate the abstract mathematical development. Many students will see here
for the first time an abstract mathematical system developed to fit a specific
intuitive picture.
In Chapter 5, we discuss the conic sections. Many students will already be
familiar with a more standard development of the conic sections. If so, we
hope that they will find this approach new and interesting. The student who
knows nothing about the conic sections will have to work hard here, since the
discussion is rather rigorous and somewhat brief. While the vector methods
developed in earlier chapters are used wherever convenient, we avoid using
these for their own sake when other methods might be more efficient or more
revealing.
The last chapter discusses the rotation of coordinates and ends with two
sections in which the quadric surfaces and polar coordinates are discussed in as
concise a manner as possible. These topics are useful in a calculus course and
are included for this reason. Unfortunately, there would not be time left in a
one-semester course to expand on these topics beyond what is given here.
The last comment may also help to explain why other material was not in­
cluded. Each topic added to a one-semester course would require the deletion
of something else. The topics included here were chosen to serve the basic
purposes of the course in what seemed to be the best way. Other choices could
also have been made, but it would require a great deal of revision to make more
than very minor changes.
PREFACE

This text has been used for several semesters in a preliminary edition. The
results of these trials were quite satisfactory. The students found the material
difficult, of course, but they acquired the desired knowledge and point of view.
Both the students and the instructors seemed to find the text interesting. Their
many helpful comments and suggestions were taken into account in preparing
the final version of the text.
I wish to express my appreciation to all of my colleagues for the many dis­
cussions which helped to mold this text into its present form. In particular, I
wish to thank Professor Stanley Jackson for his help in reading the manuscript
in its several versions and for his many thoughtful suggestions.

College Park, Maryland J.A.H.


December 1964
Contents
Chapter 1 THE REAL NUMBER SYSTEM

1-1 Mathematical reasoning 1


1-2 Real number field 5
1-3 The order axioms 12
1-4 The completeness axiom 15
1-5 Absolute value 20
1-6 Determinants 25

Chapter 2 ANALYTIC GEOMETRY AND TRIGONOMETRY

2-1 The cartesian plane . 33


2-2 Straight lines . 39
2-3 Functions and graphs 46
2-4 Translations . 52
2-5 Angles............................ 56
2-6 Trigonometric functions 63
2-7 Triangle formulas 71

Chapter 3 VECTORS

3-1 Cartesian coordinates in three-dimensional space 80


3-2 Direction cosines and direction numbers . 87
3-3 Vectors........................................ 93
3-4 The algebraic operation on vectors 97
3-5 Projections and the dot product 106
3-6 The triangle inequality . 114

Chapter 4 PLANES AND LINES

4-1 Planes .... 120


4-2 The cross product 126
4-3 Distance formulas 133
4-4 The straight line 138

Chapter 5 VECTORS AS COORDINATE SYSTEMS

5-1 Some vector identities . 145


5-2 Collinear and coplanar vectors . 152
5-3 Coordinate vectors....................... 159
5-4 Projections and distance formulas . 166
5-5 General methods ... 173
vii
viii CONTENTS

Chapter 6 THE CONIC SECTIONS

6-1 The definition of conicsections 179


6-2 Equivalent definitions 185
6-3 The ellipse 191
6-4 The hyperbola 200
6-5 The parabola........................................................ 210
6- 6 General quadratic equationswithout cross product
terms 213

Chapter 7 QUADRATIC CURVES AND SURFACES

7- 1 Rotation of axes .... 220


7-2 General quadratic equations 224
7-3 The quadric surfaces . 231
7-4 Polar coordinates 241
ANSWERS TO SELECTED PROBLEMS 251
INDEX 261
1

The Real Number System

1
1- MATHEMATICAL REASONING

In this book an attempt is made to introduce the student to the “mathe­


matical way of thinking. ” This is not to imply that there is some essential
difference between this and any other (logical) way of thinking; the main
point of mathematical reasoning is common to all disciplines: precision.
There are, however, some special features, peculiar to mathematics, which
require study.
The student is expected, and required, to learn how to operate the mathe­
matical “tools” and—this is of equal importance—he must learn to judge
when these tools may (and when they may not) be applied. For these
purposes, he must learn to distinguish between what is known and what
is assumed, between the “actual world” and the “mathematical model”
of it, and between a plausible argument and a proof. Few students will
be required to be able to provide proofs for mathematical facts after
leaving school, but many will find it essential that they be able to under­
stand such proofs. In particular, they must know on what assumptions
the particular results are based.
As we study the various topics in this text, we will see how basic as­
sumptions (axioms) are used to derive results. At this point, however,
let us discuss something else which must be understood: mathematical
definitions.
In a dictionary a definition is supposed to explain the meaning and the
use of a word. Of course, the definitions of one word are given in terms of
other words. For example, we might find that the word cord would have
“a string or small rope ” as one of its meanings. This means that wherever
applicable this phrase could be substituted in the place of the word “cord ”
without changing the meaning of the sentence in question. Suppose
however, that one knew no English at all. What good would the English
dictionary be? Imagine that you had a Russian dictionary with both
the words and the definitions in Russian, and that you did not know any
Russian. Every word of the language is there, and each word is defined.
Yet this dictionary would be of no use to you if you wished to read some­
thing written in Russian.
1
2 THE REAL NUMBER SYSTEM 1-1

In compiling a dictionary, the editors must assume that the users al­
ready know something about the language. If a definition is given for
every word, then some circular definitions must be present. For example,
in the same dictionary which defined a cord as "a string or small rope”
we find a string defined as “a small cord or slender strip of leather, ” and a
rope defined as “a large stout cord made of strands of fiber or wire twisted
or braided together. ”
One way of explaining the mathematical point of view is to describe
the way in which a mathematician would write a dictionary. First, there
would be given a list of “primitive words” which would be left undefined.
Next, there would be a list of words whose definitions used only the primi­
tive words. Finally, there would be the remaining words so arranged that
the definition of each made use only of primitive words or words which
had been already defined.
A mathematical definition differs from a dictionary definition in another
way. A good dictionary will attempt to explain the meaning and the use
of a word and to distinguish it from other words with a similar, but not
identical, meaning. A definition in mathematics will merely give an
equivalent to the term being defined. Explanations of the meaning and
the usage of the term must be given separately. This is true even for the
terms which are taken as undefined. An essential characteristic of most
works in mathematics is the precise use of terms. The exact meanings of
these terms must be known.
Let us give an example of what we mean by discussing the mathemat­
ical use of the word set. We assume that the student has been introduced
to this concept already, but even if he hasn’t, the basic idea is simple
enough to grasp. We are treating the concept of a set as an undefined
term, but this does not prevent us from explaining carefully what we
mean when we use the term.
A set is a collection of things which are called the elements of the set.
In our use of this term, there are only two things which need to be kept
in mind. First, an element either is in a set or it is not. If a set is specified
by listing its elements, an element which is listed more than once is still
in the set only once.
Secondly, the elements in a set have no order. Listing the elements of
a set puts them in some order, but this order is not part of the structure
of the set. In loose, but picturesque, terms this idea can be expressed by
saying that a set is a “bag full of elements. ”
One of the sources of the power of mathematics is in the use of symbols.
Thus elementary algebra has as its basic idea the introduction of letters
to symbolize real numbers. In the same way, it is useful to introduce
symbols, single letters, to represent sets. Capital italic letters are used
1-1 MATHEMATICAL REASONING 3

for this purpose in this text. Unfortunately, the same letters are used for
other purposes also, but in every case the meaning of the symbol used
should be clear from the context.
If A is a set and x is an element of that set, we write

x G A,

which can be read, “x is in A ” or “x is an element of A. ”


When a set is specified by listing all of its elements, it is customary to
indicate the set itself by enclosing the list in braces. Thus the set whose
elements are the three integers 1,2, and 3 is written as

{1,2, 3}.

When a set is specified by giving some rule which allows us to determine


whether or not a given element is in the set, another notation is used.
For example,
{x | x is an integer, x > 1}

is read: “The set of all x such that x is an integer and x > 1.” The braces
indicate that we are defining a set. The x before the vertical bar is the
symbol indicating the general form of the elements of the set we are
talking about. It is a “dummy variable, ” meaning that any other symbol
can be used in its place, just as we can use x, y, or any other letter to
indicate some unknown number in an algebraic problem. The rule for
determining whether or not an element is in the set is given after the
vertical bar. It is to be noted that the vertical bar does not mean “such
that”; it is merely a symbol which divides the “dummy variable” from
the rule and, in this context, can be read as “such that. ”
Usually the type of element under consideration is understood and no
explicit statement of its nature need be made inside the symbol for the
set. That is, if it were understood that we were considering only the
integers, the above set could be written {x | x > 1}. However, this will
usually be done only when the elements under consideration are the real
numbers, or doubles, or triples of real numbers. In any case, it will always
be evident which elements are to be considered as possible candidates for
inclusion in the set.
If A and B are two sets such that every element of A is also an element
of B, then A is called a subset of B, and we write A C B. Thus, for ex­
ample, the set {x | 1 < x < 2} (x being allowed to be any real number)
is a subset of the set of all real numbers. It is also a subset of the set of
all positive numbers, {x | x > 0}.
4 THE REAL NUMBER SYSTEM 1-1

There is one peculiar type of set which frequently causes confusion, the
empty set, which is also called the null set or void set. It is defined to be the
set which contains no elements. The symbol commonly used for the empty
set is 0. The empty set occupies the same position in discussions of sets
as the number zero occupies in the discussion of integers, and is often ob­
jected to by the beginning student for the same reason that the number
zero was objected to when it was first introduced. Since it represents
nothing, why do we need a symbol for it? If the student can answer this
question for the number zero, then he should be able to answer it also for
the empty set.
As the student goes through this text, in addition to learning the ma­
terial, he should spend time considering why the definitions are given as
they are, and why the proofs of theorems arc given in the form that they
are. The student is not expected to be very familiar with mathematical
proofs when he starts this course. Therefore, when proofs are given in
the text, comments about their logic will be made. It is hoped that by the
end of the course, the student will know a good deal more about how
theorems in mathematics can be proved.

PROBLEMS

1. Each of the following sets is to be a subset of the set of all real numbers.
Some are equal; some are subsets of others. Find all the relations between
the sets. (Note: equality is always used in the sense of identity.)
A = {x | 0 < x < 1} B = {x\x2 < 4}
c = {y | y2 < 1} D = {z | z2 < 1 and z > 0}
E = {w | —2 < w < 2} F = {t\t> 0}
2. Write in set notation:
(a) The set of all real numbers whose squares are less than 2
(b) The set of all pairs, (x, y), of real numbers for which the first member of
the pair is smaller than the second
(c) The set of all even integers
(d) The set of all integer multiples of 5
(e) The set of all positive real numbers whose squares are less than 2
3. Write each of the following sets of real numbers in a simpler form such as
{x | a < x < b}.
(a) {x | x2 — x < 0} (b) {x | x2 + x + 1 < 0}
(c) {x | x2 + x — 2 < 0} (d) {x | x2 + 3x + 4 > 0}
4. According to the definition of a subset, is a set a subset of itself? Explain.
5. According to the definition of a subset, if B is some set, is 0 G B? Explain.
1-2 THE REAL NUMBER FIELD 5

6. Two sets A and B are equal, A = B, if and only if they consist of exactly
the same elements. To show that two sets are equal, you must show that
every element in one is in the other, and vice versa. Identify each of the
following statements as either true or false. Explain your statement. (Note
that in order to disprove a statement, it is only necessary to give a single
example in which the statement does not hold.)
(a) If A Q B and B C A, then A = B.
(b) If A C B, C C D, and A = C, then B = D.
(c) If A C B and C C B, then A = C.
(d) If A C B and B C C, then A C C.
7. (a) Let A i be a set containing exactly one element. How many subsets does
Ai have?
(b) Let .I2 be a Bet containing exactly two elements. How many subsets does
A2 have?
(c) Let A3 be a set containing exactly three elements. How many subsets
does A3 have?
(d) How many subsets are there of a set which has exactly n elements?
8. Using a dictionary of “collegiate” size or larger, find a circle of definitions.
That is, look up some noun. Choose a noun in the definition which is crucial
to at least one part of the definition. Look up this noun and continue in this
way until you arrive at a noun already in the chosen list.
9. Using a good dictionary, try to find the exact difference between the following
pairs of words.
(a) enclose and inclose (b) further and farther
(c) whence and whither (d) should and would
(e) as and like (f) imply and infer
(g) affect and effect (h) that and which

1-
2 THE REAL NUMBER FIELD

The student probably feels that he has a good working knowledge of


the real numbers. After all, he has been studying them for years. In this
section, and the next few sections, we wish to review the basic properties
of the real numbers. The student may recognize the particular properties
which we single out as being the rules of elementary algebra, but he is
warned that the attitude adopted toward them here is different. We dis­
cuss these properties in terms of the axioms which define them; and the
particular axioms we choose to discuss will be the important thing.
No attempt will be made to define the real number system here. In­
stead, we shall merely list the properties which distinguish the real number
system from other more or less similar mathematical systems (such as the
integers). As will be seen, the particular properties that will be discussed
arc fundamental. Some of them will appear again as properties of entirely
6 THE REAL NUMBER SYSTEM 1-2

different systems. Some of them will prove to be false in other mathe­


matical systems, but all of them should be known by the student in order
to help him gain an understanding of what underlies the mathematics
he is to learn.
Let us start by making a precise statement of our assumptions about
the real number system:

The real number system is a set of elements on which two binary operations,
called addition and multiplication, and a binary relation, called order, are
defined.

We shall not define the terms binary operation or binary relation in


general at this point, but only explain them in the given context.
The binary operation of addition on the real numbers associates to every
pair of real numbers, a and b, a unique real number c, called the sum of
a and b. We write c = a + b to represent this sum. The binary operation
of multiplication on the real numbers similarly associates to every pair of
real numbers, a and b, a unique real number d, called the product of a and b.
We write d = ab, or d = a • b, to represent this product.
These statements explain what we mean by the binary operations of
addition and multiplication. They do not, however, explain what the
operations are, or how they behave. (The binary relation of order will be
studied in the next section, so we shall not discuss it here.)
In order to try to explain what addition and multiplication are, we list
some of the properties of these operations. These properties are nothing
more than the basic laws of algebraic manipulation and will be well
known to any student who is familiar with elementary algebra. The par­
ticular set of properties we choose to write down may appear to be rather
brief from this point of view, but it so happens that one can prove the
computational properties of the real numbers, assuming nothing more
than these. These properties are therefore the axioms which determine
the elementary properties of the real numbers.
Before listing these axioms, we should say a word about equality of
real numbers, or about equality in general.. In this text, the equality sign,
=, is used to mean actual identity of the elements separated by it. That
is, we write a = b, if and only if a and b are actually the same. This can
also be explained by saying that a and b are two names for the same thing.
Actually, what we are doing is using the equality sign in its meaning as
applied to the elements of a set (the set of real numbers in this case).
The axioms we now introduce are called the field axioms, since they
constitute the axioms for a special mathematical structure known as a
field. The student will encounter the concept of fields again if he takes a
course in abstract algebra.
1-2 THE REAL NUMBER FIELD 7

The Field Axioms for the Real Numbers

1. The Commutative Law for Addition, For any real numbers a and b,

a -J- b = b -j- a.

2. The Commutative Law for Multiplication. For any real numbers a and 6,

ab = ba.

3. The Associative Law for Addition. For any real numbers a, b, and c,

a + (b + c) = (a + b) + c.

4. The Associative Law for Multiplication. For any real numbers a, b, and c,

a(bc) = (ab)c.
5. The Existence of the Identity for Addition. There exists a real number 0
such that if a is any real number,

a -1- 0 = 0 -1- a = a.

6. The Existence of the Identity for Multiplication. There exists a real number
1/0 such that if a is any real number,

a • 1 = 1 • a = a.

7. The Existence of Inverses for Addition. For any real number a, there exists
a corresponding real number —a such that

a -|- (—<z) = 0.

For any real number a


8. The Existence of Inverses for Multiplication. 0,
there exists a corresponding real number a-1 such that

a • a-1 = 1.

9. The Distributive Law. For any real numbers a, b, and c,

a(b + c) = ab + ac.

A few comments about these nine axioms are called for at this point.
First, many mathematicians would add to this set two further axioms,
the closure axioms. These state that if a and b are real numbers, then
a + b and ab are also real numbers. The uniqueness of a + b and ab is
usually included in these same axioms. We prefer to think of these prop­
erties as being implied by the assumption that the binary operations of
8 THE REAL NUMBER SYSTEM 1-2

addition and multiplication are defined on the set of real numbers, and
that the result of these binary operations is in every case a unique real
number. These properties would become important if we wished to discuss
the operations of addition and multiplication on subsets of the real num­
bers or if we were to discuss similar operations in other mathematical con­
texts. We also need to quote the closure and uniqueness properties as the
reason for some steps in proofs (as we will see below).

Definition 1-1. Let R be a given set and let a binary operation be defined
on R; that is, for every pair of elements a, b in Rf there exists a unique
element c in R which we write as c = a © b. Let S be a subset of R.
Then we say that the set £ is closed with respect to the operation © if
and only if for every a and b in S, a © b is also in aS.

This formal definition should help make clear exactly what is meant
by the closure property. We shall not give a formal definition of the
uniqueness property, since we understand this to be an integral part of
the concept of a binary operation.
The student should observe the names attached to the various axioms.
The properties described by these axioms are fundamental and will be
found over and over again in different mathematical contexts. If, therefore,
the student does not already know these properties by name, he is advised
to learn them; these names are an essential part of the language of mathe­
matics.
Finally, observe closely the wording of these axioms. The order in
which the phrases occur is most important. Here the content is familiar,
and it is easy to slide over the full significance of the various phrases.
Look, for example, at the. fifth axiom. It says that there exists a real
number zero and that this number exists once and forever, without any
regard to the real number a that it is being added to. In the seventh axiom,
however, the order of the phrases is reversed. This statement asserts the
existence of the negative of a number, once we are given the number. No
implication is made that there is a single, universal number —a. Look
at the various statements and observe the logic of their construction.
Try to see how the assertions made would be altered if the statements
appeared in a different order.
From these nine axioms all of the purely arithmetic properties of the
real number system could be proved. For example, the obvious exten­
sions (which could be formally proved) of the first four of these axioms
permit us to write sums or products in any order without having to worry
about introducing parentheses to specify which operations should be done
first.
1-2 THE REAL NUMBER FIELD 9

While we do not wish to spend much time on the development of the


algebraic properties of the real number system from these axioms at this
point, two particular results which show how this development could
proceed might be of interest. We will sketch their proofs.

Theorem 1-1. If a and b are two real numbers, then there exists a unique
real number x such that
a + x = b.

Proof: We first note that the real number

(—a) + ò,

whose existence is guaranteed by Axiom 7, does indeed satisfy the require­


ment for x. This is proved by successive applications of Axioms 3, 7, and 5.
To see that the solution, x, is unique, suppose that a + x = b and
a + y = b. Then from the identity of the numbers involved, a + x =
a + y. We then merely need to add (—a) to this number in its two repre­
sentations to conclude (after using Axioms 3, 7, and 5) that x = y.

Theorem 1-2, If a Í3 any real number, then

a • 0 = 0.

Proof: We have that


1 + 0 = 1,
and hence that
a(l + 0) = a • 1
= a.

But then, from the distributive law

a • 1 + a •0 = a
or
a + a • 0 = a.

However, we know that a + 0 = a, and hence from Theorem 1-1 we


conclude that a • 0 = 0.
The proofs of the above theorems have been written in the typical
informal style used in mathematical works these days. They could also
have been given in a formal style, with each step being given a full justifica­
tion.
10 THE REAL NUMBER SYSTEM 1-2

For example, the formal proof of Theorem 1-2 would look like this:

Statement Reason

(1) 1+0=1 Existence of the identities for addi­


tion and multiplication
(2) a(l + 0) = a • 1 Uniqueness of multiplication
(3) a•1 = a Identity for multiplication
(4) a(l + 0) = a Equality of numbers in (2) and (3)
(5) <z(l +0) = u • 1 + a * 0 Distributive law
(6) u(l +0) = a + a • 0 Equality of numbers in (3) and (5)
(7) a+a•0 = a Equality of numbers in (4) and (6)
(8) a+0 = a Existence of identity for addition
(9) a•0 = 0 Theorem 1-1 applied to (7) and (8)

If the student writes out the complete formal proofs of a few theorems,
he will see how we can say that these results can be derived from the
axioms alone. However, these formal proofs are usually too long and too
detailed for ordinary use. The fragmentation of the steps makes it
difficult to see exactly what the main point of the proof is. Informal
proofs need to give only enough detail to be convincing. Correctly done,
the informal proof should give the reader enough information to enable
him to write out the complete proof in a formal fashion if required.
The proof of Theorem 1-1 contains two distinct parts. In the first
part, it is shown that a solution of the given equation does indeed exist.
This is done directly, by exhibiting a number whose existence is proved
from the axioms and which satisfies the equation. The second part of the
proof shows that this solution is unique. This is proved by showing that
any two numbers which satisfy the equation must in fact be the same.
Note that proving this part alone does not prove the entire theorem. It
might well be possible to prove that any two solutions of a given equation
would be equal when there were in fact no solutions at all. Another way
of saying this is to observe that the first part of the proof of Theorem 1-1
shows that there is at least one solution while the second part shows that
there is at most one.
PROBLEMS

1. Show that the set of all integers, with the usual operations of addition and
multiplication, does not satisfy all of the field axioms.
2. The rational numbers can be represented in the form p/q where p and q are
integers, with q # 0. Two pairs of integers, r/s and p/q, represent the same
1-2 THE REAL NUMBER FIELD 11

rational number if and only if rq = ps. Addition and multiplication of


rational numbers is defined by

r/s + p/q = (rq + ps)/(sq),


(r/s)(p/q) = (rp)/(sq).
The set of all rational numbers satisfies the field axioms. Prove that Axioms
7, 8, and 9 are satisfied, assuming that the first six have already been proved.
3. Show that the set consisting of the two elements 1 and 0 satisfies the field
axioms if we suppose that
0 + 0 = 0, 0-0 = 0,
0+1 = 1 + 0 = 1, o . 1 = 1 . o = 0,
1 + 1 =0, 1-1 = 1.

4. Prove from the field axioms that if a • b = 0, then either a or & is zero.
[Hint: suppose one of them, say b, is not zero. Use Axiom 8 to prove that a
must then be zero.]
5. Let V be the set of all possible pairs of real numbers, (a, b). Two elements
of V are equal if and only if both consist of the same pair of numbers in the
same order. Define the sum and product in V by
(«,&) + (c,d) = (a+c,ò+d),
(a, ò) • (c, d) = (ac, bd).

Which of the field axioms hold in V? Give an example showing the failure
of any of the axioms which are not true. Give an example showing the
failure of the property of Problem 4.
6. With the same set V as Problem 5, define sums and products by

(a, b) + (c, d) = (a + c, & + d),


(a, b) • (c, d) = (ac — bd, ad + be).

Which of the field axioms hold in this case?


7. Let m be any positive rational number which is not the square of a rational
number. Show that the set of all numbers of the form a + &Vm, where a
and b are rational, satisfies the field axioms.
8. Write out a full formal proof of Theorem 1-1.
9. Prove that if a • a = a, then either a = 0 or a = 1.
10. Which of the field axioms are satisfied by the set of positive real numbers?
11. Prove that for any a, —a = (—l)a. [Hint: 1 + (—1) = 0. Multiply both
sides by a.]
12. Prove that for any a, —(—a) = a.
12 THE REAL NUMBER SYSTEM 1-3

13. Each of the following binary operations is defined on the set of real numbers.
Check each operation as to whether or not it is commutative or associative.
If there is an identity for the operation, state what it is. If there is none,
show why not. If an identity for the operation does exist, do inverses exist?
(a) a ♦ b = |(a + b)
(b) a A b = the larger of a and b
(c) a o b = a + 2b
14. For each of the following subsets of the reals, check whether the subset is
closed under the three operations defined in Problem 13:
(a) the set of all integers
(b) the set of all positive reals
(c) the set of all real numbers whose square is less than or equal to one

1-
3 THE ORDER AXIOMS

The field axioms listed in the previous section fail to characterize the
real numbers. The fact that the examples in Problems 2 and 3 of the last
section satisfy the field axioms shows this to be true. If the student is
familiar with complex numbers, he can check to see that the set of com­
plex numbers also satisfies the field axioms (this was Problem 7 of the
last section).
A property of the real numbers that is not shared by the field of complex
numbers, or one of the fields of the type seen in Problem 3 of the last sec­
tion, is order. The order relation of the real numbers is linked to the
field properties by certain axioms.

The Order Axioms for the Real Numbers

There exists an order relation on the real numbers. For every pair of real
numbers a and b this relation is either true or false. If it is true, we say that
a is less than b and write a < b. If it is not true, we write a < b. This
relation satisfies:
1. The Trichotomy Law. For any pair of real numbers a and b, one and only
one of the following holds:

(a) a < b, (b) a = b, (c) b < a.

If a, b, and c are real numbers such that a < b


2. The Transitive Law.
and b < c, then a < c.
3. The Addition Law. If a, b, and c are real numbers and a < b, then
a -|- c <C b 4" c.
4. The Multiplication Law. If a, b, and c are real numbers such that a < b
and 0 < c, then ac < be.
1-3 THE ORDER AXIOMS 13

From these axioms, all of the familiar properties of the order relation
can be proved. For example:

Theorem 1-3. If a b and c <Z d, then a 4~ c < b d.

Proof: Since a < b, from Axiom 3 we have

a 4~ c < b + c.

Similarly, since c < d, we have

b 4~ c b 4~ d.

The transitive law applied to these two inequalities then gives the desired
conclusion, a 4- c < b 4- d.
Other results can be proved just as easily. We will list a few of these
without proof. The student should study them to be sure he knows and
can apply the results. In the statement of these theorems, we will use a
few symbols and conventions which have not been formally introduced,
but this should cause no difficulty. The student knows, for example,
that a > b means b < a, that a < b means that either a < b or a = b,
that a — b means a 4- (—b), and so forth.
Similarly, we will assume that all of the computational consequences
of the axioms of the last section have been proved. For example, we may
assume that —a = (—l)a, that (—a)(—b) = ab, and so on. It is not
our purpose to give a detailed development of all of the properties of the
real number system. Rather, we are only interested in showing how this
might be done and pointing out that all of these properties depend on a
very few axioms.

Theorem 1-4. If a > 0, then —a < 0.

Theorem 1-5. If a < b, then b — a > 0.

Theorem 1—Ó. If b - a > 0, then a < b.


Theorem 1-7. If a < b then —b < —a.
Theorem 1-8. If b < 0, then — b > 0.
Theorem 1-9. If a < b and c < 0, then ac > be.

Theorem 1-10. 1 > 0.

Theorem 1-11. If 0 < a < b, then 0 < b 1 < a x.


14 THE REAL NUMBER SYSTEM 1-3

There is an entirely different way of looking at the order properties of


the real numbers, typified by Theorems 1-4, 1-5, and 1-6. This method
involves consideration of those real numbers which are positive, and leads
to an alternative set of axioms to characterize the order properties.

Alternative Order Axioms

There exists a set P of positive elements in the set of all real numbers such
that:
1. The Trichotomy Law. If a is any real number, then one and only one of
the following is true:
(a) a e P,
(b) a = 0,
(c) —a G P.
2. The Addition Low. If a and b are in P, then (a + b) G P.

3. The Multiplication Law. If a and b are in P, then a • b G P.

These two sets of possible axioms are related by the assertion that
a < b if and only if b — a is positive. Indeed, with this as the definition
it is not difficult to prove these last three laws from the first set or con­
versely to prove the first set from these three. As an example, we can
prove

Theorem 1-12. The transitive law is a consequence of the three alterna­


tive order axioms.

Proof: Suppose a < b and b < c. This means that (b — a) and (c — 5)


are positive. Then the second axiom tells us that

(b — a) + (c — 5) = c — a

is positive, and hence a < c.


From a theoretical point of view, a mathematician who is interested
in abstract mathematics might prefer to use the second set of axioms to
specify the properties of order, but the first set is probably of more practical
use, and is a little easier to work with in the actual application of the order
properties.
The fact that these two sets of axioms are equivalent and that either
set can be used may be surprising to the student. He should realize that
axioms are not intrinsically determined, but are chosen to accomplish a
specific purpose.
1-4 THE COMPLETENESS AXIOM 15

PROBLEMS

1. Prove Theorems 1-4 through 1-11. Each may be used in the proof of any
subsequent ones. The proofs should not make use of the alternative order
axioms. The following hints may prove useful.
Theorem 1—4: a + (—a) = 0. Use Axiom 1.
Theorem 1-7: Use the previous theorems.
Theorem 1-9: Use Axiom 4 and Theorem 1-7.
Theorem 1-11: Prove in two parts. First prove that if a > 0, then a-1 > 0.
Try applying Axiom 1 to a-1.
2. Prove the alternative axioms from the first set. The truth of Theorems 1-3
through 1-11 may be assumed.
3. Prove the first set of order axioms from the alternative set. One has already
been done for you (Theorem 1-12). Be careful not to use anything that is not
given or has not yet been proved, starting from the alternative set.
4. If a < 6, is a”1 > 6"1? What if a < 0? What if b < 0?
5. Prove that the field of Problem 3 in the last section cannot be ordered.
[Hint: 0 and 1 are different numbers. Use the Trichotomy law, assuming that
the field can be ordered. Add 1 to each side of the assumed order relation.]

1-
4 THE COMPLETENESS AXIOM

The field axioms and the order axioms of the last two sections still do
not characterize the real numbers completely. The set of all rational
numbers (which can be represented in the form p/g, where p and ç # 0
are integers) satisfies the field axioms and the order axioms, but not all
real numbers are rational numbers. This fact was known to the early
Greek mathematicians. Indeed, it is considered probable that the existence
of such irrational numbers was one of the closely guarded secrets of the
Pythagoreans.
To the Greek mathematicians, geometric facts were of primary im­
portance, and in their view, numbers were closely related to the lengths
of line segments. Let us try to see how this point of view operates. Imagine
a straight line (extended indefinitely in both directions) and mark two dis­
tinct points on this line. Label one of these points 0 and the other 1.
The line segment between these two points is taken as our unit length.
By the construction methods of euclidean geometry this unit length can
be laid out successively along the line to give us points we can label 2, 3,
and so on.
If q is any positive integer, other construction methods allowed by
euclidean geometry can be used to subdivide each of these segments of
16 THE REAL NUMBER SYSTEM 1-4

unit length into q segments of equal length. Doing this, we in effect


construct a ruler having the initially given line segment as its unit distance,
and with marks located at a distance 1/q apart.
In this way we see that by the euclidean “straightedge and compass”
constructions we can find a line segment whose length is any (positive)
rational number.
But we can also construct a line segment of length y/2. If a square is
erected with its base as the line segment between the points labeled 0
and 1, then by the Pythagorean theorem, the diagonal of this square is
of length \/2. This length can be transferred to the given line (Fig. 1-1)
to yield a point which we can label \/2. However, \/2 is not a rational
number, as we will now prove.

Suppose, on the contrary, that there exist two integers p and q such
that \/2 = p/q. Then 2 = p2/^, or

p2 = 2g2.

The square of any odd number must be an odd number. This follows
since any odd number is of the form 2k + 1, where k is an integer, and
its square is of the form 4k2 + 4k + 1, which leaves a remainder of one
when divided by 2. Each of the numbers p and q contains some factors of
two so that
p = 2nk,
q = 2mj,
where n and m are nonnegative integers (either could be zero) and k and
j are odd numbers. Then p2 = 22nk2 and hence is an odd number times
an even number of factors of 2. However, p2 = 2q2 = 2 • 22mj2 = 22m+1j2
is at the same time an odd number times an odd number of factors of 2,
which is impossible. The same number cannot contain both an even
and an odd number of factors of 2.
We see therefore that there are points on the "measuring line” that we
constructed which are not at a rational distance from the origin, or equiv­
alently, that there are real numbers which are not rational numbers. Thus
there must be still another property satisfied by the set of all real numbers
which we have not yet listed. The missing property is the completeness
1-4 THE COMPLETENESS AXIOM 17

property. This property can be explained intuitively as stating that if we


assign numbers to the points on a line as described above, then every
point on the line corresponds to a real number.
Before making a formal statement of the completeness axiom, we
should remark about the way in which irrational numbers are used. First,
consider \/2. If you were asked, “What is the square root of two?,”
what would your answer be? If it is 1.414 or the like, you are wrong. The
only correct answer which can be given to this question is “The real
number whose square is two. ” Any other must be wrong. The square of
1.414, for example, is 1.999396, which is not 2.
Of course the student who says that the square root of 2 is 1.414 might
argue that he meant 1.414 ..., the three dots indicating additional
decimal places which could be specified if necessary. This would be a
semicorrect answer (if given this completely). It would be a correct
answer if an explanation were also given of how the missing decimal
places could be determined.
As another illustration, what is the value of tt? The numbers or
3.1416 are commonly used in place of 7r, but they are of course not the
same thing. The number ir is also an irrational number and its decimal
expansion goes on indefinitely. However, this is immaterial in actual
practice. Many handbooks list ten or twelve decimal places of the value
of 7T. How accurate is such a value? The circumference of a circle one
meter in diameter is determined to within 10“10 meter by using 10 decimal
places of 7T. This is one angstrom unit and is of the order of the distance
between atoms in ordinary matter. It would hardly be of any practical
concern to know the value of 7r any more accurately than this.
However, no finite number of decimal places suffice to determine such
irrational numbers exactly. Even if the difference is too small to matter
in any practical situation, it is still there. With regard to rational num­
bers, the situation is different. Many rational numbers also have non­
terminating decimal expansions (e.g., | = 0.33333 . . .), but a group of
digits repeats in these expansions, and it is always possible to determine
the actual number being represented.
When we write tt = 3.14159 . . . , what do we mean? Isn’t it true that
we mean that the rational number 3.14 = is smaller than the real
number tt while 3.15 is larger than 7r? That 3.141 < tt < 3.142, and so
on? Thus an infinite decimal expansion is nothing more than a sequence
of rational numbers. The longer the expansion is carried out, the closer is
its value to the desired number in terms of the decimal approximation
from below.
With this as a background, we now give the final axiom for the real
number system. In order to do so, however, we must first define some of
the terms used in the axiom.
18 THE REAL NUMBER SYSTEM 1-4

Definition 1-2.A set of real numbers A is said to be bounded above if


there exists a number m such that m > x for every x in the set A.
The number m in this case is called an upper bound qí the set A.

Definition 1-3. A number m is called the least upper bound of a set of


real numbers A if and only if
(1) m is an upper bound of the set A,
and
(2) if k is any real number which is less than m, then there is some
x in the set A such that x > k.

THE COMPLETENESS AXIOM: If A is any set of real numbers which is


bounded above, then there exists a real number m which is the least upper
bound of A.

We are not going to make formal use of this property of the real numbers
in this book. We list it here only because it is the last of the axioms for
the real number system. It so happens that the field axioms, the order
axioms, and this single completeness axiom characterize the real number
system completely. Students who go on to take advanced mathematics
courses will probably see a proof of this fact.
In this book, we are only interested in the point of view, expressed
above, that this axiom requires a correspondence between all the points
on a line and the real numbers. Before leaving this section we would like
to emphasize this point and formalize the discussion of the coordinates on
a line.
When numbers were associated with the points on a line as described
above, we considered only the points to the "right” of the point labeled
zero. We can extend the labeling to the "left” of zero as well. Clearly,
such points must correspond to the negative real numbers. When this
is done, however, we find that we have a point on the line associated with
every real number.

Definition 1-4. A coordinate line (or coordinate axis) is a line together


with an association between the real numbers and the points on the
line so that each point corresponds to a unique real number, called the
coordinate of that point, and each real number corresponds to a unique
point. Furthermore, this assignment of coordinates with points must be
such that the distance between two points of the line (in terms of a
particular unit of distance) is the difference of the coordinates of these
points. The point associated with the real number zero is called the origin
qí the coordinate line.
1-4 THE COMPLETENESS AXIOM 19

Note that there is a tacit assumption of the existence of a concept of


distance in the plane. Almost any suitable axiom scheme for euclidean
geometry will make such a distance concept available.
Suppose that instead of two points on the line to start with, we were
given a single point, which is to be the origin, and a unit distance (say
as the length of an entirely separate line segment). Then there are two
distinct ways in which the line can be turned into a coordinate line. There
are two points on the line a unit distance from the origin, and either of
these could be labeled 1 (the other would then be —1), giving us two
distinct coordinate lines with the same origin and the same unit of dis­
tance.

PROBLEMS

1. Give an example of a set of numbers which is not bounded above.


2. (a) Given A, a set of numbers which is bounded above, exactly what would
you mean by saying that the number m is the maximum of the set A?
(b) If m is the maximum of A, how would m be related to the least upper
bound of A ?
(c) Does every set A which is bounded above have a maximum?
3. A method used in many high school courses for approximating the square
root of a number a is as follows: let x be an approximation to the square
root of a. Then
x2 + a

is a closer approximation to Va. For the following problems, let a = 2.


(a) Show that if x is a rational number, then y is a rational number.
(b) Compute y2 — 2 in terms of x.
(c) Show that if x > a/2, then y > V2.
(d) Show that if x > y/2, then
y2 - 2 = (x2 - 2)K,
where 0 < K < J. What does this say about the closeness of the approxima­
tion of y to V2? Can you improve this result if x is quite close to V2?
4. Among all numbers of the form p/ç, where p and q are integers and 0 < q < 10,
which is the closest to \/2?
5. Let a and b be two real numbers with a < b. Prove that if c = |(a + Ò),
then a < c < b.
6. Using the results of Problem 5, prove that there is no largest negative number
(that is, that the set of negative numbers does not have a maximum). [Hint:
Assume that there is a maximum and arrive at a contradiction.]
20 THE REAL NUMBER SYSTEM 1-5

1-
5 ABSOLUTE VALUE

Let us open this section with a simple question :


Let a be a real number. Is —a positive or negative?
The student should answer this question for himself before reading on.
Many students will answer it incorrectly unless they pause to think about
it a moment.
Did you say, or was your first thought that —a is negative? This is
the usual incorrect answer. What makes it negative? Where was it said
that a is positive? If a is a negative number, then —a must be positive.
Actually of course this is an improper question. As worded, the correct
answer would have to be that —a may be positive, negative, or neither
(the last if a = 0).
The point here is that after years of experience in seeing numbers of the
type —2, —3, and recognizing them as negative, many students gain a
deep-rooted feeling that a negative sign indicates a negative number. Yet
this is true only if the quantity behind the negative sign is positive.
A negative sign does not indicate that a number is negative. Some of the
difficulty comes from the common habit of reading —a as “negative a.”
This is wrong. It should be read as “the negative of a. ” This small dis­
tinction is quite important.
Now, if the student has the above comments firmly in mind, we can
proceed to give a definition of the absolute value.

Definition 1-5,Let a be a real number. Then the absolute value of a


is the real number |a| defined by
|a| = a if a > 0
= —a if a < 0.
Various other explanations of this same property are possible. For
example, we could define
|a| = Vã5,

where the square root symbol is understood (as always) to mean the
positive square root. This definition is, however, less elementary and
more difficult to work with.
The absolute value of a real number is frequently described as the
“nonnegative magnitude of the number.” This phrase is descriptive but
too vague for use as a definition. It is sometimes also described as “the
distance of the point having that coordinate from the origin on a coordinate
line.” Again, this is a descriptive phrase that leaves much to be desired
in terms of usability.
1-5 ABSOLUTE VALUE 21

Another possible definition would be: the number |a| is the maximum of
the two numbers a and —a. This definition is as good as the one given
above. In fact, it is probably better from a theoretical point of view.
However, as will be seen, the use of the definition given is probably
easier. Proofs of results are not so short as they could be, but are easier
to discover, since Definition 1-5 clearly calls for the breaking down of the
problem into the various possible cases.
Let us list a few simple properties of the absolute value which can be
proved directly from the definition.

Theorem 7-13. For any real number a, |a| > 0. |a| = 0 if and only if
a = 0.

Proof: There are actually three separate results to be proved here. First,
we must prove that if a is any real number, then |a| > 0. This result is
obvious upon examination of the two possible cases in the definition.
Next, the theorem makes an "if and only if” statement. To prove this
we must prove the two separate statements:

(1) |a| = 0 if a = 0.
(2) |a| = 0 only if a = 0.

Here, however, statement (1) is equivalent to the statement:

(1') If a = 0, then |a| = 0,


and this too is obvious from the definition.
Statement (2) is equivalent to:
(2') If |a| = 0, then a = 0.
We can prove this by considering the various possible cases.
Suppose |a| = 0. Then there are only three possibilities: a > 0, a < 0,
or a = 0. If a > 0, then |a| = a > 0, which violates our supposition.
If a < 0, then |a| = —a > 0, which also violates the assumption. Hence
a = 0, since this is the only remaining possibility.

Theorem 1-14. For any real number a,

— |a| < a < |a|.

Proof: If a > 0, then |a| = a; therefore the conclusion of the theorem


is true. If a < 0, then |a| > 0, and hence — |a| = a < 0 < |a| (see
Theorems 1-4 and 1-8).
22 THE REAL NUMBER SYSTEM 1-5

Theorem 1-15. For any real numbers a and b,

|ai>| = |a| • |&|.

Proof: This can be proved by the most direct method possible, that is,
by considering the four possible cases.

Case I. a > 0, b > 0. Then ab > 0 and |a| = a, |b| = b, |ob| = ab;
hence the theorem is true.
Case II. a > 0, b < 0. Then ab < 0 and |a| = a, |6| = —b, |ab| =
—ab. But in this case, |a| • |b| = a(—b) = —ab = |ab|, and hence the
theorem is again true.

The remaining two cases are left as an exercise.

Theorem 1-16. If a2 < b2, then |a| < |b|.

Proof: Suppose a2 < b2. Then b2 — a2 > 0. However, we know


(Problem 1) that b2 = |b|2 and a2 = |a|2, and therefore |b|2 — |a|2 > 0.
The expression on the left-hand side of this inequality can be factored to
give
(|b| - |a|) • (|b| + |a|) > 0.
Now |b| + |a| > 0 (why?), and hence the first factor cannot be negative
or zero. Therefore |b| — |a| > 0, which is equivalent to the conclusion
of the theorem.
These results demonstrate how the definition can be used to give rigorous
proofs for the properties of the absolute value.
There is one more theorem which we wish to give in this section. This
is a major result which will be used throughout the rest of the student’s
study of mathematics. In fact, it is the cornerstone of most proofs in
calculus, and probably deserves to be called the fundamental theorem of
analysis (although it never is). This result is the triangle inequality. The
reason for this name is not obvious here, but will appear when we study
vectors.

Theorem 1-17. (The Triangle Inequality) For any real numbers a and b,
|a + b| < |a| + |b|.
Proof: We note that

|a + b|2 = (a + b)2
= a2 + 2ab + b2
= |a|2 + 2ab + |b|2.
1-5 ABSOLUTE VALUE 23

Using the fact that ab < |aò| = |a| (b|, we therefore have

|a + i>|2 < |a|2 + 2|a| |6| + |i»|2


= (|®| + |b|)2-

Theorem 1-16 then shows that this result implies the conclusion of the
theorem. To see this, observe that what we have actually proved is

(|a + b|)2 < (|a| + H)2.

Putting this into Theorem 1-16 as a hypothesis gives us as a conclusion

|(|a + b|)| < |(|a| + |6|)|.


However, |((a + b\)| = |a + ò| and |(|a( + |6|)| = |a| + |b| (why?),
which shows that we have completed the proof of the theorem.
We should remark on the usage of notations exhibited in this proof.
When several lines are shown linked by equalities or inequalities we are
to think of the expressions as being written sequentially on one line. Thus,
for example, the lines
|a + 6|2 = |«|2 + 2ab + \b\2
< |a|2 + 2|a| |b| + |b|2
= (H + Rd)2
would be read

|a + ò|2 = |a|2 + 2ab + |6|2 < |a|2 + 2|a| |b| + |ò|2 = (|a| + |b|)2,

from which we conclude that |a + ò|2 < (|a| + |b|)2.


In similar displays, some writers prefer to think of the top left-hand
expression as being understood as the left-hand member of each line. To
do so in the above display, the last equality would have to be replaced
by the previous inequality. This, however, would interfere with the
clarity of the relationship between successive lines.

Theorem 1-18. Let p be some positive number. Then |a| < p if and
only if — p < a < p.

Proof: Here again we have an "if and only if” theorem and hence must
prove two results.
First, suppose that |a| < p. We wish to prove that this implies that
—p < a < p. There are two possible cases. If a > 0, then a = |a| < p,
and since at the same time we have — p < 0 < a, these two facts to­
gether give us — p < a < p. On the other hand, if a < 0, then we have
24 THE REAL NUMBER SYSTEM 1-5

—a = |a| < p. From this we conclude, a > — p. Putting this together


with a < 0 < p gives the desired result — p < a < p.
Next, to prove the “if” part of the theorem, let us suppose that
—p < a < p. We wish to prove from this assumption that |a| < p.
However, if a > 0, then |a| = a < p, while if a < 0, then |a| =
—a < p (since — p < a implies that —a < p). Thus, in either case,
we have the desired result.
An example of the use of this theorem might be of interest. Suppose
that we are given that x satisfies |2x — 11 <9. From the above theorem
we can conclude that—9 < 2x — 1 < 9. What is more, since the theorem
said “if and only if,” the set of all x which satisfy this last relation is the
same as the set satisfying the first. Since we can add the same amount to
both sides of an inequality, the last relation is equivalent to —8 < 2x < 10,
which in turn is equivalent to —4 < x < 5. Thus we see that

{x | |2x — 1| < 9} = {x | —4 < x < 5}.

PROBLEMS

1. Prove from the definition of the absolute value that |a|2 = a2 for any real
number a.
2. Complete the proof of Theorem 1-15.
3. Prove the converse of Theorem 1-16. That is, prove that if |a| < |0|, then
a2 < b2.
4. In the proof of the triangle inequality, the inequality occurs at only one
point. Under what conditions will |a + b\ = |a| + |ò|?
5. Prove the triangle inequality directly by considering the four cases:
Case I. a > 0 and b > 0;
Case II. a > 0, b < 0, and a + b > 0;
Case III. a > 0, b < 0, and a + b < 0;
Case IV. a < 0 and b < 0.
Why is it sufficient to consider only these cases?
6. Prove that |x| > & if and only if x < —b or x > b.
7. Find numbers u and v for each of the following parts such that the given set
is equal to {x | u < x< v}. Show these sets on a coordinate line.
(a) {x | |x - 2| < 7} (b) {x | |x - 5| < |}
(c) {x | |x + 11 < f} (d) {x I |3x — 5| < 5}
(e) {x I |5x + 3| < 8}
8. Show each of the following sets on a coordinate line.
(a) {x | |x — 3| > 5} (b) {x | |x + 4| > 4} (c) {x | |2x — || > f}
1-6 DETERMINANTS 25

1-
6 DETERMINANTS

In this section determinants of the second and third order will be dis­
cussed. Such determinants will be useful at various places in the re­
mainder of the text in writing certain formulas in especially compact and
easily remembered form. No attempt will be made to prove the existence
of determinants of other orders or to give a rigorous discussion of de­
terminants in general. At a later stage, when the required concepts have
become familiar, it will be easy to come back and make a complete study
of determinants. For our present needs, the discussion in this section will
be enough.

Definition 1-6. A determinant is a real-valued function of a square array


of numbers. The value of a determinant of order two is defined as

ai a2 = O162 — O261,
bi &2

and the value of a determinant of order three is defined as

ai a2 a,^
61 62 &3 = O162C3 “I- ®2^3C1 &3&1C2 — ®3^2C1 — ®2^1c3 — U163C2.

Cl C2 C3

The important thing to note about this definition is that in the expan­
sion of the determinant, each term contains exactly one factor from each
row and exactly one from each column. Furthermore, there is exactly one
term for each possible combination. Thus the determinant of order three
has six terms in its expansion, since there are three ways of choosing an ele­
ment from the first row, and when this has been done, there remain two
ways of choosing an element from the second row which is not in the
column already used. After these two elements have been chosen, there
is only one element in the third row which can be used;
A similar situation holds for determinants of higher orders. Thus, for
example, the expansion of a determinant of order four will contain twenty-
four terms (why?). The only difficulty in defining determinants of ar­
bitrary orders is in giving a rule for the determination of the sign to be
attached to each term.
If we group the terms in the expansion of the determinant of order three
properly and factor out aif a2, and o3, we find

01 O2 ®3
61 62 63 = Oi(Ò2Ü3 — 63C2) — 02(6103 — 63C1) + 03(6^2 — 62C1).

Cl C2 C3
26 THE REAL NUMBER SYSTEM 1-6

Comparing the terms in parentheses with the definition of a determinant


of order two, we see that we have proved:

Theorem 7-79.

ai O>2 a3
Ò3 63 61 62
62 63 + O>3
C3 C3 Cl C2
Cl c2 ‘ C3

Note that the determinants of order two on the right-hand side of the
equality in this theorem are obtained from the original determinant by
deleting the top row and one of the columns. In fact, we delete the row
and the column which contain the factor a19 a2, or a3 which we then use
to multiply the resulting smaller determinant. The fact that the middle
term in this expansion takes a negative sign is something which must be
remembered.
We may now proceed to prove a number of properties of determinants.
We will state these results as theorems without reference to the order of
the determinant since they are actually true for determinants of any order.
The proofs given here, however, apply only to the cases of order two and
three.

Theorem 1-20, If the rows and columns are interchanged in a de­


terminant, the value of the determinant is unchanged.

Remarks. For the case of order three, this says, for example, that

fll a2 «3 «1 bi Cl
6i b2 63 = a2 bz C2
Cl c2 C3 b3 C3

This result can be proved by direct expansion of each of the determinants


involved. This proof is trivial for a determinant of order two and not much
more difficult in the case of order three. The student is invited to give the
proofs as one of the problems at the end of this section.
The square array of numbers which results after interchanging the rows
and columns in this way is usually called the transpose of the original array.
Theorem 1-20 could thus be restated in the form: the determinant of an
array is equal to the determinant of the transpose of that array.

Theorem 1-21. If in a determinant two rows (or columns) are inter­


changed, the value of the determinant is changed in sign.
1-6 DETERMINANTS 27

Remarks. In the case of a determinant of order three, an example of


this fact is
°i #2 as ®1 Ü2 a3
Ò1 b2 Ò3 = — Cl c2 C3
Cl c2 C3 bi b2 b3

It suffices to prove this theorem for either rows or columns. The remain­
der of the result would then follow by using Theorem 1-20. Thus, for ex­
ample, if Theorem 1-21 had been proved for columns, its proof for rows
would proceed as below:
Let A be a square array of numbers and let A' be the transpose of this
array. Let B be the array which results from the interchange of two rows
of A, and let B' be the transpose of B. Then it is clear that we can obtain
Bf also by interchanging two columns of Af, Hence, writing |A| for the
determinant of A, we have

ibi = ib'i = -|4'| = -m.

To prove this theorem for the interchange of columns of a determinant


of order two is trivial. In the case of a determinant of order three, we can
use the decomposition given in Theorem 1-19 to help us. The student
should try writing out this decomposition for cases where the first and
second columns have been interchanged and when the first and last
columns have been interchanged to see how the proof may be accomplished.

Theorem 1-22. If two rows (columns) in a determinant are identical,


the value of the determinant is zero.

Proof: Merely apply the previous theorem. If the two identical rows
are interchanged, then on the one hand the value of the determinant is un­
changed and on the other hand is changed in sign. The only number which
is its own negative is zero.

Theorem 1-23. If all of the entries in a row (column) of a determinant


are multiplied by a constant k, then the value of the determinant is also
multiplied by this constant.

Proof: This says, for example, that

ax Ü2 d3 ai Ü2 a3
kbr kb2 kb3 = k bi b2 b3
Cl c2 C3 Cl c2 C3
28 THE REAL NUMBER SYSTEM 1-6

allowing us to factor a constant out of a row or a column. The proof is


obviously trivial if the constant k multiplies the elements of the first row
and if we look at the decomposition given by Theorem 1-19. To prove it
for another row, we may use Theorem 1-21 as follows. Suppose we are
given a square array of numbers A. Let B be the square array which results
when the ith row of A is multiplied by k (i * be the square
1). Let A
array obtained by interchanging the first and the zth rows of A, and let
* be the array resulting from the interchange of the first and ith rows
B
of B. Then B * can be obtained from A * by multiplying the top row of
* by ft. Therefore,
A

|B| = -|B
|
* = —jb|A
*| = k\A\.

The proof for columns is obtained by using Theorem 1-20 in a similar


manner.

Theorem 1-24. Let two determinants of the same order be identical


except in one given row (column). Then the sum of the values of the
determinants is the value of the determinant with the common rows
(columns) and the sum of the corresponding elements in the remaining
row (column).

Proof: An example may help to make this clearer. This theorem asserts
that
«1 (12 C&3 Ü2 O>3 fll Ü2 d>3
òi + di &2 “F ^2 &3 + ^3 = bi b2 b3 + di d2 ds
Cl c2 c3 Cl C2 C3 Cl C2 C3

The proof (using Theorem 1-19) is again quite simple if it is the top
row which is different in the two determinants. It may then be done for
an arbitrary row by interchanging rows just as in the proof of Theorem
1-23, and then for columns by using Theorem 1-20.
Actually, this theorem depends only on the fact that in the expansion
of the determinant, each term contains one and only one (linear) factor
from each row and column.
The next property of determinants is of considerable value in practice,
since it yields a method for the simplification of determinants.

Theorem 1-25. In a given determinant, a constant multiple of the


elements in one row (column) may be added to the elements of another
row (column) without changing the value of the determinant.
1-6 DETERMINANTS 29

Proof: That is, for example,

Ü2 O3 O1 Ü2 «3
6i -f- kui 62 ka2 63 + ka3 = 0i b2 b3
Cl c2 c3 Cl c2 C3
The proof follows from the previous theorem. The value of the deter­
minant on the left-hand side of the above is equal to
fll Ü2 fll Ü2 03
61 b2 b3 + kai ka2 ka$
Cl c2 c3 Cl C2 c3
By Theorem 1-23, the constant k in the second row of the last determinant
can be factored out, and then Theorem 1-22 shows that the value of this
determinant is zero.
The final result of this section is of both theoretical and practical im­
portance. In fact, it could be used to define higher-order determinants.

Definition 7-7.The minor of an element in a determinant is the de­


terminant of lower order which results from the deletion of the row
and the column containing that element.

For example, the minor of the element 63 in the determinant

(Zj 0,2 ®3
61 62 63
Ci C2 C3

is the determinant
Ü2
Cl c2

Definition 7-8.The cofactor of an element in a determinant is (— 1)®


times the value of the minor of that element, where s = i + J, the
given element being in the ith row (counting from the top) and the
Jth column (counting from the left).

Thus, for example, the cofactor of the element 63 in the above example
is
(_1)2+3 c&i (Z2
Cl c2
The factor (—1)® is +1 for the element in the upper left-hand corner and
is alternately +1 and —1 in a checkerboard pattern throughout the de­
terminant.
30 THE REAL NUMBER SYSTEM 1-6

Theorem 1-26. The value of a determinant is equal to the sum of the


products of the elements of a given row (column) and their cofactors.

Proof: Two examples of this expansion are

d2 03 Ò3 &3 b2
bi b2 Ò3
c3 C3 c2
Ci c2 C3
&3 03 Oi «3
— c2 Ò3
C3 C3 bi

The result of this theorem applied to the top row (the first line in the
above example) is exactly the statement of Theorem 1-19. We can prove
it for the second row by interchanging rows one and two. This changes
the sign of the determinant and each cofactor in the expansion will have
its sign changed. To prove it for the third row (having proved it for the
second), interchange rows two and three.
Note how simple this result is if all of the elements except one in a given
row (or column) is zero. All determinants can be reduced to this form
by the application of Theorem 1-25. Thus, for example,

1 3 -7 1 3 -7
2 4 5 = 0 -2 19
-3 1 15 -3 1 15
1 3 -7
= 0 -2 19
0 10 -6
= -2 19
“ 10 -6
= -2 19
0 89
= -178.

Here, in the first step, the top row was multiplied by 2 and subtracted
from the second row. Next, the first row was multiplied by 3 and added
to the third row (in practice these two steps could be done simultaneously
to save writing). Then the determinant was expanded by the cofactors
of the first column, only one term appearing because of the two zeros.
The resulting second-order determinant could be expanded directly, or,
as was done, the top row multiplied by 5 and added to the second row to
give a simpler expansion.
1-6 DETERMINANTS 31

At some future time the student will see a proper treatment of deter­
minants of an arbitrary order. For the present, let us just state that the
theorems listed above are true for determinants of any order. In par­
ticular, the theorem on expansion by cofactors could be used to define
a determinant of nth order in terms of determinants of (n — l)th order.
With this type of definition, these theorems could all be proved. However,
Theorem 1-20 would be quite difficult (and this is the main reason that
we prefer to defer a complete discussion of determinants of an arbitrary
order).
If a higher-order determinant must be evaluated, use the methods dis­
cussed above to reduce the order. Successive reductions of order will
eventually lead to a second- or third-order determinant which can be
evaluated.

PROBLEMS

1. Evaluate each of the following determinants, first by the definition, and then
by the reduction technique illustrated at the end of this section.
(a) 2 -3 5 (b) 1 1 0
1 6 2 0 1 1
6 2 -5 1 0 1

(c) 3 4 . 5 (d) 15 30 -15


4 5 6 30 -80 70
5 6 7 28 14 -35
2. Evaluate the following determinants:
(a) 1 5 -7 (b) 3 —1 1
2 10 1 0 5 -1
3 16 150 2 18 -3

(c) 7 1 -2 (d) 1 2 3
5 1 3 2 3 4
1 -1 -18 3 4 5
3. For what values of x do the following determinants have a value of zero?
3 -1 2 (b) 1 5 —1
X 5 0 1 3 X
-6 2 X 1 X 3
—x 2 1 (d) X 2x —x
4 1 - X 0 1 0 3
4 -2 3 — X 5 —1 X
4. Prove Theorem 1-20 for determinants of order three.
32 THE REAL NUMBER SYSTEM 1-6

5. Prove Theorem 1-21 for the interchange of the first and second columns.
6. Prove Theorem 1-21 for the interchange of the first and third columns.
7. Expand the determinants in Problem 1 by the cofactors of the second column.
8. Define a fourth-order determinant by
ai d2 «3 O4 Ò2 b3 &4 bi b3 bt
61 b2 &3 &4 = O1 C2 C3 C4 — 02 Cl c3 C4
Cl C2 C3 C4 d2 d3 d4 di d3 d4
di d2 d3 d4
bi b2 &4 bi b2 b3
+ 03 Cl C2 C4 — 04 Cl C2 c3
di d2 d4 di d2 d3
Prove Theorem 1-21 (for columns) for a fourth-order determinant.
9. Using the reduction method illustrated in the text, evaluate
3 14 4 10
1 -6 —2 —5
-1 3 1 0 *
0 10 3 5
2
Analytic Geometry
and Trigonometry
1
2- THE CARTESIAN PLANE

Suppose that we are given a unit of distance in the plane. This unit of
distance can be used on any line in the plane to turn that line into a
coordinate line. Thus we can measure the distance between any two
points of the plane by supposing that a line has been drawn which passes
through these two points and then using the unit of distance to measure
the distance.

Figure 2-1

Suppose also that we are given two straight lines which intersect at right
angles in the plane. We arbitrarily assign a sense of direction to each, and
make each into a coordinate axis by making the point of intersection of
the two lines the origin of the coordinates on each. One line is called the
z-axis and the other the ?/-axis. While this can be done in a completely
arbitrary fashion, we make a conventional choice of the orientation of
these two axes for the purposes of illustration. This choice is as shown
in Fig. 2-1. The x-axis is horizontal, with the positive direction being to
the right. The i/-axis is vertical, the positive direction being upward.
The plane of these two lines is called the cartesian plane. If a point P
is given in the plane, unique lines parallel to the two axes can be drawn
through this point. These lines will intersect the axes at a pair of well-
determined points. Let the point at which the line parallel to the ?/-axis
33
34 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-1

cuts the z-axis have coordinate xx (on the x-axis) and let the correspond­
ing point on the ?/-axis have coordinate y^. Then the pair of numbers
(®i, i/i) are called the coordinates of the point P. The coordinates are
written in the order shown. The first number, xx in this case, is called the
x-coordinate of the point, and the second is called the y-coordinate of the
point.
Suppose conversely that an ordered pair of real numbers is given. The
first will determine a unique point on the z-axis and the second a unique
point on the 2/-axis. Through these points, perpendiculars can be drawn
to the corresponding axes. These will intersect at a unique point.
In other words, each point in the cartesian plane determines a unique
ordered pair of real numbers, and each ordered pair of real numbers de­
termines a unique point. There is thus a one-to-one correspondence be­
tween the points and the pairs of real numbers. Thus, by common usage
we will often speak of the point (xx, 2/1), meaning the point with coordi­
nates (xi, 2/i).
The particular point (0, 0) at which the coordinate axes intersect is
called the origin qí the coordinates, and hence also the origin of the plane.
In the next chapter, careful definitions will be given, and we will be able
to prove many things while knowing exactly the foundation on which we
are building. Here, however, we will assume a knowledge of the geometry
of the plane upon which we superimpose the cartesian plane. One of the
things we assume is the Pythagorean theorem. This allows us to de­
termine the distance between two points of the cartesian plane.

Figure 2-2

Suppose points Px and P2 are given in the plane with coordinates


(^1,2/i) and (x2,2/2) respectively. Let the lines through Px and P2 parallel
to the x- and 2/-axes intersect at C, as shown in Fig. 2-2. Then the triangle
whose vertices are Px, P2, and C is a right triangle with its right angle at C.
The length of the side PXC is |xx — x2| (as measured on the x-axis) and
the side P2C has length I2/1 — 2/21- From the Pythagorean theorem, we
see that the distance between Px and P2 is

[Oi — x2)2 + (yi - y2)2]112-


2-1 THE CARTESIAN PLANE 35

Definition 2-1. The distance between two points Pi and P2 in the


cartesian plane is denoted by |PXP2|.

Thus we have shown that if Pi = (xi9 ?/i) and P2 = (x2, yz), then

IP1P2I = [(
*1 - X2)2 + (2/1 - x2)2]1/2. (2-1)

For example, if A and B are the points with coordinates (3, —4) and
(1, 2) respectively, then

|AB| = [(3 - 1)2+ (-4 - 2)2]1/2


= [22+ (-6)2]1/2
= [4 + 36]1/2
= x/40.

Now suppose a positive real number R and a point Po with coordinates


(z0,3/o) are given. What are the points which lie on the circle with radius
R and center Po? Suppose P is such a point. Then |PPo| = R- Hence if
P has coordinates (x,y), then [(x — x0)2 + (y — yo)2]i/2 = R, or
equivalently,
(x - x0)2 + (y - Vo)2 = R2- (2-2)

Conversely, suppose that the coordinates of some point P = (x, y)


satisfy the relation (2-2). Comparing the equation with (2-1), we see that
|PP0|2 = R2) or equivalently, |PPo| = R- That is, the point P is at a
distance R from the point Po, but this means that the point P is on the
circle of radius R with center Po. Thus we have shown that the set of
points on this circle is exactly the same as the set of points whose co­
ordinates satisfy Eq. (2-2). Putting this into the form of a theorem,
we have

Theorem 2-1. The circle with radius R and center Po = (x0, yo) in
the cartesian plane is

{(x, y) I (x — z0)2 + (y - Vo)2 = R2}- (2-3)

An important point must be noted here. We see in (2-3) that a circle


is a set of points which satisfy a certain equation. This notation is com­
pletely accurate but rather cumbersome. For this reason, a very common
usage eliminates any mention of the set and we find phrases such as “the
circle (x — x0)2 + (y — yo)2 = R2” Such a phrase is clearly incorrect,
since the circle is the set of points which satisfy this equation and not the
36 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-1

equation itself. However, this is not a serious objection since the meaning
of the phrase could not be misunderstood.
In mathematics accuracy of thinking is essential. It is safest to be
careful and always use accurate language. When the student is aware of
the exact meaning needed, he may be allowed to use shorter terminology
and phrases which are not quite precise, provided that the meaning is not
lost or altered.

Definition 2-2. The set of all points (x, y) in the cartesian plane which
satisfy a given equation in the variables x and y is called the locus of
that equation, and the equation is called the equation of that locus.
Two equations are called equivalent if they have the same locus.
*

Thus, the locus of Eq. (2-2) is a circle. In using the terminology of this
definition, the word “locus” is sometimes replaced by the name applied
to that point set. In particular, we would say, for example, that (2-2) is
the equation of a circle.
For example, the circle with center (2, —1) and radius 3 would have the
equation
(x - 2)2 + [y - (-1)]2 = 32
or
(x - 2)2 + (y + l)2 = 9.

Let us look again at the equation of a circle, (2-2). This equation is


equivalent to the equation

z2 + y2 — 2x0x — 2y0y + (xq + y% — R2) = 0,

that is, to an equation of the form

x2 + y2 — 2ax — 2by + c = 0. (2-4)

Thus the equation of the circle with center (2, —1) and radius 3, given
above, becomes
x2 — 4x + 4 + y2 + 2y + 1 = 9
or
x2 + y2 — 4x + 2y — 4 = 0

when transformed to the form (2-4).

* The phrase “truth set” or “solution set” is often used instead of “locus”
in modern texts. The meaning is the same. We use “locus” here because it
is in such common use that the student should be aware of it. Besides, it is shorter.
2-1 THE CARTESIAN PLANE 37

Is any equation of the form (2-4) the equation of a circle? The answer
is clearly no, since, for example, the equation
x2 + y2 — 2x — 2y + 4 = 0
is equivalent to
(x - l)2 + (y - l)2 = —2,

and no points in the plane would have coordinates which could satisfy
this equation. The sum of two squares cannot be negative.
Given an equation of the form (2+L), it is easy enough to tell whether or
not it is the equation of a circle and, if it is, to identify the circle. All that
needs to be done is to complete the square in both x and y. For example,
given the equation
x2 + y2 - 6z + 141/ + 33 = 0,

we would proceed by the following steps


x2 - 6z + y2 + 141/ = -33,
x2 - 6x + 9 + y2 + 141/ + 49 = -33 + 9 + 49,
(x - 3)2 + (y + 7)2 = 25,
(x - 3)2 + (y + 7)2 = 52.
This last equation can immediately be identified from (2-3) as the equa­
tion of the circle of radius 5 with center at the point (3, —7). Note how
the numbers appearing in this form are the negative of the coordinates of the
center.
Suppose we were asked to find the circle which passes through three
given points, say the points (0, —4), (—5, 1), and (4, 4). We know that
the equation of any circle can be brought into the form (2-4), and hence
if we can find a, b, and c in this equation we can find the circle. To do this,
all that needs to be done is to put the values of x and y for the given points
into (2-4) and solve the resulting set of equations for a, 6, and c.
When the coordinates of the three points given above are put into
(2-4), we have
16 + 8b + c = 0,
26 + 10a - 2b + c = 0,
32 — 8a — 8b + c = 0,
or equivalently,
8b + c = —16,
10a - 2b + c = -26, (2-5)
8a + 8b — c = 32.
This set of equations can then be solved for a, ò, and c. The resulting
38 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-1

values can be used to find the center and the radius of the circle in the
manner described above.
Similarly, by assuming an equation for the circle with unknown param­
eters, we could find the equations for circles with other given conditions.

PROBLEMS

1. Make a sketch locating the following points in the cartesian plane:


(a) A = (1,5) (b) B = (5, 1)
(c) C = (-6,2) (d) D = (3, -5)
(e) E = (-2, -7) (f) F = (2,7)
(g) D = (-2, 7) (h) H = (2, -7)
2. If a and b are two nonzero real numbers, what is the relationship between the
points (a, ò) and (—a, ò)? between the points (a, b) and (a, —6)? between
the points (a, b) and (—a, — ò) ? What happens to these relationships if
a or b is zero?
3. Using the points A through H of Problem 1, find the distances

4. Using the points in Problem 1, find the distances:

5. Write the equations of the circles with the following centers and radii, both
in the form (2-2) and in the form (2-4). Make a sketch showing the circle.
(a) Center (1,2); radius 4
(b) Center (3, 4); radius 5
(c) Center (—5, 3); radius 1
6. Follow the instructions given for Problem 5 for the following circles:
(a) Center (0, 2); radius 2
(b) Center (6, —2); radius 6
(c) Center (—2, —2); radius 8
7. Identify whether or not the following are equations of circles. If they are,
give the centers and the radii.
(a) x2 + y2 + 2x — 4y — 4 = 0
(b) x2 + y2 — 20?/ + 84 = 0
(c) x2 + y2 — 6x — 2y + 14 = 0
8. Follow the same instructions as in Problem 7.
(a) x2 + y2 + 2x — 3y + 1 =0
(b) x2 + y2 + 7x — 8y + 3 = 0 (c) 3x2 + 3?/2 + 4x + 18?/ + 7 = 0
2-2 STRAIGHT LINES 39

9. What is the locus of the equation


(x — xo)2 + (y — yo)2 = 0,
where xo and yo are specified real numbers?
10. Solve equations (2-5) and find the center and the radius of the circle.
11. Find the equations of the circles satisfying the conditions below, and give
the centers and the radii.
(a) The circle passes through the points (0, 0), (3, 1), (7, 0).
(b) The circle passes through the points (9, 1), (8, —4), (1, 13).
(c) The circle passes through the points (0, 4), (0, —2), (4, 2).
12. Find the equations of the circles of radius 10 which pass through the points
(—4, 0) and (12, 0). How many of them are there? Make a sketch.
13. Find the equation of the circle with center at (3, —7) which passes through
the point (6, 2).

2-
2 STRAIGHT LINES

In Chapter 4 we will consider the problem of defining exactly what


is meant by a straight line. In this section we will assume that we know
what a straight line is, and concentrate on its properties.
First, let us consider a very special case. Suppose L is a straight line
in the cartesian plane which is parallel to the ?/-axis. Then by the very
way in which we introduced the coordinates of a point, we see that every
point on this line has the same x-coordinate, namely the coordinate of the
point at which this line cuts the x-axis. Furthermore, every point which
has this value for the x-coordinate is on the line. Thus, for each point
(x, y) on the line, we must have
x = c, (2-6)
where c is the coordinate of the point on the x-axis at which the given line
crosses. We therefore see that we have proved:

Theorem 2-2. L is a straight line in the cartesian plane which is parallel


to the i/-axis if and only if there is some real number c such that

L = {(z, Í/) |x = c}.


Note that this statement is the same (using our conventions) as saying
that the equation of L is x = c. Students sometimes find it difficult to
think of x = c as defining a set of points in the plane since y does not appear
in this equation; but if we recall that this is just a short way of stating the
set relation in this theorem, there should be no such difficulty.
40 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-2

Now if L is any line in the cartesian plane which is not parallel to the
i/-axis, then it must cut every line parallel to the i/-axis at a single point.
That is, given any real number x0, there will be one and only one point
on the line with xQ as its x-coordinate.
Let b be the coordinate of the point at which the line cuts the 2/-axis,
and draw the line parallel to the x-axis through the point (0, ò). In a
similar manner as above, we see that this line could be identified as the
line whose equation is y = b. Next, if we choose any c 0, and also
draw in the line x = c, we will have formed a right triangle. All such
triangles have a common angle at the point (0, h) and hence are similar.
In fact, if we take any two points (xi, yi) and (x2, y2) on the line which
are such that Xi < x2 and draw in the lines y = yi and x = x2, we will
have formed a right triangle which is also similar to any of the above
triangles (see Fig. 2-3).

Let us fix a particular triangle as the one to refer all of the others to.
We will use the triangle determined by the given line and the lines y = b
and x = 1. The length of the base of this triangle is 1. Let the height
of this triangle be |m|, where the sign of m is so chosen that the point
(1, b + m) is on the line. Thus, m is positive if the line “rises,” as does
the line in Fig. 2-3, and m is negative if the line “falls.” For this triangle,
the ratio of the height to the base is |m|/l = |m|.
On the other hand, for the triangle determined by the points (xlf yi)
and (x2,i/2), the base is of length (x2 — Xi) and the height is |i/2 — 2/i I
(why?). The fact that this triangle is similar to the triangle fixed above
means that
lz/2 — yd
m\ =
(x2 — X1)
Note, however, that m has been chosen to be positive or negative so
that it has the same sign as (y2 — yi) in this case when we are assuming
Xi < x2.
2-2 STRAIGHT LINES 41

We can therefore conclude:

Theorem 2-3. If L is a straight line which is not parallel to the y-axis,


then there exists a real number m such that if (a^i, 2/i) and (x2,2/2) are
any two distinct points of the line, then
y2 - yi
m = ------------
x2 — X1
As stated, this theorem does not require that xr < x2. The student
is asked to verify that this restriction is not necessary in one of the problems
at the end of this section.
Let us apply this theorem to the point (0, b) at which the line cuts the
2/-axis and a general point (x, y), x 0, on the line. The result of this
theorem gives

x
This equation is easily seen to be equivalent to the equation
y = mx + b (2-7)
except when x = 0. We see that the point (0, b) satisfies (2-7), however.
Therefore, every point on the line satisfies (2-7). Hence we have

Theorem 2-4. If L is a straight line in the cartesian plane, not parallel


to the 2/-axis, then there are real numbers b and m such that
L = {(x, y) \ y = mx + b}.
Conversely, the locus of an equation of the form (2-7) is a straight line
not parallel to the 2/-axis.

Strictly speaking, the last half of this theorem has not been proved.
However, exactly the same discussion as above can be used to show that
if a point (x, y) satisfies this equation, then (x, y) must lie on the line
through the points (0, b) and (1,6 + m).

Definition 2-3. If the line L has equation y = mx + 6, then b is called


the y-intercept of L and m is called the slope of L.

Equation (2-7) is called the slope-intercept form of the equation of a


line. A line which is parallel to the 2/-axis is often said to have infinite
slope (see Problem 3), but it is more correct to say that it has no slope.
Note that the lines for which m = 0, that is, those with equation y = b,
are parallel to the x-axis.
42 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-2

Equations (2-7) and (2-6) can be combined into a single case. Either
one can be written in the form

ax + by + c = 0, (2-8)

where not both a and b are zero. Equation (2-8) is called the general
form of the equation of a line. We see that if b = 0 in an equation of this
form, we can divide through by a and get the equation of a line of the
type (2-6). If b 7^ 0, then we can divide by b and get the slope-intercept
form of an equation of a line. In this case, the slope is m = —a/b.
The student will find it essential to be able to find the equation of a line
satisfying given conditions. There are two types of conditions which
appear very frequently in practice, and which must therefore be familiar
to the student. These conditions are: a pair of points which are to be on
the line, and a given point and slope. Let us look at this second condition
first.
If the slope m is given, we know that the line will have an equation of
the form
y = mx + b,

and hence all that needs to be determined is the correct value for b. Sup­
pose that the point (xx, y^) is given and is to be on the line. Then the
coordinates must satisfy the equation, giving

yi = mxi + b,

and hence b = yx — mxr. The required equation is therefore


y = mx + (i/i — mxi). (2-9)

This equation can be put into a slightly different form which is often useful.
Equation (2-9) is equivalent to
1/ — i/i = m(x — xi), (2-10)

which is called the point-slope form of the equation of a line. Note that
each side of the equation is zero at the point (zi, 2/1).
The point-slope form of the equation could also be derived directly
from Theorem 2-3. Indeed, Eq. (2-10) is equivalent to

(2-11)

except for the point (zi, t/i) which is on the line and satisfies (2-10) but
not (2-11). Equation (2-11) can be derived immediately from Theorem 2-3.
This result is easily remembered, particularly in the form (2-11), and
the student is advised to be sure he learns it, since it is used quite often.
2-2 STRAIGHT LINES 43

Let us look at an example of the use of this result. What is the equation
of the line with slope | which passes through the point (1, —3)? From
(2-10) we have
y - 1 = |(x + 3).

We can find an equivalent equation in the form (2-8) from the above
equation. Such an equation is
x — 2y + 5 = 0.
Next, let us consider the problem of finding the equation of the line
passing through two given points. This can be done in several ways. We
could assume the general form
ax + by + c = 6
and use the coordinates of the two given points to obtain the pair of
equations
azi + tyi + c = 0, ax2 + by2 + c = 0.
These two equations contain three unknowns, but they can be solved
nonetheless. The method is to eliminate one of the unknowns, leaving a
single equation in two unknowns. This can be solved for one of the un­
knowns in terms of an assumed value of the other. A solution set can be
obtained for each such assumed value. But the different solution sets
for different assumed values are multiples of each other. This does not
matter, since the locus of Eq. (2-8) is unchanged when all of the coefficients
are multiplied by the same nonzero constant.
For example, let us find the equation of the line passing through the
points (1, 2) and (4, 4). Assuming the equation ax + by + c = 0, we
find that at (1, 2)
a -J- 26 -|- c = 0,
and that at (4, 4)
4a + 46 + c = 0.
Subtracting the first of these equations from the second, we find
3a + 26 = 0.
We now assume any convenient value for a and solve for 6. The value
a = 2 is useful here, since it makes 6 = — 3 (an integer). These values
can now be put into one of the equations given above. In particular,
from the first equation,
c = —a — 2b
= -2 + 6
= 4.
44 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-2

The desired equation is therefore

2x — 3y + 4 = 0.

The method described above is not the only one which can be used to
find an equation of the line through two points. The results of Theorem 2-3
can also be utilized.
If the two given points have the same ^-coordinate, but different y-
coordinates, then the line is parallel to the 2/-axis and must have the equa­
tion x = Xi (%i being the common z-coordinate). Let us suppose that
for the two given points rri x2, and solve for the point-slope form of
the equation. All we need is the slope, since we know a point already.
From Theorem 2-3 we have the slope
y2 — yi ,
m = %2 — *1 ’
(2-12)

and putting this into the point-slope form of the equation as given by
(2-10), using i/i) as the point, gives

(2/2 - 2/1)
y - 2/i = (x2 — Xi) (x — Xi).

This result can be written in several equivalent forms, two of which are:

(2/ — 2/i) = (s — xi) ,


(y2 — 2/i) (^2 — *
i) ’

(x2 — xi)(i/ — 2/i) = (2/2 — 2/i)(


* — *
i)« (2-13)
The student should check to see that the two given points actually
satisfy these equations.
Both are called the point-point forms of the equation of the line. The
first is easier to remember, but the second is of greater generality, since it
remains valid even when Xi = x2.
In practice, most students find it easier to determine the slope first,
using (2-12), and then to use the point-slope form (2-10) rather than trying
to memorize formulas (2-13). For example, to find the line through the
points (1, 3) and (5, —5), we first find the slope

and then using (2-10) we have the equation

2/ - 3 = -2(x - 1),
or, equivalently,
2x + y — 5 = 0.
2-2 STRAIGHT LINES 45

PROBLEMS

1. Explain why Theorem 2-3 holds even if X2 < xi.


2. Let c be a fixed real number. Find the equation of the line with slope m which
cuts the x-axis at the point x = c. Write this equation in the general form
(2-8) with a = 1. What happens to this equation if m becomes increasingly
large?
3. Find the equation of the line which passes through the two points (a, 0) and
(0, b), where neither a nor b is zero. These points are called the intercepts
of the line. Make a sketch showing these points and how the line is deter­
mined. Show that this equation can be brought into the form

- + kb =
a l
4. Find the equation of the line with the given slope, passing through the given
point. Make a sketch showing the line. Give the equation in the general form
(2-8), and in the slope-intercept form (2-7).
(a) Slope 2, point (7, 3) (b) Slope —1, point (1, —1)
(c) Slope 5, point (0, 10) (d) Slope point (0, 10)
(e) Slope —f, point (—4, 5)
5. Follow the same directions as in Problem 4.
(a) Slope — J, point (3, —8) (b) Slope 50, point (1,0)
(c) Slope — point (1, 0) (d) Slope —50, point (1,0)
(e) Slope 1, point (0, —1)
6. Find the equation of the line passing through the given points. Give the
equation in the general form (2-8), and give the slope of the line. Make a
sketch.
(a) (3, 1), (2, -1) (b) (-7,8), (15,20)
(c) (4,2), (4, 17) (d) (-1, 10), (1, -12)
(e) (-3,5), (7, 5)
7. Follow the directions of Problem 6.
(a) (10,5), (7,2) (b) (-2,15), (-2,0)
(c) (4,-8), (8,-4) (d) (—2, -7) (6, —7)
(e) (7,32), (8,-62)
8. Show that the equation of the line through the points (xi, t/i) and (x2,2/2) is
given by
x y 1
xi yi 1 = 0.
X2 y2 1

Can you turn this into a formula which can be used to determine whether or
not three points are all on the same line?
46 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-3

2-
3 FUNCTIONS AND GRAPHS

We have mentioned functions a few times already, expecting that the stu­
dent would have no trouble understanding what was meant, but in this
section we will give the formal definition of a function and try to make
clear how we think about and work with the function concept.
Up until quite recently (within the last century), when functions were
mentioned in the mathematical literature they were usually considered
to be formulas, that is, algebraic expressions which could be written down
more or less explicitly, involving one or more variables. By the turn of
the century it had become obvious that this concept was too narrow and
that a better understanding of a function was needed.
In particular, it was realized that functions could not be restricted to
having arguments and values among just the real and complex numbers.
In fact, it seemed that there should be no such restrictions at all. The
elements of any set had to be allowable. Eventually what was arrived
at was the following formal definition.

Definition 2-4. Let D and R be any two sets. A function with domain
D and range R is a set F of ordered pairs (x, y) with the properties:
(1) For every (x, y) G F, x G D and y G R,
(2) For every x G D there is one and only one y G R such that
(x, y) G F.
The set D is called the domain of the function and the set
{y I (%> y) £ P for some x G D}
is called the image of the function.

This formal definition becomes necessary for certain difficult problems.


But in general it is too clumsy for actual use in the ordinary situation. In
fact, it is usually better to think of functions as some “rule” that asso­
ciates to each element of the set D a unique element of the set R.
Note that in the above definition, the range can be any set which con­
tains the image. Strictly speaking we should not speak of the range, since
this is not unique. The concept of range is useful when considering a
function in which it is difficult or impossible to determine the exact image.
In such a case, the range is defined to be the smallest set which we are
sure contains the image. For example, we might have a function in which
the domain was the set of all people in the world and which associated
to each person the number of hairs on his head. We certainly cannot de­
termine the image of this function, but we know that the range is a
subset of the nonnegative integers. We could say that the range is the
2-3 FUNCTIONS AND GRAPHS 47

set of all real numbers, or the smaller set of all integers, but these sets
are clearly too big. Can you put an upper limit on the integers in the
range?
A special notation is used for functions. We will usually denote the
function by a letter and when we are talking about functions in general,
we will usually use the letter f. The way in which we use the functional
notation is explained by the following definition:

Definition 2-5.If the function f is a set F of ordered pairs as defined


above, then by the value of the function at x E D we mean the element
y G R such that (x, y) G F. This value will be denoted by fix).

We sometimes will speak of “the function /(x) ” instead of “the function


/. ” This happens especially when the function can be defined by a simple
formula, such as
fix) = x2.
This is a function which associates to each real number x the value x2,
and is therefore the set of ordered pairs {(x, x2) | x is a real number}. We
will say that this is the function x2, even though x2 is really the value of
the function at x and not the function itself.
The logical confusion in letting /(x) represent both the function and
the value of the function never causes difficulty in any normal context.
Essentially, it merely amounts to thinking of x as being a variable point
in D and thinking of /(x) then as representing the set of all pairs (x,/(x))
in the above definition.
Although most of the functions we shall consider will have real values
and will be defined on some subset of the real numbers, it is worthwhile to
give a few examples showing other types of functions.
In a library, every book is assigned a call number. This gives rise to a
function which assigns a call number to each book. We might write this
function as C(x). The domain of this function is the set of all books in
the library. The x in C(x) thus ranges over all of these books. The value
of C(x) is a call number, and the image of C(x) is the set of all call numbers
of the books.
Let P be the set of all people in the United States and let D be the
collection of all subsets of P. Then we can define a function on D by
specifying the value of the function to be the number of people in each
subset. This is a type of function which is of interest to the census bureau,
although they only consider certain special subsets, such as the set of all
men, the set of all residents of a given state, the set of all unemployed,
etc. The image of this function would be the set of all integers from zero
to the total population of the United States.
48 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-3

Every rational number can be written in the form p/ç, where p is an


integer and q is a positive integer so that p and q have no factors in com­
mon (we say p/q is in lowest terms). For every rational number x, we
can define a function f(x) to be f(x) = 1/q, where x = p/q in lowest
terms. What is the image of this function?
For a final example of a nonstandard type of function, suppose that on
the domain of all real numbers we define

if x is rational,
if x is irrational.

What is the image of this function? Is the function completely defined?


Next, we wish to define the graph of a function.

Definition 2-6. The graph of a function f(x) with domain D is the set
of all ordered pairs (z,/(a;)) where x G D.

A comparison of Definition 2-4 with this definition seems to show that


the function and its graph are the same thing. Each is the same set of
ordered pairs. This is true, but we actually distinguish between the two
concepts.
In the first place, we do not normally think of the function as the set of
ordered pairs. We usually think of it as the association (or mapping)
between the elements of the domain and the elements of the range.
However, even when we use the formal definitions, we still distinguish
between the function and the graph. The function is considered to be the
set of ordered pairs F, while the graph is considered to be the same set F,
thought of as a subset of the set of all ordered pairs (x, y) with x in the
domain and y in the range of F.
This becomes easier to see when we have a function whose domain is
the set of real numbers (or a subset of the set of real numbers) and whose
range is the set of real numbers. The ordered pairs of the graph are
ordered pairs of real numbers, and hence can be identified with points
of the cartesian plane. The graph will then be a point set in the cartesian
plane. For each value of x, we can mark the point (z,/(z)) on the car­
tesian plane and obtain a picture of the graph of the function.
Not every point set in the cartesian plane is the graph of a function.
The requirements that a point set be the graph of a function can be deduced
from Definition 2-4 and will serve to clarify the meaning we assign to the
term function. For each real number c which is in the domain of the
function /(x), there is a unique real number /(c) such that (c,/(c)) is
on the graph. This means that for each c in the domain, the line x = c
cuts the graph at one and only one point.
2-3 FUNCTIONS AND GRAPHS 49

Thus, for example, Fig. 2-4 shows the


locus of the equation

x = y2,

but it is not the graph of a function whose


domain is a subset of the x-axis. There is
an obvious function (or rather two such
functions) associated with this locus, how­
ever. The function defined by

y = Vx
has as its domain the set of all nonnegative real numbers and as its image
the set of all nonnegative real numbers. What is its graph?
Note that when we think of the graph of a function in the cartesian
plane, we usually write the function in the form

y = /(
*)>
since we are thinking of a point set of ordered pairs (x, y) in our standard
notation.
As another example, we show in Fig.
2-5 the graph of the function

What is the domain of this function?


What is the image of this function?
There are two special types of functions
which occur frequently enough to deserve
special comment. These are the polynomial functions and the rational
functions,
A polynomial function is one whose value at each x is given by a linear
combination of powers of x. That is, a polynomial function, p(x), is
defined by some finite number, n + 1, of real numbers a0, alf. . . , an
such that for every x,
p(x) = a0 + axx + a2x2 + • • • + anxn.

We will assume that the student is already well acquainted with the basic
properties of polynomials, and turn to rational functions.
A rational function is a function whose value at each x is given by the
quotient of two polynomials. The domain of a rational function is there­
fore the set of all real numbers, less those real numbers for which the
denominator is zero. We will assume that the student is familiar with the
50 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-3

graphs of polynomial functions, but we will make a few remarks about


the problems which arise in sketching the graphs of rational functions.
This may best be done by showing how a particular example can be
analyzed.
Let us sketch the graph of the function

y = f(x\ = X2 ~ 1 = (X ~ 1)(X + D .
V z2 - x - 6 (x - 3)(x + 2)
The first thing we do is (as shown above) factor the numerator and the
denominator. It is then clear that the domain of this function is the set
of all real numbers less the numbers 3 and —2. The zeros of the numerator
tell us that the function is zero at 1 and —1 (and only at these values).
The function cannot change sign between any of these values for x.
This fact involves questions of continuity which we cannot discuss here,
but the student may accept it without question. To determine the sign
of the function f(x) between these values, we merely need to calculate
values of the function at intermediate points. These values will help us
make the sketch later. In this example, we find
/(~3) = f, 7(—t) = —f, 7(0) = I,
7(2) = -t, 7(4) = f.
Next, we determine the behavior of the function near the zeros of the
denominator. Suppose that x is near but smaller than —2 in this example.

Figure 2-6
2-3 FUNCTIONS AND GRAPHS 51

Then (x — 1) is near —3, (x + 1) is near —1, and (x — 3) is near —5.


Therefore/(x) must be near
3
5(x + 2) ’

Since x is smaller than —2, (x + 2) is negative. What happens if x gets


very close to —2? The factor (x + 2) stays negative, but gets close to
zero, and hence |/(x)| must become very large. That is, “/(x) gets close
to + oo. ” A common way of writing this down is

f(x) —► +°o as x —2 (x < —2).

This is a purely symbolic statement which we interpret as saying that the


values of f(x) for x very close to, but .less than, —2 are "near +qo”;
that is, they are very large in absolute value and are positive. For this
example we see that

/(x) -> +oo as x -> — 2 (x < —2),


/(®) -oo as x —» —2 (x > —2),
70) —> — 00 as x —> 3 (x < 3),
/(x) —► 4-00 as x —► 3 (x > 3).

Note that in order to write the above expressions we do not need to give
all of the analysis which has been carried out. All we need to do is to note
what the sign of f(x) is near the point in question; and we determine this
sign by calculating intermediate values.
Finally, we would like to determine the behavior of f(x) as x becomes
very large in absolute value. We do this by dividing the numerator and
denominator through by the highest power of x in f(x). In this case

/(a;) = *2-i = ____


x2 — x — 6 1 — 1/x — 6/x2

As |x| becomes increasingly large, 1/x, 1/x2, and 6/x2 become very
close to zero, and hence J(x) gets close to 1.
With all of the above information available, we can now make a sketch
of the graph of the function. First, we draw dashed lines x = —2, x = 3,
and y = 1. These lines are called the asymptotes of the function, since
the graph gets very close to these lines at distances far from the origin.
Next, we plot the points that have been determined and sketch in a
smooth graph which makes use of all of the information on the behavior
of the function that we have been able to determine. For this example,
we obtain the graph shown in Fig. 2-6.
52 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-4

PROBLEMS

1. Under what conditions is a straight line in the cartesian plane the graph of a
function? If it is the graph of a function, what formula will give the values
of the function?
2. The set of points on the circle whose equation is
x2 + y2 = R2
for which y > Ó is the graph of a function. What is a formula for fix) ? What
is the domain of this function? What is the image of this function?
3. What is the domain of the function defined by fix) = \/x on the real num­
bers? What is its image? Sketch the graph of this function.
4. If fix) is a function whose domain is some subset of the real numbers and
whose range is in the set of real numbers, when is l//(x) a function and what
is its domain?
5. Sketch the graph of each of the following functions, whose domains are the
real numbers or subsets of the real numbers:
(a) Ax) = 1 + x (b) f(x) = x + |x|
(c) f(x) = (16 - x2]1/2 (d) Ax) = 4 - [16 - x2]1'2
(e) fix) = I 0 if x is rational
1 if x is irrational
6. Sketch the graph of each of the following functions:
(a) fix) = x2 — 4x+ 3 (b) Ax) = (X - l)(x2 - 4)
(c) f(x) = (x + l)(x2 + 1) (d) = 4?
& — 1)
(e) f(x) =
(x - 2) ™ /<*> -
(x + 3)(x — 1) x(x + l)(x + 2)
(g) Ax) = (x - 2)(x + 2) (h) Ax) =
(3x + 2)(x + 3)(x - 1)
3
(x + l)2
(i) Ax) = (j) Ax) =
xix — l)(x + 2)

2-
4 TRANSLATIONS

The euclidean concept of translation is connected with the idea of a


“rigid motion,” that is, the moving of the points of the plane so that the
distance between any two points is the same after the motion as it was
before. The euclidean rigid motions are translation, rotation, and reflec­
tion. The particular feature that distinguishes a translation from one of
the other rigid motions is that every point is moved in the same direction
and through the same distance.
2-4 TRANSLATIONS 53

Definition 2-7. A euclidean rigid motion of the cartesian plane is a


function m(P) whose domain and image are the set of all points in the
cartesian plane and which preserves the distance between points; i.e.,
for any two points A and B, if A' = m(A) and B' = míB), then

\A'B’\ = |AB|.

A translation is a rigid motion in which the distance between any point


and its image is always the same.

Figure 2-7

Suppose that a translation carries the origin, 0, to the point 0' with
coordinates (fc, k). We do not prove it here, but it follows from the above
definition that the points on a line are mapped by the translation to the
points of a line parallel to the original one. Then if the point P is mapped
to the point P', we see as in Fig. 2-7 that the broken lines through 0'
are the images of the axes and that the dashed lines through P' are the
images of the dashed lines through P. All of these lines are parallel to one
or the other of the axes, and we conclude that P' has been moved a hori­
zontal distance h and a vertical distance k from P. That is:

Theorem 2-5. If the points of the cartesian plane are translated so that
the point which is at the origin goes to the point with coordinates (h, /c),
then a point with coordinates (z, y) is translated to the point (x', ?/'),
where
x' = x + h,
yf = y + fc.* (2-14)

Note that under a translation, every point of the plane is moved, but
that the coordinate axes remain fixed. We "slide” the plane along under
the axes. Formula (2-14) gives us the relationship between the coordinates
of the point before and after the translation. It is sometimes useful to

* This theorem could just as well be made the definition of a translation.


54 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-4

write this in the following manner:

*,( y)
(0,0) - (A, fc),
(a,ò) —* (a + h, b + k),
(1, -D —> (1 + h, —1 + k).
Suppose now that we are given a locus in the plane. For a concrete
example, consider a circle with radius R and center 0, that is, the locus of

x2 + y2 = R2. (2-15)

We know that a translation which carries the point at the origin to the
point with coordinates (/&, k) will carry this circle to a circle of the same
radius, but with center at the point (h, fc). How could this be determined
directly from Eq. (2-15) ? If a point (x, y) is on the original circle, x and y
satisfy (2-15). The point (x', y') = (x + h, y + h) is on the translated
circle. From (2-14) x = x' — handy = y' — k, and hence the translated
point (s', ?/') must satisfy the equation

(xf -h)2+(y' -k)2 = R2, (2-16)

which we recognize as the equation of the translated circle. Looking at


this closely, we see that in general:

Theorem 2-6. Let C be the locus of an equation

/(x, y) = 0,

and let C" be the set obtained from C by a translation which carries
the point at the origin to the point with coordinates (fc, fc); then C' is
the locus of the equation
f(x' — h,y' — k) = 0.
In the second expression of this theorem we have used x' and y' as the
variables to emphasize the relations (2-14) and the fact that we are as­
suming the points of C' to have coordinates (s', y'). In practice, after
making this transformation, we would drop the primes, leaving the equa­
tion of the translated locus in usual form.
Note that the identification obtained above can also be read from:
(x, y) -> (x’ y'),
(x, y) & + h,y + k),
(x' — h,y' -k)-+ (x', y').
2-4 TRANSLATIONS 55

In the last line, the coordinates of the point on the right are the same as
the coordinates on the right of the top line. Therefore, the corresponding
coordinates on the left must be the same also. When (x, y) satisfies a given
equation, we therefore must have the pair of numbers (xz — h,y' — k)
satisfying the same equation.
The concept of translation can be used in several different ways. First,
if we are given the equation of a locus, we can ask for the equation of the
translated locus. This was the problem solved by the above theorem.
Suppose, to give a specific example, we have the locus

K = {(x, y) | y = 8x2},

and we wish to make the translation which sends the point at the origin
to the point with coordinates (1, —3). Then

(0,0) (1,-3),
(x, y) -»• (x', y'),
(x, y) -> (x + l,y — 3),
(x' — 1, y' + 3) ->• (x1, y'),

so that the locus K translates to

K' = {(
*
', y')\y' + 3 = 8(x' - l)2}
= {(x,y)\y+ 3 = I)2}. (2-17)

Sometimes we would like to find the equation of a locus whose equation


we would know if the locus were properly positioned. For example, what
is the equation of the locus consisting of the point of intersection of the
two lines y = x + 1 and y = — x + 3, together with all points on the
two lines "above” this point? That is, the locus

A = {(z> y) I y — x + 1 or y = — x + 3, and y > 2}.

In the last section, we saw that if this locus is translated so that the point
of intersection is at the origin, then the translated locus would have the
equation y' = |x'|; so we write

\x,y) -» (x',y'),
(1,2) (0,0),
(x, y) (x — l,y - 2),

and obtain the desired equation

A = {(x,y) | y - 2 = |x - 1|}.
56 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-5

Finally, we can use translations to simplify equations. Having the locus

E = {(x, y) | 4(x - l)2 + 3(y + 5)2 = 12},

for example, we can introduce the translation

(x, y) -» « y'), (x, y) -> (x - l,y + 5),

so that xf = x — 1, and yf = y + 5. Then, under this translation,

E' = {(x',y')\4x'2 + 3y'2 = 12}.

We still may not know what this new locus is, but at least it has a simpler
equation.

PROBLEMS

1. Why are the two sets in (2-17) the same?


2. Suppose that we have translations T and Tf such that T takes (z, y) —>
(x + h, y + k) and T' takes (x, y) —► (x + h', y + kf).
(a) Is the result of translating the plane by translation T and then translating
the resulting points by translation Tf itself a translation? What happens
to the coordinates of a point under these circumstances?
(b) Are translations commutative? That is, is the result of T followed by T'
the same as Tf followed by T? (Be careful here.)
(c) Given T, does there always exist a T' which is the inverse of T\ that is,
such that T followed by T' returns all points to where they started from?
If so, what is it?
3. Sketch the graphs of
(a) y = \x — 3j____ (b) y = |z + 4| — 2
(c) y + 2 = \/x — 1 (d) y = |4x — 8| — 3
4. Find the equation of the locus resulting from the translation of
D = {(x, 2/) | \x\ + \y\ = 1}
in such a way that the point which is at the origin moves to the point whose
coordinates are (2, —4).

5
2- ANGLES

Euclid defined an angle to be the inclination of one line to another. Since


inclination is undefined, this is not a definition at all. A study of Euclid’s
proofs shows, however, that he considered an angle to be the geometric
configuration of two intersecting lines, with two angles being equal if the
2-5 ANGLES 57

two geometric configurations are congruent (i.e., if they can be made to


coincide by means of rigid motions).
While this form of the definition of an angle was sufficient for Euclid’s
work, it has been found necessary to use a more complex definition in
modern mathematics. Given two intersecting lines L and I/, we must be
able to distinguish between the angle from Lto Lf and the angle from U
to L, and we must be able to assign real number values to angles. In
this section we will try to make these ideas precise.

Definition 2-8. Let P and Q be two distinct points of the plane and let
L be the line through P and Q. Make L a coordinate axis by letting P
be the origin and letting Q have a positive coordinate. Then the ray
from P through Q is the set of all points on L which have nonnegative
coordinates. We call this the ray PQ.

Note that it does not matter what unit of length is used in defining the
coordinates on the line. Only the direction chosen for the positive co­
ordinates matters. For a given line through a point P there are two rays,
one for each way of choosing a sense of direction on L. From a given point
there are infinitely many possible rays, two for every line through P.

Definition 2-9. A geometric angle is a pair of rays originating from the


same point. An oriented geometric angle is an ordered pair of rays
originating from the same point. If PR and PQ, in that order, are the
pair of rays making up an oriented geometric angle, then the ray PR
is called the initial side and the ray PQ is called the terminal side of the
oriented geometric angle RPQ.

In this definition the two rays may coincide or may lie in opposite
directions along the same line. This conflicts with the definition used in
many geometry courses, but we will find it useful not to have to dis­
tinguish these special cases. Note that a given geometric angle determines
two oriented geometric angles, depending upon which of the two rays we
call the initial side of the angle.
Before we continue, we must make some observations about orientation
in the plane. The two rays consisting of the points of the x- and y-axes
with nonnegative coordinates divide the plane into two regions, one of
which is “three times as large” as the other. We can get from the x-axis
to the ?/-axis by moving along the unit circle (the circle with radius one
and center at the origin) in two distinct ways. One of these ways is shorter
than the other, the shorter path being in the counterclockwise direction.
58 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-5

As far as the mathematics goes, it is unimportant whether the shorter


path from the x-axis to the y-axis is counterclockwise or clockwise, but it
is important that we are able to fix one such sense of rotation and refer
to it as needed. Observe that the assignment of an orientation in this way
depends on our being “on one side” of the plane. If we have an angle formed
by two rays in space, we could find a plane containing these rays, but we
cannot decide whether to call a rotation from one ray to the other clock­
wise or counterclockwise unless we know which side of the plane we
observe it from.
Let us assume that we have fixed an orientation in the cartesian plane
so that we can speak of a “clockwise” or "counterclockwise” rotation in
this plane; the orientation is to be so chosen that the counterclockwise
route is the shorter one in going from the positive x-axis to the positive
z/-axis.
The concept of angle which we wish to develop is to be independent of
translation and rotation (but not of reflection), and hence it will suffice to
consider only angles at the origin which are such that the initial side is
the positive portion of the x-axis (Fig. 2-8). Note that in Fig. 2-8 we
show an arrow on an arc between the two rays to indicate which is the
terminal side of the angle.
Since we work with the real numbers, we would like to assign real num­
ber values to angles. This has, of course, been done since the earliest times:
The Babylonians measured angles by dividing a full circle into 360 equal
parts and we still use their system when we measure angles in degrees.
In military usage, angles are measured in mils, which are defined to be
6400 °f a fall circle. This particular system has a computational simplicity
which is useful in the particular applications made of it.
It would appear that we are at liberty to assign almost any desired
unit to a system of angular measurement. However, for mathematical
purposes, one particular method of measuring angles turns out to be the
most valuable. This is the so called radian measure of angles, which
assigns the value 2tt to the full circle.
It is convenient to have angles with all possible real numbers as their
values, but it is clear that any pair of rays could have an angular measure­
ment only between 0 and 2tt if we assign the value 2tt to the full circle.
We avoid this and other difficulties by defining the configuration of rays
which corresponds to a given numerical value of an angle rather than by
defining the numerical value of an angle defined by a pair of rays. We
will do this by means of another undefined concept, that of arc length along
the circumference of a circle.
When we say that we want angles to be independent of euclidean motion,
this implies an ability to subdivide angles (dividing the whole circle into
360 degrees, for example). We can imagine a method of subdividing the
2-5 ANGLES 59

circle, using the euclidean notion of congruence, so as to assign a rational


multiple of the entire length of the circle to any arc on the circle. Just
as the rational numbers can be completed to the reals, we could then
complete these arc length measurements so as to be able to measure any
arc. Conversely, we will also assume that given any real number we can
measure off a circular arc of that length. To do so, we must make an
agreement about what to do with negative numbers.
To fix our thoughts, let us use the unit circle; that is, the circle centered
at the origin whose radius is one. Let the point P be the point (1,0) on
this circle (Fig. 2-9). The entire length of circumference of this circle is
2tt (by the definition of 7r). Just as we could measure off coordinates on a
line, we now assume we can measure off coordinates on the circle, starting
at P and proceeding in the counterclockwise direction for positive co­
ordinates. Thus, for example, the point A in Fig. 2-9 would correspond
to the coordinate +1 while the point B would correspond (as shown) to
the coordinate —2.

The point (0, 1) corresponds to the coordinate 7t/2 but also to the co­
ordinate —37t/2 (why?). Each real number would give us only one point,
but each point will have many coordinates. The point P in particular has
the coordinate zero, but since the total length of the circumference of the
circle is 27t, it also has the coordinates 2tt, — 2ir, 47r, —47t, etc. Indeed,
it is easily seen that if any point has a coordinate a, then it has coordinates
a + 27rk for k = 0, ±1, ±2, . . . The coordinates of a given point all
differ by integral multiples of 2tt.

Definition 2-JO. Let a be any real number. Let P be the point (1, 0)
on the unit circle, and A the point with coordinate a, measured as arc
length from P on the unit circle, positive coordinates being measured
counterclockwise from P. Then the ray from the origin through A
is said to make an angle with value a with the ray from the origin
through P.
60 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-5

By virtue of this definition, the same ray OA will make an angle with
many different measures with the ray OP. If we are given a real number a,
then the ray OA will be uniquely determined, but the same ray can also
be said to make an angle with measure a + 2irk with the ray OP, where
k can be any integer.
Once we have a clear understanding of this ambiguity, we may use
looser terminology without creating confusion. We often see a phrase
such as "an angle of 7r/2, ” and we need to decide exactly what is meant
by a phrase such as this.
First of all, we will assume that the above definition has been extended
so that we can speak of the ray QR making an angle whose measure is a
with the ray QS, where Q is an arbitrary point in the plane and QS is an
arbitrary ray from that point. This can be done with the help of transla­
tion and rotation, two of the euclidean motions which we will assume
known.

Definition 2-17. Let a be any real number. Then by an angle of a, we


mean any oriented geometric angle which is such that the terminal ray
makes an angle with value a with the initial ray in the sense of the
definition above. The number a will then be called a value or measure
of the angle.

The phrase "an angle of a” introduced in this definition is merely a


short way of saying "an oriented geometric angle in which the terminal
ray makes an angle with measure a with the initial ray. ” The student will
find that the word angle is in common use to mean either the geometric
angle or the numerical value attached to that angle. Usually, this dual
usage will give no difficulty, and the student can determine the particular
meaning desired by the context.

Figure 2-10

For pictorial purposes it is convenient to show the measurement of the


angle along a spiral rather than the unit circle, as in Fig. 2-10. This figure
illustrates the angle with measure 97r/2. The spiral allows us to count the
amount of rotation needed to obtain this angle.
2-5 ANGLES «1

A given geometric angle determines many angles in the sense of this


definition, but each real number determines only one geometric angle.
The different numerical values for a given geometric angle differ by in­
tegral multiples of 2tt. Of all these, exactly one, a, will lie in the interval

0 < a < 27T.

This is most easily seen by thinking of the pair of rays cutting the unit
circle, and noting that we can get from the first to the second by moving
counterclockwise through some arc of less than in length. If the two
rays coincide, the value 0 for the angle can be used. The numerical value
for the angle in this interval can be used as a standard value.
Sometimes, however, it is more convenient to use the interval
— 7T < a < 7T
for the standard value. The student should satisfy himself that angles
can be reduced to this range as well as to the range 0 < a < 2ir.
Angles can be added geometrically or numerically with the same results.
Thus, for example, if the angle from the ray OP to the ray OA has the
value a and the angle from the ray OA to OB has the value 0, then the
angle from OP to OB has the value a + 0. Note that when we use the
terminology of the definition and say that the angle from OP to OA has
the value a, what we really mean is that a is one of the infinite number of
possible values that can be given as the value of this angle. The number a
determines a unique geometric angle, but not conversely. So long as this
is remembered, no confusion need arise.
In a similar way, we see how angles can be subtracted. In particular,
we can take the negative of an angle. Thus, in Fig. 2-11, 0 is the value of
the angle from OA to OB while the angle from OB to OA would have the
value —0.

Figure 2-11

We have defined angles in great generality—as signed angles in the


plane. In actual practice, there are times when the full amount of gen­
erality is not needed. For example, while we have defined only the angle
from one ray to another, it is sometimes unnecessary to distinguish be­
tween the rays. Then we need only speak of the angle between the two rays.
The value of such an angle is usually taken to be positive and in the in-
62 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-5

terval 0 < a < 7r. This would be the absolute value of 0, where 0 is the
standard value for the angle, chosen so that tt. When we
consider the angle between two rays in space, we cannot use signed values,
since a sense of direction of rotation cannot be given as it can in the plane.
There are many possible units of measurement of length. We have
assumed a natural unit of measurement in the cartesian plane, and in
terms of this unit of length, a circle whose radius is one will have a cir­
cumference of length 2tt. We could, however, introduce other units of
arc length and use them to measure angles. For example, if we assign the
length of 360 to the circumference of the unit circle, then in using Defini­
tion 2-10 we would obtain other numerical values for the same angles.
In this particular case we would obtain a measurement of the angles in
degrees. We indicate this by means of a symbol which shows the unit of
measurement we use. For an angle measured in degrees, we use the
symbol °.
This point needs to be stressed. Whenever a particular unit of measure­
ment is used in describing an angle, that unit must be given or implied.
Thus, we can describe an angle as having the value 45°, but the degree
sign is essential and cannot be omitted. After all, we recognize the state­
ment, “This board is 10 long” as nonsense. On the other hand, we do speak
of a line segment of length 2 in the cartesian plane. Two what? Two
units of length—the unit of length that is given as part of the cartesian
plane. In this case we understand the unit. Similarly we do not indicate
the unit when we measure an angle in radians. This makes it even more
important to give the unit when we measure the angle in any other way.

PROBLEMS

1. Make sketches showing angles of


(a) 7r/2 (b) r (c) 3tt/2 (d) 2tt
(e) —ir/2 (f) -r (g) —3r/2 (h) -2t
(i) 7t/4 (j) -7t/4 (k) 7r/3 (1) 5tt/6
(m) 15ir/4 (n) —9ir/2 (o) 127tt (p) 253tt/4
2. For each of the angles in Problem 1, give a value in the range 0 < a < 2tt
for the same oriented geometric angle.
3. The angle of 2t (radians) is the same as the angle of 360°.
(a) If a is the value of an angle given in the radians, what is the formula for
the value of the angle in degrees? That is, if an angle has values a and
a°, then a = ?
(b) Given an angle with a value of a degrees, what is its measurement in
terms of radians?
2-6 THE TRIGONOMETRIC FUNCTIONS 63

4. Convert each of the angles in Problem 1 to degrees.


5. How many degrees is an angle of 1 radian?
6. Using the approximation 7r = 3.1416 and the fact that an angle of 6400 mils
is the same as an angle of 27r, find the angle of one mil in radians to three
significant figures. What do you think “mil” stands for?
7. Suppose that the angle from the ray OP to the ray OA has the value a, the
angle from OA to OB has the value (3, and the angle from OP to OB has the
value V. Why is it not necessarily true that a + /3 = V? How does this
situation differ from that of the statements made about Fig. 2-11?

2-
6 THE TRIGONOMETRIC FUNCTONS

The student is probably familiar with the standard trigonometric func­


tions of angles. In this section we will give a definition of these functions
which may appear different from that which the student has seen before.
The functions are the same, however; we only change the definitions so
as to make the particular properties which we wish to emphasize as easy
to see as possible.

Definition 2-12. Let a be any real number, let P be the point (1, 0) on
the unit circle, and let A = (xa, ya) be the point on the unit circle
such that the angle from OP to OA has the value a. Then the sine and
cosine functions of a are defined by

cos a = xa,
sin a = ya.

The functions sine and cosine as defined here are functions whose domain
is the set of all real numbers. Since the points on the unit circle never
have coordinates outside of the range — 1 to +1, the range of these functions
is the interval from —1 to +1 (inclusive of the endpoints). It is easy to
picture the behavior of these functions. Starting at a = 0, as a increases
we can follow the x-coordinate, for example, and observe the behavior
of cos a. In this way, we see that the functions sin a and cos a have
graphs as shown in Fig. 2-12.
In Fig. 2-12, the horizontal axis is the a-axis. A feature of these graphs
which should be noted is the fact that they repeat with a period of 2tt.
The functions sin a and cos a are examples of periodic functions since they
satisfy the conditions that for any a,

cos (2tt + a) = cos a,


sin (27T + a) = sin a. (2-18)
64 ANALYTIC GEOMETRY AND TRIGONOMETRY 2—6

y = cos a

Figure 2-12

This follows from the fact that increasing the angle by 2ir does not change
the geometric configuration of the rays, and it is the oriented geometric
angle that determines the values of the functions.

Definition 2-13, Let f(x) be a function defined for all real x. Then f(x)
is said to be a periodic function with period p if for every x

f(x + p) = f(x).

In terms of this formal definition, we see that Eqs. (2-18) state that the
sine and cosine functions are periodic with period 2tt.
If the ray from the origin through A makes the angle a with the positive
x-axis, then the continuation of this ray through to the other side of the
origin makes the angle it + a with the positive x-axis. If A is the point
(z0, yo), this opposite ray cuts the circle at the point (—x0, —yo) (make
a sketch and verify this, noting the similar triangles formed). This shows
that for any a,
cos (tt + «) = —cos a,
sin (tt + a) = —sin a. (2-19)

Next, imagine the plane reflected in the z-axis. Any point (x, y) will
be reflected to the point (x, — y) (why?), and the points on the unit circle
will be reflected to the points also on the unit circle (why?). If A is a
point such that the arc from P to A is of signed length a, then A will be
reflected to a point A' such that the arc from P to A' is of signed length
2-6 THE TRIGONOMETRIC FUNCTIONS 65

—a. But A = (cos a, sin a) and A' = (cos a, —sin a), so we have that
for any a,
cos (—a) = cos a,
sin (—a) = —sin a. (2-20)
From Eqs. (2-20) we see that if we know the values of cos a and sin a
for every positive a, then we would know the values for every a. However,
use of Eqs. (2-18), repeated as many times as necessary, shows that it
suffices to know the values for a between 0 and 2tt. With the help of (2-19)
we see that we need only the values between 0 and ir. We can reduce this
still further by the following reasoning.
Observe Fig. 2-13. We show the two rays OA and OA' where OA makes
the angle a with the positive x-axis, and OA' makes the same angle a
with the positive ?/-axis. Thus OA' makes the angle 7t/2 + a with the
positive z-axis. The ^/-coordinate of A is sin a and is indicated by the
vertical arrow in Fig. 2-13. It is clear that the displacement of A' in the
horizontal direction, as indicated by the horizontal arrow, is of the same
numerical value but of opposite sign (since this displacement starts in
the negative direction). Hence we see that cos (tt/2 + a) = —sin a.
In exactly the same way it can be seen that sin (7r/2 + a) = cos a.
That is, we have the relations

sin I — + a I = cos a. (2-21)

Although our diagram indicates this only for a between 0 and t/2, it can
be seen that the argument is valid for any value of a by observing what
happens in Fig. 2-13 as a is allowed to change to any value. By means
of these equations we can reduce the problem of finding the sine or cosine
of any a to the same problem for a between 0 and 7t/2. Indeed, it suffices
to know the values between 0 and 7t/4, since we can combine (2-20)
and (2-21) to see that

— «) = cos^ + (—«)]

= —sin (—a)
= sin a
and
— = sin [j + (~a)j

= cos (—a)
cos a. Figure 2-13
66 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-6

If a is between 7r/4 and 7t/2, then 7r/2 a is between 0 and 7r/4. Hence
the relations
sin a,

(2-22)

serve to complete the proof of the observation that the values of sin a
and cos a for a between 0 and 7r/4 serve to define these functions for all
possible a. This is the reason that trigonometric tables are normally given
only for this range (except that the last relations are usually built into
the tables).
The use of these relations in practice is fairly simple. For example, if
we wish to find sin (—1977r/3) we first observe that this is —sin (1977r/3)
from (2-20). Now 1977t/3 = 657T + 27t/3, and 27t/3 = tt/2 + 7t/6, so

sin (—197tt/3) = -sin (197tt/3) by (2-20)


= —sin (647T + 7T + 27t/3)
= —sin (tt + 27t/3) by (2-18)
= sin (27t/3) by (2-19)
= sin (7t/2 + 7t/6)
= COS 7t/6 by (2-21).

The student might find it difficult to memorize all of these relations.


Luckily, there is a simple pair of formulas from which the above set of
formulas can all be obtained. These "addition formulas” will be derived
in the next section. However, some of these relations are so important
that they should be learned in their own right. The equations (2-18) must
be known without question. The equations (2-20) are also so useful that
they should be known. It helps in learning these relations to recall the
geometric picture.
The formulas in (2-22) are of frequent utility, and since they are fairly
easy to learn, it is recommended that the student learn these also.
There is still another relationship between these functions which is
immediately available from the definition and which is of fundamental
importance. For any a, the values of cos a and sin a are the coordinates
of a point on the unit circle. This point is at a distance 1 from the origin,
and hence
sin2 a + cos2 a = 1.

This relation is one of the fundamental properties of the trigonometric


functions and is important enough to be restated as a theorem.
2-6 THE TRIGONOMETRIC FUNCTIONS 67

Theorem 2-7. For any real number a, the functions sin a and cos a
satisfy the relation
sin2 a + cos2 a = 1. (2-23)

Note that the sine and cosine functions have been defined as functions
on the real numbers, but they are also, in a natural way, defined as func­
tions on oriented geometric angles. Relation (2-18) is what is important
in this regard. The convention we introduced for writing the measure of
an angle in radians as a pure real number (without units) permits us to
think of the trigonometric functions as functions of either angles or real
numbers.
We also allow the use of notation such as sin 45°. Here we write the
function as a function of the angle, indicating this by the use of a degree
symbol to show the units. With such a convention, note that we can write

sin 60° = sin


o
and other similar relations.
An important question which comes up frequently is the extent to which
an angle is determined by the trigonometric functions. The answers to
this question can be summarized as follows:

Theorem 2-8. If two numbers a and b are given such that a2 + b2 = 1,


then there is a unique a in the interval 0 < a < 27T such that a =
cos a and b = sin a.

Theorem 2-9. If a number a with |a| < 1 and a sign +1, or —1 are
given, then there is a unique a in the interval 0 < a < 2tt such that
a = cos a and sin a has the given sign (or is zero). Likewise there is a
unique a' with 0 < a' < 27t such that a = sin a and cos a has the
given sign (or zero).

Theorem 2-10. If a number a with |a| < 1 is given, then there is a


unique a in the interval 0 < a < 7r such that cos a = a.

The first of these results is evident when we note that the given con­
dition on a and b is exactly what is required to have the point (a, b) be
a point on the unit circle.
The second result follows from the observation that the line x = a cuts
the unit circle at exactly two points (unless a = +1 or —1, in which case
there is only one point). The ^/-coordinates of these two points have op­
68 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-6

posite signs, and only one of them has the given sign and thus determines a.
The second half of this result follows in the same way by considering the
line y = a.
The third result follows from the second when we note that sin a > 0
for 0 < a < 7T and sin a < 0 for ir < a < 2tt. Hence, although the
line x = a cuts the unit circle at two points in general, only one of these
points will correspond to an angle in the required range. This particular
result can also be interpreted as saying that the cosine alone serves to de­
termine the geometric angle (but not, of course, the oriented geometric

Figure 2-14
2-6 THE TRIGONOMETRIC FUNCTIONS 69

angle). Note that the converse of this results also holds. A geometric
angle has a single cosine. This follows from (2-20), since the two choices
of an oriented geometric angle would have measures which are negatives
of each other.
The remaining trigonometric functions can be defined in terms of the
sine and cosine. They are useful in practice and should be known. At this
time it is sufficient to know the definitions of these functions.
These remaining functions are the tangent, cotangent, secant, and co­
secant functions. They are defined as follows:

Definition 2-14. For any a for which the denominator of the given ex­
pression is not zero,
sin a 2 COS a
tan a = --------- > cot a = —-----)
cos a sin a
1 1
sec a = csc a = —---- -
COS a ’ sin a

The graphs of these functions are sketched in Fig. 2-14.


All of the relations (2-18) through (2-22) clearly hold when the cosine
is replaced by the secant and the sine is replaced by the cosecant (why?).
For the tangent and cotangent, however, somewhat different relations
hold. Putting the relations (2-20) into the definition, we see that

tan (—a) = —tan a,


cot (—a) = —cot a. (2-24)

From relations (2-19) we have

tan (7T + a) = tan a,


cot (tt + a) = cot a, (2-25)

which shows that the functions tan a and cot a are periodic with period
7T (half of that of the remaining trigonometric functions).
Finally from (2-21) and (2-22) we deduce that

tan = cot a> (2-26)


and
cot Í“ tan a. (2-27)

A few other relations will be given as problems.


70 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-6

PROBLEMS

1. For what values of a is


(a) sin a = 0? (b) cos a = 0?
(c) sin a = 1? (d) cos a = 1?
(e) sin a = —1? (f) cos a = —1?
2. What conclusion can be made from Eqs. (2-22) when a = 7r/4? Show that
(2-23) then can be used to find the values of sin t/4 and cos tt/4. What is
tan tt/4?
3. Reduce each of the following to a trigonometric function of an angle between
0 and 7r/4.
(a) sin (7tt/5) (b) cos (—177r/2)
(c) sin (315t/3) (d) sin (-11 br/10)
(e) cos (218tt) (f) sin (—15tt/2)
(g) tan (35tt/3) (h) cot (—12tt/5)
4. Rewrite relations (2-18) through (2-22) in terms of angles expressed in de­
grees.
5. Reduce each of the following to a trigonometric function of an angle (ex­
pressed in degrees) between 0° and 45°.
(a) sin (337°) (b) cos (-1000°)
(c) sin (-2345°) (d) cos (112°)
(e) tan (535°) (f) cot (1800°)
(g) sec (215°) (h) csc (-7000°)
6. Prove from (2-23) that

1 + tan2 a = sec2 a and 1 + cot2 a = csc2 a

for any a for which the functions are defined.


7. Let OA be the ray which makes an angle of a with the positive x-axis. Prove
that the line through 0 and A has slope tan a. What is the meaning of the
relation (2-25) in this context?
8. Let OA be the ray which makes an angle of a with the positive z-axis. Prove
that the line through 0 and A intersects the line x = 1 at the point (1,
tana). Make a sketch showing this for angles in all four quadrants 0 to
tt/2, tt/2 to 7T, 7T to 3tt/2, and 3tt/2 to 2?r.

9. Let Q be a point on the ray from the origin which makes an angle a with
the positive x-axis. Suppose that Q is a distance c from the origin. Prove
that Q has coordinates
(c cos a, c sin a).
10. Show that if /(x) is periodic with period p, then it is also periodic with period
kp, where k is any nonzero integer.
2-7 TRIANGLE FORMULAS 71

2-7 TRIANGLE FORMULAS

A right triangle has two sides which meet at a right angle. The other two
angles formed by the sides of the triangle are taken as unoriented angles
to which we can assign values between 0 and 7t/2. Suppose that the tri­
angle is placed on the cartesian plane so that one side is on the x-axis
and a vertex (not at the right angle) is at the origin, as in Fig. 2-15. Let a
be the value of the angle between the hypotenuse and the side along the
x-axis. Let a be the length of the side of the triangle opposite this angle,
let c be the length of the hypotenuse, and let b be the length of the re­
maining side.

Extend the hypotenuse if necessary, and locate the point Q which is


the intersection of the hypotenuse with the unit circle. Drop a per­
pendicular from Q to the x-axis. This forms another right triangle which
is similar to thè given triangle. But this new triangle has sides of length
sin a (the vertical side), cos a (the horizontal side), and 1 (the hypotenuse).
From the similarity, we can conclude that
b
cos a = - ,
c
sin a = - j (2-28)
c
,tan a = -a -
b
The first two of these can be written in the form

a = c sin a,
(2-29)
b = c cos a,
which allows us to compute the length of the sides of the right triangle if
we know the length, c, of the hypotenuse and the value of one of the base
angles, a.
If a line segment AB is given in the plane along with a line L, then
we may draw lines A A' and BB' through the respective points A and B,
72 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-7

perpendicular to the line L and meeting L at A' and B' respectively. The
line segment A'B' is called the projection of AB on L. What is its length?
By adding the dashed line in Fig. 2-16, it is easy to see with the help of
(2-29) that
|A'B'| = |AB\ cos a, (2-30)
where a is the angle between the line L and the line through A and B.
When two lines intersect, four rays are determined. These determine
4’2 = 6 geométric angles. Two of these are straight angles, and the
remaining four are equal (in value) in pairs. The two values will be a
and 7T — a. Use the smaller in (2-30). The other would give the negative
of the correct result, since
COS (7T — a) = —cos a
from the formulas of the last section. The concept of projection will be
discussed more fully in the next chapter.

Figure 2-16

Suppose now we have an arbitrary triangle, not necessarily a right tri­


angle. Suppose that it has interior angles whose values are a, /3, and T (all
taken as positive angles in the interval from 0 to 7r), and that the sides
opposite the angles a, 0, and V are of lengths a, ò, and c respectively. We
know that the area of a triangle is given by | the product of the length of
one of the sides with the altitude perpendicular to that side. Thus, if we
drop a perpendicular from the vertex with angle to the opposite side, as
illustrated in Fig. 2-17, and if the length of this altitude is h, then the area
of the triangle is
A = ibh.
However, the altitude drawn is a side of
a right triangle. Hence
h = c sin a,
and thus
A = ^bc sin a
_ abc sin a
2 a Figure 2-17
2-7 TRIANGLE FORMULAS 73

In exactly the same way, using the same altitude, we have


h = a sin 7
and
A = %ba sin 7
_ abc sin 7
“2 c
By dropping an altitude from one of the other vertices we can show in
the same way that
A _ abc sin g
~2 b

If one of the angles is greater than 7r/2, as


in Fig. 2-18, it is necessary to note that

sin (tt — 0) = sin 0 (2-31)

in proving that Figure 2-18


h = c sin/S

by the method described. Since this relation follows from the results of
the last section, we can conclude that these three representations of the
area are valid. They are equal, and hence dividing through by abc/2, we
have
sin at = sin 0 = sin 7 (2-32)
abc
The relations (2-32) are valid for any triangle. This result is known as
the Law of Sines, and can be used for several purposes. In particular, when
two angles of a triangle are known, the third angle is also known since the
sum of the angles in any triangle is ir, and then if any side is known, the
relations (2-32) can be used to find the two remaining sides.
For example, if we know that sin a = sin 0 = f, and a = 3, then
we can compute b from the law of sines by

18
5 ’
If the law of sines is used to determine an angle of a triangle, ambiguity
results. Since sin (tt — a) = sin a, finding the sine of an angle does not
completely determine the angle. There are always two possibilities (except
when sin a = 1), one less than tt/2 and one greater than tt/2. Thus, for
74 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-7

Figure 2-19

example, if a =

sin

Hence, as we will prove in one of the problems at the end of this section,
y = 7r/6, or y = 5tt/6.
This example is illustrated in Fig. 2-19. Both the angle labeled a and
the angle labeled af have a sine whose value is |. However, the dashed line
cannot be the side of a triangle meeting the required conditions, since no
point of this ray is at a distance 2 from the point B. The points C and C"
are both at a distance 2 from B. The two triangles which satisfy the given
conditions are therefore OBC and OBC'.

Turning to another topic, suppose that the sides of a triangle are of


lengths a, 6, and c and that a is the angle opposite the side of length a.
Position this triangle in the cartesian plane so that the vertex opposite
the side of length a is at the origin and the angle a is measured from the
x-axis to the side of length c (as shown in Fig. 2-20). The coordinates of
the vertices are then as shown, and hence the distance formula gives us

a2 = (c cos a — b)2 + c2 sin2 a


= c2 COS2 a — 2bc cos a + b2 + c2 sin2 a
= b2 + c2(sin2 a + cos2 a) — 2bc cos a,

or, since sin2 a + cos2 a = 1,

a2 = b2 + c2 — 2bc cos a. (2-33)


2-7 TRIANGLE FORMULAS 75

This result is called the Law of Cosines. Like the law of sines it can be
used to determine missing parts of a triangle. When used to determine
an angle (when all three sides are known) there is no ambiguity, since the
angle a must lie between 0 and 7r and there is only one such angle for a
given value of the cosine.
As it is written, this relation can be used to determine the third side of a
triangle when two sides and the included angle are given. If two sides and
an angle other than the included angle are given, this relation results in a
quadratic equation in the length of the third side. There are in general
two possible solutions.
Note that there are two other versions of the law of cosines in addition
to (2-33). There is one for each angle in the triangle. These relations are
easy to remember, since the equation says that the square of a side is
equal to an expression involving the cosine of the angle opposite the side
and the other two sides. All that is necessary is to remember the form of
the expression.
Let us now give a few examples showing how the law of cosines can be
used to find the missing parts of triangles.
First, suppose we are given that a = 4, b = 4, and c = 1. We wish
to find the angles of the triangle. Actually, it suffices to find the cosines
of the angles, and so we use the law of cosines. To find cos 0, for example,
we use
b2 = a2 + c2 — 2ac cos 0,
or
Q a2 + c2 -b2 4+1-4
cos 0 = -------- --------- = -------- £------
2ac 8
_ 1
~ 8’

The cosines of the other angles could be found in a similar way.


For our next example, suppose that a = 3, b = 5, and cos V = |.
Then we can solve for c by direct use of the law of cosines in the following
way:
c2 = a2 + b2 — 2ab cos T
= 9 + 25-15
= 19.

Therefore, we find that c = \/19. The other angles can be found as above.
As a last example, suppose we are given that a = 2\/13, c = 6, and
cos a = J. We attempt to find b from the law of cosines, using

a2 = b2 + c2 — 2bc cos a.
76 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-7

Putting in these values, we find


b2 - 66 - 16 = 0, or (b - 8)(ò + 2) = 0.
Of the two roots, only the positive one has a meaning in this case, so that
we conclude that b = 8. The student should make a sketch to see why
there is only the one solution in this particular case.
Observe that in this section we are interested in the theoretical rather
than the practical use of these formulas. For this reason we do not include,
or consider the use of, tables of trigonometric functions. We consider an
angle of a triangle as being known when we know its cosine, since there is a
unique correspondence between the cosine and the angle when the angle
is restricted to the range 0 < a < 7r.
This is not meant to imply that the student should not know how to
use the trigonometric tables. He should. In working actual problems, it is
usually easier to work with the tables than to use the methods we discuss
below. We are interested, however, in the existence of analytic methods
which do not require the use of tables.
There is no difficulty, in theory, in finding the sine of an angle if its
cosine is known since sin2 a = 1 — cos2 a, and the sine of any angle
between 0 and ir is positive. There is, of course, some ambiguity in going
the other way. The real difficulty we find in trying to use the cosines of
the angles is in applying the fact that the sum of the angles of a triangle
is 7T. If we know cos a and cos 0 and we wish to find cos T, then using the
results of the last section, we find
COS y = COS [-7T — (a + /?)]

= —cos (a + /3);
but how do we find cos (a + ft in terms of cos a and cos ft!
The answer is in the use of the trigonometric addition formulas. These
are
cos (a + ft = cos a cos (3 — sin a sin 0,
sin (a + ft = sin a cos 0 + cos a sin /?. (2-34)
Let us now prove the first of these.
In Fig. 2-21 (a), we show the unit circle cut by rays so that the value
of the angle from OP to OA is a and the value of the angle from OA to
OB is ft Then the value of the angle from OP to OB is a + We as­
sume that the point P has coordinates (1, 0). Then the point B has
coordinates (cos (« + /?), sin (a + 0)), and therefore,
|PB|2 = [cos (a + ft- l]2 + sin2 (a + ft
= cos2 (a + ft + sin2 (a + ft + 1 — 2 cos (a + ft
= 2 — 2 cos (a + ft.
2-7 TRIANGLE FORMULAS 77

Figure 2-21

Consider the same figure, but rotated so that the point A is moved to
A' on the x-axis. In the resulting figure, Fig. 2-21(6), the point P' has co­
ordinates (cos (—a), sin (—a)) = (cos a, —sin a), and the point B' has
coordinates (cos 0, sin 0). Therefore, we can compute
= (cos $ — cos a)2 + (sin/3 + sin a)2
= cos2 0 + sin2 0 + cos2 a + sin2 a — 2 cos a cos 0 + 2 sin a sin 0
= 2 — 2[cos a cos 0 — sin a sin 0].

However, |PB|2 = |P'B'|2, and hence we can conclude that

cos (a + 0) = cos a cos 0 — sin a sin 0,

which is what we wished to prove.


It is useful to obtain a similar formula for the cosine of the difference
of two angles. This is easily done as follows:

COS (a — 0) = COS [a + (—0)]

= cos a cos (—0) — sin a sin (—0)


= cos a cos 0 + sin a sin 0.

The second formula in (2-34) can now be obtained with the help of
formula (2-22),

sin (a + 0) = cos ~ (“ + 3)
= cos[(f ~ ~ 0]

cos 0 + sin — a) sin 0

= sin a cos 0 + cos a sin 0,


which is the second equation of (2-34).
78 ANALYTIC GEOMETRY AND TRIGONOMETRY 2-7

Thus we see that if the values of cos a and cos 0 are known for two
angles in a triangle, then the cosine of the third angle is given by
cos 7 = —cos (a + 0)
= —cos a cos 0 + sin a sin 0.
In using this result, it is necessary to recall that the sine of an angle in a
triangle is always taken to be positive.
As an example of the use of these equations, let us find the missing sides
and angles of the triangle in which a = 1, cos 0 = and cos V = — J.
We see that we can use the law of sines to determine the missing sides
once we have determined the third angle. To this end we compute
sin 0 = [1 - W/2 = [£ffl1/2 = 1WÍ5,
sin 7 = [1 — ygP/2 = Jx/15,
cos a = —cos 0 cos 7 + sin 0 sin 7

__ 7
— K-
We could compute sin a from this, or from
sin a = sin [tt — (0 + 7)]
= sin (0 + V)
= sin 0 cos 7 + cos 0 sin 7
= —&VÍ5 + Hx/15
= W15-
With these values, we can then use the law of sines to compute

— a s^n & _ §
sin a 2
and
sin 7
c = a —---- = 2.
sin a

PROBLEMS

1. Let a right triangle have sides a, b, and c opposite the angles a, 0, and 7 re­
spectively, 7 being the right angle. For each of the following write an expres­
sion for the required side in terms of the given quantities.
(a) Given a = 7, a = 7r/4.Find c (b) Given a = 3, 0 = 37t/5. Find b.
(c) Given a = 5 and a. Find b. (d) Given c = 7 and a. Find a.
(e) Given c and 0. Find a, (f) Given b and a. Find c.
(g) Given a and b. Find c.
2-7 TRIANGLE FORMULAS 79

2. In an equilateral triangle, all three sides are of the same length and all three
angles are tt/3.
(a) Use (2-33) to find cos 7r/3.
(b) Find sin t/3.
(c) Use (2-22) to find sin t/6 and cos 7r/6.
In each of the following problems, a, 6, and c are the sides of a triangle opposite
the angles a, 0, and 7 respectively. Using the values given, find the lengths of
the missing sides and the cosines of the missing angles. If there is more than one
triangle satisfying the given conditions, give the values for each of the possible
triangles. If there are no triangles satisfying the given conditions, explain why.
3. (a) a = 2, b = 4, c = 5 (b) a = 2, b = 11, c = 10
(c) a = 3, b = 4, c = 6 (d) a = 3, b = 8, c = 4
4. (a) a = 5, 6 = 6, cos 7 = H (b) b = 7, c = 6, cos a = 7
T2
(c) a = 2, c = 4, cos ft = —A (d) a = 1, c = 2, cos 3 = i
5. (a) a = 3, _
b = 5, cos a — 3
5 (b) a = 7, c = 6, cos a = 3
3
(c) a = 2, 6 = 3, cos 3 = i (d) b = 5, c = 4, cos T = 3
5
(e) a = 3, _ 13
b = 6, cos a “ F5

6. (a) c = 5, cos a = 13 cos/S = 37

(b) b = 6, cos a = 23
72» cosT = 7
12
(c) a = 4, cos/3 = 15
16» cosT = 1
16
(d) b = 10, cos/3 = 12 cosT = 5
13

7. Construct another proof of the law of sines along the following lines: Let
A, B, and C be the vertices of a triangle with angles a, 0, and 7 at these
vertices respectively. Let 0 be the center of the circle circumscribed about
the triangle. For some vertex, say A, choose another vertex, say B, and draw
the line from B through the center 0 of the circle to intersect the circle again
at A'. Draw A'C. Show that the angle BA'C is the same as a. Show that
the triangle BA'C is a right triangle. Compute |BC| in terms of a and the
diameter of the circle. Repeat the same process for the other vertices and use
the results to obtain the law of sines.
3
Vectors
3-
1 CARTESIAN COORDINATES IN THREE-DIMENSIONAL SPACE

Just as the geometry of the plane is characterized by two real coordinates,


so the geometry of three-dimensional space can be characterized by three
real coordinates. In order to do this, we need to have a fixed set of co­
ordinate axes in space. These may conveniently be chosen to be a set
of three mutually perpendicular lines intersecting in a single point.
Suppose that we have such a set of lines. We will assume that each is
directed and that we have a fixed unit of distance in our three-dimensional
space which can be applied to each of the lines to make it a coordinate line
or axis. On each of the lines, a point is determined by a single real number,
or coordinate. Through that point a unique plane can be constructed
perpendicular to that coordinate axis. And since the three coordinate lines
are mutually perpendicular, three points, one on each axis, determine three
mutually perpendicular planes. Any two of these planes intersect in a
line, and the third cuts this line at a point. In this manner three real
numbers determine three distinct points on the axes, which in turn de­
termine a unique point in the three-dimensional space (see Fig. 3-1).

Conversely, given any point in space, it determines three planes, re­


spectively perpendicular to the three axes, and these three planes cut the
axes in unique points which then correspond to three coordinates. There­
fore, there is a one-to-one correspondence between the points of space
and sets of three real numbers.
80
3-1 CARTESIAN COORDINATES IN 3-DIMENSIONAL SPACE 81

Note that we are making use of the properties of euclidean three-


dimensional space. If we start from the axioms of euclidean geometry, it
can be shown that three mutually perpendicular directed lines do exist
and that the above discussion is valid. This is not done here since our
point of view is going to be the opposite. We will treat the mathematical
system we develop as an independent entity, which can be thought of as
a model of euclidean geometry. Throughout the discussion, however, we
will base this development on the comparison with euclidean space as
we usually understand it. The discussion should help the student to see
the relationship between mathematics and the physical world. While
the student should concentrate on building a geometric picture of the
mathematical system, he should also realize that the validity of this
picture is not being proved. Indeed, it is more accurate to say that we
are postulating the fact that the geometry of space is determined by the
mathematical system which we shall develop.
There is one point, however, on which agreement must be reached
before a physical picture of the mathematical system can be said to be
known. Suppose we have three mutually perpendicular directed lines
passing through a single common point, and we wish to label these as the
axes of our system, say the x-, i/-, and z-axes. There are six possible ways
of assigning these labels (why?), but these six fall into only two groups
which need to be distinguished.
Note the three ways of labeling the axes shown in (a), (b), and (c) of
Fig. 3-2. Each can be changed into any other by the rigid euclidean

Figure 3-2
82 VECTORS 3-1

motion of rotation; however, the arrangement shown in (d) cannot be so


transformed without using reflection. For, if we leave the x-axis fixed
and attempt to transform Fig. 3-2 (d) into Fig. 3-2 (b) by rotating about
the x-axis until the ?/-axes coincide, then the z-axes of the two figures
will be pointing in opposite directions.
Extend the index finger of your right hand, hold the thumb perpen­
dicular to the index finger but in the same plane as the rest of the hand,
and turn the middle finger inward so that it is perpendicular to the palm
and both the index finger and thumb. In this position, the hand can be
rotated so as to bring the thumb into the position of the positive z-axis, the
index finger into the position of the positive i/-axis, and the middle finger
into the position of the positive z-axis for any of the three arrangements
shown in Fig. 3-2(a), (b), or (c). However, the arrangement of Fig. 3-2(d)
cannot be obtained. This arrangement of axes would require the left hand.
Henceforth, we will agree on the arrangement of axes shown in Figs, (a),
,
(b) and (c). This is called a right-handed coordinate system, since the
thumb and the first two fingers of the right hand can be put into the
positions of the x-, y-, and z-axes respectively as described above. Al­
though this convention is immaterial to the mathematical development,
it would be essential in any consideration of applications to know exactly
which system was being used.
Let us assume now that we have a fixed right-handed coordinate system.
As described above, three coordinates serve to identify uniquely one point
in space. We shall use the convention of writing down the coordinates in
the order of the x-, y-, z-coordinates respectively and enclosing the re­
sulting triple of numbers in parentheses. Thus a triple of numbers such
as (1, 2, —1) represents the coordinates of some point, but we will actually
go further and identify this triple with the point. That is, when it is
understood that a coordinate system has been fixed, we may speak of
a point (xi, ?/i, Zi) rather than having to say the point with coordinates
(xn ?/i, ^1). Moreover, we will use a single letter to label the point. This
letter then would represent both the point and the triple of numbers
giving the coordinates of the point.

Definition 3-1. A point of three-dimensional space is an ordered triple


of numbers. Points will be denoted by capital italic letters, e.g., P =
(xi, 3/i, 2i). The particular point (0, 0, 0) is called the origin.

This definition is the start of the construction of a mathematical model


of three-dimensional space. We are not saying that the points of the
space in which we live are really triples of numbers, but that the set of
all triples of numbers can be used to represent this space. The mathematical
model will give us something that we can work with algebraically. It is
3-1 CARTESIAN COORDINATES IN 3-DIMENSIONAL SPACE 83

actually three-dimensional cartesian space with a fixed intrinsic coordinate


system.
Suppose we have two points given, say Px = (xx, ylf zx) and P2 =
(z2, y2> £2)- We would like to give a formula for the distance between
these two points which would coincide with the euclidean concept of
distance. Observe the set of points shown in Fig. 3-3. It is clear from the
figure how these points are obtained, but they can be completely defined
in terms of their coordinates. Thus A = (xx, ylf 0), B = (x2, ?/2, 0),
C = (x!,y2, 0), D = (x2ty2fz1)J and E = (xuy2,zx). (Verify all of
these.) The points Px and E lie on a line parallel to the y-axis since only
their ^-coordinates differ. The distance between these two points is
therefore the absolute value of the difference of their ^-coordinates, or
I2/1 —* 2/21- Likewise the distance between the points D and E is |xx — z2|.
The triangle PXED is a right triangle, and hence from the Pythagorean
theorem, the distance between Px and D would be
[(Xi — x2)2 + (yi — y2)2]112-
Similarly, the triangle PXP2Z> is a right triangle, the distance between P2
and D is |zx — z2|, and the Pythagorean theorem finally gives the distance
between the points Px and P2. Thus we have:

Definition 3-2. Given two points Px = (zx, 2/x, zx) andP2 = (x2,y2,z2),
the distance between them is defined to be:
= [(
*1 — z2)2 + (yi — Vi)2 + («1 — zz)2]112-

Thus, for example, if A = (7, 3, —9) and B = (11, —5, —8), then
|AB| = [(7 - ll)2 + (3 + 5)2 + (-9 + 8)2]1/2
= [(-4)2 + 82 + (-1)2]1'2
= [16 + 64 + 1]1/2
= 9.
84 VECTORS 3-1

From this distance formula we may obtain the algebraic conditions


satisfied by the points on a sphere. Since by definition a sphere is the set
of all points which are at some fixed distance from a fixed point, we have:

Definition 3-3. A sphere with center (x0, t/o, Zq) and radius R > 0 is

*
,
{( y, z) | (x — x0)2 + (y - y0)2 + (z - z0)2 = R2}.

For short, we say that

(x — x0)2 + (y — 3/0)2 + {z — z0)2 = R2

is the equation of the sphere.

From this formula, it is easy to write down the equation of any sphere
if we are given its center and the radius. Conversely, if we are given an
equation which can be brought into this form, we can recognize it as the
equation of a sphere and obtain the center and radius.
The equation of the sphere of radius 4 whose center is located at the
point (1,3, —2) is thus

(x - l)2 + (y - 3)2 +(z + 2)2 = 16,


or
x2 + y2 + z2 — 2x — 6y -J- 4z — 2 = 0.

On the other hand, if we are given the equation

x2 -|- y2 -|- z2 -|- 3x — 62 -f- 11 = 0,

we can identify this as the equation of a sphere by completing the square:

x2 + 3x + y2 + z2 — 62 = —11,
x2 + 3x + I + y2 + z2 - 62 + 9 = -11 + | + 9,
(x + i)2 + y2 + (z - 3)2 = J.

Therefore, we conclude that this is the equation of a sphere of radius |


whose center is located at the point (—f, 0, 3).
The concept of translation will prove to be most important in our de­
velopment. Let us investigate how this concept fits into the algebraic
model we are developing. A translation (or parallel translation) of the
points of space is a rigid motion of the points—that is, a motion which
leaves the distance between any two points unchanged—and is such that
every point is moved exactly the same distance and in the same direction.
3-1 CARTESIAN COORDINATES IN 3-DIMENSIONAL SPACE 85

Suppose that under such a translation the point which is originally at


the origin is moved to the point (h, j, k) , and that a point (x, y, z) is moved
to the point (x', yf, z'). What is the relationship between these sets of
coordinates? See Fig. 3-4. Through the point (h,jfk) we have drawn
dashed lines parallel to the original axes. The point (x'f y', z') will be a
signed distance x from the plane through the point (ft, j, k) perpendicular
to the x-axis. But this plane cuts the x-axis at the point with coordinate ft.
Therefore, x', the first coordinate of (x', ?/', z'), must be x + ft. Similarly,
we see that y' = y + j and zf = z + k. This then is the background
for the following definition:

Definition 3-4. A translation of the


points of three-dimensional space is
a function whose domain and range
are the space, and for which there
are three real numbers ft, j, and k
such that if the translate of a point
(x, y, z) is the point (x', yr, z'), then
x' = x + ft,
y' = y + j, (3-1)
z’ = z + k.

A translation is therefore a mapping which carries the points in one


copy of three-dimensional space into the points of another copy of the
same space. We might prefer to think of there being only a single three-
dimensional space, and the translation as being a motion of the points of
this space. Every point is moved from its original position to its new trans­
lated position.
An entirely different point of view can also be taken. Our definition
of the points of space as being triples of numbers is merely a useful mathe­
matical device. The more “physical” point of view would be that three-
dimensional space has an intrinsic existence, and that the coordinates of a
point result from the (arbitrary) introduction of a coordinate system.
In this picture, a translation results from the introduction of a new co­
ordinate system in a “translated position.” The translation given by
Eqs. (3-1) would be obtained by introducing the new x'-, ?/'-, ^'-coordinate
system with its axes parallel to the original axes, but all intersecting at
the point (—ft, —j, —ft), where the coordinates of this point are given
in terms of the original coordinate system. Whatever geometric point of
view is taken, the algebraic form of a translation is the same. It is given
by Eqs. (3-1).
86 VECTORS 3-1

Let us note that translation as defined above actually does preserve the
distance between points. If we have two points, (xb zi) and (x2,1/2, 22),
which are translated to (zj, y[, z[) and (z2, y2, z2), then

x'i = xi + hf
x2 = x2 + A,
but then
(z'2 — x'l) = (x2 + A) — (xi + A)
= fa — *
1).

In the same way, we can see that

yf2 — y'i = y2 - yi,


z'2 — z\ = z2 — 21,

and inserting these values into the distance formula of Definition 3-2,
we find that the distance between the translated points is the same as
the distance between the original points.
It is important to note the distinction between a point as a physical
location and a point as an algebraic entity. In our intuitive discussions,
we think of a point as a physical location, that is, as a “geometric point. ”
In our formal mathematical development, we will mean the algebraic
entity of Definition 3-2. Our development assumes a complete corre­
spondence between these two points of view.

PROBLEMS

1. Find the distance between the following pairs of points:


(a) (1, 1, 2) and (3, 7, 5)
(b) (1, 0, 7) and (6, 2, -1)
(c) (-1,4, -5) and (5, -3, -8)
2. Find the coordinates of a point whose distance from the origin is V6, and
whose distances from the points (0, 0, 1) and (2, 2, 2) are x/3 and V2 re­
spectively. Is there more than one such point? If so, how many such points
are there, and what are their coordinates?
3. Write the equations of the spheres:
(a) with radius 5 and center (3, 1, —2)
(b) with radius 2 and center (1, 0, 1)
(c) with radius 10 and center (8, —6, 0)
4. Show that the equation of any sphere can be brought into the form

x2 3+4y2 + z2 + Bx + Cy + Dz + E = 0,
3-2 DIRECTION COSINES AND DIRECTION NUMBERS 87

and that any equation of the form


Ax2 + Ai/2 + Az2 + Bx + Cy + Dz + E = 0
with A # 0 can be brought into the form of the equation of a sphere. State
the conditions under which this would be the equation of a sphere. If it is a
sphere, what is its center and radius?
5. Find the center and radius of each of the spheres with the following equations:
(a) x2 + y2 + z2 - 6x + 2y + 4z + 5 = 0
(b) 2x2 + 2y2 + 2z2 + 4x - 20z + 32 = 0
(c) 3x2 + 3j/2 + 3z2 - 5z + 6?/ - 12z - 8 = 0
6. Find the equation, center, and radius of the sphere which passes through
the four points (3, 0, 4), (—1, 3, —1), (—2, 0, —1), and (3, —4, 2).
7. Find the equation of the sphere with center (1,5, —2) which passes through
the point (2, 0, 4).
8. What is the result of two successive translations? How could this problem
be handled algebraically?
9. If B is a set of points which satisfy an equation/(x, 3/, z) = 0, what equation
is satisfied by the set of points obtained from B by a translation? Apply
this to the equation of a sphere. What happens to a sphere when it is trans­
lated?

3-
2 DIRECTION COSINES AND DIRECTION NUMBERS

By a ray we mean a half-line; that is, the set of all points which are on
one side of a given point on a line. A sense of direction is automatically
assigned to the ray, the positive direction being away from the given
point. This description is not adequate in terms of the algebraic frame­
work we are erecting, for as yet we do not know (algebraically) what a
straight line is. Again, we proceed by letting our geometric intuition be
our guide. Let us consider the points on a ray originating from the origin.
Let a point P = (Z, m, n), not the origin, be assumed to be on the ray
we wish to discuss and let X = (x, 2/, z) be any other point on the ray.
From these two points drop lines perpendicular to the x-axis. The one
from P will meet the x-axis at the point with coordinate Z, the other at
the point with coordinate x. Suppose now that Z 0. (See Fig. 3-5.)
The two right triangles formed are similar and hence the ratio of |x| to \l\
is the same as the ratio of the distance |OX| to |OP|. Let |0X|/|0P| = t.
Then |x|/|Z| = Z. However, these two points on the x-axis are both on
the same side of the origin and we can conclude that x/Z = |x|/|Z|, and
SO X = It.
In case Z happened to be zero, the ray must have been perpendicular
to the x-axis, in which case x = 0 and the equation x = It would still hold.
88 VECTORS 3-2

In the same way, we could drop perpendiculars to the y- and z-axes


and conclude that y = mt and z = nt, the ratio t being the same in each
case. The above analysis then is the motivation for the definition:

Definition 3-5. A ray from the origin through the point (I, m, n), not
the origin, is

{(x, y, z) | x = It, y = mt, z = nt, for all t > 0}.

The coordinates of any point of the ray, other than the origin, are called
a set of direction numbers for the ray.

The reader can try to exercise his critical sense on this definition. Al­
though complete as it stands, it raises an important question. After trying
to discover this question, check with Problem 3 at the end of this section.
Note that the number t which appears in this definition is the ratio
|0X|/|0P|. This fact can be used, for example, to find the point two-thirds
of the way from the origin to the point (6, —9, 5). The desired point
would be (4, —6, ^). Why?
Suppose now we have a ray, with direction numbers I, m, n. Let this
ray cut the unit sphere (the sphere of radius 1, centered at the origin,
with equation x2 + y2 + z2 = 1) at the point (X, g, v). We might remark
here that the postulates of euclidean geometry are sufficient to show
that this point exists,
* but the proof is quite difficult. In our case, this
is easy to show. We merely need to set t = 1/[Z2 + m2 + n2]1/2 in the

* Euclid’s postulates are not sufficient to show this, but later extensions,
such as the postulate system of Hilbert, are sufficient for a rigorous proof of
this fact.
3-2 DIRECTION COSINES AND DIRECTION NUMBERS 89

definition above. It is simple to verify that the point (X, g, v) with

_ _________ I_______ _ _______ m_______


[I2 + m2 + n2]1/2 1 ~ [I2 + m2 + n2]1/2 ’

V = [Z2 +
Tt
Wi2_|_n2]l/2 ’ (3_2)

is common to the ray and the unit sphere.


It is clear geometrically that there is only one such point on each ray.
Conversely, we see that there is a unique ray from the origin through each
point of the unit sphere. Thus, the three coordinates of this point on the
unit sphere suffice to determine the ray completely.
A given ray determines three angles, one between it and each of the
positive directions along the axes. Let. a, 0, and V be the values of these
angles, as shown in Fig. 3-6. Consider a, the value of the angle between
the positive direction of the x-axis and the ray. The perpendicular from
the point (X, /x, *0 to the x-axis meets the x-axis at the point with co­
ordinate X. But at the same time, since the length of the segment from the
origin to (X, /x, p) is one, the point X on the x-axis also has coordinate cos a.
That is, X = cos a. (See Fig. 3-7.) In exactly the same way, if (3 and 7
are the values of the angles between the ray and the y-axis and z-axis re­
spectively, then it is seen that /x = cos 0 and v = cos V. This leads to
the following definition:

If (I, m, n) is a set of direction numbers for a ray, then


Definition 3-6.
the set of numbers
_ I m
K~ [I2 + m2 + n2]i/2 ’ M _ p + m2 + n2]i/2 ’

_ n
V - [12 4- m2_|_ n2]l/2

is called the set of direction cosines of the ray. Figure 3-7

For example, the ray from the origin through the point (8, —1, —4) has
direction numbers (8, —1, —4), or (16, —2, —8), or (4, — —2), or any
other set of positive multiples. But this ray has only the single set of
direction cosines, (f, ——|).
Our next step is to introduce the notion of a directed line segment. The
essential idea is to consider a directed line segment as the translate of a
portion of a ray from the origin. For example, suppose P = (I, m, n)
is some point other than the origin. Then the set of all points on the ray
from 0 through P which lie between 0 and P is

7? = {(x, y,z) \ x = lt,y = mt, z = nt, 0 < t < 1},


90 VECTORS 3-2

and if we make a translation which carries the point at the origin to


(z0, 2/o, to), then R is translated to a set R' which can be seen to be

R' = y'> z') )x' = x + xQ, y' = y + yOfz' = z + zQ, and (x, y, z) G R}
= {(s', 2/', z') I x’ = Xq + It, y' = yo + mt, z' = ZQ + n£, 0 < t < 1}
= {(z, y, z) | x = xq + U, y = 1/0 + mt, z = z^ + nt, 0 < t < 1}.

The last step here follows from the fact that the symbols used in defining
a set are “dummy variables. ” The set is the same no matter what letters
are used for variables.
The ray from the origin through P = (I, m, n) has direction numbers
(l,m,n). We assign these same direction numbers to the directed line
segment obtained in this way by translation of the segment from 0 to P.
Observe that these direction numbers are then merely the differences of
the values of the coordinates at the two ends of the segment. Indeed, if
Pi = C^i, yi, Zi) is the point corresponding to t = 1, then Xi = xQ + I
and hence xr — x0 = I. Similarly, 2/1 — 2/o = m and Zi — z0 = n.

Definition 3-7. A directed line segment PqPi whose initial point is at


Pq = (#0, yo, 20) and whose terminal point is at Pi = (xlf yx, Zi) is
the set of points

{(x, y, z) I x = x0(l — t) + Xit, y = 2/0(l — t) + yit,

z = z0(l — t) + z\t, for all t with 0 < t < 1},

together with the sense of direction determined by increasing t. This


directed line segment is said to have direction numbers

(*
i — Xq, yr — yQ, zx — z0)
and length

d = [Oi ~ zo)2 + (yi — yo)2 + (zi — zq)2]112-

Its direction cosines are X = (xi — xQ)/d, y, = (2/1 — yo)/d, and


v = (zi — zo)/d.

For short, we will speak of the directed line segment PqPi as being
from Po to Pi rather than always referring to its initial and terminal points.
While the coordinates of any point on a ray from the origin form a set of
direction numbers for that ray, we say that a directed line segment whose
initial point is at the origin and whose terminal point is at (I, m, n) has
3-2 DIRECTION COSINES AND DIRECTION NUMBERS 91

direction numbers (Z, m, n) and only these. That is, a directed line seg­
ment will have only a single set of direction numbers. Note also that
the length of a directed line segment is the square root of the sum of the
squares of its direction numbers, which is exactly our formula for the
distance between the initial and terminal points.
The direction numbers of a directed line segment determine completely
its direction in space and its length. Also, if we make any parallel trans­
lation, the direction numbers of a given directed line segment remain un­
changed. This is usually expressed by saying that the direction numbers
of a directed line segment are invariant under translation.
As an example, consider the directed line segment from A = (7, —3, 2)
to B = (1, 0, 4). Its direction numbers are (1 — 7, 0 + 3, 4 — 2) or
(—6, 3, 2). The length of this line segment is |AB| = [36 + 9 + 4]1/2 = 7.
Therefore its direction cosines are (—f, f).
The parameter t in Definition 3-7, just as in Definition 3-5, represents
a ratio of distances. In fact, making use of the definition of distance we
can prove

Theorem 3-1. Let PQ = (x0, yo, Zo) and Pi = (xx, ylf Zi) be two dis­
tinct points. Let t be any real number between zero and one. Then the
point

X = (Zo(l — Z) + X\ty y0(l — Z) + yxZ, Zq(1 — Z) + Z\t) (3-3)


on the directed line segment POPX has the property |P0X|/|P0Px| = Z.

Proof: To prove this theorem, we merely need to compute

|Po-X'|2 = — x0Z)2 + (yit — yot)2 + (Z1« — z0t)2


= i2[Oi - Xo)2 + (yt - yo)2 + (Z! - z0)2]
= i2|PoPi|2,

which is equivalent to the desired result.

Observe that the point X given in this theorem can also be characterized
by the fact that it divides the line segment POPX in the ratio Z/(l — Z).
Noting this fact makes it easier for some students to remember (3-3) in
the form

X = (x0 + Z(xx — x0), y0 + Z(yx — y0), *o + Z(zx — z0))- (3-4)


It is worthwhile learning this formula for the special case when t = |.
The point in question is then the midpoint of the line segment P0P], and
92 VECTORS 3-2

from (3-3) we see that the coordinates of this midpoint are the averages
of the corresponding coordinates of the endpoints of the segment.
For example, the midpoint of the segment AB, where A = (7, —3, 2)
and B = (1, 0, 4) is C = (4, —f, 3). This result can be written down by
inspection, but other division points usually take more computation.
Thus, the point D, one-third of the way from A to B, has coordinates
(7 + |(—6), —3 + |(3), 2 + £(2)) = (5, —2, |)
as calculated from formula (3-4).
The above definition can be extended easily to define a straight line.
The formal discussion of straight lines will be postponed until Section 4-4,
but we give the definition here so that we will be able to speak of straight
lines if necessary.

Definition 3-8. Let PQ = (xQ, yQ, Zq) and Pi = (xi9 ylf Zi) be two dis­
tinct points. Then the straight line through Po and Px is

{(x, y,z) \ x = x0(l — 0 + xrf, y = yQ(l — t) + yrf,


z = z0(l — t) + zit, t any real number}.

PROBLEMS

1. What are the direction cosines of the directed ray from the origin through the
following points?
(a) (2, 6, 3) (b) (7, 3, -5)
(c) (-1,-1, 5) (d) (2,-2,-1)
2. Find the direction numbers, length, and direction cosines of the directed line
segments:
(a) From (3, 1, 7) to (-2, 5, 3) (b) From (1, 1, 1) to (7, 2, 5)
(c) From (0, 1, 1) to (1, 0, 1) (d) From (1, 1, 1) to (—1, 0, —1)
3. Let R be the ray from the originthrough the point (Z, m, n). Let (Z', m', n')
be any point (other than the origin) on this ray. Prove that the ray, P',
from the origin through the point (Z', m', n') is identical to R.
Remark: Definition 3-5 defines a ray as a certain set of points, depending on
a given point. What you are asked to show is that the resulting set of points
is the same, no matter what point of the ray we start with. Note that in order
to show that two sets R and R' are the same, you must show that any point
in R is also in R' and, conversely, that any point in R' is also in R.
4. In each of the coordinate axes, the set of all points which have nonnegative
coordinates constitutes a ray. What are the direction cosines of these three
rays?
3-3 VECTORS 93

5. Show that the direction numbers of a directed line segment are invariant
under translation.
6. Find the midpoints of the directed line segments of Problem 2. Also find the
points which are J and § of the distance from the initial point to the terminal
points.
7. Let R be a ray from the origin through (Z, m, n). Let X, m, and v be defined
as in (3-2).
(a) Show that (X, v) is at distance 1 from the origin.
(b) Show that any other point of the ray is at a distance other than 1 from
the origin. [Hint: Use the result of Problem 3 above to express the points
of the ray in terms of X, g, and pj
8. Let A = (ai, 02, <13), B = (òi, 62, 63), and C = (cj, C2, C3) be three distinct
noncollinear points in space. Consider these three points as the vertices of a
triangle and let A', B', and C' be the midpoints of the sides opposite the
vertices A, B, and C, respectively. The line segments A A', BB', and CC' are
therefore the medians of the triangle. Let A", B", and C" be the points two-
thirds of the way from A to A', B to B', and C to C', respectively.
(a) Find the coordinates of A', B', and C'.
(b) Find the coordinates of A", B", and C". What can you conclude?
9. What happens to the conclusion of Theorem 3-1 if the point X in (3-3) is
such that t does not lie between 0 and 1 ? State and prove a theorem about
the location of X in relation to Pq and Pi for t > 1 or t < 0.

3
3- VECTORS

The concept of a vector is a consequence of physical fact. It has long been


observed that a single number is insufficient to characterize certain physical
phenomena. For example, a moving object may have a known speed,
but until we also know its direction wè cannot say that we can describe
its motion. Similarly, force requires both a number (the magnitude of
the force) and a direction (the direction of application of the force) to
characterize it.
Physicists long ago found it useful to introduce a single symbol to repre­
sent a quantity, such as force, which has both a magnitude and a direction.
They called such a quantity a vector. From the observed physical behavior
of such quantities, algebraic operations on vectors were defined.
For example, if F represented a certain force, say a force of 100 dynes
directed straight downward, it was found useful to be able to represent a
force of some different magnitude, say 200 dynes, but still in the same
direction. It seemed logical to write this second force as 2F, the number 2
being thought of as doubling the force without changing its direction. Such
a multiple of a vector was called a scalar multiple of a vector, pure num­
bers being called scalars to distinguish them from vectors.
94 VECTORS 3-3

A second algebraic operation, that of vector addition, was defined in


terms of the result of applying two forces simultaneously. If a force is
applied to an object, that object will move unless a second force of exactly
equal magnitude but pointed in the opposite direction is also applied at
the same time. A simple physical experiment serves to show how simul­
taneous forces must be combined. Suppose two strings are tied together
at a point P, and a weight is suspended by a third string from the same
point P (see Fig. 3-8). The forces on the point P are applied through the
strings, and hence must be in the direction of the strings. The weight exerts
a known force through the vertical string on the point P. Since the point
P is not moving, the resultant of the forces exerted by the two supporting
strings must exactly match the downward force exerted by the weight.
That is, it must be as shown by the dashed arrow in Fig. 3-8, where the
length of the arrow represents the magnitude of the force.


Figure 3-8

If the forces exerted by the two supporting strings are measured by


introducing spring balances into the strings for example, it is found that
the forces line up as shown in the insert of Fig. 3-8. In other words, if we
draw lines in the direction of the forces, whose lengths are equal (in some
units) to the magnitudes of the forces, then the resultant of these two
forces is represented by the line which forms the diagonal of the parallelo­
gram determined by the given lines.
Such a diagram, called in physics the parallelogram of forces, determines
the way in which two forces combine. The same diagram is then used to
define (physically) the sum of two vectors.
While the above description of vectors in physical terms is quite satis­
factory for the use that is made of them in elementary physics, it leaves
much to be desired mathematically. In particular, it has been found de­
sirable to generalize the idea of a vector to situations in which it is im­
possible to give an accurate definition of what is meant by “direction.”
For this reason we would like to give a mathematical definition of a vector,
which can then be used without reference to physical intuition.
To a mathematician, the only really satisfactory way to introduce
vectors is by the axiomatic method, since only in this manner can he make
3-3 VECTORS 95

the broad generalizations, which have been found so useful. This method
of introduction corresponds exactly to what is frequently called an “op­
erational definition”; that is, vectors would be defined only in terms of
how they behave. A set of postulates would be given and a vector would
be anything that satisfied these postulates.
Here, however, we will begin in a more modest manner and merely try
to define something which corresponds to the physical description of a
vector. At a later stage we may then return to obtain the abstract defini­
tion. The above physical description will be held in mind throughout.
The mathematical development will be motivated by this physical de­
scription.
Let us consider then what is required. The physical description asks
for a quantity with both direction and magnitude. We may take our clue
from the arrows that physicists use to represent vectors and observe that
a directed line segment satisfies our requirements. There is still a diffi­
culty, however. Two directed line segments with the same directions and
lengths may be distinct, yet they would represent the same vector. This
could be taken care of by using only directed line segments with initial
points at the orgin, but it is convenient to be able to associate vectors with
arbitrary directed line segments. All we need to do is find a property of a
directed line segment which determines its direction and magnitude but
which is independent of translation. The set of direction numbers of the
directed line segment satisfy this requirement.

Definition 3-9. A vector is a triple of numbers. Vectors will be denoted


by boldface letters. The vector 0 = [0, 0, 0] is called the zero vector.
The three numbers ax, a2, and a% are called the components qí A. Given
a vector A = [ab a2, 03], the quantity
|A| = [af + a| + a2]1/2

is called the magnitude of A.

In this text, we follow the common practice of indicating vectors by


means of boldface type. In handwritten work it is difficult to indicate
boldface letters; so some other convention is usually used. Most students
find it easiest to indicate a vector by placing an arrow or bar over the
letter, but any consistent usage is satisfactory.
Note that by virtue of the above definition, if two vectors differ in any
one of their components, they are different. We therefore write A = B
if and only if the two triples of numbers are identical.
If we compare the definition of a vector with the definition of a point
in space, we see that they are identical. How can this be? Surely they
96 VECTORS 3-3

must be different? As suprising as it might seem, there is in reality no


essential difference. The set of all vectors, as defined here, constitutes a
vector space which we can identify in a natural manner with our three-
dimensional euclidean space. We could make this identification complete
and consider only one space, but because of the possible applications,
it is useful to have both the three-dimensional space and the vector space
as two distinct entities. On the other hand, many of the properties of
three-dimensional space can be expressed most easily in terms of vectors,
so it would be convenient to be able to make this identification when
needed. For these reasons, we will introduce the following conventions.

Convention 3-7. The same triple of numbers may represent either a


point or a vector. When it is meant to represent a point, the triple will
be enclosed in parentheses,

A = («i, a2, «3)-

When it is meant to represent a vector, it will be enclosed in brackets,

A = [ai, a2, «3]-

The same letter in boldface or italic type will be used to represent the
same triple of numbers, considered as a vector or a point, respectively.

These conventions identify the vectors as points in space or, perhaps


more accurately, as directed line segments with their initial points at the
origin. This point of view produces what physicists call “bound vectors. ”
Physicists also found it useful to have “free vectors,” which are thought
of as directed line segments with arbitrary placement. To accommodate
this idea, we introduce a function which associates a vector to each di­
rected line segment. Note that in this definition, although each directed
line segment is associated with a unique vector, many different line seg­
ments may be associated with a single vector.

Definition 3-10. Let Pi and P2 be two given points and let PtP2 be
the directed line segment from Pi to P2. Then the vector PXP2 is the
vector whose components are the direction numbers of the directed line
segment PXP2.

Thus, for example, the directed line segment AB, where A = (3, 5, 2)
and B = (—1, 6, 5) has associated with it the vector AB = [—4, 1, 3].
3-4 THE ALGEBRAIC OPERATIONS ON VECTORS 97

In most of our work with vectors, we will find ourselves treating directed
line segments as though they were vectors. Much of the time the difference
is not really important. But remember, a directed line segment is not a
vector. A vector is really a property of a directed line segment. The vector
determines the direction and length of the directed line segment but not
its placement in space.

PROBLEMS

1. Let the point A have coordinates (01,02,03) and suppose that the vector

ÃP = i&b ?>2, Õ3].


What are the coordinates of the point P?
2. If the weight in Fig. 3-8 is 100 grams and the two supporting strings make an
angle of 90° with each other, find the force exerted by each string in terms of
the angle between that string and the vertical.
3. Let A and B be two points of space and suppose that under a translation they
are translated to A' and Bf respectively. Prove that

ÃB = ~ArB\

3-
4 THE ALGEBRAIC OPERATIONS ON VECTORS

As mentioned in the previous section, it is found useful to have an opera­


tion which changes the magnitude of a vector without altering its di­
rection. This operation, called scalar multiplication, is easily found
from the comparison of directed line segments of different lengths which
are on the same ray.

Definition 3-11. Real numbers are called scalars. Given a vector


A = [ax, a2, «3] and a real number (scalar) t, the scalar multiple of
A by iis
tk = [tai, ta2, ta3].

Geometrically, scalar multiplication can be thought of as multiplying


the length of the vector (directed line segment) by the number t. Note
that if a directed line segment has nonzero length and is represented by
a vector A, then its length is |A| (which is a scalar) and the vector A/|A|
has length one. The components of A/|A| are the direction cosines of the
directed line segment.
98 VECTORS 3-4

From Problem 1 of the last section, we see that vector addition in geo­
metric terms corresponds exactly to the addition of vectors by com­
ponents. Thus we have

Definition 3-12, Given two vectors A = [ab a2, «3] and B = [6b b2, 63],
the sum of these two vectors is the vector

A + B = [ai + bi, a2 + b2, a# + 63].

Another way of expressing the parallelogram law is to make use of di­


rected line segments. Letting 0 represent the origin and letting A and P
be points such that OA = A and AP = B, we have

A + B = ÕA+ÃP = ÕP.

Figure 3-9 illustrates these relationships. It should also help the student
to see how componentwise addition gives the sum of the two vectors.

A special case of a scalar multiple of a vector is worth noting. Given


any vector A = [ai, a2, <13], the vector (—1)A = [—fli, —a2, —03] is
such that the sum of it and the vector A is exactly the zero vector. It is
useful to have a special notation for this vector and other negative forms.

Definition 3-13.Given any vectors A and B, we shall use the following


notational conventions:
-A = (-l)A,
A — B = A + (—B).

The expression A — B is called the difference between the two vectors


and has a special geometric interpretation. In Fig. 3-10(a) we see the
parallelogram with sides A and —B (dashed). The sum A + (—B) is
3-4 THE ALGEBRAIC OPERATIONS ON VECTORS 99

then the dashed diagonal which is parallel and equal in length to the solid
diagonal. Therefore, the difference A — B is exactly the vector BA,
where B and A are the points associated with the vectors B and A respec­
tively (see Fig. 3-10b).
Although this geometric interpretation of the difference of two vectors
seems to be purely intuitive, we can actually prove that it is valid (in
terms of the definitions which have been given).

Figure 3-10

Theorem 3-2. Let A, B, and C be three points. Then

AB + BC = AC (3-5)
and
ÃB = CB - CA. (3-6)

Proof: Let A = (an a2, a3), B = (&i, b2, 63), and C = (cbc2, C3).
Then
ÃB = [Ò1 — ab b2 — a2, b3 — 03],

BC = [ci — 51, c2 — b2i c3 — Ò3],

and
AC = [ci — ai, c2 — a2, c3 — a3].

It is then clear from Definition 3-12 that Eq. (3-5) is true.


To prove Eq. (3-6), we could proceed in the same way, or merely
observe that AC = —CA, and hence that

CB - CA =CB + ÃC
= ÃC + CB
= ÃB,

where the last step follows from (3-5), which we have already proved.
In the middle step we made use of the commutativity of vector addition,
which we have not proved, but which follows immediately from Defini­
tion 3-12.
100 VECTORS 3-4

An important fact to note is that by virtue of Definition 3-10, we have


the representation of a directed line segment in the following form:
ÃB = B - A. (3-7)
This result can also be viewed as a special case of (3-6) when we realize
that Convention (3-1) and Definition 3-10 can be thought of as saying
that A = OA, and B = OB, where 0 is the origin.

In our informal discussions, we introduced scalar multiplication so as


to change the magnitude of a vector while leaving its direction unchanged.
That is, the new vector is parallel to the old. We would now like to make
formal definitions which will allow us to speak of parallel line segments
and vectors.

Definition 3-14. Two directed line segments are called parallel if and
only if they have the same set of direction cosines. Two vectors are
called collinear if and only if one is a scalar multiple of the other, and
are called parallel if and only if one is a nonnegative scalar multiple of
the other.

The definition of parallelism made here differs from that made in most
elementary courses. Two directed line segments which would ordinarily
be called parallel may not be called parallel according to this definition.
For example, if we have A = (1, 1, 1), B = (1, 0, 2), and C = (2, 1, 3),
then OA = BC and the directed line segments OA and BC are parallel,
but the directed line segments OA and CB are not called parallel. In
order to be called parallel, two directed line segments must have the same,
not opposite, directions.
The above definition is consistent. That is, two directed line segments
AB and CD are parallel if and only if the associated vectors AB and CD
are parallel. The notion of collinearity does not fit well with thinking of
vectors as arbitrarily located directed line segments. It is, however,
geometrically evident when we think of vectors as directed line segments
whose initial points are at the origin. Two line segments (not directed)
are parallel if and only if the associated vectors are collinear.
The zero vector, the vector which has zero for all its components, oc­
cupies a peculiar position with regard to this definition. The way we have
stated the definition, the zero vector is parallel (and collinear) to any
vector. We could have avoided this by using nonzero and positive scalar
multiples in the above definition, but it will turn out to be convenient to
have the definitions in the form actually given. Remember that parallel
3-4 THE ALGEBRAIC OPERATIONS ON VECTORS 101

vectors (disregarding the zero vector) have the same direction, while
collinear vectors have the same or opposite directions.
Now let us observe the algebraic laws satisfied by vector addition and
scalar multiplication. There are many such laws, but we choose the
following set for emphasis:

Theorem 3-3. Vector addition satisfies the following properties:

Pl. For any vectors A and B

A + B = B + A.

P2. For any vectors A, B, and C

A + (B + C) = (A + B) + C.

P3. There exists a vector 0 such that for any vector A

A + 0 = A.

P4. For any vector A there exists a vector —A such that

A + (—A) = 0.

These properties are (in view of the definitions we have given) fairly
self-evident. The proofs are easy and can be left as problems for the
student. The student will recognize these as properties of addition in the
real number system. Since they are the same properties, they are referred
to by the same names. That is, Pl is called the commutative property for
addition of vectors, P2 is called the associative property for addition of vectors,
P3 is the existence of the identity, and P4 is the existence of inverses.
The student who has heard of a group will see that Theorem 3-2 can be
rephrased more simply as: The set of vectors forms a commutative group
with respect to addition. If you have never heard of this concept, you need
not worry about it. Merely realizing that this particular set of properties
occurs often enough to deserve a special name will be enough at this point.
When we introduce scalar multiplication we obtain further properties
of the set of vectors. Note that if we try to verify properties similar to
those held by the real number system, we find immediate differences.
For example, the scalar multiple of a vector is an indicated product of
completely different entities. It is up to us to define what we mean by the
indicated product in either order. We obviously wish to define them to be
the same regardless of order. In other words, the commutative law holds
by definition. We can, however, obtain a close equivalent to the associa­
tive property; and the distributive property also applies. In fact, since
102 VECTORS 3-4

there are two distinct types of addition (addition of scalars and addition
of vectors), there are two distributive properties.

Theorem 3-4. The algebraic operations on vectors have the following


properties:
P5. For any vector A and any scalars s and 2,

(s0A = s(iA).

P6. For any vector A and any scalars s and /,

(s -|- í)A = sA -|- tA.

P7. For any vectors A and B and any scalar t,

t(A -|- B) = tA -|- iB.


P8. For any vector A,
1 • A = A.

The four properties listed here are again very simple to verify. Property
8 may seem too trivial to mention, but it is included for a specific reason.
It so happens that the properties Pl through P8 listed in the above two
theorems are sufficient to characterize an algebraic system of considerable
mathematical importance—a vector space or, as it is sometimes called, a
linear space over the real numbers. This is a set of elements, called vectors,
together with the real numbers and two operations, vector addition and
scalar multiplication, satisfying the above eight properties (which would
then be called postulates). The eighth property becomes quite important,
allowing the complete algebra of vectors to be developed.
There are many different types of vector spaces satisfying the above
eight properties, but the system of vectors that we have developed re­
quires only one additional property to characterize it completely. The
remaining property concerns linear dependence and will be discussed in a
later section.
At this stage, the student need not worry about how additional algebraic
properties of vectors could be proved from those listed above, but should
merely use the properties as needed. Any doubts can usually be resolved
by reference to the definitions.

To illustrate some of the concepts introduced in this section, let us give


a proof, using vector methods, of the thoerem:
The diagonals of a quadrilateral bisect each other if and only if the quad­
rilateral is a parallelogram.
3-4 THE ALGEBRAIC OPERATIONS ON VECTORS 103

Proof: Suppose that A, B, C, and D are the vertices of a quadrilateral


with diagonals AC and BD. Let M i and M2 be the midpoints of AC and
BD, respectively. Then
Mi = ÕMX = ÕA + 'ÃM1
= A + |ÃC
= A + |(C - A)
= i(A + C)

(see also Problem 2). Similarly,

M2 = j(B + D).

We observe that the diagonals of the quadrilateral bisect each other if and
only if Mi = M2, or equivalently Mi = M2.
The theorem we are trying to prove is an “if and only if” proposition.
This means that there are two separate results to be proved. First, let us
prove that if the quadrilateral is a parallelogram, then the diagonals bisect
each other.
The condition that ABCD be a parallelogram can be restated in the form
AB = DC (the opposite sides are parallel—make a sketch). Hence
B — A = C — D, or B = A + C — D. Substituting this into the
above formula for M2, we find

M2 = 1(B + D)
= i(A + C - D + D)
= i(A + C)
= Mi.

Therefore, we conclude that the diagonals bisect one another.


Next, we prove the other half of the theorem: if the diagonals bisect
each other, then the quadrilateral is a parallelogram.
The assumption that the diagonals bisect each other is equivalent to
Mi = M2, or A + C = B + D. This equation implies that

A - B = D - C,
or
BA = CD,

which tells us that the quadrilateral is a parallelogram.


Exactly what has been proved here? This theorem was proved to be
true in the particular model of euclidean space which we have been con­
structing. Its truth depends on the concepts of length and parallelism
104 VECTORS 3-4

which we have defined. Similarly, if we use these methods to prove other


geometric facts, let us remember that the proof is really only valid in the
“vector spacew model of euclidean space. It so happens that this model is
an accurate representation of euclidean space, but proof of this will not
be given here.

PROBLEMS

1. Let 0 (the origin), Pi, P2, P3, and P4 be points in three-dimensional space.
What algebraic condition on the vectors OPi, P1P2, P2P3, and P3P4 is
necessary and sufficient for the point P4 to coincide with the origin?
2. Let A and B be any two distinct vectors. For any real t between 0 and 1,
let P = (1 — t)A + /B. Where is the point P in relation to the points A
and B? [Hint: See Definition 3-10 and compare with Theorem 3-1.]
3. Verify the properties listed in Theorems 3-3 and 3-4.
4. Define the vectors ei = [1, 0, 0], e2 = [0, 1, 0], and e3 = [0, 0, 1]. Find
scalars u, v, and w such that
uei + ve2 + we3 = A,
where
(a) A = [7, 3,-4], (b) A = [1, 0, £],
(c) A = [0, 0, 0], (d) A = [ai, «2, 03].
5. Find w, v, and w to satisfy the same conditions as those of Problem 4 if
ei = [1, 1, 0], e2 = [1, —1, 1], and e3 = [—1, 1, 2].
6. Can you find u, v, and w such that
wei + ve2 + we3 = [1,0, 0],
if
(a) ei = [1, 1, 0], e2 = [1, -1, 1], e3 = [2, 0, 1],
(b) ei = [0, 0, 1], e2 = [0, 1, 1], e3 = [0, -1, 2]?
If not, why not?
7. For each part of Problem 6, find u, v, and w not all zero such that
wei + ve2 + we3 = 0.
8. For each of the following vectors, find a vector B with the same direction
but magnitude 1.
(a) A = [3,-6, 2] (b) A = [1,1,1]
(c) A = [12, 0, -5] (d) A = [3, 5, 7].
9. Which of the following systems satisfy all eight of the properties Pl through
P8, and hence are vector spaces? If a system does not satisfy all the proper­
ties, which fail to hold (in general) ?
3-4 THE ALGEBRAIC OPERATIONS ON VECTORS 105

(a) The set of all n-tuples of real numbers A = [ai, a,2, . . . , aB] with vector
addition defined by
[ai, 0,2, . . . , an] + [6i, Ò2, • • • , 6n] = [oi + 6i> ct2 + 62, . . . , an + 6n],
and scalar multiplication defined by
U2j • • • j = [fol, ta2, • • • , tan].
(b) The set of all ordered pairs of real numbers A = [ai, 02] with vector
addition defined by
[ai, 02] + [6i, 62] = [ai + 6i, «2 + 62],
and scalar multiplication defined by
f[ai, 02] = [tai, 0].
(c) The set of all real-valued functions/defined for all x, 0 < x < 1, with
vector addition defined by f + g being the function whose value at x is
/(x) + 0(z), and scalar multiplication defined by tf being the function
whose value at x is tf(x).
10. Let A and B be two points in space. How are AB and BA related?
11. Let A, B, C, and D be any four distinct points in space.
(a) Show that
AD = AB + BC + CD.

(b) Let E, F, G, and H be the midpoints of the line segments AB, BC, CD,
and DA respectively. Show that
JF = iÃB + iBC
HG = i~ÃD + %DC.

(c) Show that EF = HG. Express this as a geometrical theorem.


12. Let A, B, and C be any three distinct points in space. Let D be the midpoint
of the line segment BC. Then AD is a median of the triangle ABC.
(a) Find AD in terms of the vectors A, B, and C.
(b) Find A + % AD in terms of A, B, and C.
(c) What would happen if the roles of A, B, and C were interchanged in the
above? What theorem has been proved? Compare with Problem 8 of
Section 3-2.
Using vector methods, prove each of the following theorems.
13. The line segment joining the midpoints of two sides of a triangle is parallel
to, and one-half the length of, the third side.
14. The line segment joining the midpoints of the nonparallel sides of a trapezoid
is parallel to the bases and equal to half the sum of their lengths.
106 VECTORS 3-5

15. The line segments joining the midpoints of opposite sides of any quadrilateral
bisect each other.
16. The midpoints of two opposite sides of any quadrilateral and the midpoints
of the diagonals are the vertices of a parallelogram.
17. The two lines from a vertex of a parallelogram to the midpoints of the
opposite sides trisect the diagonal that they cross.

5
3- PROJECTIONS AND THE DOT PRODUCT

Let L be a given line and AB & given directed line segment in three-
dimensional space. We wish to make an intuitive definition of the pro­
jection of AB onto L. To do this, erect planes perpendicular to L through
A and B and let these planes cut L at points A' and B' as in Fig. 3-11.
We will call the directed line segment A'B' the projection of AB onto the
line L.
In many texts, particularly those in engineering, it is common to find
the projection defined as the length of the segment A'B' described above
rather than the segment itself. We will find it more useful, however, to
consider the projection as a vector rather than as a scalar.
It is the vector properties of the directed line segment A'B' that we
are really interested in. Note that if we use another line which is parallel
to L in Fig. 3-11, the resulting projection will be a different directed line
segment, but the associated vector will be the same.
Similarly, suppose that we have another directed line segment, CD,
which is parallel to and of the same length as AB, that is CD = AB;
then we see that the projection of CD onto L will be a directed line seg­
ment C'Z)' such that C'D' = A'B'. This can be seen easily by observing
that the two planes through C and D will be the same distance apart as
the planes through A and B. From this, it follows that the vector A'B'
is determined by the vector AB. For this reason, we will say that the
projection of the vector AB is the vector A'B'.
Observe that there is no difficulty here in having two different things
being called a projection. The projection of a directed line segment is a
directed line segment, while the projection of a vector is a vector.
Given a vector and a line, we find the projection of the vector onto the
line by drawing a directed line segment with that vector, finding the pro­
jection of that directed line segment, and converting the resulting directed
line segment into a vector. Since the result depends only on the direction
of the line L, we might as well assume that the desired line L passes through
the origin. Thus given a vector A, the obvious directed line segment to
associate with A is OA. The projection of this onto L will then be OA',
3-5 PROJECTIONS AND THE DOT PRODUCT 107

and we denote the projection of A onto L by

Proj (A) = ÕA'.

When we use this notation, the line L is understood. To be completely


correct, we should indicate the line in our notation, but this would be
rather cumbersome for our purposes.
We now wish to investigate (intuitively) some of the properties of this
projection function in order to learn what the appropriate formal definition
should be. For these intuitive purposes, it is clearly sufficient to think in
terms of directed line segments.

Figure 3-12

Let us see what the projection of the sum of two vectors is. Let C =
A + B and suppose we project C onto a line L.
Observe Fig. 3-12. Here we see the vector C as the sum of A and B,
and three planes drawn through the appropriate points. It is then clear
that
Proj (A + B) = Proj (A) + Proj (B).

We also observe that for a fixed line L, the projection of t times A is t


times the projection of A, as is easily seen by observing the similar triangles
formed when A is drawn as a directed line segment with its initial point
on L (see Fig. 3-13). Thus, having fixed L, we can write for any vectors
A and B, and for any scalars s and i, the following
relation:

Proj (sA + tB) = s • Proj (A) + t • Proj (B),


(3-8)

which is usually called the linearity property of the


projection. Figure 3-13
108 VECTORS 3-5

It is the direction of the line L which is important in determining the


projection. For a given line L, all projections onto this line will be col­
linear; that is, they will all be scalar multiples of some one nonzero vector
which determines the direction of L. Suppose that B is a nonzero vector
on the line (i.e. that there are two points P and Q on the line such that
PQ = B). This vector B will then determine a sense of direction on L.
Think of the vectors A and B as directed line segments drawn from the
same point and let 6 be the angle between these two (Fig. 3-13 again).
Then clearly,
Proj (A) = |A|(cos0)e, (3-9)

where e is a vector of unit length parallel to B. That is, e = B/|B|, and so

Proj (A) = |A|(cos 6) • (3-10)

Observe carefully that this result is independent of which vector B


is taken on the line L. If B' is another nonzero vector, parallel to B, that is,
a positive scalar multiple of B, then

Bz B
|B'| “ |B|'

If, however, B" is some negative multiple of B (hence is collinear but not
parallel), then
■R" R
IB77! = _ |B|’

but the angle 0" between A and B" will be changed by ir from 0. Hence
cos 0" = —cos 0 and the net result is that

_ |A|(cos 0)B _ |A|(cos 0")B"


rroj W - |B| - |B„|

Let A = [«i, a2, 03] be a given vector and let B = [òx, b2, Ò3] be a
nonzero vector which determines the direction of a line L. We wish to
find the projection of A onto L. We will do this with the help of the
linearity property (3-8), breaking A up into the sum of three vectors,
each in the direction of one of the coordinate axes.

Definition 3-15. The unit coordinate vectors are the vectors

ej = [1, 0, 0], e2 = [0,1,0], and e3 = [0, 0,1].


3-5 PROJECTIONS AND THE DOT PRODUCT 109

Now we need the projections of the three unit vectors. From (3-9), the
projection of ei in the direction of B is |ei|(cos a)e = (cos a)e, where
e = B/|B| and a is the angle between the x-axis (ei) and the vector B.
However, cos a is exactly the first direction cosine of B, or cos a = 6i/|B|.
In exactly the same way the projections of e2 and e3 can be seen to be
(62/|B|)e and (63/|B|)e respectively.
Therefore, using the linearity property, we find that the projection of A
onto L is

On the other hand, if 6 is the angle between A and B, this same projection
is given, according to (3-9), by
Proj (A) = |A|(cos 0)e = |A|(cos 0) jgj- •

Comparing these two results, we find the interesting relation

|A| |B| cos 6 = a^i 4- a2b2 4- a363, (3-11)


or, using the summation notation, we have
3
|A| |B| cos 0 = aibi.
t=i

All of this discussion has been based on our understanding of the ge­
ometry of three-dimensional space. We want our algebraic development to
110 VECTORS 3-5

correspond to the familiar geometry of space. We do this by seeing that


our definitions correspond to the above results. In particular, we would
like to have a simple symbol which would represent the combination of
two vectors which appears on the right-hand side of (3-11). This then, is
our motivation for the following definitions.

Definition 3-16. Given two vectors A = [ax, a2j «3] and B = [6X, b2f 63],
the dot product of these two vectors is

A • B = axòx 4~ a2b2 -|- O3Ò3.

The projection of A onto a line L whose direction is determined by a


vector B 0 is
„ ./AX (A-B)B
pr°j (A) = |B|2 •

The cosine of the angle between the two vectors is defined to be


n A B
cos 0 ~ |A| |B| ’

provided A 0 and B # 0. In particular, the two vectors A and B


are called orthogonal (or perpendicular) if and only if A • B = 0.

Several remarks about this definition are in order. First, the quantity
defined to be the cosine of the angle between the vectors has not been proved
to be the actual cosine of a real angle. In order to know this, we would
have to know that the quantity so defined always is between —1 and +1
or, what amounts to the same thing, that for any two vectors A and B

|A -B| < |A| • |B|.


This happens to be true, as will be proved in the next section, so that we
can use the above definition to define the angle between two vectors. The
orthogonality condition then corresponds correctly to our usual ideas of
orthogonality.
A second observation is that according to the above definition, the zero
vector, 0, is orthogonal to every vector. While this may seem unusual, it
turns out to be a useful convention and so we will use it.
Finally, the unit vectors ex, e2, and e3, defined on p. 108, are mutally
orthogonal according to this definition. Since we have been basing our
algebraic development on the geometric picture of mutually orthogonal
coordinate axes, this fits in with our picture.
The use of the name dot product for the type of product defined above
is based only on the way we write the product. Other names for this
concept are inner product and scalar product. The expression “inner
3-5 PROJECTIONS AND THE DOT PRODUCT 111

product” is probably the most suitable mathematically, but we will find


that the term "dot product ” will serve our purposes. It is essential to give
the full name, however. In a later section we will define a different type of
product, which we will call the cross product, and with two types of product
it is necessary to specify which is being used.

As examples of the use of these concepts, let us suppose that A =


[1, —4, 8], B = [—7, 6, 6], and C = [—4, 15, 8], and find
(1) A’B;
(2) The cosines of the angles between A and B and between A and C;
(3) The projections of B and C onto the line whose direction is de­
termined by A.
We solve these problems as follows:
(1) A B = [1,-4,8]. [-7,6,6]
= -7 - 24 + 48
= 17.
(2) Let 6 be the angle between A and B. Then
AB 17 17
e°s 0 - |A| |B-| [i2i]i/2[i2i]i/2 — 121'

Let <f> be the angle between A and C. Then


AC -4 - 60 4-64 n
cos* |A| |C| ll[305]i/2 °'

So A and C are orthogonal.


/ox t>_ s (B • A)A 17 A r 17 -68 1361
(3) Proj (B) |A|2 121 A [121> 121 ’ 121J ■

D (C • A)A 0 .
Proj(C) = nÃp“ = i2iA = 0-
Let us now state the fundamental algebraic properties of the dot product.

Theorem 3-5. The dot product of vectors is such that for any vectors
A, B, and C, and any scalar s, the following properties are satisfied:

PI. A B = B A.
P2. A - (B + C) = A B + A C.
P3. s(A • B) = (sA) • B = A • (sB).
P4. A • A = |A|2 > 0 for every A, and A • A = 0 if and only if
A = 0.
112 VECTORS 3-5

Again, the proofs of these facts are almost trivial and can be left to the
reader. Note that the first three can be summarized by the observation
that as far as vector addition, scalar multiplication, and the dot product
go, the ordinary rules of algebra apply. However, there is one rule which
is definitely missing—the cancellation law. Suppose we have

A • B = A • C,

and A # 0. Then we cannot conclude that B = C. We can say that


A • B — A • C = 0, and hence using P2, that

A • (B - C) = 0.
*

But from this, all we can conclude is that B — C is orthogonal to A, which


need not imply B = C. For example, if A = ei = [1, 0, 0], B = [1, 2, 1],
and C = [1, 1, —2], the student can easily verify that A • B = A • C,
but clearly B # C.
The properties listed in this theorem are again standard properties
which have been found valuable in much more general situations. In
fact, they appear as the postulates which must be satisfied by the inner
product in a special mathematical structure of great importance, an inner
product space. At some time in his career, the student will probably hear
of a Hilbert Space. A real Hilbert space is nothing more than a vector
space with an inner product which satisfies the postulates mentioned in
this and the last section.
The first property, Pl, which is obviously a commutative law, is called
the symmetric property of the dot product. Property P2, clearly a form of
the distributive law, and property P3, which appears to be some form of an
associative law, are together referred to as the bilinearity property of the
dot product. This is best explained by observing that if we consider the
vector A fixed, then for any vectors B and C and any scalars s and t,

k-(sB + tC) = s(A • B) + /(A • C),

which is exactly the linearity property mentioned in connection with


projections. The word "bilinear” is used because this property also holds
with respect to the first factor in the dot product.
The property P4 is usually described by saying that the inner product
is positive definite. It is called positive because A • A > 0 for every A,
and definite because this inner product is zero only when A = 0.

* Note that this argument makes implicit use of P3, with s = —1, in order
to obtain the distributive law for the difference of two vectors.
3-5 PROJECTIONS AND THE DOT PRODUCT 113

The properties of the dot product given in these theorems can be gen­
eralized in obvious ways. For example, using the properties of vector addi­
tion and the distributive law of P2, we can generalize the distributive law
to the product of the sum of a set of vectors with the sum of another set.
Thus, for example,

(Ai + A2 + A3) • (Bi + B2) = Ai • Bi + Ai • B2 + A2 • Bi


+ A2 • B2 + A3 • Bi + A3 • B2. (3-12)

This same result can be written more concisely by using the summation
notation. Equation (3-12) is equivalent to

= (3-13)
\i=l / \; = 1 / i=l j=l
The student may not be familiar with the double summation shown on the
right-hand side of this equation. This may be interpreted as follows:

££A,-By= E(£A« BA
1=1 ;=1 1=1 \>=1 /

= £ (At -B1 + A, -B2)


1=1

= Ai • Bi + Ai • B2 + A2 • Bi
+ A2 • B2 + A3 • Bi + A3 • B2.

PROBLEMS

1. Let B be a nonzero vector which defines the direction of a line L. Prove that
the projection of A onto L is given by
Proj (A) = (A • e)e,
where e = B/|B|.
2. Find the projection of the vector A = [7, 1, —4] onto the lines determined
by the vectors
(a) B = [2, 6, 3] (b) B = [2, 2, -1]
(c) B = [—2, —2, 1] (d) B = [7, 1, -3]
(e) B = [-7,-1,3]
3. Find the projections of the vector A = [2, —4, 3] onto the lines determined
by the vectors
(a) B = [1,0, -3] (b) B = [2, 2, -1]
(c) B = [-3, 3, 6] (d) B = [-8, 16, -12]
(e) B = [2, —4, 0]
114 VECTORS 3—6

4. What are the projections of A = [ai, «2, 03] onto the coordinate axes?
5. Find the cosines of the angles between the vector A = [3, —2, —6] and the
vectors B, where
(a) B = [4, -1, 8] (b) B = [4, 7, -4] (c) B = [12, 4, 3]
6. Find the angle between the vector A = [2, —1, 2] and the vector B if
(a) B = [-1, 2, 2] (b) B = [-6, 3, -6] (c) B = [0, 1, -1].
7. Find some nonzero vector B orthogonal to A = [1, 2, 5].
8. Find a vector C orthogonal to both A = [1, 2, 5] and B = [2, 0, —1].
9. Prove Theorem 3-5. Note that the proof here is to be purely algebraic,
depending only on Definition 3-16.
10. Prove that the four properties of Theorem 3-5 hold for the vector space of
Problem 9(a), Section 4, if the inner product is defined as
A • B = aibi + G2&2 + • • • +
11. The law of cosines can be stated as follows: if three sides of a triangle are of
lengths |A|, |B|, and |C|, and if the angle opposite the side of length |C|
is 0, then |C|2 = |A|2 + |B|2 - 2|A| • |B| cos 0 (see Fig. 3-15).
(a) Assume the law of cosines, let A = [ai, a2, 03],
B = [ò], 62, £>3], and suppose that the point 0 in
Fig. 3-15 is the origin. Find C and use the above
formula to compute |A| |B| cos 0.
(b) Assume the properties of Theorem 3-5 and com­
pute |C|2 = IA — B|2. Compare with the law
of cosines. Figure 3-15

Remark: Parts (a) and (b) of this problem together show that the law of cosines
is equivalent to the formula A • B = |A| |B| cos 0 (granting the algebraic
properties of the inner product).

3-
6 THE TRIANGLE INEQUALITY

In Problem 11 of the last section, it was shown that the formula A • B =


|A| |B| cos 0 is equivalent to the law of cosines. In this section we will
prove the inequality
|A • B| < |A| |B|,

which was needed in the last section to justify the definition of the cosine
of the angle between two vectors, and we shall show that this inequality
is essentially equivalent to the familiar geometric fact that the length of
one side of a triangle is less than the sum of the lengths of the other two
sides.
3—6 THE TRIANGLE INEQUALITY 115

The above inequality is very important in many different situations. In


general vector spaces it is called the Cauchy-Schwarz inequality.

Theorem 3-6. (The Cauchy-Schwarz Inequality.) Given any two


vectors A and B,
|A-B| < |A| |B|. (3-14)

Proof: Observe carefully that in the following proof we make use of only
those properties of vectors given in Theorems 3-3, 3-4, and 3-5.
Let t be any scalar. Then we may compute

|íA + B|2 = (tA + B) • (tA + B)


= t2(A • A) + 2i(A • B) + (B • B) (3-15)
= t2\A\2 + 2i(A-B) + |B|2.

The right-hand member of this equation is a quadratic in t except when


the length of A is zero.
Suppose that |A| = 0, then A = 0 and hence A • B = 0. It then fol­
lows that inequality (3-14) is true (with equality holding in this case).
If |A| 7* 0, then we have a proper quadratic expression in (5-15). We
may complete the square in this expression. Doing so gives us

|iA + Bf2 = i2|A|2 + 2Z(A • B) + + |B|2 -

= [<|A| + [|A|2|B|2 - (A • B)2].

The left-hand side of this equation is nonnegative for any t. Hence


this must also be true of the right-hand side. In particular, the smallest
possible value of the right-hand side must be greater than or equal to zero.
However, the right-hand side is the sum of two terms, one of which is
squared. It will take on its smallest possible value when the squared
term is zero, that is, when t = — (A • B)/|A|2. We thus conclude that

iX|2 [|A|2|B|2 - (A-B)2] > 0,

or, since |A|2 > 0,


(A-B)2 < |A|2 |B|2,

which is equivalent to the Cauchy-Schwarz inequality.


If the student feels that the above proof depends upon a trick, he is
correct. While most proofs that he has seen earlier were probably fairly
obvious, it might not be clear how a proof such as this might be discovered.
116 VECTORS 3-6

The reason for this is that there is no easy method for finding such proofs.
One must discover the special device (or trick) that makes the proof
possible. Once one has seen such a proof, however, he should be able to
make use of the same trick in other proofs.
Now let us consider the sum of two vectors, A + B.
If we draw a picture showing the physical realization
of this vector sum (Fig. 3-16) and use the fact that
the sum of the lengths of two sides of a triangle is
greater than the length of the third side, we have

(3-16)

This inequality, for obvious reasons, is called the triangle inequality. The
most interesting point is that we do not have to assume its truth or de­
pend on the geometric discussion given here. We can prove this inequality
using only the algebraic properties of vectors.

Theorem 3-7. (The triangle inequality.) For any two vectors A and B,
|A + B| < |A| + |B|.

Proof: Let us compute the square of the left-hand member of this in­
equality. This is
|A + B|2 = (A + B) • (A + B)
= |A|2 + 2A • B + |B|2.

From the Cauchy-Schwarz inequality, A • B < |A| |B|, and hence


|A + B|2 < |A|2 + 2|A| |B| + |B|2

= (|A| + |B|)2.

This last inequality implies the triangle inequality.


An important thing to note about the work in this section is the use of
the fact that |R|2 = R • R to find the length of a vector which is given
as the sum of two or more vectors. Let us give a few other examples of
the use of this fact.

Theorem 3-8.
|A + B|2 = |A|2 + |B|2

if and only if A and B are orthogonal.

Proof: We compute (exactly as above)

|A + B|2 = |A|2 + 2(A-B) + |A|2.


3-6 THE TRIANGLE INEQUALITY 117

The conclusion of the theorem then follows immediately upon application


of Definition 3-16.
This same algebraic device can be used to give vector proofs of theorems
involving the lengths of segments (and sometimes also theorems involving
angles). For example, let us prove the following:

The medians to the equal sides of an isosceles triangle are equal in length.

Proof: Let A, B, and C be the vertices of the triangle and suppose that
| AB| = |AC|. The midpoints of the equal sides are given by |(A + B)
and |(A + C), and so the lengths of the medians are ||(A + B) — C|
and ||(A + C) — B|. To prove these two equal we compute

|J(A + B) - C|2 - ||(A + C) - B|2 = [£|A|2 + £|B|2 + |C|2


+ JAB-AC-BC]
- ti|A|2 + i|C|2 + |B|2

+ IAC-AB-BC]
= f|C|2 - f|B|2 + fA-B - fA-C

= f[|C|2 - |B|2 + 2A • B — 2A • CJ.

However, by the assumption of the equal length of the two sides we have

0 = |C - A|2 - |B - A|2
= |C|2 - 2A • C + |A|2 - |B|2 + 2A • B - |A|2
= |C|2 - |B|2 + 2AB - 2A C.

Putting this into the above equation then proves the theorem.
We remark that this proof could have been simplified by proper choice
of notation, but this proof is offered to show how the result can be obtained
even if no special care is taken. In the proof of the next theorem we show
how one can choose the notation so as to simplify the computations. Let
us prove:
The diagonals of a rhombus intersect at right angles.

Proof: Let A be a vertex of the rhombus and let R and S be the vectors
associated with the two directed line segments forming the sides of the
rhombus meeting at A. Then three of the vertices are given by A, A + R,
and A + S.
Since a rhombus is a parallelogram, the fourth vertex is given by
A + R + S. The sides of a rhombus are all equal. This tells us that |R| — |S|.
118 VECTORS 3—6

The vectors giving the diagonals are

D! = (A + R + S)-A = R + S
and
D2 = (A + R) — (A + S) = R — S.

To show that these two diagonals are orthogonal, we compute

Dx • D2 = (R + S) • (R - S)
= IRI2 - |S|2

= o,
which proves the theorem.
The reader should also observe how these computations can be simplified
even further by letting the point A be the origin. He should then go back
to the previous theorem and see how the calculations would be simplified
if the vertex A of the triangle were at the origin.

PROBLEMS

1. Assume that the triangle inequality is true. By comparing the expansion


of |A + B|2 and (|A| + |B|)2, prove the Cauchy-Schwarz inequality.
2. Write the Cauchy-Schwarz inequality in terms of the components of the
vectors involved.
3. Verify that the Cauchy-Schwarz inequality holds for the vector space of
Problem 9(a) of Section 3-4. Write this out in terms of components.
4. Show that equality can hold in the Cauchy-Schwarz inequality only if the
two vectors are collinear.
5. Under what circumstances can equality hold in the triangle inequality?
These circumstances can be determined by observing when equality holds
in the proof.
6. Write the triangle inequality in terms of components.
7. Let A = xei, B = yei where ei = [1, 0, 0]. What are |A|, |B|, and |A + B| ?
What does the triangle inequality reduce to?
8. Let 0 be the center of a circle, let A and B be the endpoints of a diameter
of the circle, and let C be any other point on the circle. Set OA = R and
OC = P. What are OB, AC, and BC in terms of R and P? Prove that
AC and BC are orthogonal.
Using vector methods, prove the following theorems:
9. If the diagonals of a parallelogram are orthogonal, then the parallelogram
is a rhombus.
3-6 THE TRIANGLE INEQUALITY 119

10. The midpoint of the hypotenuse of a right triangle is equidistant from the
three vertices.
11. If the diagonals of a parallelogram are equal in length, then the parallelogram
is a rectangle.
12. The line segments joining the midpoints of consecutive sides of a rhombus
form a rectangle.
13. The line segments joining the midpoints of consecutive sides of a rectangle
form a rhombus.
14. The sum of the squares of the distances from a point P to two opposite ver­
tices of a rectangle is equal to the sum of the squares of the distances from
P to the other two vertices.
15. State and prove the converse of Problem 14.
16. Use Theorem 3-16 to prove the Pythagorean theorem and its converse.
17. Prove that the sum of the squares of the lengths of the diagonals of a paral­
lelogram is equal to the sums of the squares of the four sides. Is the converse
true?
4
Planes and Lines
4-
1 PLANES

In his Elements, Euclid attempted to define planes. By our present


day standards his definition leaves much to be desired, but then, it is not
easy to give a truly adequate definition. For example, if pressed, the student
might try to define a plane as follows: A plane is a set of points with the
property that if any two points are in it, then the entire straight line through
these two points is in it.
Leaving aside for the moment any questions about the existence of the
line, there is still a great deal wrong with this attempted definition. Once
again, the student should attempt to find the difficulty for himself before
reading further.
There are many ways in which we could proceed to give a definition for
a plane, but we wish to choose one which can be generalized easily and
which leads to useful results in higher dimensions. At the same time, we
wish to have a definition which is both rigorous and easy to work with.
The definition attempted above, for example, is inadequate because the
entire space or a straight line satisfies it. We therefore will not attempt to
patch up this definition, but will start fresh.
The fundamental property we will use is that there exists a unique plane
through a given point, perpendicular to a given line. The student may
notice that this property has been used in the geometric discussions in
earlier sections. It therefore is only right that w’e make this our defining
property.

Definition 4-1. Let PQ be a given point in space and let A be a given


nonzero vector. Then by the plane through Po orthogonal to A we mean
the set of points P = (x, y, z),
M = {P \(P^P)-k= 0}.

Note that according to our conventions, P0P = P — Po. Thus if


Po = Oo, Vo, and A = [a, b, c], then PQP = [x — xQ,y — y0, z — z0J
120
4-1 PLANES 121

and (PqP) • A = a(x — x0) + b(y — yQ) + c(z — z0), so the point
(x, y, z) is on the plane if and only if

a(x — Xq) + b(y — t/0) + c(z — z0) = 0- (4-1)


If this equation is multiplied out, and we set d = —axQ — byQ — cz0,
then we have the equation

ax + by + cz + d = 0, (4-2)

which must be satisfied by the coordinates of the points of the plane. This
last equation has the form of a general linear equation. It is called the
cartesian form of the equation of the plane.
Suppose, on the other hand, that we are given a linear equation ax +
by + cz + d = 0, not all three of a, ò, and c being zero. For example,
suppose a # 0. Set x0 = —d/a, y0 = 0, and zQ = 0. Then this equa­
tion is equivalent to the equation a(x — x0) + b(y — y0) + c(z — z0) = 0,
and hence is equivalent to (PqP) • A = 0 when PQ = (x0,2/o, zo)> P =
(x, y, z), and A = [a, ò, c]. That is, the set of all points (x, y, z) whose
coordinates satisfy a nontrivial linear equation constitutes a plane. (A
nontrivial linear equation is a linear equation in which not all of the co­
efficients of the variables are zero.) These facts are important enough to
warrant their collection into a theorem.

Theorem 4-J. The coordinates of all points on a plane satisfy a non­


trivial linear equation. Conversely, the set of all points whose co­
ordinates satisfy a nontrivial linear equation constitutes a plane.

Let us try to make clear exactly what we mean by saying that a vector
is orthogonal to a plane. This idea is expressed in Definition 4-1, but
needs to be made precise.

Definition 4-2. A vector R is parallel to a plane M if and only if there


exist points and P2 in M such that P\P2 = R-

Definition 4-3. A vector which is orthogonal to every vector parallel


to a plane M is said to be orthogonal to M.

These definitions should be clear enough. In order, for the terminology


of Definition 4-1 to correspond to these definitions, we must have the
following theorem:

Theorem 4-2. Let M be the plane through Po, orthogonal to A, as in


Definition 4-1. Then A is orthogonal to M in the sense of Definition 4-3.
122 PLANES AND LINES 4-1

Proof: Let Pi and P2 be any two points of M. From Definition 4-1


we have
A • (P^Pi) = A • (Px - Po) = 0,
A-(P0P2) = A.(P2 - Po) = 0.
However, then

A • (P^2) = A • (P2 - Px)


= A.[(P2 - Po) - (Pi ~ Po)]
= A.(P2-Po)-A.(P1-Po)
= 0,
which proves the theorem.

There is essentially only one vector orthogonal to a given plane. That


is, any two vectors orthogonal to the same plane must be collinear. We
can obtain this result more easily at a later stage, but since it seems to be
relevant at this point, let us prove it here.

Theorem 4-3. Let M be the plane through Po orthogonal to A. If B


is orthogonal to M, then A and B are collinear.

Proof: The vector A 0. If A = [ax, a2, 03], then one of the three
components is not zero. Let us suppose for definiteness that ax # 0.
The proof would be similar in other cases. Let

ax

Now P = (x, 2/, z) is a point of the plane if and only if

ax(x — xQ) + a2(y — 3/0) + «3(2 — *o) = 0. (4-3)

Multiplying this equation by t, we have, since tai =

bi(x — x0) + ta2(y — yo) + ta3(z — z0) = 0. (4-4)

However, if B is orthogonal to the plane, it must be orthogonal to P — Po,


and hence
6i(x — xQ) + b2(y — yo) + b3(z — zQ) = 0.

Subtracting (4-4) from this, we see that for any point P on the plane,

(62 — ta2)(y — yo) + (63 — ta3)(z — zQ) = 0. (4-5)


4-1 PLANES 123

We now make use of this relation by choosing particular points on the


plane. First, set
z = z0 — > y = y0 + 1, z = zQ. (4-6)
«1

Substituting these values into Eq. (4-3), we see that the point (z, y, z) is
on the plane. When these values are put into (4-5), however, we find
Ò2 — td2*
In the same way, if we set

x = x° — 7T
fll
’ y = yo, z = z0 + 1, (4-7)

we again have a point on the plane, and Eq. (4-5) gives us b3 = ta3. We
therefore have shown

bi = tai, 62 = ^2, ^3 = tfyh (4—8)

or B = ZA, which proves the theorem.

Theorems 4-2 and 4-3 together show that, exclusive of scalar multiples,
there is only one vector orthogonal to a given plane. Suppose, on the
other hand, that two nonzero vectors A and B are collinear and that a
point PQ is given. Are the planes through Po orthogonal to A and B the
same? The answer is, of course, yes.

Theorem 4-4. If A and B are nonzero collinear vectors and if PQ is a


fixed point, then the plane through PQ orthogonal to A is identical to
the plane through Po orthogonal to B.

Proof: Let B = kA, where k 0 (why is this possible?). Let

M! = {P IA • (P - Po) = 0},
and
M2 = (P I B • (P - Po) = 0}
= {P I fcA • (P - Po) = 0}.

However, since k 0, the expression in the last line can be zero if and
only if A • (P — Po) = 0, and hence we can conclude that Mi = M2.
The equation, in vector form, of the plane through a point PQ orthogonal
to a given vector A is
A • (P - Po) = 0.
124 PLANES AND LINES 4-1

By making use of the algebraic properties of the dot product, this can
also be written in the form

A • P — A • Po = 0, (4-9)

or equivalently, in the form


A • P = A • Po.

This last form is quite easy to remember, since it is obvious that P = PQ


satisfies this equation. Form (4-9), however, corresponds exactly to
(4-2), since if A = [a, b, c] and P = (z, y, z), then

A • P = ax + by + cz,

while A • Po is a constant which can be identified with — d, d being the


constant in (4-2).
Suppose, for example, that we wish to find the cartesian equation of the
plane through Po = (1, 2, 3) orthogonal to A = [3, —4, 1]. We have

A • P = 3z — 4y + z,
while
A • Po = 3 - 8 + 3 = -2,

and hence the desired equation is

3x — 4y + z + 2 = 0.

Conversely, if we are given an equation such as

2x + y — 3z — 5 = 0,
we recognize it as the equation of a plane orthogonal to A = [2, 1, —3].
But what point does it pass through? All we need do is to find any PQ
whose coordinates satisfy the given equation. In this case, a simple
choice is Pq = (0, 5, 0). This equation is, therefore, the equation of the
plane through (0, 5, 0) orthogonal to [2,1, —3], and is equivalent to

[2,1, -3]>([x,y,z] - [0,5,0]) = 0.

PROBLEMS

1. Let M be the plane, as defined above, through the point Po orthogonal to


the vector A. Let Pi be another point in M and let Mi be the plane through
Pi orthogonal to A. Prove that M = Mi. (Recall that in order to show
that two sets are the same we must show that any point in the first is also
in the second, and any point in the second is also in the first.)
4-1 PLANES 125

2. Find the equation, in standard form ax + by + cz + d = 0 of the plane


through the point Po orthogonal to A if
(a) Po = (1,3,-2), A = [1,2,7]
(b) Po = (4, 0,-1), A = [2, 2,-1]
(c) Po = (2,1,0), A = [0,0,1]
(d) Po = (1,1,1), A = [1,0,0]
3. Find the equation, in standard form, for the plane through Po orthogonal
to A for each of the following. Find the point at which the resulting plane
cuts the z-axis.
(a) Po = (1, 2, 3), A = [5,1, 1]
(b) Po = (0,1,0), A = [1,1, — 1]
(c) Po = (5,-5,0), A = [1,1,2]
(d) Po = (1, 10, 1), A = [-5, 1, -4]
4. For each of the following equations, give a point Po and a vector A such that
the equation is that of a plane through Po orthogonal to A:
(a) 3x — 2y -j- z -j- 5 — 0
(b) x + y + 1 = 0
(c) 2x + y + 4z = 0
(d) -x - 2y - 3z + 6 = 0
5. By assuming an equation of the form ax + by + cz + d = 0, and treating
a, 6, c, and d as unknowns, find the equation of the plane containing the,
three points (1, 0, 2), (2, 1,1), and (—1, —3, 3). How is it that we can
solve for four unknowns, with only three equations being
* given by these
three requirements?
6. Let M be a plane through the origin. A vector X = [xi, Z2, ^3] is in the
plane M if and only if the point X = (®i, X2, £3) is in the plane. Let Po
be any point in space and set N = {P | PqÉ G M}. What is N? Prove
your answer.
7. Let B = [òi, 62, Ò3] and C = [ci, C2, C3] be two nonzero vectors which are
not collinear. Suppose the vector A = [ai, a2, 03] is orthogonal to both
B and C (and A # 0). Prove that for any real s and t, sB + tC lies in the
plane through the origin orthogonal to A.
8. Verify the details of the proof of Theorem 4-3 by showing that the points
given by (4-6) and (4-7) are on the plane and that (4-8) follows.
9. At what points does the plane
ax + by + cz + d = 0
cut the three coordinate axes? Use these formulas to make a sketch showing
the location of the planes for each part of Problem 3. Draw the triangular
section of the plane determined by these three points. What happens in (6) ?
126 PLANES AND LINES 4-2

10. (a) What is the equation of the sphere with center (1, 0, 12) which passes
through the point (3, 5, —2) ?
(b) Find the equation of the plane through the point (3, 5, —2) which is
tangent to the sphere of part (a). [Hint: What line segment through
this point would be orthogonal to the plane?]

4-
2 THE CROSS PRODUCT

A number of problems in the preceding sections could be reduced to


finding a vector orthogonal to two given vectors. In particular, Problem 5
of the last section is simplified if we could find such a vector.
Suppose two vectors A = [ax, a2, 03] and B = [bx, b2, 63] are given,
and we wish to find a vector X = [x, y, z] simultaneously orthogonal to
both, that is, to find an X satisfying A • X = 0 and B • X = 0, or

arx + a2y + a3z = 0,


bix + b2y + b3z = 0.

We wish to solve this pair of equations for the three unknowns x, y, and z.
The solution obviously cannot be unique (think geometrically; if any
nonzero vector X satisfies the requirement, then so does any vector col­
linear with X), but we will be satisfied to obtain any nonzero vector solu­
tion.
In trying to solve this pair of equations we must be cautious. We cannot
divide through by any of the 0» or since some may be zero. Therefore,
if we try to eliminate one of the unknowns, say z, we must do so by multi­
plying the first equation by b3 and the second by a3 and subtracting. This
gives us
(01Ò3 — a3br)x + (a2&3 — «3^2)?/ = 0.

This equation can be satisfied (we cannot divide!) if we set

x = (a2b3 — 0362),
V = —(«163 — «361).
We can put these two values into our original equations. After some
simplification, this gives
0302&1 — 03(1102 ”1“ ®32 =

^302^1 — ^3®1^2 + ^32 = 0.

It can be seen that these are both satisfied if

Z = 01&2 — 02^1
*
4-2 THE CROSS PRODUCT 127

These three values thus yield us a somewhat arbitrary solution to our


problem. We give this solution a special name.

Definition 4-4. Given two vectors A = [ai, a2, a3] and B = [6j, b2, 63],
the cross product of these two vectors is defined to be the vector

A X B = [(a2b3 — a3b2), — (axò3 — a3òx), (axb2 — «2&i)L (4-10)

A useful mnemonic which is helpful in remembering the form of the


cross product is the representation as a formal determinant:

©1 ®2 ®3

A X B = dX U3

bi b2 b3

Expansion of this determinant by the cofactors of the top row gives a


representation of A X B.
Thus, for example, to find [—1, 3, 2] X [5,1, —1] we write

ei ©2 ®3
-1 3 2 = ei 3 2 -1 3
— e2 + ©3
5 1 -1 1 -1 5 1
= —5ex -|- 9e2 — 16e3
= [-5, 9,-16].

After a bit of practice the reader should find it possible to write down the
final vector directly after writing the second vector below the first:

[—1,3, 2],
[ 5,1,-1],

and visualizing the appropriate cofactors (don’t forget the negative sign
on the second cofactor). When this is done, the student should always
check that the dot product of the original vectors and this result is zero.
Geometrically, it is evident that there are exactly two vectors of a given
magnitude orthogonal to a given pair of noncollinear vectors, A and B.
If A and B are thought of as directed line segments from the same point,
then they determine a plane through that point and there are two vectors
of magnitude 1, say, orthogonal to that plane, one pointing to each side
of the plane.
Suppose A and B, contained in a plane M, are not collinear, and are
represented as directed line segments from a common point. If we measure
the angle from A to B counterclockwise in the plane M, we will get an angle
128 PLANES AND LINES 4-2

less than 7r looking at the plane from one side, and an angle greater than tv
looking at it from the other side. The vector A X B points toward the side
of the plane from which this angle is less than it.
Another way of expressing this result is in terms of the right-hand rule.
If the right hand is held as in Fig. 3-2 (e), with the thumb pointing in the
direction of the vector A and the first finger pointing in the direction of B
(or as close to the direction of B as the joints will allow), then the middle
finger will point in the direction of A X B.
The proof of this would be tedious, and will not be given here. The
student may use this fact without question, however.
In (4-10) we have defined a unique vector obtained from two given
vectors, that is, an algebraic product of a new type. Again, we are in­
terested in the algebraic laws that are satisfied. Here we find some sur­
prises. First, let us think of the commutative law. Interchanging A and B
in the above definition amounts to interchanging the small a’s and small b’s.
Note that if this is done, the two terms in each component of the definition
of A X B are interchanged. But these two terms have opposite signs.
Thus the commutative law does not hold in general. In fact, we actually
have an anticommutative law:

A X B = -(B X A).

Note how this result is connected with the physical picture described
above. Interchanging the roles of the vectors reverses the direction of the
cross product.
The next general property to investigate is the associative law. Here
again we find that the law fails in general. For, setting as usual ex =
[1, 0, 0], e2 = [0,1, 0], and e3 = [0, 0,1], we find ex X e2 = e3, e2 X
e3 = ex, and e2 X e2 = 0. Hence

e2 X (e2 X e3) = e2 X ex = —e3,


(e2 X e2) X e3 = 0 X e3 = 0.

The failure of these laws to hold means that care must be exercised in
algebraic manipulations involving the cross product.
The associative and commutative laws do hold when we mix the scalar
multiple and the cross product. This property, which is usually called
the homogeneity property, is easily verified in the form í(A X B) =
(*A) X B — A X (iB). This follows from the observation that in the
definition of A X B, each term in each component contains exactly one
component of A as a factor, so if each component of A, say, is multiplied
by /, a single factor of t appears in each component of A X B. Similarly,
each term of each component of A X B contains exactly one component
of B as a factor, and hence A X (£B) = Z(A X B).
4-2 THE CROSS PRODUCT 129

We will postpone discussion of the behavior of combinations of the dot


and cross products and turn to the distributive law. Looking at the defini­
tion of A X B, we see that each component is linear in the components of
B. Hence Ax(B + C) = AxB + AxC. Similarly, we could show
that the distributive law holds in the other order, or we can use the anti-
commutative law to obtain the same result.
The above analysis can be summarized in the following theorem:

Theorem 4-5. The cross product is a binary operation between vectors


which is neither commutative nor associative, but which satisfies the
following algebraic laws.
(1) Anticommutativity:
A X B = —(B X A).
(2) Bilinearity:

(/A) X B = A X (®) = t(A X B),


AX(B + C) = AXB+AXC,
(B + C)XA = BXA + CXA.

The bilinearity property of the cross product is essentially a distributive


property. It can also be extended to the cross product of sums of vectors.
For example,
/ n \ / m \ n m
(E a.) X (E B,) = E E A, X B„
xt=i ' j=i ' i=i y=i

Similarly, the first part of the bilinearity property could also be combined
with this same result (see Problem 11).

Next we wish to examine the combinations of three vectors involving


the cross and dot products. Given two vectors B and C, B X C is also a
vector. Hence we can consider the dot product of this vector and a third
vector A. This particular combination, A • (B X C), occurs frequently
enough to deserve a special name.

Definition 4-5. Given three vectors A, B, and C, the scalar triple product
of these three vectors, in this order, is A • B X C.

Note that in this definition, the parentheses have been left off B X C.
This can be done since there is no way in which the result could be mis­
understood. Since the cross product is defined only between two vectors,
we could not take the dot product first.
130 PLANES AND LINES 4-2

If we let A = [ax, a2, «3], B = [61, b2, 63], and C = [ci, c2, c3], then
it is easily verified from the definitions that

A B X C = O102C3 + #2^3^! 4“ <Z3Ò1C2 — <&3Ò2C1 — ®1^3C2 — ®2^1c3*


(4-11)
This is easily seen by noting that

ei e2 ©3

AB X C = A bi b2 b3
Cl c2 c3
b2 b3 bi b3 bi b2

- e2 + e3
I
I

c2 c3 Cl c3 Cl c2

I&2 b3 bl b3 bi b2
= Cbl — a2 j a3
|c2 c3 Cl C3 Cl c2

di a2 03

bi b2 b3 , (4-12)
Ci C2 C3

which is itself a useful result.


We can obtain the expansion of C • A X B by replacing each a by c,
each b by a, and each c by b in (4-11). It is easily seen that if we do so,
this expression is unaltered. The three terms with the positive sign inter­
change cyclically, and so do the three with the negative sign. In other
words,
A • B X C = C • A X B.

This can also be put in the form

ABxC = AxBC,

which is easily remembered as the fact that the dot and cross can be inter­
changed without altering the scalar triple product.
^.s a consequence of this and the anticommutative law, B X C =
—C X B, we can evaluate any of the six possible scalar triple products by
using A, B, and C in various orders. Another simple consequence is that
A • B X C is zero if any two of the three vectors are identical. This
follows since the cross product of any vector and itself is zero. If two
vectors are collinear, the same result must follow, since a scalar multiple
can be factored out of the product. We therefore have:
4-2 THE CROSS PRODUCT 131

Theorem 4~6. The scalar triple product is unchanged upon interchange


of the dot and cross product,

AxBC = ABxC.

The value of the scalar triple product is zero if any two of the three
vectors are collinear.

By the way in which it was constructed, the cross product of two vectors
is orthogonal to both the original vectors. But what if the two vectors are
collinear? There is no single direction orthogonal to both of them. What
happens to the cross product? The answer is contained in

Theorem Ar-7. A X B = 0 if and only if A and B are collinear.

Proof: Suppose that A X B = 0. Now, if A = 0, then A and B are


automatically collinear; so suppose that A 0. Then at least one com­
ponent of A is nonzero. Let us suppose that ar 0 (the proof would be
similar if one of the other components were taken to be nonzero).
Let t = bi/ai. Then
br = tai.

The fact that A X B = 0 means that all three components are zero, and
hence
a2&3 — U3Ò2 = 0,

U163 — asb} = 0,
and
(1102 — #2^1 = 0
*

From the last of these, we have

02 — — &2 — ta2-
ai

From the second of the relations, we have

a
03 = — «3 = *a3-
«1
Thus we have proved

B = [t>1( b2, 63] = ftai, ta2, to3]


= tA,

which proves the first half of the theorem.


132 PLANES AND LINES 4-2

The proof of the other half of the theorem will be left to the student
(see Problem 2 at the end of this section).
Let us illustrate the use of the cross product in the solution of a prob­
lem. We wish to find the equation of the plane passing through the points
A = (1, 0, 2), B = (2, 2, — 1), and C = (1, 1, 0). This plane must be
orthogonal to a vector which is in turn orthogonal to both of the vectors
AC = [0,1, —2] and BC = [—1, —1,1]. Thus, a vector orthogonal to
the plane is
ACxBC = [-1,2,1].

Therefore the equation of the desired plane is

[-1, 2, 1] • [x, y, z] - [-1, 2, 1] • [1, 1, 0] = 0,


or
—x + 2y + z — 1 = 0.

PROBLEMS

1. Find A X B, given that


(a) A = [1,3, 1], B = [7, 2,-1]
(b) A = [0,1,1], B = [1,0, 1]
(c) A = [1,-2, 3], B = [-3, 2,-1]
(d) A = [5, 2, -1], B = [4, —7, 2]
(e) A = [1, 1, 1], B = [-1, -1, 1]
2. Prove that A X B =0, given that A and B are collinear.
3. By actually computing the vectors involved, prove that t(k X B) =
(/A) X B = A X (®).
4. By actually computing the vectors involved, prove that A X (B + C) =
A X B + A X C.
5. Prepare a multiplication table for the cross products of the vectors ei, e2,
and e3.
6. There are twelve possible scalar triple products which can be written down
using A, B, and C. Write down each of these and, using the two operations
discussed at the end of this section, obtain its value in terms of A • B X C.
7. Find the equations of the planes containing the three given points:
(a) (1, —2, 5), (0, -5, -1), (-3, 5, 0) (b) (5, 0, 1), (2, 3, 1), (0, -5, 3)
(c) (2, 5, 3), (0, 1, 1), (-1, 3, 0) (d) (1, —4, 1), (0, -2, 0), (-2, 2, -2)
8. Find the equation of the plane containing the points (1, 0, 1) and (3, 1, 2)
and parallel to B = [1, —1, 2].
9. Find a nonzero vector parallel to the plane 3x y — z — 2 = 0 and
orthogonal to B = [1, 0, 2].
4-3 DISTANCE FORMULAS 133

10. Find a nonzero vector simultaneously parallel to both of the planes


x + 2y — z + 1 = 0 and 2x — 1/ + z + 9 = 0.
11. Verify that
3 3 3 3
(£ «íeA X (£ My) = 22 S a,M< X e,.

\t=l / \j=l / 1=1 J=1

4-
3 DISTANCE FORMULAS

Let A and Q be given points and let B be a given nonzero vector. We


wish to find the distance from Q to the plane through A orthogonal to B.
Our knowledge of euclidean geometry tells us that this distance would be
the distance between the point Q and the point P on the plane which is
such that the vector QP is collinear with B (and hence orthogonal to the
plane). This vector is the projection of QA on the
line with the direction B. This gives us, from
Definition 3-16,

(QA • B)
QP =
|B|2
and the required distance is (Fig. 4-1)

IQA • B|
|QP| = Figure 4-1
|B|

We should, however, verify this solution, using the algebraic properties


we have assumed. First, let us see that the point P is indeed on the plane.
From the definition of a plane, this is true if and only if AP • B =
(QP — QA) • B = 0. But using the value given above for QP, we have

(0? - QA) • B = (Q^pB) (B • B) - (QA • B)

= (QA ■ B) - (QA • B)
= 0.
Next, we verify that the point P is the closest point of the plane to Q.
Suppose X is any other point on the plane. Since this plane is also the
plane through P orthogonal to B, we must have PX orthogonal to B.
But then PX must also be orthogonal to QP, since QP and B are collinear.
We must then have

|QV|2 = \QP + PX\2 = \QP\2 + |PX|2


134 PLANES AND LINES 4-3

from Theorem 3-8. From this, we can conclude that |QX| is always greater
than or equal to |QP|, with equality holding only when X = P. We
have thus proved

Theorem 4-8. Given points A and Q and a nonzero vector B, the point
on the plane through A orthogonal to B which is closest to Q is P, where

(QA • B)
P = B + Q, (4-13)
|B|2
and the distance from Q to the plane is

IQA • B|
IQi’l = (4-14)
|B|

Rather than trying to memorize these formulas, many students find it


easier to remember only the formula for projections, and redevelop (4-13)
or (4-14) as needed, by thinking of Fig. 4-1. Note how (4-14) is related
to the formula given in Definition 3-16 for the cosine of the angle between
two vectors. If 0 is the angle between QA and QP (or equivalently between
QA and B), then from Fig. 4-1 we would have

\QP\ = |QA| |cos 6\.


But from Definition 3-16,
I nl IÕA ‘ B|
cos 01 = ----- 1 •
IQAI |B|
An immediate consequence of these two equations is (4-14).
Suppose the given plane has the equation ax + by + cz + d = 0, and we
wish to find the distance from this plane to the point Q = (xi, yi,zi).
To use the above formula, we need a point of the plane. Let A =
(x0, yo, Zq) be such a point. Then we must have ax0 + by0 + cz0 = —d.
The formula in Theorem 4-8 then gives the distance from Q to the plane as
|[xo — xi, y0 — yu z0 — ai] • [a, b, c]|
(a2 + b2 + c2)i/2
= |ax0 + by0 + cz0 — (axi + bt/i + czi)| = l«^i + fy/i + czi + d|
(a2 + b2 + c2)i/2 (a2 + b2 + c2)i/2

Theorem 4-9. The distance from the point (xi, j/i, zj to the plane hav­
ing equation ax + by + cz + d = 0 is
|axi + bj/i + C2i + d| ,._1
(a2 + b2 + c2)i/2 ’ V '
4-3 DISTANCE FORMULAS 135

The reader should note the simplification of this formula when the
normal vector [a, b, c] is taken to be of length one and also what the formula
reduces to when the distance from the origin to the plane is calculated.

Let us see an example of the use of the formulas developed above. Con­
sider the plane M with equation

5z - 14?/ + 2z + 9 = 0

and the point Q = (—2,15, —7). We wish to find the distance from Q
to the plane and the point on the plane which is closest, to Q.
It is easy to find the distance from Q to M by the use of formula (4-15).
This formula gives us the distance
_ |-10 - 210 — 14 + 9| 225
0 (25 + 196 + 4)1/2 15

We can use (4-13) to find the point P on M which is closest to Q. For


this formula we need a point A on M, however. We choose one arbitrarily
to satisfy the given equation. For example, we let A = (1,1,0). Then
QA = [3, —14, 7], and hence
P = [3,-14, 7b [5,-14,2] [5> _14, 2] + (_2> 15> _7]

= y|f[5, -14,2] + [-2,15,-7]

= [3,1, -5].
A plane which is parallel to the z-axis has an equation which contains no
z-term. This can also be thought of as the equation of a line in the xy-
plane (the line formed by the intersection with the x?/-plane). Considera­
tion of the above formulas in these terms leads to the following conclusion:

Theorem 4-10. The distance from a point (xx, i/i) of the a^-plane to a
line ax + by + d = 0 is
|axi + by\ + d|
(a2+ 52)1/2

Further discussion of this result can be found in the problems at the


end of this section.

One final formula which can be included at this time is for the determina­
tion of the angle between two planes. When two planes intersect we may
choose a point on the line of intersection and measure the angle formed
136 PLANES AND LINES 4-3

between two lines, one in each plane, from this point. It can easily be seen
that no matter how the planes are situated, the resulting angle can fall
anywhere between 0 and 7r, inclusive. The two limiting cases result when
both lines are chosen to be along the line of intersection. In order to
specify the angle of intersection between two planes, we must decide which
of these many possible angles to measure. Reference to geometric intuition
tells us that we should choose to measure the angle between the two lines
which are orthogonal to the line of intersection.
It could actually be shown that this pair of lines has a stronger property.
Indeed, if we fix a line in one plane and let the line in the other plane vary,
we will find that we get many angles, but one of them will be a minimum.
If we then let the first line vary, this minimum will change, but for some
angle it will be a maximum. The maximum value of the minimum turns
out to be the angle between the two lines orthogonal to the line of inter­
section, exactly the angle we choose to call the angle between the two
planes.
Since we do not have these lines given to us, but we do know the vectors
orthogonal to the two planes, we choose to define the angle between the
planes as the angle between the orthogonal vectors. This corresponds to
the familiar geometric fact that two angles whose sides are respectively
orthogonal are equal. However, each plane has two distinct orthogonal
vectors of a given length (one the negative of the other). There are there­
fore two possible angles (between 0 and 7r). They are related by the fact
that their cosines are negatives of each other. Hence one of them is
between tt/2 and 7r, and the other is between 0 and 7r/2. We will choose
the smaller of the two angles to call the angle between the two planes.

Definition 4-6. Let Mi and M2 be two planes and let Bi and B2 be


nonzero vectors orthogonal to M i and M2 respectively. Then the cosine
of the angle between and M2 is
Bi B2|
Bj| |B2|’

For example, the cosine of the angle between the planes with equations

3x — Qy + 6z — 5 = 0
and
6x + 9y — 2z + 3 = 0
is
|18 - 54 - 12| _ 16
9-11 33’
4-3 DISTANCE FORMULAS 137

PROBLEMS

1. Find the distance from the given point to the given plane:
(a) (1, 3, 1), 3x + 7y - 5z + 3 = 0
(b) (2, 1, -5), 2x - y + 2z + 1 = 0
(c) (1,1,0), 3z + 4z+2 = 0
(d) (2, 0, 3), 6x + 2y + 3z = 0
2. In the distance formula of Theorem 4-10, the absolute value of the numerator
is taken. What is the meaning of the sign of this quantity? Consider this
question in connection with the direction of the normal vector [a, 6, c].
3. Let M be the plane with equation ax + by + cz + d — 0. Let P be the
point on M which is closest to Q = (xi, yi, zi). Show that

axi + byi + czi + d


P fab yi, zi] a2 _|_ &2 _|_ c2 Ia'

4. For each part of Problem 1, find the point on the given plane which is closest
to the given point.
5. If two numbers a and b are such that a2 + b2 = 1, then there exists an
angle a such that a = cos a, b = sin a. Show that the equation of any
straight line in the plane can be brought into the form
x cos a + y sin a + p = 0.
For an equation in this form, what is the geometric meaning of a and p?
[Hint: Use Theorem 4-10.] This is called the normal form of the equation of
a line.
6. Find the cosine of the angle between the following pairs of planes:
(a) 3x — 2y z — 5 = 0, 2x -1- 3y — z 1 =0
(b) 7x - z + 1 = 0, x + y - 1 =0
(c) y + z = 0, x — y = 0
(d) x — y + z + 1 = 0, 2x — 2y + 2z — 3 = 0
(e) 3x -|- 5y — 2z — 5 = 0, 2x 4~ 21/ -|- 8z -|- 7 = 0
7. What does formula (4-13) become if |B| = 1? For an arbitrary B, set
e = B/|B| and state (4-13) and (4-14) in terms of e.
8. What is the equation of the sphere with center (3, —9, —15) which is tangent
to the plane
4x - 7y - 4z + 27 = 0?
[Hint: What is the radius of the required sphere?]
9. Find the distance from the point (3, 5, 3) to the plane passing through the
three points (2, —5, —1), (—3, 1, 1), and (0, 2, 9) by first finding the
equation of the plane.
138 PLANES AND LINES 4-4

10. Let A, Bj C, and Q be four noncollinear points in space and let AB = R,


AC = S, and AQ = T. Prove that the distance from Q to the plane passing
through A, B, and C is
|T • R X S| .
|R X S|

11. Let 2I1 and 2I2 be two distinct points and let 2I12I2 = B. Let Mi and M2
be the planes through Ai and A 2 respectively, each orthogonal to B. Let
Q be any point in M2. Prove that the distance from Q to Mi is |B|.
Remark: This proves that two distinct planes orthogonal to the same vector
have no points in common (i.e. they are parallel).

4-
4 THE STRAIGHT LINE

We now turn to a discussion of straight lines. Definition 3-8 can be


rewritten in the following way:

Definition 4-7. Given a point Pq and a nonzero vector A the straight


line determined by this point and this vector is

L = {X I PqX = tk, t any real number}.

If we use the representation of a point X as a vector X, this definition


can be written in the form
X = Po + tk; (4-17)

the same definition in terms of the components would give

x = Xq + ta,
y = 2/o + tb, (4-18)
z = Zq + tc,

where Pq = (x0,2/0, Zo) and A = [a, 6, c]. These three forms are com­
pletely equivalent, and any one of them can be used as the situation
requires.
What is defined here is a set of points constituting the straight line.
However, there is a natural direction associated with this line, namely the
direction on the line induced by the parameter t. The line can actually be
thought of as a coordinate line with t as the coordinate. We will use the
notion of a direction on this line, induced by the parameter t> without
further comment whenever needed.
4-4 THE STRAIGHT LINE 139

Any of the three forms given above for the determination of the co­
ordinates of a point on the line is called the parametric form of the equation
of a line. Any desired information about the line can be found from its
parametric equation. In many textbooks on analytic geometry, the
standard form of a line is given by the symmetric equations. These are of
limited utility, however, since for many lines special conventions must
be introduced to give the symmetric equations a meaning. (A discussion
of the symmetric equations will not be given here, but will be found in the
problems at the end of this section.) For short, we will call (4-17) an
equation of the line, and (4-18) equations of the line.
What are equations of the line through Po = (1,1,0) with the direction
A = [1, —2, 1]? From (4-17) we have in this case the equation
X= [x,y,z] = [1, 1, 0] + *[1, —2, 1],
or equivalently,
X = [1 + /, 1 - 2Í, t].
This last form can also be thought of as
(x, y, z) = (1 + t, 1 — 21, t),
which is a condensed method for writing the three equations of the form
(4-18).
Is the point (5, —7, 5) on this line? No, since the only possible way
of getting the first coordinate of the point on the line to be equal to 5 is
to put t = 4. This gives us the point (5, —7, 4) rather than the point
(5,-7, 5).
At what point does this line cross the plane y = 3? To have y = 3,
we see that we must have 1 — 2t = 3, or t = —1. This then gives the
point (0, 3, —1).
At what point does the line cross the plane whose equation is
5x + Gy + z + 1 = 0?
Substituting the coordinates of a general point into this equation, we
find that we must have
5(1 + 0 + 6(1 - 2t) + t + 1 = 0,
or
12 - Gt = 0.
This is satisfied by t = 2, giving the point (3, —3, 2).

The first question we raise is about the uniqueness of a straight line.


The definition is given in terms of a point on a line and a vector which
defines the direction of the line. We are interested in showing that the
140 PLANES AND LINES 4-4

line remains the same if we substitute another point on the line and another
vector collinear with the given vector.

Theorem 4-7 7. Let a point Pi be on the line L determined by the point


PQ and the vector A. Let B 0 be collinear with A. Then the line
determined by Pi and B is identical to L.

Proof: We have L = {X | X = Po + /A}. Set M = {X | X = Px +


sB}. Here we use a different parameter, s, to avoid confusion in dis­
cussing the points in these two sets. We are given that Px is on the line
L; hence there exists a tQ such that

Pi = Po +
Likewise, since B is collinear with A, there exists a nonzero constant k
such that B = kk. (Why must k 0?)
Suppose X G M. Then there is a real s such that

X = Pi + sB
= (Po 4" 4” s/cA
= Po 4" (fo 4” sk)k
= Po 4“ tk,

where we set t = t0 + sk. This shows that if X G M, then X G L.


Conversely, if X G L, then we have

X = Pq -|- tk
= Pq 4“ tQk 4- (t — to)k
= Pi + sB,

where s = (t — to^/k. This then completes the proof and shows that
L = M,

A line is determined by a point and a vector, but there are also many
other conditions which determine a line. This gives rise to problems of
how to obtain the parametric equations of a line ‘when it is determined by
conditions other than the standard ones.
As a first example, we observe that two points determine a line, and that
if we are given two points Pi = (xx, ylf z^ and P2 = (x2,2/2, ^2), we can
easily find the line determined by these two points. Either of the two
points will serve as a point on the line, and so we only need to find a vector
with the required direction. Clearly such a vector is PXP2 = P2 — Px.
4-4 THE STRAIGHT LINE 141

Therefore, a parametric equation of the line determined by the points


Pi and P2 is
X = Px + i(P2 - Pi),
or
X = (1 - OPi + £P2. (4-19)
This is the form we used to define directed line segments in an earlier
section and could have been used in this section.
As an example of this, let us find an equation of the line through the
points (1, 0, 2) and (3, 1,5). Here, a vector giving the direction of the line
is [3, 1, 5] — [1, 0, 2] = [2, 1, 3], and so a vector parametric equation
for the line is
X= [1,0, 2]+ i[2,1,3].

A second way in which a line can be determined is by a point and the


requirement that it be orthogonal to two given vectors. If the two given
vectors are A and B, then clearly the vector A X B is the desired vector
determining the direction of the line. Thus, the line through a point Po
orthogonal to the vectors A and B is given by the equation

X = Po + /(A X B). (4-20)


If we are given two nonparallel planes, they detef-mine a line, their line
of intersection. Finding a vector giving the direction of this line is easy.
It is merely the cross product of the two orthogonal vectors of the planes.
(Why?) The only problem is to find a point on the line. This can be done
by taking the equations of the two planes, eliminating one of the unknowns
and assigning any convenient value to one of the remaining unknowns.
This procedure can be illustrated best by an example. Suppose we are asked
for an equation of the line of intersection of the two planes with equations

3x — y + 2z — 7 = 0,
x y — 5^+ 5 = 0.

The orthogonal vectors are [3, —1, 2] and [1, 1, —5]. A direction vector
for the line is therefore [3, —1, 2] X [1, 1, —5] = [3, 17, 4]. Eliminating
y between the equations gives

4x — 32 — 2 = 0.

Setting 2 = 2 in this equation gives 4x = 8, or x = 2. Putting these


values back into the second equation yields 2 + y — 10 + 5 = 0, or
y = 3. So a point on the line is (2, 3, 2). Hence a parametric equation of
the desired line is
[x, y, 2] = [2, 3, 2] + Z[3, 17, 4].
142 PLANES AND LINES 4-4

Other conditions could be given which would serve to determine a line,


but those listed above are the main ones which appear in practice. Other
sets of conditions appear only rarely.
When two equations represent the same plane, it is easy to recognize
the fact since this can happen only when one equation is a nonzero multiple
of the other. The situation is not so simple in the case of the parametric
equations of lines. For example, two of the lines
L1:X= [l,-5,3] + i[6,8, -4],
L2: X = [6,0,51 + 4-3,4,2],
L3.X= [7, -3,-1J +4-3, -4,2],
L4: X = [4, —7,1] + 49, 12, -6],
are actually identical. But which two?
In order to be identical, it is first necessary that two lines have collinear
direction vectors. On this basis we can eliminate line L2 from consideration.
It is not parallel to any of the others. The remaining three lines are all
parallel, however. Two will be identical if and only if they have a point
in common. We check this by seeing whether the "initial point” of one
line is on another.
We first check whether (7, —3, —1) is on Lr. To have x = 7 in Lx,
we must have t = 1. This gives the point (7, 3, —1), and we conclude that
this point is not on the line.
Next we see whether (4, —7, 1) is on Lr. To have x = 4 we must have
Z = I, giving the point (4, —1, 1) on Lx. Therefore LY and L4 cannot be
identical.
Finally, we verify that the remaining two lines, L3 and L4, are identical
by noting that t = 1 in L3 gives the point (4, —7, 1), which is on the line
L4.
PROBLEMS

1. Give parametric equations of the lines joining the pairs of points listed below.
(a) (1, 2, 7) and (-3, 1, 1) (b) (3, 1, 0) and (5, -2, 7)
(c) (11, 12, 13) and (2, 1, —1) (d) (1, 1, 1) and (3, 1, —1)
(e) (0, 1,2) and (0, 1,3) (f) (1,-1, 1) and (0,-1, 0)
2. Find parametric equations of the lines of intersection of the pairs of planes
listed.
(a) 3x — 2y + z — 5 = 0, 2x + Sy — z + 1 =0
(b) 7x - z + 1 = 0, x + y - 1 =0
(c) y+z = 0, x — y = Q
(d) x — y + z + 1 = 0, 2x — 2y + z — 3 = 0
(e) 3x + 5y - 2z - 5 = 0, 2x + 2y + 8z + 7 = 0
(f) x + 1 = 0, x + y + z = 0
4-4 THE STRAIGHT LINE 143

3. For each (or a selected number) of the lines in Problem 1 (and/or 2), find
the value of the parameter and the coordinates of the points at which the
line cuts the three coordinate planes (that is, the planes with equations
x = 0, y = 0, and z = 0).
4. What is an equation of the line joining the points (xi, i/i, 0) and (x2, 0, 22)?
At what point does this line cut the plane x = 0?
5. The line of intersection of the plane through Po orthogonal to A and the
plane through Po orthogonal to B is given by X = Po + t(A X B) according
to the result of the text. (Assume A and B are not collinear.) Prove that
every point on this line is common to both planes.
6. For the line X = Po + £A, show that the parameter t is, in general, deter­
mined by any of the three coordinates of a point on the line in the form:
t = (x — xq)
a
= (y — yo)
b
= (g — go)
c
The three expressions on the right-hand side can be set equal in pairs, giving
the equations of three planes. Identify these planes (a sketch will help).
The form
x — xq = y — yo = z — zp
a b c
is called the symmetric form of the equations of the line. Under what con­
ditions can this form be considered valid?
7. For each of the following, find the value of the parameter and the coordinates
of the point at which the given line cuts the given plane.
(a) X = [1, 3, -2] + t[l, -2, 3];3x + 2y + z - 1 = 0
(b) X = [0, 1, —1] + 41, 5, —2]; lx — y + z + 2 = 0
(c) X = [5, 8, 1] + i[l, 0, 8]; 2x - 2y - z - 5 = 0
(d) X = [3, 0, 5] + fll, 1,-1];bx + z = 0
8. At what points does the line
X = [1, 3, -12] + «1, 0, 5]
cut the sphere
(x - 6)2 + (y + l)2 + 22 = 81?

9. Find equations for the line through (1, —1,2) which is orthogonal to the
plane 3x — 2y + z — 5 = 0.
10. Find equations for the line through (—1, 5, 0), orthogonal to the line
X = [1, 1, 2] + £[—1, 3, 0], and parallel to the plane x-\- y — 4z + 2 = 0.
144 PLANES AND LINES 4-4

11. If the given pair of lines intersect, find the point of intersection.
(a) X = [1, 5, 4] + 42, 1, -7]; X = [14, -2, 5] + 43, -3, 5]
(b) X = [1, 2, 0] + 45, 0, 7]; X = [0, 0, 8] + 4~8, 4, 2]
(c) X = [3, 4, 5] + 41, -1, -2]; X = [9, -8, 3] + 4~1, 4, -3]
(d) X = [5,1, -5] + 42, 1, 5]; X = [1, -1, -14] + 41, 3, -1]
12. For parts (a) and (6) of Problem 11, find equations of the lines orthogonal
to the given pair of lines and passing through the point of intersection.
13. The angle between a line and a plane is defined to be (7r/2) — </>, where </>
is the angle between the line and the vector orthogonal to the plane (choosing
the angle between 0 and t/2). Find the cosines of the angles between the
line and the plane in each of the four parts of Problem 7.
5
Vectors as
Coordinate Systems
5-
1 SOME VECTOR IDENTITIES

Let us consider the triple cross product A X (B X C). As was com­


mented earlier, the parentheses are necessary here since the cross product
is not associative. This combination clearly represents a vector orthogonal
to A and to B X C. A vector orthogonal to B X C must lie in the plane
(through the origin) determined by B and C and, as will be shown in the
next section, must therefore be a linear combination of B and C. If we
write A X (B X C) = uB + vC, then
A • [uB + vC] = u(A • B) + v(Á • C)
= A • [A X (B X C)]
= 0.
This last follows since A • [A X (B X C)] is the scalar triple product of
three vectors, two of which are identical. As a consequence of this calcula­
tion, we can conclude that u = k(Á • C) and v = — fc(A • B) for some
scalar k. An actual example shows that k = 1, but an attempt to prove
that k is a constant independent of A, B, and C would be very difficult.
The required argument is quite deep and involves the concept of con­
tinuity. It would be very difficult to give a rigorous proof along these lines
at this stage.
The actual identity which we wish to prove is stated in the following
theorem.

Theorem 5-1. For any vectors A, B, and C,


A X (B X C) = (A • C)B — (A • B)C. (5-1)
We will use this in obtaining all the other identities of this section and so
would like to have a complete proof of it. A direct proof by calculating
each side in terms of components is possible but extremely tedious. In­
stead, we will offer two proofs, which, while still long, are shorter than
direct computation.
145
146* VECTORS ÁS COORDINATE SYSTEMS 5-1

First proof: Let A = + a2e2 + 03^3, B = b^ + 62e2 + ò3e3, and


C = Ci©! 4“ c2e2 -f- c3e3. Then

B X C = (62c3 — ò3c2)ei 4" (&3ci — ^ic3)®2 4“ (&ic2 — &2Ci)e3>

and

A X (B X C) = fliej X (B X C) 4~ &2®2 X (B X C) 4~ O3C3 X (B X C)

by the linearity of the cross product. Let us investigate the first term of
this expansion. Using d X ei = 0, ei X e2 = e3, ej X e3 = —©2, and
the linearity of the cross product, we have

ai©i X (B X C) = ^1(63^1 — 5ic3)e3 — fli(5iC2 — ^2ci)®2


= aiCiò2e2 4~ aiCiò3e3 — aiòjc2e2 — &iòic3e3
= aiCibiGi 4- fliCiò2e2 4~ ^1^16363 — —
aiòic2e2 — Ui&iC3e3
= «iCiB — dòiC.

Here, in the next to last step, we added and subtracted the term aib^ei.
In the same way we could show that

U2®2 X (B X C) — a2c2B — a2ò2C,


^3^3 X (B X C) = a3c3B — a3ò3C.

Adding these three results together would give (5-1).

Second proof: Let us first look at the unit vectors. Since the cross product
of two collinear vectors is zero,

ei X (ey X ek) = 0 if j = k.

On the other hand, if j k, then e; X ek is collinear with the third unit


vector, and hence

ei X (ey X ek) = 0 if j k and i j or k.

With the help of the right-hand rule it is easy to verify that

ei X (e> X ek) = —ek if i = j and j kf

= ej if i = k and j # k.
5-1 SOME VECTOR IDENTITIES 147

Looking at these four cases, we verify that for any z, J, and k,

ei X (ey X efc) = (e£ • ek)ej — (e» • ey)efc,

that is, (5-1) holds if A, B, and C are the unit vectors.


To save writing, we will make use of the summation notation and write
3 3 3
A = ' afii, B = C =
z=i j=i k=i

Then from the linearity of the cross product, we have


3 3
B X C = 22 22 x e*
j=l k=l
and
3 3 3
Ax (Bx C) = 22 22 aibic^i X (e> x e
*).
i=i y=i k=i

The multiple summation symbols indicate that we are to take the sums
over all three indices, giving a total of 27 terms. But then
3 3 3
A X (B X C) = üibjCkÇei • efc)®y
i=i j=i k=i
3 3 3
> aibjCk{ei • ej)ek.
i=l >=1 k=l

In the first of these summations ez • ek = 0 except when i = k; hence


for any fixed j, the nine terms with that value of j reduce to three, and are
in fact
(A.C)6yey.

Summing these over j gives (A • C)B. Similarly, the second summation


is identified as — (A • B)C, and the proof of (5-1) has again been obtained.
From (5-1) and the anticommutative property of the cross product, we
can obtain the similar identity:

Theorem 5-2. For any vectors A, B, and C,

(A X B) X C = (A • C)B - (B • C)A. (5-2)

Also, from (5-1) and (5-2) we can prove the following theorem.
148 VECTORS AS COORDINATE SYSTEMS 5-1

Theorem 5-3. For any vectors A, B, C, and D,


(A X B) X (C X D) = (A • B X D)C - (A • B X C)D, (5-3)
and
(A X B) X (C X D) = (A • C X D)B - (B • C X D)A. (5-4)

Proof: We prove (5-3) by considering (A X B) on the left-hand side as


a single vector and using (5-1) on this combination. The properties of the
scalar triple product then suffice to complete the proof.
The proof of (5-4) is found similarly by using (5-2) considering (C X D),
as a single vector.
The right-hand sides of (5-3) and (5-4) represent the same vector.
Setting these two expressions equal and doing some rearranging, we find

Theorem 5-4. For any vectors A, B, C, and D,

(B • C X D)A - (A • C X D)B + (A • B X D)C - (A B X C)D = 0.


(5-5)
Note that the scalar triple products which appear in this result contain
the three vectors not being multiplied and that the three vectors appear
in their natural order in every case.
Next, we prove an extremely important pair of relations:

Theorem 5-5. For any vectors A, B, C, and D,


(A X B) • (C X D) = (A • C)(B • D) — (B • C)(A • D), (5-6)
and
|A X B|2 = |A|2|B|2 - (A • B)2. (5-7)

Formula (5-7) is known as Lagrange's Identity and (5-6) is usually called


the Extended Lagrange Identity.

Proof: We prove (5-6) by considering the left-hand member as a scalar


triple product and using (5-1) in the following manner:
(A X B) • (C X D) = A • [B X (C X D)]
= A • [(B • D)C - (B • C)D]
= (A • C)(B • D) — (B • C)(A • D).

Formula (5-7) is a special case of (5-6), obtained by setting C = A


and D = B.
Lagrange’s identity Eq. (5-7) has a special geometric significance. If
we let 0 be the angle between the vectors A and B, we have A • B =
|A||B| cos 9.
5-1 SOME VECTOR IDENTITIES 149

If we put this into (5-7), we have

|A X B|2 = |A|2|B|2 - |A|2|B|2 cos2 B


= |A|2|B|2[1 - cos2 B]
= |A|2|B|2 sin2 B.

Taking the square root of this expression shows that

|A X B| = |A||B| sin B, (5-8)

a formula which is similar to the relation found for the dot product. We
of course take sin B as nonnegative in this formula. This is equivalent to
using only angles between 0 and it.
The importance of formula (5-8) is indicated by the fact that many
texts on vector analysis have used this equation to define the cross product.
That is, A X B would be defined as a vector orthogonal to both A and B,
with a direction as given by the right-hand rule, and with a magnitude
given by (5-8). Such a definition is, however, very difficult to work with.
Proving such a fundamental fact as the linearity of the cross product would
already cause trouble.

Figure 5-2

Consider now the parallelogram determined by the pair of vectors A and


B (Fig. 5-1). If B is the angle between the vectors A and B, then taking
|B| as the length of the base of the parallelogram, we find that the length
of the altitude is |A| sin B. Hence the area of this parallelogram is |A| |B|
sin B = |A X B|. This can also be interpreted as saying that the vector
A X B has a direction orthogonal to the plane of A and B and a magnitude
equal to the area of the plane parallelogram with sides A and B.
If we now add a third vector C, we see that

(A X B) • C = |A X B||C| cos </>,

where </> is the angle between A X B and C. But |C| cos </> is exactly the
length of the projection of C onto the line with direction A X B, and
150 VECTORS AS COORDINATE SYSTEMS 5-1

hence is the altitude of the parallelepiped with sides A, B, and C (Fig. 5-2).
That is, we see that the scalar triple product A X B • C has a magnitude
equal to the volume of this parallelepiped. The sign of the scalar triple
product is positive or negative as the triple of vectors forms a right-handed
or left-handed set. That is, if the angle between A X B and C lies in
the range 0 to 7r/2, then A X B • C is positive.

PROBLEMS

1. Prove (5-2) from (5-1).


2. Prove (5-3) from (5-1).
3. Prove (5-4) from (5-2).
4. Let A = [1, 2, 5], B = [-1, -5, 2], C = [1, 3, 2], D = [1, 1, -1].
(a) Calculate directly the right-hand and left-hand members of (5-1).
Which side is easier to calculate?
(b) Do the same for (5-3).
(c) Do the same for (5-6).
5. Show that
|A X B|2+ (A-B)2 = |A|2|B|2,
and prove:
(a) A X B = 0 if and only if |A • B| = |A| • |B|,
(b) A • B = 0 if and only if |A X B| = |A| • |B|.
6. Calculate the area of the parallelogram determined by:
(a) A and B in Problem 4
(b) A and C in Problem 4
(c) B and C in Problem 4
7. Calculate the volume of the parallelepiped determined by A, B, and C of
Problem 4.
8. Show that if A, B, and C are three points in space, then the area of the
triangle with these three points as vertices is
i|ÃBX ÃC|.
9. Show that if A, B, C, and D are four points in space, then the volume of
the tetrahedron with these four points as vertices is

ÜÃB-ÃC X ÃD|.
10. Let A = (1, 3, 7), B = (2, 5, 1), C = (1, 1, 5), and D = (-2, 3, 2).
(a) Find the area of the triangle ABC.
(b) Find the area of the triangle ABD.
(c) Find the volume of the tetrahedron ABCD.
5-1 SOME VECTOR IDENTITIES 151

11. Let (ai, 02) and (òi, Ò2) be two points of the plane. Show that the area of
the parallelogram which has vertices at the origin and these two points is
the absolute value of
ai
bi b2
12. Find a formula for the volume of the parallelopiped determined by the four
vertices, 0 (the origin), A, B, and C, in terms of the coordinates of .4, B,
and C. (See formula 4-12.)
13. From the results of Problems 11 and 12, can you give a condition on the
coordinates of the points so that
(a) two points in the plane are on the same line through the origin,
or
(b) three points in space are on the same plane through the origin?
14. Let A and B be two nonzero vectors. Discuss the problem of finding a
vector X such that
A X X = B.
Find all solutions of this equation if any exist. [Hint: What is the direction
of the cross product of two vectors? What if B is not orthogonal to A? If
B is orthogonal to A, X must be orthogonal to B. Try writing X = U X B.
What conditions must U satisfy?]
15. Show that there are five different ways in which parentheses can be intro­
duced into the product AXBXAXBto make it well defined. Using the
formulas of this section, simplify each product and show that four of them
are always the same.
16. Using the formulas of this section, simplify each of the following expressions.
(a) A X (A X B)
(b) A X (A X (A X B))
(c) A X (A X (A X (A X B)))
(d) A X (A X (A X (A X (A X B))))
(e) A X (A X (A X (A X (A X (A X B)))))
(f) What would the general formula be?
17. Using (5-8), prove the law of sines for a triangle by vector methods.
18. Find a condition on the vectors A, B, C, and D which will guarantee that the
plane through the origin, A, and B will be orthogonal to the plane through
the origin, C, and D.
19. Prove each of the following:
(a) (A X B) • (B X C) X (C X A) = (A • B X C)2
(b) (A X B) X (A X C) = (A • B X C)A
(c) (((A X B) X A) X (A X B)) • ((A X B) X A) =0
(d) |A X (A X B)|2 = |A|4|B|2 - |A|2(A-B)2
(e) A X (B X C) + B X (C X A) + C X (A X B) =0
(f) (A X B) • (C X D) + (A X D) • (B X C) = (A X C) • (B X D)
152 VECTORS AS COORDINATE SYSTEMS 5-2

20. Let Ai, A2, and A3 be three given vectors. Define Bi = A2 X A3, B2 =
A3 X Ai, B3 = Ai X A2. Prove that A» • By = 0 for all i # j. Can you
interpret this geometrically?

5-
2 COLLINEAR AND COPLANAR VECTORS

We said that two vectors are collinear if and only if one is a scalar
multiple of the other. We have used several intuitive properties of collinear­
ity in our discussions of previous sections. In this section we would like
to organize and prove these properties more carefully. The first thing we
are interested in is the connection between the cross and dot products and
collinearity. This connection is given by the following theorem:

Theorem 5-6. Two vectors A and B are collinear if and only if either
A X B = 0 or |A • B| = |A||B|.

Proof: Formula (5-7) of the last section shows that the two conditions
of the theorem are equivalent (see also Problem 5 of that section). There­
fore, we see that this result has already been proved in Theorem 4-7.
We will, however, offer another proof here, which is somewhat simpler
than the proof given in Theorem 4-7.
Half of the theorem is immediately obvious, for if A and B are collinear,
then there is some t such that A = fB, and

A X B = tB X B = tO = 0.

The other half of the proof is similar to the proof of the Cauchy-Schwarz
inequality. Suppose that |A • B| = |A||B|. Further, let us suppose that
A # 0 (if A = 0, then A and B are trivially collinear). Set t = ±|B|/|A|,
choosing the same sign as (A • B) so that
t(A-B) = |i||A- B| = j|j|A||B| = |B|2.

Then, just as in the proof of the Cauchy-Schwarz inequality, we find

|IA - B|2 = i2|A|2 - 2f(A-B) + |B|2


= |B|2 - 2|B|2 + |B|2 = 0.

Hence we can conclude that B = iA, thus proving the theorem.


As a consequence of this result, we can prove the following theorem,
which states a fact that we have already been using in our informal dis­
cussion. It is, in effect, the converse of the statement that A X B is orthog­
onal to both A and B.
5-2 COLLINEAR AND COPLANAR VECTORS 153

Theorem 5-7. If C is orthogonal to both A and B, then C is collinear


with A X B.

Proof: If C is orthogonal to both A and B, then C • A = 0 and C • B = 0.


But then, using (5-1), we can compute

C X (A X B) = (C • B)A - (C • A)B
= 0,

and hence we can conclude from Theorem 5-6 that C is collinear with
A X B. This result should be compared with Theorems 4-2 and 4-3.

We now turn to a consideration of coplanar vectors. We shall call a


collection of vectors coplanar if they are all parallel to the same plane.
If we think of a set of vectors as being line segments drawn from the
origin, then they are coplanar only if there exists a plane through the
origin containing all of them. From our definition of a plane, this will
occur only if there is some vector (the orthogonal vector to the plane)
orthogonal to all vectors in the plane. Therefore, we use this as our formal
definition.

Definition 5-1. A collection of vectors is called coplanar if and only if


there exists a nonzero vector N orthogonal to all vectors in the collec­
tion.

Directly from this definition we can prove

Theorem 5-8. Two vectors are always coplanar. Three vectors, A, B,


and C, are coplanar if and only if A • B X C = 0.

Proof: Let A and B be two given vectors. Suppose first that they are
not collinear. Then A X B 5* 0 (from Theorem 5-6), and the vector
N = A X B will be the required common orthogonal to A and B.
On the other hand, if A and B are collinear and if either is nonzero, then
any nonzero vector orthogonal to it will satisfy the requirements. The
existence of such a vector is easy to show. If both A and B are zero, then
any nonzero vector is orthogonal to both.
The remaining part of the theorem requires two proofs, since it is an
"if and only if” statement. For the first proof, let us suppose that A, B,
and C are coplanar—that is, that there exists a nonzero vector N orthogonal
to all three. We must then show that A • B X C = 0. However, since
A • N = 0 and B • N = 0, we find from Theorem 5-7 that N is collinear
154 VECTORS AS COORDINATE SYSTEMS 5-2

with A X B. Since N 0, this means that there must exist a scalar t


such that A X B = /N. But then we calculate
A-BxC=(AxB)-C
= /N-C
= 0,
because N was also orthogonal to C.
To prove the last part of the theorem, let us suppose that A • B X C =
0. If A X B 5^ 0, we can set N = A X B. Then N • A = N • B = 0,
and N-C = AxB-C = 0, hence A, B, and C are coplanar. On the
other hand, if A X B = 0, then A and B are collinear. If A = B = 0,
then using the first part of the theorem, we see that A, B, and C are coplanar.
If one of these, say A, is not zero, then B = /A, and again from the first
part of the theorem, A and C are coplanar. That is, there exists a nonzero
N such that N • A = N • C = 0. This also implies N • B = /(N • A) = 0,
and hence we have proved the theorem.
The last part of this proof is somewhat involved, because there are
many separate cases which have to be considered. The reader may find it
useful to diagram the proof, seeing how the various cases arise and how
they are disposed of. The basic ideas used in the "main” cases are really
the important ones.

Theorem 5-9. Let A and B be two noncollinear vectors. Then a vector C


is coplanar with A and B if and only if there exist scalars s and t such
that
C = sA + ZB. (5-9)

Proof: Suppose that C = sA + /B. Then

C
*
A-BxC=(AxB)
= (A X B) • [sA + /B]
= s(A X B) • A + /(A X B) • B.

But AxB-A = AxB B


* = 0, and hence we conclude from Theorem
5-8 that A, B, and C are coplanar.
The other half of the theorem is more difficult. Suppose that A, B, and
C are coplanar. Then A • B X C = 0 (from Theorem 5-8). From the
hypothesis that A and B are not collinear we have A X B / 0. Let
D = A X B. We now make use of formula (5-5) of the last section. This
gives us
(B • C X D)A — (A • C X D)B + (A • B X D)C — (A • B X C)D = 0.
(5-10)
5-2 COLLINEAR AND COPLANAR VECTORS 155

As we have seen, the coefficient of D in this expression is 0. The coefficient


of C is
(A • B X D) = (A X B • D)
= (A X B) • (A X B)
= |A X B|2,
which is nonzero by hypothesis. We can therefore solve (5-10) for C.
Doing so and replacing D by A X B gives
(BxC)-(AxB)a , (AXC)-(AXB)P
(5-U)
|A X B|2 |A X B|2
This is the expression of the form (5-9) needed to prove the theorem.

The last result can be extended to give us the following important


theorem:

Theorem 5-10. If the vectors A,


B, and C are not coplanar, then
every vector D can be written as
a linear combination of A, B, and
C. That is, for any D there exist
scalars s, /, and u such that

D = sA + tB + uC. (5-12) Figure 5-3

Proof: For any four vectors, formula (5-10) holds. By the hypothesis
that A, B, and C are not coplanar and by Theorem 5-8, A • B X C # 0.
Hence we can solve (5-10) for D, giving
n (B ■ C X D) A (A>CXD)R , (A-BXD)P
(A • B X C) (A • B X C) T (A • B X C)

The last theorem tells us that if we are given any three noncoplanar
vectors, then every vector can be expressed as a linear combination of
these. The geometric meaning of this statement is illustrated in Fig. 5-3.
Here, all vectors are represented as directed line segments from the
origin. Comparing this sketch with Fig. 3-1, we see that the three vectors
A, B, and C can be thought of as determining an oblique system of co­
ordinates. A point D in space can be determined by its A, B, C coordinates,
which are (s, /, u) as given by (5-12). We will exploit this point of view
further in the next section.
Let us emphasize that the formulas and conclusions of this section have
been obtained with the help of the cross product. In a later section we will
see what can be done without having to use the cross product.
156 VECTORS AS COORDINATE SYSTEMS 5-2

Formulas (5-11) and (5-13) are easy to use in practice. For example,
suppose that
A = [1, -3, 2], B = [1, -1, -1], and C = [-3, -7, 18].

We find A X B = [5, 3, 2], and so A and B are not collinear. But A • B X


C = 0; so the three vectors are coplanar. Computing the coefficients in
(5-11) gives us
C = 5A - 8B,

as the reader may easily verify.


Theorems 5-9 and 5-10 seem to have a great deal in common. Adding
the remark that if A 0 and A and B are collinear, then there is a scalar
s such that B = sA, we see that the following three statements are related:
(i) A single vector A is nonzero;
(ii) Two vectors A and B are noncollinear;
(iii) Three vectors A, B, and C are noncoplanar.
There is a fundamental property which underlies these three statements.
This property has been found to be most important in any attempt to
extend the concept of vectors. It is called linear independence.

Definition 5-2. A finite collection of vectors An A2,. . . , An is called


linearly dependent if and only if there exist scalars X< (i = 1,. . . ,n),
not all zero, such that

XiAj + X2A2 + • • • + XnAn = 0. (5-14)

A collection of vectors which is not linearly dependent is called linearly


independent.

Note that in order to prove that a set of vectors is linearly independent,


one must show that if (5-14) holds, then Xx = X2 = • • • = Xn = 0.
The connection of linear independence with the concepts already discussed
in this section is contained in the following theorem:

Theorem 5-11.

(1) A single vector A is linearly dependent if and only if A = 0.


(2) Two vectors, A and B, are linearly dependent if and only if they
are collinear.
(3) Three vectors are linearly dependent if and only if they are
coplanar.
(4) Four vectors are always linearly dependent.
5-2 COLLINEAR AND COPLANAR VECTORS 157

Proof: Statement (1) follows obviously from Definition 5-2.


Statement (2) follows easily from the definition of collinearity. For if
there exist Xx and X2 not both zero such that

XxA + X2B = 0, (5—15)

then we can solve for one of the two vectors as a scalar multiple of the
other, and hence vectors A and B are collinear. On the other hand, if
A and B are collinear, then one is a scalar multiple of the other. Suppose,
for example, that A = /B. This statement can then be rewritten in the
form (5-15),
A -|- (—/)B = 0.

One of the coefficients is 1, and hence is nonzero.


Statement (3) follows in the same manner with the help of Theorem 5-9.
Similarly, statement (4) follows from Theorem 5-10, provided three of
the four vectors are noncoplanar. If, however, some three of the vectors
are coplanar, then these three are already linearly dependent. We can,
therefore, write (5-14) with not all coefficients zero, using just these three
vectors. To write such an expression involving all four vectors, just add
the fourth vector with a coefficient of zero. This shows that the four
vectors are linearly dependent in any case.
Note that Theorem 5-10 can now be rewritten in the following form:

Theorem 5-12. If A, B, and C are linearly independent, then any vector


D can be written as a linear combination of A, B, and C.

PROBLEMS

1. Let n vectors, Ax, A2, . . . , An, be given. Let R be some nonzero vector.
Prove that the vectors Bx = R X Ax, B2 = R X A2, .. ., Bn = R X An
are coplanar.
2. Prove that if A is collinear with B # 0 and B is collinear with C, then A
is collinear with C.
3. Prove that there always exists a nonzero vector orthogonal to a given
vector.
4. Use identity (5-5) on the final expression obtained for C in formula (5-11).
How is this expression simplified if |A| = 1, |B| = 1, and A and B are
orthogonal?
5. Let A, B, and C be three noncoplanar vectors. Show that the representation
of a vector D as given in (5-12) is unique.
158 VECTORS AS COORDINATE SYSTEMS 5-2

6. Show that a plane is defined by


{X | X — Po + sA + ZB, for all real s and t},

where A and B are a pair of noncollinear vectors. What is the equation of


this plane? (This is called the parametric representation of a plane.)
7. Let A = [ai, 0,2, 03], ® — [&b Ò3], C = [ci, C2, C3] and D = [di, d2> ^3].
Show that the system of equations
aiz + biy + az = di,
O2Z + Ü2y + C2Z = d2,
a^x + bay + caz = da
is equivalent to the single vector equation
xA + 2/B + zC = D.
What happens to this equation if we take the dot product of each side with
B X C? Use this to solve for x. Do the same with A X C and A X B.
Under what circumstances will a solution exist?
8. Prove that for any vectors A, B, C, and D, with A • B X C # 0,
= (D ■ B X C) (A ■ D X C) (A ■ B X D)
(A • B X C) (A • B X C) (A • B X C)
9. Suppose that A and B are two noncollinear vectors. Show that if C = A X B,
then A, B, and C are noncoplanar.
10. Let A and B be noncollinear vectors. Rewrite formula (5-13) in terms of
A, B, and D alone when C = A X B. Compare with (5-11).
11. For each of the following, prove that the three vectors are coplanar and
obtain C as a linear combination of A and B.
(a) A = [2, 0, 1], B = [0, 3, 4], C = [8, -3, 0]
(b) A = [1, 7, -2], B = [-1, 5, 1], C = [7, 1, -10]
(c) A = [1, 2, 0], B = [1, 1, 0], C = [1, —4, 0]
(d) A = [1, 1, 1], B = [1, 2, 3], C = [5, 0, -5]
12. For each of the following, prove that A, B, and C are not coplanar and obtain
D as a linear combination of A, B, and C.
(a) A = [1, 0, 0], B = [1,1,0],
C = [1, 1, 1], D = [5, 3, -1]
(b) A = [1, 2, -1], B = 12, 0, 3],
C = [6, -5, -4], D = [3, -12, -24]
(c) A = [-1, 3, -1], B = [4, 2, 1],
C = [3, 5,1], D = [-8, 24, -10]
(d) A = [2, 7, 5], B = [-1, -8, 3],
C = [1,1, -5], D = [6, 3, 37]
5-3 COORDINATE VECTORS 159

5-
3 COORDINATE VECTORS

We have defined a vector as a triple of numbers. This gives a natural


representation of an arbitrary vector as a linear combination of the three
special unit vectors ex = [1, 0, 0], e2 = [0, 1, 0], and e3 = [0, 0, 1]. That
is,
[x, y, z] = xex + ye2 + ze3.

We remark that these are not the symbols most often used in vector
analysis. The more common symbols are i, j, and k. We have avoided
using these for two reasons. First, the letters i and j already cause enough
confusion, since they are used as symbols with several different meanings
in mathematics, physics, and electrical engineering. But a much more
important reason is that we are looking forward to a generalization of
three-dimensional vectors to a higher number of dimensions. Most of the
formulas we have derived will have natural generalizations, but the symbols
i, j, and k would have to be replaced by others (normally ex, e2,. . . , en).
Theorem 5-10 of the previous section shows that any three noncoplanar
vectors can be used to express arbitrary vectors in space. Suppose we had
three noncoplanar vectors ux, u2, and u3, and

A = axux + a2u2 + a3u3,


B = òxux + ò2u2 + 63u3.

Then using the linearity of the dot product, we obtain


3 3
A • B = 23 23 %(“«• ’ «,)•
1=1 J=1

This summation contains nine terms in all. It would be much simpler if


the vectors u; were mutually orthogonal. Then u* • Uy = 0 if i j, and
we would have
A • B = 23
1=1

Again, this formula would be simplified if each ut had magnitude 1. In


this case,
3
A• B = ciib
i=l

the same formula we had for the dot product in terms of the expansion by
the natural unit vectors, ex, e2, and e3.
160 VECTORS AS COORDINATE SYSTEMS 5-3

Definition 5-3. A set of three nonzero vectors, ux, u2, and u3 is called an
orthogonal set if and only if u» • u; = 0 for all i j. It is called an
orthonormal set if and only if it is an orthogonal set and in addition
|u<| = 1 for all i.

Note that according to this definition, the unit coordinate vectors ex,
e2, and e3 are an orthonormal set. The following theorem is a direct
consequence of this definition and shows how similar an orthonormal set
is in its behavior to the unit coordinate vectors.

Theorem 5- 73. Let ux, u2, u3 be an orthonormal set of vectors. Then this
set of vectors is linearly independent. Given any vector A,

A = (A • ui)ui + (A • u2)u2 + (A • u3)u3. (5-16)

If A = fliUi + a2u2 + a3u3 and B = òiux + ò2u2 + 63u3, then

A • B = a^b\ + o2ò2 4~ a363. (5-17)

Proof: To see that an orthonormal set is linearly independent we merely


need observe that if
XiUi + X2u2 + X3u3 = 0,
then 0 = 0 • U1

= (XiUi + X2u2 + X3u3) • ux


= Xi(ux • Ui) + X2(u2 • Ui) + X3(u3 • Ui)
= Xx.
In exactly the same way we can compute X2 and X3 to be equal to zero.
Hence the three vectors are linearly independent. But then, as we saw in
Theorem 5-10, any vector A can be expressed as a linear combination of
the three orthonormal vectors,

A = tiiUi 4~ a2u2 4~ U3U3.

However, we may then compute

A • ux = (aiux 4- fl2U2 4- «3^3) • Ui = ax

just as above. Similarly we find A • u2 = a2 and A • u3 = a3, which


proves the second assertion of the theorem. The final part was done above.

The question which this theorem immediately raises is: can we produce
an orthonormal set of vectors from any three linearly independent vectors?
5-3 COORDINATE VECTORS 161

Thinking geometrically it is quite obvious how this must be done. Suppose


we have three vectors Ax, A2, and A3 which are linearly independent. It
is easy to produce a vector with magnitude 1 in the same direction as
Ai, namely Ui = Ai/|Ai |. Clearly, it is possible to find a vector orthogonal
to Ui in the plane of Ax and A2. Such a vector is B2 = (Ax X A2) X Ax.
We then can set u2 = B2/|B2|. To get a third vector orthogonal to both
Ui and u2, we only need to take ux X u2. This process can be simplified
somewhat. Since the vector u3 is merely required to be orthogonal to the
plane of Ax and A2, we could just as easily set
Ai X A2
U3 ~ |Ax X A2| ’

and then u2 = u3 X ux. The reader should note how the requirement
that Ai, A2, and A3 be linearly independent implies that the quantities
|Ai|, |B2 I, and |Ai X A2| be all nonzero.
Therefore, we see that we have actually proved:

Theorem 5-14. Let Ax and A2 be a given pair of noncollinear vectors.


Let

_ Ai X A2
U3 “ |A> X A2| ’ (5-18)

u2 = u3 X up
Then ux, u2, and u3 form an orthonormal set with ux parallel to Ax and
u2 coplanar with Ax and A2.

For example, suppose we have the vectors Ax = [6, —3, 2] and A2 =


[8, 3, 5]. Then |AX| = [36 + 9 + 4]1/2 = 7, and so

ui = [|, -I, >].


Also
Ai X A2 = [-21, -14, 42]
= 7[—3, —2, 6],

and hence |AX X A2| = 7 • 7 = 49. Thus


u3 = [~7> ~7) IL

and taking the cross product, we have


U2 = [7,7, IL
162 VECTORS AS COORDINATE SYSTEMS 5-3

The reader should verify for himself that these three vectors do indeed
form an orthonormal set.

We can now discuss coordinate systems. In our definition of points in


space as triples of numbers, we assumed an intrinsic coordinate system.
However, we feel that the points of our space have an independent ex­
istence, without regard to the particular coordinate system that we happen
to use. Similarly vectors, which are thought of as a property of directed
line segments, independent of translation, cannot really depend on the co­
ordinate system either, despite the way we defined them.
We discussed the problem of translation earlier and saw that no real
change in the properties of space resulted from translation of the co­
ordinate system (which is algebraically equivalent to translation of the
entire space). Translation of the coordinate system in space leaves the
representation of vectors as triples of numbers unchanged, of course.
We still have the problem of rotation of coordinate systems to consider.
In this regard, it is best to make the identification between vectors and
points of space and consider both simultaneously. That is, we consider
all vectors to be directed line segments with initial points at the origin,
and identify the terminal point of the segment with the vector.
In this context, our initial choice of a set of mutually orthogonal co­
ordinate axes corresponds to a choice of an orthonormal set of vectors: the
unit vectors in the positive directions along the coordinate axes. The co­
ordinates of a point (or the components of a vector) in terms of this
coordinate system are then given by the three coefficients of these unit
vectors in the expansion of the vector, for example,

A = fliUi + O2u2 4“ O3U3.


Formula (5-16) of Theorem 5-13 shows how these coefficients can
actually be obtained for a given new coordinate system. For example, the
vectors

(5-19)

J_, _ JL1
5/6 ’ VôJ
form an orthonormal set. If these are considered to define a coordinate
system, say the z'l/'z'-coordinate system (in that order), then any point X
which initially has coordinates (x, y, z) will have coordinates (x', y', zf)
5-3 COORDINATE VECTORS 163

in this new coordinate system. To find the relationship between these


coordinates, we observe that
X = [x, y, z] = xex + ye2 + ze3 = x'ux + 2/'u2 + z'u3,

Formula (5-17) is of special interest in connection with the point of view


of considering ux, u2, and u3 as defining a new coordinate system. This
formula says that the representation of the dot product of two vectors in
terms of their components is the same no matter what orthonormal set
of vectors is used to define the coordinate system. Despite the fact that
our initial definition of the dot product was in terms of a given coordinate
system, it is really independent of this choice.
What about the cross product? It too was defined in terms of the
intrinsic coordinates. Let us see what happens if we express vectors in
terms of another orthonormal set. Suppose that

A = axux + a2u2 + u3u3,


B = òxu2 + 62u2 + 63u3.

Then if we take the cross product A X B, using the linearity and the
anticommutative property, we find

A X B = (a2ò3 — ^3ò2)u2 X U3 (gi63 — &35i)ux X u3


“F (ai52 — ®25x)ux X u2. (5—20)

Here, we have discarded the terms involving ux X ux, u2 X u2, and


u3 X u3 since they are zero.
Now, we know that ux X u2 is orthogonal to both ux and u2, hence
ux X u2 = cu3; but from (5-16) we see that c = ux X u2 • u3, and so
ux X u2 = (ux X u2 • u3)u3.
On the other hand, from identity (5-7)
|ux X U2I2 = |ui|2|u2|2 - (Uj -u2)2 = 1,
since ux • u2 = 0, so that ux X u2 = ±u3.
164 VECTORS AS COORDINATE SYSTEMS 5-3

If Ui X u2 = u3 (and ux • u2 X u3 = +1), we say that ux, u2, and


u3 constitute a right-handed system of vectors. Let us assume that this is
true for the moment. Then with the help of identity (5-1), we have
u2 X u3 = u2 X (ui X u2)
= (u2 • u2)ui — (u2 • ui)u2
= Ui,
and similarly,
Ui X u3 = Ui X (ui X u2)
= (Ui • U2)U1 — (Ui • U1)U2

= —u2.

Substituting these into (5-20), we find that if ux, u2, and u3 constitute a
right-handed orthonormal system, then
A X B = (a2b3 — a3ò2)ux — (axò3 — a3òx)u2 + (axò2 — a2òx)u3.
(5-21)
Comparing this with formula (4-10), we find that the expansion of the
cross product in terms of any right-handed orthogonal coordinate system
is the same. Thus the cross product does not depend on the choice of a
coordinate system, so long as the chosen system is still right-handed.
The student can easily verify that if the coordinate system is left­
handed (that is, if Ui X u2 = — u3), then instead of (5-21) we find a
formula for A X B which is the negative of that given in (5-21).
Finally, we remark that if we have any three vectors which are not
coplanar, then from Theorem 5-10 we can express any vector X as a linear
combination of these. We can think of the coefficients of these vectors as
being the coordinates of the vector X (or point X) in terms of an oblique
coordinate system. The rectangular parallelepiped in Fig. 3-1 would be
replaced by a parallelepiped whose edges are parallel to the given vectors
(as in Fig. 5-3). While the oblique coordinates are useful in some cases,
their use make the formulas for the dot and cross product very compli­
cated.
PROBLEMS

1. Construct an orthonormal set out of the given vectors by the method


described in this section.
(a) Ai = [1,3,0], A2 = [-1, 1, 0]
(b) Ai = [-1,0,1], A2 = [0, 2, 1]
(c) Ai = [2, 1, -3], A2 = [1,-1, 2]
(d) Ai = [1,2,1], A2 = [—2, -5, 2]
2. Express the vector B = [2, 2, 2] as a linear combination of the orthonormal
set ui, U2> and u3 for each part of Problem 1.
5-3 COORDINATE VECTORS 165

3. Prove that the right-hand side of (5-20) reduces to the negative of (5-21)
if Ui X U2 = — U3.
4. Is the set of vectors defined by (5-18) a right-handed system? Does this fact
depend on Ai and A2?
5. Prove that the vectors (5-19) form an orthornormal set. Are they a right-
handed system?
6. For each of the orthonormal sets of vectors obtained in Problem 1, find
formulas for the coordinates (s', t/', «') of a point (x, y, z) when the x'-, y'-,
and «'-axes are determined by ui, 112, and 113, respectively.
7. Show that each of the following sets of vectors is an orthonormal set. Give
formulas for the coordinates (x', y', z') of a point (x, y, z) when the x'-, 1/'-,
and «'-axes are determined by ui, U2, and U3, respectively. Express A =
[5, —2, 1] as a linear combination of ui, U2, and U3.
(a) ui = [0,|,f ], u2 = [jf > AL U3 = Iff> tf > — ff1
(b) ui = [f, f, —f], u2 = [—f, f, ~ u3 = [|, f]
(c) ui = [f, —f, J], u2 = I-—TK> T%L U3 = IA, ~It]

8. Let ui, U2, and U3 be a right-handed, orthonormal set of vectors. Set

N = ei X ui + e2 X U2 + e3 X 113,

X = xei + ye2 + ze3,


and
Y = xui + 2/U2 + ZU3.

(a) Prove that the angle between et and N is the same as the angle between
th and N for i = 1, 2, and 3.
(b) Show that |X| = |Y|, and prove that the angle between X and N is the
same as the angle between Y and N for any x, y, and «.
(c) Prove that X = Y if X is collinear with N. [Hint: Use (5-16) on both
X and Y.]
Remark: A rotation of space which carries ei to ui, e2 to U2, and e3 to
U3 can be thought of as a function which maps a point X to a point Y, where
X and Y are given as above. The vector N is then the axis of the rotation.
Property (b) shows that N is the axis, and property (c) shows that any
rotation must have an axis.
9. Find the axis N as defined in Problem 8 for each of the sets of vectors in
Problem 7. Make a sketch showing these vectors for each of the cases.
10. Let ui, U2, and U3 be any three noncoplanar vectors (they need not form an
orthonormal set). Define

Vi = c(u2 X u3), v2 = c(u3 X Ui), V3 = c(m X u2),

where c = l/(ui • U2 X U3). The v» are called a dual basis to the ut.
166 VECTORS AS COORDINATE SYSTEMS 5-4

(a) Prove that


Í0 if Í j,
Vt • U;
|1 if i = j.
(b) Prove that for any vector A,
A = (A • vi)ui + (A • v2)u2 + (A • v3)u3,
and also
A = (A • ui)vi + (A • u2)v2 + (A • u3)v3.

(c) Find the dual basis to A, B, and C in each of the four parts of Problem 12,
Section 5-2. Use the results of part (b) of this problem to obtain D as a
linear combination of A, B, and C.

5-
4 PROJECTIONS AND DISTANCE FORMULAS

In Section 3-5 we introduced the projection of vectors in the direction


of a line and we have made use of this idea several times in various applica­
tions. The formula which was introduced as the definition is not par­
ticularly easy to remember by itself. At this point, we would like to
exhibit another way of looking at the projection. From this point of view,
the formula for the projection can be rederived easily as needed.

Let A and B 0 be given vectors. We wish to find the projection of A


in the direction of the line determined by B. The situation is illustrated
in Fig. 5-4. Here, the required projection is P, and we see that the basic
requirements are satisfied when V = A — P is orthogonal to B. In order
that P be collinear with B, we must have
P = tB (5-22)

for some scalar t. But then, in order to have V orthogonal to B, we must


have
V B = A B - P-B
= A • B - tB • B
= 0.

This will be true only if t = (A • B)/(B • B). Using this


value in (5-22) yields exactly the formula given in Defini­
tion 3-16.
This particular point of view can be extended to give us the definition
of the projection of a vector onto a plane.

Definition 5-4. Let A be a given vector and M a given plane. Then


the projection of A onto Af is a vector P which lies in the plane and is
5-4 PROJECTIONS AND DISTANCE FORMULAS 167

such that
A = P +V
for some vector V orthogonal to M.
The contents of this definition are illustrated
in Fig. 5-5. From this figure it is easy to see
how we can obtain formulas for the projection of a vector onto a plane.
Suppose N 0 is orthogonal to the plane. Then it is clear that we want
V to be collinear with N to satisfy the requirements of Definition 5-4.
Likewise, it seems clear from the figure that V must actually be the pro­
jection of A in the direction of N. Using this would give

P= A — V
(A-N)
= A - N.
(N • N)

This result has been obtained by means of our geometric understanding


rather than by formal computation. However, once we have obtained this
formula, it is easy to verify it. Direct computation shows that if P is
defined in this way, then P • N = 0 and hence that P is parallel to the
given plane. Thus, we have proved:

Theorem 5-15. Let M be a plane orthogonal to a vector N 0. Then


the projection of a vector A onto M is given by

P = A-^N. (5-23)

For example, let us find the projection of A = [2, 5, —1] onto the plane
with equation 3x — y + 2z — 5 = 0. The normal vector to this plane
isN = [3, —1,2]. Here, |N|2 = 14 and A • N = —1. Hence

P = [2, 5, -1] + ^[3, -1, 2]


= —Kl­

in Section 4-3 we derived an expression for the distance from a point


to a plane. We now wish to find expressions for the distance between a
point and a line and for the distance between two lines.
Suppose that we are given a line L which contains the point A and has
the direction B. Thus the line L has parametric equation

X = A + /B.
168 VECTORS AS COORDINATE SYSTEMS 5-4

We wish to find the distance from a point P to the line L. From geo­
metric considerations it is clear that the required distance is exactly the
length of the vector PC from P to a point C on the line which is such that
PC is orthogonal to B. This can be verified by noting that if D is any
point on the line, then (see Fig. 5-6)

PD = PC + CD.

Since PC is orthogonal to B and CD is


collinear with B, PC and CD must be orthogonal.
But then
\PD\2 = \PC\2 + \CD\2, (5-24)
as can be shown by direct calculation or by reference to Theorem 3-8.
(The fact that the square of the magnitude of the sum of two orthogonal
vectors is the sum of the squares of their magnitudes is exactly the
Pythagorean theorem.)
By letting the point D vary along the line L, we see from (5-23) that
the minimum distance \PD\ is attained when C = D, and thus |PC| is
the distance from P to L.
To find PC we can proceed in any one of several different ways. Perhaps
the easiest way is to note that
PC = AC - ÃP.
(We remark that the reader frequently will find it easier to obtain an
equation such as this by noting first that AP + PC = AC.) The vector
ÃC is the projection of AP onto the direction of B, and hence

ÃC- (AP‘B)B
AC ~ |B|2 ’
giving
PC = (A.g|2B) B - ÃP

To find the length of PC, we can calculate

|PC|2 = (PC) • (PC)


(AP ■ B)2 (AP-'R')2
|B|2 - 2 + |AP|2
|B|4 |B|2
(AP-B)2
= |AP|2
|B|2
|AP X B|2
|B|2
where the last step follows from Eq. (5-7) of Section 5-1.
5-4 PROJECTIONS AND DISTANCE FORMULAS 169

This same result can also be obtained by observing that if B is the angle
between AP and B, then
\PC\ = \AP\ sin B,
while
\AP X B| = \AP | • |B| sin B.

Either way, we have proved

Theorem 5-76. Let L be the line through A with direction B, and let
P be a given point. Then the distance from P to L is d, where

. \AP X B|
(5-25)
|B|
The point C on L which is closest to P is given by

C = A + (5-26)

It should be noted that the distance d given by (5-25) will be zero if


and only if the point P is on the line.
As an illustration of the use of these formulas, let us find the distance
between the line
X = [2, 3, -5] + «1, —2, 2]

and the point P = (15,7,6). Here, B = [1, — 2, 2], and AP = [13,4,11].


Starting with (5-26), we see that the projection of AP in the direction of
the line is

AC |jj| 2 ® 9 ® 6,6].

Hence C = A + AC = [5, —3, 1], and

d = \PC\ = |[10, 10, 5]|


= 5-|[2,2, 1]|
= 5 • 3 = 15.

Alternatively, we could compute

ÃP X B = [30, -15, -30],

and, using (5-25), find


d = |[30, -15, -30]| = 15 • |[2, —1, —2]| = 15
170 VECTORS AS COORDINATE SYSTEMS 5-4

Now, we wish to turn to the problem of finding a formula for the dis­
tance between two lines. Let Lx be the line through Ax with the direc­
tion Bi, and let L2 be the line through A2 with direction B2. Suppose
that there exist points Cx and C2 on Li and L2, respectively, such that
CiC2 is orthogonal to both Lx and L2. Then |CXC2| is the required dis­
tance between Li and L2. This is easily seen by considering the planes
orthogonal to CiC2 through Ax and A2, respectively. The lines Li and
L2 will be contained in these respective planes (why?), and hence the
distance between these planes will be the minimum distance between the
lines. (See Fig. 5-7.)
The vector CXC2 is clearly the projection of the vector AXA2 in the
direction of CXC2. But since CXC2 is assumed to be orthogonal to both Bx
and B2, it must be collinear with Bx X B2 (assuming that Bx and B2 are
not collinear). Therefore CXC2 is the projection of AXA2 in the direction
of Bx X B2, and hence

|C^2| = B*| B~ ‘ (5_27)


This result was obtained under the assumption that points Cx and C2
with the required properties exist. The existence of these points can be
verified in several different ways, but with the proper point of view it is

Figure 5-7
We suppose that Bx and B2 are not collinear. Then Bx X B2 # 0, and
we can imagine the planes orthogonal to Bx X B2 through Ax and A2.
The line Lx is in one of these planes and the line L2 is in the other. If
we make the orthogonal projection of the line L2 into the plane con­
taining Li, we see that Lx and the projection of L2 must intersect. The
point of intersection will be the point Cx (Fig. 5-8).
This rather intuitive argument can be made rigorous quite easily. Intro­
duce an orthonormal set of vectors ux, u2, and u3 such that ux is parallel
to Bx, u2 is in the plane determined by Bx and B2, and u3 is collinear
with Bx X B2, by the method of the last section. Then we have
Bx = 6xux, B2 = cxux + c2u2, (5-28)
5-4 PROJECTIONS AND DISTANCE FORMULAS 171

where br = 1/|BX| 0 and c2 0, since otherwise Bx and B2 would


have been collinear. The lines Lx and L2 have equations

X = Ai —|— íBj, Y = A2 d- sB2, (5—29)

respectively. If X is an arbitrary point on Llf having coordinate 2, and


Y is an arbitrary point on L2 with coordinate s, then

XY = A2 — Ai -|- sB2 — iBx


= (A2 — Ai) 4~ (sci — ^5x)ux + sc2u2.

Let us suppose that A2 — Ai = axux + a2u2 + «3U3. Then

XY = (ax + sci — ^i)ui + (a2 + sc2)u2 + (Z3U3,

and since Bx X B2 is parallel to u3, the problem is whether or not we can


find values for s and t so that + scx — tb± = 0 and a2 + sc2 = 0.
If so, these values when put into (5-29) would give us the required points
Cx and C2.
However, the second equation,

«2 + sc2 = 0,

can be solved for s, since from (5-28) the number c2 is not zero. Having
solved for s, we can solve for t in the first equation,

ax + sci — tbi — 0,

since 61 0. Therefore the required points Cx and C2 exist.


Note that in this proof we are not particularly interested in finding
Ci and C2. All we are really after is the information that these points
exist. With this knowledge, the rest of the calculations are easy.

Theorem 5-17. Let Lx be the line through Ax in the direction Bx, and
let L2 be the line through A2 in the direction B2. If Bx and B2 are not
collinear, then the distance between Lx and L2 is

7 _ IAxA2 • Bx X B2|
(5-30)
|BX X B2|

If Bx and B2 are collinear, then the distance between Lx and L2 is

j _ |AxA2 X Bx|
(5-31)
“ |Bi|
172 VECTORS AS COORDINATE SYSTEMS 5-4

The proof of the second part of this result will be left as one of the
problems. It should be noted that the distance d given in this theorem will
be zero if and only if the two lines intersect.
As an example, let us find the distance between the lines

X= [1, 0, 5] + t[2, 0, 3]
and
X = [4, -1, 2] + t[l, -1, 0].

Here, Bi = [2, 0, 3] and B2 = [1, —1, 0]. Hence Bi X B2 = [3, 3, —2].


Also, AiA2 = [3, —1, —3]. Therefore, the required distance is
d = 9-3 + 6 = 12
[22]1/2 a/22

PROBLEMS

1. For each of the following sets of vectors, find the projection of A3 onto a
plane parallel to both Ax and A2.
(a) Ax = [1,3,0], A2 = [-1, 1, 0], A3 = [1, 2, 1]
(b) Ax = [-1,0,1], A2 = [0, 2, 1], A3 = [1, 1, 0]
(c) Ax = [2, 1, -3], A2 = [1, -1, 2], A3 = [3, 0, 4]
(d) Ax = [1, 2, 1], A2 = [—2, -5, 2], As = [1, 1, 5]
2. For each of the sets of vectors in Problem 1, find the projection of A3 onto a
plane orthogonal to Ax.
3. Find s and t such that

XY = A2 - Ai + «B2 - iBi

is orthogonal to both Bx and B2 by direct calculation (without introducing


the vectors ux, 112, and U3). Show that the result can be written in the form

|Bi X B2|2
4. Set
Ai = (3, 2, 4), Bi = [1, 0, 1],
A2 = (1, -3, 1), B2 = [0, -5, 3],
AS = (-2, 1, 2), B3 = [7, 5, -3],
A4 = (2, 1, -1), B4 = [-3, 1, -1 1?
As = (5, 5, 10), Bs = [1, 0, 1],
Pi = (1, 3, 5), P2 = (0, 5, 0),
P3 = (5, 4, 3), P4 = (-3, 0, 1).
5-5 GENERAL METHODS 173

Let L\ be the line through Ai in the direction Bi, L2 the line through A2 in
the direction B2, etc.
(a) Find the distance from Pi to for i = 1, 2, 3 and 4.
(b) Find the distance from Li to each of the other lines.
5. Using Theorem 5-16, prove the second proposition of Theorem 5-17.
6. Let C be a given point and M a given plane. Let L be the line through C
orthogonal to M. Then by the projection of C onto M we will mean the point
D at which the line L cuts the plane. If M is the plane through the point A,
orthogonal to B, prove that the projection of C onto M is the point Z), where

(AC-B)
D = C - B.
|B|2
[Hint: Consider Fig. 5-5.]
7. In what way is Problem 6 related to the problem of finding the point on a
given plane which is closest to a given point?

5-5 GENERAL METHODS


*

The results of Sections 5-2 and 5-3 are of great importance in the
general study of vector spaces. These results, however, were obtained with
the aid of the properties of the cross product. The cross product is an
artifact of the three-dimensional vector space which does not exist (in the
same form) in spaces of other dimensions. In this section, it is our purpose
to show how certain results obtained in Sections 5-2 and 5-3 can be
obtained by methods which do not involve the cross product.
The first topic we wish to discuss is linear dependence. Recall that in
Definition 5-2, a collection of vectors, An A2, . . . , An, was called linearly
dependent if and only if there exist scalars Xt- (i = 1, 2, . . . , n), not all
zero, such that
XiAx + X2A2 + • • • + XnAn = 0. (5-32)

A set of vectors which is not linearly dependent is called linearly in­


dependent. Let us now investigate some properties of this concept.

Theorem 5-18. If a collection of vectors contains the zero vector, then


the collection is linearly dependent.

* The material discussed in this section is not essential to this course, but
it does constitute an introduction to some very important topics taken up in
later courses. It would be well worth the student’s while to study this material.
In particular, students interested in applications of mathematics are advised
to study the proof of Theorem 5-24 with care.
174 VECTORS AS COORDINATE SYSTEMS 5-5

Proof: If the collection is Ax, A2, . . . , An_x, 0, then

XxAx + X2A2 + • • • + Xn_xAn_x + Xn0 = 0

when we set Xx = X2 = • • • = Xn_x = 0 and Xn = 1. Here, not all


of the Xi are zero, since Xn = 1.

Theorem 5-19. If a collection of vectors contains two identical vectors,


then the collection is linearly dependent.

Proof: Let all of the X


* in (5-18) be zero except for the coefficients of the
two identical vectors. Let one of these have the coefficient +1 and the
other the coefficient —1. We then have a linear combination of the
vectors which is zero, but with some nonzero coefficients.

Theorem 5-20. If a set of vectors is linearly independent, then any


nonempty subset of this set is also linearly independent. If a collection
of vectors is linearly dependent, then any enlarged collection is also
linearly dependent.

Proof: The two parts of this theorem are logically equivalent, so let us
prove only the second part. Let Ax, A2, . . . , An be a linearly dependent
collection of vectors. Then there exist X» (i = 1, 2, . . . , n), not all zero,
such that
XxAx + X2A2 + • • • + XnAn = 0.

If the enlarged collection is Ax, A2, . . . , An, Bx, B2, . . . , B


*, then letting
the Xt- be the same as above (and hence not all zero), and all of the be
zero, we have

XxAx + • • • + XnAn + ’ * * + Pk&k = 0.

Definition 5-5. A collection of vectors Ax, A2, . . . , An is called a set of


generators if and only if every vector can be expressed as a linear com­
bination of these. That is, if B is any vector, there exist scalars Xt such
that
B = XxAx + X2A2 + • • • + XnAn.

Directly from this definition we are able to prove

Theorem 5-27. If Ax, A2, . . . , An is a set of generators, and if B is any


vector, then the collection of vectors B, Ax, A2, . . . , An is linearly
dependent.
5-5 GENERAL METHODS 175

Proof: By the definition, there exist scalars Xn . . . , Xn such that

B = XiAi + • • • + XnAn.
But then
B X^Aj • •• XnAn = 0,
and this collection is linearly dependent (the coefficient of B is 1 0).

Theorem 5-22. Let An A2, . . . , An be a linearly dependent sequence of


nonzero vectors. Then there is some k with 1 < k < n such that the
* is linearly independent and Ajt+1 is a
sequence of vectors An . . . , A
linear combination of these. That is, there exist scalars Xx, X2, . . . , X
*
such that
Afc_|_i = XiAi + X2A2 + • • • + XfcAfc.

Proof: We look at the sequence of vectors, and let k be the smallest


integer for which the sequence At, A2, . . . , A&+1 is linearly dependent.
Since Ax is nonzero, k must be greater than or equal to one. On the
other hand, since the entire sequence is linearly dependent, k is less than n.
Since k is chosen to be the smallest integer such that the sequence
Ax, . . . , Afc_|_x is linearly dependent, there exist scalars «x, a2, . . . , ajt+i,
not all zero, such that
«lAx + a2A2 + • • • + Q!fc+xAfc4_i = 0.

However, 0, for if «fc+1 = 0, then not all the being zero would
imply that the sequence Ax, . . . , A
* is linearly dependent, contradicting
the way k was chosen. This equation can then be divided through by
afc+i and brought into the form required by the theorem.

Finally we prove

Theorem 5-23. Any four vectors are linearly dependent; and if three
vectors are linearly independent, they are a set of generators.

Proof: If we are given four vectors, and if three of them are linearly
dependent, then the whole collection is linearly dependent. On the other
hand, if three of the vectors are linearly independent and we prove that
they are a set of generators, then Theorem 5-21 shows that all four are
linearly dependent. So suppose that Ax, A2, and A3 are linearly independ­
ent. We will prove that they are a set of generators.
The method of proof that we use is called the method of replacement.
We start with a known set of generators, ex, e2, and e3 (why is this a set
of generators?) and add Ax to this set. By Theorem 5-21 the resulting
collection of four vectors Ax, ex, e2, and e3 is linearly dependent. From
176 VECTORS AS COORDINATE SYSTEMS 5-5

Theorem 5-22, we find one of these to be a linear combination of the


previous ones.
Suppose, for example, that we can solve for e3, finding

©3 = XXAX + X2eX + X3e2.

But now, we can remove e3 from the collection Ax, ex, e2, e3 and still have
a set of generators. For if B is an arbitrary vector with

B = Mlel + + M3©3>
then
B = Miei + M2®2 + M3(XiAx + X2ex + X3e2)
= M3^1A1 + (mi + M3^2)®1 + (m2 + M3^3)®2

= M1A1 + /X2®1 + M3®2-

Next we add A2 to the collection and look at the sequence A2, An ex, e2
assuming e3 was the one removed). The same reasoning shows that we
can remove another of the vectors. But the one removed will not be
Ai or A2, since these two are linearly independent. We therefore have a
sequence Ax, A2, (where is one of the original three) which is still a
set of generators.
Finally we add A3 to the collection, getting A3, A2, Ax, e». This time,
when we remove a vector, it must be et-, since the collection A3, A2, Ax is
linearly independent. We are therefore able to conclude that the three
vectors are a set of generators, and the theorem is proved.

The reader should study carefully the process used in this proof. We
start with a set of generators. When another vector is added, the resulting
collection must be linearly dependent. Hence one of the original vectors
can be removed and still leave a set of generators. A vector is pushed in
at one end, forcing one out at the other. Observe also that in this proof
we have made use only of the definitions, the properties of vectors as found
in Theorems 3-3 and 3-4, and the fact that there are three special vectors
(ex, e2, and e3) which generate the entire space. Theorem 5-23 should be
compared with Theorem 5-12. The two are essentially equivalent.
Next, we add to the above assumptions the properties of the dot product
as given in Theorem 3-5, and show how we can obtain a close approxima­
tion to Theorem 5-14 without having to use the cross product.

Theorem 5-24. Suppose Ax, A2, and A3 are linearly independent. Then
there exists an orthonormal set ux, u2, and u3 such that ux is a scalar
multiple of Ax, u2 is a linear combination of Ax and A2, and u3 is a
linear combination of Ax, A2, and A3.
5-5 GENERAL METHODS 177

Proof: Since Ax, A2, and A3 are linearly independent, none can be the
zero vector. We start by setting

Ui = Ar/JAil
just as in Theorem 5-14.
Next we wish to find u2 coplanar with Ax and A2 and orthogonal to
ux. Refer back to Fig. 5-4 for a picture of what we wish to accomplish. In
this figure, we let B = ux, and A = A2. Then the orthogonal vector V,
which we call V2, must be

V2 = A2 — (A2 • ux)ux. (5-33)

This same result can also be obtained in a purely algebraic manner.


The desired vector V2 is to be a linear combination of A2 and ux. Since
it is to be orthogonal to ux, it cannot be collinear with ux. We can therefore
assume that the desired vector is of the form

V2 = A2 -|- £ux.

To find t so that V2 is orthogonal to ux, we set V2 • ux = 0. This gives


A2 • ux + t = 0, and hence t = — (A2 • ux), resulting in the same vector
V2 as obtained in (5-33).
The vector V2 is orthogonal to ux, but it is not of unit length in general.
Therefore, we set
U2 = |V2|/V2.

We can continue this process in a similar manner to find a vector orthog­


onal to both ux and u2. The desired vector can be determined with the
help of the concept of projection (see Fig. 5-5) or directly in the following
way. The vector we wish to find is to be a linear combination of A3, ux,
and u2. Hence we can assume

V3 = A3 + sux + /u2. (5—34)

We can then solve for s and t by using the conditions that V3 is orthogonal
to ux and u2. Setting V3 • ux = 0 in (5-34) gives

A3 • ux + s = 0.

Similarly, setting V3 • u2 = 0 gives

A3 • u2 + t = 0.
Hence we find that

V3 = A3 — (A3 • ux)ux - (A3 • u2)u2. (5-35)


178 VECTORS AS COORDINATE SYSTEMS 5-5

Note that this is merely A3 minus the projection of A3 onto the plane
determined by Ui and u2. The required vector of length 1 is then finally
given by

The process that we have gone through here is of great theoretical (and
in many cases, practical) importance. The method produces an ortho­
normal set of vectors successively from a given set of vectors by sub­
tracting from each vector its projection on the “plane” determined by the
previously determined vectors. Equations (5-33) and (5-35) serve to show
how the method proceeds. The process in no way depends on the fact
that our vectors are three-dimensional, and can be extended to any vector
space with an inner product. It is known as the Gram-Schmidt orthog­
onalization process.
6
The Conic Sections
6-
1 THE DEFINITION OF CONIC SECTIONS

In their study of geometry, the early Greek mathematicians gave special


consideration to those curves which could be obtained by cutting a right
circular cone with a plane. Such curves occupy an important position
throughout geometry and analysis, and their properties must be well known
to any student planning to do further work in mathematics.
To start our discussion of the conic sections, let us first find the vector
form of the equation of a right circular cone. Three things are required in
order to define a cone. We must specify a point to be the vertex of the
cone (A in Fig. 6-1), a vector N to define the direction of the axis of the
cone, and an angle 6 to be the half angle of the cone. A point X = (x, y, z)
is on the cone if and only if the vector AX makes the angle 0 with N or
—N. This is the same as

|(X - A)-N| = X|X - Al • |N|,

where X = cos 9. The absolute value on


(X — A) • N is needed to give us both sides
of the cone. If the absolute value signs were
left off, what points X would satisfy this
relation?

Definition 6-1. Given a point A, a nonzero vector N, and a real number


X with 0 < X < 1, a right circular cone is

{X = (x, y, 2)1 |(X — A) • N| = X|X - A| • |N|}.

The point A is called the vertex of the cone; the line X = A + ZN is


called the axis of the cone; and the angle 6 with 0 < 0 < 7r/2 such
that cos 0 = X is called the half angle of the cone.
The two sets
{X | (X - A) • N = X|X - A| • |N|},
{X I (X - A) • N = -X|X - A| • |N|}

are called the nappes of the cone.


179
180 THE CONIC SECTIONS 6—1

If Xo is any point on the cone other than the vertex, then the line
X = A + t(XQ — A) is called a generator of the cone.

Although the set defined here is correctly known as a right circular cone,
we shall just call it a cone for the time being. Since only the direction of N
is needed to determine the axis of the cone, we can set B = N/|N| and use
this vector of unit length to determine the axis. That is, if A = (ax, a2, a3)
and B = [bx, b2, 63], with |B| = 1, the above relation can be written as
— dj) + b2(y — a2) + b3(z — a3)| = X[(x — O02 + (y — a2)2
+ (2 - a3)2]*
' 2.
To eliminate the absolute value signs, we can square this equation to
obtain
[bx(z — ax) + b2(y — a2) + &3(z — <*3)] 2 = X2[(z — «i)2
+ (y — ®2)2 + (2 — «3)2]- (6-1)
This equation is satisfied by the coordinates of each point X on the cone,
and if the coordinates of a point satisfy this equation, then that point is
on the cone. Therefore, if we specify that &x + + &3 = 1, then (6-1)
is the general form of the cartesian equation of a cone.
In order to study the intersection of a cone with a plane we merely have
to choose a plane and find the points common to this plane and the cone.
The most convenient plane to use is of course the zy-coordinate plane
(z = 0). If we set z = 0 in the equation of the cone, we are left with an
equation containing only x and y. This will then be the cartesian equation,
in the xz/-plane, of the conic section.
Setting z = 0 in the equation of the cone given above, expanding and
rearranging terms gives an equation of the form
Ax2 -|- Bxy -f- Cy2 -J- Dx -|- Ey -f- F = 0. (6—2)
We can conclude that every conic section in the xi/-plane satisfies a general
quadratic equation.
To make a more careful study of the conic sections we need to simplify
the equations somewhat. This can be done by proper adjustment of the
parameters determining the cone. Thus in the position A = (alf a2, a3)
of the vertex of the cone, we can change ax and a2 in any way we may
find useful. Such changes merely translate the conic section on the plane.
Likewise the direction of the axis B = [bx, b2, b3] can be altered as desired
by changing bx and b2 (keeping |B| = 1). The effect on the conic section
will be a rotation in the x?/-plane. Since b3 is the cosine of the angle between
B and the z-axis, fixing it determines the angle between B and the' x?/-plane.
(See Problem 13, at the end of Section 4-4.)
6—1 DEFINITION OF CONIC SECTIONS 181

Let us set a2 = 0 and b2 = 0. This corresponds to assuming that the


vertex of the cone is in the zz-plane and that the axis of the cone is in this
plane also. Later, we will adjust ax as needed to give the greatest simpli­
fication. Note that when b2 = 0, òx is the cosine of the angle between B
and the xy-plane.
Substituting these values in the cartesian equation of the cone and
setting z = 0, we find, after some algebraic manipulation, the equation of
the conic to be

(X2 — b2)x2 — 2[ax(X2 — bi) — bib3a3]x + X2y2


= 2Ò163U1U3 — a?(X2 — b2) — a2(X2 — ò2). (6—3)

We will consider various cases of this equation as bi is varied while


holding fixed X, the cosine of the half angle of the cone. Changing bi
corresponds to “tipping” the cone. To fix our picture we will assume that
bi > 0, b3 > 0, and a3 < 0.
The first case we consider is that when bi = 0.
Then, since òf + b3 = 1 and b3 > 0, b3 = 1.
For this case, we set ax = 0. Then Eq. (6-3)
becomes
XV + ^2 = o2(1 _ X2L
or
2 | 2 2 (1 — X2)
Z + 3/ = «3 - --------------- - (6-4)
X2
This is recognized as the equation of a circle
(note that X2 < 1), which is as it should be,
since with bi = 0, the cone has its axis orth­
ogonal to the x?/-plane (Fig. 6-2).
Let </> be the angle between the axis of the cone and the xi/-plane. Then
cos </> = bi. Since the axis of the cone is in the xz-plane, </> is also the
angle between the axis of the cone and the x-axis. In Fig. 6-3 we show the
cross section of the cone in the xz-plane for several possible cases. Letting
6 be the half angle of the cone, it is clear that the “bottom” generator of
the cone makes an angle of </> — 0 with the x-axis.
Therefore, if 0 < </> or equivalently, if 0 < bi < X (since the cosine
decreases as the angle increases), all generators of the cone are pointing
“upward” and the xy-plane cuts through a single nappe of the cone. The
resulting curve is called an ellipse (see Fig. 6-3a).
In this case, we have 0 < X2 — ò2 < X2. Set X2 — ò? = p2. To sim­
plify Eq. (6-3) as much as possible we set ax = bib3a3/p2. That is, we
move the vertex of the cone so as to obtain the greatest simplification.
182 THE CONIC SECTIONS 6—1

The result is shown in Fig. 6-4. Equation (6-3) then reduces to

M2x2 + X2?/2 = k2. (6-5)

Here, the right-hand side of (6-3) reduces to a constant which we have


written as a positive constant k2. To see that this is possible we merely
need to observe that the cone must cut the xy-plane somewhere, hence
there exist points (x, y) which satisfy (6-3). But the left-hand side of
(6-3) reduces to the left-hand side of (6-5), which is positive for any
(x, y). Hence the right-hand side must be a positive constant.
The next case to be considered is that for which bi = X. Here, the half
angle of the cone is the same as the angle between its axis and the x-axis.
Hence, exactly one of the generators of the cone will be parallel to the
6-1 DEFINITION OF CONIC SECTIONS 183

Figure 6-4 Figure 6-5

x-axis (Fig. 6-3b). The intersection will be a single curve which never
closes. This conic section is called a parabola. Inserting this value into
Eq. (6-3), we find that

2bib3a3x + X2Z/2 = 2bib3aia3 — a3(\2 — b3).

In this equation bib3a3 0 except for certain limiting cases. We may


adjust ai so that the right-hand side of this equation is zero. Then, dividing
through by X2, we find that the equation of the parabola has the form

y2 = kx, (6-6)

where k is some nonzero constant (Fig. 6-5).


The final case to be considered is that when 0 < X < 61. In Fig. 6-3 (c)
we observe that under this condition the plane z = 0 will cut both nappes
of the cone. The conic section thus consists of two parts. The resulting
curve is called a hyperbola.
The coefficient of x2 in (6-3) is negative, so we set X2 — b2 = —p2.
Then we can set ax = — bib3a3/p2 to eliminate the coefficient of the x
term. The left-hand side of (6-3) then becomes — p2x2 + \2y2. From
the geometric position of the cone, it is easily seen that there must be two

Figure 6-6
184 THE CONIC SECTIONS 6-1

distinct points on the z-axis which are on the hyperbola. Since at least
one of these is not zero, we see that the constant on the right-hand side of
(6-3) must be negative. That is, the hyperbola must satisfy an equation
of the form
-/x2x2 + \2y2 = —k2. (6-7)

This case is illustrated in Fig. 6-6.


In the above discussion òx is the cosine of the angle between the axis
of the cone and thé z-axis and at the same time the cosine of the angle
between the axis of the cone and the zi/-plane. Therefore, each of the conic
sections considered above satisfies the following formal definition.

Definition 6-2. A nondegenerate conic section is the intersection of a


right circular cone with a plane which does not pass through the vertex
of the cone. Given such a cone and plane, let 6 be the half angle of the
cone and let </> be the angle between the axis of the cone and the plane.
Then the resulting conic section is called:
(1) a circle if and only if </> = 7t/2,
(2) an ellipse if and only if 7r/2 > </> > 0,
(3) a parabola if and only if </> = 0,
(4) a hyperbola if and only if 0 > </>.

The calculations of this section can then be summarized in the following


theorem:

Theorem 6-1. The intersection of a right circular cone with the z?/-plane
is the locus of an equation of the form

Ax2 + Bxy + Cy2 + Dx + Ey + F = 0.

Let X be the cosine of the half angle of the cone and let be the cosine
of the angle between the axis of the cone and the plane. Set p2 =
|X2 — b2\. Then for a suitable location of the cone with respect to the
z- and i/-axes, an ellipse, a parabola, or a hyperbola will be the locus
of the equation
p2x2 + X2?/2 = k2,
y2 = kx,
or
-p2x2 + X2y2 = -k2,

respectively, where A; is a nonzero constant whose value depends on the


particular conic section in question.
6—2 EQUIVALENT DEFINITIONS 185

We remark that the size and shape of the conic section is determined
completely by three quantities: the half angle of the cone, the angle between
the axis of the cone and the plane, and the distance between the vertex
of the cone and the plane. The phrase, “suitable location of the cone,”
found in this theorem should be interpreted as allowing the cone to be
moved in any way which does not change any of these three quantities.

PROBLEMS

1. Write the cartesian equations of each of the following cones:


(a) The cone with vertex at (0, 1, 1), axis parallel to the x-axis, and half
angle 60°
(b) The cone with vertex at (0, 0, 2), axis parallel to B = [2, 0, 1], and half
angle 45°
(c) The cone with vertex at (0, 0, 1), axis parallel to B = [1, 0, 1], and half
angle 45°
2. Find and simplify as much as possible the cartesian equation of the conic
sections obtained as the intersections of the cones of Problem 1 with the
xy-plane. Identify each conic section as to type.
3. Prove algebraically that the right-hand side of (6-3) is positive when
0 < X2 — bi < X2 and when we put ai = òi&3O3/(X2 — òf).
4. Let A be the vertex of a right circular cone and let Xo be any point other than
A on the cone. Prove that every point of the line X = A + f(Xo — A) is
on the cone.
5. The intersection of a cone and a plane which passes through the vertex of the
cone is a degenerate conic section.
(a) What is the general form of the equation for the intersection of a right
circular cone whose vertex is at (0, 0, 0) with the plane z = 0?
(b) Give an example of a cone in (a) in which the conic section consists of the
single point (0, 0, 0).
(c) Give an example of a cone for which the conic section is the single line
x = 0.
(d) Give an example of a cone for which the conic section is the pair of lines
x = 0 and y = 0.

6-
2 EQUIVALENT DEFINITIONS

There are several other possible ways to define the conic sections. In
this section we wish to investigate these alternatives.
It has been known for a long time that if we are given an ellipse, then
there are two points in the plane such that the sum of the distances from
186 THE CONIC SECTIONS 6-2

these points to every point of the ellipse is a


constant. This condition can be written in the
form:
|X - Fx| + |X - F2| = 2a,

where Fx and F2 are the fixed points, a is a


constant (2a is used here because it leads to
certain simplifications later), and X is any point
of the ellipse. In this form, the two fixed points
are called the foci of the ellipse.
A simple proof of this property was dis­
covered in the early nineteenth century by
the Belgian mathematician Dandelin. The
Dandelin proof is illustrated in Fig. 6-7. We
imagine a cone and an intersecting plane
forming an ellipse. Every point on the cen­
tral axis of the cone is equidistant from the
generators of the cone (see Problem 1 at the
end of this section), and hence each point of the central axis is the center
of a sphere tangent to the sides of the cone. Exactly two of these spheres
would also be tangent to the plane which cuts off the ellipse as shown in
Fig. 6-7. The points of tangency, Fx and F2, are the foci of the ellipse.
Let P be any point of the ellipse. There is a unique generator of the cone
through the point P. Let this generator be tangent to the two spheres at
Ci and C2. Then the distance between Cx and C2 is a fixed constant, say 2a,
independent of which point P has been chosen. Now consider the line
segments FXP and F2P. The two segments FXP and CXP are both tangent
to the lower sphere from the point P and hence have the same length,
\FiP\ = |CXP|. Similarly, |P2P| = IC2PI, and hence
\FiP\ + |P2P| = \CiP\ + \C2P\ = |CXC2| = 2a.

We have therefore proved

Theorem 6-2. For any ellipse, there exist a constant a and two points,
Fi and F2, which are in the plane of the ellipse, such that if X is any
point on the ellipse, then
\XFi\ + |XP2| = 2a. (6-8)

The points Px and F2 are called the foci of the ellipse.

In an exactly similar manner a focal relation for the hyperbola can be


obtained. Two points Px and P2, called the foci of the hyperbola, can be
6-2 EQUIVALENT DEFINITIONS 187

found such that for every point X on the hyperbola, the absolute value of
the difference of the distances from X to Fx and from X to F2 will be a
constant. Verification of this will be left to the reader, with the observation
that in the case of the hyperbola the two spheres are in different nappes
of the cone.

Theorem 6-3. For any hyperbola, there exist a constant a and two
points, Fx and F2, which are in the plane of the hyperbola, such that if
X is any point on the hyperbola, then

\XF,\ - \XF2\ I = 2a. (6-9)

The points Fx and F2 are called the foci of the hyperbola.

Let us look at another characterization of the conic sections which can


be obtained from the introduction of the sphere of the Dandelin proof.
This sphere, which is tangent to both the cone and the intersecting plane,
is called the Dandelin sphere. The circle consisting of those points at which
the Dandelin sphere is tangent to the cone lies in a plane which is orthog­
onal to the axis of the cone. (See Problem 1 at the end of this section.)
In Fig. 6-8 we have indicated this plane together with the cutting plane
which determines the conic section. Note that these two planes can be
determined for any of the configurations discussed in section 6-1. That is,
we are not restricting ourselves to a particular one of the conics (except
that the circle does not fall under our discussion here). In Fig. 6-9, where
we have redrawn the important features of the geometric configuration
which we wish to consider, F is the focus of the conic, that is, the point at
which the sphere is tangent to the cutting plane. Just as in the discussion
above, we see that if P is an arbitrary point on the conic, then \FP\ =
|CP| where the line CP is along a generator of the cone and C is the point
at which this generator is tangent to the sphere.

Figure 6-8 Figure 6-9


188 THE CONIC SECTIONS 6-2

Let DP be the line segment parallel to the axis of the cone such that D
is on the plane through the circle of tangency. Thus DP is orthogonal to
this plane. Then if 0 is the half angle of the cone, the angle between CP
and DP is B, and hence

\DP\ = |CP| cos B = |FP| cos B. (6-10)

Let L be the line of intersection of these two planes. Then L is orthogonal


to the axis of the cone. Let the plane through P orthogonal to L cut this
line at E. Then DP is in this plane (since DP is parallel to the axis of the
cone and hence is orthogonal to L) and the triangle DEP is a right triangle.
Let a be the angle between the two planes and let </> = 7r/2 — a. This
angle </> is then exactly the angle between the axis of the cone and the
plane that was discussed in the last section. Observe that the angle 0
is the vertex angle of the triangle DPE at P, and therefore

\DP\ = \PE\ cos </>. (6-11)

Combining this equation with (6-10), we find

|FP| = |P£| ,
cos B
or, setting
cos </>
e ~~ cos B ’ (6-12)
we have
\FP\ = \PE\e.

In this relation, the quantity e is known as the eccentricity of the conic


and the line of intersection of the planes is called the directrix qí the conic.
When the cutting plane is orthogonal to the axis of the cone and thus
yielding a circle, this formula is not strictly applicable since the directrix
does not exist. However, in this case we define e = 0. If the cutting plane
is allowed to tip, e continuously increases while an ellipse is produced.
When the plane becomes parallel to a generator of the cone, cos </> = cos Bf
e = 1, and we have a parabola. For a hyperbola e > 1. The maximum
value of e for a given cone is 1/cos B, but this can be made as large as
desired by widening the cone and letting B get close to 7t/2 (it can
never become tt/2).
In the case of the ellipse or hyperbola, there are two Dandelin spheres
and hence two foci. The above argument can be used at either focus. Thus
the ellipse and hyperbola have two directrices, one associated with each
focus. The parabola, on the other hand, will have a single directrix and
focus.
6-2 EQUIVALENT DEFINITIONS 189

Collecting these results, we have the following theorem.

Theorem 6-4. Let F be a focus of a nondegenerate conic section which


is not a circle. Then there exists a line L, called a directrix of the conic,
and a constant e, called the eccentricity of the conic, such that for every
point X on the conic
|XF| = \XE\e, (6-13)
where |XE| is the distance from X to the line L. The eccentricity e is
less than 1 for an ellipse equal to 1 for a parabola, and greater than 1
for a hyperbola.

PROBLEMS

1. Let the vertex of a cone be at the origin. Let B be the unit vector defining
the axis of the cone, 6 be the half angle of the cone and X = cos 6. Let R =
rB be a point on the axis of the cone, and suppose that u is a unit vector along
a generator of the cone.
(a) Let S be the projection of R in the direction of u. Prove that
S = Xru.
(b) Show that the distance from R to the generator is |&S| = [1 — X2]1/2r,
and hence that R is the same distance from all generators.
(c) Let T be the projection of S in the direction of B. Show that
T = X2rB.
(d) Prove that TS is orthogonal to B and hence that the points of tangency
of the sphere with center R and radius |#S| all lie on a plane orthogonal
to B.
2. Using the same cone of Problem 1, let M be the plane through the point Q
orthogonal to the unit vector v. Suppose that Q = kB where k 0. Sup­
pose further that v • B = [1 — X2]1/2.
(a) Prove that the conic section which results is a parabola.
(b) Show that the distance from the point R of Problem 1 to the plane M
is |& — r|[l — X2]1/2. Prove that the center of the Dandelin sphere for
this parabola is at (k/2) B.
(c) At what point is the focus?
(d) Show that |B X v| = |B X (B X v)| = X.
(e) Prove that the directrix of the parabola is the line
X = Eo + í(B X v),
where
B X (B X v)
190 THE CONIC SECTIONS 6-2

3. Draw a diagram and give the proof of Theorem 6-3 similar to that given for
Theorem 6-2.
4. Let a parabola have a focus F and let Z/ be a line which cuts the parabola
and is parallel to the directrix. Let P be any point of the parabola which is
on the same side of L' as the directrix. Let PB be the line segment from
P to L' which is orthogonal to L'. Prove that |FP| + |PB| is a constant,
independent of P. Make a sketch.
5. Let a solid be made up, as shown in Fig. 6-10, from a cone cut by a plane
through its axis and a second plane parallel to a generator so that the line L'
of intersection of these two planes is orthogonal to the axis. Let A be the
vertex; let B be the point at which a generator meets the parabola; and let
BC be a line segment from B to Z/ orthogonal to Z/. Prove that | AB| + |BC|
is a constant, independent of the generator chosen.

Figure 6-10

Remark: This shows that the geodesics (paths of the shortest length) from
the tip of the cone to the line Lr are all of the same length. If this figure were
covered by a sheet of explosive, then by starting the explosion at the point A,
a linear explosive front is produced at L'. The reader can easily verify that
these paths are indeed geodesics.
6. The usual method used to draw an ellipse is to stick two tacks into a sheet
of paper and place a loop of string around these. A curve can then be drawn
by placing a pencil point into the loop and moving it about, always holding
the loop tight. (See Fig. 6-11.) Prove that the resulting curve satisfies the
condition given in Theorem 6-2.

7. Let a string be fastened at a point F on a drawing board and at the end Q


of a T-square (Fig. 6-12). Move the square along the board, holding a pencil
point on the edge of the square at a point P so that the string is tight from
F to P and from P to Q. As the square is moved, the point P traces out a
parabola. Prove that the resulting curve satisfies the condition of Theorem
6-4 with e = 1.
6-3 THE ELLIPSE 191

6-
3 THE ELLIPSE

In the previous sections we have seen several properties of the ellipse.


In section 6-1 we saw that a properly located ellipse would satisfy an
equation of the form
y>2x2 + X2y2 = k2. (6-14)

Let us compare this fact with the result of Theorem 6-2, which tells us
that each point X of the ellipse satisfies the relation

\XF,\ + |XF2| = 2a. (6-15)

Suppose that the foci are on the x-axis, at equal distances on either side of
the origin. (Two given points can always be so located by a suitable
choice of the coordinate system.) Let be the point (—c, 0) and F2 the
point (c, 0), where c > 0. Then relation (6-15) becomes

[(x + c)2 + y2],/2 + [(x - c)2 + 3/2]1'2 = 2a.

Any point (x, y) which satisfies this equation must also satisfy

[(x + c)2 + y2]"2 = 2a- [(x - c)2 + t/2]1/2,

and, squaring, must also satisfy

x2 + 2cx + c2 + y2 = 4a2 — 4a[(x — c)2 + y2]'12 + x2 — 2cx


+ c2 + y2,
or
4a[(x — c)2 + y2]112 = 4a2 — 4cx.

Squaring again after dividing by 4, we eliminate the last radical, giving

a2[x2 — 2cx + c2 + y2] = a4 — 2a2cx + c2x2,


or
(a2 — c2)x2 + a2y2 = a4 — a2c2
= a2 (a2 — c2).

Dividing through by the right-hand member, we obtain the equation

= 1.
Note that since the distance between the foci is 2c, 2a must be greater
than 2c or the ellipse will not exist. Hence a2 — c2 > 0. Let us set

a2 — c2 = b2, (6-17)
192 THE CONIC SECTIONS 6-3

then we finally have the equation

i i L (^18)

We have thus shown that any point which satisfies Equation (6-15)
satisfies Equation (6-18).
Before continuing with the discussion of (6-18), let us look at the second
representation of the ellipse as given in Theorem 6-4. This says that there
are a point F, a line L, and a number e, with 0 < e < 1, such that every
point X on the ellipse satisfies the equation

| XF\ = \XE\e (6-19)

where \XE\ is the distance from X to the line L.


Let U be the line through F, orthogonal to L and suppose that X is a
point on the line L'. Then, for some scalar t, X = F + tFE, where E is
the point on L closest to F (and also to X). That is, E is the point of
intersection of L and U, Hence

IXF| = |i| • \FE\,


and
|X£?| = |E - X|
= |E — F — tFE\
= \FE - tFE\
= |1 - Z| • |F<
Therefore,
|Xf| M
ii — <r
To satisfy requirement (6-19) we must have 2/(1 — /) = ±e. Since
0 < e < 1, there are two possible values of t which will satisfy this, namely
t = e/(l + e) and t = —e/(l — e). Therefore, there are two points, A
and A', on the line L' which satisfy the requirement (6-19).
One of these points, which we denote by A, is between F and L (t is
between 0 and 1). The other, A', is so located that F is between A' and
L. The reader will find it instructive to observe how the ratio |XF|/|XF|
behaves as the point X moves along the line L'.
We are free to locate the focus and directrix as we wish so as to simplify
our computations (but, of course, maintaining the same distance between
them). Let us place them so that the line Z/ coincides with the x-axis.
That is, so that F is on the x-axis and the directrix L is orthogonal to the
x-axis. The two points A and A' found above will also be on the x-axis,
6-3 THE ELLIPSE 193

and we will suppose that these points are so located that they are at equal
distances on either side of the origin. Set

A = (a, 0), A' = (—a, 0), a > 0,


F = (e,0),
L = {(
*, y) | x = d}.
The points A and A' satisfy relation (6-19). That is, |AF\ = |A#|e,
and |A'F\ = |A'l?|e. Since 0 < a < d, these are equivalent to the
equations
a — c = (d — a)e,
a + c = (d + a)e.
Adding these equations gives
2a = 2de,
and subtracting them gives
2c = 2ae.

Therefore, we may assume that F = (ae, 0) and that the directrix L is


the line x = a/e, where a is some positive constant.

Figure 6-13

Let X = (x, y) be an arbitrary point which satisfies the desired relation


(6-19) (see Fig. 6-13). Then, we must have

a
[(a: - ae)2 + 3/2]1/2 = e----- x
\e
Squaring this gives

x2 — 2aex + a2e2 + y2 = a2 — 2aex + e2x2,


or
(1 — e2)x2 + y2 = a2(l — e2).

This equation may be divided through by the right-hand member to give


2
y = 1.
a2(l — e2) (6-20)
194 THE CONIC SECTIONS 6-3

This equation is identical to Eq. (6-18) if we set

b2 = a2(l - e2). (6-21)

Let us now investigate the rather delicate question of exactly what


implications have been proved. An ellipse is defined as the point set com­
mon to a cone and a plane which cuts through a single nappe of the cone.
In Theorem 6-1 we showed that if the cone and plane were suitably located,
that is, if a coordinate system is suitably chosen, then a point is on the
ellipse if and only if it satisfies an equation of the form (6-14). In Section
6-2, we showed that if a point is on the ellipse, it satisfies an equation of
the form (6-15) and one of the form (6-19). Here we have shown that if a
point satisfies an equation of the form (6-15), then it satisfies one of the
form (6-18) and that if a point satisfies (6-19), then it satisfies an equation
of the form (6-20). These facts may be summarized in the following dia­
gram.
Ellipse

(6-15 (6-14)

*
I
(6-18)
--- *
I
(6-20)--

The solid arrows in this diagram indicate the implications we have proved.
The dashed double arrow joining (6-18) and (6-20) is meant to indicate
that a point which satisfies an equation of either form also satisfies the
other. This is clearly true. We merely use (6-21) to go from the one
form to the other.
The final implication indicated in this diagram, the arrow from (6-20)
to the word “ellipse” is meant to indicate that if a point satisfies an
equation of the form (6-20), then it lies on an ellipse with the given
eccentricity e. To show this we merely have to set

A =

B = [[1 + e2]i/2 ’ °’[1 + e2]1/2] ’ (6-22)

1
X =
[1 + e2]1/2

in (6-3) to obtain the equation of a conic section in the x?/-plane which is


identical to (6-20). The student is asked to verify this as an exercise at the
end of this section.
6-3 THE ELLIPSE 195

With this result, we see that in the above diagram all the implications
hold, and hence that starting at any point in the diagram and following
the arrows we can arrive at any other point. In other words, all of these
characterizations of the ellipse are equivalent. In particular, we have proved
the next theorem.

Theorem 6-5. The locus of any equation of the form

where a > b > 0, is an ellipse.


*

Let us now see what can be determined about the points of an ellipse
from the equation given in this theorem. We see that if |x| > a, this
equation cannot be satisfied, so that all points of the ellipse must have x-
coordinates between —a and +a. Similarly the ^/-coordinates of all points
of the ellipse must be between — b and +b.
When x = —a, only y = 0 can satisfy the equation. When — a < x <
a, there will be exactly two values of y satisfying the equation, and hence
exactly two points of the ellipse with this value for their x-coordinate.
In particular, when x = 0, y = +Ò and — b are the two values for y.
When x = a, there is again only the single value y = 0 which can
satisfy the equation.
The ellipse also has a number of symmetry properties. To discuss these
properties we first need a definition.

Definition 6-3. Two points Pi and P2 are symmetric with respect to a


line L if and only if the line L is orthogonal to, and bisects the line
segment PiP2. A set of points is symmetric with respect to a line L
if and only if for each point P in S, there is also a point P' in $ such
that P and P' are symmetric with respect to the line L.

This definition corresponds to our usual notion of symmetry. It can


easily be converted to an analytic condition in special cases. For example,

Theorem 6-6. If S is the set of all points (x, y) in the plane which
satisfy a functional relationship f(x y) = 0, and if for every x and y,
f(x, —y) = f(x, y), then & is symmetric with respect to the x-axis.
Similarly, if for every x and y, f(—x, y) = f(x, y), then $ is symmetric
with respect to the i/-axis.

* Compare this with Theorem 6-4.


196 THE CONIC SECTIONS 6-3

It should be noted that this is not an "if and only if” theorem. The
condition given here gives us what is called a sufficient condition. If it is
satisfied then the set is symmetric. It is not, however, a necessary condition.
The locus of J(x, y) = 0 can be symmetric with respect to one of the axes
without the corresponding condition in this theorem being satisfied. The
proof of this theorem is left as an exercise.
Now, the equation of the ellipse satisfies the requirements of the above
theorem [letting /(x, y) = x2/a2 + y2/b2 — 1]; hence we can conclude
that the ellipse is symmetric with respect to both the x- and i/-axes. This
is a most remarkable result. It is "obvious” to most people that if a
plane intersects a cone at an angle, the figure which results must be egg-
shaped, fatter at the end which is at the wider part of the cone. Indeed,
when Albrecht Dürer, one of the great inventors of descriptive geometry,
discovered an accurate method for the construction of an ellipse in the
early sixteenth century, he allowed his "knowledge” to affect the con­
struction so as to obtain an egg-shaped figure. An illustration from one
of Dfirer’s books showing this error can be found on page 614 of Volume 1
of The World of Mathematics, edited by James R. Newman.
Let us now collect the information we have found about the ellipse.
In Fig. 6-14 we show a diagram of the ellipse together with the other
relationships which have been developed. The dashed rectangle is centered
at the origin and is made up of the lines x = zta and y = ±6. The
ellipse itself lies within this rectangle, touching the four sides of the
rectangle at the points where the axes cross the sides. The origin is called
the center of the ellipse (later we will discuss the case of an ellipse whose
center is located at a point other than the origin).

Note that a > b (since b2 = a2 — c2), hence the rectangle is longer


in the x-direction than it is in the ^/-direction. The x-axis (which is the
line through the two foci) is called the principal axis of the ellipse. The
?/-axis, which is the line through the center orthogonal to the principal
axis, we call the conjugate axis of the ellipse. In traditional usage, these
axes are called the major and minor axes respectively. These terms will
not be used here, but the reader should be aware of them. Strictly speaking,
both of these axes should be called principal axes to conform with modern
6-3 THE ELLIPSE 197

usage in physics and applied mathematics, but our present terminology


will suffice.
The two points at which the principal axis cuts the ellipse [the points
(—a, 0) and (a, 0) in Fig. 6-14] are called the principal vertices (or just the
vertices') qí the ellipse. The points where the conjugate axis cuts the ellipse
(points (0, — b) and (0, b) in this case) are called the conjugate vertices of
the ellipse. When we refer to a vertex, without specifying the type, we
mean a principal vertex.
The quantities a and b are respectively called the principal and con­
jugate dimensions of the ellipse. In traditional usage, these are called
the semimajor axis and semiminor axis.
Let us collect these terms into a formal definition.

Definition 6-4. Let Fx and F2 be the foci of an ellipse E. Then we define


the following quantities in the ellipse.
(1) The center, C, is the midpoint of the line segment FXF2.
(2) The principal axis is the line L through Fx and F2.
(3) The conjugate axis is the line U through C orthogonal to L.
(4) The principal vertices are the points Ax and A2 of intersection of E
and the principal axis L.
(5) The conjugate vertices are the points Bx and B2 of intersection of E
with the conjugate axis Lf.
(6) The principal dimension is

a = |CA x| = |CA2|.

(7) The conjugate dimension is

b = ICBil = |CB2|.

(8) The focal dimension is

c = \CFi\ = \CF2\.

The relationships between the quantities listed in this definition and the
other properties of the ellipse are given by the following theorem.

Theorem 6-7. Let a, b, and c be the principal, conjugate, and focal


dimensions of an ellipse with eccentricity e. Let d be the distance from
the center of the ellipse to either directrix. Then the five quantities
a, b, c, d, and e are related by the three equations

a2 = b2 c2, c = ae, d = a/e. (6-23)


198 THE CONIC SECTIONS 6-3

With one exception, if any two of these three quantities are specified,
the remaining three are uniquely determined by these relationships.
These three equations can easily be remembered by keeping Fig. 6-14
in mind. In this figure, the center, the focus, and the point (0, b) form the
vertices of a right triangle whose sides are of length b and c. The hypote­
nuse of this triangle is, by symmetry, exactly half of the sum of the distances
from the foci to the point (0, ò), and hence must be a. Thus the Pythago­
rean relation for this triangle gives the first of the three equations.
To remember the other two relations, it is only necessary to remember
that c and d are ae and a/e, and that e < 1. This fact together with a
recollection of the figure will tell you which is which.
The above discussion could be repeated step for step interchanging the
roles of x and y. The result would be an ellipse with equation
2 2

Ò2 a2

where as before a > b. The resulting ellipse would have its principal axis
coinciding with the y-axis, and would appear as in Fig. 6-15.
The same relations between the quantities a, 6, c, d, and e hold for this
case as well.
Students often seem to have difficulty in recalling which dimension is a
and which is b when faced with an actual equation. The following procedure
is therefore recommended. When an equation of the form
2 2
+ 1
p2 q2
is given, set x = 0 and note that y = The two points (0, q) and
(0, — q) may then be marked on a coordinate plane. Similarly, setting
y = 0 we find the points (p, 0) and (—p, 0) on the ellipse. Mark these
points on the plane, and draw the rectangle determined by these four
points (as in the figures). The ellipse may then be sketched within the
rectangle and the various other quantities determined with a being the
larger of p and q.
For example, if we are given the equation
2 2
^ + ^-= 1
4^9

we immediately recognize it as the equation of an ellipse of the type shown


in Fig. 6-15. In this ellipse, the principal dimension is a = 3 and the
conjugate dimension is b = 2. The focal dimension is then c = \/a2 — b2
= y/b. The eccentricity is e = c/a = \/5/3, and hence d = a/e =
6-3 THE ELLIPSE 199

9/\/5. We therefore have for this ellipse the


following properties.
center: (0, 0)
eccentricity: e = V5/3
principal axis: x = 0
conjugate axis: y = 0
foci: (0, \/5), (0, — y/5)
principal vertices: (0, 3), (0, —3)
conjugate vertices: (2, 0), (—2, 0) Figure 6-15
directrices: y = 9\/5/5, y = — 9\/5/5
Going in the other direction, we may wish to find the equation of an
ellipse satisfying certain given conditions. For example, what is the
equation of the ellipse with foci at (—3, 0), (3, 0), and having a vertex at
(5, 0) ? Note that this must be a principal vertex, since it lies on the line
through the foci.
Here, a = 5, c = 3, and hence b2 = a2 — c2 = 25 — 9 = 16.
Therefore, 5 = 4 and the equation is

(How do we know that 25 divides the x2 rather than the y2?)

PROBLEMS

1. Using the given relations between a, 5, c, d, and e for the ellipse, find formulas
for the following:
(a) 5, c, and d in terms of a and e only
(b) c, d, and e in terms of a and b only
(c) ò, d, and e in terms of a and c only
(d) Ò, c, and e in terms of a and d only
(e) a, d, and e in terms of b and c only
(f) a, b, and c in terms of b and e only
(g) a, b, and d in terms of c and e only
(h) a, 6, and e in terms of c and d only
(i) a, Ò, and c in terms of d and e only
2. (a) From the relations given in Theorem 6-7, find an equation which involves
b, d, and e only.
(b) Using the result of part (a), show that in any ellipse b < d/2.
(c) Prove that when b < d/2, there are two different ellipses with the same
values for b and d.
(d) If b = d/2, what is the eccentricity of the ellipse? What are a and c?
200 THE CONIC SECTIONS 6-4

3. Show that exactly the same equation for an ellipse results if the focus­
directrix form is assumed with the focus at (—ae, 0) and the line x = —a/e
as the directrix.
4. For each of the following equations, make a sketch of the ellipse. Give the
coordinates of the foci, and the principal and conjugate vertices. Give the
equations of both directrices. Show these on the sketch. Give the eccentricity.

(b)á + É = 1

(o) ii+16 ■ 1
2 2
(f) 3? + 4«/2 = 1
2
(g) 6z2 + 15//2 = 60 (h) y + 20j/2 = 5.

5. Give the equation of the ellipse with center at the origin, x-axis as the prin­
cipal axis, and satisfying the conditions given:
(a) One (principal) vertex is at (3, 0) and one focus is at (1, 0)
(b) One vertex is at (3, 0) and a directrix is x = 4
(c) One vertex is at (3, 0) and the eccentricity is
(d) One focus is at (4, 0) and the eccentricity is f
(e) One focus is at (4, 0) and a directrix is x = 10
(f) One focus is at (4, 0) and the conjugate dimension is 3
(g) A directrix is x = 5 and the eccentricity is f
6. Sketch and give the quantities asked for in Problem 4 for each of the ellipses
found in Problem 5.
7. Prove that the cone defined by the quantities A, B, and X given by (6-22)
intersects the plane z — 0 in the ellipse whose equation, as given by (6-3),
reduces to (6-20).
8. Prove Theorem 6-6.

6-
4 THE HYPERBOLA

We will now derive the standard form for the equation of the hyperbola
from the focal property given in Theorem 6-3:

| \XF,\ - \XF2\ I = 2a. (6-24)

In this representation, we will let Fi = (c, 0), F2 = (—c, 0), and X =


(x, y). Then this equation can be written as

|[(x - c)2 + y2]'12 - [(X + c)2 + z/2]1/2| = 2a. (6-25)


6-4 THE HYPERBOLA 201

Any pair (x, y) which satisfies this equation will satisfy one of the pair
of equations

[(x - c)2 + y2]112 - [(x + c)2 + y2]112 = ±2a,

or equivalently,

[(x — c)2 + y2]1'2 = [(


* + c)2 + y2]1/2 ± 2a.
Squaring gives

x2 — 2cx + c2 + y2 = x2 + 2cx + c2 + 4a2 ± 4a[(x + c)2 + i/2]1/2,


or equivalently,
=F 4a[(x + c)2 + y2]1/2 = 4cx + 4a2.

Dividing by 4 and squaring again gives

a2x2 + 2a2cx + a2c2 + a2y2 = c2x2 + 2a2cx + a4.

This can be simplified to give

(c2 — a2)x2 — a2y2 = a2c2 — a4.


— a.

Dividing through by the right-hand member finally gives

*x
(6-26)
a2 c2 — a2
If we set
c2 -a2 = b2, (6-27)
this can then be written as
2 /2 = 1
(6-28)
a2 b2
Every point (x, y) which satisfies (6-24) must therefore satisfy (6-28).
However, we should check whether or not the definition of b2 in (6-27)
is permissible. The distance between the two foci is 2c. If X is an arbi­
trary point on the hyperbola, let I be the
smaller of the two distances |XFi| and
|XF2|, and m the larger. Then from the
triangle inequality (Fig. 6-16), we have m
< I + 2c. Therefore,

2a = m — I < 2c,
or a < c, which shows that the left-hand
side of (6-27) is a positive quantity.
202 THE CONIC SECTIONS 6-4

Next we will derive the equation of the hyperbola from the focus-direc­
trix property of Theorem 6-10. We assume that we are given a point F,
a line L, and a real number e > 1.
Just as in the last section, let U be the line through F, orthogonal to L,
and let E be the intersection of L and Lr. Then, as before, we find that if
X is any point of L', then X = F + tFE, and
XF\ M
XE\ |1 - <
To satisfy the property of Theorem 6-10, we must have

|XF| = \XE\e. (6-29)


Again, (6-29) will hold when t = e/(l + e) or I = e/(e — 1). These
two values of t will give us two points, and A2, on L' which satisfy
(6-29). These points are on opposite sides of the directrix and one is
between the directrix and the focus.
We may now assume that the focus and directrix have been so located
on the plane that L' coincides with the x-axis, and A i and A 2 are at equal
distances on either side of the origin. Let Ai = (a, 0), A2 = (—a, 0),
F = (c, 0), where a > 0 and c > 0, and suppose the directrix has the
equation x = d. From the locations of Ax and A2 in relation to the
focus and the directrix, we have
—a < d < a < c.
Since the points Ax and A2 must satisfy
(6-29), we find the two equations
c — a = (a — d)e,
c + a = (a + d)e.
Adding and subtracting these equations
give
2c = 2ae
and
2a = 2de.
Therefore c = ae and d = a/e1 just as in the case of an ellipse (except
now e > 1).
Let X = (x, y) be an arbitrary point on the ellipse. Then (6-29) must
be satisfied. In this case (see Fig. 6-17), (6-29) is equivalent to
a
[(x — ae)2 + 3/2]1/2 = e X
e
= \ex — a|.
6-4 THE HYPERBOLA 203

Squaring this relation gives the equivalent equation

(x — ae)2 + y2 = (ex — a)2,


or
x2 — 2aex + a2e2 + y2 = e2x2 — 2aex + a2.

This is again equivalent to

(e2 — l)x2 — y2 = a2(e2 — 1),


or
2 2
Í- — y = 1
a2 a2(e2 _ J) (6-30)

which is in the same form as (6-28) if we set

b2 = a2(e2 - 1), (6-31)

(noting that e > 1 and hence e2 — 1 > 0).


We thus have found the equation
2 2
_ y_ = j
a2 b2

for the hyperbola with focus at (c, 0) and directrix x = d, where c = ae


and d = a/e. Connecting these quantities are the three relations

a2 + b2 = c2,
c = ae, (6-32)
d = a/e.

The problem of showing that the forms (6-26) and (6-30) are equivalent
to the definition will be left as an exercise.
We therefore have

Theorem 6-8. Any hyperbola in the z?/-plane can be so located that it


is the locus of an equation of the form
2 2
_ y_ = i
a2 b2

and the locus of any such equation is a hyperbola. For a hyperbola with
this equation, relations (6-32) connect the quantities a, b, c, d, and e,
where e is the eccentricity and c and d are the distances from the origin
to the focus and directrix respectively.
204 THE CONIC SECTIONS 6-4

From Eq. (6-28) we see, just as in the case of the ellipse, that the
hyperbola is symmetric with respect to both the x-axis and to the 2/-axis.
The origin, the point at which these two axes of symmetry intersect, is
called the center of the hyperbola.
In Fig. 6-18 we sketch the hyperbola determined by an equation of
the form (6-28). The four quantities a, b, c, and d are indicated in this
sketch. The dashed lines outline a rectangle centered at the origin with
sides of length 2a and 2b just as in the case of an ellipse. The hyperbola,
however, lies outside of this rectangle.
The line passing through two foci, in this case the x-axis, we call the
principal axis of the hyperbola. The two points at which the hyperbola
crosses this line [the points (a, 0), (—a, 0) in the figure] are called the
vertices of the hyperbola. The line through the center, orthogonal to the
principal axis (in this case the i/-axis, is called the conjugate axis of the
hyperbola. Note that there are no conjugate vertices for the hyperbola.
The quantities a and b are called the principal and conjugate dimensions
of the hyperbola respectively.
The two lines shown in Fig. 6-18 forming the diagonals of the dashed
rectangle have a special relation to the hyperbola. In this figure each
branch of the hyperbola is shown to lie completely within one of the
angles formed by these lines and the points of the hyperbola are shown to
be close to the points of these lines as |x| and |?/| become large. The proof
of Theorem 6-9 will show that this is actually true. These lines are called
the asymptotes of the hyperbola.

Definition 6-5. Let Fi and F2 be the foci of a hyperbola H. Then we


define the following terms.
(1) The center C is the midpoint of the line
segment FrF2.
(2) The principal axis is the line L through
Fi and F2.
(3) The conjugate axis is the line Lr through
C, orthogonal to L.
(4) The vertices are the points and A2
of intersection of H and L.
(5) The principal dimension is

a = ICAxI = |CA2|.

(6) The focal dimension is

c = ICFJ = |CF2 Figure 6-18


6-4 THE HYPERBOLA 205

(7) The conjugate dimension is 6, where

b2 = c2 - a2.

(8) The asymptotes are the two lines through the center which are the
diagonals of the rectangle whose sides are parallel to L and L',
whose center is C, and whose dimensions are 2a and 2b, the sides of
length 2b passing through the vertices.

Theorem 6-9. Each branch of the hyperbola lies completely within one
of the angles formed by the asymptotes. Points of the hyperbola which
are far from the center are arbitrarily close to the asymptotes.

Proof: The asymptotes of the hyperbola (6-28) have equations

x _y
= 0,
a b
(why?). To prove our assertions, it suffices to show them to be true for
the one line x/a — y/b = 0 and for the points of the hyperbola both of
whose coordinates x and y are greater than zero; By symmetry, the
results will then hold true for all points of the hyperbola.
What must be shown is that if (xb yi is any point of the hyperbola with
both xi and yr > 0, and if (xlf yi) is a point on the line x/a — y/b = 0,
then y2 > yi, and also that as Zi becomes large, the distance between
the point (xlf yi and the line becomes small.
Both of these remarks can be proved from the fact that if (xlf yi is on
the hyperbola, then

If (xi, yi) is on the line x/a — y/b = 0, then


£i _ Z/2 _ n
a b
Hence if yr > y2, then we would have
£1 __ 2/i < n
a b ~ a b
and the product on the right-hand side of (6-34) would be nonpositive (we
are supposing Xi > 0 and yi > 0, remember). This is a contradiction.
Therefore, yi < y2 as we wished to show.
206 THE CONIC SECTIONS 6—4

To see the second assertion, we recall that the distance from a point
2/i) to the line x/a — y/b = 0 is given by

a= ab
[a2 + b2]ll2\a bj
(why?). But from (6-34) we have
. _ ab (xi _ yA _ ab_________ 1
[o2 + 62ji/Aa V [a2 + b2]1/2 Çji + 20

ab_____ 1___ a 2b
[a2 + b2]1/2^“ xd^ + b2]172’
CL
Here we have used the assumption that 2/1 > 0. It is clear that as we
allow Xi to increase, the quantity ô becomes very small. In fact, we can
make it as small as we desire by choosing xx large enough.

If the above development is repeated, interchanging the roles of x and


1/, we would obtain an equation of the form

(6-35)

The hyperbola determined by this equation is of the type shown in


Fig. 6-19. The eccentricity and distances from the center to the focus and
directrix are determined by the same formulas, (6-32), for this hyperbola.
The principal axis in this case is the ?/-axis and the conjugate axis is the
z-axis.

Figure 6-19

Students sometimes find it difficult to remember which type of


hyperbola goes with which of the two types of equations. Rather than
attempting to memorize this information, it is probably easier to proceed
as follows when occasion arises. Suppose an equation is given, say
x2/32 — y2/42 = 1. We know that the hyperbola crosses its principal
6—4 THE HYPERBOLA 207

axis at the two vertices, and has no points in common with its conjugate
axis. Setting x = 0 in the given equation, we see that there are no values
of y which can satisfy the equation, and hence x = 0 must be the con­
jugate axis for this hyperbola. The vertices can be found by setting y = 0
and solving for x. The vertices and the foci are on the principal axis, which
is the x-axis (y = 0) in this case. The hyperbola is thus of the type shown
in Fig. 6-18.
For this hyperbola then, a = 3 and 5 = 4. Therefore, c = [a2 +
62]1/2 = 5, e = c/a = 5/3, and d = a/e = f. We can thus list for this
hyperbola the following properties.
center: (0, 0)
eccentricity: e = 5/3
principal axis: y = 0
conjugate axis: x = 0
vertices: (3, 0), (—3, 0)
foci: (5, 0), (-5, 0)
directrices: x = 9/5, x = —9/5

asymptotes: | — | = 0, |+| = 0

As another illustrative example, let us find the equation of the hyperbola


whose center is at the origin, which has one focus at (0, 10), and which
has the line 2x — y = 0 as one of its asymptotes. Here the hyperbola
must be one whose equation is of the form (6-35). An asymptote to the
hyperbola of this form is x/b — y/a = 0. This has the slope a/b, while
the given asymptote has slope 2. Hence we must have

a2 + b2 = 100,
a/b = 2.

From these equations we find a = 2b and hence

56 2 = 100,
b2 = 20,
a2 = 80.
Thus the desired equation is
2 2
_ y_ = 1
20 80
Note well that unlike the case of the ellipse, a need not be larger than
b. Any pair of positive values are possible for the principal and conjugate
dimensions of a hyperbola.
208 THE CONIC SECTIONS 6-4

PROBLEMS

1. Show the same equation for the hyperbola results if the focus is assumed to
be (—ae) 0) and the directrix x = — a/e.
2. For the hyperbola defined by the given equation, give the principal and
conjugate dimensions, the coordinates of the vertices and the foci, the equa­
tions of the directrices and asymptotes, and the eccentricity. Make a sketch
for each, showing all relevant items.

(f) x2 — 4y2 = —1

3. Find the equation of the hyperbola with center at the origin satisfying the
given conditions.
(a) One vertex at (5, 0) and eccentricity 2
(b) One vertex at (0, 8) and one focus at (0, 18)
(c) One focus at (4, 0) and eccentricity 5/2
(d) One focus at (6, 0) and one directrix being x = 2
(e) One vertex at (0, 8) and one asymptote being 2x — y = 0
(f) One directrix being x = 4 and one asymptote being x + 4y = 0
4. Suppose H is the hyperbola determined by the equation

What is the equation of the hyperbola H' whose principal axis and principal
dimension are the conjugate axis and conjugate dimension of H and whose
conjugate axis and conjugate dimension are the principal axis and principal
dimension of H? What is the relationship between the asymptotes of H
and H'? These two are called conjugate hyperbolas.
5. During the second world war sound ranging units were used to locate enemy
artillery. These units operated as follows. Three sound detectors were
placed on a straight line which was nearly orthogonal to the direction of the
gun to be located. When the sound of the gun was picked up, the time
difference between the arrival of the sound at the right-hand and center
detectors and the time difference between the arrival of the sound at the
left-hand and center detectors were determined. From a table, each time
difference was used to determine an angle. On a map lines were drawn with
the determined angles through the points half way between the corresponding
detectors (see Fig. 6-20). The enemy gun was then located as being at the
point at which these lines intersected.
6—4 THE HYPERBOLA 209

Figure 6-20

In practice, the sound ranging units observed that when the enemy gun
was relatively close, the position located in this manner was usually slightly
beyond the actual position.
Explain why the gun is located on the intersection of two hyperbolas.
For a given pair of detectors and a given time difference at these detectors,
how could you determine the actual hyperbola? The lines read from the
tables were the asymptotes of these hyperbolas. Why were these lines used?
Explain the observed discrepancy.
*
6. Prove that a point satisfies an equation of the form (6-26) if and only if
it satisfies one of the form (6-30). Given a and e > 1, show that the equa­
tion of the intersection of the xi/-plane with the cone having vertex A =
(0, 0, — aVe2 — 1), axis in the direction B = [1, 0, 0], and X = 1/e is
(6-30). Draw a diagram similar to that in the last section to show that the
focal properties are equivalent to the definition of the hyperbola.
7. Prove that the following geometric construction locates the focus and the
directrix of a hyperbola. Let 0 be the origin, C a corner of the rectangle whose
sides are 2a and 26 and which is centered at the origin as shown in Fig. 6-21,
and A the point at which one side of the rectangle crosses the x-axis. (A
is thus a vertex of the hyperbola.) With center at 0 draw a circle with
radius |OC|. This circle cuts the x-axis at F, a focus of the hyperbola. Also
with center at O, draw a circle with radius |O A |. This circle cuts the asymp­
tote (the line through 0 and C) at a point D which is on the directrix.
8. Find a geometric construction similar to that in Problem 7 to locate the
directrix of an ellipse.
9. To what extent do any two of the five quantities a, 6, c, d, and e for a given
hyperbola determine the other three? Prepare a table for all possible
determinations.
10. Let 6 be the angle between the principal axis and one of the asymptotes of
a hyperbola whose eccentricity is e. Prove that e = sec 6.

* A friend of the author was in such a unit during the second world war,
and, despite the fact that he became a mathematician, it was not until many
years later that he realized that his activities had anything to do with the focal
properties of the hyperbola.
210 THE CONIC SECTIONS 6-5

6-5 THE PARABOLA

While we had two different conditions which could be used to determine


an ellipse or a hyperbola, Theorem 6-4 gives us only one condition which
can be used to derive the general equation of a parabola. This condition
says that all points on the parabola are at the same distance from a given
point, the focus, and a given line, the directrix.
To derive the equation of a parabola, let us start by assuming that the
focus F is at the point (p, 0), and that the directrix is the line x = — p.
Then if P is a point of the parabola with coordinates (x, y), we must have,
\PE\ = \PF\, (6-36)
where \PE\ is the distance from P to the line, and hence

k+ p\ = ik — p)2 + ?/2]I/2-
This equation can be squared to give the equivalent equation
x2 + 2px + p2 = x2 — 2px + p2 + y2,
which simplifies to
y2 = 4px. (6-37)
This is then the equation of the desired parabola. The set of points which
satisfy this equation has the general appearance shown in Fig. 6-22. This
figure is based on the case p > 0. The x-axis, which is the line passing
through the focus orthogonal to the directrix is the axis of the parabola.
The parabola has no second axis so that we do not have to speak of this
as the principal axis, although that is actually what it is.
The point at which the axis cuts the parabola, the origin in this case, is
called the vertex of the parabola. The eccentricity is, of course, 1 as is
necessary to fit in with the focus-directrix form of the definition of the
ellipse and hyperbola.

Figure 6-22 Figure 6-23


6-5 THE PARABOLA 211

Definition 6-6. Let F be the focus and Lf the directrix of a parabola P.


Then
(1) the axis of the parabola is the line L through F orthogonal to L';
(2) the vertex of the parabola is the point of intersection of L with P.

A fairly good sketch of the parabola can be made, as shown in Fig. 6-23,
by sketching in two rectangles. The smaller rectangle is made up of the
lines x = 0, x = p, y = 2p, and y = — 2p. The larger rectangle, which
is actually two squares, is made up of the lines x = 0, x = 4p, y = 4p,
and y = —4p. The parabola then can be sketched in as shown. Essen­
tially, we use here the fact that if p is the distance from the vertex to the
focus, the parabola is of "height” 2p at the focus and of "height” 4p at the
distance 4p from the vertex.
We observe from Eq. (6-37) that the parabola is symmetric with
respect to the x-axis, that is with respect to its axis. This is the only axis
of symmetry for the parabola.
When p < 0 in Eq. (6-37), the parabola will "open out” to the left
as in Fig. 6-24 (b). The axis is still the x-axis, however.

If we interchange the roles of x and y, we get an equation of the form

x2 = 4py. (6-38)

The appearance of the parabola in this case is as shown in Fig. 6-24 (c)
and (d).
For any given equation it is easy to determine which type of figure is
involved. If x, for example, is the squared variable, as in (6-38), and
if the point (x, y) is on the parabola, then so is (—x, y). This means
that the 2/-axis must be the axis of symmetry. Since x2 is always positive,
4py must be positive also. Hence every point (x, y) on the parabola must
be such that y is of the same sign as p. In this way, any given equation
of this type can be analyzed. Conversely, the type of equation required
for a given parabola is easily determined.

y2 = 4px y2 = 4px x2 = 4py x2 = 4py


p>0 p<0 p>0 p<Q
(a) (b) (c) (d)
Figure 6-24
212 THE CONIC SECTIONS 6-5

For example,
4+ Sy = 0
x23

is the equation of a parabola. Rewriting this as

x2 = 4(-2)i/,

we recognize it as a parabola of the type shown in Fig. 6-24(d). Its


focus is at (0, —2) and its directrix is the line y = 2.
Similarly, the equation of the parabola with vertex at (0, 0) and focus
(—3, 0) must be
y2 = 4(—3)x
= 12x.

PROBLEMS

1. For each of the following parabolas, give the coordinates of the focus and the
equations of the axis and the directrix. Sketch the parabola.
(a) x2 = 16y (b) y2 = —6x
(c) x2 + 2y =0 (d) x - 12z/2 = 0
(e) 4x2 + 5y = 0 (f) 2x + 15t/2 = 0
(g) 4z2 = y (h) 3x2 — Sy = 0
2. Find the equation of each of the parabolas with vertex at (0, 0) and satisfying
the following conditions.

(a) focus at (2, 0) (b) focus at (—3, 0)


(c) focus at (I, 0) (d) focus at (0, —2)
(e) focus at (0, —J) (f) focus at (0, 5)
(g) directrix x = —| (h) directrix x = 7
(i) directrix y = f (j) directrix y — —f
3. Show that the cone with

X = l/x/2, B = [1/V2, 0, 1/V2], and A = (0, 0, -2p)

intersects the plane z = 0 in the parabola with equation

y2 — 4px.

4. (a) Using directly the definition of the parabola as given by (6-36), find the
equation of the parabola with focus at (4, 3) and directrix 4x + 3y +
25 = 0.
(b) Find the equation of the parabola with vertex at (0, 6) and focus at
(1, -D.
(c) Find the equation of the parabola with directrix x y = 0 and vertex
at (0, 2).
6—6 QUADRATIC EQUATIONS WITHOUT CROSS PRODUCT 213

6-6 GENERAL QUADRATIC EQUATIONS WITHOUT


CROSS PRODUCT TERMS

If we make a translation of all of the points of the plane through a


distance h in the x-direction and a distance k in the ^-direction, then the
origin is translated to the point (h, k) and a point which has coordinates
(x', y') is translated to the point with coordinates (x' + A, y' + k).
Suppose that (x, y) is an arbitrary point of the translated plane, and that
(x', yf) is the point which is translated into (x, y). Then x = x' + h and
y = y' + k, or
x' = x - h, (6-39)
yf = y — k.

Now suppose we have a point set which we wish to translate. As a


specific example, suppose we are considering the points of an ellipse
(see Fig. 6-25). That is, we have a set of points (x', t/') which satisfy the
equation
+ = 1
a2 b2

After the translation the coordinates (x, y) of the points of the translated
ellipse must satisfy the equation

(* — fe)2 i (y — ft)2
(6-40)
a2 b2

In other words, (6-40) is the equation of the ellipse after translation.

Figure 6-25

It is sometimes useful to think of a translation in terms of the change


of position of the coordinate axes. Thus we may first think of a translation
as being a function which maps the points of the x'i/'-plane to the points
of the x?/-plaiie, which we think of as an entirely separate plane. The
resulting loci in their respective planes are identical except for their
214 THE CONIC SECTIONS 6—6

Figure 6-26

location with respect to the coordinate axes. In Fig. 6-26 we have illus­
trated this point of view. Note, however, that we can reconstruct the
left-hand side of this figure if we view the dashed lines on the right-hand
side as the x'- and z/'-axes. This fact may be clarified, perhaps, by looking
at some specific examples.
The first type of problem we investigate is the problem of finding the
equation of a locus in a nonstandard position. For example, what is the
equation of the parabola with focus (1, 6) and vertex (—3, 6)? Plotting
these points, as in Fig. 6-27, we see that the axis of the parabola must be
the line y = 6, which is shown as a broken line. Adding the line x = — 1,
and thinking of these two lines as the x'- and z/'-axes, we see that the
desired parabola would have an equation of the form
y'2 = 4-4x'
in the x'l/'-plane. However, the x-, y- and x'-, 2/'-axes are related by the
equations
x' = x + 3,
y' = y — 6.
Recall that these equations are most easily determined by observing that
the point x' = 0, yf = 0 must be the same as x = —3, y = 6. Inserting
these values into the equation of the parabola, we find the desired equation
to be
(y - 6)2 = 16(x + 3).
Another type of problem which often occurs is that of identifying the
locus of a given equation. Again it is often easiest to simplify the equation
by a change of the variables such as given by Eqs. (6-39), and view the
new equation as the equation of the same locus in terms of a new co­
ordinate system. Thus, for example, if we are given an equation such as
(6-40), we can introduce new coordinates x' and yf, satisfying (6-39) so
as to simplify the equation. From (6-39) we see that the new axes, x' = 0
6-6 QUADRATIC EQUATIONS WITHOUT CROSS PRODUCT 215

and y' = 0, are the lines x = h and y = k. The quantities, points, and
lines associated with the conic in the new coordinate system can then be
identified and located in terms of the original coordinate system.
The ellipse whose equation is (6-40) has a focal distance c = [a2 —
b2]1/2, and eccentricity c/a, directrix distance d = a/e, but its center is
at the point (h, k). Its principal axis is the x'-axis, or, in other words,
y = k. Its conjugate axis is the line x = h. Note that the center is the
point at which the left-hand side of (6-40) vanishes, and the axes are the
lines parallel to the coordinate axes through this center. The quantity c
is the distance between the center and the foci, and hence the foci of the
ellipse determined by (6-40) are the points (h + c, k) and (h — c, k).
Similarly, we see that the vertices are the points (h + a, k) and (h — a, k).
The directrices will be the lines x = h + d and x = h — d.

Figure 6-27

Exactly similar reasoning will apply to the translation of any of the


conic sections discussed in the previous sections. An equation such as

(x ~ 3)2 (y + 2)2 _
4 9
is easily identified as the equation of a hyperbola with center at (3, —2).
Its asymptotes are the lines
£ — 3 y + 2 _ n
2 3 — ’
as may be discovered directly from the equation, thinking of the new
coordinate system as defined by Eqs. (6-39), or equivalently, found
with the help of a sketch as in Fig. 6-28. It is seen that one asymptote is
the line passing through (3, —2) and (5, 1) while the other asymptote
216 THE CONIC SECTIONS 6—6

is the line through (3, —2) and (5, —5). The other quantities connected
with this hyperbola can be determined in a manner similar to that dis­
cussed above. The methods are best learned by doing a few examples,
such as the problems at the end of this section.

Now, we wish to consider the general quadratic equation

AV2 + Bxy + Cy2 + Dx + Ey + F = 0, (6-41)

and to discuss the possible point sets which can satisfy such an equation.
In this section, we will restrict ourselves to a discussion of equations of the
type (6-41) in which B = 0, that is, to general quadratic equations with­
out cross product terms.
Suppose that in such an equation,

Ax2 + Cy2 + Dx + Ey + F = 0, (6-42)

both A and C are zero (strictly speaking, we do not then have a quadratic
equation). The equation is then seen to be the equation of a straight line.
A straight line is actually a degenerate case of a conic section (see Problem
2 at the end of this section), but we are more interested in the cases in
which (6-42) is a true quadratic equation, where A and C are not both
zero.
First, let us consider the case in which A 0, and C = E = 0. Equa­
tion (6-42) is then of the form

Ax2 + Dx + F = 0,

a simple quadratic equation in x. If this quadratic equation has no roots,


then there are no points satisfying the equation. If it has a single root r,
then the line x = r is the set of all points satisfying the given equation.
If there are two distinct roots, then two parallel lines constitute the set of
points satisfying Eq. (6-42). This last locus is an especially degenerate
case since it cannot be obtained by the intersection of a cone and plane
in any way, but requires the cone to degenerate into a cylinder.
The discussion of the case when (6-42) reduces to a quadratic equation
in y alone is similar.
Now suppose A 0, C = 0, and E 0. Then, by completing the
square, the Dx term can be combined with the Ax2, and the Ey term can
be combined with the resulting constant term to give an equation of the
form
A(x - h)2 + E(y - k) = 0.

This is easily recognized as the equation of a translated parabola.


6—6 QUADRATIC EQUATIONS WITHOUT CROSS PRODUCT 217

A similar discussion shows that when A = 0, C 0, and D 0,


Eq. (6-42) again determines a parabola.
We are left then with only the cases in which A 0 and C 0. By
completing the square, Eq. (6-42) then can always be reduced to an
equation of the form

A(x - h)2 + C(y - k)2 = m, (6-43)

where m is some constant. Again, we may consider various cases.

First, if A = C in (6-43), then this is the equation of a circle when


m 0 and is of the same sign as A and C. If m = 0, only the single
point (h, k) satisfies the equation, and if m # 0 but is of opposite sign to
A and C, then there are no points which satisfy (6-43).

Secondly, if A C, but both are of the same sign, then (6-43) deter­
mines a translated ellipse, a single point, or has no locus at all depending
on whether m is of the same sign as A and C, zero, or of opposite sign.

Finally, if A and C are of opposite signs, then (6-43) determines a


translated hyperbola when m 0. If m = 0, however, the set of points
satisfying (6-43) is a pair of intersecting straight lines, which are the
asymptotes of the hyperbolas determined when m 0.

Theorem 6-70. The locus of Eq. (6-42) is one of the following:

(1) the empty set


(2) a single point
(3) a single line, parallel to one of the coordinate axes
(4) two parallel lines, parallel to one of the coordinate axes
(5) two intersecting lines whose slopes are the negatives of one another
(6) a circle
(7) a nondegenerate conic section whose principal axis is parallel to
one of the coordinate axes.

This theorem has been proved, except for a few details, in the discussion
given above. These remaining details are left as exercises.
The methods used in the above discussion can be applied in practice to
actually determine the set of points satisfied by any equation of the form
(6-42). The trick is merely to complete the square on each variable whose
squared term appears.
For example, what is the locus of

x2 + 4y2 + 2x - 241/ + 33 = 0?
218 THE CONIC SECTIONS 6—6

To answer this question, we proceed as follows:


x2 + 2x + 4(y2 - Gy) = -33,
x2 + 2x + 1 + 4(j/2 - Gy + 9) = -33 + 1 + 36,
(x + l)2 + 4(2/ - 3)2 = 4,
(x + l)2 , (y - 3)2 .
4'1
This we recognize as an ellipse with center at (—1, 3) and principal axis
parallel to the x-axis. Here, a = 2, b = 1, c = \/3, e = a/3/2, and
d = 4/\/3 = 4V3/3. Thus for this ellipse, we have the following:
center: (—1, 3)
eccentricity: e = y/^/2
principal axis: y = 3
conjugate axis: x = — 1
foci: (-1 + \/3, 3), (-1 - \/3, 3)
principal vertices: (1, 3), (—3, 3)
conjugate vertices: (—1, 4), (—1, 2)
directrices: x = — 1 + 4\/ã/3, x = — 1 — 4\/3/3
We remark that it is usually helpful to make a sketch as an aid in deter­
mining these quantities.

PROBLEMS

1. Sketch each of the following conic sections. Identify each; give the coordinates
of the center, foci, and vertices, and the equations of the axes, directrices, and
asymptotes (if any). What is the eccentricity of each?
(x - l)2 (y - 3)2
(a)
9 16
(b) (x - 3)2 = 16(2/ + 2)
(x+ l)2 (y - 7)2
(c) 9 "T 25
(d) x2 - 2x + 82/ + 41 =0
(e) 16x2 + 25j/2 - 96x + 50y - 231 = 0
(f) x2 — 4y2 — 14x + 45 = 0
(g) y2 — bx — 4y + 7 = 0
(h) x2 — 4y2 + 6x — 4y + 4 = 0
(i) 9x2 + 9?/2 - 54s + Qy + 66 = 0
(j) 4x2 - 25i/2 + 40s + 50?/ + 75 = 0
2. Describe how the intersection of a cone and a plane can be: (a) a single
straight line, (b) two intersecting lines, (c) a single point.
6-6 QUADRATIC EQUATIONS WITHOUT CROSS PRODUCT 219

3. Find the equation of the conic section determined in each of the following
cases:
(a) The ellipse with foci at (3, —2) and (3, 6), and with principal dimension 5
(b) The hyperbola of eccentricity 2 with a focus at (4, 1) and directrix
(associated with this focus) x = 1
(c) The parabola with focus at (7, 6) and directrix y = 4
(d) The ellipse with center at (3, 5), a focus at (3, 8) and a vertex at (3, 9)
(e) The hyperbola with asymptotes

4x + y — 11 =0,
Ax — y — 13 = 0,
and a vertex at (3, 1)
4. Find the equation of the conic section having eccentricity e, a focus at (1, 0)
and an associated vertex at (0, 0). What happens to this equation and
its locus as e approaches 0? as e approaches 1 from below? from above?
5. What are the possible loci of (6-42) if
(a) AC > 0?
(b) AC = 0?
(c) AC < 0?
6. Show that the lines of parts (3) and (4) of Theorem 6-10 must be parallel to
one of the coordinate axes.
7. Prove that when case (5) occurs in Theorem 6-10, the two lines have slopes
which are negatives of one another.
8. Give a specific example illustrating each of the first five cases of Theorem 6-10.
7
Quadratic Curves
and Surfaces
7-
1 ROTATION OF AXES

In the last section we considered the effect of translations upon loci in


the plane. We now wish to consider the effect of another euclidean motion,
rotation.
We will consider only the case of rotations about the origin. Rotations
about any other center can be studied by translating the center of rotation
to the origin, performing the rotation, and translating the center back
to the original position.
As before, we can think of this euclidean motion as a function which
maps the points of the xy-plane onto the points of a (separate) x'z/'-plane.
However, the relative positions of points are unchanged, and this mapping
can be illustrated by drawing two sets of coordinate axes in the same
plane. A given point will have different coordinates with respect to the
different axes, as in Fig. 7-1.
Theorem 5-13 (on page 160) tells us how to determine the coordinates
of the point in the new coordinate system. Let e{ and e£ be the unit
vectors in the directions of the new x'- and i/'-axes. Then from Theorem
5-13, we have for any point X

X = (X • eDeS + (X • e2)e2
= x'e'x + y'&2.
Hence the point has coordinates (x', y') with respect to the new coordinate
system, where
x' = X • ei, (71)
y’ = X • e'2.

To find the new coordinates all we have to do is compute these dot


products. But we must first know e{ and e2. Let us suppose that the new
coordinate system is located so that the x'-axis makes a signed angle of 0
with the original x-axis, as in Fig. 7-1. Then the signed angle from the
?/-axis to the x'-axis is 0 — 7r/2 (why?), and the signed angles from the
220
7-1 ROTATION OF AXES 221

x- and ?/-axes to the i/'-axis are 0 + tt/2 and 6 respectively. Therefore,


ei • ex = cos 0,
e'i • e2 = cos (0 — 7t/2) = sin 0,
e2 • ex = cos (0 + 7t/2) = —sin 0,
e2 • e2 = cos 0.
Using these relations and Theorem 5-13 again [ej = (e{ • ei)ei +
(ej • e2)e2, etc.], we have
ei = cos 0 ex + sin 0 e2,
e2 = —sin 0 ex + cos 0 e2.
For a given point X = (x, y) in the original coordinate system, we can
use relations (7-2) to obtain the change of coordinates in (7-1). Doing
so, we find
X • e'x = (xex + ye2) • (cos 0 ex + sin 0 e2)
= x cos 0 + y sin 0,
and
X • e2 = (xex + z/e2) • (—sin 0 ex + cos 0 e2)
= —x sin 0 + y cos 0.
We have thus proved the theorem below.

Theorem 7-1. If a point (x, y) has coordinates (xz, y') in a new co­
ordinate system which has been rotated through the signed angle 0,
then
xf = x cos 0 + y sin 0,
yf = —x sin 0 + y cos 0.

Let us illustrate the use of these formulas by finding the equation of the
hyperbola whose principal axis is the line y = x, whose asymptotes are

/ \ X

Figure 7-1 Figure 7-2


222 QUADRATIC CURVES AND SURFACES 7-1

the lines x = 0 and y = 0, and which has a vertex at the point (1, 1)
(Fig. 7-2). Introduce a new coordinate system whose z'-axis is the principal
axis of this hyperbola. Then, from relations (7-3) we have

x' = (x + y),
V2 (7-4)
y' = ~y= (—x + y).
V2
The line x = 0 is the line x' — y' = 0 in these new coordinates (why?),
and the other asymptote is the line xf + yr = 0. These can be obtained
by solving the relations (7-4) for x and y in terms of x' and y'. The vertex
is at the point (\/2, 0) in terms of these coordinates.
The equation of the required hyperbola in the ' 1
(s',i/')-coordinate system is therefore
/
z'2 y'2 _ /
2 2
From (7-4) this is
i(x + y)2 — i(—X + y)2 = 1,
or after some simplification, „ _
1 Figure 7-3
xy = 1. (7-5)
In the next section we will be looking at the opposite problem: how to
find a rotation which will simplify a given equation. For this, we will
find it useful to have some relations which are consequences of the trigono­
metric sum formulas. These were derived in Section 2-7, but we can
repeat the derivation here. Indeed, these formulas can be derived by using
the methods of this section. Let us suppose that the vector e{ results
from a rotation of ei through an angle 0, and that e" results from a rotation
of ei through an angle 0 + </>, or equivalently, from the rotation of ef
through an angle of </> (see Fig. 7-3). Then
ey = cos (0 -|- </>)ei -|- sin (0 </>)e2
= cos <£e Í + sin </>e2.
However, ei and e'z are given by Eqs. (7-2), and substituting these values
into the second equation above, we have
cos (0 + </>)ei + sin (0 + </>)e2 = cos </>(cos 0ei + sin 0e2)
+ sin </>(—sin 0ei + cos 0e2)
= (cos 0 cos <t> — sin 0 sin </>)ei
+ (sin 0 cos </> + cos 0 sin </>)e2-
7-1 ROTATION OF AXES 223

Since the two sides of this equation are identical, we have proved
cos (0 + tf>) = cos 0 cos <£ — sin 0 sin <£>,
sin (0 + </>) = sin 0 cos </> + cos 0 sin <£.

PROBLEMS

1. Find formulas similar to (7-2) for ei and e2 in terms of and e^.


2. Use the results of Problem 1 to obtain formulas for x and y in terms of x' and y'.
3. Find x and y in terms of x' and / by solving Eqs. (7-3). How does this
result compare with the answer to Problem 2?
4. Find the equation of the hyperbola whose principal and conjugate dimensions
are a and b, whose center is at the origin, having the i/-axis as an asymptote,
and such that a vertex is in the first quadrant. Can you solve the resulting
equation for y as a function of x?
5. Find the equation of the ellipse whose foci are at (3, 4) and (—3, —4) and
whose principal dimension is 13. Use the methods of this section.
6. Find the equation of the ellipse of Problem 5 directly from the focal property,
as was done in Section 6-3. Compare with the result of Problem 5. Which
method is easier?
7. Starting from Equations (7-6) prove the following:
sin 20 = 2 sin 0 cos 0
cos 20 = cos2 0 — sin2 0
cos 20 = 2 cos2 0 — 1
cos 20 = 1—2 sin2 0
cos2 0 = |[1 + cos 20]
sin2 0 = |[1 — cos 20]
1 — cos 20
tan20
1 + cos 20
1 — cos 20
tan 0
sin 20
sin 20
tan 0 =
1 + cos 20
8. Prove:
tan 0 + tan </>
tan (0 + </>)
1 — tan 0 tan 0
tan 0 — tan </>
tan (0 — </>)
1 + tan 0 tan </>
224 QUADRATIC CURVES AND SURFACES 7-2

7-
2 GENERAL QUADRATIC EQUATIONS

In the first section of the last chapter we found that the intersection of a
cone with the xi/-plane is the set of points which satisfy a quadratic equa­
tion, that is, it is the locus of an equation of the form
Ax? + Bxy + Cy2 + Dx + Ey + F = 0. (7-7)
In this section we wish to investigate the converse property, i.e. to determine
what point sets will satisfy an arbitrary equation of the form (7-7).
If B = 0 in Eq. (7-7), then we know that the equation represents a
conic section, a degenerate conic section, or the empty set, as was dis­
cussed in Section 6-6. We will see that the same result holds for Eq. (7-7)
by showing that in a suitably rotated coordinate system the cross product
term xy will not occur, and hence in this coordinate system the results of
Section 6-6 will apply.
Let us suppose that the (xr ,yf) -coordinate system is rotated through an
angle 0 with respect to the x-, y-axes. Then from (7-3) we have

x' = x cos 0 + y sin 0,


y' = —x sin 0 + y cos 0.

If these relations are solved for x and y in terms of x' and y' (see Problems
2 and 3 in the last section) we find
x = x* cos 0 — yr sin 0,
(7—o)
y = x' sin 0 + y* cos 0,
and hence, we can compute
x2 = x'2 cos2 0 — 2x'y' cos 0 sin 0 + y'2 sin2 0,
y2 = x' 2 sin2 0 + 2x'y' cos 0 sin 0 + y'2 cos2 0,
xy = x'2 cos 0 sin 0 + xV(cos2 0 — sin2 0) — y'2 cos 0 sin 0.

Therefore, in terms of the rotated coordinate system, the point set satis­
fying Eq. (7-7) will satisfy the equation
A'x’2 + B'x'y' + C'y'2 + D'x' + E'y' + Ff = 0,
where
A' = A cqs2 0 + B cos 0 sin 0 + C sin2 0,
B' = 2(C — A) cos 0 sin 0 + B(cos2 0 — sin2 0),
C' = A sin2 0 — B cos 0 sin 0 + C cos2 0, (7-9)
D' = D cos 0 + E sin 0,
Er = —D sin 0 + E cos 0,
F' = F,
7-2 GENERAL QUADRATIC EQUATIONS 225

Using the relations of Problem 7 of the last section, we see that


B' = (C — 4) sin 20 + B cos 20.
By proper choice of 0, this expression can be made zero. Indeed, if we
set
sin 20 = -r )
(7-10)
on C ~ A
cos 20 = ---- -— >
A
where
A = [B2 + (C - A)2]1/2, (7-11)
then B' = 0.
The values on the right-hand side of Eqs. (7-10) are the sine and cosine
of some angle 20 since the sum of their squares is one. The only possible
way in which this could fail is if A = 0. However, in order for A to be
zero, we would have to have B = 0 and C — A = 0. Since we were
trying to eliminate B from (7-7) by this process, this would mean that
Eq. (7-7) was already in the desired form to begin with. It is interesting
to note that in this case, Eq. (7-7) represents a circle (if its locus is non­
degenerate).
Using the identities of Problem 7 of the last section, we find from
(7-10) that
C°s2<? = i(l - ^A) = 2£[A - (C - A)L
sin2 0 = 1 (1 + = A [A + (C - A)], (7-12)

cos 0 sin 0 = B/2A.


Putting these values into (7-9), we find that when the angle 0 is deter­
mined by (7-10) we have
A' = j[(C + A) + A],
B' = 0, (7-13)
C' = |[(C + A) - A].
The values for D' and E' are similarly determined by using the values
determined by (7-12). There is here, however, a matter of choice. Equa­
tions (7-12) determine cos2 0 and sin2 0. There are in general four possible
angles 0 between 0 and 2tt which will satisfy these requirements, but only
two of these will also satisfy cos 0 sin 0 = B/2A. Generally, it probably is
most convenient to choose the angle in the first or fourth quadrant, which
corresponds to taking the positive root for cos 0, and the appropriate root
for sin 0 so that sin 0 cos 0 = B/2A.
226 QUADRATIC CURVES AND SURFACES 7-2

In practical problems, it probably is unwise to try to memorize these


formulas. If Eqs. (7-13) cannot be referred to, it is better to start with
Eqs. (7-8) and work the relations out in full. An example will show how
this may be done. Let us attempt to find the point set which satisfies
the equation
9z2 - 6xy + 17z/2 - 288 = 0.
We set x = x' cos 0 — y' sin 0 and y = x' sin 6 + y' cos 0, getting
9[x'2 cos2 0 — 2x'y’ cos 0 sin 0 + y'2 sin2 0] — 6[z'2 cos 0 sin 0
+ x'y' (cos2 0 — sin2 0) + y’2 cos 0 sin 0] + 17[x'2 sin2 0
+ 2x'y' cos 0 sin 0 + y’2 cos2 0] — 288 = 0,
or
[9 cos2 0 — 6 cos 0 sin 0 + 17 sin2 0]x'2 + [16 cos 0 sin 0 — 6(cos2 0 —
sin2 0)]x'y' + [9 sin2 0 — 6 cos 0 sin 0 + 17 cos2 0]y'2 — 288 = 0.
The expression which we wish to eliminate is
16 cos 0 sin 0 — 6(cos2 0 — sin2 0) = 8 sin 20 — 6 cos 20
= 2[4 sin 20 — 3 cos 20].
This will be zero if we set
sin 20 = f,
cos 20 = f.
In this case,
cos 0 sin 0 = 2 si
* 1 20 = •&,
cos2 0 = |(1 + cos 20) = 3^,
sin2 0 = |(1 — cos 20) = 3^.
Substituting in these values, we have the equation
8x'2 + 18t/'2 - 288 = 0,
or
+ 1
36 16
This we immediately recognize as an ellipse with the z'-axis as its principal
axis and with principal and conjugate dimensions 6 and 4 respectively.
Since sin 0 cos 0 is positive, we can choose the positive roots for sin 0
and cos 0, giving cos 0 = 3/x/10 and sin 0 = l/\/10. The z'-axis can
thus be sketched in by drawing the line through the origin and (3, 1). It
is then easy to make a sketch of the ellipse as in Fig. 7-4. How have the
coordinates given in Fig. 7-4 been determined? What are the coordinates
of the foci?
7-2 GENERAL QUADRATIC EQUATIONS 227

In order to see whether a given equation of the form (7-7) represents


an ellipse, hyperbola, or parabola, it is not necessary to do all of this
computation. We will show that the expression B2 — 4AC remains un­
changed under rotation of coordinates, hence this expression will be the
same as — 4A'C" when the coordinates are so chosen that B' = 0. But the
equation
A'x'2 + C'y'2 + D'x + E'y + F' = 0

will represent an ellipse if A'C' > 0, a parabola if A'C' = 0, and a hyper­


bola if A'C' < 0 (without considering degenerate cases, or the possibility
of no locus). First, however, we must prove the invariance of this ex­
pression.

Theorem 7-2, If the equation

Ax2 + Bxy + Cy2 + Dx + Ey + F = 0


becomes

A'x'2 + B'x'y' + C'y'2 + D' + E'y' + F' = 0

under a rotation of the coordinate system, then

C' + A' = C + A,
B'2 + (C' - A')2 = B2 + (C - A)2,
and
B'2 - ±A'C' = B2 - ±AC.

Proof: First we observe from the relations (7-9) that

C' + A' = A (cos2 6 + sin2 B) + C(cos2 B + sin2 B)


= C + A,
so that C + A is invariant under the rotation.
228 QUADRATIC CURVES AND SURFACES 7-2

Next we compute
C' — A' = (C — A)(cos2 0 — sin2 0) — 2B cos 0 sin 0
= (C — A) cos 20 — B sin 20.
Likewise,
B' = (C — A) sin 20 + B cos 20.
Therefore,
B'2 _|_ — A')2*
= (C — A)2 sin2 20 + 2B(C - A) sin 20 cos 20
+ B2 cos2 20 + (C — A)2 cos2 20
— 2B(C — A) sin 20 cos 20 + B2 sin2 20

= B2 + (C - A)2,
Hence the expression B2 + (C — A)2, which is A2, is also invariant under
the rotation.
Finally, we merely need to observe that

B'2 - 4A'C' = B'2 + (C' - A')2 - (C' + A')2


= B2 + (C - A)2 - (C + A)2
= B2 - 4AC.
We immediately obtain the following theorem as a corollary.

Theorem 7-3. The set of points satisfying the equation

Ax2 + Bxy + Cy2 + Dx + Ey + F = 0

is an ellipse, a point, or an empty set if B2 — 4 AC < 0. It is a parabola,


two parallel lines, or a line if B2 — 4AC = 0. It is a hyperbola or a
pair of intersecting lines if B2 — 4 AC > 0.

For example, the equation that we considered above,

9x2 - Gxy + 17?/2 - 288 = 0,

was found to be the equation of an ellipse. For this equation,

B2 - 4AC = 36-4-9-17 <0,

and hence this information could have been obtained from an application
of Theorem 7-3.
It is relatively easy to remember that B2 — 4AC remains unchanged
when the coordinate system is rotated. If the student can also remember
that C + A is invariant under the rotation, then he knows all he needs
7-2 GENERAL QUADRATIC EQUATIONS 229

in order to obtain the coefficients of the transformed equation. We wish


to rotate the coordinate system so that B' = 0. If this is done, we will
then have the relations
-4A'C" = B2 - 4AC,
C' + A' = C + A.
Knowing A, B, and C, we can solve these equations for A' and C".
Again using the above example,

9x2 - Qxy + 17z/2 - 288 = 0,


we have
—4A'C' = -576,
C + A' = 26.
From the first equation we have A' = 144/C'. Substituting this into the
second equation gives C" + 144/C' = 26, or

C'2 - 26C' + 144 = 0.

This gives C' = 18 or 8, and hence A' = 8 or 18, respectively. If we use


the first pair, we find the transformed equation
8z'2 + lSy'2 - 288 = 0.

This process gives us the transformed equation, but it does not tell us
the angle through which the coordinate system has been rotated. The
latter can be obtained in another way, however. The last equation tells us
that we have an ellipse whose principal dimension is 6 and whose conjugate
dimension is 4. The center is at the origin, which is unchanged by the
rotation. Hence, if we could find the coordinates of a principal vertex
in the (x,y)-coordinate system, then we could locate the ellipse as in Fig. 7-4.
The coordinates of a principal vertex can, however, be found easily
by finding a point on the locus whose distance from the origin is 6 (the
same process will work equally well in the case of a hyperbola). That is,
we wish to solve the pair of equations,

9x2 - 6xy + 17z/2 - 288 = 0,


x2 + y2 = 36,
simultaneously. Multiplying the second equation by 9 and subtracting
gives
—&xy + 8?/2 = —36,
and hence,
4i/2 - 18 4«/ 6
x
230 QUADRATIC CURVES AND SURFACES 7-2

Combining this with x2 + y2 = 36 gives

™y2 + ™ + $ + y2 = &,

or
25 y4 - 180 y2 + 324 = 0.

A positive root of this equation is y2 = From this we have


x2 = 36 — y2 = ffi-. If we insert these values into the original equa­
tion of the locus, we find

This is satisfied when we take the roots for x and y to have the same
sign. Thus, we can choose the positive roots, and find the point (18/\/10,
6/V1Õ) to be one of the principal vertices.

PROBLEMS

1. Prove that the relations of (7-13) follow from (7-9) and (7-12).
2. Show that there are in general four possible angles 0 between 0 and 2ir such
that
(C — A) sin 20 + B cos 20 = 0.

How are these four angles related to each other?


3. For each of the following equations, find and sketch the locus, first by using
relations (7-13) directly, and then by working out the relations in full,
starting from Eqs. (7-8).
(a) 9x2 + 24x?/ + 16i/2 + 100x - 40?/ + 100 = 0
(b) 5x2 + %xy_ + 5y2 — 9 = 0
(c) x2 + 2V3 xy — y2 + 1 =0
(d) 3x2 — 6x7/ + y2 — 4 = 0
4. If the expression F(x, y) = Ax2 + Bxy + Cy2 is evaluated at points of the
unit circle, that is, at points where x = cos <f> and y = sin 0, we have

V(</>) = F (cos </>, sin </>) = A cos2</> + B cos </> sin </> + C sin2</>.

Show that

V(</>) = j[(C + A) + B sin 2</> - (C - A) cos 2#


Show also that
F(0) = il(C + A) + A cos (20 - 2*)],
7-3 THE QUADRIC SURFACES 231

where A is defined as in (7-11) and 6 is the angle determined as in (7-10).


For what value of 0 is this expression a maximum? a minimum?
5. Equations (7-13) give A' and C" for the coordinate system in which B' = 0.
What is A'C" in terms of A, B, and C according to these equations? Does
this offer another proof of Theorem 7-3?

7-
3 THE QUADRIC SURFACES

The general quadratic equation in the three variables x, y, and z is

Ax2 -|- By2 -|- Cz2 -|- Hxy Jyz Kxz -|- Dx -|- Ey -|- Fz -|- G = 0.
(7-14)

In this section we wish to discuss the point sets in the three-dimensional


space which can satisfy such an equation. These are called the quadric
surfaces.
Just as in the case of quadratic equations in two variables, there always
exists a rotation of the coordinate system such that the cross product terms
in (7-14) can be eliminated. The proof of this would take too much time
to give here and is more difficult than might appear at first glance. For,
suppose we consider the z-axis as fixed and rotate the x- and y-axes. As
was shown in the last section, this can be done so as to eliminate the xy
term. It might seem that we could continue by holding the new y-axis
fixed and rotating the (new) x-axis and z-axis to eliminate the xz term.
However, if we try to do this we find that the yz term generates a new
xy term (why?)
Actually, several proofs are available. One in particular results from an
extension of the observations made in Problem 4 of the last section, but it
too requires techniques we do not want to develop at this point. Instead,
we will just assume that this has been accomplished and study only equa­
tions of the form

Ax2 + By2 + Cz2 + Dx + Ey + Fz + G = 0. (7-15)

The character of the locus of this equation changes drastically as the


various coefficients in it take on positive, negative, or zero values. Since
there are seven coefficients with three possibilities for each coefficient,
there are a total of 37 = 2187 possible cases to consider. We can, however,
cut the number of these cases down to a reasonable size by making a few
observations.
First, if any one of the three coefficients A, B, or C is not zero, then
we can make the corresponding coefficient D, E, or F respectively zero.
232 QUADRATIC CURVES AND SURFACES 7-3

This can be done by means of a translation (completing the square in that


variable).
For example, we could write

x2 + Zy2 — z2 + 4x — 4?/ + 6z — 10
= (x + 2)2 + 2(?/ - l)2- (z - 3)2 - 3
= x'2 + Zy'2 - z'2 - 3.

Secondly, if any of the coefficients Z>, E, or F is nonzero while the


corresponding quadratic coefficient is zero, then we can make (7=0,
again by means of a translation. An example of this would be

x2 + %y2 — 6z + 8 = x2 + %y2 — 6(z — f)


= x2 + Zy2 - 6z'.

Thirdly, if one of the three coefficients A, B, or C is nonzero, we can


assume it to be A, interchanging the roles of the coordinates if necessary
(this is a particular case of rotation of coordinates, but one which is
quite simple). Furthermore, we can assume A to be positive in this case,
since we can multiply (7-15) through by —1 if necessary.
Fourthly, if two of the coefficients A, B, or C are nonzero, then again by
interchanging coordinates if necessary, we can assume them to be A and B.
Finally, if A, B, and C are all nonzero, then at least two are of the same
sign. We may assume these two to be A and B, and furthermore that both
are positive (why?).
On page 233 we give a table which lists all of the pertinent cases re­
maining after the above reductions have been made. In this table, a zero
means that the coefficient is zero; a + or — indicates that the coefficient
is nonzero and positive or negative respectively; an x indicates that the
coefficient is nonzero (the sign being immaterial in that case); and a zero
in parentheses indicates that the coefficient can be assumed to be zero
because of one of the above comments. A blank space means that the
coefficient in that case does not matter. There are 18 cases listed in this
table. The only cases which have been left out are those in which A, B, C,
D, B, and F are all zero (why has this been left out?). Of these eighteen
cases, nine are fairly trivial in one way or another, and the remaining
nine are of considerable interest. These are indicated by an asterisk in the
table.
We turn now to an analysis of these nine interesting cases. Each time,
we attempt to build up a picture of the locus satisfying the equation by
considering the curves which result when a plane parallel to one of the
coordinate planes is allowed to cut the surface.
7-3 THE QUADRIC SURFACES 233

Case A B C DEF G Locus

I 1 0 0 0 X Plane

II 1 + 00 (0) 0 0 + Empty set


2 + 00 (0) 0 0 0 Plane
3 + 00 (0) 0 0 — Two parallel planes
4* + 00 (0) X (0) Parabolic cylinder

III 1 + + 0 (0) (0) 0 + Empty set


2 + + 0 (0) (0) 0 0 Line
3* + + 0 (0) (0) 0 — Elliptic cylinder
4* + + 0 (0) (0) X (0) Elliptic paraboloid

IV 1 + - 0 (0) (0) 0 0 Two intersecting planes


*2 + - 0 (0) (0) 0 X Hyperbolic cylinder
3* + - 0 (0) (0) X (0) Hyperbolic paraboloid

V 1 + + + (0) (0) (0) + Empty set


2 + + + (0) (0) (0) 0 Single point
3* + + + (0) (0) (0) — Ellipsoid

VI 1* + + - (0) (0) (0) + Hyperboloid of two sheets


*2 + + - (0) (0) (0) 0 Elliptic cone
3* + + - (0) (0) (0) — Hyperboloid of one sheet

Case II (4). The Parabolic Cylinder


In this case the equation can be written as

x2 + Ey + Fz = 0

after dividing through by A. We assume E


to be nonzero. Any plane orthogonal to the
x-axis cuts the surface in a straight line, the
line with equation

Ey + Fz + a2 = 0 Figure 7-5

where x = a is the cutting plane. All such lines are parallel and the
resulting surface is called a cylinder (Fig. 7-5), since it can be generated
by a set of parallel lines. Any plane orthogonal to the z-axis cuts the
cylinder in a parabola and all such parabolas are translates of one another
in the i/-, z-directions. The vertices lie on a straight line in the i/z-plane,
and the axes of the parabolas are all parallel to the i/-axis.
234 QUADRATIC CURVES AND SURFACES 7-3

Note that we can make a rotation in the 7/z-plane so that Ey + Fz =


E'y'. Then the surface becomes particularly simple.

Case III (3). The Elliptic Cylinder


By dividing through by (7, we can assume the equation to be in the
form

where we have written the coefficients of x2 and y2 in this form to indicate


that they are positive. This is the equation of an ellipse in the xy-plane;
and since z does not appear in the equation, if any point is on the locus,
then the entire line through that point parallel to the z-axis will also be
on the locus. The surface is therefore again a cylinder, but this time an
elliptic cylinder (Fig. 7-6).

Case III (4). The Elliptic Paraboloid


The equation can be divided through by |F| and written in the form

The two cases here are similar, and we will discuss only the one with the
positive sign on z. The negative sign would merely invert the locus. As­
suming the positive sign on z, there are no points with negative z which
satisfy the locus. Setting z = c2, we see that the intersection of the
surface with a plane orthogonal to the z-axis is an ellipse. All such ellipses
are similar (have the same eccentricity). A plane orthogonal to the x-
(or y-) axis cuts the surface in a parabola. In particular, the planes x = 0
and y = 0 cut the surfaces in parabolas which pass through the vertices
of the above ellipses. The surface is illustrated in Fig. 7-7.
7-3 THE QUADRIC SURFACES 235

Case IV (2). The Hyperbolic Cylinder


Dividing the equation through by |(7| we obtain an equation of the form

This is the equation of a hyperbola in the xi/-plane. The principal axis is the
x-axis or the y-axis depending on the sign of the right-hand side. Again,
the surface is a cylinder with generators parallel to the z-axis (Fig. 7-8).

Case IV (3). The Hyperbolic Paraboloid


By dividing through by |F| we can obtain the equation

We will consider only the case with — z on the right-hand side. The other
case is similar. The cross sections of this surface in planes orthogonal to
the x-axis are parabolas, opening upward. The cross sections in planes
orthogonal to the i/-axis are parabolas which open downward. Finally,
the cross sections in planes orthogonal to the z^axis are hyperbolas (except
when z = 0). When z > 0, these have their principal axis in the z/z-plane.
When z < 0, the principal axis is in the xz-plane. For z > 0, the vertices
of these hyperbolas lie on the parabola y2 = b2 z. For z < 0, the vertices
are on the parabola x2 = — a2z. When z = 0 we find

The locus of this equation is a pair of intersecting lines. A surprising prop­


erty of this surface is the fact that through every point of the surface
there exist two distinct straight lines which lie on the surface. This is
shown in Problem 8 at the end of this section.
236 QUADRATIC CURVES AND SURFACES 7-3

This surface, as illustrated in Fig. 7-9, is probably the most interesting


of the quadric surfaces.

Case V (3). The Ellipsoid


This equation can be divided through by — G to give an equation of the
form

Every cross section of this surface in a plane orthogonal to one of the


axes is an ellipse (or a point, or nothing). The ellipses orthogonal to a
given axis are all similar. The surface is sketched in Fig. 7-10.

Figure 7-10

Case VI (1). The Hyperboloid of Two Sheets


The equation in this case can be brought to the form

Every cross section in a plane orthogonal to the z-axis is an ellipse (or a


point, or nothing) and all of these ellipses are similar. The cross sections
in planes orthogonal to the x-axis or i/-axis are hyperbolas, with principal
axis parallel to the z-axis in the xz- or i/z-plane. There are no points on the
locus for — c < z < c and the surface is in two parts (as shown in Fig.
7-11).

Case VI (2). The Elliptic Cone


When the coefficient G is zero, the equation can be written as

The cross sections in planes orthogonal to the z-axis are ellipses (except
7-3 THE QUADRIC SURFACES 237

when z = 0). The cross sections in planes orthogonal to the x- and 2/-axes
are, in general, hyperbolas. However, when x = 0 or y = 0 these cross
sections are a pair of intersecting lines. In fact, it is easy to verify that if
any point is on the locus, then the entire straight line through this point
and the origin is on the locus. This is the characteristic property of a
cone. This surface is shown in Fig. 7-12.

Case VI (3). The Hyperboloid of One Sheet


The equation in this case can be brought to the form
2 2 2
i y___ = !
a2 Ò2 c2
The analysis is similar to that of the hyperboloid of two sheets. The dif­
ference here is that every plane orthogonal to the z-axis intersects the
locus in an ellipse. The locus therefore does not fall into two parts. The
planes x = 0 and y = 0 intersect the locus in hyperbolas whose principal
axes are orthogonal to the z-axis. The resulting surface is as shown in Fig.
7-13. Notice that hyperbolas in the xz- and 2/z-planes pass through the
vertices of the ellipses found as intersections of the surface with planes
orthogonal to the z-axis.
This surface also has the property of being made up of straight lines.
This is proved in Problem 7 at the end of this section.

There is a special case which can occur in each of six of the above cases.
Whenever two of the coefficients A, B, and C are equal (and of the same
sign) we will have a surface of revolution. When A = B we would have
surfaces of revolution about the z-axis. This means that if (x0, 2/0, Zo) is
on the surface, then all points of the circle in the plane z = z0 with center
at (0, 0, z0) are also on the locus. This follows from the fact that a point
(^1,2/1, So) is on this circle if and only if
2 I 2 2 I 2
#1 + 2/1 — Xq + 2/0.
238 QUADRATIC CURVES AND SURFACES 7-3

Thus if the quadratic equation is of the form

Ax2 + .Ay2 + • • • = 0,

this condition is satisfied.


Surfaces of revolution can be visualized easily. If, for example, it is a
surface of revolution about the z-axis, we can find the curve of intersection
of the surface with the xz-plane (setting y = 0) and imagine this curve
being rotated about the z-axis to generate the surface.
Surfaces of revolution are of such importance that most of them are
given special names.
In case III (3), if A = B the elliptic cylinder becomes a right circular
cylinder. Similarly, in case VI (2) the elliptic cone becomes a right circular
cone.
If A = B in the elliptic paraboloid of case III (4), we obtain a surface
of revolution which is called just a paraboloid. In cases VI (1) and (3) we
have hyperboloids of revolution, of two and one sheets respectively.
Finally, if two of the coefficients are equal in the equation of the ellipsoid,
we obtain a surface known as a spheroid. Assuming that A = B, a
spheroid has an equation of the type

If c > a, then the spheroid looks something like a football and is called a
prolate spheroid. If c < a, then surface is called an oblate spheroid. The
oblate spheroid looks like a curling stone (or like a volleyball that is
being sat upon). What happens when c = a?
The reader is expected to learn the names (together with the forms)
of the various quadric surfaces, but he should not attempt to memorize
which forms of the equation go with which of the surfaces. Rather, when
faced with the problem of identifying a quadric surface, say,

4z2 - y2 + z2 - Sy + 4z + 11 = 0, (7-16)

he should proceed by working up a picture of the surface by considering


the cross sections in the planes parallel to the coordinate planes. For
example, Eq. (7-16) would be rewritten

4x2 - (y + 4)2 + (z + 2)2 = 9,


or
x2 (y + 4)2 , (z + 2)2
= 1. (7-17)
9/4 9'9
7-3 THE QUADRIC SURFACES 239

Figure 7-14

In choosing the cross sections to consider first, we prefer to find planes


in which the cross sections are ellipses (or circles). If such planes exist,
it is usually easiest to build up the picture by starting with them. In
(7-17) we see that any plane y = c will intersect the surface in an ellipse.
The resulting ellipse has its center at x = 0, z = —2. Its principal axis
will be the line x = 0 and its conjugate axis the line z = — 2 (all of these
being in the plane y = c). (See Fig. 7-14a.)
All of these ellipses are similar, and they can easily be determined by
the location of their vertices. The principal vertices are the points of
intersection of the surface with the plane x = 0, that is, the points on
_ (y + 4)2 (z + 2)2
9 ‘ 9
This is a hyperbola in the yz-plane with center at y = —4, z = — 2 (see
Fig. 7-14b).
The conjugate vertices of the ellipses are the points of intersection of
the surface with the plane z = —2. These are the points in this plane
which satisfy the equation
x2 _ (y + 4)2 _
9/4 9
and hence lie on a hyperbola with center x = 0, and y = —4 (Fig. 7-14c).
240 QUADRATIC CURVES AND SURFACES 7-3

Collecting the information, we see that the surface is a hyperboloid of


one sheet whose axis is the line x = 0, z = —2. The surface is as shown in
Fig. 7-15.

PROBLEMS

1. Show that nine cases of the above table which were not discussed in the text
have loci as listed in the table.
2. What relationship exists between the two intersecting planes of case IV (1)
and the hyperbolic cylinder of IV (2) when A, B, and C are the same in the
two equations? What relations hold between the planes of IV (1) and hyper­
bolic paraboloid IV (3) ?
3. What relationship exists between the elliptic cone of case VI (2) and the
surfaces of cases VI (1) and (3) when A, B, and C are the same in the two
equations?
4. Identify the surfaces defined by each of the following equations, Discuss the
intersections of planes parallel to the coordinate planes. Sketch the surface.
(a) x2 + 4z2 = 0 (b) y2 + 9z2 - 36z = 0
(c) 4z2 + 9i/2 + 36z2 - 36 = 0 (d) 4z2 — y2 + 4z2 + 12 = 0
(e) x2 — 16z2 + 481/ = 0 (f) y2 - 4z2 + 16 = 0
(g) &y2 — 4x + z = 0 (h) 6x2 + 4i/2 + 4z2 - 12 = 0
5. Follow the same directions as in problem 4.
(a) 4x2 — y2 + 12z2 — 36 = 0 (b) y2 + 4z2 - 16 = 0
(c) y2 — 9z2 = 0 (d) 9x2 + 9z2 — 4i/ = 0
(e) x2 - 8y2 - 8z2 = 0 (f) 8z2 + y2 - 8z2 - 32 = 0
(g) 9x2 + 4i/2 + 9z2 - 36 = 0 (h) z2 — x + 2y = 0
6. Identify the surface defined by each of the following equations. Discuss the
intersections of planes parallel to the coordinate planes. Sketch the surface.
(a) x2 - y2 + 4z2 + 6z - 8z + 14 = 0
(b) 4z2 + 9z2 - 12?/ + 6 = 0
(c) 6z2 + 2y2 + z2 — 24z + Sy — 4z = 0
(d) 9?/2 - 4z2 + 10s = 0
(e) 4x2 — y2 + 4z2 + 16 = 0
(f) x2 + 4i/2 - 3z2 - 2x - 12z - 11 =0
(g) x2 — 4z2 + 6x = 0
7-4 POLAR COORDINATES 241

7. Let Xo = (xo, yQf 0) be any point on the intersection of the hyperboloid of


one sheet

with the plane z = 0. Show that every point of the lines


X = Xo “F ®
is on the hyperboloid for
B = [a2yo, —b2xo, abc],
or
B = [—a2yo, b2xof aòc].
8. Let Xo = (zo, 0, 20) be any point on the intersection of the hyperbolic
paraboloid
2 2

i-b+
* =°
with the plane y = 0. Show that every point of the lines X = Xo + £B is
on the hyperbolic paraboloid for
B = [a2, ab, —2xo],
or
B = [a2, — ab, — 2xo].
9. A quadric surface is called central if and only if there is a point P, called the
center, such that whenever the point X is on the conic then so is the point X',
where
__ > __ > X' = P - PX
(i.e., PX' = -PX).
Which of the quadric surfaces defined in this section are central? What are
their centers?

7-
4 POLAR COORDINATES

We have made an identification between points of the cartesian plane


and vectors. Thus, if X is the point (x, y), we identify it with the vector
X = OX = xei + 2/e2
(0 being the origin). The vector can then be thought of as determining the
point. However, we can specify a vector by giving its length and direction
instead of its cartesian coordinates.
If X is an arbitrary nonzero vector in the cartesian plane, then |X| is
its magnitude and er = X/|X| is a unit vector in the same direction. Here,
we use the subscript r to indicate that the unit vector is in the radial
direction. A subscript 0 would be more to the point, since this vector
depends on 0, but standard usage calls for er with the dependence on 6
being understood.
242 QUADRATIC CURVES AND SURFACES 7-4

Since er • ei = cos 0 and er•e2 = sin 6 (why?), we have er =


cos 6 ei + sin 6 e2. We thus have the motivation for the definition of
polar coordinates.

Definition 7-1. The polar coordinates of a point X in the cartesian plane


are a pair of real numbers (r, 0) such that
X = rer, (7-18)
where
er = cos 0ei + sin 0e2. (7-19)

Several comments must be made about this definition. First of all, it is


clear that any given pair of real numbers (r, 0) determine a unique point
from these two conditions. Normally we think of the number r as being
|X|, the distance of the point in question from the origin, but this is not
necessary in this definition. The number r could just as well be — |X|.
This, however, is the only other possibility (why?).
On the other hand, a given point has many different sets of polar co­
ordinates, an infinite number in fact. If a point is fixed and we set r = |X|,
then there is exactly one 0O with 0 < 0O < 27r satisfying the requirements
of the definition. Any angle 0 = 0O + 2Trk, fc = 0, ±1, ±2,. . . will also
satisfy the definition. Choosing the other possibility r = — |X| requires
the angle 0i = 0O + tt. Here again we can add any multiple of 2tv and
still satisfy the conditions.

Polar coordinates are useful in many applications. In particular, they


may be used to specify a curve in the plane, just as with cartesian co­
ordinates. In general, this is done by giving an equation of the form
r = J(0) (7-20)
to specify the locus. We sometimes find 0 as a function of r, or find an
equation in both r and 0, but equations of the form (7-20) are the type
which occur most often.
There are two types of questions which arise. First, given an equation
of the form (7-20), what is its locus? Second, given a curve in the plane,
what is the polar coordinate form of its equation? Both of these questions
will be considered with the help of a few examples. We note first that the
conversion from polar coordinates to cartesian coordinates or vice versa is
aided by the relations
x = r cos 0,
y = r sin 0, (7-21)
r2 = x2 + y2,
7-4 POLAR COORDINATES 243

which are easily proved from the definition. Some caution is required in
the use of these relations, however. When they are used to transform from
one kind of coordinate to the other, we must always check that no ex­
traneous points have been introduced into the locus, and that no points
have been lost.
Let us now look at a few loci defined by their equations in polar coordi­
nates. The general method of analysis is to determine a number of points
on the locus, and to discuss the behavior of r as 6 varies to see how to
connect these points. A sketch may then be made.

Example I. The Circle


r = a.
Here, r is a constant function of 0. The locus is the set of all points at
the distance |a| from the origin. (Why the absolute value?) It is therefore
a circle with radius |a|, centered at the origin.

Example II. The Circle


r = 2a cos 0. (7-22)

When 0 = 0, the point (2a, 0) is determined. As 0 increases from 0 to


tt/2, r decreases (if a > 0) from 2a to zero. When 0 goes from ir/2 to
7T, r is negative and we get points in the fourth Quadrant. When 0 = 7r,
r = —2a and we discover that we again have the point (2a, 0). As 0
increases from tt to 2tt, the points previously obtained are obtained again,
since cos (0 + tt) = — cos 0. (See Fig. 7-16.)
Any point satisfying this equation must also
satisfy
r2 = 2a r cos Ô,

and hence from (7-21) must also satisfy

x2 + y2 = 2ax.

This, however, is the equation of a circle with center at (a, 0) and radius
|a|. This discussion shows that the locus of r(r — 2a cos 0) = 0 is exactly
this circle. This equation was obtained from (7-22) by multiplying by
r. No points of the locus of (7-21) will have been lost by this process.
However, an extraneous point may have been introduced into the new
locus. When r = 0, the new equation is satisfied. Hence this point is on
the locus we have obtained, but may not be on the locus of (7-22). We
see, however, that when 0 = ir/2, the point r = 0 results in (7-21).
Therefore, the locus of (7-21) is exactly the circle described here.
244 QUADRATIC CURVES AND SURFACES 7-4

Example III. The 4-Leafed Rose

r — a cos 20 (a > 0). (7-23)


A discussion such as given above shows that the locus of this equation
is as shown in Fig. 7-17. The dashed lines, at 0 = ±7r/2, ±37t/2, are
the rays at which r = 0. Drawing these in helps to sketch the locus.
Note the order in which the leaves are traced out as 0 goes from 0 to 2tt:
first, the upper side of the right-hand leaf, then the left side of the bottom

A technique that is often helpful in making sketches of the loci of polar


coordinate equations is to make a preliminary sketch of the locus of the
equation in a rectangular (r,0)-coordinate system. The sketch of (7-23) in
such a coordinate system would look like Fig. 7-18.

From such a sketch, it is easy to pick out the critical angles which
should be marked on the polar-coordinate plane to help with the sketch.
These are the values of 0 for which r = 0 (7r/4, 37t/4, 57t/4, 7t/4 in this
case) and values of 0 at which r takes on local maximum and minimum
values (0, tt/2, 7r, 37t/2 for the locus discussed here).
The reader is advised to make such a sketch for each of the examples of
this section and to see how the rectangular coordinate sketch is related to,
and helps to obtain, the polar coordinate sketch.
7-4 POLAR COORDINATES 245

Example IV. The 3-Leafed Rose

r = a cos 30 (a > 0). (7-24)

This equation has a locus as shown in Fig. 7-19. An analysis similar to


that above shows that the curve is traced out twice as 0 goes from 0 to 2tt.

Example V. The Lemniscate

r2 = a2 cos 20. (7-25)

The analysis here is similar to that of Example III, except that now the
upper and lower leaves cannot appear since cos 20 is negative for 7t/4 <
0 < 3tt/4 and 57r/4 < 0 < 7r/4. (See Fig. 7-20.)

Figure 7-21

Example VI. The Cardioid


r = a(l — cos 0). (7-26)

The resulting curve is illustrated in Fig. 7-21. The name, cardioid,


comes from the heart shape of the figure. The cardioid finds many uses
as examples in calculus courses.
246 QUADRATIC CURVES AND SURFACES 7-4

Next we turn to the opposite problem: Given a locus, how do we find its
equation? Let us start with a simple one.

Example VII. The Straight Line


x = a.
Using relations (7-21), we see that every point of this line satisfies
r cos 0 = a,
or
r = a sec 0. (7-27)
An analysis such as done above can be made to show that (7-27) is indeed
the equation of this single straight line.

As our final example (which we do not label, since it will result in a


theorem), let us find the polar form for the equations of the conic sections.
To do this, we use the focus-directrix property of the conic sections. Let
the distance between the focus and the directrix be p, and let the eccen­
tricity be e. We can assume that the focus is at the origin and let x = — p
be the directrix. The equation of the conic section is then
|X| = e\XE\, (7-28)
where \XE\ is the distance between the point X and the directrix.
There are four possible cases which should be considered: where the
point X is to the right or left of the directrix, and where r = |X| or — |X|.
Let us consider only two of these here. The others will be left as exercises.
First, suppose X is to the right of x = —p, and r = |X|. In this case, the
signed distance from X to the 2/-axis is X • ei = |X| cos 0 = r cos 0, and
hence \XE\ = p + r cos 0. Equation (7-28) then becomes
r = e(p + t cos 0),
which can be solved for r to give
pe
r = ----- -------- • (7-29)
1 — e cos 0
Next, suppose that X is to the left of x = —p, and that r = — |X|.
The signed distance of X from the ?/-axis in this case is |X| cos (0 + 7r) =
— |X| cos 0 = r cos 0. This number is negative. (This case is illustrated
in Fig. 7-22 with the primed coordinates.) Thus
\XE\ = —r cos 0 — p,
and Eq. (7-28) becomes
—r = e(—r cos 0 — p).
Solving for r again results in (7-29).
7-4 POLAR COORDINATES 247

Figure 7-22

Without bothering with the other cases, let us turn to Eq. (7-29) and
see what the actual locus of it will be. The above discussion shows that
every point on the locus of (7-29) will be on the conic, but the question is,
will the entire conic be covered?
In the case of an ellipse 0 < e < 1, and it is easily seen that all the
points on the ellipse are obtained. For each 0, (7-29) gives a positive value
for r, which is always less than p when 7r/2 < 0 < 37r/2.
For a parabola, e = 1 and r is undefined in (7-29) if 0 = 0. All other
values of 0 between 0 and 2tt give positive values of r. Again when 7r/2 <
6 < 37t/2, r is less than p.
In the case of a hyperbola, e > 1, and r is undefined when cos 0 = 1/e.
Suppose cos a = 1/e where a is between 0 and 7r/2. Then for —a <
0 < a, we see from (7-29) that r is negative. Indeed we can verify that
the resulting point is to the left of the directrix. As 0 varies within this
range, one entire branch of the hyperbola is swept out. When a < 0 <
2tt — a, we see that r is positive in (7-29) and the remaining branch of the
hyperbola is produced.
A sketch of the hyperbola that results is shown in Fig. 7-23. Note that
the lines determined by 0 = and 0 = + 7r are not the asymptotes
of the hyperbola. They are parallel to the asymptotes, however, and are
shown dashed whereas the asymptotes are represented by solid lines in
Fig. 7-23.
248 QUADRATIC CURVES AND SURFACES 7-4

Putting together the above observations, we find that we have proved


the following theorem.

Theorem 7-4. The equation in polar coordinates

r = pe
1 — e cos 0

has as its locús a conic section with eccentricity e, a focus at the origin,
and associated directrix x = — p.

PROBLEMS

1. If a locus is defined by the equation r = f(fl) and if f(fl) is periodic with


period 2tt, show that the locus has at most two points other than the origin
in common with any line through the origin.
2. What are the polar coordinates of the origin?
3. If relations (7-21) are used to transform an equation in cartesian coordinates
to one in polar coordinates, will any points of the locus be lost?
4. Sketch the Spiral of Archimedes,

r — ad, a > 0.

What happens if a is negative? Sketch this case also.


5. Find the equation in polar coordinates of the circle

(x - a)2 +(y- b)2 = R2.

Can you solve for r as a function of 6 in the resulting equation? What hap­
pens if R2 = a2 + b2?
6. If 3 is the locus of r = f(fl) and if the plane is rotated about the origin
through an angle a, leaving the coordinate system fixed, so that the point
set £ becomes the point set S', what is the equation of S'?
7. Sketch the loci of the following:
(a) r = 2a sin 0, a > 0 (b) r = a sin 20, a > 0
(c) r = a(l + cos 0), a > 0 (d) r = a(l — sin 0), a > 0
(e) r = a csc 0, a > 0.
8. Discuss the locus of
r = a cos n 0

(a) if n is an even integer; (b) if n is an odd integer.


7-4 POLAR COORDINATES 249

9. What is the equation in polar coordinates of the line


x cos a + y sin a = p?

10. A locus is said to be symmetric with respect to the origin if whenever X


is on the locus then Y is also on the locus where Y = —X. Prove that if
f(0) is periodic with period 7r, then the locus of r = f(0) is symmetric with
respect to the origin.
11. Prove that if f(ir — 6) = }(fi) for all then the locus of r = /(0) is sym­
metric with respect to the y-axis.
12. Prove that if r is given as a function of 0 which involves only sin 0, then the
resulting locus is symmetric with respect to the y-axis.
13. Prove that if /(—0) = /(0) for all 0, then the locus of r = /(0) is symmetric
with respect to the x-axis.
14. Prove that if r is given as a function of 0 which involves only cos 0, then the
resulting locus is symmetric with respect to the x-axis.
15. If /(—0) = for all 0, what are the symmetry properties of the locus
of r = Note that/(0) = 0 because of this condition.
16. Let a function /(0) be given and suppose that another function g(0) is such
that g(0) = —f(0 + ir). Prove that the polar-coordinate equations
r = f(0)
and
r = g(0)
have the same loci.
17. If an equation of a locus in polar coordinates is
r = /(cos 0),
what is the locus of the equation
r = —/(—cos0)?
18. Use Eq. (7-28) to obtain r as a function of 0 for the two cases not discussed
in the text. How is the locus of the resulting equation related to the locus
of (7-29)?
19. Let e > 1, cos a — \/e, with 0 > a < t/2. Prove that if —a < 0 < a,
then the point with polar coordinates (r, 0) defined by (7-29) is to the left of
the line x = —p.
20. What are the loci of the following fore < 1 ? e = 1? and e > 1 ?
pe pe
(a) t = (b) r =
1 + e cos 0 1 — e sin 0
pe
(c) r =
1 + e sin 0
250 QUADRATIC CURVES AND SURFACES 7-4

21. Identify and sketch the loci of


6
(a) T 3 + 2 cos 6

22. Sketch the limaçon


r — a + b cos 0
(a) if 0 < a < b;
(b) if 0 < & < a.
23. Prove that the lemniscate (7-25) is the locus of all points with the property
that the product of their distances from the points (a/\/2,0) and (—a/x/2, 0)
is a2/2. [Hint: Use the z-, ^-coordinates of the point which are given by the
polar coordinates of the point.]
Answers to Selected Problems

Section 1-1
1. A = D QC C B = E; A = DC F
2. (a) {x | x2 < 2} (c) {x | x = 2k, k an integer}
(e) {x | x > 0 and x2 < 2}
3. (a) {x j 0 < x < 1} (b) 0 4. Yes

Section 1-2
1. Axiom 8 fails 6. All hold.
13. (a) Commutative and associative (c) The number 0 is an identity (on
the right). There are right and left hand inverses, a ° (—a/2) = 0,
(—2a) ° a = 0.
14. (b) closed under *, A, and °.

Section 1-4
2. (a) m is the maximum of A if and only if (i) m is the least upper bound
of a, and (Ü) m is in A.
3. (b) y2 - 2 = (x2 - 2)2/4x2

Section 1-5
4. If and only if a and b are both nonnegative or both nonpositive. That
is, if and only if ab > 0.
7. (a) -5, 9 (c) —2, 3 (e) -1, V
8. (a) x > 8 and x < 2 (c) x > 1 and x < —J

Section 1-6
1. (a) -289 (c) 0 2. (a) -15 (c) 0
3. (a) —4, —15 (c) 4, —3, 3

7. (a) o 1 5
2
(c)

9. 5
251
252 ANSWERS TO SELECTED PROBLEMS

Section 2-1
2. Symmetrically located with respect to: (i) the y-axis, (ii) the x-axis,
(m) the origin.
3. (a) 4V2 (c) 2a/53 (e) 14
5. (a) (x — l)2 + (y - 2)2 = 16, x2 + y2 - 2x - 4y - 9 =0
(c) (x + 5)2 + (y - 3)2 = 1, x2 + y2 + lOx - 6y + 33 =0
6. (a) x2 + (y — 2)2 = 4, x2 + y2 — 4y = 0
(c) (x + 2)2 + (y + 2)2 = 64, x2 + y2 + 4x + 4y - 56 =0
7. (a) (—1, 2), 3 (c) No locus
8. (a) (-1, f), f (c) (-f, -3), f
9. The single point (xo, yo)
11. (a) x2 + y2 - 7x + lly = 0, (I, -V), V17Õ/2
13. (x - 3)2 + (y + 7)2 = 90
Section 2-2

2 x---- - w — c = 0. Asm becomes large, the equation tends toward


m
x — c = 0.
4. (a) y = 2x — 11, 2x — y - 11 = 0
(c) y = 5x + 10, 5x — y + 10 = 0
(e) y = -§x + ¥> 3x + 5y - 13 = 0
5. (a) y = —ix — x + 2y + 13 = 0
(c) y = —37>x + yfr * + 50 y — 1 = 0
6. (a) m = 2, 2x — y — 5 = 0 (c) no slope, x — 4 = 0
(e) m = 0, y — 5 = 0
7. (c) m = 1, x — y — 12 = 0 (e) m = —94, 94x + y — 690 = 0
Section 2-3
1. The line cannot be parallel to the j/-axis. If y = mx + b is the equation
of the line, then /(x) = mx + b.
2. /(x) = [fí2 — x2]1/2, domain is {x | — R < x < R}, image is
{2/|0< y< R}.
Section 2-4
2. (a) Yes, (x, y) (x + h + V, y + k + V) (b) Yes
(c) Yes, if T is given by (x, y) —> (x + h, y + k), then T' is given by
(x, y) -> (x — h,y — k).
4. {(x, y)\ |x — 2| + \y + 4| = 1}
Section 2-5
2. (a) t/2 (c) 3tt/2 (e) 3tt/2 (g) »/2
(i) r/4 (k) r/3 (m) 7tf/6 (O) 7T
3. (a) 360 a/2ir
4. (a) 90° (c) 270° (e) -90° (g) -270'
(i) 45° (k) 60° (m) 675° (o) 22860°
7. a + 0 = T + some integral multiple of 27t
ANSWERS TO SELECTED PROBLEMS 253

Section 2-6
1. (a) a = kir, k any integer (c) a = 2fcir + tt/2 (e) a = 2kir — 7r/2
3. (a) —sin (2tt/5) (c) sin 0 (e) cos 0 (g) —cot (tt/6)
5. (a) —sin 23° (c) sin 5° (e) —tan 5° (g) —sec 35°

Section 2-7
1. (a) c = 7>/2 (c) b = 5 cot a (e) a = c cos /3
a b c cos a cos/? COS 7
37 13
3. (a) TÕ
43
27)
29
~TS
(c)
1
4. (a) 9 f ~15
13
(c) 5 fu 2$
5. (a) Imp ossible
(c) 3 j
5
(ei) 5 —T5
11
(«2) ¥ T5 T5
6. (a) 4 2
(c) Impossible

Section 3-1
1. (a) 7 (c) VÕ4 2- The only P°int is (1, 1, 2)
3. (a) x2 + y2 + z2 — Qx — 2y + 4? — 11 = 0
4. It is a sphere if B2 + C2 + D2 > 4AE. The center is at (—B/2A,
-C/2A, -D/2A). The radius is [(B2 + C2 + D2 - 4AE)/4A2]1'2.
5. (a) (3, -1, -2), 3 (c) (f, -1, 2), x/301/6
6. x2 + y2 + z2 — 6x + 2z — 15 = 0, center (3, 0, —1), radius 5

Section 3-2
1. (a) (I, f, ?), (c) (-<3/9, -x/3/9, 5V3/9)
2. (a) (-5, 4, -4), <57, (-5/<57, 4/<57, -4/<57)
(c) (1, -1, 0), <2, (l/<2, —1/<2, 0)
6. (a) (g, 3, 5), (g, Jg-), (—g,
(c) (g, g, 1), (g, g, 1), (g, g, 1)

Section 3-3
1. P = (ai + bi, a2 + Ò2, as + ta)
2. 100 cos 0, where 13 is the angle between the string and the vertical.

Section 3-4
4. (a) 7, 3, -4 (c) 0, 0, 0
5. 5, 0, —2 (c) 0, 0, 0 7. (a) 1, 1, -1
8. (a) If, -f, fl (c) [{j, 0, -Al 9- (b) All but P8
254 ANSWERS TO SELECTED PROBLEMS

Section 3-5
2. (a) [|f, ff, ff] (d) ff, —
3. (a) [-H ff, -f|] (c) [0, 0, 0] (e) [ff, -ff, f£]
4. aiei, O2©2, 0363 5. (a) —(c) ST
7. [-2, 1, 0]

Section 3-6

7. This reduces to the triangle inequality for scalars.

Section 4-1
1. (a) x + 2y + 1z + 7 = 0 (c) z = 0
3. (a) 5x + y + z — 10 = 0, (0, 0, 10)
(c) x + y+2z = 0, (0, 0, 0)
4. (a) (0, 0, -5), [3, -1, 1] (c) (0, 0, 0), [2, 1, 4]
5. 2x — y + z — 4 = 0. One of the three quantities a, Ò, or c must be
nonzero. If a 5^ 0, say, then there are really only three unknowns:
6/a, c/a, and d/a.

Section 4-2
1. (a) [-5, 8, -19] (c) [-4, -8, -4]

X ei e2 e3

ei 0 e3 —e2
e2 —e3 0 ei
e3 e2 —ei 0

7. (a) 3x + y — z + 4 = 0 (c) x — z + 1 = 0
9. [2, —7, -1]

Section 4-3
1. (a) 22/x/83 (c) 1
2. If the sign is positive, the vector from the plane to the point is parallel
to the given normal vector. If the sign is negative, this same vector is
collinear with, but not parallel to, the given normal.

4. (a) [1 --^,3 1 + ^1 (c) [-2, 1, -4],


L V83 V83 V83j
6. (a) —3^, (c) — I (e) 0
9. Plane: 2x + 2y — z + 5 = 0, distance 6
ANSWERS TO SELECTED PROBLEMS 25S

Section 4—4
1. (a) X = [1, 2, 7] + t[4, 1, 6]
(c) X = [11, 12, 13] + t[9, 11, 14]
(e) X - |0, 1, 2] + 40, 0, 1]
2. (a) X = [1, -1, 0] + i[-l, 5, 13]
(c) X = 41,1, -i]
(e) X = [-A, 1, -Ü1 + 444, -28, -4]
3. The value of t depends on the particular equation, but the point is:
(la) (0, J, (—7, 0, —5), (—V', t,
(lc) (0, —V, —V), (ff, 0, —ff), (fj, ff, 0), (le) —, —, (0, 1, 0)
(2a) (0, 4, 13), (f, 0, V). d> -1> 0), (2c) (0, 0, 0)
(2e) (0, fff, —fff), (ft, 0, —f|), (—ffi> "¥>
7. (a) t = -3, (-2, 9, -11) (c) t = -2, (3, 8, -15)
9. X = [1, -1, 2] + 43, —2, 1]
11. (a) (5, 7, -10) (c) (7, 0, -3)
12. (a) X = [5, 7, -10] + 416, 31, 9]
13. (a) V143/7 (c) V61/V65

Section 5-1

4. (a) [--16, -82, 36] (c) V276


7. 2
12. ai a2 Ü3
V = Ò1 &2 &2
CI C2 C3

14. X = UX B where U is any vector such that A • U = —1, hence the


complete solution is X = B X A/|A|2 + cA, for any c.
16. (a) (A • B)A - (A • A)B (d) (A • A)2(A X B)
(e) (A • A)2(A • B)A - (A • A)3B

Section 5-2
_ D BXC A-DXC A BXD
7'x_abxc’2/_abxc’z_abxc‘
A solution exists if D = 0, or if D 5^ 0 and A, B, and C are not coplanar
(are linearly independent).
11. (a) C = 4A - B (c) C = -5A + 6B
12. (a) D = 2A + 4B — C (c) D = 10A + 2B - 2C

Section 5-3

1. (a) [1, 3, 0], ~ [-3, 1, 0], [0, 0, 1]


V1Õ V1Õ
(c) -$= [2, 1, -3], [24, -9, 13], [-1, —7, -3]
V14 y/826 V59
256 ANSWERS TO SELECTED PROBLEMS

2. B = aiui + Ü2U2 + «3^3, where «i, 02, and 03 are:


, , 4x/ÍÕ 2VIÕ „ 22
(a5- ’ -5- ’ 2 (c) 0,
a/59

z» , x / 1 <3 . —3,1 ,
6. (a) x' = —— x + —— y,y = —— x + —— y,z' = z
x/10 x/10 Víõ Vlõ
1 1 3 . 24
(c) I- -7= y —*, / = —= x
x/14 x/14 V826
, -17 3
z = —=x----- —y----- —z
V59 V59 \/59
7. (a) x' = fy + fz, y' = t%x — &y + ^z,

A = — fui + ffU2 — AU3*


9- (a) [—ff, —ff, —tfl (c) tA> —A’ Al
10. (a) [1, —1, 0], [0, 1, —1], [0, 0, 1] (c) [A> —1], [3^, —A> —1],
Í-A> A
* i]
Section 5-4
1. (a) [1, 2, 0] (c) ^[162, -105, 211]
2. (a) t%[3, -1, 10] (c) ^[54, 6, 38]
4. (a) Pi: x/íl/2, P3: x/1782/83 (b) L2: 20/x/õ9, L4: 2/\/6
Section 6-1
1. (a) 3x2 — (y — l)2 — (z — l)2 = 0 (c) y2 — x(z — 1) =0
2. (a) 3x2 — y2 + 2y — 2 = 0, hyperbola (c) x + y2 = 0, parabola
_ + 2bl_b2xy + (bl - \2)y2 = 0
5. (a) (&? - X2)x2
(c) B = [0, V2/2, \/2/2], X = x/2/2, for example, give x2 = 0.

Section 6-3
1. (a) b2 = a2(l — e2), c = ae, d = a/e
(c) b2 = a2 — c2, d = a2/c, e = c/a
(e) a2 = b2 + c2, d = (b2 + c2)/c, e = c/[b2 + c2]1/2
(g) a = c/e, b2 = c2(l — e2)/e2, d = c/e2
(i) a = de, b2 = d2e2(l — e2), c — de2
O fa\ — A
4. (a) F = (0, 3), (0, -3), PF = (0, 5), (0, -5), CV = (4, 0),
(—4, 0), e = f, dir: y = y = — V-
(c) F = (x/65, 0), (-x/65, 0), PV = (9, 0), (-9, 0), CV = (0, 4),
(0, —4), e = x/65/9, dir: x_= 81/VÕ5, x = -81/V65.
(e) F = (±1, 0), PV = (±V5, 0), CV = (0, ±2), e - l/<5,
dir: x = ±5
(g) F = (±x/Õ, 0), PF = (±x/10, 0), CV = (0, ±2), e = x/Iõ/õ,
dir:x = ±5V6/3
ANSWERS TO SELECTED PROBLEMS 257

2 2
5. (a) +y - 1 (c) 24? + 25? = 216

(e) 24s2 + 40? - 960 (g) 9z2 + 25? = 144


6. (a) F = (±1, 0), PV = (±3, 0), CV = (0, ±2\/2), e = $, dir:
x = ±9
(c) F = (±f, 0), PV = (±3, 0), CV = (0, ±6?6/5), e = dir:
x = ±15
(e) F = (±4, 0), PV = (±V4Õ, 0), CV = (0, ±v^4), e = V1Õ/5,
dir:x= ±10
(g) F = (±^, 0), PV = (±4, 0), CV = (0, ±-^), e = f, dir:
x = ±5

Section 6-4
2. (a) pd = 8, cd = 6, V = (±8, 0), F = (±10, 0), e = f, dir: x =
±^, asym: y = ±3x/4
(c) pd = 3, cd = 2, V = (0, ±3), F = (0, ±V13), e = \/13/3,
dir: y = ±9/v/13, asym.: y = ±3x/2
(e) pd = 3, cd = 4, V = (±3, 0), F = (±5, 0), e = J, dir: x = ±f,
asym:?/ = ±f
(g) pd = 6, cd — 2, V = (±6, 0), F = (±2\/10, 0), c = V^ÍÕ/3,
dir: x = ±18/x/10, y = ±2x/6
2
X 25x2 _ 25y2 =
3. (a) = 1, 64 336 “ ’
4. The equation is
= -1.
The two hyperbolas have the same asymptotes.
9. Any pair determine a unique hyperbola:
(a, b) c2 = a2 + ò2, d2 = a4/(a2 + b2), e2 = (a2 + b2)/a2
(a, d) b2 = a2(a2 — d2)/d2, c = a2/d, e = a/d
(b, e) a2 = b2/(e2 — 1), c2 = b2e2/(c2 — 1), d2 = b2/c2(e2 — 1)
If b and d are given, then e must satisfy the equation
e4 — e2 — b2/d2 = 0.
This equation is quadratic in e2. One of the two possible solutions is
negative. Hence there is only one e which can satisfy this equation.

Section 6-5
axis directrix focus
(a) x = 0, V = —4, (0, 4)
(c) x = 0 y = i, (0, -4)
(e) x = 0, y = ts< (o, ~ A)
(g) x = 0 V = —À. (o,A)
258 ANSWERS TO SELECTED PROBLEMS

2. (a) y2 = Sx (c) y2 = 2x (e) x2 = —4^/3 (g) y2 = 2x


(i) x2 = —12y/5
4. (a) 9x2 - 400x + 16i/2 - 3001/ - 24xt/ = 0
(c) x2 — 4x + y2 — 121/ — 2xy + 20 = 0

Section 6-6
1. (a) hyperbola, C = (1,= 3), F = (1, 8), (1, -2), V = (1, 7), (1, -1),
pr axis: x = 1, conj axis: y = 3, dir: y = y = —asym: 4x —
31/ + 5 = 0, 4x + 3y - 13 = 0, e = f
(c) ellipse, C = (-1, 7), F = (-1, 11), (-1, 3), PV = (-1, 12), (-1,
2), CV = (2, 7), (—4, 7), pr axis: x = —1, conj axis: y = 7,
dir: i/ = -^, i/ = f, e = f
(e) ellipse, C = (3, -1), F = (6, -1), (-2, -1), PV = (8, -1),
(—2, —1), CV = (3, 3), (3, —5), pr axis: y = —1, conj axis: x = 3,
dir: x = x = —e =
(g> parabola, V = (j, 2), F = (2, 2), axis: y = 2, dir: x = —1
(i) circle, center (3, —J), radius J
3. (a) (* ~ 3)2 + ~2)2 = 1 (c) (x - 7)2 = 4(2/ - 5),

/e) - 3>2 _ (y+1>2 = _i


(’ 1/4 4
4. y2 = — (1 — e2)x2 + 2(1 + e)x; as e —► 0, equation becomes (x — l)2

+
y2 = 1, which is the equation of a circle of radius 1 with center (1, 0);

§
e —> 1, the equation becomes y2 = 4x, which is the equation of a parabola.

Section 7-1
2. x = x' cos 6 — y' sin 0, y = x' sin 6 + y' cos 6
2,2 z,2 2x 2
a b — (b — a )x 2 2
4. y = --------------------- — • 5. 160x - 24xy + 1531/ = 24,336.
Zabx
Section 7-2
3. (a) parabola, sin 0 = f, cos 0 = f,
25s'2 + 28x' — 1041/ + 100 = 0
(c) hyperbola, 0 = ?r/6,
2x'2 — 2y'2 = —1
4. Maximum when 0 = 0, minimum when 0 = 0 + 7r/2

Section 7-3
2. The planes are made up of the asymptotes of the hyperbolas.
4. (a) elliptic cone (c) ellipsoid (e) hyperbolic paraboloid
(g) parabolic cylinder
ANSWERS TO SELECTED PROBLEMS 259

5. (a) hyperboloid of one sheet (c) two planes (e) circular cone
(g) prolate spheroid
6. (a) A hyperboloid of two sheets centered at (—3, 0, 1). Vertices at
(-3, 1, 1) and (-3, -1, 1).
(c) An ellipsoid with center at (2, —2, 2).
(e) Hyperboloid of two sheets with center at the origin and vertices at
(0, 4, 0) and (0, —4, 0).
(g) A hyperbolic cylinder with axis x = —3, z = 0.

Section 7-4
2. r = 0 and any 0 6. r = f(0 — a).
9. r = p sec (0 — a) 17. It is the same
21. (a) Ellipse (b) parabola (c) hyperbola (d) ellipse
Index
absolute value, 20 conjugate hyperbolas, 208 (Prob. 4)
addition, 6 conjugate vertices, 197
addition formulas, 76 coordinate, 18, 34
addition law, 12, 14 coordinate axis, 18, 33
alternative order axioms, 14 coordinate line, 18, 33
angle, 56-62, 136 coordinate system, 162
anticommutative, 128, 129 coordinate vectors, 108, 159-164
arc length, 58 coplanar, 153
associative law, 7, 128, 101 cosecant, 69
asymptote, 51, 204, 205 cosine, 63, 110
axiom, 2, 6 cotangent, 69
axis, 179, 196, 197, 204, 210, 211 cross product, 126-132
axis of rotation, 165 (Prob. 8) cylinder, 233, 238

bilinearity, 112, 129 Dandelin, G. P., 186


binary operation, 6 Dandelin sphere, 187
binary relation, 6 decimal expansion, 17
bounded, 18 degree, 58, 62, 63 (Prob. 5)
bound vector, 96 determinant, 25-31
difference, 98
cancellation law, 112 directed line segment, 90, 96
cardioid, 245 direction cosines, 89, 90
cartesian coordinates, 80 direction numbers, 88, 90
cartesian plane, 33 directrix, 188, 192, 202, 210
Cauchy-Schwarz inequality, 115 distance, 19, 35, 83
center, 84, 196, 197, 204, 241 distance formulas, 133-136, 169-172
(Prob. 9) distributive law, 7, 129
central quadric surface, 241 (Prob. 9) domain, 46
circle, 35, 181, 184, 243 dot product, 110
closure, 7, 8 dual basis, 165 (Prob. 10)
cofactor, 29 Dürer, Albrecht, 196
collinear, 100, 131, 152
commutative law, 7, 101 eccentricity, 188
completeness, 16 element, 2
completeness axiom, 18 ellipse, 181, 184, 186, 191-199, 228
complex numbers, 12 ellipsoid, 236
component, 95 elliptic cone, 236
cone, 179, 237, 238 elliptic cylinder, 234
conic section, 179-185, 248 elliptic paraboloid, 234
conjugate axis, 196, 197, 204 empty set, 4
conjugate dimension, 197, 204, 205 equality, 6
261
262 INDEX

equivalent, 36 line, 39, 92, 138, 246


euclidean space, 104 linear dependence and independence,
extended Lagrange Identity, 148 156, 173
linearity property, 107
field, 6-11 linear space, 102
field axioms, 7 locus, 36
focal dimension, 197, 204
focus, 186, 192, 202, 210 magnitude, 95
formal proof, 1.0 major axis, 196
formula, 46 mapping, 48
free vector, 96 mathematical model, 1, 82
function, 46 maximum, 19 (Prob. 2)
measure, 59, 60
generator, 174, 180 measuring line, 16
geometric angle, 57 method of replacement, 175
Gram-Schmidt orthogonalization, 178 midpoint, 91
graph, 48 mil, 58, 63 (Prob. 6)
group, 101 minor, 29
minor axis, 196
half angle, 179 multiplication, 6,
Hilbert, D., 88 multiplication law, 12, 14
Hilbert space, 112
homogeneity property, 128 nappe, 179
hyperbola, 183, 184, 187, 200-207, 228 nondegenerate conic section, 184
hyperbolic cylinder, 235 normal form, 137 (Prob. 5)
hyperbolic paraboloid, 235 null set, 4
hyperboloid of one sheet, 237 number line, 15
hyperboloid of two sheets, 236
oblate spheroid, 238
identities, 145-150 order, 6, 12
identity, 7, 101 order axioms, 12, 14
image, 46 ordered pair, 46
inclination, 56 ordered triple, 82
initial side, 57 order relation, 12
inner product, 110 orientation, 57
inner product space, 112 oriented geometric angle, 57
intercept, 41 origin, 18, 34, 82
inverse, 7, 101 orthogonal, 110, 121
irrational numbers, 15, 17 orthogonalization, 178
orthogonal set, 160
Lagrange’s identity, 148 orthonormar set, 160
law of cosines, 75, 114 (Prob. 11)
law of sines, 73, 151 (Prob. 17) parabola, 183, 184, 210-212, 228
least upper bound, 18 parabolic cylinder, 233
lemniscate, 248, 250 (Prob. 23) paraboloid, 238
length, 90 parallel, 100, 121
limaçon, 250 (Prob. 22) parallelogram of forces, 94
INDEX 263

parallel translation, 84 secant, 69


parametric equations, 139 set, 2-4
periodic functions, 63, 64, 249 sine, 63
(Prob. 10) slope, 41
perpendicular, 110 solution set, 36
plane, 120-124 sphere, 84
point, 82, 86 spheroid, 238
polar coordinates, 241-248 spiral of Archimedes, 248 (Prob. 4)
polynomial, 49 straight line, 39, 92, 138, 246
positive definite, 112 subset, 3
principal axis, 196, 197, 204 sum, 6, 94, 98
principal dimension, 197, 204 summation symbol, 109
principal vertices, 197 surface of revolution, 237
product, 6 symmetric equations, 139, 143
projection, 72, 106-113, 166-172, (Prob. 6)
173 (Prob. 6) symmetric property, 112
prolate spheroid, 238 symmetry, 195, 249 (Prob. 10)
Pythagoreans, 15
Pythagorean theorem, 34, 83, 168 tangent, 69
terminal side, 57
quadratic equations, 213-218, three-dimensional space, 80, 82
224-230, 231 transitive law, 12
quadric surfaces, 231-240 translation, 52, 53, 84, 85, 213-215
transpose, 26
radian, 58, 63 (Prob. 5) triangle, 71'
radius, 35, 84 triangle inequality, 22, 114-118
range, 46 trichotomy law, 12, 14
rational function, 49 trigonometric addition formulas, 76,
rational numbers, 10 (Prob. 2), 15, 17 222
ray, 57, 88 trigonometric functions, 63-69
real numbers, 5 triple cross product, 145
reflection, 52 triple of numbers, 96
right circular cone, 179 truth set, 36
right-handed coordinate system, 82
right-handed system of vectors, 164 uniqueness, 7
right-hand rule, 128 upper bound, 18
right triangle, 71
rigid motion, 52 value, 47, 59, 60
rose, 244, 245 vector, 93, 95
rotation, 52, 82, 162, 220-223 vector addition, 94, 98
rule, 46 vector identities, 145-150
vector space, 96, 102, 104
scalar, 93, 97 vertex, 179, 197, 204, 210, 211
scalar multiple, 93, 97 void set, 4
scalar product, 110
scalar triple product, 129, 150 zero vector, 95
ABCDE698765

You might also like