Anderson_M2B_Lesson_5
Anderson_M2B_Lesson_5
3 Inner Products
Notice that the dot product is a function with domain Rn ◊ Rn and codomain
R. The output of the dot product between any two vectors is a real number.
EXAMPLE 2.3.1
Recall our vector model for storing your performance on all graded assessments in
Math 2B from Example 2.1.2. In this model, we stored our individual performance
data in the following 4 ◊ 1 vector
S q T
FT
W 300 X
W X
W e1 X
W X
W 100 X
W X
g=W
W e2 X
X
W X
W 100 X
W X
W X
U V
f
100
A
where
In order to calculate our final grade using vector g we need to know the grade-
category weights assigned to each grade category. In Math 2B, these weights were
given in the course syllabus as follows:
D
0.40
q e1 e2 f
ps = g · c = 0.10 + 0.25 + 0.25 + 0.40
300 100 100 100
where pf is the final percent score you earn in Math 2B and is used to calculate
your final grade based on the grade scale included in the course syllabus.
EXAMPLE 2.3.2
Let’s suppose that we want to use Riemann integration to find the exact integral
of f (x) = cos(x) on the interval [0, 2fi]. By our study of integral calculus, we know
we can evaluate this integral analytically using the definition
⁄b n
ÿ
f (x) dx = lim f (xi ) h
næŒ
a i=1
where
FT
b≠a
h= , and xi = a + i · h
n
Because the cosine function has a closed-form antiderivative (namely sin(x)), we
can find our area exactly.
However, for a very wide class of problems, this theoretic definition is not en-
tirely helpful. For any function whose antiderivative cannot be written in terms of
2
elementary functions (e.g. e≠x ), integral calculus does not give us a closed form
solution for the integral.
We can instead attempt to numerically approximate the definite integral using
A
Riemann sums. For example, we can choose an approximation scheme in which we
discretize our domain space into n equally spaced intervals and sample our function
at the discrete endpoints of each interval, just as in Example 2.1.4. If we desire
high accuracy, we can use a very fine discretization (which requires more time and
energy to compute by the technician).
In this example, we let’s discretize our interval [0, 2fi] at n = 20 points. The
R
associated Riemann sum approximation to our integral can be visualized as follows:
D
c Jeffrey A. Anderson
• 76 vS20190403
In this case, we’ve used the left-hand rule height of each rectangle touches the
graph on the left-hand side. As we recall, there are other methods we can use
to approximate definite integrals. Recall from Integral Calculus that a general
approach to numerical approximation proceeds as follows:
⁄b n
ÿ
f (x) dx ¥ f (xúi ) h
a i=1
b≠a
where xúi œ [xi≠1 , xi ] is any point in the ith subinterval and step size h = is
n
uniform for each interval.
Notice that our approximation scheme can be written as the dot product
n
ÿ
f (xúi ) h = f · h
i=1
FT
W f (xú2 ) X W1X
W X W X
f = W . X, and h = h W.X
U .. V U .. V
f (xún ) 1
The choice of the points xúi œ [xi≠1 , xi ] depended on the situation. The three most
rudimentary techniques included:
i. Bilinearity:
a. Linearity in left argument: (ax + by) · z = a(x · z) + b(y · z)
b. Linearity in right argument: x · (ay + bz) = a(x · y) + b(x · z)
ii. Symmetry: x · y = y · x
iii. Positivity: x · x > 0 when x ”= 0 while 0 · 0 = 0.
Below, we will prove part (i) subpart (a) of Theorem 8. The other proofs are
left to the reader as an exercise.
FT
S T S T S T
x1 y1 z1
W x2 X W y2 X W z2 X
W X W X W X
x = W . X, y = W . X, z = W . X.
U .. V U .. V U .. V
xn yn zn
Now consider:
S T S T
ax1 + by1 z1
W ax2 + by2 X W z2 X
A
W X W X
(ax + by) · z = W .. X · W .. X
U . V U.V
axn + byn zn
n
ÿ
= (axj + byj )zj
R
j=1
ÿn n
ÿ
=a xj zj + b y j zj
j=1 j=1
= a x · z + b y · z.
D
The algebraic properties of the inner product come in very handy when we
construct solutions to our four fundamental problems in linear algebra. For example,
bilinearity of the inner product guarantees that we can interpret matrix-matrix
multiplication in many ways. The triangle inequality provides deep intuition in
order to construct solutions to the least-square problem.
c Jeffrey A. Anderson
• 78 vS20190403
Lesson 5: Inner Products- Suggested Problems:
1. Derive the Cosine Formula for the Inner Product: Prove Theorems 10, 11,
and 12 for yourself (using these notes):
2. Prove Theorem 13: The Cauchy-Schwartz Inequality
3. Use the inner product operation to approximate the area:
⁄fi/2
cos(x)dx
≠fi/2
.
a. In your approximation scheme, use various values of n
b. find the exact solution to this problem using Integral Calculus
c. How fine of a discretization due you need to use to get within 0.1 of the
exact answer?
4. Set up an inner product model for your final grade calculation in each class
FT
you are currently enrolled in. Write this somewhere very special and refer
back to it throughout the quarter.
5. Calculate your GPA using our inner product model. Check to see if Foothill’s
calculation match your calculations.
A
R
D
Remark: Some texts refer to the two-norm as the euclidean norm. From this
point forward, let us denote
ÎxÎ = ÎxÎ2 .
In general, there are many other vector norms we can consider. However, because
FT
the 2≠norm is by far the most powerful from the standpoint of introductory linear
algebra, we will focus our attention here.
Using this definition, we can prove a number of interesting facts about the two-
norm of a vector, as listed below.
ÎxÎ Ø 0
D
Ô
For any y œ R, recall that if y Ø 0, then y Ø 0. This remains true for y =
x1 +x2 +· · ·+xn . In other words, if we can prove that ÎxÎ2 = x21 +x22 +· · ·+x2n Ø 0 for
2 2 2
all choices of x, we can conclude that ÎxÎ Ø 0. However, by the positivity property
of the inner product, we know x · x Ø 0. Since ÎxÎ2 = x · x, we see immediately
that ÎxÎ2 Ø 0 and we have ÎxÎ Ø 0. This is what we wanted to show.
Let’s continue with the homogeneity property. To this end consider the scalar-
vector multiplication
S T S T
x1 ax1
W x2 X W ax2 X
W X W X
ax = a W . X = W . X
U .. V U .. V
xn axn
c Jeffrey A. Anderson
• 80 vS20190403
With this in mind, we have
ÎaxÎ = (ax1 )2 + (ax2 )2 + · · · + (axn )2
Ò
= a2 · (x21 + x22 + · · · + x2n )
Ô Ò
= a2 · x21 + x22 + · · · + x2n
= |a| · ÎxÎ
FT
Ô Ô
then we can conclude t1 = Îx + yÎ Æ ÎxÎ + ÎyÎ = t2 , which is the triangle
inequality. Consider:
Îx + yÎ2 = (x + y) · (x + y)
= x · (x + y) + y · (x + y)
=x·x+x·y+y·x+y·y
A
= ÎxÎ2 + 2x · y + ÎyÎ2
n
ÿ
= ÎxÎ2 + 2xi yi + ÎyÎ2
R
i=1
2
= (ÎxÎ + ÎyÎ)
D
Notice, the proof above depends heavily on the algebraic properties of the inner
product and the inner product formula for the 2≠norm. Also, the second to last
qn
expression requires that we know that xi yi Æ ÎxÎ ÎyÎ. This is the famous
i=1
Cauchy-Schwarz Inequality and follows from the cosine formula for the inner prod-
uct that we discuss below.
In addition to these algebraic statements, we can also greatly benefit from the
study of a geometric property of the inner product. In our discussion of geometry,
we will need a few background results including the pythagorean theorem and the
law of cosines.
Proof. Let’s begin by visualizing our right triangle and labeling the length of side
as indicated in the theorem statement. Further, let’s introduce variables ◊ and „
to represent the two acute angles of our triangle as detailed below:
„
c
b
◊
a
FT
We know the sum of all three interior angles of our triangle add to 180¶ = 90¶ +◊+„.
Using four copies of this triangle, let’s construct a special quadrilateral.
a b
b c
A
c a
R
a c
D
c
b
b a
(a + b)2 = a2 + 2a b + b2
c Jeffrey A. Anderson
• 82 vS20190403
If we take away each of the four triangles and leave only the center square then we
know
3 4
2 2 1
c = (a + b) ≠ 4 ab
2
4
= a2 + 2ab + b2 ≠ ab
2
= a2 + b2
We conclude that a2 + b2 = c2 as was to be shown.
FT
c2 = a2 + b2 ≠ 2a b cos(◊)
b c
A
h
◊
x a≠x
R
In this case we have x2 + h2 = b2 . Further we see
c2 = (a ≠ x)2 + h2
= a2 ≠ 2ax + x2 + h2
D
= a2 ≠ 2ax + b2
Since x = b cos(◊) by the definition of cosine as the ratio of the adjacent angle over
the length of the hypotenuse, we see
c2 = a2 + b2 ≠ 2ab cos(◊)
which is what we wanted to show.
1fi 2
Case II: The Obtuse Case <◊<fi
2
Below we draw the relevant image for Case II.
◊ fi≠◊
a x
c2 = (a + x)2 + h2
FT
= a2 + 2ax + x2 + h2
= a2 + b2 + 2ax
We can rewrite our equation x = b cos(fi ≠ ◊) = ≠b cos(◊) since the function cos(x)
has a period of 2fi. Then we conclude
c2 = (a + x)2 + h2
A
= a2 + 2ax + x2 + h2
= a2 + b2 ≠ 2ab cos(◊)
c Jeffrey A. Anderson
• 84 vS20190403
Theorem 11: Cosine Formula for Inner Product
Proof. Case I: Assume x and y are not scalar multiples of each other. Suppose we
begin with two vectors x, y œ Rn . Consider the triangle defined by these vectors.
The length of each side of this triangle can be given by the 2≠norm of the vectors:
ÎxÎ Îx ≠ yÎ
FT
◊
ÎyÎ
Recall, using the algebraic properties of the inner product, we can write
A
Îx ≠ yÎ2 = (x ≠ y) · (x ≠ y)
= x · (x ≠ y) ≠ y · (x ≠ y)
R
=x·x≠x·y≠y·x+y·y
= ÎxÎ2 ≠ 2 x · y + ÎyÎ2
D
By canceling out the appropriate terms using our knowledge of arithmetic, we see
Case II: Assume x and y are scalar multiples of each other (i.e. y = ax). In this
case we know that the angle between our vectors is either ◊ = 0 or ◊ = fi. If ◊ = 0,
x · y = x · (a y)
= ax · x
= aÎxÎ2
= aÎxÎ ÎxÎ
Recall that the sign function f (x) = sgn(x) is a piecewise function defined as follows:
Y
] 1 if x > 0,
sgn(x) = 0 if x = 0,
[
≠1 if x < 0.
Ô
Then, for any scalar a œ R, we can write a = sgn(a) |a|. Moreover, since |a| = a2 ,
we see
FT
ı̂ n
Ô ıÿ
x · y = sign(a) a2 Ù x2 ÎxÎ2
i
i=1
ı̂ n
ıÿ
= sign(a) Ù (axi )2 ÎxÎ2
i=1
A
= sign(a) ÎyÎ2 ÎxÎ2
parallel (the closer ◊ is to zero), the closer the dot product resembles the norm of
fi
the two vectors. In contrast, if ◊ ¥ , the dot product is close to zero.
2
Using this interpretation, we can think of the inner product between vectors
as giving a measurement of “parallelity. The larger the magnitude of the inner
product between two vectors, the more parallel these vectors are while the smaller
the magnitude, the less parallel. Of course, the magnitudes of each vector come
into play here, as indicated in the cosine formula.
Orthogonality plays a major role in applied linear algebra and will be the theme
of many techniques we develop to solve least-squares problems and linear systems
problems. The cosine formula for the dot product gives us a powerful tool to enforce
orthogonality between two vectors by guaranteeing that the inner product of two
non-zero vector is zero if and only if the vectors are orthogonal.
c Jeffrey A. Anderson
• 86 vS20190403
Now, we can use the law of cosines to make a statement about the relationship
between the lengths of general n ◊ 1 vectors x, y and x ≠ y. We can also use the
algebraic properties of the dot product to establish the dot product cosine formula
stated above.
|x · y| Æ ÎxÎ ÎyÎ
Proof. Let x, y œ Rn . Then, by the cosine formula for the inner product, we know
FT
Taking the absolute value of the inner expression implies our desired relation.
Two vectors are orthogonal if and only if the dot product between these
vectors is zero.
EXAMPLE 2.4.1
Using two norms to calculate the similarities in voting records for the US Senate
votes.
EXAMPLE 2.4.2
Using inner products between vectors to project one vector onto another.
FT
A
R
D
c Jeffrey A. Anderson
• 88 vS20190403
Lesson 5: Inner Products- Suggested Problems:
1. Let x, y, z œ R4 be given by
S T S T
5 6 5 6 4 ≠2
4 ≠1
w= , x= , y = U≠1V , z = U≠4V ,
1 2
2 3
w·x y·z
E. w and z F. ÎwÎ2 , ÎxÎ2 , ÎyÎ2 , and ÎzÎ2 ,
w·w z·z
2. Derive the Cosine Formula for the Inner Product: Prove Theorems 10, 11,
FT
and 12 for yourself (using these notes):
3. Prove Theorem 13: The Cauchy-Schwartz Inequality
4. Use the inner product operation to approximate the area:
⁄fi/2
cos(x)dx
≠fi/2
.
A
a. In your approximation scheme, use various values of n
b. find the exact solution to this problem using Integral Calculus
c. How fine of a discretization due you need to use to get within 0.1 of the
exact answer?
R
5. Set up an inner product model for your final grade calculation in each class
you are currently enrolled in. Write this somewhere very special and refer
back to it throughout the quarter.
D
6. Calculate your GPA using our inner product model. Check to see if Foothill’s
calculation match your calculations.