Quiz 2 Solutions
Quiz 2 Solutions
Total:
Your Initials:
Lets call the 2 column vectors v1 and v2 . We apply Gram-Schmidt. To start we normalize
the first vector v1 to get
1
1 1
q1 =
2
1
1
For the next step we need to first take away the projection of v2 onto the space spanned by
q1 to get
a 1 a−µ
b a + b + c + d 1 b − µ
0
−
q2 = v2 − (q1 · v2 )q1 = =
c 4 1 c − µ
d 1 d−µ
To get the second orthonormal element we need to normalize this to get
a−µ
−
1 b µ
q2 =
x c − µ
d−µ
2
p
where x = a2 + b2 + c2 + d2 − 4µ2 . Thus
1 a−µ
2 x
1 b−µ
2 x
Q=
1 c−µ
2 x
1 d−µ
2 x
3
Your Initials:
2 (20 pts.) An experimenter has data in the form of pairs (xi , yi ) for i = 1, . . . , n where
the xi are distinct and positive. Given the matrix
√
sin(x1 ) ex1 x1
√
sin(x2 ) e x 2
x2
A= .. .. .. ,
. . .
xn √
sin(xn ) e xn
√
suggest a method for computing the best fit function of the form f (x) = C sin(x)+Dex +E x
through the n points. In what precise
sense is your answer a best fit?
y f (x1 )
C 1
y 2 f (x )
2
Solution: Let v = D and y = . The vector Av will then be nothing but
. . . ...
E
yn f (xn )
x
√
where f (x) = C sin(x) + De + E x. So to find f such that f (xi ) = yi for all i we need to
solve Av = y.
If there is no such solution (which is very possible since n is arbitrary and probably greater
than 3) we can do the method of least squares, namely try to find v such that ||Av −
y|| is minimal possible. Assuming columns of A are linearly independent, this is done by
solving the equation AT Av = AT y, or in other words v = (AT A)−1 AT y (indeed, then Av =
A(AT A)−1 AT y will be the projection of y on the column space of A and so for this v the
distance ||Av − y|| will be minimal). Also to minimize ||Av − y|| is the same as to minimize
√
||Av − y||2 and ||Av − y||2= ni=1 (C sin(xi ) + Dexi + E xi − yi )2 , so such f (x) = C sin(x) +
P
C
√
Dex + E x (where v = D is (AT A)−1 AT y) will be the best fit in a sense that the sum
E
Pn √
of squares of differences i=1 (C sin(xi ) + Dexi + E xi − yi )2 will be minimal.
4
Your Initials:
3 (20 pts.)
A form of the singular value decomposition of a rank r, m × n matrix A is U ΣV T where
Σ is square r by r with positive diagonal entries, U is m × r and V is n × r. Write down
projection matrices for the four fundamental subspaces of A, in terms of one of U , Σ, or V
in each expression. Be sure to clearly identify which fundamental subspace of A goes with
which projection matrix.
The four fundamental subspaces are C(A), C(AT ), N (A), and N (AT ).
In problem set 4 we showed that C(A) = C(U ) in this case. Since U has orthonormal
columns, the projection matrix to C(U ) is U U T .
Applying transpose to the decomposition A = U ΣV T gives AT = (U ΣV T ) = V ΣT U T =
V ΣU T . By the same reasoning as for C(A), we have C(AT ) = C(V ), and the projection
matrix to C(V ) is V V T .
If P is a projection matrix to a subspace V , the projection matrix to V ⊥ is I − P . By the
main facts about the fundamental subspaces, C(A)⊥ = N (AT ) and C(AT )⊥ = N (A), so the
projection matrix to N (AT ) is I − U U T and the projection matrix to N (A) is I − V V T . To
summarize:
C(A) UUT
C(AT ) VVT
N (AT ) I − UUT
N (A) I −VVT
Warning: A lot of people tried solving this using the projection formula A(AT A)−1 AT
(for projection to C(A)) directly. This formula assumes that the columns of A are linearly
independent! In particular, the middle bit AT A won’t be invertible otherwise. We don’t run
into this problem when applying this to U , because U has orthonormal columns.
5
Your Initials:
4 (35 pts.)
Let d(A) be a scalar function of 3 × 2 matrices A with the following properties:
α) If you interchange the two columns of A, d(A) flips sign.
β) d(A) is linear in each of the columns of A.
γ) d(A) is non-zero for at least one 3 × 2 A.
a. (5 pts.) What is d(2A) in terms of d(A)?
b. (10 pts.) Give an example d(A) that satisfies the three requirements of this question.
6
a b b a
α) d c d = ad − bc, d c = bc − ad = −(ad − bc).
e f f e
β)
a + λa2 b
1
d c1 + λc2 d = (a1 + λa2 )d − b(c1 + λc2 )
e1 + λe2 f
γ) For example,
1 0
d 0 1 = 1.
0 0
7
Your Initials:
c. (10 pts.) We recall that the determinant of square matrices is linear in each column and
each row of the square matrix. Can property β be extended to rows and columns of 3 × 2
matrices A to create a d(A) with the three requirements of this question? If yes, give an
example, if not, why not?
Linearity in columns gave us that d(2A) = 4d(A) for all matrices A. By similar reasoning,
linearity in rows would give us d(2A) = 8d(A), since there are three rows. But then, for
every matrix A, we would have 8d(A) = 4d(A), which implies that d(A) = 0 for all matrices.
But this contradicts condition γ. Therefore, property β cannot be extended to the rows of
A.
8
d. (10 pts.) If we discard property γ to allow the “zero” function, the set of all functions
d(A) satisfying α and β form a three dimensional vector space. Describe explicitly this vector
space of functions in terms of the elements of A.
Remark: There are a number of thoughts that would guide students to a general under-
standing of this problem. One thought is that augmenting the matrix with a column of
three variables c1 , c2 , c3 gives rise to a 3x3 determinant which one can expand in cofactors
(which gives the entire three dimensional space!). Essentially the same thought is behind
the cross product in three dimensions. One can more simply delete any of the three rows
and notice that we would have all the requirements for the 2x2 determinant.
then the three dimensional vector space of functions d(A) satisfying the conditions α and β
have a basis
d1 (A) = ad − bc
d2 (A) = af − be
d3 (A) = cf − de.
9
functions d(A) is to use linearity in the columns to split apart d(A) as
a b a b 0 b 0 b
d c d = d 0 d + d c d + d 0 d
e f 0 f 0 f e f
a b a 0 a 0 0 0
= d 0 0 + d 0 d + d 0 0 + . . . + d 0 0 ,
0 0 0 0 0 f e f
where in the last line, we have nine terms corresponding to the nine possible ways to choose
one entry in the first column to be nonzero and one entry in the second column to be nonzero.
For any entry that has two nonzero elements in the same row, we can conclude that d of
that matrix must be zero. For example,
a b 1 1 1 1
d 0 0 = ab × d 0 0 = −ab × d 0 0 ,
0 0 0 0 0 0
by property α. But this implies that
a b
d 0 0 = 0.
0 0
By the row swap property, we also have that
1 0 0 1
d 0 1 = −d 1 0 ,
0 0 0 0
and similarly for any matrix where there are two ones in different columns and different
rows.
This implies that
a b 1 0 1 0 0 0
d c d = (ad−bc)×d 0 1 +(af −be)×d 0 0 +(cf −de)×d 1 0 .
e f 0 0 0 1 0 1
10
Choosing the three determinants on the right hand side to be arbitrary real numbers ex-
actly tells us that the determinant function d(A) must be a linear combination of our three
functions d1 (A), d2 (A), and d3 (A).
11