0% found this document useful (0 votes)

22 views

sol14

Homework 14 for EECS 16B at UC Berkeley focuses on topics such as orthonormal matrices and logistic regression for classification. It includes reading assignments, problem-solving steps for linearization, and iterative algorithms for minimizing cost functions. The homework emphasizes understanding derivatives and their applications in machine learning contexts.

Uploaded by

npcsignupaccnt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

sol14

Uploaded by

npcsignupaccnt

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Homework 14 @ 2021-05-04 18:52:48-07:00

EECS 16B Designing Information Devices and Systems II

Spring 2021 UC Berkeley Homework 14
This homework is due on Sunday, May 2, 2021, at 11:00PM. Self-grades and
HW Resubmission are due on Tuesday, May 4, 2021, at 11:00PM.

1. Reading Lecture Notes

Staying up to date with lectures is an important part of the learning process in this course. Here are links to
the notes that you need to read for this week: Note 16, Note 17

(a) Write out an N × N orthonormal matrix U whose columns represent the DFT basis.
Solution:  
2π(0)(0) 2π(0)(1) 2π(0)(2) 2π(0)(N −1)
j j j j
 e 2π(1)(0)
N e N e N ··· e N

 j N 2π(1)(1) 2π(1)(2) 2π(N −1)
j j
 e e e ··· ej
N N N


1  2π(2)(0) 2π(2)(1) 2π(2)(2) 2π2(N −1) 
U=√  ej N ej N ej N ··· ej N 
N
.. .. .. .. ..


 2π(N. −1)(0)
 . . . . 

2π(N −1) 2π2(N −1) 2π(N −1)(N −1)
ej N ej N ej N · · · ej N

2πkm
In other words, the kmth entry Ukm = √1 ej N
N

EECS 16B, Spring 2021, Homework 14 1

Homework 14 @ 2021-05-04 18:52:48-07:00

2. Linearization to help classification: discovering logistic regression and how to solve it

You can, in spirit, reduce the problem of linear multi-class classification to that of binary classification
by picking vectors that correspond to each of the categories “X” as compared with all the other examples
categorized into a hybrid synthetic category of “not-X”. This will give rise to vectors corresponding to each
category with the winner selected by seeing which one wins. However, we will focus here on the binary
problem since that is the conceptual heart of this approach.
As was discussed in lecture, the naive straightforward way of picking the decision boundary (by looking
at the mean example of each category and drawing the perpendicular bisector) is not always the best. The
included Jupyter Notebook includes synthetic examples that illustrate the different things that can happen so
that you can better appreciate the pathway that leads us to discover logistic regression as a natural approach
to solve this problem based on the conceptual building blocks that you have already seen.
It is no exageration to say that logistic regression is the default starting method for doing classification in
machine learning contexts, the same way that straightforward linear regression is the default starting method
for doing regression. A lot of other approaches can be viewed as being built on top of logistic regression.
Consequently, getting to logistic regression is a nice ending-point for this part of the 16AB story as pertains
to classification.
Let’s start by giving things some names. Consider trying to classify a set of measurements ~xi with given
labels `i . For the binary case of interest here, we will think of the labels as being “+” and “-”. For expository
convenience, and because we don’t want to have to carry it around separately, we will fold our threshold
implicitly into the weights by augmenting our given measurements with the constant “1” in the first position
of each ~xi . Now, the classification rule becomes simple. We want to learn a vector of weights w ~ so that we
can deem any point with ~x> i w
~ > 0 as being a member of the “+” category and anything with ~
x >w
i ~ < 0 as
being a member of the “-” category.
The way that we will do this, is to do a minimization in the spirit of least squares. Except, instead of
necessarily using some sort of squared loss function, we will just consider a generic cost function that can
depend on the label and the prediction for the point. For the i-th data point in our training data, we will incur
a cost c(~x>
i w,
~ `i ) for a total cost that we want to minimize as:

m
X
arg min ctotal (w)
~ = c(~x>
i w,
~ `i ) (1)
w
~
i=1

Because this can be a nonlinear function, our goal is to solve this iteratively as a sequence of least-squares
problems that we know how to solve.
Consider the following algorithm:
~ = ~0
1: w . Initialize the weights to ~0
2: while Not done do . Iterate towards solution
3: Compute w ~ > ~xi . Generate current estimated labels
4: Compute ddw~ c(w ~ > ~xi , `i ) . Generate derivatives with respect to w
~ of the cost for update step
2
5: Compute ddw~ 2 c(w ~ > ~xi , `i ) . Generate second derivatives of the cost for update step
6: ~ = LeastSquares(·, ·)
δw . We will derive what to call least squares on
7: w
~ =w ~ + δw~ . Update parameters
8: end while
9: Return w
~

EECS 16B, Spring 2021, Homework 14 2

Homework 14 @ 2021-05-04 18:52:48-07:00

The key step above is figuring out with what arguments to call LeastSquares while only having the labels `i
and the points ~xi .
When the function f~(~x, ~y ) : Rn × Rk → Rm takes in vectors and outputs a vector, the relevant derivatives
for linearization are also represented by matrices:
 ∂f ∂f1

1
···
 ∂x[1]
.. ..
∂x[n]
.. 
~
D~x f = 
 . . . .

∂fm ∂fm
∂x[1] ··· ∂x[n]
 ∂f1 ∂f1

∂y[1] ··· ∂y[k]
 . .. .. 
D~y f~ =  .
 . . . .

∂fm ∂fm
∂y[1] ··· ∂y[k]

where    
x[1] y[1]
 .   . 
~x =  .   . 
 .  ~y =  .  .
x[n] y[n]

Then, the linearization (first-order expansion) becomes

f~(~x, ~y ) ≈ f~(~x0 , ~y0 ) + D~x f~ · (~x − ~x0 ) + D~y f~ · (~y − ~y0 ) . (2)

(a) Now, suppose we wanted to approximate the cost for each data point

~ = c(~x>
ci (w) i w,
~ `i ) (3)

where  
w[1]
 . 
w
~ = . 
 . 
w[n]

in the neighborhood of a weight vector w ~ ∗ . Our goal is to write out the first-order expression for
approximating the cost function ci (w ~
~ ∗ + δw). This should be something in vector/matrix form like
you have seen for the approximation of nonlinear systems by linear systems. We don’t want to take
any second derivatives just yet — only first derivatives. We have outlined a skeleton for the derivation
with some parts missing. Follow the guidelines in each sub-section.
i) Comparing to eq. (2), we know that ci (w ~ ≈ ci (w
~ ∗ + δw) ~ ∗ ) + ddw~ ci (w ~ Write out the vector
~ ∗ )δw.
form of ddw~ ci (w~∗ ).
Solution:
d h i
ci (w~∗ ) = ∂c∂w[1]
i (w
~∗ )
··· ∂ci (w~∗ )
∂w[n]
dw~
ii) Write out the partial derivatives of ci (w)~ with respect to w[g], the g th component of w. ~ (HINT:
Use the linearity of derivatives and sums to compute the partial derivatives
Pn with respect to each of
w[g] ~
x >w~ =
the
P terms. Don’t forget the chain rule and the fact that i j=1 i [j]w[j] = xi [g]w[g] +
x
j6=g xi [j]w[j].)
Solution:

EECS 16B, Spring 2021, Homework 14 3

Homework 14 @ 2021-05-04 18:52:48-07:00

Using the hint, we calculate the partial derivative with respect to each w[g] term.
Using the chain rule,

d ∂
ci (w)[g]
~ = ci (w)
~
dw~ ∂w[g]
∂
= c(~x>
i w,
~ ì )
∂w[g]
d ∂
= c(~x>
i w,
~ ì ) (~x> w)
~
>
d(~x w) ~ ∂w[g] i
 
n
∂  X
= c0 (~x>
i w,
~ ì ) xi [g]w[g] + xi [j]w[j]
∂w[g]
j6=g
0
=c (~x>
i w,
~ ì )xi [g]

Note that c0 (~x>

i w,
~ `i ) = d
d(x~i > w)
~
c(~x>
i w,
~ `i ).
d
iii) With what you had above, can you fill in the missing part to express the row vector dw~ ci (w)?
~

d
~ = c0 (~x>
ci (w) i w,
~ `i )
dw~
Solution:
d
~ = c0 (~x>
ci (w) ~ `i )~x>
i w, i
dw~
(b) Now, we want a better approximation that includes second derivatives. For a general function, we
would look for
f (~x0 + δx) ~ + 1 δx
~ ≈ f (~x0 ) + f 0 (~x0 )δx ~ > f 00 (~x0 )δx
~ (4)
2
where f 0 (~x0 ) is an appropriate row vector and, as you’ve seen in the note, f 00 (~x0 ) is called Hessian
that represents the second derivatives.
i) Comparing to eq. (4), we know that

d 2
ci (w ~ ≈ ci (w
~ ∗ + δw) ~ ∗) + ~ + 1 δw
ci (w~∗ )δw ~ > d ci (w~∗ )δw
~
dw~ 2 dw~2
d2
Write out the matrix form of dw
c (w~∗ ).
~2 i
Solution:  
∂ 2 ci (w~∗ ) ∂ 2 ci (w~∗ )
···
d2  ∂w[1]∂w[1] ∂w[1]∂w[n] 
ci (w~∗ ) = 
 .. .. .. 
. . .
dw~2

∂ 2 ci (w~∗ ) ∂ 2 ci (w~∗ )
 
∂w[n]∂w[1] ··· ∂w[n]∂w[n]

∂ ci (w)
~ 2
ii) Take the second derivatives of the cost ci (w),
~ i.e. solve for ∂w[g]∂w[h] .
(HINT: You should use the answer to part (a) and just take another derivative. Once again, use
the linearity of derivatives and sums to compute the partial derivatives with respect to each of the
∂2
w[h] terms. This will give you ∂w[g]∂w[h] . Don’t forget the chain rule and again use the fact that
>
Pn P
~xi w~ = j=1 xi [j]w[j] = xi [h]w[h] + j6=h xi [j]w[j].)

EECS 16B, Spring 2021, Homework 14 4

Homework 14 @ 2021-05-04 18:52:48-07:00

∂ 2 ci (w)
~
Solution: Proceeding in a similar manner as above, let us find ∂w[g]∂w[h] .

d2 ∂2

∂ d
ci (w)[g,
~ h] = ci (w)
~ = ci (w)[g]
~
dw~2 ∂w[g]∂w[h] ∂w[h] dw ~
∂
= c0 (x~i > w,
~ ì )xi [g]
∂w[h]
∂
= c00 (x~i > w,
~ ì ) (x~i > w)x
~ i [g]
∂w[h]
= c00 (~x>
i w,
~ ì )xi [g]xi [h]

d2
Note that c00 (~x>
i w,
~ `i ) = d(x~i > w)
~ 2
c(~x>
i w,
~ `i ).
iii) The expression in part (ii) is for the [g, h]-th component of the second derivative. 12 times this
~
times δw[g] ~
times δw[h] would give us that component’s contribution to the second-derivative term
in the approximation, and we have to sum this up over all g and h to get the total contribution of the
second-derivative term in the approximation. Now, we want to group terms to restructure this into
matrix-vector form by utilizing the outer-product form of matrix multiplication. What should the
space in the following expression be filled with?

d2
~ = c00 (~x>
ci (w) i w,
~ `i )
dw~2
Solution:
d2
~ = c00 (~x>
ci (w) ~ `i )~xi ~x>
i w, i
dw~2

~ Since we even-
Pmapproximation of ci (w
(c) Now we have successfully expressed the second order ~ ∗ + δw).
~ = i=1 ci (w),
tually want to minimize the total cost ctotal (w) ~ can you write out the second order
approximation of ctotal (w ~ using results from (a) and (b)?
~ ∗ + δw)
Solution: From previous parts, we get

d 2
ci (w ~ ≈ ci (w
~ ∗ + δw) ~ ∗) + ~ + 1 δw
ci (w~∗ )δw ~ > d ci (w~∗ )δw
~
dw ~ 2 ~2
dw
1 ~ > 00 >
~ ∗ ) + c0 (~x>
= ci (w i w ~∗ , `i )~x> ~
i δw + δw c (~ xi w~∗ , `i )~xi ~x> ~
i δw
2

Based on the linearity of derivatives, to get the second order approximation of ctotal (w ~ we just
~ ∗ + δw),
sum up the second order approximation for each ci (w ~
~ ∗ + δw):
m m
X X 1 ~ > 00 >
ctotal (w ~ =
~ ∗ + δw) ci (w ~ ≈
~ ∗ + δw) ~ ∗ ) + c0 (~x>
ci (w i w
~ , `
∗ i i)~
x > ~
δw + δw c (~
x i w
~ , ` )~
x
∗ i i i ~
x > ~
δw
2
i=1 i=1

~ in form of C + Pm (~q> δw
(d) Now in this part, we want to re-write ctotal (w~∗ + δw) ~ − bi )2 .
i=1 i
i) Let’s first rewrite a general second order polynomial f (x) = ax2 + bx + c in the form of f (x) =
r + (px + q)2 . Find p, q, r in terms of a, b, c. This procedure is called "completing the square". Then,
use this to argue that
arg min ax2 + bx + c = arg min(px + q)2
x x

EECS 16B, Spring 2021, Homework 14 5

Homework 14 @ 2021-05-04 18:52:48-07:00

b2 √ b 2 √ b b2
Solution: ax2 + bx + c = c − 4a + ( ax + √
2 a
) . Therefore p = a, q = √
2 a
, r =c− 4a .
Since r is a constant, arg minx r + (px + q)2 = arg minx (px + q)2 , hence we have

arg min ax2 + bx + c = arg min(px + q)2

x x

~ in the form of C + Pm ~
ii) Now rewrite ctotal (w~∗ + δw) qi> δw
i=1 (~ − bi )2 . What are C, qi , and bi ?
Solution:

m
~ ≈
X
0 > > ~ 1 ~ > 00 > > ~
ctotal (w
~ ∗ + δw) ~ ∗ ) + c (~xi w~∗ , ì )~xi δw + δw c (~xi w~∗ , ì )~xi ~xi δw
ci (w
2
i=1
 2  2 
m 0 >
c (x~i w~∗ , ì ) −c0 (x~i > w~∗ , ì )  
r
X 
ci (w~∗ ) −  1 00 > > ~
= 1 00 >
+ c (x~i w~ ∗ , ì )x~i δw − q  
2

i=1
 4 2 c (x~ i w
~ ,
∗ i ` ) >
2c00 (x~i w~∗ , ì )


 2   2 
m 0 > m
c (x~i w~∗ , ì )  X −c0 (x~i > w~∗ , ì )  
r
X   1 > > ~
= ci (w~∗ ) − + c00 (x~ w~ , ` )x~ δw −
i ∗ i i
  
>
q
 2c 00 (x~ w
~ , ` )   2 >

i=1 i ∗ i i=1 2c00 (x~i w~∗ , ì )

Comparing to the expected format we can see that

 
2
m c0 (x~i > w~∗ , ì )  −c0 (x~i > w~∗ , ì )
r
X  1 00 >
C= ci (w~∗ ) − 2c00 (x~ > w~ , ` )  ~qi = c (x~i w~∗ , ì )x~i bi = q
 
2
i=1 i ∗ i 2c00 (x~i > w~∗ , ì )

(e) Consider a least squares problem:

2
x~? = arg min A~x − ~b
~
x

Show that:
m
2 X
A~x − ~b = ai > ~x − bi )2
(~
i=1

where  
− ~a>1 −
− ~a> −
 
2
A= .. .
.
 
 
− ~a>m −

Use this to intepret your expression from Part (d) as a standard least squares problem. What are
the rows of A?
2
Solution: A~x − ~b is by definition equal to the sum of all entries squared of the vector A~x − ~b.
2
Therefore A~x − ~b = m ai > ~x − bi )2 . Matching terms with our expression of ctotal (w ~ in
P
i=1 (~ ~ + δw)

EECS 16B, Spring 2021, Homework 14 6

Homework 14 @ 2021-05-04 18:52:48-07:00

Part (d), we get  q 

1 00 > >
 q 2 c (x~1 w~∗ , `1 )x~1 
 
1 00 > > 

2 c (x~2 w~∗ , `2 )x~2
A=
 
 .
..


q 
 
1 00 > >
2 c (x~m w ~∗ , `m )x~m
.
(f) Consider the following cost functions:
squared error: c+ 2 − 2
sq (p) = (p − 1) , csq (p) = (p + 1) ;
exponential: c+ −p − p
exp (p) = e , cexp (p) = e ;
and logistic: c+ −p , c− p

logistic (p) = ln 1 + e logistic (p) = ln(1 + e ).
Compute the first and second derivatives of the above expressions with respect to p.
Solution: First Derivatives:
d +
c (p) = 2(p − 1)
dp sq
d −
c (p) = 2(p + 1)
dp sq
d +
c (p) = −e−p
dp exp
d −
c (p) = ep
dp exp
d + e−p
clogistic (p) = −
dp 1 + e−p
d − ep
clogistic (p) =
dp 1 + ep

Notice that these are pretty cheap to compute, given that we have to compute the original loss functions
in the first place.
Second Derivatives:
d2 +
c (p) = 2
dp2 sq
d2 −
c (p) = 2
dp2 sq
d2 +
c (p) = e−p
dp2 exp
d2 −
c (p) = ep
dp2 exp
d2 + e−p
clogistic (p) =
dp2 (1 + e−p )2
d2 − ep
clogistic (p) =
dp2 (1 + ep )2

EECS 16B, Spring 2021, Homework 14 7

Homework 14 @ 2021-05-04 18:52:48-07:00

Notice that all of these second derivatives are positive. Moreover, calculating them takes essentially
no more work than getting the first derivatives. In particular, it is useful to note that

d2 + e−p d d
clogistic (p) = = | c+ logistic (p)|(1 − | c+ (p)|)
dp2 −p
(1 + e ) 2 dp dp logistic
d2 − ep d d
clogistic (p) = = | c− logistic (p)|(1 − | c− (p)|)
dp2 p
(1 + e ) 2 dp dp logistic

so the basic nature of the logistic loss’ second derivative becomes more clear. If the first derivative
has magnitude 12 , this second derivative is maximized — that happens when the prediction p is 0 or
maximally uncertain. The second derivative for logistic loss shrinks away from there.
When you take 70, this particular form p(1 − p) will start to ring a bell. That sound is the gateway to
understanding a different reason for why logistic regression is popular in practice — but for that, you
need to understand Probability. It is a glorious coincidence that something so natural from an optimiza-
tion point of view also turns out to have useful interpretations involving the language of probability.
You will understand this after 126.
(g) Run the Jupyter Notebook and answer the following questions.
i) In Example 2, why does mean classification fail?
Solution: The mean classifier misclassifies some points because there is some information in the
distribution that can’t be accurately captured by the mean of the data points of each category.
ii) In Example 3, for what data distributions does ordinary least squares fail?
Solution: When there are extreme outliers in the dataset, the decision boundary in this case will be
"pulled" away from the desired location, thus least squares would fail.
iii) Run the code cells in Example 4. By performing updates to w ~ according to what you derived
in previous parts of the question, how many iterations does it take for exponential and logistic
regression to converge?
Solution: You should see that the decision boundaries are almost fixed after 3-4 iterations. These
iterated least-squares approaches are very fast to converge in general. This is why in practice, logistic
or exponential regression costs almost the same to run as ordinary least squares.
Congratulations! You now know the basic optimization-theoretic perspective on logistic regression.
After you understand the probability material in 70 and 126, you can understand the probabilistic
perspective on it as well. After you understand the optimization material in 127, you will understand
even more about the optimization-theoretic perspective on the problem including why this approach
actually converges.

EECS 16B, Spring 2021, Homework 14 8

Homework 14 @ 2021-05-04 18:52:48-07:00

3. Extending Orthonormality to Complex Vectors

So far in the course, we have only dealt with real vectors. However, it is often useful to also think about com-
plex vectors, as you’ll soon see with the DFT. In this problem, we will extend several important properties
of orthonormal matrices to the complex case.
The main difference is that the normal Euclidean inner product is no longer a valid metric, and we must
define a new complex inner product as
k
X
∗
h~u, ~v i = ~v T ~u = ~v ~u = ui vi ,
i=1

where the ∗ operation will complex conjugate and transpose its argument (order doesn’t matter), and is
aptly called the conjugate transpose. Note that for real numbers, the complex inner product simplifies to
the real inner product. In all the theorems you’ve seen in this class, you can replace every inner product
with the complex inner product to show an analogous result for complex vectors: least squares becomes
x̂ = (A∗ A)−1 A∗~b, upper triangularization becomes A = U T U ∗ , Spectral Theorem becomes A = U ΛU ∗ ,
SVD becomes A = U ΣV ∗ .
*" # " #+
1+j −3 − j
(a) To get some practice computing complex inner products, what is , and
2 2+j
*" # " #+
−3 − j 1+j
, ? Does the order of the vectors in the complex inner product matter i.e. is it
2+j 2
commutative?
Solution:
*" # " #+
1+j −3 − j
, = (1 + j)(−3 − j) + (2)(2 + j)
2 2+j
= (1 + j)(−3 + j) + 2(2 − j)
= −3 + j − 3j − 1 + 4 − 2j = −4j
*" # " #+
−3 − j 1+j
, = (−3 − j)(1 + j) + (2 + j)(2)
2+j 2
= (−3 − j)(1 − j) + (2 + j)(2)
= −3 + 3j − j − 1 + 4 + 2j = 4j

The two inner products are different so clearly the complex inner product is not commutative and the
order of the vectors matters. In fact, when you swap the arguments for the inner product, you will get
the complex conjugate result, which you see an example of above.
h i
(b) Let U = ~u1 · · · ~un be an n by n complex matrix, where its columns ~u1 , ~u2 , . . . , ~un form an
orthonormal basis for Cn , i.e. (
1 if i = j
~u∗i ~uj =
0 if i 6= j
Such a complex matrix is called unitary in math literature, to distinguish from real orthonormal matri-
ces. Show that U −1 = U ∗ , where U ∗ is the conjugate transpose of U .

EECS 16B, Spring 2021, Homework 14 9

Homework 14 @ 2021-05-04 18:52:48-07:00

Solution: By definition, U −1 U = I. We want to show that U ∗ satisfies it so let’s write down U ∗

first:
 
~u∗1
 . 
U∗ =  . 
 . , (5)
~u∗n

where ~u1 , . . . , ~un are the column vectors of U . Then, the entry at the i-th row and j-th column of U ∗ U
should be ~u∗i ~uj . If we write down the general form for each element of U ∗ U :
(
∗ ∗ 1 if i = j
(U U )ij = ~ui ~uj = ,
0 if i 6= j

which is the identity matrix, since ~u1 , . . . , ~un is an orthonormal basis.

Now for any square matrices A, B such that AB = I, right multiplying by B −1 gives ABB −1 =
IB −1 so A = B −1 . B −1 must exist since det(A) det(B) = det(I) 6= 0 so det(B) 6= 0. Thus since
we showed U ∗ U = I, then U ∗ = U −1 .
(c) Show that U preserves complex inner products, i.e. if ~v , w
~ are vectors of length n, then

h~v , wi
~ = hU~v , U wi
~ .

HINT: Note that (AB)∗ = B ∗ A∗ . This is since (AB)∗ = (AB)T = B T AT = B T AT = B ∗ A∗

Solution: For this question, we want to show that:

h~v , wi ~ ∗~v = hU~v , U wi

~ =w ~

Using the definition of complex inner products we can write:

hU~v , U wi ~ ∗ U~v
~ = (U w)

Using the form for the complex conjugate of a matrix-vector product as stated in the problem:

~ ∗ U~v = w
(U w) ~ ∗ U ∗ U~v

From the previous problem we know that U ∗ U = I. Therefore:

hU~v , U wi ~ ∗~v = h~v , wi.

~ =w ~

(d) Show that if ~u1 , . . . , ~un are the columns of a unitary matrix U , they must be linearly indepen-
dent.
~ = ni=1 αi ~ui , then first show that αi = hw,
P
(Hint: Suppose w ~ ~ui i. From here ask yourself whether a
nonzero linear combination of the {~ui } could ever be identically zero.)
This basic fact shows how orthogonality is a very nice special case of linear independence.
Solution: Suppose they are not linearly independent, then there exist α1 , . . . , αn ∈ C such that
~ = ni=1 αi ~ui = ~0, while at least one of αi is non-zero. We can then take the inner product of both
P
w
sides with ~uj , for all j:
* n + n
X X
hw,
~ ~uj i = αi ~ui , ~uj = αi h~ui , ~uj i = αj .
i=1 i=1

EECS 16B, Spring 2021, Homework 14 10

Homework 14 @ 2021-05-04 18:52:48-07:00

Since ~u1 , . . . , ~un form an orthonormal basis, we know that h~ui , ~uj i will be 1 when i = j and 0
otherwise, which is why only αj survives in the above summation. Since w ~ = ~0, then αj should be 0
for all inner products h~uj , wi.
~ However, this is a contradiction to our assumption that at least one of
the αi is non-zero. Therefore, ~u1 , . . . , ~un are linearly independent.
This confirms what we know — that orthonormality is a particularly robust guarantee of linear inde-
pendence.
(e) Now let V be another n × n matrix, where the columns of the matrix form an orthonormal basis for
Cn , i.e. V is unitary. Show that the columns of the product U V also form an orthonormal basis
for Cn . )
Solution: Since V is a unitary matrix, we have V ∗ V = I. To show that the columns of U V also
form an orthonormal basis, we could write down its conjugate transpose, (U V )∗ , and apply it to U V :

(U V )∗ (U V ) = V ∗ U ∗ U V = V ∗ V = I ,

which means the columns of U V form an orthonormal basis for Cn .

(f) We can also extend the idea of symmetric matrices to complex vectors, though we again will need to
replace the transpose with the conjugate transpose. If M = M ∗ , then M is called a Hermitian matrix,
and the Spectral Theorem will say it can be diagonalized by a unitary U , i.e. M = U ΛU ∗ , where Λ is
a diagonal matrix with the eigenvalues along the diagonal. Show that M has real eigenvalues.
HINT: Use the fact that M ∗ = M .
Solution: We first calculate M ∗

M ∗ = (U ΛU ∗ )∗ = U Λ∗ U ∗ .

Since M ∗ = M = U ΛU ∗ , this means that Λ∗ = Λ. The only case where this is true is when all the
elements of Λ are real, which means the eigenvalues of M are always real.

EECS 16B, Spring 2021, Homework 14 11

Homework 14 @ 2021-05-04 18:52:48-07:00

4. Roots of Unity
An N th root of unity is a complex number ω satisfying the equation ω N = 1 (or equivalently ω N − 1 = 0).
In this problem we explore some properties of the roots of unity, as they end up being essential for the DFT.

(a) Show that the polynomial z N − 1 factors as

 
N
X −1
z N − 1 = (z − 1)  zk  .
k=0

Solution:  
N
X −1 N
X N
X −1
(z − 1)  zk  = zk − zk = zN − z0 = zN − 1
k=0 k=1 k=0

2π
k = ej N k for k ∈ Z is an N -th root of unity. From
(b) Show that any complex number of the form ωN
2π
here on, let ωN = ej N .
Solution: 2π N
kN
ωN = ej N k = ej2πk = e0 = 1

This means that the N numbers ωN 0 , ω 1 , ω 2 , . . . , ω N −1 are the solutions to the equation z N = 1 and
N N N
hence roots of the polynomial z N − 1.
(c) For a given integer N ≥ 2, using the previous parts, give the complex roots of the polynomial
1 + z + z 2 + ... + z N −1 .
Solution: (z − 1)(1 + z + z 2 + ... + z N −1 ) = z N − 1, and we just showed that the roots of z N − 1
are the ω k = ej2πk for k = 0, . . . , N − 1, therefore the roots of z 7→ 1 + z + z 2 + ... + z N −1 are the
ωNk = ej 2πk
N for k = 1, . . . , N − 1.

We can see this by factoring and matching factors together. A polynomial with a leading coefficient
of 1 can be factored into its roots. So z N − 1 = (z − ωN 0 )(z − ω 1 )(z − ω 2 ) · · · (z − ω N −1 ) =
QN −1 N N N
k ) = (z − 1)
Q N −1 k ). Dividing both sides by z − 1 gives us what we want.
k=0 (z − ωN k=1 (z − ω N
(d) What are the fourth roots of unity? Draw the fourth roots of unity in the complex plane. Where
do they lie in relation to the unit circle?
π
Solution: Using the formula for the roots of unity from part (b), ω40 = e0 = 1, ω41 = ej 2 = j, ω42 =
2π 3π
ej 2 = −1, ω43 = ej 2 = −j. All roots of unity must be on the unit circle since they have magnitude
1. From the definition of the roots of unity, we know that each root of unity is 2π/N = π/2 radians
apart, and this can be seen graphicaly in the plot below.

EECS 16B, Spring 2021, Homework 14 12

Homework 14 @ 2021-05-04 18:52:48-07:00

(e) What are the fifth roots of unity? Draw the fifth roots of unity in the complex plane.
2π
Solution: Using the formula for the roots of unity from part (b), ω50 = e0 = 1, ω51 = ej 5 , ω52 =
4π 6π 8π
ej 5 , ω53 = ej 5 , ω54 = ej 5 . Again, we know that each root of unity should be 2π/5 radians or 72◦
apart by the definition, and it can be seen graphically below.

2π
(f) For N = 5, ω5 = ej 5 , simplify ω542 such that the exponent is less than 5 and greater than 0.
Solution:
ω542 = ω58·5 ω52 = (ω55 )8 ω52 = 18 ω52 = ω52
Every 5 powers of the 5th root of unity will be 1, so it simplifies to the remainder of 42 divided by 5.
k+N k for all integers k, both
(g) Let’s generalize what you saw in the previous part. Prove that ωN = ωN
positive and negative. This shows that the roots of unity have a periodic structure.

EECS 16B, Spring 2021, Homework 14 13

Homework 14 @ 2021-05-04 18:52:48-07:00

Solution:
k+N k N k k
ωN = ωN ωN = ωN · 1 = ωN

(h) What is the complex conjugate of ω5 in terms of the 5th roots of unity? What is the complex
conjugate of ω542 in terms of the 5th roots of unity? What is the complex conjugate of ω54 in terms
of the 5th roots of unity?
Solution: Using part (g),

ω̄5 = ω5−1 = ω54

ω̄542 = ω5−42 = ω53
ω̄54 = ω5−4 = ω5

Notice here that we can think about going around the circle of roots of unity, since they are periodic
and wrap around to where they started.
This is something called modulo arithmetic and is naturally connected to cycles like this. It is tradition-
ally viewed as taking the remainder (in this case after dividing by N = 5), however some people get
confused when asked to take the remainder of a negative number divided by a positive one. Here, we
just remember that we can turn a negative number between −(N − 1) and −1 into a positive number
by adding N to it.
You will learn a lot more about modulo arithmetic in CS 70 — here in 16B, you just get a teaser
because it emerges naturally when thinking about the roots of unity and their powers.
PN −1 km
(i) Compute m=0 ωN where ωN is an N th root of unity. Does the answer make sense in terms of
the plot you drew?
k = 1, then this is easy: we have
Solution: If ωN
N
X −1 N
X −1
0
ωN = 1 = N.
m=0 m=0

k 6= 1. Consequently, we can
This happens whenever k is a multiple of N . For all other k, we know ωN
use the formula we found in part (a) to write
N −1 kN − 1
X
km ωN
ωN = =0
ωN − 1
m=0

since ωN is a root of unity and so ωNN = 1 and so ω kN = 1 as well. This makes intuitive sense because
N
all the roots of unity are spaced evenly around the circle. Therefore summing them up, we have to get
zero by symmetry.

EECS 16B, Spring 2021, Homework 14 14

Homework 14 @ 2021-05-04 18:52:48-07:00

5. Discrete Fourier Transform (DFT)

In order to get practice with calculating the Discrete Fourier Transform (DFT), this problem will have you
calculate the DFT for a few variations on a cosine signal.
Consider a sampled signal that is a function of discrete time x[t]. We can represent it as a vector of discrete
samples over time ~x, of length N .
h iT
~x = x[0] . . . x[N − 1] (6)

We can represent the DFT basis with the matrix U , and it is given by
 
1 1 1 ··· 1
2π 2π(2) 2π(N −1)
ej N ej N ··· ej N
 
1 
1   2π(2)
j N
2π(4)
j N j
2π2(N −1)

U = √ 1 e e ··· e N .

N  ..
 .. .. .. 
 . . . . 

2π(N −1) 2π2(N −1) 2π(N −1)(N −1)
1 ej N ej N · · · ej N

2πij
In other words, for the ij th entry of U , Uij = √1 ej N . From this, we can see that the DFT basis matrix
N
is symmetric, so U = U T . Another very important property of the DFT basis is that U is orthonormal, so
U ∗ U = I. We want to find the coordinates of ~x in the DFT basis, and we know these coordinates are given
by
X~ = U −1 ~x.

We call the components of X ~ the DFT coefficients of the time-domain signal ~x. We can think of the compo-
~
nents of X as weights that represent ~x in the DFT basis. As we will explore in the problem, each coefficient
can be thought of as a measurement for which frequency is present in the signal.
You can use Numpy or other calculation tools to evaluate cosines or do matrix multiplication, but you will
not get credit if you directly calculate the DFT using a function in Numpy. You must show your work to get
credit.

(a) What is U −1 ?
Solution: Since U is orthonormal, U −1 = U ∗ which means we transpose and complex conjugate U .
However, note that U is symmetric, so we just need to complex conjugate the matrix, meaning every
exponent of the exponential becomes negative. Thus,
 
1 1 1 ··· 1
2π 2π(2) 2π(N −1)
e−j N e−j N ··· e−j N
 
1 
−1 ∗ 1 
−j N
2π(2)
−j
2π(2)(2)
−j
2π2(N −1)

U = U = √ 1 e e N ··· e N


N  ..
 .. .. .. 
. . . . 

2π(N −1) 2π2(N −1) 2π(N −1)(N −1)
−j −j −j
1 e N e N ··· e N

h i
(b) Let the columns of U = ~u0 ~u1 . . . ~uN −1 . Prove that ~uk = ~uN −k for k = 1, 2, . . . , N − 1.
Solution:
2πmk 2πmk
~uk [m] = ej N = e−j N (7)

EECS 16B, Spring 2021, Homework 14 15

Homework 14 @ 2021-05-04 18:52:48-07:00

Using the fact that any multiple of 2π won’t affect the exponent,
2πmk 2πmk
e−j N = ej(2πm− N
)
(8)
2πm(N −k)
j
=e N = ~uN −k [m] (9)

Since this holds for all m, then ~uk = ~uN −k when k = 1, . . . , N − 1. It doesn’t hold for other k since
N − k for the columns of U wouldn’t be valid.
then the indices of k and
2π
(c) Decompose cos 7 n into a sum of complex exponentials.
Solution: From the inverse Euler formula,

2π 1 2πn 1 2πn
cos n = ej 7 + e−j 7
7 2 2

2πn 2πn
(d) If x1 [n] = cos 7 , write x1 [n] as a linear combination of y+ [n] = ej 7 and y− [n] = e−j 7 .
2πn

Solution: Note that we can replace both complex exponentials with the given y functions, so
1 1
x1 [n] = y+ [n] + y− [n]
2 2

(e) Now think of ~x1 as a length N = 7 vector which is x1 [n] sampled at n = 0, h1, . . . , 6. We do the
i same
to similarly get ~y+ , ~y− . Write ~y+ and ~y− in terms of the columns of U = ~u0 ~u1 . . . ~u6 .
HINT: Use part (b).
2πn
Solution: We can notice that ~u1 is exactly √1N ej 7 evaluated at the 7 sample points. This is exactly
√ √ 2πn
~y+ up to scale, so ~y+ = N~u1 = 7~u1 . From part (b), we know that ~u6 = ~u1 which is √1N e−j 7
√ √
evaluated at the sample points. This is exactly a scaling of ~y− , so ~y− = N~u6 = 7~u6 .
(f) Using the last 3 parts, compute the DFT coefficients X ~ 1 for signal ~x1 .
√ √
7 7
Solution: Combining the last 3 parts, we know ~x1 = 2 ~u1 + 2 ~u6 . We also know that since the
columns of U are ~u0 , . . . , ~u6 , then
 
1 1 1 ··· 1
 
2π 2π(2) 2π(N −1)
~u∗0
−j −j
··· e−j ~u∗1
  
1 e N e N N   

1  
−j N
2π(2)
−j
2π(2)(2) 2π2(N −1)
 
~u∗2
∗ e−j N

U = √ 1 e e N ··· =
   (10)
N .
 .. . . ..
 ..
.. ..
 
.

.   
∗
 
2π(N −1) 2π2(N −1) 2π(N −1)(N −1)
1 e−j N e−j N ··· e−j N
~uN −1

EECS 16B, Spring 2021, Homework 14 16

Homework 14 @ 2021-05-04 18:52:48-07:00

Then to find our DFT coefficients,

~ 1 = U ∗ ~x1
X (11)
 
~u∗
 0∗  √ √
~u1  7 7
=  . (
  ~u1 + ~u6 ) (12)
.
. 2 2
~u∗6
     
0 0 0
√ 
0 1  7
      2 
√ 0 √ 0  0 
   
7  7   
= 0 + 0 =  0 (13)
2   2     
0 0  0 
   
0 0  √0 
 

1 0 7
2

In the last equation, we use the orthonormality property of the DFT basis vectors, i.e. that
(
∗ 1 i=j
~ui uj = (14)
0 i 6= j.

Equivalently, we can write the result as

√
 7 k = 1, 6
X1 [k] = 2 (15)
0 k 6= 1, 6.

(g) Plot the time domain representation of x1 [n]. Plot the magnitude, |X1 [n]|, and plot the phase,
∠X1 [n], for the DFT representation X ~ 1 . You should notice that the DFT coefficients correspond to
which frequencies were present in the original signal, which is why the DFT coefficients are called the
frequency domain representation.
Solution:
Time Domain
1
x1 [n]

0.5

n
−1 1 2 3 4 5 6 7

−0.5

−1

EECS 16B, Spring 2021, Homework 14 17

Homework 14 @ 2021-05-04 18:52:48-07:00

Frequency Domain Magnitude

4
|X1 [k]|

k
−1 1 2 3 4 5 6 7

−1
Frequency Domain Phase
3 ∠X [k]
1

1
k
−1 1 2 3 4 5 6 7
−1

−2

−3
~ 1 in this case are all real-valued, the phase is zero-valued.
Because the coefficients X

(h) We define x2 [n] = cos 4π7 n for N = 7 samples n ∈ {0, 1, . . . , 6}. Compute the DFT coefficients
~ 2 for signal ~x2 .
X
Solution: Following from the above, we can write ~x2 , the vector of samples of the function, in terms
of the columns of our DFT basis matrix U :

2π(2) 1 2π(2) 1 2π(2)
x2 [n] = cos n = ej 7 n + e−j 7 n (16)
7 2 2
√
1 √ √ 7
~x2 = ( N u~2 + N u~2 ) = (u~2 + u~5 ) (17)
2 2

From this, we can see that our DFT coefficients vector X~ 2 will only have non-zero values for rows
~
k = 2 and k = 5. The elements of X2 can be written as
√
 7 k = 2, 5
X2 [k] = 2 (18)
0 k 6= 2, 5.

EECS 16B, Spring 2021, Homework 14 18

Homework 14 @ 2021-05-04 18:52:48-07:00

(i) Plot the time domain representation of x2 [n]. Plot the magnitude, |X2 [n|, and plot the phase,
~ 2.
∠X2 [n], for the DFT representation X
Solution:
Time Domain
1
x[n]

0.5

n
−1 1 2 3 4 5 6 7

−0.5

−1
Frequency Domain Magnitude
4
|X[k]|

k
−1 1 2 3 4 5 6 7

−1
Frequency Domain Phase
3 ∠X[k]

1
k
−1 1 2 3 4 5 6 7
−1

−2

−3

(j) To generalize this result, say we have some p ∈ {1, 2, 3} which scales the frequency of our signal ~xp ,

EECS 16B, Spring 2021, Homework 14 19

Homework 14 @ 2021-05-04 18:52:48-07:00

which we define as xp [n] = cos 2π 7 pn for N = 7 samples n ∈ {0, 1, . . . , 6}. Compute the DFT
~ p for signal ~xp in terms of this scalar p.
coefficients X
Solution: As in the last two problems, we can represent ~xp in terms of the columns of the DFT basis
matrix:

2π(p) 1 2π(p) 1 2π(p)
xp [n] = cos n = ej 7 n + e−j 7 n (19)
7 2 2
√
1 √ √ 7
~xp = ( 7u~p + 7 ~up ) = (u~p + ~u7−p ) (20)
2 2
~ p will occur at k = p and k = 7 − p:
The only nonzero entries in X
√
 7 k = p, 7 − p
Xp [k] = 2 (21)
0 k 6= p, 7 − p.

h iT
(k) Let’s see what happens when we have an even number of samples. We define ~s = 1 0 1 0 1 0 ,
which has N = 6 samples. Compute the DFT coefficients S ~ for signal ~s.

Hint: Write ~s as a+b cos 2π
6 pn for some constants a, b, p, and use the fact that this signal has period
2.
Solution: We have N = 6 samples, which we will denote n ∈ {0, 1, . . . , 5}. The signal repeats after
a period T = 2, so the cosine function should also have period 2. This will occur when p = 3. Then
to account for the scaling and shifting of the cosine from [−1, 1] to [0, 1], we need a = 12 , b = 21 . This
gives

1 1 2π
s[n] = + cos (3)n (22)
2 2 6

We can decompose s[n] as

1 1 2π(3) 1 2π(3)
s[n] = e(0)n + e−j 6 n + ej 6 n (23)
2 4 4
1√ 1 √ √
~s = N u~0 + ( N u~3 + N ~u3 ) (24)
2√ √4
6 6
= u~0 + (u~3 + ~u6−3 ) (25)
√2 √4
6 6
= u~0 + u~3 . (26)
2 2

From part (b), ~u3 = ~u3 . Therefore both of the ~u3 and ~u3 terms are added together and contribute to
the k = N2 = 3rd term of S. ~ Additionally, we have a term from k = 0 because of the constant offset
of the signal with 0 frequency. Thus,
√
 6 k = 0, 3.
S[k] = 2 (27)
0 k 6= 0, 3.

EECS 16B, Spring 2021, Homework 14 20

Homework 14 @ 2021-05-04 18:52:48-07:00

Having ~u3 = ~u3 shows us an interesting result of using an even number of samples. When N is even,
the ~u N column will be entirely real because
2

2π N
u N [l] = e−j N 2
l
= e−jπl ,
2

and so the entries will alternate between −1 and 1.

EECS 16B, Spring 2021, Homework 14 21

Homework 14 @ 2021-05-04 18:52:48-07:00

6. Survey
Please fill out the survey here. As long as you submit it, you will get credit. Thank you!

7. Homework Process and Study Group

Citing sources and collaborators are an important part of life, including being a student!
We also want to understand what resources you find helpful and how much time homework is taking, so we
can change things in the future if possible.

(a) What sources (if any) did you use as you worked through the homework?
(b) If you worked with someone on this homework, who did you work with?
List names and student ID’s. (In case of homework party, you can also just describe the group.)

Contributors:

• Kourosh Hakhamaneshi.

• Kuan-Yun Lee.

• Nathan Lambert.

• Sidney Buchbinder.

• Gaoyue Zhou.

• Anant Sahai.

• Ashwin Vangipuram.

• John Maidens.

• Geoffrey Négiar.

• Yen-Sheng Ho.

• Harrison Wang.

• Regina Eckert.

EECS 16B, Spring 2021, Homework 14 22

HW3
0% (1)
HW3
3 pages
Stanford University CS 229, Autumn 2014 Midterm Examination
No ratings yet
Stanford University CS 229, Autumn 2014 Midterm Examination
23 pages
Fundamentals of Mathematical Physics Edgar A Kraut
No ratings yet
Fundamentals of Mathematical Physics Edgar A Kraut
21 pages
OpenFOAM Programmers Guide
No ratings yet
OpenFOAM Programmers Guide
97 pages
Department of Mathematics: Guide To Modules
No ratings yet
Department of Mathematics: Guide To Modules
12 pages
hw3
No ratings yet
hw3
7 pages
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
No ratings yet
Department of Electrical Engineering School of Science and Engineering EE514/CS535 Machine Learning Homework 1
11 pages
hw4_red
No ratings yet
hw4_red
6 pages
Midterm F02soln
No ratings yet
Midterm F02soln
14 pages
Stanford University CS 229, Autumn 2015 Midterm Examination
No ratings yet
Stanford University CS 229, Autumn 2015 Midterm Examination
25 pages
ml-4
No ratings yet
ml-4
101 pages
Midterm With Solutions
No ratings yet
Midterm With Solutions
26 pages
SVM Problems1
No ratings yet
SVM Problems1
5 pages
Notes Ending 21 Feb 2024
No ratings yet
Notes Ending 21 Feb 2024
7 pages
Day 1
No ratings yet
Day 1
41 pages
Perceptrons
No ratings yet
Perceptrons
12 pages
Lec 03
No ratings yet
Lec 03
42 pages
HW3-2
No ratings yet
HW3-2
4 pages
2019-20-I MS Key
No ratings yet
2019-20-I MS Key
6 pages
Practice Midterm 2 Sol
No ratings yet
Practice Midterm 2 Sol
26 pages
Lect5 Reg
No ratings yet
Lect5 Reg
16 pages
hw07 Neural Soln PDF
No ratings yet
hw07 Neural Soln PDF
6 pages
HW 2
No ratings yet
HW 2
5 pages
Pattern Classification Labreport
No ratings yet
Pattern Classification Labreport
5 pages
hw5_1
No ratings yet
hw5_1
6 pages
ML ES 23-24-II Key
No ratings yet
ML ES 23-24-II Key
4 pages
HW 1
No ratings yet
HW 1
4 pages
Advanced Machine Learning
No ratings yet
Advanced Machine Learning
74 pages
Lecture 6 - Ridge Regression, Polynomial Regression (DONE!!) PDF
No ratings yet
Lecture 6 - Ridge Regression, Polynomial Regression (DONE!!) PDF
26 pages
cs188 Fa23 Note21
No ratings yet
cs188 Fa23 Note21
8 pages
Math YHPLinear Regression
No ratings yet
Math YHPLinear Regression
13 pages
Amath/Math 516 Second Homework Set Linear Least Squares
No ratings yet
Amath/Math 516 Second Homework Set Linear Least Squares
6 pages
MLF Combined
No ratings yet
MLF Combined
84 pages
Math Behind Machine Learning
No ratings yet
Math Behind Machine Learning
9 pages
HW 23 P 4 Rie
No ratings yet
HW 23 P 4 Rie
5 pages
Deep Learning Assignment2 Solutions PDF
No ratings yet
Deep Learning Assignment2 Solutions PDF
16 pages
CS 229, Public Course Problem Set #3: Learning Theory and Unsuper-Vised Learning
No ratings yet
CS 229, Public Course Problem Set #3: Learning Theory and Unsuper-Vised Learning
4 pages
Machine Learning (CSEN3203) 1-14
No ratings yet
Machine Learning (CSEN3203) 1-14
15 pages
Homework2 - Tran Anh Vu
No ratings yet
Homework2 - Tran Anh Vu
3 pages
hw3_red
No ratings yet
hw3_red
4 pages
CMU 2018s NinaBALCAN HW3
No ratings yet
CMU 2018s NinaBALCAN HW3
7 pages
Minsky y Papert
No ratings yet
Minsky y Papert
77 pages
2021 Logistic Regression
No ratings yet
2021 Logistic Regression
33 pages
Lec06 Matt[1]
No ratings yet
Lec06 Matt[1]
60 pages
NN Theory
No ratings yet
NN Theory
138 pages
NNLS1 2019 HW4 Solutions
No ratings yet
NNLS1 2019 HW4 Solutions
11 pages
lec12
No ratings yet
lec12
9 pages
Recognition Patterns: Jean Carlo Grandas Franco March 2020
No ratings yet
Recognition Patterns: Jean Carlo Grandas Franco March 2020
9 pages
Linear Classifier: Linear Discriminant Function: Compiled by Lakshmi Manasa, CED16I033
No ratings yet
Linear Classifier: Linear Discriminant Function: Compiled by Lakshmi Manasa, CED16I033
31 pages
HW 5
No ratings yet
HW 5
5 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
S Ccs Answers
No ratings yet
S Ccs Answers
192 pages
Solutions Manual Scientific Computing
0% (1)
Solutions Manual Scientific Computing
192 pages
Msep2013 L5
No ratings yet
Msep2013 L5
14 pages
NB 13
No ratings yet
NB 13
27 pages
Modeling Basics: Compartment Models Dimensional Analysis Stochastic Modeling
No ratings yet
Modeling Basics: Compartment Models Dimensional Analysis Stochastic Modeling
58 pages
cs188_sp16_f_sol
No ratings yet
cs188_sp16_f_sol
27 pages
lecture13
No ratings yet
lecture13
4 pages
Taller 3 (A. NG.) - Introducción Al Aprendizaje Supervisado
No ratings yet
Taller 3 (A. NG.) - Introducción Al Aprendizaje Supervisado
8 pages
MIDA1 AUT - Solutions
No ratings yet
MIDA1 AUT - Solutions
4 pages
Practice 1130
No ratings yet
Practice 1130
20 pages
Complex numbers
From Everand
Complex numbers
Alessio Mangoni
No ratings yet
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
113_hw_9_s18
No ratings yet
113_hw_9_s18
2 pages
L05 Memory
No ratings yet
L05 Memory
45 pages
sol12
No ratings yet
sol12
21 pages
sol15
No ratings yet
sol15
24 pages
sol11
No ratings yet
sol11
15 pages
mazzulaquizlet
No ratings yet
mazzulaquizlet
46 pages
Vector Spaces Calculus
No ratings yet
Vector Spaces Calculus
24 pages
Petrovic Oper TH 1
No ratings yet
Petrovic Oper TH 1
74 pages
Partial Differential Equations A Unified Hilbert Space Approach 1st Edition Rainer Picard instant download
100% (2)
Partial Differential Equations A Unified Hilbert Space Approach 1st Edition Rainer Picard instant download
36 pages
Sem232 La Cc03 Group 2
No ratings yet
Sem232 La Cc03 Group 2
16 pages
M.tech - Advanced Communication and Information Systems
No ratings yet
M.tech - Advanced Communication and Information Systems
45 pages
Material de Estudio Humanidades Uni 2024
No ratings yet
Material de Estudio Humanidades Uni 2024
4 pages
Course Outline MAT 125.6 Spring 2022
No ratings yet
Course Outline MAT 125.6 Spring 2022
5 pages
Function Spaces
No ratings yet
Function Spaces
10 pages
Geometry of Linear 2 Normed Spaces 1st Edition Raymond W. Freese - Download the ebook now to never miss important content
100% (1)
Geometry of Linear 2 Normed Spaces 1st Edition Raymond W. Freese - Download the ebook now to never miss important content
48 pages
Module 2 Vector Spaces Fundamentals
No ratings yet
Module 2 Vector Spaces Fundamentals
33 pages
Benavides Dominguez JAA
No ratings yet
Benavides Dominguez JAA
11 pages
Department of Mathematics MAL 110 (Mathematics I) Tutorial Sheet No. 6 Linear Algebra and Matrix
No ratings yet
Department of Mathematics MAL 110 (Mathematics I) Tutorial Sheet No. 6 Linear Algebra and Matrix
2 pages
CBCS M.SC Math 2019
No ratings yet
CBCS M.SC Math 2019
39 pages
Ca1 13030822065 Bscaiml301
No ratings yet
Ca1 13030822065 Bscaiml301
6 pages
Solution 10
No ratings yet
Solution 10
7 pages
2223 RG Notes
No ratings yet
2223 RG Notes
143 pages
(Ebook) Linear Systems and Control: An Operator Perspective by MARTIN J. CORLESS & ARTHUR E. FRAZHO ISBN 9780585467092, 9780824707293, 0585467099, 082470729X 2024 scribd download
100% (3)
(Ebook) Linear Systems and Control: An Operator Perspective by MARTIN J. CORLESS & ARTHUR E. FRAZHO ISBN 9780585467092, 9780824707293, 0585467099, 082470729X 2024 scribd download
81 pages
Ee263 Course Reader
No ratings yet
Ee263 Course Reader
430 pages
Chapter 5 Inner Product Spaces
No ratings yet
Chapter 5 Inner Product Spaces
8 pages
Geometric Properties of Natural Operators Defined by the Riemann Curvature Tensor 1st Edition Peter B Gilkey all chapter instant download
100% (5)
Geometric Properties of Natural Operators Defined by the Riemann Curvature Tensor 1st Edition Peter B Gilkey all chapter instant download
81 pages
Linear Algebra and Partial Differential Equations T Veerarajan All Chapters Instant Download
No ratings yet
Linear Algebra and Partial Differential Equations T Veerarajan All Chapters Instant Download
41 pages
BSc-NEP Semester4
No ratings yet
BSc-NEP Semester4
65 pages
FA-Lecture_notes
No ratings yet
FA-Lecture_notes
53 pages
8-Richtmyer - Principles of Advanced Mathematical Physics I
100% (3)
8-Richtmyer - Principles of Advanced Mathematical Physics I
439 pages
PHY 314: Introduction To Quantum Mechanics, Varsha 2013
No ratings yet
PHY 314: Introduction To Quantum Mechanics, Varsha 2013
5 pages
Matrix Analysis and Applications 1st Edition Xian-Da Zhang instant download
100% (6)
Matrix Analysis and Applications 1st Edition Xian-Da Zhang instant download
60 pages
Indian Institute of Technology Roorkee
No ratings yet
Indian Institute of Technology Roorkee
60 pages

sol14

Uploaded by

sol14

Uploaded by

Homework 14 @ 2021-05-04 18:52:48-07:00

EECS 16B Designing Information Devices and Systems II

1. Reading Lecture Notes

EECS 16B, Spring 2021, Homework 14 1

2. Linearization to help classification: discovering logistic regression and how to solve it

EECS 16B, Spring 2021, Homework 14 2

Then, the linearization (first-order expansion) becomes

EECS 16B, Spring 2021, Homework 14 3

Note that c0 (~x>

EECS 16B, Spring 2021, Homework 14 4

EECS 16B, Spring 2021, Homework 14 5

arg min ax2 + bx + c = arg min(px + q)2

Comparing to the expected format we can see that

(e) Consider a least squares problem:

EECS 16B, Spring 2021, Homework 14 6

Part (d), we get  q 

EECS 16B, Spring 2021, Homework 14 7

EECS 16B, Spring 2021, Homework 14 8

3. Extending Orthonormality to Complex Vectors

EECS 16B, Spring 2021, Homework 14 9

Solution: By definition, U −1 U = I. We want to show that U ∗ satisfies it so let’s write down U ∗

which is the identity matrix, since ~u1 , . . . , ~un is an orthonormal basis.

HINT: Note that (AB)∗ = B ∗ A∗ . This is since (AB)∗ = (AB)T = B T AT = B T AT = B ∗ A∗

h~v , wi ~ ∗~v = hU~v , U wi

Using the definition of complex inner products we can write:

From the previous problem we know that U ∗ U = I. Therefore:

hU~v , U wi ~ ∗~v = h~v , wi.

EECS 16B, Spring 2021, Homework 14 10

which means the columns of U V form an orthonormal basis for Cn .

EECS 16B, Spring 2021, Homework 14 11

(a) Show that the polynomial z N − 1 factors as

EECS 16B, Spring 2021, Homework 14 12

EECS 16B, Spring 2021, Homework 14 13

ω̄5 = ω5−1 = ω54

EECS 16B, Spring 2021, Homework 14 14

5. Discrete Fourier Transform (DFT)

EECS 16B, Spring 2021, Homework 14 15

EECS 16B, Spring 2021, Homework 14 16

Then to find our DFT coefficients,

Equivalently, we can write the result as

EECS 16B, Spring 2021, Homework 14 17

Frequency Domain Magnitude

EECS 16B, Spring 2021, Homework 14 18

EECS 16B, Spring 2021, Homework 14 19

We can decompose s[n] as

EECS 16B, Spring 2021, Homework 14 20

and so the entries will alternate between −1 and 1.

EECS 16B, Spring 2021, Homework 14 21

7. Homework Process and Study Group

EECS 16B, Spring 2021, Homework 14 22

You might also like