homework1
homework1
Homework submissions are expected to be in pdf format produced by LATEX. For hand-written homeworks,
points will be deducted if the handwriting is not recognizable: this will be at the discretion of the instructors.
Review https://www.stat.berkeley.edu/~songmei/Teaching/STAT154_Fall2021/Materials/cs229-linalg.
pdf and https://www.stat.berkeley.edu/~songmei/Teaching/STAT154_Fall2021/Materials/cs229-prob.
pdf for concepts in this homework. If you can handle less than 60% of this problem set, you should consider
whether you are ready to take this course.
For coding exercises, you should use Python. For submission of coding exercises: report the results and
figures produced by the simulations, and also paste the source code.
Q2
Express e1 = [1, 0, 0]T as the linear combination of a1 = [1, 3, 2]T , a2 = [2, 1, 3]T and a3 = [3, 2, 1]T .
Q3
Let A ∈ R3×3 be a matrix with
1 1 2
A = 2 1 3 .
3 1 4
What are the {rank, null space, column space (i.e., image)} of A? What are the project matrices onto the
{null space, column space} of A?
Q2
Pd
Let Z ∼ N (0, I d ) and let a ∈ Rd be a fixed vector. Calculate E[exp{aT Z}] = E[exp{ i=1 ai Zi }] (Hint:
note that Zi and Zj are independent for i ̸= j). Now let X = AZ + µ for µ ∈ Rd and A ∈ Rd×d . Calculate
M (µ, a) ≡ E[exp{aT X}] (Hint: use the result you just found for E[exp{aT Z}]).
1
Q3
Calculate ∇a M (µ, a) = [∂a1 M, . . . , ∂ad M ]T , and ∇µ M (µ, a) = [∂µ1 M, . . . , ∂µd M ]T (represent it in a com-
pact form using matrix operations).
Q4
Pd
• Let f (u) = aT u = i=1 ai ui for a, u ∈ Rd . Calculate ∇u f (u).
Pd
• Let f (u) = ∥u∥22 = i=1 u2i for u ∈ Rd . Calculate ∇u f (u).
• Let f (u) = uT Au, where u ∈ Rd and A ∈ Rd×d is a symmetric matrix. Calculate ∇u f (u).
Note that a, u ∈ Rd are column vectors. The results should be expressed as column vectors.
Q5
Let X ∈ Rn×d , y ∈ Rn , and λ > 0. Define f : Rd → R as
Q2
Let (Xi )i∈[n] ∼iid Ber(p) (Bernoulli distribution: Xi = 1 with probability p and Xi = 0 with probability
1 − p) for some p ∈ (0, 1). Let q ∈ (0, 1).
Pn
• Calculate Var(n−1 i=1 Xi ).
Pn
• Calculate the probability fn (p, q) = P( i=1 Xi = nq) (the formula can contain factorials; assuming
n, q is such that nq is an integer).
• Bonus question: calculate g(p, q) = limn→∞ 1
n log[fn (p, q)] (hint: you can use the Stirling formula).
Q3
Let (xi )i∈[n] be i.i.d. samples of exponential distribution Exp(λ) (with density pλ (x) = λe−λx ).
Qn
• Write down the log-likelihood function given these samples, i.e. ℓ(λ) = log i=1 pλ (xi ).
• By computing the derivative of ℓ(λ) with respect to λ, setting it equal to zero and solving for λ, derive
the maximum likelihood estimator λ̂ as an estimate of λ.
• Calculate E[λ̂−1 ] and Var(λ̂−1 ).
2
Q4
Let (xi )i∈[n] be i.i.d. samples of Gaussian distribution N (µ, 1) (with mean µ ∈ R and variance 1). Derive the
likelihood ratio test for testing H0 : µ = 0 versus H1 : µ = 1 at level α. Construct a symmetric confidence
interval for µ at level α.
Q1
Perform the following simulation: for each n ∈ {2, 4, 6, 8, 10, 50, 100}, draw b = 100 random vectors
x1 , . . . , xb ∼ Unif{[−1, 1]n }. At each iteration, compute and store the values of each of the inner prod-
(n) b
ucts αij = xT i xj for i ̸= j. Note that for each value of n, you should end up with an array of 2 = 4950
inner products. (Hint: one way to do this efficiently would be to, for each n, generate a random matrix
X ∈ Rb×n , with each of the entries drawn uniformly from [−1, 1], and compute the matrix A = XX T .
Then αij = Aij for i = 1, . . . , n, j < i.)
Q2
√ (n)
For each value of n, plot a histogram of the values (3/ n · αij )1≤i<j≤b (for each n, there are 4950 such
√
values). Also plot the PDF of the standard normal distribution: p(z) = 1/( 2π) exp{−z 2 /2}. Now how
(n)
does this PDF compare to your histograms as n grows? Explain in words why we multiply by scalars αij
√
by the value 3/ n, and why this changes your answer.