0% found this document useful (0 votes)
6 views

homework1

Uploaded by

hangyuju
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

homework1

Uploaded by

hangyuju
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

STAT154 Modern Statistical Prediction and Machine Learning

Linear algebra, matrix calculus, probability, and programming


Lecturer: Song Mei. GSI: Ruiqi Zhang. Assignment 1 - Due on 09/08/2024

Homework submissions are expected to be in pdf format produced by LATEX. For hand-written homeworks,
points will be deducted if the handwriting is not recognizable: this will be at the discretion of the instructors.
Review https://www.stat.berkeley.edu/~songmei/Teaching/STAT154_Fall2021/Materials/cs229-linalg.
pdf and https://www.stat.berkeley.edu/~songmei/Teaching/STAT154_Fall2021/Materials/cs229-prob.
pdf for concepts in this homework. If you can handle less than 60% of this problem set, you should consider
whether you are ready to take this course.
For coding exercises, you should use Python. For submission of coding exercises: report the results and
figures produced by the simulations, and also paste the source code.

Exercise: Linear algebra


Q1
Let A ∈ R3×3 be a matrix with  
1 2 3
A = 3 1 2 .
2 3 1
Calculate the {eigenvalues, eigenvectors, determinant, trace, inverse, Frobenius norm, operator norm} of A
respectively (need to describe the steps to calculate them).

Q2
Express e1 = [1, 0, 0]T as the linear combination of a1 = [1, 3, 2]T , a2 = [2, 1, 3]T and a3 = [3, 2, 1]T .

Q3
Let A ∈ R3×3 be a matrix with  
1 1 2
A = 2 1 3 .
3 1 4
What are the {rank, null space, column space (i.e., image)} of A? What are the project matrices onto the
{null space, column space} of A?

Exercise: Matrix Calculus


Q1
R∞
Let a, µ ∈ R. Calculate −∞ exp{−ax2 /2 + µx}dx. (Hint: recall that the PDF for a normal random variable
R∞
Z ∼ N (µ, σ 2 ) is p(z) = (2πσ 2 )−1/2 exp{−(z − µ)2 /(2σ 2 )} and that −∞ p(z)dz = 1.)

Q2
Pd
Let Z ∼ N (0, I d ) and let a ∈ Rd be a fixed vector. Calculate E[exp{aT Z}] = E[exp{ i=1 ai Zi }] (Hint:
note that Zi and Zj are independent for i ̸= j). Now let X = AZ + µ for µ ∈ Rd and A ∈ Rd×d . Calculate
M (µ, a) ≡ E[exp{aT X}] (Hint: use the result you just found for E[exp{aT Z}]).

1
Q3
Calculate ∇a M (µ, a) = [∂a1 M, . . . , ∂ad M ]T , and ∇µ M (µ, a) = [∂µ1 M, . . . , ∂µd M ]T (represent it in a com-
pact form using matrix operations).

Q4
Pd
• Let f (u) = aT u = i=1 ai ui for a, u ∈ Rd . Calculate ∇u f (u).
Pd
• Let f (u) = ∥u∥22 = i=1 u2i for u ∈ Rd . Calculate ∇u f (u).

• Let f (u) = uT Au, where u ∈ Rd and A ∈ Rd×d is a symmetric matrix. Calculate ∇u f (u).

Note that a, u ∈ Rd are column vectors. The results should be expressed as column vectors.

Q5
Let X ∈ Rn×d , y ∈ Rn , and λ > 0. Define f : Rd → R as

f (u) = ∥Xu − y∥22 + λ∥u∥22 .

What is the unique minimizer of f ? (express it as an explicit function of X, y, λ)

Exercise: Probability and statistics


Q1
Let X ∼ N (µ, Σ) be a Gaussian random variable with mean µ ∈ Rd and covariance matrix Σ ∈ Rd×d .
Assuming that Σ is invertible. Let a ∈ Rd be a fixed vector, and Y = aT X ∈ R be a random variable.
Calculate E[Y ], Var(Y ), E[X|Y = y], and Cov(X|Y = y) = E[XX T |Y = y] − E[X|Y = y]E[X T |Y = y].

Q2
Let (Xi )i∈[n] ∼iid Ber(p) (Bernoulli distribution: Xi = 1 with probability p and Xi = 0 with probability
1 − p) for some p ∈ (0, 1). Let q ∈ (0, 1).
Pn
• Calculate Var(n−1 i=1 Xi ).
Pn
• Calculate the probability fn (p, q) = P( i=1 Xi = nq) (the formula can contain factorials; assuming
n, q is such that nq is an integer).
• Bonus question: calculate g(p, q) = limn→∞ 1
n log[fn (p, q)] (hint: you can use the Stirling formula).

Q3
Let (xi )i∈[n] be i.i.d. samples of exponential distribution Exp(λ) (with density pλ (x) = λe−λx ).
Qn
• Write down the log-likelihood function given these samples, i.e. ℓ(λ) = log i=1 pλ (xi ).
• By computing the derivative of ℓ(λ) with respect to λ, setting it equal to zero and solving for λ, derive
the maximum likelihood estimator λ̂ as an estimate of λ.
• Calculate E[λ̂−1 ] and Var(λ̂−1 ).

2
Q4
Let (xi )i∈[n] be i.i.d. samples of Gaussian distribution N (µ, 1) (with mean µ ∈ R and variance 1). Derive the
likelihood ratio test for testing H0 : µ = 0 versus H1 : µ = 1 at level α. Construct a symmetric confidence
interval for µ at level α.

Computational Exercise: Inner products and the central limit theorem


In this problem we will numerically explore the central limit theorem and how it can arise in the context of
inner products.

Q1
Perform the following simulation: for each n ∈ {2, 4, 6, 8, 10, 50, 100}, draw b = 100 random vectors
x1 , . . . , xb ∼ Unif{[−1, 1]n }. At each iteration, compute and store the values of each of the inner prod-
(n) b

ucts αij = xT i xj for i ̸= j. Note that for each value of n, you should end up with an array of 2 = 4950
inner products. (Hint: one way to do this efficiently would be to, for each n, generate a random matrix
X ∈ Rb×n , with each of the entries drawn uniformly from [−1, 1], and compute the matrix A = XX T .
Then αij = Aij for i = 1, . . . , n, j < i.)

Q2
√ (n)
For each value of n, plot a histogram of the values (3/ n · αij )1≤i<j≤b (for each n, there are 4950 such

values). Also plot the PDF of the standard normal distribution: p(z) = 1/( 2π) exp{−z 2 /2}. Now how
(n)
does this PDF compare to your histograms as n grows? Explain in words why we multiply by scalars αij

by the value 3/ n, and why this changes your answer.

You might also like