ST3189 2022 paper
ST3189 2022 paper
Candidates should answer all FOUR questions. All questions carry equal
marks.
A table of common distributions is provided after the final question of this paper.
You should complete this paper using pen and paper. Please use BLACK INK
only.
You may use any calculator for any appropriate calculations, but you may not use
any computer software to obtain solutions. Credit will only be given if all workings
are shown.
You have until 12:00 (BST) on Thursday 26 May 2022 to submit your answers.
However, you are advised not to leave your submission to the last minute in order to
allow sufficient time to submit your work.
If you think there is any information missing or any error in any question, then
you should indicate this but proceed to answer the question stating any
assumptions you have made.
By accessing this question paper, you agree not to commit any assessment offence.
Assessment offences include (but are not limited to) committing plagiarism and the
use or access of any paid-for or any other services offering live assistance
during an examination. You must not confer with anyone else during a live
examination; and we take conferring to include any exchange of information or
discussion about the assessment with others in any way that could potentially give
you or another student an advantage in the examination. As such, any exchanging
with others of exam questions; or any accessing of websites, blogs, forums or any
other form of oral or written communication with others which involves any
discussion of live examination questions or potential answers/solutions to
exam questions will be considered an assessment offence.
The University of London will conduct checks to ensure the academic integrity of
your work. Many students that break the University of London’s assessment
regulations did not intend to cheat but did not properly understand the University of
London’s regulations on referencing and plagiarism. The University of London
considers all forms of plagiarism, whether deliberate or otherwise, a very
serious matter and can apply severe penalties that might impact on your
award.
1. (a) The lasso and best subset selection can be used for variable selection. Discuss
the main advantage and disadvantage of the lasso compared with best subset
selection. [4 marks]
(b) Consider the k-nearest neighbours classification using the Euclidean distance
on the dataset shown in Figure 1.
8
+ −
−
6
−
4
+
+
2
+ −
0
0 2 4 6 8
where the error terms i ’s are independent and distributed according to the Normal
distribution with mean 0 and known variance σ 2 . Equivalently, we can write that
given x each yi is independent and distributed according to the Normal distribution
√
with mean β xi and known variance σ 2 .
(a) Derive the likelihood function for the unknown parameter β. [3 marks]
(b) Derive the Jeffreys prior for β. Use it to obtain the corresponding posterior
distribution. [6 marks]
(c) Consider the Normal distribution prior for β with zero mean and variance ω 2 .
Use it to obtain the corresponding posterior distribution. [6 marks]
(d) Consider the least squares criterion
n
X √
(yi − β xi )2 , (1)
i=1
and show that the estimator of β that minimises equation (1), also maximises
the likelihood function derived in part (a). Derive this estimator and, in
addition, consider the following penalised least squares criterion
( n )
X √ 2
(yi − β xi ) + λβ 2 , (2)
i=1
given a λ > 0. Derive the estimator of β that minimises equation (2) and
compare it with the one that minimises equation (1). [5 marks]
(e) Provide a Bayes estimator for each of the posteriors in parts (b) and (c) and
compare them with the estimators of part (d). [5 marks]
3. (a) Consider the regression task of predicting the variable Y based on the variable
X given the following training sample:
Y X
7 8
6 9
8 7
3 1
4 0
Apply the recursive binary splitting algorithm to produce a regression tree.
The objective is to minimise the residual sum of squares (RSS)
X X
RSS = (Yi − cm )2 ,
m i:i∈Rm
Binomial(n, θ): number of successes in n independent Bernoulli trials with probability of suc-
cess θ.
n! x
• f (x|θ) = P (x|θ) = x!(n−x)! θ (1 − θ)n−x for x = 0, 1, . . . , n.
NegBin(r, θ): number of successes before rth failures in repeated independent Bernoulli trials.
x+r−1
θx (1 − θ)r
• f (x|θ) = P (x|θ) = x for x = 0, 1, . . ..
r(1−θ) r(1−θ)
• E(X) = θ , Var(X) = θ2
.
Poisson(λ): often used for the number of events which occur in an interval of time.
λx e−λ
• f (x|λ) = P (x|λ) = x! for x = 0, 1, . . ..
• E(X) = λ, Var(X) = λ.
• E(X) = µ, Var(X) = σ 2 .
1 α−1 (1 − x)β−1
R1 Γ(α)Γ(β)
• f (x) = B(α,β) x for 0 ≤ x ≤ 1, B(α, β) = 0 y α−1 (1 − y)β−1 dy = Γ(α+β)
α αβ
• E(X) = α+β , Var(X) = (α+β+1)(α+β)2
.
β α α−1 R∞
• f (x) = Γ(α) x exp(−βx) for 0 ≤ x < ∞, Γ(t) = 0 y t−1 e−y dy.
• E(X) = αβ , Var(X) = α
β2
.
IGamma(α, β): characterized by parameters α > 0 and β. If X ∼ Gamma(α, β), then 1/X ∼
IGamma(α, β).
β α −α−1
• f (x) = Γ(α) x exp − βx for 0 ≤ x < ∞.
β β2
• E(X) = α−1 , Var(X) = (α−1)2 (α−2)
. for positive integer n.
END OF PAPER