Maximum-Likelihood & Bayesian Parameter Estimation: Srihari: CSE 555
Maximum-Likelihood & Bayesian Parameter Estimation: Srihari: CSE 555
Parameter Estimation
Likelihood as a
function of mean
(peaks at mean)
Would be sharp
peak with many
samples
Log-likelihood
function(also
peaks at mean)
Maximizing the log-likelihood function
z Let θ = (θ1, θ2, …, θp)t and let ∇θ be the gradient operator
t
⎡ ∂ ∂ ∂ ⎤
∇θ = ⎢ , ,..., ⎥
⎣ ∂θ 1 ∂θ 2 ∂θ p ⎦
1
2
[ ] 1
ln p ( xk | µ ) = − ln (2π ) d Σ − ( xk − µ ) t Σ −1 ( xk − µ )
2
and ∇ µ ln P ( xk | µ ) = Σ −1 ( xk − µ )
θ = µ therefore:
∑ ( xk − µˆ ) = 0
Σ −1
k =1
• Multiplying by Σ
1 n
µ̂ = ∑ xk
n k =1
Just the sample mean
MLE: Gaussian Case- unknown µ and σ
1 1
θ = (θ1, θ2) = (µ, σ2) ln p ( x k | θ ) = − ln 2 πθ − (x k − θ 1)2
2θ
2
2 2
⎛ ∂ ⎞
⎜ (ln p ( x k | θ )) ⎟
∂ θ
∇ θ l = ⎜⎜ 1 ⎟ = 0
⎟
∂
⎜⎜ (ln p ( x k | θ )) ⎟⎟
⎝ ∂ θ 2 ⎠
⎧ 1
⎪θ (x k − θ 1) = 0
⎪ 2
⎨
⎪− 1 + (x k − θ 1) = 0
2
⎪⎩ 2 θ 2 2 θ 22
⎧n 1
⎪∑ ˆ ( xk − θ1 ) = 0 (1)
⎪ k =1 θ 2 n
⎨ n
⎪− 1 n
( xk − θˆ1 ) 2 n ∑ (x k − µ )2
⎪ ∑
+∑ =0 (2) µ =∑
xk
; σ2 = k =1
⎩ k =1 θ 2 k =1 θ2
ˆ ˆ 2
k =1 n n
MLE Bias
z ML estimate for σ2 is biased
⎡1 2⎤ n−1 2
E ⎢ Σ( x i − x ) ⎥ = .σ ≠ σ 2
⎣n ⎦ n
1 k =n
C= ∑ (x k − µ )(x k − µˆ )
t
14n4- 4
1 k4
=14244444 3
Sample covariance matrix