0% found this document useful (0 votes)
11 views

Notes Maximum Likelihood

The document discusses the maximum likelihood estimator method. It explains that the maximum likelihood estimator is the value of the parameter that maximizes the likelihood function, which is the probability of observing the sample given the parameter value. For independent and identically distributed samples, the joint likelihood can be written as the product of the individual likelihoods. Taking the log of the likelihood function allows finding the maximum by taking derivatives and setting them equal to zero. An example using a normal distribution illustrates computing the maximum likelihood estimators for the mean and variance.

Uploaded by

jerry vera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Notes Maximum Likelihood

The document discusses the maximum likelihood estimator method. It explains that the maximum likelihood estimator is the value of the parameter that maximizes the likelihood function, which is the probability of observing the sample given the parameter value. For independent and identically distributed samples, the joint likelihood can be written as the product of the individual likelihoods. Taking the log of the likelihood function allows finding the maximum by taking derivatives and setting them equal to zero. An example using a normal distribution illustrates computing the maximum likelihood estimators for the mean and variance.

Uploaded by

jerry vera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Notes on Maximum Likelihood Estimator

Usual inference setup: I have a variable Y on a population of interest, with


distribution Y ∼ f (y; θ). The θ parameter (scalar or vector) is unknown.
In an effor to estimate it:

ˆ I collect a simple random sample from Y , which is Y = (Y1 , Y2 , . . . , YN ).

ˆ I consider the joint density of the whole sample, it is f (y1 , y2 , . . . , yN ; θ).

Once i have my observations, I can change the perspective and focus on


the unknown parameter

f (θ; y1 , y2 , . . . , yN )

This is called likelihood function, it strictly depends on the value of θ,


given that i have observed my sample. I have to look for the value of θ that
maximizes the probability of observing my sample, hence the term maxi-
mum likelihood.
The function can be expressed omitting the sample from the notation, fo-
cusing on the unknown parameter: L(θ).
Furthermore, given that I have a random sample in which all the compo-
nents are independent and identically distributed (i.i.d.) as Y , the joint
density can be factored as
N
Y
f (y1 , y2 , . . . , yN ; θ) = f (yi ; θ)
i=1

so that
N
Y
L(θ) = f (yi ; θ)
i=1

Since I have to find the maximum of the function, I need to compute the
first derivatives of L(θ) with respect to the unknown θ and set it to zero,
solving for θ. It is equivalent to maximize l(θ) = log L(θ), often for algebraic
simplification.

1
Example
Y ∼ N (µ, σ 2 ), both parameters unknown.
 2
yi −µ
1 − 12
f (yi ; µ, σ 2 ) = √ e σ

2πσ 2
The likelihood is
N 
yi −µ
2
2
Y 1 − 12
L(µ, σ ) = √ e σ

i=1 2πσ 2
The log-likelihood is
N  
yi −µ
2  N N √ N
2
X 1 − 21
X X  1X
l(µ, σ ) = log √ e σ
= log 1 − log 2π − log σ 2 +
2πσ 2 2
i=1 i=1 i=1 i=1
N  
y −µ 2
 
− 21 iσ
X
+ log e =
i=1
√ N
N 1 X
= −N log 2π − 2
log σ − 2 (yi − µ)2
2 2σ
i=1

Computing the derivative with respect to µ, with further simplifications and


solving for µ:
N
∂l(µ, σ 2 ) 1 X
=− 2 (yi − µ) (−1) = 0
∂µ σ
i=1
N
X
→ yi − N µ = 0
i=1
N
1 X
→ µ̂ = yi
N
i=1

Computing the derivative with respect to σ 2 , with further simplifications


and solving for σ 2 :
N
∂l(µ, σ 2 ) N 1X 1
2
=− 2 − (yi − µ)2 4 (−1) = 0
∂σ 2σ 2 σ
i=1
N
σ4 1 X
→− 2
+ (yi − µ)2 = 0
σ N
i=1
N
1 X
c2 =
→σ (yi − µ)2
N
i=1

It can be proved that (µ, σ 2 ) represent a global maximum for the likelihood.

2
Remarks
ˆ The maximum likelihood estimator is generally not unbiased (see that
the estimator for σ 2 is the biased one!).

ˆ It can be proved that the maximum likelihood estimator is generally


consistent.

ˆ It can be proved that the maximum likelihood estimator is asymptot-


ically Gaussian.

ˆ Invariance principle. For any one-to-one function g(.) the maximum


likelihood estimator for g(θ) is g(θ̂).

Exercise. Given the density function (Exponential random variable)

f (y; θ) = θe−θy y>0 θ>0

find the maximum likelihood estimator for θ and 1θ .

You might also like