CH 7
CH 7
Chapter 7
Maximum Likelihood Estimation
This is t he Turn The Crank procedure. Used in
almost all pract ical syst ems. Approximat ely MVU
est imat or ( efficient ) for large dat a records. I n
general, it is non- linear.
Example: DC Level in WGN- Modified
X[ n] = A+ W[ n]
Unknown WGN wit h Variance
( A> 0)
A
Try CRLB Theorem
)
( 2)
A ) A
E( 1)
N as or optimal ally asymptotic
is A
that so A
I(A)(
) ] [ (
2
1
) ] [ (
1
2
ln
) (2
1
A) ; X p(
?
2
2
) ] [ (
2
1
2 N
1
0
2
CRLB A Var
A n X
A
A n X
A A
N
A
p
e
A
n n
A n X
A
N
n
=
+ +
E(
since Biased
4
1
] [
1
2
1
- A
Consider
MVU ally Asymptotic Efficient lly symptotica
2
2
2
1
0
2
+ + =
+ +
|
|
.
|
\
|
+ + =
+ + =
=
n X E
n X
N
E
n X
N
n X
N
A
SS
N
n
Non-linear !
Asymptotic Optimality
value) true yields estimator
records, data large for least (at consistent be to said is A
)) ( ( ] [
N
1
N as in that Reasonable
)
2
1
(
2
1
4 1
2
1
2 2 2
2
+ =
= + + = + + + =
A A n X E n X
A A A A
2
1
1
0
2
)
4
1
(
2
1
; 4 1
2
1
) (
] [ N 1 u ) (
=
+ =
+ + =
= =
u
u
g
u u g
n X u g A
N
n
g[U]
g[U]
u
u
U
0
=A+A
2
Small N
U
0
=A+A
2
Large N
To find mean and variance approximat ely
We can linearize t he est imat or
) N (as
p(u;A)
unbiased ally asymptotic A ) A
E(
N as Valid
) ( ] [
1
2
1
A
2
1
A A
) (
) (
] [ ] [
]) [ ( u about g[u] g Linearizin
. A A is which mean value its about
ed concentrat be will of pdf the N, large For
2 2
0
0
0
0
2
o
2
0
=
(
+
+
+ =
+
=
+ =
n
A A n X
N
u u
u
u g
u g u g
n X E
u
u
2 2
) (
2 2 4
2 2 4 2
2
2
2
2
]) [ ( ) ]) [ ((
]) [ ( ]) [ ( ]) [ Var(X But,
]) [ Var(
)
2
1
N(A
4
1
] [
N
1
Var
2
1
A
2
1
) A
Var(
A A
n
n X E n W A E
n X E n X E n
n X
n X
+
+ =
=
+
=
|
.
|
\
|
|
|
|
|
.
|
\
|
+
=
efficient. ally asymptotic is A
! CRLB
)
2
1
N(A
A
)
2
1
( 4A
)
2
1
N(A
4
1
) A
Var(
2 4 2 6 3 ]) [ Var(X
6 3A
0) moments order - odd (
]) [ (
2
4
]) [ (
]) [
4
( ) ]) [ ((
2
2
2
2 3 4 3 2 4 3 2 2
4 3 2
4 2 2 4
4
4
0
4
=
+
=
+
+
=
+ = + + =
+ + =
=
+
|
|
.
|
\
|
+ =
|
|
.
|
\
|
= +
=
A
A A A A A A A A n
A A
A n W E A n W E
n W A
k
E n W A E
k k
k
)
2
1
(
to equal
is variance the ; estimator for the : Note
2
+
>
=
A N
A
N
A
X A
W(n) ~ G(0,A)
Fact
4 2 2 4 4
2 2 2
2
3 6 ] [
] [
) , ( ~
+ + =
+ =
E
E
G
If
, Show that
For finit e dat a records cannot say much.
I n pract ice, it works well
Also, as N >
(MLE) Estimator Likelihood Maximum is
] [
N
1
of function linear
) ( ] [
1
2
1
2
1
2
2
GAUSSIAN CLT BY
2
A
A
n X A
A A n X
N
A
A A
n
n
=
(
(
(
+
+
+ =
MLE Properties
1) Consistent
2) Asymptotically Efficient
3) Asymptotically Gaussian (assuming that
the number of estimated parameters < N)
4) Asymptotically MVU
5) Asymptotically Optimal
Finding MLE
observed X If :
) vector and scalar to (applies
. of range allowable over ) , X p(
maximizing e that valu is of MLE :
0
X Rationale
Definition
=
2
) ; (
0
X X p =
Finding MLE
Function Likelihood X Given ) ; (
X of PDF Given ) ; (
) ; (
Example
0 ] [
1
A
0 ] [ NA -
0 ] [ 2 ] [ 2 X[n] 2A -NA
0 ) ] [ (
2
1
) ] [ (
1
2 A
ln(p)
A over Maximize
) 2 (
1
A) , X p(
example previous as setup Same
2 2
2 2
2 2 2
2
2
) ] [ (
2
1
2
2
= +
= +
= + + +
= + + =
n X
N
A
n X NA
NA n X A n X NA
A n X
A
A n X
A A
N
e
A
n
A n X
A
N
n
Example (Contd)
records. data finite for even estimator efficient an
yields MLE and lucky are we Sometimes
4
1
] [
1
2
1
Sign choose , variance) a also is (it 0 A Since
4
1
] [
1
2
1
SS of function
2
2
+ + =
+ >
+ =
n X
N
A
n X
N
A
Example
it. be will MLE
exists, estimator efficient if general, in True
N. finite for efficient be Known to
] [
1
0 ) ] [ (
1 ln
) (2
1
A) ; X p(
in WGN Level DC
2
) ] [ (
2
1
N/2 2
2
2
=
= =
=
|
|
.
|
\
|
n
n
A n X
n X
N
A
A n X
A
p
e
n
MLE
MVU
Efficient
0 )
)( (
) ln(
= =
MLE eff
I
p
MLE Properties
PROPERTI ES:
1) Consist ent
2) Asympt ot ically Efficient
3) Asympt ot ically Gaussian
Optimal ally Asymptotic
MVU ally Asymptotic
ally Asymptotic a )) ( , (
Or
) ) ( , ( ~
, N As
1
1
I N
I N
CRLB
UNBIASED GAUSSIAN
a
Proof in book
2
1
1
) ) A
E(
(
1
) A
Var(
1
) A
E(
compute N, given a For : simulation Carlo Monte Use
apply?
to results asymptotic for be N must large How
Modified - in WGN Level DC
ns realizatio 1000 M
=
=
=
M
i
i
M
i
i
A
M
A
M
Numerical Example
Estimates
Example (Contd)
For A= 1 result s are
* M= 5000
N E() NVar()
5 0.954 0.624<CRLB!
10 0.976 0.648<CRLB!
15 0.991 0.696
20 0.996(0.987)* 0.707(0.669)*
25 0.994 0.656<CRLB!
Theoretical 1 0.667=A
2
/A+0.5
Example (Contd)
To assess PDF Recall,
~ N( A, I
- 1
( A) ) = N( 1, 2/ 3/ N)
Figure 2a Theoretical PDF and Histogram
a) N = 5 b) N = 20
0.00 1.00 2.00 3.00
Theoretical Asymptotic Limit
Histogram
Example
Theoretical Asymptotic Limit
Histogram
Figure 2b
0.50 1.00 1.5 2.0
Summary
large! very be to have not does N Usually
records data large for Optimal
efficient ally Asymptotic
CRLB attains ally Asymptotic
unbiased ally Asymptotic
)) ( , ( ~
I N
a
Example
| |
| |
0
) 2 sin( ) 2 cos( ] [ 2
) 2 cos( ] [ ) J(
minimize Must
) 2 (
1
) ; (
: MLE find To
estimate Known, f A,
] [ ) f ACos(2 X[n]
Estimation Phase
0 0
2
0
)] 2 cos( ] [ [
2
1
2 2
o
0
2
0
2
=
+ + =
+ =
=
+ + =
n f A n f A n X
J
n f A n X
e X p
n W n
n
n
n f A n X
N
n
Example (Contd)
=
=
=
+ =
+ + =
+ + = +
n
0
0
n
0 0
0
0 0
0 0 0
2 os X[n]
2 sin X[n]
ArcTan -
sin ). f 2 cos( ] [ )
cos( ) f X[n]sin(2
0
0 )
2 4 sin(
2
1
)
2 cos( )
2 sin(
)
2 cos( ).
2 ( sin )
2 sin( ] [
OR
n f c
n f
n n X n
LHS
ignore n f
n f n f
n f n f A n f n X
n
n
n
n n
HW problem from Chapter 3
for not near 0 or 1/2
0
f
Example (Contd)
( )
be? to have N does large How
1
2 /
1
)
Var(
2 / NA ) I( that Recall
) ( I , N ~
: MLE of e performanc assess To
2 2
2 2
1 -
a
SNR
N NA
= =
=
Example 3.4 in SK
Asymptotically
How Large Should N be?
05 . 0 , 4 , 08 . 0 f 1, A For
2
0
= = = =
N E( ) NVar( )
20 0.732 0.0978
40 0.746 0.108
60 0.774 0.110
80 0.789 0.099
Theoretical =0.785 NVar( )=0.1
var(
scale) log - log a (on SNR in Linear becomes CRLB
log 10 log 10
1
log 10 )
( log 10 :
dB in SNR
10 10
10 10
=
=
N
N
Var CRLB
Actual Variance
CRLB
10 log
10
Variance
SNR(dB)
Performance vs SNR
For SNR-10dB, MLE attains CRLB or asymptotic variance
-20 -10 0 10
MLE Performance
At lower SNRs, a threshold effect occurs
High SNR
relatively stable peak from
one realization to the other
OUTLIER
/4
/4
Low SNR
Outliers (due to high noise) give rise to increased variance (more
than predicted by CRLB)
log likelihood
Many peaks
log likelihood
Threshold Effect
Threshold SNR
SNR dB
10log
10
Variance
This behavior is characteristic of all non-linear estimators
For signal in noise problems, the CRLB is attained even for short
data records if SNR is high enough. Why?
=
=
+
+
=
+
+
+ +
+ +
=
n
n
c
s
n
n
n
n
n f n W
NA
n f n W
NA
E
E
Tan Arc
n f n W
NA
n f n W
NA
Tan Arc
n f n W n f A
n f n W n f A
Tan Arc
0 c
0 s
0
0
0 0
0 0
2 cos ] [
2
E
2 sin ] [
2
E Where
cos
sin
2 cos ] [ cos
2
2 sin ] [ sin
2
2 cos ]) [ ) 2 cos( (
2 sin ]) [ ) 2 cos( (
noise), (no 0 E If
s
s
= = =
c
c
E
MLE Performance
MLE For Transformed Parameters
How do we find t he MLE for a funct ion of , for
example, A
2
inst ead of A?
Example: X[ n] = A + W[ n] WGN
Want MLE of = e
A
, which is a 1- 1
t ransformat ion of A
This PDF is paramet erized by A since is 1- 1
t ransformat ion, we can equivalent ly use
< <
=
A -
) 2 (
1
) ; (
2
2
) ] [ (
2
1
2 2
n
A n X
N
e A X p
>
] [
1
ln
0
1
)
ln - (X[n] 2 -
0 ) ln ] [ (
. over p
maximizing by found of MLE . equivalent are PDFs
0
) 2 (
1
) ; (
n
2
T
) ln ] [ (
2
1
2 2
d Transforme
2
2
=
> =
= =
=
=
A
ation. transform 1 - 1 Not
0 A let data same For
ation. transform into
A. of MLE is X A
But
X
e
MLE For Transformed Parameters
Invariance Property
Example :
Example (Contd)
follows as sets two Need . by pdf ze parameteri Cannot
0
) 2 (
1
) ; ( p
2
2
1
) ] [ (
2
1
2 2
T
n
n X
N
e X
) ; (
1
X p
T
0
) 2 (
1
) ; ( p
2
2
2
) ] [ (
2
1
2 2
T
=
+
n
n X
N
e X
) ; (
2
X p
T
follows as
p and p maximizes that finding by found is of MLE
T
T
T T T
T T
2 1
2 1
2 1
= = = >
=
A X p X
or
X p X p
T T
Example (Contd)
{ }
2 2
2
2
0
) , ( ) , (
0
) ( ) A
(
: )] ; X p( max [arg
] ) , ( ), ; p( max [arg
) ; ( , ) ; ( max arg
: example our For
2 1
X
A A
X p X
X p X p
X p X p
T T
= =
< < =
=
)
`
X
Summary
Vector MLE Example : DC Level in WGN
| |
( )
( )
( ) 0 - [n]
2
1
0 A - [n]
2
1
2
ln
and
A
0 A - [n]
1
A
ln
A [n]; A [n]
^
2 2
1
0
4
2
1
0
4 2 2
^
1
0
2
T 2
=
(
= + =
=
= =
= + =
=
MLE
N
n
N
n
MLE
N
n
N x x
x
N p
x
x
p
w x
Example (Contd)
( )
( )
(
(
(
(
(
=
=
=
1
0
2
MLE
^
1
0
2 2
MLE
- [n]
1
- [n]
1
N
n
N
n
x x
N
x
x x
N
+ =
+ =
+ =
=
+ + =
n
2
o o
n
2
o o
n
2
o
2
2
2
o
o
n) 2 sin( Asin n) 2 cos( Acos - [n]
n 2 Acos - [n] , A, J : Minimize
) n 2 Acos - [n]
2
1
exp(
2
1
) ; (
A
[n]; n 2 Acos [n]
f f x
f x f
f x x p
f
w f x
N
T
Example (Contd)
( ) ( ) ( ) | |
( ) ( ) ( ) | |
( ) ( ) ( )
( ) ( )
| |
| | s c H
H H
s c s c
s
c
and where,
J'
1 2 2 1
1 2 2 1
A
Asin - Acos Let,
2 N
2 1
2 1 2 1 o 2 1
o o
o o
1
2
1
2
2
2
1
2 1
= =
=
=
=
=
|
|
.
|
\
|
= + =
= =
T
T
T
T
T
x x
x x f , ,
N f sin f sin
N f cos f cos
tan
Solution
( )
( ) ( )
( )
2
1
N
1
&
2
1
N
1
, 0
N
1
2
1
or 0 near not is f If
: f over maximize
maximize want to we Hence,
f , J'
o
1
o
1
1
o
2
^
, 1
^
1
opt
^
(
=
|
.
|
\
|
=
=
s s c c s c
x s
x c
s s c s
s c c c
x s
x c
x H H H H x
x H H H H I x
x H H H
T T T
T
T
T T
T T
T
T
T
T T T
T T T
T T
Example (Contd)
( ) ( ) | |
( ) ( )
( )
f find [n]
N
2
max
n 2 sin [n] n 2 cos [n]
N
2
max
N
2
max
2
0
0
2
max
1 -
o
m periodogra
2
1
0 n
n 2
o f
2
o
2
o o f
2 2
o f
1
T
o f
o
x H H H
x s x c
x s
x c
x s
x c
T T
N
f j
n n
T T
T
T
T
T
e x
f x f x
N
N
=
=
(
(
|
.
|
\
|
+ |
.
|
\
|
=
+ =
(
(
(
Example (Contd)
=
=
|
.
|
\
|
+
|
.
|
\
|
=
(
(
(
(
=
(
(
n
o
n
o
1
^
n
n 2 -
2
2
^
2
1
^ ^
n
o
n
o
opt
^
n) (2 [n]
n) (2 [n]
and
[n]
2
A Finally,
n) (2 [n]
n) (2 [n]
2
2
Then,
o
^
^
f j
^
^
T
^
T
^
f cos x
f sin x
tan
e x
N
f sin x
f cos x
N N
^
x s
x c