0% found this document useful (0 votes)
71 views

CH 7

This document discusses maximum likelihood estimation (MLE). MLE finds the parameter values that maximize the likelihood function given the observed data. MLE is consistent, asymptotically efficient, asymptotically Gaussian, and asymptotically optimal. Examples are provided to illustrate MLE for estimating the DC level in additive white Gaussian noise and estimating an unknown phase parameter. Simulation results show the MLE approaches the theoretical asymptotic properties as the number of observations increases.

Uploaded by

probability2
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views

CH 7

This document discusses maximum likelihood estimation (MLE). MLE finds the parameter values that maximize the likelihood function given the observed data. MLE is consistent, asymptotically efficient, asymptotically Gaussian, and asymptotically optimal. Examples are provided to illustrate MLE for estimating the DC level in additive white Gaussian noise and estimating an unknown phase parameter. Simulation results show the MLE approaches the theoretical asymptotic properties as the number of observations increases.

Uploaded by

probability2
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Estimation Theory

Chapter 7
Maximum Likelihood Estimation
This is t he Turn The Crank procedure. Used in
almost all pract ical syst ems. Approximat ely MVU
est imat or ( efficient ) for large dat a records. I n
general, it is non- linear.
Example: DC Level in WGN- Modified
X[ n] = A+ W[ n]
Unknown WGN wit h Variance
( A> 0)
A
Try CRLB Theorem


)

( 2)
A ) A

E( 1)
N as or optimal ally asymptotic
is A

that so A

Find : approach following the propose We




A) - A

I(A)(
) ] [ (
2
1
) ] [ (
1
2
ln

) (2
1
A) ; X p(
?
2
2
) ] [ (
2
1
2 N
1
0
2
CRLB A Var
A n X
A
A n X
A A
N
A
p
e
A
n n
A n X
A
N
n


=
+ +

Doesnt appear that CRLB = is attained


)
2
1
(
2
+ A N
A
prove the CRLB bound !
Example (Contd)
4
1
]) [ (
2
1


4
1
])) [
1
(
2
1

4
1
] [
1
2
1
- E ) A

E(
since Biased
4
1
] [
1
2
1
- A

Consider
MVU ally Asymptotic Efficient lly symptotica
2
2
2
1
0
2
+ + =
+ +
|
|
.
|

\
|
+ + =
+ + =

=
n X E
n X
N
E
n X
N
n X
N
A
SS
N
n

Non-linear !
Asymptotic Optimality
value) true yields estimator
records, data large for least (at consistent be to said is A

Numbers Large of Law by A A


)) ( ( ] [
N
1

N as in that Reasonable
)
2
1
(
2
1
4 1
2
1

2 2 2
2

+ =

= + + = + + + =

A A n X E n X
A A A A
2
1
1
0
2
)
4
1
(
2
1
; 4 1
2
1
) (
] [ N 1 u ) (

=
+ =

+ + =
= =

u
u
g
u u g
n X u g A
N
n
g[U]
g[U]
u
u
U
0
=A+A
2
Small N
U
0
=A+A
2
Large N
To find mean and variance approximat ely
We can linearize t he est imat or
) N (as
p(u;A)
unbiased ally asymptotic A ) A

E(
N as Valid
) ( ] [
1
2
1
A
2
1
A A


) (
) (
] [ ] [
]) [ ( u about g[u] g Linearizin
. A A is which mean value its about
ed concentrat be will of pdf the N, large For
2 2
0
0
0
0
2
o
2
0
=

(

+
+
+ =

+
=
+ =

n
A A n X
N
u u
u
u g
u g u g
n X E
u
u

2 2
) (
2 2 4
2 2 4 2
2
2
2
2
]) [ ( ) ]) [ ((
]) [ ( ]) [ ( ]) [ Var(X But,
]) [ Var(
)
2
1
N(A
4
1

] [
N
1
Var
2
1
A
2
1
) A

Var(
A A
n
n X E n W A E
n X E n X E n
n X
n X
+
+ =
=
+
=
|
.
|

\
|
|
|
|
|
.
|

\
|
+
=

efficient. ally asymptotic is A


! CRLB
)
2
1
N(A
A

)
2
1
( 4A
)
2
1
N(A
4
1
) A

Var(
2 4 2 6 3 ]) [ Var(X
6 3A
0) moments order - odd (
]) [ (
2
4
]) [ (
]) [
4
( ) ]) [ ((
2
2
2
2 3 4 3 2 4 3 2 2
4 3 2
4 2 2 4
4
4
0
4

=
+
=
+
+
=
+ = + + =
+ + =
=
+
|
|
.
|

\
|
+ =
|
|
.
|

\
|
= +

=

A
A A A A A A A A n
A A
A n W E A n W E
n W A
k
E n W A E
k k
k
)
2
1
(
to equal
is variance the ; estimator for the : Note
2
+
>
=
A N
A
N
A
X A

W(n) ~ G(0,A)
Fact
4 2 2 4 4
2 2 2
2
3 6 ] [
] [
) , ( ~



+ + =
+ =
E
E
G
If
, Show that
For finit e dat a records cannot say much.
I n pract ice, it works well
Also, as N >
(MLE) Estimator Likelihood Maximum is

Gaussian ally asymptotic is


] [
N
1
of function linear


) ( ] [
1
2
1
2
1


2
2
GAUSSIAN CLT BY
2
A
A
n X A
A A n X
N
A
A A
n
n

=
(
(
(

+
+
+ =


MLE Properties
1) Consistent
2) Asymptotically Efficient
3) Asymptotically Gaussian (assuming that
the number of estimated parameters < N)
4) Asymptotically MVU
5) Asymptotically Optimal
Finding MLE
observed X If :
) vector and scalar to (applies
. of range allowable over ) , X p(
maximizing e that valu is of MLE :
0
X Rationale
Definition
=



2


) ; (
0
X X p =
Finding MLE
Function Likelihood X Given ) ; (
X of PDF Given ) ; (
) ; (

be should estimate Our . reasonable more is


large been have must y Probabilit
X observe did But we
small. is X observing of y probabilit , If
2
2
0
0 1
=
=
= =
=

=
= =
for X p
for X p
X p Max Arg
X
X

Example
0 ] [
1
A
0 ] [ NA -
0 ] [ 2 ] [ 2 X[n] 2A -NA
0 ) ] [ (
2
1
) ] [ (
1
2 A
ln(p)

A over Maximize
) 2 (
1
A) , X p(
example previous as setup Same
2 2
2 2
2 2 2
2
2
) ] [ (
2
1
2
2
= +
= +
= + + +
= + + =




n X
N
A
n X NA
NA n X A n X NA
A n X
A
A n X
A A
N
e
A
n
A n X
A
N
n

Example (Contd)
records. data finite for even estimator efficient an
yields MLE and lucky are we Sometimes
4
1
] [
1
2
1


Sign choose , variance) a also is (it 0 A Since
4
1
] [
1
2
1


SS of function
2
2
+ + =
+ >
+ =

n X
N
A
n X
N
A
Example
it. be will MLE
exists, estimator efficient if general, in True
N. finite for efficient be Known to
] [
1


0 ) ] [ (
1 ln

) (2
1
A) ; X p(
in WGN Level DC
2
) ] [ (
2
1
N/2 2
2
2


=
= =

=
|
|
.
|

\
|

n
n
A n X
n X
N
A
A n X
A
p
e
n


MLE
MVU
Efficient
0 )

)( (
) ln(
= =

MLE eff
I
p

MLE Properties

PROPERTI ES:
1) Consist ent
2) Asympt ot ically Efficient
3) Asympt ot ically Gaussian
Optimal ally Asymptotic
MVU ally Asymptotic
ally Asymptotic a )) ( , (

Or
) ) ( , ( ~

, N As
1
1




I N
I N
CRLB
UNBIASED GAUSSIAN
a
Proof in book
2
1
1
) ) A

E(

(
1
) A

Var(

1
) A

E(
compute N, given a For : simulation Carlo Monte Use
apply?
to results asymptotic for be N must large How
Modified - in WGN Level DC
ns realizatio 1000 M

=
=

=
M
i
i
M
i
i
A
M
A
M
Numerical Example
Estimates
Example (Contd)
For A= 1 result s are
* M= 5000
N E() NVar()
5 0.954 0.624<CRLB!
10 0.976 0.648<CRLB!
15 0.991 0.696
20 0.996(0.987)* 0.707(0.669)*
25 0.994 0.656<CRLB!
Theoretical 1 0.667=A
2
/A+0.5
Example (Contd)
To assess PDF Recall,
~ N( A, I
- 1
( A) ) = N( 1, 2/ 3/ N)
Figure 2a Theoretical PDF and Histogram
a) N = 5 b) N = 20
0.00 1.00 2.00 3.00
Theoretical Asymptotic Limit
Histogram
Example
Theoretical Asymptotic Limit
Histogram
Figure 2b
0.50 1.00 1.5 2.0
Summary
large! very be to have not does N Usually
records data large for Optimal
efficient ally Asymptotic
CRLB attains ally Asymptotic
unbiased ally Asymptotic
)) ( , ( ~


I N
a
Example
| |
| |
0
) 2 sin( ) 2 cos( ] [ 2
) 2 cos( ] [ ) J(
minimize Must
) 2 (
1
) ; (
: MLE find To
estimate Known, f A,
] [ ) f ACos(2 X[n]
Estimation Phase
0 0
2
0
)] 2 cos( ] [ [
2
1
2 2
o
0
2
0
2
=
+ + =

+ =

=
+ + =

n f A n f A n X
J
n f A n X
e X p
n W n
n
n
n f A n X
N
n
Example (Contd)


=
=
=
+ =
+ + =
+ + = +
n
0
0
n
0 0
0
0 0
0 0 0
2 os X[n]
2 sin X[n]
ArcTan -

sin ). f 2 cos( ] [ )

cos( ) f X[n]sin(2
0
0 )

2 4 sin(
2
1

)

2 cos( )

2 sin(
)

2 cos( ).

2 ( sin )

2 sin( ] [
OR
n f c
n f
n n X n
LHS
ignore n f
n f n f
n f n f A n f n X
n
n
n
n n






HW problem from Chapter 3
for not near 0 or 1/2
0
f
Example (Contd)
( )
be? to have N does large How
1
2 /
1
)

Var(
2 / NA ) I( that Recall
) ( I , N ~


: MLE of e performanc assess To

2 2
2 2
1 -
a
SNR
N NA

= =
=



Example 3.4 in SK
Asymptotically
How Large Should N be?
05 . 0 , 4 , 08 . 0 f 1, A For
2
0
= = = =
N E( ) NVar( )
20 0.732 0.0978
40 0.746 0.108
60 0.774 0.110
80 0.789 0.099
Theoretical =0.785 NVar( )=0.1

Need About N80


Performance vs SNR
Plot 10 log( ) vs SNR, as well as CRLB
)

var(
scale) log - log a (on SNR in Linear becomes CRLB
log 10 log 10
1
log 10 )

( log 10 :
dB in SNR
10 10
10 10

=
=

N
N
Var CRLB
Actual Variance
CRLB
10 log
10
Variance
SNR(dB)
Performance vs SNR
For SNR-10dB, MLE attains CRLB or asymptotic variance
-20 -10 0 10
MLE Performance
At lower SNRs, a threshold effect occurs
High SNR
relatively stable peak from
one realization to the other
OUTLIER
/4
/4

Low SNR
Outliers (due to high noise) give rise to increased variance (more
than predicted by CRLB)
log likelihood
Many peaks
log likelihood
Threshold Effect
Threshold SNR
SNR dB
10log
10
Variance
This behavior is characteristic of all non-linear estimators
For signal in noise problems, the CRLB is attained even for short
data records if SNR is high enough. Why?

=
=
+
+
=
+
+

+ +
+ +
=
n
n
c
s
n
n
n
n
n f n W
NA
n f n W
NA
E
E
Tan Arc
n f n W
NA
n f n W
NA
Tan Arc
n f n W n f A
n f n W n f A
Tan Arc
0 c
0 s
0
0
0 0
0 0
2 cos ] [
2
E
2 sin ] [
2
E Where
cos
sin

2 cos ] [ cos
2
2 sin ] [ sin
2

2 cos ]) [ ) 2 cos( (
2 sin ]) [ ) 2 cos( (

MLE Performance @ High SNR


Show this
) cos( ) cos( and ) cos( ) sin( for
identities onometric apply trig : hint
b a b a
both or SNR) (high A or N either
small, is error when holds PDF asymptotic general, In
linear n nsformatio ArcTan tra Then,
small. are E , E when valid be will PDF Asymptotic
N all for

noise), (no 0 E If
s
s

= = =
c
c
E
MLE Performance
MLE For Transformed Parameters
How do we find t he MLE for a funct ion of , for
example, A
2
inst ead of A?
Example: X[ n] = A + W[ n] WGN
Want MLE of = e
A
, which is a 1- 1
t ransformat ion of A
This PDF is paramet erized by A since is 1- 1
t ransformat ion, we can equivalent ly use

< <

=

A -
) 2 (
1
) ; (
2
2
) ] [ (
2
1
2 2
n
A n X
N
e A X p

MLE For Transformed Parameters


X n X
N
n X
e X p
n
n
n X
N
T
n
= =
=
=

>

] [
1

ln
0

1
)

ln - (X[n] 2 -
0 ) ln ] [ (
. over p
maximizing by found of MLE . equivalent are PDFs
0
) 2 (
1
) ; (
n
2
T
) ln ] [ (
2
1
2 2
d Transforme
2
2

=
> =
= =
=
=
A
ation. transform 1 - 1 Not
0 A let data same For
ation. transform into

ng substituti by found is MLE


e e


A. of MLE is X A

But

X
e
MLE For Transformed Parameters
Invariance Property
Example :
Example (Contd)
follows as sets two Need . by pdf ze parameteri Cannot
0
) 2 (
1
) ; ( p
2
2
1
) ] [ (
2
1
2 2
T

n
n X
N
e X
) ; (
1
X p
T

0
) 2 (
1
) ; ( p
2
2
2
) ] [ (
2
1
2 2
T

=
+

n
n X
N
e X
) ; (
2
X p
T

ies possibilit all cover To


Example (Contd)
{ }
0 over p maximizes that by given MLE ) 2
)) 0 ; ( ) 0 ; ( p 0( all for Repeat
. p larger is p or p whether determine , given For ) 1
) ; ( ), ; ( max arg


follows as
p and p maximizes that finding by found is of MLE
T
T
T T T
T T
2 1
2 1
2 1

= = = >

=

A X p X
or
X p X p
T T
Example (Contd)
{ }
2 2
2
2
0
) , ( ) , (
0
) ( ) A

(
: )] ; X p( max [arg
] ) , ( ), ; p( max [arg
) ; ( , ) ; ( max arg


: example our For
2 1
X
A A
X p X
X p X p
X p X p
T T
= =
< < =
=
)
`

same yielding all over taken is maximum


and ) g( where ), ; X p( all of maximum
) ; ( p or function likelihood modified
maximizes

but holds, still property Invariance


T
=
=

X
Summary
Vector MLE Example : DC Level in WGN
| |
( )
( )
( ) 0 - [n]
2
1

0 A - [n]
2
1
2
ln
and
A
0 A - [n]

1
A
ln
A [n]; A [n]
^
2 2
1
0
4
2
1
0
4 2 2
^
1
0
2
T 2
=
(


= + =

=
= =

= + =

=
MLE
N
n
N
n
MLE
N
n
N x x
x
N p
x
x
p
w x


Example (Contd)
( )
( )
(
(
(
(
(

=
=

=
1
0
2
MLE
^
1
0
2 2
MLE
- [n]
1

- [n]
1

N
n
N
n
x x
N
x
x x
N

Verify that it is asymptotically unbiased and efficient


by computing the mean and covariance of this MLE !
Another Vector MLE Example
( )
| |
( )
( ) ( )
( ) ( ) ( )
( )

+ =
+ =
+ =
=
+ + =
n
2
o o
n
2
o o
n
2
o
2
2
2
o
o
n) 2 sin( Asin n) 2 cos( Acos - [n]
n 2 Acos - [n] , A, J : Minimize
) n 2 Acos - [n]
2
1
exp(
2
1
) ; (
A
[n]; n 2 Acos [n]
f f x
f x f
f x x p
f
w f x
N
T


Example (Contd)
( ) ( ) ( ) | |
( ) ( ) ( ) | |
( ) ( ) ( )
( ) ( )
| |

| | s c H
H H
s c s c
s
c
and where,

J'
1 2 2 1
1 2 2 1



A
Asin - Acos Let,
2 N
2 1
2 1 2 1 o 2 1
o o
o o
1
2
1
2
2
2
1
2 1
= =
=
=
=
=
|
|
.
|

\
|
= + =
= =

T
T
T
T
T
x x
x x f , ,
N f sin f sin
N f cos f cos
tan




Solution
( )
( ) ( )
( )
2
1

N
1
&
2
1

N
1
, 0
N
1

2
1
or 0 near not is f If
: f over maximize
maximize want to we Hence,
f , J'

o
1
o
1
1
o
2
^
, 1
^
1
opt
^

(

=
|
.
|

\
|
=
=

s s c c s c
x s
x c
s s c s
s c c c
x s
x c
x H H H H x
x H H H H I x
x H H H
T T T
T
T
T T
T T
T
T
T
T T T
T T T
T T

Example (Contd)
( ) ( ) | |
( ) ( )
( )


f find [n]
N
2
max
n 2 sin [n] n 2 cos [n]
N
2
max

N
2
max
2
0
0
2
max
1 -
o
m periodogra
2
1
0 n
n 2
o f
2
o
2
o o f
2 2
o f
1
T
o f
o
x H H H
x s x c
x s
x c
x s
x c
T T
N
f j
n n
T T
T
T
T
T
e x
f x f x
N
N
=
=
(
(

|
.
|

\
|
+ |
.
|

\
|
=
+ =
(

(
(


Example (Contd)

=
=
|
.
|

\
|
+
|
.
|

\
|
=
(
(
(
(

=
(
(

n
o
n
o
1
^
n
n 2 -
2
2
^
2
1
^ ^
n
o
n
o
opt
^
n) (2 [n]
n) (2 [n]

and
[n]
2
A Finally,
n) (2 [n]
n) (2 [n]
2

2
Then,
o
^
^
f j
^
^
T
^
T
^
f cos x
f sin x
tan
e x
N
f sin x
f cos x
N N
^

x s
x c

You might also like