ET Lecture02
ET Lecture02
October 5, 2019
Introduction
Definition
An estimator is said unbiased if
E(θ̂) = θ a<θ<b
Example 2.1
N −1
1
 = x[n]
N
n=0
has expected value
N −1
1
N −1
1
E Â = E x[n] = E (x[n]) = A
N N
n=0 n=0
Unbiased estimators usually have symmetric PDF centered about the true
value of θ, although this is not necessary.
The restriction that E θ̂ = θ for all θ is an important one. Let
θ̂ = g (x), where x = [x[0] x[1] · · · x[N − 1]]T , it asserts that
E θ̂ = g (x) p (x; θ) dx for all θ.
Biased Estimator
Example 2.2
N −1
1
Ǎ = x[n]
2N
n=0
Unbiasedness
Definition
The bias of an estimator is defined as
b(θ) = E(θ̂) − θ.
Optimal estimators
In searching for optimal estimators, we need to adopt some optimality
criterion. A natural one is the mean square error (MSE), define as
2
mse θ̂ = E θ̂ − θ .
This measures the average mean squared deviation of the estimator from
the true value.
We may rewrite the MSE as
2
mse θ̂ = E θ̂ − θ
2
=E θ̂ − E θ̂ + E θ̂ − θ
= var θ̂ + b2 (θ)
Unrealizable estimators
Consider a modified DC estimator
N −1
1
Ǎ = a x[n]
N
n=0
a2 σ 2
mse Ǎ = + (a − 1)2 A2
N
Differentiating the MSE
d 2aσ 2 A2
mse Ǎ = + 2 (a − 1) A2 aopt =
da N A2 + σ 2 /N
The estimator is not realizable because the optimal value of a depends on
the unknown parameter A.
2019-10-05 Prof. Daniel R. Pipa 8/41
2 Minimum Variance Unbiased Estimation 2.5 Existence of MVU Estimator
The MVU estimator may not exist: the variance of the estimator must be
minimum among all unbiased estimators for all range of the parameter.
Estimator Accuracy
Alternative interpretation
1 1
pi (x[0]; A) = exp − 2 (x[0] − A)2
2πσi2 2σi
Definition
When the PDF is viewed as a function of the unknown parameter, with x
fixed, it is termed likelihood function. The log-likelihood function is the
natural logarithm of the likelihood function.
Theorem
Cramer-Rao Lower Bound: the variance of any unbiased estimator θ̂ must satisfy
1
var θ̂ ≥ 2
∂ ln p (x; θ)
−E
∂θ2
An unbiased estimator may be found that attains the CRLB for all θ if and only if
∂ ln p (x; θ)
= I (θ) (g (x) − θ)
∂θ
for some functions g and I. The estimator, which is the MVU, is θ̂ = g (x) and
the minimum variance is 1/I (θ).
2019-10-05 Prof. Daniel R. Pipa 16/41
3 Cramer-Rao Lower Bound 3.4 Cramer-Rao Lower Bound
Then
∂ ln p (x[0]; A) 1
= 2 (x[0] − A)
∂A σ
∂ 2 ln p (x[0]; A) 1 ∂ 2 ln p (x[0]; A) 1
2
=− 2 −E 2
= 2
∂A σ ∂A σ
Finally
var  ≥ σ 2 .
Because the estimator was unbiased and had var  = σ 2 , it attained
the CRLB and was the MVU estimator.
2019-10-05 Prof. Daniel R. Pipa 17/41
3 Cramer-Rao Lower Bound 3.4 Cramer-Rao Lower Bound
Efficient estimators
Definition
An unbiased estimator that attains the CRLB is said to be efficient.
An MVU estimator may or may not be efficient, as it requires only that
the variance be less than any other unbiased estimators.
Introduction
Consider the cases where
◮ we do not know the PDF of the data. The CRLB or Sufficient
Statistic cannot be applied; or
◮ We know the PDF, but the MVU cannot be found.
We may resort to suboptimal estimators.
Alternative approach
Restrict the estimator to be linear in the data and find the estimator that
is unbiased and has minimum variance.
Advantages
◮ We need knowledge of only the first and second moments of PDF.
◮ We do not need knowledge of the complete PDF.
◮ If the performance of the estimator meets our system requirements,
its use may be adequate for the problem.
2019-10-05 Prof. Daniel R. Pipa 20/41
6 Best Linear Unbiased Estimators 6.3 Definition of the BLUE
However, we can use the BLUE in the transformed data y[n] = x2 [n].
Unbiased Constraint
In order to satisfy the unbiased constraint, E (x[n]) must be linear in θ or
E (x[n]) = s[n]θ
where the s[n]’s are known. This is because a linear combination of the
E (x[n])’s must yield θ. Additionally, writing x[n] as
Application of BLUE
Amplitude estimation of known signals in noise. To generalize, we will
need nonlinear transformations.
2019-10-05 Prof. Daniel R. Pipa 24/41
6 Best Linear Unbiased Estimators 6.4 Finding the BLUE
Scalar case
The unbiasedness constraint results in
N
−1 N
−1
an E (x[n]) = an s[n]θ = θ
n=0 n=0
N
−1
an s[n] = 1 aT s = 1.
n=0
Now, we use Lagrange multipliers to minimize the variance subject to the
constraint
min J(a) = aT Ca + λ(aT s − 1)
∂J λ
= 2Ca + λs = 0 a = − C−1 s
∂a 2
In the equatlity constraint
λ λ 1 sT C−1
− sT C−1 s = 1 − = T −1 θ̂ = x
2 2 s C s sT C−1 s
2019-10-05 Prof. Daniel R. Pipa 25/41
6 Best Linear Unbiased Estimators 6.5 Extension to a Vector Parameter
Derivation of BLUE I
The unbiasedness condition becomes
p
∂Ji
= 2Cai + λij hi = 2Cai + Hλi
∂ai
j=1
Derivation of BLUE II
Setting the gradient to zero yields
1 1 1
ai = − C−1 Hλi AT = − C−1 HΛ A = − ΛT HT C−1
2 2 2
Substituting in the unbiasedness condition
1 1 −1
− ΛT HT C−1 H = I − HT C−1 HΛ = I Λ = −2 HT C−1 H
2 2
The optimal matrix A is
−1 T −1
A = HT C−1 H H C
The covariance of θ̂ is
T
Cθ̂ = E θ̂ − E θ̂ θ̂ − E θ̂
x = E (x) + [x − E (x)] = Hθ + w
then
−1 T −1
θ̂ − E θ̂ = HT C−1 H H C (Hθ + w) − θ
T −1 −1 T −1
= H C H H C w
Derivation of BLUE IV
and
−1 T −1 T −1 −1
T −1 T −1
Cθ̂ = E H C H H C ww C H H C H
T −1 −1 T −1 −1
= H C H H C CC−1 H HT C−1 H
−1
= HT C−1 H
Gauss-Markov Theorem
Theorem
If the data are of the general linear model form
x = Hθ + w
Homework Assignments I
Then ∂ 2 p(x;θ)
2
∂ 2 ln p (x; θ) ∂θ 2 ∂ ln p (x; θ)
E =E −
∂θ2 p (x; θ) ∂θ
But
∂ 2 p(x;θ) ∂ 2 p(x;θ)
∂θ 2 ∂θ 2 ∂2 ∂2
E = p (x; θ) dx = 2 p (x; θ) dx = 1=0
p (x; θ) p (x; θ) ∂θ ∂θ2
Therefore 2
∂ 2 ln p (x; θ) ∂ ln p (x; θ)
E = −E
∂θ2 ∂θ
This expression may be useful for theoretical work.
2019-10-05 Prof. Daniel R. Pipa 33/41
Fisher Information I
Definition
The Fisher Information of the data x is defined as
2
∂ ln p (x; θ)
I (θ) = −E
∂θ2
This results in
N
−1 2
∂ 2 ln p (x; θ) ∂ ln p (x[n]; θ)
−E =− E
∂θ2 ∂θ2
n=0
I (θ) = N i (θ)
where
∂ 2 ln p (x[n]; θ)
i (θ) = −E
∂θ2
is the Fisher information for one sample.
2019-10-05 Prof. Daniel R. Pipa 35/41
Regularity conditions for the CRLB
Defining
2
∂ ln p (x; θ) ∂ 2 ln p (x; θ)
I (θ) = E = −E
∂θ ∂θ2
Differentiating produces
N −1
∂ ln p (x; θ) 1 ∂s[n; θ]
= 2 (x[n] − s[n; θ])
∂θ σ ∂θ
n=0
N −1
∂ 2 ln p (x; θ) 1 ∂ 2 s[n; θ] ∂s[n; θ] 2
= 2 (x[n] − s[n; θ]) −
∂θ2 σ ∂θ2 ∂θ
n=0
2019-10-05 Prof. Daniel R. Pipa 39/41
3.5 General CRLB for Signals in White Noise
so that finally
σ2
var θ̂ ≥ N −1 2
∂s[n; θ]
∂θ
n=0
That is, signals that change rapidly with the unknown parameter results in
accurate estimators.