Before quiz 2

OPTIMAL LINEAR FILTER: WIENER FILTER
5.1 Estimation of signal in presence of white Gaussian noise (WGN)
Consider the signal model is
[ ] [ ] [ ]X n S n V n 
where  X n is the observed signal,  S n is 0-mean Gaussian with variance 1 and ][nV
is a white Gaussian sequence mean 0 and variance 1. The problem is to find the best
guess for  S n given the observation [ ], 1,2,...,X i i n
Maximum likelihood estimation for  S n determines that value of  S n for which the
sequence [ ], 1,2,...,X i i n is the most likely. Let us represent the random samples
[ ], 1,2,...,X i i n by [ ] [ [ ] [ 1] .. [1]]'X X n X n X n and the particular values
[1], [2],..., [ ]x x x n by [ ] [ [ ] [ 1]... [1]]x x n x n x  n .
The likelihood function / [ ]( [ ]/ [ ])X n S nf x n s n[ ] will be Gaussian with mean ][nx
 
2
1
( [ ] [ ])
2
/ [ ]
1
( [ ]/ [ ])
2
n
i
x i s n
X n S n n
f x n s n e




[ ]
Maximum likelihood will be given by
[1], [2],..., [ ]/ [ ] ˆ [ ]
( ( [1], [2],..., [ ]) / [ ]) 0
[ ] MLE
X X X n S n s n
f x x x n s n
s n



1
1ˆ [ ] [ ]
n
MLE
i
S n x i
n 
  
Similarly, to find ˆ [ ]MAPS n and ˆ [ ]MMSES n we have to find a posteriori probability
density function (pdf)
S
y [n]
x [n]
+
X
V

2
2
1
[ ] [ ]/ [ ]
[ ]/ [ ]
[ ]
1 ( [ ] [ ])
[ ]
2 2
[ ]
( [ ]) ( [ ]/ [ ])
( [ ]/ [ ])
( [ ])
1
=
( [ ])
n
i
S n X n S n
S n X n
X n
x i s n
s n
X n
f s n f x n s n
f s n x
f x n
e
f x n


 


n
Taking the logarithm on both sides, we get
2
2
[ ]/ [ ] [ ]
1
1 ( [ ] [ ])
log ( [ ]) [ ] log ( [ ])
2 2
n
e S n x n e X n
i
x i s n
f s n s n f x n


   
[ ]/ [ ]log ( [ ])e S n X nf s n is maximum at ˆ [ ].MAPs n Therefore, taking partial derivative of
[ ]/ [ ]log ( [ ])e S n X nf s n with respect to [ ]x n and equating it to 0, we get
1 ˆ [ ]
1
MAP
[ ] ( [ ] [ ]) 0
[ ]
ˆs [ ]
1
MAP
n
i s n
n
i
s n x i s n
x i
n
n


  




Similarly the minimum mean-square error estimator is given by
1
MMSE
[ ]
ˆs [ ] ( [ ]/ [ ])
1
n
i
x i
n E S n X n
n

 


 For MMSE we have to know the joint probability structure of the channel and the
source and hence the a posteriori pdf.
 Finding pdf is computationally very exhaustive and involves nonlinear operations
on the samples.
 Normally we may be having the estimated values first-order and second-order
statistics of the data
We look for a simpler estimator.
The answer is Optimal filtering or Wiener filtering
We have seen that we can estimate an unknown signal (desired signal) [ ]x n from an
observed signal [ ]y n on the basis of the known joint distributions of [ ]Y n and [ ].X n We
could have used the criteria like MMSE or MAP that we have applied for parameter es
timations. But such estimations are generally non-linear, require the computation of a
posteriori probabilities and involves computational complexities.
The approach taken by Wiener is to specify a form for the estimator that depends on a
number of parameters. The minimization of errors then results in determination of an

optimal set of estimator parameters. A mathematically sample and computationally easier
estimator is obtained by assuming a linear structure for the estimator.

5.2 Linear Minimum Mean Square Error Estimator
The linear minimum mean square error (LLMSE) criterion is illustrated in the above
figure. The estimation problem can be slated as follows:
Given observations represented as random variables
[ 1], [ 2]..., [ ],... [ ],X n M X n M X n X n N     determine a set of parameters
[ ], [ 1]...., [ ], [1],..., [ 1].   h N h N h o h h M such that
1
ˆ[ ] [ ] [ ]
M
i N
S n h i X n i


 
and the mean square error  
2
ˆ[ ] [ ]E S n S n is a minimum with respect to each of
[ ], [ 1]...., [ ], [1],..., [ 1].   h N h N h o h h M
This minimization problem results in an elegant solution if we assume jointly wide-sense
stationarity of the signals [ ] and [ ]S n X n . The estimator parameters can be obtained from
the second order statistics of the processes { [ ]} and { [ ]}.S n X n
The problem of deterring the estimator parameters by the LMMSE criterion is also called
the Wiener filtering problem. Three subclasses of the problem are identified
1. The optimal smoothing problem 0N 
2. The optimal filtering problem 0N 
3. The optimal prediction problem 0N 
[ 1]x n M  …….. [ ]x n … .. [ ]x n N
y[nb]
1n M  n n N
Noise
Filter
[ ]s n [ ]x n ˆ[ ]s n
Syste
m
+

In the smoothing problem, an estimate of the signal is made at a location inside the
observation window. The filtering problem estimates the curent value of the signal on the
basis of the present and past observations. The prediction problem addresses the issues of
optimal prediction of the future value of the signal on the basis of present and past
observations.
5.3 Wiener-Hopf Equations
The mean-square error of estimation is given by
2 2
1
2
ˆ[ ] ( [ ] [ ])
( [ ] [ ] [ ])
M
i N
Ee n E S n S n
E S n h i X n i


 
  
We have to minimize ][2
nEe with respect to each ][ih to get the optimal estimation.
Corresponding minimization is given by
 2
[ ]
0, for ..0.. 1
[ ]
E e n
j N M
h j

   

( E being a linear operator, and
[ ]
E
h j


can be interchanged)
[ ] [ - ] 0, ...0,1,..., 1Ee n X n j j N M    (1)
or
[ ]
1
[ ] - [ ] [ ] [ - ] 0, ,...0,1,..., 1
a
e n
M
i N
E S n h i X n i X n j j N M


 
     
 
 (2)
1
( ) [ ] [ ], ,...0,1,..., 1
a
M
SX X
i N
R j h i R j i j N M


     (3)
This set of 1N M  equations in (3) are called Wiener Hopf equations or Normal
equations.
 The result in (1) is the orthogonality principle which implies that the
estimation error is orthogonal to observed data.
 ˆ[ ]S n is the projection of  S n onto the subspace spanned by observations
[ ], [ 1],..., [ ],..., [ ].X n M X n M X n X n N   
 The estimation uses second order-statistics i.e. autocorrelation and cross-
correlation functions.

 If { [ ]}S n and { [ ]}X n are jointly Gaussian, then the MMSE and the LMMSE are
equivalent. Otherwise we get a sub-optimum result.
 Also observe that
ˆ[ ] [ ] [ ]S n S n e n 
where ˆ[ ]S n and [ ]e n are the parts of  S n respectively correlated and uncorrelated
with [ ]X n Thus LMMSE separates out that part of  S n which is correlated with
[ ]X n Hence the Wiener filter can be also interpreted as a correlation canceller.
5.4 FIR Wiener Filter
1
0
ˆ[ ] [ ] [ ]
M
i
S n h i X n i


 
The model parameters are given by the orthogonality principlee
[ ]
1
0
ˆ[ ] [ ] [ ] [ - ] 0, 0,1,... 1
e n
M
i
E S n h i X n i S n j j M


 
     
 

1
0
[ ] [ ] ( ), 0,1,..., 1
M
X SX
i
h i R j i R j j M


   
In matrix form, we have
X SXR h r
where
[0] [ 1] .... [1 ]
[1] [0] .... [2 ]
...
[ 1] [ 2] .... [0]
X X X
X X X
X
X X X
R R R M
R R R M
R M R N R
  
  
 
 
  
R
and
[0]
[1]
...
[ 1]
SX
SX
SX
SX
R
R
R M
 
 
 
 
 
 
r
and
[0]
[1]
...
[ 1]
h
h
h M
 
 
 
 
 
 
h
Therefore,
[1]h

1
X SX

h R r
5.5 Minimum Mean Square Error - FIR Wiener Filter
1
2
0
1
0
1
0
( [ ] [ ] [ ] [ ] [ ]
= [ ] X[ ] error isorthogonal to data
= [ ] [ ] [ ] [ ]
= [0] [ ] [ ]
M
i
M
i
M
S SX
i
E e n Ee n S n h i X n i
Ee n n
E S n h i X n i s n
R h i R i






 
   
 
 
  
 




 
Z-1
[0]h
[1]h
Z-1
Wiener
Estimation
[1]h
ˆ[ ]s n
X SXR r
[ 1]h M 
[ ]x n

Example1: Noise Filtering
Consider the case of a carrier signal in presence of white Gaussian noise
0 0[ ] [ ],
4
[ ] [ ] [ ]
S n Acos w n w
X n S n V n

  
 
here  is uniformly distributed in (1,2 ).
[ ]V n is white Gaussian noise sequence of variance 1 and is independent of [ ].S n Find
the parameters for the FIR Wiener filter with M=3.
  
 
2
0
2
0
[ ] cos
2
[ ] X[ ] [ ]
[ ] [ ] [ ] [ ]
[ ] [ ] 0 0
( ) [ ]
2
[ ] S[ ] [ ]
S[ [ ] [ ]
[ ]
S
X
S V
S
S
A
R m m
R m E n X n m
E S n V n S n m V n m
R m R m
A
cos w m m
R m E n X n m
E nS x n m V n m
R m



 
    
   
 
 
   

Hence the Wiener Hopf equations are
2 2 2
2 2 2
2 2 2
[0] [1] [2] [0] [0]
[1] [0] [1] [1] [1]
[2] 1] [0] [2] [2]
[0]1 cos cos
2 2 4 2 2
cos 1 cos
2 4 2 2 4
cos cos 1
2 2 2 4 2
X X X SS
X X X SS
X X X SS
R R R h R
R R R h R
R R R h R
A A A h
A A A
h
A A A
 
 
 
     
     
     
          
 
 
 
 
 
 
 
 
 
2
2
2
2
cos[1]
2 4
cos[2]
2 2
A
A
A
h


 
   
   
   
    
   
   
    
 
suppose A = 5v then
12.5
13.5 0
12.52 [0]
12.5 12.5 12.5
13.5 [1]
2 2 2
[2]
12.5 0
0 13.5
2
h
h
h
 
   
     
     
     
      
   
 
 

1
12.5
13.5 0 12.5
[0] 2
12.5 12.5 12.5
[1] 13.5
2 2 2
[2] 12.5
0 13.5 0
2
[0] 0.707
[1 0.34
[2] 0.226
h
h
h
h
h
h

 
                         


 
Plot the filter performance for the above values of [0], [1] and [2].h h h The following
figure shows the performance of the 20-tap FIR wiener filter for noise filtering.

Example 2 : Active Noise Control
Suppose we have the observation signal ][ny is given by
0 1[ ] 0.5cos( ) [ ]X n w n V n  
where  is uniformly distrubuted in   10,2 and V [ ] 0.6 [ 1] [ ]n V n V n    is an MA(1)
noise. We want to control 1[ ]V n with the help of another correlated noise 2[ ]V n given by
2[ ] 0.8 [ 1] [ ]V n V n V n  
The filtering scheme is as follows:
The Wiener Hopf Equations are given by
2 1 2V VVR h r
where [ [0] h[1]]h h
and
2
1 2
1.64 0.8
and
0.8 1.64
1.48
0.6
V
V V
 
  
 
 
  
 
R
r
[0] 0.9500
[1] -0.0976
h
h
   
    
   
Example 3:
(Continuous time prediction) Suppose we want to predict the continuous-time process
( ) at time ( ) by
ˆ( ) ( )
S t t
S t aS t



 
Then by orthogonality principle
-
2-tap FIR
filter
1[ ] [ ]s n v n  ˆs n
2[ ]v n

( ( ) ( )) ( ) 0
( )
(0)
SS
SS
E S t aS t S t
R
a
R


  
 
As a particular case consider the first-order Markov process given by
( ) ( ) ( )
d
S t AS t v t
dt
 
In this case,
( ) (0) A
S SR R e 
 

( )
(0)
AS
S
R
a e
R
 
  
Observe that for such a process
1 1
1
1 1
( ) ( )( )
( ( ) ( )) ( )
( ) ( )
(0) (0)
0
S S
A AA
S S
E S t aS t S t
R aR
R e e R e  
 
  
  
  
  
 

Therefore, the linear prediction of such a process based on any past value is same as the
linear prediction based on current value.
5.6 IIR Wiener Filter (Causal)
Consider the IIR filter to estimate the signal [ ]x n shown in the figure below.
The estimator  ˆS n is given by
0
ˆ( ) ( ) ( )
i
S n h i X n i


 
The mean-square error of estimation is given by
2 2
2
0
ˆ[ ] ( [ ] [ ])
( [ ] [ ] [ ])
i
Ee n E S n S n
E S n h i X n i


 
  
We have to minimize ][2
nEe with respect to each ][ih to get the optimal estimation.
Applying the orthogonality principle, we get the WH equation.
[ ]h n
[ ]X n ˆ[ ]S n

0
0
( [ ] ( ) ]) [ ] 0, 0, 1, .....
From which we get
[ ] [ ] [ ], 0, 1, .....
i
X SX
i
E S n h i yXn i X n j j
h i R j i R j j




    
  


We have to find [ ], 0,1,...h i i   by solving the above infinite set of equations.
 This problem is better solved in the z-transform domain, though we cannot
directly apply the convolution theorem of z-transform.
Here comes Wiener’s contribution.
The analysis is based on the spectral Factorization theorem for a regular random process:
2 1
( ) ( ) ( )X v c cS z H z H z 

Whitening filter
Wiener filter
Now 2[ ]h n is the coefficient of the Wiener filter to estimate [ ]S n I from the innovation
sequence [ ].V n Applying the orthogonality principle results in the Wiener Hopf equation
2
0
ˆ( ) ( ) ( )
i
S n h i V n i


 
2
0
2
0
[ ] [ ] [ ] [ ]) 0
[ ] [ ] [ ], 0,1,...
i
V SV
i
E S n h i V n i V n j
h i R j i R j j




 
    
 
   


Note that
2
2
2
0
[ ] [ ]
( ) [ ] ( ), 0,1,...
V V
V SV
i
R m m
h i j i R j j
 
 



   
So that
1
1
( )
( )c
H z
H z

[ ]X n [ ]V n
[ ]V n
1( )H Z
[ ]X n
2 ( )H z
 ˆS n

 
2 2
2 2
[ ]
[ ] j 0
( )
( )
SV
V
SV
V
R j
h j
S z
H z


 


where  ( )SVS z  is the positive part (i.e., containing non-positive powers of z ) in power
series expansion of ( ).SVS z To determine ( ).SVS z , consider
1
0
0
1
0
1
1 1
2 2 1
[ ] [ ] [ ]
[ ] [ ] [ ]
[ ] S[ ] X[ - - ]
[ ] [j i]
1
( ) ( ) ( ) ( )
( )
( )1
( )
( )
i
SV
i
i
SX
i
SV SX SX
c
SX
V c
V n h i X n i
R j ES n V n j
h i E n n j i
h i R
S z H z S z S z
H z
S z
H z
H z










 
 

 
 
 
   
 



Therefore,
1 2 2 1
( )1
( ) ( ) ( )
( ) ( )
SX
V c c
S z
H z H z H z
H z H z 

 
   
 
We have to
 find the power spectrum of data and the cross power spectrum of the of the
desired signal and data from the available model or estimate them from the data
 factorize the power spectrum of the data using the spectral factorization theorem

5.7 Mean Square Estimation Error – IIR Filter (Causal)
2
0
0
0
( [ ] [ ] [ ] [ ] [ ]
= [ ] S[ ] error isorthogonal to data
= [ ] [ ] [ ] [ ]
= [0] [ ] [ ]
i
i
S SX
i
Ee n n
E S n h i X n i S n
R h i R i






 
   
 
 
  
 




*
*
1 1
1 1
( ) ( ) ( )
2 2
1
( ( ) ( ) ( ))
2
1
( ( ) ( ) ( ))
2
S SX
S SX
S SX
C
S d H S d
S w H w S w dw
S z H z S z z dz
 
 


    
 


 

 
 
 
 
 


Example 4:
1[ ] S[ ] V[ ]X n n n  observation model with
[ ] 0.8 S [ -1] [ ]S n n W n 
where 1[ ]V n is and additive zero-mean Gaussian white noise with variance 1 and [ ]W n
is zero-mean white noise with variance 0.68. Signal and noise are uncorrelated.
Find the optimal Causal Wiener filter to estimate [ ].X n
Solution:
  1
0.68
( )
1 0.8 1 0.8
SSS z
z z

 
  
[ ] X[ ] [ ]
[ ] [ ] [ ] [ ]
[ ] [ ] 0 0
( ) ( ) 1
X
S V
X SS
R m E n X n m
E X n V n S n m V n m
R m R m
S z S z
 
    
   
 
1
1
1 0.8z

[ ]W n [ ]S n

Factorize
  
  
 
1
1
1
1
1
2
0.68
( ) 1
1 0.8 1 0.8
2(1 0.4 )(1 0.4 )
=
1 0.8 1 0.8
(1 0.4 )
( )
1 0.8
2
X
c
V
S z
z z
z z
z z
z
H z
z
and






 
 
 
 

 


Also
  
  1
[ ] S[ ] [ ]
[ ] [ ] [ ]
[ ]
( ) ( )
0.68
=
1 0.8 1 0.8
SX
S
SX S
R m E n X n m
E S n S n m V n m
R m
S z S z
z z
 
   


 
 
 
2 1
1
11
1
( )1
( )
( ) ( )
1 0.68(1 0.8 )
=
2 (1 0.8 )(1 0.8 )1 0.4
0.944
=
1 0.4
[ ] 0.944(0.4) 0
SX
V c c
n
S z
H z
H z H z
z
z zz
z
h n n
 





 
   
 
 
 
   

 
5.8 IIR Wiener filter (Noncausal)
The estimator ][ˆ nx is given by
ˆ[ ] [ ] [ ]
i
S n h i X n i


 
For LMMSE, the error is orthogonal to data.
[ ] [ ] X[ ] [ ] 0 j
i
E S n h i n i X n j


 
      
 

[ ]X n
y
[n]
( )H z
ˆ[ ]S n

[ ] [ ] [ ], ,...0, 1, ...X SX
i
h i R j i R j j


    
 This form Wiener Hopf Equation is simple to analyse.
 Easily solved in frequency domain. So taking Z transform we get
 Not realizable in real time
( ) ( ) ( )X SXH z S z S z
so that
( )
( )
( )
or
( )
( )
( )
SX
X
SX
X
S z
H z
S z
S
H
S





5.9 Mean Square Estimation Error – IIR Filter (Noncausal)
The mean square error of estimation is given by
2
( [ ] [ ] [ ] [ ] [ ]
= [ ] S[ ] error isorthogonal to data
= [ ] [ ] [ ] [ ]
= [0] [ ] [ ]
i
i
S SX
i
Ee n n
E S n h i X n i S n
R h i R i






 
   
 
 
  
 




*1 1
( ) ( ) ( )
2 2
S SXS d H S d
 
 
    
  
  
*
1 1
1
( ( ) ( ) ( ))
2
1
( ( ) ( ) ( ))
2
S SX
S SX
C
S H S d
S z H z S z z dz


   



 
 
 


Example 5: Noise filtering by noncausal IIR Wiener Filter
Consider the case of a carrier signal in presence of white Gaussian noise
[ ] [ ] [ ]X n S n V n 
where [ ]V n is and additive zero-mean Gaussian white noise with variance 2
.V Signal
and noise are uncorrelated

( ) ( ) ( )
and
( ) ( )
( )
( )
( ) ( )
( )
( )
=
(
) 1
( )
Y S V
SX S
S
S V
SX
V
S
V
S S S
S S
S
H
S S
S
S
S
S
  
 


 




 

 


Suppose SNR is very high
( ) 1 H
(i.e. the signal will be passed un-attenuated).
When SNR is low
( )
( )
( )
SS
V
S
H
S




(i.e. If noise is high the corresponding signal component will be attenuated in proportion
of the estimated SNR.
 SNR
( )H
 Signal
Noise
 time

Example 6: Image filtering by IIR Wiener filter
( ) power spectrum of the corrupted image
( ) power spectrum of the noise, estimated from the noise model
or from the constant intensity ( like back-ground) of the image
( )
( )
XX
VV
SS
S w
S w
S w
H w
S



( ) ( )
( ) ( )
=
( )
SS VV
XX VV
XX
w S w
S w S w
S w


Example 7:
Consider the signal in presence of white noise given by
[ ] 0.8 s [ -1] [ ]s n n w n 
where [ ]v n is and additive zero-mean Gaussian white noise with variance 1 and [ ]w n is
zero-mean white noise with variance 0.68. Signal and noise are uncorrelated.
Find the optimal noncausal Wiener filter to estimate s[ ].n
  
  
  
  
-1
SX
-1
XX
-1
-1
-1
0.68
1-0.8z 1 0.8zS ( )
( )
S ( ) 2 1-0.4z 1 0.4z
1-0.6z 1 0.6z
0.34
One pole outside the unit circle
1-0.4z 1 0.4z
0.4048 0.4048
1-0.4z 1-0.4z
[ ] 0.4048(0.4) ( ) 0.4048(0.4) (n n
z
H z
z
h n u n u

 




 
    1)n 
[ ]h n
n

KALMAN FILTER- P.K. Bora
1. Introduction
To estimate a signal [ ]s n in the presence of noise,
 FIR Wiener Filter is optimum when the data length and the filter length are equal.
 IIR Wiener Filter is based on the assumption that infinite length of data sequence
is available.
Neither of the above filters represents the physical situation. We need a filter that adds a
tap with each addition of data.
The basic mechanism in Kalman filter ( R.E. Kalman, 1960) is to estimate the signal
recursively by the following relation
  ˆ ˆ[ ] [ 1] [ ]n nS n A S n K X n
The whole of Kalman filter is also based on the innovation representation of the signal.
We used this model to develop causal IIR Wiener filter.
2. Signal Model
The simplest Kalman filter uses the first-order AR signal model
S[n] [n 1] [ ]aS W n  
where [ ]W n is a white noise sequence.
The observed data is given by
X[n] [ ] [ ]S n V n 
where [ ]V n is another white noise sequence independent of the signal.
The general stationary signal is modeled by a difference equation representing the ARMA
(p,q) model. Such a signal can be modeled by the state-space model and is given by
S[n] [n 1] [ ]S n  A BW (1)
The observations also can be represented as a linear combination of the states and the
observation noise.
X[n] [n] [ ]S V n c (2)
Noise
Linear filter
+
[ ]X n[ ]S n ˆ[ ]S n

Equations (1) and (2) have direct relation with the state space model in the control system
where you have to estimate the ‘unobservable’ states of the system through an observer
that performs well against noise.
Example 1:
Consider the ( )AR p model
1 2S[n] S[n 1] S[n 2]+....+ S[n ]+ [ ]pa a a p W n    
Then the state variable model for { [ ]}X n is given by
S[n] [n 1] [ ]S w n  A B
where
1
2
1 2
1 2
[ ]
[ ]
[ ] , S [ ] [ ], S [ ] [ 1].... and S [ ] [ 1],
[ ]
.. ..
1 0 .. .. 0
0 1 .. .. 0
0 0 .. .. 1
1
0
and
..
0
p
p
p
S n
S n
n n S n n S n n S n p
S n
a a a
 
 
       
 
 
  
 
 
 
 
 
  
 
 
 
 
 
 
A
b
S
Our analysis will include only the simple (scalar) Kalman filter
3. Innovation Representation
The Kalman filter also uses the innovation representation of the stationary signal as the
IIR Wiener filterdoes. The innovation representation is shown in the following diagram.
[ ]X n ˆ [ ]X n
Let ˆ[ | ]S n n denote the LMMSE of [ ]S n based on the data [0], [1],..., [ ].X X X n Thus,
ˆ ˆ[ | ] ( [ ]| [0], [1],..., [ 1])S n n E S n X X X n 
where ˆE is the linear minimum mean-square operator.
Innovation Generation

Let ˆ[ | 1]X n n  be the linear prediction of [ ]X n based on [0], [1],..., [ 1].X X X n
Define
ˆX[ ] [ ] [ | 1]
ˆ[ ] ( [ ]| [0], [1],..., [ 1])
ˆ[ ] ( [ ] [ ]| [0], [1],..., [ 1])
ˆ[ ] ( [ 1] [ ] [ ]| [0], [1],..., [ 1])
ˆ[ ] ( [ 1]| [0
n X n X n n
X n E X n X X X n
X n E S n V n X X X n
X n E aS n W n V n X X X n
X n E aS n X
  
  
   
     
   ], [1],..., [ 1])
ˆX[ ] [ ] [ 1| 1]
ˆ[ ] [ ] [ 1| 1]
ˆ[ 1] [ ] [ ] [ 1| 1]
[ 1] [ ] [ ]
X X n
n X n aS n n
S n V n aS n n
aS n W n V n aS n n
ae n W n V n

    
    
      
   
Similarly,
ˆX[ ] [ ] [ | 1]
[ 1] [ ] [ ]
n j X n X n n j
ae n j W n j V n j
    
      
It is easy to show that
E X[ ] X[ ] 0 1,2,...,n n j j n  
and
2 2
[n]n EX  , which varies with n .
In the above representation [ ]X n is the innovation of [ ]X n and contains the same
information as the original sequence.
Exercise: Show that E X[ ] X[ ] 0 1,2,...,n n j j n  
4. LMMSE estimation based on the Innovation Sequence
The LMMSE estimation of [ ]S n based on [0], [1],..., [ ],X X X n is same as the estimation
based on the innovation sequence [0], [1],..., [ 1], [ ]X X X n X n . Therefore,

 0
ˆ[ | ] [ ]
n
i
i
S n n k X i
where ski can be obtained by using the orthogonality relation.
Note that
( )
0
( )
0
( ) 2
[ ] [ ] [ ]
Then,
( [ ] [ ]) [ ] 0 0,1,...,
so that
[ ] [ ]/ 0,1,...,
nn
i i
nn
i i
n
j j
e n S n k X i
E S n k X i X j j n
k ES n X j j n


  
  
 

where
2 2
=Ej X
Similarly,
1
( 1)
0
( 1) 2
2
2
( ) ( 1)
ˆ[ 1| 1] [ ]
[ 1] [ ]/ 0,1,..., 1
( [ ] [ ]) [ ]/
( [ ]) [ ]/
0,1,...,
n
n
i
i
n
j j
j
j
n n
j j
S n n k X n
k ES n X j j n
E S n W n X j a
E S n X j a
k ak j








  
    
 

  

1n 

 



 

  
 
     
   
1
( ) ( ) ( )
0 0
1
( 1) ( )
0
( )
ˆ[ ] [ ] [ ] [ ]
[ ] [ ]
ˆ ˆ[ 1| 1] ( [ ] ( [ ] | [0], [1],..., [ 1]))
ˆ [ 1| 1]
n n
n n n
i i n
i i
n
n n
i n
i
n
n
S n k X i k X i k X n
a k X i k X n
aS n n k X n E X n X X X n
aX n n k   
    
   

( )
( ) ( )
ˆ( [ ] [ 1| 1])
ˆ(1 ) [ 1| 1]) [ ]
ˆ ˆwhere [ 1| 1] ( [ ] / [0], [1],..., [ 1]) is the linear prediction of X[ ] based on
obseravtions X[0], [1],..., [ 1]
n
n
n n
n n
X n aS n n
k aS n n k X n
aS n n E Y n X X X n n
X X n
    
 
( )
( )
.
ˆ ˆ[ | ] [ 1| 1] [ ]
with (1 )
n
n n
n
n n
S n n A S n n k X n
A k a
Thus the recursive estimator ˆ[ ]x n is given by
Or
The filter can be represented in the following diagram
    ( )ˆ ˆ[ | ] [ 1| 1] [ ]n
n nS n n A S n n k X n
( )ˆ ˆ ˆ[ | ] [ 1| 1] ( [ ] [ 1| 1])n
nS n n aS n n k X n aS n n      
-
[ ]X n
+ +( )n
nk
nk
a
ˆ[ ]S n
1
z

5. Estimation of the filter-parameters
Consider the estimator
   ˆ ˆ[ | ] [ 1| 1] [ ]n nS n n A S n n k X n
The estimation error is given by
  ˆ[ ] [ ] [ | ]e n S n S n n
Therefore ][ne must orthogonal to past and present observed data .
  [ ] [ ] 0, 0Ee n X n m m
We want to find nA and the nk using the above condition.
The error ][ne is orthogonal to current and past data. First consider the condition that
][ne is orthogonal to the current data.
2
( ) 2
( ) 2
( )
[ ] [ ] 0
[ ]( [ ] [ ]) 0
[ ] [ ] [ ] [ ] 0
ˆ[ ]( [ | ] [ ]) [ ] [ ] 0
[ ] [ ] [ ] 0
ˆ[ ] ( [ ] [ 1| 1] [ ]) [ ] 0 (denoting [ ] by [ ])
[ ] 0
[ ]
n
n n
n
n V
n
n
Ee n X n
Ee n S n V n
Ee n S n Ee n V n
Ee n S n n e n Ee n V n
Ee n Ee n V n
P n E S n A S n n k X n V n Ee n P n
P n k
P n
k

 
  
  
   
  
      
  
  2
V
We have to estimate [ ]P n at every value of n.
Estimation of [ ]P n
We have

( )
2 ( ) ( )
2 ( ) ( ) 2
( ) 2 ( )
[ ] [ ] [ ]
ˆ[ ]( [ ] (1 ) [ 1] [ ])
ˆ(1 ) [ ] [ 1] [ ] [ ]
ˆ(1 ) [ ] [ 1]
ˆ(1 ) (1 ) ( [ 1] [ ]) [
n
n n
n n
S n n
n n
S n n S
n n
n S n
P n ES n e n
Es n S n k aS n k X n
k aES n S n k ES n X n
k aES n S n k
k k aE aS n w n S n

 


    
    
    
     
( ) 2 2
( ) 2 2
( ) 2 2 2 2
( ) 2 2 2 2
2
1]
ˆ(1 )( [ 1] [ 1])
ˆ(1 )( [ 1]( [ 1] [ 1] [ 1]))
(1 )( [ 1] [ 1] [ 1]))
(1 )( [ 1] )
[ ]
(1
n
n S
n
n S
n
n S
n
n S S
V
k a ES n S n
k a ES n S n S n S n
k a ES n e n a ES n
k a P n a
P n



 


    
        
      
    
  2 2 2 2
)( [ 1] )S Sa P n a   
Hence
2 2
2
2 2 2
[ 1]
[ ]
[ 1]


 
 

  
W
V
W V
a P n
P n
a P n
where we have substituted 2 2 2
(1 )W Sa   for an AR(1) process.
We have still to find [0]P . For this assume    ˆ[ 1] [ 1] 0.S S Hence from the relation
( ) 2 2 ˆ[ ] (1 ) (1 ) [ 1] [ 1]n
n S nP n k k a ES n S n     
we get
(0) 2
0[0] (1 ) SP k  
Substituting (0)
0 2
[0]


V
P
k
in the expression for [0]P above
We get
2 2
2 2
[0] S V
S V
P
 
 


6. Scalar Kalman filter algorithm
Given: Signal model parameters 2 2
W Vand and the observation noise variance .a  
Initialisation  ˆ[ 1] 0S
Step 1 .0n Calculate
2 2
2 2
[0] S V
S V
P
 
 


Step 2 Calculate

( )
2
[ ]

n
n
V
P n
k
Step 3 Input [ ].X n Estimate ˆ[ ]S n by
( )
Predict
ˆ ˆ[ | 1] [ 1]
Correct
ˆ ˆ ˆ[ ] [ | 1] ( [ ] [ | 1])n
n
S n n aS n
S n S n n k X n S n n
  
    
Step 4 .1 nn Calculate
2 2
2
2 2 2
[ 1]
[ ]
[ 1]


 
 

  
W
V
W V
a P n
P n
a P n
Step 5 Go to Step 2
 We have to initialize [0]P .
 Irrespective of this initialization, ( )
and [ ]n
nk P n converge to final values.
 Considering a to be time varying, the filter can be used to estimate nonstationary
signal.
Example: Given
2 2
S[n] 0.6 [n 1] [ ] 0
[ ] [ ] [ ] 0
0 25 0.5W V
S W n n
X n S n V n n
σ . , σ
   
  
 
Find the expression for the Kalman filter equations at convergence and the corresponding
mean square error.
Using
2 2
2
2 2 2
[ 1]
[ ]
[ 1]


 
 

  
W
V
W V
a P n
P n
a P n
We get
2
0.25 0.6
0.5
0.25 0.5 0.6


 
P
P
P
Solving and taking the positive root
0.320P
( ) 0.320
lim 0.64
0.5
 n
n
n
k

7. Vector Kalman Filter
Consider the time-varying state-space model representing the nonstationary signal.
The state equation is
[ ] [ ] [ 1] [ ]n n n n  S A S W
where
1 2[n]= [ ] [ ].... [ ]pS n S n S n   S is the state vector, [ ]nA is a p p system matrix, and
[ ]nW is the 0-mean Gaussian process noise vector with the p p covariance matrix
.WQ
The observed data 1 2X[n]=[X [ ] [ ]... [ ]]pn X n X n  is related to the sates by
[ ] [n] [ ]n n X C S V
where C is a q q output matrix, [ ]nV is the 0-mean Gaussian noise vector with q×q
covariance matrix .VQ
Denote | LMSE of given [0], [1],..., [ ].n n n n n S[ ] S[ ] S[ ] X X X
and | 1 LMMSE of given [0], [1],..., [ 1].n n n n  S[ ] S[ ] X X X
The corresponding state estimation errors are:
| |
and
| 1 | 1
n n n n n
n n n n n 
e[ ] = S[ ]-S[ ]
e[ ] = S[ ]-S[ ]
Generalising Equation (4) we have
| -1 | -1
[ ] -1 -1
n n n n n n
n n n n n
n
n
S[ ] = S[ ]+k ( [ ]- [ ])
= A S[ ]+k ( [ ]-c[ ]S[ ])
X X
X
where kn is a p×q gain matrix.
The mean-square estimation error in scalar Kalman filter will now be replaced by the
error covariance matrix denoted by .P
The a priori estimate of the error covariance matrix is given
| 1 | 1 | 1   n n E n n n nP[ ] e[ ]e [ ]
and the a posteriori estimate error covariance is
| | |n n E n n n nP[ ] e[ ]e [ ]
With these definitions and notations the vector Kalman filter algorithm is as follows:

7. Vector Kalman filter algorithm
State equation: [ ] -1n n n nS[ ]= A S[ ]+ W[ ]
Observation equation: [ ] [ ] [ ] [ ]n n n nX = C S +V
Given (a) State matrix , 01,2...n nA[ ] and the process noise covariance matrix WQ
(b) Observation parameter matrix , 01,2...n nC[ ] and the observation noise
covariance matrix VQ
(c) Observed data , 01,2...n n X[ ]
Initialization:
1| 1 1 0
1| 1 1 1E
    
    
S[ ] S[ ]
P[ ] S[ ]S [ ]
Estimation:
For 0,1,2,.... do
Prediction
Predict the state
| -1 [ ] -1
Estimate error covariance matrix
| -1 -1 -1
Compute Kalman gain
| -1 | -1
n
n n n n
a priori
n n n n n n
n n n n n n n




W
n
S[ ] A S[ ]
P[ ] = A[ ]P[ | ]A [ ] +Q
k = P[ ]C[ ](C[ ]P[ ]C [ ]+Q
Update state
| | -1 ( [ ] [ ] | -1 )
Update error covariance matrix
| ( [ ]) | 1
n n n n n n n n
a posteriori
n n n n n
  
  
-1
v
n
n
)
S[ ] S[ ] k c S[ ]
P[ ] I k C P[ ]
X

Least-squares Estimator
In least squaree (LS) estimation, the observed data are assumed to be a known function of
some unknown parameters plus some errors which are assumed to be random. Unlike
MVUE, MLE or the Bayesian estimators, we do not have an explicit probabilistic model for
the data .The randomness is only due to the measurement error
Consider the simplest example,
i iX e  , Where  is the unknown parameter and ie is the observation or measurement
noise. According to LS principle, we have to minimize the sum-square error.
   
2
1
n
i
i
J X 

 
with respect to .
Thus  ˆ arg minLS J

 
ˆ
LS is given by
 
ˆ
1
1
0
ˆ 0
1ˆ
LS
n
i LS
i
n
LS i
i
J
X n
X
n
 










  
 


which is same as the MLE.
General Linear Model
The general linear model for observed samples , 1,2,....,iX i n is given by
H eX =   where we have assumed  1 2...
T
nX X XX = and  1 2...
T
M    .In general
,both the data and the parameters can be multidimensional or single dimensional.
We consider the simple case
Signal model
H eX =  
So that sum-square error is given by

 
   
2
1
2
1 1
=
=
=
n
i
i
n n
i ij j
i j
T
T T T T T T
J e
X h
H H
H H HH



 

 
 
 
 
  

 
X X
X X X X
 
   
Thus ˆ
LS is given by
 
ˆ
0
LS
J
 

 



2T T T
H H H H   X X   
ˆT T
LSH H H  X
The above equation is nown as the normal equation.
The optimal solution is given by
 
1
ˆ T T
LS H H H

 X
The matrix  
1T T
H H H

is known as the pseudo inverse.
We can also write
1
m
j j
j
X h e

 
So that
 
2
2
1
2
1
0
0; 1,2,.....,
ˆ 0
m
j j
j
j
T
m
j j j
j
j
e h
e
h h j m
h





 



 
    
 
   


X
X
X X
orthogonality principle
The error orthogonal to the each column vector of H matrix.

Geometrical Interpretation
From the orthogonality principle, a geometrical interpretation of the LS estimation can be
given. The observed data X is orthogonally projected to the space spanned by
h , 1,2,...,j j m and find the optimal solution in the space. This is illustrated in the given
figure belo
h1
h2
Example:
 
    
0 1
1
2
1
2
1 1
1
0
1
1 1
1 2
. ,H= . .
. . .
1
Equation
ˆ
1
ˆ 1 1 . . 12
ˆ 1 2 . .1 1 2 1
2 6
i
n
n
iT
n n
i i
T T
LS
LS
x i e
x
x
x n
n i
H H
i i
Normal
H H H
x
n n
n
nn n n n n
 



 
  
   
   
   
   
   
   
     
 
 
 
 
 
 


 
     
     
       
 
 

 
X =
XLSθ
2
1
1
.
.
=
n
n
i
i
n
i
i
x
x
x
ix


 
 
 
 
 
 
  
 
 
 
 
 
 


x

Solving the above matrix equation, we can get 0
ˆ
LS and 1
ˆ
LS
Statistical properties of ˆ
LS
ˆ
LS is unbiased
We have
 
   
 
1
1
1
ˆ
=
ˆ
=
T T
T T
T T
H H H
H H H H e
E H H H H E Ee





  
X



LS
LS
θ
θ

Before quiz 2

More Related Content

What's hot (20)

Similar to Before quiz 2 (20)

More from Gopi Saiteja (20)

Recently uploaded (20)

Before quiz 2