0% found this document useful (0 votes)

26 views

Recursive Least Squares Estimation: 1 Estimation of A Constant

This document discusses recursive least squares estimation methods. It introduces weighted least squares estimation which accounts for varying confidence in measurements. It then describes how to recursively update estimates with each new measurement to reduce computation costs compared to recomputing from scratch each time.

Uploaded by

malik fayz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views

Recursive Least Squares Estimation: 1 Estimation of A Constant

Uploaded by

malik fayz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Recursive Least Squares Estimation∗

(Com 477/577 Notes)

Yan-Bin Jia

Dec 8, 2015

1 Estimation of a Constant
We start with estimation of a constant based on several noisy measurements. Suppose we have a
resistor but do not know its resistance. So we measure it several times using a cheap (and noisy)
multimeter. How do we come up with a good estimate of the resistance based on these noisy
measurements?
More formally, suppose x = (x1 , x2 , . . . , xn )T is a constant but unknown vector, and y =
(y1 , y2 , . . . , yl )T is an l-element noisy measurement vector. Our task is to find the “best” estimate
x̃ of x. Here we look at perhaps the simplest case where each yi is a linear combination of xj ,
1 ≤ j ≤ n, with addition of some measurement noise νi . Thus, we are working with the following
linear system,
y = Hx + ν,
where ν = (ν1 , ν2 , . . . , νl )T , and H is an l × n matrix; or with all terms listed,
      
y1 H11 · · · H1n x1 ν1
 ..   .. .. ..   ..  +  ..  .
 . = . . .  .   . 
yl Hk1 · · · Hkn xn νl

Given an estimate x̃, we consider the difference between the noisy measurements and the pro-
jected values H x̃:
ǫ = y − H x̃.
Under the least squares principle, we will try to find the value of x̃ that minimizes the cost function

J(x̃) = ǫT ǫ
= (y − H x̃)T (y − H x̃)
= y T y − x̃T Hy − y T H x̃ + x̃T H T H x̃.

The necessary condition for the minimum is the vanishing of the partial derivative of J with
respect to x̃, that is,
∂J
= −2y T H + 2x̃T H T H = 0.
∂ x̃
∗
The material is adapted from Sections 3.1–3.3 in Dan Simon’s book Optimal State Estimation [1].

1
We solve the equation, obtaining
x̃ = (H T H)−1 H T y. (1)
The inverse (H T H)−1exists if l > n and H is non-singular. In other words, when the number
of measurements is no fewer than the number of variables, and these measurements are linearly
independent.

Example 1. Suppose we are trying to estimate the resistance x of an unmarked resistor based on l noisy
measurements using a multimeter. In this case,

y = Hx + ν, (2)

where
H = (1, · · · , 1)T . (3)
Substitution of the above into equation (1) gives us the optimal estimate of x as

x̃ = (H T H)−1 H T y
1 T
= H y
l
y1 + · · · + yl
= .
l

2 Weighed Least Squares Estimation

So far we have placed equal confidence on all the measurements. Now we look at varying confidence
in the measurements. For instance, some of our measurements of an unmarked resistor were taken
with an expensive multimeter with low noise, while others were taken with a cheap multimeter by
a tired student late at night. Even though the second set of measurements is less reliable, we could
get some information about the resistance. We should never throw away measurements, no matter
how unreliable they may seem. This will be shown in the section.
We assume that each measurement yi , 1 ≤ i ≤ l, may be taken under a different condition so
that the variance νi of the measurement noise may be distinct too:

E(νi2 ) = σi2 , 1 ≤ i ≤ l.

Assume that the noise for each measurement has zero mean and is independent. The covariance
matrix for all measurement noise is

R = E(νν T )
 2 
σ1 · · · 0
 .. . . ..  .
=  . . . 
0 · · · σl2

Write the difference y − H x̃ as ǫ = (ǫ1 , ǫ2 , . . . , ǫl )T . We will minimize the sum of squared

differences weighted over the variations of the measurements:
ǫ21 ǫ22 ǫ2l
J(x̃) = ǫT R−1 ǫ = + + · · · + .
σ12 σ22 σl2

2
If a measurement yi is noisy, we care less about the discrepancy between it and the ith element
of H x̃ because we do not have much confidence in this measurement. The cost function J can be
expanded as follows:

J(x̃) = (y − H x̃)T R−1 (y − H x̃)

= y T R−1 y − x̃T H T R−1 y − y T R−1 H x̃ + x̃T H T R−1 H x̃.

At a minimum, the partial derivative of J must vanish, yielding

∂J
= −2y T R−1 H + 2x̃T H T R−1 H = 0.
∂ x̃
Immediately, we solve the above equation for the best estimate of x:

x̃ = (H T R−1 H)−1 H T R−1 y. (4)

Note that the measurement noise matrix R must be non-singular for a solution to exist. In other
words, each measurement yi must be corrupted by some noise for the estimation method to work.

Example 2. We get back to the problem in Example 1 of resistance estimation, for which the equations are
given in (2) and (3). Suppose each of the l noisy measurements has variance

E(νi2 ) = σi2 .

The measurement noise covariance is given as

R = diag(σ12 , . . . , σl2 ).

Substituting H, R, y into (4), we obtain the estimate

−1
1/σ12 1/σ12
     
1 y1
..   ..  ..   .. 
x̃ = (1, . . . , 1)    .  (1, . . . , 1) 
  
. .  . 
2 2
1/σl 1 1/σl yl
l
!−1 l
!
X 1 X yi
= 2 2 .
σ
i=1 i
σ
i=1 i

3 Recursive Least Squares Estimation

Equation (4) is adequate when we have made all the measurements. More often, we obtain mea-
surements sequentially and want to update our estimate with each new measurement. In this case,
the matrix H needs to be augmented. We would have to recompute the estimate x̃ according to (4)
for every new measurement. This update can become very expensive. And the overall computation
can become prohibitive as the number of measurements becomes large.
This section shows how to recursively compute the weighted least squares estimate. More
specifically, suppose we have an estimate x̃k−1 after k − 1 measurements, and obtain a new mea-
surement y k . To be general, every measurement is now an m-vector with values yielded by, say,
several measuring instruments. How can we update the estimate to x̃k without solving equation (4)?

3
A linear recursive estimator can be written in the following form:

y k = Hk x + ν k ,
x̃k = x̃k−1 + Kk (y k − Hk x̃k−1 ). (5)

Here Hk is an m × n matrix, and Kk is n × m and referred to as the estimator gain matrix. We

refer to y k − Hk x̃k−1 as the correction term. Namely, the new estimate x̃k is modified from the
previous estimate x̃k with a correction via the gain vector. The measurement noise has zero mean,
i.e., E(ν k ) = 0.
The current estimation error is

ǫk = x − x̃k
= x − x̃k−1 − Kk (y k − Hk x̃k−1 )
= ǫk−1 − Kk (Hk x + ν k − Hk x̃k−1 )
= ǫk−1 − Kk Hk (x − x̃k−1 ) − Kk ν k
= (I − Kk Hk )ǫk−1 − Kk ν k , (6)

where I is the n × n identity matrix. The mean of this error is then

E(ǫk ) = (I − Kk Hk )E(ǫk−1 ) − Kk E(ν k ).

If E(ν k ) = 0 and E(ǫk−1 ) = 0, then E(ǫk ) = 0. So if the measurement noise ν k has zero mean for
all k, and the initial estimate of x is set equal to its expected value, then x̃k = xk for all k. With
this property, the estimator (5) is called unbiased. The property holds regardless of the value of
the gain vector Kk . It says that on the average the estimate x̃ will be equal to the true value x.
The key is to determine the optimal value of the gain vector Kk . The optimality criterion used
by us is to minimize the aggregated variance of the estimation errors at time k:

Jk = E(kx − x̃k k2 )
= E(ǫTk ǫk )
= E tr(ǫk ǫTk )

= Tr(Pk ), (7)

where Tr is the trace operator1 , and the n × n matrix Pk = E(ǫk ǫTk ) is the estimation-error
covariance, Next, we obtain Pk with a substitution of (6):
T
Pk = E (I − Kk Hk )ǫk−1 − Kk ν k (I − Kk Hk )ǫk−1 − Kk ν k

= (I − Kk Hk )E(ǫk−1 ǫTk−1 )(I − Kk Hk )T − Kk E(ν k ǫTk−1 )(I − Kk Hk )T

− (I − Kk Hk )E(ǫk−1 ν Tk )KkT + Kk E(ν k ν Tk )KkT .

The estimation error ǫk−1 at time k − 1 is independent of the measurement noise ν k at time k,
which implies that

E(ν k ǫTk−1 ) = E(ν k )E(ǫTk−1 ) = 0,

E(ǫk−1 ν Tk ) = E(ǫk−1 )E(ν Tk ) = 0.
1
The trace of a matrix is the sum of its diagonal elements.

4
Given the definition of the m × m matrix Rk = E(ν k ν Tk ) as covariance of ν k , the expression of Pk
becomes
Pk = (I − Kk Hk )Pk−1 (I − Kk Hk )T + Kk Rk KkT . (8)
Equation (8) is the recurrence for the covariance of the least squares estimation error. It is
consistent with the intuition that as the measurement noise (Rk ) increases, the uncertainty (Pk )
increases. Note that Pk as a covariance matrix is positive definite.
What remains is to find the value of the gain vector Kk that minimizes the cost function given
by (6). The mean of the estimation error is zero independent of the value of Kk already. Thus
the minimizing value of Kk will make the cost function consistently close to zero. We need to
differentiate Jk with respect to Kk .
∂f ∂f
The derivative of a function f with respect to a matrix A = (aij ) is a matrix ∂A = ( ∂aij
).

Theorem 1 Let C and X be matrices of the same dimension r × s. Suppose C does not depend
on X. Then the following holds:

∂Tr(CX T )
= C, (9)
∂X
∂Tr(XCX T )
= XC + XC T . (10)
∂X

∂
A proof of the theorem is given in Appendix A. In the case that C is symmetric, ∂X Tr(XCX T ) =
2XC. With these facts in mind, we first substitute (8) into (7) and differentiate the resulting
expression with respect to Kk :
∂Jk ∂ ∂
= Tr Pk−1 − Kk Hk Pk−1 − Pk−1 HkT KkT + Kk (Hk Pk−1 HkT )KkT + Tr(Kk Rk KkT )
∂Kk ∂Kk ∂Kk
∂
= −2 (Pk−1 HkT KkT ) + 2Kk (Hk Pk−1 HkT ) + 2Kk Rk (by (10))
∂Kk
= −2Pk−1 HkT + 2Kk Hk Pk−1 HkT + 2Kk Rk (by (9))
= −2Pk−1 HkT + 2Kk (Hk Pk−1 HkT + Rk )

In the second equation above, we also used that Pk−1 is independent of Kk and that Kk Hk Pk−1 and
Pk−1 HkT KkT are transposes of each other (since Pk−1 is symmetric) so they have the same trace.
Setting the partial derivative to zero, we solve for Kk :

Kk = Pk−1 HkT (Hk Pk−1 HkT + Rk )−1 . (11)

Write Sk = Hk Pk−1 HkT + Rk , so

Kk = Pk−1 HkT Sk−1 . (12)

Substitute the above for Kk into equation (8) for Pk . The operation followed by an expansion leads
to a few steps of manipulation as follows:

Pk = (I − Pk−1 HkT Sk−1 Hk )Pk−1 (I − Pk−1 HkT Sk−1 Hk )T + Pk−1 HkT Sk−1 Rk Sk−1 Hk Pk−1
= Pk−1 − Pk−1 HkT Sk−1 Hk Pk−1 − Pk−1 HkT Sk−1 Hk Pk−1 +

5
Pk−1 HkT Sk−1 Hk Pk−1 HkT Sk−1 Hk Pk−1 + Pk−1 HkT Sk−1 Rk Sk−1 Hk Pk−1
= Pk−1 − Pk−1 HkT Sk−1 Hk Pk−1 − Pk−1 HkT Sk−1 Hk Pk−1 + Pk−1 HkT Sk−1 Sk Sk−1 Hk Pk−1
(after merging the underlined terms into Sk )
= Pk−1 − 2Pk−1 HkT Sk−1 Hk Pk−1 + Pk−1 HkT Sk−1 Hk Pk−1
= Pk−1 − Pk−1 HkT Sk−1 Hk Pk−1 (13)
= Pk−1 − Kk Hk Pk−1 (by (12))
= (I − Kk Hk )Pk−1 . (14)
Note that in the above Pk is symmetric as a covariance matrix, and so is Sk .
We take the inverses of both sides of equation (13) and plug into the expression for Sk . Expan-
sion and merging of terms yield
Pk−1 = Pk−1
−1
+ HkT Rk−1 Hk , (15)
from which we obtain an alternative expression for the convariance matrix:
+ HkT Rk−1 Hk
−1
Pk = Pk−1
−1
. (16)
This expression is more complicated than (14) since it requires three matrix inversions. Neverthe-
less, it has computational advantages in certain situations in practice [1, pp.156–158].
We can also derive an alternate form for the convariance Pk as follows. Start with a multiplica-
tion of the right of (11) with Pk Pk−1 . Then, substitute (15) for Pk−1 into the resulting expression.
Multiply the Pk−1 Hk factor inside the parenthesized factor on its left, and extract HkT Rk−1 out of
the parentheses. The last two parenthesized factors will cancel each other, yielding
Kk = Pk HkT Rk−1 . (17)

4 The Estimation Algorithm

The algorithm for recursive least squares estimation is summarized as follows.
1. Initialize the estimator:
x̃0 = E(x),
P0 = E (x − x̃0 )(x − x̃0 )T .

In the case of no prior knowledge about x, simply let P0 = ∞I. In the case of perfect prior
knowledge, let P0 = 0.
2. Iterate the follow two steps.
(a) Obtain a new measurement y k , assuming that it is given by the equation
y k = Hk x + ν k ,
where the noise ν k has zero mean and covariance Rk . The measurement noise at each
time step k is independent. So,

T 0, if i 6= j,
E(ν i ν j ) =
Rj , if i = j.
Essentially, we assume white measurement noise.

6
(b) Update the estimate x̃ and the covariance of the estimation error sequentially according
to (11), (5), (14), which are re-listed below:

Kk = Pk−1 HkT (Hk Pk−1 HkT + Rk )−1 , (18)

Pk = (I − Kk Hk )Pk−1 , (19)
x̃k = x̃k−1 + Kk (y k − Hk x̃k−1 ), (20)

or according to (16), (17), and (20):

+ HkT Rk−1 Hk
−1
Pk = Pk−1
−1
,
Kk = Pk HkT Rk−1 ,
x̃k = x̃k−1 + Kk (y k − Hk x̃k−1 ).

Note that (20) and (19) can switch their order in one round of update.

Example 3. We revisit the resistance estimation problem presented in Examples 1 and 2. Now, we want
to iteratively improve our estimate of the resistance x. At the kth sampling, our measurement is
yk = Hk x + νk = x + νk ,
Rk = E(νk2 ).
Here, the measurement vector Hk is a scalar 1. Furthermore, we suppose that each measurement has the
same covariance so Rk is a constant written as R.
Before the first measurement, we have some idea about the resistance x. This becomes our initial
estimate. Also, we have some uncertainty about this initial estimate, which becomes our initial covariance.
Together we have
x̃0 = E(x),
P0 = E((x − x̃0 )2 ).
If we have no idea about the resistance, set P0 = ∞. If we are certain about the resistance value, set P0 = 0.
(Of course, then there would be no need to take measurements.)
After the first measurement (k=1), we update the estimate and the error covariance according to equa-
tions (18)–(19) as follows:
P0
K1 = ,
P0 + R
P0
x̃1 = x̃0 + (y1 − x̃0 ),
P0 + R

P0 P0 R
P1 = 1− P0 = .
P0 + R P0 + R
After the second measurement, the estimates become
P1 P0
K2 = = ,
P1 + R 2P0 + R
P1
x̃2 = x̃1 + (y2 − x̃1 )
P1 + R
P0 + R P0
= x̃1 + y2 ,
2P0 + R 2P0 + R
P1 R P0 R
P2 = = .
P1 + R 2P0 + R

7
By induction, we can show that
P0
Kk = ,
kP0 + R
(k − 1)P0 + R P0
x̃k = x̃k−1 + yk ,
kP0 + R kP0 + R
P0 R
Pk = .
kP0 + R
Note that if x is known perfectly a priori, then P0 = 0, which implies that Kk = 0 and x̃k = x̃0 , for all
k. The optimal estimate of x is independent of any measurements that are obtained. At the opposite end of
the spectrum, if x is completely unknown a priori, then P0 = ∞. The above equation for x̃k becomes,

(k − 1)P0 + R P0
x̃k = lim x̃k−1 + yk
P0 →∞ kP0 + R kP0 + R
k−1 1
= x̃k−1 + yk
k k
1
= (k − 1)x̃k−1 + yk .
k
1 Pk
The right hand side of the last equation above is just the running average ȳk = k j=1 yj of the measure-
ments. To see this, we first have
k
X k−1
X
yj = yj + yk
j=1 j=1
 
k−1
1 X
= (k − 1)  yj  + yk
k − 1 j=1

= (k − 1)ȳk−1 + yk .

Since x̃1 = ȳ1 , the recurrences for x̃k and ȳk are the same. Hence x̃k = ȳk for all k.

Example 4. Suppose that a tank contains a concentration x1 of chemical 1, and a concentration x2 of

chemical 2. We have an instrument to detect the combined concentration x1 + x2 of the two chemicals but
not able to tell the values of x1 and x2 . Chemical 2 leaks from the tank so that its concentration decreases
by 1% from one measurement to the next. The measurement equation is given as

yk = x1 + 0.99k−1 x2 + νk ,

where Hk = (1, 0.99k−1 )T , and νk is a random variable with zero mean and a variance R = 0.01.
Let the real values be x̃ = (x1 , x2 )T = (10, 5)T . Suppose the initial estimates are x̃1 = 8 and x̃2 = 7 with
P0 equal to the identity matrix. We apply the recursive least squares algorithm. The next figure2 shows the
evolutions of the estimates x̃1 and x̃2 , along with those of the variance of the estimation errors. It can be
seen that after a couple dozen measurements, the estimates are getting very close to the true values 10 and 5.
The variances of the estimation errors asymptotically approach zero. This means that we have increasingly
more confidence in the estimates with more measurements obtained.
2
Figure 3.1, p. 92 of [1].

8
A Proof of Theorem 1
Proof
Denote C = (cij ), X = (xij ), and CX T = (dij ). The trace of CX T is
r
X
T
Tr(CX ) = dtt
t=1
Xr X s
= ctk xtk .
t=1 k=1

From the above, we easily obtain its partial derivatives with respect to the entries of X:
∂
Tr(CX T ) = cij .
∂xij

This establishes (9).

To prove (10), we have

∂ ∂ ∂
Tr(XCX T ) = Tr(XCY T ) + Tr(Y CX T )

∂X ∂X Y =X ∂X Y =X

∂
= Tr(Y C T X T ) +Y C (by (9))

∂X Y =X Y =X

= Y CT +XC

Y =X

= XC T + XC.

9
References
[1] D. Simon. Optimal State Estimations. John Wiley & Sons, Inc., Hoboken, New Jersey, 2006.

Solution Manual and Notes For - Applied Optimal Estimation (Gelb)
No ratings yet
Solution Manual and Notes For - Applied Optimal Estimation (Gelb)
185 pages
State Estimation
No ratings yet
State Estimation
34 pages
State Estimation
No ratings yet
State Estimation
23 pages
Mae 546 Lecture 16
No ratings yet
Mae 546 Lecture 16
8 pages
Lecture 4 - Estimation - BMSLec03
No ratings yet
Lecture 4 - Estimation - BMSLec03
20 pages
Kalmannote Basics
No ratings yet
Kalmannote Basics
4 pages
Introduction To Nonlinear Filtering
No ratings yet
Introduction To Nonlinear Filtering
126 pages
KOp ECE5530-CH04
No ratings yet
KOp ECE5530-CH04
25 pages
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
No ratings yet
Design and Analysis of Computer Experiments: Theory: 1 Density Estimation
9 pages
Identification and Estimation
No ratings yet
Identification and Estimation
37 pages
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
No ratings yet
MA 324, Lecture 1: Yohann Tendero Yohann - Tendero@
19 pages
Robust Regression: 1 M-Estimation
No ratings yet
Robust Regression: 1 M-Estimation
8 pages
Appendix Robust Regression
No ratings yet
Appendix Robust Regression
8 pages
Lect 6
No ratings yet
Lect 6
20 pages
Observers and Kalman Filters: CS 393R: Autonomous Robots
No ratings yet
Observers and Kalman Filters: CS 393R: Autonomous Robots
37 pages
Kalman
No ratings yet
Kalman
8 pages
Least Squares Estimation PDF
No ratings yet
Least Squares Estimation PDF
5 pages
Classical Linear Regression and Its Assumptions
No ratings yet
Classical Linear Regression and Its Assumptions
63 pages
Unit - III
No ratings yet
Unit - III
4 pages
Gevorderde Systeemidentificatie
No ratings yet
Gevorderde Systeemidentificatie
243 pages
Kalman Filter Derivation 2023-11-08 15-05-45 (1)
No ratings yet
Kalman Filter Derivation 2023-11-08 15-05-45 (1)
7 pages
Self-Driving Cars Specialization
No ratings yet
Self-Driving Cars Specialization
30 pages
METULecture 1
No ratings yet
METULecture 1
15 pages
Se1 PDF
No ratings yet
Se1 PDF
35 pages
Estimation 2 PDF
No ratings yet
Estimation 2 PDF
44 pages
Discrete Kalman Filter
No ratings yet
Discrete Kalman Filter
44 pages
An Introduction To Kalman Filtering:Probabilistic And: Deterministic Approaches
No ratings yet
An Introduction To Kalman Filtering:Probabilistic And: Deterministic Approaches
12 pages
Recursive Lecture PDF
No ratings yet
Recursive Lecture PDF
33 pages
RGA LMA NLMA NGA Algorithms
No ratings yet
RGA LMA NLMA NGA Algorithms
13 pages
Econometric Theory: Module - Ii
No ratings yet
Econometric Theory: Module - Ii
11 pages
SLAM_Least_Squares_notes
No ratings yet
SLAM_Least_Squares_notes
12 pages
Appendix Robust Regression
No ratings yet
Appendix Robust Regression
17 pages
Education and Research: UP School of Statistics Student Council
No ratings yet
Education and Research: UP School of Statistics Student Council
26 pages
ch4 KFderiv
No ratings yet
ch4 KFderiv
22 pages
Industrial Mathematics Institute: Research Report
No ratings yet
Industrial Mathematics Institute: Research Report
25 pages
R300 Solution Guide 2018M
No ratings yet
R300 Solution Guide 2018M
8 pages
Tutorial KF
No ratings yet
Tutorial KF
13 pages
Basic Stats Estimation
No ratings yet
Basic Stats Estimation
8 pages
LeastSquares_DeptMath
No ratings yet
LeastSquares_DeptMath
7 pages
LSE_RLS
No ratings yet
LSE_RLS
4 pages
Tutorial On Kalman Filter
No ratings yet
Tutorial On Kalman Filter
47 pages
R Akne Ovningar Empirisk Modellering
No ratings yet
R Akne Ovningar Empirisk Modellering
23 pages
3 The Basic Linear Model Finite Sample Results
No ratings yet
3 The Basic Linear Model Finite Sample Results
9 pages
Matrix OLS NYU Notes
No ratings yet
Matrix OLS NYU Notes
14 pages
SDET Formulae MidSem2 2018 Ver3
No ratings yet
SDET Formulae MidSem2 2018 Ver3
2 pages
Class
No ratings yet
Class
35 pages
hw5 Sol
No ratings yet
hw5 Sol
17 pages
3 SimpleLinearRegression
No ratings yet
3 SimpleLinearRegression
30 pages
CH 4
No ratings yet
CH 4
36 pages
M176
No ratings yet
M176
6 pages
Technometrics
No ratings yet
Technometrics
14 pages
CH 7
No ratings yet
CH 7
47 pages
Lecture5 Module2 Anova 1
No ratings yet
Lecture5 Module2 Anova 1
9 pages
TS-Theme3
No ratings yet
TS-Theme3
18 pages

Recursive Least Squares Estimation: 1 Estimation of A Constant

Uploaded by

Recursive Least Squares Estimation: 1 Estimation of A Constant

Uploaded by

Recursive Least Squares Estimation∗

(Com 477/577 Notes)

2 Weighed Least Squares Estimation

Write the difference y − H x̃ as ǫ = (ǫ1 , ǫ2 , . . . , ǫl )T . We will minimize the sum of squared

J(x̃) = (y − H x̃)T R−1 (y − H x̃)

At a minimum, the partial derivative of J must vanish, yielding

x̃ = (H T R−1 H)−1 H T R−1 y. (4)

The measurement noise covariance is given as

Substituting H, R, y into (4), we obtain the estimate

3 Recursive Least Squares Estimation

Here Hk is an m × n matrix, and Kk is n × m and referred to as the estimator gain matrix. We

where I is the n × n identity matrix. The mean of this error is then

E(ǫk ) = (I − Kk Hk )E(ǫk−1 ) − Kk E(ν k ).

= (I − Kk Hk )E(ǫk−1 ǫTk−1 )(I − Kk Hk )T − Kk E(ν k ǫTk−1 )(I − Kk Hk )T

E(ν k ǫTk−1 ) = E(ν k )E(ǫTk−1 ) = 0,

Kk = Pk−1 HkT (Hk Pk−1 HkT + Rk )−1 . (11)

Write Sk = Hk Pk−1 HkT + Rk , so

Kk = Pk−1 HkT Sk−1 . (12)

4 The Estimation Algorithm

Kk = Pk−1 HkT (Hk Pk−1 HkT + Rk )−1 , (18)

or according to (16), (17), and (20):

Example 4. Suppose that a tank contains a concentration x1 of chemical 1, and a concentration x2 of

This establishes (9).

You might also like