gaussian filters for nonlinear filtering problems
gaussian filters for nonlinear filtering problems
5, MAY 2000
Abstract—In this paper we develop and analyze real-time and use Gaussian sum filters for the development of nearly optimal
accurate filters for nonlinear filtering problems based on the filters. The Gaussian sum filter has been studied in [1] and
Gaussian distributions. We present the systematic formulation [20]. However, we adapt our Gaussian filter for the update of
of Gaussian filters and develop efficient and accurate numerical
integration of the optimal filter. We also discuss the mixed Gaussian distributions. We also suggest some new update rules
Gaussian filters in which the conditional probability density of weights for Gaussian sum filters. Through our experimental
is approximated by the sum of Gaussian distributions. A new study we found that the filters developed in the paper perform
update rule of weights for Gaussian sum filters is proposed. better than or as good as the filter of Julier–Uhlmann [12]. They
Our numerical testings demonstrate that new filters significantly have a significant improvement over the extended Kalman filter
improve the extended Kalman filter with no additional cost and
the new Gaussian sum filter has a nearly optimal performance. with no additional cost.
An outline of the paper is as follows. In Section II we de-
Index Terms—Author: please supply index terms. E-mail: key-
[email protected] for information. velop the Gaussian filter based on a single Gaussian distribu-
tion. In Section III we discuss the efficient numerical integration
of the Gaussian filter based on quadrature rules, and introduce
I. INTRODUCTION the Gauss–Hermite filter (GHF) and the central difference filter
(CDF). In Section IV we formulate the Gauss–Hermite filter and
T HE NONLINEAR filtering problem consists of estimating
the state of a nonlinear stochastic system from noisy ob-
servation data. The problem has been the subject of consider-
the central difference filter as the filter algorithms. In Section V
we introduce the mixed Gaussian filter and the new update rules
able research interest during the past several decades because of weights. In Section VI we discuss the relation of the Gaussian
it has many significant applications in science and engineering filter for the discrete time system to the continuous-time optimal
such as navigational and guidance systems, radar tracking, sonar filter governed by the Zakai equation. In Section VII we ana-
ranging, and satellite and airplane orbit determination [11], [14], lyze the stability and performance bound of the Gaussian fil-
[15]. As is well-known, the most widely used filter is the ex- ters and the mixed Gaussian filters developed in the paper. In
tended Kalman filter for nonlinear filtering problems. It is de- Section VIII we report our numerical findings and comparison
rived from the Kalman filter based on the successive lineariza- studies. Also, we demonstrate a nearly optimal performance of
tion of the signal process and the observation map. The extended the mixed Gaussian filter. Finally, we conclude our results in
Kalman filter has been successfully applied to numerous non- Section IX.
linear filtering problems. If nonlinearities are significant, how-
ever, its performance can be substantially improved. Such ef- II. GAUSSIAN FILTERS
forts have also been reported in [1], [5], [7], [12], [13], and [20]. We discuss the nonlinear filtering problem for the dis-
In this paper our objective is to develop and analyze real-time crete-time signal system for -valued process
and accurate filters for nonlinear filtering algorithms based on (2.1)
Gaussian distributions. We present the systematic formulation
of Gaussian filters and mixed Gaussian filters. and the observation process is given by
We first develop the Gaussian filter. The proposed filter is (2.2)
based on the steps: 1) we assume the conditional probability where and are white noises with covariances and
density to be a Gaussian distribution (i.e., assumed density) respectively. We assume that the initial condition and
and 2) we obtain the Gaussian filter by equating the Bayesian are independent random variables. The optimal non-
formula with respect to the first moment (mean) and the second linear filtering problem is to find the conditional expectation
moment (covariance). Our approach is based on the efficient of the process given the observation data
numerical integration of the Bayesian formula for optimal The probability density function of the
recursive filtering. The direct evaluation of the Jacobian matrix conditional expectation is given by Bayes’ formula
associated with the extended Kalman filter is avoided, which
is similar to the one recently reported in [12]. Secondly, we
Manuscript received November 7, 1997; revised November 30, 1998 and June
16, 1999. Recommended by Associate Editor, T. E. Duncan. This work was sup-
ported in part by the Office of Naval Research under Grant N00014-96-1-0265 (2.3)
and MURI-AFOSR under Grant F49620-95-1-0447.
The authors are with the Center for Research in Scientific Computation, and
North Carolina State University, Raleigh, NC 27695-8205 USA (e-mail:
[email protected]; [email protected]).
Publisher Item Identifier S 0018-9286(00)04155-6. (2.4)
0018–9286/00$10.00 © 2000 IEEE
ITO AND XIONG: GAUSSIAN FILTERS FOR NONLINEAR FILTERING PROBLEMS 911
(2.7)
and
(2.9)
and
which is precisely the same as the above, though differently ex- (2.10)
pressed.
Thus, if is a Gaussian with mean and co- where the Kalman-filter gain is defined by
variance then the Gaussian approximation of
has mean and covariance defined by (2.11)
(2.5)
(2.12)
912 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 5, MAY 2000
In order to implement the Gaussian filter we must develop If we assume and change the coordinate of integration
the approximation methods to evaluate integrals (2.5)–(2.12). by then
In Section III we discuss the approximation methods for the
integration of the form (3.2)
(3.3)
and
(3.4)
III. QUADRATURE RULES
The JU-rule requires -point function evaluation and is
In this section we discuss the approximation methods for the exact for all quadratic polynomials. If we set and
integral of the form then the JU-rule coincides with the Gauss–Hermite rule
Finally we consider the polynomial interpolation methods.
We approximate by the quadratic function that satisfies
(3.1)
ITO AND XIONG: GAUSSIAN FILTERS FOR NONLINEAR FILTERING PROBLEMS 913
and
(4.1)
Thus, we approximate (3.1) by
Let and be the starting values for the mean and co-
variance of the random variable and set and
In Gaussian–Hermite filter we apply the following
(3.6) predictor and corrector steps recursively.
(3.7)
Corrector Step: Compute the factorization
and set Update by
where is the quadratic approximation of for
Next, we consider the approximation of defined by
where
(3.8)
based filter is that we are not required to have the derivative of V. MIXED GAUSSIAN FILTER
and In this section we discuss the mixed Gaussian filter. We ap-
Secondly, we present the filter algorithm based on the cen- proximate the conditional probability density by the
tral difference approximation with second order diagonal cor- linear combination of multiple Gaussian distributions, i.e.,
rection. We summarize the filter algorithm of the central differ-
ence approximation with the second-order diagonal correction
as follows.
where
to obtain the update where is chosen so that the singularity of the matrix
is avoided and the matrices are defined by
(5.3)
Finally we discuss the simultaneous update of the weights.
We determine the weights by the -projection, i.e.,
minimize
(5.8)
(5.5) and
over satisfying A positive constant
is chosen so that the likelihood of each Gaussian distribution is
nonzero [e.g., ]. Problem (5.5) is
formulated as the quadratic programming respectively.
Roughly speaking, the estimate (5.7) shows that any proba-
subject to (5.6) bility density function can be
916 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 5, MAY 2000
approximated by a sum of Gaussian distributions each of whose (6.3). We approximate the process by the discrete-time
components is given by process by
(5.9) (6.6)
with
with order for
VI. RELATION TO CONTINUOUS-TIME FILTER Here the interval is subdivided into the
In this section let us discuss the nonlinear filtering problem subintervals
for the continuous-time signal process in generated by
(6.1)
where and
where is the standard Brownian motion, i.e., is the So, denotes the approximation of the process at
diffusion process [3], [11], [19]. So, (6.1) holds in the sense of and denotes the discretization of the ordi-
Ito and satisfies nary differential equation
(6.2)
The predictor step (2.5), (2.6) is applied successively times
to obtain i.e.,
where is the standard Brownian process that is independent
of and
Or, we consider the discrete observation process
(6.3)
where is the stepsize and is white noise with
covariance We assume that the initial condition and
are independent. (6.7)
Then the conditional probability density and
satisfies the Zakai equation
(6.4)
where is the Fokker–Planck operator [3], [19].
As shown in [8], [10], and [15], the discrete-time filter (2.3),
(2.4) applied to the time-discretized signal system of (6.1), (6.2)
(6.5) (6.8)
provides an approximation method for the continuous-time op-
timal filter to (6.1), (6.2). Note that denotes the finite dif- with
ference approximation
where is the white noise with covariance Note that for any the covariance is nonegative and satis-
We assume that fies the following property:
for (7.1)
Then we have the following theorem. The theorem shows sta- Thus by (7.5) we obtain the estimate (7.2).
bility of the proposed Gaussian filter independent of the fineness Remark 7.1: Using the similar arguments that lead to (7.2),
of on a fixed bounded time interval. we can prove that the same result holds for the approximation
Theorem 7.1: The solution to (2.5)–(2.12) has the esti- method discussed in Sections III and IV. That is, the estimate
mate (7.2) is still valid for the Gaussian–Hermite filter, provided that
(7.6)
(7.7)
where is the probability density function of random vari-
Thus by the Cauchy and Schwarz inequality we have able and
where
is the mixed Gaussian approximation of Let
Then we have the following result.
Theorem 7.2: Assume that for any probability density func-
tion and
Hence it follows from (7.4) that
(7.8)
Then the error estimate
(7.9)
and the error is given by ters against that of the extended Kalman filter (EKF) and the
filter of Julier–Uhlmann (JUF) [12].
(7.10)
We use the average root mean square error for our comparison
Proof: Note that of the methods. The average root mean square error is defined
by
(8.1)
parameters and We consider the time from the system. Here, we choose and
0 to 4. is chosen as the best value, i.e., (see [12]). which is a mathematically very interesting case, because in the
Fig. 1 shows the average root mean square errors committed case there are three unstable equilibria and a strange attractor
by each filter across a simulation consisting of 50 runs. From for the equations.
Fig. 1 we see that the five-point Gauss–Hermite filter performs We use the following problem data: the system parameters are
better than others. The central difference filter has similar per- chosen as and the initial condition is
formance like the filter of Julier–Uhlmann [12] in this example. i.e.,
All the three filters perform superior to the extended Kalman and We consider the time from 0 to 4. The initial
filter. estimate is with covariance
Example 8.2: In this example, we consider the Lorenz In Figs. 2–4 we show the average of root mean square errors
system for each component of system state committed by the algorithms
of the Gauss–Hermite quadrature rule, the central difference ap-
proximation, and the JU-rule, respectively, across a simulation
consisting of 50 runs. As shown in Figs. 2–4, the GHF and the
where is the -valued process. The CDF have smaller errors than the JUF and the EKF. From these
-valued processes and are white noises with the figures we conclude that in this example the GHF has a substan-
same covariance Here is a constant vector and tial improved performance, and the GHF performs a little better
is a constant scalar. Also than the CDF.
Example 8.3: We further discuss the three-dimensional con-
tinuous signal process
(8.6)
(8.8)
The observation function is chosen as the shifted distance from
and the discrete observation process
the origin given by
(8.9)
It is motivated from the Lorenz stochastic differential system where and are zero mean and uncorrelated
of the form (8.4) as described in Section VI where noises with covariances given by and
The three-dimensional deterministic equations respectively. The function is given by
(8.7)
is the Lorenz equation in which are positive The above system states and represent altitude
parameters. The three parameters have a great deal of impact on (in feet), velocity (in feet per second), and constant ballistic
920 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 5, MAY 2000
coefficient (in per second), respectively. The detailed physical linearity of signal process [2], [12]. That is, in the predicator
meaning of the system and its parameters can be found in [2]. steps of EKF and JUF, the means are calculated using the nu-
We chose this example as a benchmark because it contains merical scheme. Then their covariance are propagated from the
significant nonlinearities in the signal and observation processes th to th step using
and had been discussed widely in the literature.
In the previous literature, a fourth-order Runge–Kutta scheme
with 64 steps between each observation is employed for numer-
ical integration of (8.8) in order to deal with the significant non-
ITO AND XIONG: GAUSSIAN FILTERS FOR NONLINEAR FILTERING PROBLEMS 921
with (8.9). In the predictor step of GHF, we directly use the Euler
approximation of signal process (8.8) based on 32 or 64 steps
(i.e., 32 or 64) between each observation (they are shortly
denoted by GHF32 or GHF64, respectively).
where was evaluated at and Thus, However, in NGHF the signal process system (8.8) is
rewritten as the discrete form
In the comparison, we implement the proposed approxima-
tion schemes (6.7)–(6.9) to solve the filtering problem of (8.8),
922 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 5, MAY 2000
where is the approximation of the process at We use the following data: the system parameters are chosen
and function with as and
the initial condition is
and We consider the time from 0 to 30. The
initial estimate is with
covariance
Fig. 8. The comparison of signal; estimates of two mixed Gaussian filter and optimal filters.
We also choose the optimization number in JUF for the A. Mixed Gaussian Example
three-dimensional system.
In Figs. 5–7 we show the absolute value of average error In this section we first demonstrate a nearly optimal perfor-
for each component committed by the algorithms of the mance of the mixed Gaussian method described in Section V.
Gauss–Hermite quadrature rule, the JU-rule, and the EKF, We consider the same example as in Example 8.1 with
respectively, across a simulation consisting of 50 runs. These We employ the two Gaussian distributions starting from
figures show that in this example the GHF has superior perfor-
mance than the JUF and the EKF.
924 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 5, MAY 2000
Fig. 10. The comparison of the probability density functions of mixed Gaussian filter and optimal filter at t = 1:
Fig. 11. The comparison of the probability density functions of mixed Gaussian filter and optimal filter at t = 2:75:
and against the one of the optimal Zakai filter (6.4) with
in Figs. 8–11. Here GHF1 (GHF2) represents the three-point
Gauss–Hermite filter which is the first (second) component of
respectively, and the initial weights are In the two mixed Gaussian filter with Weight 1 (Weight 2). We ap-
order to demonstrate the effectiveness of the proposed update proximate the Zakai equation by the operator-splitting method
(5.5), (5.6) we jumped the process at [which as described in [10]. We observed that the mixed Gaussian
corresponds to in the continuous time process (8.4)] filter performs nearly optimally up to We also note
to It may represent an impulse force at that the Zakai filter is no longer optimal after because
We compare the performance of the mixed Gaussian filter of the jump. As seen as Figs. 8 and 9, the update formula (5.5),
ITO AND XIONG: GAUSSIAN FILTERS FOR NONLINEAR FILTERING PROBLEMS 925
(5.6) quickly captures the phase change from the one steady and
point to the other The observations are also
supported by the comparison in Figs. 10 and 11.
Example 8.3 (Revisited): Next, we also apply our mixed with the same covariance
Gaussian method to Example 8.3. We employ the two Gaussian
distribution starting from
926 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 45, NO. 5, MAY 2000
lems. Our numerical results indicate that both the Gauss–Her- [14] R. E. Kalman, “A new approach to linear filtering and prediction prob-
mite filter and the central difference filter have superior perfor- lems,” Trans. ASME, J. Basic Eng., vol. 82D, pp. 35–45, Mar. 1960.
[15] H. J. Kushner, “Approximation to optimal nonlinear filters,” IEEE
mance to the filter of Julier–Uhlmann and the extended Kalman Trans. Automat. Contr., vol. 12, pp. 546–556, 1967.
filter. We also proposed the new update rules for the Gauss sum [16] , private communication.
filters and show that they can perform near optimally. [17] V. Mazya and G. Schmidt, “On approximate approximations using
gaussian kernels,” IMA J. Numer. Anal., vol. 16, pp. 13–29, 1996.
[18] M. Piccioni, “Convergence of implicit discretization schemes for linear
ACKNOWLEDGMENT differential equations eith application to filtering,” in Stochastic Partial
Differential Equations and Application, G. Da Prato and L. Tabbaro,
The authors thank the referees for their careful reviews and Eds. Trento, 1985, vol. 1236. (Lecture Note in Mathematics).
helpful suggestions. [19] B. L. Rozovskii, Stochastic Evolution Systems, Linear Theory and Ap-
plication to Nonlinear Filtering. Norwell, MA: Kluwer, 1991.
[20] H. W. Sorenson and D. L. Alspace, “Recursive Bayesian estimation
REFERENCES using Gaussian sums,” Automatica, vol. 7, pp. 465–479, 1967.
[1] D. L. Alspace and H. W. Sorenson, “Nonlinear Bayesian estimation
using Gaussian sum approximation,” IEEE Trans. Automat. Contr., vol.
17, pp. 439–448, 1972.
[2] M. Athans, R. P. Wishner, and A. Bertolini, “Suboptimal state estima-
tion for continuous-time nonlinear systems from discrete noise measure-
ments,” IEEE Trans. Automat. Contr., vol. 13, pp. 504–514, 1968. Kazufumi Ito received the Ph.D. degree in system
[3] A. Bensoussan, “Nonlinear filtering theory,” in Progress in Automation science and mathematics from Washington Univer-
and Information Systems, Recent Advances in Stochastic Calculus, J. S. sity in 1981. Subsequently, he held the positions
Baras and V. Mirelli, Eds. New York: Springer-Verlag, 1990. in ICASE, Brown University and University of
[4] A. Bensoussan, R. Glowinski, and A. Rascanu, “Approximation of the Southern California before joining the Department
Zakai equation by the splitting up method,” SIAM J. Contr. Optim., vol. of Mathematics, North Carolina State University,
28, pp. 1420–1431, 1990. where he is currently a Professor. His research inter-
[5] Y. Bar-Shalom and X. R. Liu, Estimation and Tracking: Principles, ests include nonlinear filtering, stochastic process,
Techniques, and Software. New York: Artech, 1993. control and analysis of partial differential equations.
[6] J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained He co-organized an AMS-IMS-SIAM Joint Summer
Optimization and Nonlinear Equations. Englewood Cliffs, NJ: Pren- Research Conference on Identification and Control
tice-Hall, 1983. in Systems governed by PDE’s, 1992 and the workshop on Stochastic Control
[7] C. P. Fang, “New algorithms of Gaussian assumed density filter and a and Nonlinear filtering, at NC State University, 1996.
convergence result,” IEEE Trans. Automat. Contr..
[8] P. Florchinger and F. Le Gland, “Time-discretization of the Zakai
equation for diffusion process observed in correlated noise,” in Proc.
9th Conf. Analysis and Optimization of Systems, A. Bensoussan and J.
L. Lions, Eds. New York: Springer-Verlag, 1990, vol. 144. (Lecture
Notes on in Control and Inform. Sci.). Kaigqi Xiong received the M.S. and Ph.D. de-
[9] G. H. Golub, “Some modified matrix eigenvalue problems,” SIAM Rev., grees in applied mathematics from the Claremont
vol. 15, pp. 318–334, 1973. Graduate School in 1995 and 1996, respectively. In
[10] K. Ito, “Approximation of the Zakai equation for nonlinear filtering,” 1995–1996, he was also a Researcher of Electrical
SIAM J. Contr. Optim., vol. 34, pp. 620–634, 1996. Engineering at the University of California, River-
[11] A. H. Jazwinski, Stochastic Process and Filtering Theory. New York: side. From 1996–1999, he was a Visiting Assistant
Academic, 1970. Professor in the Center for Research in Scientific
[12] S. J. Julier and J. K. Uhlmann. (1994) A general method for approxi- Computation at the North Carolina State University.
mating nonlinear transformations of probability distributions. [Online]. As a Researcher, he is visiting the Department of
Available: www http://phoebe.robots.ox.ac.uk/default.htm Mechanical and Aerospace Engineering at the Uni-
[13] S. J. Julier, J. K. Uhlmann, and H. F. Durrant-Whyte, “A new approach versity of California, Irvine. His research interests
for filtering nonlinear systems,” in Proc. Amer. Contr. Conf., Seattle, include nonlinear systems, mathematical control theory, filtering algorithms,
WA, June 1995, pp. 1628–1632. parameter estimation, flight control, neural networks and applications.