TEXTBOOK-Introduction To Random Signals and Applied Kalman Filtering With Matlab

giao trinh mon toan chuyen nganh ky thuat dien tu

Uploaded by

Cường Sống Thiện

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (1 vote)

409 views

TEXTBOOK-Introduction To Random Signals and Applied Kalman Filtering With Matlab

giao trinh mon toan chuyen nganh ky thuat dien tu

Uploaded by

Cường Sống Thiện

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 248

(Gre who pBea for mars, fo on Fe the Kalman filter represents the most widely applied and' demonstrably ‘seh result to emerge ffom the state asible approach of modem contol theors”™ THIRD EDITION Introduction to Random Signals and Applied Kalman Filtering WITH MATLAB EXERCISES AND SOLUTIONS Robert Grover Brown Electrical Engineering Department lowa State University Patrick Y. C. Hwang Rockwell International Corporation ) ornareACQUISITIONS EDITOR charg Robey MARKETING MANAGER Jay Kuch PRODUCTION EDITOR Ken Sanor MANUFACTURING MANAGER Mark Cito, ILLUSTRATION COORDINATOR Gene Alo ‘COVER DESIGNER Cart Grote “This bok was stn 10/12 ‘Ties Roms by Pro-image Corportion, and pind ané Youn by amiton Pring Comp. The cover as pte by Lehigh Bes, le Recosinng the importance of preerviag what hs been writers & policy of John Wily Son, ne. to ave books of enduring ae published (Rie Unied Sates pred om ace pape, and we exer ou Dest efforts to ten “Te pet on this ook wat manufac by a mill whose forest manasemeat PPowransIcde sued old arvesiog of inberiands,Soxtined yield Fevestng pines ence tat the aber of es cit each year Ges ot eed the amount of ew BT ‘Conyit © 1957, by Joka Wiley & Sons, te Al rights reserved, blake simultaneously in Cand the work beyond tat peed by Sections 1 and 108 of the 1976 Und States Copyright ‘Act wth th permission ofthe copyright ‘oer umawl. Requests fr permission tr fare informatio shou be nese to ‘he Permissions Deparment, Joba Wiley & Sons, In Library of Congress Cataloging in Publication Data Brown, Roben Grove “usdocton to random sige and applied Kalan sing: wih MATLAB execs and slaions / Rober Grover Brown, Pack ¥.C. Hang, — Sed pom Inches nd. ISBN -471-12830-2 (elo ah. pope) TSize! processing--Data proceing. 2. Random oot to: 2. Kalman tenng-Data processing. 4 MATLAB 1. Hoag Pauiee YC. Il Tile ‘TKSI029.575 1999 37 9.25800 e320 sta itd inthe United Stats of America w9s76s4321 Preface to the Third Edition “This text is a third edition of Introduction to Random Signals and Applied Kal- ‘man Filtering. At the time we prepared the second edition, there was no com ble to engineering. students at an affordable price. All this changed with the publication of The Student Edition of MATLAB® (Prentice-Hall, 1992).* This was followed in 1995 with an up- graded version under the same tile (but referred to as Version 4). Ths has made it feasible for an instructor to expect every student in class to have available, as ‘2 minimum, the student edition of MATLAB as an aid in solving engineering problems. This is especially true in upper-level courses where the student would very likely have gained some familiarity with MATLAB from earlier courses, We are not providing the MATLAB software package with this text. Its as sumed that the student already has MATLAB available (or perhaps some other prehensive PC mathematics software avai suitable mathematics software) and knows how to use it. ‘The problems at the end of each chapter that are flagged with a small computer icon ate “computer” exercises. They cannot be worked out by paper-and-pencil methods witha reasonable amount of effort. They were written With MATLAB in mind, but they can, of course, be worked out with other suitable software. We encourage all those who are interested in learning about Kalman filtering to complete a generous number ofthese exercises. The discrete Kalman filter is a numerical procedure, so considerable insight into the filter's behavior is gained in doing so. Just looking at the equations is not enough for ‘most of us. Programming the equations and analyzing the results of specific examples is the best way to get the insight that is so essential in engineering work. "MATLAB is an especially good software package for working out Kalman filtering problems because of its convenience in handling matrix operations. A diskette containing MATLAB M-fles giving solutions for most of the computer problems and examples is included inside the back cover of the text. The solution * MATLAR isa rpiered ier of The Math Work, Ie, 24 Prine Pak Way, Nach, (0760130, Pe (505) 87-700w PREFACE TO THE THIRD EDTION ‘Mefles were writen for torial purposes, and many comment statements are interspersed withthe executable ones. They were written to be understandable, not for efficiency. (See Appendix C for more on the software.) ‘The previous edition of this book included a Kalman filtering software package called "Kalm 'N Smooth.” This menu-oriented software is not as ver satle as MATLAB, but itis still good effective software for solving certain types of Kalman fitering problems. Thus, itis also included in the diskete for those ‘who wish to use it ‘The main thrust of this text is applied Kalman filtering. Our intent, just as in the earlier edi.ons, has been to present the subject matter at an introductory level. We have found from teaching both universty-credit and coatinuing- education courses that the main impediment to learning about Kelman filtering ‘is not the mathematics. Rather, its the background material in random process theory and linear systems analysis that usually causes the difficulty. Chapters 1 through 3 are intended to provide a minimal background in random process theory and the response of the linear systems to random inputs. Knowledge of| this material is essential to the subject matter in the remaining chapters. The necessary prerequisite material on linear systems analysis can be found in most any junior- or senior-level engineering text on linear systems analysis or linear control systems, Chapter 4 is on Wiener filtering. Those who are primarily interested in Kalman filtering may wish to skip this chapter, except for Section 4.7 on the discrete Wiener filter. This has relevance tothe discrete Kalman filter. Chapters 5 through 11 deal wit various facets of Kalman filtering with emphasis ‘on applications throughout "The authors wish to express special thanks to Dr. Larry Levy for his many helpful comments during the course of preparation of this third edition. We also wish to thank Lova Brown (Mrs. Robert Grover Brown) for her patience and help in preparing the final manusecipt. Robert Grover Brown, Patrick Y. C. Hwang Contents Probability and Random Variables: A Review 1 1.1 Random Signals 1 1.2 Intuitive Notion of Probabilty 2 1.3 Axiomatic Probebiity 5 1.4 Joint and Conditional Probability 11 1.5 Independence 15 116 Random Variables 16 1.7 Probabilty Distibution and Density Functions 19) 1.8 Expectation, Averages, and Characteristic Function 21 1.9 Nowmal or Gaussian Random Variables 25 1.10 Impulsive Probabiity Density Functions 29 1.11 Multiple Random Variables 30 1.12 Gorreation, Covariance, and Orthogonality 36 1.13 Sum of Independent Random Variables and Tendency Teward Normal Distribution 38 41.14 Transfoxmation of Random Variables 42 41.15. Mutivariate Normal Density Function 49 1.18 Linear Transformation and General Properties ‘of Normal Random Variables 53 1.47 Limits, Convergence, and Unbiased Estimators $7 Mathematical Description of Random Signals 72 2.1 Concept of a Random Process 72 22 Probabilistic Description of a Random Process 75vil contents 2.9 Gaussian Random Process 78 2.4 Stationary, Ergodcty, and Casstfication of Processes 78 25. Autocorelation Function 80 28 Crosscorrlation Function 94 2.7 Power Spectral Density Function 86 2.8 Cross Spectral Density Function 91 29 White Noise 92 2.10 Gauss-Markov Process 94 2.11 Random Telegraph Wave 96 2.12 Narrowband Gaussian Process. 98 2.13 Wener or Brownian-Motion Process 100, 2.14 Peeudorandom Signals 108 2.15 Determination of Autocorrelation and Spectral Density Funtions from Experimental Data 105 2.18 Sampling Theorem 111 2.17 Discrete Fourier Transform and Fast Fourier Transform 119 3 Response of Linear Systems to Random Inputs 128 81 Introdvetion: The Analysis Problem 128 8.2 Stationary (Steady-State) Analysis 129 3.2 Integral Tales for Computing Mean-Square Value 192 8.4 Pure White Noise and Bandiimited Systems 194 35 Noise Equivalent Bandwidth 135 38 Shaping Fiter 197 3,7 Nonetaonary (Transient) Analysis—hital Cndtion Response 138 3.8 Nonstationary (Transient) Analysis—Forced Response 140 9.9. Discrete-Time Process Models and Analysis 144 810 Summay 147 4 Wiener Filtering 159 4.1 The Wiener For Problem 159 42. Optimization with Respect to @ Parameter 161 4.3. The Stationary Optimization Problem—Weighting Function Approach 163 444 The Nonstationary Problem 172 48. Onthogonalty 17 48 Complementary Fiter 178 47 The Discrete Wiener Fiter 181 48 Perspective 183 CONTENTS x 5 The Discrete Kalman Filter, State-Space Modeling, and Simulation 190 5.1 ASimple Recursive Example 190 5.2 Vector Description of a Gontinuous-Time Random Process 192 53. Discrete-Time Model 198 5.4 Morte Carlo Simulation of Disrete-Time Systems 210 55 The Disorte Kalman Fiter 214 5.8 Scalar Kalman Fiter Examples 220 5.7 Augmenting the State Vector and Mulple-Input/Mutiple-Output Example 225, 58 The Condtional Density Viewpoint 228 6 Prediction, Applications, and More Basics on Discrete Kalman Filtering 242 6.1 Prediction 242 62. Altematve Form of the Diserate Kalman Fier 246 83 Processing the Measurement Vector One Component at atime 250, 64 Power System Relaying Application 252 65 Power Systems Harmonics Determination 256 66 Divergence Problems 260 6.7 Off-Line System Error Analysis 264 68 Relationship to Deterministic Least Squares and Note on Estimating a Constant 270 69 Disorete Kalman Fier Stablity 275, 6.10 Deterministic Inputs 27 6.11 Real-Time Implementation Issues 278 6.12 Perspective 281 7 The Continuous Kalman Filter 289 7.4 Transtion trom the Discrate to Continuous Fier Equations 200 7.2. Solution ofthe Matix Ficeati Equation 283 TS. Comelated Measurement and Process Noize 296 74 Colored Measurement Nose 290 7.8 Suboptimal Enor Analysis 308 7.8 Fier Stabiity in Steady-State Condition 305 1.7 Relationship Between Wiener and Kalman Fiters 306x CONTENTS. 8 10 oat Smoothing 312 8.1 Classification of Smoothing Problems 312 82 Discrete Fixed-Interal Smoothing 313. 83. Discrote Fixed-Point Smoothing 917 84° Fixed-Lag Smoothing 920 85. Forward-Backward Fiter Approach to Smoothing 922 Linearization and Additional Intermediate-Level Topics on Applied Kalman Filtering 335 941 Unearization 385 912 Correlated Process and Measurement Noise forthe Discrete Fier. Delayed-State Example 348 93 Adaptive Kalman Fiter (Multiple Model Adaptive Estimator) 959 94 Schmict-Kalman Fiter, Reducing the Order of the State Vector 361 95 UD Factorization 367 96 Decentralized Kalman Fiter 971 8.7 ‘Stochastic Linear Regulator Problem and the Separation Theorem 37 More on Modeling: integration of Noninertial Measurements into INS 392 10.1 Complementary Fiter Methodology 82 10.2 INS Error Models 996 40.3 Damping the Schulor Oscilation with External Velocity Reference Information 402 104 Baro-Aided INS Vertical Channel Model 407 40.5 Integrating Postion Measurements 410 10.6 Other Integration Considerations 413 ‘The Global Positioning System: A Case Study 419 114 Description of GPS 419 112 The Observables 423 113 GPS Error Models 426 41.4 GPS Dynamic Error Models Using Inenialy-Derived Reference Trajectory 492 “APPEND APPEND APPEND 11.8 Stand-Alone GPS Models 437 11.8 Effects of Satelite Geometry 443 11.7 Differential and Kinematic Positioning 445 11.8 Other Applications 449 A Laplace and Fourier Transforms Ad The One-Sided Laplace Transform 461 ‘2 The Fourier Transform 464 A Two-Sided Laplace Transform 466 B___Wrical Navigation Satellite Geometry C Kalman Filter Software Index 481 474, 478 481Probability and Random Variables: A Review 14 RANDOM SIGNALS Nearly everyone has some notion of random or noisclike signals. One has only to tune an ordinary AM radio away from station, wm up the volume, and the result is static, or noise. If one were to look ata strip-chart recording of such a signal, it would appear to wander on aimlessly with no apparent order in its amplitude patter, as shown in Fig. 1.1, Signals of this type cannot be deseribed ‘with explicit mathematical fanctions such as sine waves, step functions, and the like. Their description must be put in probabilistic terms. Early investigators recognized that random signals could be described loosely in terms of their spectral content, but a rigorous mathematical description of such signals was not formulated until the 1940s, most notably with the work of Wiener and Rice (1, 2). Noise is usually unwanted. The additive noise in the radio signal disturbs ‘our enjayment of the music or interferes with our understanding of the spoken ‘word; noise in an electronic navigation system induces positon erors that can be disastrous in critical situations; noise in a digital data wansmission system ‘can cause bit errors with obvious undesirable consequences; and on and on. Any noise that corupts the desired signal is bad; itis just a question of how bad! Even after designers have done their best 10 eliminate all the obvious noise- producing mechanisms, there always seems to be some noise left over that must be suppressed with more subtle means, such as filtering. To do so effectively, fone must understand noise in quantitative terms. Probability plays a Key role in the description of noiselike signals. Our tweatment of this subject must necessarily be brief and directed toward the specific needs of subsequent chapters. The scope is thus limited in this regard. We make no apology for this, because many fine books have been written on probability in the broader sense. Our main objective here is the study of random signals and optimal filtering, and we wish to move on to this area as quickly as possible. First, though, we must at least review the bare essentials of probability ‘with special emphasis on random variables,2. CHAPTER PROBABILITY AND RANDOM VARIABLES: A EVN 12 Figure 14. Typical nals waver, INTUITIVE NOTION OF PROBABILITY Most engineering and science students have had some acquaintance with the intuitive concepts of probability. Typically, withthe intuitive approach we first Consider all possible outcomes of a chance experiment as being equally likely, land then the probability of a particular event, say, event A, is defined as Possible outcomes favoring event A cay “Total possible outcomes PA ere we ead PA) a8 “probability of even A This concep then expanded Uoinrive the rele fequency-oFoccarence or Matta viewpoint of prob Shi Wht he eave fequensy cone, we agin age mabe of tls ‘some cance experiment td hen dene prbabiity ashe reat fequeney Of oruence oft event in question. Cnsdeations such 35 whats meant by “lage and the exbtece of limits se somal voide a clementary Wea tment. Ths is for good tcason "The idea of limi in a probable sense i be Skmough the ole initve notions of probabil have imitans, hey ss playa important rae a pobaity tory. Te sat of posble-evens ong isa wit peoblemsoving tool in many stances. The relative froqntcy concep is epeially hepa in ising the statsical Sigfcace Of he itso pba caenaons. That pores the necessary te Secween the theory andthe piysieal station, Two examples that iustate che Uefnest of tee ntuive notions of probability should now prove wef EXAMPLE 14 oe In sight poker cach player is del 5 cards fe down fom a deck of 52 playing card, We pose two guestos (a) What isthe probably of being del for of kind thats, Four aes, four king, and 30 for? (o) Wit he robbity of being deat aight sh that i oni ous sequence of five cards ay sul Solution to Question (a) Tis problem i elavely complicated if you thik Items of Be sequonce of chance events that ean take ice when th cards Ie deal on aa ue Yet he problem seal easy when Viewed in ems af he ado of favorable to ol umber of outcomes, These ae ex coud th ths case Thece are only 48 posse hands containing # aces, anther 48 1.2 INTUTE NOTION OF PROBABILITY 3 containing 4 kings; etc. Thus, there are 13 » 48 possible four-of-a-kind hands. ‘The total number of possible poker hands of any kind is obtained from the ‘combination formula for “52 things taken 5 at a time” (3). This i given by the binomial coefficient 52) Sat___ $2-51-50-49-48 . (3) -seian SSS = 900 aan Therefore, the probability of being dealt four of a kind is 13-48 _ 64 2,598,960 ~ 7,598,960 Solution to Question (b) Again, the direct itemization of favorable events is the simplest approach. The possible sequences in each of four suits are: AKQUIO, KQIO9, .. ., 5432A, (Note: We allow the ace to be counted either high or low.) Thus, there are 10 possible straight lushes in each suit (including the royal flush of the suit) giving a total of 40 possible staight flushes. The probability (ofa straight flush is, then, PAfour of a kind) = ~ 00024 (1.23) 40 PuSuaight ust) = Sas ~ 000015 24 ‘We note in passing that in poker a straight flush wins over four of a kind; and, rightly so, since its the rarer of the two hands. a EXAMPLE 1.20 Craps is a popular gambling game played in easinos throughout the world (4). ‘The player rolls two dice and plays against the house (ie, the casino). I the first roll is 7 or 11, the player wins immediately; if it is 2, 3, or 12, the player loses immediately. If the first roll results in 4, 5, 6, 8, 9, or 10, the player continues to rll until either the same number appears, which constitutes & win, ‘or &7 appears, which results in the player losing. What is te player's probability of winning when throwing the dice? ‘This example was chosen to ilustrate the shortcoming of the direct count the-outcomes approach. In ths ease, one cannot enumerate all the possible out ‘comes. For example, if the player’s first roll is @ 4, the play continues until ‘another outcome is reached. Presumably, the rolling could continue on ad ini- itum without a 4 or 7 appearing, which is what is required to termingte the ‘game. Thus, the direct enumeration approach fails inthis situation. On the other hhand, the relative-frequency-occurence approach works quite well. Table 1.1 shows the relative frequency of occurrence of the various numbers on the first roll. The numbers in the column labeled “probability” were obtained by enu- ‘merating the 36 possible outcomes and allotting xy for each outcome that yields 4 sum corresponding to the number in the first column. For example, a 4 may bbe obtained with the combinations (1, 3), 2, 2), or (I, 3) For the cases where the game continues after the first throw, the subsequent probabilities were ob-4 CHAPTER 1 PROBABILITY AND RANDOM VARIABLES: A REVIEW ‘able 1.4 Probabits n Craps ia om Result repens rr “t Saneanent eM rt Fist Probate in Varo Fist prataity Throw and Routes Throws yk tee ° 3 : = ° before) = $n om ‘ ‘ Coin BH ‘PC before 8) = 3 (lose) PS talons 7) = Find 5 fe Cominne +t PCT before 5) = 4 (ose) (PO before) = 6 de Continue set CT before 6) = fy (ose) 7 Win & Petre 1) = Hid 8 Come ket PCT before 8) = 4 (lose) {9 before 7) = 3 (win) 7 ° Se Cominue aa “RC before 9) = 3 (lose) fore 7) = 5 (wi PB = 08) . 10 3 Continue et PCT before 10) = § (lose) u zk We é n Lowe ° Foal pobebiy wining = 3 = tained simply by observing the relative frequency of occurence ofthe numbers aan oe Tce aly a4 Tn he ela Feaweney ling a Delos shoul be vice that of "4 befor 7,” and the respective probabitie we and }.The toa probity of wining with a4 onthe fst throw was reasoned a follows 4 sppeas onthe fist roll ony ofthe ie: thd of th Wacom, oly of he tine wil hres in an inate win, Ths, thevlative frequency of wining via this ute isthe product of 4. Admit 419 /OOMATIC PROBABILITY. 5 tedly, this line of reasoning is quite intuitive, but that isthe very nature of the relative-trequeney-of occurrence approach to probability. For the benefit of those who like to gamble, it should be noted that craps is a very close game. The edge in favor of the house is only about 1} perces (Also see Problem 1.7.) AXIOMATIC PROBABILITY It should be apparent that the intuitive concepts of probability have thei ti tations, The ratio-of-outcomes approach requires the equal-lkelinood assumption for all outcomes. This may ft many situations, but often we wish to consider “unfair” chance situations as well as “fair” ones. Also, as demonstrated in Example 1.2, there are many problems for which all possible outcomes simply cannot be enumerated, The relative-frequency approach is intuitive by its very nature. Intuition should never be ignored: but, on the other hand, it can lead one astray in complex situations. For these reasons, the axiomatic formulation of probability theory is now almost universally favored among both applied and theoretical scholars in this area. As we would expect, axiomatic probability is ‘compatible withthe older, more heuristic probability theory. "Axiomatic probability begins withthe concept of a sample space. We fist imagine a conceptual chance experiment. The sample space is the set of all, possible outcomes of this experiment. The individual outcomes are called ele ‘ments or points in the sample space. We denote the sample space as S and its set of elements 85 {5 Sy $5, -- -}- The number of points in the sample space ‘may be finite, countably infinite, or simply infinite, depending on the experiment under consideration, A few examples of sample spaces should be helpful at chs point. EXAMPLE 1.3, oe The Experiment Make a single draw from a deck of 52 playing cards, Since there are 52 possible outcomes, the sample space contains 52 discrete points. If wwe wished, we could enumerate them as Ace of Clubs, King of Clubs, Queen of Clubs, and so forth, Note that the points of the sample space in this case are “things,” not numbers a EXAMPLE 1.4 _ ‘The Experiment Two fair dice are thrown and the number of dots on the top of each is observed, There are 36 discrete outcomes that can be enumerated as 1,1), 2), (0, 3) ---» (6.5) 6, 6. The first number in parentheses identifies the number of dots on die 1 and the second is the number on die 2. Thus, 36 distinct 2-tuples describe the possible outcomes, and our sample space contains 36 points or elements. Note tha the points in this sample space retain the identity of each individual die and the number of dats shown on its top face. a6 CHAPTER PROBABILITY AND RANDOM VARIABLES: A REVIEW EXAMPLE 1.5 ‘The Experiment ‘Two far dice are thrown and the sum of the number of dots is observed. In this experiment, we do not wish to retain the identity of the ‘numbers on each die; only the sum is of interest. Therefore, it would be perfectly proper to say the possible outcomes of the experiment are {2, 3,4, 5, 6,7, 8 5, 10, 11, 12}. Thus, the sample space would contain 11 discrete elements. From this end the preceding example, ican be seen that we have some discretion in how we define the sample space corresponding to a certain experiment. It depends to some extent on what we wish to observe, If certain details of the Experiment are not of interest, they often may be suppressed with some resultant ‘Simplification. However, once we agree on what items are to be grouped together land called outcomes, the sample space must include all the defined outcomes; ‘and, similarly, the result of an experiment must always yield one of the defined ‘outcomes, a Ge) ‘The Experiment A dart is thrown at a target and the location of the hit is ‘observed. In this experiment we imagine that the random mechanisms affecting the throw are such that we get a continuous spread of data centered around the ball's-eye when the experiment is repeated over and over. In this ease, even if we bound the hit locations within a certain region determined by reasonableness, ‘we stl cannot enumerate all possible hit locations. Thus, we have an infinite ‘number of points in our sample space in this example. Even though we cannot enumerate the points one by one, they are, of course, identifiable in terms of either rectangular or polar coordinates a It should be noted that elements ofa sample space must always be mutually exclusive or disjoint, On a given tial, the occurrence of one excludes the oc turence of another. There is ao averlap of points in a sample space. In axiometic probability, the term event has special meaning and should not ‘be used interchangeably with outcome. An events a special subset ofthe sample space 5. We usually wish to consider various events defined on a sample space, and they will be denoted with uppercase leters such as A, B, C, ... or pethaps ‘Ay, An = +6, Also, we will have occasion to consider the set of operations of union, intersection, and complement of our defined events. Thus, we must be tareful in our definition of events to make the set sufficiently complete such that these set operations also yield properly defined events. In discrete problems, this ‘ean always be done by defining the set of events under consideration to be all possible subsets of the sample space S, We will tacitly assume thatthe null set fsa subset of every set, and that every set is a subset of itself. ‘One other comment about events isin order before proceeding to the basic axioms of probability, The eveat A is said to occur if any point in A occur. "The three axioms of probability may now be stated. Let $ be the sample space and A be any event defined on the sample space. The first two axioms are Axiom 1: P(A) = 0 asa Axiom 2: P(S) 132) 418 MooMATCPROBAALITY 7 Now, let Ay, Az, As, «be mutually exclusive (disjoint) events defined on S, ‘The sequence may be finite or countably infinite. The third axiom is then Axiom 3: PUA, UA, U ASU.) = PA) + PAL) + PAS) + 33) ‘Axiom 1 simply says that the probability of an event cannot be negative ‘This certainly conforms to the relative-fequency-of-oceurrence concept of prob ability. Axiom 2 says that the event S, which includes all possible outcomes, ‘must have a probability of unity. It is sometimes called the certain event, The first two axioms are obviously necessary if axiomatic probability isto be compatible with the older relaive-frequency probability theory. The third axiom is ‘not quite so obvious, perhaps, and it simply must be assumed. In words, it says that When we have nonoverlapping (disjoint) events, the probability of the union of these events is the sum of the probabilities of the individual events. If this ‘were not s0, one could easily think of counterexamples that would not be com patible withthe relative-fequency concept. This would be most undesirable. ‘We now recepitulate. There are three essential ingredients in the formal approach to probability. Fist, a sample space must be defined that includes all possible outcomes af our conceptual experiment. We have some diseretion in ‘hat we call outcomes, but caution is in order here, The outcomes must be isjoint and all-inclusive such that P(S) = 1. Second, we must carefully define a set of events on the sample space, and the set must be closed such thatthe ‘operations of union, intersection, and complement also yield events in the set Finally, we must assign probabilities to all events in accordance with the basic axioms of probability. In physical problems, this assignment is chosen to be compatible with what we feel to be reasonable in terms of relative frequency of ‘occurrence of the events. If the sample space § contains a finite number of clement, the probability assignment is usually made directly on the elements of S. They are, of course, elementary events themselves. Ths, along with Axiom 3, then indirectly assigns a probability o all other events defined On the sample space, However, if the sample space consists of an infinite “smear” of points, the probability assignment must be made on events and not on points in the sample space. This will be illustrated later in Example 1.8. ‘Once we have specified the sample space, the Set of events, and the probabilities associated with the events, we have what is known a% a probability space, This provides the theoretical structure for the formal solution of a wide variety of probability problems EXAMPLE 1.7 __ ~ Consider a single throw of two dice, and let us say we are only interested in the sum of the dots that appear on the top faces. This chance situation fits many sores tat re played with dice. In hs cnt, we will define our sample epee © S=(2,3.4,5,6,7,8,9, 10, 11, 12)8 CHAPTER 1 PROBABITY AND RANDOM VARIABLES: A REVIEW and itis seen to contain 11 discrete points. Next, we define the set of possible vents to be all subsets of $, including the null set and $ itself. Note that the tlements of S are elementary events, and they are disjoint, as they should be. ‘Also, P(S) = 1. Finally, we need to assign probabilities to the events. This could ‘be done arbitrarily (within the constraints imposed by the axioms of arobability), but in this case we want the results of our formal analysis to coincide with the relative-frequency approach, Therefore, we will assign probabilities to the ele- ‘ments in accordance with Table 1.2, which, in tum, indirectly specifies probabilities for all other events defined on $. We now have a propzrly defined probability space, and we can pose a variety of questions relative to the single throw of two die. ‘Suppose we ask: What isthe probability of throwing either a7 or an 11? From Axiom 3, and noting that “7 or 11” is the equivalent of saying “7 U 11," swe have oo PO of II) = 35 34) "Next, suppose we ask: What isthe probability of not throwing 2,3, or 12? ‘This calls for the complement of event "2 or 3 or 12." which isthe set (4, 5. 6,7, 8,9, 10, 11], Recall that we say the event occurs if any element in the set ‘occurs, Therefore, again using Axiom 3, we have B44bS e645 HSH 3H PONot throwing 2, 3, o 12) ass) = 5 ‘Table 1.2. Probables fr Two Dee Example ‘Sum of Two Assigned Dice Probability 2 * 3 + 4 * 5 * 6 & 7 * 8 * 9 * 0 * " rs 2 * 1.8 /moMAnc PROBABILITY Figure 1.2 Vern agar fro avons A and & Suppose we now pose the further question: What is the probability that to “4s are thrown? In our definition of the sample space, we suppressed the identity of the individual dice, so this simply is not a proper question for the probability space, as defined. This exemple will be continued, bt Brot we digress for a moment to consider intersection of evens. @ In addition to the set operations of union and complementation, the operation of intersection is also useful in probability theory. The intersection of two ‘events A and B is the event containing points that are common 10 both A and BB. This i illostrated in Fig, 1.2 with what is sometimes called a Venn diagram, ‘The points lying within the heavy contour comprise the union of A and B, denoted as A UB or “A or B." The points within the shaded region are the ‘event “A intersection B,” which is denoted AB, or sometimes just “A and B."* The following relationship should be apparent just from the geometry of the Venn diagram: PIAU B) = PLA) + P(B) ~ PIA B) 136) ‘The subtractive term in Eq, (1.3.6) is present because the probabilities in the overlapping region have been counted twice in the summation of P(A) and P(B) ‘The probability P(A AB) is known as the join probability of A and B and will be discussed further in Section 1.4, We digress for the moment, though, and look at two examples : EXAMPLE 1.7 (continued) a ‘We return to the two-dice example. Suppose we define event A as throwing a 4, 5,6, or 7. Event B will be defined as throwing a 7, 8, 9, 10, or 11, We now jose two questions: * tn many references, te naaen for “A union Bis “A+ 8" and “A ereston Bi shorened to st "AB" We wl be proceeding tothe say of random vals shorty and then he chance ‘cdeencer wl be rete to eal nanbors, ot "Hinge Thu, med to oi aso, We Wl Stay with be more cmberoae aon of U al "fr bt operas nd veseve XY and XY {0'tean he sul aimee operations on val vals.10 (CHAPTER | PROBABILITY AND RANDOM VARIABLES: A REVIEW (@) What isthe probability of event “A and B” (ic, A.B)? (b) What is the probability of event “A or B” (ic. AU BY? “The answer t (a) is found by looking for the elements of the sample space that fre common to both events A and B, Since the number 7 is the only common clement, PAB) ‘The answer to (b) can be found either by itemizing the elements in AU B or from Eq, (1.3.6). Using Bq, (1.3.6) leads to PAU B) = Play + P(B) ~ PAB) m6 ae a 8 3 “This is easly verted using the direct itemization method e EXAMPLE 1.8 Le us reconsider the dare-trovingexpeviment. We mph conser a simple te where We draw senle of a Ron the val It the ployer dat ands ‘erwin he eel, te player win i the dr lands side, the player Tou We pe the simple question, What th probabil tht the player wins tthe single tw oft dat? in hs example the ample space i alte pons om the wal (We assume thar the ployer em it eat i te wall) This here ae a ae number of point in ou sample apace S. Tah game, we define ony wo evens on the sample space—etet player win: vent layer loses Tis itinat eto events beats The br optatos of union, ineseton, and complement yield dened events tie et “rhe assignment of probable in his case most be made ctl on the ovo creer tan onthe sample sacs bse he pay of ig ‘Sera paral ott oo the wall seo an ths ls us nothing aboot Sens Te a'e not of rensm, we might speculate Ut we ae devising a Simpl gambling game for the howe, one ht the pao il ny and that Sil ase be petle tote howe, Our observations inate tht he typical player vw about TO percent of his or her das within a ele of mas Testi, the hs are moe ores unfonnly spaced within the cic 1 would then be arora to asin probabilities fo the two evens 35 PGHit lies on oF within Ry) = 1 (Hit lies outside Ri) = 9 14 JOINT AND CONDMONAL PROBABILITY. 14 If the establishment were to offer 5 to 1 odds, this game might well please the players and atthe same time produce revenue for the house. Before we leave this example, it should be mentioned that the radius R may be treated as a parameter, and hence we have the structure for looking at a whole family of problems, not just a single one. To add variety tothe game, the pro- prictor might occasionally wish to decrease the diameter of the circle and increase the odds. Clearly, if the hits within the circle are nearly uniformly distributed, reducing the area by a factor of 2 will reduce the relative frequency fof occurrence by a similar factor. Yet the corresponding reduction in radius is ‘only a factor of V2, which should make the game “look” attractive. The ap- ‘ropriate probability assignments to this ease would be .05 for “win” and .95 for “lose." Presumably, the house could offer 10 to 1 odds aad still net the same ‘average return as received on the Sto-I game. a JOINT AND CONDITIONAL PROBABILITY In complex situations itis often desirable to arrange the elements ofthe sample space in arrays. This provides an orderly way of grouping related points in the space and is especially useful when considering successive trials of similar experiments, Consider the following example. EXAMPLE 1.9 The Experiment Draw a card from a deck of 52 playing cards. Then replace the card, reshuffle the deck, and draw a card a second time. This is known as sampling with replacement. The sample space for ths experiment, when viewed as a whole, consists of all possible pairs of cards forthe two draws. This amounts to (52)? of 2704 elements—clearly an unwieldy number of elements to keep ‘tack of without some systematic way of ordering things. Yet in spite ofits size, the sample space is quite manageable when viewed as an array as shown ia ‘Table 1.3. This is a two-dimensional array, but it should be obvious that the ‘Table 13_Joit Probabiies for Two Draws wth Replacement ‘Second Draw First Aceof King of Deuce of Dravwe Spades Spades ++ ‘ce of spades oe she King of spedes sbi sae (62 elemens) (Teal of 2704 entries) Deuce of clubs abe he oe12 CHAPTER PROBABILITY AND RANDOM VARLABLES: A REVIEW concept is easily extended to higher-order situations —conceptually at least. Had We specified three draws rather than to, we would have a thre>-dimensional array, and so forth. Thus, the idea of an n-dimensional array associated with m teals of an experiment is an important concept. Note that summing out the numbers along any particular row yields the probability of drawing that particular card on the frst drayy irrespective of the result of the second draw. This leads to the idea of marginal probability, which will now be ccnsidered in a more general setting a Let A and B loosely refer to “first” and “second” chance experiments Time is not of the essence, so the experiments may be either successive oF simultaneous as the siuation dictates. Let A, Ay...» Aq denote disjoint events associated with experiment A and, similarly, By, By. ”- , By denote disjoint events for experiment B. This leads to the joint-probability array siown in Table 14, Note first that m does not have to equal a, and therefore the array is not necessarily square, Also, the so-called marginal probabilities ae svown in Table 14 tothe right and bottom of the joint-probabilty array. These ar the result of summing out rows or columns, as the ease may be, and the term marginal arose because these probabilities are often written in the margins outside the m % array. Clearly, summing out horizontally yields the probability of particular event in the experiment A, iespective of results of experiment B. Similarly, summing columns yields P(B,), PCP, and so forth. Also, we tacity assume that events Ay, As,---, Aq and By, B,.-, By are all-inclusive as wall as disjoint, so thatthe sum of the marginal probabilities, either vertically or Forizontaly, is unity. ‘Table 1.4 also contains information about the relative frequency of occurrence of various events in one set, given a particular event in the other set. For example, look at column 2, which lists P(A, 1B), P(A: © Ba). «y Play B,). Since no other entries in the table involve Ba, this list of numbers gives the relative distribution of events Ay, As,» Aq given B, has occurred. However, the set of numbers appearing in column 2 is nota legitimate probability distri bution because the sum is P(B,), not unity. So, imagine 'renormalizing” all the entries in the column by dividing by P(B,). The new set of numbers is then P(A, 0 B)/PCB.), Pla, MBP), --. + Pty B;)/PCB.), the sum is unity, ‘and the relative distribution corresponds to the felative frequency of occurrence Of Ay, Ay, -- 5 Aye given B,. This heuristic reasoning leads vs to the formal definition of conditional probability. ‘Table 144_Acay of Joint and Marginal Probabtis a | Event Event Event Marginal A 3, By 7 a Probeblites Even, | PAB) ANB) <= PAB) | PAD Event, | PA;NB) —PALNB) => PUB) | PAD Event Ae | PAB) PAW BD PAB) | PUA Margit PH) Pa) BD Sem = 1 Probabilities 14. JOINT AND CONDITIONAL PROBABILITY 13 ‘The conditional probability of A, given B, is defined as PUA, 8) PAB) = ay aan Similarly, the conditional probebilty of B, given A i nein Ay PBI) = Ray 142) kis tacily assumed in the above equations that P(B) and PCA) are not 2210 Otherwise, conditional probability is not defined. I should also be emphasized that conditional probability isa defined concept and isnot derived from eter concepts. The discussion leading up t Eqs. (1.1) and (142) was presented 10 five an intuitive rationale for the definition and was not intended a8 a poof A useful relationship is obtained when Eqs. (141) and (142) are combined Esch equation may be solved for the probability of 4 ntersestion Band the results equated This leads to Bayes rule (or Bayes theorem): PAB) =" (143) ‘This relationship is useful in reversing the conditioning of evens. Note thatthe Joint probability array P(A, 1B) contains all the necessary information for computing ail marginal and conditional probabilities. Conversely, if you know AB, and PCB) [or POBJA) and PCA], there is sufficient information to find Pa, 8) EXAMPLE 1.10 oo For variety, we now consider an urn problem. The urn contains two red and two black balls. Two balls are drawn sequentially from the urn without replacement, (a) What is the array of joint probabilities for the frst and second draws? (b) What is the conditional probability that the second draw is red, given the first draw is red? ‘To obtain the joint probability table, we first define a sample space consisting of all possible outcomes, including the identity of the individual balls The four balls will be referred to as Red 1, Red 2, Black 1, and Black 2, The Joint probability aray forthe first and second draws is given in Table 1.5. Note the effect of the “without replacement” statement. It gives rise to zeros along the major diagonal, because drawing Red 1 on the first draw precludes Red I being drawn again on the second draw, and so forth. In effect, there are really only 12 nontrivial outcomes for this experiment, and we assume they are all equally likely In the original problem, there was no mention of retaining the individuality of the two red and two black balls—only “red” and “black” were specified. ‘Therefore, we can consolidate outcomes in accordance with the partitioning44 CHAPTER PROBABILITY AND RANDOM VARIABLES: A REVIEW “Table 4.5 _Joirt Probability Table for Four Sal Un Example ‘Second Draw First Draw Red 2 Bick 1 Black 2 Red t Red 2 lack 1 Black 2 shown by dashed lines in Table 1.5. This leads to the two-by-two array shown in Table 1.6. This, then, is the answer to pat (a) For the conditional probability we will use the basic definition given by Eq, (1442), Writing this out explicitly for the conditional situation posed in {question (b), we have (Second draw redFist draw red) __ P(First draw red and second draw red) = ‘POFiest draw red) 4a) ‘The numerator of Bq, (1.4.4) is the upper-left entry in Table 1.6, The denomi nator is the marginal probability obtained by suraming the elements of the first row in Table 1.6, This yields (Second draw sedFirst draw red) 4s) ‘This is the solution to part (b). Note this checks with the result one would obtain by considering the three balls that remain in the um after a red one is withdrawn, a ‘Table 1.6 Joint Probebity Table Reduced to T-bycTwo ay Second Draw First Draw Red Black Red oy Back iy 15 NOEPENDENCE 45: 15 INDEPENDENCE In qualitative terms, two events are sai to be independent if the occurrence of ‘one does not affect the likelihood of the other. If we toss two coins simultaneously, we would not expect the outcome of one of the coins to affect the other Similarly, if we draw a card from a deck of 52 playing cards, then replace i, reshuffle, and draw a second time, we would not expect the second outcome to be affected by the first. However, if we draw the second card without replacing the first, itis @ much different matter. For example, the probability of drawing fn ace on the second draw with replacement is 3. However, if we draw an ace fon the first draw and do not replace it the probability of getting an ace on the second draw is only 3. Jn the “without replacement” experiment, the outcome fof the first draw certainly affects the chances on the second drav, so the two events are not independent Formally, events A and B are said to be independent if PLA B) = PAYPCB) asa Als, it should be evident from Eq, (1.5.1) and the defining equations for con- tional probability, Eqs. (14.1) and (1.4.2), that if A and B are independent Pala) ra For A and B POA) %B)J independent only ‘We might also note that the defining equation for independence, usually provides the simplest test for independence. This is illustrated with two ‘examples, EXAMPLE 1.11 ee ‘The joint probability array for the simultaneous toss of two coins is given in ‘Table 1.7. The marginal probabilities are also shown in the “margins” wid cheir significance stated in wards in parentheses. Note that each of the four joint probabilities ofthe array may be written asthe product oftheir espective mar: ‘ginal probabilities, Thus, all events are independent in this case. a EXAMPLE 1.12 a Let us reconsider the urn experiment described in Bxample 1.10, Section 1.4 Recall thatthe two balls were withdrawn sequentially without replacement, and that this led to the joint probability array shown in Table 1.8. Te marginal probabilities ae also included, just asin the previous example, However, inthis ‘ease none ofthe joint probabilities can be wetten as the product of the respective ‘marginal probabilities. Thus, all event pairs are dependent. 5 ‘To recapitulate, we say events A and B are independent if their joint probability can be written as the product of the individual total probabilities, P(A) and P(B). Otherwise, they are said to be dependent.416 CHAPTER PROBABILITY AND RANDOM VARIABLES: A REVIEW “Table 1.7_Joint and Margin Probables for Tess of Two Cains Second Coin Fist Coin Heads ‘Heads + £ Prob fst coin heads) “Tlls + + $ Prob. fist ‘coin i als) ¥ (Prob second feats) RANDOM VARIABLES In the study of muiselike siguals, we are nearly always dealing with phyzical {quantities such as voltage, torque, distance, and s0 forth, which can be measured in physical units. In these eases, the chance occurrences are related to real num ‘ers, not just “things” like heads or tails, black balls or red balls, and the like. ‘This brings us to the notion of a random variable. Let us say we have @ con- ‘ceptual experiment for which we have defined a suitable sample space, an appropriate set of events, and a probability assignment for the set of events. A Fandom variable is simply a function that maps every point in the same space (ings) on to the real line (numbers). A simple example of tis mapping is sown in Fig, 1.3, Note that each face of the die is embossed with a patter of Gots, not a number, The assignment of numbers is our own doing and could be ‘almost anything. In Fig. 1.3 we just happened to choose the most common rnumerical assignment, namely, the sum of the number of dots, tat this was not necessary. Presumebly, in our probability space, probabilities have been assigned to the events in the sample space in accordance with the basic axioms of proba bility. Associated with each event in the original sample space (things) there will be a cortesponding event in the random-variable space (nimbers). These ‘Table 1.8. Joint and Marga Probabiies for Un Banoo ‘Second Draw First Draw Red Black Red + yo¢ Black $ toot + + 18 RANDOM VARIABLES 17 Samp ace ‘gure 13 Mapping fre tow of one will be called equivalent events, and itis only natural that we should assign probabilities tothe random-variable events in the same manner as forthe original sample-space events. Stated formally, we have ‘equivalent event on the real line) = P(corresponding event in the original sample space) (1.6.1) ‘Two more examples will illustrate the concept of a random variable further, ANOLE (12 “The mapping that defines a random variable must fit the chance situation at hand if it is to be useful. Sometimes this leads to unusual but perfectly legitimate functional relaionships. For example, in the game of pitch, a portion of the scoring is done by summing the card values of the cards each player takes in ‘wicks during the course of play. The card values, by arbitrary rules of the game, are a8 follows: Card of Any Suit Card Value 2 through 9 0 10 10 Tack 1 Queen 2 King 3 ‘Ace 4 ‘Thus, in exploring your chances relative to this aspestof the game, it would be appropriate to map the 52 points in the sample space (for a single card) into18 (CHAPTER 1 PROBABILITY AND RANDOM VARIABLES: A REVIEW nea note ace Teh man ot Gs ol Baron EEEeae] Figure 14 Mapping fr ptch example ‘eal numbers in accordance with the above table. This is also shown in Fig. 1.4. [Note that multiple points in the sample space map into the same number ofthe real line, This is perfectly legitimate. The mapping must not be ambiguous in going from the sample space to thé real line; however, it may be ambiguous going the other way. That is, the mapping need not be one-to-one. 5 EXAMPLE 1.14 In many games, the player spins a pointer that is mounted on a circular card of | some sort and is free «0 spin about its center. Ths is depicted in Fig. 1.5 and the circular card is intentionally shown without any markings along its edge. ‘Suppose we define the outcome of an experiment asthe location on the periphery ff the card at which the pointer stops. The sample space then consists of an infinite number of points along a circle. For analysis purposes, we might wish to identify each point in the sample space in terms of an angular coordinate ‘measured in radians. The functional mapping that maps all points on a circle to naar Figure 15, Mopping fr sine palrter oral 47 [LT PROBABILITY OSTRIGUTION AND DENSITY FUNCTIONS. 19 comesponding points on the real line between 0 and 2 would then define an appropriate random variable. a PROBABILITY DISTRIBUTION AND DENSITY FUNCTIONS ‘When the sample space consists of a finite number of elements, the probability assignment can be made directly on the sample-space elements in accordance ‘with what we feel to be reasonable in terms of likelihood of occurrence. This then defines probabilities for all events defined on the sample space. These probabilities, in tum, transfer directly to equivalent events in the random-variable space. The allowable realizations (i., real numbers) in the random-variable space are elementary equivalent events themselves, so the reult isa probability associated with each allowable realization in the random-variable space. The sum of these probabilities must be unity, just as in the original sample space, but the distribution need not be the same. A continuation of Example 1.13, Section 1.6, will illustrate this POMPE 0 The mapping from the sample space tothe real Hine for the pitch card game was shown in Fig. 14. Let us assign equal probabilities forall elements in the original sample space. The probabilities for the allowable realizations inthe random- variable space would then be: Random Variable Realization Probability ° z 1 # 2 4 3 & 4 # 0 - Note thatthe probabilities are not distributed uniformly in the random-variable space. The end result of the mapping is a set of real numbers representing the possible realizations of the rancom variable and a corresponding set of probabilities that sum to unity. Once this correspondence has been established, the ‘original sample space is usually ignored, 5 ‘The random variable of Example 1.15 is an example of a diserete random ‘variable in tat is allowable realizations are discrete (.e., countable) rather than continuous. The associated discrete set of probabilities is sometimes referred to as the probability mass distribution ar simply probability distribution, We also have occasion to work with continuous random variables. AS a ‘matter of fact, the usual electronic noise that is encountered in a wide variety20 CHAPTER PROBABILITY AND RANDOM VARIABLES: A REVIEW ‘of applications is of this type, that is, the voltage (or current) may assume a Continuous range of values. The corresponding sample space then also contains fan infinite number of points, so we cannot assign probabilities directly on the points of the sample space; this must be done on the defined events. We will Continue the spin-the-pointer example of Section 1.6 (Example 1.14) to illustrate hhow this is done. Let X denote a continuous random variable corresponding to the angular position ofthe pointer after it stops. Presumably, this could be any angle between 0 and 2 2 radians; therefore, the probability of any particular postion is infinitesimal. Thus, we essign a probability to the event thatthe pointer stops within ‘certain continuous range of values, say, between O and @ radians If ll positions fare equally likely, i i reasonable to assign probabilities as follows. (within the admissible range of 0): 0s 0520 any [Note thatthe probebility assignment is a function of the parameter 0 and the function is sketched in Fig. 1.6. The linear portion of the function between O and 2m is due to the “equally likely” assumption, “The function sketched in Fig. 1.6 is known as a cumulaive distribution Junction (or just probability distribution function), and it simply describes the ‘probability assignment as it reflets onto equivalent events in the -andom variable Space. Specifically, the probability distribution function associated with the ran- ddom variable X is defined as (8) = PX 0) 72) 1.8 EXPECTATION, AVERAGES, AND CHARACTERISTIC FUNCTION 21 2FYO1, as Ove. 3. F,(0) is a nondecreasing function of 6. ‘The information contained in the distribution function (eg. Fig. 1.6) may also be presented in derivative form. Specifically, lt f4(@) be defined as a HO =F ‘ 73) “The function (0) is known as the probability density function associated with the random variable X. The density function for the pointer example is shown in Fig. 1.7. From properties 1, 2, and 3 just cited for the distribution function, it should be apparent that the density function has the following properties: 1. f(0) is @ nonnegative function. 2. FAO do = 1 1 should also be apparent from elementary calculus that the shaded area shown in Fig. 1.7 tepresente the probably that X Hes between @, and @ IF 0, and O ave separated by an infinitesimal amount, 8, the area is approximately (0) ‘80, and thus we have the tem probability density. “The probability density and distribution functions are alternative ways of desctibing the relative distribution of the random variable. Both functions are Useful in random-variable analysis, and you should always keep in mind the derivative integral relationship between the (wo. As a matter of notation, we il tnomally use an uppercae symbol for the disuibution function and the core Sponding lowercase symbol forthe density function, The subscript in exch ease indicates the random variable being considered. The argument of the function is adummy variable and may be almost anything. where 0 isa parameter representing a realization of X 18 Juste det cs ose hi ton etal he EXPECTATION, AVERAGES, AND dom-variable space, the original sample space is usually ignored. It should Clear from the definition that a probability distribution function always has the oer rere ne eure on following properties: ‘The idea of averaging is so commonplace that it may not seem worthy of elab- 1 FY) 0, a5 0 ~ oration. Yet there are subtleties, especially as averaging relates to probability ‘Thus we need to formalize the notion of average, exer fi © Eg dole) Figure 17 Probably densty hncton er parer Figure 1.6. Prcbecity tition unten fr ptr example sane.22 CHAPTER PROBABILITY AND RANDOM VARIABLES: A REVIEW Perhaps the firs thing to note is that we always average over numbers and not “things.” There is no such thing as the average of apples and oranges. When ‘we compute a students average grades, we do not average over A, B, C, and so fon; instead, we average over numerical equivalents that have been arbitrarily assigned to each grade. Also, the quantities being averaged may or may not be {governed by chance. In ether case, random or deterministic, the average is just the sum of the numbers divided by the number of quantities being averaged. In the random case, the sample average or sample mean of a random variable X is defined as gp ekit hte x 7 asp where the bar over X indicates average, and X,, Xs...» are sample realizations ‘obtained from repeated trials of the chance situation under consideration. We ‘ill use the terms average and mean interchangeably, and the adjective sample serves a a reminder that we are averaging over a finite number of tials as in Eq. 0.8). in the study of random variables we also like to consider the conceptual average that would occur for an infinite number of trials. This idea is basic to the relative-frequency concept of probability. This hypothetical average is called expected value and is aptly named; it simply refers to what one would “expect” in the typical statistical siuation. Beginning with discrete probability, imagine ‘random variable whose n possible realizations are x), »X. The corresponding probabilities are py, Pas.» Pa If we have NV trials, where Nis lege, ‘We would expect approximately p,N 2,5, p's, ete. Thus, the sample average ‘would be x eM Oly + ON co ‘This suggests the following definition for expected value for the discrete probability case: —o re) ee, ce ee en peesrneetr= m0 [ixiod 8s It should be mentioned that Eqs. (1.8.3) and (1.8.4) are definitions, and arguments leading up to these definitions were presented to give a sensible‘re tionale for the definitions, and not as a proof. We can use these same arguments for defining the expectation ofa function of X, as well as for X. Thus, we have the following: 1.8. EXPECTATION, AVERAGES, AND CHARACTERISTIC FUNCTION 23, Discrete case: 0@00) = Spt) ass) Continuous case: B90) = [_s00pte) ae aso) AAS an example ofthe use of Ea. (1.8.6), let the fonction g() be X2. Bguation (1.8.6) [or its discrete counterpart Eq, (18.5) then provides an expression for the kth moment of X, that is, B08) = [xine ae (87 ‘es manent os of pel mee, aise by au = [ioe ass ‘The first moment is, of course, just the expectation of X, which is also knowa as the mean or average value of X. Note that when the term sample is omitted, wwe tacitly assume that we are referring to the hypothetical infnite-sample average. We also have occasion to look at the second moment of X “about the This quantity is called the variance of X and is defi Variance of X = E[(X — E00?) (1.8.9) In a qualitative sense, the variance of X is a measure of the dispersion of X about its mean. Of cours, if the mean is zero, the variance is identical to the second moment ‘The expression for variance given by Eq: (1.8.9) can be reduced to a more ‘convenient computational form by expanding the quantity within the brackets ‘and then noting thatthe expectation of the sum is the sum of the expectations. ‘This leads to Var X = E[X* — 2X - EX) + (E00) = 60%) ~ (E007 (810) ‘The square root of the variance is also of interest, and it has been given the mame standard deviation, that is, Standard deviation of X = Variance of X asin‘24 CHAPTER PROBABILITY AND RANOOM VARIABLES: A REVIEW EXAMPLE 1.16 Let X be uniformly distributed in the interval (0, 22). This leads to the proba Dility density function (see Example 1.14), 1 oss<2e t, oxr<2 0, elsewhere fle) = Find the mean, variance, and standard deviation of X. “The mean is just the expectation of X and is given by Eq. ‘1.84). Mean of age (1.8.12) [Now that we have computed the mean, we are in a posiion to find the variance from Eq, (1.8.10). var-[oda- a.8.13) “The standard deviation is now just the square root of the variarce: Standard deviation of X = Var (el hat = toe 1.8.14) fate Se (isa) ‘The characteristic function associated with the random variable X is de- ee cuss 1W-can be seen that y(a) is just the Fourier transform of the probability density 1 funetion with a reversal of sign on a. Thus, the theorems (and tables) of Fourier transform theory can be used to advantage in evaluating characteristic functions and their inverse. 19 NORMAL OR GAUSSIAN RANDOM VARIABLES 25 ‘The characteristic function is especially useful in evaluating the moments of X. This can be demonstrated as follows. The moments of X may be written £09 = [409 de 816) ma [2100 « ec. asin Now consider the derivatives of y(w) evaluated at w [#2] .- [Panne] = [anor sie [ss).- [fp _cnrnte a] ao [Peto a ee. asa9 Tecan be sen that aby aun ~ [2e 829 alee bee) ~A%8 = os as2y ‘Thus, with the help of a table of Fourier transforms, you can often evaluate the ‘moments without performing the integrations indicated in their definitions. [See Pres 216) and 20 er ppiston cf the careers neon ane ope 19 NORMAL OR GAUSSIAN RANDOM VARIABLES. ‘The random variable X is called normal or Gaussian if its probability density function is26 CHAPTER 1 PROBABILITY AND RANDOM VARIABLES: A REVIEW file) aeoel ast m'| as.) [Note that this density function contains two parameters my and a, These are the random variable’s mean and variance, Tat is, forthe fy specified by Eq, (9), [xi ae = mg 192) and Pee = mori ae = o 9a) Note that the normal density function is completely specified by assigning au merical values to the mean and variance. Thus, a shorthand notation has come into common usage to designate @ normal random variable. When we write X~ Nitty, 9) asa) ‘we mean X is normal with its mean given by the first argument in parentheses ind its vatiance by the second argument. Also, as a matter of terminology. the terms normal and Gaussian are used! interchangeably in deseribing normal ran- tdom variables, and we will make no distinction between the two. The normal density and distribution functions are sketched in Figs. 18a and 1.8b, Note that the density function is symmetric and peaks at its mean, ‘Qualitaively, then, the mean is seen to be the most likely value, with values on tithe side ofthe mean gradually becoming less and less likely as the distance from the mean becomes larger. Since many natural random phenomena seem to cxhbit tis central-tendency property, atleast approximately, the normal distibution is encountered frequently in applied probability. Recall thatthe variance ise measure of dispersion about the mean. Thus, small «corresponds toa sharp peaked density curve, whereas lage will yield a curve with a flat peak. “The normal distribution function is, of course, the integral of the density function re [poods= [pe enn{ shu mor] ae 099) ‘Unfortunately, this integral cannot be represented in closed form in terms of ‘lementary functions. Thus, its value must be obtained from tables or by mu teri! integration. A bref tabulation for zero mean and unity variance is given in Table 19. A quick glance atthe table will show thatthe distribution is very close to unity for values of the argument greater than 4.0 (i.e. 40). In our table, ‘which wes taken from Feller (5), the tabulation quits at about this point. In some ‘pplications, though, the difference between Fy(x) and unity [i.e the area under the “tail” of G2] i very much of interest, even though iti quite small. Tables 4.9 NORMAL OF GAUSSIAN RANDOM VARIABLES 27 Figure 1.8.) Namal denety net. 2) Narn auton eto, such the one give ra ot of much se in sch aes, bease tbe a Of si lined od he resalion poor. Funai wavy ex 1 igre the ocal nny con mealy ising sft sec ULB ga dn ring tl fh ety Te, tert inet a) fom +t sme ge ine pp lie Gepreseting tian ovate 116) ically. Sone tena enperinaon ie tly necessary in detsiing a uae upert t l Jldthe dad Scluacy and te Game ine wl allow vesoably ft convergence of the Si egatn Ts mans an ewe penn mua to pin demining an sppopise pe in inte egaion (see Problem 1.40*). ropa vee ee The merc! intgation off, alo recommende for ene the probly that X ies betwen any fin iis swell sin eal the Gerbudew” pobin. The acury an eohon biel fom mumena fegraton at ea alvaye bet than thee abune fom whles Fr xa ple sy we van the probly that 23 < X < 3.0. From Table 19 we ge PI25 2 or z <2, there is no overlap in the pulses 80 their product is zero. When —2 = z = 0, there is @ nontrivial overlap that {increases linearly beginning at z = ~2 and reaching a maximum at z = 0. The ‘convolution integral then increases accordingly as shown in Fig. 1.17, A similar argument may be used to show that (2) decreases linearly in the interval where 0 = z= 2 This leads to the triangular density function of Fig. 1.176. ‘We can go one step further now and look at the density corresponding to the sum of three random variables. Let W be defined as WexeyeV 38) ‘where X, ¥, and V are mutually independent and have identical rectangular den- Sites as'shown in Fig, 1.174, We have already worked out the density for X + Y, so the density of W is the convolution of the two functions shown in Figs. L'r7a and 1.176, We will leave the details of this as an exercise, and the result is shown in Fig, 118, Each of the segments labeled 1, 2, and 3 is an arc of @ parabola. Notice the smooth central tendency. With a lite imagination one can fee a similarity between this and a zero-mean normal density curve. If we were to go another step and convolve a rectangular density with thet of Fig, 1.18, we ‘would get the density for the sum of four independent random variables. The ‘resulting function would consist of connected segments of cabic functions ex: tending from —4 to 4, ts appearance, though ot shown, would resemble the rhormal curves even more thaa that of Fig. 1.18. And on and or—each additional ‘convolution results in & curve that resembles the normal curve more closely than the preceding one. 449. SUM OF INDEPENDENT RANDOM VARIBLES 41 vied mn | ! 3 so | 1 genet Fgura 1.10. Probably deny the sin of tee dependent anon (arabes wn ntl elu deny eters ‘This simple example is intended to demonstrate (not prove) that a super position of independent random variables always tends toward normality, re- ‘ardless of the distribution of the individual random variables contributing to the sum. This is known as the central limit theorem of statistics. I is a most remarkable theorem, and its valiity is subject to only modest restrictions (3). Tn engineering applications the noise we must deal with is frequently due to a superposition of many small contributions. When this is so, we have good reason to make the assumption of normality. The central limit theorem says to do just that. Thus we have here one of the reasons for our seemingly exaggerated interest in normal random variables—they are a common occurrence in nature, a EXAMPLE 1.21 ___ ————— Let X and Y be independent normal random variables with zero means and variances a} and or}. We wish to find the probability density function for the sum of X and Y, which will again be denoted as Z. For variety, we illustrate the Foutier transform approach. The explicit expressions for fx and f, are 1 eng Si = ae 139) 1 ene WO = Fare (1.13.10) [Note that we have used # as the dummy argument of the functions. Its of no consequence because itis integrated ovt in the transformation to the domain. Using Fourier transform table, we find the transforms of jy and f, to be slp) = eee 13.10) Lf] = en (1.13.12) Forming their product yields SLAdeL fl = eteitenene (1.13.13) ‘Then the inverse gives the desired fg42 CHMPTEN | PROBABILITY AND RANDOM VARIABLES: A REVIEW “U4 TRANSFORMATION OF RANDOM VARIABLES 43 [etetteneny [Ose a, for dy postive i oo! te) du = | (aay cal Vinita ee ! =f a for dy negative Note thatthe density function far Z is also normal in form, and its variance is a ity (One of the subtleties of this problem should now be apparent from Eq, (1.14.4) If postive de yields negative dy (ie., a negative derivative), the integral of f, ‘must be taken from y + dy to y in order to yield a positive probability. This is the aunt of interchanging he ints and reversing the sign as shown in Ba 14.4), ‘The differential equivalent of Eq. (1.14.4) is ab okt ot (sas) ‘The summation of any numberof random varisbis can always be thought of a1 soqunce of summing operations on two vals, therefor, sould te car tat sonming any nomber ef independent nomal random variables | Tels 04 nonmalranom vara. This ae emale result can be gener 0) de = fy aus | alized further to include the case of dependent normal random variables, which ‘vith a simple single-input, single-output situation where the input-output relationship is governed by the algebraic equation y=) aay Here we are interested in random inputs, so think of x as a realization of the input random variable X, andy as the corresponding realization of the output ¥. ‘Assume we know the probability density function for X, and would like o find the corresponding density for ¥. I is tempting to simply replace x in fy(x) with its equivalent in terms of y and pass it off at that. However, it is not quite that Simple, 28 will be seen presently Fit, let us assume that the transformation g(x) is one-to-one for all per- rissibie x By this we mean that the functional relationship given by Eq. (1.14.1) can be reversed, and x can be written uniquely as a function of y. Let the “reverse” relationship be x= hy) 1142) “The probabilities that X and ¥ lie within corresponding differential regions must be equal. That i, POX is between x and x + de) = PCY is between y and y + dy) (1.143) we will discuss later. a Where we have tacitly assumed dx to be positive. Also, x is constrained 10 be My). Thus, we have 1.14 |as| ‘TRANSFORMATION OF fo) = f {ncmts>) a6 RANDOM VARIABLES or, equivalently, 'A mathematical transformation that takes one set of variables (say, inputs) into nother set (Say, outputs) isa comion situation in systems analysis. Let us begi PO oon are Where #'()) indicetes the derivative of h with respect to y. Two exemples will now be presented EXAMPLE 1.22 FFind the appropriate output density functions for the ease where the input X is ‘MO, o) and the transformation is (®) y= Ke (Risa given constant) 148) yar (149) Over (1.14.10) ‘We begin with the scale-factor transformation indicated by Eq (1.14.8). We fist solve for x in terms of y and then forma the derivative. Thus, aaa fa a We can now obtain the equation for f, fom Eq, (1.14.6). The result is 14.12){44 CHAPTER 1 PROBABILITY AND RANOOM VARIABLES: A REVIEW lo vee | ar 114.13 GO) * ig Vimo "| - Bot (4.14.13) (rewriting Eq, (1.14.13) in standard norma form yields a 1.14.14) LO) = Tea it a (1.14.14 It can now be seen that transforming a zero-mean normal random variable with ‘a simple scale factor yields another normal random variable witha corresponding Seale change in lis standad deviation, Ie important to note chat normality is preserved in a linear transformation. ‘Next, consider part (b). This transformation is also one-to-one, so solving for x yields xe 4.as) In Bq, (1.14.15) we take the positive real root for y > 0 and the negative real root for y < 0. The derivative of x is then (1.14.16) ce a “The quantity y*” can be written as (9, 0 97” is always postive, provided ‘yl i real. This is consistent with the geometric interpretation of y = 2°, which Always has a nonnegative slope. The density function for ¥ is then. 1 se ole 414.17) SOD" S8 Vaa, a4) Since this does not reduce 10 normal form, we have here an example of a ‘nonlinear transformation that converts a normal random variable to non-Gaussian orm, 1.14 TRANSFORMATION OF RANDOM VARUSLES 45, ‘The transformation y = 2 for part (c) is sketched in Fig, 1.19. Its obvious from the sketch that two values of x yield the same y. Thus, the Function is not ‘one-to-one, which violates one ofthe assumptions in deriving Eq. (1.14.6). The problem is solvable, though; we simply must go back to fundamentals and derive ‘2 new fy()) relationship. Note that we can consider y = 2° in terms of two branches. These will be defined as te 20° (Branch 1) Vj, <0 Branch 2) ele) Now think of perturbing y a positive differential amount dy as shown in Fig. 1.19. This results in two corresponding differential regions on the x axis. Thus, wwe have the following relationship for probabilities: PU S15 Jo + dh) = Ply SX =H + dn) + Pl-m+dyeXs-x) (114.19) Note that de, (Branch 1) is positive and dx, (Branch 2) is negative. Next, Bq. (1.14.19) can be rewritten in tems of integrals as ‘Sle a [nae [2 sere assem Wenow aote that asap Substitaing Eq (114-21) into Bg. (1.1420) and ting the range of integration be incremental yield Figure 1.49. Steno y = 2848 CHAPTER “PROBABILITY AND RANDOM VARIABLES: A REVIEW 1 HY) Seon) dy 1 stad, 20 (1.1422) HKD Fear w= O Now, since yp can be any y greater than zero, Eg. (1.14.22) reduces t0 = LHD 1.1423) LD GAD. 0 (1.1423) From the equation y = 22, we see that no real values of x map into negative y. ‘Therefore, (0) = 0 for nogative y. The final fy is then eAvD. 20 6, fo) 29, y 4.5) using both MATLAB’s quad and quad8 integration functions, and compare the results with those obtained from ‘Table 1.9. Note that in the integration you will have to experiment with a ‘numerical upper limit that will give @ good approximation to the infinite limit, Feller (5) gives an approximate formula for the area under the “tail” of the normal density function: ee, This can be used to find a suitable stating point in the tia (b) Feller (5) states thatthe approximation made in Eg, (P1.40) improves, rapidly as x inereases, To verify ths statement (or otherwise), calculate 1 — Py(a) from Eq, (P1.40) for x = 4.5, and then again for x = 6.5. Compare the resulls with the more accurate values obtained using MATLAB, PROBLEMS 69 [Note: The calculated probability is reduced by roughly five orders of magnitude in going from 4.5 to 6.5, but the improvement in the accuracy of Eq. (P140) is not nearly this dramatie-) LAL A probability formula for calculating the result of m tials of a binary satstcal experiment is given in Problem 1.10. This formula is deceptively simple. If and & are small, the indicated probability can be computed in a matter fof seconds with just a hand-held calculator. However, the calculation ean be both laborious and tricky when & and m are large. Numerical problems due t0 overflow/underflow can easily be encountered under these conditions, because keand (n ~ B) appear in the equation as exponents. Thus, the formula given by Eq, (P1.10) should be programmed with care. The following example is intended to illustrate a case where m is relatively large. ‘One of the test procedures for GPS navigation equipment involves simulating a satellite range ervor and then testing the receiver equipment to see if it cean reliably detect the range error with its built-in measurement consistency- checking algorithm. (See Chapter 11 for a bref discussion of the GPS navigation system.) The specifications for detecting the satelite range eror are quite severe, fd they sequire that the probability of & missed detection (inthe presence of noise) be no greater than 001 (14), ‘Consider a simulation test where there are to be 5000 wials. There will be only two possible results from each tral: (1) equipment fails to detect the satellite range error (can be interpreted as “error” in Eq. P1.10), and (2) equipment successfully detects the range error (interpreted as “no entor” in Eq, PI. 10). Suppose that the testing procedure is 10 be designed to give the recciver- equipment manufacturer a generous chance of passing the 5000-tral test. How ‘many failures (ic. “errors” in the binary probability formula) should the equip- ‘ment manufacturer be allowed in order that the probability of passing the test bbe approximately 987 [init Write and execute @ MATLAB program to calculate the probabilities of ‘occuerence of exactly 0, 1, 2,..., m failures, and then sum the results to get the curmulative probability of obtaining no more than m failures in the 5000-tial test, The m parameter is variable and will have to be determined by tial-and- ‘error, 50 write your program such that this parameter can be changed easily in the editor. Note that the expected number of failures is 5 (for p = 001 and ‘n= 5000), so this is a good Starting point for your tri-and-error iteration. Also, 8 a precaution against overflow/underflow problems, itis best to caleulate the log of the various probabilities first, and then obtain the desired probability as exp(log prob) atthe end of the calculation] 142 Consider a random variable that is defined to be the sum of the squares ‘of m independent normal random variables, all of which are N(O, 1). The parameter n is any positive integer. Such a sum-of-the-squares random variable is called a chi-square random variable with m degrees of freedom. The probability density function associated with a chi-square random variable X is faves Ble) =) aver(2 6, xs070 CHAPTER 1 PROBABILITY AND RANDOM VARIABLES: A REVIEW ‘where T indicates the gamma function (3, 10). It is not difficult to show that the ‘mean and variance of X are given by Bx Var X= 2n [This is casily derived by noting that the defining integral expressions for the fst and second moments of X are in the exact form of single-sided Laplace transform with s = 4, See Appendix A fora table of Laplace transforms and note that at = Fin +1) (@) Make rough sketches of the ebi-squre probability density for 2, and 4. (You wil ind MATLAB use here.) (b) Note that he chi-square random variable for'n > 1 is the sum of independent random variables that ee radically non-Gaussian (e. Took atthe sketch of fe for m =I). According fo the central iit theorem, there should bea tendency foward normality as we sum more and tore such random variables. Using MATLAB, ceate an m-file for the chi-square density funtion forn = 16. Plt ths function along with the normal density fencton for 8 N(I6, 32) random varicle same ‘ean and sigma as the chi-square random verabe), This 8 intended to demonstrat the tendeney toward normality, even when the im con- tans only 16 terms REFERENCES CITED IN CHAPTER 1 1. N. Wiener, Exogpotation,Inerplation, and Smoothing of Stationary Time Series, Cambridge, MA: MIT Press and New York: Wiley, 1989 2. S.O. Rice, “Mathematical Analysis of Noise,” Bell Sytem Tech. J, 23, 282-332 (1994; 24 46-256 (1945). 3. AM. Mood, FA Graybill, and D.C. Boes, /atradution to the Theory of Statistics, Sed ed, New York: MeGra-Hil, 1974 4. CHL Goren, Go Wick the Od, New York: Mocmillan, 1969. 5, W. Feller, Ax Introduction to Probability Theory and les Applications, Vol. 1, 2nd cea, New York: Wiley, 1957 6.P. Beckmann, Probability in Communication Engineering, New York: Harcourt, Brace, and Worl, 1967 7. A. Papoulis, Probability, Random Variables, and Stochastic Processes, 2d ei, New ‘Yorks McGraw-Hill, 1984 8. G.H. Golub and C. F. Van Loan, Matric Compustions, 2nd ed, Baltimore, MD: ‘The Johns Hopkins Univesity Pres, 1989. 9. W.B. Davenport, Jt and W. L. Root, An Introduction to the Theory of Random Signals and Noize, New York: MeGraw-Hill, 1958, 10, K'S. Shanmugan and A. M. Breipoh, Random Signals: Detection, Estimation, and Data Analysis, New York: Wiley, 1988. 11. D. Gerhardt and T. Kerfnan, Video Poker, Playing 10 Win, Las Vegas, NV: Gaming ‘Books Intemational, Ine. 1987 12. LF. Blake An Invoduction to Applied Probability, New York: Wile, 1979. 13, HJ. Larson and B. O. Shubert, Probabilistic Models in Engineering Sclnces, Vol 1, New York: Wiley, 1979, 4, Minimum Operational Performance Standards for Airbore Supplemental Navigation Equipment Using Global Posiuoning System (GPS), Document No. KTCA/DO-208, RICA, Washington, DC, July 1991 REFERENCES 1 ‘Adsitional References on Probability 15. P.Z Peebles, Je, Probabiliny Random Variables, and Random Signal Principles, 38 ced. New York: McGraw-Hill, 1993, 16, G.'R. Cooper and C. D. McGillem, Probabilistic Methods of Signal and Sytem ‘Anolysis, nd ed, New York: Holt, Rinehart, and Winston, 1986. 17. A.M. Breipo, Probabilisc Systems Analsis, New York: Wiley, 1970. 18, J.-L, Melsa and A. P. Sage, An Invaduction to Probabiliy and Stochastic Processes, Englewood Cis, NI: Premio Hall, 1973,Mathematical Description of Random Signals ‘The concept of frequency spectrum is familiar from elementary physics, and so ‘might seem appropriate to begin our discussion of noiselike signals with their spectral description. This approach, while intuitively appealing, leads to all sors of difficulties. The only realy careful way to describe noise is to begin with a probabilistic description and then proceed to derive the associated spectral characteristics from the probabilistic model. We now proceed toward this end, 24 CONCEPT OF A RANDOM PROCESS ‘We should begin by distinguishing between deterministic ard random signals. Usually, the signals being considered here will represent some physical quantity such as voltage, curent, distance, temperature, and so forth. Thus they are real variables. Also, time will usually be the independent variable, although this does ‘ot necessarily need to be the case. A signal is said to be deterministic if it is exactly predictable forthe time span of interest. Examples would be x0 (sine wave) ©) x0 (unit step) conn = {!— TES. exponential respon) [Notice that there is nothing “chancy" about any of these signals. They are ‘described by functions in the usual mathematical sense; thet is, specify a numerical value of rand the corresponding value of x is determined. We are usually 21 CONCEPT OF A RANDOM PROCESS 73 Sear au) Figure 24 ‘Tpla aio ne sna, able to write the functional relationship between x and t explicitly. However, tis is not really necessary. All that is needed is to know conceptually that @functional relationship exis, In contrast with a deterministic signal, a random signal always bas some element of chance associated with it. Thus, itis not predictable in a deterministic sense. Examples of random signals are: (@ XW) = 10 sin@ar + O, where 6 is a random variable uniformly dis- ‘wibuted between O and 27. (©) X@ = Asin(2at + 0), where @ and A are independent random variables ‘with known distributions. (®) X(@ = A noiseike signal with no particular deterministic strcture— tne that just wanders on aimlessly ad infinitum, Since all of these signals have some element of chance associated with them, they are random signals. Signals such as (2), (e), and (0) are formally known as random or stochastic processes, and we will use the terms random and stochastic interchangeably throughout the remainder of the book.” Let us now consider the description of signal (f) in more detail. It might be the common audible radio noise that was mentioned in Chapter 1. If we looked at an oscllographic recording of the radio speaker current, it might ap pear as shown in Fig. 2.1. We might expect such a signal to have some kind of spectral description, because the signal is audible to the human eat. Yet the precise mathematicel description of such a signal is remarkably elusive, and it cluded investigators prior t the 1940s (3, 4). ‘Imagine sampling the noise shown in Fig. 2.1 ata particular point in time, say, The numerical value obtained would be governed largely by chance, whieh suggests it might be considered to be a random variable. However, wi “We nsf esognne » notational poem here. Dentin he random proces Xm hat ‘tee rfc eltonship tween and Tf enue, othe eae bea 10) fovea Wy chance. Forti eso, some ats (1) prefer ew aude ata, [rater then (to dente and ie signal, hen “look ke a random vale wih ine {8 parameter, whic is presely what Ts tan, however, ht without ts ox robles Sofe ito sn mot enginerting erature, re anor poses te denote iy "prentens ‘xo'des not mean fon Ie eu material sense when) ia xno process.TA GIUPTE2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS random variables we must be sbe to visualize a conceptual statstical experiment in which samples ofthe random variable are obtained under identical chance cicamstanes It would not be proper inthis case to sample X by faking sic cesive time samples ofthe sume signal, because, if they were taken very close together, there would be a close satsical connection among nearby sample. ‘Therefore, the conceptual experiment inthis case must consist of many “iden scl” rais, al playing simltaneovsy, ll being tuned way fom regular st ‘ions in ferent portions ofthe broadeast band, and all having thir volumes tured up tothe same sound level. Tas then leads to the notion ofan ensemble of similar noiseike signals a shown in Fig. 22. Tecan be seen then tht random process a set of random variables that, nol with time in acordance with some conceptual chance experiment. Each of he noselie time signals so generated is called sample realization ofthe process. Samples ofthe individual signals at a partcular time f, woud then be Sample ealztins of the eandom variable X(;). Four of these ae illustrated in Fg. 22 38 Ka Ky, Ne), and Ky). we were to sample a a iret time 5a, We Would cian samples ofa different random variable Xz), and so forth, Ths, in this example, an iafnite set of random variables is generted by the random process X() ‘The radio experiment just described is an example of & connous-tme random process in tht time evolves in continuous manne. In this example the probahty density function describing the amplitude variation also happens tobeconinvows. However, random processes may als be discret in ether time or emplitude, a wil be seen inthe following two examples EXAMPLE 2.1 — Consider a card player with a deck of stendard playing cards numbered from 1 (Qe) through 13 (King). The deck is shuffled and the player picks a card at = Ae) xe x0 Figure 22. Ensure sample eseaton of 8 random process. 22 PROBABILISTIC DESCRIPTION OF A RANDOM PROCESS 75: random and observes its number. The card is then replaced, the deck reshulled, ‘and another card observed, This process is then repeated at nit intervals of time ‘and continued on ad infinitum. The random process so generated would be discrete in both time and “amplitude,” provided we say thatthe observed nua ber applies only atthe precise instant of time itis observed, ‘The preceding description would, of course, generate only one sample re alization of the process. In order to obtain an ensemble of saimple signals, we need to imagine an ensemble of cart players, each having similar decks of cards and each generating a different (but statistically similar) sample realization of the process, 5 EXAMPLE 2.2 Imagine a sack containing a large quantity of sample numbers taken from a ‘zero-mean, unity-variance normal distibution. An observer reaches into the sack at nit intervals of time and observes a number with each tial, In order to avoid ‘exact repetition, he does not replace the numbers during the experiment. This process would be diserete in time, as before, but continuous in amplitude. Also, the conceptual experiment leading t0 an ensemble of sample realizations of the process would involve many observers, each with a separate sack of random numbers, 5 ‘We will concentrate mostly on continuous-time processes in this chapter. However, a brief discussion of discrete-time Markov and Wiener processes is given in Problems 2.33, 2.34, and 2.35 at the end ofthe chapter. Scalar discrete time processes are then considered further in Chapter 3 with the discussion of the ARMA model in Section 39. Finally, vector discrete-time models are intro duced in Chapter 5 and then used in all of the subsequent chapters. baba fog f 22 : PROBABILISTIC DESCRIPTION OF A RANDOM PROCESS ‘As mentioned previously, one can usually write out the functional form for & deterministic signal explicly; for example, s) = 10 sin 2a, or s(#) = F, and so on. No such deterministic description is possible for random signals because the numerical value ofthe signal at any particular time is governed by chance ‘Thus, we should expect our description of noiselike signals to be somewhat ‘vaguer than that for deterministic signals. One vay to specify a random process Js to describe in detail the conceptual chance experiment giving rise to the process. Examples 2.1 and 2.2 illustrate this way of describing a random process ‘The following two examples will illustrate this futher. EXAMPLE 2.9 Consider a time signal (e-., a voltae) that is generated according to the fol- Towing rules: (a) The waveform is generated with a sample-and-hold arrangement where the “hold” interval is 1 sec; (b) the successive amplitudes are76 CHAPTER 2. MATHEMATICAL OESCRIPTION OF RANOOM SIGNALS. Figure 23, Sarpl sa for Exaryio 23. independent samples taken from a set of random numbers with uniform distribution from —1 to +1; and (¢) the fist switching time after ¢ = 0 is a random variable with uniform distribution from 0 to 1. (This is equivatent to saying the time origin is chosen at random.) A typical sample realization of this process is shown in Fig, 23. Note thatthe process mean is zero and is mean-square value ‘works out 10 be one-third. [This is obtained from item (b) of the description} ° 4 le 2 a EXAMPLE 2.4 Consider another time function generated with a sample-and-1old arrangement with these properties: (a) The “hold” interval is 0.2 sec, (b) the successive amplitudes are independent samples obtained from a zero-mean normal distibution with a variance of one-thied, and (c) the switching points occur at mul- siples of .2 units of time; chat is, the dime origin is not chosen at random in this ‘ase. A sketch of a typical waveform for this process is showa in Fig. 24. Now, from Examples 2.3 and 2.4 it should be apparent that if we simply say, “Noiselike waveform with zero mean and mean-square value of one-third,” Wwe really are not being very definite. Both processes of Examples 2.3 and 2.4 ‘would satisfy these criteria, but yet they are quite different. Obviously, more information than just mean and variance is needed to completely describe a random process. We will now explore the “description” problem in more detail, 'A more typical “noiselike” signal is shown in Fig. 2.5. The times indicated, tus fy «+» fy have beon arranged in ascending order, and the corresponding simple values X,, X,... » X; are, of course, random variables. Note that we i le lowe 24 ‘pica wavtoe or Example 24, 122 PROGABILISTIC DESCRIPTION OF A RANDOM PROCESS 77 Figure 25 Sarple sort of atypical nda pooass have abbreviated the notation and have let X(i) = Xy XC) = Xp =» and s0 cn, Obviously, the frstorder probability density functions f(2). fat), Jq(s) axe important in describing the process because they tell us something, about the process amplitude distibution. In Example 2.3, fu(2) f(s» -- » ‘2, ate ll identical density functions and are given by fusing f(s) 86 an example] Sud ® a Wet ‘The density functions are not always identical for X,, Xs... , Xg they just happened to be in this simple example. In Example 2.4, the density functions describing the amplitude distribution of the X,, Xs, .-. » X; random variables ‘ate again all the same, but in this case they are normal in form witha variance ‘of one-third, Note that the first-order densities tellus something about the rel ‘ative distibution of the process amplitde as well as is mean and mean-square valve Tt should be clear that the joint densities relating any pair of random va ables, for example, f(t. 1). Svat) and 30 fort, ace alo important in ‘our process desripion Ie is these density functions that tellus something about how rapidly the signal changes with time, and these will eventually tell ws some~ thing about the signals specteal content. Continuing on, the third, fourth, and subsequent higher-ocder density functions provide even more detailed information about the process in probabilistic tems. However, his leads oa formidable description ofthe proces, to say the last, because k-variate density function 4s required where kcan be any positive integer. Obviously, we will nt usually be able o specity this keh onder density function explicitly. Rather, this usually rust be done more subly by providing, with a word description or otherwise, enough information about the process to enable one 10 write out any desired bigher-ocder density function; but the actual "writing it ou is usually not done. ‘Recall from probability thery that two random variables X and Y are said to be statistically independent if thei joint density function can be written in product frm Sarl) = FeO) @2 Similarly, random processes X() and ¥() are statistically independent ifthe joint density for any combination of random variables of the two processes can be ‘written in product form, that is, X() and Y(0) are independent if8 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS Frrcriten = fea Sui 222) In Eq, (22.2) we are using the shortened notation X, = XC). Xs = Xt and ¥, = YC), Ya = ¥,(G), -., where the sample times do not have to be the same for the two procestes In summary, the test for completeness of the process description is this: Is ‘enough information given to enable one, conceptually atleast, to write out the Ih order probability density function for any £? If s0, the description is as complete as can be expected; i not, itis incomplete to some extent, and radically diferent processes may ft the same incomplete description GAUSSIAN RANDOM PROCESS ‘There is one special situation where an explicit probability density description of the random process is both feasible and appropriate. Ths case isthe Gaussian ‘or normal random process. Ii defined as aue in which all the density fictions describing the process are normal in form. Note that it is not sufficient that just ‘the “amplitude” of the process be normally distibuted; all higher-order density functions must also be normal! As an example, the process defined in Example 2.4 fas a normal first-order density fonction, but closer scrutiny will reveal that its second-order density function i not normal in form. Thus, the process is not Gaussian process, ‘ ‘The multivariate normal density function was discussed in Section 1.15. It was pointed out there that muatix notation makes it possible to write out all ‘variate density functions in the same compact mateix form, regardless of the size off, All we have to dais specify the vector random-variable mean and covariance matrix, and the density function is specified. In the case of 2 Gaussian random process the “variates” ae the random variables X(@,), X()--« «XU. where the points in time may be chosen arbitrarily, Thus, enough information must be supplied to specify the mean and covariance matrix regardless of the ‘choice of t,t,» «» fe Examples showing how to do this will be deferred for the moment, because it is expedient frst to introduce the basic ideas of stationarity and correlation functions. 24. STATIONARITY, ERGODICITY, AND CLASSIFICATION OF PROCESSES 79) all the higher-order density functions. The adjective strict is also used occasionally with this type of stationarity to distinguish it from wide-sense stationarity, which is a less restrictive form of stationarity. This will be discussed later in Section 2.5 on cortelation functions. ‘A random process is said to be ergodic if time averaging is equivalent to ‘ensemble averaging. In a qualitative sense this implies that a single sample time signal of the process contains all possible statistical variations of the process. ‘Ths, no additional information is to be gained by observing an ensemble of sample signals over the information obtained from a one-sample signal, for example, one long strip-chart recording. An example will illustrate this concept. EXAMPLE 2.5, oe Consider a somewhat trivial process defined 10 be a constant with time, the constant being a random variable with zero-mean normal distribution, An en semble of sample realizations for this process is shown in Fig, 2.6. A common physical situation where this kind of process model would be appropriate is Fandom instrument bia. In many applications, some small residual random bias will in spite ofall atempts to eliminate it, and the bias willbe different xa eto 24 STATIONARITY, ERGODICITY, AND CLASSIFICATION OF PROCESSES ot ‘A random process is said tobe time stationary or simply stationary ifthe density functions describing the process are invariant under a translation of time. That is, if we considera set of random variables X, = X(i), Xs = Xt)... Xiq, and also a translated set Xj = Xi + .X5 = Xl # Dee Ke = Ma + 2), the density functions fry frrnrfi.x describing the first set would be identical inform to thse describing the tansated set. Note that this applies to Figure 26 nse of aren constants8 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS {for each instrument in the batch. In Fig. 2.6 we see that time samples collected from a single sample signal, say the first one, will all have tke same value a ‘The average of these is, of course, just da, On the other hand, if we were 10 collect samples in an ensemble sense, the values dp. dy, dy, -- @yy Would be fbtained. ‘These would have @ normal distribution with zero mean.” Obviously, time and ensemble sampling do not lead to the same result i this case, so the process is not ergodic. It is, however, a stationary process besause the “stais- tics” of the process do not change with time. a In the case of physical noise processes, one can rarely justify strict station- arty or ergodicity in a formal sense, Thus, we often lean on heuristic knowledge ‘of the processes involved and simply make assumptions accordingly. ‘Random processes are sometimes classified according to two categories, deterministic and nondeterministc. As might be expected, & deterministic random process resembles a deterministic nonrandom signal in that it has some special deterministic structure. Specifically, if the process description is such that Knowledge of a sample signal’s past enables exact prediction of its future, itis classified as a deterministic random process. Examples ae: 1. X(@) = @; ais normal, Nm, 0°) 2. X@) = A sin ae; Ais Rayleigh distribute. 3. X() = A sin(ot + 0); A and 6 are independent, and Rayleigh and ‘uniformly distributed, respectively. In each case, if one were to specify a particular sample signal prior to some time, say, f, the sample realizations for that particular signal vould be indirectly specified, andthe signal's future values would be exactly predictable. Random processes that are not deterministic are classified as nondeterministic. These processes have no special functional stracture tat enables their exact prediction by specification of certain key parameters or their past history. Typical poise” is a good example of a nondeterministic random process. It wanders on aimlessly, as determined by chance, and has no particular deterministic structure. AUTOCORRELATION FUNCTION ‘The autocorrelation function for a random process X() is deined as* BLXE XCD] @say Rett) +n describing tne conlaton popes of nam process, some authors prefer 19 wok wth th salon once aie aoronenn neon dete Eq. 1). Te aa ‘Senet deed a Aucovariance fonction = £1) ~ mr) = me “Te wo functions are crows eae, In ae ete te meas ile he proc (tocar ‘elton, ang inthe be the ea soba at atocovarnc), That ste exten recs, ‘Toco fata ae course, ene! for 2e-mean process. The asocorelation fncon rbssy the mae common of the bo egieeing Mere 30 wll be med thoughowt is 25 AUTOCORRELATION FUNCTION 81 where ¢, and f, are arbitrary sampling times. Clearly, i tes how well the process is conelated with itself at two different times. If the process is stationary, its probability density functions are invariant with time, and the autocorrelation function depends only on the time difference f — f,. Thus, Ry reduces to a function of just the time difference variable +, that is, Ry(2) = ELX(@XG + 2] (Stationary case) 252) where 1, is now denoted as just and ¢ is (¢ + 2). Stationarity assures us that the expectation is not dependent on t [Note thatthe autocorrelation function is the ensemble average (i.e, expec tation) of the product of X(¢,) and X¢,); therefore, it can formally be writen as Rw AKA fP aaenndds dy 253) where we ae using the shortened notation X, = X(,) and X, = X(_). However, Eq, (2.5.3) is often not the simplest way of determining Ry because the joint density function fy Ct 1) must be known explicitly inorder to evaluate the integral. Ifthe ergodic hypothesis apples, it soften easier to compute R38 & time average rather than an ensemble average. An example wil lustete this, EXAMPLE 2.6 Z Consider the same process defined in Example 2.3. A typical sample signal for this process is shown in Fig. 27 along withthe same signal shifted in time an amount +. Now, the process under consideration in this case is ergodic, so we should be able to interchange time and ensemble averages. Thus, the autocorrelation function can be writlen a8 Ry(1) = time average of X,(#) - X,(t + 7) i win xaone 9 ase an tt = th ia 23) jo be men ne value of X,(0), which is 4 in this ease, On the other hand, when 7 is unity or larger, there is no overlap of the correlated portions of X,(0) and X,(¢ + 2) and pan Figure 27. Pancor wave for Expo 26182 CHAPTER?” MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS thus the average of the product is zero, Now, as the shift +is reduced from I to 0, the overlap of correlated portions increases linearly until the maximum overlap occurs at 7 = 0. This then leads to the autocorrelation function shown in Fig. 28. Note that for stationary exgodie processes, the direction of time shift = is immaterial, and hence the autocorrelation function is symmetric about the ‘origin. Also, note that we arrived at Ry(1) without formally finding the joint density function fy (ty, 1) a ‘Sometimes, the random process under consideration is not ergodic, and it is necessary to distingvish between the usual autocorrelation function (ensemble lverage) and the time-average version. Thus, we define the fime autocorrelation Junction as imi ['xiames oa es 9) where X,() denotes a sample realization of the X(0) process. There isthe tact ‘assumption that the limit indicated in Eq. 2.54a) exists. Also note that script wt rather then italic R is used as a reminder that this is @ time average rather than an ensemble average. EXAMPLE 2.7 —___— ‘To illustrate the difference between the usual autocorrelation function and the time autocorrelation function, consider the deterministic random process XQ) = A sin ow e255) where A is a normal random variable with zero mean and variance 3, and © is a known constant. Suppose we obtain a single sample of A and its numerical value is A, The corresponding sample of X(0) would then be X,0 = Ay sin or 256) ‘According to Eq, (2.5.42), its time autocorrelation function would then be ‘gure 28. Atocoelson ncion fr Example 26 125. AUTOCORRELATON FUNCTION 63 Ke(7) = Lim 7 tim 4 [4 sin oA, sin te + 2) 4 cos or esr (On the other hand, the usual autocorrelation function is calculated as an ensemble average, that is, from Eq, (25.1. In this case, itis Rutty 6) = ELKEKE BAA sin at, sin ox) o sin at sin af 058) [Note that this expression is quite different from that obtained for (a) Clearly, time averaging, does not yield the same result as ensemble averaging, so the process is not ergodic. Furthermore the autocorrelation function given by Eg, (2.5.8) does not reduce to simply a function of t ~ f,. Therefore, the process isnot stationary. ie General Properties of Autocorrelation Functions ‘There are some general properties that are common to all autocorrelation func~ tions for stationary processes. These will now be enumerated with a brief com- ‘ment about each: 1. Ry(O) is the mean-square value of the process X(). This is self-evident from Eq, (2.5.2), 2. Re(=) is an even function of x, This results from the stationarity as sumption. [In the nonstationary case there is symmetry with respect to the two arguments ¢, and #,. in Eq. (25.1) it certainly makes no dif ference in which order we multiply X(t) and X(,). Thus, Ry(t. 2) = Rut t)] 3. [Ry(| = Re(O) for all r. We have assumed X() is stationary and thus ‘the mean-square values of X() and X(1 + 7) must be the same, Also the ‘magnitude of the correlation coefficient relating two random vatibles is never greater than unity. Thus, R(x) can never be greater in magnitude than Ry (0). 4. IEX(O contains a periodic component, Ry(t) will also contain a periodic ‘component with the same period. This can be verified by writing X() as the sum of the nonperiodic and periodic components and then applying the definition given by Eq. (2.5.2). It is of interest to note that ifthe process is ergodic as well as stationary and if the periodic component is sinusoidal, hen Ry(z) will contain no information about the phase of the sinusoidal component. The harmonic component always appears in84 CHAPTER 2” MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS the autocorrelation function as a cosine function, imespective of its phase. 5. If X(0) does not contain any periodic components, Ryfz) tends to 2er0 fas 7+ © This is just a mathematical way of saying that X(t + 1) ‘becomes completely uncorrelated with X() for large + if there are no hidden periodicities in the process. Note that a constant isa special case of a periodic function, Thus, Ry(=) = 0 implies 2e:0 mean for the process. 6. The Fourier transform of Ris real, symmetric, and ronnegative, The real, symmetric property follows directly from the even property of Ry (3), The nonnegative property is not obvious at this point. It will be justified later in Section 2.7, which deals with the spectral density unetion for the process. 1t-vas mentioned previously that src stationarity is a sere requirement, because it requires that all the higher-order probability density functions be invariant undet atime translation. This is often diffielt to verify. Thus, a less demanding form of stationarity is often used, or assumed. A rendom process is Seid to De covariance stationary of wide-sense stationary i€ E{X()] is inde pendent off, and ELXC)X(2)] is dependent only on the time difference f= ¢- Obviously, i the second-order density fx.x(t, %) is independent of the time igin, the process is covariance stationary. Turther examples of autocorelation functions willbe given as this chapter progresses. We wil see that the autocorrelation function isan important descip- for of a random process and one that is relatively easy to obtain because it depends on only the second-order probability density for the process (CROSSCORRELATION FUNCTION “The crosscoetelation function between the processes X(#) and Y(¢ is defined as Rely 1) = ELKO )MCED) 6.) ‘Again, if the processes are stationary, only the ime difference between sample points is relevant, so the erosscorrelation function reduces 10 Reda) = ELXCOMC + 2)](tationary case} (2.62) Just as the autocorrelation function tells us something abou: how a process is ‘comelated with itself, the crosscorrelation function provides information about the mutual correlation between the two processes. ‘Notice that itis important to order the subscripts properly in writing Ryy (2. A skew-symmetric relation exists for stationary processes a follows. By efinition, 26 CROSSCORRELATON FUNCTION 85 Rolo) = EXONE + 9) 263) Rye) = ELMOX(E + ] 2.6.4) ‘The expectation in Eq. (2.64) is invariant under a translation of 7. Thus, Rry (is also given by Ral?) Now, comparing Eqs. (2.6.3) and (2.65), we see that FLve~ ax@] 265) Rak) = Rak?) 266) ‘Thus, interchanging the order of the subscripts ofthe erosscorrelation function has the effect of changing the sign of the argument EXAMPLE 2.8 Lat X() be the sme random process of Example 2.6 and illustated in Fig 2.7 Lat Mo) be the same signal aX), but delayed oneal uni of Une he eroascorcation Ry) woud then be sown in ig. 29. Noe that Ry (>) is not seven ft fo oe fe maxima erat + = 0, Ths he cos Correlation function iacks the symmety possesied by the aulocerelation function. Pe ” : a ‘We frequently need to consider additive combinations of random processes For example, let the process Zi0) be the sum of stationary processes X() and Yo: 2) = XO + HO 26) ‘The autocorrelation function of the summed process is then Reto) (1X0 + HOIIXE +) + e+ a) = EXON + 9] + ELYEOXE + 9) + BLXONe + A] + ELYONE + 9) Rela) + Ral) + Re) + Bylo) 268) Aer Figure 29. Crotcooeliten uncon er Example 2.8(86 CHAPTER? MATHEMATICAL OESCRETION OF RANDOM SIGNALS Now, if X and Y are zero-mean uncorrelated processes, the middle terms of Eg. (2.68) ate zero, and we have Rae) = Rela) + Rela) (Por zero crosscorelation) (2.69) ‘This can obviously be extended to the sum of more than two processes, Equation (2.69) is a much-used relationship, and it should always be remembered that it applies only when the processes being summed have zero erosscorrelation, 27 POWER SPECTRAL DENSITY FUNCTION acedd) 5 It-was mentioned in Section 246 thatthe autocorrelation function is an important escriptor of a random process. Qualitatively, if the autocorrelation function decreases rapidly with 7, the process changes rapidly with time; conversely, a slowly changing process will have an autocorrelation function that decreases slowly with +. Thus, we would suspect that this important descriptor contains information about the frequency content of the process; and this is in fact the case. For stationary processes, there is an important relation known as the Wiener-Khinchine relation: Atoke «hay fe tae olg Sede) = ERC] = [Rade dr en) jrvevset, where Sf] indicates Fourier transform and «has the usual meaning of (22) w gee} (frequency in hertz). S, is called the power spectral density function or simply sorte spectral density function of the process d= The adjectives power and spectral come from the relationship of S_( ja) to the usual spectrum concept for a deterministic signal. However, some care is required in making tis connection. If the process X() is time stationary, it wanders on ad infinitum and is not absolutely integrable. Thus, the defining integral forthe Fourier transform does not converge. When considering the Fou- Fer transform of the process, we are forced to consider a truneated version of it, say, Xp(9), which is truncated to zero outside a span of time T. The Fourier transform ofa sample realization ofthe truncated process will then exist. Let #(X;} denote the Fourier transform of X,(f), where it is understood that for any given ensemble of samples of X0) there willbe comtesponding ensemble of 5(X(1)}. That is, #(X;(0)} has stochastic attributes just as does X;(t). Now look athe following expectation # [Fier] For any particular sample realization of X,(0), the qusatity inside the brackets Js known as the periadogram for that particular signal. It will now be shown that averaging over an ensemble of periodograms for large T yields the power spectral density function 27, POWER SPECTRAL DENSITY FUNCTION 87 ‘The expectation of the periodogram of a signal spanning the time interval {0, 7] can be manipulated as follows: E iF rexsone] <8 [ff 00a [x00 a] HEL [ x@xole ards 7) Ne at we nr eo ope nasi Ton X Decne of te ead tage of integration If we nase Haat, EUS Noses R(t ~ 8) and Eq, (2.7.2) becomes menace ; 7 eltpowon] Lf [me semrraea 023 ‘The appearance of ¢ in two places in Eq, (273) suggests a change of variables. Let eds = ena Equation (2.7.3) then becomes ‘The new region of integration in the 7¢ plane is shown in Fig. 210, [Next we interchange the order of integration and integrate over the two ‘wiangular regions seperately, Ths leads to gue 210 ajc gen ib88 CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS ef brown LE mmemad etl [imacrasr 1) ‘We now integrate with respect to ¢ with the result 1 iscxorr e [pee | 2 r) her Ll op aR ne" - Lf, (7+ DR Ae de + af (ARC 77 Finally, a, (27.7) may be writen in more compact form as E (5 wexioie | “f( Riders 78) ‘The factor 1 ~ |r\/T that multiples Ry(z) may be thought of as a triangular ‘weighting factor that approaches unity as T becomes large; a: lest this is true if Ry(a) approaches zero as r becomes large, which it will do if X( contains no periodic components, Thus, as T becomes large, we have the following relationship: ee ree el} sixaon? |= [mare ar 079) Or, in other words, “Average periodogram for large T= power spectral density (2.7.10) Note especially the “for large 7” qualification in Bq, (27.101. (This is pursued further in Section 2.15 and Problem 2.34.) Equation (2.7.9) is a mast important relationship, beceus: i is this that tes the spectral function Sy(je) to “spectrum” as thought of in tie usual deterministic sense. Remember thatthe spectrdl density function, as formally defined by Eq, (2.71), is @ probabilistic concept. On the other hand, the periodogram is @ spectral concept in the usual sense of being related to the Fourier transform of 2 time signal. The relationship given by Eq, (2.7.9) then provides the tie between the probabilistic and spectral descriptions of the process, an i is this equation that suggests the name for S,(jw), power spectral density function. More will be said of tis in Section 2.14, which deals with the determinetion of the spectral function from experimental data, "Because of the spectral attributes of the autocorrelation function Ry(), is Fourier transform Sy(ja) always works out to be a real, nonnegative, symmetric function of «x This should be apparest from the left side of Eq. (2.7.9), and will be illustrated in Example 29. t 2:7 PONER SPECTRAL DENSITY FUNCTION 69 EXAMPLE 2.9 — ‘Consider a random process X() whose autocorrelation function is given by, Rela) = oteret Tha Ne ar) where 0 and fi are known constants, The spectral density function for the X() process is oe? __20%p — jot Ba sR 2.2) jot B Both Ry and Sy are sketched in Fig, 2.11 a Occasionally, its convenient to write the spectral density function in terms of the complex frequency variable s rather than a This is done by simply replacing jo with s; of, equivalently, replacing a? with ~s?. For Example 2.9, the spectral density function in tems of sis then 20°68 | 208 Ft Alu eR It should be clear now why we chose to include the “with w in Sy(Jo), even though S,(ju) always works out to be areal function of a By writing the argument of Se(/o) a8 jay rather than just aa, we can use the same symbol for spectral function in eter the complex or real frequency domein, That i, Sus (2.7.13) Sy) = Jo) ery is comrect notation in the usual mathematical sense. From Fourier transform theory, we know that the inverse transform of the spectral function should yield the autocorrelation function, that is, F1S,Ca) ; Lf sumereaenie — @n15 If we let 7 = 0 in Ba, (27.15), we get gto sth : 0 » o Figure 211. Autocoralten and epocra ont Knctons fr Exar 29.) ‘ocsrlain anton) Special hncon.(90 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOK SIGNALS (0) = BLO] = 5 [Sli de 0.716) Equation (2.7.16) provides a convenient means of computing the mean square value of a stationary process, given its spectral function. As mentioned before, it is sometimes convenient to use the complex frequency variable s rather thas Jus IF this is done, Eq, (2.7.16) becomes et] = 5 fs ae en) [Equation (2.7.16) suggests that we can consider the signal power as being distributed in frequency in accordance with Sy(jo), thus, the terms power and density in power spectral density function. Using this concept, we can obtain the power ina finite band by integrating over the appropriate range of frequen cies, that is, ~Falin Power" in] ap [ong ar )=a [2 sccm dog [ss dw 7 ‘An example will now be given to illustrate the use of Eqs. (2.7.16) and ann, EXAMPLE 2.10 __ Consider the spectral function of Example 2.9: = 228. 500) = Boe e719) Applicaton of Eq. (27.16) should yield the mean square value a. This ean be \etifed using conventional integral tables, Sly eet ee el 0-3 Lite [gees] = 0729 00 =f 2B e129 ‘More will be said about evaluating integrals of the type in Eq. (27.21) later, in (Chapter 3. a In summary, we see that the autocorrelation function and spectral density function are Fourier ransform pairs. Thus, both contain the same basic information about the process but in different forms. Since we ean easily transform 28 CROSS SPECTRAL DENSITY FUNCTION 91 back and forth between the time and frequency domains, the manner in which the information is presented is purely a matter of convenience for the problem at hand, CROSS SPECTRAL DENSITY FUNCTION Cross spectral density functions for stationary processes X(t) and Y() are defined Seljo) = [Rot] = [Revie dr ean) Sri) = ER eA] = [Reem ar 0.82) ‘The crosscorrelation functions Ry(x) and R(x) are not necessarily even functions of +, and thus the corresponding cross spectral densities are usually not real functions of ai It was noted in Section 2.6 that Rey(7) = Ryx(~7). Thus, Sand Syy are complex conjugates of each other: Sarde) = Silja) 283) and the sum of Si and Sy is ral. ‘Another function that is closely related to the cross spectral density is the coherence function. Itis defined as [Scrbio) Sxie)S/jo) Ca Yu ‘The coherence function can be seen to be normalized, and it is sort of a “cor relation coefficient in the frequency domain.” ‘To sce the normalization, let 2X(0) = Y(@ (maximum correlation) and then 2, - Salio Ye” SGe)SCi0) 285) On the other extreme, if X() and ¥(9 have zero erosscorrelation, Syy( jo) and fy = 0. Both the cross spectral density and coherence functions are useful in analysis of experimental data, because modern computer technology has made it possible to transform time data to the frequency domain with ease. (See Sec- tion 2.15 and references 5 and 6 for more on the analysis of experimental data Also, soe Problems 3.23 and 3.25.) If Z( isthe sum of zero-mean processes X() and Y(0), the spectral density of 2) is given by(92 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS S,6Ja) = HR] 286) Refertng to Ba. (2.638), we then have Sy(Ja) = S<6j0) + Syl ji) + SeyJe) + So) esr Jost as in the case of the autocorrelation function, the two middle terms in Eg. (28.17) are zero if the X and Y processes have zero crosscorrelation. So, for this special ease, Sexrlio) = S,(je) + S,ja) (for zero crosscorclation) (2.88) 29 WHITE NOISE White noise is defined to be a stationary random provess having a constant spectral density fonction. The (erm “white” is a camyover trom optics, where bite light is light containing all visible frequencies, Denoting the white-noise spectral amplitude as A, we then have Saja) = A 29.1) “The corresponding autocorrelation function for white noise is then Ret) = AB) 292) ‘These functions are sketched in Fig. 2.12. Tn analysis, one frequently makes simplifying assumptions in order to make the problem mathematically tractable, White noise is a good example of this. However, by assuming the spectral amplitude of white noise to be constant for all frequencies (Fr the sake of mathematical simplicity), ve find ourselves in the awkward situation of having defined a process with infnite variance. Qual- itaively, white noise is sometimes characterized as noise tha is jumping around infinitely far, infinitely fast! Ths is obviously physical nonsense but iti a useful ny o Fiowe 212, Witancte, Autocrine.) Sac donsy Seton 29 WATE NOSE 93 abstraction. The saving feature is that all physical systems are bandlimited to ‘some extent, and a bandlimited system driven by white noise yields a process that has finite variance; that is, the end result makes sense. We will elaborate ‘on this futher in Chapter"3 Bandlimited white noise is @ random process whose spectral amplitude is constant over & finite range of frequencies, and zero outside that ange. If the bandwidth inclades the origin (sometimes called baseband), we then have A, el = 20 sintiad= {5 laser 293) where W is the physical bandwidth in hertz, The corresponding autocorrelation function is sin owe) Rags) = 2WA SEE es ‘Both the autocorrelation and speval density fanctions for baseband bandlimited ‘white noise are sketched in Fig. 2.13. tis of interest to note that the autocorrelation function for baseband bandlimited white noise is zero for 7 = 1/2W, 2/2W, 3/2W, ete, From this we see that if the process is sampled at a rate of 2W samples/second (sometimes called the Nyguist rate), the resulting set of random variables are uncorrelated. Since this usually simplifies the analysis, the ‘white bandlimited assumption is frequently made in bandlimited situations. "The frequency band for bandlimited white noise is sometimes offset from the origin and centered about some center frequency Wo, Itis easily verified that the autocorrelation/spectral-function pair for this situation is (A, 2nW, = lol = 2a, sun fe FRE at Aas 2m 099 sin 2nWyr yyy sin 20 na 8 [a a 2m Sa | sin mA We = 42 aw A cos 2atigr 296) ews a0), o o Figure 2.19. Baseband banda wie nok. (a) Alocoltan func.) Spec ‘senaty Anton84 cueTER2. MATHEMATICAL CESCRIPTION OF FANDOM SIGNALS | = / AW = W, ~ W, Hz MM, | ‘These funetions aro sketched in Fig. 2.14 I is worth noting the bandlimited white noise has a finite mean-square | value, and thus it is physically plausible, whereas pure white noise is not. How- ever, the mathematical forms for the autocorrelation and spectral functions in the bandlimited case are more complicated than for pute white noise. Tefore leaving the subject of white noise, it is worth mentioning that the analogous discrete-time process is referred to as a white sequence. A white sequence is defined simply as a sequence of zero-mean, uncorrelated random ‘arabes. That is, all members of the sequence have zero means and are mutually Uunconlated with all other members of the sequence, Ifthe random variables se also normal, then the sequence is a Gaussian white sequence. 2.10 | GAUSS-MARKOV PROCESS: this process are then of the form a0 wo E gure 214 Bandits white rc wth over Raquel) Pcearlten bon) Sect nah. [A stationary Gaussian process X() that has an exponential autocorrelation is called a Gauss-Markav process. The autocorrelation and spectral functions for | 210 GAUSS-MARKOV PROCESS 95 Rel) = ren OE e101) 2.10.2) ‘These functions are sketched in Fig. 2.15. The mean-square value and time constant for the process are given by the a and 1/8 parameters, respectively. ‘The process is nondeterministc, so a typical sample time function would show ‘no deterministic structure and would look like typical “noise.” The exponential autocorrelation function indicates that sample values of the process gradually become less and less correlated as the time separation between samples increases, The autocorrelation function approaches zero as 7 — «, and thus the ‘mean value of the process must be zero. The reference to Markov in the name of this process is not obvious at this point, but it will be after the discussion on ‘optimal prediction ia Chapter 4. 2 f "The Gauss-Markov process is an important process in applied work be- ccause (1) it seems to fit a large number of physical processes with reasonable accuracy, and (2) it has a relatively simple mathematical description. As in the case of all stationary Gaussian processes, specification of the process autacor- relation function completely defies the process, This means that any desited higher-order probability density function for the process may be written out explicitly, given the autocorrelation function. An example will illustrate this. EXAMPLE 2.11 Let us say that a,Gauss~Markov process X(0) has autocorrelation function Ry(a) = 100e-%4 2.103) ‘We wish to write out the third-order probability density function, Frusscltie tn 35) where X, = X(0), Xz = XC5), and X5 = XC) First we note that the process mean is zero. The covariance matrix in this case is a3 x 3 matrix and is obtained as follows: 0 Si aay » a Figure 218, futccorltion and spc densty incon fr Gauss Maton rose bi Aconoaten uncon) Spectra dons196 CHAPTER 2 MATHEMATICAL DESCRIPTION OF FANDOM SIGNALS FXX) BOG) BUGXS) | = | RuC5) Ry(O) BCS) BX) BOGX,) EX) Re) RelS) Re(O) 100 1006" 100e"* 1O0e"t 100 100e~" 2.10.4) 00e-* 100e"* 100 [ee BOK) #35 [i Re) #3 Cy - Now that Cy has been written out explicitly, we ean use the general normal form aiven by Ea. (1.15.5). The desired density funetion is then os | _gttecss) 2. 0 = GaaEyT 2.105) 7 ] 2100) and Gis en by By (2104), . “The simple scalar Gauss-Markov process whose autocorrelation function is exponential is sometimes referred to a5 a first-order Gaust-Markov process. ‘This is because the diseete-ime version of the process is described bya first- ‘order difference equation of the form Xu) = KE) + OD) ‘where W(q) is an uncorrelated zero-mean Gaussian sequence. Diserete-time ‘Gaussian processes that satisty higher-order difference equations are also often referred t0 85 Gauss-Markov processes of the appropriate oer. Such processes fare best described in veetor form, and ths is discussed in detail in Sections 5.2 fand 53. Also, an example of second-order Gauss-Markov process that is of ‘Some importance in satellite navigation systems is discussed in Problem $4 and Example 6.1 (ens) pee RANDOM TELEGRAPH WAVE. Consider a binary voltage waveform that is generated accord ng tothe following rules: 1, The voltage is either +1 or ~1 V. 2. The state at | = O may be ether +1 or ~1 V with 2qual likelihood 3. As time progresses, the voltage switches from one state to the other at random. Specifically, the probability of k switches in a time interval T i | | | 211 RANDOM TELEGRAPH WAVE 97 Figure 216. Random tsagaph wae, is governed by the Poisson distribution ay cE PH = eu whore @ is the average nurnber of ewiteher per unit time. ‘This random process is called the random telegraph wave, and a sample wave: form is shown in Fig, 2.16. I is worth noting that the likelihood of switching at any point in time does not depend on the particular state or the length of time the system has been in that state ‘A rigorous derivation of the autocorrelation function for the random tele= graph wave can be found in a number of references (7, 8). For our purposes here, we ean use the following heuristic argument. Consider the product of successive time samples X(G) and X(c,); the result must be either of two possi bilities, +1 or —1. If the time interval t, ~ tis small, X(,) and X(z) are highly correlated, so that X,(%,)X(,) is nearly unity. Then, asthe spacing between samples is increased, their corrclation gradually decreases and approaches zero as, the spacing between samples goes to infinity. This leads to an exponential autocorrelation function of the form Ryo) = emt 2) where a is the average number of switches per unit time. Tt should be apparent that the random telegraph wave is aot a Gaussian process-—far from it! Yet the Gauss-Markov process described in Section 2.10 has an autocorrelation function identical in form to that ofthe random telegraph wave. For the purposes of comparison, a typical Gauss-Markov signal with about the same time constant as the random telegraph wave of Fig. 2.16 is shown in Fig. 2.17. The difference is striking. This isa vivid example of two processes DAL Figure 2.17. Gases-Marew gal wth about he sare tne cor- ‘But ae rte random legge of Fase 246(98 CHWPTER2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS that have radcelly different random structures, but stil have the same autocorrelation functions and, of cours, identical spectral characteristics. The moral is this: The autocorrelation function and/or spectral characteristics do not tell the ‘whole story; all the probability density functions must be specified in order for the process to be completely described. In the Gauss—Markov case, this was, ‘done by specifying “Gaussian process” in addition to the autocorrelation function, In the random telegraph wave case, the higher-order densities were indirectly specified by describing a conceptual chance experiment creating the waveform. Obviously, the probability density functions are radically different for the two processes in this case, NARROWBAND GAUSSIAN PROCESS In both control and communication systems, we frequently encounter situations ‘where avery natrowband system is excited by wideband Gaussian nose. A high- Q tuned circuit and/or a lightly damped mass-spring arrangement are examples of nartowband systems. The resulting output is a noise process with essentially all its spectral content concentrated in a narrow frequency range. If one were to ‘observe the output of such a system, the time function would appear to be nearly sinusoidal, especially if just a few cycles of the output signal were observed. However, if one were o carefull) examine a long record ofthe signa, it would be seen that the quasi-sinusoid is slowly varying in both amplitude and phase. Such a signa is called narrowband noise and, if tis the result of passing wide- ‘band Gaussian noise through a linear narrowband system, then itis also Gaus- sian. We are assured of this because any linear operation on a set of normal variates results in another set of normal variates. The quasi-simusoidal character depends only on the narrowband property, and the exact spectral shape within the band is immaterial "The mathematical description of narrowband Gaussian noise follows. We fist write the narrowband signal as $0) =X 008 wf ~ YO) sin wt e120 where X(0) and Y(?) are independent Gaussian random processes with similar narrowband spectral functions that are centered about zet0 frequency. The fre quency «, is usually called the carrer frequency, and the effect of multiplying XQ) and ¥() by cos «as and sin ou is to translate the baseband spectrum up to 8 similar spectrum centered about w. (see Problem 2.32). The independent X(0) and Y@) processes are frequently called the in-phase and quadrature components of S(). Now, thnk of time 1 as a particular time, and think of X(2) and Y(0) as the comresponding random variables. Then make the usual rectangular to polar ‘uansformation via the equations 212, NARROWEAND GAUSSIAN PROCESS 88 xnRom® (2.12.2) Y= Rsin@ ox, equivalently, ReVEaF enur't ens) By substituting Eq, (2.12.2) into Eq, (2.12.1), we ean now write S() in the form SE) = RG £05 (A cos of ~ RG) sin OC) sin aye = RO coslws + OC] e124) Ic is fom Eq, (2.12.4) that we get the physical interpretation of “slowly varying ‘envelope (amplitude) and phase.” ‘Before we proceed, a word or two about the probability densities for X,Y, 2, and @ is in onder. IF X and ¥ are independent normal random variables with the same variance e, their individual and joint densities are 1 ene LO = Fee e125) So) 2.126) and e127 ‘The corresponding densities for R and @ are Rayleigh and uniform (see Example 1.23), The mathematical forms are e128) 1 $60 fr (oxo) 2.129) otherwise Also, the joint density function for R and © is4100 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS ere, p20 and 0S 0<2n (2:12.10) Its of interest to note here that if we consider simultaneous time samples of teavelope and phase, the resulting random variables are statistically independent. However the processes RG) and @(?) are not statistically independent (7), This 2.12 WENER OR BRONNAN-MOTION PROCESS 101 e— — f —— t : t i i i i se estat th jin probity density assciaed with adjacent # Samples cannot be writen i product form, that i, ; t x FrmoredtrFa Ov 0) * Frndt MFoyatOn 6) 2-12.11) t I ‘ ‘We have assumed that S(¢) is a Gaussian process, and from Eq, (2.12.1) we ‘ see that i Var $ = § (Var X) + (Var Y) = en | el aa Thus, ' fp ais) | 0) $09 = Faas ey | s “The higher-order density functions for will ofcourse, depend on the specific} tue atthe spctal density forte process. | 1 Sa 243 { q : WIENER OR | BROWNIAN-MOTION PROCESS e Suppose we start atthe origin and take » steps forward or backward at random, ‘with equal likelihood of stepping in either direction, We pose two questions ‘After taking n steps, (I) what isthe average distance traveled, and (2) what is the variance of the distance? This is the classical random-walk problem of statistics, The averages considered here must be taken in an ensemble sense; for example, think of running simultaneous experiments and ‘hen averaging the {sults fora given number of steps. It should be apparent that the average dis tance traveled is zero, provided we say “forward” is postive and “backward” is negative. However, the square ofthe distance is always positive (or 2er0), 50 its average fora large numberof trials will not be zero. Its siown in elementary Statistics that the variance after m unit steps is just n, or the standard deviation is Vn (Gee Problem 2.21), Note that this inereases without bound as m increases, and thus the process is nonstationary. "The continuous analog of random-walk is the output of an integrator driven ‘with white noise. This is shown in block-diagram form in Fig. 2.184. Here we ‘consider the inpot switch as closing at ¢ = O and the initial integrator output as Foure 218 Contr arog of nom va) Bock gam by Enel of aera being zero, An ensemble of typical output time signals is shown in Fi ut time signals is shown in Fig. 2.18. ‘The system response X(0) is given by n= [mara eno Cleary, the average of the output is PDK] = B if rw a] = [[arwlde=0 2132) Also, the mean-square-value (variance) is102 CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS FLO) Efe au fry ao] = ff eteuorent au a 21339 But E{F()F(0)} is just the autocoreation function Ry(u ~ v), which in this case isa Dirac del function. Thus, [foo and = fa So, E[X%(] increases linearly with time and the sms value increases in accord lance with Vi (for unity white noise input). (Problem 2.35 provides a demonstration of this) Now, ad the further requirement thatthe input be Gaussian white noise. ‘The output will then be a Gaussian process because integration is linear op- ‘tation on the input, The resulting continuous random-walk process is known as the Wrener or Brownian-motion process. The process is nonstationary, itis Gaussian, and its mean, mean-square value, and autocorelation function are sven by RC) ' @34) FLX] = 0 e135) EDO] 2136) lly 4) = BTRUQRUL = [ [row f'n «| =f fatrorojdeae = f° au ode Evaluation ofthe double integral yields Rall) = fs nee @.3.7) Since the proces i nonstationary, the autocorrelation function is a general funetion ofthe two arguments f and. With a hile imagination, Eq, (2.13.7) can be seen to descibe two faces ofa pyramid with the sloping ridge ofthe pyramid running, along the line f= t twas mentioned before that there are difcules in defining directly whi is meant by Gaussian white noise. This is because of the "infinite waranc 1obiem, The Wiener process is well behaved, bough. Thus, we can reverse the argument given here and begin by arbitrarily defining it as a Gaussian process with an autoconeation function given by Ea, (2.13.7). Tis completely specifies te process. We can now deseribe Gaussian white noise in terms of its integral ‘That is, Gaussian white noise is that hypothetical process Which, when inte arted, Yields & Wiener process 214 2.14 PSEUDORANDOM SIGNALS 103, Figure 219. Spscium ef nase same oopd hack en st. PSEUDORANDOM SIGNALS ‘As the name implies, pseudorandom signals have the appearance of being ran- ‘dom, but are not truly random. In order for signal to be truly random, there ‘must be some uncertainty about it that is governed by chance. Pseudorandom signals do not have this “chance” property. Two examples of pseudorandom signals will now be presented. EXAMPLE 2.12 a ‘Consider a sample realization of finite length T of a Gauss~Markov process. Let the time length T of the sample be large relative to the time constant of the process. After the sample is taken, of course, the time function in all its intimate etal is known tothe observer. After the fact, nothing remains to chance insof as the observer in concemed. Next, imagine folding this record back on itself into single loop (it might be on magnetic tape), and then imagine playing the Joop continuously. I should be clear that the resulting signal would be periodic and completely known (determined), at least to the original observer. Yet to 8 second observer casually ooking at a small portion ofthe loop, te signal would ‘appear to be just random noise It should be mentioned that this is not a com pletely hypothetical situation; experimental spectral analysis was frequently implemented in just this way in times prior to the modem on-line digital methods, ‘The “looped” signal that goes on ad infinitum is periodic, so it would have line type rather than continuous spectral characteristics. See Fig, 2.19, Hl Line-type spectra are characteristic of all pseudorandom signals. The line spacing may be extremely small, as is the case for very large 7, but itis there, nevertheless. Note that the envelope of the lines would approximate the square 100t of the spectral density of the process from which the sample was taken, ‘provided the record length T is lage." In this ease, the usual laboratory analog + suc speaking, the average eavelpe would approximate the gene rot ofthe petal density (cee Seon 215)104 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS spectrum analyzer would not be able to resolve individual fines, and it would indicate 2 smooth spectrum proportional to the average spectrum. This would be just fine and what was desired ifthe original experimental problem was to Sdeiermine the spectral charaterstcs of random process from z single long sample of the process, The point of allthis is that the typical analog spectrum analyzer could not distinguish between pseudorandom noise and true random noise if the line spacing for the pseudorandom noise is very small. EXAMPLE 2.13 a Binary sequences generated by shift registers with feedback have periodic properties and have found extensive application in ranging and communication systems (9, 10, 11, 12), We will use the simple 5-bit shift register shown in Fig. 2.20 to demonstrate how a pseudorandom binary signal can be generated. tn this system the bits are shifted to the right with each clock pulse, and the inpat on the left is determined by the feedback arrangement. For the initial condition shown, it ean be verified that the output sequence is 1111190011011 101010900100101100 1111100. 31 bits same 31 bits etc Note that the sequence repeats itself after 31 bits. This periodic property is characteristic of a shift register with feedback. The maximum length of the sequence (before repetition) is given by (2" ~ 1), where n isthe register length (). The S-bit example used here is then a maximumlength sequence. These Sequences are especially interesting because oftheir pseudorandom appearance. NNote that there are nearly the same number of zeros and ones in the sequence (16 ones and 15 zeros), and that they appear to occur more or less at random, If we consider a longer shift register, say, one with 10 bits, its maximum-length sequence would be 1023 bits, It would have 512 ones and 511 zeros; again, these would appear to be distributed at random. A casual look at any interior string of bits would not reveal anything systematic. Yet the string of bits so ‘generated is entirely deterministic, Once the feedback arrangement and initial condition are specified, the output is determined forever after. Thus, the sequence is pseudorandom, not random, a When converted to voltage waveforms, maximum-length sequences also hhave special autocorrelation properties. Returning to the 31-bit example, let bi- gure 220, inary irate with foettac | | i ' | 2.15 235 DETERMINATION OF AUTOCORRELATION AND SPECTRAL DENSITY FUNCTIONS 105, Figure 221. Pseudo binary waveform, nary one be 1 V and binary zero be ~1 V, and let the voltage level be held constant during the clock-pulse interval. The resulting time waveform is shown in Fig. 221, and its time autocorrelation function is shown in Fig. 2.22. Note thatthe unique distribution of zeros and ones for this sequence is such that the autocorrelation Function isa small constant value after a shift of one unit of Gime (Le, one bid). This is typical of all maximum-length sequences. When the se- uence length is long, the correlation after shift of one unit i quite mall, ‘This has obvious advantages in correlation detection schemes, and such schemes have been used extensively in electronic ranging applications (10, 11). ‘The spectral density function for the waveform of Fig. 2.21 is shown in Fig. 2.23. As with all pseudorandom signals, the spectrum is line-type rather than continuous (9, 12). DETERMINATION OF AUTOCORRELATION AND SPECTRAL DENSITY FUNCTIONS FROM EXPERIMENTAL DATA ‘The determination of spectral characteristics of a random process from experimental data is a common engineering problem. All of the optimization techniques presented in the following chapters depend on prior knowledge of the spectral density of the processes involved. Thus, the designer needs this infor- , Conan os gure 2.22 Tine acon Sct fr wave of ee am106 CHAPTER|2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS i Hl Figure 229, Spectral dont for poauorandem Bary wate, mation and it usually must come from experimental evidence. Spectral determination is a relatively complicated problem with many pitfalls, and one should approach it with a good deal of caution. I is closely related to the larger problem of digital data processing, because the amount of data needed is usually large, and processing it either manually or in analog form is often not feasible. We first consider the span of observation time of the experimental data, which is a fundamental limitation, irespestive of the means of processing the data. ‘The time span of the data to be analyzed must, of course, be finite; and, as, 4 practical matter, we prefer not to analyze any more data than is necessary t0 achieve reasonable results, Remember that since this is a matter of statistical inference, there wil always remain some statistical uncertainty inthe result. One way to specify the accuracy of the experimentally determined spectrum or autocorrelation function is to say that its variance must be less than a specified value. General accuracy hounds applicable to all processes are not available but there is one special case, dhe Gaussian-process, that is amenable to analysis. We will not give the proof here, but itis shown in @ number of references (5, 13) that the Variance of an experimentally determined autocorrelation function saisies the inequality var vate = $f Re ar us) ‘where it is assumed that a single sample realization of the process is being analyzed, and T= time length of the experimental record ‘Rg() = autocorrelation function of the Gaussian process under consideration Vq(1) = time average of X,(0X(¢ + 2) where Xp(0) is the finite-length sample of X(0) [Le., Vel) is the experimentally determined ‘autocorrelation function based on a finite record length] It should be mentioned that in determining the time average of X(0Xy(t + 7) ‘we cannot use the whole span of time T, because X,(0) must be shifted an amount cof 7 with respect to itself before multiplication. The true extension of X7(9 ‘beyond the experimental data span is unknown; therefore, we simply omit the rnonoverlapped portion in the integration: | | | | \ | | 215 DETERMINATION OF AUTOCORRELATION AND SPECTRAL DENSITY FUNCTIONS 107 Vela) time avg, of X(OK (E+ 2] = ES “x HOXle +d 2.152) 1t wil be assumed from this point on that the range of 7 being considered is ‘much less than the total data span 7, tha is, 7. We first note that Vq(2) i the result of analyzing a single time signal therefore, Va(=) is itself sta sample function from un ensemble of functions. Its hoped that V(x) as determined by Eq, 2-152) will yield a good estimate of Re(7) and, in order to do so it shouldbe an unbiased estimator This can be vetfed by computing its expectation Fol mons a «| Pal eons ae ELVg(] = 5 FEL mame aus ‘Thus, V2) is an unbiased estimator of f(A). Also, it can be seen from the equation for Var Va(), Ea, 215.1), that if the integral of Re converges (eg Fy decreases exponentially with 2), then the variance of Vq(*) approaches 2210 8 T becomes large. Thus, Vx(2) would appear to be a well-behaved estimator of Ry(7), that is, Ve(0) converges in the mean to Ry(e). We wall now pursue the estimation accuracy problem further. Equation (215.1 sof litle value ifthe proces autocorelation function is not knovin. So, at this point, we assume that X() is a Gauss-Mazkov process ‘with an suiecoreation function Rea) = esa) ‘The o and f parameters may be difficult to determine in a real-life problem, but we can get at least a rough estimate of the amount of experimental: data needed for a given required accuracy. Substituting the assumed Markov autocorrelation function into Eq. (2.15.1) then yields varivsco) = 39 155) ‘We now look at an example illustrating the use of Eq, (2.15.5). EXAMPLE 2.14 Let us say thatthe process being investigated is thought to be @ Gauss-Markov process with an estimated time constant (1/) of 1 sec. Let us also say that we108 CHAPTER 2 MATHEMATICAL DESCHIPTION OF RANDOM SIGNALS ‘wish to determine its autoconclation function within an accuricy of 10 percent, land se want to know the length of experimental data needed. By accuracy’ ‘we mean that the experimentally determined V(2) should have a standard devi ation less than «| of the a of the process, at least for a reasonably small range Of + near zer0, Therefore, the ratio of Var{V(=)] to (a?) must be less than 01 Using Eg. 2.15.5), we can write varlVoo) <2 ar Setting the quantity on te left side equal to 01 and using the equality condition yield 1 TOF 100 see 2.156) vin ‘A skefen naiating a typleal sample experimental autoconaton function f= Shown in Fig. 224, Note that 10 percent accuracy is really not an expecially Sermanding erent, bt ye the data eure 20 mest Hime Constant the process, To put this in more graphic terms, if the prozess unde invest. favon were andor yo dit witha estinaled ine const of 10 ous, 2000 Fours of continuous data would be needed to achieve 10 perent accuracy. This Could very well be in the same range as the mean time to fille forthe gyro. Were we to be more demanding and ask for 1 percent aceuricy, about 23 years of data would be required! I'can be seen that accurate determination of the futocorelation fonetion isnot a rival problem in some applications. (This ex: ample is pursued further in Problem 2.33) 5 ‘The main point to be leamed from this example is that reliable determi nation ofthe autocorrelation function takes considerably mare experimental data than one might expect intuitively. The spectral density function is just the Fourier transform of the autocorrelation function, so we might expecta similar accuracy problem in its experimental determination. Ee fomest] Figure 224 Croan and ius autocrine fo or Bena? 215 DETERMINATION OF AUTOCOARELATION AND SPECTRAL DENSITY FUNCTIONS 109 ‘As just mentioned, the spectral density function for a given sample signal ‘may be estimated by taking the Fourier transform of the experimentally deter- ‘mined autocorrelation function, This, of course, involves @ numerical procedure ‘of some sort because the data describing V,(=) will be in numerical form. The spectral function may also be estimated ditectly from the periodogram of the sample signal. Recall from Section 2.7 thatthe average periodogram (the square of the magnitude of the Fourier transform of X,) is proportional to the spectral density function (For large T). Unfortunately, since we do not usually have the luxury of having a large ensemble of periodograms to average, there ae pitfalls in this approach, just as there are in going the autocorrelation route. Neverthe less, modern digital processing methods using fast Fourier transform (FFT) tech niques have popularized the periodogram approach. Thus, it is important to understand is limitations (5, 6). First, there isthe truncation problem. When the time record being analyzed is finite in length, we usually assume thatthe signal will “jump” abruptly to ‘zero outside the valid data interval. This causes frequency spreading and gives rise to high-frequency components that are not truly representative of the process lunder consideration, which is assumed to ramble on indefinitely in a continuous ‘manner. An extreme case of this would occur if we were to chop up one long record into many very short records and then average the periodograms of the short records, The individual periodograms, with their predominance of high- frequency components due to the truncation, would not be at all representative of the spectral content of the original signal; nor would their average! Thus, the first rule is that we must have & long time record relative to the typical time variations inthe signal. This is true regardless of the method used in analyzing the data. There is, however, a statistical convergence problem that arises as the record length becomes large, and this will now be examined In Section 2.7 it was shown that the expectation of the periodogram ap. roaches the spectral density of the process for large T. This is certainly desir. able, because we want the periodogram to be an unbiased estimate of the spectral density. It is also of interest to look at the behavior of the variance of the periodogram as T becomes large. Let us denote the periodogram of X,(x) a8 ‘May T), that is, Mow. T) =F ilX EO) @isn [Note thatthe periodogram isa function of the record length T as well as an The variance of Ma, 7) is Var M = E(u) ~ (Eu) (2.15.8) Since we have already found E(M) as given by Fags. (2.7.8) and (2.7.9), we now need to find E(M?, Squaring Eg, (2.15.7) leads to rar) = KELL LLL xoxoxo vat sd (2.15.9)410 CHAPTER|2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS. It can be shown that if X() is a Gaussian process,* FLX@XIKWOKW)] = Ryle — Rew = v) + Ryle — WRyls ~ ») + R(t ORS 1) RAS.10 ‘Thus, moving the expectation operator inside the integration in Eq, (2.15.9) and using Eq, (2.15.10) lead to sry = EEF [te ome ~9 nen Raia = lc ei one acnreaas [fnew dat 1 [enemas [FE mle = ween a rhe eA me ems ara esa ex sing Hy 272) me 218.1 fas 6 ue -atruoys | [Innere aa cs Var M = E(M?) — [E(M)P teuny +2 (ff ff me-werereara|? e153 ‘The second term of Eq, (2.15.13) is nonnegative, soit should be clear that Var M = [EWP 215.14) ‘But E(M) approaches the spectral function as T+ =, Thus, the variance of the ‘at those exceptional periodogram does not go to zero as T+» (except possibly at those except points where the spectra function is zer0).In other words, the periodogram does + Soe Problem 230. | | 236 SAMPLING THEOREM. 111 ‘not converge in the mean as 7'—~ :t This is most disturbing, especially in view ‘of the populatity of the periodogram method of spectral determination. ‘The dilemma is summarized in Fig. 225. Increasing Twill not help educe the ripples in the individual periodogram. It simply makes M “jump around” fester with oo ‘This docs help, though, with the subsequent averaging that must accompany the spectral analysis. Recall hat itis the average periodogram that is the measure of the special density function. Averaging may not be essential in the analysis of deterministic signals, but itis for random signals. Averaging in both frequency and time is easily accomplished in analog spectrum analyzers by appropriate adjustment of the width ofthe scanning window and the sweep speed. In digital ‘analyzers, similar averaging over a band of diserete frequencies can be implemented in software. Also, further averaging in time may be accomplished by averaging successive periodograms before displaying the spectrum graphically. In either event, analog or digital, some form of averaging is essential when analyzing noise. (Averaging over a window of frequencies is illustrated in Prob. em 2.34) (Our treetment of the general problem of autocorrelation and spectral deter: ‘mination from experimental data must be brief. However, the message here should be clear. Treat this problem with respect. Its fraught with subtleties and pitfalls. Engineering literature abounds with reports of shoddy spectral analysis ‘methods and the attendant questionable results. Know your digital signal processing methods and recognize the limitations of the results We will pursue the subject of digital spectral analysis further in Section 2.17. But first we digress to present Skennon’s sampling theorems, which play fn important role in digital signal processing. 2.16 ‘SAMPLING THEOREM Consider a time function g(—) that is bandlimited, that is, [Nontriviat, [of = 20 MNto] = ote) = {Moreno s 2.16.) Under the conditions of Eq, (2.16.1) the time function can be written in the form Figure 228 ypc pavodogram trang reco oath.412. CHAPTEF|2. MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS kart % yr Figure 2.28, Sample of andinod srl of (2.162) “This remarabl theorem is due to CB, Shanon (14, 15), nit has seca “once hon ening wi band oe The hoe ye ao see a Meeity an inane sequence of sample values wy Bi of > wetonnay gced 1/217 se apart a shown in Fig. 226, then tere woud be ahead ony one banlimited funtion that would go through all the sample cscs ater wont spesfying the signal same valves a reqing (0 TeShalitd nteay specify the sigal in ween the sample points 38 mete sampling rate Of 21 Hz known as the Nygust ate. This represents ‘rama anpting re needed to peserve all he infeation content in the continous sigaal If we sample (at Tess than the Nygust rae, some ihfonaton ni elt, andthe origi signal eannot be eval reconstructed ait bans ofthe sequence of samples. Sampling at arate higher than the Sepa ate Sno necessary, but Goes n0 harm because his simply eatends ae alowabl ange of signal equencis beyond W Ha, Cerny signal lying st the bandh eso within a bandwith great than W. Tu dscabing stationary random process tat bani ican be seen that we noe Yo onsier on te stastical properties of samples taken at the Rogie of 20 Ha This Simplifies the proces descrpion considerably. 1 SG te fier requirement thatthe proces Gausson and white within the bandwidth Wen the jin probability density forthe samples may be temas simple prodct of sinlevaite normal density fonctions. Ths sim Jiteation frequently used in tose analysis inorder to make ee problem Mathematica See thre symmetry inte ect and ivere Fuser wansforms, we would expect here we coresponding sampling theorem inthe fequency oman ay be sated a follows. Consider the te function g( t0 be tine Tel hati mnuvi over span of te T and zero ose his teva then is Fourier aston Glo) maybe writen 2 seeing ite ng ty re sas ats str saat Sat oe igi toga tia te tS Ct ay fares 217 DISCRETE FOURIER TRANSFORM AND FAST FOURIER TRANSFORM 113 2.163) Al of the previous comments relative to time domain sampling have their corresponding frequency-domain counterparts Frequently, itis useful to consider time functions that are limited i both time and frequency. Swictly speaking, this is not possible, but itis @ useful approximation. This being the case, the time function can be uniquely repre: sented by 2WT samples. These may be specified either in the time or frequency domain, Sampling theorems have also been worked out for the nonbaseband case (18, 19). These are somewhat more involved than the baseband theorems and ‘will not be given here, DISCRETE FOURIER TRANSFORM AND. FAST FOURIER TRANSFORM ‘The subject of digital signal processing has received considerable attention in the past few decades, and itis only natural that this would occur concurrently ‘ith the advancement of computer technology. Whole books have been devoted to the subject (20, 21), so that we cannot expect to do the matter justice in one short section. We can, however, presenta brief overview in order to place digital spectral analysis in proper perspective, We will then proceed on to the main subject ofthis book, namely, filter analysis. Modern computer technology has made it possible to perform an efficient discrete version of the Fourier transform. Thus, nearly all spectral analysis is now done using the direct periodogram approach rather than the more roundabout approach via the autocorrelation function, The sampling. theorems presented in Section 2.16 dictate some constraints in the choice of sampling rates and the total amount of data analyzed in any one batch. Since these constraints play an important role in digital signal processing, we will examine their consequences in some detail In spectral analysis, we usually have at least a rough idea as to the band- ‘width of the signal to be analyzed; therefore, let us say this is approximately 0 to W Hz. The sampling theorem, Eq, (2.16.2), says thatthe sampling rate should be 2W samples/sec or, equivalently, the sample spacing should be 1/2W sec. Let us further say that we wish to analyze NV samples in one batch where 1 is yet to be determined, The total time span of the samples would then be x otal timespan of data = 7 = ern414 MPTER2 MATHEMATICAL CESGALPTION OF FANDOM SIGNALS, “The frequency-domain sampling theorem, Bq, (2.16.3), states that our truncated time signal eould be represented in the spectral domain with discrete samples spaced 1/T or 2W/N apart. That is, we have N samples uniformly spaced in the time domain and /2 corresponding samples spaced uniformly from 0 to W fz in the frequency domain. Since each spectral sample is a complex number, the hurnber of degrees of freedom is N in either the time or frequency domain. This ‘one-to-one correspondence of scalar elements in the two domains suggests a teansform-pair relationship, and this will now be formalized. ‘As a matter of notation, let the truncated time signal be g(?) and its Fourier transform be tie = [eve a 72) Itis tacitly assumed hece that g() is real. Consider next a discrete approximation for G( jw) as follows: CUrdm Af) = D gerereney AT (2.17.3) a enc fie sage wet 1 ar = ama ne in ap 2H = ame ign nny min Ge star ana 1s B73 an in in eo cuorana=}Sne(-2@) ens or se ea ee i a Bk of he ie of Mer a ar ke ae CH tinal eget ach i oe EE gen(i), [Note that we do not claim th 5's to be exact samples of Gia), bu, i is hoped, they will be reasonable approximations thereof. .N=1 Q176) 217 DISCRETE FOURIER TRANSFORM AND FAST FOURIER TRANSFORM 145 re i pee, ts a ee ee ee 7 7 7 anes Figue 227 Dauete Fouter tandem magic andthe sites, ‘The 8, sequence as defined by Eq, (2.17.6) exhibits certain symmetry that is worth noting. Firs, if we extend the index m beyond N’~ 1, we simply get a periodic extension of the sequence: % et. eu Also, these is symmetry about the midpoint of the sequence in that Set yaa = ete 178) In other words, half ofthe defined 8s are complex conjugates of the other half and are thus redundant (see Problem 2.31). This is as expected; there can only bbe IV degrees of freedom in the frequency domain, just asin the time domain Figure 2.27 summarizes the symmetry properties of the 6 sequence. Once the 8, sequence is defined as per Eq. (2.17.6) it can be shown that ‘an exact inverse relationship exists as follows (20, 21)116 CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS “The 8, and g, sequences then form a transform pair in the usual sense; that is, given one, we can find the other, and vice versa." As might be expected, the ‘4, sequence is called the discrete Fourier transform (DFT) of g, and, of course, iris the discrete inverse transform of 8, It should be emphsized that there is ‘BO approximation involved in the discrete transform pair relationship. This te lationship is exact, irespective of the sampling rate or frequescy content of 2) ‘The approximation comes inthe interpretation of 8, as samples ofthe continuous signal spectrum. This is a relatively complicated matter and tne references cited (6.20, 21) may be consulted for further details. It suffices bere to say that this bas been studied extensively and, with appropriate cautions and proper weighting cof the time data, the diserete Fourier transform can provide meaningful results ‘We might also note that all the preceding remarks about digital signal processing apply to deterministic as well as random signals In analyzing determin- fstie signals, we usually compute the magnitude ofthe discrete Fourier transform land cal this the signal spectrum. In the Fandom signal case, usually the square ff the magnitudes of the #, terms are formed, and this sequence (Le, (BP. [5,P. ..-) becomes an approximation of the periodogram of the signal being analyzed. The periodogram is, in turn, statistically elated to the power spectral Sensity function, which is usually the desired end result of the analysis. (See Problem 2.29 for comments about the crossperiodogram.) Digital implementation of the diserete Fourier transform is not a trivial matter. High resolution and celiable results in the frequency domain are obtained by making NV large. However, if one programs the transform literally as given by Eq. (2.176), the number of multiplications required is ofthe order N7. This cean easily get out of hand, especially in “on-line” applications. Fortunately, fst, tficient algorithms have been developed for which the number of required multiplications is ofthe order N log, N rather than N* (20, 21). ‘The computational ‘saving is spectacular for large NV. For example, let N be 2! = 1024, which is a modest numberof time samples for many applications. Then N? would be about 10", whereas NV fog, N is only about 10%. This represents a saving of about factor of 100 and reflects directly into the time required for the transformation. "All ofthe fast discrete Fourier transform algorithms reqaie thatthe number ‘of samples be an integer power 2 (Which usualy presents no particular problem), land they all go under the generic name of fast Fourier transform (FFT), The FFT cannot perform any wondrous, magical tricks on the basic data (es some seem to believe; itis simply an efficient means of implemeating the DFT. Thes, fll ofthe cautions that apply to the DFT also apply to the FFT. Because of its efficiency, though, the FFT ig used almost universally in on-Hne speetral analysis ‘applications. Occasionally, though, in offline applications where speed is of Iie concer, the straightforward programming of the DFT, as given by Eq. (2.176), is advantageous. For example, if only 2 limited amount of data is available and itis desirable to achieve as fine a resolution as possible in the frequency domain, the straightforward DFT might as well be prefered, because 1 is not restricted to integer powers of 2 as itis with the FFT, + sone authors peer associ th 18 fst of. 2.17.6) wt he ives elaoship aber ‘nate diet toa, Sie ie 2 mpl popoenaiy factor. this peel pone | | | | | | | PROBLEMS 117 Before we leave the subject of the diserete Fourier transform, itis worth ‘mentioning that the DFT is easily generalized to the case where the time-domain samples are complex, rather than real as assumed here. The direct and inverse transform Eqs. (2.17.6) and (2.17.9) still apply in the complex case. Ione begins ‘with WV complex samples in the time domain, then there will bea corresponding set of N complex samples in the frequency domain. Also, if no special symmetry exists among the time-domain samples, then there will be no special symmetry ‘or “folding over" in the frequency domain. The periodic extension stated in Eq (2.17.7) sill applies in the compiex case, though. PROBLEMS 21 Noise measurements at the output of a certain amplifier (with its input shorted) indicate that the rms output voltage due to internal noise is 100 pV. If ‘we assume thatthe frequency spectrum of the noise is fat from 0 to 10 MHz and zero above 10 MHZ, find! (a) The spectral density function forthe noise (©) ‘The autocorrelation tuncton for the noise. Give proper units for both the spectral density and autocorrelation functions. 22 A sketch of a sample realization of a random process X() is shown in the fre, The pls amps a ae independent samples of norma random variable with zero mean and variance a. The time origin is completely random, Pole cise eies accor Me pees 1 F a 23. The waveform shown is an example of a digital-coded waveform. The signal is equally likely to be zero or one in the intervals (Fy + 1s lo + 2, t+ 3), et, and itis always zero in the “in between” intervals (fp + 1 fy + 2) ly + 3, fy + 4), ete. The switching time fy is random and uniformly distib- luted hetwicen zero and two, The presence of absence of a pulse in the “pulse pose intra ihe code fora binary digi, Teri no statis coneaton among any of the bits of the message. Find the autocorrelation and spectral dasityfuacons for his proces. oe a Problem 23,118 CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS 24 Find the autocorrelation function corresponding to the spectral density fanetion SK) = So) + 480 ~ oy) + $800 + oy) + 25 A stationary Gaussian random process X(0) has an autocorrelation function of the form Reo) ‘What fraction ofthe time will [9 exceed four units? 2.6 A random provess X(0) is generated according tothe following res: (@) The waveform is generated with asample-and-old arrangement where the “hold” interval is 1 see (b) The sucessve amplitudes for cach I-sec interval are independent samples taken from a zero-mean normal distribution with yariance of (©) The fist switching time is random variable with uniform distbution from O10 | (ie, the time odgin is random). Let X, denote X(r) and X, denote XC + 9. (@) Find the joint probability density function fy, (©) Is this process Gausian process? 2.7. Find the autocoreltion fonction forthe process described in Problem 2.6 ‘sing the expectation formula Ry) = BLK = [sted de de 28 It is suggested that a certain real process has an autocorrelation function ‘8 shown in the figure, Is this possible? Justify your answer (Hint: Calculate the spectral density function and see if itis plausible.) 29° Consider the random process X() = 2 sin w® where w is @ random variable ‘with uniform distribution between a = 2 and « = 6, Is the process (a) stationary, (©) ergodic, and (¢) deterministic or nondeterministic? 2.40 The input to an ideal rectifier (unity forward gain, zero reverse gain) is a Stationary Gaussian proces. (a) Is the output stationary? (b) Is the output a Gaussian process? Give a brief justification for both answers 2.11 A random process X(0) has sample realizations of the form PROBLEMS 119 XW ar+¥ where a is a known constant and ¥ is a random variable whose distibution is MO, o. Is the process (2) stationary and (b) ergodic? Tustfy your answers. 2.12 What is the autocorrelation function for X(0) of Problem 2.11? 2.13 A sample realization of a random process X() is shown in the igure. The time f when the transition from the ~1 state to the +1 state takes place is a random variable that is uniformly distributed between O and 2 units (@) Is the process stationary? (b) Is the process deterministic or nondeterministc? (€) Find the autocorrelation function and spectral density function forthe process, xe Problem 2:19, 2:14 A common autocorelation function encountered in physical problems i (=) = 06°91 c08 aye (2) Find dhe conesponding specesl density function (©) R(x) will be recognized asa damped cosine function. Sketch both the autocorrelation and spectral density funtion forthe Highly damped 2418 Show that a Gauss-Markov process described by the autocorrelation function RG) becomes Gaussian white noise if we let @ +o and o? <> in such a way thet the azea under the autocorrelation-function curve remains constant inthe limiting process 2.16 A stationary random process X(¢) has a spectral density function of the form Go? +12 Cree sD ‘What is the mean-square value of X(0? (Hint: S() may be resolved into a sum of two terms of the form: [Alta + 4)] + [B/(o? + 1)). Bach term may then be integrated using standard integral tables.) Sela) =4120. CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS, 247 The stationary process X() has an autocorrelation function ofthe form Ryle) = oe PA ‘Another process Yt) i related to X() by the deterministic equation He = axt +6 where a and b are known constants. (@) What isthe autocorrelation function for ¥()? () What is the erosscorrelation function Ryy(2)? 248 The erosscorrelation function Re, (=) for Example 2.8 is sketched below. ‘What isthe corresponding cross spectral density function? Ray Problem 218 219 The random telegraph wave is described in Section 2.11. Let this process be the input to an ideal rectifier (unity forward gain, zero reverse gain) {a) What is the autocorrelation function of the output? (6) What isthe erosseortelation function Ryy() (X is input, ¥ is output)? 220 Two deterministic random processes are defined by XG) =A sin(wr + 8) Y) = BsinQor + 8) where @ isa random variable with uniform distribution between 0 and 2x, and to is known constant. The A and B coefficients are both normal random variables N(O, o2) and are correlated with a correlation coefficent p. What is the ‘rosscortelation function Ryy(7)? (Assume A and B are independent of @) 221. The discrete random walk process is discussed in Section 2.13. Assume teach step is of length J and thatthe steps are independent and equally likely 0 be positive or negative, Show thatthe variance of the total distance D traveled in N steps is given by Var D PN (Hint: Fist write D as the sum fy + ly +++ fy and note that flay» Jy ane independent random variables. Then form E(D) and E(D") and compute Var D as ED) ~ [EWD)F) 222 Let the process Z(t) be the product of two independent stationary processes X() and H(). Show that the spectral density function for Z() is given by {Gn the s domain) ProeLems 121 i. 2 fst. ae (Hint: First show that R,(7) = Ry(DRy(2).] 225, The scl ey con esa poe 526) 1 S00) = aR Find the autocorrelation function for X(. 224 A stationary process X() is Gaussian and hes an autocorrelation function of the form Belo) = ae Let the random variable X, denote X(t) and X, denote X(t, + 1). Write the expression forthe joint probability density function fx, 0t 225 A stationary Gaussian process X() has a power spectral density function Find £09 and £0"), 2.26 A typical sample function of « stationary Gauss-Markov process is shown in the skeich. The process has a mean-square value of 9 units, and the random variables X, and X, indicated on the waveform have a correlation coefficient of| 0.5. Write the expression for the autocorrelation Function of X() Problem 2.26 227 We wish to determine the autocorrelation function @ randoin signal em- Pirically from a single time record. Let us say we have good reason to believe the provess is ergodic and a least approximately Gaussian and, furthermore, that the autocorrelation function of the process decays exponentially with a time ‘constant no greater than 10 sec. Estimate the record length needed to achieve 5 percent accuracy in the determination of the autocomelation function. (By 5 percent accuracy, assume we mean that for any x, the standard deviation of the experimentally determined autocorrelation function will not be more than 5 percent of the maximum value of the true autocorrelation function.) 228 In Problem 2.27 the signal is not known to be truly bandlimited, but it is reasonable to assume that essentially all the signal energy will lie between 0CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS and 1 Hz. Let us assume I Hz to be the signal bandwidth, and fet us say we wish to sample the signal a the Nyquist rate. (a) How many diserete samples would be required to describe the record length of Problem 2.277 (b) Suppose that we wish to compute the discrete Fourier transform of the finite-ime signal using the fast Fourier wansform. This requires that the number of samples W be an integer power of 2. What should be inthis case? () The value of Win part (b) should work out tobe greater than that found in part (2). In order to achieve the appropriate NV for the FFT algorithm, ‘would we be better off (in terms of accuracy) to increase the sampling rate for the length of time computed in Problem 2.27 or should we keep the sampling rate at .2 Hz and increase the time length of the record accordingly? Presumably, the computational effort would be the same either way. 229° Calculating either the autocorrelation or crosscortelation function can be fan onerous task, especially if there isa large amount of data to be processed. It sometimes easier to compute the appropriate spectral function first and then inverse transform it, rather than do the computation as an averaging process directly in the time domain, This may seem to be a “roundabout” approach at first glance, but the ease of using the FFT often makes this approach atractiv. ‘Suppose we have time records of length of two stationary random processes x(@) and y(), and suppose we are interested in obtaining the crosscorre- Tation R,(2) of these two processes. Show that a crossperiodogram-ype function ‘can be formed just as was done for the usual power spectral density, and that it is given by ‘where X; and ¥, are the Fourier transforms of the truncated (0) and ys(®) functions. Note that itis relatively easy to form the indicated product function in the complex domain and then do another FFT to get the erosscorrelation function in the time domain. (It should be mentioned that all of the same sta~ tistical convergence problems mentioned in Section 2.15 relative to the periodo~ ‘gram also apply to the crossperiodogram,) 230 Let Xj Xy, Xy X, be zero-mean Gaussian random variables. Show that BUR ANXY = BOK YEOGK) + EOXIECK) + EXXJEOGK) (2.30.1) (dint: The characteristic Function was discussed briefly in Section 1.8.) ‘The multivariate version of the characteristic function is useful here. Let Ww), ‘@z,-, ,) be the multidimensional Fourier transform of fyygr(fiy in === + +x) (but With the signs reversed on a}, a. . «» o,)- Then if can be readily Verified that PRooLes 123 180, done -LE ofstees afar tay =F) My Bay BUX Xap «4X (P2302) ‘The characteristic function fora zero-mean, vector Gaussian random variable X Uo) = epee 2.303) ‘where Cy is the covariance matrix for X. This, along with Bq. (F2.30.2), may now be used to justify the original statement given by Eq. (P2.30.1). 231 It was mentioned in Section 2.17 that half of the discrete Fousiertrans- form elements are complex conjugates of the other half. This statement deserves closer serutiny because IV may be either odd or even insofar asthe basic discrete twansform is concemed (and not its efficient FFT implementation). () For the case where N is even, show that fy and jy. ze both real, and thus the total number of nonredundant scalar elements in the frequency domain is N, just as in the time domain. (©) For the case where NV is odd, show that only i is constrained to be real, and thus the count of nonredundant scalar elements is N, just as in the previous case 232 The accompanying figure shows & means of generating narrowband noise from two independent baseband noise sources. (See Section 2.12.) The band- ‘width ofthe resulting narrowband noise is controlled by the eutolfIrequency of the low-pass filters, which are assumed to have identical characteristics. Assume that F\(@) and F.(9 are independent white Gaussian noise processes with similar spectral amplitudes. The resulting noise processes aftr low-pass filtering will then have identical autocorrelation functions that will be denoted R,(). (@) Show that the narrowband noise S() is @ stationary Gaussian random process whose autocorrelation function is Rg) = Ryla) cos or (©) Also show that both the in-phase and quadrature channels are needed to produce stationary narrowband noise, (That is, if either of the sin a,/ oF €05 w,¢ multiplying operations is omitted, the resultant output will not be stetly stationary.)424 CHAPTER 2 MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS Tampon |X x Naratand eo —l wie Fo) Compe 2, Aw tai L_ Filer Problem 2.2 233 A sequence of discrete samples of a Gauss-Markov process can be gen erated using the following difference equation (see Section 54): Keay = OM FW 01,2, W, = white sequence, N[0, o%(1 ~ 9] 7 = variance of the Markov process B at reciprocal time constant of the process time interval between samples Ifthe inital value of the process Xj is chosen from @ 2oputation that is ‘NO, 07), then the sequence so generated will be ® sample realization of a stationary Gauss-Markov process. Such a sample realization i easily generated with MATLAB's normal random number generator with appropriate scaling of initial X, and the W, sequence. {(a) Generate 1024 samples of 2 Gauss~Markov process with ? = 1, 6 = 1, and Ar = ,05 sec. As matter of convenience, et the samples be & 1024-clement row vector with a suitable variable name. (b) Calculate the experimental autocorrelation function for the X, sequence fof part (@). That is, find Vq(2) for = 0, 05, 10, .. . 30 (i.e. 60 “lags"). You will find it convenient here to write a general MATLAB. program for computing the autocorrelation function for 2 sequence of length + and foc m lags. (This program can then be 1sed in subsequent problems.) Compare your experimentally determined V,(z) with the true autocorrelation function Ry(x) by ploting both Vy(z) and Ry(z) on the same graph, Note that for the relatively short 1024-point time se- ‘quence being used here, you should not expect to see a close match between Vs(=) and Ry() (see Example 2.14) () The theory given in Section 2.15 states thatthe expectation of V(x) is ‘equal to Re(x) regardless of the length of the sequence. Ic is also shown that V(x) converges in the mean for a Gauss-Merkov process as T becomes large. One would also expect to see the seme sort of conver- {gence when we look at the average of an ensemble of Vs()'s that are generated from “statistically identical,” but different, 1024-point se- ‘quences. This can be demonstrated (not proved) using a different seed in developing each Vs(*). Say we use seeds 1,2, . .. 8. Fist plot the Vq(7) obtained using seed I. Next, average the two Vq(7)’s obtained ris tit PROBLEMS 125 from seeds 1 and 2, and plot the result. Then average the four V4(1)’s for seeds 1, 2, 3, and 4, and plot the result. Finally, average all eight 'Vg(2)'s, and plot the result. You should see a general tend toward the true R(7) as the number of V's used in the average increases. 234 A recursion equation for generating a Gauss-Markov process was given in Problem 2.33, We wish to use the same process here, but the sampling rate will be changed to make it more suitable for a demonstration of experimental determination of the process power spectral density function. Therefore, the parameters will be the same as in Problem 2.33, except that we will let Ar = 1.0 sec inthis problem, (@) Generate a 256-point (Le, samples) realization of the process just described and store the samples a8 a row vector. () Using the first 64 samples of the process, calculate the discrete periodo- ‘gram of the process (see Eq. 2.15.7). To do this, you will need to do a discrete Fourier transform (DFT) of the 64-point time series. This can be done either by writing your own m-fle implementing the DFT according to Eq. (2.17.6) or by using the builtin MATLAB function ft). ‘The results should be the same (within a scale factor). Plot the resulting pesiodogsen (©) Repeat part (>) using the first 128 samples of the process, and then repeat it again using all 256 samples of the process. Note that the “noisiness” of the successive periodograms does not diminish as more time data are included. This is consistent with Eq. (2.15.14). As we include more time data, the frequency samples crowd closer together, ‘bat the noisiness of the samples remains the same. (Note that this isin Contrast to the autocorrelation function determination, where increasing the time span smooths out the experimental estimate of the true autocorrelation function.) (@) One way of smoothing the noisiness of the discrete periodogram is to average the data in the frequency domain. To demonstrate this, reconsider the 256-point periodogram of part (c) and form an “averaged periodogram” by averaging the data in successive 8-point blocks in the frequency domain. This yields a smoother plot, but with coarser resolution, of course. Now each point represents the power in a bandwidth of, ‘a rather than Af as before. Note thatthe average periodogram is be- sinning to approximate the tue spectra density that is 207B/(w* + By This approximation can be improved further by holding the sampling interval constant and increasing the number of samples in the time domain, 235 Discrete samples of a Wiener process are easily generated using MATLAB's normal random number generator and implementing the recursion equation: Xe Xt We 20,12. (P2.35) ‘where the subscript kis the ime index and the initial condition is set to X%=0426 CHAPTER? MATHEMATICAL DESCRIPTION OF RANDOM SIGNALS Consider a Wiener process where the white noise being integrated has a power spectral density of unity (see Section 2.13), and the sampling interval is 1 sec. ‘The increment to be added with each step (Le, W,) is a NO, 1) random variable, and all the W's are independent. [That this will generate samples of a process ‘whose variance ist (in sec) is easily verified by working out the variance of X, for a few steps beginning at k= 0.) (a) Using Eq, (P2.35), generate an ensemble of 50 sample realizations of the Wiener process described above for k= 0, 1, 2... . 10. For ‘convenience, arrange the resulting realizations into a 50 > 11 matrix, ‘where each row represents a sample realization beginnit 0. () Plot any 8 sample realizations (i.e, rows) from part (a), and note the ‘obvious nonstationary character of the process. () Form the average squares of the 50 process realizations from part (a), fand plot the result vs. time (ie., 4), (The resulting plot should be approximately linear with a slope of unity.) REFERENCES CITED IN CHAPTER 2 1. E, Wong, Stochastic Processes in Information and Dynamical Systems, New Yor: MoGrawHil, 1971 2. AH. Jazwinski, Scchasrc Processes and Filtering Theory, New York: Academic Press 1970, 3. N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series, ‘Cambridge, MA: MIT Press and New York: Wiley, 1989. 4.5.0. Rice, "Mathematical Analysis of Noise," Bell System Tech, J, 23, 282-332 (94ey 24, 46-256 1945). 5. 1S. Bendet and A. G,Pietsol, Random Date: Analysis and Measurement Procedures, [New York: Wiley-interscienee, 1971 6. KS. Shanmogan and A. M. Breipol, Rendom Signals: Detection, Estimation, and Data Analysis, New York: Wily, 1988 7. W.B. Davenport, Je. and W. L. Root, An Iniroduction to the Theory of Rendom Signals ond Noise, New York: McGraw-Hill, 1958. 8, A. Papoulis, Probably, Rardom Variables, and Stochastic Processes, 2nd ed, New “Yorks MeGraw-Hil, 1984, 9. RC. Dixon, Spread Specirum Systems, New York: Wiley, 1976 10, RL. Denar, “"Navstare The All-Purpose Satelite,” IEEE Spectrum, 18:5, 35-40 (May 198). 11, B. W. Parkinson and $. W. Giben, °NAVSTAR: Global Positioning System—Ten ‘Years Late” Proc. IEEE, 71:10, 1177-1186 (Oct, 1983. 12, RE. Ziemer and RL. Peterson, Digital Communications and Spread Spectrum ‘Shates, New York: Macmillan, 1985 13, 1H. Laning, Je and R. H. Bata, Random Processes in Automatic Control, New York: McGraw-Hill, 1956 14. CUE. Shannon, "The Mathematical Theory of Communication” Bell System Tech 5F Ouly and Oct. 1948), (Later reprinted in book form by the Univesity of Hinois Pres, 1949.) 15, CE” Shannon, “Communication in the Presence of Noise;” Proc. Inst. Radio Engr, 371, 10-21 Gan. 1949). 16, H. Nygqist, “Certain Topics in Telegraph Transmission Theory." Trans. Am, Inst lect. Engr, 47, 617-644 (Apel 1928). REFERENCES 127 17. H.S, Black, Modulation Theory, New York: Van Nostrand Co., 1950. 1. S, Goldman, information Theory, Eaglewood Clits, NJ: Preatice-Hall, 1953. 19. K, 8. Shanmugam, Digtal and Analog Communication Systems, New York: Wiley, 1978, 20. A. V, Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Englewood Clif, NI: Prentice Hal, 1989 21, D, Childers and A. Dusing, Digital Filtering and Signal Processing, St.Paul, MN: West Publishing Co,, 1975. Additional References on Probability and Random Signals, 22, PZ. Peebles, Je, Probability Random Variables, and Random Signal Principles, 3rd ed, New York: McGraw-Hill, 1993. 23, HJ. Latson and B. ©. Shuber, Probabilitc Models in Engineering Sciences (Vols | and 2), New Yorke Wile, 1999. 24, GR. Cooper and C.D, MeGillem, Probabilistic Methods of Signal and System ‘Analysis, 2 ed, New York: Holt, Rinehart, and Winston, 1986, 25, JL, Melsa and A, Sage, An Inraduction to Probability and Stochastic Processes, Englewood Clits, Ni: Prentice-Hall, 1973. 26, A.M, Breipohl, Probabilictic Systems Analysis, New York: Wiley, 1970, 27. 1. V, Candy, Signal Processing: The Model-Based Approach, New York: MeCraw- Hil, 1986.Response of Linear Systems to Random Inputs 34 “The central problem of linear systems analysis is: Given the input, what is the ‘output? In the deterministic case, we usually seek an explicit expression for the response or output, In the random-input problem no such expicit expression is possible, except for the special case where the input is a so-called deterministic random process (and not always in this case). Usually, in random-input prob- Jems, we must settle for a considerably less complete descripion of the output than we get for corresponding deterministic problems. In the case of random processes the most convenient descriptors to work with are autocorrelation func tion, spectral deasity function, and mean-square value. We now examine the Jnpat-oorput relationships of linear systems in these terms. INTRODUCTION: THE ANALYSIS PROBLEM. 128 In any system satisfying a set of linear differential equations, the solution may be writen as a superposition of an initial-condition part and another part due to the driving or forcing functions. Both the initial conditions and forcing functions ‘may be random; and, if so, the resultant response isa random process. We direct ‘our attention here to such situations, and it will be assumed thatthe reader has at least an elementary acquaintance with deterministic methods of linear system analysis (1, 2, 3) With reference to Fig. 3.1, the analysis problem may be simply stated: Given the initial conditions and the input and the system's dynamical characteristics [ie., G(e) in Fig. 3.1], what is the output? Of course, in the stochastic problem, the input and ouiput will have to be described in probabilistic terms. ‘We need to digress here for a moment and discuss a notational matter. In (Chapters 1 and 2 we were careful to use uppercase symbols 1o denote random 32 STATIONARY (STEADY-STATE) ANALYSIS 129 Boson foe 60 Fe Ga rte etin sx Figure 14. lock diagram or mentary ania Vatiables and lowercase symbols forthe corresponding arguments oftheir probability density functions. This is te custom in most current books on probability There is, however, a long tradition in engineering books on automatic control and linear systems analysis of using lowercase for time functions and uppercase {or the corresponding Laplace or Fourier transforms. Hence, we are confronted with notational conflict. We will resolve this in favor of the traditional linear analysis notation, and from this point on we will use lowercase symbols fo time signals—either random or deterministic—and uppercase for their transforms. ‘This seems to be the lesser of the two evils. The reader will simply have to interpret the meaning of symbols such as 2(), (9, and the like, within the context ofthe subject mater under diseussion. This usually presents no problem, For example, with reference to Fig. 3.1, (0) would mean inverse transform of G(s), and it clearly isa deterministic time function. On the other hand, the input and output, f( and 20, will usually be random processes in the subsequent material Generally, analysis problems can be divided into two categories: 1. Stationary (steady-state) analysis. Here the input is assumed to be time stationary, and the system is assumed to have fixed parameters with a stable transfer function. This leads to a stationary output, provided the input has been present for a long period of time relative t the system time constants, 2. Nonstationary (transient) analysis. Here we usually consider the driving function as being applied at ¢ = 0, and the system may be initially at rest or have nontrivial initial conditions. The response in this case is usually nonstationary. We note that analysis of unstable systems ino this category, because no steady-state (stationary) condition will, exist. ‘The similarity between these two categories and the corresponding ones in . How ever, if we are only interested in the stationary result, this is getting at the solution the “hard way.” Much simpler methods are available for the stationary solution, and these will now be consi ‘STATIONARY (STEADY-STATE) ANALYSIS ‘We assume in Fig. 3.1 that G(s) represents a stable, fed-parameter system and that the input is Covariance (wide-sense) stationary with a known spectral func-| 130 OlPrERa_ RESPONSE OF LNEAA SYSTEMS TO RANDOM INPUTS detent analyis, we know th if he inputs Fore tansformabe Se qt spectra spy modied by GL) in going through the ite. Ia te ao ten eae oe erpretation ofthe speta fonction is hat is ropa th maps o the auare ofthe Foe transfor, Thus the Enon lig te inpat nd oop peta uncon" $44) = GG) e2n Note that Eg, (2.1) is writen in the s domain where the imaginary axis has the meaning of real angular frequency a If you prefer to write Eq, (32.1) in terms of just replace « with jn Equation (3.21) then-becomes GL joNGK—jo5 C0 Iojons,6ie) 22) SKio Because ofthe special propetes of spectral functions, both sides of Eq. 3.2.2) ‘or out fo be teal factions of «Also note thatthe auloorelation function ‘the quput can be obtained the inverse Fourier transform of 8,3). Two examples wll now ilastate the use of Eq. (2). EXAMPLE 3.1 a Consider a first-order low-pass filter with unity white noise as the input. With reference to Fig. 3.1, then 5) =1 ete: ‘where Tis the time constant ofthe filter, The output spectral function is then 1 1 Teh Tso airy. = + OTF 50 (0, in terms of real frequency a quiry S69 = SUR Se bm 3. for formal sification of By. 8.2.1, 142. STATIONARY (STEADYSTATE) ANALYSIS 134 ‘This is sketched asa function of w in Fig. 32. As would be expected, most af the spectral content is concentrated at low frequencies and then it gradually diminishes as «=. tis also of interest to compute the mean-square value ofthe output. I is given by Eg. 2.1.1. fon p90, tpi Daj) Te Te TF 9 Be) = 623) ‘The integral of Eq. (3.2.3) is easily evaluated in this ease by substituting jas for sand using a standard table of integrals. This leads to 1 Buy = as ‘The “standard” table-of integrals approach is of limited value, though, as will be seen in the next example, a EXAMPLE 3.2 Consider the input process to have an exponential autocorelation function and the filter to be the same as in the previous example. Then. Ria = real Gis) + 7 First, we transform o obtain the input specie function. (2, Georg % or AR) ce ‘The ouput spect fui is hen 5) pee G24) “Tete is am) a Figure 32. Spoctal ncton ter ow pas Mer at 2+ ‘neh wt-naae ut4192 CHAPTER'S RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS Now, if we wish to find B(2) in this case, it will involve integrating a ‘function that is fourth-order in the denominator, and most tables of integrals will be of no help. We note, though, thatthe input spectral function can be factored and the terms of Eq, (3.2.4) can be reatranged a follows: v2078 _(u/T) (<4 as ere be amile+ a Catan] so ] eas ‘The first term has all its poles and zeros in the left half-plare, and the second term has mirror-image poles and zeros in the right half-plane. This regrouping fof terms is known as spectral factorization and can always be done ifthe spectral function is rational in form ({.e,, if it ean be written as a ratio of polynomials in even powers of 5). ‘Since special tables of integrals have been worked out for integrating complex functions of the type given by Eq. (3.2.5), we defer eveluating E(x) until these have been presented in the next section. We note, however, thatthe concept fof power spectral density presented in Section 2.7 1s perfectly general, and its integral represents a mean-square value irespective of wheter or not the integral can be evaluated in closed form, There are many physical examples where ‘One must resort to numerial integration to determine the “power” content of the signal (e.g., see Problem 3.28). a 33 INTEGRAL TABLES FOR COMPUTING MEAN-SQUARE VALUE {In linear anslysis problems, the spectral function can often be written as a ratio ‘of polynomials in s. If tis is the case, spectral factorization can be used to ‘write the function in the form 6) <2) rire 3) 5,09 where (s)/d{s) has al its poles and zeros in the left half-plane and e(-s)/d(~3) hes mirror-image poles and zeros in the right half-plane. No roots of dis) are permitted on the imaginary axis, The mean-square value of xcan now be written eb)e(—9) 9, @32) B= Fj nov) RS. Phillips (4) was the fist to prepare a table of integrals for definite integrals of the type given by Eq, (3.3.2). His table has since been repeated in ‘many texts with a variety of minor modifications (5, 6, 7). An abbreviated table in terms of the complex s domain follows. An example will now illustrate the tse of Table 3.1, 129 INTEGRAL TARLES FOR COMPUTING MEAN-SQUARE VALUE 193, dd + (6 ~ Desa, + Gite, Baghdad ~ dey lcd, + dad) + (G ~ Deed + (Ct — Deedee + ends + dade) Bald id, iy) EXAMPLE 3.3 = ‘The solution in Example 3.2 was brought to the point where the spectral function hhad been written in the form A [eee V2e7B 17 O-[Semrimllcrmssm)] 6 Clearly, 8, has been factored properly with its poles separated into left and right half-plane pars. The mean-square value of x is given by 2), sie eas Comping he frm ofS) n Eq, ..4) withthe standard orm given in (3.3.3), we see that 2 -194 CHAPTER'S RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS. and B(2) is then a 20%pIT? @ Me) Fad, TpITNB + T+ Bre PURE WHITE NOISE AND BANDLIMITED SYSTEMS ‘We are now in a postion to demonstrate the validity of using the pure white- noise model in certain problems, even though white noise has infinite variance. ‘This will be done by posing two hypothetical mean-square analysis problems: 1, Consider a simple first-order low-pass filter with bandlimited white noise asthe input. Specifically, with reference to Fig. 3.1, let ean G42) 2. Consider the same low-piss filter 25 in problem 1, but with pure white noise as the input: 5a) =A, forall w G43) Ge) = G44 ee Certainly, problem 1 is physically plausible because bandlimited white noise has finite variance. Conversely, problem 2 is not because the input has infinite variance. The preceding theory enables us to evaluate the ‘mean-square value ofthe outpat for both problems. As a matter of convenience, we do this in the real frequency domain rather than the complex s domain, Problem 1: 1 or se sufi ay 45) fale, eh ft A aw Arte: Bea) ape OD AB) 45 NOISE EQUVALENT BANDWOTH 735 Problem 2: 54d) = Tor all w Gan canes 1¥ Co) Fd G48) Now, we see by comparing the results given by Eqs. (3.4.6) and (3.4.8) thatthe difference is just that between tan” "(w,T) and tan~"(@2) The bandwidth Of the input is w, and the filter bandwidth is 1/7. Thus, if their rato is large, tan“"(w.T) ~ tan"). For a ratio of 100:1, the error is less than 1 percent. ‘Thus, ifthe input spectrum is fiat considerably out beyond the point where the system response is decreasing at 20 db/decade (or faster), there is relatively litde error introduced by assuming that the input is flat out to infinity. The resulting simplification inthe analysis is significant NOISE EQUIVALENT BANDWIDTH In Blter theory, it is sometimes convenient to think of an idealized filter whose frequency response is unity over @ prescribed bandwidth B (in hertz) and zero ‘outside this band. This response is depicted in Fig, 33a, If this ideal filter is driven by white noise with amplitude A its mean-square response is, a Figure 3.3. cal aed actual er eens, al A196 CHAPTER . RESPONSE OF LINEAR SYSTEMS TO RANOOM INPUTS Be) Gidea) = 2 fi dw = 24 651) [Nest, consider an actual filter G(s) whose gain has been normalized to yield ‘peak response of unity. An example is shown in Fig. 3.35. The mean-square response of the actual filter to white noise of amplitude A is given by Lf sanaa 2 Bie) (aca) = Gf” aGinat—n ae 652) Now, if we wish to find the idealized filter that will yicld this same response, ‘we simply equate E(2) (deal) and (2) (actual) and solve ‘or the bandwidth that gives equality, The resultant bandwidth B is known as th: noise equivalent bandwidth. It may, of course, be written explicitly as 1a » corn! [Lf evens] asa puwriess— = i ie a) = ashy Since the peak response of G(s) occurs at zero frequency and is unity, the gain scale factor is set properly. We must next evaluate the integral in brackets in Eg. {5.3}. Clearly, G(s) is second-order in the denominator, and therefore we use 4 ofthe integral tables given in Section 3.3. The coefficients in this ease are a0 deat ett d= ant and thus J, is ‘The filter's noise equivalent noise bandwidth is then 1 Banh ‘This says, in effect, that an idealized filter with a bandwidth of 1/87 Hz would pass the same amount of noise as the actual second-order fier. 5 28 SHAPINGFLIER 137 St —e[ ea }-—~0 Figure 9.4 Shaping ter 36 SHAPING FILTER With reference to Fig. 3.4, we have seen that the output spectral function can be written as S49) = 1+ G)GI-9) Gen) 1f G(s) is minieoum phase and rational in form,* Eg. (3.6.1) immediately provides a factored form for S,(s) with poles and zeros automatically separated Into Jeft and right halé-plane pars Clearly, we can reverse the analysis problem and pose the question: What minimem-phase transfer function will shape unity white noise into a given spectral function 5,(9)? The answer should be apparent. If we can use special fax torization on Si), the past with poles and zeros in the left half-plane provides the appropriate shaping filter. Ths is a useful concept, both as a mathematical aifice and also as a physical means of obtaining a noise source with desired spectral characteristics from wideband source. EXAMPLE 3.5 ‘Suppose we wish to find the shaping filter that will shape unity whi noise with a spectral function noise into ott Sia) = FAS 8.62) First, we waite S, in the s domain as S09) (663) [Next we find the poles and zer0s of S, Zeros = #1 Poles “24, 2272 Finally, we group together left and right half-plane parts. $5) can then be ‘written as, “hi coniton mies tv fie mtr foe a o,f wich mB in ‘the left half-plane, m180 OFTEN RESPONSE OF LNEAA SYSTEMS TO RANDOM UTS sei ast SO- Tears Fae ae ‘The desire shaping iter is then ce) 665) 5 a7 NONSTATIONARY (TRANSIENT) ANALYSIS— INITIAL CONDITION RESPONSE. [As mentioned previously, the response of a linear system may always be considered as a superposition of an initisl-condition part and a driven part. The response due to the inital conditions is often ignored in tutorial discussions, but itshould not be because there are many applications where the initial conditions fre properly modeled as random variables. If this isthe case, one simply solves the problem using standard deterministic methods leaving the intial conditions in the solution in general terms, An example wil illustrate the procedure, Aue ‘Suppote the circuit shown in Fig 3.5 has been in operation for a long time with the switch open, and then the switch is closed st a random time, which we ‘denote as 1 = 0, We are asked to describe the voltage across the capacitor after ‘We first look at the steady-state condition just prior to closing the switch, ‘The transfer function relating the capacitor voltage to the white-noise source is given by oo-its e2) Figure 38. Grout fr Example 36 i | | i 37 NONSTATONARY (TRANSIENT ANALYSIS—sNTIML CONOTTON RESPONSE 185 ‘Therefore, by using the methods of Sections 3.2 and 3.3, the mean-square value ‘of the capacitor voltage v, is found to be ee ele BOD = aaj lem T+ 2s T= 258-4 We have now established that the with zero mean and a variance of 4 After the switch is closed, the system differential equation is tal condition is a normal Gi, +4e=0 73) R Taking the Laplace transform of both sides yields Lev.) - 94) + VA 2,0) + RO) va) a4 ‘The explicit expression for the time-domain waveform is now obtained by taking the inverse transform of Eq, (3.7.4). Kis 2) = vere 75) ‘where v{0) is # random variable characterized by (0, 4). Note that the solution ‘ofthe initial-condition problem leads to a deterministic random process; that is, ‘the process has deterministic structure and any particular realization of the pro- ‘ess is exactly predictable once the initial condition is known. ‘The mean-square value of a random process is usually of prime interest. In this ease, itis easily computed as Blot) = EleOveP mae - Blv30)) meme 6.75) 10s, ofcourse, a funtion of time 5 ‘The extension of the procedure of Example 3.6 to more complicted situations is fairly obvious, so this will not be pursued further.40 ciMETER'S RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS 38 NONSTATIONARY (TRANSIENT) ANALYSIS—FORCED RESPONSE “The block diagram of Fig. 3.1 is repeated as Fig. 3.6 with the addition of a tc in the inpot. Imagine the system to be initially at rest, and then close the ‘witch at ¢= 0. A transient response takes place in the stochastic problem just ts in the corresponding deterministic problem. Ifthe input fie) is a nondeter- rinistic random process, we would expect the response also to be nondetermin- Hate, and its autocorrelation function may be computed in terms of the input ‘utocorrelation function. This is done as follows, “The system response can be written as a convolution integral 36) = ff aeofd- 0 du 68» wwhete gu) isthe inverse Laplace transform of G{s) and 1s usually referred 10 tS the system weighting function, To find the autocorrelation function, we simply evaluate E[x(t,)x(t,)} Rally 4) = Elles) = ef eens, nde set 9 a] [" f seoseretee, - 04 - ol ded 082) Now, if f() is stationary, Pq. (3.8.2) can be written as aioe ff rotoru-v+4-vam 029 sev nov ves xen fo hut arte con ms ee ate a ara lye Te Gate eros toa so epee masta cat Postpa p—~—[ea}— ee Fee 6 Beck dager renatnay eras sen 38 NONSTATIONARY (TRANSIENT) ANALYSIS-FORGED RESPONSE 141, ate = ff etmecrrsu ~ 0) du do a4) ‘Three examples will now illustrate the use of Eqs. (3.8.3) and (3.8.4) EXAMPLE 3.7 Let G(s) be a first-order low-pass filter, and let f(0) be white noise with amplitude A. Then Ge) S(a)=A Taking inverse transforms gives tu Rfo) = AB) [Next substituting in Eq. (3.8.4) yields seo [[f eu ~ 0) du do HAS eur =Bfewrds -A ui) =#l-"] ess) [Note that as t+, the mean-square value approaches A/27, which isthe same result obtained in Section 3.2 using spectral analysis methods... 22/70. (See, Exe Siete 204, poh -138E) DAEs eee Let G(s) be an integrator with zer0 initial conditions, and let f() be a Gauss-Markov process with variance a? and time constant 1/8. We desire the ‘mean-square value of the output x. The transfer function and input autocorce- lation funetion are cot we on and142 CHAPTER 3 RESPONSE OF UNEAR SYSTEMS TO RANDOM INPUTS Ra) = ote Next, we use Eq, (3.84) to obtain £{2%(0)] Bera) = ff [1-1 o%e-6-4 du do 686) ‘Some care is required in evaluating Eq. (3.8.6) because one functional expressi for eM applies for w > v, and a different one applies for u < v. This is shown in Fig. 3.7. Recognizing that the region of integration must be split into to pats, we have sieol=[[[oremsades [forerrae oan ince there is symmetry in the two integrals of Eq. (3.8.7), we can simply evaluate the fist one and multiply by 2. The mean-square value of x is then weal =2 ff ore du do = 2 [ore [eau do 322 f omen — 1) do 22 pan) 689 [Note that Ef) increases without bound as ¢ + @. This might be expected because an integrator is an unstable system, s Figure 3:7 Rogen of egaton or Bromo 38 38 NONSTATIONARY TRANSIENT) ANALYSIS—FORCED RESPONSE 143 EXAMPLE 3.9 See ‘As our final example, we find the autocorrelation function of the output of a ‘imple integrator driven by unity-amplitude Gaussian white noise, The transfer function and input autocorrelation function are Ryo) = 310) We ota, 4) am Ea 2.83 it ake apant ibe Diss hc hb 8S) so aa te dike Thao scat ere wer tans Rao 8) 1-8u-v+Q-A)dudo B89) off Similarly, when t, < ty Flqure 3.8, asin of rtogeton ereanpe $3144 39 CHAPTER 3 RESPONSE OF LINEAR SYSTEMS TO RANOOM INPUTS Rly) = ‘The final result is then Ratu t) fe oar @8.10) RS [Note that this is the same result obtained for the Wiener process Chapter 2. a In concluding this section, we might cormment that ifthe transient response includes both forced and intial-condition components, the total response is just the superposition of the two, The mean-square value must be evaluated with cere, thongh, because the total mean-square value is the sum of the two only ‘when the erosscorrelation is zero, I the crosscorrelation beiween the two responses isnot zero, it must be properly accounted for in computing the mean- square valve, DISCRETE-TIME PROCESS MODELS AND ANALYSIS. ‘Our emphasis thas far has beet’ on continuous-time random signals and the associated response of linear systems to such signals. There is @ discrete-time counterpart to all of this, and we shall now take a brief look at discrete-time random processes, In order to keep the discussion as brief as possible, we will Confine our attention to single-input, single-output constant-parameter systems. ‘Then later, in Chapter 5, we will consider multiple input-oulput systems. ‘It was mentioned in Section 3.6 that a continuous-time process with a ra tional spectral density can be thought of as the result of passing white noise through a linear filter, The filter, in tuen, specifies an input-output differential {equation relationship between the response and the input white noise. This equation has the general form (DF ay D+ a, gD? ++ + age) = OD + by Dm +01 + BMD, ment B91 ‘where D is derivative operator and u(t) is unity white noise. (The scale factor ‘onthe input is absorbed in the b coefficients.) The system transfer function Gis) is then gg + Dyas lott Dy Gly) = Pa he 692) Fra ae a ay It should be apparent that the continuous-time process x() generated by Eq, {89 DISCRETE-TIME PROCESS MODELS AND ANALYSSS 145 6.9.1) can be either stationary or nonstationary, depending on the stability char- ‘acteristics of G(s) and the initial conditions. [We assume here that the (0) pro= ‘ess is initiated at ¢ In the discrete-time world, the input-output relationship coresponding to Eq, (3.9.1) isa difference equation, and it has the general form +) + ay Xk + n= 1) + aa +m 2)4 66+ ogyh awk +m) + By mk +m — Tyo + Both, oman 1 93) “The corresponding quantities in the continuous and discrete models can be summarized as follows: jscrete-time Integer index &, k= 0, 1,2, Response y(2) ‘units of advance Input discrete white sequence ‘w(R) with unit variance Continuous-time Continuous time 1 Response 0) rth derivative Input continuous unity ‘white noise u(d) [Note that we have intentionally used different symbols in the continuous and discrete models in order to emphasize that there need be no direct connection between the two models. The difference equation, Eq. (3.93), is called the ARMA model for the y(R) process; the left side of the equation is the AR part fof the model (for autoregressive), and the right side is the MA part (for moving, average)." Just as in the continuous problem, we can think of generating. a sample response as the result of inputting a particular w(l) sequence (chosen by chance, of course) and then solving for the resulting y(l) sequence. We cannot ‘write out an explicit solution, but we can conceptually think ofthe ARMA model as shaping the input white sequence into 2 comesponding colored sequence. ‘Also, just as in the continuous case we can generate either stationary or nonstationary processes, depending on the stability characteristics of the ARMA difference equation. Ziransforms play the same role in difference equations that Laplace transforms doin differential equations, so they can be used in examining, the stability of discrete-time systems, Two examples will ilusrate this. EXAMPLE 3.10 ‘Consider the simple first-order ARMA model Mk + D = R= WH, k= 0,12, 94) + We ae being sompwhatesricive nour ARMA model in that allowed wo icremet ony in {postive sete Pegi seso and that he one ofthe MA part he model i es han te ‘de of te A pst The verlting tf foc the dal thn sit prope. Also, He {nso sgeht sain the sobnquent aie cua be singled, incon fo two-te, [eos fin toonegtve Al of tt Se cation of our prt sx of the ARMA mode n (Chaper 5 So rteences # und 9 for more on the ARMA mod146 CHAPTER RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS "Note that the MA part of the model is trivial in this ease because there is only ‘the w(&) term. We inquire as to the stability of the () process. Al we need to 4o is look at the transfer function of the system in the z-domain. Toward this end we take the zransform of both sides of Eq, (3.9.4) and form the output-to- input ratio, 2H) ~ Ye) = We) 95) Ho Wo "2-7 G96) Immediately, we see that we have a pole on the unit circle at z = 1, s0 the system is unstable. In this situation we would expect the output yk) to be nonstationary. This example, of course, isthe sampled form of a Wiener process, provided the initial (0) is zero and W(k) is white Gaussian sequence, We know that its variance increases linearly with the number of steps from the origin. Ed ACL oes ‘We will now look at slightly more complicated example where stability isnot ‘obvious at first glance. Let the ARMA model be Mk +2) ~ y+) + SVK) = Swe +1) + 25, k= 0,12, 697 ‘Again, we inquire about the stability of y(); and, just as inthe previous example, we can form the system transfer function in the z-domain by taking the 7 transform of both sides of Eq. (3.9.7) BY) ~ Me) + SH) = See) + 252) or 98) Solving for the roots of the denominator tells us that we have a pair of complex poles at ~.5 + j5. They are within the unit circle (and the zero at —.5 does not affect the stability), so we would expect the output to reach a stationary condition after the transient dies out. Furthermore, the pole locations also pro- Vide information about how fast this should occur. We will not pursue this further haere (see Example 5.6), a ‘We could go on and on exploiting the anslogy between continuous- and discrete-time models, but much of ths is better done in a state-space setting with vector models. Thus, further discussion of this will be deferred until Chap- ter 5, One further comment is in order here before closing, though. We have 830 SUMMARY 147 referred to our discrete models heré as discrete-time models, but time may not actually be important in many physical applications. For example, if one looks ata sequence of rolls of a pair of dice, its only the sum ofthe dots that matters ‘in many games, and the result of such an experiment is simply a sequence of discrete events. There is nothing “in between,” s0 to speak, and time is not af the essence! On the other hand, there are many situations where the discrete- time sequence arises as a result of sampling a continuous-time process, and the sampling interval and other time-telated matters are quite important. Both of these physical situations fit into the mathematical framework of what we have called discrete-time systems, and we append the word time just as a reminder that the soquences being considered are diserete in their argument and not in the random variable space. ‘SUMMARY ‘The mean-square value and spectral density function (or autocorrelation function) ae the output process descriptors of prime interest. This is partially @ ‘matter of mathematical convenience, because they are usually the only descriptors that can be readily computed. If only the stationary solution is desired, the ‘output spectral density is readily computed from Eq. (3.2.1) or (3.2.2). The mean-square value may then be obtained from the integral ofthe spectral function. Integral tables are available to assist inthis task, provided the transfer and ‘input spectral functions are both in rational form, In transient problems, the portion of the system response due to random initial conditions is computed using standard deterministic methods. The resulting response is always a deterministic random process. The autocorrelation funetion of the driven response may be computed from Eq. (38.3). In many cases, the computation is quite involved; therefore, only the mean-square value is cor puted. This computation is relatively simple and is given by Eq. (3.84). The ‘otal response in the transient problem is, of course, the superpanition of both the initial-condition and driven parts. Linear multiple-input, multiple-output problems were not discussed in this chapter. Complicated problems of this type are best handled by state-variable ‘methods, and discussion of such systems will be deferred to Chapter 5. Simple ‘multiple-input, multiple-output problems are sometimes manageable using scalar ‘methods, and in these cases one simply uses superposition in computing the various responses. One must, of course, be careful in computing spectral fanctions and mean-square values to account properly for any nontrivial crosscor- relations that may exis. (One further comment abovt process models for physical systems isin order. ‘The models, be they continuous or discrete, must come from somewhere, of course. Most often they are derived fom experimental data, but occasionally they come from purely theoretical considerations (perhaps backed up by exper- ‘imental evidence). Analysis of experimental data is a separate subject in its own right, and much has been written about it (8, 9, 10). Usually, the end result of148, CHAPTER “RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS the model formulation is either a spectral description of the arocess (or equivalently, its correlation structure) or a discrete time-series model, that is, an ‘ARMA model. Often the experimentally derived models are themselves uncer- tain because of limited data or other experimental vagaries. Be that as it may, reasonable process models must be assumed before the engincer/analyst can go forward and proceed with the analysis and system design. We are primarily cancemed here with the second half of the problem, that is the analysis and Gesign part. Thus, we will assume in subsequent chapters that somehow or other reasonable process models have been provided, and we simply pick up the prob- Jem at that point and proceed to look at analysis and desigr methods that are ‘based on minimizing the mean-square error. PROBLEMS 3.1 In section 3.1 the equation relating the input and output spectral densities {Sci0) = [GGe}PS(jo)] vas jostiied with heuristic arguments. This can be formalized by procesding tough the following steps: a) Write the oor 2() convolution integral using Forier rather than Laplace transfor (©) Do iikewise forthe sifted output x + 2). {© Mutiny the expressions for x() and x(¢ + 2) and (symbolically) form the expectation ofthe product. (4) Now note thatthe Foutier transform of the autoconelation function is the spectral density; transform both sides, interchange the order of integration and the desire result i apparent. Proceed throug the steps just described and formally juiy the S,(ja) = |G(j0)?5;(jo) fort 32 Find the steady-state mean-square value of the output for the following fiters. The input is white noise with a spectral density amplitude A © 60) = OER ae © 60 = saya set © OO - eae 33 A white-noise proces having a spectral density amplitade of A is appli io the cituit shown, The cxcuit has been in operation fra long ime. Find the ‘mean-square valve of the output voltage (tem ouput Problem 3.3 Promems 149 34 The input to the feedback system shown is a stationary Markov process ‘with an autocortelation function RO ‘The system is in stationary condition. (@) What is the spectral density function of the output? (b) What is the mean-square value of the output? haw —"G {7} om orent Probl 3 35. Consider the nonminimum phase filter i=1s Ts driven witha stationary Gauss-Markov process with an autocorrelation function RG) = oe", Find (@) The spectral density function of the output. (@) The mean-square value of the oupot. 346. Find the steady-state mean-square valu ofthe output fora fist-order low pas filer fie, G(s) = T/C + T9)] i the input has en autocorelation function of the form a) or ~ Bind. RG) = [ine: The input spectral function is irational so the integrals given in Table 3.1 are of no help here. One approach is to write the integral expression for F(X?) in toms of real w rather than s and then use conventional integral tables. ‘Also, those familiar with residue theory will find thatthe integral can be evaluated by the method of residues.) 37-Thermal noise in 2 metallic resistor is sometimes modeled as a white-noise voltage source in series with the resistance R of the resistor (11, 12). This is shown in the figure on the next page, along with the parameters describing the spectral amplitude of the noise source. At room temperature the flat spectrum approximation is reasonably accurate from zero frequency out to the infrared range. Clearly inthe idealized model of part (a), the voltage from a to b would be infinity, which is physically impossible. The model is sil useful, though, because there is always some shunt capacitance associated with the load connected from a to b. I nothing els, the parasitic capacitance ofthe resistor leads is suficent to cause the spectral function to “roll off” at 20 db/decade and thus180 GUPTER2 RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS. ‘cause the output to be bounded. This is shown in part (b) ofthe figure, Now, to ‘demonstrate the validity of the white-noise model, consider an example where R= 1% 10° Mand C= 1 x 10°" F (@ plausible value for parasitic capacitance). ‘Also, assume the temperature is room temperature, about 290 K. (@) Find the rms voltage across the capacitor C. () Find the hall;power frequency in hertz. (Defined as the frequency at which the spectral function is half its value at zero frequency.) {€) Based on the result of part (b), would you think it reasonable to consider this noise source as being fat (i.., pure white) in most electronic-circuit applications? Explain briefly. 1am conan 128107 Glee) 1 emparr eesKle) Problem 87 38. Consider a finear filter whose weighting function is shown in the figure. (This filter is sometimes refered to a5 a finite-time integrator.) The input to the filter is white noise with a spectral density amplitude A, andthe filter has been in operation a long time. What is the mean-square value of the output? 0 Problem 38. fie wah tncton 39 Find the shaping filter hat will shape unity white noise into noise with a ‘spectral fonction wed P48 FTE 340. A series resonant circuit is shown in the figure. Let the resistance R be ‘mall such thatthe circuit is sharply tuned (ie., high O or very low damping tato), Find the noise equivalent bandwidth for this circuit and express it in terms of the damping ratio {and the natural undamped resonant frequency w, (Le. wo, = 1/VLO), Note that the “ideal” response in this case is a unity-gain rec- {angular pass band ceatered about «Also find the usual half-power bandwidth and compare this with the noise equivalent bandwidth. (Half-power bandwidth Ste) = Prostews 181 is dofined 10 be the frequency difference between the two points onthe response curve that are “down by 8 factor of 1/2 from the peak vale eis usta f appronimat the resonance curve es being symmetie aout the peak fr this part ‘tthe problem) oe 341 The ansfer functions and corresponding bandpass characteristics for first, second-, and third-order Butterworth filters are shown in the figure below. ‘These filters are said to be “maximally flat" at zeo frequency with each suc: cessive higher-order filter more nearly approgching the ideal curve than the pre vious one. All thee filers have been normalized such tha all responses intersect the ~3-db point at 1 rad/sec (or 1/277 Ha. (@) Find the noise equivalent bandwidth for each of the filters () Insofar as noise suppression is concerned, is there much fo be gained by using anything higher-order than a third-order Butterworth filter? 3.12. Find the mean-square value ofthe output (averaged in an ensemble sense) {or the following transfer functions. In both eases, the intial conditions are zer0 and the input 4() is applied at ¢ = 0 1 @ Ge) Ry) = Aa) Fre Bin Aan ©) Ge = second onde) ” eos Coat ora Problem 211) ocponcos of ree Sutewort as, 1) Tans ncn of Batra hes,152 CHAPTER RESPONSE OF UNEAR SYSTEMS TO RANDOM INPUTS, 33. A certain linear system is known to satisfy the following differential ‘equation: 20) = x0) = 0 where 2(0) is the response and f() is the input that is applied at ¢ = 0. If £0 is white noise with spectral density amplitude A, what is the mean-square value of the response x(0)? 3414 Consider a simple firstorder low-pass filter whose transfer function is 1 reg ‘The input to the filter is initiated at ¢ = 0, and the filter's inital condition is zero, The input is given by J = Ault + ne) where (d= unit step function ‘A = random variable with uniform distribution fom 0 to 1 n(f) = unity Gaussian white noise Find: (@) The mean, mean square, and variance of the output evaluated at see (©) Repeat (a) forthe steady-state condition (ie, for = 9. (Hint: Since the system is linear, superposition may be used in computing the ‘output. Note thatthe deterministic component of the input is written explicitly in functional form. Therefore, deterministic methods may be used to compute the portion of the output duc fo this component. Also remember thatthe mean- quite value and the variance are not the same if the mean is nonzero.) SAAS A signal is known to have the following form: 3) = ay + nto) where ay is an unknown constant and n() isa stationary noise process with a known autocorrelation function Rye oer It is suggested that ay can be estimated by simply averaging s(t) over a finite interval of ime 7. What would be the rms error in the determination of ay by this method? (Wore: The root mean square rather than mean square value is requested in this problem.) 316 In the figure shown, /(Q is a time stationary random process whose ‘autocorrelation funetion is Ra) = ater PROBLEMS 153 ‘The input f( is first multiplied by e°™ and then integrated, beginning at ¢ = 0. ‘The initial value of the integrator is zero. Find the mean-square value of x(). Probl 216 3.17. Consider an integrator whose initial outpat at ¢ = 0 is a Gaussian random. variable with zero mean and variance 0. A Gaussian white-noise input with spectral amplitude A is applied at 1 = 0. What is the mean-square value of the ‘ulput as a funetion of time? 348. Unity Gaussian white noise J(0 is applied to the cascaded combination of integrators shown in the figure. The switch is closed at 1 = 0. The initial condition for the first integrator is zero, andthe second integrator has two units as its initial value, (2) What is the mean-square value of the output at r= 2 sec? () Sketch the probability density function forthe output evaluated at ¢— 2 sec steree) 3.9 Consider the random process defined by the transfer function shown in the figure. The input f(2) is Gaussian white noise with unity spectral amplitude, and the process is stated at r = 0 with zero inital conditions. The autocortelation function ofthe output x() is defined as Rut) = ELA, and 1, > 0 Find Ratt, 0) wo Problem 219 3.20 Consider again the filter with a rectangular weighting function discussed in Problem 3.8. Consider the filter to be driven with unity Gaussian white noise, Which is initiated at ¢ = O with zero initial condition. (a) Find the mean-square response in the interval from 0 to-7, () Find the mean-square response for 1 = T and compare the result with that obtained in Problem 3.8. (©) From the result of (b), would you say the filter's “memory” is finite or infinite? 321 The block diagram on the next page describes the error propagation in fone channel of an inertial navigation system with external-velocity-teference damping (14). The inputs shown as f,() and f,(t) are random driving functions154. CHMPTER 3 RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS, die tothe accelerometer and external-velocty-reference instrument errors. These will be assumed to be independent white-noise processes with spectral ampli tudes A, and Ap, respectively. The outputs are labeled x, x, and x and these physically represent the inertial system's platform tlt, veloety error, and position feor, Find the steady-state mean-square value of each of the outputs. (Hine Since the system is liner, use superposition and note that the driving functions are independent.) n0 Epmitlnd cait 5 ne Problem 3.24 3.22 Random processes x() and y(f) are generated by passing unity Gaussian ‘white noise through parallel ters as shown below. (@) Find the autocorrelation functions for each of x(2) and x). You may ‘assume that a staionary condition exists. (@) Find the crosseorrelation function R,,(7). (Hint: Waite x and y as convolution integrals. Then form the appropriate product land take the expectation, Finally, let ¢— cto get the stationary condition.) lt aaa sin . Problem 9.22 323 The block diagram in the accompanying figure shows a means of deter- ‘mining the weighting function of a linear system (and thus, indirectly, the system’s transfer function). The basic idea is to superimpose a small ammount of ‘white noise on the regular input and then crosscorrlate the resulting output with the intentionally added white noise. If the amplitude of the additive noise is relatively small, it is scarcely noticeable, if at all, and this provides a means of Continuously monitoring the transfer characteristics of the system without dis tubing its normal operation, Show that che output of the crosscorrlator R=) is, in fact, proportional to the system weighting function (7). [If you need help ‘on this one, see Truxal (15), p. 437, This idea has “been around’ for a long PROBLEMS 185 time but has seen only limited application, The reason, of course, lies in the computational effort in determining the crosscorrelation (1) or its counterpart in the frequency domtin, the cross-spectral deasity function. Either way, con siderable computational effort is involved because the amount of data to be Processed needs to be fairly large owing to the presence ofa large signal com: PPonent as well as the random noise. See Problem 2.29 for more on determining the cross-spectral density function,)} ea wip ote Problem 3.23 ee 3.24, Part (a) of the accompanying figure shows a functional block diagram of ‘an elementary phase-lock loop. Such circuits are used in a wide variety of com- ‘munication applications (12, 16). Briefly, the phase comparator provides a signal proportional to the difference between the phase of the incoming rf signal and that of the local oscillator. The phase difference is modified by gain factor K, and is then fed back as a voltage tothe local voltage-controlled oscillator (VCO). This voltage causes the frequency of the VCO to shift up oF down as necessary to lock onto the phase of the incoming rf signal. Part (b) of the figure shows the linearized block diagram for a phase-lock Joop with the addition of @ phase noise input labeled n(@). Ifthe incoming :f signal is frequency-modulated (FM), itis the derivative of (0) that is propor tional to the baseband signal, and thus (1) is the signal to be recovered (de- tected) in this case, 3} be Sloe ” » Problem 3.24 (@) Show that the transfer function relating #(0) to the output y(t) is the equivalent of a differentiator in cascade with a first-order, low-pass ‘ite. () Consider the phase noise n(?) to be flat with a spectral amplitude Ny ‘Also assume that the gain parameters ofthe phase-lock loop are ‘such thatthe baseband signal is passed with no distortion, that Toop respoase is lat over the signal frequency range, say, from 0 to W Hz. Find the spectral function of the output noise in the 0 10 W Ha4155. civeTE. RESPONSE OF UNEAR SYSTEMS TO RANDOM INPUTS range, and find its average power (Le, mean-square value) inthis fre- (quency range. (It may help here to think of the noist as being limited to the O to W Hiz range by a sharp-cutoft postdetector filter.) {325 For two stationary random processes x(0) and y(, the coherence function is defined to be see Chapter 2) Is,GaP . 5G) U0) a Consider special case where 30 isthe inp to liner system, and (i the uiput as shown in the accompanying figure. (@) First show that (0) = Sy(de) = GaSe) (3.25.2) () Then show thatthe coherence fusction is unity for all (This shows that x and y do aot have to be equal in order to have unity coherence. ‘They only need to be intimately related, a, of cours, they are inthis situation) cn er Problem 2.25 B) 326 The recursion equstion for generating a Markov process from problem * 253 is repeated here for convenience: wen = MG Wye = 0,12, ww, = white sequence ~ M0, (1 variance of the Markov process {= reciprocal time constant of the Markov process {t= time interval between samples (4) This willbe recognized asa simple firstorder ARMS model forthe process. Using the same parameter values as in Froblem 2.33 (ie. 7 = i, B= 1, At = 05 see), verify analytically that this model ‘encraics a stable process, This is easily done by locating the system characteristic pole in the z plane, What is the system time constant? [Express this in terms of both seconds and number of discrete steps. (b) In onder to demonstrate the stability of the process of part (a), generate four sample relizations ofthe process for k = 0, 1,2, ...» 500. The ‘tansient phenomenon that takes place and the evolution into a staionary condition wil be more obvious if you initialize each realization 10 be zero at k = 0. View the plots of the four sample realizations. (The overall trend toward stationary behavior is more obvious when the four plots are superimposed on the same graph.) (© Ifwe let e™ = 1 in the above recursion equation, the ARMA model becomes ti] ig PROBLEMS 187 ‘This model was discussed in Example 3.10, and it was noted to be unstable. To demonstrate this, generate four sample realizations ofthis process for k = 0, 1, 2, .. $00, and let the initial value of x, be zero just as in part (6). Also, in ‘order to make the inital rate of growth of x, similar (statistically) to that of pare (), let the variance of w, be .095163. View the plots of the four realizations and note the instability 3.27 When it is difficult to integrate the power spectral density function in ‘losed form, one should not overlook numerical integration. This is often the ‘quickest way to get an answer to a specific numerical problem. In Problem 3.5, let? 1, and repeat the problem using MATLAB's quad or quad ‘numerical integration programs. Compare your result with the exact value of (Wote: Beware of trying to integrate numerically through the origin where there is the indeterminate form 0/0. This can be avoided by staying. an incremental distance away from the origin in the integration, Then approximate the omitted pert as e narrow rectangular strip.) 328 One of the pseudorandom noise (PN) signals transmitted by the GPS satellites has a power spectral density that is approximated by the function “@y @) ‘where ois the mean-square value of the signal, and Tis the chip width of the spread spectrum signal. (See Chapter 11 for a brief discussion of the GPS satellite navigation system.) (@) Find the fraction of the total power that is contained in the primary lobe of the spectrum (ie, for ~2n/T < w < 2m), () Find the fraction ofthe total power that i contained in both the primary and first side lobes of the signal (ie, for ~4n/T < w < 4/7) Sw) = 0° 3.28) (Hint: You wll find MATLAB's numerical integration programs quad or quad8 useful for this problem. Also, the same precaution about integrating an indeter ‘minate form that was mentioned in Problem 3.27 is applicable here.) 3.29 In the nonstationary mean-square-respoase problem, it is worth noting thatthe double integral in Eq, (3.8.4) reduces to a single integral when the input is white noise. That is, if R,(2) = A&K®), then EO] ‘where g(v) is the inverse Laplace transform ofthe transfer function Gis), and A is the power spectral density (PSD) of the white-noise input. Evaluation of integrals analytically can be laborious (if not impossible), so ‘one should not overlook the possibility of using numerical integration when g2(0) sfewa ]4158 CHAPTERS. RESPONSE OF LINEAR SYSTEMS TO RANDOM INPUTS is either rather complicated in form, or when itis only available numerically ‘To demonstrate the effectiveness of the aumerical approach, consider the follow ing system driven by white noise whose PSD = 10 units: Gs) = 08 % ere Let us say that we want to find EL2(0)] for 0 = ¢ = 1 with a sample spacing ‘of O01 sec (ie. 101 samples including end points) Using MATLAB's numerical integration function quad8 or quad (or other suitable software), find the desired ‘mean-square response numerically. Then plot the result along with the exact theoretical response for comparison [see part (0) of Problem 3.12]. REFERENCES CITED IN CHAPTER 3 1. J.J. D'Aeao and C,H. Houpis, Linear Contot System Analysis and Design, 4th ef, [New York: MeGraw-Hil, 1895. 2. RC. Dorf and R. H. Bishop, Modem Control Systems, Th ei, Reading, MA: ‘adison- Wesley, 1995, 3. Rel. Smith and R. Dorf Cireuts Devices and Systems, Sth ed, New York: Wiley, 1992. 4, HM. James, N. B, Nichols, and R.S. Philips, Theory of Servomechanisms, Radi- stion Laboratory Series (Wol 25), New York: MeGraw-Hil, 1947, 5. GC. Newton, L. A. Gould and J.P. Ksser, Analytical Design of Linear Feedback Contos, New York: Wiley, 1957 6.GR Cooper and C, D. McGillem, Probabilitic Methods of Signal and System ‘Analysis, od ed, New York: Hol, Rinehart, and Winston, 1986, 7. K_S. Shanmugan and A. M, Breipohl, Random Signals: Detection, Estimation, and Data Analysis, New Yori: Wiley, 1988 8, MB, Presley, Spectral Analysis and Time Series, New York: Academic Press, 1981, 9, J. M. Mendel, OptinalSeiomie Deconvolution, New York: Academic Press, 1983 10, 5.8. Bendat and A.C, Piersa, Random Data: Analysis and Measurement Procedure, [New Yorke Wiley-intrscenee, 1971 11, A.B. Cation, Communication Systems, 2nd ed, New York: MeGraw Hil, 1975. 12. KS. Shanmogan, Digital and Analog Communication Systems, New York: Wiley, 179. 13. D. Childers and A, Dusting, Digital Fitering and Signal Processing, St. Paul, MN West Publishing, 1975, 14, G.R, Pitman (e4), Inertial Guidance, New York: Wiley, 1982 15. J. G, Traxal, Automatic Feedback Control Systm Syrthesis, New York: McGraw: il, 1955. 16, FM, Gardner, Phaselock Techniques, 2nd ed, New York; Wiley, 1979. Additional References for General Reading 17, P.2, Peebles, Ir, Probability, Randont Variables, and Random Signal Principles, xd fad, New Yor: McGraw-Hill, 1993 18, HJ, Larson and B, 0. Shubert, Probabilistic Models in Engineering Sciences (Vols {Vand 2}, New York: Wiley, 1979. Wiener Filtering In this and subsequent chapters we will consider a particular branch of filter theory that is sometimes referred to as least-squares filtering. Actually this is an oversimplification because it isthe average squared error that is minimized ‘and not just the squared error. “Linear minimum mean-square ecror filtering” is ‘a more descriptive name for this type of filtering. This isa bit wordy, though, 50 the name is often shortened to MMSE filtering (for minimum mean-square error. Simply stated, the linear MMSE filter problem is this: Given the spectral characteristics of an additive combination of signal and noise, what linear operation on this input combination will yield the best separation of the signal from the noise? “Best” in this ease means minimum mean-square erro. This branch of filtering began with N. Wiener’s work in the 1940s (1). RE, Kalman then made an imporant contribution in the early 1960s by providing an alternative approach to the same problem using state-space methods (2, 3). Kalman’s contribution has been especially significant in applied work, because his solution is readily implemented in time-varisble, multiple-input/multiple-output applications. ‘We will consider the Wiener and Kalman theories in their historical order. Ik should be mentioned that neither is prerequisite material forthe other; there- foe they may bested in ether order, or oe 10 the excasion ofthe oe, 44 THE WIENER FILTER PROBLEM ‘The purpose of any filter is to separate one thing from another. In the electric filter case, this usually refers to passing signals in a specified frequency range and rejecting those outside that range; and, historically, filter theory began with the problem of designing a circuit to yield the desired frequency response. This is still an important problem, In many applications in communication and control, one knows intuitively what the ideal frequency response should be. For 159160 CHETER 4 WENER FRTERIG example, if we want to receive the signal from a particular AM radio station {and do it faithfully), we know that the appropriate filter is one that passes all frequencies within a few kilohertz on cither side of the assigned station fre- [quency and rejects all others. Certainly, no elaborate theory is needed to deter- thine the desired frequency response in this case. The problem is simply one of tdrcuit design, We will see, though, that this is not always the case. During ‘World War Il, Norbert Wiener considered a different sort of fiter problem (1). Suppose the signal, as well as the noise, is noiselike in character, and suppose further that there isa significant overlap in the spectra of both the signal and noise. For example, say the signal is a Gauss~Markov procest and the corrupting toise is white noise. Their spectral densities are shown in Fig. 4.1. Now, in this ts, it should be apparent that no filter is going to yield perfect separation, and the filler that gives the best compromise of passing the signal and, at the same time, suppresses most of the noise is not at all obvious, Neither is it obvious how one should define “host compromise” in order to make the problem math matically tractable. This isthe problem Wiener examined inthe 1940s. We note thar he was not concerned with fiker design in the sense of c100sing appropriate fesisters, capacitors, and so forth. Instead, his problem was more fundamental ame, what should the file's frequency response be in onder to give the best possible separation of signal from noise? “The theory that is now loosely referred to as Wiener filer theory is chi acterized by 1. The assumption that both signal and noise are random processes with known spectral characteristics or, equivalently, known auto and erass- correlation function. 2. The criterion for best performance is minimum mean-square error. This is partially to make the problem mathematically tra:table, but it is also 4 good physical criterion in many applications. 3. A solution based on scalar methods that leads to the optimal filter ‘weighting function (or transfer function in the stationary e356. ‘We now proceed to the filter optimization problem. By (sins een) Figure At Soocal dents of sgl ar ie 42 4.2 OPTIMIZATION WITH RESPECT TO A PARAMETER 161 Figure 42. Fitropizatonprblen. OPTIMIZATION WITH RESPECT TO A PARAMETER, vs heey mf chase spinon a alii el ets en eee Beach Can pe ma w em XG) (LS) + NEO) a2 We define the filter etror asthe difference between the actual output and what We would like it to be ideally—the signal. Therefore let’ et = (9 ~ x09) (422) and Els) = Sis) ~ X00) (423) Substituting Eq. (4.2.1) into Bq, (4.2.3) yields (8) = Si8) ~ GEESE) + NEI] = [1 ~ GEN]s(9) ~ [GEM (42.4) It can be seen that the error can be thought of as a superposition of wo com- Ponents, one du to the signal modified by the transfer function [1 — G(s)] and ‘another due to the noise modified by ~G(s). Ifthe signal and noise have zero * Where thee might be confusion between the inal varia (aa the comple variable we il sow he ie epndence ep a 1 tsp varie Ae appease 5) Si ci a le Lae ani of a) ina fa ‘taken hin be comet of he thet de eomiersiee, nn ne Sn "Some autor pero detie te eer wit he oppose sg Ia he subsequent opiizaton we will aye te cones wit miming th mesure or Ths the hp of te ero 1 coseqene: he esang opal fit de mesnagunre enor a ese eae wy462 CHAPTER 4 WIENER FRTERING ecosscortelation, the mean-square error is obtained as simply the sum of 10 terms, that i, Ll = G-sSf0) ds Be) = 5 [tt = Gelli ~ 6-sI8(0 a LP + yf amarnseo a 425) ‘We now have an explicit expression for the mean-square error in terms of the spectral functions of s(2) and n(@) (presumably known) and the filter transfer function. If G(s) contains a parameter free to vary, we can now use ordinary dilfereatiel calculus to minimize E(e%) with respect to the parameter. We now proceed with an example, EXAMPLE 44 Consider the Gauss-Markov signal and white-noise situation shown in Fig. 4.1 It is apparent that some sor of low-pass filter is needed to separate signal from noise, Let us ty a simple first-order flter of the form Gy) = 8 \We have now specified the functional form for G(s), and hence we are ready to use Eq. (4.2.5). The needed quantities are 1 ur Tem sea Gy) = 50) = A= VA-VA Subsitating the above quantities into Eg, (4.2.5) and evaluating B(e) using the integral tables of Section 3.4 yield 80 et 426) ‘This ean now be minimized with respect to T using differentia calculus. The result i that Be) is @ minimum for 43 43. THE STATIONAY OPTDIIZATION PROGLEM—WWEIGHTING FUNCTION APPROACH 163 sins —o[ 9 Figure 49. Wine fer protien. Fp nytt) VA oV2p ~ BVA Its interesting to note that this will yield a positive value of T only for cestain ‘values of the parameters. A negative Solution for T ‘simply means that no relative ‘minimum exists within the interval from ze¢0 to infty. It should be remembered that the minimum obtained is not the absolute minimum possible (unless by coincidence), because the form ofthe filter transfer function was chosen intuitively. Other functional forms might have done better. w aon ‘THE STATIONARY OPTIMIZATION PROBLEM—WEIGHTING FUNCTION APPROACH* ‘We now consider the filter optimization problem that Wiener fist solved in the 1940s (1), Referring to Fig. 4.3, we assume the following: 1. The filter input is an additive combination of signal and noise, both of Which are covariance stationary with known aulo- and crosicorrelation functions (or corresponding spectral function). 2. The filter is linear and not time-varying. No further assumption is made as to its form, 3. The output is covariance stationary. (A long time has elapsed since any switching operation.) 4. The performance criterion is minimum mean-square error, where the error is defined as e(0) = s(t + a) ~ 2(0. In addition tothe generalization relative to the form of the filter transfer function, wwe are also generalizing by saying the ideal filter output is to be s(¢ + a) rather than just s(). The following terminology has evolved relative to the choice of the «parameter 1. e positive: Ths is called the prediction problem. (The filter is tying to predict the signal value «units ahead of the present time 1) 2. « = 0: This is called the filter problem. (The usual problem we have considered before.) ‘Two-sided Laplace transform theory is wed extensively in hs sston See Append & For brit164 CHAPTER «WIENER FLTERING 5, a negative: This is called the smoothing problem. (The filter is trying to estimate the signal value cr units inthe past.) ‘This is an important generalization and there are numerous physical applications ‘corresponding to all three cases. The a parameter is chosen to fit the particular application at hand, and it is fixed in the optimization proces. ‘We begin by defining the filter error as el) = s(¢ + a) ~ 20) aap ‘The squared error is then eo A + a) ~ De + att) + 0. 432) We nes wit aa cnoon a [seater =a + n= ae 33 ‘is an be soba nto Hy, (4:2) ad both ses rege ye [So rae, =o a = 2 fe ARaale +o du + RO) 434 R, autocorrelation function of 5(?) autocorrelation function of s() + (0) crosscorrelation between s() + n() and s(t) [Note that if signal and noise have zero crosscortelation, R, R+R, Rane ‘We wish to find the funetion g(u) in Eq. (4.3.4) that minimizes E(e). This will be recognized as a problem in calculus of variations (5). Following the usual procedure, we replace g(u) with a perturbed weighting function gu) + en(u) + We hive chosen here owt the mean square erin tes of he ie weeing fon and nyu stocaretaton fran. Inthe sonar probe on ean as wate Ce) in ems of the ‘hes ante func and input spel fncine, and then proce wih be option on at Basis (67). Weave chosen te time-domain approach bacaie ical generalized 0 the 08- toany problem hts cosdeed in Seton 4 The frqueny dora spoach 1 ot ey eerie, 443. THE STATIONASY OPTMIZATION PROBLEM WEIGHTING FUNCTION APPROACH 165 where 2) = optimum weighting function [Note: From this point on in the solution, g(x) will denote the optimal weighting function] fw) = an arbitrary perturbing function ‘e = small perturbation factor such that the perturbed function approaches the optimum one as ¢ goes to zero “The optimum and perturbed weighting function are sketched in Fig. 4.4, Re- placing (a) with gu) + en() ia Eq (38) then lead to Be) = ff tate) + enteige) + enterlR,.u ~ 0) du do ~ 2 tate) + enedTRonde + w) du + 8,0) 436) Note that £(e) is now a fonction of e, and itis to be @ minimum when ¢ = 0. Now, using differenti ealeulus methods, we differentiate E(e!) with respect 10 ‘and set the result equal to zero fore = 0. After interchanging dummy variables of integration freely, the result is io [rR tat 4a sR, = 2) au] ar ‘A subilety in the solution arises at this point; therefore, i is convenient to look atthe causal and noncausal cases separately Noncausal Solution If we put no constraint on the filter weighting function, we will very likely ‘obtain 2 g(u) that is nontrivial for negative as well as positive w. This weighting function is noncausal because it requires the filter to “look ahead” of real time and use data that are not yet available. This is, of course, not possible if the filter is operating on-line. However, in off-line applications, such as postflight analysis of recorded data, the noneausal solution is possible and very much of interest. Thus, it should not be ignored. gins Figure 44 Optima and partbed weighing functors,166 CHAPTER 4 WIENER FLTERING If there ate no restrictions on g(u), then, similarly, there are no constrains ‘on the perturbation function (7). It is arbitrary forall values ofits argument. ‘Thus if the integral with respect to + in Eq. (4.3.7 isto be zero, the bracketed term must be zero for all +. This leads to [rion dae tales merem a8 ‘This is an integral equation of the fist kind, and in this ease it ean be solved readily using Fourier transform methods, Since R,,, is symmetric, the term on the left side of Eq (43.8) has the exact form of a convolution integral. Therefore, transforming both sides yields GMS AB) * Sson ASD 439) = Sane” Gg) = Sei (43.10) Remember thatthe transforms indicated in Eq. (43.10) are nvo-sided transforms rather than the usual single-sided transforms. Of course, if we wish to find the ‘weighting function g(u), we simply take the inverse transform of the expression given by Eq. (43.10) ‘The filter mean-square error is given by Eq. (4:34). If fu) is the optimal weighting function satisfying Eq, (4.38), the mean-square error equation may be simplified as follows, First write the second term of Eq. (43.4) as the sum of two equal terms and combine one of these with the double integral term, After we rearrange terms, this leads t0 oe fis [mateo {cone} ae asin ‘The bracketed quantity in Bq, (43.11) is zero for optimal g() forall w, There- fore, the mean-square error is eee ce EXAMPLE 4.2 ee Consider the same Markov signal and white-noise combination used in Example 4.1, We wish to find the optimal noncausal filter (i., a = 0) In order to simplify the arithmetic, let ¢? = @ = A = 1. Since the signal and noise have zero exosscorrelaton, |43. THE STATONARY OPTIMIZATION PROBLEM-—WEIGHTING FUNCTION APPROACH 167 (43.13) (43.14) 43.5) From Bg, (4.3.10) we have [Expanding this with a partial fraction expansion yields Uva ua 4316) revi Tava ‘The posiive- and negative-time parts of g(u) are given by the fist and second terms of Eg. (4.3.16). Thus, au) is es 10 =f 3 ae u<0 no, ‘This is the optimal noncausal weighting function, and i is sketched in Fig. 4.5 along with the intuitive weighting function of Example 4.1 (evaluated for o* = a and 7'= 1 + V2), Note that since the noncausal filter weights both past and future input data, it can afford to have a smaller time constant than the Intuitive filter, which is allowed to weight only past input data Ls also of interest to compare the mean-square errors for the noncausal optimal filter and the parameter-optimnized filter of Example 4.1, ‘These may be computed from Eqs. (42.6) and (4.3.12) with the result 20 one tint) Flgure 45. ite weighing anne pts rence pararotaoptmaad causa Mar168 CHAPTER 4 WIENER FLTERING He?) (parameter-optimized) ~ 0.914 F(@) (noncausal optimal) ~ 0.577 Notice thatthe noneausal filter has significantly less error than the causal one. CCetainly, in offline applications it would be worthwhile implementing the noncausal filler in preference to the intuitive causal one a Causal Solution ‘The calculus of variations procedure led to Eq. (4.3.7), which is repeated here for convenient reference. fixe [Be so) +f AOR 9 a] Pa a7 Recall that (7 isan arbitrary perturbing function. If we wish to constrain the filter weighting function to be causal, we must place a simila- constraint on (2) inthe variation, Otherwise, we get the unconstrained (noncavsal) solution, Thus, for the causal case we require 7(2) to be zero for negative x and allo it to be abitrary for positive x. The bracketed quantity in Eq. (4.3.2) then needs to be ‘ero only for positive x. The zero criterion is satisfied for megative + by virtue fof our constraint on (7), that is, n(2) = 0 for 7 < 0. Therefore, the resulting integral equation is Fh sear stu = 91 du Rafe +) 720 (43:18) Equation (43.18) is known as the Wiener-Hopf equation, and the fact that it is valid only for + = 0 complicates the solution considerabiy. (One solution of Eq. (4.3.18) that is based on spectral factorization proceeds 4s follows, Fist, replace the right side with an unknown negative-time function ‘(7 fe, a(2) is unknown for negative time, but is known to be zero for positive time.} Equation (4.3.18) can be writen as [i stoR.tu- du Ret =a, ex r< = 43.19) anfoming bh sides of Eg 4.3.19) then yields G40 9) ~ Sil = A) 4329 Nast se spectral factorization on Sand group terms a ‘ollows [a(s)s?,.(s)} 8) ~ SyanSde"* = AG) 449. THE STATIONARY OPTMIZATION PROBLEN—WEIGHTING FUNCTION APPROACH 169 AG), SrrnlS G83.) = Bs + ES (4321) In Eq, (43.21), the "super +" indicates the factored pat ofthe spectra function its poles and zeros in the lft half-plane. Similarly, the poles and 1 af mirror images of those of We note here tha g(0) i a ‘table postve-time function; therefore G(s) will have its poles in the left hall= plane. Thos, G()8,() wil ave all s poles in the lft half-plane, an it wil be the transform of postvestime funtion. Similarly, A) i the transform of 2 nepative-time fietion, soi poles will bein the right half-plane. Also, both the zeros and poles of Ss) ar in the right half-plane. Hence, the three terms of Eq, (43.21) translate into words as [rs time | = [aces] . [et tive. “| negative-time function function function Equating positive-time parts on both sides of Eq, (43.21) then Teads to in) = ovine a of Ss” ‘The bracketed term of Eq, (4.3.22) can be interpreted as follows. Furst find the inverse transform of S,,94(3)/S;.(8). This will normally be nontrivial for both positive and nogative time. Next, translate the time function an amount a. (This ceounts for e*.) Finally, take the ordinary single-sided Laplace transform of the shifted time function, and this willbe the bracketed quantity in Eq, (4.3.22), “Two examples should be helpful at this point EXAMPLE 4.3 Consider the same Markov signal and white-noise combination used in Examples 4.1 and 42, Again we let o* = B to simplify the arithmetic and, in this example, we are looking for the optimal causal solution. Since the signal and noise are assumed to have zero crosscorrelation, the needed spectral func= tions are (43.23) (43.24) ‘Also, since the prediction time a is assumed to be zero,170 CHAPTER 4 WIENER FLTERING eel 4325) First, we factor S,, See" Sin = [ (4326) Next, we form the S,.,fS;q funeton Sesne a Sue aE Cat VIGFD coy eT ‘This, in tr, can be expanded in terms of a pari feaction expansion: Vi-1 vi- a “S (43.28) Sie ST ee VS Clearly, the frst term of Eq, (4.3.28) is the positive-time part. Therefore, G(s) as given by Eq, (43.22) is a 1 Bee eee ee ree OF rts ‘ Fran (rin terms ofthe flter weighing function, -{05-dev%, tz0 2 { er 4330) Astefr, the mean-square eor ofthe fle canbe computed using Ea, 43.12). ‘Thee i Be) = 732 4339 S We have now examined three different optimization approsches for the same signal-plus-noise situation. A comparison of the results is shown in Table 4.1, with the most restrictive fter being listed first and the leat restrictive one listed last As should be expected, the mean-square error decreases with cach succesive relaxation of the constraints on the choice of transfer function. The linear constraint is, of course, present in all three solutions. We will sce in later chapters that this is not as serious as one might think at first glance. The expla- ‘ation of this, though, will be deferred until Chapter 5. 43. Tie STATIONARY OPTIMIZATION PROBLEM—IVEIGHTING FUNCTION APPROACH 474 ‘Table 44 Compaison of Reals of Examples 41, 4.2, and 4:3 ‘Type of Filter Transfer Square Solution Funetion ‘Weighting Function Error Single parameter “, cyan faa ‘ 0 reo caval Wiener fiter V5 po DA red “Ta ® #<0 Nogeusl ae : ews reo fer 00 R O i sm ae EXAMPLE 4.4 ‘There is classical problem in random process theory known as the pure pre= diction problem. Here we assume the additive noise corrupting the signal is zero, and we pose the problem of looking ahead and finding the best estimate of the signal a units ahead of the present time t. To demonstrate the applicability of Wiener filter theory to this problem, let the signal be Markoy with a known autocorrelation function: 20%8 Rid = orem or 540) = EE 43.32) Also, We first factor 5,46) VIB VIC Soon eS 43.33) stp -s+6 (4338) In this problem «is not ero, and hence we must frst multiply Ba. (43.34) by ‘e* and then find the positive-time part ofthe result. This i readily accomplished by appropriate shifting in the time domain as shown in Fig. 4.6. Finally subst- ‘uting the proper positive-time part into Eq, (43.22) yieldsOMAPTER A WIENER FITERING th] ac Figure 48. The shied en tuncons fr Example 4.8 Gy = hee VP (4335) ae V207B st+B THB Or the comesponding weighting funtion is 10 = 7009) 4336 “The Wiener solution says that the best one can hope ‘0 do (in the Teast- ‘squares sense) is to multiply the present value of the inpu: by an attenuation factor ¢-*, and this will yield the best estimate ofthe process a units ahead of the present time. Observe thatthe predictive estimate is dependent only on the present value of the input signal and not on its past history. I the Markov signal linder consideration ig also Gaussian, then the predictive estinate given by Wie ner theory is also identical tothe conditional mean ofthe process atthe predicted time, conditioned on all the prior input data. The use of the term Markov for a process with an exponential autocorrelation function should now be apparent. This will be expanded on later, in Chapter 5. a ‘THE NONSTATIONARY PROBLEM Inthe preveding discussion it was assumed thatthe filter was “turned on” at {= =, which made the entire past history of s(?) + n() available for weighting. ‘Tis led to the steady-state or stationary solution. In the trrsient or nonstationary problem, we consider the signal and noise to be covariance stationary pro- fesses with knowin spectral characteristics as before, but we consider the input fo be applied at r= O rather than —, This is indicated in Fig. 4.7 asa switching, oe ab tee] on m Figure 47 lock dogram forthe eansatoray pete, 444 THENONSTATIONARY PROBLEM 173 ‘operation atthe input, We assume 2er9 initial conditions; hence the filter output fan be Written as x= [stolste- 9+ ne- ad, 0 4) [As before, the ideal output would be s(¢ + a). Thus, the filter error is a= st + a) ~ 9) (442) Substituting Bg, (44.1) into Bq, (44.2), squaring, and then taking the expectation of both sides yield Be) = ff ewetrr, tu ~ 0 edo nr fletomse 9 de £0) (43 “The problem is to choose g(u) such as to minimize Ete). The variational procedure to be followed is essentially the same as for the stationary case, so it ‘will not be repeated. The only difference is in the limits on the integration, ‘which, in turn, traces back to the expression for x@) given by Eq, (4.4.1). This is an important difference, though, We need not worry about placing a causality constraint on (2) oF its perturbation, because the range of integration is Timited to be from 0 to t, We ean arbitrarily truncate g(z) to zero outside this range ‘As before, the variational procedure leads to an integral equation in g(7). For the nonstationary problem, itis feo. Wdu=Ry lot, Osrsr 44) “This equation is similar to the Wiener-Hopf equation except for the limits on the integral and the range of r for which the equation is valid, These seemingly small differences complicate the solution even further, though. The spectral fac torization method cannot be applied to Eq. (4.4.4) because the range of vis finite rather than semifiite 2s in the Wiener-Hopf equation. Equation (4.44) is ame- rable to solution, though, by another method that involves transforming the integral equation to a differential equation and then solving the differential equation for g(z) (8), This technique will work only for the case where the spectral function for s(?) + n(@) is rational, that is, a ratio of polynomials in s*. However, Wwe note that this same restriction is necessary for spectral factorization, and therefore only limited types of spectal situations can be handled for either the stationary or nonstationary case. “The solution of Eq. (44.4) proceeds as follows, We first write R,,, a5 a Fourier integral174 crapren 4 WIENER FLTEAING if Fire [35 fF sate at] te Byala) 445) Next, we waite ,,,(5) as a ratio of polynomials in s* No) Sul) = Hy 446) Now we note that operating on e*~ with the differential operator D(a*/d) will generate D(sFe"*.* Thus, the denominator of S,,, in Bq. (44.5) can be canceled by operating on both sides of the equation with D(d?/dr’) Similarly, algebraic moliplication within the integral by (=) is equivalent to operating in front of the frst integral with N(d?/de", Inserting these equivalent operations in Eg, (4.45) yields N i) fis iz We now note that the Fourier integral in brackets in Eq. (4.4.7) is just the Dirac dela function (+ ~ 1). Inserting this and using the shifting property of the impulse function lead to [Leva]a-0(8) mwioe9 ean e @ : »(Z) «9 =0(S) enter a Oster (448) ‘We now havea diferent equation ng() ater than an integral equation. Furthermore, ics a linear diferental equation with constant coefficients, and the solution ofthis type of equation is well known. Before proceeding, note that the inerval on 7 in Eq, (4.48) i the open interval, 2) ater than the closed interval [,#} associated with the integral equation, Eq. (44.4) This is inten- one and arises because ofthe problem of continuity and differentiation at dhe endpoints. In other words, we may safely assume only tht the diferntal equation i valid inthe interior region of the interval “The solution of Eq. (448) will, of course, conain arbitrary constants of integration. Futhemmore, because of the end-point problem, impulse functions with undetermined amplitudes must be added at + = O+ and 7 = 1. (The + and ~ indicate thatthe impulses are placed atthe inside edges ofthe interval.) ‘There for adding the impulses is a follows: I the oder of Dts) is 1. The same as NG), add no impulses. 2. itis owo greater than N(#), add simple impulses. * enemies ht refers tote polynomial in the denominao of Sy.) ad nt he “D-apeaoe” ‘sod nae egution hry. Thats, Dla) Dts) wth pan by 1dr Seay, ‘bide is the ames polyoma NG) wih replace with 42Fae 44 THE NONSTATIONARY PROBLEM 175 3. If it is four greater than N(#), add simple impulses plus doublet impulses 4, Bre ‘The unknown coefficients in the general solution are evaluated by substituting the assumed solution into the original integral equation and demanding equality ‘on both sides of the equation. This is much the same as using the initial conditions to evaluate the constants of integration in the usual inital-condition problem, Remember that the procedure just described is highly specialized and applies only to the case where S,,,(s) can be written as a ratio of polynomials in . The justification of the procedure in any particular case lies in the final substitution of the solution into the integral equation. We can ask no more of the solution than to satisfy the original integral equation EXAMELEG eee ‘We again look atthe situation where the signal is Markov and the noise is white ‘The additive combination forms the input that is applied to the filter at « = 0. Le R= of 5) a9) RO) os) 44.10) 1 we assume the signal and noise have 2ro crosseoreaton, 243 _ Me) ste 4a Reyes + 2) 720 (44.12) [Note that the absolute magnitude signs around r may be dropped because + is always positive. If we use the polynomials Ns*) and Dis!) as given by Eq. (44.11), the differential equation (4.4.8) becomes @ ae (2) and this reduces 10 10 de + 3g(7) = 0 (4.4.14) ‘The general solution of this equation is recognized to be176 CHAPTER 4 WIENER FILTERING a) = 6-5" be (4.4.15) ‘We do not need to add impulses in this case because N(=*) and D(s?) are both second order. Thus, we know the filter weighting function is of the form given by Bg, (44.15) without additional impulses. The a and b coefficients may be evalusied by substituting the known form of solution (e., Bg. 44.15) into the ‘original integral equation, Eq, (4.4.4), and then choosing a and b such that the resulting functions on left and right sides of the equation are identical functions of =, This is straightforward, but considerable algebra is involved, which will be committed. The end result is : V3 4 Dev 0 = aa ee = VE ca -2(-3V3 + Ne O- ER OVI De (447) Note thatthe “constants” are functions ofthe running time variable ¢. The final solution for weighting Function is then a(t) = ape VF + be (44.18) where a(t) and B(e) are given by, Eqs. (3.4.16) and (3.4.17). The semicolon in (7 is used to emphasize the fact that ris the usual age variable inthe weighting function and ris just a parameter. The resulting filter is, of course, a time- variable filter ‘tis readily verified that as ¢ approaches ~, the solution for g(=) becomes 8) = (V3 - Dev 44.19) [As should be expected, this is the same steady-state solution that was obtained previously using spectral factorization methods. It is of interest to note that the differenial-equation approach provides an alternative method of solving the stationary problem. a In Example 4.5 it is worth noting thatthe running time variable came into the weighting-function solution naturally (ie, without any conscious effort) because we chose to write the superposition integral in the form a= [ atrnte- 9 dr (4420) ‘The other form, which is equally valid, and sometimes preferred in books on linear systems theory (9), is x00 [Lae 260 aay 45 onMocoNATY 177 In Eq. (44.21), i, 2) has the physical meaning of the system response to a unit impulse applied at time +. The relationship between the impulsive response and ‘weighting function is obtained by making the appropriate change of variable in either Eq. (44.20) or Eq. (44.21) and then comparing the two integrals. The result is Ma) = at 7 (4422) 45 ORTHOGONALITY ‘It was shown in Section 4.4 that the filter weighting function that minimizes the mean-square error must satisfy the integral equation fo Also the filter enor is given by et) = at + «) ~ 19 = s+ 0) ~ ff aenlste— + m= 9] du (452) ‘We wish to examine the expectation of the product of the filer error at time ¢ and input at some time % where 0 < 1 = &. Let the input be denoted as 20). Then a) = 5) +) asa) ad Ataepto] = E[tae) + me xfare or [stalse a9 + nro] aif] 549 ‘Moving s(t) + n(t,) inside the integration and carrying out the expectation ‘operation yield Platte) = Rand = +0) ~ [OR le 45 — wae 45) However, g(u) must satisfy the integral equation, Eg. (4.5.1). Thus, the above rust be 2ero for 0's (( ~ f) S & Since the time f, was assumed to lie between O and 1, this is equivalent to saying Flue] =0, 05454 456)178 46 (QUPTER A. WENER FILTERING av-—e eer =H 9o40-—al Ge Faee 48. Gowan Winer rola. If the expectation ofthe product of two random variables is zero, the variables are suid to be orthogonal. Equation (45.6) states thatthe filter error at the current time ¢is not only orthogonal to the input at the same time f, but it is also ‘orthogonal to the input evaluated at any previous time during the past history ofthe filter operation. This is a consequence of minimizing the mean-square eror. Furthermore, it should be apparent from the derivation thatthe argument can be reversed. That is, if we begin by assuming that e(?) is orthogonal to 2(,), ‘we can then conclude that the integral equation, Eq, (4.5.1), is satisfied. It is important to recognize this equivalence because some authors prefer to begin their optimality arguments with the orthogonality relationship rather than the minimization of the mean-square exor (2, 3,8) COMPLEMENTARY FILTER Applications of Wiener filter theory are not as commonplace as one might expect. Perhaps one reason for tis is that Wiener theory demands thatthe signal, 1 well asthe noise, be noiselike in character. Inthe usual communication problem, this is not the case. The signal usually has at least some deterministic structure, and itis often not reasonable to assume it to be completely random. ‘Thus, the ypical filtering problem encountered in communication engineering simply does not fit the Wiener mold. There is an instrumentation application, though, where Wiener methods have been used extensively. In this application, redundant measurements of the same signal are available, and the problem is to combine all the information in such @ way as to minimize the instrumentation ‘eros, In order to keep the discussion as simple as possible, we will concentrate ‘on the twovinput case and simply mention that the technique is easily extended ‘0 more than two inputs Consider the general problem of combining two independent noisy mes surements of the same signal as depicted in Fig. 4.8. In the context of instru- ‘entation, the measurements might come from two completely different types of instruments, each with its own particular error characteristic. We wish o blend the two measurements together in such 2 way as to eliminate as much of the cor as possible. Ifthe signals is noselike, Wiener methods may, in principle, "The tem conpementy ler per to have ont in psp pls i 1953 by W. adeno nd Be Fiz (10) 48 COMPLEMENTARY FLTEA 178 10 1a 0-——of main a=40) so+nio—ef ao Figure 49. Corgleraniay tte, bbe used to determine the transfer functions Gis) and Gy(s) that minimize the ‘mean-square error. However, more often than not the signal may not be properly ‘modeled as a random process with known spectral eharacteristis, For example, if s() eepresents the postion ofan airplane in Might along a preseribed air route, certainly the signal is not random. Furthermore, inthis case we would not want to delay or distort the signal in any way in the process of filtering the measure- ‘ment errors. Thus, we look for a way to filter the signal without paying the price of unwanted delay and distoroa. ‘A method of filtering the noise without distorting the signal is shown in Fig. 4.9. From the block diagram, it should be apparent that the Laplace trans- oem of the output may be written as Xi) = Sis) + MILL - GO] + NIG) 46. Clearly, the signal term S(s) isnot affected by our choice of G(s) in any way On the other hand, the evo noise inputs are modied by the complementary teanser functions [1 ~ G(0)] and GC). If the two noises have complementary special characteristics, G(s) may be chosen to mitigate the nose in both channels For example, ifm, is predominantly low-frequency noise and m, high- frequency, ten choosing Gs) be a low-pass filter wil automaticaly sliendate rn, a8 well am, ‘We now note that the noise term in Eg. (4.6.1) has the same form as seen before (Eg. 42.4) except forthe sign on the N,) term, Thus, we would expect to be able to use Wiener methods ia minimising this tem. This is perhaps even more evideat from Fig. 4.10. It can be easly verified that the input-output relationships ae identical forthe systems of Figs. 4.9 and 4.10, and thus they ‘re equivalent. From Fig. 4.10 we see thet the purpose ofthe filler Gla) is to sent a 40) noone oo Figure 4.10 Dien and ectorard contusion for Seonpomenay ter soon 7100 cHvPTER § WeNeR ALTERING weft oa ew rs gue 11 Cares comementay Maro eomtnleg ‘abn nd econremeter seas give the best possible estimate of (0), and this, in tur, 's subtracted from (+ mi) in order to give an improved estimate of 50). The input to G(s) is nn{d)~ mend hence the filter must separate one noiselie sgnal from another. Clealy, if we let n,(0 play the role of signal and ~n,() the role ofthe noise, this problem fits the single-input Wiener theory perfectly. An example will I+ lustate an engineering application of the complementary-fiter method of combining redundant measurement data EXAMPLE 4.0 ————————— Let us say that in a particular closed-loop position servo, itis desirable to add rate feedback tothe system to improve the system stability. The rate signal is to come ftom a permanent-magnet de tachometer on the cutput shaft of the servo. However, i i noticed that the tachometer output is noisy due to the Combined commutator/brush action. Low-pass filtering the tachometer signal is ‘ posibiity, but this introduces unwanted delay into the rate signal. Someone Suggests that if we were to add an angular accelerometer (as well a5 the ta Chomelet) onthe output shaft, then we could use @ complementary filter imple~ mentation to obtain a clean rate signal without the usual delay “The suggested scheme is shown in Fig. 4.11. This is 2 conceptual block iagram, becuse clearly, we would not want to implement an integration in the acceleration path and then follow it direcly with a differentiation. The overall transfer funtion for this path is just T/(1 + Ts). This, incur, ean be combined vith the upper path to yield the’ simplified block diagram shown in Fig. 4.12. ‘We hhave not attempted <0 optimize the low-pass filter in this example. Rather, we chose the simplest possible form for G(s) that wl give the desired Jow-pas fiterng, For simplicity, we have assumed that both m, and n, are high= fiequeney noises. This being the case, n; will contain primarily low-frequency components because of the integration. The time constant T'can be adjusted to awe 412 Sncted complerontnry Str ter combing tachometer and accor rat srl 4A THEDISORETE WIENER FILTER 181 ‘minimize the effects of the noise sources m, and na, subject tothe constraint on the form that we chose for G(s). Itis left as an exercise to find the optimum 7, ‘ven the power spectral densities of n, and n, (see Problem 4.14) a ‘The principal of complementary filtering is easily extended to the case of ‘more than two signals. All one has to do i let one ofthe transfer characteristics be the complement of the sum of the others. For example, for the three-input case, let G,(«) = transfer function for noisy measurement 1 G,{s) = transfer function for noisy measurement 2 1 ~ G(s) ~ G(s) = wansfer function for noisy measurement 3 ‘Then the signal component passes through the system undistorted, and one chooses G(s) and G,{s) such a to give the best suppression of the noise. The problem of determining the optimal G,(s) and Gs) is a evo-input Wiener problem. Tt is important to note thatthe choice of complementary filter transfer function does not depend on any prior assumptions about the signal structure. It can be either noiselike or deterministic, and the complementary feature assures that the signal will not be distorted in any way by the filtering action. For this reason, the complementary filter is also referred to as a dynamically exact mechanization, Philosophically, i is a “safe” design and is particularly applicable in sit baton where the designer wants fer twill ope rskonshly well with Statistically unusual situations without giving disastrously large errors. A strict Wiener design with no complementary constraint is, of course, chosen to minimize the average squared error. So in the unusual situation, the error may be 4uite lage. This can be disastrous in some applications. a7 ‘THE DISCRETE WIENER FILTER ‘The Wiener approach to least-squares filtering is basically a weighting function approsch. When viewed this way, the basic problem always reduces to: How should the past history of the input be weighted in order to yield the present best estimate ofthe variable of interest? Its instructive to see how this approach ‘extends to the discrete-measurement situation. We could begin by discretizing ‘the continuous weighting function g(u) in Eq. (4-44), and then approximate the integral in the equation with a finite sum. This approximation is not necessary, though, because the minimum-mean-square-error problem can be re-posed in discrete-time terms and solved exactly in its own right. This will be our approach here (Consider the filter input to be a sequence of discrete noisy measurements fu Zn +++ % 8 shown in Fig. 4.13, These are additive combinations of signal182 CHWPTER« WENER FLTERING gue 439. Ostet measrenant stator, and noise; hence, 2, = #) + my, 2 * S, + my and $0 on. As before, we denote the filter output as x and therefore the corresponding samples of the output are ipod We now waite the output at time f, as 2 general linear combination ofthe past measurements kay + kag to + kay aay The filter err may then be writen as 72) +k) “The mean-square error is then cc Bs, ~ (hay + hate toe + oP Bist) + [REE + BCE) +--+ + REC + Dig bBlay,) + Deke Cee) + °°] ~ [2b.Btes,) + WsEles,) ++ + ACS] — 47.3) We now wish fo find ky fay hy such a8 to minimize B(e). The usuat methods of differential calculus lead 9 the following set ofKinear equations: Bid) Bae) = TPA] PE) Fe) z| [Bes . : ara Fez) * we} LA) Leeo. Just asin the continuous problem, we assume thatthe auto~ and crosscorrelation functions ofthe signal and noise are known, so that all dhe expectations indicated in Eq. (4.74) are available and the equations may be solved for the Weight factors fy hy.» « hy» Note that the problem grows in size with each new measurement as m increments in time. Also, note that in our notation the ‘ordering ofthe weight factors is opposite to that in the corresponding continuous problem. That is, the “end” weight factor kis the weight given to the current Imeasurement at time f,, whereas the corresponding weighting in the continuous ‘case isthe “beginning” value of g(u), that is, g(0) (see Eq. 44.1) It should be 48 PERSPECTIVE 183 clear that the size of the problem can easily get out of hand numerically as n Thecomes large. There isa special case that is quite manageable, though, and we ‘will look at it in some detail ‘Suppose we consider the stationary problem, and suppose further thatthe auto- and crosscorrelatons for z ands become small asthe “lag” between them ‘becomes large. It is then reasonable to assume thatthe string of weight factors ‘can be truncated at some suitably large value fm. This isto say (in our notation) that if we go backward and look at ky fy,» iy the weight factor k, (and nearby weights) will be negligible. Now imagine a stationary real-time situation Where we only store the 1 most recent measurements. As time evolves, every time we get a new measurement, we add it to our suite of measurements, and we discard the oldest one. Presumably, if stationary condition exists, the trun cated ky oy. ky Sequence is a constant vector that can be precomputed offline and stored, The numerical problem of summing m weighted measurements is then quite manageable, even if a few hundred terms are involved; in many applications, this provides sufficient accuracy for the problem at hand, ‘One nice feature of the truncated discrete-time approach that was just described is that no special mathematical form is required for the auto- and cross- corzeations that describe the signal and noise. This works fine for the stationary, single-input, single-output problem. However, the weight factor approach be- ‘comes completely unwieldy in more complex time-variable, multipl-input, ‘multiple-output applications. Such problems are much better solved using discrete Kalman filtering methods, and this is the subject of Chapter 5 and subse- ‘quent chapters PERSPECTIVE ‘A Wiener filter minimizes the mean-square estimation error subject to certain ‘constraints and assumptions. It is important to remember that this optimization is only intended to apply to the problem of separating one notslike signal from another, which isa very restricted class of filtering problems. Also the assump tion of linear filtering was built into the derivation from the start. We will see later, in Chapter 5, that this is no a serious restriction if all the random processes involved are Gaussian. In this one case, the linear filter is optimum by almost ‘any reasonable criterion of performance (11). However, the non-Gaussian case is another matter. A nonlinear filter may be better, and the Wiener filter is optimal only within the restricted class of linear filters. ‘Sometimes, minimization of the mean-square error can lead to seemingly strange physical results; therefore, the results of Wiener filtering should be viewed with a degree of caution, An example will ilustrate this. Tt was men tioned in Section 2.11 that both the random telegraph wave and the Gauss-Markov process have exponential autocorrelation functions. Since the Wiener filter design depends only on the auto- and erosscorrelation functions, the solution for the pure prediction problem witl be the same for both random telegraph and Gauss-Markov signals with the same autocorrelation functions. It486 CHAPTERS wENER FILTERING ‘was found in Example 4.4 thatthe solution in this case is a simple attenustor and, for large prediction time, the predictor output is approximately zero. This makes good sense for the Gauss-Markoy signal, because itis noselike with a central tendency toward zero; in the absence of relevant (ie., “recent") measurement information, one should just pick the process mean asthe best estimate. ‘This way the estimate is only rarely in error by @ gross amount, and itis often close to the signal value, On the other hand, to estimate the random telegraph signal to be 2er0 is pure nonsense, We kaow a priori thatthe signal is either +1 fof —1, and itis never zero, That is, the Wiener predictor never predicts the correct answer, nor is it even close’to the correct answe:! We might better arbitrarily choose +1 as our estimate. Then, at least we would be correct half the time! Think of the many game (and more serious) siuations where you ‘would be better off to be exactly correct half the time (and grossly in error the other half than to be significantly (but perhaps not grossly) in eror all the time. ‘The reason forthe strange result inthe random-telegraph-wave predictor is thatthe optimization procedure did not account for the high=r-order “statistics” of the process. The mean-square-error criterion is simplistic and only calls for knowledge of the correlation functions of the processes ievolved. This%is, of course, convenient and it also happens to fit the Gaussian process case quite well (for reasons that may not be apparent yet). However, as has just been demonstrated, the mean-square-error criterion of performance can lead to strange results when dealing with non-Gaussian processes PROBLEMS. 4.1 A closed-loop position control system has the form shown in the block diagram. The spectra density function ofthe derivative ofthe signal s() i given Problem 41 and the noise is unity-amplitude Gaussian white noise. Fad the value of the fin constant K that will minimize the mean-square error, and find the damping Into corresponding to this gain (Ging: The error term tat involves the signal spectrum iso the form S(=[1 — G9] (ee Eq, 4.2.3). In this problem [I~ G(e)} contains an factor in the rumerator that may be linked with Si). Since this has the interpretation of derivative of s() i the time domain, the mean-square errr term de to signal may be writen in terms of the spectrum ofthe derivative of the signal, which is the spectral function given in the problem. This problem is worked out as an example in Trunl (12). 42 The figure fr Problem 3.21 is repeated here for convenience. Let f,(0) and 4,0 be independent white-noise inputs with spectral amplitudes A, and A, just PRoaLeMs 185 as in Problem 3.21. Consider x (the inertial system position ero) as the output and find the value of K that minimizes the mean-square value of x CJ {Eitan ctr 92° te? fost evan nab it he dss dag te rotiem 42 43 One facet of biomedical electronic instrumentation work i that of implant- ing miniature, telemetering ansducers in live animals. This is done in order that various body functions may be observed under normal conditions of activity and environment. When considering such an implant, one is immediately confronted with the problem of supplying enetpy tothe transducer. Bateries are an obvious possibilty, but they are Bulky and have finite life. Another possibilty is to take advantage of the random motion of the animal, and a device for accomplishing this is shown in the simplified diagram of the accompanying figure. The energy conversion takes place in the damping mechanism. This is shown as simple viscous damping, bu, in fact, would have to be some sort of| clectromechanical conversion device. Assuming that all the power indicated 35, being dissipated in damping can be converted to electrical form, what is the ‘optimum value for the spring constant K and how much power is converted for this optimum condition? The spring-mass arrangement is to be critically damped, the mass is 1 kg, and the autocorrelation function for the velocity # is, estimated to be $— Epa 1a Problem 4:3 oeeeee¥r 186 cwerens wenn FLTERNG Ro) = te se) [or more details, see Long (13). i 44 Find the optimal causal transfer function for the case where the autoor- telaion fonctions ofthe signal and noise are Ryo) = 22% ! RO = ‘Also find the mean-square error. The signal s(?) and the oise n(o) may be assumed to be independent of each other and both are time stationary. Also let the prediction time be zero. In bref, this isthe classical Wiener filter problem ‘with zero prediction time. 45 Find the optimal noncausal G(s forthe signal and noise situation given in Problem 4.4, Also, find the corresponding weighting function and the resultant ‘mean-square error. i 446 Consider a stationary Gaussian signal whose spectral density function is eet i SW) = a gat #16 Find the optimal predictor for the stationary case and a = 1 (vote: The causal solution is desired, and the predictor may be specified in terms of either @ transfer function or & weighting function.) 447 Consider an additive combination of signal and noise where the spectral densities are given by 80=AG i ‘Both the signal and noise are stationary Gaussian processes, and they are sttis- i tically independent. (@) Without regard to causality, what is the optimal linear filter for the stationary case? The answer may be specified in terms of either a transfer function or 2 weighting Function, () Does the noncausal result of (a) have any physical significance? That js, is there a corresponding estimation problem where the computed theoretical mean-square error would have the significance of actual ‘mean-square estimation error? Explain brill. 48 Consider the following autocorrelation functions of the signal and noise in a stationary Wiener prediction problem Proaes 187 RO ‘The signal and noise are independent and the prediction time is .25 sec. Find the optimal causal weighting function. 4.9 Show that the Wiener-Hopf equation (Eq. 4.3.18) is @ sufficient as well as necessary condition for minimizing the mean-square ero [xlin: Differentiate Eq, (43.6) twice with respect to © and then examine @E(@*\fde. Recall from elementary calculus that the extremum is a relative imu fs quant spose at the exemam point, which nis cases e= 0] 4.10 The orthogonality principle states that the filter err at the current time ‘is orthogonal to the filter input evaluated at any previous time since initiation of the input. In Section 4.5 this principle was derived from the nonstationary version of the Wiener-Hopf equation (Eq. 4.4.4), Reverse the arguments and show thatthe integral equation (ie., Eq. 4.44) may be derived from the ortho: gonality principle. [If you need help on this one, see Davenport and Root (8), p. 240.) 4.11 Consider a noisy measurement of the form = ay + mt) ‘where ay is an unknown random constant with zero-mean normal distribution and a variance of 6%. The additive noise may be assumed to be white with spectral amplitude A, Find the optimal time-varible filter for estimating ay. The filter is umed on at r = 0, (int: A random constant may be thought of as a limiting case of » Markov process with a very large time constant.) 4.12 Verify that the a(#) and 6(0 coefficients given in Example 45 are correct. 4.13 The figure on the next page was taken from Kayton and Fried (14), p. 318. It describes a means of blending together barometricaly derived and in cently derived altitude signals in such a Way as to take advantage of the best properties of both signals. The barometric signal, when taken by itsel, has large inherent time lag that is undesirable. It is, however, lable and reasonably ac- ‘curate in the steady-state. On the other hand, integrated vertical acceleration has ‘an unbounded steady-state error because of accelerometer bias error. In effect, the high-frequency response of the accelerometer is good, bu its low-frequency response is poor; just the reverse is true forthe barometric instrument. Ths this is an ideal seting for a complementary filter application Let G(s) and Gs) in the figure be constant gains G, and G, and show that: (@) The system fits the form of a complementary filter as diseussed in Section 4.6. (&) The system error has second-order characteristics with the natural fre- ‘quency and damping ratio being controled by the designer's choice of G, and G,188 CHAPTER ¢ voenen FLTERING ie ) Problem 443 [vote: In order to show the complementiry-filtr property, you must conceptually think of directly integrating the accelerometer signal twice ahead of the ‘simmer to obtain an inertally derived altitude signal contaminated with error However, Uiret integiation of the accelerometer owtpat is not required in the final implementation because of cancellation of ss in. numerator and denominator of the transfer function. This will be apparent after cxrying through the | { a. the sca roses Xltyes) = eats AKC) + f Hu Gute) dr (53.5) ce ton ome Foe 85 ined Gaus Mar poses 0 nour abbreviated notation, a Let us say the sampling interval is Ar and we wish to find the conesponding discrete model. This amounts tothe determination of 4, Qu, and By. The tran sition matrix is easly determined as* Cleaaly, dis the state transition matrix forthe step from 1 t0 fy» and My is the driven response at ,, due to the presence of the white-noise input during the (fa) imteval Note thatthe white noise inpt requirement inthe ontin= 4.5 [ett = 1° ‘ous mde atomaially assures that w, will be a white sequence in the diseret= ode iat ‘Analytical methods for finding the state transition matrix are well known, ai [i “1 | els GB and these may be used in systems With low dimensionality. However, evaluation Os+p Aan ‘tthe Q, watuix that describes w, may not be so obvious. Formally, we can ae ‘write Q, in integral form as i ane Q = Alwwi) ee (53.9) 0 em ALE” etn 06c0u oe] |" S¢.. 6emutn aa] | , Next rather than using Eg. (3.6) diely to determine Qy we use the wansfer ci fanetion approch. From the biok diagram of Fig. 3.3, we observe the fallen 2 PP een a6 coetuouT ener one 9 a dy transfer functions: cfc ahera elec 536 Viet Guo) = 6 = 6310 ‘The matrix E{u(u"%(m)] is a matrix of Dirac delta functions that, presumably, is inom fom the coinvous mode. Thus in principle, Q may be evaluated VFB from by (62.8) This ota tv task, though, even for ow order systems. : (ay Iie couingows sytem given ise othe discrete station has conta para oe inc andi the various whitenote inputs have zero crscorelaton, some Simpleton i possible andthe taster function methods of Copter 3 may ‘The comesponding weighting functions are Seapled, Ths best stated with an example rahe thn in general tes COMES 20- PEa-en 6312) Tae imegated Gauss-Markov process shown in Fig. 55 is requenty encoun teed in engineeingapplleaion, The condnuous model in hs cae s af) = Veer oo ale (af o J-2 aE] [es] an + en tg tay hier rts mine a nee ap ee Stew tg dears ca i nls feb ates wrt oe ines chatc te ' hnaec e aahae k e y=0 [3] 639 ‘pata matenatss ql MATLAB eons bas Tae ieee CE TSE creme apres meet202 GUPTENS THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION. We can now use the methods of Chapter 3 to find the needed mea responses: square etal = ff e@euenetatoucn) 46 én =f PE a eon ~ ease 9 detn 1 28 Zee] eat Flond =f [F ex@esenetucouen] a6 dn = [C [rena - emma m dean 2203 [La— ey - -e| 63.15) 2 [ie ) at - & ) Hhaad = [° [ sx@sionetecoue) as an = [FE cetpe mera — m dean =o 6316 Ths the Q, mati i Flas) Ele) -E (5.3.14) Bq. (5.3.15) - Lees) As) ~ [Bq 63:15) Fa. 6319) S31? ‘The B, matrix is the same for both the continuous and discrete models and is B= Oo} 63.18) “The diserete model is now complete with the specification of), Qu. and By 25 aiven by Eqs. (5.3.9), (5.3.17), and (5.3.18). Note that te k subscript could have ‘been dropped in this example because the sampling interval is constant. Numerical Evaluation of ¢, and Q, ‘Analytical methods for finding &, and Q, work quite well for constant parameter systems with jst a few elements in th state vector. However, the dimensionality oes not have to be very large before it becomes virtually impossible to work ‘out explicit expressions for ¢, and Q,. This is especially tue if the system F 53 DISCRETE-TIME MODEL 203, ‘matrix contains time-varying terms. Thus, we offen need to resort to numerical methods, ‘The state transition matrix tells us how the dynamical system naturally ‘elaxes from one state to a subsequent state in the absence of a forcing function. Stated in mathematical terms, X= HAD, =H 63.19) I is easly verified that if ¢ satisfies the matrix differential equation 46,40) then x(7) as given by Eq. ($3.19) will satisfy Eq, (5.2.1) with w = 0, which describes the natural dynamics of the system. Clearly, once we have established the differential equation that must satisfy (and the inital condition), then the entire theory of numerical solutions of ordinary differential equations can be brought to bear on the problem (5). Runga-Kutta methods are especially attractive for solving the intial-condition problem, and they carry over nicely from scalar problems to matrix differential equations (6). If F() varies appreciably cover the AF interval of interest, there appears to be no way to avoid solving Eq, (6.320) by some means or other. Some simplification does accur, though, for the special case where F is constant, and that will be considered next ‘Assume that F is constant over the (fs fs) interval of interest. Then the state transition matrix is simply the matrix exponential of F\, that is, Ob. lt, 4) = 1 (6320) ay = eats par + SE . (53.21) where At is the step size. This is especially easy to evaluate using MATLAB's boilt-in expm function. An example will illustrate ths, EXAMPLE 5.4 Consider the harmonic motion process that was discussed in Section 5.2 (see Bq, 5.2.20). For this numerical example we will let oy = I and Ar = 1. Fav is then ou Far = iE : 4] Say that we give Far the variable name fdt. Then executing MATLAB's expm function yields (with rounding) ora -[ 252 22) (0998 9950204 CHAPTERS. THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AUD SIMULATION “The numerical evaluation of Q, is not as easy as evaluating é,. One method is to use the defining equation, Eq, (5.3.6), and evaluate Q, for a small interval using a firstorder approximation in Ar. This can then be propagated through & sequence of small steps to get the Q, for the whole interval. This is a workable ‘way of evaluating Q if done carefully with very small steps (1). However, itis ‘not the most efficient or convenient method. We will look first at the general time-variable case and then consider the fixed-parameter case late. I will be shown later in Chapter 7 that Q, must satisfy the matrix difer- ‘ential equation FQ 6) + Qle, IRM + GOWGO, — QU 4) = 0 63.22) Que 1) where W is the power spectral density matrix associated with the forcing function w in Eq. (5.34) and F() in the same equation describes the system ‘dynamics. The notation in Eg, (53.22) is similar to that used in the transition ‘matrix problem, that i, is the “running” time variable, is fixed, and ¢ = 6 Presumably, F, G, su W ave how, s0, i principle, Bg. (53.22) can be solved numerically for any step size, Ths is nota trivial mater, though, and it involves the use of Runga-Kutta (or other) numerical methods (6). ‘The Q evaluation problem simplifies considerably forthe fxed-parameter cease. One method due to van Loan (8) is especially convenient when using ‘MATLAB. It proceeds as follows 1, Fist, form a 2n X 2 matrix that we will call A (vis the dimension of ». -F |Gwer ae (5323) 2. Using MATLAB (or other software), form e* and calli B. [e'Q, B = expm(a) = | ——| —— 6329 ole (The upper-left partition of B is of no concern here.) 3. Transpose the lower-right patton of B to get 6 (= transpose of lower-ight parition of B (5.3.25) 4, Finally, Q, is obtained from the upper-right partition of B as follows: Q = & * (upper-right part of B) 5326, “The method will now be illustrated with an example, 53 DISCRETE-TIME MODE. 205 EXAMPLE 5.5 ‘Consider the nonstationary harmonic-moton process described by the difxen ial equation Ft y= 2a) 632 where u(t) is unity white noise (See Section 5.2). The continuous state model for this process is then El IE}-B B sla 529 ee wheres ad ae the us phase variables. In this ase, W is 0 0 w-[3 3] 62.29 (Ph sate fctor is accounted for in WG is then ewor=[2 3] 6330) "Now form the partitioned A’ matrix. Let Ar = .1 just as in Example 5.4 o 0 a» [-Far Gworar ae oer (5331) o - 10 ‘The next step is to compute B = e*, The result is (with numerical rounding) 9950-0998 | —0007 | ~.0200" 0998 9950 | 0200 | 3987 B = expm(a) = | ——— —— 4 —_ ~ 6332) 0 0 | 9950 | ~0998 0 0 | 0998 | ‘9880, Fi iy, we get both & and Q, from 4 © transpose of lower-ight pation of B ae lee se' | i 208 COPTER THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION 0, = 6 + (upper-siaht partition of B) _ f.0013 0199 [283 er] as State Model from the ARMA Model 1W vas mentioned previously that sometimes processes ate intrinsically discrete and have nothing to do with a continuous evolution of time. Such processes are often desribed in terms of an ARMA model, which was touched on briefly in Section 39 (, 10, 11). We will now continue that discussion and show how the 'ARMA model can be converted to state-space form. We begin by repeating the ‘ARMA model equation given in Section 3.9; HEE A) + atk tn — D+ a ay n= 2) toe + at) + Paw), Baw(k + m) + By-yw(k +m ~ 1) 1 and & ” 11,2... 63.35) [Not thatthe model is single-sided in k and the order of the MA. partis a least fone fess than the AR part. The reasons for these restrictions will be apparent shot ‘Conversion of the ARMA model to the controllable canonical form is effected by following a procedure analogous to that used in the continuous dit ferential equation case. First, we define an intermediate variable r(K), which is the solution of| rk +n) + ak += 1) tb aatktn—2) +a) = wi) 63.36) [Next we define state variables that are analogous to the phase variables in the continuous case 2) = kn 1) 6337 Using state variables as defined by Eg, (5.3.37) andthe difference equation, Eq. (6.336) leads to the matrix equation 59 DISCRETE-TIME wove. 207 see) ft oo “x ] [or né+D] | 9 0 = Pl 1 0 4119] way HED] [may -a ao 0] Lato. 63.38) We now need to reconstruct the original y(X) variable as a linear combination of the elements of the state vector. This can be done by considering. @ superposition of equations of the type given by Eq, (5.3.36) with the index k advanced appropriately, However, itis abit easier to do this with a block diagram by using z transforms as shown in Fig. 5.6. The analogy between the block diagrams of Figs. 5.2 and 5.6 should be apparent. It should be clear now that the output equation that takes us from the state space back to y() is 38 = 1B Bo BAL: co 2, ‘We will illustrate the procedure with an example. EXAMPLE 5.6 We will use the same ARMA model here that was introduced in Example 3.11 jn Chapter 3. The difference equation is E+ 2) — Kk #1) + SHE) = Sulk + 1) + 2500), 01,2. (53.40) ‘The block diagram that leads to the controllable canonical form is shown in Fig, 5.7, The state model is then . ra vin Figure 5.6 Slok dag fr consncing tte model frm ARIA mode.208 CHAPTERS THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AUD SIMULATION eae ee) Sai J * Fare 7 Slck diagram for slate made + D] [0 1] fa] , fo [er B)-[ 2s HER) +[i]o ean w= [25 5] [23] 63.42) Ege (5341) and (5.3.42) now compose the model We can use this same example to demonstrate why we need to put certain restrictions on our ARMA model. First, we restrict to integer increments be ining at k = 0, that is, k = 0, 1, 2,.. .. Ths is done simply because in the fer theory that follows in Section 5.5 we wish to consider measurement se~ quences tat begin at some finite realtime, and itis a matter of convenience to Tet the point of initiation be k = 0 (equivalent to # = 0 in the continuous case. ‘We have no problem with leting k (or 1) proceed on indefnitely in a positive sense, but time prior to starting the process will be of no consequence in the ‘material tat follows. The second restition put on our ARMA model has to do ‘wth the order of the right side of the ARMA equation. Suppose we were to let the right side be of the same order asthe lft side, This would lead to a transfer function in the z-domain that bas the same order polynomial in the numerator 8 inthe denominator. For example, we might then have something like 1 eae aly Transfer function = 272 It can be seen by long division that there will be a direct feedthrough of the input w(8) into the output. We do not wish to allow this in the Kalman filter ‘Model that follows in Section 5.5 (atleast in its most elementary form), so we will quire the order of the MA. part to be at least one less than the AR part. ‘There is also a problem of causality when the order of the right side is equal or greater than the left side, and this will be elaborated oa further in Section 55. ‘To summarize, we put the restrictions on our ARMA model because we only wish to consider processes that begin at a finite ime, and ones that are ‘causal with no dizect feedthrough ofthe input into the output ‘To demonstrate further that the model specified by Eqs. (5.3.41) and (6.32) is, in fact, causal and has no direct feedthrough, we will go through thre steps ofa simple simulation. Suppose thatthe initial conditions on the two state variables are 3 DISORETE-TIME MODEL 209 ¥4(0) = 2,0) 0 ise cessed iy, ese sy ie edad Now ‘opps tthe ite alts tip seucce ee w= 1) 5, and w(2) = .2 (by chance, of course), We begin at k = 0. : Atk Ee 0) ] = [Betis is sve int conn as si[S]-0 Note that th sate and ouput ak = 0 do nt depend on m0), a) [0 ao 2) ] [5 1] Lo wo= tas si[{]=s (0) Atk = Note thatthe state and output at k = 1 depend on v0) but not on wi). Eel-L IE) les ye tas si[}] = Atk Astor, not tha these and out at = 2 do not depend onthe “euent Value ofthe np (This gly s mater of nota, bts imperet inthe materia that follows in Secon 5. Conceal, we ean nk of he ingt ite sequence anolding wih time oe step tack of the st and outa sequenes. That is, when (1) fll, oly w(0) 1s Known; when 2) ols only) and wl) ae kaon; wen 3) unfls, nly nO), w(t), and C2) tre koe ands fh Thinking interns of vet tint atte ong io Known bout note than the fa hat chs aro mean ind a now sree does nt unflé and vee ise unl spk 1 Before we leave thier itis of nterest occk to se if he original ARMA motel, Eq (5.340) ts ats. We have worked though enogh eps toeck this aA 0. Subst the computed ves fr 00), 0). 2) end the given values of (0) and nl) yes210 CHAPTERS THE DISCRETE KALMAN FILTER, STATE-SPAGE MODELING, AND SBALLATION S\-5) + (25)(1.0) ‘which checks. It might also be noted thatthe “initial conditions” on the scalar Aifference equation in y, that is, (0) and y(1) are not the same as those of the sate variables x,(0) and x,(0), nor should they be. They are algebraically related, bot there is not a one-to-one correspondence, because the state variables are intermediate phase variables and not just (#) and y(k + 1). a In closing it should be mentioned that the state-space model for a given system is aot unique. This should be obvious from the fact that we cen make any nonsingular transformation on any set of state variables and get another equally legitimate set, The choice of state variables is largely a matter of con- ‘venience and may vary from one application to another. The controllable ca- ronal form that was used here is especially convenient in making the transition fom a differential (or difference) equation model to a vector model, because the coefficients in the scalar equation transfer directly without any algebraic ‘manipulation. It might also be mentioned that one can also go the other direction ‘nmodeling, that is, go from the state model to the scalar mod The conversion is relatively easy and the details are discussed in Problem 5.18 MONTE CARLO SIMULATION OF DISCRETE-TIME PROCESSES “Monte Carlo simulation refers to system simulation using random sequences as inputs. Such methods are often helpful in understanding the behavior of sto- thastic systems that are not amenable to analysis by usual direct mathematical ‘methods, This is especially tue of nonlinear filtering problems (considered later in Chapter 9), but there are also maay other applications where Monte Carlo ‘methods are useful. Briefly, these methods involve seting up a statistical experiment that matches the physical problem of interest, then repeating the experi- ‘ment over and over with typical sequences of random numbers, and finally, ‘analyzing the results ofthe experiment statistically. We are concerned here primarily with experiments where the random processes are Gaussian and sampled in time. We vall begin with processes that originate conceptually as continuous processes, and then conclude with a brief discussion of simulating ARMA pro- ‘cesses where there need not be any corresponding continuous-time process in the background. Simulation of Sampled Continuous-Time Random Processes ‘The usual description of a stationary continuous-time random process is its power spectral density (PSD) function or the corresponding autocorrelation fune- ‘54 MONTE CARLO SIMULATION OF DISCRETE-TIME PROCESSES 211 tion, It was mentioned in Chapter 2 that the autocorrelation fenction provides a complete statistical description of the process when it is Gaussian Ths is imm- ‘portant in Monte Carlo simulation (even though somewhat restrictive), because Gaussian processes have a firm theoretical foundation and this adds credibility to the resulting analysis. In Section 5.3 a general method was given for obtaining, a discrete state-space model for a random process, given its power spestal den sity. Thus, we will begin with a state model of the form given by Eqs. (5.3.1) and (5.3.2) (repeated here for convenience) Ber = bat 643) yo Bay (542) Presumably, , and B, are known, and w, is a Gaussian white sequence with known covariance Q,. The problem is to generate an ensemble of random tials of x, and y, (i.e, sample realizations of the processes) for k= 0, 1,2. m. ‘Equations (544.1) and (5.4.2) are explicit. Thus, oace methods are esleb- lished for generating w, for k= 0, 1, 2,-.. (m~ 1) and setting the initial Condition for x at & = 0, then programming the few lines of code needed to {implement Eqs. (54.1) and (5.4.2) is routine. MATLAB is especially useful here because of its “user friendliness” in performing matrix calculations. If, and B, are constants, they are simply assigned variable names and given numecical values in the MATLAB workspace. If and B, are time-varable, itis relatively ‘easy to reevaluate the parameters with each step as the simulation proceeds in time. Generating the w, sequence is a bit more difficult, hough, because Q, is ‘usually not diagonal. Proceeding on this bass, we begin with a vector u, whose ‘components are independent samples from an N(O, 1) population (whichis read. ily obtained in MATLAB), and then operate on this vector with a linear transformation C, that is chosen so as to yield a w, vector withthe desired covatiance structure. The desired C, is not unique, but a simple way of forming a suitable jis to let it be lower triangular and then solve for the unknown elements Stated mathematically, we have (temporarily omitting the k subscripts) ween (643) and we demand that e{(Cuy(Cuy] = Efww"] =O 64.4) Now, £{au’] is the unitary matrix because of the way we obtain the elements ‘of w as independent (0, 1) samples. Therefore, cor=@ G45) We will now proceed to show that the algebra for Solving forthe elements ‘of C is simple, provided that the steps are done in the proper order. This will212 CHMOTERS THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION be demonstrated for a 2 x 2 Q matrix. Recall that Q is symmetric and postive definite, For the 2 % 2 case, we have (with the usual matrix subscript notation) eb al-e [3 étal-[e 2] ea We start first with the L1 term, ens Von oan Next, we solve for the 21 term, 648) Finally, cz is obtained as ee Vine 6.49) ‘The preceding 2 X 2 example is a special case of what is known as Cho- lesky factorization, and itis easily generalized to higher-order cases (12). This procedure factors @ symmetric, positive definite matix into upper- and lower- triangular parts, and MATLAB has a built-in function chol to perform this op- ‘ration. The user defines a matrix variable, say, QUE, with tie numerical values Of Q, and then chol(QUE) retums the transpose of the desired C in the notation used here. This is a very nice feature of MATLAB, and it sa valuable timesaver ‘when dealing with higher-order systems. should also be clear that if the transformation C takes a vector of un correlated, unit-variance random variables into a correspond:ng set of correlated random variables, then C-' will do just the opposite. If we start with a set of random variables with covariance Q, then C™'QC"¥ will te the covariance of the transformed set. This covariance is, of course, just the entity matrix. ‘Specifying an appropriate initial condition on x in the simulation can also bbe troublesome, and each case has to be considered on its own merits. If the process being simulated is nonstationary, there is no “‘typ.cal” starting point. ‘This will depend on the definition ofthe process. For examp.e, a Wiener process js defined to have a zero initial condition. All Sample realizations must be ini- tinlized at zero in this case. On the other hand, a simple one-state random-walk process ean be defined to start with any specified x,, be it deterministic or random. '54 MONTE CARLO SIMULATION OF OISCRETE-TIME PROCESSES 213 If the process being considered is stationary, one usually wants to generate aan ensemble of realizations that are stationary throughout the time span of the runs, The initial condition on x must be chosen carefully to assure this. There is one special case where specification ofthe initial components of x is relatively easy, If the process is stationary and the state variables are chosen to be phase variables, (as isthe case inthe model development shown in Fig. 5.2), it works ‘out that the covariance matrix of the state variables is diagonal in the steady- state condition (See Problem 5.9). Thus, one simply chooses the components of X a independent samples from an N(O, 1) population appropriately scaled in accordance with the rms values of the process “position,” “velocity,” “acceleration,” and so forth. Ifthe state variables are not phase variables, however, then they will be correlated (in general), and this complicates matters consi erably. Sometimes, the most expeditious way of circumventing the problem is {o start the simulation with zero initial conditions, then let the process run unt the steady-state condition is reached (or nearly so}, and finally use just the later portion of the realization for “serious” analysis. This may not be an elegant solution to the inital-condition problem, but itis effective. < Simulating 2 Gaussian random process that is described by an ARMA model is simpler than the corresponding diseretized continuous-time problem, This is be- ‘cause we have the luxury of starting with a model that is already in discrete form. When doing the simolation with MATLAB, its convenient to fist convert the scalar ARMA equation into vector form, This ean be done by inspection without any laborious calculations and was discussed in detail in Section 5.3 ‘The steps involved in simulating a process defined by a scalar ARMA equation will now be illustrated using the second-order example given in Section 5.3 (Example 5.6). Both the scalar and vector models are repeated here for n of ARMA Models Scalar ARMA Equation: the + 2) — ye + 1) + Sy(h) = Sw(k + 1) + 250K), 01,2, 4.10) Corresponding Vecior ARMA Model: su+o] fo 1] fe] , fo [823}-[ JE) E]-o oan tas_s1[2] (64.12) aa 0) ‘The steps in a typical simulation can be summarized as follows?214 GWETERS THE DICRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION Convert the scalar equation, Eq. (4.10), t0 vector form using the “hase variable” method piven in Section 5.3, The result is given by Es G41) and 6.4.12) 2. A white Gaussian sequence with unity variance is needed for the sim ‘aon. Tis is easily obtained using the MATLAB random number ecto. ‘An intial condition on x(2) must be specified. In this example its 2 tuple. The choice of initial condition will gover the init transient in smuch te same way as inthe continuoussime problem. 4. Prgramming the recursive equation for x2) i routine, once the initial vector x() is specified. Program Fa. 6.4.11), Finally the original y() process i reconstructed from x(t) a each step wing the ouput equation, Eq (54.12). Jn sume, simulation isan impeitan tol in analyzing systems with read inputs With modem compute echncogy, simulation is relatively easy 10 Go, and tis ofen the best Way to get answers to otherwise intractable pedis ‘THE DISCRETE KALMAN FILTER RE, Kalan’s paper describing a recursive solution of the discrete data linear fering problem was published in 1960 (1). About this same time, advances in gil computer technology made it possible to consider implementing his re- cue solution in a number of real-time applications. This was a fortitous circumstance, and Kalman fiktering caught hold almost immediately. We will consider some examples shortly, but we must first develop the Kalman fer ‘cursive equations, which are, in effect, the filter.” ‘We begin by assuming the random process to be estimated can be modeled inthe form Xo = Oa Hy 65.) ‘The observation (measurement) of the process is assumed to occur at discrete Doits in ine in accordance with the linear eelationship Hx, + 552) Sone elaboration on notation and the various terms of Eqs. (5.5.1) and (5.5.2) isin order: 55 THEDSCRETEKAAN FLIER 215 x, = (7X 1) process state vector at time f, c= (Xm) matrix relating x, t0 x,,, in the absence of a forcing function (it 4,15 a sample of continuous process, is the usual state transition matrix) = ( X 1) vector—assumed to be a white sequence with known covariance = (m 1) vector measurement at time f Hi, = (rm Xn) matrix giving the ideal (noiseless) connection between the measurement and the state vector at time 1, v= (m X 1) measurement exror—assumed to be a white sequence with known Covariance stricture and having zero exosscorrelation with the w, sequence ‘The covariance matices forthe w, and ¥, vectors are given by aiwwn= {S15 653) sowa= {8 15h 654 Elw,v1} = 0, for all k and é 655) We assume at this point that we have an initial estimate of the process at ‘some point in time sand that this estimate is based on all our knowledge about the process prior to f. This prior (or a priori) estimate will be denoted as X ‘where the “hat” denotes estimate, and the “super minus” isa reminder that this is our best estimate prior to assimilating the measurement at (Note that super ‘minus as used here is not related in any way to the super minus notation used in spectral factorization.) We also assume that we know the error covariance ‘matrix associated with ¥;. That is, we define the estimation error to be ey (5.5.6) land the associated error covariance matrix is* Be = lezen”) = Elia — 8% — HY] 65.2 ‘We tay assume Nee Ha the estimation enor has ro ean, abd hu ts prope 0 rer to ‘laces | st acovuance mate is alo, of oar, 8 moment mts tat ts omay no etre torch216 CHAPTERS. THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION In many cases, we begin the estimation problem with no prior measurements. ‘Ths, in this case, if the process mean is zero, te initial esimate is zer0, and the associated error covariance matrix is just the covariance matrix of x itself. ‘With the assumption of a prior estimate i, we now seek to use the measurement 2, to improve the prior estimate, We choose a linear blending of the noisy measurement and the prior estimate in accordance with the equation a, = + Ka, - Ha) (5.5.8) where ‘% = updated estimate lending factor (yet to be determined) ‘The justification of the special form of Eq, (5.5.8) will be deferred until Section 5.8. The problem now is to find the particular blending facter K, that yields an ‘updated estimate that is optimal in some sense. Just as in the Wiener solution, ‘We use minimum mean-square error as the performance criterion, Toward this fend, we first form the expression for the error covariance mattix associated with the updated (a posterior) estimate Py = Fleer] = Flex, ~ %)0% ~ %)'1 (55.9) [Next, we substitute Eq. (5.52) into Ea. (55.8) and then subattute the resulting ‘expression for & into Bq, (5:59). The result is P, = Bll — 2) — KL + ¥ — HRD)] [ox - 8) - Kt, + ¥. - BDI") 6.5.10) Now, performing the indicated expectation and noting the ‘x, ~ Sz) is the a prior estimation error that i uncortelated with che measurement eFTOr ¥, We have rR ~ KBP; ~ KH + KR os. Notice here that Eq, (5.5.11) is @ perfectly general expression for the updated error covariance matrix, and it applies for any gain K,, subortimal or otherwise. Returning to the optimization problem, we wish to find the particuler Ky that minimizes the individual terms along the major diagonal of P,, because these terms represent the estimation error variances for the elements of the state vector being estimated. The optimization can be done in a number of ways. We will do this using a straightforward differential calculus approach, and to do so ‘we need two matrix differentiation formulas. They are 56 THEDGCRETE KALMAN FILTER 217 d{ace(AB)] _ py omceAD] gr (Am manibesmia) 851) Aee(ACAT) o4¢ (C must be symmetie) 65.13) where the derivative of a scalar with respect to a matrix is defined as ae tay, day, = a : 55.14), Proof of these two differentiation formulas will be left as an exercise (see Prob- lem 5.16). We will now expand the general form for P, Eg, (5.5.11), and rewrite it in the form: ~ KNP; — PPHIKE + KCH.PSHT + RYKP 65.15) [Notice thatthe second and third terms are linear in K, and that the fourth terra is quadratic in K. The two matrix differentiation formulas may now be applied t0 Eq, (5.5.15). We wish to minimize the trace of P because itis the sum of the mean-square errors in the estimates of all the clements of the state vector, We ‘can use the argument here that the individual mean-square errors are also minimized when the total is minimized, provided that we have enough degrees of freedom in the variation of K,, which we do in this ease. We proceed now to differentiate the trace of Py with respoct to Ky, and we note that the trace of | PCBIKE is equal to the trace of its wanspose K,H,P;. The result is eee) omer aK +R) 6510) We no ste derive elt 0 and sole fr he inal un The wanes PrHTOP; HY + Ry (65.17) ‘This particular K,, namely, the one that minimizes the mean-square estimation cexzr, is called the Kalman gain. "The covariance matrix associated with the optimal estimate may now be computed, Referring to Eg. (5.5.11), we have218 QUoTERS THE OSCRETE KALMAN FLTER, STATE-SPACE MODEUNG, ANO SMULATION P, = — KH)Pr( ~ KH)! + K.RKT (65.18) Py KALB) ~ PUKE + KGLPPHZ + ROKE 65.19) Routine substitution of the optimal gain expression, Bq, (5.5.17), into Ea, (55.19) leads to P= Pp — PrRAPBY + RYH (5.5.20) Py = KGLPCHY + RDKE 6521) P= ~ KEP; (55.22) Note that we have four expressions for computing the updated P, from the prioe Pz. Three of these, Eqs. (5.5.20), (5.5.21), and (55.22), are only valid for the optimal gti condition. However Ea, (5.5.18) is valid for any gain, optimal o suboptimal, All four equations yield identical results for optimal gain with perfect arithmetic. We note though, that in the eal engineering world Kalman fitering is a numerical procedure, and some of the P-update equations may perform better numerically then others under unusual conditions. More will be said of this later in Chapter 6. For now, we will ist the simplest update equation, that is, Ea. (5.5.22), as the usual way to update the error covariance. One should remember, though, that there are altemative equations for implementing the error covariance update We now have a means of assimilating the measurement at, by the use of Eq (5.5.8) with K, set equal tothe Kalman gain as given by Eq. (5.5.17). Note that we need X; and Py to accomplish this, and we can anticipate a similar need atthe next step in order to make optimal use of the measurement 2y1.. The updated estimated &, is easily projected ahead via the transition matrix. We are jstiied in ignoring the contribution of w, in Eq, (5.5.1) because it has 2210 ‘mean and isnot correlated with any ofthe previous w's.* Thus, we have (5523) + Rec at a ur aoaton mi the process nie tht acomults ding he step ahead from {or Tiss prety a mate fsocation (bt an gett oe), and some books is desta {5h te ha, (6,9). Conseny a notaon te iparaat hing he, Conceptual, we fre lang of ig'sitine feng contr fo smog. wich we wail thine fing ‘see (ee Chops). Terfo. ive begin with 2 dasee ARMA ‘model he whe eng Segensew) mst conform tots sae sepahead stata, aid Wel nao vss he ode of {hea prt fhe melo he estore le than he AR prc Seton 8.3) Ts hen assures ‘Fa nt coat ely te sever aor ine nd tS y abe ype Ho age 55 THEDSCRETE KAMAN FILTER 219 ‘The ercor covariance matrix associated with Sj, is obtained by frst forming the expression for the a priori error = Xr — Kine = a, + WO) - O84 Soe te 55.24) ‘We now note that w, and e, have zero crosscorrelation, because wis the process ‘noise for the step ahead of f. Thus, we can write the expression for Py, as Pins = Elenite = Elle, + we + 907 = PAT + & 6525) ‘We now have the needed quantities at time f and the measurement 2, can be assimilated just as in the previous step. Equations (5.5.8), (5.5.17), (55.22), (55:23), and (5.5.25) comprise the ‘Kalman filter recursive equations. It should be clear that once the loop is entered, itcan be continued ad infzitum. The pertinent equations and the sequence of ‘competational steps are shown pictrially in Fig. 5.8. This summarizes what is now known as the Kalman filter. Before we proceed to some examples, itis interesting to reflect on the Kalman filter in perspective. If you were to stumble onto the recursive process of Fig. 5.8 without benefit of previous history, you might logically ask, “Why in the world did somebody call that a fiter? It looks more like a computer algorithm.” You would, of course, be quite right in your observation. The Kal- man filter is just a computer algorithm for processing discrete measurements (the input) into optimal estimates (the output). Its roots, though, go back to the p= ra Orns i a a. a, OaraaE Ration emo Figure 88. Kanan her loop220 CHAPTER S ‘THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION days when filters were made of electrical elements wired together in such a way 135 to yield the desired frequency response. The design was cften heuristic. Wie- ner then came on the scene in the 1940s and added a more sophisticated type of filter problem. The end result of his solution was a filter weighting function or a corresponding transfer function in the complex domain. Implementation in terms of electrical elements was left as a further exercise for the designer. The discrete-time version of the Wiener problem remained unsclved (in a practical sense, at least) until Kalman’s paper of 1960. Even though his presentation appeared to be quite abstract at first glance, engineers socn realized that this ‘work provided a practical solution to a number of unsolveé filtering problems, especially in the field of navigation. More than 35 years have elapsed since ‘Kalman’s original paper, and there are still numerous current papers dealing with ‘new applications and variations on the basic Kalman fits. x has withstood the test of time! SCALAR KALMAN FILTER EXAMPLES ‘The basic recursive equations for the Kelman filter were gresented in Section 5.5. Two simple scalar examples will illustrate the use of these equations. EXAMPLE 5.7 — Wiener (Brownian Motion) Process Consider the Wiener process described in Fig, 5.9, Recall that this process has Gaussian statistics and is assumed to be zero at f = 0, Assume that we have a sequence of independent noisy measure- ‘meats taken at unit intervals as shown in Fig. 5.9 and let the standard deviation ‘of the measurement error be one-half. We wish to determine te optimal estimate fof the process at the sample times 0, 1, 2,3, -., et "The model parameters for the Kalman filter may be computed as follows: “The process differential equation for this ease is feu 66.) ‘Therefore, the transition matrix is Figure 89 Blck dasa of Wener process anya! saree nc. {an oh proes. 56 SCALARKALMAN ALTER 221 wal 662) ‘The Q, matrix is computed as =e(f! = ff ff ctweuen at dn = Hl [[flxe-meran=1 053 Since the measurement has a direct one-to-one correspondence to the process 2x(0), the H, matrix is Hat 664) ‘The variance of the measurement error is R, = (Standard deviation)? = (4)? = 665) ‘Our inital estimate at r = 0 is and is zero because the Wiener process is defined to begin at zero, Furthermore, because of tis prior knowledge about the ‘process, the error associated with our initial estimate is zero that is, the a priori estimate, i5 = 0, is perfect by definition ofthe process. Thos, o 666) ‘We now have all the parameters needed forthe Kalman filter and are ready to begin the recursive process. Referring to Fig, 5.8, we enter the loop at ¢ = 0 ‘and process the first measurement. Step 1: t= 0 (Subscripts will be dropped for constant parameters.) Compute gain: Ky = PyHP GH + RY 0 -$-0 667) Update estimate + Koley ~ His) 668) Update P:222 CETERS THE OSCRETE KALMAN FLTER, STATE SPACE MODELING, ANO SIMULATION Po=U~ KelDP5 =1-R5=0 669) Project ahead: A, big 1-00 P= bP +O 66.10) S1-O-141=1 G61 [Note thatthe measurement at ¢ = 0 is given zero weight relative to the prior estimate. This makes good sense because the initial esti- ‘mate is known to be perfect, whereas the measurement is known t0 be noisy, Step 2: t= 1 ‘Compute gain: K, = rH HT + RY deletes 66.12) Update estimate ~ Hi) Update P: KP; nist 66.4) Project ahead: A= bi = 1% Py = Pit + 0 66.15) elepreie$ 6616) ‘This could now be carried on ad infinitum, Note that with step 2 the filter begins to give the measurement some weight in determining the optimal estimate, As times goes on, the filter depends more and more on the measurements and less on the init ‘assumptions (5.6.13) 86 SCALARKALMAN FLIER 229 EXAMPLE 5.8 ‘Scalar Gauss-Markov Process Consider a stationary Gauss~Markov process ‘whose autocorrelation Function is Raa = leet 6.6.47) (Clearly, the correlation ime and variance for this process are both unity. ‘Therefore, the spectral function for the process is va v3 6.18) and the shaping filter that shapes white noise into the process is shown in Fig, 5.10. Thus, the state equation for this process is ba oxt VOu(t) (56.19) Suppose we have a sequence of noisy measurements of this process taken .02 sec apart beginning at = 0, The measurement error will be assumed to have a variance of unity. We wish to process these via a Kalman filter and obtain an ‘optimal estimate of x(). Frst, we need to determine the filter parameters dy, He ,, and ‘The state transition matrix (seaar in this case) is . 9802 (5.6.20) ‘Te metsuremeat relationship ox s mat 6a ‘The inpot noise sequence is = Hui) = E {f Vie- tus de Vien) és = [Peter dy 1 om = 901 6620) ‘The measurement error variance is Figure 510. Stupng Se for Gos ro peo224, CHAPTERS: THE OGCRETE KALMAN FILER, STATE-SPACE MODELING, ANE SIMULATION Ret (5623) We also need the initial conditions and P; to enter tke recursive loop. In this case, we have assumed that the process is Gauss-Markov and stationary with a known autocorrelation function. We have no measurements prior to ¢ = 0, but the assumed knowledge of the process autocorrelation function tells us the process has zero mean and a variance of unity. This is important information and enables us to stat the recursive process at ¢ = 0 with initial conditions %=0 (5.624) 7 1 (6625) [Now thatthe four filler parameters and initial conditions are determined, itis @ routine matter to cycle through the Kalman filter loop show ir Fig. 5.8 as many: times as desired i order to add a note of realism to his example, the Markov process and noisy measurement situation just described were simulated using the methods described in Section 5.4, The results for the figs 51 steps of te simulation are shown in Fig, 5.11, The discrete measurements z, are shown és triangles, and it should be obvious that we have postulated a very noisy measurement situation for this example. In spite of this, hough, the filter does a reasonably good job of tacking x0. After about the first 20 steps, the filter settles down to a steady. state condition where the Kalman gain is sbout .165 and the standard deviation of the filter error is about 4, That is, VP, in the steady state is about 4 units. Tie) Figure S11 Smuston for Gauge Maver eae {87 AVQMENTING THE STATE VEGTOR AND MUCTPLE-NPUT/MULTPLE-OUTPUT EXAMPLE 228 ‘This is consistent, qualitatively atleast, with what we see in the results shown in Fig. 5.11. It is comforting to know that our single sample realization of the ‘process is typical rather than atypical. (See Problem 5.19 for a continuation of this example.) 5 5.7 AUGMENTING THE STATE VECTOR AND MULTIPLE-INPUT/MULTIPLE- OUTPUT EXAMPLE. It was mentioned earlier that one of the advantages of the Kalman filter over Wiener methods lies in the convenience of handling, multiple-input/multiple- ‘output applications. We will now look at an example of this, inthe proces, it vill be seen that itis sometimes necessary to expand the original state model in order to achieve a credible model that will At the formar required by Eqs. (65.1) through (5.5.5). The example that we will consider is based on a paper by B. E, Bona and R. I. Smay that was published in 1966 (14). This paper is ‘of some historical importance, because it was one of the very early applications of real-time Kalman filtering in a terestial navigation setting. For tutorial purposes, we will consider a simplified version of the state model used in the Bona-Smay paper, and we will then continue the discussion with Problem 5.20. ‘Tn marine epplications, and especially inthe case of submarines, the mission time is usually long, and the ship's inertial navigation system (INS) must operate for long periods without the benefit of positon fixes. The major source of position etvor during such periods is gyro drift, Ths, in tum, is due to unwanted biases on the axes that control the orientation of the platform (Le., the inertial instrument cluster), These “biases” may change slowly over long time periods, 0 they need to be recalibrated occasionally. This is dificult to do at sea, because the biases are hidden from direct one-to-one measurement. One must be content to observe them indirectly through their effect on the inertial system’s outputs. ‘Thus, the main function of the Kalman filter in this application is to estimate the three gyro biases and platform azimuth error, so they can be reset to zero. (dn this application, the platform tilts are kept close to zero by the gravity vector and by damping the Schuler oscillation with extemal velocity information from the ships log) In out simplified model, the measurements will be the inertial system’s two horizontal postion errors, that is, laiude error (N-S direction) and longitude cezor (E-W direction). These are to be obtained by comparing the INS output ‘with position as determined independently from other sources such as a satellite navigation system, Loran, or perhaps from a known (approximately) positon at dockside. The mean-square errors associated with extemal reference are assumed to be known, and they determine the numerical values assigned to the R, matrix ‘of the Kelman filter.228 CHAPTER'S. THE DISCRETE KALMAN FLTER, STATE-SPACE MODEUNG, AND SIMULATION ‘The applicable error propagation equations for a damped inertial navigation system in a slow-moving vehicle are* 4% 24, 7) +99, — Oe 672) H+ OM, 673) where x9, and z denote the platform coordinate axes in the north, west, and up directions, and % inertial system's west positon error (in terms of great circle arc distance in radians) inetal system's south position error (in terms of great circle are distance in radians) ‘= [platform azimuth error] ~ {west postion eror] «(tan (Iatitude)] % Also, 2, = x component of earth rate fie, 0, = 0 cos(lat)] 0, = z component of earth ate M fice, , = O sin(at)] and ‘5yr0 drift rates forthe x, y, and z axis gyros ‘We assume that the ship’s latitude is known approximately; therefore, 0, and 0, are known and may be assumed to be constant over the observation interval Nonwhite Forcing Functions ‘The three differential equations, Eqs. (5.7.1) through (5.7.3), representa third order linear system withthe gyto drift rates, e,, ¢),€ a8 the forcing functions. ‘These will be assumed to be random processes. However, they certainly are not White noises in this application. Quite to the contrary, they are processes that vary very slowly with ime—just the opposite of white noise. Thus, if we were + Egutons (3.7.1 1 (73) ae ceainly not obvious, an a considerable amount of background in inertial navigation teary send 0 understand te ssampton nd pronation ding is Spl set of eqatons (15,15) We do aot ate to dee te euaons hee. For purposes of Underage Kaman fen smply aeume tht these equations do, fc ccorly dscibe the err propapton in tis plication and proceed on othe deta he Kalman Ate S57 AUGNENTING THE STATE VECTOR ANO MULTIPLE:NPUT/MULTIPLE-OUTPUT EXAMPLE 227 to discretize this third-order system of equations using relatively small sampling intervals, we would find that the resuling sequence of w,’s would be highly correlated. This would violate the white sequence assurnption that was used in ‘deriving the filter recursive equations (see Eq. 5.5.3). The solution for this is to ‘expand the size of the model and include the forcing functions as part of the state vector. They can be thought of as the result of passing fictitious white noises through a system with linear dynamics. This yields an additional system ‘of equations that can be appended to the original set inthe expanded or aug- ‘mented set of equations, the new forcing functions will be white noises. We are assured then that when we discretize the augmented system of equations, the resulting w, sequence will be white. In the interest of simplicity, we will model e,, e, and s, as Gaussian random-walk processes. This allows the “biases” to change slowiy with time. Each ofthe gyro biases can then be thought of as the output of an integrator as shown in Fig. 9.12, The thre differential equations to be added to the original set are then a 674) f, 675) enh 616) where ff and fae independent white-noise processes with power spectral densities equal to W. ‘We now have a six-state system of linear equations that can be put into the usual state-space form. a] fo a 0 | a oe a} fo - | 5 al |e S| a | % as 0 io = ts | 6, ‘The process dynamics model is now in the proper form for a Kalman filter. It is routine to convert the continuous model to discrete form for a given Av step size, The key parameters in the discrete model are ¢, and Q, and methods for calculating these are given in Section 5.3 ‘we? —[E} + aon Figure 8.12. Random wak modelo go228 CHVOTENS THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION [As mentioned previously, we will assume that there are only two measure- iments available to the Kelman filter at time ¢. They are west position error v, and south position error ‘The matrix measurement equation is then nu] oft 0000 ol}s] , fu feh-[0288 8 s)]2] Lk) ze a, an “The measurement model is now complete except for specifying R, that describes the mean-square erors associated with the external position fixes. The numerical ‘ues will, of course, depend on the particular reference system being used for the ies. ‘Measurement Noise ‘We have just seen an example where it was necessary fo expand the state model because the random foreing functions were not white. A similar situation can also cccur when the measurement noise is not white. This would also violate ‘ne of the assumptions used in the derivation of the Kalman filter equations (see Eq, 5.54). The correlated measurement-error problem can also be remedied by sugmenting the state vector, ust as was done in the preceding gyro-calibration txample. The correlated part of the measurement noise is simply moved from into the state vector, and His changed accordingly. It should be noted, ‘hough, that if the white noise part of v, was zero in the original model, then in the new model, after augmentation, the measurement noise willbe zero. In effect, the model is saying that there exists a perfect measurement of certain lin combinations of state variables. The R, matrix wil then be singular. Technically, this is permissible in the discrete Kalman filter, provided tha the Py matrix that has been projected ahead from the previous step is positive definite, and the ‘measurement situation isnot trivial. (See Problem 5.8 for ar example where Ry (0) The key requirement for permiting a singular Ry is tat (H,PSHE + R) be invertible in the gain-computation step. Even so, there 's some risk of mi- merical problems when working with “‘perfect” measurements. More will be Said of this in the section on divergence in Chapter 6, 58 ‘THE CONDITIONAL DENSITY VIEWPOINT In our discussion thus far, we have used minimum mean-square error as the performance criterion, and we have assumed a linear form for the filter. This ‘was parly a matter of convenience, but not entirely so, because we will see 58 THE CONDMONAL DENSITY VEWPOWT. 228 presently thatthe resulting linear filter has more far-reaching consequences than fare apparent at fist glance. This is especially so in the Gaussian case. We now elaborate on this, ‘We fist show that if we choose as our estimate the mean of x, conditioned fn the available measurement stream, then this estimate will minimize the mean- square error. Ths is a somewhat restrictive form of what is sometimes called the fundamental theorem of estimation theory (10, 17). The same notation and ‘model assumptions that were used in Section 5.5 will be used here, and our derivation follows closely that given by Mendel (10). Also, to save writing, we will temporarily drop the k subscripts, and we will denote the complete measurement stream 2, 2...» % simply as 2*, We first write the mean-square estimation errr of x, conditioned on 7*, as Blix ~ Stu — Sa") = Elie ~ XR ~ RK + MH] 6a) = Eixtxle*) ~ Ela" ~ REalaY) +R Factoring & away from the expectation operator in Eq. (58.1) is justified, be- ‘cause is a function of 2*, which isthe conditioning on the random variable x. ‘We now compete the square of te last tree tes in Eq (3.8.1) and obtain Bltx — He - He") Boca) + [& — Boxe VR ~ Beale*)] - Bee (682) Clearly, the first and last terms on the sight sie of Eq (5.82) do not depend ‘on our choice ofthe estimate &. Therefore, in our search among the admissible estimators (both linear and nonlinear, it should be clear that the best we can {dois to force the middle term to be zero. We do this by leting Bex let) 683) ‘where we have now reinserted the & subscripts. We have tacitly assumed here that we are dealing with the filter problem, but a similar line of reasoning would also apply to the predictive and smoothed estimates of the x process. (See Sec- tion 4:4 for the definition and a brief discussion of prediction and smoothing.) Equation (5.8.3) now provides us with a general formula for finding the estimator that minimizes the mean-square error, and itis especially useful in the Gaussian case because it enables us to write out an explicit expression for the ‘optimal estimate in recursive form. Toward this end, we will now assume sian statistics throughout, We will further assume that we have, by some means, an optimal prior estimate X; and its associated error covariance P;. Now, at his point we will stretch our notation somewhat and let x, denote the x random variable at % conditioned on the measurement stream zf.,. We know that the form of the probability density of x, is then290 CHAPTER §. THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION fa ~ NPE 684) Now, from our measurement model we know that x, is related to 2, by y= Hat 685) Therefore, we can immesitely write the density function for % as fu ~ MOR, PSH +) 686) ‘Agein, remember that conditioning on 2f., is implied.) Also, from Eq. (5.8.5) ‘we ean vite out the form for the conditional density of x, given x). Its Sad ~ NE RD 687) Finally, we can now use Bayes formula and write ate fart 688) where the terms on the right side of the equation are given by Eqs. (5.8.4), G86), and (5.8.7), But recall that x, itself was conditioned on 2%, 2, 2. Thus, the density function on the left side of Bq. (5.8.8) is actually the density ofthe usual random variable x, conditioned on the whole measurement stream up through x, So, we will change the notation slightly and rewrite Eq. (588) as INGx. RINGS, PEL foe (NCES, HPCE + RO] eee! were itis implied tat we substitute the indicated normal functional expressions into the ight side ofthe equation (sc Section 115 forthe vector normal fom). Tis a routine mater now to male te appropriate subsituions fn Ea. (589) td detrmine the mean and covariance By inspection ofthe exponential tem. ‘The algebra Is route, but a bit laborious, so we wil not pursue i Tre hee (Gee Problem 5.17). The resulng mean and covariance for nt ae Mean = + PPH{GEP HE + RJM, - HS) 68.10) ' 8.11) Covariance = [(Pp)t + HER; Hi) NNote that the expression for the mean is identical to the optimal estimate previously derived by other methods and given by Eqs. (55.8) and (5.5.17). The expression for the covariance given by Eq, (5.8.11) may not look familiar, bat in Chapter 6 it will be shown to be identically equal to the usual P, = (1 — K,H)P; expression, provided that Ky is the Kalman gain, 58 THE CONDMONAL DENSITY VIEWPONT 234 We also note by comparing Eq, (5.8.10) with Eq. (5.5.8), which was used inthe minimum-mean-square-etror approach, thatthe form chosen forthe update in Bq. (55.8) was correct (For the Gaussian case, atleast). Note also that Eq, (5.58) ean be waitten in the form => KAD + Kay (58.12) ‘When the equation is written this way, we see thatthe updated estimate is formed sa weighted linear combination of two independent measures of xj the first is the prior estimate tha i the cumulative result of all the past measurements and the prior knowledge of the process statistics, and the second is the new information about x, as viewed in the measurement space, Thus, the effective weight factor placed on the new information is KH, From Eq, (5.8.12) we see that the weight factor placed on the old information about x; is (I — KH), and thus the sum of the weight factors is T (or just unity inthe scalar case). This implies that &, will be an unbiased estimator, provided, of course, thatthe two estimates being combined are themselves unbiased estimates. [An estimate is said to be unbiased if £(8) = x] Note, though, in the Gaussian case we did not start out demanding that the estimator be unbiased. This fell out naturally by simply requiting the estimate to be the mean of the probability density function of %, fen the measurement stream and the statistics of the process. (This idea of blending together two unbiased estimates to obtain the optimal estimate is ex: ploited further in the chapter on smoothing. See Section 8.5.) In summary, we see that in the Gaussian case the conditional density viewpoint leads to the same identical result that was obtained in Section 5.5, where we assumed a special linear form for our estimator. There are some far-reaching conclusions that can be drawn from the conditional density viewpoint: 1. Note tha nthe conditional density function apposch, we dd not need to astm « linear relationship between the etme and the measre tents Insea, tis ame out naturally a 4 consequence ofthe Caussan sssumption and our choice of the conditional mean a8 our esate Thus, in he Gaussian case, we know tht we need not search among olivate fora beter one; cannot exist. Ths, our eae neat assumption in the derivation of both the Wiener an Kalman ers turns Gut tbe fortuitous one. That is, in the Gaussian case, the Wiener-Kalman filter isnot just best within a clas of nea test is best within a lass ofall ters, iar or nonlinear 2. For the Gausian case, the conditional mean fs also the “most ke Nalue in that the maximum of the density funtion ocr athe mes. iso, itcan be shown that the conditional mean minimizes the expe: tation of elmost any reasonable nondecreasing fenction of the magnitude OF the cror (as wel as the squared eror. [See Mediteh (17) fo a more comps dscsson of this] Thus, in the Gouslan case, th Kabnan fers best by almost any reasonable orion 3, In physical problems, we often bepin with incomplete knowledge ofthe procets under cousideraon. Perhaps only the covariance suture of232 CHAPTER § THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATON the process is known. In this case, we can always imazine @ correspond ing Gaussian process with the same covariance structure. This process is then completely defined and conclusions forthe equivalent Gaussian process can be drawn. It is, of course, a bt risky to extend these conclusions tothe original process, not knowing it tobe Gaussian. However, ceven risky conclusions are better than none if viewed with proper PROBLEMS 54. The syitem shown is driven by two independent Gaussian white sources (0 and u,(9. Theie special functions are given by Suu = 4 (sect? rad (see) Sia = 16 (sendy (see) Lat state variables be chosen as shown on the diagram, and assume that noisy measurements of x ate obtained at unit intervals of time. A discrete Kalman filter mode is desired. Find Q, for this model. 5.2 In modern navigation equipment, the Kalman filter is often configured to ‘timate vehicle acceleration as well ab position and velocity. A generic process ‘model forth situation is simply three integrators in cascade as shown in the sccompanying figure (for one dimension only). I is usually quite laborious to ‘work out the exact expressions for dy and Qa, but inthis cae all the terms in the respective matrices can be found in general form with a modest amount of | effort. Fora step size of Ar, show that & and Q, are given by Waa Wan ae Bae Bae w x 2 Hor Fae Fae Wap War war ale le (Hint: Use the same method as in Example 5:3. Also see Problem 5.3 for 4 variation on this model.) $1 Problem 5:2 ee eee PROBLEMS 233 53. A variation on the dynamic position-velocity-acceleration (PVA) mode! given in Problem 5.2 is obtained by modeling acceleration as a Markov process rather then random walk. The model is shown in block-diagram form in the accompanying figure. The linear differential equation for this model is of the form ka Fee or (@) Write ovt the F, G, and GWG matrices for this model, showing each term in the respective matrices explicitly. () Show that the first-order mean-square response (Le, in Af) for accel- tration is the same here as in the model given in Problem 5.2. (©) The exact expressions for the terms of d, and Q, are considerably more dificult to work out in this model than they were in Problem 5:2, However, their numerical values for any reasonable values of W and can be found readily using the method referred to as the van Loan ‘method discussed in Section 5.3. Using MATLAB (or other suitable software), find a, and Q, for A= 0.2 see? W = 10 (an/sec*?/(rad/ sec), and Ar = I se. (Note: These numerical values yield a relatively high-dynamics model Where the sigma of the Markov process is about 5g with atime constant of 5 sec, Also note that the numerical values obtained in this ‘model correspond to those obiained in Problem 5.2 within a first-order 5 ev — bh ‘SA _A small intentional random range error is introduced into each of the GPS navigation satellite signals. This is done to degrade the solution accuracy for civil users to 100 m 2 drs (see Chapter 11 fora brief discussion of GPS). One ‘model that has been suggested for this random dithering of the range signal is to model it as a stationary second-order Gauss-Markoy process with a power spectral density (PSD) given by (18) 002585 ae Sa) = sm /(rad/sec) toy = 012 rad/see ‘This PSD corresponds to the autocorrelation function 237" Ai(eos far + sin fe) m? vere B= axlV2 radsee24 CMOTERS THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION (4) Fis, develop a state model for this process. (©) Using MATLAB or other suitable software, generate four sample realizations of this process. Let the Ar interval be 10 see and the total simulated time span be 4000 sec (401 samples for each realization, including the start and end samples.) Give each realization a separate variable name and arrange the sample values as a 1 x 401 row vector ‘A rough check on the reasonableness of your simulation results can be ‘obtained by plotting all four sample realizations superimposed on one plot. The results should appear to be stationary with an rms value of about 23 m. (© Find the time autocorrelation functions associated with each sample realization of part (b); 40 “lags” willbe sufficient. Plot all four sample tutocorrelstion functions on one plot. Are your results consistent with the theory given in Section 2.15, Chapter 2? (@ Finally, plot the average of the Tour sample autocorrelation functions land compare this with the theoretical autocorrelation function given ‘atlir in the problem. Note that one would expect the average experimentally derived autocorrelation function to approximate the theoretical function better than any of the individual functions. S$ _ A sutionary Gaussian random process is known to have a spectral density {unetion ofthe form etl Gs BaF +6 Sie) Assume that discrete noisy measurements of y are available at f= 0, 1, 2, 3... and that the measurement ervors are uncorrelated and have a variance of ‘wo unis. Choose phase variables as state variables and develop a Kalman filter model for this process. That is, find ¢,, Qy. Hye Ri and the initial conditions Aj and P5. Note that numerical answers are requested, so MATLAB and the tlgoritns for determining @, and Q, given in Section 5.3 will be helpful. You may assume that in the slationary condition, the x, and x, state variables are uncoated. Thus, Py will be diagonal. (See Problem 5.9 for more on this.) 56 Using the same dy, Qy, Ry Hy and Py parameters as in Example 5.8, ‘compute the sequence of optimal estimation-eror variances for 26 steps using MATLAB. Do this interactively line-by-line following the steps given in Fig, 58 (Wow: The & update and & projection operations are omitted when doing covar- jance analysis. Also, a pause statement in the “for” loop will enable you to view the ero variance with each recursive step.) The end result i to be a 1 x 26 row vector containing the a posteriori variances for the 26 steps. Plot the result using the MATLAB plot statement. Note that ths isa discrete sequence, soitis best o plo it as a sequence of discrete symbols rater than a continuous cure. Also save the result for future reference. In this example, the estimation ror variance reaches a steady-state value of -1653 in about 20 steps, (This _poblet is continued as Problem 5.7.) eS @ PROBLEMS 235° 5.7 Having completed Problem 5.6, now write a more general MATLAB co variance analysis program that will accommodate an m X 1 vector process and fan m % 1 vector measurement sequence. Make an Mcfile for your program and give it an appropriate name (eg, llcovarsn). The number of steps in the Kalman filter isto be 5, withthe frst measurement occurring at ¢ = 0 (ie, step 1 is at (0). Assume thatthe step size Av is fixed and thatthe system parameters are constant, The desired result isthe sequence of optimal a posteriori error covar- ance matrices for « steps. Arrange your program such tha the error covariances are stacked side-by-side into a single large m % ns matrix that contains all of the error covariance information forthe entire run, (a) Fist, test your generalized program on the scalar situation described in Probletn 5.6. (©) Consider next the position-veloity model shown in the accompanying. figure. The step size is 1 sec, and position is the measurement. The ‘measurement erzor variance is 225 m?, and the intial uncertainties, the 4 and x, estimates, are zero (Le., Py = 0). Do a covariance analysis run for this model for 51 steps. (Last measurement is at ¢ = 50 sec.) (On the basis of your results, would you say that this system is observable? Give a qualitative justification for your answer. ve ee 58 A classical problem in Wiener-fitor theory is one of separating signal from ‘noise when both the signal and noise have exponential autocorrelation functions. Let the noisy measurement be A) = 5) + m() and let signal and noise be stationary independet processes with autocorrelation functions Ria) = a3e"8t Ry) = oe (@) Assume that we have discrete samples of 2() spaced Ar apart and wish to form the optimal estimate of s(?) atthe sample points. Let s() and rn) be the fist and second elements of the process state vector, and thea find the parameters of the Kalman filter for this situation. That is, find , Hy, Q,, R, and the initial conditions % and Pj. Assume that the measurement sequence begins at 1 = 0, and write Your results in general terms Of 6 Oy» By Bye Ad At () To demonstrate thatthe discrete Kalman filter can be run with Ry = 0 (for a limited number of steps, at least), use the following numerical values in the model developed in part (a:| i 296 CIUPTERS THE O'SCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION B= 1 sect a1, B= Tse? ar ‘Then tun the enor covariance part of the Kalman filter for 51 steps beginning at ¢ = 0. (You do not need to simulate the 2, sequence for this problem, Simple covariance analysis will do.) ‘59 Show thatthe state variables as defined in the model development of Fig. 52 are uncorrelated in the steady-state condition, It may be assumed that the poles ofthe transfer function relating u(?) and r(?) are all in she lft-half plane. (Wnt: Use the methods given in’ Section 3.8 to evaluate the erosscorrelations, and note that the impulse response beging at zero and apprcaches zero ast — ‘ when the order of the denominator polynomial is 2 greatr than that of the ‘umertor of the corresponding transfer function.) ‘540 In Example 5.7, Section 5.7, the random walk process was assumed to have an inital value of zero, that i, it was a Wiener process. Consider a similar process except thatthe intial value is @ random vartable described as N(O, 1) How will his change in inital condition affect the discrete Kalman filter model? S.11 Suppose we have a scalar process y(@) that is an additive combination of a random bias and a Wiener process (see Section 5:2). The y(@) process can be modeled either asthe sum of two separate state variables or a8 a single variable ‘vith a random intial condition as Was done in Problem 5.10, Assume that the measurements of y() occur at unit intervals beginning at ; = O and that the measurement error has a variance R. Also assume that the random bias component is described as N(O, 02), and thatthe Wiener process has a variance that increases in accordance with Af, Both o? and A are known parameters, (a) Demonstrate that both the single-state and costae models yield the same optimal estimate of y by cycling through two recursive steps for each model () Assume that only the sum of the random constant and Wiener process is of interest. Which of the two models would you prefer? Why? 52 Consider the same Gauss-Markoy process and measurement situation dis assed in Example 5.8. (a) Carty out two recursive steps of the Kalman filter ard write the estimate att = 02 see explicitly in terms of z and zy () Using the weight factor approach of Section 4.7, write the optimal estimate after two measurements (Le. at f = .02 set) in terms of zp and 2. [Note thatthe answer should be the same as tht obtained in part @] 5.413 Two different state models were given in Section 5:2 for @ pure harmonic process. It was also mentioned that a linear transformatior will exist that will take one model into the other. Bogin with the phase-variable model (Eqs. 5.2.20 and 52.21) and (@) First, write out the explicit matrix transformation T that takes x' into X, that is, write x = TX: proses 287 (&) Replace x in the general state equations x + Gu y with Tx’ and rearrange the x’ equations into the standard state-space form. The final result ofthe algebraic manipulations of part (b) should be the same as Eqs. (5.2.23) and (5.2.24), ‘5.14 Suppose that we make the following linear transformation on the process en “This transformation is nonsingular, and hence we should be able to consider x" as the state vector to be estimated and waite out the Kelman filter equations accordingly. Specify the Kalman filter parameters for the ransfonmed problem. (Note: The specified transformation yields a simplification of the measurement ‘matrix, but this is at the expense of complicating the model elsewhere.) 5.15 Its almost self-evident tha if the estimation ertors ae minimized in one set of state variables, this also will minimize the error in any linear combination of those state variables, This can be shown formally by considering a new state vector x’ to be related to the original state vector x via a general nonsingular transformation x’ = AX, Proceeding in this manner, show that the Kalman estimate obiained in the transformed domain is the same as would be obtained by performing the update (i.e, Eq, 5.5.8) in the original x domsin and then transforming this estimate via the A matrix 5.16 The matrix differentiation formulas used in Section 5.5 are repeated here for convenience. eA Aeree(A] yr (AB must be sua) lanes LimNCA) - 240 (€ mabe ye Verify that these fonmuls ae corect. (A. straightforward way of doing this is to nie out afew terms ofthe india matrices, perform the product, and thon differentiate term-by term a indisted in Eg (3.5.18) 5.17 Show thatthe mean and covariance associated with conditional density Tunetion fan are as given by Eqs. (68.10) and 8.1) (Hs When the expression for fay, i written out explicitly in terms of Xy ble foond tha the quanti inthe exponent i quadratic in x. Next, you vill need fo do the matrix equivalent of completing the square in oer to recognize the mean and covariance of the density fetion. Alo, you will id the alter native gun and rrr-covriance expressions derived in Chapter 6 to be useful hee. They are, with subscripts omited,238 CHAPTER. THE DISCRETE KALMAN ALTER, STATE-SPAGE MODELING, AND SIMULATION K=PHR* Pe (ey! +R Hy ‘Assume these to be valid and use them wherever they may be helpful inthis problem.) 5.18 Occasionally, we need to transform the process state model back to a ‘corresponding scalar differential equation (continuous case) or an ARMA model (discrete case). In either case, we are looking for the direct relationship between the scalar input and the sealar output. In the continuous case the state equation has the following Form: = Fx + Gu (wis unit white noise) y= Bx (vis salar) If we transform to the s-domain and solve for Y(), we get Yo) = [Bel = Fy'G]UE) Clearly, the denominator of [B(sE ~ F)-'G] provides the characteristic polynomial of the desired differential equation, and the numerator gives the coefii- cients of the foreing function pat. Derive the corresponding input-output transfer function relationship in the-2-domain for the discrete model and relate the ‘denominator and the numerator to the respective AR and MA parts ofthe ARMA ‘model. In other words, develop an ARMA model from the diserete state model Demonstrate that this procedure does yield the comrect ARMA equation by applying it to Eqs. (5.3.40) and (5.3.41) of Example 5.6. 5.419 In Example 5.8, the Monte Carlo realizations of the true process and ‘Kalman fiter estimate were plotted together for purposes of comparison. I is also informative to plot the estimation erro for such simulations and plot it with the square root of the error variance for comparison. One simulation run may not be very conelusive, but when the results of a number of Monte Carlo runs sre superimposed, one ean get a good indication as to whether or not the filter (Ge, estimator) is performing as expected ‘Use the same parameters as given in Example 58, except here tet the correlation time of the process be 4 sec, rather than | S60 (i let = 25 Sec"), Then compute fovr Kalman filter simolations and save the estimation for sequence foreach run a8 1 % 51 coW vector The plo, superimposed on one plo.) the four simulated error tajectores, and (b) both the positive fnd negative values of VP. (twill help here to use continous plots for the four estimation eor curves and discrete symbols for VP.) You wil find that the two-vlued ‘VP plot forms srt of an envelope forthe estimation eor plots. With Gaussian statistics, one would expect the estimation errors to sty within the envelope about 68 percent ofthe time. Hf your results donot agree wih th check your rouram carefully for possible mors (Experience has shown th PROSLEMS 239 the MATLAB random number generator does a very good job of generating ‘Gaussian statistics when the “normal” option is used.) 5.20. The following numerical exercise is a continuation of the gyro-bias cal ibration example that was discussed in Section 5.7. For our analysis here, we will approximate the earth as being spherical, andthe ship will be moving slowly ‘ta latitude of 45 deg. We will use the following numerical values forthe earth's rotation rate and its radius at 45 deg latitude: 1 = 2625161 rad/hr (rotation rate) 6367253 m (radius) (a) Firs, look atthe natural modes of oscillation ofthe original set of thee iferential equations, that is, Eqs. (5.71 through 5.7.3). This is easily ddone by setting the forcing function equal to zero and finding the ei- ‘genvalues of F [roots of (S1-) for the third-order system). Note that the natural period of oscillation works out 10 be very long—about I solar day! Thus, itis convenient numerically to express time in units ‘of hours, rather than the usual seconds, in this example () Next, consider the augmented six-state model as discussed in Section 5.7. The gyro biases are to be modeled as random-walk processes, and let the white-noise forcing functions for these processes have power spectral densities that are such as to produce an incremental random ‘walk variance of (.001 deg/h? in 1 hr, Discrete measurements will be assumed to be available at intervals of .5 hr. Compute &, and Qy for this model, and check for reasonableness in view of the very long natural period of oscillation fr the system. In the simplified model discussed in Section 5.7, the two horizontal position erors of the inertial system were the measurements. There are problems with system observabiity for this measurement situation. For this reason, Bona and Smay also considered the possibility of a thind extemal measurement, namely, the platform azimuth error (14). Call it ‘, and note that i is @ linear combination of the state variables and ‘Uc ALAS deg latitude, we have © ®, aa (P5201) ‘After we add the azimuth measurement to the two position error, the 1H, matrix becomes 100000 H=|0 10000 (P5.202) rer) ‘We now wish to sce how well the inertial system can be calibrated on the basis of a 3-tuple measurement as described by Eq. (P5.21.2). To 4o this, rn & Kelman filter error-covariance analysis for 11 steps wih240 CHAPTER'S THE DISCRETE KALMAN FILTER, STATE-SPACE MODELING, AND SIMULATION the measurement interval set at .5 hr. Use the following variances for the initial Py matrix: a) tion ence (an 43: (200 Azimuth and west position combined (¥): (1 deg)? Gyro drift rates (ll the same and independent: (02 deg/hr}? ‘The Py matrix may be assumed to be diagonl. Fer the measurement errors, assume independence and let the variance be 100 m y 7, Gm), matin see ( ‘Azimuth; (1 ase min? The R, matrix will be a diagonal 3.x 3 matrix. [Now run the covariance analysis for 11 steps ‘beginning at = 0 and ending at ¢ = 5.0 he) and plot the rms estimation errs fOr Yo Uy Yn #y ey and ¢, On the basis of these plots, would you think that the system is completely observable? =} S21 Consider two different measurement situations forthe same random-walk ‘dynamical process Process model: ‘Measurement model I: y= Sut Os Measurement model 2: 1 k = (OS = Te AGRE Using Q = 4, R = 1, and Pj = 100, run error covariance analyses for each measurement model for k= 0, 1,2, ... 200. Plot the estimation eror variance for the scalar state x against the time index k foreach case. Explain the difference seen between the two plots, particulary as the recursive process approaches and passes k = 70 REFERENCES CITED IN CHAPTER 5 1. RLE- Kalman, “A New Approach to Linear Filtering and Prediction Problems,” Trans. ASMEJ. Base Engr, 35-45 (March 1960) REFERENCES 261 2. J.J. D'Azzo and C. H. Houpis, Linear Control System Analysis and Design, th New Yorke McGraw-Hill, 1995, 3. RC. Dorf and RH, Bishop, Modera Control Systems, 7th ed, Reading, MA: Ad- ison- Wesley, 1995. 4.TTA. Stansell, Je, “The Many Face of Transit of Navigation, 25:1, 55-70 (Spring 1978). 5. P,Henrici, Discrete Variable Methods in Ordinary Digerential Equations, New Yat: Wiley, 1962, 6. C.R Wylie, Ir, Advanced Engineering Mathomarics, rd ed, New York: MeCiraw ill, 1966, p. 108-117, 7. R.G. Brown and P. Y. C, Hwang, Introduction to Random Signals and Applied Kalman Fiering, 2nd et, New York: Wily, 1992 8, C.F. van Loan, “Computing Totegrals Involving the Matrix Exponential” ZEEE Trans, Aucomatie Compal, AC-23: 3, 395-404 (lune 1978. 9. K-S. Shanmugan and A. ML Breipohl, Random Signals: Detection, Estimation, and aia Analysis, New York: Wiley, 1988. 10, J. M. Mendel, Optimal Seismic Deconvolution, New York: Academic Press, 1983 1. M.B. Priestly, Spectral Analysis and Time Series, New York: Academic Press, 1981 12. G.H. Golub and C.F van Loan, Maur Computations, 2nd ed, Balimmre, MD: The Tans Hopkins University Press. 1989, pp. 41-142 13, AHL Jazwinski, Stochatle Processes and Filering Theory, New York: Academic Pres, 1970, 14, BLE. Bons and R. J-Smay, “Optimum Reset of Ship's Ineial Navigation System” IEEE Trans. Aerospace Elect: $)st, AES-2: 4, 09-414 (July 1960). 15. G.R, Pitman (ed), metal Guidance, New York: Wiley, 1962. 16. 1.C. Pinzon, "ineial Guidance for Cruse Vehicles," in C. 7. Leondes (ed), Gul tance and Control of Aerospace Vehicles, New York: MeGraw-Hill, 1963. 17, 1S. Medith, Stockastie Optimal Linear Estimation and Control, New York: ‘MeGiraw-Hil, 1969, 18, "Change No. Ite RTCA/DO-208," RTCA paper no, 479-93/PMC-106, RTCA, tne: ‘Washington, DC, Sept 21, 1993 Navigation, Youral ofthe Institute Additional References 19, A. Gelb (ed), Applied Optimal Eximation, Cambridge, MA: MIT Pres, 1974 20, BS. Maybeck, Siochastlc Models. Estimation and Control (Vol. 1), New York Academic Pest, 1979 21 ALP. Sage and J L, Melsa, Eumation Theory with Applications to Communications ‘and Control, New York: McGraw-Hill, 197. 22, SUM. Bozic, Digital and Kelmon Filerng, London: E. Arold, Publisher, 1979. 23, B.D. O. Anderson and J.B. Moore, Optimal Fiering, Englewood Ciifs, NI Pren- tice-Hall, 1979 24, CT, Leondes (ed), Theory and Application of Kalman Fitering, North Adantic Treaty Orpanization AGARD rept. no. 139 (Feb. 1970). 25, H. W. Worenson, Parameter Estimation, New York: Marce! Dekker, 1980. 26. H.W. Sorenson, "Kalman Filtering Techniques.” in Advances in Control Systems (Vol 3, €. T Leondes (ed, New York: Academic Press, 1966, pp. 219-289, 21. M, Aoki, Stote Space Modeling of Tne Series, Beri: Springer Verlag, 1987, 28. RLF. Stengel, Stochastic Optimal Control-Theory and Application, New York: Wiley, 1986, 29, M.S. Grewal and A. P. Andrews, Kalman Filtering Theory and Practice, Englewood (Cin, NJ Prentice-Hall, 1993 30. G, Minkler nd J. Minkler, Theory and Application of Kalman Fitering, Palm Bay, FL: Magellan Book Co, 1093,1 PREDICTION 243 a + NIB) Pe + NI) ck + Ny 8K) 1) HE + N,B PEWS +N, H+ QK+NH (6.1.2) where (Uk) = updated fer estimate at time # {+ Nk) = predictive estimate of x at time f,,y given all the measurements through f Pk) = error covariance associated with the filter estimate 8(Hf) P(k + NIK) = error covariance associated with the predictive estimate 5k + IR) ransition matrix from step k to k + covariance of the cumulative effect of white-noise inputs from step & to step k +N | Prediction, Applications, and : More Basics on Discrete Kalman Filtering [Note that a more explicit notation is required herein order to distinguish between the end of the measurement stzeam (2) and the point of estimation (k + N), (These were the same in the filter problem, and thus a shortened subscript no- i tation could be used without ambiguity) “There are wo types of prediction problems that we will consider: In the period immediately following Kalman’s original work, many extensions and vetiations were developed that enhanced the usefulness of the technique in applied work. This chapter continues the subject of diserete Kalman filtering With some of the more important extensions ang related topics. Also, a limited rhumber of applications are presented to illustrate the versatility of Kalman filtering. The treatment of applications here must be brief. More applications are siven in Chapters 9, 10, and 11. There also exist a wealth of application papers in various journals and conference proceedings over the past 35 years. In particular, many papers dealing with navigation and trajectory determination will be found in Navigation (Journal of the Institute of Navigation), IEEE Transac- tions on Aerospace and Electronic Systems, and IEBE Transactions on Automatic Case 1: Case 1 is where Nis fixed and k evolves in integer steps in time {ust as inthe fiter problem. In this cas, the predictor is just an appendage that ‘we add to the usual filter loop. This is shown in Fig. 6.1. In offline analysis ‘work, the P(k + N[k) matrix is of primary interest, The terms along the major diagonal of P(k + Nik) give a measure of the quality of the predictive state cstimate, On the other hand, in on-line prediction it is 8k + Nk) that is of primary interest. Note that it is not necessary to compute Ptk + jk) to get ble ease emesis fe a= gt, HIN) ts ADAPT NA) 0,8) i i : Control. Also, Sorenson (1) pives an especially valuable collection of applied j | : paper in his IEE Press book. ' ' He I ' ft ! | Hi \ I 4 64 | ! i PREDICTION : i | H \ I { ! ' | { ' Figure 81 tap pedo244 CHAPTER6 PREDICTION, APPLIGATIONS, AND MORE BASICS ON DISCRET: KALMAN FILTERING, {{k + NV), Also, Case 1 will be recognized as the discrete recursive version of the Wiener prediction problem that was discussed in Chapter 4 Case 2: Case 2 is where we Bx k and then compute i(k + Nx) and its error covariance for ever-increasing prediction times, that is, N = 1,2, 3... 4 fe. The error covariance is of special interest here, because it tells us how the predictive estimate degrades as we reach out further and furter into the future ‘We will now consider an example that illustrates this kindof prediction problem. EXAMPLE 6.1 — twas mentioned previously in Problem $4 thatthe timing cn the GPS ranging signals from each of the satellites is dithered randomly in oder to limit the civil ‘user's horizontal accuracy to 100 m 2dems. This intentional degradation of the GPS signal is known as selective availability, or just SA for short. In a special form of GPS called differential GPS (15), the user is supplied with range corrections for all satellites as observed by a nearby monitoring station. These Comections include the effects of SA that usually predominate over all other Sources of error. The range (and range rate) corrections caunot be transmitted Continvously, though, because the amount of data to be transrtted is fairly large relative to the bit ate ofthe message. Therefore, the corrections are only updated in coarse steps, and the ser must extrapolate this information in the interim between updates. It is, of course, of much interest inthis application to see how the accuracy of the extrapolation degrades in between updates. ‘We will now consider an idealized situation where only the SA-induced range errs will be included in the analysis. Furthermore, we will say that at the update time (call it ¢ = 0), the user receives perfect rmge and range-rate corrections for a particular satellite. The user must then extrapolate (predict) the range for a few tens of seconds until a fresh update is weceived. A previse statistical desription ofthe random process for the SA range dithering has not been made public, but there is some empirical evidence tha! it can be appro) mated a5 a second-order Gauss-Markov process with a power spectral density (PSD) ofthe form (16): PSD = So) = rn? /(rad/see) 13) ‘constant determined by the level of dithering 012 rad/see |As a matter of convenience, the amplitude of the SA dithering is usually ex- pressed in terms of range rather than time (ie., range = velocity of light % time). Suppose we choose the constant ¢ to correspond to an rms amplitude of 50 m (17). Then, using the methods given in Chapter 3 (Table 3.1), we find that 81 PREDICTION 245, era Nig = G0 my? (0043987 m* (rad/sec)® 1a) Next, we can use the modeling methods discussed in Section 5.3 to develop ‘ continuous state model. The result is (using range and range rate as the state lft tll 1u@) = white noise with unity PSD a(t) 15) where ‘We will consider the observable to be range, so the diserete measurement equa tion is act afs] eo 616 Where the variance of v, is Ry. The model is now complete withthe specifi of the sep size Ar, Ry and the initial conditions. ‘To find the mean-square error in prediction, one can initialize PCR) at zero ‘and then repeatedly calculate P(k + N[k) as indicated in Fig. 6.1 for N = 1, 2, 3, ete, However, there is an easier way if filter error-covariance software is available. This being the case, we simply initialize P(O)0) at zero and then run the covariance program an appropriate number of steps with R, set at an abnor rally large value (eg, 1.0220). This is the equivalent of telling the Kalman filter that the measurement information is worthless, This, in tura, makes the P update trivial at each stop, and P, just propagates ahead in accordance with lhe Py, = 42,82 + Q, equation, The result for a step size of 10 sec and 30 steps is shown in Fig. 6.2. Note that the rms error approaches 30 m as the predictive time becomes large. This is as expected. As we reach out further into the future, the correlation between the present and the future becomes smaller tnd smaller, and the optimal estimate approaches the mean of the process (zero in this case). Toe rms estimation error then is just the ms value of the process itself. Also note that ifthe differential corrections are refreshed every 50 sec, the prediction error can be held to about 10 m, at worst, at the end of the prediction intecval = ‘There are a number of interesting variations that can be made on the “type 22” prediction problem illustrated in Example 6.1. One such variation is to carry the problem a bit further and compare the optimal predictor with a particular246 CHAPTER PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING <1 Bas] : ce 00 88 00 Figure 82. Preoton ror—SA crore, ‘suboptimal predictor where the prediction is accomplished by simply projecting ‘ahead with & constant rate. This comparison is made in Section 6.7 as an example ‘of suboptimal filter analysis. Another variation is to relax the assumption of beginning the prediction with perfect estimates of range and range rate. An example of this is given in Problem 6.1, ALTERNATIVE FORM OF THE DISCRETE KALMAN FILTER ‘The Kalian filter equations given in Chapter 5 can be algebraically manipulated ino a variety of forms (2, 3, 4). An alternative form that is especially useful will now be presented. We begin with the expression for updating the error covariance, Eq, (5.5.22), and we temporarily omit the subscripts to save writing. Pr a KMDP- 21) Recall thatthe Kalman gin i given by Ea. 5.19 K = Pw GP-H +R)" 622 Substtaing Eg. (62:2) into (62.1) yields Pa P= PHY + RHP 623 ‘We now wish to show that if the inverses of P, P, and R exist, P-' can be written as ey eR 624) 182. ALTERNATIVE FORM OF THE DISCRETE KALMAN FILTER 247 Justification of Eq, (6.2.4) is straightforward. We simply form the product of the right sides of Eqs. (6.2.3) and (6.2.4) and show that this reduces t0 the identity matrix. Proceeding as indicated, we obtain [e ~ PRCHP-H” + RYH }[y! + ERB) = PAP AP + Ry! — RE OP + RP RE 1 = Po [APE + RY" + BP-AR) ~ RY = Pir! - Ro] ‘An alternative expression forthe Kalman gain may sso be derived, Begin- ning with Eq. (6.22), we have K = PWR + Ry! Insertion of PP! and R-'R will not alter the gain, Thus, K can be written as K = PPP RORGP-HY + Ry = PPP-HTR-(HP-HTR™' + Dt We now use Bg. (6.2.4) for P-! and obtain K = PLP) + PROP REP HER + PO + PRONE WR-(AP-EPR™! + 1! PER + BPHR- APPR! + PER 625) ‘The main results have now been derived, and Fags. (62.4) and (6.2.5) may be rewritten with the subscripts reinserted: Reh oy! + BERG, 626) K, = PAATR;" 27) [Note thatthe updated error covariance can be computed without fist finding the ‘gain, Also, the expression for gain now involves Py; therefore, if Eq. (6.2.7) is to be used, K, must be computed after the P, computation. Thus, the order in Which the P, and K, computations appear in the recursive algorithm is reverse from that presented in Chapter 5, The alternative Kalman filter algorithm just derived is summarized in Fig. 63. [Note from Fig. 6.3 that two (n Xi) matrix inversions are required foreach recursive loop. Ifthe order of the state vector is large, this leads to obvious ‘computational problems. Nevertheless, the alternative algorithm has some useful applications. One of these will now be presented as an example.248 CHAPTERS PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING rae — tise angie ei santo rR ‘pst etme Sioterm inn Figure 63 erate Klan Mar eure oop, EXAMPLE 6.2 ‘Suppose we wish to estimate a random constant based on a sequence of independent noisy measurements of tke constant. We can think of the constant as being a deterministic random process that satisfies the differential equation #=0 (628) ‘Thus, 6, and Q, ae 1 and O, respectively. Let us also speculate that very Hite is known about the process intially. The constant is equaly likely to be positive ‘or negative, and its magnitade could be quite large. It might be thought of as a random variable witha fla probability density function extending from —® to ‘For more realistically, a normal zero-mean random varisble with a very large variance. This being the ease the a priori estimate end associated error covatiance should be 629) (62.10) However, this is not permitted in the usual Kalman filter algorithm (Fig. 5.8), because it leads to the indeterminant form =/ in the gain expression. The altemative algorithm will accommodate this situation, though, because (P5)"* rather than Ps appears in the first stp. "Time is of ao consequence in this example because we are estimating a constant. So let us assume we have N independent noisy measurements of x, teach made at 1= O and having an error variance of o. The measurement model is then (62 ALTERNATIVE FORM OF THE DISCRETE KALMAN FILTER 249 tel + |"? (62.11) and R, is the (N x N) diagonal matrix 0 Oe R - oF (62.12) Proceeding with the frst step of the altermative algorithm yields Pt = (Bayt + HERG By Y tet rs Salt 62.1) 1 pe (62.14 [Next the gain is computed as = PR! “Fut Sa “ERR 62.18) Finally, the estimate is given by 4 + Kyla ~ Hof) Kga=% N 6216)i 250 CHAPTERS PREDICTION, APPLICATIONS, ANO MORE BASICS ON DISCRETE KALMAN FLTEFING “The final result is no surprise; itis exactly the result one would expect from elementary statistics, The main point of this example is this: The altemative algorithm provides a means of stating the Kalman filter with “infinite uncer tainty" if the physical situation under Consideration so dictates. a 63 PROCESSING THE MEASUREMENT VECTOR ONE COMPONENT AT A TIME We now have two different Kelman iter algorithms as summarized in Figs. 5.8 and 6.3. They are, of course, algebraically equivalent and produce identical estimates (with perfect arithmetic). The choice as to which should be used in @ particular application is a matter of computational convenience. Both algorithms Tavolve matrix inverse operations, and these may lead to difficulties. When using the altemative algorithm of Fig. 63, there is no reasonable way to avoid two (@r % n) matix inversions with each recursive eye. If the dimension of the state ‘vector n is large, this i, at best, awkward computationally On the other hand, the matrix inverse that appears in the regular algorithm given in Fig. 5.8 is the ‘same order as the measurement vector. Since ths is often less than the order of| the state vector, it is usually the prefered algorithm. Furthermore, if the mea- ‘urement errors at time f, are uncorrelated, the inverse operation can be elimi- pated entirely by processing the scalar measurements one ate time.* This will now be shown. ‘We begin with the expression forthe updated error covariance, Eq, (6.2.6): oo | OTHE ae wort + (Bar|) a LeRbr'| @ HR) an eceearecee ne econ term in B. (631 is itetonly write npartioned form and Bsn to bea est ck ago Pascal. hs ens tht he ecuronen avin ty canbe groupe ogee sch hat he measienent Reon non hee. ben locks ate unrelated This fen he cae hen ‘Sond meses cone Hw ferent nme. We net expand the parvo af Eg ge + mosesing he ect examen oe conte tines someties ef sen sputter hwo ve ean tt pe cnn me mig of ni vfienal un cae for eso, pocosig the auc sequently ee 2 8 {Secs gh pn nie wl be fered ere phys ore-a-tme proven The em wen Sita eee to pete e up op even of he messrenen otsing wi 183 PROCESSING THE MEASUREMENT VEGTOR ONE COMPONENT ATA TIME 251 =P MERON + HPC + (632) afer essimilating "block a measurements 7 after assimilating both block a and & measurements ‘and 60 forth PR [Note that the sum of the first two terms is just the Py! one would obtain after assimilating the “block a” measurement just as if no futher measurements were available. The Kalman gain associated with this block of measurements may now be used to update the state estimate accordingly. Now think of making & trivial projection ahead through zero time. The a posteriori P then becomes the a priori P for te next step. When this is added to the b term of Eq, (6.3.2), we have the updated P;" after assimilating the second block of data. This can now be repeated until all blocks are processed. The final estimate and associated error js then the same as would be obtained if all the measurements at t, had been processed simultaneously. Thus, the designer has some flexibility in the design of the system software. The aveilable measurements at any particular time may be processed either in blocks, one block at a time, or all at once, as best suits the situation at hand. One-at-a-time measurement processing is illustrated in the timing diagram of Fig. 64. Note that once we have established the validity of fonevat-a-time processing, it makes no difference whether we use the “usual ‘update formula given in Chapter 5 (see Fig, 5.8) of the alterative formula given in Section 6.2. The end results ate the same (within the limits of computational arithmetic), ‘The concept of processing the measurements one block ata time leads to an interesting physical interpretation of P inverse. With reference to Eq. (6.3.2), think of (Pj)! 28 a measure of the information content of the a priori estimate, that is, before the new measurement information is assimilated into the estimate For simplicity, begin with (P;)°' = 0. This corresponds to infinite uncertainty, ‘or zero information. Then, as each measurement black is processed, we add an amount H"R™'ET to the previous information, until finaly the total information is the sum indicated by Eq. (6.3.2). The term “ada” is appropriate here because HPR“'H is always positive definite. For the heuristic reasons just noted, P inverse is often referred t0 as the information matrix. This concept is developed further in Chapter 9 in the discussion of decentralized filters (Section 9.6) One-ata-time processing is also useful from a system organization viewpoint. Often the system must have the flexibility to accommodate a variety of ‘measurement combinations at each update point. By block processing, the system may be programmed to cycle through all possible measurement blocks one at a time, processing those that are available and skipping those that are not. ‘Simultaneous processing requires a somewhat more complicated system orga nization whereby dhe system must be able to form appropriate Hy snd R, ma trices for all possible combinations of measurements, and it must be prepared to do the corresponding matrix operations with various dimensionality.252 CHAPTERS. PREDICTION, APPLICATIONS, NO MOFE BASICS ON DISCRETE KALMAN FILTERING 64 POWER SYSTEM RELAYING APPUCATION 253, able to determine the distance to the fault as soon as possible; and, in order to do this, the steady-state postault currents and voltages must be estimated. Tran- sient components are superimposed on the steady-state signals immediately after Conon caroute the fault, so that the transients become the corrupting noise in the problem, ane Using Normally, these transients are not considered as random noise. However, 10 ‘model them otherwise complicates the problem immensely because of the many variables involved. So the basic problem is to estimate the steady-state components of the sending-end voltages and current in the presence of the transient compat (noise) components. It is assumed that digital samples of the various phase ee Ea voltages and currents are available for processing at a reasonably fast rate, say, 64 samples per cycle ofthe 60-Hez signal Girgis and Brown (10) made an extensive simulation study of the transients accompanying various types of faults for various lengths of line, and so forth ‘They concluded thatthe transients could be approximated as nonstationary ran- [compute i= dom processes, and they developed models accordingly that would fit the re- ee a Steet er ie sic opel at le ee erected pina tas tenany cme he eal oe pponentially with time. Thus, it seemed reasonable to model the time samples ot the process as a white sequence with an exponentially decaying variance. The current transients, however, showed (on the average) a sizable long-time-constant exponential component in'addition to the high-frequency components. (Power Figure 64 Ting dogram fo cne-at-aire measursent 00 engineers sometimes refer to this as the “de offset") Thus, the model chosen ig n= robe of eis fre measurement wet 2) forthe current noise was an exponential process with random initial amplitude plus a white sequence with exponentially decaying variance. The signal process to be estimated for both current and voltage was a sine wave with random amplitude and phase. This is readily modeled as a two-element vector, where the state variables are the coefficients ofthe sine and cosine components of the ‘wave (see Section 5.2). The final models forthe currents and voltages may be summarized as follows: Provost PP “There are bound to be some applications where the measurement errors are all mutually corelated. The R, mate is then “ull” I this s the case, and one fata-time processing is desirable, linear combinations of the measurements may be formed in such a way a5 t0 form a new set of measurements whose errors ‘ue uncorrelated, One technique for accomplishing this is known as the Gram_Schmidt orthogonalization procedure (4). This procedure is straightfor- siad and an exercise i included fo demonstrate i aplication to the problem one ph UPazcoupling the measurement errs (se Problem 62), Noe th the itive foltage Model (same for each phase) grocobae demonstrated in Problem 62 is closely related to Cholesky facto 1, State equation: Enon. This is discussed in detail in Section 5.4 eee lze]-[ He) 640) 64 POWER SYSTEM RELAYING APPLICATION New applications of Kalman fering Keep appearing regulary, and many of these a¥e now outside the orginal application area of rvigation. One such Ape pplication has to do with power system relaying 9, 10), When a fault (Shor) ‘eure on #thre-phse eansmisson lin, i 8 desirable 1 sense the problem rly and take appropriate relaying action to protect the remainder of the item Te hierarchy of decisions that mst be made a to which lays should - top (and where) s relatively complicated, It suffices to say here that iis desi- {% = measured value of x just prior to the fault (644.4) 2. Measurement equation: afeosutar -snapadfh| +n 642)254 CHAPTER 6 PREDICTION, APPUCATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING ‘The o2 parameter was determined by simulation, It is fixed and not ‘determined on-line. On the other hand, the initial estimates of x are assumed to be determined on-line, There is no reason to waste the avail- fle measurement information just prior tothe fault It should be obvious from Eq, (64.1) that Q, = 0 for the voltage model. Also, as mentioned previously, R, is assumed to decay exponentially with k, and the expo- ential parameters are predetermined by simulation or experimental data, Current Model (same for all phases) 1. State equations: fe)-Bt Seb] 2. Measurement equation: a se fone “tn oft) a6 3, 3. Initial conditions: of 0 0 0 a 0 42 0 (648) Just as in the voltage model, a? is determined off-line andthe first two elements of &5 are obtained from measurerments just prior tothe fault. The third element ff fe (ie, the exponential component) is assumed to be zero. The measurement terror variance R, is assumed to decay exponentially just asin the voltage model ‘The Q, parameter is not zero, hough, because the exponential component was ‘observed in the similation studies to have a small residual nose associated with it, Ths is accounted for in the model with w,, and thus the "33" element of Q, is nonzero. In effect, 23 is modeled as @ nonstationary process with a large random initial valve, and then it relaxes to 2 Markov process with a relatively ‘small rms value in the steady-state condition. This is an unusual model, but perfectly legitimate, Ageia, this illustrates the versatility of the Kalman filter to ‘adapt to a wide variety of situations. Figures 6.5 and 6.6 show the voltage and current estimates fora particular simulation of a line-to-ground fault located 90 miles from the sending end. The details of the simulation are not important here, because the results xe all rel ‘ative. Recall that x1 and x2 ae the coefficients ofthe sine and cosine components (64 POWER GYSTEM RELAYING APPLICATION 255 ————— ee Figure 6.8. Kalman ter eaten of he pont vtage of the steady-state values, 2 they are constants, Note tha the Kalman fe estimates converge reasonably welt the correct vals ater about 8 ms (half Cycle a 6 Ha). igure 67 shows theresa of wing the volage and curent Cstimaes oft simulatono compute distances tothe fel A similar distance Saleen ws alo made ting curren and vlages as determined by adres Fore tanafom lg, Bath the Fourier transorm and Kalman Alter eu tre shown in Fig 6, andi lear tt the Klan er converges onthe Comet rest fe ha he oer algo This a expected. Te dct otro pc env fins yg ate of he noise, nor does slow for any pin knowledge ofthe partes bs Cstimated| Ofcourse, both algo converge tothe comect eu eventally Howover dines ofthe exenee! Why accep near perfomance when optimal pevtorance is realy svalable? Tmproveneats hive been made on tis basic scam, since i¢ was fist present in 1981 by Gigs and Brown (10). The renement involve adaptive Figure 66 Kiran sor esirton of he posto caret fase1256 CHAPTER PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING 65 a | fet eon, Tine rate Figure 6:7_‘The crete tance to the aut wing he Ka fran alguien and the ceo Four ans Kalman filtering, though, so further discussion of this application will be deferred until Chapter 9 POWER SYSTEM HARMONICS DETERMINATION ‘A recent application of Kalman filtering in power systems has to do with the {determination ofthe harmonic content of the 60-Hz voltage and current waveforms (18). Ideally, these waveforms should be pure sinuscids. However, tran sents induced by heavy loads being switched on and of" and electronically Controlled loads cause distortion of the waveforms. The amount of distortion is best described in terms ofthe harmonic content of the waveform (.e, the Fourier series components). The harmonics can change with time, o they need to be ‘monitored continuously, Of cours, ifthe harmonic content changes with time, fay, in a random wey, the waveform will not be toly periodic ina strict sense. ‘We will assume here that the waveforms that we are dealing with are at least (quasiperiodic, and we will be interested in tracking a sort of “average level” of, the vatious harmonics present, averaged over a few cycles of the 60-Hz fundamental. “The harmonic-process modeling for this application will follow closely the incphase and quadrature state model that was discussed in Section 5.2, There is (65. POWER SYSTEM HARMONICS DETERMINATION 257, ‘one essential difference, though. Here we are looking for a filter that will operate in a steady-state condition. The random process model in this case must not be Aeterministc. That is, each of the in-phase and quadrature components must be allowed to random walk, and this is accomplished in the model by including a white-noise forcing function for each state variable. The effect of this isto de- ‘weight older measurement data, and this allows the filter gains to approach a nontrivial steady-state condition, ‘We will now look at a specific example. In the interest of simplicity, let us say that we are primarily interested in estimating just the fondamental, 3rd, and Sth harmonics. ‘These are assumed to be the dominant components, For each Frequency we will have in-phase and quadrature components as state variables as discussed in Section 5.2. Our 6-state continuous model is then +[ 51) where 1, = in-phase and quadrature components ofthe fundamental 4% © i-phase and quadrature components of the 3rd harmonic 4% ~ nphase and quadrature components of the Sth harmonic jy ty ++ My = independent Gaussian white-noise foreing functions ‘The corresponding diserete process model is then eee (652) Bh Led, Led ‘The ¢ and Q, parameters of the filter are then 4 a 1 (6 x 6 identity matrix) (653) covlw, Ww, Wy, We Ws welr (diagonal 6 x 6 matrix) (6.5.4) ‘The process model is now complete except for specifying numerical values for tne clements along the diagonal of Qy and the step size A ‘The sampling rate inthe Kalman fer shouldbe at east as igh as wie tne highest frequency component (ie, dhe Nyquist rate in order forthe filter be competitive with FET methods. For this example we will let the sampling258 CHAPTER 6 PREDICTION, APPLICATIONS, ANO MORE BASICS ON DISCRETE KALMAN FILTERING rate be 32 samples per cycle of the 60-Hz fundamental. Therefore, the update interval will be 1132)(1/60) = 1/1920 see 653) “The measurement is scalar in this example, and it is given by the equation ty = 7,608 kat — xy sin Raabe + x, 05 Shanht ~ xy sin 3koshe “+ x ¢08 Skowr ~ Xe sin Skat + oy eo 656) where to = Dar 60 rad/see 1», = measurement noise (white sequence) It can be seen from Eq, (6.5.6) that the noise term v, must include every thing that has not been accounted for in the three harmonic components. Thus, if we think that there may be higher harmonics present, their effect must be lumped into v,. This, in turn, will reflect into the numerical value assigned to Ry This leads to some degree of suboptimality, but it makes the model man- ‘ageable.* The H, matrix is now obvious from Eq, (6.5.6), and itis Hy, = [eos kaake ~ sin kudt cos Skee — sin 3kaade cos Skadt ~ sin Skene] 5.) [Note thatthe sampling rate was chosen to be an integer multiple of the funda- ‘mental frequency. Therefore, the elements of H, will repeat themselves every 132 steps, This means that they can be precomputed and stored; they do not have {0 be computed on-line. "To demonstrate that the Kalman filter will wack changes in the harmonic content of a waveform, a Monte Carlo measurement sequence was created with ‘jump change in the 3ed harmonic inthe middle of the run. Specifically, z, was generated with MATLAB in accordance with the following equation: + Onision of hither order amos in the ste moe! would te @ mating err Gf hey were ‘cil presen) ati woul dograde the iter eum vlave o what woul be oboe thoy had teen Gore inclfed the model The effect of misdeling in ths polation is ‘Sitewha smiar to aliasing, wheat ecar oFFT spect analyst itis ot xa hese ‘Rreranple inthe prof at and there were 3nd barons pset Would ot be alted own torso fegacy compose, beta ve havea ned is component in oa sie ‘tout The tmodele30n¢ temic wad, however, deze the Kalman hie estimates of the Fnanem tad Sn ee comple sot of way. We wl a elaborate fre on he eave Theis of Kai fie ve FET methods are Ths dead sme deta the Cig eeence os. 65 POWER SYSTEM HARMONICS DETERMINATION 258 4-* acer Gor Papt eg [Mm ovpane Fria LSet a | hontina rede wi eu ea Figure 11 Comparison hee rs used in sunt er ara ‘The significance of the conceptual suboptimal filter no. 1 in Fig. 6.11 is ‘considerably more subtle than that for the optimal filter. We can imagine a hypothetical filter that uses the true model parameters, just as in the optimal filter, but instead of generating the gain sequence within the filter, suppose we luse & gain sequence that comes from some source external to the filter. The ‘gains may be suboptimal. Now, referring back to Chapter 5 and considering the derivation leading to Eq, (5.5.11), we find that if (1) Py is wuly representative fof the error in &;, and (2) the model parameters are correct, then P, as given by the general Prupdate equation, P, = (= KHQPr( ~ KB)’ + KRKE (67.1) is sepresentative ofthe error associated with the updated estimate, irespective Of the gain used inthe update equation. This means that if we use suboptimal fins from any source in suboptimal fier no. 1, and if we are careful to use the general Pupdate equation (ie, Eq. 6.7.1), then the calculated P, sequence has its usual meaning in tems of the errs associated with the suboptimal estimates so generated, Note expecially that suboptimal Alter no. 1 i assumed to operate with the ue ¢, Q,. Hy, Ry parameters nally, we come to suboptimal filter no. 2, the one that is actually imple- tented in el life, How good are its estimates? This is realy the central question of suboptimal analysis. We wll not ry to answer this question in general terms Rather, we wil Took in deal at ost the cases where We have implemented the wrong R, oF Qy, oF both Special Case (a): Incorrect Ry In this case, all the model parsmeters in the realife filter are assumed to be correct except for Ry. We allow it to differ from the te value in most any way, except thatthe implemented R must be symmetric and positive definite. Having an incorrect R, inthe implemented filter ‘will, of course, result in suboptimal gains and estimates. Now assume that the 67 OFF-LINE SYSTEM ERROR ANALYSIS 267 ‘suboptimal gain sequence that is generated in the implemented filter is cycled ‘through suboptimal filter no. I (the one with the correct model parameters) This being the case, the two filters will generate identical estimate sequences. This rust be so, because the estimates depend only on the estimate-update and estimate-projection step; these, in turn, depend only on the H, K, and ¢> par- meters that are the same in both the no. 1 and 2 suboptimal filters. ‘The P sequence coming out of suboptimal filter no. 1 will then give a meaningful ‘measure ofthe estimation errors inthe implemented filter (as well as for the no. 1 filter). This is the desired result in analyzing the suboptimality effect of implementing the wrong R,. (Note that the P matix generated by the implemented filter is meaningless in terms of mean-square error.) Special Case (b): Incorrect Q, All ofthe arguments used for special case (a) are also applicable to the case where Q, in the implemented filter is not ‘correct. If we use the suboptimal gains from the implemented filter in suboptimal filter no. 1, then the gains, , and HI, will be the same in both filters, and the two filters will generate identical estimate sequences. Thus, the P matrix coming ‘out of the no. I suboptimal filter will be meaningful of the estimation errors in the implemented filter, just as in the wrong R, case In summary, if either Ry or Qy (or both) is incorrect in the real-life implemented filter, the suboptimal gains so generated may be cycled through the truth model to obtain a P matrix that accurately describes the estimation ferors in the implemented filter, In doing this kind of suboptimal analysis, though, we must be careful to always use the general P-update formula [i.e P = — KH)P-( ~ KH)" + KRK" in the ruth-model filter Also, there is firm requirement thatthe ¢, and H, matrices inthe implemented filter be the same as inthe truth model. We will now look at two examples, one where this condition is satisfied and one where it isnot EXAMPLE 6.4 ‘We return to Example 6.3 in which @ random walk process was incorrectly ‘modeled as a random constant. One sample run of a random walk process was used to demonstrate the divergence phenomenon. However, this can hardly be called proof of divergence. We could, of course, make many more runs using new sels of random nombers, and then average the results to find the rms error ‘This would be doing the analysis the hard way, though. In this example itis only the Q, parameter that is incorrect in the implemented iter. Therefore, all ‘we need to-do is to consider the random walk model asthe truth model (suboptimal filter no. 1 in Fig, 6.11), and then feed the suboptimal gains generated by the implemented filter into the tuth model. This was done for the ist 15 steps ofthis example, and the resulting rms errr along with suboptimal gains is shown in Fig. 6.12, The optimal rms error is also shown for comparison. It can be seen that divergence does occur in this situation. In this example, it can be easily verified that the suboptimal filter approaches a steady-state condition where the ‘error variance increases by a fixed amount with each step. Thus, the rms error increases asthe square root of the number of steps as indicated in Fig. 6.12.268 CHAPTER 6 PREDICTION, APPLIATIONS. AND MORE BASICS ON OISCRETE KALMAN FILTERING Figure G12. Swope andysis fr endo walk oro, EXAMPLE 6.5 ‘An optimal prediction application to GPS was presented in Example 6.1. The fend result of the example was 2 plot of the rms prediction error for the range correction forthe optimal predictor. It is also of interest to compare the optimal results with corresponding results for a suboptimal predictor that is being, considered for this application (17). The suboptimal predictor sinply takes the range and range-rate corections as provided at the start time, and projects these ahead ‘with a constant rate (much as is done in dead reckoning). This, of course, does not take advantage of any prior knowledge of the spectral characteristics of the SA process, "The continuous dynamic médel for the suboptimal preictor is xan 6712) where x is range end w(t) is white noise. [The PSD of w(t) does not affect the projection of x.) If we choose range and range rate as our state variables, the ‘continuous state model becomes. [] : [ 3 f:] * (i) wm) 6.73) Faw ‘The F matrix for the suboptimal model can now be compared with F for the ‘optimal mode! from Example 6.1. It is. Fas [2 vi «| (optimal mode) (674) The optimal system isthe truth model inthis example, and clearly, Fyy and Fy, 157 OFF-UNE SYSTEM ERROR ANALYSIS 269 are quite different. This means then thatthe , matrices forthe two models will toe different, and this precludes the use of the “recycling suboptimal gains” ‘method of analyzing the suboptimal system performance. Al isnot lost, though. In this simple situation we can return to basics and write out an explicit ex pression for the suboptimal prediction error. An explicit equation forthe error ‘covariance as a function of prediction time can then be obtained. ‘The optimal model is the SA model given in Example 6.1. Therefore, the true x at step k + Nis er 675) where N denotes the steps ahead, beginning at step k, and wy isthe white-noise contribution to the state vector that accumulates for N’ steps. The covariance associated with Wy is Qax(k + NB Lw.wi) 6.76) “The parameters ofthe continuous SA model are known, so by, and Qs, can be ‘computed as a function of N. "The state transition matrix forthe suboptimal dynamic model of Ea, (6.7.3) baal) [ a] oz) where 1 = numberof prediction steps AL = step size fr each step (st) "The predictive estimate produced by the suboptimal predictor is then 3k + MB al) 8) 618) But, we assume that we begin the prediction with a perfect estimate of x at RH) = 679) ‘We can now form the difference between the true x and its suboptimal estimate Using qs. (6.7.5) to (6.79) leads to en) san — 8 + (eal) ~ baal], + Wy (67.10) and‘270 CHAPTER PREDICTION, APPLIGATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING Figure 6413. Compson of ms proce eres fr otal fe subeptral poder Pay = FletWie(%)] = [sa ~ aD] BOD) % [osa(N) ~ Gaal + Qay(k +N (67.11) ‘The true process x is stationary, so E(x,x7) is easily evaluated using the methods ‘of Chapter 3. For the model parameters given in Example 6.1, we have Gomz 0 Baxd = [‘ o 136 ey Gee Now, al the remains to be done isto evaluate dy, aay and Qs, for N 0,1,2,-.- , and then use Eq, (67.11) to get the estimation eror covariances for the desired numberof steps. This is easily done with MATLAB. The desired ‘comparative resulls are shown in Fig. 6.13. For the fist few seconds of predic: tion, tere is very ite difference between the optimal and suboptimal predictor. However, as the prediction time inreases, the difference becomes more pronounced, Fora prediction time of 50 see (which was used as a reference point, in Example 6.1), the eror comparison is sbout 10 m for the optimal predictor 4S. 12.5 m for the suboptimal one. The difference is signicant, and the ime ovement in. going from suboptimal to optimal is achieved. simply with Software (68 PELATONSHIP 10 DETERMINISTIC LEAST SQUARES 271 an oversimplificaton, because the criterion for optimization is minimum mean- square error and not the squared error in a deterministic sense. There is, however, fa coincidental connection between Kalman/Wiener fitering and deterministic least squares, and this will now be demonstrated. The presentation here follows closely that of Sorenson (4). Consider a set of m linear equations in x specified in matrix form by Mx=b 68.1) In Eq, (6.8.1) we think of M and b as being given, and x is (w X 1), bis (m1), and thus M is (om n). Let us assume that m > n, and that x is over- determined by the system of equations represented by Eq, (6.8.1). Thus, no solution for x will satisfy all equations. This situation arises frequently in phys- {cal experiments where redundant noisy measurements are made of linear combinations of fixed parameters. In such cases itis logical to ask, “What solution will best fit all the equations?” The term best must, of course, be defined and it is frequently defined to be the particular x, say x, that minimizes the sum of the squared residuals. That is, move b to the left side of Eq, (68.1) and substitute x, for x. This yields a residual vector e given by Mx,,—b=e 682) and Xo 8 chosen such that ee is minimized. A perfect fit, of course, would make €7 ‘We can generalize at this point and consider a weighted sum of squared residuals specified by ‘Weighted sum of] _ age — aye . [Retief] tg by WOM.) 6.83) ‘We assume thatthe weighing mauix W is symmetic and positive definite and, hhene, so i its inverse If we wish equal weighting ofthe residuals, we simply Jet W be the identity mates. The problem now is to find the particular x (Le, yp) that minimizes the weighted sum of the residuals. Toward this end, the ‘expression given by Eq. (68.3) may be expanded and differeniated term by term and then set equal o zro.* This eads to ‘ite arvave oe sas» wih wip oa veto Geil Be 68 a RELATIONSHIP TO DETERMINISTIC ae LEAST SQUARES AND NOTE ON “ ESTIMATING A CONSTANT Both Kalman and Wiener filtering are sometimes referred to simply a least- squares filtering (11, 12, 13). It was mentioned in Chapter 4 that this is somewhat (onicued on nest page) i272 CHAPTER 6 PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING a [x},(M"WM)x,,, ~ b'WMx., ~ x3,.M7Wh + bb] 20M WM x, ~ WMD" — MTWh = 0 (684) Eqoation (68.4) may now be solved for Xe The result i Xu = [WM MW] 685) ‘and this isthe solution of the deterministic least-squares problem, [Next consider the Kalman filter solution for the same measurement situation. The vector xis assumed to be a random constant, so the differential equation for x is ‘ 686 ‘The conesponding discrete model i then mann to (682) “The measurement equation i ae ean 688) where , and HY, play the same roles as b and M in the deterministic probe. Since time is of no consequence, we asume that all measurements occur si- rmulaneously, Furthermore, we asume that we have no a rir knowledge of Xx, 30 the intial il be zero and its associated eror covariance will be =. ‘Therefore, using the alkemative form ofthe Kalman filter (Section 62), we have = (ey) + HERS, = HERSH, 689) ‘The Kalman gein is then Fanon ioe ace el oe Ee $0 ann Gorse 8) Both ofthese fons can be veri by wing out a few salar ems ofthe mst expesions ‘nd wring oiary ifeenaion mead, (68. FELATONSH TO DETERMIGTICLEAST SQUARES. 278 Ky = (HGR 'H"HGR5! and the Kalman fiter estimate of x at = 0 is Sy = (HERG) "HER ey 68:10) ‘This is the same identical expression obtained for x. in the deterministic least- squares problem with R” playing the role of the weighting matrix W. Let us now recapitulate the conditions under which the Kalman filter esti- ‘mate coincides with the deterministic least-squares estimate. First, the system slate vector was assumed to be a random constant (the dynamics are thus trivial). Second, we assumed the measurement sequence was such a to yield an over determined system of linear equations [otherwise (FISR5'H,)~" will not exist]. ‘And, finally, we assumed that we had no prior knowledge about the constant vector being estimated. This latter assumption is unusual because in many sit uations we have atleast some a priori knowledge of the process being estimated. One of the things that distinguishes the Kelman filter from other estimators is the convenient way ia which it accounts for this prior knolivedge via the initial conditions of the fecursive process. (This was used 10 good advantage in the power system relaying application of Section 6.4.). OF course, if there is traly to prior knowledge to use, the Kalman filter sdvantage is lost (in this respect), and it degenerates to a least-squares fit under the conditions just stated “The coincidence in the deterministic least-squares and Kalman filter estimates is really rather remarkable. Remember, one solution was obtained by posing a deterministic optimization problem, the other by posing a similar so ‘haste problem. There is no reason offhand to think these two approaches would Tead to identical solutions. Yet they do under certain circumstances. The circumstances may be generalized somewhat from thase of this example, but not to the complete extent of the general process model used in the Kalman filter. [See Sorenson (12) for more on this point] Thus, this happy coincidence in the two solutions will not always exist ‘There are subiletes that need to be recognized when using Kalman filtering tocstimate a constant. These are related to deterministic least Squares. A simple example will illustrate the significance of the Kalman filter estimate in the ‘unknown-constant estimation problem, O66 ‘Consider an elementary physics experiment that is intended to measure the grav ity constant g. A mass is relessed at ¢ = 0 in a vertical, evacuated column, and ‘muliple-exposure photographs ofthe falling mass are taken at .05-sec intervals beginning at ¢ = .05 sec. A sequence of N such exposures is taken, and then the position of the mass at each time is read from a scale in the photograph, ‘There will be experimental erors for a number of reasons; let us assume that they are random (i.e.,not systematic) and are such thatthe statistical uncertin- ties in all postion readings are the same and thatthe standard deviation of these ‘Consider g to be an unknown constant, and suppose we say that we have no prior knowledge about its value. (We will elaborate on this assumption later274 CHAPTER. PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING in the example.) We wish to develop a Kalman fiter for processing the noisy position measurements. We begin our model by leting g be the single state variable x. The discrete state equation is then xe = x tO ay ‘The measurement sequence 2, is related to x, via the equations 2) = Ges, + oy (the first measurement is at, not 4) B= Gn te y= Geldie # Uy (68.12) ‘The filter parameters are then bh A= 0 H=38, Rae (68:13) “The measurement sequence begins at Joop at k= 1 with the initial conditions 105 see, so we enter the recursive a =0 Py! =< (ao prior knowledge of 2) (68.14) ‘The alternative algorithm is weful here to get the recursive process started ‘Therefore, we update P," fst using Ea. (62.6). re sant ean (2) ae or pean (6.8.15) “Te Kalman gun is competed next sng Eg (62.7) y= PaHTR = 68.16) Finally the estimate is updated using the usual update equation: 69 DISCRETE KALMAN FRTER STABRITY 275, a-0+(2)e-9-(2)a (68.179 “The a posterior Pis finite, so we can now project ahead tothe next measurement and proceed with the recursive processing using either the regular algorithm (Chapter 5) or the alternative algorithm (Chapter 6), This is routine, so we will not pursue the algebra further here. It is easily demonstrated that in these cit- ‘cumstances the Kalman filter estimate obtained at each step is identical to that ‘obiained from deterministic least squares and batch processing the dala (see Problem 6:7) "This example is similar to Example 6.2 in many respects, but there are subtle differences that warrant further elaboration. In Example 6.2 the process envisioned was an ensemble of random constants (see also Example 2.5). Thus, ‘when we think of this in terms of a statistical experiment, each time function in the ensemble is constant with time, but their amplitudes are different in that they are sample realizations of a zero-mean random variable with a lage variance. “Thus, the setting for Example 6.2 is truly a stochastic seting, and all the conclusions about minimizing the mean-square error, and so forth, are applicable. [Now contrast this with the setting for the gravity-determination experiment of the present example. Not only is it unrealistic to say that we have no prior ‘knowledge of g, but the unknown g is aot really properly modeled as a random ‘process. Presumably, if we were to repeat our elementary physics experiment ‘over and over again atthe same location, we would have the same g each time the experiment was performed. Thus, the stochastic setting here is different from the one for Example 6.2. The gravity-experiment measurement noise may be random, but the quantity being estimated (even if unknown) is not. Now, there is nothing wrong with applying the Kalman ite algorithm inthis setting, After all, i¢ can be viewed as just one of many arithmetic rules for processing nu- ‘merical data, The faut, if there is fault, comes with the interpretation of the results. In Example 6.2 we have a right to expect the Kalman filter to minimize the mean-square error in a truly stochastic (ensemble averaging) sense. In the present example we cannot draw the same inference. We do know in the present ‘example that the resulting estimate is identical with the ordinary deterministic least-squares estimate (see Problem 6.8), but beyond this we should not draw hasty conclusions about optimality. a DISCRETE KALMAN FILTER STABILITY ‘A Kalman filter is sometimes referred to as a time-domain filter, because the design is done in the time domain rather than the frequency domain; of course, ‘one of the beauties of the Kalman filter is its ability to accommodate time> variable parameters. However, there are some applications where the filter, after ‘many recursive steps, approaches a steady-state condition. When this happens land the sampling rate is fixed, the Kalman filter behaves much the same as any ‘other digital filter (20), the main difference being the vector input/output prop-276 OHAPTERG PREDICTION, APPUCATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING eny of the Kalman filter. The stability of conventional digital fiters is easily fnalyzed with z-transform methods. We shall proveed to do the same for the Kalinan filter, ‘We begin by assuming that the Kalman filter under consideration has reached a constant-gein condition. The basic estimate update equation is repeated here for convenience: 8, = 87 + Ku ~ BS) (691) We first need to rewrite Eq. (69.1) as a first-order vector difference equation, ‘Toward this end, we replace 8; with, ,8,_, in Eq, (69.1). The result is 8 (y-5 ~ KBs 8-0 + Kae (692) ‘We now take the z transform of both sides of Eq. (6.9.2) and ote that retarding 4, by one step inthe time domain isthe euivalent of multiplying by z~ in the ‘domain, This yield (i the -domain) KO = (a ~ Kb) KO + KZ 693) Or, after rearranging terms, we have [el = Ge ~ Keb JRO = KZ) 694) [We note that in Eqs. (6.9.3) and (69.4), italic x denotes the usual z-transform variable, whereas boldface Z, refers to the z-transformed measurement vector] ‘We know from linear system theory thatthe bracketed quantity on the left side of Eq. (6.24) describes the natural modes of the system. The determinant of the bracketed n % m matrix gives us the characteristic polynomial for the system, that is, (Characteristic polynomial = lel ~ (by-1~ Keabs.d| (69.5) and the roots of this polynomial provide information about th: filter stability. If all the roos lie inside the unit circle in the z-plane, the filter is table; conversely, if any root lies on or outside the unit crcl, the filter is unstable. [As @ matter ‘of terminology, the roots of the characteristic polynomial are the same as the eigenvalues of (,- ~ KH,d,.,)|- A simple example will illustrate the usefulness of the stability concept EXAMPLE 6.7 ———— Let us retura to the random walk problem of Example 6.3 and investigate the stability of the filter in the steady-state condition. The discrete model in this example is 610 OETERMINISTCINEUTS. 27 Sen He 69.6) Bete 97 and the discrete filter parameters are Qa Me Q Rad Pe &=0 In this example the gain reaches steady state in just a few steps, and itis easily verified that its steady-state value is Ky 916 ‘We ean now form the characteristic polynomial from Eq. (6.9.5) Characteristic polynomial = z ~ [1 ~ ¢916)1(1)] =~ 084 698) The characteristic root is at 084, which is well within the unit eile in the plane, Thus we see thatthe filter is highly stable in this ease "Note that even though the input inthis case is nonstationary, the flter itself is intrinsically stable, Furthermore, the filter pole location tells us that any small perturbation from the steady-state condition (e.g., due to roundofT esr) will ‘damp out quickly. Any such perturbations will be atenvated by a factor of 084 ‘with each step in this case, so their effect “vaporizes” rapidly. This same kind of reasoning can be extended to the vector case, provided thatthe P matrix is kept symmetric in the recursive process and that it is never allowed to lose its positive definiteness. Thus, we see that we can gain considerable insight into the filter operation just by looking at its characteristic poles in the steady-state ‘condition, provided, of cours, that a steady-state condition exists a 6.10 DETERMINISTIC INPUTS. In many situations the random processes undet consideration are driven by deterministic as well as random inputs. That is, the process equation may be of ‘the form &= Fx + Gu} Buy (6.10.1) where Bu, is the additional deterministic input. Since the system is neat, we can use superposition and consider the random and deterministic responses separately. Thus, the discrete Kalman filter equations are modified only slightly. ‘The only change required is in the estimate projection equation. In this equation278 CHAPTER PREDICTION, APPLICATIONS, AND MORE ASICS ON DISCRETE KALMAN FILTERING the contribution due to Buy must be properly accounted for. Using the same zero-mean argument as before, relative to the random response, we then have Bi = b+ O+ | Ho DBOUAA de (6.10.2) were the integral term isthe contribution du to Bu, inthe interval (ht. The associated equation for P;., 8 (Pl + Qs before, because the tn- certain in the deterministic etm i ro” Also, the estimate update and as30- sintedcovariance expressions (eee Fig. 5.8) are unchanged, provided the deterministic conribugon has been properly accounted for in computing the 8 jor esimate PROT another way of accounting forthe deteministic inp is to treat the problem asa soperpositon of two enitely separate estimation problems, one determin. {ste and the oct random, The dterinisone is tvia, of Cows, and the, random one isnot trivial. This compete separation approach isnot necessary, thouth provided one propery accounts forthe deteministie contribution inthe projection sep REAL-TIME IMPLEMENTATION ISSUES ‘One of the more attractive features of the Kalman filter is its recursive nature ‘and its modest use of memory storage. This aspect of the Kalman filter makes i a very useful data processing tool in real-time applications. In such applications, the computations of the Kalman filter must take less time to execute than the time interval containing the total number of measurements processed by that Kalman filter. I this were not true, one of two things could happen: The Kalman filter processing will only process a reduced number of measurements in order ‘to keep up with the progression of time and ignoce the remaining measurements, or the Kalman filter will insist on processing everything. presented to it and fradvally fall behind in the timeliness of computing its solution. The former is Still @ real-time filter but may be a suboptimal one. The later is certainly no longer considered a real-time filter. In this section, we shall focus on Kalman filter implementation issues in the context of being both real-time and optimal. Data Latency In the real-time world, itis impossible to expect the processing of a solution to be completed instantaneously at where the measurements are made. ‘There is always a finite time delay that must be accommodated. This latency may be associated with the delivery ofthe measurements, the Kalman iter computation time, and also the conveyance of the solution to where itis needed (see Figure 6.14). In general, with transmission and processing delays, the solution latency ‘may constitute a significant fraction of the time interval between measurements. 611 REALTIME MPLEMENTATIN ISSUES 270 Figure 6.14 Tine for ayia rtm pressing wah Echscn atncy ‘To present a timely solution at the point when the solution is ready, one alternative is to project the Kalman filter solution during the processing to a time in the futur; the solution presented at that point in time would be timely. “The penalty one pays for this timeliness is thatthe error covariance associated ‘with the solution may not be at the lowest possible value because the solution propagation constitutes an additional prediction error. The choice of where to ‘project this solution is up o the designer's judgment. However, one convenient choice is to project the solution to the time point where the next measurement is expected to be made (see Figure 6.15) because this computation already exists a part of the normal Kalman filter eycle. The solution can simply be derived from the projection of the state vector without the need to project to an inter mediate time point Processor Loading ‘The determination of how much the processor is being exercised when running, Kalman filter in real ime is commonly known as loading ot throughput analysis. Such studies are ad hoc in nature because of dependencies on the type and Speed of the processor used, It is not within our scope here to pursue such an analysis. In general, ifthe entice processor is dedicated solely to Kelman filter EE. rea Figure 6.15 ‘Timeline x “tray projected scion wih no soon ‘So,280 CHAPTERS PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETEKALMAN FILTERING ‘computations, he throughput analysis is quite straightforward, The total amount of time needed to execute the entire algorithm fora single cycle of the Kalman fiter can be derived from the number of additions and multiplications needed to implement the matrix equations associated with the Kalman filter. How this Tength of time fits into the interval between the arrival of measurements will determine the viability of the real-time processing. In most real-life situations, however, the processor that runs the Kalman filter is also tasked with running other functions that are sometimes more critical and also more time-consuming than the Kalman filter itself. In such multitasking environments, the answer to the feasibility of running a Kalman filter in realtime is much less obvious. “To lower processor loading, computational efficiency may be improved by taking advantage of matrix sparseness or matrix symmetry waere such characteristics appear. Particularly in systems where the dimensioralty is high, the Kalman parameters &, Q, H, and R are generally quite spa-se, containing in them many zero- and singular valued elements. To exploit the sparseness, certain tmattix equations of the standard Kalman filter will have to be waitten out ex- plivily rather than as generalized matrix algorithms. The following example Hlustrates the cnuparisin Between the two approaches for the state projection ‘of a 3-tuple state vector ‘The generalized computation would be Binoy © bu Su t da’ ba t by “Ba Binoy banc * Gan En + Say “Suey Pin at Baer = Du Se + aba t+ Oy “By, Suppose that ‘The specialized computation would then be fa } ae ‘The computational savings inthis simple example are significant tis generally the case that additional software code must be generated to implement a specialized matrix computation. Inthe example above, however, it turns out that the software code needed to implement the specialized computation is actually simpler than the generalized one. ‘Another simplification that may potentially be exploited involves systems whose measurement availability and rate are high. In such systems, the error covariance and the corresponding Kalman gain do not change much from one cycle to the next, If this isthe case, then instead of updatirg these quantities 632 PensPecIVe 281 every cycle, doing so at a lower rate will result in a substantial amount of ‘computational savings. This approximation should not cause any instability as Tong the system remains observable from a steady rate of new measurement information State Estimates and Error Covariance Propagation In systems where the time interval between measurements can vary substantially, there are at least wo ways of mechanizing the propagation ofthe state estimate ‘vector & and the error covariance matrix P,. One way is to define the funds- ‘mental (shortest) time interval between measurements and propagate consistently using values of the state transition matrix ¢ and the process noise covariance Q computed fora fixed &t. This approach is well suited to a real-time environ iment where itis usually important to maintain regularity in the computa cycles for the sake of consistency. In other words, even if there are no mea- Strements available for a particular Kalman filter cycle, the state estimate and terror covariance propagation will bo caried out after a trivial update of those very quantities (without need to compute the gain, the update equations are simply &, » $j and P, = Pj). Inthe absence of measurements, the state estimate ‘and error covariance ae simply propagated repeatedly. However, this continuous ‘propagation, if carried out over a large number of cycles, may result in problems ‘of numerical stability. In other words, the drawback of this approach becomes tevident when the time interval between measurements is large compared to the fundamental computational time interval, ‘A second way to handle variable propagation time intervals isto specially compute the appropriate state transition and process noise covariance matrices for the entire time interval needed for the propagation. The difficulty of accom modating the “on demand” computation of ¢, and Q, may vary depending on how each parameter is derived. In any case, for a real-time implementation of this approach, a slight adjustment to the sequence of computation is in order. ‘We have, up to this point, been considering the propagation equations to be made aftr the state estimate and error covariance updates in the processing cycle (Gee Fig. 5.8). At that point of the processing cycle when the propagation equations are to be computed, unless the availability of the next measurement is predetermined, the stretch of time over which the stete estimates and error co- ‘Variance must be propagated is indeterminable, Instead, itis usually best to defer the propagation computations to the stat of the following computational cycle atthe time when the measurements next become available. 6.12 PERSPECTIVE We have now presented the basies of Kalmen filtering and looked at a few ‘examples of how the technique can be applied in physical situations. More is yet to come in the remaining chapters. However, this is a good place to pause282 CHAPTER 6 PREDICTION, APPLICATIONS AND MORE BASICS ON DISCRETE KALMAN FILTERING for a moment and reflect on just what we have (and do not have) with this thing wwe all a Kalman filter 1. The Kalman filter is intended to be used for estimating random processes. Any application in a norandom setting must be viewed with caution (see Example 6.6). 2. The Kalman fier is model-dependent, This is to say that we assume that we know a priori the model parameters. These, in turn, come from the second-order “statistics” of the various processes involved in the application at hand. Therefore, in its most primitive form, the Kalman filter is not adaptive or sel-leaming. 3, The Kalman filter isa linear estimator. When all the processes involved are Gaussian, the filter is optimal in the rinimum-mean-square-error sense within a class of all estimators, linear and nonlinear. [See Meditch (2) for a good discussion of optimality.) 4, Various Kalman filter recursive algorithms exist, The “usual” algorithm ‘was given in Chapter 5, an alternative one was presented in Chapter 6, and a third one (U-D factorization) will be presented in Chapter 9. All of these yield identical results (assuming perfect arithmetic). 5. Under certain special circumstances, the Kalman filter yields the samme result obtained from deterministic least squares (see Section 6.8). 6. Kalman filtering is especially useful as an analysis tool in off-line error analysis studies. The optimal filter error covariance equation can be propagated recursively without actual (or simulated) measurement dat ‘This is also tue for the suboptimal filter with some restrictions (see Section 6.7). With these brief comments in mind, we are now ready to proceed to vat ations on the original discrete Kalman filter. It is worth mentioning that the discrete filter came fist historically (1960). The continuous version and other Variations followed the diserete filter PROBLEMS 61. The process of landing on an aircraft cartier is @ highly complex operation primarily because the carrier deck is constantly in motion with a certain degree of randomness that ig atisbutable to wind and sea conditions. In particular, one ‘motion called heaving changes the vertical displacement of the carrier deck. ‘Accurate prediction of the heave motion even 10 to 15 sec into the future will significantly enhance the success of the landing operation. In a paper published in 1983, Sidar and Doolin (21) suggested using Kal- man fiter methods to predict the motion of the carrier deck. On the basis of ‘empirical data, they developed a power spectral density (PSD) for the heave ‘motion, and then they worked out an optimal predictor based on this spectral model. "The functional form for the PSD to be used here comes from the Sidar-Doolin paper, bt the amplitude factor has been changed for convenience, ‘Also, the measurement noise variance R, and the sempling interval used here tre hypothetical. Thus, there is no claim thatthe results ofthis problem represent an exact real-life situation, PROBLEMS 283 Uoty nite nie 7 eae etn Problem 6 “The random process model forthe heave motion is described in the accom. panying igure. Fst note thatthe undamped natural frequency forthe wan fer function in the figure is V.36 = .6 rad/sec. The damping ratio is then ‘06/@).6) = 05. Ths, the heave motion isa relatively narowband noise pro fess with mast of its spectral content concentated around 6 rad/sec (or about 1H), {@) Fist, develop a suitable continuous state model forthe heave motion. Choose the scale factor c such thatthe steady-state rms heave motion 452m. Note that fortis proces there is some rskin choosing velocity 4 one of the state variables. Theoretically it has infinite variance. (See Section 5.2 on modeling processes with rational PSD functions) (b) Now design a Kalman fiter/predictr fr the model developed in par (@). The sampling interval is to be 1 see, and R, = 1 me The mea Surement sequence can be thought of a6 coming from s soure com pletely independent of the landing operation (eg, GPS). For this Station, ind the ms prediction ertor for steps ahead for N = 0,1, 2). +20, Plo the rst. The plot should show a significant reduction of estimation eror for prediction times around 10 sec, as compared withthe 2-m ims eror that would exist without prediction 62 Consider the measurement to be a 2-uple [2,2 and assume that the Measurement eror ae corelated such thatthe Rmatix sof the form a 5] (o For «sew meet pi and «Inet conan of erga pat mane be ctée a ok gor pes sono eee eieeee dence a eae ee ce ——) 0) Fb Oe mack seciacd wi ac ew V aon Oe ee a ee epee peer Se ane cae mashes wien me ad a treo Sees oe eae tecl ome eee ee eines the pal me ate of he pos Gt) Deine mex een fr Pn rms of Py ad the & Hy, Pcs me cet ur cocmneecetl ual pa eli rappin fn expen) 0) Peds te titetce quan fr Py in tems of 5 and he eee284 CHAPTERS. PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FLTERING 65 It was mentioned in the power system harmonics example (Section 6.5) that the gain sequences become periodic in the steady-state condition. Verily this by programming the 6-state Kalman filter model using the same parameter values given in Section 6.5. Let the Kalman filter eycle throug enovgh steps to reach a steady-state condition. Plot the gain sequences for the fundamental, 3rd, and Sth harmonics on three separate graphs with the in-phase and quadrature ‘gains superimposed on each graph Note the quadrature relationship between the ‘to gains for each harmonic. 6.46 Suboptiial filter analysis is often used in sensitivity analysis. Ths is where the analyst wishes to assess the sensitivity of the system enor to changes in certain parameters. In the random walk example of Section 67 (Example 6.4), the tue value of Q was sid to be 1.0, the ue R was 1, and the ms estimation ‘error worked out to be about 302668. (a) Now suppose the designer incoréctly models the filter to be used in this application using Q = 1.1 rather than the true value. The filter will then be suboptimal to some extent, Compute the filter ems error for this suboptimal situation, (©) Suprise the designer errs in the other direction and chooses the filter Q to be 0.9 rather than the true value of 1.0. Find the filter ems error {or this situation (© The percent variation of Q on either side of truth in pars (a) and (b) js about 10 percent. What is the corresponding percent variation in the rims estimation error? Would you call this a high or low sensitivity (ote: Convergence to the steady-state condition is quite rapid inthis example, ‘Thus, the analysis can be carried out easily with a hand-held zaleulator) 6.7 In the power systems harmonics application discussed in Section 6.5, itis envisioned that the numerical values of Q, and Ry would be fixed, that is, the filter is not self-learning. The value assigned to the Ry will éepend on the accuracy of the measurement equipment used in the field, anc a ferly reliable ‘value can be assigned Co this parameter, The same is not so fo: Q,. This para ter is associated with the random variations of the harmonic content of the signal, and thus the value assigned to @ is bound to be more “fuzzy” than the value assigned to R,. Therefore, itis of interest to assess the filter's sensitivity to the Q, parameter, ‘Assume thatthe numerical values used in Section 6,5 are the true parameter values. Now suppose that the 95, and qu values (for the 3rd narmonic compo- rents) actually implemented in the on-line filter are greater than the true value by a factor of 4. Using the suboptimal methods discussed in Section 6.7, assess the effect of this mismodeling on the ems error in the estimates ofthe in-phase and quadrature components of the 3rd harmonics. On the basis of your inves- tigation, would you say this isa low- or high-sensitivity situation? 6.38 The recursive process in Example 6.6 was carried through only one step. Continue the process through a second recursive step, and obtain an explicit expression for 2 in terms of z, and z,- Next, compute the ordinary least-squares estimate of g using just two measurements'z, and z,. Do this on a batch basis using the equation PROBLEMS 285 g= EH ea " (4) . Compare the least-squares isut with that obtined ater caring the Kalman fterecursive proces though tw sep 65 tn Example 66 the inal positon and velocity forthe falling mass were Scrum (oben that hey were ena za0; the grvityeonsant as renamed to be uw ano be estimated Boe (14) pest an teresting Meson on this problem where te sivaton is reves, sumed t be {own perfectly andthe tia poston and soc are random variables with frown Gatsin bution. Even thovgh the taco obey» own dtr inate equation of motion, he random inal contons 280 sifiientuacer tuny to the mio o make te trajectory a eitmate eadom proces. Assume the he nal positon and veloc ave normal random variables described by Me ae ee, Lat nate eben andy be sion an wee incase downvrtr, nd Tt te measurement ake pac a norm intervals Er beianng t= 0The measurement erorvarante sR. Workout the Hey parasters forthe Kalman fier novel for his station. Tha i, fd dy Fira dental Re and Pe (Note hata deterministic ering nso has tobe accouted fr in this example. The efecto hin fring Fonction, though, wl apenrin projecting Sy but notin te model peat) 610. The scalar Gaus-Marhov modst hasbeen wed extensively ia bth Chap thn 5 and 6. A more general second-order Gauts-Mkov mode ha requies two state variables also used frequen in Kalman fier aptcatons. This proces ha amped fonts stosonlation fone, andor spel version Brn type af roves can e genereted with the shaping filer shown in te tccomparing gre. Note tha thre parameters desrbe this process completely oe variance: ey the undamped natura requeney; andthe dm tng sto Consider a second-order Gansian process yas described inthe figure with the following parameters ‘where 0, = Wunits, ay = O2rad/sec, = 5 ‘This will be referred to asthe “signal” in our estimator. We wish now to Took: at a discrete Kalman filer where we have a direct one-to-one measurement of y comupted by measurement noise, The measurement error inthis case consists {f an additive combination of a scalar Gauss~Markov process and a Gaussian White sequence. The parameters for the measurement noises are as follows: Scalar Markov component: 0, = 10 units, , = 002 rad/see White components ayye = 10 units ‘The sampling interval is 10 see.286 C§PTEN6 PREDICTION, APPLICATIONS, ANID MORE BASICS ON DISCRETE KALMAN FILTERING vot —— a = ered it We wish to determine the steady-state rms estimator error for this situation. ‘Using appropriate covariance analysis software, eycle the Kalman filter through the equited numberof steps to reach a steady-state condition. Do this two ways: (@) Fis, initialize the filter P matrix by assuming that the initial state estimate is set equal to zero and that the system state isin a stationary condition when th filter begins to process measurements. (See Problem 59) (©) Next, arbitrarily initialize Py to be zero (more precisely the mull matrix). Then run the error covarcnce program to the steady-state con- dion and compare the result with that obtained in part (a). This might be thought of as the lazy way to solve the problem (but effective! 61L The third-order Kalman filter of Problem 6.10 works out to be stable in the steady-state condition. (The gain matrix approaches a constant as the number ‘of steps becomes large.) A method for finding the characteristic roots (cigen- values) fora filter with constant gain was discussed in Section 6.9. Find the characteristic roots forthe system of Problem 6.10 and comment on the number of steps required forthe filter to reach a steady-state condition (say, within about 99 percent of the final value). Are the results of this problem consistent wit the empirical results observed in part (b) of Problem 6.107 OO _— wis 612 The accompanying block diagram shows two cascaded integrators driven by white aoise. The two state variables x, and x, can be thought of as position tnd velociy for convenience, and the forcing function is acceleration, Let us suppose that we have a noisy measurement of velocity, but there is no direct, ‘observation of positon. From linear control theory, we know that this system is ‘not observable on the basis of velocity measurements alone. (This is also obvious fiom the ambiguity in intial position, given only the integral of velocity.) Clearly, there will be no divergence of the estimation error in x», because We havea direct measurement of it. However, divergence of the estimation error of 1, not so obvious. ‘The question of divergence of the erro in estimating x is easily answered empiccally by cycling through the Kalman filter error covariance equations until either (a) a stationary condition for p,, is reached, or (b) divergence becomes, obvious by continued grovih of p,, with each recursive step. Perform the sug ‘gested experiment using appropriate covariance analysis software. You will find the following mumerical values suitable for this exercise: Power spectral density of f(2) = .1 (m/sec?) ¥(rad/sec) Step size Ar = 1 sec Measurement error variance = 01 (m/sec)? REFERENCES 287 sie yo ‘s Problem 6:12 Note that if divergence is found, itis not the “fault” of the filter. It simply reflects an inadequate measurement situation. This should not be confused with ‘computational divergence. REFERENCES CITED IN CHAPTER 6 1. H.W, Sorenson ed), Kalman Flering: Theory and Application, New York: THEE, res, 1985, 2.5.8. Meditch, Stochastic Optimal Linear Estimation and Control, New York ‘McGraw Hil, 1969, 3, A. Gelb (ed), Applied Opral Estimation, Cambridge, MA: MIT Press, 1974, 4 LW. Sorenson, “Kalman Filtering Techniques,” iC. T. Leondes (ed), Advances in Control Systems, Nol 3, New York: Academie Press, 1966 5. RP Dena, "Navsir: The AlLPurpose Satelite," IEEE Spectrum, 18(5):35-40 (May 1981). 6. B. W. Pukiason and S. W. Gilbert, “NAVSTAR: Global Positioning System—Ten ‘Years Late" Proc. IEEE, 71(10):1177=1186 (October 1983). 4. RG Brown, “Inegraied Navigation Systems and Kalman Fitering: A Perspective,” Navigation. Inst. Navigation, 19(4):385~362 (Winter 197273. 8. J.D. Salisbury, "Comments on Integrated Navigation Systems and Kalman Fieing: ‘A Perspective” Navigation, J. Inst. Navigation, 20@):190 (Summer 1973) 9. A.A. Girgs, “Application of Kalman Filtering in Computer Relaying of Power Sys teins." PhD. disseration, Towa Stat University, Ames, 1981 10, A.A. Gigis and R. G. Brown, “Application of Kalman Filteing in Computer Re- laying” IEEE Trans. Power Apparatus Sys, PAS-100)3387-3397 (lly 1981). 11, HW Bode and C. E, Shannon, “A Simplified Derivation of Linear Least Squares Smoothing and Prediction Theory;” Proc. LR, 38:417-424 (April 1950) 12, HW Sorenson, “Least Squares Estimation: From Gauss to Kalman” IEEE Spec trary, 7:63-68 (uly 1970). 13, T Kailath, "A View of Three Decades of Linear Filtering Theory." IEEE Trans Information Theory, 12202): 146-181 (March 1914). 1M, $M. Bore, Digital and Kalman Filtering, London: B. Amold, Publishes, 1979. 15, Global Positioning System, Vol. IV, The intiute of Navigation, Alexandria, VA, 1993. 16, “Change No. to RTCA/DO-208;" RICA paper no. 479-98/TMC-106, RTCA, Ine, ‘Washington, DC, Sept. 21, 1993 17, BY.C Hwang, “Recommendation for Bakncement of RTCM-104 Differenil Stan- tard and les Derivatives,” Proceedings of ION-GPS-03, The Institute of Navigation, ‘Sept. 22-24, 1983, pp. 1501-1508. 18. ALA. Gigs, W. B. Chang, and EB. Makram, “A Digital Recursive Measurement Scheme for On-Line Tracking of Power System Harmonics.” IEEE Trans. Power Delivery 63, 1153-1160 Gly 19D. 19, M.S. Grewal and A. P. Andrews, Kalman Filtering Theory and Practice, Englewood Cif, NF: Prentice Hal, 1993 (se Section 4.2 and Chapter 6). 120, A.V. Oppenheim and R. W, Schafer, Discrete-Time Signal Processing, Englewood Cif, NI Prentice-Hall, 1989.268 CHAPTER 6 PREDICTION, APPLICATIONS, AND MORE BASICS ON DISCRETE KALMAN FILTERING 21, M.M. Sidr and B. F. Doolin, “On the Feasibility of Real-Time Prediction of Aireraft (Carrier Motion at Sea," IEEE Trans. Automatic Control, AC-28, pp. 390-355 (March 1983). (Also reprined in H. W. Sorenson, ed, Kalman Filtering: Theory and Appl cation, New York: IEEE Press, 1985.) Additional Reference on Applied Kalman Filtering 22, G. Minkler and J. Minkle, Theory and Application of Kalman Filtering, Palm Bay, FL: Magellan Book Co, 1995 The Continuous Kalman Filter ‘About a year after his paper on diserete-data filtering, R. B. Kalman coauthored a second paper with R. S. Bucy on continuous filtering (1). This paper also proved to be a milestone in the area of optimal filtering. Our approach here will bbe somewhat different from theirs, in that we will derive the continuous filter equations as a limiting case of the discrete equations as the step size becomes small" Philosophically, itis of interest to note that we begin with the discrete equations and then go to the continuous equations. So often in numerical procedures, we begin with the continuous dynamical equations; these are then dis cretized and the discrete equations become approximations of the continuous H,PyHY lead to K, = PyRICHPGH, + R/AQ~! = PHOR™ ar ‘We can now drop the subscripts and the super minus on the right side and we obtain K, = HR) Ar us)292 CHAPTER 7 THE CONTINUOUS KALMAN FLTER We define the continuous Kalman gain as the coefficient 0° At in Eq. (7.1.15), that is, KS PHR @.116) Next, we look at the error covariance equation. From the projection and update equations (Fig. 5.9), we have Poy = QPOl + Q, = dl ~ KADPS AT + Q, OELOT — OKMP SO! + Q, ay We now approximate , as T+ RAs and note from Eq. (1.1.15) that Ky is of | the order of Az. After we neglect higher-order terms in A, Eq. (7.1.17) becomes Pi = PP + FPS e+ POPTAR— KPE + OQ, (118) ‘We next substitute the expressions for K, and Q,, Eqs. (7.115) and (7.1.9), and form the finite difference expression pPy + PEF PPR HP; +GQG" (7.1.19) ae Finally, passing to the limit as Ar ~+ 0 and dropping the subscripts and super minus lead to the matrix differential equation P =P + PF’ PIPR“HP + Gog” (7.4.20) RO) "Next, consider the state estimation equation. Recall the discrete equation is R= & + Kye, — 5) 412 ‘We now note that 85 Rey Thus, Eg. (7.1.21) can be writen as & eoifier + By — Habe iB) 122) ‘Again, we approximate ¢ as I + FA. Then, neglecting higher-order terms in ‘At and noting that K, = KA/ lead to FR, Ar + KAM, — HS.) 1.23) Finally, dividing by A, passing to the limit, and dropping the subscripts yield the differential equation 72 SOLUTION OF THE MATRIX ICCA EQUATION 283 rare neew seat | Palys} ener! Figure 73 Onine back diagram for he cine Karan er fame KG - HD a2 Equations (7.1.16), (7.1.20), and (7.1.24) comprise the continuous Kalman fiter equations and these are summarized in Fig. 7.1. If the filter were to be implemented on-line, note that certain equations would have to be solved in real time as indicated in Fig. 7.1. Theoretically, the differential equation for P could be solved offline, and the gain profile could be stored for later use on-line. However, the main & equation must be solved on-line, because 2), that is, the noisy measurement, i the input to the differential equation ‘The continuous filter equations as summarized in Fig. 7.1 are innocent looking because they are writen in matrix form. They should be treated with respect, though. It does not take much imagination to see the degree of complexity that results when they are written out in scala form. I the dimensionality is high, an analog implementation is completely unwieldy. "Note thatthe error covariance equation must be solved in order to find the ‘gin, just asin the diserete case. In the continuous case, though, a differential rather than difference equation must be solved. Furthermore, the differential equation is nonlinear because of the PA"R™'HP term, which complicates mat ters. This will be explored further in the next section, SOLUTION OF THE MATRIX RICCATI EQUATION ‘The error covariance equation B= FP + Pr’ — PH'R“HP + GQG" aay PO) = Fy294 GUDTERT ‘THE CONTINUOUS KALMAN FILTER isa special form of nonlinear differential equation known as the matrix Ri ‘equation. This equation has been studied extensively, and an analytical sot ‘exits forthe constant-parameter case. The general procedure is to transform the ‘ingle nonlinear equation into a system of two simultaneous linear equations; of ‘course, analytical solutions exist for linear diferential equations with constant coefficients. Toward this end we assume that P can be written in product forma P=XZ"', YO=1 722) pe =x : 723) Diflerentiatng both sides of Ea, (72.3) leads to Peer k 24) ‘est, we substitute P from Bq, (7.2.1) into Eq. (7.24) and obtain (PF + PRT PHRHP + GOG)Z+PE=X (725) Rearranging terms and noting that PZ = X lead to POPE - HWRHX + Z) + FX +GQGZ-H=0 (7.26) [Note tat if both terms in parentheses in Eq (7.2.6) are set equal to zero, equality is satisfied, Thus, we have the pair of linear differential equations X= EX + GOQG™ an %= WRX - Fz 728) ith initia conditions X00) 20) 729) ‘These can now be solved by a variety of methods, including Laplace transforms. Once P is found, the gain K is obtained as PH'R™', and the filter parameters ae determined. An example illustrates the procedure, DONE) We now retum to the continuous filter problem considered previously in Ex- ample 43, The problem was solved there using Wiener methods, and we now ‘wish. apply Kalman filtering methods. Let the signal and noise be statistically independent with autocorrelation functions 172 SOLLITON OF THE MATRIX ICCA EQUATION 295 ee |e ho a) (orS,= 1) a2) Since this is a one-state system, x isa scalar, Let x equal the signal. The additive ‘measurement noise is white and thus no augmentation of the stale vector is required. The process and measurement models are then nity white noise 72.12) nity white noise 7213) ‘Ths, the system parameters are Pe-1, G=V3, O=1. H= ‘The differential equations for X and Z. are then Xe X42, XO) = Py Z=X+Z, 2O)=1 (72.14) Equations (7.2.14) may be solved readily using Laplace-transform techniques, ‘The result is @-P) XC = Py cosh VE 0+ sinh Ve +n (0) = cost + ‘sinh V3 ¢ 2.15) 240 = cosh V3 + OO 2.3) ‘The solution for P may now be formed as P = XZ" Py cosh V31-+ OF 9 sinh V3 1 72.16) SCs conn V5 0+ PSP sion V5 « Once P is found, the gain K is given by K = PER" and the filter yielding 2 is determined, Obviously, there should be a correspondence between this solution and the ‘one obtained by Wiener methods. The general connection between the two meth- ‘ods will be discussed in more detail in Section 7-7; here we simply consider a limiting case check as t=, This should yield the same steady-state (stationery)296 GHAPTERT THE CONTINUOUS KALMAN FLTER 73 CORRELATED MEASUREMENT AND PROCESS NOISE 297 schon obtain previa with Wiener methods Leting¢~ in B,(7.216) oo {nd noting tat Py = I yield Bena = O31 - 2) as pr) oa) ew aod eax oe tle Hlucov"eo) = Cott 2) Gather than zero, as before) (73.4) v3 (Our general approach isto form a new process model whose input noise has ‘ero eroseconlation with. We fist aot that 2-~ Hx vis zero, and this D(z ~ Hx ~ v) may be added to the right side of Eq. (7.3.1). Thus, we have ‘The Kalman filter block diagram for this example is then as shown in Fig. 72 ‘This can be systematically reduced to yield the following overall transfer function relating £ to z: : = Fx +Gu+De@-Hx-y) 735) Laplace transform of ¢_ V3 — 1 6) = Taplace transfor of 2" 5 + V3 (72.18) ‘where the constant D will be chosen presently. However, fist we rearrange the terms of Eq, (7.35) to obtain a new process model “This is the same result obtained using Wiener methods, a New process model: (F-DE)x + Dx +(Gu-DY 736) “The random process xis the same as before, but Eq, (7.3.6) shows that we can 73 think of x as a superposition of two responses. One partis due to Dz(), which CORRELATED MEASUREMENT AND ‘we will ueat as if it were a known explicit input. (It is observed and available PROCESS NOISE directly as a measurement.) The other part ofthe response is due to (Gu ~ Dv), ‘Thus far we have considered the process noise w and the messurement noise v to have zero crosscorrelation. This is often a reasonable assumption in physical problems, but not always. We will now sce how the filter equations can be modified to account for erossconelation between the process and measurement noises. We will consider the continuous filter problem here and defer the cor responding solution for the discrete filter until Chapter 9. ‘We first define the process and measurement models. They are k= Fx + Gu Bx ty | oat eso where Figure 7.2 Stations Kaman thro Expo and this is a white-noise input that is not observed (directly, at least). For lack ‘of better names, we will refer to these two component responses as the explicit, and random pars, and they will be denoted as x, and x,. The total response is then oan Now, in the process model, we wish to choose D such that (Gu ~ Dy) and vy have zero erosscortelation, that is E{[Gu(y — Dv(o]v(ay") = 0 73.8) ELGuQvCH"] = ELD 739) Next, substituting Eqs, (7.3.3) and (7.3.4) into Eq, (7.3.9) leads to GCE ~ 2) = DRA - (73.10) We now see that if we choose D to be D=GcR* 73.1y the desired decorrelation is effected. The new process model now satisfies all1298 CHAPTER 7 THE CONTINUOUS KALMAN FLTER 74 COLORED MEASUREMENT NOISE 299 tne necessary conditions imposed in Section 7.1, so the errar covariance ex pression may be writen as (F ~ DENg, + K'@ ~ HR, ~ HR) «7320, Now, adding Eqs. (7.3.17) and (7.3.20), and noting that 8+ 8, yield =F ~ DID + POE = DET PER 'HP + Q’ (73.12) b= DIP + PE k= @ ~ DIR + (D+ Ke KBE = FE + (D+ KYB 732) where Q' is defined by 1 can be seen from the form of Eq. (73.21) that (D + K°) plays the role of i =pv@l}=@ar-9 73:13) “gain in the estimation equation for the toll x quantity. The eor associated E{{Gut) ~ DuO]{Guc ~ DVT} = QaE— with the &, component is zero, so the P equation, which was derived for the : : omponen, also applies tothe total &. From Eqs” (7.11) and (7.3.16), we see Expanding Bq, (7.3.13) and using Eqs. (7.3.2), (73-4, and (7.3.11) lead that the gain D + I i just (PH? + GOR". Thus he final esmation equations Y = GQ — CR'C)GT (73.14) for & may be summarized as 1. Estimation equation: “The equation for P may now be rewritten as : B+KG-HH, {=m (7322) Bo ~ DUP + P(E ~ DEY ~ PATRTHP + GQ = CRICIGT ' : 3.5) 2 Gain equation: f the new model i the K = PH + GOR" 323) “The expression for gain inthe new model is then 3. Brror covariance equation: Ki) = PH 7319, a : B= ~ DEPP + PO - DY’ — PRP ‘and the motive for using the prime will be made clear shortly. ‘We now Took a the equation for the estimate, Just as in the ease of the + GQ — CRICDGT ca process itself, we wish to thin of the estate asa superposition of an explicit noe vet, whichis an estimate of x,, and another past, which isan estimate of the Fandom component x,. Thus, we have for the two estimates a 4, = - DINE, + Daw) 37 | . 3, = © - DEN, + KG, ~ HR) 3.18) _We note that if C is zero, the above estimation equations reduce to the previous equation whee we had zo roxsoeaion been av, This “The error associated with &, is, of course, zero because 2) is known, tht i is a8 expected. Also, note that in the process of deriving the filter equations for inca te it isthe total heasurement and its known exactly. The measurement the correlated case, it was necessary to consider a superposition of explicit and Urs has been denoted as z,, and this requires some explanation. From the basic random responses. Since tis can always be done in linear system, che addition ‘meagurement relationship, We have of an explicit or deterministic driving Function to the process equation presents no particular problem. We simply treat it separately and note that it contributes p= Hix ty = Hi, + Hx, ty ‘ero error to the estimate (see Problem 7.8). An example illustrating the use of | the equations for correlated u and ¥ will be presented in Section 7.4. «a ~ Hx) = Hix, + 73.19) 14 cao COLORED MEASUREMENT NOISE fx — Hx.) must be considered a8 the noisy measurement of x. This is leads to | appears in both the gain and error covariance expressions. Ifthe measurement900 CHAPTER 7 THE CONTINUOUS KALMAN FLTER noise is colored, rather than white, it must be modeled with additional state variables, This leaves zero for the white component, and thus R* will not exis. ‘This leads to obvious difficulty. OF course, one remedy would be to add a small, white component artificially, and then proceed on with the fit design using the usual equations. However, this is begging the issue. There are legitimate situations where one simply does not want to model any part ofthe measurement ‘error as white noise, ‘A general solution to the colored measurement noise problem was frst presented by Bryson and Johansen (3). Our approach here will be slightly dif ferent and, we hope, more intuitively satisfying. If R~' does not exist, we have 4 situation where certain linear combinations of the slate variables are being ‘measured perfectly. Obviously, if @ particular state variable is known perfect, itcan be removed from the estimation problem, There is certainly no need for filtering something that is already free of corrupting noise. Therefore, our general approach will be to remove those quantities that are known gerfectly from those that are not, and then solve the remaining estimation problem. This usually necessitates Finear transformation of the original state vector in order to de- Couple the knowin states from the others. This adds to the algebra, but i certainly presents no theoretical problem because the state estimate transforms with ex: actly the same transformation as the state itself. EXAMPLE 7.2 Consider the problem of separating an additive combination of two independent ‘Markov processes with identical spectral functions. We know the solution calls for a trivial filter with a gain function of $. Let the autocortlation functions of the signal and noise be Royse, any ‘The measurement zis the additive combination stn (742) [Now the obvious way to model the continuous Kalman fiker is to let s be x and n be x, and then we have (]-Pe ae [? ale] 1 af] * (0) 74a) Where 1, and u are independent, unity white-noise driving functions. Notice that u is 2er0 and thus R' does not exist. Since we have a perfect measurement of the linear combination x, + x, let us transform to new states, y, and ys, according to 7A COLORED MEASUREMENT NOISE 904 by-( ae] vas ‘Making this transformation leads to the new state and measurement models, G1-[3 aIb]+E4 alle] ee ta=00 uf] +00) oan [Note that zis now a perfect estimate of y alone, so tt we need to worry only about estimating y,. Thus, the order ofthe problem is reduced from 2 to 1 ‘Consider next the y, state equation aon + VE (Ag) ‘This is in the correct form as itis. (Usually, y, would also appear, and if so, it would be treated es a known driving function because it is equal to z0),) Next, we need # measurement relationship ofthe proper form, that i, there rust be nontrivial additive white noise and some linear connection to y, (and _y; alone). In this case, since z has zero additive white noise, we can consider its “erivative ¢as also being available as a measurement. £2 yoy t VO uy + VE (149) Observe that 2 contains additive white noise as required. However, it also involves y,, In order to eliminate y,, we note t= 74.10) Now, add z and 2 to eliminate y,. This leads to 2+2=0-y + V2, +m) aay ‘We now have a linear connection between 2 + z and), with additive white noise, which is in the correct form. In the new problem of estimating y,, we then have902 CHAPTER. THE CONTINUOUS KALMAN FILTER 74 COLORED MEASUREMENT NOISE 303 | 1. Decouple the quantities that are known perfectly from those that are not Use @ linear ansformation and, as a matter of convenience, define the vi new state variables such that the perfecly known variables are at the , “bottom” of the new slate vector. The botiom elements should then have i 1 one-to-one correspondence to the perfect measurements ' 2. Consider a reduced state estimation problem where the stale vector con- Hao tains only the “top” elements of the new stale vector, These are not known perfecly from the measurement information, so a nontrivial es- v= V2 (u, +m) (74.12) ‘timation problem exists relative to these variables. Note that the perfectly : known botiom elements that appear on the rght side of the state equation and ¢ + 2 plays the role ofthe “iheasurement.” (After al, if is known, so is jay Wal opecat Gan sacs oes a grouped withthe driving functions because they are known quantities, [Note that w and w are corelted, so we must use the correlated form of STs cates elias at a Gasae Kalman fiter equations given in Section 7.3. Using the notation of Section 7.3, wwe then have the additional parameters i, = FyX; + (known inputs) + (white-noise inputs) (7.4.18) cavE where the subscript Tis used to indicate top elements ofthe transformed Re4 state vector. ae 3, Rearrange the measurement equation to achieve the desired form D= Ger 4.13) y= Hox + (white noise) 04.19) ‘Therefore, the optimal filter is given by the differential equation [Note that @ nontrivial white-noise component must be associated with . ‘ach measurement element. Als, the number of elements in must be Hm HAHE+ 9-0-9 0419) the same as for the original vector in order that no measurement information is ignored in estimating x,. The white-noise components are ‘This will be recognized as the equivalent transfer funeton relationship brought in by repested differentiation of the perfect elements until a white-noise term appears. The new 2, vector is then formed from ap- ge) e+ _1 ey propriate linear combinations of the elements of x and thei derivatives. a) @FD 2 a Sometimes, considerable algebraic manipulation is needed to twist the jquations into the form demanded by Eq. (7.4.19), but this can always ‘This checks with Wiener filter theory, that i, be done. 4, Solve the reduced state estimation problem. Usually te filter equations 7416) for comelated w and ¥, as given in Section 7.3, will have to be used, The resolt will be a differential equation for the reduced siste vector. ‘We now have optimal estimates of y, andy. All ve have to dois transform This equation will nomally contain derivatives of some of the measure- back to the x domain to get optimal estimates of x, and x. This yields ‘ments, as wel as the measurement, as driving functions. Te derivatives tsually present no problems and may be left there as driving functions, Ry 1 oy fy, 1 olf]. fe : If, however, one wishes to remove them, there are standard techniques: a-L Bei Ey LE} eo in linear system theory for doing so [eg., see Ogata (4) or Chen (5) ‘The reduced state equation is usualy solved (conceptually, a leas) for us, the optimal estimates ofboth x, and x, check withthe results ofthe Wiener cither the transient or steady-state solution, whichever is desired. The jee eee ee a nd result i the best eaimate ofthe edaced state veto (ey heim perfectly known elements) inthe trensformed domain. procedure for dealing with colored measurement noise may now be 5. Finally, append the perfectly known elements to the bottom of those summaized fallow : estima mep 4, an enon he toast imate ck we‘904 GHAPTER7 THE CONTINUOUS KALMAN FILTER original state space. This gives the optimal estimzte for the original problem. 75 SUBOPTIMAL ERROR ANALYSIS Just as in the diserete case, an error covariance equation can be derived that is ‘applicable to suboptimal as well as optimal gain. The derivation is similar to that given in Section 7.1, and the relationships between Q and Q, and Rand IR, afe given there. We begin with the projection equation forthe discrete filter vy = PAT + Qe 7s.) ‘We now replace P, with the general P-update expression given by Eq, (5.4.11). This yields i dE ~ KYHOPEO ~ KT + KIRKE OT + Q, (752) Next, the following substitutions are made in Eq, (7.5.2) os04 Far Kar Goorar R = 753) R=5, (753) ‘Then, after we neglect higher-order terms in Af, the expression for Pin. becomes Pp, = Py — KA,Py At — PSHIKT Oe + KRK Ar} RP; Ar + PFFT Ar + GQG" Ar (754) Finally, forming the dference P;,, ~ Pr, dividing by 4% and passing to the limit lead to P = FP + PF’ KBP ~ PEK + KRK™+GQG™ (7.55) ‘This equation may now be used with any gain K and is useful in suboptimat cerror analysis, A word of caution is in order, though. Note that “gain” means the K coefficient in the differential equation Fi + Ki ~ Hi) 56 78 FLTER STABWITY IN STEADYSTATE CONDITON 305 and the equations describing the suboptimal filter under consideration must be pt into the proper format before Eq. (75.5) can be applied. Also note that there ‘must be a correspondence of certain parameters in the suboptimal and truth ‘models in order for P to be meaningful asthe error covariance for the suboptimal filter (see Section 6.7) FILTER STABILITY IN STEADY-STATE CONDITION In many applications the Kalman filter will each a steady-state condition (or at least quasi-steady-stat) within a reasonable time after initialization. When this happens, the gain becomes constant. In the continuous case the filter then looks ‘much the same as any other analog fixed-parameter filter except, perhaps, it may be of the multiple input-output variety rather than just a single input-output filer Concepmally, the fiter is simply a linesr operitor that processes a set of inputs (the measurements) and transforms them to a corresponding set of outputs (the state estimates). We know from classical filter theory thatthe characteristic poles ofthe filter (roots of the transfer function denominator tell us much about the stability of the filter, so we will now extend this idea to the Kalman filter (in the steady state, of course). The stability analysis here is similar to that presented in Section 6.9 for the discrete filter, except that Laplace transforms are used rather than z-transforms. ‘We will go directly to the differential equation relating the input 2) to the ‘output £(9. Equation (7.1.24) is repeated for convenience: FR + Kix ~ Bi) a6) Now assume that F, K, and Hare constant and take the Laplace transform of both sides of Eq. (7.6.1). (We will tacitly assume thatthe initial conditions are zero) Alter we rearrange terms, this lads to [a - & ~ KED)X) Kas) 762) 1 can be seen that the bracketed term on the let side of Eq. (7.6.2) isthe ‘characteristic matrix of the system, and its determinant will be the characteristic polynomial of the system, that is, Characteristic polynomial = |st ~ (F ~ KE] 763) ‘The roots ofthis polynomial are then the characteristic poles of the system. [Or, if you prefer, the eigenvalues of (F ~ KH) are the characteristic eigenvalues of the system.) ‘A simple example will illustrate the use of Eq, (7.63). Refer to Example 7.1 and note thatthe steady-state gain is306 | | 11 CoupTER 7 THE CONTMUOUS KALMAN FLTER K = PHR" = (V3 ~ 1) = V3~1 Also, inthis example F and Hare Fe- Het ‘Therefore, the characteristic polynomial is s- [1 (V3 ~ De} =s+ V5 “The characteristic pole is seen to be at ~V3 in the s-plane, indicating dhat the filter is stable, This is, of course, the same result that was arrived at by block ddiggram reduction in Pxample 7:1. However, itis important to note that we did this without any such block diagram manipulation. Remember that state-space block diagrams lke the one shown in Fig. 7.1 are vector time-domain diagrams, and extracting the transfer function in the s-domain can be very complicated (if not virtually impossible) in higher-order systems. On the other hang, iti routine to find the eigenvalues of a square matrix via computer methods, even in high- ‘order systoms, [Note that MATLAB has a built-in function eig(A) that returns the eigenvalues of a square matrix A.) RELATIONSHIP BETWEEN WIENER, AND KALMAN FILTERS Itis appropriate here to pause and reflect on the connection between Wiener and Kalman filter theories. Both the Wiener and Kelman fiers are minimum rmean-square-eror estimators, both require the same a prior information about the proceses being estimated, and both yield identical estimates. We saw in Chapter 4 that the Wienee approach leads to an integral equation with the filter ‘weighting fonction asthe unknown, After solution ofthe integral equation, che ‘weighting function then describes the relationship between input and output. On the other hand, the end result of continuous Kalman filter theory isa differential equation relating inpat and output. There must be equivalence, and books on linear system theory fe. Chen (5)] may be consulted forthe details of how to get back and forth between the two descriptions. There are, however, certain Subeties about this particular problem that warrant some further amplification First, in the Wiener theory we related the filter response to the input via the superposition (convolution) integral, which was writen inthe form $0 fee a ‘This form was convenient because the frst argument of g has the significance of the “age” variable, and g tells us hows the past values of the mput are weighted to yield the present Value of the output (6). The second argument of g (ie. 1) 177 RELATIONSHIP BETWEEN WIENER AND KALMAN FTERS 307 simply appears as a parameter that may be considered fixed in the optimization ‘process. Frequently, though, books on linear system theory [e., see Chen (5)] ‘write the superposition integral in the Form 4 [Loe ina any (Chis was mentioned before in Chapter 4.) When 2 is written in the form of Ea, (7-12), Me, 2) has a physical interpretation as the system response at time t to ‘unit impulse applied at time +. By making a simple change of dummy variables in Eq, (77.1), We can obtain the relationship between g and hi a nD = HD 773) ‘This is not a point of confusion in constant-parameter systems. It is, however, in time-variable systems, and we have to be watchful of this Tile detail in converting from a weighting-function description to a state-ealization (differ: ‘ential equation) description. In addition, the Wiener filter is always a single-output estimator. That is, Wwe may have multiple inputs, but we always choose a single scalar process (usually called signal”) asthe quantity being estimated. For example, consider the problem where we have one Markov process that we eal signals, and the ‘dative noise consists of another Markov process n, plus a white component rn In Wiener theory, n, gets lumped with n, a8 the corrupting noise and the estimator yields just an estimate of s. On the other hand, with Kalman filtering, fone models both s and n, as state variables, and the filler estimates both $ and zn, simultaneously (even though fl, may noc be of interest). In eect, the Kalman filter does the work of two Wiener filters. Furthermore, the Kalman filter auto: ‘matically provides information about the quality of the estimates (ie, their ‘mean-square errors) while doing the estimation. The Wiener filer docs not provide this information, and one has to go to considerable extra effort t0 get ‘mean-square error information, Thus, we have in the Kalman filter a group of estimators, all packaged into a single matrix algorithm. The price, of course, is extra computation effort. We often find in Kalman filtering that we are forced to cary along considerable excess baggage in order to obtain estimates of «few select quantities of interes, ‘Another subtlety that appears when comparing Wiener and Kalman methods has to do with the initial conditions. It may appear a first glance that we are allowed to choose &, and P,, as we wish in a Kalman filter, whereas no such ‘explicit choice exists in the corresponding Wiener formulation. This apparent discrepancy exists because the Wiener filter output was writen asa superposition integral, which tacitly implies zero initial conditions for the filter. This is justified, provided the input processes have zer0 mean and we have no prior information about the processes other than what was already assumed in modeling the various spectral functions. To get correspondence in the Kalman filter, we ‘must choose fy to be zero, and after we have done this, P, must be the covariance (of the process itself. Thus, we really do not have as much legitimate choice ia908 CHAPTERT THE CONTINUOUS KALMAN FILTER the initial &, and Py as it seems at first glance; atleast this is true if we want the result to be optimal. The correspondence between the Kalman and Wiener solutions, properly initialized, is easily verified for simple cases, and Problems 7.3 and 7.4 are intended to demonstrate this, Tn summary, it is not proper to say that a Kalman fiter is either better or worse than a Wiener filter, because both yield identical results once they are realized, However, inthe realization, either for analysis purposes or on-line, the ‘Kalman approach elearly has two distinct advantages: 1. With one common matrix formulation, we can ocommodate a large class of estimation problems with relatively complicated process and ‘measurement relationships. 2. The recursive feature of the Kalman filter makes it readily adaptable to ‘computer solutions, This is certainly of considerble practical importance. Solutions of relatively complex estimation problems are often quite feasible using Kalman filtering methods, whereas the same prob- Jems may be completly intractable using Wiener methods. PROBLEMS 7A Consider a scalar process x() that may be thought of as the result of integrating Gaussian white noise with a spectral amplituce of 16 units. Let us say the integrator is “zeroed” at r = 0, and thus we know x(0) = 0. (This is @ ‘Wiener process.) Let us further say that we have a continuous noisy measure- iment of x where the measurement noise is white, Gaussian, independent of =, ‘and has a specteal amplitude of 4 units () Find the optimal Kalman filter for this situation. (Your solution may be left in the form of @ differential equation, but ll parameters of the differential equation are to be written out explicily.) (b) Find the optimal iter transfer function and the ccrresponding rms error forthe steady-state condition 7.2. The optimal estimator for a Markov signal plus adcitive white noise was given in Example 7.1. The gain worked out to be tim: variable, but it did fpproach a constant value of (V3 — 1) as 1+ Investigate the effect of using this constant gain forall ¢ = 0 relative to the filter rms erro. Sketch plots of both the optimal and suboptimal rms errors for purposes of comparison. 73. Consider a stationary Gaussian Markov signal x(2) whose autocorrelation function is Ria) = de “We make two noisy measurements of the signal, one at 1= 0 and the other at f= 1, and we denote these as zp and 2,. Each discrete measurement is known to have an error variance of unity, and the errors are statistically independent. ‘We have no prior knowledge of the signal other than its antocorrelaton function as stated above. (a) Using the methods of Section 4.7 (je. the weight factor approach), ‘write an explicit expression for the optimal estate of x(1) in tems of zoand 2 Prosues 300 (b) Repeat part (a) using discrete Kalman filtering and compare the result with that obtained in part (2). (ote: The Wiener and Kalman estimates should be identical.) 74 The same estimation problem was used in both Examples 4.5 and 7.1 ‘Show thatthe two estimators yield identical results forall ¢ > O as well as for the steady-state condition, (Hine: In the Wiener solution, £() can be waitten in terms of a convolution integral where the weighting function is known explicily. First, convert the weighting function to impulsive-response form and then substitute the integral ‘expression for s() into the differential equation describing the Kalman estimator ‘and show that it satisfies the differential equation } 71S A certain noisy measurement is known to be of the form AO =a tn), 120 where ay is an unknown random constant and (0) is white Gaussian noise with 1 spectral amplitude of A. Initially, at ¢ = 0, all that is known about ay is that i has a zero-mean normal distribution with @ variance o%. We wish to estimate ‘4 0n & “running time” basis beginning att 0. Find the appropriate continuous ‘Kalman filter for estimating a, and sketch a block diagram for the fle. (Wote: This isthe same estimation problem as that of Problem 4.11. However, ‘when using Kelman filtering, we do not need to consider a random constant as 4 limiting case of a Markov process.) 7.6 The resulting optioal filter for Problem 7.5 is described by a differential equation relating the input 2() to the output £0. When the same estimation problem is solved with Wiener methods, the result is @ weighting function relating 2(9 and $(0). Show thatthe two results are equivalent. 7.7 Consider the two-state Gaussian random process shown in the block dia- ‘gram below. Cleary, the measurement is of state variable 1, and it may be assumed thatthe measurement begins at r = 0. The initial condition om the first integrator (state variable 2) is zero, whereas the second integrator has an initial condition that is a zero-mean normal random variable with o? = 1. Write the differential equation for the P matrix for the continuous Kalman filter for this eae ae Process310 CHAPTER7 THE CONTINUOUS KALMAN FLTER situation. Be sure to specify the appropriate initial conditions for P, and also be sure to specify all elements of the matrix parameters (such as F, R, and Q) of the differential equation. You do not need to solve the differential equation, 78 Modification of the discrete Kalman filter equations to include a deterministic input was discussed in Section 6.10. Show that the corresponding modifi ‘cation in the continuous case is accomplished with the addition of Buy 28 2 function in the & differential equation. Specifically, assume the process equation is of the form k= Fx + Gu + Bu, (78.1) ‘where Bu, is the deterministic input. Then show that the corresponding differ- ‘ential equation for the estimate is & = FO + Ki - HR) + Bu, (e782) 7.9 The accompanying figure shows a continuous random process model consisting of two integrators in cascade diven by white noise. The noisy observable is the output of the second integrator corrupted by additive white noise. system is observable, so one would expect the Kalman filter error covariance equation to reach a steady-state condition ater a suitable settling time. (a) Find the characteristic poles for this Kalman filter in the steady-state conation “Seria oogget aotaa® ww—=fz +f FI 4 _ Probie 79. Lttne: The dynamics for this process are relatively simple, so it is feasible to ‘obtain an explicit solution for P. All one has to do is let P = 0 in the Riccati ‘equation, Eq (7.2.1), and then solve for the components of P algebraically. Ths, in tur, makes it possible to find the steady-state gain from Eq. (71.16).] {(b) Consider a similar discrete Kalman filter problem (but not exactly the same) where the two-state process is the same, the sampling interval is “5 see, and R, = 200 units (obtained from Eq. 7.1.13). Find the characteristic poles for the discrete Kalman filter in the steady-state condition, (See Section 6.9 for help on this part.) You should find @ rough correspondence (but not exact) between the results of this part and those. of part (2), 7.0 The matrix Riccati equation, Eq, (7.2.1), describes the estimation error in 2 continuous Kalman filter. Note that seting R~* = 0 corresponds to no mex ‘surement, In this ease the best estimate of the state i zero, and P(O) is just the ‘covariance of the x process beginning at = 0. If we le the intial value of P() be zero, we then oblain the differential equation for the Qy matrix that was discussed in Chapter 5. That is, with the step size At as the argument, we have =F. + QT + GEG, QO (P7.10.1) where GQGF is the power spectral density associated with the vector white- REFERENCES 311 noise input Gu, For the scalar Markov process discussed in Example 5.8, Q, ‘was found to be (1 — e-%). Show that the solution of the above differential ‘equation, Eq. (P7.10.1), yields the same result REFERENCES CITED IN CHAPTER 1, RB, Kalman and R. 8, Bucy, “New Results ia Linear Filtering and Prediction,” Trans. ASHE, J Basie Eng, 83: 95-108 (1961). 2, Av. Jeawinski, Stochastic Processes and Ftering Theory, New York: Academic Pres, 1970, 3, AVE. Brjton, J and D. E, Johansen, “Linear Filtering for Time-Varying Systems Using Measurements Containing Colored Noise," IEEE Trans. Auomatie Control, AC-I0 (1): 4-10 Gan. 1965). 4. K. Opata, State Space Analysis of Control Systems, Englewood Cliffs, NI: Preatice- Hal, 1967. 5. C. Chen, Linear System Theory and Design, New York: Holt, Rinchar, and Win sion, 1984 6, RG, Brown and J. W. Nision, Introduction 10 Linear Systems Analysis, New York: Wiley, 1962 Additional References on Continuous Kalman Filtering 17. A. Gelb (ed), Applied Optimal Estimation, Cambridge, MA: MIT Press, 1974 5. A.B Sage and J. L, Mela, Estimation Theory with Applications to Communications ‘and Control, New York: McGraw-Hill, 1971 9.1.8, Meditch, Stochastic: Optimal Linear Estimation and Control, New York MeGraw-Hil, 1969 10, B.S. Maybock, Stochastic Model, Estimation and Control (Vol. 1), New York Academic Press, 1979. 11, S.M, Bozi, Digital and Kalman Fitering, London: E. Arold, Publisher, 1979, 12. M.S, Grewal and A. P Andrews, Kalman Filtering Theory and Practice, Englewood (Clif, NJ- Prentice-Hall, 1993. 13, G. Minkler and J. Minker, Theory and Application of Kalman Filtering, Palm Bay, FL: Magellan Book Co, 1993,at Smoothing ‘The diserete Kalman filtering algorithm was presented in Cxapter 5. Prediction ‘was then discussed in Chapter 6. We will now consider the ecursive smoothing problem. Recall from Chapter 4 thatthe smoothing problen is where we wish {o form the optimal estimate at some point back in the past, relative tothe current ‘measurement, We will fist classify the different types of smoothing problems and then proceed to the recursive solutions for each of the -ypes. CLASSIFICATION OF SMOOTHING PROBLEMS sna ‘Smoothing seems to be inherently more difficult than either filtering or prediction, In the Wiener theory of Chapter 4, a is negative in Eq, (4.3.22) and this causes a cusp to appear in the positve-time part of the shifted time function (At lest this is rue for rational spectral functions—see Protlem 8.1.) This eusp, in tum, leads to a considerably more complicated expression for the Fourier transform than occurs in the corresponding filter or prediction problems. Simi- lazly, we will see that the recursive algorithms for smoothing are considerably ‘more complicated than those for filtering and prediction. ‘Meditch (1) classifies smoothing into three categories: 1. Fixed-interval smoothing. Here the time interval of measurements (i... the data span) is fixed, and we seek optimal estimates at some, or perhaps all, interior points. This isthe typical problem encountered when processing noisy measurement data off-line 2, Fixed-point smoothing. In this case, we seek an sstimate at a single fixed point in time, and we think of the measurements continuing. on indefinitely ahead of the point estimation, An example of this would be the estimation of initial conditions based on noisy trejectory observations 182 DISCRETE FEDINTERVAL SMOOTHING 913 after ¢ = 0. In fixed-point smoothing there is no loss of generality in letting the fixed point be at the beginning of the data stream, because all prior measurements can be processed wih the filter algorithm. 3. Fixed-lag smoothing. In this problem, we again envision the measure~ ‘ment information proceeding on indefinitely with the running time variable f, and we seek an optimal estimate of the process ata fixed length of time back in the past. Clearly, the Wiener problem with a negative is fixed-lag smoothing. It is of interest to note that the Wiener formu- Tation will not accommodate either fixed-interval or fixed-point smooth ing without using multiple sweeps through the same data with different values of a, This would be a most awkward way to process measurement ata ‘Obviously, the dee smoothing problems are related, and it is not especially difficult to devise correct, but clumsy, solutions. Thus, the central problem is, ‘one of finding reasonably efficent recursive algorithms for each of the types of smoothing. This has been studied extensively since the early 1960s, and we will present only those solutions that are generally considered to be best from a ‘computational viewpoint. We will not atempt to derive all the algorithms. The derivations are adequately documented in the references cited. (Also, see Prob- Jem 8.2.) We now proceed to look at algorithms forthe tree specified categories of smoothing, 82 DISCRETE FIXED-INTERVAL SMOOTHING ‘The algorithm to be presented here is due to Rauch, Tung, and Striebel (2, 3), and its derivation is given in Meditch (1) as well a the referenced papers. Also, ‘anew simplified derivation is presented in Problem 8.2. In the interest of brevity, the algorithm will be subsequently referred to as the RTS algorithm. Consider a fixed-length interval containing NY + 1 measurements. These will be indexed in ascending order 2. 2, Zy. The assumptions relative to the process and measurement models are the same as for the fter problem. The computational procedure for the RTS algorithm consists ofa forward recursive sweep followed by a backward seep. This is illustrated in Fig, 8.1. We enter the algorithm as tasual at & = 0 with the initial conditions 5 and Pj. We then sweep forward using the conventional filter algorithm. With each step ofthe forward sweep, we ‘must save the computed a priori and & posteriori estimates and their associated P matrices. These are needed for the backward sweep. After completing the forward sweep, we begin the backward sweep with “initial” conditions &(N{N) and (NIN) objaitied af the final computation in the forward sweep.* With each + The notion wed here he same os hat used lathe pression problem. See Chapter 6, Section 6hi 314 CHAPTER sMOOTHANG Nanda Sea i= facia eae) See Figure 81. Procedure or uesintenal sreating step of the backward sweep, the old filter estimate is updated to yield an improved smoothed estimate, which is based on all the measurement data. The recursive equations for the backward sweep are SIN) = RC) + AGOLRE + TIN) ~ 80+ 10] 2 where the smoothing gain A() is given by AG) = PROSE +1, DPE + 1H, 822) and k -AN- ‘The eror covariance matrix forthe smoothed estimates is given by the recursive equation PEN) = POE) + AGIPEE + IN) - PE + IIIA — 2.3) It is of interest to note thatthe smoothing errr covariance matrix is not needed for the computation of the estimates in the backward sweep. This isin contrast to the situation in the filter (forward) sweep where the P-matrix sequence is needed for the gain and associated estimate computations. An example illustrating the use of the RTS algorithm is now in order. Cee [Let us consider the same Gauss-Markov process used previously in Example 568, Section 5.6, Recall that the process hasan autocorcelaton function Ra) =e land we have a sequence of noisy measurements ofthis process taken every 02 see. The measurement sequence begins at r= 0 and ends at ¢ = 1 sec giving a 2 DISCRETE FNEDANTERVAL SMOOTHING 815: total of 51 discrete measurements. The measurement errors are white and have tunity variance. A sample of this situation was simulated using Gaussian random ‘numbers, and the fltering results were described previously, in Section 5.6. We now continue this same example with a backward sweep and obtain smoothed estimates, ‘A partial isting of the results from the filtering solution is repeated in Table 8.1 for reference purposes. It can be seen from the error covariances thatthe filter has reached 2 steady-state condition by the end of the forward sweep. ‘We enter the backward sweep atthe end point where k = 50. Here we have 50|50) = 539 62.4) P (50/50) = .1653 25) Since the filter solution at this point is conditioned on al! the measurement data, it is also the smoothed estimate at k = 50, We are now ready to compute the smoothed estimate one step back at & = 49. From Eqs. (8.2.1) and (8.2.2) we have 5(49)50) = 3149/49) + ACA9)[SCS0)50) - 51504)] 6.2.6) where A(49) = PEASIA9) 6P"S0449) 27 Using the fltring results from Table 8.1 and noting that the transition matrix ise" = 98020 lead to (G9) = 8183 +5(49)50) = ~.525 ‘This may now be repeated at k = 48, 47, ..., etc. with the results ‘Table 81 tering Rests x a PH PR 46 41s 1653 1980 -ail a eas 1653, 1980 st 4 sn 1653 382 1980 1310 0-408 1659, ~501 1980 ~064 50-539 1653, 40 1980 =138316 cHAPTER® SMOOTHING 34850) = =.531 5(47)50) = —506 3(46)50) = -.536 ete ‘The smoothing results forthe entire 50 steps are summarized in Fig, 8.2 along. with the tue« for comparison. The smoothed estimate plot can also be compared withthe corresponding filter estimates shown in Fig. 5.11. Itis evident from the plots that the smoothed plot is, in fact, smoother than the fier plot. This is as expected, because the smoother uses both past and future dats in the makeup of its estimate. The smoothed estimate is not conspicuously batter than the fiter estimate for this simulation. However, we should not expect to be able to draw firm conclusions from such a small sample, Is also of interes in this example to compare the error variances of the ‘ter and smoothed estimates. These are shown in Fig. 83. Both the filter and smoother converge to a steady-state condition after about 15 steps. The steady. state values are Pree * 1653 (Cue = 406) wea: 099 (Omer = 315) ‘Note that when the comparison is made on the basis of ems value, there is not a dramatic difference in the errors. In addition, the smoothed error curve is ‘minimum in the middle and then gets larger as either end point is approached, Do zmemrenent {Faue 82 Soothes estrale ae ive pts er Emp 8. 83. 89 DISCRETEFINED-PONT SMOOTHING 317 Figure 83. Stee of ener watancs fr rae monte et trates of Baro 8 This indicates tha the best situation for estimation occurs in the interior region where there is an abundance of measurement data in either direction from the point of estimation. This is exactly what we would expect intuitively. Before we leave this example, the steady-state behavior in the middle of the variance plot of Fig. 8.3 should be noted. This frequently happens when the data span is large and the process is stationary and observable. In the steady- sage region, the filter and smoother gains and error covariance matrices are constant, and considerable computational simplification occur, both in terms of | the number of arithmetic operations and the amount of storage required for the algorithm. Thus, sometimes a problem that appears to be quite formidable at first glance works out to be feasible because a large amount of the measurement data can be processed in the steady-state region For future reference purposes, a partial listing ofthe fixed-nterval smoothing solution is given in Table 8.2 = DISCRETE FIXED-POINT SMOOTHING ‘The fixed-point smoothing algorithm to be presented here is taken directly from Meditch (1). Alternative algorithms that, under certain circumstances, may be better computationally are also given by Meditch. In order to keep the discussion here as simple as possible, we present just one algorithm, which was chosen because of its simplicity and similarity to the fixed-inerval algorithm. The algorithm is 80H) = 8017 — 1) + BEDLRGLD = 8 ~ DD ea) where is fixed (usually & = 0) Jak 2 ete and318 CHAPTERS SMOOTHING Table 82. Fred-nterval ‘Smoothing Soliton 5) PASO) 2s =1552 1683 -1559 ase -1536 187 -1556 1189 ss2 M1 =150 1078 -1 0580 -ss 1079 5961123 5061189 -su 1287 2536 591683 By = [140 (8.3.2) AG) = PEASE + IDE 1) 33) (Note that the equation for A is the same as for the fixed-interval algorithm. ‘The error covariance associated with the smoothed estimate is given by PLD = Pik — D+ BOYIPULD ~ PUI - DIB) 83.4) [Note thatthe filter estimates and their eror covariances are needed in this algorthin, just as in the fixed-interval case. The computational procedure is summarized in Fig, 84. We enter the algorithm at the beginning of the data stream with the usual a priori $5 and Pj. (We have let k = 0.) These inital parameters may come from prior knowledge of the process in the event there fare no prior measurements, or &5 and P may come from processing previous data via the filter algorithm. In either event, the first step is to update the a priori 5 and P; with the measurement 2» and obtain %(0)0) and P(O)0). This is done with the usual filter equations and, of course, the result can be interpreted as either a filter estimate or the 2eco-stage smoothed estimate—they are one and the same, We next let = 1, and we are ready to solve the one-stage smoothing 89 DISCRETE FDED-POWNT SMOOTHING 319 Stor ana RS) Figure 84. Proce or fe pat raahing problem using Eqs. (83.1) to (8.3.4). This may then be continued indefinitely, theoretically, atleast. OF course, as with any recursive algorithm, roundoff error ‘may eventually lead to difficulties, We now look at an example of fixed-point smooth Dec ‘Toillustrate the application of the fixed-point algorithm, we use the same process ‘and simulated data used in Example 8.1. We let the fixed point be at k = 0 and the data for the first 5 steps are given in Table 8.3. Since the filtering results are needed for the solution of the smoothing problem, they are also listed in the table. "The fest step is to compute £(0)0) and P(O)0). This was done in the previous filter solution with the result 400) = -1.610 POO) = 5 We are now ready to find s(0[1) by leting k = 0 and j = 1 in Eqs. (6.3.1), (83.2), and (8.3.3). The results are ‘Table 89 Data for xsc-Polat Exam a) -D A_BD On PUD PuU-D POD =1510 9432 =1610 5 2 o 1 1317-1578 9114 9432-1364 341951963419 2 1350-1291 8860 8596-1414 268936772089 31406-1323 a661 7616-1477 2093 2975.29 41454-1378 8513 6597-1528 2060 25952060 5 BIS 1425 galt 5616-1787 19172321917920. CHAPTER SMOOTHING BL) = A = PODSP™'Cl0) 9432 H10]1) = 301) + BULRCAND ~ 80]0)] = -1364 We now have the needed information to go on to the next step and compute 4#{0)2), and so on. Note that itis not necessary to compute the smoothed error Covariance in order to find the smoothed estimates. It was computed for checking porposes, though, and is listed in Table 83. In this example the fixed-point fstimate converges nicely in 50 steps, and for j = 30 the solution is the same {that obtained from the fixed-interval algorithm. (See the k = 0 entry in Table 82) a FIXED-LAG SMOOTHING “The fixed-lag smoothing problem was originally solved by Wiener in the 1940s. However, he considered only the stationary case, that is the smoother was a5- ‘sumed to have the entire past history of the input available for weighting in its Alelermination ofthe estimate. OF the three smoothing categories mentioned in Seetion 8.1, the fixed-lag problem's the most complicated. Ore reason for this js the start-up problem, For example, let us say we wish to estimate our process fa a point three steps back from the current point of measurement. If we enter the problem with the first measurement at r= 0, we have only the one measurement on which to base the estimate, and there is a gap in the measurement Soquence between the current time and point of estimation. This gap becomes filled a time progresses, bat we have to be watchful of this litle detail for the first few stops. “Meditch (1) gives an algorithm fr fxed-lag smoothing that is considerably more complicated than the algorithms for the other two smoothing categories. We take 2 somewhat simpler approach here. Knowing the RTS algorithm for fixed-interval smoothing, we can always do fixed-lg smoothing by first filtering up to the current measurement and then sweeping back a fixed number of steps ‘with the RTS algorithm, If the number of backward steps ‘s small, this is @ Simple and effective way of doing fixed-lag smoothing. If however, the number Of backward steps is large, this method becomes cumbersome and one should Took for more efficient algorithms. To illustrate the sweeping back on a running time basis and the start-up problem, we return to the Gauss-Markov example previously considered in Examples 8.1 and 8.2. EXAMPLE 8.3 ‘The needed filtering data for our Gauss-Markov example is given in Example 82 Let us say that we wish to estimate the process three stops back from. the current point of measurement, and that the fist measurement occurs at k = 0. 24 FIXED-UG SOOTHING 321 Table 8.4 Forward and Backward Swesps (Currant Measurements at k = 0) Forward Sweep Backward Sweep Pak-) #0) Pa 20) 3 1 ° 1 0) = | =1610 2 1 0 1-1 9802-1578 “1 1 ° 1-2 9302-1587 ° ' -160 -3 9802-1516 Index & is the discrete running time variable. Our first estimate is then to be 4(—3)0), Anticipating that we will use the Rauch backward sweep from k = 0 to k= —3, we see that we will need all the filter estimates for this interval, Now, even though we have no measurements prior to k = 0, we can stil form a trivial forward sweep beginning at k= ~3. Recall that filter updating without the benefit of a measurement simply amounts to setting the a posterion and P ‘equal to the a priori and P~. The resul ofthis Wivial sweep is showa on the Teft in Table 8.4, We now have all the filtering data needed for the backward sweep. The desired estimate #(~3)0) is obtained from Eqs. (8.2.1) and (8.2.2), fand the results are given on the right side of Table 84, Presumably, the only estimate of interest here is the last entry inthe table, #(~3)0) = ~1.516, 'Next, suppose we get a measurement at k = 1, and we wish the smoothed estimate three steps back at k = ~2. The forward sweep that began at k= ~3 ‘may now be extended one step, and the results are shown at the left in Table 8.5. Note that the only new entries are in the bottom row. The results of the backward sweep are also given in Table 8.5, and all the entries here are-new; that is, the backward sweep must be completely redone with cach new mea- ‘surement. Note that each time we increment the filter forward one step, the oldest, estimate and its error covariance may be discarded, because we need only the four most recent filter results for the backward sweep. Thus, we have sort of a ‘sliding window” of data that we must store as time increments forward ‘The procedure just described may now be repeated indefinitely. For refer- lence purposes, the solutions for the next two steps are given in Tables 8.6 and Table 85 Forward and Backward Swesps (Current Measurements at k= 1) Forward Sweep Backward Sweep ktak-) PRD) kA) EID 2 0 1 oo ian, io 1 1 0 9482-1364 oo 1 5 19802-1337 1-187 5196 39-2 9802-1310922. cHAPTER® swOOTANG ‘Table 86 Forward end Backward Sweeps (Curent Meosurement st k = 2) Forward Sweep Backward Sweep Suk PWD HPA) HD) “1 ° 1 oo 20 = 1350 ° ° 1 oS 1 gue -137t 1-15 5196-1317 3419 gas -1415 2-120 367) 13502689 soz -1387 8.7. The procedure has been demonstrated with tables at each step for the purpose of clarity, but it should be recognized thet much of the data in the tables {s either repetitious or not necessary for succeeding steps. Thus, the actual pro- ‘gramming ofthe running forward-plus-beckward algorithm is not as complicated ‘as would appear from the tabular listings. With each new measurement, it simply involves moving one step forward and M steps back, where M is the desired fixed lag a a5 FORWARD-BACKWARD FILTER APPROACH TO SMOOTHING In 1969, D.C, Fraser and J. E, Potter presented a novel solution to the smoothing problem (i), With reference to the fixed-interval problem, their approach was to {iter the measurement data from both ends to the interior point of interest, and then the two filter estimates wece combined to obtain the optimal smoothed estimate. This is illustrated in Fig. 8.5. The equations forthe continuous problem ‘will be developed first following the derivation given in Gell (5). We will then 7. We speculate that PL) = Or — IPA) + WIG —) S27. will satisfy Bq. (8.5.23). Note thatthe discrete Q, is writen as a function of because it varies as the interval increases. Differentiation of the assumed solu tion, Eq, (8.5.27), leads to ay _ dey aren + gas str +) Be = (rsd + QO + ot TH" +o HPsad + (85.28) ‘where the parenthetical dependence has been omitted in places to save writing. ‘We fest note from linear system theory that ap) (8.5.29) ee = Fe “Thus, the frst and third terms of Eq. (85.28) combine to yield (FP, ~ PAF”), ‘which is identical tothe fist two terms on the right side of Eg (8.5.23). Looking next at the ”"(dO,/de}-" term in Eq, (85.28), we see that we need to form the derivative of Q, Equation (5.3.6) is useful here. Let 4 = T~ toy = 7 ~ ‘oy and note that Efu(Siu"(n)] = Q 8 ~ m). After performing the inner inte- aration, we have =f er- mw n6Qcer-madn — 45:30 Differentiating this with respect to + leads to = Mr ~ GOGO" ~ 7) es31) ar ‘This may now be pre- and postmultiplied by "(7 ~ 1) and "(r= 7), as indicated in Eq, (8.5.28), and the result is obviously GQG", which equals the third term on the right side of Eq, (8.5.23). Thus, the discrete covariance projection given by Eq. (85.26) is seen to satisfy the continuous differential equa-tion for B, with set equal to zero. It is important to note thatthe d and Q, in the projection equations are computed from the forward process equations. ‘That is, they are the same parameters that would have been wsed in the forward filter forthe interval, had the forward recursive steps been continued up through that interval of time. ‘The boundary conditions for the backward filter are awkward, but not possible. One way of implementing the infinite intial P matrx in offline processing is 10 make all terms along its major diagonal very large, say, about 10 forders of magnitude larger than their inital counterparts in the forward filter. ‘The off-diagonal terms may then be set equal to zero. For all practical purposes, this completely de-weights the prior information about the arocess relative to the weight that it is given in the forward filter. This is not 2 very elegant approach, but itis an effective, practical solution in many applications. In simple ‘models where the order of the state vector is small, the toundary-condition problem may be handled analytically by ising the alternative Kalman filter al gorithm for the fist step in the backward filter. This is illustrated in the one- State example presented atthe end ofthis section, and itis alsa discussed further in Problem 8.6. IF all else fails, we can resort to propagating P inverse as sug- ‘gested inthe Fiaser-Potles paper cited eatliee ‘We assume now thatthe forward filter has been stopped ahead recursively at the estimation point, say, k, and the end result is an a posteriori estimate x, and an associated P,. The backward filter steps backward from the end point N, land it stops at k + 1 where it assimilates the 2°" measurement. Tc then projects this estimate one more step to oblain an a priori estimate x, and its associated Px. Ie does not assimilate the 2 measurement, because this has already been used in the forward filter Finally, the forward and backward estimates are blended together in accordance with the equation uMnLP, CN) +P (85.32) whece PUIN) y+ (8533) ‘An example will now illustrate the procedure, EXAMPLE 8.4 ‘Again, we consider the same Gauss-Markov process used for the previous examples ofthis chapter. Let us say we want to find the smoothed estimate at k = 48, that is, 2(48}50). We first forward-filter up through zy. This has already been done in previous examples and the results are given in Table 8.1. The pertinent forward-flter results are sn 165 fy Pe ‘We now need to generate a similar estimate at k = 48 with the backward 185 FORWARO-BACKWARO FILTER APPROACH TO SMOOTHING 328 filter. We begin at the end point where k = $0. The initial conditions for the backward filter are Si Pry (8534) where the asterisk indicates “a prior” in the backward filter, and the subscript indicates the backward index i, rather than k. We update the initial estimate with the alternative Kalman filter algorithm in order to avoid the P-equal-infinity problem. Thus, we have at k = 50 (i = 0) f + HERG Hy +1 (6535) 1 6536) “The galn Is then Kyy = PioHSRa! = 1 6537 and the updated estimate is Seq = in + Kilt ~ Haig] = 29 = ~1.138 (Grom k = 50 entry of Table 8.1) (8538) [Note that we could have taken @ more heuristic approach here and simply said: “Initially, we knew nothing about the backward process at the end point, and hence we must accept the measurement at k = 50 at face value. The resulting estimation error is then just the measurement erro.” (This philosophy can also be extended to the vector case, as shown in Problem 8.6.) ‘Continuing the backward filter, we next project x, and P, w k= 49 (i = 1) sing Eqs. (8.5.25) and (8.5.26). The results are eq = e(—1.138) = ~1.161 85.39) Pi = Pig + Ode = (e+ 0392] 10816 (85.40) Next, we compute the gain and update the a priori estimate using the regular Kalman filter algorithm: Pr eae Ft = 5196 san By) = + Kae, — 8) = — 59 (85.42)390 CHPTER® SMOOTHING (Note from Table 8.1 that the measurement at k = 49 is ~.064). The a posteriori fervor covariance matrix is then Py =~ KydPiy = 5196 (85.43) Now we project 2, and Py, to k= 48 (¢ = 2), a = eM, = —603 6544) Pin = e(Py, + Qdet = 5816 (e545) “The backward filter is stopped at this point, because the measurement at k = 48, has alresdy been assimilated in the forvard filter. We are now ready to blend together the forward and backward estimates in accordance with Eqs. (8.5.32) and (8.5.33). The results are P(AR|50) = [(-1653)-* + (5816"J* = 1287 and 5(48)50) = PCAB'SOPalg + PES'Ka] = ~-531 Notice that these results are the same as those given in Table 82, which were obtained using the RTS algorithm. 8 PROBLEMS 8.1 Consider the same signal and noise sition used in Example 4.3 We found there that the causal Wiener filter was a simple first-order low-pass filter. Using Eq, (43.22), find the stationary, fixed-lag smoothing solution for « = ~1. Recall that the signal and noise are independent processes with autocorrelation functions RQ) = ett RA) = ‘The solution may be given as a transfer function, that is, find G(s) rater than (9. Note that the smoothing solution is considerably more complicated than the corresponding filter solution 82 (a) Consider the one-step back, fxed-interval smoothing problem. Its solution may be obtained from the usual Kalman filter equations by batching together the last two measurements %., and 2 and considering them as one measurement occurring at N'~ 1. Note that 2y ean be Tinearly related to xy. a8 follows: HiXy + Ye Hl) * Wes) + Yy Hb DMuan + HyMys + Y) PRoaLems 831 ‘The batched measurement relationship at N’— 1 is then [2 -Le + fee] [Note thatthe upper and lower components of the batched measurement ccan be processed sequentially because their errors are uncorrelated, ‘Assume now that we have an a priori estimate &(N ~ [NV ~ 2) and its ‘error covariance matrix P(N ~ IIN ~ 2). Proceed to update the estimate by assimilating the two components of the measurement separately in two steps, and show the final result isthe same as that obtained using the RIS algorithm. (Hint: The frst sequential step yields $(V ~ IV ~ 1). Next, show that the gain for the second sequential step is PON ~ TIN ~ 1)%_,POMIN' ~ 1)-"K,y, where Ky is the usual Kalman filter gsin for the Mth stage. Finally, replace XMM) in the RTS formula with [8(MN — 1) + Klay ~ Hy8(NIN — 1} and show the equivalence.) (©) The exercise of part (a) can be generalized to justify the RTS algorithm for any interior point within the fixed interval from 0 to N. To do this, Tet the interior point be denoted & and batch together all subsequent measurements 2, Zar ---+Zy- Call the batched measurement Y,,, that is, ‘We ean now form a linear connection between X,., and Yyey a5 Yee ‘The batched measurement y,,, now plays the same role as 2y in part (@). (We do not actually have to write out Mg. and vj., explicitly. We simply need to know that such a relationship exists.) We can now consider the interior-point smoothing problem in terms of an equivalent filter problem that terminates at k + 1. This isthe same as the problem ‘considered in (a) except for notation, Proceed through the steps of ex fercise (a) again with appropriate changes in notation and show that the generalization is vali 83 Consider a stationary, Gaust-Markov process with an. autocorrelation function Reyer ‘Assume that we have two noisy measurements of this process that were made fate O and r= 1. Call these z and z,. The measurement errors associated with392 cHAPTER@ SMOOTHING ze and 2, are white and have a variance of unity, We wish to find dhe optimsl estimate of x at = O, given z and 2, that is, we desire £(O{1). Write an expression for (0|1) explicitly in terms of the measurements 2, and 2, using (@) The RTS algorithm. (@) The Fraser-Potter forward-backward filter method. (©) The weight factor method (see Section 47). (@) The fixed-point algorithm of Section 8.3. ‘84 Show thatthe continuous version of the RTS algorithm is as follows S41) = FAGID) + GOGTP-'ClofsCE) ~ SCH] Buln = [F + GOGTP-Kén}PeEN, + PUDLF + GGA)" — GOST Boundary conditions: (7) and PCT} obtained from fier solution, (Hin: Begin with the diseete RTS algorithm and then let the step size approach zero, just as was done in deriving the filter equations in Chapter 7. Recall that Q, in the diserste mode! approaches GQG" ar for small a. Also nce that [1 AQ)] is ofthe order of Ar im the smoothing algorithm: ths, the gain A(F) approaches Tin the limit as Ar — 0) 85 Consider a Wiener process and measurement situation a follows: seu, x(0)=0 +o) where u(t) and o() are independent Gaussian white-noise processes with spectral amplitudes q and , respectively (@) Assume & fixed interval T that is sufficiently large to allow the filter Solution to reach a steady-state condition on the forward sweep. This then becoms the boundary condition for the backward sweep. Using the continuous RTS algorithm given in Problem 84, show that P(q7) for the terminal region of th (0, T] interval is approximated by Pun) = $1 + eam where 0 = Veg and 8 = Va7e (o) Now thatthe Sluton of yt (2) approaches a/2 as #0. This Cbriouly not comple withthe Keown a prior toendary condition fora Wiener proves that, PO) = 0 Explain tis ieepaey 86. Consiter the two-state system [}-F3 allele) amplitude spectral functions. We have a sequence of scalar measurements ofthis Drocess ZZ» 2y that are related t0 the process by the equation Prosicus 398 aa U tte, k= OT. ‘where v, has unity variance, That is, we are allowed to observe only the sum of the state variables at each step and not their individual values. Suppose we want fan estimate of the process at some interior point, and we wish to get it using the Fraser-Poter forward-backward-filter method. In this case, the single me surement z does not provide enough information to yield a finite error covariance estimate of x at ¢ = N, Thus, we cannot start the backward filter ‘uite a5 easly as was done in Example 8.4. In this problem, let the Av interval bbe unity and show that the backward filter may be started at ¢ = N ~ 1 by batching together z and zy. into an equivalent vector measurement at ¢ = NN ~ 1. The same technigue used in Problem 8.2 will be helpful here, In effect, you need to show that the error covariance after assimilating zy and zy is finite and nonsingular, and that the estimate is the same as would be obtained by deterministic methods. (The extension of this technique to higher-order systems is fairly obvious, provided the system is observable. We simply batch together an appropriate number of measurements at the end of the interval and then solve forthe system stat, just as ifthe measurements were noise-free. This then becomes the inital state estimate for the backward filter, and we start the backward filtering an appropriate number of steps back from the end point) 87 Table 8.3 sumarizes the results of the fixed-point smoothing simulation of| Example 82. Notice that the filter error variances listed under the column PGlA) ate identical to the smoothing error variances P(O|)). Give an inwitive explanation of this coincidence. 8.8 A rough sketch ofthe error variances forthe forward and backward sweeps for Example 8.1 is shown in Fig. 8.3. Using MATLAB (or other suitable soft- ‘ware), calculate and plot the 51 variances for each of the forward and backward sweeps, Note the symmetry in the backward-sweep variances as seen from the ‘midpoint of the measurement stream. Give an intuitive explanation for the symmetry. 89 A two-state model for the GPS selective availability process was given in Example 6.1, Chapter 6 The model is repeated here for convenience. f}-[& Aa]E}- helo act ofef ee = O12 madisee (0043987 m?/(rad/sec)? unity white noise where Mo) In this model, x, and x, are phase variables and both have finite variances. Consider the off-line fixed-interval smoothing problem where we have 101 noisy samples of this process, The sampling interval is 20 sec, and the rms noise334 GHAPTER® SMOOTHNG 7 associated with each sample is 20 m. Calculate the rms errors associated with teach of the smoothed estimates for ths situation. Note (@) the improvement in the state estimates inthe midrange of the data span as compared with the est- sate at the end point (.e, the best filtered estimate), and (b) the symmetry in the rms error sequence relative to the midpoint of the measurement span. REFERENCES CITED IN CHAPTER & 1. S. Modich, Stochastic Optimal Linear Estimation and Control, New York: MeGraw- Hil, 1969. 2. HE. Rauch, "Solutions tothe Linear Smoothing Problem," IEEE Trans. Auto. Con- sol, AC: 371 (1963), 3. HLE. Rauch, F Tung, and C. T. Strebel, “Maximum Likelihood Estihates of Linear Dynami Systems." ATAA J, 3: 1445 (1965). 4, D.C. Fraser and J. E, Poe, "The Optimum Linear Smoother as 2 Combination of ‘Two Optirmim Linear Filter," JEEE Trans. uto. Control, AC-14(4: 387 (Aug. 1969). 5. A. Gelb (ed), Applied Opal Estimation, Cambridge, MA: MIT Press, 1974 Additional References on Smoothing 6. ALP. Sage and J. L. Mesa, Estimation Dicory wit Applications to Communications ‘and Control, New York: McGraw Hil, 1971. 17. B.D. 0. Anderson nd J. B. Moor, Opn Filtering, Englewood Clif, NI: Pretice Hall, 1979. 8. M.S, Grewal and A. P. Andrews, Kalman Fltering Theory and Practice, Englewood Clits, NJ: Prentice, 1993 9. G. Miakler and J. Minkler, Theory and Application of Kalman Fitering, Palm Bay, FL: Magellan Book Co, 193 Linearization and Additional Intermediate-Level Topics on Applied Kalman Filtering Kalman’s papers of the early 1960s (1, 2) were recognized almost immediately ‘a8 new and important contribations to least-squares filtering. As a result, there ‘was a renewal of research interest in this area, and a flury of papers expanding ‘on Kalman's original work followed during the next decade or 50. Kailath (3) sives an especially comprehensive bibliography of papers for this petiod. Re- Search woek inthis area still continues (although perhaps ata somewhat reduced rate), and new applications and extensions continue to appear regularly in the technical literature, A few of the more significant extensions and related topics have been selected for comment here. The list is by no means comprehensive, However, there is a hierarchy of importance, and itis the authors’ reeommen- dation thatthe reader begin with the first topic on linearization. This is, by fa, the most important section in this chapter. The other sections may be studied in any desired order as time permit. LINEARIZATION Some of the most successful applications of Kalman filtering have been in sit uations with nonlinear dynamics and/or nonlinear measurement relationships. ‘We now examine two basic ways of linearizing the problem. One isto linearize about some nominal trajectory in state space that does not depend on the mea suement data, The resulting filter is usually referred to as simply a linearized Kalman fiter, The other method isto linearize about a trajectory that is contin~ telly updated with the state estimates resulting from the measurements. When this is done, the filter is called an extended Kalman fier. A brief discussion of cach will now be presented. 335336 CHAPTERG. UNEARIZATION AND ADDITIONAL IVTERMEDIATE-LEVEL TOPICS Linearized Kalman Filter ‘We begin by assuming the process to be estimated and the asiociated measure- ‘ment relationship may be writen in the form fox, ay 1) + 00) oy hx, 0) + 0) 12) Where F and hare known functions, u, is # deterministic forcing function, and Wand v are white-noise processes with Zero crosscorrelation a: before. Note that nonlinearity may enter into the problem either in the dynamics of the process fr in the measurement relationship; Also, note thatthe forms ef Eqs. (9.1.1) and (9.1.2) are somewhat restctive in that w, andy are assumed to be separate additive terms and are not included with the f and hi terms. However, to do ‘otherwise complicates the problem considerably, and thus we will tay with these restrictive forms. [Let us now assume that an approximate trajectory x*() may be determined by some means. This will be eferred to as the nominal or relerence trajectory, and it is illustrated along with the actual trajectory in Fig, 9.1. The actual 2° jectory x(@) may then be written as x = x0 + ax, 0.13) Equations (2.1.1) and (9.1.2) then become at fixt +x, uy) + 0) 1a) hat + ax.) + O15) ‘We now assume Ax is small and approximate the f and h functions with Taylor's series expansions retaining only first-order terms. The result is eal rjeceny tt Fre 21 Nerina act racers rs Ka 24 UNEARBATION 337 AP A= FO [2] ax tu 016) z= hoe, oF [2] ox W0 oan where af, Af, mh ay an a, 2%, Oe, at _Jofah..|, am] ahah ae lanag fa La ay Mae) I's customary to choose the nominal trajectory x*() to satisfy the deterministic differential equation Pa ft, a ers) Substituting this into (9.1.6) then leads to the Tiearized modet ai-[E] a +00 tie oumiad 049 te-mern)=[] sar 19 sn seme pi sa 111 [Note thatthe “measurement” in the linear mode! isthe actual measurement less ‘that predicted by the nominal trajectory in the absence of noise. Also the equivalent F and H matrices are obtained by evaluating the partial derivative matrices (Eqs. 9.1.8) along the nominal trajectory. We will now look at two examples that illustrate the Tinearization procedure. In the first example the nonlinearity ‘appears only in the measurement relationship, so it is relatively simple. In the second, noalinearty occurs in both the measurement and process dynamics, so itis somewhat more involved than the frst. EXAMPLE 9.4 In many electronic navigation systems the basic observable is a noisy measure- ‘ment of range (distance) from the vehicle to a known location. One such system that has enjoyed wide use in aviation is distance-measuring equipment (DME) (4), We do not need to go into detail here as to how the equipment works, It suffices to say that the airborne equipment transmits a pulse that is retumed by the ground station, and then the aircraft equipment interprets the transit time in398 CHAPTER 9 UNEARZATION AND ADDITIONAL ITERMEDIATE-LEVEL TOPICS terms of distance. In our example we will simplify the geometsic situation by assuming thatthe aircraft and the two DME stations are all in a horizontal plane as shown in Fig. 9.2 (slant range ~ horizontal range). The coordinates of the {wo DME stations are assumed to be known, and the airraft coordinates are tunknovin and co be estimated ‘We will look at th airraft dynamics frst. This, in turn, will determine the process state model. To keep things as simple as possible, we will assume a ‘nominal stright-and-level fight condition with constant velocity. The true ta- {ectory will be assumed to be the nominal one plus small perturbations due to random horizontal accelerations, which will be assumed to be white, This leads to random walk in velocity and integrated random walk in position. This is probably unrealistic for long time spans because of the control applied by the pilot (or autopilot), However, this would be a reasonable model for short inter Vals of time. The basic differential equations of motion inthe x and y directions are then yo 0+ wy 1.12) Figure 92. Geomaty for OVE erie, 81 LNEARZATION 338 ‘The dynamical equations are seen to be linear inthis case, so the differential ‘equations for the incremental quantities are the same as forthe total x and y, that is, aru, ay=u, e113) We now define filter state variables in terms of the incremental positions and velocities: stan He eV fEe Sel [sp ou “The state variables are driven by the white-noise processes u, and u, 80 We are assured thatthe corresponding discrete equations will bein the appropriate form for a Kalman filter ‘We now tum to the measurement relationships. We will assume that we have two simultaneous range measurements, one to DME, and the other to DME,. The two measurement equations in terms of the total x and y are then 42 Vem AP FH b) toy n= Ve~ ay FOO th 11.16) ‘where v, and v, are additive white measurement noises. We see immediately that the connection between the observables (z, and z,) and the quantities to be estimated (x and y) is nonlinear. Thus, linearization about the nominal trajectory is in order. We assume that an approximate nominel position is known at the time of the measurement, and that the locations of the two DME stations are known exactly, We now need to form the dh/ax matrix as specified by Eq, 8.1.8), [We note a small notational problem here. The variables xy %, and 1% are used in Eg. (9.1.8) to indicate total state variables, and then the same symbols are used again to indicate incremental state variables as defined by Eqs (0.1.14), However, the meanings of the symbols are never mixed in any one set ‘of equations, so this should not lead to confusion] We now note that the x and 'y position variables are the first and third elements of the state vector. Thus, ‘evaluation of the partial derivatives indicated in Eg, (9.1.8) leads to‘340 CHAPTER 9° LINEARIZATION AND ADDITIONAL INTERMEDIATE EVEL TOPICS ee an ey — aF + Gy bY Ve, — a? + Oy - bY x ee | iG Mex, = aF + Gy = by? Vix = a. + Oy = bY oan a ena en) Finally, we note that Bq. (9.1.18) can be generalized even further, since the sine and cosine terms are actually direction cosines between the x and y axes and the respective lines of sight to the two DME stations. Therefore, we will write the linearized Ht matrix init final form as ah 4 1119) cos 8, 0 cos 0 jue” [£08 8, 0 C08 Gy 0. where the subscripts on @ indicate the respective axes and lines of sight to the DME stations. Note that His evaluated at @ point on the nominal trajectory. (The tue tajectory isnot known to the filer) The nominal aircraft position will change with cach step of the recuisive process, so the terms of Hare time- ‘arable and must be recomputed with each recursive step. Also, recall from Eq (©.111) that the measurement presented to the linearized fier isthe total tninus the predicted 2 based on the nominal position x". ‘Switly speaking, the linearized filter is always estimating incremental ‘quantities, and then the total quantity i reconstructed by adding the incremental éstimate to the nominal part. However, we will see later that when it comes to the actual mechanics of handling the aitimetic onthe computer, we can avoid working with incremental quantities if we choose to do so. This is discussed further inthe section on the extended Kalman filter. We will now proceed t0 @ second linearization example, where the process dynamics as well a the measurement relationship has to be linearized a EXAMPLE 9.2 Zalman Fler al 8 Boe ‘This example is taken from Sorenson ‘classic example bf liniear of a nonlinear problem, Consider a near-arth space vehicle in «neatly circular cit Is desied to estimate the vehicle's postion and velocity on the bas of 2 sequence of angular measurements made witha horizon sensor With reference to Fig. 93, the horizon sensor is capable of measuring: 1. The angle » between the earth's horizon and the local vertical 2. The angle a between the local vertical and a known reference line (say, to a celestial object). a ving Fehler 9.1 UNEARZATION 341 Figure 13. Coeetes or space whi exe, Ti dhe interest of simplicity, we assume all motion and measurements 1 be within a plane as shown in Fig, 93. Thus, the motion of the vehicle can be described withthe usual polar coordinates r and @. ‘The equations of motion forthe space vehicle may be obtained from either Newtonian or Lagrangian mechanics. They are (see Section 2-10, ref. 6): wnt > poe Eentg 20 ro-+ 2F0 = use) 121) ‘where K is a constant proportional to the universal gravitational constant, and 1, and u, are small random forcing functions in the r and @ directions (due minly (0 gravitational anomalies unaccounted for in the K/?® term). It ean be seen thatthe constant K must be equal to gR? if the gravitational forcing function js to match the earth's gravity constant g at the surface, The random forcing functions w, and u, will be assumed to be white. We will look atthe linearized process dynamics first and then consider the nonlinear measurement situation later, “The equations of motion, Eqs. (9.1.20) and (9.1.21), ae clearly nonlinear, so we must linearize the dynamics if we are to apply Kelman filter methods, We have assumed that random forcing functions u, and uy are small, so the corresponding perturbations from a circular orbit will also be small. By direct substitution into Eqs. (9.1.20) and (9.1.21), it can be verified that, y= Ry (@ constant radius) (1.22) wow (a= (B) a VR) Pash PF petit eed i‘342 GHAPTER9_LINEARIZATION AND ADDITIONAL INTERIEDIATE-LEVEL TOPICS. 91 UNEARWATION 343, ‘vil satisfy the differential equations. Thus, this will be the reference trajectory that we linearize about. We note that we have two second-order differential equations describing, the dynamics. Therefore, we must have four state variables in our state model \ | ar We choose the usual phase variables as state variables as follows: | z 127) | mm fo 0 0 1 ner =2ay 0 2% o 0 Re a (0.1.24) Equation (9.1.27) then defines the F matrix that characterizes the linearized ‘dynamics. Note that inthe linear equations, Ar, Ar, A¢, and A@ become the four state variables, ‘We now fur to the measurement model. The idealized (no noise) rlation- ships are given by : @) a] fy) [swt (% : : (0.128) = al le] | w-6 ‘We nest replace with and with xy, and then perform the partial derivatives mK CL indicated by Eq. (9.8). Te result is ‘The noal We must ow form the at/ox matrix indicated in Eg, (9.1.8) to get the linearized on 0 00 i [2]- om ' 0 010 cig 0 Finally, we evaluate ah/3x along the reference tajcto (a+) 0 0 mm . 7 sd a F CL an i=? 9° ao ee 0.1.20 lee o 0-10 ‘This then becomes the linearized H matrix of the Kalman fiter. The linearized ‘model is now complete with the determination ofthe F and H{ matrices. Before ‘we leave this example, though, it is worth noting that the forcing function ud) must be scaled by 1/Ry in the linear model because of the 1/, factor in (9.1.26) (9.1.25). . 0 eae = : a The Extended Kalman Fier ‘The extended Kalman filter is similar to a linearized Kalman filter except that Next, we evaluate affax along the reference trajectory the linearization takes place about the filter's estimated trajectory, as shown in‘344 CHAPTER. UNEARIZATION AND ADOTTONAL INTERMEDIATE:LEVEL TOPICS Fig. 9.4, rather than a precompoted nominal trajectory. That is, the partial derivatives of Eq, (91.8) are evaluated along a trajectory that has been updated With the filter’ estimates; these, in turn, depend on the measurements, so the filter gain sequence will depend on the sample measurement sequence realized ‘on a particular run of the experiment. Thus, the gain sequence is not predetermined by the process model assumptions as in the usual Kalman filter, ‘A general analysis of the extended Kalman filter is dificult because of the feedback of the measurement sequence into the process model. However, qualitatively i€ would seem to make sense to update the trajectory that is used for the Hinearizaion—after all, why use the old trjectory when a better one is available? The flaw inthis argument is this: The “bette trajectory is only beter in a statistical sense, There is a chance (and maybe 2 good one) that the updated trajectory will be poorer than the nominal one. In that event, the estimates may bbe poorer; this, in turn, leads to further error in the trajectory, which causes further errors in the estimates, and so forth and so forth, leading to eventual divergence of the filer, The net result is that the extended Kalman filter is a somewhat riskier filter than the regular linearized filter, espeally in situations ‘where th initial uncertainty and measurement errors are large. It may be better fon the average than the linearized filter, but its also more likely to diverge in ‘unusual situations. ‘Both the regular linearized Kalman filter and the extended Kalman filter hhave been used in 2 variety of applications. Each has its advantages and disad- ‘vantages, and no general statement can be made as to which is best because it ‘depends on the particular situation at hand, Aided inertial navigation systems serve as good examples of both methods of linearization, and this is discussed further in Chapter 10. sem Figure 94 Refeaoe and acti ici or an ends ran fe 91 UNEAAZATION 345, Keeping Track of Total Estimates in an Extended Kalman Filter 1 should be remembered that the basic state variables in a linearized Kalman filter are incremental quantities, and not the total quantities such as position, velocity, and so forth. However, in an extended Kalman filter it i usually moce ‘convenient to keep track of the total estimates rather than the incremental ones, 50 we will now proceed to show how this is done and why itis valid to do 50. ‘We begin withthe basic linearized measurement equation, Eq, (9.1.11) 2 h(x) = Hax ty (9.1.31) [Note that when working with incremental state variables, the measurement presented to the Kalman filter is (2 ~ h(x*)] rather than the total measurement Next, consider the incremental estimate update equation at time + Ky [a — ROxp) ~ BAR] (9.1.32) Now, in forming the measurement residual in Ea. (9.1.32), suppose we associate the h(x) term with HAR; rather than z,. This measurement residual can then be writen as 0.133) because the predictive estimate ofthe measurement i just the sum of hx) and W,;. Note thatthe measurement residual as given by Bq, (2.1.3) is formed exactly sh would be done in en extended Kalman iter, that is, its the noisy ‘meanirement mins the predictive measurement based on the corte tjetory rather than the nominal one ‘We now return to the update equation, Bq. (9.1.32), and add xf to both sides of th equation: AP + OR =P FAR + Ka, ~ 25) 0.134) 8 & 8 = 8) + KY %) 0.1.35) [Equation (9.1.35) is, of couse, the familiar Hinear ten in terms of Zora! rather than incremental quantities. It simply says that we corect the a priori estimate by adding the measurement residual appropriately Weighted by the Kalman gain K, Note that after the update is made in the extended Kalman filter, the incremental AR, is reduced to zero. Its projection to346 (CHAPTER 9 LINEARIZATION AND ADDITIONAL INTERMEDIATE-LEVEL TOPICS the next step is then tival, The only nonttvial projection isto project & (which hhas become the reference X ats) 10 &,,, This must be done through the nonlinear dynamics as dictated by Eq. (9.1.1). That is, $= MCS, Uy 2) at = fy, subject tothe inital condition x = 8, at ‘Solution of the nonlinear differential equation 5 9.136) [Note thatthe additive white-noise forcing function u() is zero in the projection step, but the deterministic u, is included in the f function. Once %., is deter mined, the predictive measurement 2; can be formed as h(S;,,), and the measurement residual ati, is formed as the difference (Z,,, ~ 2). The filter is then ready to go through another recursive loop. For completeness, we repeat the familiar error covariance update and projection equations: Peed KP e137 Pia = POT + QW 9.138) where ¢y, Hy, and Q, come from the linearized model. Equations (9.1.37) and (0.1.38) and the gain equation (which isthe same as inthe linear Kalman filter) should serve as a reminder that the extended Kalman filter is still working in the world of linear dynamics, even though it keeps track of total estimates rather than ineremental ones. Getting the Extended Kalman Filter Started ‘twas mentioned previously that the extended Kalman filter can diverge if the reference about which the linearization takes place is poor. The most common situation ofthis type accurs at the intial stating point of the recursive process. Frequently, the & priori information about the tre state of the system is poor ‘This causes a large error in & and forces Py to be large. Thus, two problems can arise in getting the extended filter started: 1. A very large Py combined with low-noise measurements atthe first step will cause the P matrix to jump” from a very large value to a small value in one step. In principle, this is permissible. However, this can Tead to numerical problems duc to roundoff. A non-positve-definite P ‘matrix at any point in the recursive process usually leads to divergence. 2. The inital % is presumably the best estimate of x prior to receiving any measurement information, and thus itis used as the reference for Tinearization. If the erxor in 5 is large, the first-order approximation 981 UNEARIZATION 347 used in the linearization will be poor, and divergence may oceur, even with perfect arithmetic. With respect to problem 1, the filter designer should be especially careful to use all the usual numerical precautions to preserve the symmetry and positive Sefinteness of the P matrix on the first step. In some cases, simply using the symmetric form of the P-update equation is sufficient to ward off divergence. ‘This form, Eq, (55.11), is repeated here for convenience (sometimes called the oseph form; see ref. 7): P.= (KAP ~ KH) + KIRKE 0.139) Another way of mitigating the numerical problem is to let Bj be considerably smaller than would normally be dictated by the true a priori uncertainty in 5, ‘This will cause suboptimal operation forthe first few steps, but this is beer than divergence! A similar result can be accomplished by letting R, be abnor- mally large for the frst few steps. There is no one single cure forall numerical problems. Bach case must be considered on its own merits Problem 2 is more subtle than problem 1. Even with perfect arithmetic, poor linearization can cause a poor & to be updated into an even poorer & posterior estimate, which in tur gets projected on ahead, and so forth, Various “fixes” have been suggested forthe poor-linearization problem, and itis difficult to generalize about them (7, 8, 9, 10). All are ad hoc procedures. This should come as no surprise, because the exterided Kalman filter is, itself, an ad hoc procedure. One remedy that works quite well when the information contained in 2 is sulicient to determine x algebraically is to se % to solve for x, just as if there were no measurement error. This is usually done with some tried-and: ‘sue numerical algorithm such as the NewtonRaphson method of solving al- sebraic equations (11) It is hoped this will yield a better estimate of x than the ‘original coarse $5. The filter can then be linearized shout the new estimate (and aa smaller Py than the original Py can be used) and the filter is then run as usual beginning with 2 and with proper accounting for the measurement noise. An- other ad hoc procedure that has been used isto let the filter itself iterate on the estimate at the first step. The procedure is fairly obvious. The linearized filter parameters that depend on the reference x are simply relinearized with each iteration until convergence is reached within some predetermined tolerance. Py ‘may be held fixed during the iteration, but this need not be the case. Also, if x isnot observable on the basis of just one measurement, iteration may also have tobe carried out at a few subsequent steps in order to converge on good estimates of all the elements of x. There is no guarantee that iteration will work in all cases, but it is worth tying Before leaving the subject of getting the filter stated, it should be noted that neither the algebraic solution nor the iteration remedies just mentioned play any role in the basic “filtering” process. Their sole purpose is simply to provide 1 good reference for linearization, so that the linearized Kalman filter can do its job of optimal estimation,948 CHAPTER9. LINEARIZATION AND ADDITIONAL INTERMEDIATE-LEVEL TOPICS 92 CORRELATED PROCESS AND MEASUREMENT NOISE FOR THE DISCRETE FILTER. DELAYED-STATE FILTER EXAMPLE It was showin in Section 7.3 that the continuous Kalman filter equations could be modified to accommodate correlated process and measurement noise, We will ‘now consider the corresponding problem for the discrete filter. Discrete Filter—Correlated Process and Measurement Noise Just as for the continuous filter, we first define the process and measurement ‘models. They are as follows: : 20 } os wire cape 3 gan where rinwtl={% {54 stew in capers) 23) siwtl= {Be 15 tess in chars) 0.24) and Elm_ivf] = Cy (new) 25) Before we proceed, an explanation is inorder as to why we are concemed withthe crosscorrelation of v, with w, rather than w,, which one might expect from just a casual look at the problem. Rewriting Eq. (9.2.1) with k retarded ‘one step should help in this regard X= beaks tM 026) Note that itis ws, (and not w;) that represents the cumulative effect of the ‘white forcing function in the continuous model in te interval yt). Similarly, +, represents the cumulative effect of the white measuremen: noise in the con= tinuous model when averaged over the same interval (1,4) (See Eg. 7.1.10) ‘Therefore, if we wish to have a correspondence between the continuous and discrete models for small Ay, itis the crosscorrelation between v, and w,.» that 192 CORRELATED PROCESS AND MEASUREMENT NOISE FOR THE DISCRETE FILTER 340 we need to include in the discrete model. This is, of course, largely & mater of| ‘notation, but an important one. "Now, we could go backward from the continuous equations given in Section 73 to get the corresponding discrete equations. However, itis somewhat easier inthis ease to star over and rederive the diserete recursive equations with proper accounting for the ctosscorrelaton in the process. We begin with the general update equation uF + Ky(e, ~ WS) 027) Next, we form the expression forthe estimation error =a. Be + Ria ~ HS] (~ K.Her ~ Key, 28) We anticipate now that 6; and v, will be correlated, so we asa side problem: I work this out Blervil = El(x, ~ v1) [baaikenr + Wer ~ eV) 29) Note that v, will not be correlated with either x,., oF ness, Therefore, Eq. (9.2.9) reduces to | because ofits white- levi) = Elw.wil = ©, (92.10) We now return to the main derivation. By using Eq, (9.2.8), we form the expression for the P, matrix: Fleet] = E(( — Ker ~ Kw J - Ker PR KyJ) 21 Now, expanding Eq, (9.2.11) and taking advantage of Eq, (9.210) lead to Poe = KAS ~ KT + KRY ~~ KADGKE ~ KCHE~ KH)" 02.12) ‘This is perfectly general expression forthe errr covariance and is valid for any gain K,. The lst two tems in Eq, (92.12) are “new” and involve the crosscoreation parameter C, ‘We now follow the same procedure used in Section 5.5 o find the optimal ‘gin, We differentiate trace P, with respect to K, and set the result equal t0 zero,350 (CHAPTER 9 LNEARIZATION AND ADDITIONAL INTERMEDITE-LEVEL TOPICS ‘The necessary matrix differentiation formulas are given in Section 5.5, and the resulting optimal gain is K, = (PFE + CoLAPSHT + R, + HC, + CPZ! 2.13) [Note that this expression is similar to the gain formula of Chapter 5 except for the additional terms involving C,. Let C, go to zero, and Eq. (92.13) reduces to the same gain asin the zero crosscorrelation model, which is as it should be. ‘We can now substitute the optimal gain expression, Eq. (9.2.13), into the ‘general P, equation, Eq, (0.2.12), to get the a posteriori P, equation. After some Algebraic manipulation, this leads to either of the two forms: P= Pr KIB +R, + ,C, + CIKE 2.14) P= - KOPF: ~ KC (9.2.15) “The projection equations are not affected by the crosscorteation between ‘4. and ¥, because ofthe whiteness property of each. Therefore, the projection ‘equations are (repeated here for completeness) Bia = Oe 9216) Pos = OPOT + Q e217 Equations (9.2.2), (9.2.13), (92.15), (92.16), and (9.2.17) now comprise the complete set of recursive equations forthe correlated process and! measurement poise ease. We could go one step further at this point and show that the discrete recursive equations will in fac, go over into the continuous-model differential ‘equations as At — . This is faaly routine, though, so itis left as an exercise (Gee Problem 9.8). Delayed-State Measurement Problem “There are numerous dynamical applications where position and velocity are chosen as state variables, It is also common to have integrated velocity over some. ‘Ar interval as one of the measurements In some applications the integration is ic part ofthe measurement mechanism, and an associated accumulative “count” is the actual measurement that is available to the Kalman filter (2. integrated doppler in a digital GPS receiver—see Chapter 11; also see refs. 12 and 13), Other times, integration may be performed on the velocity measurement te presmooth the high-frequency noise. In either case, these measurement si ations are described by (in words): {92 CORRELATED PROGESS AND MEASUREMENT NOISE FOR THE DISCRETE FILTER 951 (Discrete measurement observed a time 4) = [F cootciynar + (iserete noise = (position at 1) ~ (postion at 1.) + (Gisceete noise) 0.2.18) (Or, in general mathematical terms, the measurement equation is of the form a Ht I HY 9.2.19) ‘This, of course, does not fit the required format for the usual Kalman filter because ofthe x, term, In practice, various approximations have been used to accommodate the delayed-state term: some good, some not so good. (One of the poorest approximations is simply to consider the integral of velocity divided by Af to be & measure of the instantaneous velocity atthe end point of the A interval) The correct way to handle the delayed-state measurement problem, though, is to modify the recursive equations so as to accommodate the x, erm exacily (14), This can be done with only a modest increase in complexity, 35, will be seen presently. : ‘We begin by noting thet the recursive equation for x, ean be shift ne sep this. “ ee Me Me Me 0220) Equation (9.2.20) can now be rewritten as Me = OER be 221) td is cn be mbatted into the menue equation, By, (21), tat Mit) (9.222) ‘Equation (9.2.22) now has the proper form for a Kalman filter, but the new’ ¥, term is obyiously correlated with the process w,., term. We etn now take ad- ‘vantage of the correlated measurement-process noise equations that were derived in the first part ofthis section. Before doing so, though, we need to work out the covariance expression for the new ¥, term and also evaluate C, for this application. ‘We will emporarily let the covariance associsied with new v, be denoted as “New R,”, and its New R= ACH0eh m=, + WIE roi + VT (9.2.23) ‘We note now that v, and w,_, are uncorrelated. Therefore,1952 CHAPTER 9. UNEARIZATION AND ADOTIONAL INTERMEDIATEAEVEL TOPICS Now Ry = JbrQ,. itt + Re 224, ‘Also, with reference to Eq. (9.2.5), we can write C, as C= Alm Hbemes + WT = QT 0225) lowing replacements in Eqs. (9.2.7), In this application, we can now make the following replacements (6213), (92.14), 9.216), and (9.2.17) HH, + 5oe, 9226, “RAR LOO (227) 6, -Q..6000 (9.228) ° rents are made in vere — means “is place by” Afr the indsted replacement Feet ang ne ree relate compiaed ot of equations treacle among other things, the inverse of. Thi isa computation is not required in the usual recursive equations, and it can be eliminated with a eae gebraic substitutions. The Key sep i to elimina Q. by tng id QP oh 9229) and thatthe inverse of the transpose is the transpose of the inverse. The final fesulting recursive equations for the delayed-state measurement situation can. then be writen in the form: Estimate upda = 85 Ky ~ (9.230) where Hg + IRC 0231) Gain: SUBPFHE +R, + LPO HE . +H Rod * Ped! 0252) [PAH + oy Error covariance update: rR 9233) Pp KGKT (93 ADAPTIVE KALMAN FILTER MULTIPLE MODEL ADAPTIVE ESTMATOR) 353. 1, = HP +R, + Py Hi + Huby ST + LPP 02.38) Projection: oh 9235) Pi = POT + Q, 9236) Equations (9.2.30) through (9.2.36) comprise the complete set of recursive equations that must be implemented for the exact (e., optimal) solution for the elayed-state measurement problem.* Note that the general form of the equations is the same as for the usual Kalman filer equations. It is just shat there are a few additional terms that have to be calculated in the gain and P-update expressions, Thus, the extea effort in programming the exact equations is quite ‘modest. (See Problem 9.5 for a demonstration that these recursive equations really do what they purport to do.) 93 ADAPTIVE KALMAN FILTER (MULTIPLE MODEL ADAPTIVE ESTIMATOR) {In the usual Kalman fiter we assume that all the process parameters, that is, dy, Hi, R,, and Q,, are known. They may vary with time (index k) but, i's, the nature of the Variation is assumed to be known, In physical problems this is ‘often a rash assumption. There may be large uncertainty in some parameters ‘because of inadequate prior test data about the process. Or some parameter might be expected to change slowly with time, but the exact nature of the change is not predictable. In such cases, itis highly desirable to design the filter to be self-learning, so that it can adapt itself to the situation at hand, whatever that ‘might be. This problem has received considerable atention since Kalman’s orig inal papers of the early 1960s. However, itis not an easy problem with one simple solution. This is evidenced by the fact that 35 years later we still see many papers of the subject in current covtem-eyate journals. (Reference 21 ives a good list of recent papers on adaptive control.) ‘We will concentrate our attention here on an adaptive filter scheme that was first presented by D. T. Magill (16). Also a more intuitive scheme is dis- ‘cussed in Problem 9.3. We will see presently that Magill’s adaptive filter is not 5st one filter bot, instead, i a whole bank of Kalman filters running in parallel. ‘At the time that this scheme was fist suggested in 1965, it was considered to be impractical for implementation on-line. However, the spectacular advances in ‘computer technology over the past few decades have made Magil’s parallel- filter scheme quite feasible in a number of applications (17, 18, 1, 20, 21, 2) "1155 of inet to amt that Bas, (@2.30 rough (2.236 can alo be derived by a completely ‘ierem mend Se Seton 9.4 (15,‘364 CHAPTER. LINEARIZATION AND ADDITIONAL INTERMEDIATEAEVEL TOPICS ‘Because of the parallel bank of filters, this scheme is usually referred to as the ‘multiple model adaptive estimator (MMAE), In the interest of simplicity, we Will confine our atention here to Magill’s original MMAEE scheme in its pri itive form. It is worth mentioning that there have been many extensions andl ‘variations on the original scheme since 1965, including recent papers by Caputi (3) and Blair and Bar Shalom (22). (These are interesting papers, both for their technical content and the references contained therein.) We will now proceed to the derivation that leads to the bank of parallel filters, ‘We begin with the simple statement that the desired estimator is to be the conditional mean given by a [vere oa Where 2f denotes all the measurements up to and including time t, (i ty Ze. 2), and p(s) is the probability density function of x, with the Conditioning shown in parentheses.* The indicated integration is over the entire space. If the x and z processes are Gaussian, we are assured that the estimate tiven by Eq, (03.1) will be optimal by almost any reasonable criterion of op- fimality, least mean-square or otherwise (24). We also wish to assume that some parameter of the process, say, a, is unknown to the observer, and that this pa- ameter is a random variable (aot necessarily Gaussian). Thus, on any particular Sample run it will be an unknown constant, but with a known statistical dstri- bution, Hence, rather than beginning with p(xjzt), we really need to begin with the joint density p(x, zt) and sum out on to get p(xlat). Thus, we will rewrite Bq, (9.3.1) in the form sax [x [mt ob) dat 32 bt hint dens a (032) om be ete lx, ol Substituting Eg, (9.33) into (9.32) and interchanging the order of integration lead to pis, 22)pCala) 033) los) | x pala 2) dx da 034) “The inner integral will be recognized as just the usual Kalman filter estimate for “throughout his secon ne wil se a aoe nottion han hat ase in Cape 1 8 dtp wil Be CEEWPSE Rosai Gey ant scree prota In this way we avoid the mutdoous are ou thors Be egied o ondtoned altar random varies, Hower REE ae cen mas ose se agian and ltt te symbol p propery wiih the contest of er ein ay paras aon. (93 ADAPTIVE KALMAN FRTER (MULTPLE MODEL ADAPTIVE ESTIMATOR) 955 4 given a. This is denoted as 8,(a) where a shown in parentheses is intended 2 reminder that ther is dependence. Equation (2.3.4) may now bereritea 85 | Soptoiat dor 035) Or the discrete random variable equivalent to Ea, (9.35) would be R= Laolalet 03.6) wer ale) i te dace pabsiiy fe , condone onthe mesemet Segoe at We wll onan onthe Gate form om ths point on ou Equation (03.6) simply sys thatthe optimal imal estimate is « weighted sum of Kalman filter eximates with each Kalman fiter operating with separate {soumed value of. Ths is shown in Fig. 9.5. The problem now redces fo one of determining the weight factors plait) pla?) ete. These, of couse, change with each recirive sop as te measremet poses evlves in time sub a mean ore eet sone sate, We ean out the state ofthe process and the unknown parameter (Note that i Coma roy parva of te pots) : fe now tem to the matter of finding the weight factors indicated ia Fi 915. Toward this end we we Bayes" ral oe tole) - eae oa bu warp a ofp Figure 9.5. Weigted sin of Kanan fe estimate,1956. CHAPTERO _UINEARIZATION AND ADDITIONAL INTERMEDIATE-LEVEL TOPICS ) > paatlayra) 938) [Equation (9.3.8) may now be substituted into Eq. (2.3.7) with terest peatety = [pated], i= 1.2...4 039) 2 paatledeta) ve ixribution pla) is presumed to be known, so it remains to determine ‘Te ition 6) Sar hs de il we pi) 8 & prot of ree nal density functions. Temporarily omitting the a, conditioning (ust tO Seve etn), we have Plat) = Pa tan 39) ply ta B00) ple ta ln PCE) = tae Ben += ADAGE Pan Peay + 2a) ~~ POMP) La... @3.10) 3.10) is just ‘We now note that the fist tem in the product string of Eq. (2.3.10) is Pld, and that the remaining product is just pa). Thus, we can rewrite Ea. (6.3.10) in the form plat) = pC plat.) oat) \We now make the Gaussian assumption for the x and x processes (bot not vance fea oc la taphy mates we wl aur 2f wb 46 Fe eit ae Egon 3.11) th CES 5. keh2... O31) 1 Barra + RIP [ peer) vamp, atin i tht) win gone be dlerent or eh For exam Bee iy ech rath bank of fiers wl mole ‘Cound ae sale 199. ADAPTIVE KALMAN FILTER MULTIPLE MODEL ADAPTIVE ESTIMATOR) 957 It should be helpful at this point to go through an example step by step (in words, atleast) to see how the parallel bank of filters works. 1. We begin with the prior distribution of a and set the fter weight factors accordingly. Frequently, we have very litle prior knowledge of the unknown parameter a. In this case we would assume a uniform probability distribution and set all the weight factors equal initially. This does not have to be the case in general, though. 2. The initial prior estimates & for each filter are set in accordance with whatever prior information is available. Usually, if x is a zero-mean process, 85 is simply sot equal to zer0 for each filter. We will assume that this is true in this example. 5, Usually, the initial estimate uncertainty does not depend on a, so the initial Ps for each fer will just be the covarianee matrix of the x process, 4, Initially then, before any measurements are received, the prior estimates from each of the filters are weighted by the intial weight facios and summed to give the optimal prior estimate from the bank of fiters. In the present example this is trivia, because the inital estimates for each filter were set to 2er. 5. ACK = 0 the bank of filters receives the fist measurement zy and the unconditional p(z2) must be computed for each permissible a, We note that 2 = Hx + v, so plz) may be writen as 1 fae Bay LCHE + RP [4 GGHE +R), 3.13) Phe) where C, is the covariance matix of x, We note again that one or more of the parameters in Eq, (9.3.13) may have a dependence; thus, in general, p(z) will be different for each a, 6. Once p(e,) for each ay has been determined, Eq, (9.3.9) may be used t0 find paz). These are the weight factors to be used in summing. the updated estimates, that come out of each of the filters in the bani of filters. This then yields the optimal adaptive estimate, given the measurement z, and we are ready to project on to the next step, 17. Bach ofthe individual Kalman filter estimates and their error covariances is projected ahead to & = 1 in the usual manner. The adaptive filter must ‘now compute p(2t) for each «, and it uses the recursive formula, Eq (03:12), in doing s0. Therefore, for p(2*) we have IF gg | 1 HRY | ee * Gay 7OR PT + RY? | 20H + ZI wea) 03.14) peat) [Note that p(af) was computed in the previous step, and the prior 8; and for each a, are obtained from the projection step.1358 CHAPTER 9 UNEARZATION ANO ADDITIONAL INTERMEDITE-LEVEL TOPICS 8, Now, the p(a!) determined in step 7 can be used in Bayes’ formula, Eq (0.39), and the weight factors pala?) for k = 1 are thus determined. It should be clear now that this recursive procedure can be carried on ‘ad inginitun ‘We are now in a postion t reflect on the whole adaptive file in perspective, At cach recursive step the adaptive filter does three things: (1) Each filter ihahe bank of filters computes its own estimate, which is hypothesized on its vou model; (2) the system computes thea posteriori probabilities for each of {he hypotheses, and (the scheme forms the adaptive opimal estimate of x a8 a vefdited sum of the estimates produced by each of the individual Kalman fers As the measurements evolve with time, the adaptive scheme leams which St the filters i the comect one, and is weight factor approaches unity while the Sher are going to zero. The bank of filters accomplishes this, in effec, by Tooking at cue ofthe weighted squared micasurement residuals. The filter with the smallest residuals "wins." 50 to speak. ‘Te Magill scheme just described is not without some problems and lie tations (23) Tt is sill important, though, because it is optimum (within the ‘nous assumptions that were made), and it serves a5 point of departure for Uiher Tes rigorous schemes. One of the problems of this technique has to do siith numeral behavior as the number of steps becomes large. Cleary, as the Thi ofthe squared measurement residuals becomes large, there i «possibilty Steomputer underflow. Also, note tat the unknown parameter was assumed fe conant with time, Thus, there is no way this kind of adaptive fiter can adjust if the parameter actually varies slowly with ime. This adaptive scheme, ‘rit purest form, never forges; it tends to give early measurements just a8 hoch weight as Tater ones ins detenaiation ofthe a posterior probabilities. Some od hoc procedure, such as periodic reintiaization, has to be used if the Schime is to adapt o a slowly varying parameter situation. wy as vay mle, Ue Mogi adaptive Hitec wansient scheme, andi converges tothe contest a, in an optimal manner (provided, of course, that the into its U and D parts. cea eu UT es1 see ned) ‘The Kalman gain is computed in the usual way, either as P-H’(HP-H? + R)-' it PIs used or bythe same forma wth P replaced by UD-U-" if P is tot available in reconstuced form from the previous sop, Afr the gain Come, the ema is upted by wsing the esl equation 8-04 Kes He) 953) Next, we need f0 update F", and this 1 where we use U-D factonzation. We will begin with the P~ update form given by Eq. (5.5.20), which we repeat here for convenience =P PP(HP-T + RHP 059) ‘We will assume that the measurement is scalar. (Note that we can always process the measurements sequentially in scalar form by forming appropriate linear combinations of the elements of 2, if necessary, and thus diagonalize the R. matrix. See Problem 6.2.) The (HP-H™ + R) term will then be scalar, and we will denote it as a, that is, of HPH +R 95.10) Next, we will rewrite P and P- in Eq, (95.9) in terms of their U-D factors. Let P= U'D'U", Then opus up-ut ~ Lupe D ot <0 [> -tevanowwy] o> eam [Now note that the bracketed term in Eq. (95.11) is symmetric, so it ean be factored into U-D factors. Let [> -L ovr rw] wpe e512) ‘Then, Eq, (95.11) can be rewritten as{a70 CHAPTER 9 UNEAFIZATION AND ADDITIONAL INTERMEDIATESEVEL TOPICS wp = U-UDUUT =u upE wy 05.13) ow note that (U-0) is upper tiangular and Dis diagonel. Therefore, Fg {0.513 in Ue form and tos wut 514 p-5 05:5) [Equations (95.14) and (95.15) then provide a means of updating the err co- attance, but in terms of its U-D factors rather than in terms of Fr Tee et emaine to project the estimate and U* and D* ahead to the next step."The estimate is projected ahead usal (with the subscripts reinserted), 1 ba 0516) “The U* and D* factors can be projected ahead in a number of ways. The easiest ‘way is to write the usual P projection equation 2s Pr = OPA + QW = 02D zU;78F + Q = UBUD + QW e541) is commana I Se wnt sition matrix.) Also, the computation specified by Eq. (9.5.17) pee a wh ane SF eter “Freon eh uc ge feces oie ee Pa serpents at bye ies ie Sa aie as ee moo Sra enon sence oe aD" i aly wn pin creping se same me ieee a aS A co Pe St eb erect as) pan Pa ye 3 a NO CD eo Din “eee lc alo 96 DECENTRALZED KALMAN FILTERING S74 In many applications where the system is completely observable and there is adequate process noise feeding into all the system states, there is no divergence problem. ‘This assumes, of course, that the programmer has taken the usual precaution of symmetrizing the error covariance matrix after both the projection land update steps. In most instances, the usual update equation a- KEP (95.18) is perfectly adequate and is, of course, easy to program. Therefore, do not be afraid of numerical stability in Kalman filtering. In are situations it can be a problem but, even in these cases, there are methods of coping with i 9.6 DECENTRALIZED KALMAN FILTERING* In our previous discussions of Kalman filtering, we always considered all of the ‘measurements being input diectly into a single fies. This mode of operation is now usually referred to as a centralized Kalman filter. Before we look at alte natives, it should be recognized that the centralized filter yields optimal est- mates, We cannot expect to find any other filer that will produce any beter [MMSE (minimum mean-square eror) estimates, subject, of course, to the usual ‘assumptions of linear dynamics and measurement connections, andthe validity ‘of the state models that describe the various random processes. Occasionally, though, there are applications where a full-oder centralized filter cannot be implemented, for one reason or another. Even so, the centralized filter is stil important because it serves as a baseline for purposes of comparison. We will ‘now look at some altematives to the centralized filter architecture. Cascaded Filters (With Preexisting Equipment Constraints) [A simplified block diagram for a decentralized filter is shown in Fig. 99. A basic tenet ofthe decentralized approach is for each ofthe local filters to operate ‘autonomously, Each local filter has its ovm suite of measurements, and there is no sharing of measurements (directly, at least). Note that this is inherently a ‘cascaded mode of operation, because the outputs of one of more of the local filters are acting as inputs to the master filter. A problem with cascaded filters that is often encountered in real-life engineering work can be illustrated as fol lows: Suppose a new piece of equipment is to be integrated into an existing suite of avionics equipment, The systems engineer is expected to blend together the outputs of the various equipments in some sensible near-optimal manner. ‘rhe autre ar eel indeed to Dr Lary Levy. he Johns Hopkins sive Appliod Pgs Laboratory, fr mich Of he mata ois sco (9),1372. CHAPTERS. LINEARI2ATION ANO ADOFTIONAL IVTERMEDIATE-LEVEL TOPICS ca 5 tas woefazt—-| | Figure 8.9. Docentaaed Marna fda aster Each component in the suite may have its own local Kalman filter, and each of the boxes ie “cloved," in that the combiner (ie. master filter) only has access to the output state estimates of the Various boxes. In these circumstances itis tempting to simply treat the outputs ofthe local filters a8 messurements feeding the master filter, and then design the integration filter using the usual rules governing a Kalman filter. There are two problems with this simple approach: 1. The estimation errors in the outputs of the local filters (which translate to measurement errors inthe master filter) normally Fave nontrivial time Correlation structure. This rime correlation is dificult o account for, even if the P matrix is made available to the master filter (and it usually is rot in preexisting “closed” equipments). If the corrlation structure of the errs in the inputs to the master filter is not progerly accounted for in the master filter's model, divergence may occu. 2, Ifthe fll-order state vectors not implemented in each loca filter, which is usually the case with specialized equipment, there is the possibilty (of Toss of valuable measurement information that cannot be recovered by the master filter just by looking at local state estimates, ‘There is no really good theoretical solution to the constrained cascaded filter problem as just posed. One ad hoc solution that has been used successfully in special cizcumstances is simply to “thin out” the measurement data feeding the master filter. For example, if prefitered estimates from a GPS receiver (see ‘Chapter 11) are available at a 1-H rate, and these are to be used to update fnerial equipment with large time constants, itis quite possible that sampling every 20th, or pethaps every 50th, measurement will yield satisfactory updating Of the inertial equipment, In this Way, the measurement errors seen by the master filter become less correlated timewise, and the white measurement noise require- iment is satisfied, or at least nearly so. This solution may seem erude, bat the ‘omission of some of the measurement data may lead to only a small Joss in ‘optimality. This is certainly better than risking divergence. 96 DECENTRALZED KALMAN FILTERING $73 Decentralized Filtering—No Feedback to the Local iors We will now consider various decentralized filter architectures where we relax ‘most ofthe constraints on the local filters. Autonomous operation is stil desired, though. We will first look at en idealistic structure where there is no feedback, and we will then use this as a point of departure for other possible architectures ‘where thee is feedback from the master filter to the local filters. ‘The block diagram of Fig. 9.9 applies to our no-feedback system. The autonomous operation requirement says that there is to be no sharing of measurement information among the local filters. Also, the master filter does not have direct access to the raw measurements feeding the local filters. We begin by developing a rigorous theoretical strcture for our feedforward decentralized filter. The alternative form of the Kalman filter equations will be used, and the equations are repeated here for convenience (see Section 6.2 for their derivation): 1. Information matrix update: Py! = py! + BR; 61) 2. Gain computation K, = PAR; (96.2) 3. Estimate update: 8 + KG, — BSD) 063) 4, Project ahead to next step Bia = 5, 064) Pi = POE + Q 065) Recall that P-? i called the information matrix. In terms of information, Eq. (9.6.1) says that the updated information is equal to the prior information plus the additional information obtained from the measurement at time f,. Further- more, if R, is block diagonal, the total “added” information can be divided into separate components, each representing the contribution from the respective ‘measurement blocks, That i, we have (omitting the k subscripts for convenience)Seer [ATA GHWPTER 9 UNGAPIZATION AND ADDITONAL INTERMEDINE-LEVEL TOPICS HR = BORSH, + HIRSH, + HRW, 0.6.6) ‘We also note thatthe estimate update equation at time 1, can be writen in & diferent form as follows: a= - Kee + ke = Pery e+ PAR = Ply e PR) 62 ven write in this fom, is clea at he update exit ier lend Me gtomaton th te se informa. 3 ea rt nh ut wo ea tren our denied cor i we cmt kasi se ing Bath soe and we wl cng floret sate ve, nd ste Eb te a sta ee ar reserve reste and ma 8 SE Oe Ss aml My Foe Cassia process, my dM essed eer cot ocd on cr apne messed! Sens Yl a ee tye arene preset te 1 ad 2 9, barry hve te afl © X 2, = Hxty a) n= Bxty 069) ad sere vs and re eo mean andr ales with cova By td ere vy and 1a 2 Mand sae assed oe mutually wacoelated 3 Dal se ase ow ta sl ies 1 and 20 gt fave gre ch soe! Me stume no al fom he especie er baits Se ing Bae O61) and O67) Local filter 1: jt My! + HERB, 9.6.10) 8, = P(My'm, + HAR'4) eo) Local filter 2: : By! = Mgt + HERS "H, (9.612) 8, = Py(M!m, + HER") 96.13) [Note that the local estimates will be optimal, conditioned on their respective 96 DECENTRALZED KALMAN ALTERING S75: measurement streams, but not with respect to the combined measurements. (Re~ member, the filters are operating autonomously.) "Now consider the master filter. Its looking for an optimal global estimate of x conditioned on both measurement streams 1 and 2. Let = optimal exinaof x condoned 0 both mentee Reams opto ut ning = cvmlance mix oct tm ‘Te opal ltl end denial ever covciancs en Rr 0 || ot vanes ["5" 2] [ie] +e = M*' + HYR;"H, + HGR;'H, (9.6.14) Pom + HR; 2, + BERS") 0615 However, the master filter does not have direct access to 2, and 23, so we will rewrite Eqs. (0.6.14) and (9.6.15) in terms of the local filter's computed estimates ‘and covariances. The result is Pot = Bt Mp) + (Re Mp) + Me 0616 8 = PIP; ~ My'm) + (Pp'%, ~ Mytm,) + Mo'm] 0.6.17) 1 can now be seen that the local filters can pass their respective &, Pr! my, M:! (7 1,2) on to the master filter, whic, in turn, can then compute its global estimate, The loca filters can, of course, do their own local projections and then repeat the cycle at step k + 1, Likewise, the master ter can project its global festimate and get a new m and M for the next step. Thus, we see that this “architecture permits complete autonomy of the local filters, and it yields local ‘optimality with respect to the respective measurement streams. The system also achieves global optimality in the master filter ‘We now come to a matter of semantics. Note thatthe parenthetical quantities in Eq, (9.6.17) are really just HR; "2, and HER;'z,. Each local iter most compute its respective HR-'z to get its local estimate (see Eqs. 9.6.11 and 196.13), 90 why not pass these quantities on to the master filter rather than the ‘more complicated individual terms that appear in Eq (9.6.17). Certainly, it would be simpler. However, ifthe loal filters do pass their HR“'z quantities on to the master filter, this becomes very close to parallel architecture where the raw measurements are fed not only to the local filters but also to a larger master filter that optimally processes all the measurements. Such parallel systems have been implemented (30). So, it might be said that our feedforward decentralized filter is teally just a parallel architecture in disguise. We will not try to answer this question here—it is simply a matter of semantics376 CHAPTERS LNEARIZATION AND ADDITIONAL INTERMEDIATE-LEVEL TOPIGS Decentralized Filtering with Feedback ‘We now wish to consider a decentralized filter architecture where we permit information feedback from the master filter tothe local filters. Figure 9.10 shows Such an architecture for tvo local filters. We note that by feeding bac the prior fm and M to the local filters, we are allowing indireet measurement sharing between the local filters. This feedback enables the local filers to reset their respective prior estimates more accurately with each step then they would be ble to do otherwise. All we need to do to get the appropriate equations for the feedback configuration is to let (96.18) (96.19) | With these modifications the previously derived equations can be applied to the ecentsalized for with Feedback. The key equations are Local filter 1 (with feedback): §, = Prim + HER;'2) (0.620) Ppt = Met + HIRSH, 9621) Local filter 2 (with feedback): y= PAM 'm + BER;'2) (0.622) = Mot 4 BIR; H, 0623) Global fer: PP; '8, + Pek, — Mm) (9624) pie pes Py Mt (625) tis [Note that there is no direet communication between filters 1 and 2 wil ‘of a gure 0.10 Dever Se wen eocack, 97. STOCHASTIC UNEAR REGULATOR PROBLEM ANO THE SEPARATION THEOREM S77. architecture. There is indirect information sharing, though, in thatthe prior m that is fed back with each step is a linear combination of past measurements in oth filters ‘The master filter maintains full global optimality with the feedback architecture where m, = m, =m. The local filters have better optimality than they ‘would have without feedback, but they still do not have full optimality with respect to all measurements. For example, with the Gaussian astumptiort4ocal estimate &, isthe conditional mean of x, conditioned on all past measurements and the present 2, (but not 2) similar interpretation applies to & Concluding Remarks ‘Two rigorous architectures for filter decentralization have been presented. One involves feeding back information from the master filter to the loca filters; the ‘other does not. In both architectures global optimality is maintained in the master filter. Also, optimality is achieved in the local filters to a limited degree with respect to specific subsets of the measurements. Thus, these configurations are Impostant as baseline atchitecates for purposes of woman. Both the no-feedback and feedback architectures considered here require thatthe fullorder state vector (ie. the global state vector) be implemented in teach of the local filters. In many applications this is an unreasonable require- ‘ment. One of the motivations for decentralization is simplification, so having to implement the full sate vector in all the filters does not help in this regard, ‘Thas, in practice there ate applications where itis necessary 10 consider a suite of Jocal filters with lower-order state vectors. This complicates the theory considerably, and it nearly always leads to some degree of suboptimalty relative 10 the two baseline architectures considered hee. One architecture that has received considerable attention as a practical means of decentralization is the federated filter (38, 39, 40). In the federated filter the information that is fed back to the local filters is divided, and portions of the total information ae shared araong the local ites, Various sharing strat- egies are possible (see Problem 9.7 for an example of information sharing). Dividing the total information leads to some degree of suboptimality in the local filters, but there are reports of applications where the penalty is quite modest (G9). The federated filter is relatively new, and itis considerably more complicated conceptually than the straightforward parallel configuration where the 2), 2, - ty measurements are input directly to @ master filter, as well as to the respective local filters. The federated filter must compete with parallel systems {as well as other architectures), so the eventual role of the federated filter in integrated systems remains to be seen, 97 STOCHASTIC LINEAR REGULATOR PROBLEM AND THE SEPARATION THEOREM ‘The linear regulator problem is a classical problem of optimal control theory, and it can be posed (and solved) in either continuous or discrete form (24,378 CHAPTER 9 UNEARIZATION AND ADOTIONAL INTERMEDIATE:LEVEL TOPICS 444-38) In the interest of brevity we will confine our rears to the continuous version of the problem. the deterministic linear regulator problem may be stated as follows: Given linear dynamical system of the form Fx + Bu, en) ‘hat wt) will minimize the quadratic performance index rasan + [1 beOWat9 + wHOW,eso] at O72) where §, W, and W, are symmetic positive definite weighing matrices that are ares hi thesitvation at hand? The intent of the regulator is to reduce the cao site of the system t0 zero (or nearly s0) quickly and without undue aril effort. The weighting factor S penalizes the system for not reaching 2e°0 the specified terminal time ¢, The weighting factors W, and W. apply pen ae eeetne trajectories ofthe state x() and the control u) over the time span fia] Tecan be seen that if W, is large and W, smal, he pina system Wit te Marge contol effort and force the system toward zero rapidly im onde 19 are Tange penalty due to large W On the other hand, f Ws small and lange the system wil be fuga with ts contol effort, and it wil approsch sees. Clearly, wide variety of situations can be accommodated within ree eet of this formulation, We will nt desve the solution of the linear regulator problem hee. This is adequately covered in the mentioned references Ged many others, as well), We simply state thatthe optimal contol law is @ Tinear feedback law that specifies w,() to be (9 = =KCOx® 73) where the feedback gain K,(0 is computed from the system parameters. (tis wie limction of x) We nced not be concemed with the detailed solution for TCo (except to note parembetclly that there isa close duality between ‘his ee ks optimal estimator problem.) Tt s important to note, though, that itis are ned atthe state vector x is available for feedback purposes as indicate Tip 9.1. However, in many physical situations, we do not have the privilege ae aeeeving all the elements of x directly; quite to the contrary, we are usually Slowed to observe x only through some output relationship, 23, y= Hx er) Now, based on y, we must somehow reconstruct x. Ifthe observation is essen Tally erorfree, we can use observer theory in the reconstruction (38-40). How aly Sine observation of x is comupted with noise, as is often the ease, then ‘87. STOCHASTIC UNEAR REGULATOR PROBLEM AND THE SEPARATION THEOREM 378) the reconstruction of x becomes r an estimation problem. This leads to the sto chaste linear regulator problem,* which will now be formulated, [Let the system dynamics be specified by the linear equation i= Fx + Bu, + Gu (075) where F, B, end u, are as before, and w is an additional Gaussian white-noise forcing function that is characterized by luce" = Qe ~ 0.16) “The system state vector is assumed to be observed via the relationship Hx +y orn where v is Gaussian white noise and is characterized by Evora] att = 9) 0.78) “The two white noises wand v will be assumed to be independent, and, smed to be independent, and, in order to avoid any questions about singular conditions, we assume that the system is Ccontollable with respect to the control input uy and is observable with respect to. The optimization problem is to minimize the following cost function: A{vensxep + ff bow. + w7ow,ns0) a} 073) vate few al te oe das of ein: wl int wheres (24, 9,97, 38), Design ofthe optimal tosh near regulator may be separated into two steps: : ce “This problemi so called the near quadeac Gasian 1.90) problemFigure 9.412. Obtinalstocaste nar egustor PROBLEMS 981 1. First, design the minimum mean-square error estimator for x, treating tu, just as if it were a known deterministic input. The optimal estimator is, of course, a Kalman filter with parameters F, GQG", H, and R. Note that the process has a known input as well as a random input; therefore, the differential equation for & is & = FR + Bu, + Ke — H) (9.7.10) where K is the continuous Kalman gain (see Problem 7.8). 2. Next, solve the deterministic linear regulator problem for the optimal feedback gain K,(0) just as if perfect measurement of x were available and u(f) were absent, Then let contol input u, be uy, = “Koa (9.11) and the cost function is m ‘summarized in Fig. 9.12. mized. The resulting optimal system is "ine two-step solution just described is known asthe separation theorem or separation principle. It is a most remarkable result in that we would normally ‘expect that the feedback loop would horibly complicate matters by mixing together the control and estimation problems. However, it does not; the 10 problems separate nicely! The superficial reason for this isthe duality between the optimal control and optimal filter problems. The underlying reason for the ‘duality inthe fist place, though, is not all that obvious, so we will simply say that this isa fortuitous circumstance, ILis not our intent here to texch the subject of optimal control. The subject is well developed, and many fine books have been writen in this area. We simply Want to point out with this limited example that estimation theory (and Kalman firing in particular) plays an important role in optimal control theory, and it behooves the control systems engineet/scientist to understand the rudiments of estimation theory. PROBLEMS 9.1 Consider a simple one-dimensional trajectory determination problem as follows. A small object is launched vertically in the earth's atmosphere, The initial thrust exists fora very short time, and the object "fre falls” ballistically for essentially all ofits straight-up, straight-down trajectory. Let y be measured in the up direction, and assume that the nominal trajectory is governed by the following dynamical equation: my = —mg ~ Dy where m= .05 kg (mass of object) 8 = 9.087 m/sec? (acceleration of gravity) D= 14x 10 n/(m/sec)® (drag coefficient) ‘The drag coefficient will be assumed to be constant for this relatively short1982 CHAPTER 9 LNEARIZATION AND ADOATIONAL INTERMEDIATE-LEVEL TOPICS trajectory, and note that drag force is proportional to (velocity), which makes the differential equation nonlinear. Let the initial conditions for the nominal teajectory be as follows: “The body is tracked and noisy positon measurements are obtained at intervals Of 1 seo, The measurement error variance is 25 m?. The actual trajectory will ‘hffer from the nominal one primarily because of uncertainty in the initial ve- focity, Assume thatthe inital position is known perfectly but that nil velocity Ie best modeled as & normal random variable described by N (85 m/sec, 1 mv? See), Work out the linearized discrete Kalman fiter model for the wp portion of the trajectory. (Hint, Aa snalyica’ solution for the nominal trajectory may be obtained by CGhsidering the differential equation as a first-order equation in velocity. Note fh = 5 during the up portion of the trajectory. Since variables are separable in the velocity equation, it can be integrated. The velocity can then be integrated 1 obtain positon.) 9.2, (a) Atthe Ith step of the usual nonadaptive Kalman filter, the measure- iment residual is (2, — H,&;). Let z be sealar and show that the expectation ofthe equared residual is minimized if 8 isthe optimal f priori estimate of x, that is, the one normally computed in the projection step of the Kelman fiter loop. EHints Use the measurement relationship 2, = Hx, + vy and note {hat v, and the @ priori estimation error have zero erosscorrelation. ‘also note that the & priori estimate 8, optimal or otherwise, can ‘nly depend on the measurement sequence up through 2, and not 2) (b) Show that the time sequence of residuals (% — HAS) Gis — Hise)... isa white sequence ifthe filter is optimal. As # mater bf terminology) this sequence is known as an innovations sequence. See (G3) for a bref discussion of innovations processes and their role in ‘optimal filter theory 93 In Problem 92 it was stated that the sequence of measurement residuals ja Kalman fiter is a white sequence. This occurs only when the filter gain is fdjusted tothe optimal value. Thus, we can think of the filter gain asa “tuning” putameter that we tone to make the measurementesidual sequence white. This Fhmediately suggests an intuitive sort of adaptive filter in which the filter monitors the measurement-residual sequence, and if it detects nonwhiteness, it re- adjusts the gain to force the sequence t0 be white. This wil be demonstrated ‘vith a simple continuous Kalman filter example. Let the process and measurement models be 0, (tis white with spectral amplitude g z= x oO, v(t) is white with spectra amplitude r PROBLEMS 988 ‘As usual, we will assume u() and o(9) to be independent. Their spectral independent. Their spectral ampli- ludes q and r are presumed to be unknown and, furthermore, they might vary slowly with time. The continuous Kalman filter solution for this setting is given bby (see Chapter 7) B= KG-8) “The solution i so shown in back diagram form inthe accompanying ge Tris ao dia imege he hs peermng spect seal on he et Surement res athe summing pit fn much the same manera With ay node cigal spectrum analyze This could be doe repeatedly houghou the Course ofthe iter operation. The rele Would be an experimentally deed poner spect density function (or autocoelton fantn) that would be Tep- IS of he meen en ces oo en Problem 93 ‘Assume that an autocorrelation function has been determined de ined as just de- setibed, and assume that q and r do not change appreciably during the observation span, Find q, rand the filter gain in terms of the parameters that describe the autocorrelation function, Note that once q and r are determined, the filter ‘gain can be readjusted to the optimal condition. {If you have difficulty with this problem, see Gelb (8), pp. 319-320.) : 9.4 This problem is a variation on the DME jon on the DME example given in Section 9.1 (Example 9.1) with a simplification in the model of the aircraft dynamics. Sup- pose that the two DME: stations are located on the x-axis as shown in the ac- ‘companying figure, and further suppose thatthe aircraft follows an approximate pat fo Sut © nth as shown, The craft at a nominal voc of 10 ‘m/sec in a northerly direction, but there is random motion superimposed on this in bom he x andy det. The ight dation (or ou pape) 20 sc And the initial coordinates at ¢ = 0 are properly described as normal random variables as follows: ae . x5 ~ N(O, 2000 mi) Yo ~ M(=10,000 m, 2000 m?) ‘The sizeraft kinematics are described by the following equations:[384 CHAPTERS LINEARIZATION AND ADDITIONAL INTERMEDIATE-LEVEL TOPICS ween hE May ROL 2 200 Yess ye Wag + 100 Ar, where wy, and wy are independent white Gaussian sequences described by yy ~ NO, 400 mi?) vy ~ NO, 400 m1?) The simple random walk in the x-coordi Tinear motion for the y-coordinate "The aireraft obtains simultaneous discrete range measurements at 1 sec inervate Ga both DME stations, and the measurement errors consist of a super positon of Markov and white components. Thus, we have (fr typical range measurement) [Sel] |We have two DME stations, s0 the measurement vector is a 2tuple at each Sinple point beginning at € = 0 and continuing until k = 200, The Markoy aaae sce cach ofthe DME stations are independent Gaussian random processes, which are described by the autocorrelation function sapling interval At is 1 sec. The aircraft motion willbe ecogalzed a te. and random walk superimposed on (0) = 9006" "4 a? “The white measurement erors for both stations have a variance of 225-m {Work out the linearized Kalman filter parameters fr this situation, The Tinearantion is to be done about the nominal linear-motion trajectory ‘enacly along the y-axis. That is, the resultant filter is to be an “ordi- fray” linearized Kalman filter and not an extended Kelman filter (o) After the key filter parameters have been worked ovt, run a covariance “nalysis for k=O, 1,2, , 200, and plot the estimation error variances for both the x and y position coordinates. (©) You should see a pronounced peak in the y-coordinate error Curve as the airoraft goes through (or near) the origin, Tis imply reRets a bad feomeisic situation when the two DME stations are 180 degrees apart Reatve to the airraft. One might think that the estimation error vari- ‘thor should go to infinity a exactly k = 100. Explsin qualitatively why this i not true iB PROBLEMS 385 ryt tjcty 0000.01 3 esa (390009) e090) 95 It is well known that the Schuler oscillation in & pure inertial navigation| system (INS) can be damped with velocity information from a noninertial mea- srement. This is discussed in detail in Section 10.3. There, we think of the extemal velocity measurements as being a sequence of instantaneous snapshot velocity measurements, so the measurement equation forthe Kalman filter does not involve a delayed-state term. In this exercise, however, we wish to consider ‘4 measurement situation where the Kalman filter has to work with a sequence ‘of measurements where they represent the integral of velocity over a sequence ‘of contiguous AT intervals, ‘The single-channel INS error model given in Fig. 10.6 is repeated befow for convenience, but the exteral velocity feedback branch has been omitted for four purposes here. Also, gyro white noise has been inserted into the Schuler loop at the appropriate point. lar wie ne pinto) Ae Problem 9 = 2.09 X 10" fg ~ 32.2 Rec’, ay = VaR, ~ 124 X 107 radlsec er a or, a time 41386 CHAPTER LINEARIZATION AND ADOTONAL INTERMEDIATE LEVEL TOPICS ale) = 1) — H-) # VCE) (P955.1) ‘Also if the measurement noise atthe velocity level is white, the corresponding discrete sequence v(,) will be white. Equation (P9.5.1) then fits the format for the delayed-state measurement model that was discussed in Section 92 (a) Firs, refer to Fig. 106 and study the stability properties ofthe contin- tous system as the feedback constant K i varied, Do this by examining the roots of the characteristic polynomial |st ~ F [Answers For K = 0 the three roots are at s = 0 and s = -+/ey Then, fe Kis increased, the complex roots move into the left-helfs-plane and converge on the negative real axis (critical damping). As K is increased farther, one root goes to the left and the other to the right, which corresponds to the overdamped situation.) (&) Wenow wish to demonstrate that damping can be achieved just as well with integrated velocity measurements as with instantaneous samples Of velocity, Toward this end we will look atthe esror covariance prop- gation in the delayed-state filter as discussed in Section 9.2, This can te done easly using MATLAB (or other soitable software), The spe- tie parameters to be used for tis exercise ae as follows (in addition to the constants shown in the accompanying figure): (j) Integration interval: At = 120 see Gi Perfect alignment at r = 0 (ic. k = 0) Gii) Imegrated velocity measurement-error variance: R, = (300 fy" iv Power seal deny off) = 008 (hy eae Cane ena Off) ~ 2511 oust) Cree is anne poston snr of abut nmin 1 Roti Ge sence of vel damping) ‘te mae sen o be programed 5 a follows 1 Inne the P mais to be the null marx (perfec alignment, 2. Sample pis are a E= 01,2,» 100, comesponding © # = 0 Meera e 2lo se. (Note, were, ha he fst conespua ran eat Gre m= Ivnot k=, ezase ofthe delayed state treasure station) — 4, Fort st 2 ep (= 1,2, 2) the ate 0 eo Feely incanuremens Thi lions te err covariances forall hee eet lap to novia vals, andi demonsats the sytem Seah wthout damping (This esi slated ding hs peiod ip iting R, be excel age, 31.020) 4 The coed velocity measurements begin atk ~ 43 and coninue Tah t 100. Ry is to best 90,00 during his pio. Plot the error covariances associated with each of the state variables (ie. Pur Pay and pn) 88 function of K (or time), Note that when the extemal velocity Pm curciment is “turned on,” the damping effect on the INS velocity error is PRoeLes $87 very rapid. The damping effect is less rapid, though, on the position and platform terrors. Ths is to be expected—they are one integration removed from the velocity measurement. Note especially the stabilizing effet on the platform tit {6) In Section 6.9, it was shown thatthe stability characteristics of a constant-gain Kalman filter are described by the filter's characteristic roots ‘much the same as with the digital filters that one studies in digital signal theory (41). Inthe scenario considered here in part (b), the filter reaches ‘4 near-constant gain condition at the end of the rus. First, adapt the derivation given in Section 6.9 to the delayed-state filer equations and show thatthe characteristic roots of the delayed-state fer are the i genvalues of ( ~ KHé ~ KJ), where , K, H, and J are steady-state values, Then, using the value of K oblained ‘at & = 100, find the characteristic roots of the filter. [Note that che builtin MATLAB funetion eig(A) retums the eigenvalues of a square matrix A.] It should be noted that in the case of the discrete filter we are working with 2- transforms, The unit cile in the z-domain plays the same role as the imaginary axis in the s-domain, and the entire lef-half s-plane maps into the interior of the unit circle inthe z-domain. (Answer: For Ry = 90,000 the characteristic roots are 7915 + f.1593.) (@ Once the computer program for parts (b) and (c) is written, it is easy to rerun the scenario using diffeent values of R. Try two more runs, fone with R, set larger than 90,000, and another with R, less than 90,000. (Suggested values for this part are 160,000 and 40,000.) Are your results reasonable in view of how the characteristic roots vary ‘with a change of gain in the comesponding continuous problem? 9.6 In Example 9.4 the measurement bias thet was to be “considered” (but rot implemented) was modeled as a true constant with time. It may be more realistic in some applications to model the “bias” as a quasibias, that is, one that is allowed to vary slowly with time in a random manner. Let us say that 3 ‘Gauss-Markov process is @ suitable way to model the quasibias, and its autocorrelation function is given by 10,2= Rio) = oe, 1 and p= 1 see" Repeat the three-way comparison among the optimal, Sehmidt-Kalman, and R-bumped-up filters using the same parameters as in Example 9.4, except for the measurement bias model. In order to be assured of reaching a near steady- state condition after k = 0, You may wish to let the fers run through 50 recursive stops instead of 20 as was done in Example 9.4. By our choice of =, the inital conditions here ae the same as in Example 9.4; note, in partic- tla, that the intial true x for each Monte Carlo run (forthe R-bumped-up filter) cen be set to be a sample of an M(0, Pa) random variable, and ¥ can be set (0 ‘zero. This corresponds physically to resetting the random walk process to zero, in accordance with the optimal filter's best estimate, for each step prior to & (0. Then at the step just before k= 0, the projection of & to k= 0 is zero, and its uncertainty (as measured by its variance) is just (618034 + 1.0)388 CHAPTER.9 LINEARIZATON AND ADDMONAL INTERMEDIATE-LEVEL TOPICS When the resulting thee eror variance plots are compared, you will note that al three variances are closer together than they were in Example 9.4. Give 1 heuristic explanation of this, 9.7 In some decentralized filters with feedback, informatior from the master filter is divided and shared amoag the local filters (see Section 9.6). We will now look at a simple example of information sharing. (@) Furst, show that the two-fiter equations for the feedback ease (Eqs, 9.6.20 through 9.6.25) can be generalized to account for N local filters. ‘The eesult forthe global filter is - wearin] 20 -w-pM (P9772) ter feeds the following prion quantities (©) Now suppose thatthe master back tothe local filters: ae oases + (2) moi rnin divided by N, and the seme amount is fed back to each local filter) ‘Show that the global fiter equations simplify to the following equations: »[Srs] 913) rt " pa74) (©) With the information divided and fed back as in past (b), is the master (ie. global) filter estimate a tuly optimal global estimate? Also, ae the loeal estimates optimal with respect to their respective measurement streams? Justify your answers. 9.8 The discrete filter equations that account for the correlation between ‘ens and ¥, should, for small step size Ar, go over into the continuous filter ‘equations derived in Section 7.3 by other means. In order to show this, we must fist derive the appropriate connection between C (continuous model) and Cy (Giserete model), We do this by noting that REFERENCES 389 [fe o6com ae we B fl vende may The C, parameter is then (for small 44) C= lw Lp op GL, LE, 6 960 ctocer"Cn) a dn —— : cag - 0) ec ar ‘Therefore, the desired relationship is c=6c This is then added to the other necessary correspondences for small Av Hos Ht = GaGa ‘Now, following the same method used in Section 7.1, show that the discrete filer equations forthe corelated case go over into the continuous filter differential equations derived in Section 73, REFERENCES CITED IN CHAPTER 9 1. RE Kalman, “A New Apron o Line Firing and Pedison Poles Tos ‘Ate, 1 Baste ge! 9889 Mach 1960, 2. RE Kalman and S. Bucy, “New Resin Linear Ftesng and Preis." 4 Pie ASE fs 0108 itd "View of ee Deedes of Ligr lesng Theor” IEEE Trans Information Theory, IT-20 (2): 146-181 (March 1974). _ 4 Ms s,s Hin Se Non Yo Wk,| | [a ce etanoeeninman | 15. H.W. Sorenson, “Kalan Firing Teebigues” in CT, Leones (ed), Advances i in Cone Sytems, VO. 3, New York: Academic Pres, 1966, p, 219-269. i 6. 8.6 Brown and, W. Nilson, Irodution fo Linear Stems Analyt, New York: j Wiley, 1962 HH) 1, MLS Grewal and AP. Andiews, Kainan Filtering Theory and Practice, Englewood i Cf, Prence Hal, 1993 (ee Section 42 and Chapter 6) i 1, A. Gelb (ed), Applied Optima Eximario, Cambridge, MA: MIT Press, 1974. | 5. C.K. Ci, 0. Chen, and H.C. Ci, "Modified Extended Kelman Filering and a Ht Real-Time Parallel Algorithm fr System Parameter destin." IEEE Tras. Aw tomate Conta, 35) 100-108 Gan. 199), 10, $° Park and J. 6. Lee, "Comments on “Moifed Extended Kalman Ftering and 2 ea Time Parallel Algoritm for System Parameter Ientifeaton” TEEE Trans. | ‘Automatic Corot (91661-1662 Sep. 199) { 11, R Bilis and D. Gulich, Callus with Anayical Geometry, 4tb ea, San Diego: i Harcourt Brace Jovanovic, 1990. | 1. RG, Brown, “Analysis of en Inetted. Inertia Doppler Satellite Navigation Sys- i {co Pat Theory and Mathematical Model,” Teh. et no. ERI 62600, Engoceing Reseach last, lowa State Universi, Ames, 1969 13, RG Brown and LL: Hagerman, “An Optimsm Intl Doppler Satelite Naviga- tin Sytem,” Navgeion, J Int Nevigaion, 163). 260-269 (al 1969. 14, RG, Brow and GL. Hartnn, “Kalman Filter with Delayed Sates as Observa bien" Proceedings ofthe National Electronics Conference, Chicago, TL, 1968. 15, RG. Brown and PYCG. Hwang, Itodution to Rondom Signals and Applied Kelman Filtering, 2a 0, New York Wiley, 1982. 16. DT Magi, “Opimal Adaptive Estimation of Sampled Stochastic Processes" IEEE Tran, Automatic Con, AC-10) 434-439 (Oct. 1965). 17. 6. Mealy and W. Tang, "Applian of Maliple Mode Estimation toa Recursive Tenin Height Carlton System" IEEE Tran. Automatic Contra, AC-28: 323-331 (arch 1983). 18. AA Gigs and R.G. Brown, “Adaptive Kalman Filering in Computer Relaying Fault Castficaton Using Volage Models,” IEEE Trans, Power Apparat Sys. PAS-104(). 1168-1177 May 198). 19, RG Brown, “A Now Look a the Magil Adaptive Filter a a Practical Means of Mlle Hypothesis Testing” IEEE Trans. Cres Spt, CAS-30: 165-768 (Oc 198. 20, RG, Brown and PY, C. Hoang, “A Kalan Fiter Approsch to Precision Geodesy’ [Navigation J Tost Navigation, 303) 338-349 (Winter 1983-4) 21, HE Rauch, “Autonomous Conte Recotguation,” IEEE Control Systems, 156) 37-48 (De. 1995) 22, WD. Bl snd ¥, BarShalom, “Tracking Manceerng Targets with Multiple Sexe ors: Does More Dea Alwaye Mean Beter Estimates?" IEEE Trans. Aerospace lecrronie Ste, 320: 430-456 Can, 1996. 23, MJ. Capa "A Necessary Condition for Effective Peformance ofthe Multiple Model. Adaptive. Estimator” IEEE Trane. Aerospace Electronic Syst, 318): 1132-1138 Gly 1995), 241. 8. Meith, Stochastic Optimal Linear Estimation and Conrl, New York Metra Hil, 1969. 25, 8. F Seid, "Applicaton of Stte-Space Methods to Navigation Problems” ia ©. Leones (ed), Advances in Control Systems, Vol 3, New York: Academic Press, 1966 26, 6-1 Biman, Factorization Methods or Discrete Sequetal Estimation, New York: ‘Academic ress, 197 n 28 31 22 33. 35. 36 3”. 38 38, 40, EFERENCES 991 PS, Magtsk,Schae Model, Estimation nd Con 1, Now Yar: Ae demic Press, 1979. ee TCH Bai, AnonaalGulonce, New Ys: MeGe- Hl 196, p. 338-340, I Lev, apple Raman Pring Unpblaed Nato Coun 30, Nevsh emt, Ang, VA 55 BG Hitman,“ Tad GPS/RS Dein Appr Novgtion 9911-154 rng 1988) NTA Gato “fede Sar Rot ie fr Decent Parle Processes” Te Ts hers ond Bc Sys) 17-95 Oy 190) NA, Caton tnd. PB “Fesued Rann te Sila Res Nongtion Ist Navigation 410) 297-21 al 199, Mn ands ey Teo) and Aptian of alan Ferg, Pal By, FL Maplin Rok Co 198 ML Alu and 1 Flb Optimal Cont New Yoke Mein, 196 DE Ki opis Contra theory, Engen Chis, Pen aly 170. CE Digi and Ho, ppl Opinal Cou 2 Nw Yo Hated Pres Bb af ake Wy San 178 AP sige and CC Whe, Opin Sons Conroe, Eagkwond Ci, NE: Prentice-Hall, 1977, ae = {ler ranin and. Powel, Dig Cont of Dram Stems Retin ‘Addison-Wesley, 1980, shite m ae EE im ier item Tey nd Desi Ne Ys Ha, Rina iv 5 Ralah incr Syems Englewood Cif, Nt Prec al, 1980 Navigation, J. Inst.101 COMPLEMENTARY ALTER METHODOLOGY 383, Beneficial Features Since the early 1960s, the complementary filter, introduced earlier in Chapter 4, has become the basis for this form of integration (1), and there are a number of ‘good reasons for this choice; a block diagram of the general methodology using 2 complementary filter for such function is shown in Fig. 10.1. First ofall, the method has a degree of generality that allows for a wide varity of mixes of aiding measurement information. This is important because the combination of aiding sensors may vary during an individual mission, as well as in the broader sense over various suites of equipment. The Kalman filter readily accepts ‘various mixes of aiding sources; tis is all handled in the system software. Note that all the aiding measurements are processed in a single Kalman filter. Thus, More on Modeling: Integration of Noninertial Measurements tail Sree laatie ces : Justo do withthe resttions placed on the Kalman Ser model. Recall tht into INS the process dynamics and measurement relationship mist both be liner. Fre uct, the alse variables do not say this requirement. For example, Slecuonically ule diane menses ate proportnal to the Square 10 ofthe sum ofthe squares of Cartesian components, and these are cern not near relationships. In navigation systems, measurement relationships iolving rulipe spall dimension ae seldom linear. Therefore the problem must be OO teas often been sad that modeling isthe “hardest” part of Kalman filtering, ‘This is especially ue when there are nonlinearities in the pysieal equations that must be linearized. Developing a good Kalman filter mocel is pat art and part science. As a general rule, we look for models that are simple enovg fe implementable, bat yet, atthe same ime, stil epreseat the pysca itvation With ¢ reasonable degree of accuracy. This chapter is devotec to modeling ex- Lmples from one of the most successful applications of Kalman filtering over the past tee decades, namely, integrated inertial navigation systems. 10.1 COMPLEMENTARY FILTER METHODOLOGY ase of Kalman fering hs rotbly beefed 9 ther apiston sea ra ame on ayes Te car vga ers problem ht ae net yal msg te negrton 2 etal ao eam cn NS ih nv ata ote sess Sc a roar ee tects den Snes elds a0 ce aa ee ent gue. Howes, eet crap dsr rae cre croc so aed gly Te a ngraton ma oe len he walle da eget om opin ign soln, In tes eps ei at an ae ene efor te Kaan ter sian ih problem il a npr oie i aigaton ss vor linearized about some reference trajectory in order to fit the format required by the Kalman filter. Ths reference trajectory ean be a single point in vector space but is, in general, a time-varying trajectory. The details of the linearization pro- cedire are described in Chapter 9. It should be apparent thatthe system inte gration scheme shown in Fig. 10.1 is, infact, a regular linearized Kalman filter in every sense of the word. The process dynamics in the inertial system and the ‘measurement relationships may be nonlinear. So be it, The nonlinearities are ‘washed out in the differencing operation [z ~ h(x*)], andthe filter subproblem becomes linear, provided, of course, thatthe deviation of the reference trajectory from truth remains small throughout the time span of intrest. AA third reason for choosing the complementary filter form of integration has to do with maintaining high dynamic response in the position, velocity, and attude state variables. The usual price associated with filtering is time delay or sluggish response. For example, ia low-pass filter is energized with a step input, the response is exponential. The leading edge ofthe step input is rounded in the (tae pnn, wt, te) Sra we Huladins velocity ete) ‘eres + anore Figure 10:1 negated vain ter fected contig,ti i i 304 CHAPTER 19 MORE ON MODELING ‘viput and the degree of “rounding” is directly related to the amount of smooth: sat that is designed into the filter. This lag is undesirable in most realtime igation applications. Normally, we want the navigation system to follow. the tiymanics ofthe aterafe faithfully, no matter how rapid the changes may be, The complementary filter philosophy accomplishes this and, atthe same time, pro: Sider filtering of the measurement noise. A fist glance, this may seem to be a vretvadietion, but this is made possible by intelligent management of the mea Sonument redndaney. Note in Fig. 10.1 that the filter only operates on the com- Dineton of inertial aystem errors and the aiding source errors. The filter does fot operate on the total dynamic quantities of interest; they passthrough the ‘System without any distortion or delay whatsoever. For this reason, this type of fering is sometimes called distortionless fering or dynamically exact systema Iepration, Note also tat te total dynamical quantities of intrest (Le. postion anetyc and attitude) do not have to’be modeled as random processes. The filter Drip operates ‘onthe system eros, s0 these are the quantities that appear in che Kalman filter model. ‘Many real-life navigation systems have depended largely on the INS for providing the reference trajectory when using his type of integration philosophy Prowuse the INS is self-contained, continuous, and provides all the basic navi tation quantities normally of interest—position, velocity, and attitude. None of svther sensors, when considered individually, can provide the sare comple wre at of information in quite the same way. Thus, the INS js the logical choice Tor the reference, even though by itself, its accuracy may be poorer than some of the aiding sources. 4 onal Details on System Methodology ‘Some additional comments on Fig. 10.1 are in order before proceeding. Fist, it Je tacitly assumed that in Forming the difference [2 ~ W(x"), we difference like quantities, For example, we do.n0t try 10 compare inetally derived postion ‘Fyectly with velocity as seen by the Doppler radar. Rather, we compare inertial velocity with Doppler radar velocity; furthermore, we compare the two in the tumne frame of reference. This often means that inetially denved quantities mast be converted to another coordinate system before the appropriate comparison eer ce made, I should also be noted that each of the aiding sources does not fave to measure all ofthe same dynamical quantities that are normally computed in the inertial system. If the “aiding” sourees are really to ai, there needs to be at least one connection to the inertial quantities, but that i all. Usually, no fone of the aiding sources (regardless of how accurate) provides all of these (Quantities and, therefore, itis convenient conceptually to Tump al of the aiding Sources together in the aiding” eatezory. Tris important to note that the INS measurements are not used purely as “measurements” but rather to form the reference trajectory against which the ing data are compared. The reference trajectory consists of unfiltered ata; onion and velocity ae extracted from measured inertial acceleration and te 10.1 COMPLEMENTARY FLTER METHODOLOGY 395 sinude is frmed fom measured site rte. Hi the aiding dat ess the ference trajectory that frm te very “mensrement fet the Kal ite Of the ers ta he Kalan fer estimates, the neta system eor re of bine nets ne we ater mel rng thane Inst be lina wit white-noise process a the diving fron, Various mode ave bc wet te mello be wad any Fa ppt il Gepend onthe pe of intl system at hand atthe degre of complexity tht the syste engner i willing to lve with nthe design, Tis contates the ingen living con hs hap en © msi Ine sytem enor on must ao angen be stem ste veto Wit ny nonwhite ang source errs. Brea though ty ay note oF pity ines inte navigation problem, ey mis neverilest, be cared along a theirs on ea ope id Kal The tne soutce eorestinas are no of Course, sed in the fetforard conection 0 thea sy, bt ey sc prope acauted for en Torin he mer Strement esta where te ene Hae vectors volved Feedforward vs. Feedback Configuration ‘The block diagram shown in Fig 10. is a configuration called is a configuration called the fesdfornard ar ope loop configuration because he corrections othe INS output ae lized exteraly but ae not etured to madly the INS item, Although conep- sly wl cniguan ape dees bet he ee nce and aca ajectrcs to as fo cause Titer asumpions and random process eror model to radially deteriorate. To avid this, a Toedbeck cons tration soch as that shown in Fig. 102 ian acceptable altermave. This cong ston Kas oa xen Kaan er deo wpa I cots a ordinary ineniaed Kalan fer thats ssocisted tthe feedforward case One sould be earful ooo he to literal in he inerpetation ofthe lock diagras of Figs. 1 and 102. They ae inendel wo be concept, 0 era ores the rc so ack 0 th INS i 10.2 we orrextons tht mst he made in «computer someere inthe system. some systems this might be a ental ight management compte tht so does tiny er commons lg py,» sre elond Solution In other systems, the omiputtonl efor may be dsb among Ce ee Figure 102 negated raigatonajton—toadock contusion396 CHAPTER 10 MORE ON MODELING various “boxes” in the system. Never mind which box does the computation; Tegurdless, the key item is h(x"). If b(x*) is computed before the corrections fare made fo the inertial ouiputs, the filter is an ordinary linearized Kelman filter: if hx") is computed after the corrections are made, the filter is an extended Kalman filter, Both modes of operation have been used successfully, and each has its place. Both modes of operation have the same performance, within the inearty assumption, Clearly, though, the extended Kalman fiter is to be pre= ferred in applications where the mission time is long, as would be true of a ship at sea for many weeks or months, Otherwise, he reference trajectory would verge fom truth beyond acceptable limits. On the other hend, in «ballistic Space vehicle Iaunch, the mission time is short and, there is every reason to azsume the actual trajectory will match the preprogrammed one closely. tn this tase, an ordinary linearized Kalman filter might well be prefered. INS ERROR MODELS [An INS is made up of gyroscopes (gyros, for short) and accelerometers for basic fonsors. A gyro senses rotational rate (angular velocity) that mathematically Fnegrates t give overall change in attitude over time, Similarly, an acceler- ‘meter senses linear acceleration that integrates to give velocity change, or dou- by integrates to give position change over time. An INS sustains attitude, position, and velocity accuracy by accurately maintaining chenges in those par- meters om thee initial conditions. However, due to the integration process, terrors in the attitude, postion, and velocity data are inherenty Unstable but the frowth characteristics of these errors depend on the type of sensors used. The evel of complexity needed forthe error modeling depends on the mix of sensors in the integration and the performance expected of it Single-Axis Inertial Error Model ‘We shall begin by looking ata simple model that contains the physical relation ship between the gyro and the accelerometer in one axis. The following notation will be used: {Ax = position error Ax = velocity exror Ax = acceleration exror 4 ¢ = gravitational acceleration platform (or attitude) error relative to level ‘earth radius «a = accelerometer noise = gy70 noise in terms of angular rate 102 ISERROR MODELS 997 Je-axs mol i isrctive for the elationsiptdeseribes between the scoters a the go sc ath ene ira quien oe near acclration andthe tet angular velocity. The diferent extn ta desc the scccrometer andthe gyro cor we gen @8folo¥o. ats an ed ao2.) 1 4 agte «an2.2) Jn Eq, (1021, he erin aoseation is fundamenlly due to combination of selon sn i ad «congo gn ewe tenses as a result of platform eo. The Platform enter rat, described by E (1020) rst tom gyro senor nie alot eor hay when projected Along te eats surface curvature, gts asad into an angular entero An accelerometer ero thet inept ino a veloty cor ives tse to mis: alignment in the perceived gravity vector due to the earth's curved surface. This Inbalgnen resus in nonaona commponent ta fortonly works agus the effec ofthe inal acceleomte ener The resuing coilaton known as the Schule sciltion provides some sui tothe rzotal ero, Note that fis assumed to be the usual car's gravitational constant. Thu, ts simple tol i vested to low dynamics. Three-Axes Inertial Error Model In progesing fm a single-axisto level platform theeaxes INS, atonal Complexe trie rom intracion ang the thee senor pis (2,3), A sensor pair aligned tn the nonh-soud diction is shown mn Fig 109 29 9 tant fancon block diagram denoted asthe noth chan. Avery sls model eis for the east channel as shown in Fig. 10.4. ust as with the single-axis error i he eae mo eed i dais. se Mon 102 "he deren equations that accompany the transfer functions of Fi 103 and 104 ae ges balou, Inte following notion he plato ron ent op Mt i}+ms site te Fhe fF] tie Figure 10 Net henner mal ey th, 2 un, (7 Bator ang rata seul y a8998 CIUPTER 10 MORE ON MODELING 102 INSERROR MODELS 308 Figure 108 tic charlene ace a iste tie te oat See an, _ ‘Abate Sate dynamic model can be wed aa foundation fran aided ey | 1s Kalman iter mel. For ore bere the nie variables ine ste vector | Mil be olred as flows / fines 104 tan cor er ot ce wy wer 8 a = eas postin enor (n) (Se pit eng te abou 98) x, = east velocity error (m/sec) 1x; = platform tile about y axis ad) rate o is an angular velocity. Bear in mind, however, that « is not the same as the platform tilt rate error d, which is an angular velocity error x, = north positon error (m) ‘x, = north velocity error (@m/see) i — = platform kt about (x) ais (ad) iad ab) (102.3) x, = vertical position error (m) aes 10.24) xy = vertical velocity rrr (m/sec) * y= platform azimuth error (ca) 1029) East channel: | Based on Eqs. (10.2.3) through (10.2.8), we can write out the nine-dimensional vector fstorde differential equation: ats a,~ 8d, (0025) dart a, 102.6) «] fo 1 0 0 0 0 00 offs] fo R w| foo 00 0 00 olm! Ju Vertical channel: i ee occ el le 27 : ae Coal x | [oo 0 0 1 0 00 offs} fo Platform azimuth: t.{-[o 0 0 00 -,00 olfyl+ an, 102.8) x] foo oot 0 oo ala! Ju “The north and cast channel models take into account the previously described «| Joo 000 001 0ffs,| Jo phenomenon that is due to the earth evrvature. The vertical channel does not benef fom the Schuler phenomenon and is governed by a simpler model as w| |o 0 0 00 0 00 Olly H shown in Fig, 105) ee + tcan be shown tar he charters ples fr he vert chanrl do note ea the fin — ees They a tly snes ose el us ae iy we of Pe i a cer oll sae tbh ples be ease he (102.10)400 CHAPTER 10 MORE ON MODELING 102 INSERROR MODELS 401 From the parameters of Bq, (102.10) the discrete-time ine-imensonal vector ee From the rates ation forthe process model can be derived, Closed- or process shay = 6+ Me 02.1 Sesto ere eco ose covariance marx €or te ta wasiton face srameters are general ined wit cence eee rameter = ¢-%"" veins liscrete-time al a A a al ah any coed eer ereeepe ere ae pee sonm tare tomate eerie ns ovens this process (ve Example 5.2) The vance of te process nos yi Ekman rao nna fey ef fo oo ‘Var(w,) = (1 — G7)Var(x,) 01 mia 0 0:00 0 i 2 hen tes atonal err sates are mugmented ote base Sse mote, tbe 0200) oo oem process model canbe ten inthe following partioned ay Tu 0 1 5 0 0 6 Xo oo mn |0 0 0 01-0 0 0 = a at bo = oO an oo 0 of 1 0 0 wr ms ° +| me oo 0 00 0 14 0 | me : Me coo 0 0 oF oo has te L Lee 7 0 021 ‘The INS process models rine states ntl comprsng a ysition a veloc ee ea ee te ah cach of tice Gimensins. This misinal system sapere eae paneer sa aan econ for platform misorientation, but only allows for very litle eom- INS dynamic equations Fo through 2 tat crnted wih the accelerometer and gyro. ober wor, ples in errs asoeironly simplified ond imdeled ss white-noise forcing Aa rene INS enor dyaamics Also, platform angular eer are oa tari re and pla toring rats are assumed Yo change ety o 0.0 Opa as hey contested constant Sal aceleraton (lative 0 1070700) 0 Fo umel inte sate model gies by Ea, (102-10) Even hough ooo oo 1) a aes simple is noetless workable ad canbe found in oo /0 00 00 the in vome rea lie sstems ote eG 300000 501000 ‘Sensor Noise Models ooooo01 ‘te aecteration and gyro enor ems ar spl white-noise nats in Figs 102 oo Fo ace add ore fey tothe mal, adons err tems with 8 ee ee eee though 105.7 aiscre can aso be, iacloded. Genel, fistomler seg ei nrg rg enerio rina Le co a aulfiien for each ofthese pes er. To dS, 20 tended draton of tine (yp ous). By compase,medhn-qualy 3 a rs eetjuee fr aceclertion and thee for go ets, ae nosed temo rogue’ ental adig to atin the performance afte by high-qaly for augmenting the state vector. They are all of the form: systems. Otherwise, medium-quality systems are capable of stand-alone opera402 CHAPTER 10. MORE ON MOCELING ‘Table 1041. Comparaon of Ditrent etal Systoms (8) Sensor Parameters High Low Gyro bias =orrs 17h Gyro white noise 3x 10 "see! Viz >0.001"/see) VEE Accelerometer bias 10-50 ne 200-500 sg 1000 1 ‘Accelerometer white noise 3-10 yg/ VA 50 el 50 pg ‘FH sie qaliy pb ics Ave crepe oe Toe quliy owsgiven by (Te lo gay gn Tae rein el ae the nhc anne pro ales ae Sed 8 terms of Ce quae roto per special dry. 10.3 tion over shorter durations. Low-quality systems require extemal aiding to pro- vVde useful performance and can only offer brief stand-alone operation. ‘Rrother level of sophistication that may be added to the accelerometer and syto errr models i that of accountng for scale factor em. The mathematica) eMionship thet tanslates the physical quantity a sensor measures tothe desired Tale representing that quantity generally involves a scale factor that may vary svc caeuumentation uncertainties. This error may be estimated when the integrated system isin a highly observable state. Whether it is worthwbile to cece this exor depends on how significant the errr is and its impact on the ‘overall performance (see Problem 103). DAMPING THE SCHULER OSCILLATION WITH EXTERNAL VELOCITY REFERENCE INFORMATION tis well known that pure INS exhibit undamped oscillatory exror charactrities tvith a period of 84 minutes. Tis is known as the Schuler oscillation. When Undamped oscillatory systems are excited by random noise, the output grows Sunistcally without bound (see Problem 3.12). This is, of course, undesirable fer missions that last for even a few cycles ofthe natural oscillation. One way CTinitigating the problem is to add viscous damping in much the same manner 5 might be done with a mechanical pendulum, To do this in an INS, an independent, noainerial source of vehicle velocity information must be available eer, Doppler radar. The block digram given in Fig, 10.6 shows the raditlonal sekog way of damping the Schuler oscillation, The key portion of the diggram ihe part sthere the INS velocity is compared with the reference velocity, and then the diference is fed back through a scale factor tothe accelerometer input We now ‘wish to look at how we might accomplish a similar result using the Complementary filter methodology that was discussed in Section 101. In ordes To Keep the modeling as simple as possible, we will only Took at one horizontal Channel of the INS, WE-Will then expand the model later. 102 DAMPING THE SCHULER OSCLLATION 403 eee te etl om EE he, Figure 10.8 Campa of Seiler oxsaion ug adam velotyreerence We fist not from Fig, 101 thi the base ole of the Kalan lt so esinats he intl ye eos Ths hese Guar st be eel at fandom process insutespace oom The beck liga of ig 106 wl serve ts well ot opr, proved that we ignore the slo reiene feedback ora he dagen (We only wiht motel te unaided INS enor propagstion Eras point) in is ints of simply, we wil arsme the street rer tobe wie Wit a Knows power special eniy, Suitable state equations ty be cisnd by choosing te ee mega ons ute vats, They will be denoted as x,, x, and x, and they are defined as follows: ~ 15, = INS positon enor (m) x, INS velocity ertor (m/sec) xy = platform tlt (a8) ‘The diferent equations are now obtained del fom th J obisined diel fom te block diagram we ignore the velocity feedback part): : oe a= ex +410, ld is white noise with a spectral amplitude A ee R, (103.1) (Or, in the usual matrix form, % x] fo a= ale | |fA@ (103.2) ay x} Lo404 CHAPTER to MORE ON MODELING We now need the diseete form of Eq, (10.3.2) for a step size of At, which is the interval between samples of the external velocity reference measurements Reference velocity is asstmed to be the only aiding source ie this example. If [Aris small relative to the Schuler period, we can approximate the solution of Bg, (10.3.2) with just first-order tems in Ae. The result is x rar 0 ]fs] Po * x] + fe 103.3) ‘The transition matrix 4, is obvious from’ Eq, (10.3.3). Derivation of the Oy Covariance matrix associated with the vector forcing function [w, ws _ws]” in fq, (10.33) is left as an exercise atthe end ofthe chapter (se Problem 10.5). Hiving determined the key process model parameters , and Q,, we will look next atthe measurement mode! ‘The measurement presented to the Kalman filter isthe cifference between INS velocity and the aiding refereace velocity. Note that the siding source is only related directly t0 2, in this example, and not to the other two state vari= ables. This stil fits the format of Eq. (103.3), though, and we will account for this with the H, parameter in the measurement model. First, however, it should te noted that we are dealing with'the discrete Kalman filter, so we must assume that reference velocity measurements are available on a sampled basis with a Sampling interval Ar. We aso need to make the assumption that, after sampling, the sequence of measurement errors is a white (uncorrelatec) sequence with @ Known variance R,- Putting the known measurement relationship in this ease into mathematical form yields the Kalman filter measurement 2: Kalman filter measurement “(uve velocity ~ INS velocity error) + (true velocity “+ reference velocity error) = (INS velocity error) + (reference velocity error) in terms of 2 and x, we have p10 (:] + 034) “The H, parameter is obvious from Eq, (10.34) and R, is the variance associated with of, the velocity reference measurement error. Presumably, Ry is known (or it least can be estimated) from the nature of the equipment being used for the roninertial velocity reference, "The initial conditions for the Kalman filter are chosen fo correspond to the physica! situation at hand when the velocity reference is "turned on.” (inthis 103 DAMPING THE SCHULER OSCILLATION 405 Holey reence win oceans obit elle Figure 107 ror growth stabilaaion ung an sera oy refoenoe simple example, the nonin vlc settee the ony sing source, 0 there sno Kelman filter operation uni he velocity reference is operative) Bear iting that he te varies ae inertia uate so woul orally be set esa to aero Gn eaten of bees infrmaon) The nal ae cote nly cnn Saginaw via ne ins crane) Wing teresa ofthe fal corny he ine oo Tighe 107 is iene to demons te efetvres of he te efetvres of he Katman fer insights hone re ape of en. Te per ane sho he meas ftom err oe apn oe Sele ee wit isting tla Seflrneenormnton, Cen. Xi caneagef an are fener Te owe ce shows he sane mea ona pte or he icy race nomaion meted vn Kt he Tet alized int case with te el siuaton of ef igen, and te le fe wee hse be pe fer ange el option scat tpl that te ost being dared ad that he atm eer Yam aproaching site pe bend EXAMPLE 10.1 INS/Dopper Radar. Measurement Model Here, we will generalize te Schlringng prem oe ll St IN6 od Dope a ve lociy-measring system based onthe Dols shit nan eletromagnetic wave tht Feet back fom the goin (or va) ater being transited by an Aiea Although thee mea varity of Doppler radar equipments we call Simply conser the Doppler rar sytem to ease and cup the sat Aorta velocity in ters of two orthogonal components: Vote eon Of the srr Heng, and Vp that pependiclaro Vy Gee Fig 108) Th the Doppler radar supplies a 2-tuple velocity measurement that may be used 10 dan INS. We wil imi the dscsson hereto the development of the nerzedE 408 CHAPTER 19 MORE ON MODELING i va i Figure 108 Arcane Dest convraan ed inearrele 10 1H, matrix forthe case where the measurement nose is assumed to Be white wr Mihe INS errors are modeled as a Oeste system (slates described in Eq. 10.2.9. Tellowing the same system integration philosophy as before, we see hat the INS muent resolve its indicated horizontal velocity components V, and V, into tre erat body frame of reference, which we wil call heading and drift. The cre Ainate directions are shown in Fig. 10.8. Note thatthe azienuth angle A is Srevsugative of the usual heading angle with respect to true north, We wil ae acathat the INS platform azirouth angle Bis used to resolve V, and V, into Vy and Vor This resolution yields the INS “predicted values ofthe two Doppler eerteiments, ‘The idealized noiseless relationships are as follows Vy = ~V, sin B + V, cos 6 Vp = V,c0s B+ V, sin B 103.5) Clearly, the measurement relationships are nonlinear in , so they must be line Saas ean be done by forming the appropriate matrix of paral derivatives St vg and Vo with respect to V,, Vj, and (See Section 9.1) The resulting linearized H, matrix is then o snp 0 0 cmp 0.0 0 (Yom ev. sinB ae 5 E88 Cemer ens aia “The two Doppler measurements have been ordered with the heading component aT aelocty fest and its drift component second. Also, itis important to note tat he tens fn the Hy, matrix are time-varying in general, and they must be econ faced with each recursive step ofthe Kalman filter. This computation is made, Pricourse, using the most current V,, V,, and outputs of the INS. 10.4 104 BAROWAIDED INS VERTICAL CHANNEL MODE 407 canrpnn force eects) 1 ws Tons carats — eT : | I ' owe vets | vena { we ob, TESA Figure 102 Bock degra ofan NS veri charnl wih bawling BARO-AIDED INS VERTICAL CHANNEL MODEL ‘Figure 10.9 shows a simplified block diagram of the vertical channel of an INS with aiding from a barometsc altimeter. Note that this implementation follows the methodology discussed previously in Section 10.1 with the INS providing the reference trajectory. The INS in this ease can be thought of as a strapdown system with a triad of accelerometers sensing total acceleration plus the gravity vector. In Fig. 10.9 we are looking only atthe vertical component of the sensed acceleration vector. It is assumed, of course, that the dynamic reaction forces {due to aircraft motion are properly accounted for in the gravity correction. The Kalman fiter only operates on the system errors, so we now need to look a the ‘ay in which errors propagate in this system, Figure 10.10 shows the error models that will be used for the vertical ‘channel of the INS and the barometric altimeter. These models are much sim plified, but they will serve our purpose here. In defense of the models, they do ho eee 7 2 ~ x fa, ie > [YEE = eb Figure 1010. Erormacele usd for te verca chan. fad shou in. 109.408 CHAPTER 10° MORE ON MODELING sow fr none or propagation in bth he INS and bro ainetr. Clearly, SHCTNS ee chal hn steer propagation, The de ple the inne spare mans ta the met gate reponse ow rls il ee ar bend On thee han, he Maso adel for Ge bare ata Saye at eve we Sod (osha aes) and the mode Toe Es Coder Nextt nh jst of te of ard parameters Stead thet whl NS cox somal Mac dogors choce ofthe spect ie of the whe ae vin acon ine INS eanel-Thn cen thoge he tne tel gue inl eee este Seger of Roxy in aaping fo # vee of ohyseal tise We wl nw Tak athe proces and messremen dl fo the man teri this station etre lt vel ae necessary forthe maematial eseripin of he mode nig 1O10 end thy ae ined 9 follows x, = INS vertical position error (mm) xy = INS vertical velocity error _(m/cee) xx, = baro error (m) “The approprete differential equations are obtained directly from the block diagrams, and they are 4 =f oa.) y= Bx + VIRB AO Equations (104.1) canbe solved for a step size Ar, and the result in matrix frm s var o]fn] fw n ot o [ln] +] 10.42) Bh. too eb bel. “The state transition matrix ¢ is obvious from Bq. (10.4.2), The Q, matrix is the covariance essociated with the forcing term in Eg. (10.4.2), ‘tis not obvious, ‘at it ean be calculated withthe help of random-process theory (see Chapter 3). ‘The result is 104 BARO-ADED INS VERTICAL CHANNEL MODEL 409, ap ae ase 4a o ar an|4te aur 0 (043) 0 0 ox ey Where A is the power spectral density of f,(0), and o# and f are the Markov parameters for the bato error process, ‘Tae random-process part of the Kelman filter model is now complete, We will look atthe measurement model nest The desired measurement relationship is obtained directly from the block diagram shown in Fig. 10.9. In words, the relationship is (Measurement presented to the Kalman fitet) = (que altitude + bao evox) — (uve alliade ~ ENS aldiude error) Now we denote the discrete measurement as 2, and note that x, is the INS Position error and x, is the Markov part of the berovalttade errr. We will also ‘ay thatthe baro-altitude error has @ white component in addition to the Markow ‘component. The final measurement model in matrix form is then asfo [i]. nay and the H, matrix is obvious from Eq, (10.44). The R, parameter is the variance associated with the white sequence v, term in Eq, (10.4.4) and its numerical value will depend on the physical situation under consideration, We will not attempt to assign numerical values to the various parameters in our INS baro-altimeter model (see Problem 10-1). Itsufices here to say tat, with reasonable values, the system effectively slaves the corrected inertial output to the baro-attude measurement in the steady-state condition, However, wien sharp transients oceur, the system follows the INS-derived altitae, Said another ‘way, the baro-atitude provides the low-frequency response, and the INS provides the high-frequency response. Thus, the Kalman filter mechanization is dynamically exact and accomplishes a result similar to the analog mechanization discussed in Problem 4.13. The Kalman filter has the advantage, though, of its sigital precision and optimality (subject, of course, to the accuracy ofthe model). Also, with the uncorrected INS error growing without bound, it should be obvious that correcting the INS on an open-loop basis could eventually lead to ‘numerical problems. Thus, closing the loop by using the feedback structure of Fig. 10.2 is highly desirable in this case.440 CHAPTER 10. MORE ON MODELING In summary, the beauty of the INS/baro-atituée mechanization just pre sented is that it achieves the “best of both worlds"—the fast dynamic response bf the INS coupled with the bounded error characteristics of the baro-alimeter 10.5 INTEGRATING POSITIONING MEASUREMENTS We shall consider here an integrated navigation system that is updated with positioning measurements. There are many types of positioning systems used Whit integrated navigation systems offering different levels of postion accuracy nd error characteristics. The distance-measuring equipment (DME) system Jnentioned in Chapter 9 is a (v0-dimensional (horizontal) ranging system of Signals traveling to and from ground-based transmitters (6). OMEGA and LORAN are also examples of two-dimensional ranging systems (6). GPS, the premier positioning system available today (see Chapter 11), is a thee Timsensional ranging system. Avery high-frequency omnidirectional ranging (VOR) system is positioning system based on angular measurements of the Girection of signals eriving from ground-based transmitters (6). ‘All positioning systems based on ranging principles are susceptible, in varying degrees, to ranging ecors due to timing resolution and unpredictable [mosphere propagation effects. Positioning systems based on angular measure- nents are prone fo errors in directional resolution and multipath. These errors wre sully correlated between time samples, unless the sampling period is un- toually Tong, because changes in the signal propagation path or the atmospheric Characteristics affecting it evelve slowly. These errors can be modeled indepen- deny with first-order Gauss-Markov states augmented to the state vector in much the same way that accelerometer and gyro errors are accounted for in E. (102.12) INS/DME Example “The linearization of a DME measurement was discussed in Section 9.1 (Example O11), There, the direct slant range from the airraft to the DME ground station vias considered to be the same as horizontal range. This approximation that can tein ervor by a few percent for short to medium ranges is usually ebsorbed into the measurement noise component of the model. Once the horizontal range is Dbiained, i is then compared with the predicted horizontal range based on the INS position output, and the difference becomes the measurement input fo the Kalman filter, The difference quantity bas a linear connection to Ar and Ay as iscused in Example 9.1. Based on the same ordering ofthe state vector as in Ea, (10.25), the rows of the Hl, matrix corresponding to the two DME stations would then be 105 INTEGRATING POSITIONNG MEASUREMENTS. 411 sin ay 0 0 cosa 0 0 0 0 0 wo [RE 88 E888 8 Bees cas ‘where we have written the direction cosines in terms of the bearing angle to the Station rather than 0 2s used in Eq. (9.1.18) (see Fig, 9.2). Note that the bearing angle is the usual clockwise angle with respect to true north, and the x-axis is assumed be east. Iti assumed, of course, thatthe coordinates of the DME station being interrogated are known and that the aircrafts position is known Approximately from the INS postion output. Thus, sin a and cos a are com- pPulable on-line to a fistorder approximation. Range measurements from more than one DME station could be made either sequentially or simultaneously. The Kalman fiter can easily accommodate to either situation by setting up appro- priste rows in the H, matrix corresponding to whatever measurements happen to be available EXAMPLE 10.2 Simulation of INS/DME_ For the following simulation exercise, we shall look atthe performance of thee different systems: (2) an integrated INS/DME system, (b) an INS-only system with initial alignment, () a DME-only system. The nominal aircraft motion and the DME station locations are as shown in the figure accompanying Problem 9.4 in Chapter 9. For the INS/DME. system, we shall tse the basic 9-sate process model described in Eq, (10:2.10) and choose the following parameters for it: Ar step size = 1 see Accelerometer white-noise spectral density = 0.0036 (m/sec")'/(rad/sec) Gyro white-noise spectral density = 2.35 (10°) (rad/sec)*/(rad/sec) R, = 6;380,000 m Y-axis angular velocity «, = 020000727 rad/see (earth rate at the equator) 100 mise Xaxis angular velocity @, =0,0000157 rad/see Initial position vaviance = 10m? Initial velocity variance = (0.001 m/s)* Initial atitude variance = (0.001 rad)? DME white measurement error = (15m)? ‘The INS-only system uses the same inertial sensor parameters and the seme inital alignment conditions. The main difference between the INS-only system and the integrated INS/DME. system is that the DME measurements are never processed, whereas the INS errors are allowed to propagate according to the natural dyamies modeled.412 CHAPTER 10. MORE ON MODELING For the DME-only sytem th sirraft mation is mele a random walk (in both x- and y-positions) superimposed on constant-velocity motion in the ‘y-direction, The filter in this case is a simple 2-stae Kalman filter linearized bout the nominal trajectory, The following parameters are use Ar step size = 1 sec Process noise variance (in each position axis) = 40) m* Initial position variance = 10 m# [DME white measurement error = (15 m)* In the 200-sec run, the aircraft starts out at the location (0, ~10,000) and flies north at 100 m/see and nominally ends up atthe location (0, 10,000). Figure 10.11 shows a comparison of the standard deviations of the fosition esror for om eee eo ae He Te es ad i = Tine ee) Figure 10.11 Comparson of vatous comélatins between NS {nd Due eoors teming aor ie presser eas ‘onponent on na compart 10.6 OTHER 105 OTHER INTEGRATION CONSIDERATIONS 413 all three systems described above, for the east component (a) and the north ‘component (b). Although the INS position error grows without bound, the position errors are stable and smaller for the integrated INS/DME system. The ‘resting of the north position error near the 100-sec time mark is due tothe poor DME observabilty in the north direction when the sircraft crosses the x-axis (ee Problem 9.4). The position esror for the DME-only system bas character istics similar but slightly larger in magnitude by comparison to those of the integrated INS/DME postion error, a INTEGRATION CONSIDERATIONS ‘The integration philosophy discussed here has found wide use in navigation systems over the past three decades. It is clearly a philosophy that is centered around the use of an INS primarily because this type of sensor better than any ‘other, is capable of providing a reference uajeclony representing position, ve locity, and attitude with a high degree of continuity and dynamical fidelity. tis Jogical to then ask: If an INS is not available, can this integration philosophy still be useful? In general, any sensor that is capable of providing a suitable reference trajectory with a high level of continuity and dynamical fidelity can bbe used in place of the INS. In some applications, the reference trajectory need only consist of subsets of position, velocity, and attitude, An example of a reference sensor for a land vehicle is a wheel tachometer that senses speed, inte rated with a compass that senses direction, to produce a velocity measurement and position through integration; attitude is not available, nor perhaps, necessary, In an aircraft, a suitable reference sensor might be derived from a combination of true air speed data with a magnetic compass for some applications. In an integration exercise, it is always useful to keep the error stability characteristics of each sensor in mind while the detils of modeling are being. hhashed out. In the two examples discussed inthis section, INS/DME and INS/ Doppler radar, itis important to note thet in each case the aiding source serves the INS ina very different way. As was pointed out in Section 10.2 the position, velocity, and attitude errors produced by an INS are unstable over time. As @ Positioning system, the horizontal positon errors produced in DME measure- ‘ments are stable. Thus, an INS/DME system has stable horizontal position and velocity ertors. Since the DME provides no vertical position information, the vertical position error and consequently the attitude errors are unstable overtime. So if an INS/DME system is required to produce a stable three dimensional Position, velocity, and atitude data output, i will require more tion on vertical In the other integration example, the Doppler radar provides a stable source ‘of velocity data, but when this is integrated into position, the errors become unstable over time. Thus, an INS/Doppler radar system has stable horizontal velocity errors but its horizontal postion errors are unstable, The stability of its vertical position and velocity components must also be addressed in the same ‘manner as for the case of the INS/DME,i 414° CHAPTER 10 MORE ON MODELING PROBLEMS 415 ‘To close, one final comment is in order. It should be apparent that the are rewrite here with the inlusion ofthe Intra acceleration components system integration philosophy presented in this chapter is not the only way of repo nponents Ay ‘and A, tie-in to the azimuth error ia the horizontal acceleration error equations integrating inerdal measurements with noninetal data, One only has to peruse (ediditfonal terms are indicated with a double underscore) tuvigation conference proceedings over the years to find countless integration i ). Schemes, each with litle diferent twist because of some special circumstance ‘East channel: Sr choice. The scheme presented here is optimal though (within the constraint ‘Of dynamic exactness in the use of inertial measurements), and it represents the pest way of managing the system integration theoretically. However, the systems fnginee often does not have the luxury of designing a truly optimal system, find must at mes setle for something less because of equipment constrain, ‘Cont and go forth. Even s0, the optimal methodology presented here is stil ‘Teluable for analysis purposes, because it provides a lower bound.on system Crrors for a given mix of sensors. That is, the optimal system serves as @ “yard- Stick” that can be used for comparison purposes in evaluating the degree of Suboptimality of various other candidate integration schemes 10.» North channel: (102) Ay + ad te, PROBLEMS Jo. A Kalman fiter model for integrating vertical acceleration and baro~ altitude measurements was given in Section 10.4. Consider the following set of | ‘numerical values for this Kalman filter implementation: Using the parameters for Example 10.2 in the integrated INS/DME navi- sation system, perform a covariance analysis to determine the time profile for fe varane fe sinuh eer forte flowing dani setae The nominal y-axis acceleration and velocity profiles areas shown inthe accompa ying figure. "The y(0) profile for linearization ofthe matrix may be approximated as constant-veloity (100 ma/see) forthe firs 95 sec; then a reduced constant velocity of 80 m/sec dusing the 10-ee deceleration period; and, finally, a constant velocity of 60 m/see forthe remaining 95 sec of the profile. (Note that we are not assuming this to be the atual fight path. This is simply the approximate reference trajectory to be used for the linearization.) “The parumeter values given in Example 10.2 are to be used here, except that the gyro white noise power spectral density iso be increased t0 2.35 (10) At step size = 1 sec ‘Accelerometer white-noise specteal density = 0.13889 (m/sec*)/(rad/sec) Markov baro error variance and inverse time constant: = (100 m)* fr! = 300 sec ‘White component of baro error = (10 m)* ‘We will assume that the system is “turned on” at ¢= 0 with perfect knowledge i i ot ala (rad/sec)*/(rad/sec) and the initial azimuth erro : Hon the above assumplions, derive the mumescal values forthe ‘Fai ited to simalate ale sable INS ay compared with he ove cond ay er a Tis mend ita erste NS compe wih he on coi (oy Aker seting up te above Kaan fit model, excl true 1090 ‘nBxample 102, he cath rats L ee SF Sovmance analysis forthe given Biter parameters. By the sat Sty cat gant cath ai Ry may be sume wo be 9.8 Bh fend of the run, at where the filter can be considered to have reached a is problem. Steady-state condition, explain the relationship between the variances of the first and third states. a 4102, The epproximate INS error equations of (10.2.3) through (10-28) are for oe ‘a slow-moving vehicle In this model, the observability ofthe azimuth error aie i is poor because it can only depend on earth motion (gyrocompassing), Hence, 3 tbs for an INS with poor gyro stability, its steady-state azimuth error can be quite nme eee large. For @ faster-moving vehicle that occasionally encounters horizontal ac- ee ‘EE os the improved ebserabiiy of, under such conditions actualy pro Sides sabtata reducion in the eo, thus momentarily sailing it The 103, Instrument eors ate often found tobe simple quasibases that wander INS cor equations forthe east and north channels (Eqs. 102.3 through 10226) Be ate Consus "These con mpl) be model ih snglese416 (CHAPTER 10. MORE ON MODELING ss-Markoy proces a was pointed out i Seton 102. Some instrument ‘trove, howett ae rene na Geter way tothe magatude of he mea- Strevaable he ent common type being Known as sale ator err. We Shaligok atthe nate of ale for emer a combination wih a Ba enor sntie problem bat Involves barometric derived aide Suppose tha he pane {alrintip etwen he internally sensed bromedie eding and he r- ped ate i given by the folowing equations: weby where HY = reported altitude b’ = barometric reading ‘= barometric altitude scale factor ‘Consider that the barometric reading b’ is made up of the correct value b plus bias error 8,: b’ = B+ Dy Consider also that de seale factory’ is made wp of the correct value 7 plus an error 7.3 ¥! = 7 + Ye a) Show that we can use the 2-Stale measurement model shown below to account for the bias and scale factor errors (neglect second-order ef- fe) ry] +o, noa[ es (©) Suppose thatthe two error states are modeled as random constants: bo.) ft o} fe], fo yale Lo U La] * Lo Let H’ = 50, Do the variances of b, and 7, go to zero in the limit? ‘Under what condition will the variances of 8, and 7, go to zero in the limit? (©) Suppose that the two error states are modeled individually as single- state Gauss-Markov processes: [i Ju.= [7 oso Ei]. Leh where proses 417 Let H’ = 50. Do the variances of b, and +, go to zero in the limit? Why? 104. Suppose that the integrated INS/DME situation given in Example 10.2 involves a locomotive instead of an aircraft. The locomotive is constrained to nilraad tracks aligned in the north-south direction. In place of an INS, the ‘integrated navigation system uses wheel tachometer data to provide the reference trajectory inthe complementary filter arrangement. tachometer monitors wheel revolutions to determine the relative distance traveled. In words, Relative distance = number of wheel revolutions X wheel circumference ‘The model for the position error inthe relative distance takes on the same form as that of a scale Factor ertor (see Problem 10.3) The locomotive kinematics are described by the following equations: x=0 Yer Yet m + 10M, k= 0,1,2,...,2000 ‘where w, is an independent Gaussian sequence described by we ~ NO, 1m?) ‘The sampling interval Aris 1 sec. {@) Formulate a process model that includes for bias and scale factor error states using random constant models for each, so formulate a line arized measurement model using the scenario setup from Problem 9.4; use only DME station No. 2. For simplicity, a linearized Kalman filter should be used, not an extended Kalman filter: Let the intial estimation error variances forthe bias and scale factor states be (100 mand (0,02 per uni () Run a covariance analysis using the filter parameters worked out in (a) for k = 0, 1, 2, ... , 2000, and plot the rs estimation errors for each of the error states. Also plot the rms position eror. (©) Make another run of (b) except that, between k= 1000 and & DME measurement updates are not available. 10.5, In the Schuler damping example of Section 10,3, the differential equation describing the dynamic process is given by Eq. (10.3.2). The state transition ‘matrix computed forthe discrete-time process difference equation in Eg, (10.3.3) is simply approximated by ¢ = I + Fat. This same first-order approximation for & can also be used in the integral expression for Q, given by Eg. (5.35). When F is constant and ¢ is first order in the step size iti feasible to evaluate the integral analytically and obtain an expression for Q, in closed form. (Each of the tenms in the resulting Q, are functions of At) (a) Work out the closed-form expression for Q, using a first-order approx imation for & in Eq, (53.6). Call this Qt (©) Next, evaluate Q, with MATLAB using the numerical method described in Section 5.3 (Eqs. 5.3.23-5.3.26). Do this for At = 5 sec, 50 sec, and 800,418 CHAPTER fo MORE ON MODELING ‘500 sec. These will be referred to as Q2 (different, of course, for each 40). (©) Compare the respective diagonal terms of QI with those of Q2 for At = 5, 50, and 500 see “This exercise is intended to demonstra that one should be wary of using first ‘order approximations in the step size when it is an appreciable fraction of the natural period or time constant of the system. REFERENCES CITED IN CHAPTER 10 1. RG. Brown, “Integrated Navigation Systems and Kalman Fitering: A Perspectv mn rk The Global Positioning 2 Rene te i cate System: A Case Study ‘and Cont of Aerospace Vehicles, New York: MeCiraw-Hil, 1963. 4. De Tacanes,R. Buches, H. Tipton, and R.Grethel, "Synergistic Interferometric GPS- | INS!" Proceedings ofthe National Technical Meeting of the Insite of Navigation. | Anbeim, CA, Jan. 18-20, 1995, pp. 657-671, | 5, GIC-100 Inertial Measurement Sensor product description, Rockwell Intemational, 1998, 6. M. Kayion and W. R. Fried (eds), Avionics Navigation Systems, New York: Wiley, ‘The Global Positioning Systom (GPS) is a satellite-based system that has dem- 1568 ‘onsrated the provision of unprecedented levels of positioning accuracy, leading to its extensive use in both military and civil arenas (1, 2). It became fully operational in 1994 and provides worldwide coverage that benefits all nations of the word. AC its conception, GPS was, by fa, the most ambitious navigation | project ever undertaken by the United States, or by any nation for that matter. Now, in & mature state, the applications it has spawned go beyond the usual positioning of aircraft and ships. Other applicstions include precise surveying, faceurate land vehicle tracking, near-earth space navigation, and precise time dissemination on a worldwide basis. The central problem for the GPS receiver is the precise estimation of position, velocity, and time based on noisy obser vations ofthe satellite signals It should come as no surprise then that this isan ideal setting for Kalman filtering Infact, Kalman fitering has become a house- hhold word ia the GPS business. Our discussion of the subject here is intended to be tutorial and must be brief. Thus, we will confine our attention to receiver applications only, and we will leave all ofthe other interesting facets of Kalman filtering applied to GPS as extracuricular reading WA DESCRIPTION OF GPS GPS is a satellite-based navigation system that allows a user with the proper equipment access to useful and accurate positioning and timing information any- ‘where on the globe, Position and time determination is accomplished by the 4 reception of GPS signals to obtain ranging information as well as messages transmitted by the satellites. The system of satellites that makes up the space | segment of GPS consists of 24 satellites in six I2-hour orbits. This ensures a ‘a ae420 CHAPTER 11. THE GLOBAL POSITIONING SYSTEM: A GASE STUDY user located anywhere on the globe a visibility of four satellites or more at any time. From an observer’s viewpoint, the satellite geometry inthe visible sky is flways changing because the satellites are not in geosynchronous orbits. The maintenance of updated information embedded in the transmited message is performed by ground monitoring stations collectively known asthe control seg- feat. The control segment periodically updates the information that is dissem- inated by al the satellites. This includes satelite ephemerides and health status, ‘as well as a constellation almanac. GPS signals are transmitted on two coherent cartier ‘requencies, Li (1575.42 MHz) and L2 (1227.60 MH), which are modulated by vasious spread spectrum signals. The major carrier, Ll, i biphase-modulated by two types of| pseudorandom noise codes (see Section 2.14 of Chapter 2): {alled the C/A-code, and the other at 0.23 MHz called the Y-cade. The Y-code js intended only for euthorized access because its one-chip wavelength of 30 m provides the most accurate positioning possible. The C/A-code, with its 300-m Dne-chip wavelength, is used in all cases for initial acquisition and code-signal ‘ligament purposes. All users have access to this less accurate C/A-code for positioning, The vecond cartier signal 12 contains only Y-code modulation, and FS intended to give authorized users the additional capability of actually measuring the ionospheric delays using the two frequencies, the celays being fre- {queney-dependent. In official parlance, Y-code access is reserved for what is alled the Precise Positioning Service (PPS) mode of operation, whereas everything else is classified as the Standard Positioning Service (SPS). "A 50 bits/sec navigation message is also combined with te pseudorandom noise codes. This navigation message, 1500 bits in length and repeated every 30 Sec, cates many kinds of information with varying degre:s of functional tnd operational importance tothe user. Foremost in significance isthe satelite tphemerides, a collection of data making up 60 percent of the message that ‘then decoded, uniquely deseribes the postion and trajectory ofthe satellite that transmitted it.The remaining 40 percent of space allotted to the message is ‘common to all satellites and cartes general almanac information. ‘Tis infor- nation runs the gamut from providing approximate satellite positions for vi bility checks and signal acquisition purposes to satellite health status and current ‘operational modes. Position Determination ‘An observer equipped to receive and decode GPS signals mest then solve the problem of postion determination. In free space, there are three dimensions of position that need to be solved. Also, an autonomous user is not expected to be precisely synchronized to the satellite system time initially. In all, the stand- ‘ard GPS positioning problem poses four variables that can be solved from the following system of equations, representing measurements from four different satellites: 114 DESCRIPTION OF GPS 424 = VEE — WF FE, = oF + che = VWF FO = FFG = OF + cat hGH TO PT] — et ct Ye VG WP FE = AF + cde AA hs a ts a = noiseless pseudorange [x, ¥%, ZI” = Cartesian position coordinates of satelite x,y, I” = Cartesian postion coordinates of observer ai receiver offset from the satellite system tine speed of light ‘The observer position [x, y, 2] ig “slaved 10 the eooeinate frame of reference used by the satellite system. In the case of GPS, this reference is a ‘geodetic datum called WGS-84 (for World Geodetic System of 1984) that is ‘earth-centered earth-ixed (3). The datum also defines the ellipsoid that erudely approximates the surface of the earth (Gee Fig. 11.1), Although the satellite positions are reported in WGS-84 coordinates, it is sometimes useful to deal With a locally level frame of reference, where the x'-y" plane is tangential to the surface ofthe earth ellipsoid. As depicted in Fig. 11.1, we shall define such 1 locally level reference frame by having the x-axis pointing east, the y'-axis north, and the z-axis normal fo the plane, and equivalently, to the surface of the tart at the observer's location, Itsufices here to say thatthe coordinate trans- formations to convert between the WGS-84.coordinates and any other derived i Figure 1.4, The WGS-84 cognate: rao ara J} used by GPS ad ao gin correc rao422. CHAPTER 14. THE GLOBAL POSITIONING SYSTENE A GASE STUDY reference frame, including the locally level one given here, are usually quite straightforward. Measurement Linearization ‘The measurement situation for GPS is clearly nonlinear from Eq. (11.1.1). Lin- cearization of a measurement of this form has already been covered in Section ‘9.1 and will not be reiterated here, We will imply evaluate the partial derivatives necessary to obiain the linearized-equations about an approximate observer 1o- Cation X5 = Di Yor Za This nominal point of linearization x, is sometimes based on an estimate of the true observer location x although, in general, its choice may be arbitrary. (= x9) VE a FI FG, eS y Kaa eG WG ay. ____ at) ae VR aE PG for T= 1... 44 From & geometrical perspective, the partial derivative vector for each satelite i 2% 2% a ay 2s given in Bq, (11.1.2), i actualy the unit direction vector pointing from the Satelite to the observer, the direction being specitied by the negative sign inthe ‘equation. th classical navigtion geometry, the components of this unit vector fre often called direction cosines. The resulting measurement vector equation for pseudorange asthe observable is then given by (without noise 8, A BH Tg, 4] [ooo] (eS far A a a ve] [do] a Wot ais | | oo Sh if oe wf [doo She he Wear ‘ay 412 THE OBSERVABLES 423 where 1d, = noiseless pseudorange ‘x: = nominal point of linearization based on Los Yo zal” and predicted receiver time ‘hexs) = predicted pseudorange based on xy [Ax, Ay, de)” = difference vector between tru location x and x ct = range equivalent of the receiver timing error THE OBSERVABLES ‘Useful information can be derived from’measurements made on the pseudoran- ‘dom code and the carier signal. The block diagram of a generic signal-tracking ‘scheme for a GPS receiver is shown in Fig. 11.2. Init, there are separate tracking Toops for the code and the carrer, with the latter eross-feeding to dynamically tid the code tracking (4). The loop filters are generally simple low-pass filters the bandwidths of which determine the noisiness of the measurement data that are synthesized, ‘The observable known as pseudorange is a timing measurement of the propagation delay that is due to the geometric range from the transmitting sat tllite tothe receiver and also the receiver clock offset from satellite time—hence, the tern pseudorange and not just range. At the point of reception, the messurement is made in the receiver by determining the amount of shift in the pseudorandom code position since the time of transmission. By using some ‘coherent form of signal tracking, the binary pseudorandom code can be monitored by aligning the received signal with a replica of the known code generated by the receiver. Hence, the precision of the crosscorrelation process in deter- ‘mining the pseudorandom code position establishes the accuracy af the pseudorange measurement, which by state-of-the-art standards is considered to be about ampere foe eee t Figure 11.2. Gane signal racing scare ora GPS rece.424 CHAPTER 11. THE GLOBAL POSMONING SYSTEM: A CASE STUDY 1 m under nominal signal reception strengths (4). In the context of Kalman filtering, these numbers represent the standard deviation of the measurement noise white sequence. The pseudorange measurement can be represented by the following equation p= vt Rte a2) where 1 = noiséless pseudorange consisting of geometric range and range error due to receiver timing error (8, ~ time-correlated errors associated with pseudorange vy, = pseudorange measurement noise “The term f, represents other significant error sources, which may range from estimation etrors inthe reported satellite position and signal delays due to fono- Spheric and tropospheric refraction, to the intentional errors invoked under an ccuracy-degiadation scheme called celective availability. Semetimes collectively called womodeled biases, these errors are generally dificult to estimate due to their poor observability characteristics and discussion cf their modeling is deferred to Section 10.3. However, because they are dependent on the spatial relationship between the observer and the satelite, two observers in proximi to one another and sighting the same satellites will encounter these errors with ‘a high degree of correlation between them. To take advantage ofthis, if one of these observers is at a knovin reference location, these so-alled unmodeled biases can be measured as a lumped error and the information, when shared, ‘ean be used by the other observer fo correct forthe error. This mode of operation is known as differential positioning and such corrections are very effective in enhancing position accuracy for all participants of such a positioning network (Gee Section 11.7). Today, differential GPS plays an important role in the development of high-accuracy positioning systems (I, 5,6). “in addition to code tracking, most GPS navigation receivers possess the capability of tracking the carer signal as well. The wavelength ofthe LI carrier fslittle more than 19 cm, ths allowing very precise measurements of the phase Of the carrier to be made. This type of measurement is made after the code ‘modulation has been stripped off. The amount of noise inthe carrier phase data depends largely on the parameters of the tracking loops used by the receiver may be less than 1 percent of the wavelength for a stationary receiver, but to maintain cartier tracking in high dynamics, the noise in the data for most nnavigation-ype receivers may be as high as 2 percent (4, 7) in most conventional GPS receivers, the rate of change ir the cartier phase over a brief interval is used to represent the measured Doppler frequency. As fan approximation to the true Doppler frequency, this observable, called delta yrange, is generally used to provide accurate velocity information. Although th Doppler frequency due to satellite motion as seen by a stationary observer is ‘constantly changing, itis nevertheless accurately predictable. Hence, when the ‘measured Doppler is compared against the predicted value, the difference reficets the velocity of the observer as well asthe frequency error of the receiver clock. 112 THE OBSERVABLES 425 “The we ofthe deta range measurement to aproximae te s:rement to apposite he Dopp eaves, of cure snakes te amp thatthe velo is consan trough! the imegiatin irl ed to form the dle ange Tis ssumpton hls fore rst preven he velo 1s mt constant proved ta he negrton in terval emai sor. Most csr inser itpation itv tht te hay acto oad fg Te ay te easement a te Of eo vader nena gal cep emg about 0.008 m/sec (4). : ae ‘More eco, aplioions have surfaced whereby the Dopler dat we exalted for iafrtadon led to lave poston rae tha ox oy 5) The titerence tetwecn the intgted Dope Sometimes al called fon thuous carr pas) an the dla ge epicly compared a Pig 1 ‘The pplication tha ils integrated Doppler menuremens seed lit in Secs 11 and 117) ote lng inegraon inva hat ead i terpenes tacking ofthe eae Unde el operating contons, Sich derands se sldom eaiy saci. But ving fo the get tenets of ‘tat an be reps ellos recone th tn oie wih measuedsusatn Te crergace name can be pron ‘the equation: . = 2 DA UTNS + Bt By 122) where 4 noises psendorange consisting of geome range and range emor jue to receiver timing exror gang unceainy sometimes called integer cle ambiguity. (ce Section 11.9) e ay time-correlated errors associated with carier phase arrer-phase measurement noise Be ‘When pseudorange and delta range (or integrated Doppler) data are used in a combined setting, it may be presumed that the measurement noises for both types of data can be regarded as independent of each other. In the receiver, Afferent processes are involved in obtaining the (wo types of measurements (se, ! DIASIIIDIIIAIW Figure 112 Measurements ote tom crio-phso dl,426 CHAPTER 11. THE GLOBAL POSONING SYSTEM: A GASE STUDY Fig. 1.2). The eross-feed from the carrier to the code-tracking loops can largely bbe ignore because the cartier data are virally noiseless when compared to the pseudorange measurements made in the code loop. 13 GPS ERROR MODELS “The accuracy of a GPS postion solution is dictated by errors in the observables eseribed. The error components are listed in Table 11.1 with their approximate statistical characteristics. Selective Availability ‘The intentional timing distortion applied to the GPS signal for civil users to reduce its ranging accuracy is known as selective availability or S/A for shor. ‘This distortion appears as a random process to SPS receivers that do not have full access to removing it. Many studies have been made on the statistical time- correlation strcture of this random process that can be adequately approximated by a second-order Gauss-Markov process (13). Figure 114 shows SA variations ‘over a 100-min interval for four satellites. The process for the different satellites appears to be uncorrelated. ‘Examples on the modeling of SA. were given earlier in Chapters 5 and 6 (sce Problem 54 and Example 6.1) If the second-order Gauss-Markov model requires two states for each satellite, then for minimal navigation function, at feast eight states are needed just to account for SA. More than 20 states are needed if all visible satellites are to be included! ‘Table 44.1 Goreates Erors Afecting Pscusrange: Approximate Statistical Baramaters for a Statonary Observer (10, 11, 12) Error ‘Standard Time Factors Causing Component Deviation Constant Unpredictability Capen ee ‘Satelite boadeast, 330m Ty ‘GPS Contol Segment parameters Selective availability 0,30. ~2min GPS Control Segment ono refraction 1.5m Sih Sunspot cycle: ‘with correction sentation activity “Tropa refzction 22m Dib Local atmospheric ‘wih conection condition; ate (Code multipath 1.5m $10 10 min__Local scatering conditions ‘he two ama en we or he PPS (etal) and SPS Cibo) operators mos, reece 118 GPS ERROR MODELS 427 Figure 114 Cuampla of SA er for fu lien sats ver a 100-in no 25) Satellite and Atmospheric Errors During receiver operation, several error components are minimized using para- ‘meters that are broadcast in the 50 bits per sec (bps) navigation message. These ‘error components include satelite position, satellite clock, and ionospheric refraction erors. The compensating parameters, being established from estimates made by the ground-tracking network of the Control Segment, ae imperfect and contain errors at levels that are approximately represented in Table 11.1. Tro- pospheric refraction error, being dependent on focal stmospheric conditions, is hot compensated by any satellite broadcast parameter. Rather, the tropospheric fecrors are compenstted by user-defined models that typically depend on inputs of altitude and satellite elevation angle, and for more complex models, emper- ature, and humidity. ‘The time-correlation characteristics of all of the aforementioned compo nents are difficult to determine with precision bt, in general, they change rather slowly. To choose aa appropriate random process to model these etrars, the fist [key characteristic to note is that these errors are all bounded over time. Hence, ‘we shall consider the Gauss-Markov process to be a suitable candidate. Cer- tainly, the second-order Gauss-Markov process used for the SA dither described previously can also be used here, after modification to reduce the steady-state variance and lengthening the correlation time constant. However, because of the smaller steady-state variance and the longer correlation time, the second-order rate-type state is practically negligible. Therefore, 2 first-order GaussMarkov process for each component should sulfie.

TEXTBOOK-Introduction To Random Signals and Applied Kalman Filtering With Matlab

Uploaded by

TEXTBOOK-Introduction To Random Signals and Applied Kalman Filtering With Matlab

Uploaded by

You might also like