Command To Line-of-Sight Guidance-A Stochastic Optimal Control Problem
Command To Line-of-Sight Guidance-A Stochastic Optimal Control Problem
A command to line-of-sight (CLOS) guidance design approach using modern stochastic optimal control
Downloaded by Universitaets- und Landesbibliothek Dusseldorf on December 15, 2013 | http://arc.aiaa.org | DOI: 10.2514/3.57220
theory is discussed. CLOS guidance requires a wide guidance bandwidth in order to follow a threat maneuver.
Yet the LOS noise (beam jitter) inherent in any LOS tracking scheme must be attenuated in order to prevent
excessive control surface saturation. The stochastic describing function (CADET) is used to model the
aerodynamic control surface saturation nonlinearity allowing the "linear" stochastic optimal control theory to
be applied. Results from a sample airframe indicate near-optimal performance using a realizable nonlinear
guidance compensation against a randomly maneuvering threat.
seems an excellent candidate for the stochastic optimal Such a control policy produces considerable control
control methods. References 1 and 2 give a comparison of fluctuation when compared with more traditional guidance
several such intercept policies for missiles with line-of-sight laws and high control gains through-out the intercept. As
rate measurements. indicated in Ref. 1, considerable parameter sensitivity results.
Two major drawbacks to the practical application of such Nevertheless, this approach may be useful for comparative
guidance policies to future missiles have been observed. First, purposes.
these optimal strategies may be highly "tuned" to the system The approach for dealing with the nonlinearity presented
parameters assumed in the optimal control design. Sensitivity herein allows the application of both the hard limit and the
to off-nominal parameter variations and modeling errors may soft limit imposed by including a weighted integral of the
be unacceptable. Secondly, optimal intercept strategies control within the cost function. This method can be used to
usually require knowledge of time-to-go. Time-to-go implies penalize the control "chattering," which can result when no
knowledge of range-to-go, which in most cases implies radar integral control penalty is imposed.
ranging. An effective "passive" guidance policy that does not The covariance analysis describing fuction technique
require threat range is more desirable. (CADET) described in Refs. 3 and 4 is used to "linearize" the
Because of the nature of the ship/threat line-of-sight saturation nonlinearity statistically so that the linear optimal
measurement, the command to line-of-sight (CLOS) guidance control theory can be applied.
problem can be formulated as an optimal stochastic regulator The purpose of this paper is twofold. First we wish to
Downloaded by Universitaets- und Landesbibliothek Dusseldorf on December 15, 2013 | http://arc.aiaa.org | DOI: 10.2514/3.57220
problem. This allows the determination of control gains based present an efficient design tool for a realistic missile-guidance
only on time-into-flight. In addition, the known well-behaved problem. Second, we wish to demonstrate how optimal
closed-loop response of the optimal regulator insures a degree stochastic control theory in conjunction with the stochastic
of parameter insensitivity. describing function can be used to generate a nonlinear
An aerodynamic control surface limit or acceleration feedback control law.
command limit must be included in the model for any highly
maneuverable interceptor. Knowledge that a well-designed Modeling the Intercept Problem
guidance policy likely will accelerate "to the limit" to reduce
miss distance makes the application of the linear stochastic The command-to-LOS guidance loop can be approximated
optimization theory questionable. Reference 2 derives the as shown in Fig, 2 using nomenclature defined in Fig. 1. The
nonlinear optimal control law for the case where the control is threat and interceptor displacements off the line-of-sight YT
restricted only in its upper bound for the sample data-control and Y/ are divided by the threat and interceptor ranges (R T
problem. The control law for this case can be summarized as and /?/), respectively, to obtain the inertial LOS angles BT and
follows. First solve for the constant control level required 0/. The measured pointing error A0m is the input to the
over the next sample period to null the predicted miss at the guidance compensator from which interceptor commands ac
known final time. Prediction is accomplished via the Kalman are generated. The measured pointing error is corrupted by
filter and predictor. Clip this computed acceleration level at noise TJLOS, which for this study will be assumed wide in
the limit value. bandwidth with respect to the system characteristic
frequencies.
Two types of measurement schemes for generating A0W are
used. For the "beamrider" guidance version, the threat is
designated with a beam that is followed by the interceptor.
The beam must be coded so as to provide the interceptor with
off-LOS errors in two planes. In this case, the guidance
compensator would be onboard the interceptor. Only a single
tracker is required in this case, although the beam coding may
be quite sophisticated. The second method tracks both the
interceptor and threat and uplinks the appropriate guidance
commands to the interceptor. The actual tracking in either
case may be with infrared, rf, video, laser, etc., or various
combinations. Both types of guidance schemes can be
Fig. 3 Single-plane rigid-body damped airframe. represented by Fig. 2. The noise magnitude will reflect the
pointing accuracy for the particular system in use. A noise
level of 0.0001 rad (0.1 mrad) will be assumed throughout this
NATURAL FREQUENCY
study.
An additional guidance input available in some systems is
the threat range. Typically, range is measured by interpreting
signals reflected from the threat. Interference signals can be
O 20 generated by the threat and directed to the ship, making range
measurements relatively vulnerable to electronic coun-
| termeasures. Note that threat directionality is more difficult
>
U to disguise. It will be assumed in the following discussion that
z
UJ threat range is unavailable for interceptor guidance purposes.
3 The single-plane, rigid-body interceptor airframe for this
- Q 0.7
discussion will be modeled as in Fig. 3. The coefficients a, b,
c, d, and e are the aerodynamic stability derivatives, V/ is the
interceptor velocity, and Kc is the pitch-rate feedback gain.
The aerodynamic control surface deflection 6 must be limited
to prevent control surface stall. The short-range point defense
interceptor studied here attains maximum velocity and lateral
acceleration capability about 5 sec after lauch at booster
0 2 4 6 8 10
burnout. Lateral acceleration capability is reduced to about
one-third of maximum at about 10 sec. Considerable airframe
TIME INTO INTERCEPTOR FLIGHT (SECS)
coefficient variation occurs during this time. The normalized
Fig. 4 Interceptor airframe characteristics. characteristics of this airframe with Kc fixed are shown on
440 J. E. KAIN AND D. J. YOST J. SPACECRAFT
Fig. 4. These characteristics include the natural frequency <*N, into flight (0 but also time-to-go (tf-t). Knowledge of
damping f, and maximum steady-state acceleration amax. time-to-go in this case implies knowledge of the threat range,
The threat will be assumed to maneuver about the line-of- which, as indicated earlier, is not a reliable measurement for
sight in a random fashion defined by the rms threat ac- guidance purposes.
celeration and the threat maneuver bandwidth. A stochastic We shall redefine the cost so as to obtain a "time-
representation of this threat will be generated by passing a invariant". regulator solution based on a particular flight
white noise (yT) through a third-order Butterworth filter (Fig. condition. The regulator solution will be obtained at suc-
5). cessive discrete times along a nominal interceptor trajectory,
The threat displacement off the line-of-sight (YT) from this yielding a smooth curve of optimal gains vs only time into
model is a stationary process, as is the threat acceleration flight. We shall proceed as in the classical analysis by
( YT). The rms threat acceleration level can be used to specify assuming that the guidance compensator contains a
the strength (spectral density) of the white noise input. The multiplication by interceptor range. Thus the guidance loop
rms threat acceleration will be fixed at one-third the peak of Fig. 2 can be redrawn as in Fig. 7 .
interceptor acceleration, whereas the threat maneuver band- Note in Fig. 7 that the interceptor off-beam displacement
width will be fixed at 1 rad/sec. A typical threat off-LOS ( YI) is controlled by the threat off-beam displacement ( YT)
trajectory generated by this process is shown on Fig. 6. only at intercept (/?, = /?7) . Prior to the intercept time, the
interceptor displacement is commanded proportionately less
Downloaded by Universitaets- und Landesbibliothek Dusseldorf on December 15, 2013 | http://arc.aiaa.org | DOI: 10.2514/3.57220
Guidance Compensation Design than the threat displacement. The stochastic optimal regulator
The line-of-sight guidance loop of Fig. 2 typically is will be designed based on the conditions at intercept. The
compensated using classical control theory. A multiplication guidance loop near intercept will be shown in Fig. 7, with
by interceptor range (/?,) is assumed within the Compensator R,=RT.
(cancelling the \/Rj in the feedback), the LOS noise is In effect, our guidance compensator gains always will be
neglected, and a representative flight condition is selected. adjusted as though minimum distance off the LOS is desired
Also, the effects of control surface deflection limiting are not at every point along the interceptor trajectory. Yet the input
considered. The guidance loop becomes a linear time- to the control loop will be attenuated by Rt/R r, so that
invariant system and can be designed with the classical control effort is reduced when interceptor to threat range is
methods such as Bode analysis and root locus. A wide large. This low level of control effort at large time-to-go is
guidance loop bandwidth with acceptable phase and gain characteristic of the optimal control obtained by minimizing
margin is the design objective. thecostofEq.(l).
In many cases, the LOS noise or beam jitter will dominate The stochastic regulator solution minimizes the cost
the interceptor performance. In addition to contributing function given by Eq. (2):
directly to miss distance, noise (coupled with a wide guidance
loop bandwidth) can cause performance deterioration because y=lim (2)
of saturation of the control surfaces. When control saturation
is encountered, the closed-loop stability characteristics of the
system will change from the linear theory predictions. If the response time of the resulting guidance loop is "fast
A guidance loop design procedure that directly addresses with respect to the variation of the airframe parameters, the
the problems of beam jitter and control surface limiting is resulting actual rms miss distance performance should be
required. Modern stochastic optimal control theory allows close to that predicted by the solution of the regulator
noise to be included in the design procedure, whereas the problem. The selection of the control weighting /* requires
stochastic describing function (CADET) of Ref. 3 has been some discussion. Through the use of this parameter, the
used successfully to model the saturation nonlinearity. By optimization can be constrained in some manner. The
using the describing function with the linear optimal control weighting /x will, to some degree, control the bandwidth of the
theory, a useful design tool for the nonlinear command to resulting guidance loop; i.e., a large /x will act to penalize
line-of-sight guidance loop can be constructed. control motions. If \i is set to zero, a bang-bang control will
result (Ref. 5, p. 1 10) although the regulator solution becomes
Cost Function undefined and no convenient feedback control law can be
A critical aspect of any optimal control problem is the cost computed. For the data presented here, a rather arbitary value
function. For the intercept problem interceptor-threat range of n was selected which produced a reasonable amount of
at some final time typically is minimized. In addition, the control limiting as predicted via CADET. This value was held
allowable control also must be penalized or limited to prevent constant for all regulator solutions discussed here. A more
unrealistic control motions. A cost function given by Eq. (1) realistic approach is to select /* based on a specified amount of
frequently is used for intercept problems: control limiting. It is likely that a sufficient degree of limiting
will be encountered such that additional limiting produces
negligible performance improvement.
, = (YT-Yr)2 (1) Other constraints might be imposed by the appropriate
selection of pt. For some interceptors, the net control energy
available over the trajectory is limited because of battery
The miss distance is assumed to be the relative in- lifetime, gas storage, etc. Rapid control surface motions
terceptor/threat separation off the line-of-sight (YT- Y,) at
the intercept time (//). The intercept time is assumed to be the
time when interceptor and threat separation along the line-of-
sight is zero, i.e., the time when /?, =RT. The minimization of
this cost function yields a tractable optimal control problem
although the resulting gains will be a function not only of time
between limits may be costly and can be controlled by >. Consider the following stochastic optimal control problem:
Another penalty paid by control motions is control-induced
drag. The mean control-induced drag (D) can be ap- State Equation
proximated as D=aE(ct2), where a is the angle of attack in (3)
radians, and a is normal force per angle of attack. This ex-
pression can be computed from the regulator solution and can Measurement Equation
be constrained using the weighting/i.
y=Hx+p (4)
Optimal Stochastic Control Solution Cost
Given the state description, measurement description, and
cost function, the optimal control of a linear system is
straightforward and well documented. The statistical
linearization of a nonlinear system is derived in Ref. 6 and The spectral density of the process noise is given by the
related to the describing function in Ref. 3, where it is called matrix Q, whereas the spectral density of the measurement
CADET. This statistical linearization allows the propagation noise is given as R. We shall assume that the resulting control
of the system covariance matrix under the assumption that the and state will be zero mean. The following statistical
Downloaded by Universitaets- und Landesbibliothek Dusseldorf on December 15, 2013 | http://arc.aiaa.org | DOI: 10.2514/3.57220
joint state probability density remains gaussian. For a scalar linearization will be considered:l '6
input/scalar output nonlinearity imbedded in an otherwise
linear system, CADET allows the replacement of the x=Fx+Gu + rj (6)
nonlinearity with a gain (describing function) computed from where
the mean and standard deviation of the nonlineariy input.
Since the mean and covariance of the optimally controlled F=—E\f(x,u)] (7)
states are computed easily, the describing function
linearization can be implemented within the stochasic optimal
control solution. Further discussion of the statistical G=-Elf(x,u)] (8)
du
describing function use with the stochastic regulator can be
found in Ref. 7. If x and u are assumed to be zero-mean, Gaussian random
vectors, the expectations indicated in Eqs. (7) and (8) require
only the covariance matrices of x and u.
The solutions of the stochastic regulator gains for the
"linearized'' problem defined by Eqs. (4-6) are given by the
steady-state solution of the following Ricatti equations:
P=FP+PFT+Q-PHR~1HTP (9)
(10)
l
K = PHR ~ (optimal estimator gains) (U)
Fig. 7 CLOS guidance I
C=B~ GS (optimal control gains) (12)
loop with range
multiplication.
The following equations will be used to approximate the
optimal estimator-controller:
x=f(xtu)+K(y-Hx) (13)
(14)
This approach is somewhat different from that used in Ref.
7. Reference 7 suggests the use of the linearized state equation
with the estimator, Eq. (13), rather than the nonlinear form.
It is common practice to utilize the exact nonlinear dynamics
to propagate the state estimate and yet design the estimator
gains based on the linearized dynamics, as with the extended
Kalman filter.
The following linear matrix equation is used frequently to
compute the performance of the optimally controlled
ax - RMS NONLINEARITY INPUT system 5
LIM = LIMIT MAGNITUDE
Z=(F-GC)Z+Z(F-GC) (15)
where
Z=E[xxT] (16)
P=E((x-x)(x-x)T] (17)
Z+P=E[xxT] (18)
10,000
Downloaded by Universitaets- und Landesbibliothek Dusseldorf on December 15, 2013 | http://arc.aiaa.org | DOI: 10.2514/3.57220
°1
0.4
0.3
0.2
0 2 4 6 8 1 0
INTERCEPTER FLIGHT TIME (SECS)
0 2 4 6 8 1 0
INTERCEPTOR FLIGHT TIME (SECS)
Fig. 10 Optimal estimator gains.
0 2 4 6 8 10 -125
INTERCEPTER FLIGHT TIME (SECS)
-E\f(xtu)] (19)
d_
(20)
du
nonlinearity input is a combination of the control u and in- (7) TIME INDICATOR (SECS)
terceptor pitch rate state. The pitch rate estimate is not im-
proved by the LOS measurement, and thus the variance of the
pitch rate estimate is identical to the variance of the actual
pitch rate. Therefore, the matrices Fand G are identical to F'
and G' for this problem. If this had not been the case, we
would be required to utilize the Kalman filter sensitivity
equations rather than Eq. (15), to evaluate the performance.
The steady-state solutions of Eqs. (9) and (10) were -5
determined by simultaneously integrating Eqs. (9, 10, and 15) REAL AXIS (RAD/SEC)
using Z and P to compute the matrices F and G. For the Fig. 11 Closed-loop roots vs time.
airframe dynamics depicted by Fig. 3, the describing function
gain is used to replace the saturation nonlinearity in determing Guidance Compensator Solutions
the linear system matrix F. The input to the limiter is the sum
of the acceleration command (ac) and the pitch rate feedback The optimal stochastic regulator approximation just
term. The acceleration command represents the control (u) in discussed was computed for the interceptor airframe and
the preceding equations. When the rms limiter input is small measurements discussed earlier. The state equations are
with respect to the limit, the gain is near unity. The describing (21)
function gain vs rms input for the limiter is shown on Fig. 8
and derived in Ref. 8. (22)
JULY 1977 COMMAND TO LINE-OF-SIGHT GUIDANCE 443
x COMPENSATION POLE
O COMPENSATION ZERO
(D TIME INDICATOR (SECS)
2 4 6 8
-10
REAL AXIS (RAD/SEC) For early flight times, the performance is dominated by the
Fig. 12 Compensation pole-zero representation. slow interceptor airframe response time, which results in a
low guidance loop bandwidth. Late in flight, the system is
dominated by the effects of LOS noise, and the resulting
attenuation provided by the optimal system again produces a
low guidance bandwidth. A peak bandwidth occurs at about 5
.1 MIL BEAM JITTER
O
z
sec into flight and is about 8 rad/sec.
O The guidance compensation indicated by Fig. 1.2 contains a
double differentation, which cancels the double integration
p 0.5
from the plant. The resulting system is thus "type 0" and has
nonzero error to a bias input. A bias error on the LOS
measurement easily could be included in the model and likely
would lead to another compensation philosophy. In addition,
a nonstationary target model (containing free integrators),
2 4 6 8
which tends to turn from the LOS, would lead to a system
INTERCEPT TIME FROM LAUNCH - SECS
with free integrations in the forward loop.
Fig. 13 Describing function gains from stochastic regulator solution. A considerable degree of control saturation is indicated by
the describing function gains shown on Fig. 13. The value for
the describing function gain gives an indication of the
(23) probability of the control surface not being limited for in-
tercepts at the indicated time into flight.
The measurement equation is
(24) Optimal System Performance
and the cost is given by Eq. (2). These equations represent a The previous section gave performance predictions based
seven-state, scalar control, scalar measurement nonlinear o n two assumptions:
stochastic control problem. 1) The airframe parameter time variations are slow with
The white measurement noise spectral density was selected respect to the guidance loop bandwidth, allowing a
to approximate a discrete uncorrelated process with sample "quasitime invariant" system design via the optimal regulator
time of 0.02 sec (Ref. 5, p. 342), i.e., solution.
= olosR2,(t)0.02 (25) 2) The describing function approximation is a valid
characterization of the saturation nonlinearity.
The regulator problem was solved at 1-sec intervals along a Both assumptions are subject to question, since the air-
nominal trajectory using a value of aLOs of 0.1 mrad. Gain frame parameters vary rapidly, and a high degree of control
histories from the regulator solutions are shown in Figs. 9 and limiting is predicted. The performance of the overall guidance
10. Figure 9 shows the seven optimal control gains, whereas system and the validity of the underlying assumptions were
Fig. 10 shows the estimator gains. All estimator gains are zero investigated using Monte Carlo methods. The airframe was
except those multiplying YTt YT, YT. Recall that, for the simulated using a thorough aerodynamic description, and the
optimal estimator solution, the control input is assumed saturation nonlinearity was modeled exactly in both the
effectively to be zero. Thus there are no stochastic inputs airframe and the guidance compensation. Twenty-five in-
driving the interceptor states, resulting in zero steady-state tercepts were flow against random threats generated from a
gains for the interceptor state estimates. Solutions were digital representation of the Butterworth stochastic threat.
obtained for a case where a correlated noise input was injected Figure 14 shows the Monte Carlo results, together with the
to the angle of attack (representative of a gust noise) although regulator solution rms miss distance prediction. All missile
for realistic gust magnitudes the compensator and per- and target parameters are identical to those used in the
formance were unchanged. guidance design procedure. The correlation between the
If the nonlinearity is replaced by the describing function Monte Carlo and the regulator prediction from Fig. 14 is
gain from the stochastic regulator design, a "linearized" excellent.
system results, and concepts from linear system theory may be Also on Fig. 14 we show the performance of a proportional
applied. Figure 11 shows the closed-loop system poles, navigation guidance law using the previously discussed air-
whereas Fig. 12 shows the compensator poles. The pole frame. Time-varying gains were implemented to realize a
travel is shown at 1-sec intervals into flight, beginning at 1 sec navigation ratio of 3. A noiseless seeker was assumed
and ending at 10 sec. The resulting describing function gain although a 0.1-sec lag was inserted to represent guidance
history is shown on Fig. 13. filtering and/or seeker lags. Figure 14 shows excellent per-
444 J. E. KAIN AND D. J. YOST J. SPACECRAFT
This volume presents thirty-five papers on the guidance and control of missiles and space vehicles, covering active and passive
attitude control for space vehicles, inertia! guidance for space flight, onboard techniques for interplanetary flight, manned
control of space vehicles, deep space guidance and navigation, rendezvous, and reentry and landing.
The attitude control section includes a comprehensive survey, covering a wide variety of stabilization systems for satellites,
including gravity-gradient, spin, stabilization, and pulse-frequency methods. Cryostabilization studies examine drift, gyro
optimization, mechanical and electrical problems, and damping. Radar and infrared studies concern sensor requirements and
scanning problems.
The model and the role of the human operator in spacecraft control systems are analyzed, with emphasis on the pilot-vehicle
feedback control loop. Guidance and correction algorithms and compensation are examined. Data reduction in these fields is
explored.
Rendezvous studies examine Apollo program requirements, fuel-mission-orbit-thrust optimization for reentry, lunar land-
ing, shuttle rendezvous, and orbit injection.
TOORDER WRITE: Publications Dept., AIAA, 1290 Avenue of the Americas, New York, N. Y. 10019
This article has been cited by: