7 Linear Quadratic Control: 7.1 The Problem
7 Linear Quadratic Control: 7.1 The Problem
You have seen that the design of a controller can be broekn down into the following
two parts:
1. Designing a state feedback regulator u = Kx; and
2. Building a state observer.
You can design controllers where the closed-loop poles are placed at any desired lo-
cation. At this point, you might want to ask the following question. Is there some K
that is better than others? This question leads you into the realm of optimal control
theory. That is, designing controllers which optimize some desirable characteris-
tic. The first, and also best known, problem that has been considered is the linear
quadratic regulator problem. Its rigorous derivation is somewhat tricky and best left
to a graduate course in the area. We can certainly do much better than the presenta-
tion in the book, however, so these notes should help you get started.
with initial condition x(0) = x0 . We will assume that the system is controllable and
observable.
Our goal is to minimize a combination of the output and input values:
Z
J(u) = y 2 (t) + u2 (t) dt.
0
defined. We will assume that any control input we look for is such that u(t) 0
and y(t) 0 sufficiently fast so as to make the integral finite. The parameter > 0
is used to weigh the two different goals of the integrand. Large penalizes large
input signals (this serves as a means of preventing saturations, for example). Small
makes the output smaller.
At this point, we do not know much, but let us assume that the state-feedback
input u(t) = Kx(t) is used. Note that there is no a priori reason to expect that the
optimal input would be a state-feedback.
If u(t) = Kx(t), then the integrand can be written as:
Also, since
where we have defined the integral to be the symmetric matrix X. It follows that the
cost function is quadratic in the initial condition x0 .
Let us perform some matrix manipulations on X. These are slightly unmotivated
at this point, but you will later see how they arise.
where we first used the fundamental theorem of calculus in Eq. 7.1, and then the
(assumption) that e(A+BF )t 0 in Eq. 7.2.
It follows that
This is an equation that relates the feedback matrix K to the cost function J(F x) =
xT0 Xx0 . No notion of optimality has been used so far. Note, however, that X must
be a positive matrix, because the integrand is always positive.
At this point, we choose a particular pair (K, X) related by this equation. In
particular, we are going to make the last quadratic term 0; that is K = 1 B T P and
AT P + P A 1 P BB T P + C T C = 0 (7.4)
Weve used a different letter (P instead of X) to remind you that this is the cost
of a particular choice of K. We will also use the following characterization of this
equation
[A 1 BB T X]T P + P [A 1 BB T ] + 1 P BB T P + C T C = 0
Lemma 2. With P the positive semi-definite solution to Eq. 7.4, the matrix AK :=
A BK = A 1 BB T P is stable.
Proof. We prove this by contradiction. Suppose that AK is not stable; that is, there
exists an eigenvalue and eigenvector v such that
AK v = v (7.5)
with Re 0. Note that, in general, eigenvalues and eigenvectors are complex num-
bers. Consider the Hermitian transpose of Eq. 7.5. A Hermitian transpose, denoted
by X H is just the regular transpose with complex conjugates. That is:
X H = [X T ] = (X)T .
Note that, if the matrix or vector X is real, then X H = X T . It follows that the
Hermitian transpose of Eq. 7.5 is
v H ATK = v H (7.6)
Consider the individual elements of this equation. First of all, since P is a positive
semidefinite matrix, v H P v 0, for any vector v. Also ( + ) = 2Re 0.
48 7 Linear quadratic control
Finally, note that v H C T Cv = kCvk2 0 and v P BB T P v = kB T P vk2 0.
It follows that all the elements are non-negative. Hence, the only way they can sum
to zero is if they are all zero. But then Cv = 0 and
Av = AK v + 1 BB T P v = v
where all we have assumed is that the control u causes the state x to go to zero. Thus:
Z
T AT P + P A P B x(t)
T
x (t) u (t) dt + xT0 P x0 = 0 (7.7)
0 BT P 0 u(t)
where: in line Eq. 7.8 we have added Eq. 7.7; in line Eq. 7.9 we used Eq. 7.4 in the
(1,1) term of the block matrix; in line Eq. 7.10 we transferred the terms 1 P B to the
xT term; in line Eq. 7.11 we transferred the terms 1 B T P to the x term.
Note that the cost function J(u) divides into two parts. The first the integral
is always non-negative. The second term is independent of u. It follows that the
best you can do is to set the first term equal to zero; but this is accomplished by the
choice
1 T
B P x(t) + u(t) = 0 u(t) = Kx(t)
as we claimed.
At this point, we should really show that a solution to Eq. 7.4 does exists. Im
going to skip this part of the theory as I think that it takes us a bit far afield. Instead,
it would be more instructive to look at an example.
Example 11. Consider the first order system:
Clearly, there
are two solutions. We want to take the one that has p 0, and this is
p = a + a2 + 1. The control input is then u = kx where
p
k = p = a + a2 + 1
placing the closed-loop pole at a k = a2 + 1. u t
50 7 Linear quadratic control
The problem of finding an observer gain L such that A+LC is stable is equivalent to
finding the gain K T such that AT C T K T is stable. We might be tempted to define
an optimal observer gain by considering the optimal control gain for the problem
with AT and C T . This leads to another Algebraic Riccati Equation:
AQ + QAT 1 QC T CQ + BB T = 0 (7.12)
L = 1 QC T . (7.13)
Note that we have just taken Eq. 7.4, and replaced A with AT ; B with C T and C
with B T .
It can be shown that this is indeed an optimal observation problem for a suitably
defined problem.
In particular, consider the system
The two external signals: w(t) and v(t) are stochastic signals; that is, these are sig-
nals where at any point in time their value is a random variable.
We make the following assumptions about the probability distributions. First,
both signals are zero mean random processes. That is, for any t:
where E denotes mathematical expectation. Because both signals are zero mean, and
the system is linear, then the state and output will both be zero mean processes.
Furthermore, we assume that
for any t and . The first two are equivalent to assuming that the two disturbances are
white noise processes. The third assumes that there is no correlation between them.
Now, we build a observer for this system:
x(t) = Ax(t) + L[y(t) C x(t)].
Notice that the observer does not include components from the external disturbances
as these are assumed to be unmeasurable. The estimation error is e(t) = x(t) x(t)
and obeys the differential equation:
7.2 Optimal Observers 51
Thus:
Z t
AL t
e(t) = e e0 + eAL (t) Bw() + Lv() d, (7.14)
0
lim Eke(t)k2 .
t
Note that, because we are looking at the variance of the estimate as t , then the
first term in Eq. 7.14 will go to zero, provided that AL is stable. Thus,
where
Z t
e(t) = eAL (t) Bw() + Lv() d.
0
Now we evaluate this, but we do so indirectly. Recall from linear algebra that
ke(t)k2 = [e(t)]T e(t) = trace e(t)[e(t)]T .
Moreover,
e(t)[e(t)]T =
Z tZ t
T
eAL [t1 ] [Bw(1 ) + Lv(1 )][Bw(2 ) + Lv(2 )]T eAL [t2 ] d1 d2 .
0 0
Bw(1 )w(2 )B T + Bw(1 )v(2 )LT + Lv(1 )w(2 )B T + Lv(1 )v(2 )LT
(7.15)
When we take the mathematical expectation, because both the trace operation and
multiplication by the matrix exponential are linear, we get
Eke(t)k2 = E trace e(t)[e(t)]T
Z tZ t
T
= trace eAL [t1 ] E[ ] eAL [t2 ] d1 d2 .
0 0
52 7 Linear quadratic control
Replacing this into the cost function, and integrating with respect to 2 , leads to:
Z tZ t
T
2
Eke(t)k = trace eAL [t1 ] [BB T + LLT ](1 2 )eAL [t2 ] d1 d2
0 0
Z t
T
= trace eAL [t1 ] [BB T + LLT ]eAL [t1 ] d1 .
0
Now, define = t 1 :
Z t
T
Eke(t)k2 = trace eAL [BB T + LLT ]eAL d.
0
A = AT , B = C T , C = B T , and K = LT
then
Z
T
2
lim Eke(t)k = trace eAK t [C T C + K T K]eAK t dt.
t 0
This is the integral that appears in the LQR problem (Eq. ??) under the assumption
that state-feedback control is used u = Kx:
Z
T
T
J(u = Kx) = x0 eAK [C T C + K T K]eAK dt x0 .
0
lim Eke(t)k2 .
t
K = 1 B T P
0 = AT P + P A + C T C 1 P B B T P .
In terms of the original data, this is simply Eq. 7.12 and Eq. 7.13 with Q = P .
x = Ax + Bu L(y C x)
u = Fx
v = u Fx
and consider the new system, by replacing the input u with the new input v:
x = (A + BF )x + Bv
y = Cx
where gv (t) = e(A+BF )t B is the impulse response of the system which transfers the
input v to the state x. This system has transfer function
Gv (s) = (sI A BF )1 B
I can also write down the first term in the solution for x(t) as a convolution of gx (t) =
e(A+BF )t with the signal x0 (t). Note that
Gx (s) = (sI A BF )1
It follows that
By Parcevals theorem.
Note that:
Y (j) 2
|Y (j)|2 + |U (j)|2 = k k
U (j)
Also,
Y (s) C 0
= X(s) + V (s)
U (s) F 1
C CGv (s)
= Gx (s)x0 + V (s)
F 1 + F Gv (s)
= Hx (s)x0 + Hv (s)V (s)
Hence
|Y (j)|2 + |U (j)|2 = xT0 Hx (j)H Hx (j)x0 + |v(j)|2 Hv (j)H Hv (j) + 2xT0 Hx (j)H Hv (j)