From Variational Calculus To Pontryagin's Minimum Principle
From Variational Calculus To Pontryagin's Minimum Principle
Principle
In real life problems the control variables are usually subject to constraints
on their magnitudes, typically of the form |ui (t)| ki . This implies that
the set of final states which can be achieved is restricted. Our aim here is ti
derive the necessary conditions for optimality corresponding to the necessary
condition in variational calculus for the unbounded case. An admissible
control is one which satisfies the constraints, and we consider variations
such that u + u is admissible and kuk is sufficiently small so that the sign
of
J = J(u + u) J(u ),
where J is defined by
(1)
J(u) = Q[x(t1 ), t1 )] +
t1
q(x, u, t)dt
t0
is determined by J in
(2)
Because of the restriction on u, the fundamental theorem of variational calculus no longer applies, and instead a necessary condition for u to minimize
J is
J(u , u) 0.
(3)
The development then proceeds as earlier: Lagrange multipliers are introduced to define Ja in
Z t1
(4)
Ja = Q[x(t1 ), t1 ] +
[q(x, u, t) + p (f x)]dt
t0
pi =
H
, i = 1, 2, ..., n
xi
and
(6)
pi (t1 ) =
Q
xi
t=t1
Ja (u, u) =
t1
for all admissible u and all t [t0 , T ]; for if (9) did not hold in some interval
t2 t t3 , say, with t3 t2 arbitrarily small, then by choosing u = 0 for
t outside this interval Ja (u , u) would be made negative. Eqn (9) states
that u minimizes H, so we have established:
Theorem 1 (Pontryagins minimum principle). Necessary conditions for u
to minimize (1) are (5), (6), and (9)
With a slightly different definition of H the principle becomes one of maximizing H, and is then referred to in the literature as the maximum principle.
Note that u is now allowed to be piecewise continuous. We omit the rigorous proof here.
Our derivation assumed that t1 was fixed and x(t1 ) free; the boundary conditions for other situations are precisely the same as those given in the
preceding section. It can also be shown that when H does not explicitly depend on t then H is constant and H is a constant on the optimal trajectory
for the respective cases when the final time t1 is fixed or free.
Example 1. Consider again the soft landing problem described in the
Soft-landing problem where
Z tf
(|u| + k)dt,
(10)
0
x 1 = x2 ,
x 2 = u.
The Hamiltonian is
(12)
H = |u| + k + p1 x2 + p2 u.
1 if p2 (t) > 1
(13)
u (t) = 0
if 1 > p2 (t) > 1
1
if p2 (t) < 1.
Such a control is referred to in the literature by the graphic term bangzero-bang, since only maximum thrust is applied in a forward or reverse
direction; no intermediate nonzero values are used. If there is no period in
which u is zero the control is called bang-bang. For example, a racing-car
driver approximates to bang-bang operation, since he tends to use either full
throttle or maximum braking when attempting to circuit a track as quickly
as possible.
In (13) u (t) switches in values according to the value of p2 (t), which is therefore termed (in this example) the switching function. The adjoint equations
p1 (t) = 0, p2 = p1
and integrating these gives
(14)
Consider the first possibility and suppose that u switches from zero to
one at time t1 . By virtue of (13) this sequence of control is possible if p2
decreases with time. It is easy to verify that the solution of (11) subject to
the initial conditions
(16)
x1 (0) = h,
x2 (0) = .
is
x1 = h t, x2 = , 0 t t1
(17)
1
x1 = h t + (t t1 )2 , x2 = + (t t1 ), t1 t tf
2
Substituting the soft landing requirements
(18)
x1 (tf ) = 0,
x2 (tf ) = 0.
using the assumption that p2 (t1 ) = 1. Thus the assumed optimal control
will be valid if t1 > 0 and p2 (0) < 1 (the latter conditions being necessary
since u (0) = 0), and using (19) and (20) these conditions imply
1
1
h > 2 , k < 2 2 /(h 2 ).
2
2
If these inequalities do not hold then some different control strategy, such
as (15), becomes optimal. For example, if k is increased so that the second
inequality in (21) is violated then this means that more emphasis is placed on
the time to landing in the performance index (10). It is therefore reasonable
to expect this time would be reduced by first accelerating downwards with
u = 1 before coasting with u = 0, as in (15). It is interesting to note that
(21)
provided (21) holds then the total time tf to landing in (19) is independent
of k.
Example 2. Suppose that in the preceding example it is now required to
determine a control which achieves a soft landing in the least possible time,
starting with an arbitrary given initial state x(0) = x0 . The performance
index is just
Z t1
(22)
J = t1 t0 =
dt
t0
u (y) = sgn(p2 )
where sgn stands for the sign function. The optimal control thus has bangbang form, and we must determine the switching function p2 (t). We obtain
adjoint equations
p1 = 0, p2 = p1
so
p1 = c1 , p2 = c1 t + c2
where c1 and c2 are constants. Since p2 is a linear function of t it can change
sign at most once in [0, tf ], so the optimal control (23) must take one of the
following forms:
+1 0 t tf
1, 0 t t
f
(24)
u (t) =
+1, 0 t < t1 ; 1, t1 t tf
1, 0 t < t2 ; +1, t2 t tf .
(i) u = +1, 0 t tf . The initial state x0 must lie on the lower part
of the curve PO corresponding to c5 = 0 in Figure 1(a).
(ii) u = 1, 0 t tf . The initial state x0 must lie on the upper part
of the curve QO corresponding to c6 = 0 in Figure 1(b).
4
Figure 1
Figure 2
(iv) A similar argument shows that the last case in (24) applies for any
initial state lying to the right of PO and QO, a typical optimal
trajectory being shown in Figure 3. The switching now takes place
on PO, so the complete switching curve is QOP, shown in Figure 4.
5
Figure 3
To summarize, if x0 lies on the switching curve then u = 1 according as
x1 (0) is positive or negative. If x0 does not lie on the switching curve then
u must initially be chosen so as to move x (t) towards the switching curve.
Figure 4