Hamilton-Jacobi-Bellman Equations and Optimal Control: !birkh Auser Verlag Basel
Hamilton-Jacobi-Bellman Equations and Optimal Control: !birkh Auser Verlag Basel
1. Introduction
The aim of this paper is to offer a quick overview of some applications of the theory of viscosity
solutions of Hamilton-Jacobi-Bellman equations connected to nonlinear optimal control problems.
The central role played by value functions and Hamilton-Jacobi equations in the
Calculus of Variations was recognized as early as in C. Caratheodory’s work, see [1] for a survey.
Similar ideas reappeared under the name of Dynamic Programming in the work of R. Bellman and
his school and became a standard tool for the synthesis of feedback controls for discrete time systems.
However, the lack of smoothness of value functions, even in simple problems, was recognized as a
severe restriction to the range of applicability of Hamilton-Jacobi-Bellman theory (in short, HJB from
now on) to continuous time processes.
The main reasons for this limitation are twofold:
(i) the very basic difficulty to give an appropriate global meaning to the HJB equation (a fully
non linear partial differential equation) satisfied at all points of differentiability by the value
function,
(ii) to identify value function as the unique solution of that equation; a related important issue is that
of stability of value functions, specially in connection with approximation procedures required
for computational purposes.
Several non classical notions of solutions have been therefore proposed to overcome these difficulties.
Let us mention, in this respect, the Kruzkov theory which applies, in the case of sufficiently smooth
Hamiltonians, to semiconvex functions satisfying the HJB equation almost everywhere (see [2, 3] and
also [4, 5] for recent results on semiconcavity of value functions).
Only in the 80’s, however, a decisive impulse to the setting of a satisfactory mathematical frame-
work to Dynamic Programming came from the introduction by Crandall-Lions [6] of the notion of
viscosity solutions of Hamilton-Jacobi equations.
The presentation here, which is mainly based on material contained in the forthcoming book [7],
to which we refer for detailed proofs, will be focused on optimization problems for controlled ordinary
differential equations and discrete time systems.
Let us consider the control system, whose solution will be denoted by yxα ,
ẏ(t) = f (y(t), α(t)) (t > 0), y(0) = x. (2..1)
In (2..1), f : IR × A −→ IR , A is a topological space and
N N
Here we list a few classical examples of optimal control problems and associated HJB equations
(complemented in some cases by initial or boundary conditions) for system (2..1).
and ! +∞
J(x, α) = l(yxα (t), α(t)) e−λt dt, otherwise.
0
The corresponding HJB equation is now the Dirichlet problem:
λv(x) + Supa∈A [−f (x, a) · Dv(x) − l(x, a)] = 0 in IRN \ T ,
v(x) = g(x) if x ∈ ∂T .
Hamilton-Jacobi-Bellman equations and optimal control 3
The set Ω plays here the role of a constraint on the states of system (2..1).
For the present problem the HJB equation (2..3) is complemented with a quite unusual boundary
condition, namely
Here we set:
Mam (A) = {α ∈ M (A), α nondecreasing, α(0) ≥ a} .
The HJB equation for this problem takes the form of the evolutionary variational inequality:
∂v
M ax[λv(x, a) − f (x, a) · Dx v(x, a) − l(x, a); − ]=0 in IRN × [0, 1) ,
∂a
! +∞
v(x, 1) = l(yx1 (t), 1) e−λt dt in IRN .
0
It is well-known that the value functions of the various optimal control problems described above
satisfy the corresponding HJB equations at all points of differentiability. This fact can be proved by
means of the Dynamic Programming Principle, a functional equation relating the value of v at the
initial point x to its value at some point reached later by a trajectory of system (2..1).
Let us just indicate that, in the simplest case of the infinite horizon discounted regulator problem, the
Dynamic Programming Principle is expressed by the identity
! T
v(x) = Infα∈M (A) l(yxα (t), α(t)) e−λt dt + e−λT v((yxα (T )), (3..4)
0
Let us just sketch the proof of the inequality u ≤ w, the argument to prove the reverse being
completely similar. Assume, by contradiction, the existence of some x0 such that:
Define then
|x − y|2
Φ(x, y) = u(x) − w(y) − − β((g(x) + g(y)),
2ε
where ε > 0 and
1
g(x) = log(1 + |x|2 ).
2
The parameter β is chosen as to satisfy
δ δ
β≤ , ω(2β) ≤ .
4g(x0 ) 6
The above choices yield:
δ
SupIRN ×IRN Φ ≥ Φ(x0 , y0 ) ≥ . (3..6)
2
By the assumptions made on u, w it is not hard to prove
the existence of (xε , yε ) such that
and that (xε , yε ) remain uniformly bounded with respect to ε. The key point is to observe that
xε − yε
pε := + βDg(xε ) ∈ D+ u(xε ),
ε
and
xε − yε
qε := − βDg(yε ) ∈ D− u(yε ).
ε
By definition of viscosity sub and super solution, then
and, consequently,
|xε − yε |2
→ 0 as ε → 0.
ε
At this point it is easy to realize that inequality (3..7) is contradictory with (3..6).
This concludes the proof that u ≤ w.
As for stability we have:
Hamilton-Jacobi-Bellman equations and optimal control 6
Then,
u(x) + H(x, Du(x)) = 0 in Ω, in the viscosity sense.
The uniform convergence of un to u guarantees that for any x ∈ Ω and p ∈ D+ u(x) there exist
xn ∈ Ω and pn ∈ D+ un (x) such that
xn → x, pn → p.
From this fact it follows easily that u is a subsolution of the limit equation.
A completely similar argument, with D+ replaced by D− , shows that u is a supersolution as well.
The theory outlined up to now does not depend on the convexity of the map
p −→ H(x, p).
When this property holds true (this fact is typical of Hamiltonians occurring in optimal control
problems), then some special results are valid. Let us just mention the non obvious fact that in this
case a function u is a viscosity solution of equation (3..5) if and only if
The value functions in the examples 2.1, 2.2, 2.5 of Section 2 are continuous under the assumptions
(A0 ), (A1 ) plus some uniform continuity conditions on the costs l, g. Continuity of v in problems 2.3
and 2.4 is guaranteed under an additional restriction involving the behaviour of the dynamics f on
the boundary of Ω or of T . For problem 2.4 this condition is
Theorem 3 Assume (A0 ),(A1 ); assume also l continuous and bounded on IRN × A and
Then, the value function v of the infinite horizon discounted regulator problem is a bounded, continuous
viscosity solution of (2..3).
Moreover, v is the unique viscosity solution of (2..3) in the class of bounded, continuous functions on
IRN .
For the proof, note that the second statement follows from the first and the uniqueness Theorem
1.
Let just indicate how to prove that v is a viscosity solution of (2..3). Observe that (4..8) with
α(t) ≡ a ∈ A yields
!
v(yxa (T ) − v(x) 1 T
≥ l(yxa (t), a) e−λt dt + (e−λT − 1)v(yxa (T )) (4..9)
T T 0
if α% and x satisfy
" " "
p · f (yxα (t), α% (t)) + l(yxα (t), α% (t)) = λv(yxα (t)) a.e. t > 0
" $ "
for all p ∈ D+ v(yxα (t)) D− v(yxα (t)), then α% is optimal for the initial position x.
$
Note that condition D+ v(x) D− v(x) *= ∅ is rather restrictive; it is fulfilled, for example, when v is
semiconcave, i.e.
v(x + z) − 2v(x) + v(x − z) ≤ C|z|2
for some C and all x,z in IRN .
5. Approximate synthesis
Consider again the infinite horizon problem 2.1, the associated HJB equation
uh (x) + Supa∈A [−(1 − λh)uh (x + hf (x, a)) − hl(x, a)] = 0 in IRN . (5..10)
Under the assumptions of Theorem 3, the Contraction Mapping Principle applies to show that for each
h ∈ (0, λ1 ) there exists of a unique bounded, continuous solution uh to the above functional equation.
The functions uh can be interpreted as value functions of a discrete time version of the infinite
horizon problem. Let us define at this purpose
where k = 0, 1, ...,.
Define then, a feedback law a%h : IRN −→ A by selecting
where uh is the solution of equation (5..10). Consider now the control αh% ∈ Mh (A) given by
The next result states that uh converges to the value function v of the infinite horizon problem as
the time step h → 0+ .
Hamilton-Jacobi-Bellman equations and optimal control 9
for all K ⊂⊂ IRN . Under the further conditions λ > 2Lf , f smooth and l semiconcave, the estimate
As a consequence of this convergence result it follows that any optimal pair (αh% , yh% ) for the above
described discrete time problem converges weakly to an optimal relaxed pair (µ% , y % ) for the original
problem (2..2), see [11, 12].Theorem 6 is also the starting point for a numerical approach to the
computation of value functions and optimal feedbacks. We refer for example to [18, 19, 21, 22, 23].
6. Final remarks
In this paper we restricted our attention to the role played by viscosity solutions in optimal control
problems for systems governed by ordinary differential equations. Only a few examples have been
shortly described but many more can be approached in a similar way, see [7] for impulse and switching
control problems,the minimum time problem and H∞ control.
Some important topics we did not touch as well are discontinuous viscosity solutions and their ap-
plications to control and game problems with discontinuos value functions (e.g. the classical Zermelo
navigation problem). Discontinuous viscosity solutions and the closely related weak limits technique
are relevant also in the analysis of some asymptotic problems occurring, for example, in connections
with ergodic systems (see [24]), large deviations (see [25]) or control of singularly perturbed systems
(see [26]).
Let us mention, finally, that the viscosity solutions approach is flexible enough to be applicable
to control problems for stochastic and distributed parameters systems as well as to differential games
(we refer at this purpose to [8, 9, 10, 13]).
REFERENCES
[1] PESCH H.J., BULIRSCH R. The Maximum Principle, Bellman’s equation, and Caratheodory’s
work. J.Optim.Theory Appl. 80 (1994).
[2] KRUZKOV S.N. The Cauchy problem in the large for certain nonlinear first order differential
equations.Sov. Math. Dokl. 1 (1960).
[3] KRUZKOV S.N. Generalized solutions of the Hamilton-Jacobi equations of eikonal type I.
Math.USSR Sbornik 27 (1975).
[4] CANNARSA P., SINESTRARI C. Convexity properties of the minimum time function. Calc.
Var. 3 (1995).
[5] SINESTRARI C. Semiconcavity of solutions of stationary Hamilton-Jacobi equations. Nonlinear
Analysis 24 (1995).
[6] CRANDALL M.G., LIONS P.L. Viscosity solutions of Hamilton-Jacobi equations.
Trans.Amer.Math.Soc. 277 (1983).
[7] BARDI M., CAPUZZO DOLCETTA I. Optimal control and viscosity solutions of Hamilton-
Jacobi-Bellman equations. To appear, Birkhauser (1997).
[8] LIONS P.L. Optimal control of diffusion processes and Hamilton-Jacobi equations. Comm.Partial
Differential Equations 8 (1983).
[9] FLEMING W.H., SONER M.H.Controlled Markov processes and viscosity solutions. Springer
Verlag (1993).
[10] LI X.,YONG J. Optimal control theory for infinite dimensional systems. Birkhauser (1995).
[11] CAPUZZO DOLCETTA I. On a discrete approximation of the Hamilton-Jacobi equation of
dynamic programming. Appl.Math.Optim. 10 (1983).
Hamilton-Jacobi-Bellman equations and optimal control 10
[12] CAPUZZO DOLCETTA I., ISHII H. Approximate solutions of the Bellman equation of deter-
ministic control theory. Appl.Math.Optim. 11 (1984).
[13] BARDI M. Some applications of viscosity solutions to optimal control and differential games. In
Viscosity Solutions and Applications, I.Capuzzo Dolcetta, P.L.Lions (eds.). To appear in Lecture
Notes in Mathematics, Springer (1997).
[14] BARRON E.N., JENSEN R. Semicontinuous viscosity solutions of Hamilton-Jacobi equations
with convex hamiltonians. Comm.Partial Differential Equations 15 (1990).
[15] BARLES G. Discontinuous viscosity solution of first order Hamilton-Jacobi equations: a guided
visit. Nonlinear Analysis 20 (1993).
[16] SONER M.H. Optimal control problems with state-space constraints I-II. SIAM J. Control Optim.
24 (1986).
[17] CAPUZZO DOLCETTA I., LIONS P.L. Hamilton-Jacobi equations with state constraints. Trans.
Amer. Math. Soc. 318 (1990).
[18] GONZALEZ R., ROFMAN E. On deterministic control problems: an approximation procedure
for the optimal cost. SIAM J. Control Optim. 23 (1985).
[19] FALCONE M. A numerical approach to the infinite horizon problem of deterministic control
theory. Appl. Math. Optim. 15 (1987).
[20] LORETI P., TESSITORE M.E. Approximation and regularity results on constrained viscosity
solutions of Hamilton-Jacobi-Bellman equations. J.Math.Systems Estimation Control 4 (1994).
[21] ROUY A. Numerical approximation of viscosity solutions of Hamilton-Jacobi equations with
Neumann type boundary conditions. Math. Models Methods Appl. Sci. 2 (1992).
[22] FALCONE M., FERRETTI R. Discrete high-order schemes for viscosity solutions of Hamilton-
Jacobi equations. Numer. Math. 67 (1994).
[23] BARDI M., BOTTACIN S., FALCONE M. Convergence of discrete schemes for discontinuous
value functions of pursuit-evasion games. In G.J. OLSDER, editor,New Trends in Dynamic Games
and Applications. Birkhauser (1995).
[24] ARISAWA M. Ergodic problem for the Hamilton-Jacobi-Bellman equation I and II. Cahiers du
CEREMADE. (1995).
[25] BARLES G. Solutions de Viscosite des Equations de Hamilton-Jacobi Vol. 17. Mathematiques et
Applications. Springer (1994).
[26] BARDI M., BAGAGIOLO F., CAPUZZO DOLCETTA I., A viscosity solutions approach to some
asymptotic problems in optimal control. In J.P.ZOLESIO, editor,Proceedings of the Conference
”PDE’s Methods in Control, Shape Optimization and Stochastic Modelling”. M.Dekker (1996).