Sethi2019 Book OptimalControlTheory
Sethi2019 Book OptimalControlTheory
Sethi
Optimal
Control Theory
Applications to Management Science
and Economics
Third Edition
Optimal Control Theory
Suresh P. Sethi
Third Edition
123
Suresh P. Sethi
Jindal School of Management, SM30
University of Texas at Dallas
Richardson, TX, USA
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
This book is dedicated to the memory of
my parents
Manak Bai and Gulab Chand Sethi
Preface to Third Edition
The third edition of this book will not see my co-author Gerald L.
Thompson, who very sadly passed away on November 9, 2009. Gerry
and I wrote the first edition of the 1981 book sitting practically side by
side, and I learned a great deal about book writing in the process. He
was also my PhD supervisor and mentor and he is greatly missed.
After having used the second edition of the book in the classroom
for many years, the third edition arrives with new material and many
improvements. Examples and exercises related to the interpretation of
the adjoint variables and Lagrange multipliers are inserted in Chaps. 2–
4. Direct maximum principle is now discussed in detail in Chap. 4 along
with the existing indirect maximum principle from the second edition.
Chattering or relaxed controls leading to pulsing advertising policies are
introduced in Chap. 7. An application to information systems involving
chattering controls is added as an exercise.
The objective function in Sect. 11.1.3 is changed to the more popular
objective of maximizing the total discounted society’s utility of consump-
tion. Further discussion leading to obtaining a saddle-point path on the
phase diagram leading to the long-run stationary equilibrium is provided
in Sect. 11.2. For this purpose, a global saddle-point theorem is stated
in Appendix D.7. Also inserted in Appendix D.8 is a discussion of the
Sethi-Skiba points which lead to nonunique stable equilibria. Finally,
a new Sect. 11.4 contains an adverse selection model with continuum of
the agent types in a principal-agent framework, which requires an appli-
cation of the maximum principle.
Chapter 12 of the second edition is removed except for the material
on differential games and the distributed parameter maximum principle.
The differential game material joins new topics of stochastic Nash differ-
ential games and Stackelberg differential games via their applications to
marketing to form a new Chap. 13 titled Differential Games. As a result,
Chap. 13 of the second edition becomes Chap. 12. The material on the
distributed parameter maximum principle is now Appendix D.9.
The exposition is revised in some places for better reading. New
exercises are added and the list of references is updated. Needless to say,
the errors in the second edition are corrected, and the notation is made
consistent.
vii
viii Preface to Third Edition
ix
x Preface to Second Edition
Finally, while we regret that lack of time and pressure of other du-
ties prevented us from bringing out a second edition soon after the first
edition went out of print, we sincerely hope that the wait has been worth-
while. In spite of the numerous applications of optimal control theory
which already have been made to areas of management science and eco-
nomics, we continue to believe there is much more that remains to be
done. We hope the present revision will rekindle interest in furthering
such applications, and will enhance the continued development in the
field.
xiii
xiv Preface to First Edition
xv
xvi CONTENTS
Bibliography 473
Index 547
List of Figures
1.1 The Brachistochrone problem . . . . . . . . . . . . . . . 9
1.2 Illustration of left and right limits . . . . . . . . . . . . 18
1.3 A concave function . . . . . . . . . . . . . . . . . . . . . 21
1.4 An illustration of a saddle point . . . . . . . . . . . . . . 23
xxiii
xxiv LIST OF FIGURES
10.1 Optimal policy for the sole owner fishery model . . . . . 316
10.2 Singular usable timber volume x̄(t) . . . . . . . . . . . . 320
10.3 Optimal thinning u∗ (t) and timber volume x∗ (t) for the
forest thinning model when x0 < x̄(t0 ) . . . . . . . . . . 320
10.4 Optimal thinning u∗ (t) and timber volume x∗ (t) for the
chain of forests model when T > t̂ . . . . . . . . . . . . 322
10.5 Optimal thinning and timber volume x∗ (t) for the chain
of forests model when T ≤ t̂ . . . . . . . . . . . . . . . . 323
10.6 The demand function . . . . . . . . . . . . . . . . . . . . 324
10.7 The profit function . . . . . . . . . . . . . . . . . . . . . 326
10.8 Optimal price trajectory for T ≥ T̄ . . . . . . . . . . . . 329
10.9 Optimal price trajectory for T < T̄ . . . . . . . . . . . . 330
xxvii
Chapter 1
wish at first to cover only the simpler models in each area to get an idea
of what could be accomplished with optimal control theory. Later, the
reader may wish to go into more depth in one or more of the applied
areas.
Examples are worked out in most of the chapters to facilitate the
exposition. At the end of each chapter, we have listed exercises that the
reader should solve for deeper understanding of the material presented
in the chapter. Hints are supplied with some of the exercises. Answers
to selected exercises are given in Appendix E.
x(T ) ∈ X, (1.6)
where X is called the reachable set of the state variable at time T. Note
that X depends on the initial value x0 . Here X is the set of possible
terminal values that can be reached when x(t) and u(t) obey imposed
constraints.
Although the above description of the control problem may seem ab-
stract, you will find that in each specific application, the variables and
parameters will have specific meanings that make them easy to under-
stand and remember. The examples that follow will illustrate this point.
4 1. What Is Optimal Control Theory?
with G(0) = G0 > 0 specifying the initial goodwill for the product.
ρ = Discount rate
results from the goodwill level G(t) at time t less the cost of advertising
assumed to be proportional to u(t) (proportionality factor = 1); thus
π(G(t)) − u(t) is the net profit rate at time t. Also [π(G(t)) − u(t)]e−ρt is
the net profit rate at time t discounted to time 0, i.e., the present value
of the time t profit rate. Hence, J can be interpreted as the total value of
discounted future profits, and is the quantity we are trying to maximize.
There are control constraints 0 ≤ u(t) ≤ Q, where Q is the upper
bound on the advertising rate. However, there is no state constraint. It
can be seen from the state equation and the control constraints that the
goodwill G(t) in fact never becomes negative.
You will find it instructive to compare this model with the previous
one and note the similarities and differences between the two.
Example 1.3 A Consumption Model. Rich Rentier plans to retire at
age 65 with a lump sum pension of W0 dollars. Rich estimates his re-
maining life span to be T years. He wants to consume his wealth during
these T retirement years, beginning at the age of 65, and leave a bequest
to his heirs in a way that will maximize his total utility of consumption
and bequest.
Since he does not want to take investment risks, Rich plans to put
his money into a savings account that pays interest at a continuously
compounded rate of r. In order to formulate Rich’s optimization problem,
let t = 0 denote the time when he turns 65 so that his retirement period
can be denoted by the interval [0, T ]. If we let the state variable W (t)
denote Rich’s wealth and the control variable C(t) ≥ 0 denote his rate of
consumption at time t ∈ [0, T ], it is easy to see that the state equation is
Ẇ (t) = rW (t) − C(t),
with the initial condition W (0) = W0 > 0. It is reasonable to require that
W (t) ≥ 0 and C(t) ≥ 0, t ∈ [0, T ]. Letting U (C) be the utility function
of consumption C and B(W ) be the bequest function of leaving a bequest
of amount W at time T, we see that the problem can be stated as an
optimal control problem with the variables, equations, and constraints
shown in Table 1.3.
Note that the objective function has two parts: first the integral of
the discounted utility of consumption from time 0 to time T with ρ as
the discount rate; and second the bequest function e−ρT B(W ), which
measures Rich’s discounted utility of leaving an estate W to his heirs
8 1. What Is Optimal Control Theory?
at time T. If he has no heirs and does not care about charity, then
B(W ) = 0. However, if he has heirs or a favorite charity to whom he
wishes to leave money, then B(W ) measures the strength of his desire
to leave an estate of amount W. The nonnegativity constraints on state
and control variables are obviously natural requirements that must be
imposed.
You will be asked to solve this problem in Exercise 2.1 after you
have learned the maximum principle in the next chapter. Moreover, a
stochastic extension of the consumption problem, known as a consump-
tion/investment problem, will be discussed in Sect. 12.4.
W0 = Initial wealth
ρ = Discount rate
r = Interest rate
1.3. History of Optimal Control Theory 9
nors and Teichroew (1967), Arrow and Kurz (1970), Hadley and
Kemp (1971), Bensoussan et al. (1974), Stöppler (1975), Clark (1976),
Sethi (1977a, 1978a), Tapiero (1977, 1988), Wickwire (1977), Book-
binder and Sethi (1980), Lesourne and Leban (1982), Tu (1984), Fe-
ichtinger and Hartl (1986), Carlson and Haurie (1987b), Seierstad
and Sydsæter (1987), Erickson (2003), Léonard and Long (1992),
Kamien and Schwartz (1992), Van Hilten et al. (1993), Feichtinger
et al. (1994a), Maimon et al. (1998), Dockner et al. (2000), Ca-
puto (2005), Grass et al. (2008), and Bensoussan (2011). Nev-
ertheless, we have included in our bibliography many works of
interest.
dy dz
ẏ = = (ẏ1 , · · · , ẏn )T and ż = = (ż1 , . . . , żm ),
dt dt
where ẏi and żj denote the time derivatives dyi /dt and dzj /dt, respec-
tively.
When n = m, we can define the inner product
zy = Σni=1 zi yi . (1.7)
More generally, if
⎡ ⎤
⎢ a11 a12 ··· a1k ⎥
⎢ ⎥
⎢ ⎥
⎢ a21 a22 · · · a2k ⎥
⎢ ⎥
A = {aij } = ⎢ ⎥
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
am1 am2 · · · amk
We will also use the notation f = (f1 , f2 , · · · , fk ) and f (t) in place of ft .
If f is a column vector, then
⎡ ⎤
⎢ f1t ⎥
⎢ ⎥
⎢ ⎥
⎢ f2t ⎥
df ⎢ ⎥
= ft = ⎢ ⎥ = (f1t , f2t , · · · , fkt )T , a column vector.
dt ⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
⎣ ⎦
fkt
Once again, f (t) may also be written as f or f (t).
A similar rule applies if a matrix function is differentiated with re-
spect to a scalar.
⎡ ⎤
2
⎢ t 2t + 3 ⎥
Example 1.4 Let f (t) = ⎣ ⎦ . Find ft .
e3t 1/t
⎡ ⎤
⎢ 2t 2 ⎥
Solution ft = ⎣ ⎦.
3e3t −1/t2
and
Fz = (Fz1 , · · · , Fzm ), a row vector, (1.10)
where Fyi and Fzj denote the partial derivatives with respect to the
subscripted variables.
Thus, we always define the gradient with respect to a row or column
vector as a row vector. Alternatively, Fy and Fz are also denoted as ∇y F
and ∇z F, respectively. In this notation, if F is a function of y only or
z only, then the subscript can be dropped and the gradient of F can be
written simply as ∇F.
14 1. What Is Optimal Control Theory?
fz = (f T )z = fz T = (f T )z T .
Solution. ⎡ ⎤
2y
⎢ 3y2 /z1 y1 3 ⎥
⎢ ⎥
⎢ ⎥
fz = ⎢ z2 2 y3 2z1 z2 y3 ⎥,
⎢ ⎥
⎣ ⎦
y1 y2
⎡ ⎤
2z
⎢ 2y1 y3 z2 3 ln z1 y1 2 ⎥
⎢ ⎥
⎢ ⎥
fy = ⎢ 0 0 z1 z2 2 ⎥.
⎢ ⎥
⎣ ⎦
z1 z2 0
Example 1.7 Obtain Fyz and Fzy for F (y, z) specified in Example 1.5.
Since the given F (y, z) is twice continuously differentiable, check also
that Fzy = (Fyz )T .
16 1. What Is Optimal Control Theory?
1.4.5 Miscellany
The norm of an m-component row or column vector z is defined to be
z = z12 + · · · + zm
2 . (1.17)
The norm of a vector is commonly used to define a neighborhood Nz0 of
a point, e.g.,
Nz0 = {z| z − z0 < ε} , (1.18)
where ε > 0 is a small positive real number.
1.4. Notation and Concepts Used 17
The most common use of this notation will be to collect higher order
terms in a series expansion.
In the continuous-time models discussed in this book, we generally
will use x(t) to denote the state (column) vector, u(t) to denote the
control (column) vector, and λ(t) to denote the adjoint (row) vector.
Whenever there is no possibility of confusion, we will suppress the time
indicator (t) from these vectors and write them as x, u, and λ, respec-
tively. When talking about optimal state and control vectors, we put an
asterisk “∗ ” as a superscript, i.e., as x∗ and u∗ , respectively, whereas u
will refer to an admissible control with x as the corresponding state. No
asterisk, however, needs to be put on the adjoint vector λ as it is only
defined along an optimal path.
Thus, the values of the control, state and adjoint variables at time t
along an optimal path will be written as u∗ (t), x∗ (t), and λ(t). When the
control is expressed in terms of the state, it is called a feedback control.
With an abuse of notation, we will express it as u(x), or u(x, t) if an
explicit time dependence is required. Likewise, the optimal feedback
control will be denoted as u∗ (x) or u∗ (x, t).
We also use the simplified notation x (t) to mean (x(t)) , the trans-
pose of x(t). Likewise, for a matrix A(t), we use A (t) to mean (A(t))
or the transpose of A(t), and A−1 (t) to mean (A(t))−1 or the inverse of
A(t), when the inverse exists.
The norm of an m-dimensional row or column vector function z(t),
t ∈ [0, T ], is defined to be
T
12
z = Σm
j=1 zj2 (τ )dτ . (1.19)
0
x(T − ) = lim x(τ ) = lim x(T − ε) and x(T + ) = lim x(τ ) = lim x(T + ε).
τ ↑T ε→0 τ ↓T ε→0
(1.20)
18 1. What Is Optimal Control Theory?
These limits are illustrated for a function x(t) graphed in Fig. 1.2. Here,
x(0) = 1, x(0+ ) = 2,
x(3− ) = 2, x(3) = 3.
x(t )
4 [
3 ) )
2 ( )
1 (
t
0 1 2 3
The word “sat” is short for the word “saturation.” The latter name
comes from an electrical engineering application to saturated amplifiers.
In several applications to be discussed, we will need the concept of
impulse control, which is sometimes needed in cases when an unbounded
control can be applied for a very short time. An example is the adver-
tising model in Table 1.2 when Q = ∞. We apply unbounded control for
a short time in order to cause a jump discontinuity in the state variable.
For the example in Table 1.2, this might mean an intense advertising
campaign (a media blitz) in order to increase advertising goodwill by a
finite amount in a very short time. The impulse function defined be-
low is required to evaluate the integral in the objective function, which
measures the cost of the intense advertising campaign.
Suppose we want to apply an impulse control at time t to change the
state variable from x(t) = x1 to the value x2 “immediately” after t, i.e.,
x(t+ ) = x2 . To compute its contribution to the objective function (1.2),
we use the following procedure: given ε > 0 and a constant control u(ε),
integrate (1.1) from t to t + ε with x(t) = x1 and choose u(ε) so that
x(t + ε) = x2 ; this gives the trajectory x(τ ; ε, u(ε)) for τ ∈ [t, t + ε]. We
can now compute
t+ε
imp(x1 , x2 ; t) = lim F (x, u, τ )dτ . (1.23)
ε→0 t
20 1. What Is Optimal Control Theory?
If there are several instants at which impulses are applied, then this
procedure is easily extended. Examples of the use of (1.24) occur in
Chaps. 5 and 6. We frequently omit t in (1.23) when the impulse function
is independent of t.
l
l
pi = 1 and y = pi x i .
i=1 i=1
Note that a saddle point may not exist, and even if it exists, it may not
be unique. Note also that
(a) Set P (t) = 1000 for 0 ≤ t ≤ 10. Determine whether this control
is feasible; if it is feasible, compute the value J of the objective
function.
(b) If P (t) = 800, show that the terminal constraint is violated and
hence the control is infeasible.
(c) If P (t) = Pmin for 0 ≤ t ≤ 6 and P (t) = Pmax for 6 < t ≤ 10,
show that the control is infeasible because the state constraint is
violated.
E 1.5 Rich Rentier in Example 1.3 has initial wealth W0 = $1, 000, 000.
Assume B = 0, ρ = 0.1, r = 0.15, and assume that Rich expects to live
for exactly 20 years.
(a) What is the maximum constant consumption level that Rich can
afford during his remaining life?
(b) If Rich’s utility function is U (C) = ln C, what is the present value
of the total utility in part (a)?
(c) Suppose Rich sets aside $100,000 to start the Rentier Foundation.
What is the maximum constant grant level that the foundation can
support if it is to last forever?
(b) If Rich (no longer a rentier) consumes at the constant rate found in
Exercise 1.5(a), find his terminal wealth and his new total utility.
E 1.7 Consider the following educational policy question. Let S(t) de-
note the total number of scientists at time t, and let δ be the retirement
rate of scientists. Let E(t) be the number of teaching scientists and R(t)
be the number of research scientists, so that S(t) = E(t) + R(t). Assume
γE(t) is the number of newly graduated scientists at time t, of which
the policy allocates uγE(t) to the pool of teachers, where 0 ≤ u ≤ 1.
The remaining graduates are added to the pool of researchers. The gov-
ernment has a target of maximizing the function αE(T ) + βR(T ) at a
given future time T, where α and β are positive constants. Formulate
the optimal control problem for the government.
E 1.8 For F (x, y) defined in Example 1.5, obtain the matrices Fxx and
Fyy .
Hint: Set the gradient Fx = g, a row vector, and then use Exer-
cise 1.9 to derive the first equality. Note in connection with the second
equality that the function F being twice continuously differentiable
implies that Fxx = (Fxx )T .
E 1.12 Use the bang function defined in (1.21) to sketch the optimal
control
u∗ (t) = bang[−1, 1; W (t)] for 0 ≤ t ≤ 5,
when
(a) W (t) = t − 2
(b) W (t) = t2 − 4t + 3
E 1.13 Use the sat function defined in (1.22) to sketch the optimal con-
trol
u∗ (t) = sat[2, 3; W (t)] for 0 ≤ t ≤ 5,
when
(a) W (t) = 4 − t
(b) W (t) = 2 + t2
2.1.2 Constraints
In this chapter, we are concerned with problems of types (1.4) and (1.5)
that do not have state constraints. Such constraints are considered in
Chaps. 3 and 4, as indicated in Sect. 1.1. We do impose constraints of
type (1.3) on the control variables. We define an admissible control to
be a control trajectory u(t), t ∈ [0, T ], which is piecewise continuous and
satisfies, in addition,
u(t) ∈ Ω(t) ⊂ E m , t ∈ [0, T ]. (2.2)
2.1. Statement of the Problem 29
subject to (2.4)
⎪
⎪
⎪
⎪
⎪
⎩ ẋ = f (x, u, t), x(0) = x .
0
over u(t) ∈ Ω(t), subject to (2.7). Of course, the price paid for going
from Bolza to linear Mayer form is an additional state variable and its
associated differential equation (2.6). Also, for the function fn+1 to be
continuously differentiable, in keeping with the assumptions made in
Sect. 2.1.1, we need to assume that the salvage value function S(x, t) is
twice continuously differentiable.
Exercise 2.5 presents the task of showing in a similar way that the
Lagrange and Mayer forms can also be reduced to the linear Mayer
form.
subject to
ẋ = u, x(0) = x0 .
Thus, the linear Mayer form version with the two-dimensional state y =
(x, y2 ) can be stated as
max {J = y2 (T )}
subject to
ẋ = u, x(0) = x0 ,
u2 1 1
ẏ2 = x − + xu, y2 (0) = x20 .
2 2 4
In Sect. 2.2, we derive necessary conditions for optimal control in the
form of the maximum principle, and in Sect. 2.4 we derive sufficient con-
ditions. In these derivations, we shall assume the existence of an optimal
control, while providing references where needed, as the topic of existence
is beyond the scope of this book. In any particular application, however,
the existence of a solution will be demonstrated by actually finding a
solution that satisfies both the necessary and the sufficient conditions
for optimality. We thus avoid the necessity of having to prove general
existence theorems, which require advanced and difficult mathematics.
Nevertheless, interested readers can consult Hartl et al. (1995) and Seier-
stad and Sydsæter (1987) for brief discussions of existence results and
references therein including Cesari (1983).
where for s ≥ t,
dx(s)
= f (x(s), u(s), s), x(t) = x.
ds
We initially assume that the value function V (x, t) exists for all x and t
in the relevant ranges. Later we will make additional assumptions about
the function V (x, t).
Bellman (1957) in his book on dynamic programming states the prin-
ciple of optimality as follows:
Optimal
Path
V x, t
x
t
0 t
V [x(t + δt), t + δt] = V (x, t) + [Vx (x, t)ẋ + Vt (x, t)]δt + o(δt), (2.12)
Substituting for ẋ from (2.1) in the above equation and then using it
in (2.11), we obtain
o(δt)
0 = max {F (x, u, t) + Vx (x, t)f (x, u, t) + Vt (x, t)} + . (2.14)
u∈Ω(t) δt
This boundary condition follows from the fact that the value function at
t = T is simply the salvage value function.
The components of the vector Vx (x, t) can be interpreted as the
marginal contributions of the state variables x to the value function
or the maximized objective function (2.9). We denote the marginal re-
turn vector (along the optimal path x∗ (t)) by the adjoint (row) vector
λ(t) ∈ E n , i.e.,
From the preceding remark, we can interpret λ(t) as the per unit change
in the objective function value for a small change in x∗ (t) at time t. In
other words, λ(t) is the highest hypothetical unit price which a rational
decision maker would be willing to pay for an infinitesimal addition to
x∗ (t). See Sect. 2.2.4 for further discussion.
Next we introduce a function H : E n × E m × E n × E 1 → E 1 called
the Hamiltonian
Remark 2.1 We use u∗ and x∗ for optimal control and state to distin-
guish them from an admissible control u and the corresponding state x,
respectively. However, since the adjoint variable λ is defined only along
the optimal path, there is no need for such a distinction, and therefore
we do not use the superscript ∗ on λ.
= (Vxx f )T + Vtx .
(2.26)
38 2. The Maximum Principle: Continuous Time
and ⎛ ⎞⎛ ⎞
⎜ Vx 1 x 1 Vx 1 x 2 · · · Vx1 xn ⎟ ⎜ ẋ1 ⎟
⎜ ⎟⎜ ⎟
⎜ ⎟⎜ ⎟
⎜ Vx2 x1 Vx 2 x 2 · · · Vx2 xn ⎟ ⎜ ẋ2 ⎟
⎜ ⎟⎜ ⎟
Vxx ẋ = ⎜ ⎟⎜ ⎟. (2.27)
⎜ .. .. .. ⎟⎜ .. ⎟
⎜ . . ··· . ⎟⎜ . ⎟
⎜ ⎟⎜ ⎟
⎝ ⎠⎝ ⎠
Vx n x 1 Vxn x2 · · · Vxn xn ẋn
Since the terms on the right-hand side of (2.26) are the same as the
last two terms in (2.25), we see that (2.26) becomes
dVx
= −Fx − Vx fx . (2.28)
dt
Because λ was defined in (2.17) to be Vx , we can rewrite (2.28) as
λ̇ = −Fx − λfx .
To see that the right-hand side of this equation can be written simply as
−Hx , we need to go back to the definition of H in (2.18) and recognize
that when taking the partial derivative of H with respect to x, the adjoint
variables λ are considered to be independent of x. We note further that
along the optimal path, λ is a function of t only. Thus,
λ̇ = −Hx . (2.29)
∂S(x, T )
λ(T ) = |x=x∗ (T ) = Sx (x∗ (T ), T ). (2.30)
∂x
The adjoint equation (2.29) together with its boundary condition (2.30)
determine the adjoint variables.
This completes our derivation of the maximum principle using dy-
namic programming. We can now summarize the main results in the
following section.
2.2. Dynamic Programming and the Maximum Principle 39
with respect to the state variables. In such cases, when Vx (x∗ (t), t) does
not exist, then (2.17) has no meaning. See Bettiol and Vinter (2010),
Yong and Zhou (1999), and Cernea and Frankowska (2005) for interpre-
tations of the adjoint variables or extensions of (2.17) in such cases.
The first term F (x, u, t)dt represents the direct contribution to J in dol-
lars from time t to t + dt, if the firm is in state x (i.e., it has a capital
stock of x), and we apply control u in the interval [t, t + dt]. The differ-
ential dx = f (x, u, t)dt represents the change in capital stock from time t
to t + dt, when the firm is in state x and control u is applied. Therefore,
the second term λdx represents the value in dollars of the incremental
capital stock dx, and hence can be considered as the indirect contribution
to J in dollars. Thus, Hdt can be interpreted as the total contribution
to J from time t to t + dt when x(t) = x and u(t) = u in the interval
[t, t + dt].
With this interpretation, it is easy to see why the Hamiltonian must
be maximized at each instant of time t. If we were just to maximize
F at each instant t, we would not be maximizing J, because we would
ignore the effect of the control in changing the capital stock, which gives
rise to indirect contributions to J. The maximum principle derives the
adjoint variable λ(t), the price of capital at time t, in such a way that
λ(t)dx is the correct valuation of the indirect contribution to J from
time t to t + dt. As a consequence, the Hamiltonian maximizing problem
can be treated as a static problem at each instant t. In other words, the
maximum principle decouples the dynamic maximization problem (2.4)
in the interval [0, T ] into a set of static maximization problems associated
with instants t in [0, T ]. Thus, the Hamiltonian can be interpreted as a
surrogate profit rate to be maximized at each instant of time t.
The value of λ to be used in the maximum principle is given by (2.29)
and (2.30), i.e.,
∂H ∂F ∂f
λ̇ = − =− − λ , λ(T ) = Sx (x(T ), T ).
∂x ∂x ∂x
Rewriting the first equation as
we can observe that along the optimal path, −dλ, the negative of the
increase or, in other words, the decrease in the price of capital from t
to t + dt, which can be considered as the marginal cost of holding that
capital, equals the marginal revenue Hx dt of investing the capital. In turn
the marginal revenue Hx dt consists of the sum of the direct marginal
contribution Fx dt and the indirect marginal contribution λfx dt. Thus,
the adjoint equation becomes the equilibrium relation—marginal cost
equals marginal revenue, which is a familiar concept in the economics
literature; see, e.g., Cohen and Cyert (1965, p. 189) or Takayama (1974,
p. 712).
Further insight can be obtained by integrating the above adjoint
equation from t to T as follows:
T
λ(t) = λ(T ) + t Hx (x(τ ), u(τ ), λ(τ ), τ )dτ
T
= Sx (x(T ), T ) + t Hx dτ .
Note that the price λ(T ) of a unit of capital at time T is its marginal
salvage value Sx (x(T ), T ). In the special case when S ≡ 0, we have
λ(T ) = 0, as clearly no value can be derived or lost from an infinitesimal
increase in x(T ). The price λ(t) of a unit of capital at time t is the sum of
its terminal price λ(T ) plus the integral of the marginal surrogate profit
rate Hx from t to T.
The above interpretations show that the adjoint variables behave
in much the same way as the dual variables in linear (and nonlinear)
programming, with the differences being that here the adjoint variables
are time dependent and satisfy derived differential equations. These
connections will become clearer in Chap. 8, which addresses the discrete
maximum principle.
whose solution is
x∗ (t) = 1 − t for t ∈ [0, 1]. (2.42)
The graphs of the optimal state and adjoint trajectories appear in
Fig. 2.2. Note that the optimal value of the objective function is
J ∗ = −1/2.
Figure 2.2: Optimal state and adjoint trajectories for Example 2.2
In Sect. 2.2.4, we stated that the adjoint variable λ(t) gives the
marginal value per unit increment in the state variable x(t) at time t.
Let us illustrate this claim at time t = 0 with the help of Example 2.2.
Note from (2.40) that λ(0) = −1. Thus, if we increase the initial value
x(0) from 1, by a small amount ε, to a new value 1 + ε, where ε may be
positive or negative, then we expect the optimal value of the objective
function to change from J ∗ = −1/2 to
∗
J(1+ε) = −1/2 + λ(0)ε + o(ε) = −1/2 − ε + o(ε),
2.3. Simple Examples 45
Example 2.3 Let us solve the same problem as in Example 2.2 over the
interval [0, 2] so that the objective is:
2
max J = −xdt . (2.43)
0
The dynamics and constraints are (2.34) and (2.35), respectively, as be-
fore. Here we want to minimize the signed area between the horizontal
axis and the trajectory of x(t) for 0 ≤ t ≤ 2.
λ̇ = 1, λ(2) = 0 (2.44)
Figure 2.3: Optimal state and adjoint trajectories for Example 2.3
Example 2.5 Let us rework Example 2.4 with T = 2, i.e., with the
objective function: 2
1 2
max J = − x dt (2.53)
0 2
subject to the constraints (2.47).
Solution The Hamiltonian is still as in (2.48) and the form of the optimal
policy remains as in (2.49). The adjoint equation is
λ̇ = x, λ(2) = 0,
which is the same as (2.50) except T = 2 instead of T = 1. Let us try to
extend the solution of the previous example from T = 1 to T = 2. Thus,
we keep λ(t) as in (2.52) for t ∈ [0, 1] with λ(1) = 0. If we recall from
the definition of the bang function that bang [−1, 1; 0] is not defined, it
allows us to choose u in (2.49) arbitrarily when λ = 0. This is an instance
of singular control, so let us see if we can maintain the singular control
by choosing u appropriately. To do this we choose u = 0 when λ = 0.
Since λ(1) = 0 we set u(1) = 0 so that from (2.47), we have ẋ(1) = 0.
Now note that if we set u(t) = 0 for t > 1, then by integrating equations
(2.47) and (2.50) forward from t = 1 to t = 2, we see that x(t) = 0
and λ(t) = 0 for 1 < t ≤ 2; in other words, u(t) = 0 maintains singular
control in the interval. Intuitively, this is the correct answer since once
we get x = 0, we should keep it at 0 in order to maximize the objective
2.3. Simple Examples 49
and
⎧
⎪
⎨ 1 + ε − t, t ∈ [0, 1 + ε],
x∗(1+ε) (t) =
⎪
⎩ 0, t ∈ (1 + ε, 2],
where we use the subscript to show the dependence of the control and
state trajectories of a problem beginning at time t with the state x(t) =
x. Thus,
t+x
1 1 t+x
V (x, t) = − [x∗(x,t) (s)]2 ds = − (x − s + t)2 ds.
t 2 2 t
Furthermore, since
⎧
⎪
⎨ 1 − t, t ∈ [0, 1],
x∗ (t) =
⎪
⎩ 0, t ∈ (1, 2],
we obtain
⎧
⎪
⎨ − 1 1 2(x − s + t)ds = − 1 t2 + t − 1 , t ∈ [0, 1],
2 t 2 2
Vx (x∗ (t), t) =
⎪
⎩ 0, t ∈ (1, 2],
which equals λ(t) obtained as the adjoint variable in Example 2.5. Note
that for t ∈ [0, 1], λ(t) in Example 2.5 is the same as that in Example 2.4
obtained in (2.52).
Example 2.6 This example is slightly more complicated and the opti-
mal control is not bang-bang. The problem is:
2
2
max J = (2x − 3u − u )dt (2.54)
0
subject to
ẋ = x + u, x(0) = 5 (2.55)
and the control constraint
H = (2x − 3u − u2 ) + λ(x + u)
= (2 + λ)x − (u2 + 3u − λu). (2.57)
Let us find the optimal control policy by differentiating (2.57) with re-
spect to u. Thus,
∂H
= −2u − 3 + λ = 0,
∂u
52 2. The Maximum Principle: Continuous Time
λ(t) − 3
u∗ (t) = , (2.58)
2
provided this expression stays within the interval Ω = [0, 2]. Note that
the second derivative of H with respect to u is ∂ 2 H/∂u2 = −2 < 0, so
that (2.58) satisfies the second-order condition for the maximum of a
function.
We next derive the adjoint equation as
∂H
λ̇ = − = −2 − λ, λ(2) = 0. (2.59)
∂x
Referring to Appendix A.1, we can use the integrating factor et to obtain
The graph of u∗ (t) appears in Fig. 2.5. In the figure, t1 satisfies e2−t1 −
2.5 = 2, i.e., t1 = 2 − ln 4.5 ≈ 0.496, while t2 satisfies e2−t2 − 2.5 = 0,
which gives t2 = 2 − ln 2.5 ≈ 1.08.
In Exercise 2.2 you will be asked to compute the optimal state tra-
jectory x∗ (t) corresponding to u∗ (t) shown in Fig. 2.5 by piecing together
the solutions of three separate differential equations obtained from (2.55)
and (2.60).
2.4. Sufficiency Conditions 53
t
e2 2.5
For our proof of the sufficiency of the maximum principle, we also need
the derivative Hx0 (x, λ, t), which by use of the Envelope Theorem can be
given as
∂u∗
Hx0 (x, λ, t) = Hx (x, u∗ , λ, t) + Hu (x, u∗ , λ, t) . (2.64)
∂x
To obtain (2.63) from (2.64), we need to show that the second term on
the right-hand side of (2.64) vanishes, i.e.,
∂u∗
Hu (x, u∗ , λ, t) =0 (2.65)
∂x
for each x. There are two cases to consider. If u∗ is in the interior of
Ω(t), then it satisfies the first-order condition Hu (x, u∗ , λ, t) = 0, thereby
implying (2.65). Otherwise, u∗ is on the boundary of Ω(t). Then, for each
i, j, either Hui = 0 or ∂u∗i /∂xj = 0 or both. Once again, (2.65) holds.
Exercise 2.25 gives a specific instance of this case.
H 0 [x(t), λ(t), t] ≤ H 0 [x∗ (t), λ(t), t] + Hx0 [x∗ (t), λ(t), t][x(t) − x∗ (t)].
(2.67)
Using (2.66), (2.62), and (2.63) in (2.67), we obtain
or,
or,
J(u∗ ) − J(u) (2.73)
∗ ∗ ∗
≥ [λ(T ) − Sx (x (T ), T )][x(T ) − x (T )] − λ(0)[x(0) − x (0)],
where J(u) is the value of the objective function associated with a control
u. Since x∗ (0) = x(0) = x0 , the initial condition, and since λ(T ) =
Sx (x∗ (T ), T ) from the terminal adjoint condition in (2.31), we have
J(u∗ ) ≥ J(u). (2.74)
Thus, u∗ is an optimal control. This completes the proof. 2
Because λ(t) is not known a priori, it is usual to test H 0 for a stronger
assumption, i.e., to check for the concavity of the function H 0 (x, λ, t) in x
for any λ and t. Sometimes the stronger condition given in Exercise 2.27
can be used.
Mangasarian (1966) gives a sufficient condition in which the concav-
ity of H 0 (x, λ(t), t) in Theorem 2.1 is replaced by a stronger condition
requiring the Hamiltonian H(x, u, λ(t), t) to be jointly concave in (x, u).
Example 2.7 Let us show that the problems in Examples 2.2 and 2.3
satisfy the sufficient conditions. We have from (2.36) and (2.61),
H 0 = −x + λu∗ ,
where u∗ is given by (2.37). Since u∗ is a function of λ only, H 0 (x, λ, t) is
certainly concave in x for any t and λ (and in particular for λ(t) supplied
by the maximum principle). Since S(x, T ) = 0, the sufficient conditions
hold.
Finally, it is important to mention that thus far in this chapter, we
have considered problems in which the terminal values of the state vari-
ables are not constrained. Such problems are called free-end-point prob-
lems. The problems at the other extreme, where the terminal values of
the state variables are completely specified, are termed fixed-end-point
problems. Then, there are problems in between these two extremes.
While a detailed discussion of terminal conditions on state variables ap-
pears in Sect. 3.4 of the next chapter, it is instructive here to briefly
indicate how the maximum principle needs to be modified in the case
of fixed-end-point problems. Suppose x(T ) is completely specified, i.e.,
2.5. Solving a TPBVP by Using Excel 57
Thus, the TPBVP is given by the system of equations (2.77) and (2.78).
A simple method to solve the TPBVP uses what is known as the
shooting method, explained in the flowchart in Fig. 2.6.
Guess Yes
? STOP
No
Here we have entered the right-hand side of the difference equation (2.80)
for t = 0 in cell A2 and the right-hand side of the difference equation
(2.79) for t = 0 in cell B2. Note that λ(0) = −0.2 shown as the entry
−0.2 in cell A1 is merely a guess. The correct value will be determined
by the use of the GOAL SEEK function.
Next highlight cells A2 and B2 and drag the combination down to
row 101 of the spreadsheet. Using EDIT in the menu bar, select FILL
DOWN. Thus, Excel will solve Eqs. (2.80) and (2.79) from t = 0 to t = 1
in steps of t = 0.01, and that solution will appear as entries in columns
A and B of the spreadsheet, respectively. In other words, the guessed
solution for λ(t) will appear in cells A1 to A101 and the corresponding
solution for x(t) will appear in cells B1 to B101. To find the correct value
for λ(0), use the GOAL SEEK function under TOOLS in the menu bar
and make the following entries:
It finds the correct initial value for the adjoint variable as λ(0) =
−0.10437, which should appear in cell A1, and the correct ending value
of the state variable as x(1) = 0.62395, which should appear in cell B101.
You will notice that the entry in cell A101 may not be exactly zero as
instructed, although it will be very close to it. In our example, it is
−0.0007. By using the CHART function, the graphs of x∗ (t) and λ(t)
can be printed out by Excel as shown in Fig. 2.7.
60 2. The Maximum Principle: Continuous Time
E 2.2 Complete Example 2.6 by writing the optimal x∗ (t) in the form
of integrals over the three intervals (0, t1 ), (t1 , t2 ), and (t2 , 2) shown in
Fig. 2.5.
Exercises for Chapter 2 61
E 2.3 Find the optimal solution for Example 2.1 with x0 = 0 and T = 1.
E 2.5 Show that both the Lagrange and Mayer forms of the optimal
control problem can be reduced to the linear Mayer form (2.5).
E 2.6 Show that the optimal control obtained from the application of
the maximum principle satisfies the principle of optimality: if u∗ (t) is an
optimal control and x∗ (t) is the corresponding optimal path for 0 ≤ t ≤ T
with x(0) = x0 , then verify the above proposition by showing that u∗ (t)
for τ ≤ t ≤ T satisfies the maximum principle for the problem beginning
at time τ with the initial condition x(τ ) = x∗ (τ ).
E 2.8 In Example 2.4, show that in view of (2.47) any λ(t), t ∈ [0, 1],
that satisfies (2.50) must be nonnegative.
E 2.11 In Example 2.6, verify by direct calculation that with a new ini-
tial x(0) = 5 + ε with ε small, the objective function value will change by
E 2.12 Obtain the value function V (x, t) explicitly in Example 2.4 and
verify the relation Vx (x∗ (t), t) = λ(t) for the example by showing that
Vx (1 − t, t) = −(1/2)t2 + t − 1/2.
E 2.13 Obtain the value function V (x, t) explicitly in Example 2.5 for
every x ∈ E 1 and t ∈ [0, 2].
Hint: You need to deal with the following cases for t ∈ [0, 2]:
(i) 0 ≤ x ≤ 2 − t,
(ii) x > 2 − t,
(iii) t − 2 ≤ x < 0, and
(iv) x < t − 2.
E 2.14 Obtain V (x, t) in Example 2.6 for small positive and negative x
for t ∈ [t2 , 2]. Then, show that Vx (x, t) = 2(e2−t − 1), t ∈ [t2 , 2], is the
same as λ(t), t ∈ [t2 , 2] obtained in Example 2.6.
subject to
ẋ = u, x(0) = x0 ,
u ∈ [0, 1],
for optimal control and optimal state trajectory. Verify that your solu-
tion is optimal by using the maximum principle sufficiency condition.
ẋ = 1 − u2 , x(0) = 1;
E 2.17 Use the maximum principle to solve the following problem given
in the Mayer form:
max[8x1 (18) + 4x2 (18)]
subject to
ẋ1 = x1 + x2 + u, x1 (0) = 15,
ẋ2 = 2x1 − u, x2 (0) = 20,
and the control constraint
0 ≤ u ≤ 1.
Hint: Use the method in Appendix A to solve the simultaneous differ-
ential equations.
E 2.18 In Fig. 2.8, a water reservoir being used for the purpose of fire-
fighting is leaking, and its water height x(t) is governed by
J = 5x(100),
subject to
ẋ = f (x) + b(x)u, x(0) = x0 , x(T ) = 0,
where functions g ≥ 0, f, and b are assumed to be continuously differen-
tiable. Derive the two-point boundary value problem (TPBVP) satisfied
by the optimal state and control trajectories.
where π > 0 with πx representing the profit rate when the machine state
is x, u2 /2 is the cost of maintaining the machine at rate u, ρ > 0 is the
discount rate, T is the time horizon, and S > 0 is the salvage value of
the machine for each unit of the machine state at time T. Furthermore,
show that the optimal maintenance rate decreases, increases, or remains
constant over time depending on whether the difference S − π/(ρ + δ) is
negative, positive, or zero, respectively.
subject to
K̇(t) = u(t)K(t), K(0) = K0 , K(T ) free, 0 ≤ u(t) ≤ 1, 0 ≤ t ≤ T.
Assume T > 1 and 0 < K0 < 1 − e1−T . Obtain explicitly the optimal in-
vestment allocation u∗ (t), optimal capital K ∗ (t), and the adjoint variable
λ(t), 0 ≤ t ≤ T.
66 2. The Maximum Principle: Continuous Time
E 2.24 The rate at which a new product can be sold at any time t
is f (p(t))g(Q(t)) where p is the price and Q is cumulative sales. We
assume f (p) < 0; sales vary inversely with price. Also g (Q) ≷ 0 for
Q ≶ Q1 , respectively, where Q1 > 0 is a constant known as the saturation
level. For a given price, current sales grow with past sales in the early
stages as people learn about the good from past purchasers. But as
cumulative sales increase, there is a decline in the number of people who
have not yet purchased the good. Eventually the sales rate for any given
price falls, as the market becomes saturated. The unit production cost c
may be constant or may decline with cumulative sales if the firm learns
how to produce less expensively with experience: c = c(Q), c (Q) ≤ 0.
Formulate and solve the optimal control problem in order to characterize
the price policy p(t), 0 ≤ t ≤ T, that maximizes profits from this new
“fad” over a fixed horizon T. Specifically, show that in marketing a new
product, its optimal price rises while the market expands to its saturation
level and falls as the market matures beyond the saturation level.
(b) Find the rate of change of optimal consumption over time and
conclude that consumption remains constant when r = ρ, increases
when r > ρ, and decreases when r < ρ.
(a) Formulate the TPBVP (2.32) and its discrete version for the prob-
lem in Example 2.8, but with a new initial condition x(0) = 1.
(b) Solve the discrete version of the TPBVP by using Excel.
subject to
ẋ(t) = u(t), x(0) = 1, x(2) = 0,
−a ≤ u(t) ≤ b, a > 1/2, b > 0.
Obtain optimal x∗ (t), u∗ (t), and all required multipliers.
Chapter 3
a(x(T ), T ) ≥ 0, (3.4)
b(x(T ), T ) = 0, (3.5)
where a : E n × E 1 → E la and b : E n × E 1 → E lb are continuously
differentiable in all their arguments. Clearly, a and b are not functions
of T, if T is a given fixed number. In the specific cases when T is
given, the terminal state constraints will be written as a(x(T )) ≥ 0 and
b(x(T )) = 0. Important special cases of (3.4) are x(T ) ≥ k.
We can now define a control u(t), t ∈ [0, T ], or simple u, to be admis-
sible if it is piecewise continuous and if, together with its corresponding
state trajectory x(t), t ∈ [0, T ], it satisfies the constraints (3.3), (3.4),
and (3.5).
At times we may find terminal inequality constraints given as
where Y (T ) is a convex set and X(T ) is the set of all feasible terminal
states, also called the reachable set from the initial state x0 , i.e.,
Remark 3.1 The feasible set defined by (3.4) and (3.5) need not be
convex. Thus, if the convex set Y (T ) can be expressed by a finite number
of inequalities a(x(T ), T ) ≥ 0 and equalities b(x(T ), T ) = 0, then (3.6)
becomes a special case of (3.4) and (3.5). In general, (3.6) is not a special
case of (3.4) and (3.5), since it may not be possible to define a given Y (T )
by a finite number of inequalities and equalities.
In this book, we will only deal with problems in which the following
full-rank conditions hold. That is,
rank[∂g/∂u, diag(g)] = q
holds for all arguments x(t), u(t), t, that could arise along an optimal
solution, and
⎡ ⎤
⎢ ∂a/∂x diag(a) ⎥
rank ⎣ ⎦ = la + lb
∂b/∂x 0
3.1. A Maximum Principle for Problems with Mixed Constraints 73
hold for all possible values of x(T ) and T. The first of these condi-
tions means that the gradients with respect to u of all active constraints
in (3.3) must be linearly independent. Similarly, the second condition
means that the gradients with respect to x of the equality constraints
(3.5) and of the active inequality constraints in (3.4) must be linearly
independent. These conditions are also referred to as the constraint qual-
ifications. In cases when these do not hold, see Seierstad and Sydsæter
(1987) for details on weaker constraint qualifications.
Before proceeding further, let us recapitulate the optimal control
problem under consideration in this chapter:
⎧ T
⎪
⎪
⎪
⎪ max J = F (x, u, t)dt + S[x(T ), T ] ,
⎪
⎪
⎪
⎪ 0
⎪
⎪
⎪
⎪ subject to
⎪
⎪
⎪
⎪
⎪
⎨ ẋ = f (x, u, t), x(0) = x ,
0
(3.7)
⎪
⎪
⎪
⎪ g(x, u, t) ≥ 0,
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ a(x(T ), T ) ≥ 0,
⎪
⎪
⎪
⎪
⎪
⎩ b(x(T ), T ) = 0.
λ̇ = −Lx (x∗ , u∗ , λ, μ, t)
α ≥ 0, αa(x∗ (T ), T ) = 0,
g[x∗ (t), u, t] ≥ 0,
In the case of the terminal constraint (3.6), note that the terminal
conditions on the state and the adjoint variables in (3.12) will be re-
placed, respectively, by
x∗ (T ) ∈ Y (T ) ⊂ X(T ) (3.13)
and
In Exercise 3.5, you are asked to derive (3.14) from (3.12) in the one
dimensional case when Y (T ) = Y = [x, x̄] for each T > 0, where x and
x̄ are two constants such that x̄ > x.
In the case when the terminal time T ≥ 0 in the problem (3.10) is
also a decision variable, there is an additional necessary transversality
condition for T ∗ to be optimal, namely,
subject to
ẋ = u, x(0) = 1, (3.16)
u ≥ 0, x − u ≥ 0. (3.17)
Note that constraints (3.17) are of the mixed type (3.3). They can also
be rewritten as 0 ≤ u ≤ x.
76 3. The Maximum Principle: Mixed Inequality Constraints
H = u + λu = (1 + λ)u,
To get the adjoint equation and the multipliers associated with con-
straints (3.17), we form the Lagrangian:
L = H + μ1 u + μ2 (x − u) = μ2 x + (1 + λ + μ1 − μ2 )u.
μ1 ≥ 0, μ1 u = 0, (3.21)
μ2 ≥ 0, μ2 (x − u) = 0. (3.22)
μ2 = 1 + λ.
Remark 3.4 It should be pointed out that if the set Y in (3.6) consists
of a single point Y = {k}, making the problem a fixed-end-point prob-
lem, then the transversality condition reduces to simply λ(T ) to equal
78 3. The Maximum Principle: Mixed Inequality Constraints
Remark 3.7 In the case when the terminal constraint (3.4) or (3.5) is
binding, the transversality condition λ(T ) in (3.12) should be viewed as
the left-hand limit, limt↑T λ(t), sometimes written as λ(T − ), and then
we would express λ(T ) = Sx (x∗ (T ), T ). However, the standard practice
for problems treated in Chaps. 2 and 3 is to use the notation that we
have used. Nevertheless, care should be exercised in distinguishing the
marginal value of the state at time T given by Sx (x∗ (T ), T ) and the
shadow prices for the terminal constraints (3.4) and (3.5) given by α and
β, respectively. See Sect. 3.4 and Example 3.4 for further elaboration.
Remark 3.9 In the case when the problem (3.7) is changed by inter-
changing x(T ) and x(0) so that the initial condition x(0) = x0 is re-
placed by x(T ) = xT , and S(x(T ), T ), a(x(T ), T ) and b(x(T ), T ) are
3.2. Sufficiency Conditions 79
where we assume the discount rate ρ > 0. We should also mention that
if F (x, u, t) = φ(x, u, t)e−ρt and S(x, T ) = ψ(x, T )e−ρT , then there is no
advantage of developing a current-value version of the maximum princi-
ple, and it is recommended that the present-value formulation be used
in this case.
Now, the objective in problem (3.7) can be written as:
T
−ρt −ρT
max J = φ(x, u)e dt + ψ[x(T )]e . (3.28)
0
make the distinction explicitly since we will either be using the present-
value definitions or the current-value definitions of these functions. The
reader will always be able to tell what is meant from the context.
We now define the current-value Hamiltonian
The first term on the right-hand side of (3.39) is simply ρλ using the
definition in (3.37). To simplify the second term we use the differential
equation (3.31) for λpv and the fact that Lx = eρt Lpv
x from (3.38). Thus,
λ̇ = ρλ − Lx ,
μ ≥ 0, μg = 0, α ≥ 0, and αa = 0
a(x∗ (T ), T ) ≥ 0, b(x∗ (T ), T ) = 0,
α ≥ 0, αa(x∗ (T ), T ) = 0,
g[x∗ (t), u, t] ≥ 0,
∂u |u=u∗ (t)
∂L
= 0, and the complementary slackness
You are asked in Exercise 3.8 to show that (3.44) is the current-value
version of (3.15) under the relation (3.27). Furthermore, show how (3.44)
should be modified if S(x, T ) = ψ(x, T )e−ρT in (3.27).
As for the sufficiency conditions for the current-value formulation,
one can simply use Theorem 3.1 as if it were stated for the current-value
formulation.
Ẇ = rW − C, W (0) = W0 , W (T ) = 0,
∂H
λ̇ = ρλ − = (ρ − r)λ, λ(T ) = β, (3.46)
∂W
where β is some constant to be determined. The solution of (3.46) is
∂H 1
= − λ = 0,
∂C C
84 3. The Maximum Principle: Mixed Inequality Constraints
which implies
1 1
C ∗ (t) = = e(ρ−r)(T −t) . (3.48)
λ(t) β
Using this consumption level in the wealth dynamics gives
1 (ρ−r)(T −t)
Ẇ (t) = rW (t) − e , W (0) = W0 ,
β
This gives the marginal value per unit increase in wealth at time t in
time-t dollars. In Exercise 2.29(a), the standard adjoint variable was
λpv (t) = e−rt (1 − e−ρT )/ρW0 , which can be written as λpv (t) = e−ρt λ(t).
3.3. Current-Value Formulation 85
Thus, it is clear that λpv (t) expresses the same marginal value in time-
zero dollars. In particular,
dJ ∗ /dW0 = (1 − e−ρT )/ρW0 = λ(0) = λpv (0)
gives the marginal value per unit increase in the initial wealth W0 .
In Exercise 3.11, you are asked to formulate and solve a consumption
problem of an economy. The problem is a linear version of the famous
Ramsey model; see Ramsey (1928) and Feichtinger and Hartl (1986, p.
201).
Before concluding this section on the current-value formulation, let
us also provide the current-value version of the HJB equation (2.15)
or (2.19) along with the terminal condition (2.16). As in (2.9), we now
define the value function for the problem (3.7), with its objective function
replaced by (3.28), as follows:
T
−ρ(T −t)
V (x, t) = max φ(x(s), u(s))ds + e ψ(x(T ))
{u|g(x,u,t)≥0} t
= max {H(x, u, Vx , t) + Vt } = 0,
{u|g(x,u,t)≥0}
(3.55)
where H is defined as in (3.35).
Finally, we can write the terminal condition as
⎧
⎪
⎨ ψ(x), if a(x, T ) ≥ 0 and b(x, T ) = 0,
V (x, T ) = (3.56)
⎪
⎩ −∞, otherwise.
86 3. The Maximum Principle: Mixed Inequality Constraints
x(T ) ∈ X(T ).
Case 2: Fixed-end point. In this case, which is the other extreme from
the free-end-point case, the terminal constraint is
b(x(T ), T ) = x(T ) − k = 0,
and the terminal conditions in (3.42) do not provide any information for
λ(T ). However, as mentioned in Remark 3.4 and recalled subsequently
in connection with (3.42), λ(T ) will be some constant β, which will be
determined by solving the boundary value problem, where the system
of differential equations consists of the state equations with both initial
and terminal conditions and the adjoint equations with no boundary
conditions. This condition is repeated in Table 3.1, Row 2. Example 3.2
solved in the previous section illustrates this case.
3.4. Transversality Conditions: Special Cases 87
Case 3: Lower bound. Here we restrict the ending value of the state
variable to be bounded from below, namely,
a(x(T ), T ) = x(T ) − k ≥ 0,
and
{λ(T ) − ψ x [x∗ (T )]}{x∗ (T ) − k} = 0, (3.59)
with the recognition that the shadow price of the inequality constraint
(3.4) is
α = λ(T ) − ψ x [x∗ (T )] ≥ 0. (3.60)
For ψ(x) ≡ 0, these terminal conditions can be written as
Case 4: Upper bound. Similarly, when the ending value of the state
variable is bounded from above, i.e., when the terminal constraint is
k − x(T ) ≥ 0,
and (3.59). These are repeated in Table 3.1, Row 4. Furthermore, (3.62)
can be related to the condition on λ(T ) in (3.42) by setting
x(T ) ∈ Y (T ) ⊂ X(T ),
88 3. The Maximum Principle: Mixed Inequality Constraints
denotes his utility of leaving wealth W (T ) to his heirs upon death. Then,
the problem is:
T
−ρt −ρT
max J = e ln C(t)dt + e BW (T ) (3.66)
C(t)≥0 0
on x(T ) when ψ ≡ 0
point
constraints ∀y ∈ Y (T ) ∀y ∈ Y (T )
Note 1. In Table 3.1, x(T ) denotes the (column) vector of n state variables and λ(T )
denotes the (row) vector of n adjoint variables at the terminal time T ; X(T ) ⊂ E n
denotes the reachable set of terminal states obtained by using all possible admissible
controls; and ψ : E n → E 1 denotes the salvage value function
Note 2. Table 3.1 will provide transversality conditions for the standard Hamiltonian
formulation if we replace ψ with S, and reinterpret λ as being the standard adjoint
variable everywhere in the table. Also (3.15) is the standard form of (3.44)
In case (i), the solution of the problem is the same as that of Exam-
ple 3.2, because by setting λ(T ) = β and recalling that W ∗ (T ) = 0 in
that example, it follows that (3.68) holds.
In case (ii), we set λ(T ) = B. Then, by using B in place of β in
(3.47)–(3.49), we get λ(t) = Be(ρ−r)(t−T ) , C ∗ (t) = (1/B)e(ρ−r)(T −t) , and
e(ρ−r)T (1 − e−ρt )
W ∗ (t) = ert W0 − . (3.69)
ρB
Since β < B, we can see from (3.49) and (3.69) that the wealth level
in case (ii) is larger than that in case (i) at t ∈ (0, T ]. Furthermore, the
amount of bequest is
eρT − 1
W ∗ (T ) = W0 erT − > 0.
ρB
Note that (3.68) holds for case (ii). Also, if we had used (3.42) instead
of Table 3.1, Row 3, we would have λ(T ) = B + α, α ≥ 0, αW ∗ (T ) = 0,
equivalently, in place of (3.68). It is easy to see that α = β − B in case
(i) and α = 0 in case (ii).
subject to
ẋ = u, x(0) = 1, x(2) ≥ 0, (3.70)
− 1 ≤ u ≤ 1. (3.71)
H = −x + λu.
Here, we do not need to introduce the Lagrange multipliers for the con-
trol constraints (3.71), since we can easily deduce that the Hamiltonian
maximizing control has the form
obtained from (3.61) or from Table 3.1, Row 3. Since λ(t) is monotoni-
cally increasing, the control (3.72) can switch at most once, and it can
only switch from u∗ = −1 to u∗ = 1. Let the switching time be t∗ ≤ 2.
Then the optimal control is
⎧
⎪
⎨ −1 for 0 ≤ t ≤ t∗ ,
∗
u (t) = (3.75)
⎪
⎩ +1 for t∗ < t ≤ 2.
λ(t) = t − t∗ .
There are two cases: (i) t∗ < 2 and (ii) t∗ = 2. We analyze case (i) first.
Here λ(2) = 2 − t∗ > 0; therefore from (3.74), x(2) = 0. Solving for x(t)
with u∗ (t) given in (3.75), we obtain
⎧
⎪
⎨ 1−t for 0 ≤ t ≤ t∗ ,
x(t) =
⎪
⎩ (t − t∗ ) + x(t∗ ) = t + 1 − 2t∗ for t∗ < t ≤ 2.
x(2) = 3 − 2t∗ = 0,
which makes t∗ = 3/2. Since this satisfies t∗ < 2, we do not have to deal
with case (ii), and we have
⎧
⎪
⎨ 1 − t for 0 ≤ t ≤ 3/2,
∗ 3
x (t) = and λ(t) = t − .
⎪
⎩ t − 2 for 3/2 < t ≤ 2 2
Figure 3.1 shows the optimal state and adjoint trajectories. Using the
optimal state trajectory in the objective function, we can obtain its op-
timal value J ∗ = −1/4.
In Exercise 3.15, you are asked to consider case (ii) by setting t∗ = 2,
and show that the maximum principle will not be satisfied in this case.
92 3. The Maximum Principle: Mixed Inequality Constraints
and ⎧
⎪
⎨ x + t − s, s ∈ [t, 1
2 (x + t) + 1),
x∗(x,t) (s) =
⎪
⎩ s − 2, s ∈ [ 12 (x + t) + 1, 2].
Then for x ≥ t − 2,
2
V (x, t) = −x∗(x,t) (s)ds
t
(1/2)(x+t)+1 2
= − t (x + t − s)ds − (1/2)(x+t)+1 (s − 2)ds
function V (x, t) is not differentiable at x∗ (t), and since Vx (x∗ (t), t) does
not exist for t ∈ (3/2, 2], (2.17) has no meaning; see Remark 2.2.
It is possible, however, to provide an economic meaning for λ(2). In
Exercise 3.17, you are asked to rework Example 3.4 with the terminal
condition x(2) ≥ 0 replaced by x(2) ≥ ε, where ε is small. Furthermore,
½ ½
-½
()
-3/2
Let us begin with a special case of the condition (3.15) for the simple
problem (2.4) when T ≥ 0 is a decision variable. When compared with
the problem (3.7), the simple problem is without the mixed constraints
and constraints at the terminal time T. Thus the transversality condition
(3.15) reduces to
H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ST [x∗ (T ∗ ), T ∗ ] = 0. (3.77)
This condition along with the Maximum Principle (2.31) with T replaced
by T ∗ give us the necessary conditions for the optimality of T ∗ and
u∗ (t), t ∈ [0, T ∗ ] for the simple problem (2.4) when T ≥ 0 is also a
decision variable.
An intuitively appealing way to check if the optimal T ∗ ∈ (0, ∞)
must satisfy (3.77) is to solve the problem (2.4) with the terminal time
T ∗ with u∗ (t), t ∈ [0, T ∗ ] as the optimal control trajectory, and then show
that the first-order condition for T ∗ to maximize the objective function
in a neighborhood (T ∗ − δ, T ∗ + δ) of T ∗ with δ > 0 leads to (3.77).
For this, let us set u∗ (t) = u∗ (T ∗ ), t ∈ [T ∗ , T ∗ + δ), so that we have a
control u∗ (t) that is feasible for (2.4) for any T ∈ (T ∗ − δ, T ∗ + δ), as
well as continuous at T ∗ . Let x∗ (t), t ∈ [0, T ∗ + δ] be the corresponding
state trajectory. With these we can obtain the corresponding objective
function value
T
J(T ) = F (x∗ (t), u∗ (t), t)dt + S(x∗ (T ), T ), T ∈ (T ∗ − δ, T ∗ + δ),
0
(3.78)
which, in particular, represents the optimal value of the objective func-
tion for the problem (2.4) when T = T ∗ . Furthermore, since u∗ (t) is
continuous at T ∗ , x∗ (t) is continuously differentiable there, and so is
J(T ). In this case, since T ∗ is optimal, it must satisfy
dJ(T )
J (T ∗ ) := |T =T ∗ = 0. (3.79)
dT
Otherwise, we would have either J (T ∗ ) > 0 or J (T ∗ ) < 0. The former
situation would allow us to find a T ∈ (T ∗ , T ∗ + δ) for which J(T ) >
J(T ∗ ), and T ∗ could not be optimal since the choice of an optimal control
for (2.4) defined on the interval [0, T ] would only improve the value of
the objective function. Likewise, the later situation would allow us to
find a T ∈ (T ∗ − δ, T ∗ ) for which J(T ) > J(T ∗ ). By taking the derivative
of (3.78), we can write (3.79) as
F (x∗ (T ∗ ), u∗ (T ∗ ), T ∗ ) + Sx [x∗ (T ∗ ), T ∗ ]ẋ∗ (T ∗ ) + ST [x∗ (T ∗ ), T ∗ ] = 0.
(3.80)
3.5. Free Terminal Time Problems 95
subject to
ẋ = −2 + 0.5u, x(0) = 17.5, (3.83)
u ∈ [0, 1], T ≥ 0.
96 3. The Maximum Principle: Mixed Inequality Constraints
H = x − u + λ(−2 + 0.5u),
λ(t) = 1 + (T − t).
x∗ (T ∗ ) − 2 = 17 − 1.5T ∗ − 2 = 0,
which gives T ∗ = 10. Thus, the optimal solution of the problem is given
by T ∗ = 10 and
u∗ (t) = bang[0, 1; 0.5(9 − t)].
Note that if we had restricted T to be in the interval [T1 , T2 ] = [2, 8],
we would have T ∗ = 8, u∗ (t) = bang[0, 1; 0.5(7 − t)], and x∗ (8) − 2 =
5 − 2 = 3 ≥ 0, which would satisfy (3.81) at T ∗ = T2 = 8. On the other
hand, if T were restricted in the interval [T1 , T2 ] = [11, 15], then T ∗ =
11, u∗ (t) = bang[0, 1; 0.5(10 − t)], and x∗ (11) − 2 = 0.5 − 2 = −1.5 ≤ 0
would satisfy (3.81) at T ∗ = T1 = 11.
Next, we will apply the maximum principle to solve a well known
time-optimal control problem. It is one of the problems used by Pontrya-
gin et al. (1962) to illustrate the applications of the maximum principle.
3.5. Free Terminal Time Problems 97
d2 x(t)
m = mẍ(t) = u(t),
dt2
where u(t) denotes the external force applied to the train at time t
and ẍ(t) represents the acceleration in miles per minute per minute,
or miles/minute2 . This equation, along with
respectively, as the initial position of the train and its initial velocity in
miles per minute, characterizes its motion completely.
For convenience in further exposition, we may assume m = 1 so that
the equation of motion can be written as
ẍ = u. (3.87)
λ1 = β 1 and λ2 = β 2 + β 1 (T − t),
H + ST |T =T ∗ = λ2 (T ∗ )u∗ (T ∗ ) − 1 = β 2 u∗ (T ∗ ) − 1 = 0,
which together with the bang-bang control policy (3.91) implies either
λ2 (T ∗ ) = β 2 = −1 and u∗ (T ∗ ) = −1,
or
λ2 (T ∗ ) = β 2 = +1 and u∗ (T ∗ ) = +1.
Since the switching function β 2 + β 1 (T ∗ − t) is a linear function of
the time remaining, it can change sign at most once. Therefore, we have
two cases: (i) u∗ (τ ) = −1 in the interval t ≤ τ ≤ T ∗ for some t ≥ 0; (ii)
u∗ (τ ) = +1 in the interval t ≤ τ ≤ T ∗ for some t ≥ 0. We can integrate
(3.88) in each of these cases as shown in Table 3.2. Also in the table we
have the curves Γ− and Γ+ , which are obtained by eliminating t from
the expressions for x and y in each case. The parabolic curves Γ− and
Γ+ are called switching curves and are shown in Fig. 3.2.
It should be noted parenthetically that Fig. 3.2 is different from the
figures we have seen thus far, where the abscissa represented the time
100 3. The Maximum Principle: Mixed Inequality Constraints
y(t) = T ∗ − t y(t) = t − T ∗
Γ− : x = −y 2 /2 for y ≥ 0 Γ+ : x = y 2 /2 for y ≤ 0
dimension. In Fig. 3.2, the abscissa represents the train’s location and
the ordinate represents the train’s velocity. Thus, the point (x0 , y0 )
represents the vector of the train’s initial position and initial velocity.
A trajectory of the train over time can be represented by a curve in
this figure. For example, the bold-faced trajectory beginning at (x0 , y0 )
represents a train that is moving in the positive direction and it is slowing
down. It passes through the main station located
# at the origin and comes
to a momentary rest at the point that is y02 + 2x0 miles to the right
of the main station. At this location, the train reverses its direction and
speeds up to reach the location x∗ and attain the velocity of y∗ . At this
point, it slows down gradually until it comes to rest at the main station.
In the ensuing discussion we will show that this trajectory is in fact
the minimal time trajectory beginning at the location x0 at a velocity
of y0 . We will furthermore obtain the control representing the optimal
acceleration and deceleration along the way. Finally, we will obtain the
various instants of interest, which are implicit in the depiction of the
trajectory in Fig. 3.2.
We can put Γ+ and Γ− into a single switching curve Γ as
⎧
⎪ √
⎨ Γ+ (x) = − 2x, x ≥ 0,
y = Γ(x) = (3.92)
⎩ Γ− (x) = +√−2x, x < 0.
⎪
If the initial state (x0 , y0 ) = 0, lies on the switching curve, then we have
u∗ = +1 (resp., u∗ = −1) if x0 > 0 (resp., x0 < 0); i.e., if (x0 , y0 ) lies on
Γ+ (resp., Γ− ). In the common parlance, this means that we apply the
brakes to bring the train to a full stop at the main station. If the initial
state (x0 , y0 ) is not on the switching curve, then we choose, between
u∗ = 1 and u∗ = −1, that which moves the system toward the switching
3.5. Free Terminal Time Problems 101
where the minus sign in the expression for y∗ in (3.94) was chosen since
the intersection occurs when y∗ is negative. The time t∗ that it takes to
reach the switching curve, called the switching time, given that we start
above it, is
t ∗ = y0 − y∗ = y0 + (y02 + 2x0 )/2. (3.95)
To find the minimum total time to go from the starting point (x0 , y0 )
to the origin (0,0), we substitute t∗ into the equation for Γ+ in Column
(ii) of Table 3.2; this gives
∗
T = t∗ − y∗ = y0 + 2(y02 + 2x0 ). (3.96)
Here t∗ is the time to get to the switching curve and −y∗ is the time
spent along the switching curve.
Note
# that the parabola (3.93) intersects the y-axis at the point
(0, + 2x0 + y02 ) and the x-axis at the point (x0 + y02 /2, 0). This means
that for the initial position (x0 , y0 ) depicted#in Fig. 3.2, the train first
passes the main station at the velocity of + 2x0 + y02 and comes to a
momentary stop at the distance of (x0 + y02 /2) to the right of the main
station. There it reverses its direction, comes to within the distance of
x∗ from the main station, switches then to u∗ = +1, which slows it to a
complete stop at the main station at time T ∗ given by (3.96).
As a numerical example, start at the point (x0 , y0 ) =(1,1). Then, the
equation of the parabola (3.93) is
2x = 3 − y 2 .
#
The switching point given by (3.94) # is (3/4, − 3/2). Finally from (3.95),
the switching time is t∗ = 1 + 3/2 min. Substituting
√ into (3.96), we
∗
find the minimum time to stop is T = 1 + 6 min.
To complete the solution of this example let us evaluate β 1 and β 2 ,
which are needed to obtain λ1 and λ2 . Since (1,1) is above the switching
curve, the approach to the main station is on the curve Γ+ , and therefore,
u∗ (T ∗ ) = 1 and β 2 = 1. To compute β 1 , we observe # that λ2 (t
#∗ ) =
∗
β 2 + β 1 (T − t∗ ) = 0 so that β 1 = −β 2 /(T ∗
# − t∗ ) = −1/ 3/2 = − 2/3.
Finally, we obtain x∗ = 3/4 and y∗ = − 3/2 from (3.94).
Let us now describe the optimal solution from (1, 1) in the common
parlance. The position (1, 1) means the train is 1 mile to the right of the
main station, moving away from it at the speed of 1 mile per minute.
The control u∗ = −1 means that the brakes are applied to slow the train
3.6. Infinite Horizon and Stationarity 103
down.
√ This action brings the train to a momentary stop at a distance
of 3 miles to the right of the main station. Moreover, the continuation
of control u∗ = −1 means the train reverses its direction at that point
and starts speeding toward the station. When it comes # to within 3/4
miles#to the right of the main# station at time t∗ = 1 + 3/2, its velocity
of − 3/2 or the speed of 3/2 miles per minute toward the station is
too fast to come to a rest at the main station without application of the
brakes. So the control is switched to u∗ = +1 at time t∗ , which means
the brakes are applied at that time. This action brings the √ train to a
∗
complete stop at the main station at the time of T = 1 + 6 min after
the train left its initial position (1, 1).
In Exercises 3.19–3.22, you are asked to work other examples with
different starting points above, below, and on the switching curve. Note
that t∗ = 0 by definition, if the starting point is on the switching
curve.
for Theorem 2.1, and therefore Theorem 3.1, to hold. See Seierstad and
Sydsæter (1987) and Feichtinger and Hartl (1986) for further details.
3.6. Infinite Horizon and Stationarity 105
imply (3.98). Note that these are also analogous to Table 3.1, Row 3.
We leave it as Exercise 3.38 for you to show that the limiting version
of the condition in the rightmost column of Rows 2, 3, and 4 in Table 3.1
imply (3.98). This would mean that Theorem 3.1 provides sufficient
optimality conditions for the problem (3.97), except in the free-end-point
case, i.e., when the terminal constraints a(x(T )) ≥ 0 and b(x(T )) = 0
are not present. Moreover, in the free-end-point case, we can use (3.98),
or even (3.99) with some qualifications, as discussed earlier.
Example 3.7 Let us return to Example 3.3 and now assume that we
have a perpetual charitable trust with initial fund W0 , which wants to
maximize its total discounted utility of charities C(t) over time, subject
to the terminal condition
lim W (T ) ≥ 0. (3.102)
T →∞
subject to
Ẇ = rW − C, W (0) = W0 > 0, (3.103)
and (3.102).
106 3. The Maximum Principle: Mixed Inequality Constraints
Since λ(t) ≥ 0 and λ(t)W ∗ (t) = 1/ρ, it is clear that (3.104) holds. Thus,
(3.105) gives the optimal solution. Using this solution in the objective
function, we obtain
1 r−ρ
J∗ = ln ρW0 + 2 , (3.106)
ρ ρ
f (x̄, ū) = 0,
μ̄ ≥ 0, μ̄g(x̄, ū) = 0,
and (3.107)
g(x̄, u) ≥ 0.
subject to
Ẇ = ln W − C, W (0) = W0 > 0, (3.111)
and one task is to find the long-run stationary equilibrium for it. Note
that since the horizon is infinite, it is usual to assume no salvage value
and no terminal conditions on the state.
ln W̄ − C̄ = 0, ρ = 1/W̄ , 1/C̄ − λ̄ = 0,
which gives the equilibrium {W̄ , C̄, λ̄} = {1/ρ, − ln ρ, −1/ ln ρ}. Since,
0 < ρ < 1, we have C̄ > 0, which satisfies the requirement that the
consumption be nonnegative. Also, the equilibrium wealth W̄ > 0.
It is important to note that the optimal long-run stationary equilib-
rium (which is also called the turnpike) is not the same as the optimal
steady-state among the set of all possible steady-states. The latter con-
cept is termed the Golden Rule or Golden Path in economics, and a
procedure to obtain it is described below. However, the two concepts
are identical if the discount rate ρ = 0; see Exercise 3.43.
The Golden Path is obtained by setting ẋ = f (x, u) = 0, which
provides the feedback control u(x) that would keep x(t) = x over
time. Then, substitute u(x) in the integrand φ(x, u) of (3.28) to obtain
φ(x, u(x)). The value of x that maximizes φ(x, u(x)) yields the Golden
Path. Of course, all of the constraints imposed on the problem have to
be respected when obtaining the Golden Path.
In some cases, there may be more than one equilibria defined by
(3.107). If so, the equilibrium that is attained may depend on the initial
3.7. Model Types 109
starting point. Moreover, from some special starting points, the system
may have an option to go to two or more different equilibria. Such points
are called the Sethi-Skiba points; see Appendix D.8.
For multidimensional systems consisting of two or more states, op-
timal trajectories may exhibit more complex behaviors. Of particular
importance is the concept of limit cycles. If the optimal trajectory of
a dynamical system tends to spiral in toward a closed loop in the state
space, then that closed loop is called a limit cycle. For more on this
topic, refer to Vidyasagar (2002) and Grass et al. (2008).
the constraints on the control and state variables, we can get a completely
specified optimal control model by selecting one of the model types in
Table 3.3 together with one of the terminal conditions given in Table 3.1.
The reader will see numerous examples of the uses of Tables 3.1
and 3.3 when we construct optimal control models of various applied
situations in later chapters. To help in understanding these, we will give
a brief mathematical discussion of the six model types in Table 3.3, with
an indication of where each model type will be used later in the book.
In Model Type (a) of Table 3.3 we see that both φ and f are linear
functions of their arguments. Hence it is called the linear-linear case.
The Hamiltonian is
H = Cx + Du + λ(Ax + Bu + d)
= Cx + λAx + λd + (D + λB)u. (3.112)
From (3.112) it is obvious that the optimal policy is bang-bang with the
switching function (D + λB). Since the adjoint equation is independent
of both control and state variables, it can be solved completely without
resorting to two-point boundary value methods. Examples of (a) oc-
cur in the cash balance problem of Sect. 5.1.1 and the maintenance and
replacement model of Sect. 9.1.1.
Model Type (b) of Table 3.3 is the same as Model Type (a) except
that the function C(x) is nonlinear. Thus, the term Cx appears in the
adjoint equation, and two-point boundary value methods are needed to
solve the problem. Here, there is a possibility of singular control, and a
specific example is the Nerlove-Arrow model in Sect. 7.1.1.
Model Type (c) of Table 3.3 has linear functions in the state equa-
tion and quadratic functions in the objective function. Therefore, it is
sometimes called the linear-quadratic case. In this case, the optimal
control can be expressed in a form in which the state variables enter
linearly. Such a form is known as the linear decision rule; see (D.36) in
Appendix D. A specific example of this case occurs in the production-
inventory example of Sect. 6.1.1.
Model Type (d) is a more general version of Model Type (b) in which
the state equation is nonlinear in x. Here again, there is a possibility of
singular control. The wheat trading model of Sect. 6.2.1 illustrates this
model type. The solution of a special case of the model in Sect. 6.2.3
exhibits the occurrence of a singular control.
3.7. Model Types 111
Table 3.3: Objective, state, and adjoint equations for various model
types
Objective State Current-value Form of optimal
integrand
φ= ẋ = f = λ̇ =
(e) c(x) + q(u) (ax + d)b(u) + e(x) λ(ρ − ab(u) − ex ) − cx Interior or boundary
(f) c(x)q(u) (ax + d)b(u) + e(x) λ(ρ − ab(u) − ex ) − cx q(u) Interior or boundary
Note. The current-value Hamiltonian is often used when ρ > 0 is the discount rate;
the standard formulation is identical to the current-value formulation when ρ = 0. In
Table 3.3, capital letters indicate vector functions and small letters indicate scalar
functions or vectors. A function followed by an argument in parentheses indicates
a nonlinear function; when it is followed by an argument without parenthesis, it
indicates a linear function. Thus, A(x) and e(x) are nonlinear vector and scalar
functions, while Ax and ax are linear. The function d is always to be interpreted as
an exogenous function of time only
In Model Types (e) and (f), the functions are scalar functions, and
there is only one state equation, so λ is also a scalar function. In these
cases, the Hamiltonian function is nonlinear in u. If it is concave in u,
then the optimal control is usually obtained by setting Hu = 0. If it is
convex, then the optimal control is the same as in Model Type (b).
Several examples of Model Type (e) occur in this book: the opti-
mal financing model in Sect. 5.2.1, the Vidale-Wolfe advertising model in
Sect. 7.2.1, the nonlinear extension of the maintenance and replacement
model in Sect. 9.1.4, the forestry model in Sect. 10.2.1, the exhaustible
resource model in Sect. 10.3.1, and all of the models in Chap. 11. Model
Type (f) examples are: The Kamien-Schwartz model in Sect. 9.2.1 and
the sole-owner fishery resource model in Sect. 10.1.
Although the general forms of the model are specified in Tables 3.1
and 3.3, there are a number of additional modeling tricks that are useful,
which will be employed later. We collect these as a series of remarks
below.
in the simple cash balance model of Sect. 5.1, u < 0 represents buying
and u > 0 represents selling; in either case there is a transaction cost
which can be represented as c|u|. In order to handle this, we define new
control variables u1 and u2 satisfying the following relations:
u := u1 − u2 , u1 ≥ 0, u2 ≥ 0, (3.113)
u1 u2 = 0. (3.114)
Thus, we represent u as the difference of two nonnegative variables, u1
and u2 , together with the quadratic constraint (3.114). We can then
write
|u| = u1 + u2 , (3.115)
which expresses the nonlinear function |u| as a linear function with the
constraint (3.114).
Remark 3.13 Tables 3.1 and 3.3 are constructed for continuous-time
models. Exactly the same kinds of models can be developed in the
discrete-time case; see Chap. 8.
Remark 3.14 Consider Model Types (a) and (b) when the control vari-
able constraints are defined by linear inequalities of the form
g(u, t) = g(t)u ≥ 0. (3.116)
Then, the problem of maximizing the Hamiltonian function becomes:
⎧
⎪
⎪
⎪
⎪ max(D + λB)u
⎪
⎨
subject to (3.117)
⎪
⎪
⎪
⎪
⎪
⎩ g(t)u ≥ 0.
Further in Model Type (a), the adjoint equation does not contain
terms in x and u, so we can solve it for λ(t), and hence the objective
function of (3.117) varies parametrically with λ(t). In this case we can
use parametric linear programming techniques to solve the problem over
time. Since the optimal solution to the linear program always occurs at
an extreme point of the convex set defined by g(t)u ≥ 0, it follows that
as λ(t) changes, the optimal solution to (3.117) will “bang” from one
extreme point of the feasible set to another. This is called a generalized
bang-bang optimal policy. Such a policy occurs, e.g., in the optimal
financing model treated in Sect. 5.2; see Table 5.1, Row 5.
In Model Type (b), the adjoint equation contains terms in x, so we
cannot solve for the trajectory of λ(t) without knowing the trajectory
of x(t). It is still true that (3.117) is a linear program for any given t,
but the parametric linear programming techniques will not usually work.
Instead, some type of iterative procedure is needed in general; see Bryson
and Ho (1975).
Remark 3.15 The salvage value part S[x(T ), T ] of the objective func-
tion is relevant in the optimization context in the following two cases:
Case (i) T is free and part of the problem is to determine the optimal
terminal time; see, e.g., Sect. 9.1.
Case (ii) T is fixed and the problem is that of maximizing the objec-
tive function involving the salvage value of the ending state x(T ), which
in this case can be written simply as S[x(T )].
For the fixed-end-point problem and for the infinite horizon problem,
it does not usually make much sense to define a salvage value function.
Remark 3.16 One important model type that we did not include in
Table 3.3 is the impulse control model of Bensoussan and Lions (1975).
In this model, an infinite control is instantaneously exerted on a state
variable in order to cause a finite jump in its value. This model is
particularly appropriate for the instantaneous reordering of inventory
as required in lot-size models; see Bensoussan et al. (1974). Further
discussion of impulse control is given in Sect. D.9.
E 3.2 Find the reachable set X, defined in Sect. 3.1, if x and u satisfy
ẋ = u − 1, x0 = 5, −1 ≤ u ≤ 1,
and T = 3.
E 3.4 Use the Lagrangian form of the maximum principle to obtain the
optimal control for the following problem:
max{J = x1 (2)}
subject to
ẋ1 (t) = u1 − u2 , x1 (0) = 2,
E 3.6 Obtain the optimal value J ∗ (T ) of the objective function for Ex-
ample 3.5 for a given terminal time T, and then maximize it with respect
to T by using the conditions dJ ∗ (T )/dT = 0. Show that you get the same
optimal T ∗ as the one obtained for Example 3.5 by using (3.77).
E 3.7 Check that the solution of Example 3.1 satisfies the sufficiency
conditions in Theorem 3.1.
E 3.8 Starting from (3.15), obtain the current-value version (3.44) for
the problem defined by (3.27) and (3.28). Show further that if we
were to require the function ψ to also depend on T, i.e. if S(x, T ) =
ψ(x, T )e−ρT then the left-hand side of condition (3.44) would be modi-
fied to H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ψ T [x∗ (T ∗ ), T ∗ ] − ρψ[x∗ (T ∗ ), T ∗ ].
E 3.10 Begin with (3.54) and perform the steps leading to (3.55).
E 3.19 In Example 3.6, determine the optimal control and the corre-
sponding state trajectory starting at the point (-4,6), which lies above
the switching curve.
E 3.20 Carry out the synthesis of the optimal control for Example 3.6
when the starting point (x0 , y0 ) lies below the switching curve.
E 3.21 Use the results of Exercise 3.20 to find the optimal control and
the corresponding trajectory starting at the point (−1, −1).
E 3.22 Find the optimal control, the minimum time, and the corre-
sponding trajectory for Example 3.6 starting at the point (−2, 2), which
lies on the switching curve.
E 3.25 Solve the following minimum weighted energy and time problem:
T
1
max J = −( )(u2 + 1)dt
u,T 0 2
subject to
ẋ = u, x(0) = 5, x(T ) = 0,
and the control constraint
|u| ≤ 2.
Hint. Use (3.77) to determine T ∗ , the optimal value of T.
subject to
ẋ = u, x(0) = x0 , x(T ) = 0,
|u| ≤ q, where q > 0,
is an optimal control.
Exercises for Chapter 3 119
subject to
ẋ = u, x(0) = x0 , x(T ) = 0,
−1 ≤ u ≤ +1.
Hint: Use (3.113)–(3.115) to deal with |u|. Show that for x0 > 0, say
x0 = 5, every feasible control is optimal.
ẋ = −ax + u,
where a > 0. Show that no optimal control exists for the problem.
Also, provide the values of the state variable, the adjoint variable, and
the Lagrange multipliers along the optimal path.
120 3. The Maximum Principle: Mixed Inequality Constraints
subject to
ẋ = u, x(0) = 0, x(T ) ≥ 1,
u ∈ [0, 1],
T ∈ [1, 8].
Hint: First, show that u∗ = bang[0, 1; λ − x] and that control can switch
at most once from 1 to 0. Then, let t∗ (T ) denote that switching time, if
any, for a given T ∈ [1, 8]. Consider three cases: (i) T = 1, (ii) 1 < T < 8,
and (iii) T = 8. Note that λ(t∗ (T )) − x(t∗ (T )) = 0. Use (3.15) in case
(ii). Find the optimal solution in each of the three cases. The best of
these solutions will be the solution of the problem.
subject to
ẋ = u, x(0) = 0, x(T ) ≥ 1,
Exercises for Chapter 3 121
u ∈ [0, 1],
√
T ∈ [1, 4 + 2 2].
The problem has two different optimal solutions with different values for
optimal T ∗ . Find both of these solutions.
(a) Find the optimal consumption rate C ∗ (t), t ∈ [0, T ], in the prob-
lem: T
−ρt
max J = e ln C(t)dt
0
subject to
Ẇ (t) = −C(t), W (0) = W0 ,
where T is given and ρ > 0.
(b) Assume that T is not given in (a), and is to be chosen optimally.
Show for this free terminal time version that the optimal T ∗
decreases as the discount rate ρ increases.
lim λ(t) = 0
t→∞
such that
ẋ = (1 − x)u, x(0) = 0,
0 ≤ u ≤ 1.
Show this by finding an optimal control.
(c) Solve for x∗ (t) and λ(t) and show that limt→∞ x∗ (t) = 0 and that
the limiting condition (3.99), i.e., limt→∞ λ(t) = 0, holds for this
problem.
E 3.40 Show that for the problem (3.97) without the constraint
g(x, u) ≥ 0, the optimal value of the objective function
J ∗ = H(x0 , u∗ (0), λ(0))/ρ.
See Grass et al. (2008).
E 3.41 Apply (3.108), along with the requirement λ̄ ≥ 0 and λ̄W̄ = 0 in
view of the constraint (3.102), to Example 3.7 to verify that the long-run
stationary equilibrium is as shown in (3.110).
E 3.42 For a stationary system as defined in Sect. 3.6, show that
dH
= ρλf (x∗ (t), u∗ (t))
dt
and
dH pv
= −ρe−ρt φ(x∗ (t), u∗ (t))
dt
along the optimal path. Also, contrast these results with that of Exer-
cise 2.9.
Exercises for Chapter 3 123
subject to
I˙ = P − S, I(0) = I0 ,
where I denotes inventory level, P denotes production rate, and S de-
notes a given constant demand rate.
(a) Find the optimal long-run stationary equilibrium, i.e., the turnpike
defined in (3.107).
(b) Find the Golden Rule by setting I˙ = 0 in the state equation, solve
for P, and substitute it into the integrand of the objective function.
Then, maximize the integrand with respect to I.
(c) Verify that the Golden Rule inventory level obtained in (b) is the
same as the turnpike inventory level found in (a) when ρ = 0.
Chapter 4
subject to
ẋ = u, x(0) = 0, (4.2)
0 ≤ u ≤ 3, (4.3)
x − 1 + (t − 2)2 ≥ 0. (4.4)
Solution From the objective function (4.1), one can see that it is good
to have low values of u. If we use u = 0 to begin with, we see that
x(t) = 0 as long as u(t) = 0. At t = 1, x(1) = 0 and the constraint (4.4)
is satisfied with an equality. But continuing with u(t) = 0 beyond t = 1
is not feasible since x(t) = 0 would not satisfy the constraint (4.4) just
after t = 1.
In Fig. 4.1, we see that the lowest possible feasible state trajectory
from t = 1 to t = 2 satisfies the state constraint (4.4) with an equality.
In order not to violate the constraint (4.4), its first time derivative u(t)+
2(t − 2) must be nonnegative. This gives us u(t) = 2(2 − t) to be the
lowest feasible value for the control. This value will make the state x(t)
ride along the constraint boundary until t = 2, at which point u(2) = 0;
see Fig. 4.1. Continuing with u(t) = 2(2 − t) beyond t = 2 will make u(t)
negative, and violate the lower bound in (4.3). It is easy to see, however,
that u(t) = 0, t ≥ 2, is the lowest feasible value, which can be followed
all the way to the terminal time t = 3.
128 4. The Maximum Principle: Pure State and Mixed Constraints
It can be seen from Fig. 4.1 that the bold trajectory is the lowest pos-
sible feasible state trajectory on the entire time interval [0,3]. Moreover,
it is obvious that the lowest possible feasible control is used at any given
t ∈ [0, 3], and therefore, the solution we have found is optimal. We can
now restate the values of the state and control variables that we have
obtained:
⎧ ⎧
⎪
⎪ ⎪
⎪
⎪
⎪ 0, t ∈ [0, 1), ⎪
⎪ 0, t ∈ [0, 1),
⎪
⎨ ⎪
⎨
x∗ (t) = ∗
1 − (t − 2)2 , t ∈ [1, 2], u (t) = ⎪ 2(2 − t), t ∈ [1, 2],
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎪
⎩ ⎪
⎩
1, t ∈ (2, 3], 0, t ∈ (2, 3].
(4.5)
Next we find the value function V (x, t) for this problem. It is obvious
that the feedback control u∗ (x, t) = 0 is optimal at any point (x, t) when
x ≥ 1 or when (x, t) is on the right-hand side of the parabola in Fig. 4.1.
Thus, V (x, t) = 0 on such points.
On the other hand, when x ∈ [0, 1] and it is on the left-hand side
of the parabola, the optimal trajectory is very similar to the one shown
in Fig. 4.1. Specifically,
√ the control is zero until it hits the trajectory at
time τ = 2 − 1 − x. Then, the control switches to 2(2 − s) for s ∈ (τ , 2)
4.2. Optimal Control Problem with Pure and Mixed Constraints 129
to climb along the left-hand side of the parabola to reach its peak, and
then switches back to zero on the time interval [2,3]. Thus, in this case,
τ 2 3
V (x, t) = − 0ds − 2(2 − s)ds − 0ds
t τ 2
2
$2
= s − 4s √
2− 1−x
= (x − 1).
This gives us the marginal valuation along the optimal path x∗ (t)
given in (4.5) as
⎧
⎪
⎨ 1, t ∈ [0, 2),
∗
Vx (x (t), t) = (4.6)
⎪
⎩ 0, t ∈ [2, 3].
h(x, t) ≥ 0, (4.7)
130 4. The Maximum Principle: Pure State and Mixed Constraints
where for t ∈ [τ 1 , τ 2 ],
hi (x∗ (t), t) = 0, i = 1, 2, . . . , p̂ ≤ p
and
hi (x∗ (t), t) > 0, i = p̂ + 1, . . . , p.
Note that this full-rank condition on the constraints (4.7) is written
when the order of each of the constraints in (4.7) is one. For the general
case of higher-order constraints, see Hartl et al. (1995).
Let us recapitulate the optimal control problem for which we will
state a direct maximum principle in the next section. The problem is
⎧ T
⎪
⎪
⎪
⎪ max J = F (x, u, t)dt + S[x(T ), T ] ,
⎪
⎪ 0
⎪
⎪
⎪
⎪
⎪
⎪ subject to
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ ẋ = f (x, u, t), x(0) = x0 ,
⎨
g(x, u, t) ≥ 0, (4.11)
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ h(x, t) ≥ 0,
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪ a(x(T ), T ) ≥ 0,
⎪
⎪
⎪
⎪
⎩ b(x(T ), T ) = 0.
132 4. The Maximum Principle: Pure State and Mixed Constraints
For the problem (4.11), we will now state the direct maximum principle
which includes the discussion above and the required jump conditions.
For details, see Dubovitskii and Milyutin (1965), Feichtinger and Hartl
(1986), Hartl et al. (1995), Boccia et al. (2016), and references therein.
We will use superscript d on various multipliers that arise in the direct
method, to distinguish them from the corresponding multipliers (which
are not superscripted) that arise in the indirect method, to be discussed
in Sect. 4.5. Naturally, it will not be necessary to superscript the multi-
pliers that are known to remain the same in both methods.
To formulate the maximum principle for the problem (4.11), we define
the Hamiltonian function H d : E n × E m × E 1 → E 1 as
H d = F (x, u, t) + λd f (x, u, t)
+γ d hx (x∗ (T ), T ), and
α ≥ 0, αa(x∗ (T ), T ) = 0, γ d ≥ 0, γ d h(x∗ (T ), T ) = 0;
H d [x∗ (τ ), u∗ (τ − ), λd (τ − ), τ ] = H d [x∗ (τ ), u∗ (τ + ), λd (τ + ), τ ]
−ζ d (τ )ht (x∗ (τ ), τ );
μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0,
ζ d (τ ) ≥ 0, ζ d (τ )h(x∗ (τ ), τ ) = 0 hold.
134 4. The Maximum Principle: Pure State and Mixed Constraints
λd (T ) = Sx (x∗ (T ), T ). (4.14)
Example 4.2 Apply the direct maximum principle (4.13) to solve the
problem in Example 4.1.
H d = −u + λd u, (4.16)
Ld = H d + μ1 u + μ2 (3 − u) + η d [x − 1 + (t − 2)2 ], (4.17)
Ldu = −1 + λd + μ1 − μ2 = 0, (4.18)
d
λ̇ = −Ldx = −η d , λd (3− ) = γ d , (4.19)
μ1 ≥ 0, μ1 u∗ = 0, μ2 ≥ 0, μ2 (3 − u∗ ) = 0, (4.21)
λd (τ − ) = λd (τ + ) + ζ d (τ ), ζ d (τ ) ≥ 0, (4.23)
See Feichtinger and Hartl (1986) and Seierstad and Sydsæter (1987) for
details.
α ≥ 0, αa(x∗ (T ), T ) = 0, γ ≥ 0, γh(x∗ (T ), T ) = 0;
μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0,
ζ d (τ 1 ) = ζ(τ 1 ) − η(τ + d − d
1 ), ζ (τ 2 ) = η(τ 2 ), ζ (τ ) = ζ(τ ). (4.32)
γ = γ d − η(T − ).
Finally, as we had mentioned earlier, the multipliers μ, α, and β are
the same in both methods.
η̇(t) ≤ 0 (4.34)
and
ζ(τ 1 ) ≥ η(τ +
1 ) at each entry time τ 1 , (4.35)
which are useful to know about. Hartl et al. (1995) and Feichtinger and
Hartl (1986) also add these conditions to the indirect maximum principle
necessary conditions (4.29).
4.5. The Maximum Principle: Indirect Method 141
subject to
ẋ = u, x(0) = 1, (4.36)
u + 1 ≥ 0, 1 − u ≥ 0, (4.37)
x ≥ 0. (4.38)
Note that this problem is the same as Example 2.3, except for the
nonnegativity constraint (4.38).
H = −x + λu,
L = H + μ1 (u + 1) + μ2 (1 − u) + ηu,
μ1 ≥ 0, μ1 (u + 1) = 0, (4.41)
μ2 ≥ 0, μ2 (1 − u) = 0, (4.42)
η ≥ 0, ηx = 0. (4.43)
This gives ⎧
⎪
⎨ 1 − t, t ∈ [0, 1),
x∗ (t) =
⎪
⎩ 0, t ∈ [1, 2].
To obtain λ(t), let us first try λ(2− ) = γ = 0. Then, since x∗ (t) enters
the boundary zero at t = 1, there are no jumps in the interval (1, 2], and
the solution for λ(t) is
Since λ(t) ≤ 0 and x∗ (t) = 1−t > 0 is positive on [0,1), we can use (4.39)
to obtain u∗ (t) = −1 for 0 ≤ t < 1, which is as stipulated in (4.46). In
the time interval [0,1) by (4.42), μ2 = 0 since u∗ < 1, and by (4.43),
η = 0 because x > 0. Therefore, μ1 (t) = −λ(t) = 1 − t > 0 for 0 ≤ t < 1,
and this with u∗ = −1 satisfies (4.41).
To complete the solution, we calculate the Lagrange multipliers in the
interval [1,2]. Since u∗ (t) = 0 on t ∈ [1, 2], we have μ1 (t) = μ2 (t) = 0.
Then, from (4.44) we obtain η(t) = −λ(t) = 2 − t ≥ 0 which, with
4.5. The Maximum Principle: Indirect Method 143
()
() 1 ()
0
0 2 0
()
Remark 4.5 Example 4.3 is a problem instance in which the state con-
straint is active at the terminal time. In instances where the initial state
or the final state or both are on the constraint boundary, the maximum
principle may degenerate in the sense that there is no nontrivial solution
of the necessary conditions, i.e., λ(t) ≡ 0, t ∈ [0, T ], where T is the termi-
nal time. See Arutyunov and Aseev (1997) or Ferreira and Vinter (1994)
for conditions that guarantee a nontrivial solution for the multipliers.
144 4. The Maximum Principle: Pure State and Mixed Constraints
Remark 4.6 It can easily be seen that Example 4.3 is a problem in-
stance in which multipliers λ and μ1 would not be unique if the jump
condition on the Hamiltonian in (4.29) was not imposed. For references
dealing with the issue of non-uniqueness of the multipliers and conditions
under which the multipliers are unique, see Kurcyusz and Zowe (1979),
Maurer (1977, 1979), Maurer and Wiegand (1992), and Shapiro (1997).
Example 4.4 The purpose here is to show that the solution obtained
in Example 4.3 satisfies the sufficiency conditions of Theorem 4.1. For
this we first obtain the direct adjoint variable
⎧
⎪
⎨ t − 1, t ∈ [0, 1),
d ∗
λ (t) = λ(t) + η(t)hx (x (t), t) =
⎪
⎩ 0, t ∈ [1, 2).
and
h(x) = x
are linear and hence quasiconcave in (x, u) and x, respectively. Functions
S ≡ 0, a ≡ 0 and b ≡ 0 satisfy the conditions of Theorem 4.1 trivially.
Thus, the solution obtained for Example 4.3 satisfies all conditions of
Theorem 4.1, and is therefore optimal.
In Exercise 4.14, you are asked to use Theorem 4.1 to verify that the
given solution there is optimal.
Example 4.5 Consider Example 4.3 with T = 3 and the terminal state
constraint
x(3) = 1.
4.5. The Maximum Principle: Indirect Method 145
Solution Clearly, the optimal control u∗ will be the one that keeps x as
small as possible, subject to the state constraint (4.38) and the boundary
condition x(0) = x(3) = 1. Thus,
⎧ ⎧
⎪
⎪ ⎪
⎪
⎪
⎪ −1, t ∈ [0, 1), ⎪
⎪ 1 − t, t ∈ [0, 1),
⎪
⎨ ⎪
⎨
u∗ (t) = 0,
∗
t ∈ [1, 2], x (t) = ⎪ 0, t ∈ [1, 2],
⎪
⎪ ⎪
⎪
⎪ ⎪
⎪
⎪
⎩ ⎪
⎩
1, t ∈ (2, 3], t − 2, t ∈ (2, 3].
For brevity, we will not provide the same level of detailed explanation as
we did in Example 4.3. Rather, we will only compute the adjoint function
and the multipliers that satisfy the optimality conditions. These are
⎧
⎪
⎨ t − 1, t ∈ [0, 1],
λ(t) = (4.48)
⎪
⎩ t − 2, t ∈ (1, 3),
Solution It is obvious that the optimal solution will remain the same as
(4.5), shown also in Fig. 4.1.
With u∗ and x∗ as in (4.5), we must obtain λ, μ1 , μ2 , η, γ, and ζ so
that the necessary optimality conditions (4.29) hold, i.e.,
Lu = −e−ρt + λ + μ1 − μ2 + η = 0, (4.55)
μ1 ≥ 0, μ1 u = 0, μ2 ≥ 0, μ2 (3 − u) = 0, (4.58)
− e−ρ u∗ (1− ) + λ(1− )u∗ (1− ) = −e−ρ u∗ (1+ ) + λ(1+ ) − ζ(1)(−2). (4.61)
From (4.60), we obtain λ(1− ) = e−ρ . This with (4.56) gives
⎧
⎪
⎨ e−ρ , 0 ≤ t < 1,
λ(t) =
⎪
⎩ 0, 1 ≤ t ≤ 3,
and ⎧
⎪
⎪
⎪
⎪ 0, 0 ≤ t < 1,
⎪
⎨
η(t) =
⎪ e−ρt , 1 ≤ t ≤ 2,
⎪
⎪
⎪
⎪
⎩ 0, 2 < t ≤ 3,
T
−ρt −ρT
max J = φ(x, u)e dt + ψ[x(T )]e .
0
λ̇ = ρλ − Lx [x∗ , u∗ , λ, μ, η, t]
α ≥ 0, αa(x∗ (T ), T ) = 0, γ ≥ 0, γh(x∗ (T ), T ) = 0;
μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0,
The infinite horizon problem with pure and mixed constraints can be
stated as (3.97) with an additional constraint (4.7). As in Sect. 3.6, the
conditions in (4.62) except the transversality condition on the adjoint
variable are still necessary for optimality. As for the sufficiency condi-
tions, an analogue of Theorem 4.1 holds, subject to the discussion on
infinite horizon transversality conditions in Sect. 3.6.
We conclude this chapter with the following cautionary remark.
E 4.1 Rework Example 4.3 by guessing that γ > 0, and show that it
leads to a contradiction with a condition of the maximum principle.
subject to
ẋ = −u − 1, x(0) = 1,
x(t) ≥ 0, 0 ≤ u(t) ≤ 1.
Show that
ẋ = u, x(0) = 1, x(4) = 1,
u + 1 ≥ 0, 1 − u ≥ 0,
x ≥ 0.
Find the optimal trajectories of the control variable, the state variable,
and other multipliers. Also, graph these trajectories.
E 4.7 Transform the problem (4.11) with the pure constraint of type
(4.7) to a problem with the nonnegativity constraint of type (4.9).
Hint: Guess the optimal solution and verify it by using the La-
grangian form of the maximum principle.
subject to
S2
I˙ = P − S, I(0) = I0 > ,
2h
and the control and the pure state inequality constraints
P ≥ 0 and I ≥ 0,
ẋ = u − x, x(0) = 1,
Use the sufficiency conditions in Theorem 4.1 to verify that the optimal
control for the problem is
⎧
⎪
⎪
⎪
⎪ 0, 0 ≤ t ≤ θ,
⎪
⎨
u∗ (t) = 0.5 − 0.2t, θ < t ≤ 2.5,
⎪
⎪
⎪
⎪
⎪
⎩ 0, 2.5 < t ≤ 5,
where θ ≈ 0.51626. Sketch the optimal state trajectory x∗ (t) for the
problem.
√
E 4.15 In Example 4.6, let t± (x) = 2 ± 1 − x. Show that the value
function
⎧ √ √
⎪ −2ρ
⎨ − 2e +2(ρ 1−x−1)e
−ρ(2− 1−x)
ρ2
, for x < 1, 0 ≤ t ≤ t− (x),
V (x, t) =
⎪
⎩ 0, for x ≥ 1 or t+ (x) ≤ t ≤ 3.
Note that V (x, t) is not defined for x < 1, t− (x) < t ≤ 3. Show further-
more that for the given initial condition x(0) = 0, the marginal valuation
is ⎧
⎪
⎪
⎪
⎪ e−ρ , for t ∈ [0, 1),
⎪
⎨
Vx (x∗ (t), t) = λd (t) = λ(t) + η(t) = e−ρt , for t ∈ [1, 2],
⎪
⎪
⎪
⎪
⎪
⎩ 0, for t ∈ (2, 3].
In this case, it is interesting to note that the marginal valuation is dis-
continuous at the constraint exit time t = 2.
154 4. The Maximum Principle: Pure State and Mixed Constraints
subject to
˙ = P (t) − S,
I(t) I(0) = 1,
P (t) ≥ 0 and I(t) ≥ 0, t ∈ [0, T ],
where P (t) denotes the production rate and I(t) is the inventory level at
time t and√where c, h and S are positive constants and the given terminal
time T > 2S.
subject to u ∈ [0, ū] and x ≤ x̄, where ū > δ x̄, a > δ, and x̄ > x0 .
Hint: Solve first the problem without the state constraint x ≤ x̄. You will
need to treat two cases: δT ≤ ln a − ln (a − δ) and δT > ln a − ln (a − δ).
Exercises for Chapter 4 155
E 4.20 Maximize
3
J= (u − x)dt
0
subject to
ẋ = 1 − u, x(0) = 2,
0 ≤ u ≤ 3, x + u ≤ 4, x ≥ 0.
E 4.21 Maximize
2
J= (1 − x)dt
0
subject to
ẋ = u, x(0) = 1,
−1 ≤ u ≤ 1, x ≥ 0.
E 4.22 Maximize
3
J= (4 − t)udt
0
subject to
ẋ = u, x(0) = 0, x(3) = 3,
0 ≤ u ≤ 2, 1 + t − x ≥ 0.
E 4.23 Maximize
4
J =− e−t (u − 1)2 dt
0
subject to
ẋ = u, x(0) = 0,
x ≤ 2 + e−3 .
156 4. The Maximum Principle: Pure State and Mixed Constraints
ẋ = −u, x(0) = e,
−3 ≤ u ≤ 3, x − u ≥ 0, x ≥ t.
ẋ1 = x2 , x1 (0) = 2,
ẋ2 = u, x2 (0) = 0,
x1 ≥ 0.
E 4.26 Re-solve Example 4.6 with the control constraint (4.3) replaced
by 0 ≤ u ≤ 1.
subject to
ẋ(t) = u(t), x(0) = 1,
−a ≤ u(t) ≤ b, a > 1/2, b > 0,
x(t) ≥ t − 2.
Obtain x∗ (t), u∗ (t) and all the required multipliers.
E 4.28 Minimize T
1 2
(x + c2 u2 )dt
0 2
subject to
ẋ = u, x(0) = x0 > 0, x(T ) = 0,
h1 (x, t) = x − a1 + b1 t ≥ 0,
h2 (x, t) = a2 − b2 t − x ≥ 0,
where ai , bi > 0, a2 > x0 > a1 , and a2 /b2 > a1 /b1 ; see Fig. 4.5. The
optional path must begin at x0 on the x-axis, stay in the shaded area,
and end on the t-axis.
Exercises for Chapter 4 157
(a) First, assume that the problem parameters are such that the op-
timal solution x∗ (t) satisfies h1 (x∗ (t), t) > 0 for t ∈ [0, T ]. Show
that
x∗ (t) = k1 et/c + k2 e−t/c ,
subject to
ẋ = −u, x(0) = x0 > 0 given,
K̇ = suK, K(0) = K0 ,
Ẇ = uK − δW, W (0) = W0 ,
where a fraction s of the production output uK is invested, with u de-
noting the capacity utilization rate. The control constraints are
0 ≤ s ≤ 1, 0 ≤ u ≤ 1,
Applications to Finance
∂H
λ˙1 = − = −λ1 r1 , λ1 (T ) = 1, (5.6)
∂x
∂H
λ˙2 = − = −λ2 r2 , λ2 (T ) = 1. (5.7)
∂y
It is easy to solve these, respectively, as
T
r1 (τ )dτ
λ1 (t) = e t (5.8)
and T
r2 (τ )dτ
λ2 (t) = e t . (5.9)
The interpretations of these solutions are also clear. Namely, λ1 (t) is
the future value (at time T ) of one dollar held in the cash account from
time t to T and, likewise, λ2 (t) is the future value of one dollar invested
in securities from time t to T. Thus, the adjoint variables have natural
interpretations as the actuarial evaluations of competitive investments
at each point of time.
Let us now derive the optimal policy by choosing the control vari-
able u to maximize the Hamiltonian in (5.5). In order to deal with the
absolute value function we write the control variable u as the difference
of two nonnegative variables, i.e.,
u = u1 − u2 , u1 ≥ 0, u2 ≥ 0. (5.10)
Recall that this method was suggested in Remark 3.12 in Sect. 3.7. In
order to make u = u1 when u1 is strictly positive, and u = −u2 when u2
is strictly positive, we also impose the quadratic constraint
u1 u2 = 0, (5.11)
|u| = u1 + u2 . (5.12)
0 ≤ u1 ≤ U1 and 0 ≤ u2 ≤ U2 . (5.13)
We can now substitute (5.10) and (5.12) into the Hamiltonian (5.5)
and reproduce the part that depends on control variables u1 and u2 , and
denote it by W. Thus,
then
(1 − α)λ1 (t) < λ2 (t),
so that if u2 (t) > 0, then u1 (t) = 0. Hence, with the optimal policy, the
relation (5.11) is always satisfied.
Figure 5.1 illustrates the optimal policy at time t. The first quadrant
is divided into three areas which represent different actions (including
no action) to be taken. The dotted lines represent the singular control
manifolds. A possible path of the vector (λ1 (t), λ2 (t)) of the adjoint
variables is shown in Fig. 5.1 also. Note that on this path, there is one
period of selling, two periods of buying, and three periods of inactivity.
Note also that the final point on the path is (1, 1), since the terminal
values λ1 (T ) = λ2 (T ) = 1, and therefore, the last interval is always
characterized by inactivity.
Another way to represent the optimal path is in the (t, λ2 /λ1 ) space.
The path of (λ1 (t), λ2 (t)) shown in Fig. 5.1 corresponds to the path of
λ2 (t)/λ1 (t) over time shown in Fig. 5.2.
164 5. Applications to Finance
For reasons of simplicity and ease of its solution, the model analyzed
here does not permit debt as a source of financing, but does permit
retained earnings and external equity to be used in any proportions.
see Miller and Modigliani (1961) and Sethi (1996) for further discus-
sion. Note that in the case of a finite horizon, a more realistic objective
function would include a salvage value or bequest term S[x(T )]. This is
not very difficult to incorporate. See Exercise 5.12 where the bequest
function is linear. We will also solve the infinite horizon problem (i.e.,
T = ∞) after we have solved the finite horizon problem.
λ̇ = ρλ − (1 − v − u) − λr(cu + v) (5.23)
λ(T ) = 0. (5.24)
where
W1 = crλ − 1, (5.26)
W2 = rλ − 1. (5.27)
Note first that the state variable x factors out so that the optimal
controls are independent of the state variable. Second, since the Hamil-
tonian is linear in the two control variables, the optimal policy is a com-
bination of generalized bang-bang and singular controls. Of course, the
characterization of these optimal controls in terms of the adjoint variable
λ will require solving a parametric linear programming problem at each
168 5. Applications to Finance
W1 , W2 g≤r g>r
Subcases Subcases
bang-bang
(2) W2 = 0 A2 B2 u∗ = 0,
bang-bang
bang-bang
bang-bang
adjacent to the darkened lines in Figs. 5.3 and 5.4, respectively. In ad-
dition to W1 < cW2 and W1 = 0 implying W2 > 0, we see that W2 ≤ 0
implies W1 < 0. In view of these, we can simply characterize Subcases
A1 and B1 by W2 < 0, A2 and B2 by W2 = 0, A3 by W2 > 0, B4 by
W1 = 0, and B5 by W1 > 0, and use these simpler characterizations
in our subsequent discussion. Keep in mind that Subcase B3 remains
characterized as W1 < 0, W2 > 0.
In Table 5.1, we list the feasible cases, shown along the darkened
lines in Figs. 5.3 and 5.4 and provide the form of the optimal control
in each of these cases. The catalog of possible optimal control regimes
shown in Table 5.1 gives the potential time-paths for the firm. What
must be done to obtain the optimal path (given an initial condition) is
to synthesize these subcases into an optimal sequence. This is carried
out in the following section.
A3
A4
A2 A
A5
A A6
B4
B3
B5
B6
B2 B9
B1 B8 B7
τ = T − t,
5.2. Optimal Financing Model 171
so that
◦ dy dy dt
y= = = −ẏ.
dτ dt dτ
◦
As a consequence, y = −ẏ, and the reverse-time versions of the state
and adjoint equations (5.18) and (5.23), respectively, can be obtained by
◦
simply replacing ẏ by y and changing the signs of the right-hand sides.
The transversality condition on the adjoint variable
Case A: g ≤ r.
◦ ◦
x= 0 and λ= 1 − ρλ. (5.33)
172 5. Applications to Finance
With the initial conditions given in (5.29), the solutions for x and λ are
x(τ ) = αA and λ(τ ) = (1/ρ)[1 − e−ρτ ]. (5.34)
It is easy to see that because of the assumption 0 ≤ c < 1, it follows that
if W2 = rλ − 1 < 0, then W1 = crλ − 1 < 0. Therefore, to remain in
this subcase as τ increases, W2 (τ ) must remain negative for some time
as τ increases. From (5.34), however, λ(τ ) is increasing asymptotically
toward the value 1/ρ and therefore, W2 (τ ) is increasing asymptotically
toward the value r/ρ − 1. Since, we have assumed r > ρ, there exists a
τ 1 such that W2 (τ 1 ) = (1 − e−ρτ 1 )r/ρ − 1 = 0. It is easy to compute
τ 1 = (1/ρ) ln[r/(r − ρ)]. (5.35)
From this expression, it is clear that the firm leaves Subcase A1 provided
τ 1 < T. Moreover, this observation also makes precise the notion of a
sufficiently large T in Case A by having T > τ 1 .
Remark 5.2 When T is not sufficiently large, i.e., when T ≤ τ 1 in
Case A, the firm stays in Subcase A1. The optimal solution in this case
is u∗ = 0 and v ∗ = 0, i.e., a policy of no investment.
Remark 5.3 Note that if we had assumed r < ρ, the firm would never
have exited from Subcase A1 regardless of the value of T. Obviously,
there is no use investing if the rate of return is less than the discount
rate.
At reverse time τ 1 , we have W2 = 0 and W1 < 0 and the firm,
therefore, is in Subcase A2. Also, λ(τ 1 ) = 1/r since W2 (τ 1 ) = 0.
Subcase A2: W2 = rλ − 1 = 0.
The optimal controls in this subcase from Row (2) of Table 5.1 are
u∗ = 0, v ∗ = g/r. (5.39)
The state and the adjoint equations are
◦
x= −gx, x(τ 1 ) = αA , (5.40)
◦
λ= (1 − g/r) − λ(ρ − g), λ(τ 1 ) = 1/r, (5.41)
with values at τ = τ 1 deduced from (5.34) and (5.35).
◦
Since λ (τ 1 ) > 0, λ is increasing at τ 1 from its value of 1/r. A further
examination of the behavior of λ(τ ) as τ increases will be carried out
under two different possible conditions: (i) ρ > g and (ii) ρ ≤ g.
◦
(i) ρ > g: Under this condition, as λ increases, λ decreases and
becomes zero at a value obtained by equating the right-hand side of
(5.41) to zero, i.e., at
1 − g/r
λ̄ = . (5.42)
ρ−g
This value λ̄ is, therefore, an asymptote to the solution of (5.41) starting
at λ(τ 1 ) = 1/r. Since r > ρ > g in this case,
r(1 − g/r) r−ρ
W2 = rλ̄ − 1 = −1= > 0, (5.43)
ρ−g ρ−g
which implies that the firm continues to stay in Subcase A3.
◦
(ii) ρ ≤ g: Under this condition, as λ(τ ) increases, λ (τ ) increases.
So W2 (τ ) = rλ(τ ) − 1 continues to be greater than zero and the firm
continues to remain in Subcase A3.
174 5. Applications to Finance
ln[ ]
Case B: g > r.
Subcase B2: W2 = rλ − 1 = 0.
u∗ = 0, 0 ≤ v ∗ ≤ 1 (5.50)
from Row (3) of Table 5.1 are singular with respect to v. As before
in Subcase A2, the singular case cannot be sustained for a finite time
because of our assumption r > ρ. As in Subcase A2, W2 is increasing at
τ 1 from zero and becomes positive after τ 1 . Thus, at τ +
1 , the firm finds
itself in Subcase B3.
u∗ = 0, v ∗ = 1, (5.51)
as shown in Row (5) of Table 5.1. The state and the adjoint equations
are
◦
x= −rx, x(τ 1 ) = αB (5.52)
with αB , a parameter to be determined, and
◦
λ= λ(r − ρ), λ(τ 1 ) = 1/r. (5.53)
we have
λ(τ ) = (1/r)e(r−ρ)(τ −τ 1 ) for τ ≥ τ 1 . (5.54)
As λ increases, W1 increases and becomes zero at a time τ 2 defined by
At τ +
2 , the firm switches to Subcase B4.
Before proceeding to Subcase B4, let us observe that in Case B, we
can now define T to be sufficiently large when T > τ 2 . See Remark 5.4
when T ≤ τ 2 .
0 ≤ u∗ ≤ (g − r)/rc, v ∗ = 1. (5.57)
From Row (6) in Table 5.1, these controls are singular with respect to
u. To maintain this singular control over a finite time period, we must
◦
keep W1 = 0 in the interval. This means we must have W 1 (τ 2 ) = 0,
◦ ◦
which, in turn, implies λ (τ 2 ) = 0. To compute λ, we substitute (5.57)
into (5.32) and obtain
◦
λ= −u∗ − λ{ρ − r(cu∗ + 1)}. (5.58)
The optimal controls in this subcase from Row (4) of Table 5.1 are
g−r
u∗ = , v ∗ = 1. (5.60)
rc
Then from (5.31) and (5.32), the reverse-time state and the adjoint equa-
tions are
◦
x= −gx, (5.61)
◦ g−r
λ= −( ) + λ(g − ρ). (5.62)
rc
◦
Since λ (τ 2 ) > 0 from (5.59), λ(τ ) is increasing at τ 2 from its value
λ(τ 2 ) = 1/rc > 0. Furthermore, we have g > r in Case B, which together
with r > ρ, assumed throughout Sect. 5.2, makes g > ρ. This implies that
the second term on the right-hand side of (5.62) is increasing. Moreover,
the second term dominates the first term for τ > τ 2 , since λ(τ 2 ) =
1/(rc) > 0, and r > ρ and g > r imply g − ρ > g − r > 0. Thus,
◦
λ (τ ) > 0 for τ > τ 2 , and λ(τ ) increases with τ . Therefore, the firm
continues to stay in Subcase B5.
=
B1
B3
B5
Earnings
Remark 5.6 When T is not sufficiently large, i.e., when T < τ 2 in Case
B, the optimal solution is the same as in Remark 5.1 when T ≤ τ 1 . If
τ 1 < T ≤ τ 2 , then the optimal solution is u∗ = 0 and v ∗ = 1 until
t = T − τ 1 . For t > T − τ 1 , the optimal solution is u∗ = 0 and v ∗ = 0.
Having completely solved the finite horizon case, we now turn to the
infinite horizon case.
Case A: g ≤ r.
Let us first consider the case ρ > g and examine the solution in
forward time obtained in (5.44)–(5.48) as T goes to infinity. Clearly
(5.45) and (5.46) disappear, and (5.44) and (5.48) can be written as
1 − g/r
λ(t) = = λ̄, t ≥ 0. (5.67)
ρ−g
Clearly λ(t) satisfies (5.65). Furthermore,
r−ρ
W2 (t) = rλ − 1 = > 0, t ≥ 0,
ρ−g
Since there are many policies which give an infinite value to the
objective function, the choice among them may be decided on subjective
grounds. We will briefly discuss only the constant (over time) optimal
policies. If g < r, then the rate of growth q may be chosen in the
closed interval [ρ, g]; if g = r, then q may be chosen in the half-open
interval [ρ, r). In either case, the choice of a low rate of growth (i.e., a
high proportional dividend payout) would mean a higher dividend rate
(in dollars per unit time) early in time, but a lower dividend rate later
in time because of the slower growth rate. Similarly the choice of high
growth rate means the opposite in terms of dividend payments in dollars
per unit time.
To conclude, we note that for ρ ≤ g in Case A, the limiting solution
of the finite case is an optimal solution for the infinite horizon problem
in the sense that the objective function becomes infinite. However, this
will not be the situation in Case B; see also Remark 5.7.
Case B: g > r.
Remark 5.7 Let (u∗T , vT∗ ) denote the optimal control for the finite
horizon problem in Case B. Let (u∗∞ , v∞ ∗ ) denote any optimal con-
trol for the infinite horizon problem in Case B. We already know that
J(u∗∞ , v∞∗ ) = ∞. Define an infinite horizon control (u , v ) by extend-
∞ ∞
ing (u∗T , vT∗ ) as follows:
(u∞ , v∞ ) = lim (u∗T , vT∗ ).
T →∞
1
τ1 = ln[r/(r − ρ)] = 10 ln 3 ≈ 11 months.
ρ
u∗ = 0, v ∗ = 0, t ∈ [49, 60],
Note that the infinite horizon problem is well defined in this case, since
g < ρ and g < r. The optimal controls are
u∗ = 0, v ∗ = g/r = 1/3,
and
∞
1
J= e−0.1t (2/3)(1000)e0.05t dt = 2000/0.15 = 13, 333 .
0 3
In Exercise 5.14, you are asked to extend the optimal financing model
to allow for debt financing. Exercise 5.15 requires you to reformulate the
optimal financing model (5.21) with decisions expressed in dollars per
unit of time rather than in terms relative to x. Exercise 5.16 extends the
model to allow the rate of return on the assets to decrease as the assets
grow.
Exercises for Chapter 5 185
E 5.5 For the solution found in Exercise 5.4, show by using the maxi-
mum principle (4.29) that the adjoint trajectories are:
⎧
⎪
⎨ λ1 (0) = e1.5 , 0 ≤ t ≤ 5,
λ1 (t) =
⎪
⎩ λ (5)e−0.3(t−5) = e3−0.3t , 5 ≤ t ≤ 10,
1
and
⎧
⎪
⎨ λ2 (0)e−0.1t∗ = e1.5+0.1(t∗ −t) , 0 ≤ t ≤ f (t∗ ) ≈ 6.52,
λ2 (t) =
⎪
⎩ 2
3 + 13 e3−0.3t , f (t∗ ) < t ≤ 10,
E 5.7 Discuss the optimal equity financing model of Sect. 5.2.1 when
c = 1. Show that only one control variable is needed. Then solve the
problem.
Exercises for Chapter 5 187
E 5.8 What happens in the optimal equity financing model when r < ρ?
Guess the optimal solution (without actually solving it).
E 5.10 Let g = 0.12 in Example 5.1. Re-solve the finite horizon problem
with this new value of g. Also, for the infinite horizon problem, state a
policy which yields an infinite value for the objective function.
e−ρT Bx(T ),
ż = wx, y(0) = y0 .
How would you modify the objective function, the state equation for x,
and the growth constraint (5.19)? Assume i to be the constant interest
rate on debt, and i < r.
E 5.17 Find the form of the optimal policy for the following model due
to Davis and Elzinga (1971):
T
−ρt −ρT
max J = e (1 − v)Erdt + P (T )e
u,v 0
Exercises for Chapter 5 189
subject to
Ṗ = k[rE(1 − v) − ρP ], P (0) = P0 ,
u ≥ 0, v ≥ 0, cu + v ≤ g/r.
Here P denotes the price of a stock, E denotes equity per stock and
k > 0 is a constant. Also, assume r > ρ > g and 1/c < r/ρ < 1/c + (ck +
1)g/(ρck). This example requires the use of the generalized Legendre-
Clebsch condition (D.69) in Appendix D.8.
Chapter 6
Applications to Production
and Inventory
∂H
= λ − c(P − P̂ ) = 0. (6.4)
∂P
From this we obtain the optimal production rate
= ρ(I˙ − P̂ + S) + (h/c)(I − I)
ˆ − Ṡ.
We rewrite this as
m2 − ρm − α2 = 0,
note that m1 < 0 and m2 > 0. We can therefore write the general solution
to (6.9) as
I(t) = a1 em1 t + a2 em2 t + Q(t), I(0) = I0 , (6.12)
where Q(t) is a particular integral of (6.9).
We will say that Q(t) is a special particular integral of (6.9) if it has
no additive terms involving em1 t and em2 t . From now on we will always
assume that Q(t) is a special particular integral.
Although (6.12) has two arbitrary constants a1 and a2 , it has only
one boundary condition. To get the other boundary condition we dif-
ferentiate (6.12), substitute the result into (6.7), and solve for λ. We
obtain
b1 = I0 − Q(0), (6.14)
b2 = P̂ − Q̇(T ) − S(T ). (6.15)
We now impose the boundary conditions in (6.12) and (6.13) and solve
for a1 and a2 as follows:
a 1 ≈ b1 , (6.18)
b2 −m2 T
a2 ≈ e . (6.19)
m2
Note that for a large T, e−m2 T is close to zero and, therefore, a2 is close
to zero. However, the reason for retaining the exponential term in (6.19)
is that a2 is multiplied by em2 t in (6.13), which, while small when t is
small, becomes large and important when t is close to T.
With these values of a1 and a2 and with (6.5), (6.12), and (6.13),
we now write the expressions for I ∗ , P ∗ , and λ. We will break each
expression into three parts: the first part labeled Starting Correction
is important only when t is small; the second part labeled Turnpike
Expression is significant for all values of t; and the third part labeled
Ending Correction is important only when t is close to T.
remains finite. One can then show that the limit of the finite horizon
solution as T → ∞ also solves the infinite horizon problem. Note that as
T → ∞, the ending correction terms in (6.20)–(6.22) disappear because
e−m2 T goes to 0. We now have
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8
60
50
40
30
20
10
0
0 1 2 3 4 5 6 7 8
1 πCk Dk
K
Q(t) = Iˆ + 2
B+ cos(πDk t + Ek ). (6.31)
α α2 + (πDk )2
k=1
Since the two-point boundary value problem given by (6.7) and (6.8)
is a linear system of differential equations, it is known via its fundamental
solution matrix that λ can be expressed in terms of I in a linear way as
follows:
λ(t) = ψ(t) − s(t)I(t), (6.32)
where ψ(t) and s(t) are continuously differentiable in t. Differentiating
(6.32) with respect to t and substituting for I˙ and λ̇ from (6.7) and (6.8)
with ρ = 0, respectively, we obtain
Since the above relation must hold for any value of the initial inventory
I0 , we must have
Also from λ(T ) = 0 in (6.8) and (6.32), we have 0 = ψ(T ) − s(T )I(T ),
a relation that must hold regardless of the value of I(T ). Thus, we can
conclude that
s(T ) = 0 and ψ(T ) = 0. (6.34)
Clearly, the solution of the differential equation given by (6.33) and
(6.34) will give us the optimal control (6.5) in terms of S(t) and ψ(t). In
particular, the differential equation
ṡ = s2 /c − h, s(T ) = 0 (6.35)
This says that the optimal production rate equals the production goal
level P̂ plus two adjustment terms. The first term implies ceteris paribus
that the higher the current inventory level, the lower the production rate
is. Furthermore, this dependence is linear with the linear effect decreas-
ing as t increases, reaching zero at t = T. The second term depends on
all the model parameters including the demand rate from time t to T.
202 6. Applications to Production and Inventory
Moreover, from the state in (6.25), we can obtain the corresponding I ∗ (t)
as
I ∗ (t) = (I0 − Q)em1 t + Q. (6.41)
It is easy to see that I ∗ (t) increases monotonically to Q as t → ∞, as
shown in Fig. 6.4.
If Q < I0 ≤ Q − S/m1 , we can easily see from (6.40) that P ∗ (0) ≥ 0.
Furthermore, Ṗ ∗ (t) ≥ 0, and therefore the optimal production rate is
once again given by (6.40). We also have I ∗ (t) as in (6.41) and conclude
that I ∗ (t) → Q monotonically as t → ∞, as shown in Fig. 6.4.
Finally, if I0 > Q − S/m1 , (6.40) would have a negative value for the
initial production which is infeasible. By (6.6), P ∗ (0) = 0. We can now
depict this situation in Fig. 6.4. The time t̂ shown in the figure is the
time at which P ∗ (t̂) = P̂ + λ(t̂)/c = 0. We already know from (6.40) that
in the case when I0 = Q − S/m1 , P ∗ (0) = 0. This suggests that
S
I ∗ (t̂) = Q − . (6.42)
m1
As for the adjoint equation (6.7), we now need the boundary condition
at t̂. For this, we can use (6.4) to obtain λ(t̂) = −cP̂ . Thus, the adjoint
equation in the interval [0, t̂ ] is
ˆ λ(t̂) = −cP̂ .
λ̇ = ρλ + h(I − I), (6.44)
We can substitute I0 − St for I in Eq. (6.44) and solve for λ. Note that
we can easily obtain t̂ as
S I0 − Q 1
I0 − S t̂ = Q − ⇒ t̂ = + . (6.45)
m1 S m1
We can now specify the complete solution in the case when I0 >
Q − S/m1 . With t̂ specified in (6.45), the solution is as follows.
204 6. Applications to Production and Inventory
Figure 6.4: Optimal production rate and inventory level with different
initial inventories
the total value of its assets at the horizon time T. The problem here is
similar to the simple cash balance model of Sect. 5.1 except that there
are nonlinear holding costs associated with storing wheat. An extension
of this model to one having two control variables appears in Ijiri and
Thompson (1972).
− V2 ≤ v(t) ≤ V1 , (6.48)
and (6.52) as
T
λ2 (t) = p(T ) − h (y(τ ))er(T −τ ) dτ . (6.54)
t
In Exercise 6.8 you are asked to provide the interpretation of this optimal
policy.
Equations (6.46), (6.47), (6.54), and (6.55) determine the two-point
boundary value problem which usually requires a numerical solution pro-
cedure. In the next section we assume a special form for the storage
function h(y) to be able to obtain a closed-form solution.
The graph of the price function is shown in Fig. 6.5. Since p(t) is
increasing, short-selling is never optimal. Since the storage cost is 1/2
per unit per unit time and the wheat price jumps by 1 unit at t = 3, it
never pays to store wheat for more than 2 time units. Because y(0) = 0,
we have v ∗ (t) = 0 for 0 ≤ t ≤ 1. This obviously must be a singular
208 6. Applications to Production and Inventory
control. Suppose we start buying wheat at t∗ > 1. From (6.60) the rate
of buying is 1; clearly buying will continue at this rate until t = 3, and
not longer. In order to not lose money on the storage of wheat, it must be
sold within 2 time units of its purchase. Clearly we should start selling
at t = 3+ at the maximum rate of 1, and continue until a last sale time
t∗∗ . In order to sell exactly all of the wheat purchased, we must have
3 − t∗ = t∗∗ − 3. (6.61)
Thus, v ∗ (t) = 0 in the interval [t∗∗ , 6], which is also a singular control.
With this policy, y(t) > 0 for all t ∈ (t∗ , t∗∗ ). From (6.59), λ̇2 = 1/2 in
the interval (t∗ , t∗∗ ). In order to have a singular control in the interval
[t∗∗ , 6], we must have λ2 (t) = 4 in that interval. Also, in order to have a
singular control in [0, t∗ ], we must have λ2 (t) = 3 in that interval. Thus,
λ2 (t∗∗ ) − λ2 (t∗ ) = 1, which with λ̇2 = 1/2 allows us to conclude that
t∗∗ − t∗ = 2, (6.62)
We can now sketch graphs for λ2 (t), v ∗ (t), and y ∗ (t) as shown in
Fig. 6.6. In Exercise 6.13 you are asked to show that these trajectories are
optimal by verifying that the maximum principle necessary conditions
hold and that they are also sufficient.
Buy
Sell
Figure 6.6: Adjoint variable, optimal policy and inventory in the wheat
trading model
210 6. Applications to Production and Inventory
L = H + μ1 (v + 1) + μ2 (1 − v) + ηv, (6.69)
μ1 ≥ 0, μ1 (v + 1) = 0, (6.70)
μ2 ≥ 0, μ2 (1 − v) = 0, (6.71)
η ≥ 0, ηy = 0. (6.72)
and the optimal control from (6.67) or (6.68) is v ∗ = 1, i.e., buy wheat
at the maximum rate of 1, so long as λ2 (t) > p(t). Also, this will give
6.2. The Wheat Trading Model 211
y(3) > 0, so that (6.75) holds. Let us next find the time t̂ of the last
jump before the terminal time. Clearly, this value will not be larger than
the time at which λ2 (t) = p(t). Thus,
λ2 − p + μ1 − μ2 + η = λ2 − p + η = 0,
and consequently
We can now use the jump condition in (4.29) on the adjoint variables
to obtain
It is important to note that in the interval [1, 1.8], the optimal control
condition (6.68) holds, justifying our supposition that v ∗ = 0 in this
interval. Furthermore, using (6.80) and (6.74),
and the optimal control condition (6.67) holds, justifying our supposition
that v ∗ = −1 in this interval. Also, we can conclude that our guess γ = 0
Sell Buy
Figure 6.7: Adjoint trajectory and optimal policy for the wheat trading
model
6.3. Decision Horizons and Forecast Horizons 213
is correct. The graphs of λ2 (t), p(t), and v ∗ (t) are displayed in Fig. 6.7.
To complete the solution of the problem, you are asked to determine the
values of μ1 , μ2 , and η in these various intervals.
ield
Price Sh
Figure 6.8: Decision horizon and optimal policy for the wheat trading
model
where
γ 1 ≥ 0, γ 1 y(4) = 0, γ 2 ≥ 0, γ 2 (1 − y(4)) = 0. (6.91)
Let us first try γ 1 = γ 2 = 0. Let t̂ be the time of the last jump of the
adjoint function λ2 (t) before the terminal time T = 4. Then,
Figure 6.9: Optimal policy and horizons for the wheat trading model
with no short-selling and a warehouse constraint
6.3. Decision Horizons and Forecast Horizons 217
To find the actual value of t̂ we must insert a line of slope 1/2 above
the minimum price at t = 2 in such a way that its two intersection points
with the price trajectory are exactly one time unit (the time required to
fill up the warehouse) apart. Thus using (6.83), t̂ must satisfy
−2(t̂ − 1) + 7 + (1/2)(1) = t̂ + 1,
v ∗ = 0, y ∗ = 0.
v ∗ = 1, 0 < y ∗ < 1.
v ∗ = 0, y ∗ = 1.
In Exercise 6.17 you are asked to solve another variant of this problem.
For the example in Fig. 6.9 we have labeled t = 1 as a decision horizon
and t̂ = 17/6 as a strong forecast horizon. By this we mean that the
218 6. Applications to Production and Inventory
which is sketched in Fig. 6.10. Note that the price trajectory up to time
17/6 is the same as before, and the price after time 17/6 goes above the
extension of the price shield in Fig. 6.9.
Solution The new λ2 trajectory is shown in Fig. 6.10, which is the same
as before for t < 17/6, and after that it is λ2 (t) = t/2+6 for t ∈ [17/6, 4].
The optimal policy is as shown in Fig. 6.10, and as previously asserted,
the optimal policy in [0,1) remains unchanged. In Exercise 6.17 you are
asked to verify the maximum principle for the solution of Fig. 6.10.
Solution Again the price trajectory is the same up to time 17/6, but
the price after time 17/6 is declining. This changes the optimal policy
220 6. Applications to Production and Inventory
in the time interval [1, 17/6), but the optimal policy will still be to sell
in [0, 1).
As in the beginning of the section, we solve (6.90) to obtain λ2 (t) =
t/2+5/4 for t̂1 ≤ t ≤ 4, where t̂1 ≥ 1 is the time of the last jump which is
to be determined. It is intuitively clear that some profit can be made by
buying and selling to take advantage of the price rise between t = 2 and
t = 17/6. For this, the λ2 (t) trajectory must cross the price trajectory
between times 2 and 17/6 as shown in Fig. 6.11, and the inventory y
must go to 0 between times 17/6 and 4 so that λ2 can jump downward
to satisfy the ending condition λ2 (4− ) = p(4) = 13/4. Since we must
buy and sell equal amounts, the point of intersection of the λ2 trajectory
with the rising price segment, i.e., t̂1 − α, must be exactly in the middle
of the two other intersection points, t̂1 and t̂1 − 2α, of λ2 with the two
declining price trajectories. Thus, t̂1 and α must satisfy:
−2(t̂1 − 2α) + 7 + α/2 = (t̂1 − α) + 1,
(t̂1 − α) + 1 + α/2 = −t̂1 /2 + 21/4.
These can be solved to yield t̂1 = 163/54 and α = 5/9. The times
t̂1 , t̂1 − α, and t̂1 − 2α are shown in Fig. 6.11. The λ2 trajectory is given
by ⎧
⎪
⎪
⎪
⎪ t/2 + 9/2 for t ∈ [0, 1),
⎪
⎨
λ2 (t) =
⎪ t/2 + 241/108 for t ∈ [1, 163/54),
⎪
⎪
⎪
⎪
⎩ t/2 + 5/4 for t ∈ [163/54, 4].
Evaluation of the Lagrange multipliers and verification of the maximum
principle is similar to that for the case in Fig. 6.9.
In Sect. 6.3 we have given several examples of decision horizons and
weak and strong forecast horizons. In Sect. 6.3.1 we found a decision
horizon which was also a weak forecast horizon, and it occurred exactly
when y(t) = 0. We also introduced the idea of a price shield in that
section. In Sect. 6.3.2 we imposed a warehousing constraint and obtained
the same decision horizon and a strong forecast horizon, which occurred
when y(t) = 1.
Note that if we had solved the problem with T = 1, then y ∗ (1) = 0;
and if we had solved the problem with T = 17/6, then y ∗ (1) = 0 and
y ∗ (17/6) = 1. The latter problem has the smallest T such that both
y ∗ = 0 and y ∗ = 1 occur for t > 0, given the price trajectory. This is
one of the ways that time t = 17/6 can be found to be a forecast horizon
Exercises for Chapter 6 221
along with the decision horizon at time t = 1. There are other ways to
find strong forecast horizons. For a survey of the literature, see Chand
et al. (2002).
E 6.1 Verify the expressions for a1 and a2 given in (6.16) and (6.17).
E 6.6 For the model of Sect. 6.1.6, derive the turnpike triple by using
the conditions in (6.39).
E 6.7 Solve the production-inventory model of Sect. 6.1.6 for the pa-
rameter values listed on Fig. 6.4, and draw the figure using MATLAB or
another suitable software.
E 6.11 Set up the two-point boundary value problem for Exercise 6.9
with c = 0.05, h(y) = (1/2)y 2 , and the remaining values of parameters
as in the model of Sect. 6.2.3.
E 6.13 Show that the solution obtained for the problem in Sect. 6.2.3
satisfies the necessary conditions of the maximum principle. Conclude
the optimality of the solution by showing that the maximum principle
conditions are also sufficient.
E 6.15 Compute the optimal trajectories for μ1 , μ2 , and η for the model
in Sect. 6.2.4.
E 6.16 Solve the model in Sect. 6.2.4 with each of the following condi-
tions:
(a) y(0) = 2.
E 6.17 Verify that the solutions shown in Figs. 6.10 and 6.11 satisfy the
maximum principle.
E 6.18 Re-solve the model of Sect. 6.3.2 with y(0) = 1/2 and with the
warehousing constraint y ≤ 1/2 in place of (6.82).
˙ = P (t)−S(t),
E 6.20 Re-solve Exercise 6.19 with the state equation I(t)
where I(0) = I0 ≥ 0 and I(T ) is not fixed. Assume the demand S(t) to
be continuous in t and non-negative. Keep the state constraint I ≥ 0, but
drop the production constraint P ≥ 0 for simplicity. For specificity, you
may assume S = − sin πt + C with the constant C ≥ 1 and T = 4. (Note
that negative production can and will occur when initial inventory I0 is
too large. Specifically, how large is too large depends on the parameters
of the problem.)
Exercises for Chapter 6 223
˙ = P (t) − S,
E 6.21 Re-solve Exercise 6.19 with the state equation I(t)
where S > 0 and h > 0 are constants, I(0) = I0 > cS 2 /2h, and I(T ) is
not fixed. Assume that T is sufficiently large. Also, graph the optimal
P ∗ (t) and I ∗ (t), t ∈ [0, T ].
Chapter 7
Applications to Marketing
Assuming the rate of total production cost is c(S), we can write the total
revenue net of production cost as
R(p, G; Z) = pS(p, G; Z) − c(S(p, G; Z)). (7.3)
The revenue net of advertising expenditure is therefore R(p, G; Z) − u.
We assume that the firm wants to maximize the present value of net
revenue streams discounted at a fixed rate ρ, i.e.,
∞
−ρt
max J= e [R(p, G; Z) − u] dt (7.4)
u≥0,p≥0 0
subject to (7.1).
Since the only place that p occurs is in the integrand, we can max-
imize J by first maximizing R with respect to price p while holding G
fixed, and then maximize the result with respect to u. Thus,
∂R ∂S ∂S
=S+p − c (S) = 0, (7.5)
∂p ∂p ∂p
which implicitly gives the optimal price p∗ (t) = p(G(t); Z(t)). Defining
η = −(p/S)(∂S/∂p) as the elasticity of demand with respect to price,
we can rewrite condition (7.5) as
ηc (S)
p∗ = , (7.6)
η−1
which is the usual price formula for a monopolist, known sometimes as
the Amoroso-Robinson relation. You are asked to derive this relation
in Exercise 7.2. In words, the relation means that the marginal revenue
(η −1)p/η must equal the marginal cost c (S). See, e.g., Cohen and Cyert
(1965, p. 189).
Defining Π(G; Z) = R(p∗ , G; Z), the objective function in (7.4) can
be rewritten as
∞
−ρt
max J = e [Π(G; Z) − u] dt .
u≥0 0
For convenience, we assume Z to be a given constant. Thus, we can
define π(G) = Π(G; Z) and restate the optimal control problem which
we have just formulated:
⎧ ∞
⎪
⎪ −ρt
e [π(G) − u] dt
⎪
⎪ max J =
⎪
⎨ u≥0 0
subject to (7.7)
⎪
⎪
⎪
⎪
⎪
⎩ Ġ = u − δG, G(0) = G .
0
228 7. Applications to Marketing
Recall from Sect. 3.6 that this limit condition is only a sufficient condi-
tion.
The adjoint variable λ(t) is the shadow price associated with the
goodwill at time t. Thus, the Hamiltonian in (7.8) can be interpreted as
the dynamic profit rate which consists of two terms: (1) the current net
profit rate (π(G) − u) and (2) the value λĠ = λ[u − δG] of the goodwill
rate Ġ created by advertising at rate u. Also, Eq. (7.9) corresponds to
the usual equilibrium relation for investment in capital goods; see Arrow
and Kurz (1970) and Jacquemin (1973). It states that the marginal
opportunity cost λ(ρ + δ)dt of investment in goodwill, by spending on
advertising, should equal the sum of the marginal profit π (G)dt from the
increased goodwill due to that investment and the capital gain dλ := λ̇dt
on the unit price of goodwill.
We use (3.108) to obtain the optimal long-run stationary equilibrium
or turnpike {Ḡ, ū, λ̄}. That is, we obtain λ = λ̄ = 1 from (7.8) by using
∂H/∂u = 0. We then set λ = λ̄ = 1 and λ̇ = 0 in (7.9) to obtain
7.1. The Nerlove-Arrow Advertising Model 229
π (Ḡ) = ρ + δ. (7.11)
In order to obtain a strictly positive equilibrium goodwill level Ḡ, we
may assume π (0) > ρ + δ and π (∞) < ρ + δ.
Before proceeding further to obtain the optimal advertising policy, let
us relate (7.11) to the equilibrium condition for Ḡ obtained by Jacquemin
(1973). For this we define β = (G/S)(∂S/∂G) as the elasticity of demand
with respect to goodwill. We can now use (7.3), (7.5), (7.6), and (7.9)
with λ̇ = 0 and λ̄ = 1 to derive, as you will in Exercise 7.3,
Ḡ β
= . (7.12)
pS η(ρ + δ)
The interpretation of (7.12) is that in the equilibrium, the ratio of good-
will to sales revenue pS is directly proportional to the goodwill elasticity,
inversely proportional to the price elasticity, and inversely proportional
to the cost of maintaining goodwill given by the marginal opportunity
cost λ(ρ + δ) of investment in goodwill.
The property of Ḡ is that the optimal policy is to go to Ḡ as fast
as possible. If G0 < Ḡ, it is optimal to jump instantaneously to Ḡ by
applying an appropriate impulse at t = 0 and then set u∗ (t) = ū = δ Ḡ
for t > 0. If G0 > Ḡ, the optimal control u∗ (t) = 0 until the stock of
goodwill depreciates to the level Ḡ, at which time the control switches
to u∗ (t) = δ Ḡ and stays at this level to maintain the level Ḡ of goodwill.
This optimal policy is graphed in Fig. 7.1 for these two different initial
conditions.
Of course, if we had imposed an upperbound M > 0 on the control
so that 0 ≤ u ≤ M, then for G0 < Ḡ, we would use u∗ (t) = M until
the goodwill stock reaches Ḡ and switch to u∗ (t) = ū thereafter. This is
shown as the dotted curve in Fig. 7.1.
Problem (7.7) is formulated with the assumption that a dollar spent
on current advertising increases goodwill by one unit. Suppose, instead,
that we need to spend m dollars on current advertising to increase good-
will by one unit. We can then define u as advertising effort costing the
firm mu dollars, and reformulate problem (7.7) by replacing [π(G) − u]
230 7. Applications to Marketing
Case :
in its integrand by [π(G) − mu]. In Exercise 7.4, you are asked to solve
problem (7.7) with its objective function and the control constraint
replaced by
∞
max J= e−ρt [π(G) − mu]dt , (7.13)
0≤u≤M 0
and show that the equilibrium goodwill level formula (7.11) changes to
With Ḡ thus defined, the optimal solution is as shown in Fig. 7.1 with
the dotted curve representing the solution in Case 2: G0 < Ḡ.
For a time-dependent Z, however, Ḡ(t) = G(Z(t)) will be a func-
tion of time. To maintain this level of Ḡ(t), the required control is
˙
ū(t) = δ Ḡ(t) + Ḡ(t). If Ḡ(t) is decreasing sufficiently fast, then ū(t) may
become negative and thus infeasible. If ū(t) ≥ 0 for all t, then the opti-
mal policy is as before. However, suppose ū(t) is infeasible in the interval
[t1 , t2 ] shown in Fig. 7.2. In such a case, it is feasible to set u(t) = ū(t)
for t ≤ t1 ; at t = t1 (which is point A in Fig. 7.2) we can no longer stay
on the turnpike and must set u(t) = 0 until we hit the turnpike again (at
point B in Fig. 7.2). However, such a policy is not necessarily optimal.
For instance, suppose we leave the turnpike at point C anticipating the
infeasibility at point A. The new path CDEB may be better than the
old path CAB. Roughly the reason this may happen is that path CDEB
is “nearer” to the turnpike than CAB. The picture in Fig. 7.2 illustrates
7.1. The Nerlove-Arrow Advertising Model 231
such a case. The optimal policy is the one that is “nearest” to the turn-
pike. This discussion will become clearer in Sect. 7.2.2, when a similar
situation arises in connection with the Vidale-Wolfe model. For further
details; see Sethi (1977b) and Breakwell (1968).
The Nerlove-Arrow model is an example involving bang-bang and
impulse controls followed by a singular control, which arises in a class of
optimal control problems of Model Type (b) in Table 3.3 that are linear
in control.
Nonlinear extensions of the Nerlove-Arrow model have been offered
in the literature. These amount to making the objective function non-
linear in advertising. Gould (1970) extended the model by assuming a
subject to (7.15)
⎪
⎪
⎪
⎪
⎪
⎩ Ġ = u − δG, G(0) = G .
0
Note that with concave c(u), the profit rate π(G) − c(u) is convex
in u. Thus, its maximum over u would occur at the boundary 0 or M
of the set [0, M ]. It should be clear that if we replace c(u) by the linear
function mu with m = c(M )/M, then
This means that if problem (7.15) with mu in place of c(u), i.e., the
problem
⎧ T
⎪
⎪ −ρt
e [π(G) − mu] dt
⎪
⎪ max J2 =
⎪
⎨ 0≤u≤M 0
subject to (7.17)
⎪
⎪
⎪
⎪
⎪
⎩ Ġ = u − δG, G(0) = G
0
has only the bang-bang solution, then the solution of problem (7.17)
would also be the solution of the convex problem (7.15). Given the
7.1. The Nerlove-Arrow Advertising Model 233
similarity of problem (7.17) to problem (7.7), we can see that for a suf-
ficiently small value of T, the solution of (7.17) will be bang-bang only,
and therefore, it will also solve (7.15). However, if T is large or infinity,
then the solution of (7.17) will have a singular portion, and it will not
solve (7.17).
In particular, let us consider problems (7.15) and (7.17) when T = ∞
and G0 < Ḡ. Note that problem (7.17) is the same as the problem in
Exercise 7.4, and its optimal solution is as shown in Fig. 7.1 with Ḡ given
by (7.14) and the optimal trajectory given by the dotted line followed by
the solid horizontal line representing the singular part of the solution.
Let u∗2 denote the optimal control of problem (7.17). Since the sin-
gular control is in the open interval (0, M ), then in view of (7.16),
Thus, for sufficiently small ε1 > 0 and ε2 > 0, we can “chatter” between
G1 = (Ḡ + ε1 ) and G2 = (Ḡ − ε2 ) by using controls M and 0 alternately,
as shown in Fig. 7.3, to obtain a near-optimal control of problem (7.15).
Clearly, in the limit as ε1 and ε2 go to 0, the objective function of problem
(7.15) will converge to J2 (u∗2 ).
subject to (7.19)
⎪
⎪
⎪
⎪
⎪
⎩ Ġ = vM − δG, G(0) = G .
0
S
x= . (7.23)
M
Thus, x represents the market share (or more precisely, the rate of sales
expressed as a fraction of the saturation level M ). Furthermore, we
define
a Ṁ
r= , δ =b+ . (7.24)
M M
subject to
ẋ = ru(1 − x) − δx, x(τ ) = A, x(θ) = B, (7.28)
0 ≤ u ≤ Q. (7.29)
To change the objective function in (7.27) into a line integral along any
feasible arc Γ1 from (τ , A) to (θ, B) in (t, x)-space as shown in Fig. 7.4,
we multiply (7.28) by dt and obtain the formal relation
dx + δxdt
udt = ,
r(1 − x)
Proof If I(x) ≥ 0 for all (x, t) ∈ R, then JΓ ≥ 0 from (7.31) and (7.32).
Hence from (7.30), JΓ1 ≥ JΓ2 . The proof of the other statement is similar.
2
To make use of this lemma to find the optimal control for the problem
stated in (7.26), we need to find regions where I(x) is positive and where
it is negative. For this, note first that I(x) is an increasing function of
x in [0, 1]. Solving I(x) = 0 will give that value of x, above which I(x)
is positive and below which I(x) is negative. Since I(x) is quadratic in
1/(1 − x), we can use the quadratic formula (see Exercise 7.16) to get
2δ
x=1− # . (7.33)
−ρ ± ρ2 + 4πrδ
To keep x in the interval [0, 1], we must choose the positive sign before
the radical. The optimal x must be nonnegative so we have
2δ
x = max 1 −
s
# ,0 , (7.34)
−ρ + ρ2 + 4πrδ
240 7. Applications to Marketing
where the superscript s is used because this will turn out to be a singular
trajectory. Since xs is nonnegative, the control
δxs
us = (7.35)
r(1 − xs )
Theorem 7.1 Let T be large and let xT be reachable from x0 . For the
Cases 1–4 of inequalities relating x0 and xT to xs , the optimal trajectories
are given in Figures 7.5, 7.6, 7.7, and 7.8, respectively.
Proof We give details for Case 1 only. The proofs for the other cases
are similar. Figure 7.9 shows the optimal trajectory for Fig. 7.5 together
with an arbitrarily chosen feasible trajectory, shown dotted. It should
be clear that the dotted trajectory cannot cross the arc x0 to C, since
u = Q on that arc. Similarly the dotted trajectory cannot cross the arc
G to xT , because u = 0 on that arc.
We subdivide the interval [0, T ] into subintervals over which the dot-
ted arc is either above, below, or identical to the solid arc. In Fig. 7.9
these subintervals are [0, d], [d, e], [e, f ], and [f, T ]. Because I(x) is pos-
itive for x > xs and I(x) is negative for x < xs , the regions enclosed
by the two trajectories have been marked with a + or − sign depending
on whether I(x) is positive or negative on the regions, respectively. By
Lemma 7.1, the solid arc is better than the dotted arc in the subintervals
[0, d], [d, e], and [f, T ]; in interval [e, f ], they have identical values. Hence
the dotted trajectory is inferior to the solid trajectory. This proof can
be extended to any (countable) number of crossings of the trajectories;
see Sethi (1977b). 2
Figures 7.5, 7.6, 7.7, and 7.8 are drawn for the situation when T >
t1 + t2 . In Exercise 7.25, you are asked to consider the case when T =
t1 + t2 . The following theorem deals with the case when T < t1 + t2 .
Theorem 7.2 Let T be small, i.e., T < t1 + t2 , and let xT be reach- able
from x0 . For the two possible Cases 1 and 2 of inequalities relating x0
and xT to xs , the optimal trajectories are given in Figs. 7.10 and 7.11,
respectively.
Proof The requirement of feasibility when T is small rules out cases
where x0 and xT are on opposite sides of or equal to xs . The proofs of
optimality of the trajectories shown in Figs. 7.10 and 7.11 are similar to
the proofs of the parts of Theorem 7.1, and are left as Exercise 7.25. In
Figs. 7.10 and 7.11, it is possible to have either t1 ≥ T or t2 ≥ T. Try
sketching some of these special cases. 2
All of the previous discussion has assumed that Q was finite and
sufficiently large, but we can easily extend this to the case when Q = ∞.
7.2. The Vidale-Wolfe Advertising Model 243
This possibility makes the arcs in Figs. 7.5, 7.6, 7.7, 7.8, 7.9, and 7.10,
corresponding to u∗ = Q, become vertical line segments corresponding to
impulse controls. For example, Fig. 7.6 becomes Fig. 7.12 when Q = ∞
and we apply the impulse control imp(x0 , xs ; 0) when x0 < xs .
Next we compute the cost of imp(x0 , xs ; 0) by assessing its effect
on the objective function of (7.26). For this, we integrate the state
equation in (7.26) from 0 to ε with the initial condition x0 and u treated
244 7. Applications to Marketing
xT
x0
u* = 0 u* = Q
xs
t
0 T
T – t2 t1
Impulse
Control
Impulse
Control
Therefore,
1 1 − x0
imp(x0 , x ; 0) = − ln
s
. (7.36)
r 1 − xs
We remark that this formula holds for any time t, as well as t = 0. Hence
it can also be used at t = T to compute the impulse at the end of the
period; see Fig. 7.12 and Exercise 7.28.
H = πx − u + λ[ru(1 − x) − δx]
= πx − δλx + u[−1 + rλ(1 − x)], (7.37)
∂L
λ̇ = ρλ − = ρλ + λ(ru + δ) − π, (7.39)
∂x
μ ≥ 0, μ(Q − u) = 0. (7.40)
where
We divide the analysis of this problem into the same two cases defined
as before, namely, “Q is large” and “Q is small”.
When Q is large, the results of Theorem 7.1 suggest the solution
when T is infinite. Because of the discount factor, the ending parts of
the solutions shown in Figs. 7.5, 7.6, 7.7, and 7.8 can be shown to be
irrelevant (i.e., the discounted profit accumulated during the interval
(T − t2 , T ) goes to 0 as T goes to ∞). Therefore, we only have two
cases: (a) x0 ≤ xs , and (b) x0 ≥ xs . The optimal control in Case (a) is
to use u∗ = Q in the interval [0, t1 ) and u∗ = us for t ≥ t1 . Similarly, the
optimal control in Case (b) is to use u∗ = 0 in the interval [0, t1 ) and
u∗ = us for t ≥ t1 .
An alternate way to see that the above solutions give u∗ = us for
t ≥ t1 is to check that they satisfy the turnpike conditions (3.107). To do
this we need to find the values of the state, control, and adjoint variables
and the Lagrange multiplier along the turnpike. It can be easily shown
that x = xs , u = us , λs = π/(ρ + δ + rus ), and μs = 0 satisfy the
turnpike conditions (3.107).
When Q is small, i.e., Q < us , it is not possible to follow the turnpike
x = xs , because that would require u = us , which is not a feasible control.
Intuitively, it seems clear that the “nearest” stationary path to xs that
we can follow is the path obtained by setting ẋ = 0 and u = Q, the
largest possible control, in the state equation of (7.43). This gives
rQ
x̄ = , (7.44)
rQ + δ
and correspondingly we obtain
π
λ̄ = (7.45)
ρ + δ + rQ
248 7. Applications to Marketing
x̂ = 1 − 1/rλ̄, (7.46)
Theorem 7.3 When Q is small, the quadruple {x̄, Q, λ̄, μ̄} forms a
turnpike.
Proof We show that the turnpike conditions (3.107) hold for the quadru-
ple. The first two conditions of (3.107) are (7.44) and (7.45). By Ex-
ercise 7.31 we know x̄ ≤ x̂, which, from definitions (7.46) and (7.47),
implies μ̄ ≥ 0. Furthermore ū = Q, so (7.40) holds and the third con-
dition of (3.107) also holds. Finally because W = μ̄ from (7.42) and
(7.47), it follows that W ≥ 0, so the Hamiltonian maximizing condition
of (3.107) holds with ū = Q. 2
Proof (a) We set λ(t) = λ̄ for all t ≥ τ and note that λ satisfies the
adjoint equation (7.39) and the transversality condition (3.99).
By Exercise 7.31 and the assumption that x(τ ) ≤ x̂, we know that
x(t) ≤ x̂ for all t. The proof that (7.40) and (7.41) hold for all t ≥ τ
relies on the fact that x(t) ≤ x̂ and on an argument similar to the proof
of the previous theorem.
Figure 7.13 shows the optimal trajectories when x0 < x̂ for two dif-
ferent starting values of x0 , one above and the other below x̄. Note that
in this figure we are always in Case (a) since x(τ ) ≤ x̂ for all τ ≥ 0.
7.2. The Vidale-Wolfe Advertising Model 249
(b) Assume x0 > x̂. In this case we will show that the optimal trajec-
tory is as shown in Fig. 7.14, which is obtained by applying u = 0 until
x = x̂ and u = Q thereafter. Using this policy we can find the time t1
at which x(t1 ) = x̂, by solving the state equation in (7.43) with u = 0.
This gives
1 x0
t1 = ln . (7.48)
δ x̂
Clearly for t ≥ t1 , the policy u = Q is optimal because Case (a)
applies. We now consider the interval [0, t1 ], where we set u = 0. Let τ
be any time in this interval as shown in Fig. 7.14, and let x(τ ) be the
corresponding value of the state variable. Then x(τ ) = x0 e−δτ . With
u = 0 in (7.39), the adjoint equation on [0, t1 ] becomes
λ̇ = (ρ + δ)λ − π.
We also know that x(t1 ) = x̂. Thus, Case (a) applies at time t1 , and
we would like to have λ(t1 ) = λ̄. So, we solve the adjoint equation with
λ(t1 ) = λ̄ and obtain
π π
λ(τ ) = + λ̄ − e(ρ+δ)(τ −t1 ) , τ ∈ [0, t1 ]. (7.49)
ρ+δ ρ+δ
Now, with the values of x(τ ) and λ(τ ) in hand, we can use (7.42)
to obtain the switching function value W (τ ). In Exercise 7.34, you are
250 7. Applications to Marketing
x(0)
u* = 0
x(τ) u* = Q
xs
t
0 τ t1
Case(b) Case(a)
E 7.2 Derive the optimal monopoly price formula in (7.6) from (7.5).
E 7.4 Re-solve problem (7.7) with its objective function and the control
constraint replaced by (7.13), and show that the only possible singular
Exercises for Chapter 7 251
E 7.8 Verify that G1 and G2 , which are shown in Fig. 7.3 for the pulsing
policy derived from solving problem (7.19) as a near-optimal solution of
problem (7.17) with T = ∞, are given by
M 1 − e−δτ v̄ M e−δτ (1−v̄) − e−δτ
G1 = , G2 = .
δ 1 − e−δτ δ 1 − e−δτ
and B(t) ≥ 0 for all t. Solve only the infinite horizon model. See Sethi
and Lee (1981).
(b) Find the value of T for which the minimum advertising is optimal
throughout, i.e., u∗ (t) = 1, 0 ≤ t ≤ T.
(c) Let T = ∞. Obtain the long-run stationary equilibrium (x̄, ū, λ̄).
E 7.14 Let
E 7.15 For problem (7.26), find the reachable set for a given initial x0
and horizon time T.
E 7.17 Show that both xs in (7.34) and us in (7.35) are 0 if, and only
if, πr ≤ δ + ρ.
E 7.19 Let xs denote the solution of I(x) = 0 and let A < xs < B in
Fig. 7.4. Assume that I(x) > 0 for x > xs and I(x) < 0 for x < xs .
Construct a path Γ3 such that JΓ3 ≥ JΓ1 and JΓ3 ≥ JΓ2 .
E 7.20 For the problem in (7.26), suppose x0 and xT are given and
define xs as in (7.34). Let t1 be the shortest time to go from x0 to xs ,
and t2 be the shortest time to go from xs to xT .
(b) Using the form of the answers in (a), find t1 and t2 when x0 > xs
and xs < xT < x̄.
E 7.21 For Exercise 7.20(a), write the condition that T is large, i.e.,
T ≥ t1 + t2 , in terms of all the parameters of problem (7.26).
(b) Redo (a) when xT = 0.7. Show that both T = 13 and T = 8 are
large.
E 7.26 Sketch one or two other possible curves for the case when T is
small.
(b) Show that T > 0 is large for Exercise 7.22(b) when Q = ∞. Find
the optimal value of the objective function when T = 8.
πrδ
> 1.
(δ + ρ + rQ)(δ + rQ)
E 7.36 Write the equation satisfied by the turnpike level x̄ for the model
⎧ ∞
⎪
⎪ max J = −ρt 2
e (πx − u )dt
⎪
⎪
⎪
⎨ u≥0 0
⎪ subject to
⎪
⎪
⎪
⎪
⎩ ẋ = ru(1 − x) − δx, x(0) = x .
0
E 7.37 Obtain the optimal long-run stationary equilibrium for the fol-
lowing modification of the model (7.26), due to Sethi (1983b):
⎧ ∞
⎪
⎪
⎪
⎪ max e−ρt (πx − u2 )dt
⎪
⎪ 0
⎪
⎪
⎪
⎨ subject to
(7.50)
⎪
⎪ #
⎪
⎪ ẋ = ru (1 − x) − δx, x0 ∈ [0, 1],
⎪
⎪
⎪
⎪
⎪
⎩ u ≥ 0.
In particular, show that the turnpike triple (x̄, λ̄, ū) is given by
√
r2 λ̄/2 rλ̄ 1 − x̄
x̄ = 2 , ū = , (7.51)
r λ̄/2 + δ 2
Exercises for Chapter 7 257
and
#
[(ρ + δ)2 + r2 π] − (ρ + δ)
λ̄ = . (7.52)
r2 /2
E 7.39 The Ozga Model (Ozga 1960; Gould 1970). Suppose the informa-
tion spreads by word of mouth rather than by an impersonal advertising
medium, i.e., individuals who are already aware of the product inform
individuals who are not, at a certain rate, influenced by advertising ex-
penditure. What we have now is the Ozga model
subject to the Ozga model. Assume that π(x) is concave and w(u) is
convex. See Sethi (1979c) for a Green’s theorem application to this
problem.
Chapter 8
f (x) = a, (8.2)
g(x) ≥ b. (8.3)
Lx = −2x + 2λ = 0,
Ly = −2y + λ = 0,
Lλ = 2x + y − 10 = 0.
From the first two equations we get λ = x = 2y. Solving this with the
last equation yields the quantities
x∗ = 4, y ∗ = 2, λ = 4, h∗ = −20,
Lx = 8 − 2x + μ = 0, (8.14)
x − 2 ≥ 0, (8.15)
μ ≥ 0, μ(x − 2) = 0. (8.16)
Case 2: x = 2. Here from (8.14) we get μ = −4, which does not satisfy
the inequality μ ≥ 0 in (8.16).
Lx = 8 − 2x + μ = 0, (8.17)
x − 6 ≥ 0, (8.18)
μ ≥ 0, μ(x − 6) = 0. (8.19)
x∗ = 6, h∗ = h(x∗ ) = 12,
The examples above involve only one variable, and are relatively
obvious. The next example, which is two-dimensional, will reveal more
of the power and the difficulties of applying the Kuhn-Tucker conditions.
Example 8.4 Find the shortest distance between the point (2,2) and
the upper half of the semicircle of radius one with its center at the origin,
shown as the curve in Fig. 8.1. In order to simplify the calculation, we
minimize h, the square of the distance. Hence, the problem can be stated
8.1. Nonlinear Programming Problems 265
These three points are shown in Fig. 8.1. Of the three√points√ found that
satisfy the necessary conditions, clearly the point (1/ 2, 1/ 2) found in
(a) is the nearest point and solves the closest-point problem. The point
(−1, 0) in (c) is in fact the farthest point; and the point (1, 0) in (b)
is neither the closest nor the farthest point. The associated√multiplier
values can be easily computed, and these are: (a) λ = 1 − 2 2, μ = 0;
(b) λ = −1, μ = 4; and (c) λ = 3, μ = 4.
266 8. The Maximum Principal: Discrete Time
Closest
Point
Farthest
Point
(-1,0) (1,0)
The fact that there are three points satisfying the necessary condi-
tions, and only one of them actually solves the problem at hand, empha-
sizes that the conditions are only necessary and not sufficient. In every
case it is important to check the solutions to the necessary conditions to
see which of the solutions provides the optimum.
Next we work two examples that show some technical difficulties that
can arise in the application of the Kuhn-Tucker conditions.
max{h(x, y) = y} (8.26)
subject to
(1 − y)3 − x2 ≥ 0, (8.27)
x ≥ 0 (8.28)
y ≥ 0. (8.29)
The set of points satisfying the constraints is shown shaded in Fig. 8.2.
From the figure it is obvious that the solution point (0,1) maximizes the
value of y.
Hence, the optimum solution is (x∗ , y ∗ ) = (0, 1) and h∗ = 1. Let us
see if we can find it using the above procedure. The Lagrangian is
at point (x, y) = (0, 1). It has a null vector in the first row, and therefore
its rows are not linearly independent; see Sect. 1.4.10. Thus, it does
not have a full rank of three, and the condition (8.36) does not hold.
Alternatively, note that the inequality constraints (8.27) and (8.28) are
active at point (x, y) = (0, 1), and their respective gradients (−2x, −3(1−
y)2 ) = (0, 0) and (1, 0) at that point are clearly not linearly independent.
where λ and μ are row vectors of multipliers associated with the con-
straints (8.2) and (8.3), respectively. We now state two theorems whose
proofs can be found in Mangasarian (1969).
8.2. A Discrete Maximum Principle 269
x0 x1 x2 xk xk+1 xT −1 xT
u0 u1 u2 uk uk+1 uT −1
0 1 2 k k+1 T −1 T
g(uk , k) ≥ bk , k = 0, . . . , T − 1. (8.44)
∂H k ∂g
= −μk k , (8.50)
∂uk ∂u
and
μk ≥ 0, μk [g(uk , k) − bk ] = 0. (8.51)
We note that, provided H k is concave in uk , g(uk , k) is concave in uk , and
the constraint qualification holds, then conditions (8.50) and (8.51) are
precisely the necessary and sufficient conditions for solving the following
Hamiltonian maximization problem:
⎧
⎪
⎪
⎪ max
⎪ Hk
⎪
⎨ u k
subject to (8.52)
⎪
⎪
⎪
⎪
⎪
⎩ g(uk , k) ≥ bk .
We have thus derived the following restricted form of the discrete maxi-
mum principle.
Theorem 8.3 If for every k, H k in (8.46) and g(uk , k) are concave in
uk , and the constraint qualification holds, then the necessary conditions
for uk∗ , k = 0, 1, . . . , T − 1, to be an optimal control for the problem
(8.42)–(8.44), with the corresponding state xk∗ , k = 0, 1, . . . , T, are
⎧
⎪
⎪
⎪
⎪ xk∗ = f (xk∗ , uk∗ , k), x0 given,
⎪
⎪
⎪
⎪
⎪ T∗
⎨ λk = − ∂Hkk [xk∗ , uk∗ , λk+1 , k], λT = ∂S(x T ,T ) ,
∂x ∂x
⎪
⎪
⎪
⎪ H k (xk∗ , uk∗ , λk+1 , k) ≥ H k (xk∗ , uk , λ(k+1) , k),
⎪
⎪
⎪
⎪
⎪
⎩ for all uk such that g(uk , k) ≥ bk , k = 0, 1, . . . , T − 1.
(8.53)
Section 8.2.3 gives examples of the application of this maximum prin-
ciple (8.53). In Sect. 8.3 we state a more general discrete maximum prin-
ciple.
8.2.3 Examples
Our first example will be similar to Example 2.4 and it will be solved
completely. The reader will note that the solutions of the continuous
and discrete problems are very similar. The second example is a discrete
version of the production-inventory problem of Sect. 6.1.
8.2. A Discrete Maximum Principle 273
subject to
xk = uk , x0 = 5, (8.55)
u ∈ Ω = [−1, 1].
k
(8.56)
∂H k //
λ = − k /xk∗ = xk∗ , λT = 0.
k
(8.60)
∂x
Let us assume T = 6. Substitute (8.59) into (8.60) to obtain
λk = −k + 5, λ6 = 0.
so that
1 11
λk = − k 2 + k − 15. (8.61)
2 2
A sketch of the values for λk and xk appears in Fig. 8.4. Note that
5
λ = 0, so that the control u4 is singular. However, since x4 = 1 we
choose u4 = −1 in order to bring x5 down to 0.
The solution of the problem for T ≥ 7 is carried out in the same
way that we solved Example 2.4. Namely, observe that x5∗ = 0 and
λ5 = λ6 = 0, so that the control is singular. We simply make λk = 0 for
k ≥ 7 so that uk∗ = 0 for all k ≥ 7. It is clear without a formal proof
that this maximizes (8.54).
subject to
I k = P k − S k , k = 0, 1, . . . , T − 1, I 0 given. (8.63)
∂H k ˆ λT = 0.
λk = − = h(I k − I), (8.65)
∂I k
8.2. A Discrete Maximum Principle 275
+
xk
5
4
3
2
1
1 2 3 4 5 6
0 k
-1
-3
-6
-10
λk
∗
Figure 8.4: Optimal state xk and adjoint λk
∂H k
= −c(P k − P̂ ) + λk+1 = 0.
∂P k
Since production must be nonnegative, we obtain the optimal production
as
P k∗ = max[0, P̂ + λk+1 /c]. (8.66)
Expressions (8.63), (8.65), and (8.66) determine a two-point bound-
ary value problem. For a given set of data, it can be solved numerically
by using spreadsheet software like Excel; see Sect. 2.5 and Exercise 8.21.
If the constraint P k ≥ 0 is dropped it can be solved analytically by the
method of Sect. 6.1, with difference equations replacing the differential
equations used there.
276 8. The Maximum Principal: Discrete Time
subject to
(ii) The sets {−F (x, Ωk , k), f (x, Ωk , k)} are b-directionally convex for
every x and k, where b = (−1, 0, . . . , 0). That is, given v and w in
Ωk and 0 ≤ λ ≤ 1 , there exists u(λ) ∈ Ωk such that
and
f (x, u(λ), k) = λf (x, v, k) + (1 − λ)f (x, w, k)
for every x and k. It should be noted that convexity implies b-
directional convexity, but not the converse.
H1 < 0, |H2 | > 0, |H3 | < 0, . . . , (−1)n |Hn | = (−1)n |H| > 0
E 8.3 Find the optimal speed in cases (a) and (b) below:
(a) During times of an energy crisis, it is important to economize on
fuel consumption. Assume that when traveling x mile/hour in high
gear, a truck burns fuel at the rate of
1 2500
+ x gallons/mile.
500 x
If fuel costs 50 cents per gallon, find the speed that will minimize
the cost of fuel for a 1000 mile trip. Check the second-order con-
dition.
(b) When the government imposed this optimal speed in 1974, truck
drivers became so angry that they staged blockades on several free-
ways around the country. To explain the reason for these blockades,
we found that a crucial figure was the hourly wage of the truckers,
estimated at $3.90 per hour at that time. Recompute a speed that
will minimize the total cost of fuel and the driver’s wages for the
same trip. You do not need to check for the second-order condition.
E 8.5 Verify Eq. (8.8) in Example 8.1 by determining h∗ (a) and expand-
ing the function h∗ (10 + ) in a Taylor series around the value 10.
(b) x ≤ 20.
Exercises for Chapter 8 279
E 8.7 Rework Example 8.4 by replacing (2, 2) with each of the following
points:
for (a) h(x, y) = x + y, (b) h(x, y) = x + 2y, and (c) h(x, y) = x + 3y.
Comment on the solution in each of the cases (a), (b), and (c).
E 8.13 Rewrite the maximum principle (8.53) for the special case of the
linear Mayer form problem obtained when F ≡ 0 and S(xT , T ) = cxT ,
where c is an n-component row vector of constants.
280 8. The Maximum Principal: Discrete Time
where A is a given matrix. Obtain the expression for the adjoint variable
and the form of the optimal control.
E 8.20 Convert the problem defined by (8.42) and (8.68) to its La-
grange form. Then, obtain the assumptions on the salvage value function
S(xT , T ) so that the results of Sect. 8.3 apply. Under these assumptions,
state the maximum principle for the Bolza form problem defined by
(8.42) and (8.68).
k
x = −δx + r
k k
fkl (xl , ul ), x0 given,
l=0
where, as in Sect. 7.2.1, π denotes per unit sales revenue, ρ denotes the
discount rate, and the inequalities 0 ≤ uk ≤ Qk represent the restric-
tions on the advertising amount uk . For the continuous-time version of
problems with lags, see Hartl and Sethi (1984b).
Chapter 9
Maintenance and
Replacement
The first term represents the present value of one dollar of additional
salvage value at T brought about by one dollar of additional resale value
at the current time t. The second term represents the present value of
incremental production from t to T brought about by the extra produc-
tivity of the machine due to the additional one dollar of resale value at
time t.
Since the Hamiltonian is linear in the control variable u, the optimal
control for a problem with any fixed T is bang-bang as in Model Type
(a) in Table 3.3. Thus,
π
u∗ (t) = bang 0, U ; {e−ρT + (e−ρt − e−ρT )}g(t) − e−ρt . (9.9)
ρ
is the present value of the marginal return from increasing the preventive
maintenance by one dollar at time t. The last term e−ρt in the argument
of the bang function is the present value of that one dollar spent for pre-
ventive maintenance at time t. Thus, in words, the optimal policy means
the following: if the marginal return of one dollar of additional preven-
tive maintenance is more than one dollar, then perform the maximum
possible preventive maintenance, otherwise do not perform any at all.
To find how the optimal control switches, we need to examine the
switching function in (9.9). Rewriting it as
−ρt πg(t) π
e − ( − 1)eρ(t−T ) g(t) − 1 (9.10)
ρ ρ
Note that all of the above calculations were made on the assumption
that T was fixed, i.e., without imposing condition (9.7). On an optimal
path, this condition, which uses (9.5), (9.7), and (9.8), can be restated
as
∗ ∗
−ρe−ρT x∗ (T ∗ ) = −{πx∗ (T ∗ ) − u∗ (T ∗ )}e−ρT
(9.12)
∗
−e−ρT {−d(T ∗ ) + g(T ∗ )u(T ∗ )}.
In doing so, we have assumed that the solution of (9.16) lies in the open
interval (0, T ). As we will indicate later, special care needs to be exercised
if this is not the case.
Substituting the data in (9.16) we have
0.1 − 0.05e−0.05(T −t ) = 0.025(1 + ts )1/2 ,
s
which simplifies to
(1 + ts )1/2 = 4 − 2e−0.05(T −t ) .
s
(9.17)
Then, integrating (9.15), we find
x(t) = −2t + 4(1 + t)1/2 + 96, if t ≤ ts ,
and hence
x(t) = −2ts + 4(1 + ts )1/2 + 96 − 2(t − ts )
= 4(1 + ts )1/2 + 96 − 2t, if t > ts .
Since we have assumed 0 < ts < T, we substitute x(T ) into (9.13), and
obtain
4(1 + ts )1/2 + 96 − 2T = 2/0.05 = 40,
which simplifies to
T = 2(1 + ts )1/2 + 28. (9.18)
We must solve (9.17) and (9.18) simultaneously. Substituting (9.18) into
(9.17), we find that ts must be a zero of the function
9.1.4 An Extension
The pure bang-bang result in the model developed above is a result of the
linearity in the problem. The result can be enriched as in Sethi (1973b)
by generalizing the resale value equation (9.3) as follows:
ẋ(t) = −d(t) + g(u(t), t), (9.20)
where g is nondecreasing and concave in u. For this section, we will
assume the sale date T to be fixed for simplicity and g to be strictly
concave in u, i.e., gu ≥ 0 and guu < 0 for all t. Also, gt ≤ 0, gut ≤ 0, and
g(0, t) = 0; see Exercise 9.7 for an example of the function g(u, t).
The standard Hamiltonian is
H = (πx − u)e−ρt + λ[−d + g(u, t)], (9.21)
where λ is given in (9.8). To maximize the Hamiltonian, we differentiate
it with respect to u and equate the result to zero. Thus,
Hu = −e−ρt + λgu = 0. (9.22)
If we let u0 (t) denote the solution of (9.22), then u0 (t) maximizes the
Hamiltonian (9.21) because of the concavity of g in u. Thus, for a fixed
T, the optimal control is
u∗ (t) = sat[0, U ; u0 (t)]. (9.23)
290 9. Maintenance and Replacement
e−ρt 1
gu = = . (9.24)
λ(t) π
ρ − ( πρ − 1)eρ(t−T )
ρ2 (π − ρ)eρ(t−T )
gut + guu u̇0 = > 0.
[π − (π − ρ)eρ(t−T ) ]2
But gut ≤ 0 and guu < 0, it is therefore obvious that u̇0 (t) < 0. In order
now to sketch the optimal control u∗ (t) specified in (9.23), let us define
0 ≤ t1 ≤ t2 ≤ T such that u0 (t) ≥ U for t ≤ t1 and u0 (t) ≤ 0 for t ≥ t2 .
Then, we can rewrite the sat function in (9.23) as
⎧
⎪
⎪
⎪
⎪ U for t ∈ [0, t1 ],
⎪
⎨
u∗ (t) = u0 (t) for t ∈ (t1 , t2 ), (9.25)
⎪
⎪
⎪
⎪
⎪
⎩ 0 for t ∈ [t2 , T ].
the failure rate of the machine rather than arrest the deterioration in
the resale value as before. Furthermore, their model also allows for sale
of the machine at any time, provided it is still in running condition, and
for its disposal as junk if it breaks down for good. The optimal control
problem is therefore to find an optimal maintenance policy for the period
of ownership and an optimal sale date at which the machine should be
sold, provided that it has not yet failed. Other references to related
models are Alam et al. (1976), Alam and Sarma (1974, 1977), Sarma
and Alam (1975), Gaimon and Thompson (1984a, 1989), Dogramaci and
Fraiman (2004), Dogramaci (2005), Bensoussan and Sethi (2007), and
Bensoussan et al. (2015a).
Thus, the cost of reducing the failure rate increases more than pro-
portionately as the fractional reduction increases. But the cost of a
given fractional reduction increases linearly with the natural failure rate.
Hence, these conditions imply that a given absolute reduction becomes
increasingly more costly as the machine gets older.
To derive the state equation for F (t), we note that Ḟ /(1−F ) denotes
the conditional probability density for the failure of the machine at time
t, given that it has survived to time t. This is assumed to depend on
two things, namely (i) the natural failure rate that governs the machine
in the absence of preventive maintenance, and (ii) the current rate of
preventive maintenance.
Thus,
Ḟ (t)
= h(t)[1 − u(t)], (9.29)
1 − F (t)
9.2. Maintenance & Replacement for a Machine Subject to Failure 293
In Exercise 9.8, you are asked to derive this condition by using (9.31)–
(9.33) in (3.77).
While we know from (3.79) that (9.34) has a standard economic in-
terpretation of having zero marginal profit of changing T ∗ , it is still
illuminating to flesh out a more detailed interpretation of each term in
what looks like a fairly complex expression. A good way to accomplish
that is totally what we get if we decide to sell the machine at time T ∗ + δ
in comparison to selling it at T ∗ . We will do this only for a small δ > 0,
and leave it as Exercise 9.9 for a small δ < 0.
First we note that in solving Exercise 9.8 to obtain (9.34) from
∗
(3.77), a simplification involved canceling the common factor e−ρT (1 −
∗
F (T ∗ )) > 0. Removing e−ρT brings the revenue and cost terms from
present-value dollars to dollars at time T ∗ . The presence of the probabil-
ity term 1 − F (T ∗ ) means that the machine will be replaced at T ∗ if it
has not failed by time T ∗ with that probability. Its removal means that
(9.34) can be interpreted as if we are at T ∗ and we find the machine to
be working, which is tantamount to interpreting (9.34) with F (T ∗ ) = 0.
Now consider keeping the machine to T ∗ + δ. Clearly we lose its
selling price B(T ∗ ) in doing so. But then we gain the following amounts
discounted to time T ∗ :
{R − C(u∗ (T ∗ ))h(T ∗ )}δe−ρδ = {R − C(u∗ (T ∗ ))h(T ∗ )}δ + o(δ), (9.35)
L(1 − u∗ (T ∗ ))h(T ∗ )δe−ρδ = L(1 − u∗ (T ∗ ))h(T ∗ )δ + o(δ), (9.36)
−ρB(T ∗ )δ + BT (T ∗ )δ + o(δ).
(9.37)
9.2. Maintenance & Replacement for a Machine Subject to Failure 295
In the trivial cases in which the natural failure rate h(t) is zero or when
the machine fails with certainty by time t (i.e., F (t) = 1), then u∗ (t) = 0.
Assume therefore h > 0 and F < 1. Under these conditions, we can infer
from (9.28) and (9.38) that
⎫
∗ ⎪
⎪
(i) Cu (0) + L + λe > 0 ⇒ u (t) = 0,
ρt ⎪
⎪
⎪
⎬
ρt ∗
(ii) Cu (1) + L + λe < 0 ⇒ u (t) = 1. (9.39)
⎪
⎪
⎪
⎪
⎪
(iii) Otherwise, Cu + L + λeρt = 0determines u∗ (t). ⎭
296 9. Maintenance and Replacement
Using the terminal condition λ(T ) = −e−ρT B(T ) from (9.33), we can
derive u∗ (T ) satisfying (9.39):
⎫
∗ ⎪
⎪
(i) Cu (0) > B(T ) − L and u (T ) = 0, ⎪
⎪
⎪
⎬
∗
(ii) Cu (1) < B(T ) − L and u (T ) = 1. (9.40)
⎪
⎪
⎪
⎪
⎪
(iii) Otherwise, Cu = B(T ) − L ⇒ u∗ (T ). ⎭
for all j ≥ 1; see Sethi (1973b) as well as Exercise 9.16. In this case = ∞
as well. The second relaxes the assumption (9.42) of identical machine
lives, but then, it can only solve a finite horizon problem involving a
finite chain of machines, i.e., is finite; see Sethi and Morton (1972)
and Tapiero (1973). For a decision horizon formulation of this problem,
see Sethi and Chand (1979), Chand and Sethi (1982), and Bylka et al.
(1992).
In this section, we will deal with the latter problem as analyzed by
Sethi and Morton (1972). The problem is solved by a mixed optimization
technique. The subproblems dealing with the maintenance policy are
solved by appealing to the discrete maximum principle. These subprob-
lem solutions are then incorporated into a Wagner and Whitin (1958)
model formulation for solution of the full problem. The procedure is
illustrated by a numerical example.
net earnings associated with the machine. To calculate Jst we need the
following notation:
It is required that
We must also have functions that will provide us with the ways in
which states change due to the age of the machine and the amount
of preventive maintenance. Also, assuming that at time s, the only
machines available are those that are up-to-date with respect to the
technology prevailing at s, we can subscript these functions by s to reflect
the effect of the machine’s technology on its state at a later time k. Let
Ψs (uk , k) and Φs (uk , k) be such concave functions so that we can write
the following state equations:
given,
Δxks = Φs (uk , k), xss = (1 − δ)Cs , (9.46)
where δ is the fractional depreciation immediately after the purchase of
the machine at time s.
9.3. Chain of Machines 299
k−1
Aks = Rsi (1 + ρ)−i , (9.47)
i=s
k−1
Bsk = ui (1 + ρ)−i . (9.48)
i=s
Using Eqs. (9.47) and (9.48), we can write the optimal control prob-
lem as follows:
subject to
λk1 = 1, (9.57)
300 9. Maintenance and Replacement
Note that λk1 , λk2 , and λk4 are constants for a fixed machine salvage time
t. To apply the maximum principle, we substitute (9.57)–(9.60) into the
Hamiltonian (9.52), collect terms containing the control variable uk , and
rearrange and decompose H as
H = H1 + H2 (uk ), (9.61)
t−1
H2 (uk ) = −uk (1 + ρ)−k + (1 + ρ)−i Ψs + (1 + ρ)−t Φs . (9.62)
i=k+1
t−1
Huk = [H2 ]uk = −(1+ρ)−k +(Ψs )uk (1+ρ)−i +(Φs )uk (1+ρ)−t = 0.
i=k+1
(9.63)
Equation (9.63) is an equation in uk
with the exception of the particular
case when Ψs and Φs are linear in uk (which will be treated later in this
section). In general, (9.63) may or may not have a unique solution. For
our case we will assume Ψs and Φs to be of the form such that they
give a unique solution for uk . One such case occurs when Ψs and Φs are
quadratic in uk . In this case, (9.63) is linear in uk and can be solved
explicitly for a unique solution for uk . Whenever a unique solution does
exist, let this be
uk = Ust
k
. (9.64)
9.3. Chain of Machines 301
t−1
Ws (k, t) = −(1 + ρ)−k + ψ ks (1 + ρ)−i + φts (1 + ρ)−t , (9.68)
i=k+1
respectively.
Let Rss be the net return (net of necessary maintenance) of a machine
purchased at the beginning of period s and operated during period s. We
assume
R00 = $600, R11 = $1, 000, and R22 = $1, 100.
In a period k subsequent to the period s of machine purchase, the
returns Rsk , k > s, depend on the preventive maintenance performed on
the machine in the periods prior to period k. The incremental return
function is given by Ψs (u, k), which we assume to be linear. Specifically,
where
Δxks = −s Cs + bs uk ,
where ⎧
⎪
⎨ 0.1 when s = 0, 1,
s =
⎪
⎩ 0.2 when s = 2,
and
bs = (0.5 − 0.05s).
304 9. Maintenance and Replacement
That is, the decrease in salvage value is a constant percentage of the pur-
chase price if there is no preventive maintenance. With preventive main-
tenance, the salvage value can be enhanced by a proportional amount.
∗ be the optimal value of the objective function associated with
Let Jst
a machine purchased at s and sold at t ≥ s + 1. We will now solve for
∗ , s = 0, 1, 2, and s < t ≤ 3, where t is an integer.
Jst
Before we proceed, we will as in (9.68) denote by Ws (k, t), the coef-
ficient of uk in the Hamiltonian H, i.e.,
t−1
Ws (k, t) = −(1 + ρ)−k + as (1 + ρ)−i + bs (1 + ρ)−t . (9.72)
i=k+1
Ws (k + 1, t) − Ws (k, t) = (1 + ρ)−(k+1) (ρ − as ),
so that
sgn[Ws (k + 1, t) − Ws (k, t)] = sgn[ρ − as ]. (9.73)
This implies that
⎧
⎪
⎪
⎪
⎪ ≥ 0 if (ρ − as ) > 0,
⎪
⎨
u(k+1)∗ − uk∗ = 0 if (ρ − as ) = 0, (9.74)
⎪
⎪
⎪
⎪
⎪
⎩ ≤ 0 if (ρ − as ) < 0.
Subproblem: s = 0, t = 1.
Now,
R00 = 600,
R01 = 600 − 200 = 400,
x00 = 0.75 × 1, 000 = 750,
x10 = 750 − 0.1 × 1, 000 = 650,
∗
J01 = 600 − 1, 000 + 650 × (1.06)−1 = $213.2.
Subproblem: s = 0, t = 2.
u0∗ = 0, u1∗ = 0,
∗ = 466.9.
J02
Subproblem: s = 0, t = 3.
Subproblem: s = 1, t = 2.
W1 (1, 2) < 0,
u1∗ = 0,
∗
J12 = 559.9.
Subproblem: s = 1, t = 3.
Subproblem: s = 2, t = 3.
W2 (2, 3) < 0,
u2∗ = 0,
∗
J23 = 80.
g3 = 0,
∗
g2 = J23 = $80,
∗ ∗
g1 = max [J13 , J12 + g2 ]
= max [1024.2, 559.9 + 80]
= $1024.2,
∗ ∗ ∗
g0 = max [J03 , J01 + g1 , J02 + g2 ]
= max [639.0, 213.2 + 1024.2, 466.9 + 80]
= $1237.4.
E 9.2 Change the values of U and d(t) in Sect. 9.1.3 to the new values
U = 1/2 and d(t) = 3 and re-solve the problem.
E 9.3 Show for the model in Sect. 9.1.1 that if it is optimal to have
the maximum maintenance throughout the life of the machine, then its
optimal life T must satisfy g(T ) − 1 ≥ 0. In particular, for the example
in Sect. 9.1.3, show T ≤ 3.
Exercises for Chapter 9 307
Hint: The salvage value function required in (3.77) for the problem here
is S(F (T ), T ) = e−ρT B(T )(1 − F (T )) as given in (9.31). Its partial
derivative with respect to T is [−ρe−ρT B(T ) + e−ρT BT (T )(1 − F (T )).
Show that u̇0 (t) ≤ 0. Furthermore, show that u∗ (t) is nonincreasing over
time.
E 9.11 For the model of Sect. 9.2, prove that the derived Hamiltonian H
is concave in F for each given λ and t, so that the Sufficiency Theorem 2.1
holds.
308 9. Maintenance and Replacement
where
h(0) = 0, h (p) > 0, h (p) ≥ 0.
Discounting future profits at rate ρ, the firm seeks a price policy p(t) to
∞
max e−ρt {R1 (p(t))[1 − F (t)] + R2 F (t)}dt
0
subject to
Ḟ (t) = h(p(t))[1 − F (t)], F (0) = 0.
The integrand represents the expected profits at t, composed of R1 if no
rival has entered by t, and otherwise R2 .
(a) Show that the maximum principle necessary conditions are satisfied
by p(t) = p∗ , where p∗ is a constant. Obtain the equation satisfied
by p∗ and show that it has a unique solution.
(b) Let pm denote the monopoly price (in the absence of any rival), i.e.,
R1 (pm ) = maxp R1 (p). Show that p∗ < pm and R1 (pm ) > R1 (p∗ ) >
R2 . Provide an intuitive explanation of the result.
(c) Verify the sufficiency condition for optimality by showing that the
maximized Hamiltonian is concave.
Exercises for Chapter 9 309
where P0 (t) is the probability that the machine is in the state 0 at time t.
Let P1 (t) = 1−P0 (t), which is the probability that the machine is in state
1 at time t. This equation along with (9.3) gives us two state equations.
In view of the equation for Ṗ0 , we modify the objective function (9.2) to
T
J= [πx(t)P0 (t) − u(t) − kP1 (t)]e−ρt dt + x(T )e−ρT ,
0
E 9.15 Extend the Thompson model in Sect. 9.1 to allow for process
discontinuities. An example of this type of machine is an airplane as-
signed to passenger transportation which may, after some deterioration
or obsolescence, be assigned to freight transportation before its eventual
retirement. Formulate and analyze the problem. See Tapiero (1971).
E 9.16 Extend the Thompson model in Sect. 9.1 to allow for a chain
of machines with identical lines. See Sethi (1973b) for an analysis of a
similar model.
Applications to Natural
Resources
given horizon under the assumption that when its price reaches a given
high threshold, a substitute will be used instead. Therefore, the analysis
of this section can also be viewed as a problem of optimally phasing in
an expensive substitute.
From (10.1) and (10.2), it follows that x will stay in the closed interval
0 ≤ x ≤ X provided x0 is in the same interval.
An open access fishery is one in which exploitation is completely
uncontrolled. Gordon (1954) analyzed this model, also known as the
Gordon-Schaefer model, and showed that the fishing effort tends to reach
an equilibrium, called a bionomic equilibrium, at the level where total
revenue equals total cost. In other words, the so-called economic rent is
completely dissipated. From (10.3) and (10.2), this level is simply
c g(xb )p
xb = and ub = . (10.4)
pq c
g(x) − ẋ
u= , (10.6)
qx
Rewriting, we have
∞
J= e−ρt [M (x) + N (x)ẋ]dt, (10.8)
0
where
c c
N (x) = −p + and M (x) = (p − )g(x). (10.9)
qx qx
We note that we can write ẋdt = dx so that (10.8) becomes the following
line integral
JB = [e−ρt M (x)dt + e−ρt N (x)dx], (10.10)
B
where B is a state trajectory in (x, t) space, t ∈ [0, ∞).
In this section we are only interested in the infinite horizon solution.
The Green’s theorem method achieves such a solution by first solving a
finite horizon problem as in Sect. 7.2.2, and then determining the infinite
horizon solution for which you are asked to verify that the maximum
principle holds in Exercise 10.1. See also Sethi (1977b).
In order to apply Green’s Theorem to (10.10), let Γ denote a simple
closed curve in the (x, t) space surrounding a region R in the space.
Then,
,
JΓ = [e−ρt M (x)dt + e−ρt N (x)dx]
Γ
∂ −ρt ∂ −ρt
= [e N (x)] − [e M (x)] dtdx
R ∂t ∂x
= −e−ρt [ρN (x) + M (x)]dtdx. (10.11)
R
10.1. The Sole-Owner Fishery Resource Model 315
If we let
We can now conclude, as we did in Sects. 7.2.2 and 7.2.4, that the turn-
pike level x̄ is given by setting the integrand of (10.11) to zero. That is,
c cg(x)
− I(x) = [g (x) − ρ](p − )+ = 0. (10.12)
qx qx2
In addition, a second-order condition must be satisfied for the solution x̄
of (10.12) to be a turnpike solution; see Lemma 7.1 and the subsequent
discussion there. The required second-order condition can be stated as
I(x) < 0 for x < x̄ and I(x) > 0 for x > x̄.
Figure 10.1: Optimal policy for the sole owner fishery model
the equilibrium level x̄ given by (10.12), and suppose we reduce this level
to x̄ − ε by removing ε amount of fish instantaneously from the fishery,
which can be accomplished by an impulse fishing effort of ε/q x̄. The
immediate marginal revenue MR from this action is
ε
MR = (pq x̄ − c) .
q x̄
However, this causes a decrease in the sustainable economic rent which
equals
π (x̄)ε.
Over the infinite future, the present value of this stream is
∞
π (x̄)ε
e−ρt π (x̄)εdt = .
0 ρ
Adding to this the cost cε/q x̄ of the additional fishing effort ε/q x̄, we
get the marginal cost
π (x̄)ε cε
MC = + .
ρ q x̄
Equating MR and MC, we obtain (10.14), which is also (10.12).
When the discount rate ρ = 0, Eq. (10.14) reduces to
π (x) = 0,
10.2. An Optimal Forest Thinning Model 317
x̄ |ρ=∞ = xb = c/pq.
The latter is the bionomic equilibrium attained in the open access fishery
solution; see (10.4). Finally, by denoting x̄ obtained from (10.12) for any
given ρ > 0 as x̄ |ρ , you are asked in Exercise 10.3 to show that
λ̄ = p − c, (10.22)
Figure 10.3: Optimal thinning u∗ (t) and timber volume x∗ (t) for the
forest thinning model when x0 < x̄(t0 )
10.2. An Optimal Forest Thinning Model 321
For x0 > x̄(t0 ), the optimal control at t0 will be the impulse cutting
to bring the level from x0 to x̄(t0 ) instantaneously. To complete the
infinite horizon solution, set u∗ (t) = 0 for t ≥ T̂ . In Exercise 10.12 you
are asked to obtain λ(t) for t ∈ [0, ∞).
∞
T
(k−1)ρT
J(T ) = e e−ρt (p − c)udt
k=1 0
T
1
= e−ρt (p − c)udt. (10.26)
1 − e−ρT 0
From the solution of the model in the previous section, and the
assumption that the forest is profitable, it is obvious that 0 ≤ T ≤ T̂
as shown in Fig. 10.4. We have two cases to consider, depending on
whether T > t̂ or T ≤ t̂.
Case 1: T > t̂. From the preceding section it is easy to conclude that
the optimal trajectory is as shown in Fig. 10.4. Using the turnpike ter-
minology of Chap. 7, the trajectory from 0 to A is the entry ramp to
the turnpike, the trajectory from A to B is on the turnpike, and the
trajectory from B to T is the exit ramp. Since u∗ (t) = 0 on the entry
ramp, no timber is collected from time 0 to time t̂. Timber is, however,
collected by thinning from time t̂ to T − and clearcutting at time T. Note
from Fig. 10.4 that x̄(T ) is the amount of timber collected from impulse
clearcutting u∗ (T ) = imp[x̄(T ), 0; T ] at time T. Thus, we can write the
322 10. Applications to Natural Resources
Figure 10.4: Optimal thinning u∗ (t) and timber volume x∗ (t) for the
chain of forests model when T > t̂
(10.27)
If the solution T lies in (t̂, T̂ ], keep it; otherwise set T = T̂ . Note that
(10.29) can also be derived by using the transversality condition (3.15);
see Exercise 3.6.
Figure 10.5: Optimal thinning and timber volume x∗ (t) for the chain of
forests model when T ≤ t̂
context, the resource under consideration could be crude oil and its ex-
pensive substitute could be coal and/or tar sands; see, e.g., Fuller and
Vickson (1987).
We introduce the following notation:
for which it is obvious that g(p) > 0 for p < p̄ and g(p) = 0 for p ≥ p̄.
Let
denote the profit function of the producers, i.e., the producers’ surplus.
Let p be the smallest price at which π(p) is nonnegative. Assume further
that π(p) is a concave function in the range [p, p̄] as shown in Fig. 10.7.
In the figure the point pm indicates the price which maximizes π(p).
326 10. Applications to Natural Resources
We also define p̄
ψ(p) = f (y)dy (10.34)
p
as the consumers’ surplus, i.e., the area shown shaded in Fig. 10.6. This
quantity represents the total excess amount that consumers would be
willing to pay. In other words, consumers actually pay pf (p), while they
would be willing to pay
p
yf (y)dy = pf (p) + ψ(p).
p̄
subject to
Q̇ = −f (p), Q(0) = Q0 , (10.37)
Q(T ) ≥ 0, (10.38)
and p ∈ Ω = [p, p̄]. Recall that the sum ψ(p) + π(p) is concave in p.
10.3. An Exhaustible Resource Model 327
which implies
⎧
⎪
⎨ 0 if Q(T ) ≥ 0 is not binding,
λ(t) = (10.41)
⎪
⎩ λ(T )eρ(t−T ) if Q(T ) ≥ 0 is binding.
∂H
= ψ + π − λf = (p − λ)f − g = 0. (10.42)
∂p
To show that the solution s(λ) for p of (10.42) actually maximizes the
Hamiltonian, it is enough to show that the second derivative of the
Hamiltonian is negative at s(λ). Differentiating (10.42) gives
∂2H
= f − g + (p − λ)f .
∂p2
∂2H g
2
= f − g + f . (10.43)
∂p f
f g − g f
G = ,
f 3
∂2H
= f − G f 2 . (10.44)
∂p2
328 10. Applications to Natural Resources
p∗ = p̂. (10.45)
With this value, the total consumption of the resource is T f (p̂), which
must be ≤ Q0 so that the constraint Q(T ) ≥ 0 is not binding. Hence,
T f (p̂) ≤ Q0 (10.46)
where
∗ 1 p̄ − G (0)
t = min T, T + ln . (10.48)
p λ(T )
∗ −T )
The time t∗ , if it is less than T, is the time at which s[λ(T )eρ(t ] = p̄.
From Exercise 10.17,
∗ −T )
λ(T )eρ(t = p̄ − G (0) (10.49)
E 10.4 Obtain the turnpike level x̄ of (10.12) for the special case g(x) =
x(1 − x), p = 2, c = q = 1, and ρ = 0.1.
(b) Allen (1973) and Clark (1976) estimated the parameters of the
Schaefer model for the Antarctic fin-whale population as follows:
r = 0.08, X = 400, 000 whales, and xb = 40, 000. Solve for x̄ for
ρ = 0, 0.10, and ∞.
E 10.6 Obtain π (x) from (10.13) and use it in (10.12) to derive (10.14).
Exercises for Chapter 10 331
E 10.8 Show that extinction is optimal if ∞ > p ≥ c(0) and ρ > 2g (0)
in Exercise 10.7.
where V (x) with V (x) > 0 is the conservation value function, which
measures the value to society of having a large fish stock. By deriving
the analogue to (10.12), show that the new x̄ is larger than the x̄ in
Exercise 10.7.
E 10.12 Find λ(t), t ∈ [0, ∞), for the infinite horizon model of
Sect. 10.2.2.
E 10.13 Derive the second term inside the brackets of (10.27) by com-
puting e−ρT (p − c) imp[x̄(T ), 0; T ].
where p is the price of a unit of timber and c is the unit cost of fertiliza-
tion.
(a) Show that the optimal control v ∗ (t) is given by solving the
equation
∂f c
= e−(ρ+r)(t−T ) .
∂v p
Check that the second order condition for a maximum holds for
this v ∗ (t).
(b) If f (v) = (1 + t) ln(1 + v), then find explicitly the optimal control
v ∗ (t) under the assumption that p/c > e(ρ+r)T . Show further that
v ∗ (t) is increasing and convex in t ∈ [0, T ].
G(q) = q 2 .
for T ≥ T̄ , and
for T > T̄ .
Chapter 11
Applications to Economics
K(T ) = KT , (11.3)
∂H ∂F
λ̇ = ρλ − = (ρ + δ)λ − λ , λ(T ) = α, (11.5)
∂K ∂K
where α is a constant to be determined.
The optimal control is given by
∂H
= U (C) − λ = 0. (11.6)
∂C
Since U (0) = ∞, the solution of this condition always gives C(t) > 0.
An intuitive argument for this result is that a slight increase from a zero
consumption rate brings and infinitesimally large marginal utility and
therefore optimal consumption will remain strictly positive. Moreover,
the capital stock will not be allowed to fall to zero along an optimal
path in order to avoid the consumption rate from falling to zero. See
Karatzas et al. (1986) for a rigorous demonstration of this result in a
related context.
Note that the sufficiency of optimality is easily established here by
obtaining the derived Hamiltonian H 0 (K, λ) by substituting for C from
(11.6) in (11.4), and showing that H 0 (K, λ) is concave in K. This follows
easily from the facts that F (K) is concave and λ > 0 from (11.6) on
account of the assumption that U (C) > 0.
The economic interpretation of the Hamiltonian is straightforward.
It consists of two terms: the first one gives the utility of current con-
sumption and the second one gives the net investment evaluated by price
λ, which, from (11.6), reflects the marginal utility of consumption.
For the economic system to be run optimally, the solution must sat-
isfy the following three conditions:
(a) The static efficiency condition (11.6) which maximizes the value of
the Hamiltonian at each instant of time myopically, provided that
λ(t) is known.
(b) The dynamic efficiency condition (11.5) which forces the price λ of
capital to change over time in such a way that the capital stock
always yields a net rate of return, which is equal to the social
discount rate ρ. That is,
∂H
dλ + dt = ρλdt.
∂K
338 11. Applications to Economics
F (K, L) K
f (k) = = F ( , 1) = F (k, 1). (11.8)
L L
It is clear from the assumptions of F that f (k) > 0 and f (k) < 0 for
k ≥ 0.
To derive the state equation for k, we note that
where γ = g + δ.
Let u(c) be the utility of per capita consumption c, where u is as-
sumed to satisfy
u (c) > 0 and u (c) < 0 for c ≥ 0 and u (0) = ∞. (11.10)
As in Sect. 11.1.2, the last condition in (11.10) rules out zero consump-
tion.
According to the position known
∞ as total utilitarianism, the soci-
ety’s discounted total utility is 0 e−ρt L(t)u(c(t))dt, which we aim to
maximize. In view of (11.7), this is equivalent to maximizing
∞
J= e−rt u(c)dt, (11.11)
0
∂H
λ̇ = rλ − = (r + γ)λ − f (k)λ = (ρ + δ)λ − f (k)λ. (11.13)
∂k
To obtain the optimal control, we differentiate (11.12) with respect to c,
set it to zero, and solve
u (c) = λ. (11.14)
Let c = h(λ) = u−1 (λ) denote the solution of (11.14). In Exercise 11.3,
you are asked to show that h (λ) < 0. This can be easily shown by
340 11. Applications to Economics
inverting the graph of u (c) vs. c. Alternatively you can rewrite (11.14)
as u (h(λ)) = λ and then take its derivative with respect to λ.
To show that the maximum principle is sufficient for optimality, it is
enough to show that the derived Hamiltonian
obtained from (11.9), (11.13), and (11.14). In Exercise 11.2 you are asked
to show that the graphs of k̇ = 0 and λ̇ = 0 are like the dotted curves
in Fig. 11.1. Given the nature of these graphs, known as isoclines, it is
clear that they have a unique point of intersection denoted as (k̄, λ̄). In
The two isoclines divide the plane into four regions, I, II, III, and IV,
as marked in Fig. 11.1. To the left of the vertical line λ̇ = 0, we have
k < k̄ and therefore r + γ < f (k) in view of f (k) < 0. Thus, λ̇ < 0 from
(11.13). Therefore, λ is decreasing, which is indicated by the downward
pointing arrows in Regions I and IV. On the other hand, to the right of
the vertical line, in Regions II and III, the arrows are pointed upward
because λ is increasing. In Exercise 11.3, you are asked to show that
the horizontal arrows, which indicate the direction of change in k, point
to the right above the k̇ = 0 isocline, i.e., in Regions I and II, and they
point to the left in Regions III and IV which are below the k̇ = 0 isocline.
The point (k̄, λ̄) represents the optimal long-run stationary equilib-
rium. The values of k̄ and λ̄ are obtained in Exercise 11.2. The next
important thing is to show that there is a unique path starting from
any initial capital stock k0 , which satisfies the maximum principle and
converges to the steady state (k̄, λ̄). Clearly such a path cannot start in
Regions II and IV, because the directions of the arrows in these areas
point away from (k̄, λ̄). For k0 < k̄, the value of λ0 (if any) must be
selected so that (k0 , λ0 ) is in Region I. For k0 > k̄, on the other hand,
the point (k0 , λ0 ) must be chosen to be in Region III. We analyze the
case k0 < k̄ only, and show that there exists a unique λ0 associated with
the given k0 , and that the optimal path, shown as the solid curve in Re-
gion I of Fig. 11.1, starts from (k0 , λ0 ) and converges to (k̄, λ̄). It should
be obvious that this path also represents the locus of such (k0 , λ0 ) for
k0 ∈ [0, k̄]. The analysis of the case k0 > k̄ is left as Exercise 11.4.
In Region I, k̇(t) > 0 and k(t) is an increasing function of t as indi-
cated by the horizontal right-directed arrow in Fig. 11.1. Therefore, we
can replace the independent variable t by k, and then use (11.16) and
(11.17) to obtain
3
dλ dλ dk [f (k) − (r + γ)]λ
λ (k) = = = . (11.19)
dk dt dt h(λ) + γk − f (k)
Thus, our task of showing that there exists an optimal path starting from
any initial k0 < k̄ is equivalent to showing that there exists a solution
of the differential equation (11.19) on the interval [0, k̄], beginning with
the boundary condition λ(k̄) = λ̄. For this, we must obtain the value
λ (k̄). Since both the numerator and the denominator in (11.19) vanish
342 11. Applications to Economics
−f (k̄)λ̄ −f (k̄)λ̄
λ (k̄) = = , (11.20)
f (k̄) − γ − h (λ̄) f (k̄) − γ − λ (k̄)/u (h(λ̄))
or
(λ (k̄))2
− + λ (k̄)[f (k̄) − γ] + λ̄f (k̄) = 0. (11.21)
u (h(λ̄))
Note that the second equality in (11.20) uses the relation h (λ̄) =
1/u (h(λ̄)) obtained by differentiating u (c) = u (h(λ)) = λ of (11.14)
with respect to λ at λ = λ̄.
It is easy to see that (11.21) has one positive solution and one negative
solution. We take the negative solution for λ (k̄) because of the following
consideration. With the negative solution, we can prove that the differ-
ential equation (11.19) has a smooth solution, such that λ (k) < 0. For
this, let
π(k) = f (k) − kγ − h(λ(k)).
Since k < k̄, we have r+γ−f (k) < 0. Then from (11.19), since λ (k̄) < 0,
we have λ(k̄ − ε) > λ(k̄). Also since λ̄ > 0 and f (k̄) < 0, Eq. (11.20)
with λ (k̄) implies
λ (k̄)
π (k̄) = f (k̄) − γ − < 0,
u (h(λ̄))
and thus,
λ (k)
π (k) = f (k) − γ − < 0. (11.22)
u (h(λ(k)))
This implies that f (k) − kγ − h(λ) > 0, and also since r + γ − f (k)
remains negative for k < k̄, we have λ (k) < 0.
11.2. A Model of Optimal Epidemic Control 343
Suppose now that there is a point k̃ < k̄ with π(k̃) = 0. Then, since
π(k̃ + ε) > 0, we have π (k̃) ≥ 0. But at k̃, π(k̃) = 0 in (11.19) implies
λ (k̃) = −∞, and then from (11.22), we have π (k̃) = −∞, which is a
contradiction with π (k̃) ≥ 0. Thus, we can proceed on the whole interval
[0, k̄]. This indicates that the path λ(k) (shown as the solid line in Region
I of Fig. 11.1) remains above the curve
k̇ = f (k) − kγ − h(λ) = 0,
shown as the dotted line in Fig. 11.1 when k < k̄. Thus, we can set
λ0 = λ(k0 ) for 0 ≤ k0 ≤ k̄ and have the optimal path starting from
(k0 , λ0 ) and converging to (k̄, λ̄).
Similar arguments hold when the initial capital stock k0 > k̄, in order
to show that the optimal path (shown as the solid line in Region III of
Fig. 11.1) exists in this case. You have already been asked to carry out
this analysis in Exercise 11.4.
We should mention that the conclusions derived in this subsection
could have been reached by invoking the Global Saddle Point Theorem
stated in Appendix D.7, but we have chosen instead to carry out a de-
tailed analysis for illustrating the use of the phase diagram method. The
next time we use the phase diagram method will be in Sect. 11.3.3, and
there we shall rely on the Global Saddle Point Theorem.
when infected people are cured, they become susceptible again. The
state equation governing the dynamics of the epidemic spread in the
population is
ẋ = βx(N − x) − vx, x(0) = x0 , (11.23)
where β is a positive constant termed infectivity of the disease, and v
is a control variable reflecting the level of medical program effort. Note
that x(t) is in [0, N ] for all t > 0 if x0 is in that interval.
The objective of the control problem is to minimize the present value
of the cost stream up to a horizon time T, which marks the end of the
season for that disease. Let h denote the unit social cost per infective,
let m denote the cost of control per unit level of program effort, and let
Q denote the capability of the health care delivery system providing an
upper bound on v. The optimal control problem is:
T
−ρt
max J = −(hx + mv)e dt (11.24)
0
x(T ) = xT , (11.25)
0 ≤ v ≤ Q.
The optimal control shown in Figs. 11.2 and 11.3 assumes 0 < xs <
N. It also assumes that T is large so that the trajectory will spend some
time on the turnpike and Q is large so that xs ≥ N − Q/β. The graphs
are drawn for x0 > xs and xs < N/2; for all other cases see Sethi (1974c).
subject to
Ṗ = a(v) − δP, P (0) = P0 , (11.33)
0 ≤ v ≤ L. (11.34)
From Fig. 11.4, it is obvious that v is at most V, since the production
of DDT beyond that level decreases food production and increases DDT
pollution. Hence, (11.34) can be reduced to simply
v ≥ 0. (11.35)
λ̇ = (ρ + δ)λ + h (P ), (11.37)
and
μ ≥ 0 and μv = 0. (11.38)
11.3. A Pollution Control Model 349
∂L
= u [C(v)]C (v) + λa (v) + μ = 0. (11.39)
∂v
Since the derived Hamiltonian is concave, conditions (11.36)–(11.39) to-
gether with
lim λ(t) = λ̄ = constant (11.40)
t→∞
are sufficient for optimality; see Theorem 2.1 and Sect. 2.4. The phase
diagram analysis presented below gives λ(t) satisfying (11.40).
a(v̄)
P̄ = , (11.41)
δ
h (P̄ ) u [C(v̄)]C (v̄)
λ̄ = − =− . (11.42)
ρ+δ a (v̄)
From (11.42) and the assumptions on the derivatives of g, C and a, we
know that λ̄ < 0. From this and (11.37), we conclude that λ(t) is always
negative. The economic interpretation of λ is that −λ is the imputed
cost of pollution. Let v = Φ(λ) denote the solution of (11.39) with μ = 0.
On account of (11.35), define
dP a (v) dv
|Ṗ =0 = >0 (11.46)
dλ δ dλ
dP (ρ + δ)
|λ̇=0 = − <0 (11.47)
dλ h (P )
where q is the number of units purchased and φ is the total amount paid
to the seller. We assume a(0) = 0, a > 0, and a < 0.
The seller knows only the distribution F (t), having the density
f (t), t ∈ [t1 , t2 ]. The seller’s unit production cost is c > 0, so that his
profit from selling q units against a sum of money φ is given by
π = φ − cq. (11.49)
Thanks to the revelation principle, the answer is that the seller can offer
a menu of contracts {φ(t), q(t)} which comes from solving the following
maximization problem:
t2
max [φ(t) − cq(t)]f (t)dt (11.50)
q(·),φ(·) t1
subject to
q̇(t) ≥ 0. (11.58)
subject to
q̇(t) = u(t), (11.60)
φ̇(t) = ta (q(t))u(t), (11.61)
t1 a(q(t1 )) − φ(t1 ) = 0, (11.62)
u(t) ≥ 0. (11.63)
Here, q(t) and φ(t) are state variables and u(t) is a control variable
satisfying the control constraint u(t) ≥ 0. The objective function (11.59)
is the expected value of the seller’s profit with respect to the density f (t).
Equation (11.60) and constraint (11.63) come from the monotonicity
condition (11.58). Equation (11.61) with u(t) from (11.60) gives the
local incentive compatibility condition (11.56). Finally, (11.62) specifies
11.4. An Adverse Selection Model 355
the IR constraint (11.53) in view of the fact it will be binding for the
lowest agent type t1 at the optimum.
We can now use the sense of the maximum principle (3.12) to write
the necessary conditions for optimality. Note that (3.12) is written
for problem (3.7) that has specified initial states and some constraints
on the terminal state vector x(T ) that include the equality constraint
b(x(T ), T ) = 0. Our problem, on the other hand, has this type of equal-
ity constraint, namely (11.62), on the initial states q(t1 ) and φ(t1 ) and
no specified terminal states q(t2 ) and φ(t2 ). However, since initial time
conditions and terminal time conditions can be treated in a symmetric
fashion, we can apply the sense of (3.12), as shown in Remark 3.9, to
obtain the necessary optimality conditions to problem (11.59)–(11.63).
In Exercise 11.13, you are asked to obtain (11.67) and (11.68) by fol-
lowing Remark 3.9 to account for the presence of the equality constraint
(11.62) on the initial state variables rather than on the terminal state as
in problem (3.7).
To specify the necessary optimality condition, we first define the
Hamiltonian.
the singular path. An impulse control would occur if the initial q(t1 )
were above the singular path. Since in our problem, initial states are not
exactly specified, we shall not encounter an impulse control here.
The third remark concerns a numerical way of solving the problem.
For this, let us rewrite the boundary conditions in (11.67) and (11.68)
and the condition (11.66) as below:
t1 a(q ∗ (t1 )) − φ∗ (t1 ) = 0, λ(t1 ) = −μ(t1 )t1 a (q ∗ (t1 )) (11.70)
λ(t2 ) = μ(t2 ) = 0. (11.71)
With (11.71) and a guess of q(t2 ) and φ(t2 ), we can solve the differential
equation (11.65), (11.67) and (11.68), with u∗ (t) in (11.69), backward in
time. These will give us the values of λ(t1 ), μ(t1 ), q(t1 ) and φ(t1 ). We
can check if these satisfy the two equations in (11.70). If yes, we have
arrived at a solution. If not, we change our guess for q(t2 ) and φ(t2 ) and
start again. As you may have noticed, the procedure is very similar to
solving a two-point boundary value problem.
Next we provide an alternative procedure to solve the seller’s prob-
lem, a procedure used in the theory of mechanism design. This procedure
first ignores the nonnegativity constraint (11.60) and solves the relaxed
problem given by (11.59)–(11.62). In view of (11.52), let us define
u0 (t̂) = t̂a(q(t̂)) − φ(t̂) = max[ta(q(t)) − φ(t)]. (11.72)
t
subject to
q̇(t) = u(t), u(t) ≥ 0. (11.80)
358 11. Applications to Economics
q̂(t)
q * (θ)
t1 t t̄ t2 t
Using the transversality conditions in the case when neither the initial
nor the terminal state is specified for the state equation (11.80), we
obtain
t2
1
0 = λ(t1 ) = λ(t2 ) = − z− a (q(z)) − c f (z)dz.
t1 h(z)
11.4. An Adverse Selection Model 359
q * (t) q * (t)
q * (t)
q * (t)
μ=0
λ=0 λ=0
λ<0
t1 θ1 t t̄ θ2 t2 t
from the continuity of q ∗ (·). Thus, we have two equations (11.84) and
(11.85) and two unknowns, allowing us to obtain the values of θ1 and θ2 .
An interval [θ1 , θ2 ] over which q ∗ (t) is constant is known as a bunching
interval.
Here, we have given a procedure when q̂(·) has only one interval [t, t̄]
over which it is strictly decreasing. If there are more such intervals,
360 11. Applications to Economics
ogy, Dockner and Jørgensen (1988) and Jedidi et al. (1989) for optimal
pricing and/or advertising for monopolistic diffusion models, Hartl and
Jørgensen (1985) for manpower planning, Ringbeck (1985) for optimal
quality and advertising under asymmetric information, Hartl and Krauth
(1989) for optimal production mix, Gaimon (1997) for planning for in-
formation technology, Hartl and Kort (2005) for advertising directed to
existing and new customers, and Shani et al. (2005) for dynamic irriga-
tion policies.
Finally, we conclude this section by citing a series of rather un-
usual but humorous applications of optimal control theory that began
with the Sethi (1979b) paper on optimal pilfering policies for dynamic
continuous thieves. These are: Hartl and Mehlmann (1982, 1983) and
Hartl et al. (1992a) on optimal blood consumption by vampires, Hartl
and Mehlmann (1986) on renumeration patterns for medical services,
Hartl and Jørgensen (1988, 1990) on optimal slidemanship at confer-
ences, Jørgensen (1992) on the dynamics of extramarital affairs, and
Feichtinger et al. (1999) on Petrarch’s Canzoniere: rational addiction
and amorous cycles. See also the monograph by Mehlmann (1997) on
unusual and humorous applications of differential games.
E 11.3 Use (11.14) to show that h (λ) < 0. Then, conclude that the
directions of the horizontal arrows above and below the k̇ = 0 curve are
as drawn in Fig. 11.1.
E 11.4 Show that for any k0 > k̄, there exists a unique optimal path,
such as that shown by the solid curve in Region III of Fig. 11.1.
E 11.6 Use the phase diagram method to solve the advertising model
of (7.7) with its objective function replaced by
∞
−ρt
max J = e [π(G) − c(u)]dt ,
u≥0 0
subject to
k̇ = f (k) − c − γk, k(0) = k0 ,
where
B = sup u(c) > 0
c≥0
(a) Show that the optimal capital stock trajectory satisfies the differ-
ential equation
u (f (k) − γk − k̇)k̇ = B − u(f (k) − γk − k̇).
Exercises for Chapter 11 363
where Xt is the state at time t and Ut is the control at time t, and together
they are required to satisfy the Itô stochastic differential equation
dXt = f (Xt , Ut , t)dt + G(Xt , Ut , t)dZt , X0 = x0 , (12.2)
where Zt , t ∈ [0, T ] is a standard Wiener process.
For convenience in exposition we assume the drift coefficient function
F : E1 × E1 × E1 → E1, S : E1 × E1 → E1, f : E1 × E1 × E1 → E1
and the diffusion coefficient function G : E 1 × E 1 × E 1 → E 1 , so that
(12.2) is a scalar equation. We also assume that the functions F and S are
continuous in their arguments and the functions f and G are continuously
differentiable in their arguments. For multidimensional extensions of this
problem, see Fleming and Rishel (1975).
Since (12.2) is a scalar equation, the subscript t here represents only
time t. Thus, writing Xt , Ut , and Zt in place of writing X(t), U (t), and
Z(t), respectively, will not cause any confusion and, at the same time,
will eliminate the need for writing many parentheses.
To solve the problem defined by (12.1) and (12.2), let V (x, t), known
as the value function, be the expected value of the objective function
(12.1) from t to T, when an optimal policy is followed from t to T, given
Xt = x. Then, by the principle of optimality,
V (x, t) = max E[F (x, U, t)dt + V (x + dXt , t + dt)]. (12.3)
U
+higher-order terms.
Substitute (12.4) into (12.3) and use (12.5), (12.6), (12.7), and the prop-
erty that E[dZt ] = 0 to obtain
1 2
V = max F dt + V + Vt dt + Vx f dt + Vxx G dt + o(dt) . (12.8)
U 2
Then we have
Vt = Ṽt e−ρt − ρṼ e−ρt , Vx = Ṽx e−ρt and Vxx = Ṽxx e−ρt . (12.14)
12.1. Stochastic Optimal Control 369
1
0 = max[φe−ρt + Ṽ e−ρt − ρṼ e−ρt + Vx f e−ρt + Vxx G2 e−ρt ].
U 2
1
ρṼ = max[φ + Ṽt + Ṽx f + Ṽxx G2 ]. (12.15)
U 2
Moreover, from (12.12), (12.13), and (12.10), we can get the boundary
condition
Ṽ (x, T ) = ψ(x). (12.16)
Thus, we have obtained (12.15) and (12.16) as the current-value HJB
equation.
To obtain its infinite-horizon version, it is generally the case that
we remove the explicit dependence on t from the function f and G in
(12.2), and also assume that ψ ≡ 0. With that, the dynamics (12.2) and
the objective function (12.11) change, respectively, to
It should then be obvious that Ṽt = 0, and we can obtain the infinite-
horizon version of (12.15) as
1
ρṼ = max[φ + V˜x f + Ṽxx G2 ]. (12.19)
U 2
with respect U. For further details and extension when the value func-
tion is not smooth enough and thus not a classical solution of the HJB
equation, see Fleming and Rishel (1975), Yong and Zhou (1999), and
Fleming and Soner (1992).
In the next three sections, we will apply this procedure to solve prob-
lems in production, marketing and finance.
Then,
R2
x2 [Q̇ + Q2 − 1] + x[Ṙ + RQ − 2SQ] + Ṁ + − RS + σ 2 Q = 0. (12.31)
2
Since (12.31) must hold for any value of x, we must have
Q̇ = 1 − Q2 , Q(T ) = 0, (12.32)
Ṙ = 2SQ − RQ, R(T ) = B, (12.33)
R2
Ṁ = RS − − σ 2 Q, M (T ) = 0, (12.34)
4
where the boundary conditions for the system of simultaneous differential
equations (12.32), (12.33), and (12.34) are obtained by comparing (12.27)
with the boundary condition V (x, T ) = Bx of (12.23).
To solve (12.32), we expand Q̇/(1 − Q2 ) by partial fractions to obtain
Q̇ 1 1
+ = 1,
2 1−Q 1+Q
which can be easily integrated. The answer is
y−1
Q= , (12.35)
y+1
where
y = e2(t−T ) . (12.36)
12.2. A Stochastic Production Inventory Model 373
The optimal control is defined by (12.24), and the use of (12.35) and
(12.37) yields
√
(y − 1)x + (B − 2S) y
P ∗ (x, t) = Vx /2 = Qx + R/2 = S + . (12.39)
y+1
This means that the optimal production rate for t ∈ [0, T ]
(e2(t−T ) − 1)It∗ + (B − 2S)e(t−T )
Pt∗ = P ∗ (It∗ , t) = S + , (12.40)
e2(t−T ) + 1
where It∗ , t ∈ [0, T ], is the inventory level observed at time t when using
the optimal production rate Pt∗ , t ∈ [0, T ], according to (12.40).
Remark 12.2 The optimal production rate in (12.39) equals the de-
mand rate plus a correction term which depends on the level of inven-
tory and the distance from the horizon time T. Since (y − 1) < 0 for
t < T, it is clear that for lower values of x, the optimal production rate
is likely to be positive. However, if x is very high, the correction term
will become smaller than −S, and the optimal control will be negative.
In other words, if inventory level is too high, the factory can save money
by disposing a part of the inventory resulting in lower holding costs.
P ∗ (x, t) → S − x, (12.41)
but the undiscounted objective function value (12.21) in this case be-
comes −∞. Clearly, any other policy will render the objective function
value to be −∞. In a sense, the optimal control problem becomes ill-
posed. One way to get out of this difficulty is to impose a nonzero
discount rate. You are asked to carry this out in Exercise 12.2.
Xt
Figure drawn for:
5
x 0 = 2, T = 12, B = 20
S = 5, s = 2
4
0 t
T
-1
-2
as derived in Exercise 7.37. In Exercise 12.3, you are asked verify that
(12.48) and (12.49) solve the HJB equation (12.47).
We can now obtain the explicit formula for the optimal feedback
control as √
∗ rλ̄ 1 − x
U (x) = . (12.50)
2
Note that U ∗ (x) satisfies the conditions in (12.44).
As in Exercise 7.37, it is easy to characterize (12.50) as
⎧
⎪
⎪
⎪
⎪ > Ū if Xt < X̄,
⎪
⎨
Ut∗ = U ∗ (Xt ) = = Ū if Xt = X̄, (12.51)
⎪
⎪
⎪
⎪
⎪
⎩ < Ū if Xt > X̄,
where
r2 λ̄/2
X̄ = (12.52)
r2 λ̄/2 + δ
and √
rλ̄ 1 − x̄
Ū = , (12.53)
2
as given in (7.51).
The market share trajectory for Xt is no longer monotone because
of the random variations caused by the diffusion term σ(Xt )dZt in the
Itô equation in (12.42). Eventually, however, the market share process
hovers around the equilibrium level x̄. It is, in this sense and as in the
previous section, also a turnpike result in a stochastic environment.
the risky stock over time and consume over time so as to maximize his
total utility of consumption. We will assume an infinite horizon problem
in lieu of the bequest, for convenience in exposition. One could, however,
argue that Rich’s bequest would be optimally invested and consumed by
his heir, who in turn would leave a bequest that would be optimally
invested and consumed by a succeeding heir and so on. Thus, if Rich
considers the utility accrued to all his heirs as his own, then he can justify
solving an infinite horizon problem without a bequest.
In order to formulate the stochastic optimal control problem of Rich
Investor, we must first model his investments. The savings account is
easy to model. If S0 is the initial deposit in the savings account earning
an interest at the rate r > 0, then we can write the accumulated amount
St at time t as
St = S0 ert .
This can be expressed as a differential equation, dSt /dt = rSt , which we
will rewrite as
dSt = rSt dt, S0 ≥ 0. (12.54)
Modeling the stock is much more complicated. Merton (1971) and
Black and Scholes (1973) have proposed that the stock price Pt can be
modeled by an Itô equation, namely,
dPt
= αdt + σdZt , P0 > 0, (12.55)
Pt
or simply,
dPt = αPt dt + σPt dZt , P0 > 0, (12.56)
where P0 > 0 is the given initial stock price, α is the average rate of
return on stock, σ is the standard deviation associated with the return,
and Zt is a standard Wiener process.
Remark 12.6 The LHS in (12.55) can be written also as dlnPt . Another
name for the process Zt is Brownian Motion. Because of these, the price
process Pt given by (12.55) is often referred to as a logarithmic Brownian
Motion. It is important to note from (12.56) that Pt remains nonnegative
at any t > 0 on account of the fact that the price process has almost
surely continuous sample paths (see Sect. D.2). This property nicely
captures the limited liability that is incurred in owning a share of stock.
In order to complete the formulation of Rich’s stochastic optimal
control problem, we need the following additional notation:
Wt = the wealth at time t,
12.4. An Optimal Consumption-Investment Problem 379
where h > 0, c > 0, Iˆ ≥ 0 and P̂ ≥ 0; see the objective function (6.2) for
the interpretation of these parameters.
Differential Games
where we may assume all variables to be scalar for the time being. Ex-
tension to the vector case simply requires appropriate reinterpretations
of each of the variables and the equations. In this equation, we let u
and v denote the controls applied by players 1 and 2, respectively. We
assume that
u(t) ∈ U, v(t) ∈ V, t ∈ [0, T ],
where U and V are convex sets in E 1 . Consider further the objective
function T
J(u, v) = S[x(T )] + F (x, u, v, t)dt, (13.2)
0
which player 1 wants to maximize and player 2 wants to minimize. Since
the gain of player 1 represents a loss to player 2, such games are appro-
priately termed zero-sum games. Clearly, we are looking for admissible
control trajectories u∗ and v ∗ such that
H(x∗ (t), u∗ (t), v ∗ (t), λ(t), t) = min max H(x∗ (t), u, v, λ(t), t), (13.6)
v∈V u∈U
Hu = 0 and Hv = 0, (13.8)
Let J i , defined by
T
i i
J = S [x(T )] + F i (x, u1 , u2 , . . . , uN , t)dt, (13.11)
0
denote the objective function which the ith player wants to maximize. In
this case, a Nash solution is defined by a set of N admissible trajectories
J i (u1∗ , u2∗ , . . . , uN ∗ ) =
max J i (u1∗ , u2∗ , . . . , u(i−1)∗ , ui , . . . , u(i+1)∗ , . . . , uN ∗ ) (13.13)
ui ∈U i
for i = 1, 2, . . . , N.
To obtain the necessary conditions for a Nash solution for nonzero-
sum differential games, we must make a distinction between open-loop
and closed-loop controls.
388 13. Differential Games
H i (x, u1 , u2 , . . . , uN , λi ) = F i + λi f (13.14)
The Nash control ui∗ for the ith player is obtained by maximizing the
ith Hamiltonian H i with respect to ui , i.e., ui∗ must satisfy
i
N
N
λ̇ = −Hxi − Hui j φjx = −Hxi − Hui j φjx . (13.18)
j=1 j=1,j=i
To find the feedback Nash solution for this model, we let x̄i denote the
turnpike (or optimal biomass) level given by (10.12) on the assumption
that the ith producer is the sole-owner of the fishery. Let the bionomic
equilibrium xib and the corresponding control uib associated with producer
i be defined by (10.4), i.e.,
ci g(xib )pi
xib = and u i
b = . (13.23)
pi q i ci
⎧ ⎫
⎪
⎪ ⎪
⎪
⎪
⎪ U1 if x > x2b ⎪
⎪
⎪
⎨ ⎪
⎬
u1∗ (x) = g(x2b ) 1 2
if x = x2b ⎪ , if x̄ ≥ xb . (13.27)
⎪
⎪ q 1 x2b ⎪
⎪
⎪ ⎪
⎪
⎪
⎩ 0 ⎪
if x < x ⎭
2
b
and
J 2 (u1∗ , u2∗ ) ≥ J 2 (u1∗ , u2 ). (13.29)
The direct verification involves defining a modified growth function
⎧
⎪
⎨ g(x) − q 2 U 2 x if x > x2 ,
1 b
g (x) =
⎪
⎩ g(x) if x ≤ x2b ,
and using the Green’s theorem results of Sect. 10.1.2. Since U 2 ≥ u2b
by assumption, we have g 1 (x) ≤ 0 for x > x2b . From (10.12) with g
replaced by g 1 , it can be shown that the new turnpike level for producer
1 is min(x̄1 , x2b ), which defines the optimal policy (13.26)–(13.27) for
producer 1. The optimality of (13.25) for producer 2 follows easily.
To interpret the results of the model, suppose that producer 1 orig-
inally has sole possession of the fishery, but anticipates a rival entry.
Producer 1 will switch from his own optimal sustained yield ū1 to a
more intensive exploitation policy prior to the anticipated entry.
We can now guess the results in situations involving N producers.
The fishery will see the progressive elimination of inefficient producers
as the stock of fish decreases. Only the most efficient producers will
survive. If, ultimately, two or more maximally efficient producers exist,
the fishery will converge to a classical bionomic equilibrium, with zero
sustained economic rent.
We have now seen that a feedback Nash solution involving N ≥ 2
competing producers results in the long-run erosion of economic rents.
This conclusion depends on the assumption that producers face an in-
finitely elastic supply of all factors of production going into the fishing
392 13. Differential Games
∞
2 −ρ2 t
4 2
$
max V (x0 ) = E e π 2 (1 − Xt ) − c2 U2t dt , (13.32)
U2 ≥0 0
+(σ(x))2 Vxx
1
/2}, (13.33)
= max{π 2 (1 − x) − c2 U22
U2 ≥0
√ √
+Vx2 [r1 U1 1 − x − r2 U2 x − δ(2x − 1)]
+(σ(x))2 Vxx
2
/2}, (13.34)
X
Y
X
Y
at time t, and a menu U (x, W ) for the follower representing his decision
when he observes the leader’s decision to be W in addition to the state x
at time t. For this, let us first define a feedback Stackelberg equilibrium,
and then develop a procedure to obtain it.
We begin with specifying the admissible strategy spaces for the man-
ufacturer and the retailer, respectively:
W = {W |W : [0, 1] → [0, 1]
and W (x) is Lipschitz continuous in x}
U = {U |U : [0, 1] × [0, 1] → [0, ∞)
and U (x, W ) is Lipschitz continuous in (x, W )}.
where we should stress that W (·), U (·, W (·)) evaluated at any state ζ are
W (ζ), U (ζ, W (ζ)). We can now define our equilibrium concept.
A pair of strategies (W ∗ , U ∗ ) ∈ W ×U is called a feedback Stackelberg
equilibrium if
t,x
JM (W ∗ (·), U ∗ (·, W ∗ (·)))
t,x
≥ JM (W (·), U ∗ (·, W (·))), W ∈ W, x ∈ [0, 1], t ≥ 0, (13.56)
and
JRt,x (W ∗ (·), U ∗ (·, W ∗ (·)))
≥ JRt,x (W ∗ (·), U (·, W ∗ (·))), U ∈ U , x ∈ [0, 1], t ≥ 0. (13.57)
13.4. A Stackelberg Differential Game of Cooperative Advertising 399
as the optimal response of the follower for any decision W by the leader.
We then substitute this for U in H M to obtain
W (λR r)2 (1 − x)
H M (x, W, U ∗ (x, W ), λM ) = π M x −
4(1 − W )2
R 2
λ r (1−x)
+λ M
−δx . (13.61)
2(1−W )
2λM − λR
W (x) = . (13.62)
2λM + λR
Clearly W (x) ≥ 1 makes no intuitive sense because it would induce the
retailer to spend an infinite amount on advertising, and that would not be
optimal for the leader. Moreover, λM and λR , the marginal valuations of
the market share of the leader and the follower, respectively, are expected
to be positive, and therefore it follows from (13.62) that W (x) < 1. Thus,
we set,
∗ 2λM − λR
W (x) = max 0, M . (13.63)
2λ + λR
We can now write the HJB equations as
The solution of these equations will yield the value functions V M (x) and
V R (x). With these in hand, we can give the equilibrium menu of actions
to the manufacturer and the retailer to guide their decisions at each t.
These menus are
√
∗ 2VxM − VxR ∗ VxR r 1 − x
W (x) = max 0, and U (x, W ) = .
2VxM + VxR 2(1 − W )
(13.66)
To solve for the value function, we next investigate the two cases
where the subsidy rate is (a) zero and (b) positive, and determine the
condition required for no subsidy to be optimal.
2π 2π M
β = # , βM = ,
2 2
(ρ + δ) + r π + (ρ + δ) 2(ρ + δ) + βr2
(13.70)
β 2 r2 ββ M r2
α = , αM = . (13.71)
4ρ 2ρ
13.4. A Stackelberg Differential Game of Cooperative Advertising 401
#
Using (13.71) in (13.67), we can write U ∗ (x) = ρα(1 − x). Finally, we
can derive the required condition from the right-hand side of W ∗ (x) in
(13.66), which is 2VxM ≤ VxR , for no co-op advertising (W ∗ = 0) in the
equilibrium. This is given by 2β M ≤ β, or
4π M 2π
2πr 2
≤# . (13.72)
2(ρ + δ) + √ (ρ + δ)2 + r2 π + (ρ + δ)
(ρ+δ)2 +r 2 π+(ρ+δ)
Case (b): Co-op Advertising (W ∗ > 0). Then, W ∗ (x) in (13.66) reduces
to
2V M − VxR
W ∗ (x) = xM . (13.74)
2Vx + VxR
Inserting this for W ∗ (x) into (13.65) and (13.64), we have
r2 (1 − x)[4(VxM )2 − (VxR )2 ]
ρV M = πM x −
16
VxM r2 (1 − x)[2VxM + VxR ]
+
4
(σ(x)) 2V M
−VxM δx + xx
, (13.75)
2
R 2 2
(Vx ) r (1 − x) 2VxM + VxR (σ(x))2 Vxx
R
ρV R = πx + − V x
R
δx + .
4 2VxR 2
(13.76)
(β + 2β M )2 r2
(ρ + δ)β M = π M − . (13.80)
16
Using (13.66), (13.74), and (13.79), we can write U ∗ (x, W ∗ (x)), with a
slight abuse of notation, as
√
∗ r(VxR + 2VxM ) 1 − x #
U (x) = = ραM (1 − x). (13.81)
4
The four equations (13.77)–(13.80) determine the solutions for the
four unknowns, α, β, αM , and β M . From (13.78) and (13.80), we can
obtain
2π M 2 8π 8π 2
β3 + β + 2β− = 0. (13.82)
ρ+δ r (ρ + δ)r2
If we denote
2π M 8π −8π 2
a1 = , a2 = 2 , and a3 = ,
ρ+δ r (ρ + δ)r2
then a1 > 0, a2 > 0, and a3 < 0. From Descarte’s Rule of Signs, there
exists a unique, positive real root. The two remaining roots may be
both imaginary or both real and negative. Since this is a cubic equation,
a complete solution can be obtained. Using Mathematica or following
Spiegel et al. (2008), we can write down the three roots as
1
β(1) = S + T − a1 ,
3 √
1 1 3
β(2) = − (S + T ) − a1 + i(S − T ),
2 3 √2
1 1 3
β(3) = − (S + T ) − a1 − i(S − T ),
2 3 2
with
# #
3 3 √
S= R + Q3 + R2 , T = R − Q3 + R2 , i = −1,
where
3a2 − a21 9a1 a2 − 27a3 − 2a31
Q= , R= .
9 54
Next, we identify the positive root in each of the following three cases:
Retailer’s
Manufacturer’s
Subsidy
2β M −β
rate W ∗ (x) = 0 2β M +β =1− α
αM
Advertising
√
rβ 1−x
# √
r(β+2β M ) 1−x
#
effort U ∗ (x) = 2 = ρα(1 − x) 4 = ραM (1 − x)
Case 2 (Q < 0 and Q3 + R2 > 0): There are three real roots with one
positive root, which is β = S + T − (1/3)a1 .
0.6 0.6
0.55 0.5
0.5 0.4
0.45
0.3
0.4
0.2
0.35
0.1
0.3
0.25 0
0.2 -0.1
0.15 -0.2
0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6
Figure 13.2: Optimal subsidy rate vs. (a) Retailer’s margin and (b)
Manufacturer’s margin
x1 (0) = x10 ,
x2 (0) = x20 ,
E 13.2 Let x(t) denote the stock of pollution at time t ∈ [0, T ] that
affects the welfare of two countries, one of which is the leader and the
other the follower. The state dynamics is
ẋ = u + v, x(0) = x0 ,
where u and v are emission rates of the leader and the follower, respec-
tively. Let their instantaneous utility functions be
u − (u2 + x2 )/2 and v − (v 2 + x2 )/2,
respectively. Obtain the open-loop Stackelberg solution. By re-solving
this problem at time τ , 0 < τ < T, show that the first solution obtained
is time inconsistent.
ci
f (v i∗ ) = .
(pi − λi )x
which means that a firm can stimulate its sales through advertising (but
subject to decreasing returns) and that demand learning effects (imita-
tion) are industry-wide. (If these effects were firm-specific we would have
Si instead of S in the brackets on the right-hand side of the dynamics.)
Payoffs are given by
T
Ji = [(pi − ci )Ṡi (t) − Ai (t)]dt,
0
in which prices and unit costs are constant. Since Ṡi (t) in the expression
for J i is stated in terms of the state variable S(t) and the control vari-
ables Ai (t), i ∈ {1, 2, . . . , N }, formulate the differential game problem
with S(t) as the state variable. In the open-loop Nash equilibrium,
show that the advertising rates are monotonically decreasing over time.
E 13.5 Solve (13.43) to obtain the solution for α and β given in (13.44)
and (13.45).
E 13.8 Suppose the manufacturer in Exercise 13.7 does not behave op-
timally and decides instead to offer no cooperative advertising. Obtain
the value functions of the manufacturer and the retailer. Compare the
manufacturer’s value function in this case with VM (x) in Exercise 13.7.
Furthermore, when x0 = 0.5, obtain the manufacturer’s loss in expected
profit when compared to the optimal expected profit VM (x0 ) in Exer-
cise 13.7.
E 13.9 Suppose that the manufacturer and the retailer in the prob-
lem of Sect. 13.4 are integrated into a single firm. Then, formulate the
stochastic optimal control problem of the integrated firm. Also, using
the data in Exercise 13.7, obtain the value function V I (x) = αI + β I x
of the integrated firm, and compare it to V M (x) + V (x) obtained in
Exercise 13.7.
#
E 13.10 Let σ(x) = 0.25 x(1 − x) and the initial market share x0 =
0.1. Use the optimal feedback advertising effort U ∗ (x) in (13.50) to de-
termine the optimal market share Xt∗ over time. You may use MATLAB
or another suitable software to graph a sample path of Xt∗ , t ≥ 0.
Appendix A
Solutions of Linear
Differential Equations
(4) Replace a1 by a1 + 2r
Replace a by a + a1 r + r2
(5) Replace a1 by a1 + 2r
Replace a by a + a1 r + r2
(tA)k ∞
t2 A2
e tA
= I + tA + + ··· = . (A.8)
2! k!
0
P −1 AP = Λ. (A.11)
where ⎡ ⎤
⎢ etλ1 0 ··· 0 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 etλ2 ··· 0 ⎥
⎢ ⎥
etΛ =⎢ ⎥. (A.14)
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
0 0 ··· etλn
The solution of this system will be of the form (A.15), which can be
restated as
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
⎢ x(t) ⎥ ⎢ Q11 (t) Q12 (t) ⎥ ⎢ x(0) ⎥ ⎢ R1 (t) ⎥
⎣ ⎦=⎣ ⎦⎣ ⎦+⎣ ⎦, (A.18)
λ(t) Q21 (t) Q22 (t) λ(0) R2 (t)
414 A. Solutions of Linear Differential Equations
property that
Δg = f (k). (A.23)
One can easily show that
so that
k 4 − 3k + 4 = k (4) + 6k (3) + 7k (2) − 2k (1) + 4.
Δλk = −k + 5, λ6 = 0.
k (1) = k,
k (2) = −k + k 2 ,
k (3) = 2k − 3k 2 + k 3 ,
k (4) = −6k + 11k 2 − 6k 3 + k 4 ,
k (5) = 24k − 50k 2 + 35k 3 − 10k 4 + k 5 .
⎡ ⎤ ⎡ ⎤
⎢ 3 3 ⎥ ⎢ 6 0 ⎥
E A.3 If A = ⎣ ⎦ , show that Λ = ⎣ ⎦ and P =
2 4 0 1
⎡ ⎤
⎢ 1 3 ⎥
⎣ ⎦.
1 −2
⎡ ⎤
⎢ 0 ⎥
Use (A.15) to solve (A.7) for this data, given that z(0) = ⎣ ⎦ .
5
E A.4 After you have read Sect. 6.1, re-solve the production-inventory
example stated in Eqs. (6.1) and (6.2), (ignoring the control constraint
(P ≥ 0) by the method of Sect. A.4. The linear two-point boundary
value problem is stated in Eqs. (6.6) and (6.7).
Appendix B
has a relative maximum. We will assume that all first and second partial
derivatives of the function F : E 1 × E 1 × E 1 → E 1 are continuous.
B.2. The Euler-Lagrange Equation 421
T
it is possible to make the integral 0 h(t)η(t)dt = 0. Thus, by contradic-
tion, h(t) must be identically zero over the entire interval [0, T ].
By using the fundamental lemma, we have the necessary condition
d
Fx − Fẋ = 0 (B.5)
dt
known as the Euler-Lagrange equation, or simply the Euler equation,
which must be satisfied by a maximal solution x∗ . In other words, the
solution x∗ (t) must satisfy
d
Fx (x∗ , ẋ∗ , t) − Fẋ (x∗ , ẋ∗ , t) = 0. (B.6)
dt
B.2. The Euler-Lagrange Equation 423
⎪ subject to
⎪
⎪
⎪
⎪
⎩ x(0) = x and x(T ) = x .
0 T
√
Here t refers to distance rather than time. Since F = − 1 + ẋ2 does
not depend explicitly on x, we are in the second special case and the first
integral (B.9) of the Euler equation is
x∗ (t) = C1 t + C2 ,
The time τ AB required for the bead to slide from point A to point B
along a wire formed in the shape of a curve x(t) is given as
sT
ds
τ AB = ,
0 v
where ẋ = dx/dt (note that t does not denote time here). From ele-
mentary physics, it is known that if v(t = 0) = 0 and a denotes the
acceleration due to gravity, then
#
v(t) = 2ax(t), t ∈ [0, T ].
Then, )
T
1 + ẋ2
τ AB = dt. (B.13)
0 2ax
The purpose of the Brachistochrone problem is to find x(t), t ∈ [0, T ],
so as to minimize the time τ AB . This is a variational problem, which in
view of a being a constant, can be stated as follows:
T T)
1 + ẋ2
min J(x) = F (x, ẋ, t)dt = dt . (B.14)
0 0 x
As we can see, the integral F in the above problem does not depend
explicitly on t, and the problem (B.14) belongs to the third special case.
Using the first integral (B.11) of the Euler equation for this case, we have
) 7
1 + ẋ2 2 1 1
− ẋ = (a constant).
x x(1 + ẋ2 ) k
√ T7
1 + (ẋ∗ (t))2
τ ∗AB = 2aJ(x∗ ) = dt. (B.22)
0 x∗ (t)
In Exercise B.1, you are asked to obtain φ1 for T = b = 1 m, and
then obtain the minimum time τ ∗AB .
B.5. The Weierstrass-Erdmann Corner Conditions 427
The greatest lower bound for J(x) for smooth x = x(t) satisfying the
boundary conditions is obviously zero. Yet there is no x ∈ C 1 [−1, 1]
with x(−1) = 0 and x(1) = 1, which achieves this value of J(x). In fact,
the minimum is achieved for the curve
⎧
⎪
⎨ 0, −1 ≤ t ≤ 0,
∗
x (t) =
⎪
⎩ t, 0 < t ≤ 1,
It is clear that on each of the intervals [0, τ ) and (τ , T ], the Euler equation
must hold.
To compute variations δJ1 and δJ2 , we must recognize that the two
‘pieces’ of x are not fixed-end-point problems. We must require that the
two pieces of x join continuously at t = τ ; the point t = τ can, however,
move freely as shown in Fig. B.3.
This will require a slightly modified version of formula (B.4) for
writing out the variations; see pp. 55–56 in Gelfand and Fomin (1963).
428 B. Calculus of Variations and Optimal Control Theory
δJ = δJ1 + δJ2 = 0
Fẋẋ ≤ 0. (B.25)
Integrating the middle term by parts and using (B.3), we can transform
(B.26) into a more convenient form
T
(Qη 2 + P η̇ 2 )dt ≤ 0, (B.27)
0
where
d
Q = Q(t) = Fxẋ − Fxẋ and P = P (t) = Fẋẋ .
dt
While it is possible to rigorously obtain (B.25) from (B.27), we will
only provide a qualitative argument for this. If we consider the quadratic
functional (B.27) for functions η(t) satisfying η(0) = 0, then η(t) will be
small in [0, T ] if η̇(t) is small in [0, T ]. The converse is not true, however,
since it is easy to construct η(t) which is small but has a large derivative
η̇(t) in [0, T ]. Thus, P η̇ 2 plays the dominant role in (B.27); i.e., P η̇ 2
can be much larger than Qη 2 but it cannot be much smaller (provided
P = 0). Therefore, it might be expected that the sign of the functional
in (B.8) is determined by the sign of the coefficient P (t), i.e., (B.27)
implies (B.25). For a rigorous proof, see Gelfand and Fomin (1963).
We note that the strengthened Legendre condition (i.e., with a strict
inequality in (B.25)), the Euler equation, and one other condition called
strengthened Jacobi condition are sufficient for a maximum. The reader
can consult Chapter 5 of Gelfand and Fomin (1963) for details.
E(x, ẋ, t, u) = F (x, u, t) − F (x, ẋ, t) − Fẋ (x, ẋ, t)(u − ẋ). (B.29)
Note that this condition is always met if F (x, ẋ, t) is concave in ẋ.
The proof of (B.28) is by contradiction. Suppose there exists a τ ∈
[0, T ] and a vector q such that
The Hamiltonian is
Hu = Fẋ + λ = 0, (B.33)
Fẋẋ ≤ 0,
F (x∗ , ẋ∗ , t) − Fẋ (x∗ , ẋ∗ , t)ẋ∗ ≥ F (x∗ , u, t) − Fẋ (x∗ , ẋ∗ , t)u,
E(x∗ , ẋ∗ , t, u) = F (x∗ , u, t) − F (x∗ , ẋ∗ , t) − Fẋ (x∗ , ẋ∗ , t)(u − ẋ∗ ) ≤ 0.
λ(τ − ) = λ(τ + ),
H(x∗ (τ ), u∗ (τ − ), λ(τ − ), τ ) = H(x∗ (τ ), u∗ (τ + ), λ(τ + ), τ ).
However,
λ = −Fẋ and H = F − Fẋ ẋ,
which means that the right-hand sides must be continuous with respect
to time, i.e., even across corners. These are precisely the Weierstrass-
Erdmann corner conditions.
An Alternative Derivation
of the Maximum Principle
Let τ denote any time in the open interval (0, T ). We select a suffi-
ciently small ε to insure that τ − ε > 0 and concentrate our attention on
this small interval (τ − ε, τ ]. We vary the control on this interval while
keeping the control on the remaining intervals [0, τ − ε] and (τ , T ] fixed.
Specifically, the modified control is
⎧
⎪
⎨ v ∈ Ω, t ∈ (τ − ε, τ ],
u(t) = (C.2)
⎪
⎩ u∗ (t), otherwise.
C.1. Needle-Shaped Variation 435
Since the initial difference δx(τ ) is small and since u∗ (τ ) does not change
from t > τ on, we may conclude that δx(t) will be small for all t > τ .
Being small, the law of variation of δx(t) can be found from linear equa-
tions for small changes in the state variables. These are called variational
equations. From the state equation in (C.1), we have
d(x∗ + δx)
= f (x∗ + δx, u∗ , t) (C.6)
dt
or,
dx∗ d(δx)
+ ≈ f (x∗ , u∗ , t) + fx δx (C.7)
dt dt
or using (C.1),
d
(δx) ≈ fx (x∗ , u∗ , t)δx, for t ≥ τ , (C.8)
dt
436 C. An Alternative Derivation of the Maximum Principle
d
Φ(t, τ ) = fx [x∗ (t), u∗ (t)t]Φ(t, τ ), Φ(τ , τ ) = I, (C.12)
dt
where I is an n × n identity matrix; see Appendix A.
Substituting for δx(t) from (C.11) into (C.10), we have
T
δJ = cδx(τ ) + cfx [x∗ (t), u∗ (t), t]Φ(t, τ )δx(τ )dt ≤ 0. (C.13)
τ
C.2. Derivation of Adjoint Equation and the Maximum Principle 437
δJ = λ∗ (τ )δx(τ ) ≤ 0. (C.15)
But δx(τ ) is supplied in (C.5). Noting that ε > 0, we can rewrite (C.15)
as
λ∗ (τ )f [x∗ (τ ), v, τ ] − λ∗ (τ )f [x∗ (τ ), u∗ (τ ), τ ] ≤ 0. (C.16)
Defining the Hamiltonian for the Mayer form as
H[x, u, λ, t] = λf (x, u, t), (C.17)
Since this can be done for almost every τ , we have the required Hamil-
tonian maximizing condition.
The differential equation form of the adjoint equation (C.14) can be
obtained by taking its derivative with respect to τ . Thus,
T
dλ(τ ) dΦ(t, τ )
= cfx [x∗ (t), u∗ (t), t] dt
dτ τ dτ
−cfx [x∗ (τ ), u∗ (τ ), τ ]. (C.19)
dλ(τ )
= −λ(τ )fx [x∗ (τ ), u∗ (τ ), τ ]
dτ
438 C. An Alternative Derivation of the Maximum Principle
This completes the derivation of the maximum principle along with the
adjoint equation using the direct method.
δJ = cδx(T ) ≤ 0. (C.22)
First, we define
λ(T ) = c, (C.23)
which makes it possible to write (C.22) as
∂S[x(T )]
δJ = δx(T ) = λ(T )δx(T ).
∂x(T )
or in other words,
It turns out that the differential equation which λ(t) must satisfy can
be easily found. From (C.27),
d δx
[λ(t)δx(t)] = λ + λ̇δx = 0, (C.28)
dt dt
which after substituting for dδx/dt from (C.8) becomes
y t = Ht xt + v t , t = 0, 1, ..., N, (D.7)
or simply
dX = f dt + GdZ, X0 given. (D.20)
Now let the one-dimensional process Yt = ψ(Xt , t), t ∈ [0, T ], where
the function ψ(x, t) is continuously differentiable in t and twice continu-
ously differentiable in x. Then, it possesses the stochastic differential
1
dYt = ψ t (Xt , t) + ψ X (Xt , t)dxt + ψ XX (Xt , t)G2 (t)dt
2
1
= [ψ t (Xt , t) + ψ X (Xt , t)f (t) + ψ XX (Xt , t)G2 (t)]dt
2
+ψ X (Xt , t)G(t)dZt , Y0 = ψ(X0 , 0). (D.21)
Y (xt , t) = Y (x0 , 0)
t
1
+ [ψ s (xs , s) + ψ x (xs , s)f (s) + ψ xx (xs , s)G2 (s)]ds
0 2
t
+ ψ(xs , s)G(s)dZs , w.p.1. (D.22)
0
D.3. The Kalman-Bucy Filter 447
where H (t) denotes the transpose (H(t)) and R−1 (t) means the inverse
(R(t))−1 , as the notational convention defined in Chap. 1. The interpre-
tations of P (t) and K(t) are the same as in the previous section.
The filter (D.27)–(D.29) is the Kalman-Bucy filter (Kalman and Bucy
1961) for linear systems in continuous time. Equation (D.29) is called the
matrix Riccati equation. Besides engineering applications, the Kalman
filter and its extensions are very useful in econometric and financial mod-
eling; see Buchanan and Norton (1971), Chow (1975), Aoki (1976), Naik
et al. (1998), and Bhar (2010).
subject to
ẋ = Ax + Bu, x(0) = x0 . (D.31)
Here x ∈ E n , u ∈ E m , and the appropriate dimensional matrices
C, D, A, and B, when time-dependent, are assumed to be continuous in
D.4. Linear-Quadratic Problems 449
Since this equation holds for all x, we have the matrix differential equa-
tion
Ṡ = −SA − A S + SBD−1 B S − C, (D.38)
called a matrix Riccati equation, with the terminal condition
S(T ) = ST (D.39)
which is precisely the first-order condition for the maximum of the right-
hand side of (D.32). Moreover, the first-order condition yields a global
maximum of the Hamiltonian, which is concave since the matrix D is
positive definite.
A generalization of (D.30) to include a cross-product term to allow
for interactions between the state x and control u, which would be useful
in the next section on the second variation, is to set
⎡ ⎤⎡ ⎤
⎢ C N ⎥⎢ x ⎥
T
J = −x (T )ST x(T ) − (x , u ) ⎣ ⎦ ⎣ ⎦ dt, (D.40)
0
N D u
the transformed problem and then use the definition of ũ to write the
feedback control of the generalized problem as
where
with
S(T ) = ST . (D.43)
where S(t) is given by (D.42) and (D.43), and X̂t is given by the Kalman-
Bucy filter:
The above procedure has received two different names in the liter-
ature. In economics it is called the certainty equivalence principle; see
Simon (1956). In engineering and mathematics literature it is called the
separation principle; see Fleming and Rishel (1975). When we call it
the certainty equivalence principle, we are emphasizing the fact that X̂t
can be used for the purposes of optimal feedback control as if it were
the certain value of the state variable Xt . Whereas the term separation
principle emphasizes the fact that the process of determining the optimal
control can be broken down into two steps: first, estimate Xt by using the
optimal filter; second, use that estimate in the optimal feedback control
formula for the deterministic problem.
subject to
ẋ = f (x, u, t), x(0) = x0 . (D.45)
From Chap. 2, we know that the first-order necessary conditions for
this problem are given by
Hu = 0, (D.47)
D.5. Second-Order Variations 453
H = F + λf. (D.48)
⎡ ⎤⎡ ⎤
T H H δx
1 1 ⎢ xx xu ⎥ ⎢ ⎥
δ 2 J¯ = (δxT (T )Φxx δx(T )) + (δx, δu) ⎣ ⎦⎣ ⎦ dt
2 2 0
Hux Huu δu
(D.53)
subject to
dδx
= fx δx + fu δu, δx(0) specified. (D.54)
dt
Since we are interested in a neighboring extremal path, we must deter-
mine δu(t) so as to maximize δ 2 J¯ subject to (D.54). This problem is
454 D. Special Topics in Optimal Control
x1 (T ) = x2 (T ) = 0. (D.63)
The optimal control is bang-bang plus singular. Singular arcs must sat-
isfy
H u = λ1 − λ2 = 0 (D.66)
for a finite time interval. The optimal control can, therefore, be obtained
by
dHu
= λ̇1 − λ̇2 = x1 + λ1 = 0. (D.67)
dt
Differentiating once more with respect to time t, we obtain
d2 Hu
= ẋ1 + λ̇1 = x2 + u + x1 = 0,
dt2
which implies
u = −(x1 + x2 ) (D.68)
along the singular arc. We now verify for the example, the generalized
Legendre-Clebsch condition (D.59) for k = 1:
∂ d2 Hu
− = −1 ≤ 0. (D.69)
∂u dt2
λ̇ = ρλ − φx − λfx (D.72)
D.7. Global Saddle Point Theorem 457
The important issue for this problem is the existence and uniqueness
of an optimal path that steers the system from an initial value x0 to the
steady state x̄. This is equivalent to finding a value λ0 so that the system
(D.73) starting from (x0 , λ0 ) moves asymptotically to (x̄, λ̄). A sufficient
condition for this to happen is given in the following theorem.
The proof of this theorem, based on Theorem 1.2 and Corollaries 1.1
and 1.2 from Hartman (1982), can be found in Feichtinger and Hartl
(1986).
subject to
ẋ(t) = f (x(t), u(t)), x(0) given,
subject to
ẋ(t) = −x(t) + u(t), x(0) = x0 , (D.75)
u(t) ∈ [−1, +1], t ≥ 0.
Let us first solve this problem for x0 < 0. We form the Hamiltonian
with
λ̇(t) = (1 + ρ)λ(t) − u(t). (D.77)
Since H is linear in u, the optimal policy is
For x0 < 0, the state equation reveals that u∗ (t) = −1 will give the
largest decrease of x(t) and keep x(t) < 0, t ≥ 0. Thus, it will maximize
the product x(t)u(t) for each t > 0. We also note that the long-run
stationary equilibrium in this case is (x̄, ū, λ̄) = (−1, −1, −1/(1 + ρ)).
It is also easy to verify that the solution u∗ (t) = −1, x∗ (t) = −1 +
e−t (x0 + 1), and λ(t) = −1/(1 + ρ), t ≥ 0, satisfies (D.75), (D.77) along
with the sufficiency transversality condition (3.99), and maximizes the
Hamiltonian in (D.76).
Similarly, we can argue that for x0 > 0, the optimal solution is
u∗ (t) = +1, x∗ (t) = 1+e−t (x0 −1) > 0, and λ(t) = 1/(1+ρ), t ≥ 0. The
long-run stationary equilibrium in this case is (x̄, ū, λ̄) = (1, 1, 1/(1 + ρ).
Then by symmetry, we can conclude that if x0 = 0, both u∗ (t) = −1
and u∗ (t) = +1, t ≥ 0, yield the same objective function, and hence
both are optimal. Thus, x0 = 0 is a Sethi-Skiba point for this example.
control from the set of possible optimal controls. This may have im-
portant implications. In a model of controlling illicit drugs, Grass et al.
(2008) derive a Sethi-Skiba point, signifying a critical number of ad-
dicts, such that if there are fewer addicts than the critical number, it
is optimal to use an eradication strategy that uses massive treatment
spending that drive the number of addicts down to zero. On the other
hand, if there are more than the critical number of addicts, then it is
optimal to use an accommodation strategy that uses a moderate level
of treatment spending that balances the social cost of drug use and the
cost of treatment.
This is a case of a classic Sethi-Skiba point acting as a “tipping point”
between the two strikingly different equilibria, one of which may be more
socially or politically favored than the other, and the social planner can
use an optimal control to move to the more favored equilibrium.
We conclude this subsection by mentioning that the Sethi-Skiba
points are exhibited in the production management context by Fe-
ichtinger and Steindl (2006) and Moser et al. (2014), in the open-source
software context by Caulkins et al. (2013a), and in other contexts by
Caulkins et al. (2011, 2013b, 2015a).
and let x(t, y) be a one dimensional state variable. Let u(t, y) denote a
control at (t, y) and let the state equation be
∂x ∂x
= g(t, y, x, , u) (D.79)
∂t ∂y
for t ∈ [0, T ] and y ∈ [0, h]. We denote the region [0, T ] × [0, h] by D,
and we let its boundary ∂D be split into two parts Γ1 and Γ2 as shown
in Fig. D.2. The initial conditions will be stated on the part Γ1 of the
boundary ∂D as
x(0, y) = x0 (y) (D.80)
and
x(t, 0) = v(t). (D.81)
In Fig. D.2, (D.80) is the initial condition on the vertical portion of Γ1 ,
whereas (D.81) is that on the horizontal portion of Γ1 . More specifically,
in (D.80) the function x0 (y) gives the starting distribution of x with
respect to the spatial coordinate y. The function v(t) in (D.81) is an
exogenous breeding function of x at time t when y = 0, which in the
cattle ranching model mentioned above, measures the number of newly
born calves at time t. To be consistent we make the obvious assumption
that
x(0, 0) = x0 (0) = v(0). (D.82)
Let F (t, y, x, u) denote the profit rate when x(t, y) = x and u(t, y) =
u at a point (t, y) in D. Let Q(t) be the price of one unit of x(t, h) at
462 D. Special Topics in Optimal Control
time t and let S(y) be the salvage value of one unit of x(T, y) at time T.
Then the objective function is:
T h
max J= F (t, y, x(t, y), u(t, y))dydt
u(t,y)∈Ω 0 0
T h (D.83)
+ Q(t)x(t, h)dt + S(y)x(T, y)dy ,
0 0
and
λ(T, y) = S(y). (D.87)
Once again we need a consistency requirement similar to (D.82). It is
which gives the consistency requirement in the sense that the price and
the salvage value of a unit x(T, h) must agree.
We let u∗ (t, y) denote the optimal control at (t, y). Then the dis-
tributed parameter maximum principle requires that
E[v t v τ ] = rδ tτ ,
Pt+1 h t+1
x̂t+1 − x̂t = ax̂t + (y − h(a + 1)x̂t ), x̂0 = μ
r
and
r[(a + 1)2 Pt + q]
Pt+1 = , p0 = rΣ0 /(r + Σ0 h2 ).
r + h2 [(a + 1)2 Pt + q]
where Z and ξ are standard Brownian motions, q and σ are positive con-
stants, and X0 is a Gaussian random variable with mean 0 and variance
Σ0 . Show that the Kalman-Bucy filter is given by
P (t)
dX̂t = (dYt − X̂t dt), X̂0 = 0,
r
and
√ 1 + be−2αt
P (t) = rq ,
1 − be−2αt
where
# √
Σ0 − rq
α= q/r and b = √ .
Σ0 + rq
Hint: In solving the Riccati equation for P (t), you will need the formula
du 1 u−a
= ln .
u2 − a 2 2a u+a
E D.3 Let w(u) = u in Exercise 7.39. Analyze the various cases that
may arise in this problem from the viewpoint of obtaining the Sethi-Skiba
points.
Answers to Selected
Exercises
Chapter 3
3.1 x − u1 ≥ 0, u1 − u2 ≥ 0, u1 ≥ 0, 1 + u2 ≥ 0.
λ̇ = −(α̇/α)λ − ∂L
∂x , μ ≥ 0, μg = 0.
4 $
3.13 (a) λ(t) = 10 1 − e0.1(t−100) ,
⎧
⎪
⎨ 0 if K = 300,
μ=
⎩ −10 41 − e0.1(K/3−100) $ if K < 300,
⎪
3.17 λ(t) = t − 1.
3.23 11.87 min.
3.25 u∗ = −1, T ∗ = 5.
3.26 u∗ = −2, T ∗ = 5/2.
¯ P̄ , λ̄} = {I1 − ρ(S − P1 ), S, 2(S − P1 )}.
3.43 (a) {I,
(b) I = I1 .
Chapter 4
4.2 u∗ (t) = −1, μ1 = −λ = 1/2 − t, μ2 = η = 0.
4.3 One solution appears in Fig. 3.1. Another solution is u(t) = 1/2
for t ∈ [0, 2]. There are many others.
4.5 (a) u∗ = 0.
⎧
⎪
⎨ 1, 0 ≤ t ≤ 1 − T,
∗
(c) u =
⎪
⎩ 0, 1 − T < t ≤ T.
∗
5.4 (b) f (t∗ ) = t∗ − 10 ln(1 − 0.3e0.1t ).
(c) t∗ = 1.969327, J(t∗ ) = 19.037.
J ∗ = 34,420.
Chapter 6
6.12 J ∗ = 10.56653.
⎧
⎪
⎪
⎪
⎪ 0, 0 ≤ t ≤ 7/3,
⎪
⎪
⎪
⎪
⎪
⎨ 2, 7/3 < t < 3,
6.14 ∗
u (t) =
⎪
⎪
⎪ −1,
⎪ 3 ≤ t < 13/3,
⎪
⎪
⎪
⎪
⎪
⎩ 0, 13/3 ≤ t ≤ 6.
⎧
⎪
⎨ − 5 t + 5 , t ∈ [0, 1],
2 2
6.15 μ1 =
⎪
⎩ 0, t ∈ (0, 3].
⎧
⎪
⎨ 0, t ∈ [0, 1.8),
μ2 =
⎪
⎩ − 1 t + 3 , t ∈ [1.8, 3].
2 2
⎧
⎪ 8
⎨ 0, t ∈ [0, 1) (1.8, 3],
η=
⎪
⎩ − 5 t + 5 , t ∈ [1, 1.8).
2 2
E. Answers to Selected Exercises 469
⎧
⎪
⎨ −1 for t ∈ [0, 1.8),
6.16 (a) v ∗ (t) =
⎪
⎩ 1 for t ∈ (1.8, 3].
(b) v ∗ (t) = 1 for t ∈ [0, 10].
⎧
⎪
⎪
⎪
⎪ −1 for t ∈ [0, 1/2],
⎪
⎪
⎪
⎪
⎪
⎨ 0 for t ∈ (1/2, 23/12],
∗
6.18 v (t) =
⎪
⎪
⎪
⎪ +1 for t ∈ (23/12, 29/12],
⎪
⎪
⎪
⎪
⎪
⎩ 0 for t ∈ (29/12, 4].
⎧
⎪
⎨ 0, for 0 ≤ t ≤ t1 ,
6.19 u∗ (t) =
⎪
⎩ h(t − t )/c, for t < t ≤ T,
1 1
#
where t1 = T − 2BC/h.
Chapter 7
7.20 (b)
1 x0 1 x̄ − xs
t1 = ln s , t2 = ln .
rQ + δ x rQ + δ x̄ − xT
7.21
1 rQ(1 − x0 ) − δx0 1 xs
T ≥ ln + ln .
rQ + δ rQ(1 − xs ) − δxs δ xT
% &
7.28 imp(A, B; t) = − 1r ln 1−B
1−A
.
Chapter 8
8.1 (a) y = 1, z = 3.
(b) y = 2, z = 10.
Chapter 9
9.4 T = ts = 2.47.
9.5 ts = 0, T = 30.
4 $2
9.7 u∗ (t) = sat[0, 1; u0 (t)], where u0 (t) = 2 − e0.05(t−34.8) /(1 + t),
t1 ≈ 3; t2 − T = 34.8.
Chapter 10
10.4 x̄ = 0.734.
10.5 (a)
⎡ ⎤
7
X⎣ ρ c ρ c 2 8cρ ⎦
x̄ = 1− + + 1− + + .
4 r Xp r Xp prX
Chapter 11
Chapter 12
' (
12.5 q ∗ (x) = (1−β)σ ∗ (x) = 1 γβ
1−β ρ − rβ −
α−r
2 , c 1−β x,
1−β
V (x) = [ ρ−rβ−γβ/(1−β) ]1−β xβ , x ≥ 0.
472 E. Answers to Selected Exercises
Chapter 13
⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ ẋ ⎟ ⎜ 0 1 1 0 ⎟⎜ x ⎟ ⎜ 2 ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ F ⎟ ⎜ ⎟⎜ F ⎟ ⎜ ⎟
⎜ λ̇ ⎟ ⎜ 1 0 0 0 ⎟ ⎜ ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎟⎜ λ ⎟ ⎜ ⎟
⎜ ⎟=⎜ ⎟⎜ ⎟+⎜ ⎟
⎜ L ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ λ̇ ⎟ ⎜ 1 0 0 −1 ⎟ ⎜ λL ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
μ̇ 0 0 −1 0 μ 0
Alam M, Sarma VVS (1974) Optimal maintenance policy for equipment subject
to deterioration and random failure. IEEE Trans Syst Man Cybern SMC-
4:172–175
Alam M, Lynn JW, Sarma VVS (1976) Optimal maintenance policy for equip-
ment subject to random deterioration and random failure. Int J Syst Sci
7:1071–1080
Arora SR, Lele PT (1970) A note on optimal maintenance policy and sale date
of a machine. Manag Sci 17:170–173
Arrow KJ, Chang S (1980) Optimal pricing, use, and exploration of uncertain
natural resource stocks. In: Liu PT (ed) Dynamic optimization in mathe-
matical economics. Plenum Press, New York
Arrow KJ, Kurz M (1970) Public investment, the rate of return, and optimal
fiscal policy. The John Hopkins Press, Baltimore
Arrow KJ, Bensoussan A, Feng Q, Sethi SP (2007) Optimal savings and the
value of population. Proc Natl Acad Sci 104(47):18421–18426
Arrow KJ, Bensoussan A, Feng Q, Sethi SP (2010) The genuine savings cri-
terion and the value of population in an economy with endogenous fertility
rate. In: Boucekkine R, Hritonenko N, Yatsenko Y (eds) Optimal control
of age-structured population in economy, demography, and the environment.
Routledge explorations in environmental economics. Routledge, New York,
pp 20–44
Bibliography 475
Arthur WB, McNicoll G (1977) Optimal time paths with age dependence: a
theory of population policy. Rev Econ Stud 44:111–123
Aubin J-P, Cellina A (1984) Differential inclusions: set-valued maps and via-
bility theory. Springer, Berlin
Basar T (1986) A tutorial on dynamic and differential games. In: Basar T (ed)
Dynamic games and applications in economics. Springer, Berlin, pp 1–25
Bass FM (1969) A new product growth model for consumer durables. Manag
Sci 15(5):215–227
Bean JC, Smith RL (1984) Conditions for the existence of planning horizons.
Math Oper Res 9(3):391–401
Bell DJ, Jacobson DH (1975) Singular optimal control. Academic Press, New
York
Bensoussan A, Sethi SP (2007) The machine maintenance and sale age model
of Kamien and Schwartz revisited. Manag Sci 53(12):1964–1976
Bibliography 477
Bhaskaran S, Sethi SP (1981) Planning horizons for the wheat trading model.
In: Proceedings of AMS 81 conference, 5: life, men, and societies, pp 197–201
Bhaskaran S, Sethi SP (1988) The dynamic lot size model with stochastic de-
mands: a planning horizon study. Inf Syst Oper Res 26(3):213–224
Blaquière A (1985) Impulsive optimal control with finite or infinite time horizon.
J Optim Theory Appl 46:431–439
Brito DL, Oakland WH (1977) Some properties of the optimal income tax. Int
Econ Rev 18:407–423
Bibliography 481
Bryant GF, Mayne DQ (1974) The maximum principle. Int J Control 20:1021–
1054
Bultez AV, Naert PA (1979) Does lag structure really matter in optimizing
advertising spending. Manag Sci 25(5):454–465
Bultez AV, Naert PA (1988) When does lag structure really matter...indeed?
Manag Sci 34(7):909–916
Burdet CA, Sethi SP (1976) On the maximum principle for a class of discrete
dynamical system with lags. J Optim Theory Appl 19:445–454
Canon MD, Cullum CD, Polak E (1970) Theory of optimal control and math-
ematical programming. McGraw-Hill, New York
Carlson DA (1986b) The existence of finitely optimal solutions for infinite hori-
zon optimal control problems. J Optim Theory Appl 51(1):41–62
Carlson DA (1993) Nonconvex and relaxed infinite horizon optimal control prob-
lems. J Optim Theory Appl 78(3):465–491
Carlson DA, Haurie A (1987a) Infinite horizon optimal control theory and ap-
plications. Lecture Notes in Economics and Mathematical Systems, vol 290.
Springer, New York
Carlson DA, Haurie A (1987b) Optimization with unbounded time interval for
a class of non linear systems. Springer, Berlin
Carlson DA, Haurie A (1995) A turnpike theory for infinite horizon open-loop
differential games with decoupled dynamics, In: New trends in dynamic
games and applications, annals of the international society of dynamic games,
vol 3. Birkhäuser, Boston, pp 353–376
Carlson DA, Haurie A (1996) A turnpike theory for infinite horizon competitive
processes. SIAM J Optim Control 34(4):1405–1419
Carraro C, Filar J (eds) (1995) Control and game theoretic models of the envi-
ronment. Birkhäuser, Boston
484 Bibliography
Case JH (1979) Economics and the competitive process. New York University
Press, New York
Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2011) Op-
timal pricing of a conspicuous product in a recession that freezes capital
markets. J Econ Dyn Control 35(1):163–174
Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2013a) When
to make proprietary software open source. J Econ Dyn Control 37(6):1182–
1194
Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Novak AJ, Seidl A
(2013b) Leading bureaucracies to the tipping point: an alternative model of
multiple stable equilibrium levels of corruption. Eur J Oper Res 225(3):541–
546
Caulkins JP, Feichtinger G, Hartl RF, Kort PM, Novak AJ, Seidl A (2013c)
Multiple equilibria and indifference-threshold points in a rational addiction
model. Central Eur J Oper Res 21(3):507–522
Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Novak AJ, Seidl A,
Wirl F (2014) A dynamic analysis of Schelling’s binary corruption model: a
competitive equilibrium approach. J Optim Theory Appl 161(2):608–625
Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2015a) Skiba
points in free end-time problems. J Econ Dyn Control 51:404–419
Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2017) In-
teraction of pricing, advertising and experience quality: a dynamic analysis.
Eur J Oper Res 256(3):877–885
Chand S, Sethi SP (1983) Finite production rate inventory models with first
and second shift setups. Naval Res Logist Quart 30:401–414
Chand S, Sethi SP (1990) A dynamic lot size model with learning in setups.
Oper Res 38(4):644–655
Chand S, Sethi SP, Sorger G (1992) Forecast horizons in the discounted dynamic
lot size model. Manag Sci 38(7):1034–1048
Chand S, Hsu VN, Sethi SP (2002) Forecast, solution and rolling horizons in
operations management problems: a classified bibliography. Manufact Service
Oper Manag 4(1):25–43
Clarke FH, Darrough MN, Heineke JM (1982) Optimal pricing policy in the
presence of experience effects. J Bus 55:517–530
Cohen KJ, Cyert RM (1965) Theory of the firm: resource allocation in a market
economy. Prentice-Hall, Englewood Cliffs
Conrad K (1985) Quality, advertising and the formation of goodwill under dy-
namic conditions. In: Feichtinger G (ed) Optimal control theory and eco-
nomic analysis, vol 2. North-Holland, Amsterdam, pp 215–234
Dantzig GB, Sethi SP (1981) Linear optimal control problems and generalized
linear programs. J Oper Res Soc 32:467–476
Davis BE (1970) Investment and rate of return for the regulated firm. Bell J
Econ Manag Sci 1:245–270
Davis MHA (1993) Markov models and optimization. Chapman & Hall, New
York
Deger S, Sen SK (1984) Optimal control and differential game models of military
expenditure in less developed countries. J Econ Dyn Control 7:153–169
Derzko NA, Sethi SP, Thompson GL (1980) Distributed parameter systems ap-
proach to the optimal cattle ranching problem. Optim Control Appl Methods
1:3–10
Derzko NA, Sethi SP, Thompson GL (1984) Necessary and sufficient conditions
for optimal control of quasilinear partial differential systems. J Optim Theory
Appl 43:9–101
Dockner EJ, Sorger G (1996) Existence and properties of equilibria for a dy-
namic game on productive assets. J Econ Theory 71:209–227
Dockner EJ, Long NV, Sorger G (1996) Analysis of Nash equilibria in a class
of capital accumulation games. J Econ Dyn Control 20:1209–1235
Dolan RJ, Jeuland AP (1981) Experience curves and dynamic demand models:
implications of optimal pricing strategies. J Market 45:52–73
Dolan RJ, Muller E (1986) Models of new product diffusion: extension to com-
petition against existing and potential firms over time. In: Mahajan V, Wind
Y (eds) Innovation diffusion models of new product acceptance. Ballinger,
Cambridge, pp 117–150
Bibliography 491
Elliott RJ, Aggoun L, Moore JB (1995) Hidden Markov models: estimation and
control. Springer, New York
Feichtinger G (ed) (1982e) Optimal control theory and economic analysis. In:
First Viennese workshop on economic applications of control theory, Vienna,
28–30 October 1981. North-Holland, Amsterdam
Feichtinger G (ed) (1988) Optimal control theory and economic analysis, vol 3.
North-Holland, Amsterdam
Feichtinger G, Hartl RF, Kort PM, Novak AJ (2001) Terrorism control in the
tourism industry. J Optim Theory Appl 108:283–296
Ferreira MMA, Vinter RB (1994) When is the maximum principle for state
constrained problems nondegenerate? J Math Anal Appl 187:438–467
Ferreyra G (1990) The optimal control problem for the Vidale-Wolfe advertising
model revisited. Optim Control Appl Methods 11:363–368
Fleming WH, Soner HM (1992) Controlled Markov processes and viscosity so-
lutions. Springer, New York
Frankena JF (1975) Optimal control problems with delay, the maximum prin-
ciple and necessary conditions. J Eng Math 9:53–64
Friedman A (1964) Optimal control for hereditary processes. Arch Ration Mech
Anal 15:396–416
Bibliography 497
Fuller D, Vickson RG (1987) The optimal construction of new plants for oil
from the Alberta tar sands. Oper Res 35(5):704–715
Gaimon C (1986c) The optimal acquisition of new technology and its impact on
dynamic pricing policies. In: Lev B (ed) Production management: methods
and studies. Studies in management science and systems, vol 13. Elsevier,
Amsterdam, pp 187–206
498 Bibliography
Gaimon C, Burgess R (2003) Analysis of lead time and learning for capacity
expansions. Prod Oper Manag 12(1):128–140
Gaimon C, Singhal V (1992) Flexibility and the choice of facilities under short
product life cycles. Eur J Oper Res 60(2):211–223
Grimm W, Well KH, Oberle HJ (1986) Periodic control for Minimum-fuel air-
craft trajectories. J Guid 9:169–174
Halkin H (1966) A maximum principle of the Pontryagin type for systems de-
scribed by nonlinear difference equations. SIAM J Control 4:90–111
Hanssens DM, Parsons LJ, Schultz RL (1990) Market response models: econo-
metric and time series analysis. Kluwer Academic Publishers, Boston
Harris FW (1913) How many parts to make at once. Factory Mag Manag
10:135–136, 152
Harris H (1976) Optimal planning under transaction costs: the demand for
money and other assets. J Econ Theory 12:298–314
Bibliography 501
Hartl RF (1989b) Most rapid approach paths in dynamic economic problems. In:
Kleinschmidt P et al (eds) Methods of operations research 58 (Proceedings
of SOR 12, 1987). Athenäun, pp 397–410
Hartl RF, Feichtinger G (1987) A new sufficient condition for most rapid ap-
proach paths. J Optim Theory Appl 54(2):403–411
Hartl RF, Kort PM (1997) Optimal input substitution of a firm facing an en-
vironmental constraint. Eur J Oper Res 99:336–352
Hartl RF, Kort PM (2005) Advertising directed towards existing and new cus-
tomers. In: Deissenberg C, Hartl RF (eds) Optimal control and dynamic
games. Springer, Dordrecht, pp 3–18
Hartl RF, Krauth J (1989) Optimal production mix. J Optim Theory Appl
66:255–273
Bibliography 503
Hartl RF, Mehlmann A (1984) Optimal seducing policies for dynamic continu-
ous lovers under risk of being killed by a rival. Cybern Syst Int J 15:119–126
Hartl RF, Sethi SP (1983) A note on the free terminal time transversality
condition. Z Oper Res Ser Theory 27(5):203–208
Hartl RF, Sethi SP (1984a) Optimal control problems with differential inclu-
sions: sufficiency conditions and an application to a production-inventory
model. Optimal Control Appl Methods 5(4):289–307
Hartl RF, Sethi SP (1984b) Optimal control of a class of systems with contin-
uous lags: dynamic programming approach and economic interpretations. J
Optim Theory Appl 43(1):73–88
Hartl RF, Sethi SP (1985a) Solution of generalized linear optimal control prob-
lems using a simplex-like method in continuous-time I: theory. In: Feichtinger
G (ed) Optimal control theory and economic analysis, vol 2. North-Holland,
Amsterdam, pp 45–62
Hartl RF, Sethi SP (1985b) Solution of generalized linear optimal control prob-
lems using a simplex-like method in continuous-time II: examples. In Fe-
ichtinger G (ed) Optimal control theory and economic analysis, vol 2. North-
Holland, Amsterdam, pp 63–87
Hartl RF, Sethi SP, Vickson RG (1995) A survey of the maximum principles for
optimal control problems with state constraints. SIAM Rev 37(2):181–218
Hartl RF, Kort PM, Novak AJ (1999) Optimal investment facing possible ac-
cidents. Ann Oper Res 88:99–117
504 Bibliography
Hartl RF, Kort PM, Feichtinger G (2003a) Offense control taking into account
heterogeneity of age. J Optim Theory Appl 116:591–620
Hartl RF, Novak AJ, Rao AG, Sethi SP (2003b) Optimal pricing of a product
diffusing in rich and poor populations. J Optim Theory Appl 117(2):349–375
Hartl RF, Kort PM, Feichtinger G, Wirl F (2004) Multiple equilibria and thresh-
olds due to relative investment costs. J Optim Theory Appl 123:49–82
Hartl RF, Novak AJ, Rao AG, Sethi SP (2005) Dynamic pricing of a status
symbol, invited talks from the fourth world congress of nonlinear analysts
(WCNA 2004) Orlando, FL, June 30–July 07, 2004. Nonlinear Anal Theory
Methods Appl 63:e2301–e2314
Hartman R (1982) Ordinary differential equations. Birkhäuser, Boston
Haruvy E, Prasad A, Sethi SP (2003) Harvesting altruism in open source soft-
ware development. J Optim Theory Appl 118(2):381–416
Haruvy E, Prasad A, Sethi SP, Zhang R (2005) Optimal firm contributions to
open source software. In: Deissenberg C, Hartl RF (eds) Optimal control and
dynamic games, applications in finance, management science and economics.
Springer, Dordrecht, pp 197–212
Haruvy E, Prasad A, Sethi SP, Zhang R (2008a) Competition with open source
as a public good. J Ind Manag Optim 4(1):199–211
Haruvy E, Sethi SP, Zhou J (2008b) Open source development with a commer-
cial complementary product or service. Prod Oper Manag (Special Issue on
Management of Technology) 17(1):29–43
Harvey AC (1994) Forecasting, structural time series models and the Kalman
filter. Cambridge University Press, New York
Haunschmied JL, Kort PM, Hartl RF, Feichtinger G (2003) A DNS-curve in
a two-state capital accumulation model: a numerical analysis. J Econ Dyn
Control 27:701–716
Haurie A (1976) Optimal control on an infinite time horizon: the turnpike
approach. J Math Econ 3:81–102
Haurie A, Hung NM (1977) Turnpike properties for the optimal use of a natural
resource. Rev Econ Stud 44:329–336
Haurie A, Leitmann G (1984) On the global asymptotic stability of equilibrium
solutions for open-loop differential games. Large Scale Syst 6:107–122
Haurie A, Sethi SP (1984) Decision and forecast horizons, agreeable plans, and
the maximum principle for infinite horizon control problems. Oper Res Lett
3(5):261–265
Bibliography 505
Heal GM (1976) The relationship between price and extraction cost for a re-
source with a backstop technology. Bell J Econ 7:371–378
Heal GM (1993) The optimal use of exhaustible resources. In: Kneese AV,
Sweeney (eds) Handbook of natural resource and energy economics, vol 3,
chap 18. Elsevier, London, pp 855–880
Heaps T (1984) The forestry maximum principle. J Econ Dyn Control 7:131–151
Holt CC, Modigliani F, Muth JF, Simon HA (1960) Planning production, in-
ventories and workforce. Prentice-Hall, Englewood Cliffs
Hwang CL, Fan LT, Erickson LE (1967) Optimal production planning by the
maximum principle. Manag Sci 13:750–755
Intriligator MD, Smith BLR (1966) Some aspects of the allocation of scientific
effort between teaching and research. Am Econ Rev 61:494–507
Isaacs R (1969) Differential games: their scope, nature, and future. J Optim
Theory Appl 3:283–295
Jacobson DH, Lele MM, Speyer JL (1971) New necessary conditions of opti-
mality for control problems with state-variable inequality constraints. J Math
Anal Appl 35:255–284
Jacquemin AP, Thisse J (1972) Strategy of the firm and market structure: an
application of optimal control theory. In: Cowling K (ed) Market structure
and corporate behavior. Gray-Mills, London, pp 61–84
Jørgensen S, Kort PM, Zaccour G (2009) Optimal pricing and advertising poli-
cies for an entertainment event. J Econ Dyn Control 33(3):583–596
Kalish S (1983) Monopolist pricing with dynamic demand and production cost.
Market Sci 2(2):135–159
Kalish S (1985) A new product adoption model with price, advertising, and
uncertainty. Market Sci 31(12):1569–1585
510 Bibliography
Kalish S, Lilien GL (1983) Optimal price subsidy policy for accelerating the
diffusion of innovation. Market Sci 2(4):407–420
Kalish S, Sen SK (1986) Diffusion models and the marketing mix for single
products. In: Mahajan V, Wind Y (eds) Series in econometrics and manage-
ment science: innovation diffusion models of new products acceptance, vol
V. Ballinger, Cambridge, pp 87–116
Kalman RE, Bucy R (1961) New results in linear filtering and prediction theory.
Trans ASME Ser D J Basic Eng 83:95–108
Kamien MI, Schwartz NL (1971a) Optimal maintenance and sale age for a
machine subject to failure. Manag Sci 17:427–449
Kamien MI, Schwartz NL (1971b) Limit pricing and uncertain entry. Econo-
metrica 39:441–454
Kamien MI, Schwartz NL (1982b) The role of common property resources in op-
timal planning models with exhaustible resources. In: Smith VK, Krutilla JV
(eds) Explorations in natural resource economics. John Hopkins University
Press, Baltimore, pp 47–71
Kem MC, Long NV (eds) (1980) Exhaustible resources, optimality, and trade.
North-Holland, Amsterdam
Kemp MC, Long NV (1977) Optimal control problems with integrands discon-
tinuous with respect to time. Econ Rec 53:405–420
Kirby BJ (ed) (1974) Optimal control theory and its applications. Lecture notes
in economics and mathematical systems, part I & II, vols 105–106. Springer,
Berlin
Kleindorfer PR, Lieber Z (1979) Algorithms and planning horizon results for
production planning problems with separable costs. Oper Res 27:874–887
Kort PM, Caulkins JP, Hartl RF, Feichtinger G (2006) Brand image and brand
dilution in the fashion industry. Automatica 42:1363–1370
Krouse CG (1972) Optimal financing and capital structure programs for the
firm. J Financ 27:1057–1071
Kumar S, Sethi SP (2009) Dynamic pricing and advertising for web content
providers. Eur J Oper Res 197:924–944
Kurcyusz S, Zowe J (1979) Regularity and stability for the mathematical pro-
gramming problem in Banach spaces. Appl Math Optim 5:49–62
Kydland FE, Prescott EC (1977) Rules rather than discretion: the inconsistency
of optimal plans. J Polit Econ 85:473–493
Lasdon LS, Mitter SK, Warren AD (1967) The conjugate gradient method for
optimal control problems. IEEE Trans Autom Control AC-12:132–138
Lee EB, Markus L (1968) Foundations of optimal control theory. Wiley, New
York
Lehoczky JP, Sethi SP, Soner HM, Taksar MI (1991) An asymptotic analysis of
hierarchical control of manufacturing systems under uncertainty. Math Oper
Res 16(3):596–608
Leitmann G (1981) The calculus of variations and optimal control. In: Miele A
(ed) Series mathematical concepts and methods in science and engineering.
Plenum Press, New York
Leland HE (1972) The dynamics of a revenue maximizing firm. Int Econ Rev
13:376–385
Lesourne J, Leban R (1982) Control theory and the dynamics of the firm: a
survey. OR-Spektr 4:1–14
Bibliography 515
Levine J, Thépot J (1982) Open loop and closed loop equilibria in a dynamic
duopoly. In: Feichtinger G (ed) Optimal control theory and economic analy-
sis. North-Holland, Amsterdam, pp 143–156
Lewis TR, Schmalensee R (1982) Optimal use of renewable resources with non-
convexities in production. In: Mirman LJ, Spulber PF (eds) Essays in the
economics of renewable resources. North-Holland, Amsterdam, pp 95–111
Lieber Z, Barnea A (1977) Dynamic optimal pricing to deter entry under con-
strained supply. Oper Res 25:696–705
Lintner J (1963) The cost of capital and optimal financing of corporate growth.
J Financ 23:292–310
Little JDC (1979) Aggregate advertising models: the state of the art. Oper Res
27(4):629–667
Liu PT, Roxin EO (eds) (1979) Differential games and control theory III. Marcel
Dek̇ker, New York
516 Bibliography
Long NV, Sorger G (2006) Insecure property rights and growth: the role of
appropriation costs, wealth effects, and heterogeneity. Econ Theory 28:513–
529
Long NV, Vousden N (1977) Optimal control theorems. In: Pitchford JD,
Turnovsky SJ (eds) Applications of control theory in economic analysis.
North-Holland, Amsterdam, pp 11–34
Lundin RA, Morton TE (1975) Planning horizons for the dynamic lot size
model: protective procedures and computational results. Oper Res 23:711–
734
Luptacik M (1982) Optimal price and advertising policy under atomistic com-
petition. J Econ Dyn Control 4:57–71
Bibliography 517
Magat WA, McCann JM, Morey RC (1986) When does lag structure really
matter in optimizing advertising expenditures? Manag Sci 32(2):182–193
Magat WA, McCann JM, Morey RC (1988) Reply to when does lag structure
really matter ... Indeed? Manag Sci 34(7):917–918
Maurer H (1981) First and second order sufficient optimality conditions in math-
ematical programming and optimal control. Math. Program Stud 14:163–177
Mayne DQ, Polak E (1987) An exact penalty function algorithm for control
problems with state and control constraints. IEEE Trans Autom Control
32:380–387
Mehra RK, Davis RE (1972) Generalized gradient method for optimal control
problems with inequality constraint and singular arcs. IEEE Trans Autom
Control AC-17:69–79
Mond B, Hanson M (1968) Duality for control problems. SIAM J Control 6:114–
120
Nahorski Z, Ravn HF, Vidal RVV (1984) The discrete-time maximum principle:
a survey and some new results. Int J Control 40:533–554
Naik PA, Mantrala MK, Sawyer A (1998) Planning pulsing media schedules in
the presence of dynamic advertising quality. Market Sci 17(3):214–235
522 Bibliography
Näslund B (1979) Consumer behavior and optimal advertising. J Oper Res Soc
20:237–243
Neck R (1984) Stochastic control theory and operational research. Eur J Oper
Res 17:283–301
Oberle HJ, Grimm W (1989) BNDSCO: a program for the numerical solution of
optimal control problems. Report 515, Institute for flight systems dynamics,
German aerospace research establishment DLR, Oberpfaffenhofen, Germany
Parlar M (1983) Optimal forest fire control with limited reinforcements. Optimal
Control Appl Methods 4:185–191
Pesch HJ, Bulirsch R (1994) The maximum principle, Bellman’s equation, and
Canathéodory’s Work. J Optim Theory Appl 80(1):203–229
Pesch HJ, Plail M (2009) The maximum principle of optimal control: a story of
ingenious ideas and missed opportunities. Control Cybern 38 (4A):973–995
Pindyck RS (1982) Adjustment costs, uncertainty, and the behavior of the firm.
Am Econ Rev 72(3):415–427
Polyanin AD, Zaitsev VF (2003) Handbook of exact solutions for ordinary dif-
ferential equations, 2nd edn. Chapman & Hall/CRC, Boca Raton
Prasad A, Sethi SP, Naik P (2012) Understanding the impact of churn in dy-
namic oligopoly markets. Automatica 48:2882–2887
Rao RC (1985) A note on optimal and near optimal price and advertising strate-
gies. Manag Sci 31(3):376–377
Raviv A (1979) The design of an optimal insurance policy. Am Econ Rev 69:84–
96
Ravn HF (1999) Discrete time optimal control. PhD thesis, Technical University
of Denmark
528 Bibliography
Rempala R (1986) Horizon for the dynamic family of wheat trading problems.
In: Proceedings fourth international symposium on inventories, Budapest
Richard SF (1979) A generalized capital asset pricing model. TIMS Stud Manag
Sci 11:215–232
Russak B (1976) Relations among the multipliers for problems with bounded
state constraints. SIAM J Control Optim 14:1151–1155
Sarma VVS, Alam M (1975) Optimal maintenance policies for machines subject
to deterioration and intermittent breakdowns. IEEE Trans Syst Man Cybern
SMC-5:396–398
Sethi SP (1974a) Sufficient conditions for the optimal control of a class of sys-
tems with continuous lags. J Optim Theory Appl 13:542–552
Sethi SP (1977d) A linear bang-bang model of firm behavior and water quality.
IEEE Trans Autom Control AC-22:706–714
Sethi SP (1978b) Optimal equity financing model of Krouse and Lee: corrections
and extensions. J Financ Quant Anal 13(3):487–505
Sethi SP (1979c) Optimal advertising policy with the contagion model. J Optim
Theory Appl 29(4):615–627
Sethi SP (1996) When does the share price equal the present value of future
dividends? - a modified dividend approach. Econ Theory 8:307–319
Sethi SP (1997b) Some insights into near-optimal plans for stochastic man-
ufacturing systems. In: Yin G, Zhang Q (eds) Mathematics of stochastic
manufacturing systems. Lectures in applied mathematics, vol 33. American
Mathematical Society, Providence, pp 287–315
Sethi SP, Bass FM (2003) Optimal pricing in a hazard rate model of demand.
Optimal Control Appl Methods 24:183–196
Sethi SP, Chand S (1981) Multiple finite production rate dynamic lot size in-
ventory models. Oper Res 29(5):931–944
Sethi SP, Lee SC (1981) Optimal advertising for the Nerlove-Arrow Model under
a replenishable budget. Optimal Control Appl Methods 2(2):165–173
Sethi SP, McGuire TW (1977) Optimal skill mix: an application of the max-
imum principle for systems with retarded controls. J Optim Theory Appl
23:245–275
Sethi SP, Morton TE (1972) A mixed optimization technique for the generalized
machine replacement problem. Naval Res Logist Quart 19:471–481
Sethi SP, Sorger G (1991) A theory of rolling horizon decision making. Ann O
R 29:387–416
Sethi SP, Staats PW (1978) Optimal control of some simple deterministic epi-
demic models. J Oper Res Soc 29(2):129–136
Sethi SP, Taksar MI (1988) Deterministic and stochastic control problems with
identical optimal cost functions. In: Bensoussan A, Lions JL (eds) Analy-
sis and optimization of systems. Lecture notes in control and information
sciences. Springer, New York, pp 641–645
Sethi SP, Thompson GL (1981b) A tutorial on optimal control theory. Inf Syst
Oper Res 19(4):279–291
Sethi SP, Zhang H (1999a) Average-cost optimal policies for an unreliable flex-
ible multiproduct machine. Int J Flex Manuf Syst 11:147–157
Sethi SP, Zhang Q (2004) Problem 4.3 feedback control in flowshops. In: Blon-
del VD, Megretski A (eds) Unsolved problems in mathematical systems and
control theory. Princeton University Press, Princeton, pp 140–143
Sethi SP, Suo W, Taksar MI, Zhang Q (1997a) Optimal production planning
in a stochastic manufacturing system with long-run average cost. J Optim
Theory Appl 92(1):161–188
Sethi SP, Suo W, Taksar MI, Yan H (1998a) Optimal production planning in a
multiproduct stochastic manufacturing systems with long-run average cost.
Discrete Event Dyn Syst 8(1):37–54
Sethi SP, Yan H, Zhang H, Zhang Q (2002) Optimal and hierarchical controls
in dynamic stochastic manufacturing systems: a survey. Manuf Serv Oper
Manag 4(2):133–170
Sethi SP, Yeh DHM, Zhang R, Jardine A (2008b) Optimal maintenance and
replacement of extraction machinery. J Syst Sci Syst Eng 17(4):416–431
Shell K (ed) (1967) Essays on the theory of optimal economic growth. The MIT
Press, Cambridge
Silva GN, Vinter RB (1997) Necessary conditions for optimal impulsive control
problems. SIAM J Control Optim 35(6):1829–1846
Singhal K, Singhal J (1986) A solution to the Holt et al. model for aggregate
production planning. OMEGA 14:502–505
Solow RM, Wan FY (1976) Extraction costs in the theory of exhaustible re-
sources. Bell J Econ 7:359–370
Spence M (1981) The learning curve and competition. Bell J Econ 12:49–70
Stiglitz JE, Dasgupta P (1982) Market structure and the resource depletion: a
contribution to the theory of intertemporal monopolistic competition. J Econ
Theory 28:128–164
Sweeney DJ, Abad PL, Dornoff RJ (1974) Finding an optimal dynamic adver-
tising policy. Int J Syst Sci 5(10):987–994
Sydsæter K (1978) Optimal control theory and economics: some critical remarks
on the literature. Scand J Econ 80:113–117
Tan KC, Bennett RJ (1984) Optimal control of spatial systems. George Allen
& Unwin, London
Tapiero CS (1981) Optimum product quality and advertising. Inf Syst Oper
Res 19(4):311–318
Bibliography 539
Tapiero CS, Farley JU (1975) Optimal control of sales force effort in time.
Manag Sci 21(9):976–985
Tapiero CS, Farley JU (1981) Using an uncertainty model to assess sales re-
sponse to advertising. Decis Sci 12:441–455
Tapiero CS, Venezia I (1979) A mean variance approach to the optimal machine
maintenance and replacement problem. J Oper Res Soci 30:457–466
Tapiero CS, Eliashberg J, Wind Y (1987) Risk behaviour and optimum ad-
vertising with a stochastic dynamic sales response. Optimal Control Appl
Methods 8(3):299–304
Teng JT, Thompson GL (1983) Oligopoly models for optimal advertising when
production costs obey a learning curve. Manag Sci 29(9):1087–1101
Teng JT, Thompson GL, Sethi SP (1984) Strong decision and forecast horizons
in a convex production planning problem. Optimal Control Appl Methods
5(4):319–330
540 Bibliography
Teo KL, Moore EJ (1977) Necessary conditions for optimality for control prob-
lems with time delays appearing in both state and control variables. J Optim
Theory Appl 23:413–427
Teo KL, Goh CJ, Wong KH (1991) A unified computational approach to optimal
control problems. Longman Scientific & Technical, Essex
Thompson GL, Teng JT (1984) Optimal pricing and advertising policies for
new product oligopoly models. Market Sci 3(2):148–168
Thompson GL, Sethi SP, Teng JT (1984) Strong planning and forecast horizons
for a model with simultaneous price and production decisions. Eur J Oper
Res 16:378–388
Treadway AB (1970) Adjustment costs and variable inputs in the theory of the
competitive firm. J Econ Theory 2:329–347
Uhler RS (1979) The rate of petroleum exploration and extraction. In: Pindyck
RS (ed) Advances in the economics of energy and resources, vol 2. JAI Press,
Greenwich, pp 93–118
Van Hilten O, Kort PM, Van Loon PJJM (1993) Dynamic policies of the firm:
an optimal control approach. Springer, New York
van Loon PJJM (1983) A dynamic theory of the firm: production, finance and
investment. Lecture notes in economics and mathematical systems, vol 218.
Springer, Berlin
Verheyen PA (1985) A dynamic theory of the firm and the reaction on govern-
mental policy. In: Feichtinger G (ed) Optimal control theory and economic
analysis, vol 2. North-Holland, Amsterdam, pp 313–329
Verheyen P (1992) The jump in models with irreversible investments. In: Fe-
ichtinger G (ed) Dynamic economic models and optimal control, vol 4. Else-
vier Science, Amsterdam, pp 75–89
Wagener FOO (2003) Skiba points and heteroclinic bifurcations, with applica-
tions to the shallow lake system. J Econ Dyn Control 27(9):1533–1561
Wagener FOO (2005) Structural analysis of optimal investment for firms with
non-concave production. J Econ Behav Org 57(4):474–489
Wagener FOO (2006) Skiba points for small discount rates. J Optim Theory
Appl 128(2):261–277
Wagner HM, Whitin TM (1958) Dynamic version of the economic lot size model.
Manag Sci 5:89–96
Welam UP (1982) Optimal and near optimal price and advertising strategies
for finite and infinite horizons. Manag Sci 28(11):1313–1327
Wickwire K (1977) Mathematical models for the control of pests and infectious
diseases: a survey. Theor Popul Biol 11:182–238
Wirl F (1984) Sensitivity analysis of OPEC pricing policies. OPEC Rev 8:321–
331
Wirl F (1985) Stable and volatile prices: and explanation by dynamic demand.
In: Feichtinger G (ed) Optimal control theory and economic analysis, vol 2.
North-Holland, Amsterdam, pp 263–277
Yang J, Yan H, Sethi SP (1999) Optimal production planning in pull flow lines
with multiple products. Eur J Oper Res 119(3):26–48
Young LC (1969) Calculus of variations and optimal control theory. W.B. Saun-
ders, Philadelphia
Zoltners AA (ed) (1982) Marketing planning models. TIMS studies in the man-
agement sciences, vol 18. North-Holland, Amsterdam
Gordon’s formula, 181 Hartl, R.F., x, 11, 32, 39, 70, 79,
Gould, J.P., 231, 257, 362, 499 85, 95, 104,
Grass, D., 11, 70, 109, 131, 132, 135–137,
122, 360, 458, 140, 141, 149,
460, 484, 485, 207, 213, 225,
491, 500, 545 281, 285, 351,
Green’s theorem, 225, 237, 360, 361, 458,
239, 245, 254, 460, 463, 464,
257, 314, 344, 346, 363, 391 484, 485, 490, 491,
Grienauer, W., 495 493, 495, 497,
Grimm, W., 500, 523 499–505, 512,
Gross, M., 500 516, 519, 535, 537, 545
Gruber, M., 159, 492 Hartman, R., 458, 504
Gruver, W.A., 512 Haruvy, E., 235, 253, 504, 545
Gutierrez, G.J., 404, 505 Harvey, A.C., 504
Haunschmied, J.L., 360, 484,
H 495, 504
Hämäläinen, R.P., 392, 500 Haurie, A., 11, 106, 213,
Hadley, G., 11, 70, 335, 500 385, 392, 463,
Hahn, M., 500 479, 483, 500, 504, 505
Halkin, H., 32, 276, 500 Haussmann, U.G., 505
Hämäläinen, R.P., 392 He, X., 396, 404, 505, 535
Hamilton, 10
Heal, G.M., 324, 487, 505
Hamiltonian, 35, 41, 73,
Heaps, T., 283, 505
271, 337, 437
Heckman, J., 505
Hamiltonian maximizing
Heineke, J.M., 487
condition, 36, 39, 74, 97,
Hestenes, M.R., 10, 70, 505
99, 118, 437
Hamilton-Jacobi-Bellman (HJB) HJB equation, 36, 376
equation, 32, 36, 366, 368, HMMS model, 191
371, 393 Ho, Y.-C., 39, 113, 141,
Hamilton-Jacobi equation, 385, 388, 442,
371, 394 450, 453, 481, 505, 523, 537
Han, M., 500 Hochman, E., 383, 527
Hanson, M., 521 Hofbauer, J., 505, 506
Hanssens, D.M., 500 Hoffmann, K.H., 506
Harris, F.W., 191, 500 Hohn, F., 213, 520
Harris, H., 360, 500 Holly, S., 506
Harrison, J.M., 379, 501 Holt, C.C., 191, 200, 202, 506
Hartberger, R.J., 32, 436, 491, 501 Holtzman, J.M., 276, 506
554 Index
Mitchell, A., 521 Näslund, B., 11, 70, 113, 283, 323,
Mitra, T., 517 331, 477, 522
Mitter, S.K., 477, 523 Natural resources, 311, 383
Mittnik, S., 490 Necessary condition, 37, 39, 269
Mixed constraints, 69, 71, 79 Neck, R., 522
Mixed inequality constraints, 3, Needle-shaped variation, 434, 435
69, 70 Neighborhood, 16
Mixed optimization technique, 302 Nelson, R.T., 360, 522
Modeling tricks, 111 Nepomiastchy, P., 360, 522
Modigliani, F., 166, 191, Nerlove, M., 226, 228, 522
200, 202, 213, 506, 520 Nerlove-Arrow model, 110
Moiseev, N.N., 521 Nerlove-Love advertising model,
Monahan, G.E., 521 226
Mond, B., 521 Neuman, C.P., 226, 541
Mookerjee, V.S., 508 Neustadt, L.W., 10, 518, 522
Moore, E.J., 540 Newton, 9
Moore, J.B., 383, 442, Nguyen, D., 522
450, 474, 491 Nishimura, K., 458, 488
Morey, R.C., 517 Nissen, G., 477
Mortimort, D., 352, 513 Nonlinear programming,
Morton, A., 498 259, 260, 268
Morton, T.E., xiv, 213, Norm, 16, 17
297, 302, 309, 516, 521, 532 Norström, C.J., xiv, 191, 208, 522
Moser, E., 351, 460, 521 Norton, F.E., 448, 481
Moskowitz, H., 485 Notation, 11
Motta, M., 521 Novak, A.J., 361, 460,
Muller, E., 490, 515, 517, 521 484, 485, 494, 495,
Mulvey, J.M., 532 503, 504, 522
Munro, G.R., 311, 521
Murata, Y., 521 O
Murray, D.M., 277, 521 Oakland, W.H., 360, 480
Muth, J.F., 191, 200, 202, 506 Oberle, H.J., 500, 522, 523
Muzicant, J., 521 Objective function, 2, 29, 438
Oettli, W., 481
N Oğuztöreli, M.N., 523
Naert, P.A., 404, 477, 481 Øksendal, B.K., 383
Nahorski, Z., 521 Øksendal, B.K., 480, 523
Naik, P.A., 404, 448, 521, 522, 526 Okuguchi, K., 486
Nash differential games, 387 Olsder, G.J., 385, 396,
Nash solutions, 385 405, 475, 523
Index 559
Strictly concave function, 21, 79 Teo, K.L., 135, 462, 473, 507, 540
Strictly convex function, 79 Terborgh, G., 283, 540
Strong forecast horizon, 213, 216, Terminal conditions, 38, 74, 86
219 Terminal inequality constraints, 72
Strong maximum, 430 Terminal time, 4, 29, 71, 75, 86, 89,
Subsidy rate, 397 98, 113
Sufficiency conditions, 53, 54, 79, Thépot, J., 360, 515, 540
136, 269 Thisse, J., 507
Sulem, A., 538 Thompson, G.L., 54, 159,
Summary of transversality 191, 205, 213,
conditions, 89 253, 274, 283,
Suo, W., x, 526, 534, 535 291, 302, 309,
Surveys of applications, 10 360, 371, 388, 404, 460,
Sutinen, J.G., 488, 516 462, 463, 474,
Swan, G.W., 343, 538 488, 489, 498, 506, 512,
Sweeney, D.J., 473, 538 533, 534, 539, 540
Sweeney, J.L., 480, 505, 521 Tidball, M., 540
Switching curves, 99 Tihomirov, V.M., 507
Switching point, 171, 175, 178 Time-optimal control problem,
Switching time, 102 96, 97
Sydsæter, K., 11, 32, 53, 70, 73, Tintner, G., 540
77–79, 104, 136, Titli, A., 536
149, 335, 530, 538 Tolwinski, B., 385, 505, 540
Synthesis of optimal controls, Total contribution, 41
97, 170 Tou, J.T., 452
System noise, 442 Toussaint, S., 541
Szego, G.P., 535 TPBVP, 39, 40, 57, 58, 60, 64, 67,
221, 338
T Tracz, G.S., 541
Taboubi, S., 404, 476, 518 Tragler, G., 11, 70, 109,
Takayama, A., 42, 335, 491, 538 122, 360, 458,
Taksar, M.I., 512, 514, 533–535 460, 475, 484,
Tan, K.C., 538 495, 499, 500, 541, 545
Tapiero, C.S., 11, 297, Transition matrix, 436
309, 360, 383, Transversality conditions, 38, 75,
477, 491, 533, 538, 539 77, 86, 88, 89, 91, 99,
Taraysev, A., 525 104, 105, 116, 121
Taylor, J.G., 360, 539 Transversality conditions: special
Teichroew, D., 11, 487 cases, 86
Teng, J.-T., 539, 540 Treadway, A.B., 360, 541
564 Index