100% found this document useful (1 vote)
214 views

Sethi2019 Book OptimalControlTheory

Sethi optimal control

Uploaded by

Koray Yenal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
214 views

Sethi2019 Book OptimalControlTheory

Sethi optimal control

Uploaded by

Koray Yenal
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 577

Suresh P.

Sethi

Optimal
Control Theory
Applications to Management Science
and Economics
Third Edition
Optimal Control Theory
Suresh P. Sethi

Optimal Control Theory


Applications to Management Science
and Economics

Third Edition

123
Suresh P. Sethi
Jindal School of Management, SM30
University of Texas at Dallas
Richardson, TX, USA

ISBN 978-3-319-98236-6 ISBN 978-3-319-98237-3 (eBook)


https://doi.org/10.1007/978-3-319-98237-3

Library of Congress Control Number: 2018955904

2nd edition: © Springer-Verlag US 2000


© Springer Nature Switzerland AG 2019
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
This book is dedicated to the memory of
my parents
Manak Bai and Gulab Chand Sethi
Preface to Third Edition
The third edition of this book will not see my co-author Gerald L.
Thompson, who very sadly passed away on November 9, 2009. Gerry
and I wrote the first edition of the 1981 book sitting practically side by
side, and I learned a great deal about book writing in the process. He
was also my PhD supervisor and mentor and he is greatly missed.
After having used the second edition of the book in the classroom
for many years, the third edition arrives with new material and many
improvements. Examples and exercises related to the interpretation of
the adjoint variables and Lagrange multipliers are inserted in Chaps. 2–
4. Direct maximum principle is now discussed in detail in Chap. 4 along
with the existing indirect maximum principle from the second edition.
Chattering or relaxed controls leading to pulsing advertising policies are
introduced in Chap. 7. An application to information systems involving
chattering controls is added as an exercise.
The objective function in Sect. 11.1.3 is changed to the more popular
objective of maximizing the total discounted society’s utility of consump-
tion. Further discussion leading to obtaining a saddle-point path on the
phase diagram leading to the long-run stationary equilibrium is provided
in Sect. 11.2. For this purpose, a global saddle-point theorem is stated
in Appendix D.7. Also inserted in Appendix D.8 is a discussion of the
Sethi-Skiba points which lead to nonunique stable equilibria. Finally,
a new Sect. 11.4 contains an adverse selection model with continuum of
the agent types in a principal-agent framework, which requires an appli-
cation of the maximum principle.
Chapter 12 of the second edition is removed except for the material
on differential games and the distributed parameter maximum principle.
The differential game material joins new topics of stochastic Nash differ-
ential games and Stackelberg differential games via their applications to
marketing to form a new Chap. 13 titled Differential Games. As a result,
Chap. 13 of the second edition becomes Chap. 12. The material on the
distributed parameter maximum principle is now Appendix D.9.
The exposition is revised in some places for better reading. New
exercises are added and the list of references is updated. Needless to say,
the errors in the second edition are corrected, and the notation is made
consistent.
vii
viii Preface to Third Edition

Thanks are due to Huseyin Cavusoglu, Andrei Dmitruk, Gustav Fe-


ichtinger, Richard Hartl, Yonghua Ji, Subodha Kumar, Sirong Lao, Hel-
mut Maurer, Ernst Presman, Anyan Qi, Andrea Seidl, Atle Seierstad,
Xi Shan, Lingling Shi, Xiahong Yue, and the students in my Optimal
Control Theory and Applications course over the years for their sug-
gestions for improvement. Special thanks go to Qi (Annabelle) Feng
for her dedication in updating and correcting the forthcoming solution
manual that went with the first edition. I cannot thank Barbara Gordon
and Lindsay Wilson enough for their assistance in the preparation of
the text, solution manual, and presentation materials. In addition, the
meticulous copy editing of the entire book by Lindsay Wilson is much
appreciated. Anshuman Chutani, Pooja Kamble, and Shivani Thakkar
are also thanked for their assistance in drawing some of the figures in
the book.

Richardson, TX, USA Suresh P. Sethi


June 2018
Preface to Second Edition
The first edition of this book, which provided an introduction to op-
timal control theory and its applications to management science to many
students in management, industrial engineering, operations research and
economics, went out of print a number of years ago. Over the years we
have received feedback concerning its contents from a number of instruc-
tors who taught it, and students who studied from it. We have also kept
up with new results in the area as they were published in the literature.
For this reason we felt that now was a good time to come out with a
new edition. While some of the basic material remains, we have made
several big changes and many small changes which we feel will make the
use of the book easier.
The most visible change is that the book is written in Latex and the
figures are drawn in CorelDRAW, in contrast to the typewritten text
and hand-drawn figures of the first edition. We have also included some
problems along with their numerical solutions obtained using Excel.
The most important change is the division of the material in the
old Chap. 3, into Chaps. 3 and 4 in the new edition. Chapter 3 now
contains models having mixed (control and state) constraints, current
value formulations, terminal conditions and model types, while Chap. 4
covers the more difficult topic of pure state constraints, together with
mixed constraints. Each of these chapters contain new results that were
not available when the first edition was published.
The second most important change is the expansion of the material in
the old Sect. 12.4 on stochastic optimal control theory and its becoming
the new Chap. 13. The new Chap. 12 now contains the following ad-
vanced topics on optimal control theory: differential games, distributed
parameter systems, and impulse control. The new Chap. 13 provides a
brief introduction to stochastic optimal control problems. It contains
formulations of simple stochastic models in production, marketing and
finance, and their solutions. We deleted the old Chap. 11 of the first
edition on computational methods, since there are a number of excellent
references now available on this topic. Some of these references are listed
in Sect. 4.2 of Chap. 4 and Sect. 8.3 of Chap. 8.

ix
x Preface to Second Edition

The emphasis of this book is not on mathematical rigor, but rather


on developing models of realistic situations faced in business and man-
agement. For that reason we have given, in Chaps. 2 and 8, proofs of the
continuous and discrete maximum principles by using dynamic program-
ming and Kuhn-Tucker theory, respectively. More general maximum
principles are stated without proofs in Chaps. 3, 4 and 12.
One of the fascinating features of optimal control theory is its ex-
traordinarily wide range of possible applications. We have covered some
of these as follows: Chap. 5 covers finance; Chap. 6 considers production
and inventory problems; Chap. 7 covers marketing problems; Chap. 9
treats machine maintenance and replacement; Chap. 10 deals with prob-
lems of optimal consumption of natural resources (renewable or ex-
haustible); and Chap. 11 discusses a number of applications of control
theory to economics. The contents of Chaps. 12 and 13 have been de-
scribed earlier.
Finally, four appendices cover either elementary material, such as
the theory of differential equations, or very advanced material, whose
inclusion in the main text would interrupt its continuity. At the end
of the book is an extensive but not exhaustive bibliography of relevant
material on optimal control theory including surveys of material devoted
to specific applications.
We are deeply indebted to many people for their part in making this
edition possible. Onur Arugaslan, Gustav Feichtinger, Neil Geismar,
Richard Hartl, Steffen Jørgensen, Subodha Kumar, Helmut Maurer, Ger-
hard Sorger, and Denny Yeh made helpful comments and suggestions
about the first edition or preliminary chapters of this revision. Many
students who used the first edition, or preliminary chapters of this revi-
sion, also made suggestions for improvements. We would like to express
our gratitude to all of them for their help. In addition we express our
appreciation to Eleanor Balocik, Frank (Youhua) Chen, Feng Cheng,
Howard Chow, Barbara Gordon, Jiong Jiang, Kuntal Kotecha, Ming
Tam, and Srinivasa Yarrakonda for their typing of the various drafts of
the manuscript. They were advised by Dirk Beyer, Feng Cheng, Sub-
odha Kumar, Young Ryu, Chelliah Sriskandarajah, Wulin Suo, Houmin
Yan, Hanqin Zhang, and Qing Zhang on the technical problems of using
LATEX.
We also thank our wives and children—Andrea, Chantal, Anjuli,
Dorothea, Allison, Emily, and Abigail—for their encouragement and un-
derstanding during the time-consuming task of preparing this revision.
Preface to Second Edition xi

Finally, while we regret that lack of time and pressure of other du-
ties prevented us from bringing out a second edition soon after the first
edition went out of print, we sincerely hope that the wait has been worth-
while. In spite of the numerous applications of optimal control theory
which already have been made to areas of management science and eco-
nomics, we continue to believe there is much more that remains to be
done. We hope the present revision will rekindle interest in furthering
such applications, and will enhance the continued development in the
field.

Richardson, TX, USA Suresh P. Sethi


Pittsburgh, PA, USA Gerald L. Thompson
January 2000
Preface to First Edition
The purpose of this book is to exposit, as simply as possible, some
recent results obtained by a number of researchers in the application of
optimal control theory to management science. We believe that these re-
sults are very important and deserve to be widely known by management
scientists, mathematicians, engineers, economists, and others. Because
the mathematical background required to use this book is two or three
semesters of calculus plus some differential equations and linear algebra,
the book can easily be used to teach a course in the junior or senior
undergraduate years or in the early years of graduate work. For this
purpose, we have included numerous worked-out examples in the text,
as well as a fairly large number of exercises at the end of each chapter.
Answers to selected exercises are included in the back of the book. A
solutions manual containing completely worked-out solutions to all of
the 205 exercises is also available to instructors.
The emphasis of the book is not on mathematical rigor, but on mod-
eling realistic situations faced in business and management. For that
reason, we have given in Chaps. 2 and 7 only heuristic proofs of the con-
tinuous and discrete maximum principles, respectively. In Chap. 3 we
have summarized, as succinctly as we can, the most important model
types and terminal conditions that have been used to model manage-
ment problems. We found it convenient to put a summary of almost all
the important management science models on two pages: see Tables 3.1
and 3.3.
One of the fascinating features of optimal control theory is the ex-
traordinarily wide range of its possible applications. We have tried to
cover a wide variety of applications as follows: Chap. 4 covers finance;
Chap. 5 considers production and inventory; Chap. 6 covers marketing;
Chap. 8 treats machine maintenance and replacement; Chap. 9 deals with
problems of optimal consumption of natural resources (renewable or ex-
haustible); and Chap. 10 discusses several economic applications.
In Chap. 11 we treat some computational algorithms for solving op-
timal control problems. This is a very large and important area that
needs more development.

xiii
xiv Preface to First Edition

Chapter 12 treats several more advanced topics of optimal con-


trol: differential games, distributed parameter systems, optimal filtering,
stochastic optimal control, and impulsive control. We believe that some
of these models are capable of wider applications and further theoretical
development.
Finally, four appendixes cover either elementary material, such as
differential equations, or advanced material, whose inclusion in the main
text would spoil its continuity. Also at the end of the book is a bibliogra-
phy of works actually cited in the text. While it is extensive, it is by no
means an exhaustive bibliography of management science applications
of optimal control theory. Several surveys of such applications, which
contain many other important references, are cited.
We have benefited greatly during the writing of this book by hav-
ing discussions with and obtaining suggestions from various colleagues
and students. Our special thanks go to Gustav Feichtinger for his care-
ful reading and suggestions for improvement of the entire book. Carl
Norström contributed two examples to Chaps. 4 and 5 and made many
suggestions for improvement. Jim Bookbinder used the manuscript for
a course at the University of Toronto, and Tom Morton suggested some
improvements for Chap. 5. The book has also benefited greatly from var-
ious coauthors with whom we have done research over the years. Both of
us also have received numerous suggestions for improvements from the
students in our applied control theory courses taught during the past
several years. We would like to express our gratitude to all these people
for their help.
The book has gone through several drafts, and we are greatly in-
debted to Eleanor Balocik and Rosilita Jones for their patience and
careful typing.
Although the applications of optimal control theory to management
science are recent and many fascinating applications have already been
made, we believe that much remains to be done. We hope that this book
will contribute to the popularity of the area and will enhance future
developments.

Toronto, ON, Canada Suresh P. Sethi


Pittsburgh, PA, USA Gerald L. Thompson
August 1981
Contents
1 What Is Optimal Control Theory? 1
1.1 Basic Concepts and Definitions . . . . . . . . . . . . . . 2
1.2 Formulation of Simple Control Models . . . . . . . . . . 4
1.3 History of Optimal Control Theory . . . . . . . . . . . 9
1.4 Notation and Concepts Used . . . . . . . . . . . . . . . 11
1.4.1 Differentiating Vectors and Matrices with Respect
To Scalars . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Differentiating Scalars with Respect to Vectors . 13
1.4.3 Differentiating Vectors with Respect to Vectors . 14
1.4.4 Product Rule for Differentiation . . . . . . . . . 16
1.4.5 Miscellany . . . . . . . . . . . . . . . . . . . . . . 16
1.4.6 Convex Set and Convex Hull . . . . . . . . . . . 20
1.4.7 Concave and Convex Functions . . . . . . . . . . 20
1.4.8 Affine Function and Homogeneous Function of
Degree k . . . . . . . . . . . . . . . . . . . . . . . 22
1.4.9 Saddle Point . . . . . . . . . . . . . . . . . . . . 22
1.4.10 Linear Independence and Rank of a Matrix . . . 23
1.5 Plan of the Book . . . . . . . . . . . . . . . . . . . . . . 23

2 The Maximum Principle: Continuous Time 27


2.1 Statement of the Problem . . . . . . . . . . . . . . . . . 27
2.1.1 The Mathematical Model . . . . . . . . . . . . . 28
2.1.2 Constraints . . . . . . . . . . . . . . . . . . . . . 28
2.1.3 The Objective Function . . . . . . . . . . . . . . 29
2.1.4 The Optimal Control Problem . . . . . . . . . . 29
2.2 Dynamic Programming and the Maximum Principle . . 32
2.2.1 The Hamilton-Jacobi-Bellman Equation . . . . . 32
2.2.2 Derivation of the Adjoint Equation . . . . . . . . 36

xv
xvi CONTENTS

2.2.3 The Maximum Principle . . . . . . . . . . . . . . 39


2.2.4 Economic Interpretations of the Maximum
Principle . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Simple Examples . . . . . . . . . . . . . . . . . . . . . . 42
2.4 Sufficiency Conditions . . . . . . . . . . . . . . . . . . . 53
2.5 Solving a TPBVP by Using Excel . . . . . . . . . . . . . 57

3 The Maximum Principle: Mixed Inequality


Constraints 69
3.1 A Maximum Principle for Problems with Mixed Inequality
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.2 Sufficiency Conditions . . . . . . . . . . . . . . . . . . . 79
3.3 Current-Value Formulation . . . . . . . . . . . . . . . . 80
3.4 Transversality Conditions: Special Cases . . . . . . . . . 86
3.5 Free Terminal Time Problems . . . . . . . . . . . . . . . 93
3.6 Infinite Horizon and Stationarity . . . . . . . . . . . . . 103
3.7 Model Types . . . . . . . . . . . . . . . . . . . . . . . . 109

4 The Maximum Principle: Pure State and Mixed


Inequality Constraints 125
4.1 Jumps in Marginal Valuations . . . . . . . . . . . . . . . 127
4.2 The Optimal Control Problem with Pure and Mixed
Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.3 The Maximum Principle: Direct Method . . . . . . . . . 132
4.4 Sufficiency Conditions: Direct Method . . . . . . . . . . 136
4.5 The Maximum Principle: Indirect Method . . . . . . . . 137
4.6 Current-Value Maximum Principle:
Indirect Method . . . . . . . . . . . . . . . . . . . . . . 147

5 Applications to Finance 159


5.1 The Simple Cash Balance Problem . . . . . . . . . . . . 160
5.1.1 The Model . . . . . . . . . . . . . . . . . . . . . 160
5.1.2 Solution by the Maximum Principle . . . . . . . 161
5.2 Optimal Financing Model . . . . . . . . . . . . . . . . . 164
5.2.1 The Model . . . . . . . . . . . . . . . . . . . . . 165
5.2.2 Application of the Maximum Principle . . . . . . 167
5.2.3 Synthesis of Optimal Control Paths . . . . . . . 170
5.2.4 Solution for the Infinite Horizon Problem . . . . 180
CONTENTS xvii

6 Applications to Production and Inventory 191


6.1 Production-Inventory Systems . . . . . . . . . . . . . . . 192
6.1.1 The Production-Inventory Model . . . . . . . . . 192
6.1.2 Solution by the Maximum Principle . . . . . . . 193
6.1.3 The Infinite Horizon Solution . . . . . . . . . . . 196
6.1.4 Special Cases of Time Varying Demands . . . . . 197
6.1.5 Optimality of a Linear Decision Rule . . . . . . . 200
6.1.6 Analysis with a Nonnegative Production
Constraint . . . . . . . . . . . . . . . . . . . . . . 202
6.2 The Wheat Trading Model . . . . . . . . . . . . . . . . 204
6.2.1 The Model . . . . . . . . . . . . . . . . . . . . . 205
6.2.2 Solution by the Maximum Principle . . . . . . . 206
6.2.3 Solution of a Special Case . . . . . . . . . . . . . 206
6.2.4 The Wheat Trading Model with No Short-Selling 208
6.3 Decision Horizons and Forecast Horizons . . . . . . . . . 213
6.3.1 Horizons for the Wheat Trading Model with
No Short-Selling . . . . . . . . . . . . . . . . . . 214
6.3.2 Horizons for the Wheat Trading Model with No
Short-Selling and a Warehousing Constraint . . . 214

7 Applications to Marketing 225


7.1 The Nerlove-Arrow Advertising Model . . . . . . . . . . 226
7.1.1 The Model . . . . . . . . . . . . . . . . . . . . . 226
7.1.2 Solution by the Maximum Principle . . . . . . . 228
7.1.3 Convex Advertising Cost and Relaxed Controls . 232
7.2 The Vidale-Wolfe Advertising Model . . . . . . . . . . . 235
7.2.1 Optimal Control Formulation for the
Vidale-Wolfe Model . . . . . . . . . . . . . . . . 236
7.2.2 Solution Using Green’s Theorem When
Q Is Large . . . . . . . . . . . . . . . . . . . . . 237
7.2.3 Solution When Q Is Small . . . . . . . . . . . . . 245
7.2.4 Solution When T Is Infinite . . . . . . . . . . . . 247

8 The Maximum Principle: Discrete Time 259


8.1 Nonlinear Programming Problems . . . . . . . . . . . . 259
8.1.1 Lagrange Multipliers . . . . . . . . . . . . . . . . 260
8.1.2 Equality and Inequality Constraints . . . . . . . 262
8.1.3 Constraint Qualification . . . . . . . . . . . . . . 267
8.1.4 Theorems from Nonlinear Programming . . . . . 268
xviii CONTENTS

8.2 A Discrete Maximum Principle . . . . . . . . . . . . . . 269


8.2.1 A Discrete-Time Optimal Control Problem . . . 269
8.2.2 A Discrete Maximum Principle . . . . . . . . . . 270
8.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . 272
8.3 A General Discrete Maximum Principle . . . . . . . . . 276

9 Maintenance and Replacement 283


9.1 A Simple Maintenance and Replacement Model . . . . . 284
9.1.1 The Model . . . . . . . . . . . . . . . . . . . . . 284
9.1.2 Solution by the Maximum Principle . . . . . . . 285
9.1.3 A Numerical Example . . . . . . . . . . . . . . . 287
9.1.4 An Extension . . . . . . . . . . . . . . . . . . . . 289
9.2 Maintenance and Replacement for
a Machine Subject to Failure . . . . . . . . . . . . . . . 290
9.2.1 The Model . . . . . . . . . . . . . . . . . . . . . 291
9.2.2 Optimal Policy . . . . . . . . . . . . . . . . . . . 293
9.2.3 Determination of the Sale Date . . . . . . . . . . 296
9.3 Chain of Machines . . . . . . . . . . . . . . . . . . . . . 297
9.3.1 The Model . . . . . . . . . . . . . . . . . . . . . 297
9.3.2 Solution by the Discrete Maximum Principle . . 299
9.3.3 Special Case of Bang-Bang Control . . . . . . . . 301
9.3.4 Incorporation into the Wagner-Whitin
Framework for a Complete Solution . . . . . . . 301
9.3.5 A Numerical Example . . . . . . . . . . . . . . . 302

10 Applications to Natural Resources 311


10.1 The Sole-Owner Fishery Resource Model . . . . . . . . . 312
10.1.1 The Dynamics of Fishery Models . . . . . . . . . 312
10.1.2 The Sole Owner Model . . . . . . . . . . . . . . 313
10.1.3 Solution by Green’s Theorem . . . . . . . . . . . 314
10.2 An Optimal Forest Thinning Model . . . . . . . . . . . 317
10.2.1 The Forestry Model . . . . . . . . . . . . . . . . 317
10.2.2 Determination of Optimal Thinning . . . . . . . 318
10.2.3 A Chain of Forests Model . . . . . . . . . . . . . 321
10.3 An Exhaustible Resource Model . . . . . . . . . . . . . 324
10.3.1 Formulation of the Model . . . . . . . . . . . . . 324
10.3.2 Solution by the Maximum Principle . . . . . . . 327
CONTENTS xix

11 Applications to Economics 335


11.1 Models of Optimal Economic Growth . . . . . . . . . . . 335
11.1.1 An Optimal Capital Accumulation Model . . . . 336
11.1.2 Solution by the Maximum Principle . . . . . . . 336
11.1.3 Introduction of a Growing Labor Force . . . . . . 338
11.1.4 Solution by the Maximum Principle . . . . . . . 339
11.2 A Model of Optimal Epidemic Control . . . . . . . . . . 343
11.2.1 Formulation of the Model . . . . . . . . . . . . . 343
11.2.2 Solution by Green’s Theorem . . . . . . . . . . . 344
11.3 A Pollution Control Model . . . . . . . . . . . . . . . . 346
11.3.1 Model Formulation . . . . . . . . . . . . . . . . . 347
11.3.2 Solution by the Maximum Principle . . . . . . . 348
11.3.3 Phase Diagram Analysis . . . . . . . . . . . . . . 349
11.4 An Adverse Selection Model . . . . . . . . . . . . . . . . 352
11.4.1 Model Formulation . . . . . . . . . . . . . . . . . 352
11.4.2 The Implementation Problem . . . . . . . . . . . 353
11.4.3 The Optimization Problem . . . . . . . . . . . . 354
11.5 Miscellaneous Applications . . . . . . . . . . . . . . . . 360

12 Stochastic Optimal Control 365


12.1 Stochastic Optimal Control . . . . . . . . . . . . . . . . 366
12.2 A Stochastic Production Inventory Model . . . . . . . . 370
12.2.1 Solution for the Production Planning Problem . 372
12.3 The Sethi Advertising Model . . . . . . . . . . . . . . . 375
12.4 An Optimal Consumption-Investment Problem . . . . . 377
12.5 Concluding Remarks . . . . . . . . . . . . . . . . . . . . 383

13 Differential Games 385


13.1 Two-Person Zero-Sum Differential Games . . . . . . . . 386
13.2 Nash Differential Games . . . . . . . . . . . . . . . . . . 387
13.2.1 Open-Loop Nash Solution . . . . . . . . . . . . . 388
13.2.2 Feedback Nash Solution . . . . . . . . . . . . . . 388
13.2.3 An Application to Common-Property Fishery
Resources . . . . . . . . . . . . . . . . . . . . . . 389
13.3 A Feedback Nash Stochastic Differential
Game in Advertising . . . . . . . . . . . . . . . . . . . . 392
13.4 A Feedback Stackelberg Stochastic Differential Game of
Cooperative Advertising . . . . . . . . . . . . . . . . . . 395
xx CONTENTS

A Solutions of Linear Differential Equations 409


A.1 First-Order Linear Equations . . . . . . . . . . . . . . . 409
A.2 Second-Order Linear Equations with
Constant Coefficients . . . . . . . . . . . . . . . . . . . . 410
A.3 System of First-Order Linear Equations . . . . . . . . . 410
A.4 Solution of Linear Two-Point Boundary Value Problems 413
A.5 Solutions of Finite Difference Equations . . . . . . . . . 414
A.5.1 Changing Polynomials in Powers of k into
Factorial Powers of k . . . . . . . . . . . . . . . . 415
A.5.2 Changing Factorial Powers of k into Ordinary
Powers of k . . . . . . . . . . . . . . . . . . . . . 416

B Calculus of Variations and Optimal Control Theory 419


B.1 The Simplest Variational Problem . . . . . . . . . . . . 420
B.2 The Euler-Lagrange Equation . . . . . . . . . . . . . . . 421
B.3 The Shortest Distance Between Two Points on the Plane 424
B.4 The Brachistochrone Problem . . . . . . . . . . . . . . . 424
B.5 The Weierstrass-Erdmann Corner
Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . 427
B.6 Legendre’s Conditions: The Second Variation . . . . . . 428
B.7 Necessary Condition for a Strong
Maximum . . . . . . . . . . . . . . . . . . . . . . . . . . 429
B.8 Relation to Optimal Control Theory . . . . . . . . . . . 430

C An Alternative Derivation of the Maximum Principle 433


C.1 Needle-Shaped Variation . . . . . . . . . . . . . . . . . . 434
C.2 Derivation of the Adjoint Equation and the Maximum
Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 436

D Special Topics in Optimal Control 441


D.1 The Kalman Filter . . . . . . . . . . . . . . . . . . . . . 441
D.2 Wiener Process and Stochastic Calculus . . . . . . . . . 444
D.3 The Kalman-Bucy Filter . . . . . . . . . . . . . . . . . . 447
D.4 Linear-Quadratic Problems . . . . . . . . . . . . . . . . 448
D.4.1 Certainty Equivalence or Separation Principle . . 451
D.5 Second-Order Variations . . . . . . . . . . . . . . . . . . 452
D.6 Singular Control . . . . . . . . . . . . . . . . . . . . . . 454
CONTENTS xxi

D.7 Global Saddle Point Theorem . . . . . . . . . . . . . . . 456


D.8 The Sethi-Skiba Points . . . . . . . . . . . . . . . . . . . 458
D.9 Distributed Parameter Systems . . . . . . . . . . . . . . 460

E Answers to Selected Exercises 465

Bibliography 473

Index 547
List of Figures
1.1 The Brachistochrone problem . . . . . . . . . . . . . . . 9
1.2 Illustration of left and right limits . . . . . . . . . . . . 18
1.3 A concave function . . . . . . . . . . . . . . . . . . . . . 21
1.4 An illustration of a saddle point . . . . . . . . . . . . . . 23

2.1 An optimal path in the state-time space . . . . . . . . . 34


2.2 Optimal state and adjoint trajectories for Example 2.2 . 44
2.3 Optimal state and adjoint trajectories for Example 2.3 . 46
2.4 Optimal trajectories for Examples 2.4 and 2.5 . . . . . . 48
2.5 Optimal control for Example 2.6 . . . . . . . . . . . . . 53
2.6 The flowchart for Example 2.8 . . . . . . . . . . . . . . 58
2.7 Solution of TPBVP by excel . . . . . . . . . . . . . . . . 60
2.8 Water reservoir of Exercise 2.18 . . . . . . . . . . . . . . 63

3.1 State and adjoint trajectories in Example 3.4 . . . . . . 93


3.2 Minimum time optimal response for Example 3.6 . . . . 101

4.1 Feasible state space and optimal state trajectory


for Examples 4.1 and 4.4 . . . . . . . . . . . . . . . . . . 128
4.2 State and adjoint trajectories in Example 4.3 . . . . . . 143
4.3 Adjoint trajectory for Example 4.4 . . . . . . . . . . . . 147
4.4 Two-reservoir system of Exercise 4.8 . . . . . . . . . . . 151
4.5 Feasible space for Exercise 4.28 . . . . . . . . . . . . . . 157

5.1 Optimal policy shown in (λ1 , λ2 ) space . . . . . . . . . . 163


5.2 Optimal policy shown in (t, λ2 /λ1 ) space . . . . . . . . . 164
5.3 Case A: g ≤ r . . . . . . . . . . . . . . . . . . . . . . . . 169
5.4 Case B: g > r . . . . . . . . . . . . . . . . . . . . . . . . 170
5.5 Optimal path for case A: g ≤ r . . . . . . . . . . . . . . 174
5.6 Optimal path for case B: g > r . . . . . . . . . . . . . . 179

xxiii
xxiv LIST OF FIGURES

5.7 Solution for Exercise 5.4 . . . . . . . . . . . . . . . . . . 186


5.8 Adjoint trajectories for Exercise 5.5 . . . . . . . . . . . 187

6.1 Solution of Example 6.1 with I0 = 10 . . . . . . . . . . . 199


6.2 Solution of Example 6.1 with I0 = 50 . . . . . . . . . . . 199
6.3 Solution of Example 6.1 with I0 = 30 . . . . . . . . . . . 200
6.4 Optimal production rate and inventory level with different
initial inventories . . . . . . . . . . . . . . . . . . . . . . 204
6.5 The price trajectory (6.56) . . . . . . . . . . . . . . . . . 207
6.6 Adjoint variable, optimal policy and inventory in the
wheat trading model . . . . . . . . . . . . . . . . . . . . 209
6.7 Adjoint trajectory and optimal policy for the wheat trad-
ing model . . . . . . . . . . . . . . . . . . . . . . . . . . 212
6.8 Decision horizon and optimal policy for the wheat trading
model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.9 Optimal policy and horizons for the wheat trading model
with no short-selling and a warehouse constraint . . . . 216
6.10 Optimal policy and horizons for Example 6.3 . . . . . . 218
6.11 Optimal policy and horizons for Example 6.4 . . . . . . 219

7.1 Optimal policies in the Nerlove-Arrow model . . . . . . 230


7.2 A case of a time-dependent turnpike and the nature of
optimal control . . . . . . . . . . . . . . . . . . . . . . . 231
7.3 A near-optimal control of problem (7.15) . . . . . . . . . 233
7.4 Feasible arcs in (t, x)-space . . . . . . . . . . . . . . . . 238
7.5 Optimal trajectory for Case 1: x0 ≤ xs and xT ≤ xs . . 240
7.6 Optimal trajectory for Case 2: x0 < xs and xT > xs . . 241
7.7 Optimal trajectory for Case 3: x0 > xs and xT < xs . . 241
7.8 Optimal trajectory for Case 4: x0 > xs and xT > xs . . 242
7.9 Optimal trajectory (solid lines) . . . . . . . . . . . . . . 243
7.10 Optimal trajectory when T is small in Case 1: x0 < xs
and xT > xs . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.11 Optimal trajectory when T is small in Case 2: x0 > xs
and xT > xs . . . . . . . . . . . . . . . . . . . . . . . . . 244
7.12 Optimal trajectory for Case 2 of Theorem 7.1 for Q = ∞ 244
7.13 Optimal trajectories for x(0) < x̂ . . . . . . . . . . . . . 249
7.14 Optimal trajectory for x(0) > x̂ . . . . . . . . . . . . . . 250

8.1 Shortest distance from point (2,2) to the semicircle . . . 266


8.2 Graph of Example 8.5 . . . . . . . . . . . . . . . . . . . 267
LIST OF FIGURES xxv

8.3 Discrete-time conventions . . . . . . . . . . . . . . . . . 270



8.4 Optimal state xk and adjoint λk . . . . . . . . . . . . . 275

9.1 Optimal maintenance and machine resale value . . . . . 289


9.2 Sat function optimal control . . . . . . . . . . . . . . . . 291

10.1 Optimal policy for the sole owner fishery model . . . . . 316
10.2 Singular usable timber volume x̄(t) . . . . . . . . . . . . 320
10.3 Optimal thinning u∗ (t) and timber volume x∗ (t) for the
forest thinning model when x0 < x̄(t0 ) . . . . . . . . . . 320
10.4 Optimal thinning u∗ (t) and timber volume x∗ (t) for the
chain of forests model when T > t̂ . . . . . . . . . . . . 322
10.5 Optimal thinning and timber volume x∗ (t) for the chain
of forests model when T ≤ t̂ . . . . . . . . . . . . . . . . 323
10.6 The demand function . . . . . . . . . . . . . . . . . . . . 324
10.7 The profit function . . . . . . . . . . . . . . . . . . . . . 326
10.8 Optimal price trajectory for T ≥ T̄ . . . . . . . . . . . . 329
10.9 Optimal price trajectory for T < T̄ . . . . . . . . . . . . 330

11.1 Phase diagram for the optimal growth model . . . . . . 340


11.2 Optimal trajectory when xT > xs . . . . . . . . . . . . . 346
11.3 Optimal trajectory when xT < xs . . . . . . . . . . . . 347
11.4 Food output function . . . . . . . . . . . . . . . . . . . . 348
11.5 Phase diagram for the pollution control model . . . . . . 351
11.6 Violation of the monotonicity constraint . . . . . . . . . 358
11.7 Bunching and ironing . . . . . . . . . . . . . . . . . . . 359

12.1 A sample path of optimal production rate It∗ with


I0 = x0 > 0 and B > 0 . . . . . . . . . . . . . . . . . . . 374

13.1 A sample path of optimal market share trajectories . . . 396


13.2 Optimal subsidy rate vs. (a) Retailer’s margin and (b)
Manufacturer’s margin . . . . . . . . . . . . . . . . . . . 404

B.1 Examples of admissible functions for the problem . . . . 420


B.2 Variation about the solution function . . . . . . . . . . . 421
B.3 A broken extremal with corner at τ . . . . . . . . . . . . 428
xxvi LIST OF FIGURES

C.1 Needle-shaped variation . . . . . . . . . . . . . . . . . . 434


C.2 Trajectories x∗ (t) and x(t) in a one-dimensional case . . 434

D.1 Phase diagram for system (D.73) . . . . . . . . . . . . . 457


D.2 Region D with boundaries Γ1 and Γ2 . . . . . . . . . . . 461
List of Tables
1.1 The production-inventory model of Example 1.1 . . . . 4
1.2 The advertising model of Example 1.2 . . . . . . . . . . 6
1.3 The consumption model of Example 1.3 . . . . . . . . . 8

3.1 Summary of the transversality conditions . . . . . . . . 89


3.2 State trajectories and switching curves . . . . . . . . . . 100
3.3 Objective, state, and adjoint equations for various model
types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5.1 Characterization of optimal controls with c < 1 . . . . . 168

13.1 Optimal feedback Stackelberg solution . . . . . . . . . . 403

A.1 Homogeneous solution forms for Eq. (A.5) . . . . . . . . 411


A.2 Particular solutions for Eq. (A.5) . . . . . . . . . . . . . 411

xxvii
Chapter 1

What Is Optimal Control


Theory?

Many management science applications involve the control of dynamic


systems, i.e., systems that evolve over time. They are called continuous-
time systems or discrete-time systems depending on whether time varies
continuously or discretely. We will deal with both kinds of systems in this
book, although the main emphasis will be on continuous-time systems.
Optimal control theory is a branch of mathematics developed to find
optimal ways to control a dynamic system. The purpose of this book is
to give an elementary introduction to the mathematical theory, and then
apply it to a wide variety of different situations arising in management
science. We have deliberately kept the level of mathematics as simple as
possible in order to make the book accessible to a large audience. The
only mathematical requirements for this book are elementary calculus,
including partial differentiation, some knowledge of vectors and matri-
ces, and elementary ordinary and partial differential equations. The last
topic is briefly covered in Appendix A. Chapter 12 on stochastic opti-
mal control also requires some concepts in stochastic calculus, which are
introduced at the beginning of that chapter.
The principle management science applications discussed in this book
come from the following areas: finance, economics, production and in-
ventory, marketing, maintenance and replacement, and the consumption
of natural resources. In each major area we have formulated one or more
simple models followed by a more complicated model. The reader may

© Springer Nature Switzerland AG 2019 1


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 1
2 1. What Is Optimal Control Theory?

wish at first to cover only the simpler models in each area to get an idea
of what could be accomplished with optimal control theory. Later, the
reader may wish to go into more depth in one or more of the applied
areas.
Examples are worked out in most of the chapters to facilitate the
exposition. At the end of each chapter, we have listed exercises that the
reader should solve for deeper understanding of the material presented
in the chapter. Hints are supplied with some of the exercises. Answers
to selected exercises are given in Appendix E.

1.1 Basic Concepts and Definitions


We will use the word system as a primitive term in this book. The only
property that we require of a system is that it is capable of existing in
various states. Let the (real) variable x(t) be the state variable of the
system at time t ∈ [0, T ], where T > 0 is a specified time horizon for
the system under consideration. For example, x(t) could measure the
inventory level at time t, the amount of advertising goodwill at time t,
or the amount of unconsumed wealth or natural resources at time t.
We assume that there is a way of controlling the state of the system.
Let the (real) variable u(t) be the control variable of the system at time t.
For example, u(t) could be the production rate at time t, the advertising
rate at time t, etc.
Given the values of the state variable x(t) and the control variable
u(t) at time t, the state equation, a differential equation,

ẋ(t) = f (x(t), u(t), t), x(0) = x0 , (1.1)

specifies the instantaneous rate of change in the state variable, where


ẋ(t) is a commonly used notation for dx(t)/dt, f is a given function of
x, u, and t, and x0 is the initial value of the state variable. If we know
the initial value x0 and the control trajectory, i.e., the values of u(t) over
the whole time interval 0 ≤ t ≤ T, then we can integrate (1.1) to get
the state trajectory, i.e., the values of x(t) over the same time interval.
We want to choose the control trajectory so that the state and control
trajectories maximize the objective functional, or simply the objective
function,  T
J= F (x(t), u(t), t)dt + S[x(T ), T ]. (1.2)
0
1.1. Basic Concepts and Definitions 3

In (1.2), F is a given function of x, u, and t, which could measure


the benefit minus the cost of advertising, the utility of consumption, the
negative of the cost of inventory and production, etc. Also in (1.2), the
function S gives the salvage value of the ending state x(T ) at time T.
The salvage value is needed so that the solution will make “good sense”
at the end of the horizon.
Usually the control variable u(t) will be constrained. We indicate
this as
u(t) ∈ Ω(t), t ∈ [0, T ], (1.3)
where Ω(t) is the set of feasible values for the control variable at time t.
Optimal control problems involving (1.1), (1.2), and (1.3) will be
treated in Chap. 2.
In Chap. 3, we will replace (1.3) by inequality constraints involving
control variables. In addition, we will allow these constraints to depend
on state variables. These are called mixed inequality constraints and
written as
g(x(t), u(t), t) ≥ 0, t ∈ [0, T ] , (1.4)
where g is a given function of u, t, and possibly x.
In addition, there may be constraints involving only state variables,
but not control variables. These are written as

h(x(t), t) ≥ 0, t ∈ [0, T ], (1.5)

where h is a given function of x and t. Such constraints are the most


difficult to deal with, and are known as pure state inequality constraints.
Problems involving (1.1), (1.2), (1.4), and (1.5) will be treated in Chap. 4.
Finally, we note that all of the imposed constraints limit the values
that the terminal state x(T ) may take. We denote this by saying

x(T ) ∈ X, (1.6)

where X is called the reachable set of the state variable at time T. Note
that X depends on the initial value x0 . Here X is the set of possible
terminal values that can be reached when x(t) and u(t) obey imposed
constraints.
Although the above description of the control problem may seem ab-
stract, you will find that in each specific application, the variables and
parameters will have specific meanings that make them easy to under-
stand and remember. The examples that follow will illustrate this point.
4 1. What Is Optimal Control Theory?

1.2 Formulation of Simple Control Models


We now formulate three simple models chosen from the areas of produc-
tion, advertising, and economics. Our only objective here is to identify
and interpret in these models each of the variables and functions de-
scribed in the previous section. The solutions for each of these models
will be given in detail in later chapters.

Example 1.1 A Production-Inventory Model. The various quantities


that define this model are summarized in Table 1.1 for easy comparison
with the other models that follow.
Table 1.1: The production-inventory model of Example 1.1

State variable I(t) = Inventory level

Control variable P (t) = Production rate

State equation ˙ = P (t) − S(t), I(0) = I0


I(t)
  T 
Objective function Maximize J = −[h(I(t)) + c(P (t))]dt
0

State constraint I(t) ≥ 0

Control constraints 0 ≤ Pmin ≤ P (t) ≤ Pmax

Terminal condition I(T ) ≥ Imin

Exogenous functions S(t) = Demand rate

h(I) = Inventory holding cost

c(P ) = Production cost

Parameters T = Terminal time

Imin = Minimum ending inventory

Pmin = Minimum possible production rate

Pmax = Maximum possible production rate

I0 = Initial inventory level


1.2. Formulation of Simple Control Models 5

We consider the production and inventory storage of a given good,


such as steel, in order to meet an exogenous demand. The state variable
I(t) measures the number of tons of steel that we have on hand at time
t ∈ [0, T ]. There is an exogenous demand rate S(t) tons of steel per day
at time t ∈ [0, T ], and we must choose the production rate P (t) tons of
steel per day at time t ∈ [0, T ]. Given the initial inventory of I0 tons of
steel on hand at t = 0, the state equation
˙ = P (t) − S(t)
I(t)
describes how the steel inventory changes over time. Since h(I) is the
cost of holding inventory I in dollars per day, and c(P ) is the cost of
producing steel at rate P, also in dollars per day, the objective function
is to maximize the negative of the sum of the total holding and produc-
tion costs over the period of T days. Of course, maximizing the negative
sum is the same as minimizing the sum of holding and production costs.
The state variable constraint, I(t) ≥ 0, is imposed so that the demand
is satisfied for all t. In other words, backlogging of demand is not per-
mitted. (An alternative formulation is to make h(I) become very large
when I becomes negative, i.e., to impose a stockout penalty cost.) The
control constraints keep the production rate P (t) between a specified
lower bound Pmin and a specified upper bound Pmax . Finally, the termi-
nal constraint I(T ) ≥ Imin is imposed so that the terminal inventory is
at least Imin .
The statement of the problem is lengthy because of the number of
variables, functions, and parameters which are involved. However, with
the production and inventory interpretations as given, it is not difficult
to see the reasons for each condition. In Chap. 6, various versions of this
model will be solved in detail. In Sect. 12.2, we will deal with a stochastic
version of this model.
Example 1.2 An Advertising Model. The various quantities that define
this model are summarized in Table 1.2.
We consider a special case of the Nerlove-Arrow advertising model
which will be discussed in detail in Chap. 7. The problem is to determine
the rate at which to advertise a product at each time t. Here the state
variable is advertising goodwill, G(t), which measures how well the prod-
uct is known at time t. We assume that there is a forgetting coefficient δ,
which measures the rate at which customers tend to forget the product.
6 1. What Is Optimal Control Theory?

To counteract forgetting, advertising is carried out at a rate measured


by the control variable u(t). Hence, the state equation is

Ġ(t) = u(t) − δG(t),

with G(0) = G0 > 0 specifying the initial goodwill for the product.

Table 1.2: The advertising model of Example 1.2

State variable G(t) = Advertising goodwill

Control variable u(t) = Advertising rate

State equation Ġ(t) = u(t) − δG(t), G(0) = G0


  ∞ 
−ρt
Objective function Maximize J = [π(G(t)) − u(t)]e dt
0

State constraint ···

Control constraints 0 ≤ u(t) ≤ Q

Terminal condition ···

Exogenous function π(G) = Gross profit rate

Parameters δ = Goodwill decay constant

ρ = Discount rate

Q = Upper bound on advertising rate

G0 = Initial goodwill level

The objective function J requires special discussion. Note that the


integral defining J is from time t = 0 to time t = ∞; we will later
call a problem having an upper time limit of ∞, an infinite horizon
problem. Because of this upper limit, the integrand of the objective
function includes the discount factor e−ρt , where ρ > 0 is the (constant)
discount rate. Without this discount factor, the integral would (in most
cases) diverge to infinity. Hence, we will see that such a discount factor
is an essential part of infinite horizon models. The rest of the integrand
in the objective function consists of the gross profit rate π(G(t)), which
1.2. Formulation of Simple Control Models 7

results from the goodwill level G(t) at time t less the cost of advertising
assumed to be proportional to u(t) (proportionality factor = 1); thus
π(G(t)) − u(t) is the net profit rate at time t. Also [π(G(t)) − u(t)]e−ρt is
the net profit rate at time t discounted to time 0, i.e., the present value
of the time t profit rate. Hence, J can be interpreted as the total value of
discounted future profits, and is the quantity we are trying to maximize.
There are control constraints 0 ≤ u(t) ≤ Q, where Q is the upper
bound on the advertising rate. However, there is no state constraint. It
can be seen from the state equation and the control constraints that the
goodwill G(t) in fact never becomes negative.
You will find it instructive to compare this model with the previous
one and note the similarities and differences between the two.
Example 1.3 A Consumption Model. Rich Rentier plans to retire at
age 65 with a lump sum pension of W0 dollars. Rich estimates his re-
maining life span to be T years. He wants to consume his wealth during
these T retirement years, beginning at the age of 65, and leave a bequest
to his heirs in a way that will maximize his total utility of consumption
and bequest.
Since he does not want to take investment risks, Rich plans to put
his money into a savings account that pays interest at a continuously
compounded rate of r. In order to formulate Rich’s optimization problem,
let t = 0 denote the time when he turns 65 so that his retirement period
can be denoted by the interval [0, T ]. If we let the state variable W (t)
denote Rich’s wealth and the control variable C(t) ≥ 0 denote his rate of
consumption at time t ∈ [0, T ], it is easy to see that the state equation is
Ẇ (t) = rW (t) − C(t),
with the initial condition W (0) = W0 > 0. It is reasonable to require that
W (t) ≥ 0 and C(t) ≥ 0, t ∈ [0, T ]. Letting U (C) be the utility function
of consumption C and B(W ) be the bequest function of leaving a bequest
of amount W at time T, we see that the problem can be stated as an
optimal control problem with the variables, equations, and constraints
shown in Table 1.3.
Note that the objective function has two parts: first the integral of
the discounted utility of consumption from time 0 to time T with ρ as
the discount rate; and second the bequest function e−ρT B(W ), which
measures Rich’s discounted utility of leaving an estate W to his heirs
8 1. What Is Optimal Control Theory?

at time T. If he has no heirs and does not care about charity, then
B(W ) = 0. However, if he has heirs or a favorite charity to whom he
wishes to leave money, then B(W ) measures the strength of his desire
to leave an estate of amount W. The nonnegativity constraints on state
and control variables are obviously natural requirements that must be
imposed.
You will be asked to solve this problem in Exercise 2.1 after you
have learned the maximum principle in the next chapter. Moreover, a
stochastic extension of the consumption problem, known as a consump-
tion/investment problem, will be discussed in Sect. 12.4.

Table 1.3: The consumption model of Example 1.3

State variable W (t) = Wealth

Control variable C(t) = Consumption rate

State equation Ẇ (t) = rW (t) − C(t), W (0) = W0


  T 
Objective function Max J = U (C(t))e−ρt dt + B(W (T ))e−ρT
0

State constraint W (t) ≥ 0

Control constraint C(t) ≥ 0

Terminal condition ···

Exogenous U (C) = Utility of consumption

Functions B(W ) = Bequest function

Parameters T = Terminal time

W0 = Initial wealth

ρ = Discount rate

r = Interest rate
1.3. History of Optimal Control Theory 9

1.3 History of Optimal Control Theory


Optimal control theory is an extension of the calculus of variations (see
Appendix B), so we discuss the history of the latter first.
The creation of the calculus of variations occurred almost immedi-
ately after the formalization of calculus by Newton and Leibniz in the
seventeenth century. An important problem in calculus is to find an
argument of a function at which the function takes on its maximum or
minimum. The extension of this problem posed in the calculus of vari-
ations is to find a function which maximizes or minimizes the value of
an integral or functional of that function. As might be expected, the
extremum problem in the calculus of variations is much harder than the
extremum problem in differential calculus. Euler and Lagrange are gen-
erally considered to be the founders of the calculus of variations. Newton,
Legendre, and the Bernoulli brothers also contributed much to the early
development of the field.

Figure 1.1: The Brachistochrone problem

A celebrated problem first solved using the calculus of variations was


the path of least time or the Brachistochrone problem. The problem is
illustrated in Fig. 1.1. It involves finding the shape of a curve Γ con-
necting the two points A and B in the vertical plane with the property
that a bead sliding along the curve under the influence of gravity will
move from A to B in the shortest possible time. The problem was posed
10 1. What Is Optimal Control Theory?

by Johann Bernoulli in 1696, and it played an important part in the


development of calculus of variations. It was solved by Johann Bernoulli,
Jakob Bernoulli, Newton, Leibnitz, and L’Hôpital. In Sect. B.4, we pro-
vide a solution to the Brachistochrone problem by using what is known
as the Euler-Lagrange equation, stated in Sect. B.2, and show that the
shape of the solution curve is represented by a cycloid.
In the nineteenth and early twentieth centuries, many mathemati-
cians contributed to the calculus of variations; these include Hamilton,
Jacobi, Bolza, Weierstrass, Carathéodory, and Bliss.
Converting calculus of variations problems into control theory prob-
lems requires one more conceptual step—the addition of control variables
to the state equations. Isaacs (1965) made such an extension in two-
person pursuit-evasion games in the period 1948–1955. Bellman (1957)
made a similar extension with the idea of dynamic programming.
Modern control theory began with the publication (in Russian in
1961 and English in 1962) of the book, The Mathematical Theory of
Optimal Processes, by Pontryagin et al. (1962). Well-known American
mathematicians associated with the maximum principle include Valen-
tine, McShane, Hestenes, Berkovitz, and Neustadt. The importance of
the book by Pontryagin et al. lies not only in a rigorous formulation of
a calculus of variations problem with constrained control variables, but
also in the proof of the maximum principle for optimal control problems.
See Pesch and Bulirsch (1994) and Pesch and Plail (2009) for historical
perspectives on the topics of the calculus of variations, dynamic pro-
gramming, and optimal control.
The maximum principle permits the decoupling of the dynamic prob-
lem over time, using what are known as adjoint variables or shadow
prices, into a series of problems, each of which holds at a single instant
of time. The optimal solution of the instantaneous problems can be
shown to give the optimal solution to the overall problem.
In this book we will be concerned principally with the application of
the maximum principle in its various forms to find the solutions of a wide
variety of applied problems in management science and economics. It is
hoped that the reader, after reading some of these problems and their
solutions, will appreciate, as we do, the importance of the maximum
principle.
Some important books and surveys of the applications of the
maximum principle to management science and economics are Con-
1.4. Notation and Concepts Used 11

nors and Teichroew (1967), Arrow and Kurz (1970), Hadley and
Kemp (1971), Bensoussan et al. (1974), Stöppler (1975), Clark (1976),
Sethi (1977a, 1978a), Tapiero (1977, 1988), Wickwire (1977), Book-
binder and Sethi (1980), Lesourne and Leban (1982), Tu (1984), Fe-
ichtinger and Hartl (1986), Carlson and Haurie (1987b), Seierstad
and Sydsæter (1987), Erickson (2003), Léonard and Long (1992),
Kamien and Schwartz (1992), Van Hilten et al. (1993), Feichtinger
et al. (1994a), Maimon et al. (1998), Dockner et al. (2000), Ca-
puto (2005), Grass et al. (2008), and Bensoussan (2011). Nev-
ertheless, we have included in our bibliography many works of
interest.

1.4 Notation and Concepts Used


In order to make the book readable, we will adopt the following notation
which will hold throughout the book. In addition, we will define some
important concepts that are required, including those of concave, convex
and affine functions, and saddle points.
We use the symbol “=” to mean “is equal to” or “is defined to be
equal to” or “is identically equal to” depending on the context. The
symbol “:=” means “is defined to be equal to,” the symbol “≡” means
“is identically equal to,” and the symbol “≈” means “is approximately
equal to.” The double arrow “⇒” means “implies,” “∀” means “for all,”
and “∈” means “is a member of.” The symbol 2 indicates the end of a
proof.
Let y be an n-component column vector and z be an m-component
row vector, i.e.,
⎡ ⎤
⎢ y1 ⎥
⎢ ⎥
⎢ ⎥
⎢ y2 ⎥
⎢ ⎥
y =⎢ ⎥ = (y1 , . . . , yn )T and z = (z1 , . . . , zm ),
⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
⎣ ⎦
yn

where the superscript T on a vector (or, a matrix) denotes the transpose


of the vector (or, the matrix). At times, when convenient and not con-
fusing, we will use the superscript  for the transpose operation. If y and
12 1. What Is Optimal Control Theory?

z are functions of time t, a scalar, then the time derivatives ẏ := dy/dt


and ż := dz/dt are defined as

dy dz
ẏ = = (ẏ1 , · · · , ẏn )T and ż = = (ż1 , . . . , żm ),
dt dt
where ẏi and żj denote the time derivatives dyi /dt and dzj /dt, respec-
tively.
When n = m, we can define the inner product

zy = Σni=1 zi yi . (1.7)

More generally, if
⎡ ⎤
⎢ a11 a12 ··· a1k ⎥
⎢ ⎥
⎢ ⎥
⎢ a21 a22 · · · a2k ⎥
⎢ ⎥
A = {aij } = ⎢ ⎥
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
am1 am2 · · · amk

is an m × k matrix and B = {bij } is a k × n matrix, we define the matrix


product C = {cij } = AB, which is an m × n matrix with components

cij = Σkr=1 air brj . (1.8)

Let E k denote the k-dimensional Euclidean space. Its elements are


k-component vectors, which may be either row or column vectors, de-
pending on the context. Thus in (1.7), y ∈ E n is a column vector and
z ∈ E m is a row vector.
Next, in Sects. 1.4.1–1.4.4, we provide the notation for multivariate
differentiation. Needless to say, the functions introduced are assumed to
be appropriately differentiable for their derivatives being defined.

1.4.1 Differentiating Vectors and Matrices with Respect


To Scalars
Let f : E 1 → E k be a k-dimensional function of a scalar variable t. If f
is a row vector, then we define
df
= ft = (f1t , f2t , · · · , fkt ), a row vector.
dt
1.4. Notation and Concepts Used 13

    
We will also use the notation f = (f1 , f2 , · · · , fk ) and f (t) in place of ft .
If f is a column vector, then
⎡ ⎤
⎢ f1t ⎥
⎢ ⎥
⎢ ⎥
⎢ f2t ⎥
df ⎢ ⎥
= ft = ⎢ ⎥ = (f1t , f2t , · · · , fkt )T , a column vector.
dt ⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
⎣ ⎦
fkt
 
Once again, f (t) may also be written as f or f (t).
A similar rule applies if a matrix function is differentiated with re-
spect to a scalar.
⎡ ⎤
2
⎢ t 2t + 3 ⎥
Example 1.4 Let f (t) = ⎣ ⎦ . Find ft .
e3t 1/t
⎡ ⎤
⎢ 2t 2 ⎥
Solution ft = ⎣ ⎦.
3e3t −1/t2

1.4.2 Differentiating Scalars with Respect to Vectors


If F (y, z) is a scalar function defined on E n ×E m with y an n-dimensional
column vector and z an m-dimensional row vector, then the gradients
Fy and Fz are defined, respectively, as

Fy = (Fy1 , · · · , Fyn ), a row vector, (1.9)

and
Fz = (Fz1 , · · · , Fzm ), a row vector, (1.10)
where Fyi and Fzj denote the partial derivatives with respect to the
subscripted variables.
Thus, we always define the gradient with respect to a row or column
vector as a row vector. Alternatively, Fy and Fz are also denoted as ∇y F
and ∇z F, respectively. In this notation, if F is a function of y only or
z only, then the subscript can be dropped and the gradient of F can be
written simply as ∇F.
14 1. What Is Optimal Control Theory?

Example 1.5 Let F (y, z) = y1 2 y3 z2 + 3y2 ln z1 + y1 y2 , where y =


(y1 , y2 , y3 )T and z = (z1 , z2 ). Obtain Fy and Fz .

Solution Fy = (Fy1 , Fy2 , Fy3 ) = (2y1 y3 z2 + y2 , 3 ln z1 + y1 , y1 2 z2 ) and


Fz = (Fz1 , Fz2 ) = (3y2 /z1 , y1 2 y3 ).

1.4.3 Differentiating Vectors with Respect to Vectors


If f : E n × E m → E k is a k-dimensional vector function, f either row or
column, i.e.,
f = (f1 , · · · , fk ) or f = (f1 , · · · , fk )T ,
where each component fi = fi (y, z) depends on the column vector y ∈ E n
and the row vector z ∈ E m , then fz will denote the k × m matrix
⎡ ⎤
⎢ ∂f1 /∂z1 , ∂f1 /∂z2 , · · · ∂f1 /∂zm ⎥
⎢ ⎥
⎢ ⎥
⎢ ∂f2 /∂z1 , ∂f2 /∂z2 , · · · ∂f2 /∂zm ⎥
⎢ ⎥
fz = ⎢ ⎥ = {∂fi /∂zj }, (1.11)
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
∂fk /∂z1 , ∂fk /∂z2 , · · · ∂fk /∂zm

and fy will denote the k × n matrix


⎡ ⎤
⎢ ∂f1 /∂y1 ∂f1 /∂y2 · · · ∂f1 /∂yn ⎥
⎢ ⎥
⎢ ⎥
⎢ ∂f2 /∂y1 ∂f2 /∂y2 · · · ∂f2 /∂yn ⎥
⎢ ⎥
fy = ⎢ ⎥ = {∂fi /∂yj }. (1.12)
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
∂fk /∂y1 ∂fk /∂y2 · · · ∂fk /∂yn

Matrices fz and fy are known as Jacobian matrices. It should be


emphasized that the rule of defining a Jacobian does not depend on the
row or column nature of the function or its arguments. Thus,

fz = (f T )z = fz T = (f T )z T .

Example 1.6 Let f : E 3 × E 2 → E 3 be defined by f (y, z) = (y1 2 y3 z2 +


3y2 ln z1 , z1 z2 2 y3 , z1 y1 + z2 y2 )T with y = (y1 , y2 , y3 )T and z = (z1 , z2 ).
Obtain fz and fy .
1.4. Notation and Concepts Used 15

Solution. ⎡ ⎤
2y
⎢ 3y2 /z1 y1 3 ⎥
⎢ ⎥
⎢ ⎥
fz = ⎢ z2 2 y3 2z1 z2 y3 ⎥,
⎢ ⎥
⎣ ⎦
y1 y2
⎡ ⎤
2z
⎢ 2y1 y3 z2 3 ln z1 y1 2 ⎥
⎢ ⎥
⎢ ⎥
fy = ⎢ 0 0 z1 z2 2 ⎥.
⎢ ⎥
⎣ ⎦
z1 z2 0

Applying the rule (1.11) to Fy in (1.9), we obtain Fyz = (Fy )z to be


the n × m matrix
⎡ ⎤
⎢ Fy1 z1 Fy1 z2 · · · Fy1 zm ⎥
⎢ ⎥
⎢ ⎥  
⎢ Fy2 z1 Fy2 z2 · · · Fy2 zm ⎥ ∂2F
⎢ ⎥
Fyz =⎢ ⎥= . (1.13)
⎢ .. .. .. ⎥ ∂yi ∂zj
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
Fyn z1 Fyn z2 · · · Fyn zm

Applying the rule (1.12) to Fz in (1.10), we obtain Fzy = (Fz )y to be


the m × n matrix
⎡ ⎤
⎢ Fz1 y1 Fz1 y2 ··· Fz1 yn ⎥
⎢ ⎥
⎢ ⎥  
⎢ Fz2 y1 Fz2 y2 · · · Fz2 yn ⎥ ∂2F
⎢ ⎥
Fzy =⎢ ⎥= . (1.14)
⎢ .. .. .. ⎥ ∂zi ∂yj
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
Fzm y1 Fzm y2 · · · Fzm yn

Note that if F (y, z) is twice continuously differentiable, then we also have


Fzy = (Fyz )T .

Example 1.7 Obtain Fyz and Fzy for F (y, z) specified in Example 1.5.
Since the given F (y, z) is twice continuously differentiable, check also
that Fzy = (Fyz )T .
16 1. What Is Optimal Control Theory?

Solution. Applying rule (1.11) to Fy obtained in Example 1.5 and rule


(1.12) to Fz obtained in Example 1.5, we have, respectively,
⎡ ⎤
⎡ ⎤
⎢ 0 2y1 y3 ⎥
⎢ ⎥
⎢ ⎥ ⎢ 0 3/z1 0 ⎥
Fyz = ⎢ 3/z1 0 ⎥ and Fzy = ⎣ ⎦.
⎢ ⎥
⎣ ⎦ 2y1 y3 0 y1 2
0 y1 2

Also, it is easily seen from these matrices that Fzy = (Fyz )T .

1.4.4 Product Rule for Differentiation


Let g be an n-component row vector function and f be an n-component
column vector function of an n-component vector x. Then in Exercise 1.9,
you are asked to show that
(gf )x = gfx + f T gx = gfx + f T (g T )x . (1.15)
In Exercise 1.10, you are asked to show further that with g = Fx , where
x ∈ E n and the function F : E n → E 1 is twice continuously differentiable
so that Fxx = (Fxx )T , called the Hessian, then
(gf )x =(Fx f )x =Fx fx + f T Fxx = Fx fx + (Fxx f )T . (1.16)
The latter result will be used in Chap. 2 for the derivation of (2.25).
Many mathematical expressions in this book will be vector equations
or inequalities involving vectors and vector functions. Since scalars are
a special case of vectors, these expressions hold just as well for scalar
equations or inequalities involving scalars and scalar functions. In fact,
it may be a good idea to read them as scalar expressions on the first
reading. Then in the second and further readings, the extension to vector
form will be easier.

1.4.5 Miscellany
The norm of an m-component row or column vector z is defined to be

z = z12 + · · · + zm
2 . (1.17)
The norm of a vector is commonly used to define a neighborhood Nz0 of
a point, e.g.,
Nz0 = {z| z − z0 < ε} , (1.18)
where ε > 0 is a small positive real number.
1.4. Notation and Concepts Used 17

We will occasionally make use of the so-called “little-o” notation o(z).


A function F (z) : E m → E 1 is said to be of the order o(z), if
F (z)
lim = 0.
z→0 z

The most common use of this notation will be to collect higher order
terms in a series expansion.
In the continuous-time models discussed in this book, we generally
will use x(t) to denote the state (column) vector, u(t) to denote the
control (column) vector, and λ(t) to denote the adjoint (row) vector.
Whenever there is no possibility of confusion, we will suppress the time
indicator (t) from these vectors and write them as x, u, and λ, respec-
tively. When talking about optimal state and control vectors, we put an
asterisk “∗ ” as a superscript, i.e., as x∗ and u∗ , respectively, whereas u
will refer to an admissible control with x as the corresponding state. No
asterisk, however, needs to be put on the adjoint vector λ as it is only
defined along an optimal path.
Thus, the values of the control, state and adjoint variables at time t
along an optimal path will be written as u∗ (t), x∗ (t), and λ(t). When the
control is expressed in terms of the state, it is called a feedback control.
With an abuse of notation, we will express it as u(x), or u(x, t) if an
explicit time dependence is required. Likewise, the optimal feedback
control will be denoted as u∗ (x) or u∗ (x, t).
We also use the simplified notation x (t) to mean (x(t)) , the trans-
pose of x(t). Likewise, for a matrix A(t), we use A (t) to mean (A(t))
or the transpose of A(t), and A−1 (t) to mean (A(t))−1 or the inverse of
A(t), when the inverse exists.
The norm of an m-dimensional row or column vector function z(t),
t ∈ [0, T ], is defined to be
 T 12
z = Σm
j=1 zj2 (τ )dτ . (1.19)
0

In Chap. 4 and some other chapters, we will encounter functions of


time with jumps. For such functions, it is useful to have the concepts of
left and right limits. With ε > 0, these are defined, respectively, for a
function x(t) as

x(T − ) = lim x(τ ) = lim x(T − ε) and x(T + ) = lim x(τ ) = lim x(T + ε).
τ ↑T ε→0 τ ↓T ε→0
(1.20)
18 1. What Is Optimal Control Theory?

These limits are illustrated for a function x(t) graphed in Fig. 1.2. Here,

x(0) = 1, x(0+ ) = 2,

x(1− ) = 3, x(1+ ) = x(1) = 4,

x(2− ) = 3, x(2) = 2, x(2+ ) = 1,

x(3− ) = 2, x(3) = 3.

x(t )

4 [

3 ) )

2 ( )

1 (

t
0 1 2 3

Figure 1.2: Illustration of left and right limits

In the discrete-time models introduced in Chap. 8 and applied in


Chap. 9, we use xk , uk , and λk to denote state, control, and adjoint
vectors, respectively, at time k, k = 0, 1, 2, . . . , T. We also denote the
difference operator by
Δxk := xk+1 − xk .

As in the continuous-time case, the optimal values of the state variable


xk and the control variable uk will have an asterisk as a superscript;
thus, xk∗ and uk∗ denote the corresponding quantities along an optimal
path. Once again, the adjoint variable λk along an optimal path will not
have an asterisk.
1.4. Notation and Concepts Used 19

In order to specify the optimal control for linear control problems,


we will introduce a special notation, called the bang function, as




⎪ b1 if W < 0,


bang[b1 , b2 ; W ] = arbitrary if W = 0, (1.21)





⎩ b2 if W > 0.

In order to specify the optimal control for linear-quadratic problems,


we define another special function, called the sat function, as




⎪ y1 if W < y1 ,


sat[y1 , y2 ; W ] =
⎪ W if y1 ≤ W ≤ y2 , (1.22)




⎩ y2 if W > y2 .

The word “sat” is short for the word “saturation.” The latter name
comes from an electrical engineering application to saturated amplifiers.
In several applications to be discussed, we will need the concept of
impulse control, which is sometimes needed in cases when an unbounded
control can be applied for a very short time. An example is the adver-
tising model in Table 1.2 when Q = ∞. We apply unbounded control for
a short time in order to cause a jump discontinuity in the state variable.
For the example in Table 1.2, this might mean an intense advertising
campaign (a media blitz) in order to increase advertising goodwill by a
finite amount in a very short time. The impulse function defined be-
low is required to evaluate the integral in the objective function, which
measures the cost of the intense advertising campaign.
Suppose we want to apply an impulse control at time t to change the
state variable from x(t) = x1 to the value x2 “immediately” after t, i.e.,
x(t+ ) = x2 . To compute its contribution to the objective function (1.2),
we use the following procedure: given ε > 0 and a constant control u(ε),
integrate (1.1) from t to t + ε with x(t) = x1 and choose u(ε) so that
x(t + ε) = x2 ; this gives the trajectory x(τ ; ε, u(ε)) for τ ∈ [t, t + ε]. We
can now compute
 t+ε
imp(x1 , x2 ; t) = lim F (x, u, τ )dτ . (1.23)
ε→0 t
20 1. What Is Optimal Control Theory?

If the impulse is applied only at time t, then we can calculate (1.2) as


 t  T
J= F (x, u, τ )dτ + imp(x1 , x2 ; t) + F (x, u, τ )dτ + S[x(T ), T ].
0 t
(1.24)

If there are several instants at which impulses are applied, then this
procedure is easily extended. Examples of the use of (1.24) occur in
Chaps. 5 and 6. We frequently omit t in (1.23) when the impulse function
is independent of t.

1.4.6 Convex Set and Convex Hull


A set D ⊂ E n is a convex set if for each pair of points y, z ∈ D, the
entire line segment joining these two points is also in D, i.e.,

py + (1 − p)z ∈ D, for each p ∈ [0, 1].

Given xi ∈ E n , i = 1, 2, . . . , l, we define y ∈ E n to be a convex


combination of xi ∈ E n , if there exists pi ≥ 0 such that


l 
l
pi = 1 and y = pi x i .
i=1 i=1

The convex hull of a set D ⊂ E n is


 

l 
l
coD := pi x i : pi = 1, pi ≥ 0, xi ∈ D, i = 1, 2, . . . , l .
i=1 i=1

In other words, coD is the set of all convex combinations of points in D.

1.4.7 Concave and Convex Functions


A real-valued function ψ defined on a convex set D ⊂ E n , i.e., ψ : D →
E 1 , is concave, if for each pair of points y, z ∈ D and for all p ∈ [0, 1],

ψ(py + (1 − p)z) ≥ pψ(y) + (1 − p)ψ(z).


1.4. Notation and Concepts Used 21

If the inequalities in the above definition are strict for all y, z ∈ D


with y = z, and 0 < p < 1, then ψ is called a strictly concave
function.
In the single dimensional case of n = 1, there is an enlightening
geometrical interpretation. Namely, ψ(x) defined on an interval D =
[a, b] is concave if, for each pair of points on the graph of ψ(x), the line
segment joining these two points lies entirely below or on the graph of
ψ(x); see Fig. 1.3.
Reverting back to the n-dimensional case, if ψ is a differentiable
function on a convex set D ⊂ E n , then it is concave, if for each pair of
points y, z ∈ D,

ψ(z) ≤ ψ(y) + ψ x (y)(z − y),


where we understand y and z to be column vectors. Furthermore, if the
function ψ is twice differentiable, then it is concave, if at each point in
D, the n × n symmetric matrix ψ xx is negative semidefinite, i.e., all of
its eigenvalues are non-positive.
Finally, if ψ is a concave function, then the negative of the function
ψ, i.e., −ψ : D → E 1 , is a convex function.

Figure 1.3: A concave function


22 1. What Is Optimal Control Theory?

1.4.8 Affine Function and Homogeneous Function of


Degree k
A function ψ : E n → E 1 is said to be affine, if the function  ψ(x)−ψ(0) is
linear. Thus, ψ can be represented as ψ(x1 , x2 , . . . , xn ) = ni=1 ai xi + b,
where ai , i = 1, 2, . . . , n, and b are scalar constants.
A function ψ : E n → E 1 is said to be homogeneous of degree k, if
ψ(bx) = bk ψ(x), where b > 0 is a scalar constant.
In economics, we often assume that a firm’s production function is ho-
mogeneous of degree 1, i.e., if all inputs are multiplied by b, then output is
multiplied by b. Such a production function is said to exhibitthe property
of constant return to scale. A linear function ψ(x) = ax = ni=1 ai xi is a
simple example of a homogeneous function of degree 1. Other examples n
are ψ(x) = min{xi , i = 1, 2, . . . , n} and ψ(x) = a(Πni=1 xi αi )1/ i=1 αi
with a > 0 and αi > 0, i = 1, 2, . . . , n. An important special case of
the last example, known as the Cobb-Douglas production function, is
ψ(K, L) = aK α1 Lα2 with α1 + α2 = 1, where K and L are factors of
production called capital and labor, respectively, and a denotes the total
factor productivity.

1.4.9 Saddle Point


An important concept in two-person zero-sum games is that of a saddle
point. Let ψ(x, y), a real-valued function defined on the space E n × E m ,
i.e., ψ : E n × E m → E 1 , be the payoff of player 1 and −ψ(x, y) be the
payoff of player 2, when they make decisions x and y, respectively, in a
zero-sum game. A point (x̂, ŷ) ∈ E n × E m is called a saddle point of
ψ(x, y) or of the game, if

ψ(x̂, y) ≥ ψ(x̂, ŷ) ≥ ψ(x, ŷ) for all x ∈ E n and y ∈ E m .

Note that a saddle point may not exist, and even if it exists, it may not
be unique. Note also that

ψ(x̂, ŷ) = max ψ(x, ŷ) = min ψ(x̂, y).


x y

Intuitively, this could produce a picture like a horse saddle as shown


in Fig. 1.4, hence the name saddle point for a point like (x̂, ŷ). This
concept will be used in Sect. 13.1.
1.5. Plan of the Book 23

Figure 1.4: An illustration of a saddle point

1.4.10 Linear Independence and Rank of a Matrix


A set of vectors a1 , a2 , . . . , am in E n , m ≤ n, is said to be linearly de-
pendent if there exist scalars pi not all zero such that

m
pi ai = 0. (1.25)
i=1

If (1.25) holds only when p1 = p2 = · · · = pm = 0, then the vectors are


said to be linearly independent. In particular, if one of the vectors in the
set {a1 , a2 , . . . , am } is a null vector, then the set is linearly dependent.
The rank of an m × n matrix A, written rank(A), is the maximum
number of linearly independent rows or, equivalently, the maximum num-
ber of linearly independent columns of A. An m × n matrix is of full
rank if
rank(A) = min{m, n}.

1.5 Plan of the Book


The book has thirteen chapters and five appendices: A, B, C, D, and E,
covering a variety of topics which are listed in the table of contents and
explained in the prefaces.
In any given chapter, say Chap. 7, sections are numbered consec-
utively as 7.1, 7.2, 7.3, etc. Subsections are numbered consecutively
within each section, i.e., 7.2.1, 7.2.2, 7.2.3, etc. Mathematical expres-
sions are numbered consecutively by chapter as (7.1), (7.2), (7.3), etc.
Theorems are also numbered consecutively by chapter as Theorem 7.1,
Theorem 7.2, Theorem 7.3, etc. Similarly, definitions, examples, exer-
cises, figures, propositions, remarks, and tables are numbered consec-
utively by chapter. These elements will be referenced throughout the
24 1. What Is Optimal Control Theory?

book by use of their designated numbers. The same scheme is used in


the appendices, thus, sections in Appendix B, for example, are numbered
as B.1, B.2, B.3, etc.

Exercises for Chapter 1

E 1.1 In Example 1.1, let the functions and parameters of the


production- inventory model be given by:

h(I) = 10I, c(P ) = 20P, T = 10, I0 = 1, 000

Pmin = 600, Pmax = 1200, Imin = 800, S(t) = 900 + 10t.

(a) Set P (t) = 1000 for 0 ≤ t ≤ 10. Determine whether this control
is feasible; if it is feasible, compute the value J of the objective
function.
(b) If P (t) = 800, show that the terminal constraint is violated and
hence the control is infeasible.
(c) If P (t) = Pmin for 0 ≤ t ≤ 6 and P (t) = Pmax for 6 < t ≤ 10,
show that the control is infeasible because the state constraint is
violated.

E 1.2 In Example 1.1, suppose there is a cost associated with changing


the rate of production. One way to formulate this problem is to let
the control variable u(t) denote the rate of change of the production
rate P (t), having a cost cu2 associated with such changes, where c > 0.
Formulate the new problem.

Hint: Let P (t) be an additional state variable.



E 1.3 For the advertising model in Example 1.2, let π(G) = 2 G, δ =
0.05, ρ = 0.2, Q = 2, and G0 = 16. Set u(t) = 0.8 for t ≥ 0, and show
that G(t) is constant for all t. Compute the value J of the objective
function.

E 1.4 In Example 1.2, suppose G measures the number of people who


know about the product. Hence, if A is the total population, then A − G
is the number of people who do not know about the product. If u(t)
measures the advertising rate at time t, assume that u(A − G) is the
corresponding rate of increase of G due to this advertising. Formulate
the new model.
Exercises for Chapter 1 25

E 1.5 Rich Rentier in Example 1.3 has initial wealth W0 = $1, 000, 000.
Assume B = 0, ρ = 0.1, r = 0.15, and assume that Rich expects to live
for exactly 20 years.

(a) What is the maximum constant consumption level that Rich can
afford during his remaining life?
(b) If Rich’s utility function is U (C) = ln C, what is the present value
of the total utility in part (a)?
(c) Suppose Rich sets aside $100,000 to start the Rentier Foundation.
What is the maximum constant grant level that the foundation can
support if it is to last forever?

E 1.6 Suppose Rich in Exercise 1.5 takes on a part-time job, which


yields an income of y(t) at time t. Assume y(t) = 10, 000e−0.05t and that
he has a bequest function B(W ) = 0.5 ln W.

(a) Reformulate this new optimal control problem.

(b) If Rich (no longer a rentier) consumes at the constant rate found in
Exercise 1.5(a), find his terminal wealth and his new total utility.

E 1.7 Consider the following educational policy question. Let S(t) de-
note the total number of scientists at time t, and let δ be the retirement
rate of scientists. Let E(t) be the number of teaching scientists and R(t)
be the number of research scientists, so that S(t) = E(t) + R(t). Assume
γE(t) is the number of newly graduated scientists at time t, of which
the policy allocates uγE(t) to the pool of teachers, where 0 ≤ u ≤ 1.
The remaining graduates are added to the pool of researchers. The gov-
ernment has a target of maximizing the function αE(T ) + βR(T ) at a
given future time T, where α and β are positive constants. Formulate
the optimal control problem for the government.

E 1.8 For F (x, y) defined in Example 1.5, obtain the matrices Fxx and
Fyy .

E 1.9 Let x ∈ E m , g be an n-component row vector function of x, and


f be an n-component column vector function of x. Use the ordinary
product rule of calculus for functions of scalars to derive the formula

(gf )x = gfx + f T (g T )x = gfx + f T gx .


26 1. What Is Optimal Control Theory?

E 1.10 Let F be a scalar function of x ∈ E n and f as defined in


Exercise 1.9. Assume F to be twice continuously differentiable. Show
that

(Fx f )x =Fx fx + f T Fxx = Fx fx + f T (Fxx )T = Fx fx + (Fxx f )T .

Hint: Set the gradient Fx = g, a row vector, and then use Exer-
cise 1.9 to derive the first equality. Note in connection with the second
equality that the function F being twice continuously differentiable
implies that Fxx = (Fxx )T .

E 1.11 For Fy obtained in Example 1.5 and f defined in Example 1.6,


obtain (Fy f )y and verify the relation shown in Exercise 1.10.

E 1.12 Use the bang function defined in (1.21) to sketch the optimal
control
u∗ (t) = bang[−1, 1; W (t)] for 0 ≤ t ≤ 5,
when

(a) W (t) = t − 2

(b) W (t) = t2 − 4t + 3

(c) W (t) = sin πt.

E 1.13 Use the sat function defined in (1.22) to sketch the optimal con-
trol
u∗ (t) = sat[2, 3; W (t)] for 0 ≤ t ≤ 5,
when

(a) W (t) = 4 − t

(b) W (t) = 2 + t2

(c) W (t) = 4 − 4e−t .

E 1.14 Evaluate the function imp(G1 , G2 ; t) for the advertising model


of Table 1.2 when G2 > G1 , Q = ∞, and π(G) = pG, where p is a
constant.
Chapter 2

The Maximum Principle:


Continuous Time

The main purpose of this chapter is to introduce the maximum principle


as a necessary condition that must be satisfied by any optimal control
for the basic problem specified in Sect. 2.1. Although vector notation is
used, the reader can consider the problem as one with only a single state
variable and a single control variable on the first reading. In Sect. 2.2,
the method of dynamic programming is used to derive the maximum
principle. We use this method because of the simplicity and familiarity
of the dynamic programming concept. The derivation also yields signifi-
cant economic interpretations. In Appendix C, the maximum principle is
also derived by using a more general method similar to that of Pontrya-
gin et al. (1962), but with certain simplifications. In Sect. 2.3, we apply
the maximum principle to solve a number of simple, but illustrative, ex-
amples. In Sect. 2.4, the maximum principle is shown to be sufficient for
optimal control under an appropriate concavity condition, which holds in
many management science applications. Finally, Sect. 2.5 illustrates the
use of Excel spreadsheet software to solve an optimal control problem.

2.1 Statement of the Problem


Optimal control theory deals with the problem of optimizing dynamic
systems. The problem must be well posed before any solution can be
attempted. This requires a clear mathematical description of the system

© Springer Nature Switzerland AG 2019 27


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 2
28 2. The Maximum Principle: Continuous Time

to be optimized, the constraints imposed on the system, and the objective


function to be maximized (or minimized).

2.1.1 The Mathematical Model


An important part of any control problem is the process of modeling
the dynamic system under consideration, be it physical, business, or
otherwise. The aim is to arrive at a mathematical description which is
simple enough to deal with, and realistic enough to be able to predict
the response of the system to any given input. Our model is restricted
to systems that can be characterized by a set of ordinary differential
equations (or, ordinary difference equations in the discrete-time case
treated in Chap. 8). Thus, given the initial state x0 of the system and
control history u(t), t ∈ [0, T ], of the process, the evolution of the system
may be described by the first-order differential equation, known also as
the state equation,
ẋ(t) = f (x(t), u(t), t), x(0) = x0 , (2.1)
where the vector of state variables, x(t) ∈ E n , the vector of control vari-
ables, u(t) ∈ E m , and f : E n × E m × E 1 → E n . Furthermore, the
function f is assumed to be continuously differentiable. Here we assume
x to be a column vector and f to be a column vector of functions. The
path x(t), t ∈ [0, T ], is called a state trajectory and u(t), t ∈ [0, T ], is
called a control trajectory or simply, a control. The terms vector of state
variables, state vector, and state will be used interchangeably; similarly
for the terms vector of control variables, control vector, and control. As
mentioned earlier, when no confusion arises, we will usually suppress the
time notation (t); thus, e.g., x(t) will be written simply as x. Further-
more, it should be inferred from the context whether x denotes the state
at time t or the entire state trajectory. A similar statement holds for u.

2.1.2 Constraints
In this chapter, we are concerned with problems of types (1.4) and (1.5)
that do not have state constraints. Such constraints are considered in
Chaps. 3 and 4, as indicated in Sect. 1.1. We do impose constraints of
type (1.3) on the control variables. We define an admissible control to
be a control trajectory u(t), t ∈ [0, T ], which is piecewise continuous and
satisfies, in addition,
u(t) ∈ Ω(t) ⊂ E m , t ∈ [0, T ]. (2.2)
2.1. Statement of the Problem 29

Usually the set Ω(t) is determined by physical or economic constraints


on the values of the control variables at time t.

2.1.3 The Objective Function


An objective function is a quantitative measure of the performance of
the system over time. An optimal control is defined to be an admissible
control which maximizes the objective function. In business or economic
problems, a typical objective function gives some appropriate measure
of quantities such as profit or sales. If the aim is to minimize cost,
then the objective function to be maximized is the negative of cost.
Mathematically, we let
 T
J= F (x(t), u(t), t)dt + S(x(T ), T ) (2.3)
0
denote the objective function, where the functions F : E n × E m × E 1 →
E 1 and S : E n × E 1 → E 1 are assumed for our purposes to be contin-
uously differentiable. In a typical business application, F (x, u, t) could
be the instantaneous profit rate and S(x, T ) could be the salvage value
of having x as the system state at the terminal time T.

2.1.4 The Optimal Control Problem


Given the preceding definitions we can state the optimal control problem,
which we will be concerned with in this chapter. The problem is to find
an admissible control u∗ , which maximizes the objective function (2.3)
subject to the state equation (2.1) and the control constraints (2.2). We
now restate the optimal control problem as:
⎧   T 



⎪ max J= F (x, u, t)dt + S(x(T ), T )

⎨ u(t)∈Ω(t) 0

subject to (2.4)





⎩ ẋ = f (x, u, t), x(0) = x .
0

The control u∗ is called an optimal control and x∗ , determined by means


of the state equation with u = u∗ , is called the optimal trajectory or an
optimal path. The optimal value J(u∗ ) of the objective function will be
30 2. The Maximum Principle: Continuous Time

denoted as J ∗ , and occasionally as J(x∗


0)
when we need to emphasize its
dependence on the initial state x0 .
The optimal control problem (2.4) is said to be in Bolza form because
of the form of the objective function in (2.3). It is said to be in Lagrange
form when S ≡ 0. We say the problem is in Mayer form when F ≡ 0.
Furthermore, it is in linear Mayer form when F ≡ 0 and S is linear, i.e.,


⎪ max {J = cx(T )}



⎨ u(t)∈Ω(t)
subject to (2.5)





⎩ ẋ = f (x, u, t), x(0) = x ,
0

where c = (c1 , c2 , · · · , cn ) is an n-dimensional row vector of given con-


stants. In the next paragraph and in Exercise 2.5, it will be demonstrated
that all of these forms can be converted into the linear Mayer form.
To show that the Bolza form can be reduced to the linear Mayer
form, we define a new state vector y = (y1 , y2 , . . . , yn+1 ), having n + 1
components defined as follows: yi = xi for i = 1, . . . , n and yn+1 defined
by the solution of the equation
∂S(x, t) ∂S(x, t)
ẏn+1 = F (x, u, t) + f (x, u, t) + , (2.6)
∂x ∂t
with yn+1 (0) = S(x0 , 0). By writing f (x, u, t) as f (y, u, t), with a slight
abuse of notation, and by denoting the right-hand side of (2.6) as
fn+1 (y, u, t), we can write the new state equation in the vector form
as
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
⎜ ẋ ⎟ ⎜ f (y, u, t) ⎟ ⎜ x 0 ⎟
ẏ = ⎝ ⎠=⎝ ⎠ , y(0) = ⎝ ⎠. (2.7)
ẏn+1 fn+1 (y, u, t) S(x0 , 0)
We also put c = (0, · · · , 0, 1), where c has n + 1 components with the
first n terms all 0. If we integrate (2.6) from 0 to T, we see that
 T
yn+1 (T ) − yn+1 (0) = F (x, u, t)dt + S(x(T ), T ) − S(x0 , 0).
0
In view of setting the initial condition as yn+1 (0) = S(x0 , 0), the
problem in (2.4) can be expressed as that of maximizing
 T
J= F (x, u, t)dt + S(x(T ), T ) = yn+1 (T ) = cy(T ) (2.8)
0
2.1. Statement of the Problem 31

over u(t) ∈ Ω(t), subject to (2.7). Of course, the price paid for going
from Bolza to linear Mayer form is an additional state variable and its
associated differential equation (2.6). Also, for the function fn+1 to be
continuously differentiable, in keeping with the assumptions made in
Sect. 2.1.1, we need to assume that the salvage value function S(x, t) is
twice continuously differentiable.
Exercise 2.5 presents the task of showing in a similar way that the
Lagrange and Mayer forms can also be reduced to the linear Mayer
form.

Example 2.1 Convert the following single-state problem in Bolza form


to its linear Mayer form:
    
T
u2 1 2
max J = x− dt + [x(T )]
0 2 4

subject to
ẋ = u, x(0) = x0 .

Solution. We use (2.6) to introduce the additional state variable y2 as


follows:
u2 1 1
ẏ2 = x − + xu, y2 (0) = x20 .
2 2 4
Then,
  
T
u2 1
y2 (T ) = y2 (0) + x− + xu dt
0 2 2
 T 2
  T 
u 1
= x− dt + xẋ dt + y2 (0)
0 2 0 2
 T   T  
u2 1 2
= x− dt + d x
0 2 0 4
 T 
u2 1 1
= x− dt + [x(T )]2 − x20 + y2 (0)
0 2 4 4
 T 2

u 1
= x− dt + x(T )2
0 2 4
= J.
32 2. The Maximum Principle: Continuous Time

Thus, the linear Mayer form version with the two-dimensional state y =
(x, y2 ) can be stated as

max {J = y2 (T )}

subject to

ẋ = u, x(0) = x0 ,
u2 1 1
ẏ2 = x − + xu, y2 (0) = x20 .
2 2 4
In Sect. 2.2, we derive necessary conditions for optimal control in the
form of the maximum principle, and in Sect. 2.4 we derive sufficient con-
ditions. In these derivations, we shall assume the existence of an optimal
control, while providing references where needed, as the topic of existence
is beyond the scope of this book. In any particular application, however,
the existence of a solution will be demonstrated by actually finding a
solution that satisfies both the necessary and the sufficient conditions
for optimality. We thus avoid the necessity of having to prove general
existence theorems, which require advanced and difficult mathematics.
Nevertheless, interested readers can consult Hartl et al. (1995) and Seier-
stad and Sydsæter (1987) for brief discussions of existence results and
references therein including Cesari (1983).

2.2 Dynamic Programming and the Maximum


Principle
We will now derive the maximum principle by using a dynamic pro-
gramming approach. The proof is intuitive in nature and is not intended
to be mathematically rigorous. For more rigorous derivations, we refer
the reader to Appendix C, Berkovitz (1961), Pontryagin et al. (1962),
Halkin (1967), Boltyanskii (1971), Hartberger (1973), Bryant and Mayne
(1974), Leitmann (1981), and Seierstad and Sydsæter (1987). Additional
references can be found in the survey by Hartl et al. (1995). For discus-
sions of maximum principles for more general optimal control problems,
including those with nondifferentiable functions, see Clarke (1983, 1989).

2.2.1 The Hamilton-Jacobi-Bellman Equation


Suppose V (x, t) : E n × E 1 → E 1 is a function whose value is the maxi-
mum value of the objective function of the control problem for the sys-
2.2. Dynamic Programming and the Maximum Principle 33

tem, given that we start at time t in state x. That is,


 T
V (x, t) = max F (x(s), u(s), s)ds + S(x(T ), T ) , (2.9)
u(s)∈Ω(s) t

where for s ≥ t,

dx(s)
= f (x(s), u(s), s), x(t) = x.
ds
We initially assume that the value function V (x, t) exists for all x and t
in the relevant ranges. Later we will make additional assumptions about
the function V (x, t).
Bellman (1957) in his book on dynamic programming states the prin-
ciple of optimality as follows:

An optimal policy has the property that, whatever the


initial state and initial decision are, the remaining decision
must constitute an optimal policy with regard to the outcome
resulting from the initial decision.

Intuitively this principle is obvious, for if we were to start in state x


at time t and did not follow an optimal path from then on, there would
then exist (by assumption) a better path from t to T, hence, we could
improve the proposed solution by following this better path. We will
use the principle of optimality to derive conditions on the value function
V (x, t).
Figure 2.1 is a schematic picture of the optimal path x∗ (t) in the
state-time space, and two nearby points (x, t) and (x + δx, t + δt), where
δt is a small increment of time and x + δx = x(t + δt). The value function
changes from V (x, t) to V (x + δx, t + δt) between these two points. By
the principle of optimality, the change in the objective function is made
up of two parts: first, the incremental change in J from t to t + δt, which
is given by the integral of F (x, u, t) from t to t + δt; second, the value
function V (x + δx, t + δt) at time t + δt. The control actions u(τ ) should
be chosen to lie in Ω(τ ), τ ∈ [t, t + δt], and to maximize the sum of these
two terms. In equation form this is
 t+δt 
V (x, t) = max F [x(τ ), u(τ ), τ ]dτ + V [x(t + δt), t + δt] ,
u(τ )∈Ω(τ ) t
τ ∈[t,t+δt]
(2.10)
34 2. The Maximum Principle: Continuous Time

Optimal
Path

V x, t
x

t
0 t

Figure 2.1: An optimal path in the state-time space

where δt represents a small increment in t. It is instructive to compare


this equation to definition (2.9).
Since F is a continuous function, the integral in (2.10) is approxi-
mately F (x, u, t)δt so we can rewrite (2.10) as

V (x, t) = max {F (x, u, t)δt + V [x(t + δt), t + δt]} + o(δt), (2.11)


u∈Ω(t)

where o(δt) denotes a collection of higher-order terms in δt. (By definition


given in Sect. 1.4.4, o(δt) is a function such that limδt→0 o(δt)
δt = 0.)
We now make an assumption that we will return to again later. We
assume that the value function V is a continuously differentiable function
of its arguments. This allows us to use the Taylor series expansion of V
with respect to δt and obtain

V [x(t + δt), t + δt] = V (x, t) + [Vx (x, t)ẋ + Vt (x, t)]δt + o(δt), (2.12)

where Vx and Vt are partial derivatives of V (x, t) with respect to x and


t, respectively.
2.2. Dynamic Programming and the Maximum Principle 35

Substituting for ẋ from (2.1) in the above equation and then using it
in (2.11), we obtain

V (x, t) = max {F (x, u, t)δt + V (x, t) + Vx (x, t)f (x, u, t)δt


u∈Ω(t)
+ Vt (x, t)δt} + o(δt). (2.13)

Canceling V (x, t) on both sides and then dividing by δt we get

o(δt)
0 = max {F (x, u, t) + Vx (x, t)f (x, u, t) + Vt (x, t)} + . (2.14)
u∈Ω(t) δt

Now we let δt → 0 and obtain the following equation

0 = max {F (x, u, t) + Vx (x, t)f (x, u, t) + Vt (x, t)} , (2.15)


u∈Ω(t)

for which the boundary condition is

V (x, T ) = S(x, T ). (2.16)

This boundary condition follows from the fact that the value function at
t = T is simply the salvage value function.
The components of the vector Vx (x, t) can be interpreted as the
marginal contributions of the state variables x to the value function
or the maximized objective function (2.9). We denote the marginal re-
turn vector (along the optimal path x∗ (t)) by the adjoint (row) vector
λ(t) ∈ E n , i.e.,

λ(t) = Vx (x∗ (t), t) := Vx (x, t) |x=x∗ (t) . (2.17)

From the preceding remark, we can interpret λ(t) as the per unit change
in the objective function value for a small change in x∗ (t) at time t. In
other words, λ(t) is the highest hypothetical unit price which a rational
decision maker would be willing to pay for an infinitesimal addition to
x∗ (t). See Sect. 2.2.4 for further discussion.
Next we introduce a function H : E n × E m × E n × E 1 → E 1 called
the Hamiltonian

H(x, u, λ, t) = F (x, u, t) + λf (x, u, t). (2.18)

We can then rewrite Eq. (2.15) as the equation

max [H(x, u, Vx , t) + Vt ] = 0, (2.19)


u∈Ω(t)
36 2. The Maximum Principle: Continuous Time

called the Hamilton-Jacobi-Bellman equation or, simply, the HJB equa-


tion to be satisfied along an optimal path. Note that it is possible
to take Vt out of the maximizing operation since it does not depend
on u.
The Hamiltonian maximizing condition of the maximum principle
can be obtained from (2.19) and (2.17) by observing that, if x∗ (t) and
u∗ (t) are optimal values of the state and control variables and λ(t) is the
corresponding value of the adjoint variable at time t, then the optimal
control u∗ (t) must satisfy (2.19), i.e., for all u ∈ Ω(t),

H[x∗ (t), u∗ (t), λ(t), t] + Vt (x∗ (t), t) ≥ H[x∗ (t), u, λ(t), t]


+Vt (x∗ (t), t). (2.20)

Canceling the term Vt on both sides, we obtain the Hamiltonian maxi-


mizing condition

H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t] (2.21)

for all u ∈ Ω(t).


In order to complete the statement of the maximum principle, we
must still obtain the adjoint equation.

Remark 2.1 We use u∗ and x∗ for optimal control and state to distin-
guish them from an admissible control u and the corresponding state x,
respectively. However, since the adjoint variable λ is defined only along
the optimal path, there is no need for such a distinction, and therefore
we do not use the superscript ∗ on λ.

2.2.2 Derivation of the Adjoint Equation


The derivation of the adjoint equation proceeds from the HJB equation
(2.19), and is similar to those in Fel’dbaum (1965) and Kirk (1970). Note
that, given the optimal path x∗ , the optimal control u∗ maximizes the
left-hand side of (2.19), and its maximum value is zero. We now consider
small perturbations of the values of the state variables in a neighborhood
of the optimal path x∗ . Thus, let

x(t) = x∗ (t) + δx(t), (2.22)

where δx(t) < ε for a small positive ε.


2.2. Dynamic Programming and the Maximum Principle 37

We now consider a ‘fixed’ time instant t. We can then write (2.19) as


0 = H[x∗ (t), u∗ (t), Vx (x∗ (t), t), t] + Vt (x∗ (t), t)
≥ H[x(t), u∗ (t), Vx (x(t), t), t] + Vt (x(t), t). (2.23)
To explain, we note from (2.19) that the left-hand side of ≥ in (2.23)
equals zero. The right-hand side can attain the value zero only if u∗ (t)
is also an optimal control for x(t). In general, for x(t) = x∗ (t), this will
not be so. From this observation, it follows that the expression on the
right-hand side of (2.23) attains its maximum (of zero) at x(t) = x∗ (t).
Furthermore, x(t) is not explicitly constrained. In other words, x∗ (t) is
an unconstrained local maximum of the right-hand side of (2.23), so that
the derivative of this expression with respect to x must vanish at x∗ (t),
i.e.,
Hx [x∗ (t), u∗ (t), Vx (x∗ (t), t), t] + Vtx (x∗ (t), t) = 0, (2.24)
provided the derivative exists, and for which, we must further assume
that V is a twice continuously differentiable function of its arguments.
With H = F + Vx f from (2.17) and (2.18), we obtain
Hx = Fx + Vx fx + f T Vxx = Fx + Vx fx + (Vxx f )T
by using g = Vx in the identity (1.15). Substituting this in (2.24) and
recognizing the fact that Vxx = (Vxx )T , we obtain
Fx + Vx fx + f T Vxx + Vtx = Fx + Vx fx + (Vxx f )T + Vtx = 0, (2.25)
where the superscript T denotes the transpose operation. See (1.16) or
Exercise 1.10 for further explanation.
The derivation of the necessary condition (2.25) is the crux of the
reasoning in the derivation of the adjoint equation. It is easy to obtain
the so-called adjoint equation from it. We begin by taking the time
derivative of Vx (x, t). Thus,
 
dVx dVx1 dVx2 dVxn
= , ,··· ,
dt dt dt dt
= (Vx1 x ẋ + Vx1 t , Vx2 x ẋ + Vx2 t , · · · , Vxn x ẋ + Vxn t )
 n n
= ( ni=1 Vx1 xi ẋi , i=1 Vx2 xi ẋi , · · · , i=1 Vxn xi ẋi ) + (Vx )t

= (Vxx ẋ)T + Vxt

= (Vxx f )T + Vtx .
(2.26)
38 2. The Maximum Principle: Continuous Time

Note in the above that

Vxi x = (Vxi x1 , Vxi x2 , · · · , Vxi xn )

and ⎛ ⎞⎛ ⎞
⎜ Vx 1 x 1 Vx 1 x 2 · · · Vx1 xn ⎟ ⎜ ẋ1 ⎟
⎜ ⎟⎜ ⎟
⎜ ⎟⎜ ⎟
⎜ Vx2 x1 Vx 2 x 2 · · · Vx2 xn ⎟ ⎜ ẋ2 ⎟
⎜ ⎟⎜ ⎟
Vxx ẋ = ⎜ ⎟⎜ ⎟. (2.27)
⎜ .. .. .. ⎟⎜ .. ⎟
⎜ . . ··· . ⎟⎜ . ⎟
⎜ ⎟⎜ ⎟
⎝ ⎠⎝ ⎠
Vx n x 1 Vxn x2 · · · Vxn xn ẋn

Since the terms on the right-hand side of (2.26) are the same as the
last two terms in (2.25), we see that (2.26) becomes

dVx
= −Fx − Vx fx . (2.28)
dt
Because λ was defined in (2.17) to be Vx , we can rewrite (2.28) as

λ̇ = −Fx − λfx .

To see that the right-hand side of this equation can be written simply as
−Hx , we need to go back to the definition of H in (2.18) and recognize
that when taking the partial derivative of H with respect to x, the adjoint
variables λ are considered to be independent of x. We note further that
along the optimal path, λ is a function of t only. Thus,

λ̇ = −Hx . (2.29)

Also, from the definition of λ in (2.17) and the boundary condition


(2.16), we have the terminal boundary condition, which is also called
the transversality condition:

∂S(x, T )
λ(T ) = |x=x∗ (T ) = Sx (x∗ (T ), T ). (2.30)
∂x
The adjoint equation (2.29) together with its boundary condition (2.30)
determine the adjoint variables.
This completes our derivation of the maximum principle using dy-
namic programming. We can now summarize the main results in the
following section.
2.2. Dynamic Programming and the Maximum Principle 39

2.2.3 The Maximum Principle


The necessary conditions for u∗ (t), t ∈ [0, T ], to be an optimal control
are:




⎪ ẋ∗ = f (x∗ , u∗ , t), x∗ (0) = x0 ,


⎪ λ̇ = −Hx [x∗ , u∗ , λ, t], λ(T ) = Sx (x∗ (T ), T ), (2.31)




⎩ H[x∗ , u∗ , λ, t] ≥ H[x∗ , u, λ, t], ∀u ∈ Ω(t), t ∈ [0, T ].

It should be emphasized that the state and the adjoint arguments


of the Hamiltonian are x∗ (t) and λ(t) on both sides of the Hamiltonian
maximizing condition in (2.31), respectively. Furthermore, u∗ (t) must
provide a global maximum of the Hamiltonian H[x∗ (t), u, λ(t), t] over
u ∈ Ω(t). For this reason the necessary conditions in (2.31) are called
the maximum principle.
Note that in order to apply the maximum principle, we must simulta-
neously solve two sets of differential equations with u∗ obtained from the
Hamiltonian maximizing condition in (2.31). With the control variable
u∗ so obtained, the state equation for x∗ is given with the initial value
x0 , and the adjoint equation for λ is specified with a condition on the
terminal value λ(T ). Such a system of equations, where initial values of
some variables and final values of other variables are specified, is called
a two-point boundary value problem (TPBVP). The general solution of
such problems can be very difficult; see Bryson and Ho (1975), Roberts
and Shipman (1972), and Feichtinger and Hartl (1986). However, there
are certain special cases which are easy. One such is the case in which the
adjoint equation is independent of the state and the control variables;
here we can solve the adjoint equation first, then get the optimal control
u∗ , and then solve for x∗ .
Note also that if we can solve the Hamiltonian maximizing condition
for an optimal control function in closed form u∗ (x, λ, t) so that
u∗ (t) = u∗ [x∗ (t), λ(t), t],
then we can substitute this into the state and adjoint equations to get
the TPBVP just in terms of a set of differential equations, i.e.,


⎨ ẋ∗ = f (x∗ , u∗ (x∗ , λ, t), t), x∗ (0) = x0 ,
(2.32)

⎩ λ̇ = −H (x∗ , u∗ (x∗ , λ, t), λ, t), λ(T ) = S (x∗ (T ), T ).
x x
40 2. The Maximum Principle: Continuous Time

We should note that we are making a slight abuse of notation here by


using u∗ (x, λ, t) to denote the optimal control function and u∗ (t) as the
optimal control at time t. Thus, depending on the context, when we
use u∗ without any argument, it may mean the optimal control function
u∗ (x, λ, t), or the optimal control at time t, or the entire optimal control
trajectory {u∗ (t), t ∈ [0, T ]}.
In Sect. 2.5, we derive the TPBVP for a specific example, and solve
its discrete version by using Excel. In subsequent chapters we will solve
many TPBVPs of varying degrees of difficulty.
One final remark should be made. Because an integral is unaffected
by values of the integrand at a finite set of points, some of the arguments
made in this chapter may not hold at a finite set of points. This does
not affect the validity of the results.
In the next section, we give economic interpretations of the maximum
principle, and in Sect. 2.3, we solve five simple examples by using the
maximum principle.

2.2.4 Economic Interpretations of the Maximum


Principle
Recall from Sect. 2.1.3 that the objective function (2.3) is
 T
J= F (x, u, t)dt + S(x(T ), T ),
0

where F is considered to be the instantaneous profit rate measured in


dollars per unit of time, and S(x, T ) is the salvage value, in dollars, of the
system at time T when the terminal state is x. For purposes of discussion
it will be convenient to consider the system as a firm and the state x(t)
as the stock of capital at time t.
In (2.17), we interpreted λ(t) to be the per unit change in the value
function V (x, t) for small changes in capital stock x. In other words, λ(t)
is the marginal value per unit of capital at time t, and it is also referred
to as the price or shadow price of a unit of capital at time t. In particular,
the value of λ(0) is the marginal rate of change of the maximum value
of J (the objective function) with respect to the change in the initial
capital stock, x0 .

Remark 2.2 As mentioned in Appendix C, where we prove a maximum


principle without any smoothness assumption on the value function,
there arise cases in which the value function may not be differentiable
2.2. Dynamic Programming and the Maximum Principle 41

with respect to the state variables. In such cases, when Vx (x∗ (t), t) does
not exist, then (2.17) has no meaning. See Bettiol and Vinter (2010),
Yong and Zhou (1999), and Cernea and Frankowska (2005) for interpre-
tations of the adjoint variables or extensions of (2.17) in such cases.

Next we interpret the Hamiltonian function in (2.18). Multiplying


(2.18) formally by dt and using the state equation (2.1) gives

Hdt = F dt + λf dt = F dt + λẋdt = F dt + λdx.

The first term F (x, u, t)dt represents the direct contribution to J in dol-
lars from time t to t + dt, if the firm is in state x (i.e., it has a capital
stock of x), and we apply control u in the interval [t, t + dt]. The differ-
ential dx = f (x, u, t)dt represents the change in capital stock from time t
to t + dt, when the firm is in state x and control u is applied. Therefore,
the second term λdx represents the value in dollars of the incremental
capital stock dx, and hence can be considered as the indirect contribution
to J in dollars. Thus, Hdt can be interpreted as the total contribution
to J from time t to t + dt when x(t) = x and u(t) = u in the interval
[t, t + dt].
With this interpretation, it is easy to see why the Hamiltonian must
be maximized at each instant of time t. If we were just to maximize
F at each instant t, we would not be maximizing J, because we would
ignore the effect of the control in changing the capital stock, which gives
rise to indirect contributions to J. The maximum principle derives the
adjoint variable λ(t), the price of capital at time t, in such a way that
λ(t)dx is the correct valuation of the indirect contribution to J from
time t to t + dt. As a consequence, the Hamiltonian maximizing problem
can be treated as a static problem at each instant t. In other words, the
maximum principle decouples the dynamic maximization problem (2.4)
in the interval [0, T ] into a set of static maximization problems associated
with instants t in [0, T ]. Thus, the Hamiltonian can be interpreted as a
surrogate profit rate to be maximized at each instant of time t.
The value of λ to be used in the maximum principle is given by (2.29)
and (2.30), i.e.,
∂H ∂F ∂f
λ̇ = − =− − λ , λ(T ) = Sx (x(T ), T ).
∂x ∂x ∂x
Rewriting the first equation as

−dλ = Hx dt = Fx dt + λfx dt,


42 2. The Maximum Principle: Continuous Time

we can observe that along the optimal path, −dλ, the negative of the
increase or, in other words, the decrease in the price of capital from t
to t + dt, which can be considered as the marginal cost of holding that
capital, equals the marginal revenue Hx dt of investing the capital. In turn
the marginal revenue Hx dt consists of the sum of the direct marginal
contribution Fx dt and the indirect marginal contribution λfx dt. Thus,
the adjoint equation becomes the equilibrium relation—marginal cost
equals marginal revenue, which is a familiar concept in the economics
literature; see, e.g., Cohen and Cyert (1965, p. 189) or Takayama (1974,
p. 712).
Further insight can be obtained by integrating the above adjoint
equation from t to T as follows:

T
λ(t) = λ(T ) + t Hx (x(τ ), u(τ ), λ(τ ), τ )dτ
T
= Sx (x(T ), T ) + t Hx dτ .

Note that the price λ(T ) of a unit of capital at time T is its marginal
salvage value Sx (x(T ), T ). In the special case when S ≡ 0, we have
λ(T ) = 0, as clearly no value can be derived or lost from an infinitesimal
increase in x(T ). The price λ(t) of a unit of capital at time t is the sum of
its terminal price λ(T ) plus the integral of the marginal surrogate profit
rate Hx from t to T.
The above interpretations show that the adjoint variables behave
in much the same way as the dual variables in linear (and nonlinear)
programming, with the differences being that here the adjoint variables
are time dependent and satisfy derived differential equations. These
connections will become clearer in Chap. 8, which addresses the discrete
maximum principle.

2.3 Simple Examples


In order to absorb the maximum principle, the reader should study very
carefully the examples in this section, all of which are problems having
only one state and one control variable. Some or all of the exercises at
the end of the chapter should also be worked.
In the following examples and others in this book, we will at times
omit the superscript ∗ on the optimal values of the state variables as
long as no confusion arises from doing so.
2.3. Simple Examples 43

Example 2.2 Consider the problem:


  1 
max J = −xdt (2.33)
0

subject to the state equation


ẋ = u, x(0) = 1 (2.34)
and the control constraint
u ∈ Ω = [−1, 1]. (2.35)
Note that T = 1, F = −x, S = 0, and f = u. Because F = −x, we can
interpret the problem as one of minimizing the (signed) area under the
curve x(t) for 0 ≤ t ≤ 1.

Solution First, we form the Hamiltonian


H = −x + λu (2.36)
and note that, because the Hamiltonian is linear in u, the form of the
optimal control, i.e., the one that would maximize the Hamiltonian, is




⎪ 1 if λ(t) > 0,


u∗ (t) = arbitrary if λ(t) = 0, (2.37)





⎩ −1 if λ(t) < 0,

or referring to the notation in Sect. 1.4,


u∗ (t) = bang[−1, 1; λ(t)]. (2.38)
To find λ, we write the adjoint equation
λ̇ = −Hx = 1, λ(1) = Sx (x(T ), T ) = 0. (2.39)
Because this equation does not involve x and u, we can easily solve it as
λ(t) = t − 1. (2.40)
It follows that λ(t) = t−1 < 0 for t ∈ [0, 1) and so u∗ (1) = −1, t ∈ [0, 1).
Since λ(1) = 0, for simplicity we can also set u∗ (1) = −1 at the single
point t = 1. We can then specify the optimal control to be
u∗ (t) = −1 for all t ∈ [0, 1].
44 2. The Maximum Principle: Continuous Time

Substituting this into the state equation (2.34) we have

ẋ = −1, x(0) = 1, (2.41)

whose solution is
x∗ (t) = 1 − t for t ∈ [0, 1]. (2.42)
The graphs of the optimal state and adjoint trajectories appear in
Fig. 2.2. Note that the optimal value of the objective function is
J ∗ = −1/2.

Figure 2.2: Optimal state and adjoint trajectories for Example 2.2

In Sect. 2.2.4, we stated that the adjoint variable λ(t) gives the
marginal value per unit increment in the state variable x(t) at time t.
Let us illustrate this claim at time t = 0 with the help of Example 2.2.
Note from (2.40) that λ(0) = −1. Thus, if we increase the initial value
x(0) from 1, by a small amount ε, to a new value 1 + ε, where ε may be
positive or negative, then we expect the optimal value of the objective
function to change from J ∗ = −1/2 to

J(1+ε) = −1/2 + λ(0)ε + o(ε) = −1/2 − ε + o(ε),
2.3. Simple Examples 45

where we use the subscript (1 + ε) to distinguish the new value from


J ∗ as well as to emphasize its dependence on the new initial condition
x(0) = 1 + ε. To verify this, we first observe that u∗ (t) = −1, t ∈ [0, 1],
remains optimal in this example for the new initial condition. Then from
(2.41) with x(0) = 1 + ε, we can obtain the new optimal state trajectory,
shown by the dotted line in Fig. 2.2 as

x∗(1+ε) (t) = 1 + ε − t, t ∈ [0, 1],


where the notation x∗(y) (t) indicates the dependence of the optimal tra-
jectory on the initial value x(0) = y. Substituting this for x in (2.33)
and integrating, we get the new objective function value to be −1/2 − ε.
Since 0 is of the order o(ε), our claim has been illustrated.
We should note that in general it may be necessary to perform sep-
arate calculations for positive and negative ε. It is easy to see, however,
that this is not the case in this example.

Example 2.3 Let us solve the same problem as in Example 2.2 over the
interval [0, 2] so that the objective is:
  2 
max J = −xdt . (2.43)
0

The dynamics and constraints are (2.34) and (2.35), respectively, as be-
fore. Here we want to minimize the signed area between the horizontal
axis and the trajectory of x(t) for 0 ≤ t ≤ 2.

Solution As before, the Hamiltonian is defined by (2.36) and the optimal


control is as in (2.38). The adjoint equation

λ̇ = 1, λ(2) = 0 (2.44)

is the same as (2.39) except that now T = 2 instead of T = 1. The


solution of (2.44) is easily found to be

λ(t) = t − 2, t ∈ [0, 2]. (2.45)

The graph of λ(t) is shown in Fig. 2.3.


With λ(t) as in (2.45), we can determine u∗ (t) = −1 throughout.
Thus, the state equation is the same as (2.41). Its solution is given by
(2.42) for t ∈ [0, 2]. The optimal value of the objective function is J ∗ = 0.
The graph of x∗ (t) is also sketched in Fig. 2.3.
46 2. The Maximum Principle: Continuous Time

Figure 2.3: Optimal state and adjoint trajectories for Example 2.3

Example 2.4 The next example is:


  1 
1 2
max J = − x dt (2.46)
0 2
subject to the same constraints as in Example 2.2, namely,

ẋ = u, x(0) = 1, u ∈ Ω = [−1, 1]. (2.47)

Here F = −(1/2)x2 so that the interpretation of the objective function


(2.46) is that we are trying to find the trajectory x(t) in order that the
area under the curve (1/2)x2 is minimized.

Solution The Hamiltonian is


1
H = − x2 + λu. (2.48)
2
2.3. Simple Examples 47

The control function u∗ (x, λ) that maximizes the Hamiltonian in this


case depends only on λ, and it has the form
u∗ (x, λ) = bang[−1, 1; λ]. (2.49)
Then, the optimal control at time t can be expressed as u∗ (t) =
bang[−1, 1, λ(t)].
The adjoint equation is
λ̇ = −Hx = x, λ(1) = 0. (2.50)
Here the adjoint equation involves x, so we cannot solve it directly. Be-
cause the state equation (2.47) involves u, which depends on λ, we also
cannot integrate it independently without knowing λ.
A way out of this dilemma is to use some intuition. Since we want to
minimize the area under (1/2)x2 and since x(0) = 1, it is clear that we
want x to decrease as quickly as possible. Let us therefore temporarily
assume that λ is nonpositive in the interval [0, 1] so that from (2.49) we
have u = −1 throughout the interval. (In Exercise 2.8, you will be asked
to show that this assumption is correct.) With this assumption, we can
solve (2.47) as
x(t) = 1 − t. (2.51)
Substituting this into (2.50) gives
λ̇ = 1 − t.
Integrating both sides of this equation from t to 1 gives
 1  1
λ̇(τ )dτ = (1 − τ )dτ ,
t t
or
1
λ(1) − λ(t) = (τ − τ 2 ) |1t ,
2
which, using λ(1) = 0, yields
1 1
λ(t) = − t2 + t − . (2.52)
2 2
The reader may now verify that λ(t) is nonpositive in the interval [0, 1],
verifying our original assumption. Hence, (2.51) and (2.52) satisfy the
necessary conditions. In Exercise 2.26, you will be asked to show that
they satisfy sufficient conditions derived in Sect. 2.4 as well, so that they
are indeed optimal. Thus, x∗ (t) = 1 − t, and using this in (2.46), we can
get J ∗ = −1/6. Figure 2.4 shows the graphs of the optimal state and
adjoint trajectories.
48 2. The Maximum Principle: Continuous Time

Figure 2.4: Optimal trajectories for Examples 2.4 and 2.5

Example 2.5 Let us rework Example 2.4 with T = 2, i.e., with the
objective function:   2 
1 2
max J = − x dt (2.53)
0 2
subject to the constraints (2.47).
Solution The Hamiltonian is still as in (2.48) and the form of the optimal
policy remains as in (2.49). The adjoint equation is
λ̇ = x, λ(2) = 0,
which is the same as (2.50) except T = 2 instead of T = 1. Let us try to
extend the solution of the previous example from T = 1 to T = 2. Thus,
we keep λ(t) as in (2.52) for t ∈ [0, 1] with λ(1) = 0. If we recall from
the definition of the bang function that bang [−1, 1; 0] is not defined, it
allows us to choose u in (2.49) arbitrarily when λ = 0. This is an instance
of singular control, so let us see if we can maintain the singular control
by choosing u appropriately. To do this we choose u = 0 when λ = 0.
Since λ(1) = 0 we set u(1) = 0 so that from (2.47), we have ẋ(1) = 0.
Now note that if we set u(t) = 0 for t > 1, then by integrating equations
(2.47) and (2.50) forward from t = 1 to t = 2, we see that x(t) = 0
and λ(t) = 0 for 1 < t ≤ 2; in other words, u(t) = 0 maintains singular
control in the interval. Intuitively, this is the correct answer since once
we get x = 0, we should keep it at 0 in order to maximize the objective
2.3. Simple Examples 49

function J in (2.53). We will later give further discussion of singular


control and state an additional necessary condition in Sect. D.6 for such
cases; see also Bell and Jacobson (1975). In Fig. 2.4, we can get the
singular solution by extending the graphs shown to the right (as shown
by thick dotted line), making x∗ (t) = 0, λ(t) = 0, and u∗ (t) = 0 for
1 < t ≤ 2.
With the trajectory x∗ (t), 0 ≤ t ≤ 2, thus obtained, we can use
(2.53) to compute the optimal value of the objective function as
 1  2
∗ 2
J = −(1/2)(1 − t) dt + −(1/2)(0)dt = −1/6.
0 1

Now suppose that the initial x(0) is perturbed by a small amount


ε to x(0) = 1 + ε, where ε may be positive or negative. According to
the marginal value interpretation of λ(0), whose value is −1/2 in this
example, we can estimate the change in the objective function to be
λ(0)ε + o(ε) = −ε/2 + o(ε).
Next we calculate directly the impact of the perturbation in the initial
value. For this we must obtain new control and state trajectories. These
are clearly


⎨ −1, t ∈ [0, 1 + ε],

u(1+ε) (t) =

⎩ 0, t ∈ (1 + ε, 2],

and


⎨ 1 + ε − t, t ∈ [0, 1 + ε],
x∗(1+ε) (t) =

⎩ 0, t ∈ (1 + ε, 2],

where we have used the subscript (1 + ε) to distinguish these from the


original trajectories as well as to indicate their dependence on the initial
value x(0) = 1 + ε. We can then obtain the corresponding optimal value
of the objective function as
 1+ε

J(1+ε) = −(1/2)(1 + ε − t)2 dt = −1/6 − ε/2 − ε2 /2 − ε3 /6
0
= −1/6 + λ(0)ε + o(ε),

where o(ε) = −ε2 /2 − ε3 /6.


50 2. The Maximum Principle: Continuous Time

In this example and Example 2.2, we have, by direct calculation,


demonstrated the significance of λ(0) as the marginal value of the change
in the initial state. This could have also been accomplished by obtaining
the value function V (x, t) for x(t) = x, t ∈ [0, 2], and then showing that
λ(0) = Vx (1, 0). This, of course, is the relationship (2.17) at x(0) = x = 1
and t = 0.
Keep in mind, however, that deriving V (x, t) is more than just find-
ing the solution of the problem, which we have already found by using
the maximum principle. V (x, t) also yields additional insights into the
problem. In order to completely specify V (x, t) for all x ∈ E 1 and all
t ∈ [0, 2], we need to deal with a number of cases. Here, we will carry
out the details only in the case of any t ∈ [0, 2] and 0 ≤ x ≤ 2 − t,
and leave the listing of the other cases and the required calculations as
Exercise 2.13.
We know from (2.9) that we need to solve the optimal control problem
for any given t ∈ [0, 2] with 0 ≤ x ≤ 2 − t. However, from our earlier
analysis of this example, it is clear that the optimal control


⎨ −1, s ∈ [t, t + x],

u(x,t) (s) =

⎩ 0, s ∈ (t + x, 2],

and the corresponding




⎨ x − (s − t), s ∈ [t, t + x],
x∗(x,t) (s) =

⎩ 0, s ∈ (t + x, 2],

where we use the subscript to show the dependence of the control and
state trajectories of a problem beginning at time t with the state x(t) =
x. Thus,
 t+x 
1 1 t+x
V (x, t) = − [x∗(x,t) (s)]2 ds = − (x − s + t)2 ds.
t 2 2 t

While this expression can be easily integrated to obtain an explicit so-


lution for V (x, t), we do not need to do this for our immediate purpose
at hand, which is to obtain Vx (x, t). Differentiating the right-hand side
with respect to x, we obtain

1 x+t
Vx (x, t) = − 2(x − s + t)ds.
2 t
2.3. Simple Examples 51

Furthermore, since


⎨ 1 − t, t ∈ [0, 1],
x∗ (t) =

⎩ 0, t ∈ (1, 2],

we obtain

⎪ 
⎨ − 1 1 2(x − s + t)ds = − 1 t2 + t − 1 , t ∈ [0, 1],
2 t 2 2
Vx (x∗ (t), t) =

⎩ 0, t ∈ (1, 2],

which equals λ(t) obtained as the adjoint variable in Example 2.5. Note
that for t ∈ [0, 1], λ(t) in Example 2.5 is the same as that in Example 2.4
obtained in (2.52).

Example 2.6 This example is slightly more complicated and the opti-
mal control is not bang-bang. The problem is:
  2 
2
max J = (2x − 3u − u )dt (2.54)
0

subject to
ẋ = x + u, x(0) = 5 (2.55)
and the control constraint

u ∈ Ω = [0, 2]. (2.56)

Solution Here T = 2, F = 2x − 3u − u2 , S = 0, and f = x + u. The


Hamiltonian is

H = (2x − 3u − u2 ) + λ(x + u)
= (2 + λ)x − (u2 + 3u − λu). (2.57)

Let us find the optimal control policy by differentiating (2.57) with re-
spect to u. Thus,
∂H
= −2u − 3 + λ = 0,
∂u
52 2. The Maximum Principle: Continuous Time

so that the form of the optimal control is

λ(t) − 3
u∗ (t) = , (2.58)
2
provided this expression stays within the interval Ω = [0, 2]. Note that
the second derivative of H with respect to u is ∂ 2 H/∂u2 = −2 < 0, so
that (2.58) satisfies the second-order condition for the maximum of a
function.
We next derive the adjoint equation as
∂H
λ̇ = − = −2 − λ, λ(2) = 0. (2.59)
∂x
Referring to Appendix A.1, we can use the integrating factor et to obtain

et (dλ + λdt) = d(et λ) = −2et dt.

We then integrate it on both sides from t to 2 and use the terminal


condition λ(2) = 0 to obtain the solution of the adjoint equation (2.59)
as
λ(t) = 2(e2−t − 1).
If we substitute this into (2.58) and impose the control constraint
(2.56), we see that the optimal control is




⎪ 2 if e2−t − 2.5 > 2,


u∗ (t) = e2−t − 2.5 if 0 ≤ e2−t − 2.5 ≤ 2, (2.60)





⎩ 0 if e2−t − 2.5 < 0,

or referring to the notation defined in (1.22),

u∗ (t) = sat[0, 2; e2−t − 2.5].

The graph of u∗ (t) appears in Fig. 2.5. In the figure, t1 satisfies e2−t1 −
2.5 = 2, i.e., t1 = 2 − ln 4.5 ≈ 0.496, while t2 satisfies e2−t2 − 2.5 = 0,
which gives t2 = 2 − ln 2.5 ≈ 1.08.
In Exercise 2.2 you will be asked to compute the optimal state tra-
jectory x∗ (t) corresponding to u∗ (t) shown in Fig. 2.5 by piecing together
the solutions of three separate differential equations obtained from (2.55)
and (2.60).
2.4. Sufficiency Conditions 53

t
e2 2.5

Figure 2.5: Optimal control for Example 2.6

2.4 Sufficiency Conditions


So far, we have shown the necessity of the maximum principle condi-
tions for optimality. Next we prove a theorem that gives qualifications
under which the maximum principle conditions are also sufficient for op-
timality. This theorem is important from our point of view since the
models derived from many management science applications will satisfy
conditions required for the sufficiency result. As remarked earlier, our
technique for proving existence will be to display for any given model, a
solution that satisfies both necessary and sufficient conditions. A good
reference for sufficiency conditions is Seierstad and Sydsæter (1987).
We first define a function H 0 : E n ×E m ×E 1 → E 1 called the derived
Hamiltonian as follows:

H 0 (x, λ, t) = max H(x, u, λ, t). (2.61)


u∈Ω(t)
54 2. The Maximum Principle: Continuous Time

We assume that by this equation a function u∗ (x, λ, t) is implicitly and


uniquely defined. Given these assumptions we have by definition,

H 0 (x, λ, t) = H(x, u∗ , λ, t). (2.62)

For our proof of the sufficiency of the maximum principle, we also need
the derivative Hx0 (x, λ, t), which by use of the Envelope Theorem can be
given as

Hx0 (x, λ, t) = Hx (x, u∗ , λ, t) := Hx (x, u, λ, t)|u=u∗ . (2.63)

To see this in the case when u∗ (x, λ, t) is differentiable in x, let us


differentiate (2.62) with respect to x:

∂u∗
Hx0 (x, λ, t) = Hx (x, u∗ , λ, t) + Hu (x, u∗ , λ, t) . (2.64)
∂x
To obtain (2.63) from (2.64), we need to show that the second term on
the right-hand side of (2.64) vanishes, i.e.,

∂u∗
Hu (x, u∗ , λ, t) =0 (2.65)
∂x
for each x. There are two cases to consider. If u∗ is in the interior of
Ω(t), then it satisfies the first-order condition Hu (x, u∗ , λ, t) = 0, thereby
implying (2.65). Otherwise, u∗ is on the boundary of Ω(t). Then, for each
i, j, either Hui = 0 or ∂u∗i /∂xj = 0 or both. Once again, (2.65) holds.
Exercise 2.25 gives a specific instance of this case.

Remark 2.3 We have shown the result in (2.63) in cases when u∗ is


a differentiable function of x. The result holds more generally, provided
that Ω(t) is appropriately qualified; see Derzko et al. (1984). Such results
are known as Envelope Theorems, and are used often in economics.

Theorem 2.1 (Sufficiency Conditions). Let u∗ (t), and the corre-


sponding x∗ (t) and λ(t) satisfy the maximum principle necessary con-
dition (2.31) for all t ∈ [0, T ]. Then, u∗ is an optimal control if
H 0 (x, λ(t), t) is concave in x for each t and S(x, T ) is concave in x.

Proof. The proof is a minor extension of the arguments in Arrow and


Kurz (1970). By definition

H[x(t), u(t), λ(t), t] ≤ H 0 [x(t), λ(t), t]. (2.66)


2.4. Sufficiency Conditions 55

Since H 0 is differentiable and concave, we can use the applicable defini-


tion of concavity given in Sect. 1.4 to obtain

H 0 [x(t), λ(t), t] ≤ H 0 [x∗ (t), λ(t), t] + Hx0 [x∗ (t), λ(t), t][x(t) − x∗ (t)].
(2.67)
Using (2.66), (2.62), and (2.63) in (2.67), we obtain

H[x(t), u(t), λ(t), t] ≤ H[x∗ (t), u∗ (t), λ(t), t]


+Hx [x∗ (t), u∗ (t), λ(t), t][x(t) − x∗ (t)]. (2.68)

By definition of H in (2.18) and the adjoint equation of (2.31)

F [x(t), u(t), t] + λ(t)f [x(t), u(t), t] ≤ F [x∗ (t), u∗ (t), t]


+λ(t)f [x∗ (t), u∗ (t), t] − λ̇(t)[x(t) − x∗ (t)]. (2.69)

Using the state equation in (2.31), transposing, and regrouping,

F [x∗ (t), u∗ (t), t] − F [x(t), u(t), t] ≥ λ̇(t)[x(t) − x∗ (t)]


+λ(t)[ẋ(t) − ẋ∗ (t)]. (2.70)

Furthermore, since S(x, T ) is a differential and concave function in its


first argument, we have

S(x(T ), T ) ≤ S(x∗ (T ), T ) + Sx (x∗ (T ), T )[x(T ) − x∗ (T )] (2.71)

or,

S(x∗ (T ), T ) − S(x(T ), T ) ≥ Sx (x∗ (T ), T )[x(T ) − x∗ (T )]. (2.72)

Integrating both sides of (2.70) from 0 to T and adding (2.72), we have


 T
∗ ∗ ∗
F (x (t), u (t), t)dt + S(x (T ), T )
0
 T
− F (x(t), u(t), t)dt + S(x(T ), T )
0
≥ [λ(T ) − Sx (x∗ (T ), T )][x(T ) − x∗ (T )] − λ(0)[x(0) − x∗ (0)]
56 2. The Maximum Principle: Continuous Time

or,
J(u∗ ) − J(u) (2.73)
∗ ∗ ∗
≥ [λ(T ) − Sx (x (T ), T )][x(T ) − x (T )] − λ(0)[x(0) − x (0)],
where J(u) is the value of the objective function associated with a control
u. Since x∗ (0) = x(0) = x0 , the initial condition, and since λ(T ) =
Sx (x∗ (T ), T ) from the terminal adjoint condition in (2.31), we have
J(u∗ ) ≥ J(u). (2.74)
Thus, u∗ is an optimal control. This completes the proof. 2
Because λ(t) is not known a priori, it is usual to test H 0 for a stronger
assumption, i.e., to check for the concavity of the function H 0 (x, λ, t) in x
for any λ and t. Sometimes the stronger condition given in Exercise 2.27
can be used.
Mangasarian (1966) gives a sufficient condition in which the concav-
ity of H 0 (x, λ(t), t) in Theorem 2.1 is replaced by a stronger condition
requiring the Hamiltonian H(x, u, λ(t), t) to be jointly concave in (x, u).
Example 2.7 Let us show that the problems in Examples 2.2 and 2.3
satisfy the sufficient conditions. We have from (2.36) and (2.61),
H 0 = −x + λu∗ ,
where u∗ is given by (2.37). Since u∗ is a function of λ only, H 0 (x, λ, t) is
certainly concave in x for any t and λ (and in particular for λ(t) supplied
by the maximum principle). Since S(x, T ) = 0, the sufficient conditions
hold.
Finally, it is important to mention that thus far in this chapter, we
have considered problems in which the terminal values of the state vari-
ables are not constrained. Such problems are called free-end-point prob-
lems. The problems at the other extreme, where the terminal values of
the state variables are completely specified, are termed fixed-end-point
problems. Then, there are problems in between these two extremes.
While a detailed discussion of terminal conditions on state variables ap-
pears in Sect. 3.4 of the next chapter, it is instructive here to briefly
indicate how the maximum principle needs to be modified in the case
of fixed-end-point problems. Suppose x(T ) is completely specified, i.e.,
2.5. Solving a TPBVP by Using Excel 57

x(T ) = k ∈ E n , where k is a vector of constants. Observe then that


the first term on the right-hand side of inequality (2.73) vanishes regard-
less of the value of λ(T ), since x(T ) − x∗ (T ) = k − k = 0 in this case.
This means that the sufficiency result would go through for any value of
λ(T ). Not surprisingly, therefore, the transversality condition (2.30) in
the fixed-end-point case changes to
λ(T ) = β, (2.75)
where β ∈ En is a vector of constants to be determined.
Indeed, one can show that (2.75) is also the necessary transversality
condition for fixed point problems. With this observation, the maximum
principle for fixed-end-point problems can be obtained by modifying
(2.31) as follows: adding x(T ) = k and removing λ(T ) = Sx (x∗ (T ), T ).
Likewise, the resulting TPBVP (2.32) can be modified correspondingly;
it will have initial and final values on the state variables, whereas both
initial and terminal values for the adjoint variables are unspecified, i.e.,
λ(0) and λ(T ) are constants to be determined.
In Exercises 2.28 and 2.19, you are asked to solve the fixed-end-point
problems given there.

2.5 Solving a TPBVP by Using Excel


A number of examples and exercises found throughout this book involve
finding a numerical solution to a two-point boundary value problem (TP-
BVP). In this section we will show how the GOAL SEEK function in
Excel can be used for this purpose. We will solve the following example.
Example 2.8 Consider the problem:
  1 
1 2 2
max J = − (x + u )dt
0 2
subject to
ẋ = −x3 + u, x(0) = 5. (2.76)
Solution We form the Hamiltonian
1
H = − (x2 + u2 ) + λ(−x3 + u),
2
58 2. The Maximum Principle: Continuous Time

where the adjoint variable λ satisfies the equation

λ̇ = x + 3x2 λ, λ(1) = 0. (2.77)

Since u is unconstrained, we set Hu = 0 to obtain u∗ = λ. With this, the


state equation (2.76) becomes

ẋ = −x3 + λ, x(0) = 5. (2.78)

Thus, the TPBVP is given by the system of equations (2.77) and (2.78).
A simple method to solve the TPBVP uses what is known as the
shooting method, explained in the flowchart in Fig. 2.6.

Guess Yes
? STOP

No

Figure 2.6: The flowchart for Example 2.8

We will use Excel functions to implement the shooting method. For


this we discretize (2.77) and (2.78) by replacing dx/dt and dλ/dt by

x x(t + t) − x(t) λ λ(t + t) − λ(t)


= and = ,
t t t t

respectively. Substitution of x/t for ẋ in (2.78) and λ/t for λ̇ in


(2.77) gives the discrete version of the TPBVP:

x(t + t) = x(t) + [−x(t)3 + λ(t)]  t, x(0) = 5, (2.79)

λ(t + t) = λ(t) + [x(t) + 3x(t)2 λ(t)]  t, λ(1) = 0. (2.80)


2.5. Solving a TPBVP by Using Excel 59

In order to solve these equations, open an empty spreadsheet, choose


the unit of time to be t = 0.01, make a guess for the initial value λ(0)
to be, say −0.2, and make the entries in the cells of the spreadsheet as
specified below:

Enter -0.2 in cell A1.


Enter 5 in cell B1.
Enter = A1 + (B1 + 3 ∗ (B1ˆ2)∗ A1)∗ 0.01 in cell A2.
Enter = B1 + (-B1ˆ 3 + A1) ∗ 0.01 in cell B2.

Here we have entered the right-hand side of the difference equation (2.80)
for t = 0 in cell A2 and the right-hand side of the difference equation
(2.79) for t = 0 in cell B2. Note that λ(0) = −0.2 shown as the entry
−0.2 in cell A1 is merely a guess. The correct value will be determined
by the use of the GOAL SEEK function.
Next highlight cells A2 and B2 and drag the combination down to
row 101 of the spreadsheet. Using EDIT in the menu bar, select FILL
DOWN. Thus, Excel will solve Eqs. (2.80) and (2.79) from t = 0 to t = 1
in steps of t = 0.01, and that solution will appear as entries in columns
A and B of the spreadsheet, respectively. In other words, the guessed
solution for λ(t) will appear in cells A1 to A101 and the corresponding
solution for x(t) will appear in cells B1 to B101. To find the correct value
for λ(0), use the GOAL SEEK function under TOOLS in the menu bar
and make the following entries:

Set cell: A101.


To value: 0.
By changing cell: A1.

It finds the correct initial value for the adjoint variable as λ(0) =
−0.10437, which should appear in cell A1, and the correct ending value
of the state variable as x(1) = 0.62395, which should appear in cell B101.
You will notice that the entry in cell A101 may not be exactly zero as
instructed, although it will be very close to it. In our example, it is
−0.0007. By using the CHART function, the graphs of x∗ (t) and λ(t)
can be printed out by Excel as shown in Fig. 2.7.
60 2. The Maximum Principle: Continuous Time

Figure 2.7: Solution of TPBVP by excel

As we discuss more complex problems involving control and state


inequality constraints in Chaps. 3 and 4, we will realize that the shooting
method is no longer adequate to solve such problems. However, there is a
large amount of literature devoted to computational methods for solving
optimal control problems. While a detailed treatment of this topic is
beyond the scope of this book, we suggest some references as well as a
software in Chap. 4, Sect. 4.3.

Exercises for Chapter 2

E 2.1 Perform the following:

(a) In Example 2.2, show J ∗ = −1/2.


(b) In Example 2.3, show J ∗ = 0.
(c) In Example 2.4, show J ∗ = −1/6.
(d) In Example 2.5, show J ∗ = −1/6.

E 2.2 Complete Example 2.6 by writing the optimal x∗ (t) in the form
of integrals over the three intervals (0, t1 ), (t1 , t2 ), and (t2 , 2) shown in
Fig. 2.5.
Exercises for Chapter 2 61

Hint: It is not necessary to actually carry out the numerical evaluation


of these integrals unless you are ambitious.

E 2.3 Find the optimal solution for Example 2.1 with x0 = 0 and T = 1.

E 2.4 Rework Example 2.6 with F = 2x − 3u.

E 2.5 Show that both the Lagrange and Mayer forms of the optimal
control problem can be reduced to the linear Mayer form (2.5).

E 2.6 Show that the optimal control obtained from the application of
the maximum principle satisfies the principle of optimality: if u∗ (t) is an
optimal control and x∗ (t) is the corresponding optimal path for 0 ≤ t ≤ T
with x(0) = x0 , then verify the above proposition by showing that u∗ (t)
for τ ≤ t ≤ T satisfies the maximum principle for the problem beginning
at time τ with the initial condition x(τ ) = x∗ (τ ).

E 2.7 Provide an alternative derivation of the adjoint equation in


Sect. 2.2.2 by starting with a restatement of the Eq. (2.19) as −Vt = H 0
and differentiating it with respect to x.

E 2.8 In Example 2.4, show that in view of (2.47) any λ(t), t ∈ [0, 1],
that satisfies (2.50) must be nonnegative.

E 2.9 The system defined in (2.4) is termed autonomous if F, f, S and


Ω are not explicit functions of time t. In this case, show that the Hamil-
tonian is constant along the optimal path, i.e., show that dH/dt = 0.
Furthermore, verify this result in Example 2.4 by a direct substitution
for x and λ from (2.51) and (2.52), respectively, into H given in (2.48).

E 2.10 In Example 2.4, verify by direct calculation that with a new


initial value x(0) = 1+ε with ε small, the new optimal objective function
value will be

J(1+ε) = −1/6 + λ(0)ε + o(ε) = −1/6 − ε/2 − ε2 /2.

E 2.11 In Example 2.6, verify by direct calculation that with a new ini-
tial x(0) = 5 + ε with ε small, the objective function value will change by

λ(0)ε + o(ε) = 2(e2 − 1)ε + o(ε).


62 2. The Maximum Principle: Continuous Time

E 2.12 Obtain the value function V (x, t) explicitly in Example 2.4 and
verify the relation Vx (x∗ (t), t) = λ(t) for the example by showing that
Vx (1 − t, t) = −(1/2)t2 + t − 1/2.

E 2.13 Obtain the value function V (x, t) explicitly in Example 2.5 for
every x ∈ E 1 and t ∈ [0, 2].

Hint: You need to deal with the following cases for t ∈ [0, 2]:

(i) 0 ≤ x ≤ 2 − t,
(ii) x > 2 − t,
(iii) t − 2 ≤ x < 0, and
(iv) x < t − 2.

E 2.14 Obtain V (x, t) in Example 2.6 for small positive and negative x
for t ∈ [t2 , 2]. Then, show that Vx (x, t) = 2(e2−t − 1), t ∈ [t2 , 2], is the
same as λ(t), t ∈ [t2 , 2] obtained in Example 2.6.

E 2.15 Solve the problem:


  
T
u2
max J = (x − )dt
0 2

subject to
ẋ = u, x(0) = x0 ,

u ∈ [0, 1],

for optimal control and optimal state trajectory. Verify that your solu-
tion is optimal by using the maximum principle sufficiency condition.

E 2.16 Solve completely the problem:


 1 
max (x + u)dt
0

ẋ = 1 − u2 , x(0) = 1;

that is, find x∗ (t), u∗ (t) and λ(t), 0 ≤ t ≤ 1.


Exercises for Chapter 2 63

E 2.17 Use the maximum principle to solve the following problem given
in the Mayer form:
max[8x1 (18) + 4x2 (18)]
subject to
ẋ1 = x1 + x2 + u, x1 (0) = 15,
ẋ2 = 2x1 − u, x2 (0) = 20,
and the control constraint
0 ≤ u ≤ 1.
Hint: Use the method in Appendix A to solve the simultaneous differ-
ential equations.

E 2.18 In Fig. 2.8, a water reservoir being used for the purpose of fire-
fighting is leaking, and its water height x(t) is governed by

ẋ = −0.1x + u, x(0) = 10,

where u(t) denotes the net inflow at time t and 0 ≤ u ≤ 3.


Note that x(t) also represents the water pressure in appropriate units.
Since high water pressure is useful for fire-fighting, the objective function
in (a) below involves keeping the average pressure high, while that in (b)
involves building up a high pressure at T = 100. Furthermore, we do not
need to impose the state constraints 0 ≤ x(t) ≤ 50, as these will always
be satisfied for every feasible control u(t), 0 ≤ t ≤ 100.

Figure 2.8: Water reservoir of Exercise 2.18

(a) Find the optimal control which maximizes


 100
J= xdt.
0

Find the maximum level reached.


64 2. The Maximum Principle: Continuous Time

(b) Replace the objective function in (a) by

J = 5x(100),

and re-solve the problem.


 100
(c) Redo the problem with J = 0 (x − 5u)dt.

E 2.19 Consider the following fixed-end-point problem:


  T 
2
max J = − (g(x) + cu )dt
u 0

subject to
ẋ = f (x) + b(x)u, x(0) = x0 , x(T ) = 0,
where functions g ≥ 0, f, and b are assumed to be continuously differen-
tiable. Derive the two-point boundary value problem (TPBVP) satisfied
by the optimal state and control trajectories.

E 2.20 A Machine Maintenance Problem. Consider the machine state


dynamics
ẋ = −δx + u, x(0) = x0 > 0,
where δ > 0 is the rate of deterioration of the machine state and u is the
rate of machine maintenance. Find the optimal maintenance rate:
  T 
−ρt u2 −ρT
max J = e (πx − )dt + e Sx(T ) ,
0 2

where π > 0 with πx representing the profit rate when the machine state
is x, u2 /2 is the cost of maintaining the machine at rate u, ρ > 0 is the
discount rate, T is the time horizon, and S > 0 is the salvage value of
the machine for each unit of the machine state at time T. Furthermore,
show that the optimal maintenance rate decreases, increases, or remains
constant over time depending on whether the difference S − π/(ρ + δ) is
negative, positive, or zero, respectively.

E 2.21 Transform the machine maintenance problem of Exercise 2.20


into Mayer Form. Then solve it to obtain the optimal maintenance rate.

E 2.22 Regional Allocation of Investment. Let Ki , i = 1, 2, denote the


capital stock in Region i. Let bi be the productivity of capital and si be
Exercises for Chapter 2 65

the marginal propensity to save in Region i. Since the investment funds


for the two regions come from the savings in the whole economy, we have
K̇1 + K̇2 = b1 s1 K1 + b2 s2 K2 = g1 K1 + g2 K2 ,
where gi = bi si . Let u denote the control variable representing the frac-
tion of investment allocated to Region 1 with the remainder going to
Region 2. Clearly,
0 ≤ u ≤ 1, (2.81)
and
K̇1 = u(g1 K1 + g2 K2 ), K1 (0) = a1 > 0, (2.82)

K̇2 = (1 − u)(g1 K1 + g2 K2 ), K2 (0) = a2 > 0. (2.83)


The optimal control problem is to maximize the productivity of the whole
economy at time T. Thus, the objective is:
max{J = b1 K1 (T ) + b2 K2 (T )}
subject to (2.81), (2.82), and (2.83).
(a) Use the maximum principle to derive the form of the optimal policy.
(b) Assume b2 > b1 . Show that u∗ (t) = 0 for t ∈ [t̂, T ], where t̂ is a
switching point and 0 ≤ t̂ < T.
(c) If you are ambitious, find the t̂ of part (b).

E 2.23 Investment Allocation. Let K denote the capital stock and λK


its output rate with λ > 0. For simplicity in notation, we set the pro-
ductivity factor λ = 1. Let u denote the invested fraction of the output.
Then, uK is the investment rate and (1 − u)K is the consumption rate.
Let us assume an exponential utility 1 − e−C of consumption C. Solve
the resulting optimal control problem:
  T 
−(1−u(t))K(t)
max J = [1 − e ]dt
0

subject to
K̇(t) = u(t)K(t), K(0) = K0 , K(T ) free, 0 ≤ u(t) ≤ 1, 0 ≤ t ≤ T.
Assume T > 1 and 0 < K0 < 1 − e1−T . Obtain explicitly the optimal in-
vestment allocation u∗ (t), optimal capital K ∗ (t), and the adjoint variable
λ(t), 0 ≤ t ≤ T.
66 2. The Maximum Principle: Continuous Time

E 2.24 The rate at which a new product can be sold at any time t
is f (p(t))g(Q(t)) where p is the price and Q is cumulative sales. We
assume f  (p) < 0; sales vary inversely with price. Also g  (Q) ≷ 0 for
Q ≶ Q1 , respectively, where Q1 > 0 is a constant known as the saturation
level. For a given price, current sales grow with past sales in the early
stages as people learn about the good from past purchasers. But as
cumulative sales increase, there is a decline in the number of people who
have not yet purchased the good. Eventually the sales rate for any given
price falls, as the market becomes saturated. The unit production cost c
may be constant or may decline with cumulative sales if the firm learns
how to produce less expensively with experience: c = c(Q), c (Q) ≤ 0.
Formulate and solve the optimal control problem in order to characterize
the price policy p(t), 0 ≤ t ≤ T, that maximizes profits from this new
“fad” over a fixed horizon T. Specifically, show that in marketing a new
product, its optimal price rises while the market expands to its saturation
level and falls as the market matures beyond the saturation level.

E 2.25 Suppose H(x, u, λ, t) = λux − 12 u2 and Ω(t) = [0, 1] for all t.


(a) Show that the form of the optimal control is given by the function




⎪ λx if 0 ≤ λx ≤ 1,


u∗ (x, λ) = sat[0, 1; λx] = 1 if λx > 1,





⎩ 0 if λx < 0.

(b) Verify that (2.63) holds for all values of x and λ.

E 2.26 Show that the derived Hamiltonians H 0 found in Examples 2.4


and 2.6 satisfy the concavity condition required for the sufficiency result
in Sect. 2.4.

E 2.27 If F and f are concave in x and u and if λ(t) ≥ 0, then show


that the derived Hamiltonian H 0 is concave in x. Note that the concavity
of F and f are easier to check than the concavity of H 0 as required in
Theorem 2.1 on sufficiency conditions.

E 2.28 A simple controlled dynamical system is modeled by the scalar


equation
ẋ = x + u.
Exercises for Chapter 2 67

The fixed-end-point optimal control problem consists in steering x(t)


from an initial state x(0) = x0 to the target x(1) = 0, such that
 1
1
J(u) = u4 dt
4 0

is minimized. Use the maximum principle to show that the optimal


control is given by
4x0 −4/3
u∗ (t) = (e − 1)−1 e−t/3 .
3
E 2.29 Perform the following:

(a) Solve the optimal consumption problem of Example 1.3 with


U (C) = ln C and B = 0.

Hint: Since C(t) ≥ 0, we can replace the state constraint W (t) ≥ 0, t ∈


[0, T ], by the terminal condition W (T ) = 0, and then use the transver-
sality condition given in (2.75).

(b) Find the rate of change of optimal consumption over time and
conclude that consumption remains constant when r = ρ, increases
when r > ρ, and decreases when r < ρ.

E 2.30 Perform the following:

(a) Formulate the TPBVP (2.32) and its discrete version for the prob-
lem in Example 2.8, but with a new initial condition x(0) = 1.
(b) Solve the discrete version of the TPBVP by using Excel.

E 2.31 Solve explicitly


  2 
max J = − x(t)dt
0

subject to
ẋ(t) = u(t), x(0) = 1, x(2) = 0,
−a ≤ u(t) ≤ b, a > 1/2, b > 0.
Obtain optimal x∗ (t), u∗ (t), and all required multipliers.
Chapter 3

The Maximum Principle:


Mixed Inequality
Constraints

The problems to which the maximum principle derived in the previous


chapter was applicable had constraints involving only the control vari-
ables. We will see that in many applied models it is necessary to impose
constraints involving both control and state variables. Inequality con-
straints involving control and possibly state variables are called mixed
inequality constraints.
In the solution spaces of problems with mixed constraints, there may
be regions in which one or more of the constraints is tight. When this
happens, the system must be controlled in such a way that the tight
constraints are not violated. As a result, the maximum principle of
Chap. 2 must be revised so that the Hamiltonian is maximized subject
to the constraints. This is done by appending the Hamiltonian with
the mixed constraints and the associated Lagrange multipliers to form a
Lagrangian, and then setting the derivatives of the resulting Lagrangian
with respect to the control variables to zero.

© Springer Nature Switzerland AG 2019 69


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 3
70 3. The Maximum Principle: Mixed Inequality Constraints

In Sect. 3.1, a Lagrangian form of the maximum principle is discussed


for models in which there are some constraints that involve only control
variables, and others that involve both state and control variables simul-
taneously. Problems having pure state variable inequality constraints,
i.e., those involving state variables but no control variables, are more
difficult and will be dealt with in Chap. 4.
In Sect. 3.2, we state conditions under which the Lagrangian maxi-
mum principle is also sufficient for optimality.
Economists frequently analyze optimal control problems involving a
discount rate. By combining the discount factor with the adjoint vari-
ables and the Lagrange multipliers and making suitable changes in the
definitions of the Hamiltonian and Lagrangian functions, it is possible
to derive the current-value formulation of the maximum principle as de-
scribed in Sect. 3.3.
It is often the case in finite horizon problems that some restrictions
are imposed on the state variables at the end of the horizon. In Sect. 3.4,
we discuss the transversality conditions to be satisfied by the adjoint
variable in special cases of interest. Section 3.5 is devoted to the study
of free terminal time problems where the terminal time itself is a deci-
sion variable to be determined. Models with infinite horizons and their
stationary equilibrium solutions are covered in Sect. 3.6.
Section 3.7 presents a classification of a number of the most important
and commonly used kinds of optimal control models, together with a brief
description of the forms of their optimal solutions. The reader may wish
to refer to this section from time to time while working through later
chapters in the book.

3.1 A Maximum Principle for Problems with


Mixed Inequality Constraints
We will state the maximum principle for optimal control problems with
mixed inequality constraints without proof. For further details see Pon-
tryagin et al. (1962), Hestenes (1966), Arrow and Kurz (1970), Hadley
and Kemp (1971), Bensoussan et al. (1974), Feichtinger and Hartl (1986),
Seierstad and Sydsæter (1987), and Grass et al. (2008).
Let the system under consideration be described by the following
vector differential equation

ẋ = f (x, u, t), x(0) = x0 (3.1)


3.1. A Maximum Principle for Problems with Mixed Constraints 71

given the initial conditions x0 and a control trajectory u(t), t ∈


[0, T ], T > 0, where T can be the terminal time to be optimally deter-
mined or given as a fixed positive number. Note that in the above equa-
tion, x(t) ∈ E n and u(t) ∈ E m , and the function f : E n ×E m ×E 1 → E n
is assumed to be continuously differentiable.
Let us consider the following objective:
  T 
max J = F (x, u, t)dt + S[x(T ), T ] , (3.2)
0

where F : E n × E m × E 1 → E 1 and S : E n × E 1 → E 1 are continuously


differentiable functions and where T denotes the terminal time. Depend-
ing on the situation being modeled, the terminal time T may be given or
to be determined. In the case when T is given, the function S(x(T ), T )
should be viewed as merely a function of the terminal state, and can be
revised as S(x(T )).
Next we impose constraints on state and control variables. Specifi-
cally, for each t ∈ [0, T ], x(t) and u(t) must satisfy

g(x, u, t) ≥ 0, t ∈ [0, T ], (3.3)

where g: E n × E m × E 1 → E q is continuously differentiable in all its


arguments and must contain terms in u. An important special case is that
of controls having an upper bound that depends on the current state,
i.e., u(t) ≤ M (x(t)), t ∈ [0, T ], which can be written as M (x) − u ≥ 0.
Inequality constraints without terms in u will be introduced later in
Chap. 4.
It is important to note that the mixed constraints (3.3) allow for
inequality constraints of the type g(u, t) ≥ 0 as special cases. Thus, the
control constraints of the form u(t) ∈ Ω(t) treated in Chap. 2 can be
subsumed in (3.3), provided that they can be expressed in terms of a
finite number of inequality constraints of the form g(u, t) ≥ 0. In most
problems that are of interest to us, this will indeed be the case. Thus,
from here on, we will formulate control constraints either directly as
inequality constraints and include them as parts of (3.3), or as u(t) ∈
Ω(t), which can be easily converted into a set of inequality constraints
to be included as parts of (3.3).
72 3. The Maximum Principle: Mixed Inequality Constraints

Finally, the terminal state is constrained by the following inequality


and equality constraints:

a(x(T ), T ) ≥ 0, (3.4)

b(x(T ), T ) = 0, (3.5)
where a : E n × E 1 → E la and b : E n × E 1 → E lb are continuously
differentiable in all their arguments. Clearly, a and b are not functions
of T, if T is a given fixed number. In the specific cases when T is
given, the terminal state constraints will be written as a(x(T )) ≥ 0 and
b(x(T )) = 0. Important special cases of (3.4) are x(T ) ≥ k.
We can now define a control u(t), t ∈ [0, T ], or simple u, to be admis-
sible if it is piecewise continuous and if, together with its corresponding
state trajectory x(t), t ∈ [0, T ], it satisfies the constraints (3.3), (3.4),
and (3.5).
At times we may find terminal inequality constraints given as

x(T ) ∈ Y (T ) ⊂ X(T ), (3.6)

where Y (T ) is a convex set and X(T ) is the set of all feasible terminal
states, also called the reachable set from the initial state x0 , i.e.,

X(T ) = {x(T ) | x(T ) obtained by an admissible control u and (3.1)}.

Remark 3.1 The feasible set defined by (3.4) and (3.5) need not be
convex. Thus, if the convex set Y (T ) can be expressed by a finite number
of inequalities a(x(T ), T ) ≥ 0 and equalities b(x(T ), T ) = 0, then (3.6)
becomes a special case of (3.4) and (3.5). In general, (3.6) is not a special
case of (3.4) and (3.5), since it may not be possible to define a given Y (T )
by a finite number of inequalities and equalities.

In this book, we will only deal with problems in which the following
full-rank conditions hold. That is,

rank[∂g/∂u, diag(g)] = q

holds for all arguments x(t), u(t), t, that could arise along an optimal
solution, and
⎡ ⎤
⎢ ∂a/∂x diag(a) ⎥
rank ⎣ ⎦ = la + lb
∂b/∂x 0
3.1. A Maximum Principle for Problems with Mixed Constraints 73

hold for all possible values of x(T ) and T. The first of these condi-
tions means that the gradients with respect to u of all active constraints
in (3.3) must be linearly independent. Similarly, the second condition
means that the gradients with respect to x of the equality constraints
(3.5) and of the active inequality constraints in (3.4) must be linearly
independent. These conditions are also referred to as the constraint qual-
ifications. In cases when these do not hold, see Seierstad and Sydsæter
(1987) for details on weaker constraint qualifications.
Before proceeding further, let us recapitulate the optimal control
problem under consideration in this chapter:
⎧   T 



⎪ max J = F (x, u, t)dt + S[x(T ), T ] ,



⎪ 0



⎪ subject to





⎨ ẋ = f (x, u, t), x(0) = x ,
0
(3.7)



⎪ g(x, u, t) ≥ 0,







⎪ a(x(T ), T ) ≥ 0,





⎩ b(x(T ), T ) = 0.

To state the maximum principle we define the Hamiltonian function


H : E n × E m × E n × E 1 → E 1 as
H(x, u, λ, t) := F (x, u, t) + λf (x, u, t), (3.8)
where λ ∈ En
(a row vector). We also define the Lagrangian function
L : E × E × E n × E q × E 1 → E 1 as
n m

L(x, u, λ, μ, t) := H(x, u, λ, t) + μg(x, u, t), (3.9)


where μ ∈ E q is a row vector, whose components are called Lagrange
multipliers. These Lagrange multipliers satisfy the complementary slack-
ness conditions
μ ≥ 0, μg(x, u, t) = 0,
which, in view of (3.3), can be expressed equivalently as
μi ≥ 0, μi gi (x, u, t) = 0, i = 1, 2, . . . , q.
The adjoint vector satisfies the differential equation
λ̇ = −Lx (x, u, λ, μ, t) (3.10)
74 3. The Maximum Principle: Mixed Inequality Constraints

with the terminal condition




⎨ la(T ) = Sx (x(T ), T ) + αax (x(T ), T ) + βbx (x(T ), T ),
(3.11)

⎩ α ≥ 0, αa(x(T ), T ) = 0,

where α ∈ E la and β ∈ E lb are constant vectors.


The maximum principle states that the necessary conditions for u∗ ,
with the corresponding state trajectory x∗ , to be an optimal control are
that there should exist continuous and piecewise continuously differen-
tiable functions λ, piecewise continuous functions μ, and constants α and
β such that (3.12) holds, i.e.,

ẋ∗ = f (x∗ , u∗ , t), x∗ (0) = x0 ,

satisfying the terminal constraints

a(x∗ (T ), T ) ≥ 0 and b(x∗ (T ), T ) = 0,

λ̇ = −Lx (x∗ , u∗ , λ, μ, t)

with the terminal conditions

λ(T ) = Sx (x∗ (T ), T ) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T ),

α ≥ 0, αa(x∗ (T ), T ) = 0,

the Hamiltonian maximizing condition (3.12)

H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t]

at each t ∈ [0, T ] for all u satisfying

g[x∗ (t), u, t] ≥ 0,

and the Lagrange multipliers μ(t) are such that


 
∂L ∂H ∂g
| ∗ = +μ | ∗ =0
∂u u=u (t) ∂u ∂u u=u (t)
and the complementary slackness conditions

μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0 hold.


3.1. A Maximum Principle for Problems with Mixed Constraints 75

In the case of the terminal constraint (3.6), note that the terminal
conditions on the state and the adjoint variables in (3.12) will be re-
placed, respectively, by

x∗ (T ) ∈ Y (T ) ⊂ X(T ) (3.13)

and

[λ(T ) − Sx (x∗ (T ), T )][y − x∗ (T )] ≥ 0, ∀y ∈ Y (T ). (3.14)

In Exercise 3.5, you are asked to derive (3.14) from (3.12) in the one
dimensional case when Y (T ) = Y = [x, x̄] for each T > 0, where x and
x̄ are two constants such that x̄ > x.
In the case when the terminal time T ≥ 0 in the problem (3.10) is
also a decision variable, there is an additional necessary transversality
condition for T ∗ to be optimal, namely,

H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ST [x∗ (T ∗ ), T ∗ ]


+αaT [x∗ (T ∗ ), T ∗ ] + β T [x∗ (T ∗ ), T ∗ ] = 0, (3.15)

provided T ∗ is an interior solution, i.e., T ∗ ∈ (0, ∞). In other words,


optimal T ∗ and x∗ (t), u∗ (t), t ∈ [0, T ∗ ], must satisfy (3.12) with T
replaced by T ∗ and (3.15). This condition will be further discussed and
illustrated with examples in Sect. 3.5. The discussion will also include
the case when T is restricted to lie in the interval [T1 , T2 ], T2 > T1 ≥ 0.
We will now illustrate the use of the maximum principle (3.12) by
solving a simple example.

Example 3.1 Consider the problem:


  1 
max J = udt
0

subject to

ẋ = u, x(0) = 1, (3.16)
u ≥ 0, x − u ≥ 0. (3.17)

Note that constraints (3.17) are of the mixed type (3.3). They can also
be rewritten as 0 ≤ u ≤ x.
76 3. The Maximum Principle: Mixed Inequality Constraints

Solution The Hamiltonian is

H = u + λu = (1 + λ)u,

so that the optimal control has the form

u∗ (x, λ) = bang[0, x; 1 + λ]. (3.18)

To get the adjoint equation and the multipliers associated with con-
straints (3.17), we form the Lagrangian:

L = H + μ1 u + μ2 (x − u) = μ2 x + (1 + λ + μ1 − μ2 )u.

From this we get the adjoint equation


∂L
λ̇ = − = −μ2 , λ(1) = 0. (3.19)
∂x
Also note that the optimal control must satisfy
∂L
= 1 + λ + μ1 − μ2 = 0, (3.20)
∂u
and μ1 and μ2 must satisfy the complementary slackness conditions

μ1 ≥ 0, μ1 u = 0, (3.21)
μ2 ≥ 0, μ2 (x − u) = 0. (3.22)

It is reasonable in this simple problem to guess that u∗ (t) = x(t) is an


optimal control for all t ∈ [0, 1]. We now show that this control satisfies
all the conditions of the Lagrangian form of the maximum principle.
Since x(0) = 1, the control u∗ = x gives x = et as the solution of
(3.16). Because x = et > 0, it follows that u∗ = x > 0. Thus, μ1 = 0
from (3.21).
From (3.20) we then have

μ2 = 1 + λ.

Substituting this into (3.19) and solving gives

1 + λ(t) = e1−t . (3.23)

Since the right-hand side of (3.23) is always positive, u∗ = x satisfies


(3.18). Notice that μ2 = e1−t ≥ 0 and x − u∗ = 0, so (3.22) holds.
3.1. A Maximum Principle for Problems with Mixed Constraints 77

Using u∗ = x in (3.16), we can obtain the optimal state trajectory


x∗ (t) = et . Thus, the optimal value of the objective function is
 1

J = et dt = (e − 1).
0

Let us now examine the consequence of changing the constraint x −


u ≥ 0 on control u to x − u ≥ −ε, which gives u ≤ x + ε for a small ε. In
this case, it is clear that the optimal control u∗ = x + ε, which we can
use in (3.16) to obtain x∗ (t) = et (1 + ε) − ε. The optimal value of the
objective function changes to
 1  1
u(t)dt = et (1 + ε)dt = (e − 1)(1 + ε).
0 0

 1This means that J ∗ increases by (e − 1)ε, which in this case equals


1 1−t
ε 0 μ2 (t)dt = ε 0 e dt, as stipulated in Remark 3.8.

We conclude Sect. 3.1 with the following remarks.

Remark 3.2 Strictly speaking, we should have H = λ0 F + λf in (3.8)


with (λ0 , λ(t)) = (0, 0) for all t ∈ [0, T ]. However, when λ0 = 0, the
conditions in the maximum principle do not change if we replace F by any
other function. Therefore, the problems where the maximum principle
holds only with λ0 = 0 are termed abnormal. Such problems may arise
when there are terminal state constraints such as (3.4) and (3.5) or pure
state constraints treated in Chap. 4. In this book, as is standard in
the economics literature dealing with optimal control theory, we will set
λ0 = 1. This is because the problems that are of interest to us will be
normal. For examples of abnormal problems and further discussion on
this issue, see Seierstad and Sydsæter (1987).

Remark 3.3 The function defined in (3.9) is not a Lagrangian function


in the sense of the continuous-time counterpart of the Lagrangian func-
tion defined in (8.45) in Chap. 8. However, it can be viewed, roughly
speaking, as a Lagrangian function associated with the problem of max-
imizing the Hamiltonian (3.8) subject to the constraints (3.3) along the
optimal path. As in this book, some people refer to (3.9) as a Lagrangian
function, while others call it an extended Pontryagin function.

Remark 3.4 It should be pointed out that if the set Y in (3.6) consists
of a single point Y = {k}, making the problem a fixed-end-point prob-
lem, then the transversality condition reduces to simply λ(T ) to equal
78 3. The Maximum Principle: Mixed Inequality Constraints

a constant to be determined, since x∗ (T ) = k. In this case the salvage


function S becomes a constant, and can therefore be disregarded. When
Y = X, the terminal condition in (3.12) reduces to (2.30). Further dis-
cussion of the terminal conditions can be found in Sect. 3.4 along with a
summary in Table 3.1.

Remark 3.5 As in Chap. 2, it can be shown that λi (t), i = 1, 2, ..., n,


is interpreted as the marginal value of an increment in the state variable
xi at time t. Specifically, the relation (2.17) holds so long as the value
function V (x, t), defined in (2.10), is continuously differentiable in xi ;
see Seierstad and Sydsæter (1987).

Remark 3.6 The Lagrange multiplier αi , i = 1, 2, . . . , n represents the


shadow price associated with the terminal state constraint ai (x(T ), T ) ≥
0. Thus, if we change this constraint to ai (x(T ), T ) ≥ ε for a small ε, then
the change in the objective function will be −εαi + o(ε). A similar inter-
pretation holds for the multiplier β; see Sect. 3.4 for further discussion.
This will be illustrated in Example 3.4 and Exercise 3.17.

Remark 3.7 In the case when the terminal constraint (3.4) or (3.5) is
binding, the transversality condition λ(T ) in (3.12) should be viewed as
the left-hand limit, limt↑T λ(t), sometimes written as λ(T − ), and then
we would express λ(T ) = Sx (x∗ (T ), T ). However, the standard practice
for problems treated in Chaps. 2 and 3 is to use the notation that we
have used. Nevertheless, care should be exercised in distinguishing the
marginal value of the state at time T given by Sx (x∗ (T ), T ) and the
shadow prices for the terminal constraints (3.4) and (3.5) given by α and
β, respectively. See Sect. 3.4 and Example 3.4 for further elaboration.

Remark 3.8 It is also possible to provide marginal value interpretations


to Lagrange multipliers μi , i = 1, 2, . . . , m. If we change the constraint
gi (x, u, t) ≥ 0 to gi (x, u, t) ≥ ε for a small ε, then we expect the change
T
in the optimal value of the objective function to be −ε 0 μi (t)dt + o(ε);
see Peterson (1973, 1974) or Malanowski (1984). If ε < 0, then the
T
constraint is being relaxed, and 0 μi (t)dt ≥ 0 provides the marginal
value of relaxing the constraint. We will illustrate this concept with the
help of Example 3.1.

Remark 3.9 In the case when the problem (3.7) is changed by inter-
changing x(T ) and x(0) so that the initial condition x(0) = x0 is re-
placed by x(T ) = xT , and S(x(T ), T ), a(x(T ), T ) and b(x(T ), T ) are
3.2. Sufficiency Conditions 79

replaced by S(x(0)), a(x(0)) and b(x(0)), respectively, then in the maxi-


mum principle (3.12), we need to replace initial condition x∗ (0) = x0 by
x∗ (T ) = xT and the terminal condition on the adjoint variable λ by the
initial condition λ(0) = Sx (x∗ (0)) + αax (x∗ (0)) + βbx (x∗ (0)) with α ≥ 0
and αa(x∗ (0)) = 0.

3.2 Sufficiency Conditions


In this section we will state, without proof, a number of sufficiency re-
sults. These results require the concepts of concave and quasiconcave
functions.
Recall from Sect. 1.4 that with D ⊂ E n , a convex set, a function
ψ : D → E 1 is concave, if for all y, z ∈ D and for all p ∈ [0, 1],

ψ(py + (1 − p)z) ≥ pψ(y) + (1 − p)ψ(z). (3.24)

The function ψ is quasiconcave if (3.24) is relaxed to

ψ(py + (1 − p)z) ≥ min{ψ(y), ψ(z)}, (3.25)

and ψ is strictly concave if y = z and p ∈ (0, 1) and (3.24) holds with


a strict inequality. Furthermore, ψ is convex, quasiconvex, or strictly
convex if −ψ is concave, quasiconcave, or strictly concave, respectively.
Note that linearity implies both concavity and convexity, and concavity
implies quasiconcavity. For further details on the properties of such
functions, see Mangasarian (1969).
We can now state a sufficiency result concerning the problem with
mixed constraints stated in (3.7). For this purpose, let us define the
maximized Hamiltonian

H 0 (x, λ, t) = max H(x, u, λ, t). (3.26)


{u|g(x,u,t)≥0}

Theorem 3.1 Let (x∗ , u∗ , λ, μ, α, β) satisfy the necessary conditions in


(3.12). If H 0 (x, λ(t), t) is concave in x at each t ∈ [0, T ], S in (3.2) is
concave in x, g in (3.3) is quasiconcave in (x, u), a in (3.4) is quasicon-
cave in x, and b in (3.5) is linear in x, then (x∗ , u∗ ) is optimal.

The result is a straightforward extension of Theorem 2.1. See, e.g.,


Seierstad and Sydsæter (1977, 1987) and Feichtinger and Hartl (1986).
In Exercise 3.7 you are asked to check these sufficiency conditions for
Example 3.1.
80 3. The Maximum Principle: Mixed Inequality Constraints

3.3 Current-Value Formulation


In most management science and economics problems, the objective func-
tion is usually formulated in terms of money or utility. These quantities
have time value, and therefore the future streams of money or utility are
discounted. The discounted objective function can be written as a spe-
cial case of (3.2) by assuming that the time dependence of the relevant
functions comes only through the discount factor. Thus,

F (x, u, t) = φ(x, u)e−ρt and S(x, T ) = ψ(x)e−ρT , (3.27)

where we assume the discount rate ρ > 0. We should also mention that
if F (x, u, t) = φ(x, u, t)e−ρt and S(x, T ) = ψ(x, T )e−ρT , then there is no
advantage of developing a current-value version of the maximum princi-
ple, and it is recommended that the present-value formulation be used
in this case.
Now, the objective in problem (3.7) can be written as:
  T 
−ρt −ρT
max J = φ(x, u)e dt + ψ[x(T )]e . (3.28)
0

For this problem, the Hamiltonian, which we shall now refer to as


the present-value Hamiltonian, H pv , is

H pv := e−ρt φ(x, u) + λpv f (x, u, t) (3.29)

and the present-value Lagrangian is

Lpv := H pv + μpv g(x, u, t) (3.30)

with the present-value adjoint variables λpv and present-value multipliers


αpv and β pv satisfying
pv
λ̇ = −Lpv x , (3.31)

λpv (T ) = Sx [x(T ), T ] + αpv ax (x(T ), T ) + β pv bx (x(T ), T )


= e−ρT ψ x [x(T )] + αpv ax (x(T ), T ) + β pv bx (x(T ), T ), (3.32)

αpv ≥ 0, αpv a(x(T ), T ) = 0, (3.33)


and μpv satisfying
μpv ≥ 0, μpv g = 0. (3.34)
We use superscript pv in this section to distinguish these from the
current-value functions defined as follows. Elsewhere, we do not need to
3.3. Current-Value Formulation 81

make the distinction explicitly since we will either be using the present-
value definitions or the current-value definitions of these functions. The
reader will always be able to tell what is meant from the context.
We now define the current-value Hamiltonian

H[x, u, λ, t] := φ(x, u) + λf (x, u, t) (3.35)

and the current-value Lagrangian

L[x, u, λ, μ, t] := H + μg(x, u, t). (3.36)

To see why we can do this, we note that if we define

λ := eρt λpv and μ := eρt μpv , (3.37)

we can rewrite (3.29) and (3.30) as

H = eρt H pv and L = eρt Lpv . (3.38)

Since eρt > 0, maximizing H pv with respect to u at time t is equivalent to


maximizing the current-value Hamiltonian H with respect to u at time
t. Furthermore, from (3.37),
pv
λ̇ = ρeρt λpv + eρt λ̇ . (3.39)

The first term on the right-hand side of (3.39) is simply ρλ using the
definition in (3.37). To simplify the second term we use the differential
equation (3.31) for λpv and the fact that Lx = eρt Lpv
x from (3.38). Thus,

λ̇ = ρλ − Lx ,

λ(T ) = ψ x [x(T )] + αax (x(T ), T ) + βbx (x(T ), T ), (3.40)


where the terminal condition for λ(T ) follows immediately from the ter-
minal condition for λpv (T ) in (3.32), the definition (3.38),

α = eρt αpv and β = eρt β pv . (3.41)

The complementary slackness conditions satisfied by the current-


value Lagrange multipliers μ and α are

μ ≥ 0, μg = 0, α ≥ 0, and αa = 0

on account of (3.33), (3.34), (3.37), and (3.41).


82 3. The Maximum Principle: Mixed Inequality Constraints

We will now state the maximum principle in terms of the current-


value functions. It states that the necessary conditions for u∗ , with the
corresponding state trajectory x∗ , to be an optimal control are that there
exist λ and μ such that the conditions (3.42) hold, i.e.,

ẋ∗ = f (x∗ , u∗ , t),

a(x∗ (T ), T ) ≥ 0, b(x∗ (T ), T ) = 0,

λ̇ = ρλ − Lx [x∗ , u∗ , λ, μ, t], with the terminal conditions

λ(T ) = ψ x (x∗ (T )) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T ),

α ≥ 0, αa(x∗ (T ), T ) = 0,

and the Hamiltonian maximizing condition


(3.42)
H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t]

at each t ∈ [0, T ] for all u satisfying

g[x∗ (t), u, t] ≥ 0,

and the Lagrange multipliers μ(t) are such that

∂u |u=u∗ (t)
∂L
= 0, and the complementary slackness

conditions μ(t) ≥ 0 and μ(t)g(x∗ , u∗ , t) = 0 hold.

As in Sect. 3.1, when the terminal constraint is given by (3.6) instead


of (3.4) and (3.5), we need to replace the terminal condition on the state
and the adjoint variables, respectively, by (3.13) and

[λ(T ) − ψ x (x∗ (T ))][y − x∗ (T )] ≥ 0, ∀y ∈ Y (T ). (3.43)

See also Remark 3.4, which applies here as well.


If T ≥ 0 is also a decision variable and if T ∗ is the optimal terminal
time, then the optimal solution x∗ , u∗ , and T ∗ must satisfy (3.42) with
T replaced by T ∗ along with

H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] − ρψ[x∗ (T ∗ )]


+αaT [x∗ (T ∗ ), T ∗ ] + β T [x∗ (T ∗ ), T ∗ ] = 0. (3.44)
3.3. Current-Value Formulation 83

You are asked in Exercise 3.8 to show that (3.44) is the current-value
version of (3.15) under the relation (3.27). Furthermore, show how (3.44)
should be modified if S(x, T ) = ψ(x, T )e−ρT in (3.27).
As for the sufficiency conditions for the current-value formulation,
one can simply use Theorem 3.1 as if it were stated for the current-value
formulation.

Example 3.2 We illustrate an application of the current-value maxi-


mum principle by solving the consumption problem of Example 1.3 with
U (C) = ln C and W (T ) = 0. Thus, we solve
  T 
−ρt −ρT
max J= e ln C(t)dt + B(0)e
C(t)≥0 0

subject to the wealth dynamics

Ẇ = rW − C, W (0) = W0 , W (T ) = 0,

where W0 > 0. As hinted in Exercise 2.29(a), we do not need to impose


the pure state constraint W (t) ≥ 0, t ∈ [0, T ], in view of C(t) ≥ 0, t ∈
[0, T ], and W (T ) = 0. Also, the salvage function reduces to B(0), which
is a constant; see Remark 3.4.

Solution In Exercise 2.29(a) we used the standard Hamiltonian for-


mulation to solve the problem. We now demonstrate the use of the
current-value Hamiltonian formulation:

H = ln C + λ(rW − C), (3.45)

with the adjoint equation

∂H
λ̇ = ρλ − = (ρ − r)λ, λ(T ) = β, (3.46)
∂W
where β is some constant to be determined. The solution of (3.46) is

λ(t) = βe(ρ−r)(t−T ) . (3.47)

To find the optimal control, we maximize H by differentiating (3.45)


with respect to C and setting the result to zero:

∂H 1
= − λ = 0,
∂C C
84 3. The Maximum Principle: Mixed Inequality Constraints

which implies
1 1
C ∗ (t) = = e(ρ−r)(T −t) . (3.48)
λ(t) β
Using this consumption level in the wealth dynamics gives

1 (ρ−r)(T −t)
Ẇ (t) = rW (t) − e , W (0) = W0 ,
β

which can be solved as



∗ e(ρ−r)T (1 − e−ρt )
W (t) = e rt
W0 − . (3.49)
ρβ

Setting W ∗ (T ) = 0 gives β = e(ρ−r)T (1 − e−ρT )/ρW0 . Therefore, the


optimal consumption rate and wealth at time t are
−ρt
∗ ρW0 e(r−ρ)t ∗ rt e − e−ρT
C (t) = , W (t) = e W0 . (3.50)
1 − e−ρT 1 − e−ρT

The optimal value of the objective function is


 
∗ 1 − e−ρT ρW0 r−ρ 1 −ρT 1
J = ln + −e T+ + B(0)e−ρT .
ρ 1 − e−ρT ρ ρ ρ
(3.51)
The interpretation of the current-value functions are that these func-
tions reflect the values at time t in terms of the current (or, time-t)
dollars. The standard functions, on the other hand, reflect the values at
time t in terms of time-zero dollars. For example, the standard adjoint
variable λpv (t) can be interpreted as the marginal value per unit increase
in the state at time t, in the same units as that of the objective function
(3.28), i.e., in terms of time-zero dollars; see Sect. 2.2.4. On the other
hand, λ(t) = eρt λpv (t) is obviously the same value expressed in terms of
current (or, time-t) dollars.
For the consumption problem of Example 3.2, note that the current-
value adjoint function

λ(t) = e(ρ−r)t (1 − e−ρT )/ρW0 . (3.52)

This gives the marginal value per unit increase in wealth at time t in
time-t dollars. In Exercise 2.29(a), the standard adjoint variable was
λpv (t) = e−rt (1 − e−ρT )/ρW0 , which can be written as λpv (t) = e−ρt λ(t).
3.3. Current-Value Formulation 85

Thus, it is clear that λpv (t) expresses the same marginal value in time-
zero dollars. In particular,
dJ ∗ /dW0 = (1 − e−ρT )/ρW0 = λ(0) = λpv (0)
gives the marginal value per unit increase in the initial wealth W0 .
In Exercise 3.11, you are asked to formulate and solve a consumption
problem of an economy. The problem is a linear version of the famous
Ramsey model; see Ramsey (1928) and Feichtinger and Hartl (1986, p.
201).
Before concluding this section on the current-value formulation, let
us also provide the current-value version of the HJB equation (2.15)
or (2.19) along with the terminal condition (2.16). As in (2.9), we now
define the value function for the problem (3.7), with its objective function
replaced by (3.28), as follows:
 T
−ρ(T −t)
V (x, t) = max φ(x(s), u(s))ds + e ψ(x(T ))
{u|g(x,u,t)≥0} t

if x(T ) satisfies a(x(T ), T ) ≥ 0 and b(x(T ), T ) = 0,

and V (x, t) = −∞, otherwise.


(3.53)
Then proceeding as in Sect. 2.1.1, we have
! "
V (x, t)= max φ[x(τ ), u(τ )]dτ + e−ρδdt V [x(t + δt), t + δt] .
{u(τ )|g(x(τ ),u(τ ),τ )≥0}
τ ∈[t, t+δt]
(3.54)
Noting that e−ρδt = 1−ρδt+0(δt) and continuing on as in Sect. 2.1.1,
we can obtain the current-value version of (2.15) and (2.19) as

ρV (x, t) = max {φ(x, u, t) + Vx (x, t)f (x, u, t) + Vt (x, t)}


{u|g(x,u,t)≥0}

= max {H(x, u, Vx , t) + Vt } = 0,
{u|g(x,u,t)≥0}
(3.55)
where H is defined as in (3.35).
Finally, we can write the terminal condition as


⎨ ψ(x), if a(x, T ) ≥ 0 and b(x, T ) = 0,
V (x, T ) = (3.56)

⎩ −∞, otherwise.
86 3. The Maximum Principle: Mixed Inequality Constraints

3.4 Transversality Conditions: Special Cases


Terminal conditions on the adjoint variables, also known as transversality
conditions, are extremely important in optimal control theory. Because
the salvage value function ψ(x) is known, we know the marginal value
per unit change in the state at terminal time T. Since λ(T ) must be equal
to this marginal value, it provides us with the boundary conditions for
the differential equations for the adjoint variables. We will now derive
the terminal or transversality conditions for the current-value adjoint
variables for some important special cases of the general problem treated
in Sect. 3.3. We also summarize these conditions in Table 3.1.

Case 1: Free-end point. In this case, we do not put any constraints on


the terminal state x(T ). Thus,

x(T ) ∈ X(T ).

From the terminal conditions in (3.42), it is obvious that for the


free-end-point problem, i.e., when Y (T ) = X(T ),

λ(T ) = ψ x [x∗ (T )]. (3.57)

This includes the condition λ(T ) = 0 in the special case of ψ(x) ≡ 0;


see Example 3.1, specifically (3.19). These conditions are repeated in
Table 3.1, Row 1.
The economic interpretation of λ(T ) is that it equals the marginal
value of a unit increment in the terminal state evaluated at its optimal
value x∗ (T ).

Case 2: Fixed-end point. In this case, which is the other extreme from
the free-end-point case, the terminal constraint is

b(x(T ), T ) = x(T ) − k = 0,

and the terminal conditions in (3.42) do not provide any information for
λ(T ). However, as mentioned in Remark 3.4 and recalled subsequently
in connection with (3.42), λ(T ) will be some constant β, which will be
determined by solving the boundary value problem, where the system
of differential equations consists of the state equations with both initial
and terminal conditions and the adjoint equations with no boundary
conditions. This condition is repeated in Table 3.1, Row 2. Example 3.2
solved in the previous section illustrates this case.
3.4. Transversality Conditions: Special Cases 87

The economic interpretation of λ(T ) = β is as follows. The constant


β times ε, i.e., βε, provides the value that could be lost if the fixed-end
point were specified to be k + ε instead of k; see Exercise 3.12.

Case 3: Lower bound. Here we restrict the ending value of the state
variable to be bounded from below, namely,

a(x(T ), T ) = x(T ) − k ≥ 0,

where k ∈ X. In this case, the terminal conditions in (3.42) reduce to

λ(T ) ≥ ψ x [x∗ (T )] (3.58)

and
{λ(T ) − ψ x [x∗ (T )]}{x∗ (T ) − k} = 0, (3.59)
with the recognition that the shadow price of the inequality constraint
(3.4) is
α = λ(T ) − ψ x [x∗ (T )] ≥ 0. (3.60)
For ψ(x) ≡ 0, these terminal conditions can be written as

λ(T ) ≥ 0 and λ(T )[x∗ (T ) − k] = 0. (3.61)

These conditions are repeated in Table 3.1, Row 3.

Case 4: Upper bound. Similarly, when the ending value of the state
variable is bounded from above, i.e., when the terminal constraint is

k − x(T ) ≥ 0,

the conditions for this opposite case are

λ(T ) ≤ ψ x [x∗ (T )] (3.62)

and (3.59). These are repeated in Table 3.1, Row 4. Furthermore, (3.62)
can be related to the condition on λ(T ) in (3.42) by setting

α = ψ x [x∗ (T )] − λ(T ) ≥ 0. (3.63)

Case 5: A general case. A general ending condition is

x(T ) ∈ Y (T ) ⊂ X(T ),
88 3. The Maximum Principle: Mixed Inequality Constraints

which is already stated in (3.6). The transversality conditions are spec-


ified in (3.43) and repeated in Table 3.1, Row 5.

An important situation which gives rise to a one-sided constraint


occurs when there is an isoperimetric or budget constraint of the form
 T
l(x, u, t)dt ≤ K, (3.64)
0
where l : En × Em× E1
→ E 1 is assumed to be nonnegative, bounded,
and continuously differentiable, and K is a positive constant representing
the amount of a budgeted resource. To see how this constraint can be
converted into a lower bound constraint, we define an additional state
variable xn+1 by the state equation
ẋn+1 = −l(x, u, t), xn+1 (0) = K, xn+1 (T ) ≥ 0. (3.65)
We employ the index n + 1 simply because we already have n state vari-
ables x = (x1 , x2 , . . . , xn ). Also Eq. (3.65) becomes an additional equa-
tion which is added to the original system.
In Exercise 3.13 you will be asked to rework the leaky reservoir prob-
lem of Exercise 2.18 with an additional isoperimetric constraint on the
total amount of water available. Later in Chap. 7, you’ll be asked to
solve Exercises 7.10–7.12 involving budgets for advertising expenditures.
In Table 3.1, we have summarized all the terminal or transversality
conditions discussed previously. In Sect. 3.7 we discuss model types.
We will see that, given the initial state x0 , we can completely specify a
control model by selecting a model type and a transversality condition.
In what follows, we solve two examples with lower bounds on the terminal
state illustrating the use of transversality conditions (3.61), also stated in
Table 3.1, Row 3. Example 3.3 is a variation of the consumption problem
in Example 3.2. It illustrates the use of the transversality conditions
(3.61).
Example 3.3 Let us modify the objective function of the consumption
problem (Example 3.2) to take into account the salvage (bequest) value of
terminal wealth. This is the utility to the individual of leaving an estate
to his heirs upon death. Let us now assume that T denotes the time
of the individual’s death and BW (T ), where B is a positive constant,
3.4. Transversality Conditions: Special Cases 89

denotes his utility of leaving wealth W (T ) to his heirs upon death. Then,
the problem is:
  T 
−ρt −ρT
max J = e ln C(t)dt + e BW (T ) (3.66)
C(t)≥0 0

Table 3.1: Summary of the transversality conditions


Constraint Description λ(T ) λ(T )

on x(T ) when ψ ≡ 0

1 x(T ) ∈ Y (T ) = X(T ) Free-end λ(T ) = ψ x [x∗ (T )] λ(T ) = 0

point

2 x(T ) = k ∈ X(T ), Fixed-end λ(T ) = β, a constant λ(T ) = β, a constant

i.e., Y (T ) = {k} point to be determined to be determined

3 x(T ) ∈ X(T ) ∩ [k, ∞), lower λ(T ) ≥ ψ x [x∗ (T )] λ(T ) ≥ 0

i.e., Y (T ) = {x|x ≥ k} bound and and

x(T ) ≥ k {λ(T ) − ψ x [x∗ (T )]}{x∗ (T ) − k} = 0 λ(T )[x∗ (T ) − k] = 0

4 x(T ) ∈ X(T ) ∩ (−∞, k], upper λ(T ) ≤ ψ x [x∗ (T )] λ(T ) ≤ 0

i.e., Y (T ) = {x|x ≤ k} bound and and

x(T ) ≤ k {λ(T ) − ψ x [x∗ (T )]}{k − x∗ (T )} = 0 λ(T )[k − x∗ (T )] = 0

5 x(T ) ∈ Y (T ) ⊂ X(T ) General {λ(T ) − ψ x [x∗ (T )]}{y − x∗ (T )} ≥ 0 λ(T )[y − x∗ (T )] ≥ 0

constraints ∀y ∈ Y (T ) ∀y ∈ Y (T )

Note 1. In Table 3.1, x(T ) denotes the (column) vector of n state variables and λ(T )
denotes the (row) vector of n adjoint variables at the terminal time T ; X(T ) ⊂ E n
denotes the reachable set of terminal states obtained by using all possible admissible
controls; and ψ : E n → E 1 denotes the salvage value function
Note 2. Table 3.1 will provide transversality conditions for the standard Hamiltonian
formulation if we replace ψ with S, and reinterpret λ as being the standard adjoint
variable everywhere in the table. Also (3.15) is the standard form of (3.44)

subject to the wealth equation


Ẇ = rW − C, W (0) = W0 , W (T ) ≥ 0. (3.67)
Solution The Hamiltonian for the problem is given in (3.45), and the ad-
joint equation is given in (3.46) except that the transversality conditions
are from Table 3.1, Row 3:
λ(T ) ≥ B, [λ(T ) − B]W ∗ (T ) = 0. (3.68)
In Example 3.2, the value of β, the terminal value of the adjoint variable,
was
1 − e−rT
β= .
rW0
We now have two cases: (i) β ≥ B and (ii) β < B.
90 3. The Maximum Principle: Mixed Inequality Constraints

In case (i), the solution of the problem is the same as that of Exam-
ple 3.2, because by setting λ(T ) = β and recalling that W ∗ (T ) = 0 in
that example, it follows that (3.68) holds.
In case (ii), we set λ(T ) = B. Then, by using B in place of β in
(3.47)–(3.49), we get λ(t) = Be(ρ−r)(t−T ) , C ∗ (t) = (1/B)e(ρ−r)(T −t) , and

e(ρ−r)T (1 − e−ρt )
W ∗ (t) = ert W0 − . (3.69)
ρB

Since β < B, we can see from (3.49) and (3.69) that the wealth level
in case (ii) is larger than that in case (i) at t ∈ (0, T ]. Furthermore, the
amount of bequest is

eρT − 1
W ∗ (T ) = W0 erT − > 0.
ρB

Note that (3.68) holds for case (ii). Also, if we had used (3.42) instead
of Table 3.1, Row 3, we would have λ(T ) = B + α, α ≥ 0, αW ∗ (T ) = 0,
equivalently, in place of (3.68). It is easy to see that α = β − B in case
(i) and α = 0 in case (ii).

Example 3.4 Consider the problem:


  2 
max J = −xdt
0

subject to
ẋ = u, x(0) = 1, x(2) ≥ 0, (3.70)
− 1 ≤ u ≤ 1. (3.71)

Solution The Hamiltonian is

H = −x + λu.

Here, we do not need to introduce the Lagrange multipliers for the con-
trol constraints (3.71), since we can easily deduce that the Hamiltonian
maximizing control has the form

u∗ = bang[−1, 1; λ]. (3.72)

The adjoint equation is


λ̇ = 1 (3.73)
3.4. Transversality Conditions: Special Cases 91

with the transversality conditions

λ(2) ≥ 0 and λ(2)x(2) = 0, (3.74)

obtained from (3.61) or from Table 3.1, Row 3. Since λ(t) is monotoni-
cally increasing, the control (3.72) can switch at most once, and it can
only switch from u∗ = −1 to u∗ = 1. Let the switching time be t∗ ≤ 2.
Then the optimal control is


⎨ −1 for 0 ≤ t ≤ t∗ ,

u (t) = (3.75)

⎩ +1 for t∗ < t ≤ 2.

Since the control switches at t∗ , λ(t∗ ) must be 0. Solving (3.73) gives

λ(t) = t − t∗ .

There are two cases: (i) t∗ < 2 and (ii) t∗ = 2. We analyze case (i) first.
Here λ(2) = 2 − t∗ > 0; therefore from (3.74), x(2) = 0. Solving for x(t)
with u∗ (t) given in (3.75), we obtain


⎨ 1−t for 0 ≤ t ≤ t∗ ,
x(t) =

⎩ (t − t∗ ) + x(t∗ ) = t + 1 − 2t∗ for t∗ < t ≤ 2.

Therefore, setting x(2) = 0 gives

x(2) = 3 − 2t∗ = 0,

which makes t∗ = 3/2. Since this satisfies t∗ < 2, we do not have to deal
with case (ii), and we have


⎨ 1 − t for 0 ≤ t ≤ 3/2,
∗ 3
x (t) = and λ(t) = t − .

⎩ t − 2 for 3/2 < t ≤ 2 2

Figure 3.1 shows the optimal state and adjoint trajectories. Using the
optimal state trajectory in the objective function, we can obtain its op-
timal value J ∗ = −1/4.
In Exercise 3.15, you are asked to consider case (ii) by setting t∗ = 2,
and show that the maximum principle will not be satisfied in this case.
92 3. The Maximum Principle: Mixed Inequality Constraints

Finally, we can verify the marginal value interpretation of the adjoint


variable as indicated in Remark 3.5. For this, we first note that the
feasible region for the problem is given by x ≥ t − 2, t ∈ [0, 2]. To obtain
the value function V (x, t), we can easily obtain the optimal solution in
the interval [t, 2] for the problem beginning with x(t) = x. We use the
notation introduced in Example 2.5 to specify the optimal solution as


⎨ −1, s ∈ [t, 1 (x + t) + 1),
∗ 2
u(x,t) (s) =

⎩ 1, s ∈ [ 1 (x + t) + 1, 2],
2

and ⎧

⎨ x + t − s, s ∈ [t, 1
2 (x + t) + 1),
x∗(x,t) (s) =

⎩ s − 2, s ∈ [ 12 (x + t) + 1, 2].
Then for x ≥ t − 2,
2
V (x, t) = −x∗(x,t) (s)ds
t
 (1/2)(x+t)+1 2
= − t (x + t − s)ds − (1/2)(x+t)+1 (s − 2)ds

= (1/4)t2 − (1/4)x2 + (1/2)t(x − 2) − (x − 1).


(3.76)
For x < t − 2, there is no feasible solution, and we therefore set V (x, t) =
−∞.
We can now verify that for 0 ≤ t ≤ 3/2, the value function V (x, t) is
continuously differentiable at x = x∗ (t) = 1 − t, and
Vx (x∗ (t), t) = −(1/2)x∗ (t) + (1/2)t − 1
= −(1/2)(1 − t) + (1/2)t − 1
= t − 3/2
= λ(t).
What happens when t ∈ (3/2, 2]? Clearly, for x ≥ x∗ (t) = t − 2, we
may still use (3.76) to obtain the right-hand derivative Vx+ (x∗ (t), t) =
−(1/2)x∗ (t) + (1/2)t − 1 = −(1/2)(t − 2) + (1/2)t − 1 = 0. However,
for x < x∗ (t), we have x < t − 2 for which there is no feasible solution,
and we set the left-hand derivative Vx− (x∗ (t), t) = −∞. Thus, the value
3.5. Free Terminal Time Problems 93

function V (x, t) is not differentiable at x∗ (t), and since Vx (x∗ (t), t) does
not exist for t ∈ (3/2, 2], (2.17) has no meaning; see Remark 2.2.
It is possible, however, to provide an economic meaning for λ(2). In
Exercise 3.17, you are asked to rework Example 3.4 with the terminal
condition x(2) ≥ 0 replaced by x(2) ≥ ε, where ε is small. Furthermore,

½ ½

()

-3/2

Figure 3.1: State and adjoint trajectories in Example 3.4

the solution will illustrate that α = λ(2) − 0 = 1/2, obtained by us-


ing (3.60), represents the shadow price of the constraint as indicated in
Remark 3.7.

3.5 Free Terminal Time Problems


In some cases, the terminal time is not given but needs to be determined
as an additional decision. Here, a necessary condition for a terminal
time to be optimal in the present-value and current-value formulations
are given in (3.15) and (3.44), respectively. In this section, we elabo-
rate further on these conditions as well as solve two free terminal time
examples: Examples 3.5 and 3.6.
94 3. The Maximum Principle: Mixed Inequality Constraints

Let us begin with a special case of the condition (3.15) for the simple
problem (2.4) when T ≥ 0 is a decision variable. When compared with
the problem (3.7), the simple problem is without the mixed constraints
and constraints at the terminal time T. Thus the transversality condition
(3.15) reduces to
H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ST [x∗ (T ∗ ), T ∗ ] = 0. (3.77)
This condition along with the Maximum Principle (2.31) with T replaced
by T ∗ give us the necessary conditions for the optimality of T ∗ and
u∗ (t), t ∈ [0, T ∗ ] for the simple problem (2.4) when T ≥ 0 is also a
decision variable.
An intuitively appealing way to check if the optimal T ∗ ∈ (0, ∞)
must satisfy (3.77) is to solve the problem (2.4) with the terminal time
T ∗ with u∗ (t), t ∈ [0, T ∗ ] as the optimal control trajectory, and then show
that the first-order condition for T ∗ to maximize the objective function
in a neighborhood (T ∗ − δ, T ∗ + δ) of T ∗ with δ > 0 leads to (3.77).
For this, let us set u∗ (t) = u∗ (T ∗ ), t ∈ [T ∗ , T ∗ + δ), so that we have a
control u∗ (t) that is feasible for (2.4) for any T ∈ (T ∗ − δ, T ∗ + δ), as
well as continuous at T ∗ . Let x∗ (t), t ∈ [0, T ∗ + δ] be the corresponding
state trajectory. With these we can obtain the corresponding objective
function value
 T
J(T ) = F (x∗ (t), u∗ (t), t)dt + S(x∗ (T ), T ), T ∈ (T ∗ − δ, T ∗ + δ),
0
(3.78)
which, in particular, represents the optimal value of the objective func-
tion for the problem (2.4) when T = T ∗ . Furthermore, since u∗ (t) is
continuous at T ∗ , x∗ (t) is continuously differentiable there, and so is
J(T ). In this case, since T ∗ is optimal, it must satisfy
dJ(T )
J  (T ∗ ) := |T =T ∗ = 0. (3.79)
dT
Otherwise, we would have either J  (T ∗ ) > 0 or J  (T ∗ ) < 0. The former
situation would allow us to find a T ∈ (T ∗ , T ∗ + δ) for which J(T ) >
J(T ∗ ), and T ∗ could not be optimal since the choice of an optimal control
for (2.4) defined on the interval [0, T ] would only improve the value of
the objective function. Likewise, the later situation would allow us to
find a T ∈ (T ∗ − δ, T ∗ ) for which J(T ) > J(T ∗ ). By taking the derivative
of (3.78), we can write (3.79) as
F (x∗ (T ∗ ), u∗ (T ∗ ), T ∗ ) + Sx [x∗ (T ∗ ), T ∗ ]ẋ∗ (T ∗ ) + ST [x∗ (T ∗ ), T ∗ ] = 0.
(3.80)
3.5. Free Terminal Time Problems 95

Furthermore, using the definition of the Hamiltonian in (2.18) and the


state equation and the transversality condition in (2.31), we can easily
see that (3.80) can be written as (3.77).

Remark 3.10 An intuitive way to obtain optimal T ∗ is to first solve


the problem (2.4) with a given terminal time T and obtain the optimal
value of the objective function J ∗ (T ), and then maximize J ∗ (T ) over
T. Hartl and Sethi (1983) show that the first-order condition for max-
imizing J ∗ (T ), namely, dJ ∗ (T )/dT = 0 can also be used to derive the
transversality condition (3.77).

If T is restricted to lie in the interval [T1 , T2 ], where T2 > T1 ≥ 0, then


(3.77) is still valid provided T ∗ ∈ (T1 , T2 ). As is standard, if T ∗ = T1 ,
then the = sign in (3.77) is replaced by ≤, and if T ∗ = T2 , then the = sign
in (3.77) is replaced by ≥ . In other words, if we must have T ∗ ∈ [T1 , T2 ],
then we can replace (3.77) by




⎪ ≤ 0 if T ∗ = T1 ,


H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ST [x∗ (T ∗ ), T ∗ ] = 0 if T ∗ ∈ (T1 , T2 ),





⎩ ≥ 0 if T ∗ = T2 .
(3.81)
Similarly, we can also obtain the corresponding versions of (3.15) and
(3.44) for the problem (3.7) and its current value version (specified in
Sect. 3.3), respectively.
We shall now illustrate (3.77) and (3.81) by solving Examples 3.5
and 3.6. To illustrate the idea in Remark 3.10, you are asked in Ex-
ercise 3.6 to solve Example 3.5 by using dJ ∗ (T )/dt = 0 to obtain the
optimal T ∗ .

Example 3.5 Consider the problem:


  T 
max J = (x − u)dt + x(T ) (3.82)
u,T 0

subject to
ẋ = −2 + 0.5u, x(0) = 17.5, (3.83)
u ∈ [0, 1], T ≥ 0.
96 3. The Maximum Principle: Mixed Inequality Constraints

Solution The Hamiltonian is

H = x − u + λ(−2 + 0.5u),

where λ̇ = −1, λ(T ) = 1, which gives

λ(t) = 1 + (T − t).

Then, the optimal control is given by

u∗ (t) = bang[0, 1; 0.5(T − 1 − t)]. (3.84)

In other words, u∗ (t) = 1 for 0 ≤ t ≤ T − 1 and u∗ (t) = 0 for T − 1 <


t ≤ T.
Since we must also determine the optimal terminal time T ∗ , it must
satisfy (3.77), which, in view of the fact that u∗ (T ∗ ) = 0 from (3.84),
reduces to
x∗ (T ∗ ) − 2 = 0. (3.85)
By substituting u∗ (t) in (3.83) and integrating, we obtain


⎨ 17.5 − 1.5t, 0 ≤ t ≤ T − 1,
x∗ (t) = (3.86)

⎩ 17 + 0.5T − 2t, T − 1 < t ≤ T.

We can now apply (3.85) to obtain

x∗ (T ∗ ) − 2 = 17 − 1.5T ∗ − 2 = 0,

which gives T ∗ = 10. Thus, the optimal solution of the problem is given
by T ∗ = 10 and
u∗ (t) = bang[0, 1; 0.5(9 − t)].
Note that if we had restricted T to be in the interval [T1 , T2 ] = [2, 8],
we would have T ∗ = 8, u∗ (t) = bang[0, 1; 0.5(7 − t)], and x∗ (8) − 2 =
5 − 2 = 3 ≥ 0, which would satisfy (3.81) at T ∗ = T2 = 8. On the other
hand, if T were restricted in the interval [T1 , T2 ] = [11, 15], then T ∗ =
11, u∗ (t) = bang[0, 1; 0.5(10 − t)], and x∗ (11) − 2 = 0.5 − 2 = −1.5 ≤ 0
would satisfy (3.81) at T ∗ = T1 = 11.
Next, we will apply the maximum principle to solve a well known
time-optimal control problem. It is one of the problems used by Pontrya-
gin et al. (1962) to illustrate the applications of the maximum principle.
3.5. Free Terminal Time Problems 97

The problem also elucidates a specific instance of the synthesis of optimal


controls.
By the synthesis of optimal controls, we mean the procedure of
“patching” together various forms of the optimal controls obtained from
the Hamiltonian maximizing condition. A simple example of the syn-
thesis occurs in Example 2.5, where u∗ = 1 when λ > 0, u∗ = −1 when
λ < 0, and the control is singular when λ = 0. An optimal trajectory
starting at the given initial state variables is synthesized from these. In
Example 2.5, this synthesized solution is u∗ = −1 for 0 ≤ t < 1 and
u∗ = 0 for 1 ≤ t ≤ 2. Our next example requires a synthesis proce-
dure which is more complex. In Chap. 5, both the cash management and
equity financing models require such synthesis procedures.

Example 3.6 A Time-Optimal Control Problem. Consider a subway


train of mass m moving horizontally along a smooth linear track with
negligible friction. Let x(t) denote the position of the train, measured in
miles from the origin called the main station, along the track at time t,
measured in minutes. Then the equation of the train’s motion is governed
by Newton’s Second Law of Motion, which states that force equals mass
times acceleration. In mathematical terms, the equation of the motion
is the second-order differential equation

d2 x(t)
m = mẍ(t) = u(t),
dt2
where u(t) denotes the external force applied to the train at time t
and ẍ(t) represents the acceleration in miles per minute per minute,
or miles/minute2 . This equation, along with

x(0) = x0 and ẋ(0) = y0 ,

respectively, as the initial position of the train and its initial velocity in
miles per minute, characterizes its motion completely.
For convenience in further exposition, we may assume m = 1 so that
the equation of motion can be written as

ẍ = u. (3.87)

Then, the force u can be expressed simply as acceleration or decelera-


tion (i.e., negative acceleration) depending on whether u is positive or
negative, respectively.
98 3. The Maximum Principle: Mixed Inequality Constraints

In order to develop the time-optimal control problem under consid-


eration, we transform (3.87) into a system of two first-order differential
equations (see Appendix A)


⎨ ẋ = y, x(0) = x0 ,
(3.88)

⎩ ẏ = u, y(0) = y ,
0

where y(t) denotes the velocity of the train in miles/minute at time t.


Assume further that, for the comfort of the passengers, the maximum
acceleration and deceleration are required to be at most 1 mile/minute2 .
Thus, the control variable constraint is
u ∈ Ω = [−1, 1]. (3.89)
The problem is to find a control satisfying (3.89) such that the train
stops at the main station located at x = 0 in a minimum possible time
T. Of course, for the train to come to rest at x = 0 at time T, we
must have x(T ) = 0 and y(T ) = 0. We have thus defined the following
fixed-end-point optimal control problem:
⎧   T 



⎪ max J = −1dt



⎪ 0



⎪ subject to





⎨ ẋ = y, x(0) = x , x(T ) = 0,
0
(3.90)



⎪ ẏ = u, y(0) = y0 , y(T ) = 0,







⎪ and the control constraint





⎩ u ∈ Ω = [−1, 1].

Note that (3.90) is a fixed-end-point problem with unspecified ter-


minal time. For this problem to be nontrivial, we must not have
x0 = y0 = 0, i.e., we must have either x0 = 0 or y0 = 0 or both are
nonzero.

Solution Here we have only control constraints of the type treated in


Chap. 2, and so we can use the maximum principle (2.31). The standard
Hamiltonian function is
H = −1 + λ1 y + λ2 u,
3.5. Free Terminal Time Problems 99

where the adjoint variables λ1 and λ2 satisfy

λ˙1 = 0, λ1 (T ) = β 1 and λ˙2 = −λ1 , λ2 (T ) = β 2 ,

and β 1 and β 2 are constants to be determined in the case of a fixed-end-


point problem; see Table 3.1, Row 2. We can integrate these equations
and write the solution in the form

λ1 = β 1 and λ2 = β 2 + β 1 (T − t),

where β 1 and β 2 are constants to be determined from the maximum


principle (2.31), condition (3.15), and the specified initial and terminal
values of the state variables. The Hamiltonian maximizing condition
yields the form of the optimal control to be

u∗ (t) = bang{−1, 1; β 2 + β 1 (T − t)}. (3.91)

As for the minimum time T ∗ , it is clearly zero if the train is initially


at rest at the main station, i.e., (x0 , y0 ) = 0. In this case, the problem
is trivial, u∗ (0) = 0, and there is nothing further to solve. Otherwise,
at least one of x0 or y0 is not zero, in which case the minimum time
T ∗ > 0 and the transversality condition (3.15) applies. Since y(T ) = 0
and S ≡ 0, we have

H + ST |T =T ∗ = λ2 (T ∗ )u∗ (T ∗ ) − 1 = β 2 u∗ (T ∗ ) − 1 = 0,

which together with the bang-bang control policy (3.91) implies either

λ2 (T ∗ ) = β 2 = −1 and u∗ (T ∗ ) = −1,

or
λ2 (T ∗ ) = β 2 = +1 and u∗ (T ∗ ) = +1.
Since the switching function β 2 + β 1 (T ∗ − t) is a linear function of
the time remaining, it can change sign at most once. Therefore, we have
two cases: (i) u∗ (τ ) = −1 in the interval t ≤ τ ≤ T ∗ for some t ≥ 0; (ii)
u∗ (τ ) = +1 in the interval t ≤ τ ≤ T ∗ for some t ≥ 0. We can integrate
(3.88) in each of these cases as shown in Table 3.2. Also in the table we
have the curves Γ− and Γ+ , which are obtained by eliminating t from
the expressions for x and y in each case. The parabolic curves Γ− and
Γ+ are called switching curves and are shown in Fig. 3.2.
It should be noted parenthetically that Fig. 3.2 is different from the
figures we have seen thus far, where the abscissa represented the time
100 3. The Maximum Principle: Mixed Inequality Constraints

Table 3.2: State trajectories and switching curves

(i) u∗ (τ ) = −1 for (t ≤ τ ≤ T ∗ ) (ii) u∗ (τ ) = +1 for (t ≤ τ ≤ T ∗ )

y(t) = T ∗ − t y(t) = t − T ∗

x(t) = −(T ∗ − t)2 /2 x(t) = (t − T ∗ )2 /2

Γ− : x = −y 2 /2 for y ≥ 0 Γ+ : x = y 2 /2 for y ≤ 0

dimension. In Fig. 3.2, the abscissa represents the train’s location and
the ordinate represents the train’s velocity. Thus, the point (x0 , y0 )
represents the vector of the train’s initial position and initial velocity.
A trajectory of the train over time can be represented by a curve in
this figure. For example, the bold-faced trajectory beginning at (x0 , y0 )
represents a train that is moving in the positive direction and it is slowing
down. It passes through the main station located
# at the origin and comes
to a momentary rest at the point that is y02 + 2x0 miles to the right
of the main station. At this location, the train reverses its direction and
speeds up to reach the location x∗ and attain the velocity of y∗ . At this
point, it slows down gradually until it comes to rest at the main station.
In the ensuing discussion we will show that this trajectory is in fact
the minimal time trajectory beginning at the location x0 at a velocity
of y0 . We will furthermore obtain the control representing the optimal
acceleration and deceleration along the way. Finally, we will obtain the
various instants of interest, which are implicit in the depiction of the
trajectory in Fig. 3.2.
We can put Γ+ and Γ− into a single switching curve Γ as

⎪ √
⎨ Γ+ (x) = − 2x, x ≥ 0,
y = Γ(x) = (3.92)
⎩ Γ− (x) = +√−2x, x < 0.

If the initial state (x0 , y0 ) = 0, lies on the switching curve, then we have
u∗ = +1 (resp., u∗ = −1) if x0 > 0 (resp., x0 < 0); i.e., if (x0 , y0 ) lies on
Γ+ (resp., Γ− ). In the common parlance, this means that we apply the
brakes to bring the train to a full stop at the main station. If the initial
state (x0 , y0 ) is not on the switching curve, then we choose, between
u∗ = 1 and u∗ = −1, that which moves the system toward the switching
3.5. Free Terminal Time Problems 101

Figure 3.2: Minimum time optimal response for Example 3.6

curve. By inspection, it is obvious that above the switching curve we


must choose u∗ = −1 and below we must choose u∗ = +1.
The other curves in Fig. 3.2 are solutions of the differential equations
starting from initial points (x0 , y0 ). If (x0 , y0 ) lies above the switching
curve Γ as shown in Fig. 3.2, we use u∗ = −1 to compute the curve as
follows:
ẋ = y, x(0) = x0 ,
ẏ = −1, y(0) = y0 .
Integrating these equations gives
y = −t + y0 ,
t2
x=− + y0 t + x 0 .
2
Elimination of t between these two gives
y02 − y 2
x= + x0 . (3.93)
2
This is the equation of the parabola in Fig. 3.2 through (x0 , y0 ). The
point of intersection of the curve (3.93) with the switching curve Γ+ is
obtained by solving (3.93) and the equation for Γ+ , namely 2x = y 2 ,
simultaneously, which gives

y 2 + 2x0
x∗ = 0 , y∗ = − (y02 + 2x0 )/2, (3.94)
4
102 3. The Maximum Principle: Mixed Inequality Constraints

where the minus sign in the expression for y∗ in (3.94) was chosen since
the intersection occurs when y∗ is negative. The time t∗ that it takes to
reach the switching curve, called the switching time, given that we start
above it, is
t ∗ = y0 − y∗ = y0 + (y02 + 2x0 )/2. (3.95)
To find the minimum total time to go from the starting point (x0 , y0 )
to the origin (0,0), we substitute t∗ into the equation for Γ+ in Column
(ii) of Table 3.2; this gives


T = t∗ − y∗ = y0 + 2(y02 + 2x0 ). (3.96)
Here t∗ is the time to get to the switching curve and −y∗ is the time
spent along the switching curve.
Note
# that the parabola (3.93) intersects the y-axis at the point
(0, + 2x0 + y02 ) and the x-axis at the point (x0 + y02 /2, 0). This means
that for the initial position (x0 , y0 ) depicted#in Fig. 3.2, the train first
passes the main station at the velocity of + 2x0 + y02 and comes to a
momentary stop at the distance of (x0 + y02 /2) to the right of the main
station. There it reverses its direction, comes to within the distance of
x∗ from the main station, switches then to u∗ = +1, which slows it to a
complete stop at the main station at time T ∗ given by (3.96).
As a numerical example, start at the point (x0 , y0 ) =(1,1). Then, the
equation of the parabola (3.93) is
2x = 3 − y 2 .
#
The switching point given by (3.94) # is (3/4, − 3/2). Finally from (3.95),
the switching time is t∗ = 1 + 3/2 min. Substituting
√ into (3.96), we

find the minimum time to stop is T = 1 + 6 min.
To complete the solution of this example let us evaluate β 1 and β 2 ,
which are needed to obtain λ1 and λ2 . Since (1,1) is above the switching
curve, the approach to the main station is on the curve Γ+ , and therefore,
u∗ (T ∗ ) = 1 and β 2 = 1. To compute β 1 , we observe # that λ2 (t
#∗ ) =

β 2 + β 1 (T − t∗ ) = 0 so that β 1 = −β 2 /(T ∗
# − t∗ ) = −1/ 3/2 = − 2/3.
Finally, we obtain x∗ = 3/4 and y∗ = − 3/2 from (3.94).
Let us now describe the optimal solution from (1, 1) in the common
parlance. The position (1, 1) means the train is 1 mile to the right of the
main station, moving away from it at the speed of 1 mile per minute.
The control u∗ = −1 means that the brakes are applied to slow the train
3.6. Infinite Horizon and Stationarity 103

down.
√ This action brings the train to a momentary stop at a distance
of 3 miles to the right of the main station. Moreover, the continuation
of control u∗ = −1 means the train reverses its direction at that point
and starts speeding toward the station. When it comes # to within 3/4
miles#to the right of the main# station at time t∗ = 1 + 3/2, its velocity
of − 3/2 or the speed of 3/2 miles per minute toward the station is
too fast to come to a rest at the main station without application of the
brakes. So the control is switched to u∗ = +1 at time t∗ , which means
the brakes are applied at that time. This action brings the √ train to a

complete stop at the main station at the time of T = 1 + 6 min after
the train left its initial position (1, 1).
In Exercises 3.19–3.22, you are asked to work other examples with
different starting points above, below, and on the switching curve. Note
that t∗ = 0 by definition, if the starting point is on the switching
curve.

3.6 Infinite Horizon and Stationarity


Thus far, we have studied problems whose horizon is finite or whose
horizon length is a decision variable to be determined. In this section,
we briefly discuss the case of T = ∞ in the problem (3.7), called the in-
finite horizon case. This case is especially important in many economics
and management science problems. Our treatment of this case is largely
heuristic, since a general theory of the necessary optimality conditions
is not available. Nevertheless, we can rely upon an infinite-horizon ex-
tension of the sufficiency optimality conditions stated in Theorem 3.1.
When we put T = ∞ in (3.7) along with ρ > 0, we will generally
get a nonstationary infinite horizon problem in the sense that the var-
ious functions involved depend explicitly on the time variable t. Such
problems are extremely hard to solve. So, in this section we will devote
our attention to only stationary infinite horizon problems, which do not
depend explicitly on time t. Furthermore, it is reasonable in most cases
to assume σ(x) ≡ 0 in infinite horizon problems. Moreover, in most eco-
nomics and management science problems, the terminal constraints, if
104 3. The Maximum Principle: Mixed Inequality Constraints

any, require the state variables to be nonnegative. Thus, to begin with,


we consider the problem:
⎧   ∞ 

⎪ −ρt

⎪ max J = φ(x, u)e dt ,

⎪ 0



⎨ subject to
(3.97)



⎪ ẋ = f (x, u), x(0) = x0 ,





⎩ g(x, u) ≥ 0.

This stationarity assumption means that the state equations, the


current-value adjoint equations, and the current-value Hamiltonian in
(3.35) are all explicitly independent of time t.

Remark 3.11 The concept of stationarity introduced here is different


from the concept of autonomous systems introduced in Exercise 2.9. This
is because, in the presence of discounting in (3.28), the stationarity as-
sumption (3.97) does not give us an autonomous system as defined there.
See Exercise 3.42 for further comparison between the two concepts.

When it comes to the transversality conditions in the infinite horizon


case, the situation is somewhat more complicated. Even the economic
argument for the finite horizon case fails to extend here because we do
not have a meaningful analogue of the salvage value function. Moreover,
in the free-end-point case with no salvage value, the standard maximum
principle (2.31) gives λpv (T ) = 0, which can no longer be necessary in
general for T = ∞, as confirmed by a simple counter-example in Exer-
cise 3.37. As a matter of fact, we have no general results giving condi-
tions under which the limit of the finite horizon transversality conditions
are necessary. What is true is that the maximum principle (3.42) holds
except for the transversality condition on λ(T ).
When it comes to the sufficiency of the limiting transversality condi-
tions obtained by letting T → ∞ in Theorem 3.1, the situation is much
better. As a matter of fact, we can see from the inequality (2.73) with
S(x) ≡ 0 that all we need is

lim λpv (T )[x(T ) − x∗ (T )] = lim e−ρT λ(T )[x(T ) − x∗ (T )] ≥ 0 (3.98)


T →∞ T →∞

for Theorem 2.1, and therefore Theorem 3.1, to hold. See Seierstad and
Sydsæter (1987) and Feichtinger and Hartl (1986) for further details.
3.6. Infinite Horizon and Stationarity 105

In the important free-end-point case (3.97), since x(T ) is arbitrary,


(3.98) will imply

lim λpv (T ) = lim e−ρT λ(T ) = 0. (3.99)


T →∞ T →∞

While not a necessary condition as indicated earlier, it is interesting to


note that (3.99) is the limiting version of the condition in Table 3.1,
Row 1.
Another important case is that of nonnegativity constraints

lim x(T ) ≥ 0. (3.100)


T →∞

Then, it is clear that the transversality conditions

lim e−ρT λ(T ) ≥ 0 and lim e−ρT λ(T )x∗ (T ) = 0, (3.101)


T →∞ T →∞

imply (3.98). Note that these are also analogous to Table 3.1, Row 3.
We leave it as Exercise 3.38 for you to show that the limiting version
of the condition in the rightmost column of Rows 2, 3, and 4 in Table 3.1
imply (3.98). This would mean that Theorem 3.1 provides sufficient
optimality conditions for the problem (3.97), except in the free-end-point
case, i.e., when the terminal constraints a(x(T )) ≥ 0 and b(x(T )) = 0
are not present. Moreover, in the free-end-point case, we can use (3.98),
or even (3.99) with some qualifications, as discussed earlier.

Example 3.7 Let us return to Example 3.3 and now assume that we
have a perpetual charitable trust with initial fund W0 , which wants to
maximize its total discounted utility of charities C(t) over time, subject
to the terminal condition

lim W (T ) ≥ 0. (3.102)
T →∞

For convenience we restate the problem:


  ∞ 
−ρt
max J = e ln C(t)dt
C(t)≥0 0

subject to
Ẇ = rW − C, W (0) = W0 > 0, (3.103)
and (3.102).
106 3. The Maximum Principle: Mixed Inequality Constraints

Solution We already know from Example 3.3 with B = 0 that we are


in case (i), and the optimal solution is given by (3.50) in Example 3.2.
It seems reasonable to explore whether or not we can obtain an optimal
solution for our infinite horizon problem by letting T → ∞ in (3.50).
Furthermore, since the limiting version of the maximum principle (3.42)
is sufficient for optimality in this case, all we need to do is to check if
the limiting solution satisfies the condition

lim e−ρT λ(T ) ≥ 0 and lim e−ρT λ(T )W ∗ (T ) = 0. (3.104)


T →∞ T →∞

With T → ∞ in (3.50) and (3.52), we have

W ∗ (t) = e(r−ρ)t W0 , C ∗ (t) = ρW ∗ (t), λ(t) = 1/ρW ∗ (t). (3.105)

Since λ(t) ≥ 0 and λ(t)W ∗ (t) = 1/ρ, it is clear that (3.104) holds. Thus,
(3.105) gives the optimal solution. Using this solution in the objective
function, we obtain

1 r−ρ
J∗ = ln ρW0 + 2 , (3.106)
ρ ρ

which we can verify to be the same as (3.51) as T → ∞.


It is interesting to observe from (3.105) that the optimal consumption
is increasing, constant, or decreasing if r is greater than, equal to, or less
than ρ, respectively. Moreover, if ρ = r, then W ∗ (t) = W0 , C ∗ (t) = rW0 ,
and λ(t) = 1/rW0 , which means that it is optimal to consume just the
interest earned on the invested wealth—no more, no less—and, therefore,
none of the initial wealth is ever consumed!
In the case of stationary systems, considerable attention is focused on
equilibrium where all motion ceases, i.e., the values of x and λ for which
ẋ = 0 and λ̇ = 0. The notion is that of optimal long-run stationary
equilibrium; see Arrow and Kurz (1970, Chapter 2) and Carlson and
Haurie (1987a, 1996). If an equilibrium exists, then it is defined by the
quadruple {x̄, ū, λ̄, μ̄} satisfying
3.6. Infinite Horizon and Stationarity 107

f (x̄, ū) = 0,

ρλ̄ = Lx [x̄, ū, λ̄, μ̄],

μ̄ ≥ 0, μ̄g(x̄, ū) = 0,

and (3.107)

H(x̄, ū, λ̄) ≥ H(x̄, u, λ̄)

for all u satisfying

g(x̄, u) ≥ 0.

Clearly, if the initial condition x0 = x̄, the optimal control is u∗ (t) = ū


for all t. If x0 = x̄, the optimal solution will have a transient phase.
Moreover, depending on the problem, the equilibrium may be attained
in a finite time or an approach to it may be asymptotic.
If the nonnegativity constraint (3.100) is added to problem (3.97),
then we may include the requirement λ̄ ≥ 0 and λ̄x̄ = 0 in (3.107).
If the constraint involving g is not imposed in (3.97), μ̄ may be
dropped from the quadruple. In this case, the long-run stationary equi-
librium is defined by the triple {x̄, ū, λ̄} satisfying
f (x̄, ū) = 0, ρλ̄ = Hx (x̄, ū, λ̄), and Hu (x̄, ū, λ̄) = 0. (3.108)
Also known in this case is that the optimal value of the objective function
can be expressed as
J ∗ = H(x0 , u∗ (0), λ(0))/ρ. (3.109)
You are asked to prove this relation in Exercise 3.40. That it holds in
Example 3.7 is quite clear when we use (3.105) in (3.109) and see that
we get (3.106).
Also, we see from Example 3.7 that when we let t → ∞ in (3.105),
we formally obtain




⎪ (0, 0, ∞) if ρ > r,


(W̄ , C̄, λ̄) = (W0 , ρW0 , 1/ρW0 ) if ρ = r, (3.110)





⎩ (∞, ∞, 0) if ρ < r.
108 3. The Maximum Principle: Mixed Inequality Constraints

This is precisely the long-run stationary equilibrium that we will obtain


if we apply (3.108) along with λ̄ ≥ 0 and λ̄W̄ = 0 directly to the optimal
control problem in Example 3.7. This verification is left as Exercise 3.41.

Example 3.8 For another application of (3.108), let us return to Ex-


ample 3.7 and now assume that the wealth W is invested in a productive
activity resulting in an output rate ln W, and that the horizon T = ∞.
Since ln W is only defined for W > 0, we do not need to impose the
terminal constraint (3.102) here.
Thus, the problem is
  ∞ 
max J = e−ρt ln C(t)dt
C(t)≥0 0

subject to
Ẇ = ln W − C, W (0) = W0 > 0, (3.111)
and one task is to find the long-run stationary equilibrium for it. Note
that since the horizon is infinite, it is usual to assume no salvage value
and no terminal conditions on the state.

Solution By (3.108) we set

ln W̄ − C̄ = 0, ρ = 1/W̄ , 1/C̄ − λ̄ = 0,

which gives the equilibrium {W̄ , C̄, λ̄} = {1/ρ, − ln ρ, −1/ ln ρ}. Since,
0 < ρ < 1, we have C̄ > 0, which satisfies the requirement that the
consumption be nonnegative. Also, the equilibrium wealth W̄ > 0.
It is important to note that the optimal long-run stationary equilib-
rium (which is also called the turnpike) is not the same as the optimal
steady-state among the set of all possible steady-states. The latter con-
cept is termed the Golden Rule or Golden Path in economics, and a
procedure to obtain it is described below. However, the two concepts
are identical if the discount rate ρ = 0; see Exercise 3.43.
The Golden Path is obtained by setting ẋ = f (x, u) = 0, which
provides the feedback control u(x) that would keep x(t) = x over
time. Then, substitute u(x) in the integrand φ(x, u) of (3.28) to obtain
φ(x, u(x)). The value of x that maximizes φ(x, u(x)) yields the Golden
Path. Of course, all of the constraints imposed on the problem have to
be respected when obtaining the Golden Path.
In some cases, there may be more than one equilibria defined by
(3.107). If so, the equilibrium that is attained may depend on the initial
3.7. Model Types 109

starting point. Moreover, from some special starting points, the system
may have an option to go to two or more different equilibria. Such points
are called the Sethi-Skiba points; see Appendix D.8.
For multidimensional systems consisting of two or more states, op-
timal trajectories may exhibit more complex behaviors. Of particular
importance is the concept of limit cycles. If the optimal trajectory of
a dynamical system tends to spiral in toward a closed loop in the state
space, then that closed loop is called a limit cycle. For more on this
topic, refer to Vidyasagar (2002) and Grass et al. (2008).

3.7 Model Types


Optimal control theory has been used to solve problems occurring in en-
gineering, economics, management science, and other fields. In each field
of application, certain general kinds of models which we will call model
types are likely to occur, and each such model requires a specialized
form of the maximum principle. In Chap. 2 we derived, in considerable
detail, a simple form of the continuous-time maximum principle. How-
ever, to continue to provide such details for each different version of the
maximum principle needed in later chapters of this book would be both
repetitive and lengthy.
The purpose of this section is to avoid the latter by listing most
of the different management science model types that we will use in
later chapters. For each model type, we will give a brief description of
the corresponding objective function, state equations, control and state
inequality constraints, terminal conditions, adjoint equations, and the
form of the optimal control policy. We will also indicate where each of
these model types is applied in later chapters.
The reader may wish to skim this section on first reading to get an
idea of what it contains, work a few of the exercises, and go on to the
various functional areas discussed in later chapters. Then, when specific
model types are encountered, the reader may return to read the relevant
parts of this section in more detail.
We are now able to state the general forms of all the models (with one
or two exceptions) that we will use to analyze the applications discussed
in the rest of the book. Some other model types will be explained in
later chapters.
In Table 3.3 we have listed six different combinations of φ and f
functions. If we specify the initial value x0 of the state variable x and
110 3. The Maximum Principle: Mixed Inequality Constraints

the constraints on the control and state variables, we can get a completely
specified optimal control model by selecting one of the model types in
Table 3.3 together with one of the terminal conditions given in Table 3.1.
The reader will see numerous examples of the uses of Tables 3.1
and 3.3 when we construct optimal control models of various applied
situations in later chapters. To help in understanding these, we will give
a brief mathematical discussion of the six model types in Table 3.3, with
an indication of where each model type will be used later in the book.
In Model Type (a) of Table 3.3 we see that both φ and f are linear
functions of their arguments. Hence it is called the linear-linear case.
The Hamiltonian is

H = Cx + Du + λ(Ax + Bu + d)
= Cx + λAx + λd + (D + λB)u. (3.112)

From (3.112) it is obvious that the optimal policy is bang-bang with the
switching function (D + λB). Since the adjoint equation is independent
of both control and state variables, it can be solved completely without
resorting to two-point boundary value methods. Examples of (a) oc-
cur in the cash balance problem of Sect. 5.1.1 and the maintenance and
replacement model of Sect. 9.1.1.
Model Type (b) of Table 3.3 is the same as Model Type (a) except
that the function C(x) is nonlinear. Thus, the term Cx appears in the
adjoint equation, and two-point boundary value methods are needed to
solve the problem. Here, there is a possibility of singular control, and a
specific example is the Nerlove-Arrow model in Sect. 7.1.1.
Model Type (c) of Table 3.3 has linear functions in the state equa-
tion and quadratic functions in the objective function. Therefore, it is
sometimes called the linear-quadratic case. In this case, the optimal
control can be expressed in a form in which the state variables enter
linearly. Such a form is known as the linear decision rule; see (D.36) in
Appendix D. A specific example of this case occurs in the production-
inventory example of Sect. 6.1.1.
Model Type (d) is a more general version of Model Type (b) in which
the state equation is nonlinear in x. Here again, there is a possibility of
singular control. The wheat trading model of Sect. 6.2.1 illustrates this
model type. The solution of a special case of the model in Sect. 6.2.3
exhibits the occurrence of a singular control.
3.7. Model Types 111

Table 3.3: Objective, state, and adjoint equations for various model
types
Objective State Current-value Form of optimal

function equation adjoint equation control policy

integrand

φ= ẋ = f = λ̇ =

(a) Cx + Du Ax + Bu + d λ(ρ − A) − C Bang-bang

(b) C(x) + Du Ax + Bu + d λ(ρ − A) − Cx Bang-bang+Singular

(c) xT Cx + uT Du Ax + Bu + d λ(ρ − A) − 2xT C Linear decision rule

(d) C(x) + Du A(x) + Bu + d λ(ρ − Ax ) − Cx Bang-bang+Singular

(e) c(x) + q(u) (ax + d)b(u) + e(x) λ(ρ − ab(u) − ex ) − cx Interior or boundary

(f) c(x)q(u) (ax + d)b(u) + e(x) λ(ρ − ab(u) − ex ) − cx q(u) Interior or boundary

Note. The current-value Hamiltonian is often used when ρ > 0 is the discount rate;
the standard formulation is identical to the current-value formulation when ρ = 0. In
Table 3.3, capital letters indicate vector functions and small letters indicate scalar
functions or vectors. A function followed by an argument in parentheses indicates
a nonlinear function; when it is followed by an argument without parenthesis, it
indicates a linear function. Thus, A(x) and e(x) are nonlinear vector and scalar
functions, while Ax and ax are linear. The function d is always to be interpreted as
an exogenous function of time only

In Model Types (e) and (f), the functions are scalar functions, and
there is only one state equation, so λ is also a scalar function. In these
cases, the Hamiltonian function is nonlinear in u. If it is concave in u,
then the optimal control is usually obtained by setting Hu = 0. If it is
convex, then the optimal control is the same as in Model Type (b).
Several examples of Model Type (e) occur in this book: the opti-
mal financing model in Sect. 5.2.1, the Vidale-Wolfe advertising model in
Sect. 7.2.1, the nonlinear extension of the maintenance and replacement
model in Sect. 9.1.4, the forestry model in Sect. 10.2.1, the exhaustible
resource model in Sect. 10.3.1, and all of the models in Chap. 11. Model
Type (f) examples are: The Kamien-Schwartz model in Sect. 9.2.1 and
the sole-owner fishery resource model in Sect. 10.1.
Although the general forms of the model are specified in Tables 3.1
and 3.3, there are a number of additional modeling tricks that are useful,
which will be employed later. We collect these as a series of remarks
below.

Remark 3.12 We sometimes need to use the absolute value function


|u| of a control variable u in forming the functions φ or f. For example,
112 3. The Maximum Principle: Mixed Inequality Constraints

in the simple cash balance model of Sect. 5.1, u < 0 represents buying
and u > 0 represents selling; in either case there is a transaction cost
which can be represented as c|u|. In order to handle this, we define new
control variables u1 and u2 satisfying the following relations:
u := u1 − u2 , u1 ≥ 0, u2 ≥ 0, (3.113)
u1 u2 = 0. (3.114)
Thus, we represent u as the difference of two nonnegative variables, u1
and u2 , together with the quadratic constraint (3.114). We can then
write
|u| = u1 + u2 , (3.115)
which expresses the nonlinear function |u| as a linear function with the
constraint (3.114).

We now observe that we need not impose (3.114) explicitly, provided


there are costs associated with the controls u1 and u2 , since in the pres-
ence of these costs no optimal policy would ever choose to make both of
them simultaneously positive. This is indeed the case in the cash balance
problem of Sect. 5.1, where the associated transaction costs prevent us
from simultaneously buying and selling the same security.
Thus, by doubling the number of variables and adding inequality
constraints, we are able to represent |u| as a linear function in the model.

Remark 3.13 Tables 3.1 and 3.3 are constructed for continuous-time
models. Exactly the same kinds of models can be developed in the
discrete-time case; see Chap. 8.

Remark 3.14 Consider Model Types (a) and (b) when the control vari-
able constraints are defined by linear inequalities of the form
g(u, t) = g(t)u ≥ 0. (3.116)
Then, the problem of maximizing the Hamiltonian function becomes:




⎪ max(D + λB)u


subject to (3.117)





⎩ g(t)u ≥ 0.

This is clearly a linear programming problem for each given instant of


time t, since the Hamiltonian function is linear in u.
Exercises for Chapter 3 113

Further in Model Type (a), the adjoint equation does not contain
terms in x and u, so we can solve it for λ(t), and hence the objective
function of (3.117) varies parametrically with λ(t). In this case we can
use parametric linear programming techniques to solve the problem over
time. Since the optimal solution to the linear program always occurs at
an extreme point of the convex set defined by g(t)u ≥ 0, it follows that
as λ(t) changes, the optimal solution to (3.117) will “bang” from one
extreme point of the feasible set to another. This is called a generalized
bang-bang optimal policy. Such a policy occurs, e.g., in the optimal
financing model treated in Sect. 5.2; see Table 5.1, Row 5.
In Model Type (b), the adjoint equation contains terms in x, so we
cannot solve for the trajectory of λ(t) without knowing the trajectory
of x(t). It is still true that (3.117) is a linear program for any given t,
but the parametric linear programming techniques will not usually work.
Instead, some type of iterative procedure is needed in general; see Bryson
and Ho (1975).
Remark 3.15 The salvage value part S[x(T ), T ] of the objective func-
tion is relevant in the optimization context in the following two cases:
Case (i) T is free and part of the problem is to determine the optimal
terminal time; see, e.g., Sect. 9.1.
Case (ii) T is fixed and the problem is that of maximizing the objec-
tive function involving the salvage value of the ending state x(T ), which
in this case can be written simply as S[x(T )].
For the fixed-end-point problem and for the infinite horizon problem,
it does not usually make much sense to define a salvage value function.
Remark 3.16 One important model type that we did not include in
Table 3.3 is the impulse control model of Bensoussan and Lions (1975).
In this model, an infinite control is instantaneously exerted on a state
variable in order to cause a finite jump in its value. This model is
particularly appropriate for the instantaneous reordering of inventory
as required in lot-size models; see Bensoussan et al. (1974). Further
discussion of impulse control is given in Sect. D.9.

Exercises for Chapter 3


E 3.1 Consider the constraint set
Ω = {(u1 , u2 )|0 ≤ u1 ≤ x, −1 ≤ u2 ≤ u1 }.
Write these in the form shown in (3.3).
114 3. The Maximum Principle: Mixed Inequality Constraints

E 3.2 Find the reachable set X, defined in Sect. 3.1, if x and u satisfy

ẋ = u − 1, x0 = 5, −1 ≤ u ≤ 1,

and T = 3.

E 3.3 Assume the constraint (3.3) to be of the form g(u, t) ≥ 0, i.e.,


g does not contain x explicitly, and assume x(T ) is free. Apply the
Lagrangian form of the maximum principle and derive the Hamiltonian
form (2.31) with
Ω(t) = {u|g(u, t) ≥ 0}.
Assume g(u, t) to be of the form α ≤ u ≤ β.

E 3.4 Use the Lagrangian form of the maximum principle to obtain the
optimal control for the following problem:

max{J = x1 (2)}

subject to
ẋ1 (t) = u1 − u2 , x1 (0) = 2,

ẋ2 (t) = u2 , x2 (0) = 1,


and the constraints

u1 (t) ≥ u2 (t), 0 ≤ u1 (t) ≤ x2 (t), 0 ≤ u2 (t) ≤ 2, 0 ≤ t ≤ 2.

An interpretation of this problem is that x1 (t) is the stock of steel at time


t and x2 (t) is the total capacity of the steel mill at time t. Production
of steel at rate u1 , which is bounded by the current steel mill capacity,
can be split into u2 and u1 − u2 , where u2 goes into increasing the steel
mill capacity and u1 − u2 adds to the stock of steel. The objective is to
build as large a stockpile of steel as possible by time T = 2. With this
interpretation, we clearly need to have x1 (t) ≥ 0 and x2 (t) ≥ 0. However,
it is easily seen that these constraints are automatically satisfied for every
feasible solution of the problem. You may find it interesting to show
why this is true. (It is possible to make the problem more interesting by
assuming an exogenous demand d for steel so that ẋ1 = u1 − u2 − d.)

E 3.5 Specialize the terminal condition (3.13) in the one-dimensional


case (i.e., n = 1) with Y (T ) = Y = [x, x̄] for each T > 0, where x and x̄
are two constants satisfying x̄ > x. Use (3.12) to derive (3.14).
Exercises for Chapter 3 115

E 3.6 Obtain the optimal value J ∗ (T ) of the objective function for Ex-
ample 3.5 for a given terminal time T, and then maximize it with respect
to T by using the conditions dJ ∗ (T )/dT = 0. Show that you get the same
optimal T ∗ as the one obtained for Example 3.5 by using (3.77).

E 3.7 Check that the solution of Example 3.1 satisfies the sufficiency
conditions in Theorem 3.1.

E 3.8 Starting from (3.15), obtain the current-value version (3.44) for
the problem defined by (3.27) and (3.28). Show further that if we
were to require the function ψ to also depend on T, i.e. if S(x, T ) =
ψ(x, T )e−ρT then the left-hand side of condition (3.44) would be modi-
fied to H[x∗ (T ∗ ), u∗ (T ∗ ), λ(T ∗ ), T ∗ ] + ψ T [x∗ (T ∗ ), T ∗ ] − ρψ[x∗ (T ∗ ), T ∗ ].

E 3.9 Develop the current-value formulation of Sect. 3.3 for a time-


varying nonnegative discount rate ρ(t), by replacing the factors e−ρt and
e−ρT in (3.28), respectively, by
t T
α(t) = e− 0 ρ(s)ds
and α(T ) = e− 0 ρ(s)ds
.

E 3.10 Begin with (3.54) and perform the steps leading to (3.55).

E 3.11 Optimal Consumption of An Initial Investment Over a Finite


Horizon. Begin with an initial investment of x0 . Assets x(t) at time t
earn at the rate of r per dollar per unit time. A portion of the earnings
is consumed at a rate of c(t) per unit time at time t, while the remainder
is invested. Neither a negative consumption rate nor a consumption rate
exceeding the earnings is allowed. Assets depreciate at the constant rate
δ. Assume r > δ+ρ, where ρ is the discount rate applied on consumption.
Find the optimal consumption rate over a finite horizon T such that
the present value of the consumption stream over the finite horizon is
maximized. Assume that T is sufficiently large. Let us note that the
optimal capital accumulation model treated in Sect. 11.1.1 represents a
generalization of this problem.

E 3.12 Show that if we require W (T ) = ε > 0, ε small, instead of


W (T ) = 0 in Example 3.2, then the optimal value of the objective func-
tion will decrease by an amount βε = ε(1 − erT )/rW0 + o(ε).

E 3.13 Recall Exercise 2.18 of the leaky reservoir in Chap. 2. In this


problem there was no explicit constraint on the total amount of water
116 3. The Maximum Principle: Mixed Inequality Constraints

available. Suppose we impose the following isoperimetric constraint on


that problem:  100
udt = K,
0
where K > 0 is the total amount of water which must be used. Assume
also that the reservoir has infinite capacity. Re-solve this problem for
various values of K and the objective functions in parts (a) and (b) of
Exercise 2.18.
E 3.14 From the transversality conditions for the general terminal con-
straints in Row 5 of Table 3.1, derive the transversality conditions in Row
1 for the free-end-point case, in Row 2 for the fixed-end-point case, and
in Rows 3 and 4 for the one-sided constraint cases. Assume ψ(x) = 0,
i.e., there is no salvage value and X = E 1 for simplicity.
E 3.15 For solving Example 3.3, consider case (ii) by starting with t∗ =
2, and show that the maximum principle will not be satisfied in this case.
E 3.16 Rework Example 3.4 with T = 4 and the following different
terminal conditions:
(a) x(4) unconstrained,
(b) x(4) = 1,
(c) x(4) ≤ 1,
(d) x(4) ≥ 1.
E 3.17 Rework Example 3.4 with the terminal condition (3.70) replaced
by x(2) ≥ ε, where ε is small. Verify that the change in the optimal
value of the objective function is −ε/2 ≈ −αε + o(ε), as stipulated in
Remark 3.6.
E 3.18 Introduce a terminal value in Example 3.4 as follows:
  2 
max J = (−x)dt + Bx(2)
0
subject to
ẋ = u, x(0) = 1,
x(2) ≥ 0, i.e., Y = [0, ∞) in Table 3.1, Row 3,
−1 ≤ u ≤ 1.
Note that for B = 0, the problem is the same as Example 3.4. Solve this
problem for B = 1/2, 1, 3/2, 2, 3. Conclude that for B ≥ 2, the solution
for the state variable does not change.
Exercises for Chapter 3 117

E 3.19 In Example 3.6, determine the optimal control and the corre-
sponding state trajectory starting at the point (-4,6), which lies above
the switching curve.

E 3.20 Carry out the synthesis of the optimal control for Example 3.6
when the starting point (x0 , y0 ) lies below the switching curve.

E 3.21 Use the results of Exercise 3.20 to find the optimal control and
the corresponding trajectory starting at the point (−1, −1).

E 3.22 Find the optimal control, the minimum time, and the corre-
sponding trajectory for Example 3.6 starting at the point (−2, 2), which
lies on the switching curve.

E 3.23 What is the shortest time in which a passenger can be trans-


ported in a ballistic missile from Los Angeles to New York? Assume that
a missile with the ultimate mechanical and thermodynamical properties
is available, but that the passenger imposes the restraint that the max-
imum acceleration or deceleration is 100 ft/s2 . The missile starts from
rest in Los Angeles and stops in New York. Assume that the path is a
straight line of length 2400 miles and ignore the rotation and curvature
of the earth.

E 3.24 In the time-optimal control problem (3.90), replace the state


equations by
ẋ = ay, x(0) = x0 ≥ 0, x(T ) = x̄ > x0 ,
ẏ = u, y(0) = y0 ≥ 0, y(T ) = 0,
and the control constraint by
u ∈ Ω = [Umin , Umax ].
Assume a > 0 and Umax > 0 > Umin . Observe here that x(t) could be
interpreted as the cumulative value of gold mined by a gold-producing
country and y(t) could be interpreted as the total value of gold-mining
machinery employed by the country at time t ≥ 0. The required ma-
chinery is to be imported. Because of some inertia in the world market
for the machinery, the country cannot control y(t) directly, but is able
to control its rate of change ẏ(t). Thus u(t) represents at time t, the
import rate of the machinery when positive and the export rate when
negative. The terminal value x̄ represents the required amount of gold to
be produced in a minimum possible time. Obtain the optimal solution.
118 3. The Maximum Principle: Mixed Inequality Constraints

E 3.25 Solve the following minimum weighted energy and time problem:
  T 
1
max J = −( )(u2 + 1)dt
u,T 0 2
subject to
ẋ = u, x(0) = 5, x(T ) = 0,
and the control constraint
|u| ≤ 2.
Hint. Use (3.77) to determine T ∗ , the optimal value of T.

E 3.26 Rework Exercise 3.25 with the new integrand F =


−(1/2)(u2 + 16) in the objective function.

Hint: Note that use of (3.77) gives an infeasible u. This means


that we should look for a boundary solution for u. To obtain this,
calculate J ∗ (T ) as defined in Exercise 3.6, and then choose T to
maximize it. In doing so, take care to see that x(T ) = 0, and the control
constraint is satisfied.

E 3.27 Exercise 3.26 becomes a minimum energy problem if we set


F = −u2 /2. Show that the Hamiltonian maximizing condition of the
maximum principle implies u∗ = k, where k is a constant. Note that
the application of (3.77) implies that k = 0, which gives x(t) = 5 for all
t ≥ 0 so that the terminal condition x(T ) = 0 cannot be satisfied.
To see that there exists no optimal control in this situation, let k < 0
and compute J ∗ . It is now possible to see that limk→0 J ∗ = 0. This
means that we can make the objective function value as close to zero
as we wish, but not equal to zero. Note that in this case there are no
feasible solutions satisfying the necessary conditions so we cannot check
the sufficiency conditions; see the last paragraph of Sect. 2.1.4.

E 3.28 Show that every feasible control of the problem


  T 
max J = −udt
T,u 0

subject to
ẋ = u, x(0) = x0 , x(T ) = 0,
|u| ≤ q, where q > 0,
is an optimal control.
Exercises for Chapter 3 119

E 3.29 Let x0 > 0 be the initial velocity of a rocket. Let u be the


amount of acceleration (or deceleration) caused by applying a force which
consumes fuel at the rate |u|. We want to bring the rocket to rest using
minimum total amount of fuel. Hence, we have the following optimal
control problem:
  T 
max J = −|u|dt
T,u 0

subject to
ẋ = u, x(0) = x0 , x(T ) = 0,

−1 ≤ u ≤ +1.

Hint: Use (3.113)–(3.115) to deal with |u|. Show that for x0 > 0, say
x0 = 5, every feasible control is optimal.

E 3.30 Analyze Exercise 3.29 with the state equation

ẋ = −ax + u,

where a > 0. Show that no optimal control exists for the problem.

E 3.31 By using the maximum principle, show that the problem


⎧  1



⎪ max xdt

⎪ 0



⎨ subject to



⎪ ẋ = x + u, x(0) = 0,





⎩ 1 − u ≥ 0, 1 + u ≥ 0, 2 − x − u ≥ 0,

has the optimal control




⎨ 1, t ∈ [0, ln 2],
u∗ (t) =

⎩ 1 + 2ln2 − 2t, t ∈ (ln 2, 1].

Also, provide the values of the state variable, the adjoint variable, and
the Lagrange multipliers along the optimal path.
120 3. The Maximum Principle: Mixed Inequality Constraints

E 3.32 If, in Exercise 3.31, we perturb the constraint 2 − x − u ≥ 0 by


2 − x − u ≥ ε, where ε is small, then show that the change in value of
the objective function equals
 1
ε μ3 dt + o(ε),
0

where μ3 is the Lagrange multiplier associated with the constraint 2−x−


u ≥ 0 in Exercise 3.31. Moreover, if ε < 0, implying that we are relaxing
the constraint, then verify that the change in the objective function is
positive.

E 3.33 Obtain the value function V (x, t) explicitly in Exercise 3.31


for every x ∈ E 1 and t ∈ [0, 1]. Furthermore, verify that λ(t) =
Vx (x∗ (t), t), t ∈ [0, 1], where λ(t) is the adjoint variable obtained in
the solution of Exercise 3.31.

E 3.34 Solve the problem:


  T 
max J = [−2 + (1 − u(t))x(t)]dt
u,T 0

subject to
ẋ = u, x(0) = 0, x(T ) ≥ 1,
u ∈ [0, 1],
T ∈ [1, 8].
Hint: First, show that u∗ = bang[0, 1; λ − x] and that control can switch
at most once from 1 to 0. Then, let t∗ (T ) denote that switching time, if
any, for a given T ∈ [1, 8]. Consider three cases: (i) T = 1, (ii) 1 < T < 8,
and (iii) T = 8. Note that λ(t∗ (T )) − x(t∗ (T )) = 0. Use (3.15) in case
(ii). Find the optimal solution in each of the three cases. The best of
these solutions will be the solution of the problem.

E 3.35 Consider the problem:


  T 
max J = [−3 − u(t) + x(t)]dt
u,T 0

subject to
ẋ = u, x(0) = 0, x(T ) ≥ 1,
Exercises for Chapter 3 121

u ∈ [0, 1],

T ∈ [1, 4 + 2 2].
The problem has two different optimal solutions with different values for
optimal T ∗ . Find both of these solutions.

E 3.36 Perform the following:

(a) Find the optimal consumption rate C ∗ (t), t ∈ [0, T ], in the prob-
lem:   T 
−ρt
max J = e ln C(t)dt
0
subject to
Ẇ (t) = −C(t), W (0) = W0 ,
where T is given and ρ > 0.
(b) Assume that T is not given in (a), and is to be chosen optimally.
Show for this free terminal time version that the optimal T ∗
decreases as the discount rate ρ increases.

Hint: It is possible to obtain dT ∗ /dρ by implicit differentia-


tion.

E 3.37 An example, which illustrates that

lim λ(t) = 0
t→∞

is not a necessary transversality condition in general, is:


  ∞ 
max J = (1 − x)udt
0

such that
ẋ = (1 − x)u, x(0) = 0,
0 ≤ u ≤ 1.
Show this by finding an optimal control.

E 3.38 Show that the limiting conditions in the rightmost column of


Rows 2, 3, and 4 in Table 3.1 imply (3.98) when T → ∞.
122 3. The Maximum Principle: Mixed Inequality Constraints

E 3.39 Consider the regulator problem defined by the scalar equation


ẋ = u, x(0) = x0 ,
with the objective function
 ∞ 4 
x u2
J =− + dt.
0 4 2
(a) Show that the long-term stationary equilibrium (x̄, ū, λ̄) = (0, 0, 0),
and conclude that in feedback form u∗ (x) = ū = 0 when x = x̄ = 0.

(b) By using the maximum principle and the relation u̇∗ = dudx(x) ẋ,
derive a differential equation for the optimal feedback control u∗ (x)
and solve it with the boundary condition u∗ (0) = 0 to obtain

⎪ √

⎪ −x2 / 2, x > 0,




u (x) = 0, x = 0,



⎪ √

⎩ +x2 / 2, x < 0.

(c) Solve for x∗ (t) and λ(t) and show that limt→∞ x∗ (t) = 0 and that
the limiting condition (3.99), i.e., limt→∞ λ(t) = 0, holds for this
problem.
E 3.40 Show that for the problem (3.97) without the constraint
g(x, u) ≥ 0, the optimal value of the objective function
J ∗ = H(x0 , u∗ (0), λ(0))/ρ.
See Grass et al. (2008).
E 3.41 Apply (3.108), along with the requirement λ̄ ≥ 0 and λ̄W̄ = 0 in
view of the constraint (3.102), to Example 3.7 to verify that the long-run
stationary equilibrium is as shown in (3.110).
E 3.42 For a stationary system as defined in Sect. 3.6, show that
dH
= ρλf (x∗ (t), u∗ (t))
dt
and
dH pv
= −ρe−ρt φ(x∗ (t), u∗ (t))
dt
along the optimal path. Also, contrast these results with that of Exer-
cise 2.9.
Exercises for Chapter 3 123

E 3.43 Consider the inventory problem:


  ∞ 
−ρt 2 2
max J = −e [(I − I1 ) + (P − P1 ) ]dt
0

subject to
I˙ = P − S, I(0) = I0 ,
where I denotes inventory level, P denotes production rate, and S de-
notes a given constant demand rate.

(a) Find the optimal long-run stationary equilibrium, i.e., the turnpike
defined in (3.107).
(b) Find the Golden Rule by setting I˙ = 0 in the state equation, solve
for P, and substitute it into the integrand of the objective function.
Then, maximize the integrand with respect to I.
(c) Verify that the Golden Rule inventory level obtained in (b) is the
same as the turnpike inventory level found in (a) when ρ = 0.
Chapter 4

The Maximum Principle:


Pure State and Mixed
Inequality Constraints

In Chap. 2 we addressed optimal control problems having constraints


only on control variables. We extended the discussion in Chap. 3 to
treat mixed constraints that may involve state variables in addition to
control variables.
Often in management science and economics problems there are non-
negativity constraints on state variables, such as inventory levels or
wealth. These constraints do not include control variables. Also, there
may be more general inequality constraints only on state variables, which
include constraints that require certain state variables to remain non-
negative. Such constraints are known as pure state variable inequality
constraints or, simply, pure state constraints.
Pure state constraints are more difficult to deal with than mixed
constraints. We can intuitively appreciate this fact by keeping in mind
that only control variables are under the direct influence of the decision
maker. This enables the decision maker, when a mixed constraint be-
comes tight, to choose from the controls that would keep it tight for as
long as required for optimality. Whereas with pure state constraints, the
situation is different and more complicated. That is, when a constraint
becomes tight, it does not provide any direct information to the decision
maker on how to choose values for the control variables so as not to

© Springer Nature Switzerland AG 2019 125


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 4
126 4. The Maximum Principle: Pure State and Mixed Constraints

violate the constraint. Hence, considerable changes in the controls may


be required to keep the constraint tight if needed for optimality.
This chapter considers pure state constraints together with mixed
constraints. In the literature there are two ways of handling pure state
constraints: direct and indirect. The direct method associates a multi-
plier with each constraint for appending it to the Hamiltonian to form
the Lagrangian, and then proceeds in much the same way as in Chap. 3
dealing with mixed constraints. In the indirect method, the choice of
controls, when a pure constraint is active, must be further limited by
constraining approximately the value of the derivative of the active state
constraint with respect to time. This derivative will involve time deriva-
tives of the state variables, which can be written in terms of the con-
trol and state variables through the use of the state equations. Thus,
the restrictions on the time derivatives of the pure state constraints are
transformed in the form of mixed constraints, and these will be appended
instead to the Hamiltonian to form the Lagrangian. Because the pure
state constraints are adjoined in this indirect fashion, the corresponding
Lagrange multipliers must satisfy some complementary slackness condi-
tions in addition to those mentioned in Chap. 3.
With the formulation of the Lagrangian in each approach, we will
write the respective maximum principle, where the choice of control will
come from maximizing the Hamiltonian subject to both pure state con-
straints and mixed constraints. We will find, however, in contrast to
Chap. 3, that in both approaches, the adjoint functions may be required
to have jumps at those times where the pure state constraints become
tight.
We begin with a simple example in Sect. 4.1 to motivate the neces-
sity of possible jumps in the adjoint functions. Section 4.2 formulates the
problem with pure state constraints along with the required assumptions.
In Sect. 4.3, we use the direct method for stating the maximum principle
necessary conditions for solving such problems. Sufficiency conditions
are stated in Sect. 4.4. Section 4.5 is devoted to developing the maxi-
mum principle for the indirect method, which involves adjoining the first
derivative of the pure state constraints to form the Lagrangian function
and imposing some additional constraints on the Lagrange multipliers of
the resulting formulation. Also, the adjoint variables and the Lagrange
multipliers arising in this method will be related to those arising in the
direct method. Finally, the current-value form of the maximum principle
for the indirect method is described in Sect. 4.6.
4.1. Jumps in Marginal Valuations 127

4.1 Jumps in Marginal Valuations


In this section, we formulate an optimal control problem with a pure
constraint, which can be solved merely by inspection and which exhibits a
discontinuous marginal valuation of the state variable. Since the adjoint
variables in Chaps. 2 and 3 provide these marginal valuations and since
we would like this feature to continue, we must allow the adjoint variables
to have jumps if the marginal valuations can be discontinuous. This will
enable us to formulate a maximum principle in the next section, which
is similar to (3.10) with the exception that the adjoint variables, and
therefore also the Hamiltonian, may have possible jumps satisfying some
jump conditions.

Example 4.1 Consider the problem with a pure state constraint:


  3 
max J = −udt (4.1)
0

subject to
ẋ = u, x(0) = 0, (4.2)
0 ≤ u ≤ 3, (4.3)
x − 1 + (t − 2)2 ≥ 0. (4.4)

Solution From the objective function (4.1), one can see that it is good
to have low values of u. If we use u = 0 to begin with, we see that
x(t) = 0 as long as u(t) = 0. At t = 1, x(1) = 0 and the constraint (4.4)
is satisfied with an equality. But continuing with u(t) = 0 beyond t = 1
is not feasible since x(t) = 0 would not satisfy the constraint (4.4) just
after t = 1.
In Fig. 4.1, we see that the lowest possible feasible state trajectory
from t = 1 to t = 2 satisfies the state constraint (4.4) with an equality.
In order not to violate the constraint (4.4), its first time derivative u(t)+
2(t − 2) must be nonnegative. This gives us u(t) = 2(2 − t) to be the
lowest feasible value for the control. This value will make the state x(t)
ride along the constraint boundary until t = 2, at which point u(2) = 0;
see Fig. 4.1. Continuing with u(t) = 2(2 − t) beyond t = 2 will make u(t)
negative, and violate the lower bound in (4.3). It is easy to see, however,
that u(t) = 0, t ≥ 2, is the lowest feasible value, which can be followed
all the way to the terminal time t = 3.
128 4. The Maximum Principle: Pure State and Mixed Constraints

Figure 4.1: Feasible state space and optimal state trajectory


for Examples 4.1 and 4.4

It can be seen from Fig. 4.1 that the bold trajectory is the lowest pos-
sible feasible state trajectory on the entire time interval [0,3]. Moreover,
it is obvious that the lowest possible feasible control is used at any given
t ∈ [0, 3], and therefore, the solution we have found is optimal. We can
now restate the values of the state and control variables that we have
obtained:
⎧ ⎧

⎪ ⎪


⎪ 0, t ∈ [0, 1), ⎪
⎪ 0, t ∈ [0, 1),

⎨ ⎪

x∗ (t) = ∗
1 − (t − 2)2 , t ∈ [1, 2], u (t) = ⎪ 2(2 − t), t ∈ [1, 2],

⎪ ⎪

⎪ ⎪


⎩ ⎪

1, t ∈ (2, 3], 0, t ∈ (2, 3].
(4.5)
Next we find the value function V (x, t) for this problem. It is obvious
that the feedback control u∗ (x, t) = 0 is optimal at any point (x, t) when
x ≥ 1 or when (x, t) is on the right-hand side of the parabola in Fig. 4.1.
Thus, V (x, t) = 0 on such points.
On the other hand, when x ∈ [0, 1] and it is on the left-hand side
of the parabola, the optimal trajectory is very similar to the one shown
in Fig. 4.1. Specifically,
√ the control is zero until it hits the trajectory at
time τ = 2 − 1 − x. Then, the control switches to 2(2 − s) for s ∈ (τ , 2)
4.2. Optimal Control Problem with Pure and Mixed Constraints 129

to climb along the left-hand side of the parabola to reach its peak, and
then switches back to zero on the time interval [2,3]. Thus, in this case,
 τ  2  3
V (x, t) = − 0ds − 2(2 − s)ds − 0ds
t τ 2
2
$2
= s − 4s √
2− 1−x
= (x − 1).

Thus, we have the value function






⎪ 0, x ≥ 1, t ∈ [0, 3],


V (x, t) =
⎪ x − 1, x ≥ 1 − (t − 2)2 , t ∈ [0, 2),




⎩ 0, 1 − (t − 2)2 ≤ x ≤ 1, t ∈ [2, 3].

This gives us the marginal valuation along the optimal path x∗ (t)
given in (4.5) as


⎨ 1, t ∈ [0, 2),

Vx (x (t), t) = (4.6)

⎩ 0, t ∈ [2, 3].

We can now see that this marginal valuation is discontinuous at t = 2,


and it has a downward jump of size 1 at that time.
The maximum principle that we will state in Sect. 4.3 will have cer-
tain jump conditions in order to accommodate problems like Exam-
ple 4.1. Indeed in Example 4.2, we will apply the maximum principle of
Sect. 4.3 to the problem in Example 4.1, and see that the adjoint variable
that represents the marginal valuation along the optimal path will have
a jump consistent with (4.6).
In the next section, we state the general optimal control problem that
is the subject of this chapter.

4.2 The Optimal Control Problem with Pure


and Mixed Constraints
We will append to the problem (3.7) considered in Chap. 3, the pure
state variable inequality constraint of type

h(x, t) ≥ 0, (4.7)
130 4. The Maximum Principle: Pure State and Mixed Constraints

where we assume function h : E n × E 1 → E p to be continuously dif-


ferentiable in all its arguments. By the definition of function h, (4.7)
represents a set of p constraints hi (x, t) ≥ 0, i = 1, 2, . . . , p. It is noted
that the constraint hi ≥ 0 is called a constraint of rth order if the rth
time derivative of hi is the first time a term in control u appears in
the expression by putting f (x, u, t) for ẋ after each differentiation. It
is through this expression that the control acts to satisfy the constraint
hi ≥ 0. The value of r is referred to as the order of the constraint. Of
course, if the constraint hi is of order r, then we would require hi to be
r times continuously differentiable.
Except for Exercise 4.12, in this book we will consider only first-order
constraints, i.e., r = 1. For such constraints, the first-time derivative of
h has terms in u. Thus, we can define h1 (x, u, t) as follows:
dh ∂h ∂h
h1 = = f+ . (4.8)
dt ∂x ∂t
In the important special case of the nonnegativity constraint

x(t) ≥ 0, t ∈ [0, T ], (4.9)

(4.8) is simply h1 = f. For an upper bound constant x(t) ≤ M, written


as
M − x(t) ≥ 0, t ∈ [0, T ], (4.10)
(4.8) gives h1 = −f. These will be of order one because the function
f (x, u, t) usually contains terms in u.
As in Chap. 3, the constraints (4.7) need also to satisfy a full-rank
type constraint qualification before a maximum principle can be derived.
With respect to the ith constraint hi (x, t) ≥ 0, an interval (θ1 , θ2 ) ⊂
[0, T ] with θ1 < θ2 is called an interior interval if hi (x(t), t) > 0 for all
t ∈ (θ1 , θ2 ). If the optimal trajectory “hits the boundary,” i.e., satisfies
hi (x(t), t) = 0 for τ 1 ≤ t ≤ τ 2 for some i, then [τ 1 , τ 2 ] is called a
boundary interval. An instant τ 1 is called an entry time if there is an
interior interval ending at t = τ 1 and a boundary interval starting at
τ 1 . Correspondingly, τ 2 is called an exit time if a boundary interval ends
and an interior interval starts at τ 2 . If the trajectory just touches the
boundary at time τ , i.e., h(x(τ ), τ ) = 0 and if the trajectory is in the
interior just before and just after τ , then τ is called a contact time. Taken
together, entry, exit, and contact times are called junction times.
In this book we shall not consider problems that require optimal state
trajectories to have countably many junction times. In other words, we
4.2. Optimal Control Problem with Pure and Mixed Constraints 131

shall state the maximum principle necessary optimality conditions for


state trajectories having only finitely many junction times. Also, all of
the applications studied in this book exhibit optimal state trajectories
containing finitely many junction times or no junction times.
Throughout the book, we will assume that the constraint qualifica-
tion introduced in Sect. 3.1 as well as the following full-rank condition
on any boundary interval [τ 1 , τ 2 ] hold:
⎡ ⎤
1
⎢ ∂h1 /∂u ⎥
⎢ ⎥
⎢ ⎥
⎢ ∂h2 /∂u ⎥
1
⎢ ⎥
rank ⎢ ⎥ = p̂,
⎢ .
.. ⎥
⎢ ⎥
⎢ ⎥
⎣ ⎦
∂h1p̂ /∂u

where for t ∈ [τ 1 , τ 2 ],

hi (x∗ (t), t) = 0, i = 1, 2, . . . , p̂ ≤ p

and
hi (x∗ (t), t) > 0, i = p̂ + 1, . . . , p.
Note that this full-rank condition on the constraints (4.7) is written
when the order of each of the constraints in (4.7) is one. For the general
case of higher-order constraints, see Hartl et al. (1995).
Let us recapitulate the optimal control problem for which we will
state a direct maximum principle in the next section. The problem is
⎧   T 



⎪ max J = F (x, u, t)dt + S[x(T ), T ] ,

⎪ 0





⎪ subject to







⎪ ẋ = f (x, u, t), x(0) = x0 ,

g(x, u, t) ≥ 0, (4.11)







⎪ h(x, t) ≥ 0,







⎪ a(x(T ), T ) ≥ 0,




⎩ b(x(T ), T ) = 0.
132 4. The Maximum Principle: Pure State and Mixed Constraints

Important special cases of the mixed constraint g(x, u, t) ≥ 0 are


ui ∈ [0, M ] for M > 0 and ui (t) ∈ [0, xi (t)], and those of the terminal
constraints a(x(T ), T ) ≥ 0 and a(x(T ), T ) = 0 are xi (T ) ≥ k and xi (T ) =
k, respectively, where k is a constant. Likewise, the special cases of the
pure constraints h(x, t) ≥ 0 are xi ≥ 0 and xi ≤ M, for which hxi = +1
and hxi = −1, respectively, and ht = 0.

4.3 The Maximum Principle: Direct Method

For the problem (4.11), we will now state the direct maximum principle
which includes the discussion above and the required jump conditions.
For details, see Dubovitskii and Milyutin (1965), Feichtinger and Hartl
(1986), Hartl et al. (1995), Boccia et al. (2016), and references therein.
We will use superscript d on various multipliers that arise in the direct
method, to distinguish them from the corresponding multipliers (which
are not superscripted) that arise in the indirect method, to be discussed
in Sect. 4.5. Naturally, it will not be necessary to superscript the multi-
pliers that are known to remain the same in both methods.
To formulate the maximum principle for the problem (4.11), we define
the Hamiltonian function H d : E n × E m × E 1 → E 1 as

H d = F (x, u, t) + λd f (x, u, t)

and the Lagrangian function Ld : E n × E m × E n × E q × E p × E 1 → E 1


as

Ld (x, u, λd , μ, η d , t) = H d (x, u, λd , t) + μg(x, u, t) + η d h(x, t). (4.12)

The maximum principle states the necessary conditions for u∗ (with


the corresponding state trajectory x∗ ) to be optimal. The conditions
are that there exist an adjoint function λd , which may be discontinuous
at a time in a boundary interval or a contact time, multiplier functions
μ, α, β, γ d , η d , and a jump parameter ζ d (τ ), at each time τ , where λd is
discontinuous, such that the following (4.13) holds:
4.3. The Maximum Principle: Direct Method 133

ẋ∗ = f (x∗ , u∗ , t), x∗ (0) = x0 , satisfying constraints

g(x∗ , u∗ , t) ≥ 0, h(x∗ , t) ≥ 0, and the terminal constraints

a(x∗ (T ), T ) ≥ 0 and b(x∗ (T ), T ) = 0;


d
λ̇ = −Lx [x∗ , u∗ , λd , μ, η d , t]

with the transversality conditions

λd (T − ) = Sx (x∗ (T ), T ) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T )

+γ d hx (x∗ (T ), T ), and

α ≥ 0, αa(x∗ (T ), T ) = 0, γ d ≥ 0, γ d h(x∗ (T ), T ) = 0;

the Hamiltonian maximizing condition

H d [x∗ (t), u∗ (t), λd (t), t] ≥ H d [x∗ (t), u, λd (t), t]

at each t ∈ [0, T ] for all u satisfying

g[x∗ (t), u, t] ≥ 0; (4.13)

the jump conditions at any time τ ,

where λd is discontinuous, are

λd (τ − ) = λd (τ + ) + ζ d (τ )hx (x∗ (τ ), τ ) and

H d [x∗ (τ ), u∗ (τ − ), λd (τ − ), τ ] = H d [x∗ (τ ), u∗ (τ + ), λd (τ + ), τ ]

−ζ d (τ )ht (x∗ (τ ), τ );

the Lagrange multipliers μ(t) are such that

∂Ld /∂u|u=u∗ (t) = 0, dH d /dt = dLd /dt = ∂Ld /∂t,

and the complementary slackness conditions

μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0,

η(t) ≥ 0, η d (t)h(x∗ (t), t) = 0, and

ζ d (τ ) ≥ 0, ζ d (τ )h(x∗ (τ ), τ ) = 0 hold.
134 4. The Maximum Principle: Pure State and Mixed Constraints

As in the previous chapters, λd (t) has the marginal value interpreta-


tion. Therefore, while it is not needed for the application of the maxi-
mum principle (4.13), we can trivially set

λd (T ) = Sx (x∗ (T ), T ). (4.14)

If T is also a decision variable constrained to lie in the interval


[T1 , T2 ], 0 ≤ T1 < T2 < ∞, then in addition to (4.13), if T ∗ is the
optimal terminal time, it must satisfy a condition similar to (3.15) and
(3.81), i.e.,

H d [x∗ (T ∗ ), u∗ (T ∗− ), λd (T ∗− ), T ∗ ] + ST [x∗ (T ∗ ), T ∗ ] + αaT [x∗ (T ∗ ), T ∗ ]






⎪ ≤0 if T ∗ = T1 ,


+βbT [x∗ (T ∗ ), T ∗ ] + γ d hT [x∗ (T ∗ ), T ∗ ] =0 if T ∗ ∈ (T1 , T2 ), (4.15)





⎩ ≥0 if T ∗ = T2 .

Remark 4.1 In most practical examples, λd and H d will only jump at


junction times. However, in some cases a discontinuity may occur at a
time in the interior of a boundary interval, e.g., when a mixed constraint
becomes active at that time.

Remark 4.2 It is known that the adjoint function λd is continuous at


a junction time τ , i.e., ζ d (τ ) = 0, if (i) the entry or exit at time τ is
non-tangential, i.e., if h1 (x∗ (τ ), u∗ (τ ), τ ) = 0, or (ii) if the control u∗ is
continuous at τ and the
⎡ ⎤
⎢ ∂g/∂u diag(g) 0 ⎥
rank ⎣ ⎦ = m + p,
∂h1 /∂u 0 diag(h)

when evaluated at x∗ (τ ) and u∗ (τ ).

We will see that the jump conditions on the adjoint variables in


(4.13) will give us precisely the jump in Example 4.2, where we will
apply the direct maximum principle to the problem in Example 4.1. The
jump condition on H d in (4.13) requires that the Hamiltonian should be
continuous at τ if ht (x∗ (τ ), τ ) = 0. The continuity of the Hamiltonian
(in case ht = 0) makes intuitive sense when considered in the light of its
interpretation given in Sect. 2.2.4.
4.3. The Maximum Principle: Direct Method 135

This brief discussion of the jump conditions, limited here only to


first-order pure state constraints, is far from complete, and a detailed
discussion is beyond the scope of this book. An interested reader should
consult the comprehensive survey by Hartl et al. (1995). For an example
with a second-order state constraint, see Maurer (1977).
Needless to say, computational methods are required to solve prob-
lems with general inequality constraints in all but the simplest of the
cases. The reader should consult the excellent book by Teo et al. (1991)
and references therein for computational procedures and software; see
also Polak et al. (1993), Bulirsch and Kraft (1994), Bryson (1998), and
Pytlak and Vinter (1993, 1999). A MATLAB based software, used
for solving finite and infinite horizon optimal control problems with
pure state and mixed inequality constraints, is available at http://orcos.
tuwien.ac.at/research/ocmat software/.

Example 4.2 Apply the direct maximum principle (4.13) to solve the
problem in Example 4.1.

Solution Since we already have optimal u∗ and x∗ as obtained in (4.5),


we can use these in (4.13) to obtain λd , μ1 , μ2 , γ d , η d , and ζ d . Thus,

H d = −u + λd u, (4.16)

Ld = H d + μ1 u + μ2 (3 − u) + η d [x − 1 + (t − 2)2 ], (4.17)

Ldu = −1 + λd + μ1 − μ2 = 0, (4.18)
d
λ̇ = −Ldx = −η d , λd (3− ) = γ d , (4.19)

γ d [x∗ (3) − 1 + (3 − 2)2 ] = 0, (4.20)

μ1 ≥ 0, μ1 u∗ = 0, μ2 ≥ 0, μ2 (3 − u∗ ) = 0, (4.21)

η d ≥ 0, η d [x∗ (t) − 1 + (t − 2)2 ] = 0, (4.22)


and if λd is discontinuous for some τ ∈ [1, 2], the boundary interval as
seen from Fig. 4.1, then

λd (τ − ) = λd (τ + ) + ζ d (τ ), ζ d (τ ) ≥ 0, (4.23)

− u∗ (τ − ) + λd (τ − )u∗ (τ − ) = −u∗ (τ + ) + λd (τ + )u∗ (τ + ) − ζ d (τ )2(τ − 2).


(4.24)
136 4. The Maximum Principle: Pure State and Mixed Constraints

Since γ d = 0 from (4.20), we have λd (3−) = 0 from (4.19). Also, we


set λd (3) = 0 according to (4.14).
d
Interval (2,3]: We have η d = 0 from (4.22), and thus λ̇ = 0 from
(4.19), giving λd = 0. From (4.18) and (4.21), we have μ1 = 1 > 0 and
μ2 = 0.
Interval [1,2]: We get μ1 = μ2 = 0 from 0 < u∗ < 3 and (4.21).
d
Thus, (4.18) implies λd = 1 and (4.19) gives η d = −λ̇ = 0. Thus λd is
discontinuous at the exit time τ = 2, and we use (4.23) to see that the
jump parameter ζ d (2) = λd (2− ) − λd (2+ ) = 1. Furthermore, it is easy to
check that (4.24) also holds at τ = 2.
Interval [0,1): Clearly μ2 = 0 from (4.21). Also u∗ = 0 would still be
optimal if there were no lower bound constraint on u in this interval. This
means that the constraint u ≥ 0 is not binding, giving us μ1 = 0. Then
d
from (4.18), we have λd = 1. Finally, from (4.19), we have η d = −λ̇ = 0.
We can now see that the adjoint variable


⎨ 1, t ∈ [0, 2),
d
λ (t) = (4.25)

⎩ 0, t ∈ [2, 3],

is precisely the same as the marginal valuation Vx (x∗ (t), t) obtained in


(4.6). We also see that λd is continuous at time t = 1 where the entry
to the constraint is non-tangential as stated in Remark 4.2.

4.4 Sufficiency Conditions: Direct Method


When first-order pure state constraints are present, sufficiency results
are usually stated in terms of the maximum principle using the direct
method described in Hartl et al. (1995).
We will now state the sufficiency result for the problem specified in
(4.11). For this purpose, let us define the maximized Hamiltonian

H 0d (x, λd (t), t) = max H d (x, u, λd , t). (4.26)


{u|g(x,u,t)≥0}

See Feichtinger and Hartl (1986) and Seierstad and Sydsæter (1987) for
details.

Theorem 4.1 Let (x∗ , u∗ , λd , μ, α, β, γ d , η d ) and the jump parameters


ζ d (τ ) at each τ , where λd is discontinuous, satisfy the necessary condi-
tions in (4.13). If H 0d (x, λd (t), t) is concave in x at each t ∈ [0, T ], S
4.5. The Maximum Principle: Indirect Method 137

in (3.2) is concave in x, g in (3.3) is quasiconcave in (x, u), h in (4.7)


and a in (3.4) are quasiconcave in x, and b in (3.5) is linear in x, then
(x∗ , u∗ ) is optimal.

We will illustrate an application of this theorem in Example 4.4,


which shows that the solution obtained in Example 4.3 is optimal.
Theorem 4.1 is written for finite horizon problems. For infinite hori-
zon problems, this theorem remains valid if the transversality condition
on the adjoint variable in (4.29) is modified along the lines discussed in
Sect. 3.6.
In concluding this section, we should note that the sufficiency condi-
tions stated in Theorem 4.1 rely on the presence of appropriate concav-
ity conditions. Sufficiency conditions can also be obtained without these
concavity assumptions. These are called second-order conditions for a lo-
cal maximum, which require the second variation on the linearized state
equation to be negative definite. For further details on second-order suf-
ficiency conditions, the reader is referred to Maurer (1981), Malanowski
(1997), and references in Hartl et al. (1995).

4.5 The Maximum Principle: Indirect Method


The main idea underlying the indirect method is that when the pure
state constraint (4.7), assumed to be of order one, becomes active, we
must require its first derivative h1 (x, u, t) in (4.8) to be nonnegative, i.e.,

h1 (x, u, t) ≥ 0, whenever h(x, t) = 0. (4.27)

While this is a mixed constraint, it is different from those treated in


Chap. 3 in the sense that it is imposed only when the constraint (4.8) is
tight.
Since (4.27) is a mixed constraint, it is tempting to use the maximum
principle (3.12) developed in Chap. 3. This can be done provided that
we can find a way to impose (4.27) only when h(x, t) = 0. One way to
accomplish this is to append (4.27) to the Hamiltonian when forming the
Lagrangian, by using a multiplier η ≥ 0, i.e., append ηh1 , and require
that ηh = 0, which is equivalent to imposing η i hi = 0, i = 1, 2, . . . , p.
This means that when hi > 0 for some i, we have η i = 0 and it is then
not a part of the Lagrangian.
Note that when we require ηh = 0, we do not need to impose ηh1 = 0
as required for mixed constraints. This is because when hi > 0 on an
138 4. The Maximum Principle: Pure State and Mixed Constraints

interval, then η i = 0 and so η i h1i = 0 on that interval. On the other


hand, when hi = 0 on an interval, then it is because h1i = 0, and thus,
η i h1i = 0 on that interval. In either case, η i h1i = 0.
With these observations, we are ready to formulate the indirect max-
imum principle for the problem (4.11).
We form the Lagrangian as

L(x, u, λ, μ, η, t) = H(x, u, λ, t) + μg(x, u, t) + ηh1 (x, u, t), (4.28)

where the Hamiltonian H = F (x, u, t) + λf (x, u, t) as defined in (3.8).


We will now state the maximum principle which includes the discussion
above and the required jump conditions.
The maximum principle states the necessary conditions for u∗ (with
the state trajectory x∗ ) to be optimal. These conditions are that there
exist an adjoint function λ, which may be discontinuous at each entry or
contact time, multiplier functions μ, α, β, γ, η, and a jump parameter ζ(τ )
at each τ , where λd is discontinuous, such that (4.29) on the following
page holds.
Once again, as before, we can set λ(T ) = Sx (x∗ (T ), T ). If T ∈ [T1 , T2 ]
is a decision variable, then (4.15) with λd and γ d replaced by λ and γ,
respectively, must also hold.
In (4.29), we see that there are jump conditions on the adjoint vari-
ables and also the Hamiltonian in the indirect maximum principle. The
remarks on the jump condition made in connection with the direct max-
imum principle (4.13) apply also to the jump conditions in (4.29). In
(4.29), we also see a condition η̇ ≤ 0, in addition to the complimentary
conditions on η. The presence of this term will become clear after we
relate this multiplier to those in the direct maximum principle, which we
discuss next.
In various applications that are discussed in subsequent chapters of
this book, we use the indirect maximum principle. Nevertheless, it is
worthwhile to provide relationships between the multipliers of the two
approaches, as these will be useful when checking for the sufficiency
conditions of Theorem 4.1, developed in Sect. 4.4.
4.5. The Maximum Principle: Indirect Method 139

ẋ∗ = f (x∗ , u∗ , t), x∗ (0) = x0 , satisfying constraints

g(x∗ , u∗ , t) ≥ 0, h(x∗ , t) ≥ 0, and the terminal constraints

a(x∗ (T ), T ) ≥ 0 and b(x∗ (T ), T ) = 0;

λ̇ = −Lx [x∗ , u∗ , λ, μ, η, t] with the transversality conditions

λ(T − ) = Sx (x∗ (T ), T ) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T )

+γhx (x∗ (T ), T ), and

α ≥ 0, αa(x∗ (T ), T ) = 0, γ ≥ 0, γh(x∗ (T ), T ) = 0;

the Hamiltonian maximizing condition

H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t]

at each t ∈ [0, T ] for all u satisfying

g[x∗ (t), u, t] ≥ 0, and

h1i (x∗ (t), u, t) ≥ 0 whenever hi (x∗ (t), t) = 0, i = 1, 2, · · · , p; (4.29)

the jump conditions at any entry/contact time τ ,

where λ is discontinuous, are

λ(τ − ) = λ(τ + ) + ζ(τ )hx (x∗ (τ ), τ ) and

H[x∗ (τ ), u∗ (τ − ), λ(τ − ), τ ] = H[x∗ (τ ), u∗ (τ + ), λ(τ + ), τ ]

−ζ(τ )ht (x∗ (τ ), τ );

the Lagrange multipliers μ(t) are such that

∂L/∂u|u=u∗ (t) = 0, dH/dt = dL/dt = ∂L/∂t,

and the complementary slackness conditions

μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0,

η(t) ≥ 0, η(t)h(x∗ (t), t) = 0, and

ζ(τ ) ≥ 0, ζ(τ )h(x∗ (τ ), τ ) = 0 hold.


140 4. The Maximum Principle: Pure State and Mixed Constraints

We now obtain the multipliers of the direct maximum principle from


those in the indirect maximum principle. Since the multipliers coincide
in the interior, we let [τ 1 , τ 2 ] denote a boundary interval and τ a contact
time. It is shown in Hartl et al. (1995) that

η d (t) = −η̇(t), t ∈ (τ 1 , τ 2 ), (4.30)

λd (t) = λ(t) + η(t)hx (x∗ (t), t), t ∈ (τ 1 , τ 2 ), (4.31)


Note that η d (t) ≥ 0 in (4.13). Thus, we have η̇ ≤ 0, which we have
already included in (4.29). The jump parameter at an entry time τ 1 , an
exit time τ 2 , or a contact time τ , respectively, satisfies

ζ d (τ 1 ) = ζ(τ 1 ) − η(τ + d − d
1 ), ζ (τ 2 ) = η(τ 2 ), ζ (τ ) = ζ(τ ). (4.32)

By comparing λd (T − ) in (4.13) and λ(T − ) in (4.29) and using (4.31), we


have
γ d = γ + η(T − ). (4.33)
Going the other way, we have
 τ2
η(t) = η d (s)ds + ζ d (τ 2 ), t ∈ (τ 1 , τ 2 ),
t

λ(t) = λd (t) − η(t)h(x∗ (t), t), t ∈ (τ 1 , τ 2 ),


ζ(τ 1 ) = ζ d (τ 1 ) + η(τ + d
1 ), ζ(τ 2 ) = 0, ζ(τ ) = ζ (τ ),

γ = γ d − η(T − ).
Finally, as we had mentioned earlier, the multipliers μ, α, and β are
the same in both methods.

Remark 4.3 From (4.30), (4.32), and η d (t) ≥ 0 and ζ d (τ 1 ) ≥ 0 in


(4.13), we can obtain the conditions

η̇(t) ≤ 0 (4.34)

and
ζ(τ 1 ) ≥ η(τ +
1 ) at each entry time τ 1 , (4.35)
which are useful to know about. Hartl et al. (1995) and Feichtinger and
Hartl (1986) also add these conditions to the indirect maximum principle
necessary conditions (4.29).
4.5. The Maximum Principle: Indirect Method 141

Remark 4.4 In Exercise 4.12, we discuss the indirect method for


higher-order constraints. For further details, see Pontryagin et al. (1962),
Bryson and Ho (1975) and Hartl et al. (1995).

Example 4.3 Consider the problem:


  2 
max J = −xdt
0

subject to
ẋ = u, x(0) = 1, (4.36)
u + 1 ≥ 0, 1 − u ≥ 0, (4.37)
x ≥ 0. (4.38)
Note that this problem is the same as Example 2.3, except for the
nonnegativity constraint (4.38).

Solution The Hamiltonian is

H = −x + λu,

which implies the optimal control to have the form

u∗ (x, λ) = bang[−1, 1; λ], whenever x > 0. (4.39)

When x = 0, we impose ẋ = u ≥ 0 in order to insure that (4.38) holds.


Therefore, the optimal control on the state constraint boundary is

u∗ (x, λ) = bang[0, 1; λ], whenever x = 0. (4.40)

Now we form the Lagrangian

L = H + μ1 (u + 1) + μ2 (1 − u) + ηu,

where μ1 , μ2 , and η satisfy the complementary slackness conditions

μ1 ≥ 0, μ1 (u + 1) = 0, (4.41)
μ2 ≥ 0, μ2 (1 − u) = 0, (4.42)
η ≥ 0, ηx = 0. (4.43)

Furthermore, the optimal trajectory must satisfy


∂L
= λ + μ1 − μ2 + η = 0. (4.44)
∂u
142 4. The Maximum Principle: Pure State and Mixed Constraints

From the Lagrangian we also get


∂L
λ̇ = − = 1, λ(2− ) = γ ≥ 0, γx(2) = λ(2− )x(2) = 0. (4.45)
∂x
It is reasonable to guess that the optimal control u∗ will be the one
that keeps x∗ as small as possible, subject to the state constraint (4.38).
Thus, ⎧

⎨ −1, t ∈ [0, 1),

u (t) = (4.46)

⎩ 0, t ∈ [1, 2].

This gives ⎧

⎨ 1 − t, t ∈ [0, 1),
x∗ (t) =

⎩ 0, t ∈ [1, 2].

To obtain λ(t), let us first try λ(2− ) = γ = 0. Then, since x∗ (t) enters
the boundary zero at t = 1, there are no jumps in the interval (1, 2], and
the solution for λ(t) is

λ(t) = t − 2, t ∈ (1, 2). (4.47)

Since λ(t) ≤ 0 and x∗ (t) = 0 on (1, 2], we have u∗ (t) = 0 by (4.40),


as stipulated. Now let us see what must happen at t = 1. We know
from (4.47) that λ(1+ ) = −1. To obtain λ(1− ), we see that H(1+ ) =
−x∗ (1+ ) + λ(1+ )u∗ (1+ ) = 0 and H(1− ) = −x∗ (1− ) + λ(1− )u∗ (1− ) =
−λ(1− ). By equating H(1− ) to H(1+ ) as required in (4.29), we obtain
λ(1− ) = 0. Using now the jump condition on λ(t) in (4.29), we get the
value of the jump ζ(1) = λ(1− ) − λ(1+ ) = 1 ≥ 0.
With λ(1− ) = 0, we can solve (4.45) to obtain

λ(t) = t − 1, t ∈ [0, 1].

Since λ(t) ≤ 0 and x∗ (t) = 1−t > 0 is positive on [0,1), we can use (4.39)
to obtain u∗ (t) = −1 for 0 ≤ t < 1, which is as stipulated in (4.46). In
the time interval [0,1) by (4.42), μ2 = 0 since u∗ < 1, and by (4.43),
η = 0 because x > 0. Therefore, μ1 (t) = −λ(t) = 1 − t > 0 for 0 ≤ t < 1,
and this with u∗ = −1 satisfies (4.41).
To complete the solution, we calculate the Lagrange multipliers in the
interval [1,2]. Since u∗ (t) = 0 on t ∈ [1, 2], we have μ1 (t) = μ2 (t) = 0.
Then, from (4.44) we obtain η(t) = −λ(t) = 2 − t ≥ 0 which, with
4.5. The Maximum Principle: Indirect Method 143

x∗ (t) = 0 satisfies (4.43). Thus, our guess γ = 0 is correct, and we do


not need to examine the possibility of γ > 0. The graphs of x∗ (t) and λ(t)
are shown in Fig. 4.2. In Exercise 4.1, you are asked to redo Example 4.3
by guessing that γ > 0 and see that it leads to a contradiction with a
condition of the maximum principle.

()

() 1 ()
0

0 2 0

()

Figure 4.2: State and adjoint trajectories in Example 4.3

It should be obvious that if the terminal time were T = 1.5, the


optimal control would be u∗ (t) = −1, t ∈ [0, 1) and u∗ (t) = 0, t ∈
[1, 1.5]. You are asked in Exercise 4.10 to redo the above calculations in
this case and show that one now needs to have γ = 1/2. In Exercise 4.3,
you are asked to solve a similar problem with F = −u.

Remark 4.5 Example 4.3 is a problem instance in which the state con-
straint is active at the terminal time. In instances where the initial state
or the final state or both are on the constraint boundary, the maximum
principle may degenerate in the sense that there is no nontrivial solution
of the necessary conditions, i.e., λ(t) ≡ 0, t ∈ [0, T ], where T is the termi-
nal time. See Arutyunov and Aseev (1997) or Ferreira and Vinter (1994)
for conditions that guarantee a nontrivial solution for the multipliers.
144 4. The Maximum Principle: Pure State and Mixed Constraints

Remark 4.6 It can easily be seen that Example 4.3 is a problem in-
stance in which multipliers λ and μ1 would not be unique if the jump
condition on the Hamiltonian in (4.29) was not imposed. For references
dealing with the issue of non-uniqueness of the multipliers and conditions
under which the multipliers are unique, see Kurcyusz and Zowe (1979),
Maurer (1977, 1979), Maurer and Wiegand (1992), and Shapiro (1997).

Example 4.4 The purpose here is to show that the solution obtained
in Example 4.3 satisfies the sufficiency conditions of Theorem 4.1. For
this we first obtain the direct adjoint variable


⎨ t − 1, t ∈ [0, 1),
d ∗
λ (t) = λ(t) + η(t)hx (x (t), t) =

⎩ 0, t ∈ [1, 2).

It is easy to see that




⎨ −x + (t − 1)u, t ∈ [0, 1),
H(x, u, λd (t), t) =

⎩ −x, t ∈ [1, 2],

is linear and hence concave in (x, u) at each t ∈ [0, 2]. Functions


⎛ ⎞
⎜ u+1 ⎟
g(x, u, t) = ⎝ ⎠
1−u

and
h(x) = x
are linear and hence quasiconcave in (x, u) and x, respectively. Functions
S ≡ 0, a ≡ 0 and b ≡ 0 satisfy the conditions of Theorem 4.1 trivially.
Thus, the solution obtained for Example 4.3 satisfies all conditions of
Theorem 4.1, and is therefore optimal.

In Exercise 4.14, you are asked to use Theorem 4.1 to verify that the
given solution there is optimal.

Example 4.5 Consider Example 4.3 with T = 3 and the terminal state
constraint
x(3) = 1.
4.5. The Maximum Principle: Indirect Method 145

Solution Clearly, the optimal control u∗ will be the one that keeps x as
small as possible, subject to the state constraint (4.38) and the boundary
condition x(0) = x(3) = 1. Thus,
⎧ ⎧

⎪ ⎪


⎪ −1, t ∈ [0, 1), ⎪
⎪ 1 − t, t ∈ [0, 1),

⎨ ⎪

u∗ (t) = 0,

t ∈ [1, 2], x (t) = ⎪ 0, t ∈ [1, 2],

⎪ ⎪

⎪ ⎪


⎩ ⎪

1, t ∈ (2, 3], t − 2, t ∈ (2, 3].

For brevity, we will not provide the same level of detailed explanation as
we did in Example 4.3. Rather, we will only compute the adjoint function
and the multipliers that satisfy the optimality conditions. These are


⎨ t − 1, t ∈ [0, 1],
λ(t) = (4.48)

⎩ t − 2, t ∈ (1, 3),

μ1 (t) = μ2 (t) = 0, η(t) = −λ(t), t ∈ [1, 2], (4.49)


γ = 0, β = λ(2− ) = 1, (4.50)
and the jump ζ(1) = 1 ≥ 0 so that

λ(1− ) = λ(1+ ) + ζ(1) and H(1− ) = H(1+ ). (4.51)

Example 4.6 Introduce a discount rate ρ > 0 in Example 4.1 so that


the objective function becomes
  3 
−ρt
max J = −e udt (4.52)
0

and re-solve using the indirect maximum principle (4.29).

Solution It is obvious that the optimal solution will remain the same as
(4.5), shown also in Fig. 4.1.
With u∗ and x∗ as in (4.5), we must obtain λ, μ1 , μ2 , η, γ, and ζ so
that the necessary optimality conditions (4.29) hold, i.e.,

H = −e−ρt u + λu, (4.53)

L = H + μ1 u + μ2 (3 − u) + η[u + 2(t − 2)], (4.54)


146 4. The Maximum Principle: Pure State and Mixed Constraints

Lu = −e−ρt + λ + μ1 − μ2 + η = 0, (4.55)

λ̇ = −Lx = 0, λ(3− ) = 0, (4.56)

γ[x∗ (3) − 1 + (1 − 2)2 ] = 0, (4.57)

μ1 ≥ 0, μ1 u = 0, μ2 ≥ 0, μ2 (3 − u) = 0, (4.58)

η ≥ 0, η[x∗ (t) − 1 + (t − 2)2 ] = 0, (4.59)


and if λ is discontinuous at the entry time τ = 1, then

λ(1− ) = λ(1+ ) + ζ(1), ζ(1) ≥ 0, (4.60)

− e−ρ u∗ (1− ) + λ(1− )u∗ (1− ) = −e−ρ u∗ (1+ ) + λ(1+ ) − ζ(1)(−2). (4.61)
From (4.60), we obtain λ(1− ) = e−ρ . This with (4.56) gives


⎨ e−ρ , 0 ≤ t < 1,
λ(t) =

⎩ 0, 1 ≤ t ≤ 3,

as shown in Fig. 4.3,






⎪ e−ρt − e−ρ , 0 ≤ t < 1,


μ1 (t) = 0, 1 ≤ t ≤ 2, μ2 (t) = 0, 0 ≤ t ≤ 3,





⎩ e−ρt , 2 < t ≤ 3,

and ⎧



⎪ 0, 0 ≤ t < 1,


η(t) =
⎪ e−ρt , 1 ≤ t ≤ 2,




⎩ 0, 2 < t ≤ 3,

which, along with u∗ and x∗ , satisfy (4.29).


Note, furthermore, that λ is continuous at the exit time t = 2. At the
entry time τ 1 = 1, ζ(1) = e−ρ ≥ η(1+ ) = e−ρ , so that (4.35) also holds.
Finally, γ = η(3− ) = 0.
4.6. Current-Value Maximum Principle: Indirect Method 147

Figure 4.3: Adjoint trajectory for Example 4.4

4.6 Current-Value Maximum Principle:


Indirect Method

Just as the necessary condition (3.42) represents the current-value for-


mulation corresponding to (3.12), we can, when first-order pure state
constraints are present, also state the current-value formulation of the
necessary conditions (4.29). As in Sect. 3.3, with F (x, u, t) = φ(x, u)e−ρt ,
S(x, T ) = ψ(x)e−ρT , and ρ > 0, the objective function in the problem
(4.11) is replaced by

  T 
−ρt −ρT
max J = φ(x, u)e dt + ψ[x(T )]e .
0

With the Hamiltonian H as defined in (3.35), we can write the La-


grangian as

L[x, u, λ, μ, η] := H + μg + ηh1 = φ + λf + μg + ηh1 .

We can now state the current-value form of the maximum principle,


giving the necessary conditions for u∗ (with the state trajectory x∗ ) to
be optimal. These conditions are that there exist an adjoint function
λ, which may be discontinuous at each entry or contact time, multiplier
functions μ, α, β, γ, η, and a jump parameter ζ(τ ) at each τ where λd is
discontinuous, such that the following (4.62) holds:
148 4. The Maximum Principle: Pure State and Mixed Constraints

ẋ∗ = f (x∗ , u∗ , t), x∗ (0) = x0 , satisfying constraints

g(x∗ , u∗ , t) ≥ 0, h(x∗ (t), t) ≥ 0,and the terminal constraints

a(x∗ (T ), T ) ≥ 0 and b(x∗ (T ), T ) = 0;

λ̇ = ρλ − Lx [x∗ , u∗ , λ, μ, η, t]

with the transversality conditions

λ(T − ) = ψ x (x∗ (T ), T ) + αax (x∗ (T ), T ) + βbx (x∗ (T ), T )

+γhx (x∗ (T ), T ), and

α ≥ 0, αa(x∗ (T ), T ) = 0, γ ≥ 0, γh(x∗ (T ), T ) = 0;

the Hamiltonian maximizing condition

H[x∗ (t), u∗ (t), λ(t), t] ≥ H[x∗ (t), u, λ(t), t]

at each t ∈ [0, T ] for all u satisfying

g[x∗ (t), u, t] ≥ 0, and


(4.62)
h1i (x∗ (t), u, t) ∗
≥ 0 whenever hi (x (t), t) = 0, i = 1, 2, · · · , p;

the jump conditions at any entry/contact time τ ,

where λ is discontinuous, are

λ(τ − ) = λ(τ + ) + ζ(τ )hx (x∗ (τ ), τ ) and

H[x∗ (τ ), u∗ (τ − ), λ(τ − ), τ ] = H[x∗ (τ ), u∗ (τ + ), λ(τ + ), τ ]

−ζ(τ )ht (x∗ (τ ), τ );

the Lagrange multipliers μ(t) are such that

∂L/∂u|u=u∗ (t) = 0, dH/dt = dL/dt = ∂L/∂t + ρλf,

and the complementary slackness conditions

μ(t) ≥ 0, μ(t)g(x∗ , u∗ , t) = 0,

η(t) ≥ 0, η(t)h(x∗ (t), t) = 0, and

ζ(τ ) ≥ 0, ζ(τ )h(x∗ (τ ), τ ) = 0 hold.


Exercises for Chapter 4 149

If T ∈ [T1 , T2 ], 0 ≤ T1 < T2 < ∞, is also a decision variable, then


if T ∗ is the optimal terminal time, then the optimal solution x∗ , u∗ , T ∗
must satisfy (4.62) with T replaced by T ∗ and the condition

H[x∗ (T ∗ ), u∗ (T ∗− ), λd (T ∗− ), T ∗ ] − ρψ[x∗ (T ∗ ), T ∗ ] + αaT [x∗ (T ∗ ), T ∗ ]




⎪ ∗

⎪ ≤ 0 if T = T1 ,


+βbT [x∗ (T ∗ ), T ∗ ] + γ d hT [x∗ (T ∗ ), T ∗ ] = 0 if T ∗ ∈ (T1 , T2 ),(4.63)





⎩ ≥ 0 if T ∗ = T2 .

Derivation of (4.63) starting from (4.15) is similar to that of (3.44) from


(3.15).

Remark 4.7 The current-value version of (4.34) in Remark 4.3 is η̇(t) ≤


ρη(t) and (4.35).

The infinite horizon problem with pure and mixed constraints can be
stated as (3.97) with an additional constraint (4.7). As in Sect. 3.6, the
conditions in (4.62) except the transversality condition on the adjoint
variable are still necessary for optimality. As for the sufficiency condi-
tions, an analogue of Theorem 4.1 holds, subject to the discussion on
infinite horizon transversality conditions in Sect. 3.6.
We conclude this chapter with the following cautionary remark.

Remark 4.8 While various subsets of conditions specified in the max-


imum principles (4.13), (4.29), or (4.62) have been proved in the litera-
ture, proofs of the entire results are still not available. For this reason,
Hartl (1995) call (4.13), (4.29), or (4.62) as informal theorems. Seier-
stad and Sydsæter (1987) call them almost necessary conditions since,
very rarely, problems arise where the optimal solution requires more
complicated multipliers and adjoint variables than those specified in this
chapter.

Exercises for Chapter 4

E 4.1 Rework Example 4.3 by guessing that γ > 0, and show that it
leads to a contradiction with a condition of the maximum principle.

E 4.2 Rework Example 4.3 with terminal time T = 1/2.


150 4. The Maximum Principle: Pure State and Mixed Constraints

E 4.3 Change the objective function of Example 4.3 as follows:


  2 
max J = (−u)dt .
0

Re-solve and show that the solution is not unique.

E 4.4 Specialize the maximum principle (4.29) for the nonnegativity


state constraint of the form

x(t) ≥ 0 for all t satisfying 0 ≤ t ≤ T,

in place of h(x, t) ≥ 0 in (4.7).

E 4.5 Consider the problem:


  T 
max J = (−x)dt
0

subject to
ẋ = −u − 1, x(0) = 1,
x(t) ≥ 0, 0 ≤ u(t) ≤ 1.
Show that

(a) If T = 1, there is exactly one feasible and optimal solution.


(b) If T > 1, then there is no feasible solution.
(c) If 0 < T < 1, then there is a unique optimal solution.
(d) If the control constraint is 0 ≤ u(t) ≤ K, there is a unique optimal
solution for every K ≥ 1 and T = 1/2.
(e) The value of the objective in (d) increases as K increases.
(f) If the control constraint in (d) is u(t) ≥ 0, then the optimal control
is an impulse control defined by the limit of the solution in (e).

E 4.6 Impose the constraint x ≥ 0 on Exercise 3.16(b) to obtain the


problem:   4 
max J = (−x)dt
0
subject to
Exercises for Chapter 4 151

ẋ = u, x(0) = 1, x(4) = 1,
u + 1 ≥ 0, 1 − u ≥ 0,
x ≥ 0.
Find the optimal trajectories of the control variable, the state variable,
and other multipliers. Also, graph these trajectories.

E 4.7 Transform the problem (4.11) with the pure constraint of type
(4.7) to a problem with the nonnegativity constraint of type (4.9).

Hint: Define y = h(x, t) as an additional state variable. Recall


that we have assumed (4.7) to be a first-order constraint.

E 4.8 Consider a two-reservoir system such as that shown in Fig. 4.4,


where xi (t) is the volume of water in reservoir i and ui (t) is the rate of
discharge from reservoir i at time t. Thus,

ẋ1 (t) = −u1 (t), x1 (0) = 4,

ẋ2 (t) = u1 (t) − u2 (t), x2 (0) = 4.

Figure 4.4: Two-reservoir system of Exercise 4.8

Solve the problem of maximizing


 10
J= [(10 − t)u1 (t) + tu2 (t)]dt
0

subject to the above state equations and the constraints

0 ≤ ui (t) ≤ 1, xi (t) ≥ 0 for all t ∈ [0, 10].


152 4. The Maximum Principle: Pure State and Mixed Constraints

Also compute the optimal value of the objective function.

Hint: Guess the optimal solution and verify it by using the La-
grangian form of the maximum principle.

E 4.9 An Inventory Control Problem. Solve


 T  
P2
max − hI + dt
P 0 2

subject to
S2
I˙ = P − S, I(0) = I0 > ,
2h
and the control and the pure state inequality constraints

P ≥ 0 and I ≥ 0,

respectively. Assume that S > 0 and h > 0 are constants and T


is sufficiently large. Note that I represents inventory, P represents
production rate, and S represents demand. The constraints on P and
I mean that production must be nonnegative and backlogs are not
allowed, respectively.

Hint: By T being sufficiently large, we mean T > I0 /S + S/(2h).

E 4.10 Redo Example 4.3 with T = 1.5.

E 4.11 Redo Example 4.6 using the current-value maximum principle


(4.62) in Sect. 4.6.

E 4.12 For this exercise only, assume that h(x, t) ≥ 0 in (4.7) is a


second-order constraint, i.e., r = 2. Transform the problem to one with
nonnegativity constraints. Use the result in Exercise 4.4 to derive a
maximum principle for problems with second-order constraints.

Hint: As in Exercise 4.7, define y = h. In addition, define yet an-


other state variable z = ẏ = dh/dt. Note further that this procedure
can be generalized to handle problems with rth-order constraints for
any positive integer r.

E 4.13 Re-solve Example 4.6 when ρ < 0.


Exercises for Chapter 4 153

E 4.14 Consider the following problem:


  5 
min J = udt
0

subject to the state equation

ẋ = u − x, x(0) = 1,

and the control and state constraints

0 ≤ u ≤ 1, x(t) ≥ 0.7 − 0.2t.

Use the sufficiency conditions in Theorem 4.1 to verify that the optimal
control for the problem is




⎪ 0, 0 ≤ t ≤ θ,


u∗ (t) = 0.5 − 0.2t, θ < t ≤ 2.5,





⎩ 0, 2.5 < t ≤ 5,

where θ ≈ 0.51626. Sketch the optimal state trajectory x∗ (t) for the
problem.

E 4.15 In Example 4.6, let t± (x) = 2 ± 1 − x. Show that the value
function
⎧ √ √
⎪ −2ρ
⎨ − 2e +2(ρ 1−x−1)e
−ρ(2− 1−x)

ρ2
, for x < 1, 0 ≤ t ≤ t− (x),
V (x, t) =

⎩ 0, for x ≥ 1 or t+ (x) ≤ t ≤ 3.

Note that V (x, t) is not defined for x < 1, t− (x) < t ≤ 3. Show further-
more that for the given initial condition x(0) = 0, the marginal valuation
is ⎧



⎪ e−ρ , for t ∈ [0, 1),


Vx (x∗ (t), t) = λd (t) = λ(t) + η(t) = e−ρt , for t ∈ [1, 2],





⎩ 0, for t ∈ (2, 3].
In this case, it is interesting to note that the marginal valuation is dis-
continuous at the constraint exit time t = 2.
154 4. The Maximum Principle: Pure State and Mixed Constraints

E 4.16 Show in Example 4.3 that the value function




⎨ −x2 /2, for x ≤ 2 − t, 0 ≤ t ≤ 2,
V (x, t) =

⎩ −2x + 2 − 2t + xt + t2 /2, for x > 2 − t, 0 ≤ t ≤ 2.

Then verify that for the given initial condition x(0) = 1,




⎨ t − 1, for t ∈ [0, 1),
∗ d
Vx (x (t), t) = λ (t) = λ(t) + η(t) =

⎩ 0, for t ∈ [1, 2].

E 4.17 Rework Example 4.5 by using the direct maximum principle


(4.13).

E 4.18 Solve the linear inventory control problem of minimizing


 T
(cP (T ) + hI(t))dt
0

subject to
˙ = P (t) − S,
I(t) I(0) = 1,
P (t) ≥ 0 and I(t) ≥ 0, t ∈ [0, T ],
where P (t) denotes the production rate and I(t) is the inventory level at
time t and√where c, h and S are positive constants and the given terminal
time T > 2S.

E 4.19 A machine with quality x(t) ≥ 0 produces goods with ax(t)


dollars per unit time at time t. The quality deteriorates at the rate δ,
but the decay can be slowed by a preventive maintenance u(t) as follows:

ẋ = u − δx, x(0) = x0 > 0.

Obtain the optimal maintenance rate u(t), 0 ≤ t ≤ T, so as to maximize


 T
(ax − u)dt
0

subject to u ∈ [0, ū] and x ≤ x̄, where ū > δ x̄, a > δ, and x̄ > x0 .

Hint: Solve first the problem without the state constraint x ≤ x̄. You will
need to treat two cases: δT ≤ ln a − ln (a − δ) and δT > ln a − ln (a − δ).
Exercises for Chapter 4 155

E 4.20 Maximize
 3
J= (u − x)dt
0

subject to
ẋ = 1 − u, x(0) = 2,

0 ≤ u ≤ 3, x + u ≤ 4, x ≥ 0.

E 4.21 Maximize
 2
J= (1 − x)dt
0

subject to
ẋ = u, x(0) = 1,

−1 ≤ u ≤ 1, x ≥ 0.

E 4.22 Maximize
 3
J= (4 − t)udt
0

subject to
ẋ = u, x(0) = 0, x(3) = 3,

0 ≤ u ≤ 2, 1 + t − x ≥ 0.

E 4.23 Maximize
 4
J =− e−t (u − 1)2 dt
0

subject to
ẋ = u, x(0) = 0,

x ≤ 2 + e−3 .
156 4. The Maximum Principle: Pure State and Mixed Constraints

E 4.24 Solve the following problem:


  2 
max J = (2u − x)dt
0

ẋ = −u, x(0) = e,
−3 ≤ u ≤ 3, x − u ≥ 0, x ≥ t.

E 4.25 Solve the following problem:


  3 
max J = −2x1 dt
0

ẋ1 = x2 , x1 (0) = 2,
ẋ2 = u, x2 (0) = 0,
x1 ≥ 0.

E 4.26 Re-solve Example 4.6 with the control constraint (4.3) replaced
by 0 ≤ u ≤ 1.

E 4.27 Solve explicitly the following problem:


  2 
max J = − x(t)dt
0

subject to
ẋ(t) = u(t), x(0) = 1,
−a ≤ u(t) ≤ b, a > 1/2, b > 0,
x(t) ≥ t − 2.
Obtain x∗ (t), u∗ (t) and all the required multipliers.

E 4.28 Minimize  T
1 2
(x + c2 u2 )dt
0 2
subject to
ẋ = u, x(0) = x0 > 0, x(T ) = 0,
h1 (x, t) = x − a1 + b1 t ≥ 0,
h2 (x, t) = a2 − b2 t − x ≥ 0,
where ai , bi > 0, a2 > x0 > a1 , and a2 /b2 > a1 /b1 ; see Fig. 4.5. The
optional path must begin at x0 on the x-axis, stay in the shaded area,
and end on the t-axis.
Exercises for Chapter 4 157

Figure 4.5: Feasible space for Exercise 4.28

(a) First, assume that the problem parameters are such that the op-
timal solution x∗ (t) satisfies h1 (x∗ (t), t) > 0 for t ∈ [0, T ]. Show
that
x∗ (t) = k1 et/c + k2 e−t/c ,

where k1 and k2 are the constants to be determined. Write down


the two conditions that would determine the constants. Also, il-
lustrate graphically the optimal state trajectory.
(b) How would your solution change if the problem parameters do not
satisfy the condition in (a)? Characterize and graphically illustrate
the optimal state trajectory.

E 4.29 With a > 0, b > 0, and γ̇(t)/γ(t) = −ρ(t) < 0,


  T 
a
max J = (1 − e−bu(t) )γ(t)dt
u,T 0 b

subject to
ẋ = −u, x(0) = x0 > 0 given,

and the constraint


x(t) ≥ 0.

Obtain the expressions satisfied by the optimal terminal time T ∗ , the


optimal control u∗ (t), 0 ≤ t ≤ T ∗ , and the optimal state trajectory
x∗ (t), 0 ≤ t ≤ T ∗ . Furthermore, obtain them explicitly in the special
case when ρ(t) = ρ > 0, a constant positive discount rate.
158 4. The Maximum Principle: Pure State and Mixed Constraints

E 4.30 Set ρ = 0 in the solution of Example 4.6 and obtain λ, γ, η, ζ(1)


for the undiscounted problem. Then use the transformation formulas
(4.30)–(4.33) on these and the fact that ζ(2) = 0 to obtain λd , γ d , η d ,
and ζ d (1) and ζ d (2), and show that they are the same as those obtained
in Example 4.2 along with ζ d (1) = 0, which holds trivially.

E 4.31 Consider a finite-time economy in which production can be used


for consumption as well as investment, but production also pollutes. The
state equations for the capital stock K and stock of pollution W are

K̇ = suK, K(0) = K0 ,

Ẇ = uK − δW, W (0) = W0 ,
where a fraction s of the production output uK is invested, with u de-
noting the capacity utilization rate. The control constraints are

0 ≤ s ≤ 1, 0 ≤ u ≤ 1,

and the state constraint


W ≤ W̄
implies that the pollution stock cannot exceed the upper bound W̄ .
The aim of the economy is to choose s and u so as to maximize the
consumption utility  T
(1 − s)uKdt.
0

Assume that W0 < W̄ , T > 1 and W0 − K0 /δ)e−δT + K0 /δ < W̄ , which


means that even with s(t) ≡ 0, the pollution stock never reaches W̄ even
with u(t) ≡ 1.
Chapter 5

Applications to Finance

An important area of finance involves making decisions regarding


investment and dividend policies over time and ways to finance them.
Among the ways of financing such policies are: issuing equity, retaining
earnings, borrowing money, etc. It is possible to model such situations
as optimal control problems; see, for example, Davis and Elzinga (1971),
Elton and Gruber (1975), and Sethi (1978b). Some of these models are
simple to analyze and they yield useful insights.

In this chapter we deal with two different problems relating to a


firm. The cash balance problem, in its simplest form, is a problem
of controlling the level of a firm’s cash balances to meet its demand
for cash at minimum total cost. The problem of the optimal equity
financing of a corporate firm, a central problem in finance, is that of
determining the optimal dividend path along with new equity issued
over time in order to maximize the value of the firm. Although we
only deal with deterministic problems in this chapter, some of the
more important problems in finance involve uncertainty. Thus, their
optimization requires the use of stochastic optimal control theory or
stochastic programming. A brief introduction to stochastic optimal
control theory will be provided in Chap. 12, together with an application
to a stochastic consumption-investment problem and references.

In the next section, we introduce a simple cash balance problem as


a tutorial. This model is based on Sethi and Thompson (1970) and
Sethi (1973d, 1978c). We will be especially interested in the financial

© Springer Nature Switzerland AG 2019 159


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 5
160 5. Applications to Finance

interpretations for the various functions such as the Hamiltonian and


the adjoint functions that arise in the course of the analysis.

5.1 The Simple Cash Balance Problem


Consider a firm which has a known demand for cash over time. To satisfy
this cash demand, the firm must keep some cash on hand, assumed to be
held in a checking account at a bank. If the firm keeps too much cash,
it loses money in terms of opportunity cost, in that it can earn higher
returns by buying securities such as bonds. On the other hand, if the
cash balance is too small, the firm has to sell securities to meet the cash
demand and thus incur a broker’s commission. The problem then is to
find the tradeoff between the cash and security balances.

5.1.1 The Model


To formulate the optimal control problem we introduce the following
notation:
T = the time horizon,
x(t) = the cash balance in dollars at time t,
y(t) = the security balance in dollars at time t,
d(t) = the instantaneous rate of demand for cash; d(t) can be
positive or negative,
u(t) = the rate of sale of securities in dollars; a negative sales
rate means a rate of purchase,
r1 (t) = the interest rate earned on the cash balance,
r2 (t) = the interest rate earned on the security balance,
α = the broker’s commission in dollars per dollar’s worth
of securities bought or sold; 0 < α < 1.
The state equations are
ẋ = r1 x − d + u − α|u|, x(0) = x0 , (5.1)
ẏ = r2 y − u, y(0) = y0 , (5.2)
and the control constraints are
− U2 ≤ u(t) ≤ U1 , (5.3)
where U1 and U2 are nonnegative constants. The objective function is:
max{J = x(T ) + y(T )} (5.4)
subject to (5.1)–(5.3). Note that the problem is in the linear Mayer form.
5.1. The Simple Cash Balance Problem 161

5.1.2 Solution by the Maximum Principle


Introduce the adjoint variables λ1 and λ2 and define the Hamiltonian
function
H = λ1 (r1 x − d + u − α|u|) + λ2 (r2 y − u). (5.5)
The adjoint variables satisfy the differential equations

∂H
λ˙1 = − = −λ1 r1 , λ1 (T ) = 1, (5.6)
∂x
∂H
λ˙2 = − = −λ2 r2 , λ2 (T ) = 1. (5.7)
∂y
It is easy to solve these, respectively, as
T
r1 (τ )dτ
λ1 (t) = e t (5.8)

and T
r2 (τ )dτ
λ2 (t) = e t . (5.9)
The interpretations of these solutions are also clear. Namely, λ1 (t) is
the future value (at time T ) of one dollar held in the cash account from
time t to T and, likewise, λ2 (t) is the future value of one dollar invested
in securities from time t to T. Thus, the adjoint variables have natural
interpretations as the actuarial evaluations of competitive investments
at each point of time.
Let us now derive the optimal policy by choosing the control vari-
able u to maximize the Hamiltonian in (5.5). In order to deal with the
absolute value function we write the control variable u as the difference
of two nonnegative variables, i.e.,

u = u1 − u2 , u1 ≥ 0, u2 ≥ 0. (5.10)

Recall that this method was suggested in Remark 3.12 in Sect. 3.7. In
order to make u = u1 when u1 is strictly positive, and u = −u2 when u2
is strictly positive, we also impose the quadratic constraint

u1 u2 = 0, (5.11)

so that at most one of u1 and u2 can be nonzero. However, the optimal


properties of the solution will automatically cause this constraint to be
satisfied. The reason is that the broker’s commission must be paid on
162 5. Applications to Finance

every transaction, which makes it not optimal to simultaneously buy and


sell securities. Given (5.10) and (5.11) we can write

|u| = u1 + u2 . (5.12)

Also, since u ∈ [−U1 , U2 ] from (5.3), we must have u1 ≤ U1 and u2 ≤ U2 .


In view of (5.10), the control constraints on the variables u1 and u2 are

0 ≤ u1 ≤ U1 and 0 ≤ u2 ≤ U2 . (5.13)

We can now substitute (5.10) and (5.12) into the Hamiltonian (5.5)
and reproduce the part that depends on control variables u1 and u2 , and
denote it by W. Thus,

W = u1 [(1 − α)λ1 − λ2 ] − u2 [(1 + α)λ1 − λ2 ]. (5.14)

Maximizing the Hamiltonian (5.5) with respect to u ∈ [−U1 , U2 ] is the


same as maximizing W with respect to u1 ∈ [0, U1 ] and u2 ∈ [0, U2 ]. But
W is linear in u1 and u2 so that the optimal strategy is bang-bang and
is as follows:
u∗ = u∗1 − u∗2 , (5.15)
where
u∗1 = bang[0, U1 ; (1 − α)λ1 − λ2 ], (5.16)
u∗2 = bang[0, U2 ; −(1 + α)λ1 + λ2 ]. (5.17)
Since u1 (t) represents the rate of sale of securities, (5.16) says that the
optimal policy is: sell at the maximum allowable rate if the future value
of a dollar less the broker’s commission (i.e., the future value of (1 − α)
dollars) is greater than the future value of a dollar’s worth of securities;
and do not sell if these future values are in reverse order. In case the
future value of a dollar less the commission is exactly equal to the fu-
ture value of a dollar’s worth of securities, then the optimal policy is
undetermined. In fact, we are indifferent as to the action taken, and
this is called singular control. Similarly, u2 (t) represents the purchase of
securities. Here we buy, do not buy, or are indifferent, if the future value
of a dollar plus the commission is less than, greater than, or equal to the
future value of a dollar’s worth of securities, respectively.
Note that if
(1 − α)λ1 (t) ≥ λ2 (t),
then
(1 + α)λ1 (t) > λ2 (t),
5.1. The Simple Cash Balance Problem 163

Figure 5.1: Optimal policy shown in (λ1 , λ2 ) space

so that if u1 (t) > 0, then u2 (t) = 0. Similarly, if

(1 + α)λ1 (t) ≤ λ2 (t),

then
(1 − α)λ1 (t) < λ2 (t),

so that if u2 (t) > 0, then u1 (t) = 0. Hence, with the optimal policy, the
relation (5.11) is always satisfied.
Figure 5.1 illustrates the optimal policy at time t. The first quadrant
is divided into three areas which represent different actions (including
no action) to be taken. The dotted lines represent the singular control
manifolds. A possible path of the vector (λ1 (t), λ2 (t)) of the adjoint
variables is shown in Fig. 5.1 also. Note that on this path, there is one
period of selling, two periods of buying, and three periods of inactivity.
Note also that the final point on the path is (1, 1), since the terminal
values λ1 (T ) = λ2 (T ) = 1, and therefore, the last interval is always
characterized by inactivity.
Another way to represent the optimal path is in the (t, λ2 /λ1 ) space.
The path of (λ1 (t), λ2 (t)) shown in Fig. 5.1 corresponds to the path of
λ2 (t)/λ1 (t) over time shown in Fig. 5.2.
164 5. Applications to Finance

Figure 5.2: Optimal policy shown in (t, λ2 /λ1 ) space

Perhaps a more realistic version of the cash balance problem is to


disallow overdraft on the bank account. This means imposing the pure
state constraint x(t) ≥ 0. In addition, if short selling of securities is not
permitted, then we must also have y(t) ≥ 0. These extensions give rise to
pure state constraints treated in Chap. 4. In Exercise 5.2 you are asked
to formulate such an extension and write the indirect maximum principle
(4.29) for it. Exercises 5.3 and 5.4 present instances where it is easy to
guess the optimal solutions. In Exercise 5.5, you are asked to show if the
guessed solution in Exercise 5.4 satisfies the maximum principle (4.29).
It is in Chap. 6 that we discuss in detail an application of the indirect
maximum principle (4.29) for solving a problem called the wheat trading
model.

5.2 Optimal Financing Model


In the present section, we discuss a model of a corporate firm which must
finance its investments by an optimal combination of retained earnings
and external equity. The model to be discussed is due to Krouse and
Lee (1973), with corrections and extensions due to Sethi (1978b). The
problem of the optimal financing of the firm can be formulated as an op-
timal control problem. The formulations, such as those of Davis (1970),
Krouse (1972), and Krouse and Lee (1973), permit the firm to finance its
investments by retained earnings, debt, and/or external equity in various
proportions which may vary over time. Note that earnings not retained
are paid out as dividends to the firm’s stockholders.
5.2. Optimal Financing Model 165

For reasons of simplicity and ease of its solution, the model analyzed
here does not permit debt as a source of financing, but does permit
retained earnings and external equity to be used in any proportions.

5.2.1 The Model


In order to formulate the model, we use the following notation:
y(t) = the value of the firm’s assets or invested capital at time
t,
x(t) = the current earnings rate in dollars per unit time at
time t,
u(t) = the external or new equity financing expressed as a
multiple of current earnings; u ≥ 0,
v(t) = the fraction of current earnings retained, i.e., 1 − v(t)
represents the rate of dividend payout; 0 ≤ v(t) ≤ 1,
1 − c = the proportional floatation (i.e., transaction) cost for
external equity; c a constant, 0 ≤ c < 1,
ρ = the continuous discount rate (assumed constant);
known commonly as the stockholder’s required rate
of return, or the cost of capital,
r = the actual rate of return (assumed constant) on the
firm’s invested capital; r > ρ,
g = the upper bound on the growth rate of the firm’s as-
sets,
T = the planning horizon; T < ∞ (T = ∞ in Sect. 5.2.4) .
Given these definitions, the current earnings rate is x = ry. The rate
of change in the current earnings rate is given by
ẋ = rẏ = r(cu + v)x, x(0) = x0 . (5.18)
Furthermore, the upper bound on the rate of growth of the assets implies
the following constraint on the control variables:
ẏ/y = (cu + v)x/(x/r) = r(cu + v) ≤ g. (5.19)
Finally, the objective of the firm is to maximize its value, which is
taken to be the present value of the future dividend stream accruing to
the shares outstanding at time zero. To derive this expression, note that
 T
(1 − v)xe−ρt dt
0
166 5. Applications to Finance

represents the present value of total dividends issued by the firm. A


portion of these dividends go to the new equity, which under the as-
sumption of an efficient market will get a rate of return exactly equal to
the discount rate ρ. This should therefore be equal to the present value
 T
uxe−ρt dt
0

of the external equity raised over time.


Thus, the net present value of the total future dividends that accrue
to the initial shares is the difference of the previous two expressions, i.e.,
 T
J= e−ρt (1 − v − u)xdt; (5.20)
0

see Miller and Modigliani (1961) and Sethi (1996) for further discus-
sion. Note that in the case of a finite horizon, a more realistic objective
function would include a salvage value or bequest term S[x(T )]. This is
not very difficult to incorporate. See Exercise 5.12 where the bequest
function is linear. We will also solve the infinite horizon problem (i.e.,
T = ∞) after we have solved the finite horizon problem.

Remark 5.1 An intuitive interpretation of (5.20) is that the value J


of the firm is the present value of the cash flows (dividends) going out
from the firm to the society less the present value of the cash flows (new
equity) coming from the society into the firm.

The optimal control problem is to choose u and v over time so as to


maximize J in (5.20) subject to (5.18), the constraints (5.19), u ≥ 0, and
0 ≤ v ≤ 1. For convenience, we restate this problem as
⎧   T 

⎪ −ρt

⎪ max J = e (1 − v − u)xdt

⎪ u,v


0



⎪ subject to


ẋ = r(cu + v)x, x(0) = x0 , (5.21)







⎪ and the control constraints





⎩ cu + v ≤ g/r, u ≥ 0, 0 ≤ v ≤ 1.
5.2. Optimal Financing Model 167

5.2.2 Application of the Maximum Principle


This is a bilinear problem with two control variables which is a special
case of Row (f) in Table 3.3. The current-value Hamiltonian is

H = (1 − v − u)x + λr(cu + v)x


= [(crλ − 1)u + (rλ − 1)v + 1]x, (5.22)

where the current-value adjoint variable λ satisfies

λ̇ = ρλ − (1 − v − u) − λr(cu + v) (5.23)

with the transversality condition

λ(T ) = 0. (5.24)

The first term in the Hamiltonian in (5.22) is the dividend payout


rate to stockholders of record at time t. According to Sect. 2.2.1, λ is the
marginal value (in time t dollars) of a unit change in the earnings rate
at time t. Thus, λr(cu + v)x is the value at time t of the incremental
earnings rate due to the investment of retained earnings vx and the net
amount of external financing cux. This explains why we should maximize
H with respect to u and v at each t. To interpret (5.23) as in Sect. 2.2.4,
consider an earnings rate of one dollar at time t. It is worth λ, on which
the stockholders expect a return of ρλdt at time dt. In equilibrium this
must be equal to the “capital gain” dλ, plus the immediate dividend
(1 − v)dt less udt, the “claims” of the new stockholders, plus the value
λr(cu + v)dt of the incremental earnings rate r(cu + v)dt at time t + dt.
To specify the form of optimal policy, we rewrite the Hamiltonian as

H = [W1 u + W2 v + 1]x, (5.25)

where
W1 = crλ − 1, (5.26)
W2 = rλ − 1. (5.27)
Note first that the state variable x factors out so that the optimal
controls are independent of the state variable. Second, since the Hamil-
tonian is linear in the two control variables, the optimal policy is a com-
bination of generalized bang-bang and singular controls. Of course, the
characterization of these optimal controls in terms of the adjoint variable
λ will require solving a parametric linear programming problem at each
168 5. Applications to Finance

Table 5.1: Characterization of optimal controls with c < 1


Conditions on Case A: Case B: Optimal controls Characterization

W1 , W2 g≤r g>r

Subcases Subcases

(1) W2 < 0 A1 B1 u∗ = 0, v ∗ = 0 Generalized

bang-bang

(2) W2 = 0 A2 B2 u∗ = 0,

0 ≤ v ∗ ≤ min[1, g/r] Singular


∗ ∗
(3) W2 > 0 A3 – u = 0, v = g/r Generalized

bang-bang

(4) W1 < 0, W2 > 0 – B3 u∗ = 0, v ∗ = 1 Generalized

bang-bang

(5) W1 = 0 – B4 0 ≤ u∗ ≤ (g − r)/rc, Singular



v =1

(6) W1 > 0 – B5 u∗ = (g − r)/rc, v ∗ = 1 Generalized

bang-bang

instant of time t. The Hamiltonian maximization problem can be stated


as follows: ⎧



⎪ max {W1 u + W2 v}

⎨ u,v
subject to (5.28)





⎩ u ≥ 0, 0 ≤ v ≤ 1, cu + v ≤ g/r.

Obviously, the constraint v ≤ 1 becomes redundant if g/r < 1. Therefore,


we have two cases:

Case A: g ≤ r and Case B: g > r,


under each of which, we can solve the linear programming problem (5.28)
graphically in a closed form. This is done in Figs. 5.3 and 5.4.
There are seven subcases shown in Fig. 5.3 and nine subcases on
Fig. 5.4, but some of these subcases cannot occur. To see this, we note
from our assumption c < 1 that

W1 = crλ − 1 < crλ − c = cW2 ,

which also gives W2 > 0 if W1 = 0. Thus, subcases A4–A7 and B6–B9


are ruled out. The remaining Subcases A1–A3 and B1–B5 are shown
5.2. Optimal Financing Model 169

adjacent to the darkened lines in Figs. 5.3 and 5.4, respectively. In ad-
dition to W1 < cW2 and W1 = 0 implying W2 > 0, we see that W2 ≤ 0
implies W1 < 0. In view of these, we can simply characterize Subcases
A1 and B1 by W2 < 0, A2 and B2 by W2 = 0, A3 by W2 > 0, B4 by
W1 = 0, and B5 by W1 > 0, and use these simpler characterizations
in our subsequent discussion. Keep in mind that Subcase B3 remains
characterized as W1 < 0, W2 > 0.
In Table 5.1, we list the feasible cases, shown along the darkened
lines in Figs. 5.3 and 5.4 and provide the form of the optimal control
in each of these cases. The catalog of possible optimal control regimes
shown in Table 5.1 gives the potential time-paths for the firm. What
must be done to obtain the optimal path (given an initial condition) is
to synthesize these subcases into an optimal sequence. This is carried
out in the following section.

A3

A4

A2 A

A5

A A6

Figure 5.3: Case A: g ≤ r


170 5. Applications to Finance

B4

B3
B5

B6

B2 B9

B1 B8 B7

Figure 5.4: Case B: g > r

5.2.3 Synthesis of Optimal Control Paths


To obtain an optimal path, we must synthesize an optimal sequence
of subcases. The usual procedure employed is that of the reverse-time
construction, first developed by Isaacs (1965). Reverse time can only be
defined for finite horizon problems. However, the infinite horizon solution
can usually be inferred from the finite horizon solution if sufficient care
is exercised. This will be done in Sect. 5.2.4.
Our analysis of the finite horizon problem (5.21) proceeds with the
assumption that the terminal time T is assumed to be sufficiently large.
We will make this assumption precise during our analysis. Moreover, we
will discuss the solution when T is not sufficiently large in Remarks 5.2
and 5.4.
Define the reverse-time variable τ as

τ = T − t,
5.2. Optimal Financing Model 171

so that
◦ dy dy dt
y= = = −ẏ.
dτ dt dτ

As a consequence, y = −ẏ, and the reverse-time versions of the state
and adjoint equations (5.18) and (5.23), respectively, can be obtained by

simply replacing ẏ by y and changing the signs of the right-hand sides.
The transversality condition on the adjoint variable

λ(t = T ) = λ(τ = 0) = 0 (5.29)

becomes the initial condition in the reverse-time sense. Furthermore, let


us parameterize the terminal state by assuming that

x(t = T ) = x(τ = 0) = αA , (5.30)

where αA is a parameter to be determined.


From now on in this section, everything is expressed in the reverse-
◦ ◦
time sense unless otherwise specified. Using the definitions of x and λ
and the conditions (5.30) and (5.29), we can write reverse-time versions
of (5.18) and (5.23) as follows:

x= −r(cu + v)x, x(0) = αA , (5.31)

λ= (1 − u − v) − λ{ρ − r(cu + v)}, λ(0) = 0. (5.32)
This is the starting point for our switching point synthesis. First, we
consider Case A.

Case A: g ≤ r.

Note that the constraint v ≤ 1 is superfluous in this case and the


only feasible subcases are A1, A2, and A3. Since λ(0) = 0, we have
W1 (0) = W2 (0) = −1 and, therefore, Subcase A1.

Subcase A1: W2 = rλ − 1 < 0.

From Row (1) of Table 5.1, we have u∗ = v ∗ = 0, which gives the


state equation (5.31) and the adjoint equation (5.32) as

◦ ◦
x= 0 and λ= 1 − ρλ. (5.33)
172 5. Applications to Finance

With the initial conditions given in (5.29), the solutions for x and λ are
x(τ ) = αA and λ(τ ) = (1/ρ)[1 − e−ρτ ]. (5.34)
It is easy to see that because of the assumption 0 ≤ c < 1, it follows that
if W2 = rλ − 1 < 0, then W1 = crλ − 1 < 0. Therefore, to remain in
this subcase as τ increases, W2 (τ ) must remain negative for some time
as τ increases. From (5.34), however, λ(τ ) is increasing asymptotically
toward the value 1/ρ and therefore, W2 (τ ) is increasing asymptotically
toward the value r/ρ − 1. Since, we have assumed r > ρ, there exists a
τ 1 such that W2 (τ 1 ) = (1 − e−ρτ 1 )r/ρ − 1 = 0. It is easy to compute
τ 1 = (1/ρ) ln[r/(r − ρ)]. (5.35)
From this expression, it is clear that the firm leaves Subcase A1 provided
τ 1 < T. Moreover, this observation also makes precise the notion of a
sufficiently large T in Case A by having T > τ 1 .
Remark 5.2 When T is not sufficiently large, i.e., when T ≤ τ 1 in
Case A, the firm stays in Subcase A1. The optimal solution in this case
is u∗ = 0 and v ∗ = 0, i.e., a policy of no investment.
Remark 5.3 Note that if we had assumed r < ρ, the firm would never
have exited from Subcase A1 regardless of the value of T. Obviously,
there is no use investing if the rate of return is less than the discount
rate.
At reverse time τ 1 , we have W2 = 0 and W1 < 0 and the firm,
therefore, is in Subcase A2. Also, λ(τ 1 ) = 1/r since W2 (τ 1 ) = 0.

Subcase A2: W2 = rλ − 1 = 0.

In this subcase, the optimal controls


u∗ = 0, 0 ≤ v ∗ ≤ g/r (5.36)
from Row (3) of Table 5.1 are singular with respect to v. This case is
termed singular because the Hamiltonian maximizing condition does not
yield a unique value for the control v. In such cases, the optimal controls
are obtained by conditions required to sustain W2 = 0 for a finite time
◦ ◦
interval. This means we must have W = 0, which in turn implies λ= 0.

To compute λ, we substitute (5.36) into (5.32) and obtain

λ= (1 − v ∗ ) − λ[ρ − rv ∗ ]. (5.37)
5.2. Optimal Financing Model 173

Substituting λ = 1/r, its value at τ 1 , in (5.37) and equating the right-


hand side to zero we obtain
r=ρ (5.38)
as a necessary condition required to maintain singularity over a finite
time interval following τ 1 . Condition (5.38) is fortuitous and will not
generally hold. In fact we have assumed r > ρ. Thus, the firm will not
stay in Subcase A2 for a nonzero time interval. Furthermore, since r > ρ,

we have λ (τ 1 ) = (1 − ρ/r) > 0. Therefore, W2 is increasing from zero
and becomes positive after τ 1 . Thus, at τ +
1 the firm switches to Subcase
A3.

Subcase A3: W2 = rλ − 1 > 0.

The optimal controls in this subcase from Row (2) of Table 5.1 are
u∗ = 0, v ∗ = g/r. (5.39)
The state and the adjoint equations are

x= −gx, x(τ 1 ) = αA , (5.40)

λ= (1 − g/r) − λ(ρ − g), λ(τ 1 ) = 1/r, (5.41)
with values at τ = τ 1 deduced from (5.34) and (5.35).

Since λ (τ 1 ) > 0, λ is increasing at τ 1 from its value of 1/r. A further
examination of the behavior of λ(τ ) as τ increases will be carried out
under two different possible conditions: (i) ρ > g and (ii) ρ ≤ g.

(i) ρ > g: Under this condition, as λ increases, λ decreases and
becomes zero at a value obtained by equating the right-hand side of
(5.41) to zero, i.e., at
1 − g/r
λ̄ = . (5.42)
ρ−g
This value λ̄ is, therefore, an asymptote to the solution of (5.41) starting
at λ(τ 1 ) = 1/r. Since r > ρ > g in this case,
r(1 − g/r) r−ρ
W2 = rλ̄ − 1 = −1= > 0, (5.43)
ρ−g ρ−g
which implies that the firm continues to stay in Subcase A3.

(ii) ρ ≤ g: Under this condition, as λ(τ ) increases, λ (τ ) increases.
So W2 (τ ) = rλ(τ ) − 1 continues to be greater than zero and the firm
continues to remain in Subcase A3.
174 5. Applications to Finance

Remark 5.4 With ρ ≤ g, note that λ(τ ) increases to infinity as τ in-


creases to infinity. This has important implications later when we deal
with the solution of the infinite horizon problem.

Since the optimal decisions for τ ≥ τ 1 have been found to be inde-


pendent of αA for T sufficiently large, we can sketch the solution for Case
A in Fig. 5.5 starting with x0 . This also gives the value of

αA = x0 eg(T −τ 1 ) = x0 egT [1 − ρ/r]g/ρ ,

as shown in Fig. 5.5.

ln[ ]

Figure 5.5: Optimal path for case A: g ≤ r

Mathematically, we can now express the optimal controls and the


optimal state, now in forward time, as

u∗ (t) = 0, v ∗ (t) = g/r, x∗ (t) = x0 egt , t ∈ [0, T − τ 1 ], (5.44)


5.2. Optimal Financing Model 175

u∗ (t) = 0, v ∗ (t) = 0, x∗ (t) = x0 eg(T −τ 1 ) , t ∈ (T − τ 1 , T ], (5.45)


As for λ(t), from (5.34) we have
1
λ(t) = [1 − e−ρ(T −t) ], t ∈ (T − τ 1 , T ]. (5.46)
ρ
For t ∈ [0, T − τ 1 ], we have from (5.41),

λ̇(t) = λ(ρ − g) − (1 − g/r), λ(T − τ 1 ) = 1/r. (5.47)

Following Sect. A.1, we can solve this equation as


1 1 − g/r
λ(t) = e−(ρ−g)(T −τ 1 −t) + [1 − e−(ρ−g)(T −τ 1 ) ], t ∈ [0, T − τ 1 ].
r ρ−g
(5.48)
In this solution for Case A, there is only one switching point provided
that T is sufficiently large (i.e., T > τ 1 in this case). The switching time
t = T −τ 1 has an interesting economic interpretation. Namely, it requires
at least τ 1 units of time to retain a dollar of earnings to be worthwhile
for investment. That means, it pays to invest as much of the earnings
as feasible before T − τ 1 , and it does not pay to invest any earnings
after T − τ 1 . Thus, T − τ 1 is the point of indifference between retaining
earnings or paying dividends out of earnings. To see this directly, let us
suppose the firm retains one dollar of earnings at T − τ 1 . Since this is
the last time that any of the earnings invested will be worthwhile, it is
obvious (because all earnings are paid out) that the dollar just invested
at T − τ 1 yields dividends at the rate r from T − τ 1 to T. The value of
this dividend stream in terms of (T − τ 1 )-dollars is
 τ1
r
re−ρs ds = [1 − e−ρτ 1 ], (5.49)
0 ρ
which must be equated to one dollar to find the indifference point. Equat-
ing (5.49) to 1 yields precisely the value of τ 1 given in (5.35).
With this interpretation of τ 1 , we conclude that enough earnings
must be retained so as to make the firm grow exponentially at the max-
imum rate of g until t = T − τ 1 . After this time, all of the earnings are
paid out and the firm stops growing. Since g ≤ r (assumed for Case A),
the growth in the first part of the solution can be financed entirely from
retained earnings. Thus, there is no need to resort to more expensive
external equity financing. The latter will not be true, however, in Case
B when g > r, which we now discuss.
176 5. Applications to Finance

Case B: g > r.

Since g/r > 1, the constraint v ≤ 1 in Case B is relevant. The feasible


subcases are B1, B2, B3, B4, and B5 shown adjacent to the darkened
lines in Fig. 5.4. As in Case A, it is obvious that the firm starts (in
the reverse-time sense) in Subcase B1. Recall that T is assumed to be
sufficiently large here as well. This statement in Case B will be made
precise in the course of our analysis. Furthermore, the solution when T
is not sufficiently large in Case B will be discussed in Remark 5.4.

Subcase B1: W2 = rλ − 1 < 0.

The analysis of this subcase is the same as Subcase A1. As in that


subcase, the firm switches out at time τ = τ 1 to Subcase B2.

Subcase B2: W2 = rλ − 1 = 0.

In this subcase, the optimal controls

u∗ = 0, 0 ≤ v ∗ ≤ 1 (5.50)

from Row (3) of Table 5.1 are singular with respect to v. As before
in Subcase A2, the singular case cannot be sustained for a finite time
because of our assumption r > ρ. As in Subcase A2, W2 is increasing at
τ 1 from zero and becomes positive after τ 1 . Thus, at τ +
1 , the firm finds
itself in Subcase B3.

Subcase B3: W1 = crλ − 1 < 0, W2 = rλ − 1 > 0.

The optimal controls in this subcase are

u∗ = 0, v ∗ = 1, (5.51)

as shown in Row (5) of Table 5.1. The state and the adjoint equations
are

x= −rx, x(τ 1 ) = αB (5.52)
with αB , a parameter to be determined, and

λ= λ(r − ρ), λ(τ 1 ) = 1/r. (5.53)

Obviously, earnings are growing exponentially at rate r and λ(τ ) is in-


creasing at rate (r − ρ) as τ increases from τ 1 . Since λ(τ 1 ) = 1/r,
5.2. Optimal Financing Model 177

we have
λ(τ ) = (1/r)e(r−ρ)(τ −τ 1 ) for τ ≥ τ 1 . (5.54)
As λ increases, W1 increases and becomes zero at a time τ 2 defined by

W1 (τ 2 ) = crλ(τ 2 ) − 1 = ce(r−ρ)(τ −τ 1 ) − 1 = 0, (5.55)

which, in turn, gives


 
1 1
τ2 = τ1 + ln . (5.56)
r−ρ c

At τ +
2 , the firm switches to Subcase B4.
Before proceeding to Subcase B4, let us observe that in Case B, we
can now define T to be sufficiently large when T > τ 2 . See Remark 5.4
when T ≤ τ 2 .

Subcase B4: W1 = crλ − 1 = 0.

In Subcase B4, the optimal controls are

0 ≤ u∗ ≤ (g − r)/rc, v ∗ = 1. (5.57)

From Row (6) in Table 5.1, these controls are singular with respect to
u. To maintain this singular control over a finite time period, we must

keep W1 = 0 in the interval. This means we must have W 1 (τ 2 ) = 0,
◦ ◦
which, in turn, implies λ (τ 2 ) = 0. To compute λ, we substitute (5.57)
into (5.32) and obtain

λ= −u∗ − λ{ρ − r(cu∗ + 1)}. (5.58)

At τ 2 , W1 (τ 2 ) = 0 gives λ(τ 2 ) = 1/rc. With this in (5.58), its right-hand


side equals zero only when r = ρ. But we have assumed r > ρ throughout
Sect. 5.2, and therefore a singular path cannot be sustained for τ 2 > 0,
and the firm will not stay in Subcase B4 for a finite amount of time.
Furthermore, from (5.58), we have
◦ r−ρ
λ (τ 2 ) = > 0, (5.59)
rc
which implies that λ is increasing and therefore, W1 is increasing. Thus
at τ +
2 , the firm switches to Subcase B5.
178 5. Applications to Finance

Subcase B5: W1 = crλ − 1 > 0.

The optimal controls in this subcase from Row (4) of Table 5.1 are
g−r
u∗ = , v ∗ = 1. (5.60)
rc
Then from (5.31) and (5.32), the reverse-time state and the adjoint equa-
tions are

x= −gx, (5.61)
◦ g−r
λ= −( ) + λ(g − ρ). (5.62)
rc

Since λ (τ 2 ) > 0 from (5.59), λ(τ ) is increasing at τ 2 from its value
λ(τ 2 ) = 1/rc > 0. Furthermore, we have g > r in Case B, which together
with r > ρ, assumed throughout Sect. 5.2, makes g > ρ. This implies that
the second term on the right-hand side of (5.62) is increasing. Moreover,
the second term dominates the first term for τ > τ 2 , since λ(τ 2 ) =
1/(rc) > 0, and r > ρ and g > r imply g − ρ > g − r > 0. Thus,

λ (τ ) > 0 for τ > τ 2 , and λ(τ ) increases with τ . Therefore, the firm
continues to stay in Subcase B5.

Remark 5.5 Note that λ(τ ) in Case B increases without bound as τ


becomes large. This will have important implications when dealing with
the infinite horizon problem in Sect. 5.2.4.

As in Case A, we can obtain this optimal solution explicitly in forward


time, and we ask you to do this in Exercise 5.9. We now can sketch
the complete solution for Case B in Fig. 5.6. In this solution, there are
two switching points instead of just one as in Case A. The reason for two
switching points becomes quite clear when we interpret the significance of
τ 1 and τ 2 . It is obvious that τ 1 has the same meaning as before. Namely,
if τ 1 is the remaining time to the horizon, the firm is indifferent between
retaining a dollar of earnings or paying it out as dividends. Intuitively, it
seems that since external equity is more expensive than retained earnings
as a source of financing, investment financed by external equity requires
more time to be worthwhile. That is,
 
1 1
τ2 − τ1 = ln >0 (5.63)
r−ρ c
as obtained in (5.56), should be the time required to compensate for the
floatation cost of external equity. Let us see why.
5.2. Optimal Financing Model 179

=
B1

B3

B5

Earnings

Figure 5.6: Optimal path for case B: g > r

When the firm issues a dollar’s worth of stock at time t = T − τ 2 ,


it incurs a future dividend obligation in the amount of one (T − τ 2 )-
dollar, even though the capital acquired is only c dollars because of the
floatation cost (1 − c). Since we are attempting to find the breakeven
time for external equity, it is obvious that retaining all of the earnings
for investment is still profitable. Thus, there is no dividend from (T −τ 2 )
to (T − τ 1 ), and the firm grows at the rate r. Therefore, this investment
of c dollars at time (T −τ 2 ) grows into cer(τ 2 −τ 1 ) dollars at time (T −τ 1 ).
From the point of view of a buyer of the stock at time (T − τ 2 ), since no
dividend is paid until time (T − τ 1 ) and since the stockholder’s required
rate of return is ρ, the firm’s future dividend obligation at time (T − τ 1 )
is eρ(τ 2 −τ 1 ) in terms of (T − τ 1 )-dollars. But then we must have
eρ(τ 2 −τ 1 ) = cer(τ 2 −τ 1 ) , (5.64)
180 5. Applications to Finance

which can be rewritten precisely as (5.63). Moreover, the firm is


marginally indifferent between investing any costless retained earnings
at time (T − τ 1 ) or paying it all out as dividends. This also means that
the firm will be indifferent between having the new available capital of
cer(τ 2 −τ 1 ) dollars at time (T − τ 1 ) as a result of issuing a dollar’s worth
of stock at time (T − τ 2 ), or not having it. Thus, we can conclude that
the firm is indifferent between issuing a dollar’s worth of stock at time
(T − τ 2 ) or not issuing it. This means that before time (T − τ 2 ), it pays
to issue stocks at as large a rate as feasible, and after time (T − τ 2 ), it
does not pay to issue any external equity at all.
We have now provided an intuitive justification of (5.63) and con-
cluded that all earnings must be retained from time (T − τ 2 ) to (T − τ 1 ).
Because r > ρ, it follows that the excess return on the proceeds c from
the new stock issue is cer(τ 2 −τ 1 ) − ceρ(τ 2 −τ 1 ) at time (T − τ 1 ). When dis-
counted this amount back to time (T − τ 2 ), we can use (5.63) or (5.64)
to see that
% &
cer(τ 2 −τ 1 ) − ceρ(τ 2 −τ 1 ) e−ρ(τ 2 −τ 1 ) = celn(1/c) − c = 1 − c.

Thus, the excess return from time (T − τ 2 ) to (T − τ 1 ) recovers precisely


the floatation cost.

Remark 5.6 When T is not sufficiently large, i.e., when T < τ 2 in Case
B, the optimal solution is the same as in Remark 5.1 when T ≤ τ 1 . If
τ 1 < T ≤ τ 2 , then the optimal solution is u∗ = 0 and v ∗ = 1 until
t = T − τ 1 . For t > T − τ 1 , the optimal solution is u∗ = 0 and v ∗ = 0.

Having completely solved the finite horizon case, we now turn to the
infinite horizon case.

5.2.4 Solution for the Infinite Horizon Problem


As indicated in Sect. 3.6 for the infinite horizon case, the transversality
condition must be changed to

lim e−ρt λ(t) = 0. (5.65)


t→∞

Furthermore, this condition may no longer be a necessary condition; see


Sect. 3.6. It is a sufficient condition for optimality however, in conjunc-
tion with the other sufficiency conditions stated in Theorem 2.1.
5.2. Optimal Financing Model 181

As demonstrated in Example 3.7, a common method of solving an


infinite horizon problem is to take the limit as T → ∞ of the finite
horizon solution and then prove that the limiting solution obtained solves
the infinite horizon problem. The proof is important because the limit of
the solution may or may not solve the infinite horizon problem. The proof
is usually based on the sufficiency conditions of Theorem 2.1, modified
slightly as indicated above for the infinite horizon case.
We now analyze the infinite horizon case following the above proce-
dure. We start with Case A.

Case A: g ≤ r.

Let us first consider the case ρ > g and examine the solution in
forward time obtained in (5.44)–(5.48) as T goes to infinity. Clearly
(5.45) and (5.46) disappear, and (5.44) and (5.48) can be written as

u∗ (t) = 0, v ∗ (t) = g/r, x∗ (t) = x0 egt , t ≥ 0, (5.66)

1 − g/r
λ(t) = = λ̄, t ≥ 0. (5.67)
ρ−g
Clearly λ(t) satisfies (5.65). Furthermore,
r−ρ
W2 (t) = rλ − 1 = > 0, t ≥ 0,
ρ−g

which implies that the firm is in Subcase A3 for t ≥ 0. The maximum


principle holds, and (5.66) and (5.67) represent an optimal solution for
the infinite horizon problem. Note that the assumption ρ > g together
with our overall assumption that ρ < r gives g < r so that 1 − v ∗ > 0,
which means a constant fraction of earnings is being paid as dividends.
Note that the value of the adjoint variable λ̄ in this case is a con-
stant and its form is reminiscent of Gordon’s classic formula; see Gordon
(1962). In the control theory framework, the value of λ̄ represents the
marginal worth per additional unit of earnings. Obviously, a unit in-
crease in earnings will mean an increase of 1 − v ∗ or 1 − g/r units in
dividends. This, of course, should be capitalized at a rate equal to the
discount rate less the growth rate (i.e., ρ−g), which is precisely Gordon’s
formula.
For ρ ≤ g, it is clear from (5.48) that λ(t) does not satisfy (5.65). A
moment’s reflection shows that for ρ ≤ g, the objective function can be
made infinite. For example, any control policy with earnings growing at
182 5. Applications to Finance

rate q, ρ ≤ q ≤ g, coupled with a partial dividend payout, i.e., a constant


v such that 0 < v < 1, gives an infinite value for the objective function.
That is, with u∗ = 0, v ∗ = q/r < 1, we have
 ∞  ∞
J= e−ρt (1 − u∗ − v ∗ )x∗ dt = e−ρt (1 − q/r)x0 eqt = ∞.
0 0

Since there are many policies which give an infinite value to the
objective function, the choice among them may be decided on subjective
grounds. We will briefly discuss only the constant (over time) optimal
policies. If g < r, then the rate of growth q may be chosen in the
closed interval [ρ, g]; if g = r, then q may be chosen in the half-open
interval [ρ, r). In either case, the choice of a low rate of growth (i.e., a
high proportional dividend payout) would mean a higher dividend rate
(in dollars per unit time) early in time, but a lower dividend rate later
in time because of the slower growth rate. Similarly the choice of high
growth rate means the opposite in terms of dividend payments in dollars
per unit time.
To conclude, we note that for ρ ≤ g in Case A, the limiting solution
of the finite case is an optimal solution for the infinite horizon problem
in the sense that the objective function becomes infinite. However, this
will not be the situation in Case B; see also Remark 5.7.

Case B: g > r.

The limit of the finite horizon optimal solution is to grow at the


maximum allowable growth rate with
g−r
u= and v = 1
rc
all the way. Since τ 1 disappears in the limit, the stockholders will never
collect dividends. The firm has become an infinite sink for investment.
In fact, the limiting solution is a pessimal solution because the value
of the objective function associated with it is zero. From the point of
view of optimal control theory, this can be explained as before in Case
A when ρ ≤ g. In Case B, we have g > r so that (since r > ρ throughout
the chapter) we have ρ < g. For this, as noted in Remark 5.5, λ(τ )
increases without bound as τ increases and, therefore, (5.64) does not
have a solution.
As in Case A with ρ < g, any control policy with earnings growing
at rate q ∈ [ρ, g] coupled with a constant v, 0 < v < 1, has an infinite
value for the objective function.
5.2. Optimal Financing Model 183

In summary, we note that the only nondegenerate case in the infinite


horizon problem is when ρ > g. In this case, which occurs only in Case
A, the policy of maximum allowable growth is optimal. On the other
hand, when ρ ≤ g, whether in Case A or B, the infinite horizon problem
has nonunique policies with infinite values for the objective function.
Before solving a numerical example, we will make an interesting re-
mark concerning Case B.

Remark 5.7 Let (u∗T , vT∗ ) denote the optimal control for the finite
horizon problem in Case B. Let (u∗∞ , v∞ ∗ ) denote any optimal con-

trol for the infinite horizon problem in Case B. We already know that
J(u∗∞ , v∞∗ ) = ∞. Define an infinite horizon control (u , v ) by extend-
∞ ∞
ing (u∗T , vT∗ ) as follows:
(u∞ , v∞ ) = lim (u∗T , vT∗ ).
T →∞

We now note that for our model in Case B, we have


lim J(u∗T , vT∗ ) = ∞ and J( lim (u∗T , vT∗ )) = J(u∞ , v∞ ) = 0. (5.68)
T →∞ T →∞

Obviously (u∞ , v∞ ) is not an optimal control for the infinite horizon


problem. Since the two terms in (5.68) are not equal, we can say in tech-
nical terms that J(u, v), regarded as a mapping, is not a closed mapping.
However, if we introduce a salvage value Bx(T ), B > 0, for the finite
horizon problem, then the new objective function,

⎪ 
⎨ T e−ρt (1 − u − v)xdt + Bx(T )e−ρT , if T < ∞,
0
JB (u, v) =
⎩  ∞ e−ρt (1 − u − v)xdt + limT →∞ {Bx(T )e−ρT }, if T = ∞,

0

is a closed mapping in the sense that


lim JB (u∗T , vT∗ ) = ∞ and JB ( lim (u∗T , vT∗ )) = JB (u∞ , v∞ ) = ∞
T →∞ T →∞

for the modified model.

Example 5.1 We will now assign numbers to the various parameters in


the optimal financing problem in order to compute the optimal solution.
Let
x0 = 1000/month, T = 60 months,

r = 0.15, ρ = 0.10, g = 0.05, c = 0.98.


184 5. Applications to Finance

Solution Since g ≤ r, the problem belongs to Case A. We compute

1
τ1 = ln[r/(r − ρ)] = 10 ln 3 ≈ 11 months.
ρ

The optimal controls for the problem are

u∗ = 0, v ∗ = g/r = 1/3, t ∈ [0, 49),

u∗ = 0, v ∗ = 0, t ∈ [49, 60],

and the optimal state trajectory is




⎨ 1000e0.05t , t ∈ [0, 49),
x(t) =

⎩ 1000e2.45 , t ∈ [49, 60].

The value of the objective function is


 49  60
∗ −0.1t 0.05t
J = e (1 − 1/3)(1000)e dt + 1000e2.45 · e−0.1t dt
0 49
= 12, 578.75.

Note that the infinite horizon problem is well defined in this case, since
g < ρ and g < r. The optimal controls are

u∗ = 0, v ∗ = g/r = 1/3,

and
 ∞
1
J= e−0.1t (2/3)(1000)e0.05t dt = 2000/0.15 = 13, 333 .
0 3

In Exercise 5.14, you are asked to extend the optimal financing model
to allow for debt financing. Exercise 5.15 requires you to reformulate the
optimal financing model (5.21) with decisions expressed in dollars per
unit of time rather than in terms relative to x. Exercise 5.16 extends the
model to allow the rate of return on the assets to decrease as the assets
grow.
Exercises for Chapter 5 185

Exercises for Chapter 5


E 5.1 Find the optimal policies for the simple cash balance model
(Sects. 5.1.1 and 5.1.2) with x0 = 2, y0 = 2, U1 = U2 = 5, T = 1,
α = 0.01, and the following specifications for the interest rates:
(a) r1 (t) = 1/2, r2 (t) = 1/3.
(b) r1 (t) = t/2, r2 (t) = 1/3.
(c) Sketch the optimal policy in (b) in the (t, λ2 /λ1 ) space, like in
Fig. 5.2.
E 5.2 Formulate the extension of the model in Sect. 5.1.1 when over-
draft and short selling are disallowed in the following two cases: (a)
α = 0 and (b) α > 0. State the maximum principle (4.29) as it applies
to these cases.

Hint: Adjoin the control constraints to the Hamiltonian in form-


ing the Lagrangian. For (b), write u = u1 − u2 as in (5.10).
E 5.3 It is possible to guess the optimal solution for Exercise 5.2 when
α = 0, T = 10, x0 = 0, y0 = 3,


⎨ 0 for 0 ≤ t < 5,
r1 (t) =

⎩ 0.3 for 5 ≤ t ≤ 10,

r2 (t) = 0.1 for 0 ≤ t ≤ 10,


and U1 = U2 = ∞ (allowing for impulse controls). Show that the
optimum policy remains the same for each α ∈ [0, 1 − 1/e].

Hint: Use an elementary compound interest argument.


E 5.4 Do the following for Exercise 5.3 with U1 = U2 = 1, so that the
control constraints are −1 ≤ u ≤ 1.
(a) Give reasons why the solution shown in Fig. 5.7 is optimal.
(b) Compute f (t∗ ) in terms of t∗ .
(c) Compute J in terms of t∗ .
(d) Find t∗ that maximizes J by setting dJ/dt∗ = 0.
Hint: Because this is a long and tedious calculus problem, you may wish
to use Mathematica or MAPLE to solve this problem.
186 5. Applications to Finance

Figure 5.7: Solution for Exercise 5.4

E 5.5 For the solution found in Exercise 5.4, show by using the maxi-
mum principle (4.29) that the adjoint trajectories are:


⎨ λ1 (0) = e1.5 , 0 ≤ t ≤ 5,
λ1 (t) =

⎩ λ (5)e−0.3(t−5) = e3−0.3t , 5 ≤ t ≤ 10,
1

and


⎨ λ2 (0)e−0.1t∗ = e1.5+0.1(t∗ −t) , 0 ≤ t ≤ f (t∗ ) ≈ 6.52,
λ2 (t) =

⎩ 2
3 + 13 e3−0.3t , f (t∗ ) < t ≤ 10,

where t∗ ≈ 1.97. Sketches of these functions are shown in Fig. 5.8.

E 5.6 Argue that as the lower and upper bounds on u go to −∞ and


+∞ in Exercise 5.4, respectively, t∗ goes to 0 and f (t∗ ) goes to 5. Show
that this solution is consistent with the guess in Exercise 5.3. Finally,
find the corresponding impulse solution and show that it satisfies the
maximum principle as applied in Exercise 5.2.

E 5.7 Discuss the optimal equity financing model of Sect. 5.2.1 when
c = 1. Show that only one control variable is needed. Then solve the
problem.
Exercises for Chapter 5 187

Figure 5.8: Adjoint trajectories for Exercise 5.5

E 5.8 What happens in the optimal equity financing model when r < ρ?
Guess the optimal solution (without actually solving it).

E 5.9 In Sect. 5.2.3, we obtained the optimal solution in Case B. Express


the corresponding control, state, and adjoint trajectories in forward time.

E 5.10 Let g = 0.12 in Example 5.1. Re-solve the finite horizon problem
with this new value of g. Also, for the infinite horizon problem, state a
policy which yields an infinite value for the objective function.

E 5.11 Reformulate and solve the simple cash balance problem of


Sects. 5.1.1 and 5.1.2, if the earnings on bonds are paid in cash.

E 5.12 Add a salvage value function

e−ρT Bx(T ),

where B ≥ 0, to the objective function in the problem (5.21) and ana-


lyze the modified problem due to Sethi (1978b). Show how the solution
changes as B varies from 0 to 1/rc.

E 5.13 Suppose we extend the model of Exercise 5.12 to include debt.


For this let z denote the total debt at time t and w ≥ 0 denote the
188 5. Applications to Finance

amount of debt issued expressed as a proportion of current earnings.


Then the state equation for z is

ż = wx, y(0) = y0 .

How would you modify the objective function, the state equation for x,
and the growth constraint (5.19)? Assume i to be the constant interest
rate on debt, and i < r.

E 5.14 Remove the assumption of an arbitrary upper bound g on the


growth rate in the financing model of Sect. 5.2.1 by introducing a convex
cost associated with the growth rate. With r re-interpreted now as the
gross rate of return, obtain the net increase in rate of earnings by the rate
of increase in gross earnings less the cost associated with the growth rate.
Also assume c = 1 as in Exercise 5.7. Formulate the resulting model and
apply the maximum principle to find the form of the optimal policy. You
may assume the cost function to be quadratic in the growth rate to get
an explicit form for the solution.

E 5.15 Reformulate the optimal financing model (5.21) with y(t) as


the state variable, U (t) as the new equity financing rate in dollars per
unit of time, and V (t) as the retained earnings in dollars per unit of time.

Hint: This formulation has mixed constraints requiring the La-


grangian formulation of the maximum principle (3.42) introduced in
Chap. 3. Note further that it can be converted into the form (5.21) by
setting U = ux, V = vx, and x = ry.

E 5.16 In Exercise 5.15, we assume a constant rate of return r on the


assets so that the total earnings rate at time t is ry(t) dollars per unit of
time. Extend this formulation to allow for a decreasing marginal rate of
return as the assets grow. More specifically, replace ry by an increasing,
strictly concave function R(y) > 0 with R (0) = r and R (ȳ) = ρ for some
ȳ > y0 > 0. Obtain the optimal solution in the case when r > g > ρ,
0 < c < 1, T sufficiently large, and y0 < y1 < ȳ, where y1 is defined by
the relation R(y1 )/y1 = g. See Perrakis (1976).

E 5.17 Find the form of the optimal policy for the following model due
to Davis and Elzinga (1971):
  T 
−ρt −ρT
max J = e (1 − v)Erdt + P (T )e
u,v 0
Exercises for Chapter 5 189

subject to
Ṗ = k[rE(1 − v) − ρP ], P (0) = P0 ,

Ė = rE[v + u(c − E/P )], E(0) = E0 ,

and the control constraints

u ≥ 0, v ≥ 0, cu + v ≤ g/r.

Here P denotes the price of a stock, E denotes equity per stock and
k > 0 is a constant. Also, assume r > ρ > g and 1/c < r/ρ < 1/c + (ck +
1)g/(ρck). This example requires the use of the generalized Legendre-
Clebsch condition (D.69) in Appendix D.8.
Chapter 6

Applications to Production
and Inventory

Applications of optimization methods to production and inventory prob-


lems date back at least to the classical EOQ (Economic Order Quantity)
model or the lot size formula of Harris (1913). The EOQ is essentially
a static model in the sense that the demand is constant and only a sta-
tionary solution is sought. A dynamic version of the lot size model was
analyzed by Wagner and Whitin (1958). The solution methodology used
there was dynamic programming.
An important dynamic production planning model was developed by
Holt et al. (1960). In their model, referred to as the HMMS model, they
considered both production costs and inventory holding costs over time.
They used calculus of variations techniques to solve the continuous-time
version of their model. In Sect. 6.1, a model of Thompson and Sethi
(1980), similar to the HMMS model, is formulated and completely solved
using optimal control theory. The turnpike solution is also obtained when
the horizon is infinite.
In Sect. 6.2, we introduce the wheat trading model of Ijiri and
Thompson (1970), in which a wheat speculator must buy and sell wheat
in an optimal way in order to take advantage of changes in the price of
wheat over time. In Sects. 6.2.1–6.2.3, we solve the model when the short-
selling of wheat is allowed. In Sect. 6.2.4, we follow Norström (1978) to
solve a simple example that disallows short-selling.

© Springer Nature Switzerland AG 2019 191


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 6
192 6. Applications to Production and Inventory

In Sect. 6.3, we introduce a warehousing constraint, i.e., an upper


bound on the amount of wheat that can be stored, in the wheat trading
model. In addition to being realistic, the introduction of the warehousing
constraint helps us to illustrate the concepts of decision and forecast
horizons by means of examples. This section is expository in nature, but
theoretical developments of these ideas are available in the literature.

6.1 Production-Inventory Systems


Many manufacturing enterprises use a production-inventory system to
manage fluctuations in consumer demand for their products. Such a
system consists of a manufacturing plant and a finished goods ware-
house to store products which are manufactured but not immediately
sold. Once a product is made and put into inventory, it incurs inventory
holding costs of two kinds: (1) costs of physically storing the product,
insuring it, etc.; and (2) opportunity cost of having the firm’s money
invested or tied up in the unsold inventory. The advantages of having
products in inventory are: first, that they are immediately available to
meet demand; second, that excess production during low demand peri-
ods can be stored in the warehouse so it will be available for sale during
high demand periods. This usually permits the use of a smaller manu-
facturing plant than would otherwise be necessary, and also reduces the
difficulties of managing the system.
The optimization problem is to balance the benefits of production
smoothing versus the costs of holding inventory. Works that apply con-
trol theory to production and inventory problems have been reviewed in
Sethi (1978a, 1984).

6.1.1 The Production-Inventory Model


We consider a factory producing a single homogeneous good and having
a finished goods warehouse. To state the model we define the following
quantities:
I(t) = the inventory level at time t (state variable),
P (t) = the production rate at time t (control variable),
S(t) = the exogenously given sales rate at time t;
assumed to be bounded and differentiable for t ≥ 0,
T = the length of the planning period,
Iˆ = the inventory goal level,
6.1. Production-Inventory Systems 193

I0 = the initial inventory level,


P̂ = the production goal level,
h = the inventory holding cost coefficient; h > 0,
c = the production cost coefficient; c ≥ 0,
ρ = the constant nonnegative discount rate; ρ ≥ 0.
The interpretation of the inventory goal level Iˆ is that it is a safety
stock that the company wants to keep on hand. For example, Iˆ could be
2 months of average sales or Iˆ could be 100 units of the finished goods.
Similarly, the production goal level P̂ can be interpreted as the most
efficient level at which it is desired to run the factory.
With this notation, the state equation is given by the stock-flow
differential equation
˙ = P (t) − S(t), I(0) = I0 ,
I(t) (6.1)
which says that the inventory at time t is increased by the production
rate and decreased by the sales rate. The objective function of the model
is:   T 
−ρt h ˆ 2 c 2
min J = e [ (I − I) + (P − P̂ ) ]dt . (6.2)
0 2 2
The interpretation of the objective function is that we want to keep the
ˆ and also to keep the
inventory as close as possible to its goal level I,
production rate P as close as possible to its goal level P̂ . The quadratic
terms (h/2)(I − I)ˆ 2 and (c/2)(P − P̂ )2 impose “penalties” for having
either I or P not being close to its corresponding goal level.
Next we apply the maximum principle to solve the optimal control
problem specified by (6.1) and (6.2). A stochastic extension of this prob-
lem will be carried out in Sect. 12.2.

6.1.2 Solution by the Maximum Principle


We now associate an adjoint function λ with Eq. (6.1) and can write the
current-value Hamiltonian function as
h ˆ 2 − c (P − P̂ )2 .
H = λ(P − S) − (I − I) (6.3)
2 2
In (6.3), we have used the negative of the (undiscounted) integrand in
(6.2), since the minimization of J in (6.2) is equivalent to the maximiza-
tion of −J.
To apply the Pontryagin maximum principle, we differentiate (6.3)
and set the resulting expression equal to 0, which gives
194 6. Applications to Production and Inventory

∂H
= λ − c(P − P̂ ) = 0. (6.4)
∂P
From this we obtain the optimal production rate

P ∗ (t) = P̂ + λ(t)/c. (6.5)

We should mention that in writing (6.5), we are allowing negative pro-


duction (or disposal). Of course, the situation of a disposal will not arise
if we assume a sufficiently large P̂ and a sufficiently small I0 .

Remark 6.1 If P is constrained to be nonnegative, then the form of


the optimal control will be

P ∗ (t) = max{P̂ + λ(t)/c, 0}. (6.6)

This case will be treated in Sect. 6.1.6.

By substituting (6.5) into (6.1), we obtain

I˙ = P̂ + λ/c − S, I(0) = I0 . (6.7)

The equation for the adjoint variable is easily found to be


∂H ˆ λ(T ) = 0.
λ̇ = ρλ − = ρλ + h(I − I), (6.8)
∂I
We see that (6.7) has the initial boundary specified and (6.8) has the ter-
minal boundary specified, so together these give a two-point boundary
value problem. We will employ a method to solve these two equations
simultaneously, which works only in some special cases including the
present case. The method is the well-known trick used to solve simulta-
neous differential equations by differentiation and substitution until one
of the variables is eliminated. Specifically, we differentiate (6.7) with
respect to t, which creates an equation with λ̇ in it. We then use (6.8)
to eliminate λ̇ and (6.7) to eliminate λ from the resulting equation as
follows:
I¨ = λ̇/c − Ṡ = ρ(λ/c) + (h/c)(I − I)
ˆ − Ṡ

= ρ(I˙ − P̂ + S) + (h/c)(I − I)
ˆ − Ṡ.

We rewrite this as

I¨ − ρI˙ − α2 I = −α2 Iˆ − Ṡ − ρ(P̂ − S), (6.9)


6.1. Production-Inventory Systems 195

where the constant α is given by


#
α= h/c. (6.10)

We can now solve (6.9) by using the standard method described in


Appendix A. The auxiliary equation for (6.9) is

m2 − ρm − α2 = 0,

which has the two real roots


# #
m1 = (ρ − ρ2 + 4α2 )/2, m2 = (ρ + ρ2 + 4α2 )/2; (6.11)

note that m1 < 0 and m2 > 0. We can therefore write the general solution
to (6.9) as
I(t) = a1 em1 t + a2 em2 t + Q(t), I(0) = I0 , (6.12)
where Q(t) is a particular integral of (6.9).
We will say that Q(t) is a special particular integral of (6.9) if it has
no additive terms involving em1 t and em2 t . From now on we will always
assume that Q(t) is a special particular integral.
Although (6.12) has two arbitrary constants a1 and a2 , it has only
one boundary condition. To get the other boundary condition we dif-
ferentiate (6.12), substitute the result into (6.7), and solve for λ. We
obtain

λ(t) = c(m1 a1 em1 t + m2 a2 em2 t + Q̇ + S − P̂ ), λ(T ) = 0. (6.13)

Note that we have imposed the boundary condition on λ so that we can


determine the constants a1 and a2 .
For ease of expressing a1 and a2 , let us define two constants

b1 = I0 − Q(0), (6.14)
b2 = P̂ − Q̇(T ) − S(T ). (6.15)

We now impose the boundary conditions in (6.12) and (6.13) and solve
for a1 and a2 as follows:

b2 em1 T − m2 b1 e(m1 +m2 )T


a1 = , (6.16)
m1 e2m1 T − m2 e(m1 +m2 )T
b1 m1 e2m1 T − b2 em1 T
a2 = . (6.17)
m1 e2m1 T − m2 e(m1 +m2 )T
196 6. Applications to Production and Inventory

If we recall that m1 is negative and m2 is positive, then when T is


sufficiently large so that em1 T and e2m1 T are negligible, we can write

a 1 ≈ b1 , (6.18)
b2 −m2 T
a2 ≈ e . (6.19)
m2

Note that for a large T, e−m2 T is close to zero and, therefore, a2 is close
to zero. However, the reason for retaining the exponential term in (6.19)
is that a2 is multiplied by em2 t in (6.13), which, while small when t is
small, becomes large and important when t is close to T.
With these values of a1 and a2 and with (6.5), (6.12), and (6.13),
we now write the expressions for I ∗ , P ∗ , and λ. We will break each
expression into three parts: the first part labeled Starting Correction
is important only when t is small; the second part labeled Turnpike
Expression is significant for all values of t; and the third part labeled
Ending Correction is important only when t is close to T.

Starting Correction Turnpike Expression Ending Correction


 
b2 m2 (t−T )
I ∗ = (b1 em1 t )+ (Q)+ e (6.20)
m2
' (
P ∗ = (m1 b1 em1 t )+ (Q̇ + S)+ b2 em2 (t−T ) (6.21)
' ( ' (
λ = c(m1 b1 em1 t )+ c Q̇ + S − P̂ + c b2 em2 (t−T ) (6.22)

Note that if b1 = 0, which by (6.14) means I0 = Q(0), then there is no


starting correction. In other words, I0 = Q(0) is a starting inventory
that causes the solution to be on the turnpike initially. In the same way,
if b2 = 0, then the ending correction vanishes in each of these formulas,
and the solution stays on the turnpike until the end.
Expressions (6.20) and (6.21) represent approximate closed-form so-
lutions for the optimal inventory and production functions I ∗ and P ∗
as long as S is such that the special particular integral Q can be found
explicitly. For such examples of S; see Sect. 6.1.4.

6.1.3 The Infinite Horizon Solution


It is important to show that this solution also makes sense when T → ∞.
In this case it is usual to assume that the discount rate ρ > 0 and the
sales rate S does not grow too fast so that the objective function (6.2)
6.1. Production-Inventory Systems 197

remains finite. One can then show that the limit of the finite horizon
solution as T → ∞ also solves the infinite horizon problem. Note that as
T → ∞, the ending correction terms in (6.20)–(6.22) disappear because
e−m2 T goes to 0. We now have

λ(t) = c[m1 b1 em1 t + Q̇ + S − P̂ ]. (6.23)

Since we would like


lim e−ρt λ(t) = 0, (6.24)
t→∞

we would require that S + Q̇ grows slower asymptotically than the dis-


count rate ρ. One can easily verify that this condition holds for the
demand terms discussed in Sect. 6.1.4 that follows. Moreover, the con-
dition is easy to check for any given specific demand S(t) for which the
particular integral Q(t) is known.
By the sufficiency of the maximum principle conditions (Sect. 2.4), it
can be verified that the limiting solution

I ∗ (t) = b1 em1 t + Q, P ∗ (t) = m1 b1 em1 t + Q̇ + S (6.25)

is optimal. If I(0) = Q(0), the solution is always on the turnpike. Note


¯ P̄ , λ̄} = {Q, Q̇ + S, c(Q̇ + S − P̂ )} represents a non-
that the triple {I,
stationary turnpike. If I(0) = Q(0), then b1 = 0 and the expressions
(6.25) imply that the paths of inventory and production only approach
the turnpike but never attain it.

6.1.4 Special Cases of Time Varying Demands


In this section, we provide some important cases of time varying demands
including seasonal demands. These involve polynomial or sinusoidal de-
mand functions. We then solve some numerical examples of the model
described in Sect. 6.1.1 for ρ = 0 and T < ∞.
For the first example, we assume that S(t) is a polynomial of degree
2p or 2p−1 so that S (2p+1) = 0, where S (k) denotes the kth time derivative
of S with respect to t. In other words,

S(t) = C0 t2p + C1 t2p−1 + . . . + C2p , (6.26)

where at least one of C0 and C1 is not zero. Then, from Zwillinger


(2003), a particular integral of (6.9) is
1 1 1
Q(t) = Iˆ + 2 S (1) + 4 S (3) + · · · + 2p S (2p−1) . (6.27)
α α α
198 6. Applications to Production and Inventory

In Exercise 6.2 the reader is asked to verify this by direct substitution.


For the second example, we assume that S(t) is a sinusoidal demand
function of form
S(t) = A sin(πBt + C) + D, (6.28)
where A, B, C, and D are constants. In Exercise 6.3 you are asked to
verify that a particular integral of (6.9) for S in (6.28) is
πAB
Q(t) = Iˆ + cos(πBt + C). (6.29)
α2
+ π2B 2
It is well known in the theory of differential equations that demands
that are sums of functions of the form (6.26) and/or (6.28) give rise to
solutions that are sums of functions of form (6.27) and/or (6.29).
Example 6.1 Assume P̂ = 30, Iˆ = 15, T = 8, ρ = 0, and h = c = 1
so that α = 1, m1 = −1, and m2 = 1. Assume
S(t) = t(t − 4)(t − 8) + 30 = t3 − 12t2 + 32t + 30.
Solution It is then easy to show from (6.27) that
Q(t) = 3t2 − 24t + 53 and Q̇(t) = 6t − 24.
Also from (6.14), (6.15), and (6.16), we have a1 ≈ b1 = I0 − 53 and
b2 = −24. Then, from (6.20) and (6.21),
I ∗ (t) = (I0 − 53)e−t + Q(t) − 24et−8 ,
P ∗ (t) = −(I0 − 53)e−t + Q̇(t) + S(t) − 24et−8 .
In Fig. 6.1 the graphs of sales, production, and inventory are drawn
with I0 = 10 (a small starting inventory), which makes b1 = −43. In
Fig. 6.2 the same graphs are drawn with I0 = 50 (a large starting inven-
tory), which makes b1 = −3. In Fig. 6.3 the same graphs are drawn with
I0 = 30, which makes b1 = −23. Note that initially during the time from
0 to 4, the three cases are quite different, but during the time from 4 to
8, they are nearly identical. The ending inventory ends up being 29 in
all three cases.
Example 6.2 Assume that

K
S(t) = A + Bt + Ck sin(πDk t + Ek ), (6.30)
k=1
where the constants A, B, Ck , Dk , and Ek are estimated from future de-
mand data by means of one of the standard forecasting techniques such
as those in Brown (1959, 1963).
6.1. Production-Inventory Systems 199

60

50

40

30

20

10

0
0 1 2 3 4 5 6 7 8

Figure 6.1: Solution of Example 6.1 with I0 = 10

60

50

40

30

20

10

0
0 1 2 3 4 5 6 7 8

Figure 6.2: Solution of Example 6.1 with I0 = 50


200 6. Applications to Production and Inventory

60

50

40

30

20

10

0
0 1 2 3 4 5 6 7 8

Figure 6.3: Solution of Example 6.1 with I0 = 30

Solution By using formulas (6.27) and (6.29), we obtain the particular


integral

1  πCk Dk
K
Q(t) = Iˆ + 2
B+ cos(πDk t + Ek ). (6.31)
α α2 + (πDk )2
k=1

6.1.5 Optimality of a Linear Decision Rule


In Sect. 6.1.2, our emphasis was to explore the turnpike nature of
the solution of the inventory model of Sect. 6.1.1. For this purpose,
we made some asymptotic approximations when solving the state and
adjoint differential equations under the assumption that the horizon
is long. Here our focus is to solve the undiscounted version (i.e.,
ρ = 0) of the model exactly to find its optimal feedback solution, and
show that it is a linear decision rule as reported in the classical work
of Holt et al. (1960).
6.1. Production-Inventory Systems 201

Since the two-point boundary value problem given by (6.7) and (6.8)
is a linear system of differential equations, it is known via its fundamental
solution matrix that λ can be expressed in terms of I in a linear way as
follows:
λ(t) = ψ(t) − s(t)I(t), (6.32)
where ψ(t) and s(t) are continuously differentiable in t. Differentiating
(6.32) with respect to t and substituting for I˙ and λ̇ from (6.7) and (6.8)
with ρ = 0, respectively, we obtain

I(h − s2 /c + ṡ) + (P̂ + ψ/c − S)s − hIˆ − ψ̇ = 0.

Since the above relation must hold for any value of the initial inventory
I0 , we must have

ṡ = s2 /c − h and ψ̇ = (P̂ + ψ/c − S)s − hI.


ˆ (6.33)

Also from λ(T ) = 0 in (6.8) and (6.32), we have 0 = ψ(T ) − s(T )I(T ),
a relation that must hold regardless of the value of I(T ). Thus, we can
conclude that
s(T ) = 0 and ψ(T ) = 0. (6.34)
Clearly, the solution of the differential equation given by (6.33) and
(6.34) will give us the optimal control (6.5) in terms of S(t) and ψ(t). In
particular, the differential equation

ṡ = s2 /c − h, s(T ) = 0 (6.35)

is known as the Riccati equation, whose solution is given by


)
√ h
s(t) = hc tanh( (T − t)). (6.36)
c
Using (6.32) and (6.36) in (6.5), the optimal production rate P ∗ (t) is
) *) +
∗ h h ψ(t)
P (t) = P̂ − tanh (T − t) I ∗ (t) + . (6.37)
c c c

This says that the optimal production rate equals the production goal
level P̂ plus two adjustment terms. The first term implies ceteris paribus
that the higher the current inventory level, the lower the production rate
is. Furthermore, this dependence is linear with the linear effect decreas-
ing as t increases, reaching zero at t = T. The second term depends on
all the model parameters including the demand rate from time t to T.
202 6. Applications to Production and Inventory

Because of the linear dependence of the optimal production rate on


the inventory level in (6.37), this rule is known as a linear decision rule as
reported by Holt et al. (1960). More generally, this rule can be extended
to linear quadratic problems as listed in Table 3.3(c). In Appendix D.4,
we derive this rule for the problems given in Table 3.3(c), but with-
out the forcing function d. Furthermore, the rule can be extended to a
class of stochastic linear-quadratic problems that include the stochastic
production planning problem treated in Sect. 12.2.

6.1.6 Analysis with a Nonnegative Production Constraint


Thus far in this chapter, we have ignored the production constraint P ≥ 0
and used (6.5) and (6.37) as the optimal decision rules. Here we will solve
the production-inventory problem subject to P ≥ 0, and use (6.6) as the
optimal production rule. For simplicity of analysis and exposition, we
will assume also that S is a positive constant, T = ∞, and ρ > 0. These
specifications make Ṡ = 0, making the right hand side −αIˆ − ρ(P̂ − S)
a constant, a1 = b1 in (6.16), and a2 = 0 in (6.17).
In view of its constant right-hand side, we can use Row (3) of Ta-
ble A.2 to obtain its particular integral as
ρ ˆ
Q= (P̂ − S) + I, (6.38)
α2

which is a constant and thus Q̇ = 0. From (6.14) and (6.15), we now


have

b1 = I0 − Q = I0 − Iˆ − (ρ/α2 )(P̂ − S) and b2 = P̂ − S.

The turnpike is defined by the triple {I,¯ P̄ , λ̄} = {(ρ/α2 )(P̂ − S) +


ˆ S, c(S − P̂ )} formed from the turnpike expressions in (6.20), (6.21),
I,
and (6.22), respectively. Note that we could have obtained the turnpike
levels directly by applying the conditions (3.108), which in this case are

I¯˙ = 0, λ̄˙ = 0, and P̄ = P̂ + λ̄/c = S. (6.39)

If I0 = Q, then the optimal solution stays on the turnpike. If I0 = Q,


we must obtain the transient solution. It should be clear that the control
in (6.25) may become negative, especially when the initial inventory is
high. Let us complete the solution of the problem by considering three
cases: I0 ≤ Q, Q < I0 ≤ Q − S/m1 , and I0 > Q − S/m1 .
6.1. Production-Inventory Systems 203

If I0 ≤ Q, then the control in (6.25) with b1 = I0 − Q0 is clearly


positive. Thus, the optimal production rate is given by

P ∗ (t) = m1 b1 em1 t + S = m1 (I0 − Q)em1 t + S ≥ 0. (6.40)

Moreover, from the state in (6.25), we can obtain the corresponding I ∗ (t)
as
I ∗ (t) = (I0 − Q)em1 t + Q. (6.41)
It is easy to see that I ∗ (t) increases monotonically to Q as t → ∞, as
shown in Fig. 6.4.
If Q < I0 ≤ Q − S/m1 , we can easily see from (6.40) that P ∗ (0) ≥ 0.
Furthermore, Ṗ ∗ (t) ≥ 0, and therefore the optimal production rate is
once again given by (6.40). We also have I ∗ (t) as in (6.41) and conclude
that I ∗ (t) → Q monotonically as t → ∞, as shown in Fig. 6.4.
Finally, if I0 > Q − S/m1 , (6.40) would have a negative value for the
initial production which is infeasible. By (6.6), P ∗ (0) = 0. We can now
depict this situation in Fig. 6.4. The time t̂ shown in the figure is the
time at which P ∗ (t̂) = P̂ + λ(t̂)/c = 0. We already know from (6.40) that
in the case when I0 = Q − S/m1 , P ∗ (0) = 0. This suggests that

S
I ∗ (t̂) = Q − . (6.42)
m1

For t ≤ t̂, we have P ∗ (t) = 0 so that I˙∗ = −S, which gives

I ∗ (t) = I0 − St, t ≤ t̂. (6.43)

As for the adjoint equation (6.7), we now need the boundary condition
at t̂. For this, we can use (6.4) to obtain λ(t̂) = −cP̂ . Thus, the adjoint
equation in the interval [0, t̂ ] is

ˆ λ(t̂) = −cP̂ .
λ̇ = ρλ + h(I − I), (6.44)

We can substitute I0 − St for I in Eq. (6.44) and solve for λ. Note that
we can easily obtain t̂ as

S I0 − Q 1
I0 − S t̂ = Q − ⇒ t̂ = + . (6.45)
m1 S m1

We can now specify the complete solution in the case when I0 >
Q − S/m1 . With t̂ specified in (6.45), the solution is as follows.
204 6. Applications to Production and Inventory

For 0 ≤ t ≤ t̂ : P ∗ (t) = 0, I ∗ (t) = I0 − St, and λ(t) is the solution of


λ̇ = ρλ + h(I0 − St − I), ˆ λ(t̂) = −cP̂ .
For t > t̂ : we replace I0 by Q − S/m1 and t by t − t̂ on the right hand
side of (6.40) to obtain P ∗ (t) = −Sem1 (t−t̂) . The same replacements in
(6.41) gives us the corresponding I ∗ (t) = −(S/m1 )em1 t . Finally, λ(t) can
be obtained by solving
 
S m1 t ˆ
λ̇ = ρλ − h e + I , λ(t̂) = −cP̂ .
m1
We have thus solved the problem in every case of the initial condition
I0 . These solutions are sketched in Fig. 6.4 for Iˆ = 8, P̂ = 5, S = 6, h =
1, c = 4, and ρ = 0.1, for three different values of I0 , namely, 25, 15, and
1. In Exercise 6.7, you are asked to solve the problem for these values
and obtain Fig. 6.4.

Figure 6.4: Optimal production rate and inventory level with different
initial inventories

6.2 The Wheat Trading Model


Consider a firm that buys and sells wheat. The firm’s only assets are
cash and wheat, and the price of wheat over time is known with certainty.
The objective of this firm is to buy and sell wheat in order to maximize
6.2. The Wheat Trading Model 205

the total value of its assets at the horizon time T. The problem here is
similar to the simple cash balance model of Sect. 5.1 except that there
are nonlinear holding costs associated with storing wheat. An extension
of this model to one having two control variables appears in Ijiri and
Thompson (1972).

6.2.1 The Model


We introduce the following notation:

T = the horizon time,


x(t) = the cash balance in dollars at time t,
y(t) = the wheat balance in bushels at time t,
v(t) = the rate of purchase of wheat in bushels per unit time;
a negative purchase means a sale,
p(t) = the price of wheat in dollars per bushel at time t,
r = the constant positive interest rate earned on the cash
balance,
h(y) = the cost of holding y bushels per unit time.

In this section we permit x and y to go negative, meaning that bor-


rowing money and short-selling wheat are both allowed. In the next
section we disallow the short-selling of wheat.
The state equations are:

ẋ = rx − h(y) − pv, x(0) = x0 , (6.46)


ẏ = v, y(0) = y0 , (6.47)

and the control constraints are

− V2 ≤ v(t) ≤ V1 , (6.48)

where V1 and V2 are nonnegative constants. The objective function is:

max{J = x(T ) + p(T )y(T )} (6.49)

subject to (6.46)–(6.48). Note that the problem is in the linear Mayer


form.
206 6. Applications to Production and Inventory

6.2.2 Solution by the Maximum Principle


Introduce the adjoint variables λ1 and λ2 and define the Hamiltonian
function
H = λ1 [rx − h(y) − pv] + λ2 v. (6.50)
The adjoint equations are:

λ̇1 = −λ1 r, λ1 (T ) = 1, (6.51)



λ̇2 = h (y)λ1 , λ2 (T ) = p(T ). (6.52)

It is easy to solve (6.51) as

λ1 (t) = er(T −t) (6.53)

and (6.52) as
 T
λ2 (t) = p(T ) − h (y(τ ))er(T −τ ) dτ . (6.54)
t

The interpretation of λ1 (t) is that it is the future value (at time T )


of one dollar held as cash from t to T. The interpretation of λ2 (t) is the
price at time T of a bushel of wheat less the total future value (at time
T ) of the stream of storage costs incurred to store that bushel of wheat
from t to T.
From (6.50) the optimal control is

v ∗ (t) = bang[−V2 , V1 ; λ2 (t) − λ1 (t)p(t)]. (6.55)

In Exercise 6.8 you are asked to provide the interpretation of this optimal
policy.
Equations (6.46), (6.47), (6.54), and (6.55) determine the two-point
boundary value problem which usually requires a numerical solution pro-
cedure. In the next section we assume a special form for the storage
function h(y) to be able to obtain a closed-form solution.

6.2.3 Solution of a Special Case


For this special case we assume h(y) = 12 |y|, r = 0, x(0) = 10, y(0) = 0,
V1 = V2 = 1, T = 6, and


⎨ 3 for 0 ≤ t ≤ 3,
p(t) = (6.56)

⎩ 4 for 3 < t ≤ 6.
6.2. The Wheat Trading Model 207

We will apply the maximum principle (2.31) developed in Chap. 2


to this problem even though h(y) is not differentiable at y = 0. The
answer can be obtained rigorously by using the maximum principle for
models involving nondifferentiable functions discussed, e.g., in Clarke
(1989, Chapter 4) and Feichtinger and Hartl (1985b).
For this case with r = 0, we have λ1 (t) = 1 for all t from (6.53) so
that the TPBVP is
1
ẋ = − |y| − pv, x(0) = 10, (6.57)
2
ẏ = v, y(0) = 0, (6.58)
1
λ̇2 (t) = sgn(y), λ2 (6) = 4. (6.59)
2
For this simple problem it is easy to guess a solution. From the fact that
λ1 = 1, the optimal policy (6.55) reduces to

v ∗ (t) = bang[−1, 1; λ2 (t) − p(t)]. (6.60)

Figure 6.5: The price trajectory (6.56)

The graph of the price function is shown in Fig. 6.5. Since p(t) is
increasing, short-selling is never optimal. Since the storage cost is 1/2
per unit per unit time and the wheat price jumps by 1 unit at t = 3, it
never pays to store wheat for more than 2 time units. Because y(0) = 0,
we have v ∗ (t) = 0 for 0 ≤ t ≤ 1. This obviously must be a singular
208 6. Applications to Production and Inventory

control. Suppose we start buying wheat at t∗ > 1. From (6.60) the rate
of buying is 1; clearly buying will continue at this rate until t = 3, and
not longer. In order to not lose money on the storage of wheat, it must be
sold within 2 time units of its purchase. Clearly we should start selling
at t = 3+ at the maximum rate of 1, and continue until a last sale time
t∗∗ . In order to sell exactly all of the wheat purchased, we must have

3 − t∗ = t∗∗ − 3. (6.61)

Thus, v ∗ (t) = 0 in the interval [t∗∗ , 6], which is also a singular control.
With this policy, y(t) > 0 for all t ∈ (t∗ , t∗∗ ). From (6.59), λ̇2 = 1/2 in
the interval (t∗ , t∗∗ ). In order to have a singular control in the interval
[t∗∗ , 6], we must have λ2 (t) = 4 in that interval. Also, in order to have a
singular control in [0, t∗ ], we must have λ2 (t) = 3 in that interval. Thus,
λ2 (t∗∗ ) − λ2 (t∗ ) = 1, which with λ̇2 = 1/2 allows us to conclude that

t∗∗ − t∗ = 2, (6.62)

and therefore t∗ = 2 and t∗∗ = 4. Thus from (6.59) and (6.60),






⎪ 3, 0 ≤ t ≤ 2,


λ2 (t) =
⎪ 2 + t/2, 2 ≤ t ≤ 4, (6.63)




⎩ 4, 4 ≤ t ≤ 6.

We can now sketch graphs for λ2 (t), v ∗ (t), and y ∗ (t) as shown in
Fig. 6.6. In Exercise 6.13 you are asked to show that these trajectories are
optimal by verifying that the maximum principle necessary conditions
hold and that they are also sufficient.

6.2.4 The Wheat Trading Model with No Short-Selling


We next consider the wheat trading problem in the case when short-
selling is not permitted, i.e., we impose the state constraint y ≥ 0. More-
over, for simplicity in exposition we consider the following special case of
Norström (1978). Specifically, we assume h(y) = y/2, r = 0, x(0) = 10,
y(0) = 1, V1 = V2 = 1, T = 3, and


⎨ −2t + 7 for 0 ≤ t < 2,
p(t) = (6.64)

⎩ t + 1 for 2 ≤ t ≤ 3.
6.2. The Wheat Trading Model 209

The statement of the problem is:




⎪ max {J = x(3) + p(3)y(3) = x(3) + 4y(3)}







⎪ subject to



⎪ ẋ = − 12 y − pv, x(0) = 10, (6.65)





⎪ ẏ = v, y(0) = 1,





⎩ v + 1 ≥ 0, 1 − v ≥ 0, y ≥ 0.

Buy

Sell

Figure 6.6: Adjoint variable, optimal policy and inventory in the wheat
trading model
210 6. Applications to Production and Inventory

To solve this problem, we use the Lagrangian form of the indirect


maximum principle given in (4.29). The Hamiltonian is

H = λ1 (−y/2 − pv) + λ2 v. (6.66)

The optimal control is

v ∗ (t) = bang[−1, 1; λ2 (t) − λ1 (t)p(t)] when y > 0. (6.67)

Whenever y = 0 we must impose ẏ = v ≥ 0 in order to insure that no


short-selling occurs. Therefore,

v ∗ (t) = bang[0, 1; λ2 (t) − λ1 (t)p(t)] when y = 0. (6.68)

Next we form the Lagrangian

L = H + μ1 (v + 1) + μ2 (1 − v) + ηv, (6.69)

where μ1 , μ2 , and η satisfy the complementary slackness conditions:

μ1 ≥ 0, μ1 (v + 1) = 0, (6.70)
μ2 ≥ 0, μ2 (1 − v) = 0, (6.71)
η ≥ 0, ηy = 0. (6.72)

Furthermore, the optimal trajectory must satisfy


∂L
= λ2 − pλ1 + μ1 − μ2 + η = 0. (6.73)
∂v
With r = 0, we get λ1 = 1 as before, and
∂L
λ̇2 = − = 1/2, λ2 (3− ) = 4 + γ, (6.74)
∂y
with
γ ≥ 0, γy(3) = 0. (6.75)
Let us first try γ = 0. Then λ2 (3− )
= 4, and if we let t̂ denote the time
of the last jump before the terminal time, then there is no jump in the
interval (t̂, 3). Then, from (6.74) we have

λ2 (t) = t/2 + 5/2 for t̂ ≤ t < 3, (6.76)

and the optimal control from (6.67) or (6.68) is v ∗ = 1, i.e., buy wheat
at the maximum rate of 1, so long as λ2 (t) > p(t). Also, this will give
6.2. The Wheat Trading Model 211

y(3) > 0, so that (6.75) holds. Let us next find the time t̂ of the last
jump before the terminal time. Clearly, this value will not be larger than
the time at which λ2 (t) = p(t). Thus,

t̂ ≤ {t|t/2 + 5/2 = −2t + 7} = 1.8. (6.77)

Since p(t) is decreasing at the start of the problem, it appears that


selling at the maximum rate of 1, i.e., v ∗ = −1, should be optimal at
the start. Since the beginning inventory is y(0) = 1, selling at the rate
of 1 can continue only until t = 1, at which time the inventory y(1)
becomes 0. Suppose that we do nothing, i.e., v ∗ (t) = 0 in the interval
(1, 1.8]. Then, t = 1 is an entry time (see Sect. 4.2) and t = 1.8 is not
an entry time, and t̂ = 1. Hence, according to the maximum principle
(4.29), λ2 (t) is continuous at t = 1.8, and therefore λ2 (t) is given by
(6.76) in the interval [1, 3), i.e.,

λ2 (t) = t/2 + 5/2 for 1 ≤ t < 3. (6.78)

Using (6.73) with λ1 = 1 in the interval (1, 1.8] and v ∗ = 0 so that


μ1 = μ2 = 0, we have

λ2 − p + μ1 − μ2 + η = λ2 − p + η = 0,

and consequently

η(t) = p(t) − λ2 (t) = −5t/2 + 9/2, t ∈ (1, 1.8]. (6.79)

Since ht = 0, the jump condition in (4.29) for the Hamiltonian at


τ = 1 reduces to

H[x∗ (1), u∗ (1− ), λ(1− ), 1] = H[x∗ (1), u∗ (1+ ), λ(1+ ), 1].

From the definition of the Hamiltonian H in (6.66), we can rewrite the


condition as

λ1 (1− )[−y(1)/2 − p(1− )v ∗ (1− )] + λ2 (1− )v ∗ (1− ) =

λ1 (1+ )[−y(1)/2 − p(1+ )v ∗ (1+ )] + λ2 (1+ )v ∗ (1+ ).

Since λ1 (t) = 1 for all t, the above condition reduces to

−p(1− )v ∗ (1− ) + λ2 (1− )v ∗ (1− ) = −p(1+ )v ∗ (1+ ) + λ2 (1+ )v ∗ (1+ ).


212 6. Applications to Production and Inventory

Substituting the values of p(1− ) = p(1+ ) = 5 from (6.64), λ2 (1+ ) = 3


from (6.78), and v ∗ (1+ ) = 0 and v ∗ (1− ) = −1 from the above discussion,
we obtain

− 5(−1) + λ2 (1− )(−1) = −5(0) + 3(0) = 0 ⇒ λ2 (1− ) = 5. (6.80)

We can now use the jump condition in (4.29) on the adjoint variables
to obtain

λ2 (1− ) = λ2 (1+ ) + ζ(1) ⇒ ζ(1) = λ2 (1− ) − λ2 (1+ ) = 5 − 3 = 2 ≥ 0.

It is important to note that in the interval [1, 1.8], the optimal control
condition (6.68) holds, justifying our supposition that v ∗ = 0 in this
interval. Furthermore, using (6.80) and (6.74),

λ2 (t) = t/2 + 9/2 for t ∈ [0, 1), (6.81)

and the optimal control condition (6.67) holds, justifying our supposition
that v ∗ = −1 in this interval. Also, we can conclude that our guess γ = 0

Sell Buy

Figure 6.7: Adjoint trajectory and optimal policy for the wheat trading
model
6.3. Decision Horizons and Forecast Horizons 213

is correct. The graphs of λ2 (t), p(t), and v ∗ (t) are displayed in Fig. 6.7.
To complete the solution of the problem, you are asked to determine the
values of μ1 , μ2 , and η in these various intervals.

6.3 Decision Horizons and Forecast Horizons


In some dynamic problems it is possible to show that the optimal deci-
sions during an initial positive time interval are either partially or wholly
independent of the data from some future time onwards. In such cases,
a forecast of the future data needs to be made only as far as that time
to make optimal decisions in the initial time interval. The initial time
interval is called the decision horizon and the time up to which data is
required to make the optimal decisions during the decision horizon is
called the forecast horizon; see Bes and Sethi (1988), Bensoussan et al.
(1983), and Haurie and Sethi (1984) for details on these concepts. When-
ever they exist, these horizons naturally decompose the problem into a
series of smaller problems.
If the optimal decisions during the decision horizon are completely
independent of the data beyond the forecast horizon, then the latter
is called a strong forecast horizon. If, on the other hand, some mild
restrictions on the data after the forecast horizon are required in order
to keep the optimal decisions during the decision horizon unaffected,
then it is called a weak forecast horizon.
In this section we demonstrate these concepts in the context of the
wheat trading model of the previous section. In Sect. 6.3.1 we obtain a
decision horizon for the model of Sect. 6.2.4 which is also a weak forecast
horizon. In Sect. 6.3.2 we modify the wheat trading model by adding
a warehousing constraint. For the new problem we obtain a decision
horizon and a strong forecast horizon. See also Sethi and Thompson
(1982), Rempala (1986) and Hartl (1986a, 1988a) for further research in
the context of the wheat trading model.
In what follows we obtain these horizons and verify them for some
examples with different forecast data. For more details and proofs in
other situations including more general ones, see Modigliani and Hohn
(1955), Lieber (1973), Pekelman (1974, 1975, 1979), Kleindorfer and
Lieber (1979), Vanthienen (1975), Morton (1978), Lundin and Morton
(1975), Rempala and Sethi (1988, 1992), Hartl (1989a), and Sethi (1990).
214 6. Applications to Production and Inventory

6.3.1 Horizons for the Wheat Trading Model with


No Short-Selling
For the model of Sect. 6.2.4, we will demonstrate that t = 1 is a decision
horizon as well as a weak forecast horizon. In Fig. 6.8 we have redrawn
Fig. 6.7 with a new price trajectory in the time interval [1, 3]. Also in the
figure, we have extended the initial λ2 trajectory and labeled it the price
shield. Its significance is that, as long as the new price trajectory in the
interval [1, 3] stays below the price shield, the optimal solution in the
interval [0, 1], which is the decision horizon, remains unchanged. That
is, it is optimal to sell throughout the interval. The restriction that p(t)
must stay below the price shield in [1, 3] is the reason that t = 1 is a
weak forecast horizon. The optimality of the control shown in Fig. 6.8
can be concluded by obtaining the adjoint trajectory in the interval [1, 3]
as a straight line with slope 1/2 and the terminal value λ2 (3− ) = p(3).
This way of drawing the adjoint trajectory is correct as long as the
corresponding policy does not violate the inventory constraint y(t) ≥ 0
in the interval [1, 3]. For example, this will be the case if the buy interval
in Fig. 6.8 is shorter than the sell interval at the end. On the other hand,
if the inventory constraint is violated, then the λ2 (t) trajectory may
jump in the interval [1, 3), and it will be more complicated to obtain it.
Nevertheless, the decision horizon and weak forecast horizon still occur
at t = 1. Moreover, if we let T > 1 be any finite horizon and assume that
p(t) in the interval [1, T ] is always below the price shield line of Fig. 6.8
extended to T, then the policy of selling at the maximum rate in the
interval [0, 1] remains optimal.

6.3.2 Horizons for the Wheat Trading Model with No


Short-Selling and a Warehousing Constraint
In order to give an example in which a strong forecast horizon occurs, we
modify the example of Sect. 6.2.4 by adding the warehousing constraint
y ≤ 1 or
1 − y ≥ 0, (6.82)

changing the terminal time to T = 4, and defining the price trajectory


to be ⎧

⎨ −2t + 7 for t ∈ [0, 2),
p(t) = (6.83)

⎩ t + 1 for t ∈ [2, 4].
6.3. Decision Horizons and Forecast Horizons 215

ield
Price Sh

Decision Weak Forecast


Horizon Horizon

Sell Do Nothing Buy Sell

Figure 6.8: Decision horizon and optimal policy for the wheat trading
model

The Hamiltonian of the new problem is unchanged and is given in


(6.66). Furthermore, λ1 = 1. The optimal control is defined in three
parts as:
v ∗ (t) = bang[−1, 1; λ2 (t) − p(t)] when 0 < y < 1, (6.84)

v (t) = bang [0, 1; λ2 (t) − p(t)] when y = 0, (6.85)

v (t) = bang[−1, 0; λ2 (t) − p(t)] when y = 1. (6.86)
Defining a Lagrange multiplier η 1 for the derivative of (6.82), i.e., for
−ẏ = −v ≥ 0, we form the Lagrangian
L = H + μ1 (v + 1) + μ2 (1 − v) + ηv + η 1 (−v), (6.87)
where μ1 , μ2 , and η satisfy (6.70)–(6.72) and η 1 satisfies
η 1 ≥ 0, η 1 (1 − y) = 0, η̇ 1 ≤ 0. (6.88)
Furthermore, the optimal trajectory must satisfy
∂L
= λ2 − p + μ1 − μ2 + η − η 1 = 0. (6.89)
∂v
216 6. Applications to Production and Inventory

As before, λ1 = 1 and λ2 satisfies

λ˙2 = 1/2, λ2 (4− ) = p(4) + γ 1 − γ 2 = 5 + γ 1 − γ 2 , (6.90)

where
γ 1 ≥ 0, γ 1 y(4) = 0, γ 2 ≥ 0, γ 2 (1 − y(4)) = 0. (6.91)
Let us first try γ 1 = γ 2 = 0. Let t̂ be the time of the last jump of the
adjoint function λ2 (t) before the terminal time T = 4. Then,

λ2 (t) = t/2 + 3 for t̂ ≤ t < 4. (6.92)

The graph of (6.92) intersects the price trajectory at t = 8/5 as shown


in Fig. 6.9. It also stays above the price trajectory in the interval [8/5, 4]
so that, if there were no warehousing constraint (6.82), the optimal de-
cision in this interval would be to buy at the maximum rate. However,
with the constraint (6.82), this is not possible. Thus t̂ > 8/5, since λ2
will have a jump in the interval [8/5, 4].

Figure 6.9: Optimal policy and horizons for the wheat trading model
with no short-selling and a warehouse constraint
6.3. Decision Horizons and Forecast Horizons 217

To find the actual value of t̂ we must insert a line of slope 1/2 above
the minimum price at t = 2 in such a way that its two intersection points
with the price trajectory are exactly one time unit (the time required to
fill up the warehouse) apart. Thus using (6.83), t̂ must satisfy

−2(t̂ − 1) + 7 + (1/2)(1) = t̂ + 1,

which yields t̂ = 17/6.


The rest of the analysis for determining λ2 including the jump con-
ditions is similar to that given in Sect. 6.2.4. Thus,




⎪ t/2 + 9/2 for t ∈ [0, 1),


λ2 (t) =
⎪ t/2 + 29/12 for t ∈ [1, 17/6), (6.93)




⎩ t/2 + 3 for t ∈ [17/6, 4].

This makes γ 1 = γ 2 = 0 the correct guess.


Given (6.93), the optimal policy is given by (6.84)–(6.86) and is
shown in Fig. 6.9. To complete the maximum principle we must derive
expressions for the Lagrange multipliers in the four intervals shown in
Fig. 6.9.

Interval [0, 1) : μ2 = η = η 1 = 0, μ1 = p − λ2 > 0;

v ∗ = −1, 0 < y ∗ < 1.

Interval [1, 11/6) : μ1 = μ2 = η 1 = 0, η = p − λ2 > 0, η̇ ≤ 0;

v ∗ = 0, y ∗ = 0.

Interval [11/16, 17/6) : μ1 = η = η 1 = 0, μ2 = λ2 − p > 0;

v ∗ = 1, 0 < y ∗ < 1.

Interval [17/6, 4] : μ1 = μ2 = η = 0, η 1 = λ2 − p > 0, η̇ 1 ≤ 0,


γ 1 = γ 2 = 0;

v ∗ = 0, y ∗ = 1.

In Exercise 6.17 you are asked to solve another variant of this problem.
For the example in Fig. 6.9 we have labeled t = 1 as a decision horizon
and t̂ = 17/6 as a strong forecast horizon. By this we mean that the
218 6. Applications to Production and Inventory

optimal decision in [0, 1] continues to be to sell at the maximum rate


regardless of the price trajectory p(t) for t > 17/6. Because t̂ = 17/6 is
a strong forecast horizon, we can terminate the price shield at that time
as shown in the figure.
In order to illustrate the statements in the previous paragraph, we
consider two examples of price changes after t̂ = 17/6.

Example 6.3 Assume the price trajectory to be






⎪ −2t + 7 for t ∈ [0, 2),


p(t) =
⎪ t+1 for t ∈ [2, 17/6),




⎩ 25t/7 − 44/7 for t ∈ [17/6, 4],

which is sketched in Fig. 6.10. Note that the price trajectory up to time
17/6 is the same as before, and the price after time 17/6 goes above the
extension of the price shield in Fig. 6.9.

Figure 6.10: Optimal policy and horizons for Example 6.3


6.3. Decision Horizons and Forecast Horizons 219

Solution The new λ2 trajectory is shown in Fig. 6.10, which is the same
as before for t < 17/6, and after that it is λ2 (t) = t/2+6 for t ∈ [17/6, 4].
The optimal policy is as shown in Fig. 6.10, and as previously asserted,
the optimal policy in [0,1) remains unchanged. In Exercise 6.17 you are
asked to verify the maximum principle for the solution of Fig. 6.10.

Example 6.4 Assume the price trajectory to be






⎪ −2t + 7 for t ∈ [0, 2),


p(t) =
⎪ t+1 for t ∈ [2, 17/6),




⎩ −t/2 + 21/4 for t ∈ [17/6, 4],

which is sketched in Fig. 6.11.

Figure 6.11: Optimal policy and horizons for Example 6.4

Solution Again the price trajectory is the same up to time 17/6, but
the price after time 17/6 is declining. This changes the optimal policy
220 6. Applications to Production and Inventory

in the time interval [1, 17/6), but the optimal policy will still be to sell
in [0, 1).
As in the beginning of the section, we solve (6.90) to obtain λ2 (t) =
t/2+5/4 for t̂1 ≤ t ≤ 4, where t̂1 ≥ 1 is the time of the last jump which is
to be determined. It is intuitively clear that some profit can be made by
buying and selling to take advantage of the price rise between t = 2 and
t = 17/6. For this, the λ2 (t) trajectory must cross the price trajectory
between times 2 and 17/6 as shown in Fig. 6.11, and the inventory y
must go to 0 between times 17/6 and 4 so that λ2 can jump downward
to satisfy the ending condition λ2 (4− ) = p(4) = 13/4. Since we must
buy and sell equal amounts, the point of intersection of the λ2 trajectory
with the rising price segment, i.e., t̂1 − α, must be exactly in the middle
of the two other intersection points, t̂1 and t̂1 − 2α, of λ2 with the two
declining price trajectories. Thus, t̂1 and α must satisfy:
−2(t̂1 − 2α) + 7 + α/2 = (t̂1 − α) + 1,
(t̂1 − α) + 1 + α/2 = −t̂1 /2 + 21/4.
These can be solved to yield t̂1 = 163/54 and α = 5/9. The times
t̂1 , t̂1 − α, and t̂1 − 2α are shown in Fig. 6.11. The λ2 trajectory is given
by ⎧



⎪ t/2 + 9/2 for t ∈ [0, 1),


λ2 (t) =
⎪ t/2 + 241/108 for t ∈ [1, 163/54),




⎩ t/2 + 5/4 for t ∈ [163/54, 4].
Evaluation of the Lagrange multipliers and verification of the maximum
principle is similar to that for the case in Fig. 6.9.
In Sect. 6.3 we have given several examples of decision horizons and
weak and strong forecast horizons. In Sect. 6.3.1 we found a decision
horizon which was also a weak forecast horizon, and it occurred exactly
when y(t) = 0. We also introduced the idea of a price shield in that
section. In Sect. 6.3.2 we imposed a warehousing constraint and obtained
the same decision horizon and a strong forecast horizon, which occurred
when y(t) = 1.
Note that if we had solved the problem with T = 1, then y ∗ (1) = 0;
and if we had solved the problem with T = 17/6, then y ∗ (1) = 0 and
y ∗ (17/6) = 1. The latter problem has the smallest T such that both
y ∗ = 0 and y ∗ = 1 occur for t > 0, given the price trajectory. This is
one of the ways that time t = 17/6 can be found to be a forecast horizon
Exercises for Chapter 6 221

along with the decision horizon at time t = 1. There are other ways to
find strong forecast horizons. For a survey of the literature, see Chand
et al. (2002).

Exercises for Chapter 6

E 6.1 Verify the expressions for a1 and a2 given in (6.16) and (6.17).

E 6.2 Verify (6.27). Note that ρ = 0 is assumed in Sect. 6.1.4.

E 6.3 Verify (6.29). Again assume ρ = 0.

E 6.4 Given the demand function

S = t(t − 4)(t − 8)(t − 12)(t − 16) + 30,

ρ = 0, Iˆ = 15, T = 16, and α = 1, obtain Q(t) from (6.27).

E 6.5 Complete the solution of Example 6.2 in Sect. 6.1.4.

E 6.6 For the model of Sect. 6.1.6, derive the turnpike triple by using
the conditions in (6.39).

E 6.7 Solve the production-inventory model of Sect. 6.1.6 for the pa-
rameter values listed on Fig. 6.4, and draw the figure using MATLAB or
another suitable software.

E 6.8 Give an intuitive interpretation of (6.55).

E 6.9 Assume that there is a transaction cost cv 2 when v units of wheat


are bought or sold in the model of Sect. 6.2.1. Derive the form of the
optimal policy.

E 6.10 In Exercise 6.9, assume T = 10, x(0) = 10, y(0) = 0, c = 1/18,


h(y) = (1/2)y 2 , V1 = V2 = ∞, r = 0, and p(t) = 10 + t. Solve the
resulting TPBVP to obtain the optimal control in closed form.

E 6.11 Set up the two-point boundary value problem for Exercise 6.9
with c = 0.05, h(y) = (1/2)y 2 , and the remaining values of parameters
as in the model of Sect. 6.2.3.

E 6.12 Use Excel, as illustrated in Sect. 2.5, to solve the TPBVP of


Exercise 6.11.
222 6. Applications to Production and Inventory

E 6.13 Show that the solution obtained for the problem in Sect. 6.2.3
satisfies the necessary conditions of the maximum principle. Conclude
the optimality of the solution by showing that the maximum principle
conditions are also sufficient.

E 6.14 Re-solve the problem of Sect. 6.2.3 with V1 = 2 and V2 = 1.

E 6.15 Compute the optimal trajectories for μ1 , μ2 , and η for the model
in Sect. 6.2.4.

E 6.16 Solve the model in Sect. 6.2.4 with each of the following condi-
tions:
(a) y(0) = 2.

(b) T = 10 and p(t) = 2t − 2 for 3 ≤ t ≤ 10.

E 6.17 Verify that the solutions shown in Figs. 6.10 and 6.11 satisfy the
maximum principle.

E 6.18 Re-solve the model of Sect. 6.3.2 with y(0) = 1/2 and with the
warehousing constraint y ≤ 1/2 in place of (6.82).

E 6.19 Solve and interpret the following production planning problem


with linear inventory holding costs:
⎧   T 

⎪ c 2

⎪ max J = −[hI + P ]dt

⎪ 0 2



⎨ subject to
(6.94)



⎪ 2
I˙ = P, I(0) = 0, I(T ) = B; 0 < B < hT /2c,





⎩ P ≥ 0 and I ≥ 0.

˙ = P (t)−S(t),
E 6.20 Re-solve Exercise 6.19 with the state equation I(t)
where I(0) = I0 ≥ 0 and I(T ) is not fixed. Assume the demand S(t) to
be continuous in t and non-negative. Keep the state constraint I ≥ 0, but
drop the production constraint P ≥ 0 for simplicity. For specificity, you
may assume S = − sin πt + C with the constant C ≥ 1 and T = 4. (Note
that negative production can and will occur when initial inventory I0 is
too large. Specifically, how large is too large depends on the parameters
of the problem.)
Exercises for Chapter 6 223

˙ = P (t) − S,
E 6.21 Re-solve Exercise 6.19 with the state equation I(t)
where S > 0 and h > 0 are constants, I(0) = I0 > cS 2 /2h, and I(T ) is
not fixed. Assume that T is sufficiently large. Also, graph the optimal
P ∗ (t) and I ∗ (t), t ∈ [0, T ].
Chapter 7

Applications to Marketing

Over the years, a number of applications of optimal control theory have


been made to the field of marketing. Many of these applications deal with
the problem of finding or characterizing the optimal advertising rate over
time. Others deal with the problem of determining the optimal price and
quality over time, in addition to or without advertising. The reader is
referred to Sethi (1977a) and Feichtinger et al. (1994a) for comprehensive
reviews on dynamic optimal control problems in advertising and related
problems. In this chapter we discuss optimal advertising policies for
two of the well-known models called the Nerlove-Arrow model and the
Vidale-Wolfe model.
To describe the specific problems under consideration, let us assume
that a firm has some way of knowing or estimating the dynamics of sales
and advertising. Such knowledge is expressed in terms of a differential
equation with either goodwill or the rate of sales as the state variable and
the rate of advertising expenditures as the control variable. We assume
that the firm wishes to maximize an objective function (the criterion
function) which reflects its profit motives expressed in terms of sales and
advertising rates. The optimal control problem is to find an advertising
policy which maximizes the firm’s objective function.
The plan of this chapter is as follows. Section 7.1 will cover the
Nerlove-Arrow model as well as a nonlinear extension of it. Section 7.2
deals with the Vidale-Wolfe advertising model and its detailed analy-
sis using Green’s theorem in conjunction with the maximum principle.
The switching-point analysis for this problem is a good example of the
reverse-time construction technique used earlier in Chaps. 4 and 5. Ex-

© Springer Nature Switzerland AG 2019 225


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 7
226 7. Applications to Marketing

tensions of these models to multi-state problems are treated in Turner


and Neuman (1976) and Srinivasan (1976).

7.1 The Nerlove-Arrow Advertising Model


The belief that advertising expenditures by a firm affect its present and
future sales, and hence its present and future net revenues, has led a
number of economists including Nerlove and Arrow (1962) to treat ad-
vertising as an investment in building up some sort of advertising capital,
usually called goodwill. Furthermore, the stock of goodwill depreciates
over time. Vidale and Wolfe (1957), Palda (1964), and others present
empirical evidence that the effects of advertising linger but diminish over
time.
Goodwill may be created by adding new customers or by altering
the tastes and preferences of consumers and thus changing the de-
mand function for the firm’s product. Goodwill depreciates over time
because consumers “drift” to other brands as a result of advertising
by competing firms and the introduction of new products and/or new
brands, etc.

7.1.1 The Model


Let G(t) ≥ 0 denote the stock of goodwill at time t. The price of (or cost
of producing) one unit of goodwill is one dollar so that a dollar spent on
current advertising increases goodwill by one unit. It is assumed that
the stock of goodwill depreciates over time at a constant proportional
rate δ, so that
Ġ = u − δG, G(0) = G0 , (7.1)
where u = u(t) ≥ 0 is the advertising effort at time t measured in
dollars per unit time. In economic terms, Eq. (7.1) states that the net
investment in goodwill is the difference between gross investment u(t)
and depreciation δG(t).
To formulate the optimal control problem for a monopolistic firm,
assume that the rate of sales S(t) depends on the stock of goodwill
G(t), the price p(t), and other exogenous factors Z(t), such as consumer
income, population size, etc. Thus,
S = S(p, G; Z). (7.2)
7.1. The Nerlove-Arrow Advertising Model 227

Assuming the rate of total production cost is c(S), we can write the total
revenue net of production cost as
R(p, G; Z) = pS(p, G; Z) − c(S(p, G; Z)). (7.3)
The revenue net of advertising expenditure is therefore R(p, G; Z) − u.
We assume that the firm wants to maximize the present value of net
revenue streams discounted at a fixed rate ρ, i.e.,
  ∞ 
−ρt
max J= e [R(p, G; Z) − u] dt (7.4)
u≥0,p≥0 0
subject to (7.1).
Since the only place that p occurs is in the integrand, we can max-
imize J by first maximizing R with respect to price p while holding G
fixed, and then maximize the result with respect to u. Thus,
∂R ∂S ∂S
=S+p − c (S) = 0, (7.5)
∂p ∂p ∂p
which implicitly gives the optimal price p∗ (t) = p(G(t); Z(t)). Defining
η = −(p/S)(∂S/∂p) as the elasticity of demand with respect to price,
we can rewrite condition (7.5) as
ηc (S)
p∗ = , (7.6)
η−1
which is the usual price formula for a monopolist, known sometimes as
the Amoroso-Robinson relation. You are asked to derive this relation
in Exercise 7.2. In words, the relation means that the marginal revenue
(η −1)p/η must equal the marginal cost c (S). See, e.g., Cohen and Cyert
(1965, p. 189).
Defining Π(G; Z) = R(p∗ , G; Z), the objective function in (7.4) can
be rewritten as
  ∞ 
−ρt
max J = e [Π(G; Z) − u] dt .
u≥0 0
For convenience, we assume Z to be a given constant. Thus, we can
define π(G) = Π(G; Z) and restate the optimal control problem which
we have just formulated:
⎧   ∞ 

⎪ −ρt
e [π(G) − u] dt

⎪ max J =

⎨ u≥0 0

subject to (7.7)





⎩ Ġ = u − δG, G(0) = G .
0
228 7. Applications to Marketing

Furthermore, it is reasonable to assume the functions introduced in (7.2)


and (7.3) to satisfy conditions so that π(G) is increasing and concave in
goodwill G. More specifically, we assume that π  (G) ≥ 0 and π  (G) < 0.

7.1.2 Solution by the Maximum Principle


While Nerlove and Arrow (1962) used calculus of variations, we use
Pontryagin’s maximum principle to derive their results. We form the
current-value Hamiltonian

H = π(G) − u + λ[u − δG] (7.8)

with the current-value adjoint variable λ satisfying the differential equa-


tion
∂H dπ
λ̇ = ρλ − = (ρ + δ)λ − (7.9)
∂G dG

and the condition that

lim e−ρt λ(t) = 0. (7.10)


t→+∞

Recall from Sect. 3.6 that this limit condition is only a sufficient condi-
tion.
The adjoint variable λ(t) is the shadow price associated with the
goodwill at time t. Thus, the Hamiltonian in (7.8) can be interpreted as
the dynamic profit rate which consists of two terms: (1) the current net
profit rate (π(G) − u) and (2) the value λĠ = λ[u − δG] of the goodwill
rate Ġ created by advertising at rate u. Also, Eq. (7.9) corresponds to
the usual equilibrium relation for investment in capital goods; see Arrow
and Kurz (1970) and Jacquemin (1973). It states that the marginal
opportunity cost λ(ρ + δ)dt of investment in goodwill, by spending on
advertising, should equal the sum of the marginal profit π  (G)dt from the
increased goodwill due to that investment and the capital gain dλ := λ̇dt
on the unit price of goodwill.
We use (3.108) to obtain the optimal long-run stationary equilibrium
or turnpike {Ḡ, ū, λ̄}. That is, we obtain λ = λ̄ = 1 from (7.8) by using
∂H/∂u = 0. We then set λ = λ̄ = 1 and λ̇ = 0 in (7.9) to obtain
7.1. The Nerlove-Arrow Advertising Model 229

π  (Ḡ) = ρ + δ. (7.11)
In order to obtain a strictly positive equilibrium goodwill level Ḡ, we
may assume π  (0) > ρ + δ and π  (∞) < ρ + δ.
Before proceeding further to obtain the optimal advertising policy, let
us relate (7.11) to the equilibrium condition for Ḡ obtained by Jacquemin
(1973). For this we define β = (G/S)(∂S/∂G) as the elasticity of demand
with respect to goodwill. We can now use (7.3), (7.5), (7.6), and (7.9)
with λ̇ = 0 and λ̄ = 1 to derive, as you will in Exercise 7.3,
Ḡ β
= . (7.12)
pS η(ρ + δ)
The interpretation of (7.12) is that in the equilibrium, the ratio of good-
will to sales revenue pS is directly proportional to the goodwill elasticity,
inversely proportional to the price elasticity, and inversely proportional
to the cost of maintaining goodwill given by the marginal opportunity
cost λ(ρ + δ) of investment in goodwill.
The property of Ḡ is that the optimal policy is to go to Ḡ as fast
as possible. If G0 < Ḡ, it is optimal to jump instantaneously to Ḡ by
applying an appropriate impulse at t = 0 and then set u∗ (t) = ū = δ Ḡ
for t > 0. If G0 > Ḡ, the optimal control u∗ (t) = 0 until the stock of
goodwill depreciates to the level Ḡ, at which time the control switches
to u∗ (t) = δ Ḡ and stays at this level to maintain the level Ḡ of goodwill.
This optimal policy is graphed in Fig. 7.1 for these two different initial
conditions.
Of course, if we had imposed an upperbound M > 0 on the control
so that 0 ≤ u ≤ M, then for G0 < Ḡ, we would use u∗ (t) = M until
the goodwill stock reaches Ḡ and switch to u∗ (t) = ū thereafter. This is
shown as the dotted curve in Fig. 7.1.
Problem (7.7) is formulated with the assumption that a dollar spent
on current advertising increases goodwill by one unit. Suppose, instead,
that we need to spend m dollars on current advertising to increase good-
will by one unit. We can then define u as advertising effort costing the
firm mu dollars, and reformulate problem (7.7) by replacing [π(G) − u]
230 7. Applications to Marketing

Case :

Figure 7.1: Optimal policies in the Nerlove-Arrow model

in its integrand by [π(G) − mu]. In Exercise 7.4, you are asked to solve
problem (7.7) with its objective function and the control constraint
replaced by   

max J= e−ρt [π(G) − mu]dt , (7.13)
0≤u≤M 0
and show that the equilibrium goodwill level formula (7.11) changes to

π  (Ḡ) = (ρ + δ)m. (7.14)

With Ḡ thus defined, the optimal solution is as shown in Fig. 7.1 with
the dotted curve representing the solution in Case 2: G0 < Ḡ.
For a time-dependent Z, however, Ḡ(t) = G(Z(t)) will be a func-
tion of time. To maintain this level of Ḡ(t), the required control is
˙
ū(t) = δ Ḡ(t) + Ḡ(t). If Ḡ(t) is decreasing sufficiently fast, then ū(t) may
become negative and thus infeasible. If ū(t) ≥ 0 for all t, then the opti-
mal policy is as before. However, suppose ū(t) is infeasible in the interval
[t1 , t2 ] shown in Fig. 7.2. In such a case, it is feasible to set u(t) = ū(t)
for t ≤ t1 ; at t = t1 (which is point A in Fig. 7.2) we can no longer stay
on the turnpike and must set u(t) = 0 until we hit the turnpike again (at
point B in Fig. 7.2). However, such a policy is not necessarily optimal.
For instance, suppose we leave the turnpike at point C anticipating the
infeasibility at point A. The new path CDEB may be better than the
old path CAB. Roughly the reason this may happen is that path CDEB
is “nearer” to the turnpike than CAB. The picture in Fig. 7.2 illustrates
7.1. The Nerlove-Arrow Advertising Model 231

such a case. The optimal policy is the one that is “nearest” to the turn-
pike. This discussion will become clearer in Sect. 7.2.2, when a similar
situation arises in connection with the Vidale-Wolfe model. For further
details; see Sethi (1977b) and Breakwell (1968).
The Nerlove-Arrow model is an example involving bang-bang and
impulse controls followed by a singular control, which arises in a class of
optimal control problems of Model Type (b) in Table 3.3 that are linear
in control.
Nonlinear extensions of the Nerlove-Arrow model have been offered
in the literature. These amount to making the objective function non-
linear in advertising. Gould (1970) extended the model by assuming a

Figure 7.2: A case of a time-dependent turnpike and the nature of opti-


mal control

convex cost of advertising effort, which implies a marginally diminishing


effect of advertising expenditures. Jacquemin (1973) assumed that the
current demand function S in (7.2) also depends explicitly on the current
advertising effort u. In Exercise 11.6, you are asked to analyze Gould’s
extension via the phase diagram analysis introduced in Chap. 11. The
analysis of Jacquemin’s extension is similar.
232 7. Applications to Marketing

7.1.3 Convex Advertising Cost and Relaxed Controls


Another nonlinear extension of the Nerlove-Arrow model would involve
a concave advertising cost resulting from quantity discounts that may be
available in the purchase of advertising. Such an extension results in an
optimal control problem with a profit rate that is convex in advertising,
and this has a possibility of rendering the problem without an optimal
solution within the class of admissible controls discussed thus far. What
is then required is an enlargement of the class to include what are known
as relaxed controls. To introduce such controls, we formulate and solve
a convex optimal control problem involving the Nerlove-Arrow model.
Let c(u) be a strictly concave advertising cost function with c(0) = 0,
c (u) > 0 and c (u) < 0 for 0 ≤ u ≤ M, where M > 0 denotes an
upperbound on the advertising rate. Let us also consider T > 0 to be
the fixed terminal time. Then, our problem is the following modification
of problem (7.7):
⎧   T 

⎪ −ρt
e [π(G) − c(u)]dt

⎪ max J1 =

⎨ 0≤u≤M 0

subject to (7.15)





⎩ Ġ = u − δG, G(0) = G .
0

Note that with concave c(u), the profit rate π(G) − c(u) is convex
in u. Thus, its maximum over u would occur at the boundary 0 or M
of the set [0, M ]. It should be clear that if we replace c(u) by the linear
function mu with m = c(M )/M, then

π(G) − c(u) < π(G) − mu, u ∈ (0, M ). (7.16)

This means that if problem (7.15) with mu in place of c(u), i.e., the
problem
⎧   T 

⎪ −ρt
e [π(G) − mu] dt

⎪ max J2 =

⎨ 0≤u≤M 0

subject to (7.17)





⎩ Ġ = u − δG, G(0) = G
0

has only the bang-bang solution, then the solution of problem (7.17)
would also be the solution of the convex problem (7.15). Given the
7.1. The Nerlove-Arrow Advertising Model 233

similarity of problem (7.17) to problem (7.7), we can see that for a suf-
ficiently small value of T, the solution of (7.17) will be bang-bang only,
and therefore, it will also solve (7.15). However, if T is large or infinity,
then the solution of (7.17) will have a singular portion, and it will not
solve (7.17).
In particular, let us consider problems (7.15) and (7.17) when T = ∞
and G0 < Ḡ. Note that problem (7.17) is the same as the problem in
Exercise 7.4, and its optimal solution is as shown in Fig. 7.1 with Ḡ given
by (7.14) and the optimal trajectory given by the dotted line followed by
the solid horizontal line representing the singular part of the solution.
Let u∗2 denote the optimal control of problem (7.17). Since the sin-
gular control is in the open interval (0, M ), then in view of (7.16),

J1 (u∗2 ) < J2 (u∗2 ). (7.18)

Thus, for sufficiently small ε1 > 0 and ε2 > 0, we can “chatter” between
G1 = (Ḡ + ε1 ) and G2 = (Ḡ − ε2 ) by using controls M and 0 alternately,
as shown in Fig. 7.3, to obtain a near-optimal control of problem (7.15).
Clearly, in the limit as ε1 and ε2 go to 0, the objective function of problem
(7.15) will converge to J2 (u∗2 ).

Figure 7.3: A near-optimal control of problem (7.15)


234 7. Applications to Marketing

This is an intuitive explanation that there does not exist an opti-


mal control of problem (7.15) in the class of controls discussed thus far.
However, when the class of controls is enlarged to include relaxed or gen-
eralized controls, which are the limits of the approximating controls like
the ones constructed above, we can recover the existence of an optimal
solution; see Gamkrelidze (1978) and Lou (2007) for details.
The manner in which the theory of relaxed controls manifests itself
for our problem is to provide a probability measure on the boundary
values {0, M }. Thus, let v be the probability that control M is used, so
that the probability of using control 0 is (1 − v). With this, we transform
problem (7.15) with T = ∞ as follows:
⎧   ∞ 

⎪ −ρt


⎪ max J 3 = e [π(G) vc(M )] dt

⎨ v∈[0,1] 0

subject to (7.19)





⎩ Ġ = vM − δG, G(0) = G .
0

We can now use the maximum principle to solve problem (7.19).


Thus, the Hamiltonian
H = πG − vc(M ) + λ(vM − δG)
with the adjoint equation as defined by (7.9) and (7.10). The optimal
control is given by
v ∗ = bang[0, 1; λM − c(M )]. (7.20)
The singular control is given by
λ̄ = m, π  (Ḡ) = (ρ + δ)m, v̄ = δ Ḡ/M. (7.21)
The way we interpret this control is by use of a biased coin with the
probability of heads being v̄. We flip this coin infinitely fast, and use the
maximum control M when heads comes up and the minimum control
0 when tails comes up. Because the control will chatter infinitely fast
according to the outcome of the coin tosses, such a control is also referred
to as a chattering control.
While such a chattering control cannot be implemented, it can be
arbitrarily approximated by using alternately u∗ = M for τ v̄ periods and
u∗ = 0 for τ (1 − v̄) periods for a small τ > 0. With reference to Fig. 7.3
and with G1 and G2 to be determined for the given τ , this approximate
7.2. The Vidale-Wolfe Advertising Model 235

policy of rapidly switching the control between M and 0 can be said to


begin at time t1 , when the goodwill reaches G2 . After that goodwill goes
up to G1 and then back down to G2 , and so on. The values of G1 and
G2 , corresponding to the given τ , are specified in Exercise 7.8, and you
are asked to derive them.
In marketing parlance, advertising rates that alternate between max-
imum and zero are known as a pulsing policy. While there are other rea-
sons for pulsing that are known in the advertising literature, the convex
cost of advertising is one of them; see Feinberg (1992, 2001) for details.
Another example of relaxed control appears in Haruvy et al. (2003)
in connection with open-source software development. This is given as
Exercise 7.14.

7.2 The Vidale-Wolfe Advertising Model


We now present the analysis of the Vidale-Wolfe advertising model
which, in contrast to the Nerlove-Arrow model, does not make use of
the idea of advertising goodwill; see Vidale and Wolfe (1957) and Sethi
(1973a, 1974b). Instead the model exploits the closely related notion
that the effect of advertising tends to persist, but diminishes over subse-
quent time periods. This carryover effect is modeled explicitly by means
of a differential equation that gives the relationship between sales and
advertising.
Vidale and Wolfe argued that changes in the rate of sales of a prod-
uct depend on two effects: the action of advertising (via the response
constant a) on the unsold portion of the market and the loss of sales (via
the decay constant b) from the sold portion of the market. Let M (t),
known as the saturation level or market potential, denote the maximum
potential rate of sales at time t. Let S(t) be the actual rate of sales at
time t. Then, the Vidale-Wolfe model for a monopolistic firm can be
stated as
S
Ṡ = au(1 − ) − bS. (7.22)
M
The important feature of this equation, which distinguishes it from
the Nerlove-Arrow equation (7.1), is the idea of the finite saturation level
M. The Vidale-Wolfe model exhibits diminishing returns to the level of
advertising as a direct consequence of this saturation phenomenon. Note
that when M is infinitely large, the saturation phenomenon disappears,
reducing (7.22) to the equation (with constant returns to advertising)
similar to the Nerlove-Arrow equation (7.1). Nerlove and Arrow, on the
236 7. Applications to Marketing

other hand, include the idea of diminishing returns to advertising in their


model by making the sales S in (7.2) a concave function of goodwill.
Vidale and Wolfe based their model on the results of several experi-
mental studies of advertising effectiveness, which are described in Vidale
and Wolfe (1957).

7.2.1 Optimal Control Formulation for the Vidale-Wolfe


Model
Whereas Vidale and Wolfe offered their model primarily as a description
of actual market phenomena represented by cases which they had ob-
served, we obtain the optimal advertising expenditures for the model in
order to maximize a certain objective function over the horizon T, while
also attaining a terminal sales target; see Sethi (1973a). For this, it is
convenient to transform (7.22) by making the change of variable

S
x= . (7.23)
M

Thus, x represents the market share (or more precisely, the rate of sales
expressed as a fraction of the saturation level M ). Furthermore, we
define
a Ṁ
r= , δ =b+ . (7.24)
M M

Now we can rewrite (7.22) as

ẋ = ru(1 − x) − δx, x(0) = x. (7.25)

From now on we assume M, and hence δ and r, to be positive con-


stants. It would not be difficult to extend the analysis when M depends
on t, but we do not carry it out here. In Exercise 7.35 you are asked to
partially analyze the time-dependent case.
To define the optimal control problem arising from the Vidale-Wolfe
model, we let π denote the maximum sales revenue corresponding to
x = 1, with πx denoting the revenue function for x ∈ [0, 1]. Also let Q be
the maximum allowable rate of advertising expenditure and let ρ denote
the continuous discount rate. With these definitions the optimal control
7.2. The Vidale-Wolfe Advertising Model 237

problem can be stated as follows:


⎧   T 

⎪ −ρt

⎪ max J = e (πx − u)dt

⎪ 0





⎪ subject to





⎪ ẋ = ru(1 − x) − δx, x(0) = x0 ,



the terminal state constraint (7.26)







⎪ x(T ) = xT ,






⎪ and the control constraint





⎩ 0 ≤ u ≤ Q.

Here Q can be finite or infinite and the target market share xT is in


[0, 1]. Note that the problem is a fixed-end-point problem. It is obvious
that the requirement 0 ≤ x ≤ 1 holds without being imposed, where
x0 ∈ [0, 1] is the initial market share.
It is possible to solve this problem by an application of the maximum
principle; see Exercise 7.18. However, we will use instead a method based
on Green’s theorem which does not make use of the maximum principle.
This method provides a convenient procedure for solving fixed-end-point
problems having one state variable and one control variable, and where
the control variable appears linearly in both the state equation and the
objective function; see Miele (1962) and Sethi (1977b). Problem (7.26)
has these properties, and therefore it is also a good example with which
to illustrate the method. For the application of Green’s theorem we
require that Q be large. In particular we can let Q = ∞.

7.2.2 Solution Using Green’s Theorem When Q Is Large


In this section we will solve the fixed-end-point problem starting with x0
and ending with xT , under the assumption that Q is either unbounded
or very large. The places where these assumptions are needed will be
indicated.
To make use of Green’s theorem, it is convenient to consider times τ
and θ, where 0 ≤ τ < θ ≤ T, and the problem:
  θ 
−ρt
max J(τ , θ) = e (πx − u)dt (7.27)
τ
238 7. Applications to Marketing

subject to
ẋ = ru(1 − x) − δx, x(τ ) = A, x(θ) = B, (7.28)

0 ≤ u ≤ Q. (7.29)
To change the objective function in (7.27) into a line integral along any
feasible arc Γ1 from (τ , A) to (θ, B) in (t, x)-space as shown in Fig. 7.4,
we multiply (7.28) by dt and obtain the formal relation

dx + δxdt
udt = ,
r(1 − x)

which we substitute into the objective function (7.27). Thus,


  
δx −ρt 1 −ρt
JΓ 1 = πx − e dt − e dx .
Γ1 r(1 − x) r(1 − x)

Figure 7.4: Feasible arcs in (t, x)-space

Consider another feasible arc Γ2 from (τ , A) to (θ, B) lying above Γ1


as shown in Fig. 7.4. Let Γ = Γ1 − Γ2 , where Γ is a simple closed curve
traversed in the counter-clockwise direction. That is, Γ goes along Γ1 in
the direction of its arrow and along Γ2 in the direction opposite to its
arrow. We now have

JΓ = JΓ1 −Γ2 = JΓ1 − JΓ2 . (7.30)


7.2. The Vidale-Wolfe Advertising Model 239

Since Γ is a simple closed curve, we can use Green’s theorem to


express JΓ as an area integral over the region R enclosed by Γ. Thus,
treating x and t as independent variables, we can write
,  
δx −ρt 1 −ρt
JΓ = πx − e dt − e dx
Γ r(1 − x) r(1 − x)
   
∂ −e−ρt ∂ δx −ρt
= − (πx − )e dtdx
R ∂t r(1 − x) ∂x r(1 − x)
  −ρt
δ ρ e
= 2
+ − πr dtdx. (7.31)
R (1 − x) (1 − x) r

Denote the term in brackets of the integrand of (7.31) by


δ ρ
I(x) = 2
+ − πr. (7.32)
(1 − x) (1 − x)
Note that the sign of the integrand is the same as the sign of I(x).

Lemma 7.1 (Comparison Lemma) Let Γ1 and Γ2 be the lower and


upper feasible arcs as shown in Fig. 7.4. If I(x) ≥ 0 for all (x, t) ∈
R, then the lower arc Γ1 is at least as profitable as the upper arc Γ2 .
Analogously, if I(x) ≤ 0 for all (x, t) ∈ R, then Γ2 is at least as profitable
as Γ1 .

Proof If I(x) ≥ 0 for all (x, t) ∈ R, then JΓ ≥ 0 from (7.31) and (7.32).
Hence from (7.30), JΓ1 ≥ JΓ2 . The proof of the other statement is similar.
2
To make use of this lemma to find the optimal control for the problem
stated in (7.26), we need to find regions where I(x) is positive and where
it is negative. For this, note first that I(x) is an increasing function of
x in [0, 1]. Solving I(x) = 0 will give that value of x, above which I(x)
is positive and below which I(x) is negative. Since I(x) is quadratic in
1/(1 − x), we can use the quadratic formula (see Exercise 7.16) to get

x=1− # . (7.33)
−ρ ± ρ2 + 4πrδ

To keep x in the interval [0, 1], we must choose the positive sign before
the radical. The optimal x must be nonnegative so we have
 

x = max 1 −
s
# ,0 , (7.34)
−ρ + ρ2 + 4πrδ
240 7. Applications to Marketing

where the superscript s is used because this will turn out to be a singular
trajectory. Since xs is nonnegative, the control

δxs
us = (7.35)
r(1 − xs )

corresponding to (7.34) will always be nonnegative. Also since Q is as-


sumed to be large, us will always be feasible. Moreover, in Exercise 7.17,
you will be asked to show that xs = 0 and us = 0 if, and only if,
πr ≤ δ + ρ.
We now have enough machinery to obtain the optimal solution for
(7.26) when Q is assumed to be sufficiently large, i.e., Q ≥ us , where us is
given in (7.35). We state these in the form of two theorems: Theorem 7.1
refers to the case in which T is large; Theorem 7.2 refers to the case in
which T is small. To define these concepts, let t1 be the shortest time
to go from x0 to xs and similarly let t2 be the shortest time to go from

Figure 7.5: Optimal trajectory for Case 1: x0 ≤ xs and xT ≤ xs

xs to xT . Then, we say T is large if T ≥ t1 + t2 ; otherwise T is small.


Figures 7.5, 7.6, 7.7, and 7.8 show cases for which T is large, while
Figs. 7.10 and 7.11 show cases for which T is small. In Exercise 7.21 you
are asked to determine whether T is large or small in specific cases. In
the statements of the theorems we will assume that x0 and xT are such
that xT is reachable from x0 . In Exercise 7.15 you are asked to find the
reachable set for any given initial condition x0 .
In Figures 7.5, 7.6, 7.7, and 7.8, the quantities t1 and t2 are case
dependent and not necessarily the same; see Exercise 7.20.
7.2. The Vidale-Wolfe Advertising Model 241

Theorem 7.1 Let T be large and let xT be reachable from x0 . For the
Cases 1–4 of inequalities relating x0 and xT to xs , the optimal trajectories
are given in Figures 7.5, 7.6, 7.7, and 7.8, respectively.

Proof We give details for Case 1 only. The proofs for the other cases
are similar. Figure 7.9 shows the optimal trajectory for Fig. 7.5 together
with an arbitrarily chosen feasible trajectory, shown dotted. It should
be clear that the dotted trajectory cannot cross the arc x0 to C, since
u = Q on that arc. Similarly the dotted trajectory cannot cross the arc
G to xT , because u = 0 on that arc.
We subdivide the interval [0, T ] into subintervals over which the dot-
ted arc is either above, below, or identical to the solid arc. In Fig. 7.9

Figure 7.6: Optimal trajectory for Case 2: x0 < xs and xT > xs

Figure 7.7: Optimal trajectory for Case 3: x0 > xs and xT < xs


242 7. Applications to Marketing

Figure 7.8: Optimal trajectory for Case 4: x0 > xs and xT > xs

these subintervals are [0, d], [d, e], [e, f ], and [f, T ]. Because I(x) is pos-
itive for x > xs and I(x) is negative for x < xs , the regions enclosed
by the two trajectories have been marked with a + or − sign depending
on whether I(x) is positive or negative on the regions, respectively. By
Lemma 7.1, the solid arc is better than the dotted arc in the subintervals
[0, d], [d, e], and [f, T ]; in interval [e, f ], they have identical values. Hence
the dotted trajectory is inferior to the solid trajectory. This proof can
be extended to any (countable) number of crossings of the trajectories;
see Sethi (1977b). 2
Figures 7.5, 7.6, 7.7, and 7.8 are drawn for the situation when T >
t1 + t2 . In Exercise 7.25, you are asked to consider the case when T =
t1 + t2 . The following theorem deals with the case when T < t1 + t2 .
Theorem 7.2 Let T be small, i.e., T < t1 + t2 , and let xT be reach- able
from x0 . For the two possible Cases 1 and 2 of inequalities relating x0
and xT to xs , the optimal trajectories are given in Figs. 7.10 and 7.11,
respectively.
Proof The requirement of feasibility when T is small rules out cases
where x0 and xT are on opposite sides of or equal to xs . The proofs of
optimality of the trajectories shown in Figs. 7.10 and 7.11 are similar to
the proofs of the parts of Theorem 7.1, and are left as Exercise 7.25. In
Figs. 7.10 and 7.11, it is possible to have either t1 ≥ T or t2 ≥ T. Try
sketching some of these special cases. 2
All of the previous discussion has assumed that Q was finite and
sufficiently large, but we can easily extend this to the case when Q = ∞.
7.2. The Vidale-Wolfe Advertising Model 243

Figure 7.9: Optimal trajectory (solid lines)

Figure 7.10: Optimal trajectory when T is small in Case 1: x0 < xs and


xT > xs

This possibility makes the arcs in Figs. 7.5, 7.6, 7.7, 7.8, 7.9, and 7.10,
corresponding to u∗ = Q, become vertical line segments corresponding to
impulse controls. For example, Fig. 7.6 becomes Fig. 7.12 when Q = ∞
and we apply the impulse control imp(x0 , xs ; 0) when x0 < xs .
Next we compute the cost of imp(x0 , xs ; 0) by assessing its effect
on the objective function of (7.26). For this, we integrate the state
equation in (7.26) from 0 to ε with the initial condition x0 and u treated
244 7. Applications to Marketing

xT

x0
u* = 0 u* = Q

xs

t
0 T
T – t2 t1

Figure 7.11: Optimal trajectory when T is small in Case 2: x0 > xs and


xT > xs

Impulse
Control

Impulse
Control

Figure 7.12: Optimal trajectory for Case 2 of Theorem 7.1 for Q = ∞


7.2. The Vidale-Wolfe Advertising Model 245

as constant. By using (A.7), we can write the solution as


 ε
−(δ+ru)ε
x(ε) = x0 e + e(δ+ru)(τ −ε) rudτ
0
 
ru ru
= x0 − e−(δ+ru)ε + .
δ + ru δ + ru
According to the procedure given in Sect. 1.4, we must, for u, choose
u(ε) so that x(ε) is xs . It should be clear that u(ε) → ∞ as ε → 0. With
F (x, u, τ ) = πx(τ ) − u(τ ) and t = 0 in (1.23), we have the impulse

I = imp(x0 , xs ; 0) = lim [−u(ε)ε].


ε→0

It is possible to solve for I by letting ε → 0, −u(ε)ε → I, u(ε) → ∞, and


x(ε) = xs in the expression for x(ε) obtained above. This gives

x(0+) = erI (x0 − 1) + 1.

Therefore,
1 1 − x0
imp(x0 , x ; 0) = − ln
s
. (7.36)
r 1 − xs
We remark that this formula holds for any time t, as well as t = 0. Hence
it can also be used at t = T to compute the impulse at the end of the
period; see Fig. 7.12 and Exercise 7.28.

7.2.3 Solution When Q Is Small


When Q is small, it is not possible to go along the turnpike xs , so the
arguments based on Green’s theorem become difficult to apply. We there-
fore return to the maximum principle approach to analyze the problem.
By “Q is small” we mean Q < us , where us is defined in (7.35). An-
other characterization of the phrase “Q is small” in terms of the problem
parameters is given in Exercise 7.30.
We now apply the current-value maximum principle (3.42) to the
fixed-end-point problem given in (7.26). We form the current-value
Hamiltonian as

H = πx − u + λ[ru(1 − x) − δx]
= πx − δλx + u[−1 + rλ(1 − x)], (7.37)

and the Lagrangian function as

L = H + μ(Q − u). (7.38)


246 7. Applications to Marketing

The adjoint variable λ satisfies

∂L
λ̇ = ρλ − = ρλ + λ(ru + δ) − π, (7.39)
∂x

where λ(T ) is a constant, as in Row 2 of Table 3.1, that must be deter-


mined. Furthermore, the Lagrange multiplier μ in (7.38) must satisfy

μ ≥ 0, μ(Q − u) = 0. (7.40)

From (7.37) we notice that the Hamiltonian is linear in the control. So


the optimal control is

u∗ (t) = bang[0, Q; W (t)], (7.41)

where

W (t) = W (x(t), λ(t)) = rλ(t)(1 − x(t)) − 1. (7.42)

We remark that the sufficiency conditions of Sect. 2.4, which require


concavity of the derived Hamiltonian H 0 , do not apply here; see Exer-
cise 7.33. However, the sufficiency of the maximum principle for this
kind of problem has been established in the literature; see, for example,
Lansdowne (1970).
When W = rλ(1 − x) − 1 = 0, we have the possibility of a singular
control, provided we can maintain this equality over a finite time interval.
For the case when Q is large, we showed in the previous section that the
optimal trajectory contains a segment on which x = xs and u∗ = us ,
where 0 ≤ us ≤ Q. (See Exercise 7.30 for the condition that Q is small.)
This can obviously be a singular control. Further discussion of singular
control is given in Sect. D.6.
A complete solution of problem (7.26) when Q is small requires a
lengthy switching point analysis. The details are too voluminous to give
here, but an interested reader can find the details in Sethi (1973a).
7.2. The Vidale-Wolfe Advertising Model 247

7.2.4 Solution When T Is Infinite


In Sects. 7.2.1 and 7.2.2, we assumed that T was finite. We now formulate
the infinite horizon version of (7.26):
⎧   ∞ 

⎪ −ρt

⎪ max J = e (πx − u)dt

⎪ 0



⎨ subject to
(7.43)



⎪ ẋ = ru(1 − x) − δx, x(0) = x0 ,





⎩ 0 ≤ u ≤ Q.

We divide the analysis of this problem into the same two cases defined
as before, namely, “Q is large” and “Q is small”.
When Q is large, the results of Theorem 7.1 suggest the solution
when T is infinite. Because of the discount factor, the ending parts of
the solutions shown in Figs. 7.5, 7.6, 7.7, and 7.8 can be shown to be
irrelevant (i.e., the discounted profit accumulated during the interval
(T − t2 , T ) goes to 0 as T goes to ∞). Therefore, we only have two
cases: (a) x0 ≤ xs , and (b) x0 ≥ xs . The optimal control in Case (a) is
to use u∗ = Q in the interval [0, t1 ) and u∗ = us for t ≥ t1 . Similarly, the
optimal control in Case (b) is to use u∗ = 0 in the interval [0, t1 ) and
u∗ = us for t ≥ t1 .
An alternate way to see that the above solutions give u∗ = us for
t ≥ t1 is to check that they satisfy the turnpike conditions (3.107). To do
this we need to find the values of the state, control, and adjoint variables
and the Lagrange multiplier along the turnpike. It can be easily shown
that x = xs , u = us , λs = π/(ρ + δ + rus ), and μs = 0 satisfy the
turnpike conditions (3.107).
When Q is small, i.e., Q < us , it is not possible to follow the turnpike
x = xs , because that would require u = us , which is not a feasible control.
Intuitively, it seems clear that the “nearest” stationary path to xs that
we can follow is the path obtained by setting ẋ = 0 and u = Q, the
largest possible control, in the state equation of (7.43). This gives
rQ
x̄ = , (7.44)
rQ + δ
and correspondingly we obtain
π
λ̄ = (7.45)
ρ + δ + rQ
248 7. Applications to Marketing

by setting u = Q and λ̇ = 0 in (7.39) and solving for λ.


To find an optimal solution from any given initial x0 , the approach we
take is to find a feasible path that is “nearest” to xs ; See Sethi (1977b) for
further discussion. As we shall see, for x0 < xs , such a path is obtained
by using the maximum possible control Q all the way. For x0 > xs , the
situation is more difficult. Nevertheless, the following two theorems give
the turnpike as well as the optimal path starting from any given initial
x0 . Let us define x̂ and μ̄ such that W (x̂, λ̄) = rλ̄(1 − x̂) − 1 = 0 and
Lu (x̄, ū, λ̄, μ̄) = W (x̄, λ̄) − μ̄ = 0. Thus,

x̂ = 1 − 1/rλ̄, (7.46)

μ̄ = rλ̄(1 − x̄) − 1. (7.47)

Theorem 7.3 When Q is small, the quadruple {x̄, Q, λ̄, μ̄} forms a
turnpike.

Proof We show that the turnpike conditions (3.107) hold for the quadru-
ple. The first two conditions of (3.107) are (7.44) and (7.45). By Ex-
ercise 7.31 we know x̄ ≤ x̂, which, from definitions (7.46) and (7.47),
implies μ̄ ≥ 0. Furthermore ū = Q, so (7.40) holds and the third con-
dition of (3.107) also holds. Finally because W = μ̄ from (7.42) and
(7.47), it follows that W ≥ 0, so the Hamiltonian maximizing condition
of (3.107) holds with ū = Q. 2

Theorem 7.4 When Q is small, the optimal control at any time τ ≥ 0


is given by:

(a) If x(τ ) ≤ x̂, then u∗ (τ ) = Q.

(b) If x(τ ) > x̂, then u∗ (τ ) = 0.

Proof (a) We set λ(t) = λ̄ for all t ≥ τ and note that λ satisfies the
adjoint equation (7.39) and the transversality condition (3.99).
By Exercise 7.31 and the assumption that x(τ ) ≤ x̂, we know that
x(t) ≤ x̂ for all t. The proof that (7.40) and (7.41) hold for all t ≥ τ
relies on the fact that x(t) ≤ x̂ and on an argument similar to the proof
of the previous theorem.
Figure 7.13 shows the optimal trajectories when x0 < x̂ for two dif-
ferent starting values of x0 , one above and the other below x̄. Note that
in this figure we are always in Case (a) since x(τ ) ≤ x̂ for all τ ≥ 0.
7.2. The Vidale-Wolfe Advertising Model 249

Figure 7.13: Optimal trajectories for x(0) < x̂

(b) Assume x0 > x̂. In this case we will show that the optimal trajec-
tory is as shown in Fig. 7.14, which is obtained by applying u = 0 until
x = x̂ and u = Q thereafter. Using this policy we can find the time t1
at which x(t1 ) = x̂, by solving the state equation in (7.43) with u = 0.
This gives
1 x0
t1 = ln . (7.48)
δ x̂
Clearly for t ≥ t1 , the policy u = Q is optimal because Case (a)
applies. We now consider the interval [0, t1 ], where we set u = 0. Let τ
be any time in this interval as shown in Fig. 7.14, and let x(τ ) be the
corresponding value of the state variable. Then x(τ ) = x0 e−δτ . With
u = 0 in (7.39), the adjoint equation on [0, t1 ] becomes

λ̇ = (ρ + δ)λ − π.

We also know that x(t1 ) = x̂. Thus, Case (a) applies at time t1 , and
we would like to have λ(t1 ) = λ̄. So, we solve the adjoint equation with
λ(t1 ) = λ̄ and obtain
 
π π
λ(τ ) = + λ̄ − e(ρ+δ)(τ −t1 ) , τ ∈ [0, t1 ]. (7.49)
ρ+δ ρ+δ

Now, with the values of x(τ ) and λ(τ ) in hand, we can use (7.42)
to obtain the switching function value W (τ ). In Exercise 7.34, you are
250 7. Applications to Marketing

x(0)
u* = 0

x(τ) u* = Q

xs

t
0 τ t1
Case(b) Case(a)

Figure 7.14: Optimal trajectory for x(0) > x̂

asked to show that the switching function W (τ ) is negative for each τ in


the interval [0, t1 ) and W (t1 ) = 0. Therefore by (7.41), the policy u = 0
used in deriving (7.48) and (7.49) satisfies the maximum principle. This
policy “joins” the optimal policy after t1 because λ(t1 ) = λ̄.
In this book the sufficiency of the transversality condition (3.99) is
stated under the hypothesis that the derived Hamiltonian is concave;
see Theorem 2.1. In the present example, this hypothesis does not hold.
However, as mentioned in Sect. 7.2.3, for this simple bilinear problem
it can be shown that (3.99) is sufficient for optimality. Because of the
technical nature of this issue we omit the details. 2

Exercises for Chapter 7

E 7.1 In Eqs. (7.2) and (7.3), assume S(p, G) = 1000 − 5p + 2G and


c(S) = 5S. Substitute into (7.5) and solve for the optimal price p∗ in
terms of G.

E 7.2 Derive the optimal monopoly price formula in (7.6) from (7.5).

E 7.3 Derive the equilibrium goodwill level formula (7.12).

E 7.4 Re-solve problem (7.7) with its objective function and the control
constraint replaced by (7.13), and show that the only possible singular
Exercises for Chapter 7 251

level of goodwill (which can be maintained over a finite time interval) is


the value Ḡ obtained in (7.14).

E 7.5 Show that the total cost of advertising required to go from


G0 < Ḡ to Ḡ instantaneously (by an impulse) is Ḡ − G0 .

Hint: Integrate Ġ = u − δG, G(0) = G0 , from 0 to ε and equate


Ḡ = lim G(ε), where the limit is taken as ε → 0 and u → ∞ in such a
way that uε → cost = −imp(G0 , Ḡ; 0). See also the derivation of (7.36).

E 7.6 Assume the effect of the exogenous variable Z(t) is seasonal so


that the goodwill Ḡ(t) = 2 + sin t. Assume δ = 0.1. Sketch the graph
˙ similar to Fig. 7.2, and identify intervals in which
of ū(t) = δ Ḡ + Ḡ,
maintaining the singular level Ḡ(t) is infeasible.

E 7.7 In the Nerlove-Arrow Model of Sects. 7.1.1 and 7.1.2, assume


S(p, A, Z) = αp−η Gβ Z γ and c(S) = cS. Show that the optimal sta-
tionary policy gives ū/pS = constant, i.e., that the optimal advertising
level is a constant fraction of sales regardless of the value of Z. (Such
policies are followed by many industries.)

E 7.8 Verify that G1 and G2 , which are shown in Fig. 7.3 for the pulsing
policy derived from solving problem (7.19) as a near-optimal solution of
problem (7.17) with T = ∞, are given by

M 1 − e−δτ v̄ M e−δτ (1−v̄) − e−δτ
G1 = , G2 = .
δ 1 − e−δτ δ 1 − e−δτ

E 7.9 Extend the Nerlove-Arrow Model and its results by introducing


the additional capital stock variable
K̇ = v − γK, K(0) = K0 ,
where v is the research expenditure. Assume the new cost function to
be C(S, K). Note that this model allows the firm to manipulate its cost
function. See Dhrymes (1962).

E 7.10 Analyze an extension of a finite horizon Nerlove-Arrow Model


subject to a budget constraint. That is, introduce the following isoperi-
metric constraint:  T
ue−ρt dt = B.
0

Also assume π(G) = α G where α > 0 is a constant. See Sethi (1977c).
252 7. Applications to Marketing

E 7.11 Introduce a budget constraint in a different way into the Nerlove-


Arrow model as follows. Let B(t) be the budget at time t, and let γ > 0
be a constant. Assume B satisfies

Ḃ = e−ρt (−u + γG), B(0) = B0

and B(t) ≥ 0 for all t. Solve only the infinite horizon model. See Sethi
and Lee (1981).

E 7.12 Maximize the present value of total sales in the Nerlove-Arrow


model, i.e.,   ∞ 
−ρt
max J = e pS(p, G)dt
u≥0 0

subject to (7.1) and the isoperimetric profit constraint


 ∞
e−ρt [pS(p, G) − C(S) − u]dt = π̂.
0

See Tsurumi and Tsurumi (1971).

E 7.13 A Logarithmic Advertising Model (Sethi 1975).

(a) With πr > ρ + δ, solve


⎧   T 

⎪ −ρt

⎪ max J = e (πx − u)dt




0



⎪ subject to


⎪ ẋ = r log u − δx, x(0) = x0 ,







⎪ and the control constraint




⎩ u ≥ 1.

(b) Find the value of T for which the minimum advertising is optimal
throughout, i.e., u∗ (t) = 1, 0 ≤ t ≤ T.

(c) Let T = ∞. Obtain the long-run stationary equilibrium (x̄, ū, λ̄).

E 7.14 Let

Q(t) = the quality of the software at time t; Q(0) ≥ 0,


Exercises for Chapter 7 253

P (t) = the price of the software at time t; P (t) ≥ 0,


D(P, Q) = the demand; D(P, Q) ≥ 0, DQ ≥ 0, DP ≤ 0,
g(x) = a decreasing function; g(x) ≥ 0, g  (x) ≤ 0, g  (x) ≥ 0,
and g(x) → 0 as x → ∞,
ρ = the discount rate; ρ > 0,
δ = the obsolescence rate for software quality; δ > 0.
Assume that
lim P D(P, Q) = 0, for each Q.
P →0
Furthermore, we assume that there is a price that maximizes the revenue
(in the case when there is more than one global maximum, we will choose
the largest of these) and denote it as P m (Q).
We assume that 0 < P m (Q) < ∞ and define
R(Q) = P m (Q)D(P m (Q), Q).
By the envelope theorem (see Derzko et al. 1984), we have
RQ (Q) = P m (Q)DQ (P m (Q), Q) ≥ 0.
In an open-source approach to software development, the improve-
ment in software quality is proportional to the number of volunteer pro-
grammers participating at any point in time. The volunteer program-
mers’ willingness to contribute to software quality is driven by fairness
considerations.
To capture the loss of motivation that results from the profit making
of the firm, we formulate the motivations of the programmers based on
the current or projected future profit of the firm. Then, let g(P D) be
the quality improvement affected by the volunteer programmers. The
optimal dynamic price and quality paths can be obtained by solving the
following problem due to Haruvy et al. (2003):
  ∞ 
max J = e−rt P Ddt ,
P (t)≥0 0
s.t. dQ/dt = g(P D) − δQ, Q(0) = Q0 .
Because of the convexity of function g in this case, argue that the problem
would require the inclusion of chattering controls. Then reformulate the
problem as
  ∞ 
−rt
max J = e vR(Q)dt ,
0≤v≤1 0
254 7. Applications to Marketing

s.t. dQ/dt = (1 − v)g(0) + vg(R(Q)) − δQ, Q(0) = Q0 .

Apply the Green’s theorem approach to solve this problem.

E 7.15 For problem (7.26), find the reachable set for a given initial x0
and horizon time T.

E 7.16 Solve the quadratic equation I(x) = 0, where I(x) is defined in


(7.32), to obtain its solution as shown in (7.33).

E 7.17 Show that both xs in (7.34) and us in (7.35) are 0 if, and only
if, πr ≤ δ + ρ.

E 7.18 For problem (7.26) with πr > δ + ρ and Q sufficiently large,


derive the turnpike {x̄, ū, λ̄} by using the maximum principle. Check
to see that x̄ and ū correspond, respectively, to xs and us derived by
Green’s theorem. Show that when ρ = 0, x̄ reduces to the golden path
rule.

E 7.19 Let xs denote the solution of I(x) = 0 and let A < xs < B in
Fig. 7.4. Assume that I(x) > 0 for x > xs and I(x) < 0 for x < xs .
Construct a path Γ3 such that JΓ3 ≥ JΓ1 and JΓ3 ≥ JΓ2 .

Hint: Use Lemma 7.1.

E 7.20 For the problem in (7.26), suppose x0 and xT are given and
define xs as in (7.34). Let t1 be the shortest time to go from x0 to xs ,
and t2 be the shortest time to go from xs to xT .

(a) If x0 < xs and xs > xT , show that


s
1 x̄ − x0 1 x
t1 = ln , t2 = ln ,
rQ + δ x̄ − x s δ xT

where x̄ = rQ/(rQ + δ); assume x̄ > xs .

(b) Using the form of the answers in (a), find t1 and t2 when x0 > xs
and xs < xT < x̄.

E 7.21 For Exercise 7.20(a), write the condition that T is large, i.e.,
T ≥ t1 + t2 , in terms of all the parameters of problem (7.26).

E 7.22 Perform the following:


Exercises for Chapter 7 255

(a) For problem (7.26), assume r = 0.2, δ = 0.05, ρ = 0.1, Q =


5, π = 2, x0 = 0.2 and xT = 0.3. Use Exercises 7.20(a) and 7.21 to
show that T = 13 is large and T = 8 is small. Sketch the optimal
trajectories for T = 13 and T = 8.

(b) Redo (a) when xT = 0.7. Show that both T = 13 and T = 8 are
large.

E 7.23 Prove Theorem 7.1 for Case 3.

E 7.24 Draw four figures for the case T = t1 + t2 corresponding to


Figs. 7.5, 7.6, 7.7, and 7.8.

E 7.25 Prove Theorem 7.2.

E 7.26 Sketch one or two other possible curves for the case when T is
small.

E 7.27 An intermediate step in the derivation of (7.36) is to establish


that  ε
lim e−ρt [πx(t) − u(t)]dt = lim [−u(ε)ε].
ε→0 0 ε→0

Show how to accomplish this by using the Mean Value Theorem.

E 7.28 Obtain the impulse function, imp(xs , xT ; T ), required to take the


state from xs up to xT instantaneously at time T as shown in Fig. 7.12
for the Vidale-Wolfe model in Sect. 7.2.2.

E 7.29 Perform the following:

(a) Re-solve Exercise 7.22(a) with Q = ∞. Show T = 10.5 is no longer


small.

(b) Show that T > 0 is large for Exercise 7.22(b) when Q = ∞. Find
the optimal value of the objective function when T = 8.

E 7.30 Show that Q is small if, and only if,

πrδ
> 1.
(δ + ρ + rQ)(δ + rQ)

E 7.31 Perform the following:


256 7. Applications to Marketing

(a) Show that x̄ < xs < x̂ when Q is small, where x̂ is


defined in (7.46).

(b) Show that x̄ > xs when Q is large.

E 7.32 Derive (7.48).

E 7.33 Show the derived Hamiltonian H 0 corresponding to (7.37) and


(7.41) is not concave in x for any given λ > 0.

E 7.34 Show that the switching function defined in (7.42) is concave in


t, and then verify that the policy in Fig. 7.14 satisfies (7.41).

E 7.35 In (7.25), assume r and δ are positive, differentiable functions


of time. Derive expressions similar to (7.31)–(7.35) in order to get the
new turnpike values.

E 7.36 Write the equation satisfied by the turnpike level x̄ for the model
⎧   ∞ 

⎪ max J = −ρt 2
e (πx − u )dt



⎨ u≥0 0

⎪ subject to




⎩ ẋ = ru(1 − x) − δx, x(0) = x .
0

Show that the turnpike reduces to the golden path when ρ = 0.

E 7.37 Obtain the optimal long-run stationary equilibrium for the fol-
lowing modification of the model (7.26), due to Sethi (1983b):
⎧  ∞



⎪ max e−ρt (πx − u2 )dt

⎪ 0



⎨ subject to
(7.50)

⎪ #

⎪ ẋ = ru (1 − x) − δx, x0 ∈ [0, 1],





⎩ u ≥ 0.

In particular, show that the turnpike triple (x̄, λ̄, ū) is given by

r2 λ̄/2 rλ̄ 1 − x̄
x̄ = 2 , ū = , (7.51)
r λ̄/2 + δ 2
Exercises for Chapter 7 257

and
#
[(ρ + δ)2 + r2 π] − (ρ + δ)
λ̄ = . (7.52)
r2 /2

Show that the optimal value of the objective function is


2
r2 λ̄
J ∗ (x0 ) = λ̄x0 + . (7.53)

E 7.38 Consider (7.43) with the state equation replaced by

ẋ = ru(1 − x) + μx(1 − x) − δx, x(0) = x0 ,

where the constant μ > 0 reflects word-of-mouth communication between


buyers (represented by x) and non-buyers (represented by (1 − x)) of the
product. Assume Q is infinite for convenience. Obtain the turnpike for
this problem. See Sethi (1974b).

E 7.39 The Ozga Model (Ozga 1960; Gould 1970). Suppose the informa-
tion spreads by word of mouth rather than by an impersonal advertising
medium, i.e., individuals who are already aware of the product inform
individuals who are not, at a certain rate, influenced by advertising ex-
penditure. What we have now is the Ozga model

ẋ = ux(1 − x) − δx, x(0) = x0 .

The optimal control problem is to maximize


 ∞
J= e−ρt [π(x) − w(u)]dt
0

subject to the Ozga model. Assume that π(x) is concave and w(u) is
convex. See Sethi (1979c) for a Green’s theorem application to this
problem.
Chapter 8

The Maximum Principle:


Discrete Time

For many purposes it is convenient to assume that time is represented by


a discrete variable, k = 0, 1, 2, . . . , T, rather than by a continuous variable
t ∈ [0, T ]. This is particularly true when we wish to solve a large control
theory problem by means of a computer. It is also desirable, even when
solving small problems which have state or adjoint differential equations
whose solutions cannot be expressed in closed form, to formulate them as
discrete problems and let the computer solve them in a stepwise manner.
We will see that the maximum principle, which is to be derived in
this chapter, is not valid for the discrete-time problem in as wide a
sense as for the continuous-time problem. In fact, we will reduce it to
a nonlinear programming problem and state necessary conditions for its
solution by using the well-known Kuhn-Tucker theorem. In order to
follow this procedure, we have to make some simplifying assumptions
and hence will obtain only a restricted form of the discrete maximum
principle. In Sect. 8.3, we state without proof a more general form of the
discrete maximum principle.

8.1 Nonlinear Programming Problems


We begin by stating a general form of a nonlinear programming prob-
lem. Let x be an n-component column vector, a an r-component col-
umn vector, and b an s-component column vector. Let the functions

© Springer Nature Switzerland AG 2019 259


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 8
260 8. The Maximum Principal: Discrete Time

h : E n → E 1 , f : E n → E r , and g : E n → E s be continuously dif-


ferentiable. We assume functions f and g to be column vectors with r
and s components, respectively. We consider the nonlinear programming
problem:
max h(x) (8.1)
subject to r equality constraints and s inequality constraints given, re-
spectively, by

f (x) = a, (8.2)
g(x) ≥ b. (8.3)

Next we develop necessary conditions, called the Kuhn-Tucker condi-


tions, which a solution x∗ to this problem must satisfy. We start with
simpler problems and work up to the statement of these conditions for
the general problem in a heuristic fashion. References are given for rig-
orous developments of these results.
In this chapter, whenever we take derivatives of functions, we assume
that those derivatives exist and are continuous. It would be also helpful
to recall the notation developed in Sect. 1.4.

8.1.1 Lagrange Multipliers


Suppose we want to solve (8.1) without imposing constraints (8.2) or
(8.3). The problem is now the classical unconstrained maximization
problem of calculus, and the first-order necessary conditions for its solu-
tion are
hx = 0. (8.4)
The points satisfying (8.4) are called critical points. Critical points which
are maxima, minima, or saddle points are of interest in this book. Ad-
ditional higher-order conditions required to determine whether a critical
point is a maximum or a minimum are stated in Exercise 8.2. In an
important case when the function h is concave, condition (8.4) is also
sufficient for a global maximum of h.
Suppose we want to solve (8.1) while imposing just the equality con-
straints (8.2). The method of Lagrange multipliers permits us to obtain
the necessary conditions that a solution to the constrained maximization
problem (8.1) and (8.2) must satisfy. We define the Lagrangian function

L(x, λ) = h(x) + λ[f (x) − a], (8.5)


8.1. Nonlinear Programming Problems 261

where λ is an r-component row vector. The necessary condition for x∗


to be a (maximum) solution to (8.1) and (8.2) is that there exists an
r-component row vector λ such that (x∗ , λ) satisfy the equations
Lx = hx + λfx = 0, (8.6)
Lλ = f (x) − a = 0. (8.7)
Note that (8.7) states simply that x∗ is feasible according to (8.2).
The system of n + r Eqs. (8.6) and (8.7) has n + r unknowns. Since
some or all of the equations are nonlinear, the solution method will, in
general, involve nonlinear programming techniques, and may be difficult.
In other cases, e.g., when h is linear and f is quadratic, it may only
involve the solution of linear equations. Once a solution (x∗ , λ) is found
satisfying the necessary conditions (8.6) and (8.7), the solution must still
be checked to see whether it satisfies sufficient conditions for a global
maximum. Such sufficient conditions will be stated in Sect. 8.1.4.
Suppose (x∗ , λ) is in fact a solution to equations (8.6) and (8.7).
Note that x∗ depends on a and we can show this dependence by writing
x∗ = x∗ (a). Now h∗ = h∗ (a) = h(x∗ (a)) is the optimum value of the
objective function. By differentiating h∗ (a) with respect to a and using
(8.6), we obtain
dx∗ dx∗
h∗a = hx = −λfx .
da da
But by differentiating (8.7) with respect to a at x = x∗ (a), we get
dx∗
fx = 1,
da
and therefore we have
h∗a = −λ. (8.8)
We can see that the Lagrange multipliers have an important managerial
interpretation, namely, λi is the negative of the imputed value or shadow
price of having one unit more of the resource ai . In Exercise 8.4 you are
asked to provide a proof of (8.8).

Example 8.1 Consider the two-dimensional problem:






⎪ max{h(x, y) = −x2 − y 2 }


⎪ subject to




⎩ 2x + y = 10.
262 8. The Maximum Principal: Discrete Time

Solution We form the Lagrangian

L(x, y, λ) = (−x2 − y 2 ) + λ(2x + y − 10).

The necessary conditions for an optimal solution (x∗ , y ∗ ) are that


(x∗ , y ∗ , λ) satisfy the equations

Lx = −2x + 2λ = 0,
Ly = −2y + λ = 0,
Lλ = 2x + y − 10 = 0.

From the first two equations we get λ = x = 2y. Solving this with the
last equation yields the quantities

x∗ = 4, y ∗ = 2, λ = 4, h∗ = −20,

which can be seen to give a maximum value to h, since h is concave


and the constraint set is convex. The interpretation of the Lagrange
multiplier λ = 4 can be obtained to verify (8.8) by replacing the constant
10 by 10 +  and expanding the objective function in a Taylor series; see
Exercise 8.5.

8.1.2 Equality and Inequality Constraints


Now suppose we want to solve the problem defined by (8.1)–(8.3). As
before, we define the Lagrangian

L(x, λ, μ) = h(x) + λ[f (x) − a] + μ[g(x) − b]. (8.9)

The Kuhn-Tucker necessary conditions for this problem cannot be as


easily derived as for the equality-constrained problem in the preceding
section. We will write them first, and then give interpretations to make
them plausible. The necessary conditions for x∗ to be a solution of (8.1)–
(8.3) are that there exist an r-dimensional vector λ and an s-dimensional
row vector μ such that

Lx = hx + λfx + μgx = 0, (8.10)


f = a, (8.11)
g ≥ b, (8.12)
μ ≥ 0, μ(g − b) = 0. (8.13)
8.1. Nonlinear Programming Problems 263

Note that g is appended in (8.10) in the same way f is appended in


(8.6). Also (8.12) repeats the inequality constraint (8.3) in the same way
that (8.11) repeats the equality constraint (8.2). However, the conditions
in (8.13) are new and particular to the inequality-constrained problem.
We will see that they include some of the boundary points of the feasible
set of points as well as unconstrained maximum solution points, as can-
didates for the solution to the maximum problem. This is best brought
out by examples.

Example 8.2 Solve the problem:






⎪ max{h(x) = 8x − x2 }


⎪ subject to




⎩ x ≥ 2.

Solution We form the Lagrangian

L(x, μ) = 8x − x2 + μ(x − 2).

The necessary conditions (8.10)–(8.13) become

Lx = 8 − 2x + μ = 0, (8.14)
x − 2 ≥ 0, (8.15)
μ ≥ 0, μ(x − 2) = 0. (8.16)

Observe that the constraint μ(x − 2) = 0 in (8.16) can be phrased as:


either μ = 0 or x = 2. We treat these two cases separately.

Case 1: μ = 0. From (8.14) we get x = 4, which also satisfies (8.15).


Hence, this solution, which makes h(4) = 16, is a possible candidate for
the maximum solution.

Case 2: x = 2. Here from (8.14) we get μ = −4, which does not satisfy
the inequality μ ≥ 0 in (8.16).

From these two cases we conclude that the optimum solution is x∗ = 4


and h∗ = h(x∗ ) = 16.
264 8. The Maximum Principal: Discrete Time

Example 8.3 Solve the problem:






⎪ max{h(x) = 8x − x2 }


⎪ subject to




⎩ x ≥ 6.

Solution The Lagrangian is

L(x, μ) = 8x − x2 + μ(x − 6).

The necessary conditions are

Lx = 8 − 2x + μ = 0, (8.17)
x − 6 ≥ 0, (8.18)
μ ≥ 0, μ(x − 6) = 0. (8.19)

Again, the condition μ(x − 6) = 0 is an either-or relation which gives


two cases.

Case 1: μ = 0. From (8.17) we obtain x = 4, which does not satisfy


(8.18), so this case is infeasible.

Case 2: x = 6. Obviously (8.18) holds. From (8.17) we get μ = 4, so


(8.19) holds as well. The optimal solution is then

x∗ = 6, h∗ = h(x∗ ) = 12,

since it is the only solution satisfying the necessary conditions.

The examples above involve only one variable, and are relatively
obvious. The next example, which is two-dimensional, will reveal more
of the power and the difficulties of applying the Kuhn-Tucker conditions.

Example 8.4 Find the shortest distance between the point (2,2) and
the upper half of the semicircle of radius one with its center at the origin,
shown as the curve in Fig. 8.1. In order to simplify the calculation, we
minimize h, the square of the distance. Hence, the problem can be stated
8.1. Nonlinear Programming Problems 265

as the following nonlinear programming problem:




⎪ - .

⎪ max −h(x, y) = −(x − 2)2 − (y − 2)2





⎨ subject to



⎪ x2 + y 2 = 1,





⎩ y ≥ 0.

The Lagrangian function for this problem is

L = −(x − 2)2 − (y − 2)2 + λ(x2 + y 2 − 1) + μy. (8.20)

The necessary conditions are

−2(x − 2) + 2λx = 0, (8.21)


−2(y − 2) + 2λy + μ = 0, (8.22)
2 2
x + y − 1 = 0, (8.23)
y ≥ 0, (8.24)
μ ≥ 0, μy = 0. (8.25)

First, we conclude that λ = 0, since otherwise λ = 0 would imply


x = 2 from (8.21), which would contradict (8.23). Next, from (8.25) we
conclude that either μ = 0 or y = 0. If μ = 0, then from (8.21) and (8.22),
we get x = y. Solving the equation x = y together with x2 +y 2 = 1 gives:
√ √ √
(a) (1/ 2, 1/ 2) and h = −(9 − 4 2).

If y = 0, then solving with x2 + y 2 = 1 gives:

(b) (1, 0) and h = −5,

(c) (−1, 0) and h = −13.

These three points are shown in Fig. 8.1. Of the three√points√ found that
satisfy the necessary conditions, clearly the point (1/ 2, 1/ 2) found in
(a) is the nearest point and solves the closest-point problem. The point
(−1, 0) in (c) is in fact the farthest point; and the point (1, 0) in (b)
is neither the closest nor the farthest point. The associated√multiplier
values can be easily computed, and these are: (a) λ = 1 − 2 2, μ = 0;
(b) λ = −1, μ = 4; and (c) λ = 3, μ = 4.
266 8. The Maximum Principal: Discrete Time

Closest
Point

Farthest
Point

(-1,0) (1,0)

Figure 8.1: Shortest distance from point (2,2) to the semicircle

The fact that there are three points satisfying the necessary condi-
tions, and only one of them actually solves the problem at hand, empha-
sizes that the conditions are only necessary and not sufficient. In every
case it is important to check the solutions to the necessary conditions to
see which of the solutions provides the optimum.
Next we work two examples that show some technical difficulties that
can arise in the application of the Kuhn-Tucker conditions.

Example 8.5 Consider the problem:

max{h(x, y) = y} (8.26)

subject to

(1 − y)3 − x2 ≥ 0, (8.27)
x ≥ 0 (8.28)
y ≥ 0. (8.29)

The set of points satisfying the constraints is shown shaded in Fig. 8.2.
From the figure it is obvious that the solution point (0,1) maximizes the
value of y.
Hence, the optimum solution is (x∗ , y ∗ ) = (0, 1) and h∗ = 1. Let us
see if we can find it using the above procedure. The Lagrangian is

L = y + λ[(1 − y)3 − x2 ] + μx + νy, (8.30)


8.1. Nonlinear Programming Problems 267

Figure 8.2: Graph of Example 8.5

so that the necessary conditions are


Lx = −2xλ + μ = 0, (8.31)
2
Ly = 1 − 3λ(1 − y) + ν = 0, (8.32)
3 2
λ ≥ 0, λ[(1 − y) − x ] = 0, (8.33)
μ ≥ 0, μx = 0, (8.34)
ν ≥ 0, νy = 0, (8.35)
together with (8.27)–(8.29). Let us check if these conditions hold at the
point (0,1). At y = 1, the constraint y ≥ 0 is not active, and we have
ν = 0. With ν = 0 and y = 1, (8.32) cannot be satisfied.
The reason for failure of the method in Example 8.5 is that the con-
straints do not satisfy what is called the constraint qualification. A com-
plete study of the topic is beyond the scope of this book, but we state
in the next section a constraint qualification sufficient for our purposes.
For further information, see Mangasarian (1969).

8.1.3 Constraint Qualification


Example 8.5 shows the need for imposing some kind of condition to rule
out features such as the cusp at (0, 1) in Fig. 8.2 on the boundary of the
constraint set. One way to accomplish this is to assume that the gradi-
ents of the equality constraints and of the active inequality constraints at
the candidate point under consideration are linearly independent. Equiv-
alently, we say that the constraints (8.2) and (8.3) satisfy the constraint
qualification at x if the following full-rank condition holds at x, that is,
⎡ ⎤
⎢ ∂g/∂x diag(g) ⎥
rank ⎣ ⎦ = min(s + r, s + n), (8.36)
∂f /∂x 0
268 8. The Maximum Principal: Discrete Time

where ∂g/∂x and ∂f /∂x are s × n and r × n gradient matrices, respect-


fully, as defined in Sect. 1.4.3, the notation diag(g) refers to the diagonal
s × s matrix ⎡ ⎤
⎢ g1 0 ··· 0 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 g2 ··· 0 ⎥
⎢ ⎥
⎢ ⎥,
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
0 0 · · · gs

and therefore the matrix in (8.36) is an (s + r) × (s + n) matrix.


Let us now return to Example 8.5 and examine whether the con-
straints (8.27)–(8.29) satisfy the constraint qualification at point (0,1).
In this example, s = 3, r = 0 and n = 2, and the matrix in (8.36) is
⎡ ⎤ ⎡ ⎤
2 3 2
⎢ −2x −3(1 − y) (1 − y) − x 0 0 ⎥ ⎢ 0 0 0 0 0 ⎥
⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥
⎢ 1 0 0 x 0 ⎥=⎢ 1 0 0 0 0 ⎥
⎢ ⎥ ⎢ ⎥
⎣ ⎦ ⎣ ⎦
0 1 0 0 y 0 1 0 0 1

at point (x, y) = (0, 1). It has a null vector in the first row, and therefore
its rows are not linearly independent; see Sect. 1.4.10. Thus, it does
not have a full rank of three, and the condition (8.36) does not hold.
Alternatively, note that the inequality constraints (8.27) and (8.28) are
active at point (x, y) = (0, 1), and their respective gradients (−2x, −3(1−
y)2 ) = (0, 0) and (1, 0) at that point are clearly not linearly independent.

8.1.4 Theorems from Nonlinear Programming


In order to derive our version of the discrete maximum principle, we
use two well-known results from nonlinear programming. These provide
sufficient and necessary conditions for the problem given by (8.1)–(8.3).
The Lagrangian function for this problem is

L(x, λ, μ) = h + λ(f (x) − a) + μ(g(x) − b), (8.37)

where λ and μ are row vectors of multipliers associated with the con-
straints (8.2) and (8.3), respectively. We now state two theorems whose
proofs can be found in Mangasarian (1969).
8.2. A Discrete Maximum Principle 269

Theorem 8.1 (Necessary Conditions) If h, f, and g are differen-


tiable, x∗ solves (8.1)–(8.3), and the constraint qualification (8.36) holds
at x∗ , then there exist multipliers λ and μ such that (x∗ , λ, μ) satisfy the
Kuhn-Tucker conditions

Lx (x∗ , λ, μ) = hx (x∗ ) + λfx (x∗ ) + μgx (x∗ ) = 0, (8.38)



f (x ) = a, (8.39)

g(x ) ≥ b, (8.40)

μ ≥ 0, μ(g(x ) − b) = 0, (8.41)

Theorem 8.2 (Sufficient Conditions) If h, f, and g are differen-


tiable, f is affine, g is concave, and (x∗ , λ, μ) satisfy the conditions
(8.38)–(8.41), then x∗ is a solution to the maximization problem (8.1)–
(8.3).

8.2 A Discrete Maximum Principle


We will now use the nonlinear programming results of the previous sec-
tion to derive a special form of the discrete maximum principle. Some
references in this connection are Luenberger (1972), Mangasarian and
Fromovitz (1967), and Ravn (1999). A more general discrete maximum
principle will be stated in Sect. 8.3.

8.2.1 A Discrete-Time Optimal Control Problem


In order to state a discrete-time optimal control problem over the periods
0, 1, 2, . . . , T, we define the following:

Θ = the set {0, 1, 2, . . . , T − 1},


k
x = an n-component column state vector; k = 0, 1, . . . , T,
u k
= an m-component column control vector; k = 0, 1, 2, . . . , T − 1,
b k
= an s-component column vector of constants; k=0, 1, . . . , T −1.

Here, the state xk is assumed to be measured at the beginning of


period k and control uk is implemented during period k. This convention
is depicted in Fig. 8.3.
270 8. The Maximum Principal: Discrete Time

x0 x1 x2 xk xk+1 xT −1 xT
u0 u1 u2 uk uk+1 uT −1
0 1 2 k k+1 T −1 T

Figure 8.3: Discrete-time conventions

We also define continuously differentiable functions f : E n × E m ×


Θ → E n , F : E n × E m × Θ → E 1 , g : E m × Θ → E s , and S :
E m × Θ ∪ {T } → E 1 .
Then, a discrete-time optimal control problem in the Bolza form (see
Sect. 2.1.4) is:
 −1


T
k k T
max J = F (x , u , k) + S(x , T ) (8.42)
k=0

subject to the difference equation

xk = xk+1 − xk = f (xk , uk , k), k = 0, . . . , T − 1, x0 given, (8.43)

and the constraints

g(uk , k) ≥ bk , k = 0, . . . , T − 1. (8.44)

In (8.43) the term xk = xk+1 − xk is known as the difference operator.


This problem is clearly a special case of the nonlinear programming
problem (8.1)–(8.3) with x = (x1 , x2 , . . . , xT , u0 , u1 , . . . , uT −1 ) as the (n+
m)T vector of variables, nT equality constraints (8.43), and sT inequality
constraints (8.44).

8.2.2 A Discrete Maximum Principle


We now apply the nonlinear programming theory of Sect. 8.1 to find
necessary conditions for the solution to the Mayer form of the control
problem of Sect. 8.2.1.
We let λk+1 be an n-component row vector of Lagrange multipliers,
which we rename adjoint variables and associate with Eq. (8.43). Sim-
ilarly, we let μk be an s-component row vector of Lagrange multipliers
associated with constraint (8.44). These multipliers are defined for each
time k = 0, . . . , T − 1.
8.2. A Discrete Maximum Principle 271

The Lagrangian function of the problem is



T −1 
T −1
L= F (xk , uk , k) + S(xT , T ) + λk+1 [f (xk , uk , k) − xk+1 + xk ]
k=0 k=0

T −1
+ μk [g(uk , k) − bk ]. (8.45)
k=0

We now define the Hamiltonian function H k to be


H k = H(xk , uk , k) = F (xk , uk , k) + λk+1 f (xk , uk , k). (8.46)
Using (8.46) we can rewrite (8.45) as

T −1
L = S(xT , T ) + [H k − λk+1 (xk+1 − xk )]
k=0

T −1
+ μk [g(uk , k) − bk ]. (8.47)
k=0

We can now apply the Kuhn-Tucker conditions (8.38)–(8.41). Con-


ditions (8.39) and (8.40) in this case give (8.43) and (8.44), respectively.
Application of (8.38) results in (8.48)–(8.50) below and application of
(8.41) gives the complimentary slackness conditions (8.51) below.
By differentiating (8.47) with respect to xk for k = 1, 2, . . . , T − 1,
we obtain
∂L ∂H k
= − λk + λk+1 = 0,
∂xk ∂xk
which upon rearranging terms becomes
∂H k
λk = λk+1 − λk = − , k = 0, 1, . . . , T − 1. (8.48)
∂xk
By differentiating (8.47) with respect to xT , we get
∂L ∂S ∂S
T
= T
− λT = 0, or λT = . (8.49)
∂x ∂x ∂xT
The difference equations (8.48) with terminal boundary conditions (8.49)
are called the adjoint equations.
By differentiating L with respect to uk and stating the corresponding
Kuhn-Tucker conditions for the multiplier μk and constraint (8.44), we
have
∂L ∂H k ∂g
k
= k
+ μk k = 0
∂u ∂u ∂u
or
272 8. The Maximum Principal: Discrete Time

∂H k ∂g
= −μk k , (8.50)
∂uk ∂u
and
μk ≥ 0, μk [g(uk , k) − bk ] = 0. (8.51)
We note that, provided H k is concave in uk , g(uk , k) is concave in uk , and
the constraint qualification holds, then conditions (8.50) and (8.51) are
precisely the necessary and sufficient conditions for solving the following
Hamiltonian maximization problem:



⎪ max
⎪ Hk

⎨ u k

subject to (8.52)





⎩ g(uk , k) ≥ bk .

We have thus derived the following restricted form of the discrete maxi-
mum principle.
Theorem 8.3 If for every k, H k in (8.46) and g(uk , k) are concave in
uk , and the constraint qualification holds, then the necessary conditions
for uk∗ , k = 0, 1, . . . , T − 1, to be an optimal control for the problem
(8.42)–(8.44), with the corresponding state xk∗ , k = 0, 1, . . . , T, are




⎪ xk∗ = f (xk∗ , uk∗ , k), x0 given,




⎪ T∗
⎨ λk = − ∂Hkk [xk∗ , uk∗ , λk+1 , k], λT = ∂S(x T ,T ) ,
∂x ∂x



⎪ H k (xk∗ , uk∗ , λk+1 , k) ≥ H k (xk∗ , uk , λ(k+1) , k),





⎩ for all uk such that g(uk , k) ≥ bk , k = 0, 1, . . . , T − 1.
(8.53)
Section 8.2.3 gives examples of the application of this maximum prin-
ciple (8.53). In Sect. 8.3 we state a more general discrete maximum prin-
ciple.

8.2.3 Examples
Our first example will be similar to Example 2.4 and it will be solved
completely. The reader will note that the solutions of the continuous
and discrete problems are very similar. The second example is a discrete
version of the production-inventory problem of Sect. 6.1.
8.2. A Discrete Maximum Principle 273

Example 8.6 Consider the discrete-time optimal control problem:


 −1


T
1 k 2
max J = − (x ) (8.54)
2
k=1

subject to

xk = uk , x0 = 5, (8.55)
u ∈ Ω = [−1, 1].
k
(8.56)

We will solve this problem for T = 6 and T ≥ 7.

Solution The Hamiltonian is


1
H k = − (xk )2 + λk+1 uk , (8.57)
2
from which it is obvious that the optimal policy is bang-bang. Its form
is ⎧


⎪ 1
⎪ if λk+1 > 0,


uk∗ = bang[−1, 1; λk+1 ] = singular if λk+1 = 0, (8.58)





⎩ −1 if λk+1 < 0.

Let us assume, as we did in Example 2.4, that λk < 0 as long as xk


is positive so that uk = −1. Given this assumption, (8.55) becomes
xk = −1, whose solution is

xk∗ = −k + 5 for k = 1, 2, . . . , T − 1. (8.59)

By differentiating (8.57), we obtain the adjoint equation

∂H k //
λ = − k /xk∗ = xk∗ , λT = 0.
k
(8.60)
∂x
Let us assume T = 6. Substitute (8.59) into (8.60) to obtain

λk = −k + 5, λ6 = 0.

From Sect. A.5, we find the solution to be


1 11
λk = − k 2 + k + c,
2 2
274 8. The Maximum Principal: Discrete Time

where c is a constant. Since λ6 = 0, we can obtain the value of c by


setting k = 6 in the above equation. Thus,
1 11
λ6 = − (36) + (6) + c = 0 ⇒ c = −15,
2 2

so that
1 11
λk = − k 2 + k − 15. (8.61)
2 2
A sketch of the values for λk and xk appears in Fig. 8.4. Note that
5
λ = 0, so that the control u4 is singular. However, since x4 = 1 we
choose u4 = −1 in order to bring x5 down to 0.
The solution of the problem for T ≥ 7 is carried out in the same
way that we solved Example 2.4. Namely, observe that x5∗ = 0 and
λ5 = λ6 = 0, so that the control is singular. We simply make λk = 0 for
k ≥ 7 so that uk∗ = 0 for all k ≥ 7. It is clear without a formal proof
that this maximizes (8.54).

Example 8.7 Let us consider a discrete version of the production-


inventory example of Sect. 6.1; see Kleindorfer et al. (1975). Let I k ,
P k , and S k be the inventory, production, and demand at time k, respec-
tively. Let I 0 be the initial inventory, let Iˆ and P̂ be the goal levels of
inventory and production, and let h and c be inventory and production
cost coefficients. The problem is:
 −1

T 1 2
ˆ + c(P − P̂ ) ] 2
max J = − [h(I − I)
k k
(8.62)
P k ≥0 2
k=0

subject to

I k = P k − S k , k = 0, 1, . . . , T − 1, I 0 given. (8.63)

Form the Hamiltonian


1 ˆ 2 + c(P k − P̂ )2 ] + λk+1 (P k − S k ),
H k = − [h(I k − I) (8.64)
2
where the adjoint variable satisfies

∂H k ˆ λT = 0.
λk = − = h(I k − I), (8.65)
∂I k
8.2. A Discrete Maximum Principle 275

+
xk

5
4
3
2
1
1 2 3 4 5 6
0 k
-1

-3

-6

-10

λk


Figure 8.4: Optimal state xk and adjoint λk

To maximize the Hamiltonian, let us differentiate (8.64) to obtain

∂H k
= −c(P k − P̂ ) + λk+1 = 0.
∂P k
Since production must be nonnegative, we obtain the optimal production
as
P k∗ = max[0, P̂ + λk+1 /c]. (8.66)
Expressions (8.63), (8.65), and (8.66) determine a two-point bound-
ary value problem. For a given set of data, it can be solved numerically
by using spreadsheet software like Excel; see Sect. 2.5 and Exercise 8.21.
If the constraint P k ≥ 0 is dropped it can be solved analytically by the
method of Sect. 6.1, with difference equations replacing the differential
equations used there.
276 8. The Maximum Principal: Discrete Time

8.3 A General Discrete Maximum Principle


For the maximum principle (8.53) we assumed that H k and g were con-
cave in uk so that the set of admissible controls was convex. These are
fairly strong assumptions which will now be relaxed and a general max-
imum principle stated. The proof can be found in Canon et al. (1970).
Other references on discrete maximum principles are Halkin (1966) and
Holtzman (1966). The problem to be solved is:
 −1


T
max J = F (xk , uk , k) (8.67)
k=0

subject to

xk = f (xk , uk , k), x0 given


uk ∈ Ωk , k = 0, 1, . . . , (T − 1). (8.68)

Assumptions required are:

(i) F (xk , uk , k) and f (xk , uk , k) are continuously differentiable in xk


for every uk and k.

(ii) The sets {−F (x, Ωk , k), f (x, Ωk , k)} are b-directionally convex for
every x and k, where b = (−1, 0, . . . , 0). That is, given v and w in
Ωk and 0 ≤ λ ≤ 1 , there exists u(λ) ∈ Ωk such that

F (x, u(λ), k) ≥ λF (x, v, k) + (1 − λ)F (x, w, k)

and
f (x, u(λ), k) = λf (x, v, k) + (1 − λ)f (x, w, k)
for every x and k. It should be noted that convexity implies b-
directional convexity, but not the converse.

(iii) Ωk satisfies the Kuhn-Tucker constraint qualification.

With these assumptions replacing the assumptions of Theorem 8.3,


and since there is no salvage value term in (8.67) meaning that
S(xT , T ) ≡ 0, the maximum principle (8.53) with λT = 0 holds with
control constraint set g(uk , k) ≥ bk replaced by uk ∈ Ω. When the sal-
vage function S(xT , T ) is not identically zero, the objective function in
Exercises for Chapter 8 277

(8.67) is replaced by the Bolza form objective function (8.42). In Ex-


ercise 8.20, you are asked to convert the problem defined by (8.42) and
(8.68) to its Lagrange form, and then obtain the corresponding assump-
tions on the salvage value function S(xT , T ) for the results of this section
to apply. For a fixed-end-point problem, i.e., when xT is also given in
(8.68), the more general maximum principle holds with λT a constant to
be determined. Exercise 8.17 is an example of a fixed-end-point problem.
Finally, when there are lags in the system dynamics, i.e., when the state
of the system in a period depends not only on the state and the control
in the previous period, but also on the values of these variables in prior
periods, it is easy to adapt the discrete maximum principle to deal with
such systems; see Burdet and Sethi (1976). Exercise 8.22 presents an
advertising model containing lags in its sales-advertising dynamics.
Some concluding remarks on the applications of discrete-time optimal
control problems are appropriate. Real-life examples that can be mod-
eled as such problems include the following: payments of principal and
interest on loans; harvesting of crops; production planning for monthly
demands; etc. Such problems would require efficient computational pro-
cedures for their solution. Some references dealing with computational
methods for discrete optimal control problems are Murray and Yakowitz
(1984), Dunn and Bertsekas (1989), Pantoja and Mayne (1991), Wright
(1993), and Dohrmann and Robinett (1999). Another reason that makes
the discrete optimal control theory important arises from the fact that
computers are being used increasingly in the control of dynamic systems.
Finally, Pepyne and Cassandras (1999) have explored an optimal con-
trol approach to treat discrete event dynamic systems (DEDS). They also
apply the approach to a transportation problem, modeled as a polling
system.

Exercises for Chapter 8

E 8.1 Determine the critical points of the following functions:

(a) h(y, z) = −5y 2 − z 2 + 10y + 6z + 27,

(b) h(y, z) = 5y 2 − yz + z 2 − 10y − 18z + 17.

E 8.2 Let h be twice differentiable with its Hessian matrix defined to


be H = hxx . Let x̄ be a critical point, i.e., a solution of hx = 0. Let Hj
be the jth principal minor, i.e., the j × j submatrix found in the first j
278 8. The Maximum Principal: Discrete Time

rows and the first j columns of H. Let |Hj | be the determinant of Hj .


Then, y 0 is a local maximum of h if

H1 < 0, |H2 | > 0, |H3 | < 0, . . . , (−1)n |Hn | = (−1)n |H| > 0

evaluated at x̄, and x̄ is a local minimum of h if

H1 > 0, |H2 | > 0, |H3 | > 0, . . . , |Hn | = |H| > 0

evaluated at x̄. Apply these conditions to Exercise 8.1 to identify local


minima and maxima of the functions in (a) and (b).

E 8.3 Find the optimal speed in cases (a) and (b) below:
(a) During times of an energy crisis, it is important to economize on
fuel consumption. Assume that when traveling x mile/hour in high
gear, a truck burns fuel at the rate of

1 2500
+ x gallons/mile.
500 x
If fuel costs 50 cents per gallon, find the speed that will minimize
the cost of fuel for a 1000 mile trip. Check the second-order con-
dition.

(b) When the government imposed this optimal speed in 1974, truck
drivers became so angry that they staged blockades on several free-
ways around the country. To explain the reason for these blockades,
we found that a crucial figure was the hourly wage of the truckers,
estimated at $3.90 per hour at that time. Recompute a speed that
will minimize the total cost of fuel and the driver’s wages for the
same trip. You do not need to check for the second-order condition.

E 8.4 Use (8.5)–(8.7) to derive Eq. (8.8).

E 8.5 Verify Eq. (8.8) in Example 8.1 by determining h∗ (a) and expand-
ing the function h∗ (10 + ) in a Taylor series around the value 10.

E 8.6 Maximize h(x) = (1/3)x3 − 6x2 + 32x + 5 subject to each of the


following constraints:
(a) x ≤ 6

(b) x ≤ 20.
Exercises for Chapter 8 279

E 8.7 Rework Example 8.4 by replacing (2, 2) with each of the following
points:

(a) (0, −1)

(b) (1/2, 1/2).

E 8.8 Add the equality constraint 2x = y to the problem in Example 8.4


and solve it.

E 8.9 Solve the problem:






⎪ max h(x, y)


⎪ subject to




⎩ x2 ≤ (2 − y)3 , y ≥ 0,

for (a) h(x, y) = x + y, (b) h(x, y) = x + 2y, and (c) h(x, y) = x + 3y.
Comment on the solution in each of the cases (a), (b), and (c).

E 8.10 Constraint Qualification. Show that the feasible region in two


dimensions, determined by the constraints (1 − x)3 − y ≥ 0, x ≥ 0, and
y ≥ 0, does not satisfy the constraint qualification (8.36) at the boundary
point (1,0). Also sketch the feasible region to see the presence of a cusp
at point (1, 0).

E 8.11 Constraint Qualification. Show that the feasible region in two


dimensions, determined by the constraints x2 +y 2 ≤ 1, x ≥ 0, and y ≥ 0,
satisfies the constraint qualification (8.36) at the boundary point (1,0).
Also sketch the feasible region to contrast it with that in Exercise 8.10.

E 8.12 Solve graphically the problem of minimizing x subject to the


constraints
1 − x ≥ 0, y ≥ 0, x3 − y ≥ 0.
Show that the constraints do not satisfy the constraint qualification
(8.36) at the optimal point.

E 8.13 Rewrite the maximum principle (8.53) for the special case of the
linear Mayer form problem obtained when F ≡ 0 and S(xT , T ) = cxT ,
where c is an n-component row vector of constants.
280 8. The Maximum Principal: Discrete Time

E 8.14 Show that the necessary conditions for uk to be an optimal so-


lution for (8.52) are given by (8.50) and (8.51).

E 8.15 Prove Theorem 8.3.

E 8.16 Formulate and solve a discrete-time version of the cash balance


model of Sect. 5.1.1.

E 8.17 Minimum Fuel Problem. Consider the problem:


⎧ !

⎪ T −1 k "

⎪ min J = k=0 |u |





⎨ subject to



⎪ xk = Axk + buk , x0 and xT given





⎩ uk ∈ [−1, 1], k = 0, 1, . . . , T − 1,

where A is a given matrix. Obtain the expression for the adjoint variable
and the form of the optimal control.

E 8.18 Current-Value Formulation. Obtain the current-value formula-


tion of the discrete maximum principle. Assume that r is the discount
rate, i.e., 1/(1 + r) is the discount factor.

E 8.19 Convert the Bolza form problem (8.42)–(8.44) to the equiva-


lent linear Mayer form; see Sect. 2.1.4 for a similar conversion in the
continuous-time case.

E 8.20 Convert the problem defined by (8.42) and (8.68) to its La-
grange form. Then, obtain the assumptions on the salvage value function
S(xT , T ) so that the results of Sect. 8.3 apply. Under these assumptions,
state the maximum principle for the Bolza form problem defined by
(8.42) and (8.68).

E 8.21 Use Excel to solve the production planning problem given by


(8.62) and (8.63) with I 0 = 1, P̂ = 30, Iˆ = 15, h = c = 1, T = 8, and
S k = k 3 − 12k 2 + 32k + 30, k = 0, 1, 2, . . . , (T − 1). This is a discrete
time version of Example 6.1 so that you can compare your solution with
Fig. 6.1.
Exercises for Chapter 8 281

E 8.22 An Advertising Model (Burdet and Sethi 1976). Let xk denote


the sale and uk , k = 1, 2, . . . , T − 1, denote the amount of advertising in
period k. Formulate the sales-advertising dynamics as


k
x = −δx + r
k k
fkl (xl , ul ), x0 given,
l=0

where δ and r are decay and response constants, respectively, and


fkl (xl , ul ) is a nonnegative function that decreases with xl and increases
with ul . In the special case when

fkl (xl , ul ) = γ lk ul , γ lk > 0,

obtain optimal advertising amounts to maximize the total discounted


profit given by

T −1
(πxk − uk )(1 + ρ)−k ,
k=1

where, as in Sect. 7.2.1, π denotes per unit sales revenue, ρ denotes the
discount rate, and the inequalities 0 ≤ uk ≤ Qk represent the restric-
tions on the advertising amount uk . For the continuous-time version of
problems with lags, see Hartl and Sethi (1984b).
Chapter 9

Maintenance and
Replacement

The problem of simultaneously determining the lifetime of an asset or an


activity along with its management during that lifetime is an important
problem in practice. The most typical example is the problem of opti-
mal maintenance and replacement of a machine; see Rapp (1974) and
Pierskalla and Voelker (1976). Other examples occur in forest manage-
ment, such as in Näslund (1969), Clark (1976), and Heaps (1984), and
in advertising copy management, such as in Pekelman and Sethi (1978).
The first major work dealing with machine replacement problems ap-
peared in 1949 as a MAPI (Machinery and Applied Products Institute)
study by Terborgh (1949). For the most part, this study was confined to
those problems where the optimization was carried out only with respect
to the replacement lives of the machines under consideration. Boiteux
(1955) and Massé (1962) extended the single machine replacement prob-
lem to include the optimal timing of a partial replacement of the machine
before its actual retirement. Näslund (1966) was the first to solve a gen-
eralized version of the Boiteux problem by using the maximum principle.
He considered optimal preventive maintenance applied continuously over
the entire period instead of a single optimal partial replacement before
the machine is retired. Thompson (1968) presented a modification of
Näslund’s model which is described in the following section.

© Springer Nature Switzerland AG 2019 283


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 9
284 9. Maintenance and Replacement

9.1 A Simple Maintenance and Replacement


Model
Consider a single machine whose resale value gradually declines over
time. Its output is assumed to be proportional to its resale value. By
applying preventive maintenance, it is possible to slow down the rate of
decline of the resale value. The control problem consists of simultane-
ously determining the optimal rate of preventive maintenance and the
sale date of the machine. Clearly this is an optimal control problem with
unspecified terminal time; see Sect. 3.1 and Example 3.6.

9.1.1 The Model


In order to define Thompson’s model, we use the following notation:
T = the sale date of the machine to be determined,
ρ = the constant discount rate,
x(t) = the resale value of the machine in dollars at time t; let
x(0) = x0 ,
u(t) = the preventive maintenance rate at time t (mainte-
nance here means money spent over and above the
minimum required for necessary repairs),
g(t) = the maintenance effectiveness function at time t (mea-
sured in dollars added to the resale value per dollar
spent on preventive maintenance),
d(t) = the obsolescence function at time t (measured in terms
of dollars subtracted from x at time t),
π = the constant production rate in dollars per unit time
per unit resale value; assume π > ρ or else it does not
pay to produce.
It is assumed that g(t) is a nonincreasing function of time and d(t)
is a nondecreasing function of time, and that for all t
u(t) ∈ Ω = [0, U ], (9.1)
where U is a positive constant.
The present value of the machine is the sum of two terms, the dis-
counted income (production minus maintenance) stream during its life
plus the discounted resale value at T :
 T
J= [πx(t) − u(t)]e−ρt dt + x(T )e−ρT . (9.2)
0
9.1. A Simple Maintenance and Replacement Model 285

The state variable x is affected by the obsolescence factor, the amount


of preventive maintenance, and the maintenance effectiveness function.
Thus,
ẋ(t) = −d(t) + g(t)u(t), x(0) = x0 . (9.3)
In the interests of realism we assume that

− d(t) + g(t)U ≤ 0, t ≥ 0. (9.4)

The assumption implies that preventive maintenance is not so effective


as to enhance the resale value of the machine over its previous values;
rather, it can at most slow down the decline of the resale value, even
when preventive maintenance is performed at the maximum rate U. A
modification of (9.3) is given in Arora and Lele (1970). See also Hartl
(1983b).
The optimal control problem is to maximize (9.2) subject to (9.1)
and (9.3).

9.1.2 Solution by the Maximum Principle


This problem is similar to Model Type (a) of Table 3.3 with the free-
end-point condition as in Row 1 of Table 3.1. Therefore, we follow the
steps for solution by the maximum principle stated in Chap. 3.
The standard Hamiltonian as formulated in Sect. 2.2 is

H = (πx − u)e−ρt + λ(−d + gu), (9.5)

where the adjoint variable λ satisfies

λ̇ = −πe−ρt , λ(T ) = e−ρT . (9.6)

Since T is unspecified, the required additional terminal condition (3.15)


for this problem is
− ρe−ρT x(T ) = −H, (9.7)
which must hold on the optimal path at time T.
The adjoint variable λ can be easily obtained by integrating (9.6),
i.e.,
 T
−ρT π
λ(t) = e + πe−ρτ dτ = e−ρT + [e−ρt − e−ρT ]. (9.8)
t ρ
The interpretation of λ(t) is as follows. It gives, in present value
terms, the marginal profit per dollar of gain in resale value at time t.
286 9. Maintenance and Replacement

The first term represents the present value of one dollar of additional
salvage value at T brought about by one dollar of additional resale value
at the current time t. The second term represents the present value of
incremental production from t to T brought about by the extra produc-
tivity of the machine due to the additional one dollar of resale value at
time t.
Since the Hamiltonian is linear in the control variable u, the optimal
control for a problem with any fixed T is bang-bang as in Model Type
(a) in Table 3.3. Thus,

π
u∗ (t) = bang 0, U ; {e−ρT + (e−ρt − e−ρT )}g(t) − e−ρt . (9.9)
ρ

To interpret this optimal policy, we see that the term


π −ρt
{e−ρT + (e − e−ρT )}g(t)
ρ

is the present value of the marginal return from increasing the preventive
maintenance by one dollar at time t. The last term e−ρt in the argument
of the bang function is the present value of that one dollar spent for pre-
ventive maintenance at time t. Thus, in words, the optimal policy means
the following: if the marginal return of one dollar of additional preven-
tive maintenance is more than one dollar, then perform the maximum
possible preventive maintenance, otherwise do not perform any at all.
To find how the optimal control switches, we need to examine the
switching function in (9.9). Rewriting it as

−ρt πg(t) π
e − ( − 1)eρ(t−T ) g(t) − 1 (9.10)
ρ ρ

and taking the derivative of the bracketed terms with respect to t, we


can conclude that the expression inside the square brackets in (9.10) is
monotonically decreasing with time t on account of the assumptions that
π/ρ > 1 and that g(t) is nonincreasing with t (see Exercise 9.1). It follows
that there will not be a singular control for any finite interval of time.
Furthermore, since e−ρt > 0 for all t, we can conclude that the switching
function can only go from positive to negative and not vice versa. Thus,
the optimal control will be either U, or zero, or U followed by zero. The
switching time ts is obtained as follows: equate (9.10) to zero and solve
for t. If the solution is negative, let ts = 0, and if the solution is greater
9.1. A Simple Maintenance and Replacement Model 287

than T, let ts = T, otherwise set ts equal to the solution. It is clear that


the optimal control in (9.9) can now be rewritten as


⎨ U t ≤ ts ,

u (t) = (9.11)

⎩ 0 t > ts .

Note that all of the above calculations were made on the assumption
that T was fixed, i.e., without imposing condition (9.7). On an optimal
path, this condition, which uses (9.5), (9.7), and (9.8), can be restated
as
∗ ∗
−ρe−ρT x∗ (T ∗ ) = −{πx∗ (T ∗ ) − u∗ (T ∗ )}e−ρT
(9.12)

−e−ρT {−d(T ∗ ) + g(T ∗ )u(T ∗ )}.

This means that when u∗ (T ∗ ) = 0 (i.e., ts < T ∗ ), we have


d(T ∗ )
x∗ (T ∗ ) = , (9.13)
π−ρ
and when u∗ (T ∗ ) = U (i.e., ts = T ∗ ), we have
d(T ∗ ) − [g(T ∗ ) − 1]U
x∗ (T ∗ ) = . (9.14)
π−ρ
Since d(t) is nondecreasing, g(t) is nonincreasing, and x(t) is non-
increasing, Eq. (9.13) or Eq. (9.14), whichever the case may be, has a
solution for T ∗ .

9.1.3 A Numerical Example


It is instructive to work an example of this model in which specific values
are assumed for the various functions. Examples that illustrate other
kinds of qualitatively different behavior are left as Exercises 9.3–9.5.
Suppose U = 1, x(0) = 100, d(t) = 2, π = 0.1, ρ = 0.05, and
g(t) = 2/(1 + t)1/2 . Then (9.3) specializes to
2u(t)
ẋ(t) = −2 + √ , x(0) = 100. (9.15)
1+t
First, we write the condition on ts by equating (9.10) to 0, which
gives
ρ
π − (π − ρ)e−ρ(T −t ) =
s
. (9.16)
g(ts )
288 9. Maintenance and Replacement

In doing so, we have assumed that the solution of (9.16) lies in the open
interval (0, T ). As we will indicate later, special care needs to be exercised
if this is not the case.
Substituting the data in (9.16) we have
0.1 − 0.05e−0.05(T −t ) = 0.025(1 + ts )1/2 ,
s

which simplifies to
(1 + ts )1/2 = 4 − 2e−0.05(T −t ) .
s
(9.17)
Then, integrating (9.15), we find
x(t) = −2t + 4(1 + t)1/2 + 96, if t ≤ ts ,
and hence
x(t) = −2ts + 4(1 + ts )1/2 + 96 − 2(t − ts )
= 4(1 + ts )1/2 + 96 − 2t, if t > ts .
Since we have assumed 0 < ts < T, we substitute x(T ) into (9.13), and
obtain
4(1 + ts )1/2 + 96 − 2T = 2/0.05 = 40,
which simplifies to
T = 2(1 + ts )1/2 + 28. (9.18)
We must solve (9.17) and (9.18) simultaneously. Substituting (9.18) into
(9.17), we find that ts must be a zero of the function

h(ts ) = (1 + ts )1/2 − 4 + 2e−[2(1+t


s )1/2 −ts +28]/20
. (9.19)
A simple binary search program was written to solve this equation, which
obtained the value ts = 10.6. Substitution of this into (9.18) yields T =
34.8. Since this satisfies our supposition that 0 < ts < T, we can conclude
our computations. Thus, if we let the unit of time be 1 month, then the
optimal solution is to perform preventive maintenance at the maximum
rate during the first 10.6 months, and thereafter not at all. The sale date
is at 34.8 months after purchase. Figure 9.1 gives the functions x(t) and
u(t) for this optimal maintenance and sale date policy.
If, on the other hand, the solution of (9.17) and (9.18) did not satisfy
our supposition, we would need to follow the procedure outlined earlier
in the section. This would result in ts = 0 or ts = T. If ts = 0, we would
obtain T from (9.18), and conclude u∗ (t) = 0, 0 ≤ t ≤ T. Alternatively,
if ts = T, we would need to substitute x(T ) into (9.14) to obtain T. In
this case the optimal control would be u∗ (t) = U, 0 ≤ t ≤ T.
9.1. A Simple Maintenance and Replacement Model 289

Figure 9.1: Optimal maintenance and machine resale value

9.1.4 An Extension
The pure bang-bang result in the model developed above is a result of the
linearity in the problem. The result can be enriched as in Sethi (1973b)
by generalizing the resale value equation (9.3) as follows:
ẋ(t) = −d(t) + g(u(t), t), (9.20)
where g is nondecreasing and concave in u. For this section, we will
assume the sale date T to be fixed for simplicity and g to be strictly
concave in u, i.e., gu ≥ 0 and guu < 0 for all t. Also, gt ≤ 0, gut ≤ 0, and
g(0, t) = 0; see Exercise 9.7 for an example of the function g(u, t).
The standard Hamiltonian is
H = (πx − u)e−ρt + λ[−d + g(u, t)], (9.21)
where λ is given in (9.8). To maximize the Hamiltonian, we differentiate
it with respect to u and equate the result to zero. Thus,
Hu = −e−ρt + λgu = 0. (9.22)
If we let u0 (t) denote the solution of (9.22), then u0 (t) maximizes the
Hamiltonian (9.21) because of the concavity of g in u. Thus, for a fixed
T, the optimal control is
u∗ (t) = sat[0, U ; u0 (t)]. (9.23)
290 9. Maintenance and Replacement

To determine the direction of change in u∗ (t), we obtain u̇0 (t). For


this, we use (9.22) and the value λ(t) from (9.8) to obtain

e−ρt 1
gu = = . (9.24)
λ(t) π
ρ − ( πρ − 1)eρ(t−T )

Since π > ρ, the denominator on the right-hand side of (9.24) is mono-


tonically decreasing with time. Therefore, the right-hand side of (9.24)
is increasing with time. Taking the time derivative of (9.24), we have

ρ2 (π − ρ)eρ(t−T )
gut + guu u̇0 = > 0.
[π − (π − ρ)eρ(t−T ) ]2

But gut ≤ 0 and guu < 0, it is therefore obvious that u̇0 (t) < 0. In order
now to sketch the optimal control u∗ (t) specified in (9.23), let us define
0 ≤ t1 ≤ t2 ≤ T such that u0 (t) ≥ U for t ≤ t1 and u0 (t) ≤ 0 for t ≥ t2 .
Then, we can rewrite the sat function in (9.23) as




⎪ U for t ∈ [0, t1 ],


u∗ (t) = u0 (t) for t ∈ (t1 , t2 ), (9.25)





⎩ 0 for t ∈ [t2 , T ].

In (9.25), it is possible to have t1 = 0 and/or t2 = T. In Fig. 9.2 we have


sketched a case when t1 > 0 and t2 < T.
Note that while u0 (t) in Fig. 9.2 is decreasing over time, the way
it will decrease will depend on the nature of the function g. Indeed,
the shape of u0 (t), while always decreasing, can be quite general. In
particular, you will see in Exercise 9.7 that the shape of u0 (t) is concave
and, furthermore, u0 (t) > 0, t ≥ 0, so t2 = T in that case.

9.2 Maintenance and Replacement for


a Machine Subject to Failure
In Kamien and Schwartz (1971a), a related model is developed which has
somewhat different assumptions. They assume that the production rate
of the machine is independent of its age, while its probability of failure
increases with its age. Consistent with this assumption, the purpose of
preventive maintenance in the Kamien-Schwartz model is to influence
9.2. Maintenance & Replacement for a Machine Subject to Failure 291

Figure 9.2: Sat function optimal control

the failure rate of the machine rather than arrest the deterioration in
the resale value as before. Furthermore, their model also allows for sale
of the machine at any time, provided it is still in running condition, and
for its disposal as junk if it breaks down for good. The optimal control
problem is therefore to find an optimal maintenance policy for the period
of ownership and an optimal sale date at which the machine should be
sold, provided that it has not yet failed. Other references to related
models are Alam et al. (1976), Alam and Sarma (1974, 1977), Sarma
and Alam (1975), Gaimon and Thompson (1984a, 1989), Dogramaci and
Fraiman (2004), Dogramaci (2005), Bensoussan and Sethi (2007), and
Bensoussan et al. (2015a).

9.2.1 The Model


In order to define the Kamien-Schwartz model, we use the following
notation:
T = the sale date of a machine to be determined,
u(t) = the preventive maintenance rate at time t;
0 ≤ u(t) ≤ 1,
R = the constant positive rate of revenue produced by a
functioning machine independent of its age at any
time, net of all costs except preventive maintenance,
292 9. Maintenance and Replacement

ρ = the constant discount rate,


L = the constant positive junk value of the failed machine
independent of its age at failure,
B(t) = the (exogenously specified) resale value of the machine
at time t, if it is still functioning; Ḃ(t) ≤ 0,
h(t) = the natural failure rate (also termed the natural hazard
rate in the reliability theory); h(t) ≥ 0, ḣ(t) ≥ 0,
F (t) = the cumulative probability that the machine has failed
by time t,
C(u; h) = the cost function depending on the preventive mainte-
nance u when the natural failure rate is h.
To make economic sense, an operable machine must be worth at least
as much as an inoperable machine and its resale value should not exceed
the present value of the potential revenue generated by the machine if it
were to function forever. Thus,

0 ≤ L ≤ B(t) ≤ R/ρ, t ≥ 0. (9.26)

Also for all t > 0,


u(t) ∈ Ω = [0, 1]. (9.27)
Finally, when the natural failure rate is h and a controlled failure rate of
h(1 − u) is sought, the action of achieving this reduction will cost C(u; h)
dollars. For simplicity, we assume that C(u; h) = C(u)h with

C(0) = 0, Cu > 0, Cuu > 0, for u ∈ [0, 1]. (9.28)

Thus, the cost of reducing the failure rate increases more than pro-
portionately as the fractional reduction increases. But the cost of a
given fractional reduction increases linearly with the natural failure rate.
Hence, these conditions imply that a given absolute reduction becomes
increasingly more costly as the machine gets older.
To derive the state equation for F (t), we note that Ḟ /(1−F ) denotes
the conditional probability density for the failure of the machine at time
t, given that it has survived to time t. This is assumed to depend on
two things, namely (i) the natural failure rate that governs the machine
in the absence of preventive maintenance, and (ii) the current rate of
preventive maintenance.
Thus,
Ḟ (t)
= h(t)[1 − u(t)], (9.29)
1 − F (t)
9.2. Maintenance & Replacement for a Machine Subject to Failure 293

which gives the state equation


Ḟ = h(1 − u)(1 − F ), F (0) = 0. (9.30)
Thus, the controlled failure rate at time t is h(t)(1 − u(t)). If u = 0,
the failure rate assumes its natural value h. As u increases, the failure
rate decreases and drops to zero when u = 1.
The expected present value of the machine is the sum of the expected
present values of (i) the total revenue it produces less the total cost of
maintenance, (ii) its junk value if it should fail before it is sold, and (iii)
the salvage value if it does not fail and is sold. That is,
 T ! "
J= e−ρt [R − C(u)h](1 − F ) + LḞ dt + e−ρT B(T )[1 − F (T )].
0

Using (9.30), we can rewrite J as follows:


 T
J= e−ρt [R − C(u)h + L(1 − u)h] (1 − F )dt + e−ρT B(T ) [1 − F (T )] .
0
(9.31)

The optimal control problem is to maximize J in (9.31) subject to (9.30)


and (9.27).

Remark T 9.1 In the absence of discounting, the expected junk value


term 0 LḞ (t)dt reduces to LF (T ), i.e., the junk value times the prob-
ability that the machine fails by time T.

Remark 9.2 While the maintenance and replacement problem of


Kamien and Schwartz is stochastic, they formulate and solve it as a
deterministic optimal control problem. Bensoussan and Sethi (2007) for-
mulate the underlying stochastic problem as a stochastic optimal control
problem, and show how their solution relates to that of the Kamien-
Schwartz model. They also provide a sufficient condition for an optimal
maintenance and replacement policy.

9.2.2 Optimal Policy


The problem is similar to Model Type (f) in Table 3.3 subject to the
free-end-point condition as in Row 1 of Table 3.1. Therefore, we follow
the steps for solution by the maximum principle stated in Chap. 3. The
standard Hamiltonian is
H = e−ρt [R − C(u)h + L(1 − u)h](1 − F ) + λ(1 − u)h(1 − F ), (9.32)
294 9. Maintenance and Replacement

and the adjoint variable satisfies




⎨ λ̇ = e−ρt [R − C(u)h + L(1 − u)h] + λh(1 − u),
(9.33)

⎩ λ(T ) = −e−ρT B(T ).

Since T ≥ 0 is also to be decided, we require the additional transversality


condition (3.77) for an optimal T ∗ to satisfy.

R − C[u∗ (T ∗ )]h(T ∗ ) + L[1 − u∗ (T ∗ )]h(T ∗ )


(9.34)
−[1 − u∗ (T ∗ )]h(T ∗ )B(T ∗ ) − ρB(T ∗ ) + BT (T ∗ ) = 0.

In Exercise 9.8, you are asked to derive this condition by using (9.31)–
(9.33) in (3.77).
While we know from (3.79) that (9.34) has a standard economic in-
terpretation of having zero marginal profit of changing T ∗ , it is still
illuminating to flesh out a more detailed interpretation of each term in
what looks like a fairly complex expression. A good way to accomplish
that is totally what we get if we decide to sell the machine at time T ∗ + δ
in comparison to selling it at T ∗ . We will do this only for a small δ > 0,
and leave it as Exercise 9.9 for a small δ < 0.
First we note that in solving Exercise 9.8 to obtain (9.34) from

(3.77), a simplification involved canceling the common factor e−ρT (1 −

F (T ∗ )) > 0. Removing e−ρT brings the revenue and cost terms from
present-value dollars to dollars at time T ∗ . The presence of the probabil-
ity term 1 − F (T ∗ ) means that the machine will be replaced at T ∗ if it
has not failed by time T ∗ with that probability. Its removal means that
(9.34) can be interpreted as if we are at T ∗ and we find the machine to
be working, which is tantamount to interpreting (9.34) with F (T ∗ ) = 0.
Now consider keeping the machine to T ∗ + δ. Clearly we lose its
selling price B(T ∗ ) in doing so. But then we gain the following amounts
discounted to time T ∗ :
{R − C(u∗ (T ∗ ))h(T ∗ )}δe−ρδ = {R − C(u∗ (T ∗ ))h(T ∗ )}δ + o(δ), (9.35)
L(1 − u∗ (T ∗ ))h(T ∗ )δe−ρδ = L(1 − u∗ (T ∗ ))h(T ∗ )δ + o(δ), (9.36)

B(T ∗ + δ)(1 − F (T ∗ + δ))e−ρδ = B(T ∗ ) − B(T ∗ )(1 − u∗ (T ∗ ))h(T ∗ )δ

−ρB(T ∗ )δ + BT (T ∗ )δ + o(δ).
(9.37)
9.2. Maintenance & Replacement for a Machine Subject to Failure 295

The RHS of these equations can be obtained by noting that e−ρδ =


1 − ρδ + o(δ), B(T ∗ + δ) = B(T ∗ ) + BT (T ∗ )δ + o(δ) and F (T ∗ + δ) =
F (T ∗ ) + Ḟ (T ∗ )δ = F (T ∗ ) + (1 − u∗ (T ∗ ))h(T ∗ )(1 − F (T ∗ ))δ = (1 −
u∗ (T ∗ ))h(T ∗ )δ + o(δ), since we had set F (T ∗ ) = 0 for interpreting (9.34)
upon arrival at T ∗ and finding the machine to be working. The net
gain is the sum of (9.35), (9.36) and (9.37) less B(T ∗ ), where (9.35)
gives the net cash flow (revenue—cost of preventative maintenance from
T ∗ to T ∗ + δ), (9.36) represents the junk value L multiplied by the
probability [1 − u(T ∗ )]h(T ∗ )δ that the machine fails during the short
time δ when the machine is found to be working at T ∗ , and (9.37) less
B(T ∗ ) has three terms −ρB(T ∗ )δ +BT (T ∗ )δ −B(T ∗ )(1−u∗ (T ∗ ))h(T ∗ )δ:
the first of which is the loss of interest ρB(T ∗ )δ on the resale value B(T ∗ )
not obtained when deciding to keep the machine to T ∗ + δ, the second
term BT (T ∗ ) < 0 is the decrease in the resale value from T ∗ to T ∗ + δ,
and the third term represents the loss of the entire resale value if the
machine fails with the probability (1 − u∗ (T ∗ ))h(T ∗ )δ given that the
machine was found to be working at time T ∗ . Moreover, if we divide
the net gain by δ and then let δ → 0, we obtain the marginal profit of
keeping the machine from time T ∗ to T ∗ + δ, and setting it equal to zero
gives precisely the transversality condition (9.34). If we separate the
revenue and cost terms in the resulting expression of the marginal profit,
then (9.34) determining the optimal sale date T ∗ is the usual economic
condition equating marginal revenue to marginal cost.
Next, we analyze the problem to obtain the optimal maintenance
policy for a fixed T. If the optimal solution is in the interior, i.e., u∗ ∈
(0, 1), then the Hamiltonian maximizing condition gives

Hu = −e−ρt h(1 − F )[Cu + L + eρt λ] = 0. (9.38)

In the trivial cases in which the natural failure rate h(t) is zero or when
the machine fails with certainty by time t (i.e., F (t) = 1), then u∗ (t) = 0.
Assume therefore h > 0 and F < 1. Under these conditions, we can infer
from (9.28) and (9.38) that

∗ ⎪

(i) Cu (0) + L + λe > 0 ⇒ u (t) = 0,
ρt ⎪



ρt ∗
(ii) Cu (1) + L + λe < 0 ⇒ u (t) = 1. (9.39)





(iii) Otherwise, Cu + L + λeρt = 0determines u∗ (t). ⎭
296 9. Maintenance and Replacement

Using the terminal condition λ(T ) = −e−ρT B(T ) from (9.33), we can
derive u∗ (T ) satisfying (9.39):

∗ ⎪

(i) Cu (0) > B(T ) − L and u (T ) = 0, ⎪




(ii) Cu (1) < B(T ) − L and u (T ) = 1. (9.40)





(iii) Otherwise, Cu = B(T ) − L ⇒ u∗ (T ). ⎭

Next we determine how u∗ (t) changes over time. Kamien and


Schwartz (1971a, 1992) have shown that u∗ (t) is nonincreasing; see Ex-
ercise 9.10. Thus, there exists T ≥ t2 ≥ t1 ≥ 0 such that



⎪ 1
⎪ for t ∈ [0, t1 ],


u∗ (t) = u0 (t) for t ∈ (t1 , t2 ), (9.41)





⎩ 0 for t ∈ (t2 , T ],

where u0 (t) is the solution of (9.39)(iii). Clearly, it must also be shown


that u̇0 (t) ≤ 0 as part of Exercise 9.10. Of course, u∗ (T ) is immediately
known from (9.40). If u∗ (T ) ∈ (0, 1), it implies t2 = T ; and if u∗ (T ) = 1,
it implies t1 = t2 = T.
For this model, the sufficiency of the maximum principle follows from
Theorem 2.1; see Exercise 9.11.

9.2.3 Determination of the Sale Date


For a fixed T, we know that the terminal optimal control u∗ (T ) is deter-
mined by (9.40). If this u∗ (T ) also satisfies (9.34), we have determined
an optimal trajectory as well as the optimal life of the machine. This,
of course, is subject to the second-order condition since (9.34) is only a
necessary condition for an optimal T ∗ to satisfy. It is clear that the deter-
mination of T ∗ , in most cases, will require numerical computations. The
algorithm needs only to be a simple search method because it requires
consideration of the single variable T.
Before we go to the next section, we remark that a business is usually
a continuing entity and does not end at the sale date of one machine.
Normally, an existing machine will be replaced by another, which in
turn will be replaced by another, and so on. The technology of the newer
machines will generally be different from that of the existing machine. In
9.3. Chain of Machines 297

what follows, we address these issues. We will choose the discrete-time


setting and illustrate the use of the discrete-time maximum principle
developed in Chap. 8.

9.3 Chain of Machines


We now extend the problem of maintenance and replacement to a chain of
machines. By this we mean that given the time periods 0, 1, 2, . . . , T − 1,
we begin with a machine purchase at the beginning of period zero. Then,
we find an optimal number of machines, say , and optimal times 0 <
t1 < t2 , . . . , t−1 < t < T of their replacements such that the existing
machine will be replaced by a new machine at time tj , j = 1, 2, . . . , .
At the end of the horizon defined by the beginning of period T, the last
machine purchased will be salvaged. Moreover, the optimal maintenance
policy for each of the machines in the chain must be found.
Two approaches to this problem have been developed in the litera-
ture. The first attempts to solve for an infinite horizon (T = ∞) with a
simplifying assumption of identical machine lives, i.e.,

tj − tj−1 = tj+1 − tj (9.42)

for all j ≥ 1; see Sethi (1973b) as well as Exercise 9.16. In this case  = ∞
as well. The second relaxes the assumption (9.42) of identical machine
lives, but then, it can only solve a finite horizon problem involving a
finite chain of machines, i.e.,  is finite; see Sethi and Morton (1972)
and Tapiero (1973). For a decision horizon formulation of this problem,
see Sethi and Chand (1979), Chand and Sethi (1982), and Bylka et al.
(1992).
In this section, we will deal with the latter problem as analyzed by
Sethi and Morton (1972). The problem is solved by a mixed optimization
technique. The subproblems dealing with the maintenance policy are
solved by appealing to the discrete maximum principle. These subprob-
lem solutions are then incorporated into a Wagner and Whitin (1958)
model formulation for solution of the full problem. The procedure is
illustrated by a numerical example.

9.3.1 The Model


Consider buying a machine at the beginning of period s and salvaging it
at the beginning of period t > s. Let Jst denote the present value of all
298 9. Maintenance and Replacement

net earnings associated with the machine. To calculate Jst we need the
following notation:

xks = the resale value of the machine at the beginning of


period k, k = s, s + 1, . . . , t,
k
Ps = the production quantity (in dollar value) during period
k, k = s, s + 1, . . . , t − 1,
k
Es = the necessary expense of the ordinary maintenance (in
dollars) during period k,
Rsk = Psk − Esk , k = s, s + 1, . . . , t − 1,
uk = the rate of preventive maintenance (in dollars) during
period k, k = s, s + 1, . . . , t − 1,
Cs = the cost of purchasing the machine at the beginning of
period s,
ρ = the periodic discount rate.

It is required that

0 ≤ uk ≤ U sk , k ∈ [s, t − 1]. (9.43)

We can calculate Jst in terms of the variables and functions defined


above:

t−1 
t−1
Jst = Rsk (1+ρ)−k − uk (1+ρ)−k −Cs (1+ρ)−s +xts (1+ρ)−t . (9.44)
k=s k=s

We must also have functions that will provide us with the ways in
which states change due to the age of the machine and the amount
of preventive maintenance. Also, assuming that at time s, the only
machines available are those that are up-to-date with respect to the
technology prevailing at s, we can subscript these functions by s to reflect
the effect of the machine’s technology on its state at a later time k. Let
Ψs (uk , k) and Φs (uk , k) be such concave functions so that we can write
the following state equations:

ΔRsk = Rsk+1 − Rsk = Ψs (uk , k), Rss (9.45)

given,
Δxks = Φs (uk , k), xss = (1 − δ)Cs , (9.46)
where δ is the fractional depreciation immediately after the purchase of
the machine at time s.
9.3. Chain of Machines 299

To convert the problem into the Mayer form, define


k−1
Aks = Rsi (1 + ρ)−i , (9.47)
i=s


k−1
Bsk = ui (1 + ρ)−i . (9.48)
i=s
Using Eqs. (9.47) and (9.48), we can write the optimal control prob-
lem as follows:

max[Jst = Ats − Bst − Cs (1 + ρ)−s + xts (1 + ρ)−t ] (9.49)


{uk }

subject to

ΔAks = Rsk (1 + ρ)−k , Ass = 0, (9.50)


−k
ΔBsk k
= u (1 + ρ) , Bss = 0, (9.51)

and the constraints (9.45), (9.46), and (9.43).

9.3.2 Solution by the Discrete Maximum Principle


We associate the adjoint variables λk+1 k+1 k+1 k+1
1 , λ2 , λ3 , and λ4 , respec-
tively with the state equations (9.50), (9.51), (9.45), and (9.46). There-
fore, the Hamiltonian becomes
−k −k
H = λk+1 k
1 Rs (1 + ρ) + λk+1 k
2 u (1 + ρ) + λk+1 k+1
3 Ψs + λ4 Φs , (9.52)

where the adjoint variables λ1 , λ2 , λ3 , and λ4 satisfy the following differ-


ence equations and terminal boundary conditions:
∂H
Δλk1 = − = 0, λt1 = 1, (9.53)
∂Aks
∂H
Δλk2 = − = 0, λt2 = −1, (9.54)
∂Bsk
∂H −k
Δλk3 = − k = −λk+1 t
1 (1 + ρ) , λ3 = 0, (9.55)
∂Rs
∂H
Δλk4 = − k = 0, λt4 = (1 + ρ)−t . (9.56)
∂x
The solutions of these equations are

λk1 = 1, (9.57)
300 9. Maintenance and Replacement

λk2 = −1, (9.58)



t−1
k
λ3 = (1 + ρ)−i , (9.59)
i=k
λk4 = (1 + ρ)−t . (9.60)

Note that λk1 , λk2 , and λk4 are constants for a fixed machine salvage time
t. To apply the maximum principle, we substitute (9.57)–(9.60) into the
Hamiltonian (9.52), collect terms containing the control variable uk , and
rearrange and decompose H as

H = H1 + H2 (uk ), (9.61)

where H1 is that part of H which is independent of uk and


t−1
H2 (uk ) = −uk (1 + ρ)−k + (1 + ρ)−i Ψs + (1 + ρ)−t Φs . (9.62)
i=k+1

Next we apply the maximum principle to obtain the necessary con-


dition for the optimal schedule of preventive maintenance expenditures
in dollars. The condition of optimality is that H should be a maximum
along the optimal path. If uk were unconstrained, this condition, given
the concavity of Ψs and Φs , would be equivalent to setting the partial
derivative of H with respect to u equal to zero, i.e.,


t−1
Huk = [H2 ]uk = −(1+ρ)−k +(Ψs )uk (1+ρ)−i +(Φs )uk (1+ρ)−t = 0.
i=k+1
(9.63)
Equation (9.63) is an equation in uk
with the exception of the particular
case when Ψs and Φs are linear in uk (which will be treated later in this
section). In general, (9.63) may or may not have a unique solution. For
our case we will assume Ψs and Φs to be of the form such that they
give a unique solution for uk . One such case occurs when Ψs and Φs are
quadratic in uk . In this case, (9.63) is linear in uk and can be solved
explicitly for a unique solution for uk . Whenever a unique solution does
exist, let this be
uk = Ust
k
. (9.64)
9.3. Chain of Machines 301

The optimal control uk∗ is given as






⎪ 0 k ≤ 0,
if Ust


uk∗ = k
Ust if 0 ≤ Ust
k ≤ U sk , (9.65)





⎩ U sk k ≥ U sk .
if Ust

9.3.3 Special Case of Bang-Bang Control


We now treat the special case in which the problem, and therefore H, is
linear in the control variable uk . In this case, H can be maximized simply
by having the control at its maximum when the coefficient of uk in H is
positive, and minimum when it is negative, i.e., the optimal control is of
bang-bang type.
In our problem, we obtain the special case if Ψs and Φs assume the
form
Ψs (uk , k) = uk ψ ks (9.66)
and
Φs (uk , k) = uk φks , (9.67)
respectively, where ψ ks and φks are given constants. Then, the coefficient
of uk in H, denoted by Ws (k, t), is


t−1
Ws (k, t) = −(1 + ρ)−k + ψ ks (1 + ρ)−i + φts (1 + ρ)−t , (9.68)
i=k+1

and the optimal control uk∗ is given by

uk∗ = bang[0, U sk ; Ws (k, t)], k = s, s + 1, . . . , t − 1. (9.69)

9.3.4 Incorporation into the Wagner-Whitin


Framework for a Complete Solution
Once uk∗ has been obtained as in (9.65) or (9.69), we can substitute it
into (9.45) and (9.46) to obtain Rsk∗ and xk∗
s , which in turn can be used
in (9.44) to obtain the optimal value of the objective function denoted
∗ . This can be done for each pair of machine purchase time s and
by Jst
sale time t > s.
302 9. Maintenance and Replacement

Let gs denote the present value of the profit (discounted to period 0)


of an optimal replacement and preventive maintenance policy for periods
s, s + 1, . . . , T − 1. Then,

gs = max [Jst + gt ], 0 ≤ s ≤ T − 1 (9.70)
t=s+1,...,T

with the boundary condition


gT = 0. (9.71)
The value of g0 will give the required maximum.
The mixed optimization technique presented here avoids many of
the shortcomings of either pure dynamic programming or pure control
theory formulations. Since the solution technique used to optimize a
given machine represents a submodule of the overall method, the pure
dynamic programming approach may be recognized as a special case. It
should be advantageous, however, to be able to use a methodology for the
submodule that is most efficient for a given particular problem. Previous
control theory formulations do not seem to be easily adaptable to the
situation of an existing initial machine; see Sethi and Morton (1972) for
other similar asymmetries.
The mixed technique can also be adapted to the case of probabilistic
technological breakthroughs (Exercise 9.17). Here the path of technolog-
ical growth is assumed to be a tree with probabilities associated with its
branches. The subproblems can be solved by using the maximum prin-
ciple for stochastic networks given in Sethi and Thompson (1977). How-
ever, the number of subproblems that must be solved increases rapidly
with the number of branches, thus putting computational limitations on
the general usefulness of this extension.
Another application of the mixed technique has been used by Pekel-
man and Sethi (1978) to obtain the optimal durations of advertising
copies, and the optimal level of advertising expenditures for each copy.

9.3.5 A Numerical Example


To illustrate the procedure, a simple three-period example will be pre-
sented and solved for the case where there is no existing machine at time
zero.
Machines may be bought at times 0, 1, and 2. The cost of a machine
bought at time s is assumed to be
Cs = 1, 000 + 500s2 .
9.3. Chain of Machines 303

The discount rate, the fractional instantaneous depreciation at purchase,


and the maximum preventive maintenance per period are assumed to be

ρ = 0.06, δ = 0.25, and U = $100,

respectively.
Let Rss be the net return (net of necessary maintenance) of a machine
purchased at the beginning of period s and operated during period s. We
assume
R00 = $600, R11 = $1, 000, and R22 = $1, 100.
In a period k subsequent to the period s of machine purchase, the
returns Rsk , k > s, depend on the preventive maintenance performed on
the machine in the periods prior to period k. The incremental return
function is given by Ψs (u, k), which we assume to be linear. Specifically,

ΔRsk = Ψs (uk , k) = −ds + as uk ,

where

d0 = 200, d1 = 50, d2 = 100, and as = 0.5 + 0.1s3 .

This means that, in the absence of any preventative maintenance, the


return in period k on a machine purchased in period s goes down by
an amount ds every period from s to k, including s, in which there is
no preventive maintenance. This decrease can be offset by an amount
proportional to the amount of preventive maintenance.
Note that the function Ψs is assumed to be stationary over time in
order to simplify the example.
Let xks be the salvage value at time k of a machine purchased at s.
We assume
xss = (1 − δ)Cs = 0.75[1, 000 + 500s2 ].
The incremental salvage value function is given by

Δxks = −s Cs + bs uk ,

where ⎧

⎨ 0.1 when s = 0, 1,
s =

⎩ 0.2 when s = 2,

and
bs = (0.5 − 0.05s).
304 9. Maintenance and Replacement

That is, the decrease in salvage value is a constant percentage of the pur-
chase price if there is no preventive maintenance. With preventive main-
tenance, the salvage value can be enhanced by a proportional amount.
∗ be the optimal value of the objective function associated with
Let Jst
a machine purchased at s and sold at t ≥ s + 1. We will now solve for
∗ , s = 0, 1, 2, and s < t ≤ 3, where t is an integer.
Jst
Before we proceed, we will as in (9.68) denote by Ws (k, t), the coef-
ficient of uk in the Hamiltonian H, i.e.,


t−1
Ws (k, t) = −(1 + ρ)−k + as (1 + ρ)−i + bs (1 + ρ)−t . (9.72)
i=k+1

The optimal control is given by (9.69).


It is noted in passing that

Ws (k + 1, t) − Ws (k, t) = (1 + ρ)−(k+1) (ρ − as ),

so that
sgn[Ws (k + 1, t) − Ws (k, t)] = sgn[ρ − as ]. (9.73)
This implies that




⎪ ≥ 0 if (ρ − as ) > 0,


u(k+1)∗ − uk∗ = 0 if (ρ − as ) = 0, (9.74)





⎩ ≤ 0 if (ρ − as ) < 0.

In this example ρ − as < 0, which means that if there is a switching in


the preventive maintenance trajectory of a machine, the switch must be
from $100 to $0.

Solution of Subproblems We now solve the subproblems for various


values of s and t(s < t) by using the discrete maximum principle.

Subproblem: s = 0, t = 1.

W0 (0, 1) = −1 + 0.5(1.06)−1 < 0.

From (9.69) we have


u0∗ = 0.
9.3. Chain of Machines 305

Now,

R00 = 600,
R01 = 600 − 200 = 400,
x00 = 0.75 × 1, 000 = 750,
x10 = 750 − 0.1 × 1, 000 = 650,

J01 = 600 − 1, 000 + 650 × (1.06)−1 = $213.2.

Similar calculations can be carried out for other subproblems. We will


list these results.

Subproblem: s = 0, t = 2.

W0 (0, 2) < 0, W0 (1, 2) < 0,

u0∗ = 0, u1∗ = 0,
∗ = 466.9.
J02

Subproblem: s = 0, t = 3.

W0 (0, 3) > 0, W0 (1, 3) < 0, W0 (2, 3) < 0,

u0∗ = 100, u1∗ = 100, u2∗ = 0,


∗ = 639.
J03

Subproblem: s = 1, t = 2.

W1 (1, 2) < 0,
u1∗ = 0,

J12 = 559.9.

Subproblem: s = 1, t = 3.

W1 (1, 3) > 0, W1 (2, 3) < 0,

u1∗ = 100, u2∗ = 0,


∗ = 1024.2.
J13
306 9. Maintenance and Replacement

Subproblem: s = 2, t = 3.

W2 (2, 3) < 0,
u2∗ = 0,

J23 = 80.

Wagner-Whitin Solution of the Entire Problem With reference to


the dynamic programming equation in (9.70) and (9.71), we have

g3 = 0,

g2 = J23 = $80,
∗ ∗
g1 = max [J13 , J12 + g2 ]
= max [1024.2, 559.9 + 80]
= $1024.2,
∗ ∗ ∗
g0 = max [J03 , J01 + g1 , J02 + g2 ]
= max [639.0, 213.2 + 1024.2, 466.9 + 80]
= $1237.4.

Now we can summarize the optimal solution. The optimal number of


machines is 2, and their optimal purchase times, maintenance rates, and
sell times are as follows:
First Machine Optimal Policy: Purchase at s = 0 and sell at
t = 1. The optimal preventive maintenance policy is u0∗ = 0.
Second Machine Optimal Policy: Purchase at s = 1 and sell at
t = 3. The optimal preventive maintenance policy is u1∗ = 100, u2∗ = 0.
The associated value of the objective function is J ∗ = $1237.4.

Exercises for Chapter 9

E 9.1 Show that the bracketed expression in (9.10) is monotonically


decreasing in t.

E 9.2 Change the values of U and d(t) in Sect. 9.1.3 to the new values
U = 1/2 and d(t) = 3 and re-solve the problem.

E 9.3 Show for the model in Sect. 9.1.1 that if it is optimal to have
the maximum maintenance throughout the life of the machine, then its
optimal life T must satisfy g(T ) − 1 ≥ 0. In particular, for the example
in Sect. 9.1.3, show T ≤ 3.
Exercises for Chapter 9 307

E 9.4 Re-solve the example in Sect. 9.1.3 with x(0) = 40.

E 9.5 Replace the maintenance effectiveness function in Sect. 9.1.3 by

g(t) = 2/(16 + t)1/2

and solve the resulting problem.

E 9.6 Re-solve Exercise 2.20 when T is unspecified and it denotes the


sale date of the machine to be determined.

E 9.7 Let the maintenance effectiveness function in the model of


Sect. 9.1.4 be
2u1/2
g(u, t) = .
(1 + t)1/2
Derive the formula for u0 (t) for this case. Furthermore, solve the problem
with T = 34.8, U = 1, x(0) = 100, d(t) = 2, π = 0.1 and ρ = 0.05, and
compare its solution to that of the numerical example in Sect. 9.1.3. Note
that the sale date T is assumed to be fixed in Sect. 9.1.4 for simplicity
in exposition.

E 9.8 Derive the formula in (9.34) by using (3.77).

E 9.9 Redo the analysis providing the detailed economic interpretation


of (9.34) when selling the machine at time T ∗ + δ, which is earlier than
time T ∗ when the small δ < 0.

Hint: The salvage value function required in (3.77) for the problem here
is S(F (T ), T ) = e−ρT B(T )(1 − F (T )) as given in (9.31). Its partial
derivative with respect to T is [−ρe−ρT B(T ) + e−ρT BT (T )(1 − F (T )).

E 9.10 To show that the singular control in the third alternative in


(9.39) can be sustained, we set dHu /dt = 0 for all t for which a singular
control obtains. That is, u0 (t) satisfies

Cuu u̇0 = Cu [ρ + (1 − u0 )h] + ρL − R + C(u0 )h. (9.75)

Show that u̇0 (t) ≤ 0. Furthermore, show that u∗ (t) is nonincreasing over
time.

E 9.11 For the model of Sect. 9.2, prove that the derived Hamiltonian H
is concave in F for each given λ and t, so that the Sufficiency Theorem 2.1
holds.
308 9. Maintenance and Replacement

E 9.12 A firm wants to price its product to maximize the stream of


discounted profits. If it maximizes current profits, the high price and
profits may attract the entry of rivals, which in turn will reduce future
profit possibilities. Let the current profit rate R1 (p) be a strictly concave

function of price p with R1 (p) < 0. The profit rate that the firm believes
will be available to it after rival entry is R2 < maxp R1 (p) (independent
of current price and lower than current monopoly profits). Whether, or
when, a rival will enter is not known, but let F (t) denote the probability
that entry will occur by time t, with F (0) = 0. The conditional prob-
ability density of entry at time t, given its nonoccurrence prior to t, is
Ḟ (t)/[1 − F (t)]. We assume that this conditional entry probability den-
sity is a strictly increasing, convex function h(p) of product price p. This
specification reflects the supposition that as price rises, the profitability
of potential entrants of a given size increases and so does their likelihood
of entry. Thus, we assume

Ḟ (t)/[1 − F (t)] = h(p(t))

where
h(0) = 0, h (p) > 0, h (p) ≥ 0.
Discounting future profits at rate ρ, the firm seeks a price policy p(t) to
 ∞
max e−ρt {R1 (p(t))[1 − F (t)] + R2 F (t)}dt
0

subject to
Ḟ (t) = h(p(t))[1 − F (t)], F (0) = 0.
The integrand represents the expected profits at t, composed of R1 if no
rival has entered by t, and otherwise R2 .

(a) Show that the maximum principle necessary conditions are satisfied
by p(t) = p∗ , where p∗ is a constant. Obtain the equation satisfied
by p∗ and show that it has a unique solution.

(b) Let pm denote the monopoly price (in the absence of any rival), i.e.,
R1 (pm ) = maxp R1 (p). Show that p∗ < pm and R1 (pm ) > R1 (p∗ ) >
R2 . Provide an intuitive explanation of the result.

(c) Verify the sufficiency condition for optimality by showing that the
maximized Hamiltonian is concave.
Exercises for Chapter 9 309

E 9.13 Let us define the state of a machine to be ‘0’ if it is working


and ‘1’ if it is being repaired. Let λ be the breakdown rate and μ be the
service rate as in waiting-line theory, so that we have

Ṗ0 = −λP0 + μ(1 − P0 ), P0 (0) = 1,

where P0 (t) is the probability that the machine is in the state 0 at time t.
Let P1 (t) = 1−P0 (t), which is the probability that the machine is in state
1 at time t. This equation along with (9.3) gives us two state equations.
In view of the equation for Ṗ0 , we modify the objective function (9.2) to
 T
J= [πx(t)P0 (t) − u(t) − kP1 (t)]e−ρt dt + x(T )e−ρT ,
0

where k characterizes the additional expenditure rate while the machine


is being repaired. Solve this model to obtain the optimal control. See
Alam and Sarma (1974).

E 9.14 Starting from Ws (k, t) in (9.72), derive the result in (9.74).

E 9.15 Extend the Thompson model in Sect. 9.1 to allow for process
discontinuities. An example of this type of machine is an airplane as-
signed to passenger transportation which may, after some deterioration
or obsolescence, be assigned to freight transportation before its eventual
retirement. Formulate and analyze the problem. See Tapiero (1971).

E 9.16 Extend the Thompson model in Sect. 9.1 to allow for a chain
of machines with identical lines. See Sethi (1973b) for an analysis of a
similar model.

E 9.17 Extend the formulation of the Sethi-Morton model in Sect. 9.3


to allow for probabilistic technological breakthroughs. See Sethi and
Morton (1972) and Sethi and Thompson (1977).
Chapter 10

Applications to Natural
Resources

The increase in world population is causing a corresponding increase in


the demand for consumption of natural resources. As a consequence the
optimal management and utilization of natural resources is becoming
increasingly important. There are two main kinds of natural resource
models: those involving renewable resources such as fish, food, timber,
etc., and those involving nonrenewable or exhaustible resources such as
petroleum, minerals, etc.
In Sect. 10.1 we deal with a fishery resource model, the sole owner of
which is considered to be a regulatory agency. The management prob-
lem of the agency is to control the rate of fishing over time so that
an appropriate objective function is maximized over an infinite horizon.
A differential game extension known as the common property fishery
resource model is discussed in Sect. 13.2.3. For other applications of
optimal control theory to renewable resource models including those in-
volving predator-prey relationships, see Clark (1976), Goh et al. (1974),
Jørgensen and Kort (1997), and Munro and Scott (1985).
Section 10.2 deals with an optimal forest thinning model, where thin-
ning is the process of removing some trees from a forest to improve its
growth rate and quality. An extension to a chain of forests model is
presented in Sect. 10.2.3.
The final model presented in Sect. 10.3 deals with an exhaustible
resource such as petroleum, which must be utilized optimally over a

© Springer Nature Switzerland AG 2019 311


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 10
312 10. Applications to Natural Resources

given horizon under the assumption that when its price reaches a given
high threshold, a substitute will be used instead. Therefore, the analysis
of this section can also be viewed as a problem of optimally phasing in
an expensive substitute.

10.1 The Sole-Owner Fishery Resource Model


With the establishment of 200-mile territorial zones in the ocean for
most countries having coastlines, the control of fishing in these zones
has become highly regulated by these countries. In this sense, fishing
in territorial waters can be considered as a sole owner fishery problem.
On the other hand, if the citizens and commercial fishermen of a given
country are permitted to fish freely in their territorial waters, the prob-
lem becomes that of an open access fishery. The solutions of these two
extreme problems are quite different, as will be shown in this section.

10.1.1 The Dynamics of Fishery Models


We introduce the following notation and terminology which is due to
Clark (1976):

ρ = the discount rate,


x(t) = the biomass of fish population at time t,
g(x) = the natural growth function,
u(t) = the rate of fishing effort at time t; 0 ≤ u ≤ U,
q = the catchability coefficient,
p = the unit price of landed fish,
c = the unit cost of effort.

Assume that the growth function g is differentiable and concave, and


it satisfies

g(0) = 0, g(X) = 0, g(x) > 0 for 0 < x < X, (10.1)

where X denotes the carrying capacity, i.e., the maximum sustainable


fish biomass.
The state equation due to Gordon (1954) and Schaefer (1957) is

ẋ = g(x) − qux, x(0) = x0 , (10.2)


10.1. The Sole-Owner Fishery Resource Model 313

where qux is the catch rate assumed to be proportional to the biomass


as well as the rate of fishing effort. The instantaneous profit rate is

π(x, u) = pqux − cu = (pqx − c)u. (10.3)

From (10.1) and (10.2), it follows that x will stay in the closed interval
0 ≤ x ≤ X provided x0 is in the same interval.
An open access fishery is one in which exploitation is completely
uncontrolled. Gordon (1954) analyzed this model, also known as the
Gordon-Schaefer model, and showed that the fishing effort tends to reach
an equilibrium, called a bionomic equilibrium, at the level where total
revenue equals total cost. In other words, the so-called economic rent is
completely dissipated. From (10.3) and (10.2), this level is simply

c g(xb )p
xb = and ub = . (10.4)
pq c

Let U > g(c/pq)p/c so that ub is in the interior of [0, U ]. The economic


basis for (10.4) is as follows: If the fishing effort u > ub is made, then
total costs exceed total revenues so that at least some fishermen will lose
money, and eventually some will drop out, thus reducing the level of the
fishing effort. On the other hand, if the fishing effort u < ub is made, then
total revenues exceed total costs, thereby attracting additional fishermen,
and increasing the fishing effort.
The Gordon-Schaefer model does not maximize the present value of
the total profits that can be obtained from the fish resources. This is
done next.

10.1.2 The Sole Owner Model


The bionomic equilibrium solution obtained from the open access fishery
model usually implies severe biological overfishing. Suppose a fishing
regulatory agency is established to improve the operation of the fishing
industry. In determining the objective of the agency, it is convenient
to think of it as a sole owner who has complete rights to exploit the
fishing resource. It is reasonable to assume that the agency attempts to
maximize  ∞
J= e−ρt (pqx − c)udt (10.5)
0

subject to (10.2). This is the optimal control problem to be solved.


314 10. Applications to Natural Resources

10.1.3 Solution by Green’s Theorem


The solution method presented in this section generalizes the one based
on Green’s theorem used in Sect. 7.2.2. Solving (10.2) for u we obtain

g(x) − ẋ
u= , (10.6)
qx

which we substitute into (10.3), giving


 ∞
g(x) − ẋ
J= e−ρt (pqx − c) dt. (10.7)
0 qx

Rewriting, we have
 ∞
J= e−ρt [M (x) + N (x)ẋ]dt, (10.8)
0

where
c c
N (x) = −p + and M (x) = (p − )g(x). (10.9)
qx qx
We note that we can write ẋdt = dx so that (10.8) becomes the following
line integral 
JB = [e−ρt M (x)dt + e−ρt N (x)dx], (10.10)
B
where B is a state trajectory in (x, t) space, t ∈ [0, ∞).
In this section we are only interested in the infinite horizon solution.
The Green’s theorem method achieves such a solution by first solving a
finite horizon problem as in Sect. 7.2.2, and then determining the infinite
horizon solution for which you are asked to verify that the maximum
principle holds in Exercise 10.1. See also Sethi (1977b).
In order to apply Green’s Theorem to (10.10), let Γ denote a simple
closed curve in the (x, t) space surrounding a region R in the space.
Then,
,
JΓ = [e−ρt M (x)dt + e−ρt N (x)dx]
Γ  
∂ −ρt ∂ −ρt
= [e N (x)] − [e M (x)] dtdx
R ∂t ∂x

= −e−ρt [ρN (x) + M  (x)]dtdx. (10.11)
R
10.1. The Sole-Owner Fishery Resource Model 315

If we let

I(x) = −[ρN (x) + M  (x)]


c cg(x)
= (ρ − g  (x))(p − ) − ,
qx qx2
we can rewrite (10.11) as

JΓ = e−ρt I(x)dtdx.
R

We can now conclude, as we did in Sects. 7.2.2 and 7.2.4, that the turn-
pike level x̄ is given by setting the integrand of (10.11) to zero. That is,

c cg(x)
− I(x) = [g  (x) − ρ](p − )+ = 0. (10.12)
qx qx2
In addition, a second-order condition must be satisfied for the solution x̄
of (10.12) to be a turnpike solution; see Lemma 7.1 and the subsequent
discussion there. The required second-order condition can be stated as

I(x) < 0 for x < x̄ and I(x) > 0 for x > x̄.

Let x̄ be the unique solution to (10.12) satisfying the second-order condi-


tion. The procedure can be extended to the case of nonunique solutions
as in Sethi (1977b); see Appendix D.8 on the Sethi-Skiba points.
The corresponding value ū of the control which would maintain the
fish stock level at x̄ is g(x̄)/q x̄. In Exercise 10.2 you are asked to show
that x̄ ∈ (xb , X) and also that ū < U. In Fig. 10.1 optimal trajectories
are shown for two different initial values: x0 < x̄ and x0 > x̄.
Let
g(x)(pqx − c)
π(x) = . (10.13)
qx
With π  (x) obtained from (10.13), condition (10.12) can be rewritten as
 
dπ(x) pqx − c
=ρ , (10.14)
dx qx
which facilitates the following economic interpretations.
The interpretation of π(x) is that it is the sustainable economic rent
at fish stock level x. This can be seen by substituting u = g(x)/qx into
(10.3), where u = g(x)/qx, obtained using (10.2), is the fishing effort
required to maintain the fish stock at level x. Suppose we have attained
316 10. Applications to Natural Resources

Figure 10.1: Optimal policy for the sole owner fishery model

the equilibrium level x̄ given by (10.12), and suppose we reduce this level
to x̄ − ε by removing ε amount of fish instantaneously from the fishery,
which can be accomplished by an impulse fishing effort of ε/q x̄. The
immediate marginal revenue MR from this action is
ε
MR = (pq x̄ − c) .
q x̄
However, this causes a decrease in the sustainable economic rent which
equals
π  (x̄)ε.
Over the infinite future, the present value of this stream is
 ∞
π  (x̄)ε
e−ρt π  (x̄)εdt = .
0 ρ
Adding to this the cost cε/q x̄ of the additional fishing effort ε/q x̄, we
get the marginal cost
π  (x̄)ε cε
MC = + .
ρ q x̄
Equating MR and MC, we obtain (10.14), which is also (10.12).
When the discount rate ρ = 0, Eq. (10.14) reduces to

π  (x) = 0,
10.2. An Optimal Forest Thinning Model 317

so that it gives the equilibrium fish stock level x̄ |ρ=0 . On account of


this level satisfying the above first-order condition, one can show that it
maximizes the instantaneous profit rate π(x). In economics, such a level
is called the golden rule level. On the other hand, when ρ = ∞, we can
conclude from (10.12) that pqx − c = 0. This gives

x̄ |ρ=∞ = xb = c/pq.

The latter is the bionomic equilibrium attained in the open access fishery
solution; see (10.4). Finally, by denoting x̄ obtained from (10.12) for any
given ρ > 0 as x̄ |ρ , you are asked in Exercise 10.3 to show that

x̄ |ρ=0 > x̄ |ρ>0 > x̄ |ρ=∞ = xb . (10.15)

The sole owner solution x̄ satisfies x̄ > xb = c/pq. If we regard a


government regulatory agency as the sole owner responsible for operat-
ing the fishery at level x̄, then it can impose restrictions, such as gear
regulations, catch limitations, etc. that will increase the fishing cost c.
If c is increased to the level pq x̄, then the fishery can be turned into an
open access fishery subject to those regulations, and it will attain the
bionomic equilibrium at level x̄.

10.2 An Optimal Forest Thinning Model


Forests are another important kind of renewable natural resource, and
their optimal management is becoming a significant current problem. In
Kilkki and Vaisanen (1969), a model is developed for forest growth and
thinning in connection with Scotch Pine forests in Finland. Thinning is
the process of removing some but not all of the trees prior to clearcutting
the forest. Besides yielding a harvest of wood, the thinning process also
improves the growth rate and quality of the forest. The solution method
employed by Kilkki and Vaisanen was based on dynamic programming.
We will use the maximum principle approach to solve the model. For
related literature, see Clark (1976) and Bowes and Krutilla (1985).

10.2.1 The Forestry Model


We introduce the following notation:

t0 = the initial age of the forest,


ρ = the discount rate,
x(t) = the volume of usable timber in the forest at time t,
318 10. Applications to Natural Resources

u(t) = the rate of thinning at time t,


p = the constant price per unit volume of timber,
c = the constant cost per unit volume of thinning,
f (x) = the growth function, which is positive, concave, and
has a unique maximum at xm ; we assume f (0) = 0,
g(t) = the growth coefficient which is a positive, decreasing
function of time.
The specific function form for the forest growth used in Kilkki and
Vaisanen (1969) is as follows:
1
f (x) = xe−αx , 0 ≤ x ≤ ,
α
where α is a positive constant. Note that f is increasing and concave in
the relevant range, and it takes it maximum at 1/α. They use the growth
coefficient of the form
g(t) = at−b ,
where a and b are positive constants.
The forest growth equation is

ẋ = g(t)f (x) − u(t), x(t0 ) = x0 . (10.16)

The objective is to maximize the discounted profit


 ∞
J= e−ρt (p − c)udt (10.17)
t0

subject to (10.16) and the state and control constraints

x(t) ≥ 0 and u(t) ≥ 0. (10.18)

The control constraint in (10.18) implies that there is no replanting in the


forest. In Sect. 10.2.3 we extend this model to incorporate the successive
replantings of the forest each time it is clearcut.

10.2.2 Determination of Optimal Thinning


We solve the forest thinning model by using the maximum principle.
The Hamiltonian is

H = (p − c)u + λ[gf (x) − u] (10.19)


10.2. An Optimal Forest Thinning Model 319

with the adjoint equation

λ̇ = λ[ρ − gf  (x)]. (10.20)

The optimal control is

u∗ = bang[0, ∞; p − c − λ]. (10.21)

The appearance of ∞ as an upper bound in (10.21) simply means that


impulse control is permitted.
We do not use the Lagrangian form of the maximum principle to
include constraints (10.18) because, as we will see, the forestry problem
has a natural ending at a time T for which x(T ) = 0.
To get the singular control solution triple {x̄, λ̄, ū}, we must observe
that due to the time dependence of g(t), x̄ and ū will be functions of
time. From (10.21), we have

λ̄ = p − c, (10.22)

which is a constant so that λ̇ = 0. From (10.20),


ρ
f  (x̄(t)) = or x̄(t) = f −1 (ρ/g(t)). (10.23)
g(t)
Then, from (10.14),

ū(t) = g(t)f (x̄(t)) − x̄(t)


˙ (10.24)

gives the singular control.


The solution of (10.23) can be illustrated as in Fig. 10.2. Since g(t)
is a decreasing function of time, it is clear from Fig. 10.2 that x̄(t) is a
decreasing function of time, and then by (10.24), ū(t) ≥ 0. It is also clear
from (10.23) that x̄(T̂ ) = 0 at time T̂ , where T̂ is given by
ρ
= f  (0),
g(T̂ )
which, in view of f  (0) = 1, gives

T̂ = e−(1/b) ln(ρ/a) . (10.25)

In Fig. 10.3 we plot x̄(t) as a function of time t. The figure also


contains an optimal control trajectory for the case in which x0 < x̄(t0 ).
To determine the switching time t̂, we first solve (10.14) with u = 0. Let
x(t) be the solution. Then, t̂ is the time at which the x(t) trajectory
intersects the x̄(t) curve; see Fig. 10.3.
320 10. Applications to Natural Resources

Figure 10.2: Singular usable timber volume x̄(t)

Figure 10.3: Optimal thinning u∗ (t) and timber volume x∗ (t) for the
forest thinning model when x0 < x̄(t0 )
10.2. An Optimal Forest Thinning Model 321

For x0 > x̄(t0 ), the optimal control at t0 will be the impulse cutting
to bring the level from x0 to x̄(t0 ) instantaneously. To complete the
infinite horizon solution, set u∗ (t) = 0 for t ≥ T̂ . In Exercise 10.12 you
are asked to obtain λ(t) for t ∈ [0, ∞).

10.2.3 A Chain of Forests Model


We now extend the model of Sect. 10.2.1 to incorporate successive re-
plantings of the forest each time it is clearcut. This extension is similar
in spirit to the chain of machines model of Sect. 9.3, but with some im-
portant differences. We will assume that successive plantings, sometimes
called forest rotations, take place at equal intervals. This is similar to
what was assumed in the machine replacement problem treated in Sethi
(1973b).
Let T be the rotation period, i.e., the time from planting to clear-
cutting which is to be determined. During the nth rotation, the dynamics
of the forest is given by (10.17) with t ∈ [(n−1)T, nT ] and x[(n−1)T ] = 0.
The discounted profit to be maximized is given by


  T
(k−1)ρT
J(T ) = e e−ρt (p − c)udt
k=1 0
 T
1
= e−ρt (p − c)udt. (10.26)
1 − e−ρT 0

From the solution of the model in the previous section, and the
assumption that the forest is profitable, it is obvious that 0 ≤ T ≤ T̂
as shown in Fig. 10.4. We have two cases to consider, depending on
whether T > t̂ or T ≤ t̂.

Case 1: T > t̂. From the preceding section it is easy to conclude that
the optimal trajectory is as shown in Fig. 10.4. Using the turnpike ter-
minology of Chap. 7, the trajectory from 0 to A is the entry ramp to
the turnpike, the trajectory from A to B is on the turnpike, and the
trajectory from B to T is the exit ramp. Since u∗ (t) = 0 on the entry
ramp, no timber is collected from time 0 to time t̂. Timber is, however,
collected by thinning from time t̂ to T − and clearcutting at time T. Note
from Fig. 10.4 that x̄(T ) is the amount of timber collected from impulse
clearcutting u∗ (T ) = imp[x̄(T ), 0; T ] at time T. Thus, we can write the
322 10. Applications to Natural Resources

Figure 10.4: Optimal thinning u∗ (t) and timber volume x∗ (t) for the
chain of forests model when T > t̂

discounted profit J ∗ (T ) of (10.26) for a given T as


 −
T
∗ 1
J (T ) = e−ρt (p − c)ū(t)dt + e−ρT (p − c)x̄(T ) .
1 − e−ρT t̂

(10.27)

Formally, the second term inside the brackets above represents


 T
e−ρt (p − c) imp[x̄(t), 0; t]dt, (10.28)
T−

the value of clearcutting at time T. In Exercise 10.13, you are asked to


show that this value is precisely the second term.
For finding the optimal value of T in this case, we differentiate (10.27)
with respect to T, equate the result to zero, and simplify to obtain (see
Exercise 10.14)
 T−
(1 − e−ρT )g(T )f [x̄(T )] − ρx̄(T ) − ρ e−ρt ū(t)dt = 0. (10.29)

10.2. An Optimal Forest Thinning Model 323

If the solution T lies in (t̂, T̂ ], keep it; otherwise set T = T̂ . Note that
(10.29) can also be derived by using the transversality condition (3.15);
see Exercise 3.6.

Case 2: T ≤ t̂. The optimal trajectory in this case is as shown in


Fig. 10.5. In the Vidale-Wolfe advertising model of Chap. 7, a similar
case occurs when T is small; see Fig. 7.10 and compare it with Fig. 10.5.
The solution for x(T ) is obtained by integrating (10.14) with u = 0 and
x0 = 0. Let this solution be denoted as x∗ (t). Here (10.26) becomes
e−ρT
J ∗ (T ) = (p − c)x̄(T ). (10.30)
1 − e−ρT
To find the optimal value of T for this case, we differentiate (10.30)
with respect to T and equate dJ ∗ (T )/dT to zero. We obtain (see Exer-
cise 10.14)
(1 − e−ρT )g(T )f [x̄(T )] − ρx̄(T ) = 0. (10.31)
If the solution lies in the interval [0, t̂] keep it; otherwise set T = t̂.

Figure 10.5: Optimal thinning and timber volume x∗ (t) for the chain of
forests model when T ≤ t̂

The optimal value T ∗ can be obtained by computing J ∗ (T ) from both


cases and selecting whichever is larger; see also Näslund (1969) and Sethi
(1973c).
324 10. Applications to Natural Resources

10.3 An Exhaustible Resource Model


In the previous two sections we discussed two renewable resource mod-
els. However, many natural resources are nonrenewable or exhaustible.
Examples are petroleum, mineral deposits, coal, etc. Given the growing
energy shortage, the optimal production and use of these resources is
of immense importance to the world. The earliest important work in
this area is Hotelling (1931). Since then, a number of studies have been
published such as Dasgupta and Heal (1974a), Solow (1974), Weinstein
and Zeckhauser (1975), Pindyck (1978a,b), Derzko and Sethi (1981a,b),
Amit (1986) and Heal (1993).
In this section, we discuss a simple model taken from a paper by Sethi
(1979a). The paper obtains the optimal depletion rate of an exhaustible
resource that maximizes a social welfare function involving consumers’
surplus and producers’ surplus with various weights. Here we treat the
special case when these weights are equal.

Figure 10.6: The demand function

10.3.1 Formulation of the Model


The model will be developed under the assumption that at a high enough
price, say p̄, a substitute, preferably renewable, will become available.
For example, if the price of fossil fuel becomes sufficiently high, solar
energy may become an economic substitute. In the North American
10.3. An Exhaustible Resource Model 325

context, the resource under consideration could be crude oil and its ex-
pensive substitute could be coal and/or tar sands; see, e.g., Fuller and
Vickson (1987).
We introduce the following notation:

p(t) = the price of the resource at time t,


q = f (p) is the demand function, i.e., the quantity de-
manded at price p; f  ≤ 0, f (p) > 0 for p < p̄, and
f (p) = 0 for p ≥ p̄, where p̄ is the price at which the
substitute completely replaces the resource. A typical
graph of the demand function is shown in Fig. 10.6,
c = G(q) is the cost function; G(0) = 0, G(q) > 0 for q >
0, G > 0 and G ≥ 0 for q ≥ 0, and G (0) < p̄. The
latter assumption makes it possible for the producers
to make a positive profit at a price p below p̄,
Q(t) = the available stock or reserve of the resource at time t;
Q(0) = Q0 > 0,
ρ = the social discount rate; ρ > 0,
T = the horizon time, which is the latest time at which the
substitute will become available regardless of the price
of the natural resource; T > 0.

Before stating the optimal control problem, we need the following


additional definitions and assumptions. Let

c = G[f (p)] = g(p), (10.32)

for which it is obvious that g(p) > 0 for p < p̄ and g(p) = 0 for p ≥ p̄.
Let

π(p) = pf (p) − g(p) (10.33)

denote the profit function of the producers, i.e., the producers’ surplus.
Let p be the smallest price at which π(p) is nonnegative. Assume further
that π(p) is a concave function in the range [p, p̄] as shown in Fig. 10.7.
In the figure the point pm indicates the price which maximizes π(p).
326 10. Applications to Natural Resources

Figure 10.7: The profit function

We also define  p̄
ψ(p) = f (y)dy (10.34)
p
as the consumers’ surplus, i.e., the area shown shaded in Fig. 10.6. This
quantity represents the total excess amount that consumers would be
willing to pay. In other words, consumers actually pay pf (p), while they
would be willing to pay
 p
yf  (y)dy = pf (p) + ψ(p).

The instantaneous rate of consumers’ surplus and producers’ surplus is


the sum ψ(p)+π(p). Let p̂ denote the maximum of this sum, i.e., p̂ solves
ψ  (p̂) + π  (p̂) = p̂f  (p̂) − g  (p̂) = 0. (10.35)
In Exercise 10.16 you will be asked to show that p̂ < pm , as marked in
Fig. 10.7. Later we will show that the correct second-order conditions
hold at p̂.
The optimal control problem is:
  T 
−ρt
max J = [ψ(p) + π(p)]e dt (10.36)
0

subject to
Q̇ = −f (p), Q(0) = Q0 , (10.37)
Q(T ) ≥ 0, (10.38)
and p ∈ Ω = [p, p̄]. Recall that the sum ψ(p) + π(p) is concave in p.
10.3. An Exhaustible Resource Model 327

10.3.2 Solution by the Maximum Principle


Form the current-value Hamiltonian

H(Q, p, λ) = ψ(p) + π(p) + λ[−f (p)], (10.39)

where λ satisfies the relation

λ̇ = ρλ, λ(T ) ≥ 0, λ(T )Q(T ) = 0, (10.40)

which implies


⎨ 0 if Q(T ) ≥ 0 is not binding,
λ(t) = (10.41)

⎩ λ(T )eρ(t−T ) if Q(T ) ≥ 0 is binding.

To obtain the optimal control, the Hamiltonian maximizing condition,


which is both necessary and sufficient in this case (see Theorem 2.1), is

∂H
= ψ  + π  − λf  = (p − λ)f  − g  = 0. (10.42)
∂p

To show that the solution s(λ) for p of (10.42) actually maximizes the
Hamiltonian, it is enough to show that the second derivative of the
Hamiltonian is negative at s(λ). Differentiating (10.42) gives

∂2H
= f  − g  + (p − λ)f  .
∂p2

Using (10.42) we have

∂2H g
2
= f  − g  +  f  . (10.43)
∂p f

From the definition of G in (10.32), we can obtain

f  g  − g  f 
G = ,
f 3

which, when substituted into (10.43), gives

∂2H
= f  − G f 2 . (10.44)
∂p2
328 10. Applications to Natural Resources

The right-hand side of (10.44) is strictly negative because f  < 0, and


G ≥ 0 by assumption. We remark that p̂ = s(0) using (10.35) and
(10.42), and hence the second-order condition for p̂ of (10.35) to give
the maximum of H is verified. In Exercise 10.17 you are asked to show
that s(λ) increases from p̂ as λ increases from 0, and that s(λ) = p̄ when
λ = p̄ − G (0).

Case 1: The constraint Q(T ) ≥ 0 is not binding. From (10.41), λ(t) ≡ 0


so that from (10.42) and (10.35),

p∗ = p̂. (10.45)

With this value, the total consumption of the resource is T f (p̂), which
must be ≤ Q0 so that the constraint Q(T ) ≥ 0 is not binding. Hence,

T f (p̂) ≤ Q0 (10.46)

characterizes Case 1 and its solution is given in (10.45).

Case 2: T f (p̂) > Q0 so that the constraint Q(T ) ≥ 0 is binding. Ob-


taining the solution requires finding a value of λ(T ) such that
 t∗
f (s[λ(T )eρ(t−T ) ])dt = Q0 , (10.47)
0

where  
∗ 1 p̄ − G (0)
t = min T, T + ln . (10.48)
p λ(T )
∗ −T )
The time t∗ , if it is less than T, is the time at which s[λ(T )eρ(t ] = p̄.
From Exercise 10.17,
∗ −T )
λ(T )eρ(t = p̄ − G (0) (10.49)

which, when solved for t∗ , gives the second argument of (10.48).


One method to obtain the optimal solution is to define T̄ as the
longest time horizon during which the resource can be optimally used.
Such a T̄ must satisfy
λ(T̄ ) = p̄ − G (0),
and therefore,
 T̄ ' % &(
f s {p̄ − G (0)}eρ(t−T̄ ) dt = Q0 , (10.50)
0
10.3. An Exhaustible Resource Model 329

which is a transcendental equation for T̄ . We now have two subcases.

Subcase 2a: T ≥ T̄ . The optimal control is


⎧ ' (

⎨ s {p̄ − G (0)}eρ(t−T̄ ) for t ≤ T̄ ,
p∗ (t) = (10.51)

⎩ p̄ for t > T̄ .

Clearly in this subcase, t∗ = T̄ and


λ(T ) = [p̄ − G (0)]e−ρ(T̄ −T ) .
A sketch of (10.51) is shown in Fig. 10.8.

Figure 10.8: Optimal price trajectory for T ≥ T̄

Subcase 2b: T < T̄ . Here the optimal price trajectory is


% &
p∗ (t) = s λ(T )eρ(t−T ) , (10.52)

where λ(T ) is to be obtained from the transcendental equation


 T ' % &(
f s λ(T )eρ(t−T ) dt = Q0 . (10.53)
0

A sketch of (10.52) is shown in Fig. 10.9.


In Exercise 10.18 you are given specific functions for the exhaustible
resource model and asked to work out explicit optimal price trajectories
for the model.
330 10. Applications to Natural Resources

Figure 10.9: Optimal price trajectory for T < T̄

Exercises for Chapter 10

E 10.1 As an alternate derivation for the turnpike level x̄ of (10.12),


use the maximum principle to obtain the optimal long-run stationary
equilibrium triple {x̄, ū, λ̄}.

E 10.2 Prove that x̄ ∈ (xb , X) and ū < U, where x̄ is the solution of


(10.12) and xb is given in (10.4).

E 10.3 Show that x̄ obtained from (10.12) decreases as ρ increases. Fur-


thermore, derive the relation (10.15).

E 10.4 Obtain the turnpike level x̄ of (10.12) for the special case g(x) =
x(1 − x), p = 2, c = q = 1, and ρ = 0.1.

E 10.5 Perform the following:


(a) For the Schaefer model with g(x) = rx(1 − x/X) and q = 1, derive
the formula for the turnpike level x̄ of (10.12).

(b) Allen (1973) and Clark (1976) estimated the parameters of the
Schaefer model for the Antarctic fin-whale population as follows:
r = 0.08, X = 400, 000 whales, and xb = 40, 000. Solve for x̄ for
ρ = 0, 0.10, and ∞.

E 10.6 Obtain π  (x) from (10.13) and use it in (10.12) to derive (10.14).
Exercises for Chapter 10 331

E 10.7 Let π(x, u) = [p − c(x)](qux) in (10.3), where c(x) is a differ-


entiable, decreasing, and convex function. Derive an expression for x̄
satisfying an equation corresponding to (10.12).

E 10.8 Show that extinction is optimal if ∞ > p ≥ c(0) and ρ > 2g  (0)
in Exercise 10.7.

Hint: Use the generalized mean value theorem.

E 10.9 Let the constant price p in Exercise 10.7 be replaced by a time


dependent price p(t) which is differentiable with respect to t. Derive the
equation x̄ corresponding to (10.12) for this nonautonomous problem.
Furthermore, find the turnpike level x̄(t) satisfying the derived equation.

E 10.10 Let π(x, u) of Exercise 10.7 be

π(x, u) = [p − c(x)](qux) + V (x),

where V (x) with V  (x) > 0 is the conservation value function, which
measures the value to society of having a large fish stock. By deriving
the analogue to (10.12), show that the new x̄ is larger than the x̄ in
Exercise 10.7.

E 10.11 When c(x) = 0 in Exercise 10.9, show that the analogue to


(10.12) reduces to

g  (x) = ρ − .
p
Give an economic interpretation of this equation.

E 10.12 Find λ(t), t ∈ [0, ∞), for the infinite horizon model of
Sect. 10.2.2.

E 10.13 Derive the second term inside the brackets of (10.27) by com-
puting e−ρT (p − c) imp[x̄(T ), 0; T ].

E 10.14 Derive (10.29) by using the first-order condition for maximizing


J ∗ (T ) of (10.27) with respect to T. Similarly, derive (10.31).

E 10.15 Forest Fertilization Model (Näslund 1969). Consider a forestry


model in which thinning is not allowed, and the forest is to be clearcut
332 10. Applications to Natural Resources

at a fixed time T. Suppose v(t) ≥ 0 is the rate of fertilization at time t,


so that the growth equation is

ẋ = r(X − x) + f (v, t), x(0) = x0 ,

where x is the volume of timber, r and X are positive constants, and f


is an increasing, differentiable, concave function of v. The objective is to
maximize  T
J = −c e−ρt v(t)dt + e−ρT px(T ),
0

where p is the price of a unit of timber and c is the unit cost of fertiliza-
tion.

(a) Show that the optimal control v ∗ (t) is given by solving the
equation
∂f c
= e−(ρ+r)(t−T ) .
∂v p
Check that the second order condition for a maximum holds for
this v ∗ (t).

(b) If f (v) = (1 + t) ln(1 + v), then find explicitly the optimal control
v ∗ (t) under the assumption that p/c > e(ρ+r)T . Show further that
v ∗ (t) is increasing and convex in t ∈ [0, T ].

E 10.16 Show that p̂ defined in (10.35) satisfies p ≤ p̂ ≤ pm .

E 10.17 Show that s(λ), the solution of (10.39), increases from p̂ as λ


increases from 0. Also show that s(λ) = p̄, when λ = p̄ − G (0).

E 10.18 For the model of Sect. 10.3, assume




⎨ p̄ − p for p ≤ p̄,
f (p) =

⎩ 0 for p > p̄,

G(q) = q 2 .

(a) Show that p∗ = 2p̄/3 if T ≤ 3Q0 /p̄.


Exercises for Chapter 10 333

(b) Show that T̄ satisfies T̄ + e−ρT̄ /ρ = 1/ρ + 3Q0 /p̄. Moreover,


⎧ ' (

⎨ p̄ eρ(t−T̄ ) + 2 /3 if t ≤ T̄ ,
p∗ (t) =

⎩ p̄ if t > T̄ ,

for T ≥ T̄ , and

2p̄ ρ[p̄T − 3Q0 ]


p∗ (t) = + −ρt ρT
3 3e (e − 1)

for T > T̄ .
Chapter 11

Applications to Economics

Optimal control theory has been extensively applied to the solution of


economic problems since the early papers that appeared in Shell (1967)
and the works of Arrow (1968) and Shell (1969). The field is too vast
to be surveyed in detail here, however. Several books in the area are:
Arrow and Kurz (1970), Hadley and Kemp (1971), Takayama (1974),
Lesourne and Leban (1982), Seierstad and Sydsæter (1987), Feichtinger
(1988), Léonard and Long (1992), Van Hilten et al. (1993), Kamien and
Schwartz (1992), and Dockner et al. (2000), and Weber (2011). We
content ourselves with the discussion of three simple kinds of models.
In Sect. 11.1, two capital accumulation or economic growth models
are presented. In Sect. 11.2, we formulate and solve an epidemic control
model. Finally, in Sect. 11.3 we discuss a pollution control model.

11.1 Models of Optimal Economic Growth


In this section we develop two simple models of economic growth or
capital accumulation. The earliest such model was developed by Ramsey
(1928) for an economy having a stationary population; see Exercise 11.7
for one of his models.
The first model treated in Sect. 11.1.1 is a finite horizon fixed-end-
point model with a stationary population. The problem is to maximize
the present value of the utility of consumption for the society, as well as
to accumulate a specified capital stock by the end of the horizon.
The second model incorporates an exogenously and exponentially

© Springer Nature Switzerland AG 2019 335


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 11
336 11. Applications to Economics

growing population in the infinite horizon setting. A technique known


as the method of phase diagrams is used to analyze the model.
For related discussion and extensions of these models, see Arrow
and Kurz (1970), Burmeister and Dobell (1970), Intriligator (1971), and
Arrow et al. (2007, 2010).

11.1.1 An Optimal Capital Accumulation Model


Consider a one-sector economy in which the stock of capital, denoted by
K(t), is the only factor of production. Let F (K) be the output rate of
the economy when K is the capital stock. Assume F (0) = 0, F (K) >
0, F  (K) > 0, and F  (K) < 0, for K > 0. These conditions imply
the diminishing marginal productivity of capital as well as the strict
concavity of F (K) in K. A part of this output is consumed and the
remainder is reinvested for further accumulation of capital stock. Let
C(t) be the amount of output allocated to consumption, and let I(t) =
F [K(t)] − C(t) be the amount invested. Let δ be the constant rate of
depreciation of capital. Then, the capital stock equation is

K̇ = F (K) − C − δK, K(0) = K0 . (11.1)

Let U (C) be the society’s utility of consumption, where we assume


U  (0) = ∞, U  (C) > 0, and U  (C) < 0, for C ≥ 0. These conditions
ensure that U (C) is strictly concave in C. Let ρ denote the social discount
rate and T denote the finite horizon. Then, a government which is elected
for a term of T years could consider the following problem:
  T 
max J = e−ρt U [C(t)]dt (11.2)
0

subject to (11.1) and the fixed-end-point condition

K(T ) = KT , (11.3)

where KT is a given positive constant. It may be noted that replacing


(11.3) by K(T ) ≥ KT would give the same solution.

11.1.2 Solution by the Maximum Principle


Form the current-value Hamiltonian as

H = U (C) + λ[F (K) − C − δK]. (11.4)


11.1. Models of Optimal Economic Growth 337

The adjoint equation is

∂H ∂F
λ̇ = ρλ − = (ρ + δ)λ − λ , λ(T ) = α, (11.5)
∂K ∂K
where α is a constant to be determined.
The optimal control is given by

∂H
= U  (C) − λ = 0. (11.6)
∂C
Since U  (0) = ∞, the solution of this condition always gives C(t) > 0.
An intuitive argument for this result is that a slight increase from a zero
consumption rate brings and infinitesimally large marginal utility and
therefore optimal consumption will remain strictly positive. Moreover,
the capital stock will not be allowed to fall to zero along an optimal
path in order to avoid the consumption rate from falling to zero. See
Karatzas et al. (1986) for a rigorous demonstration of this result in a
related context.
Note that the sufficiency of optimality is easily established here by
obtaining the derived Hamiltonian H 0 (K, λ) by substituting for C from
(11.6) in (11.4), and showing that H 0 (K, λ) is concave in K. This follows
easily from the facts that F (K) is concave and λ > 0 from (11.6) on
account of the assumption that U  (C) > 0.
The economic interpretation of the Hamiltonian is straightforward.
It consists of two terms: the first one gives the utility of current con-
sumption and the second one gives the net investment evaluated by price
λ, which, from (11.6), reflects the marginal utility of consumption.
For the economic system to be run optimally, the solution must sat-
isfy the following three conditions:

(a) The static efficiency condition (11.6) which maximizes the value of
the Hamiltonian at each instant of time myopically, provided that
λ(t) is known.

(b) The dynamic efficiency condition (11.5) which forces the price λ of
capital to change over time in such a way that the capital stock
always yields a net rate of return, which is equal to the social
discount rate ρ. That is,

∂H
dλ + dt = ρλdt.
∂K
338 11. Applications to Economics

(c) The long-run foresight condition, which establishes the terminal


price λ(T ) of capital in such a way that exactly the terminal capital
stock KT is obtained at T.

Equations (11.1), (11.3), (11.5), and (11.6) form a two-point bound-


ary value problem which can be solved numerically. In Exercise 11.1, you
are asked to solve a simple version of the model in which the TPBVP
can be solved analytically.

11.1.3 Introduction of a Growing Labor Force


In the preceding sections of this chapter we studied the simplest capital
accumulation model in which the population was assumed to be fixed.
We now introduce labor as a new factor (treated the same as population,
for simplicity), which grows exponentially at a fixed rate g, 0 < g < ρ. It
is now possible to recast the new model in terms of per capita variables
so that it is formally similar to the previous model. The introduction
of the per capita variables makes it possible to treat the infinite horizon
version of the new model.
Let L(t) denote the amount of labor at time t. Since it is growing
exponentially at rate g, we have

L(t) = L(0)egt . (11.7)

Let F (K, L) be the production function which is assumed to be strictly


increasing and concave in both factors of production so that FK >
0, FL > 0, FKK < 0, and FLL < 0 for K ≥ 0, L ≥ 0. Furthermore, it is
homogeneous of degree one so that F (mK, mL) = mF (K, L) for m ≥ 0.
We define k = K/L and the per capita production function f (k) as

F (K, L) K
f (k) = = F ( , 1) = F (k, 1). (11.8)
L L
It is clear from the assumptions of F that f  (k) > 0 and f  (k) < 0 for
k ≥ 0.
To derive the state equation for k, we note that

K̇ = k̇L + k L̇ = k̇L + kgL.

Substituting for K̇ from (11.1) and defining per capita consumption c =


C/L, we get
k̇ = f (k) − c − γk, k(0) = k0 , (11.9)
11.1. Models of Optimal Economic Growth 339

where γ = g + δ.
Let u(c) be the utility of per capita consumption c, where u is as-
sumed to satisfy

u (c) > 0 and u (c) < 0 for c ≥ 0 and u (0) = ∞. (11.10)

As in Sect. 11.1.2, the last condition in (11.10) rules out zero consump-
tion.
According to the position known
 ∞ as total utilitarianism, the soci-
ety’s discounted total utility is 0 e−ρt L(t)u(c(t))dt, which we aim to
maximize. In view of (11.7), this is equivalent to maximizing
 ∞
J= e−rt u(c)dt, (11.11)
0

where r = ρ − g > 0. Note also that r + γ = ρ + δ.

Remark 11.1 It is interesting to note that the problem is an infinite


version of that in Sect. 11.1.1, if we consider r to be the adjusted dis-
count rate and γ to be the adjusted depreciation rate. This reduction of
a model with two factors of production to a one-sector model does not
work if we jettison the assumption of an exponentially growing popula-
tion. Then, the analysis becomes much more complicated. The reader
is referred to Arrow et al. (2007, 2010) for economic growth models with
non-exponentially and endogenously growing populations.

11.1.4 Solution by the Maximum Principle


The current-value Hamiltonian is

H = u(c) + λ[f (k) − c − γk]. (11.12)

The adjoint equation is

∂H
λ̇ = rλ − = (r + γ)λ − f  (k)λ = (ρ + δ)λ − f  (k)λ. (11.13)
∂k
To obtain the optimal control, we differentiate (11.12) with respect to c,
set it to zero, and solve
u (c) = λ. (11.14)
Let c = h(λ) = u−1 (λ) denote the solution of (11.14). In Exercise 11.3,
you are asked to show that h (λ) < 0. This can be easily shown by
340 11. Applications to Economics

inverting the graph of u (c) vs. c. Alternatively you can rewrite (11.14)
as u (h(λ)) = λ and then take its derivative with respect to λ.
To show that the maximum principle is sufficient for optimality, it is
enough to show that the derived Hamiltonian

H 0 (k, λ) = u(h(λ)) + λ[f (k) − h(λ) − γk] (11.15)

is concave in k for any λ satisfying (11.14). The concavity follows im-


mediately from the facts that λ is positive from (11.10) and (11.14) and
f (k) is concave because of the assumptions on F (K, L).
Equations (11.9), (11.13), and (11.14) now constitute a complete au-
tonomous system, since time does not enter explicitly in these equations.
Such systems can be analyzed by the phase diagram method, which is
used next.
In Fig. 11.1 we have drawn a phase diagram for the two equations

k̇ = f (k) − h(λ) − γk = 0, (11.16)



λ̇ = (r + γ)λ − f (k)λ = 0, (11.17)

obtained from (11.9), (11.13), and (11.14). In Exercise 11.2 you are asked
to show that the graphs of k̇ = 0 and λ̇ = 0 are like the dotted curves
in Fig. 11.1. Given the nature of these graphs, known as isoclines, it is
clear that they have a unique point of intersection denoted as (k̄, λ̄). In

Figure 11.1: Phase diagram for the optimal growth model


11.1. Models of Optimal Economic Growth 341

other words, (k̄, λ̄) is the unique solution of the equations

f  (k̄) − h(λ̄) − γ k̄ = 0 and(r + γ) − f  (k̄) = 0. (11.18)

The two isoclines divide the plane into four regions, I, II, III, and IV,
as marked in Fig. 11.1. To the left of the vertical line λ̇ = 0, we have
k < k̄ and therefore r + γ < f  (k) in view of f  (k) < 0. Thus, λ̇ < 0 from
(11.13). Therefore, λ is decreasing, which is indicated by the downward
pointing arrows in Regions I and IV. On the other hand, to the right of
the vertical line, in Regions II and III, the arrows are pointed upward
because λ is increasing. In Exercise 11.3, you are asked to show that
the horizontal arrows, which indicate the direction of change in k, point
to the right above the k̇ = 0 isocline, i.e., in Regions I and II, and they
point to the left in Regions III and IV which are below the k̇ = 0 isocline.
The point (k̄, λ̄) represents the optimal long-run stationary equilib-
rium. The values of k̄ and λ̄ are obtained in Exercise 11.2. The next
important thing is to show that there is a unique path starting from
any initial capital stock k0 , which satisfies the maximum principle and
converges to the steady state (k̄, λ̄). Clearly such a path cannot start in
Regions II and IV, because the directions of the arrows in these areas
point away from (k̄, λ̄). For k0 < k̄, the value of λ0 (if any) must be
selected so that (k0 , λ0 ) is in Region I. For k0 > k̄, on the other hand,
the point (k0 , λ0 ) must be chosen to be in Region III. We analyze the
case k0 < k̄ only, and show that there exists a unique λ0 associated with
the given k0 , and that the optimal path, shown as the solid curve in Re-
gion I of Fig. 11.1, starts from (k0 , λ0 ) and converges to (k̄, λ̄). It should
be obvious that this path also represents the locus of such (k0 , λ0 ) for
k0 ∈ [0, k̄]. The analysis of the case k0 > k̄ is left as Exercise 11.4.
In Region I, k̇(t) > 0 and k(t) is an increasing function of t as indi-
cated by the horizontal right-directed arrow in Fig. 11.1. Therefore, we
can replace the independent variable t by k, and then use (11.16) and
(11.17) to obtain
3
dλ dλ dk [f  (k) − (r + γ)]λ
λ (k) = = = . (11.19)
dk dt dt h(λ) + γk − f (k)

Thus, our task of showing that there exists an optimal path starting from
any initial k0 < k̄ is equivalent to showing that there exists a solution
of the differential equation (11.19) on the interval [0, k̄], beginning with
the boundary condition λ(k̄) = λ̄. For this, we must obtain the value
λ (k̄). Since both the numerator and the denominator in (11.19) vanish
342 11. Applications to Economics

at k = k̄, we need to derive λ (k̄) by a perturbation argument. To do so,


we use (11.19) and (11.18) to obtain

[r + γ − f  (k)]λ [f  (k̄) − f  (k)]λ


λ (k) = = .
f (k) − γk − h(λ) f (k) − f (k̄) − γk + γ k̄ − h(λ) + h(λ̄)

We use L’Hôpital’s rule to take the limit as k → k̄ and obtain

−f  (k̄)λ̄ −f  (k̄)λ̄
λ (k̄) = = , (11.20)
f  (k̄) − γ − h (λ̄) f  (k̄) − γ − λ (k̄)/u (h(λ̄))
or
(λ (k̄))2
− + λ (k̄)[f  (k̄) − γ] + λ̄f  (k̄) = 0. (11.21)
u (h(λ̄))
Note that the second equality in (11.20) uses the relation h (λ̄) =
1/u (h(λ̄)) obtained by differentiating u (c) = u (h(λ)) = λ of (11.14)
with respect to λ at λ = λ̄.
It is easy to see that (11.21) has one positive solution and one negative
solution. We take the negative solution for λ (k̄) because of the following
consideration. With the negative solution, we can prove that the differ-
ential equation (11.19) has a smooth solution, such that λ (k) < 0. For
this, let
π(k) = f (k) − kγ − h(λ(k)).
Since k < k̄, we have r+γ−f  (k) < 0. Then from (11.19), since λ (k̄) < 0,
we have λ(k̄ − ε) > λ(k̄). Also since λ̄ > 0 and f  (k̄) < 0, Eq. (11.20)
with λ (k̄) implies

λ (k̄)
π  (k̄) = f  (k̄) − γ − < 0,
u (h(λ̄))

and thus,

π(k̄ − ε) = f (k̄ − ε) − γ(k̄ − ε) − h(λ(k̄ − ε)) > 0.

Therefore, the derivative at k̄ − ε is well defined and λ (k̄ − ε) < 0. We


can proceed as long as

λ (k)
π  (k) = f  (k) − γ − < 0. (11.22)
u (h(λ(k)))

This implies that f (k) − kγ − h(λ) > 0, and also since r + γ − f  (k)
remains negative for k < k̄, we have λ (k) < 0.
11.2. A Model of Optimal Epidemic Control 343

Suppose now that there is a point k̃ < k̄ with π(k̃) = 0. Then, since
π(k̃ + ε) > 0, we have π  (k̃) ≥ 0. But at k̃, π(k̃) = 0 in (11.19) implies
λ (k̃) = −∞, and then from (11.22), we have π  (k̃) = −∞, which is a
contradiction with π  (k̃) ≥ 0. Thus, we can proceed on the whole interval
[0, k̄]. This indicates that the path λ(k) (shown as the solid line in Region
I of Fig. 11.1) remains above the curve

k̇ = f (k) − kγ − h(λ) = 0,

shown as the dotted line in Fig. 11.1 when k < k̄. Thus, we can set
λ0 = λ(k0 ) for 0 ≤ k0 ≤ k̄ and have the optimal path starting from
(k0 , λ0 ) and converging to (k̄, λ̄).
Similar arguments hold when the initial capital stock k0 > k̄, in order
to show that the optimal path (shown as the solid line in Region III of
Fig. 11.1) exists in this case. You have already been asked to carry out
this analysis in Exercise 11.4.
We should mention that the conclusions derived in this subsection
could have been reached by invoking the Global Saddle Point Theorem
stated in Appendix D.7, but we have chosen instead to carry out a de-
tailed analysis for illustrating the use of the phase diagram method. The
next time we use the phase diagram method will be in Sect. 11.3.3, and
there we shall rely on the Global Saddle Point Theorem.

11.2 A Model of Optimal Epidemic Control


Certain infectious epidemic diseases are seasonal in nature. Examples
are the common cold, the flu, and certain children’s diseases. When it
is beneficial to do so, control measures are taken to alleviate the effects
of these diseases. Here we discuss a simple control model due to Sethi
(1974c) for analyzing an epidemic problem. Related problems have been
treated by Sethi and Staats (1978), Sethi (1978c), and Francis (1997).
See Wickwire (1977) for a good survey of optimal control theory applied
to the control of pest infestations and epidemics, and Swan (1984) for
applications to biomedicine.

11.2.1 Formulation of the Model


Let N be the total fixed population. Let x(t) be the number of infectives
at time t so that the remaining N − x(t) is the number of susceptibles.
To keep the model simple, assume that no immunity is acquired so that
344 11. Applications to Economics

when infected people are cured, they become susceptible again. The
state equation governing the dynamics of the epidemic spread in the
population is
ẋ = βx(N − x) − vx, x(0) = x0 , (11.23)
where β is a positive constant termed infectivity of the disease, and v
is a control variable reflecting the level of medical program effort. Note
that x(t) is in [0, N ] for all t > 0 if x0 is in that interval.
The objective of the control problem is to minimize the present value
of the cost stream up to a horizon time T, which marks the end of the
season for that disease. Let h denote the unit social cost per infective,
let m denote the cost of control per unit level of program effort, and let
Q denote the capability of the health care delivery system providing an
upper bound on v. The optimal control problem is:
  T 
−ρt
max J = −(hx + mv)e dt (11.24)
0

subject to (11.23), the terminal constraint that

x(T ) = xT , (11.25)

and the control constraint

0 ≤ v ≤ Q.

11.2.2 Solution by Green’s Theorem


Rewriting (11.23) as

vdt = [βx(N − x)dt − dx]/x

and substituting into (11.24) yields the line integral


 ! "
m
JΓ = − [hx + mβ(N − x)]e−ρt dt − e−ρt dx , (11.26)
Γ x

where Γ is a path from x0 to xT in the (t, x)-space. Let Γ1 and Γ2 be


two such paths from x0 to xT , and let R be the region enclosed by Γ1
and Γ2 . By Green’s theorem, we can write
  % mρ &
JΓ1 −Γ2 = JΓ1 − JΓ2 = − − h + mβ e−ρt dtdx. (11.27)
R x
11.2. A Model of Optimal Epidemic Control 345

To obtain the singular control we set the integrand of (11.27) equal to


zero, as we did in Sect. 7.2.2. This yields
ρ ρ
x= = , (11.28)
h/m − β θ

where θ = h/m − β. Define the singular state xs as follows:




⎨ ρ/θ if 0 < ρ/θ < N,
s
x = (11.29)

⎩ N otherwise.

The corresponding singular control level




⎨ β(N − ρ/θ) if 0 < ρ/θ < N,
v = β(N − x ) =
s s
(11.30)

⎩ 0 otherwise.

We will show that xs is the turnpike level of infectives. It is instructive to


interpret (11.29) and (11.30) for the various cases. If ρ/θ > 0, then θ > 0
so that h/m > β. Here the smaller the ratio h/m, the larger the turnpike
level xs , and therefore, the smaller the medical program effort should be.
In other words, the smaller the social cost per infective and/or the larger
the treatment cost per infective, the smaller the medical program effort
should be.
When ρ/θ < 0, you are asked to show in Exercise 11.9 that xs = N
in the case h/m < β, which means the ratio of the social cost to the
treatment cost is smaller than the infectivity coefficient. Therefore, in
this case when there is no terminal constraint, the optimal trajectory
involves no treatment effort. An example of this case is the common
cold where the social cost is low and treatment cost is high.
The optimal control for the fortuitous case when xT = xs is




⎪ Q if x(t) > xs ,


v ∗ (x(t)) = v s if x(t) = xs , (11.31)





⎩ 0 if x(t) < xs .

When xT = xs , there are two cases to consider. For simplicity of expo-


sition we assume x0 > xs and T and Q to be large.
346 11. Applications to Economics

Case 1: xT > xs . The optimal trajectory is shown in Fig. 11.2. In


Exercise 11.8 you are asked to show its optimality by using Green’s
theorem.

Case 2: xT < xs . The optimal trajectory is shown in Fig. 11.3. It can


be shown that x goes asymptotically to N − Q/β if v = Q. The level is
marked in Fig. 11.3.

The optimal control shown in Figs. 11.2 and 11.3 assumes 0 < xs <
N. It also assumes that T is large so that the trajectory will spend some
time on the turnpike and Q is large so that xs ≥ N − Q/β. The graphs
are drawn for x0 > xs and xs < N/2; for all other cases see Sethi (1974c).

Figure 11.2: Optimal trajectory when xT > xs

11.3 A Pollution Control Model


In this section we will describe a simple pollution control model due to
Keeler et al. (1971). We will describe this model in terms of an economic
system in which labor is the only primary factor of production, which is
allocated between food production and DDT production. It is assumed
that all of the food produced is used for consumption. On the other hand,
all of the DDT produced is used as a secondary factor of production
which, along with labor, determines the food output. However, when
used, DDT causes pollution, which can only be reduced by natural decay.
The objective of the society is to maximize the total present value of the
utility of food less the disutility of pollution due to the use of DDT.
11.3. A Pollution Control Model 347

Figure 11.3: Optimal trajectory when xT < xs

11.3.1 Model Formulation

We introduce the following notation:

L = the total labor force, assumed to be constant for simplicity,


v = the amount of labor used for DDT production,
L − v = the amount of labor used for food production,
P = the stock of DDT pollution at time t,
a(v) = the rate of DDT output; a(0) = 0, a > 0, a < 0, for v ≥ 0,
δ = the natural exponential decay rate of DDT pollution,
C(v) = f [L − v, a(v)] = the rate of food output to be consumed;
C(v) is concave, C(0) > 0, C(L) = 0; C(v) attains a
unique maximum at v = V > 0; see Fig. 11.4.
Note that a sufficient condition for C(v) to be strictly
concave is f12 ≥ 0 along with the usual concavity and
monotonicity conditions on f (see Exercise 11.10),
u(C) = the utility function of consuming the food output C ≥ 0;
u (0) = ∞, u (C) > 0, u (C) < 0,
h(P ) = the disutility function of pollution stock P ≥ 0;
h (0) = 0, h (P ) > 0, h (P ) > 0.
348 11. Applications to Economics

Figure 11.4: Food output function

The optimal control problem is:


  ∞ 
−ρt
max J = e [u(C(v)) − h(P )]dt (11.32)
0

subject to
Ṗ = a(v) − δP, P (0) = P0 , (11.33)
0 ≤ v ≤ L. (11.34)
From Fig. 11.4, it is obvious that v is at most V, since the production
of DDT beyond that level decreases food production and increases DDT
pollution. Hence, (11.34) can be reduced to simply

v ≥ 0. (11.35)

11.3.2 Solution by the Maximum Principle


Form the current-value Lagrangian

L(P, v, λ, μ) = u[C(v)] − h(P ) + λ[a(v) − δP ] + μv (11.36)

using (11.32), (11.33) and (11.35), where

λ̇ = (ρ + δ)λ + h (P ), (11.37)

and
μ ≥ 0 and μv = 0. (11.38)
11.3. A Pollution Control Model 349

The optimal solution is given by

∂L
= u [C(v)]C  (v) + λa (v) + μ = 0. (11.39)
∂v
Since the derived Hamiltonian is concave, conditions (11.36)–(11.39) to-
gether with
lim λ(t) = λ̄ = constant (11.40)
t→∞

are sufficient for optimality; see Theorem 2.1 and Sect. 2.4. The phase
diagram analysis presented below gives λ(t) satisfying (11.40).

11.3.3 Phase Diagram Analysis


From the assumptions on C(v) or from Fig. 11.4, we see that C  (0) >
0. This means that du/dv = u (C(v))C  (v)|v=0 > 0. This along with
h (0) = 0 implies that v > 0, meaning that it pays to produce some
positive amount of DDT in equilibrium. Therefore, the equilibrium value
of the Lagrange multiplier is zero, i.e., μ̄ = 0. From (11.33), (11.37) and
(11.39), we get the equilibrium values P̄ , λ̄, and v̄ as follows:

a(v̄)
P̄ = , (11.41)
δ
h (P̄ ) u [C(v̄)]C  (v̄)
λ̄ = − =− . (11.42)
ρ+δ a (v̄)
From (11.42) and the assumptions on the derivatives of g, C and a, we
know that λ̄ < 0. From this and (11.37), we conclude that λ(t) is always
negative. The economic interpretation of λ is that −λ is the imputed
cost of pollution. Let v = Φ(λ) denote the solution of (11.39) with μ = 0.
On account of (11.35), define

v ∗ = max[0, Φ(λ)]. (11.43)

We know from the interpretation of λ that when λ increases, the imputed


cost of pollution decreases, which can justify an increase in the DDT
production to ensure an increased food output. Thus, it is reasonable to
assume that

> 0,

and we will make this assumption. It follows that there exists a unique
λc such that Φ(λc ) = 0, Φ(λ) < 0 for λ < λc and Φ(λ) > 0 for λ > λc .
350 11. Applications to Economics

To construct the phase diagram, we must plot the isoclines Ṗ = 0


and λ̇ = 0. These are, respectively,

a(v ∗ ) a[max{0, Φ(λ)}]


P = = , (11.44)
δ δ

h (P ) = −(ρ + δ)λ. (11.45)


Observe that the assumption h (0) = 0 implies that the graph of (11.45)
passes through the origin. Differentiating these equations with respect
to λ and using (11.43), we obtain

dP a (v) dv
|Ṗ =0 = >0 (11.46)
dλ δ dλ

as the slope of the Ṗ = 0 isocline, and

dP (ρ + δ)
|λ̇=0 = −  <0 (11.47)
dλ h (P )

as the slope of the λ̇ = 0 isocline.


Using (11.41), (11.42), (11.46), and (11.47), we can draw (11.44) and
(11.45) in the (λ, P )-space as shown in Fig. 11.5. As in Sect. 11.1.4, these
isoclines divide the (λ, P ) space in four regions. At any point in each
of these regions, we have depicted the direction of the movement of the
trajectory with v ∗ in (11.33) and (11.37). It is easy to conclude that we
have Ṗ < 0 (Ṗ > 0) above (below) the Ṗ = 0 isocline and λ̇ > 0 (λ̇ < 0)
to the right (left) of the λ̇ = 0 isocline.
The intersection point (λ̄, P̄ ) of these isoclines denotes the equilib-
rium levels for the adjoint variable and the pollution stock, respectively.
That there exists an optimal path (shown as the solid line in Fig. 11.5)
converging to the equilibrium (λ̄, P̄ ) follows directly from the Global
Saddle Point Theorem stated in Appendix D.7.
Given λc as the intersection of the Ṗ = 0 curve and the horizontal
axis, the corresponding ordinate P c on the optimal trajectory is the
related pollution stock level. The significance of P c is that if the existing
pollution stock P is larger than P c , then the optimal control is v ∗ = 0,
meaning no DDT is produced.
Given an initial level of pollution P0 , the optimal trajectory curve in
Fig. 11.5 provides the initial value λ0 of the adjoint variable. With these
initial values, the optimal trajectory is determined by (11.33), (11.37),
and (11.43). If P0 > P c , as shown in Fig. 11.5, then v ∗ = 0 until such
11.3. A Pollution Control Model 351

Figure 11.5: Phase diagram for the pollution control model

time that the natural decay of pollution stock has reduced it to P c . At


that time, the adjoint variable has increased to the value λc . The optimal
control is v ∗ = φ(λ) from this time on, and the path converges to (λ̄, P̄ ).
At equilibrium, v̄ = Φ(λ̄) > 0, which implies that it is optimal to
produce some DDT forever in the long run. The only time when its
production is not optimal is at the beginning when the pollution stock
is higher than P c .
It is important to examine the effects of changes in the parameters on
the optimal path. In particular, you are asked in Exercise 11.11 to show
that an increase in the natural rate of decay of pollution, δ, will increase
P c . That is, when pollution decays at a faster rate, we can increase the
threshold level of pollution stock at which to ban the production of the
pollutant. For DDT in reality, δ is small so that its complete ban, which
has actually occurred, may not be far from the optimal policy.
Here we have presented a very simple model of pollution in which
the problem was to choose an optimal production process. Models in
which the control variable to determine is the optimal amount to spend
in reducing the pollution output of an existing dirty process have also
been formulated; see Wright (1974) and Sethi (1977d). For other related
models, see Luptacik and Schubert (1982), Hartl and Luptacik (1992),
and Hartl and Kort (1996a,b,c, 1997), Xepapadeas and de Zeeuw (1999),
and Moser et al. (2014).
352 11. Applications to Economics

11.4 An Adverse Selection Model


In modern contract theory, the term adverse selection is used to describe
principal-agent models in which an agent has private information before
a contract is written. For example, a seller does not know perfectly how
much a buyer is willing to pay for a good. A related concept is that
of moral hazard, when there is present a hidden action not adversely
observed by the principal.
In such game situations, clearly the principal would like to know the
agent’s private information which he cannot learn simply by asking the
agent, because it is in the agent’s interest to distort the truth. Fortu-
nately, according to the theory of mechanism design, the principal can
design a game whose rules can influence the agent to act the way he
would like. Thanks, particularly to the revelation principle, the princi-
pal needs only consider games in which the agent truthfully reports her
private information.
There is a large literature on contract theory, and we refer the reader
to books by Laffont and Mortimort (2001), Bolton and Dewatripont
(2005) and Cvitanic and Zhang (2013). For our purposes, we shall next
consider a game between a seller and a buyer, where the buyer has private
information about her willingness-to-pay for the seller’s goods; see Bolton
and Dewatripont (2005).

11.4.1 Model Formulation


Consider a transaction between a seller (the principal) and a buyer (the
agent) of type t ∈ [t1 , t2 ], 0 ≤ t1 ≤ t2 , represents her willingness-to-pay
for seller’s goods. We assume in particular that buyer’s preferences are
represented by the utility function

U (q, φ, t) = ta(q) − φ, (11.48)

where q is the number of units purchased and φ is the total amount paid
to the seller. We assume a(0) = 0, a > 0, and a < 0.
The seller knows only the distribution F (t), having the density
f (t), t ∈ [t1 , t2 ]. The seller’s unit production cost is c > 0, so that his
profit from selling q units against a sum of money φ is given by

π = φ − cq. (11.49)

The question of interest here is to obtain a profit-maximizing pair


{φ, q} that the seller will be able to induce the buyer of type t̂ to choose.
11.4. An Adverse Selection Model 353

Thanks to the revelation principle, the answer is that the seller can offer
a menu of contracts {φ(t), q(t)} which comes from solving the following
maximization problem:
 t2
max [φ(t) − cq(t)]f (t)dt (11.50)
q(·),φ(·) t1

subject to

(IR) t̂a(q(t̂)) − φ(t̂) ≥ 0, t̂ ∈ [t1 , t2 ] (11.51)

(IC) t̂a(q(t̂)) − φ(t̂) ≥ t̂a(q(t)) − φ(t), t, t̂ ∈ [t1 , t2 ], t = t̂. (11.52)

The constraints (11.51), called individual rationality constraints (IR),


say that the agent of type t̂ will participate in the contract. Clearly,
given (11.52), we can replace these constraints by a single constraint

t1 a(q(t1 )) − φ(t1 ) ≥ 0. (11.53)

The left-hand side of the constraints (11.52), called incentive compatibil-


ity constraints (IC), is the utility of agent t̂ if she chooses the contract
intended for her, whereas the right-hand side represents the utility of
agent t̂ if she chooses the constraint intended for type t = t̂. The IC
constraints, therefore, imply that type t̂ agent is better off choosing the
contract intended for her than any other contract in the menu.
Clearly, the seller’s problem is mathematically difficult as it involves
maximizing the seller’s profit over a class of functions. So, a way to deal
with this problem is to decompose it into an implementation problem
(which functions q(·) are incentive compatible?) and an optimization
problem (which one is the best implementation function for the seller?)

11.4.2 The Implementation Problem


Given a menu {q(·), φ(·)} that satisfies the seller’s problem (11.50)–
(11.52), it must be the case in equilibrium that the buyer t̂ will choose
the contract {q(t̂), φ(t̂)}. In other words, his utility t̂a(q(t)) − φ(t) of
choosing a contract {q(t), φ(t)} will be maximized at t = t̂. Assuming
that q(·) and φ(·) are twice differentiable functions, the first-order and
second-order conditions are

t̂a (q(t))q̇(t) − φ̇(t)|t=t̂ = t̂a (q(t̂))q̇(t̂) − φ̇(t̂) = 0, (11.54)


354 11. Applications to Economics

t̂a (q(t))(q̇(t))2 + t̂a (q(t))q̈(t) − φ̈(t)|t=t̂ ≤ 0. (11.55)


From (11.54), it follows from replacing t̂ by t that

ta (q(t))q̇(t) − φ̇(t) = 0, t ∈ [t1 , t2 ], (11.56)

called the local incentive compatibility condition, must hold. Differenti-


ating (11.56) gives,

ta (q(t))(q̇(t))2 + a (q(t))q̇(t) + ta (q(t))q̈(t) − φ̈(t) = 0. (11.57)

It follows from (11.55), (11.57), and a > 0, that

q̇(t) ≥ 0. (11.58)

This is called the monotonicity condition. In Exercise 11.12, you are


asked to show that (11.56) and (11.58) are sufficient for (11.52) to hold.
Since, these conditions are already necessary, we can say that local incen-
tive compatibility (11.56) and monotonicity (11.58) together are equiva-
lent to the IC condition (11.52).
We can now ready to formulate the seller’s optimization problem.

11.4.3 The Optimization Problem


The seller’s problem can be written as the following optimal control
problem:  t2
max [φ(t) − cq(t)]f (t)dt (11.59)
u(·) t1

subject to
q̇(t) = u(t), (11.60)
φ̇(t) = ta (q(t))u(t), (11.61)
t1 a(q(t1 )) − φ(t1 ) = 0, (11.62)
u(t) ≥ 0. (11.63)
Here, q(t) and φ(t) are state variables and u(t) is a control variable
satisfying the control constraint u(t) ≥ 0. The objective function (11.59)
is the expected value of the seller’s profit with respect to the density f (t).
Equation (11.60) and constraint (11.63) come from the monotonicity
condition (11.58). Equation (11.61) with u(t) from (11.60) gives the
local incentive compatibility condition (11.56). Finally, (11.62) specifies
11.4. An Adverse Selection Model 355

the IR constraint (11.53) in view of the fact it will be binding for the
lowest agent type t1 at the optimum.
We can now use the sense of the maximum principle (3.12) to write
the necessary conditions for optimality. Note that (3.12) is written
for problem (3.7) that has specified initial states and some constraints
on the terminal state vector x(T ) that include the equality constraint
b(x(T ), T ) = 0. Our problem, on the other hand, has this type of equal-
ity constraint, namely (11.62), on the initial states q(t1 ) and φ(t1 ) and
no specified terminal states q(t2 ) and φ(t2 ). However, since initial time
conditions and terminal time conditions can be treated in a symmetric
fashion, we can apply the sense of (3.12), as shown in Remark 3.9, to
obtain the necessary optimality conditions to problem (11.59)–(11.63).
In Exercise 11.13, you are asked to obtain (11.67) and (11.68) by fol-
lowing Remark 3.9 to account for the presence of the equality constraint
(11.62) on the initial state variables rather than on the terminal state as
in problem (3.7).
To specify the necessary optimality condition, we first define the
Hamiltonian.

H(q, φ, λ, μ, t) = [φ(t) − cq(t)]f (t) + λ(t)u(t) + μ(t)[ta (q(t)u(t))]


= [φ(t) − cq(t)]f (t) + [λ(t) + μ(t)ta (q(t))]u(t)
(11.64)

Then for u∗ with the corresponding state trajectories q ∗ and φ∗ to be


optimal, we must have adjoints λ and μ, and a constant β, such that

q̇ ∗ = u∗ , φ̇ = ta (q ∗ )u (11.65)

t1 a(q ∗ (t1 )) − φ∗ (t1 ) = 0, (11.66)


λ̇ = cf − μta (q ∗ )u∗ , λ(t1 ) = βt1 a (q ∗ (t1 )), λ(t2 ) = 0, (11.67)
μ̇ = −f, μ(t1 ) = −β, μ(t2 ) = 0, (11.68)
u∗ (t) = bang[0, ∞; λ(t) + μ(t)ta (q ∗ (t))]. (11.69)
Several remarks are in order at this point. First we see that we have
a bang-bang control in (11.69). This means that the u∗ (t) can be 0,
or greater than 0, or an impulse control. Moreover, in the region when
u∗ (t) = 0, which will occur when λ(t) + μ(t)ta (q ∗ (t)) < 0, we will have
a constant q ∗ (t), and we will have a singular control u∗ (t) > 0 if we can
keep λ(t) + μ(t)ta (q ∗ (t)) = 0 by an appropriate choice of u∗ (t) along
356 11. Applications to Economics

the singular path. An impulse control would occur if the initial q(t1 )
were above the singular path. Since in our problem, initial states are not
exactly specified, we shall not encounter an impulse control here.
The third remark concerns a numerical way of solving the problem.
For this, let us rewrite the boundary conditions in (11.67) and (11.68)
and the condition (11.66) as below:
t1 a(q ∗ (t1 )) − φ∗ (t1 ) = 0, λ(t1 ) = −μ(t1 )t1 a (q ∗ (t1 )) (11.70)
λ(t2 ) = μ(t2 ) = 0. (11.71)
With (11.71) and a guess of q(t2 ) and φ(t2 ), we can solve the differential
equation (11.65), (11.67) and (11.68), with u∗ (t) in (11.69), backward in
time. These will give us the values of λ(t1 ), μ(t1 ), q(t1 ) and φ(t1 ). We
can check if these satisfy the two equations in (11.70). If yes, we have
arrived at a solution. If not, we change our guess for q(t2 ) and φ(t2 ) and
start again. As you may have noticed, the procedure is very similar to
solving a two-point boundary value problem.
Next we provide an alternative procedure to solve the seller’s prob-
lem, a procedure used in the theory of mechanism design. This procedure
first ignores the nonnegativity constraint (11.60) and solves the relaxed
problem given by (11.59)–(11.62). In view of (11.52), let us define
u0 (t̂) = t̂a(q(t̂)) − φ(t̂) = max[ta(q(t)) − φ(t)]. (11.72)
t

By the envelope theorem, we have


du0 (t̂) ∂u0 (t̂)
= = a(q(t̂)), (11.73)
dt̂ ∂ t̂
which we can integrate to obtain
 t  t
0 0
u (t) = a(q(x)dx) + u (t1 ) = a(q(x))dx, (11.74)
t1 t1

since u∗ (t1 ) = 0 at the optimum. Also, since φ(t) = ta(q(t)) − u0 (t), we


can write the seller’s profit as
 t2  t
[ta(q(t)) − a(q(x))dx − cq(t)]f (t)dt. (11.75)
t1 t1

Then, integrating by parts, we have


 t2
t1 [{ta(q(t)) − cq(t)}f (t) − a(q(t))(1 − F (t))] dt
t
= t12 [ta(q(t)) − cq(t) − a(q(t))/h(t)] f (t)dt, (11.76)
11.4. An Adverse Selection Model 357

where h(t) = f (t)/[1 − F (t)] is known as the hazard rate. Since we


are interested in maximizing the seller’s profit with respect to the out-
put schedule q(·), we can maximize the expression under the integral
pointwise for each t. The first-order condition for that is

1 − F (t)  1
t− a (q(t)) = t − a (q(t)) = c, (11.77)
f (t) h(t)

which gives us the optimal solution of the relaxed problem as


  
 −1 1 −1
q̂(t) = a c t− . (11.78)
h(t)

In obtaining (11.78), we had omitted the nonnegativity constraint


(11.63) introduced to ensure that q(t) is increasing. Thus, it remains to
check if dq̂(t)/dt ≥ 0. It is straightforward to verify that if the hazard
rate h(t) is increasing in t, then q̂(t) is increasing in t. To show this, we
differentiate (11.78) to obtain

dq̂(t) g(t)a (q̂(t))


= −  ,
dt a (q̂(t))g(t)

where g(t) = [t − 1/h(t)]. Clearly, if h(t) is increasing, then g(t) is in-


creasing, and dq̂(t)/dt ≥ 0.
In this case, q̂(t) and the corresponding φ̂(t) obtained from solving
the differential equation given by (11.61) and the boundary condition
(11.62) give us the optimal menu {φ̂(t), q̂(t)}.
What if h(t) is not increasing? In that case, there is a procedure
called bunching and ironing given by the solution of an optimal control
problem to be formulated next. This is because q̂(t) in (11.78) is obtained
by solving the relaxed problem that ignores the nonnegativity constraint
(11.63), and so it may be that dq̂/dt is strictly negative for some t ∈
[t, t̄] ⊂ [t1 , t2 ] as shown in Fig. 11.6.
Then the seller must choose the optimal q ∗ (t) to maximize the fol-
lowing constrained optimal control problem:
 t2
a(q(t))
max ta(q(t)) − cq(t) − f (t)dt (11.79)
q(·) t1 h(t)

subject to
q̇(t) = u(t), u(t) ≥ 0. (11.80)
358 11. Applications to Economics

q̂(t)

q * (θ)

t1 t t̄ t2 t

Figure 11.6: Violation of the monotonicity constraint

Now the necessary optimality conditions, with the Hamiltonian defined


as
H(q, 0, λ, t) = (ta(q) − cq − a(q)/h)f + λu, (11.81)
are
4 $
λ̇ = − (t − 1/h)a (q) − c f, λ(t1 ) = λ(t2 ) = 0, (11.82)
and
u∗ = [0, ∞; λ]. (11.83)
We may also note that these conditions are also sufficient since H in
(11.81) is concave in q.
Integrating (11.82), we have
 t  
1 
λ(t) = − z− a (q(z)) − c f (z)dz.
t1 h(z)

Using the transversality conditions in the case when neither the initial
nor the terminal state is specified for the state equation (11.80), we
obtain
 t2  
1 
0 = λ(t1 ) = λ(t2 ) = − z− a (q(z)) − c f (z)dz.
t1 h(z)
11.4. An Adverse Selection Model 359

Then for u∗ (t) = 0 on an interval t ∈ [θ1 , θ2 ] ⊂ [t1 , t2 ], we must have


λ(t) < 0, t ∈ [θ1 , θ2 ]. Moreover, when u∗ (t) > 0, it must be a singular
control for which λ(t) = 0.
But λ(t) = 0 is the same as the condition (11.77), which means that

if q (t) is strictly increasing, then it must coincide with q̂(t) in (11.78).
It, therefore, only remains to determine the intervals over which q ∗ (t) is
constant. Consider Fig. 11.7
q̂(t); q * (t)

q * (t) q * (t)
q * (t)

q * (t)

μ=0
λ=0 λ=0
λ<0

t1 θ1 t t̄ θ2 t2 t

Figure 11.7: Bunching and ironing

By continuity, we must have λ(θ1 ) = λ(θ2 ) = 0, so that


 θ2  
1  ∗
z− a (q (z)) − c dz = 0. (11.84)
θ1 h(z)
In addition, we must have

q ∗ (θ1 ) = q ∗ (θ2 ) (11.85)

from the continuity of q ∗ (·). Thus, we have two equations (11.84) and
(11.85) and two unknowns, allowing us to obtain the values of θ1 and θ2 .
An interval [θ1 , θ2 ] over which q ∗ (t) is constant is known as a bunching
interval.
Here, we have given a procedure when q̂(·) has only one interval [t, t̄]
over which it is strictly decreasing. If there are more such intervals,
360 11. Applications to Economics

this procedure of ironing and bunching can be extended in an obvious


manner.

11.5 Miscellaneous Applications


The number of papers which apply control theory to problems in eco-
nomics and management science is now so large that it is impossible to
cover them in detail within the confines of a single book. We satisfy
ourselves by listing selected references with a brief indication of their
contents.
For control theory applications to economics, see: Tu (1969) and
Southwick and Zionts (1974) for optimal educational investments,
Kamien and Schwartz (1971b) for limit pricing and uncertain entry,
Treadway (1970) for adjustment costs in the theory of competitive firms,
Vousden (1974) for international trade, Harris (1976) for money demand
with transaction costs, Raviv (1979) for the design of an optimal insur-
ance policy, Sethi and McGuire (1977) for optimal training and heteroge-
neous labor, Arthur and McNicoll (1977) for population policy, Brito and
Oakland (1977) for optimal income tax, Thompson (1982a,b) for contin-
uous expanding economies, Thépot (1983) for investment and marketing
policies in a duopoly, Verheyen (1985) for a theory of firm under govern-
ment regulations, Hartl and Mehlmann (1986) for renumeration patterns
for medical services, Schijndel (1986) for dynamic shareholder behavior
under personal taxation, Hartl and Kort (1997) for optimal input substi-
tution in response to environmental constraints, Feichtinger et al. (1998),
Behrens et al. (2000, 2002), Tragler et al. (2001), Grass et al. (2008), and
Seidl et al. (2016) for optimal control of crime such as illicit drugs and
terrorism.
For control theory applications to management science and opera-
tions research, see: Nelson (1960) for labor assignments, Fan and Wang
(1964), Charnes and Kortanek (1966), Tapiero and Soliman (1972) and
Bookbinder and Sethi (1980) for distribution and transportation applica-
tions, Nepomiastchy (1970) and Zimin and Ivanilov (1971) for scheduling
and network planning problems, Lucas (1971) for research and develop-
ment, Legey et al. (1973) for city congestion problems, Taylor (1974) for
warfare models, Mehra (1975) for national settlement planning, Kalish
(1983) for pricing with dynamic demand and production costs, Kalish
and Lilien (1983) for optimal price subsidy for accelerating diffusion
of innovation, Gaimon (1986c) for optimal acquisition of new technol-
Exercises for Chapter 11 361

ogy, Dockner and Jørgensen (1988) and Jedidi et al. (1989) for optimal
pricing and/or advertising for monopolistic diffusion models, Hartl and
Jørgensen (1985) for manpower planning, Ringbeck (1985) for optimal
quality and advertising under asymmetric information, Hartl and Krauth
(1989) for optimal production mix, Gaimon (1997) for planning for in-
formation technology, Hartl and Kort (2005) for advertising directed to
existing and new customers, and Shani et al. (2005) for dynamic irriga-
tion policies.
Finally, we conclude this section by citing a series of rather un-
usual but humorous applications of optimal control theory that began
with the Sethi (1979b) paper on optimal pilfering policies for dynamic
continuous thieves. These are: Hartl and Mehlmann (1982, 1983) and
Hartl et al. (1992a) on optimal blood consumption by vampires, Hartl
and Mehlmann (1986) on renumeration patterns for medical services,
Hartl and Jørgensen (1988, 1990) on optimal slidemanship at confer-
ences, Jørgensen (1992) on the dynamics of extramarital affairs, and
Feichtinger et al. (1999) on Petrarch’s Canzoniere: rational addiction
and amorous cycles. See also the monograph by Mehlmann (1997) on
unusual and humorous applications of differential games.

Exercises for Chapter 11

E 11.1 For the model formulated in Sect. 11.1.1, assume F (K) = γK


and U (C) = (C − C̄)1−θ /(1 − θ), where 0 < θ < 1, C̄ > 0 a constant, and
γ−δ > 0 a constant satisfying (γ−δ)(1−θ) < ρ < γ−δ. Let β = γ−δ and
assume θ = 1/2 for simplicity. Also assume that K0 eβT + C̄(1−eβT )/β >
KT for the problem to be well-posed (note that the left-hand side of this
inequality is the amount of capital at T associated with the consumption
rate C̄). Solve this problem to obtain explicit expressions for the optimal
consumption rate and the associated capital and the adjoint trajectories.

E 11.2 Perform the following:

(a) Obtain the value of k̄ in Fig. 11.1 from Eq. (11.17).

(b) Show that the graph of k̇ = 0 starts from +∞ when k = 0, de-


creases to a minimum of λ̂ at k̂, and then increases. Also obtain
the expression for λ̄.

(c) Show that k̄ < k̂.


362 11. Applications to Economics

E 11.3 Use (11.14) to show that h (λ) < 0. Then, conclude that the
directions of the horizontal arrows above and below the k̇ = 0 curve are
as drawn in Fig. 11.1.

E 11.4 Show that for any k0 > k̄, there exists a unique optimal path,
such as that shown by the solid curve in Region III of Fig. 11.1.

E 11.5 In the formulation of the objective function for the economic


growth model in Sect. 11.1.3, we took the position of total utilitarianism.
Reformulate and solve the problem if our task is to maximize the present
value of the utility of per capita consumption over time.

E 11.6 Use the phase diagram method to solve the advertising model
of (7.7) with its objective function replaced by
  ∞ 
−ρt
max J = e [π(G) − c(u)]dt ,
u≥0 0

where c(u) represents an increasing convex advertising cost function with


c(u) ≥ 0, c (u) ≥ 0, and c (u) > 0 for u ≥ 0. This is the model of Gould
(1970).

E 11.7 A variation of the optimal capital accumulation model with sta-


tionary population, known as Ramsey’s model, is:
  ∞ 
max J = [u(c) − B]dt
0

subject to
k̇ = f (k) − c − γk, k(0) = k0 ,
where
B = sup u(c) > 0
c≥0

is the so-called Bliss point,


lim u[c(t)] = B
t→∞

so that the integral in the objective function converges, and


lim u [c(t)] = 0; see Ramsey (1928).
t→∞

(a) Show that the optimal capital stock trajectory satisfies the differ-
ential equation
u (f (k) − γk − k̇)k̇ = B − u(f (k) − γk − k̇).
Exercises for Chapter 11 363

(b) From part (a), derive Ramsey’s rule


d[u (c(t))]
= u (c(t))[γ − f  (k(t))].
dt
(c) Assume u(c) = 2c − c2 /B and f (k) = αk, where α − γ := β > 0
and β < B/k0 < 2β. Show that the optimal feedback consumption
rule is
c∗ (k) = 2βk − B
and the optimal capital trajectory k ∗ is given by
1
k ∗ (t) = [B − (B − βk0 )e−βt ].
β
E 11.8 Show that the trajectory x0 BLxT shown in Fig. 11.2 is optimal
for the epidemic model under the stated assumptions. Assume 0 < xs <
N.
E 11.9 In (11.29), show by using Green’s theorem that xs = N if
ρ/θ < 0.
E 11.10 Show that C(v) defined in Sect. 11.3.1 satisfies C  (v) < 0 if
f12 ≥ 0.

Hint: Note that the usual concavity and monotonicity conditions


on the production function f are f1 > 0, f2 > 0, f11 < 0 and f22 < 0.
E 11.11 Show that the P c of Fig. 11.5 increases as δ in Eq. (11.33) in-
creases.
E 11.12 Show that (11.56) and (11.58) imply the (global) IC condition
(11.52).

Hint: The proof is by contradiction. First, begin by supposing


that (11.52) is violated for some t > t̂. Then do the same with t < t̂.
E 11.13 In problem (3.7), the terminal equality constraint b(x(T ), T ) =
0 results in the term βbx (x(T ), T ) in the terminal condition (3.11) on
the adjoint variable. In problem (11.59)–(11.63), we have the equality
constraint (11.62) on the initial states q(t1 ) and φ(t1 ) instead, which
we can write as b((q(t1 ), φ(t1 )), t1 ) = t1 a(q(t1 )) − φ(t1 ) = 0. Now apply
(3.11) in a symmetric fashion to obtain the initial conditions (11.67) and
(11.68) on the adjoint variables.
Chapter 12

Stochastic Optimal Control

In previous chapters we assumed that the state variables of the system


are known with certainty. When the variables are outcomes of a random
phenomenon, the state of the system is modeled as a stochastic process.
Specifically, we now face a stochastic optimal control problem where the
state of the system is represented by a controlled stochastic process. We
shall only consider the case when the state equation is perturbed by
a Wiener process, which gives rise to the state as a Markov diffusion
process. In Appendix D.2 we have defined the Wiener process, also
known as Brownian motion. In Sect. 12.1, we will formulate a stochastic
optimal control problem governed by stochastic differential equations
involving a Wiener process, known as Itô equations. Our goal will be to
synthesize optimal feedback controls for systems subject to Itô equations
in a way that maximizes the expected value of a given objective function.
In this chapter, we also assume that the state is (fully) observed.
On the other hand, when the system is subject to noisy measurements,
we face partially observed optimal control problems. In some important
special cases, it is possible to separate the problem into two problems:
optimal estimation and optimal control. We discuss one such case in
Appendix D.4.1. In general, these problems are very difficult and are
beyond the scope of this book. Interested readers can consult some
references listed in Sect. 12.5.
In Sect. 12.2, we will extend the production planning model of Chap. 6
to allow for some uncertain disturbances. We will obtain an optimal
production policy for the stochastic production planning problem thus
formulated. In Sect. 12.3, we will solve an optimal stochastic advertising

© Springer Nature Switzerland AG 2019 365


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 12
366 12. Stochastic Optimal Control

problem explicitly. The problem is a modification as well as a stochastic


extension of the optimal control problem of the Vidale-Wolfe advertising
model treated in Sect. 7.2.4. In Sect. 12.4, we will introduce investment
decisions in the consumption model of Example 1.3. We will consider
both risk-free and risky investments. Our goal will be to find optimal
consumption and investment policies in order to maximize the discounted
value of the utility of consumption over time.
In Sect. 12.5, we will conclude the chapter by mentioning other types
of stochastic optimal control problems that arise in practice.

12.1 Stochastic Optimal Control


In Appendix D.1 on the Kalman filter, we obtain the optimal state
estimation for linear systems with noise and noisy measurements. In
Sect. D.4.1, we see that for stochastic linear-quadratic optimal control
problems, the separation principle allows us to solve the problem in two
steps: to obtain the optimal estimate of the state and to use it in the
optimal feedback control formula for deterministic linear-quadratic prob-
lems.
In this section we will introduce the possibility of controlling a sys-
tem governed by Itô stochastic differential equations. In other words,
we will introduce control variables into Eq. (D.20). This produces the
formulation of a stochastic optimal control problem.
It should be noted that for such problems, the separation principle
does not hold in general. Therefore, to simplify the treatment, it is often
assumed that the state variables are observable, in the sense that they
can be directly measured. Furthermore, most of the literature on these
problems uses dynamic programming or the Hamilton-Jacobi-Bellman
framework rather than a stochastic maximum principle. In what fol-
lows, therefore, we will formulate the stochastic optimal control problem
under consideration, and provide a brief, informal development of the
Hamilton-Jacobi-Bellman equation for the problem. A detailed analysis
of the problem is available in Fleming and Rishel (1975). For problems
involving jump disturbances, see Davis (1993) for the methodology of op-
timal control of piecewise deterministic processes. For stochastic optimal
control in discrete time, see Bertsekas and Shreve (1996).
Let us consider the problem of maximizing
 T
E F (Xt , Ut , t)dt + S(XT , T ) , (12.1)
0
12.1. Stochastic Optimal Control 367

where Xt is the state at time t and Ut is the control at time t, and together
they are required to satisfy the Itô stochastic differential equation
dXt = f (Xt , Ut , t)dt + G(Xt , Ut , t)dZt , X0 = x0 , (12.2)
where Zt , t ∈ [0, T ] is a standard Wiener process.
For convenience in exposition we assume the drift coefficient function
F : E1 × E1 × E1 → E1, S : E1 × E1 → E1, f : E1 × E1 × E1 → E1
and the diffusion coefficient function G : E 1 × E 1 × E 1 → E 1 , so that
(12.2) is a scalar equation. We also assume that the functions F and S are
continuous in their arguments and the functions f and G are continuously
differentiable in their arguments. For multidimensional extensions of this
problem, see Fleming and Rishel (1975).
Since (12.2) is a scalar equation, the subscript t here represents only
time t. Thus, writing Xt , Ut , and Zt in place of writing X(t), U (t), and
Z(t), respectively, will not cause any confusion and, at the same time,
will eliminate the need for writing many parentheses.
To solve the problem defined by (12.1) and (12.2), let V (x, t), known
as the value function, be the expected value of the objective function
(12.1) from t to T, when an optimal policy is followed from t to T, given
Xt = x. Then, by the principle of optimality,
V (x, t) = max E[F (x, U, t)dt + V (x + dXt , t + dt)]. (12.3)
U

By Taylor’s expansion, we have

V (x + dXt , t + dt) = V (x, t) +Vt dt + Vx dXt + 12 Vxx (dXt )2

+ 12 Vtt (dt)2 + 12 Vxt dXt dt (12.4)

+higher-order terms.

From (12.2), we can formally write


(dXt )2 = f 2 (dt)2 + G2 (dZt )2 + 2f GdZt dt, (12.5)
2
dXt dt = f (dt) + GdZt dt. (12.6)
The exact meaning of these expressions comes from the theory of
stochastic calculus; see Arnold (1974, Chapter 5), Durrett (1996) or
Karatzas and Shreve (1997). For our purposes, it is sufficient to know
the multiplication rules of stochastic calculus:
(dZt )2 = dt, dZt dt = 0, dt2 = 0. (12.7)
368 12. Stochastic Optimal Control

Substitute (12.4) into (12.3) and use (12.5), (12.6), (12.7), and the prop-
erty that E[dZt ] = 0 to obtain

1 2
V = max F dt + V + Vt dt + Vx f dt + Vxx G dt + o(dt) . (12.8)
U 2

Note that we have suppressed the arguments of the functions involved


in (12.8).
Canceling the term V on both sides of (12.8), dividing the remainder
by dt, and letting dt → 0, we obtain the Hamilton-Jacobi-Bellman (HJB)
equation
1
0 = max[F + Vt + Vx f + Vxx G2 ] (12.9)
U 2
for the value function V (t, x) with the boundary condition

V (x, T ) = S(x, T ). (12.10)

Just as we had introduced a current-value formulation of the max-


imum principle in Sect. 3.3, let us derive a current-value version of the
HJB equation here. For this, in a way similar to (3.29), we write the
objective function to be maximized as
 T
E [φ(Xt , Ut )e−ρt + ψ(XT )e−ρT ]. (12.11)
0

We can relate this to (12.1) by setting

F (Xt , Ut , t) = φ(Xt , Ut )e−ρt and S(XT , T ) = ψ(XT )e−ρT . (12.12)

It is important to mention that the explicit dependence on time t in


(12.11) is only via the discounting term. If it were not the case, there
would be no advantage in formulating the current-value version of the
HJB equation.
Rather than develop the current-value HJB equation in a manner of
developing (12.9), we will derive it from (12.9) itself. For this we define
the current-valued value function

Ṽ (x, t) = V (x, t)eρt . (12.13)

Then we have

Vt = Ṽt e−ρt − ρṼ e−ρt , Vx = Ṽx e−ρt and Vxx = Ṽxx e−ρt . (12.14)
12.1. Stochastic Optimal Control 369

By using these and (12.12) in (12.9), we obtain

1
0 = max[φe−ρt + Ṽ e−ρt − ρṼ e−ρt + Vx f e−ρt + Vxx G2 e−ρt ].
U 2

Multiplying by eρt and rearranging terms, we get

1
ρṼ = max[φ + Ṽt + Ṽx f + Ṽxx G2 ]. (12.15)
U 2

Moreover, from (12.12), (12.13), and (12.10), we can get the boundary
condition
Ṽ (x, T ) = ψ(x). (12.16)
Thus, we have obtained (12.15) and (12.16) as the current-value HJB
equation.
To obtain its infinite-horizon version, it is generally the case that
we remove the explicit dependence on t from the function f and G in
(12.2), and also assume that ψ ≡ 0. With that, the dynamics (12.2) and
the objective function (12.11) change, respectively, to

dXt = f (Xt , Ut )dt + G(Xt , Ut )dZt , X0 = x0 , (12.17)


 ∞
E φ(Xt , Ut )e−ρt dt. (12.18)
0

It should then be obvious that Ṽt = 0, and we can obtain the infinite-
horizon version of (12.15) as

1
ρṼ = max[φ + V˜x f + Ṽxx G2 ]. (12.19)
U 2

As for its boundary condition, (12.16) is replaced by a growth condition


that is the same, in general, as the growth of the function φ(x, U ) in
x. For example, if φ(x, U ) is quadratic in x, we would look for a value
function Ṽ (x) to be of quadratic growth. See Beyer et al. (2010), Chapter
3, for a related discussion of a polynomial growth case in the discrete
time setting.
If we can find a solution of the HJB equation with the given bound-
ary condition (or an appropriate growth condition in the infinite horizon
case), then a result called a verification theorem suggests that we can
construct an optimal feedback control U ∗ (x, t) (or U ∗ (x) in the infinite
horizon case) by maximizing the right-hand side of the HJB equation
370 12. Stochastic Optimal Control

with respect U. For further details and extension when the value func-
tion is not smooth enough and thus not a classical solution of the HJB
equation, see Fleming and Rishel (1975), Yong and Zhou (1999), and
Fleming and Soner (1992).
In the next three sections, we will apply this procedure to solve prob-
lems in production, marketing and finance.

12.2 A Stochastic Production Inventory Model


In Sect. 6.1.1, we formulated a deterministic production-inventory model.
In this section, we extend a simplified version of that model by including
a random process. Let us define the following quantities:

It = the inventory level at time t (state variable),


Pt = the production rate at time t (control variable),
S = the constant demand rate at time t; S > 0,
T = the length of planning period,
I0 = the initial inventory level,
B = the salvage value per unit of inventory at time T,
Zt = the standard Wiener process,
σ = the constant diffusion coefficient.

The inventory process evolves according to the stock-flow equation


stated as the Itô stochastic differential equation

dIt = (Pt − S)dt + σdZt , I0 given, (12.20)

where I0 denotes the initial inventory level. As mentioned in Appendix


Sect. D.2, the process dZt can be formally expressed as w(t)dt, where
w(t) is considered to be a white noise process; see Arnold (1974). It can
be interpreted as “sales returns,” “inventory spoilage,” etc., which are
random in nature.
The objective function is:
  T 
2 2
max E BIT − (Pt + It )dt . (12.21)
0

It can be interpreted as maximization of the terminal salvage value less


the cost of production and inventory assumed to be quadratic. In Ex-
ercise 12.1, you will be asked to solve the problem with the objective
12.2. A Stochastic Production Inventory Model 371

function given by the expected value of the undiscounted version of the


integral in (6.2).
As in Sect. 6.1.1 we do not restrict the production rate to be nonneg-
ative. In other words, we permit disposal (i.e., Pt < 0). While this is
done for mathematical expedience, we will state conditions under which
a disposal is not required. Note further that the inventory level is allowed
to be negative, i.e., we permit backlogging of demand.
The solution of the above model due to Sethi and Thompson (1981a)
will be carried out via the previous development of the HJB equation
satisfied by a certain value function.
Let V (x, t) denote the expected value of the objective function from
time t to the horizon T with It = x and using the optimal policy from
t to T. The function V (x, t) is referred to as the value function, and it
satisfies the HJB equation
1
0 = max[−(P 2 + x2 ) + Vt + Vx (P − S) + σ 2 Vxx ] (12.22)
P 2
with the boundary condition
V (x, T ) = Bx. (12.23)
Note that these are applications of (12.9) and (12.10) to the production
planning problem.
It is now possible to maximize the expression inside the bracket of
(12.22) with respect to P by taking its derivative with respect to P and
setting it to zero. This procedure yields
Vx (x, t)
P ∗ (x, t) = . (12.24)
2
Substituting (12.24) into (12.22) yields the equation
Vx2 1
0= − x2 + Vt − SVx + σ 2 Vxx , (12.25)
4 2
which, after the max operation has been performed, is known as the
Hamilton-Jacobi equation. This is a partial differential equation which
must be satisfied by the value function V (x, t) with the boundary con-
dition (12.23). The solution of (12.25) is considered in the next section.
Remark 12.1 It is important to remark that if the production rate were
restricted to be nonnegative, then, as in Remark 6.1, (12.24) would be
changed to
∗ Vx (x, t)
P (x, t) = max 0, . (12.26)
2
372 12. Stochastic Optimal Control

Substituting (12.26) into (12.23) would give us a partial differential equa-


tion which must be solved numerically. We will not consider (12.26)
further in this chapter.

12.2.1 Solution for the Production Planning Problem


To solve Eq. (12.25) with the boundary condition (12.23) we let

V (x, t) = Q(t)x2 + R(t)x + M (t). (12.27)

Then,

Vt = Q̇x2 + Ṙx + Ṁ , (12.28)


Vx = 2Qx + R, (12.29)
Vxx = 2Q, (12.30)

where Ẏ denotes dY /dt. Substituting (12.28)–(12.30) in (12.25) and col-


lecting terms gives

R2
x2 [Q̇ + Q2 − 1] + x[Ṙ + RQ − 2SQ] + Ṁ + − RS + σ 2 Q = 0. (12.31)
2
Since (12.31) must hold for any value of x, we must have

Q̇ = 1 − Q2 , Q(T ) = 0, (12.32)
Ṙ = 2SQ − RQ, R(T ) = B, (12.33)
R2
Ṁ = RS − − σ 2 Q, M (T ) = 0, (12.34)
4
where the boundary conditions for the system of simultaneous differential
equations (12.32), (12.33), and (12.34) are obtained by comparing (12.27)
with the boundary condition V (x, T ) = Bx of (12.23).
To solve (12.32), we expand Q̇/(1 − Q2 ) by partial fractions to obtain

Q̇ 1 1
+ = 1,
2 1−Q 1+Q
which can be easily integrated. The answer is
y−1
Q= , (12.35)
y+1
where
y = e2(t−T ) . (12.36)
12.2. A Stochastic Production Inventory Model 373

Since S is assumed to be a constant, we can reduce (12.33) to


Ṙ0 + R0 Q = 0, R0 (T ) = B − 2S
by the change of variable defined by R0 = R − 2S. Clearly the solution
is given by  T
log R0 (T ) − log R0 (t) = − Q(τ )dτ ,
t
which can be simplified further to obtain

2(B − 2S) y
R = 2S + . (12.37)
y+1
Having obtained solutions for R and Q, we can easily express (12.34) as
 T
M (t) = − [R(τ )S − (R(τ ))2 /4 − σ 2 Q(τ )]dτ . (12.38)
t

The optimal control is defined by (12.24), and the use of (12.35) and
(12.37) yields

(y − 1)x + (B − 2S) y
P ∗ (x, t) = Vx /2 = Qx + R/2 = S + . (12.39)
y+1
This means that the optimal production rate for t ∈ [0, T ]
(e2(t−T ) − 1)It∗ + (B − 2S)e(t−T )
Pt∗ = P ∗ (It∗ , t) = S + , (12.40)
e2(t−T ) + 1
where It∗ , t ∈ [0, T ], is the inventory level observed at time t when using
the optimal production rate Pt∗ , t ∈ [0, T ], according to (12.40).

Remark 12.2 The optimal production rate in (12.39) equals the de-
mand rate plus a correction term which depends on the level of inven-
tory and the distance from the horizon time T. Since (y − 1) < 0 for
t < T, it is clear that for lower values of x, the optimal production rate
is likely to be positive. However, if x is very high, the correction term
will become smaller than −S, and the optimal control will be negative.
In other words, if inventory level is too high, the factory can save money
by disposing a part of the inventory resulting in lower holding costs.

Remark 12.3 If the demand rate S were time-dependent, it would have


changed the solution of (12.33). Having computed this new solution
in place of (12.37), we can once again obtain the optimal control as
P ∗ (x, t) = Qx + R/2.
374 12. Stochastic Optimal Control

Remark 12.4 Note that when T → ∞, we have y → 0 and

P ∗ (x, t) → S − x, (12.41)

but the undiscounted objective function value (12.21) in this case be-
comes −∞. Clearly, any other policy will render the objective function
value to be −∞. In a sense, the optimal control problem becomes ill-
posed. One way to get out of this difficulty is to impose a nonzero
discount rate. You are asked to carry this out in Exercise 12.2.

Remark 12.5 It would help our intuition if we could draw a picture of


the path of the inventory level over time. Since the inventory level is
a stochastic process, we can only draw a typical sample path. Such a
sample path is shown in Fig. 12.1. If the horizon time T is long enough,
the optimal control will bring the inventory level to the goal level x̄ = 0.
It will then hover around this level until t is sufficiently close to the
horizon T. During the ending phase, the optimal control will try to build
up the inventory level in response to a positive valuation B for ending
inventory.

Xt
Figure drawn for:
5
x 0 = 2, T = 12, B = 20
S = 5, s = 2
4

0 t
T
-1

-2

Figure 12.1: A sample path of optimal production rate It∗ with I0 =


x0 > 0 and B > 0
12.3. The Sethi Advertising Model 375

12.3 The Sethi Advertising Model


In this section, we will discuss a stochastic advertising model due to
Sethi (1983b). The model is:
⎧  ∞

⎪ −ρt 2

⎪ max E e (πXt − Ut )dt

⎪ 0



⎨ subject to
(12.42)

⎪ √

⎪ dXt = (rUt 1 − Xt − δXt )dt + σ(Xt )dZt , X0 = x0 ,





⎩ Ut ≥ 0,

where Xt is the market share and Ut is the rate of advertising at time t,


and where, as specified in Sect. 7.2.1, ρ > 0 is the discount rate, π > 0
is the profit margin on sales, r > 0 is the advertising effectiveness pa-
rameter, and δ > 0 is the sales decay parameter. Furthermore, Zt is the
standard one-dimensional Wiener process and σ(x) is the diffusion coef-
ficient function having some properties to be specified shortly. The term
in the integrand represents the discounted profit rate at time t. Thus,
the integral represents the total value of the discounted profit stream on
a sample path. The objective in (12.42) is, therefore, to maximize the
expected value of the total discounted profit.
The Sethi model is a modification as well as a stochastic extension
of the optimal control formulation of the Vidale-Wolfe advertising model
presented in (7.43). The Itô equation in (12.42) modifies the√Vidale-
Wolfe dynamics (7.25) by replacing the term rU (1 − x) by rUt 1 − Xt
and adding a diffusion term σ(Xt )dZt on the right-hand side. Further-
more, the linear cost of advertising U in (7.43) is replaced by a quadratic
cost of advertising Ut2 in (12.42). The control constraint 0 ≤ U ≤ Q in
(7.43) is replaced by simply Ut ≥ 0. The addition of the diffusion term
yields a stochastic optimal control problem as expressed in (12.42).
An important consideration in choosing the function σ(x) should be
that the solution Xt to the Itô equation in (12.42) remains inside the
interval [0, 1]. Merely requiring that the initial condition x0 ∈ [0, 1], as in
Sect. 7.2.1, is no longer sufficient in the stochastic case. Additional con-
ditions need to be imposed. It is possible to specify these conditions by
using the theory presented by Gihman and Skorohod (1972) for stochas-
tic differential equations on a finite spatial interval. In our case, the
conditions boil down to the following, in addition to x0 ∈ (0, 1), which
376 12. Stochastic Optimal Control

has been assumed already in (12.42):


σ(x) > 0, x ∈ (0, 1) and σ(0) = σ(1) = 0. (12.43)
It is possible to show that for any feedback control U (x) satisfying
U (x) ≥ 0, x ∈ (0, 1], and U (0) > 0, (12.44)
the Itô equation in (12.42) will have a solution Xt such that 0 < Xt < 1,
almost surely (i.e., with probability 1). Since our solution for the optimal
advertising U ∗ (x) would turn out to satisfy (12.44), we will have the
optimal market share Xt∗ lie in the interval (0, 1).
Let V (x) denote the value function for the problem, i.e., V (x) is the
expected value of the discounted profits from time t to infinity, when
Xt = x and an optimal policy Ut∗ is followed from time t onwards. Note
that since T = ∞, the future looks the same from any time t, and
therefore the value function does not depend on t. It is for this reason
that we have defined the value function as V (x), rather than V (x, t) as
in the previous section.
Using now the principle of optimality as in Sect. 12.1, we can write
the HJB equation as
4 √ $
ρV (x) = max πx − U 2 + Vx (rU 1 − x − δx) + Vxx (σ(x))2 /2 .
U
(12.45)

Maximization of the RHS of (12.45) can be accomplished by taking its


derivative with respect to U and setting it to zero. This gives

∗ rVx 1 − x
U (x) = . (12.46)
2
Substituting of (12.46) in (12.45) and simplifying the resulting expression
yields the HJB equation
Vx2 r2 (1 − x) 1
ρV (x) = πx + − Vx δx + σ 2 (x)Vxx . (12.47)
4 2
As shown in Sethi (1983b), a solution of (12.47) is
2
λ̄ r2
V (x) = λ̄x + , (12.48)

where #
(ρ + δ)2 + r2 π − (ρ + δ)
λ̄ = , (12.49)
r2 /2
12.4. An Optimal Consumption-Investment Problem 377

as derived in Exercise 7.37. In Exercise 12.3, you are asked verify that
(12.48) and (12.49) solve the HJB equation (12.47).
We can now obtain the explicit formula for the optimal feedback
control as √
∗ rλ̄ 1 − x
U (x) = . (12.50)
2
Note that U ∗ (x) satisfies the conditions in (12.44).
As in Exercise 7.37, it is easy to characterize (12.50) as




⎪ > Ū if Xt < X̄,


Ut∗ = U ∗ (Xt ) = = Ū if Xt = X̄, (12.51)





⎩ < Ū if Xt > X̄,

where
r2 λ̄/2
X̄ = (12.52)
r2 λ̄/2 + δ
and √
rλ̄ 1 − x̄
Ū = , (12.53)
2
as given in (7.51).
The market share trajectory for Xt is no longer monotone because
of the random variations caused by the diffusion term σ(Xt )dZt in the
Itô equation in (12.42). Eventually, however, the market share process
hovers around the equilibrium level x̄. It is, in this sense and as in the
previous section, also a turnpike result in a stochastic environment.

12.4 An Optimal Consumption-Investment


Problem
In Example 1.3 in Chap. 1, we had formulated a problem faced by Rich
Rentier who wants to consume his wealth in a way that will maximize his
total utility of consumption and bequest. In that example, Rich Rentier
kept his money in a savings plan earning interest at a fixed rate of r > 0.
In this section, we will offer Rich the possibility of investing a part
of his wealth in a risky security or stock that earns an expected rate
of return that equals α > r. Rich, now known as Rich Investor, must
optimally allocate his wealth between the risk-free savings account and
378 12. Stochastic Optimal Control

the risky stock over time and consume over time so as to maximize his
total utility of consumption. We will assume an infinite horizon problem
in lieu of the bequest, for convenience in exposition. One could, however,
argue that Rich’s bequest would be optimally invested and consumed by
his heir, who in turn would leave a bequest that would be optimally
invested and consumed by a succeeding heir and so on. Thus, if Rich
considers the utility accrued to all his heirs as his own, then he can justify
solving an infinite horizon problem without a bequest.
In order to formulate the stochastic optimal control problem of Rich
Investor, we must first model his investments. The savings account is
easy to model. If S0 is the initial deposit in the savings account earning
an interest at the rate r > 0, then we can write the accumulated amount
St at time t as
St = S0 ert .
This can be expressed as a differential equation, dSt /dt = rSt , which we
will rewrite as
dSt = rSt dt, S0 ≥ 0. (12.54)
Modeling the stock is much more complicated. Merton (1971) and
Black and Scholes (1973) have proposed that the stock price Pt can be
modeled by an Itô equation, namely,
dPt
= αdt + σdZt , P0 > 0, (12.55)
Pt
or simply,
dPt = αPt dt + σPt dZt , P0 > 0, (12.56)
where P0 > 0 is the given initial stock price, α is the average rate of
return on stock, σ is the standard deviation associated with the return,
and Zt is a standard Wiener process.
Remark 12.6 The LHS in (12.55) can be written also as dlnPt . Another
name for the process Zt is Brownian Motion. Because of these, the price
process Pt given by (12.55) is often referred to as a logarithmic Brownian
Motion. It is important to note from (12.56) that Pt remains nonnegative
at any t > 0 on account of the fact that the price process has almost
surely continuous sample paths (see Sect. D.2). This property nicely
captures the limited liability that is incurred in owning a share of stock.
In order to complete the formulation of Rich’s stochastic optimal
control problem, we need the following additional notation:
Wt = the wealth at time t,
12.4. An Optimal Consumption-Investment Problem 379

Ct = the consumption rate at time t,


Qt = the fraction of the wealth invested in stock at time t,
1 − Qt = the fraction of the wealth kept in the savings account
at time t,
U (C) = the utility of consumption when consumption is at the
rate C; the function U (C) is assumed to be increasing
and concave,
ρ = the rate of discount applied to consumption utility,
B = the bankruptcy parameter, to be explained later.
Next we develop the dynamics of the wealth process. Since the in-
vestment decision Q is unconstrained, it means Rich is allowed to buy
stock as well as to sell it short. Moreover, Rich can deposit in, as well
as borrow money from, the savings account at the rate r.
While it is possible to rigorously obtain the equation for the wealth
process involving an intermediate variable, namely, the number Nt of
shares of stock owned at time t, we will not do so. Instead, we will write
the wealth equation informally as
dWt = Qt Wt αdt + Qt Wt σdZt + (1 − Qt )Wt rdt − Ct dt
= (α − r)Qt Wt dt + (rWt − Ct )dt + σQt Wt dZt , W0 given,
(12.57)
and provide an intuitive explanation for it. The term Qt Wt αdt represents
the expected return from the risky investment of Qt Wt dollars during the
period from t to t+dt. The term Qt Wt σdZt represents the risk involved in
investing Qt Wt dollars in stock. The term (1 − Qt )Wt rdt is the amount
of interest earned on the balance of (1 − Qt )Wt dollars in the savings
account. Finally, Ct dt represents the amount of consumption during the
interval from t to t + dt.
In deriving (12.57), we have assumed that Rich can trade contin-
uously in time without incurring any broker’s commission. Thus, the
change in wealth dWt from time t to time t + dt is due to consumption as
well as the change in share price. For a rigorous development of (12.57)
from (12.54) and (12.55), see Harrison and Pliska (1981).
Since Rich can borrow an unlimited amount and invest it in stock,
his wealth could fall to zero at some time T. We will say that Rich goes
bankrupt at time T, when his wealth falls zero at that time. It is clear
that T is a random variable defined as
T = inf{t ≥ 0|Wt = 0}. (12.58)
380 12. Stochastic Optimal Control

This special type of random variable is called a stopping time, since it is


observed exactly at the instant of time when wealth falls to zero.
We can now specify Rich’s objective function. It is:
  T 
max J = E e−ρt U (Ct )dt + e−ρT B , (12.59)
0

where we have assumed that Rich experiences a payoff of B, in the units


of utility, at the time of bankruptcy. B can be positive if there is a
social welfare system in place, or B can be negative if there is remorse
associated with bankruptcy. See Sethi (1997a) for a detailed discussion
of the bankruptcy parameter B.
Let us recapitulate the optimal control problem of Rich Investor:
⎧   T 

⎪ −ρt −ρT

⎪ max J = E e U (Ct )dt + e B

⎪ 0



⎨ subject to



⎪ dWt = (α − r)Qt Wt dt + (rWt − Ct )dt + σQt Wt dZt , W0 given,





⎩ Ct ≥ 0.
(12.60)
As in the infinite horizon problem of Sect. 12.2, here also the value
function is stationary with respect to time t. This is because T is a stop-
ping time of bankruptcy, and the future evolution of wealth, investment,
and consumption processes from any starting time t depends only on the
wealth at time t and not on time t itself. Therefore, let V (x) be the
value function associated with an optimal policy beginning with wealth
Wt = x at time t. Using the principle of optimality as in Sect. 12.1, the
HJB equation satisfied by the value function V (x) for problem (12.60)
can be written as


⎪ ρV (x) = max [(α − r)QxVx + (rx − C)Vx



⎨ C≥0,Q

⎪ +(1/2)Q2 σ 2 x2 Vxx + U (C)], (12.61)






⎩ V (0) = B.

This problem and a number of its generalizations are solved explicitly


in Sethi (1997a). Here we shall confine ourselves in solving a simpler
problem resulting from the following considerations.
12.4. An Optimal Consumption-Investment Problem 381

It is shown in Karatzas et al. (1986), reproduced as Chapter 2 in


Sethi (1997a), that when B ≤ U (0)/ρ, no bankruptcy will occur. This
should be intuitively obvious because if Rich goes bankrupt at any time
T > 0, he receives B at that time, whereas by not going bankrupt at
that time he reaps the utility of strictly more than U (0)/ρ on account
of consumption from time T onward. It is shown furthermore that if
U  (0) = ∞, then the optimal consumption rate will be strictly positive.
This is because even an infinitesimally small positive consumption rate
results in a proportionally large amount of utility on account of the
infinite marginal utility at zero consumption level. A popular utility
function used in the literature is
U (C) = lnC, (12.62)
which was also used in Example 1.3. This function gives an infinite
marginal utility at zero consumption, i.e.,
U  (0) = 1/C|C=0 = ∞. (12.63)
We also assume B = U (0)/ρ = −∞. These assumptions imply a strictly
positive consumption level at all times and no bankruptcy.
Since Q is already unconstrained, having no bankruptcy and only
positive (i.e., interior) consumption level allows us to obtain the form of
the optimal consumption and investment policy simply by differentiating
the RHS of (12.61) with respect to Q and C and equating the resulting
expressions to zero. Thus,
(α − r)xVx + Qσ 2 x2 Vxx = 0,
i.e.,
(α − r)Vx
Q∗ (x) = − , (12.64)
xσ 2 Vxx
and
1
C ∗ (x) = . (12.65)
Vx
Substituting (12.64) and (12.65) in (12.61) allows us to remove the
max operator from (12.61), and provides us with the equation
 
γ(Vx )2 1
ρV (x) = − + rx − Vx − lnVx , (12.66)
Vxx Vx
where
(α − r)2
γ= . (12.67)
2σ 2
382 12. Stochastic Optimal Control

This is a nonlinear ordinary differential equation that appears to be


quite difficult to solve. However, Karatzas et al. (1986) used a change
of variable that transforms (12.66) into a second-order, linear, ordinary
differential equation, which has a known solution. For our purposes, we
will simply guess that the value function is of the form
V (x) = A ln x + B, (12.68)
where A and B are constants, and obtain the values of A and B by
substitution in (12.66). Using (12.68) in (12.66), we see that
'  
x( A A
ρA ln x + ρB = γA + rx − − ln
A x x
= γA + rA − 1 − ln A + ln x.
By comparing the coefficients of ln x and the constants on both sides,
we get A = 1/ρ and B = (r − ρ + γ)/ρ2 + ln ρ/ρ. By substituting these
values in (12.68), we obtain
1 r−ρ+γ
V (x) = ln(ρx) + , x ≥ 0. (12.69)
ρ ρ2
In Exercise 12.4, you are asked by a direct substitution in (12.66)
to verify that (12.69) is indeed a solution of (12.66). Moreover, V (x)
defined in (12.69) is strictly concave, so that our concavity assumption
made earlier is justified.
From (12.69), it is easy to show that (12.64) and (12.65) yield the
following feedback policies:
α−r
Q∗ (x) = , (12.70)
σ2
C ∗ (x) = ρx. (12.71)
The investment policy (12.70) says that the optimal fraction of the wealth
invested in the risky stock is (α − r)/σ 2 , i.e.,
α−r
Q∗t = Q∗ (Wt ) = , t ≥ 0, (12.72)
σ2
which is a constant over time. The optimal consumption policy is to
consume a constant fraction ρ of the current wealth, i.e.,
Ct∗ = C ∗ (Wt ) = ρWt , t ≥ 0. (12.73)
This problem and its many extensions have been studied in great
detail. See, e.g., Sethi (1997a).
Exercises for Chapter 12 383

12.5 Concluding Remarks


In this chapter, we have considered stochastic optimal control problems
subject to Itô differential equations. For impulse stochastic control, see
Bensoussan and Lions (1984). For stochastic control problems with jump
Markov processes or, more generally, martingale problems, see Fleming
and Soner (1992), Davis (1993), and Karatzas and Shreve (1998). For
problems with incomplete information or partial observation, see Ben-
soussan (2004, 2018), Elliott et al. (1995), and Bensoussan et al. (2010).
For applications of stochastic optimal control to manufacturing prob-
lems, see Sethi and Zhang (1994a), Yin and Zhang (1997), Sethi et al.
(2005), Bensoussan (2011), and Bensoussan et al. (2007b,c,d, 2008a,b,
2009a,b,c). For applications to problems in finance, see Sethi (1997a),
Karatzas and Shreve (1998), and Bensoussan et al. (2009d). For ap-
plications in marketing, see Tapiero (1988), Raman (1990), and Sethi
and Zhang (1995b). For applications of stochastic optimal control to
economics including economics of natural resources, see, e.g., Pindyck
(1978a,b), Rausser and Hochman (1979), Arrow and Chang (1980),
Derzko and Sethi (1981a), Bensoussan and Lesourne (1980, 1981),
Malliaris and Brock (1982), and Brekke and Øksendal (1994).

Exercises for Chapter 12

E 12.1 Solve the production-inventory problem with the state equation


(12.20) and the objective function
  T 
h ˆ 2 c 2
min J = E [ (I − I) + (P − P̂ ) ]dt ,
0 2 2

where h > 0, c > 0, Iˆ ≥ 0 and P̂ ≥ 0; see the objective function (6.2) for
the interpretation of these parameters.

E 12.2 Formulate and solve the discounted infinite-horizon version of


the stochastic production planning model of Sect. 12.2. Specifically, as-
sume B = 0 and replace the objective function in (12.21) by
 ∞ 
−ρt 2 2
max E −e (Pt + It )dt .
0

E 12.3 Verify by direct substitution that the value function defined by


(12.48) and (12.49) solves the HJB equation (12.47).
384 12. Stochastic Optimal Control

E 12.4 Verify by direct substitution that the value function in (12.69)


solves the HJB equation (12.66).

E 12.5 Solve the consumption-investment problem (12.60) with the util-


ity function U (C) = C β , 0 < β < 1, and B = 0.

E 12.6 Solve Exercise 12.5 when U (C) = −C β with β < 0 and B =


−∞.

E 12.7 Solve the optimal consumption-investment problem:


  ∞ 
−ρt
V (x) = max J = E e ln(Ct − s)dt
0
subject to
dWt = (α − r)Qt Wt dt + (rWt − Ct )dt + σQt Wt dZt , W0 = x,
Ct ≥ s.
Here s > 0 denotes a minimal subsistence consumption, and we assume
0 < ρ < 1. Note that the value function V (s/r) = −∞. Guess a solution
of the form
V (x) = A ln(x − s/r) + B.
Find the constants A, B, and the optimal feedback consumption and in-
vestment allocation policies C ∗ (x) and Q∗ (x), respectively. Characterize
these policies in words.

E 12.8 Solve the consumption-investment problem:


  ∞ 
V (x) = max J = E e−ρt (Ct − s)β dt
0
subject to
dWt = (α − r)Qt Wt dt + (rWt − Ct )dt + σQt Wt dZt , W0 = x,
Ct ≥ s.
Here s > 0 denotes a minimal subsistence consumption and we assume
0 < ρ < 1 and 0 < β < 1. Note that the value function V (s/r) = 0.
Therefore, guess a solution of the form
V (x) = A(x − s/r)β .
Find the constant A and the optimal feedback consumption and invest-
ment allocation policies C ∗ (x) and Q∗ (x), respectively. Characterize
these policies in words.
Chapter 13

Differential Games

In previous chapters, we were mainly concerned with the optimal control


problems formulated by a single objective function (or a single decision
maker). However, there are situations when there may be more than
one decision maker, each having one’s own objective function that each
is trying to maximize, subject to a set of differential equations. This
extension of optimal control theory is referred to as the theory of differ-
ential games.
The study of differential games was initiated by Isaacs (1965). Af-
ter the development of Pontryagin’s maximum principle, it became clear
that there was a connection between differential games and optimal con-
trol theory. In fact, differential game problems represent a generalization
of optimal control problems in cases where there is more than one con-
troller or player. However, differential games are conceptually far more
complex than optimal control problems in the sense that it is no longer
obvious what constitutes a solution; see Starr and Ho (1969), Ho (1970),
Varaiya (1970), Friedman (1971), Leitmann (1974), Case (1979), Selten
(1975), Mehlmann (1988), Berkovitz (1994), Basar and Olsder (1999),
Dockner et al. (2000), and Basar et al. (2010). Indeed, there are a num-
ber of different types of solutions such as minimax, Nash, Stackelberg,
along with possibilities of cooperation and bargaining; see, e.g., Tolwin-
ski (1982) and Haurie et al. (1983). We will discuss minimax solutions for
zero-sum differential games in Sect. 13.1, Nash solutions for nonzero-sum
games in Sect. 13.2, and Stackelberg differential games in Sect. 13.3.

© Springer Nature Switzerland AG 2019 385


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3 13
386 13. Differential Games

13.1 Two-Person Zero-Sum Differential Games


Consider the state equation

ẋ = f (x, u, v, t), x(0) = x0 , (13.1)

where we may assume all variables to be scalar for the time being. Ex-
tension to the vector case simply requires appropriate reinterpretations
of each of the variables and the equations. In this equation, we let u
and v denote the controls applied by players 1 and 2, respectively. We
assume that
u(t) ∈ U, v(t) ∈ V, t ∈ [0, T ],
where U and V are convex sets in E 1 . Consider further the objective
function  T
J(u, v) = S[x(T )] + F (x, u, v, t)dt, (13.2)
0
which player 1 wants to maximize and player 2 wants to minimize. Since
the gain of player 1 represents a loss to player 2, such games are appro-
priately termed zero-sum games. Clearly, we are looking for admissible
control trajectories u∗ and v ∗ such that

J(u∗ , v) ≥ J(u∗ , v ∗ ) ≥ J(u, v ∗ ). (13.3)

The solution (u∗ , v ∗ ) is known as the minimax solution. Here u∗ and v ∗


stand for u∗ (t), t ∈ [0, T ], and v ∗ (t), t ∈ [0, T ], respectively.
The necessary conditions for u∗ and v ∗ to satisfy (13.3) are given by
an extension of the maximum principle. To obtain these conditions, we
form the Hamiltonian
H = F + λf (13.4)
with the adjoint variable λ satisfying the equation

λ̇ = −Hx , λ(T ) = Sx [x(T )]. (13.5)

The necessary condition for trajectories u∗ and v ∗ to be a minimax so-


lution is that for t ∈ [0, T ],

H(x∗ (t), u∗ (t), v ∗ (t), λ(t), t) = min max H(x∗ (t), u, v, λ(t), t), (13.6)
v∈V u∈U

which can also be stated, with suppression of (t), as

H(x∗ , u∗ , v, λ, t) ≥ H(x∗ , u∗ , v ∗ , λ, t) ≥ H(x∗ , u, v ∗ , λ, t) (13.7)


13.2. Nash Differential Games 387

for u ∈ U and v ∈ V. Note that (u∗ , v ∗ ) is a saddle point of the Hamil-


tonian function H.
Note also that if u and v are unconstrained, i.e., when, U = V = E 1 ,
condition (13.6) reduces to the first-order necessary conditions

Hu = 0 and Hv = 0, (13.8)

and the second-order conditions are

Huu ≤ 0 and Hvv ≥ 0. (13.9)

We now turn to the treatment of nonzero-sum differential games.

13.2 Nash Differential Games


In this section, let us assume that we have N players where N ≥ 2. Let
ui ∈ U i , i = 1, 2, . . . , N, represent the control variable for the ith player,
where U i is the set of controls from which the ith player can choose. Let
the state equation be defined as

ẋ = f (x, u1 , u2 , . . . , uN , t). (13.10)

Let J i , defined by
 T
i i
J = S [x(T )] + F i (x, u1 , u2 , . . . , uN , t)dt, (13.11)
0

denote the objective function which the ith player wants to maximize. In
this case, a Nash solution is defined by a set of N admissible trajectories

{u1∗ , u2∗ , . . . , uN ∗ }, (13.12)

which have the property that

J i (u1∗ , u2∗ , . . . , uN ∗ ) =
max J i (u1∗ , u2∗ , . . . , u(i−1)∗ , ui , . . . , u(i+1)∗ , . . . , uN ∗ ) (13.13)
ui ∈U i

for i = 1, 2, . . . , N.
To obtain the necessary conditions for a Nash solution for nonzero-
sum differential games, we must make a distinction between open-loop
and closed-loop controls.
388 13. Differential Games

13.2.1 Open-Loop Nash Solution


The open-loop Nash solution is defined when the set of trajectories in
(13.12) are given as functions of time satisfying (13.13). To obtain the
maximum principle type conditions for such solutions to be a Nash so-
lution, let us define the Hamiltonian functions

H i (x, u1 , u2 , . . . , uN , λi ) = F i + λi f (13.14)

for i = 1, 2, . . . , N, with λi satisfying


i
λ̇ = −Hxi , λi (T ) = Sxi [x(T )]. (13.15)

The Nash control ui∗ for the ith player is obtained by maximizing the
ith Hamiltonian H i with respect to ui , i.e., ui∗ must satisfy

H i (x∗ , u1∗ , . . . , u(i−1)∗ , ui∗ , u(i+1)∗ , . . . , uN ∗ , λ, t) ≥

H i (x∗ , u1∗ , . . . , u(i−1)∗ , ui , u(i+1)∗ , . . . , uN ∗ , λ, t), t ∈ [0, T ],


(13.16)
for all ui ∈ U i , i = 1, 2, . . . , N.
Deal et al. (1979) formulated and solved an advertising game with
two players and obtained the open-loop Nash solution by solving a two-
point boundary value problem. In Exercise 13.1, you are asked to obtain
their boundary value problem. See also Deal (1979).

13.2.2 Feedback Nash Solution


A feedback Nash solution is obtained when (13.12) is defined in terms of
the current state of the system. To avoid confusion, we let

ui∗ (x, t) = φi (x, t), i = 1, 2, . . . , N. (13.17)

For these controls to represent a feedback Nash strategy, we must rec-


ognize the dependence of the other players’ actions on the state variable
x. Therefore, we need to replace the adjoint equation (13.15) by

i 
N 
N
λ̇ = −Hxi − Hui j φjx = −Hxi − Hui j φjx . (13.18)
j=1 j=1,j=i

The presence of the summation term in (13.18) makes the necessary


condition for the feedback solution virtually useless for deriving compu-
tational algorithms; see Starr and Ho (1969). It is, however, possible
13.2. Nash Differential Games 389

to use a dynamic programming approach for solving extremely simple


nonzero-sum games, which require the solution of a partial differential
equation. We will use this approach in Sect. 13.3.
The troublesome summation term in (13.18) is absent in three im-
portant cases: (a) in optimal control problems (N = 1) since Hu ux = 0,
(b) in two-person zero-sum games because H 1 = −H 2 so that Hu12 u2x =
−Hu22 u2x = 0 and Hu21 u1x = −Hu11 u1x = 0, and (c) in open-loop nonzero-
sum games because ujx = 0. It certainly is to be expected, therefore, that
the feedback and open-loop Nash solutions are going to be different, in
general. This can be shown explicitly for the linear-quadratic case.
We conclude this section by providing an interpretation to the adjoint
variable λi . It is the sensitivity of the ith player’s profit to a perturbation
in the state vector. If the other players are using closed-loop strategies,
any perturbation δx in the state vector causes them to revise their con-
trols by the amount φjx δx. If the ith Hamiltonian H i were maximized
with respect to uj , j = i, this would not affect the ith player’s profit;
but since ∂H i /∂uj = 0 for i = j, the reactions of the other players to the
perturbation influence the ith player’s profit, and the ith player must
account for this effect in considering variations of the trajectory.

13.2.3 An Application to Common-Property Fishery


Resources
Consider extending the fishery model of Sect. 10.1 by assuming that there
are two producers having unrestricted rights to exploit the fish stock in
competition with each other. This gives rise to a nonzero-sum differential
game analyzed by Clark (1976).
Equation (10.2) is modified by
ẋ = g(x) − q 1 u1 x − q 2 u2 x, x(0) = x0 , (13.19)
where ui (t) represents the rate of fishing effort and q i ui x is the rate of
catch for the ith producer, i = 1, 2. The control constraints are
0 ≤ ui (t) ≤ U i , i = 1, 2, (13.20)
the state constraints are
x(t) ≥ 0, (13.21)
and the objective function for the ith producer is the total present value
of his profits, namely,
 ∞
Ji = (pi q i x − ci )ui e−ρt dt, i = 1, 2. (13.22)
0
390 13. Differential Games

To find the feedback Nash solution for this model, we let x̄i denote the
turnpike (or optimal biomass) level given by (10.12) on the assumption
that the ith producer is the sole-owner of the fishery. Let the bionomic
equilibrium xib and the corresponding control uib associated with producer
i be defined by (10.4), i.e.,

ci g(xib )pi
xib = and u i
b = . (13.23)
pi q i ci

As shown in Exercise 10.2, xib < x̄i , and we assume U i to be sufficiently


large so that uib ≤ U i . We also assume that

x1b < x2b , (13.24)

which means that producer 1 is more efficient than producer 2, i.e.,


producer 1 can make a positive profit at any level in the interval (x1b , x2b ],
while producer 2 loses money in the same interval, except at x2b , where
he breaks even. For x > x2b , both producers make positive profits.
Since U 1 ≥ u1b by assumption, producer 1 has the capability of driving
the fish stock level down to at least x1b which, by (13.24), is less than x2b .
This implies that producer 2 cannot operate at a sustained level above
x2b ; and at a sustained level below x2b , he cannot make a profit. Hence,
his optimal feedback policy is bang-bang:


⎨ U 2 if x > x2 ,
b
u2∗ (x) = (13.25)

⎩ 0 if x ≤ x2b .

As far as producer 1 is concerned, he wants to attain his turnpike level


x̄1 if x̄1 ≤ x2b . If x̄1 > x2b and x0 ≥ x̄1 , then from (13.25) producer 2
will fish at his maximum rate until the fish stock is driven to x2b . At this
level, it is optimal for producer 1 to fish at a rate which maintains the
fish stock at level x2b in order to keep producer 2 from fishing. Thus, the
optimal feedback policy for producer 1 can be stated as
⎧ ⎫

⎪ ⎪

⎪ U 1 if x > x̄ ⎪
1 ⎪


⎨ ⎪

1∗ 1 g(x̄ )
1
1 1 2
u (x) = ū = q1 x̄1 if x = x̄ ⎪ , if x̄ < xb , (13.26)

⎪ ⎪

⎪ ⎪


⎩ 0 ⎪
if x < x̄1 ⎭
13.2. Nash Differential Games 391

⎧ ⎫

⎪ ⎪


⎪ U1 if x > x2b ⎪


⎨ ⎪

u1∗ (x) = g(x2b ) 1 2
if x = x2b ⎪ , if x̄ ≥ xb . (13.27)

⎪ q 1 x2b ⎪

⎪ ⎪


⎩ 0 ⎪
if x < x ⎭
2
b

The formal proof that policies (13.25)–(13.27) give a Nash solution


requires direct verification using the result of Sect. 10.1.2. The Nash
solution for this case means that for all feasible paths u1 and u2 ,

J 1 (u1∗ , u2∗ ) ≥ J 1 (u1 , u2∗ ), (13.28)

and
J 2 (u1∗ , u2∗ ) ≥ J 2 (u1∗ , u2 ). (13.29)
The direct verification involves defining a modified growth function


⎨ g(x) − q 2 U 2 x if x > x2 ,
1 b
g (x) =

⎩ g(x) if x ≤ x2b ,

and using the Green’s theorem results of Sect. 10.1.2. Since U 2 ≥ u2b
by assumption, we have g 1 (x) ≤ 0 for x > x2b . From (10.12) with g
replaced by g 1 , it can be shown that the new turnpike level for producer
1 is min(x̄1 , x2b ), which defines the optimal policy (13.26)–(13.27) for
producer 1. The optimality of (13.25) for producer 2 follows easily.
To interpret the results of the model, suppose that producer 1 orig-
inally has sole possession of the fishery, but anticipates a rival entry.
Producer 1 will switch from his own optimal sustained yield ū1 to a
more intensive exploitation policy prior to the anticipated entry.
We can now guess the results in situations involving N producers.
The fishery will see the progressive elimination of inefficient producers
as the stock of fish decreases. Only the most efficient producers will
survive. If, ultimately, two or more maximally efficient producers exist,
the fishery will converge to a classical bionomic equilibrium, with zero
sustained economic rent.
We have now seen that a feedback Nash solution involving N ≥ 2
competing producers results in the long-run erosion of economic rents.
This conclusion depends on the assumption that producers face an in-
finitely elastic supply of all factors of production going into the fishing
392 13. Differential Games

effort, but typically the methods of licensing entrants to regulated fish-


eries make some attempt to also control the factors of production such
as permitting the licensee to operate only a single vessel of specific size.
In order to develop a model for the licensing of fishermen, we let the
control variable v i denote the capital stock of the ith producer and let
the concave function f (v i ), with f (0) = 0, denote the fishing mortality
function for i = 1, 2, . . . , N. This requires the replacement of q i ui in
the previous model by f (v i ). The extended model becomes nonlinear in
control variables. You are asked in Exercise 13.3 to formulate this new
model and develop necessary conditions for a feedback Nash solution for
this game involving N producers. The reader is referred to Clark (1976)
for further details. For other papers on applications of differential games
to fishery management, see Hämäläinen et al. (1984, 1985, 1986, 1990).

13.3 A Feedback Nash Stochastic Differential


Game in Advertising
In this section, we will study a competitive extension of the Sethi ad-
vertising model discussed in Sect. 12.3. This will give us a stochastic
differential game, for which we aim to obtain a feedback Nash equilib-
rium by using a dynamic programming approach developed in Sect. 12.1.
We should note that this approach can also be used to obtain feedback
Nash equilibria in deterministic differential games as an alternative to
the maximum principle approach developed in Sect. 13.2.2.
Specifically, we consider a duopoly market in a mature product cate-
gory where total sales are distributed between two firms, labeled as Firm
1 and Firm 2, which compete for market share through advertising ex-
penditures. We let Xt denote the market share of Firm 1 at time t, so
that the market share of Firm 2 is (1 − Xt ). Let U1t and U2t denote the
advertising effort rates of Firms 1 and 2, respectively, at time t. Using
the subscript i ∈ {1, 2} to reference the two firms, let ri > 0 denote
the advertising effectiveness parameter, π i > 0 denote the sales margin,
ρi > 0 denote the discount rate, and ci > 0 denote the cost parame-
ter so that the cost of advertising effort u by Firm i is ci u2 . Further,
let δ > 0 be the churn parameter, Zt be the standard one-dimensional
Wiener process, and σ(x) be the diffusion coefficient function as defined
in Sect. 12.3. Then, in view of the competition between the firms, Prasad
and Sethi (2004) extend the Sethi model dynamics in (12.42) as the Itô
13.3. A Feedback Nash Stochastic Differential Game in Advertising 393

stochastic differential equation


# #
dXt = [r1 U1t 1 − Xt − δXt − r2 U2t Xt + δ(1 − Xt )]dt + σ(Xt )dZt ,
X(0) = x0 ∈ [0, 1]. (13.30)

We formulate the optimal control problem faced by the two firms as


  ∞ 
1 −ρ1 t
4 2
$
max V (x0 ) = E e π 1 Xt − c1 U1t dt , (13.31)
U1 ≥0 0

  ∞ 
2 −ρ2 t
4 2
$
max V (x0 ) = E e π 2 (1 − Xt ) − c2 U2t dt , (13.32)
U2 ≥0 0

subject to (13.30). Thus, each firm seeks to maximize its expected,


discounted profit stream subject to the market share dynamics.
To find the feedback Nash equilibrium solution, we form the
Hamilton-Jacobi-Bellman (HJB) equations for the value functions V 1 (x)
and V 2 (x) :

ρ1 V 1 = max{H 1 (x, U1 , U2 , Vx1 ) + (σ(x))2 Vxx


1
/2}
U1 ≥0
√ √
= max{π 1 x − c1 U12 + Vx1 [r1 U1 1 − x − r2 U2 x − δ(2x − 1)]
U1 ≥0

+(σ(x))2 Vxx
1
/2}, (13.33)

ρ2 V 2 = max{H 2 (x, U1 , U2 , Vx2 ) + (σ(x))2 Vxx


2
/2}
U2 ≥0

= max{π 2 (1 − x) − c2 U22
U2 ≥0
√ √
+Vx2 [r1 U1 1 − x − r2 U2 x − δ(2x − 1)]
+(σ(x))2 Vxx
2
/2}, (13.34)

where the Hamiltonians are as defined in (13.14). We use the first-order


conditions for Hamiltonian maximization to obtain the optimal feedback
advertising decisions
√ √
U1∗ (x) = Vx1 (x)r1 1 − x/2c1 and U2∗ (x) = −Vx2 (x)r2 x/2c2 . (13.35)

Since it is reasonable to expect that Vx1 ≥ 0 and Vx2 ≤ 0, these controls


will turn out to be nonnegative as we will see later.
394 13. Differential Games

Substituting (13.35) in (13.33) and (13.34), we obtain the Hamilton-


Jacobi equations
ρ1 V 1 = π 1 x + (Vx1 )2 r12 (1 − x)/4c1 + Vx1 Vx2 r22 x/2c2
−Vx1 δ(2x − 1) + (σ(x))2 Vxx
1
/2, (13.36)

ρ2 V 2 = π 2 (1 − x) + (Vx2 )2 r22 x/4c2 + Vx1 Vx2 r12 (1 − x)/2c1


−Vx2 δ(2x − 1) + (σ(x))2 Vxx
2
/2. (13.37)
As in Sect. 12.3, we look for the following forms for the value functions
V 1 = α1 + β 1 x and V 2 = α2 + β 2 (1 − x). (13.38)
These are inserted into (13.36) and (13.37) to determine the unknown
coefficients α1 , β 1 , α2 , and β 2 . Equating the coefficients of x and the
constants on both sides of (13.36) and the coefficients of (1 − x) and the
constants on both sides of (13.37), the following four equations emerge,
which can be solved for the unknowns α1 , β 1 , α2 , and β 2 :
ρ1 α1 = β 21 r12 /4c1 + β 1 δ, (13.39)
ρ1 β 1 = π 1 − β 21 r12 /4c1 − β 1 β 2 r22 /2c2 − 2β 1 δ, (13.40)
ρ2 α 2 = β 22 r22 /4c2 + β 2 δ, (13.41)
ρ2 β 2 = π 2 − β 22 r22 /4c2 − β 1 β 2 r12 /2c1 − 2β 2 δ. (13.42)
Let us first consider the special case of symmetric firms, i.e., when
π = π 1 = π 2 , c = c1 = c2 , r = r1 = r2 , and ρ = ρ1 = ρ2 , and therefore
α = α1 = α2 , β = β 1 = β 2 . The four equations in ((13.39)–(13.42))
reduce to the following two:
ρα = β 2 r2 /4c + βδ and ρβ = π − 3β 2 r2 /4c − 2βδ. (13.43)
There are two solutions for β. One is negative, which clearly makes no
sense. Thus, the remaining positive solution is the correct one. This also
allows us to obtain the corresponding α. The solution is
#
α = [(ρ − δ)(W − W 2 + 12Rπ) + 6Rπ]/18Rρ, (13.44)
#
2
β = ( W + 12Rπ − W )/6R, (13.45)
where R = r2 /4c and W = ρ + 2δ. With this the value functions in
(13.38) are defined, and the controls in (13.35) for the case of symmetric
firms can be written as
√ √ √ √
β r1 1 − x βr 1 − x β r2 x βr x
u∗1 (x) = 1 = and u∗2 (x) = 2 = ,
2c1 2c 2c2 2c
13.4. A Stackelberg Differential Game of Cooperative Advertising 395

which are clearly nonnegative as required.


We return now to the general case of asymmetric firms. For this, we
re-express equations ((13.39)–(13.42)) in terms of a single variable β 1 ,
which is determined by solving the quartic equation
3R12 β 41 + 2R1 (W1 + W2 )β 31 + (4R2 π 2 − 2R1 π 1 − W12 + 2W1 W2 )β 21
+ 2π 1 (W1 − W2 )β 1 − π 21 = 0. (13.46)
This equation can be solved explicitly to give four roots. We will find
that only one of these is positive, and select it as our value of β 1 . With
that, other coefficients can be obtained by solving for α1 and β 2 and
then, in turn, α2 , as follows:
α1 = β 1 (β 1 R1 + δ)/ρ1 , (13.47)
β 2 = (π 1 − β 21 R1 − β 1 W1 )/2β 1 R2 , (13.48)
α2 = β 2 (β 2 R2 + δ)/ρ2 , (13.49)
where R1 = r12 /4c1 ,R2 = r22 /4c2 ,
W1 = ρ1 + 2δ, and W2 = ρ2 + 2δ.
It is worthwhile to mention that firm i’s advertising effectiveness pa-
rameter ri and advertising cost parameter ci manifest themselves through
Ri = ri 2 /4ci . This would suggest that Ri is a measure of firm i’s adver-
tising power. This can be seen more clearly in Exercise 13.6 involving
two firms that are identical in all other aspects except that R2 > R1 .
Specifically in that exercise, you are asked to use Mathematica or an-
other suitable software program to solve (13.46) to obtain β 1 and then
obtain the coefficients α1 , α2 , and β 2 by using (13.47)–(13.49), when
ρ1 = ρ2 = 0.05,#π 1 = π 2 = 1, δ = 0.01, R1 = 1, R2 = 4, x0 =
0.5, and σ(x) = 0.5x(1 − x). Figure 13.1 represents a sample path of
the market share of the two firms with this data.
It is noteworthy to see that both firms are identical except in their
advertising powers R1 and R2 . With R2 > R1 , firm 2 is more powerful
and we see that this results in its capture of an increasing share of the
market average over time beginning with exactly one half of the market
at time 0.

13.4 A Feedback Stackelberg Stochastic


Differential Game of Cooperative
Advertising
The preceding sections in this chapter dealt with differential games in
which all players make their decisions simultaneously. We now discuss
396 13. Differential Games

X
Y
X
Y

Figure 13.1: A sample path of optimal market share trajectories

a differential game in which two players make their decisions in a hier-


archical manner. The player having the right to move first is called the
leader and the other player is called the follower. If there are two or
more leaders, they play Nash, and the same goes for the followers.
In terms of solutions of Stackelberg differential games, we have open-
loop and feedback solutions. An open-loop Stackelberg equilibrium spec-
ifies, at the initial time (say, t = 0), the decisions over the entire horizon.
As in Sect. 13.1, there is a maximum principle for open-loop solutions.
Typically, open-loop solutions are not time consistent in the sense that
at any time t > 0, the remaining decision may no longer be optimal; see
Exercise 13.2. A feedback or Markovian Stackelberg equilibrium, on the
other hand, consists of decisions expressed as functions of the current
state and time. Such a solution is time consistent.
In this section, we will not develop the general theory, for which
we refer the reader to Basar and Olsder (1999), Dockner et al. (2000),
and Bensoussan et al. (2014, 2015a, 2018). Instead, we will formulate a
Stackelberg differential game of cooperative advertising between a manu-
facturer as the leader and a retailer as the follower, and obtain a feedback
Stackelberg solution. This formulation is due to He et al. (2009). A veri-
fication theorem that applies to this problem can be found in Bensoussan
et al. (2018).
13.4. A Stackelberg Differential Game of Cooperative Advertising 397

The manufacturer sells a product to end users through the retailer.


The product is in a mature category where sales, expressed as a fraction
of the potential market, is influenced through advertising expenditures.
The manufacturer as the leader decides on an advertising support scheme
via a subsidy rate, i.e., he will contribute a certain percentage of the
advertising expenditure by the retailer. Specifically, the manufacturer
decides on a subsidy rate Wt , 0 ≤ Wt ≤ 1, and the retailer as the
follower decides on the advertising effort level Ut ≥ 0, t ≥ 0.
As in Sect. 12.3, the cost of advertising is quadratic in the advertising
effort Ut . Then, with the advertising effort Ut and the subsidy rate Wt ,
the manufacturer’s and the retailer’s advertising expenditures are Wt Ut2
and (1 − Wt )Ut2 , respectively. The market share dynamics is given by
the Sethi model
#
dXt = (rUt 1 − Xt − δXt )dt + σ(Xt )dZt , X0 = x0 . (13.50)

The corresponding expected profits of the retailer and the manufacturer


are, respectively, as follows:
 ∞
−ρt 2
JR = E e (πXt − (1 − Wt )Ut )dt , (13.51)
0
 ∞
−ρt
5 6
JM = E e π M Xt − Wt Ut2 dt . (13.52)
0
A solution of this Stackelberg differential game depends on the avail-
able information structure. We shall assume that at each time t, both
players know the current system state and the follower knows the action
of the leader. The concept of equilibrium that applies in this case is
that of feedback Stackelberg equilibrium. For this and other information
structures and equilibrium concepts, see Bensoussan et al. (2015a).
Next we define the rules, governing the sequence of actions, by which
this game will be played over time. To be specific, the sequence of
plays at any time t ≥ 0 is as follows. First, the manufacturer observes
the market share Xt at time t and selects the subsidy rate Wt . Then,
the retailer observes this action Wt and, knowing also the market share
Xt at time t, sets the advertising effort rate Ut as his response to Wt .
The system evolves over time as this game is played in continuous time
beginning at time t = 0. One could visualize this game as being played
at times 0, δt, 2δt, . . . , and then let δt → 0.
Next, we will address the question of how players choose their actions
at any given t. Specifically, we are interested in deriving an equilibrium
menu W (x) for the leader representing his decision when the state is x
398 13. Differential Games

at time t, and a menu U (x, W ) for the follower representing his decision
when he observes the leader’s decision to be W in addition to the state x
at time t. For this, let us first define a feedback Stackelberg equilibrium,
and then develop a procedure to obtain it.
We begin with specifying the admissible strategy spaces for the man-
ufacturer and the retailer, respectively:

W = {W |W : [0, 1] → [0, 1]
and W (x) is Lipschitz continuous in x}
U = {U |U : [0, 1] × [0, 1] → [0, ∞)
and U (x, W ) is Lipschitz continuous in (x, W )}.

For a pair of strategies (W, U ) ⊂ W ×U , let Ys , s ≥ t, denote the solution


of the state equation
#
dYs = (rU (Ys , Ws ) 1 − Ys − δYs )ds + σ(Ys )dZs , Yt = x. (13.53)

We should note that Ys here stands for Ys (t, x; W, U ), as the solution


t,x
depends on the specified arguments. Then JM (W (·), U (·, W (·))) and
t,x
JR (W (·), U (·, W (·))) representing the current-value profits of the man-
ufacturer and retailer at time t are, respectively,
t,x
JM (W (·), U (·, W (·)))
∞
=E t e−ρ(s−t) [π M Ys − W (Ys ){U (Ys , W (Ys ))}2 ], (13.54)

JRt,x (W (·), U (·, W (·)))


∞
=E t e−ρ(s−t) [πYs − (1 − W (Ys )){U (Ys , W (Ys ))}2 ], (13.55)

where we should stress that W (·), U (·, W (·)) evaluated at any state ζ are
W (ζ), U (ζ, W (ζ)). We can now define our equilibrium concept.
A pair of strategies (W ∗ , U ∗ ) ∈ W ×U is called a feedback Stackelberg
equilibrium if
t,x
JM (W ∗ (·), U ∗ (·, W ∗ (·)))
t,x
≥ JM (W (·), U ∗ (·, W (·))), W ∈ W, x ∈ [0, 1], t ≥ 0, (13.56)

and
JRt,x (W ∗ (·), U ∗ (·, W ∗ (·)))
≥ JRt,x (W ∗ (·), U (·, W ∗ (·))), U ∈ U , x ∈ [0, 1], t ≥ 0. (13.57)
13.4. A Stackelberg Differential Game of Cooperative Advertising 399

It has been shown in Bensoussan et al. (2014) that this equilibrium is


obtained by solving a pair of Hamilton-Jacobi-Bellman equations where
a static Stackelberg game is played at the Hamiltonian level at each t,
and where

H M (x, W, U, λM ) = π M x − W U 2 + λM (rU 1 − x − δx) (13.58)

H R (x, W, U, λR ) = πx − (1 − W )U 2 + λR (rU 1 − x − δx) (13.59)
are the Hamiltonians for the manufacturer and the retailer, respectively.
To solve this Hamiltonian level game, we first maximize H R with respect
to U in terms of x and W. The first order condition gives

∗ λR r 1 − x
U (x, W ) = , (13.60)
2(1 − W )

as the optimal response of the follower for any decision W by the leader.
We then substitute this for U in H M to obtain
W (λR r)2 (1 − x)
H M (x, W, U ∗ (x, W ), λM ) = π M x −
4(1 − W )2
 R 2 
λ r (1−x)
+λ M
−δx . (13.61)
2(1−W )

The first-order condition of maximizing H M with respect to W gives us

2λM − λR
W (x) = . (13.62)
2λM + λR
Clearly W (x) ≥ 1 makes no intuitive sense because it would induce the
retailer to spend an infinite amount on advertising, and that would not be
optimal for the leader. Moreover, λM and λR , the marginal valuations of
the market share of the leader and the follower, respectively, are expected
to be positive, and therefore it follows from (13.62) that W (x) < 1. Thus,
we set,  
∗ 2λM − λR
W (x) = max 0, M . (13.63)
2λ + λR
We can now write the HJB equations as

ρV R = H R (x, W ∗ (x), U ∗ (x, W ∗ (x)), VxR ) + (σ(x))2 Vxx


R
/2
R 2
(V r) (1 − x) 2
(σ(x)) VxxR
= πx + x ∗
− VxR δx + (13.64)
4(1 − W (x)) 2
400 13. Differential Games

ρV M = H M (x, W ∗ (x), U ∗ (x, W ∗ (x)), VxM ) + (σ(x))2 Vxx


M
/2
2 ∗
(V r) (1 − x)W (x) Vx Vx r (1 − x)
R M R 2
= πM x − x +
4(1 − W ∗ (x))2 2(1 − W ∗ (x))
−VxM δx + (σ(x))2 Vxx
M
/2 (13.65)

The solution of these equations will yield the value functions V M (x) and
V R (x). With these in hand, we can give the equilibrium menu of actions
to the manufacturer and the retailer to guide their decisions at each t.
These menus are
  √
∗ 2VxM − VxR ∗ VxR r 1 − x
W (x) = max 0, and U (x, W ) = .
2VxM + VxR 2(1 − W )
(13.66)
To solve for the value function, we next investigate the two cases
where the subsidy rate is (a) zero and (b) positive, and determine the
condition required for no subsidy to be optimal.

Case (a): No Co-op Advertising (W ∗ = 0). Inserting W ∗ (x) = 0 into


(13.66) gives

∗ rVxR 1 − x
U (x, 0) = . (13.67)
2
Inserting W ∗ (x) = 0 into (13.65) and (13.64), we have

VxM VxR r2 (1 − x) (σ(x))2 Vxx


M
ρV M = π M x + − VxM δx + , (13.68)
2 2

(VxR )2 r2 (1 − x) (σ(x))2 Vxx


R
ρV R = πx + − VxR δx + . (13.69)
4 2
Let V M (x) = αM + β M x and V R (x) = α + βx. Then, VxM = β M and
VxR = β. Substituting these into (13.68) and (13.69) and equating like
powers of x, we can express all of the unknowns in terms of β, which
itself can be explicitly solved. That is, we obtain

2π 2π M
β = # , βM = ,
2 2
(ρ + δ) + r π + (ρ + δ) 2(ρ + δ) + βr2
(13.70)
β 2 r2 ββ M r2
α = , αM = . (13.71)
4ρ 2ρ
13.4. A Stackelberg Differential Game of Cooperative Advertising 401
#
Using (13.71) in (13.67), we can write U ∗ (x) = ρα(1 − x). Finally, we
can derive the required condition from the right-hand side of W ∗ (x) in
(13.66), which is 2VxM ≤ VxR , for no co-op advertising (W ∗ = 0) in the
equilibrium. This is given by 2β M ≤ β, or
4π M 2π
2πr 2
≤# . (13.72)
2(ρ + δ) + √ (ρ + δ)2 + r2 π + (ρ + δ)
(ρ+δ)2 +r 2 π+(ρ+δ)

After a few steps of algebra, this yields the required condition


πM π
θ := # −# ≤ 0. (13.73)
2 2
(ρ + δ) + r π (ρ + δ) + r2 π + (ρ + δ)
2

Next, we obtain the solution when θ > 0.

Case (b): Co-op Advertising (W ∗ > 0). Then, W ∗ (x) in (13.66) reduces
to
2V M − VxR
W ∗ (x) = xM . (13.74)
2Vx + VxR
Inserting this for W ∗ (x) into (13.65) and (13.64), we have

r2 (1 − x)[4(VxM )2 − (VxR )2 ]
ρV M = πM x −
16
VxM r2 (1 − x)[2VxM + VxR ]
+
4
(σ(x)) 2V M
−VxM δx + xx
, (13.75)
2
R 2 2
(Vx ) r (1 − x) 2VxM + VxR (σ(x))2 Vxx
R
ρV R = πx + − V x
R
δx + .
4 2VxR 2
(13.76)

Once again, V M (x) = αM + β M x, V R = α + βx, VxM = β M , VxR = β.


Substituting these into (13.75) and (13.76) and equating like powers of
x, we have
β(β + 2β M )r2
α= , (13.77)

β(β + 2β M )r2
(ρ + δ)β = π − , (13.78)
8
(β + 2β M )2 r2
αM = , (13.79)
16ρ
402 13. Differential Games

(β + 2β M )2 r2
(ρ + δ)β M = π M − . (13.80)
16
Using (13.66), (13.74), and (13.79), we can write U ∗ (x, W ∗ (x)), with a
slight abuse of notation, as

∗ r(VxR + 2VxM ) 1 − x #
U (x) = = ραM (1 − x). (13.81)
4
The four equations (13.77)–(13.80) determine the solutions for the
four unknowns, α, β, αM , and β M . From (13.78) and (13.80), we can
obtain
2π M 2 8π 8π 2
β3 + β + 2β− = 0. (13.82)
ρ+δ r (ρ + δ)r2
If we denote
2π M 8π −8π 2
a1 = , a2 = 2 , and a3 = ,
ρ+δ r (ρ + δ)r2
then a1 > 0, a2 > 0, and a3 < 0. From Descarte’s Rule of Signs, there
exists a unique, positive real root. The two remaining roots may be
both imaginary or both real and negative. Since this is a cubic equation,
a complete solution can be obtained. Using Mathematica or following
Spiegel et al. (2008), we can write down the three roots as
1
β(1) = S + T − a1 ,
3 √
1 1 3
β(2) = − (S + T ) − a1 + i(S − T ),
2 3 √2
1 1 3
β(3) = − (S + T ) − a1 − i(S − T ),
2 3 2
with
# #
3 3 √
S= R + Q3 + R2 , T = R − Q3 + R2 , i = −1,

where
3a2 − a21 9a1 a2 − 27a3 − 2a31
Q= , R= .
9 54
Next, we identify the positive root in each of the following three cases:

Case 1 (Q > 0): We have S > 0 > T and Q3 + R2 > 0. There is


one positive root and two imaginary roots. The positive root is β =
S + T − (1/3)a1 .
13.4. A Stackelberg Differential Game of Cooperative Advertising 403

Table 13.1: Optimal feedback Stackelberg solution


(a) if θ ≤ 0 (b) if θ > 0

No co-op equilibrium Co-op equilibrium

Retailer’s

profit V R V R (x) = α + βx V R (x) = α + βx

Manufacturer’s

profit V M V M (x) = αM + β M x V M (x) = αM + β M x


β(β+2β M )r 2
Coefficients of β=√ 2π
β= π
ρ+δ − 8(ρ+δ)
(ρ+δ)2 +rπ+(ρ+δ)
2π M (β+2β M )2 r 2
profit functions, βM = 2(ρ+δ)+βr 2 βM = πM
ρ+δ − 16(ρ+δ)
β 2 r2 β(β+2β M )r 2
α, β, αM , β M α= 4ρ α= 8ρ
ββ M r 2 (β+2β M )2 r 2
obtained from: αM = 2ρ αM = 16ρ

Subsidy
2β M −β
rate W ∗ (x) = 0 2β M +β =1− α
αM

Advertising

rβ 1−x
# √
r(β+2β M ) 1−x
#
effort U ∗ (x) = 2 = ρα(1 − x) 4 = ραM (1 − x)

Case 2 (Q < 0 and Q3 + R2 > 0): There are three real roots with one
positive root, which is β = S + T − (1/3)a1 .

Case 3 (Q < 0 and Q3 + R2 < 0): S and T are both imaginary. We


have three real roots with one positive root. While subcases can be given
to identify the positive root, for our purposes, it is enough to identify it
numerically.

Finally, we can conclude that 2β M − β > 0 so that W ∗ > 0, since if


this were not the case, then W ∗ would be zero, and we would once again
be in Case (a).
We can now summarize the optimal feedback Stackelberg equilibrium
in Table 13.1. In Exercises 13.7–13.10, you are asked to further explore
the model of this section when the parameters π = 0.25, π M = 0.5, r =
404 13. Differential Games
#
2, ρ = 0.05, δ = 1, and σ(x) = 0.25 x(1 − x). For this case, He et al.
(2009) obtain the comparative statics as shown in Fig. 13.2.

0.6 0.6
0.55 0.5
0.5 0.4
0.45
0.3
0.4
0.2
0.35
0.1
0.3
0.25 0

0.2 -0.1

0.15 -0.2
0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6

Figure 13.2: Optimal subsidy rate vs. (a) Retailer’s margin and (b)
Manufacturer’s margin

There have been many applications of differential games in mar-


keting in general and optimal advertising in particular. Some refer-
ences are Bensoussan et al. (1978), Deal et al. (1979), Deal (1979),
Jørgensen (1982a), Rao (1984, 1990), Dockner and Jørgensen (1986,
1992), Chintagunta and Vilcassim (1992), Chintagunta and Jain (1994,
1995), Fruchter (1999a), Jarrar et al. (2004), Martı́n-Herrán et al. (2005),
Breton et al. (2006), Jørgensen and Zaccour (2007), He and Sethi (2008),
Naik et al. (2008), Zaccour (2008a), Jørgensen et al. (2009), Prasad and
Sethi (2009). The literature on advertising differential games is surveyed
by Jørgensen (1982a) and the literature on management applications
of Stackelberg differential games is reviewed by He et al. (2007). Mono-
graphs are written by Erickson (2003) and Jørgensen and Zaccour (2004).
For applications of differential games to economics and management sci-
ence in general, see the book by Dockner et al. (2000).

Exercises for Chapter 13

E 13.1 A Bilinear Quadratic Advertising Model (Deal et al. 1979). Let


xi be the market share of firm i and ui be its advertising rate, i = 1, 2.
Exercises for Chapter 13 405

The state equations are

ẋ1 = b1 u1 (1 − x1 − x2 ) + e1 (u1 − u2 )(x1 + x2 ) − a1 x1

x1 (0) = x10 ,

ẋ2 = b2 u2 (1 − x1 − x2 ) + e2 (u2 − u1 )(x1 + x2 ) − a2 x2

x2 (0) = x20 ,

where bi , ei , and ai are given positive constants. Firm i wants to maxi-


mize  T
−ρT
Ji = w i e xi (T ) + (ci xi − u2i )e−ρt dt,
0
where wi , ci , and ρ are positive constants. Derive the necessary condi-
tions for the open-loop Nash solution, and formulate the resulting bound-
ary value problem. In a related paper, Deal (1979) provides a numerical
solution to this problem with e1 = e2 = 0.

E 13.2 Let x(t) denote the stock of pollution at time t ∈ [0, T ] that
affects the welfare of two countries, one of which is the leader and the
other the follower. The state dynamics is
ẋ = u + v, x(0) = x0 ,
where u and v are emission rates of the leader and the follower, respec-
tively. Let their instantaneous utility functions be
u − (u2 + x2 )/2 and v − (v 2 + x2 )/2,
respectively. Obtain the open-loop Stackelberg solution. By re-solving
this problem at time τ , 0 < τ < T, show that the first solution obtained
is time inconsistent.

Hint: Apply first the maximum principle to the follower’s prob-


lem for any given leader’s decision u. Let λF denote the adjoint
variable associated with the state x; Clearly λF (T ) = 0. Then apply
the maximum principle to the leader’s problem, treating the follower’s
adjoint equation as a “state” equation in addition to the state equation
for x. Let the adjoint variables associated with x and λF be λL and μ,
respectively. Clearly λL (T ) = 0. However, the transversality condition
for μ will be μ(0) = 0 in view of Remark 3.9. See Basar and Olsder
(1999) and Dockner et al. (2000) for further details.
406 13. Differential Games

E 13.3 Develop the nonlinear model for licensing of fisherman described


toward the end of Sect. 13.2.3 by rewriting (13.19) and (13.22) for the
model. Derive the adjoint equation for λi for the ith producer, and show
that the feedback Nash policy for producer i is given by

ci
f  (v i∗ ) = .
(pi − λi )x

E 13.4 Consider an N-firm oligopoly. Let Si (t) denote  the cumulative


sales by time t of firm i ∈ {1, 2, ..., N } and define S(t) = Ni=1 Si (t). Let
Ai (t) denote firm i’s advertising rate. With positive constants a, b, and
d, assume that the differential game has the diffusion dynamics

Ṡi (t) = [a + b log Ai (t) + dS(t)][M − S(t)], Si (0) = Si0 ≥ 0,

which means that a firm can stimulate its sales through advertising (but
subject to decreasing returns) and that demand learning effects (imita-
tion) are industry-wide. (If these effects were firm-specific we would have
Si instead of S in the brackets on the right-hand side of the dynamics.)
Payoffs are given by
 T
Ji = [(pi − ci )Ṡi (t) − Ai (t)]dt,
0

in which prices and unit costs are constant. Since Ṡi (t) in the expression
for J i is stated in terms of the state variable S(t) and the control vari-
ables Ai (t), i ∈ {1, 2, . . . , N }, formulate the differential game problem
with S(t) as the state variable. In the open-loop Nash equilibrium,
show that the advertising rates are monotonically decreasing over time.

Hint: Assume ∂ 2 H i /∂S 2 ≤ 0 so that H i is concave in S. Use this


condition to prove the monotone property.

E 13.5 Solve (13.43) to obtain the solution for α and β given in (13.44)
and (13.45).

E 13.6 Use Mathematica or another suitable software program to solve


the quartic equation (13.46). Show that for ρ1 = ρ2 = 0.05, π 1 = π 2 = 1,
δ = 0.01, R1 = 1, R2 = 4, the only positive solution for β 1 is 0.264545.
Figure 13.1 gives a sample path of the optimal market shares of the two
firms for this problem. Draw another sample path.
Exercises for Chapter 13 407

E 13.7 In the Stackelberg differential game of Sect. 13.4 let π =


0.25, π M = 0.5, r = 2, ρ = 0.05, and δ = 1. Obtain the coefficients
α, β, αM , β M , and show that W ∗ = 0.58. Graph the value functions
V M (x) = αM + β M x, V (x) = α + βx, and their sum V M (x) + V (x), as
the functions of the market share x.

E 13.8 Suppose the manufacturer in Exercise 13.7 does not behave op-
timally and decides instead to offer no cooperative advertising. Obtain
the value functions of the manufacturer and the retailer. Compare the
manufacturer’s value function in this case with VM (x) in Exercise 13.7.
Furthermore, when x0 = 0.5, obtain the manufacturer’s loss in expected
profit when compared to the optimal expected profit VM (x0 ) in Exer-
cise 13.7.

E 13.9 Suppose that the manufacturer and the retailer in the prob-
lem of Sect. 13.4 are integrated into a single firm. Then, formulate the
stochastic optimal control problem of the integrated firm. Also, using
the data in Exercise 13.7, obtain the value function V I (x) = αI + β I x
of the integrated firm, and compare it to V M (x) + V (x) obtained in
Exercise 13.7.
#
E 13.10 Let σ(x) = 0.25 x(1 − x) and the initial market share x0 =
0.1. Use the optimal feedback advertising effort U ∗ (x) in (13.50) to de-
termine the optimal market share Xt∗ over time. You may use MATLAB
or another suitable software to graph a sample path of Xt∗ , t ≥ 0.
Appendix A

Solutions of Linear
Differential Equations

A.1 First-Order Linear Equations


Consider the equation

ẋ + ax = b(t), x(0) = x0 , (A.1)

where a is a constant real number and b(t) is a given function of t. If we


multiply both sides of this equation by the integrating factor eat , we get

ẋeat + axeat = b(t)eat ,

which can be written at any time τ as

d(x(τ )eaτ ) = b(τ )eaτ dτ .

Integrating from 0 to t and then multiplying throughout by e−at , we get


the solution of (A.1) as
 t
−at
x(t) = e x0 + e−a(t−τ ) b(τ )dτ . (A.2)
0

If we generalize (A.1) by replacing the constant a by a function a(t),


we get
ẋ(t) + a(t)x(t) = b(t), x(0) = x0 . (A.3)

© Springer Nature Switzerland AG 2019 409


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
410 A. Solutions of Linear Differential Equations
t
We can then use the integrating factor e 0 a(s)ds , and with that you are
asked to show in Exercise A.1 by employing a procedure similar to that
for the solution of (A.3) that
  t t
− 0t a(s)ds
x(t) = x0 e + b(τ )e− τ a(s)ds dτ . (A.4)
0

A.2 Second-Order Linear Equations with


Constant Coefficients
Consider the equation
ẍ + a1 ẋ + ax = b(t), (A.5)
where a and a1 are constants and b(t) is a function of t. This equation
requires two boundary conditions to be completely specified. These, for
example, could be the values of x(t) at two points in time or the values
of x(0) and ẋ(0).
A general solution of (A.5) has the form
x(t) = xn (t) + xp (t), (A.6)
where xn (t) is a homogeneous solution, defined to be a solution of (A.5)
with b(t) set at 0, and xp (t) is the particular solution. Clearly ẍn +
a1 ẋn + axn = 0 and ẍp + a1 ẋp + axp = b(t).
To obtain a homogeneous solution, let m1 and m2 be the roots of the
auxiliary equation
m2 + a1 m + a = 0.
Then there are 3 cases shown in Table A.1.
Next we provide the particular solution of Eq. (A.5). Since this solu-
tion depends on the function b(t), we will provide this in Table A.2.
It is easy to extend Row 3 and Row 5 of Table A.2 for a polynomial
P (t) of degree n. See Zwillinger (2003) for details.
For solutions of higher order linear differential equations with con-
stant coefficients and many other differential equations, the reader is
referred to Zwillinger (2003) and Polyanin and Zaitsev (2003).

A.3 System of First-Order Linear Equations


In vector form, a system of first-order linear equations reads
ẋ + Ax = b(t), x(0) = x0 , (A.7)
A.3. System of First-Order Linear Equations 411

Table A.1: Homogeneous solution forms for Eq. (A.5)

Root General solution form

m1 = m2 , real x(t) = C1 em1 t + C2 em2 t

m1 = m2 = m, real x(t) = (C1 + C2 t)emt

m1 = p + qi, m2 = p − qi x(t) = ept (C1 sin qt + C2 cos qt)

Table A.2: Particular solutions for Eq. (A.5)

b(t) The particular solution of (A.5)

(1) ert ert /(r2 + a1 r + a)


(a−θ2 ) sin θt−a1 cos θt
(2) sin θt (a−θ 2 )2 +(a1 θ)2
1 a21 −a 
(3) P (t) = α + βt + γt2 a [P (t) − a1 
a P (t) + a2
P (t)]

ert sin θ Multiply row 2 by ert

(4) Replace a1 by a1 + 2r

Replace a by a + a1 r + r2

P (t)ert Multiply row 3 by ert

(5) Replace a1 by a1 + 2r

Replace a by a + a1 r + r2

where x is an n-column vector, A is an n × n matrix of constants, and


b is a function of t. We will present two ways of solving the first-order
system (A.7).
The first method involves the matrix exponential function etA defined
412 A. Solutions of Linear Differential Equations

by the power series

 (tA)k ∞
t2 A2
e tA
= I + tA + + ··· = . (A.8)
2! k!
0

It can be shown that this series converges (component by component)


for all values of t. Also it is differentiable (component by component) for
all values of t and satisfies
d tA
(e ) = AetA = (etA )A. (A.9)
dt

By analogy with (A.2), we can write the solution of (A.7) as


 t
−tA
x(t) = e x0 + e−(t−τ )A b(τ )dτ . (A.10)
0

Although (A.10) represents a formal expression for the solution of


(A.7), it does not provide a computationally convenient way of getting
explicit solutions.
For the second method we assume that the matrix A is diagonalizable,
i.e., that there exists a nonsingular square matrix P such that

P −1 AP = Λ. (A.11)

Here Λ is the diagonal matrix


⎡ ⎤
⎢ λ1 0 ··· 0 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 λ2 · · · 0 ⎥
⎢ ⎥
Λ=⎢ ⎥, (A.12)
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
0 0 ··· λn

where the diagonal elements, λ1 , . . . , λn , are eigenvalues of A. The ith


column of P is the column eigenvector associated with the eigenvalue λi
(to see this multiply both sides of (A.11) by P on the left). By looking
at (A.8) it is easy to see that

P −1 etA P = etΛ and P etΛ P −1 = etA , (A.13)


A.4. Solution of Linear Two-Point Boundary Value Problems 413

where ⎡ ⎤
⎢ etλ1 0 ··· 0 ⎥
⎢ ⎥
⎢ ⎥
⎢ 0 etλ2 ··· 0 ⎥
⎢ ⎥
etΛ =⎢ ⎥. (A.14)
⎢ .. .. .. ⎥
⎢ . . ··· . ⎥
⎢ ⎥
⎣ ⎦
0 0 ··· etλn

Using (A.13) into (A.10), we can write the solution to (A.7) as


 t
−tΛ −1
x(t) = (P e P )x0 + P e−(t−τ )Λ P −1 b(τ )dτ . (A.15)
0

Since well-known algorithms are available for finding eigenvalues and


eigenvectors of a matrix, the solution (A.15) can be computed in a
straightforward manner.

A.4 Solution of Linear Two-Point Boundary


Value Problems
In linear-quadratic control problems with linear salvage values (e.g., the
production-inventory problem in Sect. 6.1) we require the solution of lin-
ear two-point boundary value problems of the form
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
⎢ ẋ ⎥ ⎢ A11 A12 ⎥ ⎢ x ⎥ ⎢ b1 ⎥
⎣ ⎦=⎣ ⎦⎣ ⎦ + ⎣ ⎦ (A.16)
λ̇ A21 A22 λ b2

with boundary conditions

x(0) = x0 and λ(T ) = λT . (A.17)

The solution of this system will be of the form (A.15), which can be
restated as
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
⎢ x(t) ⎥ ⎢ Q11 (t) Q12 (t) ⎥ ⎢ x(0) ⎥ ⎢ R1 (t) ⎥
⎣ ⎦=⎣ ⎦⎣ ⎦+⎣ ⎦, (A.18)
λ(t) Q21 (t) Q22 (t) λ(0) R2 (t)
414 A. Solutions of Linear Differential Equations

where the λ(0) is a vector of unknowns. They can be determined by


setting
λT = Q21 (T )x(0) + Q22 (T )λ(0) + R2 (T ), (A.19)
which is a system of linear equations for the variables λ(0).

A.5 Solutions of Finite Difference Equations


In this book we will have uses for finite difference equations only in
Chaps. 8 and 9. For that reason we will give only a brief introduction to
solution techniques for them. Readers who wish more details can consult
one of several texts on difference equations; see, e.g., Goldberg (1986) or
Spiegel (1971).
If f (k) is a real function of time, then the difference operator applied
to f is defined as
Δf (k) = f (k + 1) − f (k). (A.20)

The factorial power of k is defined as

k (n) = k(k − 1)(k − 2) . . . (k − (n − 1)). (A.21)

It is easy to show that


Δk (n) = nk (n−1) . (A.22)
Because this formula is similar to the corresponding formula for the
derivative d(k n )/dk, the factorial powers of k play an analogous role
for finite differences that the ordinary powers of k play for differential
calculus.
If f (k) is a real function of time, then the anti-difference operator
Δ applied to f is defined as another function g = Δ−1 f (k) with the
−1

property that
Δg = f (k). (A.23)
One can easily show that

Δ−1 k (n) = (1/(n + 1))k (n+1) + c, (A.24)

where c is an arbitrary constant. Equation (A.24) corresponds to the


integration formula for powers of k in calculus.
Note that formulas (A.22) and (A.24) are similar to, respectively,
differentiation and integration of the power function k n in calculus. By
A.5. Solutions of Finite Difference Equations 415

analogy with calculus, therefore, we can solve difference equations in-


volving polynomials in ordinary powers of k by first rewriting them as
polynomials involving factorial powers of k so that (A.22) and (A.24)
can be used. We show next how to do this.

A.5.1 Changing Polynomials in Powers of k into


Factorial Powers of k
We first give an abbreviated list of formulas that show how to change
powers of k into factorial powers of k:

k 0 = k (0) = 1 (by definition),


1 (1)
k = k ,
2 (1)
k = k + k (2) ,
k 3 = k (1) + 3k (2) + k (3) ,
k 4 = k (1) + 7k (2) + 6k (3) + k (4) ,
k 5 = k (1) + 15k (2) + 25k (3) + 10k (4) + k (5) .

The coefficients of the factorial powers on the right-hand sides of these


equations are called Stirling numbers of the second kind, after the person
who first derived them. This list can be extended by using a more com-
plete table of these numbers, which can be found in books on difference
equations cited earlier.

Example A.1 Express k 4 − 3k + 4 in terms of factorial powers.

Solution Using the equations above we have

k 4 = k (1) + 7k (2) + 6k (3) + k (4) , −3k = −3k (1) , 4 = 4,

so that
k 4 − 3k + 4 = k (4) + 6k (3) + 7k (2) − 2k (1) + 4.

Example A.2 Solve the following difference equation in Example 8.7:

Δλk = −k + 5, λ6 = 0.

Solution We first change the right-hand side into factorial powers so


that it becomes
Δλk = −k (1) + 5.
416 A. Solutions of Linear Differential Equations

Applying (A.24), we obtain

λk = −(1/2)k (2) + 5k (1) + c,

where c is a constant. Applying the condition λ6 = 0, we find that


c = −15, so that the solution is

λk = −(1/2)k (2) + 5k (1) − 15. (A.25)

However, we would like the answer to be in ordinary powers of k.


The way to do that is discussed in the next section.

A.5.2 Changing Factorial Powers of k into Ordinary


Powers of k
In order to change factorial powers of k into ordinary powers of k, we
make use of the following formulas:

k (1) = k,
k (2) = −k + k 2 ,
k (3) = 2k − 3k 2 + k 3 ,
k (4) = −6k + 11k 2 − 6k 3 + k 4 ,
k (5) = 24k − 50k 2 + 35k 3 − 10k 4 + k 5 .

The coefficients of the factorial powers on the right-hand sides of these


equations are called Stirling numbers of the first kind. This list can also
be extended by using a more complete table of these numbers, which
can be found in books on difference equations.

Solution of Example A.2 Continued By substituting the first two


of the above formulas into (A.25), we see that the desired answer is

λk = −(1/2)k 2 + (11/2)k − 15, (A.26)

which is the solution needed for Example 8.7.


Exercises for Appendix A 417

Exercises for Appendix A

E A.1 Show that the solution of Eq. (A.3) is given by (A.4).


⎡ ⎤ ⎡ ⎤
⎢ 3 2 ⎥ ⎢ 5 0 ⎥
E A.2 If A = ⎣ ⎦ , show that Λ = ⎣ ⎦ and P =
2 3 0 2
⎡ ⎤
⎢ 1 1 ⎥
⎣ ⎦.
1 −1
⎡ ⎤
⎢ 1 ⎥
Use (A.15) to solve (A.7) for this data, given that z(0) = ⎣ ⎦ .
2

⎡ ⎤ ⎡ ⎤
⎢ 3 3 ⎥ ⎢ 6 0 ⎥
E A.3 If A = ⎣ ⎦ , show that Λ = ⎣ ⎦ and P =
2 4 0 1
⎡ ⎤
⎢ 1 3 ⎥
⎣ ⎦.
1 −2
⎡ ⎤
⎢ 0 ⎥
Use (A.15) to solve (A.7) for this data, given that z(0) = ⎣ ⎦ .
5

E A.4 After you have read Sect. 6.1, re-solve the production-inventory
example stated in Eqs. (6.1) and (6.2), (ignoring the control constraint
(P ≥ 0) by the method of Sect. A.4. The linear two-point boundary
value problem is stated in Eqs. (6.6) and (6.7).
Appendix B

Calculus of Variations and


Optimal Control Theory

Here we introduce the subject of the calculus of variations by analogy


with the classical topic of maximization and minimization in calculus;
see Gelfand and Fomin (1963), Young (1969), and Leitmann (1981) for
rigorous treatments of the subject. The problem of the calculus of varia-
tions is that of determining a function that maximizes a given functional,
the objective function. An analogous problem in calculus is that of de-
termining a point at which a specific function, the objective function, is
maximum. This, of course, is done by taking the first derivative of the
function and equating it to zero. This is what is called the first-order
condition for a maximum. A similar procedure will be employed to de-
rive the first-order condition for the variational problem. The analogy
with classical optimization extends also to the second-order maximiza-
tion condition of calculus. Finally, we will show the relationship between
the maximum principle of optimal control theory and the necessary con-
ditions of the calculus of variations. It is noted that this relationship is
similar to the one between the Kuhn-Tucker conditions in mathematical
programming and the first-order conditions in classical optimization.
We start with the “simplest” variational problem in the next section.

© Springer Nature Switzerland AG 2019 419


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
420 B. Calculus of Variations and Optimal Control Theory

B.1 The Simplest Variational Problem


Assume a function x : C 1 [0, T ] → E 1 , where C 1 [0, T ] is a class of
functions defined over the interval [0, T ] with continuous first derivatives.
For simplicity in exposition, we are assuming x(t) to be a scalar function
of t ∈ [0, T ], and the extension to a vector function is straightforward.
Here t simply denotes the independent variable which need not be time.
Assume further that a function in this class is termed admissible if it
satisfies the terminal conditions

x(0) = x0 and x(T ) = xT . (B.1)

We are thus dealing with a fixed-end-point problem. Examples of ad-


missible functions for the problem are shown in Fig. B.1; see Chapters 2
and 3 of Gelfand and Fomin (1963) for problems other than the simplest
problem, i.e., the problems with other kinds of conditions for the end
points.

Figure B.1: Examples of admissible functions for the problem

The problem under consideration is to obtain the admissible function


x∗ for which the functional
 T
J(x) = F (x, ẋ, t)dt (B.2)
0

has a relative maximum. We will assume that all first and second partial
derivatives of the function F : E 1 × E 1 × E 1 → E 1 are continuous.
B.2. The Euler-Lagrange Equation 421

B.2 The Euler-Lagrange Equation


The necessary first-order conditions in classical optimization were ob-
tained by considering small changes about the solution point. For the
variational problem, we consider small variations about the solution func-
tion. Let x(t) be the solution and let

y(t) = x(t) + εη(t),

where η(t) : C 1 [0, T ] → E 1 is an arbitrary continuously differentiable


function satisfying
η(0) = η(T ) = 0, (B.3)
and ε ≥ 0 is a small number. A sketch of these functions is shown in
Fig. B.2.

Figure B.2: Variation about the solution function

The value of the objective functional associated with y(t) can be


considered a function of ε, i.e.,
 T
V (ε) = J(y) = F (x + εη, ẋ + εη̇, , t)dt.
0

However, x(t) is a solution and therefore V (ε) must have a maximum at


ε = 0. This means /
 dV /
δJ = / = 0,
dε /ε=0
422 B. Calculus of Variations and Optimal Control Theory

where δJ is known as the variation δJ in J. Differentiating V (ε) with


respect to ε and setting ε = 0 yields
/  T
dV //
δJ = = (Fx η + Fẋ η̇)dt = 0,
dε /ε=0 0

which after integrating the second term by parts provides


/  T  T
dV // T d
δJ = / = Fx ηdt + (Fẋ η)| 0 − (Fẋ )ηdt = 0. (B.4)
dε ε=0 0 0 dt

Because of the end conditions on η, the expression simplifies to


/  T
dV // d
δJ = / = [Fx − Fẋ ]ηdt = 0.
dε ε=0 0 dt

We now use the fundamental lemma of the calculus T of variations


which states that if h is a continuous function and 0 h(t)η(t)dt = 0 for
every continuous function η(t), then h(t) = 0 for all t ∈ [0, T ]. The reason
that this lemma holds, without going into details of a rigorous proof
which is available in Gelfand and Fomin (1963), is as follows. Suppose
that h(t) = 0 for some t ∈ [0, T ]. Since h(t) is continuous, there is,
therefore, an interval (t1 , t2 ) ⊂ [0, T ] over which h is nonzero and has the
same sign. Now selecting η(t) such


⎨ > 0, t ∈ (t1 , t2 )
η(t) is

⎩ 0, otherwise,

T
it is possible to make the integral 0 h(t)η(t)dt = 0. Thus, by contradic-
tion, h(t) must be identically zero over the entire interval [0, T ].
By using the fundamental lemma, we have the necessary condition

d
Fx − Fẋ = 0 (B.5)
dt
known as the Euler-Lagrange equation, or simply the Euler equation,
which must be satisfied by a maximal solution x∗ . In other words, the
solution x∗ (t) must satisfy

d
Fx (x∗ , ẋ∗ , t) − Fẋ (x∗ , ẋ∗ , t) = 0. (B.6)
dt
B.2. The Euler-Lagrange Equation 423

We note that the Euler equation is a second-order ordinary differen-


tial equation. This can be seen by taking the total time derivative of Fẋ
in (B.5) to obtain
Fx − Fẋt − (Fẋx ẋ) − (Fẋẋ ẍ) = 0. (B.7)
The boundary conditions for this equation are obviously the end-point
conditions x(0) = x0 and x(T ) = xT .

Special Case (i): When F does not depend explicitly on ẋ.

In this case, the Euler equation (B.5) reduces to


Fx = 0,
which is nothing but the first-order condition of classical optimization.
In this case, the dynamic problem is a succession of static classical opti-
mization problems.

Special Case (ii): When F does not depend explicitly on x.

The Euler equation reduces to


d
Fẋ = 0, (B.8)
dt
which we can integrate as
Fẋ = C, (B.9)
where C is a constant.

Special Case (iii): When F does not depend explicitly on t.

In this important special case, the Euler equation (B.7) reduces to


Fx − (Fẋx ẋ) − (Fẋẋ ẍ) = 0. (B.10)
On multiplying the left hand side of (B.10) by ẋ on the right, and adding
and subtracting the term Fẋ ẍ, transforms (B.10) to
d
(F − Fẋ ẋ) = 0. (B.11)
dt
We can solve the above equation as
F − Fẋ ẋ = C, (B.12)
where C is a constant.
424 B. Calculus of Variations and Optimal Control Theory

B.3 The Shortest Distance Between Two Points


on the Plane
The problem is to show that the straight line passing through two points
on a plane is the shortest distance between the two points. The problem
can be stated as follows:
⎧  T#

⎪ 1 + ẋ2 dt

⎪ min

⎨ 0

⎪ subject to




⎩ x(0) = x and x(T ) = x .
0 T


Here t refers to distance rather than time. Since F = − 1 + ẋ2 does
not depend explicitly on x, we are in the second special case and the first
integral (B.9) of the Euler equation is

Fẋ = −ẋ(1 + ẋ2 )− 2 = C.


1

This implies that ẋ is a constant, which results in the solution

x∗ (t) = C1 t + C2 ,

where C1 and C2 are constants. These can be evaluated by imposing


boundary conditions which give C1 = (xT − x0 )/T and C2 = x0 . Thus,

∗ xT − x0
x (t) = t + x0 ,
T

which is the straight line passing through x0 and xT .

B.4 The Brachistochrone Problem


The problem arises from the search for the shape of a wire along which
a bead will slide, without friction, in the least time from a given point
A to another point B, under the influence of gravity; see Fig. 1.1.
Let t denote the horizontal axis, x denote the vertical axis (measured
vertically down), and let the (t, x) coordinates of A and B be (0, 0) and
(T, b), respectively. Thus, x(0) = 0 and x(T ) = b. It is reasonable to
assume b ≥ 0, so that point B is not higher than point A.
B.4. The Brachistochrone Problem 425

The time τ AB required for the bead to slide from point A to point B
along a wire formed in the shape of a curve x(t) is given as
 sT
ds
τ AB = ,
0 v

where v represents velocity and sT is the final displacement measured


along the curve. Since ds2 = dx2 + dt2 , we can write
#
ds = 1 + ẋ2 dt,

where ẋ = dx/dt (note that t does not denote time here). From ele-
mentary physics, it is known that if v(t = 0) = 0 and a denotes the
acceleration due to gravity, then
#
v(t) = 2ax(t), t ∈ [0, T ].

Then, )
 T
1 + ẋ2
τ AB = dt. (B.13)
0 2ax
The purpose of the Brachistochrone problem is to find x(t), t ∈ [0, T ],
so as to minimize the time τ AB . This is a variational problem, which in
view of a being a constant, can be stated as follows:
  T  T) 
1 + ẋ2
min J(x) = F (x, ẋ, t)dt = dt . (B.14)
0 0 x

As we can see, the integral F in the above problem does not depend
explicitly on t, and the problem (B.14) belongs to the third special case.
Using the first integral (B.11) of the Euler equation for this case, we have
) 7
1 + ẋ2 2 1 1
− ẋ = (a constant).
x x(1 + ẋ2 ) k

We can reduce this to )


dx k2 − x
= ,
dt x
which we rewrite as
dx
= dt. (B.15)
k2 −x
x
426 B. Calculus of Variations and Optimal Control Theory

By performing a change of variable according to


 
2 2 2 1 1
x = k sin θ = k − cos 2θ (B.16)
2 2
and recognizing that x(t = 0) = 0 corresponds to θ = 0, we can integrate
(B.15) to obtain
 θ  t
1
2k 2 sin2 θdθ = k 2 (θ − sin 2θ) = dt = t. (B.17)
0 2 0
By setting 2θ = φ in (B.16) and (B.17), we can write the solution para-
metrically as ⎫

t = k (φ − sin φ)/2 ⎬
2
, (B.18)

x = k 2 (1 − cos φ)/2 ⎭
which are known to be equations representing a cycloid, as depicted in
Fig. 1.1 in Chap. 1. Furthermore, since the initial condition x(0) = 0 is
already incorporated in performing the integration in (B.17), we must
use the terminal condition x(T ) = b for determining the constant k.
Clearly, if we let φ1 be defined by the relation
b 1 − cos φ1
= , (B.19)
T φ1 − sin φ1
then we can write
2b 2T
k2 = = . (B.20)
(1 − cos φ1 ) (φ1 − sin φ1 )
The value of φ1 can be easily obtained numerically for any given values
of b > 0 and T > 0.
With these, the optimal solution x∗ (t) is the cycloid given paramet-
rically as
' ( ⎫
φ−sin φ ⎪

t = T φ −sin φ
1 1
' ( ⎪. (B.21)
∗ 1−cos φ
x = b 1−cos φ ⎭
1

Furthermore, the minimum time τ ∗AB can be obtained as

√  T7
1 + (ẋ∗ (t))2
τ ∗AB = 2aJ(x∗ ) = dt. (B.22)
0 x∗ (t)
In Exercise B.1, you are asked to obtain φ1 for T = b = 1 m, and
then obtain the minimum time τ ∗AB .
B.5. The Weierstrass-Erdmann Corner Conditions 427

B.5 The Weierstrass-Erdmann Corner


Conditions
So far we have only considered functionals defined for smooth curves.
This is, however, a restricted class of curves which qualify as solutions,
since it is easy to give examples of variational problems which have no
solution in this class. Consider, for example, the objective functional
  1 
min J(x) = x2 (1 − ẋ)2 dt , x(−1) = 0, x(1) = 1.
−1

The greatest lower bound for J(x) for smooth x = x(t) satisfying the
boundary conditions is obviously zero. Yet there is no x ∈ C 1 [−1, 1]
with x(−1) = 0 and x(1) = 1, which achieves this value of J(x). In fact,
the minimum is achieved for the curve


⎨ 0, −1 ≤ t ≤ 0,

x (t) =

⎩ t, 0 < t ≤ 1,

which has a corner (i.e., a discontinuous first derivative) at t = 0. Such


a piecewise smooth extremal with corners is called a broken extremal.
We now enlarge the class of admissible functions by relaxing the
requirement that they be smooth everywhere. The larger class is the class
of piecewise continuous functions which are continuously differentiable
almost everywhere in [0, T ], i.e., except at some points in [0, T ].
Let x, defined on the interval [0, T ], have a corner at τ ∈ [0, T ]. Let
us decompose J(x) as
 T  τ  T
J(x) = F (x, ẋ, t)dt = F (x, ẋ, t)dt + F (x, ẋ, t)dt
0 0 τ
= J1 (x) + J2 (x).

It is clear that on each of the intervals [0, τ ) and (τ , T ], the Euler equation
must hold.
To compute variations δJ1 and δJ2 , we must recognize that the two
‘pieces’ of x are not fixed-end-point problems. We must require that the
two pieces of x join continuously at t = τ ; the point t = τ can, however,
move freely as shown in Fig. B.3.
This will require a slightly modified version of formula (B.4) for
writing out the variations; see pp. 55–56 in Gelfand and Fomin (1963).
428 B. Calculus of Variations and Optimal Control Theory

Figure B.3: A broken extremal with corner at τ

Equating the sum of variations

δJ = δJ1 + δJ2 = 0

for x∗ to be an extremal and using the fact that it must be continuous


at t = τ implies
Fẋ |τ − = Fẋ |τ + , (B.23)

[F − Fẋ ẋ]τ − = [F − Fẋ ẋ]τ + . (B.24)

These conditions are called Weierstrass-Erdmann corner conditions,


which must hold at the point τ where the extremal has a corner.
In each of the interval [0, τ ) and (τ , t], the extremal x must satisfy the
Euler equation (B.5). Solving these two equations will provide us with
four constants of integration since the Euler equations are second-order
differential equations. These constants can be found from the end-point
conditions (B.1) and Weierstrass-Erdmann conditions (B.23) and (B.24).

B.6 Legendre’s Conditions: The Second


Variation
The Euler equation is a necessary conditions analogous to the first-order
condition for a maximum (or minimum) in the classical optimization
B.7. Necessary Condition for a Strong Maximum 429

problems of calculus. The condition analogous to the second-order nec-


essary condition for a maximum x∗ is the Legendre condition

Fẋẋ ≤ 0. (B.25)

To obtain this condition, we use the second-order condition of classical


optimization on function V (ε) to be a maximum at ε = 0, i.e.,
/  T
d2 V (ε) //
= (Fxx η 2 + 2Fxẋ η η̇ + Fẋẋ η̇ 2 )dt ≤ 0. (B.26)
dε2 /ε=0 0

Integrating the middle term by parts and using (B.3), we can transform
(B.26) into a more convenient form
 T
(Qη 2 + P η̇ 2 )dt ≤ 0, (B.27)
0

where
d
Q = Q(t) = Fxẋ − Fxẋ and P = P (t) = Fẋẋ .
dt
While it is possible to rigorously obtain (B.25) from (B.27), we will
only provide a qualitative argument for this. If we consider the quadratic
functional (B.27) for functions η(t) satisfying η(0) = 0, then η(t) will be
small in [0, T ] if η̇(t) is small in [0, T ]. The converse is not true, however,
since it is easy to construct η(t) which is small but has a large derivative
η̇(t) in [0, T ]. Thus, P η̇ 2 plays the dominant role in (B.27); i.e., P η̇ 2
can be much larger than Qη 2 but it cannot be much smaller (provided
P = 0). Therefore, it might be expected that the sign of the functional
in (B.8) is determined by the sign of the coefficient P (t), i.e., (B.27)
implies (B.25). For a rigorous proof, see Gelfand and Fomin (1963).
We note that the strengthened Legendre condition (i.e., with a strict
inequality in (B.25)), the Euler equation, and one other condition called
strengthened Jacobi condition are sufficient for a maximum. The reader
can consult Chapter 5 of Gelfand and Fomin (1963) for details.

B.7 Necessary Condition for a Strong


Maximum
So far we have discussed necessary conditions for a weak maximum. By
weak maximum we mean that the candidate extremals are smooth or
430 B. Calculus of Variations and Optimal Control Theory

piecewise smooth functions. The concept of a strong maximum on the


other hand requires that the candidate extremal need only be continuous
functions. Without going into details, which are available in Gelfand and
Fomin (1963), we state a necessary condition for a strong maximum. This
is called the Weierstrass necessary condition. The condition is analogous
to the one in the static case that the objective function be concave. It
states that if the functional (B.2) has a strong maximum for the extremal
x∗ satisfying (B.1), then

E(x∗ , ẋ∗ , t, u) ≤ 0 (B.28)

for every finite u, where E is the Weierstrass Excess Function defined


as

E(x, ẋ, t, u) = F (x, u, t) − F (x, ẋ, t) − Fẋ (x, ẋ, t)(u − ẋ). (B.29)

Note that this condition is always met if F (x, ẋ, t) is concave in ẋ.
The proof of (B.28) is by contradiction. Suppose there exists a τ ∈
[0, T ] and a vector q such that

E(τ , x∗ (τ ), ẋ∗ (τ ), q) > 0.

It is then possible to suitably modify x∗ to y, which is close to x∗ in


C 1 [0, T ], such that
 
J = F (y, ẏ, t)dt − F (x∗ , ẋ∗ , t)dt > 0,

contradicting the hypothesis that J(x) has a strong maximum at x∗ .

B.8 Relation to Optimal Control Theory


It is possible to derive the necessary conditions of the calculus of varia-
tions from the maximum principle. This is strongly reminiscent of the
relationship between the first-order conditions of classical optimization
and the Kuhn-Tucker conditions of mathematical programming.
First, we note that the calculus of variations problem can be stated
B.8. Relation to the Optimal Control Theory 431

as an optimal control problem as follows:


⎧   T 



⎪ max J = F (x, u, t)dt

⎪ 0



⎨ subject to
(B.30)



⎪ ẋ = u, x(0) = x0 , x(T ) = xT ,





⎩ u ∈ Ω = En.

The Hamiltonian is

H(x, u, λ, t) = F (x, u, t) + λu (B.31)

with the adjoint variable λ satisfying

λ̇ = −Hx = −Fx . (B.32)

Maximizing the Hamiltonian with respect to u yields

Hu = Fẋ + λ = 0, (B.33)

from which we obtain


λ = −Fẋ . (B.34)
Differentiating (B.34) with respect to time gives
d
λ̇ = − Fẋ .
dt
This equation with (B.32) implies the Euler equation
d
Fx − Fẋ = 0.
dt
From (B.30) and (B.32), the second-order condition Huu ≤ 0 for the
maximization of the Hamiltonian leads to

Fẋẋ ≤ 0,

known as the Legendre condition.


By the maximum principle, if u∗ is an optimal control with x∗ de-
noting the corresponding trajectory, then for each t ∈ [0, T ],

H(x∗ , u∗ , λ, t) ≥ H(x∗ , u, λ, t),


432 B. Calculus of Variations and Optimal Control Theory

where u is any other control. By the definition of the Hamiltonian (B.31),


ẋ∗ = u∗ from (B.32), and Eq. (B.33), we have

F (x∗ , ẋ∗ , t) − Fẋ (x∗ , ẋ∗ , t)ẋ∗ ≥ F (x∗ , u, t) − Fẋ (x∗ , ẋ∗ , t)u,

which by transposition of the terms yields the Weierstrass necessary


condition

E(x∗ , ẋ∗ , t, u) = F (x∗ , u, t) − F (x∗ , ẋ∗ , t) − Fẋ (x∗ , ẋ∗ , t)(u − ẋ∗ ) ≤ 0.

We have just proved the equivalence of the maximum principle and


the Weierstrass necessary condition in the case where Ω is open. In cases
when Ω is closed and when the optimal control is on the boundary of
Ω, the Weierstrass necessary condition holds no longer in general. The
maximum principle still applies, however.
Finally, according to the maximum principle, both λ and H are con-
tinuous functions of time. That is,

λ(τ − ) = λ(τ + ),
H(x∗ (τ ), u∗ (τ − ), λ(τ − ), τ ) = H(x∗ (τ ), u∗ (τ + ), λ(τ + ), τ ).

However,
λ = −Fẋ and H = F − Fẋ ẋ,
which means that the right-hand sides must be continuous with respect
to time, i.e., even across corners. These are precisely the Weierstrass-
Erdmann corner conditions.

Exercises for Appendix B

E B.1 Solve (B.19) numerically to obtain φ1 for T = b = 1 m. Then,


use the formulas (B.21) and (B.22) to compute the minimum time τ ∗AB .
Note that the gravitational acceleration rate a = 9.81 m/s2 .
Appendix C

An Alternative Derivation
of the Maximum Principle

Recall that in the derivation of the maximum principle in Chap. 2, we


assumed the twice differentiability of the value function V (x, t) with re-
spect to the state variable x. Looking at (2.31), we can observe that
the smoothness assumptions on the value function do not arise in the
statement of the maximum principle. Also since it is not an exogenously
given function, there is no a priori reason to assume the twice differen-
tiability. Moreover, there arise cases in which the value function V (x, t)
is not even differentiable in x.
In what follows, we will give an alternate derivation. This proof fol-
lows the course pointed out by Pontryagin et al. (1962) but with certain
simplifications. It appears in Fel’dbaum (1965) and, in our opinion, it
is one of the simplest proofs for the maximum principle which is not
related to dynamic programming and thus permits the elimination of
assumptions about the differentiability of the return function V (t, x).
We select the Mayer form of the problem (2.5) for deriving the max-
imum principle in this section. It will be convenient to reproduce (2.5)
here as (C.1):


⎪ max {J = cx(T )}



⎨ u(t)∈Ω(t)
subject to (C.1)





⎩ ẋ = f (x, u, t), x(0) = x .
0

© Springer Nature Switzerland AG 2019 433


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
434 C. An Alternative Derivation of the Maximum Principle

C.1 Needle-Shaped Variation


Let u∗ (t) be an optimal control with corresponding state trajectory x∗ (t).
We sketch u∗ (t) in Fig. C.1 and x∗ (t) in Fig. C.2 in a scalar case. Note
that the kink in x∗ (t) at t = θ corresponds to the discontinuity in u∗ (t)
at t = θ.

Figure C.1: Needle-shaped variation

Figure C.2: Trajectories x∗ (t) and x(t) in a one-dimensional case

Let τ denote any time in the open interval (0, T ). We select a suffi-
ciently small ε to insure that τ − ε > 0 and concentrate our attention on
this small interval (τ − ε, τ ]. We vary the control on this interval while
keeping the control on the remaining intervals [0, τ − ε] and (τ , T ] fixed.
Specifically, the modified control is


⎨ v ∈ Ω, t ∈ (τ − ε, τ ],
u(t) = (C.2)

⎩ u∗ (t), otherwise.
C.1. Needle-Shaped Variation 435

This is called a needle-shaped variation as shown in Fig. C.1. It is a


jump function and is different from variations in the calculus of varia-
tions; see Appendix B. Also the difference v − u∗ is finite and need not
be small. However, since the variation is on a small time interval, its
influence on the subsequent state trajectory can be proved to be ‘small’.
This is done in the following.
Let the subsequent motion be denoted by x(t) = x∗ (t) for t > τ − ε.
In Fig. C.2, we have sketched x(t) corresponding to u(t).
Let
δx(t) = x(t) − x∗ (t), t ≥ τ − ε,
denote the change in the state variables. Obviously δx(τ − ε) = 0.
Clearly,
δx(τ ) ≈ ε[ẋ(s) − ẋ∗ (s)], (C.3)
where s denotes some intermediate time in the interval (τ − ε, τ ]. In
particular, we can write (C.3) as

δx(τ ) = ε[ẋ(τ ) − ẋ∗ (τ )] + o(ε)


= ε[f (x(τ ), v, τ ) − f (x∗ (τ ), u∗ (τ ), τ ] + o(ε). (C.4)

But δx(τ ) is small since f is assumed to be bounded. Furthermore, since


f is continuous and the difference δx(τ ) = x(τ ) − x∗ (τ ) is small, we can
rewrite (C.4) as

δx(t) ≈ ε[f (x∗ (τ ), v, τ ) − f (x∗ (τ ), u∗ (τ ), τ )]. (C.5)

Since the initial difference δx(τ ) is small and since u∗ (τ ) does not change
from t > τ on, we may conclude that δx(t) will be small for all t > τ .
Being small, the law of variation of δx(t) can be found from linear equa-
tions for small changes in the state variables. These are called variational
equations. From the state equation in (C.1), we have

d(x∗ + δx)
= f (x∗ + δx, u∗ , t) (C.6)
dt
or,
dx∗ d(δx)
+ ≈ f (x∗ , u∗ , t) + fx δx (C.7)
dt dt
or using (C.1),

d
(δx) ≈ fx (x∗ , u∗ , t)δx, for t ≥ τ , (C.8)
dt
436 C. An Alternative Derivation of the Maximum Principle

with the initial condition δx(τ ) given by (C.5).


The basic idea in deriving the maximum principle is that equations
(C.8) are linear variational equations and result in an extraordinary sim-
plification. We next obtain the adjoint equations.

C.2 Derivation of the Adjoint Equation and the


Maximum Principle
For this derivation, we employ two methods. The direct method, similar
to that of Hartberger (1973), is the consequence of directly integrating
(C.8). The indirect method avoids this integration by a trick which is
instructive.

Direct Method. Integrating (C.8) we get


 T
δx(T ) = δx(τ ) + fx [x∗ (t), u∗ (t), t]δx(t)dt, (C.9)
τ

where the initial condition δx(τ ) is given in (C.5).


Since δx(T ) is the change in the terminal state from the optimal state

x (T ), the change in the objective function δJ must be negative. Thus,
 T
δJ = cδx(T ) = cδx(τ ) + cfx [x∗ (t), u∗ (t), t]δx(t)dt ≤ 0. (C.10)
τ

Furthermore, since (C.8) is a linear homogeneous differential equation,


we can write its general solution as

δx(t) = Φ(t, τ )δx(τ ), (C.11)

where the fundamental solution matrix or the transition matrix Φ(t, τ ) ∈


E n×n obeys

d
Φ(t, τ ) = fx [x∗ (t), u∗ (t)t]Φ(t, τ ), Φ(τ , τ ) = I, (C.12)
dt
where I is an n × n identity matrix; see Appendix A.
Substituting for δx(t) from (C.11) into (C.10), we have
 T
δJ = cδx(τ ) + cfx [x∗ (t), u∗ (t), t]Φ(t, τ )δx(τ )dt ≤ 0. (C.13)
τ
C.2. Derivation of Adjoint Equation and the Maximum Principle 437

This induces the definition


 T

λ (t) = cfx [x∗ (t), u∗ (t), t]Φ(t, τ )dt + c, (C.14)
τ

which when substituted into (C.13), yields

δJ = λ∗ (τ )δx(τ ) ≤ 0. (C.15)

But δx(τ ) is supplied in (C.5). Noting that ε > 0, we can rewrite (C.15)
as
λ∗ (τ )f [x∗ (τ ), v, τ ] − λ∗ (τ )f [x∗ (τ ), u∗ (τ ), τ ] ≤ 0. (C.16)
Defining the Hamiltonian for the Mayer form as

H[x, u, λ, t] = λf (x, u, t), (C.17)

we can rewrite (C.16) as

H[x∗ (τ ), u∗ (τ ), λ(τ ), τ ] ≥ H[x∗ (τ ), v, λ(τ ), τ ]. (C.18)

Since this can be done for almost every τ , we have the required Hamil-
tonian maximizing condition.
The differential equation form of the adjoint equation (C.14) can be
obtained by taking its derivative with respect to τ . Thus,
 T
dλ(τ ) dΦ(t, τ )
= cfx [x∗ (t), u∗ (t), t] dt
dτ τ dτ
−cfx [x∗ (τ ), u∗ (τ ), τ ]. (C.19)

It is also known that the transition matrix has the property:


dΦ(t, τ )
= −Φ(t, τ )fx [x∗ (τ ), u∗ (τ ), τ ],

which can be used in (C.19) to obtain
 T
dλ(τ )
=− cfx [x∗ (t), u∗ (t), t]Φ(t, τ )fx [x∗ (τ ), u∗ (τ ), τ ]dt
dτ τ
−cfx [x∗ (τ ), u∗ (τ ), τ ]. (C.20)

Using the definition (C.14) of λ(τ ) in (C.20), we have

dλ(τ )
= −λ(τ )fx [x∗ (τ ), u∗ (τ ), τ ]

438 C. An Alternative Derivation of the Maximum Principle

with λ(T ) = c, or using (C.17) and noting that τ is arbitrary, we have

λ̇ = −λfx [x∗ , u∗ , t] = −Hx [x∗ , u∗ , λ, t), λ(T ) = c. (C.21)

This completes the derivation of the maximum principle along with the
adjoint equation using the direct method.

Indirect Method. The indirect method employs a trick which simpli-


fies considerably the derivation. Instead of integrating (C.8) explicitly,
we now assume that the result of this integration yields cδx(T ) as the
change in the state at the terminal time. As in (C.10), we have

δJ = cδx(T ) ≤ 0. (C.22)

First, we define

λ(T ) = c, (C.23)
which makes it possible to write (C.22) as

δJ = cδx(T ) = λ(T )δx(T ) ≤ 0. (C.24)

Note parenthetically that if the objective function J = S(x(T )), we must


define λ(T ) = ∂S[x(T )]/∂x(T ) giving us

∂S[x(T )]
δJ = δx(T ) = λ(T )δx(T ).
∂x(T )

Now, λ(T )δx(T ) is the change in the objective function due to a


change δx(T ) at the terminal time T. That is, λ(T ) is the marginal
return or the marginal change in the objective function per unit change
in the state at time T. But δx(T ) cannot be known without integrating
(C.8). We do know, however, the value of the change δx(τ ) at time τ
which caused the terminal change δx(T ) via (C.8).
We would therefore like to pose the problem of obtaining the change
δJ in the objective function in terms of the known value δx(τ ); see
Fel’dbaum (1965). Simply stated, we would like to obtain the marginal
return λ(τ ) per unit change in state at time τ . Thus,

λ(τ )δx(τ ) = δJ = λ(T )δx(T ) ≤ 0. (C.25)

Obviously, knowing λ(τ ) will make it possible to make an inference about


δJ, which is directly related to the needle-shaped variation applied in the
small interval (τ − ε, τ ].
C.2. Derivation of Adjoint Equation and the Maximum Principle 439

However, since τ is arbitrary, our problem of finding λ(τ ) can be


translated to one of finding λ(t), t ∈ [0, T ], such that

λ(t)δx(t) = λ(T )δx(T ), t ∈ [0, T ], (C.26)

or in other words,

λ(t)δx(t) = constant, λ(T ) = c. (C.27)

It turns out that the differential equation which λ(t) must satisfy can
be easily found. From (C.27),

d δx
[λ(t)δx(t)] = λ + λ̇δx = 0, (C.28)
dt dt
which after substituting for dδx/dt from (C.8) becomes

λfx δx + λ̇δx = (λfx + λ̇)δx = 0. (C.29)

Since (C.29) is true for arbitrary δx, we have

λ̇ = −λfx = −Hx (C.30)

using the definition (C.17) for the Hamiltonian.


The Hamiltonian maximizing condition can be obtained by substi-
tuting for δx(τ ) from (C.5) into (C.25). This is the same as what we did
in (C.15) through (C.18).
The purpose of the alternative proof was to demonstrate the valid-
ity of the maximum principle for a simple problem without knowledge
of any return function. For more complex problems, one needs compli-
cated mathematical analysis to rigorously prove the maximum principle
without making use of return functions. A part of mathematical rigor is
in proving the existence of an optimal solution without which necessary
conditions are meaningless; see Young (1969).
Appendix D

Special Topics in Optimal


Control

In this appendix we will discuss a number of specialized topics in seven


sections. These are the Kalman and Kalman-Bucy filters, the Weiner
Process, Itô’s Lemma, linear-quadratic problems, second-order varia-
tions, singular control, and the Sethi-Skiba points. These topics are
referred to but not discussed in the main body of the text. While we will
not be able to go into great detail, we will provide an adequate descrip-
tion of these topics for our purposes. For further details, the reader can
consult the references cited in the respective sections dealing with these
topics.

D.1 The Kalman Filter


So far in this book, we have assumed that the values of the state variables
can be measured with certainty. In many cases the assumption that the
value of a state variable can be directly measured and exactly determined
may not be realistic.
There are two types of random disturbances present. The first kind,
termed measurement noise, arises because of imprecise measurement in-
struments, inaccurate recording systems, etc. In many cases the mea-
surement technique involves observations of functions of state variables,
from which the values of some or all of the state variables are inferred;
e.g., measuring the inventory of a natural gas reservoir involves pressure

© Springer Nature Switzerland AG 2019 441


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
442 D. Special Topics in Optimal Control

measurements together with physical laws relating pressure and volume.


The second kind can be termed system noise, in which the system
itself is subjected to random disturbances. For instance, sales may follow
a stochastic process, which affects the system equation (6.1) relating in-
ventory, production, and sales. In the cash balance example, the demand
for cash as well as the interest rates in (5.1) and (5.2) can be represented
by stochastic processes.
In analyzing systems in which one or both of these kinds of noises
are present, it is important to be able to make good estimates of the
values of the state variables. We discuss the Kalman and Kalman-Bucy
filters devoted to optimal estimation of current values of state variables
given past measurements. The Kalman filter will be described in this
section, for which further details can be obtained from references such as
Kalman (1960a,b), Bryson and Ho (1975), Anderson and Moore (1979),
and Kumar and Varaiya (1986). The Kalman-Bucy filter for continuous-
time linear systems will be described briefly in Sect. D.3 and the readers
can refer to Fleming and Rishel (1975) and Arnold (1974) for further
details.
Consider a dynamic stochastic system in discrete time described by
the difference equation
xt+1 − xt = At xt + Gt wt , t = 0, 1, ..., N − 1, (D.1)
or
xt+1 = (At + I)xt + Gt wt , t = 0, 1, ..., N − 1, (D.2)
where xtis an n-component (column) state vector, wt
is a k-component
(column) system noise vector, At is an n × n matrix, and Gt is an n ×
k matrix. The initial state x0 is assumed to be a Gaussian (normal)
random variable with mean and n × n covariance matrix given by
E[x0 ] = x̄0 and E[(x0 − x̄0 )(x0 − x̄0 ) ] = Σ0 . (D.3)
Without loss of generality, we confine ourselves to the case when wt is a
standard Gaussian purely random sequence with
E[wt ] = 0 and E[wt (wτ ) ] = Iδ tτ , (D.4)
where for t = 0, 1, ..., N, τ = 0, 1, ..., N,


⎨ 0 if t = τ ,
δ tτ = (D.5)

⎩ 1 if t = τ .
D.1. The Kalman Filter 443

Thus, the random vectors wt and wτ are independent standard normal


variables for t = τ . We also assume that the sequence wt is independent
of the initial condition x0 , i.e., the k × n matrix

E[wt (x0 − x̄0 ) ] = 0, t = 0, 1, ..., N. (D.6)

The process of measurement of the state variables xt yields a r-


dimensional vector y t which is related to xt by the transformation

y t = Ht xt + v t , t = 0, 1, ..., N, (D.7)

where Ht is the state-to-measurement transformation matrix of dimen-


sion r × n, and v t is a Gaussian purely random sequence of r-dimensional
measurement noise vectors having the following properties:

E[v t ] = 0, E[v t (v τ ) ] = Rt δ tτ , (D.8)

E[wt (v τ ) ] = 0, E[(x0 − x̄0 )(v̄ t ) ] = 0. (D.9)

In (D.8) the matrix Rt is the r × r covariance matrix for the random


variable v t , and it is therefore positive semidefinite, symmetric, and non-
singular. The requirements in (D.9) mean that the additive measurement
noise is independent of the system noise as well as the initial state.
Given a sequence of observations y 0 , y 1 , y 2 , . . . , y i up to time i, we
would like to obtain the maximum likelihood estimate of the state xi ,
or equivalently, to find the weighted least squares estimate. In order to
derive the estimate x̂i of xi , we require the use of the Bayes theorem
and an application of calculus to find the unconstrained minimum of
a quadratic form. This derivation is straightforward but lengthy. It
yields the following recursive procedure for finding the estimate x̂t , t =
0, 1, ..., i, i ≤ N :

x̂t = x̄t + Kt (y t − Ht x̄t ), (D.10)


t+1 t 0
x̄ = (At + I)x̂ , x̄ given, (D.11)
Kt = Pt Ht Rt−1 ,

(D.12)
Pt = (Σ−1  −1
t + Ht Rt Ht ) ,
−1
(D.13)
Σt+1 = (At + I)Pt (I + At ) + Gt Gt , Σ0 given. (D.14)

The procedure in expressions (D.10)–(D.14) is known as the Kalman


filter for linear discrete-time processes.
444 D. Special Topics in Optimal Control

The interpretation of (D.10) is that the estimate x̂t is equal to the


mean value x̄t plus a correction term which is proportional to the differ-
ence between the actual measurement y t and the predicted measurement
Ht x̄t . Also,
Σt = E[(xt − x̄t )(xt − x̄t ) ],
the error covariance before the measurement at time t, and

Pt = E[(xt − x̂t )(xt − x̂t ) ],

the error covariance matrix after the measurement at time t. In other


words, Σt and Pt are measures of uncertainties in the state before and
after the measurement at time t, respectively. Thus, the proportionality
matrix Kt can be interpreted as the ratio between the uncertainty Pt in
the state and the measurement uncertainty Rt . Because of this property
of Kt , it is called the Kalman gain in the engineering literature.
It is important to note that the propagation of Pt given by (D.13) and
(D.14) is independent of the measurements. Thus, it can be computed
offline and stored. The computation of updated estimates by (D.10) and
(D.11) involves only the current measurement and error covariance, and
can therefore be done in real time. Finally, prediction of the state beyond
the period up to which measurements are available can be done as

x̂t+1 = x̄t+1 = (At + I)x̂t + Gt w̄t , t ≥ i, i ∈ N, (D.15)

with x̂i obtained from the filter (D.10)–(D.14).

D.2 Wiener Process and Stochastic Calculus


A continuous 1-dimensional process Z is a (standard) Wiener process on
an interval [0, T ] if
1. Z has independent increments;

2. The increment Zt − Zτ is Gaussian with mean 0 and variance


|t − τ | for any t, τ , ∈ [0, T ];

3. Z0 is Gaussian with mean 0.


This definition easily generalizes to define a k-dimensional Wiener pro-
cess.
A Wiener process is also called a Brownian motion, as it models the
motion of a particle in a fluid. It has been shown that a Wiener process is
D.2. Wiener Process and Stochastic Calculus 445

nowhere differentiable; a Brownian particle does not possess a velocity at


any instant. Furthermore, it is a process with unbounded variation, i.e.,
its length in any finite interval is infinite. The Wiener process is difficult
to draw, although Fig. D.1 is an attempt to sketch a continuous sample
path that, at the same time, conveys the flavor of its “wild” nature.
Nevertheless, the formal time derivative of a Wiener process is termed
white noise in the engineering literature. Thus, wt = dZ/dt can be
regarded as a stationary process in which the random variables wt and
wτ , t = τ , are independent with Ewt = Ewτ = 0 and the covariance
E[wt ws ] = δ tτ . One can see that wt is a continuous time analogue of the
discrete-time process wt defined in the previous section.
t
Next we wish to define an integral s G(τ )dZτ for a rather wide class
of processes G. Specifically, it will be the  T class M0 of all real-valued,
stochastic processes G on [0, T ] such that 0 |G(τ )|2 dτ < ∞ with proba-
bility 1. Given the wild nature of the Wiener process, the integral cannot
be defined in the sense of Reimann-Steiltjes for every function in M0 .
Therefore, we resort to the concept of a stochastic integral in the Itô
sense. For this, let us define the subclass M ⊂ M0 such that any G ∈ M
satisfies E 0 |G(τ )|2 dτ < ∞. Let Gj ∈ M be a step process on [0, t] in
T

the sense that there is a partition consisting of points τ 0 , τ 1 , . . . , τ m with


0 < τ 0 < τ 1 < . . . < τ m = t. For this step process, the integral equals
the Riemann-Steiltjes sum
 t 
m
Gj (τ )dZτ = Gj (τ k−1 )[Zτ k − Zτ k−1 ]. (D.16)
0 k=1

We then define the stochastic integral for any G ∈ M0 by taking a


sequence of step processes Gj , j = 1, 2, . . . , such that 0 |Gj (τ )−G(τ )|2 dτ
t

converges to zero in probability as j → ∞. Then, the sequence of random


variables defined in(D.16) converges, as j → ∞, to  t a limit in probability,
t
which is defined as 0 G(τ )dZτ , written simply as 0 GdZ. It can be shown
that the limit does not depend on the approximating sequence Gj with
probability 1 for each t.
It is important to note the following
t important properties of Itô’s
stochastic integral. The integral 0 GdZ can be defined simultaneously
for all t ∈ [0, T ], so that it is continuous on [0, T ]. Furthermore, for any
H, G ∈ M0 , we have
 t  t
E G(τ )dZτ = 0, E H(τ )dZτ = 0,
0 0
and
446 D. Special Topics in Optimal Control
 t   t   t
E G(τ )dZτ H(τ )dZτ =E G(τ )H(τ )dτ .
0 0 0
(D.17)

Equation (D.17) serves as motivation for the frequently used symbolic


notation
(dZt )2 = dt. (D.18)
Now that we have defined the stochastic integral, it remains to spec-
ify the stochastic differential rule.
 T Let f, G, and X be one-dimensional
stochastic processes such that E 0 |f |dt < ∞, G ∈ M0 , X is continuous,
and
 t  t
Xt − X0 = f (τ )dτ + G(τ )dZτ , 0 ≤ t ≤ T. (D.19)
0 0

This equation is a stochastic integral equation, for which it is customary


to use the suggestive notation

dXt = f (t)dt + G(t)dZt , X0 given,

or simply
dX = f dt + GdZ, X0 given. (D.20)
Now let the one-dimensional process Yt = ψ(Xt , t), t ∈ [0, T ], where
the function ψ(x, t) is continuously differentiable in t and twice continu-
ously differentiable in x. Then, it possesses the stochastic differential

1
dYt = ψ t (Xt , t) + ψ X (Xt , t)dxt + ψ XX (Xt , t)G2 (t)dt
2
1
= [ψ t (Xt , t) + ψ X (Xt , t)f (t) + ψ XX (Xt , t)G2 (t)]dt
2
+ψ X (Xt , t)G(t)dZt , Y0 = ψ(X0 , 0). (D.21)

Equation (D.21) is to be interpreted in the sense that its integral


form from 0 to t holds with probability 1, i.e.,

Y (xt , t) = Y (x0 , 0)
 t
1
+ [ψ s (xs , s) + ψ x (xs , s)f (s) + ψ xx (xs , s)G2 (s)]ds
0 2
 t
+ ψ(xs , s)G(s)dZs , w.p.1. (D.22)
0
D.3. The Kalman-Bucy Filter 447

It is worth pointing out that the term 12 ψ xx G2 dt does not appear


in the differential rule of elementary calculus. This is an important dif-
ference as seen in Chap. 12, where we discuss stochastic optimal control
problems. Also, a multi-dimensional generalization of (D.16)–(D.22) is
straightforward.

D.3 The Kalman-Bucy Filter


The continuous-time analogue of the Kalman filter is known as the
Kalman-Bucy filter. Here, the difference equation (D.2) is replaced by
the linear stochastic differential equation

dXt = A(t)Xt dt + G(t)dZt , 0 ≤ t ≤ T, (D.23)

which is a special case of the Itô stochastic differential equation (D.20)


introduced in Chap. 12. In this equation, Xt is an n-component (column)
state vector, Zt is the value at time t of a standard k-component (column)
Wiener process Z, and the matrices A(t) and G(t) of dimensions n × n
and n × k, respectively, are continuous in t. Furthermore,

E(X0 ) = X̄0 , and E[(X0 − X̄0 )(X0 − X̄0 ) ] = Σ0 . (D.24)

The measurement process (D.7) is replaced by

dYt = H(t)Xt + σ(t)dξ t , Y0 = 0, (D.25)

where ξ is a standard r-dimensional Weiner process and the k × r matrix


σ(t) is such that the k × k matrix R(t) := σ(t)σ  (t) is positive definite.
Note that the term σ(t)dξ t in (D.25) represents the noise term, which
corresponds to v t in (D.7). Thus, the term R(t) corresponds to the
covariance matrix Rt in Sect. D.1 on the Kalman filter.
The filtering problem is to find the weighted least square estimate
of Xt given the measurements up to time t. It can be shown that the
optimal estimate is the conditional expectation

X̂t = E[Xt |Ys , 0 ≤ s ≤ t]. (D.26)

Furthermore, it can be obtained recursively by the following Kalman-


Bucy filter:

dX̂t = A(t)X̂t dt + K(t)[dYt − H(t)X̂t dt], X̂0 = X̄0 , (D.27)


K(t) = P (t)H  (t)R−1 (t), (D.28)
448 D. Special Topics in Optimal Control

Ṗ (t) = (A(t)P (t) + P (t)A (t) − K(t)H(t)P (t)


+G(t)G (t)), P (0) = Σ0 , (D.29)

where H  (t) denotes the transpose (H(t)) and R−1 (t) means the inverse
(R(t))−1 , as the notational convention defined in Chap. 1. The interpre-
tations of P (t) and K(t) are the same as in the previous section.
The filter (D.27)–(D.29) is the Kalman-Bucy filter (Kalman and Bucy
1961) for linear systems in continuous time. Equation (D.29) is called the
matrix Riccati equation. Besides engineering applications, the Kalman
filter and its extensions are very useful in econometric and financial mod-
eling; see Buchanan and Norton (1971), Chow (1975), Aoki (1976), Naik
et al. (1998), and Bhar (2010).

D.4 Linear-Quadratic Problems


An important problem in systems theory, especially engineering sciences,
is to synthesize feedback controllers. These controllers provide optimal
control as a function of the state of the system. A usual method of ob-
taining these controllers is to solve the Hamilton-Jacobi-Bellman partial
differential equation (2.19). This equation is nonlinear in general, which
makes it very difficult to solve in closed form. Thus, it is not possible in
most cases to obtain optimal feedback control schemes explicitly.
It is, however, feasible in many cases to obtain perturbation feed-
back control, which refers to control in the vicinity of an optimal path.
These perturbation schemes require the approximation of the problem
by a linear-quadratic problem in the vicinity of an optimal path (see
Sect. D.5), and feedback control for the approximating problem is easy
to obtain.
A linear-quadratic control problem is a problem with linear dynamics
and a quadratic objective function. First, we treat a special case called
the Regulator Problem:
  T 
  
min x (T )ST x(T ) + (x Cx + u Du)dt (D.30)
u 0

subject to
ẋ = Ax + Bu, x(0) = x0 . (D.31)
Here x ∈ E n , u ∈ E m , and the appropriate dimensional matrices
C, D, A, and B, when time-dependent, are assumed to be continuous in
D.4. Linear-Quadratic Problems 449

time t. Furthermore, we shall assume the matrices C and ST to be posi-


tive semidefinite and, without loss of generality, symmetric, and matrix
D to be symmetric and positive definite.
To solve the regulator problem for an explicit feedback controller, we
rewrite it as that of maximizing
 T
J= −(x Cx + u Du)dt − x (T )ST x(T )
0

subject to (D.31). Clearly, this is a special case of the optimal control


problem (2.4) and we can apply (2.15) and (2.16) to obtain the Hamilton-
Jacobi-Bellman equation
- .
0 = max −(x Cx + u Du) + Vx [Ax + Bu] + Vt (D.32)
u

with the terminal boundary condition

V (x, T ) = −x (T )ST x(T ). (D.33)

By checking that V (γx, t) = γ 2 V (x, t) and V (x, t)+V (y, t) = 12 [V (x+


y, t) + V (x − y, t), we can establish that the value function V (x, t) is of
a quadratic form. Thus, let

V (x(t), t) = −x (t)S(t)x(t) (D.34)

for some matrix S(t), symmetric without loss of generality. Then Vt =


−x Ṡx and Vx = −2(Sx) = −2x S. Using these relations in (D.32), we
get
- .
x Ṡx = max −x Cx − u Du − 2x SAx − 2x SBu
u
- .
= − min x Cx + u Du + 2x SAx + 2x SBu . (D.35)
u

To find the minimum of the expression on the right-hand side of


(D.35), we observe the following identity obtained by completing the
square:

x Cx + u Du + 2x SAx + 2x SBu = (u + D−1 B  Sx) D(u + D−1 B  Sx)


+x (C − SBD−1 B  S + SA + A S)x.

Because matrix D is positive definite, it follows that the minimum is


achieved in (D.35) by the control

u∗ = −D−1 B  Sx. (D.36)


450 D. Special Topics in Optimal Control

Then from (D.35) and (D.36), we obtain,

x Ṡx = −x [C − SBD−1 B  S + SA + A S]x. (D.37)

Since this equation holds for all x, we have the matrix differential equa-
tion
Ṡ = −SA − A S + SBD−1 B  S − C, (D.38)
called a matrix Riccati equation, with the terminal condition

S(T ) = ST (D.39)

obtained from (D.33), where ST is specified in (D.30).


A solution procedure for Riccati equations appears in Bryson and Ho
(1975) or Anderson and Moore (1990). With the solution S of (D.38)
and (D.39), we have the optimal feedback control as in (D.36).
To see that the optimal control u∗ in (D.36) maximizes the Hamilto-
nian H = −x Cx − u Du + Vx [Ax + Bu], let us use (D.32) to obtain

2(Du∗ ) = 2u∗  D = −2x SB(D )−1 D = −2x SB = Vx B,

which is precisely the first-order condition for the maximum of the right-
hand side of (D.32). Moreover, the first-order condition yields a global
maximum of the Hamiltonian, which is concave since the matrix D is
positive definite.
A generalization of (D.30) to include a cross-product term to allow
for interactions between the state x and control u, which would be useful
in the next section on the second variation, is to set
⎡ ⎤⎡ ⎤

⎢ C N ⎥⎢ x ⎥
T
J = −x (T )ST x(T ) − (x , u ) ⎣ ⎦ ⎣ ⎦ dt, (D.40)
0
N D u

and the problem is to maximize J subject to the state equation (D.31).


It is easy to see that the integrand in (D.40) can be rewritten as
x Cx+u Du+2x N u. Furthermore, with the definition ũ = u+D−1 N  x,
the generalized problem defined by (D.40) and (D.31) can be reduced to
T
the standard regulator problem of maximizing 0 −[ũ D−1 ũ + x (C −
N D−1 N  )x] − x (T )ST x(T ) subject to ẋ = (A − BD−1 N  )x + B ũ, pro-
vided that the matrix C − N D−1 N  is positive semidefinite. We can
then use formulas (D.36), (D.38), and (D.39), to obtain the solution of
D.4. Linear-Quadratic Problems 451

the transformed problem and then use the definition of ũ to write the
feedback control of the generalized problem as

u∗ (x) = −D−1 [N  + B  S]x, (D.41)

where

Ṡ = −S(A − BD−1 N  ) − (A − N D−1 B  )S


+SBD−1 B  S + N D−1 N  − C
= −SA − A S + (SB + N )D−1 (B  S + N  ) − C (D.42)

with
S(T ) = ST . (D.43)

D.4.1 Certainty Equivalence or Separation Principle


Suppose Eq. (D.31) is changed by the presence of the stochastic term
G(t)dZt as defined in (D.23) so that we have the Itô equation

dXt = (A(t)Xt + B(t)Ut )dt + G(t)dZt ,

and X0 is a normal random variable with

E[X0 ] = 0, E[X0 X0 ] = Σ0 .

Because of the presence of uncertainty in the system equation, we modify


the objective function in (D.40) as follows:
⎧ ⎡ ⎛ ⎞⎛ ⎞ ⎤⎫

⎨  ⎪

T C t N t ⎟ ⎜ Xt ⎟ ⎥ ⎬
   ⎜
max J = E ⎣−XT ST XT − (Xt , Ut ) ⎝ ⎠⎝ ⎠ dt⎦ .

⎩ 0 ⎪

Nt Dt Ut

Assume further that Xt cannot be directly measured and the mea-


surement process is given by (D.25), i.e.,

dYt = H(t)Xt + σ(t)dξ t , Y0 = 0.

The optimal control Ut∗ for this linear-quadratic stochastic optimal


control problem can be shown to be given by (D.41) with Xt replaced
by its estimate X̂t ; see Arnold (1974). Thus,

Ut∗ = −D(t)−1 [N  (t) + B  (t)S(t)]X̂t ,


452 D. Special Topics in Optimal Control

where S(t) is given by (D.42) and (D.43), and X̂t is given by the Kalman-
Bucy filter:

dX̂t = [A(t)X̂t + B(t)Ut∗ ]dt + K(t)(dYt − H(t)X̂t dt), X̂(0) = 0,


K(t) = P (t)H  (t)R−1 (t),
Ṗ (t) = A(t)P (t) + P (t)A (t) − K(t)H(t)P (t) + G(t)G (t), P (0) = Σ0 .

The above procedure has received two different names in the liter-
ature. In economics it is called the certainty equivalence principle; see
Simon (1956). In engineering and mathematics literature it is called the
separation principle; see Fleming and Rishel (1975). When we call it
the certainty equivalence principle, we are emphasizing the fact that X̂t
can be used for the purposes of optimal feedback control as if it were
the certain value of the state variable Xt . Whereas the term separation
principle emphasizes the fact that the process of determining the optimal
control can be broken down into two steps: first, estimate Xt by using the
optimal filter; second, use that estimate in the optimal feedback control
formula for the deterministic problem.

D.5 Second-Order Variations


Second-order variations in optimal control theory are analogous to the
second-order conditions in the classical optimization problem of calculus.
To discuss the second-order variational condition is difficult when the
control variable u is constrained to be in the control set Ω. So we make
the simplifying assumption that Ω = Rm , and thus the control u is
unconstrained. As a result, we are now dealing with the problem:
  T 
max J = F (x, u, t)dt + Φ[x(T )] (D.44)
u 0

subject to
ẋ = f (x, u, t), x(0) = x0 . (D.45)
From Chap. 2, we know that the first-order necessary conditions for
this problem are given by

λ̇ = −Hx , λ(T ) = 0, (D.46)

Hu = 0, (D.47)
D.5. Second-Order Variations 453

where the Hamiltonian H is given by

H = F + λf. (D.48)

Since u is unconstrained, these conditions may be easily derived by the


method of calculus of variations. To see this, we write the augmented
objective functional as
 T
J¯ = Φ[x(T )] + [H(x, u, λ, t) − λẋ]dt. (D.49)
0

Consider small perturbation from the extremal path given by (D.45)–


(D.48) as a result of small perturbations δx(0) in the initial state. Define
the resulting perturbations in state, adjoint, and control variables by
δx(t), δλ(t), and δu(t), respectively. These, of course, will be obtained
by linearizing ((D.45)–(D.47)) around the external path:
dδx
= fx δx + fu δu, δx(0)specified, (D.50)
dt
dδλ
= −(Hxx δx)T − δλf T − (Hxu δu), (D.51)
dt

δHu = (Hux δx)T + δλ(Hu λ)T + (Huu δu)T


= (Huu δx)T + δλfu + (Huu δu)T = 0. (D.52)

Alternatively, we may consider an expansion of the objective function


and the state equation to second order since the first-order terms vanish
about a trajectory which satisfies ((D.44)–(D.47)). From Bryson and Ho
(1975), this may be accomplished by expanding (D.49) to second order
and all the constraints to first order. Thus, we have

⎡ ⎤⎡ ⎤
 T H H δx
1 1 ⎢ xx xu ⎥ ⎢ ⎥
δ 2 J¯ = (δxT (T )Φxx δx(T )) + (δx, δu) ⎣ ⎦⎣ ⎦ dt
2 2 0
Hux Huu δu
(D.53)
subject to
dδx
= fx δx + fu δu, δx(0) specified. (D.54)
dt
Since we are interested in a neighboring extremal path, we must deter-
mine δu(t) so as to maximize δ 2 J¯ subject to (D.54). This problem is
454 D. Special Topics in Optimal Control

a linear-quadratic problem discussed in the previous section. For this


problem, the optimal control δu∗ (t) is given by the formula (D.42), pro-
vided Huu (t) is nonsingular for 0 ≤ t ≤ T. The case when Huu (t) is
singular for a finite time interval is treated in Sect. D.6. Thus, recogniz-
ing that G = Φxx , C = Hxx , N = Hxu , D = Huu , A = fx , and B = fu ,
we have
δu∗ (t) = Huu
−1
[Hux + fuT S(t)]δx(t), (D.55)
where
−1 T
Ṡ +Sfx +fuT S −(Sfu +Hxu )Huu (fu S +Hux )+Hxx = 0, S(T ) = Φxx .
(D.56)
While a number of second-order conditions can be obtained by pro-
ceeding further from this manner, we will be interested only in the con-
cavity condition (or strengthened Legendre-Clebsch condition). It is pos-
sible to show that neighboring stationary paths exist (in a weak sense;
i.e., δx and δu are small) if

Huu (t) < 0 for 0 ≤ t ≤ T, (D.57)

or in other words, Huu (t) is negative semidefinite. First-order conditions,


conditions (D.57), and the condition that S(t) is finite for 0 ≤ t ≤ T
represent sufficient conditions for a trajectory to be a local maximum.
We are not being specific here because in this book we would be relying
mostly on the sufficiency conditions developed in Chaps. 2–4, which are
based on certain concavity requirements. We are stating (D.57) because
of its similarity to the second-order condition for a local maximum in
the classical maximization problem.
We must note that

Hu = 0 and Huu ≤ 0 (D.58)

form necessary conditions for a trajectory to be a local maximum.

D.6 Singular Control


In some optimization problems including some problems treated in this
text, extremal arcs satisfying Hu = 0 occur on which the matrix Huu is
singular. Such arcs are called singular arcs. Note that these arcs sat-
isfy (D.58) but not the strengthened condition (D.57). While no general
sufficiency conditions are available for singular arcs, some additional nec-
essary conditions known as the generalized Legendre-Clebsch conditions
D.6. Singular Control 455

have been developed. A good reference on singular control is Bell and


Jacobson (1975).
We will only discuss the case in which the Hamiltonian is linear in one
or more of the control variables. For these systems, Hu = 0 implies that
the coefficient of the linear control term in the Hamiltonian vanishes
identically along a singular arc. Thus, the control is not determined
in terms of x and λ by the Hamiltonian maximizing condition Hu = 0.
Instead, the control is determined by the requirement that the coefficient
of these linear terms remain zero on the singular arc. That is, the time
derivatives of Hu must be zero. Having obtained the control by setting
dHu /dt = 0 (or by setting higher time derivatives to equal zero) along the
singular arc, we must check additional necessary conditions analogous to
the second-order condition (D.57). For a maximization problem with a
single control variable, these conditions turn out to be
2k
k ∂ d Hu
(−1) ≤ 0, k = 0, 1, 2, .... (D.59)
∂u dt2k

The conditions (D.59) are called the generalized Legendre-Clebsch


conditions.

For applications of these conditions to problems in production and


finance, see e.g., Maurer et al. (2005) and Davis and Elzinga (1971).
The Davis-Elzinga model is covered in Exercise 5.17 in Chap. 5. For
numerical solutions of singular control problems, see Maurer (1976).

Example D.1 We present an example treated by Johnson and Gibson


(1963):
  
1 T 2
max J = − x1 dt (D.60)
2 0
subject to

ẋ1 = x2 + u, x1 (0) = a, (D.61)


ẋ2 = −u, x(0) = b, (D.62)

x1 (T ) = x2 (T ) = 0. (D.63)

Solution We form the Hamiltonian


1
H = − x21 + λ1 (x2 + u) + λ2 (−u), (D.64)
2
456 D. Special Topics in Optimal Control

where the adjoint equations are

λ̇1 = x1 , λ̇2 = −λ1 . (D.65)

The optimal control is bang-bang plus singular. Singular arcs must sat-
isfy
H u = λ1 − λ2 = 0 (D.66)
for a finite time interval. The optimal control can, therefore, be obtained
by
dHu
= λ̇1 − λ̇2 = x1 + λ1 = 0. (D.67)
dt
Differentiating once more with respect to time t, we obtain

d2 Hu
= ẋ1 + λ̇1 = x2 + u + x1 = 0,
dt2
which implies
u = −(x1 + x2 ) (D.68)
along the singular arc. We now verify for the example, the generalized
Legendre-Clebsch condition (D.59) for k = 1:

∂ d2 Hu
− = −1 ≤ 0. (D.69)
∂u dt2

D.7 Global Saddle Point Theorem


In this section, we provide an important result for a class of station-
ary infinite-horizon optimal control problems such as those treated in
Chap. 11. In particular, we are concerned here with the one-dimensional
state problem defined in (3.97) without the mixed constraint and the
terminal inequality constraints, i.e.,
  ∞ 
−ρt
max J = φ(x, u)e dt , (D.70)
0

ẋ = f (x, u), x(0) = x0 . (D.71)


An application of the maximum principle results in an adjoint equation

λ̇ = ρλ − φx − λfx (D.72)
D.7. Global Saddle Point Theorem 457

and a Hamiltonian maximizing control u∗ (x, λ). Substituting this for


u in (D.71) and (D.72) gives rise to a canonical system of differential
equations
ẋ = f ∗ (x, λ) and λ̇ = ψ ∗ (x, λ). (D.73)

A saddle point (x̄, λ̄) of the system (D.73) satisfies

f ∗ (x̄, λ̄) = 0 and ψ ∗ (x̄, λ̄) = 0. (D.74)

The important issue for this problem is the existence and uniqueness
of an optimal path that steers the system from an initial value x0 to the
steady state x̄. This is equivalent to finding a value λ0 so that the system
(D.73) starting from (x0 , λ0 ) moves asymptotically to (x̄, λ̄). A sufficient
condition for this to happen is given in the following theorem.

Figure D.1: Phase diagram for system (D.73)

Theorem D.1 (Global Saddle Point Theorem) Let (x̄, λ̄) be a


unique saddle point of the canonical system (D.73) of the differential
equations and let x0 be a given initial state for which the vertical line
x = x0 (see Fig. D.1) intersects both isoclines ẋ = f ∗ (x, λ) = 0 and
λ̇ = ψ ∗ (x, λ) = 0. Assume further that the region bounded by the iso-
clines and the line x = x0 has a triangular shape as in Fig. D.1 (i.e., the
isoclines themselves do not intersect in the open interval between x0 and
x̄). Then, there exists a unique saddle point path starting for x = x0 and
leading to the saddle point (x̄, λ̄).
458 D. Special Topics in Optimal Control

The proof of this theorem, based on Theorem 1.2 and Corollaries 1.1
and 1.2 from Hartman (1982), can be found in Feichtinger and Hartl
(1986).

D.8 The Sethi-Skiba Points


In Exercise 2.9, we defined autonomous optimal control problems. Here,
we limit the discussion to autonomous systems that are discounted
infinite-horizon optimal control problems with one-dimensional state, de-
fined as follows:
  ∞ 
−ρt
max J = e φ(x(t), u(t))dt
u(t)∈Ω 0

subject to
ẋ(t) = f (x(t), u(t)), x(0) given,

with ρ > 0 as the discount rate. In addition to assuming that the


function φ and f are continuously differentiable, we assume that the in-
tegral in the objective function J converges for any admissible solution
x(t), u(t), t ≥ 0. In such problems, there may arise multiple equilibria de-
pending on the initial condition. Suppose x0 is an initial value for x(0),
such that the system starting from it exhibits multiple optimal solutions
or equilibria. Thus, at least in the neighborhood of x0 , the system moves
to one equilibrium if x(0) > x0 and to another if x(0) < x0 . In other
words, x0 is an indifference point from which the system could move to
either of two equilibria. Such points were originally identified by Sethi
(1977b, 1979c). Subsequently, Skiba (1978) and Dechert and Nishimura
(1983) explored these indifference points for one-sector optimal economic
growth models with nonconvex production functions, in contrast to con-
cave production functions treated in Sect. 11.1. These points are also
referred to as the DNSS points, where the acronym DNSS stands for
Dechert, Nishimura, Sethi, and Skiba. Before it became known that
Sethi (1977b) had already identified them prior to Skiba (1978), these
points were also called Skiba points.
Below we present a simple example that exhibits a Sethi-Skiba point
at x0 = 0. For further discussion on these points, see Grass et al. (2008),
Zeiler et al. (2010), Kiseleva and Wagener (2010), and Caulkins et al.
(2015a).
D.8. The Sethi-Skiba Points 459

Example D.2 Solve the problem:


  ∞ 
−ρt
max J = e x(t)u(t)dt
0

subject to
ẋ(t) = −x(t) + u(t), x(0) = x0 , (D.75)
u(t) ∈ [−1, +1], t ≥ 0.
Let us first solve this problem for x0 < 0. We form the Hamiltonian

H = x(t)u(t) + λ(t)(−x(t) + u(t)) (D.76)

with
λ̇(t) = (1 + ρ)λ(t) − u(t). (D.77)
Since H is linear in u, the optimal policy is

u∗ (t) = bang[−1, 1; x(t) + λ(t)]. (D.78)

For x0 < 0, the state equation reveals that u∗ (t) = −1 will give the
largest decrease of x(t) and keep x(t) < 0, t ≥ 0. Thus, it will maximize
the product x(t)u(t) for each t > 0. We also note that the long-run
stationary equilibrium in this case is (x̄, ū, λ̄) = (−1, −1, −1/(1 + ρ)).
It is also easy to verify that the solution u∗ (t) = −1, x∗ (t) = −1 +
e−t (x0 + 1), and λ(t) = −1/(1 + ρ), t ≥ 0, satisfies (D.75), (D.77) along
with the sufficiency transversality condition (3.99), and maximizes the
Hamiltonian in (D.76).
Similarly, we can argue that for x0 > 0, the optimal solution is
u∗ (t) = +1, x∗ (t) = 1+e−t (x0 −1) > 0, and λ(t) = 1/(1+ρ), t ≥ 0. The
long-run stationary equilibrium in this case is (x̄, ū, λ̄) = (1, 1, 1/(1 + ρ).
Then by symmetry, we can conclude that if x0 = 0, both u∗ (t) = −1
and u∗ (t) = +1, t ≥ 0, yield the same objective function, and hence
both are optimal. Thus, x0 = 0 is a Sethi-Skiba point for this example.

Clearly, at this point, the choice between using u∗ (0) = −1 and


u∗ (0) = +1 will determine the equilibrium the system approaches. No-
tice that once the system has moved away from x0 = 0, there is no more
choice left in choosing the control.
It is possible that at a Sethi-Skiba point, a decision maker can in-
fluence the equilibrium that the system would move to, by choosing a
460 D. Special Topics in Optimal Control

control from the set of possible optimal controls. This may have im-
portant implications. In a model of controlling illicit drugs, Grass et al.
(2008) derive a Sethi-Skiba point, signifying a critical number of ad-
dicts, such that if there are fewer addicts than the critical number, it
is optimal to use an eradication strategy that uses massive treatment
spending that drive the number of addicts down to zero. On the other
hand, if there are more than the critical number of addicts, then it is
optimal to use an accommodation strategy that uses a moderate level
of treatment spending that balances the social cost of drug use and the
cost of treatment.
This is a case of a classic Sethi-Skiba point acting as a “tipping point”
between the two strikingly different equilibria, one of which may be more
socially or politically favored than the other, and the social planner can
use an optimal control to move to the more favored equilibrium.
We conclude this subsection by mentioning that the Sethi-Skiba
points are exhibited in the production management context by Fe-
ichtinger and Steindl (2006) and Moser et al. (2014), in the open-source
software context by Caulkins et al. (2013a), and in other contexts by
Caulkins et al. (2011, 2013b, 2015a).

D.9 Distributed Parameter Systems


Thus far, our efforts have been directed to the study of the control of
systems governed by systems of ordinary differential or difference equa-
tions. Such systems are often called lumped parameter systems. It is
possible to generalize these to systems in which the state and control
variables are defined in terms of space as well as time dimensions. These
are called distributed parameter systems and are described by a set of
partial differential or difference equations.
For example, in the lumped parameter advertising models of the type
treated in Chap. 7, we solved for the optimal rate of advertising expen-
diture at each instant of time. However, in the analogous distributed
parameter advertising models, we must obtain the optimal advertising
expenditure rate at every geographic location of interest at each instant
of time; see Seidman et al. (1987) and Marinelli and Savin (2008). In
other economic problems, the spatial coordinates might be income, qual-
ity, age, etc. Derzko et al. (1980), for example, discuss a cattle-ranching
model in which the spatial dimension measures the age of a cow.
Let y denote a one dimensional spatial coordinate, let t denote time,
D.9. Distributed Parameter Systems 461

and let x(t, y) be a one dimensional state variable. Let u(t, y) denote a
control at (t, y) and let the state equation be
∂x ∂x
= g(t, y, x, , u) (D.79)
∂t ∂y
for t ∈ [0, T ] and y ∈ [0, h]. We denote the region [0, T ] × [0, h] by D,
and we let its boundary ∂D be split into two parts Γ1 and Γ2 as shown
in Fig. D.2. The initial conditions will be stated on the part Γ1 of the
boundary ∂D as
x(0, y) = x0 (y) (D.80)
and
x(t, 0) = v(t). (D.81)
In Fig. D.2, (D.80) is the initial condition on the vertical portion of Γ1 ,
whereas (D.81) is that on the horizontal portion of Γ1 . More specifically,
in (D.80) the function x0 (y) gives the starting distribution of x with
respect to the spatial coordinate y. The function v(t) in (D.81) is an
exogenous breeding function of x at time t when y = 0, which in the
cattle ranching model mentioned above, measures the number of newly
born calves at time t. To be consistent we make the obvious assumption
that
x(0, 0) = x0 (0) = v(0). (D.82)

Figure D.2: Region D with boundaries Γ1 and Γ2

Let F (t, y, x, u) denote the profit rate when x(t, y) = x and u(t, y) =
u at a point (t, y) in D. Let Q(t) be the price of one unit of x(t, h) at
462 D. Special Topics in Optimal Control

time t and let S(y) be the salvage value of one unit of x(T, y) at time T.
Then the objective function is:
  T h
max J= F (t, y, x(t, y), u(t, y))dydt
u(t,y)∈Ω 0 0
 T  h  (D.83)
+ Q(t)x(t, h)dt + S(y)x(T, y)dy ,
0 0

where Ω is the set of allowable controls.


We will formulate, without giving proofs, a procedure for solving the
problem in (D.79)–(D.83) by a distributed parameter maximum princi-
ple, which is analogous to the ordinary one. A more complete treatment
of this topic can be found in Sage (1968), Butkowskiy (1969), Ahmed
and Teo (1981), Tzafestas (1982b), Derzko et al. (1984), Brokate (1985),
and Veliov (2008).
In order to obtain necessary conditions for a maximum, we introduce
the Hamiltonian
H = F + λf, (D.84)
where the spatial adjoint function λ(t, y) satisfies

∂λ ∂H ∂ ∂H ∂ ∂H
=− + + , (D.85)
∂t ∂x ∂t ∂xt ∂y ∂xy

where xt = ∂x/∂t and xy = ∂x/∂y. The boundary conditions on λ are


stated for the Γ2 part of the boundary of D (see Fig. D.2) as follows:

λ(t, h) = Q(t) (D.86)

and
λ(T, y) = S(y). (D.87)
Once again we need a consistency requirement similar to (D.82). It is

λ(T, h) = Q(T ) = S(h), (D.88)

which gives the consistency requirement in the sense that the price and
the salvage value of a unit x(T, h) must agree.
We let u∗ (t, y) denote the optimal control at (t, y). Then the dis-
tributed parameter maximum principle requires that

H(t, y, x∗ , x∗t , x∗y , u∗ , λ) ≥ H(t, y, x∗ , x∗t , x∗y , u, λ) (D.89)

for all (t, y) ∈ D and all u ∈ Ω.


Exercises for Appendix D 463

We have stated only a simple form of the distributed parameter max-


imum principle which is sufficient for most applications in management
science and economics, such as Derzko et al. (1980), Haurie et al. (1984),
Feichtinger et al. (2006a), and Kuhn et al. (2015). More general forms
of the maximum principle are available in the references cited earlier.
Among other things, these general forms allow for the function F in
(D.83) to contain arguments such as ∂x/∂y, ∂ 2 x/∂y 2 , etc. It is also pos-
sible to consider controls on the boundary. In this case v(t) in (D.81)
will become a control variable.

Exercises for Appendix D

E D.1 Consider the discrete-time dynamics




⎨ xt+1 − xt = axt + wt ,
(D.90)

⎩ yt = hxt + vt,

where wt and v t are Gaussian purely random sequences with

E[wt ] = E[v t ] = 0, E[wt wτ ] = qδ tτ ,

E[v t v τ ] = rδ tτ ,

where h, q, and r are constants. The initial condition x0 is a Gaussian


random variable with mean μ and variance Σ0 . Use the Kalman filter
(D.10)–(D.14) to obtain the recursive equations

Pt+1 h t+1
x̂t+1 − x̂t = ax̂t + (y − h(a + 1)x̂t ), x̂0 = μ
r
and
r[(a + 1)2 Pt + q]
Pt+1 = , p0 = rΣ0 /(r + Σ0 h2 ).
r + h2 [(a + 1)2 Pt + q]

E D.2 Consider the continuous-time dynamics of the simplest nontrivial


filter ⎧
⎨ dXt = √q dZt , x0 given,

(D.91)
⎩ dY = X + √r dξ , Y = 0,

t t t 0
464 D. Special Topics in Optimal Control

where Z and ξ are standard Brownian motions, q and σ are positive con-
stants, and X0 is a Gaussian random variable with mean 0 and variance
Σ0 . Show that the Kalman-Bucy filter is given by

P (t)
dX̂t = (dYt − X̂t dt), X̂0 = 0,
r
and
√ 1 + be−2αt
P (t) = rq ,
1 − be−2αt
where
# √
Σ0 − rq
α= q/r and b = √ .
Σ0 + rq
Hint: In solving the Riccati equation for P (t), you will need the formula
  
du 1 u−a
= ln .
u2 − a 2 2a u+a

E D.3 Let w(u) = u in Exercise 7.39. Analyze the various cases that
may arise in this problem from the viewpoint of obtaining the Sethi-Skiba
points.

E D.4 The economic growth model of Sect. 11.1.3 exhibits a Sethi-Skiba


point if we assume the production function f (k) to be convex initially
and then concave, i.e., f  (k) > 0 for k < k s and f  (k) = 0 at k = k s
and f  (k) < 0 for k > k s for some k s ∈ (0, ∞). Analyze this problem
with the additional mixed constraints 0 ≤ c ≤ f (k). See Skiba (1978)
and Feichtinger and Hartl (1986).
Appendix E

Answers to Selected
Exercises

Completely worked solutions to all exercises in this book are contained


in a forthcoming Teachers’ Manual, which will be made available to
instructors by the publisher when it is ready.
Chapter 1
1.1 (a) Feasible. J = −333,333.
1.3 J = 36.
1.5 (a) C = $157,861/year.
(b) J = 103.41 utils.
(c) $15,000/year.
1.6 (b) W (20) = 985,648; J = 104.34.
1.14 imp(G1 , G2 ; t) = (G1 − G2 )e−ρt .
Chapter 2
2.4 The optimal
⎧ control is



⎪ 2 if 0 ≤ t ≤ 2 − ln 2.5,


u∗ (t) = undefined if t = 2 − ln 2.5,





⎩ 0 if t > 2 − ln 2.5.

© Springer Nature Switzerland AG 2019 465


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
466 E. Answers to Selected Exercises

2.17 u∗ = bang(0, 1; λ1 − λ2 ), where λ(t) = (8e−2(t−18) , 4e−2(t−18) ).

2.18 (a) x(100) = 30 − 20e−10 ≈ 30.


(b) u∗ = 3 for t ∈ [0, 100].


⎨ 3 for t ∈ [0, 100 − 10 ln 2],

(c) u (t) =

⎩ 0 otherwise.

2.19 ẋ = f (x) + b(x)u, x(0) = x0 , x(T ) = 0.

u̇ = [b(x)2 g  (x) − 2cu{b(x)f  (x) − b (x)f (x)}]/[2cb(x)].

2.22 (a) u∗ = bang[0, 1; (g1 K1 + g2 K2 )(λ1 − λ2 )].


(c) t̂ = T − (1/g2 ) ln[(g2 b1 − g1 b2 )/(g2 − g1 )b2 ].

2.29 (a) C ∗ (t) = ρW0 e(r−ρ)t /(1 − e−ρT ).


(b) Ċ ∗ (t) = K(r − ρ).

2.30 (a) λ̇ = x + 3λx2 , λ(1) = 0, and ẋ = −x3 + λ, x(0) = 1.

Chapter 3

3.1 x − u1 ≥ 0, u1 − u2 ≥ 0, u1 ≥ 0, 1 + u2 ≥ 0.

3.2 X = [−1, 5].

3.9 L = F (x, u) + λf (x, u, t) + μg(x, u, t),

λ̇ = −(α̇/α)λ − ∂L
∂x , μ ≥ 0, μg = 0.

4 $
3.13 (a) λ(t) = 10 1 − e0.1(t−100) ,


⎨ 0 if K = 300,
μ=
⎩ −10 41 − e0.1(K/3−100) $ if K < 300,

u∗ (t) = bang[0, 3; λ + μ].

The problem is infeasible for K > 300.


E. Answers to Selected Exercises 467

(b) t∗∗ = min[0,


⎧ 100 − K/3],

⎨ 0 for t ≤ t∗∗ ,

u (t) =

⎩ 3 for t > t∗∗ .

3.17 λ(t) = t − 1.
3.23 11.87 min.
3.25 u∗ = −1, T ∗ = 5.
3.26 u∗ = −2, T ∗ = 5/2.
¯ P̄ , λ̄} = {I1 − ρ(S − P1 ), S, 2(S − P1 )}.
3.43 (a) {I,
(b) I = I1 .
Chapter 4
4.2 u∗ (t) = −1, μ1 = −λ = 1/2 − t, μ2 = η = 0.
4.3 One solution appears in Fig. 3.1. Another solution is u(t) = 1/2
for t ∈ [0, 2]. There are many others.
4.5 (a) u∗ = 0.


⎨ 1, 0 ≤ t ≤ 1 − T,

(c) u =

⎩ 0, 1 − T < t ≤ T.

(e) J = −(1/8 + 1/8K).


(f) J = −1/8.
Chapter 5


⎨ 5, t ≤ 1 + 6 ln 0.99 ≈ 0.94,
5.1 (a) u∗ (t) =

⎩ 0, t > 0.094.




⎪ −5, 0 ≤ t ≤ 0.28,





⎨ 0, 0.28 < t ≤ 0.4,
(b) λ2 (t)/λ1 (t) = e3(t
2 −4t+1)/12
, u∗ (t) =



⎪ 5, 0.4 < t ≤ 0.93,





⎩ 0, 0.93 < t ≤ 1.0.
468 E. Answers to Selected Exercises


5.4 (b) f (t∗ ) = t∗ − 10 ln(1 − 0.3e0.1t ).
(c) t∗ = 1.969327, J(t∗ ) = 19.037.

5.8 u∗ = v ∗ = 0 for all t.

5.10 u∗ = 0, v ∗ = 4/5 for t ∈ [0, 49],

u∗ = 0, v ∗ = 0 for t ∈ [49, 60],

J ∗ = 34,420.

Chapter 6

6.4 Q(t) = t4 − 160t3 + 1740t2 − 7360t + 9639.

6.9 v ∗ = sat[−V2 , V1 ; (λ2 − λ1 p)2βλ1 ].

6.10 v ∗ (t) ≈ 3e−3t , y ∗ (t) ≈ 1 − 3e−3t .

6.12 J ∗ = 10.56653.




⎪ 0, 0 ≤ t ≤ 7/3,





⎨ 2, 7/3 < t < 3,
6.14 ∗
u (t) =


⎪ −1,
⎪ 3 ≤ t < 13/3,





⎩ 0, 13/3 ≤ t ≤ 6.


⎨ − 5 t + 5 , t ∈ [0, 1],
2 2
6.15 μ1 =

⎩ 0, t ∈ (0, 3].



⎨ 0, t ∈ [0, 1.8),
μ2 =

⎩ − 1 t + 3 , t ∈ [1.8, 3].
2 2


⎪ 8
⎨ 0, t ∈ [0, 1) (1.8, 3],
η=

⎩ − 5 t + 5 , t ∈ [1, 1.8).
2 2
E. Answers to Selected Exercises 469


⎨ −1 for t ∈ [0, 1.8),
6.16 (a) v ∗ (t) =

⎩ 1 for t ∈ (1.8, 3].
(b) v ∗ (t) = 1 for t ∈ [0, 10].




⎪ −1 for t ∈ [0, 1/2],





⎨ 0 for t ∈ (1/2, 23/12],

6.18 v (t) =



⎪ +1 for t ∈ (23/12, 29/12],





⎩ 0 for t ∈ (29/12, 4].


⎨ 0, for 0 ≤ t ≤ t1 ,
6.19 u∗ (t) =

⎩ h(t − t )/c, for t < t ≤ T,
1 1

#
where t1 = T − 2BC/h.

Chapter 7

7.1 p∗ = 102.5 + 0.2G.

7.7 (ū)/(pS) = (δβ)/(η(ρ + δ)).

7.15 The reachable set is [x0 e−δT , (x0 − x̄)e−(δ+rQ)T + x̄],


where x̄ = rQ/(W + rQ).

7.20 (b)
1 x0 1 x̄ − xs
t1 = ln s , t2 = ln .
rQ + δ x rQ + δ x̄ − xT
7.21
1 rQ(1 − x0 ) − δx0 1 xs
T ≥ ln + ln .
rQ + δ rQ(1 − xs ) − δxs δ xT
% &
7.28 imp(A, B; t) = − 1r ln 1−B
1−A
.

7.29 (b) J = 0.6325.

7.35 The equations corresponding to (6.28) and (6.29) can be obtained


by replacing ρ by ρ + ṙ/r. The form of (6.30) remains unchanged.
470 E. Answers to Selected Exercises

Chapter 8

8.1 (a) y = 1, z = 3.
(b) y = 2, z = 10.

8.2 (a) (1,3) is a relative maximum.


(b) (2,10) is a relative maximum.

8.3 x = 50; x = 80.

8.6 (a) x = 4 is a local maximum.


(b) x = 8 is a local maximum and x = 20 is a local and a global
maximum.

8.7 (a) (0, 0) is the nearest point.


(b) (1/2, 1/2) is the nearest point.
√ √
8.8 (1/ 5, 2/ 5) is the closest point.

8.9 (a) (2 2, 0).
(b) (0, 2).
(c) (0, 2).

8.13 λTi = ∂F/∂xTi for i = 1, 2, . . . , n; λTn+1 = 1. Note that here T


denotes the terminal time, and not the transpose operation.




⎪ +1 if λk+1 b > 1,


8.17 uk∗ = k T −k λT ,
⎪ −1 if λk+1 b < −1 , where λ = (I + A)




⎩ 0 if |λk+1 b| < 1.
E. Answers to Selected Exercises 471

Chapter 9

9.2 ts = 5.25, T = 11.

9.4 T = ts = 2.47.

9.5 ts = 0, T = 30.
4 $2
9.7 u∗ (t) = sat[0, 1; u0 (t)], where u0 (t) = 2 − e0.05(t−34.8) /(1 + t),
t1 ≈ 3; t2 − T = 34.8.

Chapter 10

10.4 x̄ = 0.734.

10.5 (a)
⎡ ⎤
  7 
X⎣ ρ c ρ c 2 8cρ ⎦
x̄ = 1− + + 1− + + .
4 r Xp r Xp prX

(b) For ρ = 0, x̄ = 220,000. For ρ = 0.1, x̄ = 86,000. For ρ = ∞,


x̄ = 40,000.

10.7 [g  (x) − ρ][p − c(x)] − c (x)g(x) = 0.

10.9 [g  (x) − ρ][p − c(x)] − c (x)g(x) + ṗ = 0.

Chapter 11

11.1 λ(t) = λ0 e(ρ−β)t , where

[K0 eβT + C̄(1 − eβT )/β − KT ](2ρ − β)


λ0 = ,
eβT − e2(β−ρ)T
C̄ λ0
K(t) = K0 eβt + (1 − eβt ) − (e2(β−ρ)t − eβt ).
β β − 2ρ

Chapter 12
' (
12.5 q ∗ (x) = (1−β)σ ∗ (x) = 1 γβ
1−β ρ − rβ −
α−r
2 , c 1−β x,

1−β
V (x) = [ ρ−rβ−γβ/(1−β) ]1−β xβ , x ≥ 0.
472 E. Answers to Selected Exercises

Chapter 13

13.2 u∗ (t) = 1 + λL (t), v ∗ (t) = 1 + λF (t), where λL and λF are

the solution of the linear differential equation

⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
⎜ ẋ ⎟ ⎜ 0 1 1 0 ⎟⎜ x ⎟ ⎜ 2 ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ F ⎟ ⎜ ⎟⎜ F ⎟ ⎜ ⎟
⎜ λ̇ ⎟ ⎜ 1 0 0 0 ⎟ ⎜ ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎟⎜ λ ⎟ ⎜ ⎟
⎜ ⎟=⎜ ⎟⎜ ⎟+⎜ ⎟
⎜ L ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ λ̇ ⎟ ⎜ 1 0 0 −1 ⎟ ⎜ λL ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠⎝ ⎠ ⎝ ⎠
μ̇ 0 0 −1 0 μ 0

with the boundary conditions

x(0) = x0 , λF (T ) = 0, λL (T ) = 0, and μ(0) = 0.


Bibliography
Abad PL (1982a) An optimal control approach to marketing-production plan-
ning. Optimal Control Appl Methods 3(1):1–13

Abad PL (1982b) Approach to decentralized marketing-production planning.


Int J Syst Sci 13(3):227–235

Abad PL (1987) A hierarchical optimal control model for coordination of func-


tional decisions in a firm. Eur J Oper Res 32:62–75

Abad PL (1989) Multi-product multi-market model for coordination of


marketing-production decisions. Int J Syst Sci 20:2011–2027

Abad PL, Sweeney DJ (1982) Decentralized planning with an interdependent


marketing-production system. Omega 10:353–359

Agnew CE (1976) Dynamic modeling and control of congestion-prone systems.


Oper Res 24:400–419

Ahmed NU, Teo KL (1981) Optimal control of distributed parameter systems.


Elsevier Science LTD, Amsterdam

Alam M, Sarma VVS (1974) Optimal maintenance policy for equipment subject
to deterioration and random failure. IEEE Trans Syst Man Cybern SMC-
4:172–175

Alam M, Sarma VVS (1977) An application of optimal control theory to repair-


man problem with machine interference IEEE Trans Reliab R-26:121–124

Alam M, Lynn JW, Sarma VVS (1976) Optimal maintenance policy for equip-
ment subject to random deterioration and random failure. Int J Syst Sci
7:1071–1080

Allen KR (1973) Analysis of the stock-recruitment relation in Antarctic fin


whales. In: Parrish B (ed) Fish stock and recruitment, J. du Conseil In-
ternational pour l’Exploration de la Mer, Rapports et Procés-Verbaux de
Réunions, vol 164, pp 132–137

© Springer Nature Switzerland AG 2019 473


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
474 Bibliography

Amit R (1986) Petroleum reservoir exploitation: switching from primary to


secondary recovery. Oper Res 34(4):534–549

Amit R, Ilan Y (1990) The choice of manufacturing technology in the presence


of dynamic demand and experience effects. IIE Trans 22(2):100–111

Anderson RM, May RM (1992) Infectious diseases of humans: dynamics and


control. Oxford University Press, Oxford

Anderson BDO, Moore JB (1979) Optimal filtering. Prentice-Hall, New York

Anderson BDO, Moore JB (1990) Optimal control. Linear quadratic methods.


Prentice-Hall, Englewood Cliffs

Aoki M (1976) Dynamic economics: a system theoretic approach to theory and


control. Elsevier, New York

Arnold L (1974) Stochastic differential equations: theory and applications. Wi-


ley, New York

Aronson JE, Thompson GL (1984) A survey on forward methods in mathemat-


ical programming. J Large Scale Syst 7:1–16

Arora SR, Lele PT (1970) A note on optimal maintenance policy and sale date
of a machine. Manag Sci 17:170–173

Arrow KJ (1968) Applications of control theory to economic growth. In: Dantzig


GB, Veinott AF (eds) Mathematics of the decision sciences. Lectures in ap-
plied mathematics, vol 12. American Mathematical Society, Providence, pp
85–119

Arrow KJ, Chang S (1980) Optimal pricing, use, and exploration of uncertain
natural resource stocks. In: Liu PT (ed) Dynamic optimization in mathe-
matical economics. Plenum Press, New York

Arrow KJ, Kurz M (1970) Public investment, the rate of return, and optimal
fiscal policy. The John Hopkins Press, Baltimore

Arrow KJ, Bensoussan A, Feng Q, Sethi SP (2007) Optimal savings and the
value of population. Proc Natl Acad Sci 104(47):18421–18426

Arrow KJ, Bensoussan A, Feng Q, Sethi SP (2010) The genuine savings cri-
terion and the value of population in an economy with endogenous fertility
rate. In: Boucekkine R, Hritonenko N, Yatsenko Y (eds) Optimal control
of age-structured population in economy, demography, and the environment.
Routledge explorations in environmental economics. Routledge, New York,
pp 20–44
Bibliography 475

Arthur WB, McNicoll G (1977) Optimal time paths with age dependence: a
theory of population policy. Rev Econ Stud 44:111–123

Arutyunov AV, Aseev SM (1997) Investigation of the degeneracy phenomenon


of the maximum principle for optimal control problems with state constraints.
SIAM J Control Optim 35(3):930–952

Aubin J-P, Cellina A (1984) Differential inclusions: set-valued maps and via-
bility theory. Springer, Berlin

Axsäter S (1985) Control theory concepts in production and inventory control.


Int J Syst Sci 16:161–169

Bagchi A (1984) Stackelberg differential games in economic models. Lecture


notes in control and information sciences, vol 64. Springer, Berlin

Basar T (1986) A tutorial on dynamic and differential games. In: Basar T (ed)
Dynamic games and applications in economics. Springer, Berlin, pp 1–25

Basar T, Olsder GJ (1999) Dynamic noncooperative game theory, 2nd edn.


Society for Industrial and Applied Mathematics, Philadelphia

Basar T, Bensoussan A, Sethi SP (2010) Differential games with mixed leader-


ship: the open-loop solution. Appl Math Comput 217:972–979

Bass FM (1969) A new product growth model for consumer durables. Manag
Sci 15(5):215–227

Bass FM, Bultez AV (1969) A note on optimal strategic pricing of technological


innovations. Market Sci 1:371–378

Bass FM, Krishnamoorthy A, Prasad A, Sethi SP (2005a) Advertising compe-


tition with market expansion for finite horizon firms. J Ind Manag Optim
1(1):1–19

Bass FM, Krishnamoorthy A, Prasad A, Sethi SP (2005b) Generic and brand


advertising strategies in a dynamic duopoly. Market Sci 24(4):556–568

Bean JC, Smith RL (1984) Conditions for the existence of planning horizons.
Math Oper Res 9(3):391–401

Behrens DA, Caulkins JP, Tragler G, Feichtinger G (2000) Optimal control of


drug epidemics: prevent and treat - but not at the same time? Manag Sci
46:333–347

Behrens DA, Caulkins JP, Tragler G, Feichtinger G (2002). Why present-


oriented societies undergo cycles of drug epidemics. J Econ Dyn Control
26:919–936
476 Bibliography

Bell DJ, Jacobson DH (1975) Singular optimal control. Academic Press, New
York

Bellman RE (1957) Dynamic programming. Princeton University Press, Prince-


ton

Bellman RE, Kalaba RE (1965a) Quasilinearization and boundary value prob-


lems. Elsevier, New York

Bellman RE, Kalaba RE (1965b) Dynamic programming and modern control


theory. Academic Press, New York

Benchekroun H, Martı́n-Herrán G, Taboubi S (2009) Could myopic pricing be


a strategic choice in marketing channels? A game theoretic analysis. J Econ
Dyn Control 33:1699–1718

Benkherouf L, Sethi SP (2010) Optimality of (s,S) policies for a stochastic


inventory model with proportional and lump-sum shortage costs. Oper Res
Lett 38:252–255

Bensoussan A (2004) Stochastic control of partially observable systems. Cam-


bridge University Press, Cambridge

Bensoussan A (2011) Dynamic programming and inventory control. IOS Press,


Amsterdam

Bensoussan A (2018) Estimation and control of dynamical systems. Springer,


New York

Bensoussan A, Lesourne J (1980) Optimal growth of a self-financing firm in an


uncertain environment. In: Bensoussan A et al (eds) Applied stochastic con-
trol in econometrics and management science. North-Holland, Amsterdam,
pp 235–269

Bensoussan A, Lesourne J (1981) Growth of firms: a stochastic control theory


approach. In: Brockhoff K, Krelle W (eds) Unternehmensplanung. Springer,
Berlin, pp 101–116

Bensoussan A, Lions JL (1975) Nouvelles méthodes en controle impulsionnel.


Appl Math Optim 1:289–312

Bensoussan A, Lions JL (1982) Application of variational inequalities in stochas-


tic control. North-Holland, Amsterdam

Bensoussan A, Lions JL (1984) Impulse control and quasi-variational inequali-


ties. Bordas, Paris

Bensoussan A, Sethi SP (2007) The machine maintenance and sale age model
of Kamien and Schwartz revisited. Manag Sci 53(12):1964–1976
Bibliography 477

Bensoussan A, Hurst EG Jr, Näslund B (1974) Management applications of


modern control theory. Elsevier, New York

Bensoussan A, Nissen G, Tapiero CS (1975) Optimum inventory and product


quality control with deterministic and stochastic deterioration - an applica-
tion of distributed parameter control systems. IEEE Trans Autom Control
AC-20:407–412

Bensoussan A, Bultez AV, Naert PA (1978) Leader’s dynamic marketing be-


havior in oligopoly. In: Bensoussan A et al (eds) TIMS studies in the man-
agement sciences, vol 9. North-Holland, Amsterdam, pp 123–145

Bensoussan A, Crouhy M, Proth J-M (1983) Mathematical theory of production


planning. North-Holland, Amsterdam

Bensoussan A, Sethi SP, Vickson RG, Derzko NA (1984) Stochastic production


planning with production constraints. SIAM J Control Optim 22(6):920–935

Bensoussan A, Liu RH, Sethi SP (2005a) Optimality of an (s, S) policy with


compound Poisson and diffusion demands: a quasi-variational inequalities
approach. SIAM J Control Optim 44(5):1650–1676

Bensoussan A, Çakanyildirim M, Sethi SP (2005b) On the optimal control of


partially observed inventory systems. C R Acad Sci Paris Ser I 341:419–426

Bensoussan A, Da Prato G, Delfour MC, Mitter SK (2007a) Representation and


control of infinite dimensional systems. In: Basar T (ed) Systems & control:
foundations & applications, 2nd edn. Birkhäuser, Boston

Bensoussan A, Çakanyildirim M, Sethi SP (2007b) Optimal ordering policies


for inventory problems with dynamic information delays. Prod Oper Manag
16(2):241–256

Bensoussan A, Çakanyildirim M, Sethi SP (2007c) A multiperiod newsvendor


problem with partially observed demand. Math Oper Res 32(2):322–344

Bensoussan A, Çakanyildirim M, Sethi SP (2007d) Partially observed inventory


systems: the case of zero-balance walk. SIAM J Control Optim 46(1):176–209

Bensoussan A, Çakanyildirim M, Sethi SP (2007e) Economic evaluation of sys-


tems that expedite inventory information. Prod Oper Manag 16(3):360–368

Bensoussan A, Çakanyildirim M, Minjarez-Sosa JA, Royal A, Sethi SP (2008a)


Inventory problems with partially observed demands and lost sales. J Optim
Theory Appl 136(3):321–340

Bensoussan A, Çakanyildirim M, Minjarez-Sosa JA, Sethi SP, Shi R (2008b)


Partially observed inventory systems: the case of rain checks. SIAM J Control
Optim 47(5):2490–2519
478 Bibliography

Bensoussan A, Çakanyildirim M, Feng Q, Sethi SP (2009a) Optimal ordering


policies for stochastic inventory problems with observed information delays.
Prod Oper Manag 18(5):546–559

Bensoussan A, Çakanyildirim M, Royal A, Sethi SP (2009b) Bayesian and adap-


tive controls for a newsvendor facing exponential demand. Risk Decis Anal
1(4):197–210

Bensoussan A, Çakanyildirim M, Sethi SP (2009c) A note on ‘the censored


newsvendor and the optimal acquisition of information’. Oper Res 57(3):791–
794

Bensoussan A, Keppo J, Sethi SP (2009d) Optimal consumption and portfolio


decisions with partially observable real prices. Math Financ 19(2):215–236

Bensoussan A, Sethi SP, Chutani A (2009e) Optimal cash management under


uncertainty. Oper Res Lett 37:425–429

Bensoussan A, Çakanyildirim M, Sethi SP (2010) Filtering for discrete-time


Markov processes and applications to inventory control with incomplete in-
formation. In: Crisan D, Rozovsky B (eds) Handbook on nonlinear filtering.
Oxford University Press, Oxford, pp 500–525

Bensoussan A, Çakanyildirim M, Sethi SP, Shi R (2011a) Computation of ap-


proximate optimal policies in partially observed inventory model with rain
checks. Automatica 47:1589–1604

Bensoussan A, Çakanyildirim M, Sethi SP, Wang M, Zhang H (2011b) Average


cost optimality in inventory models with dynamic information delays. IEEE
Trans Autom Control 56(12):2869–2882

Bensoussan A, Long H, Perera S, Sethi SP (2012) Impulse control with random


reaction periods: a central bank intervention problem. Oper Res Lett 40:425–
430

Bensoussan A, Chen S, Sethi SP (2014) Feedback Stackelberg solutions of


infinite-horizon stochastic differential games. In: Ouardighi FE, Kogan K
(eds) Models and methods in economics and management science, essays
in honor of Charles S. Tapiero, Series 6161, vol 198. Springer International
Publishing, Cham, pp 3–15

Bensoussan A, Chen S, Sethi SP (2015a) The maximum principle for global


solutions of stochastic Stackelberg differential games. SIAM J Control Optim
53(4):1956–1981

Bensoussan A, Feng Q, Sethi SP (2015b) Integrating equipment investment


strategy with maintenance operations under uncertain failures. Ann Oper
Res 8:1–34
Bibliography 479

Bensoussan A, Çakanyildirim M, Li M, Sethi SP (2016) Managing inventory


with cash register information: sales recorded but not demands. Prod Oper
Manag 25(1):9–21

Bensoussan A, Chen S, Chutani A, Sethi SP (2018) Feedback Stackelberg-Nash


equilibria in mixed leadership games with an application to cooperative ad-
vertising. Operations research, Working Paper, University of Texas at Dallas

Berkovitz LD (1961) Variational methods in problems of control and program-


ming. J Math Anal Appl 3:145–169

Berkovitz LD (1994) A theory of differential games. In: Basar T, Haurie A (eds)


Advances in dynamic games and applications. Birkhäuser, Boston, pp 3–22

Berkovitz LD, Dreyfus SE (1965) The equivalence of some necessary conditions


for optimal control in problems with bounded state variables. J Math Anal
Appl 10:275–283

Bertsekas DP, Shreve SE (1996) Stochastic optimal control: the discrete-time


case. Athena Scientific, New York

Bes C, Sethi SP (1988) Concepts of forecast and decision horizons: applications


to dynamic stochastic optimization problems. Math Oper Res 13(2):295–310

Bes C, Sethi SP (1989) Solution of a class of stochastic linear-convex control


problems using deterministic equivalents. J Optim Theory Appl 62(1):17–27

Bettiol P, Vinter RB (2010) Sensitivity interpretations of the costate variable


for optimal control problems with state constraints. SIAM J Control Optim
48(5):3297–3317

Beyer D, Sethi SP (1998) A proof of the EOQ formula using quasi-variational


inequalities. Int J Syst Sci 29(11):1295–1299

Beyer D, Cheng F, Sethi SP, Taksar MI (2010) Markovian Demand Inventory


Models. Springer, New York

Bhar R (2010) Stochastic filtering with applications in finance. World Scientific,


Singapore

Bhaskaran S, Sethi SP (1981) Planning horizons for the wheat trading model.
In: Proceedings of AMS 81 conference, 5: life, men, and societies, pp 197–201

Bhaskaran S, Sethi SP (1983) Planning horizon research - the dynamic pro-


gramming/control theory interface. In: Proceedings of international AMSE
winter symposium, Bermuda, pp 155–160

Bhaskaran S, Sethi SP (1987) Decision and forecast horizons in a stochastic


environment: a survey. Optimal Control Appl Methods 8:201–217
480 Bibliography

Bhaskaran S, Sethi SP (1988) The dynamic lot size model with stochastic de-
mands: a planning horizon study. Inf Syst Oper Res 26(3):213–224

Black F, Scholes M (1973) The pricing of options and corporate liabilities. J


Polit Econ 81:637–659

Blaquière A (1979) Necessary and sufficiency conditions for optimal strategies


in impulsive control. In: Liu P-T, Roxin EO (eds) Differential games and
control theory III, part A. M. Dek̇ker, New York, pp 1–28

Blaquière A (1985) Impulsive optimal control with finite or infinite time horizon.
J Optim Theory Appl 46:431–439

Boccia A, De Pinho MDR, Vinter RB (2016) Optimal control problems with


mixed and pure state constraints. SIAM J Control Optim 54(6):3061–3083

Boiteux M (1955) Réflexious sur la concurrence du rail et de la route,


le déclassement des lignes non rentables et le déficit du chemin de fer.
L’Économie Électrique, vol 2

Bolton P, Dewatripont M (2005) Contract theory. MIT Press, Cambridge

Boltyanskii VG (1971) Mathematical methods of optimal control. Holt, Rine-


hard & Winston, New York

Bookbinder JH, Sethi SP (1980) The dynamic transportation problem: a survey.


Naval Res Logist Quart 27:65–87

Bourguignon F, Sethi SP (1981) Dynamic optimal pricing and (possibly) adver-


tising in the face of various kinds of potential entrants. J Econ Dyn Control
3:119–140

Bowes MD, Krutilla JV (1985) Multiple use management of public forestlands.


In: Kneese AV, Sweeney JL (eds) Handbook of natural resource and energy
economics, vol 2, 1st edn, chap 12. Elsevier, London, pp 531–569

Breakwell JV (1968) Stochastic optimization problems in space guidance. In:


Karreman HF (ed) Stochastic optimization and control. Wiley, New York,
pp 91–100

Brekke KA, Øksendal BK (1994) Optimal switching in an economic activity


under uncertainty. SIAM J Control Optim 32(4):1021–1036

Breton M, Jarrar R, Zaccour G (2006) A note on feedback Stackelberg equilibria


in a Lanchester model with empirical application. Manag Sci 52(5):804–811

Brito DL, Oakland WH (1977) Some properties of the optimal income tax. Int
Econ Rev 18:407–423
Bibliography 481

Brokate M (1985) Pontryagin’s Principle for control problems in age-dependent


population dynamics. J Math Biol 23(1):75–101

Brown RG (1959) Statistical forecasting for inventory control. McGraw-Hill


Book, New York

Brown RG (1963) Smoothing, forecasting and prediction. Prentice-Hall, Engle-


wood Cliffs

Bryant GF, Mayne DQ (1974) The maximum principle. Int J Control 20:1021–
1054

Bryson AE Jr (1998) Dynamic optimization. Addison-Wesley, Reading

Bryson AE Jr, Ho Y-C (1975) Applied optimal control. Optimization, estima-


tion and control. Taylor & Francis, New York

Buchanan LF, Norton FE (1971) Optimal control applications in economic sys-


tems. In: Leondes CT (ed) Advances in control systems, vol 8. Academic
Press, New York, pp 141–187

Bulirsch R, Kraft D (eds) (1994) Computational optimal control. Birkhäuser-


Verlag, Boston

Bulirsch R, Oettli W, Stoer J (eds) (1975) Optimization and optimal control.


Lecture notes in mathematics, vol 477. Springer, Berlin

Bultez AV, Naert PA (1979) Does lag structure really matter in optimizing
advertising spending. Manag Sci 25(5):454–465

Bultez AV, Naert PA (1988) When does lag structure really matter...indeed?
Manag Sci 34(7):909–916

Burdet CA, Sethi SP (1976) On the maximum principle for a class of discrete
dynamical system with lags. J Optim Theory Appl 19:445–454

Burmeister E, Dobell AR (1970) Mathematical theories of economic growth.


MacMillan, London

Butkowskiy AG (1969) Distributed control systems. Elsevier, New York

Bylka S, Sethi SP (1992) Existence of solution and forecast horizons in dynamic


lot size model with nondecreasing holding costs. Prod Oper Manag 1(2):212–
224

Bylka S, Sethi SP, Sorger G (1992) Minimal forecast horizons in equipment


replacement models with multiple technologies and general switching costs.
Naval Res Logist 39:487–507
482 Bibliography

Caines P, Sethi SP, Brotherton T (1977) Impulse response identification and


casualty detection for the Lydia Pinkham data. Ann Econ Soc Meas 6(2):147–
163

Canon MD, Cullum CD, Polak E (1970) Theory of optimal control and math-
ematical programming. McGraw-Hill, New York

Caputo MR (2005) Foundations of dynamic economic analysis: optimal control


theory and applications. Cambridge University Press, Cambridge

Carlson DA (1986a) On the existence of catching up optimal solutions for


Lagrange problems defined on unbounded intervals. J Optim Theory Appl
49(2):207–225

Carlson DA (1986b) The existence of finitely optimal solutions for infinite hori-
zon optimal control problems. J Optim Theory Appl 51(1):41–62

Carlson DA (1987a) On the existence of sporadically catching up optimal so-


lutions for infinite horizon optimal control problems. J Optim Theory Appl
53(2):219–235

Carlson DA (1987b) An elementary proof of the maximum principle for optimal


control problems governed by a Volterra integral equation. J Optim Theory
Appl 54(1):43–61

Carlson DA (1988) Sufficient conditions for optimality and supported trajec-


tories for optimal control problems governed by Volterra integral equations.
In: Advances in optimization and control. Lecture notes in economics and
mathematical systems, vol 302. Springer, New York, pp 274–282

Carlson DA (1989) Some concepts of optimality for infinite horizon control


and their interrelationships. In: Roxin EO (ed) Modern optimal control;
a conference in honor of Solomon Lefschetz and Joseph P. LaSalle. Marcel
Dekker, New York, pp 13–22

Carlson DA (1990a) Uniform overtaking and weakly overtaking optimal so-


lutions in optimal control: when optimal solutions are agreeable. J Optim
Theory Appl 64(1):55–69

Carlson DA (1990b) The existence of catching-up optimal solutions for a class


of infinite horizon optimal control problems with time delay. SIAM J Control
Optim 28(2):402–422

Carlson DA (1990c) Infinite horizon optimal controls for problems governed


by a Volterra integral equation with a state and control dependent discount
factor. J Optim Theory Appl 66(2):311–336
Bibliography 483

Carlson DA (1991) Asymptotic stability for optimal trajectories of infinite hori-


zon optimal control models with state and control dependent discounting. In:
Proceedings of the 4th annual workshop in analysis and its applications, pp
281–299

Carlson DA (1993) Nonconvex and relaxed infinite horizon optimal control prob-
lems. J Optim Theory Appl 78(3):465–491

Carlson DA (1995) An existence theorem for hereditary Lagrange problems on


an unbounded interval. In: Boundary value problems for functional differen-
tial equation. World Scientific Publishing, Singapore, pp 73–83

Carlson DA (1997) Overtaking optimal solutions for convex Lagrange problems


with time delay. J Math Appl 208:31–48

Carlson DA, Haurie A (1987a) Infinite horizon optimal control theory and ap-
plications. Lecture Notes in Economics and Mathematical Systems, vol 290.
Springer, New York

Carlson DA, Haurie A (1987b) Optimization with unbounded time interval for
a class of non linear systems. Springer, Berlin

Carlson DA, Haurie A (1992) Control theoretic models of environment-economy


interactions. In: The proceedings of the 31st IEEE conference on decision and
control, pp 2860–2861

Carlson DA, Haurie A (1995) A turnpike theory for infinite horizon open-loop
differential games with decoupled dynamics, In: New trends in dynamic
games and applications, annals of the international society of dynamic games,
vol 3. Birkhäuser, Boston, pp 353–376

Carlson DA, Haurie A (1996) A turnpike theory for infinite horizon competitive
processes. SIAM J Optim Control 34(4):1405–1419

Carlson DA, Haurie A, Jabrane A (1987) Existence of overtaking solutions to


infinite dimensional control problems on unbounded time intervals. SIAM J
Control Optim 25(6):1517–1541

Carlson DA, Haurie A, Leizarowitz A (1991) Infinite horizon optimal control:


deterministic and stochastic systems, 2nd edn. Springer, New York

Carlson DA, Haurie A, Leizarowitz A (1994) Equilibrium points for linear-


quadratic infinite horizon differential games. In: Basar T, Haurie A (eds)
Advances in dynamic games and applications, annals of the international
society of dynamic games, vol 1. Birkhäuser, Boston, pp 247–268

Carraro C, Filar J (eds) (1995) Control and game theoretic models of the envi-
ronment. Birkhäuser, Boston
484 Bibliography

Carrillo J, Gaimon C (2000) Improving manufacturing performance through


process change and knowledge creation. Manag Sci 46(2):265–288

Case JH (1979) Economics and the competitive process. New York University
Press, New York

Cass D, Shell K (1976) The Hamiltonian approach to dynamic economics. Aca-


demic Press, New York

Caulkins JP, Feichtinger G, Haunschmied JL, Tragler G (2006) Quality cycles


and the strategic manipulation of value. Oper Res 54:666–677

Caulkins JP, Feichtinger G, Grass D, Tragler G, (2008) Optimizing counter-


terror operations: should one fight with ‘fire’ or ‘water’ ? Comput Oper Res
35:1874–1885

Caulkins JP, Feichtinger G, Grass D, Tragler G (2009) Optimal control of ter-


rorism and global reputation: a case study with novel threshold behaviour.
Oper Res Lett 37(6):387–391

Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2011) Op-
timal pricing of a conspicuous product in a recession that freezes capital
markets. J Econ Dyn Control 35(1):163–174

Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2013a) When
to make proprietary software open source. J Econ Dyn Control 37(6):1182–
1194

Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Novak AJ, Seidl A
(2013b) Leading bureaucracies to the tipping point: an alternative model of
multiple stable equilibrium levels of corruption. Eur J Oper Res 225(3):541–
546

Caulkins JP, Feichtinger G, Hartl RF, Kort PM, Novak AJ, Seidl A (2013c)
Multiple equilibria and indifference-threshold points in a rational addiction
model. Central Eur J Oper Res 21(3):507–522

Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Novak AJ, Seidl A,
Wirl F (2014) A dynamic analysis of Schelling’s binary corruption model: a
competitive equilibrium approach. J Optim Theory Appl 161(2):608–625

Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2015a) Skiba
points in free end-time problems. J Econ Dyn Control 51:404–419

Caulkins JP, Feichtinger G, Seidl A, Grass D, Hartl RF, Kort PM (2015b)


Capital stock management during a recession that freezes credit markets. J.
Econ Behav Organ 116:1–14
Bibliography 485

Caulkins JP, Feichtinger G, Grass D, Hartl RF, Kort PM, Seidl A (2017) In-
teraction of pricing, advertising and experience quality: a dynamic analysis.
Eur J Oper Res 256(3):877–885

Cernea A, Frankowska H (2005) A connection between the maximum principle


and dynamic programming for constrained control problems. SIAM J Control
Optim 44:673–703

Cesari L (1983) Optimization - theory and applications: problems with ordinary


differential equations. Springer, New York

Chahim M, Hartl RF, Kort PM (2012) A tutorial on the deterministic impulse


control maximum principle: necessary and sufficient optimality conditions.
Eur J Oper Res 219(1):18–26

Chand S, Sethi SP (1982) Planning horizon procedures for machine replace-


ment models with several possible replacement alternatives. Naval Res Logist
Quart 29(3):483–493

Chand S, Sethi SP (1983) Finite production rate inventory models with first
and second shift setups. Naval Res Logist Quart 30:401–414

Chand S, Sethi SP (1990) A dynamic lot size model with learning in setups.
Oper Res 38(4):644–655

Chand S, Sethi SP, Proth JM (1990) Existence of forecast horizons in undis-


counted discrete time lot size models. Oper Res 38(5):884–892

Chand S, Sethi SP, Sorger G (1992) Forecast horizons in the discounted dynamic
lot size model. Manag Sci 38(7):1034–1048

Chand S, Moskowitz H, Novak AJ, Rekhi I, Sorger G (1996) Capacity allocation


for dynamic process improvement with quality and demand considerations.
Oper Res 44:964–975

Chand S, Hsu VN, Sethi SP (2002) Forecast, solution and rolling horizons in
operations management problems: a classified bibliography. Manufact Service
Oper Manag 4(1):25–43

Chao R, Kavadias S, Gaimon C (2009) Revenue driven resource allocation and


effective NPD portfolio management. Manag Sci 55(9):1556–1569

Chappell D, Dury K (1994) On the optimal depletion of a nonrenewable natural


resource under conditions of increasing marginal extraction costs. SIAM Rev
36(1):102–106

Charnes A, Kortanek K (1966) A note on the discrete maximum principle and


distribution problems. J Math Phys 45:121–126
486 Bibliography

Chen SF, Leitmann G (1980) Labour-management bargaining modelled as a


dynamic game. Optimal Control Appl Methods 1:11–25

Chiarella C, Kemp MC, Long NV, Okuguchi K (1984) On the economics of


international fisheries. Int Econ Rev 25:85–92

Chichilinsky G (1981) Existence and characterization of optimal growth paths


including models with non-convexities in utilities and technologies. Rev Econ
Stud 48:51–61

Chintagunta PK (1993) Investigating the sensitivity of equilibrium profits to


advertising dynamics and competitive effects. Manag Sci 39(9):1146–1162

Chintagunta PK, Jain D (1992) A dynamic model of channel member strategies


for marketing expenditures. Market Sci 11(2):168–188

Chintagunta PK, Jain D (1994) A study of manufacturer-retailer marketing


strategies: a differential game approach. Lecture notes in control and infor-
mation sciences. Springer, New York

Chintagunta PK, Jain D (1995) Dynamic duopoly models of advertising compe-


tition: estimation and a specification tests. J Econ Manag Strateg 4(1):109–
131

Chintagunta PK, Vilcassim NJ (1992) An empirical investigation of advertising


strategies in a dynamic duopoly. Manag Sci 38(9):1230–1244

Chintagunta PK, Vilcassim NJ (1994) Marketing investment decisions in a dy-


namic duopoly: a model and empirical analysis. Int J Res Market 11(3):287–
306

Chow GC (1975) Analysis and control of dynamic economic systems. Wiley,


New York

Chutani A, Sethi SP (2012a) Cooperative advertising in a dynamic retail market


oligopoly. Dyn Games Appl 2(4):347–375

Chutani A, Sethi SP (2012b) Optimal advertising and pricing in a dynamic


durable goods supply chain. J Optim Theory Appl 154(2):615–643

Clark CW (1973) The economics of overexploitation. Science 181:630–634

Clark CW (1976) Mathematical bioeconomics: the optimal management of


renewal resources. Wiley, New York

Clark CW (1979) Mathematical models in the economics of renewable resources.


SIAM Rev 21:81–99
Bibliography 487

Clark CW (1985) Bioeconomic modelling and fisheries management. Wiley, New


York

Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York

Clarke FH (1989) Methods of dynamic and nonsmooth optimization. Society


for Industrial and Applied Mathematics, Philadelphia

Clarke FH, Darrough MN, Heineke JM (1982) Optimal pricing policy in the
presence of experience effects. J Bus 55:517–530

Clemhout S, Wan HY Jr (1985) Dynamic common property resources and en-


vironmental problems. J Optim Theory Appl 46:471–481

Coddington EA, Levinson NL (1955) Theory of ordinary differential equations.


McGraw-Hill, New York

Cohen KJ, Cyert RM (1965) Theory of the firm: resource allocation in a market
economy. Prentice-Hall, Englewood Cliffs

Connors MM, Teichroew D (1967) Optimal control of dynamic operations re-


search models. International Textbook, Scranton

Conrad K (1982) Advertising, quality and informationally consistent prices. Z


Staatswiss 138:680–694

Conrad K (1985) Quality, advertising and the formation of goodwill under dy-
namic conditions. In: Feichtinger G (ed) Optimal control theory and eco-
nomic analysis, vol 2. North-Holland, Amsterdam, pp 215–234

Constantinides GM, Richard SF (1978) Existence of optimal simple policies for


discounted cost inventory and cash management in continuous time. Oper
Res 26:620–636

Cvitanic J, Zhang J (2013) Contract theory in continuous-time models.


Springer, New York

Dantzig GB (1966) Linear control processes and mathematical programming.


SIAM J Control 4:56–60

Dantzig GB, Sethi SP (1981) Linear optimal control problems and generalized
linear programs. J Oper Res Soc 32:467–476

Dasgupta P, Heal GM (1974a) The optimal depletion of exhaustible resources.


Rev Econ Stud 41:3–28

Dasgupta P, Heal GM (eds) (1974b) Symposium on the economics of exhaustible


resources. The review of economic studies, vol 41. Oxford University Press,
Oxford
488 Bibliography

D’Autume A, Michel P (1985) Future investment constraints reduce present


investment. Econometrica 53:203–206

Davis BE (1970) Investment and rate of return for the regulated firm. Bell J
Econ Manag Sci 1:245–270

Davis MHA (1993) Markov models and optimization. Chapman & Hall, New
York

Davis BE, Elzinga DJ (1971) The solution of an optimal control problem in


financial modeling. Oper Res 19:1419–1433

Dawid H, Feichtinger G (1996a) Optimal allocation of drug control efforts: a


differential game analysis. J Optim Theory Appl 91:279–297

Dawid H, Feichtinger G (1996b) On the persistence of corruption. J Econ


64:177–193

Dawid H, Kopel M, Feichtinger G (1997) Complex solutions of nonconcave


dynamic optimization models. Econ Theory 9:427–439

Dawid H, Feichtinger, G, Goldstein JR, Veliov VM (2009) Keeping a learned


society young. Demogr Res 20(22), 541–558

Deal KR (1979) Optimizing advertising expenditures in a dynamic duopoly.


Oper Res 27(4):682–692

Deal KR, Sethi SP, Thompson GL (1979) A bilinear-quadratic differential game


in advertising. In: Liu PT, Sutinen JG (eds) Control theory in mathematical
economics. Marcel Dek̇ker, New York, pp 91–109

Dechert DW, Nishimura K (1983) A complete characterization of optimal


growth paths in an aggregated model with a non-concave production func-
tion. J Econ Theory 31:332–354

Deger S, Sen SK (1984) Optimal control and differential game models of military
expenditure in less developed countries. J Econ Dyn Control 7:153–169

Deissenberg C (1980) Optimal control of linear econometric models with inter-


mittent control. Econ Plan 16(1):49–56

Deissenberg C (1981) A simple model of closed-loop optimal exploration for oil.


Policy Anal Inf Syst 5(3):167–183

Deissenberg C, Stöppler S (1982) Optimal control of LQG systems with costly


observations. In: Feichtinger G (ed) Economic applications of optimal control
theory. North-Holland, Amsterdam, pp 301–320
Bibliography 489

Deissenberg C, Stöppler S (1983) Optimal information gathering and planning


policies of the profit-maximizing firm. Int J Policy Inf 7(2):49–76

Deissenberg C, Feichtinger G, Semmler W, Wirl F (2004) Multiple equilib-


ria, history dependence, and global dynamics in intertemporal optimisation
models. In: Barnett WA, Deissenberg C, Feichtinger G (eds) Economic com-
plexity: non-linear dynamics, multi-agents economies, and learning. Elsevier,
Amsterdam, pp 91–122

Derzko NA, Sethi SP (1981a) Optimal exploration and consumption of a natural


resource: stochastic case. Int J Policy Anal 5(3):185–200

Derzko NA, Sethi SP (1981b) Optimal exploration and consumption of a natural


resource: deterministic case. Optim Control Appl Methods 2(1):1–21

Derzko NA, Sethi SP, Thompson GL (1980) Distributed parameter systems ap-
proach to the optimal cattle ranching problem. Optim Control Appl Methods
1:3–10

Derzko NA, Sethi SP, Thompson GL (1984) Necessary and sufficient conditions
for optimal control of quasilinear partial differential systems. J Optim Theory
Appl 43:9–101

Dhrymes PJ (1962) On optimal advertising capital and research expenditures


under dynamic conditions. Economica 39:275–279

Dixit AK, Pindyck RS (1994) Investment under uncertainty. Princeton Univer-


sity Press, Princeton

Dmitruk AV (2009) On the development of Pontryagin’s maximum princi-


ple in the works of A.Ya. Dubovitskii and A.A. Milyutin. Control Cybern
38(4):923–958

Dockner EJ (1984) Optimal pricing of a monopoly against a competitive pro-


ducer. Optimal Control Appl Methods 5:345–351

Dockner EJ, Feichtinger G (1991) On the optimality of limit cycles in dynamic


economic systems. J Econ 53:31–50

Dockner EJ, Feichtinger G (1993) Cyclical consumption patterns and rational


addiction. Am Econ Rev 83:256–263

Dockner EJ, Jørgensen S (1984) Cooperative and non-cooperative differential


game solutions to an investment and pricing problem. J Oper Res Soc 35:731–
739

Dockner EJ, Jørgensen S (1986) Dynamic advertising and pricing in an


oligopoly: a Nash equilibrium approach. In: Basar T (ed) Dynamic games
and applications in economics. Springer, Berlin, pp 238–251
490 Bibliography

Dockner EJ, Jørgensen S (1988) Optimal advertising policies for diffusion


models of new product innovation in monopolistic situations. Manag Sci
34(1):119–130

Dockner EJ, Jørgensen S (1992) New product advertising in dynamic


oligopolies. Z Oper Res 36(5):459–473

Dockner EJ, Sorger G (1996) Existence and properties of equilibria for a dy-
namic game on productive assets. J Econ Theory 71:209–227

Dockner EJ, Feichtinger G, Jørgensen S (1985) Tractable classes of nonzero-sum


open-loop Nash differential games: theory and examples. J Optim Theory
Appl 45:179–197

Dockner EJ, Feichtinger G, Mehlmann A (1993) Dynamic R & D competition


with memory. J Evol Econ 3:145–152

Dockner EJ, Long NV, Sorger G (1996) Analysis of Nash equilibria in a class
of capital accumulation games. J Econ Dyn Control 20:1209–1235

Dockner EJ, Jørgensen S, Long NV, Sorger G (2000) Differential games in


economics and management science. Cambridge University Press, Cambridge

Dogramaci A (2005) Hibernation durations for chain of machines with mainte-


nance under uncertainty. In: Deissenberg C, Hartl R (eds) Optimal control
and dynamic games: applications in finance, management science and eco-
nomics, chap 14. Springer, New York, pp 231–238

Dogramaci A, Fraiman NM (2004) Replacement decisions with maintenance


under uncertainty: an imbedded optimal control model. Oper Res 52(5):785–
794

Dogramaci A, Sethi SP (2016) Organizational nimbleness and operational poli-


cies: the case of optimal control of maintenance under uncertainty. In: Semm-
ler W, Mittnik S (eds) Dynamic perspectives on managerial decision making,
essays in honor of Richard F. Hartl. Dynamic modeling and econometrics in
economics and finance, vol 22. Springer, Cham, pp 253–277

Dohrmann CR, Robinett RD (1999) Dynamic programming method for con-


strained discrete-time optimal control. J Optim Theory Appl 101(2):259–283

Dolan RJ, Jeuland AP (1981) Experience curves and dynamic demand models:
implications of optimal pricing strategies. J Market 45:52–73

Dolan RJ, Muller E (1986) Models of new product diffusion: extension to com-
petition against existing and potential firms over time. In: Mahajan V, Wind
Y (eds) Innovation diffusion models of new product acceptance. Ballinger,
Cambridge, pp 117–150
Bibliography 491

Dorfman R (1969) Economic interpretation of optimal control theory. Am Econ


Rev 49:817–831

Dorfman R, Steiner PO (1954) Optimal advertising and optimal quality. Am


Econ Rev 44:826–836

Drews W, Hartberger RJ, Segers R (1974) On continuous mathematical pro-


gramming. In: Cottle RW, Krarup J (eds) Optimization methods for resource
allocation. The English University Press, London

Dubovitskii AJ, Milyutin AA (1965) Extremum problems in the presence of


restrictions. USSR Comput Math Math Phys 5(3):1–80

Dunn JC, Bertsekas DP (1989) Efficient dynamic programming implementa-


tions of Newton’s method for unconstrained optimal control problems. J Op-
tim Theory Appl 63(1):23–38

Durrett R (1996) Stochastic calculus: a practical introduction, 2nd edn. CRC


Press, Boca Raton

El-Hodiri M, Takayama A (1981) Dynamic behavior of the firm with adjustment


costs under regularity constraint. J Econ Dyn Control 3:29–41

Eliashberg J, Steinberg R (1987) Marketing-production decisions in an indus-


trial channel of distribution. Manag Sci 33(8):981–1000

Elliott RJ, Aggoun L, Moore JB (1995) Hidden Markov models: estimation and
control. Springer, New York

El Ouardighi F, Pasin F (2006) Quality improvement and goodwill accumula-


tion in a dynamic duopoly. Eur J Oper Res 175(2):1021–1032

El Ouardighi F, Tapiero CS (1998) Quality and the diffusion of innovations.


Eur J Oper Res 106(1):31–38

El Ouardighi F, Jørgensen S, Pasin F (2008) A dynamic game of operations


and marketing management in a supply chain. In: Petrosjan L, Yeung DK
(eds) International game theory review (special issue dedicated to John F.
Nash), vol 10. World Scientific, Singapore

El Ouardighi F, Feichtinger G, Grass D, Hartl RF, Kort P (2016a) Autonomous


and advertising-dependent ’word of mouth’ under costly dynamic pricing. Eur
J Oper Res 251:860–872

El Ouardighi F, Feichtinger G, Grass D, Hartl RF, Kort P (2016b) Advertising


and quality-dependent word-of-mouth in a contagion sales model. J Optim
Theory Appl 170(1):323–342
492 Bibliography

El Ouardighi F, Feichtinger G, Fruchter G (2017) Accelerating the diffusion


of innovations under mixed word of mouth through marketing-operations
interaction. Ann Oper Res 264(1–2):435–458

Elton E, Gruber M (1975) Finance as a dynamic process. Prentice-Hall, Engle-


wood Cliffs

Erickson GM (1992) Empirical analysis of closed-loop duopoly advertising


strategies. Manag Sci 38:1732–1749

Erickson GM (2003) Dynamic models of advertising competition. Springer,


Boston

Fan LT, Wang CS (1964) An application of the discrete maximum principle to


a transportation problem. J Math Phys 43:255–260

Fattorini HO (1999) Infinite-dimensional optimization and control theory. En-


cyclopedia of mathematics and its applications, vol 62. Cambridge University
Press, Cambridge

Feenstra TL, Kort PM, de Zeeuw AJ (2001) Environmental policy in an inter-


national duopoly: an analysis of feedback investment strategies. J Econ Dyn
Control 25(10):1665–1687

Feichtinger G (1982a) Optimal pricing in a diffusion model with concave price-


dependent market potential. O R Lett 1(6):236–240

Feichtinger G (1982b) Saddle-point analysis in a price-advertising model. J Econ


Dyn Control 4:319–340

Feichtinger G (1982c) Optimal repair policy for a machine service problem.


Optim. Control Appl Methods 3:15–22

Feichtinger G (1982d) The Nash solution of a maintenance-production differ-


ential game. Eur J Oper Res 10:165–172

Feichtinger G (ed) (1982e) Optimal control theory and economic analysis. In:
First Viennese workshop on economic applications of control theory, Vienna,
28–30 October 1981. North-Holland, Amsterdam

Feichtinger G (1982f) Anwendungen des maximumprinzips im operations re-


search, Teil 1 und 2. OR-Spektr 4:171–190 und 195–212

Feichtinger G (1983a) The Nash solution of an advertising differential game:


generalization of a model by Leitmann and Schmitendorf. IEEE Trans Autom
Control AC-28:1044–1048

Feichtinger G (1983b) A differential games solution to a model of competition


between a thief and the police. Manag Sci 29:686–699
Bibliography 493

Feichtinger G (1984a) Optimal employment strategies of profit-maximizing and


labour-managed firms. Optim Control Appl Methods 5:235–253

Feichtinger G (1984b) On the synergistic influence of two control variables on


the state of nonlinear optimal control models. J Oper Res Soc 35:907–914

Feichtinger G (ed) (1985a) Optimal control theory and economic analysis 2.


In: Second Viennese workshop on economic applications of control theory,
Vienna, 16–18 May 1984. North-Holland, Amsterdam

Feichtinger G (1985b) Optimal modification of machine reliability by mainte-


nance and production. OR-Spektr 7:43–50

Feichtinger G (1987) Intertemporal optimization of wine consumption at a


party: an unusual optimal control model. In: Gandolfo G, Marzano F (eds)
Keynesian theory planning models and quantitative economics, essays in
memory of Vittorio Marrama, vol II. Giuffre, Milano, pp 777–797

Feichtinger G (ed) (1988) Optimal control theory and economic analysis, vol 3.
North-Holland, Amsterdam

Feichtinger G (1992a) Limit cycles in dynamic economic systems. Ann Oper


Res 37:313–344

Feichtinger G (1992b) Optimal control of economic systems. In: Tzafestas SG


(ed) Automatic control handbook. M. Dekker, New York, pp 1023–1044

Feichtinger G, Dockner EJ (1984) A note to Jørgensen’s logarithmic advertising


differential game. Z Oper Res 28:B133–B153

Feichtinger G, Dockner EJ (1985) Optimal pricing in a duopoly: a noncooper-


ative differential games solution. J Optim Theory Appl 45:199–218

Feichtinger G, Hartl RF (1985a) Optimal pricing and production in an inventory


model. Eur J Oper Res 19:45–56

Feichtinger G, Hartl RF (1985b) On the use of Hamiltonian and maximized


Hamiltonian in non-differentiable control theory. J Optim Theory Appl
46:493–504

Feichtinger G, Hartl RF (1986) Optimale Kontrolle Ökonomischer Prozesse:


Anwendungen des Maximumprinzips in den Wirtschaftswissenschaften. Wal-
ter De Gruyter, Berlin

Feichtinger G, Hartl RF (eds) (1992) Nonlinear methods in economic dynamics


and optimal control. Annals of operations research, vol 37. Baltzer, Basel

Feichtinger G, Jørgensen S (1983) Differential game models in management


science. Eur J Oper Res 14:137–155
494 Bibliography

Feichtinger G, Mehlmann A (1986) Planning the unusual: applications of con-


trol theory to non-standard problems. Acta Appl Math 7:79–102
Feichtinger G, Novak AJ (1992a) Optimal consumption, training, working time
and leisure over the life cycle. J Optim Theory Appl 75:369–388
Feichtinger G, Novak AJ (1992b) A note on the optimal exploitation of migra-
tory fish stocks. Dyn Control 2:255–263
Feichtinger G, Novak AJ (1994a) Optimal pulsing in an advertising diffusion
model. Optim Control Appl Methods 15:267–276
Feichtinger G, Novak AJ (1994b) Differential game model of the dynastic cy-
cle: 3-D canonical system with a stable limit cycle. J Optim Theory Appl
80(3):407–423
Feichtinger G, Novak AJ (2008) Terror and counterterror operations: differen-
tial game with cyclical Nash solution. J Optim Theory Appl 139:541–556
Feichtinger G, Sorger G (1986) Optimal oscillations in control models: how can
constant demand lead to cyclical production? Oper Res Lett 5:277–281
Feichtinger G, Sorger G (1988) Periodic research and development. In: Fe-
ichtinger G (ed) Optimal control theory and economic analysis. Third Vi-
ennese workshop on optimal control theory and economic analysis, Vienna,
20–22 May 1987, vol 3. North-Holland, Amsterdam, pp 121–141
Feichtinger G, Steindl A (2006) DNS curves in a production/inventory model.
J Optim Theory Appl 128:295–308
Feichtinger G, Veliov VM (2007) On a distributed control problem arising in
dynamic optimization of a fixed-size population. SIAM J Optim 18:980–1003
Feichtinger G, Wirl F (1993) A dynamic variant of the battle of the sexes. Int
J Game Theory 22:359–380
Feichtinger G, Wirl F (1994) On the stability and potential cyclicity of corrup-
tion in governments subject to popularity constraints. Math Soc Sci 28:113–
131
Feichtinger G, Wirl F (2000) Instabilities in concave, dynamic, economic opti-
mization. J Optim Theory Appl 107:277–288
Feichtinger G, Luhmer A, Sorger G (1988) Optimal price and advertising policy
in convenience goods retailing. Market Sci 7:187–201
Feichtinger G, Kaitala VT, Novak AJ (1992) Stable resource-employment limit
cycles in an optimally regulated fishery. In: Feichtinger G (ed) Dynamic
economic models and optimal control. North-Holland, Amsterdam, pp 163–
184
Bibliography 495

Feichtinger G, Hartl RF, Sethi SP (1994a) Dynamic optimal control models in


advertising: recent developments. Manag Sci 40(2):195–226

Feichtinger G, Novak AJ, Wirl F (1994b) Limit cycles in intertemporal adjust-


ment models - theory and applications. J Econ Dyn Control 18:353–380

Feichtinger G, Hartl RF, Haunschmied JL, Kort PM (1998) Optimal enforce-


ment policies (crackdowns) on a drug market. Optim Control Appl Methods
19:169–184

Feichtinger G, Jørgensen S, Novak AJ (1999) Petrarch’s Canzoniere: rational


addiction and amorous cycles. J Math Sociol 23(3):225–240

Feichtinger G, Hartl RF, Kort PM, Novak AJ (2001) Terrorism control in the
tourism industry. J Optim Theory Appl 108:283–296

Feichtinger G, Grienauer W, Tragler G (2002) Optimal dynamic law enforce-


ment. Eur J Oper Res 141:58–69

Feichtinger G, Tragler G, Veliov VM (2003) Optimality conditions for age-


structured control systems. J Math Anal Appl 288:47–68

Feichtinger G, Prskawetz A, Veliov VM (2004a) Age-structured optimal control


in population economics. Theor Popul Biol 65:373–387

Feichtinger G, Tsachev T, Veliov VM (2004b) Maximum principle for age and


duration structured systems: a tool for optimal prevention and treatment of
HIV. Math Popul Stud 11:3–28

Feichtinger G, Hartl RF, Kort PM, Veliov VM (2005) Environmental policy,


the Porter - hypothesis and the composition of capital: effects of learning
and technological progress. J Environ Econo Manag 50:434–446

Feichtinger G, Hartl RF, Kort PM, Veliov VM (2006a) Capital accumulation


under technological progress and learning: a vintage capital approach. Eur J
Oper Res 172:293–310

Feichtinger G, Hartl RF, Kort PM, Veliov VM (2006b) Anticipation effects of


technological progress on capital accumulation: a vintage capital approach.
J Econ Theory 126:143–164

Feichtinger G, Hartl RF, Kort PM, Veliov VM (2008) Financially constrained


capital investments: the effects of disembodied and embodied technological
progress. J Math Econ 44:459–483

Feinberg FM (1992) Pulsing policies for aggregate advertising models. Market


Sci 11(3):221–234
496 Bibliography

Feinberg FM (2001) On continuous-time optimal advertising under S-shaped


response. Manag Sci 47(11):1476–1487

Fel’dbaum AA (1965) Optimal control systems. Academic Press, New York

Ferreira MMA, Vinter RB (1994) When is the maximum principle for state
constrained problems nondegenerate? J Math Anal Appl 187:438–467

Ferreyra G (1990) The optimal control problem for the Vidale-Wolfe advertising
model revisited. Optim Control Appl Methods 11:363–368

Filipiak J (1982) Optimal control of store-and-forward networks. Optim Control


Appl Methods 3:155–176

Fischer T (1985) Hierarchical optimization methods for the coordination of


decentralized management planning. In: Feichtinger G (ed) Optimal control
theory and economic analysis, vol 2. North-Holland, Amsterdam

Fleming WH, Rishel RW (1975) Deterministic and stochastic optimal control.


Springer, New York

Fleming WH, Soner HM (1992) Controlled Markov processes and viscosity so-
lutions. Springer, New York

Fletcher R, Reeves CM (1964) Function minimization by conjugate gradients.


Comput J 7:149–154

Fond S (1979) A dynamic programming approach to the maximum principle of


distributed-parameter systems. J Optim Theory Appl 27(4):583–601

Forster BA (1973) Optimal consumption planning in a polluted environment


Econ Rec 49:534–545

Forster BA (1977) On a one state variable optimal control problem:


consumption-pollution trade-offs. In: Pitchford JD, Turnovsky SJ (eds) Ap-
plications of control theory to economic analysis. North-Holland, Amsterdam

Fourgeaud C, Lenclud B, Michel P (1982) Technological renewal of natural


resource stocks. J Econ Dyn Control 4:1–36

Francis PJ (1997) Dynamic epidemiology and the market for vaccinations. J.


Public Econ 63(3):383–406

Frankena JF (1975) Optimal control problems with delay, the maximum prin-
ciple and necessary conditions. J Eng Math 9:53–64

Friedman A (1964) Optimal control for hereditary processes. Arch Ration Mech
Anal 15:396–416
Bibliography 497

Friedman A (1971) Differential games. Wiley, New York

Friedman A (1977) Oligopoly and the theory of games. North-Holland, Ams-


terdam

Friedman A (1986) Game theory with applications to economics. Oxford Uni-


versity Press, New York

Fruchter GE (1999a) The many-player advertising game. Manag Sci


45(11):1609–1611

Fruchter GE (1999b) Oligopoly advertising strategies with market expansion.


Optim Control Appl Methods 20:199–211

Fruchter GE (2005) Two-part tariff pricing in a dynamic environment. In: Deis-


senberg C, Hartl RF (eds) Optimal control and dynamic games. Springer,
Dordrecht, pp 141–153

Fruchter GE, Kalish S (1997) Closed-loop advertising strategies in a duopoly.


Manag Sci 43:54–63

Fuller D, Vickson RG (1987) The optimal construction of new plants for oil
from the Alberta tar sands. Oper Res 35(5):704–715

Funke UH (1976) Mathematical models in marketing: a collection of abstracts.


Lecture notes in economics and mathematical systems, vol 132. Springer,
Berlin

Fursikov AV (1999) Optimal control of distributed systems. Theory and appli-


cations (Translations of mathematical monographs). American Mathematical
Society, Providence

Gaimon C (1985a) The acquisition of automation subject to diminishing re-


turns. IIE Trans 17:147–156

Gaimon C (1985b) The optimal acquisition of automation to enhance the pro-


ductivity of labor. Manag Sci 31:1175–1190

Gaimon C (1986a) An impulsive control approach to deriving the optimal dy-


namic mix of manual and automatic output. Eur J Oper Res 24:360–368

Gaimon C (1986b) The optimal times and levels of impulsive acquisition of


automation. Optim Control Appl Methods 7:259–270

Gaimon C (1986c) The optimal acquisition of new technology and its impact on
dynamic pricing policies. In: Lev B (ed) Production management: methods
and studies. Studies in management science and systems, vol 13. Elsevier,
Amsterdam, pp 187–206
498 Bibliography

Gaimon C (1988) Simultaneous and dynamic price, production, inventory, and


capacity decisions. Eur J Oper Res 35:426–441

Gaimon C (1989) Dynamic game results on the acquisition of new technology.


Oper Res 37(3):410–425

Gaimon C (1994) Subcontracting versus capacity expansion and the impact on


the pricing of services. Naval Res Logist 41(7):875–892

Gaimon C (1997) Planning information technology - knowledge worker systems.


Manag Sci 43(9):1308–1328

Gaimon C, Burgess R (2003) Analysis of lead time and learning for capacity
expansions. Prod Oper Manag 12(1):128–140

Gaimon C, Morton A (2005) Investment in changeover flexibility for early en-


try in high tech markets. Prod Oper Manag (Special Issue on High Tech
Production and Operations Management) 14(2):159–174

Gaimon C, Singhal V (1992) Flexibility and the choice of facilities under short
product life cycles. Eur J Oper Res 60(2):211–223

Gaimon C, Thompson GL (1984a) Optimal preventive and repair maintenance


of a machine subject to failure. Optim Control Appl Methods 5:57–67

Gaimon C, Thompson GL (1984b) A distributed parameter cohort personnel


planning model using cross-sectional data. Manag Sci 30:750–764

Gaimon C, Thompson GL (1989) Optimal preventive and repair maintenance


of a machine subject to failure and a declining resale value. Optim Control
Appl Methods 10:211–228

Gamkrelidze RV (1965) On some extrema1 problems in the theory of differential


equations with applications to the theory of optimal control. SIAM J Control
Optim 3:106–128

Gamkrelidze RV (1978) Principle of optimal control theory. Plenum Press, New


York

Gandolfo G (1980) Economic dynamics: methods and models. North-Holland,


Amsterdam

Gaskins, DW Jr (1971) Dynamic limit pricing: optimal pricing under threat of


entry. J Econ Theory 3:306–322

Gaugusch J (1984) The non-cooperative solution of a differential game: adver-


tising versus pricing. Optim Control Appl Methods 5(4):353–360
Bibliography 499

Gavrila C, Caulkins JP, Feichtinger G, Tragler G, Hartl RF (2005) Managing


the reputation of an award to motivate performance. Math Methods Oper
Res 61:1–22

Gelfand IM, Fomin SV (1963) Calculus of variations. Prentice-Hall, Englewood


Cliffs

Gerchak Y, Parlar M (1985) Optimal control analysis of a simple criminal pros-


ecution model. Optim Control Appl Methods 6:305–312

Gfrerer H (1984) Optimization of hydro energy storage plant problems by vari-


ational methods. Z Oper Res 28:B87–B101

Gihman II, Skorohod AV (1972) Stochastic differential equations. Springer, New


York

Girsanov IV (1972) Lectures on mathematical theory of extremum problems.


Springer, Berlin

Glad ST (1979) A combination of penalty function and multiplier methods for


solving optimal control problems. J Optim Theory Appl 28:303–329

Goh BS (1980) Management and analysis of biological populations. Elsevier,


Amsterdam

Goh BS, Leitmann G, Vincent TL (1974) Optimal control of a prey-predator


system. Math Biosci 19:263–286

Goldberg S (1986) Introduction to difference equations. Dover Publications,


New York

Goldstine HH (1980) A history of the calculus of variations from the 17th


through the 19th century. Springer, New York

Göllmann L, Kern D, Maurer H (2008) Optimal control problems with delays


in state and control variables subject to mixed control-state constraints. Op-
timal Control Appl Methods 30(4):341–365

Gopalsamy K (1976) Optimal control of age-dependent populations. Math


Biosci 32:155–163

Gordon HS (1954) Economic theory of a common-property resource: the fishery.


J Polit Econ 62:124–142

Gordon MJ (1962) The investment, financing and valuation of the corporation.


Richard D. Irwin, Homewood

Gould JP (1970) Diffusion processes and optimal advertising policy. InbPhelps


ES et al (eds) Microeconomic foundation of employment and inflation theory.
Norton, New York, pp 338–368
500 Bibliography

Grass D, Caulkins JP, Feichtinger G, Tragler G, Behrens DA (2008) Optimal


control of nonlinear processes: with applications in drugs, corruption, and
terror. Springer, New York

Grimm W, Well KH, Oberle HJ (1986) Periodic control for Minimum-fuel air-
craft trajectories. J Guid 9:169–174

Gross M, Lieber Z (1984) Competitive monopolistic and efficient utilization of


an exhaustible resource in the presence of habit-formation effects and stock
dependent costs. Econ Lett 14:383–388

Hadley G, Kemp MC (1971) Variational methods in economics. North-Holland,


Amsterdam

Hahn M, Hyun JS (1991) Advertising cost interpretations and the optimality


of pulsing. Manag Sci 37(2):157–169

Halkin H (1966) A maximum principle of the Pontryagin type for systems de-
scribed by nonlinear difference equations. SIAM J Control 4:90–111

Halkin H (1967) On the necessary condition for optimal control of non-linear


systems. In: Leitmann G (ed) Topics in optimization. Academic Press, New
York

Hämäläinen RP, Haurie A, Kaitala VT (1984) Bargaining on whales: a differ-


ential game model with Pareto optimal equilibria. Oper Res Lett 3(1):5–11

Hämäläinen RP, Haurie A, Kaitala VT (1985) Equilibria and threats in a fishery


management game. Optimal Control Appl Methods 6:315–333

Hämäläinen RP, Ruusunen J, Kaitala VT (1986) Myopic Stackelberg equilib-


ria and social coordination in a share contract fishery. Mar Resour Econ
3(3):209–235

Hämäläinen RP, Ruusunen J, Kaitala VT (1990) Cartels and dynamic contracts


in sharefishing. J Environ Econ Manag 19:175–192

Han M, Feichtinger G, Hartl RF (1994) Nonconcavity and proper optimal pe-


riodic control. J Econ Dyn Control 18:976–990

Hanssens DM, Parsons LJ, Schultz RL (1990) Market response models: econo-
metric and time series analysis. Kluwer Academic Publishers, Boston

Harris FW (1913) How many parts to make at once. Factory Mag Manag
10:135–136, 152

Harris H (1976) Optimal planning under transaction costs: the demand for
money and other assets. J Econ Theory 12:298–314
Bibliography 501

Harrison JM, Pliska SR (1981) Martingales, stochastic integrals, and continuous


trading. Stoch Process Appl 11:215–260
Hartberger RJ (1973) A proof of the Pontryagin maximum principle for initial-
value problem. J Optim Theory Appl 11:139–145
Hartl RF (1982a) A mixed linear/nonlinear optimization model of production
and maintenance for a machine. In: Feichtinger G (ed) Optimal control the-
ory and economic analysis. North-Holland, Amsterdam, pp 43–58
Hartl RF (1982b) Optimal control of non-linear advertising models with replen-
ishable budget. Optimal Control Appl Methods 3(1):53–65
Hartl RF (1982c) Optimal control of concave economic models with two control
instruments. In: Feichtinger G, Kall P (eds) Operations research in progress.
D. Reidel Publishing Company, Dordrecht, pp 227–245
Hartl RF (1983a) Optimal allocation of resources in the production of human
capital. J Oper Res Soc 34:599–606
Hartl RF (1983b) Optimal maintenance and production rates for a machine: a
nonlinear economic control problem. J Econ Dyn Control 6:281–306
Hartl RF (1984) Optimal dynamic advertising policies for hereditary processes.
J Optim Theory Appl 43(1):51–72
Hartl RF (1986a) A forward algorithm for a generalized wheat trading model.
Z Oper Res 30:A 135–A 144
Hartl RF (1986b) Arrow-type sufficient optimality conditions for nondifferen-
tiable optimal control problems with state constraints. Appl Math Optim
14:229–247
Hartl RF (1987) A simple proof of the monotonicity of the state trajectories in
autonomous control problems. J Econ Theory 40:211–215
Hartl RF (1988a) A wheat trading model with demand and spoilage. In: Fe-
ichtinger G (ed) Optimal control theory and economic analysis, vol 3. North-
Holland, Amsterdam, pp 235–244
Hartl RF (1988b) A dynamic activity analysis for a monopolistic firm. Optimal
Control Appl Methods 9:253–272
Hartl RF (1988c) The control of environmental pollution and optimal invest-
ment and employment decisions: a comment. Optimal Control Appl Methods
9:337–339
Hartl RF (1989a) On forward algorithms for a generalized wheat trading model.
In: Chikan A (ed) Progress in inventory research (proceedings of 4th interna-
tional symposium on inventories). Hungarian Academy of Science, Budapest
502 Bibliography

Hartl RF (1989b) Most rapid approach paths in dynamic economic problems. In:
Kleinschmidt P et al (eds) Methods of operations research 58 (Proceedings
of SOR 12, 1987). Athenäun, pp 397–410

Hartl RF (1992) Optimal acquisition of pollution control equipment under un-


certainty. Manag Sci 38:609–622

Hartl RF (1993) On the properness of one-dimensional periodic control prob-


lems. Syst Control Lett 20(3):393–395

Hartl RF (1995) Production smoothing under environmental constraints. Prod


Oper Manag 4(1):46–56

Hartl RF, Feichtinger G (1987) A new sufficient condition for most rapid ap-
proach paths. J Optim Theory Appl 54(2):403–411

Hartl RF, Jørgensen S (1985) Optimal manpower policies in a dynamic staff-


maximizing bureau. Optimal Control Appl Methods 6(1):57–64

Hartl RF, Jørgensen S (1988) Aspects of optimal slidesmanship. In Feichtinger


G (ed) Optimal control theory and economic analysis, vol 3. North-Holland,
Amsterdam, pp 335–350

Hartl RF, Jørgensen S (1990) Optimal slidesmanship in conferences with un-


predictable chairmen. Optimal Control Appl Methods 11:143–155

Hartl RF, Kort PM (1996a) Marketable permits in a stochastic dynamic model


of the firm. J Optim Theory Appl 89(1):129–155

Hartl RF, Kort PM (1996b) Capital accumulation of a firm facing an emissions


tax. J Econ 63(1):1–24

Hartl RF, Kort PM (1996c) Capital accumulation of a firm facing environmental


constraints. Optimal Control Appl Methods 17:253–266

Hartl RF, Kort PM (1997) Optimal input substitution of a firm facing an en-
vironmental constraint. Eur J Oper Res 99:336–352

Hartl RF, Kort PM (2004) Optimal investments with convex-concave revenue:


a focus-node distinction. Optimal Control Appl Methods 25(3):147–163

Hartl RF, Kort PM (2005) Advertising directed towards existing and new cus-
tomers. In: Deissenberg C, Hartl RF (eds) Optimal control and dynamic
games. Springer, Dordrecht, pp 3–18

Hartl RF, Krauth J (1989) Optimal production mix. J Optim Theory Appl
66:255–273
Bibliography 503

Hartl RF, Luptacik M (1992) Environmental constraints and choice of technol-


ogy. Czechoslov J Oper Res 1(2):107–125

Hartl RF, Mehlmann A (1982) The Transylvanian problem of renewable re-


sources. Révue Francaise d‘Automatique, Informatique et de Recherche Op-
erationelle 16:379–390

Hartl RF, Mehlmann A (1983) Convex-concave utility function: optimal blood-


consumption for vampires. Appl Math Model 7:83–88

Hartl RF, Mehlmann A (1984) Optimal seducing policies for dynamic continu-
ous lovers under risk of being killed by a rival. Cybern Syst Int J 15:119–126

Hartl RF, Mehlmann A (1986) On remuneration patterns for medical services.


Optimal Control Appl Methods 7:185–193

Hartl RF, Sethi SP (1983) A note on the free terminal time transversality
condition. Z Oper Res Ser Theory 27(5):203–208

Hartl RF, Sethi SP (1984a) Optimal control problems with differential inclu-
sions: sufficiency conditions and an application to a production-inventory
model. Optimal Control Appl Methods 5(4):289–307

Hartl RF, Sethi SP (1984b) Optimal control of a class of systems with contin-
uous lags: dynamic programming approach and economic interpretations. J
Optim Theory Appl 43(1):73–88

Hartl RF, Sethi SP (1985a) Solution of generalized linear optimal control prob-
lems using a simplex-like method in continuous-time I: theory. In: Feichtinger
G (ed) Optimal control theory and economic analysis, vol 2. North-Holland,
Amsterdam, pp 45–62

Hartl RF, Sethi SP (1985b) Solution of generalized linear optimal control prob-
lems using a simplex-like method in continuous-time II: examples. In Fe-
ichtinger G (ed) Optimal control theory and economic analysis, vol 2. North-
Holland, Amsterdam, pp 63–87

Hartl RF, Mehlmann A, Novak AJ (1992a) Cycles of fear: optimal periodic


blood-sucking rates for vampires. J Optim Theory Appl 75(3):559–568

Hartl RF, Feichtinger G, Kirakossian GT (1992b) Optimal recycling of tailings


for the production of building materials. Czechoslov J Oper Res 1(3):181–192

Hartl RF, Sethi SP, Vickson RG (1995) A survey of the maximum principles for
optimal control problems with state constraints. SIAM Rev 37(2):181–218

Hartl RF, Kort PM, Novak AJ (1999) Optimal investment facing possible ac-
cidents. Ann Oper Res 88:99–117
504 Bibliography

Hartl RF, Kort PM, Feichtinger G (2003a) Offense control taking into account
heterogeneity of age. J Optim Theory Appl 116:591–620
Hartl RF, Novak AJ, Rao AG, Sethi SP (2003b) Optimal pricing of a product
diffusing in rich and poor populations. J Optim Theory Appl 117(2):349–375
Hartl RF, Kort PM, Feichtinger G, Wirl F (2004) Multiple equilibria and thresh-
olds due to relative investment costs. J Optim Theory Appl 123:49–82
Hartl RF, Novak AJ, Rao AG, Sethi SP (2005) Dynamic pricing of a status
symbol, invited talks from the fourth world congress of nonlinear analysts
(WCNA 2004) Orlando, FL, June 30–July 07, 2004. Nonlinear Anal Theory
Methods Appl 63:e2301–e2314
Hartman R (1982) Ordinary differential equations. Birkhäuser, Boston
Haruvy E, Prasad A, Sethi SP (2003) Harvesting altruism in open source soft-
ware development. J Optim Theory Appl 118(2):381–416
Haruvy E, Prasad A, Sethi SP, Zhang R (2005) Optimal firm contributions to
open source software. In: Deissenberg C, Hartl RF (eds) Optimal control and
dynamic games, applications in finance, management science and economics.
Springer, Dordrecht, pp 197–212
Haruvy E, Prasad A, Sethi SP, Zhang R (2008a) Competition with open source
as a public good. J Ind Manag Optim 4(1):199–211
Haruvy E, Sethi SP, Zhou J (2008b) Open source development with a commer-
cial complementary product or service. Prod Oper Manag (Special Issue on
Management of Technology) 17(1):29–43
Harvey AC (1994) Forecasting, structural time series models and the Kalman
filter. Cambridge University Press, New York
Haunschmied JL, Kort PM, Hartl RF, Feichtinger G (2003) A DNS-curve in
a two-state capital accumulation model: a numerical analysis. J Econ Dyn
Control 27:701–716
Haurie A (1976) Optimal control on an infinite time horizon: the turnpike
approach. J Math Econ 3:81–102
Haurie A, Hung NM (1977) Turnpike properties for the optimal use of a natural
resource. Rev Econ Stud 44:329–336
Haurie A, Leitmann G (1984) On the global asymptotic stability of equilibrium
solutions for open-loop differential games. Large Scale Syst 6:107–122
Haurie A, Sethi SP (1984) Decision and forecast horizons, agreeable plans, and
the maximum principle for infinite horizon control problems. Oper Res Lett
3(5):261–265
Bibliography 505

Haurie A, Tolwinski B, Leitmann G (1983) Cooperative equilibria in differential


games. In: Proceedings ACC, San Francisco

Haurie A, Sethi SP, Hartl RF (1984) Optimal control of an age-structured


population model with applications to social services planning. J Large Scale
Syst 6:133–158

Haussmann UG (1981) Some examples of optimal stochastic controls or: the


stochastic maximum principle at work. SIAM Rev 23:292–307

He X, Sethi SP (2008) Dynamic slotting and pricing decisions in a durable


product supply chain. J Optim Theory Appl 137(2):363–379

He X, Prasad A, Sethi SP, Gutierrez GJ (2007) A survey of Stackelberg differ-


ential game models in supply and marketing channels. J Syst Sci Syst Eng
16(4):385–413. Erratum (2008) 17(2):255

He X, Prasad A, Sethi SP (2009) Cooperative advertising and pricing in a


stochastic supply chain: feedback Stackelberg strategies. Prod Oper Manag
18(1):78–94

He X, Krishnamoorthy A, Prasad A, Sethi SP (2011) Retail competition and


cooperative advertising. Oper Res Lett 39:11–16

He X, Krishnamoorthy A, Prasad A, Sethi SP (2012) Co-op advertising in


dynamic retail oligopolies. Decis Sci 43(1):73–105

Heal GM (1976) The relationship between price and extraction cost for a re-
source with a backstop technology. Bell J Econ 7:371–378

Heal GM (1993) The optimal use of exhaustible resources. In: Kneese AV,
Sweeney (eds) Handbook of natural resource and energy economics, vol 3,
chap 18. Elsevier, London, pp 855–880

Heaps T (1984) The forestry maximum principle. J Econ Dyn Control 7:131–151

Heckman J (1976) A life cycle model of earnings, learning, and consumption. J


Polit Econ 84:511–544

Hestenes MR (1966) Calculus of variations and optimal control theory. Wiley,


New York

Ho YC (1970) Differential games, dynamic optimization and generalized control


theory. J Optim Theory Appl 6:179–209

Hofbauer J, Sorger G (1999) Perfect foresight and equilibrium selection in sym-


metric potential games. J Econ Theory 85:1–23
506 Bibliography

Hofbauer J, Sorger G (2002) A differential game approach to evolutionary equi-


librium selection. Int Game Theory Rev 4:17–31

Hoffmann KH, Krabs W (1984) Optimal control of partial differential equations.


Birkhäuser, Basel

Holly S, Rüstem B, Zarrop MB (eds) (1979) Optimal control for econometric


models. An approach to economic policy formulation. Macmillan, London

Holt CC, Modigliani F, Muth JF, Simon HA (1960) Planning production, in-
ventories and workforce. Prentice-Hall, Englewood Cliffs

Holtzman JM (1966) On the maximum principle for nonlinear discrete-time


systems. IEEE Trans Autom Control AC-11:273–274

Horsky D (1977) An empirical analysis of the optimal advertising policy. Manag


Sci 23(10):1037–1049

Horsky D, Mate K (1988) Dynamic advertising strategies of competing durable


good producers. Market Sci 7(4):356–367

Horsky D, Simon LS (1983) Advertising and the diffusion of new products.


Market Sci 2(1):1–17

Hotelling H (1925) A general mathematical theory of depreciation. J Am Stat


Assoc 20:340–353

Hotelling H (1931) The economics of exhaustible resources. J Polit Econ 39:137–


175

Hwang CL, Fan LT, Erickson LE (1967) Optimal production planning by the
maximum principle. Manag Sci 13:750–755

Ijiri Y, Thompson GL (1970) Applications of mathematical control theory to


accounting and budgeting (the continuous wheat trading model). Account
Rev 45:246–258

Ijiri Y, Thompson GL (1972) Mathematical control theory solution of an inter-


active accounting flows model. Naval Res Logist Quart 19:411–422

Intriligator MD (1971) Mathematical optimization and economic theory.


Prentice-Hall, Englewood Cliffs

Intriligator MD (1980) Applications of control theory to economics. In: Ben-


soussan A, Lions JL (eds) Analysis and optimization of systems. Lecture
notes in control and information sciences, vol 28. Springer, Berlin, pp 607–
626
Bibliography 507

Intriligator MD, Smith BLR (1966) Some aspects of the allocation of scientific
effort between teaching and research. Am Econ Rev 61:494–507

Ioffe AD, Tihomirov VM (1979) Theory of extremal problems. North-Holland,


Amsterdam

Isaacs R (1965) Differential games. Wiley, New York

Isaacs R (1969) Differential games: their scope, nature, and future. J Optim
Theory Appl 3:283–295

Jacobson DH, Lele MM, Speyer JL (1971) New necessary conditions of opti-
mality for control problems with state-variable inequality constraints. J Math
Anal Appl 35:255–284

Jacquemin AP (1973) Optimal control and advertising policy. Metro-Economica


25:200–207

Jacquemin AP, Thisse J (1972) Strategy of the firm and market structure: an
application of optimal control theory. In: Cowling K (ed) Market structure
and corporate behavior. Gray-Mills, London, pp 61–84

Jagpal S (1999) Marketing strategy and uncertainty. Oxford University Press,


New York

Jamshidi M (1983) Large-scale systems: modelling and control. North-Holland,


New York

Jarrar R, Martı́n-Herrán G, Zaccour G (2004) Markov perfect equilibrium ad-


vertising strategies of Lanchester duopoly model: a technical note. Manag
Sci 50(7):995–1000

Jazwinski AH (1970) Stochastic processes and filtering theory. Academic Press,


New York

Jedidi K, Eliashberg J, DeSarbo W (1989) Optimal advertising and pricing for


a three-stage time-lagged monopolistic diffusion model incorporating income.
Optimal Control Appl Methods 10:313–331

Jennings LS, Teo KL (1997) Computation of manufacturing systems using an


enhancing control. In: Chandra T, Leclair SR, Meech JA, Verma B, Smith
M, Balachandran B (eds) IPMM’97: Australiasia-Pacific forum on intelligent
processing & manufacturing of materials, vol 1. Watson Ferguson, Brisbane

Jennings LS, Sethi SP, Teo KL (1997) Computation of optimal production


plans for manufacturing systems. Nonlinear Anal Theory Methods Appl
30(7):4329–4338
508 Bibliography

Jeuland AP, Dolan RJ (1982) An aspect of new product planning: dynamic


pricing. In: Zoltners AA (ed) Marketing planning models. TIMS studies in
the management science, vol 18. North-Holland, Amsterdam, pp 1–21

Ji Y, Mookerjee VS, Sethi SP (2005) Optimal software development: a control


theoretic approach. Inf Syst Res 16(3):292–306

Ji Y, Kumar S, Mookerjee VS, Sethi SP, Yeh D (2011) Optimal enhancement


and lifetime of software systems: a control theoretic analysis. Prod Oper
Manag 20(6):889–904

Ji Y, Kumar S, Sethi SP (2017) Needle exchange for controlling HIV spread


under endogenous infectivity. Inf Syst Oper Res 55(2):93–117

Jiang J, Sethi SP (1991) A state aggregation approach to manufacturing sys-


tems having machine states with weak and strong interactions. Oper Res
39(6):970–978

Johar M, Mookerjee V, Sethi SP (2015) Optimal software design reuse policies:


a control theoretic approach. Inf Syst Front 17(2):439–453

Johnson CD, Gibson JE (1963) Singular solutions in problems of optimal con-


trol. IEEE Trans Autom Control AC-8:4–15

Jones P (1983) Analysis of a dynamic duopoly model of advertising. Math Oper


Res 8(1):122–134

Jørgensen S (1982a) A survey of some differential games in advertising. J Econ


Dyn Control 4:341–369

Jørgensen S (1982b) A differential games solution to a logarithmic advertising


model. Oper Res Soc 33(5):425–432

Jørgensen S (1983) Optimal control of a diffusion model of new product ac-


ceptance with price-dependent total market potential. Optimal Control Appl
Methods 4(3):269–276

Jørgensen S (1984) A Pareto-optimal solution of a maintenance-production dif-


ferential game. Eur J Oper Res 18:76–80

Jørgensen S (1985) An exponential differential game which admits a simple


Nash solution. J Optim Theory Appl 45:383–396

Jørgensen S (1986a) Optimal production, purchasing and pricing: a differential


games approach. Eur J Oper Res 24:64–76

Jørgensen S (1986b) Optimal dynamic pricing in an oligopolistic market: a


survey. In: Basar T (ed) Dynamic games and applications in economics.
Springer, Berlin, pp 179–237
Bibliography 509

Jørgensen S (1992) The dynamics of extramarital affairs. In: Feichtinger G (ed)


Dynamic economic models and optimal control. North-Holland, Amsterdam
pp 239–266

Jørgensen S, Dockner EJ (1985) Optimal consumption and replenishment poli-


cies for a renewable resource. In: Feichtinger G (ed) Optimal control theory
and economic analysis, vol 2. North-Holland, Amsterdam, pp 647–664

Jørgensen S, Kort PM (1993) Optimal dynamic investment policies under


concave-convex adjustment costs. J Econ Dyn Control 17(1–2):153–180

Jørgensen S, Kort PM (1997) Optimal investment and finance in renewable


resource harvesting. J Econ Dyn Control 21:603–630

Jørgensen S, Kort PM (2002) Optimal pricing and inventory policies: central-


ized and decentralized decision making. Eur J Oper Res 138(3):578–600

Jørgensen S, Zaccour G (2001) Time consistent side payments in a dynamic


game of downstream pollution. J Econ Dyn Control 25:1973–87

Jørgensen S, Zaccour G (2004) Differential games in marketing. International


series in quantitative marketing. Kluwer Academic Publishers, Boston

Jørgensen S, Zaccour G (2007) Developments in differential game theory and


numerical methods: economic and management applications. Comput Manag
Sci 4(2):159–182

Jørgensen S, Kort PM, Zaccour G (1999) Production, inventory, and pricing


under cost and demand learning effects. Eur J Oper Res 117:382–395

Jørgensen S, Kort PM, Dockner EJ (2006a) Venture capital financed invest-


ments in intellectual capital. J Econ Dyn Control 30(11):2339–2361

Jørgensen S, Kort PM, Zaccour G (2006b) Advertising an event. Automatica


42(8):1349–1355

Jørgensen S, Kort PM, Zaccour G (2009) Optimal pricing and advertising poli-
cies for an entertainment event. J Econ Dyn Control 33(3):583–596

Kaitala VT (1986) Game theory models of fisheries management - a survey.


In: Basar T (ed) Dynamic games and applications in economics. Springer,
Berlin, pp 252–266

Kalish S (1983) Monopolist pricing with dynamic demand and production cost.
Market Sci 2(2):135–159

Kalish S (1985) A new product adoption model with price, advertising, and
uncertainty. Market Sci 31(12):1569–1585
510 Bibliography

Kalish S, Lilien GL (1983) Optimal price subsidy policy for accelerating the
diffusion of innovation. Market Sci 2(4):407–420

Kalish S, Sen SK (1986) Diffusion models and the marketing mix for single
products. In: Mahajan V, Wind Y (eds) Series in econometrics and manage-
ment science: innovation diffusion models of new products acceptance, vol
V. Ballinger, Cambridge, pp 87–116

Kalman RE (1960a) A new approach to linear filtering and prediction problems.


Trans ASME Ser D J Basic Eng 82:35–45

Kalman RE (1960b) Contributions to the theory of optimal control. Bol de Soc


Math Mexicana 5:102–119

Kalman RE, Bucy R (1961) New results in linear filtering and prediction theory.
Trans ASME Ser D J Basic Eng 83:95–108

Kamien MI, Schwartz NL (1971a) Optimal maintenance and sale age for a
machine subject to failure. Manag Sci 17:427–449

Kamien MI, Schwartz NL (1971b) Limit pricing and uncertain entry. Econo-
metrica 39:441–454

Kamien MI, Schwartz NL (1978) Optimal exhaustible resource depletion with


endogenous technical change. Rev Econ Stud 45:179–196

Kamien MI, Schwartz NL (1982a) Market structure and innovation. Cambridge


University Press, Cambridge

Kamien MI, Schwartz NL (1982b) The role of common property resources in op-
timal planning models with exhaustible resources. In: Smith VK, Krutilla JV
(eds) Explorations in natural resource economics. John Hopkins University
Press, Baltimore, pp 47–71

Kamien MI, Schwartz NL (1992) Dynamic optimization: the calculus of vari-


ations and optimal control in economics and management, 2nd edn, North-
Holland, New York

Kaplan W (1958) Ordinary differential equations. Addison-Wesley, Reading

Karatzas I, Shreve SE (1997) Brownian motion and stochastic calculus, 2nd


edn. Springer, New York

Karatzas I, Shreve SE (1998) Methods of mathematical finance. Springer, New


York

Karatzas I, Lehoczky JP, Sethi SP, Shreve SE (1986) Explicit solution of a


general consumption/investment problem. Math Oper Res 11(2):261–294
Bibliography 511

Karray S, Martı́n-Herrán G (2009) A dynamic model for advertising and pricing


competition between national and store brands. Eur J Oper Res 193:451–467

Keeler E, Spence M, Zeckhauser RJ (1971) The optimal control of pollution. J


Econ Theory 4:19–34

Keller HB (1968) Numerical methods for two-point boundary value problems.


Blaisdell, Waltham

Kem MC, Long NV (eds) (1980) Exhaustible resources, optimality, and trade.
North-Holland, Amsterdam

Kemp MC, Long NV (1977) Optimal control problems with integrands discon-
tinuous with respect to time. Econ Rec 53:405–420

Kendrick DA (1981) Stochastic Control for economic models. McGraw-Hill,


New York

Khmelnitsky E, Kogan K (1994) Necessary optimality conditions for a gen-


eralized problem of production scheduling. Optimal Control Appl Methods
15:215–222

Khmelnitsky E, Kogan K (1996) Optimal policies for aggregate production and


capacity planning under an arbitrary demand. Int J Prod Res 34(7):1929–
1941

Khmelnitsky E, Kogan K, Maimon O (1995) A maximum principle based com-


bined method for scheduling in a flexible manufacturing system. Discrete
Event Dyn Syst 5:343–355

Khmelnitsky E, Kogan K, Maimon O (1997) Maximum principle-based methods


for scheduling FMS with partially sequence dependent setups. Int J Prod Res
35(10):2701–2712

Khmelnitsky E, Presman E, Sethi SP (2011) Optimal production control of a


failure-prone machine. Ann Oper Res 182(1):67–86

Kilkki P, Vaisanen U (1969) Determination of optimal policy for forest stands


by means of dynamic programming. Acta Forestalia Fennica 102:100–112

Kirby BJ (ed) (1974) Optimal control theory and its applications. Lecture notes
in economics and mathematical systems, part I & II, vols 105–106. Springer,
Berlin

Kirk DE (1970) Optimal control theory: an introduction. Prentice-Hall, Engle-


wood Cliffs

Kiseleva T, Wagener FO (2010) Bifurcations of optimal vector fields in the


shallow lake model J Econ Dyn Control 34(5):825–843
512 Bibliography

Klein CF, Gruver WA (1978) On the optimal control of a single-server queueing


system: comment. J Optim Theory Appl 26:457–462

Kleindorfer PR (1978) Stochastic control models in management science: theory


and computation. TIMS Stud Manag Sci 9:69–88

Kleindorfer PR, Lieber Z (1979) Algorithms and planning horizon results for
production planning problems with separable costs. Oper Res 27:874–887

Kleindorfer PR, Kriebel CH, Thompson GL, Kleindorfer GB (1975) Discrete


optimal control of production plans. Manag Sci 22:261–273

Knobloch HW (1981) Higher order necessary conditions in optimal control the-


ory. Lecture notes in control and information sciences, vol 34. Springer, Berlin

Knowles G (1981) An introduction to applied optimal control. Academic Press,


New York

Kogan K, Khmelnitsky E (1996) An optimal control model for continuous time


production and setup scheduling. Int J Prod Res 34(3):715–725

Kogan K, Khmelnitsky E, Shtub A, Maimon O (1997) Optimal flow control of


flexible manufacturing systems: setup localization by an iterative procedure.
Int J Prod Econ 51:37–46

Kort PM, Caulkins JP, Hartl RF, Feichtinger G (2006) Brand image and brand
dilution in the fashion industry. Automatica 42:1363–1370

Kotowitz Y, Mathewson F (1979a) Informative advertising and welfare. Am


Econ Rev 69:284–294

Kotowitz Y, Mathewson F (1979b) Advertising, consumer information, and


product quality. Bell J. Econ 10:566–588

Kreindler E (1982) Additional necessary conditions for optimal control with


state-variable inequality constraints. J Optim Theory Appl 38:241–250

Krelle W (1984) Economic growth with exhaustible resources and environmental


protection. Z Staatswiss 140:399–429

Krichagina E, Lou S, Sethi SP, Taksar MI (1993) Production control in a failure-


prone manufacturing system: diffusion approximation and asymptotic opti-
mality. Ann Appl Probab 3(2):421–453

Krichagina E, Lou S, Sethi SP, Taksar MI (1995) Diffusion approximation for a


controlled stochastic manufacturing system with average cost minimization.
Math Oper Res 20(4):895–922
Bibliography 513

Krishnamoorthy A, Misra S, Prasad A (2005) Scheduling sales force training:


theory and evidence. Int J Res Market 22:427–440

Krishnamoorthy A, Prasad A, Sethi SP (2010) Optimal pricing and advertising


in a durable-good duopoly. Eur J Oper Res 200(2):486–497

Krouse CG (1972) Optimal financing and capital structure programs for the
firm. J Financ 27:1057–1071

Krouse CG, Lee WY (1973) Optimal equity financing of the corporation. J


Financ Quant Anal 8:539–563

Kugelmann B, Pesch HJ (1990) New general guidance method in constrained


optimal control, part 1: numerical method. J Optim Theory Appl 67(3):421–
435

Kuhn M, Wrzaczek S, Prskawetz A, Feichtinger G (2011) Externalities in a life


cycle model with endogenous survival. J Math Econ 47:4–5, 627–641

Kuhn M, Wrzaczek S, Fürnkranz-Prskawetz A, Feichtinger G (2015) Optimal


choice of health and retirement in a life-cycle model. J Econ Theory 158:186–
212

Kumar S, Sethi SP (2009) Dynamic pricing and advertising for web content
providers. Eur J Oper Res 197:924–944

Kumar PR, Varaiya P (1986) Stochastic systems. Estimation, identification,


and adaptive control. Prentice-Hall, Englewood Cliffs

Kurawarwala AA, Matsuo H (1996) Forecasting and inventory management of


short life-cycle products. Oper Res 44(1):131–150

Kurcyusz S, Zowe J (1979) Regularity and stability for the mathematical pro-
gramming problem in Banach spaces. Appl Math Optim 5:49–62

Kushner HJ (1971) Introduction to stochastic control. Holt, Rinehart & Win-


ston, New York

Kushner HJ (1977) Probability methods for approximations in stochastic con-


trol and for elliptic equations. Academic Press, New York

Kydland FE, Prescott EC (1977) Rules rather than discretion: the inconsistency
of optimal plans. J Polit Econ 85:473–493

Laffont JJ, Mortimort D (2001) The theory of incentives: the principal-agent


model. Princeton University Press, Princeton

Lagunov VN (1985) Introduction to differential games and control theory. Hil-


dermann, Berlin
514 Bibliography

Lansdowne ZF (1970) The theory and applications of generalized linear con-


trol processes. Technical report no. 10, Department of Operations Research,
Stanford University

Lasdon LS, Mitter SK, Warren AD (1967) The conjugate gradient method for
optimal control problems. IEEE Trans Autom Control AC-12:132–138

Leban R, Lesourne J (1983) Adaptive strategies of the firm through a business


cycle. J Econ Dyn Control 5:201–234

Lee EB, Markus L (1968) Foundations of optimal control theory. Wiley, New
York

Legey L, Ripper M, Varaiya PP (1973) Effects of congestion on the shape of


the city. J Econ Theory 6:162–179

Lehoczky JP, Sethi SP, Soner HM, Taksar MI (1991) An asymptotic analysis of
hierarchical control of manufacturing systems under uncertainty. Math Oper
Res 16(3):596–608

Leitmann G (1974) Cooperative and non-cooperative many players differential


games. Springer, Wien

Leitmann G (ed) (1976) Multicriteria decision making and differential games.


Plenum Press, New York

Leitmann G (1981) The calculus of variations and optimal control. In: Miele A
(ed) Series mathematical concepts and methods in science and engineering.
Plenum Press, New York

Leitmann G, Liu PT (1974) A differential game model of labor-management


negotiation during a strike. J Optim Theory Appl 13:427–444

Leitmann G, Stalford H (1971) A sufficiency theorem for optimal control. J


Optim Theory Appl 8:169–174

Leland HE (1972) The dynamics of a revenue maximizing firm. Int Econ Rev
13:376–385

Lele MM, Jacobson DH, McCabe JL (1971) Qualitative application of a result


in control theory to problems of economic growth. Int Econ Rev 12:209–226

Léonard D, Long NV(1992) Optimal control theory and static optimization in


economics. Cambridge University Press, Cambridge

Lesourne J (1973) Croissance optimale des entreprises. Dunod, Paris

Lesourne J, Leban R (1982) Control theory and the dynamics of the firm: a
survey. OR-Spektr 4:1–14
Bibliography 515

Levine J, Thépot J (1982) Open loop and closed loop equilibria in a dynamic
duopoly. In: Feichtinger G (ed) Optimal control theory and economic analy-
sis. North-Holland, Amsterdam, pp 143–156

Lewis TR, Schmalensee R (1977) Non-convexity and optimal exhaustion of


renewable resources. Int Econ Rev 18:535–552

Lewis TR, Schmalensee R (1979) Non-convexity and optimal harvesting strate-


gies for renewable resources. Can J Econ 12:677–691

Lewis TR, Schmalensee R (1982) Optimal use of renewable resources with non-
convexities in production. In: Mirman LJ, Spulber PF (eds) Essays in the
economics of renewable resources. North-Holland, Amsterdam, pp 95–111

Li G, Rajagopalan S (1997) A learning curve model with knowledge deprecia-


tion. Eur J Oper Res 105(1):143–154

Li G, Rajagopalan S (1998) Process improvement, quality and learning effects.


Manag Sci 44:1517–1532

Li T, Sethi SP (2017) A review of dynamic Stackelberg game models. Discrete


Continuous Dyn Syst Ser B 22(1):125–159

Lieber Z (1973) An extension of Modigliani and Hohn’s planning horizon results.


Manag Sci 20:319–330

Lieber Z, Barnea A (1977) Dynamic optimal pricing to deter entry under con-
strained supply. Oper Res 25:696–705

Lignell J, Tuominen MPT (1983) An advertising control model of two state


variables. Eur J Oper Res 24(1):77–84

Lintner J (1963) The cost of capital and optimal financing of corporate growth.
J Financ 23:292–310

Lions JL (1971) Optimal control of systems governed by partial differential


equations. Springer, New York

Little JDC (1979) Aggregate advertising models: the state of the art. Oper Res
27(4):629–667

Little JDC (1986) Comment on “Advertising pulsing policies...” by V. Mahajan


and E. Muller. Market Sci 5(2):107–108

Liu PT (1980) Dynamic optimization and mathematical economics. Plenum


Press, New York

Liu PT, Roxin EO (eds) (1979) Differential games and control theory III. Marcel
Dek̇ker, New York
516 Bibliography

Liu PT, Sutinen JG (eds) (1979) Control theory in mathematical economics.


Marcel Dek̇ker, New York

Long NV, Sorger G (2006) Insecure property rights and growth: the role of
appropriation costs, wealth effects, and heterogeneity. Econ Theory 28:513–
529

Long NV, Vousden N (1977) Optimal control theorems. In: Pitchford JD,
Turnovsky SJ (eds) Applications of control theory in economic analysis.
North-Holland, Amsterdam, pp 11–34

Lou H (2007) Existence and non-existence results of an optimal control problem


by using relaxed control. SIAM J Control Optim 46:1923–1941

Lou S, Sethi SP, Zhang Q (1994) Optimal feedback production planning in a


stochastic two-machine flowshop. Eur J Oper Res 73:331–345

Lucas RE Jr (1971) Optimal management of a research and development


project. Manag Sci 17:679–697

Lucas RE Jr (1981) Optimal investment with rational expectations. In: Lucas


RE Jr, Sargent TJ (eds) Rational expectations and economic practice. G.
Allen & Unwin, London, pp 55–66

Luenberger DG (1969) Optimization by vector space methods. Wiley, New York

Luenberger DG (1972) Mathematical programming and control theory: trends


of interplay. In: Geoffrion AM (ed) Perspectives on optimization. Addison-
Wesley, Reading

Luenberger DG (1973) Introduction to linear and nonlinear programming.


Addison-Wesley, Reading

Luenberger DG (1975) A nonlinear economic control problem with a linear


feedback solution. IEEE Trans Autom Control AC-20:184–191

Luenberger DG (1979) Introduction to dynamic systems: theory, models, and


applications. Wiley, New York

Luhmer A, Steindl A, Feichtinger G, Hartl RF, Sorger G (1988) ADPULS in


continuous time. Eur J Oper Res 34(2):171–177

Lundin RA, Morton TE (1975) Planning horizons for the dynamic lot size
model: protective procedures and computational results. Oper Res 23:711–
734

Luptacik M (1982) Optimal price and advertising policy under atomistic com-
petition. J Econ Dyn Control 4:57–71
Bibliography 517

Luptacik M, Schubert U (1982) Optimal investment policy in productive capac-


ity and pollution abatement processes in a growing economy. In: Feichtinger
G (ed) Optimal control theory and economic analysis. North-Holland, Ams-
terdam, pp 231–243

Luus R (1993) Piecewise linear continuous optimal control by iterative dynamic


programming. Ind Eng Chem Res 32:859–865

Lykina V, Pickenhain S, Wagner M (2008) Different interpretations of the im-


proper integral objective in an infinite horizon control problem. J Math Anal
Appl 340:498–510

Macki J, Strauss A (1982) Introduction to optimal control theory. Springer,


New York

Magat WA, McCann JM, Morey RC (1986) When does lag structure really
matter in optimizing advertising expenditures? Manag Sci 32(2):182–193

Magat WA, McCann JM, Morey RC (1988) Reply to when does lag structure
really matter ... Indeed? Manag Sci 34(7):917–918

Magill MJP (1970) On a general economic theory of motion. Springer, New


York

Mahajan V, Muller E (1979) Innovation diffusion and new product growth


models in marketing. J Market 43:55–68

Mahajan V, Muller E (1986) Advertising pulsing policies for generating aware-


ness for new products. Market Sci 5(2):89–111

Mahajan V, Peterson RA (1978) Innovation diffusion in a dynamic potential


adopter population. Manag Sci 24:1589–1597

Mahajan V, Peterson RA (1985) Models for innovation diffusion. Sage, Beverly


Hills

Mahajan V, Wind Y (eds) (1986) Innovation diffusion models of new product


acceptance. Ballinger, Cambridge

Maimon O, Khmelnitsky E, Kogan K (1998) Optimal flow control manufactur-


ing systems. Kluwer Academic Publishers, Boston

Majumdar M, Mitra T (1983) Dynamic optimization with a non-convex tech-


nology: the case of a linear objective function. Rev Econ Stud 50:143–151

Malanowski K (1984) On differentiability with respect to parameter of solutions


to convex optimal control problems subject to state space constraints. Appl
Math Optim 12:231–245
518 Bibliography

Malanowski K (1997) Sufficient optimality conditions for optimal control sub-


ject to state constraints. SIAM J Control Optim 35(1):205–227

Malanowski K, Maurer H, Pickenhain S (2004) Second-order sufficient condi-


tions for state-constrained optimal control problems. J Optim Theory Appl
123:595–617

Malliaris AG, Brock WA (1982) Stochastic methods in economics and finance.


North-Holland, New York

Mangasarian OL (1966) Sufficient conditions for the optimal control of nonlinear


systems. SIAM J Control 4:139–152

Mangasarian OL (1969) Nonlinear programming. McGraw-Hill, New York

Mangasarian OL, Fromovitz S (1967) A maximum principle in mathematical


programming. In: Balakrishman AV, Neustadt LW (eds) Mathematical the-
ory of control. Academic Press, New York

Manh-Hung N (1974) Essays on the optimal dynamic exploitation of natural


resources and the social rate of discount. Ph.D. Dissertation, University of
Toronto

Marinelli C, Savin S (2008) Optimal distributed dynamic advertising. J Optim


Theory Appl 137:569–591

Martı́n-Herrán G, Rincón-Zapatero JP (2002) Computation of subgame per-


fect Nash equilibria without Hamilton-Jacobi-Bellman equations. In: Zaccour
G (ed), Optimal control and differential games. Essays in honor of Steffen
Jørgensen. Kluwer Academic Publishers, Dordrecht, pp 135–151

Martı́n-Herrán G, Taboubi S, Zaccour G (2005) A time-consistent open loop


Stackelberg equilibrium of shelf-space allocation. Automatica 41:971–982

Martı́n-Herrán G, Rubel O, Zaccour G (2008) Competing for consumer’s atten-


tion. Automatica 44:361–370

Martirena-Mantel AM (1971) Optimal inventory and capital policy under cer-


tainty. J Econ Theory 3:241–253

Massé P (1962) Optimal investment decisions. Prentice-Hall, Englewood Cliffs

Maurer H (1976) Numerical solution of singular control problems using multiple


shooting techniques. J Optim Theory Appl 18:235–257

Maurer H (1977) On optimal control problems with bounded state variables


and control appearing linearly. SIAM J Control Optim 15:345–362
Bibliography 519

Maurer H (1979) Differential stability in optimal control problems. Appl Math


Optim 5:283–295

Maurer H (1981) First and second order sufficient optimality conditions in math-
ematical programming and optimal control. Math. Program Stud 14:163–177

Maurer H, Gillessen W (1975) Application of multiple shooting to the numerical


solution of optimal control problems with bounded state variables. Comput-
ing 15:105–126

Maurer H, Wiegand M (1992) Numerical solution of a drug displacement prob-


lem with bounded state variables. Optimal Control Appl Methods 13:43–55

Maurer H, Büskens C, Feichtinger G (1998) Solution techniques for periodic


control problems: a case study in production planning. Optimal Control
Appl Methods 19:185–203

Maurer H, Kim JHR, Vossen G (2005) On a state-constrained control problem


in optimal production and maintenance. In: Deissenberg C, Hartl RF (eds)
Optimal control and dynamic games. Springer, Dordrecht, pp 289–308

Mayne DQ, Polak E (1987) An exact penalty function algorithm for control
problems with state and control constraints. IEEE Trans Autom Control
32:380–387

McIntyre J, Paiewonsky B (1967) On optimal control with bounded state vari-


ables. In: Leondes CT (ed) Advances in control systems, vol 5. Academic
Press, New York

Mehra RK (1975) An optimal control approach to national settlement system


planning, vol RM-75-58. International Institute of Applied Systems Analysis,
Laxenburg

Mehra RK, Davis RE (1972) Generalized gradient method for optimal control
problems with inequality constraint and singular arcs. IEEE Trans Autom
Control AC-17:69–79

Mehlmann A (1985) State transformations and the derivation of Nash closed-


loop equilibria for non-zero-sum differential games. Appl Math Model 9:353–
357

Mehlmann A (1988) Applied differential games. Plenum, New York

Mehlmann A (1997) Wer Gewinnt Das Spiel? - Spieltheorie in Fabeln und


Paradoxa. Vieweg, Braunschweig

Mehrez A (1983) A note on the comparison of two different formulations of a


risky R & D model. O R Lett 2:249–251
520 Bibliography

Merton RC (1969) Lifetime portfolio selection under uncertainty: the


continuous-time case. Rev Econ Stat 51:247–257

Merton RC (1971) Optimum consumption and portfolio rules in a continuous-


time model. J Econ Theory 3:373–413

Merton RC (1973) An intertemporal capital asset pricing model. Econometrica


5:867–888

Merton RC (1982) On the microeconomic theory of investment under uncer-


tainty. In: Arrow KJ, Intriligator MD (eds) Handbook of mathematical eco-
nomics, vol II. North-Holland, Amsterdam, pp 601–669

Mesak HI (1985) On modeling advertising pulsing decisions. Decis Sci 16:25–42

Mesak HI, Darrat AF (1992) On comparing alternative advertising policies for


pulsation. Decis Sci 23:541–564

Michel P (1981) Choice of projects and their starting dates: an extension of


Pontryagin’s maximum principle to a case which allows choice among differ-
ent possible evolution equations. J Econ Dyn Control 3:97–118

Michel P (1982) On the transversality condition in infinite horizon optimal


control problems. Econometrica 50:975–985

Michel P (1985) Application of optimal control theory to disequilibrium anal-


ysis. In: Feichtinger G (ed) Optimal control theory and economic analysis,
vol 2. North-Holland, Amsterdam, pp 417–427

Miele A (1962) Extremization of linear integral equations by Green’s theorem.


In: Leitmann G (ed) Optimization techniques. Academic Press, New York

Miller RE (1979) Dynamic optimization and economic applications. McGraw-


Hill, New York

Miller MH, Modigliani F (1961) Dividend policy, growth, and valuation of


shares. J Bus 34:411–433

Mirman LJ, Spulber PF (eds) (1982) Essays in the economics of renewable


resources. North-Holland, Amsterdam

Mirrlees J (1967) Optimum growth when technology is changing. Rev Econ


Stud 34:95–124

Mirrlees J (1972) The optimum town. Swed J Econ 74:114–135

Modigliani F, Hohn F (1955) Production planning over time. Econometrica


23:46–66
Bibliography 521

Moiseev NN (1971) Numerical methods in optimal control theory. Nauka,


Moscow

Monahan GE (1984) A pure birth model of optimal advertising with word-of-


mouth. Market Sci 3(2):169–178

Mond B, Hanson M (1968) Duality for control problems. SIAM J Control 6:114–
120

Morton TE (1978) Universal planning horizons for generalized convex produc-


tion scheduling. Oper Res 26:1046–1057

Morton TE, Mitchell A, Zemel E (1982) A discrete maximum principle ap-


proach to a general dynamic market response model. In: TIMS studies in
the management sciences, vol 18. North-Holland, Amsterdam, pp 117–140

Moser E, Seidl A, Feichtinger G (2014) History-dependence in production-


pollution-trade-off models: a multi-stage approach. Ann Oper Res
222(1):457–481

Motta M, Rampazzo F (1996) Dynamic programming for nonlinear systems


driven by ordinary and impulsive controls. SIAM J Control Optim 34(1):199–
225

Muller E (1983) Trial/awareness advertising decision: a control problem with


phase diagrams with non-stationary boundaries. J Econ Dyn Control 6:333–
350

Munro GR, Scott AD (1985) The economics of fisheries management. In:


Kneese AV, Sweeney JL (eds) Handbook of natural resource and energy eco-
nomics, vol 2, 1st edn, chap 14. Elsevier, London, pp 623–676

Murata Y (1982) Optimal control methods for linear discrete-time economic


systems. Springer, New York

Murray DM, Yakowitz SJ (1984) Differential dynamic programming and New-


ton’s methods for discrete optimal control problems. J Optim Theory Appl
43(3):395–414

Muzicant J (1980) Systeme mit verteilten Parametern in der Bioökonomie: Ein


Maximumprinzip zur Kontrolle altersstrukturierter Modelle, Diss., Inst. f.
Unternehmensforschung, Techn. University Wien

Nahorski Z, Ravn HF, Vidal RVV (1984) The discrete-time maximum principle:
a survey and some new results. Int J Control 40:533–554

Naik PA, Mantrala MK, Sawyer A (1998) Planning pulsing media schedules in
the presence of dynamic advertising quality. Market Sci 17(3):214–235
522 Bibliography

Naik PA, Prasad A, Sethi SP (2008) Building brand awareness in dynamic


oligopoly markets. Manag Sci 54(1):129–138

Näslund B (1966) Simultaneous determination of optimal repair policy and


service life. Swed J Econ 68:63–73

Näslund B (1969) Optimal rotation and thinning. For Sci 15:446–451

Näslund B (1979) Consumer behavior and optimal advertising. J Oper Res Soc
20:237–243

Neck R (1982) Dynamic systems with several decision-makers. In: Feichtinger


G, Kall P (eds) Operations research in progress. D. Reidel Publishing Com-
pany, Dordrecht, pp 261–284

Neck R (1984) Stochastic control theory and operational research. Eur J Oper
Res 17:283–301

Nelson RT (1960) Labor assignment as a dynamic control problem. Oper Res


14:369–376

Nepomiastchy P (1970) Application of optimal control theory and penalty func-


tion techniques to solution of a particular scheduling problem (in Russian).
Zhurnal Vychiflitel’noi Matemtiki i Matematichefkoi Fiziki 10(4)

Nerlove M, Arrow KJ (1962) Optimal advertising policy under dynamic condi-


tions. Economica 29:129–142

Neustadt LW (1976) Optimization: a theory of necessary conditions. Princeton


University Press, Princeton

Nguyen D (1997) Marketing decisions under uncertainty. Kluwer Academic Pub-


lishers, Boston

Norström CJ (1978) The continuous wheat trading model reconsidered. An


application of mathematical control theory with a state constraint, W.P. 58–
77-78, GSIA, Carnegie Mellon University, Pittsburgh

Novak AJ, Feichtinger G (1993) Optimal treatment of cancer diseases. Int J


Syst Sci 24:1253–1263

Oberle HJ (1979) Numerical computation of singular control problems with


application to optimal heating and cooling by solar energy. Appl Math Optim
5:297–314

Oberle HJ (1986) Numerical solution of minimax optimal control problems by


multiple shooting technique. J Optim Theory Appl 50:331–358
Bibliography 523

Oberle HJ, Grimm W (1989) BNDSCO: a program for the numerical solution of
optimal control problems. Report 515, Institute for flight systems dynamics,
German aerospace research establishment DLR, Oberpfaffenhofen, Germany

Oberle HJ, Sothmann B (1999) Numerical computation of optimal feed rates


for a fed-batch fermentation model. J Optim Theory Appl 100(1):1–13

Oğuztöreli MN, Stein RB (1983) Optimal control or antagonistic muscles. Biol


Cybern 48:91–99

Øksendal BK (1998) Stochastic differential equations: an introduction with


applications. Springer, New York

Olsder GJ (1976) Some thoughts about simple advertising models as differential


games and the structure of coalitions. In: Ho YC, Mitter SK (eds) Directions
in large-scale systems, many-person optimization and decentralized control.
Plenum Press, New York, pp 187–205

Oniki H (1973) comparative dynamics (sensitivity analysis) in optimal control


theory. J Econ Theory 6:265–283

Oren SS, Powell SG (1985) Optimal supply of a depletable resource with a


backstop technology. Oper Res 33:277–292

Osayimwese I (1974) Rural-urban migration and control theory. Geogr Anal


4:147–161

Ozga S (1960) Imperfect markets through lack of knowledge. Quart J Econ


74:29–52

Palda KS (1964) The measurement of cumulative advertising effects. Prentice-


Hall, Englewood Cliffs

Pantoja JF, Mayne DQ (1991) Sequential quadratic programming algorithm


for discrete optimal control problems with control inequality constraints. Int
J Control 53(4):823–836

Parlar M (1983) Optimal forest fire control with limited reinforcements. Optimal
Control Appl Methods 4:185–191

Parlar M (1984) Optimal dynamic service rate control in time dependent


M/M/S/N queues. Int J Syst Sci 15:107–118

Parlar M (1986) A problem in jointly optimal production and advertising deci-


sions. Int J Syst Sci 17(9):1373–1380

Parlar M, Vickson RG (1980) An optimal control problem with piecewise


quadratic cost functional containing a ’dead zone’. Optimal Control Appl
Methods 1:361–372
524 Bibliography

Parlar M, Vickson RG (1982) Optimal forest fire control: an extension of Park’s


model. For Sci 28:345–355

Pauwels W (1977) Optimal dynamic advertising policies in the presence of con-


tinuously distributed time lags. J Optim Theory Appl 22(1):79–89

Pekelman D (1974) Simultaneous price-production decision. Oper Res 22:788–


794

Pekelman D (1975) Production smoothing with fluctuating price. Manag Sci


21:576–590

Pekelman D (1979) On optimal utilization of production processes. Oper Res


27:260–278

Pekelman D, Rausser GC (1978) Adaptive control: survey of methods and


applications. In: Bensoussan A et al (eds) Applied optimal control, TIMS
studies in management sciences, vol 9. North-Holland, Amsterdam, pp 89–
120

Pekelman D, Sethi SP (1978) Advertising budgeting, wearout and copy replace-


ment. J Oper Res Soc 29:651–659

Pepyne DL, Cassandras CG (1999) Performance optimization of a class of dis-


crete event dynamic systems using calculus of variations techniques. J Optim
Theory Appl 100(3):599–622

Perrakis S (1976) A note on optimal equity financing of the corporation. J


Financ Quant Anal 11(1):157–164

Pesch HJ (1989a) Real-time computation of feedback controls for constrained


optimal control problems, part 1: neighboring extremals. Optimal Control
Appl Methods 10(2):129–145

Pesch HJ (1989b) Real-time computation of feedback controls for constrained


optimal control problems, part 2: a correction method based on multiple
shooting. Optimal Control Appl Methods 10(2):147–171

Pesch HJ (1994) A practical guide to the solution of real-life optimal control


problems. Control Cybern 23(1–2):7–60

Pesch HJ, Bulirsch R (1994) The maximum principle, Bellman’s equation, and
Canathéodory’s Work. J Optim Theory Appl 80(1):203–229

Pesch HJ, Plail M (2009) The maximum principle of optimal control: a story of
ingenious ideas and missed opportunities. Control Cybern 38 (4A):973–995

Peterson DW (1973) The economic significance of auxiliary functions in optimal


control. Int Econ Rev 14:234–252
Bibliography 525

Peterson DW (1974) On sensitivity in optimal control problems. J Optim The-


ory Appl 13(1):56–73

Peterson FM, Fisher AC (1977) The exploitation of extractive resources: a


survey. Econ J 87:681–721

Peterson DW, Zalkin JH (1978) A review of direct sufficient conditions in op-


timal control theory. Int J Control 28:589–610

Petrov IP (1968) Variational methods in optimum control theory. Academic


Press, New York

Pickenhain S (2010) On adequate transversality conditions for infinite horizon


optimal control problems - a famous example of Halkin. In: Cuaresma JC,
Palokangas T, Taraysev A (eds) Dynamic systems, economic growth, and the
environment. Dynamic modeling and econometrics in economics and finance,
vol 12. Springer, Berlin, pp 3–22

Pickenhain S, Lykina V (2006) Sufficiency conditions for infinite horizon optimal


control problems. In: Seeger A (ed) Recent advances in optimization. Lecture
notes in economics and mathematical systems, vol 563. Springer, Berlin, pp
217–232

Pickenhain S, Lykina V, Wagner M (2008) On the lower semicontinuity of func-


tionals involving Lebesgue or improper Riemann integrals in infinite horizon
optimal control problems. Control Cybern 37:451–468

Pierskalla WP, Voelker JA (1976) Survey of maintenance models: the control


and surveillance of deteriorating systems. Naval Res Logist Quart 23:353–388

Pindyck RS (ed) (1978a) Advances in the economics of energy and resources,


vol II. J.A.I. Press, Greenwich

Pindyck RS (1978b) The optimal exploration and production of nonrenewable


resources. J Polit Econ 86:841–862

Pindyck RS (1978c) Gains to producers from the cartelization of exhaustible


resources. Rev. Econ Stat 60:238–251

Pindyck RS (1982) Adjustment costs, uncertainty, and the behavior of the firm.
Am Econ Rev 72(3):415–427

Pitchford JD, Turnovsky SJ (eds) (1977) Applications of control theory to eco-


nomic analysis. North-Holland, Amsterdam

Pohjola M (1984) Threats and bargaining in capitalism: a differential game


view. J Econ Dyn Control 8:291–302
526 Bibliography

Polak E (1971) Computational methods in optimization. Academic Press, New


York

Polak E (1973) A historical survey of computational methods in optimal control.


SIAM Rev 15:553–584

Polak E, Yang TH, Mayne DQ (1993) A method of centers based on barrier


functions for solving optimal control problems with continuous state and
control constraints. SIAM J Control Optim 31:159–179

Polyanin AD, Zaitsev VF (2003) Handbook of exact solutions for ordinary dif-
ferential equations, 2nd edn. Chapman & Hall/CRC, Boca Raton

Pontryagin LS, Boltyanskii VG, Gamkrelidze RV, Mischenko EF (1962) The


mathematical theory of optimal processes. Wiley, New York

Prasad A, Sethi SP (2004) Competitive advertising under uncertainty: stochas-


tic differential game approach. J Optim Theory Appl 123(1):163–185

Prasad A, Sethi SP (2009) Integrated marketing communications in markets


with uncertainty and competition. Automatica 45(3):601–610

Prasad A, Sethi SP, Naik P (2012) Understanding the impact of churn in dy-
namic oligopoly markets. Automatica 48:2882–2887

Presman E, Sethi SP (2006) Inventory models with continuous and Poisson


demands and discounted and average costs. Prod Oper Manag 15(2):279–
293

Presman E, Sethi SP, Zhang Q (1995) Optimal feedback production planning


in a stochastic N -machine flowshop. Automatica 31(9):1325–1332

Presman E, Sethi SP, Suo W (1997a) Optimal feedback production planning


in stochastic dynamic jobshops. In: Yin G, Zhang Q (eds) Mathematics of
stochastic manufacturing systems. Lectures in applied mathematics, vol 33.
American Mathematical Society, Providence, pp 235–252

Presman E, Sethi SP, Suo W (1997b) Existence of optimal feedback production


plans in stochastic N -machine flowshops with limited buffers. Automatica
33(4):1899–1903

Presman E, Sethi SP, Zhang H, Zhang Q (1998a) Analysis of average cost


optimality for an unreliable two-machine flowshop. In: Proceedings of the
fourth international conference on optimization techniques and applications,
Curtin University of Technology, Perth, Australia, pp 94–112

Presman E, Sethi SP, Zhang H, Zhang Q (1998b) Optimality of zero-inventory


policies for an unreliable manufacturing system producing two part types.
Dyn Contin Discrete Impuls Syst 4(4):485–496
Bibliography 527

Prskawetz A, Feichtinger G, Luptacik M (1998) The accomplishment of the


Maastricht criteria with respect to initial debt. J Econ 68:93–110

Pytlak R, Vinter RB (1993) PH2SOL solver: an O(N) implementation of an op-


timization algorithm for a general optimal control problem. Research report
C93-36, Centre for Process Systems Engineering, Imperial College, London

Pytlak R, Vinter RB (1999) Feasible direction algorithm for optimal control


problems with state and control constraints: implementation. J Optim The-
ory Appl 101(3):623–649

Raman K (1990) Stochastically optimal advertising policies under dynamic con-


ditions: the ratio rule. Optimal Control Appl Methods 11:283–288

Raman K, Chatterjee R (1995) Optimal monopolist pricing under demand un-


certainty in dynamic markets. Manag Sci 41(1):144–162

Ramsey FP (1928) A mathematical theory of savings. Econ J 38:543–559

Rao RC (1970) Quantitative theories in advertising. Wiley, New York

Rao RC (1984) Advertising decisions in oligopoly: an industry equilibrium


analysis. Optimal Control Appl Methods 5(4):331–344

Rao RC (1985) A note on optimal and near optimal price and advertising strate-
gies. Manag Sci 31(3):376–377

Rao RC (1986) Estimating continuous time advertising-sales models. Market


Sci 5(2):125–142

Rao RC (1990) Impact of competition on strategic marketing decisions. In: Day


G, Weitz B, Wensley R (eds) Interface of marketing and strategy. JAI Press,
Greenwich

Rapoport A (1980) Mathematische Methoden in den Sozialwissenschaften.


Physica-Verlag, Würzburg

Rapp B (1974) Models for optimal investment and maintenance decisions.


Almqvist & Wiksell/Wiley, Stockholm/New York

Rausser GC, Hochman E (1979) Dynamic agricultural systems: economic pre-


diction and control. North-Holland, New York

Raviv A (1979) The design of an optimal insurance policy. Am Econ Rev 69:84–
96

Ravn HF (1999) Discrete time optimal control. PhD thesis, Technical University
of Denmark
528 Bibliography

Ray A, Blaquière A (1981) Sufficient conditions for optimality of threat strate-


gies in a differential game. J Optim Theory Appl 33:99–109

Reinganum JF (1982) A dynamic game of R & D: patent protection and com-


petitive behavior. Econometrica 50:671–688

Rempala R (1980) On the multicommodity Arrow-Karlin inventory model.


In: Proceedings first international symposium on inventories, Hungarian
Academy of Science, Budapest

Rempala R (1982) On the multicommodity Arrow-Karlin inventory model, part


II: horizon and horizontal solution. In: Proceedings second international sym-
posium on inventories, Hungarian Academy of Science, Budapest

Rempala R (1986) Horizon for the dynamic family of wheat trading problems.
In: Proceedings fourth international symposium on inventories, Budapest

Rempala R, Sethi SP (1988) Forecast horizons in single product inventory mod-


els. In: Feichtinger G (ed) Optimal control theory and economic analysis, vol
3. North-Holland, Amsterdam, pp 225–233

Rempala R, Sethi SP (1992) Decision and forecast horizons for one-dimensional


optimal control problems: existence results and applications. Optimal Con-
trol Appl Methods 13:179–192

Rempala R, Zabczyk J (1988) On the maximum principle for deterministic


impulse control problems. J Optim Theory Appl 59(2):281–288

Richard SF (1979) A generalized capital asset pricing model. TIMS Stud Manag
Sci 11:215–232

Ringbeck J (1985) Mixed quality and advertising strategies under asymmetric


information. In: Feichtinger G (ed) Optimal control theory and economic
analysis, vol 2. North-Holland, Amsterdam, pp 197–214

Rishel RW (1965) An extended Pontryagin principle for control systems whose


control laws contain measures. J Soc Ind Appl Math Control 3:191–205

Rishel RW (1985) A partially observed advertising model. In: Feichtinger G


(ed) Optimal control theory and economic analysis, vol 2. North Holland,
Amsterdam, pp 253–262

Roberts SM, Shipman JS (1972) Two-point boundary value problems: shooting


methods. Elsevier, New York

Robinson B, Lakhani C (1975) Dynamic price models for new-product planning.


Manag Sci 21(10):1113–1122
Bibliography 529

Robson AJ (1981) Sufficiency of the Pontryagin conditions for optimal control


when the time horizon is free. J Econ Theory 24:438–445

Robson AJ (1985) Optimal control of systems governed by partial differential


equations: economic applications. In: Feichtinger G (ed) Optimal control
theory and economic analysis, vol 2. North-Holland, Amsterdam, pp 105–
118

Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton

Rockafellar RT (1972) State constraints in convex control problems of Bolza.


SIAM J Control 19:691–715

Russak B (1976) Relations among the multipliers for problems with bounded
state constraints. SIAM J Control Optim 14:1151–1155

Russell DL (1965) Penalty functions and bounded phase coordinates. SIAM J


Control 2:409–422

Sage AP (1968) Optimum systems control. Prentice-Hall, Englewood Cliffs

Salukvadze ME (1979) Vector-valued optimization problems in control theory.


Academic Press, New York

Samaratunga C, Sethi SP, Zhou X (1997) Computational evaluation of hier-


archical production policies for stochastic manufacturing systems. Oper Res
45(2):258–274

Samuelson PA (1972) The general saddle point property of optimal-control mo-


tions. J Econ Theory 5:102–120

Sarma VVS, Alam M (1975) Optimal maintenance policies for machines subject
to deterioration and intermittent breakdowns. IEEE Trans Syst Man Cybern
SMC-5:396–398

Sasieni M (1989) Optimal advertising strategies. Market Sci 8:358–370

Scalzo RC (1974) N person linear quadratic differential games with constraints.


SIAM J Control 12:419–425

Schaefer MB (1957) Some considerations of population dynamics and economics


in relation to the management of marine fisheries. J Fish Res Board Can
14:669–681

Schilling K (1985) On optimization principles in plant ecology. In: Aubin JP et


al (eds) Dynamics of macrosystems. Lecture notes in economics and mathe-
matical systems, vol 257. Springer, Berlin, pp 63–71
530 Bibliography

Seidl A, Kaplan EH, Caulkins JP, Wrzaczek S, Feichtinger G (2016) Optimal


control of a terror queue model. Eur J Oper Res 248(1):246–256

Seidman TI, Sethi SP, Derzko NA (1987) Dynamics and optimization of a


sales advertising model with population migration. J Optim Theory Appl
52(3):443–462

Seierstad A (1982) Differentiability properties of the optimal value function in


control theory. J Econ Dyn Control 4:303–310

Seierstad A (1985) Existence of an optimal control with sparse jumps in the


state variable. J Optim Theory Appl 45:265–293

Seierstad A, Sydsæter K (1977) Sufficient conditions in optimal control theory.


Int Econ Rev 18:367–391

Seierstad A, Sydsæter K (1983) Sufficient conditions applied to an optimal


control problem of resource management. J Econ Theory 31:375–382

Seierstad A, Sydsæter K (1987) Optimal control theory with economic applica-


tions. North-Holland, Amsterdam

Selten R (1975) Reexamination of the perfectness concept for equilibrium points


in extensive games. Int J Game Theory 4:25–55

Sethi SP (1973a) Optimal control of the Vidale-Wolfe advertising model. Oper


Res 21:998–1013

Sethi SP (1973b) Simultaneous optimization of preventive maintenance and


replacement policy for machines: a modern control theory approach. AIIE
Trans 5(2):156–163

Sethi SP (1973c) An application of optimal control theory in forest management.


J Manag Sci Appl Cybern 2:9–16

Sethi SP (1973d) A note on modeling simple dynamic cash balance problems.


J Financ Quant Anal 8:685–687; Errata, 13, 585–586

Sethi SP (1974a) Sufficient conditions for the optimal control of a class of sys-
tems with continuous lags. J Optim Theory Appl 13:542–552

Sethi SP (1974b) Some explanatory remarks on the optimal control of the


Vidale-Wolfe advertising model. Oper Res 22:1119–1120

Sethi SP (1974c) Quantitative guidelines for communicable disease control pro-


gram: a complete synthesis. Biometrics 30:681–691

Sethi SP (1975) Optimal control of a logarithmic advertising model. Oper Res


Quart 26:317–319
Bibliography 531

Sethi SP (1977a) Dynamic optimal control models in advertising: a survey.


SIAM Rev 19(4):685–725

Sethi SP (1977b) Nearest feasible paths in optimal control problems: theory,


examples, and counterexamples. J Optim Theory Appl 23(4):563–579

Sethi SP (1977c) Optimal advertising for the Nerlove-Arrow model under a


budget constraint. Oper Res Quart 28(3):683–693

Sethi SP (1977d) A linear bang-bang model of firm behavior and water quality.
IEEE Trans Autom Control AC-22:706–714

Sethi SP (1978a) A survey of management science applications of the determin-


istic maximum principle. TIMS Stud Manag Sci 9:33–67

Sethi SP (1978b) Optimal equity financing model of Krouse and Lee: corrections
and extensions. J Financ Quant Anal 13(3):487–505

Sethi SP (1978c) Optimal quarantine programs for controlling an epidemic


spread. J Oper Res Soc 29(3):265–268

Sethi SP (1979a) Optimal depletion of exhaustible resources. Appl Math Model


3:367–378

Sethi SP (1979b) Optimal pilfering policy for dynamic continuous thieves.


Manag Sci 25(6):535–542

Sethi SP (1979c) Optimal advertising policy with the contagion model. J Optim
Theory Appl 29(4):615–627

Sethi SP (1979d) A note on the Nerlove-Arrow model under uncertainty. Oper


Res 27(4):839–842; Erratum (1980) 28(4):1026–1027

Sethi SP (1983a) Optimal long-run equilibrium advertising level for the


Blattberg-Jeuland model. Manag Sci 29(12):1436–1443

Sethi SP (1983b) Deterministic and stochastic optimization of a dynamic ad-


vertising model. Optimal Control Appl Methods 4(2):179–184

Sethi SP (1983c) Applications of optimal control to management science prob-


lems. In: Proceedings of the 1983 World Conference on Systems, Caracas,
Venezuela

Sethi SP (1984) Application of the maximum principle to production and inven-


tory problems. In: Proceedings third international symposium on inventories,
Budapest, pp 753–756

Sethi SP (1990) Decision and forecast horizons in dynamic optimization. In:


Singh MG (ed) Systems and control encyclopedia supplementary, vol 1. Perg-
amon Press, Oxford, pp 192–198
532 Bibliography

Sethi SP (1996) When does the share price equal the present value of future
dividends? - a modified dividend approach. Econ Theory 8:307–319

Sethi SP (1997a) Optimal consumption and investment with bankruptcy.


Kluwer Academic Publishers, Boston

Sethi SP (1997b) Some insights into near-optimal plans for stochastic man-
ufacturing systems. In: Yin G, Zhang Q (eds) Mathematics of stochastic
manufacturing systems. Lectures in applied mathematics, vol 33. American
Mathematical Society, Providence, pp 287–315

Sethi SP (1998) Optimal consumption - investment decisions allowing for


bankruptcy: a survey. In: Ziemba WT, Mulvey JM (eds) Worldwide asset
and liability modeling. Cambridge University Press, Cambridge, pp 387–426

Sethi SP (2010) i3: incomplete information inventory models. Decision line,


October, 16–19. Reprinted in Proceedings of the first international conference
on best practices in supply chain management, Bhubaneswar, India, 22–23
November 2012, pp 658–662

Sethi SP, Bass FM (2003) Optimal pricing in a hazard rate model of demand.
Optimal Control Appl Methods 24:183–196

Sethi SP, Chand S (1979) Planning horizon procedures in machine replacement


models. Manag Sci 25:140–151; Erratum (1980) 26(3):342

Sethi SP, Chand S (1981) Multiple finite production rate dynamic lot size in-
ventory models. Oper Res 29(5):931–944

Sethi SP, Lee SC (1981) Optimal advertising for the Nerlove-Arrow Model under
a replenishable budget. Optimal Control Appl Methods 2(2):165–173

Sethi SP, Lehoczky JP (1981) A comparison of Ito and Stratonovich formula-


tions of problems in finance. J Econ Dyn Control 3:343–356

Sethi SP, Li T (2017) A review of dynamic Stackelberg game models. Discrete


Contin Dyn Syst Ser B 22(1):125–159

Sethi SP, McGuire TW (1977) Optimal skill mix: an application of the max-
imum principle for systems with retarded controls. J Optim Theory Appl
23:245–275

Sethi SP, Morton TE (1972) A mixed optimization technique for the generalized
machine replacement problem. Naval Res Logist Quart 19:471–481

Sethi SP, Sorger G (1989) Concepts of forecast horizons in stochastic dynamic


games. In: Proceedings of the 28th conference on decision and control,
Tampa, Florida, pp 195–197
Bibliography 533

Sethi SP, Sorger G (1990) An exercise in modeling of consumption, import, and


export of an exhaustible resource. Optimal Control Appl Methods 11:191–196

Sethi SP, Sorger G (1991) A theory of rolling horizon decision making. Ann O
R 29:387–416

Sethi SP, Staats PW (1978) Optimal control of some simple deterministic epi-
demic models. J Oper Res Soc 29(2):129–136

Sethi SP, Taksar MI (1988) Deterministic and stochastic control problems with
identical optimal cost functions. In: Bensoussan A, Lions JL (eds) Analy-
sis and optimization of systems. Lecture notes in control and information
sciences. Springer, New York, pp 641–645

Sethi SP, Taksar MI (1990) Deterministic equivalent for a continuous-time


linear-convex stochastic control problem. J Optim Theory Appl 64(1):169–
181

Sethi SP, Taksar MI (2002) Optimal financing of a corporation subject to ran-


dom returns. Math Financ 12(2):155–172

Sethi SP, Thompson GL (1970) Applications of mathematical control theory to


finance. J Financ Quant Anal 5(4–5):381–394

Sethi SP, Thompson GL (1977) Christmas toy manufacturer’s problem: an


application of the stochastic maximum principle. Opsearch 14:161–173

Sethi SP, Thompson GL (1981a) Simple models in stochastic production plan-


ning. In: Bensoussan A, Kleindorfer PR, Tapiero CS (eds) Applied stochastic
control in econometrics and management science. North-Holland, Amster-
dam, pp 295–304

Sethi SP, Thompson GL (1981b) A tutorial on optimal control theory. Inf Syst
Oper Res 19(4):279–291

Sethi SP, Thompson GL (1982) Planning and forecast horizons in a simple


wheat trading model. In: Feichtinger G, Kall P (eds) Operations research in
progress. D. Reidel Publishing Company, Dordrecht, pp 203–214

Sethi SP, Zhang Q (1994a) Hierarchical decision making in stochastic man-


ufacturing systems. Systems and control: foundations and applications.
Birkhäuser, Boston

Sethi SP, Zhang Q (1994b) Hierarchical controls in stochastic manufacturing


systems. SIAG/CST Newsl 2(1):1–5

Sethi SP, Zhang Q (1994c) Asymptotic optimal controls in stochastic man-


ufacturing systems with machine failures dependent on production rates.
Stochastics Stochastics Rep 48:97–121
534 Bibliography

Sethi SP, Zhang Q (1995a) Hierarchical production and setup scheduling in


stochastic manufacturing systems. IEEE Trans Autom Control 40(5):924–
930

Sethi SP, Zhang Q (1995b) Multilevel hierarchical decision making in stochastic


marketing - production systems. SIAM J Control Optim 33(2):538–553

Sethi SP, Zhang Q (1998a) Asymptotic optimality of hierarchical controls in


stochastic manufacturing systems: a review. In: Aronson JE, Zionts S (eds)
Operations research: methods, models, and applications. Quoram Books,
Westport, pp 267–294

Sethi SP, Zhang Q (1998b) Optimal control of dynamic systems by decomposi-


tion and aggregation. J Optim Theory Appl 99(1):1–22

Sethi SP, Zhang H (1999a) Average-cost optimal policies for an unreliable flex-
ible multiproduct machine. Int J Flex Manuf Syst 11:147–157

Sethi SP, Zhang H (1999b) Hierarchical production controls for a stochastic


manufacturing system with long-run average cost: asymptotic optimality.
In: McEneaney WM , Yin G, Zhang Q (eds) Stochastic analysis, control, op-
timization and applications: a volume in honor of W.H. Fleming. Birkhäuser,
Boston, pp 621–637

Sethi SP, Zhang Q (2001) Near optimization of stochastic dynamic systems by


decomposition and aggregation. Optimal Control Appl Methods 22:333–350

Sethi SP, Zhang Q (2004) Problem 4.3 feedback control in flowshops. In: Blon-
del VD, Megretski A (eds) Unsolved problems in mathematical systems and
control theory. Princeton University Press, Princeton, pp 140–143

Sethi SP, Zhou X (1996a) Optimal feedback controls in deterministic dynamic


two-machine flowshops. O R Lett 19(5):225–235

Sethi SP, Zhou X (1996b) Asymptotic optimal feedback controls in stochastic


dynamic two-machine flowshops. In: Yin G, Zhang Q (eds) Recent advances
in control and optimization of manufacturing systems. Lecture notes in con-
trol and information sciences, vol 214. Springer, New York, pp 147–180

Sethi SP, Thompson GL, Udayabhanu V (1985) Profit maximization models


for exponential decay processes. Eur J Oper Res 2:101–115

Sethi SP, Suo W, Taksar MI, Zhang Q (1997a) Optimal production planning
in a stochastic manufacturing system with long-run average cost. J Optim
Theory Appl 92(1):161–188

Sethi SP, Zhang H, Zhang Q (1997b) Hierarchical production control in a


stochastic manufacturing system with long-run average cost. J Math Anal
Appl 214:151–172
Bibliography 535

Sethi SP, Zhang Q, Zhou X (1997c) Hierarchical production controls in a


stochastic two-machine flowshop with a finite internal buffer. IEEE Trans
Robot Autom 13(1):1–13

Sethi SP, Suo W, Taksar MI, Yan H (1998a) Optimal production planning in a
multiproduct stochastic manufacturing systems with long-run average cost.
Discrete Event Dyn Syst 8(1):37–54

Sethi SP, Zhang H, Zhang Q (1998b) Minimum average-cost production plan-


ning in stochastic manufacturing systems. Math Models Methods Appl Sci
8(7):1251–1276

Sethi SP, Zhang H, Zhang Q (2000) Optimal production rates in a deterministic


two-product manufacturing system. Optimal Control Appl Methods 21:125–
135

Sethi SP, Yan H, Zhang H, Zhang Q (2002) Optimal and hierarchical controls
in dynamic stochastic manufacturing systems: a survey. Manuf Serv Oper
Manag 4(2):133–170

Sethi SP, Zhang H, Zhang Q (2005) Average-cost control of stochastic man-


ufacturing systems. Stochastic modelling and applied probability. Springer,
New York

Sethi SP, Prasad A, He X (2008a) Optimal advertising and pricing in a new-


product adoption model. J Optim Theory Appl 139(2):351–360

Sethi SP, Yeh DHM, Zhang R, Jardine A (2008b) Optimal maintenance and
replacement of extraction machinery. J Syst Sci Syst Eng 17(4):416–431

Shani U, Tsur Y, Zemel A (2005) Characterizing dynamic irrigation policies


via Green’s theorem. In: Deissenberg C, Hartl RF (eds) Optimal control and
dynamic games. Springer, Dordrecht, pp 105–117

Shapiro C (1982) Consumer information, product quality, and seller reputation.


Bell J Econ 13:20–25

Shapiro A (1997) On uniqueness of Lagrange multipliers in optimization prob-


lems subject to cone constraints. SIAM J Optim 7(2):508–518

Sharomi O, Malik T (2017) Optimal control in epidemiology. Ann Oper Res


251:55–71

Shell K (ed) (1967) Essays on the theory of optimal economic growth. The MIT
Press, Cambridge

Shell K (1969) Applications of Pontryagin’s maximum principle to economics.


In: Kuhn HW, Szego GP (eds) Mathematical systems theory and economics.
Springer, Berlin
536 Bibliography

Siebert H (1985) Economics of the resource-exporting country: intertemporal


theory of supply and trade. JAI Press, Greenwich

Silva GN, Vinter RB (1997) Necessary conditions for optimal impulsive control
problems. SIAM J Control Optim 35(6):1829–1846

Simaan M, Cruz JB Jr (1975) Formulation of Richardson’s model of arms race


from a differential game viewpoint. Rev Econ Stud 42:67–77

Simon HA (1956) Dynamic programming under uncertainty with a quadratic


criterion function. Econometrica 24:74–81

Simon HA (1982) ADPULS: an advertising model with wearout and pulsation.


J Market Res 19:352–363

Singh MG (1980) Dynamical hierarchical control. North-Holland, Amsterdam

Singh MG, Titli A, Malanowski K (1985) Decentralized control design: an


overview. Large Scale Syst 9:215–230

Singhal K (1992) A noniterative algorithm for the multiproduct production and


work force planning problem. Oper Res 40(3):620–625

Singhal K, Singhal J (1986) A solution to the Holt et al. model for aggregate
production planning. OMEGA 14:502–505

Skiba AK (1978) Optimal growth with a convex-concave production function.


Econometrica 46:527–539

Smith VL (1972) Dynamics of waste accumulation: disposal versus recycling.


Quart J Econ 86:600–616

Snower DJ (1982) Macroeconomic policy and the optimal destruction of vam-


pires. J Polit Econ 90:647–655

Solow RM (1974) Intergenerational equity and exhaustible resources. Rev Econ


Stud 41:29–45

Solow RM, Wan FY (1976) Extraction costs in the theory of exhaustible re-
sources. Bell J Econ 7:359–370

Sorger G (1989) Competitive dynamic advertising: a modification of the case


game. J Econ Dyn Control 13:55–80

Sorger G (1995) Discrete-time dynamic game models for advertising competi-


tion in a duopoly. Optimal Control Appl Methods 16:175–188

Sorger G (1998) Markov-perfect Nash equilibria in a class of resource games.


Econ Theory 11:79–100
Bibliography 537

Sorger G (2002) Existence and characterization of time-consistent monetary


policy rules. In: Zaccour G (ed) Optimal control and differential games.
Kluwer Academic Publishers, Boston, pp 87–103

Sorger G (2005) A dynamic common property resource problem with amenity


value and extraction costs. Int J Econ Theory 1:3–19

Southwick L, Zionts S (1974) An optimal-control-theory approach to the


education-investment decision. Oper Res 22:1156–1174

Spence M (1981) The learning curve and competition. Bell J Econ 12:49–70

Spiegel MR (1971) Schaum’s outline series: theory and problems of calculus of


finite differences and difference equations. McGraw-Hill, New York

Spiegel MR, Lipschutz S, Liu J (2008) Schaum’s outline of mathematical hand-


book of formulas and tables, 3rd edn (Schaum’s Outline Series). McGraw-Hill,
New York

Spremann K (1985) The signaling of quality by reputation. In: Feichtinger G


(ed) Optimal control theory and economic analysis, vol 2. North-Holland,
Amsterdam, pp 235–252

Sprzeuzkouski AY (1967) A problem in optimal stock management. J Optim


Theory Appl 1:232–241

Srinivasan V (1976) Decomposition of a multi-period media scheduling model


in terms of single period equivalents. Manag Sci 23:349–360

Stalford H, Leitmann G (1973) Sufficiency conditions for Nash equilibria in N-


person differential games. In: Blaquière A (ed) Topics in differential games.
North-Holland, Amsterdam, pp 345–376

Starr AW, Ho YC (1969) Nonzero-sum differential games. J Optim Theory Appl


3:184–206

Steindl A, Feichtinger G (2004) Bifurcations to periodic solutions in a produc-


tion/inventory model. J Nonlinear Sci 14:469–503

Steindl A, Feichtinger G, Hartl RF, Sorger G (1986) On the optimality of


cyclical employment policies: a numerical investigation. J Econ Dyn Con-
trol 10:457–466

Stepan A (1977) An application of the discrete maximum principle to designing


ultradeep drills - the casing optimization. Int J Prod Res 15:315–327

Stern LE (1984) Criteria of optimality in the infinite-time optimal control prob-


lem. J Optim Theory Appl 44:497–508
538 Bibliography

Stiglitz JE, Dasgupta P (1982) Market structure and the resource depletion: a
contribution to the theory of intertemporal monopolistic competition. J Econ
Theory 28:128–164

Stoer J, Bulirsch R (1978) Einführung in die Numerische Mathematik II, Hei-


delberger Taschenbücher, vol 114. Springer, Berlin

Stöppler S (1975) Dynamische Produktionstheorie. Westdeutscher-Verlag,


Opladen

Stöppler S (1985) Der Einflußder Lagerkosten auf die Produktionsanpassung bei


zyklischem Absatz - Eine kontrolltheoretische Analyse. OR-Spektr 7:129–142

Sulem A (1986) Solvable one-dimensional model of a diffusion inventory system.


Math Oper Res 11(1):125–133

Swan GW (1984) Applications of optimal control theory in biomedicine. M.


Dekker, New York

Sweeney DJ, Abad PL, Dornoff RJ (1974) Finding an optimal dynamic adver-
tising policy. Int J Syst Sci 5(10):987–994

Sydsæter K (1978) Optimal control theory and economics: some critical remarks
on the literature. Scand J Econ 80:113–117

Sydsæter K (1981) Topics in mathematical analysis for economists. Academic


Press, London

Takayama A (1974) Mathematical economics. The Dryden Press, Hinsdale

Tan KC, Bennett RJ (1984) Optimal control of spatial systems. George Allen
& Unwin, London

Tapiero CS (1971) Optimal simultaneous replacement and maintenance of


a machine with process discontinuities. Revue francaise d’informatique et
recherche opérationnelle 2:79–86

Tapiero CS (1973) Optimal maintenance and replacement of a sequence of ma-


chines and technical obsolescence. Opsearch 19:1–13

Tapiero CS (1977) Managerial planning: an optimum and stochastic control


approach. Gordon Breach, New York

Tapiero CS (1978) Optimum advertising and goodwill under uncertainty. Oper


Res 26(3):450–463

Tapiero CS (1981) Optimum product quality and advertising. Inf Syst Oper
Res 19(4):311–318
Bibliography 539

Tapiero CS (1982a) Optimum control of a stochastic model of advertising. In:


Feichtinger G (ed) Optimal control theory and economic analysis. North-
Holland, Amsterdam, pp 287–300

Tapiero CS (1982b) A stochastic model of consumer behavior and optimal ad-


vertising. Manag Sci 28(9):1054–1064

Tapiero CS (1983) Stochastic diffusion models with advertising and word-of-


mouth effects. Eur J Oper Res 12(4):348–356

Tapiero CS (1988) Applied stochastic models and control in management.


North-Holland, Amsterdam

Tapiero CS, Farley JU (1975) Optimal control of sales force effort in time.
Manag Sci 21(9):976–985

Tapiero CS, Farley JU (1981) Using an uncertainty model to assess sales re-
sponse to advertising. Decis Sci 12:441–455

Tapiero CS, Soliman MA (1972) Multi-commodities transportation schedules


over time. Networks 2:311–327

Tapiero CS, Venezia I (1979) A mean variance approach to the optimal machine
maintenance and replacement problem. J Oper Res Soci 30:457–466

Tapiero CS, Zuckermann D (1983) Optimal investment policy of an insurance


firm. Insurance Math Econ 2:103–112

Tapiero CS, Eliashberg J, Wind Y (1987) Risk behaviour and optimum ad-
vertising with a stochastic dynamic sales response. Optimal Control Appl
Methods 8(3):299–304

Taylor JG (1972) Comments on a multiplier condition for problems with state


variable inequality constraints. IEEE Trans Autom Control AC-12:743–744

Taylor JG (1974) Lanchester-type models of warfare and optimal control. Naval


Res Logist Quart 21:79–106

Teng JT, Thompson GL (1983) Oligopoly models for optimal advertising when
production costs obey a learning curve. Manag Sci 29(9):1087–1101

Teng JT, Thompson GL (1985) Optimal strategies for general price-advertising


models. In: Feichtinger G (ed) Optimal control theory and economic analysis,
vol 2. North-Holland, Amsterdam, pp 183–195

Teng JT, Thompson GL, Sethi SP (1984) Strong decision and forecast horizons
in a convex production planning problem. Optimal Control Appl Methods
5(4):319–330
540 Bibliography

Teo KL, Moore EJ (1977) Necessary conditions for optimality for control prob-
lems with time delays appearing in both state and control variables. J Optim
Theory Appl 23:413–427

Teo KL, Goh CJ, Wong KH (1991) A unified computational approach to optimal
control problems. Longman Scientific & Technical, Essex

Terborgh G (1949) Dynamic equipment policy. McGraw-Hill, New York

Thépot J (1983) Marketing and investment policies of duopolists in a growing


industry. J Econ Dyn Control 5:387–404

Thompson GL (1968) Optimal maintenance policy and sale date of a machine.


Manag Sci 14:543–550

Thompson GL (1981) An optimal control model of advertising pulsation and


wearout. In: Keon JW (ed) ORSA/TIMS special interest conference on mar-
ket measurement and analysis. The Institute for Management Sciences, Prov-
idence, pp 34–43

Thompson GL (1982a) Continuous expanding and contracting economies. In:


Deistler M, Furst E, Schwodiauer G (eds) Games, economic dynamics, time
series analysis. Physica-Verlag, Berlin, pp 145–153

Thompson GL (1982b) Many country continuous expanding open economies


with multiple goods. In: Feichtinger G (ed) Optimal control theory and eco-
nomic analysis. North-Holland, New York, pp 157–168

Thompson GL, Sethi SP (1980) Turnpike horizons for production planning.


Manag Sci 26:229–241

Thompson GL, Teng JT (1984) Optimal pricing and advertising policies for
new product oligopoly models. Market Sci 3(2):148–168

Thompson GL, Sethi SP, Teng JT (1984) Strong planning and forecast horizons
for a model with simultaneous price and production decisions. Eur J Oper
Res 16:378–388

Tidball M, Zaccour G (2009) A differential environmental game with coupling


constraints. Optimal Control Appl Methods 30:197–207

Tintner G (1937) Monopoly over time. Econometrica 5:160–170

Tintner G, Sengupta JK (1972) Stochastic economics: stochastic processes,


control, and programming. Academic Press, New York

Tolwinski B (1982) A concept of cooperative equilibrium for dynamic games.


Automatica 18:431–447
Bibliography 541

Tousssaint S (1985) The transversality condition at infinity applied to a problem


of optimal resource depletion. In: Feichtinger G (ed) Optimal control theory
and economic analysis, vol 2. North-Holland, Amsterdam, pp 429–440

Tracz GS (1968) A selected bibliography on the application of optimal control


theory to economic and business systems, management science and opera-
tions research. Oper Res 16:174–186

Tragler G, Caulkins JP, Feichtinger G (2001) Optimal dynamic allocation of


treatment and enforcement in illicit drug control. Oper Res 49(3):352–362

Treadway AB (1970) Adjustment costs and variable inputs in the theory of the
competitive firm. J Econ Theory 2:329–347

Troch I (ed) (1978) Simulation of control systems with special emphasis on


modelling and redundancy. Proceedings of the IMACS Symposium, North-
Holland, Amsterdam

Tsurumi H, Tsurumi Y (1971) Simultaneous determination of market share and


advertising expenditure under dynamic conditions: the case of a firm within
the Japanese pharmaceutical industry. Econ Stud Quart 22:1–23

Tu PNV (1969) Optimal educational investment program in an economic plan-


ning model. Can J Econ 2:52–64

Tu PNV (1984) Introductory optimization dynamics. Springer, Berlin

Turner RE, Neuman CP (1976) Dynamic advertising strategy: a managerial


approach. J Bus Admin 7:1–21

Turnovsky SJ (1981) The optimal intertemporal choice of inflation and unem-


ployment. J Econ Dyn Control 3:357–384

Tzafestas SG (1982a) Optimal and modal control of production-inventory sys-


tems. In: Optimization and control of dynamic operational research models.
North-Holland, Amsterdam, pp 1–71

Tzafestas SG (1982b) Distributed parameter control systems: theory and ap-


plication. In: International series on systems and control. Pergamon Press,
Oxford

Uhler RS (1979) The rate of petroleum exploration and extraction. In: Pindyck
RS (ed) Advances in the economics of energy and resources, vol 2. JAI Press,
Greenwich, pp 93–118

Valentine FA (1937) The problem of Lagrange with differential inequalities as


added side conditions. In: Contributions to the theory of the calculus of
variations, vol 1933–1937. University Chicago Press, Chicago
542 Bibliography

Van Hilten O, Kort PM, Van Loon PJJM (1993) Dynamic policies of the firm:
an optimal control approach. Springer, New York

Vanthienen L (1975) A simultaneous price-reduction decision making with pro-


duction adjustment costs. In: Proceedings XX international meeting of TIMS
held in Tel Aviv, Israel on 24–29 June 1973, vol 1. Jerusalem Academic Press,
Jerusalem, pp 249–254

van Loon PJJM (1983) A dynamic theory of the firm: production, finance and
investment. Lecture notes in economics and mathematical systems, vol 218.
Springer, Berlin

van Schijndel GJCT (1986) Dynamic shareholder behaviour under personal


taxation: a note. In: Streitferdt L et al (eds) Operations research proceedings
1985. Springer, Berlin, pp 488–495

Varaiya PP (1970) N-person nonzero-sum differential games with linear dynam-


ics. SIAM J Control 8:441–449

Veliov VM (2008) Optimal control of heterogeneous systems: basic theory. J


Math Anal Appl 346:227–242

Verheyen PA (1985) A dynamic theory of the firm and the reaction on govern-
mental policy. In: Feichtinger G (ed) Optimal control theory and economic
analysis, vol 2. North-Holland, Amsterdam, pp 313–329

Verheyen P (1992) The jump in models with irreversible investments. In: Fe-
ichtinger G (ed) Dynamic economic models and optimal control, vol 4. Else-
vier Science, Amsterdam, pp 75–89

Vickson RG (1981) Schedule control for randomly drifting production se-


quences. Inf Syst Oper Res 19:330–346

Vickson RG (1982) Optimal control production sequences: a continuous pa-


rameter analysis. Oper Res 30:659–679

Vickson RG (1985) Optimal conversion to a new production technology under


learning. IEE Trans 17:175–181

Vickson RG (1986/1987) A single product cycling problem under Brownian


motion demand. Manag Sci 32(10):1223–1370

Vidale ML, Wolfe HB (1957) An operations research study of sales response to


advertising. Oper Res 5(3):370–381

Vidyasagar M (2002) Nonlinear systems analysis, 2nd edn. SIAM, Philadelphia

Villas-Boas JM (1993) Predicting advertising policies in an oligopoly: a model


and empirical test. Market Sci 12:88–102
Bibliography 543

Vinokurov VR (1969) Optimal control of processes described by integral equa-


tions I. SIAM J Control Optim 7:324–336

Vinter RB (1983) New global optimality conditions in optimal control theory.


SIAM J Control Optim 21:235–245

von Weizsäcker CC (1980) Barriers to entry: a theoretical treatment. Lecture


notes in economics and mathematical systems, vol 185. Springer, Berlin

Vousden N (1974) International trade and exhaustible resources: a theoretical


model. Int Econ Rev 15:149–167

Wagener FOO (2003) Skiba points and heteroclinic bifurcations, with applica-
tions to the shallow lake system. J Econ Dyn Control 27(9):1533–1561

Wagener FOO (2005) Structural analysis of optimal investment for firms with
non-concave production. J Econ Behav Org 57(4):474–489

Wagener FOO (2006) Skiba points for small discount rates. J Optim Theory
Appl 128(2):261–277

Wagener FOO (2009) On the Leitmann equivalent problem approach. J Optim


Theory Appl 142(1):229–242

Wagner HM, Whitin TM (1958) Dynamic version of the economic lot size model.
Manag Sci 5:89–96

Wang PKC (1964) Control of distributed parameter systems. In: Leondes CT


(ed) Advances in control systems, vol 1. Academic Press, New York, pp 75–
170

Wang WK, Ewald CO (2010) A stochastic differential fishery game for a


two species fish population with ecological interaction. J Econ Dyn Control
34(5):844–857

Warga J (1962) Relaxed variational problems. J Math Anal Appl 4:111–128

Warga J (1972) Optimal control of differential and functional equations. Aca-


demic Press, New York

Warschat J (1985a) Optimal control of a production-inventory system with


state constraints and a quadratic cost criterion. RAIRO-OR 19:275–292

Warschat J (1985b) Optimal production planning for cascaded production in-


ventory systems. In: Bullinger HJ, Warnecke HJ (eds) Toward the factory of
the future. Springer, Berlin, pp 669–674

Warschat J, Wunderlich HJ (1984) Time-optimal control policies for cascaded


production-inventory systems with control and state constraints. Int J Syst
Sci 15:513–524
544 Bibliography

Weber TA (2011) Optimal control theory with applications in economics. The


MIT Press, Cambridge

Weinstein MC, Zeckhauser RJ (1975) The optimum consumption of depletable


natural resources. Quart J Econ 89:371–392

Welam UP (1982) Optimal and near optimal price and advertising strategies
for finite and infinite horizons. Manag Sci 28(11):1313–1327

Westphal LC (1974) Toward the synthesis of solutions of dynamic games. In:


Leondes CT (ed) Advances in control systems, vol 11. Academic Press, New
York, pp 389–489

Whittle P (1982/1983) Optimization over time: dynamic programming and


stochastic control, vol I–II. Wiley, Chichester

Wickwire K (1977) Mathematical models for the control of pests and infectious
diseases: a survey. Theor Popul Biol 11:182–238

Wiener N (1949) The interpolation and smoothing of stationary time series.


MIT Press, Cambridge

Wirl F (1984) Sensitivity analysis of OPEC pricing policies. OPEC Rev 8:321–
331

Wirl F (1985) Stable and volatile prices: and explanation by dynamic demand.
In: Feichtinger G (ed) Optimal control theory and economic analysis, vol 2.
North-Holland, Amsterdam, pp 263–277

Wirl F, Feichtinger G (2005) History dependence in concave economies. J Econ


Behav Organ 57:390–407

Wonham WM (1970) Random differential equations in control theory. In:


Bharucha-Reid AT (ed) Probabilistic methods in applied mathematics, vol
2. Academic Press, New York

Wright C (1974) Some political aspects of pollution control. J Environ Econ


Manag 1:173–186

Wright SJ (1993) Interior-point methods for optimal control of discrete-time


systems. J Optim Theory Appl 77(1):161–187

Wrzaczek S, Kaplan EH, Caulkins JP, Seidl A, Feichtinger G (2017) Differential


terror queue games. In: Dynamic games and applications, vol 7. Springer,
Berlin, pp 4, 578–593

Wrzaczek S, Kuhn M, Fürnkranz-Prskawetz A, Feichtinger G (2010) The re-


productive value in distributed optimal control models. Theor Popul Biol
77(3):164–170
Bibliography 545

Xepapadeas A, de Zeeuw A (1999) Environmental policy and competitiveness:


the Porter hypothesis and the composition of capital. J Environ Econ Manag
37(2):165–182

Yang J, Yan H, Sethi SP (1999) Optimal production planning in pull flow lines
with multiple products. Eur J Oper Res 119(3):26–48

Yin G, Zhang Q (1997) Continuous-time Markov chains and applications: a


singular perturbation approach. Springer, New York

Yong J, Zhou XY (1999) Stochastic Controls: Hamiltonian Systems and HJB


Equations. Springer, New York

Young LC (1969) Calculus of variations and optimal control theory. W.B. Saun-
ders, Philadelphia

Zaccour G (2008a) On the coordination of dynamic marketing channels and


two-part tariffs. Automatica 44:1233–1239

Zaccour G (2008b) Time consistency in cooperative differential games: a tuto-


rial. INFOR: Inf Syst Oper Res 46(1):81–92

Zeidan V (1984a) Extended Jacobi sufficiency criterion for optimal control.


SIAM J Control Optim 22:294–301

Zeidan V (1984b) A modified Hamilton-Jacobi approach in the generalized prob-


lem of Bolza. Appl Math Optim 11:97–109

Zeiler I, Caulkins JP, Grass D, Tragler G (2010) Keeping options open: an


optimal control model with trajectories that reach a DNSS point in positive
time. SIAM J Control Optim 48(6):3698–3707

Zelikin MI, Borisov VF (1994) Theory of chattering control: with applications


to astronautics, robotics, economics, and engineering (Systems & control:
foundations & applications). Birkhäuser, Boston

Zhang R, Ren Q (2013) Equivalence between Sethi advertising model and a


scalar LQ differential game. In: 25th Chinese control and decision conference
(CCDC), 25–27 May 2013

Zhang R, Haruvy E, Prasad A, Sethi SP (2005) Optimal firm contributions to


open source software. In: Deissenberg C, Hartl RF (eds) Optimal control and
dynamic games. Springer, Dordrecht, pp 197–212

Zhou X, Sethi SP (1994) A sufficient condition for near optimal stochastic


controls and its application to manufacturing systems. Appl Math Optim
29:67–92
546 Bibliography

Ziemba WT, Vickson RG (eds) (1975) Stochastic optimization models in fi-


nance. Academic Press, New York

Zimin IN, Ivanilov YP (1971) Solution of network planning problems by reduc-


ing them to optimal control problems (in Russian). Zhurnal Vychiflitel’noi
Matematiki i Matematichefkoi Fiziki 11:3

Ziólko M, Kozlowski J (1983) Evolution of body size: an optimal control model.


Math Biosci 64:127–143

Zoltners AA (ed) (1982) Marketing planning models. TIMS studies in the man-
agement sciences, vol 18. North-Holland, Amsterdam

Zwillinger D (2003) Standard mathematical tables and formulae. Chapman &


Hall/CRC, Boca Raton
Index
A Arutyunov, A.V., 143, 475
Abad, P.L., 473, 538 Aseev, S.M., 143, 475
Adjoint equation, 36–38, 42, 271 Aubin, J.-P., 475, 529
Adjoint variables, 10, 38 Autonomous, 61
Adjoint vector, 35 Axsäter, S., 475
Admissible control, 28
B
Advertising model, 5, 6
Backlogging of demand, 5, 371
Affine function, 22
Bagchi, A., 475
Aggoun, L., 383, 491
Balachandran, B., 507
Agnew, C.E., 473
Balakrishman, A.V., 518
Ahmed, N.U., 462, 473
Bang-bang, 51, 99, 110, 111, 113,
Alam, M., 291, 309, 473, 529 162, 167, 168,
Allen, K.R., 330, 473 231, 273, 286,
Amit, R., 324, 474 289, 301, 390, 456
Amoroso-Robinson relation, 227 Bang function, 19
Anderson, B.D.O., 442, 450, 474 Bankruptcy, 380
Anderson, R.M., 474 Barnea, A., 515
Anti-difference operator, 414 Barnett, W.A., 489
Aoki, M., 448, 474 Basar, T., 385, 396, 405,
Applications to biomedicine, 343 475, 477, 479, 483, 489,
Applications to finance, 159 508, 509
Applications to marketing, 225 Bass, F.M., 475, 532
Arnold, L., 367, 370, 442, 451, 474 Bayes theorem, 443
Aronson, J.E., 474, 534 Bean, J.C., 475
Arora, S.R., 285, 474 Behrens, D.A., 11, 70,
Arrow, K.J., 11, 54, 70, 109, 122, 360,
106, 226, 228, 458, 460, 475, 500
335, 336, 339, Bell, D.J., 49, 455, 476
383, 474, 520, 522 Bellman, R.E., 10, 33, 476
Arthur, W.B., 360, 475 Benchekroun, H., 476
Arugaslan, O., x Benkherouf, L., 476

© Springer Nature Switzerland AG 2019 547


S. P. Sethi, Optimal Control Theory,
https://doi.org/10.1007/978-3-319-98237-3
548 Index

Bennett, R.J., 538 Bowes, M.D., 317, 480


Bensoussan, A., 11, 70, Brachistochrone problem, 9, 424
113, 213, 291, Breakwell, J.V., 231, 480
293, 336, 339, 383, 385, Brekke, K.A., 383, 480
396, 397, 399, Breton, M., 404, 480
404, 474–479, Brito, D.L., 360, 480
506, 524, 533 Brock, W.A., 383, 518
Bequest function, 7, 377 Brockhoff, K., 476
Berkovitz, L.D., 10, 32, 385, 479 Brokate, M., 462, 481
Bernoulli, Jacob, 9 Broken extremal, 427
Bernoulli, Jakob, 10 Brotherton, T., 482
Bernoulli, Johann, 9, 10 Brown, R.G., 198, 481
Bertsekas, D.P., 277, 366, 479, 491 Brownian Motion, 378
Bes, C., 213, 479 Bryant, G.F., 32, 481
Bettiol, P., 41, 479 Bryson, A.E. Jr., 113, 135,
Beyer, D., x, 369, 479 141, 442, 450, 453, 481
Bhar, R., 448, 479 Bryson, Jr., A.E., 39
Bharucha-Reid, A.T., 544 Buchanan, L.F., 448, 481
Bhaskaran, S., 479, 480 Bucy, R., 448, 510
Bionomic equilibrium, 313, 391 Budget constraint, 88
Black, F., 378, 480 Bulirsch, R., 10, 135, 481, 524, 538
Blaquière, A., 480, 528, 537 Bullinger, H.J., 543
Bliss,G.A., 10 Bultez, A.V., 404, 475, 477, 481
Bliss point, 362 Bunching and ironing, 357, 359
Blondel, V.D., 534 Burdet, C.A., 277, 281, 481
Boccia, A., 132, 480 Burgess, R., 498
Boiteux, M., 283, 480 Burmeister, E., 336, 481
Boiteux problem, 283 Büskens, C., 519
Bolton, P., 352, 480 Butkowskiy, A.G., 462, 481
Boltyanskii, V.G., 10, 27, Bylka, S., 297, 481
32, 70, 96, 141, 433, 480,
526 C
Bolza, 10 Caines, P., 482
Bolza form, 30, 270, 277, 280 Çakanyildirim, M., 383, 477–479
Bookbinder, J.H., xiv, 11, 360, 480 Calculus of variations, 9, 419
Borisov, V.F., 545 Canon, M.D., 276, 482
Boucekkine, R., 474 Canonical system, 457
Boundary conditions, 86, 372 Capital accumulation model, 336
Boundary interval, 130 Caputo, M.R., 11, 482
Bourguignon, F., 480 Carathéodory, C., 10
Index 549

Carlson, D.A., 11, 106, 482, 483 Clemhout, S., 487


Carraro, C., 483 Coddington, E.A., 487
Carrillo, J., 484 Cohen, K.J., 42, 227, 487
Case, J.H., 385, 484 Common-property fishery
Cass, D., 484 resources, 389
Cassandras, C.G., 277, 524 Comparison lemma, 239
Caulkins, J.P., 11, 70, Complementary slackness
109, 122, 360, conditions, 73, 74,
458, 460, 475, 76, 81, 82
484, 485, 499, 500, Computational methods, ix, 135,
512, 530, 541, 544, 545 277
Cellina, A., 475 Concave function, 20, 79
Cernea, A., 41, 485 Connors, M.M., 11, 487
Certainty equivalence, 451, 452 Conrad, K., 487
Cesari, L., 32, 485 Constantinides, G.M., 487
Chahim, M., 485 Constraint of rth order, 130
Chain of forests model, 321, 323 Constraint qualifications, 73,
Chain of machines, 297 130, 131, 267
Chand, S., 221, 297, 485, 532 Constraints, 28
Chandra, T., 507 Consumption-investment problem,
Chang, S., 383, 474 377
Chao, R., 485 Consumption model, 7, 8
Chappell, D., 485 Contact time, 130
Charnes, A., 360, 485 Continuous wheat trading model,
Chattering, 545 204
Chattering controls, 234, 253 Control of pest infestations, 343
Chatterjee, R., 527 Control trajectory, 2, 28
Chen, S., 396, 397, 478, 479 Control variable, 2, 28
Chen, S.F., 486 Control vector, 28
Cheng, F., x Convex combination, 20
Chiarella, C., 486 Convex function, 20, 79
Chichilinsky, G., 486 Convex hull, 20
Chikan, A., 501 Convex set, 20
Chintagunta, P.K., 404, 486 CorelDRAW, ix
Chow, G.C., 448, 486 Cottle, R.W., 491
Chutani, A., 478, 479, 486 Cowling, K., 507
Clark, C.W., 11, 283, Crisan, D., 478
311, 312, 317, Critical points, 260
330, 389, 392, 486, 487 Crouhy, M., 213, 477
Clarke, F.H., 32, 207, 487 Cruz, J.B., Jr., 536
550 Index

Cuaresma, J.C., 525 De Pinho, M.D.R., 132, 480


Cullum, C.D., 276, 482 Derived Hamiltonian, 53
Current-value adjoint variables, 86 Derzko, N.A., 54, 253,
Current-value formulation, 70, 80, 324, 383, 460,
111, 115, 280 462, 463, 477, 489, 530
Current-value functions, 80, 82, 84 DeSarbo, W., 361, 507
Current-value Hamiltonian, 81, 83, Descarte’s Rule of Signs, 402
104, 111 Dewatripont, M., 352, 480
Current-value Lagrange de Zeeuw, A.J., 351, 492, 545
multipliers, 81 Dhrymes, P.J., 251, 489
Current-value Lagrangian, 81 Difference equation, 270, 442
Current-value maximum principle, Difference operator, 414
83, 147 Differential games, 385
Cvitanic, J., 352, 487 Differentiation with scalars, 12
Cycloid, 10 Differentiation with vectors, 13, 14
Cyert, R.M., 42, 227, 487 Direct contribution, 41
Discount factor, 6
D Discount rate, 6
Dantzig, G.B., 474, 487 Discrete maximum principle, 259,
Da Prato, G., 477 269, 270
Darrat, A.F., 520 Discrete-time optimal control
Darrough, M.N., 487 problem, 269, 270
Dasgupta, P., 324, 487, 538 Distributed parameter systems,
D’Autume, A., 488 460
Davis, B.E., 159, 164, Dixit, A.K., 489
188, 383, 455, 488 Dmitruk, A.V., 489
Davis, M.H.A., 366, 488 Dobell, A.R., 336, 481
Davis, R.E., 519 Dockner, E.J., 11, 335,
Dawid, H., 488 361, 385, 396,
Day, G., 527 404, 405, 489, 490,
DDT, 346, 348, 351 493, 509
Deal, K.R., 388, 404, 405, 488 Dogramaci, A., 291, 490
Dechert, D.W., 458, 488 Dohrmann, C.R., 277, 490
Decision horizon, 213, 215, 216, Dolan, R.J., 490, 508
219 Dorfman, R., 491
Deger, S., 488 Dornoff, R.J., 538
Deissenberg, C., 488–490, 497, Drews, W., 491
502, 504, 519, 535, 545 Dreyfus, S.E., 479
Deistler, M., 540 Dual variables, 42
Delfour, M.C., 477 Dubovitskii, A.J., 132, 491
Index 551

Dunn, J.C., 277, 491 Farley, J.U., 539


Durrett, R., 367, 491 Fattorini, H.O., 492
Dury, K., 485 Feedback Nash equilibrium,
Dynamic efficiency condition, 337 388, 393
Dynamic programming, 32, 366, Feedback Nash stochastic
433 differential game, 392
Feedback Stackelberg equilibrium,
E 403
Economic applications, 335, 383 Feedback Stackelberg stochastic
Economic interpretation, 40, 84, differential game, 395
175, 337 Feenstra, T.L., 492
Educational policy, 25 Feichtinger, G., x, xiv, 11, 39, 70,
Eigenvalues, 412, 413 79, 85, 104, 109, 122, 132,
Eigenvectors, 412, 413 136, 140, 207,
El-Hodiri, M., 491 225, 335, 351,
Eliashberg, J., 361, 491, 507, 539 360, 361, 458,
Elliott, R.J., 383, 491 460, 463, 464,
El Ouardighi, F., 491, 492 475, 484, 485,
Elton, E., 159, 492 487–496, 499–504,
Elzinga, D.J., 159, 188, 455, 488 509, 512, 513,
Ending correction, 196 515–517, 519–522,
Entry time, 130 527–530, 533,
Envelope theorem, 54, 253, 356 537, 539–542, 544
EOQ, 191 Feinberg, F.M., 235, 495, 496
Epidemic control, 343 Fel’dbaum, A.A., 36, 433, 438, 496
Equilibrium relation, 42 Feng, Q., 291, 336, 339,
Erickson, G.M., 492 383, 474, 478
Erickson, L.E., 11, 404, 506 Ferreira, M.M.A., 143, 496
Euler, 9 Ferreyra, G., 496
Euler equation, 421, 422, 427, 428 Filar, J., 483
Euler-Lagrange equation, 422 Filipiak, J., 496
Ewald, C.-O., 543 Finite difference equations, 414
Excel, ix, 57–60 First-order linear equations, 409
Exhaustible resource model, First-order pure state constraints,
111, 324 135
Exit time, 130 Fischer, T., 496
Fisher, A.C., 525
F Fishery management, 392
Factorial power, 414 Fishery model, 312, 389
Fan, L.T., 360, 492, 506 Fishing mortality function, 392
552 Index

Fixed-end-point problem, 77, 86, Gamkrelidze, R.V., 10, 27,


98, 99, 113 32, 70, 96, 141, 234, 433,
Fleming, W.H., 366, 367, 383, 498, 526
442, 496 Gandolfo, G., 493
Fletcher, R., 496 Gaskins, D.W. Jr., 498
Fomin, S.V., 419, 420, 422, Gaugusch, J., 498
427, 429, 430, 499 Gaussian, 442, 443
Fond, S., 496 Gavrila, C., 499
Forecast horizons, 213 Geismar, N., x
Forest fertilization model, 331 Gelfand, I.M., 419, 420, 422,
Forestry model, 111 427, 429, 430, 499
Forest thinning model, 317, 321 General discrete maximum
Forgetting coefficient, 5 principle, 276
Forster, B.A., 496 Generalized bang-bang, 113, 167
Fourgeaud, C., 496 Generalized Legendre-Clebsch
Fraiman, N.M., 291, 490 condition, 189, 454–456
Francis, P.J., 343, 496 Geoffrion, A.M., 516
Frankena, J.F., 496 Gerchak, Y., 499
Gfrerer, H., 499
Frankowska, H., 41, 485
Gibson, J.E., 455, 508
Free-end-point problem, 86
Gihman, I.I., 375, 499
Free terminal time problems,
Gillessen, W., 519
93
Girsanov, I.V., 499
Friedman, A., 385, 496, 497
Glad, S.T., 499
Fromovitz, S., 269, 518
Global Saddle Point Theorem,
Fruchter, G.E., 404, 492, 497
343, 350, 456, 457
Fuller, D., 325, 497
Goal Seek, 57, 59
Full-rank condition, 23, 72,
Goh, B.-S., 135, 311, 499
130, 131
Goh, C.J., 540
Fundamental lemma, 422 Goldberg, S., 414, 499
Funke, U.H., 497 Golden Path, 108
Fürnkranz-Prskawetz, A., Golden Rule, 108, 123
463, 513, 544 Goldstein, J.R., 488
Fursikov, A.V., 497 Goldstine, H.H., 499
Furst, E., 540 Göllmann, L., 499
Goodwill, 5, 226
G Goodwill elasticity of demand, 229
Gaandolfo, G., 498 Gopalsamy, K., 499
Gaimon, C., 291, 360, 361, Gordon, H.S., 312, 313, 499
484, 485, 497, 498 Gordon, M.J., 181, 499
Index 553

Gordon’s formula, 181 Hartl, R.F., x, 11, 32, 39, 70, 79,
Gould, J.P., 231, 257, 362, 499 85, 95, 104,
Grass, D., 11, 70, 109, 131, 132, 135–137,
122, 360, 458, 140, 141, 149,
460, 484, 485, 207, 213, 225,
491, 500, 545 281, 285, 351,
Green’s theorem, 225, 237, 360, 361, 458,
239, 245, 254, 460, 463, 464,
257, 314, 344, 346, 363, 391 484, 485, 490, 491,
Grienauer, W., 495 493, 495, 497,
Grimm, W., 500, 523 499–505, 512,
Gross, M., 500 516, 519, 535, 537, 545
Gruber, M., 159, 492 Hartman, R., 458, 504
Gruver, W.A., 512 Haruvy, E., 235, 253, 504, 545
Gutierrez, G.J., 404, 505 Harvey, A.C., 504
Haunschmied, J.L., 360, 484,
H 495, 504
Hämäläinen, R.P., 392, 500 Haurie, A., 11, 106, 213,
Hadley, G., 11, 70, 335, 500 385, 392, 463,
Hahn, M., 500 479, 483, 500, 504, 505
Halkin, H., 32, 276, 500 Haussmann, U.G., 505
Hämäläinen, R.P., 392 He, X., 396, 404, 505, 535
Hamilton, 10
Heal, G.M., 324, 487, 505
Hamiltonian, 35, 41, 73,
Heaps, T., 283, 505
271, 337, 437
Heckman, J., 505
Hamiltonian maximizing
Heineke, J.M., 487
condition, 36, 39, 74, 97,
Hestenes, M.R., 10, 70, 505
99, 118, 437
Hamilton-Jacobi-Bellman (HJB) HJB equation, 36, 376
equation, 32, 36, 366, 368, HMMS model, 191
371, 393 Ho, Y.-C., 39, 113, 141,
Hamilton-Jacobi equation, 385, 388, 442,
371, 394 450, 453, 481, 505, 523, 537
Han, M., 500 Hochman, E., 383, 527
Hanson, M., 521 Hofbauer, J., 505, 506
Hanssens, D.M., 500 Hoffmann, K.H., 506
Harris, F.W., 191, 500 Hohn, F., 213, 520
Harris, H., 360, 500 Holly, S., 506
Harrison, J.M., 379, 501 Holt, C.C., 191, 200, 202, 506
Hartberger, R.J., 32, 436, 491, 501 Holtzman, J.M., 276, 506
554 Index

Homogeneous function of degree J


k, 22 Jabrane, A., 483
Horsky, D., 506 Jacobi, 10
Hotelling, H., 324, 506 Jacobson, D.H., 49, 455,
Hritonenko, N., 474 476, 507, 514
Hsu, V.N., 221, 485 Jacquemin, A.P., 228, 229, 231,
Hung, N.M., 504 507
Hurst, Jr., E.G., 11, 70, 113, 477 Jagpal, S., 507
Hwang, C.L., 506 Jain, D., 404, 486
Hyun, J.S., 500 Jamshidi, M., 507
Jardine, A., 535
I Jarrar, R., 404, 480, 507
Ijiri, Y., 191, 205, 506 Jazwinski, A.H., 507
Jedidi, K., 361, 507
Ilan, Y., 474
Jennings, L.S., 507
Illustration of left and right limits,
Jeuland, A.P., 490, 508
18
Ji, Y., 508
Imp, 19
Jiang, J., 508
Impulse control, 19, 113, 242, 243
Johar, M., 508
Impulse control model, 113
Johnson, C.D., 455, 508
Impulse stochastic control, 383
Jones, P., 508
Imputed value, 261
Jørgensen, S., x, 11, 311, 335, 361,
Incentive compatibility, 353, 354
385, 396,
Indirect adjoining method,
404, 405, 489–491,
137, 147
493, 495, 502, 508, 509
Indirect contribution, 41
Joseph, P.D., 452
Individual rationality, 353 Jump conditions, 135
Infinite horizon, 6, 103 Jump Markov processes, 383
Instantaneous profit rate, 29 Junction times, 130
Interior interval, 130
Intriligator, M.D., 336, 506, 507, K
520 Kaitala, V.T., 392, 494, 500, 509
Inventory control problem, 152 Kalaba, R.E., 476
Investment allocation, 65 Kalish, S., 360, 497, 509, 510
Ioffe, A.D., 507 Kall, P., 501, 522, 533
Isaacs, R., 10, 170, 385, 507 Kalman, R.E., 442, 448, 510
Isoperimetric constraint, 88, 252 Kalman-Bucy filter, 442, 447, 448
Itô stochastic differential equation, Kalman filter, 366, 441–443, 447
366, 393 Kamien, M.I., 11, 290,
Ivanilov, Y.P., 360, 546 293, 296, 335, 360, 510
Index 555

Kamien-Schwartz model, 111, 291 Kotowitz, Y., 512


Kaplan, E.H., 360, 530, 544 Kozlowski, J., 546
Kaplan, W., 510 Krabs, W., 506
Karatzas, I., 337, 367, Kraft, D., 135, 481
381–383, 510 Krarup, J., 491
Karray, S., 511 Krauth, J., 361, 502
Karreman, H.F., 480 Kreindler, E., 512
Kavadias, S., 485 Krelle, W., 476, 512
Keeler, E., 511 Krichagina, E., 512
Keller, H.B., 346, 511 Kriebel, C.H., 274, 512
Kemp, M.C., 11, 70, 335, Krishnamoorthy, A., 475, 505, 513
486, 500, 511 Krouse, C.G., 164, 513
Kendrick, D.A., 511 Krutilla, J.V., 317, 480, 510
Keon, J.W., 540 Kugelmann, B., 513
Keppo, J., 383, 478 Kuhn, H.W., 535
Kern, D., 499 Kuhn, M., 463, 513, 544
Khmelnitsky, E., 11, 511, 512, 517 Kuhn-Tucker conditions, 260, 262,
Kilkki, P., 317, 318, 511 269, 271
Kim, J-H.R., 455, 519 Kumar, P.R., 513
Kirakossian, G.T., 503 Kumar, S., x, 442, 508, 513
Kirby, B.J., 511 Kurawarwala, A.A., 513
Kirk, D.E., 36, 511 Kurcyusz, S., 144, 513
Kiseleva, T., 458, 511 Kurz, M., 11, 54, 70, 106, 228,
Klein, C.F., 512 335, 336, 474
Kleindorfer, G.B., 274, 512 Kushner, H.J., 513
Kleindorfer, P.R., 213, 274, Kydland, F.E., 513
512, 533
Kleinschmidt, P., 502 L
Kneese, A.V., 480, 505, 521 Laffont, J.J., 352, 513
Knobloch, H.W., 512 Lagrange, 9
Knowles, G., 512 Lagrange form, 30
Kogan, K., 11, 478, 511, 512, 517 Lagrange multipliers,
Kopel, M., 488 69, 70, 73, 74, 82, 260
Kort, P.M., 11, 311, 335, Lagrangian, 73
351, 360, 361, Lagrangian form, 70, 76, 114
404, 458, 460, Lagrangian maximum principle, 70
463, 484, 485, Lagunov, V.N., 513
491, 492, 495, Lakhani, C., 528
502–504, 509, 512, 542 Lansdowne, Z.F., 246, 514
Kortanek, K., 360, 485 Lasdon, L.S., 514
556 Index

Leban, R., 11, 335, 514 Linear-quadratic case, 110


Leclair, S.R., 507 Linear-quadratic problems, 448
Lee, E.B., 164, 514 Line integral, 238
Lee, S.C., 252, 532 Lintner, J., 515
Lee, W.Y., 513 Lions, J.L., 113, 383, 476, 506, 515,
Left and right limits, 17 533
Legendre, 9 Lipschutz, S., 402, 537
Legendre’s conditions, 428 Little, J.D.C., 515
Legey, L., 360, 514 Little-o notation, 17
Lehoczky, J.P., 337, 381, 382, Liu, J., 402, 537
510, 514, 532 Liu, P.-T., 474, 480, 488, 514–516
Leibniz, 9 Liu, R.H., 477
Leitmann, G., 32, 311, Logarithmic Brownian Motion,
385, 419, 486, 378
499, 500, 504, 505, Long, H., 478
514, 520, 537 Long, N.V., 11, 335, 385,
Leizarowitz, A., 483 396, 404, 405,
Leland, H.E., 514 486, 490, 511, 514, 516
Lele, M.M., 507, 514 Long-run stationary equilibrium,
Lele, P.T., 285, 474 106
Lenclud, B., 496 Loon, P.J.J.M., van, 542
Léonard, D., 11, 335, 514 Lou, H., 234, 516
Leondes, C.T., 481, 519, 543, 544 Lou, S., 512, 516
Lesourne, J., 11, 335, 383, 476, 514 Lucas, Jr., R.E., 360, 516
Lev, B., 497 Luenberger, D.G., 269, 516
Levine, J., 515 Luhmer, A., 494, 516
Levinson, N.L., 487 Lumped parameter systems, 460
Lewis, T.R., 515 Lundin, R.A., 213, 516
L’Hôpital’s rule, 342 Luptacik, M., 351, 503,
Li, G., 515 516, 517, 527
Li, M., 479 Luus, R., 517
Li, T., 515, 532 Lykina, V., 517, 525
Lieber, Z., 213, 500, 512, 515 Lynn, J.W., 291, 473
Lignell, J., 515
Lilien, G.L., 360, 510 M
Linear independence, 23 Macki, J., 517
Linearly independent, 23 Magat, W.A., 517
Linear Mayer form, 30, 280 Magill, M.J.P., 517
Linear programming, 112, 113, Mahajan, V., 490, 510, 515, 517
167, 168 Maimon, O., 11, 511, 512, 517
Index 557

Maintenance and replacement Maximum principle, 27, 39, 40,


model, 111, 283, 284, 290 50, 69, 70, 73, 74, 76, 82,
Majumdar, M., 517 96, 99, 109, 114, 118, 119,
Malanowski, K., 78, 137, 136, 137, 259, 433
517, 518, 536 May, R.M., 474
Malik, T., 535 Mayer form, 30, 433, 437
Malliaris, A.G., 383, 518 Mayne, D.Q., 32, 135, 277,
Mangasarian, O.L., 56, 79, 481, 519, 523, 526
267–269, 518 McCabe, J.L., 514
Manh-Hung, N., 518 McCann, J.M., 517
Mantrala, M.K., 448, 521 McEneaney, W.M., 534
MAPI (Machinery and Applied McGuire, T.W., 360, 532
Products Institute), 283 McIntyre, J., 519
MAPLE, 185 McNicoll, G., 360, 475
Marginal cost, 42, 227 McShane, E.J., 10
Marginal cost equals marginal Measurement noise, 441
revenue, 42 Mechanism design, 352, 356
Marginal return, 438 Meech, J.A., 507
Marginal revenue, 42 Megretski, A., 534
Marinelli, C., 460, 518 Mehlmann, A., 360, 361, 385,
Markovian Stackelberg 490, 494, 503, 519
equilibrium, 396 Mehra, R.K., 360, 519
Markus, L., 514 Mehrez, A., 519
Martingale problems, 383 Merton, R.C., 378, 520
Martı́n-Herrán, G., 404, 476, Mesak, H.I., 520
507, 511, 518 Michel, P., 488, 496, 520
Martı́n-Herrán, G., 404 Miele, A., 237, 514, 520
Martirena-Mantel, A.M., 518 Miller, M.H., 166, 520
Marzano, F., 493 Miller, R.E., 520
Massé, P., 283, 518 Milyutin, A.A., 132, 491
Mate, K., 506 Minimax solution, 385, 386
Mathematica, 185, 395, 402, 406 Minimum fuel problem, 280
Mathematical requirements, 1 Minjarez-Sosa, J.A., 383, 477
Mathewson, F., 512 Mirman, L.J., 515, 520
Matrix Riccati equation, 448, 450 Mirrlees, J., 520
Matsuo, H., 513 Miscellaneous applications, 360
Maurer, H., x, 135, 137, Miscellany, 16
144, 455, 499, 518, 519 Mischenko, E.F., 10, 27, 32, 70, 96,
Maximum, 429 141, 433, 526
Maximum likelihood estimate, 443 Misra, S., 513
558 Index

Mitchell, A., 521 Näslund, B., 11, 70, 113, 283, 323,
Mitra, T., 517 331, 477, 522
Mitter, S.K., 477, 523 Natural resources, 311, 383
Mittnik, S., 490 Necessary condition, 37, 39, 269
Mixed constraints, 69, 71, 79 Neck, R., 522
Mixed inequality constraints, 3, Needle-shaped variation, 434, 435
69, 70 Neighborhood, 16
Mixed optimization technique, 302 Nelson, R.T., 360, 522
Modeling tricks, 111 Nepomiastchy, P., 360, 522
Modigliani, F., 166, 191, Nerlove, M., 226, 228, 522
200, 202, 213, 506, 520 Nerlove-Arrow model, 110
Moiseev, N.N., 521 Nerlove-Love advertising model,
Monahan, G.E., 521 226
Mond, B., 521 Neuman, C.P., 226, 541
Mookerjee, V.S., 508 Neustadt, L.W., 10, 518, 522
Moore, E.J., 540 Newton, 9
Moore, J.B., 383, 442, Nguyen, D., 522
450, 474, 491 Nishimura, K., 458, 488
Morey, R.C., 517 Nissen, G., 477
Mortimort, D., 352, 513 Nonlinear programming,
Morton, A., 498 259, 260, 268
Morton, T.E., xiv, 213, Norm, 16, 17
297, 302, 309, 516, 521, 532 Norström, C.J., xiv, 191, 208, 522
Moser, E., 351, 460, 521 Norton, F.E., 448, 481
Moskowitz, H., 485 Notation, 11
Motta, M., 521 Novak, A.J., 361, 460,
Muller, E., 490, 515, 517, 521 484, 485, 494, 495,
Mulvey, J.M., 532 503, 504, 522
Munro, G.R., 311, 521
Murata, Y., 521 O
Murray, D.M., 277, 521 Oakland, W.H., 360, 480
Muth, J.F., 191, 200, 202, 506 Oberle, H.J., 500, 522, 523
Muzicant, J., 521 Objective function, 2, 29, 438
Oettli, W., 481
N Oğuztöreli, M.N., 523
Naert, P.A., 404, 477, 481 Øksendal, B.K., 383
Nahorski, Z., 521 Øksendal, B.K., 480, 523
Naik, P.A., 404, 448, 521, 522, 526 Okuguchi, K., 486
Nash differential games, 387 Olsder, G.J., 385, 396,
Nash solutions, 385 405, 475, 523
Index 559

One-sector model, 338 Pauwels, W., 524


Oniki, H., 523 Pekelman, D., 213, 283, 302, 524
Open access fishery, 313 Pepyne, D.L., 277, 524
Open-Loop Nash Solution, Perera, S., 478
388, 389, 405, 406 Perrakis, S., 188, 524
Open-loop Stackelberg solution, Pesch, H.J., 10, 513, 524
396, 405 Pessimal solution, 182
Optimal consumption of an initial Peterson, D.W., 78, 524, 525
investment, 115 Peterson, F.M., 525
Optimal control problem, 29 Peterson, R.A., 517
Optimal control theory, 1, 419 Petrosjan, L., 491
Optimal economic growth models, Petrov, Iu.P., 525
335, 340 Phase diagram, 340, 349
Optimal financing model, 111, 164, Phelps, E.S., 499
186, 187 Pickenhain, S., 517, 518, 525
Optimal long-run stationary Pierskalla, W.P., 283, 525
equilibrium, 108, 228 Pindyck, R.S., 324, 383,
Optimal path, 29 489, 525, 541
Optimal thinning, 318 Pitchford, J.D., 496, 516, 525
Optimal trajectory, 29 Plail, M., 10, 524
Order of the constraint, 130 Pliska, S.R., 379, 501
Oren, S.S., 523 Pohjola, M., 525
Osayimwese, I., 523 Polak, E., 135, 276, 482, 519, 526
Ouardighi, F.E., 478 Pollution control model, 346, 351
Ozga, S., 257, 523 Polyanin, A.D., 410, 526
Ozga model, 257 Pontryagin, L.S., 10, 27, 32, 70, 96,
141, 433, 526
P Powell, S.G., 523
Paiewonsky, B., 519 Prasad, A., 235, 253, 392,
Palda, K.S., 226, 523 396, 404, 475,
Palokangas, T., 525 504, 505, 513,
Pantoja, J.F., 277, 523 522, 526, 535, 545
Parametric linear programming, Predator-prey relationships, 311
113 Prescott, E.C., 513
Parlar, M., 499, 523, 524 Presman, E., 511, 526
Parrish, B., 473 Price elasticity of demand, 227
Parsons, L.J., 500 Price shield, 214–216
Partial fractions, 372 Principle-agent framework, vii
Pasin, F., 491 Principle of optimality, 33, 367,
Path of least time, 9 380
560 Index

Production function, 338 Rekhi, I., 485


Production-inventory model, 4, Rempala, R., 213, 528
191, 192, 274 Ren, Q., 545
Production planning model, Richard, S.F., 487, 528
191, 365 Rincón-Zapatero, J.P., 518
Production smoothing, 192 Ringbeck, J., 361, 528
Product rule for differentiation, 16 Ripper, M., 360, 514
Proth, J.-M., 213, 477, 485 Rishel, R.W., 366, 367, 442,
Prskawetz, A., 495, 513, 527 496, 528
Pulsing, 494, 495, 500, 515, 517, Roberts, S.M., 39, 528
520, 521 Robinett, R.D., 277, 490
Pulsing policy, 235, 251 Robinson, B., 528
Pure constraints, 151 Robson, A.J., 529
Pure state variable inequality Rockafellar, R.T., 529
constraints, 3, 125, 129 Roxin, E.O., 480, 482, 515
Pytlak, R., 135, 527 Royal, A., 383, 477, 478
Rozovsky, B., 478
Q Rubel, O., 518
Quasiconcave function, 79 Russak, B., 529
Quasiconvex function, 79 Russell, D.L., 529
Rüstem, B., 506
R Ruusunen, J., 392, 500
Rajagopalan, S., 515 Ryu, Y., x
Raman, K., 383, 527
Rampazzo, F., 521 S
Ramsey, F.P., 85, 335, 362, 527 Saddle point, 11, 22, 23,
Rank of a matrix, 23 260, 387, 457, 492, 529
Rao, A.G., 504 Sage, A.P., 462, 529
Rao, R.C., 404, 527 Salukvadze, M.E., 529
Rapoport, A., 527 Salvage value, 3, 29, 113
Rapp, B., 283, 527 Samaratunga, C., 529
Rausser, G.C., 383, 524, 527 Samuelson, P.A., 529
Raviv, A., 360, 527 Sargent, T.J., 516
Ravn, H.F., 269, 521, 527 Sarma, V.V.S., 291, 309, 473, 529
Ray, A., 528 Sasieni, M., 529
Reachable set, 3, 72, 89, 114 Sat function, 19
Reeves, C.M., 496 Savin, S., 460, 518
Regional allocation of investment, Sawyer, A., 448, 521
64 Scalzo, R.C., 529
Reinganum, J.F., 528 Schaefer, M.B., 312, 529
Index 561

Schijndel, G.-J.C.Th.,van, 360, 542 323, 324, 336, 337,


Schilling, K., 529 339, 343, 346,
Schmalensee, R., 515 351, 360, 361,
Scholes, M., 378, 480 371, 375, 376,
Schubert, U., 351, 517 380–383, 385, 388, 392,
Schultz, R. L., 500 396, 397, 404, 458, 460,
Schwartz, N.L., 11, 290, 462, 463, 474–482,
293, 296, 335, 360, 510 485–490, 495,
Schwodiauer, G., 540 503–505, 507, 508,
Scott, A.D., 311, 521 510–516, 522, 524, 526,
Second-order differential 528–535, 539, 540, 545
equations, 382 Sethi-Morton model, 297, 309
Second-order linear equations with Sethi-Skiba Points, vii, 109,
constant coefficients, 410 315, 441, 458–460, 464
Second-order variations, 428, 452 Shadow price, 10, 40, 261
Seeger, A., 525 Shani, U., 361, 535
Segers, R., 491 Shapiro, A., 535
Seidl, A., 351, 360, 458, Shapiro, C., 144, 535
460, 484, 485, Sharomi, O., 535
521, 530, 544 Shell, K., 335, 484, 535
Seidman, T.I., 460, 530 Shi, R., 383, 477, 478
Seierstad, A., 11, 32, 53,
Shipman, J.S., 39, 528
70, 73, 77–79,
Shreve, S.E., 337, 366, 367,
104, 136, 149, 335, 530
381–383, 479, 510
Selten, R., 385, 530
Shtub, A., 512
Semmler, W., 489, 490
Siebert, H., 536
Sen, S.K., 488, 510
Silva, G.N., 536
Sengupta, J.K., 540
Separation principle, 451, 452 Simaan, M., 536
Sethi, S.P., 11, 32, 54, 95, 131, 132, Simon, H.A., 191, 200,
135–137, 140, 141, 202, 452, 506, 536
149, 159, 164, 166, 187, Simon, L.S., 506
191, 192, 213, Simple cash balance problem,
221, 225, 231, 159, 160, 187
235–237, 242, 246, 248, Simplest variational problem, 420
251–253, 256, 257, Singh, M.G., 531, 536
277, 281, 283, Singhal, J., 536
289, 291, 293, Singhal, K., 536
297, 302, 309, Singhal, V., 498
314, 315, 321, Singular arcs, 454
562 Index

Singular control, 48, 49, 110, Standard multipliers, 80


162, 163, 167, Starr, A.W., 385, 388, 537
176, 177, 345, 454, 455 Starting correction, 196
Skiba, A.K., 458, 464, 536 State equation, 28
Skorohod, A.V., 375, 499 State trajectory, 2, 28
Smith, B.L.R., 507 State variable, 2, 28
Smith, M., 507 State vector, 28
Smith, R.L., 475 Static efficiency condition, 337
Smith, V.K., 510 Stein, R.B., 523
Smith, V.L., 536 Steinberg, R., 491
Snower, D.J., 536 Steindl, A., 460, 494, 516, 537
Sole-owner fishery resource model, Steiner, P.O., 491
111, 312 Stepan, A., 537
Soliman, M.A., 360, 539 Stern, L.E., 537
Solow, R.M., 324, 536 Stiglitz, J.E., 538
Soner, H.M., 383, 496, 514 Stirling numbers of the first kind,
Sorger, G., x, 11, 297, 416
335, 385, 396, Stirling numbers of the second
404, 405, 481, kind, 415
485, 490, 494, Stochastic advertising problem,
505, 506, 516, 375
532, 533, 536, 537 Stochastic calculus, 367, 444
Sothmann, B., 523 Stochastic manufacturing
Southwick, L., 360, 537 problems, 383
Special topics, 441 Stochastic optimal control,
Spence, M., 346, 511, 537 365, 366
Speyer, J.L., 507 Stochastic production inventory
Spiegel, M.R., 402, 414, 537 model, 370
Spremann, K., 537 Stockout cost, 5
Sprzeuzkouski, A.Y., 537 Stoer, J., 481, 538
Spulber, P.F., 515, 520 Stopping time, 380
Srinivasan, V., 226, 537 Stoppler, S., 11, 488, 489, 538
Sriskandarajah, C., x Strauss, A., 517
Staats, P.W., 343, 533 Streitferdt, L., 542
Stackelberg differential games, Strengthened Jacobi condition,
385, 396, 404, 475, 478 429
Stalford, H., 514, 537 Strengthened Legendre-Clebsch
Standard adjoint variables, 80 condition, 454
Standard Hamiltonian, 80 Strengthened Legendre condition,
Standard Lagrangian, 80 429
Index 563

Strictly concave function, 21, 79 Teo, K.L., 135, 462, 473, 507, 540
Strictly convex function, 79 Terborgh, G., 283, 540
Strong forecast horizon, 213, 216, Terminal conditions, 38, 74, 86
219 Terminal inequality constraints, 72
Strong maximum, 430 Terminal time, 4, 29, 71, 75, 86, 89,
Subsidy rate, 397 98, 113
Sufficiency conditions, 53, 54, 79, Thépot, J., 360, 515, 540
136, 269 Thisse, J., 507
Sulem, A., 538 Thompson, G.L., 54, 159,
Summary of transversality 191, 205, 213,
conditions, 89 253, 274, 283,
Suo, W., x, 526, 534, 535 291, 302, 309,
Surveys of applications, 10 360, 371, 388, 404, 460,
Sutinen, J.G., 488, 516 462, 463, 474,
Swan, G.W., 343, 538 488, 489, 498, 506, 512,
Sweeney, D.J., 473, 538 533, 534, 539, 540
Sweeney, J.L., 480, 505, 521 Tidball, M., 540
Switching curves, 99 Tihomirov, V.M., 507
Switching point, 171, 175, 178 Time-optimal control problem,
Switching time, 102 96, 97
Sydsæter, K., 11, 32, 53, 70, 73, Tintner, G., 540
77–79, 104, 136, Titli, A., 536
149, 335, 530, 538 Tolwinski, B., 385, 505, 540
Synthesis of optimal controls, Total contribution, 41
97, 170 Tou, J.T., 452
System noise, 442 Toussaint, S., 541
Szego, G.P., 535 TPBVP, 39, 40, 57, 58, 60, 64, 67,
221, 338
T Tracz, G.S., 541
Taboubi, S., 404, 476, 518 Tragler, G., 11, 70, 109,
Takayama, A., 42, 335, 491, 538 122, 360, 458,
Taksar, M.I., 512, 514, 533–535 460, 475, 484,
Tan, K.C., 538 495, 499, 500, 541, 545
Tapiero, C.S., 11, 297, Transition matrix, 436
309, 360, 383, Transversality conditions, 38, 75,
477, 491, 533, 538, 539 77, 86, 88, 89, 91, 99,
Taraysev, A., 525 104, 105, 116, 121
Taylor, J.G., 360, 539 Transversality conditions: special
Teichroew, D., 11, 487 cases, 86
Teng, J.-T., 539, 540 Treadway, A.B., 360, 541
564 Index

Troch, I., 541 Vickson, R.G., 32, 131, 132,


Tsachev, T., 495 135–137, 140, 141,
Tsur, Y., 361, 535 149, 325, 477, 497, 503,
Tsurumi, H., 252, 541 523, 524, 542, 546
Tsurumi, Y., 252, 541 Vidal, R.V.V., 521
Tu, P.N.V., 11, 360, 541 Vidale, M.L., 226, 235, 236, 542
Tuominen, M.P.T., 515 Vidale-Wolfe advertising model,
Turner, R.E., 226, 541 111, 235, 375
Turnovsky, S.J., 496, 516, 525, 541 Vidyasagar, M., 109, 542
Turnpike, 108, 196, 228, 248 Vilcassim, N.J., 404, 486
Two person zero-sum games, 386 Villas-Boas, J.M., 542
Two-point boundary value Vincent, T.L., 311, 499
problem, 39, 40, Vinokurov, V.R., 543
57, 64, 413 Vinter, R.B., 41, 132, 135, 143,
Two-reservoir system, 151 479, 480, 496,
Tzafestas, S.G., 462, 493, 541 527, 536, 543
Voelker, J.A., 283, 525
U Vossen, G., 455, 519
Udayabhanu, V., 534 Vousden, N., 360, 516, 543
Uhler, R.S., 541
Utility of consumption, W
7, 335, 336, 377 Wagener, F.O.O., 458, 511, 543
Wagner, H.M., 191, 297, 543
V Wagner, M., 517, 525
Vaisanen, U., 317, 318, 511 Wagner-Whitin framework, 301
Valentine, F.A., 10, 541 Wagner-Whitin solution, 306
Value function, 33, 367 Wan, F.Y., 536
Van Hilten, O., 11, 335, 542 Wan, Jr., H.Y., 487
Van Loon, P.J.J.M., 11, 335, 542 Wang, C.-S., 360, 492
Vanthienen, L., 213, 542 Wang, M., 478
Varaiya, P.P., 360, 385, Wang, P.K.C., 543
442, 513, 514, 542 Wang, W.-K., 543
Variational equations, 435 Warehousing constraint, 214
Veinott, A.F., 474 Warga, J., 543
Veliov, V.M., 462, 463, 488, Warnecke, H.J., 543
494, 495, 542 Warschat, J., 543
Venezia, I., 539 Weak forecast horizon, 213, 215
Verheyen, P.A., 360, 542 Weak maximum, 429
Verification theorem, 369, 396 Weber, T.A., 335, 544
Verma, B., 507 Weierstrass, 10
Index 565

Weierstrass-Erdmann corner Yang, T.H., 135, 526


conditions, 428 Yatsenko, Y., 474
Weierstrass necessary condition, Yeh, D.H.M., 508, 535
430, 432 Yeung, D.K., 491
Weinstein, M.C., 324, 544 Yin, G., 383, 526, 532, 534, 545
Weitz, B., 527 Young, L.C., 419, 439, 545
Weizsäcker, C.C. von, 543
Welam, U.P., 544 Z
Well, K.H., 500 Zabczyk, J., 528
Wensley, R., 527 Zaccour, G., 404, 480,
Westphal, L.C., 544 507, 509, 518, 537, 540, 545
Wheat trading model with no Zaitsev, V.F., 410, 526
short-selling, 208 Zalkin, J.H., 525
Whitin, T.M., 191, 297, 543 Zarrop, M.B., 506
Whittle, P., 544 Zeckhauser, R.J., 324, 346,
Wickwire, K., 11, 343, 544 511, 544
Wiegand, M., 144, 519 Zeidan, V., 545
Wiener, N., 544 Zeiler, I., 458, 545
Wiener process, 367, 378 Zelikin, M.I., 545
Wind, Y., 490, 510, 517, 539 Zemel, A., 361, 535
Wirl, F., 484, 489, 494, 495, 504, Zemel, E., 521
544 Zhang, H., x, 383, 478,
Wolfe, H.B., 226, 235, 236, 542 526, 534, 535
Wong, K.H., 135, 540 Zhang, J., 352, 487
Wonham, W.M., 544 Zhang, Q., x, 383, 516,
Wright, C., 351, 544 526, 532–535, 545
Wright, S.J., 277, 544 Zhang, R., 504, 535, 545
Wrzaczek, S., 360, 463, Zhou, J., 504
513, 530, 544 Zhou, X., 529, 534, 535, 545
Wunderlich, H.J., 543 Ziemba, W.T., 532, 546
Zimin, I.N., 360, 546
X
Ziólko, M., 546
Xepapadeas, A., 351, 545
Zionts, S., 360, 534, 537
Y Zoltners, A.A., 508, 546
Yakowitz, S.J., 277, 521 Zowe, J., 144, 513
Yan, H., x, 535, 545 Zuckermann, D., 539
Yang, J., 545 Zwillinger, D., 197, 410, 546

You might also like